Bookmark this page

Deploying Stateful Applications

Objectives

  • Configure and manage stateful applications deployed on Red Hat OpenShift.

Deploying Stateful Applications

Stateful applications often require persisting data to permanent storage. This enables applications to recover from crashes and reboots and load the necessary state from storage. Some stateful applications, such as distributed stateful applications, might have additional networking requirements. For example, the Etcd distributed key-value Kubernetes store uses the Raft algorithm to maintain state between multiple Etcd replicas. This means that each replica pod must have a predictable network address to join the Etcd quorum.

Alternatively, a stateless application or service does not maintain any state which cannot be calculated at runtime. For example, an application that provides an API for calculating taxes from a set of formulas can respond directly from request parameters.

Generally speaking, stateless services are easier to maintain as they cannot run into state-based bugs and are simpler to scale. As demand on the stateless services changes, developers can add or remove replicas as necessary without dealing with data allocation or migration.

By isolating or delegating the stateful pieces, many large systems can consist of mostly stateless services. However, most applications need to store some state. For example, many applications delegate state persistence to a database server.

To address the storage needs of stateful applications and services, Kubernetes and Red Hat OpenShift provide the persistent volume subsystem.

Beyond Persisting a Database

Databases are not the only example of stateful applications. Many distributed computing tasks share state between servers.

For example, you are building an application that uses a distributed cache. As usage increases, your system needs to scale by adding additional cache replicas to handle the load. Because state is shared between the replicas, it is crucial for each one to share state, even if that state is not committed to a traditional database.

Persistent Volumes and Persistent Volume Claims

Containers only provide ephemeral storage by default. When an application stores data to the container file system, the internal data is lost when the container is re-created.

To persist data, such as that used by a database server, Kubernetes and Red Hat OpenShift provide the persistent volume subsystem.

Persistent Volumes

Persistent volumes (PVs) have a separate lifecycle from the pods that attach to them. These volumes provide provisionable storage within the cluster.

The following manifest defines an example persistent volume:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: data-volume 1
spec:
  capacity:
    storage: 5Gi 2

  volumeMode: Filesystem 3
  accessModes:
    - ReadWriteOnce 4
  persistentVolumeReclaimPolicy: Recycle 5
  storageClassName: fast 6
  nfs: 7
    path: /exports-ocp4/example
    server: 192.168.50.10

1

A name for the PV

2

The capacity of the volume

3

A volume mode of either Filesystem or Block

4

The available access modes

5

The reclaim policy

6

Which storage class the PV belongs to

7

The configuration for the underlying NFS mount

Because OpenShift manages persistent volumes at the cluster scope, cluster administrators typically create and manage PVs. Developers can reserve these persistent volumes by creating persistent volume claims.

Persistent Volume Claims

Persistent volume claims (PVCs) represent a request for a persistent volume. These requests can include requirements for the PV, such as the following attributes:

  • Amount of storage

  • Label selector

  • Volume mode

  • Access mode

  • Storage class

Note

A matched PV can have a higher capacity than the PVC requested, but not lower.

The following manifest defines an example persistent volume claim:

apiVersion: v1
kind: PersistentVolumeClaim
  name: data-volume-claim 1
spec:
  accessModes:
  - ReadWriteOnce 2
  resources:
    requests:
      storage: 1Gi 3
  volumeMode: Filesystem 4
  ...output omitted...

1

A name for the PVC

2

The requested access mode

3

The requested capacity

4

The requested volume mode

After you create a PVC, the cluster attempts to match a PV that satisfies the specified requirements. If the cluster does not find a match, then the PVC remains open until an administrator or a service class provisions a satisfactory PV. After a match is found, OpenShift binds a PV to the PVC and an application can use the PVC for persistent storage.

Static and Dynamic Provisioning

The preceding examples assume a static provisioning technique. In other words, administrators manually create and maintain the PVs.

More commonly, teams are adopting a dynamic provisioning practice. Dynamic provisioning is when the cluster creates a new PV specifically for a PVC. Developers create a PVC and the cluster dynamically creates a PV for the PVC.

To use dynamic provisioning, the administrator must enable it for the cluster and create storage classes. Additionally, if a default storage class is not defined, then the PVC must specify one to be dynamically provisioned.

Storage Classes For Dynamic Provisioning

Persistent volumes can belong to storage classes (SCs) set up by an administrator. Storage classes indicate the type and quality of underlying storage hardware. For example, an administrator might create storage classes for different storage speeds called fast and slow. What different SCs mean is up to the administrator.

For dynamic provisioning, the SC specified by a PVC must itself specify a provisioner plug-in, such as the nfs-subdir-external-provisioner plug-in. A provisioner plug-in is responsible for allocating storage and creating the PV to which the PV can then bind.

For example, the following manifest outlines an SC that specifies a provisioner:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
  name: nfs-storage
parameters:
  archiveOnDelete: "false"
provisioner: k8s-sigs.io/nfs-subdir-external-provisioner
reclaimPolicy: Delete
volumeBindingMode: Immediate

By specifying the preceding SC, the following example PVC uses dynamic provisioning to create the underlying PV:

apiVersion: v1
kind: PersistentVolumeClaim
  name: dynamic-volume-claim
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: nfs-storage
  volumeMode: Filesystem
  ...output omitted...

Because it is dynamically provisioned, an administrator does not need to manually create the PV.

Mounting Claims Within Pods

You can mount a PVC that is bound to a PV to a pod as a volume.

For example, the following definition of a pod mounts a PVC called data-volume-claim inside of a pod called app-pod:

apiVersion: v1
kind: Pod
metadata:
  name: app-pod
spec:
  containers:
    - name: app-ui
      image: quay.io/example/nginx
      volumeMounts:
      - mountPath: "/var/www/html" 1
        name: ui-volume 2
  volumes: 3
    - name: ui-volume
      persistentVolumeClaim:
        claimName: data-volume-claim 4

1

The mount path within the container

2

The name of the volume within the pod's volumes section to mount

3

The pod's volumes section

4

The name of the PVC to mount

Note that the pod cannot start until all attached PVCs are bound.

Adding Storage to Deployments

Like bare pods, deployments can include definitions to mount PVCs within the resulting pods. Often, such a PVC is shared amongst all the replica pods. This is because each pod is identical to all other replica pods that the deployment manages.

Because most software is not built with shared storage in mind, deployments are often left to stateless applications. In particular, even though database servers are built with multi-pod and replication in mind, each database server instance often requires its own storage. Replication and sharding are often handled by the application.

The following is an example command to create and attach a PVC to an existing deployment called my-deployment:

[user@host ~]$ oc set volumes deploy/my-deployment \
--add \ 1
--name nfs-volume-storage \
--type pvc \
--claim-mode rwo \ 2
--claim-size 1Gi \ 3
--mount-path /tmp/data \ 4
--claim-name my-data-claim 5
deployment.apps/my-deployment volume updated

1

Create an entry within the volumes section

2

Use the ReadWriteOnce claim mode

3

Use 1Gi as the size

4

Mount the volume to /tmp/data within the containers

5

The name of the PVC

The preceding command updates the deployment to add the volumes and volumeMounts sections, as shown in the following excerpt:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-deployment
  ...output omitted...
spec:
  ...output omitted...
  template:
    ...output omitted...
    spec:
      containers:
      - image: registry.ocp4.example.com:8443/rhel8/mysql-80
        ports:
        - containerPort: 3306
          protocol: TCP
        ...output omitted...
        volumeMounts:
        - mountPath: /tmp/data
          name: nfs-volume-storage
      ...output omitted...
      volumes:
      - name: nfs-volume-storage
        persistentVolumeClaim:
          claimName: my-data-claim
...output omitted...

Ephemeral Storage

One use for volumes is mounting pre-defined data into one or more pods without having to rebuild the container image. In such cases, you might not want to store new data from your application.

The following list includes example uses for ephemeral volumes:

  • Database initialization scripts

  • Server configuration files

  • Tokens and key files

Configuration maps and secrets can store file contents and can be mounted as a volume into a pod. In addition, you can create both configuration maps and secrets from local files.

Refer to previous sections for details on commands to create and mount configuration maps.

Stateful Sets

As their name suggests, stateful sets are intended for stateful applications. Unlike deployments, pods within stateful sets are guaranteed to have a predictable identifier (ID) for each pod. For example, three replicas for the redis stateful set might use the redis-0, redis-1, and redis-2 pod names. This is useful for routing requests to each pod, or discovering new pods that must join to an existing cluster. In addition, each replica pod created by a stateful set has a dedicated PVC.

Besides addressing storage, stateful sets also provide a means to separate networking amongst stateful replicas. A stateful application might require that each replica is addressable separately from the others. For example, clustered database server replicas often designate one as the primary and the others as secondary. Similarly, the replicas might need to communicate with each of the others to establish a quorum.

The following manifest outlines an example stateful set that includes the template for creating the PVC for each pod.

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: my-stateful-app 1
spec:
  ...output omitted...
  replicas: 3
  serviceName: my-stateful-app 2
  template:
    ...output omitted...
    spec:
      containers:
        - image: my-app-image:latest
          name: mysql-80
          ports:
            - containerPort: 80
              protocol: TCP
          volumeMounts: 3
            - name: app-pvc
              mountPath: /var/lib/mysql
              subPath: mysql-db
      volumes: 4
        - name: init-db-volume
          configMap:
            name: init-db-cm
  volumeClaimTemplates:
    - metadata: 5
        name: app-pvc
      spec:
        accessModes: [ "ReadWriteOnce" ]
        storageClassName: nfs-storage
        resources:
          requests:
            storage: 1Gi

1

The name of the stateful set, which prefixes the individual pod names

2

The name of the associated headless service

3

Specify which volumes are attached to the container

4

Define volumes that are available to be attached to containers

5

Define a template for creating PVCs called app-pvc

The following manifest outlines the headless service referenced in the preceding stateful set:

apiVersion: v1
kind: Service
metadata:
  name: my-stateful-app
spec:
  clusterIP: None
  selector:
  ...output omitted...

Because the service defines clusterIP as None, it is a headless service and it will not be load balanced. However, the individual replicas receive individual IP addresses at the pod level. Consequently, you can route requests to a specific pod by using the pod name, such as my-stateful-app-0.my-stateful-app in the preceding example. This is often necessary in a stateful application as the replica pods are not interchangeable.

Revision: do288-4.12-0d49506