Bookmark this page

Lab: Backup, Restore, and Migration of Applications with OADP

Deploy the OADP operator and describe its features and dependencies.

Configure one-time and scheduled backups with OADP and restore from them.

Outcomes

  • Configure the OpenShift API for Data Protection operator to back up OpenShift applications.

  • Schedule a complete backup of an application.

  • Create and restore an application-consistent backup by using backup hooks.

  • Clean up previous backups.

As the student user on the workstation machine, use the lab command to prepare your environment for this exercise, and to ensure that all required resources are available.

[student@workstation ~]$ lab start backup-review

Instructions

Your team deployed a wiki application on OpenShift to host its operational documentation. You are asked to configure a scheduled daily backup by using OADP.

The application is deployed in the wiki project and is accessible at the following URL: https://mediawiki-wiki.apps.ocp4.example.com. The application uses the MediaWiki software and a PostgreSQL database with two volumes: uploaded images and documents are stored on a CephFS volume, and the database uses a Ceph RBD volume.

A colleague started installing OADP and performed the following tasks:

  • Install the required operators for OADP.

  • Create the S3 bucket and configure the s3cmd command with the S3 credentials.

  • Create partial OADP configuration files in the ~/DO380/labs/backup-review/ path with the S3 information and credentials.

Review and complete the configuration of OADP. Ensure that all storage classes that the application is using can be backed up with CSI snapshots. Configure OADP to store volume backups on the S3 object storage.

Configure a scheduled daily backup of the wiki project. The backup must start at 11 PM every day. To ensure backup consistency, you must lock the application to prevent any writes during the backup, and prepare the database for a snapshot before the backup by using backup hooks.

All Kubernetes resources required for the application have the app.kubernetes.io/part-of=mediawiki label. Resources specific to the MediaWiki component have the app.kubernetes.io/name: mediawiki label. Resources specific to the PostgreSQL database have the app.kubernetes.io/name: postgresql label. The backup must include only the resources with those labels.

To lock the MediaWiki application in read-only mode, create a /data/images/lock_yBgMBwiR file in the mediawiki container. This lock file must contain a single line of text with the reason to display the lock to the users in the application. To unlock the application after the backup is complete, remove the lock file.

Note

The /data/images path is in the mediawiki volume. The lock file is therefore included in the backup.

To prepare the PostgreSQL database for a snapshot and to flush all in-memory transactions to the disk, use the psql -c "CHECKPOINT;" command in the postgresql container.

To prepare for an upcoming update of the software stack, you are asked to restore an exact copy of the wiki application to a new wiki-staging project, so your team can test the update in a staging environment first.

Trigger an immediate backup from the scheduled backup and restore it to the wiki-staging project. Ensure that the application is unlocked after the restore by deleting the MediaWiki lock file with a restore hook.

To distinguish the production environment from the staging environment, change the MediaWiki site name to DO380 Team Wiki Staging. This setting is controlled by the MEDIAWIKI_SITE_NAME environment variable on the mediawiki deployment.

Configure the MediaWiki URL on the mediawiki deployment with the MEDIAWIKI_SITE_SERVER environment variable to match the route URL in the wiki-staging project.

Finally, remove all manual backup and restore resources from OpenShift and from the object storage.

Note

The S3 object storage in the lab environment uses a custom certificate that is signed with the OpenShift service CA. If you use the velero describe --details or velero logs commands, you must specify the CA certificate with the --cacert parameter.

The CA certificate for the OpenShift service CA is available from the velero pod in the /run/secrets/kubernetes.io/serviceaccount/service-ca.crt path.

  1. Complete the configuration files in the ~/DO380/labs/backup-review/ path and configure OADP with the following requirements:

    • Ensure that OADP can back up volumes with CSI snapshots.

    • Volume backups must be stored in the S3 bucket.

    • S3 credentials are in the credentials-velero configuration file.

    • Partial OADP configuration with S3 information is in the oadp-config.yml file.

    1. Connect to the OpenShift cluster as the admin user with redhatocp as the password.

      [student@workstation ~]$ oc login -u admin -p redhatocp \
        https://api.ocp4.example.com:6443
      Login successful.
      ...output omitted...
    2. Change to the ~/DO380/labs/backup-review directory.

      [student@workstation ~]$ cd ~/DO380/labs/backup-review
    3. Create the cloud-credentials secret in the openshift-adp namespace with the credentials-velero file content.

      [student@workstation backup-review]$ oc create secret generic \
        cloud-credentials -n openshift-adp --from-file cloud=credentials-velero
      secret/cloud-credentials created
    4. Create the Restic encryption key for OADP Data Mover and store it in the restic-enc-key secret in the openshift-adp namespace.

      Use the openssl command to generate a random password to use as the encryption key.

      [student@workstation backup-review]$ oc create secret generic \
        restic-enc-key \
        -n openshift-adp \
        --from-literal=RESTIC_PASSWORD=$(openssl rand -base64 24)
      secret/restic-enc-key created
    5. Complete the OADP configuration in the oadp-config.yml file to enable CSI snapshots and Data Mover.

      apiVersion: oadp.openshift.io/v1alpha1
      kind: DataProtectionApplication
      metadata:
        name: oadp-config
        namespace: openshift-adp
      spec:
      ...output omitted...
        configuration:
          velero:
            defaultPlugins:
            - openshift
            - aws
            - csi
            - vsm
        features:
          dataMover:
            enable: true
            credentialName: restic-enc-key
    6. Apply the OADP configuration by using the oadp-config.yml file.

      [student@workstation backup-review]$ oc apply -f oadp-config.yml
      dataprotectionapplication.oadp.openshift.io/oadp-config created
    7. Verify that the backupStorageLocation object is created and in the Available phase.

      [student@workstation backup-review]$ oc get -n openshift-adp \
        backupstoragelocations
      NAME            PHASE       LAST VALIDATED   AGE     DEFAULT
      oadp-config-1   Available   18s              2m51s   true
  2. Identify the storage classes that the application uses. Ensure that the application volumes can be backed up by using CSI snapshots. Configure the matching volume snapshot classes for OADP.

    1. List the persistent volume claims in the wiki namespace and identify the storage classes.

      [student@workstation backup-review]$ oc -n wiki get pvc
      NAME                      ...   STORAGECLASS                           AGE
      mediawiki                 ...   ocs-external-storagecluster-cephfs     5m45s
      postgresql-postgresql-0   ...   ocs-external-storagecluster-ceph-rbd   5m43s
    2. Identify the provisioner for each storage class from the previous step.

      [student@workstation backup-review]$ oc get storageclasses \
        ocs-external-storagecluster-ceph-rbd ocs-external-storagecluster-cephfs
      NAME                                  PROVISIONER                           ...
      ocs-external-storagecluster-ceph-rbd  openshift-storage.rbd.csi.ceph.com    ...
      ocs-external-storagecluster-cephfs    openshift-storage.cephfs.csi.ceph.com ...
    3. List the available volume snapshot classes. Ensure that each storage class from the previous step has a matching volume snapshot class with the same driver.

      [student@workstation backup-review]$ oc get volumesnapshotclasses
      NAME                             DRIVER
      ocs-...-cephfsplugin-snapclass   openshift-storage.cephfs.csi.ceph.com
      ocs-...-rbdplugin-snapclass      openshift-storage.rbd.csi.ceph.com
    4. For each volume snapshot class, set the deletionPolicy attribute to Retain.

      [student@workstation backup-review]$ for class in \
        $(oc get volumesnapshotclass -oname); do
        oc patch $class --type=merge -p '{"deletionPolicy": "Retain"}'
        done
      .../ocs-external-storagecluster-cephfsplugin-snapclass patched
      .../ocs-external-storagecluster-rbdplugin-snapclass patched
    5. For each volume snapshot class, set the velero.io/csi-volumesnapshot-class label to true.

      [student@workstation backup-review]$ oc label volumesnapshotclass \
        velero.io/csi-volumesnapshot-class="true" --all
      .../ocs-external-storagecluster-cephfsplugin-snapclass labeled
      .../ocs-external-storagecluster-rbdplugin-snapclass labeled
  3. Create a scheduled daily backup of the wiki project. The backup must start at 11 PM every day. Use backup hooks to prepare the application for backup by using the provided commands on both the MediaWiki and PostgreSQL pods. Ensure that both CephFS and RBD volumes are backed up.

    For this exercise, disable the schedule with the paused field to prevent unexpected backups from starting during the hands-on activity. You manually trigger a backup from this schedule in a later step.

    Only the resources that the application uses or required by OADP should be in the backup. Use the application labels to filter the resources to back up. The application uses the following resource types:

    • Deployments

    • Stateful Sets

    • Secrets

    • Services

    • Routes

    • Persistent Volumes

    You can use the partial resource definition files in the ~/DO380/labs/backup-review path.

    1. Modify the partial resource definition in the ~/DO380/labs/backup-review/schedule.yml file as follows:

      apiVersion: velero.io/v1
      kind: Schedule
      metadata:
        name: wiki-backup
        namespace: openshift-adp
      spec:
        schedule: "0 23 * * *"
        paused: true
        template:
          includedNamespaces:
          - wiki
          orLabelSelectors:
          - matchLabels:
              app.kubernetes.io/part-of: mediawiki
          - matchLabels:
              kubernetes.io/metadata.name: wiki
          includedResources:
          - deployments
          - statefulsets
          - secrets
          - pvc
          - pv
          - services
          - routes
          - pods
          - namespace
          hooks:
            resources:
            - name: mediawiki-lock
              labelSelector:
                matchLabels:
                  app.kubernetes.io/name: mediawiki
              pre:
              - exec:
                  container: mediawiki
                  command:
                  - /bin/bash
                  - -c
                  - echo "backup in progress" > /data/images/lock_yBgMBwiR;
              post:
              - exec:
                  container: mediawiki
                  command:
                  - rm
                  - /data/images/lock_yBgMBwiR
            - name: postgresql-checkpoint
              labelSelector:
                matchLabels:
                  app.kubernetes.io/name: postgresql
              pre:
              - exec:
                  container: postgresql
                  command:
                  - psql
                  - -c
                  - "CHECKPOINT;"
    2. Apply the configuration for the schedule resource.

      [student@workstation backup-review]$ oc apply -f schedule.yml
      schedule.velero.io/wiki-backup created
    3. Create an alias to access the velero binary from the Velero deployment in the openshift-adp namespace.

      [student@workstation backup-review]$ alias velero='\
        oc -n openshift-adp exec deployment/velero -c velero -it -- ./velero'
    4. Verify the status of the schedule with the velero command.

      [student@workstation backup-restore]$ velero get schedule
      NAME          STATUS    CREATED                         SCHEDULE     ...
      wiki-backup   New       2023-11-17 14:37:39 +0000 UTC   0 23 * * *   ...
  4. Trigger an immediate backup from the scheduled backup, and restore it to the wiki-staging project. Use restore hooks to unlock the application.

    The backup and restore should be completed without any errors or warnings.

    1. Use the velero command to start a backup by using the schedule definition from the previous step. Note the name of the backup that the command creates, to use in the next step.

      [student@workstation backup-review]$ velero backup create \
        --from-schedule wiki-backup
      ...output omitted...
      Creating backup from schedule, all other filters are ignored.
      Backup request "wiki-backup-20231115113447" submitted successfully.
      Run velero backup describe wiki-backup-20231115113447 or velero backup logs wiki-backup-20231115113447 for more details.

      Note

      The S3 object storage that is configured in the lab environment uses a custom certificate signed with the OpenShift service CA. You must add the CA certificate to the velero backup describe --details and velero backup logs commands as follow:

      [user@host]$ velero backup logs \
        --cacert=/run/secrets/kubernetes.io/serviceaccount/service-ca.crt \
        wiki-backup-20231115113447
    2. Monitor the status of the backup and verify that the backup ends with the Completed status. The backup process takes several minutes.

      [student@workstation backup-review]$ velero get backup
      NAME                         STATUS            ERRORS   WARNINGS   ...
      wiki-backup-20231115113447   Completed         0        0          ...
    3. Create the restore resource in the openshift-adp namespace to restore the wiki project to a new wiki-staging project. Use the backup name from the previous step. Configure a post-restore hook to remove the MediaWiki lock file from the mediawiki volume.

      You can use the incomplete restore.yml file as the starting point.

      apiVersion: velero.io/v1
      kind: Restore
      metadata:
        name: wiki-staging
        namespace: openshift-adp
      spec:
        backupName: wiki-backup-20231115113447
        namespaceMapping:
          wiki: wiki-staging
        hooks:
          resources:
          - name: mediawiki-unlock
            labelSelector:
              matchLabels:
                app.kubernetes.io/name: mediawiki
            postHooks:
            - exec:
                container: mediawiki
                command:
                - rm
                - /data/images/lock_yBgMBwiR
      [student@workstation backup-review]$ oc apply -f restore.yml
      restore.velero.io/wiki-staging created
    4. Use the velero command to get the status of the restore. Monitor the output to verify that the restore status is Completed. The restore process takes several minutes.

      [student@workstation backup-review]$ velero get restore
      NAME          BACKUP                       STATUS      ...  ERRORS   WARNINGS
      wiki-staging  wiki-backup-20231115113447   Completed   ...  0        0
    5. Review the restored resources in the wiki-staging project.

      [student@workstation backup-review]$ oc -n wiki-staging get all
      NAME                             READY   STATUS    RESTARTS   AGE
      pod/mediawiki-77dfffc8bd-nlc74   1/1     Running   0          67s
      pod/postgresql-0                 1/1     Running   0          66s
      
      NAME                 TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
      service/mediawiki    ClusterIP   172.30.86.62     <none>        8080/TCP   67s
      service/postgresql   ClusterIP   172.30.216.157   <none>        5432/TCP   67s
      
      NAME                        READY   UP-TO-DATE   AVAILABLE   AGE
      deployment.apps/mediawiki   1/1     1            1           66s
      
      NAME                                   DESIRED   CURRENT   READY   AGE
      replicaset.apps/mediawiki-77dfffc8bd   1         1         1       66s
      
      NAME                          READY   AGE
      statefulset.apps/postgresql   1/1     66s
      
      NAME                               HOST/PORT
      route.route.openshift.io/mediawiki mediawiki-wiki-staging.apps.ocp4.example.com
  5. Change the MediaWiki site name and site URL in the wiki-staging project.

    1. Review the mediawiki route URL in the wiki-staging project.

      [student@workstation backup-review]$ oc -n wiki-staging get route
      NAME        HOST/PORT                                      PATH   SERVICES
      mediawiki   mediawiki-wiki-staging.apps.ocp4.example.com          mediawiki
    2. Set the MEDIAWIKI_SITE_SERVER environment variable on the mediawiki deployment to match the route URL from the previous step. Set the MEDIAWIKI_SITE_NAME environment variable to DO380 Team Wiki Staging.

      [student@workstation backup-review]$ oc -n wiki-staging \
        set env deployment/mediawiki \
        MEDIAWIKI_SITE_SERVER=https://mediawiki-wiki-staging.apps.ocp4.example.com \
        MEDIAWIKI_SITE_NAME="DO380 Team Wiki Staging"
      deployment.apps/mediawiki updated
    3. Verify that the mediawiki deployment is ready.

      [student@workstation backup-review]$ oc get deployment -n wiki-staging
      NAME        READY   UP-TO-DATE   AVAILABLE   AGE
      mediawiki   1/1     1            1           14m
    4. Open a web browser and navigate to https://mediawiki-wiki-staging.apps.ocp4.example.com. Verify the MediaWiki site name in the browser window and that you can edit any page.

  6. Remove the manual backup and restore resources from the previous steps.

    1. Use the velero command to delete the backup. Use the backup name from the previous step.

      [student@workstation backup-review]$ velero delete \
        backup wiki-backup-20231115113447
      Are you sure you want to continue (Y/N)? y
      Request to delete backup "wiki-backup-20231115113447" submitted successfully.
      The backup will be fully deleted after all associated data (disk snapshots, backup files, restores) are removed.
    2. Delete the volume snapshots that are associated with the previous restore. Use the restore name from the previous step.

      [student@workstation backup-review]$ oc delete \
        VolumeSnapshotContent,VolumeSnapshot -A \
        -l velero.io/restore-name=wiki-staging
      volumesnapshotcontent.snapshot...  "velero-velero-mediawiki-..." deleted
      volumesnapshotcontent.snapshot...  "velero-velero-postgresql-..." deleted
      volumesnapshot.snapshot.storage... "velero-mediawiki-..." deleted
      volumesnapshot.snapshot.storage... "velero-postgresql-..." deleted
    3. Review the content of the S3 bucket and identify the volume snapshots from the previous backup.

      [student@workstation backup-review]$ s3cmd la -r
      ...output omitted...
      ... s3://backup-.../openshift-adp/wiki-backup-.../snapcontent-...-pvc/config
      ... s3://backup-.../openshift-adp/wiki-backup-.../snapcontent-...-pvc/data/...
      ...output omitted...
    4. Use the s3cmd command to delete the volume snapshots from the previous step.

      [student@workstation backup-review]$ s3cmd rm -r \
        s3://backup-add...3ef/openshift-adp/wiki-backup-20231115113447/
      delete: 's3://backup-.../openshift-adp/wiki-backup-.../snapcontent-.../config'
      delete: 's3://backup-.../openshift-adp/wiki-backup-.../snapcontent-.../data/...'
      ...output omitted...
    5. Change to the home directory.

      [student@workstation backup-review]$ cd
      [student@workstation ~]$

Evaluation

As the student user on the workstation machine, use the lab command to grade your work. Correct any reported failures and rerun the command until successful.

[student@workstation ~]$ lab grade backup-review

Finish

As the student user on the workstation machine, use the lab command to complete this exercise. This step is important to ensure that resources from previous exercises do not impact upcoming exercises.

[student@workstation ~]$ lab finish backup-review

Revision: do380-4.14-397a507