Bookmark this page

Guided Exercise: Monitoring Application Health

Monitor the health of applications deployed on Red Hat OpenShift by using readiness and liveness probes.

Outcomes

  • Define your application memory requests and limits to work with an OpenShift project quotas and limits.

  • Troubleshoot common application resource planning errors in OpenShift.

  • Use the web console to create health checks for your application.

As the student user on the workstation machine, use the lab command to prepare your environment for this exercise, and to ensure that all required resources are available.

[student@workstation ~]$ lab start deployments-health

Instructions

You are asked to fix the application.yaml manifest to deploy the expense application to the cluster. The expense is expected to run in Guaranteed quality of service (QoS) mode to preserve node stability when the node runs low on memory. To do this, memory requests and limits must have the same value.

  1. Log in to Red Hat OpenShift.

    1. Log in to OpenShift as the developer user.

      [student@workstation ~]$ oc login -u developer -p developer \
      https://api.ocp4.example.com:6443
      Login successful.
      ...output omitted...
    2. Ensure that you use the deployments-health project.

      [student@workstation ~]$ oc project deployments-health
      Already on project "deployments-health" on server "https://api.ocp4.example.com:6443".
  2. Deploy the application to the cluster.

    1. Change to the exercise directory.

      [student@workstation ~]$ cd ~/DO288/labs/deployments-health
    2. Deploy the application.

      [student@workstation deployments-health]$ oc apply -f application.yaml
      deployment.apps/expense created
      service/expense created
      route.route.openshift.io/expense created
    3. Verify that the application does not start. Use the oc get po command with the -w option, which prints updates to the pods. Press Ctrl+C to stop the command.

      [student@workstation deployments-health]$ oc get po -w
      NAME                       READY   STATUS        RESTARTS     AGE
      expense-6cb66778c6-7s5bs   1/1     Running       1 (2s ago)   4s
      expense-84b5674967-6g5xz   1/1     Terminating   0            7m7s
      expense-84b5674967-6g5xz   0/1     Terminating   0            7m7s
      expense-6cb66778c6-7s5bs   0/1     OOMKilled     1 (2s ago)   4s
      expense-84b5674967-6g5xz   0/1     Terminating   0            7m7s
      expense-84b5674967-6g5xz   0/1     Terminating   0            7m7s
      expense-6cb66778c6-7s5bs   0/1     CrashLoopBackOff   1 (2s ago)   5s
      expense-6cb66778c6-7s5bs   1/1     Running            2 (17s ago)   20s
      expense-6cb66778c6-7s5bs   0/1     OOMKilled          2 (18s ago)   21s

      OpenShift terminates the application with the OOMKilled status because the application is consuming memory over the pod limits.

  3. Edit the application.yaml file to increment the memory limit of the Deployment resource. Also, set the same memory request to maintain the guaranteed QoS.

    1. Update the memory requests of the deployment to 400Mi to give the expense application the necessary memory to run.

      ...output omitted...
              resources:
                requests:
                  memory: "400Mi"
                limits:
                  memory: "400Mi"
      ...output omitted...
    2. Apply the changes to the cluster.

      [student@workstation deployments-health]$ oc apply -f application.yaml
      deployment.apps/expense configured
      service/expense unchanged
      route.route.openshift.io/expense unchanged
    3. Verify that the new deployment does not create new pods.

      [student@workstation deployments-health]$ oc get po
      NAME                       READY   STATUS             RESTARTS        AGE
      expense-6cb66778c6-7s5bs   0/1     CrashLoopBackOff   6 (2m14s ago)   8m10s

      Note that the 6cb66778c6 identifier belongs to the previous deployment.

  4. Examine the root cause of the application failure and fix it.

    1. Get the project warning events sorted by creating a time stamp.

      [student@workstation deployments-health]$ oc get events \
      --sort-by=metadata.creationTimestamp --field-selector type=Warning
      ...output omitted...
      99s  Warning   FailedCreate   replicaset/expense-54d47cbd7b   Error creating: pods "expense-54d47cbd7b-zwclt" is forbidden: maximum memory usage per Container is 360Mi, but limit is 400Mi
      ...output omitted...

      The deployment fails because the deployments-health project has a maximum memory limit per container of 360Mi and you set the limit to 400Mi .

    2. Update the application.yaml to a moderate 200Mi memory request and limit.

      ...output omitted...
              resources:
                requests:
                  memory: "200Mi"
                limits:
                  memory: "200Mi"
      ...output omitted...
    3. Apply the changes to the cluster.

      [student@workstation deployments-health]$ oc apply -f application.yaml
      deployment.apps/expense configured
      service/expense unchanged
      route.route.openshift.io/expense unchanged
    4. Verify that the pod runs.

      [student@workstation deployments-health]$ oc get po
      NAME                      READY   STATUS    RESTARTS   AGE
      expense-7879cffc6-zttgt   1/1     Running   0          6s
  5. Increment the expense service replicas to 3. Fix the issues you find and ensure that the application runs.

    1. Test incrementing application replicas to 3 with the oc command.

      [student@workstation deployments-health]$ oc scale --replicas=3 deployment/expense
      deployment.apps/expense scaled
    2. Verify that the new deployment stops before deploying the third replica.

      [student@workstation deployments-health]$ oc get deployment expense
      NAME      READY   UP-TO-DATE   AVAILABLE   AGE
      expense   2/3     2            2           3m27s
    3. Get the project events to find the issue.

      [student@workstation deployments-health]$ oc get events \
      --sort-by=metadata.creationTimestamp --field-selector type=Warning
      ...output omitted...
      100s        Warning   FailedCreate   replicaset/expense-69dc94d576   Error creating: pods "expense-69dc94d576-74k9z" is forbidden: exceeded quota: deployments-health, requested: requests.memory=200Mi, used: requests.memory=400Mi, limited: requests.memory=570Mi

      OpenShift cannot allocate the third replica, this would require a memory request limit of 600Mi and the deployments-health has a 570Mi maximum.

    4. Update the application.yaml manifest to have 3 replicas and set the memory to 160Mi.

      ...output omitted...
        name: expense
      spec:
        replicas: 3
        selector:
          matchLabels:
            deployment: expense
      ...output omitted...
              resources:
                requests:
                  memory: "160Mi"
                limits:
                  memory: "160Mi"
      ...output omitted...
    5. Apply the changes to the application deployment.

      [student@workstation deployments-health]$ oc apply -f application.yaml
      deployment.apps/expense configured
      service/expense unchanged
      route.route.openshift.io/expense unchanged
    6. Verify that OpenShift deploys all the replicas.

      [student@workstation deployments-health]$ oc get deployment expense
      NAME      READY   UP-TO-DATE   AVAILABLE   AGE
      expense   3/3     1            3            5m
    7. Use the curl command to test the application.

      [student@workstation deployments-health]$ curl -s \
      expense-deployments-health.apps.ocp4.example.com/expenses | jq
      [
        {
          "uuid": "71301a52-696a-439a-8003-cfb6e1364b5e",
          "name": "OpenShift for Developers, Second Edition",
      ...output omitted...
  6. Add the health checks for the application by using the Web Console.

    The application exposes the /q/health/live and /q/health/ready for liveness and readiness probes.

    1. Scale the deployments-health application to 2 replicas.

      [student@workstation deployments-health]$ oc scale --replicas=2 deployment/expense
      deployment.apps/expense scaled

      For each redeployment, the rolling strategy first creates a pod with the new version before replacing a pod with the previous version. Therefore, the cluster needs resources to temporarily create one extra pod over the current replica count.

    2. Open a web browser and navigate to https://console-openshift-console.apps.ocp4.example.com. Click htpasswd_provider and log in as the developer user with the developer password.

    3. Select the Developer perspective and click Topology to access the Topology view of the deployments-health project. Click the deployments-health if the project is not currently selected.

    4. Click the three dots button next to the expense deployment label and click the Add Health Checks entry.

    5. Click the Add Readiness probe button and fill the form with the following table:

      FieldValue
      Path /q/health/ready
      Failure threshold 3

      Click the checkmark button at the end of the section.

    6. Click the Add Liveness probe button and fill the form with the following table:

      FieldValue
      Path /q/health/live
      Failure threshold 2

      Click the checkmark button at the end of the section.

    7. Click the Add button to submit the health checks.

    8. Click the expense application label and then click the Details tab on the right panel to verify that OpenShift updates the deployment.

  7. Verify that when the liveness probe in one of the pods fails, then OpenShift restarts the pod.

    1. Keep the browser open on the Topology view. Open a terminal to run the following curl command to simulate one of the pods not responding.

      [student@workstation ~]$ curl -s \
      expense-deployments-health.apps.ocp4.example.com/crash
      Service not alive

      This marks the pod that receives the request as not ready. After 2 failed requests to the liveness probe, OpenShift restarts the pod.

    2. In the web console, verify that OpenShift temporarily shows the deployment as temporarily unavailable.

      Then, OpenShift restarts the pod and the deployment becomes available.

      Optionally, get the deployment pods and verify that the restart column increments.

      [student@workstation ~]$ oc get po
      NAME                       READY   STATUS    RESTARTS        AGE
      expense-569cb4c454-gg2jt   1/1     Running   1 (5m47s ago)   18m
      expense-569cb4c454-kzbps   1/1     Running   0               18m

Finish

On the workstation machine, use the lab command to complete this exercise. This step is important to ensure that resources from previous exercises do not impact upcoming exercises.

[student@workstation ~]$ lab finish deployments-health

Revision: do288-4.12-0d49506