DO180 - ch06s11

Bookmark this page

Lab: Configure Applications for Reliability

Deploy and troubleshoot a reliable application that defines health probes, compute resource requests, and compute resource limits so it can run N instances per node; and configure a horizontal pod autoscaler that will scale to a maximum of N instances.

Outcomes

You should be able to add resource requests to a Deployment object, configure probes, and create a horizontal pod autoscaler resource.

As the student user on the workstation machine, use the lab command to prepare your system for this exercise.

This command ensures that all resources are available for this exercise. It also creates the reliability-review project and deploys the longload application in that project.

[student@workstation ~]$ lab start reliability-review

Instructions

The API URL of your OpenShift cluster is https://api.ocp4.example.com:6443, and the oc command is already installed on your workstation machine.

Use the reliability-review project for your work.

The longload application in the reliability-review project fails to start. Diagnose and then fix the issue. The application needs 512 MiB of memory to work.

After you fix the issue, you can confirm that the application works by running the ~/DO180/labs/reliability-review/curl_loop.sh script that the lab command prepared. The script sends requests to the application in a loop. For each request, the script displays the pod name and the application status. Press Ctrl+C to quit the script.

[student@workstation ~]$ oc login -u developer -p developer \
  https://api.ocp4.example.com:6443
Login successful.
...output omitted...

Set the reliability-review project as the active project.

[student@workstation ~]$ oc project reliability-review
...output omitted...

List the pods in the project. The pod is in the Pending status. The name of the pod on your system probably differs.

[student@workstation ~]$ oc get pods
NAME                        READY   STATUS    RESTARTS   AGE
longload-64bf8dd776-b6rkz   0/1     Pending   0          8m1s

Retrieve the events for the pod. No compute node has enough memory to accommodate the pod.

[student@workstation ~]$ oc describe pod longload-64bf8dd776-b6rkz
Name:             longload-64bf8dd776-b6rkz
Namespace:        reliability-review
...output omitted...
Events:
  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  8m    default-scheduler  0/1 nodes are available: 1 Insufficient memory. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.

Review the resource requests for memory. The longload deployment requests 8 GiB of memory.

[student@workstation ~]$ oc get deployment longload -o \
  jsonpath='{.spec.template.spec.containers[0].resources.requests.memory}{"\n"}'
8Gi

Set the memory requests to 512 MiB. Ignore the warning message.

[student@workstation ~]$ oc set resources deployment/longload \
  --requests memory=512Mi
deployment.apps/longload resource requirements updated

Wait for the pod to start. You might have to rerun the command several times for the pod to report a Running status. The name of the pod on your system probably differs.

[student@workstation ~]$ oc get pods
NAME                        READY   STATUS    RESTARTS   AGE
longload-5897c9558f-cx4gt   1/1     Running   0          86s

Run the ~/DO180/labs/reliability-review/curl_loop.sh script to confirm that the application works.

[student@workstation ~]$ ~/DO180/labs/reliability-review/curl_loop.sh
1 curl: (7) Failed to connect to master01.ocp4.example.com port 30372: Connection refused
2 longload-5897c9558f-cx4gt: app is still starting
3 longload-5897c9558f-cx4gt: app is still starting
4 longload-5897c9558f-cx4gt: app is still starting
5 longload-5897c9558f-cx4gt: Ok
6 longload-5897c9558f-cx4gt: Ok
7 longload-5897c9558f-cx4gt: Ok
8 longload-5897c9558f-cx4gt: Ok
...output omitted...

Press Ctrl+C to quit the script.

When the application scales up, your customers complain that some requests fail. To replicate the issue, manually scale up the longload application to three replicas, and run the ~/DO180/labs/reliability-review/curl_loop.sh script at the same time.

The application takes seven seconds to initialize. The application exposes the /health API endpoint on HTTP port 3000. Configure the longload deployment to use this endpoint, to ensure that the application is ready before serving client requests.

Open a new terminal window and run the ~/DO180/labs/reliability-review/curl_loop.sh script.

[student@workstation ~]$ ~/DO180/labs/reliability-review/curl_loop.sh
1 longload-5897c9558f-cx4gt: Ok
2 longload-5897c9558f-cx4gt: Ok
3 longload-5897c9558f-cx4gt: Ok
4 longload-5897c9558f-cx4gt: Ok
...output omitted...

Leave the script running and do not interrupt it.

Scale up the application to three replicas.

[student@workstation ~]$ oc scale deployment/longload --replicas 3
deployment.apps/longload scaled

Watch the output of the curl_loop.sh script in the second terminal. Some requests fail because OpenShift sends requests to the new pods before the application is ready.

...output omitted...
22 longload-5897c9558f-cx4gt: Ok
23 longload-5897c9558f-cx4gt: Ok
24 longload-5897c9558f-cx4gt: Ok
25 curl: (7) Failed to connect to master01.ocp4.example.com port 30372: Connection refused
26 curl: (7) Failed to connect to master01.ocp4.example.com port 30372: Connection refused
27 longload-5897c9558f-cx4gt: Ok
28 curl: (7) Failed to connect to master01.ocp4.example.com port 30372: Connection refused
29 longload-5897c9558f-cx4gt: Ok
30 curl: (7) Failed to connect to master01.ocp4.example.com port 30372: Connection refused
31 longload-5897c9558f-tpssf: app is still starting
32 longload-5897c9558f-kkvm5: app is still starting
33 longload-5897c9558f-cx4gt: Ok
34 longload-5897c9558f-tpssf: app is still starting
35 longload-5897c9558f-tpssf: app is still starting
36 longload-5897c9558f-tpssf: app is still starting
37 longload-5897c9558f-cx4gt: Ok
38 longload-5897c9558f-tpssf: app is still starting
39 longload-5897c9558f-cx4gt: Ok
40 longload-5897c9558f-cx4gt: Ok
...output omitted...

Leave the script running and do not interrupt it.

Add a readiness probe to the longload deployment. Ignore the warning message.

[student@workstation ~]$ oc set probe deployment/longload --readiness \
  --initial-delay-seconds 7 \
  --get-url http://:3000/health
deployment.apps/longload probes updated

Scale down the application back to one pod.
```
[student@workstation ~]$ oc scale deployment/longload --replicas 1
deployment.apps/longload scaled
```
If scaling down breaks the curl_loop.sh script, then press Ctrl+c to stop the script in the second terminal. Then, restart the script.

To test your work, scale up the application to three replicas again.

[student@workstation ~]$ oc scale deployment/longload --replicas 3
deployment.apps/longload scaled

Watch the output of the curl_loop.sh script in the second terminal. No request fails.

...output omitted...
92 longload-7ddcc9b7fd-72dtm: Ok
93 longload-7ddcc9b7fd-72dtm: Ok
94 longload-7ddcc9b7fd-72dtm: Ok
95 longload-7ddcc9b7fd-qln95: Ok
96 longload-7ddcc9b7fd-wrxrb: Ok
97 longload-7ddcc9b7fd-qln95: Ok
98 longload-7ddcc9b7fd-wrxrb: Ok
99 longload-7ddcc9b7fd-72dtm: Ok
...output omitted...

Press Ctrl+C to quit the script.

Configure the application so that it automatically scales up when the average memory usage is above 60% of the memory requests value, and scales down when the usage is below this percentage. The minimum number of replicas must be one, and the maximum must be three. The resource that you create for scaling the application must be named longload.
The lab command provides the ~/DO180/labs/reliability-review/hpa.yml resource file as an example. Use the oc explain command to learn the valid parameters for the hpa.spec.metrics.resource.target attribute. Because the file is incomplete, you must update it first if you choose to use it.
To test your work, use the oc exec deploy/longload — curl localhost:3000/leak command to sends an HTTP request to the application /leak API endpoint. Each request consumes an additional 480 MiB of memory. To free this memory, you can use the ~/DO180/labs/reliability-review/free.sh script.
Before you create the horizontal pod autoscaler resource, scale down the application to one pod.
[student@workstation ~]$ oc scale deployment/longload --replicas 1 deployment.apps/longload scaled
Edit the ~/DO180/labs/reliability-review/hpa.yml resource file. You can retrieve the parameters for the resource attribute by using the oc explain hpa.spec.metrics.resource and oc explain hpa.spec.metrics.resource.target commands.
apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: longload labels: app: longload spec: maxReplicas: 3 minReplicas: 1 scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: longload metrics: - type: Resource resource: name: memory target: type: Utilization averageUtilization: 60
Use the oc apply command to deploy the horizontal pod autoscaler.
[student@workstation ~]$ oc apply -f ~/DO180/labs/reliability-review/hpa.yml horizontalpodautoscaler.autoscaling/longload created
In the second terminal, run the watch command to monitor the oc get hpa longload command. Wait for the longload horizontal pod autoscaler to report usage in the TARGETS column. The percentage on your system probably differs.
[student@workstation ~]$ watch oc get hpa longload Every 2.0s: oc get hpa longload workstation: Fri Mar 10 05:15:34 2023 NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE longload Deployment/longload 13%/60% 1 3 1 75s
Leave the command running and do not interrupt it.
To test your work, run the oc exec deploy/longload — curl localhost:3000/leak command in the first terminal for the application to allocate 480 MiB of memory.
[student@workstation ~]$ oc exec deploy/longload -- curl -s localhost:3000/leak longload-7ddcc9b7fd-72dtm: consuming memory!
In the second terminal, after two minutes, the oc get hpa longload command shows the memory increase. The horizontal pod autoscaler scales up the application to more than one replica. The percentage on your system probably differs.
Every 2.0s: oc get hpa longload workstation: Fri Mar 10 05:19:44 2023 NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE longload Deployment/longload 145%/60% 1 3 2 5m18s
To test your work, run the ~/DO180/labs/reliability-review/free.sh script in the first terminal for the application to release the memory. Ensure that the pod that frees the memory is the same pod that was consuming memory. Execute the free.sh script several times if necessary.
[student@workstation ~]$ ~/DO180/labs/reliability-review/free.sh longload-7ddcc9b7fd-72dtm: releasing memory!
In the second terminal, after ten minutes, the oc get hpa longload command shows the memory decrease. The horizontal pod autoscaler scales down the application to one replica. The percentage on your system probably differs.
Every 2.0s: oc get hpa longload workstation: Fri Mar 10 05:19:44 2023 NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE longload Deployment/longload 12%/60% 1 3 1 15m28s
Press Ctrl+C to quit the watch command. Close that second terminal when done.

Evaluation

As the student user on the workstation machine, use the lab command to grade your work. Correct any reported failures and rerun the command until successful.

[student@workstation ~]$ lab grade reliability-review

Finish

As the student user on the workstation machine, use the lab command to complete this exercise. This step is important to ensure that resources from previous exercises do not impact upcoming exercises.

[student@workstation ~]$ lab finish reliability-review

Discuss Red Hat OpenShift Administration I: Operating a Production Cluster

Go to community

Version 4.12.2 versus 4.12 of DO180 course?

DRobitaille

21 lis 2023

I just noticed that there is now a version 4.12.2 of the DO180. Currently without any new updated videos. Do we have access to some sort of changelog to judge how different the new version is compared to 4.12. I'm currently in the process of reviewing the content of DO180/DO280 in preparation for my upcoming EX280 exam (based on "4.12"), so I'm wondering if it's worth studying the 4.12.2 version instead of 4.12.

470

Red Hat OpenShift Administration I: Containers & Kubernetes (DO180)

Haley_Ruccio

20 lip 2023

Deploy, manage, and troubleshoot containerized applications running as Kubernetes workloads in OpenShift clusters.Course DescriptionRed Hat OpenShift Administration I: Managing Containers and Kubernetes (DO180) prepares OpenShift cluster administrators to manage Kubernetes workloads and to collaborate with developers, DevOps engineers, system administrators, and SREs to ensure the availability of application workloads. This course focuses on managing typical end-user applications that are often accessible from a web or mobile UI and that represent most cloud-native and containerized workloads. Managing applications also includes deploying and updating their dependencies, such as databases, messaging, and authentication systems.The skills that you learn in this course apply to all versions of OpenShift, including Red Hat OpenShift on AWS (ROSA), Azure Red Hat OpenShift, and OpenShift Container Platform.This course is based on Red Hat OpenShift 4.12.Course Content SummaryManaging OpenShift clusters from the command-line interface and from the web console.Troubleshooting network connectivity between applications inside and outside an OpenShift cluster.Connecting Kubernetes workloads to storage for application data.Configuring Kubernetes workloads for high availability and reliability.Managing updates to container images, settings, and Kubernetes manifests of an application.Target AudienceSystem administrators and platform operators who are interested in managing OpenShift clusters and containerized applications.Site Reliability Engineers who are interested in maintaining and troubleshooting containerized applications on Kubernetes.System and software architects who are interested in learning and using the features and functions of an OpenShift cluster.Developers and Site Reliability Engineers that are new to container technology should enroll in Red Hat OpenShift Development I: Introduction to Containers with Podman (DO188).Recommended trainingTake our free assessment to gauge whether this offering is the best fit for your skills.Prerequisite: Containers, Kubernetes and Red Hat OpenShift Technical Overview or equivalent knowledge of Linux containers.Technology considerationsThis course requires internet access to access the cloud-based classroom environment that provides an OpenShift cluster and a remote administrator’s workstation.

Welcome to the Red Hat OpenShift Administration (DO180) group in the Red Hat Learning Community!

Deanna

18 lip 2023

We are excited to launch a space dedicated to the Red Hat Training course Red Hat OpenShift Administration I - Containers & Kubernetes! To gain the most value from this group - click the "Join Group" button in the upper right hand corner of the group home screen.We encourage group members to collaborate in this group to discuss topics, ask questions, share best practices and tips, provide course feedback, and share their accomplishments as it relates to DO180.Read more about Red Hat OpenShift Administration I here.

114

Revision: do180-4.14-b6cd706