Configure an autoscaler for an application and then load test that application to observe scaling up.
Outcomes
You should be able to manually scale up a deployment, configure a horizontal pod autoscaler resource, and monitor the autoscaler.
As the student user on the workstation machine, use the lab command to prepare your system for this exercise.
This command ensures that all resources are available for this exercise.
It also creates the reliability-autoscaling project.
[student@workstation ~]$ lab start reliability-autoscaling
Instructions
Log in to the OpenShift cluster as the developer user with the developer password.
Use the reliability-autoscaling project.
Log in to the OpenShift cluster.
[student@workstation ~]$oc login -u developer -p developer \https://api.ocp4.example.com:6443Login successful. ...output omitted...
Set the reliability-autoscaling project as the active project.
[student@workstation ~]$ oc project reliability-autoscaling
...output omitted...Create the loadtest deployment, service, and route.
The deployment uses the registry.ocp4.example.com:8443/redhattraining/loadtest:v1.0 container image that provides a web application.
The web application exposes an API endpoint that creates a CPU-intensive task when queried.
Review the ~/DO180/labs/reliability-autoscaling/loadtest.yml resource file that the lab command prepared.
The container specification does not include the resources section that you use to specify CPU requests and limits.
You configure that section in another step.
Do not change the file for now.
apiVersion: v1
kind: List
metadata: {}
items:
- apiVersion: apps/v1
kind: Deployment
...output omitted...
spec:
containers:
- image: registry.ocp4.example.com:8443/redhattraining/loadtest:v1.0
name: loadtest
readinessProbe:
failureThreshold: 3
httpGet:
path: /api/loadtest/v1/healthz
port: 8080
scheme: HTTP
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
- apiVersion: v1
kind: Service
...output omitted...
- apiVersion: route.openshift.io/v1
kind: Route
...output omitted...Use the oc apply command to create the application.
[student@workstation ~]$ oc apply -f \
~/DO180/labs/reliability-autoscaling/loadtest.yml
deployment.apps/loadtest created
service/loadtest created
route.route.openshift.io/loadtest createdWait for the pod to start.
You might have to rerun the command several times for the pod to report a Running status.
The name of the pod on your system probably differs.
[student@workstation ~]$oc get podsNAME READY STATUS RESTARTS AGE loadtest-65c55b7dc-r4s4s 1/1Running0 49s
Configure a horizontal pod autoscaler resource for the loadtest deployment.
Set the minimum number of replicas to 2 and the maximum to 20.
Set the average CPU usage to 50% of the CPU requests attribute.
The horizontal pod autoscaler does not work, because the loadtest deployment does not specify requests for CPU usage.
Use the oc autoscale command to create the horizontal pod autoscaler resource.
[student@workstation ~]$ oc autoscale deployment/loadtest --min 2 --max 20 \
--cpu-percent 50
horizontalpodautoscaler.autoscaling/loadtest autoscaledRetrieve the status of the loadtest horizontal pod autoscaler resource.
The unknown value in the TARGETS column indicates that OpenShift cannot compute the current CPU usage of the loadtest deployment.
The deployment must include the CPU requests attribute for OpenShift to be able to compute the CPU usage.
[student@workstation ~]$oc get hpa loadtestNAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE loadtest Deployment/loadtest<unknown>/50% 2 20 2 74s
Get more details about the resource status. You might have to rerun the command several times. Wait three minutes for the command to report the warning message.
[student@workstation ~]$oc describe hpa loadtestName: loadtest Namespace: reliability-autoscaling ...output omitted... Conditions: Type Status Reason Message ---- ------ ------ ------- AbleToScale True SucceededGetScale the HPA controller was able to get the target's current scale ScalingActive FalseFailedGetResourceMetricthe HPA was unable to compute the replica count: failed to get cpu utilization:missing request for cpuEvents: Type ... Message ---- ... ------- ...output omitted... Warning ...failed to get cpu utilization: missing request for cpu...output omitted...
Delete the horizontal pod autoscaler resource.
You re-create the resource in another step, after you fix the loadtest deployment.
[student@workstation ~]$ oc delete hpa loadtest
horizontalpodautoscaler.autoscaling "loadtest" deletedDelete the loadtest application.
[student@workstation ~]$ oc delete -f \
~/DO180/labs/reliability-autoscaling/loadtest.yml
deployment.apps "loadtest" deleted
service "loadtest" deleted
route.route.openshift.io "loadtest" deletedAdd a CPU resource section to the ~/DO180/labs/reliability-autoscaling/loadtest.yml file.
Redeploy the application from the file.
Edit the ~/DO180/labs/reliability-autoscaling/loadtest.yml file, and configure the CPU limits and requests for the loadtest deployment.
The pods need 25 millicores to operate, and must not consume more that 100 millicores.
You can compare your work with the completed ~/DO180/solutions/reliability-autoscaling/loadtest.yml file that the lab command prepared.
...output omitted...
spec:
containers:
- image: registry.ocp4.example.com:8443/redhattraining/loadtest:v1.0
name: loadtest
readinessProbe:
failureThreshold: 3
httpGet:
path: /api/loadtest/v1/healthz
port: 8080
scheme: HTTP
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources:
requests:
cpu: 25m
limits:
cpu: 100m
...output omitted...Use the oc apply command to deploy the application from the file.
[student@workstation ~]$ oc apply -f \
~/DO180/labs/reliability-autoscaling/loadtest.yml
deployment.apps/loadtest created
service/loadtest created
route.route.openshift.io/loadtest createdWait for the pod to start.
You might have to rerun the command several times for the pod to report a Running status.
The name of the pod on your system probably differs.
[student@workstation ~]$oc get podsNAME READY STATUS RESTARTS AGE loadtest-667bdcdc99-vhc9x 1/1Running0 36s
Manually scale the loadtest deployment by first increasing and then decreasing the number of running pods.
Scale up the loadtest deployment to five pods.
[student@workstation ~]$ oc scale deployment/loadtest --replicas 5
deployment.apps/loadtest scaledConfirm that all five application pods are running.
You might have to rerun the command several times for all the pods to report a Running status.
The name of the pods on your system probably differ.
[student@workstation ~]$oc get podsNAME READY STATUS RESTARTS AGE loadtest-667bdcdc99-5fcvh 1/1Running0 43s loadtest-667bdcdc99-dpspr 1/1Running0 42s loadtest-667bdcdc99-hkssk 1/1Running0 43s loadtest-667bdcdc99-vhc9x 1/1Running0 8m11s loadtest-667bdcdc99-z5n9q 1/1Running0 43s
Scale down the loadtest deployment back to one pod.
[student@workstation ~]$ oc scale deployment/loadtest --replicas 1
deployment.apps/loadtest scaledConfirm that only one application pod is running. You might have to rerun the command several times for the pods to terminate.
[student@workstation ~]$oc get podsNAME READY STATUS RESTARTS AGE loadtest-667bdcdc99-vhc9x 1/1Running0 11m
Configure a horizontal pod autoscaler resource for the loadtest deployment.
Set the minimum number of replicas to 2 and the maximum to 20.
Set the average CPU usage to 50% of the CPU request attribute.
Use the oc autoscale command to create the horizontal pod autoscaler resource.
[student@workstation ~]$ oc autoscale deployment/loadtest --min 2 --max 20 \
--cpu-percent 50
horizontalpodautoscaler.autoscaling/loadtest autoscaledOpen a new terminal window and run the watch command to monitor the oc get hpa loadtest command.
Wait five minutes for the loadtest horizontal pod autoscaler to report usage in the TARGETS column.
Notice that the horizontal pod autoscaler scales up the deployment to two replicas, to conform with the minimum number of pods that you configured.
[student@workstation ~]$watch oc get hpa loadtestEvery 2.0s: oc get hpa loadtest workstation: Fri Mar 3 06:26:24 2023 NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE loadtest Deployment/loadtest0%/50%2 20252s
Leave the command running, and do not interrupt it.
Increase the CPU usage by sending requests to the loadtest application API.
Use the oc get route command to retrieve the URL of the application.
[student@workstation ~]$oc get route loadtestNAME HOST/PORT ... loadtestloadtest-reliability-autoscaling.apps.ocp4.example.com...
Send a request to the application API to simulate additional CPU pressure on the container.
Do not wait for the curl command to complete, and continue with the exercise.
After a minute, the command reports a timeout error that you can ignore.
[student@workstation ~]$ curl \
loadtest-reliability-autoscaling.apps.ocp4.example.com/api/loadtest/v1/cpu/1
<html><body><h1>504 Gateway Time-out</h1>
The server didn't respond in time.
</body></html>Watch the output of the oc get hpa loadtest command in the second terminal.
After a minute, the horizontal pod autoscaler detects an increase in the CPU usage and deploys additional pods.
The increased activity of the application does not immediately trigger the autoscaler. Wait a few moments if you do not see any changes to the number of replicas.
You might need to run the curl command multiple times before the application uses enough CPU to trigger the autoscaler.
The CPU usage and the number of replicas on your system probably differ.
Every 2.0s: oc get hpa loadtest workstation: Fri Mar 3 07:20:19 2023 NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE loadtest Deployment/loadtest220%/50%2 20916m
Wait five minutes after the curl command completes.
The oc get hpa loadtest command shows that the CPU load decreases.
Although the horizontal pod autoscaler resource can be quick to scale up, it is slower to scale down.
Every 2.0s: oc get hpa loadtest workstation: Fri Mar 3 07:23:11 2023
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
loadtest Deployment/loadtest 0%/50% 2 20 9 18m
Optional: Wait for the loadtest application to scale down.
It takes five additional minutes for the horizontal pod autoscaler to scale down to two replicas.
Every 2.0s: oc get hpa loadtest workstation: Fri Mar 3 07:29:12 2023
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
loadtest Deployment/loadtest 0%/50% 2 20 2 24mPress Ctrl+C to quit the watch command.
Close that second terminal when done.