Bookmark this page

Configure Node Autoscaling

Objectives

  • Autoscale a node pool based on application load.

OpenShift provides autoscaling mechanisms to dynamically adapt the cluster infrastructure to the load.

OpenShift uses autoscaling methods for two distinct purposes:

  • The horizontal pod autoscaler (HPA) scales up or down Kubernetes workloads, such as Deployment resources, based on current load on the application pods. The HPA deploys additional application pods when the load increases, and deletes pods when the load decreases.

  • The cluster autoscaler scales up or down the cluster infrastructure. The cluster autoscaler provisions new compute nodes when some workloads cannot run because of insufficient cluster resources. The autoscaler deletes compute nodes as the load decreases.

Configuring the Horizontal Pod Autoscaler

Kubernetes can autoscale a deployment based on current load on the application pods, by means of a HorizontalPodAutoscaler (HPA) resource.

An HPA resource uses performance metrics that the OpenShift Metrics subsystem collects. To autoscale a deployment, you must specify resource requests for pods so that the HPA can calculate the usage percentage.

You can create an HPA resource from a file in the YAML format.

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: hello
spec:
  minReplicas: 1   1
  maxReplicas: 10  2
  metrics:
  - resource:
      name: cpu
      target:
        averageUtilization: 80  3
        type: Utilization
    type: Resource
  scaleTargetRef:  4
    apiVersion: apps/v1
    kind: Deployment
    name: hello

1

Minimum number of pods.

2

Maximum number of pods.

3

Ideal average CPU usage for each pod. If the global average CPU usage is above that value, then the HPA starts new pods. If the global average CPU usage is below that value, then the HPA deletes pods.

4

Reference to the name of the deployment resource.

Use the oc apply -f hello-hpa.yaml command to create the resource from the file.

The preceding example creates an HPA resource that scales based on CPU usage. You can also configure an HPA resource to scale the workload based on memory usage by setting the resource name to memory, as in the following example:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: hello
spec:
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - resource:
      name: memory
      target:
        averageUtilization: 80
...output omitted...

The Red Hat OpenShift Administration I - Managing Containers and Kubernetes (DO180) course provides more information about configuring HPA resources.

Configuring the Cluster Autoscaler

In a Red Hat OpenShift Service on AWS (ROSA) cluster, you control the number of compute nodes through your ROSA machine pools.

You can manually scale a machine pool by using the --replicas option of the rosa edit machinepool command. ROSA uses the AWS API to provision or delete Amazon Elastic Compute Cloud (EC2) instances to adapt the machine pool to its new size.

The following example scales the with-gpu machine pool to six nodes:

$ rosa edit machinepool --cluster mycluster --replicas 6 with-gpu

Instead of manually scaling your machine pool, you can activate the cluster autoscaler for the pool. By using the cluster autoscaler, you ensure that your cluster adapts its size to the load of your applications.

When the load increases, the autoscaler provisions new nodes. When the load decreases, some nodes become underutilized. After some time, the cluster autoscaler relocates the remaining workloads to other nodes, and then deletes the idle nodes. By dynamically adapting the size of your cluster to your workloads, you optimize the cost of your infrastructure.

The autoscaler scales up when it detects that some pods are in the pending state because of insufficient resources in the machine pool. A typical chain of events is as follows:

  1. To adapt the number of application pods to your client requests, you configure the HPA for your workload.

  2. When the number of client requests rises, the application load increases. The HPA deploys new pods to adjust to that load.

  3. The compute nodes in the machine pool do not provide enough resources to accommodate for the new pods. Some pods are in the pending state.

  4. The cluster autoscaler detects the pending state of these pods, and then triggers the deployment of additional compute nodes.

  5. ROSA uses the AWS API to provision EC2 instances for the additional nodes.

  6. The new nodes become available. OpenShift deploys the pending pods to this increased compute node capacity.

Activating the Autoscaler for Machine Pools

The following command activates the autoscaler for the with-gpu machine pool. The mandatory --min-replicas and --max-replicas options provide limits to the autoscaler.

$ rosa edit machinepool --cluster mycluster --enable-autoscaling
  --min-replicas 3 --max-replicas 9 with-gpu

You can also activate the autoscaler when you create a machine pool. The rosa create machinepool command accepts the same --enable-autoscaling, --min-replicas, and --max-replicas options.

To verify whether the autoscaler is activated, use the rosa list machinepools command:

$ rosa list machinepools --cluster mycluster
ID         AUTOSCALING   REPLICAS  INSTANCE TYPE   LABELS      ...
Default    No            2         m5.xlarge                   ...
with-gpu   Yes           3-9       p3.2xlarge      workload=IA ...

The ROSA machine pool autoscaler translates to OpenShift machine autoscaler resources. You can list these resources, but you cannot modify them. Instead, edit the corresponding ROSA machine pool when you need to change a pool parameter.

$ oc get machineautoscaler -n openshift-machine-api
NAME                         REF KIND     REF NAME                    MIN  MAX ...
mycluster-p5k3-with-gpu...   MachineSet   mycluster-p5k3-with-gpu...  3    9 ...

Only cluster administrators can view the machine autoscaler resources.

References

For more information about the HPA, refer to the Automatically Scaling Pods with the Horizontal Pod Autoscaler section in the Working with Pods chapter in the Red Hat OpenShift Container Platform 4.12 Nodes documentation at https://access.redhat.com/documentation/en-us/openshift_container_platform/4.12/html-single/nodes/index#nodes-pods-autoscaling

For more information about autoscaling ROSA machine pools, refer to the About Autoscaling Nodes on a Cluster section in the Nodes chapter in the Red Hat OpenShift Service on AWS 4 Cluster Administration documentation at https://access.redhat.com/documentation/en-us/red_hat_openshift_service_on_aws/4/html-single/cluster_administration/index#rosa-nodes-about-autoscaling-nodes

For more information about the cluster autoscaler, refer to the Applying Autoscaling to an OpenShift Container Platform Cluster chapter in the Red Hat OpenShift Container Platform 4.12 Machine Management documentation at https://access.redhat.com/documentation/en-us/openshift_container_platform/4.12/html-single/machine_management/index#applying-autoscaling

Revision: do120-4.12-b978842