Bookmark this page

Chapter 3.  Cluster Partitioning

Abstract

Goal

Configure a subset of cluster nodes to be dedicated to a type of workload.

Sections
  • Node Pools (and Quiz)

  • Node Configuration with the Machine Configuration Operator (and Guided Exercise)

  • Node Configuration with Special Purpose Operators (and Guided Exercise)

Lab
  • Cluster Partitioning

Node Pools

Objectives

  • Illustrate methods of adding and configuring OpenShift cluster nodes, by using node labels to configure nodes and to schedule pods to those nodes.

Introduction

The OpenShift cluster has control plane nodes and compute nodes. Compute nodes are also known as worker nodes. The control plane nodes manage the Red Hat OpenShift Container Platform cluster and orchestrate the distribution of the workloads on the compute nodes.

The applications that run in the OpenShift cluster have different specifications and requirements. Some applications might require compute nodes with graphics processing units (GPU) for graphical applications, artificial intelligence (AI) computations, or machine learning (ML) workloads. Some applications, such as simple websites, might not require specific hardware or specifications, and work smoothly on a small compute node.

For example, you run an application that is composed of many microservices that require high CPU capacity, and a database that does not have such a requirement.

You can run both the CPU-intensive application and the database instances on compute nodes with high CPU capacity. In this case, the CPU-intensive application runs smoothly, but the database does not use the high CPU node to full capacity, and the billing cost increases.

In OpenShift, you can deploy both the CPU-intensive application and the database instances on compute nodes with low CPU capacity. In this case, the billing cost reduces but the application does not work correctly because of the low CPU compute node.

The best solution is to deploy the CPU-intensive application instances on high CPU compute nodes and to deploy the database application on the low CPU compute nodes.

You need compute nodes with the different capacity and hardware to accommodate these applications, and a partition of these nodes to deploy the application. You can deploy these applications on specific nodes by using node pools.

In some cases, you might face a "noisy neighbor" issue where a application gives poor performance and high latency due to other applications. You can configure compute nodes to be part of node pools for critical applications, to avoid the noisy neighbor issue.

Node Pools

A node pool is a logical group of OpenShift compute nodes. Compute nodes with similar hardware configurations can be organized into a node pool. Collecting these similar nodes into node pools provides a method of targeting placement for workload deployments.

You can configure multiple node pools for your workloads to ensure that your applications run on the required hardware.

An instance often requires a specific configuration for a compute node to run smoothly.

In this scenario, you have an application that uses ML and that also uses another database application. The ML application requires nodes with a GPU to run smoothly, whereas a database can run on a standard compute node.

In this case, provision compute nodes with GPU hardware and configure GPU compute nodes as part of a gpu node pool by using the node-pool=gpu label. You can schedule the ML application pods on the gpu node pool, whereas database pods deploy on other compute nodes.

You might have critical applications without specific hardware requirements and that can run on a standard compute node. These critical applications might face the "noisy neighbor" issue.

For example, consider a messaging application. The application does not require specific hardware and can run on any standard node, but cannot afford high network latency. Other non-critical applications might run on the same cluster node and increase network traffic and latency for the messaging application.

Although the application would typically run adequately on that compute node, other non-critical applications can affect the network performance of the compute node. The messaging application faces another "noisy neighbor" issue. To avoid this issue, you can use a node pool to organize your compute nodes to run critical applications.

Defining node pools for your cluster hardware is specific to the approach for application deployments in your organization. In a scenario with multi-tiered infrastructure deployments, such as a development, staging, or production tier of hardware, administrators can organize node pools to correspond to these tiers. Earlier hardware can populate a "development" node pool, whereas later infrastructure can be designated for production workloads through a "production" node pool. With this approach, developers can target these node pools to separate development, staging, and production applications that run in the cluster.

Node Provisioning

Compute node provisioning depends on the method of cluster installation.

In a user-provisioned infrastructure (UPI), you must manually provision new instances. The strategy for provisioning nodes is specific to your data center and your IT processes. However, the base steps remain the same, as follows:

  1. Update the compute node Ignition file with an updated TLS certificate.

  2. Install Red Hat Enterprise Linux CoreOS (RHCOS) from an ISO image or by using a Preboot eXecution Environment (PXE) boot approach.

  3. Add the new instance to the ingress load balancer.

  4. Approve the Certificate Signing Requests (CSRs).

In an installer-provisioned infrastructure (IPI), the machine API automatically performs scaling operations for supported cloud providers. Thus, you can modify the specified number of replicas in a machine set resource, and OpenShift communicates with the cloud provider to provision or remove instances. You can scale up and scale down by using autoscaling based on the workload requirements.

For more details about Red Hat OpenShift installation and node provisioning, refer to the DO322: Red Hat OpenShift Installation Lab training course.

Some managed clusters have protected machine sets for direct modification and support node provisioning by using node pools. For example, Red Hat OpenShift on AWS (ROSA) clusters use machine pools. You create a machine pool and select an instance type that is specific to the workload. The machine set is protected against direct modification. You can scale a machine pool manually or you can use the cluster autoscaler.

For more details about ROSA and autoscaling, refer to the CS220: Creating and Configuring Production Red Hat OpenShift on AWS (ROSA) Clusters training course.

Node Labels

Use node labels to configure compute nodes to be a member of a specific node pool. Node labels are key-value pairs that are attached to the node.

In the user-provisioned infrastructure, you can add a label to multiple compute nodes to designate each as a member of one node pool.

You can list the node and labels by using the oc get nodes command:

[user@host ~]$ oc get nodes worker01 --show-labels
NAME       STATUS   ROLES                  ...   LABELS
worker01   Ready    worker                 ...   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,cluster.ocs.openshift.io/openshift-storage=,kubernetes.io/arch=amd64,kubernetes.io/hostname=worker01,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.openshift.io/os_id=rhcos

Compute nodes can have many node labels. Add as many node labels to a compute node as needed for workload placement onto the intended hardware. You can add a custom node label to the compute nodes by using the oc label node command.

[user@host ~]$ oc label node/worker01 node-pool=gpu
node/worker01 labeled 1
[user@host ~]$ oc get nodes worker01 --show-labels
NAME       STATUS   ROLES    ...   LABELS
worker01   Ready    worker   ...   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,cluster.ocs.openshift.io/openshift-storage=,kubernetes.io/arch=amd64,kubernetes.io/hostname=worker01,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.openshift.io/os_id=rhcos,node-pool=gpu 2

1

The oc label command adds the node-pool=gpu label to the worker01 compute node.

2

The oc get nodes command displays the node-pool=gpu label for the worker01 compute node.

You can configure multiple GPU compute nodes to be part of a gpu node pool by adding the same node label to all GPU-enabled compute nodes.

[user@host ~]$ oc label node/worker02 node-pool=gpu
node/worker02 labeled
[user@host ~]$ oc get nodes --selector node-pool=gpu
NAME       STATUS   ROLES    AGE   VERSION
worker01   Ready    worker   13d   v1.25.7+eab9cc9
worker02   Ready    worker   13d   v1.25.7+eab9cc9

The oc get nodes command with the selector filter shows all nodes with the node-pool=gpu label. You can schedule the application to target a compute node in the gpu node pool by using node labels. The next chapter discusses pod scheduling in detail.

Node Configuration

The OpenShift cluster manages upgrades to the nodes by using the machine configuration operator (MCO). You can manage nodes by defining groups of these nodes, which are called machine configuration pools (MCP). The MCO uses the master and worker MCPs, by default. You can create custom MCPs for node pools. MCPs use labels to match one or more MCs to one or more nodes.

For example, you can create the gpu MCP. The gpu MCP selects nodes based on its assigned gpu node role.

Similarly, cloud providers also provide different compute and storage instances to meet customer requirements. For example etcd storage and I/O-intensive workloads require a low-latency storage solution for optimal performance. AWS has an io2 EBS volume type to support these workloads. You can use the gp2 EBS volume type for other workloads where intensive I/O is not required.

References

For more information about the control plane architecture, refer to the Control Plane Architecture chapter in the Red Hat OpenShift Container Platform 4.14 Architecture documentation at https://access.redhat.com/documentation/en-us/openshift_container_platform/4.14/html-single/architecture/index#control-plane

Creating and Adding Additional Worker Nodes to the Cluster

Infrastructure Nodes in OpenShift 4

For more information about adding compute nodes to a cluster, refer to the About Adding RHEL Compute Nodes to a Cluster section in the Adding RHEL Compute Machines to an OpenShift Container Platform Cluster chapter in the Red Hat OpenShift Container Platform 4.14 Machine Management documentation at https://access.redhat.com/documentation/en-us/openshift_container_platform/4.14/html-single/machine_management/index#adding-rhel-compute

For more information about the installation prerequisites, refer to the Prerequisites section in the Deploying Installer-provisioned Clusters on Bare Metal chapter in the Red Hat OpenShift Container Platform 4.14 Installing documentation at https://access.redhat.com/documentation/en-us/openshift_container_platform/4.14/html-single/installing/index#ipi-install-prerequisites

For more details about ROSA and Microsoft Azure Red Hat OpenShift, refer to the DO120: Introduction to Red Hat OpenShift Service on AWS (ROSA) and DO121: Introduction to Microsoft Azure Red Hat OpenShift training courses respectively.

Assign Pods to Nodes

Revision: do380-4.14-397a507