Configure a workload to run on dedicated nodes, and prevent other workloads from running on those nodes.
Outcomes
Create deployments with node selectors to specify the node pool for a workload.
Use taints and tolerations to prevent a node from running workloads.
As the student user on the workstation machine, use the lab command to prepare your environment for this exercise, and to ensure that all required resources are available.
[student@workstation ~]$ lab start scheduling-selector
Instructions
Your company has a cluster with three compute nodes in three different racks, as in the following diagram:
The nodes include a label with the name of the rack that they are part of.
The worker01 and worker02 nodes have a standard CPU and include the cpu=standard label.
The worker03 node has a fast CPU and includes the cpu=fast label.
This guided exercise includes the following scenarios:
In the first scenario, you create a deployment with two replicas to verify the default OpenShift scheduler behavior, which distributes the deployment pods between the three available compute nodes.
Then, your company requires you to create another deployment, with two replicas, which needs a fast CPU.
Thus, you must use a node selector to ensure that the OpenShift scheduler places the deployment in the fast CPU compute node.
Finally, you drain the worker03 node from the cluster to simulate a failure in the rack, and verify the status for the deployments.
In the second scenario, you create a project node selector for the standard CPU nodes. Then, you create a deployment with two replicas and verify that OpenShift applies the project node selector to the deployment.
In this scenario, you also create another deployment with two replicas and with a node selector for the compute node in the first rack. This deployment explores how project and pod node selectors interact with each other. Thus, you verify that pods that the deployment creates have both the project and pod node selectors.
In the third scenario, the student taints the worker01 and worker03 nodes to reserve some workload capacity, because your cluster has limited capacity.
The taint is type=mission-critical with the NoSchedule option, which ensures that previous node workloads continue working but that new workloads need a toleration to be placed in those nodes.
In this scenario, you first try to create a deployment with a node selector for the fast CPU label, to meet your application's requirements.
Because your deployment does not include a toleration for the taint, the deployment is not available.
Then, you create another deployment, to include the node selector and the toleration, to verify that the deployment is available.
Create all the deployments as the developer user.
As the admin user, connect to the OpenShift cluster and verify the labels for the nodes in the cluster.
Connect to the OpenShift cluster as the admin user with redhatocp as the password.
[student@workstation ~]$ oc login -u admin -p redhatocp \
https://api.ocp4.example.com:6443
Login successful.
...output omitted...List the labels for the compute nodes. Verify the following node labels:
The rack=1 and cpu=standard labels for the worker01 node
The rack=2 and cpu=standard labels for the worker02 node
The rack=3 and cpu=fast labels for the worker03 node
[student@workstation ~]$oc get no -L rack,cpuNAME STATUS ROLES AGE VERSIONRACKCPUmaster01 Ready control-plane,master 140d v1.25.7+eab9cc9 master02 Ready control-plane,master 140d v1.25.7+eab9cc9 master03 Ready control-plane,master 140d v1.25.7+eab9cc9worker01Ready worker 37d v1.25.7+eab9cc91standardworker02Ready worker 37d v1.25.7+eab9cc92standardworker03Ready worker 37d v1.25.7+eab9cc93fast
Create a deployment called myapp with two replicas.
Verify the default OpenShift scheduler behavior, which distributes the deployment pods between the three available compute nodes.
Connect to the OpenShift cluster as the developer user with developer as the password, and verify that OpenShift uses the scheduling-selector project.
[student@workstation ~]$ oc login -u developer -p developer
...output omitted...
Using project "scheduling-selector".Change to the ~/DO380/labs/scheduling-selector directory.
[student@workstation ~]$ cd ~/DO380/labs/scheduling-selector/Create a deployment CR YAML file with two replicas.
You can find an incomplete example for the deployment CR in the ~/DO380/labs/scheduling-selector/deployment.yml file.
apiVersion: apps/v1 kind: Deployment metadata: name:myappspec: replicas:2selector: matchLabels: app: myapp ...output omitted...
Apply the configuration for the deployment CR.
[student@workstation scheduling-selector]$ oc create -f deployment.yml
deployment.apps/myapp createdVerify that the deployment is correctly created with two replicas. Wait until all the pods are marked as ready and available. You might have to repeat this command many times.
[student@workstation scheduling-selector]$ oc get deployment
NAME READY UP-TO-DATE AVAILABLE AGE
myapp 2/2 2 2 43sVerify that the OpenShift scheduler distributes the two pods between the available compute nodes. The pod names and the nodes that run on your pods might differ in your system.
[student@workstation scheduling-selector]$oc get pods -o wideNAME READY STATUS ... NODE ... myapp-847cfd74d9-84tqf 1/1 Running ...worker02... myapp-847cfd74d9-npjrn 1/1 Running ...worker01...
Your company requires you to create a deployment, with two replicas, which needs a fast CPU.
Create a deployment called myapp-ns-fastcpu with two replicas.
The deployment must have a node selector for the fastCPU label.
Create a deployment CR YAML file with two replicas and a node selector for the fastCPU label.
You can find an incomplete example for the deployment in the ~/DO380/labs/scheduling-selector/deployment-ns-fastcpu.yml file.
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-ns-fastcpu
spec:
replicas: 2
selector:
matchLabels:
app: myapp-ns-fastcpu
template:
metadata:
labels:
app: myapp-ns-fastcpu
spec:
...output omitted...
nodeSelector:
cpu: fastApply the configuration for the deployment CR.
[student@workstation scheduling-selector]$ oc create \
-f deployment-ns-fastcpu.yml
deployment.apps/myapp-ns-fastcpu createdVerify that the deployment is correctly created with two replicas. Wait until all the pods are marked as ready and available. You might have to repeat this command many times.
[student@workstation scheduling-selector]$oc get deploymentNAME READY UP-TO-DATE AVAILABLE AGE myapp 2/2 2 2 2m57smyapp-ns-fastcpu2/2 2 2 27s
Verify that the OpenShift scheduler creates two pods on the worker03 node, because that node has the cpu=fast label.
Pod names might differ in your system.
[student@workstation scheduling-selector]$oc get pods -o wide \ -l app=myapp-ns-fastcpuNAME READY STATUS ... NODE ... myapp-ns-fastcpu-5577dd75d8-m6s9f 1/1 Running ...worker03... myapp-ns-fastcpu-5577dd75d8-pcw7j 1/1 Running ...worker03...
Drain the worker03 node from the cluster to simulate a failure in the third rack.
Verify the status for the deployments.
The myapp deployment continues to be available, because the OpenShift scheduler can place its pods in the worker01 and worker02 nodes.
However, the node selector in the myapp-ns-fastcpu deployment does not enable the scheduler to place the pods in the available compute nodes.
Thus, the deployment is not available.
Connect to the OpenShift cluster as the admin user with redhatocp as the password.
[student@workstation scheduling-selector]$ oc login -u admin -p redhatocp
...output omitted...Drain all the pods from the worker03 node.
This action might take a few minutes.
[student@workstation scheduling-selector]$ oc adm drain worker03 \
--ignore-daemonsets --delete-emptydir-data
...output omitted...
node/worker03 drainedVerify the status for the deployments.
Although the myapp deployment is available, the myapp-ns-fastcpu deployment is not available due to the node selector.
[student@workstation scheduling-selector]$oc get deploymentNAME READY UP-TO-DATE AVAILABLE AGE myapp 2/2 2 2 10m myapp-ns-fastcpu0/2206m4s
Verify that the scheduler cannot use the worker01 and worker02 nodes for the myapp-ns-fastcpu deployment due to the node selector.
[student@workstation scheduling-selector]$oc get pods -o wideNAME READY STATUS ... NODE ... myapp-847cfd74d9-84tqf 1/1 Running ...worker02... myapp-847cfd74d9-npjrn 1/1 Running ...worker01... myapp-ns-fastcpu-5577dd75d8-b422c 0/1 Pending ...<none>... myapp-ns-fastcpu-5577dd75d8-vdtsx 0/1 Pending ...<none>...
Review the project events.
The OpenShift scheduler does not find a node for the myapp-ns-fastcpu deployment due to the node selector.
[student@workstation scheduling-selector]$oc get events \ --sort-by='{.lastTimestamp}'...output omitted... 4m11s Normal SuccessfulCreate replicaset/myapp-ns-fastcpu-5577dd75d8 Created pod: myapp-ns-fastcpu-5577dd75d8-vdtsx 4m11s Normal SuccessfulCreate replicaset/myapp-ns-fastcpu-5577dd75d8 Created pod: myapp-ns-fastcpu-5577dd75d8-b422c4m11s Warning FailedScheduling pod/myapp-ns-fastcpu-5577dd75d8-b422c 0/6 nodes are available: 1 node(s) were unschedulable, 2 node(s) didn't match Pod's node affinity/selector, 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }. preemption: 0/6 nodes are available: 6 Preemption is not helpful for scheduling.
Review the pod information. Pod names might differ in your system.
[student@workstation scheduling-selector]$oc describe pod \...output omitted... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 2m10s (x2 over 7m12s) default-scheduler 0/6 nodes are available:myapp-ns-fastcpu-5577dd75d8-b422c1 node(s) were unschedulable, 2 node(s) didn't match Pod's node affinity/selector, 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }. preemption: 0/6 nodes are available: 6 Preemption is not helpful for scheduling.
Mark the worker03 node as schedulable.
[student@workstation scheduling-selector]$ oc adm uncordon worker03
node/worker03 uncordonedCreate the scheduling-ns project with a project node selector for the standard CPU nodes.
As the admin user, create the project CR YAML file with a node selector for the standard CPU nodes.
You can find an incomplete example for the project in the ~/DO380/labs/scheduling-selector/project-scheduling.yml file.
apiVersion: v1 kind: Namespace metadata: name:scheduling-nsannotations:openshift.io/node-selector: cpu=standard
Apply the configuration for the project CR.
[student@workstation scheduling-selector]$ oc create \
-f project-scheduling.yml
namespace/scheduling-ns createdGive edit permission to the developer user in the scheduling-ns project.
[student@workstation scheduling-selector]$ oc adm policy add-role-to-user \
edit developer -n scheduling-ns
clusterrole.rbac.authorization.k8s.io/edit added: "developer"Connect to the OpenShift cluster as the developer user with developer as the password.
[student@workstation scheduling-selector]$ oc login -u developer -p developer
...output omitted...Change to the scheduling-ns project.
[student@workstation scheduling-selector]$ oc project scheduling-ns
...output omitted...Verify that the project includes the node selector.
[student@workstation scheduling-selector]$oc describe project scheduling-nsName: scheduling-ns Created: 4 minutes ago ...output omitted... Annotations:openshift.io/node-selector=cpu=standardopenshift.io/sa.scc.mcs=s0:c26,c25 openshift.io/sa.scc.supplemental-groups=1000700000/10000 openshift.io/sa.scc.uid-range=1000700000/10000 Display Name: <none> Description: <none> Status: ActiveNode Selector: cpu=standardQuota: <none> Resource limits: <none> ...output omitted...
Create a deployment CR YAML file with two replicas.
You can find an incomplete example for the deployment in the ~/DO380/labs/scheduling-selector/deployment-project-ns.yml file.
apiVersion: apps/v1 kind: Deployment metadata: name:project-nsspec: replicas:2selector: matchLabels: app: project-ns ...output omitted...
Apply the configuration for the deployment CR.
[student@workstation scheduling-selector]$ oc create -f deployment-project-ns.yml
deployment.apps/project-ns createdVerify the status for the deployment. Wait until all the pods are marked as ready and available. You might have to repeat this command many times.
[student@workstation scheduling-selector]$ oc get deployment
NAME READY UP-TO-DATE AVAILABLE AGE
project-ns 2/2 2 2 131sVerify that the scheduler uses the worker01 and worker02 nodes due to the project node selector.
[student@workstation scheduling-selector]$oc get pods -o wideNAME READY STATUS ... NODE ... project-ns-79c4798c49-k8c92 1/1 Running ...worker01... project-ns-79c4798c49-r2dn7 1/1 Running ...worker02...
Verify that the pods in the deployment include the project node selector. Pod names might differ in your system.
[student@workstation scheduling-selector]$oc describe pod \Name: project-ns-79c4798c49-k8c92 Namespace: scheduling-ns ...output omitted...project-ns-79c4798c49-k8c92Node-Selectors: cpu=standardTolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s ...output omitted...
Create a deployment with two replicas and a node selector for the first rack. Verify that the deployment includes both the project and the pod node selectors.
Create a deployment CR YAML file with two replicas and a node selector for the fast CPU node.
You can find an incomplete example for the deployment in the ~/DO380/labs/scheduling-selector/deployment-project-podsel.yml file.
apiVersion: apps/v1
kind: Deployment
metadata:
name: project-podsel
spec:
replicas: 2
selector:
matchLabels:
app: project-podsel
template:
metadata:
labels:
app: project-podsel
spec:
...output omitted...
nodeSelector:
rack: "1"Apply the configuration for the deployment CR.
[student@workstation scheduling-selector]$ oc create \
-f deployment-project-podsel.yml
deployment.apps/project-podsel createdVerify the status for the deployment. Wait until all the pods are marked as ready and available. You might have to repeat this command many times.
[student@workstation scheduling-selector]$ oc get deployment
NAME READY UP-TO-DATE AVAILABLE AGE
project-ns 2/2 2 2 12m
project-podsel 2/2 2 2 72sVerify that the OpenShift scheduler places all the pods for the project-podsel deployment in the worker01 node, because the pods have both labels.
[student@workstation scheduling-selector]$ oc get pods -o wide \
-l app=project-podsel
NAME READY STATUS ... NODE ...
project-podsel-649f68f65d-44xxk 1/1 Running ... worker01 ...
project-podsel-649f68f65d-qvvjz 1/1 Running ... worker01 ...Verify that the pods in the deployment include both the project and the pod node selectors. Pod names might differ in your system.
[student@workstation scheduling-selector]$oc describe pod \Name: project-podsel-649f68f65d-44xxk Namespace: scheduling-ns ...output omitted...project-podsel-649f68f65d-44xxkNode-Selectors: cpu=standardrack=1Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s ...output omitted...
As an OpenShift administrator, apply the type=mission-critical taint to the worker01 and worker03 nodes to reserve workload capacity.
Use the NoSchedule option to ensure that previous node workloads continue working.
Connect to the OpenShift cluster as the admin user with redhatocp as the password.
[student@workstation scheduling-selector]$ oc login -u admin -p redhatocp
...output omitted...Apply the type=mission-critical:NoSchedule taint to the worker01 and worker03 nodes.
[student@workstation scheduling-selector]$ oc adm taint nodes worker01 \
type=mission-critical:NoSchedule
node/worker01 tainted[student@workstation scheduling-selector]$ oc adm taint nodes worker03 \
type=mission-critical:NoSchedule
node/worker03 taintedVerify the taints for the worker01 and worker03 nodes.
[student@workstation scheduling-selector]$ oc describe node worker01 | grep Taints
Taints: type=mission-critical:NoSchedule[student@workstation scheduling-selector]$ oc describe node worker03 | grep Taints
Taints: type=mission-critical:NoScheduleCreate a deployment with two replicas with a node selector for the cpu=fast label.
Verify that the deployment is not available because the two nodes with the cpu=fast label have a taint, and the deployment does not have a toleration for the taint.
Connect to the OpenShift cluster as the developer user with developer as the password.
[student@workstation scheduling-selector]$ oc login -u developer -p developer
...output omitted...Create the scheduling-taint project.
[student@workstation scheduling-selector]$ oc new-project scheduling-taint
...output omitted...Create a deployment CR YAML file with two replicas and a node selector for the cpu=fast label.
You can find an incomplete example for the deployment in the /home/student/DO380/labs/scheduling-selector/deployment-taint-fastcpu.yml file.
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-taint-fastcpu
spec:
replicas: 2
selector:
matchLabels:
app: myapp-taint-fastcpu
template:
metadata:
labels:
app: myapp-taint-fastcpu
spec:
...output omitted...
nodeSelector:
cpu: fastApply the configuration for the deployment CR.
[student@workstation scheduling-selector]$ oc create \
-f deployment-taint-fastcpu.yml
deployment.apps/myapp-taint-fastcpu createdVerify the status for the deployment.
The myapp-taint-fastcpu deployment is not available due to node taint without a toleration.
[student@workstation scheduling-selector]$oc get deploymentNAME READY UP-TO-DATE AVAILABLE AGE myapp-taint-fastcpu0/220104s
Verify that the OpenShift scheduler cannot create the myapp-taint-fastcpu pods in a compute node.
[student@workstation scheduling-selector]$oc get pods -o wideNAME READY STATUS ... NODE ... myapp-taint-fastcpu-5c646fbd6b-k5r9g 0/1 Pending ...<none>... myapp-taint-fastcpu-5c646fbd6b-tz2nr 0/1 Pending ...<none>...
Review the project events.
The OpenShift scheduler does not find a node for the myapp-taint-fastcpu deployment due to the node selector and the node taint.
[student@workstation scheduling-selector]$oc get events \ --sort-by='{.lastTimestamp}'...output omitted... 2m11s Warning FailedScheduling pod/myapp-taint-fastcpu-5c646fbd6b-tz2nr0/6 nodes are available: 1 node(s) didn't match Pod's node affinity/selector, 2 node(s) had untolerated taint {type: mission-critical}, 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }. preemption: 0/6 nodes are available: 6 Preemption is not helpful for scheduling. 2m11s Normal SuccessfulCreate replicaset/myapp-taint-fastcpu-5c646fbd6b Created pod: myapp-taint-fastcpu-5c646fbd6b-tz2nr 2m11s Normal SuccessfulCreate replicaset/myapp-taint-fastcpu-5c646fbd6b Created pod: myapp-taint-fastcpu-5c646fbd6b-k5r9g 2m11s Normal ScalingReplicaSet deployment/myapp-taint-fastcpu Scaled up replica set myapp-taint-fastcpu-5c646fbd6b to 2
Remove the myapp-taint-fastcpu deployment.
[student@workstation scheduling-selector]$ oc delete deployment \
myapp-taint-fastcpu
deployment.apps "myapp-taint-fastcpu" deletedCreate a deployment with the node selector for the cpu=fast label and the toleration for the taint.
Verify that the deployment is available.
Modify the YAML file for the myapp-taint-fastcpu deployment CR by adding a toleration for the type:mission-critical:NoSchedule taint.
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-taint-fastcpu
spec:
replicas: 2
selector:
matchLabels:
app: myapp-taint-fastcpu
template:
metadata:
labels:
app: myapp-taint-fastcpu
spec:
...output omitted...
nodeSelector:
cpu: standard
tolerations:
- key: "type"
value: "mission-critical"
operator: "Equal"
effect: "NoSchedule"Apply the configuration for the deployment CR.
[student@workstation scheduling-selector]$ oc create -f \
deployment-taint-fastcpu.yml
deployment.apps/myapp-taint-fastcpu createdVerify the status for the deployment.
The myapp-taint-fastcpu deployment is available because it includes a toleration for the node taint.
[student@workstation scheduling-selector]$oc get deploymentNAME READY UP-TO-DATE AVAILABLE AGE myapp-taint-fastcpu2/22262s
Verify that the OpenShift scheduler places the two pods for the myapp-taint-fastcpu deployment in the worker03 node, because the deployment has a toleration for the taint, and also because this node includes the specified label in the node selector.
[student@workstation scheduling-selector]$oc get pods -o wideNAME READY STATUS ... NODE ... myapp-taint-fastcpu-86cc9d87cc-7x4n8 1/1 Running ...worker03... myapp-taint-fastcpu-86cc9d87cc-8p4c4 1/1 Running ...worker03...
Change to the /home/student directory.
[student@workstation scheduling-selector]$ cd