Bookmark this page

Chapter 8.  Configuring Kubernetes High Availability for Virtual Machines

Abstract

Goal

Configure Kubernetes resources to implement high availability for virtual machines.

Objectives
  • Demonstrate VM load balancing with Kubernetes networking resources.

  • Configure health probes and a Watchdog device to monitor the health and responsiveness of a VM and its services.

  • Configure Kubernetes resources to help VMs fail over to another cluster node when node failure is detected.

Sections
  • Virtual Machine Load Balancing with Kubernetes Networking Resources (and Guided Exercise)

  • Configuring Health Probes for Virtual Machines (and Guided Exercise)

  • Surviving Node Failure with Virtual Machines (and Guided Exercise)

Lab
  • Configuring Kubernetes High Availability for Virtual Machines

Virtual Machine Load Balancing with Kubernetes Networking Resources

Objectives

  • Demonstrate VM load balancing with Kubernetes networking resources.

Manage Virtual Machines with a Hypervisor

A hypervisor is software that helps you to provision virtual machines (VMs) on a host computer by sharing the compute resources of the host with the virtualized environments. Hypervisors come in two types:

Bare metal hypervisor

The bare metal hypervisor executes on the host's hardware as a layer of a lightweight OS. You can use a host that runs a bare metal hypervisor only for virtualization.

Hosted hypervisor

The hosted hypervisor executes on the host's OS, like any other software, so you can use the host for purposes besides virtualization.

High Availability with Hypervisors

High Availability (HA) is the ability to maintain continuous operations and to recover from unexpected failures in the shortest possible time.

HA in traditional hypervisors often focuses on the ability to restart or migrate VM workloads to other hosts in the cluster. This approach requires frequent manual intervention to provide application monitoring, clustering, and load balancing for workloads in a hypervisor.

Load Balancing with Hypervisors

Load balancing helps avoid traffic bottlenecks by balancing the distribution of requests across the network. The traffic is distributed across available workloads that are ready for incoming requests.

By default, a stand-alone hypervisor does not include the software technology to load balance VMs. However, depending on the hypervisor provider, you can implement a load balancer to work with the hypervisor. By implementing hypervisor agents, you can also add node management actions, such as fencing and restarting. You must configure the load balancer, readiness and liveness probes, and watchdog functions for health monitoring, in addition to configuring the hypervisor.

Manage Virtual Machines with Kubernetes

KubeVirt is the Kubernetes technology that helps you create and manage VMs that run in the same layer as your container workloads. KubeVirt delivers container-native virtualization by using a Kernel-based Virtual Machine (KVM) within a Kubernetes container. KubeVirt also provides features that are associated with traditional hypervisors, such as live migration and active resource balancing.

Kubernetes Features for Virtual Machines

Kubernetes provides a set of features for VM management that traditional stand-alone hypervisors do not offer by default. These features include probes to monitor the health of your VMs and applications, load balancing of network traffic for high availability, and sticky sessions for stateful application traffic. Additional features, such as VM run strategies and watchdog devices, are available only when you configure the KubeVirt operator in a Kubernetes cluster.

You can implement KubeVirt in your RHOCP cluster by installing the Red Hat OpenShift Virtualization operator.

A description follows of some Kubernetes and KubeVirt features:

Load balancing

You can configure a load balancer service to enable external access to an OpenShift cluster, by allocating a unique IP from a configured pool. The load balancer has a single edge router IP, which can be a virtual IP (VIP), although it is still a single machine for initial load balancing.

However, load balancer services require the use of network features that are not available in all environments. For example, cloud providers typically provide their own load balancer services. If you run a Kubernetes cluster on a cloud provider, then controllers in Kubernetes use the cloud provider's APIs to configure the required cloud provider resources for a load balancing service.

On environments where managed load balancer services are not available, such as bare metal clusters or clusters that run on hypervisors, you must configure a load balancer component according to the specifics of your network. For those environments, you can use the MetalLB operator, which provides a load balancing service.

Readiness and Liveness probes

Developers can configure probes for monitoring their applications. A probe periodically monitors application health. You can use probes for applications that run in pods or VMs.

Readiness probes determine whether the application is ready to serve requests. If the readiness probe fails, then Kubernetes prevents client traffic from reaching the application by removing the VM's IP address from the service resource.

Liveness probes determine whether the application is in a healthy state. If the liveness probe detects an unhealthy state, then OpenShift Virtualization deletes the virtual machine instance (VMI) resource and redeploys a new instance.

Sticky sessions

Sticky sessions enable stateful application traffic by ensuring that all traffic hits the same endpoint. However, if the endpoint pod terminates, then the statefulness can end.

Watchdog devices

You can configure a watchdog device inside a VM to verify the state of the OS and act according to the run strategy. Watchdogs monitor the OS only, and do not detect application failures.

Machine health checks

Machine health checks automatically remediate an unhealthy machine, which is the host for a node, if the machine exists in a particular machine pool. You can use machine health checks to monitor the health of a host by creating a resource that defines the condition to verify, the label for the set of hosts to monitor, and the remediation process to use.

Live migration

A live migration ensures that the VM is not interrupted if the node is placed into maintenance or drained.

VM run strategies

A VM's run strategy determines the behavior of a VMI based on a series of conditions, which are defined in the .spec.running or the .spec.runStrategy parameters. VM run strategies are explained later in this course.

Fencing nodes

Fencing is a remediation method that reboots and deletes Machine custom resource definitions to solve problems with automatically provisioned nodes.

When the MachineHealthCheck controller detects that a node is in the NotReady state, it removes the associated Machine resource, and the node is deleted from the pool host.

References

For more information about route configuration, refer to the Route Configuration section in the Red Hat OpenShift Container Platform 4.14 Networking documentation at https://access.redhat.com/documentation/en-us/openshift_container_platform/4.14/html-single/networking/index#route-configuration

For more information about live migration, refer to the Live Migration chapter in the Red Hat OpenShift Container Platform 4.14 Virtualization documentation at https://access.redhat.com/documentation/en-us/openshift_container_platform/4.14/html-single/virtualization/index#live-migration

For more information about the MetalLB operator, refer to the Load Balancing with MetalLB chapter in the Red Hat OpenShift Container Platform 4.14 Networking documentation at https://access.redhat.com/documentation/en-us/openshift_container_platform/4.14/html-single/networking/index#load-balancing-with-metallb

For more information about VM run strategies, refer to the Run Strategies section in the Red Hat OpenShift Container Platform 4.14 Virtualization documentation at https://access.redhat.com/documentation/en-us/openshift_container_platform/4.14/html-single/virtualization/index#run-strategies

Revision: do316-4.14-d8a6b80