DO316 - ch08s03

Bookmark this page

Configuring Health Probes for Virtual Machines

Objectives

Configure health probes and a Watchdog device to monitor the health and responsiveness of a VM and its services.

Configure the Run Strategy

Virtual machines instances (VMI) have settings that control the running state of the virtual machine (VM). A VM's run strategy determines the behavior of a VMI according to a series of conditions, which are defined in the .spec.running or the .spec.runStrategy parameters.

The .spec.running Boolean specifies whether the VM is running. When the Boolean is set to true, Red Hat OpenShift Virtualization ensures that the VM is always running. OpenShift Virtualization restarts the VM if it fails. OpenShift Virtualization also restarts the VM when you stop it by gracefully shutting down the operating system or when you delete the VMI resource.
When you set the .spec.running Boolean to false, OpenShift Virtualization ensures that the VM is not running.
OpenShift Virtualization automatically updates the Boolean when you control the VM by using the virtctl command or by using the web console. If you run the virtctl stop command or click Actions → Stop in the OpenShift web console, then OpenShift Virtualization switches the .spec.running Boolean to false. If you run the virtctl start command or click Actions → Start in the OpenShift web console, then OpenShift Virtualization switches the .spec.running Boolean to true.
The .spec.runStrategy parameter provides more control over the VM status than the .spec.running Boolean. For example, when the .spec.running Boolean is true, you cannot stop a VM by gracefully shutting down its operating system, because OpenShift Virtualization detects that the VM is not running and then restarts it.
The .spec.runStrategy parameter accepts any of the following values:
- RerunOnFailure: Restarts the VM when it fails. If you gracefully shut down the operating system from inside the VM, then OpenShift Virtualization does not restart the VM.
- Always: Ensures that the VM is always running. Using this value has the same effect as setting the .spec.running Boolean to true.
- Halted: Ensures that the VM is not running. Using this value has the same effect as setting the .spec.running Boolean to false.
- Manual: Performs no automatic action. If the VM fails, then OpenShift Virtualization does not restart it. You must manage the VM by using the virtctl start or virtctl stop commands, or by using the OpenShift web console.

The .spec.running and .spec.runStrategy parameters are mutually exclusive. If you specify both parameters, then OpenShift Virtualization returns an error.

Update the Run Strategy Parameters

To set the .spec.running or .spec.runStrategy parameters from the command line, edit the VM resource by using the oc edit command:

[user@host ~]$ oc edit vm mariadb-server
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this
# file will be reopened with the relevant failures.
#
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
...output omitted...
spec:
  dataVolumeTemplates:
  ...output omitted...
  runStrategy: RerunOnFailure
  template:
    metadata:
...output omitted...

To set the .spec.running or .spec.runStrategy parameters from the Red Hat OpenShift web console, navigate to Virtualization → VirtualMachines.

Select the VM to edit, and navigate to the YAML tab.

Then, use the YAML editor to set the .spec.running or the .spec.runStrategy parameter, and click Save.

Configure Health Probes

Applications might become unreliable for different reasons:

Temporary connection loss
Configuration errors
Application errors

Developers can configure probes for monitoring their applications. A probe is a periodic check that monitors the health of an application. You can use probes for applications that run in pods or VMs.

For monitoring applications that run inside VMs, Kubernetes provides two types of probes:

Readiness probes

Readiness probes determine whether the application is ready to serve requests. If the readiness probe fails, then Kubernetes prevents client traffic from reaching the application by removing the VM's IP address from the service resource.

Readiness probes help detect temporary issues that might affect your applications. For example, the application might be temporarily unavailable when it starts because it must establish initial network connections, load files in a cache, or perform initial tasks that take time to complete. The application might also periodically have to run long batch jobs, and become temporarily unavailable to clients.

Kubernetes continues to run the probe even after the application fails. When the application is available again, Kubernetes adds back the VM's IP address to the service so that the application can receive client traffic.

Liveness probes

Liveness probes determine whether the application is in a healthy state. If the liveness probe detects an unhealthy state, then OpenShift Virtualization deletes the VMI resource and redeploys a new instance.

Liveness probes help detect unresponsive applications that require a complete restart of the VM to recover.

Kubernetes provides the following methods for monitoring your application with probes:

HTTP GET

Some web applications provide a dedicated HTTP API endpoint to expose their status. Use the HTTP GET method for these applications.

The HTTP GET method performs GET requests to control the health of an application. The check is successful if the HTTP response code is in the 200-399 range.

TCP socket

Some applications, such as database servers, file servers, or application servers, provide their service through TCP ports instead of using HTTP. When using a TCP socket check, Kubernetes attempts to establish a TCP connection to the application's TCP port. If the connection is successful, then Kubernetes considers that the application is working.

Guest agent ping

The QEMU guest agent is a daemon that runs on VMs and provides information about the virtual machine, users, file systems, and secondary networks. The guest agent ping probe uses the guest-ping command to determine whether the QEMU guest agent is running on the virtual machine. The guest agent ping probe is a Technology Preview feature only and is outside the scope of this course.

Add Probes to Virtual Machines

You can edit a VM resource to configure health probes. Use the oc edit command to add readinessProbe and livenessProbe sections to the VM resource from the command line. The following example configures a readiness probe for the www1 VM:

[user@host ~]$ oc edit vm www1
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this
# file will be reopened with the relevant failures.
#
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
...output omitted...
spec:
  ...output omitted...
  template:
    metadata:
    ...output omitted...
    spec:
      domain:
        ...output omitted...
      readinessProbe:
        httpGet: 
          path: /health
          port: 8080
        initialDelaySeconds: 120 
        periodSeconds: 20  
        timeoutSeconds: 10 
        failureThreshold: 3 
        successThreshold: 3 
...output omitted...

	The `httpGet` section provides the parameters for the HTTP request. The `path` parameter provides the URL path to the probe endpoint.
	Specifies how long Kubernetes waits after the VM starts before probing the application. By default, Kubernetes immediately starts probing the VM.
	Specifies the frequency of the check. By default, Kubernetes checks every 10 seconds.
	Determines how long to wait for the probe to finish. If the probe exceeds this time, then Kubernetes assumes that it fails. By default, Kubernetes waits one second for the check to complete.
	Specifies the number of consecutive failures for Kubernetes to consider that the probe fails. By default, Kubernetes tries three times.
	Specifies the minimum consecutive successes after a probe fails for Kubernetes to consider that the probe succeeds. By default, Kubernetes sets the probe as successful after one successful check.

The following example configures a liveness probe for the mariadb-server VM:

[user@host ~]$ oc edit vm mariadb-server
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this
# file will be reopened with the relevant failures.
#
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
...output omitted...
spec:
  ...output omitted...
  template:
    metadata:
    ...output omitted...
    spec:
      domain:
        ...output omitted...
      livenessProbe:
        tcpSocket: 
          port: 3306
        initialDelaySeconds: 120
        periodSeconds: 20
...output omitted...

The tcpSocket section provides the parameters of the TCP socket. The port parameter provides the TCP socket port to test.

To add health probes from the Red Hat OpenShift web console, navigate to Virtualization → VirtualMachines. Select the VM to edit, and navigate to the YAML tab. Then, use the YAML editor to declare the probes, and click Save.

Configure a Watchdog Device

The watchdog feature detects and restarts unresponsive operating systems. The feature relies on two components:

A hardware component, which sets by default a 30-second timer. You can adjust this timer if needed. When the timer expires, the component triggers a system restart. On bare-metal machines, a chipset provides the feature. For VMs, OpenShift Virtualization emulates the chipset.
A software component, which the operating system runs, to reset the hardware timer regularly to prevent it from expiring. If the operating system hangs, then the software component also hangs and cannot refresh the timer. Then, the timer expires and the system restarts.

Watchdogs monitor only operating systems, and do not detect application failures. For example, a watchdog does not trigger if the application hangs but the operating system is still responding.

Do not use watchdogs for applications that you can monitor by using liveness probes. In an unresponsive OS, all the running applications become unresponsive. Thus, a liveness probe indirectly detects an unresponsive OS when all the applications become unresponsive. Use watchdog monitoring for applications only when you have no other way to test your application.

Activate Watchdog Monitoring

To activate watchdog monitoring in your VM, add the emulated watchdog device to your VM resource. You can do it by using the oc edit command to add a device section to the VM resource, as follows:

[user@host ~]$ oc edit vm mariadb-server
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this
# file will be reopened with the relevant failures.
#
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
...output omitted...
spec:
  ...output omitted...
  template:
    metadata:
    ...output omitted...
    spec:
      domain:
        cpu:
          ...output omitted...
        devices:
          disks:
          - bootOrder: 1
            disk:
              bus: virtio
            name: mariadb-server
          interfaces:
          - macAddress: "02:48:09:00:00:00"
            masquerade: {}
            name: default
          networkInterfaceMultiqueue: true
          rng: {}
          watchdog:
            i6300esb: 
              action: poweroff 
            name: mywatchdog
...output omitted...

	OpenShift Virtualization can only emulate the Intel 6300ESB chipset.
	Action to perform when the watchdog triggers. The `poweroff` action stops the VM, and then OpenShift Virtualization restarts the VM according to the `.spec.running` or the `.spec.runStrategy` parameter. Other options for this parameter include the `reset` or `shutdown` actions. The references section provides more details about the watchdog actions.

To activate watchdog monitoring from the Red Hat OpenShift web console, navigate to Virtualization → VirtualMachines. Select the VM to edit, and navigate to the YAML tab. Then, use the YAML editor to create the watchdog device, and click Save.

Watchdog monitoring works only for operating systems that support the Intel 6300ESB chipset. Red Hat Enterprise Linux automatically loads the kernel module that supports the device.

References

watchdog(8) man page

For more information about run strategies, refer to the About Run Strategies for Virtual Machines section in the Red Hat OpenShift Container Platform 4.14 Virtualization guide at https://access.redhat.com/documentation/en-us/openshift_container_platform/4.14/html-single/virtualization/index#virt-about-runstrategies-vms_virt-create-vms

For more information about OpenShift Virtualization probes, refer to the Monitoring Virtual Machine Health section in the Red Hat OpenShift Container Platform 4.14 Virtualization guide at https://access.redhat.com/documentation/en-us/openshift_container_platform/4.14/html-single/virtualization/index#virt-monitoring-vm-health

For more information about Kubernetes probes, refer to the Monitoring Application Health by Using Health Checks chapter in the Red Hat OpenShift Container Platform 4.14 Building Applications guide at https://access.redhat.com/documentation/en-us/openshift_container_platform/4.14/html-single/building_applications/index#application-health

For more information about configuring a watchdog device, refer to the Defining a Watchdog section in the Red Hat OpenShift Container Platform 4.14 Virtualization guide at https://access.redhat.com/documentation/en-us/openshift_container_platform/4.14/html-single/virtualization/index#watchdog_virt-monitoring-vm-health

Discuss Managing Virtual Machines with Red Hat OpenShift Virtualization

Go to community

Welcome to the Managing Virtual Machines with Red Hat OpenShift (DO316) course in RHLC!

Syed

11 sie 2023

We are excited to launch a space dedicated to the Red Hat Training course, Managing Virtual Machines with Red Hat OpenShift Virtualization! To gain the most value from this group - click the "Join Group" button in the upper right hand corner of the group home page.We encourage group members to collaborate in this group to discuss topics, ask questions, share best practices and tips, provide course feedback, and share their accomplishments as it relates to DO316.Read more about Managing Virtual Machines with Red Hat OpenShift Virtualization here.

2372

Revision: do316-4.14-d8a6b80