Bookmark this page

Guided Exercise: Alerts and Notifications

Configure and silence alerts.

Outcomes

  • Configure OpenShift to send email notifications.

  • Review an alert notification email.

  • Silence a firing alert.

As the student user on the workstation machine, use the lab command to prepare your environment for this exercise.

[student@workstation ~]$ lab start monitoring-alerts

Instructions

In this guided exercise, you extract the Alertmanager secret and configure the values to send notifications about the PersistentVolumeUsageNearFull alerts to the ocp-admins@example.com email address.

Then, you deploy an application that triggers the PersistentVolumeUsageNearFull alert and wait for the notification email to be delivered to the ocp-admins@example.com address.

Finally, you create a silence for the alert for a period, to stop sending new notifications to the email address while you are troubleshooting the problem.

  1. List and extract the secret that contains the Alertmanager configuration.

    1. Log in to the cluster as the admin user.

      [student@workstation ~]$ oc login -u admin -p redhatocp \
        https://api.ocp4.example.com:6443
      Login successful.
      
      ...output omitted...
    2. Change to the ~/DO380/labs/monitoring-alerts directory.

      [student@workstation ~]$ cd ~/DO380/labs/monitoring-alerts
    3. List the alertmanager-main secret in the openshift-monitoring namespace.

      [student@workstation monitoring-alerts]$ oc get secret/alertmanager-main \
        -n openshift-monitoring
      NAME                TYPE     DATA   AGE
      alertmanager-main   Opaque   1      7d
    4. Extract the existing alertmanager-main secret from the openshift-monitoring namespace to the current directory.

      [student@workstation monitoring-alerts]$ oc extract secret/alertmanager-main \
        -n openshift-monitoring --to ./ --confirm
      alertmanager.yaml
  2. Modify the Alertmanager configuration to send notifications about the persistent volume alerts to the ocp-admins@example.com email address.

    1. The default alertmanager-main secret contains many unnecessary quotation marks. Remove the quotation marks by using the sed command to improve readability.

      [student@workstation monitoring-alerts]$ sed -i -f script.sed alertmanager.yaml

      Important

      Although removing the extraneous quotation mark characters is not required, it improves readability. The quotation mark characters are not required in a YAML file, except to represent null as a string.

      The previous sed command removes all quotation marks, including possible required quotation marks.

    2. Edit the alertmanager.yaml file and add the global SMTP settings.

      global:
        resolve_timeout: 5m
        smtp_from: alerts@ocp4.example.com
        smtp_smarthost: '192.168.50.254:25'
        smtp_hello: localhost
        smtp_auth_username: smtp_training
        smtp_auth_password: Red_H4T@!
        smtp_require_tls: false
      inhibit_rules:
        ...output omitted...
    3. Add the email receiver to the alertmanager.yaml file.

      ...output omitted...
      receivers:
      - name: Default
      - name: Watchdog
      - name: Critical
      - name: 'null'
      - name: email
        email_configs:
        - to: ocp-admins@example.com
      route:
        ...output omitted...
    4. Configure the group_interval and repeat_interval settings so the alert email notifications are sent more frequently.

      ...output omitted...
      route:
        group_by:
        - namespace
        group_interval: 2m
        group_wait: 30s
        receiver: Default
        repeat_interval: 1m
      ...output omitted...
    5. Finally, add the match rule at the bottom of the file. Then, save and close the file.

      ...output omitted...
      route:
        ...output omitted...
        routes:
          ...output omitted...
        - matchers:
          - severity = critical
          receiver: Critical
        - receiver: email
          match:
            alertname: PersistentVolumeUsageNearFull

      Note

      The ~/DO380/solutions/monitoring-alerts/alertmanager.yaml file contains the correct configuration, and you can use it for comparison.

  3. Apply the Alertmanager configuration changes.

    1. Update the existing alertmanager-main secret in the openshift-monitoring namespace with the content of the alertmanager.yaml file.

      [student@workstation monitoring-alerts]$ oc set data secret/alertmanager-main \
        -n openshift-monitoring --from-file alertmanager.yaml
      secret/alertmanager-main data updated
    2. Follow the alertmanager container logs and look for the message: "Completed loading of configuration file" file=/etc/alertmanager/config_out/alertmanager.env.yaml

      New log messages can take up to one minute to be displayed. Press Ctrl+C to exit the oc logs command.

      [student@workstation monitoring-alerts]$ oc logs -f \
        statefulset.apps/alertmanager-main -c alertmanager -n openshift-monitoring
      Found 2 pods, using pod/alertmanager-main-0
      ...output omitted...
      ts=2024-01-26T23:51:34.123Z caller=coordinator.go:256 level=info component=configuration msg="Loading configuration file" file=/etc/alertmanager/config_out/alertmanager.env.yaml
      ts=2024-01-26T23:51:34.456Z caller=coordinator.go:512 level=info component=configuration msg="Completed loading of configuration file" file=/etc/alertmanager/config_out/alertmanager.env.yaml
      ...output omitted...
      ^C

      Note

      If you see configuration errors in the logs, then modify the alertmanager.yaml file and reapply your changes.

  4. Create a workload that triggers persistent volume alerts and inspect the alerts.

    1. Switch to the monitoring-alerts project.

      [student@workstation monitoring-alerts]$ oc project monitoring-alerts
      Now using project "monitoring-alerts" on server ...
    2. Create the workload by using the YAML resource manifest.

      [student@workstation monitoring-alerts]$ oc apply -f workload.yaml
      configmap/mysql created
      secret/mysql created
      persistentvolumeclaim/mysql created
      deployment.apps/mysql created
      service/mysql created
    3. Wait until the mysql pod is running, and verify that the deployment is marked as ready and available.

      [student@workstation monitoring-alerts]$ watch oc get deployments,pods
      Every 2.0s: oc get deployments,pods        workstation: Fri Jan 26 17:07:10 2024
      
      NAME                    READY   UP-TO-DATE   AVAILABLE   AGE
      deployment.apps/mysql   1/1     1            1           90s
      
      NAME                         READY   STATUS    RESTARTS   AGE
      pod/mysql-69f4d555b4-hthp4   1/1     Running   0          90s
    4. Get the web console URL.

      [student@workstation monitoring-alerts]$ oc whoami --show-console
      https://console-openshift-console.apps.ocp4.example.com
    5. Open the web console URL in the browser. Click Red Hat Identity Management and log in with the following credentials:

      • Username: admin

      • Password: redhatocp

    6. Click ObserveAlerting, and then wait for the following alerts to fire:

      • PersistentVolumeUsageCritical

      • PersistentVolumeUsageNearFull

        Note

        The alerts might take a few minutes to be triggered.

    7. Click the PersistentVolumeUsageNearFull alert to view its properties. The graph displays the time when the alert was detected.

      Scroll down to view the alert details and labels.

  5. Verify that the Alertmanager sent an email notification for each firing alert.

    The lab user on the utility.lab.example.com host receives emails that are sent to the ocp-admins@example.com email address.

    1. Connect to the utility.lab.example.com host as the lab user.

      [student@workstation monitoring-alerts]$ ssh lab@utility.lab.example.com
      ...output omitted...
    2. Use the mutt command to access the mail messages.

      [lab@utility ~]$ mutt
    3. The existing emails that are sent from alerts@ocp4.example.com demonstrate that Alertmanager sent email notifications for persistent volume alerts.

      q:Quit    d:Del    u:Undel    s:Save    m:Mail    r:Reply    g:Group    ?:Help
      
         1 N   Jan 26 alerts@ocp4.exa… ( 251) [FIRING:1] monitoring-alerts (Persistent
      ...output omitted...
      
      ---Mutt: /var/spool/mail/lab [Msgs:1 New:1 123K]-----(date/date)--------(all)---
    4. Select the first email message and press Enter to open the email.

      i:Exit  -:PrevPg  <Space>:NextPg  v:View Attach…  d:Del  r:Reply  j:Next  ?:Help
      
      Date: Fri, 26 Jan 2024 23:36:49 +0000
      From: alerts@ocp4.example.com
      To: ocp-admins@example.com
      Subject: [FIRING:1] monitoring-alerts (PersistentVolumeUsageNearFull
              https-metrics 192.168.50.13:10250 kubelet /metrics worker01 platform
              mysql openshift-monitoring/k8s openshift-storage.cephfs.csi.ceph.com
              kubelet warning ocs-external-storagecluster-cephfs)
      
      <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
      ...output omitted...
      -N  - 1/1: alerts@ocp4.example. [FIRING:1] monitoring-alerts (Persistent -- (1%)

      Note

      Alertmanager sends notification emails in HTML format.

    5. Press q to close the message.

  6. Silence the PersistentVolumeUsageNearFull alert.

    1. Return to the web browser window and click ObserveAlerting to display a list of the alerts that are currently firing.

    2. Click the three dots icon at the right of the PersistentVolumeUsageNearFull alert, and then click Silence alert.

    3. Set the alert silence time to 30 minutes. Complete the values according to the following table, and then scroll down and click Silence.

      Field Value
      For 30m
      Start immediately Checked
      Alert labels
      Leave all the label names and values as they are presented
      Info
      Creator admin
      Comment Silenced during troubleshooting
    4. Click ObserveAlerting to return to the alerting main section. The PersistentVolumeUsageNearFull alert is not displayed in the list, because it is currently silenced.

      Clear the Alert State filter by clicking x to display all the alerts.

    5. Observe that the PersistentVolumeUsageNearFull alert is listed and marked as silenced.

    6. Return to the terminal window and observe that no new emails are received. Press q to exit mutt.

    7. Exit the SSH session.

      [lab@utility ~]$ exit
    8. Change to the student HOME directory.

      [student@workstation monitoring-alerts]$ cd
      [student@workstation ~]$

Finish

On the workstation machine, use the lab command to complete this exercise. This step is important to ensure that resources from previous exercises do not impact upcoming exercises.

[student@workstation ~]$ lab finish monitoring-alerts

Revision: do380-4.14-397a507