Bookmark this page

P 1 2 3 4 5 6 7 8

Chapter 7. OpenShift Logging

Section Log Forwarding

SectionObjectives
SectionOpenShift Logging
SectionOpenShift Logging Components
SectionInstall and Configure OpenShift Logging
SectionConfigure Log Collection and Forwarding
SectionTroubleshoot Log Forwarding

Guided Exercise: Log Forwarding

Centralized Logging

Objectives
Log Retention
Loki Log Store
Audit Logs
View, Search, and Query Logs
Log Access Permissions

Guided Exercise: Centralized Logging

Lab: OpenShift Logging

Summary

Abstract

Goal	Deploy OpenShift Logging and query log entries from workloads and cluster nodes.
Sections	Log Forwarding (and Guided Exercise) Centralized Logging (and Guided Exercise)
Lab	OpenShift Logging

Log Forwarding

Objectives

Deploy OpenShift Logging for forwarding logs to an external aggregator.

OpenShift Logging

OpenShift Logging collects and aggregates the log messages from all the pods and nodes in your cluster. Users and administrators can use the OpenShift web console to search and consult log entries.

Depending on the workload and the size of your cluster, processing many logs can require significant disk space and compute resources, which would then not be available for your application workloads. In such a scenario, you might need to deploy more compute nodes and increase your storage capacity to handle the extra load.

You can configure OpenShift Logging to forward the logs to a third-party log aggregator for long-term storage, or to an observability platform for further analysis, and minimize the resource requirement on your cluster.

The following examples of third-party logging solutions can receive logs from OpenShift Logging:

Elasticsearch
Grafana Loki
Splunk
Amazon CloudWatch
Google Cloud Logging

OpenShift Logging Components

OpenShift Logging is based on these components: a collector, a log store, and a visualization console. You can deploy them together as a complete logging solution, or you can deploy the collector alone and store the logs in an external solution.

Figure 7.1: OpenShift Logging architecture

Log Collector

The collector is the main component of OpenShift Logging. OpenShift Logging uses Vector to collect logs from all running containers and cluster nodes, and replaces Fluentd, which was the collector in earlier versions of OpenShift Logging.

Vector collects various log types from your cluster that are then grouped into these categories:

Infrastructure: Infrastructure logs include container logs in the openshift-*, kube*, and default namespaces, and system logs from the cluster nodes.
Audit: Audit logs include both Kubernetes API and OpenShift API audit logs, as well as the Linux audit logs from the cluster nodes. These logs might contain sensitive security details, and OpenShift Logging does not store them by default.
Application: Application logs are all container logs from user projects.

To collect those logs, Vector runs as a daemon set in the cluster and therefore runs on all nodes.

In addition to collecting logs, Vector adds metadata to describe where the logs come from, and then forwards the logs to the log store, which is either internal or external to the cluster.

Log Store

The log store uses Grafana Loki to aggregate logs from the entire cluster into a central place and provides access control to logs.

Loki replaces Elasticsearch, which was the log store in earlier versions of the logging subsystem.

The internal log store is an optional component of OpenShift Logging.

Visualization

OpenShift Logging provides a native OpenShift Console plug-in to view and query logs in the internal log store.

The OpenShift Logging UI component replaces Kibana, which was the web interface in earlier versions of OpenShift Logging.

Install and Configure OpenShift Logging

You can deploy OpenShift Logging by installing the OpenShift Logging operator with the Operator Lifecycle Manager (OLM) from OperatorHub on the web console or by using the oc command to create the OLM resources. The OpenShift Logging operator manages deploying and configuring the log collector and other resources to support the logging subsystem.

See the references section for instructions to install an operator by using OperatorHub or the oc command.

After installation, you configure the logging subsystem by using the ClusterLogging custom resource (CR), and configure the log collector by using the ClusterLogForwarder CR.

Configure OpenShift Logging Components

The ClusterLogging CR configures and manages deploying the OpenShift Logging components.

To configure OpenShift Logging, you must create the cluster logging resource in the openshift-logging namespace. The minimum configuration for OpenShift Logging is to enable the log collector, as follows:

apiVersion: logging.openshift.io/v1
kind: ClusterLogging
metadata:
  name: instance 
  namespace: openshift-logging
spec:
  managementState: Managed 
  collection:
    type: vector

	The resource name must be `instance`.
	The management state must be `Managed` for the components to receive updates from the OpenShift Logging operator.
	The collector type to deploy. It can be either `vector` or the deprecated `fluentd` collector.

You can use the cluster logging resource to enable or disable logging components, to customize pod placement, and to define pod resource limits. Configuring the log store and logging UI components are discussed in detail in the next section.

In the following example, the log collector component is configured to run on all nodes, including dedicated infrastructure nodes with the node-role.kubernetes.io/infra=reserved:NoExecute and node-role.kubernetes.io/infra=reserved:NoSchedule taints:

apiVersion: logging.openshift.io/v1
kind: ClusterLogging
metadata:
  name: instance
  namespace: openshift-logging
spec:
  managementState: Managed
  collection:
    type: vector
    tolerations: 
    - effect: NoSchedule
      key: node-role.kubernetes.io/infra
      value: reserved
    - effect: NoExecute
      key: node-role.kubernetes.io/infra
      value: reserved

List of tolerations to include in the collector daemon set.

Note

OpenShift Logging automatically adds the node-role.kubernetes.io/master:NoSchedule toleration to the collector daemon set so the collector can run on both control planes and compute nodes.

After the cluster logging resource is created, the OpenShift Logging operator starts scheduling pods for the components. The log collector pods are deployed only if the internal log store is enabled, or if log forwarding to an external log aggregator is configured.

Configure Log Collection and Forwarding

The log collector is configured by default to forward infrastructure and application logs to the internal log store that you define in the cluster logging resource. If the internal log store is not deployed, then you must configure the log collector to forward logs to a third-party logging system instead.

Vector can forward logs to various log stores, in addition to the internal Loki deployment, such as Splunk, Amazon CloudWatch, or any logging solution that uses the syslog protocol.

See the references section for a list of external log stores that OpenShift Logging can use.

The ClusterLogForwarder custom resource configures the log collector, and defines which logs to collect and where to send them. The following example is a cluster log forwarder resource that configures Vector to forward audit and infrastructure logs to an external Splunk instance:

apiVersion: logging.openshift.io/v1
kind: ClusterLogForwarder
metadata:
  name: instance 
  namespace: openshift-logging
spec:
  outputs:
    - name: splunk-receiver 
      secret: 
        name: splunk-auth-token
      type: splunk 
      url: https://mysplunkserver.example.com:8088/services/collector 
  pipelines:
    - name: to-splunk 
      inputRefs:
        - audit
        - infrastructure
      outputRefs:
        - splunk-receiver

	The resource name must be `instance` and in the `openshift-logging` namespace.
	Name of the log output. You use that name in the log pipeline to refer to that log destination.
	Secret to use for connecting to the log store, if required. In this example, the secret contains the authentication token for the Splunk instance.
	Type of the external log store.
	URL of the external log store.
	Log pipeline configuration that forwards audit and infrastructure logs to the Splunk instance.

The cluster log forwarder resource is composed of inputs, outputs, and pipelines.

inputs

An input defines the log type to collect. OpenShift Logging provides a predefined input for each log category: infrastructure, audit, and application.

outputs

An output defines a destination for the logs. You can configure multiple log outputs to one or more external log stores.

If the internal log store is configured, then the default output becomes available to target the internal Loki instance.

pipelines

A pipeline defines a routing from one or more log inputs to one or more log outputs. You can create several log pipelines and use any combination of log inputs and outputs to forward logs according to your needs.

In the next configuration example, the collector forwards logs to multiple log aggregators:

The audit logs to a remote syslog server for long-term storage and security compliance
Both audit logs and infrastructure logs to the internal log store for cluster administrators
The application logs to an external Splunk instance for developers

apiVersion: logging.openshift.io/v1
kind: ClusterLogForwarder
metadata:
  name: instance
  namespace: openshift-logging
spec:
  outputs:
  - name: audit-rsyslog 
    syslog: 
      appName: ocp-prod
      facility: local0
      msgID: audit-msg
      procID: audit-proc
      severity: informational
      rfc: RFC5424
    type: syslog
    url: tcp://audit-store.example.com:5514
  - name: splunk-receiver 
    secret:
      name: splunk-auth-token
    type: splunk
    url: https://mysplunkserver.example.com:8088/services/collector
  pipelines:
    - name: audit-to-syslog 
      inputRefs:
        - audit
      outputRefs:
        - audit-rsyslog
    - name: admin-logs 
      inputRefs:
        - audit
        - infrastructure
      outputRefs:
        - default
    - name: app-logs 
      inputRefs:
        - application
      outputRefs:
        - splunk-receiver

	Log output configuration for the remote syslog
	Specific configuration for syslog to format the log messages
	Log output configuration for the Splunk instance
	Log pipeline that forwards audit logs to the remote syslog
	Log pipeline that forwards audit and infrastructure logs to the internal log store
	Log pipeline that forwards application logs to the Splunk instance

Collect Kubernetes Events

The Event Router is an optional OpenShift Logging component that you can deploy to log Kubernetes events. The Event Router monitors the OpenShift event API and sends the events to the container stdout. The log collector captures the Event Router container logs and forwards them to the log store through the infrastructure category.

Because the OpenShift Logging operator does not manage this component, it must be manually deployed and updated. You can deploy the Event Router by using the OpenShift template from the documentation.

See the references section for more information about the Event Router and its deployment.

Filtering Log

OpenShift Logging provides filtering capabilities to limit the number of logs to forward to the log store. For example, you can limit the log collection to a specific set of OpenShift projects or application pods by creating a custom log input in the cluster log forwarder resource, as follows:

apiVersion: logging.openshift.io/v1
kind: ClusterLogForwarder
metadata:
  name: instance
  namespace: openshift-logging
spec:
  inputs:
  - name: production-apps 
    application:
      selector:
        matchLabels:
          environment: production
  - name: qa-chain 
    application:
      selector:
        matchLabels:
          environment: development
      namespaces:
      - qa-testing
      - builders

...output omitted...

  pipelines:
  - name: to-splunk 
    inputRefs:
      - qa-chain
      - production-apps
    outputRefs:
      - splunk-receiver

	The `production-apps` log input collects application logs from pods with the `environment: production` label.
	The `qa-chain` log input collects application logs from pods with the `environment: development` label in the `qa-testing` and `builders` projects.
	The log pipeline forwards both `qa-chain` and `production-apps` log inputs to the Splunk instance.

In addition to application logs, you can filter Kubernetes API audit events.

Each call to the Kubernetes and OpenShift APIs generates an audit event that is forwarded to the log store. Although these events can be valuable for security audits, they represent a high volume of data, which can increase your storage requirement and network bandwidth usage, and therefore increase the cost of your logging solution.

To mitigate this effect, you can define audit filters that remove unwanted or low-value events from the audit logs, and reduce the data that is sent to the log store.

Audit filters are defined in the cluster log forwarder resource and list a set of rules that match the events to remove. You can then apply these filters to the selected log pipeline.

In the following example, an audit filter is created to remove all audit events from infrastructure service accounts, and to remove audit events from updates to leases resources by Kubernetes system users:

spec:
  filters:
  - name: unwanted-events 
    type: kubeAPIAudit
    kubeAPIAudit:
      rules:
      - level: None 
        namespaces:
        - openshift-*
        - kube*
        userGroups:
        - system:serviceaccounts:openshift-*
        - system:nodes
      - level: None 
        resources:
        - group: coordination.k8s.io
          resources:
          - leases
        users:
        - system:kube*
        - system:apiserver
        verbs:
        - update

...output omitted...

  pipelines:
  - name: audit-to-syslog
    inputRefs:
    - audit
    filterRefs: 
    - unwanted-events
    outputRefs:
    - audit-rsyslog

	Name of the filter. You use that name in the log pipeline to refer to that filter configuration.
	The first rule removes all audit events that were created from users in the `system:serviceaccounts:openshift-` and `system:nodes` groups, for resources in the `openshift-` and `kube*` namespaces.
	The second rule removes audit events that were generated from the `system:kube*` and `system:apiserver` users, who update `leases` resources in the `coordination.k8s.io` API group.
	The `filterRefs` field lists all filters to apply to the log pipeline.

The KubeAPIAudit field uses the same syntax as the Kubernetes audit policy resources. See the references section for more information about the Kubernetes audit policy and its syntax.

Troubleshoot Log Forwarding

You can verify the configuration of the cluster log forwarder by reviewing the status of the resource. The OpenShift Logging operator validates each log input, output, and pipeline for configuration errors:

[user@host ~]$ oc -n openshift-logging describe clusterlogforwarder/instance

...output omitted...

Status:
  Conditions:
    Message:               clusterlogforwarder is not ready
    Reason:                ValidationFailure
    Status:                True
    Type:                  Validation
  Inputs:
    Critical - Apps:
      Last Transition Time:  2024-01-18T14:24:20Z
      Status:                True
      Type:                  Ready
  Outputs:
    Audit - Syslog:
      Last Transition Time:  2024-01-18T14:24:20Z
      Status:                True
      Type:                  Ready
    Infra - Syslog:
      Last Transition Time:  2024-01-18T14:24:20Z
      Status:                True
      Type:                  Ready
  Pipelines:
    Audit - Syslog:
      Last Transition Time:  2024-01-18T14:24:20Z
      Status:                True
      Type:                  Ready
    Critical - Apps - Syslog:
      Last Transition Time:  2024-01-18T14:24:20Z
      Message:               invalid: unrecognized outputs: [app-syslog], no valid outputs
      Reason:                Invalid
      Status:                False
      Type:                  Ready
...output omitted...

In this example, one of the configured log pipelines refers to an unknown log output (app-syslog). After the configuration is fixed, the cluster log forwarder resource enters the ready state.

You can also verify the status of the logging configuration by reviewing Kubernetes events about the logging resources:

[user@host ~]$ oc get event -n openshift-logging \
  --field-selector=involvedObject.name=instance
LAST SEEN   TYPE      REASON                 OBJECT                         MESSAGE
2m25s       Normal    ReconcilingLoggingCR   clusterlogging/instance        Reconciling logging resource
78s         Normal    ReconcilingLoggingCR   clusterlogforwarder/instance   Reconciling logging resource
4m11s       Warning   Invalid                clusterlogforwarder/instance   clusterlogforwarder is not ready
78s         Normal    Ready                  clusterlogforwarder/instance   ClusterLogForwarder is valid

Each time that the log forwarder configuration is updated, the OpenShift Logging operator redeploys the collector pods.

If the cluster log forwarder configuration is correct, but logs are still not forwarded to the external log store, then you can review the collector logs for errors:

[user@host ~]$ oc -n openshift-logging logs daemonset/collector

...output omitted...

... ERROR vector::topology::builder: msg="Healthcheck failed." error=Connect error: Connection refused (os error 111) component_kind="sink" component_type="socket" component_id=infra_syslog component_name=infra_syslog

In this example, Vector cannot reach the external log store that is configured in the infra_syslog log output. Verify that the remote server is accessible from the OpenShift cluster and that the log forwarder configuration is correct.

If you enable the cluster monitoring on the openshift-logging namespace during the operator installation or by adding the openshift.io/cluster-monitoring: true label to the namespace, then you can use the monitoring dashboard to inspect the collector status.

From the Logging / Collection monitoring dashboard, you can see how many logs are forwarded to each configured log store and the log rate for each application that is running in your cluster.

Figure 7.2: OpenShift Logging monitoring dashboard

References

Kubernetes audit policy documentation: https://kubernetes.io/docs/tasks/debug/debug-cluster/audit/#audit-policy

ClusterLogForwarder API audit filter documentation: https://github.com/openshift/cluster-logging-operator/blob/master/docs/features/logforwarding/filters/api-audit-filter.adoc

For more information about compatible log destinations, refer to the Log Output Types chapter in the Red Hat OpenShift Container Platform 4.14 Logging documentation at https://access.redhat.com/documentation/en-us/openshift_container_platform/4.14/html-single/logging/index#logging-output-types

For more information about installing the OpenShift Logging operator, refer to the Installing Logging chapter in the Red Hat OpenShift Container Platform 4.14 Logging documentation at https://access.redhat.com/documentation/en-us/openshift_container_platform/4.14/html-single/logging/index#cluster-logging-deploying

For more information about installing the Event Router, refer to the Collecting and Storing Kubernetes Events chapter in the Red Hat OpenShift Container Platform 4.14 Logging documentation at https://access.redhat.com/documentation/en-us/openshift_container_platform/4.14/html-single/logging/index#cluster-logging-eventrouter

Discuss Red Hat OpenShift Administration III: Scaling Deployments in the Enterprise

Go to community

Welcome to Red Hat OpenShift Administration III: Scaling Kubernetes Deployments in the Enterprise!

Syed

12 wrz 2023

We are excited to launch a space dedicated to the Red Hat Training course Red Hat OpenShift Administration III: Scaling Kubernetes Deployments in the Enterprise! To gain the most value from this group - click the "Join Group" button in the upper right hand corner of the group home page.We encourage group members to collaborate in this group to discuss topics, ask questions, share best practices and tips, provide course feedback, and share their accomplishments as it relates to DO378.Read more about Red Hat OpenShift Administration III: Scaling Kubernetes Deployments in the Enterprise here.

Revision: do380-4.14-397a507