Bookmark this page

Chapter 13. Managing Cloud Platforms with Red Hat Ceph Storage

Abstract

Goal Manage Red Hat cloud infrastructure to use Red Hat Ceph Storage to provide image, block, volume, object, and shared file storage.
Objectives
  • Describe Red Hat OpenStack Platform storage requirements, and compare the architecture choices for using Red Hat Ceph Storage as an RHOSP storage back end.

  • Describe how OpenStack implements Ceph storage for each storage-related OpenStack component.

  • Describe Red Hat OpenShift Container Platform storage requirements, and compare the architecture choices for using Red Hat Ceph Storage as an RHOCP storage back end.

  • Describe how OpenShift implements Ceph storage for each storage-related OpenShift feature.

Sections
  • Introducing OpenStack Storage Architecture (and Quiz)

  • Implementing Storage in OpenStack Components (and Quiz)

  • Introducing OpenShift Storage Architecture (and Quiz)

  • Implementing Storage in OpenShift Components (and Quiz)

Introducing OpenStack Storage Architecture

Objectives

After completing this section, you should be able to describe Red Hat OpenStack Platform storage requirements, and compare the architecture choices for using Red Hat Ceph Storage as an RHOSP storage back end.

Red Hat OpenStack Platform Overview

Red Hat OpenStack Platform (RHOSP) is implemented as a collection of interacting services that control compute, storage, and networking resources. Cloud users deploy virtual machines by using resources through a self-service interface. Cloud operators, in cooperation with storage operators, ensure that sufficient storage space is created, configured, and available for each of the OpenStack components that consume or provide storage to cloud users.

SectionFigure 13.1: A simple set of OpenStack services presents a high-level overview of the core service relationships of a simple RHOSP installation. All services interact with the Identity service (Keystone) to authenticate users, services, and privileges before any operation is allowed. Cloud users can choose to use the command-line interface or the graphical Dashboard service to access existing resources and to create and deploy virtual machines.

The Orchestration service is the primary component for installing and modifying an RHOSP cloud. This section introduces the OpenStack services for integrating Ceph into an OpenStack infrastructure.

Figure 13.1: A simple set of OpenStack services

Introducing the Storage Services

These core OpenStack services provide storage resources in various formats and by multiple access methods. Cloud users deploy application VMs that consume these storage resources.

Compute Service (Nova)

The Compute service manages VM instances that run on hypervisor nodes. It uses storage to provide system disks, swap volumes, and other ephemeral disks for launching and running instances. This service interacts with the Identity service for authentication, the Image service to obtain images, and other storage services to access additional forms of persistent storage for running instances to use. The Compute service uses libvirtd, qemu, and kvm for the hypervisor.

Block Storage Service (Cinder)

The Block Storage service manages storage volumes for virtual machines, including both ephemeral and persistent block storage for instances that the Compute service manages. The service implements snapshots for backing up and creating new block storage volumes.

Image Service (Glance)

The Image service acts as a registry for images that build instance system disks when they are launched. Live instances can be saved as images for later use to build new instances.

Shared File Systems Service (Manila)

The Shared File System service uses the network infrastructure to implement file sharing as a service. Because cloud users normally do not have connection privileges to the file share server, this service brokers connections to configured back ends. The service uses NFS and CIFS protocols to access file share servers. Administrators can configure this service to access multiple file share servers.

Object Store Service (Swift)

The Object Store provides storage for users to upload and retrieve objects as files. The Object Store architecture is distributed across disk devices and servers for horizontal scaling and to provide redundancy. It is common practice to configure the Image service to use the Object Store service as its storage back end, to support image and snapshot replication across the Object Store infrastructure. This service also provides a backup solution for other services by storing backup results as retrievable objects.

Red Hat Ceph Storage (Ceph)

Red Hat Ceph Storage is a distributed data object store that is used as the back end for all the other storage services. Ceph is the most common back end that is used with OpenStack. Ceph integrates with OpenStack services such as Compute, Block Storage, Shared File Systems, Image, and Object Store to provide easier storage management and cloud scalability.

Introducing Services for Integrating Storage

These additional core services provide overcloud installation, service container deployment, and the authentication support that are necessary to implement storage integration.

Identity Service (Keystone)

The Identity service authenticates and authorizes all OpenStack services. This service creates and manages users and roles in domains and projects. This service provides a central catalog of services and their associated endpoints that are available in an OpenStack cloud. The Identity service acts as a single sign-on (SSO) authentication service for both users and service components.

Deployment Service (TripleO)

The Deployment service installs, upgrades, and operates OpenStack clouds by using the Director node, which is an OpenStack cloud.

Orchestration Service (Heat)

The Orchestration service can provision both infrastructure and application workloads, by using resources that are defined in Heat orchestration Template (HOT) files. HOT template and environment files are the primary configuration method for deploying overclouds. In later Red Hat OpenStack Platform versions, the Orchestration template and environment files define the services, resources, and architecture to deploy, while Ansible Playbooks implement the software provisioning.

Container Deployment Service (Kolla)

In later RHOSP versions, the OpenStack services are containerized. The Container Deployment service provides production-ready containers and configuration management for operation of OpenStack services.

Bare Metal Service (Ironic)

The Bare Metal provisioning service prepares and provisions both physical hardware and KVM virtual machines. The service works with standard and vendor-specific drivers, such as PXE and IPMI, to communicate with a wide range of hardware.

Selecting a Ceph Integration Architecture

A storage operator, who works closely with infrastructure architects and network engineers, chooses the Ceph integration architecture and server node roles that are needed to support the organization's application use cases and sizing forecasts. Ceph can be integrated into an OpenStack infrastructure by using either of two implementation designs. Both Ceph designs are implemented by TripleO, which uses Ansible Playbooks for the bulk of the software deployment and configuration.

Note

RHOSP 16.1 and 16.2 support RHCS 5 only as an external cluster. RHOSP 17 supports dedicated RHCS 5 deployment with cephadm to replace ceph-ansible.

Dedicated

An organization without an existing, stand-alone Ceph cluster installs a dedicated Ceph cluster that is composed of Ceph services and storage nodes during an RHOSP overcloud installation. Only services and workloads that are deployed for, or on, an OpenStack overcloud can use an OpenStack-dedicated Ceph implementation. External applications cannot access or use OpenStack-dedicated Ceph cluster storage.

External

An organization can use an existing, stand-alone Ceph cluster for storage when creating a new OpenStack overcloud. The TripleO deployment is configured to access that external cluster to create the necessary pools, accounts, and other resources during overcloud installation. Instead of creating internal Ceph services, the deployment configures the OpenStack overcloud to access the existing Ceph cluster as a Ceph client.

A dedicated Ceph cluster supports a maximum of 750 OSDs when running the Ceph control plane services on the RHOSP controllers. An external Ceph cluster can scale significantly larger, depending on the hardware configuration. Updates and general maintenance are easier on an external cluster because they can occur independently of RHOSP operations.

To maintain Red Hat support, RHOSP installations must be built and configured with the TripleO Orchestration service. For a dedicated storage configuration, RHOSP 16 TripleO uses the same RHCS 4 ceph-ansible playbooks that are used to install stand-alone Ceph clusters. However, because TripleO dynamically organizes the playbooks and environment files to include in the deployment, direct use of Ansible without TripleO is not supported.

Node Roles Available in a Dedicated Ceph Implementation

A dedicated Ceph implementation is the TripleO default, and is sufficient for most small, medium, and moderately large OpenStack installations. A storage operator has significant choices for service distribution across overcloud nodes by using composable node roles. Except where stated otherwise, these node roles are included by default in later RHOSP versions.

SectionFigure 13.2: An example overcloud with multiple node roles presents an example of overcloud nodes to implement different service roles in a simple overcloud.

Figure 13.2: An example overcloud with multiple node roles

The following node roles determine the services that are placed on storage nodes that handle data plane traffic and on the physical storage devices.

The CephStorage role is the default, and control plane services are expected to be installed on controller nodes.

  • CephStorage - The most common dedicated Ceph storage node configuration. Contains OSDs only, without control plane services.

  • CephAll - A stand-alone full storage node with OSDs and all control plane services. This configuration might be used with the ControllerNoCeph node role.

  • CephFile - A node to scale out file sharing. Contains OSDs and MDS services.

  • CephObject - A node to scale out object gateway access. Contains OSDs and RGW services.

When storage management traffic increases, controller nodes can become overloaded. The following node roles support various configurations and distributions of Ceph control plane services across multiple nodes. Coordinate controller node roles with role choices for storage nodes to ensure that all wanted control plane services are deployed.

  • Controller - The most common controller node configuration. Contains all normal control plane services, including Ceph MGR, MDS, MON, RBD, and RGW services.

  • ControllerStorageDashboard - A normal controller node plus a Grafana dashboard service. This node role adds a further network to isolate storage monitoring traffic from the storage back end.

  • ControllerStorageNFS - A normal controller node plus a Ganesha service as a CephFS to NFS gateway.

  • ControllerNoCeph - A normal controller, but without Ceph control plane services. This node role is selected when Ceph control plane services are moved to segregated nodes for increased performance and scaling.

The following node roles are not included by default in the RHOSP distribution, but are described in Red Hat online documentation. Use these roles to alleviate overloaded controller nodes by moving primary Ceph services to separate, dedicated nodes. These roles are commonly found in larger OpenStack installations with increased storage traffic requirements.

  • CephMon - A custom-created node role that moves only the MON service from the controllers to a separate node.

  • CephMDS - A custom-created node role that moves only the MDS service from the controllers to a separate node.

A Hyperconverged Infrastructure (HCI) node is a configuration with both compute and storage services and devices on the same node. This configuration can result in increased performance for heavy storage throughput applications. The default is the ComputeHCI role, which adds only OSDs to a compute node, effectively enlarging your dedicated Ceph cluster. Ceph control plane services remain on the controller nodes. The other node roles add various choices of control plane services to the hyperconverged node.

  • ComputeHCI - A compute node plus OSDs. These nodes have no Ceph control plane services.

  • HciCephAll - A compute node plus OSDs and all Ceph control plane services.

  • HciCephFile - A compute node plus OSDs and the MDS service. Used for scaling out file sharing storage capacity.

  • HciCephMon - A compute node plus OSDs and the MON and MGR services. Used for scaling out block storage capacity.

  • HciCephObject - A compute node plus OSDs and the RGW service. Used for scaling out object gateway access.

A Distributed Compute Node (DCN) is another form of hyperconverged node that is designed for use in remote data centers or branch offices that are part of the same OpenStack overcloud. For DCN, the overcloud deployment creates a dedicated Ceph cluster, with a minimum of three nodes, per remote site in addition to the dedicated Ceph cluster at the primary site. This architecture is not a stretch cluster configuration. Later DCN versions support installing the Glance in the remote location for faster local image access.

  • DistributedComputeHCI - A DCN node with Ceph, Cinder, and Glance.

  • DistributedComputeHCIScaleOut - A DCN node with Ceph, Cinder, and HAProxy for Glance.

Implementing an External Red Hat Ceph Storage Cluster

RHOSP overcloud installations have an undercloud node, which is referred to as the Director node in SectionFigure 13.1: A simple set of OpenStack services. TripleO installs overcloud from the Director node. The default orchestration templates for TripleO services are in the /usr/share/openstack-tripleo-heat-templates directory on the undercloud. When deploying OpenStack integrated with Ceph, the undercloud node becomes the Ansible controller and cluster administration host.

Note

The following narrative provides a limited view of TripleO cloud deployment resources. Your organization's deployment will require further design effort, because every production overcloud has unique storage needs.

Because the default orchestration files are continuously being enhanced, you must not modify default template files in their original location. Instead, create a directory to store your custom environment files and parameter overrides. The following ceph-ansible-external.yaml environment file instructs TripleO to use the ceph-ansible client role to access a preexisting, external Ceph cluster. To override the default settings in this file, use a custom parameter file.

[stack@director ceph-ansible]$ cat ceph-ansible-external.yaml
 resource_registry:
  OS::TripleO::Services::CephExternal: ../../deployment/ceph-ansible/ceph-external.yaml

parameter_defaults:
  # NOTE: These example parameters are required when using CephExternal
  #CephClusterFSID: '4b5c8c0a-ff60-454b-a1b4-9747aa737d19'
  #CephClientKey: 'AQDLOh1VgEp6FRAAFzT7Zw+Y9V6JJExQAsRnRQ=='
  #CephExternalMonHost: '172.16.1.7, 172.16.1.8'

  # the following parameters enable Ceph backends for Cinder, Glance, Gnocchi and Nova
  NovaEnableRbdBackend: true
  CinderEnableRbdBackend: true
  CinderBackupBackend: ceph
  GlanceBackend: rbd
  # Uncomment below if enabling legacy telemetry
  # GnocchiBackend: rbd
  # If the Ceph pools which host VMs, Volumes and Images do not match these
  # names OR the client keyring to use is not called 'openstack',  edit the
  # following as needed.
  NovaRbdPoolName: vms
  CinderRbdPoolName: volumes
  CinderBackupRbdPoolName: backups
  GlanceRbdPoolName: images
  # Uncomment below if enabling legacy telemetry
  # GnocchiRbdPoolName: metrics
  CephClientUserName: openstack

  # finally we disable the Cinder LVM backend
  CinderEnableIscsiBackend: false

A TripleO deployment specifies a list of environment files for all of the overcloud services to be deployed, with an openstack overcloud deploy command. Before deployment, the openstack tripleo container image prepare command is used to determine all of the services that are referenced in the configuration, and prepare a list of the corrector containers to download and provide for the overcloud deployment. During the installation, Kolla is used to configure and start each service container on the correct nodes, as defined by the node roles.

For this external Ceph cluster example, TripleO needs a parameter file that specifies the real cluster parameters, to override the parameter defaults in the ceph-ansible-external.yaml file. This example parameter-overrides.yaml file is placed in your custom deployment files directory. You can obtain the key from the result of an appropriate ceph auth add client.openstack command.

      parameter_defaults:
        # The cluster FSID
        CephClusterFSID: '4b5c8c0a-ff60-454b-a1b4-9747aa737d19'
        # The CephX user auth key
        CephClientKey: 'AQDLOh1VgEp6FRAAFzT7Zw+Y9V6JJExQAsRnRQ=='
        # The list of Ceph monitors
        CephExternalMonHost: '172.16.1.7, 172.16.1.8, 172.16.1.9'

TripleO relies on the Bare Metal service to prepare nodes before installing them as Ceph servers. Disk devices, both physical and virtual, must be cleaned of all partition tables and other artifacts. Otherwise, Ceph refuses to overwrite the device, after determining that the device is in use. To delete all metadata from disks, and create GPT labels, set the following parameter in the /home/stack/undercloud.conf file on the undercloud. The Bare Metal service boots the nodes and cleans the disks each time the node status is set to available for provisioning.

clean_nodes=true
Revision: cl260-5.0-29d2128