After completing this section, you should be able to perform common cluster maintenance tasks, such as adding or removing MONs and OSDs, and recovering from various component failures.
Evaluate the potential performance impact before performing cluster maintenance activities. The following factors typically affect cluster performance when adding or removing OSD nodes:
Client load
If an OSD node has a pool that is experiencing high client loads, then performance and recovery time could be negatively affected. Because write operations require data replication for resiliency, write-intensive client loads increase cluster recovery time.
Node capacity
The capacity of the node being added or removed affects the cluster recovery time. The node's storage density also affects recovery times. For example, a node with 36 OSDs takes longer to recover than a node with 12 OSDs.
Spare cluster capacity
When removing nodes, verify that you have sufficient spare capacity to avoid reaching the full or near full ratios. When a cluster reaches the full ratio, Ceph suspends write operations to prevent data loss.
CRUSH rules
A Ceph OSD node maps to at least one CRUSH hierarchy, and that hierarchy maps to at least one pool via a CRUSH rule. Each pool using a specific CRUSH hierarchy experiences a performance impact when adding and removing OSDs.
Pool types
Replication pools use more network bandwidth to replicate data copies, while erasure-coded pools use more CPU to calculate data and coding chunks.
The more data copies that exist, the longer it takes for the cluster to recover. For example, an erasure-coded pool with many chunks takes longer to recover than a replicated pool with fewer copies of the same data.
Node hardware
Nodes with higher throughput characteristics, such as 10 Gbps network interfaces and SSDs, recover more quickly than nodes with lower throughput characteristics, such as 1 Gbps network interfaces and SATA drives.
Red Hat Ceph Storage is designed to be self-healing. When a storage device fails, extra data copies on other OSDs backfill automatically to recover the cluster to a healthy state.
When a storage device fails, the OSD status changes to down.
Other cluster issues, such as a network error, can also mark an OSD as down.
When an OSD is down, first verify if the physical device has failed.
To verify that the OSD has failed, perform the following steps.
View the cluster status and verify that an OSD has failed.
[ceph: root@node /]# ceph health detailIdentify the failed OSD.
[ceph: root@node /]# ceph osd tree | grep -i downLocate the OSD node where the OSD is running.
[ceph: root@node /]# ceph osd find osd.OSD_IDAttempt to start the failed OSD.
[ceph: root@node /]# ceph orch daemon start OSD_IDIf the OSD does not start, then the physical storage device might have failed.
Use the journalctl command to view the OSD logs or use the utilities available in your production environment to verify that the physical device has failed.
If you have verified that the physical device needs replacement, perform the following steps.
Temporarily disable scrubbing.
[ceph: root@node /]# ceph osd set noscrub ; ceph osd set nodeep-scrubRemove the OSD from the cluster.
[ceph: root@node /]# ceph osd out OSD_IDWatch cluster events and verify that a backfill operation has started.
[ceph: root@node /]# ceph -wVerify that the backfill process has moved all PGs off the OSD and it is now safe to remove.
[ceph: root@node /]# while ! ceph osd safe-to-destroy osd.OSD_ID ; \
do sleep 10 ; doneWhen the OSD is safe to remove, replace the physical storage device and destroy the OSD. Optionally, remove all data, file systems, and partitions from the device.
[ceph: root@node /]# ceph orch device zap HOST_NAME _OSD_ID --forceFind the current device ID using the Dashboard GUI, or the ceph-volume lvm list or ceph osd metadata CLI commands.
Replace the OSD using the same ID as the one that failed. Verify that the operation has completed before continuing.
[ceph: root@node /]#ceph orch osd rm OSD_ID --replace[ceph: root@node /]#ceph orch osd rm status
Replace the physical device and recreate the OSD. The new OSD uses the same OSD ID as the one that failed.
The device path of the new storage device might be different than the failed device.
Use the ceph orch device ls command to find the new device path.
[ceph: root@node /]# ceph orch daemon add osd HOST_NAME:_DEVICE_PATH_Start the OSD and verify that the OSD is up.
[ceph: root@node /]#ceph orch daemon start[ceph: root@node /]#OSD_IDceph osd tree
Re-enable scrubbing.
[ceph: root@node /]# ceph osd unset noscrub ; ceph osd unset nodeep-scrubAdd a MON to your cluster by performing the following steps.
Verify the current MON count and placement.
[ceph: root@node /]# ceph orch ls --service_type=monAdd a new host to the cluster.
[ceph: root@node /]#ceph cephadm get-pub-key > ~/ceph.pub[ceph: root@node /]#ssh-copy-id -f -i ~/ceph.pub root@[ceph: root@node /]#HOST_NAMEceph orch host addHOST_NAME
Specify the hosts where the MON nodes should run.
Specify all MON nodes when running this command. If you only specify the new MON node, then the command removes all other MONs, leaving the cluster with only one MON node.
[ceph: root@node /]# ceph orch apply mon --placement="NODE1 NODE2 NODE3 NODE4 ..."Use the ceph orch apply mon command to remove a MON from the cluster.
Specify all MONs except the one that you want to remove.
[ceph: root@node /]# ceph orch apply mon --placement="NODE1 NODE2 NODE3 ..."Use the ceph orch host maintenance command to place hosts in and out of maintenance mode.
Maintenance mode stops all Ceph daemons on the host.
Use the optional --force option to bypass warnings.
[ceph: root@node /]# ceph orch host maintenance enter HOST_NAME [--force]When finished with maintenance, exit maintenance mode.
[ceph: root@node /]# ceph orch host maintenance exit HOST_NAMEFor more information, refer to the Red Hat Ceph Storage 5 Operations Guide at https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/5/html-single/operations_guide/index