CL260 - ch05s03

The cluster OSD map contains the address and status of each OSD, the pool list and details, and other information such as the OSD near-capacity limit information. Ceph uses these last parameters to send warnings and to stop accepting write requests when an OSD reaches full capacity.

When a change occurs in the cluster's infrastructure, such as OSDs joining or leaving the cluster, the MONs update the corresponding map accordingly. The MONs maintain a history of map revisions. Ceph identifies each version of each map using an ordered set of incremented integers known as epochs.

The ceph status -f json-pretty command displays the epoch of each map. Use the ceph map dump subcommand to display each individual map, such as ceph osd dump.

[ceph: root@serverc /]# ceph status -f json-pretty
...output omitted...
    "osdmap": {
        "epoch": 478,
        "num_osds": 15,
        "num_up_osds": 15,
        "osd_up_since": 1632743988,
        "num_in_osds": 15,
        "osd_in_since": 1631712883,
        "num_remapped_pgs": 0
    },
...output omitted...
[ceph: root@serverc /]# ceph osd dump
epoch 478
fsid 11839bde-156b-11ec-bb71-52540000fa0c
created 2021-09-14T14:50:39.401260+0000
modified 2021-09-27T12:04:26.832212+0000
flags sortbitwise,recovery_deletes,purged_snapdirs,pglog_hardlimit
crush_version 69
full_ratio 0.95
backfillfull_ratio 0.9
nearfull_ratio 0.85
require_min_compat_client luminous
min_compat_client luminous
require_osd_release pacific
stretch_mode_enabled false
pool 1 'device_health_metrics' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 475 flags hashpspool stripe_width 0 pg_num_min 1 application mgr_devicehealth
...output omitted...
osd.0 up   in  weight 1 up_from 471 up_thru 471 down_at 470 last_clean_interval [457,466) [v2:172.25.250.12:6801/1228351148,v1:172.25.250.12:6802/1228351148] [v2:172.25.249.12:6803/1228351148,v1:172.25.249.12:6804/1228351148] exists,up cfe311b0-dea9-4c0c-a1ea-42aaac4cb160
...output omitted...

Analyzing OSD Map Updates

Ceph updates the OSD map every time an OSD joins or leaves the cluster. An OSD can leave the Ceph cluster either because of an OSD failure or a hardware failure.

Even though the cluster map as a whole is maintained by the MONs, OSDs do not use a leader to manage the OSD map; they propagate the map among themselves. OSDs tag every message they exchange with the OSD map epoch. When an OSD detects that it is lagging behind, it performs a map update with its peer OSD.

In large clusters, where OSD map updates are frequent, it is not practical to always distribute the full map. Instead, receiving OSD nodes perform incremental map updates.

Ceph also tags the messages between OSDs and clients with the epoch. Whenever a client connects to an OSD, the OSD inspects the epoch. If the epoch does not match, then the OSD responds with the correct increment so that the client can update its OSD map. This negates the need for aggressive propagation, because clients learn about the updated map only at the time of next contact.

Updating Cluster Maps with Paxos

To access a Ceph cluster, a client first retrieves a copy of the cluster map from the MONs. All MONs must have the same cluster map for the cluster to function correctly.

MONs use the Paxos algorithm as a mechanism to ensure that they agree on the cluster state. Paxos is a distributed consensus algorithm. Every time a MON modifies a map, it sends the update to the other monitors through Paxos. Ceph only commits the new version of the map after a majority of monitors agree on the update.

The MON submits a map update to Paxos and only writes the new version to the local key-value store after Paxos acknowledges the update. The read operations directly access the key-value store.

Figure 5.2: Cluster map consistency using Paxos

Propagating the OSD Map

OSDs regularly report their status to the monitors. In addition, OSDs exchange heartbeats so that an OSD can detect the failure of a peer and report that event to the monitors.

When a leader monitor learns of an OSD failure, it updates the map, increments the epoch, and uses the Paxos update protocol to notify the other monitors, at the same time revoking their leases. After a majority of monitors acknowledge the update, and the cluster has a quorum, the leader monitor issues a new lease so that the monitors can distribute the updated OSD map. This method avoids the map epoch ever going backwards anywhere in the cluster, and finding previous leases that are still valid.

OSD Map Commands

Use the following commands to manage the OSD map as an administrator:

Command	Action
`ceph osd dump`	Dump the OSD map to standard output.
`ceph osd getmap -o binfile`	Export a binary copy of the current map.
`osdmaptool --print binfile`	Display a human-readable copy of the map to standard output.
`osdmaptool --export-crush crushbinfile binfile`	Extract the CRUSH map from the OSD map.
`osdmaptool --import-crush crushbinfile binfile`	Embed a new CRUSH map.
`osdmaptool --test-map-pg pgid binfile`	Verify the mapping of a given PG.

References

For more information, refer to the Red Hat Ceph Storage 5 Storage Strategies Guide at https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/5/html-single/storage_strategies_guide

Discuss Cloud Storage with Red Hat Ceph Storage

Go to community

Red Hat Ceph Storage for OpenStack (CL260)

Haley_Ruccio

3 sie 2023

Build, expand and maintain cloud-scale, clustered storage for your applications with Red Hat Ceph StorageCloud Storage with Red Hat Ceph Storage (CL260) is designed for storage administrators and cloud operators who deploy Red Hat Ceph Storage in a production data center environment or as a component of a Red Hat OpenStack Platform or OpenShift Container Platform infrastructure. Learn how to deploy, manage, and scale a Ceph storage cluster to provide hybrid storage resources, including Amazon S3 and OpenStack Swift-compatible object storage, Ceph-native and iSCSI-based block storage, and shared file storage. This course is based on Red Hat Ceph Storage 5.0.Course summaryDeploy and manage a Red Hat Ceph Storage cluster on commodity servers.Perform common management operations using the web-based management interface.Create, expand, and control access to storage pools provided by the Ceph cluster.Access Red Hat Ceph Storage from clients using object, block, and file-based methods.Analyze and tune Red Hat Ceph Storage performance.Integrate Red Hat OpenStack Platform image, object, block, and file storage with a Red Hat Ceph Storage cluster.Integrate OpenShift Container Platform with a Red Hat Ceph Storage cluster.Target AudienceThis course is intended for storage administrators and cloud operators who want to learn how to deploy and manage Red Hat Ceph Storage on servers in an enterprise data center or within a Red Hat OpenStack Platform or OpenShift Container Platform environment.Developers writing applications that use cloud-based storage will learn the distinctions of various storage types and client access methods.Recommended trainingTake our free assessment to gauge whether this offering is the best fit for your skills.Red Hat Certified System Administrator (RHCSA) certification, or equivalent experience.For candidates that have not earned an RHCSA or equivalent, confirmation of the correct skill set knowledge can be obtained by taking the online skills assessment.Some experience with storage administration is recommended but not required.Technology considerationsThis course does not have any special technical requirements.This course is not intended for BYOD.Internet access is recommended.

Welcome to the Red Hat Ceph Storage for OpenStack (CL260) group in the Red Hat Learning Community!

cschunke

31 lip 2023

We are excited to launch a space dedicated to the Red Hat Training course CL260! To gain the most value from this group - click the "Join Group" button in the upper right hand corner of the group home page.We encourage group members to collaborate in this group to discuss topics, ask questions, share best practices and tips, provide course feedback, and share their accomplishments as it relates to CL260.Read more about Red Hat Ceph Storage for OpenStack here.

381

Revision: cl260-5.0-29d2128