CL260 - ch11s02

Bookmark this page

Guided Exercise: Performing Cluster Administration and Monitoring

In this exercise, you will perform common administration operations on a Red Hat Ceph Storage cluster.

Outcomes

You should be able to administer and monitor the cluster, including starting and stopping specific services, analyzing placement groups, setting OSD primary affinity, verifying daemon versions, and querying cluster health and utilization.

As the student user on the workstation machine, use the lab command to prepare your system for this exercise.

[student@workstation ~]$ lab start cluster-admin

This command confirms that the hosts required for this exercise are accessible.

Procedure 11.1. Instructions

[student@workstation ~]$ ssh admin@clienta
[admin@clienta ~]$ sudo cephadm shell
[ceph: root@clienta /]#

View the enabled MGR modules. Verify that the dashboard module is enabled.

[ceph: root@clienta /]# ceph mgr module ls | more
{
    "always_on_modules": [
        "balancer",
        "crash",
        "devicehealth",
        "orchestrator",
        "pg_autoscaler",
        "progress",
        "rbd_support",
        "status",
        "telemetry",
        "volumes"
    ],
    "enabled_modules": [
        "cephadm",
        "dashboard",
        "insights",
        "iostat",
        "prometheus",
        "restful"
    ],
    "disabled_modules": [
        {
            "name": "alerts",
            "can_run": true,
            "error_string": "",
            "module_options": {
...output omitted...

Obtain the dashboard URL for the active MGR node.

[ceph: root@clienta /]# ceph mgr services
{
    "dashboard": "https://172.25.250.12:8443/",
    "prometheus": "http://172.25.250.12:9283/"
}

View the status of the Monitors on the Ceph Dashboard page.
1. Using a web browser, navigate to the dashboard URL obtained in the previous step. Log in as the admin user with the redhat password.
2. On the Dashboard page, click Monitors to view the status of the Monitor nodes and quorum.

View the status of all OSDs in the cluster.

[ceph: root@clienta /]# ceph osd stat
9 osds: 9 up (since 38m), 9 in (since 38m); epoch: e294

Find the location of the OSD 2 daemon, stop the OSD, and view the cluster OSD status.

Find the location of the OSD 2 daemon.

[ceph: root@clienta /]# ceph osd find 2
{
    "osd": 2,
    "addrs": {
        "addrvec": [
            {
                "type": "v2",
                "addr": "172.25.250.12:6808",
                "nonce": 2361545815
            },
            {
                "type": "v1",
                "addr": "172.25.250.12:6809",
                "nonce": 2361545815
            }
        ]
    },
    "osd_fsid": "1163a19e-e580-40e0-918f-25fd94e97b86",
    "host": "serverc.lab.example.com",
    "crush_location": {
        "host": "serverc",
        "root": "default"
    }
}

[ceph: root@clienta /]# ssh admin@serverc
admin@serverc's password: redhat
[admin@serverc ~]$ sudo systemctl list-units "ceph*"
UNIT                                                        LOAD   ACTIVE SUB     DESCRIPTION
...output omitted...
ceph-ff97a876-1fd2-11ec-8258-52540000fa0c@osd.0.service     loaded active running Ceph osd.0 for ff97a876-1fd2-11ec-8258-52540000fa0c
ceph-ff97a876-1fd2-11ec-8258-52540000fa0c@osd.1.service     loaded active running Ceph osd.1 for ff97a876-1fd2-11ec-8258-52540000fa0c
ceph-ff97a876-1fd2-11ec-8258-52540000fa0c@osd.2.service     loaded active running Ceph osd.2 for ff97a876-1fd2-11ec-8258-52540000fa0c
...output omitted...
[admin@serverc ~]$ sudo systemctl stop \
ceph-ff97a876-1fd2-11ec-8258-52540000fa0c@osd.2.service

Exit the serverc node. View the cluster OSD status.

[admin@serverc ~]$ exit
[ceph: root@clienta /]# ceph osd stat
9 osds: 8 up (since 24s), 9 in (since 45m); epoch: e296

Start osd.2 on the serverc node, and then view the cluster OSD status.

[ceph: root@clienta /]# ssh admin@serverc sudo systemctl start \
  ceph-ff97a876-1fd2-11ec-8258-52540000fa0c@osd.2.service
admin@serverc's password: redhat
[ceph: root@clienta /]# ceph osd stat
9 osds: 9 up (since 6s), 9 in (since 47m); epoch: e298

View the log files for the osd.2 daemon. Filter the output to view only systemd events.

[ceph: root@clienta /]# ssh admin@serverc sudo journalctl \
  -u ceph-ff97a876-1fd2-11ec-8258-52540000fa0c@osd.2.service | grep systemd
admin@serverc's password: redhat
...output omitted...
Sep 30 01:57:36 serverc.lab.example.com systemd[1]: Stopping Ceph osd.2 for ff97a876-1fd2-11ec-8258-52540000fa0c...
Sep 30 01:57:37 serverc.lab.example.com systemd[1]: ceph-ff97a876-1fd2-11ec-8258-52540000fa0c@osd.2.service: Succeeded.
Sep 30 01:57:37 serverc.lab.example.com systemd[1]: Stopped Ceph osd.2 for ff97a876-1fd2-11ec-8258-52540000fa0c.
Sep 30 02:00:12 serverc.lab.example.com systemd[1]: Starting Ceph osd.2 for ff97a876-1fd2-11ec-8258-52540000fa0c...
Sep 30 02:00:13 serverc.lab.example.com systemd[1]: Started Ceph osd.2 for ff97a876-1fd2-11ec-8258-52540000fa0c.

Mark the osd.4 daemon as being out of the cluster and observe how it affects the cluster status. Then, mark the osd.4 daemon as being in the cluster again.

Mark the osd.4 daemon as being out of the cluster. Verify that the osd.4 daemon is marked out of the cluster and notice that the OSD's weight is now 0.

[ceph: root@clienta /]# ceph osd out 4
marked out osd.4.
[ceph: root@clienta /]# ceph osd stat
9 osds: 9 up (since 2m), 8 in (since 3s); epoch: e312
[ceph: root@clienta /]# ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME         STATUS  REWEIGHT  PRI-AFF
-1         0.08817  root default
-3         0.02939      host serverc
 0    hdd  0.00980          osd.0         up   1.00000  1.00000
 1    hdd  0.00980          osd.1         up   1.00000  1.00000
 2    hdd  0.00980          osd.2         up   1.00000  1.00000
-7         0.02939      host serverd
 3    hdd  0.00980          osd.3         up   1.00000  1.00000
 5    hdd  0.00980          osd.5         up   1.00000  1.00000
 7    hdd  0.00980          osd.7         up   1.00000  1.00000
-5         0.02939      host servere
 4    hdd  0.00980          osd.4         up         0  1.00000
 6    hdd  0.00980          osd.6         up   1.00000  1.00000
 8    hdd  0.00980          osd.8         up   1.00000  1.00000

Note

Ceph recreates the missing object replicas previously available on the osd.4 daemon on different OSDs. You can trace the recovery of the objects using the ceph status or the ceph -w commands.

Mark the osd.4 daemon as being in again.
```
[ceph: root@clienta /]# ceph osd in 4
marked in osd.4.
```
Note
You can mark an OSD as out even though it is still running (up). The in or out status does not correlate to an OSD's running state.

Analyze the current utilization and number of PGs on the OSD 2 daemon.

[ceph: root@clienta /]# ceph osd df tree
ID  CLASS  WEIGHT   REWEIGHT  SIZE    RAW USE  DATA     OMAP    META     AVAIL   %USE  VAR   PGS  STATUS  TYPE NAME
-1         0.08817         -  90 GiB  256 MiB   36 MiB  56 KiB  220 MiB  90 GiB  0.28  1.00    -          root default
-3         0.02939         -  30 GiB   71 MiB   12 MiB  20 KiB   59 MiB  30 GiB  0.23  0.83    -              host serverc
 0    hdd  0.00980   1.00000  10 GiB   26 MiB  4.0 MiB  11 KiB   22 MiB  10 GiB  0.25  0.91   68      up          osd.0
 1    hdd  0.00980   1.00000  10 GiB   29 MiB  4.0 MiB   6 KiB   25 MiB  10 GiB  0.28  1.01   74      up          osd.1
 2    hdd  0.00980   1.00000  10 GiB   16 MiB  3.9 MiB   3 KiB   12 MiB  10 GiB  0.16  0.57   59      up          osd.2
...output omitted...
                       TOTAL  90 GiB  256 MiB   36 MiB  61 KiB  220 MiB  90 GiB  0.28
MIN/MAX VAR: 0.57/1.48  STDDEV: 0.06

View the placement group status for the cluster. Create a test pool and a test object. Find the placement group to which the test object belongs and analyze that placement group's status.

View the placement group status for the cluster. Examine the PG states. Your output may be different in your lab environment.

[ceph: root@clienta /]# ceph pg stat
201 pgs: 201 active+clean; 8.6 KiB data, 261 MiB used, 90 GiB / 90 GiB avail; 511 B/s rd, 0 op/s

Create a pool called testpool and an object called testobject containing the /etc/ceph/ceph.conf file.

[ceph: root@clienta /]# ceph osd pool create testpool 32 32
pool 'testpool' created
[ceph: root@clienta /]# rados -p testpool put testobject /etc/ceph/ceph.conf

Find the placement group of the testobject object in the testpool pool and analyze its status. Use the placement group information from your lab environment in the query.

[ceph: root@clienta /]# ceph osd map testpool testobject
osdmap e332 pool 'testpool' (9) object 'testobject' -> pg 9.98824931 (9.11) -> up ([8,2,5], p8) acting ([8,2,5], p8)
[ceph: root@clienta /]# ceph pg 9.11 query
{
    "snap_trimq": "[]",
    "snap_trimq_len": 0,
    "state": "active+clean",
    "epoch": 334,
    "up": [
        8,
        2,
        5
    ],
    "acting": [
        8,
        2,
        5
    ],
    "acting_recovery_backfill": [
        "2",
        "5",
        "8"
    ],
    "info": {
        "pgid": "9.11",
...output omitted...

List the OSD and cluster daemon versions. This is a useful command to run after cluster upgrades.

List all cluster daemon versions.

[ceph: root@clienta /]# ceph versions
{
    "mon": {
        "ceph version 16.2.0-117.el8cp (0e34bb74700060ebfaa22d99b7d2cdc037b28a57) pacific (stable)": 4
    },
    "mgr": {
        "ceph version 16.2.0-117.el8cp (0e34bb74700060ebfaa22d99b7d2cdc037b28a57) pacific (stable)": 4
    },
    "osd": {
        "ceph version 16.2.0-117.el8cp (0e34bb74700060ebfaa22d99b7d2cdc037b28a57) pacific (stable)": 9
    },
    "mds": {
        "ceph version 16.2.0-117.el8cp (0e34bb74700060ebfaa22d99b7d2cdc037b28a57) pacific (stable)": 3
    },
    "rgw": {
        "ceph version 16.2.0-117.el8cp (0e34bb74700060ebfaa22d99b7d2cdc037b28a57) pacific (stable)": 2
    },
    "overall": {
        "ceph version 16.2.0-117.el8cp (0e34bb74700060ebfaa22d99b7d2cdc037b28a57) pacific (stable)": 22
    }
}

List all OSD versions.

[ceph: root@clienta /]# ceph tell osd.* version
osd.0: {
    "version": "16.2.0-117.el8cp",
    "release": "pacific",
    "release_type": "stable"
}
osd.1: {
    "version": "16.2.0-117.el8cp",
    "release": "pacific",
    "release_type": "stable"
}
osd.2: {
    "version": "16.2.0-117.el8cp",
    "release": "pacific",
    "release_type": "stable"
}
...output omitted...

View the balancer status.

[ceph: root@clienta /]# ceph balancer status
{
    "active": true,
    "last_optimize_duration": "0:00:00.001072",
    "last_optimize_started": "Thu Sep 30 06:07:53 2021",
    "mode": "upmap",
    "optimize_result": "Unable to find further optimization, or pool(s) pg_num is decreasing, or distribution is already perfect",
    "plans": []
}

Return to workstation as the student user.

[ceph: root@clienta /]# exit
[admin@clienta ~]$ exit
[student@workstation ~]$

Finish

On the workstation machine, use the lab command to complete this exercise. This is important to ensure that resources from previous exercises do not impact upcoming exercises.

[student@workstation ~]$ lab finish cluster-admin

This concludes the guided exercise.

Discuss Cloud Storage with Red Hat Ceph Storage

Go to community

Red Hat Ceph Storage for OpenStack (CL260)

Haley_Ruccio

3 sie 2023

Build, expand and maintain cloud-scale, clustered storage for your applications with Red Hat Ceph StorageCloud Storage with Red Hat Ceph Storage (CL260) is designed for storage administrators and cloud operators who deploy Red Hat Ceph Storage in a production data center environment or as a component of a Red Hat OpenStack Platform or OpenShift Container Platform infrastructure. Learn how to deploy, manage, and scale a Ceph storage cluster to provide hybrid storage resources, including Amazon S3 and OpenStack Swift-compatible object storage, Ceph-native and iSCSI-based block storage, and shared file storage. This course is based on Red Hat Ceph Storage 5.0.Course summaryDeploy and manage a Red Hat Ceph Storage cluster on commodity servers.Perform common management operations using the web-based management interface.Create, expand, and control access to storage pools provided by the Ceph cluster.Access Red Hat Ceph Storage from clients using object, block, and file-based methods.Analyze and tune Red Hat Ceph Storage performance.Integrate Red Hat OpenStack Platform image, object, block, and file storage with a Red Hat Ceph Storage cluster.Integrate OpenShift Container Platform with a Red Hat Ceph Storage cluster.Target AudienceThis course is intended for storage administrators and cloud operators who want to learn how to deploy and manage Red Hat Ceph Storage on servers in an enterprise data center or within a Red Hat OpenStack Platform or OpenShift Container Platform environment.Developers writing applications that use cloud-based storage will learn the distinctions of various storage types and client access methods.Recommended trainingTake our free assessment to gauge whether this offering is the best fit for your skills.Red Hat Certified System Administrator (RHCSA) certification, or equivalent experience.For candidates that have not earned an RHCSA or equivalent, confirmation of the correct skill set knowledge can be obtained by taking the online skills assessment.Some experience with storage administration is recommended but not required.Technology considerationsThis course does not have any special technical requirements.This course is not intended for BYOD.Internet access is recommended.

Welcome to the Red Hat Ceph Storage for OpenStack (CL260) group in the Red Hat Learning Community!

cschunke

31 lip 2023

We are excited to launch a space dedicated to the Red Hat Training course CL260! To gain the most value from this group - click the "Join Group" button in the upper right hand corner of the group home page.We encourage group members to collaborate in this group to discuss topics, ask questions, share best practices and tips, provide course feedback, and share their accomplishments as it relates to CL260.Read more about Red Hat Ceph Storage for OpenStack here.

381

Revision: cl260-5.0-29d2128