CL260 - ch12s02

Bookmark this page

Guided Exercise: Optimizing Red Hat Ceph Storage Performance

In this exercise, you will run performance analysis tools and configure the Red Hat Ceph Storage cluster using the results.

Outcomes

You should be able to run performance analysis tools and configure the Red Hat Ceph Storage cluster using the results.

Important

Do you need to reset your environment before performing this exercise?

If you performed the practice exercises in the Managing a Red Hat Ceph Storage Cluster chapter, but have not reset your environment to the default classroom cluster since that chapter, then you must reset your environment before executing the lab start command. All remaining chapters use the default Ceph cluster provided in the initial classroom environment.

As the student user on the workstation machine, use the lab command to prepare your system for this exercise.

This command ensures that the lab environment is available for the exercise.

[student@workstation ~]$ lab start tuning-optimize

Procedure 12.1. Instructions

Create a new pool called testpool and change the PG autoscale mode to off. Reduce the number of PGs, and then check the recommended number of PGs. Change the PG autoscale mode to warn and check the health warning message.
Modify the primary affinity settings on an OSD so it is more likely to be set as primary for placement groups.
Using the Ceph built in benchmarking tool known as the rados bench, measure the performance of a Ceph cluster at a pool level.
The clienta node is set up as your admin node server.
The admin user has SSH key-based access from the clienta node to the admin account on all cluster nodes, and has passwordless sudo access to the root and ceph accounts on all cluster nodes.
The serverc, serverd, and servere nodes comprise an operational 3-node Ceph cluster. All three nodes operate as a MON, a MGR, and an OSD host with three 10 GB collocated OSDs.

Warning

The parameters used in this exercise are appropriate for this lab environment. In production, these parameters should only be modified by qualified Ceph administrators, or as directed by Red Hat Support.

Log in to clienta as the admin user. Create a new pool called testpool, set the PG autoscale mode to warn, reduce the number of PGs, and view the health warning messages. Set the PG autoscale mode to on again, and then verify the number of PGs and that cluster health is ok again.

Connect to clienta as the admin user and use sudo to run the cephadm shell.

[student@workstation ~]$ ssh admin@clienta
[admin@clienta ~]$ sudo cephadm shell
[ceph: root@clienta /]#

Create a new pool called testpool with the default number of PGs.

[ceph: root@clienta /]# ceph osd pool create testpool
pool 'testpool' created

Verify the cluster health status and the information from the PG autoscaler. The autoscaler mode for the created pool testpool should be on and the number of PGs is 32.

[ceph: root@clienta /]# ceph health detail
HEALTH_OK
[ceph: root@clienta /]# ceph osd pool autoscale-status
POOL                     SIZE  TARGET SIZE  RATE  RAW CAPACITY   RATIO  TARGET RATIO  EFFECTIVE RATIO  BIAS  PG_NUM  NEW PG_NUM  AUTOSCALE
device_health_metrics      0                 3.0        92124M  0.0000                                  1.0       1              on
.rgw.root               1323                 3.0        92124M  0.0000                                  1.0      32              on
default.rgw.log         3702                 3.0        92124M  0.0000                                  1.0      32              on
default.rgw.control        0                 3.0        92124M  0.0000                                  1.0      32              on
default.rgw.meta           0                 3.0        92124M  0.0000                                  4.0       8              on
testpool                   0                 3.0        92124M  0.0000                                  1.0      32              on

Set the PG autoscale option to off for the pool testpool. Reduce the number of PGs to 8. Verify the autoscale recommended number of PGs, which should be 32. Verify that the cluster health is OK.

[ceph: root@clienta /]# ceph osd pool set testpool pg_autoscale_mode off
set pool 6 pg_autoscale_mode to off
[ceph: root@clienta /]# ceph osd pool set testpool pg_num 8
set pool 6 pg_num to 8
[ceph: root@clienta /]# ceph osd pool autoscale-status
POOL                     SIZE  TARGET SIZE  RATE  RAW CAPACITY   RATIO  TARGET RATIO  EFFECTIVE RATIO  BIAS  PG_NUM  NEW PG_NUM  AUTOSCALE
device_health_metrics      0                 3.0        92124M  0.0000                                  1.0       1              on
.rgw.root               1323                 3.0        92124M  0.0000                                  1.0      32              on
default.rgw.log         3702                 3.0        92124M  0.0000                                  1.0      32              on
default.rgw.control        0                 3.0        92124M  0.0000                                  1.0      32              on
default.rgw.meta           0                 3.0        92124M  0.0000                                  4.0       8              on
testpool                   0                 3.0        92124M  0.0000                                  1.0       8          32  off
[ceph: root@clienta /]# ceph health detail
HEALTH_OK

Set the PG autoscale option to warn for the pool testpool. Verify that cluster health status is now WARN, because the recommended number of PGs is higher than the current number of PGs. It might take several minutes before the cluster shows the health warning message.

[ceph: root@clienta /]# ceph osd pool set testpool pg_autoscale_mode warn
set pool 6 pg_autoscale_mode to warn
[ceph: root@clienta /]# ceph health detail
HEALTH_WARN 1 pools have too few placement groups
[WRN] POOL_TOO_FEW_PGS: 1 pools have too few placement groups
    Pool testpool has 8 placement groups, should have 32

Enable the PG autoscale option and verify that the number of PGs has been increased automatically to 32, the recommended value. This increase might take a few minutes to display.

[ceph: root@clienta /]# ceph osd pool set testpool pg_autoscale_mode on
set pool 6 pg_autoscale_mode to on
[ceph: root@clienta /]# ceph osd pool autoscale-status
POOL                     SIZE  TARGET SIZE  RATE  RAW CAPACITY   RATIO  TARGET RATIO  EFFECTIVE RATIO  BIAS  PG_NUM  NEW PG_NUM  AUTOSCALE
device_health_metrics      0                 3.0        92124M  0.0000                                  1.0       1              on
.rgw.root               1323                 3.0        92124M  0.0000                                  1.0      32              on
default.rgw.log         3702                 3.0        92124M  0.0000                                  1.0      32              on
default.rgw.control        0                 3.0        92124M  0.0000                                  1.0      32              on
default.rgw.meta           0                 3.0        92124M  0.0000                                  4.0       8              on
testpool                   0                 3.0        92124M  0.0000                                  1.0      32              on

Modify the primary affinity settings on an OSD so that it is more likely to be selected as primary for placement groups. Set the primary affinity for OSD 7 to 0.

Modify the primary affinity settings for OSD 7.

[ceph: root@clienta /]# ceph osd primary-affinity 7 0
set osd.7 primary-affinity to 0 (802)

Verify the primary affinity settings for each OSD.

[ceph: root@clienta /]# ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME         STATUS  REWEIGHT  PRI-AFF
-1         0.08817  root default
-3         0.02939      host serverc
 0    hdd  0.00980          osd.0         up   1.00000  1.00000
 1    hdd  0.00980          osd.1         up   1.00000  1.00000
 2    hdd  0.00980          osd.2         up   1.00000  1.00000
-5         0.02939      host serverd
 3    hdd  0.00980          osd.3         up   1.00000  1.00000
 5    hdd  0.00980          osd.5         up   1.00000  1.00000
 7    hdd  0.00980          osd.7         up   1.00000        0
-7         0.02939      host servere
 4    hdd  0.00980          osd.4         up   1.00000  1.00000
 6    hdd  0.00980          osd.6         up   1.00000  1.00000
 8    hdd  0.00980          osd.8         up   1.00000  1.00000

Verify the primary affinity settings for OSDs in the cluster.

[ceph: root@clienta /]# ceph osd dump | grep affinity
osd.7 up   in  weight 1 primary_affinity 0 up_from 45 up_thru 92 down_at 0 last_clean_interval [0,0) [v2:172.25.250.13:6816/3402621793,v1:172.25.250.13:6817/3402621793] [v2:172.25.249.13:6818/3402621793,v1:172.25.249.13:6819/3402621793] exists,up ebc2280d-1321-458d-a161-2250d2b4f32e

Create a pool called benchpool with the object clean-up feature turned off.
1. Create an OSD pool called benchpool.
```
[ceph: root@clienta /]# ceph osd pool create benchpool 100 100
pool 'benchpool' created
```
2. Use the rbd pool init command to initialize a custom pool to store RBD images. This step could take several minutes to complete.
```
[ceph: root@clienta /]# rbd pool init benchpool
```

Open a second terminal and log in to the clienta node as the admin user. Use the first terminal to generate a workload and use the second terminal to collect metrics. Run a write test to the RBD pool benchpool. This might take several minutes to complete.

Note

This step requires sufficient time to complete the write OPS for the test. Be prepared to run the osd pref command in the second terminal immediately after starting the benchpool command in the first terminal.

Open a second terminal. Log in to clienta as the admin user and use sudo to run the cephadm shell.

[student@workstation ~]$ ssh admin@clienta
[admin@clienta ~]$ sudo cephadm shell
[ceph: root@clienta /]#

In the first terminal, generate the workload.

[ceph: root@clienta /]# rados -p benchpool bench 30 write
hints = 1
Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 30 seconds or 0 objects
Object prefix: benchmark_data_clienta.lab.example.com_50
  sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg lat(s)
    0       0         0         0         0         0           -           0
    1      16        58        42   167.988       168    0.211943    0.322053
    2      16       112        96   191.982       216    0.122236    0.288171
    3      16       162       146   194.643       200    0.279456    0.300593
    4      16       217       201   200.975       220    0.385703    0.292009
...output omitted...

In the second terminal, collect performance metrics. The commit_latency data is the time for the OSD to write and commit the operation to its journal. The apply_latency data is the time to apply the write operation to the OSD file system back end. Note the OSD ID where the heavy load is occurring. Your OSD output might be different in your lab environment.

[ceph: root@clienta /]# ceph osd perf
osd  commit_latency(ms)  apply_latency(ms)
osd  commit_latency(ms)  apply_latency(ms)
  7                  94                 94
  8                 117                117
  6                 195                195
  1                  73                 73
  0                  72                 72
  2                  80                 80
  3                  72                 72
  4                 135                135
  5                  59                 59

Note

If no data displays, then use the first terminal to generate the workload again. The metric collection must run while the bench tool is generating workload.

In the second terminal, locate the system by using the OSD ID from the previous step, where the OSD has high latency. Determine the name of the system.

[ceph: root@clienta /]# ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME         STATUS  REWEIGHT  PRI-AFF
-1         0.08817  root default
-3         0.02939      host serverc
 0    hdd  0.00980          osd.0         up   1.00000  1.00000
 1    hdd  0.00980          osd.1         up   1.00000  1.00000
 2    hdd  0.00980          osd.2         up   1.00000  1.00000
-5         0.02939      host serverd
 3    hdd  0.00980          osd.3         up   1.00000  1.00000
 5    hdd  0.00980          osd.5         up   1.00000  1.00000
 7    hdd  0.00980          osd.7         up   1.00000        0
-7         0.02939      host servere
 4    hdd  0.00980          osd.4         up   1.00000  1.00000
 6    hdd  0.00980          osd.6         up   1.00000  1.00000
 8    hdd  0.00980          osd.8         up   1.00000  1.00000

Evaluate the OSD performance counters.

Verify the performance counters for the OSD. Redirect the output of the command to a file called perfdump.txt
```
[ceph: root@clienta /]# ceph tell osd.6 perf dump > perfdump.txt
```

In the perfdump.txt file, locate the section starting with osd:. Note the op_latency and subop_latency counters, which are the read and write operations and suboperations latency. Note the op_r_latency and op_w_latency parameters.

Each counter includes avgcount and sum fields that are required to calculate the exact counter value. Calculate the value of the op_latency and subop_latency counters by using the formula counter = counter.sum / counter.avgcount.

[ceph: root@clienta /]# cat perfdump.txt | grep -A88 '"osd"'
    "osd": {
        "op_wip": 0,
        "op": 3664,
        "op_in_bytes": 994050158,
        "op_out_bytes": 985,
        "op_latency": {
            "avgcount": 3664,
            "sum": 73.819483299,
            "avgtime": 0.020147238
        },
...output omitted...
        "op_r_latency": {
            "avgcount": 3059,
            "sum": 1.395967825,
            "avgtime": 0.000456347
        },
...output omitted...
        "op_w_latency": {
            "avgcount": 480,
            "sum": 71.668254827,
            "avgtime": 0.149308864
        },
...output omitted...
        "op_rw_latency": {
            "avgcount": 125,
            "sum": 0.755260647,
            "avgtime": 0.006042085
        },
...output omitted...
        "subop_latency": {
            "avgcount": 1587,
            "sum": 59.679174303,
            "avgtime": 0.037605024
        },
...output omitted...

In the first terminal, repeat the capture using the rados bench write command.

[ceph: root@clienta /]# rados -p benchpool bench 30 write
...output omitted...

In the second terminal, view the variation of the value using the following formulas:
- op_latency_sum_t2 - op_latency_sum_t1 = diff_sum
- op_latency_avgcount_t2 - op_latency_avgcount = diff_avgcount
- op_latency = diff_sum / diff_avgcount
```
[ceph: root@clienta /]# ceph tell osd.6 perf dump > perfdump.txt
[ceph: root@clienta /]# cat perfdump.txt | grep -A88 '"osd"'
...output omitted...
```
Note
The values are cumulative and are returned when the command is executed.

View information about the last operations processed by an OSD.

In the second terminal, dump the information maintained in memory for the most recently processed operations. Redirect the dump to the historicdump.txt file. By default, each OSD records information on the last 20 operations over 600 seconds. View the historicdump.txt file contents.

[ceph: root@clienta /]# ceph tell osd.6 dump_historic_ops > historicdump.txt
[ceph: root@clienta /]# head historicdump.txt
{
    "size": 20,
    "duration": 600,
    "ops": [
        {
            "description": "osd_op(client.44472.0:479 7.14 7:2a671f00:::benchmark_data_clienta.lab.example.com_92_object478:head [set-alloc-hint object_size 4194304 write_size 4194304,write 0~4194304] snapc 0=[] ondisk+write+known_if_redirected e642)",
...output omitted...

Update the values for the osd_op_history_size and osd_op_history_duration parameters. Set the size to 30 and the duration to 900. Verify that the change was successful.

[ceph: root@clienta /]# ceph tell osd.6 config set osd_op_history_size 30
{
     "success": "osd_op_history_size = '30' "
}
[ceph: root@clienta /]# ceph tell osd.6 config set osd_op_history_duration 900
{
    "success": "osd_op_history_duration = '900' "
}
[ceph: root@clienta /]# ceph tell osd.6 dump_historic_ops > historicops.txt
[ceph: root@clienta /]# head -n 3 historicops.txt
{
    "size": 30,
    "duration": 900,

Update the runtime value of the osd_op_history_size and osd_op_history_duration parameters. Verify that the change was successful.

[ceph: root@clienta /]# ceph tell osd.* config set osd_op_history_size 20
osd.0: {
    "success": "osd_op_history_size = '20' "
}
...output omitted...
osd.8: {
    "success": "osd_op_history_size = '20' "
}
[ceph: root@clienta /]# ceph tell osd.* config set osd_op_history_duration 600
osd.0: {
    "success": "osd_op_history_duration = '600' "
}
...output omitted...
osd.8: {
    "success": "osd_op_history_duration = '600' "
}

Exit the second terminal. Return to workstation as the student user.

[ceph: root@clienta /]# exit
[admin@clienta ~]$ exit
[student@workstation ~]$ exit

[ceph: root@clienta /]# exit
[admin@clienta ~]$ exit
[student@workstation ~]$

Finish

On the workstation machine, use the lab command to complete this exercise. This is important to ensure that resources from previous exercises do not impact upcoming exercises.

[student@workstation ~]$ lab finish tuning-optimize

This concludes the guided exercise.

Discuss Cloud Storage with Red Hat Ceph Storage

Go to community

Red Hat Ceph Storage for OpenStack (CL260)

Haley_Ruccio

3 sie 2023

Build, expand and maintain cloud-scale, clustered storage for your applications with Red Hat Ceph StorageCloud Storage with Red Hat Ceph Storage (CL260) is designed for storage administrators and cloud operators who deploy Red Hat Ceph Storage in a production data center environment or as a component of a Red Hat OpenStack Platform or OpenShift Container Platform infrastructure. Learn how to deploy, manage, and scale a Ceph storage cluster to provide hybrid storage resources, including Amazon S3 and OpenStack Swift-compatible object storage, Ceph-native and iSCSI-based block storage, and shared file storage. This course is based on Red Hat Ceph Storage 5.0.Course summaryDeploy and manage a Red Hat Ceph Storage cluster on commodity servers.Perform common management operations using the web-based management interface.Create, expand, and control access to storage pools provided by the Ceph cluster.Access Red Hat Ceph Storage from clients using object, block, and file-based methods.Analyze and tune Red Hat Ceph Storage performance.Integrate Red Hat OpenStack Platform image, object, block, and file storage with a Red Hat Ceph Storage cluster.Integrate OpenShift Container Platform with a Red Hat Ceph Storage cluster.Target AudienceThis course is intended for storage administrators and cloud operators who want to learn how to deploy and manage Red Hat Ceph Storage on servers in an enterprise data center or within a Red Hat OpenStack Platform or OpenShift Container Platform environment.Developers writing applications that use cloud-based storage will learn the distinctions of various storage types and client access methods.Recommended trainingTake our free assessment to gauge whether this offering is the best fit for your skills.Red Hat Certified System Administrator (RHCSA) certification, or equivalent experience.For candidates that have not earned an RHCSA or equivalent, confirmation of the correct skill set knowledge can be obtained by taking the online skills assessment.Some experience with storage administration is recommended but not required.Technology considerationsThis course does not have any special technical requirements.This course is not intended for BYOD.Internet access is recommended.

Welcome to the Red Hat Ceph Storage for OpenStack (CL260) group in the Red Hat Learning Community!

cschunke

31 lip 2023

We are excited to launch a space dedicated to the Red Hat Training course CL260! To gain the most value from this group - click the "Join Group" button in the upper right hand corner of the group home page.We encourage group members to collaborate in this group to discuss topics, ask questions, share best practices and tips, provide course feedback, and share their accomplishments as it relates to CL260.Read more about Red Hat Ceph Storage for OpenStack here.

381

Revision: cl260-5.0-29d2128