Red Hat Enterprise Linux Diagnostics and Troubleshooting
Compile and distribute a System Tap module.
Outcomes
You should be able to compile a SystemTap program and execute it on another system.
As the student user on the workstation machine, use the lab command to prepare your system for this exercise.
[student@workstation ~]$ lab start kernel-debug
This command installs kernel debug packages and executes a script to increase disk I/O traffic on the /dev/vdb disk on the serverb system.
Instructions
A developer reports poor system performance on the serverb system. An administrator adds that recent diagnostics point to high system disk activities. The iostat utility reports heavy activity on the system's /dev/vdb secondary disk.
You are asked to determine which process is responsible for the heavy disk activity. Create and run a SystemTap probe that locates the root cause. Use the servera system to build the Systemtap probe. Enable the student user to reuse this probe in the future.
Log in to the
serverasystem and switch to therootuser.[student@workstation ~]$
ssh student@servera...output omitted... [student@servera ~]$sudo -i[sudo] password for student:student[root@servera ~]#Install the necessary SystemTap packages, compile tools, and debug dependencies on the
serverasystem. Locate a suitable example script to use for this issue.Install the
systemtappackage on theserverasystem.The C compiler, libraries, build tools, kernel source code, and Perl tools are also pulled in as dependencies and installed.
[root@servera ~]#
yum install systemtap...output omitted... Complete!Verify that the
kernel-debuginfoandkernel-develpackages are installed.[root@servera ~]#
rpm -q kernel-debuginfo kernel-develkernel-debuginfo-4.18.0-305.el8.x86_64 kernel-devel-4.18.0-305.el8.x86_64List the available I/O example scripts from the
systemtappackage.[root@servera ~]#
ls -la /usr/share/systemtap/examples/io/...output omitted... -rw-r--r--. 1 root root 410 Mar 11 2021 iotop.meta -rwxr-xr-x. 1 root root 624 Mar 11 2021 iotop.stp ...output omitted...View the contents of the
iotop.stpscript file. This script displays the top 10 I/O processes every 5 seconds.[root@servera ~]#
cat /usr/share/systemtap/examples/io/iotop.stp#!/usr/bin/stap global reads, writes, total_io ...output omitted... # print top 10 IO processes every 5 seconds probe timer.s(5) { printf ("%16s\t%10s\t%10s\n", "Process", "KB Read", "KB Written") foreach (name in total_io @sum- limit 10) printf("%16s\t%10d\t%10d\n", name, @sum(reads[name])/1024, @sum(writes[name])/1024) delete reads delete writes delete total_io print("\n") }
Prepare the
iotop.stpscript to analyze theserverbsystem, and copy the kernel module to theserverbsystem.Compile the
iotop.stpscript file to generate theiotop.koportable kernel module.Ignore any warning messages after the
iotopkernel module completes thePass 1step.root@servera ~]#
stap -v -p 4 -m iotop /usr/share/systemtap/examples/io/iotop.stpPass 1: parsed user script and 482 library scripts using 456012virt/88412res/12504shr/75468data kb, in 200usr/40sys/237real ms. ...output omitted... Pass 2: analyzed script: 5 probes, 7 functions, 7 embeds, 35 globals using 688652virt/322008res/13676shr/308108data kb, in 3420usr/980sys/5467real ms. Pass 3: translated to C into "/tmp/stapXQNYv1/iotop_src.c" using 688652virt/322136res/13804shr/308108data kb, in 30usr/60sys/79real ms. iotop.ko Pass 4: compiled C into "iotop.ko" in 15500usr/3160sys/11595real ms.Open a second terminal, log in to the
serverbsystem, and switch to therootuser.[student@workstation ~]$
ssh student@serverb...output omitted... [student@serverb ~]$sudo -i[sudo] password for student:student[root@serverb ~]#On the
serverbsystem, create the SystemTap kernel module directory if it does not exist. Use the running kernel's version when naming the module directory.[root@serverb ~]#
ls -la /lib/modules/$(uname -r)/systemtapls: cannot access '/lib/modules/4.18.0-305.el8.x86_64/systemtap': No such file or directory [root@serverb ~]#mkdir -p /lib/modules/$(uname -r)/systemtapCopy the
iotop.kokernel module from theserverasystem to the SystemTap kernel module directory on theserverbsystem.[root@serverb ~]#
rsync -avz -e ssh root@servera:/root/iotop.ko /lib/modules/$(uname -r)/systemtap/...output omitted... root@servera's password:redhatreceiving incremental file list iotop.ko sent 43 bytes received 610,648 bytes 135,709.11 bytes/sec total size is 2,729,880 speedup is 4.47
Prepare the
serverbsystem to use the SystemTap probe.Install the
systemtap-runtimepackage on theserverbsystem.[root@serverb ~]#
yum install systemtap-runtime...output omitted... Complete!On the
serverbsystem, configure thestudentuser account to access the newly installed SystemTap kernel module. Add thestudentuser to thestapusrgroup to run approved SystemTap kernel modules.[root@serverb ~]#
usermod -a -G stapusr studentOn the
serverbsystem, verify that thestudentuser has the necessary memberships to run the SystemTap probe.[root@serverb ~]#
su - student[student@serverb ~]$iduid=1000(student) gid=1000(student) groups=1000(student),10(wheel),156(stapusr) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
Troubleshoot the disk activity issue.
Run the
iotopmodule. Interrupt the command after a few iterations.The output of the SystemTap program shows an increase in the I/O traffic by the
ddcommand.[student@serverb ~]$
staprun iotopProcess KB Read KB Written dd 91060 91060 irqbalance 3 0 rs:main Q:Reg 0 0 in:imjournal 0 0 NetworkManager 0 0 stapio 0 0 systemd-journal 0 0 Process KB Read KB Written dd 84680 84680 sshd 0 0 stapio 0 0 systemd 0 0 systemd-logind 0 0 in:imjournal 0 0 Process KB Read KB Written dd 86784 86784 irqbalance 3 0 sshd 0 0 stapio 0 0 in:imjournal 0 0Ctrl+cView additional details of the
ddprocess to discover the command that spawned this process.[student@serverb ~]$
ps -lef | grep -w dd...output omitted... 0 D root 1787 1708 4 82 2 - 54281 - 08:56 ? 00:00:04 dd if=/dev/zero of=/dev/vdb bs=1024 count=1000000 0 D root 1788 1688 4 82 2 - 54281 - 08:56 ? 00:00:04 dd if=/dev/zero of=/dev/vdb bs=1024 count=1000000 0 S student 44848 25068 0 80 0 - 55482 - 08:58 pts/0 00:00:00 grep --color=auto ddNote
The
lab finishcommand removes the offending processes.
Log out of the
serverbsystem and close the second terminal. Return to theworkstationsystem as thestudentuser.[student@serverb ~]$
exit[root@serverb ~]#exit[student@serverb ~]$exit[student@workstation ~]$exit[root@servera rpm]#
exit[student@servera ~]$exit[student@workstation ~]$