Bookmark this page

Guided Exercise: Kernel Debugging with SystemTap

Compile and distribute a System Tap module.

Outcomes

You should be able to compile a SystemTap program and execute it on another system.

As the student user on the workstation machine, use the lab command to prepare your system for this exercise.

[student@workstation ~]$ lab start kernel-debug

This command installs kernel debug packages and executes a script to increase disk I/O traffic on the /dev/vdb disk on the serverb system.

Instructions

A developer reports poor system performance on the serverb system. An administrator adds that recent diagnostics point to high system disk activities. The iostat utility reports heavy activity on the system's /dev/vdb secondary disk.

You are asked to determine which process is responsible for the heavy disk activity. Create and run a SystemTap probe that locates the root cause. Use the servera system to build the Systemtap probe. Enable the student user to reuse this probe in the future.

  1. Log in to the servera system and switch to the root user.

    [student@workstation ~]$ ssh student@servera
    ...output omitted...
    [student@servera ~]$ sudo -i
    [sudo] password for student: student
    [root@servera ~]#
  2. Install the necessary SystemTap packages, compile tools, and debug dependencies on the servera system. Locate a suitable example script to use for this issue.

    1. Install the systemtap package on the servera system.

      The C compiler, libraries, build tools, kernel source code, and Perl tools are also pulled in as dependencies and installed.

      [root@servera ~]# yum install systemtap
      ...output omitted...
      Complete!
    2. Verify that the kernel-debuginfo and kernel-devel packages are installed.

      [root@servera ~]# rpm -q kernel-debuginfo kernel-devel
      kernel-debuginfo-4.18.0-305.el8.x86_64
      kernel-devel-4.18.0-305.el8.x86_64
    3. List the available I/O example scripts from the systemtap package.

      [root@servera ~]# ls -la /usr/share/systemtap/examples/io/
      ...output omitted...
      -rw-r--r--.  1 root root  410 Mar 11  2021 iotop.meta
      -rwxr-xr-x.  1 root root  624 Mar 11  2021 iotop.stp
      ...output omitted...
    4. View the contents of the iotop.stp script file. This script displays the top 10 I/O processes every 5 seconds.

      [root@servera ~]# cat /usr/share/systemtap/examples/io/iotop.stp
      #!/usr/bin/stap
      
      global reads, writes, total_io
      
      ...output omitted...
      
      # print top 10 IO processes every 5 seconds
      probe timer.s(5) {
          printf ("%16s\t%10s\t%10s\n", "Process", "KB Read", "KB Written")
          foreach (name in total_io @sum- limit 10)
              printf("%16s\t%10d\t%10d\n", name,
                     @sum(reads[name])/1024, @sum(writes[name])/1024)
          delete reads
          delete writes
          delete total_io
          print("\n")
      }
  3. Prepare the iotop.stp script to analyze the serverb system, and copy the kernel module to the serverb system.

    1. Compile the iotop.stp script file to generate the iotop.ko portable kernel module.

      Ignore any warning messages after the iotop kernel module completes the Pass 1 step.

      root@servera ~]# stap -v -p 4 -m iotop /usr/share/systemtap/examples/io/iotop.stp
      Pass 1: parsed user script and 482 library scripts using 456012virt/88412res/12504shr/75468data kb, in 200usr/40sys/237real ms.
      ...output omitted...
      Pass 2: analyzed script: 5 probes, 7 functions, 7 embeds, 35 globals using 688652virt/322008res/13676shr/308108data kb, in 3420usr/980sys/5467real ms.
      Pass 3: translated to C into "/tmp/stapXQNYv1/iotop_src.c" using 688652virt/322136res/13804shr/308108data kb, in 30usr/60sys/79real ms.
      iotop.ko
      Pass 4: compiled C into "iotop.ko" in 15500usr/3160sys/11595real ms.
    2. Open a second terminal, log in to the serverb system, and switch to the root user.

      [student@workstation ~]$ ssh student@serverb
      ...output omitted...
      [student@serverb ~]$ sudo -i
      [sudo] password for student: student
      [root@serverb ~]#
    3. On the serverb system, create the SystemTap kernel module directory if it does not exist. Use the running kernel's version when naming the module directory.

      [root@serverb ~]# ls -la /lib/modules/$(uname -r)/systemtap
      ls: cannot access '/lib/modules/4.18.0-305.el8.x86_64/systemtap': No such file or directory
      [root@serverb ~]# mkdir -p /lib/modules/$(uname -r)/systemtap
    4. Copy the iotop.ko kernel module from the servera system to the SystemTap kernel module directory on the serverb system.

      [root@serverb ~]# rsync -avz -e ssh root@servera:/root/iotop.ko /lib/modules/$(uname -r)/systemtap/
      ...output omitted...
      root@servera's password: redhat
      receiving incremental file list
      iotop.ko
      
      sent 43 bytes  received 610,648 bytes  135,709.11 bytes/sec
      total size is 2,729,880  speedup is 4.47
  4. Prepare the serverb system to use the SystemTap probe.

    1. Install the systemtap-runtime package on the serverb system.

      [root@serverb ~]# yum install systemtap-runtime
      ...output omitted...
      Complete!
    2. On the serverb system, configure the student user account to access the newly installed SystemTap kernel module. Add the student user to the stapusr group to run approved SystemTap kernel modules.

      [root@serverb ~]# usermod -a -G stapusr student
    3. On the serverb system, verify that the student user has the necessary memberships to run the SystemTap probe.

      [root@serverb ~]# su - student
      [student@serverb ~]$ id
      uid=1000(student) gid=1000(student) groups=1000(student),10(wheel),156(stapusr) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
  5. Troubleshoot the disk activity issue.

    1. Run the iotop module. Interrupt the command after a few iterations.

      The output of the SystemTap program shows an increase in the I/O traffic by the dd command.

      [student@serverb ~]$ staprun iotop
               Process     KB Read  KB Written
                    dd       91060       91060
            irqbalance           3           0
         rs:main Q:Reg           0           0
          in:imjournal           0           0
        NetworkManager           0           0
                stapio           0           0
       systemd-journal           0           0
      
               Process     KB Read  KB Written
                    dd       84680       84680
                  sshd           0           0
                stapio           0           0
               systemd           0           0
        systemd-logind           0           0
          in:imjournal           0           0
      
               Process     KB Read  KB Written
                    dd       86784       86784
            irqbalance           3           0
                  sshd           0           0
                stapio           0           0
          in:imjournal           0           0
      Ctrl+c
    2. View additional details of the dd process to discover the command that spawned this process.

      [student@serverb ~]$ ps -lef | grep -w dd
      ...output omitted...
      0 D root        1787    1708  4  82   2 - 54281 -      08:56 ?        00:00:04 dd if=/dev/zero of=/dev/vdb bs=1024 count=1000000
      0 D root        1788    1688  4  82   2 - 54281 -      08:56 ?        00:00:04 dd if=/dev/zero of=/dev/vdb bs=1024 count=1000000
      0 S student    44848   25068  0  80   0 - 55482 -      08:58 pts/0    00:00:00 grep --color=auto dd

      Note

      The lab finish command removes the offending processes.

  6. Log out of the serverb system and close the second terminal. Return to the workstation system as the student user.

    [student@serverb ~]$ exit
    [root@serverb ~]# exit
    [student@serverb ~]$ exit
    [student@workstation ~]$ exit
    [root@servera rpm]# exit
    [student@servera ~]$ exit
    [student@workstation ~]$

Finish

On the workstation machine, use the lab command to complete this exercise. This is important to ensure that resources from previous exercises do not impact upcoming exercises.

[student@workstation ~]$ lab finish kernel-debug

Revision: rh342-8.4-6dd89bd