Red Hat Enterprise Linux Diagnostics and Troubleshooting
Application error messages are not always helpful for diagnosing their issues. Occasionally, applications silently hang if they are stuck in a loop, or are waiting for an external event to occur. To troubleshoot such applications, an administrator must obtain more application state information.
The strace and ltrace commands display runtime information while a program executes. These utilities intercept application system calls and signals, and also process shared library calls with ltrace. These utilities send the trace information to the STDERR channel.
The strace command, from the strace package, displays application-related system calls and signals. To use the strace command, provide an application as an argument.
In this example, the command displays the system calls from the pwd command.
[user@host ~]$strace pwdexecve("/usr/bin/pwd", ["pwd"], 0x7ffcd1c467b0 /* 26 vars */) = 0brk(NULL) = 0x56403534d000 arch_prctl(0x3001 /* ARCH_??? */, 0x7ffef5fd8a40) = -1 EINVAL (Invalid argument) mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5f1b970000 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=20755, ...}) = 0 mmap(NULL, 20755, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f5f1b96a000 ...output omitted... openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=217796128, ...}) = 0 mmap(NULL, 217796128, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f5f0e3cc000 close(3) = 0 getcwd("/home/user", 4096) = 14
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0x1), ...}) = 0 write(1, "/home/user\n", 14/home/user) = 14
close(1) = 0 close(2) = 0 exit_group(0) = ? exited with 0
In the first line, | |
The subsequent | |
The | |
The |
Each system call is displayed with its arguments and subsequent return values. When a system call fails, the numeric and symbolic value of the errno variable is displayed.
Alternatively, use the -p option to trace a current, running process via its process ID (PID).PID
[user@host ~]$ strace -p 1552You might display only system call counts, instead of detailed function call information. Add the -c option to view how often a system call is executed, and the average time spent on it during program execution.
[user@host ~]$ strace -c pwd
/home/user
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
0.00 0.000000 0 2 read
0.00 0.000000 0 1 write
0.00 0.000000 0 5 close
0.00 0.000000 0 4 fstat
0.00 0.000000 0 1 lseek
0.00 0.000000 0 7 mmap
0.00 0.000000 0 4 mprotect
0.00 0.000000 0 1 munmap
0.00 0.000000 0 4 brk
0.00 0.000000 0 1 1 access
0.00 0.000000 0 1 execve
0.00 0.000000 0 1 getcwd
0.00 0.000000 0 2 1 arch_prctl
0.00 0.000000 0 3 openat
------ ----------- ----------- --------- --------- ----------------
100.00 0.000000 0 37 2 totalWhen used with the -f option, strace also follows and profiles the child processes from the fork() system call. By default, child processes are not followed. The following strace command displays system calls and follows process forks.
[user@host ~]$ strace -f elinks -dump http://classroom.example.comThe -o option saves output to a file, rather than displaying it on the terminal.FILENAME
The -e option traces only specific system calls. For example, the following command traces only the openat and fstat system calls, and sends the output to the /tmp/mytrace file.
[user@host ~]$ strace -o /tmp/mytrace -e openat,fstat mycommandThe ltrace command, from the ltrace package, is similar to the strace command. The ltrace command also traces calls to shared libraries, rather than tracing system calls. Add the -S option to the ltrace command to also trace system calls.
In this example, the command displays the library calls from the pwd command.
[user@host ~]$ ltrace pwd
getenv("POSIXLY_CORRECT") = nil
strrchr("pwd", '/') = nil
setlocale(LC_ALL, "") = "en_US.UTF-8"
textdomain("coreutils") = "coreutils"
...output omitted...
getcwd(nil, 0) = ""
puts("/home/user"/home/user) = 14
free(0x55af41767440) = <void>
__fpending(0x7f82f1e926e0, 0, 0x55af4038c9b0, 1) = 0
fileno(0x7f82f1e926e0) = 1
__freading(0x7f82f1e926e0, 0, 0x55af4038c9b0, 1) = 0
__freading(0x7f82f1e926e0, 0, 2052, 1) = 0
fflush(0x7f82f1e926e0) = 0
fclose(0x7f82f1e926e0) = 0
__fpending(0x7f82f1e92600, 0, 0x7f82f1e8d880, 2880) = 0
fileno(0x7f82f1e92600) = 2
__freading(0x7f82f1e92600, 0, 0x7f82f1e8d880, 2880) = 0
__freading(0x7f82f1e92600, 0, 4, 2880) = 0
fflush(0x7f82f1e92600) = 0
fclose(0x7f82f1e92600) = 0
__cxa_finalize(0x55af40591ae0, 0x55af40591ad8, 1, 2) = 0
exited (status 0)The ltrace command traces only binaries that are compiled with the position-independent executables (PIE) option disabled. To check whether a binary is compiled with this option, run the file command. The file command returns "ELF 64-bit LSB executable" as output for binaries that are compiled with the PIE option disabled.
[user@host ~]$ file /usr/bin/pwd
/usr/bin/pwd: ELF 64-bit LSB pie executable ...output omitted...Similar to the strace command, the ltrace command can trace a current, running process with the -p option.PID
[user@host ~]$ ltrace -p 1729The -o, -c, and -f options of the ltrace command perform the same function as the corresponding strace command options.
Some binary programs are only executable, but not readable, by unprivileged users. The ltrace command requires the user to have read access to the executable file to trace calls.
[user@host ~]#ls -l /usr/bin/testprog-rwx--x--x. 1 root root 57903 Feb 15 15:01 /usr/bin/testprog [user@host ~]#ltrace testprogCan't open /bin/testprog: Permission denied
The strace command can trace whether an executable file is executable but not readable, but the output from the command is limited.
In the following example, the testprog application tries to open and read a file. The file exists, but the program cannot read it. Because strace cannot read the executable file, it cannot display the name of the targeted file.
[root@host ~]# strace testprog
execve("/bin/testprog", ["testprog"], [/* 17 vars */]) = 0
brk(0) = 0xb51000
...output omitted...
mprotect(0x604000, 4096, PROT_READ) = 0
mprotect(0x7f0f60157000, 4096, PROT_READ) = 0
munmap(0x7f0f6014e000, 28826) = 0
open(0x404515, O_RDONLY) = -1 EACCES (Permission denied)
write(2, 0x404528, 35Unable to open configuration file.) = 35
exit_group(1) = ?
exited with 1The Linux audit daemon, auditd, provides useful information to troubleshoot application runtime issues. The auditd daemon stores logs in the /var/log/audit/audit.log file, based on rules from the auditctl command. These rules might be triggered when a system call fails, or when an application executes.
The following commands demonstrate the use of auditd to troubleshoot an application that fails at runtime.
[root@host ~]#exampleError: configuration file not found [root@host ~]#auditctl -a always,exit -F arch=b64 -S openat -F success=0[root@host ~]#exampleError: configuration file not found [root@host ~]#tail /var/log/audit/audit.log...output omitted... type=CONFIG_CHANGE msg=audit(1456381239.150:916): auid=0 ses=2 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 op="add_rule" key=(null) list=4 res=1 type=SYSCALL msg=audit(1456381246.155:917): arch=c000003e syscall=2 success=no exit=-2 a0=404515 a1=0 a2=d a3=7fff048aae20 items=1 ppid=1484 pid=29841 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=2 comm="example" exe="/usr/bin/example" subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key=(null) type=CWD msg=audit(1456381246.155:917): cwd="/root" type=PATH msg=audit(1456381246.155:917): item=0 name="/etc/example.conf" objtype=UNKNOWN
The initial error message in the output suggests that the command fails to open a file. The auditctl command creates a rule that audits openat() system calls that fail (-F success=0). With the new rule, the application failure generates an error message in the audit.log file. The entry indicates that the application cannot open the /etc/example.conf file.
The auditctl -d command removes rules that are no longer needed.
[root@host ~]# auditctl -d always,exit -F arch=b64 -S openat -F success=0References
auditctl(8), auditd(8), ausearch(8), ltrace(1), and strace(1) man pages