Red Hat Enterprise Linux Diagnostics and Troubleshooting
System memory management is one of the most complex kernel tasks. Computer systems organize memory into fixed-size chunks called pages. The processor architecture defines the page size; for x86_64, the standard page size is 4 KiB. A system's physical RAM is divided into page frames; one page frame holds one page of data. The virtual address space size depends on the processor architecture, not on the amount of installed RAM. On a 64-bit x86-64 system, the addressable space is 264 bytes or 16 EiB.
Processes do not access physical memory directly. Instead, each process has a virtual address space. When a process requests memory, the kernel maps the physical address of a page frame to a virtual address in the address space. The process views a private memory space and accesses only the physical page frames that the kernel mapped into its virtual space. This method enforces security restrictions and provides process isolation.
A single process generally does not use its entire addressable space. Much of the space remains unallocated and unmapped to any physical memory.
A memory leak is not a significant issue for a short-lived process, such as ls or netstat, because the kernel frees process memory when it exits. However, a process might not free all of its memory when exiting. Memory leaks can become severe when processes run for an extended time without opportunities to free memory.
Tools such as the ps, top, free, sar -r, and sar -R commands help to identify memory leaks. Dedicated tools, such as the memcheck tool from the valgrind framework, can identify application memory leaks.
The valgrind framework helps to detect memory errors such as uninitialized memory use and improper memory allocation or deallocation. The valgrind framework provides various debugging and profiling tools.
- cachegrind
The
cachegrindtool simulates application interaction with the system cache hierarchy and branch predictor. It gathers statistics for the duration of the application's execution and displays a summary as output to the console.- massif
The
massiftool measures the heap space that a specified application uses. It measures both usable space and any further allocated space for bookkeeping and alignment purposes.- memcheck
The
memchecktool is the default tool in thevalgrindframework. It detects and reports memory-related errors such as memory leaks, disallowed memory access that must not occur, use of an undefined or uninitialized value, incorrect release of heap memory, and pointer overlaps.
A memory leak happens due to two different situations. In the first case, a program requests memory with a malloc system call, but it does not use the requested memory. This requested memory causes the virtual size of the program to increase (VIRT in top). The Committed_AS line in /proc/meminfo increases but not actual physical memory. The resident size (RSS in top) stays the same.
In the second case, when a program uses the allocated memory, the resident size increases with the virtual size, and a physical memory shortage occurs. Although leaking virtual memory is not good, leaking resident memory impacts the system more.
Use the memcheck tool with the process that is leaking memory. Use it on processes that run for extended periods, such as daemons or desktop applications such as web browsers. Processes that leak memory show a gradual increase in memory or create out-of-memory errors that cause some processes to exit immediately. Some processes might generate a log event and continue to work. The memcheck tool reports these errors but cannot prevent them from happening. The memcheck tool logs an error message immediately before the error occurs.
The following example uses the bigmem application to help to diagnose memory leaks. The bigmem application was developed for use in Red Hat Training courses.
Install the valgrind package to use the available tools in the framework.
[root@host ~]# yum install valgrind
...output omitted...
Complete!Run the bigmem application with the memcheck tool.
[root@host ~]# valgrind --tool=memcheck bigmem 128
==26545== Memcheck, a memory error detector
==26545== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==26545== Using Valgrind-3.16.0 and LibVEX; rerun with -h for copyright info
==26545== Command: bigmem 128
==26545==
Attempting to allocate 128 Mebibytes of resident memory...
Press <Enter> to exit Enter
==26545==
==26545== HEAP SUMMARY:
==26545== in use at exit: 134,217,728 bytes in 128 blocks
==26545== total heap usage: 130 allocs, 2 frees, 134,219,776 bytes allocated
==26545==
==26545== LEAK SUMMARY:
==26545== definitely lost: 128,974,848 bytes in 123 blocks
==26545== indirectly lost: 0 bytes in 0 blocks
==26545== possibly lost: 5,242,880 bytes in 5 blocks
==26545== still reachable: 0 bytes in 0 blocks
==26545== suppressed: 0 bytes in 0 blocks
==26545== Rerun with --leak-check=full to see details of leaked memory
==26545==
==26545== For lists of detected and suppressed errors, rerun with: -s
==26545== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)Use the --leak-check=full option to obtain detailed information to diagnose which function is leaking in the bigmem application.
[root@host ~]# valgrind --tool=memcheck --leak-check=full bigmem 128
==26546== Memcheck, a memory error detector
==26546== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==26546== Using Valgrind-3.16.0 and LibVEX; rerun with -h for copyright info
==26546== Command: bigmem 128
==26546==
Attempting to allocate 128 Mebibytes of resident memory...
Press <Enter> to exit Enter
==26546==
==26546== HEAP SUMMARY:
==26546== in use at exit: 134,217,728 bytes in 128 blocks
==26546== total heap usage: 130 allocs, 2 frees, 134,219,776 bytes allocated
==26546==
==26546== 5,242,880 bytes in 5 blocks are possibly lost in loss record 1 of 2
==26546== at 0x4C34F0B: malloc (vg_replace_malloc.c:307)
==26546== by 0x400961: ??? (in /usr/local/bin/bigmem)
==26546== by 0x4011BB: ??? (in /usr/local/bin/bigmem)
==26546== by 0x4E65492: (below main) (in /usr/lib64/libc-2.28.so)
==26546==
==26546== 128,974,848 bytes in 123 blocks are definitely lost in loss record 2 of 2
==26546== at 0x4C34F0B: malloc (vg_replace_malloc.c:307)
==26546== by 0x400961: ??? (in /usr/local/bin/bigmem)
==26546== by 0x4011BB: ??? (in /usr/local/bin/bigmem)
==26546== by 0x4E65492: (below main) (in /usr/lib64/libc-2.28.so)
==26546==
==26546== LEAK SUMMARY:
==26546== definitely lost: 128,974,848 bytes in 123 blocks
==26546== indirectly lost: 0 bytes in 0 blocks
==26546== possibly lost: 5,242,880 bytes in 5 blocks
==26546== still reachable: 0 bytes in 0 blocks
==26546== suppressed: 0 bytes in 0 blocks
==26546==
==26546== For lists of detected and suppressed errors, rerun with: -s
==26546== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)Run the bigmem application with the valgrind default tool and redirect the output to the bigmem.memchk file.
[root@host ~]# valgrind -v --leak-check=full --show-reachable=yes --log-file=bigmem.memchk bigmem 128
Attempting to allocate 128 Mebibytes of resident memory...
Press <Enter> to exit EnterView the contents of the bigmem.memchk file to verify the leak information in the LEAK SUMMARY section.
[root@host ~]# cat bigmem.memchk
==26588== Memcheck, a memory error detector
==26588== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==26588== Using Valgrind-3.16.0-bf5e647edb-20200519X and LibVEX; rerun with -h for copyright info
==26588== Command: bigmem 128
==26588== Parent PID: 1426
...output omitted...
==26588== HEAP SUMMARY:
==26588== in use at exit: 134,217,728 bytes in 128 blocks
==26588== total heap usage: 130 allocs, 2 frees, 134,219,776 bytes allocated
==26588==
==26588== Searching for pointers to 128 not-freed blocks
==26588== Checked 5,311,904 bytes
==26588==
==26588== 5,242,880 bytes in 5 blocks are possibly lost in loss record 1 of 2
==26588== at 0x4C34F0B: malloc (vg_replace_malloc.c:307)
==26588== by 0x400961: ??? (in /usr/local/bin/bigmem)
==26588== by 0x4011BB: ??? (in /usr/local/bin/bigmem)
==26588== by 0x4E65492: (below main) (in /usr/lib64/libc-2.28.so)
==26588==
==26588== 128,974,848 bytes in 123 blocks are definitely lost in loss record 2 of 2
==26588== at 0x4C34F0B: malloc (vg_replace_malloc.c:307)
==26588== by 0x400961: ??? (in /usr/local/bin/bigmem)
==26588== by 0x4011BB: ??? (in /usr/local/bin/bigmem)
==26588== by 0x4E65492: (below main) (in /usr/lib64/libc-2.28.so)
==26588==
==26588== LEAK SUMMARY:
==26588== definitely lost: 128,974,848 bytes in 123 blocks
==26588== indirectly lost: 0 bytes in 0 blocks
==26588== possibly lost: 5,242,880 bytes in 5 blocks
==26588== still reachable: 0 bytes in 0 blocks
==26588== suppressed: 0 bytes in 0 blocks
==26588==
==26588== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)References
valgrind(1) man pages
For further information, refer to Understanding Memory Leaks - Using Valgrind (Memcheck) at https://access.redhat.com/articles/17774