df8411da08
This patch, as the subject says, extends GDB so that it is able to use the contents of the file /proc/PID/coredump_filter when generating a corefile. This file contains a bit mask that is a representation of the different types of memory mappings in the Linux kernel; the user can choose to dump or not dump a certain type of memory mapping by enabling/disabling the respective bit in the bit mask. Currently, here is what is supported: bit 0 Dump anonymous private mappings. bit 1 Dump anonymous shared mappings. bit 2 Dump file-backed private mappings. bit 3 Dump file-backed shared mappings. bit 4 (since Linux 2.6.24) Dump ELF headers. bit 5 (since Linux 2.6.28) Dump private huge pages. bit 6 (since Linux 2.6.28) Dump shared huge pages. (This table has been taken from core(5), but you can also read about it on Documentation/filesystems/proc.txt inside the Linux kernel source tree). The default value for this file, used by the Linux kernel, is 0x33, which means that bits 0, 1, 4 and 5 are enabled. This is also the default for GDB implemented in this patch, FWIW. Well, reading the file is obviously trivial. The hard part, mind you, is how to determine the types of the memory mappings. For that, I extended the code of gdb/linux-tdep.c:linux_find_memory_regions_full and made it rely *much more* on the information gathered from /proc/<PID>/smaps. This file contains a "verbose dump" of the inferior's memory mappings, and we were not using as much information as we could from it. If you want to read more about this file, take a look at the proc(5) manpage (I will also write a blog post soon about everything I had to learn to get this patch done, and when I it is ready I will post it here). With Oleg Nesterov's help, we could improve the current algorithm for determining whether a memory mapping is anonymous/file-backed, private/shared. GDB now also respects the MADV_DONTDUMP flag and does not dump the memory mapping marked as so, and will always dump "[vsyscall]" or "[vdso]" mappings (just like the Linux kernel). In a nutshell, what the new code is doing is: - If the mapping is associated to a file whose name ends with " (deleted)", or if the file is "/dev/zero", or if it is "/SYSV%08x" (shared memory), or if there is no file associated with it, or if the AnonHugePages: or the Anonymous: fields in the /proc/PID/smaps have contents, then GDB considers this mapping to be anonymous. There is a special case in this, though: if the memory mapping is a file-backed one, but *also* contains "Anonymous:" or "AnonHugePages:" pages, then GDB considers this mapping to be *both* anonymous and file-backed, just like the Linux kernel does. What that means is simple: this mapping will be dumped if the user requested anonymous mappings *or* if the user requested file-backed mappings to be present in the corefile. It is worth mentioning that, from all those checks described above, the most fragile is the one to see if the file name ends with " (deleted)". This does not necessarily mean that the mapping is anonymous, because the deleted file associated with the mapping may have been a hard link to another file, for example. The Linux kernel checks to see if "i_nlink == 0", but GDB cannot easily do this check (as it has been discussed, GDB would need to run as root, and would need to check the contents of the /proc/PID/map_files/ directory in order to determine whether the deleted was a hardlink or not). Therefore, we made a compromise here, and we assume that if the file name ends with " (deleted)", then the mapping is indeed anonymous. FWIW, this is something the Linux kernel could do better: expose this information in a more direct way. - If we see the flag "sh" in the VmFlags: field (in /proc/PID/smaps), then certainly the memory mapping is shared (VM_SHARED). If we have access to the VmFlags, and we don't see the "sh" there, then certainly the mapping is private. However, older Linux kernels (see the code for more details) do not have the VmFlags field; in that case, we use another heuristic: if we see 'p' in the permission flags, then we assume that the mapping is private, even though the presence of the 's' flag there would mean VM_MAYSHARE, which means the mapping could still be private. This should work OK enough, however. Finally, it is worth mentioning that I added a new command, 'set use-coredump-filter on/off'. When it is 'on', it will read the coredump_filter' file (if it exists) and use its value; otherwise, it will use the default value mentioned above (0x33) to decide which memory mappings to dump. gdb/ChangeLog: 2015-03-31 Sergio Durigan Junior <sergiodj@redhat.com> Jan Kratochvil <jan.kratochvil@redhat.com> Oleg Nesterov <oleg@redhat.com> PR corefiles/16092 * linux-tdep.c: Include 'gdbcmd.h' and 'gdb_regex.h'. New enum identifying the various options of the coredump_filter file. (struct smaps_vmflags): New struct. (use_coredump_filter): New variable. (decode_vmflags): New function. (mapping_is_anonymous_p): Likewise. (dump_mapping_p): Likewise. (linux_find_memory_regions_full): New variables 'coredumpfilter_name', 'coredumpfilterdata', 'pid', 'filterflags'. Removed variable 'modified'. Read /proc/<PID>/smaps file; improve parsing of its information. Implement memory mapping filtering based on its contents. (show_use_coredump_filter): New function. (_initialize_linux_tdep): New command 'set use-coredump-filter'. * NEWS: Mention the possibility of using the '/proc/PID/coredump_filter' file when generating a corefile. Mention new command 'set use-coredump-filter'. gdb/doc/ChangeLog: 2015-03-31 Sergio Durigan Junior <sergiodj@redhat.com> PR corefiles/16092 * gdb.texinfo (gcore): Mention new command 'set use-coredump-filter'. (set use-coredump-filter): Document new command. gdb/testsuite/ChangeLog: 2015-03-31 Sergio Durigan Junior <sergiodj@redhat.com> PR corefiles/16092 * gdb.base/coredump-filter.c: New file. * gdb.base/coredump-filter.exp: Likewise.