From 904760745d22b6127ff3d1eb081f8920282e4073 Mon Sep 17 00:00:00 2001 From: Tom Tromey Date: Wed, 20 Apr 2011 18:05:26 +0000 Subject: [PATCH] gdb * dwarf2read.c (save_gdb_index_command): Replace format documentation with a pointer to the manual. gdb/doc * gdb.texinfo (Index Section Format): New node. (Top): Add new node to menu. --- gdb/ChangeLog | 5 ++ gdb/doc/ChangeLog | 5 ++ gdb/doc/gdb.texinfo | 124 ++++++++++++++++++++++++++++++++++++++++++++ gdb/dwarf2read.c | 73 ++------------------------ 4 files changed, 138 insertions(+), 69 deletions(-) diff --git a/gdb/ChangeLog b/gdb/ChangeLog index aac555332f..1da3f12d41 100644 --- a/gdb/ChangeLog +++ b/gdb/ChangeLog @@ -1,3 +1,8 @@ +2011-04-20 Tom Tromey + + * dwarf2read.c (save_gdb_index_command): Replace format + documentation with a pointer to the manual. + 2011-04-20 Pedro Alves * regcache.c: Include remote.h. diff --git a/gdb/doc/ChangeLog b/gdb/doc/ChangeLog index 88e5fffc64..20c5362216 100644 --- a/gdb/doc/ChangeLog +++ b/gdb/doc/ChangeLog @@ -1,3 +1,8 @@ +2011-04-20 Tom Tromey + + * gdb.texinfo (Index Section Format): New node. + (Top): Add new node to menu. + 2011-04-20 Pedro Alves * gdb.texinfo (Maintenance Commands): Document `maint print diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo index a48dac0b37..edcf5c2865 100644 --- a/gdb/doc/gdb.texinfo +++ b/gdb/doc/gdb.texinfo @@ -181,6 +181,7 @@ software in general. We will miss him. * Operating System Information:: Getting additional information from the operating system * Trace File Format:: GDB trace file format +* Index Section Format:: .gdb_index section format * Copying:: GNU General Public License says how you can copy and share GDB * GNU Free Documentation License:: The license for this documentation @@ -36913,6 +36914,129 @@ Trace state variable block. This records the 8-byte signed value Future enhancements of the trace file format may include additional types of blocks. +@node Index Section Format +@appendix @code{.gdb_index} section format +@cindex .gdb_index section format +@cindex index section format + +This section documents the index section that is created by @code{save +gdb-index} (@pxref{Index Files}). The index section is +DWARF-specific; some knowledge of DWARF is assumed in this +description. + +The mapped index file format is designed to be directly +@code{mmap}able on any architecture. In most cases, a datum is +represented using a little-endian 32-bit integer value, called an +@code{offset_type}. Big endian machines must byte-swap the values +before using them. Exceptions to this rule are noted. The data is +laid out such that alignment is always respected. + +A mapped index consists of several areas, laid out in order. + +@enumerate +@item +The file header. This is a sequence of values, of @code{offset_type} +unless otherwise noted: + +@enumerate +@item +The version number, currently 4. Versions 1, 2 and 3 are obsolete. + +@item +The offset, from the start of the file, of the CU list. + +@item +The offset, from the start of the file, of the types CU list. Note +that this area can be empty, in which case this offset will be equal +to the next offset. + +@item +The offset, from the start of the file, of the address area. + +@item +The offset, from the start of the file, of the symbol table. + +@item +The offset, from the start of the file, of the constant pool. +@end enumerate + +@item +The CU list. This is a sequence of pairs of 64-bit little-endian +values, sorted by the CU offset. The first element in each pair is +the offset of a CU in the @code{.debug_info} section. The second +element in each pair is the length of that CU. References to a CU +elsewhere in the map are done using a CU index, which is just the +0-based index into this table. Note that if there are type CUs, then +conceptually CUs and type CUs form a single list for the purposes of +CU indices. + +@item +The types CU list. This is a sequence of triplets of 64-bit +little-endian values. In a triplet, the first value is the CU offset, +the second value is the type offset in the CU, and the third value is +the type signature. The types CU list is not sorted. + +@item +The address area. The address area consists of a sequence of address +entries. Each address entry has three elements: + +@enumerate +@item +The low address. This is a 64-bit little-endian value. + +@item +The high address. This is a 64-bit little-endian value. Like +@code{DW_AT_high_pc}, the value is one byte beyond the end. + +@item +The CU index. This is an @code{offset_type} value. +@end enumerate + +@item +The symbol table. This is an open-addressed hash table. The size of +the hash table is always a power of 2. + +Each slot in the hash table consists of a pair of @code{offset_type} +values. The first value is the offset of the symbol's name in the +constant pool. The second value is the offset of the CU vector in the +constant pool. + +If both values are 0, then this slot in the hash table is empty. This +is ok because while 0 is a valid constant pool index, it cannot be a +valid index for both a string and a CU vector. + +The hash value for a table entry is computed by applying an +iterative hash function to the symbol's name. Starting with an +initial value of @code{r = 0}, each (unsigned) character @samp{c} in +the string is incorporated into the hash using the formula +@code{r = r * 67 + c - 113}. The terminating @samp{\0} is not +incorporated into the hash. + +The step size used in the hash table is computed via +@code{((hash * 17) & (size - 1)) | 1}, where @samp{hash} is the hash +value, and @samp{size} is the size of the hash table. The step size +is used to find the next candidate slot when handling a hash +collision. + +The names of C@t{++} symbols in the hash table are canonicalized. We +don't currently have a simple description of the canonicalization +algorithm; if you intend to create new index sections, you must read +the code. + +@item +The constant pool. This is simply a bunch of bytes. It is organized +so that alignment is correct: CU vectors are stored first, followed by +strings. + +A CU vector in the constant pool is a sequence of @code{offset_type} +values. The first value is the number of CU indices in the vector. +Each subsequent value is the index of a CU in the CU list. This +element in the hash table is used to indicate which CUs define the +symbol. + +A string in the constant pool is zero-terminated. +@end enumerate + @include gpl.texi @node GNU Free Documentation License diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c index 032fbd5f12..a5889ed370 100644 --- a/gdb/dwarf2read.c +++ b/gdb/dwarf2read.c @@ -16005,75 +16005,10 @@ write_psymtabs_to_index (struct objfile *objfile, const char *dir) do_cleanups (cleanup); } -/* The mapped index file format is designed to be directly mmap()able - on any architecture. In most cases, a datum is represented using a - little-endian 32-bit integer value, called an offset_type. Big - endian machines must byte-swap the values before using them. - Exceptions to this rule are noted. The data is laid out such that - alignment is always respected. - - A mapped index consists of several sections. - - 1. The file header. This is a sequence of values, of offset_type - unless otherwise noted: - - [0] The version number, currently 4. Versions 1, 2 and 3 are - obsolete. - [1] The offset, from the start of the file, of the CU list. - [2] The offset, from the start of the file, of the types CU list. - Note that this section can be empty, in which case this offset will - be equal to the next offset. - [3] The offset, from the start of the file, of the address section. - [4] The offset, from the start of the file, of the symbol table. - [5] The offset, from the start of the file, of the constant pool. - - 2. The CU list. This is a sequence of pairs of 64-bit - little-endian values, sorted by the CU offset. The first element - in each pair is the offset of a CU in the .debug_info section. The - second element in each pair is the length of that CU. References - to a CU elsewhere in the map are done using a CU index, which is - just the 0-based index into this table. Note that if there are - type CUs, then conceptually CUs and type CUs form a single list for - the purposes of CU indices. - - 3. The types CU list. This is a sequence of triplets of 64-bit - little-endian values. In a triplet, the first value is the CU - offset, the second value is the type offset in the CU, and the - third value is the type signature. The types CU list is not - sorted. - - 4. The address section. The address section consists of a sequence - of address entries. Each address entry has three elements. - [0] The low address. This is a 64-bit little-endian value. - [1] The high address. This is a 64-bit little-endian value. - Like DW_AT_high_pc, the value is one byte beyond the end. - [2] The CU index. This is an offset_type value. - - 5. The symbol table. This is a hash table. The size of the hash - table is always a power of 2. The initial hash and the step are - currently defined by the `find_slot' function. - - Each slot in the hash table consists of a pair of offset_type - values. The first value is the offset of the symbol's name in the - constant pool. The second value is the offset of the CU vector in - the constant pool. - - If both values are 0, then this slot in the hash table is empty. - This is ok because while 0 is a valid constant pool index, it - cannot be a valid index for both a string and a CU vector. - - A string in the constant pool is stored as a \0-terminated string, - as you'd expect. - - A CU vector in the constant pool is a sequence of offset_type - values. The first value is the number of CU indices in the vector. - Each subsequent value is the index of a CU in the CU list. This - element in the hash table is used to indicate which CUs define the - symbol. - - 6. The constant pool. This is simply a bunch of bytes. It is - organized so that alignment is correct: CU vectors are stored - first, followed by strings. */ +/* Implementation of the `save gdb-index' command. + + Note that the file format used by this command is documented in the + GDB manual. Any changes here must be documented there. */ static void save_gdb_index_command (char *arg, int from_tty)