43f3e411c4
Currently "symtabs" in gdb are stored as a single linked list of struct symtab that contains both symbol symtabs (the blockvectors) and file symtabs (the linetables). This has led to confusion, bugs, and performance issues. This patch is conceptually very simple: split struct symtab into two pieces: one part containing things common across the entire compilation unit, and one part containing things specific to each source file. Example. For the case of a program built out of these files: foo.c foo1.h foo2.h bar.c foo1.h bar.h Today we have a single list of struct symtabs: objfile -> foo.c -> foo1.h -> foo2.h -> bar.c -> foo1.h -> bar.h -> NULL where "->" means the "next" pointer in struct symtab. With this patch, that turns into: objfile -> foo.c(cu) -> bar.c(cu) -> NULL | | v v foo.c bar.c | | v v foo1.h foo1.h | | v v foo2.h bar.h | | v v NULL NULL where "foo.c(cu)" and "bar.c(cu)" are struct compunit_symtab objects, and the files foo.c, etc. are struct symtab objects. So now, for example, when we want to iterate over all blockvectors we can now just iterate over the compunit_symtab list. Plus a lot of the data that was either unused or replicated for each symtab in a compilation unit now lives in struct compunit_symtab. E.g., the objfile pointer, the producer string, etc. I thought of moving "language" out of struct symtab but there is logic to try to compute the language based on previously seen files, and I think that's best left as is for now. With my standard monster benchmark with -readnow (which I can't actually do, but based on my calculations), whereas today the list requires 77MB to store all the struct symtabs, it now only requires 37MB. A modest space savings given the gigabytes needed for all the debug info, etc. Still, it's nice. Plus, whereas today we create a copy of dirname for each source file symtab in a compilation unit, we now only create one for the compunit. So this patch is basically just a data structure reorg, I don't expect significant performance improvements from it. Notes: 1) A followup patch can do a similar split for struct partial_symtab. I have left that until after I get the changes I want in to better utilize .gdb_index (it may affect how we do partial syms). 2) Another followup patch *could* rename struct symtab. The term "symtab" is ambiguous and has been a source of confusion. In this patch I'm leaving it alone, calling it the "historical" name of "filetabs", which is what they are now: just the file-name + line-table. gdb/ChangeLog: Split struct symtab into two: struct symtab and compunit_symtab. * amd64-tdep.c (amd64_skip_xmm_prologue): Fetch producer from compunit. * block.c (blockvector_for_pc_sect): Change "struct symtab *" argument to "struct compunit_symtab *". All callers updated. (set_block_compunit_symtab): Renamed from set_block_symtab. Change "struct symtab *" argument to "struct compunit_symtab *". All callers updated. (get_block_compunit_symtab): Renamed from get_block_symtab. Change result to "struct compunit_symtab *". All callers updated. (find_iterator_compunit_symtab): Renamed from find_iterator_symtab. Change result to "struct compunit_symtab *". All callers updated. * block.h (struct global_block) <compunit_symtab>: Renamed from symtab. hange type to "struct compunit_symtab *". All uses updated. (struct block_iterator) <d.compunit_symtab>: Renamed from "d.symtab". Change type to "struct compunit_symtab *". All uses updated. * buildsym.c (struct buildsym_compunit): New struct. (subfiles, buildsym_compdir, buildsym_objfile, main_subfile): Delete. (buildsym_compunit): New static global. (finish_block_internal): Update to fetch objfile from buildsym_compunit. (make_blockvector): Delete objfile argument. (start_subfile): Rewrite to use buildsym_compunit. Don't initialize debugformat, producer. (start_buildsym_compunit): New function. (free_buildsym_compunit): Renamed from free_subfiles_list. All callers updated. (patch_subfile_names): Rewrite to use buildsym_compunit. (get_compunit_symtab): New function. (get_macro_table): Delete argument comp_dir. All callers updated. (start_symtab): Change result to "struct compunit_symtab *". All callers updated. Create the subfile of the main source file. (watch_main_source_file_lossage): Rewrite to use buildsym_compunit. (reset_symtab_globals): Update. (end_symtab_get_static_block): Update to use buildsym_compunit. (end_symtab_without_blockvector): Rewrite. (end_symtab_with_blockvector): Change result to "struct compunit_symtab *". All callers updated. Update to use buildsym_compunit. Don't set symtab->dirname, instead set it in the compunit. Explicitly make sure main symtab is first in its list. Set debugformat, producer, blockvector, block_line_section, and macrotable in the compunit. (end_symtab_from_static_block): Change result to "struct compunit_symtab *". All callers updated. (end_symtab, end_expandable_symtab): Ditto. (set_missing_symtab): Change symtab argument to "struct compunit_symtab *". All callers updated. (augment_type_symtab): Ditto. (record_debugformat): Update to use buildsym_compunit. (record_producer): Update to use buildsym_compunit. * buildsym.h (struct subfile) <dirname>: Delete. <producer, debugformat>: Delete. <buildsym_compunit>: New member. (get_compunit_symtab): Declare. * dwarf2read.c (struct type_unit_group) <compunit_symtab>: Renamed from primary_symtab. Change type to "struct compunit_symtab *". All uses updated. (dwarf2_start_symtab): Change result to "struct compunit_symtab *". All callers updated. (dwarf_decode_macros): Delete comp_dir argument. All callers updated. (struct dwarf2_per_cu_quick_data) <compunit_symtab>: Renamed from symtab. Change type to "struct compunit_symtab *". All uses updated. (dw2_instantiate_symtab): Change result to "struct compunit_symtab *". All callers updated. (dw2_find_last_source_symtab): Ditto. (dw2_lookup_symbol): Ditto. (recursively_find_pc_sect_compunit_symtab): Renamed from recursively_find_pc_sect_symtab. Change result to "struct compunit_symtab *". All callers updated. (dw2_find_pc_sect_compunit_symtab): Renamed from dw2_find_pc_sect_symtab. Change result to "struct compunit_symtab *". All callers updated. (get_compunit_symtab): Renamed from get_symtab. Change result to "struct compunit_symtab *". All callers updated. (recursively_compute_inclusions): Change type of immediate_parent argument to "struct compunit_symtab *". All callers updated. (compute_compunit_symtab_includes): Renamed from compute_symtab_includes. All callers updated. Rewrite to compute includes of compunit_symtabs and not symtabs. (process_full_comp_unit): Update to work with struct compunit_symtab. (process_full_type_unit): Ditto. (dwarf_decode_lines_1): Delete argument comp_dir. All callers updated. (dwarf_decode_lines): Remove special case handling of main subfile. (macro_start_file): Delete argument comp_dir. All callers updated. (dwarf_decode_macro_bytes): Ditto. * guile/scm-block.c (bkscm_print_block_syms_progress_smob): Update to use struct compunit_symtab. * i386-tdep.c (i386_skip_prologue): Fetch producer from compunit. * jit.c (finalize_symtab): Build compunit_symtab. * jv-lang.c (get_java_class_symtab): Change result to "struct compunit_symtab *". All callers updated. * macroscope.c (sal_macro_scope): Fetch macro table from compunit. * macrotab.c (struct macro_table) <compunit_symtab>: Renamed from comp_dir. Change type to "struct compunit_symtab *". All uses updated. (new_macro_table): Change comp_dir argument to cust, "struct compunit_symtab *". All callers updated. * maint.c (struct cmd_stats) <nr_compunit_symtabs>: Renamed from nr_primary_symtabs. All uses updated. (count_symtabs_and_blocks): Update to handle compunits. (report_command_stats): Update output, "primary symtabs" renamed to "compunits". * mdebugread.c (new_symtab): Change result to "struct compunit_symtab *". All callers updated. (parse_procedure): Change type of search_symtab argument to "struct compunit_symtab *". All callers updated. * objfiles.c (objfile_relocate1): Loop over blockvectors in a separate loop. * objfiles.h (struct objfile) <compunit_symtabs>: Renamed from symtabs. Change type to "struct compunit_symtab *". All uses updated. (ALL_OBJFILE_FILETABS): Renamed from ALL_OBJFILE_SYMTABS. All uses updated. (ALL_OBJFILE_COMPUNITS): Renamed from ALL_OBJFILE_PRIMARY_SYMTABS. All uses updated. (ALL_FILETABS): Renamed from ALL_SYMTABS. All uses updated. (ALL_COMPUNITS): Renamed from ALL_PRIMARY_SYMTABS. All uses updated. * psympriv.h (struct partial_symtab) <compunit_symtab>: Renamed from symtab. Change type to "struct compunit_symtab *". All uses updated. * psymtab.c (psymtab_to_symtab): Change result type to "struct compunit_symtab *". All callers updated. (find_pc_sect_compunit_symtab_from_partial): Renamed from find_pc_sect_symtab_from_partial. Change result type to "struct compunit_symtab *". All callers updated. (lookup_symbol_aux_psymtabs): Change result type to "struct compunit_symtab *". All callers updated. (find_last_source_symtab_from_partial): Ditto. * python/py-symtab.c (stpy_get_producer): Fetch producer from compunit. * source.c (forget_cached_source_info_for_objfile): Fetch debugformat and macro_table from compunit. * symfile-debug.c (debug_qf_find_last_source_symtab): Change result type to "struct compunit_symtab *". All callers updated. (debug_qf_lookup_symbol): Ditto. (debug_qf_find_pc_sect_compunit_symtab): Renamed from debug_qf_find_pc_sect_symtab, change result type to "struct compunit_symtab *". All callers updated. * symfile.c (allocate_symtab): Delete objfile argument. New argument cust. (allocate_compunit_symtab): New function. (add_compunit_symtab_to_objfile): New function. * symfile.h (struct quick_symbol_functions) <lookup_symbol>: Change result type to "struct compunit_symtab *". All uses updated. <find_pc_sect_compunit_symtab>: Renamed from find_pc_sect_symtab. Change result type to "struct compunit_symtab *". All uses updated. * symmisc.c (print_objfile_statistics): Compute blockvector count in separate loop. (dump_symtab_1): Update test for primary source symtab. (maintenance_info_symtabs): Update to handle compunit symtabs. (maintenance_check_symtabs): Ditto. * symtab.c (set_primary_symtab): Delete. (compunit_primary_filetab): New function. (compunit_language): New function. (iterate_over_some_symtabs): Change type of arguments "first", "after_last" to "struct compunit_symtab *". All callers updated. Update to loop over symtabs in each compunit. (error_in_psymtab_expansion): Rename symtab argument to cust, and change type to "struct compunit_symtab *". All callers updated. (find_pc_sect_compunit_symtab): Renamed from find_pc_sect_symtab. Change result type to "struct compunit_symtab *". All callers updated. (find_pc_compunit_symtab): Renamed from find_pc_symtab. Change result type to "struct compunit_symtab *". All callers updated. (find_pc_sect_line): Only loop over symtabs within selected compunit instead of all symtabs in the objfile. * symtab.h (struct symtab) <blockvector>: Moved to compunit_symtab. <compunit_symtab> New member. <block_line_section>: Moved to compunit_symtab. <locations_valid>: Ditto. <epilogue_unwind_valid>: Ditto. <macro_table>: Ditto. <dirname>: Ditto. <debugformat>: Ditto. <producer>: Ditto. <objfile>: Ditto. <call_site_htab>: Ditto. <includes>: Ditto. <user>: Ditto. <primary>: Delete (SYMTAB_COMPUNIT): New macro. (SYMTAB_BLOCKVECTOR): Update definition. (SYMTAB_OBJFILE): Update definition. (SYMTAB_DIRNAME): Update definition. (struct compunit_symtab): New type. Common members among all source symtabs within a compilation unit moved here. All uses updated. (COMPUNIT_OBJFILE): New macro. (COMPUNIT_FILETABS): New macro. (COMPUNIT_DEBUGFORMAT): New macro. (COMPUNIT_PRODUCER): New macro. (COMPUNIT_DIRNAME): New macro. (COMPUNIT_BLOCKVECTOR): New macro. (COMPUNIT_BLOCK_LINE_SECTION): New macro. (COMPUNIT_LOCATIONS_VALID): New macro. (COMPUNIT_EPILOGUE_UNWIND_VALID): New macro. (COMPUNIT_CALL_SITE_HTAB): New macro. (COMPUNIT_MACRO_TABLE): New macro. (ALL_COMPUNIT_FILETABS): New macro. (compunit_symtab_ptr): New typedef. (DEF_VEC_P (compunit_symtab_ptr)): New vector type. gdb/testsuite/ChangeLog: * gdb.base/maint.exp: Update expected output. |
||
---|---|---|
.. | ||
lib | ||
README | ||
guile-internal.h | ||
guile.c | ||
guile.h | ||
scm-arch.c | ||
scm-auto-load.c | ||
scm-block.c | ||
scm-breakpoint.c | ||
scm-cmd.c | ||
scm-disasm.c | ||
scm-exception.c | ||
scm-frame.c | ||
scm-gsmob.c | ||
scm-iterator.c | ||
scm-lazy-string.c | ||
scm-math.c | ||
scm-objfile.c | ||
scm-param.c | ||
scm-ports.c | ||
scm-pretty-print.c | ||
scm-progspace.c | ||
scm-safe-call.c | ||
scm-string.c | ||
scm-symbol.c | ||
scm-symtab.c | ||
scm-type.c | ||
scm-utils.c | ||
scm-value.c |
README
README for gdb/guile
====================
This file contains important notes for gdb/guile developers.
["gdb/guile" refers to the directory you found this file in]
Nomenclature:
In the implementation we use "Scheme" or "Guile" depending on context.
And sometimes it doesn't matter.
Guile is Scheme, and for the most part this is what we present to the user
as well. However, to highlight the fact that it is Guile, the GDB commands
that invoke Scheme functions are named "guile" and "guile-repl",
abbreviated "gu" and "gr" respectively.
Co-existence with Python:
Keep the user interfaces reasonably consistent, but don't shy away from
providing a clearer (or more Scheme-friendly/consistent) user interface
where appropriate.
Additions to Python support or Scheme support don't require corresponding
changes in the other scripting language.
Scheme-wrapped breakpoints are created lazily so that if the user
doesn't use Scheme s/he doesn't pay any cost.
Importing the gdb module into Scheme:
To import the gdb module:
(gdb) guile (use-modules (gdb))
If you want to add a prefix to gdb module symbols:
(gdb) guile (use-modules ((gdb) #:renamer (symbol-prefix-proc 'gdb:)))
This gives every symbol a "gdb:" prefix which is a common convention.
OTOH it's more to type.
Implementation/Hacking notes:
Don't use scm_is_false.
For this C function, () == #f (a la Lisp) and it's not clear how treating
them as equivalent for truth values will affect the GDB interface.
Until the effect is clear avoid them.
Instead use gdbscm_is_false, gdbscm_is_true, gdbscm_is_bool.
There are macros in guile-internal.h to enforce this.
Use gdbscm_foo as the name of functions that implement Scheme procedures
to provide consistent naming in error messages. The user can see "gdbscm"
in the name and immediately know where the function came from.
All smobs contain gdb_smob or chained_gdb_smob as the first member.
This provides a mechanism for extending them in the Scheme side without
tying GDB to the details.
The lifetime of a smob, AIUI, is decided by the containing SCM.
When there is no longer a reference to the containing SCM then the
smob can be GC'd. Objects that have references from outside of Scheme,
e.g., breakpoints, need to be protected from GC.
Don't do something that can cause a Scheme exception inside a TRY_CATCH,
and, in code that can be called from Scheme, don't do something that can
cause a GDB exception outside a TRY_CATCH.
This makes the code a little tricky to write sometimes, but it is a
rule imposed by the programming environment. Bugs often happen because
this rule is broken. Learn it, follow it.
Coding style notes:
- If you find violations to these rules, let's fix the code.
Some attempt has been made to be consistent, but it's early.
Over time we want things to be more consistent, not less.
- None of this really needs to be read. Instead, do not be creative:
Monkey-See-Monkey-Do hacking should generally Just Work.
- Absence of the word "typically" means the rule is reasonably strict.
- The gdbscm_initialize_foo function (e.g., gdbscm_initialize_values)
is the last thing to appear in the file, immediately preceded by any
tables of exported variables and functions.
- In addition to these of course, follow GDB coding conventions.
General naming rules:
- The word "object" absent any modifier (like "GOOPS object") means a
Scheme object (of any type), and is never used otherwise.
If you want to refer to, e.g., a GOOPS object, say "GOOPS object".
- Do not begin any function, global variable, etc. name with scm_.
That's what the Guile implementation uses.
(kinda obvious, just being complete).
- The word "invalid" carries a specific connotation. Try not to use it
in a different way. It means the underlying GDB object has disappeared.
For example, a <gdb:objfile> smob becomes "invalid" when the underlying
objfile is removed from GDB.
- We typically use the word "exception" to mean Scheme exceptions,
and we typically use the word "error" to mean GDB errors.
Comments:
- function comments for functions implementing Scheme procedures begin with
a description of the Scheme usage. Example:
/* (gsmob-aux gsmob) -> object */
- the following comment appears after the copyright header:
/* See README file in this directory for implementation notes, coding
conventions, et.al. */
Smob naming:
- gdb smobs are named, internally, "gdb:foo"
- in Guile they become <gdb:foo>, that is the convention for naming classes
and smobs have rudimentary GOOPS support (they can't be inherited from,
but generics can work with them)
- in comments use the Guile naming for smobs,
i.e., <gdb:foo> instead of gdb:foo.
Note: This only applies to smobs. Exceptions are also named gdb:foo,
but since they are not "classes" they are not wrapped in <>.
- smob names are stored in a global, and for simplicity we pass this
global as the "expected type" parameter to SCM_ASSERT_TYPE, thus in
this instance smob types are printed without the <>.
[Hmmm, this rule seems dated now. Plus I18N rules in GDB are not always
clear, sometimes we pass the smob name through _(), however it's not
clear that's actually a good idea.]
Type naming:
- smob structs are typedefs named foo_smob
Variable naming:
- "scm" by itself is reserved for arbitrary Scheme objects
- variables that are pointers to smob structs are named <char>_smob or
<char><char>_smob, e.g., f_smob for a pointer to a frame smob
- variables that are gdb smob objects are typically named <char>_scm or
<char><char>_scm, e.g., f_scm for a <gdb:frame> object
- the name of the first argument for method-like functions is "self"
Function naming:
General:
- all non-static functions have a prefix,
either gdbscm_ or <char><char>scm_ [or <char><char><char>scm_]
- all functions that implement Scheme procedures have a gdbscm_ prefix,
this is for consistency and readability of Scheme exception text
- static functions typically have a prefix
- the prefix is typically <char><char>scm_ where the first two letters
are unique to the file or class the function works with.
E.g., the scm-arch.c prefix is arscm_.
This follows something used in gdb/python in some places,
we make it formal.
- if the function is of a general nature, or no other prefix works,
use gdbscm_
Conversion functions:
- the from/to in function names follows from libguile's existing style
- conversions from/to Scheme objects are named:
prefix_scm_from_foo: converts from foo to scm
prefix_scm_to_foo: converts from scm to foo
Exception handling:
- functions that may throw a Scheme exception have an _unsafe suffix
- This does not apply to functions that implement Scheme procedures.
- This does not apply to functions whose explicit job is to throw
an exception. Adding _unsafe to gdbscm_throw is kinda superfluous. :-)
- functions that can throw a GDB error aren't adorned with _unsafe
- "_safe" in a function name means it will never throw an exception
- Generally unnecessary, since the convention is to mark the ones that
*can* throw an exception. But sometimes it's useful to highlight the
fact that the function is safe to call without worrying about exception
handling.
- except for functions that implement Scheme procedures, all functions
that can throw exceptions (GDB or Scheme) say so in their function comment
- functions that don't throw an exception, but still need to indicate to
the caller that one happened (i.e., "safe" functions), either return
a <gdb:exception> smob as a result or pass it back via a parameter.
For this reason don't pass back <gdb:exception> smobs for any other
reason. There are functions that explicitly construct <gdb:exception>
smobs. They're obviously the, umm, exception.
Internal functions:
- internal Scheme functions begin with "%" and are intentionally undocumented
in the manual
Standard Guile/Scheme conventions:
- predicates that return Scheme values have the suffix _p and have suffix "?"
in the Scheme procedure's name
- functions that implement Scheme procedures that modify state have the
suffix _x and have suffix "!" in the Scheme procedure's name
- object predicates that return a C truth value are named prefix_is_foo
- functions that set something have "set" at the front (except for a prefix)
write this: gdbscm_set_gsmob_aux_x implements (set-gsmob-aux! ...)
not this: gdbscm_gsmob_set_aux_x implements (gsmob-set-aux! ...)
Doc strings:
- there are lots of existing examples, they should be pretty consistent,
use them as boilerplate/examples
- begin with a one line summary (can be multiple lines if necessary)
- if the arguments need description:
- blank line
- " Arguments: arg1 arg2"
" arg1: blah ..."
" arg2: blah ..."
- if the result requires more description:
- blank line
- " Returns:"
" Blah ..."
- if it's important to list exceptions that can be thrown:
- blank line
- " Throws:"
" exception-name: blah ..."