binutils-gdb/gdb/guile
Doug Evans 43f3e411c4 Split struct symtab into two: struct symtab and compunit_symtab.
Currently "symtabs" in gdb are stored as a single linked list of
struct symtab that contains both symbol symtabs (the blockvectors)
and file symtabs (the linetables).

This has led to confusion, bugs, and performance issues.

This patch is conceptually very simple: split struct symtab into
two pieces: one part containing things common across the entire
compilation unit, and one part containing things specific to each
source file.

Example.
For the case of a program built out of these files:

foo.c
  foo1.h
  foo2.h
bar.c
  foo1.h
  bar.h

Today we have a single list of struct symtabs:

objfile -> foo.c -> foo1.h -> foo2.h -> bar.c -> foo1.h -> bar.h -> NULL

where "->" means the "next" pointer in struct symtab.

With this patch, that turns into:

objfile -> foo.c(cu) -> bar.c(cu) -> NULL
            |            |
            v            v
           foo.c        bar.c
            |            |
            v            v
           foo1.h       foo1.h
            |            |
            v            v
           foo2.h       bar.h
            |            |
            v            v
           NULL         NULL

where "foo.c(cu)" and "bar.c(cu)" are struct compunit_symtab objects,
and the files foo.c, etc. are struct symtab objects.

So now, for example, when we want to iterate over all blockvectors
we can now just iterate over the compunit_symtab list.

Plus a lot of the data that was either unused or replicated for each
symtab in a compilation unit now lives in struct compunit_symtab.
E.g., the objfile pointer, the producer string, etc.
I thought of moving "language" out of struct symtab but there is
logic to try to compute the language based on previously seen files,
and I think that's best left as is for now.
With my standard monster benchmark with -readnow (which I can't actually
do, but based on my calculations), whereas today the list requires
77MB to store all the struct symtabs, it now only requires 37MB.
A modest space savings given the gigabytes needed for all the debug info,
etc.  Still, it's nice.  Plus, whereas today we create a copy of dirname
for each source file symtab in a compilation unit, we now only create one
for the compunit.

So this patch is basically just a data structure reorg,
I don't expect significant performance improvements from it.

Notes:

1) A followup patch can do a similar split for struct partial_symtab.
I have left that until after I get the changes I want in to
better utilize .gdb_index (it may affect how we do partial syms).

2) Another followup patch *could* rename struct symtab.
The term "symtab" is ambiguous and has been a source of confusion.
In this patch I'm leaving it alone, calling it the "historical" name
of "filetabs", which is what they are now: just the file-name + line-table.

gdb/ChangeLog:

	Split struct symtab into two: struct symtab and compunit_symtab.
	* amd64-tdep.c (amd64_skip_xmm_prologue): Fetch producer from compunit.
	* block.c (blockvector_for_pc_sect): Change "struct symtab *" argument
	to "struct compunit_symtab *".  All callers updated.
	(set_block_compunit_symtab): Renamed from set_block_symtab.  Change
	"struct symtab *" argument to "struct compunit_symtab *".
	All callers updated.
	(get_block_compunit_symtab): Renamed from get_block_symtab.  Change
	result to "struct compunit_symtab *".  All callers updated.
	(find_iterator_compunit_symtab): Renamed from find_iterator_symtab.
	Change result to "struct compunit_symtab *".  All callers updated.
	* block.h (struct global_block) <compunit_symtab>: Renamed from symtab.
	hange type to "struct compunit_symtab *".  All uses updated.
	(struct block_iterator) <d.compunit_symtab>: Renamed from "d.symtab".
	Change type to "struct compunit_symtab *".  All uses updated.
	* buildsym.c (struct buildsym_compunit): New struct.
	(subfiles, buildsym_compdir, buildsym_objfile, main_subfile): Delete.
	(buildsym_compunit): New static global.
	(finish_block_internal): Update to fetch objfile from
	buildsym_compunit.
	(make_blockvector): Delete objfile argument.
	(start_subfile): Rewrite to use buildsym_compunit.  Don't initialize
	debugformat, producer.
	(start_buildsym_compunit): New function.
	(free_buildsym_compunit): Renamed from free_subfiles_list.
	All callers updated.
	(patch_subfile_names): Rewrite to use buildsym_compunit.
	(get_compunit_symtab): New function.
	(get_macro_table): Delete argument comp_dir.  All callers updated.
	(start_symtab): Change result to "struct compunit_symtab *".
	All callers updated.  Create the subfile of the main source file.
	(watch_main_source_file_lossage): Rewrite to use buildsym_compunit.
	(reset_symtab_globals): Update.
	(end_symtab_get_static_block): Update to use buildsym_compunit.
	(end_symtab_without_blockvector): Rewrite.
	(end_symtab_with_blockvector): Change result to
	"struct compunit_symtab *".  All callers updated.
	Update to use buildsym_compunit.  Don't set symtab->dirname,
	instead set it in the compunit.
	Explicitly make sure main symtab is first in its list.
	Set debugformat, producer, blockvector, block_line_section, and
	macrotable in the compunit.
	(end_symtab_from_static_block): Change result to
	"struct compunit_symtab *".  All callers updated.
	(end_symtab, end_expandable_symtab): Ditto.
	(set_missing_symtab): Change symtab argument to
	"struct compunit_symtab *".  All callers updated.
	(augment_type_symtab): Ditto.
	(record_debugformat): Update to use buildsym_compunit.
	(record_producer): Update to use buildsym_compunit.
	* buildsym.h (struct subfile) <dirname>: Delete.
	<producer, debugformat>: Delete.
	<buildsym_compunit>: New member.
	(get_compunit_symtab): Declare.
	* dwarf2read.c (struct type_unit_group) <compunit_symtab>: Renamed
	from primary_symtab.  Change type to "struct compunit_symtab *".
	All uses updated.
	(dwarf2_start_symtab): Change result to "struct compunit_symtab *".
	All callers updated.
	(dwarf_decode_macros): Delete comp_dir argument.  All callers updated.
	(struct dwarf2_per_cu_quick_data) <compunit_symtab>: Renamed from
	symtab.  Change type to "struct compunit_symtab *".  All uses updated.
	(dw2_instantiate_symtab): Change result to "struct compunit_symtab *".
	All callers updated.
	(dw2_find_last_source_symtab): Ditto.
	(dw2_lookup_symbol): Ditto.
	(recursively_find_pc_sect_compunit_symtab): Renamed from
	recursively_find_pc_sect_symtab.  Change result to
	"struct compunit_symtab *".  All callers updated.
	(dw2_find_pc_sect_compunit_symtab): Renamed from
	dw2_find_pc_sect_symtab.  Change result to
	"struct compunit_symtab *".  All callers updated.
	(get_compunit_symtab): Renamed from get_symtab.  Change result to
	"struct compunit_symtab *".  All callers updated.
	(recursively_compute_inclusions): Change type of immediate_parent
	argument to "struct compunit_symtab *".  All callers updated.
	(compute_compunit_symtab_includes): Renamed from
	compute_symtab_includes.  All callers updated.  Rewrite to compute
	includes of compunit_symtabs and not symtabs.
	(process_full_comp_unit): Update to work with struct compunit_symtab.
	(process_full_type_unit): Ditto.
	(dwarf_decode_lines_1): Delete argument comp_dir.  All callers updated.
	(dwarf_decode_lines): Remove special case handling of main subfile.
	(macro_start_file): Delete argument comp_dir.  All callers updated.
	(dwarf_decode_macro_bytes): Ditto.
	* guile/scm-block.c (bkscm_print_block_syms_progress_smob): Update to
	use struct compunit_symtab.
	* i386-tdep.c (i386_skip_prologue): Fetch producer from compunit.
	* jit.c (finalize_symtab): Build compunit_symtab.
	* jv-lang.c (get_java_class_symtab): Change result to
	"struct compunit_symtab *".  All callers updated.
	* macroscope.c (sal_macro_scope): Fetch macro table from compunit.
	* macrotab.c (struct macro_table) <compunit_symtab>: Renamed from
	comp_dir.  Change type to "struct compunit_symtab *".
	All uses updated.
	(new_macro_table): Change comp_dir argument to cust,
	"struct compunit_symtab *".  All callers updated.
	* maint.c (struct cmd_stats) <nr_compunit_symtabs>: Renamed from
	nr_primary_symtabs.  All uses updated.
	(count_symtabs_and_blocks): Update to handle compunits.
	(report_command_stats): Update output, "primary symtabs" renamed to
	"compunits".
	* mdebugread.c (new_symtab): Change result to
	"struct compunit_symtab *".  All callers updated.
	(parse_procedure): Change type of search_symtab argument to
	"struct compunit_symtab *".  All callers updated.
	* objfiles.c (objfile_relocate1): Loop over blockvectors in a
	separate loop.
	* objfiles.h (struct objfile) <compunit_symtabs>: Renamed from
	symtabs.  Change type to "struct compunit_symtab *".  All uses updated.
	(ALL_OBJFILE_FILETABS): Renamed from ALL_OBJFILE_SYMTABS.
	All uses updated.
	(ALL_OBJFILE_COMPUNITS): Renamed from ALL_OBJFILE_PRIMARY_SYMTABS.
	All uses updated.
	(ALL_FILETABS): Renamed from ALL_SYMTABS.  All uses updated.
	(ALL_COMPUNITS): Renamed from ALL_PRIMARY_SYMTABS.  All uses updated.
	* psympriv.h (struct partial_symtab) <compunit_symtab>: Renamed from
	symtab.  Change type to "struct compunit_symtab *".  All uses updated.
	* psymtab.c (psymtab_to_symtab): Change result type to
	"struct compunit_symtab *".  All callers updated.
	(find_pc_sect_compunit_symtab_from_partial): Renamed from
	find_pc_sect_symtab_from_partial.  Change result type to
	"struct compunit_symtab *".  All callers updated.
	(lookup_symbol_aux_psymtabs): Change result type to
	"struct compunit_symtab *".  All callers updated.
	(find_last_source_symtab_from_partial): Ditto.
	* python/py-symtab.c (stpy_get_producer): Fetch producer from compunit.
	* source.c (forget_cached_source_info_for_objfile): Fetch debugformat
	and macro_table from compunit.
	* symfile-debug.c (debug_qf_find_last_source_symtab): Change result
	type to "struct compunit_symtab *".  All callers updated.
	(debug_qf_lookup_symbol): Ditto.
	(debug_qf_find_pc_sect_compunit_symtab): Renamed from
	debug_qf_find_pc_sect_symtab, change result type to
	"struct compunit_symtab *".  All callers updated.
	* symfile.c (allocate_symtab): Delete objfile argument.
	New argument cust.
	(allocate_compunit_symtab): New function.
	(add_compunit_symtab_to_objfile): New function.
	* symfile.h (struct quick_symbol_functions) <lookup_symbol>:
	Change result type to "struct compunit_symtab *".  All uses updated.
	<find_pc_sect_compunit_symtab>: Renamed from find_pc_sect_symtab.
	Change result type to "struct compunit_symtab *".  All uses updated.
	* symmisc.c (print_objfile_statistics): Compute blockvector count in
	separate loop.
	(dump_symtab_1): Update test for primary source symtab.
	(maintenance_info_symtabs): Update to handle compunit symtabs.
	(maintenance_check_symtabs): Ditto.
	* symtab.c (set_primary_symtab): Delete.
	(compunit_primary_filetab): New function.
	(compunit_language): New function.
	(iterate_over_some_symtabs): Change type of arguments "first",
	"after_last" to "struct compunit_symtab *".  All callers updated.
	Update to loop over symtabs in each compunit.
	(error_in_psymtab_expansion): Rename symtab argument to cust,
	and change type to "struct compunit_symtab *".  All callers updated.
	(find_pc_sect_compunit_symtab): Renamed from find_pc_sect_symtab.
	Change result type to "struct compunit_symtab *".  All callers updated.
	(find_pc_compunit_symtab): Renamed from find_pc_symtab.
	Change result type to "struct compunit_symtab *".  All callers updated.
	(find_pc_sect_line): Only loop over symtabs within selected compunit
	instead of all symtabs in the objfile.
	* symtab.h (struct symtab) <blockvector>: Moved to compunit_symtab.
	<compunit_symtab> New member.
	<block_line_section>: Moved to compunit_symtab.
	<locations_valid>: Ditto.
	<epilogue_unwind_valid>: Ditto.
	<macro_table>: Ditto.
	<dirname>: Ditto.
	<debugformat>: Ditto.
	<producer>: Ditto.
	<objfile>: Ditto.
	<call_site_htab>: Ditto.
	<includes>: Ditto.
	<user>: Ditto.
	<primary>: Delete
	(SYMTAB_COMPUNIT): New macro.
	(SYMTAB_BLOCKVECTOR): Update definition.
	(SYMTAB_OBJFILE): Update definition.
	(SYMTAB_DIRNAME): Update definition.
	(struct compunit_symtab): New type.  Common members among all source
	symtabs within a compilation unit moved here.  All uses updated.
	(COMPUNIT_OBJFILE): New macro.
	(COMPUNIT_FILETABS): New macro.
	(COMPUNIT_DEBUGFORMAT): New macro.
	(COMPUNIT_PRODUCER): New macro.
	(COMPUNIT_DIRNAME): New macro.
	(COMPUNIT_BLOCKVECTOR): New macro.
	(COMPUNIT_BLOCK_LINE_SECTION): New macro.
	(COMPUNIT_LOCATIONS_VALID): New macro.
	(COMPUNIT_EPILOGUE_UNWIND_VALID): New macro.
	(COMPUNIT_CALL_SITE_HTAB): New macro.
	(COMPUNIT_MACRO_TABLE): New macro.
	(ALL_COMPUNIT_FILETABS): New macro.
	(compunit_symtab_ptr): New typedef.
	(DEF_VEC_P (compunit_symtab_ptr)): New vector type.

gdb/testsuite/ChangeLog:

	* gdb.base/maint.exp: Update expected output.
2014-11-20 07:47:44 -08:00
..
lib
README
guile-internal.h
guile.c
guile.h
scm-arch.c
scm-auto-load.c
scm-block.c Split struct symtab into two: struct symtab and compunit_symtab. 2014-11-20 07:47:44 -08:00
scm-breakpoint.c make "permanent breakpoints" per location and disableable 2014-11-12 10:37:57 +00:00
scm-cmd.c
scm-disasm.c
scm-exception.c
scm-frame.c SYMTAB_OBJFILE: New macro. 2014-11-18 09:19:11 -08:00
scm-gsmob.c
scm-iterator.c
scm-lazy-string.c
scm-math.c
scm-objfile.c
scm-param.c
scm-ports.c
scm-pretty-print.c
scm-progspace.c
scm-safe-call.c
scm-string.c
scm-symbol.c Use SYMBOL_OBJFILE more. 2014-11-18 08:54:06 -08:00
scm-symtab.c symtab.h (SYMTAB_BLOCKVECTOR): Renamed from BLOCKVECTOR. All uses updated. 2014-11-18 09:41:45 -08:00
scm-type.c Delete TYPE_CODE_CLASS, it's just an alias of TYPE_CODE_STRUCT. 2014-11-06 17:19:06 -08:00
scm-utils.c
scm-value.c Delete TYPE_CODE_CLASS, it's just an alias of TYPE_CODE_STRUCT. 2014-11-06 17:19:06 -08:00

README

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

README for gdb/guile
====================

This file contains important notes for gdb/guile developers.
["gdb/guile" refers to the directory you found this file in]

Nomenclature:

  In the implementation we use "Scheme" or "Guile" depending on context.
  And sometimes it doesn't matter.
  Guile is Scheme, and for the most part this is what we present to the user
  as well.  However, to highlight the fact that it is Guile, the GDB commands
  that invoke Scheme functions are named "guile" and "guile-repl",
  abbreviated "gu" and "gr" respectively.

Co-existence with Python:

  Keep the user interfaces reasonably consistent, but don't shy away from
  providing a clearer (or more Scheme-friendly/consistent) user interface
  where appropriate.

  Additions to Python support or Scheme support don't require corresponding
  changes in the other scripting language.

  Scheme-wrapped breakpoints are created lazily so that if the user
  doesn't use Scheme s/he doesn't pay any cost.

Importing the gdb module into Scheme:

  To import the gdb module:
  (gdb) guile (use-modules (gdb))

  If you want to add a prefix to gdb module symbols:
  (gdb) guile (use-modules ((gdb) #:renamer (symbol-prefix-proc 'gdb:)))
  This gives every symbol a "gdb:" prefix which is a common convention.
  OTOH it's more to type.

Implementation/Hacking notes:

  Don't use scm_is_false.
  For this C function, () == #f (a la Lisp) and it's not clear how treating
  them as equivalent for truth values will affect the GDB interface.
  Until the effect is clear avoid them.
  Instead use gdbscm_is_false, gdbscm_is_true, gdbscm_is_bool.
  There are macros in guile-internal.h to enforce this.

  Use gdbscm_foo as the name of functions that implement Scheme procedures
  to provide consistent naming in error messages.  The user can see "gdbscm"
  in the name and immediately know where the function came from.

  All smobs contain gdb_smob or chained_gdb_smob as the first member.
  This provides a mechanism for extending them in the Scheme side without
  tying GDB to the details.

  The lifetime of a smob, AIUI, is decided by the containing SCM.
  When there is no longer a reference to the containing SCM then the
  smob can be GC'd.  Objects that have references from outside of Scheme,
  e.g., breakpoints, need to be protected from GC.

  Don't do something that can cause a Scheme exception inside a TRY_CATCH,
  and, in code that can be called from Scheme, don't do something that can
  cause a GDB exception outside a TRY_CATCH.
  This makes the code a little tricky to write sometimes, but it is a
  rule imposed by the programming environment.  Bugs often happen because
  this rule is broken.  Learn it, follow it.

Coding style notes:

  - If you find violations to these rules, let's fix the code.
    Some attempt has been made to be consistent, but it's early.
    Over time we want things to be more consistent, not less.

  - None of this really needs to be read.  Instead, do not be creative:
    Monkey-See-Monkey-Do hacking should generally Just Work.

  - Absence of the word "typically" means the rule is reasonably strict.

  - The gdbscm_initialize_foo function (e.g., gdbscm_initialize_values)
    is the last thing to appear in the file, immediately preceded by any
    tables of exported variables and functions.

  - In addition to these of course, follow GDB coding conventions.

General naming rules:

  - The word "object" absent any modifier (like "GOOPS object") means a
    Scheme object (of any type), and is never used otherwise.
    If you want to refer to, e.g., a GOOPS object, say "GOOPS object".

  - Do not begin any function, global variable, etc. name with scm_.
    That's what the Guile implementation uses.
    (kinda obvious, just being complete).

  - The word "invalid" carries a specific connotation.  Try not to use it
    in a different way.  It means the underlying GDB object has disappeared.
    For example, a <gdb:objfile> smob becomes "invalid" when the underlying
    objfile is removed from GDB.

  - We typically use the word "exception" to mean Scheme exceptions,
    and we typically use the word "error" to mean GDB errors.

Comments:

  - function comments for functions implementing Scheme procedures begin with
    a description of the Scheme usage.  Example:
    /* (gsmob-aux gsmob) -> object */

  - the following comment appears after the copyright header:
    /* See README file in this directory for implementation notes, coding
       conventions, et.al.  */

Smob naming:

  - gdb smobs are named, internally, "gdb:foo"
  - in Guile they become <gdb:foo>, that is the convention for naming classes
    and smobs have rudimentary GOOPS support (they can't be inherited from,
    but generics can work with them)
  - in comments use the Guile naming for smobs,
    i.e., <gdb:foo> instead of gdb:foo.
    Note: This only applies to smobs.  Exceptions are also named gdb:foo,
    but since they are not "classes" they are not wrapped in <>.
  - smob names are stored in a global, and for simplicity we pass this
    global as the "expected type" parameter to SCM_ASSERT_TYPE, thus in
    this instance smob types are printed without the <>.
    [Hmmm, this rule seems dated now.  Plus I18N rules in GDB are not always
    clear, sometimes we pass the smob name through _(), however it's not
    clear that's actually a good idea.]

Type naming:

  - smob structs are typedefs named foo_smob

Variable naming:

  - "scm" by itself is reserved for arbitrary Scheme objects

  - variables that are pointers to smob structs are named <char>_smob or
    <char><char>_smob, e.g., f_smob for a pointer to a frame smob

  - variables that are gdb smob objects are typically named <char>_scm or
    <char><char>_scm, e.g., f_scm for a <gdb:frame> object

  - the name of the first argument for method-like functions is "self"

Function naming:

  General:

  - all non-static functions have a prefix,
    either gdbscm_ or <char><char>scm_ [or <char><char><char>scm_]

  - all functions that implement Scheme procedures have a gdbscm_ prefix,
    this is for consistency and readability of Scheme exception text

  - static functions typically have a prefix
    - the prefix is typically <char><char>scm_ where the first two letters
      are unique to the file or class the function works with.
      E.g., the scm-arch.c prefix is arscm_.
      This follows something used in gdb/python in some places,
      we make it formal.

  - if the function is of a general nature, or no other prefix works,
    use gdbscm_

  Conversion functions:

  - the from/to in function names follows from libguile's existing style
  - conversions from/to Scheme objects are named:
      prefix_scm_from_foo: converts from foo to scm
      prefix_scm_to_foo: converts from scm to foo

  Exception handling:

  - functions that may throw a Scheme exception have an _unsafe suffix
    - This does not apply to functions that implement Scheme procedures.
    - This does not apply to functions whose explicit job is to throw
      an exception.  Adding _unsafe to gdbscm_throw is kinda superfluous. :-)
  - functions that can throw a GDB error aren't adorned with _unsafe

  - "_safe" in a function name means it will never throw an exception
    - Generally unnecessary, since the convention is to mark the ones that
      *can* throw an exception.  But sometimes it's useful to highlight the
      fact that the function is safe to call without worrying about exception
      handling.

  - except for functions that implement Scheme procedures, all functions
    that can throw exceptions (GDB or Scheme) say so in their function comment

  - functions that don't throw an exception, but still need to indicate to
    the caller that one happened (i.e., "safe" functions), either return
    a <gdb:exception> smob as a result or pass it back via a parameter.
    For this reason don't pass back <gdb:exception> smobs for any other
    reason.  There are functions that explicitly construct <gdb:exception>
    smobs.  They're obviously the, umm, exception.

  Internal functions:

  - internal Scheme functions begin with "%" and are intentionally undocumented
    in the manual

  Standard Guile/Scheme conventions:

  - predicates that return Scheme values have the suffix _p and have suffix "?"
    in the Scheme procedure's name
  - functions that implement Scheme procedures that modify state have the
    suffix _x and have suffix "!" in the Scheme procedure's name
  - object predicates that return a C truth value are named prefix_is_foo
  - functions that set something have "set" at the front (except for a prefix)
    write this: gdbscm_set_gsmob_aux_x implements (set-gsmob-aux! ...)
    not this: gdbscm_gsmob_set_aux_x implements (gsmob-set-aux! ...)

Doc strings:

  - there are lots of existing examples, they should be pretty consistent,
    use them as boilerplate/examples
  - begin with a one line summary (can be multiple lines if necessary)
  - if the arguments need description:
    - blank line
    - "  Arguments: arg1 arg2"
      "    arg1: blah ..."
      "    arg2: blah ..."
  - if the result requires more description:
    - blank line
    - "  Returns:"
      "    Blah ..."
  - if it's important to list exceptions that can be thrown:
    - blank line
    - "  Throws:"
      "    exception-name: blah ..."