binutils-gdb/gdb/guile
Pedro Alves 7022349d5c Stop assuming no-debug-info functions return int
The fact that GDB defaults to assuming that functions return int, when
it has no debug info for the function has been a recurring source of
user confusion.  Recently this came up on the errno pretty printer
discussions.  Shortly after, it came up again on IRC, with someone
wondering why does getenv() in GDB return a negative int:

  (gdb) p getenv("PATH")
  $1 = -6185

This question (with s/getenv/random-other-C-runtime-function) is a FAQ
on IRC.

The reason for the above is:

 (gdb) p getenv
 $2 = {<text variable, no debug info>} 0x7ffff7751d80 <getenv>
 (gdb) ptype getenv
 type = int ()

... which means that GDB truncated the 64-bit pointer that is actually
returned from getent to 32-bit, and then sign-extended it:

 (gdb) p /x -6185
 $6 = 0xffffe7d7

The workaround is to cast the function to the right type, like:

 (gdb) p ((char *(*) (const char *)) getenv) ("PATH")
 $3 = 0x7fffffffe7d7 "/usr/local/bin:/"...

IMO, we should do better than this.

I see the "assume-int" issue the same way I see printing bogus values
for optimized-out variables instead of "<optimized out>" -- I'd much
rather that the debugger tells me "I don't know" and tells me how to
fix it than showing me bogus misleading results, making me go around
tilting at windmills.

If GDB prints a signed integer when you're expecting a pointer or
aggregate, you at least have some sense that something is off, but
consider the case of the function actually returning a 64-bit integer.
For example, compile this without debug info:

 unsigned long long
 function ()
 {
   return 0x7fffffffffffffff;
 }

Currently, with pristine GDB, you get:

 (gdb) p function ()
 $1 = -1                      # incorrect
 (gdb) p /x function ()
 $2 = 0xffffffff              # incorrect

maybe after spending a few hours debugging you suspect something is
wrong with that -1, and do:

 (gdb) ptype function
 type = int ()

and maybe, just maybe, you realize that the function actually returns
unsigned long long.  And you try to fix it with:

(gdb) p /x (unsigned long long) function ()
 $3 = 0xffffffffffffffff      # incorrect

... which still produces the wrong result, because GDB simply applied
int to unsigned long long conversion.  Meaning, it sign-extended the
integer that it extracted from the return of the function, to 64-bits.

and then maybe, after asking around on IRC, you realize you have to
cast the function to a pointer of the right type, and call that.  It
won't be easy, but after a few missteps, you'll get to it:

.....  (gdb) p /x ((unsigned long long(*) ()) function) ()
 $666 = 0x7fffffffffffffff             # finally! :-)


So to improve on the user experience, this patch does the following
(interrelated) things:

 - makes no-debug-info functions no longer default to "int" as return
   type.  Instead, they're left with NULL/"<unknown return type>"
   return type.

    (gdb) ptype getenv
    type = <unknown return type> ()

 - makes calling a function with unknown return type an error.

    (gdb) p getenv ("PATH")
    'getenv' has unknown return type; cast the call to its declared return type

 - and then to make it easier to call the function, makes it possible
   to _only_ cast the return of the function to the right type,
   instead of having to cast the function to a function pointer:

    (gdb) p (char *) getenv ("PATH")                      # now Just Works
    $3 = 0x7fffffffe7d7 "/usr/local/bin:/"...

    (gdb) p ((char *(*) (const char *)) getenv) ("PATH")  # continues working
    $4 = 0x7fffffffe7d7 "/usr/local/bin:/"...

   I.e., it makes GDB default the function's return type to the type
   of the cast, and the function's parameters to the type of the
   arguments passed down.

After this patch, here's what you'll get for the "unsigned long long"
example above:

 (gdb) p function ()
 'function' has unknown return type; cast the call to its declared return type
 (gdb) p /x (unsigned long long) function ()
 $4 = 0x7fffffffffffffff     # correct!

Note that while with "print" GDB shows the name of the function that
has the problem:

  (gdb) p getenv ("PATH")
  'getenv' has unknown return type; cast the call to its declared return type

which can by handy in more complicated expressions, "ptype" does not:

  (gdb) ptype getenv ("PATH")
  function has unknown return type; cast the call to its declared return type

This will be fixed in the next patch.

gdb/ChangeLog:
2017-09-04  Pedro Alves  <palves@redhat.com>

	* ada-lang.c (ada_evaluate_subexp) <TYPE_CODE_FUNC>: Don't handle
	TYPE_GNU_IFUNC specially here.  Throw error if return type is
	unknown.
	* ada-typeprint.c (print_func_type): Handle functions with unknown
	return type.
	* c-typeprint.c (c_type_print_base): Handle functions and methods
	with unknown return type.
	* compile/compile-c-symbols.c (convert_symbol_bmsym)
	<mst_text_gnu_ifunc>: Use nodebug_text_gnu_ifunc_symbol.
	* compile/compile-c-types.c: Include "objfiles.h".
	(convert_func): For functions with unknown return type, warn and
	default to int.
	* compile/compile-object-run.c (compile_object_run): Adjust call
	to call_function_by_hand_dummy.
	* elfread.c (elf_gnu_ifunc_resolve_addr): Adjust call to
	call_function_by_hand.
	* eval.c (evaluate_subexp_standard): Adjust calls to
	call_function_by_hand.  Handle functions and methods with unknown
	return type.  Pass expect_type to call_function_by_hand.
	* f-typeprint.c (f_type_print_base): Handle functions with unknown
	return type.
	* gcore.c (call_target_sbrk): Adjust call to
	call_function_by_hand.
	* gdbtypes.c (objfile_type): Leave nodebug text symbol with NULL
	return type instead of int.  Make nodebug_text_gnu_ifunc_symbol be
	an integer address type instead of nodebug.
	* guile/scm-value.c (gdbscm_value_call): Adjust call to
	call_function_by_hand.
	* infcall.c (error_call_unknown_return_type): New function.
	(call_function_by_hand): New "default_return_type" parameter.
	Pass it down.
	(call_function_by_hand_dummy): New "default_return_type"
	parameter.  Use it instead of defaulting to int.  If there's no
	default and the return type is unknown, throw an error.  If
	there's a default return type, and the called function has no
	debug info, then assume the function is prototyped.
	* infcall.h (call_function_by_hand, call_function_by_hand_dummy):
	New "default_return_type" parameter.
	(error_call_unknown_return_type): New declaration.
	* linux-fork.c (call_lseek): Cast return type of lseek.
	(inferior_call_waitpid, checkpoint_command): Adjust calls to
	call_function_by_hand.
	* linux-tdep.c (linux_infcall_mmap, linux_infcall_munmap): Adjust
	calls to call_function_by_hand.
	* m2-typeprint.c (m2_procedure): Handle functions with unknown
	return type.
	* objc-lang.c (lookup_objc_class, lookup_child_selector)
	(value_nsstring, print_object_command): Adjust calls to
	call_function_by_hand.
	* p-typeprint.c (pascal_type_print_varspec_prefix): Handle
	functions with unknown return type.
	(pascal_type_print_func_varspec_suffix): New function.
	(pascal_type_print_varspec_suffix) <TYPE_CODE_FUNC,
	TYPE_CODE_METHOD>: Use it.
	* python/py-value.c (valpy_call): Adjust call to
	call_function_by_hand.
	* rust-lang.c (rust_evaluate_funcall): Adjust call to
	call_function_by_hand.
	* valarith.c (value_x_binop, value_x_unop): Adjust calls to
	call_function_by_hand.
	* valops.c (value_allocate_space_in_inferior): Adjust call to
	call_function_by_hand.
	* typeprint.c (type_print_unknown_return_type): New function.
	* typeprint.h (type_print_unknown_return_type): New declaration.

gdb/testsuite/ChangeLog:
2017-09-04  Pedro Alves  <palves@redhat.com>

	* gdb.base/break-main-file-remove-fail.exp (test_remove_bp): Cast
	return type of munmap in infcall.
	* gdb.base/break-probes.exp: Cast return type of foo in infcall.
	* gdb.base/checkpoint.exp: Simplify using for loop.  Cast return
	type of ftell in infcall.
	* gdb.base/dprintf-detach.exp (dprintf_detach_test): Cast return
	type of getpid in infcall.
	* gdb.base/infcall-exec.exp: Cast return type of execlp in
	infcall.
	* gdb.base/info-os.exp: Cast return type of getpid in infcall.
	Bail on failure to extract the pid.
	* gdb.base/nodebug.c: #include <stdint.h>.
	(multf, multf_noproto, mult, mult_noproto, add8, add8_noproto):
	New functions.
	* gdb.base/nodebug.exp (test_call_promotion): New procedure.
	Change expected output of print/whatis/ptype with functions with
	no debug info.  Test all supported languages.  Call
	test_call_promotion.
	* gdb.compile/compile.exp: Adjust expected output to expect
	warning.
	* gdb.threads/siginfo-threads.exp: Likewise.
2017-09-04 20:21:13 +01:00
..
lib
README
guile-internal.h
guile.c Use scoped_restore in more places 2017-04-12 11:16:19 -06:00
guile.h
scm-arch.c
scm-auto-load.c
scm-block.c
scm-breakpoint.c Change breakpoint event locations to event_location_up 2017-04-12 11:16:19 -06:00
scm-cmd.c Introduce class completion_tracker & rewrite completion<->readline interaction 2017-07-17 14:45:59 +01:00
scm-disasm.c
scm-exception.c
scm-frame.c Kill init_sal 2017-09-04 17:11:45 +01:00
scm-gsmob.c
scm-iterator.c
scm-lazy-string.c
scm-math.c
scm-objfile.c
scm-param.c
scm-ports.c Use scoped_restore in more places 2017-04-12 11:16:19 -06:00
scm-pretty-print.c
scm-progspace.c
scm-safe-call.c Change gdb_realpath to return a unique_xmalloc_ptr 2017-08-22 09:30:12 -06:00
scm-string.c Introduce gdb_argv, a class wrapper for buildargv 2017-08-03 07:59:08 -06:00
scm-symbol.c
scm-symtab.c Kill init_sal 2017-09-04 17:11:45 +01:00
scm-type.c
scm-utils.c
scm-value.c Stop assuming no-debug-info functions return int 2017-09-04 20:21:13 +01:00

README

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

README for gdb/guile
====================

This file contains important notes for gdb/guile developers.
["gdb/guile" refers to the directory you found this file in]

Nomenclature:

  In the implementation we use "Scheme" or "Guile" depending on context.
  And sometimes it doesn't matter.
  Guile is Scheme, and for the most part this is what we present to the user
  as well.  However, to highlight the fact that it is Guile, the GDB commands
  that invoke Scheme functions are named "guile" and "guile-repl",
  abbreviated "gu" and "gr" respectively.

Co-existence with Python:

  Keep the user interfaces reasonably consistent, but don't shy away from
  providing a clearer (or more Scheme-friendly/consistent) user interface
  where appropriate.

  Additions to Python support or Scheme support don't require corresponding
  changes in the other scripting language.

  Scheme-wrapped breakpoints are created lazily so that if the user
  doesn't use Scheme s/he doesn't pay any cost.

Importing the gdb module into Scheme:

  To import the gdb module:
  (gdb) guile (use-modules (gdb))

  If you want to add a prefix to gdb module symbols:
  (gdb) guile (use-modules ((gdb) #:renamer (symbol-prefix-proc 'gdb:)))
  This gives every symbol a "gdb:" prefix which is a common convention.
  OTOH it's more to type.

Implementation/Hacking notes:

  Don't use scm_is_false.
  For this C function, () == #f (a la Lisp) and it's not clear how treating
  them as equivalent for truth values will affect the GDB interface.
  Until the effect is clear avoid them.
  Instead use gdbscm_is_false, gdbscm_is_true, gdbscm_is_bool.
  There are macros in guile-internal.h to enforce this.

  Use gdbscm_foo as the name of functions that implement Scheme procedures
  to provide consistent naming in error messages.  The user can see "gdbscm"
  in the name and immediately know where the function came from.

  All smobs contain gdb_smob or chained_gdb_smob as the first member.
  This provides a mechanism for extending them in the Scheme side without
  tying GDB to the details.

  The lifetime of a smob, AIUI, is decided by the containing SCM.
  When there is no longer a reference to the containing SCM then the
  smob can be GC'd.  Objects that have references from outside of Scheme,
  e.g., breakpoints, need to be protected from GC.

  Don't do something that can cause a Scheme exception inside a TRY_CATCH,
  and, in code that can be called from Scheme, don't do something that can
  cause a GDB exception outside a TRY_CATCH.
  This makes the code a little tricky to write sometimes, but it is a
  rule imposed by the programming environment.  Bugs often happen because
  this rule is broken.  Learn it, follow it.

Coding style notes:

  - If you find violations to these rules, let's fix the code.
    Some attempt has been made to be consistent, but it's early.
    Over time we want things to be more consistent, not less.

  - None of this really needs to be read.  Instead, do not be creative:
    Monkey-See-Monkey-Do hacking should generally Just Work.

  - Absence of the word "typically" means the rule is reasonably strict.

  - The gdbscm_initialize_foo function (e.g., gdbscm_initialize_values)
    is the last thing to appear in the file, immediately preceded by any
    tables of exported variables and functions.

  - In addition to these of course, follow GDB coding conventions.

General naming rules:

  - The word "object" absent any modifier (like "GOOPS object") means a
    Scheme object (of any type), and is never used otherwise.
    If you want to refer to, e.g., a GOOPS object, say "GOOPS object".

  - Do not begin any function, global variable, etc. name with scm_.
    That's what the Guile implementation uses.
    (kinda obvious, just being complete).

  - The word "invalid" carries a specific connotation.  Try not to use it
    in a different way.  It means the underlying GDB object has disappeared.
    For example, a <gdb:objfile> smob becomes "invalid" when the underlying
    objfile is removed from GDB.

  - We typically use the word "exception" to mean Scheme exceptions,
    and we typically use the word "error" to mean GDB errors.

Comments:

  - function comments for functions implementing Scheme procedures begin with
    a description of the Scheme usage.  Example:
    /* (gsmob-aux gsmob) -> object */

  - the following comment appears after the copyright header:
    /* See README file in this directory for implementation notes, coding
       conventions, et.al.  */

Smob naming:

  - gdb smobs are named, internally, "gdb:foo"
  - in Guile they become <gdb:foo>, that is the convention for naming classes
    and smobs have rudimentary GOOPS support (they can't be inherited from,
    but generics can work with them)
  - in comments use the Guile naming for smobs,
    i.e., <gdb:foo> instead of gdb:foo.
    Note: This only applies to smobs.  Exceptions are also named gdb:foo,
    but since they are not "classes" they are not wrapped in <>.
  - smob names are stored in a global, and for simplicity we pass this
    global as the "expected type" parameter to SCM_ASSERT_TYPE, thus in
    this instance smob types are printed without the <>.
    [Hmmm, this rule seems dated now.  Plus I18N rules in GDB are not always
    clear, sometimes we pass the smob name through _(), however it's not
    clear that's actually a good idea.]

Type naming:

  - smob structs are typedefs named foo_smob

Variable naming:

  - "scm" by itself is reserved for arbitrary Scheme objects

  - variables that are pointers to smob structs are named <char>_smob or
    <char><char>_smob, e.g., f_smob for a pointer to a frame smob

  - variables that are gdb smob objects are typically named <char>_scm or
    <char><char>_scm, e.g., f_scm for a <gdb:frame> object

  - the name of the first argument for method-like functions is "self"

Function naming:

  General:

  - all non-static functions have a prefix,
    either gdbscm_ or <char><char>scm_ [or <char><char><char>scm_]

  - all functions that implement Scheme procedures have a gdbscm_ prefix,
    this is for consistency and readability of Scheme exception text

  - static functions typically have a prefix
    - the prefix is typically <char><char>scm_ where the first two letters
      are unique to the file or class the function works with.
      E.g., the scm-arch.c prefix is arscm_.
      This follows something used in gdb/python in some places,
      we make it formal.

  - if the function is of a general nature, or no other prefix works,
    use gdbscm_

  Conversion functions:

  - the from/to in function names follows from libguile's existing style
  - conversions from/to Scheme objects are named:
      prefix_scm_from_foo: converts from foo to scm
      prefix_scm_to_foo: converts from scm to foo

  Exception handling:

  - functions that may throw a Scheme exception have an _unsafe suffix
    - This does not apply to functions that implement Scheme procedures.
    - This does not apply to functions whose explicit job is to throw
      an exception.  Adding _unsafe to gdbscm_throw is kinda superfluous. :-)
  - functions that can throw a GDB error aren't adorned with _unsafe

  - "_safe" in a function name means it will never throw an exception
    - Generally unnecessary, since the convention is to mark the ones that
      *can* throw an exception.  But sometimes it's useful to highlight the
      fact that the function is safe to call without worrying about exception
      handling.

  - except for functions that implement Scheme procedures, all functions
    that can throw exceptions (GDB or Scheme) say so in their function comment

  - functions that don't throw an exception, but still need to indicate to
    the caller that one happened (i.e., "safe" functions), either return
    a <gdb:exception> smob as a result or pass it back via a parameter.
    For this reason don't pass back <gdb:exception> smobs for any other
    reason.  There are functions that explicitly construct <gdb:exception>
    smobs.  They're obviously the, umm, exception.

  Internal functions:

  - internal Scheme functions begin with "%" and are intentionally undocumented
    in the manual

  Standard Guile/Scheme conventions:

  - predicates that return Scheme values have the suffix _p and have suffix "?"
    in the Scheme procedure's name
  - functions that implement Scheme procedures that modify state have the
    suffix _x and have suffix "!" in the Scheme procedure's name
  - object predicates that return a C truth value are named prefix_is_foo
  - functions that set something have "set" at the front (except for a prefix)
    write this: gdbscm_set_gsmob_aux_x implements (set-gsmob-aux! ...)
    not this: gdbscm_gsmob_set_aux_x implements (gsmob-set-aux! ...)

Doc strings:

  - there are lots of existing examples, they should be pretty consistent,
    use them as boilerplate/examples
  - begin with a one line summary (can be multiple lines if necessary)
  - if the arguments need description:
    - blank line
    - "  Arguments: arg1 arg2"
      "    arg1: blah ..."
      "    arg2: blah ..."
  - if the result requires more description:
    - blank line
    - "  Returns:"
      "    Blah ..."
  - if it's important to list exceptions that can be thrown:
    - blank line
    - "  Throws:"
      "    exception-name: blah ..."