Commit Graph

26 Commits

Author SHA1 Message Date
Alan Modra b3adc24a07 Update year range in copyright notice of binutils files 2020-01-01 18:42:54 +10:30
Nick Alcock 9323dd869d libctf: make ctf_dump not crash on OOM
ctf_dump calls ctf_str_append extensively but never checks to see if it
returns NULL (on OOM).  If it ever does, we truncate the string we are
appending to and leak it!

Instead, create a variant of ctf_str_append that returns the *original
string* on OOM, and use it in ctf-dump.  It is far better to omit a tiny
piece of a dump on OOM than to omit a bigger piece, and it is also
better to do this in what is after all purely debugging code than it is
to uglify ctf-dump.c with huge numbers of checks for the out-of-memory
case.  Slightly truncated debugging output is better than no debugging
output at all and an out-of-memory message.

New in v4.

libctf/
	* ctf-impl.h (ctf_str_append_noerr): Declare.
	* ctf-util.c (ctf_str_append_noerr): Define in terms of
	ctf_str_append.
	* ctf-dump.c (str_append): New, call it.
	(ctf_dump_format_type): Use str_append, not ctf_str_append.
	(ctf_dump_label): Likewise.
	(ctf_dump_objts): Likewise.
	(ctf_dump_funcs): Likewise.
	(ctf_dump_var): Likewise.
	(ctf_dump_member): Likewise.
	(ctf_dump_type): Likewise.
	(ctf_dump): Likewise.
2019-10-03 17:04:56 +01:00
Nick Alcock de07e349be libctf: remove ctf_malloc, ctf_free and ctf_strdup
These just get in the way of auditing for erroneous usage of strdup and
add a huge irregular surface of "ctf_malloc or malloc? ctf_free or free?
ctf_strdup or strdup?"

ctf_malloc and ctf_free usage has not reliably matched up for many
years, if ever, making the whole game pointless.

Go back to malloc, free, and strdup like everyone else: while we're at
it, fix a bunch of places where we weren't properly checking for OOM.
This changes the interface of ctf_cuname_set and ctf_parent_name_set,
which could strdup but could not return errors (like ENOMEM).

New in v4.

include/
	* ctf-api.h (ctf_cuname_set): Can now fail, returning int.
	(ctf_parent_name_set): Likewise.
libctf/
	* ctf-impl.h (ctf_alloc): Remove.
	(ctf_free): Likewise.
	(ctf_strdup): Likewise.
	* ctf-subr.c (ctf_alloc): Remove.
	(ctf_free): Likewise.
	* ctf-util.c (ctf_strdup): Remove.

	* ctf-create.c (ctf_serialize): Use malloc, not ctf_alloc; free, not
	ctf_free; strdup, not ctf_strdup.
	(ctf_dtd_delete): Likewise.
	(ctf_dvd_delete): Likewise.
	(ctf_add_generic): Likewise.
	(ctf_add_function): Likewise.
	(ctf_add_enumerator): Likewise.
	(ctf_add_member_offset): Likewise.
	(ctf_add_variable): Likewise.
	(membadd): Likewise.
	(ctf_compress_write): Likewise.
	(ctf_write_mem): Likewise.
	* ctf-decl.c (ctf_decl_push): Likewise.
	(ctf_decl_fini): Likewise.
	(ctf_decl_sprintf): Likewise.  Check for OOM.
	* ctf-dump.c (ctf_dump_append): Use malloc, not ctf_alloc; free, not
	ctf_free; strdup, not ctf_strdup.
	(ctf_dump_free): Likewise.
	(ctf_dump): Likewise.
	* ctf-open.c (upgrade_types_v1): Likewise.
	(init_types): Likewise.
	(ctf_file_close): Likewise.
	(ctf_bufopen_internal): Likewise.  Check for OOM.
	(ctf_parent_name_set): Likewise: report the OOM to the caller.
	(ctf_cuname_set): Likewise.
	(ctf_import): Likewise.
	* ctf-string.c (ctf_str_purge_atom_refs): Use malloc, not ctf_alloc;
	free, not ctf_free; strdup, not ctf_strdup.
	(ctf_str_free_atom): Likewise.
	(ctf_str_create_atoms): Likewise.
	(ctf_str_add_ref_internal): Likewise.
	(ctf_str_remove_ref): Likewise.
	(ctf_str_write_strtab): Likewise.
2019-10-03 17:04:56 +01:00
Nick Alcock 99dc3ebdff libctf: properly handle ctf_add_type of forwards and self-reffing structs
The code to handle structures (and unions) that refer to themselves in
ctf_add_type is extremely dodgy.  It works by looking through the list
of not-yet-committed types for a structure with the same name as the
structure in question and assuming, if it finds it, that this must be a
reference to the same type.  This is a linear search that gets ever
slower as the dictionary grows, requiring you to call ctf_update at
intervals to keep performance tolerable: but if you do that, you run
into the problem that if a forward declared before the ctf_update is
changed to a structure afterwards, ctf_update explodes.

The last commit fixed most of this: this commit can use it, adding a new
ctf_add_processing hash that tracks source type IDs that are currently
being processed and uses it to avoid infinite recursion rather than the
dynamic type list: we split ctf_add_type into a ctf_add_type_internal,
so that ctf_add_type itself can become a wrapper that empties out this
being-processed hash once the entire recursive type addition is over.
Structure additions themselves avoid adding their dependent types
quite so much by checking the type mapping and avoiding re-adding types
we already know we have added.

We also add support for adding forwards to dictionaries that already
contain the thing they are a forward to: we just silently return the
original type.

v4: return existing struct/union/enum types properly, rather than using
    an uninitialized variable: shrinks sizes of CTF sections back down
    to roughly where they were in v1/v2 of this patch series.
v5: fix tabdamage.

libctf/
	* ctf-impl.h (ctf_file_t) <ctf_add_processing>: New.
	* ctf-open.c (ctf_file_close): Free it.
	* ctf-create.c (ctf_serialize): Adjust.
	(membcmp): When reporting a conflict due to an error, report the
	error.
	(ctf_add_type): Turn into a ctf_add_processing wrapper.  Rename to...
	(ctf_add_type_internal): ... this.  Hand back types we are already
	in the middle of adding immediately.  Hand back structs/unions with
	the same number of members immediately.  Do not walk the dynamic
	list.  Call ctf_add_type_internal, not ctf_add_type.  Handle
	forwards promoted to other types and the inverse case identically.
	Add structs to the mapping as soon as we intern them, before they
	gain any members.
2019-10-03 17:04:56 +01:00
Nick Alcock 676c3ecbad libctf: avoid the need to ever use ctf_update
The method of operation of libctf when the dictionary is writable has
before now been that types that are added land in the dynamic type
section, which is a linked list and hash of IDs -> dynamic type
definitions (and, recently a hash of names): the DTDs are a bit of CTF
representing the ctf_type_t and ad hoc C structures representing the
vlen.  Historically, libctf was unable to do anything with these types,
not even look them up by ID, let alone by name: if you wanted to do that
say if you were adding a type that depended on one you just added) you
called ctf_update, which serializes all the DTDs into a CTF file and
reopens it, copying its guts over the fp it's called with.  The
ctf_updated types are then frozen in amber and unchangeable: all lookups
will return the types in the static portion in preference to the dynamic
portion, and we will refuse to re-add things that already exist in the
static portion (and, of late, in the dynamic portion too).  The libctf
machinery remembers the boundary between static and dynamic types and
looks in the right portion for each type.  Lots of things still don't
quite work with dynamic types (e.g. getting their size), but enough
works to do a bunch of additions and then a ctf_update, most of the
time.

Except it doesn't, because ctf_add_type finds it necessary to walk the
full dynamic type definition list looking for types with matching names,
so it gets slower and slower with every type you add: fixing this
requires calling ctf_update periodically for no other reason than to
avoid massively slowing things down.

This is all clunky and very slow but kind of works, until you consider
that it is in fact possible and indeed necessary to modify one sort of
type after it has been added: forwards.  These are necessarily promoted
to structs, unions or enums, and when they do so *their type ID does not
change*.  So all of a sudden we are changing types that already exist in
the static portion.  ctf_update gets massively confused by this and
allocates space enough for the forward (with no members), but then emits
the new dynamic type (with all the members) into it.  You get an
assertion failure after that, if you're lucky, or a coredump.

So this commit rejigs things a bit and arranges to exclusively use the
dynamic type definitions in writable dictionaries, and the static type
definitions in readable dictionaries: we don't at any time have a mixture
of static and dynamic types, and you don't need to call ctf_update to
make things "appear".  The ctf_dtbyname hash I introduced a few months
ago, which maps things like "struct foo" to DTDs, is removed, replaced
instead by a change of type of the four dictionaries which track names.
Rather than just being (unresizable) ctf_hash_t's populated only at
ctf_bufopen time, they are now a ctf_names_t structure, which is a pair
of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used
in readonly dictionaries, and the ctf_dynhash_t being used in writable
ones.  The decision as to which to use is centralized in the new
functions ctf_lookup_by_rawname (which takes a type kind) and
ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.)

This change lets us switch from using static to dynamic name hashes on
the fly across the entirety of libctf without complexifying anything: in
fact, because we now centralize the knowledge about how to map from type
kind to name hash, it actually simplifies things and lets us throw out
quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced
by the dynamic half of the name tables), through to ctf_dtnextid (now
that a dictionary's static portion is never referenced if the dictionary
is writable, we can just use ctf_typemax to indicate the maximum type:
dynamic or non-dynamic does not matter, and we no longer need to track
the boundary between the types).  You can now ctf_rollback() as far as
you like, even past a ctf_update or for that matter a full writeout; all
the iteration functions work just as well on writable as on read-only
dictionaries; ctf_add_type no longer needs expensive duplicated code to
run over the dynamic types hunting for ones it might be interested in;
and the linker no longer needs a hack to call ctf_update so that calling
ctf_add_type is not impossibly expensive.

There is still a bit more complexity: some new code paths in ctf-types.c
need to know how to extract information from dynamic types.  This
complexity will go away again in a few months when libctf acquires a
proper intermediate representation.

You can still call ctf_update if you like (it's public API, after all),
but its only effect now is to set the point to which ctf_discard rolls
back.

Obviously *something* still needs to serialize the CTF file before
writeout, and this job is done by ctf_serialize, which does everything
ctf_update used to except set the counter used by ctf_discard.  It is
automatically called by the various functions that do CTF writeout:
nobody else ever needs to call it.

With this in place, forwards that are promoted to non-forwards no longer
crash the link, even if it happens tens of thousands of types later.

v5: fix tabdamage.

libctf/
	* ctf-impl.h (ctf_names_t): New.
	(ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t.
	(ctf_file_t) <ctf_structs>: Likewise.
	<ctf_unions>: Likewise.
	<ctf_enums>: Likewise.
	<ctf_names>: Likewise.
	<ctf_lookups>: Improve comment.
	<ctf_ptrtab_len>: New.
	<ctf_prov_strtab>: New.
	<ctf_str_prov_offset>: New.
	<ctf_dtbyname>: Remove, redundant to the names hashes.
	<ctf_dtnextid>: Remove, redundant to ctf_typemax.
	(ctf_dtdef_t) <dtd_name>: Remove.
	<dtd_data>: Note that the ctt_name is now populated.
	(ctf_str_atom_t) <csa_offset>: This is now the strtab
	offset for internal strings too.
	<csa_external_offset>: New, the external strtab offset.
	(CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case.
	(ctf_name_table): New declaration.
	(ctf_lookup_by_rawname): Likewise.
	(ctf_lookup_by_rawhash): Likewise.
	(ctf_set_ctl_hashes): Likewise.
	(ctf_serialize): Likewise.
	(ctf_dtd_insert): Adjust.
	(ctf_simple_open_internal): Likewise.
	(ctf_bufopen_internal): Likewise.
	(ctf_list_empty_p): Likewise.
	(ctf_str_remove_ref): Likewise.
	(ctf_str_add): Returns uint32_t now.
	(ctf_str_add_ref): Likewise.
	(ctf_str_add_external): Now returns a boolean (int).
	* ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab
	for strings in the appropriate range.
	(ctf_str_create_atoms): Create the ctf_prov_strtab.  Detect OOM
	when adding the null string to the new strtab.
	(ctf_str_free_atoms): Destroy the ctf_prov_strtab.
	(ctf_str_add_ref_internal): Add make_provisional argument.  If
	make_provisional, populate the offset and fill in the
	ctf_prov_strtab accordingly.
	(ctf_str_add): Return the offset, not the string.
	(ctf_str_add_ref): Likewise.
	(ctf_str_add_external): Return a success integer.
	(ctf_str_remove_ref): New, remove a single ref.
	(ctf_str_count_strtab): Do not count the initial null string's
	length or the existence or length of any unreferenced internal
	atoms.
	(ctf_str_populate_sorttab): Skip atoms with no refs.
	(ctf_str_write_strtab): Populate the nullstr earlier.  Add one
	to the cts_len for the null string, since it is no longer done
	in ctf_str_count_strtab.  Adjust for csa_external_offset rename.
	Populate the csa_offset for both internal and external cases.
	Flush the ctf_prov_strtab afterwards, and reset the
	ctf_str_prov_offset.
	* ctf-create.c (ctf_grow_ptrtab): New.
	(ctf_create): Call it.	Initialize new fields rather than old
	ones.  Tell ctf_bufopen_internal that this is a writable dictionary.
	Set the ctl hashes and data model.
	(ctf_update): Rename to...
	(ctf_serialize): ... this.  Leave a compatibility function behind.
	Tell ctf_simple_open_internal that this is a writable dictionary.
	Pass the new fields along from the old dictionary.  Drop
	ctf_dtnextid and ctf_dtbyname.	Use ctf_strraw, not dtd_name.
	Do not zero out the DTD's ctt_name.
	(ctf_prefixed_name): Rename to...
	(ctf_name_table): ... this.  No longer return a prefixed name: return
	the applicable name table instead.
	(ctf_dtd_insert): Use it, and use the right name table.	 Pass in the
	kind we're adding.  Migrate away from dtd_name.
	(ctf_dtd_delete): Adjust similarly.  Remove the ref to the
	deleted ctt_name.
	(ctf_dtd_lookup_type_by_name): Remove.
	(ctf_dynamic_type): Always return NULL on read-only dictionaries.
	No longer check ctf_dtnextid: check ctf_typemax instead.
	(ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead.
	(ctf_rollback): Likewise.  No longer fail with ECTF_OVERROLLBACK. Use
	ctf_name_table and the right name table, and migrate away from
	dtd_name as in ctf_dtd_delete.
	(ctf_add_generic): Pass in the kind explicitly and pass it to
	ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid.  Migrate away
	from dtd_name to using ctf_str_add_ref to populate the ctt_name.
	Grow the ptrtab if needed.
	(ctf_add_encoded): Pass in the kind.
	(ctf_add_slice): Likewise.
	(ctf_add_array): Likewise.
	(ctf_add_function): Likewise.
	(ctf_add_typedef): Likewise.
	(ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking
	ctt_name rather than dtd_name.
	(ctf_add_struct_sized): Pass in the kind.  Use
	ctf_lookup_by_rawname, not ctf_hash_lookup_type /
	ctf_dtd_lookup_type_by_name.
	(ctf_add_union_sized): Likewise.
	(ctf_add_enum): Likewise.
	(ctf_add_enum_encoded): Likewise.
	(ctf_add_forward): Likewise.
	(ctf_add_type): Likewise.
	(ctf_compress_write): Call ctf_serialize: adjust for ctf_size not
	being initialized until after the call.
	(ctf_write_mem): Likewise.
	(ctf_write): Likewise.
	* ctf-archive.c (arc_write_one_ctf): Likewise.
	* ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not
	ctf_hash_lookup_type.
	(ctf_lookup_by_id): No longer check the readonly types if the
	dictionary is writable.
	* ctf-open.c (init_types): Assert that this dictionary is not
	writable.  Adjust to use the new name hashes, ctf_name_table,
	and ctf_ptrtab_len.  GNU style fix for the final ptrtab scan.
	(ctf_bufopen_internal): New 'writable' parameter.  Flip on LCTF_RDWR
	if set.	 Drop out early when dictionary is writable.  Split the
	ctf_lookups initialization into...
	(ctf_set_cth_hashes): ... this new function.
	(ctf_simple_open_internal): Adjust.  New 'writable' parameter.
	(ctf_simple_open): Adjust accordingly.
	(ctf_bufopen): Likewise.
	(ctf_file_close): Destroy the appropriate name hashes.	No longer
	destroy ctf_dtbyname, which is gone.
	(ctf_getdatasect): Remove spurious "extern".
	* ctf-types.c (ctf_lookup_by_rawname): New, look up types in the
	specified name table, given a kind.
	(ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *.
	(ctf_member_iter): Add support for iterating over the
	dynamic type list.
	(ctf_enum_iter): Likewise.
	(ctf_variable_iter): Likewise.
	(ctf_type_rvisit): Likewise.
	(ctf_member_info): Add support for types in the dynamic type list.
	(ctf_enum_name): Likewise.
	(ctf_enum_value): Likewise.
	(ctf_func_type_info): Likewise.
	(ctf_func_type_args): Likewise.
	* ctf-link.c (ctf_accumulate_archive_names): No longer call
	ctf_update.
	(ctf_link_write): Likewise.
	(ctf_link_intern_extern_string): Adjust for new
	ctf_str_add_external return value.
	(ctf_link_add_strtab): Likewise.
	* ctf-util.c (ctf_list_empty_p): New.
2019-10-03 17:04:56 +01:00
Nick Alcock 7e97445a5a libctf: get rid of a disruptive public include of <sys/param.h>
This hoary old header defines things like MAX that users of libctf might
perfectly reasonably define themselves.

The CTF headers do not need it: move it into libctf/ctf-impl.h instead.

include/
	* ctf-api.h (includes): No longer include <sys/param.h>.
libctf/
	* ctf-impl.h (includes): Include <sys/param.h> here.
2019-10-03 17:04:55 +01:00
Nick Alcock 49ea9b450b libctf: add CU-mapping machinery
Once the deduplicator is capable of actually detecting conflicting types
with the same name (i.e., not yet) we will place such conflicting types,
and types that depend on them, into CTF dictionaries that are the child
of the main dictionary we usually emit: currently, this will lead to the
.ctf section becoming a CTF archive rather than a single dictionary,
with the default-named archive member (_CTF_SECTION, or NULL) being the
main shared dictionary with most of the types in it.

By default, the sections are named after the compilation unit they come
from (complete path and all), with the cuname field in the CTF header
providing further evidence of the name without requiring the caller to
engage in tiresome parsing.  But some callers may not wish the mapping
from input CU to output sub-dictionary to be purely CU-based.

The machinery here allows this to be freely changed, in two ways:

 - callers can call ctf_link_add_cu_mapping to specify that a single
   input compilation unit should have its types placed in some other CU
   if they conflict: the CU will always be created, even if empty, so
   the consuming program can depend on its existence.  You can map
   multiple input CUs to one output CU to force all their types to be
   merged together: if some of *those* types conflict, the behaviour is
   currently unspecified (the new deduplicator will specify it).

 - callers can call ctf_link_set_memb_name_changer to provide a function
   which is passed every CTF sub-dictionary name in turn (including
   _CTF_SECTION) and can return a new name, or NULL if no change is
   desired.  The mapping from input to output names should not map two
   input names to the same output name: if this happens, the two are not
   merged but will result in an archive with two members with the same
   name (technically valid, but it's hard to access the second
   same-named member: you have to do an iteration over archive members).

This is used by the kernel's ctfarchive machinery (not yet upstream) to
encode CTF under member names like {module name}.ctf rather than
.ctf.CU, but it is anticipated that other large projects may wish to
have their own storage for CTF outside of .ctf sections and may wish to
have new naming schemes that suit their special-purpose consumers.

New in v3.
v4: check for strdup failure.
v5: fix tabdamage.

include/
	* ctf-api.h (ctf_link_add_cu_mapping): New.
	(ctf_link_memb_name_changer_f): New.
	(ctf_link_set_memb_name_changer): New.

libctf/
	* ctf-impl.h (ctf_file_t) <ctf_link_cu_mappping>: New.
	<ctf_link_memb_name_changer>: Likewise.
	<ctf_link_memb_name_changer_arg>: Likewise.
	* ctf-create.c (ctf_update): Update accordingly.
	* ctf-open.c (ctf_file_close): Likewise.
	* ctf-link.c (ctf_create_per_cu): Apply the cu mapping.
	(ctf_link_add_cu_mapping): New.
	(ctf_link_set_memb_name_changer): Likewise.
	(ctf_change_parent_name): New.
	(ctf_name_list_accum_cb_arg_t) <dynames>: New, storage for names
	allocated by the caller's ctf_link_memb_name_changer.
	<ndynames>: Likewise.
	(ctf_accumulate_archive_names): Call the ctf_link_memb_name_changer.
	(ctf_link_write): Likewise (for _CTF_SECTION only): also call
	ctf_change_parent_name.  Free any resulting names.
2019-10-03 17:04:55 +01:00
Nick Alcock 886453cbbc libctf: map from old to corresponding newly-added types in ctf_add_type
This lets you call ctf_type_mapping (dest_fp, src_fp, src_type_id)
and get told what type ID the corresponding type has in the target
ctf_file_t.  This works even if it was added by a recursive call, and
because it is stored in the target ctf_file_t it works even if we
had to add one type to multiple ctf_file_t's as part of conflicting
type handling.

We empty out this mapping after every archive is linked: because it maps
input to output fps, and we only visit each input fp once, its contents
are rendered entirely useless every time the source fp changes.

v3: add several missing mapping additions.  Add ctf_dynhash_empty, and
    empty after every input archive.
v5: fix tabdamage.

libctf/
	* ctf-impl.h (ctf_file_t): New field ctf_link_type_mapping.
	(struct ctf_link_type_mapping_key): New.
	(ctf_hash_type_mapping_key): Likewise.
	(ctf_hash_eq_type_mapping_key): Likewise.
	(ctf_add_type_mapping): Likewise.
	(ctf_type_mapping): Likewise.
	(ctf_dynhash_empty): Likewise.
	* ctf-open.c (ctf_file_close): Update accordingly.
	* ctf-create.c (ctf_update): Likewise.
	(ctf_add_type): Populate the mapping.
	* ctf-hash.c (ctf_hash_type_mapping_key): Hash a type mapping key.
	(ctf_hash_eq_type_mapping_key): Check the key for equality.
	(ctf_dynhash_insert): Fix comment typo.
	(ctf_dynhash_empty): New.
	* ctf-link.c (ctf_add_type_mapping): New.
	(ctf_type_mapping): Likewise.
	(empty_link_type_mapping): New.
	(ctf_link_one_input_archive): Call it.
2019-10-03 17:04:55 +01:00
Nick Alcock 72c83edd92 libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections.  This commit handles the type and string sections.

The linker calls these functions in sequence:

ctf_link_add_ctf: to add each CTF section in the input in turn to a
  newly-created ctf_file_t (which will appear in the output, and which
  itself will become the shared parent that contains types that all
  TUs have in common (in all link modes) and all types that do not
  have conflicting definitions between types (by default).  Input files
  that are themselves products of ld -r are supported, though this is
  not heavily tested yet.

ctf_link: called once all input files are added to merge the types in
  all the input containers into the output container, eliminating
  duplicates.

ctf_link_add_strtab: called once the ELF string table is finalized and
  all its offsets are known, this calls a callback provided by the
  linker which returns the string content and offset of every string in
  the ELF strtab in turn: all these strings which appear in the input
  CTF strtab are eliminated from it in favour of the ELF strtab:
  equally, any strings that only appear in the input strtab will
  reappear in the internal CTF strtab of the output.

ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
  is finalized, this calls a callback provided by the linker which
  returns information on every symbol in turn as a ctf_link_sym_t.  This
  is then used to shuffle the function info and data object sections in
  the CTF section into symbol table order, eliminating the index
  sections which map those sections to symbol names before that point.
  Currently just returns ECTF_NOTYET.

ctf_link_write: Returns a buffer containing either a serialized
  ctf_file_t (if there are no types with conflicting definitions in the
  object files in the link) or a ctf_archive_t containing a large
  ctf_file_t (the common types) and a bunch of small ones named after
  individual CUs in which conflicting types are found (containing the
  conflicting types, and all types that reference them).  A threshold
  size above which compression takes place is passed as one parameter.
  (Currently, only gzip compression is supported, but I hope to add lzma
  as well.)

Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time.  We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.

Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might.  (And when no CTF section is present,
there is no effect on performance, of course.  So only people using
a trunk GCC with not-yet-committed patches will even notice.  By the
time it gets upstream, things should be better.)

v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.

include/
	* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
	libctf linking machinery.
	(CTF_LINK_SHARE_UNCONFLICTED): New.
	(CTF_LINK_SHARE_DUPLICATED): New.
	(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
	(ECTF_NOTYET): New, a 'not yet implemented' message.
	(ctf_link_add_ctf): New, add an input file's CTF to the link.
	(ctf_link): New, merge the type and string sections.
	(ctf_link_strtab_string_f): New, callback for feeding strtab info.
	(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
	(ctf_link_add_strtab): New, tell the CTF linker about the ELF
	strtab's strings.
	(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
	symbols into symtab order.
	(ctf_link_write): New, ask the CTF linker to write the CTF out.

libctf/
	* ctf-link.c: New file, linking of the string and type sections.
	* Makefile.am (libctf_a_SOURCES): Add it.
	* Makefile.in: Regenerate.

	* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
	ctf_link_outputs.
	* ctf-create.c (ctf_update): Update accordingly.
	* ctf-open.c (ctf_file_close): Likewise.
	* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-10-03 17:04:55 +01:00
Nick Alcock d851ecd373 libctf: support getting strings from the ELF strtab
The CTF file format has always supported "external strtabs", which
internally are strtab offsets with their MSB on: such refs
get their strings from the strtab passed in at CTF file open time:
this is usually intended to be the ELF strtab, and that's what this
implementation is meant to support, though in theory the external
strtab could come from anywhere.

This commit adds support for these external strings in the ctf-string.c
strtab tracking layer.  It's quite easy: we just add a field csa_offset
to the atoms table that tracks all strings: this field tracks the offset
of the string in the ELF strtab (with its MSB already on, courtesy of a
new macro CTF_SET_STID), and adds a new function that sets the
csa_offset to the specified offset (plus MSB).  Then we just need to
avoid writing out strings to the internal strtab if they have csa_offset
set, and note that the internal strtab is shorter than it might
otherwise be.

(We could in theory save a little more time here by eschewing sorting
such strings, since we never actually write the strings out anywhere,
but that would mean storing them separately and it's just not worth the
complexity cost until profiling shows it's worth doing.)

We also have to go through a bit of extra effort at variable-sorting
time.  This was previously using direct references to the internal
strtab: it couldn't use ctf_strptr or ctf_strraw because the new strtab
is not yet ready to put in its usual field (in a ctf_file_t that hasn't
even been allocated yet at this stage): but now we're using the external
strtab, this will no longer do because it'll be looking things up in the
wrong strtab, with disastrous results.  Instead, pass the new internal
strtab in to a new ctf_strraw_explicit function which is just like
ctf_strraw except you can specify a ne winternal strtab to use.

But even now that it is using a new internal strtab, this is not quite
enough: it can't look up strings in the external strtab because ld
hasn't written it out yet, and when it does will write it straight to
disk.  Instead, when we write the internal strtab, note all the offset
-> string mappings that we have noted belong in the *external* strtab to
a new "synthetic external strtab" dynhash, ctf_syn_ext_strtab, and look
in there at ctf_strraw time if it is set.  This uses minimal extra
memory (because only strings in the external strtab that we actually use
are stored, and even those come straight out of the atoms table), but
let both variable sorting and name interning when ctf_bufopen is next
called work fine.  (This also means that we don't need to filter out
spurious ECTF_STRTAB warnings from ctf_bufopen but can pass them back to
the caller, once we wrap ctf_bufopen so that we have a new internal
variant of ctf_bufopen etc that we can pass the synthetic external
strtab to. That error has been filtered out since the days of Solaris
libctf, which didn't try to handle the problem of getting external
strtabs right at construction time at all.)

v3: add the synthetic strtab and all associated machinery.
v5: fix tabdamage.

include/
	* ctf.h (CTF_SET_STID): New.

libctf/
	* ctf-impl.h (ctf_str_atom_t) <csa_offset>: New field.
	(ctf_file_t) <ctf_syn_ext_strtab>: Likewise.
	(ctf_str_add_ref): Name the last arg.
	(ctf_str_add_external) New.
	(ctf_str_add_strraw_explicit): Likewise.
	(ctf_simple_open_internal): Likewise.
	(ctf_bufopen_internal): Likewise.

	* ctf-string.c (ctf_strraw_explicit): Split from...
	(ctf_strraw): ... here, with new support for ctf_syn_ext_strtab.
	(ctf_str_add_ref_internal): Return the atom, not the
	string.
	(ctf_str_add): Adjust accordingly.
	(ctf_str_add_ref): Likewise.  Move up in the file.
	(ctf_str_add_external): New: update the csa_offset.
	(ctf_str_count_strtab): Only account for strings with no csa_offset
	in the internal strtab length.
	(ctf_str_write_strtab): If the csa_offset is set, update the
	string's refs without writing the string out, and update the
	ctf_syn_ext_strtab.  Make OOM handling less ugly.
	* ctf-create.c (struct ctf_sort_var_arg_cb): New.
	(ctf_update): Handle failure to populate the strtab.  Pass in the
	new ctf_sort_var arg.  Adjust for ctf_syn_ext_strtab addition.
	Call ctf_simple_open_internal, not ctf_simple_open.
	(ctf_sort_var): Call ctf_strraw_explicit rather than looking up
	strings by hand.
	* ctf-hash.c (ctf_hash_insert_type): Likewise (but using
	ctf_strraw).  Adjust to diagnose ECTF_STRTAB nonetheless.
	* ctf-open.c (init_types): No longer filter out ECTF_STRTAB.
	(ctf_file_close): Destroy the ctf_syn_ext_strtab.
	(ctf_simple_open): Rename to, and reimplement as a wrapper around...
	(ctf_simple_open_internal): ... this new function, which calls
	ctf_bufopen_internal.
	(ctf_bufopen): Rename to, and reimplement as a wrapper around...
	(ctf_bufopen_internal): ... this new function, which sets
	ctf_syn_ext_strtab.
2019-10-03 17:04:55 +01:00
Nick Alcock 9b32cba44d libctf, binutils: dump the CTF header
The CTF header has before now been thrown away too soon to be dumped
using the ctf_dump() machinery used by objdump and readelf: instead, a
kludge involving debugging-priority dumps of the header offsets on every
open was used.

Replace this with proper first-class dumping machinery just like
everything else in the CTF file, and have objdump and readelf use it.
(The dumper already had an enum value in ctf_sect_names_t for this
purpose, waiting to be used.)

v5: fix tabdamage.

libctf/
	* ctf-impl.h (ctf_file_t): New field ctf_openflags.
	* ctf-open.c (ctf_bufopen): Set it.  No longer dump header offsets.
	* ctf-dump.c (dump_header): New function, dump the CTF header.
	(ctf_dump): Call it.
	(ctf_dump_header_strfield): New function.
	(ctf_dump_header_sectfield): Likewise.

binutils/
	* objdump.c (dump_ctf_archive_member): Dump the CTF header.
	* readelf.c (dump_section_as_ctf): Likewise.
2019-10-03 17:04:55 +01:00
Nick Alcock fd55eae84d libctf: allow the header to change between versions
libctf supports dynamic upgrading of the type table as file format
versions change, but before now has not supported changes to the CTF
header.  Doing this is complicated by the baroque storage method used:
the CTF header is kept prepended to the rest of the CTF data, just as
when read from the file, and written out from there, and is
endian-flipped in place.

This makes accessing it needlessly hard and makes it almost impossible
to make the header larger if we add fields.  The general storage
machinery around the malloced ctf pointer (the 'ctf_base') is also
overcomplicated: the pointer is sometimes malloced locally and sometimes
assigned from a parameter, so freeing it requires checking to see if
that parameter was used, needlessly coupling ctf_bufopen and
ctf_file_close together.

So split the header out into a new ctf_file_t.ctf_header, which is
written out explicitly: squeeze it out of the CTF buffer whenever we
reallocate it, and use ctf_file_t.ctf_buf to skip past the header when
we do not need to reallocate (when no upgrading or endian-flipping is
required).  We now track whether the CTF base can be freed explicitly
via a new ctf_dynbase pointer which is non-NULL only when freeing is
possible.

With all this done, we can upgrade the header on the fly and add new
fields as desired, via a new upgrade_header function in ctf-open.
As with other forms of upgrading, libctf upgrades older headers
automatically to the latest supported version at open time.

For a first use of this field, we add a new string field cth_cuname, and
a corresponding setter/getter pair ctf_cuname_set and ctf_cuname: this
is used by debuggers to determine whether a CTF section's types relate
to a single compilation unit, or to all compilation units in the
program.  (Types with ambiguous definitions in different CUs have only
one of these types placed in the top-level shared .ctf container: the
rest are placed in much smaller per-CU containers, which have the shared
container as their parent.  Since CTF must be useful in the absence of
DWARF, we store the names of the relevant CUs ourselves, so the debugger
can look them up.)

v5: fix tabdamage.

include/
	* ctf-api.h (ctf_cuname): New function.
	(ctf_cuname_set): Likewise.
	* ctf.h: Improve comment around upgrading, no longer
	implying that v2 is the target of upgrades (it is v3 now).
	(ctf_header_v2_t): New, old-format header for backward
	compatibility.
	(ctf_header_t): Add cth_cuname: this is the first of several
	header changes in format v3.
libctf/
	* ctf-impl.h (ctf_file_t): New fields ctf_header, ctf_dynbase,
	ctf_cuname, ctf_dyncuname: ctf_base and ctf_buf are no longer const.
	* ctf-open.c (ctf_set_base): Preserve the gap between ctf_buf and
	ctf_base: do not assume that it is always sizeof (ctf_header_t).
	Print out ctf_cuname: only print out ctf_parname if set.
	(ctf_free_base): Removed, ctf_base is no longer freed: free
	ctf_dynbase instead.
	(ctf_set_version): Fix spacing.
	(upgrade_header): New, in-place header upgrading.
	(upgrade_types): Rename to...
	(upgrade_types_v1): ... this.  Free ctf_dynbase, not ctf_base.  No
	longer track old and new headers separately.  No longer allow for
	header sizes explicitly: squeeze the headers out on upgrade (they
	are preserved in fp->ctf_header).  Set ctf_dynbase, ctf_base and
	ctf_buf explicitly.  Use ctf_free, not ctf_free_base.
	(upgrade_types): New, also handle ctf_parmax updating.
	(flip_header): Flip ctf_cuname.
	(flip_types): Flip BUF explicitly rather than deriving BUF from
	BASE.
	(ctf_bufopen): Store the header in fp->ctf_header.  Correct minimum
	required alignment of objtoff and funcoff.  No longer store it in
	the ctf_buf unless that buf is derived unmodified from the input.
	Set ctf_dynbase where ctf_base is dynamically allocated. Drop locals
	that duplicate fields in ctf_file: move allocation of ctf_file
	further up instead.  Call upgrade_header as needed.  Move
	version-specific ctf_parmax initialization into upgrade_types.  More
	concise error handling.
	(ctf_file_close): No longer test for null pointers before freeing.
	Free ctf_dyncuname, ctf_dynbase, and ctf_header.  Do not call
	ctf_free_base.
	(ctf_cuname): New.
	(ctf_cuname_set): New.
	* ctf-create.c (ctf_update): Populate ctf_cuname.
	(ctf_gzwrite): Write out the header explicitly.  Remove obsolescent
	comment.
	(ctf_write): Likewise.
	(ctf_compress_write): Get the header from ctf_header, not ctf_base.
	Fix the compression length: fp->ctf_size never counted the CTF
	header.  Simplify the compress call accordingly.
2019-10-03 17:04:55 +01:00
Nick Alcock f5e9c9bde0 libctf: deduplicate and sort the string table
ctf.h states:

> [...] the CTF string table does not contain any duplicated strings.

Unfortunately this is entirely untrue: libctf has before now made no
attempt whatsoever to deduplicate the string table. It computes the
string table's length on the fly as it adds new strings to the dynamic
CTF file, and ctf_update() just writes each string to the table and
notes the current write position as it traverses the dynamic CTF file's
data structures and builds the final CTF buffer.  There is no global
view of the strings and no deduplication.

Fix this by erasing the ctf_dtvstrlen dead-reckoning length, and adding
a new dynhash table ctf_str_atoms that maps unique strings to a list
of references to those strings: a reference is a simple uint32_t * to
some value somewhere in the under-construction CTF buffer that needs
updating to note the string offset when the strtab is laid out.

Adding a string is now a simple matter of calling ctf_str_add_ref(),
which adds a new atom to the atoms table, if one doesn't already exist,
and adding the location of the reference to this atom to the refs list
attached to the atom: this works reliably as long as one takes care to
only call ctf_str_add_ref() once the final location of the offset is
known (so you can't call it on a temporary structure and then memcpy()
that structure into place in the CTF buffer, because the ref will still
point to the old location: ctf_update() changes accordingly).

Generating the CTF string table is a matter of calling
ctf_str_write_strtab(), which counts the length and number of elements
in the atoms table using the ctf_dynhash_iter() function we just added,
populating an array of pointers into the atoms table and sorting it into
order (to help compressors), then traversing this table and emitting it,
updating the refs to each atom as we go.  The only complexity here is
arranging to keep the null string at offset zero, since a lot of code in
libctf depends on being able to leave strtab references at 0 to indicate
'no name'.  Once the table is constructed and the refs updated, we know
how long it is, so we can realloc() the partial CTF buffer we allocated
earlier and can copy the table on to the end of it (and purge the refs
because they're not needed any more and have been invalidated by the
realloc() call in any case).

The net effect of all this is a reduction in uncompressed strtab sizes
of about 30% (perhaps a quarter to a half of all strings across the
Linux kernel are eliminated as duplicates). Of course, duplicated
strings are highly redundant, so the space saving after compression is
only about 20%: when the other non-strtab sections are factored in, CTF
sizes shrink by about 10%.

No change in externally-visible API or file format (other than the
reduction in pointless redundancy).

libctf/
	* ctf-impl.h: (struct ctf_strs_writable): New, non-const version of
	struct ctf_strs.
	(struct ctf_dtdef): Note that dtd_data.ctt_name is unpopulated.
	(struct ctf_str_atom): New, disambiguated single string.
	(struct ctf_str_atom_ref): New, points to some other location that
	references this string's offset.
	(struct ctf_file): New members ctf_str_atoms and ctf_str_num_refs.
	Remove member ctf_dtvstrlen: we no longer track the total strlen
	as we add strings.
	(ctf_str_create_atoms): Declare new function in ctf-string.c.
	(ctf_str_free_atoms): Likewise.
	(ctf_str_add): Likewise.
	(ctf_str_add_ref): Likewise.
	(ctf_str_purge_refs): Likewise.
	(ctf_str_write_strtab): Likewise.
	(ctf_realloc): Declare new function in ctf-util.c.

	* ctf-open.c (ctf_bufopen): Create the atoms table.
	(ctf_file_close): Destroy it.
	* ctf-create.c (ctf_update): Copy-and-free it on update.  No longer
	special-case the position of the parname string.  Construct the
	strtab by calling ctf_str_add_ref and ctf_str_write_strtab after the
	rest of each buffer element is constructed, not via open-coding:
	realloc the CTF buffer and append the strtab to it.  No longer
	maintain ctf_dtvstrlen.  Sort the variable entry table later, after
	strtab construction.
	(ctf_copy_membnames): Remove: integrated into ctf_copy_{s,l,e}members.
	(ctf_copy_smembers): Drop the string offset: call ctf_str_add_ref
	after buffer element construction instead.
	(ctf_copy_lmembers): Likewise.
	(ctf_copy_emembers): Likewise.
	(ctf_create): No longer maintain the ctf_dtvstrlen.
	(ctf_dtd_delete): Likewise.
	(ctf_dvd_delete): Likewise.
	(ctf_add_generic): Likewise.
	(ctf_add_enumerator): Likewise.
	(ctf_add_member_offset): Likewise.
	(ctf_add_variable): Likewise.
	(membadd): Likewise.
	* ctf-util.c (ctf_realloc): New, wrapper around realloc that aborts
	if there are active ctf_str_num_refs.
	(ctf_strraw): Move to ctf-string.c.
	(ctf_strptr): Likewise.
	* ctf-string.c: New file, strtab manipulation.

	* Makefile.am (libctf_a_SOURCES): Add it.
	* Makefile.in: Regenerate.
2019-07-01 11:05:59 +01:00
Nick Alcock 9658dc3963 libctf: add hash traversal helpers
There are two, ctf_dynhash_iter and ctf_dynhash_iter_remove: the latter
lets you return a nonzero value to remove the element being iterated
over.

Used in the next commit.

libctf/
	* ctf-impl.h (ctf_hash_iter_f): New.
	(ctf_dynhash_iter): New declaration.
	(ctf_dynhash_iter_remove): New declaration.
	* ctf-hash.c (ctf_dynhash_iter): Define.
	(ctf_dynhash_iter_remove): Likewise.
	(ctf_hashtab_traverse): New.
	(ctf_hashtab_traverse_remove): Likewise.
	(struct ctf_traverse_cb_arg): Likewise.
	(struct ctf_traverse_remove_cb_arg): Likewise.
2019-07-01 11:05:59 +01:00
Nick Alcock 65365aa856 libctf: drop mmap()-based CTF data allocator
This allocator has the ostensible benefit that it lets us mprotect() the
memory used for CTF storage: but in exchange for this it adds
considerable complexity, since we have to track allocation sizes
ourselves for use at freeing time, note whether the data we are storing
was ctf_data_alloc()ed or not so we know if we can safely mprotect()
it... and while the mprotect()ing has found few bugs, it *has* been the
cause of more than one due to errors in all this tracking leading to us
mprotect()ing bits of the heap and stuff like that.

We are about to start composing CTF buffers from pieces so that we can
do usage-based optimizations on the strtab.  This means we need
realloc(), which needs nonportable mremap() and *more* tracking of the
*original* allocation size, and the complexity and bureaucracy of all of
this is just too high for its negligible benefits.

Drop the whole thing and just use malloc() like everyone else.  It knows
better than we do when it is safe to use mmap() under the covers,
anyway.

While we're at it, don't leak the entire buffer if ctf_compress_write()
fails to compress it.

libctf/
	* ctf-subr.c (_PAGESIZE): Remove.
	(ctf_data_alloc): Likewise.
	(ctf_data_free): Likewise.
	(ctf_data_protect): Likewise.
	* ctf-impl.h: Remove declarations.
	* ctf-create.c (ctf_update): No longer call ctf_data_protect: use
	ctf_free, not ctf_data_free.
	(ctf_compress_write): Use ctf_data_alloc, not ctf_alloc.  Free
	the buffer again on compression error.
	* ctf-open.c (ctf_set_base): No longer track the size: call
	ctf_free, not ctf_data_free.
	(upgrade_types): Likewise.  Call ctf_alloc, not ctf_data_alloc.
	(ctf_bufopen): Likewise.  No longer call ctf_data_protect.
2019-06-21 13:04:02 +01:00
Nick Alcock 2486542803 libctf: handle errors on dynhash insertion better
We were missing several cases where dynhash insertion might fail, likely
due to OOM but possibly for other reasons.  Pass the errors on.

libctf/
	* ctf-create.c (ctf_dtd_insert): Pass on error returns from
	ctf_dynhash_insert.
	(ctf_dvd_insert): Likewise.
	(ctf_add_generic): Likewise.
	(ctf_add_variable): Likewise.
	* ctf-impl.h: Adjust declarations.
2019-06-21 13:04:01 +01:00
Jose E. Marchesi a0486bac41 libctf: fix a number of build problems found on Solaris and NetBSD
- Use of nonportable <endian.h>
- Use of qsort_r
- Use of zlib without appropriate magic to pull in the binutils zlib
- Use of off64_t without checking (fixed by dropping the unused fields
  that need off64_t entirely)
- signedness problems due to long being too short a type on 32-bit
  platforms: ctf_id_t is now 'unsigned long', and CTF_ERR must be
  used only for functions that return ctf_id_t
- One lingering use of bzero() and of <sys/errno.h>

All fixed, using code from gnulib where possible.

Relatedly, set cts_size in a couple of places it was missed
(string table and symbol table loading upon ctf_bfdopen()).

binutils/
	* objdump.c (make_ctfsect): Drop cts_type, cts_flags, and
	cts_offset.
	* readelf.c (shdr_to_ctf_sect): Likewise.
include/
	* ctf-api.h (ctf_sect_t): Drop cts_type, cts_flags, and cts_offset.
	(ctf_id_t): This is now an unsigned type.
	(CTF_ERR): Cast it to ctf_id_t.  Note that it should only be used
	for ctf_id_t-returning functions.
libctf/
	* Makefile.am (ZLIB): New.
	(ZLIBINC): Likewise.
	(AM_CFLAGS): Use them.
	(libctf_a_LIBADD): New, for LIBOBJS.
	* configure.ac: Check for zlib, endian.h, and qsort_r.
	* ctf-endian.h: New, providing htole64 and le64toh.
	* swap.h: Code style fixes.
	(bswap_identity_64): New.
	* qsort_r.c: New, from gnulib (with one added #include).
	* ctf-decls.h: New, providing a conditional qsort_r declaration,
	and unconditional definitions of MIN and MAX.
	* ctf-impl.h: Use it.  Do not use <sys/errno.h>.
	(ctf_set_errno): Now returns unsigned long.
	* ctf-util.c (ctf_set_errno): Adjust here too.
	* ctf-archive.c: Use ctf-endian.h.
	(ctf_arc_open_by_offset): Use memset, not bzero.  Drop cts_type,
	cts_flags and cts_offset.
	(ctf_arc_write): Drop debugging dependent on the size of off_t.
	* ctf-create.c: Provide a definition of roundup if not defined.
	(ctf_create): Drop cts_type, cts_flags and cts_offset.
	(ctf_add_reftype): Do not check if type IDs are below zero.
	(ctf_add_slice): Likewise.
	(ctf_add_typedef): Likewise.
	(ctf_add_member_offset): Cast error-returning ssize_t's to size_t
	when known error-free.  Drop CTF_ERR usage for functions returning
	int.
	(ctf_add_member_encoded): Drop CTF_ERR usage for functions returning
	int.
	(ctf_add_variable): Likewise.
	(enumcmp): Likewise.
	(enumadd): Likewise.
	(membcmp): Likewise.
	(ctf_add_type): Likewise.  Cast error-returning ssize_t's to size_t
	when known error-free.
	* ctf-dump.c (ctf_is_slice): Drop CTF_ERR usage for functions
	returning int: use CTF_ERR for functions returning ctf_type_id.
	(ctf_dump_label): Likewise.
	(ctf_dump_objts): Likewise.
	* ctf-labels.c (ctf_label_topmost): Likewise.
	(ctf_label_iter): Likewise.
	(ctf_label_info): Likewise.
	* ctf-lookup.c (ctf_func_args): Likewise.
	* ctf-open.c (upgrade_types): Cast to size_t where appropriate.
	(ctf_bufopen): Likewise.  Use zlib types as needed.
	* ctf-types.c (ctf_member_iter): Drop CTF_ERR usage for functions
	returning int.
	(ctf_enum_iter): Likewise.
	(ctf_type_size): Likewise.
	(ctf_type_align): Likewise.  Cast to size_t where appropriate.
	(ctf_type_kind_unsliced): Likewise.
	(ctf_type_kind): Likewise.
	(ctf_type_encoding): Likewise.
	(ctf_member_info): Likewise.
	(ctf_array_info): Likewise.
	(ctf_enum_value): Likewise.
	(ctf_type_rvisit): Likewise.
	* ctf-open-bfd.c (ctf_bfdopen): Drop cts_type, cts_flags and
	cts_offset.
	(ctf_simple_open): Likewise.
	(ctf_bfdopen_ctfsect): Likewise.  Set cts_size properly.
	* Makefile.in: Regenerate.
	* aclocal.m4: Likewise.
	* config.h: Likewise.
	* configure: Likewise.
2019-05-31 11:10:51 +02:00
Nick Alcock 6c33b742ce libctf: library version enforcement
This old Solaris standard allows callers to specify that they are
expecting one particular API and/or CTF file format from the library.

libctf/
	* ctf-impl.h (_libctf_version): New declaration.
	* ctf-subr.c (_libctf_version): Define it.
	(ctf_version): New.

include/
	* ctf-api.h (ctf_version): New.
2019-05-28 17:08:29 +01:00
Nick Alcock b437bfe0f4 libctf: lookups by name and symbol
These functions allow you to look up types given a name in a simple
subset of C declarator syntax (no function pointers), to look up the
types of variables given a name, and to look up the types of data
objects and the type signatures of functions given symbol table offsets.

(Despite its name, one function in this commit, ctf_lookup_symbol_name(),
is for the internal use of libctf only, and does not appear in any
public header files.)

libctf/
	* ctf-lookup.c (isqualifier): New.
	(ctf_lookup_by_name): Likewise.
	(struct ctf_lookup_var_key): Likewise.
	(ctf_lookup_var): Likewise.
	(ctf_lookup_variable): Likewise.
	(ctf_lookup_symbol_name): Likewise.
	(ctf_lookup_by_symbol): Likewise.
	(ctf_func_info): Likewise.
	(ctf_func_args): Likewise.

include/
	* ctf-api.h (ctf_func_info): New.
	(ctf_func_args): Likewise.
	(ctf_lookup_by_symbol): Likewise.
	(ctf_lookup_by_symbol): Likewise.
	(ctf_lookup_variable): Likewise.
2019-05-28 17:08:19 +01:00
Nick Alcock 316afdb130 libctf: core type lookup
Finally we get to the functions used to actually look up and enumerate
properties of types in a container (names, sizes, members, what type a
pointer or cv-qual references, determination of whether two types are
assignment-compatible, etc).

With a very few exceptions these do not work for types newly added via
ctf_add_*(): they only work on types in read-only containers, or types
added before the most recent call to ctf_update().

This also adds support for lookup of "variables" (string -> type ID
mappings) and for generation of C type names corresponding to a type ID.

libctf/
	* ctf-decl.c: New file.
	* ctf-types.c: Likewise.
	* ctf-impl.h: New declarations.

include/
	* ctf-api.h (ctf_visit_f): New definition.
	(ctf_member_f): Likewise.
	(ctf_enum_f): Likewise.
	(ctf_variable_f): Likewise.
	(ctf_type_f): Likewise.
	(ctf_type_isparent): Likewise.
	(ctf_type_ischild): Likewise.
	(ctf_type_resolve): Likewise.
	(ctf_type_aname): Likewise.
	(ctf_type_lname): Likewise.
	(ctf_type_name): Likewise.
	(ctf_type_sizee): Likewise.
	(ctf_type_align): Likewise.
	(ctf_type_kind): Likewise.
	(ctf_type_reference): Likewise.
	(ctf_type_pointer): Likewise.
	(ctf_type_encoding): Likewise.
	(ctf_type_visit): Likewise.
	(ctf_type_cmp): Likewise.
	(ctf_type_compat): Likewise.
	(ctf_member_info): Likewise.
	(ctf_array_info): Likewise.
	(ctf_enum_name): Likewise.
	(ctf_enum_value): Likewise.
	(ctf_member_iter): Likewise.
	(ctf_enum_iter): Likewise.
	(ctf_type_iter): Likewise.
	(ctf_variable_iter): Likewise.
2019-05-28 17:08:14 +01:00
Nick Alcock 143dce8481 libctf: ELF file opening via BFD
These functions let you open an ELF file with a customarily-named CTF
section in it, automatically opening the CTF file or archive and
associating the symbol and string tables in the ELF file with the CTF
container, so that you can look up the types of symbols in the ELF file
via ctf_lookup_by_symbol(), and so that strings can be shared between
the ELF file and CTF container, to save space.

It uses BFD machinery to do so.  This has now been lightly tested and
seems to work.  In particular, if you already have a bfd you can pass
it in to ctf_bfdopen(), and if you want a bfd made for you you can
call ctf_open() or ctf_fdopen(), optionally specifying a target (or
try once without a target and then again with one if you get
ECTF_BFD_AMBIGUOUS back).

We use a forward declaration for the struct bfd in ctf-api.h, so that
ctf-api.h users are not required to pull in <bfd.h>.  (This is mostly
for the sake of readelf.)

libctf/
	* ctf-open-bfd.c: New file.
	* ctf-open.c (ctf_close): New.
	* ctf-impl.h: Include bfd.h.
	(ctf_file): New members ctf_data_mmapped, ctf_data_mmapped_len.
	(ctf_archive_internal): New members ctfi_abfd, ctfi_data,
	ctfi_bfd_close.
	(ctf_bfdopen_ctfsect): New declaration.
	(_CTF_SECTION): likewise.

include/
	* ctf-api.h (struct bfd): New forward.
	(ctf_fdopen): New.
	(ctf_bfdopen): Likewise.
	(ctf_open): Likewise.
	(ctf_arc_open): Likewise.
2019-05-28 17:08:08 +01:00
Nick Alcock 9402cc593f libctf: mmappable archives
If you need to store a large number of CTF containers somewhere, this
provides a dedicated facility for doing so: an mmappable archive format
like a very simple tar or ar without all the system-dependent format
horrors or need for heavy file copying, with built-in compression of
files above a particular size threshold.

libctf automatically mmap()s uncompressed elements of these archives, or
uncompresses them, as needed.  (If the platform does not support mmap(),
copying into dynamically-allocated buffers is used.)

Archive iteration operations are partitioned into raw and non-raw
forms. Raw operations pass thhe raw archive contents to the callback:
non-raw forms open each member with ctf_bufopen() and pass the resulting
ctf_file_t to the iterator instead.  This lets you manipulate the raw
data in the archive, or the contents interpreted as a CTF file, as
needed.

It is not yet known whether we will store CTF archives in a linked ELF
object in one of these (akin to debugdata) or whether they'll get one
section per TU plus one parent container for types shared between them.
(In the case of ELF objects with very large numbers of TUs, an archive
of all of them would seem preferable, so we might just use an archive,
and add lzma support so you can assume that .gnu_debugdata and .ctf are
compressed using the same algorithm if both are present.)

To make usage easier, the ctf_archive_t is not the on-disk
representation but an abstraction over both ctf_file_t's and archives of
many ctf_file_t's: users see both CTF archives and raw CTF files as
ctf_archive_t's upon opening, the only difference being that a raw CTF
file has only a single "archive member", named ".ctf" (the default if a
null pointer is passed in as the name).  The next commit will make use
of this facility, in addition to providing the public interface to
actually open archives.  (In the future, it should be possible to have
all CTF sections in an ELF file appear as an "archive" in the same
fashion.)

This machinery is also used to allow library-internal creators of
ctf_archive_t's (such as the next commit) to stash away an ELF string
and symbol table, so that all opens of members in a given archive will
use them.  This lets CTF archives exploit the ELF string and symbol
table just like raw CTF files can.

(All this leads to somewhat confusing type naming.  The ctf_archive_t is
a typedef for the opaque internal type, struct ctf_archive_internal: the
non-internal "struct ctf_archive" is the on-disk structure meant for
other libraries manipulating CTF files.  It is probably clearest to use
the struct name for struct ctf_archive_internal inside the program, and
the typedef names outside.)

libctf/
	* ctf-archive.c: New.
	* ctf-impl.h (ctf_archive_internal): New type.
	(ctf_arc_open_internal): New declaration.
	(ctf_arc_bufopen): Likewise.
	(ctf_arc_close_internal): Likewise.
include/
	* ctf.h (CTFA_MAGIC): New.
	(struct ctf_archive): New.
	(struct ctf_archive_modent): Likewise.
	* ctf-api.h (ctf_archive_member_f): New.
	(ctf_archive_raw_member_f): Likewise.
	(ctf_arc_write): Likewise.
	(ctf_arc_close): Likewise.
	(ctf_arc_open_by_name): Likewise.
	(ctf_archive_iter): Likewise.
	(ctf_archive_raw_iter): Likewise.
	(ctf_get_arc): Likewise.
2019-05-28 17:07:55 +01:00
Nick Alcock a5be9bbe89 libctf: implementation definitions related to file creation
We now enter a series of commits that are sufficiently tangled that
avoiding forward definitions is almost impossible: no attempt is made to
make individual commits compilable (which is why the build system does
not reference any of them yet): the only important thing is that they
should form something like conceptual groups.

But first, some definitions, including the core ctf_file_t itself.  Uses
of these definitions will be introduced in later commits.

libctf/
	* ctf-impl.h: New definitions and declarations for type creation
	and lookup.
2019-05-28 17:07:33 +01:00
Nick Alcock c0754cdd9a libctf: hashing
libctf maintains two distinct hash ADTs, one (ctf_dynhash) for wrapping
dynamically-generated unknown-sized hashes during CTF file construction,
one (ctf_hash) for wrapping unchanging hashes whose size is known at
creation time for reading CTF files that were previously created.

In the binutils implementation, these are both fairly thin wrappers
around libiberty hashtab.

Unusually, this code is not kept synchronized with libdtrace-ctf,
due to its dependence on libiberty hashtab.

libctf/
	* ctf-hash.c: New file.
	* ctf-impl.h: New declarations.
2019-05-28 17:07:29 +01:00
Nick Alcock 94585e7f93 libctf: low-level list manipulation and helper utilities
These utilities are a bit of a ragbag of small things needed by more
than one TU: list manipulation, ELF32->64 translators, routines to look
up strings in string tables, dynamically-allocated string appenders, and
routines to set the specialized errno values previously committed in
<ctf-api.h>.

We do still need to dig around in raw ELF symbol tables in places,
because libctf allows the caller to pass in the contents of string and
symbol sections without telling it where they come from, so we cannot
use BFD to get the symbols (BFD reasonably demands the entire file).  So
extract minimal ELF definitions from glibc into a private header named
libctf/elf.h: later, we use those to get symbols.  (The start-of-
copyright range on elf.h reflects this glibc heritage.)

libctf/
	* ctf-util.c: New file.
	* elf.h: Likewise.
	* ctf-impl.h: Include it, and add declarations.
2019-05-28 17:07:19 +01:00
Nick Alcock 60da9d9559 libctf: lowest-level memory allocation and debug-dumping wrappers
The memory-allocation wrappers are simple things to allow malloc
interposition: they are only used inconsistently at present, usually
where malloc debugging was required in the past.

These provide a default implementation that is environment-variable
triggered (initialized on the first call to the libctf creation and
file-opening functions, the first functions people will use), and
a ctf_setdebug()/ctf_getdebug() pair that allows the caller to
explicitly turn debugging off and on.  If ctf_setdebug() is called,
the automatic setting from an environment variable is skipped.

libctf/
	* ctf-impl.h: New file.
	* ctf-subr.c: New file.

include/
	* ctf-api.h (ctf_setdebug): New.
	(ctf_getdebug): Likewise.
2019-05-28 17:07:15 +01:00