Commit Graph

17 Commits

Author SHA1 Message Date
Nick Alcock 49ea9b450b libctf: add CU-mapping machinery
Once the deduplicator is capable of actually detecting conflicting types
with the same name (i.e., not yet) we will place such conflicting types,
and types that depend on them, into CTF dictionaries that are the child
of the main dictionary we usually emit: currently, this will lead to the
.ctf section becoming a CTF archive rather than a single dictionary,
with the default-named archive member (_CTF_SECTION, or NULL) being the
main shared dictionary with most of the types in it.

By default, the sections are named after the compilation unit they come
from (complete path and all), with the cuname field in the CTF header
providing further evidence of the name without requiring the caller to
engage in tiresome parsing.  But some callers may not wish the mapping
from input CU to output sub-dictionary to be purely CU-based.

The machinery here allows this to be freely changed, in two ways:

 - callers can call ctf_link_add_cu_mapping to specify that a single
   input compilation unit should have its types placed in some other CU
   if they conflict: the CU will always be created, even if empty, so
   the consuming program can depend on its existence.  You can map
   multiple input CUs to one output CU to force all their types to be
   merged together: if some of *those* types conflict, the behaviour is
   currently unspecified (the new deduplicator will specify it).

 - callers can call ctf_link_set_memb_name_changer to provide a function
   which is passed every CTF sub-dictionary name in turn (including
   _CTF_SECTION) and can return a new name, or NULL if no change is
   desired.  The mapping from input to output names should not map two
   input names to the same output name: if this happens, the two are not
   merged but will result in an archive with two members with the same
   name (technically valid, but it's hard to access the second
   same-named member: you have to do an iteration over archive members).

This is used by the kernel's ctfarchive machinery (not yet upstream) to
encode CTF under member names like {module name}.ctf rather than
.ctf.CU, but it is anticipated that other large projects may wish to
have their own storage for CTF outside of .ctf sections and may wish to
have new naming schemes that suit their special-purpose consumers.

New in v3.
v4: check for strdup failure.
v5: fix tabdamage.

include/
	* ctf-api.h (ctf_link_add_cu_mapping): New.
	(ctf_link_memb_name_changer_f): New.
	(ctf_link_set_memb_name_changer): New.

libctf/
	* ctf-impl.h (ctf_file_t) <ctf_link_cu_mappping>: New.
	<ctf_link_memb_name_changer>: Likewise.
	<ctf_link_memb_name_changer_arg>: Likewise.
	* ctf-create.c (ctf_update): Update accordingly.
	* ctf-open.c (ctf_file_close): Likewise.
	* ctf-link.c (ctf_create_per_cu): Apply the cu mapping.
	(ctf_link_add_cu_mapping): New.
	(ctf_link_set_memb_name_changer): Likewise.
	(ctf_change_parent_name): New.
	(ctf_name_list_accum_cb_arg_t) <dynames>: New, storage for names
	allocated by the caller's ctf_link_memb_name_changer.
	<ndynames>: Likewise.
	(ctf_accumulate_archive_names): Call the ctf_link_memb_name_changer.
	(ctf_link_write): Likewise (for _CTF_SECTION only): also call
	ctf_change_parent_name.  Free any resulting names.
2019-10-03 17:04:55 +01:00
Nick Alcock 886453cbbc libctf: map from old to corresponding newly-added types in ctf_add_type
This lets you call ctf_type_mapping (dest_fp, src_fp, src_type_id)
and get told what type ID the corresponding type has in the target
ctf_file_t.  This works even if it was added by a recursive call, and
because it is stored in the target ctf_file_t it works even if we
had to add one type to multiple ctf_file_t's as part of conflicting
type handling.

We empty out this mapping after every archive is linked: because it maps
input to output fps, and we only visit each input fp once, its contents
are rendered entirely useless every time the source fp changes.

v3: add several missing mapping additions.  Add ctf_dynhash_empty, and
    empty after every input archive.
v5: fix tabdamage.

libctf/
	* ctf-impl.h (ctf_file_t): New field ctf_link_type_mapping.
	(struct ctf_link_type_mapping_key): New.
	(ctf_hash_type_mapping_key): Likewise.
	(ctf_hash_eq_type_mapping_key): Likewise.
	(ctf_add_type_mapping): Likewise.
	(ctf_type_mapping): Likewise.
	(ctf_dynhash_empty): Likewise.
	* ctf-open.c (ctf_file_close): Update accordingly.
	* ctf-create.c (ctf_update): Likewise.
	(ctf_add_type): Populate the mapping.
	* ctf-hash.c (ctf_hash_type_mapping_key): Hash a type mapping key.
	(ctf_hash_eq_type_mapping_key): Check the key for equality.
	(ctf_dynhash_insert): Fix comment typo.
	(ctf_dynhash_empty): New.
	* ctf-link.c (ctf_add_type_mapping): New.
	(ctf_type_mapping): Likewise.
	(empty_link_type_mapping): New.
	(ctf_link_one_input_archive): Call it.
2019-10-03 17:04:55 +01:00
Nick Alcock 72c83edd92 libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections.  This commit handles the type and string sections.

The linker calls these functions in sequence:

ctf_link_add_ctf: to add each CTF section in the input in turn to a
  newly-created ctf_file_t (which will appear in the output, and which
  itself will become the shared parent that contains types that all
  TUs have in common (in all link modes) and all types that do not
  have conflicting definitions between types (by default).  Input files
  that are themselves products of ld -r are supported, though this is
  not heavily tested yet.

ctf_link: called once all input files are added to merge the types in
  all the input containers into the output container, eliminating
  duplicates.

ctf_link_add_strtab: called once the ELF string table is finalized and
  all its offsets are known, this calls a callback provided by the
  linker which returns the string content and offset of every string in
  the ELF strtab in turn: all these strings which appear in the input
  CTF strtab are eliminated from it in favour of the ELF strtab:
  equally, any strings that only appear in the input strtab will
  reappear in the internal CTF strtab of the output.

ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
  is finalized, this calls a callback provided by the linker which
  returns information on every symbol in turn as a ctf_link_sym_t.  This
  is then used to shuffle the function info and data object sections in
  the CTF section into symbol table order, eliminating the index
  sections which map those sections to symbol names before that point.
  Currently just returns ECTF_NOTYET.

ctf_link_write: Returns a buffer containing either a serialized
  ctf_file_t (if there are no types with conflicting definitions in the
  object files in the link) or a ctf_archive_t containing a large
  ctf_file_t (the common types) and a bunch of small ones named after
  individual CUs in which conflicting types are found (containing the
  conflicting types, and all types that reference them).  A threshold
  size above which compression takes place is passed as one parameter.
  (Currently, only gzip compression is supported, but I hope to add lzma
  as well.)

Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time.  We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.

Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might.  (And when no CTF section is present,
there is no effect on performance, of course.  So only people using
a trunk GCC with not-yet-committed patches will even notice.  By the
time it gets upstream, things should be better.)

v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.

include/
	* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
	libctf linking machinery.
	(CTF_LINK_SHARE_UNCONFLICTED): New.
	(CTF_LINK_SHARE_DUPLICATED): New.
	(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
	(ECTF_NOTYET): New, a 'not yet implemented' message.
	(ctf_link_add_ctf): New, add an input file's CTF to the link.
	(ctf_link): New, merge the type and string sections.
	(ctf_link_strtab_string_f): New, callback for feeding strtab info.
	(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
	(ctf_link_add_strtab): New, tell the CTF linker about the ELF
	strtab's strings.
	(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
	symbols into symtab order.
	(ctf_link_write): New, ask the CTF linker to write the CTF out.

libctf/
	* ctf-link.c: New file, linking of the string and type sections.
	* Makefile.am (libctf_a_SOURCES): Add it.
	* Makefile.in: Regenerate.

	* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
	ctf_link_outputs.
	* ctf-create.c (ctf_update): Update accordingly.
	* ctf-open.c (ctf_file_close): Likewise.
	* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-10-03 17:04:55 +01:00
Nick Alcock 3dde2c915e libctf: fix double-free on ctf_compress_write error path
We were freeing the compressed data buffer twice if compression failed.

v4: Fix commit message.
v5: fix tabdamage.

libctf/
	* ctf-create.c (ctf_compress_write): Fix double-free.
2019-10-03 17:04:55 +01:00
Nick Alcock 5537f9b9a3 libctf: write CTF files to memory, and CTF archives to fds
Before now, we've been able to write CTF files to gzFile descriptors or
fds, and CTF archives to named files only.

Make this a bit less irregular by allowing CTF archives to be written
to fds with the new function ctf_arc_write_fd: also allow CTF
files to be written to a new memory buffer via ctf_write_mem.

(It would be nice to complete things by adding a new function to write
CTF archives to memory, but this is too difficult to do given the short
time the linker is expected to be writing them out: we will transition
to a better format in format v4, though we will always support reading
CTF archives that are stored in .ctf sections.)

include/
	* ctf-api.h (ctf_arc_write_fd): New.
	(ctf_write_mem): Likewise.
	(ctf_gzwrite): Spacing fix.

libctf/
	* ctf-archive.c (ctf_arc_write): Split off, and reimplement in terms
	of...
	(ctf_arc_write_fd): ... this new function.
	* ctf-create.c (ctf_write_mem): New.
2019-10-03 17:04:55 +01:00
Nick Alcock d851ecd373 libctf: support getting strings from the ELF strtab
The CTF file format has always supported "external strtabs", which
internally are strtab offsets with their MSB on: such refs
get their strings from the strtab passed in at CTF file open time:
this is usually intended to be the ELF strtab, and that's what this
implementation is meant to support, though in theory the external
strtab could come from anywhere.

This commit adds support for these external strings in the ctf-string.c
strtab tracking layer.  It's quite easy: we just add a field csa_offset
to the atoms table that tracks all strings: this field tracks the offset
of the string in the ELF strtab (with its MSB already on, courtesy of a
new macro CTF_SET_STID), and adds a new function that sets the
csa_offset to the specified offset (plus MSB).  Then we just need to
avoid writing out strings to the internal strtab if they have csa_offset
set, and note that the internal strtab is shorter than it might
otherwise be.

(We could in theory save a little more time here by eschewing sorting
such strings, since we never actually write the strings out anywhere,
but that would mean storing them separately and it's just not worth the
complexity cost until profiling shows it's worth doing.)

We also have to go through a bit of extra effort at variable-sorting
time.  This was previously using direct references to the internal
strtab: it couldn't use ctf_strptr or ctf_strraw because the new strtab
is not yet ready to put in its usual field (in a ctf_file_t that hasn't
even been allocated yet at this stage): but now we're using the external
strtab, this will no longer do because it'll be looking things up in the
wrong strtab, with disastrous results.  Instead, pass the new internal
strtab in to a new ctf_strraw_explicit function which is just like
ctf_strraw except you can specify a ne winternal strtab to use.

But even now that it is using a new internal strtab, this is not quite
enough: it can't look up strings in the external strtab because ld
hasn't written it out yet, and when it does will write it straight to
disk.  Instead, when we write the internal strtab, note all the offset
-> string mappings that we have noted belong in the *external* strtab to
a new "synthetic external strtab" dynhash, ctf_syn_ext_strtab, and look
in there at ctf_strraw time if it is set.  This uses minimal extra
memory (because only strings in the external strtab that we actually use
are stored, and even those come straight out of the atoms table), but
let both variable sorting and name interning when ctf_bufopen is next
called work fine.  (This also means that we don't need to filter out
spurious ECTF_STRTAB warnings from ctf_bufopen but can pass them back to
the caller, once we wrap ctf_bufopen so that we have a new internal
variant of ctf_bufopen etc that we can pass the synthetic external
strtab to. That error has been filtered out since the days of Solaris
libctf, which didn't try to handle the problem of getting external
strtabs right at construction time at all.)

v3: add the synthetic strtab and all associated machinery.
v5: fix tabdamage.

include/
	* ctf.h (CTF_SET_STID): New.

libctf/
	* ctf-impl.h (ctf_str_atom_t) <csa_offset>: New field.
	(ctf_file_t) <ctf_syn_ext_strtab>: Likewise.
	(ctf_str_add_ref): Name the last arg.
	(ctf_str_add_external) New.
	(ctf_str_add_strraw_explicit): Likewise.
	(ctf_simple_open_internal): Likewise.
	(ctf_bufopen_internal): Likewise.

	* ctf-string.c (ctf_strraw_explicit): Split from...
	(ctf_strraw): ... here, with new support for ctf_syn_ext_strtab.
	(ctf_str_add_ref_internal): Return the atom, not the
	string.
	(ctf_str_add): Adjust accordingly.
	(ctf_str_add_ref): Likewise.  Move up in the file.
	(ctf_str_add_external): New: update the csa_offset.
	(ctf_str_count_strtab): Only account for strings with no csa_offset
	in the internal strtab length.
	(ctf_str_write_strtab): If the csa_offset is set, update the
	string's refs without writing the string out, and update the
	ctf_syn_ext_strtab.  Make OOM handling less ugly.
	* ctf-create.c (struct ctf_sort_var_arg_cb): New.
	(ctf_update): Handle failure to populate the strtab.  Pass in the
	new ctf_sort_var arg.  Adjust for ctf_syn_ext_strtab addition.
	Call ctf_simple_open_internal, not ctf_simple_open.
	(ctf_sort_var): Call ctf_strraw_explicit rather than looking up
	strings by hand.
	* ctf-hash.c (ctf_hash_insert_type): Likewise (but using
	ctf_strraw).  Adjust to diagnose ECTF_STRTAB nonetheless.
	* ctf-open.c (init_types): No longer filter out ECTF_STRTAB.
	(ctf_file_close): Destroy the ctf_syn_ext_strtab.
	(ctf_simple_open): Rename to, and reimplement as a wrapper around...
	(ctf_simple_open_internal): ... this new function, which calls
	ctf_bufopen_internal.
	(ctf_bufopen): Rename to, and reimplement as a wrapper around...
	(ctf_bufopen_internal): ... this new function, which sets
	ctf_syn_ext_strtab.
2019-10-03 17:04:55 +01:00
Nick Alcock fd55eae84d libctf: allow the header to change between versions
libctf supports dynamic upgrading of the type table as file format
versions change, but before now has not supported changes to the CTF
header.  Doing this is complicated by the baroque storage method used:
the CTF header is kept prepended to the rest of the CTF data, just as
when read from the file, and written out from there, and is
endian-flipped in place.

This makes accessing it needlessly hard and makes it almost impossible
to make the header larger if we add fields.  The general storage
machinery around the malloced ctf pointer (the 'ctf_base') is also
overcomplicated: the pointer is sometimes malloced locally and sometimes
assigned from a parameter, so freeing it requires checking to see if
that parameter was used, needlessly coupling ctf_bufopen and
ctf_file_close together.

So split the header out into a new ctf_file_t.ctf_header, which is
written out explicitly: squeeze it out of the CTF buffer whenever we
reallocate it, and use ctf_file_t.ctf_buf to skip past the header when
we do not need to reallocate (when no upgrading or endian-flipping is
required).  We now track whether the CTF base can be freed explicitly
via a new ctf_dynbase pointer which is non-NULL only when freeing is
possible.

With all this done, we can upgrade the header on the fly and add new
fields as desired, via a new upgrade_header function in ctf-open.
As with other forms of upgrading, libctf upgrades older headers
automatically to the latest supported version at open time.

For a first use of this field, we add a new string field cth_cuname, and
a corresponding setter/getter pair ctf_cuname_set and ctf_cuname: this
is used by debuggers to determine whether a CTF section's types relate
to a single compilation unit, or to all compilation units in the
program.  (Types with ambiguous definitions in different CUs have only
one of these types placed in the top-level shared .ctf container: the
rest are placed in much smaller per-CU containers, which have the shared
container as their parent.  Since CTF must be useful in the absence of
DWARF, we store the names of the relevant CUs ourselves, so the debugger
can look them up.)

v5: fix tabdamage.

include/
	* ctf-api.h (ctf_cuname): New function.
	(ctf_cuname_set): Likewise.
	* ctf.h: Improve comment around upgrading, no longer
	implying that v2 is the target of upgrades (it is v3 now).
	(ctf_header_v2_t): New, old-format header for backward
	compatibility.
	(ctf_header_t): Add cth_cuname: this is the first of several
	header changes in format v3.
libctf/
	* ctf-impl.h (ctf_file_t): New fields ctf_header, ctf_dynbase,
	ctf_cuname, ctf_dyncuname: ctf_base and ctf_buf are no longer const.
	* ctf-open.c (ctf_set_base): Preserve the gap between ctf_buf and
	ctf_base: do not assume that it is always sizeof (ctf_header_t).
	Print out ctf_cuname: only print out ctf_parname if set.
	(ctf_free_base): Removed, ctf_base is no longer freed: free
	ctf_dynbase instead.
	(ctf_set_version): Fix spacing.
	(upgrade_header): New, in-place header upgrading.
	(upgrade_types): Rename to...
	(upgrade_types_v1): ... this.  Free ctf_dynbase, not ctf_base.  No
	longer track old and new headers separately.  No longer allow for
	header sizes explicitly: squeeze the headers out on upgrade (they
	are preserved in fp->ctf_header).  Set ctf_dynbase, ctf_base and
	ctf_buf explicitly.  Use ctf_free, not ctf_free_base.
	(upgrade_types): New, also handle ctf_parmax updating.
	(flip_header): Flip ctf_cuname.
	(flip_types): Flip BUF explicitly rather than deriving BUF from
	BASE.
	(ctf_bufopen): Store the header in fp->ctf_header.  Correct minimum
	required alignment of objtoff and funcoff.  No longer store it in
	the ctf_buf unless that buf is derived unmodified from the input.
	Set ctf_dynbase where ctf_base is dynamically allocated. Drop locals
	that duplicate fields in ctf_file: move allocation of ctf_file
	further up instead.  Call upgrade_header as needed.  Move
	version-specific ctf_parmax initialization into upgrade_types.  More
	concise error handling.
	(ctf_file_close): No longer test for null pointers before freeing.
	Free ctf_dyncuname, ctf_dynbase, and ctf_header.  Do not call
	ctf_free_base.
	(ctf_cuname): New.
	(ctf_cuname_set): New.
	* ctf-create.c (ctf_update): Populate ctf_cuname.
	(ctf_gzwrite): Write out the header explicitly.  Remove obsolescent
	comment.
	(ctf_write): Likewise.
	(ctf_compress_write): Get the header from ctf_header, not ctf_base.
	Fix the compression length: fp->ctf_size never counted the CTF
	header.  Simplify the compress call accordingly.
2019-10-03 17:04:55 +01:00
Nick Alcock f57cf0e3e3 libctf: fix spurious error when rolling back to the first snapshot
The first ctf_snapshot called after CTF file creation yields a snapshot
handle that always yields a spurious ECTF_OVERROLLBACK error ("Attempt
to roll back past a ctf_update") on ctf_rollback(), even if ctf_update
has never been called.

The fix is to start with a ctf_snapshot value higher than the zero value
that ctf_snapshot_lu ("last update CTF snapshot value") is initialized
to.

libctf/
	* ctf-create.c (ctf_create): Fix off-by-one error.
2019-07-01 11:05:59 +01:00
Nick Alcock f5e9c9bde0 libctf: deduplicate and sort the string table
ctf.h states:

> [...] the CTF string table does not contain any duplicated strings.

Unfortunately this is entirely untrue: libctf has before now made no
attempt whatsoever to deduplicate the string table. It computes the
string table's length on the fly as it adds new strings to the dynamic
CTF file, and ctf_update() just writes each string to the table and
notes the current write position as it traverses the dynamic CTF file's
data structures and builds the final CTF buffer.  There is no global
view of the strings and no deduplication.

Fix this by erasing the ctf_dtvstrlen dead-reckoning length, and adding
a new dynhash table ctf_str_atoms that maps unique strings to a list
of references to those strings: a reference is a simple uint32_t * to
some value somewhere in the under-construction CTF buffer that needs
updating to note the string offset when the strtab is laid out.

Adding a string is now a simple matter of calling ctf_str_add_ref(),
which adds a new atom to the atoms table, if one doesn't already exist,
and adding the location of the reference to this atom to the refs list
attached to the atom: this works reliably as long as one takes care to
only call ctf_str_add_ref() once the final location of the offset is
known (so you can't call it on a temporary structure and then memcpy()
that structure into place in the CTF buffer, because the ref will still
point to the old location: ctf_update() changes accordingly).

Generating the CTF string table is a matter of calling
ctf_str_write_strtab(), which counts the length and number of elements
in the atoms table using the ctf_dynhash_iter() function we just added,
populating an array of pointers into the atoms table and sorting it into
order (to help compressors), then traversing this table and emitting it,
updating the refs to each atom as we go.  The only complexity here is
arranging to keep the null string at offset zero, since a lot of code in
libctf depends on being able to leave strtab references at 0 to indicate
'no name'.  Once the table is constructed and the refs updated, we know
how long it is, so we can realloc() the partial CTF buffer we allocated
earlier and can copy the table on to the end of it (and purge the refs
because they're not needed any more and have been invalidated by the
realloc() call in any case).

The net effect of all this is a reduction in uncompressed strtab sizes
of about 30% (perhaps a quarter to a half of all strings across the
Linux kernel are eliminated as duplicates). Of course, duplicated
strings are highly redundant, so the space saving after compression is
only about 20%: when the other non-strtab sections are factored in, CTF
sizes shrink by about 10%.

No change in externally-visible API or file format (other than the
reduction in pointless redundancy).

libctf/
	* ctf-impl.h: (struct ctf_strs_writable): New, non-const version of
	struct ctf_strs.
	(struct ctf_dtdef): Note that dtd_data.ctt_name is unpopulated.
	(struct ctf_str_atom): New, disambiguated single string.
	(struct ctf_str_atom_ref): New, points to some other location that
	references this string's offset.
	(struct ctf_file): New members ctf_str_atoms and ctf_str_num_refs.
	Remove member ctf_dtvstrlen: we no longer track the total strlen
	as we add strings.
	(ctf_str_create_atoms): Declare new function in ctf-string.c.
	(ctf_str_free_atoms): Likewise.
	(ctf_str_add): Likewise.
	(ctf_str_add_ref): Likewise.
	(ctf_str_purge_refs): Likewise.
	(ctf_str_write_strtab): Likewise.
	(ctf_realloc): Declare new function in ctf-util.c.

	* ctf-open.c (ctf_bufopen): Create the atoms table.
	(ctf_file_close): Destroy it.
	* ctf-create.c (ctf_update): Copy-and-free it on update.  No longer
	special-case the position of the parname string.  Construct the
	strtab by calling ctf_str_add_ref and ctf_str_write_strtab after the
	rest of each buffer element is constructed, not via open-coding:
	realloc the CTF buffer and append the strtab to it.  No longer
	maintain ctf_dtvstrlen.  Sort the variable entry table later, after
	strtab construction.
	(ctf_copy_membnames): Remove: integrated into ctf_copy_{s,l,e}members.
	(ctf_copy_smembers): Drop the string offset: call ctf_str_add_ref
	after buffer element construction instead.
	(ctf_copy_lmembers): Likewise.
	(ctf_copy_emembers): Likewise.
	(ctf_create): No longer maintain the ctf_dtvstrlen.
	(ctf_dtd_delete): Likewise.
	(ctf_dvd_delete): Likewise.
	(ctf_add_generic): Likewise.
	(ctf_add_enumerator): Likewise.
	(ctf_add_member_offset): Likewise.
	(ctf_add_variable): Likewise.
	(membadd): Likewise.
	* ctf-util.c (ctf_realloc): New, wrapper around realloc that aborts
	if there are active ctf_str_num_refs.
	(ctf_strraw): Move to ctf-string.c.
	(ctf_strptr): Likewise.
	* ctf-string.c: New file, strtab manipulation.

	* Makefile.am (libctf_a_SOURCES): Add it.
	* Makefile.in: Regenerate.
2019-07-01 11:05:59 +01:00
Nick Alcock 65365aa856 libctf: drop mmap()-based CTF data allocator
This allocator has the ostensible benefit that it lets us mprotect() the
memory used for CTF storage: but in exchange for this it adds
considerable complexity, since we have to track allocation sizes
ourselves for use at freeing time, note whether the data we are storing
was ctf_data_alloc()ed or not so we know if we can safely mprotect()
it... and while the mprotect()ing has found few bugs, it *has* been the
cause of more than one due to errors in all this tracking leading to us
mprotect()ing bits of the heap and stuff like that.

We are about to start composing CTF buffers from pieces so that we can
do usage-based optimizations on the strtab.  This means we need
realloc(), which needs nonportable mremap() and *more* tracking of the
*original* allocation size, and the complexity and bureaucracy of all of
this is just too high for its negligible benefits.

Drop the whole thing and just use malloc() like everyone else.  It knows
better than we do when it is safe to use mmap() under the covers,
anyway.

While we're at it, don't leak the entire buffer if ctf_compress_write()
fails to compress it.

libctf/
	* ctf-subr.c (_PAGESIZE): Remove.
	(ctf_data_alloc): Likewise.
	(ctf_data_free): Likewise.
	(ctf_data_protect): Likewise.
	* ctf-impl.h: Remove declarations.
	* ctf-create.c (ctf_update): No longer call ctf_data_protect: use
	ctf_free, not ctf_data_free.
	(ctf_compress_write): Use ctf_data_alloc, not ctf_alloc.  Free
	the buffer again on compression error.
	* ctf-open.c (ctf_set_base): No longer track the size: call
	ctf_free, not ctf_data_free.
	(upgrade_types): Likewise.  Call ctf_alloc, not ctf_data_alloc.
	(ctf_bufopen): Likewise.  No longer call ctf_data_protect.
2019-06-21 13:04:02 +01:00
Nick Alcock 2486542803 libctf: handle errors on dynhash insertion better
We were missing several cases where dynhash insertion might fail, likely
due to OOM but possibly for other reasons.  Pass the errors on.

libctf/
	* ctf-create.c (ctf_dtd_insert): Pass on error returns from
	ctf_dynhash_insert.
	(ctf_dvd_insert): Likewise.
	(ctf_add_generic): Likewise.
	(ctf_add_variable): Likewise.
	* ctf-impl.h: Adjust declarations.
2019-06-21 13:04:01 +01:00
Nick Alcock 62d8e3b731 libctf: eschew %zi format specifier
Too many platforms don't support it, and we can always safely use %lu or
%li anyway, because the only uses are in debugging output.

libctf/
	* ctf-archive.c (ctf_arc_write): Eschew %zi format specifier.
	(ctf_arc_open_by_offset): Likewise.
	* ctf-create.c (ctf_add_type): Likewise.
2019-06-05 13:34:36 +01:00
Tom Tromey 76fad99963 Use CHAR_BIT instead of NBBY in libctf
On x86-64 Fedora 29, I tried to build a mingw-hosted gdb that targets
ppc-linux.  You can do this with:

    ../binutils-gdb/configure --host=i686-w64-mingw32 --target=ppc-linux \
        --disable-{binutils,gas,gold,gprof,ld}

The build failed with these errors in libctf:

In file included from ../../binutils-gdb/libctf/ctf-create.c:20:
../../binutils-gdb/libctf/ctf-create.c: In function 'ctf_add_encoded':
../../binutils-gdb/libctf/ctf-create.c:803:59: error: 'NBBY' undeclared (first use in this function)
   dtd->dtd_data.ctt_size = clp2 (P2ROUNDUP (ep->cte_bits, NBBY) / NBBY);
                                                           ^~~~
../../binutils-gdb/libctf/ctf-impl.h:254:42: note: in definition of macro 'P2ROUNDUP'
 #define P2ROUNDUP(x, align)  (-(-(x) & -(align)))
                                          ^~~~~
../../binutils-gdb/libctf/ctf-create.c:803:59: note: each undeclared identifier is reported only once for each function it appears in
   dtd->dtd_data.ctt_size = clp2 (P2ROUNDUP (ep->cte_bits, NBBY) / NBBY);
                                                           ^~~~
../../binutils-gdb/libctf/ctf-impl.h:254:42: note: in definition of macro 'P2ROUNDUP'
 #define P2ROUNDUP(x, align)  (-(-(x) & -(align)))
                                          ^~~~~
../../binutils-gdb/libctf/ctf-create.c: In function 'ctf_add_slice':
../../binutils-gdb/libctf/ctf-create.c:862:59: error: 'NBBY' undeclared (first use in this function)
   dtd->dtd_data.ctt_size = clp2 (P2ROUNDUP (ep->cte_bits, NBBY) / NBBY);
                                                           ^~~~
../../binutils-gdb/libctf/ctf-impl.h:254:42: note: in definition of macro 'P2ROUNDUP'
 #define P2ROUNDUP(x, align)  (-(-(x) & -(align)))
                                          ^~~~~
../../binutils-gdb/libctf/ctf-create.c: In function 'ctf_add_member_offset':
../../binutils-gdb/libctf/ctf-create.c:1341:21: error: 'NBBY' undeclared (first use in this function)
      off += lsize * NBBY;
                     ^~~~
../../binutils-gdb/libctf/ctf-create.c: In function 'ctf_add_type':
../../binutils-gdb/libctf/ctf-create.c:1822:16: warning: unknown conversion type character 'z' in format [-Wformat=]
   ctf_dprintf ("Conflict for type %s against ID %lx: "
                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../binutils-gdb/libctf/ctf-create.c:1823:35: note: format string is defined here
         "union size differs, old %zi, new %zi\n",
                                   ^
../../binutils-gdb/libctf/ctf-create.c:1822:16: warning: unknown conversion type character 'z' in format [-Wformat=]
   ctf_dprintf ("Conflict for type %s against ID %lx: "
                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../binutils-gdb/libctf/ctf-create.c:1823:44: note: format string is defined here
         "union size differs, old %zi, new %zi\n",
                                            ^
../../binutils-gdb/libctf/ctf-create.c:1822:16: warning: too many arguments for format [-Wformat-extra-args]
   ctf_dprintf ("Conflict for type %s against ID %lx: "
                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This patch fixes the actual errors in here.  I did not try to fix the
printf warnings, though I think someone ought to.

Ok?

libctf/ChangeLog
2019-06-04  Tom Tromey  <tromey@adacore.com>

	* ctf-create.c (ctf_add_encoded, ctf_add_slice)
	(ctf_add_member_offset): Use CHAR_BIT, not NBBY.
2019-06-04 14:11:29 -06:00
Nick Alcock 6b22174ff1 libctf: look for BSD versus GNU qsort_r signatures
We cannot just look for any declaration of qsort_r, because some
operating systems have a qsort_r that has a different prototype
but which still has a pair of pointers in the right places (the last two
args are interchanged): so use AC_LINK_IFELSE to check for both
known variants of qsort_r(), and swap their args into a consistent order
in a suitable inline function.  (The code for this is taken almost
unchanged from gnulib.)

(Now we are not using AC_LIBOBJ any more, we can use a better name for
the qsort_r replacement as well.)

libctf/
	* qsort_r.c: Rename to...
	* ctf-qsort_r.c: ... this.
	(_quicksort): Define to ctf_qsort_r.
	* ctf-decls.h (qsort_r): Remove.
	(ctf_qsort_r): Add.
	(struct ctf_qsort_arg): New, transport the real ARG and COMPAR.
	(ctf_qsort_compar_thunk): Rearrange the arguments to COMPAR.
	* Makefile.am (libctf_a_LIBADD): Remove.
	(libctf_a_SOURCES): New, add ctf-qsort_r.c.
	* ctf-archive.c (ctf_arc_write): Call ctf_qsort_r, not qsort_r.
	* ctf-create.c (ctf_update): Likewise.
	* configure.ac: Check for BSD versus GNU qsort_r signature.
	* Makefile.in: Regenerate.
	* config.h.in: Likewise.
	* configure: Likewise.
2019-06-04 17:05:08 +01:00
Jose E. Marchesi a0486bac41 libctf: fix a number of build problems found on Solaris and NetBSD
- Use of nonportable <endian.h>
- Use of qsort_r
- Use of zlib without appropriate magic to pull in the binutils zlib
- Use of off64_t without checking (fixed by dropping the unused fields
  that need off64_t entirely)
- signedness problems due to long being too short a type on 32-bit
  platforms: ctf_id_t is now 'unsigned long', and CTF_ERR must be
  used only for functions that return ctf_id_t
- One lingering use of bzero() and of <sys/errno.h>

All fixed, using code from gnulib where possible.

Relatedly, set cts_size in a couple of places it was missed
(string table and symbol table loading upon ctf_bfdopen()).

binutils/
	* objdump.c (make_ctfsect): Drop cts_type, cts_flags, and
	cts_offset.
	* readelf.c (shdr_to_ctf_sect): Likewise.
include/
	* ctf-api.h (ctf_sect_t): Drop cts_type, cts_flags, and cts_offset.
	(ctf_id_t): This is now an unsigned type.
	(CTF_ERR): Cast it to ctf_id_t.  Note that it should only be used
	for ctf_id_t-returning functions.
libctf/
	* Makefile.am (ZLIB): New.
	(ZLIBINC): Likewise.
	(AM_CFLAGS): Use them.
	(libctf_a_LIBADD): New, for LIBOBJS.
	* configure.ac: Check for zlib, endian.h, and qsort_r.
	* ctf-endian.h: New, providing htole64 and le64toh.
	* swap.h: Code style fixes.
	(bswap_identity_64): New.
	* qsort_r.c: New, from gnulib (with one added #include).
	* ctf-decls.h: New, providing a conditional qsort_r declaration,
	and unconditional definitions of MIN and MAX.
	* ctf-impl.h: Use it.  Do not use <sys/errno.h>.
	(ctf_set_errno): Now returns unsigned long.
	* ctf-util.c (ctf_set_errno): Adjust here too.
	* ctf-archive.c: Use ctf-endian.h.
	(ctf_arc_open_by_offset): Use memset, not bzero.  Drop cts_type,
	cts_flags and cts_offset.
	(ctf_arc_write): Drop debugging dependent on the size of off_t.
	* ctf-create.c: Provide a definition of roundup if not defined.
	(ctf_create): Drop cts_type, cts_flags and cts_offset.
	(ctf_add_reftype): Do not check if type IDs are below zero.
	(ctf_add_slice): Likewise.
	(ctf_add_typedef): Likewise.
	(ctf_add_member_offset): Cast error-returning ssize_t's to size_t
	when known error-free.  Drop CTF_ERR usage for functions returning
	int.
	(ctf_add_member_encoded): Drop CTF_ERR usage for functions returning
	int.
	(ctf_add_variable): Likewise.
	(enumcmp): Likewise.
	(enumadd): Likewise.
	(membcmp): Likewise.
	(ctf_add_type): Likewise.  Cast error-returning ssize_t's to size_t
	when known error-free.
	* ctf-dump.c (ctf_is_slice): Drop CTF_ERR usage for functions
	returning int: use CTF_ERR for functions returning ctf_type_id.
	(ctf_dump_label): Likewise.
	(ctf_dump_objts): Likewise.
	* ctf-labels.c (ctf_label_topmost): Likewise.
	(ctf_label_iter): Likewise.
	(ctf_label_info): Likewise.
	* ctf-lookup.c (ctf_func_args): Likewise.
	* ctf-open.c (upgrade_types): Cast to size_t where appropriate.
	(ctf_bufopen): Likewise.  Use zlib types as needed.
	* ctf-types.c (ctf_member_iter): Drop CTF_ERR usage for functions
	returning int.
	(ctf_enum_iter): Likewise.
	(ctf_type_size): Likewise.
	(ctf_type_align): Likewise.  Cast to size_t where appropriate.
	(ctf_type_kind_unsliced): Likewise.
	(ctf_type_kind): Likewise.
	(ctf_type_encoding): Likewise.
	(ctf_member_info): Likewise.
	(ctf_array_info): Likewise.
	(ctf_enum_value): Likewise.
	(ctf_type_rvisit): Likewise.
	* ctf-open-bfd.c (ctf_bfdopen): Drop cts_type, cts_flags and
	cts_offset.
	(ctf_simple_open): Likewise.
	(ctf_bfdopen_ctfsect): Likewise.  Set cts_size properly.
	* Makefile.in: Regenerate.
	* aclocal.m4: Likewise.
	* config.h: Likewise.
	* configure: Likewise.
2019-05-31 11:10:51 +02:00
Nick Alcock c499eb6896 libctf: type copying
ctf_add_type() allows you to copy types, and all the types they depend
on, from one container to another (writable) container. This lets a
program maintaining multiple distinct containers (not in a parent-child
relationship) introduce types that depend on types in one container in
another writable one, by copying the necessary types.

libctf/
	* ctf-create.c (enumcmp): New.
	(enumadd): Likewise.
	(membcmp): Likewise.
	(membadd): Likewise.
	(ctf_add_type): Likewise.
2019-05-28 17:08:26 +01:00
Nick Alcock 47d546f427 libctf: creation functions
The CTF creation process looks roughly like (error handling elided):

int err;
ctf_file_t *foo = ctf_create (&err);

ctf_id_t type = ctf_add_THING (foo, ...);
ctf_update (foo);
ctf_*write (...);

Some ctf_add_THING functions accept other type IDs as arguments,
depending on the type: cv-quals, pointers, and structure and union
members all take other types as arguments.  So do 'slices', which
let you take an existing integral type and recast it as a type
with a different bitness or offset within a byte, for bitfields.
One class of THING is not a type: "variables", which are mappings
of names (in the internal string table) to types.  These are mostly
useful when encoding variables that do not appear in a symbol table
but which some external user has some other way to figure out the
address of at runtime (dynamic symbol lookup or querying a VM
interpreter or something).

You can snapshot the creation process at any point: rolling back to a
snapshot deletes all types and variables added since that point.

You can make arbitrary type queries on the CTF container during the
creation process, but you must call ctf_update() first, which
translates the growing dynamic container into a static one (this uses
the CTF opening machinery, added in a later commit), which is quite
expensive.  This function must also be called after adding types
and before writing the container out.

Because addition of types involves looking up existing types, we add a
little of the type lookup machinery here, as well: only enough to
look up types in dynamic containers under construction.

libctf/
	* ctf-create.c: New file.
	* ctf-lookup.c: New file.

include/
	* ctf-api.h (zlib.h): New include.
	(ctf_sect_t): New.
	(ctf_sect_names_t): Likewise.
	(ctf_encoding_t): Likewise.
	(ctf_membinfo_t): Likewise.
	(ctf_arinfo_t): Likewise.
	(ctf_funcinfo_t): Likewise.
	(ctf_lblinfo_t): Likewise.
	(ctf_snapshot_id_t): Likewise.
	(CTF_FUNC_VARARG): Likewise.
	(ctf_simple_open): Likewise.
	(ctf_bufopen): Likewise.
	(ctf_create): Likewise.
	(ctf_add_array): Likewise.
	(ctf_add_const): Likewise.
	(ctf_add_enum_encoded): Likewise.
	(ctf_add_enum): Likewise.
	(ctf_add_float): Likewise.
	(ctf_add_forward): Likewise.
	(ctf_add_function): Likewise.
	(ctf_add_integer): Likewise.
	(ctf_add_slice): Likewise.
	(ctf_add_pointer): Likewise.
	(ctf_add_type): Likewise.
	(ctf_add_typedef): Likewise.
	(ctf_add_restrict): Likewise.
	(ctf_add_struct): Likewise.
	(ctf_add_union): Likewise.
	(ctf_add_struct_sized): Likewise.
	(ctf_add_union_sized): Likewise.
	(ctf_add_volatile): Likewise.
	(ctf_add_enumerator): Likewise.
	(ctf_add_member): Likewise.
	(ctf_add_member_offset): Likewise.
	(ctf_add_member_encoded): Likewise.
	(ctf_add_variable): Likewise.
	(ctf_set_array): Likewise.
	(ctf_update): Likewise.
	(ctf_snapshot): Likewise.
	(ctf_rollback): Likewise.
	(ctf_discard): Likewise.
	(ctf_write): Likewise.
	(ctf_gzwrite): Likewise.
	(ctf_compress_write): Likewise.
2019-05-28 17:07:40 +01:00