Binutils with MCST patches
Go to file
Nick Alcock f5e9c9bde0 libctf: deduplicate and sort the string table
ctf.h states:

> [...] the CTF string table does not contain any duplicated strings.

Unfortunately this is entirely untrue: libctf has before now made no
attempt whatsoever to deduplicate the string table. It computes the
string table's length on the fly as it adds new strings to the dynamic
CTF file, and ctf_update() just writes each string to the table and
notes the current write position as it traverses the dynamic CTF file's
data structures and builds the final CTF buffer.  There is no global
view of the strings and no deduplication.

Fix this by erasing the ctf_dtvstrlen dead-reckoning length, and adding
a new dynhash table ctf_str_atoms that maps unique strings to a list
of references to those strings: a reference is a simple uint32_t * to
some value somewhere in the under-construction CTF buffer that needs
updating to note the string offset when the strtab is laid out.

Adding a string is now a simple matter of calling ctf_str_add_ref(),
which adds a new atom to the atoms table, if one doesn't already exist,
and adding the location of the reference to this atom to the refs list
attached to the atom: this works reliably as long as one takes care to
only call ctf_str_add_ref() once the final location of the offset is
known (so you can't call it on a temporary structure and then memcpy()
that structure into place in the CTF buffer, because the ref will still
point to the old location: ctf_update() changes accordingly).

Generating the CTF string table is a matter of calling
ctf_str_write_strtab(), which counts the length and number of elements
in the atoms table using the ctf_dynhash_iter() function we just added,
populating an array of pointers into the atoms table and sorting it into
order (to help compressors), then traversing this table and emitting it,
updating the refs to each atom as we go.  The only complexity here is
arranging to keep the null string at offset zero, since a lot of code in
libctf depends on being able to leave strtab references at 0 to indicate
'no name'.  Once the table is constructed and the refs updated, we know
how long it is, so we can realloc() the partial CTF buffer we allocated
earlier and can copy the table on to the end of it (and purge the refs
because they're not needed any more and have been invalidated by the
realloc() call in any case).

The net effect of all this is a reduction in uncompressed strtab sizes
of about 30% (perhaps a quarter to a half of all strings across the
Linux kernel are eliminated as duplicates). Of course, duplicated
strings are highly redundant, so the space saving after compression is
only about 20%: when the other non-strtab sections are factored in, CTF
sizes shrink by about 10%.

No change in externally-visible API or file format (other than the
reduction in pointless redundancy).

libctf/
	* ctf-impl.h: (struct ctf_strs_writable): New, non-const version of
	struct ctf_strs.
	(struct ctf_dtdef): Note that dtd_data.ctt_name is unpopulated.
	(struct ctf_str_atom): New, disambiguated single string.
	(struct ctf_str_atom_ref): New, points to some other location that
	references this string's offset.
	(struct ctf_file): New members ctf_str_atoms and ctf_str_num_refs.
	Remove member ctf_dtvstrlen: we no longer track the total strlen
	as we add strings.
	(ctf_str_create_atoms): Declare new function in ctf-string.c.
	(ctf_str_free_atoms): Likewise.
	(ctf_str_add): Likewise.
	(ctf_str_add_ref): Likewise.
	(ctf_str_purge_refs): Likewise.
	(ctf_str_write_strtab): Likewise.
	(ctf_realloc): Declare new function in ctf-util.c.

	* ctf-open.c (ctf_bufopen): Create the atoms table.
	(ctf_file_close): Destroy it.
	* ctf-create.c (ctf_update): Copy-and-free it on update.  No longer
	special-case the position of the parname string.  Construct the
	strtab by calling ctf_str_add_ref and ctf_str_write_strtab after the
	rest of each buffer element is constructed, not via open-coding:
	realloc the CTF buffer and append the strtab to it.  No longer
	maintain ctf_dtvstrlen.  Sort the variable entry table later, after
	strtab construction.
	(ctf_copy_membnames): Remove: integrated into ctf_copy_{s,l,e}members.
	(ctf_copy_smembers): Drop the string offset: call ctf_str_add_ref
	after buffer element construction instead.
	(ctf_copy_lmembers): Likewise.
	(ctf_copy_emembers): Likewise.
	(ctf_create): No longer maintain the ctf_dtvstrlen.
	(ctf_dtd_delete): Likewise.
	(ctf_dvd_delete): Likewise.
	(ctf_add_generic): Likewise.
	(ctf_add_enumerator): Likewise.
	(ctf_add_member_offset): Likewise.
	(ctf_add_variable): Likewise.
	(membadd): Likewise.
	* ctf-util.c (ctf_realloc): New, wrapper around realloc that aborts
	if there are active ctf_str_num_refs.
	(ctf_strraw): Move to ctf-string.c.
	(ctf_strptr): Likewise.
	* ctf-string.c: New file, strtab manipulation.

	* Makefile.am (libctf_a_SOURCES): Add it.
	* Makefile.in: Regenerate.
2019-07-01 11:05:59 +01:00
bfd Automatic date update in version.in 2019-07-01 00:00:41 +00:00
binutils Prevent attempts to allocate excessive amounts of memory when parsing corrupt ELF files. 2019-06-28 15:30:43 +01:00
config
contrib
cpu cpu/or1k: Update fpu compare symbols to imply set flag 2019-06-13 06:16:19 +09:00
elfcpp [GOLD] R_PPC64_REL16_HIGH relocs 2019-06-28 10:17:08 +09:30
etc
gas Fix spelling error in assembler documentation. 2019-07-01 10:20:43 +01:00
gdb Adjust i386 registers on SystemTap probes' arguments (PR breakpoints/24541) 2019-06-28 16:28:21 -04:00
gnulib Fix gnulib/update-gnulib.sh 2019-06-21 13:23:59 +01:00
gold [GOLD] PowerPC tweak relnum tests 2019-06-28 10:18:03 +09:30
gprof Correct the alpha sorting of the short options in the usage description of the gprof program. 2019-05-20 17:17:24 +01:00
include libctf: endianness fixes 2019-06-21 13:04:02 +01:00
intl
ld PowerPC notoc call stub tests 2019-06-28 10:16:17 +09:30
libctf libctf: deduplicate and sort the string table 2019-07-01 11:05:59 +01:00
libdecnumber
libiberty Pull in patch for libiberty that fixes a stack exhaustion bug when demangling a pathalogically constructed mangled name. 2019-04-10 15:49:36 +01:00
opcodes x86: drop Vec_Imm4 2019-07-01 08:38:50 +02:00
readline config.guess,config.sub: synchronize with config project master sources 2019-05-23 18:19:56 +02:00
sim sim/testsuite/or1k: Add tests for unordered compares 2019-06-13 21:27:10 +09:00
texinfo
zlib
.cvsignore
.gitattributes
.gitignore
ar-lib
ChangeLog Add gnulib to gdb release tarball 2019-06-21 15:20:34 +02:00
compile
config-ml.in
config.guess config.guess,config.sub: synchronize with config project master sources 2019-05-23 18:19:56 +02:00
config.rpath
config.sub config.guess,config.sub: synchronize with config project master sources 2019-05-23 18:19:56 +02:00
configure Move gnulib to top level 2019-06-14 12:40:02 -06:00
configure.ac Move gnulib to top level 2019-06-14 12:40:02 -06:00
COPYING
COPYING3
COPYING3.LIB
COPYING.LIB
COPYING.LIBGLOSS
COPYING.NEWLIB
depcomp
djunpack.bat
install-sh
libtool.m4
lt~obsolete.m4
ltgcc.m4
ltmain.sh
ltoptions.m4
ltsugar.m4
ltversion.m4
MAINTAINERS Move gnulib to top level 2019-06-14 12:40:02 -06:00
Makefile.def Move gnulib to top level 2019-06-14 12:40:02 -06:00
Makefile.in Move gnulib to top level 2019-06-14 12:40:02 -06:00
Makefile.tpl Revert "Sync top level files with versions from gcc." 2019-05-30 11:17:19 +01:00
makefile.vms
missing
mkdep
mkinstalldirs
move-if-change
multilib.am
README
README-maintainer-mode
setup.com
src-release.sh Add gnulib to gdb release tarball 2019-06-21 15:20:34 +02:00
symlink-tree
test-driver
ylwrap

		   README for GNU development tools

This directory contains various GNU compilers, assemblers, linkers, 
debuggers, etc., plus their support routines, definitions, and documentation.

If you are receiving this as part of a GDB release, see the file gdb/README.
If with a binutils release, see binutils/README;  if with a libg++ release,
see libg++/README, etc.  That'll give you info about this
package -- supported targets, how to use it, how to report bugs, etc.

It is now possible to automatically configure and build a variety of
tools with one command.  To build all of the tools contained herein,
run the ``configure'' script here, e.g.:

	./configure 
	make

To install them (by default in /usr/local/bin, /usr/local/lib, etc),
then do:
	make install

(If the configure script can't determine your type of computer, give it
the name as an argument, for instance ``./configure sun4''.  You can
use the script ``config.sub'' to test whether a name is recognized; if
it is, config.sub translates it to a triplet specifying CPU, vendor,
and OS.)

If you have more than one compiler on your system, it is often best to
explicitly set CC in the environment before running configure, and to
also set CC when running make.  For example (assuming sh/bash/ksh):

	CC=gcc ./configure
	make

A similar example using csh:

	setenv CC gcc
	./configure
	make

Much of the code and documentation enclosed is copyright by
the Free Software Foundation, Inc.  See the file COPYING or
COPYING.LIB in the various directories, for a description of the
GNU General Public License terms under which you can copy the files.

REPORTING BUGS: Again, see gdb/README, binutils/README, etc., for info
on where and how to report problems.