Another simplification made possible by using a plain char string
instead of string_t, that was only needed in the core as prep work
for CTF encoding.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For the threaded code we want to access strings in tags at the same time
that the string table may grow in another thread making the previous
pointer invalid, so, to avoid excessive locking, use plain strings.
The way the tools work will either consume the just produced CU straight
away or keep just one copy of each data structure when we keep all CUs
in memory, so lets try stopping using strings_t for strings.
For the class_member->name case we get the bonus of removing another
user of dwarves__active_loader.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As they are temporarily disabled till we go thru them fixing up problems
with the BTF changes.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We need to fix some bugs introduced recently, till then, disable steps
that try to demote the base type of bitfields and those that
move/combine bitfields to save space.
We'll revisit those later, bringing them back to the reorg codebase.
Acked-by: Andrii Nakryiko <andriin@fb.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Wielaard <mark@klomp.org>
Cc: Yonghong Song <yhs@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>#
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Instead of relying on error-prone adjustment of bit/byte holes, use
class__find_holes() to re-calculate them after members are moved around.
As part of that change, fix bug with not adjusting bit_offset, when
changing byte_offset.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
That conf_fprintf can be elided as it is always NULL for the root call,
i.e. only when expanding types is that it will be called recursively.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The changeset:
commit f043528
Author: Arnaldo Carvalho de Melo <acme@redhat.com>
Date: Sun Aug 16 12:26:33 2009 -0300
dwarves_reorganize: Fix class__demote_bitfields, we need power of
two bytes
Had a correct changeset description, but an incorrect implementation, it
was not rouding up to the next power of two, but to the next multiple of
2, i.e. when a bitfield has 2 bits, it was deciding it needed 2 bytes,
not 1.
Fix it by copying the roundup_power_of_two code from the Linux kernel,
mostly by David Howells <dhowells@redhat.com>.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As Thomas Gleixner wisely pointed out, using 'self' is stupid, it
doesn't convey useful information, so use sensible names.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We can't always pad using the module of addr_size, we need to find the
minimum alignment requirement as a power of two < addr_size.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When we move fields from the tail to eliminate holes look if the struct
still correctly fits into a multiple of cu->addr_size.
Unless the struct is explicitely marked __attribute__((packed)) (and we
can't know that for sure) we have to add padding in that case.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
I wasn't especifying the optimization level and the default, despite
using -Wall, was for this so simple case not to be warned about, so
now I'm using -O2.
Alexandre provided a patch initializing the variables to NULL, so that
when we called cus__delete it would bail out and not possibly act on
a random value, I preferred to add extra goto labels and do the exit
path only on the resources that were successfully allocated/initialized,
avoiding, for instance, to call dwarves_exit() if dwarves_init() wasn't
called, which wasn't a problem so far, but could be in the future.
Reported-by: Alexandre Vassalotti <alexandre@peadrop.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Instead pass thru cu__strings(cu, i) so that we can figure out if the
underlying debugging format handler can do that more efficiently, such as by
looking up directly the ELF section ".strtab".
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Such as signed, etc. This is in preparation for using directly ctf_strings.
Instead of duplicating it in the global strings table.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
That became cached when we moved to CTF like bitfields, i.e. each bit size
creates a new type so that the encoding for class members can have just the
name, type and bit_offset.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To shorten the name and to reflect the fact that we're no longer
"finding" a type, but merely accessing an array with a bounds check in
this function.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Because we will need the "bit_offset" and "bit_size" names when converting the
representation of offset and size everywhere to be in bits, not bytes.
At the same time we will keep bitfield_size and bitfield_offset when we convert
from DWARF to CTF and will calculate them when loading CTF, so that the
conversion of the algorithms in dwarves_reorganize, that have all sorts of
subtle issues, can be left for later.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Amazing how many crept up over time, should have set the
execute bit of .git/hooks/pre-commit already, duh.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
c'n'paste error for the two bytes case, the alternate name was also
being set to the type_name variable, duh.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Had to be a big sweeping change, but the regression tests shows just
improvements :-)
Now we stop using an id in struct tag, only storing the type, that now
uses 16 bits only, as CTF does.
Each format loader has to go on adding the types to the core, that
figures out if it is a tag that can be on the tag->type field
(tag__is_tag_type).
Formats that already have the types separated and in sequence, such as
CTF, just ask the core to insert in the types_table directly with its
original ID.
For DWARF, we ask the core to put it on the table, in sequence, and return the
index, that is then stashed with the DWARF specific info (original id, type,
decl_line, etc) and hashed by the original id. Later we recode everything,
looking up via the original type, getting the small_id to put on the tag->type.
The underlying debugging info not needed by the core is stashed in tag->priv,
and the DWARF loader now just allocates sizeof(struct dwarf_tag) at the end of
the core tag and points it there, and makes that info available thru
cu->orig_info. In the future we can ask, when loading a cu, that this info be
trown away, so that we reduce the memory footprint for big multi-cu files such
as the Linux kernel.
There is also a routine to ask for inserting a NULL, as we still have
bugs in the CTF decoding and thus some entries are being lost, to avoid
using an undefined pointer when traversing the types_table the ctf
loader puts a NULL there via cu__table_nullify_type_entry() and then
cu__for_each_type skips those.
There is some more cleanups for leftovers that I avoided cleaning to
reduce this changeset.
And also while doing this I saw that enums can appear without any
enumerators and that an array with DW_TAG_GNU_vector is actually a
different tag, encoded this way till we get to DWARF4 ;-)
So now we don't have to lookup on a hash table looking for DWARF
offsets, we can do the more sensible thing of just indexing the
types_tags array.
Now to do some cleanups and try to get the per cu encoder done. Then
order all the cus per number of type entries, pick the one with more,
then go on merging/recoding the types of the others and putting the
parent linkage in place.
Just to show the extent of the changes:
$ codiff /tmp/libdwarves.so.1.0.0 build/libdwarves.so.1.0.0
/home/acme/git/pahole/dwarves.c:
struct cu | -4048
struct tag | -32
struct ptr_to_member_type | -32
struct namespace | -32
struct type | -32
struct class | -32
struct base_type | -32
struct array_type | -32
struct class_member | -32
struct lexblock | -32
struct ftype | -32
struct function | -64
struct parameter | -32
struct variable | -32
struct inline_expansion | -32
struct label | -32
struct enumerator | -32
17 structs changed
tag__follow_typedef | +3
tag__fprintf_decl_info | +25
array_type__fprintf | +6
type__name | -126
type__find_first_biggest_size_base_type_member | -3
typedef__fprintf | +16
imported_declaration__fprintf | +6
imported_module__fprintf | +3
cu__new | +26
cu__delete | +26
hashtags__hash | -65
hash_64 | -124
hlist_add_head | -78
hashtags__find | -157
cu__hash | -80
cu__add_tag | +20
tag__prefix | -3
cu__find_tag_by_id | -2
cu__find_type_by_id | -3
cu__find_first_typedef_of_type | +38
cu__find_base_type_by_name | +68
cu__find_base_type_by_name_and_size | +72
cu__find_struct_by_name | +59
cus__find_struct_by_name | +8
cus__find_tag_by_id | +5
cus__find_cu_by_name | -6
lexblock__find_tag_by_id | -173
cu__find_variable_by_id | -197
list__find_tag_by_id | -308
cu__find_parameter_by_id | -60
tag__ptr_name | +6
tag__name | +15
variable__type | +13
variable__name | +7
class_member__size | +6
parameter__name | -119
tag__parameter | -14
parameter__type | -143
type__fprintf | -29
union__fprintf | +6
class__add_vtable_entry | -9
type__add_member | -6
type__clone_members | -3
enumeration__add | -6
function__name | -156
ftype__has_parm_of_type | -39
class__find_holes | -27
class__has_hole_ge | -3
type__nr_members_of_type | +3
lexblock__account_inline_expansions | +3
cu__account_inline_expansions | -18
ftype__fprintf_parms | +46
function__tag_fprintf | +24
lexblock__fprintf | -6
ftype__fprintf | +3
function__fprintf_stats | -18
function__size | -6
class__vtable_fprintf | -11
class__fprintf | -21
tag__fprintf | -35
60 functions changed, 513 bytes added, 2054 bytes removed, diff: -1541
/home/acme/git/pahole/ctf_loader.c:
struct ctf_short_type | +0
14 structs changed
type__init | -14
type__new | -9
class__new | -12
create_new_base_type | -7
create_new_base_type_float | -7
create_new_array | -8
create_new_subroutine_type | -9
create_full_members | -18
create_short_members | -18
create_new_class | +1
create_new_union | +1
create_new_enumeration | -19
create_new_forward_decl | -2
create_new_typedef | +3
create_new_tag | -5
load_types | +16
class__fixup_ctf_bitfields | -3
17 functions changed, 21 bytes added, 131 bytes removed, diff: -110
/home/acme/git/pahole/dwarf_loader.c:
17 structs changed
zalloc | -56
tag__init | +3
array_type__new | +20
type__init | -24
class_member__new | +46
inline_expansion__new | +12
class__new | +81
lexblock__init | +19
function__new | +43
die__create_new_array | +20
die__create_new_parameter | +4
die__create_new_label | +4
die__create_new_subroutine_type | +113
die__create_new_enumeration | -21
die__process_class | +79
die__process_namespace | +76
die__create_new_inline_expansion | +4
die__process_function | +147
__die__process_tag | +34
die__process_unit | +56
die__process | +90
21 functions changed, 851 bytes added, 101 bytes removed, diff: +750
/home/acme/git/pahole/dwarves.c:
struct ptr_table | +16
struct cu_orig_info | +32
2 structs changed
tag__decl_line | +68
tag__decl_file | +70
tag__orig_id | +71
ptr_table__init | +46
ptr_table__exit | +37
ptr_table__add | +183
ptr_table__add_with_id | +165
ptr_table__entry | +64
cu__table_add_tag | +171
cu__table_nullify_type_entry | +38
10 functions changed, 913 bytes added, diff: +913
/home/acme/git/pahole/ctf_loader.c:
2 structs changed
tag__alloc | +52
1 function changed, 52 bytes added, diff: +52
/home/acme/git/pahole/dwarf_loader.c:
struct dwarf_tag | +48
struct dwarf_cu | +4104
4 structs changed
dwarf_cu__init | +83
hashtags__hash | +61
hash_64 | +124
hlist_add_head | +78
hashtags__find | +161
cu__hash | +95
tag__is_tag_type | +171
tag__is_type | +85
tag__is_union | +28
tag__is_struct | +57
tag__is_typedef | +28
tag__is_enumeration | +28
dwarf_cu__find_tag_by_id | +56
dwarf_cu__find_type_by_id | +63
tag__alloc | +114
__tag__print_type_not_found | +108
namespace__recode_dwarf_types | +346
tag__namespace | +14
tag__has_namespace | +86
tag__is_namespace | +28
type__recode_dwarf_specification | +182
tag__type | +14
__tag__print_abstract_origin_not_found | +105
ftype__recode_dwarf_types | +322
tag__ftype | +14
tag__parameter | +14
lexblock__recode_dwarf_types | +736
tag__lexblock | +14
tag__label | +14
tag__recode_dwarf_type | +766
tag__ptr_to_member_type | +14
cu__recode_dwarf_types_table | +88
cu__recode_dwarf_types | +48
dwarf_tag__decl_file | +77
strings__ptr | +33
dwarf_tag__decl_line | +59
dwarf_tag__orig_id | +59
dwarf_tag__orig_type | +59
38 functions changed, 4432 bytes added, diff: +4432
build/libdwarves.so.1.0.0:
147 functions changed, 6782 bytes added, 2286 bytes removed, diff: +4496
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Unfortunately the most common DWARF and CTF encoders don't agree on how
the names of base types are formed, so look for an alternative name.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
And make the dwarves use it, so that we can remove duplicate strings in
a multi-CU file (vmlinux anyone?) and have it ready for insertion in a
compressed DWARF format with just the types, or better, CTF or some new
compressed debugging info format.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For correctly created and completely parsed debugging information the type will
always be found, but as we still need to parse more tags and expecting
debugging information to be always correctly built is not sane... sprinkle some
asserts.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This is trying to get CTF friendly, where bitfields are not stored in the
equivalent to the DW_TAG_member dwarf TAG, but on "base types" with bit sizes
different than the real in the DWARF sense, base types (char, long, etc).
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So that we traverse just the data members, mostly in the reorganize code, where
we can't care less where is that the compiler put the base classes in the
layout since we can't influence how the compiler does this, it has only to
respect the layout we specify for the data members.
Well, it may well be the case that the order of the ancestor classes in the
class declaration can influence this, but I haven't checked.
Yes, another C++ism :-)
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
In the past it was always cloning and doing the reorganization steps on the
clone, now this is done by the caller, so no need to return self.
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
C++ uses this, and to cache the result of the lookup at type__name time we need
to pass the cu to class__name and type__name. Big fallout because of that :-\
But now the output is mucho embelished by the humongous strings representing
C++ templates.
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Will be useful to show that the intent is to traverse just the DW_TAG_member
entries in the type list. Right now there are both DW_TAG_inheritance and
DW_TAG_member entries in the ->members type list. But there will be many more
tags, like enumerations, classes, etc, that are defined inside classes, a C++
feature. This will also help with DW_TAG_namespace support.
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>