Suggested by: Jeff Muizelaar.
And it was wrong in the sense that the help was like:
--executable|-e FILE <SNIP lots of other options> FILE
So now its a bit redundant, like:
--executable|-e FILE <SNIP lots of other options> -e FILE
But as this is the most common usage pattern, give it more visibility.
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
A more brute force approach: create a clone, reorganize it, if the resulting
size is less than the cloned class, it is packable.
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Now we have:
[acme@filo pahole]$ pahole --help
Usage: pahole [OPTION...] [FILE] {[CLASS]}
-a, --anon_include include anonymous classes
-A, --nested_anon_include include nested (inside other structs) anonymous
classes
-B, --bit_holes=NR_HOLES Show only structs at least NR_HOLES bit holes
-c, --cacheline_size=SIZE set cacheline size to SIZE
-D, --decl_exclude=PREFIX exclude classes declared in files with PREFIX
-E, --expand_types expand class members
-H, --holes=NR_HOLES show only structs at least NR_HOLES holes
-m, --nr_methods show number of methods
-n, --nr_members show number of members
-N, --class_name_len show size of classes
-P, --packable show only structs that has holes that can be
packed
-R, --reorganize reorg struct trying to kill holes
-s, --sizes show size of classes
-S, --show_reorg_steps show the struct layout at each reorganization step
-t, --nr_definitions show how many times struct was defined
-V, --verbose be verbose
-x, --exclude=PREFIX exclude PREFIXed classes
-X, --cu_exclude=PREFIX exclude PREFIXed compilation units
Input Selection:
--debuginfo-path=PATH Search path for separate debuginfo files
-e, --executable=FILE Find addresses in FILE
-k, --kernel Find addresses in the running kernel
-K, --offline-kernel[=RELEASE] Kernel with all modules
-M, --linux-process-map=FILE Find addresses in files mapped as read from
FILE in Linux /proc/PID/maps format
-p, --pid=PID Find addresses in files mapped into process PID
-?, --help Give this help list
--usage Give a short usage message
Mandatory or optional arguments to long options are also mandatory or optional
for any corresponding short options.
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
[acme@mica pahole]$ pahole lala
pahole: Permission denied
[acme@mica pahole]$ pahole foo
pahole: No such file or directory
[acme@mica pahole]$ pahole ctracer.c
pahole: couldn't load DWARF info from ctracer.c
[acme@mica pahole]$
Thanks to Matthew Wilcox for noticing how lame it was :-)
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
So that in tools like ctracer we can print to a file, most of the tools just
pass stdout, keeping the previous behaviour.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Will be used in ctracer to create a struct subset with just the types for which
we have "collectors", i.e. functions that reduce complex types to base types
that will be put in the mini-struct, that will be as tightly packed as it can
be.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This cset also does a fixup for cases where the compiler keeps the type
specified by the programmer for a bitfield but uses less space to combine with
the next, non-bitfield member, these cases can be caught using plain pahole and
will appear with this comment:
/* --- cacheline 1 boundary (64 bytes) --- */
int bitfield1:1; /* 64 4 */
int bitfield2:1; /* 64 4 */
/* XXX 14 bits hole, try to pack */
/* Bitfield WARNING: DWARF size=4, real size=2 */
short int d; /* 66 2 */
The fixup is done prior to reorganizing the fields.
Now an example of this code in action:
[acme@filo examples]$ cat swiss_cheese.c
<SNIP>
struct cheese {
char id;
short number;
char name[52];
int a:1;
int b;
int bitfield1:1;
int bitfield2:1;
short d;
short e;
short last:5;
};
<SNIP>
[acme@filo examples]$
Lets look at the layout:
[acme@filo examples]$ pahole swiss_cheese cheese
/* <11b> /home/acme/git/pahole/examples/swiss_cheese.c:3 */
struct cheese {
char id; /* 0 1 */
/* XXX 1 byte hole, try to pack */
short int number; /* 2 2 */
char name[52]; /* 4 52 */
int a:1; /* 56 4 */
/* XXX 31 bits hole, try to pack */
int b; /* 60 4 */
/* --- cacheline 1 boundary (64 bytes) --- */
int bitfield1:1; /* 64 4 */
int bitfield2:1; /* 64 4 */
/* XXX 14 bits hole, try to pack */
/* Bitfield WARNING: DWARF size=4, real size=2 */
short int d; /* 66 2 */
short int e; /* 68 2 */
short int last:5; /* 70 2 */
}; /* size: 72, cachelines: 2 */
/* sum members: 71, holes: 1, sum holes: 1 */
/* bit holes: 2, sum bit holes: 45 bits */
/* bit_padding: 11 bits */
/* last cacheline: 8 bytes */
[acme@filo examples]$
Full of holes, has bit padding and uses more than one 64 bytes cacheline.
Now lets ask pahole to reorganize it:
[acme@filo examples]$ pahole --reorganize --verbose swiss_cheese cheese
/* Demoting bitfield ('a' ... 'a') from 'int' to 'unsigned char' */
/* Demoting bitfield ('bitfield1' ... 'bitfield2') from 'short unsigned int' to 'unsigned char' */
/* Demoting bitfield ('last') from 'short int' to 'unsigned char' */
/* Moving 'bitfield2:1' from after 'bitfield1' to after 'a:1' */
/* Moving 'bitfield1:1' from after 'b' to after 'bitfield2:1' */
/* Moving 'last:5' from after 'e' to after 'bitfield1:1' */
/* Moving bitfield('a' ... 'last') from after 'name' to after 'id' */
/* Moving 'e' from after 'd' to after 'b' */
/* <11b> /home/acme/git/pahole/examples/swiss_cheese.c:3 */
struct cheese {
char id; /* 0 1 */
unsigned char a:1; /* 1 1 */
unsigned char bitfield2:1; /* 1 1 */
unsigned char bitfield1:1; /* 1 1 */
unsigned char last:5; /* 1 1 */
short int number; /* 2 2 */
char name[52]; /* 4 52 */
int b; /* 56 4 */
short int e; /* 60 2 */
short int d; /* 62 2 */
/* --- cacheline 1 boundary (64 bytes) --- */
}; /* size: 64, cachelines: 1 */
/* saved 8 bytes and 1 cacheline! */
[acme@filo examples]$
Instant karma, it gets completely packed, and look ma, no
__attribute__((packed)) :-)
With this struct task_struct in the linux kernel is shrunk by 12 bytes, there
is more 4 bytes to save with another technique that involves not combining
holes, but using the last single hole to fill it with members at the tail of
the struct.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This allows us to save 4 more bytes in struct task_struct, for instance, now we
need to combine whole bitfields with other fields if some bitfield has a size
less than sizeof(void *) and there is a suitable hole.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Using export CFLAGS="-Wall -Wfatal-errors -Wformat=2 -Wsequence-point -Wextra
-Wno-parentheses -g", suggested by Davi Arnault, amazing how cruft piles up
when one is not looking ;)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Some are just typedefs, others are inside structs and in some cases its
useful to see the statistics for them, so add two new cmd line options:
-a, --anon_include include anonymous classes\
-A, --nested_anon_include include nested (inside other structs) anonymous classes
Commiter note: I've reworked several aspects of the patch, but mostly to
give better names for the new find_first_typedef_of_type function, adding
a clarifying comment and introducing --nested_anon_include so that we
can select just the typedef'ed anonymous structs.
Damn, I had commited just dwarves.c, here is the dwarves.h and pahole.c bits.
Signed-off-by: Davi Arnaut <davi@haxent.com.br>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To show how many non inline functions receive as a parameter each of the structs
in a project, example:
[acme@newtoy ctracer_example]$ pahole --nr_methods vmlinux | sort -k2 -nr | head -5
file: 526
inode: 479
sk_buff: 386
sock: 383
dentry: 295
[acme@newtoy ctracer_example]$
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
For now it just affects showing differences in definitions of structs with the
same name found in different object files, that could be a real problem but
could as well be just a namespace colision not affecting the project's build
process as they were be local to specific objects.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Out of struct typedef_tag, that now becomes the superclass of struct class, and
that also will be for struct enumeration, struct union_type and then finally
for struct struct_type, when struct class finally dies.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
By having its own class, struct typedef_tag.
As it, as structs, unions and enums have a common part, the node and visited
fields, required when emitting its definitions there is an opportunity for
consolidation, that will be explored when adding the specific classes for
DW_TAG_enumeration & DW_TAG_union.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Almost mirroring the DWARF on-disk linkage on memory, more to come before
getting over these simplification refactorings.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
So far struct class was being used as the main data structure, switch to struct
tag, that already was the top of the tag hierarchy, being a struct class
ancestor, so reflect that and stop using struct class as the catch all class,
as a started DW_TAG_array_type tags are now represented by a new class, struct
array_type, reducing the size of struct class and reducing DW__TAG_array_type
instance memory usage.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
So that we can extract bits from one and combine it bits from other instances,
like we'll do in ctracer, where we want to have a cus instance just to get the
kprobes definitions and forward declarations but not handle the methods in it.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
So that we can load many object files, that is what the next csets will
do, to recursively look for files with debug info in a build tree, such
as the kernel one.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
An example to illustrate the kind of checks done:
[acme@newtoy multi-cu]$ cat a.c
struct foo {
int a;
char b;
};
void a_foo_print(struct foo *f)
{
printf("f.a=%d\n", f->a);
}
[acme@newtoy multi-cu]$ cat main.c
struct foo {
int a;
char b;
char c;
};
extern void a_foo_print(struct foo *f);
int main(void)
{
struct foo f = { .a = 10, };
a_foo_print(&f);
return 0;
}
[acme@newtoy multi-cu]$ cc -g -c a.c -o a.o
[acme@newtoy multi-cu]$ cc -g -c main.c -o main.o
[acme@newtoy multi-cu]$ cc a.o main.o -o m
[acme@newtoy multi-cu]$ pahole m
class: foo
first: a.c
current: main.c
nr_members: 2 != 3
padding: 3 != 2
[acme@newtoy multi-cu]$
Gotcha? In the above case this inconsistency wouldn't cause problems, as the
'c' member doesn't makes the struct bigger, it uses the padding, but what if we
inverted the members 'a' and 'b'?
Upcoming csets will check if the type and order of the members are the same,
should help in some complex projects where people insist on using #ifdefs in
struct definitions.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Simplifying options processing by using just pair of cu and class iterators and
using the list we were building just for --total_structure_stats for all
options, this way we don't print multiple times structures that are defined in
more than one object file when processing a multi-object file.
With this in place all the options will check if a struct definition in one
object file somehow doesn't matches the same struct definition in some other
object file, more checks will be put in place in the upcoming csets.
And, to show that this besides simplifying reduces the code size, lets use
codiff:
[acme@newtoy pahole]$ codiff build/pahole.before build/pahole
/home/acme/pahole/pahole.c:
structures__add | -143
class__filter | +147
main | -263
3 functions changed, 147 bytes added, 406 bytes removed
[acme@newtoy pahole]$
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
That is to find structs that have combinable holes, trying to pack the struct
by suggesting a move, for now it just prints structs that have holes that can
be combined, but these hints are not guaranteed to generate struct size
reductions, more has to be done and that involves understanding the alignment
rules that depend on the arch being 32 or 64 bits, but it at least reduces the
number of packing candidates.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
So that we can see only the structs that have more than the specified number of
bit holes.
Can be combined with --holes to see structs that have bit and byte holes.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
These are currently only used by pahole and would live in classes otherwise.
Signed-off-by: Bernhard Fischer <rep.nop@aon.at>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
The minimum number of holes that a struct must have for it to be
reported, to help in combining holes.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
pahole -D /pub/scm/linux/kernel/git/acme/net-2.6.20/include/net/ \
../OUTPUT/qemu/net-2.6.20/net/ipv4/tcp.o
Will exclude all the classes that were defined in files in the
/pub/scm/linux/kernel/git/acme/net-2.6.20/include/net/ directory, note that its
a prefix, not a directory, so one could as well pass
/pub/scm/linux/kernel/git/acme/net-2.6.20/include/net/tcp_ to exclude just the
files in the include/net directory and that start with 'tcp_'.
Now I think I implemented what Bernard wanted, and that is useful for me
as well, of course :-)
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>