I.e. the functions declared as inline but not inlined by the compiler
DW_AT_inline with DW_INL_declared_not_inlined value.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Following what is in the DWARF2 specs:
Name Meaning
-----------------------------------------------------------------------------
DW_INL_not_inlined Not declared inline nor inlined by the compiler
DW_INL_inlined Not declared inline but inlined by the compiler
DW_INL_declared_not_inlined Declared inline but not inlined by the compiler
DW_INL_declared_inlined Declared inline and inlined by the compiler
Take advantae of this and use it in a new pfunct option: --cc_inlined, to
show which functions were of the DW_INL_inlined type.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
And helper routines, so as to separate DW_TAG_subprogram from
the type tags (DW_TAG_structure_type, basic_type, etc).
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This should have been done from the start: all DW_TAG_s will be represented by
structs that has as its first member a struct tag, so that we can fully
represent the DWARF information, following csets will take continue the
restructuring.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
That uses the DW_AT_external attribute, that tells if the DW_TAG_subprogram
(a function) is visible externally.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
At least GCC only emits DW_TAG_subprogram for inline functions if they were
used somewhere in the CU, even if no DW_TAG_inlined_subroutine tag is emitted
due to optimizations reducing inline functions to nothing.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This fixes a problem with codiff usage of the ->class_to_diff member, as we
were looking at a different CU than the one intended, so we'd have to have a
pointer to the CU associated with ->class_to_diff, heck, its time to have this
backpointer :-)
Now to audit the rest of the code to look for simplifications since we now have
this backpointer and thus don't need to pass CU pointers around.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Well, needs to be a bit more terse, as prints the types per CU, i.e. for
types defined in multiple CUs we get them repeated, but using grep +
sort -u does the trick.
[acme@newtoy pahole]$ codiff -t /tmp/ipv6.ko.before /tmp/ipv6.ko.after | grep ^struct | sort -u
struct inet_connection_sock: size, offset
struct inet_sock: size, nr_members, offset
struct proto: nr_members, type
struct raw6_sock: size, offset
struct tcp6_sock: size
struct tcp_sock: size, offset
struct udp_sock: size, offset
[acme@newtoy pahole]$
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
[acme@newtoy examples]$ cat struct.c
static struct foo {
char a:2;
unsigned int b;
unsigned long c;
unsigned long d;
unsigned long e;
} bar;
int main(int argc, char *argv[])
{
printf("%d", bar.a);
}
[acme@newtoy examples]$
Then change "a:2" to "a:4":
[acme@newtoy examples]$ codiff -V old_struct new_struct
struct.c:
struct foo | +0
a:2;
from: char /* 0(6) 1(2) */
to: char /* 0(4) 1(4) */
1 struct changed
Now, on top of that move a after b:
[acme@newtoy examples]$ codiff -V old_struct new_struct
struct.c:
struct foo | +0
a:2;
from: char /* 0(6) 1(2) */
to: char /* 4(4) 1(4) */
b;
from: unsigned int /* 4(0) 4(0) */
to: unsigned int /* 0(0) 4(0) */
1 struct changed
[acme@newtoy examples]$
Move it back a to before b and change the type of e without changing its size,
i.e. from unsigned long to long:
[acme@newtoy examples]$ codiff -V old_struct new_struct
struct.c:
struct foo | +0
a:2;
from: char /* 0(6) 1(2) */
to: char /* 0(4) 1(4) */
e;
from: long unsigned int /* 16(0) 4(0) */
to: long int /* 16(0) 4(0) */
1 struct changed
[acme@newtoy examples]$
Now on top of this lets delete the c member:
[acme@newtoy examples]$ codiff -V old_struct new_struct
struct.c:
struct foo | -4
nr_members: -1
-long unsigned int c; /* 8 4 */
a:2;
from: char /* 0(6) 1(2) */
to: char /* 0(4) 1(4) */
d;
from: long unsigned int /* 12(0) 4(0) */
to: long unsigned int /* 8(0) 4(0) */
e;
from: long unsigned int /* 16(0) 4(0) */
to: long int /* 12(0) 4(0) */
1 struct changed
[acme@newtoy examples]$
WOW, many changes, what an ABI breakage, no? :-)
It started as:
[acme@newtoy examples]$ pahole old_struct foo
/* /home/acme/pahole/examples/struct.c:3 */
struct foo {
char a:2; /* 0 1 */
/* XXX 3 bytes hole, try to pack */
unsigned int b; /* 4 4 */
long unsigned int c; /* 8 4 */
long unsigned int d; /* 12 4 */
long unsigned int e; /* 16 4 */
}; /* size: 20, sum members: 17, holes: 1, sum holes: 3 */
And ended up as:
[acme@newtoy examples]$ pahole new_struct foo
/* /home/acme/pahole/examples/struct.c:3 */
struct foo {
char a:4; /* 0 1 */
/* XXX 3 bytes hole, try to pack */
unsigned int b; /* 4 4 */
long unsigned int d; /* 8 4 */
long int e; /* 12 4 */
}; /* size: 16, sum members: 13, holes: 1, sum holes: 3 */
[acme@newtoy examples]$
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
First step:
Show if struct members were removed or added:
[acme@newtoy net-2.6.20]$ codiff -sV /tmp/ipv6.ko.before /tmp/ipv6.ko.after
<SNIP>
/pub/scm/linux/kernel/git/acme/net-2.6.20/net/ipv6/tcp_ipv6.c:
struct inet_sock | -4
nr_members: -1
struct inet_connection_sock | -4
struct tcp_sock | -4
struct tcp6_sock | -4
4 structs changed
<SNIP>
Oh, so struct inet_sock must be one of the members of the other structs that
haven't had changes in its number of members? Yes, this is the case :-)
Now lets see _which_ members were removed, added or had its type changed
causing a reduction in the struct size.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
prefcnt is a new tool to do reference counting on all the TAGs, starting
from the list of DW_TAG_subroutine tags and going down thru the return type,
parameter list types, variables and inline expansions in the functions, to
help finding unused stuff, its not so effective because of bugs in gcc
DWARF emitting code for concrete inline instances, i.e. the inline expansions
are not all being emitted, see:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29792
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
To represent DW_TAG_variable, for now all the variables in all the lexical
blocks, in addition to the top level function variables are in this list, next
step is to add support for DW_TAG_lexical_block, with support for nesting, and
to associate variables to the right place, be it the function itself (first,
implicit lexical block) or to the lexical blocks they belong too, this will be
useful for calculating stack usage.
So, with what we have now pfunct can do this:
[acme@newtoy guinea_pig-2.6]$ pfunct --variables net/ipv4/built-in.o tcp_v4_remember_stamp
/* net/ipv4/tcp_ipv4.c:1197 */
int tcp_v4_remember_stamp(struct sock * sk);
{
/* variables in tcp_v4_remember_stamp: */
struct inet_sock * inet;
struct tcp_sock * tp;
struct rtable * rt;
struct inet_peer * peer;
int release_it;
}
[acme@newtoy guinea_pig-2.6]$
That is already useful when you don't have the sources, huh? :-)
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Now we're able to process a kernel built with make allyesconfig on a machine
with 1GB, of course there is still more things to optimize, but I'm lazy and
for now this gives the numbers I wanted to get.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Its just totsz / nrexp, i.e. total size of expansions divided by the number of
expansions found, but helps in analising the data.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Not safe to directly point to the string parameter, probably its a inline
string, not an indirect one, one more reason to create a string table...
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
[acme@newtoy guinea_pig-2.6]$ pahole -t ../../acme/OUTPUT/qemu/net-2.6/vmlinux | sort -k2 -nr | head -5
list_head 468
__wait_queue_head 466
timespec 466
rw_semaphore 466
plist_head 466
[acme@newtoy guinea_pig-2.6]$
Which leads to another, more non-trivia question, what if a struct
definition is included but there are no references to this function?
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Now it shows the number that each of the inline functions were expanded in an
object file:
Top 10 inline functions expanded more than once in kernel/sched.o, by total
size of inline expansions:
[acme@newtoy guinea_pig-2.6]$ pfunct --cu_inline_expansions_stats kernel/sched.o | sort -k3 -nr | grep -v ': 1 ' | head -11
kernel/sched.c: 318 10217
get_current: 38 325
finish_task_switch: 2 238
normal_prio: 2 167
__cpus_and: 14 164
find_process_by_pid: 6 152
current_thread_info: 21 149
sched_find_first_bit: 2 148
update_cpu_clock: 2 140
task_rq_unlock: 14 137
variable_test_bit: 14 121
[acme@newtoy guinea_pig-2.6]$
Now we have these options:
[acme@newtoy guinea_pig-2.6]$ pfunct --help
usage: pfunct [options] <file_name> {<function_name>}
where:
-c, --class=<class> functions that have <class> pointer parameters
-g, --goto_labels show number of goto labels
-i, --show_inline_expansions show inline expansions
-C, --cu_inline_expansions_stats show CU inline expansions stats
-s, --sizes show size of functions
-N, --function_name_len show size of functions
-p, --nr_parameters show number or parameters
-S, --variables show number of variables
-V, --verbose be verbose
[acme@newtoy guinea_pig-2.6]$
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Top five object files (CU, Compilation Unit) per number of inline expansions,
vmlinux being dissected is one built for QEMU, most things as modules, that
are not being taken into account as we're only looking at vmlinux:
[acme@newtoy guinea_pig-2.6]$ pfunct -C ../../acme/OUTPUT/qemu/net-2.6/vmlinux | sort -k2 -nr | head -5 | cut -c40-
net/ipv4/tcp_input.c: 274 20655
fs/buffer.c: 272 4597
kernel/sched.c: 214 3549
kernel/signal.c: 196 2730
fs/ext3/inode.c: 191 7961
[acme@newtoy guinea_pig-2.6]$
Top five object files (CU, Compilation Unit) per total size of inline expansions:
[acme@newtoy guinea_pig-2.6]$ pfunct -C ../../acme/OUTPUT/qemu/net-2.6/vmlinux | sort -k3 -nr | head -5 | cut -c40-
net/ipv4/tcp_input.c: 274 20655
net/xfrm/xfrm_policy.c: 173 11511
kernel/module.c: 95 10826
drivers/char/vt.c: 91 10050
net/xfrm/xfrm_user.c: 150 9682
[acme@newtoy guinea_pig-2.6]$
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
For now we just discard it so that all the variables in a function can be
accounted, but the right thing, as said in the comment added in this cset is to
have a list of lexical blocks, each with a list of its variables so that we can
find the biggest stack usage in functions.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>