Commit Graph

79 Commits

Author SHA1 Message Date
Arnaldo Carvalho de Melo d2c9a9d726 [CLASSES]: Check if the last member had zero size before print cacheline boundary
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-11-05 18:30:10 -02:00
Arnaldo Carvalho de Melo 51c81fb099 [CLASSES]: namespace cleanups: just rename the classes__ with cu__
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-11-05 15:46:45 -02:00
Arnaldo Carvalho de Melo 633dd33a05 [PAHOLE]: Print cacheline boundaries
Cacheline size defaults to 32, sample output changing the default to 64 bytes:

pahole --cacheline=64 ../../acme/OUTPUT/qemu/net-2.6/net/ipv4/tcp.o inode

/* /pub/scm/linux/kernel/git/acme/net-2.6/include/linux/dcache.h:86 */
struct inode {
        struct hlist_node          i_hash;               /*     0     8 */
        struct list_head           i_list;               /*     8     8 */
        struct list_head           i_sb_list;            /*    16     8 */
        struct list_head           i_dentry;             /*    24     8 */
        long unsigned int          i_ino;                /*    32     4 */
        atomic_t                   i_count;              /*    36     4 */
        umode_t                    i_mode;               /*    40     2 */

        /* XXX 2 bytes hole, try to pack */

        unsigned int               i_nlink;              /*    44     4 */
        uid_t                      i_uid;                /*    48     4 */
        gid_t                      i_gid;                /*    52     4 */
        dev_t                      i_rdev;               /*    56     4 */
        loff_t                     i_size;               /*    60     8 */
        struct timespec            i_atime;              /*    68     8 */
        struct timespec            i_mtime;              /*    76     8 */
        struct timespec            i_ctime;              /*    84     8 */
        unsigned int               i_blkbits;            /*    92     4 */
        long unsigned int          i_version;            /*    96     4 */
        blkcnt_t                   i_blocks;             /*   100     4 */
        short unsigned int         i_bytes;              /*   104     2 */
        spinlock_t                 i_lock;               /*   106     0 */

        /* XXX 2 bytes hole, try to pack */

        struct mutex               i_mutex;              /*   108    24 */
        /* ---------- cacheline 2 boundary ---------- */
        struct rw_semaphore        i_alloc_sem;          /*   132    12 */
        struct inode_operations *  i_op;                 /*   144     4 */
        const struct file_operations  * i_fop;                /*   148     4 */
        struct super_block *       i_sb;                 /*   152     4 */
        struct file_lock *         i_flock;              /*   156     4 */
        struct address_space *     i_mapping;            /*   160     4 */
        struct address_space       i_data;               /*   164    72 */
        struct list_head           i_devices;            /*   236     8 */
        union                      ;                     /*   244     4 */
        int                        i_cindex;             /*   248     4 */
        __u32                      i_generation;         /*   252     4 */
        long unsigned int          i_dnotify_mask;       /*   256     4 */
        /* ---------- cacheline 4 boundary ---------- */
        struct dnotify_struct *    i_dnotify;            /*   260     4 */
        struct list_head           inotify_watches;      /*   264     8 */
        struct mutex               inotify_mutex;        /*   272    24 */
        long unsigned int          i_state;              /*   296     4 */
        long unsigned int          dirtied_when;         /*   300     4 */
        unsigned int               i_flags;              /*   304     4 */
        atomic_t                   i_writecount;         /*   308     4 */
        void *                     i_private;            /*   312     4 */
}; /* size: 316, sum members: 312, holes: 2, sum holes: 4 */

Has to be improved to show the other cacheline boundaries, that may be buried
into a included struct or union.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-11-05 15:34:54 -02:00
Arnaldo Carvalho de Melo 34b5f29576 [PAHOLE]: Add basic support for typedefs
[acme@newtoy guinea_pig-2.6]$ pahole mm/slab.o kmem_cache_t | head -6
/* include/linux/slab.h:12 */
struct kmem_cache {
        struct array_cache *array[8];    /*  0  32 */
        unsigned int        batchcount;  /* 32   4 */
        unsigned int        limit;       /* 36   4 */
        unsigned int        shared;      /* 40   4 */
[acme@newtoy guinea_pig-2.6]$ pahole --sizes fs/ext3/built-in.o | grep typedef | head -5
typedef pgd_t:struct(): 4 0
typedef pgprot_t:struct(): 4 0
typedef cpumask_t:struct(): 4 0
typedef mm_segment_t:struct(): 4 0
typedef raw_spinlock_t:struct(): 4 0
[acme@newtoy guinea_pig-2.6]$ pahole fs/ext3/built-in.o pgd_t
/* include/asm/page.h:57 */
struct  {
        long unsigned int          pgd;                  /*     0     4 */
}; /* size: 4 */

[acme@newtoy guinea_pig-2.6]$

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-11-05 02:17:19 -02:00
Arnaldo Carvalho de Melo 4f0c9ef164 [CLASSES]: Introduce struct variable
To represent DW_TAG_variable, for now all the variables in all the lexical
blocks, in addition to the top level function variables are in this list, next
step is to add support for DW_TAG_lexical_block, with support for nesting, and
to associate variables to the right place, be it the function itself (first,
implicit lexical block) or to the lexical blocks they belong too, this will be
useful for calculating stack usage.

So, with what we have now pfunct can do this:

[acme@newtoy guinea_pig-2.6]$ pfunct --variables net/ipv4/built-in.o tcp_v4_remember_stamp
/* net/ipv4/tcp_ipv4.c:1197 */
int tcp_v4_remember_stamp(struct sock * sk);

{
        /* variables in tcp_v4_remember_stamp: */
        struct inet_sock * inet;
        struct tcp_sock * tp;
        struct rtable * rt;
        struct inet_peer * peer;
        int release_it;
}
[acme@newtoy guinea_pig-2.6]$

That is already useful when you don't have the sources, huh? :-)

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-11-05 01:31:41 -02:00
Arnaldo Carvalho de Melo cfd870431f [CLASSES]: Upgrade all the types that are in uleb form to uint64_t
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-11-04 23:46:22 -03:00
Arnaldo Carvalho de Melo 2e195b9a4a [CLASSES]: Use tsearch to avoid duplicating strings
Now we're able to process a kernel built with make allyesconfig on a machine
with 1GB, of course there is still more things to optimize, but I'm lazy and
for now this gives the numbers I wanted to get.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-11-04 17:37:23 -03:00
Arnaldo Carvalho de Melo 8fcf984361 [CLASSES]: If a function size is 0, don't print the (useless) details
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-11-04 00:18:28 -03:00
Arnaldo Carvalho de Melo 31a5245c8c [CLASSES]: Use strdup for decl_file too
Not safe to directly point to the string parameter, probably its a inline
string, not an indirect one, one more reason to create a string table...

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-11-03 23:56:56 -03:00
Arnaldo Carvalho de Melo 2124d4f375 [PFUNCT]: Improve --cu_inline_expansions_stats
Now it shows the number that each of the inline functions were expanded in an
object file:

Top 10 inline functions expanded more than once in kernel/sched.o, by total
size of inline expansions:

[acme@newtoy guinea_pig-2.6]$ pfunct --cu_inline_expansions_stats kernel/sched.o | sort -k3 -nr | grep -v ': 1 ' | head -11
kernel/sched.c: 318 10217
get_current: 38 325
finish_task_switch: 2 238
normal_prio: 2 167
__cpus_and: 14 164
find_process_by_pid: 6 152
current_thread_info: 21 149
sched_find_first_bit: 2 148
update_cpu_clock: 2 140
task_rq_unlock: 14 137
variable_test_bit: 14 121
[acme@newtoy guinea_pig-2.6]$

Now we have these options:

[acme@newtoy guinea_pig-2.6]$ pfunct --help
usage: pfunct [options] <file_name> {<function_name>}
 where:
   -c, --class=<class>               functions that have <class> pointer parameters
   -g, --goto_labels                 show number of goto labels
   -i, --show_inline_expansions      show inline expansions
   -C, --cu_inline_expansions_stats  show CU inline expansions stats
   -s, --sizes                       show size of functions
   -N, --function_name_len           show size of functions
   -p, --nr_parameters               show number or parameters
   -S, --variables                   show number of variables
   -V, --verbose                     be verbose
[acme@newtoy guinea_pig-2.6]$

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-11-03 15:22:12 -03:00
Arnaldo Carvalho de Melo dcfe27a7ef [PFUNCT]: Do per CU inline statistics
Top five object files (CU, Compilation Unit) per number of inline expansions,
vmlinux being dissected is one built for QEMU, most things as modules, that
are not being taken into account as we're only looking at vmlinux:

[acme@newtoy guinea_pig-2.6]$ pfunct -C ../../acme/OUTPUT/qemu/net-2.6/vmlinux | sort -k2 -nr | head -5 | cut -c40-
net/ipv4/tcp_input.c: 274 20655
fs/buffer.c: 272 4597
kernel/sched.c: 214 3549
kernel/signal.c: 196 2730
fs/ext3/inode.c: 191 7961
[acme@newtoy guinea_pig-2.6]$

Top five object files (CU, Compilation Unit) per total size of inline expansions:

[acme@newtoy guinea_pig-2.6]$ pfunct -C ../../acme/OUTPUT/qemu/net-2.6/vmlinux | sort -k3 -nr | head -5 | cut -c40-
net/ipv4/tcp_input.c: 274 20655
net/xfrm/xfrm_policy.c: 173 11511
kernel/module.c: 95 10826
drivers/char/vt.c: 91 10050
net/xfrm/xfrm_user.c: 150 9682
[acme@newtoy guinea_pig-2.6]$

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-11-03 14:38:43 -03:00
Johannes Berg 0330fb5d34 Corrects a few problems because dwarf libs use 64-bit types and we didn't.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-11-03 14:32:32 -03:00
Arnaldo Carvalho de Melo a42afe1acf [CLASSES]: Add support for DW_TAG_inlined_subroutine
Output of pfunct using this information (all for a make allyesconfig build):

Top 5 functions by size of inlined functions in net/ipv4:

[acme@newtoy guinea_pig-2.6]$ pfunct -I net/ipv4/built-in.o | sort -k3 -nr | head -5
ip_route_input: 19 7086
tcp_ack: 33 6415
do_ip_vs_set_ctl: 23 4193
q931_help: 8 3822
ip_defrag: 19 3318
[acme@newtoy guinea_pig-2.6]$

And by number of inline expansions:

[acme@newtoy guinea_pig-2.6]$ pfunct -I net/ipv4/built-in.o | sort -k2 -nr | head -5
dump_packet: 35 905
tcp_v4_rcv: 34 1773
tcp_recvmsg: 34 928
tcp_ack: 33 6415
tcp_rcv_established: 31 1195
[acme@newtoy guinea_pig-2.6]$

And the list of expansions on a specific function:

[acme@newtoy guinea_pig-2.6]$ pfunct -i net/ipv4/built-in.o tcp_v4_rcv
/* net/ipv4/tcp_ipv4.c:1054 */
int tcp_v4_rcv(struct sk_buff * skb);
/* size: 2189, variables: 8, goto labels: 6, inline expansions: 34 (1773 bytes) */

/* inline expansions in tcp_v4_rcv:
current_thread_info: 8
pskb_may_pull: 36
pskb_may_pull: 29
tcp_v4_checksum_init: 139
__fswab32: 2
__fswab32: 2
inet_iif: 12
__inet_lookup: 292
__fswab16: 20
inet_ehashfn: 25
inet_ehash_bucket: 18
prefetch: 4
prefetch: 4
prefetch: 4
sock_hold: 4
xfrm4_policy_check: 59
nf_reset: 66
sk_filter: 135
__skb_trim: 20
get_softnet_dma: 68
tcp_prequeue: 257
sk_add_backlog: 40
sock_put: 27
xfrm4_policy_check: 46
tcp_checksum_complete: 29
current_thread_info: 8
sock_put: 20
xfrm4_policy_check: 50
tcp_checksum_complete: 29
current_thread_info: 8
inet_iif: 9
inet_lookup_listener: 36
inet_twsk_put: 114
tcp_v4_timewait_ack: 153
*/
[acme@newtoy guinea_pig-2.6]$

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-11-03 12:41:19 -03:00
Arnaldo Carvalho de Melo 7896f67c95 [CLASSES]: Initial support for DW_TAG_lexical_block
For now we just discard it so that all the variables in a function can be
accounted, but the right thing, as said in the comment added in this cset is to
have a list of lexical blocks, each with a list of its variables so that we can
find the biggest stack usage in functions.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-11-02 15:53:39 -03:00
Arnaldo Carvalho de Melo 97385598a6 [CLASSES]: Use strdup for the ->name members
Reducing the memory footprint, but more has to be done, such as to take
advantage of the strings table when handling indirect strings.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-11-02 13:48:35 -03:00
Arnaldo Carvalho de Melo 09bb5df8aa [PFUNCT]: Implement --nr_parameters
[acme@newtoy net-2.6]$ pfunct --nr_parameters vmlinux | sort -k 2 -nr | head -5
__ide_add_setting: 13
ide_add_setting: 12
fib_dump_info: 12
__blockdev_direct_IO: 10
vma_merge: 9
[acme@newtoy net-2.6]$

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-11-01 10:34:42 -03:00
Arnaldo Carvalho de Melo eff5348081 Handle the DW_TAG_variable and DW_TAG_label tags, now we can do:
[acme@newtoy net-2.6]$ pfunct --variables vmlinux | sort -k 2 -nr | head -5
do_task_stat: 29
load_elf_binary: 28
elf_core_dump: 23
ext3_new_blocks: 21
sys_unshare: 19
[acme@newtoy net-2.6]$

And:

[acme@newtoy net-2.6]$ pfunct --goto_labels vmlinux | sort -k 2 -nr | head -5
copy_process: 16
sys_unshare: 10
device_add: 9
class_device_add: 8
tcp_sendmsg: 7
[acme@newtoy net-2.6]$

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-11-01 10:18:01 -03:00
Arnaldo Carvalho de Melo 5aaca80de6 [CLASSES]: Rework the find_by routines
So that we can find all the cus for some specific class
(cus__find_class_by_name), or traverse all the CUs (cus__for_each_cu),
etc.

Now we don't look at just the first CU in multi-CU files (vmlinux, etc).

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-10-31 17:23:16 -03:00
Arnaldo Carvalho de Melo 0ca9826e36 Introduce struct cu, i.e. a per compilation unit struct that holds the list of
types for each CU, for now when working on multi-CU files (vmlinux, any binary
with more than one object file linked) we look only at the first CU when
looking for a specific class or function name, this will be fixed in the
upcoming csets, but doesn't affect the case when we don't specify a class or
function name, where all the CU's are traversed.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-10-31 16:12:42 -03:00
Arnaldo Carvalho de Melo 19189046aa Use : as separator for decl_file:decl_line, suggested by martin
on irc 8)

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-10-30 15:24:32 -03:00
Arnaldo Carvalho de Melo 042c51dc1f Support DW_AT_low_pc and DW_AT_high_pc, now pfunct is able to do this:
[acme@newtoy net-2.6]$ pfunct kernel/sched.o schedule
/* /pub/scm/linux/kernel/git/acme/net-2.6/kernel/sched.c 3317 */
void schedule(void);
/* size: 1492 */

Cute, huh? :-)

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-10-30 14:22:39 -03:00
Arnaldo Carvalho de Melo f204c370ed Move the attr functions that are needed only by class__new or class_member__new
to where they are needed.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-10-30 13:37:04 -03:00
Arnaldo Carvalho de Melo 6b32c8362b Introduce classes__for_each, that receives an iterator function and a cookie,
so that one can traverse all the classes loaded by classes__load.

Also export classes__find_by_id().

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-10-28 23:55:56 -03:00
Arnaldo Carvalho de Melo 67b12e237c Support DW_AT_inline, that only makes sense on functions, where now we
see that the function was indeed inlined:

[acme@newtoy net-2.6]$ pfunct kernel/sched.o task_running
/* /pub/scm/linux/kernel/git/acme/net-2.6/kernel/sched.c 304 */
inline int task_running(struct rq * rq, struct task_struct * p);

[acme@newtoy net-2.6]$

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-10-28 21:40:35 -03:00
Arnaldo Carvalho de Melo d23a3a6e64 Print the decl file and line at class__print, not at class__print_struct, that
way we get it for free for the upcoming types, as class__print_function got
now.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-10-28 19:13:27 -03:00
Arnaldo Carvalho de Melo d0df41c935 Support DW_TAG_formal_parameter, that gets into the struct class members list,
i.e. the list of parameters of a function (DW_TAG_subprogram).

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-10-28 19:04:40 -03:00
Arnaldo Carvalho de Melo f9ed05bd42 Initial implementation of class__print_function, now to get the list
of DW_AT_formal_parameters and then print it.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-10-28 18:49:27 -03:00
Arnaldo Carvalho de Melo e61005ee82 Only structs have holes, not, for instance, DW_AT_subprogram entries,
aka functions.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-10-28 18:37:38 -03:00
Arnaldo Carvalho de Melo 35e87417f9 Move the classes methods out of pahole.c and into classes.c,
that will be used by other new dwarves 8)

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-10-28 18:22:42 -03:00