Yong Wang reported the following compiler warning:
builtin-report.c: In function 'process_overflow_event':
builtin-report.c:984: error: cast to pointer from integer of different size
Which happens because we try to print ->ips[] out with a limited
format, losing the high 32 bits. Print it out using %016Lx instead.
Reported-by: Yong Wang <yong.y.wang@linux.intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Take advantage of call-graph percounter sampling/recording to
display a non-trivial histogram: the true, collapsed/summarized
cost measurement, on a per system call total overhead basis:
aldebaran:~/linux/linux/tools/perf> ./perf record -g -a -f ~/hackbench 10
aldebaran:~/linux/linux/tools/perf> ./perf report -s symbol --syscalls | head -10
#
# (3536 samples)
#
# Overhead Symbol
# ........ ......
#
40.75% [k] sys_write
40.21% [k] sys_read
4.44% [k] do_nmi
...
This is done by accounting each (reliable) call-chain that chains back
to a given system call to that system call function.
[ So in the above example we can see that hackbench spends about 40% of
its total time somewhere in sys_write() and 40% somewhere in
sys_read(), the rest of the time is spent in user-space. The time
is not spent in sys_write() _itself_ but in one of its many child
functions. ]
Or, a recording of a (source files are already in the page-cache) kernel build:
$ perf record -g -m 512 -f -- make -j32 kernel
$ perf report -s s --syscalls | grep '\[k\]' | grep -v nmi
4.14% [k] do_page_fault
1.20% [k] sys_write
1.10% [k] sys_open
0.63% [k] sys_exit_group
0.48% [k] smp_apic_timer_interrupt
0.37% [k] sys_read
0.37% [k] sys_execve
0.20% [k] sys_mmap
0.18% [k] sys_close
0.14% [k] sys_munmap
0.13% [k] sys_poll
0.09% [k] sys_newstat
0.07% [k] sys_clone
0.06% [k] sys_newfstat
0.05% [k] sys_access
0.05% [k] schedule
Shows the true total cost of each syscall variant that gets used
during a kernel build. This profile reveals it that pagefaults are
the costliest, followed by read()/write().
An interesting detail: timer interrupts cost 0.5% - or 0.5 seconds
per 100 seconds of kernel build-time. (this was done with HZ=1000)
The summary is done in 'perf report', i.e. in the post-processing
stage - so once we have a good call-graph recording, this type of
non-trivial high-level analysis becomes possible.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Recording with -a (or with -p) can race with tasks going away:
couldn't open /proc/8440/maps
Causing an early exit() and no recording done.
Do not abort the recording session - instead just skip that task.
Also, only print the warnings under -v.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add the --repeat <n> feature to perf stat, which repeats a given
command up to a 100 times, collects the stats and calculates an
average and a stddev.
For example, the following oneliner 'perf stat' command runs hackbench
5 times and prints a tabulated result of all metrics, with averages
and noise levels (in percentage) printed:
aldebaran:~/linux/linux/tools/perf> ./perf stat --repeat 5 ~/hackbench 10
Time: 0.117
Time: 0.108
Time: 0.089
Time: 0.088
Time: 0.100
Performance counter stats for '/home/mingo/hackbench 10' (5 runs):
1243.989586 task-clock-msecs # 10.460 CPUs ( +- 4.720% )
47706 context-switches # 0.038 M/sec ( +- 19.706% )
387 CPU-migrations # 0.000 M/sec ( +- 3.608% )
17793 page-faults # 0.014 M/sec ( +- 0.354% )
3770941606 cycles # 3031.329 M/sec ( +- 4.621% )
1566372416 instructions # 0.415 IPC ( +- 2.703% )
16783421 cache-references # 13.492 M/sec ( +- 5.202% )
7128590 cache-misses # 5.730 M/sec ( +- 7.420% )
0.118924455 seconds time elapsed.
The goal of this feature is to allow the reliance on these accurate
statistics and to know how many times a command has to be repeated
for the noise to go down to an acceptable level.
(The -v option can be used to see a line printed out as each run progresses.)
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
- use IPC for the instruction normalization output
- CPUs for the CPU utilization factor value.
- print out time elapsed like the other rows
- tidy up the task-clocks/cpu-clocks printout
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
It's can be very annoying to scroll down perf annotated output
until we find relevant overhead.
Using the -l option, you can now have a small summary sorted per
overhead in the beginning of the output.
Example:
./perf annotate -l -k ../../vmlinux -s __lock_acquire
Sorted summary for file ../../vmlinux
----------------------------------------------
12.04 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:1653
4.61 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:1740
3.77 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:1775
3.56 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:1653
2.93 /home/fweisbec/linux/linux-2.6-tip/arch/x86/include/asm/irqflags.h:15
2.83 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:2545
2.30 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:2594
2.20 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:2388
2.20 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:730
2.09 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:730
2.09 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:138
1.88 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:2548
1.47 /home/fweisbec/linux/linux-2.6-tip/arch/x86/include/asm/irqflags.h:15
1.36 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:2594
1.36 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:730
1.26 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:1654
1.26 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:1653
1.15 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:2592
1.15 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:1740
1.15 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:1740
[...]
Only overhead over 0.5% are summarized.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1244844682-12928-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When we have a colored line in perf annotate, ie a middle/high
overhead one, it's sometimes useful to get the matching line
and filename from the source file, especially this path prepares
to another subsequent one which will print a sorted summary of
midle/high overhead lines in the beginning of the output.
Filename:Lines have the same color than the concerned ip lines.
It can be slow because it relies on addr2line. We could also
use objdump with -l but that implies we would have to bufferize
objdump output and parse it to filter the relevant lines since
we want to print a sorted summary in the beginning.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1244844682-12928-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Help out arch porters who want to support perf counters by listing some
basic requirements.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1244827063-24046-1-git-send-email-vapier@gentoo.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Provide for means of extending the perf_counter_attr in a 'natural' way.
We allow growing the structure by appending fields at the end by specifying
the full structure size inside it.
When a new kernel sees a smaller (old) structure, it will 0 pad the tail.
When an old kernel sees a larger (new) structure, it will verify the tail
consists of 0s, otherwise fail.
If we fail due to a size-mismatch, we return -E2BIG and write the kernel's
native attribe size back into the provided structure.
Furthermore, add some attribute verification, so that we'll fail counter
creation when unknown bits are present (PERF_SAMPLE, PERF_FORMAT, or in
the __reserved fields).
(This ABI detail is introduced while keeping the existing syscall ABI.)
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Up until now record has worked on the assumption that type=0, config=0
was a suitable configuration - which it is. Lets make this a little more
explicit and more readable via the use of proper symbols.
[ Impact: cleanup ]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Otherwise all L1-instruction aliases will be recognized as
L1-data by strcasestr() when calling function parse_aliases.
Signed-off-by: Yong Wang <yong.y.wang@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20090612031706.GA22126@ywang-moblin2.bj.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Pure renames only, to PERF_COUNT_HW_* and PERF_COUNT_SW_*.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
A build error slipped in:
builtin-report.c: In function ‘hist_entry__fprintf’:
builtin-report.c:711: error: format ‘%12d’ expects type ‘int’, but argument 3 has type ‘uint64_t’
Because we got a bit sloppy with those types. uint64_t really sucks,
because there's no printf format for it. So standardize on __u64
instead - for all types that go to or come from the ABI (which is __u64),
or for values that need to be large enough even on 32-bit.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When we use variable period sampling, add the period to the sample
data and use that to normalize the samples.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Currently report and stat catch SIGINT (and others) without altering
their exit state. This means that things like:
while :; do perf stat ./foo ; done
Loops become hard-to-interrupt, because bash never sees perf terminate
due to interruption. Fix this.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Create the counter in a disabled state and only enable it after we
mmap() the buffer, this allows us to see the first few samples (and
observe the frequency ramp).
Furthermore, print the period in the verbose report.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The rule is:
- high overhead: red
- mid overhead: green
- low overhead: normal (white/black)
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch adds support for profiling JIT generated code to 'perf
report'. A JIT compiler is required to generate a "/tmp/perf-$PID.map"
symbols map that is parsed when looking and displaying symbols.
Thanks to Peter Zijlstra for his help with this patch!
Example "perf report" output with the Jato JIT:
#
# (40311 samples)
#
# Overhead Command Shared Object Symbol
# ........ ................ ......................... ......
#
97.80% jato /tmp/perf-11915.map [.] Fibonacci.fib(I)I
0.56% jato 00000000b7fa023b 0x000000b7fa023b
0.45% jato /tmp/perf-11915.map [.] Fibonacci.main([Ljava/lang/String;)V
0.38% jato [kernel] [k] get_page_from_freelist
0.06% jato [kernel] [k] kunmap_atomic
0.05% jato ./jato [.] utf8Hash
0.04% jato ./jato [.] executeJava
0.04% jato ./jato [.] defineClass
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: a.p.zijlstra@chello.nl
Cc: acme@redhat.com
LKML-Reference: <Pine.LNX.4.64.0906082111590.12407@melkki.cs.Helsinki.FI>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Before:
$ perf report
failed to open file: No such file or directory
After:
$ perf report
failed to open file: perf.data (try 'perf record' first)
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
If perf is run on a !CONFIG_PERF_COUNTER kernel right now it
bails out with no messages or with confusing messages.
Standardize this case some more and explain the situation.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
On architectures/CPUs without PMU support but with perfcounters
enabled 'perf record' currently fails because it cannot create a
cycle based hw-perfcounter.
Fall back to the cpu-clock-tick sw-perfcounter in this case, which
is hrtimer based and will always work (as long as perfcounters
are enabled).
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
On architectures/CPUs without PMU support but with perfcounters
enabled 'perf top' currently fails because it cannot create a
cycle based hw-perfcounter.
Fall back to the cpu-clock-tick sw-perfcounter in this case, which
is hrtimer based and will always work (as long as perfcounters
is enabled).
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Before:
$ perf stat ~/hackbench 5
error: syscall returned with -1 (No such device)
After:
$ perf stat ~/hackbench 5
Time: 1.640
Performance counter stats for '/home/mingo/hackbench 5':
6524.570382 task-clock-ticks # 3.838 CPU utilization factor
35704 context-switches # 0.005 M/sec
191 CPU-migrations # 0.000 M/sec
8958 page-faults # 0.001 M/sec
<not counted> cycles
<not counted> instructions
<not counted> cache-references
<not counted> cache-misses
Wall-clock time elapsed: 1699.999995 msecs
Also add -v (--verbose) option to allow the printing of failed
counter opens.
Plus dont print 'inf' if wall-time is zero (due to jiffies granularity),
instead skip the printing of the CPU utilization factor.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The first snapshot reading often occur before any events have
been read in the mapped perfcounter files.
Just wait until we have at least one event before starting the
snapshot, or the delay before the first set of entries to be
displayed may be long in case of low refresh rate.
Note: we could also use a semaphore to wait before
"print_entries" number of eveents is reached, but again this
value is tunable and we can't ensure we will even reach it.
Also we could base on a default mimimum set of entries for the
first refresh, say 15, but again, the minimal sample is
tunable, and we could end up displaying nothing until we have a
minimal default set of events, which can take some time in case
of high samples filters.
Hence this simple solution which partially covers the default
case.
[ Impact: fix display artifacts in perf top ]
Signed-off-by: Frederic Weisbecker <fweisbeec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1244322643-6447-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Arjan noticed this bug in the perf annotate help output:
-s, --symbol <file> symbol to annotate
that should be <symbol> instead.
Reported-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
the "perf report" utility crashed in some circumstances
because the "sym" stack variable was not initialized before used
(as also proven by valgrind).
With this fix both the crash goes away and valgrind no longer complains.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Right now kernel debug info does not get resolved by default, because
we dont know where to look for the vmlinux.
The -k option can be used for that - but if no option is given, pick
up vmlinux files in the current directory - in case a kernel hacker
runs profiling from the source directory that the kernel was built in.
The real solution would be to embedd the location (and perhaps the
date/timestamp) of the vmlinux file in /proc/kallsyms, so that
tools can pick it up automatically.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
gcc warned about this bug:
util/parse-events.c: In function ‘parse_generic_hw_symbols’:
util/parse-events.c:175: warning: comparison is always false due to limited range of data type
util/parse-events.c:182: warning: comparison is always false due to limited range of data type
util/parse-events.c:190: warning: comparison is always false due to limited range of data type
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Several people have suggested that 'perf' has become a full-fledged
tool that should be moved out of Documentation/. Move it to the
(new) tools/ directory.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>