util/PERF-VERSION-GEN is currently executed on every build attempt,
and this script can take a lot of time on trees that are at a
significant git-distance from Linus's tree:
$ time util/PERF-VERSION-GEN
real 0m4.343s
user 0m4.176s
sys 0m0.140s
It also takes a lot of time if the Git repository is network attached, etc.,
because the commands it uses:
TAG=$(git describe --abbrev=0 --match "v[0-9].[0-9]*" 2>/dev/null )
has to count commits from the nearest tag and thus has to access (and
decompress) every git commit blob on the relevant version path.
Even on Linus's tree it takes 0.28 seconds on a fast box to count all the
commits and get the git version string:
$ time util/PERF-VERSION-GEN
real 0m0.279s
user 0m0.247s
sys 0m0.025s
But the version string only has to be regenerated if the git repository's
head commit changes. So add a dependency of ../../.git/HEAD and touch
the file every time it's regenerated, so that Make's build rules can
pick it up and cache the result:
make: `PERF-VERSION-FILE' is up to date.
real 0m0.184s
user 0m0.117s
sys 0m0.026s
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/n/tip-wvmlrurufuk6mo1ovtNigguT@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Concatenate all feature checks into test-all.c.
This can be built and checked faster than all the individual tests.
If test-all fails then we still check all the individual features, so
this is a pure speedup, it should have no effects on functionality.
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/n/tip-5hlcb2qorzwfwrWTjiygjjih@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The strlcpy() feature check slows every build unnecessarily - so make it
a __weak function so it does not have to be auto-detected.
If the libc (or any other library) has an strlcpy() implementation it will
be used - otherwise our fallback is active.
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/n/tip-zjbrcupapu08ePsyYhhhxiwk@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Use the standard CPP style we use in the kernel:
#ifndef foo
# define foo bar
#endif
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/n/tip-iqyVrrHqpn0eiwenvgwrh8lf@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Nest the rules properly. No change in functionality.
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/n/tip-jjlmizjmhockUs04wqnScnkl@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Nest the rules properly. No change in functionality.
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/n/tip-wwktuHl4Ra5lyrrretkxmxqf@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Nest the rules properly. No change in functionality.
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/n/tip-dDgivr9xtjrof2vmoyOfwxkj@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Use GCC's -MD feature to generate a dependency file for each feature test .c file,
and include that .d file in the config/feature-checks/Makefile.
This allows us to do two things:
- speed up feature tests
- detect removal or changes in build dependencies - including system libraries/headers
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/n/tip-Jfma8pmPnnqzpxjbs3hpgmsj@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Start the split-out of the feature check code by adding a list of features to be
tested, and rules to process that list by building its matching feature-check
file in config/feature-checks/test-<feature>.c.
Add 'hello' as the initial feature.
This structure will allow us to build split-out feature checks in parallel and
thus speed up feature detection dramatically.
No change in functionality: no feature check is used by the build rules yet.
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/n/tip-pixkihgscFaohfFigq5yt9gs@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
. The libaudit test was failing in some systems due to a unescaped newline, fix
it so that the 'trace' tool can be built in such systems.
. Fix installation of libexec components.
. Add default handler for mmap2 events so that tools that don't explicitely
define an MMAP2 handler don't crash, fix from David Ahern.
. Fix to find line information for probe list, from Masami Hiramatsu.
. Set child_pid after perf_evlist__prepare_workload(), fix from Namhyung Kim.
. Fix infinite loop on invalid perf.data file, from Namhyung Kim.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQIcBAABAgAGBQJSUwS+AAoJENZQFvNTUqpAib8P/0p5AXYrO+hBp9OOeFWleJYY
Nj0o6BOOJU3CvJJP140S4/fEHZ42lSbzivR6ounF4+DCCYmX/R24So3iyFxE7EQ5
i9WXpsIv7kOHUkigSMPocBa7+bLnR7OFhDQ1KKiCpQKq6LArRdioc4GJ9NQ/ERev
n26ArP1hzAM64vXI4xyT22AyuPmZiRFnmhUMwVIXiSLGezIWyPIzV6gdrEjwmPeA
2YFFTigFyfDC+w4/pLN/nOGKHY1kHiZo3LaFpTiqitgXSEqZJTugZJh1CPtZPMXU
pASxjvs/WnA8l5hpC7xX1pCMAjOLqvQ84BprglfuA2KVJ2CzacrJSMWEmJ5rnDCm
48YdAf5n+90Hjtg5Nue3GIjJErAhRLl32bAy2Oegrmp9eQF+dNQH/oTTGKqoFg8Q
QbMhqEUx4L6Dohw+8AfgM/NEmkL5V4va/vs0x3EDWJIdQOCl5woXqnvqvZ+eretb
rYdpTn53oLiW/WlESmcU3XkUbVJ9hvjdYPG9z9PtmpPrmOH1XQo+5QVnyrhaAl8x
7e9IPAZDXm4WnY3zUJHjmz36kBm9Aqq0hotiNUzSMX1zkOf39P3SAKWGgPUDGVrh
2Sz4oZVwzzogDlaRUE2XXa24bNt04jgDI/7KFVuhCwvn9WhHUPrllFXpTy+xQHpn
KzlbEvULc1pEFjFyBbiM
=OgDk
-----END PGP SIGNATURE-----
Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
* The libaudit test was failing in some systems due to a unescaped newline, fix
it so that the 'trace' tool can be built in such systems.
* Fix installation of libexec components.
* Add default handler for mmap2 events so that tools that don't explicitely
define an MMAP2 handler don't crash, fix from David Ahern.
* Fix to find line information for probe list, from Masami Hiramatsu.
* Set child_pid after perf_evlist__prepare_workload(), fix from Namhyung Kim.
* Fix infinite loop on invalid perf.data file, from Namhyung Kim.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
perf-record updates the header in the perf.data file at termination.
Without this update perf-report (and other processing built-ins) it
caused an infinite loop when perf report (or something like) called.
This is because the algorithm in __perf_session__process_events()
depends on the data_size which is read from file header. Use file size
directly instead in this case to do the best-effort processing.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: David Ahern <dsahern@gmail.com>
Tested-by: Sonny Rao <sonnyrao@chromium.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Sonny Rao <sonnyrao@chromium.org>
Link: http://lkml.kernel.org/r/1380529188-27193-1-git-send-email-namhyung@kernel.org
Signed-off-by: David Ahern <dsahern@gmail.com>
[ Reworded warning as per Ingo Molnar suggestion, replaces 'perf.data'
with session->filename, to precisely identify the data file involved ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Doing a fresh install on a user home directory needs to first make sure
that the ~/libexec/perf-core/ directory is present so that
'perf-archive' like scripts, 'perf test' attr config files and 'perf
script' scripts can be installed.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-z7ryi3r1b9dn9smbfnab0fdc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix to find the correct (as much as possible) line information for
listing probes. Without this fix, perf probe --list action will show
incorrect line information as below;
probe:getname_flags (on getname_flags@ksrc/linux-3/fs/namei.c)
probe:getname_flags_1 (on getname:-89@x86/include/asm/current.h)
probe:getname_flags_2 (on user_path_at_empty:-2054@x86/include/asm/current.h)
The minus line number is obviously wrong, and current.h is not related
to the probe point. Deeper investigation discovered that there were 2
issues related to this bug, and minor typos too.
The 1st issue is the rack of considering about nested inlined functions,
which causes the wrong (relative) line number.
The 2nd issue is that the dwarf line info is not correct at those
points. It points 14th line of current.h.
Since it seems that the line info includes somewhat unreliable
information, this fixes perf to try to find correct line information
from both of debuginfo and line info as below.
1) Probe address is the entry of a function instance
In this case, the line is set as the function declared line.
2) Probe address is the entry of an expanded inline function block
In this case, the line is set as the function call-site line.
This means that the line number is relative from the entry line
of caller function (which can be an inlined function if nested)
3) Probe address is inside a function instance or an expanded
inline function block
In this case, perf probe queries the line number from lineinfo
and verify the function declared file is same as the file name
queried from lineinfo.
If the file name is different, it is a failure case. The probe
address is shown as symbol+offset.
4) Probe address is not in the any function instance
This is a failure case, the probe address is shown as
symbol+offset.
With this fix, perf probe -l shows correct probe lines as below;
probe:getname_flags (on getname_flags@ksrc/linux-3/fs/namei.c)
probe:getname_flags_1 (on getname:2@ksrc/linux-3/fs/namei.c)
probe:getname_flags_2 (on user_path_at_empty:4@ksrc/linux-3/fs/namei.c)
Changes at v2:
- Fix typos in the function comments. (Thanks to Namhyung Kim)
- Use die_find_top_inlinefunc instead of die_find_inlinefunc_next.
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20130930092144.1693.11058.stgit@udc4-manage.rcp.hitachi.co.jp
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In ubuntu systems the libaudit test was always failing due to the
newline in the printf call not being escaped, which somehow didn't
prevented the test from working as expected on other systems, such
as fedora18.
Fix it by removing the newline, as this is just a test, that program is
just a compile test.
The error messages, obtained using 'make V=1':
CHK libaudit
<stdin>: In function ‘main’:
<stdin>:5:9: error: missing terminating " character [-Werror]
<stdin>:5:2: error: missing terminating " character
<stdin>:6:1: error: missing terminating " character [-Werror]
<stdin>:6:1: error: missing terminating " character
<stdin>:7:2: error: expected expression before ‘return’
<stdin>:8:1: error: expected ‘;’ before ‘}’ token
cc1: all warnings being treated as errors
config/Makefile:241: No libaudit.h found, disables 'trace' tool, please install audit-libs-devel or libaudit-dev
After this change the test works as expected in all systems tested and the
'trace' tool is built when the needed devel packages are installed.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-0trw8qs9hafeopc0vj1sicay@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The commit acf2892270 ("perf stat: Use perf_evlist__prepare/
start_workload()") converted to use the function but forgot to update
child_pid. Fix it.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1380531671-28076-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commands that do not implement an mmap2 handler should at least not die
with a segfault when processing files with MMAP2 events.
Signed-off-by: David Ahern <dsahern@gmail.com>
Link: http://lkml.kernel.org/r/1379900700-5186-5-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Haswell always give an extra LBR record after every TSX abort.
Suppress the extra record.
This only works when the abort is visible in the LBR
If the original abort has already left the 16 LBR entries
the extra entry will will stay.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1379688044-14173-7-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add support for recording and displaying the transaction flags.
They are essentially a new sort key. Also display them
in a nice way to the user.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1379688044-14173-6-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Make perf record -j aware of the new in_tx,no_tx,abort_tx branch qualifiers.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1379688044-14173-5-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Extend the perf branch sorting code to support sorting by in_tx
or abort_tx qualifiers. Also print out those qualifiers.
This also fixes up some of the existing sort key documentation.
We do not support no_tx here, because it's simply not showing
the in_tx flag.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1379688044-14173-4-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In the PEBS handler report the transaction flags using the new
generic transaction flags facility. Most of them come from
the "tsx_tuning" field in PEBSv2, but the abort code is derived
from the RAX register reported in the PEBS record.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1379688044-14173-3-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add a generic qualifier for transaction events, as a new sample
type that returns a flag word. This is particularly useful
for qualifying aborts: to distinguish aborts which happen
due to asynchronous events (like conflicts caused by another
CPU) versus instructions that lead to an abort.
The tuning strategies are very different for those cases,
so it's important to distinguish them easily and early.
Since it's inconvenient and inflexible to filter for this
in the kernel we report all the events out and allow
some post processing in user space.
The flags are based on the Intel TSX events, but should be fairly
generic and mostly applicable to other HTM architectures too. In addition
to various flag words there's also reserved space to report an
program supplied abort code. For TSX this is used to distinguish specific
classes of aborts, like a lock busy abort when doing lock elision.
Flags:
Elision and generic transactions (ELISION vs TRANSACTION)
(HLE vs RTM on TSX; IBM etc. would likely only use TRANSACTION)
Aborts caused by current thread vs aborts caused by others (SYNC vs ASYNC)
Retryable transaction (RETRY)
Conflicts with other threads (CONFLICT)
Transaction write capacity overflow (CAPACITY WRITE)
Transaction read capacity overflow (CAPACITY READ)
Transactions implicitely aborted can also return an abort code.
This can be used to signal specific events to the profiler. A common
case is abort on lock busy in a RTM eliding library (code 0xff)
To handle this case we include the TSX abort code
Common example aborts in TSX would be:
- Data conflict with another thread on memory read.
Flags: TRANSACTION|ASYNC|CONFLICT
- executing a WRMSR in a transaction. Flags: TRANSACTION|SYNC
- HLE transaction in user space is too large
Flags: ELISION|SYNC|CAPACITY-WRITE
The only flag that is somewhat TSX specific is ELISION.
This adds the perf core glue needed for reporting the new flag word out.
v2: Add MEM/MISC
v3: Move transaction to the end
v4: Separate capacity-read/write and remove misc
v5: Remove _SAMPLE. Move abort flags to 32bit. Rename
transaction to txn
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1379688044-14173-2-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add support to perf stat to print the basic transactional execution statistics:
Total cycles, Cycles in Transaction, Cycles in aborted transsactions
using the in_tx and in_tx_checkpoint qualifiers.
Transaction Starts and Elision Starts, to compute the average transaction
length.
This is a reasonable overview over the success of the transactions.
Also support architectures that have a transaction aborted cycles
counter like POWER8. Since that is awkward to handle in the kernel
abstract handle both cases here.
Enable with a new --transaction / -T option.
This requires measuring these events in a group, since they depend on each
other.
This is implemented by using TM sysfs events exported by the kernel
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Arnaldo Carvalho de Melo <acme@infradead.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1377128846-977-5-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
/proc/sys/kernel/perf_event_max_sample_rate will accept
negative values as well as 0.
Negative values are unreasonable, and 0 causes a
divide by zero exception in perf_proc_update_handler.
This patch enforces a lower limit of 1.
Signed-off-by: Knut Petersen <Knut_Petersen@t-online.de>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/5242DB0C.4070005@t-online.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Some of the node comparisons in hist.c dropped the upper
32bit by using an int variable to store the compare
result. This broke various 64bit fields, causing
incorrect collapsing (found for the TSX transaction field)
Just use int64_t always.
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1380637335-30110-1-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>