[ Upstream commit 1beaef29c3 ]
For memcpy, the source pages are memset to zero only when --cycles is
used. This leads to wildly different results with or without --cycles,
since all sources pages are likely to be mapped to the same zero page
without explicit writes.
Before this fix:
$ export cmd="./perf stat -e LLC-loads -- ./perf bench \
mem memcpy -s 1024MB -l 100 -f default"
$ $cmd
2,935,826 LLC-loads
3.821677452 seconds time elapsed
$ $cmd --cycles
217,533,436 LLC-loads
8.616725985 seconds time elapsed
After this fix:
$ $cmd
214,459,686 LLC-loads
8.674301124 seconds time elapsed
$ $cmd --cycles
214,758,651 LLC-loads
8.644480006 seconds time elapsed
Fixes: 47b5757bac ("perf bench mem: Move boilerplate memory allocation to the infrastructure")
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel@axis.com
Link: http://lore.kernel.org/lkml/20200810133404.30829-1-vincent.whitchurch@axis.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 401136bb08 upstream.
While walking code towards a FUP ip, the packet state is
INTEL_PT_STATE_FUP or INTEL_PT_STATE_FUP_NO_TIP. That was mishandled
resulting in the state becoming INTEL_PT_STATE_IN_SYNC prematurely. The
result was an occasional lost EXSTOP event.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200710151104.15137-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e4d9b04b97 upstream.
Noticed with gcc 10 (fedora rawhide) that those variables were not being
declared as static, so end up with:
ld: /tmp/build/perf/bench/epoll-wait.o:/git/perf/tools/perf/bench/epoll-wait.c:93: multiple definition of `end'; /tmp/build/perf/bench/futex-hash.o:/git/perf/tools/perf/bench/futex-hash.c:40: first defined here
ld: /tmp/build/perf/bench/epoll-wait.o:/git/perf/tools/perf/bench/epoll-wait.c:93: multiple definition of `start'; /tmp/build/perf/bench/futex-hash.o:/git/perf/tools/perf/bench/futex-hash.c:40: first defined here
ld: /tmp/build/perf/bench/epoll-wait.o:/git/perf/tools/perf/bench/epoll-wait.c:93: multiple definition of `runtime'; /tmp/build/perf/bench/futex-hash.o:/git/perf/tools/perf/bench/futex-hash.c:40: first defined here
ld: /tmp/build/perf/bench/epoll-ctl.o:/git/perf/tools/perf/bench/epoll-ctl.c:38: multiple definition of `end'; /tmp/build/perf/bench/futex-hash.o:/git/perf/tools/perf/bench/futex-hash.c:40: first defined here
ld: /tmp/build/perf/bench/epoll-ctl.o:/git/perf/tools/perf/bench/epoll-ctl.c:38: multiple definition of `start'; /tmp/build/perf/bench/futex-hash.o:/git/perf/tools/perf/bench/futex-hash.c:40: first defined here
ld: /tmp/build/perf/bench/epoll-ctl.o:/git/perf/tools/perf/bench/epoll-ctl.c:38: multiple definition of `runtime'; /tmp/build/perf/bench/futex-hash.o:/git/perf/tools/perf/bench/futex-hash.c:40: first defined here
make[4]: *** [/git/perf/tools/build/Makefile.build:145: /tmp/build/perf/bench/perf-in.o] Error 1
Prefix those with bench__ and add them to bench/bench.h, so that we can
share those on the tools needing to access those variables from signal
handlers.
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20200303155811.GD13702@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ebcb9464a2 upstream.
It is possible to return a pointer to a local variable when looking up
the architecture name for the running system and no normalization is
done on that value, i.e. we may end up returning the uts.machine local
variable.
While this doesn't happen on most arches, as normalization takes place,
lets fix this by making that a static variable and optimize it a bit by
not always running uname(), only the first time.
Noticed in fedora rawhide running with:
[perfbuilder@a5ff49d6e6e4 ~]$ gcc --version
gcc (GCC) 10.0.1 20200216 (Red Hat 10.0.1-0.8)
Reported-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cff20b3151 upstream.
To fix the build with newer gccs, that without this patch exit with:
LD /tmp/build/perf/tests/perf-in.o
ld: /tmp/build/perf/tests/bp_account.o:/git/perf/tools/perf/tests/bp_account.c:22: multiple definition of `the_var'; /tmp/build/perf/tests/bp_signal.o:/git/perf/tools/perf/tests/bp_signal.c:38: first defined here
make[4]: *** [/git/perf/tools/build/Makefile.build:145: /tmp/build/perf/tests/perf-in.o] Error 1
First noticed in fedora:rawhide/32 with:
[perfbuilder@a5ff49d6e6e4 ~]$ gcc --version
gcc (GCC) 10.0.1 20200216 (Red Hat 10.0.1-0.8)
Reported-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit bd3c628f8f ]
When recording with cache-misses and arm_spe_x event, I found that it
will just fail without showing any error info if i put cache-misses
after 'arm_spe_x' event.
[root@localhost 0620]# perf record -e cache-misses \
-e arm_spe_0/ts_enable=1,pct_enable=1,pa_enable=1,load_filter=1,jitter=1,store_filter=1,min_latency=0/ sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.067 MB perf.data ]
[root@localhost 0620]#
[root@localhost 0620]# perf record -e arm_spe_0/ts_enable=1,pct_enable=1,pa_enable=1,load_filter=1,jitter=1,store_filter=1,min_latency=0/ \
-e cache-misses sleep 1
[root@localhost 0620]#
The current code can only work if the only event to be traced is an
'arm_spe_x', or if it is the last event to be specified. Otherwise the
last event type will be checked against all the arm_spe_pmus[i]->types,
none will match and an out of bound 'i' index will be used in
arm_spe_recording_init().
We don't support concurrent multiple arm_spe_x events currently, that
is checked in arm_spe_recording_options(), and it will show the relevant
info. So add the check and record of the first found 'arm_spe_pmu' to
fix this issue here.
Fixes: ffd3d18c20 ("perf tools: Add ARM Statistical Profiling Extensions (SPE) support")
Signed-off-by: Wei Li <liwei391@huawei.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Tested-by-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20200724071111.35593-2-liwei391@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 0e0bf1ea11 upstream.
As the code comments in perf_stat_process_counter() say, we calculate
counter's data every interval, and the display code shows ps->res_stats
avg value. We need to zero the stats for interval mode.
But the current code only zeros the res_stats[0], it doesn't zero the
res_stats[1] and res_stats[2], which are for ena and run of counter.
This patch zeros the whole res_stats[] for interval mode.
Fixes: 51fd2df1e8 ("perf stat: Fix interval output values")
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200409070755.17261-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3a3cf7c570 upstream.
Using Python version 3.8.2 and PySide2 version 5.14.0, ctrl-F ('Find')
would not expand the tree to the result. Fix by using setExpanded().
Example:
$ perf record -e intel_pt//u uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.034 MB perf.data ]
$ perf script --itrace=bep -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py perf.data.db branches calls
2020-06-26 15:32:14.928997 Creating database ...
2020-06-26 15:32:14.933971 Writing records...
2020-06-26 15:32:15.535251 Adding indexes
2020-06-26 15:32:15.542993 Dropping unused tables
2020-06-26 15:32:15.549716 Done
$ python3 ~/libexec/perf-core/scripts/python/exported-sql-viewer.py perf.data.db
Select: Reports -> Context-Sensitive Call Graph or Reports -> Call Tree
Press: Ctrl-F
Enter: main
Press: Enter
Before: line showing 'main' does not display
After: tree is expanded to line showing 'main'
Fixes: ebd70c7dc2 ("perf scripts python: exported-sql-viewer.py: Add ability to find symbols in the call-graph")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200629091955.17090-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 031c8d5edb upstream.
Using ctrl-F ('Find') would not find 'unknown' because it matches id
zero. Fix by excluding id zero from selection.
Example:
$ perf record -e intel_pt//u uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.034 MB perf.data ]
$ perf script --itrace=bep -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py perf.data.db branches calls
2020-06-26 15:32:14.928997 Creating database ...
2020-06-26 15:32:14.933971 Writing records...
2020-06-26 15:32:15.535251 Adding indexes
2020-06-26 15:32:15.542993 Dropping unused tables
2020-06-26 15:32:15.549716 Done
$ python3 ~/libexec/perf-core/scripts/python/exported-sql-viewer.py perf.data.db
Select: Reports -> Call Tree
Press: Ctrl-F
Enter: unknown
Press: Enter
Before: displays 'unknown' not found
After: tree is expanded to line showing 'unknown'
Fixes: ae8b887c00 ("perf scripts python: exported-sql-viewer.py: Add call tree")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200629091955.17090-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4c95ad261c ]
The condition to add XMM registers was missing, the regs array needed to
be in the outer scope, and the size of the regs array was too small.
Fixes: 143d34a6b3 ("perf intel-pt: Add XMM registers to synthesized PEBS sample")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Luwei Kang <luwei.kang@intel.com>
Link: http://lore.kernel.org/lkml/20200630133935.11150-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 75bcb8776d ]
When recording PEBS-via-PT, the kernel will not accept the intel_pt
event with register sampling e.g.
# perf record --kcore -c 10000 -e '{intel_pt/branch=0/,branch-loads/aux-output/ppp}' -I -- ls -l
Error:
intel_pt/branch=0/: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat'
Fix by suppressing register sampling on the intel_pt evsel.
Committer notes:
Adrian informed that this is only available from Tremont onwards, so on
older processors the error continues the same as before.
Fixes: 9e64cefe43 ("perf intel-pt: Process options for PEBS event synthesis")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Luwei Kang <luwei.kang@intel.com>
Link: http://lore.kernel.org/lkml/20200630133935.11150-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d61cbb859b ]
The segmentation fault can be reproduced as following steps:
1) Executing perf report in tui.
2) Typing '/xxxxx' to filter the symbol to get nothing matched.
3) Pressing enter with no entry selected.
Then it will report a segmentation fault.
It is caused by the lack of check of browser->he_selection when
accessing it's member res_samples in perf_evsel__hists_browse().
These processes are meaningful for specified samples, so we can skip
these when nothing is selected.
Fixes: 4968ac8fb7 ("perf report: Implement browsing of individual samples")
Signed-off-by: Wei Li <liwei391@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Link: http://lore.kernel.org/lkml/20200612094322.39565-1-liwei391@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c0c652fc70 ]
If config->aggr_map is NULL and config->aggr_get_id is not NULL,
the function print_aggr() will still calling arrg_update_shadow(),
which can result in accessing the invalid pointer.
Fixes: 088519f318 ("perf stat: Move the display functions to stat-display.c")
Signed-off-by: Hongbo Yao <yaohongbo@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wei Li <liwei391@huawei.com>
Link: https://lore.kernel.org/lkml/20200608163625.GC3073@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 11b6e5482e ]
The 'evname' variable can be NULL, as it is checked a few lines back,
check it before using.
Fixes: 9e207ddfa2 ("perf report: Show call graph from reference events")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/
Signed-off-by: Gaurav Singh <gaurav1086@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 2ae5d0d7d8 upstream.
Since commit 03db8b583d ("perf tools: Fix
maps__find_symbol_by_name()") introduced map address range check in
maps__find_symbol_by_name(), we can not get "_etext" from kernel map
because _etext is placed on the edge of the kernel .text section (=
kernel map in perf.)
To fix this issue, this checks the address correctness by map address
range information (map->start and map->end) instead of using _etext
address.
This can cause an error if the target inlined function is embedded in
both __init function and normal function.
For exaample, request_resource() is a normal function but also embedded
in __init reserve_setup(). In this case, the probe point in
reserve_setup() must be skipped.
However, without this fix, it failes to setup all probe points:
# ./perf probe -v request_resource
probe-definition(0): request_resource
symbol:request_resource file:(null) line:0 offset:0 return:0 lazy:(null)
0 arguments
Looking at the vmlinux_path (8 entries long)
Using /usr/lib/debug/lib/modules/5.5.17-200.fc31.x86_64/vmlinux for symbols
Open Debuginfo file: /usr/lib/debug/lib/modules/5.5.17-200.fc31.x86_64/vmlinux
Try to find probe point from debuginfo.
Matched function: request_resource [15e29ad]
found inline addr: 0xffffffff82fbf892
Probe point found: reserve_setup+204
found inline addr: 0xffffffff810e9790
Probe point found: request_resource+0
Found 2 probe_trace_events.
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Opening /sys/kernel/debug/tracing//README write=0
Writing event: p:probe/request_resource _text+33290386
Failed to write event: Invalid argument
Error: Failed to add events. Reason: Invalid argument (Code: -22)
#
With this fix,
# ./perf probe request_resource
reserve_setup is out of .text, skip it.
Added new events:
(null):(null) (on request_resource)
probe:request_resource (on request_resource)
You can now use it in all perf tools, such as:
perf record -e probe:request_resource -aR sleep 1
#
Fixes: 03db8b583d ("perf tools: Fix maps__find_symbol_by_name()")
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/158763967332.30755.4922496724365529088.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 80526491c2 upstream.
Fix to check kprobe blacklist address correctly with relocated address
by adjusting debuginfo address.
Since the address in the debuginfo is same as objdump, it is different
from relocated kernel address with KASLR. Thus, 'perf probe' always
misses to catch the blacklisted addresses.
Without this patch, 'perf probe' can not detect the blacklist addresses
on a KASLR enabled kernel.
# perf probe kprobe_dispatcher
Failed to write event: Invalid argument
Error: Failed to add events.
#
With this patch, it correctly shows the error message.
# perf probe kprobe_dispatcher
kprobe_dispatcher is blacklisted function, skip it.
Probe point 'kprobe_dispatcher' not found.
Error: Failed to add events.
#
Fixes: 9aaf5a5f47 ("perf probe: Check kprobes blacklist when adding new events")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/158763966411.30755.5882376357738273695.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f41ebe9def upstream.
When a probe point is expanded to several places (like inlined) and if
some of them are skipped because of blacklisted or __init function,
those trace_events has no event name. It must be skipped while showing
results.
Without this fix, you can see "(null):(null)" on the list,
# ./perf probe request_resource
reserve_setup is out of .text, skip it.
Added new events:
(null):(null) (on request_resource)
probe:request_resource (on request_resource)
You can now use it in all perf tools, such as:
perf record -e probe:request_resource -aR sleep 1
#
With this fix, it is ignored:
# ./perf probe request_resource
reserve_setup is out of .text, skip it.
Added new events:
probe:request_resource (on request_resource)
You can now use it in all perf tools, such as:
perf record -e probe:request_resource -aR sleep 1
#
Fixes: 5a51fcd1f3 ("perf probe: Skip kernel symbols which is out of .text")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/158763968263.30755.12800484151476026340.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c6aab66a72 ]
Since the commit 6a13a0d7b4 ("ftrace/kprobe: Show the maxactive number
on kprobe_events") introduced to show the instance number of kretprobe
events, the length of the 1st format of the kprobe event will not 1, but
it can be longer. This caused a parser error in perf-probe.
Skip the length check the 1st format of the kprobe event to accept this
instance number.
Without this fix:
# perf probe -a vfs_read%return
Added new event:
probe:vfs_read__return (on vfs_read%return)
You can now use it in all perf tools, such as:
perf record -e probe:vfs_read__return -aR sleep 1
# perf probe -l
Semantic error :Failed to parse event name: r16:probe/vfs_read__return
Error: Failed to show event list.
And with this fixes:
# perf probe -a vfs_read%return
...
# perf probe -l
probe:vfs_read__return (on vfs_read%return)
Fixes: 6a13a0d7b4 ("ftrace/kprobe: Show the maxactive number on kprobe_events")
Reported-by: Yuxuan Shui <yshuiv7@gmail.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Yuxuan Shui <yshuiv7@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: stable@vger.kernel.org
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=207587
Link: http://lore.kernel.org/lkml/158877535215.26469.1113127926699134067.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit c3b10649a8 upstream.
Previously we could get the report of branch type statistics.
For example:
# perf record -j any,save_type ...
# t perf report --stdio
#
# Branch Statistics:
#
COND_FWD: 40.6%
COND_BWD: 4.1%
CROSS_4K: 24.7%
CROSS_2M: 12.3%
COND: 44.7%
UNCOND: 0.0%
IND: 6.1%
CALL: 24.5%
RET: 24.7%
But now for the recent perf, it can't report the branch type statistics.
It's a regression issue caused by commit 40c39e3046 ("perf report: Fix
a no annotate browser displayed issue"), which only counts the branch
type statistics for browser mode.
This patch moves the branch_type_count() outside of ui__has_annotation()
checking, then branch type statistics can work for stdio mode.
Fixes: 40c39e3046 ("perf report: Fix a no annotate browser displayed issue")
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200313134607.12873-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b9c9ce4e59 upstream.
Python 3.8 changed the output of 'python-config --ldflags' to no longer
include the '-lpythonX.Y' flag (this apparently fixed an issue loading
modules with a statically linked Python executable). The libpython
feature check in linux/build/feature fails if the Python library is not
included in FEATURE_CHECK_LDFLAGS-libpython variable.
This adds a check in the Makefile to determine if PYTHON_CONFIG accepts
the '--embed' flag and passes that flag alongside '--ldflags' if so.
tools/perf is the only place the libpython feature check is used.
Signed-off-by: Sam Lunt <samuel.j.lunt@gmail.com>
Tested-by: He Zhe <zhe.he@windriver.com>
Link: http://lore.kernel.org/lkml/c56be2e1-8111-9dfe-8298-f7d0f9ab7431@windriver.com
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: trivial@kernel.org
Cc: stable@kernel.org
Link: http://lore.kernel.org/lkml/20200131181123.tmamivhq4b7uqasr@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit db2c549407 upstream.
This patch fixes an off-by-one error in strncpy size argument in
tools/perf/util/map.c. The issue is that in:
strncmp(filename, "/system/lib/", 11)
the passed string literal: "/system/lib/" has 12 bytes (without the NULL
byte) and the passed size argument is 11. As a result, the logic won't
match the ending "/" byte and will pass filepaths that are stored in
other directories e.g. "/system/libmalicious/bin" or just
"/system/libmalicious".
This functionality seems to be present only on Android. I assume the
/system/ directory is only writable by the root user, so I don't think
this bug has much (or any) security impact.
Fixes: eca8183699 ("perf tools: Add automatic remapping of Android libraries")
Signed-off-by: disconnect3d <dominik.b.czarnota@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Keeping <john@metanate.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Lentine <mlentine@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200309104855.3775-1-dominik.b.czarnota@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit be40920fbf upstream.
When I tried to compile tools/perf from the top directory with the -C
option, the O= option didn't work correctly if I passed a relative path:
$ make O=BUILD -C tools/perf/
make: Entering directory '/home/mhiramat/ksrc/linux/tools/perf'
BUILD: Doing 'make -j8' parallel build
../scripts/Makefile.include:4: *** O=/home/mhiramat/ksrc/linux/tools/perf/BUILD does not exist. Stop.
make: *** [Makefile:70: all] Error 2
make: Leaving directory '/home/mhiramat/ksrc/linux/tools/perf'
The O= directory existence check failed because the check script ran in
the build target directory instead of the directory where I ran the make
command.
To fix that, once change directory to $(PWD) and check O= directory,
since the PWD is set to where the make command runs.
Fixes: c883122acc ("perf tools: Let O= makes handle relative paths")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sasha Levin <sashal@kernel.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/158351957799.3363.15269768530697526765.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1efde27542 upstream.
Do not depend on dwfl_module_addrsym() because it can fail on user-space
shared libraries.
Actually, same bug was fixed by commit 664fee3dc3 ("perf probe: Do not
use dwfl_module_addrsym if dwarf_diename finds symbol name"), but commit
07d3698578 ("perf probe: Fix wrong address verification) reverted to
get actual symbol address from symtab.
This fixes it again by getting symbol address from DIE, and only if the
DIE has only address range, it uses dwfl_module_addrsym().
Fixes: 07d3698578 ("perf probe: Fix wrong address verification)
Reported-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Alexandre Ghiti <alex@ghiti.fr>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sasha Levin <sashal@kernel.org>
Link: http://lore.kernel.org/lkml/158281812176.476.14164573830975116234.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6b8d68f1ce upstream.
When we put an event with multiple probes, perf-probe fails to delete
with filters. This comes from a failure to list up the event name
because of overwrapping its name.
To fix this issue, skip to list up the event which has same name.
Without this patch:
# perf probe -l \*
probe_perf:map__map_ip (on perf_sample__fprintf_brstackoff:21@
probe_perf:map__map_ip (on perf_sample__fprintf_brstackoff:25@
probe_perf:map__map_ip (on append_inlines:12@util/machine.c in
probe_perf:map__map_ip (on unwind_entry:19@util/machine.c in /
probe_perf:map__map_ip (on map__map_ip@util/map.h in /home/mhi
probe_perf:map__map_ip (on map__map_ip@util/map.h in /home/mhi
# perf probe -d \*
"*" does not hit any event.
Error: Failed to delete events. Reason: No such file or directory (Code: -2)
With it:
# perf probe -d \*
Removed event: probe_perf:map__map_ip
#
Fixes: 72363540c0 ("perf probe: Support multiprobe event")
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Reported-by: He Zhe <zhe.he@windriver.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/158287666197.16697.7514373548551863562.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f649bd9dd5 upstream.
Since commit 3b2323c2c1 ("perf bench futex: Use cpumaps") the default
number of threads the benchmark uses got changed from number of online
CPUs to zero:
$ perf bench futex wake
# Running 'futex/wake' benchmark:
Run summary [PID 15930]: blocking on 0 threads (at [private] futex 0x558b8ee4bfac), waking up 1 at a time.
[Run 1]: Wokeup 0 of 0 threads in 0.0000 ms
[...]
[Run 10]: Wokeup 0 of 0 threads in 0.0000 ms
Wokeup 0 of 0 threads in 0.0004 ms (+-40.82%)
Restore the old behavior by grabbing the number of online CPUs via
cpu->nr:
$ perf bench futex wake
# Running 'futex/wake' benchmark:
Run summary [PID 18356]: blocking on 8 threads (at [private] futex 0xb3e62c), waking up 1 at a time.
[Run 1]: Wokeup 8 of 8 threads in 0.0260 ms
[...]
[Run 10]: Wokeup 8 of 8 threads in 0.0270 ms
Wokeup 8 of 8 threads in 0.0419 ms (+-24.35%)
Fixes: 3b2323c2c1 ("perf bench futex: Use cpumaps")
Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lore.kernel.org/lkml/20200305083714.9381-3-tommi.t.rantala@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d6bc34c5ec upstream.
In __cmd_record(), when receiving SIGINT(ctrl + c), a 'done' flag will
be set and the event list will be disabled by evlist__disable() once.
While in auxtrace_record.read_finish(), the related events will be
enabled again, if they are continuous, the recording seems to be
endless.
If the event is disabled, don't enable it again here.
Based-on-patch-by: Wei Li <liwei391@huawei.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Tan Xiaojun <tanxiaojun@huawei.com>
Cc: stable@vger.kernel.org # 5.4+
Link: http://lore.kernel.org/lkml/20200214132654.20395-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c9f2833cb4 upstream.
In __cmd_record(), when receiving SIGINT(ctrl + c), a 'done' flag will
be set and the event list will be disabled by evlist__disable() once.
While in auxtrace_record.read_finish(), the related events will be
enabled again, if they are continuous, the recording seems to be
endless.
If the cs_etm event is disabled, we don't enable it again here.
Note: This patch is NOT tested since i don't have such a machine with
coresight feature, but the code seems buggy same as arm-spe and
intel-pt.
Tester notes:
Thanks for looping, Adrian. Applied this patch and tested with
CoreSight on juno board, it works well.
Signed-off-by: Wei Li <liwei391@huawei.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Tan Xiaojun <tanxiaojun@huawei.com>
Cc: stable@vger.kernel.org # 5.4+
Link: http://lore.kernel.org/lkml/20200214132654.20395-4-adrian.hunter@intel.com
[ahunter: removed redundant 'else' after 'return']
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 783fed2f35 upstream.
In __cmd_record(), when receiving SIGINT(ctrl + c), a 'done' flag will
be set and the event list will be disabled by evlist__disable() once.
While in auxtrace_record.read_finish(), the related events will be
enabled again, if they are continuous, the recording seems to be
endless.
If the intel_bts event is disabled, we don't enable it again here.
Note: This patch is NOT tested since i don't have such a machine with
intel_bts feature, but the code seems buggy same as arm-spe and
intel-pt.
Signed-off-by: Wei Li <liwei391@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Tan Xiaojun <tanxiaojun@huawei.com>
Cc: stable@vger.kernel.org # 5.4+
Link: http://lore.kernel.org/lkml/20200214132654.20395-3-adrian.hunter@intel.com
[ahunter: removed redundant 'else' after 'return']
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2da4dd3d69 upstream.
In __cmd_record(), when receiving SIGINT(ctrl + c), a 'done' flag will
be set and the event list will be disabled by evlist__disable() once.
While in auxtrace_record.read_finish(), the related events will be
enabled again, if they are continuous, the recording seems to be endless.
If the intel_pt event is disabled, we don't enable it again here.
Before the patch:
huawei@huawei-2288H-V5:~/linux-5.5-rc4/tools/perf$ ./perf record -e \
intel_pt//u -p 46803
^C^C^C^C^C^C
After the patch:
huawei@huawei-2288H-V5:~/linux-5.5-rc4/tools/perf$ ./perf record -e \
intel_pt//u -p 48591
^C[ perf record: Woken up 0 times to write data ]
Warning:
AUX data lost 504 times out of 4816!
[ perf record: Captured and wrote 2024.405 MB perf.data ]
Signed-off-by: Wei Li <liwei391@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Tan Xiaojun <tanxiaojun@huawei.com>
Cc: stable@vger.kernel.org # 5.4+
Link: http://lore.kernel.org/lkml/20200214132654.20395-2-adrian.hunter@intel.com
[ ahunter: removed redundant 'else' after 'return' ]
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 604e2139a1 upstream.
When we moved zalloc.o to the library we missed gtk library which needs
it compiled in, otherwise the missing __zfree symbol will cause the
library to fail to load.
Adding the zalloc object to the gtk library build.
Fixes: 7f7c536f23 ("tools lib: Adopt zalloc()/zfree() from tools/perf")
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jelle van der Waa <jelle@vdwaa.nl>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200113104358.123511-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3f7774033e upstream.
We need to set actions->ms.map since 599a2f38a9 ("perf hists browser:
Check sort keys before hot key actions"), as in that patch we bail out
if map is NULL.
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: 599a2f38a9 ("perf hists browser: Check sort keys before hot key actions")
Link: https://lkml.kernel.org/n/tip-wp1ssoewy6zihwwexqpohv0j@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 80cc7bb6c1 upstream.
For data collected on machines with front end stalled cycles supported,
such as found on modern AMD CPU families, commit 146540fb54 ("perf
stat: Always separate stalled cycles per insn") introduces a new line in
CSV output with a leading comma that upsets some automated scripts.
Scripts have to use "-e ex_ret_instr" to work around this issue, after
upgrading to a version of perf with that commit.
We could add "if (have_frontend_stalled && !config->csv_sep)" to the not
(total && avg) else clause, to emphasize that CSV users are usually
scripts, and are written to do only what is needed, i.e., they wouldn't
typically invoke "perf stat" without specifying an explicit event list.
But - let alone CSV output - why should users now tolerate a constant
0-reporting extra line in regular terminal output?:
BEFORE:
$ sudo perf stat --all-cpus -einstructions,cycles -- sleep 1
Performance counter stats for 'system wide':
181,110,981 instructions # 0.58 insn per cycle
# 0.00 stalled cycles per insn
309,876,469 cycles
1.002202582 seconds time elapsed
The user would not like to see the now permanent:
"0.00 stalled cycles per insn"
line fixture, as it gives no useful information.
So this patch removes the printing of the zeroed stalled cycles line
altogether, almost reverting the very original commit fb4605ba47
("perf stat: Check for frontend stalled for metrics"), which seems like
it was written to normalize --metric-only column output of common Intel
machines at the time: modern Intel machines have ceased to support the
genericised frontend stalled metrics AFAICT.
AFTER:
$ sudo perf stat --all-cpus -einstructions,cycles -- sleep 1
Performance counter stats for 'system wide':
244,071,432 instructions # 0.69 insn per cycle
355,353,490 cycles
1.001862516 seconds time elapsed
Output behaviour when stalled cycles is indeed measured is not affected
(BEFORE == AFTER):
$ sudo perf stat --all-cpus -einstructions,cycles,stalled-cycles-frontend -- sleep 1
Performance counter stats for 'system wide':
247,227,799 instructions # 0.63 insn per cycle
# 0.26 stalled cycles per insn
394,745,636 cycles
63,194,485 stalled-cycles-frontend # 16.01% frontend cycles idle
1.002079770 seconds time elapsed
Fixes: 146540fb54 ("perf stat: Always separate stalled cycles per insn")
Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200207230613.26709-1-kim.phillips@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c3314a74f8 ]
Commit 800d3f5616 ("perf report: Add warning when libunwind not
compiled in") breaks the s390 platform. S390 uses libdw-dwarf-unwind for
call chain unwinding and had no support for libunwind.
So the warning "Please install libunwind development packages during the
perf build." caused the confusion even if the call-graph is displayed
correctly.
This patch adds checking for HAVE_DWARF_SUPPORT, which is set when
libdw-dwarf-unwind is compiled in.
Fixes: 800d3f5616 ("perf report: Add warning when libunwind not compiled in")
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Thomas Richter <tmricht@linux.ibm.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200107191745.18415-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit f068435d9b upstream.
At some point in the past we needed to make sure we would get the long
name of modules and not just what we get from /proc/modules, but that
need, as described in the cset that introduced the adjustment function:
Fixes: c03d5184f0 ("perf machine: Adjust dso->long_name for offline module")
Without using the buildid-cache:
# lsmod | grep trusted
# insmod trusted.ko
# lsmod | grep trusted
trusted 24576 0
# strace -e open,openat perf probe -m ./trusted.ko key_seal |& grep trusted
openat(AT_FDCWD, "/sys/module/trusted/notes/.note.gnu.build-id", O_RDONLY) = 4
openat(AT_FDCWD, "/sys/module/trusted/notes/.note.gnu.build-id", O_RDONLY) = 7
openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
openat(AT_FDCWD, "/root/.debug/root/trusted.ko/dd3d355d567394d540f527e093e0f64b95879584/probes", O_RDWR|O_CREAT, 0644) = 3
openat(AT_FDCWD, "/usr/lib/debug/root/trusted.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/debug/root/trusted.ko", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/root/.debug/trusted.ko", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
openat(AT_FDCWD, "trusted.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, ".debug/trusted.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "trusted.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 4
openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
probe:key_seal (on key_seal in trusted)
# perf probe -l
probe:key_seal (on key_seal in trusted)
#
No attempt at opening '[trusted]'.
Now using the build-id cache:
# rmmod trusted
# perf buildid-cache --add ./trusted.ko
# insmod trusted.ko
# strace -e open,openat perf probe -m ./trusted.ko key_seal |& grep trusted
openat(AT_FDCWD, "/sys/module/trusted/notes/.note.gnu.build-id", O_RDONLY) = 4
openat(AT_FDCWD, "/sys/module/trusted/notes/.note.gnu.build-id", O_RDONLY) = 7
openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
openat(AT_FDCWD, "/root/.debug/root/trusted.ko/dd3d355d567394d540f527e093e0f64b95879584/probes", O_RDWR|O_CREAT, 0644) = 3
openat(AT_FDCWD, "/usr/lib/debug/root/trusted.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/debug/root/trusted.ko", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/root/.debug/trusted.ko", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
openat(AT_FDCWD, "trusted.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, ".debug/trusted.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "trusted.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 4
openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
#
Again, no attempt at reading '[trusted]'.
Finally, adding a probe to that function and then using:
[root@quaco ~]# perf trace -e probe_perf:*/max-stack=16/ --max-events=2
0.000 perf/13456 probe_perf:dso__adjust_kmod_long_name(__probe_ip: 5492263)
dso__adjust_kmod_long_name (/home/acme/bin/perf)
machine__process_kernel_mmap_event (/home/acme/bin/perf)
machine__process_mmap_event (/home/acme/bin/perf)
perf_event__process_mmap (/home/acme/bin/perf)
machines__deliver_event (/home/acme/bin/perf)
perf_session__deliver_event (/home/acme/bin/perf)
perf_session__process_event (/home/acme/bin/perf)
process_simple (/home/acme/bin/perf)
reader__process_events (/home/acme/bin/perf)
__perf_session__process_events (/home/acme/bin/perf)
perf_session__process_events (/home/acme/bin/perf)
process_buildids (/home/acme/bin/perf)
record__finish_output (/home/acme/bin/perf)
__cmd_record (/home/acme/bin/perf)
cmd_record (/home/acme/bin/perf)
run_builtin (/home/acme/bin/perf)
0.055 perf/13456 probe_perf:dso__adjust_kmod_long_name(__probe_ip: 5492263)
dso__adjust_kmod_long_name (/home/acme/bin/perf)
machine__process_kernel_mmap_event (/home/acme/bin/perf)
machine__process_mmap_event (/home/acme/bin/perf)
perf_event__process_mmap (/home/acme/bin/perf)
machines__deliver_event (/home/acme/bin/perf)
perf_session__deliver_event (/home/acme/bin/perf)
perf_session__process_event (/home/acme/bin/perf)
process_simple (/home/acme/bin/perf)
reader__process_events (/home/acme/bin/perf)
__perf_session__process_events (/home/acme/bin/perf)
perf_session__process_events (/home/acme/bin/perf)
process_buildids (/home/acme/bin/perf)
record__finish_output (/home/acme/bin/perf)
__cmd_record (/home/acme/bin/perf)
cmd_record (/home/acme/bin/perf)
run_builtin (/home/acme/bin/perf)
#
This was the only path I could find using the perf tools that reach at this
function, then as of november/2019, if we put a probe in the line where the
actuall setting of the dso->long_name is done:
# perf trace -e probe_perf:*
^C[root@quaco ~]
# perf stat -e probe_perf:* -I 2000
2.000404265 0 probe_perf:dso__adjust_kmod_long_name
4.001142200 0 probe_perf:dso__adjust_kmod_long_name
6.001704120 0 probe_perf:dso__adjust_kmod_long_name
8.002398316 0 probe_perf:dso__adjust_kmod_long_name
10.002984010 0 probe_perf:dso__adjust_kmod_long_name
12.003597851 0 probe_perf:dso__adjust_kmod_long_name
14.004113303 0 probe_perf:dso__adjust_kmod_long_name
16.004582773 0 probe_perf:dso__adjust_kmod_long_name
18.005176373 0 probe_perf:dso__adjust_kmod_long_name
20.005801605 0 probe_perf:dso__adjust_kmod_long_name
22.006467540 0 probe_perf:dso__adjust_kmod_long_name
^C 23.683261941 0 probe_perf:dso__adjust_kmod_long_name
#
Its not being used at all.
To further test this I used kvm.ko as the offline module, i.e. removed
if from the buildid-cache by nuking it completely (rm -rf ~/.debug) and
moved it from the normal kernel distro path, removed the modules, stoped
the kvm guest, and then installed it manually, etc.
# rmmod kvm-intel
# rmmod kvm
# lsmod | grep kvm
# modprobe kvm-intel
modprobe: ERROR: ctx=0x55d3b1722260 path=/lib/modules/5.3.8-200.fc30.x86_64/kernel/arch/x86/kvm/kvm.ko.xz error=No such file or directory
modprobe: ERROR: ctx=0x55d3b1722260 path=/lib/modules/5.3.8-200.fc30.x86_64/kernel/arch/x86/kvm/kvm.ko.xz error=No such file or directory
modprobe: ERROR: could not insert 'kvm_intel': Unknown symbol in module, or unknown parameter (see dmesg)
# insmod ./kvm.ko
# modprobe kvm-intel
modprobe: ERROR: ctx=0x562f34026260 path=/lib/modules/5.3.8-200.fc30.x86_64/kernel/arch/x86/kvm/kvm.ko.xz error=No such file or directory
modprobe: ERROR: ctx=0x562f34026260 path=/lib/modules/5.3.8-200.fc30.x86_64/kernel/arch/x86/kvm/kvm.ko.xz error=No such file or directory
# lsmod | grep kvm
kvm_intel 299008 0
kvm 765952 1 kvm_intel
irqbypass 16384 1 kvm
#
# perf probe -x ~/bin/perf machine__findnew_module_map:12 mname=m.name:string filename=filename:string 'dso_long_name=map->dso->long_name:string' 'dso_name=map->dso->name:string'
# perf probe -l
probe_perf:machine__findnew_module_map (on machine__findnew_module_map:12@util/machine.c in /home/acme/bin/perf with mname filename dso_long_name dso_name)
# perf record
^C[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 3.416 MB perf.data (33956 samples) ]
# perf trace -e probe_perf:machine*
<SNIP>
6.322 perf/23099 probe_perf:machine__findnew_module_map(__probe_ip: 5492493, mname: "[salsa20_generic]", filename: "/lib/modules/5.3.8-200.fc30.x86_64/kernel/crypto/salsa20_generic.ko.xz", dso_long_name: "/lib/modules/5.3.8-200.fc30.x86_64/kernel/crypto/salsa20_generic.ko.xz", dso_name: "[salsa20_generic]")
6.375 perf/23099 probe_perf:machine__findnew_module_map(__probe_ip: 5492493, mname: "[kvm]", filename: "[kvm]", dso_long_name: "[kvm]", dso_name: "[kvm]")
<SNIP>
The filename doesn't come with the path, no point in trying to set the dso->long_name.
[root@quaco ~]# strace -e open,openat perf probe -m ./kvm.ko kvm_apic_local_deliver |& egrep 'open.*kvm'
openat(AT_FDCWD, "/sys/module/kvm_intel/notes/.note.gnu.build-id", O_RDONLY) = 4
openat(AT_FDCWD, "/sys/module/kvm/notes/.note.gnu.build-id", O_RDONLY) = 4
openat(AT_FDCWD, "/lib/modules/5.3.8-200.fc30.x86_64/kernel/arch/x86/kvm", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 7
openat(AT_FDCWD, "/sys/module/kvm_intel/notes/.note.gnu.build-id", O_RDONLY) = 8
openat(AT_FDCWD, "/root/kvm.ko", O_RDONLY) = 3
openat(AT_FDCWD, "/root/.debug/root/kvm.ko/5955f426cb93f03f30f3e876814be2db80ab0b55/probes", O_RDWR|O_CREAT, 0644) = 3
openat(AT_FDCWD, "/usr/lib/debug/root/kvm.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/debug/root/kvm.ko", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/root/.debug/kvm.ko", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/root/kvm.ko", O_RDONLY) = 3
openat(AT_FDCWD, "kvm.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, ".debug/kvm.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "kvm.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/root/kvm.ko", O_RDONLY) = 3
openat(AT_FDCWD, "/root/kvm.ko", O_RDONLY) = 3
openat(AT_FDCWD, "/root/kvm.ko", O_RDONLY) = 4
openat(AT_FDCWD, "/root/kvm.ko", O_RDONLY) = 3
[root@quaco ~]#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-jlfew3lyb24d58egrp0o72o2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b3509b6ed7 ]
My earlier patch to just enable --reltime with --time was a little too
optimistic. The --time parsing would accept absolute time, which is
very confusing to the user.
Support relative time in --time parsing too. This only works with recent
perf record that records the first sample time. Otherwise we error out.
Fixes: 3714437d3f ("perf script: Allow --time with --reltime")
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lore.kernel.org/lkml/20191011182140.8353-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 3714437d3f upstream.
The original --reltime patch forbid --time with --reltime.
But it turns out --time doesn't really care about --reltime, because the
relative time is only used at final output, while the time filtering
always works earlier on absolute time.
So just remove the check and allow combining the two options.
Fixes: 90b10f47c0 ("perf script: Support relative time")
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lore.kernel.org/lkml/20191002164642.1719-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 07d3698578 upstream.
Since there are some DIE which has only ranges instead of the
combination of entrypc/highpc, address verification must use
dwarf_haspc() instead of dwarf_entrypc/dwarf_highpc.
Also, the ranges only DIE will have a partial code in different section
(e.g. unlikely code will be in text.unlikely as "FUNC.cold" symbol). In
that case, we can not use dwarf_entrypc() or die_entrypc(), because the
offset from original DIE can be a minus value.
Instead, this simply gets the symbol and offset from symtab.
Without this patch;
# perf probe -D clear_tasks_mm_cpumask:1
Failed to get entry address of clear_tasks_mm_cpumask
Error: Failed to add events.
And with this patch:
# perf probe -D clear_tasks_mm_cpumask:1
p:probe/clear_tasks_mm_cpumask clear_tasks_mm_cpumask+0
p:probe/clear_tasks_mm_cpumask_1 clear_tasks_mm_cpumask+5
p:probe/clear_tasks_mm_cpumask_2 clear_tasks_mm_cpumask+8
p:probe/clear_tasks_mm_cpumask_3 clear_tasks_mm_cpumask+16
p:probe/clear_tasks_mm_cpumask_4 clear_tasks_mm_cpumask+82
Committer testing:
I managed to reproduce the above:
[root@quaco ~]# perf probe -D clear_tasks_mm_cpumask:1
p:probe/clear_tasks_mm_cpumask _text+919968
p:probe/clear_tasks_mm_cpumask_1 _text+919973
p:probe/clear_tasks_mm_cpumask_2 _text+919976
[root@quaco ~]#
But then when trying to actually put the probe in place, it fails if I
use :0 as the offset:
[root@quaco ~]# perf probe -L clear_tasks_mm_cpumask | head -5
<clear_tasks_mm_cpumask@/usr/src/debug/kernel-5.2.fc30/linux-5.2.18-200.fc30.x86_64/kernel/cpu.c:0>
0 void clear_tasks_mm_cpumask(int cpu)
1 {
2 struct task_struct *p;
[root@quaco ~]# perf probe clear_tasks_mm_cpumask:0
Probe point 'clear_tasks_mm_cpumask' not found.
Error: Failed to add events.
[root@quaco
The next patch is needed to fix this case.
Fixes: 576b523721 ("perf probe: Fix probing symbols with optimization suffix")
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/157199318513.8075.10463906803299647907.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0feba17bd7 upstream.
We observed an issue that was some extra columns displayed after switching
perf data file in browser. The steps to reproduce:
1. perf record -a -e cycles,instructions -- sleep 3
2. perf report --group
3. In browser, we use hotkey 's' to switch to another perf.data
4. Now in browser, the extra columns 'Self' and 'Children' are displayed.
The issue is setup_sorting() executed again after repeat path, so dimensions
are added again.
This patch checks the last key returned from __cmd_report(). If it's
K_SWITCH_INPUT_DATA, skips the setup_sorting().
Fixes: ad0de0971b ("perf report: Enable the runtime switching of perf data file")
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Feng Tang <feng.tang@intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191220013722.20592-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 55347ec340 upstream.
Variable names are inconsistent in hists__for_each macro().
Due to this inconsistency, the macro replaces its second argument with
"fmt" regardless of its original name.
So far it works because only "fmt" is passed to the second argument.
However, this behavior is not expected and should be fixed.
Fixes: f0786af536 ("perf hists: Introduce hists__for_each_format macro")
Fixes: aa6f50af82 ("perf hists: Introduce hists__for_each_sort_list macro")
Signed-off-by: Yuya Fujita <fujita.yuya@fujitsu.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/OSAPR01MB1588E1C47AC22043175DE1B2E8520@OSAPR01MB1588.jpnprd01.prod.outlook.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 58b3bafff8 upstream.
In 7fcfa9a2d9 an unintended prefix "Counter:18 Name:" was removed from
the description for L1D_RO_EXCL_WRITES, but the extra name remained in
the description. Remove it too.
Fixes: 7fcfa9a2d9 ("perf list: Fix s390 counter long description for L1D_RO_EXCL_WRITES")
Signed-off-by: Ed Maste <emaste@freebsd.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Link: http://lore.kernel.org/lkml/20191212145346.5026-1-emaste@freefall.freebsd.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2870782687 ]
Before this patch, perf expected that there might be NPROC*4 unique
cache entries at max, however, it also expected that some of them would
be shared and/or of the same size, thus the final number of entries
would be reduced to be lower than NPROC*4. In case the number of entries
hadn't been reduced (was NPROC*4), the warning was printed.
However, some systems might have unusual cache topology, such as the
following two-processor KVM guest:
cpu level shared_cpu_list size
0 1 0 32K
0 1 0 64K
0 2 0 512K
0 3 0 8192K
1 1 1 32K
1 1 1 64K
1 2 1 512K
1 3 1 8192K
This KVM guest has 8 (NPROC*4) unique cache entries, which used to make
perf printing the message, although there actually aren't "way too many
cpu caches".
v2: Removing unused argument.
v3: Unifying the way we obtain number of cpus.
v4: Removed '& UINT_MAX' construct which is redundant.
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
LPU-Reference: 20191208162056.20772-1-mpetlan@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit eb573e746b ]
Commit f01642e491 ("perf metricgroup: Support multiple events for
metricgroup") introduced support for multiple events in a metric group.
But with the current upstream, metric events names are not printed
properly
In power9 platform:
command:# ./perf stat --metric-only -M translation -C 0 -I 1000 sleep 2
1.000208486
2.000368863
2.001400558
Similarly in skylake platform:
command:./perf stat --metric-only -M Power -I 1000
1.000579994
2.002189493
With current upstream version, issue is with event name comparison logic
in find_evsel_group(). Current logic is to compare events belonging to a
metric group to the events in perf_evlist. Since the break statement is
missing in the loop used for comparison between metric group and
perf_evlist events, the loop continues to execute even after getting a
pattern match, and end up in discarding the matches.
Incase of single metric event belongs to metric group, its working fine,
because in case of single event once it compare all events it reaches to
end of perf_evlist.
Example for single metric event in power9 platform:
command:# ./perf stat --metric-only -M branches_per_inst -I 1000 sleep 1
1.000094653 0.2
1.001337059 0.0
This patch fixes the issue by making sure once we found all events
belongs to that metric event matched in find_evsel_group(), we
successfully break from that loop by adding corresponding condition.
With this patch:
In power9 platform:
command:# ./perf stat --metric-only -M translation -C 0 -I 1000 sleep 2
result:#
time derat_4k_miss_rate_percent derat_4k_miss_ratio derat_miss_ratio derat_64k_miss_rate_percent derat_64k_miss_ratio dslb_miss_rate_percent islb_miss_rate_percent
1.000135672 0.0 0.3 1.0 0.0 0.2 0.0 0.0
2.000380617 0.0 0.0 0.0 0.0 0.0 0.0 0.0
command:# ./perf stat --metric-only -M Power -I 1000
Similarly in skylake platform:
result:#
time Turbo_Utilization C3_Core_Residency C6_Core_Residency C7_Core_Residency C2_Pkg_Residency C3_Pkg_Residency C6_Pkg_Residency C7_Pkg_Residency
1.000563580 0.3 0.0 2.6 44.2 21.9 0.0 0.0 0.0
2.002235027 0.4 0.0 2.7 43.0 20.7 0.0 0.0 0.0
Committer testing:
Before:
[root@seventh ~]# perf stat --metric-only -M Power -I 1000
# time
1.000383223
2.001168182
3.001968545
4.002741200
5.003442022
^C 5.777687244
[root@seventh ~]#
After the patch:
[root@seventh ~]# perf stat --metric-only -M Power -I 1000
# time Turbo_Utilization C3_Core_Residency C6_Core_Residency C7_Core_Residency C2_Pkg_Residency C3_Pkg_Residency C6_Pkg_Residency C7_Pkg_Residency
1.000406577 0.4 0.1 1.4 97.0 0.0 0.0 0.0 0.0
2.001481572 0.3 0.0 0.6 97.9 0.0 0.0 0.0 0.0
3.002332585 0.2 0.0 1.0 97.5 0.0 0.0 0.0 0.0
4.003196624 0.2 0.0 0.3 98.6 0.0 0.0 0.0 0.0
5.004063851 0.3 0.0 0.7 97.7 0.0 0.0 0.0 0.0
^C 5.471260276 0.2 0.0 0.5 49.3 0.0 0.0 0.0 0.0
[root@seventh ~]#
[root@seventh ~]# dmesg | grep -i skylake
[ 0.187807] Performance Events: PEBS fmt3+, Skylake events, 32-deep LBR, full-width counters, Intel PMU driver.
[root@seventh ~]#
Fixes: f01642e491 ("perf metricgroup: Support multiple events for metricgroup")
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191120084059.24458-1-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5b596e0ff0 ]
To avoid breaking the build on arches where this is not wired up, at
least all the other features should be made available and when using
this specific routine, the "unknown" should point the user/developer to
the need to wire this up on this particular hardware architecture.
Detected in a container mipsel debian cross build environment, where it
shows up as:
In file included from /usr/mipsel-linux-gnu/include/stdio.h:867,
from /git/linux/tools/perf/lib/include/perf/cpumap.h:6,
from util/session.c:13:
In function 'printf',
inlined from 'regs_dump__printf' at util/session.c:1103:3,
inlined from 'regs__printf' at util/session.c:1131:2:
/usr/mipsel-linux-gnu/include/bits/stdio2.h:107:10: error: '%-5s' directive argument is null [-Werror=format-overflow=]
107 | return __printf_chk (__USE_FORTIFY_LEVEL - 1, __fmt, __va_arg_pack ());
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cross compiler details:
mipsel-linux-gnu-gcc (Debian 9.2.1-8) 9.2.1 20190909
Also on mips64:
In file included from /usr/mips64-linux-gnuabi64/include/stdio.h:867,
from /git/linux/tools/perf/lib/include/perf/cpumap.h:6,
from util/session.c:13:
In function 'printf',
inlined from 'regs_dump__printf' at util/session.c:1103:3,
inlined from 'regs__printf' at util/session.c:1131:2,
inlined from 'regs_user__printf' at util/session.c:1139:3,
inlined from 'dump_sample' at util/session.c:1246:3,
inlined from 'machines__deliver_event' at util/session.c:1421:3:
/usr/mips64-linux-gnuabi64/include/bits/stdio2.h:107:10: error: '%-5s' directive argument is null [-Werror=format-overflow=]
107 | return __printf_chk (__USE_FORTIFY_LEVEL - 1, __fmt, __va_arg_pack ());
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In function 'printf',
inlined from 'regs_dump__printf' at util/session.c:1103:3,
inlined from 'regs__printf' at util/session.c:1131:2,
inlined from 'regs_intr__printf' at util/session.c:1147:3,
inlined from 'dump_sample' at util/session.c:1249:3,
inlined from 'machines__deliver_event' at util/session.c:1421:3:
/usr/mips64-linux-gnuabi64/include/bits/stdio2.h:107:10: error: '%-5s' directive argument is null [-Werror=format-overflow=]
107 | return __printf_chk (__USE_FORTIFY_LEVEL - 1, __fmt, __va_arg_pack ());
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cross compiler details:
mips64-linux-gnuabi64-gcc (Debian 9.2.1-8) 9.2.1 20190909
Fixes: 2bcd355b71 ("perf tools: Add interface to arch registers sets")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-95wjyv4o65nuaeweq31t7l1s@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0cd032d3b5 ]
brstackinsn must be allowed to be set by the user when AUX area data has
been captured because, in that case, the branch stack might be
synthesized on the fly. This fixes the following error:
Before:
$ perf record -e '{intel_pt//,cpu/mem_inst_retired.all_loads,aux-sample-size=8192/pp}:u' grep -rqs jhgjhg /boot
[ perf record: Woken up 19 times to write data ]
[ perf record: Captured and wrote 2.274 MB perf.data ]
$ perf script -F +brstackinsn --xed --itrace=i1usl100 | head
Display of branch stack assembler requested, but non all-branch filter set
Hint: run 'perf record -b ...'
After:
$ perf record -e '{intel_pt//,cpu/mem_inst_retired.all_loads,aux-sample-size=8192/pp}:u' grep -rqs jhgjhg /boot
[ perf record: Woken up 19 times to write data ]
[ perf record: Captured and wrote 2.274 MB perf.data ]
$ perf script -F +brstackinsn --xed --itrace=i1usl100 | head
grep 13759 [002] 8091.310257: 1862 instructions:uH: 5641d58069eb bmexec+0x86b (/bin/grep)
bmexec+2485:
00005641d5806b35 jnz 0x5641d5806bd0 # MISPRED
00005641d5806bd0 movzxb (%r13,%rdx,1), %eax
00005641d5806bd6 add %rdi, %rax
00005641d5806bd9 movzxb -0x1(%rax), %edx
00005641d5806bdd cmp %rax, %r14
00005641d5806be0 jnb 0x5641d58069c0 # MISPRED
mismatch of LBR data and executable
00005641d58069c0 movzxb (%r13,%rdx,1), %edi
Fixes: 48d02a1d5c ("perf script: Add 'brstackinsn' for branch stacks")
Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20191127095322.15417-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 98e9324511 ]
To fix these build errors on a debian mipsel cross build environment:
builtin-diff.c: In function 'block_cycles_diff_cmp':
builtin-diff.c:550:6: error: absolute value function 'labs' given an argument of type 's64' {aka 'long long int'} but has parameter of type 'long int' which may cause truncation of value [-Werror=absolute-value]
550 | l = labs(left->diff.cycles);
| ^~~~
builtin-diff.c:551:6: error: absolute value function 'labs' given an argument of type 's64' {aka 'long long int'} but has parameter of type 'long int' which may cause truncation of value [-Werror=absolute-value]
551 | r = labs(right->diff.cycles);
| ^~~~
Fixes: 99150a1faa ("perf diff: Use hists to manage basic blocks per symbol")
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-pn7szy5uw384ntjgk6zckh6a@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 91e2f539ee upstream.
Fix die_walk_lines() to list the function entry line correctly. Since
the dwarf_entrypc() does not return the entry pc if the DIE has only
range attribute, __die_walk_funclines() fails to list the declaration
line (entry line) in that case.
To solve this issue, this introduces die_entrypc() which correctly
returns the entry PC (the first address range) even if the DIE has only
range attribute. With this fix die_walk_lines() shows the function entry
line is able to probe correctly.
Fixes: 4cc9cec636 ("perf probe: Introduce lines walker interface")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/157190837419.1859.4619125803596816752.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Thomas Backlund <tmb@mageia.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit bb1835a3b8 ]
Avoid termination of trace loading in case the last record in the
decompressed buffer partly resides in the following mmaped
PERF_RECORD_COMPRESSED record.
In this case NULL value returned by fetch_mmaped_event() means to
proceed to the next mmaped record then decompress it and load compressed
events.
The issue can be reproduced like this:
$ perf record -z -- some_long_running_workload
$ perf report --stdio -vv
decomp (B): 44519 to 163000
decomp (B): 48119 to 174800
decomp (B): 65527 to 131072
fetch_mmaped_event: head=0x1ffe0 event->header_size=0x28, mmap_size=0x20000: fuzzed perf.data?
Error:
failed to process sample
...
Testing:
71: Zstd perf.data compression/decompression : Ok
$ tools/perf/perf report -vv --stdio
decomp (B): 59593 to 262160
decomp (B): 4438 to 16512
decomp (B): 285 to 880
Looking at the vmlinux_path (8 entries long)
Using vmlinux for symbols
decomp (B): 57474 to 261248
prefetch_event: head=0x3fc78 event->header_size=0x28, mmap_size=0x3fc80: fuzzed or compressed perf.data?
decomp (B): 25 to 32
decomp (B): 52 to 120
...
Fixes: 57fc032ad6 ("perf session: Avoid infinite loop when seeing invalid header.size")
Link: https://marc.info/?l=linux-kernel&m=156580812427554&w=2
Co-developed-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/cf782c34-f3f8-2f9f-d6ab-145cee0d5322@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit da6cb952a8 ]
Filter out instances except for inlined_subroutine and subprogram DIE in
die_walk_instances() and die_is_func_instance().
This fixes an issue that perf probe sets some probes on calling address
instead of a target function itself.
When perf probe walks on instances of an abstruct origin (a kind of
function prototype of inlined function), die_walk_instances() can also
pass a GNU_call_site (a GNU extension for call site) to callback. Since
it is not an inlined instance of target function, we have to filter out
when searching a probe point.
Without this patch, perf probe sets probes on call site address too.This
can happen on some function which is marked "inlined", but has actual
symbol. (I'm not sure why GCC mark it "inlined"):
# perf probe -D vfs_read
p:probe/vfs_read _text+2500017
p:probe/vfs_read_1 _text+2499468
p:probe/vfs_read_2 _text+2499563
p:probe/vfs_read_3 _text+2498876
p:probe/vfs_read_4 _text+2498512
p:probe/vfs_read_5 _text+2498627
With this patch:
Slightly different results, similar tho:
# perf probe -D vfs_read
p:probe/vfs_read _text+2498512
Committer testing:
# uname -a
Linux quaco 5.3.8-200.fc30.x86_64 #1 SMP Tue Oct 29 14:46:22 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Before:
# perf probe -D vfs_read
p:probe/vfs_read _text+3131557
p:probe/vfs_read_1 _text+3130975
p:probe/vfs_read_2 _text+3131047
p:probe/vfs_read_3 _text+3130380
p:probe/vfs_read_4 _text+3130000
# uname -a
Linux quaco 5.3.8-200.fc30.x86_64 #1 SMP Tue Oct 29 14:46:22 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
#
After:
# perf probe -D vfs_read
p:probe/vfs_read _text+3130000
#
Fixes: db0d2c6420 ("perf probe: Search concrete out-of-line instances")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/157241937063.32002.11024544873990816590.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f4d99bdfd1 ]
Skip end-of-sequence and non-statement lines while walking through lines
list.
The "end-of-sequence" line information means:
"the current address is that of the first byte after the
end of a sequence of target machine instructions."
(DWARF version 4 spec 6.2.2)
This actually means out of scope and we can not probe on it.
On the other hand, the statement lines (is_stmt) means:
"the current instruction is a recommended breakpoint location.
A recommended breakpoint location is intended to “represent”
a line, a statement and/or a semantically distinct subpart
of a statement."
(DWARF version 4 spec 6.2.2)
So, non-statement line info also should be skipped.
These can reduce unneeded probe points and also avoid an error.
E.g. without this patch:
# perf probe -a "clear_tasks_mm_cpumask:1"
Added new events:
probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask:1)
probe:clear_tasks_mm_cpumask_1 (on clear_tasks_mm_cpumask:1)
probe:clear_tasks_mm_cpumask_2 (on clear_tasks_mm_cpumask:1)
probe:clear_tasks_mm_cpumask_3 (on clear_tasks_mm_cpumask:1)
probe:clear_tasks_mm_cpumask_4 (on clear_tasks_mm_cpumask:1)
You can now use it in all perf tools, such as:
perf record -e probe:clear_tasks_mm_cpumask_4 -aR sleep 1
#
This puts 5 probes on one line, but acutally it's not inlined function.
This is because there are many non statement instructions at the
function prologue.
With this patch:
# perf probe -a "clear_tasks_mm_cpumask:1"
Added new event:
probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask:1)
You can now use it in all perf tools, such as:
perf record -e probe:clear_tasks_mm_cpumask -aR sleep 1
#
Now perf-probe skips unneeded addresses.
Committer testing:
Slightly different results, but similar:
Before:
# uname -a
Linux quaco 5.3.8-200.fc30.x86_64 #1 SMP Tue Oct 29 14:46:22 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
#
# perf probe -a "clear_tasks_mm_cpumask:1"
Added new events:
probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask:1)
probe:clear_tasks_mm_cpumask_1 (on clear_tasks_mm_cpumask:1)
probe:clear_tasks_mm_cpumask_2 (on clear_tasks_mm_cpumask:1)
You can now use it in all perf tools, such as:
perf record -e probe:clear_tasks_mm_cpumask_2 -aR sleep 1
#
After:
# perf probe -a "clear_tasks_mm_cpumask:1"
Added new event:
probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask:1)
You can now use it in all perf tools, such as:
perf record -e probe:clear_tasks_mm_cpumask -aR sleep 1
# perf probe -l
probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask@kernel/cpu.c)
#
Fixes: 4cc9cec636 ("perf probe: Introduce lines walker interface")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/157241936090.32002.12156347518596111660.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 86c0bf8539 ]
Fix to show calling lines of inlined functions (where an inline function
is called).
die_walk_lines() filtered out the lines inside inlined functions based
on the address. However this also filtered out the lines which call
those inlined functions from the target function.
To solve this issue, check the call_file and call_line attributes and do
not filter out if it matches to the line information.
Without this fix, perf probe -L doesn't show some lines correctly.
(don't see the lines after 17)
# perf probe -L vfs_read
<vfs_read@/home/mhiramat/ksrc/linux/fs/read_write.c:0>
0 ssize_t vfs_read(struct file *file, char __user *buf, size_t count, loff_t *pos)
1 {
2 ssize_t ret;
4 if (!(file->f_mode & FMODE_READ))
return -EBADF;
6 if (!(file->f_mode & FMODE_CAN_READ))
return -EINVAL;
8 if (unlikely(!access_ok(buf, count)))
return -EFAULT;
11 ret = rw_verify_area(READ, file, pos, count);
12 if (!ret) {
13 if (count > MAX_RW_COUNT)
count = MAX_RW_COUNT;
15 ret = __vfs_read(file, buf, count, pos);
16 if (ret > 0) {
fsnotify_access(file);
add_rchar(current, ret);
}
With this fix:
# perf probe -L vfs_read
<vfs_read@/home/mhiramat/ksrc/linux/fs/read_write.c:0>
0 ssize_t vfs_read(struct file *file, char __user *buf, size_t count, loff_t *pos)
1 {
2 ssize_t ret;
4 if (!(file->f_mode & FMODE_READ))
return -EBADF;
6 if (!(file->f_mode & FMODE_CAN_READ))
return -EINVAL;
8 if (unlikely(!access_ok(buf, count)))
return -EFAULT;
11 ret = rw_verify_area(READ, file, pos, count);
12 if (!ret) {
13 if (count > MAX_RW_COUNT)
count = MAX_RW_COUNT;
15 ret = __vfs_read(file, buf, count, pos);
16 if (ret > 0) {
17 fsnotify_access(file);
18 add_rchar(current, ret);
}
20 inc_syscr(current);
}
Fixes: 4cc9cec636 ("perf probe: Introduce lines walker interface")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/157241937995.32002.17899884017011512577.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c701636aee ]
Make find_best_scope() returns innermost DIE at given address if there
is no best matched scope DIE. Since Gcc sometimes generates intuitively
strange line info which is out of inlined function address range, we
need this fixup.
Without this, sometimes perf probe failed to probe on a line inside an
inlined function:
# perf probe -D ksys_open:3
Failed to find scope of probe point.
Error: Failed to add events.
With this fix, 'perf probe' can probe it:
# perf probe -D ksys_open:3
p:probe/ksys_open _text+25707308
p:probe/ksys_open_1 _text+25710596
p:probe/ksys_open_2 _text+25711114
p:probe/ksys_open_3 _text+25711343
p:probe/ksys_open_4 _text+25714058
p:probe/ksys_open_5 _text+2819653
p:probe/ksys_open_6 _text+2819701
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
Link: http://lore.kernel.org/lkml/157291300887.19771.14936015360963292236.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit dee36a2abb ]
Since debuginfo__find_probes() callback function can be called with the
location which already passed, the callback function must filter out
such overlapped locations.
add_probe_trace_event() has already done it by commit 1a375ae765
("perf probe: Skip same probe address for a given line"), but
add_available_vars() doesn't. Thus perf probe -v shows same address
repeatedly as below:
# perf probe -V vfs_read:18
Available variables at vfs_read:18
@<vfs_read+217>
char* buf
loff_t* pos
ssize_t ret
struct file* file
@<vfs_read+217>
char* buf
loff_t* pos
ssize_t ret
struct file* file
@<vfs_read+226>
char* buf
loff_t* pos
ssize_t ret
struct file* file
With this fix, perf probe -V shows it correctly:
# perf probe -V vfs_read:18
Available variables at vfs_read:18
@<vfs_read+217>
char* buf
loff_t* pos
ssize_t ret
struct file* file
@<vfs_read+226>
char* buf
loff_t* pos
ssize_t ret
struct file* file
Fixes: cf6eb489e5 ("perf probe: Show accessible local variables")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/157241938927.32002.4026859017790562751.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 38f2c4226e ]
Avoid a memory leak when the configuration fails.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: clang-built-linux@googlegroups.com
Cc: netdev@vger.kernel.org
Link: http://lore.kernel.org/lkml/20191030223448.12930-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8e8714c3d1 ]
If event parsing fails the event list is leaked, instead splice the list
onto the out result and let the caller cleanup.
An example input for parse_events found by libFuzzer that reproduces
this memory leak is 'm{'.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: clang-built-linux@googlegroups.com
Cc: netdev@vger.kernel.org
Link: http://lore.kernel.org/lkml/20191025180827.191916-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 71f699078b ]
Currently when cross compiling perf tool for ARM64 on my x86 machine I
get this error:
arch/arm64/util/sym-handling.c:9:10: fatal error: gelf.h: No such file or directory
#include <gelf.h>
For the build, libelf is reported off:
Auto-detecting system features:
...
... libelf: [ OFF ]
Indeed, test-libelf is not built successfully:
more ./build/feature/test-libelf.make.output
test-libelf.c:2:10: fatal error: libelf.h: No such file or directory
#include <libelf.h>
^~~~~~~~~~
compilation terminated.
I have no such problems natively compiling on ARM64, and I did not
previously have this issue for cross compiling. Fix by relocating the
gelf.h include.
Signed-off-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/1573045254-39833-1-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5d16dbcc31 ]
Fix 'perf probe' to probe a function which has no entry pc or low pc but
only has ranges attribute.
probe_point_search_cb() uses dwarf_entrypc() to get the probe address,
but that doesn't work for the function DIE which has only ranges
attribute. Use die_entrypc() instead.
Without this fix:
# perf probe -k ../build-x86_64/vmlinux -D clear_tasks_mm_cpumask:0
Probe point 'clear_tasks_mm_cpumask' not found.
Error: Failed to add events.
With this:
# perf probe -k ../build-x86_64/vmlinux -D clear_tasks_mm_cpumask:0
p:probe/clear_tasks_mm_cpumask clear_tasks_mm_cpumask+0
Committer testing:
Before:
[root@quaco ~]# perf probe clear_tasks_mm_cpumask:0
Probe point 'clear_tasks_mm_cpumask' not found.
Error: Failed to add events.
[root@quaco ~]#
After:
[root@quaco ~]# perf probe clear_tasks_mm_cpumask:0
Added new event:
probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask)
You can now use it in all perf tools, such as:
perf record -e probe:clear_tasks_mm_cpumask -aR sleep 1
[root@quaco ~]#
Using it with 'perf trace':
[root@quaco ~]# perf trace -e probe:clear_tasks_mm_cpumask
Doesn't seem to be used in x86_64:
$ find . -name "*.c" | xargs grep clear_tasks_mm_cpumask
./kernel/cpu.c: * clear_tasks_mm_cpumask - Safely clear tasks' mm_cpumask for a CPU
./kernel/cpu.c:void clear_tasks_mm_cpumask(int cpu)
./arch/xtensa/kernel/smp.c: clear_tasks_mm_cpumask(cpu);
./arch/csky/kernel/smp.c: clear_tasks_mm_cpumask(cpu);
./arch/sh/kernel/smp.c: clear_tasks_mm_cpumask(cpu);
./arch/arm/kernel/smp.c: clear_tasks_mm_cpumask(cpu);
./arch/powerpc/mm/nohash/mmu_context.c: clear_tasks_mm_cpumask(cpu);
$ find . -name "*.h" | xargs grep clear_tasks_mm_cpumask
./include/linux/cpu.h:void clear_tasks_mm_cpumask(int cpu);
$ find . -name "*.S" | xargs grep clear_tasks_mm_cpumask
$
Fixes: e1ecbbc3fa ("perf probe: Fix to handle optimized not-inlined functions")
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/157199319438.8075.4695576954550638618.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit af04dd2f8e ]
Fix to show ranges of variables (--range and --vars option) in functions
which DIE has only ranges but no entry_pc attribute.
Without this fix:
# perf probe --range -V clear_tasks_mm_cpumask
Available variables at clear_tasks_mm_cpumask
@<clear_tasks_mm_cpumask+0>
(No matched variables)
With this fix:
# perf probe --range -V clear_tasks_mm_cpumask
Available variables at clear_tasks_mm_cpumask
@<clear_tasks_mm_cpumask+0>
[VAL] int cpu @<clear_tasks_mm_cpumask+[0-35,317-317,2052-2059]>
Committer testing:
Before:
[root@quaco ~]# perf probe --range -V clear_tasks_mm_cpumask
Available variables at clear_tasks_mm_cpumask
@<clear_tasks_mm_cpumask+0>
(No matched variables)
[root@quaco ~]#
After:
[root@quaco ~]# perf probe --range -V clear_tasks_mm_cpumask
Available variables at clear_tasks_mm_cpumask
@<clear_tasks_mm_cpumask+0>
[VAL] int cpu @<clear_tasks_mm_cpumask+[0-23,23-105,105-106,106-106,1843-1850,1850-1862]>
[root@quaco ~]#
Using it:
[root@quaco ~]# perf probe clear_tasks_mm_cpumask cpu
Added new event:
probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask with cpu)
You can now use it in all perf tools, such as:
perf record -e probe:clear_tasks_mm_cpumask -aR sleep 1
[root@quaco ~]# perf probe -l
probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask@kernel/cpu.c with cpu)
[root@quaco ~]#
[root@quaco ~]# perf trace -e probe:*cpumask
^C[root@quaco ~]#
Fixes: 349e8d2611 ("perf probe: Add --range option to show a variable's location range")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/157199323018.8075.8179744380479673672.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit eb6933b29d ]
Fix perf probe to probe an inlne function which has no entry pc
or low pc but only has ranges attribute.
This seems very rare case, but I could find a few examples, as
same as probe_point_search_cb(), use die_entrypc() to get the
entry address in probe_point_inline_cb() too.
Without this patch:
# perf probe -D __amd_put_nb_event_constraints
Failed to get entry address of __amd_put_nb_event_constraints.
Probe point '__amd_put_nb_event_constraints' not found.
Error: Failed to add events.
With this patch:
# perf probe -D __amd_put_nb_event_constraints
p:probe/__amd_put_nb_event_constraints amd_put_event_constraints+43
Committer testing:
Before:
[root@quaco ~]# perf probe -D __amd_put_nb_event_constraints
Failed to get entry address of __amd_put_nb_event_constraints.
Probe point '__amd_put_nb_event_constraints' not found.
Error: Failed to add events.
[root@quaco ~]#
After:
[root@quaco ~]# perf probe -D __amd_put_nb_event_constraints
p:probe/__amd_put_nb_event_constraints _text+33789
[root@quaco ~]#
Fixes: 4ea42b1814 ("perf: Add perf probe subcommand, a kprobe-event setup helper")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/157199320336.8075.16189530425277588587.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit acb6a7047a ]
Since some inlined functions are in lexical blocks of given function, we
have to recursively walk through the DIE tree. Without this fix,
perf-probe -L can miss the inlined functions which is in a lexical block
(like if (..) { func() } case.)
However, even though, to walk the lines in a given function, we don't
need to follow the children DIE of inlined functions because those do
not have any lines in the specified function.
We need to walk though whole trees only if we walk all lines in a given
file, because an inlined function can include another inlined function
in the same file.
Fixes: b0e9cb2802 ("perf probe: Fix to search nested inlined functions in CU")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/157190836514.1859.15996864849678136353.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1785fbb738 ]
There are memory leaks and file descriptor resource leaks in
process_mapfile() and main().
Fix this by adding free(), fclose() and free_arch_std_events() on the
error paths.
Fixes: 80eeb67fe5 ("perf jevents: Program to convert JSON file")
Fixes: 3f056b6664 ("perf jevents: Make build fail on JSON parse error")
Fixes: e9d32c1bf0 ("perf vendor events: Add support for arch standard events")
Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Feilong Lin <linfeilong@huawei.com>
Cc: Hu Shiyuan <hushiyuan@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Luke Mujica <lukemujica@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zenghui Yu <yuzenghui@huawei.com>
Link: http://lore.kernel.org/lkml/d7907042-ec9c-2bef-25b4-810e14602f89@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3895534dd7 ]
Since debuginfo__find_probe_point() uses dwarf_entrypc() for finding the
entry address of the function on which a probe is, it will fail when the
function DIE has only ranges attribute.
To fix this issue, use die_entrypc() instead of dwarf_entrypc().
Without this fix, perf probe -l shows incorrect offset:
# perf probe -l
probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask+18446744071579263632@work/linux/linux/kernel/cpu.c)
probe:clear_tasks_mm_cpumask_1 (on clear_tasks_mm_cpumask+18446744071579263752@work/linux/linux/kernel/cpu.c)
With this:
# perf probe -l
probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask@work/linux/linux/kernel/cpu.c)
probe:clear_tasks_mm_cpumask_1 (on clear_tasks_mm_cpumask:21@work/linux/linux/kernel/cpu.c)
Committer testing:
Before:
[root@quaco ~]# perf probe -l
probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask+18446744071579765152@kernel/cpu.c)
[root@quaco ~]#
After:
[root@quaco ~]# perf probe -l
probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask@kernel/cpu.c)
[root@quaco ~]#
Fixes: 1d46ea2a6a ("perf probe: Fix listing incorrect line number with inline function")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/157199321227.8075.14655572419136993015.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9d604aad4b ]
Macro TO_CS_QUEUE_NR definition has a typo, which uses 'trace_id_chan'
as its parameter, this doesn't match with its definition body which uses
'trace_chan_id'. So renames the parameter to 'trace_chan_id'.
It's luck to have a local variable 'trace_chan_id' in the function
cs_etm__setup_queue(), even we wrongly define the macro TO_CS_QUEUE_NR,
the local variable 'trace_chan_id' is used rather than the macro's
parameter 'trace_id_chan'; so the compiler doesn't complain for this
before.
After renaming the parameter, it leads to a compiling error due
cs_etm__setup_queue() has no variable 'trace_id_chan'. This patch uses
the variable 'trace_chan_id' for the macro so that fixes the compiling
error.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: coresight ml <coresight@lists.linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20191021074808.25795-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b77afa1f81 ]
Fix die_is_func_instance() to find range-only function instance.
In some case, a function instance can be made without any low PC or
entry PC, but only with address ranges by optimization. (e.g. cold text
partially in "text.unlikely" section) To find such function instance, we
have to check the range attribute too.
Fixes: e1ecbbc3fa ("perf probe: Fix to handle optimized not-inlined functions")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/157190835669.1859.8368628035930950596.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6a5f3d94cb ]
As there are several discussions for enabling perf breakpoint signal
testing on arm64 platform: arm64 needs to rely on single-step to execute
the breakpointed instruction and then reinstall the breakpoint exception
handler. But if we hook the breakpoint with a signal, the signal
handler will do the stepping rather than the breakpointed instruction,
this causes infinite loops as below:
Kernel space | Userspace
---------------------------------|--------------------------------
| __test_function() -> hit
| breakpoint
breakpoint_handler() |
`-> user_enable_single_step() |
do_signal() |
| sig_handler() -> Step one
| instruction and
| trap to kernel
single_step_handler() |
`-> reinstall_suspended_bps() |
| __test_function() -> hit
| breakpoint again and
| repeat up flow infinitely
As Will Deacon mentioned [1]: "that we require the overflow handler to
do the stepping on arm/arm64, which is relied upon by GDB/ptrace. The
hw_breakpoint code is a complete disaster so my preference would be to
rip out the perf part and just implement something directly in ptrace,
but it's a pretty horrible job". Though Will commented this on arm
architecture, but the comment also can apply on arm64 architecture.
For complete information, I searched online and found a few years back,
Wang Nan sent one patch 'arm64: Store breakpoint single step state into
pstate' [2]; the patch tried to resolve this issue by avoiding single
stepping in signal handler and defer to enable the signal stepping when
return to __test_function(). The fixing was not merged due to the
concern for missing to handle different usage cases.
Based on the info, the most feasible way is to skip Perf breakpoint
signal testing for arm64 and this could avoid the duplicate
investigation efforts when people see the failure. This patch skips
this case on arm64 platform, which is same with arm architecture.
[1] https://lkml.org/lkml/2018/11/15/205
[2] https://lkml.org/lkml/2015/12/23/477
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Brajeswar Ghosh <brajeswar.linux@gmail.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Will Deacon <will@kernel.org>
Link: http://lore.kernel.org/lkml/20191018085531.6348-3-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 84b0975f48 ]
The "EventName" for the DDRC precharge command event is incorrect, so
fix it.
Fixes: 57cc732479 ("perf jevents: Add support for Hisi hip08 DDRC PMU aliasing")
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linuxarm@huawei.com
Link: http://lore.kernel.org/lkml/1567612484-195727-2-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 791ce9c48c ]
When executing the task exit testing case, perf gets stuck in an endless
loop this case and doesn't return back on Arm64 Juno board.
After digging into this issue, since Juno board has Arm's big.LITTLE
CPUs, thus the PMUs are not compatible between the big CPUs and little
CPUs. This leads to a PMU event that cannot be enabled properly when
the traced task is migrated from one variant's CPU to another variant.
Finally, the test case runs into infinite loop for cannot read out any
event data after return from polling.
Eventually, we need to work out formal solution to allow PMU events can
be freely migrated from one CPU variant to another, but this is a
difficult task and a different topic. This patch tries to fix the Perf
test case to avoid infinite loop, when the testing detects 1000 times
retrying for reading empty events, it will directly bail out and return
failure. This allows the Perf tool can continue its other test cases.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20191011091942.29841-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 800d3f5616 ]
We received a user report that call-graph DWARF mode was enabled in
'perf record' but 'perf report' didn't unwind the callstack correctly.
The reason was, libunwind was not compiled in.
We can use 'perf -vv' to check the compiled libraries but it would be
valuable to report a warning to user directly (especially valuable for
a perf newbie).
The warning is:
Warning:
Please install libunwind development packages during the perf build.
Both TUI and stdio are supported.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191011022122.26369-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6add129c5d ]
When fail to mmap events in task exit case, it misses to set 'err' to
-1; thus the testing will not report failure for it.
This patch sets 'err' to -1 when fails to mmap events, thus Perf tool
can report correct result.
Fixes: d723a55096 ("perf test: Add test case for checking number of EXIT events")
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20191011091942.29841-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit af8490eb2b upstream.
The test case 'Read backward ring buffer' failed on 32-bit architectures
which were found by LKFT perf testing. The test failed on arm32 x15
device, qemu_arm32, qemu_i386, and found intermittent failure on i386;
the failure log is as below:
50: Read backward ring buffer :
--- start ---
test child forked, pid 510
Using CPUID GenuineIntel-6-9E-9
mmap size 1052672B
mmap size 8192B
Finished reading overwrite ring buffer: rewind
free(): invalid next size (fast)
test child interrupted
---- end ----
Read backward ring buffer: FAILED!
The log hints there have issue for memory usage, thus free() reports
error 'invalid next size' and directly exit for the case. Finally, this
issue is root caused as out of bounds memory access for the data array
'evsel->id'.
The backward ring buffer test invokes do_test() twice. 'evsel->id' is
allocated at the first call with the flow:
test__backward_ring_buffer()
`-> do_test()
`-> evlist__mmap()
`-> evlist__mmap_ex()
`-> perf_evsel__alloc_id()
So 'evsel->id' is allocated with one item, and it will be used in
function perf_evlist__id_add():
evsel->id[0] = id
evsel->ids = 1
At the second call for do_test(), it skips to initialize 'evsel->id'
and reuses the array which is allocated in the first call. But
'evsel->ids' contains the stale value. Thus:
evsel->id[1] = id -> out of bound access
evsel->ids = 2
To fix this issue, we will use evlist__open() and evlist__close() pair
functions to prepare and cleanup context for evlist; so 'evsel->id' and
'evsel->ids' can be initialized properly when invoke do_test() and avoid
the out of bounds memory access.
Fixes: ee74701ed8 ("perf tests: Add test to check backward ring buffer")
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: stable@vger.kernel.org # v4.10+
Link: http://lore.kernel.org/lkml/20191107020244.2427-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit af833988c0 upstream.
Prior to version 3.23 SQLite does not support TRUE or FALSE, so always
use 1 and 0 for SQLite.
Fixes: 26c11206f4 ("perf scripts python: exported-sql-viewer.py: Use new 'has_calls' column")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v5.3+
Link: http://lore.kernel.org/lkml/20191113120206.26957-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
[Adrian: backported to v5.3, v5.4]
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull perf tooling fixes from Thomas Gleixner:
- Fix the time sorting algorithm which was broken due to truncation of
big numbers
- Fix the python script generator fail caused by a broken tracepoint
array iterator
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf tools: Fix time sorting
perf tools: Remove unused trace_find_next_event()
perf scripting engines: Iterate on tep event arrays directly
Daniel Borkmann says:
====================
pull-request: bpf 2019-11-02
The following pull-request contains BPF updates for your *net* tree.
We've added 6 non-merge commits during the last 6 day(s) which contain
a total of 8 files changed, 35 insertions(+), 9 deletions(-).
The main changes are:
1) Fix ppc BPF JIT's tail call implementation by performing a second pass
to gather a stable JIT context before opcode emission, from Eric Dumazet.
2) Fix build of BPF samples sys_perf_event_open() usage to compiled out
unavailable test_attr__{enabled,open} checks. Also fix potential overflows
in bpf_map_{area_alloc,charge_init} on 32 bit archs, from Björn Töpel.
3) Fix narrow loads of bpf_sysctl context fields with offset > 0 on big endian
archs like s390x and also improve the test coverage, from Ilya Leoshkevich.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The final sort might get confused when the comparison is done over
bigger numbers than int like for -s time.
Check the following report for longer workloads:
$ perf report -s time -F time,overhead --stdio
Fix hist_entry__sort() to properly return int64_t and not possible cut
int.
Fixes: 043ca389a3 ("perf tools: Use hpp formats to sort final output")
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org # v3.16+
Link: http://lore.kernel.org/lkml/20191104232711.16055-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
trace_find_next_event() was buggy and pretty much a useless helper. As
there are no more users, just remove it.
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Cc: linux-trace-devel@vger.kernel.org
Link: http://lore.kernel.org/lkml/20191017210636.224045576@goodmis.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Instead of calling a useless (and broken) helper function to get the
next event of a tep event array, just get the array directly and iterate
over it.
Note, the broken part was from trace_find_next_event() which after this
will no longer be used, and can be removed.
Committer notes:
This fixes a segfault when generating python scripts from perf.data
files with multiple tracepoint events, i.e. the following use case is
fixed by this patch:
# perf record -e sched:* sleep 1
[ perf record: Woken up 31 times to write data ]
[ perf record: Captured and wrote 0.031 MB perf.data (9 samples) ]
# perf script -g python
Segmentation fault (core dumped)
#
Reported-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Cc: linux-trace-devel@vger.kernel.org
Link: http://lkml.kernel.org/r/20191017153733.630cd5eb@gandalf.local.home
Link: http://lore.kernel.org/lkml/20191017210636.061448713@goodmis.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For users of perf-sys.h outside perf, e.g. samples/bpf/bpf_load.c, it's
convenient not to depend on test_attr__*.
After commit 91854f9a07 ("perf tools: Move everything related to
sys_perf_event_open() to perf-sys.h"), all users of perf-sys.h will
depend on test_attr__enabled and test_attr__open.
This commit enables a user to define HAVE_ATTR_TEST to zero in order
to omit the test dependency.
Fixes: 91854f9a07 ("perf tools: Move everything related to sys_perf_event_open() to perf-sys.h")
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: http://lore.kernel.org/bpf/20191001113307.27796-2-bjorn.topel@gmail.com
The memory @orig_flags is allocated by strdup(), it is freed on the
normal path, but leak to free on the error path.
Fix this by adding free(orig_flags) on the error path.
Fixes: 0e11115644 ("perf kmem: Print gfp flags in human readable string")
Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Feilong Lin <linfeilong@huawei.com>
Cc: Hu Shiyuan <hushiyuan@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/f9e9f458-96f3-4a97-a1d5-9feec2420e07@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There is a memory leak problem in the failure paths of
build_cl_output(), so fix it.
Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Feilong Lin <linfeilong@huawei.com>
Cc: Hu Shiyuan <hushiyuan@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/4d3c0178-5482-c313-98e1-f82090d2d456@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Store SYMBOL_ANNOTATE_ERRNO__BPF_MISSING_BTF in variable *ret*, instead
of returning in the middle of the function and leaking multiple
resources: prog_linfo, btf, s and bfdf.
Addresses-Coverity-ID: 1454832 ("Structurally dead code")
Fixes: 11aad897f6 ("perf annotate: Don't return -1 for error when doing BPF disassembly")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191014171047.GA30850@embeddedor
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Both build_mem_topology() and rm_rf_depth_pat() have resource leaks of
closedir() on the error paths.
Fix this by calling closedir() before function returns.
Fixes: e2091cedd5 ("perf tools: Add MEM_TOPOLOGY feature to perf data file")
Fixes: cdb6b0235f ("perf tools: Add pattern name checking to rm_rf")
Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Feilong Lin <linfeilong@huawei.com>
Cc: Hu Shiyuan <hushiyuan@huawei.com>
Cc: Igor Lubashev <ilubashe@akamai.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Link: http://lore.kernel.org/lkml/cd5f7cd2-b80d-6add-20a1-32f4f43e0744@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In the earlier fix for the memory overrun of id arrays I managed to typo
the wrong event in the fix.
Of course we need to close the current event in the loop, not the
original failing event.
The same test case as in the original patch still passes.
Fixes: 7834fa948b ("perf evlist: Fix access of freed id arrays")
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lore.kernel.org/lkml/20191011182140.8353-2-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The build of file libperf-jvmti.so succeeds but the resulting
object fails to load:
# ~/linux/tools/perf/perf record -k mono -- java \
-XX:+PreserveFramePointer \
-agentpath:/root/linux/tools/perf/libperf-jvmti.so \
hog 100000 123450
Error occurred during initialization of VM
Could not find agent library /root/linux/tools/perf/libperf-jvmti.so
in absolute path, with error:
/root/linux/tools/perf/libperf-jvmti.so: undefined symbol: _ctype
Add the missing _ctype symbol into the build script.
Fixes: 79743bc927 ("perf jvmti: Link against tools/lib/string.o to have weak strlcpy()")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20191008093841.59387-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Return errno when open_memstream() fails and add two new speciall error
codes for when an invalid, non BPF file or one without BTF is passed to
symbol__disassemble_bpf(), so that its callers can rely on
symbol__strerror_disassemble() to convert that to a human readable error
message that can help figure out what is wrong, with hints even.
Cc: Russell King - ARM Linux admin <linux@armlinux.org.uk>
Cc: Song Liu <songliubraving@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
Cc: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/n/tip-usevw9r2gcipfcrbpaueurw0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We should return errno or the annotation extra range understood by
symbol__strerror_disassemble() instead of -1, fix it, returning ENOMEM
instead.
Reported-by: Russell King - ARM Linux admin <linux@armlinux.org.uk>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
Cc: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/n/tip-8of1cmj3rz0mppfcshc9bbqq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
They are called from symbol__annotate() and to propagate errors that can
help understand the problem make them return what
symbol__strerror_disassemble() known, i.e. errno codes and other
annotation specific errors in a special, out of errnos, range.
Reported-by: Russell King - ARM Linux admin <linux@armlinux.org.uk>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
Cc: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/n/tip-pqx7srcv7tixgid251aeboj6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We were just returning -1 in symbol__annotate() when symbol__annotate()
failed, propagate its error as it is used later to pass to
symbol__strerror_disassemble() to present a error message to the user,
that in some cases were getting:
"Invalid -1 error code"
Fix it to propagate the error.
Reported-by: Russell King - ARM Linux admin <linux@armlinux.org.uk>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
Cc: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/n/tip-0tj89rs9g7nbcyd5skadlvuu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Callers of symbol__annotate() expect a errno value or some other
extended error value range in symbol__strerror_disassemble() to
convert to a proper error string, fix it when propagating a failure to
find the arch specific annotation routines via arch__find(arch_name).
Reported-by: Russell King - ARM Linux admin <linux@armlinux.org.uk>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
Cc: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/n/tip-o0k6dw7cas0vvmjjvgsyvu1i@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The callers of symbol__annotate2() use symbol__strerror_disassemble() to
convert its failure returns into a human readable string, so
propagate error values from functions it calls, starting with
perf_env__arch() that when fails the right thing to do is to look at
'errno' to see why its possible call to uname() failed.
Reported-by: Russell King - ARM Linux admin <linux@armlinux.org.uk>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
Cc: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/n/tip-it5d83kyusfhb1q1b0l4pxzs@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
I.e. if evsel->evlist or evsel->evlist->env isn't set, return the
environment for the running machine, as that would be set if reading
from a perf.data file.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-uqq4grmhbi12rwb0lfpo6lfu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For consistency, propagate the exact cause for get_cpuid() to have
failed.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-9ig269f7ktnhh99g4l15vpu2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The Intel fixed counters use a special table to override the JSON
information.
During this override the period information from the JSON file got
dropped, which results in inst_retired.any and similar running with
frequency mode instead of a period.
Just specify the expected period in the table.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lore.kernel.org/lkml/20190927233546.11533-2-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When the LBR data and the instructions in a binary do not match the loop
printing instructions could get confused and print a long stream of
bogus <bad> instructions.
The problem was that if the instruction decoder cannot decode an
instruction it ilen wasn't initialized, so the loop going through the
basic block would continue with the previous value.
Harden the code to avoid such problems:
- Make sure ilen is always freshly initialized and is 0 for bad
instructions.
- Do not overrun the code buffer while printing instructions
- Print a warning message if the final jump is not on an instruction
boundary.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lore.kernel.org/lkml/20190927233546.11533-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Specification claims latest version of jitdump file format is 2. Current
jit dump reading code treats 1 as the latest version.
Correct spec to match code.
The original language made it unclear the value to be written in the
magic field.
Revise language that the writer always writes the same value. Specify
that the reader uses the value to detect endian mismatches.
Signed-off-by: Steve MacLean <Steve.MacLean@Microsoft.com>
Acked-by: Stephane Eranian <eranian@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Brian Robbins <brianrob@microsoft.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Keeping <john@metanate.com>
Cc: John Salem <josalem@microsoft.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Tom McDonald <thomas.mcdonald@microsoft.com>
Link: http://lore.kernel.org/lkml/BN8PR21MB1362F63CDE7AC69736FC7F9EF7800@BN8PR21MB1362.namprd21.prod.outlook.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
During perf inject --jit, JIT_CODE_MOVE records were injecting MMAP records
with an incorrect filename. Specifically it was missing the ".so" suffix.
Further the JIT_CODE_LOAD record were silently truncating the
jr->load.code_index field to 32 bits before generating the filename.
Make both records emit the same filename based on the full 64 bit
code_index field.
Fixes: 9b07e27f88 ("perf inject: Add jitdump mmap injection support")
Cc: stable@vger.kernel.org # v4.6+
Signed-off-by: Steve MacLean <Steve.MacLean@Microsoft.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Brian Robbins <brianrob@microsoft.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com>
Cc: John Keeping <john@metanate.com>
Cc: John Salem <josalem@microsoft.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom McDonald <thomas.mcdonald@microsoft.com>
Link: http://lore.kernel.org/lkml/BN8PR21MB1362FF8F127B31DBF4121528F7800@BN8PR21MB1362.namprd21.prod.outlook.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Whenever an mmap/mmap2 event occurs, the map tree must be updated to add a new
entry. If a new map overlaps a previous map, the overlapped section of the
previous map is effectively unmapped, but the non-overlapping sections are
still valid.
maps__fixup_overlappings() is responsible for creating any new map entries from
the previously overlapped map. It optionally creates a before and an after map.
When creating the after map the existing code failed to adjust the map.pgoff.
This meant the new after map would incorrectly calculate the file offset
for the ip. This results in incorrect symbol name resolution for any ip in the
after region.
Make maps__fixup_overlappings() correctly populate map.pgoff.
Add an assert that new mapping matches old mapping at the beginning of
the after map.
Committer-testing:
Validated correct parsing of libcoreclr.so symbols from .NET Core 3.0 preview9
(which didn't strip symbols).
Preparation:
~/dotnet3.0-preview9/dotnet new webapi -o perfSymbol
cd perfSymbol
~/dotnet3.0-preview9/dotnet publish
perf record ~/dotnet3.0-preview9/dotnet \
bin/Debug/netcoreapp3.0/publish/perfSymbol.dll
^C
Before:
perf script --show-mmap-events 2>&1 | grep -e MMAP -e unknown |\
grep libcoreclr.so | head -n 4
dotnet 1907 373352.698780: PERF_RECORD_MMAP2 1907/1907: \
[0x7fe615726000(0x768000) @ 0 08:02 5510620 765057155]: \
r-xp .../3.0.0-preview9-19423-09/libcoreclr.so
dotnet 1907 373352.701091: PERF_RECORD_MMAP2 1907/1907: \
[0x7fe615974000(0x1000) @ 0x24e000 08:02 5510620 765057155]: \
rwxp .../3.0.0-preview9-19423-09/libcoreclr.so
dotnet 1907 373352.701241: PERF_RECORD_MMAP2 1907/1907: \
[0x7fe615c42000(0x1000) @ 0x51c000 08:02 5510620 765057155]: \
rwxp .../3.0.0-preview9-19423-09/libcoreclr.so
dotnet 1907 373352.705249: 250000 cpu-clock: \
7fe6159a1f99 [unknown] \
(.../3.0.0-preview9-19423-09/libcoreclr.so)
After:
perf script --show-mmap-events 2>&1 | grep -e MMAP -e unknown |\
grep libcoreclr.so | head -n 4
dotnet 1907 373352.698780: PERF_RECORD_MMAP2 1907/1907: \
[0x7fe615726000(0x768000) @ 0 08:02 5510620 765057155]: \
r-xp .../3.0.0-preview9-19423-09/libcoreclr.so
dotnet 1907 373352.701091: PERF_RECORD_MMAP2 1907/1907: \
[0x7fe615974000(0x1000) @ 0x24e000 08:02 5510620 765057155]: \
rwxp .../3.0.0-preview9-19423-09/libcoreclr.so
dotnet 1907 373352.701241: PERF_RECORD_MMAP2 1907/1907: \
[0x7fe615c42000(0x1000) @ 0x51c000 08:02 5510620 765057155]: \
rwxp .../3.0.0-preview9-19423-09/libcoreclr.so
All the [unknown] symbols were resolved.
Signed-off-by: Steve MacLean <Steve.MacLean@Microsoft.com>
Tested-by: Brian Robbins <brianrob@microsoft.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com>
Cc: John Keeping <john@metanate.com>
Cc: John Salem <josalem@microsoft.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom McDonald <thomas.mcdonald@microsoft.com>
Link: http://lore.kernel.org/lkml/BN8PR21MB136270949F22A6A02335C238F7800@BN8PR21MB1362.namprd21.prod.outlook.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In the pmu-events directory for JSON file definitions use the
official machine name IBM z15 instead of machine type number
8561. This is consistent with previous machines.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20190927081147.18345-2-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add s390 transaction counter definition for machine 8561. This is the
same file as for the predecessor machine.
Fixes: 6e67d77d67 ("perf vendor events s390: Add JSON files for machine type 8561")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20190927081147.18345-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'test_dir' variable is assigned to the 'release' array which is
out-of-scope 3 lines later.
Extend the scope of the 'release' array so that an out-of-scope array
isn't accessed.
Bug detected by clang's address sanitizer.
Fixes: 07bc5c699a ("perf tools: Make fetch_kernel_version() publicly available")
Cc: stable@vger.kernel.org # v4.4+
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lore.kernel.org/lkml/20190926220018.25402-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the changes from:
78a1b96bcf ("fscrypt: add FS_IOC_REMOVE_ENCRYPTION_KEY_ALL_USERS ioctl")
23c688b540 ("fscrypt: allow unprivileged users to add/remove keys for v2 policies")
5dae460c22 ("fscrypt: v2 encryption policy support")
5a7e29924d ("fscrypt: add FS_IOC_GET_ENCRYPTION_KEY_STATUS ioctl")
b1c0ec3599 ("fscrypt: add FS_IOC_REMOVE_ENCRYPTION_KEY ioctl")
22d94f493b ("fscrypt: add FS_IOC_ADD_ENCRYPTION_KEY ioctl")
3b6df59bc4 ("fscrypt: use FSCRYPT_* definitions, not FS_*")
2336d0deb2 ("fscrypt: use FSCRYPT_ prefix for uapi constants")
7af0ab0d3a ("fs, fscrypt: move uapi definitions to new header <linux/fscrypt.h>")
That don't trigger any changes in tooling, as it so far is used only
for:
$ grep -l 'fs\.h' tools/perf/trace/beauty/*.sh | xargs grep regex=
tools/perf/trace/beauty/rename_flags.sh:regex='^[[:space:]]*#[[:space:]]*define[[:space:]]+RENAME_([[:alnum:]_]+)[[:space:]]+\(1[[:space:]]*<<[[:space:]]*([[:xdigit:]]+)[[:space:]]*\)[[:space:]]*.*'
tools/perf/trace/beauty/sync_file_range.sh:regex='^[[:space:]]*#[[:space:]]*define[[:space:]]+SYNC_FILE_RANGE_([[:alnum:]_]+)[[:space:]]+([[:xdigit:]]+)[[:space:]]*.*'
tools/perf/trace/beauty/usbdevfs_ioctl.sh:regex="^#[[:space:]]*define[[:space:]]+USBDEVFS_(\w+)(\(\w+\))?[[:space:]]+_IO[CWR]{0,2}\([[:space:]]*(_IOC_\w+,[[:space:]]*)?'U'[[:space:]]*,[[:space:]]*([[:digit:]]+).*"
tools/perf/trace/beauty/usbdevfs_ioctl.sh:regex="^#[[:space:]]*define[[:space:]]+USBDEVFS_(\w+)[[:space:]]+_IO[WR]{0,2}\([[:space:]]*'U'[[:space:]]*,[[:space:]]*([[:digit:]]+).*"
$
This silences this perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/fs.h' differs from latest version at 'include/uapi/linux/fs.h'
diff -u tools/include/uapi/linux/fs.h include/uapi/linux/fs.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-44g48exl9br9ba0t64chqb4i@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
With this change if a perf_date parameter is provided to asciidoc then
it will override the default date written to the man page metadata.
Without this change, or if the perf_date isn't specified, then the
current date is written to the metadata.
Having this parameter allows the metadata to be constant if builds
happen on different dates.
The name of the parameter is intended to be consistent with the existing
perf_version parameter.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20190921041327.155054-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
An optimized build such as:
make -C tools/perf CLANG=1 CC=clang EXTRA_CFLAGS="-O3
will turn the dereference operation into a ud2 instruction, raising a
SIGILL rather than a SIGSEGV. Use raise(..) for correctness and clarity.
Similar issues were addressed in Numfor Mbiziwo-Tiapo's patch:
https://lkml.org/lkml/2019/7/8/1234
Committer testing:
Before:
[root@quaco ~]# perf test hooks
55: perf hooks : Ok
[root@quaco ~]# perf test -v hooks
55: perf hooks :
--- start ---
test child forked, pid 17092
SIGSEGV is observed as expected, try to recover.
Fatal error (SEGFAULT) in perf hook 'test'
test child finished with 0
---- end ----
perf hooks: Ok
[root@quaco ~]#
After:
[root@quaco ~]# perf test hooks
55: perf hooks : Ok
[root@quaco ~]# perf test -v hooks
55: perf hooks :
--- start ---
test child forked, pid 17909
SIGSEGV is observed as expected, try to recover.
Fatal error (SEGFAULT) in perf hook 'test'
test child finished with 0
---- end ----
perf hooks: Ok
[root@quaco ~]#
Fixes: a074865e60 ("perf tools: Introduce perf hooks")
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lore.kernel.org/lkml/20190925195924.152834-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Naresh Kamboju reported, that on the i386 build pr_err()
doesn't get defined properly due to header ordering:
perf-in.o: In function `libunwind__x86_reg_id':
tools/perf/util/libunwind/../../arch/x86/util/unwind-libunwind.c:109:
undefined reference to `pr_err'
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
They go on accumulating there like the debug.h one, that was introduced
here:
f23610245c ("perf list: Add debug support for outputing alias string")
But then, when that need is removed via:
2073ad3326 ("perf tools: Factor out PMU matching in parser")
The thing stays there, so continue the house cleaning spree...
list.h not needed, no macros from there are used, and 'struct
list_head' is in linux/types.h, ditto for util.h, no need for that as
well.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-zkxr3mf6inun8m5mbnil4u0d@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
With Java 11 there is no seperate JRE anymore.
Details:
https://coderanch.com/t/701603/java/JRE-JDK
Therefore the detection of the JRE needs to be adapted.
This change works for s390 and x86. I have not tested other platforms.
Committer testing:
Continues to work with the OpenJDK 8:
$ rm -f ~acme/lib64/libperf-jvmti.so
$ rpm -qa | grep jdk-devel
java-1.8.0-openjdk-devel-1.8.0.222.b10-0.fc30.x86_64
$ git log --oneline -1
a51937170f33 (HEAD -> perf/core) perf build: Add detection of java-11-openjdk-devel package
$ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf ; make -C tools/perf O=/tmp/build/perf install > /dev/null 2>1
$ ls -la ~acme/lib64/libperf-jvmti.so
-rwxr-xr-x. 1 acme acme 230744 Sep 24 16:46 /home/acme/lib64/libperf-jvmti.so
$
Suggested-by: Andreas Krebbel <krebbel@linux.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20190909114116.50469-4-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch is to remove following hardware events
from JSON file which are not supported on POWER8.
pm_l3_p0_grp_pump
pm_l3_p0_lco_data
pm_l3_p0_lco_no_data
pm_l3_p0_lco_rty
Note: Unfortunately power8 event list is not publicly available.
Fixes: c3b4d5c4af ("perf vendor events: Remove P8 HW events which are not supported")
Signed-off-by: Mamatha Inamdar <mamatha4@linux.vnet.ibm.com>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20190909065624.11956.3992.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
I'm not fully sure if this is the correct fix, but without this I get
crashes on more complex perf stat metric usages. The problem is that
part of the state gets freed when a weak group fails, but then is later
still used. Just don't free the ids, we're going to reuse them anyways
on the weak group retry.
For example:
% perf stat -M IpB,IpCall,IpTB,IPC,Retiring_SMT,Frontend_Bound_SMT,Kernel_Utilization,CPU_Utilization --metric-only -a -I 1000 sleep 2
crashes and gives in valgrind:
=21527== Invalid write of size 8
==21527== at 0x4EE582: hlist_add_head (list.h:644)
==21527== by 0x4EFD3C: perf_evlist__id_hash (evlist.c:477)
==21527== by 0x4EFD99: perf_evlist__id_add (evlist.c:483)
==21527== by 0x4EFF15: perf_evlist__id_add_fd (evlist.c:524)
==21527== by 0x4FC693: store_evsel_ids (evsel.c:2969)
==21527== by 0x4FC76C: perf_evsel__store_ids (evsel.c:2986)
==21527== by 0x450DA7: __run_perf_stat (builtin-stat.c:519)
==21527== by 0x451285: run_perf_stat (builtin-stat.c:636)
==21527== by 0x454619: cmd_stat (builtin-stat.c:1966)
==21527== by 0x4D557D: run_builtin (perf.c:310)
==21527== by 0x4D57EA: handle_internal_command (perf.c:362)
==21527== by 0x4D5931: run_argv (perf.c:406)
==21527== Address 0x12e3f008 is 104 bytes inside a block of size 2,056 free'd
==21527== at 0x4839A0C: free (vg_replace_malloc.c:540)
==21527== by 0x627139: xyarray__delete (xyarray.c:32)
==21527== by 0x4F6BE4: perf_evsel__free_id (evsel.c:1253)
==21527== by 0x4FA11F: evsel__close (evsel.c:1994)
==21527== by 0x4F30A3: perf_evlist__reset_weak_group (evlist.c:1783)
==21527== by 0x450B47: __run_perf_stat (builtin-stat.c:466)
==21527== by 0x451285: run_perf_stat (builtin-stat.c:636)
==21527== by 0x454619: cmd_stat (builtin-stat.c:1966)
==21527== by 0x4D557D: run_builtin (perf.c:310)
==21527== by 0x4D57EA: handle_internal_command (perf.c:362)
==21527== by 0x4D5931: run_argv (perf.c:406)
==21527== by 0x4D5CAE: main (perf.c:531)
==21527== Block was alloc'd at
==21527== at 0x483AB1A: calloc (vg_replace_malloc.c:762)
==21527== by 0x627024: zalloc (zalloc.c:8)
==21527== by 0x627088: xyarray__new (xyarray.c:10)
==21527== by 0x4F6B20: perf_evsel__alloc_id (evsel.c:1237)
==21527== by 0x4FC74E: perf_evsel__store_ids (evsel.c:2983)
==21527== by 0x450DA7: __run_perf_stat (builtin-stat.c:519)
==21527== by 0x451285: run_perf_stat (builtin-stat.c:636)
==21527== by 0x454619: cmd_stat (builtin-stat.c:1966)
==21527== by 0x4D557D: run_builtin (perf.c:310)
==21527== by 0x4D57EA: handle_internal_command (perf.c:362)
==21527== by 0x4D5931: run_argv (perf.c:406)
==21527== by 0x4D5CAE: main (perf.c:531)
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lore.kernel.org/lkml/20190923233339.25326-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Make sure to not free the name passed in by the caller, but free all the
allocated ids when parsing expressions.
The loop at the end knows that the first entry shouldn't be freed, so
make sure the caller name is the first entry.
Fixes
% perf stat -M IpB,IpCall,IpTB,IPC,Retiring_SMT,Frontend_Bound_SMT,Kernel_Utilization,CPU_Utilization --metric-only -a -I 1000 sleep 2
valgrind:
1.009943231 ==21527== Invalid read of size 1
==21527== at 0x483CB74: strcmp (vg_replace_strmem.c:849)
==21527== by 0x582CF8: collect_all_aliases (stat-display.c:554)
==21527== by 0x582EB3: collect_data (stat-display.c:577)
==21527== by 0x583A32: print_counter_aggr (stat-display.c:806)
==21527== by 0x584FAD: perf_evlist__print_counters (stat-display.c:1200)
==21527== by 0x45133A: print_counters (builtin-stat.c:655)
==21527== by 0x450629: process_interval (builtin-stat.c:353)
==21527== by 0x450FBD: __run_perf_stat (builtin-stat.c:564)
==21527== by 0x451285: run_perf_stat (builtin-stat.c:636)
==21527== by 0x454619: cmd_stat (builtin-stat.c:1966)
==21527== by 0x4D557D: run_builtin (perf.c:310)
==21527== by 0x4D57EA: handle_internal_command (perf.c:362)
==21527== Address 0x12826cd0 is 0 bytes inside a block of size 25 free'd
==21527== at 0x4839A0C: free (vg_replace_malloc.c:540)
==21527== by 0x627041: __zfree (zalloc.c:13)
==21527== by 0x57F66A: generic_metric (stat-shadow.c:814)
==21527== by 0x580B21: perf_stat__print_shadow_stats (stat-shadow.c:1057)
==21527== by 0x58418E: print_metric_headers (stat-display.c:943)
==21527== by 0x5844BC: print_interval (stat-display.c:1004)
==21527== by 0x584DEB: perf_evlist__print_counters (stat-display.c:1172)
==21527== by 0x45133A: print_counters (builtin-stat.c:655)
==21527== by 0x450629: process_interval (builtin-stat.c:353)
==21527== by 0x450FBD: __run_perf_stat (builtin-stat.c:564)
==21527== by 0x451285: run_perf_stat (builtin-stat.c:636)
==21527== by 0x454619: cmd_stat (builtin-stat.c:1966)
==21527== Block was alloc'd at
==21527== at 0x483880B: malloc (vg_replace_malloc.c:309)
==21527== by 0x51677DE: strdup (in /usr/lib64/libc-2.29.so)
==21527== by 0x506457: parse_events_name (parse-events.c:1754)
==21527== by 0x5550BB: parse_events_parse (parse-events.y:214)
==21527== by 0x50694D: parse_events__scanner (parse-events.c:1887)
==21527== by 0x506AEF: parse_events (parse-events.c:1927)
==21527== by 0x521D8B: metricgroup__parse_groups (metricgroup.c:527)
==21527== by 0x45156F: parse_metric_groups (builtin-stat.c:721)
==21527== by 0x6228A9: get_value (parse-options.c:243)
==21527== by 0x62363F: parse_short_opt (parse-options.c:348)
==21527== by 0x62363F: parse_options_step (parse-options.c:536)
==21527== by 0x62363F: parse_options_subcommand (parse-options.c:651)
==21527== by 0x453C1D: cmd_stat (builtin-stat.c:1718)
==21527== by 0x4D557D: run_builtin (perf.c:310)
and also a leak report.
Committer testing:
Before:
# perf stat -M IpB,IpCall,IpTB,IPC,Retiring_SMT,Frontend_Bound_SMT,Kernel_Utilization,CPU_Utilization --metric-only -a -I 1000 sleep 2
# time CPU_Utilization
1.000470810 free(): double free detected in tcache 2
Aborted (core dumped)
#
After:
# perf stat -M IpB,IpCall,IpTB,IPC,Retiring_SMT,Frontend_Bound_SMT,Kernel_Utilization,CPU_Utilization --metric-only -a -I 1000 sleep 2
# time CPU_Utilization
1.000494752 0.1
2.001105112 0.1
#
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lore.kernel.org/lkml/20190923233339.25326-3-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The perf_sample struct definition and the event_attr_init() are in
util/event.h, but some places were getting it thru an otherwise needless
util/mmap.h header, fix it by including util/event.h directly.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-p1anwyjdbbvghrkl9dlxv7h5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Further reducing the size of util/evsel.h.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-20zr7di9eynm0272mtjfdhfc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ditch it, noone is using it, one more stdio.h include in a hot header.
Fix the fallout in parse-events.y, where we end up using a FILE pointer,
I think due to YYDEBUG being set and in some places, like Amazon Linux 1
we don't get stdio.h included by luck, like in most other places, add a
explicit stdio.h include directive.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-37k5q0lhdbo2hvvfbnnzn7og@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We already had evsel_fprintf.c, add its counterpart, so that we can
reduce evsel.h a bit more.
We needed a new perf_event_attr_fprintf.c file so as to have a separate
object to link with the python binding in tools/perf/util/python-ext-sources
and not drag symbol_conf, etc into the python binding.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-06bdmt1062d9unzgqmxwlv88@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So that we an later link it to the python binding without having to
drag the symbol object files.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-8823tveyasocnuoelq4qopwf@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Further reducing the util.c hodgepodge files.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-0i62zh7ok25znibyebgq0qs4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move perf_evlist__poll() from tools/perf to libperf, it will be used in
the following patches.
And rename the existing perf's function to evlist__poll().
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-39-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move perf_evlist__add_pollfd() from tools/perf to libperf, it will be
used in the following patches.
Also rename perf's perf_evlist__add_pollfd()/perf_evlist__filter_pollfd()
to evlist__add_pollfd()/evlist__filter_pollfd().
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-38-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move perf_evlist__alloc_pollfd() from tools/perf to libperf, it will be
used in the following patches.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-37-jolsa@kernel.org
[ Added api/fd/array.h include to the lib/evlist.c file ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add libperf_init() call to the automated tests.
Committer notes:
Added missing stdarg.h and/or stdio.h to places using vfprintf.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-34-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The libperf_set_print() function needs to be called in any case so let's
merge it with libperf_init(), so we have just one init function.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-34-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add libperf dependency for tests targets.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-36-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The sys/types.h header looks more sensible, from its name we can gather
it should be there because of some needed typedef, and it is much
smaller than unistd.h, so use it and fix up the fallout in places where
it was being used for something else entirely but being obtained by
sheer luck, indirectly.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-49bn251httu22ymwgipeavmy@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
That was done just to have users of writen() and readn(), that before
had their prototypes in util/util.h to get it without having to add an
include for internal/lib.h, but the right way is to add it and by now
all places already do it.
Fix a fallout were readlink() was used but unistd.h was being obtained
by luck thru util.h -> internal/lib.h, now to check why unistd.h is
being included there...
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-lcnytgrtafey3kwlfog2rzzj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We need the 'page_size' variable in libperf, so move it there.
Add a libperf_init() as a global libperf init function to obtain this
value via sysconf() at tool start.
Committer notes:
Add internal/lib.h to tools/perf/ files using 'page_size', sometimes
replacing util.h with it if that was the only reason for having util.h
included.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-33-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add the perf_evlist__id_add_fd() function to libperf as an internal
function.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-32-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add the perf_evlist__id_add() function to libperf as an internal
function. We already have the 'heads' member in 'struct perf_evlist'.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-31-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add the perf_evlist__read_format() function to libperf as internal
function.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-30-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add perf_evlist__first()/last() functions to libperf, as internal
functions and rename perf's origins to evlist__first/last.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-29-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add perf_evsel__alloc_id()/perf_evsel__free_id() functions to libperf as
internal functions.
Move 'struct perf_sample_id' to internal/evsel.h header and change
'struct perf_sample_id::evsel' to 'struct perf_evsel' and the related
code that touches it.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-28-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move 'heads' hash table from 'struct evlist' to 'struct perf_evlist'.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-27-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move 'ids' from 'struct evsel' to libperf's 'struct perf_evsel'.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-26-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move the 'id' array from 'struct evsel' to libperf's 'struct perf_evsel'.
Committer note:
Fix the tools/perf/util/cs-etm.c build, i.e. aarch64's CoreSight.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-25-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move 'sample_id' array from 'struct evsel' to libperf's 'struct perf_evsel'.
Committer notes:
Removed the 'struct xyarray' from util/evsel.h, not needed anymore
there.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-24-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We were getting it by luck, from files included before internal/evsel.h
where it is being included.
Fixes: 9dfcb75990 ("libperf: Move fd array from perf's evsel to lobperf's perf_evsel class")
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lkml.kernel.org/n/tip-r8ukhxprpkflbd2k9vcc42v1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Moving 'pollfd' from 'struct evlist' to 'struct perf_evlist' it will be
used in following patches.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-23-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Moving 'mmap_len' from 'struct evlist' to 'struct perf_evlist' it will
be used in following patches.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-22-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Moving 'nr_mmaps' from 'struct evlist' to 'struct perf_evlist', it will
be used in following patches.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-21-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move the 'system_wide 'member from perf's evsel to libperf's perf_evsel.
Committer notes:
Added stdbool.h as we now use bool here.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-20-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move 'flush' from tools/perf's mmap to libperf's perf_mmap struct.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-19-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move 'event_copy' from tools/perf's mmap to libperf's perf_mmap struct.
Committer notes:
Add linux/compiler.h as we need it for '__aligned'.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-18-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>