linux/include
Kan Liang f2fb6bef92 perf/core: Optimize side-band event delivery
The perf_event_aux() function iterates all PMUs and all events in
their respective per-CPU contexts to find the events to deliver
side-band records to.

For example, the brk test case in lkp triggers many mmap() operations,
which, if we're also running perf, results in many perf_event_aux()
invocations.

If we enable uncore PMU support (even when uncore events are not used),
dozens of uncore PMUs will be iterated, which can significantly
decrease brk_test's throughput.

For example, the brk throughput:

  without uncore PMUs: 2647573 ops_per_sec
  with    uncore PMUs: 1768444 ops_per_sec

... a 33% reduction.

To get at the per-CPU events that need side-band records, this patch
puts these events on a per-CPU list, this avoids iterating the PMUs
and any events that do not need side-band records.

Per task events are unchanged to avoid extra overhead on the context
switch paths.

Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reported-by: Huang, Ying <ying.huang@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1458757477-3781-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-03 09:40:15 +02:00
..
acpi device property: Avoid potential dereferences of invalid pointers 2016-04-27 23:40:02 +02:00
asm-generic Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-05-16 15:15:17 -07:00
clocksource
crypto Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2016-03-17 11:33:45 -07:00
drm drm: Loongson-3 doesn't fully support wc memory 2016-04-22 10:24:11 +10:00
dt-bindings The clk changes for this release cycle are mostly dominated by 2016-03-23 06:06:45 -07:00
keys tpm: fix checks for policy digest existence in tpm2_seal_trusted() 2016-02-10 04:10:55 +02:00
kvm arm64: KVM: vgic-v3: Avoid accessing ICH registers 2016-03-09 04:24:04 +00:00
linux perf/core: Optimize side-band event delivery 2016-06-03 09:40:15 +02:00
math-emu
media [media] media: vb2: Fix regression on poll() for RW mode 2016-04-25 10:21:23 -03:00
memory
misc cxl: Remove cxl_get_phys_dev() kernel API 2016-03-09 23:40:02 +11:00
net udp_tunnel: Remove redundant udp_tunnel_gro_complete(). 2016-05-06 18:25:26 -04:00
pcmcia
ras
rdma IB/security: Restrict use of the write() interface 2016-04-28 12:03:16 -04:00
rxrpc rxrpc: Be more selective about the types of received packets we accept 2016-03-04 15:56:06 +00:00
scsi Merge branch 'fixes-base' into fixes 2016-04-05 06:56:47 -04:00
soc IOMMU Updates for Linux v4.6 2016-03-22 11:57:43 -07:00
sound ALSA: hda - Update BCLK also at hotplug for i915 HSW/BDW 2016-04-26 10:11:11 +02:00
target target: add a new add_wwn_groups fabrics method 2016-03-30 20:06:44 -07:00
trace Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu 2016-04-27 16:57:36 +02:00
uapi perf core: Per event callchain limit 2016-05-30 12:41:44 -03:00
video gpu: ipu-v3: ipu-dmfc: Rename ipu_dmfc_init_channel to ipu_dmfc_config_wait4eot 2016-03-31 11:24:33 +02:00
xen xen: Fix page <-> pfn conversion on 32 bit systems 2016-04-06 11:18:17 +01:00
Kbuild