linux/arch/x86
Peter Zijlstra 9536c8d2da perf/x86: Optimize intel_pmu_pebs_fixup_ip()
There's been reports of high NMI handler overhead, highlighted by
such kernel messages:

  [ 3697.380195] perf samples too long (10009 > 10000), lowering kernel.perf_event_max_sample_rate to 13000
  [ 3697.389509] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 9.331 msecs

Don Zickus analyzed the source of the overhead and reported:

 > While there are a few places that are causing latencies, for now I focused on
 > the longest one first.  It seems to be 'copy_user_from_nmi'
 >
 > intel_pmu_handle_irq ->
 >	intel_pmu_drain_pebs_nhm ->
 >		__intel_pmu_drain_pebs_nhm ->
 >			__intel_pmu_pebs_event ->
 >				intel_pmu_pebs_fixup_ip ->
 >					copy_from_user_nmi
 >
 > In intel_pmu_pebs_fixup_ip(), if the while-loop goes over 50, the sum of
 > all the copy_from_user_nmi latencies seems to go over 1,000,000 cycles
 > (there are some cases where only 10 iterations are needed to go that high
 > too, but in generall over 50 or so).  At this point copy_user_from_nmi
 > seems to account for over 90% of the nmi latency.

The solution to that is to avoid having to call copy_from_user_nmi() for
every instruction.

Since we already limit the max basic block size, we can easily
pre-allocate a piece of memory to copy the entire thing into in one
go.

Don reported this test result:

 > Your patch made a huge difference in improvement.  The
 > copy_from_user_nmi() no longer hits the million of cycles.  I still
 > have a batch of 100,000-300,000 cycles.  My longest NMI paths used
 > to be dominated by copy_from_user_nmi, now it is not (I have to dig
 > up the new hot path).

Reported-and-tested-by: Don Zickus <dzickus@redhat.com>
Cc: jmario@redhat.com
Cc: acme@infradead.org
Cc: dave.hansen@linux.intel.com
Cc: eranian@google.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20131016105755.GX10651@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-16 15:44:00 +02:00
..
boot
configs
crypto
ia32
include compiler/gcc4: Add quirk for 'asm goto' miscompilation bug 2013-10-11 07:39:14 +02:00
kernel perf/x86: Optimize intel_pmu_pebs_fixup_ip() 2013-10-16 15:44:00 +02:00
kvm KVM: nVMX: fix shadow on EPT 2013-10-10 11:39:57 +02:00
lguest
lib
math-emu
mm x86: finish user fault error path with fatal signal 2013-09-12 15:38:01 -07:00
net
oprofile
pci Revert "x86/PCI: MMCONFIG: Check earlier for MMCONFIG region at address zero" 2013-10-04 16:15:29 -06:00
platform x86, efi: Don't map Boot Services on i386 2013-09-18 14:42:33 +01:00
power
realmode
syscalls
tools
um
vdso
video
xen Bug-fixes: 2013-09-25 15:50:53 -07:00
.gitignore
Kbuild
Kconfig x86, build, pci: Fix PCI_MSI build on !SMP 2013-10-04 10:43:34 -07:00
Kconfig.cpu
Kconfig.debug
Makefile
Makefile_32.cpu
Makefile.um