linux/tools/perf
Ingo Molnar 1c13f3c904 perf: Add 'perf bench numa mem' NUMA performance measurement suite
Add a suite of NUMA performance benchmarks.

The goal was simulate the behavior and access patterns of real NUMA
workloads, via a wide range of parameters, so this tool goes well
beyond simple bzero() measurements that most NUMA micro-benchmarks use:

 - It processes the data and creates a chain of data dependencies,
   like a real workload would. Neither the compiler, nor the
   kernel (via KSM and other optimizations) nor the CPU can
   eliminate parts of the workload.

 - It randomizes the initial state and also randomizes the target
   addresses of the processing - it's not a simple forward scan
   of addresses.

 - It provides flexible options to set process, thread and memory
   relationship information: -G sets "global" memory shared between
   all test processes, -P sets "process" memory shared by all
   threads of a process and -T sets "thread" private memory.

 - There's a NUMA convergence monitoring and convergence latency
   measurement option via -c and -m.

 - Micro-sleeps and synchronization can be injected to provoke lock
   contention and scheduling, via the -u and -S options. This simulates
   IO and contention.

 - The -x option instructs the workload to 'perturb' itself artificially
   every N seconds, by moving to the first and last CPU of the system
   periodically. This way the stability of convergence equilibrium and
   the number of steps taken for the scheduler to reach equilibrium again
   can be measured.

 - The amount of work can be specified via the -l loop count, and/or
   via a -s seconds-timeout value.

 - CPU and node memory binding options, to test hard binding scenarios.
   THP can be turned on and off via madvise() calls.

 - Live reporting of convergence progress in an 'at glance' output format.
   Printing of convergence and deconvergence events.

The 'perf bench numa mem -a' option will start an array of about 30
individual tests that will each output such measurements:

 # Running  5x5-bw-thread, "perf bench numa mem -p 5 -t 5 -P 512 -s 20 -zZ0q --thp  1"
  5x5-bw-thread,                         20.276, secs,           runtime-max/thread
  5x5-bw-thread,                         20.004, secs,           runtime-min/thread
  5x5-bw-thread,                         20.155, secs,           runtime-avg/thread
  5x5-bw-thread,                          0.671, %,              spread-runtime/thread
  5x5-bw-thread,                         21.153, GB,             data/thread
  5x5-bw-thread,                        528.818, GB,             data-total
  5x5-bw-thread,                          0.959, nsecs,          runtime/byte/thread
  5x5-bw-thread,                          1.043, GB/sec,         thread-speed
  5x5-bw-thread,                         26.081, GB/sec,         total-speed

See the help text and the code for more details.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-01-30 10:35:36 -03:00
..
Documentation perf test: Allow skipping tests 2013-01-24 16:40:53 -03:00
arch Merge branch 'linus' into perf/core 2012-12-08 15:25:06 +01:00
bench perf: Add 'perf bench numa mem' NUMA performance measurement suite 2013-01-30 10:35:36 -03:00
config perf tools: Fix GNU make v3.80 compatibility issue 2013-01-24 16:40:17 -03:00
python perf python: Use attr.watermark in twatch.py 2012-01-30 18:38:23 -02:00
scripts perf script: Remove workqueue-stats script 2013-01-24 16:40:53 -03:00
tests perf tests: Fix leaks on PERF_RECORD_* test 2013-01-30 10:35:02 -03:00
ui perf ui browser: Free browser->helpline() on ui_browser__hide() 2013-01-25 12:49:29 -03:00
util perf header: Stop using die() calls when processing tracing data 2013-01-25 12:49:29 -03:00
.gitignore perf tools: Ignore compiled python binaries 2012-09-07 12:10:58 -03:00
CREDITS
MANIFEST perf tools: Fix building from 'make perf-*-src-pkg' tarballs 2013-01-10 16:03:26 -03:00
Makefile perf: Add 'perf bench numa mem' NUMA performance measurement suite 2013-01-30 10:35:36 -03:00
bash_completion perf tools: Complete tracepoint event names 2012-10-04 12:44:52 -03:00
builtin-annotate.c perf tools: Introduce struct hist_browser_timer 2012-11-05 14:03:58 -03:00
builtin-bench.c perf: Add 'perf bench numa mem' NUMA performance measurement suite 2013-01-30 10:35:36 -03:00
builtin-buildid-cache.c perf buildid-cache: Add option to show build ids that are missing in the cache 2012-12-09 08:46:08 -03:00
builtin-buildid-list.c perf symbols: Generalize filter in __fprintf_buildid methods 2012-12-09 08:46:07 -03:00
builtin-diff.c perf diff: Use internal rb tree for compute resort 2013-01-24 16:40:06 -03:00
builtin-evlist.c perf evsel: Adopt fprintf routine from 'perf evlist' 2012-12-11 17:19:53 -03:00
builtin-help.c perf help: Fix --help for builtins 2012-10-22 12:35:49 -02:00
builtin-inject.c perf inject: Mark a dso if it's used 2012-10-26 11:22:25 -02:00
builtin-kmem.c perf kmem: Use memdup() 2013-01-25 12:49:28 -03:00
builtin-kvm.c perf kvm: Initialize file_name var to fix segfault 2013-01-24 16:40:13 -03:00
builtin-list.c perf tools: Use __maybe_used for unused variables 2012-09-11 12:19:15 -03:00
builtin-lock.c perf tools: Add a global variable "const char *input_name" 2012-10-29 11:45:34 -02:00
builtin-probe.c perf probe: Don't use globals where not needed to 2012-10-02 18:36:37 -03:00
builtin-record.c perf machine: Simplify accessing the host machine 2013-01-24 16:40:13 -03:00
builtin-report.c perf report: Update documentation for sort keys 2013-01-24 16:40:28 -03:00
builtin-sched.c perf session: There is no need for a per session hists instance 2013-01-24 16:40:12 -03:00
builtin-script.c perf script: Don't display trace info when invoking scripts 2013-01-24 16:40:52 -03:00
builtin-stat.c perf evsel: Introduce perf_evsel__open_strerror method 2013-01-24 16:40:09 -03:00
builtin-timechart.c perf tools: Add a global variable "const char *input_name" 2012-10-29 11:45:34 -02:00
builtin-top.c perf tools: Allow passing a list to intlist__new 2013-01-24 16:40:53 -03:00
builtin-trace.c perf evlist: Set the leader in the perf_evlist__config method 2012-12-11 17:19:01 -03:00
builtin.h perf trace: New tool 2012-09-26 20:42:23 -03:00
command-list.txt perf trace: New tool 2012-09-26 20:42:23 -03:00
design.txt perf tools: Update ioctl documentation for PERF_IOC_FLAG_GROUP 2012-05-31 11:38:42 -03:00
perf-archive.sh perf archive: Make 'f' the last parameter for tar 2012-09-17 13:10:42 -03:00
perf.c perf tools: Remove some needless die() calls from the main routine 2013-01-24 16:40:52 -03:00
perf.h perf tools: Move get_term_dimensions from top to util.c 2013-01-24 16:40:34 -03:00