linux/kernel
Linus Torvalds 654443e20d Merge branch 'perf-uprobes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull user-space probe instrumentation from Ingo Molnar:
 "The uprobes code originates from SystemTap and has been used for years
  in Fedora and RHEL kernels.  This version is much rewritten, reviews
  from PeterZ, Oleg and myself shaped the end result.

  This tree includes uprobes support in 'perf probe' - but SystemTap
  (and other tools) can take advantage of user probe points as well.

  Sample usage of uprobes via perf, for example to profile malloc()
  calls without modifying user-space binaries.

  First boot a new kernel with CONFIG_UPROBE_EVENT=y enabled.

  If you don't know which function you want to probe you can pick one
  from 'perf top' or can get a list all functions that can be probed
  within libc (binaries can be specified as well):

	$ perf probe -F -x /lib/libc.so.6

  To probe libc's malloc():

	$ perf probe -x /lib64/libc.so.6 malloc
	Added new event:
	probe_libc:malloc    (on 0x7eac0)

  You can now use it in all perf tools, such as:

	perf record -e probe_libc:malloc -aR sleep 1

  Make use of it to create a call graph (as the flat profile is going to
  look very boring):

	$ perf record -e probe_libc:malloc -gR make
	[ perf record: Woken up 173 times to write data ]
	[ perf record: Captured and wrote 44.190 MB perf.data (~1930712

	$ perf report | less

	  32.03%            git  libc-2.15.so   [.] malloc
	                    |
	                    --- malloc

	  29.49%            cc1  libc-2.15.so   [.] malloc
	                    |
	                    --- malloc
	                       |
	                       |--0.95%-- 0x208eb1000000000
	                       |
	                       |--0.63%-- htab_traverse_noresize

	  11.04%             as  libc-2.15.so   [.] malloc
	                     |
	                     --- malloc
	                        |

	   7.15%             ld  libc-2.15.so   [.] malloc
	                     |
	                     --- malloc
	                        |

	   5.07%             sh  libc-2.15.so   [.] malloc
	                     |
	                     --- malloc
	                        |
	   4.99%  python-config  libc-2.15.so   [.] malloc
	          |
	          --- malloc
	             |
	   4.54%           make  libc-2.15.so   [.] malloc
	                   |
	                   --- malloc
	                      |
	                      |--7.34%-- glob
	                      |          |
	                      |          |--93.18%-- 0x41588f
	                      |          |
	                      |           --6.82%-- glob
	                      |                     0x41588f

	   ...

  Or:

	$ perf report -g flat | less

	# Overhead        Command  Shared Object      Symbol
	# ........  .............  .............  ..........
	#
	  32.03%            git  libc-2.15.so   [.] malloc
	          27.19%
	              malloc

	  29.49%            cc1  libc-2.15.so   [.] malloc
	          24.77%
	              malloc

	  11.04%             as  libc-2.15.so   [.] malloc
	          11.02%
	              malloc

	   7.15%             ld  libc-2.15.so   [.] malloc
	           6.57%
	              malloc

	 ...

  The core uprobes design is fairly straightforward: uprobes probe
  points register themselves at (inode:offset) addresses of
  libraries/binaries, after which all existing (or new) vmas that map
  that address will have a software breakpoint injected at that address.
  vmas are COW-ed to preserve original content.  The probe points are
  kept in an rbtree.

  If user-space executes the probed inode:offset instruction address
  then an event is generated which can be recovered from the regular
  perf event channels and mmap-ed ring-buffer.

  Multiple probes at the same address are supported, they create a
  dynamic callback list of event consumers.

  The basic model is further complicated by the XOL speedup: the
  original instruction that is probed is copied (in an architecture
  specific fashion) and executed out of line when the probe triggers.
  The XOL area is a single vma per process, with a fixed number of
  entries (which limits probe execution parallelism).

  The API: uprobes are installed/removed via
  /sys/kernel/debug/tracing/uprobe_events, the API is integrated to
  align with the kprobes interface as much as possible, but is separate
  to it.

  Injecting a probe point is privileged operation, which can be relaxed
  by setting perf_paranoid to -1.

  You can use multiple probes as well and mix them with kprobes and
  regular PMU events or tracepoints, when instrumenting a task."

Fix up trivial conflicts in mm/memory.c due to previous cleanup of
unmap_single_vma().

* 'perf-uprobes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
  perf probe: Detect probe target when m/x options are absent
  perf probe: Provide perf interface for uprobes
  tracing: Fix kconfig warning due to a typo
  tracing: Provide trace events interface for uprobes
  tracing: Extract out common code for kprobes/uprobes trace events
  tracing: Modify is_delete, is_return from int to bool
  uprobes/core: Decrement uprobe count before the pages are unmapped
  uprobes/core: Make background page replacement logic account for rss_stat counters
  uprobes/core: Optimize probe hits with the help of a counter
  uprobes/core: Allocate XOL slots for uprobes use
  uprobes/core: Handle breakpoint and singlestep exceptions
  uprobes/core: Rename bkpt to swbp
  uprobes/core: Make order of function parameters consistent across functions
  uprobes/core: Make macro names consistent
  uprobes: Update copyright notices
  uprobes/core: Move insn to arch specific structure
  uprobes/core: Remove uprobe_opcode_sz
  uprobes/core: Make instruction tables volatile
  uprobes: Move to kernel/events/
  uprobes/core: Clean up, refactor and improve the code
  ...
2012-05-24 11:39:34 -07:00
..
debug KGDB/KDB regression fixes 2012-04-04 17:26:08 -07:00
events Merge branch 'perf-uprobes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-05-24 11:39:34 -07:00
gcov
irq Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml 2012-05-23 09:01:41 -07:00
power PM / Hibernate: Use get_gendisk to verify partition if resume_file is integer format 2012-05-18 20:44:59 +02:00
sched Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2012-05-23 17:42:39 -07:00
time Merge 3.4-rc5 into staging-next 2012-05-02 11:48:07 -07:00
trace Merge branch 'perf-uprobes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-05-24 11:39:34 -07:00
.gitignore
acct.c
async.c
audit_tree.c
audit_watch.c
audit.c
audit.h
auditfilter.c
auditsc.c seccomp: remove duplicated failure logging 2012-04-14 11:13:20 +10:00
backtracetest.c
bounds.c
capability.c userns: Teach inode_capable to understand inodes whose uids map to other namespaces. 2012-05-15 14:59:24 -07:00
cgroup_freezer.c cgroup: convert all non-memcg controllers to the new cftype interface 2012-04-01 12:09:55 -07:00
cgroup.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2012-05-23 17:42:39 -07:00
compat.c new helper: sigsuspend() 2012-05-21 23:52:30 -04:00
configs.c
cpu_pm.c
cpu.c smp, idle: Allocate idle thread for each possible cpu during boot 2012-05-03 19:32:34 +02:00
cpuset.c Merge branch 'for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2012-05-22 17:40:19 -07:00
crash_dump.c
cred.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2012-05-23 17:42:39 -07:00
delayacct.c
dma.c
elfcore.c
exec_domain.c
exit.c cred: use correct cred accessor with regards to rcu read lock 2012-05-17 16:50:06 -06:00
extable.c extable: Skip sorting if sorted at build time. 2012-04-19 15:06:55 -07:00
fork.c Merge branch 'perf-uprobes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-05-24 11:39:34 -07:00
freezer.c
futex_compat.c futex: Mark get_robust_list as deprecated 2012-03-29 11:37:17 +02:00
futex.c futex: Mark get_robust_list as deprecated 2012-03-29 11:37:17 +02:00
groups.c userns: Convert in_group_p and in_egroup_p to use kgid_t 2012-05-03 03:29:33 -07:00
hrtimer.c
hung_task.c hung task debugging: Inject NMI when hung and going to panic 2012-04-25 12:39:25 +02:00
irq_work.c irq_work: fix compile failure on tile from missing include 2012-04-13 13:15:16 -04:00
itimer.c itimer: Use printk_once instead of WARN_ONCE 2012-04-10 11:00:30 +02:00
jump_label.c
kallsyms.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
kexec.c Merge branch 'akpm' (Andrew's patch-bomb) 2012-03-28 17:19:28 -07:00
kfifo.c [media] kernel:kfifo: export __kfifo_max_r symbol 2012-04-11 18:24:37 -03:00
kmod.c
kprobes.c
ksysfs.c
kthread.c
latencytop.c
lockdep_internals.h
lockdep_proc.c
lockdep_states.h
lockdep.c
Makefile smp: Add generic smpboot facility 2012-04-26 12:06:09 +02:00
module.c Guard check in module loader against integer overflow 2012-05-23 22:28:53 +09:30
mutex-debug.c
mutex-debug.h
mutex.c
mutex.h
notifier.c
nsproxy.c
padata.c padata: Fix cpu hotplug 2012-03-29 19:52:46 +08:00
panic.c panic: fix stack dump print on direct call to panic() 2012-04-12 13:12:12 -07:00
params.c params: replace printk(KERN_<LVL>...) with pr_<lvl>(...) 2012-05-04 17:28:18 -07:00
pid_namespace.c pidns: add reboot_pid_ns() to handle the reboot syscall 2012-03-28 17:14:36 -07:00
pid.c
posix-cpu-timers.c
posix-timers.c
printk.c printk() - isolate KERN_CONT users from ordinary complete lines 2012-05-14 12:36:45 -07:00
profile.c
ptrace.c userns: Convert ptrace, kill, set_priority permission checks to work with kuids and kgids 2012-05-03 03:28:51 -07:00
range.c
rcu.h
rcupdate.c rcu: Make exit_rcu() more precise and consolidate 2012-05-02 14:48:27 -07:00
rcutiny_plugin.h rcu: Make exit_rcu() more precise and consolidate 2012-05-02 14:48:27 -07:00
rcutiny.c
rcutorture.c rcu: Add rcutorture test for call_srcu() 2012-04-30 10:48:26 -07:00
rcutree_plugin.h Merge branches 'barrier.2012.05.09a', 'fixes.2012.04.26a', 'inline.2012.05.02b' and 'srcu.2012.05.07b' into HEAD 2012-05-11 10:14:21 -07:00
rcutree_trace.c rcu: Make rcu_barrier() less disruptive 2012-05-09 14:27:54 -07:00
rcutree.c Merge branch 'rcu/next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu 2012-05-14 08:41:46 +02:00
rcutree.h Merge branches 'barrier.2012.05.09a', 'fixes.2012.04.26a', 'inline.2012.05.02b' and 'srcu.2012.05.07b' into HEAD 2012-05-11 10:14:21 -07:00
relay.c
res_counter.c res_counter: Account max_usage when calling res_counter_charge_nofail() 2012-04-27 14:37:09 -07:00
resource.c
rtmutex_common.h
rtmutex-debug.c
rtmutex-debug.h
rtmutex-tester.c
rtmutex.c
rtmutex.h
rwsem.c
seccomp.c seccomp: fix build warnings when there is no CONFIG_SECCOMP_FILTER 2012-04-18 12:24:52 +10:00
semaphore.c semaphore: fix improper comment reference to mutex 2012-04-05 17:15:55 -07:00
signal.c Merge branch 'perf-uprobes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-05-24 11:39:34 -07:00
smp.c smp: Implement kick_all_cpus_sync() 2012-05-08 12:35:06 +02:00
smpboot.c smp, idle: Allocate idle thread for each possible cpu during boot 2012-05-03 19:32:34 +02:00
smpboot.h smp: Fix idle_thread_init() inline stub 2012-05-04 12:52:25 +02:00
softirq.c
spinlock.c
srcu.c rcu: Implement per-domain single-threaded call_srcu() state machine 2012-04-30 10:48:25 -07:00
stacktrace.c
stop_machine.c
sys_ni.c
sys.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2012-05-23 17:42:39 -07:00
sysctl_binary.c
sysctl.c sysctl: fix write access to dmesg_restrict/kptr_restrict 2012-04-05 14:51:43 +10:00
taskstats.c
test_kprobes.c
time.c
timeconst.pl
timer.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2012-05-23 17:42:39 -07:00
tracepoint.c
tsacct.c
uid16.c userns: Convert setting and getting uid and gid system calls to use kuid and kgid 2012-05-03 03:28:41 -07:00
up.c
user_namespace.c userns: Store uid and gid values in struct cred with kuid_t and kgid_t types 2012-05-03 03:28:38 -07:00
user-return-notifier.c
user.c userns: Silence silly gcc warning. 2012-05-19 15:44:40 -06:00
utsname_sysctl.c
utsname.c userns: Use cred->user_ns instead of cred->user->user_ns 2012-04-07 16:55:51 -07:00
wait.c
watchdog.c
workqueue_sched.h
workqueue.c lockdep: fix oops in processing workqueue 2012-05-15 08:08:31 -07:00