linux/kernel
Mathieu Desnoyers 6496968e6c markers: use synchronize_sched()
Markers do not mix well with CONFIG_PREEMPT_RCU because it uses
preempt_disable/enable() and not rcu_read_lock/unlock for minimal
intrusiveness.  We would need call_sched and sched_barrier primitives.

Currently, the modification (connection and disconnection) of probes
from markers requires changes to the data structure done in RCU-style :
a new data structure is created, the pointer is changed atomically, a
quiescent state is reached and then the old data structure is freed.

The quiescent state is reached once all the currently running
preempt_disable regions are done running.  We use the call_rcu mechanism
to execute kfree() after such quiescent state has been reached.
However, the new CONFIG_PREEMPT_RCU version of call_rcu and rcu_barrier
does not guarantee that all preempt_disable code regions have finished,
hence the race.

The "proper" way to do this is to use rcu_read_lock/unlock, but we don't
want to use it to minimize intrusiveness on the traced system.  (we do
not want the marker code to call into much of the OS code, because it
would quickly restrict what can and cannot be instrumented, such as the
scheduler).

The temporary fix, until we get call_rcu_sched and rcu_barrier_sched in
mainline, is to use synchronize_sched before each call_rcu calls, so we
wait for the quiescent state in the system call code path.  It will slow
down batch marker enable/disable, but will make sure the race is gone.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-02 15:28:19 -07:00
..
irq genirq: do not leave interupts enabled on free_irq 2008-02-19 10:43:58 +01:00
power Merge branches 'release' and 'doc' into release 2008-03-13 01:59:53 -04:00
time clocksource: revert: use init_timer_deferrable for clocksource_watchdog 2008-03-25 20:13:25 +01:00
.gitignore Update kernel/.gitignore with new auto-generated files 2008-02-09 23:27:01 -08:00
Kconfig.hz sched: high-res preemption tick 2008-01-25 21:08:29 +01:00
Kconfig.preempt rcu: move PREEMPT_RCU config option back under PREEMPT 2008-03-10 18:01:20 -07:00
Makefile avoid overflows in kernel/time.c 2008-02-08 09:22:39 -08:00
acct.c bsd_acct: using task_struct->tgid is not right in pid-namespaces 2008-03-24 19:22:20 -07:00
audit.c audit: silence two kerneldoc warnings in kernel/audit.c 2008-03-28 14:45:21 -07:00
audit.h [PATCH] audit: watching subtrees 2007-10-21 02:37:45 -04:00
audit_tree.c Introduce path_put() 2008-02-14 21:13:33 -08:00
auditfilter.c Introduce path_put() 2008-02-14 21:13:33 -08:00
auditsc.c [PATCH] Audit: Fix the format type for size_t variables 2008-03-01 07:16:06 -05:00
backtracetest.c x86: add a simple backtrace test module 2008-01-30 13:33:08 +01:00
capability.c Add 64-bit capability support to the kernel 2008-02-05 09:44:20 -08:00
cgroup.c NULL noise: fs/*, mm/*, kernel/* 2008-03-30 14:18:41 -07:00
cgroup_debug.c Task Control Groups: simple task cgroup debug info subsystem 2007-10-19 11:53:36 -07:00
compat.c hrtimer: don't modify restart_block->fn in restart functions 2008-02-10 10:48:03 +01:00
configs.c
cpu.c cpu: fix section mismatch warnings for enable_nonboot_cpus 2008-02-08 09:22:41 -08:00
cpuset.c cpusets: fix obsolete comment 2008-03-05 17:53:33 -08:00
delayacct.c Add scaled time to taskstats based process accounting 2007-10-18 14:37:28 -07:00
dma.c whitespace fixes: DMA channel allocator 2007-10-18 14:37:24 -07:00
exec_domain.c whitespace fixes: execution domains 2007-10-18 14:37:26 -07:00
exit.c Fix waitid si_code regression 2008-03-08 11:54:00 -08:00
extable.c module: Don't report discarded init pages as kernel text. 2008-01-29 17:13:18 +11:00
fork.c memcgroup: fix spurious EBUSY on memory cgroup removal 2008-03-28 14:45:21 -07:00
futex.c NULL noise: fs/*, mm/*, kernel/* 2008-03-30 14:18:41 -07:00
futex_compat.c futex_compat __user annotation 2008-03-30 14:18:41 -07:00
hrtimer.c hrtimer: catch expired CLOCK_REALTIME timers early 2008-02-14 22:08:30 +01:00
itimer.c ITIMER_REAL: convert to use struct pid 2008-02-08 09:22:29 -08:00
kallsyms.c remove support for un-needed _extratext section 2008-02-06 10:41:01 -08:00
kexec.c vmcoreinfo: add "VMCOREINFO_" to all the call for vmcoreinfo_append_str() 2008-02-07 08:42:25 -08:00
kfifo.c
kmod.c Dont touch fs_struct in usermodehelper 2008-02-14 21:13:32 -08:00
kprobes.c kprobes: fix a null pointer bug in register_kretprobe() 2008-03-04 16:35:19 -08:00
ksysfs.c Kobject: convert remaining kobject_unregister() to kobject_put() 2008-01-24 20:40:40 -08:00
kthread.c sched: fix, always create kernel threads with normal priority 2008-01-25 21:08:33 +01:00
latencytop.c sched: latencytop support 2008-01-25 21:08:34 +01:00
lockdep.c Subject: lockdep: include all lock classes in all_lock_classes 2008-02-25 23:03:02 +01:00
lockdep_internals.h
lockdep_proc.c lockdep: Avoid /proc/lockdep & lock_stat infinite output 2007-10-11 22:11:11 +02:00
marker.c markers: use synchronize_sched() 2008-04-02 15:28:19 -07:00
module.c modules: warn about suspicious return values from module's ->init() hook 2008-03-10 18:01:20 -07:00
mutex-debug.c kernel: remove fastcall in kernel/* 2008-02-08 09:22:31 -08:00
mutex-debug.h
mutex.c kernel: remove fastcall in kernel/* 2008-02-08 09:22:31 -08:00
mutex.h
notifier.c kernel/notifier.c should #include <linux/reboot.h> 2008-02-06 10:41:02 -08:00
ns_cgroup.c cgroups: implement namespace tracking subsystem 2007-10-19 11:53:37 -07:00
nsproxy.c namespaces: move the IPC namespace under IPC_NS option 2008-02-08 09:22:23 -08:00
panic.c ACPI: Taint kernel on ACPI table override (format corrected) 2008-02-06 22:07:51 -05:00
params.c Add new string functions strict_strto* and convert kernel params to use them 2008-02-08 09:22:41 -08:00
pid.c kernel: remove fastcall in kernel/* 2008-02-08 09:22:31 -08:00
pid_namespace.c namespaces: cleanup the code managed with PID_NS option 2008-02-08 09:22:23 -08:00
pm_qos_params.c pm qos infrastructure and interface 2008-02-05 09:44:22 -08:00
posix-cpu-timers.c Use find_task_by_vpid in posix timers 2008-02-08 09:22:41 -08:00
posix-timers.c hrtimer: check relative timeouts for overflow 2008-02-14 22:08:30 +01:00
printk.c Make printk() console semaphore accesses sensible 2008-03-24 19:25:08 -07:00
profile.c Nuke a duplicate include from profile.c 2008-02-08 09:22:34 -08:00
ptrace.c ptrace_check_attach: remove unneeded ->signal != NULL check 2008-02-08 09:22:26 -08:00
rcuclassic.c Preempt-RCU: implementation 2008-01-25 21:08:24 +01:00
rcupdate.c rcupdate: fix comment 2008-02-13 16:21:18 -08:00
rcupreempt.c rcupreempt: remove never-migrates assumption from rcu_process_callbacks() 2008-02-29 20:21:13 +01:00
rcupreempt_trace.c Preempt-RCU: implementation 2008-01-25 21:08:24 +01:00
rcutorture.c cpu-hotplug: replace lock_cpu_hotplug() with get_online_cpus() 2008-01-25 21:08:02 +01:00
relay.c relay: set an spd_release() hook for splice 2008-03-26 12:04:09 +01:00
res_counter.c Memory Resource Controller use strstrip while parsing arguments 2008-03-04 16:35:09 -08:00
resource.c [POWERPC] Add arch-specific walk_memory_remove() for 64-bit powerpc 2008-02-08 19:52:48 +11:00
rtmutex-debug.c Don't operate with pid_t in rtmutex tester 2008-02-08 09:22:41 -08:00
rtmutex-debug.h
rtmutex-tester.c Driver core: change sysdev classes to use dynamic kobject names 2008-01-24 20:40:40 -08:00
rtmutex.c hrtimer: more hrtimer_init_sleeper() fallout. 2008-02-13 15:45:36 +01:00
rtmutex.h
rtmutex_common.h Don't operate with pid_t in rtmutex tester 2008-02-08 09:22:41 -08:00
rwsem.c sched: mark rwsem functions as __sched for wchan/profiling 2007-12-18 15:21:13 +01:00
sched.c NOHZ: reevaluate idle sleep length after add_timer_on() 2008-03-26 08:28:55 +01:00
sched_debug.c sched: improve affine wakeups 2008-03-19 04:27:53 +01:00
sched_fair.c sched: cleanup old and rarely used 'debug' features. 2008-03-21 16:43:47 +01:00
sched_idletask.c sched: high-res preemption tick 2008-01-25 21:08:29 +01:00
sched_rt.c sched: balance RT task resched only on runqueue 2008-03-07 16:43:00 +01:00
sched_stats.h sched: clean up kernel/sched_stat.h 2007-11-28 15:52:56 +01:00
seccomp.c
signal.c freezer vs stopped or traced 2008-03-04 07:59:54 -08:00
softirq.c rcu: add support for dynamic ticks and preempt rcu 2008-02-29 18:46:50 +01:00
softlockup.c softlockup: fix task state setting 2008-02-29 18:46:53 +01:00
spinlock.c spinlock: lockbreak cleanup 2008-01-30 13:31:20 +01:00
srcu.c make srcu_readers_active() static 2008-02-06 10:41:02 -08:00
stacktrace.c
stop_machine.c stopmachine: semaphore to mutex 2008-02-06 10:41:08 -08:00
sys.c Pidns: make full use of xxx_vnr() calls 2008-02-08 09:22:29 -08:00
sys_ni.c timerfd: new timerfd API 2008-02-05 09:44:07 -08:00
sysctl.c sched: revert load_balance_monitor() changes 2008-03-04 17:54:06 +01:00
sysctl_check.c constify tables in kernel/sysctl_check.c 2008-02-08 09:22:31 -08:00
taskstats.c kernel/taskstats.c: fix bogus nlmsg_free() 2007-11-14 18:45:44 -08:00
test_kprobes.c kprobes: kretprobe user entry-handler 2008-02-06 10:41:11 -08:00
time.c avoid overflows in kernel/time.c 2008-02-08 09:22:39 -08:00
timeconst.pl timeconst.pl: correct reversal of USEC_TO_HZ and HZ_TO_USEC 2008-02-12 14:29:26 -08:00
timer.c NOHZ: reevaluate idle sleep length after add_timer_on() 2008-03-26 08:28:55 +01:00
tsacct.c Add scaled time to taskstats based process accounting 2007-10-18 14:37:28 -07:00
uid16.c
user.c sched: rt-group: make rt groups scheduling configurable 2008-02-13 15:45:40 +01:00
user_namespace.c namespaces: cleanup the code managed with the USER_NS option 2008-02-08 09:22:23 -08:00
utsname.c Fix UTS corruption during clone(CLONE_NEWUTS) 2007-09-19 11:24:17 -07:00
utsname_sysctl.c Isolate the UTS namespace's domainname and hostname back 2007-11-29 09:24:53 -08:00
wait.c kernel: remove fastcall in kernel/* 2008-02-08 09:22:31 -08:00
workqueue.c workqueue: make delayed_work_timer_fn() static 2008-02-08 09:22:37 -08:00