linux/kernel
Jason Wessel dfee3a7b92 debug_core: refactor locking for master/slave cpus
For quite some time there have been problems with memory barriers and
various races with NMI on multi processor systems using the kernel
debugger.  The algorithm for entering the kernel debug core and
resuming kernel execution was racy and had several known edge case
problems with attempting to debug something on a heavily loaded system
using breakpoints that are hit repeatedly and quickly.

The prior "locking" design entry worked as follows:

  * The atomic counter kgdb_active was used with atomic exchange in
    order to elect a master cpu out of all the cpus that may have
    taken a debug exception.
  * The master cpu increments all elements of passive_cpu_wait[].
  * The master cpu issues the round up cpus message.
  * Each "slave cpu" that enters the debug core increments its own
    element in cpu_in_kgdb[].
  * Each "slave cpu" spins on passive_cpu_wait[] until it becomes 0.
  * The master cpu debugs the system.

The new scheme removes the two arrays of atomic counters and replaces
them with 2 single counters.  One counter is used to count the number
of cpus waiting to become a master cpu (because one or more hit an
exception). The second counter is use to indicate how many cpus have
entered as slave cpus.

The new entry logic works as follows:

  * One or more cpus enters via kgdb_handle_exception() and increments
    the masters_in_kgdb. Each cpu attempts to get the spin lock called
    dbg_master_lock.
  * The master cpu sets kgdb_active to the current cpu.
  * The master cpu takes the spinlock dbg_slave_lock.
  * The master cpu asks to round up all the other cpus.
  * Each slave cpu that is not already in kgdb_handle_exception()
    will enter and increment slaves_in_kgdb.  Each slave will now spin
    try_locking on dbg_slave_lock.
  * The master cpu waits for the sum of masters_in_kgdb and slaves_in_kgdb
    to be equal to the sum of the online cpus.
  * The master cpu debugs the system.

In the new design the kgdb_active can only be changed while holding
dbg_master_lock.  Stress testing has not turned up any further
entry/exit races that existed in the prior locking design.  The prior
locking design suffered from atomic variables not being truly atomic
(in the capacity as used by kgdb) along with memory barrier races.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Acked-by: Dongdong Deng <dongdong.deng@windriver.com>
2010-10-22 15:34:13 -05:00
..
debug debug_core: refactor locking for master/slave cpus 2010-10-22 15:34:13 -05:00
gcov llseek: automatically add .llseek fop 2010-10-15 15:53:27 +02:00
irq genirq: Fix CONFIG_GENIRQ_NO_DEPRECATED=y build 2010-10-12 21:59:55 +02:00
power PM: Introduce library for device-specific OPPs (v7) 2010-10-17 01:57:50 +02:00
time ntp: Clamp PLL update interval 2010-09-09 20:48:37 +02:00
trace kdb,ftdump: Remove reference to internal kdb include 2010-10-22 15:34:11 -05:00
.gitignore
Kconfig.freezer
Kconfig.hz
Kconfig.locks mutex: Better control mutex adaptive spinning config 2009-12-03 11:50:11 +01:00
Kconfig.preempt
Makefile Merge branch 'core-memblock-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2010-10-21 18:52:11 -07:00
acct.c pass a struct path to vfs_statfs 2010-08-09 16:48:42 -04:00
async.c async: use workqueue for worker pool 2010-07-14 11:29:46 +02:00
audit.c Merge branch 'for-linus' of git://git.infradead.org/users/eparis/notify 2010-08-10 11:39:13 -07:00
audit.h Audit: split audit watch Kconfig 2010-07-28 09:58:19 -04:00
audit_tree.c fanotify: use both marks when possible 2010-07-28 10:18:55 -04:00
audit_watch.c Revert "fsnotify: store struct file not struct path" 2010-08-12 14:23:04 -07:00
auditfilter.c audit: do not get and put just to free a watch 2010-07-28 09:58:17 -04:00
auditsc.c vfs: add helpers to get root and pwd 2010-08-11 00:28:20 -04:00
backtracetest.c
bounds.c kbuild: move bounds.h to include/generated 2009-12-12 13:08:14 +01:00
capability.c sched: Remove remaining USER_SCHED code 2010-04-02 20:12:00 +02:00
cgroup.c Merge branch 'vfs' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl 2010-10-22 10:52:01 -07:00
cgroup_freezer.c Freezer / cgroup freezer: Update stale locking comments 2010-05-10 23:18:47 +02:00
compat.c compat: Make compat_alloc_user_space() incorporate the access_ok() 2010-09-14 16:08:45 -07:00
configs.c llseek: automatically add .llseek fop 2010-10-15 15:53:27 +02:00
cpu.c sched: adjust when cpu_active and cpuset configurations are updated during cpu on/offlining 2010-06-08 21:40:36 +02:00
cpuset.c security: remove unused parameter from security_task_setscheduler() 2010-10-21 10:12:44 +11:00
cred.c Add a dummy printk function for the maintenance of unused printks 2010-08-12 09:51:35 -07:00
delayacct.c headers: taskstats_kern.h trim 2009-09-18 09:48:52 -07:00
dma.c
elfcore.c elf coredump: add extended numbering support 2010-03-06 11:26:46 -08:00
exec_domain.c sys_personality: remove the bogus checks in sys_personality()->__set_personality() path 2010-08-09 20:45:05 -07:00
exit.c perf: Fix up delayed_put_task_struct() 2010-09-09 21:07:09 +02:00
extable.c
fork.c rmap: fix walk during fork 2010-09-22 17:22:39 -07:00
freezer.c
futex.c Merge branch 'futexes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2010-10-21 14:06:17 -07:00
futex_compat.c futex: Change 3rd arg of fetch_robust_entry() to unsigned int* 2010-09-18 12:19:21 +02:00
groups.c kernel/groups.c: fix integer overflow in groups_search 2010-09-09 18:57:24 -07:00
hrtimer.c hrtimer: Preserve timer state in remove_hrtimer() 2010-10-14 13:29:59 +02:00
hung_task.c lockup detector: Fix grammar by adding a missing "to" in the comments 2010-08-17 09:11:52 +02:00
hw_breakpoint.c perf, hw_breakpoint: Fix crash in hw_breakpoint creation 2010-10-18 19:58:55 +02:00
irq_work.c irq_work: Add generic hardirq context callbacks 2010-10-18 19:58:50 +02:00
itimer.c itimers: Fix racy writes to cpu_itimer fields 2009-11-18 16:32:12 +01:00
jump_label.c jump label: Add jump_label_text_reserved() to reserve jump points 2010-09-22 16:30:46 -04:00
kallsyms.c kdb: core for kgdb back end (2 of 2) 2010-05-20 21:04:21 -05:00
kexec.c kexec: return -EFAULT on copy_to_user() failures 2010-08-11 08:59:22 -07:00
kfifo.c kfifo: fix scatterlist usage 2010-10-01 10:50:58 -07:00
kmod.c Make do_execve() take a const filename pointer 2010-08-17 18:07:43 -07:00
kprobes.c Merge branch 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl 2010-10-22 10:52:56 -07:00
ksysfs.c sysfs: add struct file* to bin_attr callbacks 2010-05-21 09:37:31 -07:00
kthread.c kthread: implement kthread_data() 2010-06-29 10:07:09 +02:00
latencytop.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
lockdep.c lockdep: Check the depth of subclass 2010-10-18 18:44:26 +02:00
lockdep_internals.h lockdep: No need to disable preemption in debug atomic ops 2010-05-04 05:38:16 +02:00
lockdep_proc.c lockstat: Make lockstat counting per cpu 2010-04-06 00:15:37 +02:00
lockdep_states.h
module.c Merge commit 'v2.6.36-rc7' into perf/core 2010-10-08 10:46:27 +02:00
mutex-debug.c headers: remove sched.h from interrupt.h 2009-10-11 11:20:58 -07:00
mutex-debug.h locking: Implement new raw_spinlock 2009-12-14 23:55:32 +01:00
mutex.c mutex: Fix annotations to include it in kernel-locking docbook 2010-09-03 08:19:51 +02:00
mutex.h
notifier.c sched: Use lockdep-based checking on rcu_dereference() 2010-02-25 10:34:26 +01:00
ns_cgroup.c cgroups: let ss->can_attach and ss->attach do whole threadgroups at a time 2009-09-24 07:20:58 -07:00
nsproxy.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
padata.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2010-08-04 15:23:14 -07:00
panic.c lib/bug.c: add oops end marker to WARN implementation 2010-08-11 08:59:22 -07:00
params.c param: locking for kernel parameters 2010-08-11 23:04:20 +09:30
perf_event.c Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2010-10-21 12:54:49 -07:00
pid.c Add RCU check for find_task_by_vpid(). 2010-08-19 17:18:02 -07:00
pid_namespace.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
pm_qos_params.c llseek: automatically add .llseek fop 2010-10-15 15:53:27 +02:00
posix-cpu-timers.c Merge branch 'writable_limits' of git://decibel.fi.muni.cz/~xslaby/linux 2010-08-10 12:07:51 -07:00
posix-timers.c posix_timer: Move copy_to_user(created_timer_id) down in timer_create() 2010-07-23 15:08:12 +02:00
printk.c printk: Make console_sem a semaphore not a pseudo mutex 2010-10-12 17:36:10 +02:00
profile.c llseek: automatically add .llseek fop 2010-10-15 15:53:27 +02:00
ptrace.c ptrace: optimize exit_ptrace() for the likely case 2010-08-11 08:59:19 -07:00
range.c kernel/range: remove unused definition of ARRAY_SIZE() 2010-08-09 20:45:06 -07:00
rcupdate.c Merge branch 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu into core/rcu 2010-10-07 09:43:11 +02:00
rcutiny.c rcu: Add a TINY_PREEMPT_RCU 2010-08-20 08:55:00 -07:00
rcutiny_plugin.h rcu: performance fixes to TINY_PREEMPT_RCU callback checking 2010-08-27 10:51:17 -07:00
rcutorture.c rcu: fix sparse errors in rcutorture.c 2010-09-23 09:16:42 -07:00
rcutree.c rcu: using ACCESS_ONCE() to observe the jiffies_stall/rnp->qsmask value 2010-10-07 10:41:06 -07:00
rcutree.h rcu: Add tracing data to support queueing models 2010-09-23 09:16:53 -07:00
rcutree_plugin.h rcu: fix _oddness handling of verbose stall warnings 2010-09-02 16:15:30 -07:00
rcutree_trace.c rcu: Add tracing data to support queueing models 2010-09-23 09:16:53 -07:00
relay.c kernel/: convert cpu notifier to return encapsulate errno value 2010-05-27 09:12:48 -07:00
res_counter.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
resource.c resource: shared I/O region support 2010-05-11 12:01:10 -07:00
rtmutex-debug.c sched: Convert pi_lock to raw_spinlock 2009-12-14 23:55:33 +01:00
rtmutex-debug.h
rtmutex-tester.c rtmutex-tester: make it build without BKL 2010-10-19 11:29:56 +02:00
rtmutex.c rtmutes: Convert rtmutex.lock to raw_spinlock 2009-12-14 23:55:33 +01:00
rtmutex.h
rtmutex_common.h
rwsem.c
sched.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2010-10-21 12:55:43 -07:00
sched_clock.c sched_clock: Add local_clock() API and improve documentation 2010-06-09 10:34:49 +02:00
sched_cpupri.c sched: No need for bootmem special cases 2010-07-17 12:06:22 +02:00
sched_cpupri.h sched: No need for bootmem special cases 2010-07-17 12:06:22 +02:00
sched_debug.c sched: Use correct macro to display sched_child_runs_first in /proc/sched_debug 2010-07-21 21:46:12 +02:00
sched_fair.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2010-10-21 12:55:43 -07:00
sched_features.h sched: Remove irq time from available CPU power 2010-10-18 20:52:27 +02:00
sched_idletask.c sched: Cure load average vs NO_HZ woes 2010-04-23 11:02:02 +02:00
sched_rt.c sched: Do not account irq time to current task 2010-10-18 20:52:26 +02:00
sched_stats.h sched: Remove the obsolete exit_state/signal hacks 2010-06-18 10:46:56 +02:00
sched_stoptask.c sched: Create special class for stop/migrate work 2010-10-18 18:41:58 +02:00
seccomp.c
semaphore.c
signal.c HWPOISON: Copy si_addr_lsb to user 2010-10-07 09:41:25 +02:00
smp.c generic-ipi: Fix deadlock in __smp_call_function_single 2010-09-10 16:48:40 +02:00
softirq.c Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2010-10-21 14:11:46 -07:00
spinlock.c locking: Cleanup the name space completely 2009-12-14 23:55:33 +01:00
srcu.c kernel: Remove undead ifdef CONFIG_DEBUG_LOCK_ALLOC 2010-09-23 09:14:51 -07:00
stacktrace.c
stop_machine.c sched: Create special class for stop/migrate work 2010-10-18 18:41:58 +02:00
sys.c pid: make setpgid() system call use RCU read-side critical section 2010-08-31 17:00:18 -07:00
sys_ni.c powerpc: define a compat_sys_recv cond_syscall 2010-09-23 17:03:55 +10:00
sysctl.c sysctl: fix min/max handling in __do_proc_doulongvec_minmax() 2010-10-07 13:31:21 -07:00
sysctl_binary.c sysctl: don't use own implementation of hex_to_bin() 2010-05-25 08:07:05 -07:00
sysctl_check.c sysctl: min/max bounds are optional 2010-10-15 14:42:24 -07:00
taskstats.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
test_kprobes.c kprobes: Fix selftest to clear flags field for reusing probes 2010-10-14 08:55:27 +02:00
time.c time: Kill off CONFIG_GENERIC_TIME 2010-07-27 12:40:54 +02:00
timeconst.pl
timer.c irq_work: Add generic hardirq context callbacks 2010-10-18 19:58:50 +02:00
tracepoint.c jump_label: Use more consistent naming 2010-10-18 19:58:56 +02:00
tsacct.c mm: clean up mm_counter 2010-03-06 11:26:23 -08:00
uid16.c headers: utsname.h redux 2009-09-23 18:13:10 -07:00
up.c
user-return-notifier.c core: Clean up user return notifers use of per_cpu 2009-12-02 10:22:59 +01:00
user.c sched: Remove a stale comment 2010-05-10 08:48:39 +02:00
user_namespace.c user_ns: Introduce user_nsmap_uid and user_ns_map_gid. 2010-06-16 14:55:34 -07:00
utsname.c
utsname_sysctl.c sysctl kernel: Remove binary sysctl logic 2009-11-12 02:04:55 -08:00
wait.c
watchdog.c Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2010-10-21 12:54:49 -07:00
workqueue.c workqueue: add documentation 2010-09-13 10:26:52 +02:00
workqueue_sched.h workqueue: implement concurrency managed dynamic worker pool 2010-06-29 10:07:14 +02:00