linux/kernel
Peter Zijlstra 5fbd036b55 sched: Cleanup cpu_active madness
Stepan found:

CPU0		CPUn

_cpu_up()
  __cpu_up()

		boostrap()
		  notify_cpu_starting()
		  set_cpu_online()
		  while (!cpu_active())
		    cpu_relax()

<PREEMPT-out>

smp_call_function(.wait=1)
  /* we find cpu_online() is true */
  arch_send_call_function_ipi_mask()

  /* wait-forever-more */

<PREEMPT-in>
		  local_irq_enable()

  cpu_notify(CPU_ONLINE)
    sched_cpu_active()
      set_cpu_active()

Now the purpose of cpu_active is mostly with bringing down a cpu, where
we mark it !active to avoid the load-balancer from moving tasks to it
while we tear down the cpu. This is required because we only update the
sched_domain tree after we brought the cpu-down. And this is needed so
that some tasks can still run while we bring it down, we just don't want
new tasks to appear.

On cpu-up however the sched_domain tree doesn't yet include the new cpu,
so its invisible to the load-balancer, regardless of the active state.
So instead of setting the active state after we boot the new cpu (and
consequently having to wait for it before enabling interrupts) set the
cpu active before we set it online and avoid the whole mess.

Reported-by: Stepan Moskovchenko <stepanm@codeaurora.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1323965362.18942.71.camel@twins
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-12 20:43:15 +01:00
..
debug module: struct module_ref should contains long fields 2012-01-13 09:32:14 +10:30
events perf: Fix double start/stop in x86_pmu_start() 2012-02-07 16:58:56 +01:00
gcov gcov: disable CONSTRUCTORS for UML 2011-07-26 16:49:45 -07:00
irq genirq: Handle pending irqs in irq_startup() 2012-02-15 11:56:59 +01:00
power PM / Freezer: Thaw only kernel threads if freezing of kernel threads fails 2012-02-04 22:23:05 +01:00
sched sched: Cleanup cpu_active madness 2012-03-12 20:43:15 +01:00
time Merge branch 'rcu/fixes-for-v3.2' into rcu/urgent 2012-01-16 09:41:18 -08:00
trace Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-01-15 11:26:35 -08:00
.gitignore
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt sched: Isolate preempt counting in its own config option 2011-06-10 15:15:40 +02:00
Makefile PM: Make sysrq-o be available for CONFIG_PM unset 2012-01-14 00:33:03 +01:00
acct.c Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-01-08 12:19:57 -08:00
async.c kernel/async: remove redundant declaration. 2012-01-13 09:32:18 +10:30
audit.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit 2012-01-17 16:41:31 -08:00
audit.h audit: remove AUDIT_SETUP_CONTEXT as it isn't used 2012-01-17 16:16:57 -05:00
audit_tree.c audit_tree,rcu: Convert call_rcu(__put_tree) to kfree_rcu() 2011-07-20 14:10:11 -07:00
audit_watch.c
auditfilter.c audit: allow interfield comparison in audit rules 2012-01-17 16:17:01 -05:00
auditsc.c kernel-doc: fix new warnings in auditsc.c 2012-01-23 08:44:53 -08:00
backtracetest.c
bounds.c
capability.c Revert "capabitlies: ns_capable can use the cap helpers rather than lsm call" 2012-01-17 10:19:41 -08:00
cgroup.c Merge branch 'for-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2012-01-09 12:59:24 -08:00
cgroup_freezer.c Merge branch 'for-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2012-01-09 12:59:24 -08:00
compat.c kernel: Fix files explicitly needing EXPORT_SYMBOL infrastructure 2011-10-31 19:30:05 -04:00
configs.c kernel/configs.c: include MODULE_*() when CONFIG_IKCONFIG_PROC=n 2011-07-25 20:57:15 -07:00
cpu.c Merge branch 'pm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm 2012-01-08 13:10:57 -08:00
cpu_pm.c cpu_pm: call notifiers during suspend 2011-09-23 12:05:29 +05:30
cpuset.c Merge branch 'for-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2012-01-09 12:59:24 -08:00
crash_dump.c Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux 2011-11-06 19:44:47 -08:00
cred.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
delayacct.c KVM: Steal time implementation 2011-07-14 12:59:14 +03:00
dma.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
elfcore.c
exec_domain.c
exit.c sched: Fix ancient race in do_exit() 2012-01-27 11:55:36 +01:00
extable.c extable, core_kernel_data(): Make sure all archs define _sdata 2011-05-20 08:56:56 +02:00
fork.c epoll: introduce POLLFREE to flush ->signalfd_wqh before kfree() 2012-02-24 11:42:50 -08:00
freezer.c freezer: kill unused set_freezable_with_signal() 2011-11-23 09:28:17 -08:00
futex.c futex: Fix uninterruptible loop due to gate_area 2011-12-31 11:48:28 -08:00
futex_compat.c
groups.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
hrtimer.c Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2011-11-28 08:43:52 -08:00
hung_task.c hung_task: fix false positive during vfork 2012-01-03 16:14:32 -08:00
irq_work.c kernel: fix two implicit header assumptions in irq_work.c 2011-10-31 09:20:12 -04:00
itimer.c [S390] cputime: add sparse checking and cleanup 2011-12-15 14:56:19 +01:00
jump_label.c Merge remote-tracking branch 'tip/perf/core' into kvm-updates/3.3 2011-12-27 11:22:24 +02:00
kallsyms.c
kexec.c kdump: crashk_res init check for /sys/kernel/kexec_crash_size 2012-01-12 20:13:11 -08:00
kfifo.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
kmod.c Merge branch 'pm-sleep' into pm-for-linus 2011-12-25 23:42:20 +01:00
kprobes.c kprobes: fix a memory leak in function pre_handler_kretprobe() 2012-02-03 16:16:41 -08:00
ksysfs.c kernel: ksysfs.c is implicitly using stat.h 2011-10-31 09:20:13 -04:00
kthread.c freezer: kill unused set_freezable_with_signal() 2011-11-23 09:28:17 -08:00
latencytop.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
lockdep.c Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-01-06 08:02:58 -08:00
lockdep_internals.h
lockdep_proc.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
lockdep_states.h
module.c error: implicit declaration of function 'module_flags_taint' 2012-01-15 16:21:07 -08:00
mutex-debug.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
mutex-debug.h
mutex.c sched/rt: Use schedule_preempt_disabled() 2012-03-01 10:28:03 +01:00
mutex.h
notifier.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
nsproxy.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
padata.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
panic.c panic: don't print redundant backtraces on oops 2012-01-12 20:13:11 -08:00
params.c module: make module param bint handle nul value 2012-02-14 11:02:15 +10:30
pid.c vfs: fix panic in __d_lookup() with high dentry hashtable counts 2012-02-13 20:45:38 -05:00
pid_namespace.c sysctl: add the kernel.ns_last_pid control 2012-01-12 20:13:11 -08:00
posix-cpu-timers.c [S390] cputime: add sparse checking and cleanup 2011-12-15 14:56:19 +01:00
posix-timers.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
printk.c module_param: make bool parameters really bool (core code) 2012-01-13 09:32:18 +10:30
profile.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
ptrace.c Merge branch 'for-linus' of git://selinuxproject.org/~jmorris/linux-security 2012-01-14 18:36:33 -08:00
range.c range: fix bogus misuse of module.h to get printk() 2011-10-31 09:20:11 -04:00
rcu.h rcu: Deconfuse dynticks entry-exit tracing 2011-12-11 10:31:42 -08:00
rcupdate.c rcu: Detect illegal rcu dereference in extended quiescent state 2011-12-11 10:31:30 -08:00
rcutiny.c rcu: Augment rcu_batch_end tracing for idle and callback state 2011-12-11 10:32:22 -08:00
rcutiny_plugin.h rcu: Apply ACCESS_ONCE() to rcu_boost() return value 2011-12-11 10:33:19 -08:00
rcutorture.c rcu: Add missing __cpuinit annotation in rcutorture code 2012-01-16 09:44:05 -08:00
rcutree.c rcu: Augment rcu_batch_end tracing for idle and callback state 2011-12-11 10:32:22 -08:00
rcutree.h rcu: Keep invoking callbacks if CPU otherwise idle 2011-12-11 10:32:09 -08:00
rcutree_plugin.h rcu: Apply ACCESS_ONCE() to rcu_boost() return value 2011-12-11 10:33:19 -08:00
rcutree_trace.c rcu: Track idleness independent of idle tasks 2011-12-11 10:31:24 -08:00
relay.c relay: prevent integer overflow in relay_open() 2012-02-10 09:04:49 +01:00
res_counter.c net: introduce res_counter_charge_nofail() for socket allocations 2012-01-22 15:08:46 -05:00
resource.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
rtmutex-debug.c lockdep, rtmutex, bug: Show taint flags on error 2011-12-06 08:16:49 +01:00
rtmutex-debug.h
rtmutex-tester.c rtmutex-tester: convert sysdev_class to a regular subsystem 2011-12-14 14:54:22 -08:00
rtmutex.c Revert "rcu: Permit rt_mutex_unlock() with irqs disabled" 2011-12-11 10:33:18 -08:00
rtmutex.h
rtmutex_common.h
rwsem.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
seccomp.c seccomp: audit abnormal end to a process due to seccomp 2012-01-17 16:16:55 -05:00
semaphore.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
signal.c user namespace: make signal.c respect user namespaces 2012-01-10 16:30:54 -08:00
smp.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
softirq.c sched/rt: Document scheduler related skip-resched-check sites 2012-03-01 10:28:04 +01:00
spinlock.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
srcu.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
stacktrace.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
stop_machine.c Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux 2011-11-06 19:44:47 -08:00
sys.c c/r: prctl: add PR_SET_MM codes to set up mm_struct entries 2012-01-12 20:13:13 -08:00
sys_ni.c Cross Memory Attach 2011-10-31 17:30:44 -07:00
sysctl.c x86: Panic on detection of stack overflow 2011-12-05 11:37:47 +01:00
sysctl_binary.c binary_sysctl(): fix memory leak 2011-12-20 10:25:04 -08:00
sysctl_check.c xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
taskstats.c Make TASKSTATS require root access 2011-09-19 17:04:37 -07:00
test_kprobes.c
time.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
timeconst.pl
timer.c Merge branch 'core-debugobjects-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-01-06 07:53:34 -08:00
tracepoint.c tracepoints/module: Fix disabling tracepoints with taint CRAP or OOT 2012-01-16 11:35:57 -05:00
tsacct.c [S390] cputime: add sparse checking and cleanup 2011-12-15 14:56:19 +01:00
uid16.c
up.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
user-return-notifier.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
user.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
user_namespace.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
utsname.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
utsname_sysctl.c Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux 2011-11-06 19:44:47 -08:00
wait.c lockdep/waitqueues: Add better annotation 2011-12-21 10:07:39 +01:00
watchdog.c bugs, x86: Fix printk levels for panic, softlockups and stack dumps 2012-01-26 21:28:45 +01:00
workqueue.c workqueue: make alloc_workqueue() take printf fmt and args for name 2012-01-10 16:30:54 -08:00
workqueue_sched.h