linux/kernel
David Rientjes b246272ecc cpusets: stall when updating mems_allowed for mempolicy or disjoint nodemask
Kernels where MAX_NUMNODES > BITS_PER_LONG may temporarily see an empty
nodemask in a tsk's mempolicy if its previous nodemask is remapped onto a
new set of allowed cpuset nodes where the two nodemasks, as a result of
the remap, are now disjoint.

c0ff7453bb ("cpuset,mm: fix no node to alloc memory when changing
cpuset's mems") adds get_mems_allowed() to prevent the set of allowed
nodes from changing for a thread.  This causes any update to a set of
allowed nodes to stall until put_mems_allowed() is called.

This stall is unncessary, however, if at least one node remains unchanged
in the update to the set of allowed nodes.  This was addressed by
89e8a244b9 ("cpusets: avoid looping when storing to mems_allowed if one
node remains set"), but it's still possible that an empty nodemask may be
read from a mempolicy because the old nodemask may be remapped to the new
nodemask during rebind.  To prevent this, only avoid the stall if there is
no mempolicy for the thread being changed.

This is a temporary solution until all reads from mempolicy nodemasks can
be guaranteed to not be empty without the get_mems_allowed()
synchronization.

Also moves the check for nodemask intersection inside task_lock() so that
tsk->mems_allowed cannot change.  This ensures that nothing can set this
tsk's mems_allowed out from under us and also protects tsk->mempolicy.

Reported-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Paul Menage <paul@paulmenage.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-12-20 10:25:04 -08:00
..
debug Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux 2011-11-06 19:44:47 -08:00
events perf events: Fix ring_buffer_wakeup() brown paperbag bug 2011-12-14 08:44:53 +01:00
gcov gcov: disable CONSTRUCTORS for UML 2011-07-26 16:49:45 -07:00
irq genirq: Fix race condition when stopping the irq thread 2011-12-02 11:54:24 +01:00
power PM / Hibernate: Do not leak memory in error/test code paths 2011-11-23 21:03:38 +01:00
time alarmtimers: Fix time comparison 2011-12-06 11:38:32 +01:00
trace ftrace: Fix hash record accounting bug 2011-12-05 13:28:47 -05:00
.gitignore
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt sched: Isolate preempt counting in its own config option 2011-06-10 15:15:40 +02:00
Makefile Merge branch 'devel-stable' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm 2011-10-28 12:02:27 -07:00
acct.c
async.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
audit.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
audit.h
audit_tree.c audit_tree,rcu: Convert call_rcu(__put_tree) to kfree_rcu() 2011-07-20 14:10:11 -07:00
audit_watch.c
auditfilter.c
auditsc.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
backtracetest.c
bounds.c
capability.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
cgroup.c memcg: replace ss->id_lock with a rwlock 2011-11-02 16:07:03 -07:00
cgroup_freezer.c cgroup_freezer: fix freezing groups with stopped tasks 2011-11-24 11:58:22 -08:00
compat.c kernel: Fix files explicitly needing EXPORT_SYMBOL infrastructure 2011-10-31 19:30:05 -04:00
configs.c kernel/configs.c: include MODULE_*() when CONFIG_IKCONFIG_PROC=n 2011-07-25 20:57:15 -07:00
cpu.c Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux 2011-11-06 19:44:47 -08:00
cpu_pm.c cpu_pm: call notifiers during suspend 2011-09-23 12:05:29 +05:30
cpuset.c cpusets: stall when updating mems_allowed for mempolicy or disjoint nodemask 2011-12-20 10:25:04 -08:00
crash_dump.c Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux 2011-11-06 19:44:47 -08:00
cred.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
delayacct.c KVM: Steal time implementation 2011-07-14 12:59:14 +03:00
dma.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
elfcore.c
exec_domain.c
exit.c oom: remove oom_disable_count 2011-10-31 17:30:45 -07:00
extable.c
fork.c writeback: remove vm_dirties and task->dirties 2011-11-17 20:49:06 +08:00
freezer.c Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux 2011-11-06 19:44:47 -08:00
futex.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
futex_compat.c
groups.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
hrtimer.c Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2011-11-28 08:43:52 -08:00
hung_task.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
irq_work.c kernel: fix two implicit header assumptions in irq_work.c 2011-10-31 09:20:12 -04:00
itimer.c
jump_label.c jump_label: jump_label_inc may return before the code is patched 2011-12-05 13:28:46 -05:00
kallsyms.c
kexec.c [S390] kdump: Add infrastructure for unmapping crashkernel memory 2011-10-30 15:16:42 +01:00
kfifo.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
kmod.c kmod: prevent kmod_loop_msg overflow in __request_module() 2011-10-26 13:10:39 +10:30
kprobes.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
ksysfs.c kernel: ksysfs.c is implicitly using stat.h 2011-10-31 09:20:13 -04:00
kthread.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
latencytop.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
lockdep.c lockdep, kmemcheck: Annotate ->lock in lockdep_init_map() 2011-12-06 18:18:13 +01:00
lockdep_internals.h
lockdep_proc.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
lockdep_states.h
module.c Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux 2011-11-06 19:44:47 -08:00
mutex-debug.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
mutex-debug.h
mutex.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
mutex.h
notifier.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
nsproxy.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
padata.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
panic.c module,bug: Add TAINT_OOT_MODULE flag for modules not built in-tree 2011-11-07 07:54:42 +10:30
params.c kernel: params.c needs module.h not moduleparam.h 2011-10-31 09:20:13 -04:00
pid.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
pid_namespace.c
posix-cpu-timers.c Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2011-10-26 16:17:32 +02:00
posix-timers.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
printk.c printk: avoid double lock acquire 2011-12-09 07:50:28 -08:00
profile.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
ptrace.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
range.c range: fix bogus misuse of module.h to get printk() 2011-10-31 09:20:11 -04:00
rcu.h rcu: Add grace-period, quiescent-state, and call_rcu trace events 2011-09-28 21:38:21 -07:00
rcupdate.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
rcutiny.c kernel: fix up module header handling in rcutiny files 2011-10-31 09:20:13 -04:00
rcutiny_plugin.h kernel: fix up module header handling in rcutiny files 2011-10-31 09:20:13 -04:00
rcutorture.c rcu: Make rcu_torture_boost() exit loops at end of test 2011-09-28 21:38:46 -07:00
rcutree.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
rcutree.h rcu: Remove rcu_needs_cpu_flush() to avoid false quiescent states 2011-09-28 21:38:48 -07:00
rcutree_plugin.h rcu: Remove rcu_needs_cpu_flush() to avoid false quiescent states 2011-09-28 21:38:48 -07:00
rcutree_trace.c rcu: Simplify quiescent-state accounting 2011-09-28 21:38:22 -07:00
relay.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
res_counter.c
resource.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
rtmutex-debug.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
rtmutex-debug.h
rtmutex-tester.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
rtmutex.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
rtmutex.h
rtmutex_common.h
rwsem.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
sched.c sched: Set the command name of the idle tasks in SMP kernels 2011-11-14 12:50:43 +01:00
sched_autogroup.c
sched_autogroup.h sched: Skip autogroup when looking for all rt sched groups 2011-07-01 10:39:08 +02:00
sched_clock.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
sched_cpupri.c sched/cpupri: Remove cpupri->pri_active 2011-08-14 12:01:11 +02:00
sched_cpupri.h sched/cpupri: Remove cpupri->pri_active 2011-08-14 12:01:11 +02:00
sched_debug.c
sched_fair.c sched: Fix select_idle_sibling() regression in selecting an idle SMT sibling 2011-12-16 09:44:58 +01:00
sched_features.h sched, rt: Provide means of disabling cross-cpu bandwidth sharing 2011-11-14 12:50:40 +01:00
sched_idletask.c
sched_rt.c sched, rt: Provide means of disabling cross-cpu bandwidth sharing 2011-11-14 12:50:40 +01:00
sched_stats.h locking, sched: Annotate thread_group_cputimer as raw 2011-09-13 11:11:55 +02:00
sched_stoptask.c sched: Implement hierarchical task accounting for SCHED_OTHER 2011-08-14 12:01:13 +02:00
seccomp.c
semaphore.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
signal.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
smp.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
softirq.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
spinlock.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
srcu.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
stacktrace.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
stop_machine.c Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux 2011-11-06 19:44:47 -08:00
sys.c Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux 2011-11-06 19:44:47 -08:00
sys_ni.c Cross Memory Attach 2011-10-31 17:30:44 -07:00
sysctl.c Merge branch 'akpm' (Andrew's incoming) 2011-10-31 17:46:07 -07:00
sysctl_binary.c ipv4: NET_IPV4_ROUTE_GC_INTERVAL removal 2011-10-03 14:13:01 -04:00
sysctl_check.c xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
taskstats.c Make TASKSTATS require root access 2011-09-19 17:04:37 -07:00
test_kprobes.c
time.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
timeconst.pl
timer.c sys_getppid: add missing rcu_dereference 2011-12-09 07:50:29 -08:00
tracepoint.c Tracepoint: Dissociate from module mutex 2011-08-10 20:38:14 -04:00
tsacct.c Make taskstats round statistics down to nearest 1k bytes/events 2011-09-19 17:10:57 -07:00
uid16.c
up.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
user-return-notifier.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
user.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
user_namespace.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
utsname.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
utsname_sysctl.c Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux 2011-11-06 19:44:47 -08:00
wait.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
watchdog.c watchdog: move watchdog_*_all_cpus under CONFIG_SYSCTL 2011-10-31 17:30:53 -07:00
workqueue.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
workqueue_sched.h