linux/mm
Balbir Singh 31a78f23ba mm owner: fix race between swapoff and exit
There's a race between mm->owner assignment and swapoff, more easily
seen when task slab poisoning is turned on.  The condition occurs when
try_to_unuse() runs in parallel with an exiting task.  A similar race
can occur with callers of get_task_mm(), such as /proc/<pid>/<mmstats>
or ptrace or page migration.

CPU0                                    CPU1
                                        try_to_unuse
                                        looks at mm = task0->mm
                                        increments mm->mm_users
task 0 exits
mm->owner needs to be updated, but no
new owner is found (mm_users > 1, but
no other task has task->mm = task0->mm)
mm_update_next_owner() leaves
                                        mmput(mm) decrements mm->mm_users
task0 freed
                                        dereferencing mm->owner fails

The fix is to notify the subsystem via mm_owner_changed callback(),
if no new owner is found, by specifying the new task as NULL.

Jiri Slaby:
mm->owner was set to NULL prior to calling cgroup_mm_owner_callbacks(), but
must be set after that, so as not to pass NULL as old owner causing oops.

Daisuke Nishimura:
mm_update_next_owner() may set mm->owner to NULL, but mem_cgroup_from_task()
and its callers need to take account of this situation to avoid oops.

Hugh Dickins:
Lockdep warning and hang below exec_mmap() when testing these patches.
exit_mm() up_reads mmap_sem before calling mm_update_next_owner(),
so exec_mmap() now needs to do the same.  And with that repositioning,
there's now no point in mm_need_new_owner() allowing for NULL mm.

Reported-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-09-29 08:41:47 -07:00
..
allocpercpu.c
backing-dev.c
bootmem.c bootmem: fix aligning of node-relative indexes and offsets 2008-08-20 15:40:31 -07:00
bounce.c
dmapool.c
fadvise.c
filemap_xip.c mm: xip/ext2 fix block allocation race 2008-08-20 15:40:32 -07:00
filemap.c VFS: fix dio write returning EIO when try_to_release_page fails 2008-09-02 19:21:37 -07:00
fremap.c mmu-notifiers: core 2008-07-28 16:30:21 -07:00
highmem.c
hugetlb.c allocate structures for reservation tracking in hugetlbfs outside of spinlocks v2 2008-08-12 16:07:28 -07:00
internal.h
Kconfig mm: Make generic weak get_user_pages_fast and EXPORT_GPL it 2008-08-12 17:52:53 +10:00
maccess.c
madvise.c madvise: update function comment of madvise_dontneed 2008-07-30 09:41:45 -07:00
Makefile mmu-notifiers: core 2008-07-28 16:30:21 -07:00
memcontrol.c mm owner: fix race between swapoff and exit 2008-09-29 08:41:47 -07:00
memory_hotplug.c
memory.c mm: rename page trylock 2008-08-04 21:31:34 -07:00
mempolicy.c do_migrate_pages(): remove unused variable 2008-08-12 16:07:29 -07:00
mempool.c
migrate.c mm: rename page trylock 2008-08-04 21:31:34 -07:00
mincore.c
mlock.c mlock() fix return values 2008-08-04 16:58:45 -07:00
mm_init.c mm: mminit_loglevel cannot be __meminitdata anymore 2008-08-20 15:40:30 -07:00
mmap.c mmap: fix petty bug in anonymous shared mmap offset handling 2008-09-03 19:58:53 -07:00
mmu_notifier.c mmu-notifiers: core 2008-07-28 16:30:21 -07:00
mmzone.c mm: mark the correct zone as full when scanning zonelists 2008-09-13 14:41:52 -07:00
mprotect.c mmu-notifiers: core 2008-07-28 16:30:21 -07:00
mremap.c mmu-notifiers: core 2008-07-28 16:30:21 -07:00
msync.c
nommu.c nommu: Provide vmalloc_exec(). 2008-08-04 16:01:47 +09:00
oom_kill.c security: Fix setting of PF_SUPERPRIV by __capable() 2008-08-14 22:59:43 +10:00
page_alloc.c mm/bootmem: silence section mismatch warning - contig_page_data/bootmem_node_data 2008-09-02 19:21:37 -07:00
page_io.c
page_isolation.c Remove '#include <stddef.h>' from mm/page_isolation.c 2008-09-02 09:29:01 +01:00
page-writeback.c
pagewalk.c
pdflush.c
prio_tree.c
quicklist.c mm: size of quicklists shouldn't be proportional to the number of CPUs 2008-09-02 19:21:38 -07:00
readahead.c
rmap.c mm: dirty page tracking race fix 2008-08-20 15:40:32 -07:00
shmem_acl.c [PATCH] sanitize ->permission() prototype 2008-07-26 20:53:14 -04:00
shmem.c mm: rename page trylock 2008-08-04 21:31:34 -07:00
slab.c mm: unexport ksize 2008-07-29 23:44:26 +03:00
slob.c mm: unexport ksize 2008-07-29 23:44:26 +03:00
slub.c slub: fixed uninitialized counter in struct kmem_cache_node 2008-09-15 09:49:05 +03:00
sparse-vmemmap.c
sparse.c mm/sparse.c: removed duplicated include 2008-08-12 16:07:30 -07:00
swap_state.c mm: show free swap as signed 2008-08-20 15:40:30 -07:00
swap.c mm: rename page trylock 2008-08-04 21:31:34 -07:00
swapfile.c mm: rename page trylock 2008-08-04 21:31:34 -07:00
thrash.c
tiny-shmem.c mm: tiny-shmem fix lock ordering: mmap_sem vs i_mutex 2008-09-23 08:09:14 -07:00
truncate.c VFS: fix dio write returning EIO when try_to_release_page fails 2008-09-02 19:21:37 -07:00
util.c mm: Make generic weak get_user_pages_fast and EXPORT_GPL it 2008-08-12 17:52:53 +10:00
vmalloc.c
vmscan.c mm: rename page trylock 2008-08-04 21:31:34 -07:00
vmstat.c [ARM] Skip memory holes in FLATMEM when reading /proc/pagetypeinfo 2008-08-27 20:09:28 +01:00