linux/mm
Linus Torvalds 27421e211a Manually revert "mlock: downgrade mmap sem while populating mlocked regions"
This essentially reverts commit 8edb08caf6.

It downgraded our mmap semaphore to a read-lock while mlocking pages, in
order to allow other threads (and external accesses like "ps" et al) to
walk the vma lists and take page faults etc.  Which is a nice idea, but
the implementation does not work.

Because we cannot upgrade the lock back to a write lock without
releasing the mmap semaphore, the code had to release the lock entirely
and then re-take it as a writelock.  However, that meant that the caller
possibly lost the vma chain that it was following, since now another
thread could come in and mmap/munmap the range.

The code tried to work around that by just looking up the vma again and
erroring out if that happened, but quite frankly, that was just a buggy
hack that doesn't actually protect against anything (the other thread
could just have replaced the vma with another one instead of totally
unmapping it).

The only way to downgrade to a read map _reliably_ is to do it at the
end, which is likely the right thing to do: do all the 'vma' operations
with the write-lock held, then downgrade to a read after completing them
all, and then do the "populate the newly mlocked regions" while holding
just the read lock.  And then just drop the read-lock and return to user
space.

The (perhaps somewhat simpler) alternative is to just make all the
callers of mlock_vma_pages_range() know that the mmap lock got dropped,
and just re-grab the mmap semaphore if it needs to mlock more than one
vma region.

So we can do this "downgrade mmap sem while populating mlocked regions"
thing right, but the way it was done here was absolutely not correct.
Thus the revert, in the expectation that we will do it all correctly
some day.

Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-01 11:00:16 -08:00
..
allocpercpu.c
backing-dev.c Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-01-06 17:10:04 -08:00
bootmem.c bootmem: print request details before BUG_ON(them) 2009-01-06 15:59:10 -08:00
bounce.c
dmapool.c
fadvise.c [CVE-2009-0029] System call wrapper special cases 2009-01-14 14:15:18 +01:00
failslab.c
filemap_xip.c badpage: remove vma from page_remove_rmap 2009-01-06 15:59:07 -08:00
filemap.c [CVE-2009-0029] System call wrapper special cases 2009-01-14 14:15:18 +01:00
fremap.c [CVE-2009-0029] System call wrappers part 13 2009-01-14 14:15:23 +01:00
highmem.c
hugetlb.c mm: hugetlb: remove redundant `if' operation 2009-01-06 15:59:10 -08:00
internal.h mm: make get_user_pages() interruptible 2009-01-06 15:59:08 -08:00
Kconfig Remove obsolete CONFIG_RESOURCES_64BIT 2009-01-06 15:59:14 -08:00
maccess.c
madvise.c [CVE-2009-0029] System call wrappers part 14 2009-01-14 14:15:24 +01:00
Makefile shmem: unify regular and tiny shmem 2009-01-06 15:59:08 -08:00
memcontrol.c memcg: NULL pointer dereference at rmdir on some NUMA systems 2009-01-29 18:04:44 -08:00
memory_hotplug.c mm: remove GFP_HIGHUSER_PAGECACHE 2009-01-06 15:59:01 -08:00
memory.c x86 PAT: change track_pfn_vma_new to take pgprot_t pointer param 2009-01-13 19:13:01 +01:00
mempolicy.c [CVE-2009-0029] System call wrappers part 28 2009-01-14 14:15:30 +01:00
mempool.c
migrate.c [CVE-2009-0029] System call wrappers part 28 2009-01-14 14:15:30 +01:00
mincore.c [CVE-2009-0029] System call wrappers part 14 2009-01-14 14:15:24 +01:00
mlock.c Manually revert "mlock: downgrade mmap sem while populating mlocked regions" 2009-02-01 11:00:16 -08:00
mm_init.c
mmap.c Stop playing silly games with the VM_ACCOUNT flag 2009-01-31 15:08:56 -08:00
mmu_notifier.c
mmzone.c
mprotect.c [CVE-2009-0029] System call wrappers part 13 2009-01-14 14:15:23 +01:00
mremap.c [CVE-2009-0029] System call wrappers part 13 2009-01-14 14:15:23 +01:00
msync.c [CVE-2009-0029] System call wrappers part 13 2009-01-14 14:15:23 +01:00
nommu.c uclinux: add process name to allocation error message 2009-01-27 16:42:03 +10:00
oom_kill.c memcg: avoid deadlock caused by race between oom and cpuset_attach 2009-01-08 08:31:09 -08:00
page_alloc.c mm: introduce zone_reclaim struct 2009-01-08 08:31:07 -08:00
page_cgroup.c memcg: add mem_cgroup_disabled() 2009-01-08 08:31:05 -08:00
page_io.c mm: try_to_free_swap replaces remove_exclusive_swap_page 2009-01-06 15:59:03 -08:00
page_isolation.c
page-writeback.c mm: add dirty_background_bytes and dirty_bytes sysctls 2009-01-06 15:59:03 -08:00
pagewalk.c
pdflush.c
prio_tree.c
quicklist.c
readahead.c
rmap.c badpage: remove vma from page_remove_rmap 2009-01-06 15:59:07 -08:00
shmem_acl.c
shmem.c Stop playing silly games with the VM_ACCOUNT flag 2009-01-31 15:08:56 -08:00
slab.c
slob.c
slub.c
sparse-vmemmap.c
sparse.c
swap_state.c memcg: mem+swap controller core 2009-01-08 08:31:05 -08:00
swap.c memcg: add zone_reclaim_stat 2009-01-08 08:31:08 -08:00
swapfile.c memcg: fix refcnt handling at swapoff 2009-01-29 18:04:43 -08:00
thrash.c
truncate.c
util.c
vmalloc.c revert "mm: vmalloc use mutex for purge" 2009-01-15 16:39:40 -08:00
vmscan.c memcg: fix calculation of active_ratio 2009-01-08 08:31:09 -08:00
vmstat.c