linux/mm
Ken Chen ace4bd29c2 fix leaky resv_huge_pages when cpuset is in use
The internal hugetlb resv_huge_pages variable can permanently leak nonzero
value in the error path of hugetlb page fault handler when hugetlb page is
used in combination of cpuset.  The leaked count can permanently trap N
number of hugetlb pages in unusable "reserved" state.

Steps to reproduce the bug:

  (1) create two cpuset, user1 and user2
  (2) reserve 50 htlb pages in cpuset user1
  (3) attempt to shmget/shmat 50 htlb page inside cpuset user2
  (4) kernel oom the user process in step 3
  (5) ipcrm the shm segment

At this point resv_huge_pages will have a count of 49, even though
there are no active hugetlbfs file nor hugetlb shared memory segment
in the system.  The leak is permanent and there is no recovery method
other than system reboot. The leaked count will hold up all future use
of that many htlb pages in all cpusets.

The culprit is that the error path of alloc_huge_page() did not
properly undo the change it made to resv_huge_page, causing
inconsistent state.

Signed-off-by: Ken Chen <kenchen@google.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Adam Litke <agl@us.ibm.com>
Cc: Martin Bligh <mbligh@google.com>
Acked-by: David Gibson <dwg@au1.ibm.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-09 12:30:48 -07:00
..
Kconfig Quicklists for page table pages 2007-05-07 12:12:54 -07:00
Makefile Quicklists for page table pages 2007-05-07 12:12:54 -07:00
allocpercpu.c [PATCH] Allow NULL pointers in percpu_free 2006-12-07 08:39:22 -08:00
backing-dev.c [PATCH] nfs: fix congestion control 2007-03-16 19:25:05 -07:00
bootmem.c [PATCH] remove EXPORT_UNUSED_SYMBOL'ed symbols 2006-12-07 08:39:44 -08:00
bounce.c block: blk_max_pfn is somtimes wrong 2007-03-27 08:52:47 +02:00
fadvise.c [PATCH] mm: change uses of f_{dentry,vfsmnt} to use f_path 2006-12-08 08:28:43 -08:00
filemap.c Remove do_sync_file_range() 2007-05-08 11:15:04 -07:00
filemap.h Remove all inclusions of <linux/config.h> 2006-10-04 03:38:54 -04:00
filemap_xip.c [PATCH] mm: fix xip issue with /dev/zero 2007-03-29 08:22:26 -07:00
fremap.c [PATCH] mm: more rmap debugging 2006-12-22 08:55:49 -08:00
highmem.c [PATCH] i386: PARAVIRT: add kmap_atomic_pte for mapping highpte pages 2007-05-02 19:27:15 +02:00
hugetlb.c fix leaky resv_huge_pages when cpuset is in use 2007-05-09 12:30:48 -07:00
internal.h Make page->private usable in compound pages 2007-05-07 12:12:53 -07:00
madvise.c mm: madvise avoid exclusive mmap_sem 2007-05-07 12:12:54 -07:00
memory.c Add unitialized_var() macro for suppressing gcc warnings 2007-05-07 12:12:52 -07:00
memory_hotplug.c [PATCH] Fix sparsemem on Cell 2007-01-11 18:18:20 -08:00
mempolicy.c [PATCH] Page migration: Fix vma flag checking 2007-03-05 07:57:51 -08:00
mempool.c [PATCH] Numerous fixes to kernel-doc info in source files. 2007-02-11 10:51:32 -08:00
migrate.c page migration: fix NR_FILE_PAGES accounting 2007-04-24 08:23:08 -07:00
mincore.c [PATCH] mincore: vma crossing fix 2007-02-15 09:57:03 -08:00
mlock.c [PATCH] mlock cleanup 2006-12-07 08:39:22 -08:00
mmap.c Remove unused variable in get_unmapped_area 2007-05-08 11:35:28 -07:00
mmzone.c [PATCH] remove EXPORT_UNUSED_SYMBOL'ed symbols 2006-12-07 08:39:44 -08:00
mprotect.c [PATCH] paravirt: lazy mmu mode hooks.patch 2006-10-01 00:39:33 -07:00
mremap.c [PATCH] mm: mremap correct rmap accounting 2007-01-30 08:33:32 -08:00
msync.c [PATCH] mm: msync() cleanup 2006-09-26 08:48:45 -07:00
nommu.c move die notifier handling to common code 2007-05-08 11:15:04 -07:00
oom_kill.c oom: fix constraint deadlock 2007-05-07 12:12:55 -07:00
page-writeback.c Factor outstanding I/O error handling 2007-05-08 11:14:57 -07:00
page_alloc.c Add white list into modpost.c for memory hotplug code and ia64's machvec section 2007-05-08 11:14:57 -07:00
page_io.c [PATCH] swsusp: use block device offsets to identify swap locations 2006-12-07 08:39:27 -08:00
pdflush.c [PATCH] Add include/linux/freezer.h and move definitions from sched.h 2006-12-07 08:39:27 -08:00
prio_tree.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
quicklist.c Quicklists for page table pages 2007-05-07 12:12:54 -07:00
readahead.c readahead: code cleanup 2007-05-07 12:12:52 -07:00
rmap.c fbdev: mm: Deferred IO support 2007-05-08 11:15:26 -07:00
shmem.c slab allocators: Remove SLAB_DEBUG_INITIAL flag 2007-05-07 12:12:57 -07:00
shmem_acl.c [PATCH] Fix typos in mm/shmem_acl.c 2006-10-11 11:14:23 -07:00
slab.c krealloc: fix kerneldoc comments 2007-05-09 12:30:46 -07:00
slob.c slob: fix page order calculation on not 4KB page 2007-05-07 12:12:57 -07:00
slub.c krealloc: fix kerneldoc comments 2007-05-09 12:30:46 -07:00
sparse.c Add white list into modpost.c for memory hotplug code and ia64's machvec section 2007-05-08 11:14:57 -07:00
swap.c Make page->private usable in compound pages 2007-05-07 12:12:53 -07:00
swap_state.c [PATCH] lockdep: locking init debugging improvement 2006-07-03 15:27:02 -07:00
swapfile.c mm: make read_cache_page synchronous 2007-05-07 12:12:51 -07:00
thrash.c [PATCH] make mm/thrash.c:global_faults static 2006-12-07 08:39:22 -08:00
tiny-shmem.c [PATCH] mm/{,tiny-}shmem.c cleanups 2007-03-01 14:53:35 -08:00
truncate.c [PATCH] VM: invalidate_inode_pages2_range() should not exit early 2007-03-01 14:53:39 -08:00
util.c [PATCH] slab: clean up leak tracking ifdefs a little bit 2006-10-04 07:55:13 -07:00
vmalloc.c move die notifier handling to common code 2007-05-08 11:15:04 -07:00
vmscan.c Factor outstanding I/O error handling 2007-05-08 11:14:57 -07:00
vmstat.c [PATCH] optional ZONE_DMA: optional ZONE_DMA in the VM 2007-02-11 10:51:18 -08:00