linux/mm
Andi Kleen 4fd466eb46 HWPOISON: add memory cgroup filter
The hwpoison test suite need to inject hwpoison to a collection of
selected task pages, and must not touch pages not owned by them and
thus kill important system processes such as init. (But it's OK to
mis-hwpoison free/unowned pages as well as shared clean pages.
Mis-hwpoison of shared dirty pages will kill all tasks, so the test
suite will target all or non of such tasks in the first place.)

The memory cgroup serves this purpose well. We can put the target
processes under the control of a memory cgroup, and tell the hwpoison
injection code to only kill pages associated with some active memory
cgroup.

The prerequisite for doing hwpoison stress tests with mem_cgroup is,
the mem_cgroup code tracks task pages _accurately_ (unless page is
locked).  Which we believe is/should be true.

The benefits are simplification of hwpoison injector code. Also the
mem_cgroup code will automatically be tested by hwpoison test cases.

The alternative interfaces pin-pfn/unpin-pfn can also delegate the
(process and page flags) filtering functions reliably to user space.
However prototype implementation shows that this scheme adds more
complexity than we wanted.

Example test case:

	mkdir /cgroup/hwpoison

	usemem -m 100 -s 1000 &
	echo `jobs -p` > /cgroup/hwpoison/tasks

	memcg_ino=$(ls -id /cgroup/hwpoison | cut -f1 -d' ')
	echo $memcg_ino > /debug/hwpoison/corrupt-filter-memcg

	page-types -p `pidof init`   --hwpoison  # shall do nothing
	page-types -p `pidof usemem` --hwpoison  # poison its pages

[AK: Fix documentation]
[Add fix for problem noticed by Li Zefan <lizf@cn.fujitsu.com>;
dentry in the css could be NULL]

CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
CC: Hugh Dickins <hugh.dickins@tiscali.co.uk>
CC: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
CC: Balbir Singh <balbir@linux.vnet.ibm.com>
CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
CC: Li Zefan <lizf@cn.fujitsu.com>
CC: Paul Menage <menage@google.com>
CC: Nick Piggin <npiggin@suse.de>
CC: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-16 12:19:59 +01:00
..
Kconfig HWPOISON: add page flags filter 2009-12-16 12:19:59 +01:00
Kconfig.debug trivial: improve help text for mm debug config options 2009-09-21 15:14:57 +02:00
Makefile percpu: kill legacy percpu allocator 2009-10-02 13:29:29 +09:00
backing-dev.c flusher: Fix PF_FROZEN race 2009-12-03 13:49:43 +01:00
bootmem.c mm/bootmem.c: properly __init-annotate helper functions 2009-12-15 08:53:20 -08:00
bounce.c block: remove some includings of blktrace_api.h 2009-06-16 11:19:36 +02:00
debug-pagealloc.c generic debug pagealloc 2009-04-01 08:59:13 -07:00
dmapool.c dmapools: protect page_list walk in show_pools() 2009-06-30 18:56:00 -07:00
fadvise.c readahead: move max_sane_readahead() calls into force_page_cache_readahead() 2009-06-16 19:47:28 -07:00
failslab.c kmemtrace, mm: fix slab.h dependency problem in mm/failslab.c 2009-04-03 12:23:01 +02:00
filemap.c kill wait_on_page_writeback_range 2009-12-10 15:02:50 +01:00
filemap_xip.c const: mark struct vm_struct_operations 2009-09-27 11:39:25 -07:00
fremap.c Do not account for the address space used by hugetlbfs using VM_ACCOUNT 2009-02-10 10:48:42 -08:00
highmem.c highmem: Fix debug_kmap_atomic() to also handle KM_IRQ_PTE, KM_NMI, and KM_NMI_PTE 2009-11-10 04:15:47 +01:00
hugetlb.c hugetlb: abort a hugepage pool resize if a signal is pending 2009-12-15 08:53:24 -08:00
hwpoison-inject.c HWPOISON: add memory cgroup filter 2009-12-16 12:19:59 +01:00
init-mm.c mm: consolidate init_mm definition 2009-06-16 19:47:28 -07:00
internal.h HWPOISON: add memory cgroup filter 2009-12-16 12:19:59 +01:00
kmemcheck.c kmemcheck: add hooks for the page allocator 2009-06-15 15:48:33 +02:00
kmemleak-test.c percpu: clean up percpu variable definitions 2009-06-24 15:13:48 +09:00
kmemleak.c tree-wide: fix typos "aquire" -> "acquire", "cumsumed" -> "consumed" 2009-11-09 09:40:57 +01:00
ksm.c ksm: remove unswappable max_kernel_pages 2009-12-15 08:53:20 -08:00
maccess.c [S390] maccess: add weak attribute to probe_kernel_write 2009-06-12 10:27:37 +02:00
madvise.c HWPOISON: Turn ref argument into flags argument 2009-12-16 12:19:57 +01:00
memcontrol.c memcg: add accessor to mem_cgroup.css 2009-12-16 12:19:59 +01:00
memory-failure.c HWPOISON: add memory cgroup filter 2009-12-16 12:19:59 +01:00
memory.c HWPOISON: comment dirty swapcache pages 2009-12-16 12:19:58 +01:00
memory_hotplug.c mm: fix section mismatch in memory_hotplug.c 2009-12-15 08:53:20 -08:00
mempolicy.c ksm: memory hotremove migration only 2009-12-15 08:53:20 -08:00
mempool.c mm: remove broken 'kzalloc' mempool 2009-09-22 07:17:35 -07:00
migrate.c mm: remove unevictable_migrate_page function 2009-12-15 08:53:23 -08:00
mincore.c mm: hugetlb: fix hugepage memory leak in mincore() 2009-12-15 08:53:24 -08:00
mlock.c mlock: replace stale comments in munlock_vma_page() 2009-12-15 08:53:23 -08:00
mm_init.c mm: mminit_loglevel cannot be __meminitdata anymore 2008-08-20 15:40:30 -07:00
mmap.c mm: uncached vma support with writenotify 2009-12-15 08:53:21 -08:00
mmu_context.c mm: reduce atomic use on use_mm fast path 2009-09-22 07:17:42 -07:00
mmu_notifier.c ksm: add mmu_notifier set_pte_at_notify() 2009-09-22 07:17:31 -07:00
mmzone.c [ARM] Double check memmap is actually valid with a memmap has unexpected holes V2 2009-05-18 11:22:24 +01:00
mprotect.c perf: Do the big rename: Performance Counters -> Performance Events 2009-09-21 14:28:04 +02:00
mremap.c Take arch_mmap_check() into get_unmapped_area() 2009-12-11 06:44:58 -05:00
msync.c [CVE-2009-0029] System call wrappers part 13 2009-01-14 14:15:23 +01:00
nommu.c nommu: fix malloc performance by adding uninitialized flag 2009-12-15 08:53:24 -08:00
oom_kill.c oom: dump stack and VM state when oom killer panics 2009-12-15 08:53:10 -08:00
page-writeback.c writeback: remove unused nonblocking and congestion checks 2009-12-03 13:54:25 +01:00
page_alloc.c HWPOISON: detect free buddy pages explicitly 2009-12-16 12:19:58 +01:00
page_cgroup.c memory hotplug: alloc page from other node in memory online 2009-09-22 07:17:26 -07:00
page_io.c swap: rework map_swap_page() again 2009-12-15 08:53:16 -08:00
page_isolation.c memory hotplug: fix page_zone() calculation in test_pages_isolated() 2008-11-06 15:41:19 -08:00
pagewalk.c mm hugetlb: add hugepage support to pagemap 2009-12-15 08:53:24 -08:00
percpu.c Merge branch 'for-linus' into for-next 2009-12-08 10:02:12 +09:00
prio_tree.c spelling fixes: mm/ 2007-10-20 01:27:18 +02:00
quicklist.c cpumask: use new-style cpumask ops in mm/quicklist. 2009-09-24 09:34:52 +09:30
readahead.c readahead: introduce context readahead algorithm 2009-06-16 19:47:30 -07:00
rmap.c mm: simplify try_to_unmap_one() 2009-12-15 08:53:20 -08:00
shmem.c swap_info: note SWAP_MAP_SHMEM 2009-12-15 08:53:16 -08:00
shmem_acl.c shmfs: use 'check_acl' instead of 'permission' 2009-09-08 11:08:46 -07:00
slab.c Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-12-14 10:13:22 -08:00
slob.c slab: remove duplicate kmem_cache_init_late() declarations 2009-08-06 11:36:25 +03:00
slub.c Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-12-14 10:13:22 -08:00
sparse-vmemmap.c memory hotplug: alloc page from other node in memory online 2009-09-22 07:17:26 -07:00
sparse.c memory hotplug: alloc page from other node in memory online 2009-09-22 07:17:26 -07:00
swap.c mm: replace various uses of num_physpages by totalram_pages 2009-09-22 07:17:38 -07:00
swap_state.c mm: add_to_swap_cache() does not return -EEXIST 2009-09-22 07:17:35 -07:00
swapfile.c ksm: let shared pages be swappable 2009-12-15 08:53:19 -08:00
thrash.c mm: pass mm to grab_swap_token 2009-06-23 12:50:05 -07:00
truncate.c mm: fix comments for invalidate_inode_pages2() 2009-12-04 15:39:48 +01:00
util.c fix a struct file leak in do_mmap_pgoff() 2009-12-11 06:44:57 -05:00
vmalloc.c vmalloc(): adjust gfp mask passed on nested vmalloc() invocation 2009-12-15 08:53:13 -08:00
vmscan.c vmscan: simplify code 2009-12-15 08:53:21 -08:00
vmstat.c vmscan: stop kswapd waiting on congestion when the min watermark is not being met 2009-12-15 08:53:16 -08:00