linux/mm
Cyrill Gorcunov 34228d473e mm: ignore VM_SOFTDIRTY on VMA merging
The VM_SOFTDIRTY bit affects vma merge routine: if two VMAs has all bits
in vm_flags matched except dirty bit the kernel can't longer merge them
and this forces the kernel to generate new VMAs instead.

It finally may lead to the situation when userspace application reaches
vm.max_map_count limit and get crashed in worse case

 | (gimp:11768): GLib-ERROR **: gmem.c:110: failed to allocate 4096 bytes
 |
 | (file-tiff-load:12038): LibGimpBase-WARNING **: file-tiff-load: gimp_wire_read(): error
 | xinit: connection to X server lost
 |
 | waiting for X server to shut down
 | /usr/lib64/gimp/2.0/plug-ins/file-tiff-load terminated: Hangup
 | /usr/lib64/gimp/2.0/plug-ins/script-fu terminated: Hangup
 | /usr/lib64/gimp/2.0/plug-ins/script-fu terminated: Hangup

  https://bugzilla.kernel.org/show_bug.cgi?id=67651
  https://bugzilla.gnome.org/show_bug.cgi?id=719619#c0

Initial problem came from missed VM_SOFTDIRTY in do_brk() routine but
even if we would set up VM_SOFTDIRTY here, there is still a way to
prevent VMAs from merging: one can call

 | echo 4 > /proc/$PID/clear_refs

and clear all VM_SOFTDIRTY over all VMAs presented in memory map, then
new do_brk() will try to extend old VMA and finds that dirty bit doesn't
match thus new VMA will be generated.

As discussed with Pavel, the right approach should be to ignore
VM_SOFTDIRTY bit when we're trying to merge VMAs and if merge successed
we mark extended VMA with dirty bit where needed.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Reported-by: Bastian Hougaard <gnome@rvzt.net>
Reported-by: Mel Gorman <mgorman@suse.de>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-23 16:36:53 -08:00
..
Kconfig mm: add missing dependency in Kconfig 2013-12-18 19:04:52 -08:00
Kconfig.debug mm: more intensive memory corruption debugging 2012-01-10 16:30:42 -08:00
Makefile list: add a new LRU list type 2013-09-10 18:56:30 -04:00
backing-dev.c mm/backing-dev.c: check user buffer length before copying data to the related user buffer 2013-09-11 15:58:03 -07:00
balloon_compaction.c mm: print more details for bad_page() 2014-01-23 16:36:50 -08:00
bootmem.c mm/bootmem.c: remove unused local `map' 2013-11-13 12:09:09 +09:00
bounce.c mm/bounce.c: fix a regression where MS_SNAP_STABLE (stable pages snapshotting) was ignored 2013-09-30 14:31:02 -07:00
cleancache.c mm: dump page when hitting a VM_BUG_ON using VM_BUG_ON_PAGE 2014-01-23 16:36:50 -08:00
compaction.c mm: improve documentation of page_order 2014-01-23 16:36:53 -08:00
debug-pagealloc.c mm, x86: Remove debug_pagealloc_enabled 2011-12-06 09:24:07 +01:00
dmapool.c dmapool: make DMAPOOL_DEBUG detect corruption of free marker 2012-12-11 17:22:24 -08:00
fadvise.c teach SYSCALL_DEFINE<n> how to deal with long long/unsigned long long 2013-03-03 22:46:22 -05:00
failslab.c switch debugfs to umode_t 2012-01-03 22:54:56 -05:00
filemap.c mm: dump page when hitting a VM_BUG_ON using VM_BUG_ON_PAGE 2014-01-23 16:36:50 -08:00
filemap_xip.c seqcount: Add lockdep functionality to seqcount/seqlock structures 2013-11-06 12:40:26 +01:00
fremap.c mm: fix use-after-free in sys_remap_file_pages 2014-01-02 14:40:30 -08:00
frontswap.c frontswap: fix incorrect zeroing and allocation size for frontswap_map 2013-06-12 16:29:46 -07:00
highmem.c Some nice cleanups, and even a patch my wife did as a "live" demo for 2012-12-20 08:37:05 -08:00
huge_memory.c mm: audit/fix non-modular users of module_init in core code 2014-01-23 16:36:52 -08:00
hugetlb.c mm: dump page when hitting a VM_BUG_ON using VM_BUG_ON_PAGE 2014-01-23 16:36:50 -08:00
hugetlb_cgroup.c mm: dump page when hitting a VM_BUG_ON using VM_BUG_ON_PAGE 2014-01-23 16:36:50 -08:00
hwpoison-inject.c mm/hwpoison: add '#' to hwpoison_inject 2014-01-21 16:19:48 -08:00
init-mm.c atomic: use <linux/atomic.h> 2011-07-26 16:49:47 -07:00
internal.h mm: improve documentation of page_order 2014-01-23 16:36:53 -08:00
interval_tree.c mm: add CONFIG_DEBUG_VM_RB build option 2012-10-09 16:22:42 +09:00
kmemcheck.c kmemcheck: Fix build errors due to missing slab.h 2010-03-30 22:02:32 +09:00
kmemleak-test.c kmemleak: remove memset by using kzalloc 2011-01-27 18:31:51 +00:00
kmemleak.c mm: kmemleak: avoid false negatives on vmalloc'ed objects 2013-11-13 12:09:07 +09:00
ksm.c mm: audit/fix non-modular users of module_init in core code 2014-01-23 16:36:52 -08:00
list_lru.c mm: list_lru: fix almost infinite loop causing effective livelock 2013-10-30 12:57:46 -07:00
maccess.c mm: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
madvise.c mm/hwpoison: fix traversal of hugetlbfs pages to avoid printk flood 2013-09-30 14:31:02 -07:00
memblock.c mm/nobootmem: free_all_bootmem again 2014-01-23 16:36:52 -08:00
memcontrol.c memcg: remove unused code from kmem_cache_destroy_work_func 2014-01-23 16:36:53 -08:00
memory-failure.c mm/memory-failure.c: shift page lock from head page to tail page after thp split 2014-01-23 16:36:52 -08:00
memory.c mm: dump page when hitting a VM_BUG_ON using VM_BUG_ON_PAGE 2014-01-23 16:36:50 -08:00
memory_hotplug.c mm/memory_hotplug.c: move register_memory_resource out of the lock_memory_hotplug 2014-01-23 16:36:52 -08:00
mempolicy.c mm: new_vma_page() cannot see NULL vma for hugetlb pages 2014-01-23 16:36:52 -08:00
mempool.c mm/mempool.c: convert kmalloc_node(...GFP_ZERO...) to kzalloc_node(...) 2013-09-11 15:58:14 -07:00
migrate.c sched/numa: fix setting of cpupid on page migration twice 2014-01-23 16:36:52 -08:00
mincore.c mm: do_mincore() cleanup 2014-01-23 16:36:52 -08:00
mlock.c mm: dump page when hitting a VM_BUG_ON using VM_BUG_ON_PAGE 2014-01-23 16:36:50 -08:00
mm_init.c mm/mm_init.c: make creation of the mm_kobj happen earlier than device_initcall 2014-01-23 16:36:52 -08:00
mmap.c mm: ignore VM_SOFTDIRTY on VMA merging 2014-01-23 16:36:53 -08:00
mmu_context.c mm: remove old aio use_mm() comment 2013-05-07 18:38:27 -07:00
mmu_notifier.c mm: audit/fix non-modular users of module_init in core code 2014-01-23 16:36:52 -08:00
mmzone.c mm: numa: Change page last {nid,pid} into {cpu,pid} 2013-10-09 14:47:45 +02:00
mprotect.c mm: numa: do not automatically migrate KSM pages 2014-01-21 16:19:48 -08:00
mremap.c mm: revert mremap pud_free anti-fix 2013-10-16 21:35:53 -07:00
msync.c sanitize vfs_fsync calling conventions 2010-05-21 18:31:21 -04:00
nobootmem.c mm/nobootmem: free_all_bootmem again 2014-01-23 16:36:52 -08:00
nommu.c mm: add overcommit_kbytes sysctl variable 2014-01-21 16:19:44 -08:00
oom_kill.c mm, oom: prefer thread group leaders for display purposes 2014-01-23 16:36:53 -08:00
page-writeback.c writeback: fix negative bdi max pause 2013-10-16 21:35:53 -07:00
page_alloc.c mm: show message when updating min_free_kbytes in thp 2014-01-23 16:36:52 -08:00
page_cgroup.c Merge branch 'akpm' (incoming from Andrew) 2014-01-21 19:05:45 -08:00
page_io.c mm: dump page when hitting a VM_BUG_ON using VM_BUG_ON_PAGE 2014-01-23 16:36:50 -08:00
page_isolation.c mm: memory-hotplug: enable memory hotplug to handle hugepage 2013-09-11 15:57:48 -07:00
pagewalk.c mm/pagewalk.c: fix walk_page_range() access of wrong PTEs 2013-10-30 14:27:03 -07:00
percpu-km.c percpu: clear memory allocated with the km allocator 2010-10-02 10:28:42 +03:00
percpu-vm.c mm: fix kernel-doc warnings 2012-06-20 14:39:36 -07:00
percpu.c Merge branch 'akpm' (incoming from Andrew) 2014-01-21 19:05:45 -08:00
pgtable-generic.c mm: fix TLB flush race between migration, and change_protection_range 2013-12-18 19:04:51 -08:00
process_vm_access.c Fix: compat_rw_copy_check_uvector() misuse in aio, readv, writev, and security keys 2013-03-12 11:05:45 -07:00
quicklist.c mm: delete various needless include <linux/module.h> 2011-10-31 09:20:11 -04:00
readahead.c readahead: fix sequential read cache miss detection 2013-11-13 12:09:09 +09:00
rmap.c mm/rmap: fix coccinelle warnings 2014-01-23 16:36:53 -08:00
shmem.c mm: dump page when hitting a VM_BUG_ON using VM_BUG_ON_PAGE 2014-01-23 16:36:50 -08:00
slab.c Merge branch 'slab/next' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux 2013-11-22 08:10:34 -08:00
slab.h memcg, slab: RCU protect memcg_params for root caches 2014-01-23 16:36:51 -08:00
slab_common.c slab: do not panic if we fail to create memcg cache 2014-01-23 16:36:51 -08:00
slob.c mm/sl[aou]b: Move kmallocXXX functions to common code 2013-09-04 20:51:33 +03:00
slub.c mm: dump page when hitting a VM_BUG_ON using VM_BUG_ON_PAGE 2014-01-23 16:36:50 -08:00
sparse-vmemmap.c mm/sparse: use memblock apis for early memory allocations 2014-01-21 16:19:47 -08:00
sparse.c mm/sparse: use memblock apis for early memory allocations 2014-01-21 16:19:47 -08:00
swap.c mm: dump page when hitting a VM_BUG_ON using VM_BUG_ON_PAGE 2014-01-23 16:36:50 -08:00
swap_state.c mm: dump page when hitting a VM_BUG_ON using VM_BUG_ON_PAGE 2014-01-23 16:36:50 -08:00
swapfile.c mm/swapfile.c: do not skip lowest_bit in scan_swap_map() scan loop 2014-01-23 16:36:53 -08:00
truncate.c truncate: drop 'oldsize' truncate_pagecache() parameter 2013-09-12 15:38:02 -07:00
util.c mm: add overcommit_kbytes sysctl variable 2014-01-21 16:19:44 -08:00
vmalloc.c mm/vmalloc: interchage the implementation of vmalloc_to_{pfn,page} 2014-01-21 16:19:44 -08:00
vmpressure.c memcg: make cgroup_event deal with mem_cgroup instead of cgroup_subsys_state 2013-11-22 18:20:43 -05:00
vmscan.c mm: vmscan: call NUMA-unaware shrinkers irrespective of nodemask 2014-01-23 16:36:52 -08:00
vmstat.c mm: numa: return the number of base pages altered by protection changes 2013-11-13 12:09:11 +09:00
zbud.c mm/zbud: fix some trivial typos in comments 2013-09-11 15:57:35 -07:00
zswap.c mm/zswap.c: change params from hidden to ro 2014-01-23 16:36:50 -08:00