linux/mm
Vlastimil Babka ee78ce5d44 mm/page_alloc: prevent MIGRATE_RESERVE pages from being misplaced
commit 5bcc9f86ef upstream.

For the MIGRATE_RESERVE pages, it is useful when they do not get
misplaced on free_list of other migratetype, otherwise they might get
allocated prematurely and e.g.  fragment the MIGRATE_RESEVE pageblocks.
While this cannot be avoided completely when allocating new
MIGRATE_RESERVE pageblocks in min_free_kbytes sysctl handler, we should
prevent the misplacement where possible.

Currently, it is possible for the misplacement to happen when a
MIGRATE_RESERVE page is allocated on pcplist through rmqueue_bulk() as a
fallback for other desired migratetype, and then later freed back
through free_pcppages_bulk() without being actually used.  This happens
because free_pcppages_bulk() uses get_freepage_migratetype() to choose
the free_list, and rmqueue_bulk() calls set_freepage_migratetype() with
the *desired* migratetype and not the page's original MIGRATE_RESERVE
migratetype.

This patch fixes the problem by moving the call to
set_freepage_migratetype() from rmqueue_bulk() down to
__rmqueue_smallest() and __rmqueue_fallback() where the actual page's
migratetype (e.g.  from which free_list the page is taken from) is used.
Note that this migratetype might be different from the pageblock's
migratetype due to freepage stealing decisions.  This is OK, as page
stealing never uses MIGRATE_RESERVE as a fallback, and also takes care
to leave all MIGRATE_CMA pages on the correct freelist.

Therefore, as an additional benefit, the call to
get_pageblock_migratetype() from rmqueue_bulk() when CMA is enabled, can
be removed completely.  This relies on the fact that MIGRATE_CMA
pageblocks are created only during system init, and the above.  The
related is_migrate_isolate() check is also unnecessary, as memory
isolation has other ways to move pages between freelists, and drain pcp
lists containing pages that should be isolated.  The buffered_rmqueue()
can also benefit from calling get_freepage_migratetype() instead of
get_pageblock_migratetype().

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Yong-Taek Lee <ytk.lee@samsung.com>
Reported-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Suggested-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Suggested-by: Mel Gorman <mgorman@suse.de>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: "Wang, Yalin" <Yalin.Wang@sonymobile.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-21 09:23:07 -08:00
..
Kconfig parisc,metag: Do not hardcode maximum userspace stack size 2014-07-17 16:21:03 -07:00
Kconfig.debug mm: more intensive memory corruption debugging 2012-01-10 16:30:42 -08:00
Makefile mm: per-thread vma caching 2014-10-09 12:21:29 -07:00
backing-dev.c bdi: avoid oops on device removal 2014-04-26 17:19:05 -07:00
balloon_compaction.c mm: print more details for bad_page() 2014-01-23 16:36:50 -08:00
bootmem.c mm/bootmem.c: remove unused local `map' 2013-11-13 12:09:09 +09:00
bounce.c block: Convert bio_for_each_segment() to bvec_iter 2013-11-23 22:33:49 -08:00
cleancache.c mm: dump page when hitting a VM_BUG_ON using VM_BUG_ON_PAGE 2014-01-23 16:36:50 -08:00
compaction.c mm, compaction: properly signal and act upon lock and need_sched() contention 2014-11-21 09:23:07 -08:00
debug-pagealloc.c mm, x86: Remove debug_pagealloc_enabled 2011-12-06 09:24:07 +01:00
dmapool.c dmapool: make DMAPOOL_DEBUG detect corruption of free marker 2012-12-11 17:22:24 -08:00
fadvise.c teach SYSCALL_DEFINE<n> how to deal with long long/unsigned long long 2013-03-03 22:46:22 -05:00
failslab.c switch debugfs to umode_t 2012-01-03 22:54:56 -05:00
filemap.c callers of iov_copy_from_user_atomic() don't need pagecache_disable() 2014-11-21 09:23:06 -08:00
filemap_xip.c seqcount: Add lockdep functionality to seqcount/seqlock structures 2013-11-06 12:40:26 +01:00
fremap.c mm: fix bad rss-counter if remap_file_pages raced migration 2014-03-19 16:21:49 -07:00
frontswap.c swap: change swap_list_head to plist, add swap_avail_head 2014-10-09 12:21:28 -07:00
highmem.c Some nice cleanups, and even a patch my wife did as a "live" demo for 2012-12-20 08:37:05 -08:00
huge_memory.c mm: free compound page with correct order 2014-11-14 09:00:07 -08:00
hugetlb.c mm: optimize put_mems_allowed() usage 2014-10-09 12:21:28 -07:00
hugetlb_cgroup.c mm: dump page when hitting a VM_BUG_ON using VM_BUG_ON_PAGE 2014-01-23 16:36:50 -08:00
hwpoison-inject.c mm/hwpoison: add '#' to hwpoison_inject 2014-01-21 16:19:48 -08:00
init-mm.c atomic: use <linux/atomic.h> 2011-07-26 16:49:47 -07:00
internal.h mm, compaction: properly signal and act upon lock and need_sched() contention 2014-11-21 09:23:07 -08:00
interval_tree.c mm: add CONFIG_DEBUG_VM_RB build option 2012-10-09 16:22:42 +09:00
kmemcheck.c kmemcheck: Fix build errors due to missing slab.h 2010-03-30 22:02:32 +09:00
kmemleak-test.c kmemleak: remove memset by using kzalloc 2011-01-27 18:31:51 +00:00
kmemleak.c mm: kmemleak: avoid false negatives on vmalloc'ed objects 2013-11-13 12:09:07 +09:00
ksm.c mm: close PageTail race 2014-03-04 07:55:47 -08:00
list_lru.c mm: list_lru: fix almost infinite loop causing effective livelock 2013-10-30 12:57:46 -07:00
maccess.c mm: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
madvise.c mm: madvise: fix MADV_WILLNEED on shmem swapouts 2014-11-21 09:23:06 -08:00
memblock.c memblock, memhotplug: fix wrong type in memblock_find_in_range_node(). 2014-10-05 14:52:17 -07:00
memcontrol.c mm: memcontrol: do not iterate uninitialized memcgs 2014-11-14 09:00:08 -08:00
memory-failure.c mm, migration: add destination page freeing callback 2014-11-21 09:23:06 -08:00
memory.c mm: softdirty: keep bit when zapping file pte 2014-10-05 14:52:21 -07:00
memory_hotplug.c mm, migration: add destination page freeing callback 2014-11-21 09:23:06 -08:00
mempolicy.c mm, migration: add destination page freeing callback 2014-11-21 09:23:06 -08:00
mempool.c mm/mempool.c: convert kmalloc_node(...GFP_ZERO...) to kzalloc_node(...) 2013-09-11 15:58:14 -07:00
migrate.c mm: fix direct reclaim writeback regression 2014-11-21 09:23:07 -08:00
mincore.c mm + fs: prepare for non-page entries in page cache radix trees 2014-11-21 09:23:06 -08:00
mlock.c mm: try_to_unmap_cluster() should lock_page() before mlocking 2014-05-06 07:59:35 -07:00
mm_init.c mm: bring back /sys/kernel/mm 2014-01-27 21:02:39 -08:00
mmap.c mm: per-thread vma caching 2014-10-09 12:21:29 -07:00
mmu_context.c mm: remove old aio use_mm() comment 2013-05-07 18:38:27 -07:00
mmu_notifier.c mm: audit/fix non-modular users of module_init in core code 2014-01-23 16:36:52 -08:00
mmzone.c mm: numa: Change page last {nid,pid} into {cpu,pid} 2013-10-09 14:47:45 +02:00
mprotect.c mm: Use ptep/pmdp_set_numa() for updating _PAGE_NUMA bit 2014-02-17 11:19:36 +11:00
mremap.c mm, thp: close race between mremap() and split_huge_page() 2014-06-07 10:28:10 -07:00
msync.c sanitize vfs_fsync calling conventions 2010-05-21 18:31:21 -04:00
nobootmem.c mm/nobootmem: free_all_bootmem again 2014-01-23 16:36:52 -08:00
nommu.c mm: per-thread vma caching 2014-10-09 12:21:29 -07:00
oom_kill.c OOM, PM: OOM killed task shouldn't escape PM suspend 2014-11-14 09:00:01 -08:00
page-writeback.c mm/page-writeback.c: fix divide by zero in bdi_dirty_limits() 2014-08-07 14:52:37 -07:00
page_alloc.c mm/page_alloc: prevent MIGRATE_RESERVE pages from being misplaced 2014-11-21 09:23:07 -08:00
page_cgroup.c cgroup/kmemleak: add kmemleak_free() for cgroup deallocations. 2014-11-14 09:00:07 -08:00
page_io.c Merge branch 'for-3.14/core' of git://git.kernel.dk/linux-block 2014-01-30 11:19:05 -08:00
page_isolation.c mm: memory-hotplug: enable memory hotplug to handle hugepage 2013-09-11 15:57:48 -07:00
pagewalk.c mm/pagewalk.c: fix walk_page_range() access of wrong PTEs 2013-10-30 14:27:03 -07:00
percpu-km.c percpu: clear memory allocated with the km allocator 2010-10-02 10:28:42 +03:00
percpu-vm.c percpu: perform tlb flush after pcpu_map_pages() failure 2014-10-05 14:52:20 -07:00
percpu.c Revert "percpu: free percpu allocation info for uniprocessor system" 2014-11-14 08:59:45 -08:00
pgtable-generic.c mm: fix TLB flush race between migration, and change_protection_range 2013-12-18 19:04:51 -08:00
process_vm_access.c Fix: compat_rw_copy_check_uvector() misuse in aio, readv, writev, and security keys 2013-03-12 11:05:45 -07:00
quicklist.c mm: delete various needless include <linux/module.h> 2011-10-31 09:20:11 -04:00
readahead.c mm/readahead.c: inline ra_submit 2014-11-21 09:23:06 -08:00
rmap.c mm: fix sleeping function warning from __put_anon_vma 2014-06-30 20:11:53 -07:00
shmem.c mm + fs: prepare for non-page entries in page cache radix trees 2014-11-21 09:23:06 -08:00
slab.c mm: optimize put_mems_allowed() usage 2014-10-09 12:21:28 -07:00
slab.h memcg, slab: RCU protect memcg_params for root caches 2014-01-23 16:36:51 -08:00
slab_common.c slab_common: fix the check for duplicate slab names 2014-07-31 12:52:55 -07:00
slob.c mm/sl[aou]b: Move kmallocXXX functions to common code 2013-09-04 20:51:33 +03:00
slub.c mm: optimize put_mems_allowed() usage 2014-10-09 12:21:28 -07:00
sparse-vmemmap.c mm/sparse: use memblock apis for early memory allocations 2014-01-21 16:19:47 -08:00
sparse.c mm/sparse: use memblock apis for early memory allocations 2014-01-21 16:19:47 -08:00
swap.c mm + fs: prepare for non-page entries in page cache radix trees 2014-11-21 09:23:06 -08:00
swap_state.c swap: add a simple detector for inappropriate swapin readahead 2014-02-06 13:48:51 -08:00
swapfile.c swap: change swap_list_head to plist, add swap_avail_head 2014-10-09 12:21:28 -07:00
truncate.c mm + fs: prepare for non-page entries in page cache radix trees 2014-11-21 09:23:06 -08:00
util.c vm_is_stack: use for_each_thread() rather then buggy while_each_thread() 2014-09-05 16:34:18 -07:00
vmacache.c mm: don't pointlessly use BUG_ON() for sanity check 2014-10-09 12:21:29 -07:00
vmalloc.c Revert "mm/vmalloc: interchage the implementation of vmalloc_to_{pfn,page}" 2014-01-27 21:02:39 -08:00
vmpressure.c arm, pm, vmpressure: add missing slab.h includes 2014-02-03 13:24:01 -05:00
vmscan.c mm: vmscan: use proportional scanning during direct reclaim and full scan at DEF_PRIORITY 2014-11-21 09:23:07 -08:00
vmstat.c mm, x86: Account for TLB flushes only when debugging 2014-01-25 09:10:41 +01:00
zbud.c mm/zbud: fix some trivial typos in comments 2013-09-11 15:57:35 -07:00
zsmalloc.c zsmalloc: add copyright 2014-01-30 16:56:55 -08:00
zswap.c mm/zswap.c: change params from hidden to ro 2014-01-23 16:36:50 -08:00