linux/mm
Vlastimil Babka 4201cb7e87 mm, compaction: properly signal and act upon lock and need_sched() contention
commit be9765722e upstream.

Compaction uses compact_checklock_irqsave() function to periodically check
for lock contention and need_resched() to either abort async compaction,
or to free the lock, schedule and retake the lock.  When aborting,
cc->contended is set to signal the contended state to the caller.  Two
problems have been identified in this mechanism.

First, compaction also calls directly cond_resched() in both scanners when
no lock is yet taken.  This call either does not abort async compaction,
or set cc->contended appropriately.  This patch introduces a new
compact_should_abort() function to achieve both.  In isolate_freepages(),
the check frequency is reduced to once by SWAP_CLUSTER_MAX pageblocks to
match what the migration scanner does in the preliminary page checks.  In
case a pageblock is found suitable for calling isolate_freepages_block(),
the checks within there are done on higher frequency.

Second, isolate_freepages() does not check if isolate_freepages_block()
aborted due to contention, and advances to the next pageblock.  This
violates the principle of aborting on contention, and might result in
pageblocks not being scanned completely, since the scanning cursor is
advanced.  This problem has been noticed in the code by Joonsoo Kim when
reviewing related patches.  This patch makes isolate_freepages_block()
check the cc->contended flag and abort.

In case isolate_freepages() has already isolated some pages before
aborting due to contention, page migration will proceed, which is OK since
we do not want to waste the work that has been done, and page migration
has own checks for contention.  However, we do not want another isolation
attempt by either of the scanners, so cc->contended flag check is added
also to compaction_alloc() and compact_finished() to make sure compaction
is aborted right after the migration.

The outcome of the patch should be reduced lock contention by async
compaction and lower latencies for higher-order allocations where direct
compaction is involved.

[akpm@linux-foundation.org: fix typo in comment]
Reported-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Rik van Riel <riel@redhat.com>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Tested-by: Shawn Guo <shawn.guo@linaro.org>
Tested-by: Kevin Hilman <khilman@linaro.org>
Tested-by: Stephen Warren <swarren@nvidia.com>
Tested-by: Fabio Estevam <fabio.estevam@freescale.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-21 09:23:07 -08:00
..
Kconfig parisc,metag: Do not hardcode maximum userspace stack size 2014-07-17 16:21:03 -07:00
Kconfig.debug mm: more intensive memory corruption debugging 2012-01-10 16:30:42 -08:00
Makefile mm: per-thread vma caching 2014-10-09 12:21:29 -07:00
backing-dev.c bdi: avoid oops on device removal 2014-04-26 17:19:05 -07:00
balloon_compaction.c mm: print more details for bad_page() 2014-01-23 16:36:50 -08:00
bootmem.c mm/bootmem.c: remove unused local `map' 2013-11-13 12:09:09 +09:00
bounce.c block: Convert bio_for_each_segment() to bvec_iter 2013-11-23 22:33:49 -08:00
cleancache.c mm: dump page when hitting a VM_BUG_ON using VM_BUG_ON_PAGE 2014-01-23 16:36:50 -08:00
compaction.c mm, compaction: properly signal and act upon lock and need_sched() contention 2014-11-21 09:23:07 -08:00
debug-pagealloc.c mm, x86: Remove debug_pagealloc_enabled 2011-12-06 09:24:07 +01:00
dmapool.c dmapool: make DMAPOOL_DEBUG detect corruption of free marker 2012-12-11 17:22:24 -08:00
fadvise.c teach SYSCALL_DEFINE<n> how to deal with long long/unsigned long long 2013-03-03 22:46:22 -05:00
failslab.c switch debugfs to umode_t 2012-01-03 22:54:56 -05:00
filemap.c callers of iov_copy_from_user_atomic() don't need pagecache_disable() 2014-11-21 09:23:06 -08:00
filemap_xip.c seqcount: Add lockdep functionality to seqcount/seqlock structures 2013-11-06 12:40:26 +01:00
fremap.c mm: fix bad rss-counter if remap_file_pages raced migration 2014-03-19 16:21:49 -07:00
frontswap.c swap: change swap_list_head to plist, add swap_avail_head 2014-10-09 12:21:28 -07:00
highmem.c Some nice cleanups, and even a patch my wife did as a "live" demo for 2012-12-20 08:37:05 -08:00
huge_memory.c mm: free compound page with correct order 2014-11-14 09:00:07 -08:00
hugetlb.c mm: optimize put_mems_allowed() usage 2014-10-09 12:21:28 -07:00
hugetlb_cgroup.c mm: dump page when hitting a VM_BUG_ON using VM_BUG_ON_PAGE 2014-01-23 16:36:50 -08:00
hwpoison-inject.c mm/hwpoison: add '#' to hwpoison_inject 2014-01-21 16:19:48 -08:00
init-mm.c atomic: use <linux/atomic.h> 2011-07-26 16:49:47 -07:00
internal.h mm, compaction: properly signal and act upon lock and need_sched() contention 2014-11-21 09:23:07 -08:00
interval_tree.c mm: add CONFIG_DEBUG_VM_RB build option 2012-10-09 16:22:42 +09:00
kmemcheck.c kmemcheck: Fix build errors due to missing slab.h 2010-03-30 22:02:32 +09:00
kmemleak-test.c kmemleak: remove memset by using kzalloc 2011-01-27 18:31:51 +00:00
kmemleak.c mm: kmemleak: avoid false negatives on vmalloc'ed objects 2013-11-13 12:09:07 +09:00
ksm.c mm: close PageTail race 2014-03-04 07:55:47 -08:00
list_lru.c mm: list_lru: fix almost infinite loop causing effective livelock 2013-10-30 12:57:46 -07:00
maccess.c mm: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
madvise.c mm: madvise: fix MADV_WILLNEED on shmem swapouts 2014-11-21 09:23:06 -08:00
memblock.c memblock, memhotplug: fix wrong type in memblock_find_in_range_node(). 2014-10-05 14:52:17 -07:00
memcontrol.c mm: memcontrol: do not iterate uninitialized memcgs 2014-11-14 09:00:08 -08:00
memory-failure.c mm, migration: add destination page freeing callback 2014-11-21 09:23:06 -08:00
memory.c mm: softdirty: keep bit when zapping file pte 2014-10-05 14:52:21 -07:00
memory_hotplug.c mm, migration: add destination page freeing callback 2014-11-21 09:23:06 -08:00
mempolicy.c mm, migration: add destination page freeing callback 2014-11-21 09:23:06 -08:00
mempool.c mm/mempool.c: convert kmalloc_node(...GFP_ZERO...) to kzalloc_node(...) 2013-09-11 15:58:14 -07:00
migrate.c mm, migration: add destination page freeing callback 2014-11-21 09:23:06 -08:00
mincore.c mm + fs: prepare for non-page entries in page cache radix trees 2014-11-21 09:23:06 -08:00
mlock.c mm: try_to_unmap_cluster() should lock_page() before mlocking 2014-05-06 07:59:35 -07:00
mm_init.c mm: bring back /sys/kernel/mm 2014-01-27 21:02:39 -08:00
mmap.c mm: per-thread vma caching 2014-10-09 12:21:29 -07:00
mmu_context.c mm: remove old aio use_mm() comment 2013-05-07 18:38:27 -07:00
mmu_notifier.c mm: audit/fix non-modular users of module_init in core code 2014-01-23 16:36:52 -08:00
mmzone.c mm: numa: Change page last {nid,pid} into {cpu,pid} 2013-10-09 14:47:45 +02:00
mprotect.c mm: Use ptep/pmdp_set_numa() for updating _PAGE_NUMA bit 2014-02-17 11:19:36 +11:00
mremap.c mm, thp: close race between mremap() and split_huge_page() 2014-06-07 10:28:10 -07:00
msync.c sanitize vfs_fsync calling conventions 2010-05-21 18:31:21 -04:00
nobootmem.c mm/nobootmem: free_all_bootmem again 2014-01-23 16:36:52 -08:00
nommu.c mm: per-thread vma caching 2014-10-09 12:21:29 -07:00
oom_kill.c OOM, PM: OOM killed task shouldn't escape PM suspend 2014-11-14 09:00:01 -08:00
page-writeback.c mm/page-writeback.c: fix divide by zero in bdi_dirty_limits() 2014-08-07 14:52:37 -07:00
page_alloc.c mm, compaction: embed migration mode in compact_control 2014-11-21 09:23:07 -08:00
page_cgroup.c cgroup/kmemleak: add kmemleak_free() for cgroup deallocations. 2014-11-14 09:00:07 -08:00
page_io.c Merge branch 'for-3.14/core' of git://git.kernel.dk/linux-block 2014-01-30 11:19:05 -08:00
page_isolation.c mm: memory-hotplug: enable memory hotplug to handle hugepage 2013-09-11 15:57:48 -07:00
pagewalk.c mm/pagewalk.c: fix walk_page_range() access of wrong PTEs 2013-10-30 14:27:03 -07:00
percpu-km.c percpu: clear memory allocated with the km allocator 2010-10-02 10:28:42 +03:00
percpu-vm.c percpu: perform tlb flush after pcpu_map_pages() failure 2014-10-05 14:52:20 -07:00
percpu.c Revert "percpu: free percpu allocation info for uniprocessor system" 2014-11-14 08:59:45 -08:00
pgtable-generic.c mm: fix TLB flush race between migration, and change_protection_range 2013-12-18 19:04:51 -08:00
process_vm_access.c Fix: compat_rw_copy_check_uvector() misuse in aio, readv, writev, and security keys 2013-03-12 11:05:45 -07:00
quicklist.c mm: delete various needless include <linux/module.h> 2011-10-31 09:20:11 -04:00
readahead.c mm/readahead.c: inline ra_submit 2014-11-21 09:23:06 -08:00
rmap.c mm: fix sleeping function warning from __put_anon_vma 2014-06-30 20:11:53 -07:00
shmem.c mm + fs: prepare for non-page entries in page cache radix trees 2014-11-21 09:23:06 -08:00
slab.c mm: optimize put_mems_allowed() usage 2014-10-09 12:21:28 -07:00
slab.h memcg, slab: RCU protect memcg_params for root caches 2014-01-23 16:36:51 -08:00
slab_common.c slab_common: fix the check for duplicate slab names 2014-07-31 12:52:55 -07:00
slob.c mm/sl[aou]b: Move kmallocXXX functions to common code 2013-09-04 20:51:33 +03:00
slub.c mm: optimize put_mems_allowed() usage 2014-10-09 12:21:28 -07:00
sparse-vmemmap.c mm/sparse: use memblock apis for early memory allocations 2014-01-21 16:19:47 -08:00
sparse.c mm/sparse: use memblock apis for early memory allocations 2014-01-21 16:19:47 -08:00
swap.c mm + fs: prepare for non-page entries in page cache radix trees 2014-11-21 09:23:06 -08:00
swap_state.c swap: add a simple detector for inappropriate swapin readahead 2014-02-06 13:48:51 -08:00
swapfile.c swap: change swap_list_head to plist, add swap_avail_head 2014-10-09 12:21:28 -07:00
truncate.c mm + fs: prepare for non-page entries in page cache radix trees 2014-11-21 09:23:06 -08:00
util.c vm_is_stack: use for_each_thread() rather then buggy while_each_thread() 2014-09-05 16:34:18 -07:00
vmacache.c mm: don't pointlessly use BUG_ON() for sanity check 2014-10-09 12:21:29 -07:00
vmalloc.c Revert "mm/vmalloc: interchage the implementation of vmalloc_to_{pfn,page}" 2014-01-27 21:02:39 -08:00
vmpressure.c arm, pm, vmpressure: add missing slab.h includes 2014-02-03 13:24:01 -05:00
vmscan.c vmscan: reclaim_clean_pages_from_list() must use mod_zone_page_state() 2014-10-09 12:21:29 -07:00
vmstat.c mm, x86: Account for TLB flushes only when debugging 2014-01-25 09:10:41 +01:00
zbud.c mm/zbud: fix some trivial typos in comments 2013-09-11 15:57:35 -07:00
zsmalloc.c zsmalloc: add copyright 2014-01-30 16:56:55 -08:00
zswap.c mm/zswap.c: change params from hidden to ro 2014-01-23 16:36:50 -08:00