linux/mm
David Rientjes 14a4e2141e mm, thp: only collapse hugepages to nodes with affinity for zone_reclaim_mode
Commit 9f1b868a13 ("mm: thp: khugepaged: add policy for finding target
node") improved the previous khugepaged logic which allocated a
transparent hugepages from the node of the first page being collapsed.

However, it is still possible to collapse pages to remote memory which
may suffer from additional access latency.  With the current policy, it
is possible that 255 pages (with PAGE_SHIFT == 12) will be collapsed
remotely if the majority are allocated from that node.

When zone_reclaim_mode is enabled, it means the VM should make every
attempt to allocate locally to prevent NUMA performance degradation.  In
this case, we do not want to collapse hugepages to remote nodes that
would suffer from increased access latency.  Thus, when
zone_reclaim_mode is enabled, only allow collapsing to nodes with
RECLAIM_DISTANCE or less.

There is no functional change for systems that disable
zone_reclaim_mode.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Bob Liu <bob.liu@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06 18:01:20 -07:00
..
backing-dev.c arch: Mass conversion of smp_mb__*() 2014-04-18 14:20:48 +02:00
balloon_compaction.c mm: print more details for bad_page() 2014-01-23 16:36:50 -08:00
bootmem.c mm/bootmem.c: remove unused local `map' 2013-11-13 12:09:09 +09:00
cleancache.c mm: dump page when hitting a VM_BUG_ON using VM_BUG_ON_PAGE 2014-01-23 16:36:50 -08:00
cma.c mm, CMA: clean-up log message 2014-08-06 18:01:16 -07:00
compaction.c mm, compaction: properly signal and act upon lock and need_sched() contention 2014-06-04 16:54:11 -07:00
debug-pagealloc.c
dmapool.c mm/dmapool.c: reuse devres_release() to free resources 2014-06-04 16:54:08 -07:00
early_ioremap.c mm: create generic early_ioremap() support 2014-04-07 16:36:15 -07:00
fadvise.c
failslab.c
filemap_xip.c seqcount: Add lockdep functionality to seqcount/seqlock structures 2013-11-06 12:40:26 +01:00
filemap.c mm: describe mmap_sem rules for __lock_page_or_retry() and callers 2014-08-06 18:01:20 -07:00
fremap.c mm: mark remap_file_pages() syscall as deprecated 2014-06-06 16:08:17 -07:00
frontswap.c swap: change swap_list_head to plist, add swap_avail_head 2014-06-04 16:54:07 -07:00
gup.c mm: describe mmap_sem rules for __lock_page_or_retry() and callers 2014-08-06 18:01:20 -07:00
highmem.c
huge_memory.c mm, thp: only collapse hugepages to nodes with affinity for zone_reclaim_mode 2014-08-06 18:01:20 -07:00
hugetlb_cgroup.c cgroup: replace cgroup_add_cftypes() with cgroup_add_legacy_cftypes() 2014-07-15 11:05:09 -04:00
hugetlb.c mm, hugetlb: remove hugetlb_zero and hugetlb_infinity 2014-08-06 18:01:19 -07:00
hwpoison-inject.c mm/hwpoison-inject.c: remove unnecessary null test before debugfs_remove_recursive 2014-08-06 18:01:19 -07:00
init-mm.c
internal.h mm/internal.h: use nth_page 2014-08-06 18:01:16 -07:00
interval_tree.c
iov_iter.c bio_vec-backed iov_iter 2014-05-06 17:39:45 -04:00
Kconfig CMA: generalize CMA reserved area management functionality 2014-08-06 18:01:16 -07:00
Kconfig.debug
kmemcheck.c
kmemleak-test.c mm/kmemleak-test.c: use pr_fmt for logging 2014-06-06 16:08:18 -07:00
kmemleak.c mm: introduce kmemleak_update_trace() 2014-06-06 16:08:17 -07:00
ksm.c sched: Remove proliferation of wait_on_bit() action functions 2014-07-16 15:10:39 +02:00
list_lru.c mm: keep page cache radix tree nodes in check 2014-04-03 16:21:01 -07:00
maccess.c
madvise.c mm: update the description for madvise_remove 2014-08-06 18:01:18 -07:00
Makefile CMA: generalize CMA reserved area management functionality 2014-08-06 18:01:16 -07:00
memblock.c mm/memblock.c: call kmemleak directly from memblock_(alloc|free) 2014-06-06 16:08:17 -07:00
memcontrol.c mm: memcontrol: do not acquire page_cgroup lock for kmem pages 2014-08-06 18:01:17 -07:00
memory_hotplug.c mem-hotplug: introduce MMOP_OFFLINE to replace the hard coding -1 2014-08-06 18:01:16 -07:00
memory-failure.c hwpoison: fix race with changing page during offlining 2014-08-06 18:01:19 -07:00
memory.c mm: describe mmap_sem rules for __lock_page_or_retry() and callers 2014-08-06 18:01:20 -07:00
mempolicy.c Merge branch 'for-3.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2014-07-10 11:38:23 -07:00
mempool.c mm/mempool.c: update the kmemleak stack trace for mempool allocations 2014-06-06 16:08:17 -07:00
migrate.c mm: fix direct reclaim writeback regression 2014-07-26 14:38:50 -07:00
mincore.c mm + fs: prepare for non-page entries in page cache radix trees 2014-04-03 16:21:00 -07:00
mlock.c mm: describe mmap_sem rules for __lock_page_or_retry() and callers 2014-08-06 18:01:20 -07:00
mm_init.c mm: bring back /sys/kernel/mm 2014-01-27 21:02:39 -08:00
mmap.c mm: catch memory commitment underflow 2014-08-06 18:01:19 -07:00
mmu_context.c sched/mm: call finish_arch_post_lock_switch in idle_task_exit and use_mm 2014-02-21 08:50:17 +01:00
mmu_notifier.c mm: audit/fix non-modular users of module_init in core code 2014-01-23 16:36:52 -08:00
mmzone.c mm: numa: Change page last {nid,pid} into {cpu,pid} 2013-10-09 14:47:45 +02:00
mprotect.c mm: move mmu notifier call from change_protection to change_pmd_range 2014-04-07 16:35:50 -07:00
mremap.c mm, thp: close race between mremap() and split_huge_page() 2014-05-11 17:55:48 +09:00
msync.c msync: fix incorrect fstart calculation 2014-07-03 09:21:53 -07:00
nobootmem.c mm/memblock.c: call kmemleak directly from memblock_(alloc|free) 2014-06-06 16:08:17 -07:00
nommu.c mm: nommu: per-thread vma cache fix 2014-06-23 16:47:43 -07:00
oom_kill.c mm, oom: base root bonus on current usage 2014-01-30 16:56:56 -08:00
page_alloc.c mm: page_alloc: reduce cost of the fair zone allocation policy 2014-08-06 18:01:20 -07:00
page_cgroup.c mm/page_cgroup.c: mark functions as static 2014-04-03 16:21:02 -07:00
page_io.c fix __swap_writepage() compile failure on old gcc versions 2014-06-14 19:30:48 -05:00
page_isolation.c
page-writeback.c mm/page-writeback.c: fix divide by zero in bdi_dirty_limits() 2014-07-30 17:16:13 -07:00
pagewalk.c mm/pagewalk.c: fix walk_page_range() access of wrong PTEs 2013-10-30 14:27:03 -07:00
percpu-km.c
percpu-vm.c
percpu.c percpu: Use ALIGN macro instead of hand coding alignment calculation 2014-06-19 11:00:27 -04:00
pgtable-generic.c mm: fix TLB flush race between migration, and change_protection_range 2013-12-18 19:04:51 -08:00
process_vm_access.c start adding the tag to iov_iter 2014-05-06 17:32:49 -04:00
quicklist.c
readahead.c mm/readahead.c: remove unused file_ra_state from count_history_pages 2014-08-06 18:01:15 -07:00
rmap.c mm/rmap.c: fix pgoff calculation to handle hugepage correctly 2014-07-23 15:10:54 -07:00
shmem.c mm/shmem.c: remove the unused gfp arg to shmem_add_to_page_cache() 2014-08-06 18:01:20 -07:00
slab_common.c mm: move slab related stuff from util.c to slab_common.c 2014-08-06 18:01:15 -07:00
slab.c mm/slab.c: fix comments 2014-08-06 18:01:15 -07:00
slab.h slab: convert last use of __FUNCTION__ to __func__ 2014-08-06 18:01:15 -07:00
slob.c slab: get_online_mems for kmem_cache_{create,destroy,shrink} 2014-06-04 16:53:59 -07:00
slub.c slab: fix the alias count (via sysfs) of slab cache 2014-08-06 18:01:15 -07:00
sparse-vmemmap.c mm/sparse: use memblock apis for early memory allocations 2014-01-21 16:19:47 -08:00
sparse.c mm: use macros from compiler.h instead of __attribute__((...)) 2014-04-07 16:35:54 -07:00
swap_state.c mm: page_alloc: convert hot/cold parameter and immediate callers to bool 2014-06-04 16:54:09 -07:00
swap.c mm: pagemap: avoid unnecessary overhead when tracepoints are deactivated 2014-08-06 18:01:20 -07:00
swapfile.c mm/swapfile.c: delete the "last_in_cluster < scan_base" loop in the body of scan_swap_map() 2014-06-04 16:54:12 -07:00
truncate.c mm/fs: fix pessimization in hole-punching pagecache 2014-07-23 15:10:55 -07:00
util.c mm: move slab related stuff from util.c to slab_common.c 2014-08-06 18:01:15 -07:00
vmacache.c mm,vmacache: optimize overflow system-wide flushing 2014-06-04 16:53:57 -07:00
vmalloc.c mm/vmalloc.c: clean up map_vm_area third argument 2014-08-06 18:01:19 -07:00
vmpressure.c arm, pm, vmpressure: add missing slab.h includes 2014-02-03 13:24:01 -05:00
vmscan.c mm: move zone->pages_scanned into a vmstat counter 2014-08-06 18:01:20 -07:00
vmstat.c mm: vmscan: only update per-cpu thresholds for online CPU 2014-08-06 18:01:20 -07:00
workingset.c mm: keep page cache radix tree nodes in check 2014-04-03 16:21:01 -07:00
zbud.c mm/zbud.c: make size unsigned like unique callsite 2014-06-04 16:54:13 -07:00
zsmalloc.c mm/vmalloc.c: clean up map_vm_area third argument 2014-08-06 18:01:19 -07:00
zswap.c mm/zswap: NUMA aware allocation for zswap_dstmem 2014-06-04 16:54:14 -07:00