linux/mm
Mel Gorman 37a4094e82 mremap: remove LATENCY_LIMIT from mremap to reduce the number of TLB shootdowns
Commit 5d1904204c ("mremap: fix race between mremap() and page
cleanning") fixed races between mremap and other operations for both
file-backed and anonymous mappings.  The file-backed was the most
critical as it allowed the possibility that data could be changed on a
physical page after page_mkclean returned which could trigger data loss
or data integrity issues.

A customer reported that the cost of the TLBs for anonymous regressions
was excessive and resulting in a 30-50% drop in performance overall
since this commit on a microbenchmark.  Unfortunately I neither have
access to the test-case nor can I describe what it does other than
saying that mremap operations dominate heavily.

This patch removes the LATENCY_LIMIT to handle TLB flushes on a PMD
boundary instead of every 64 pages to reduce the number of TLB
shootdowns by a factor of 8 in the ideal case.  LATENCY_LIMIT was almost
certainly used originally to limit the PTL hold times but the latency
savings are likely offset by the cost of IPIs in many cases.  This patch
is not reported to completely restore performance but gets it within an
acceptable percentage.  The given metric here is simply described as
"higher is better".

Baseline that was known good
002:  Metric:       91.05
004:  Metric:      109.45
008:  Metric:       73.08
016:  Metric:       58.14
032:  Metric:       61.09
064:  Metric:       57.76
128:  Metric:       55.43

Current
001:  Metric:       54.98
002:  Metric:       56.56
004:  Metric:       41.22
008:  Metric:       35.96
016:  Metric:       36.45
032:  Metric:       35.71
064:  Metric:       35.73
128:  Metric:       34.96

With patch
001:  Metric:       61.43
002:  Metric:       81.64
004:  Metric:       67.92
008:  Metric:       51.67
016:  Metric:       50.47
032:  Metric:       52.29
064:  Metric:       50.01
128:  Metric:       49.04

So for low threads, it's not restored but for larger number of threads,
it's closer to the "known good" baseline.

Using a different mremap-intensive workload that is not representative
of the real workload there is little difference observed outside of
noise in the headline metrics However, the TLB shootdowns are reduced by
11% on average and at the peak, TLB shootdowns were reduced by 21%.
Interrupts were sampled every second while the workload ran to get those
figures.  It's known that the figures will vary as the
non-representative load is non-deterministic.

An alternative patch was posted that should have significantly reduced
the TLB flushes but unfortunately it does not perform as well as this
version on the customer test case.  If revisited, the two patches can
stack on top of each other.

Link: http://lkml.kernel.org/r/20180606183803.k7qaw2xnbvzshv34@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Aaron Lu <aaron.lu@intel.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-06-15 07:55:24 +09:00
..
kasan kasan: fix memory hotplug during boot 2018-05-25 18:12:11 -07:00
Kconfig libnvdimm for 4.18 2018-06-08 17:21:52 -07:00
Kconfig.debug kmemcheck: rip it out 2017-11-15 18:21:05 -08:00
Makefile mm: restructure memfd code 2018-06-07 17:34:35 -07:00
backing-dev.c memcg: writeback: use memcg->cgwb_list directly 2018-06-07 17:34:36 -07:00
balloon_compaction.c virtio_balloon: fix deadlock on OOM 2017-11-14 23:57:38 +02:00
bootmem.c mm: docs: fix parameter names mismatch 2018-02-06 18:32:48 -08:00
cleancache.c docs/vm: rename documentation files to .rst 2018-04-16 14:18:15 -06:00
cma.c Revert "mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE" 2018-05-24 10:07:50 -07:00
cma.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
cma_debug.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
compaction.c Revert "mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE" 2018-05-24 10:07:50 -07:00
debug.c mm/debug.c: provide useful debugging information for VM_BUG 2018-01-04 16:45:09 -08:00
debug_page_ref.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
dmapool.c lib/vsprintf.c: remove %Z support 2017-02-27 18:43:47 -08:00
early_ioremap.c mm/early_ioremap: Fix boot hang with earlyprintk=efi,keep 2017-12-11 14:54:44 +01:00
fadvise.c mm: add ksys_fadvise64_64() helper; remove in-kernel call to sys_fadvise64_64() 2018-04-02 20:16:10 +02:00
failslab.c mm: make should_failslab always available for fault injection 2018-04-05 21:36:26 -07:00
filemap.c mm: use new return type vm_fault_t 2018-06-07 17:34:36 -07:00
frame_vector.c mm/frame_vector.c: release a semaphore in 'get_vaddr_frames()' 2017-12-14 16:00:48 -08:00
frontswap.c docs/vm: rename documentation files to .rst 2018-04-16 14:18:15 -06:00
gup.c libnvdimm for 4.18 2018-06-08 17:21:52 -07:00
gup_benchmark.c treewide: kvzalloc() -> kvcalloc() 2018-06-12 16:19:22 -07:00
highmem.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
hmm.c mm: introduce MEMORY_DEVICE_FS_DAX and CONFIG_DEV_PAGEMAP_OPS 2018-05-22 06:59:39 -07:00
huge_memory.c treewide: kmalloc() -> kmalloc_array() 2018-06-12 16:19:22 -07:00
hugetlb.c treewide: kmalloc() -> kmalloc_array() 2018-06-12 16:19:22 -07:00
hugetlb_cgroup.c mm: rename page_counter's count/limit into usage/max 2018-06-07 17:34:35 -07:00
hwpoison-inject.c mm/memory_failure: Remove unused trapno from memory_failure 2018-01-23 12:17:42 -06:00
init-mm.c mm: introduce arg_lock to protect arg_start|end and env_start|end in mm_struct 2018-06-07 17:34:34 -07:00
internal.h Changes for 4.18: 2018-06-05 13:24:20 -07:00
interval_tree.c mm/interval_tree.c: use vma_pages() helper 2018-01-31 17:18:37 -08:00
khugepaged.c page cache: use xa_lock 2018-04-11 10:28:39 -07:00
kmemleak-test.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
kmemleak.c mm: kernel-doc: add missing parameter descriptions 2018-04-05 21:36:27 -07:00
ksm.c mm/ksm.c: ignore STABLE_FLAG of rmap_item->address in rmap_walk_ksm() 2018-06-15 07:55:23 +09:00
list_lru.c mm: make counting of list_lru_one::nr_items lockless 2018-04-05 21:36:27 -07:00
maccess.c mm: docs: fix parameter names mismatch 2018-02-06 18:32:48 -08:00
madvise.c mm/memory_failure: Remove unused trapno from memory_failure 2018-01-23 12:17:42 -06:00
memblock.c mm/memblock: add missing include <linux/bootmem.h> 2018-06-15 07:55:24 +09:00
memcontrol.c mm: fix null pointer dereference in mem_cgroup_protected 2018-06-15 07:55:23 +09:00
memfd.c mm: restructure memfd code 2018-06-07 17:34:35 -07:00
memory-failure.c mm, migrate: remove reason argument from new_page_t 2018-04-11 10:28:32 -07:00
memory.c Merge branch 'akpm' (patches from Andrew) 2018-06-07 18:39:37 -07:00
memory_hotplug.c mm: move is_pageblock_removable_nolock() to mm/memory_hotplug.c 2018-06-07 17:34:36 -07:00
mempolicy.c mm: unclutter THP migration 2018-04-11 10:28:32 -07:00
mempool.c mempool: Add mempool_init()/mempool_exit() 2018-05-14 13:14:23 -06:00
memtest.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
migrate.c mm: migrate: fix double call of radix_tree_replace_slot() 2018-05-11 17:28:45 -07:00
mincore.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
mlock.c mm, mlock, vmscan: no more skipping pagevecs 2018-02-21 15:35:42 -08:00
mm_init.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
mmap.c mm: change return type to vm_fault_t 2018-06-07 17:34:36 -07:00
mmu_context.c sched/headers: Prepare to move the task_lock()/unlock() APIs to <linux/sched/task.h> 2017-03-02 08:42:38 +01:00
mmu_notifier.c mm, mmu_notifier: annotate mmu notifiers with blockable invalidate callbacks 2018-01-31 17:18:38 -08:00
mmzone.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
mprotect.c sched/numa: avoid trapping faults and attempting migration of file-backed dirty pages 2018-04-11 10:28:31 -07:00
mremap.c mremap: remove LATENCY_LIMIT from mremap to reduce the number of TLB shootdowns 2018-06-15 07:55:24 +09:00
msync.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
nobootmem.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
nommu.c mm: use new return type vm_fault_t 2018-06-07 17:34:36 -07:00
oom_kill.c mm: rename page_counter's count/limit into usage/max 2018-06-07 17:34:35 -07:00
page-writeback.c writeback: safer lock nesting 2018-04-20 17:18:35 -07:00
page_alloc.c mm, page_alloc: do not break __GFP_THISNODE by zonelist reset 2018-06-07 17:34:38 -07:00
page_counter.c memcg: introduce memory.min 2018-06-07 17:34:36 -07:00
page_ext.c mm/page_ext.c: make page_ext_init a noop when CONFIG_PAGE_EXTENSION but nothing uses it 2018-01-31 17:18:39 -08:00
page_idle.c mm: thp: fix potential clearing to referenced flag in page_idle_clear_pte_refs_one() 2018-04-05 21:36:25 -07:00
page_io.c block: convert to bio_first_bvec_all & bio_first_page_all 2018-01-06 09:18:00 -07:00
page_isolation.c mm, migrate: remove reason argument from new_page_t 2018-04-11 10:28:32 -07:00
page_owner.c mm/page_owner.c: make early_page_owner_param() __init 2018-04-05 21:36:26 -07:00
page_poison.c mm/page_poison.c: make early_page_poison_param() __init 2018-04-05 21:36:26 -07:00
page_vma_mapped.c mm, page_vma_mapped: Introduce pfn_in_hpage() 2018-01-22 12:15:57 -08:00
pagewalk.c mm: kernel-doc: add missing parameter descriptions 2018-04-05 21:36:27 -07:00
percpu-internal.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
percpu-km.c percpu: allow select gfp to be passed to underlying allocators 2018-02-18 05:33:01 -08:00
percpu-stats.c treewide: Use array_size() in vmalloc() 2018-06-12 16:19:22 -07:00
percpu-vm.c percpu: allow select gfp to be passed to underlying allocators 2018-02-18 05:33:01 -08:00
percpu.c arch: remove obsolete architecture ports 2018-04-02 20:20:12 -07:00
pgtable-generic.c mm: do not lose dirty and accessed bits in pmdp_invalidate() 2018-01-31 17:18:38 -08:00
process_vm_access.c mm: docs: add blank lines to silence sphinx "Unexpected indentation" errors 2018-02-06 18:32:48 -08:00
quicklist.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
readahead.c mm: split ->readpages calls to avoid non-contiguous pages lists 2018-06-01 18:37:32 -07:00
rmap.c Linux 4.17-rc2 2018-04-27 17:13:20 -06:00
rodata_test.c mm: fix RODATA_TEST failure "rodata_test: test data was not read only" 2017-10-03 17:54:24 -07:00
shmem.c mm/shmem.c: zero out unused vma fields in shmem_pseudo_vma_init() 2018-06-07 17:34:38 -07:00
slab.c treewide: kzalloc() -> kcalloc() 2018-06-12 16:19:22 -07:00
slab.h slab, slub: skip unnecessary kasan_cache_shutdown() 2018-04-05 21:36:24 -07:00
slab_common.c mm: fix race between kmem_cache destroy, create and deactivate 2018-06-15 07:55:23 +09:00
slob.c slab: __GFP_ZERO is incompatible with a constructor 2018-06-07 17:34:34 -07:00
slub.c treewide: kzalloc() -> kcalloc() 2018-06-12 16:19:22 -07:00
sparse-vmemmap.c mm: merge vmem_altmap_alloc into altmap_alloc_block_buf 2018-01-08 11:46:23 -08:00
sparse.c mm/sparse.c: pass the __highest_present_section_nr + 1 to alloc_func() 2018-06-07 17:34:35 -07:00
swap.c mm: introduce MEMORY_DEVICE_FS_DAX and CONFIG_DEV_PAGEMAP_OPS 2018-05-22 06:59:39 -07:00
swap_cgroup.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
swap_slots.c treewide: kvzalloc() -> kvcalloc() 2018-06-12 16:19:22 -07:00
swap_state.c treewide: kvzalloc() -> kvcalloc() 2018-06-12 16:19:22 -07:00
swapfile.c mm/swapfile.c: fix swap_count comment about nonexistent SWAP_HAS_CONT 2018-06-15 07:55:23 +09:00
truncate.c page cache: use xa_lock 2018-04-11 10:28:39 -07:00
usercopy.c usercopy: WARN() on slab cache usercopy region violations 2018-01-15 12:07:48 -08:00
userfaultfd.c userfaultfd: prevent non-cooperative events vs mcopy_atomic races 2018-06-07 17:34:38 -07:00
util.c mm: kvmalloc does not fallback to vmalloc for incompatible gfp flags 2018-06-07 17:34:38 -07:00
vmacache.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
vmalloc.c mm: vmalloc: pass proper vm_start into debugobjects 2018-06-07 17:34:35 -07:00
vmpressure.c mm/vmpressure.c: convert to use match_string() helper 2018-06-07 17:34:36 -07:00
vmscan.c memcg: introduce memory.min 2018-06-07 17:34:36 -07:00
vmstat.c proc: introduce proc_create_seq{,_data} 2018-05-16 07:23:35 +02:00
workingset.c page cache: use xa_lock 2018-04-11 10:28:39 -07:00
z3fold.c z3fold: fix reclaim lock-ups 2018-05-11 17:28:45 -07:00
zbud.c mm: docs: fix parameter names mismatch 2018-02-06 18:32:48 -08:00
zpool.c mm/zpool.c: zpool_evictable: fix mismatch in parameter name and kernel-doc 2018-02-21 15:35:43 -08:00
zsmalloc.c mm: kernel-doc: add missing parameter descriptions 2018-04-05 21:36:27 -07:00
zswap.c mm, swap, frontswap: fix THP swap if frontswap enabled 2018-02-21 15:35:43 -08:00