linux/include
David Rientjes 2f0799a0ff mm, thp: restore node-local hugepage allocations
This is a full revert of ac5b2c1891 ("mm: thp: relax __GFP_THISNODE for
MADV_HUGEPAGE mappings") and a partial revert of 89c83fb539 ("mm, thp:
consolidate THP gfp handling into alloc_hugepage_direct_gfpmask").

By not setting __GFP_THISNODE, applications can allocate remote hugepages
when the local node is fragmented or low on memory when either the thp
defrag setting is "always" or the vma has been madvised with
MADV_HUGEPAGE.

Remote access to hugepages often has much higher latency than local pages
of the native page size.  On Haswell, ac5b2c1891 was shown to have a
13.9% access regression after this commit for binaries that remap their
text segment to be backed by transparent hugepages.

The intent of ac5b2c1891 is to address an issue where a local node is
low on memory or fragmented such that a hugepage cannot be allocated.  In
every scenario where this was described as a fix, there is abundant and
unfragmented remote memory available to allocate from, even with a greater
access latency.

If remote memory is also low or fragmented, not setting __GFP_THISNODE was
also measured on Haswell to have a 40% regression in allocation latency.

Restore __GFP_THISNODE for thp allocations.

Fixes: ac5b2c1891 ("mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings")
Fixes: 89c83fb539 ("mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask")
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-05 15:45:54 -08:00
..
acpi pci-v4.20-changes 2018-10-25 06:50:48 -07:00
asm-generic s390 updates for 4.20-rc2 2018-11-09 06:30:44 -06:00
clocksource
crypto KEYS: asym_tpm: extract key size & public key [ver #2] 2018-10-26 09:30:46 +01:00
drm drm, i915, amdgpu, bridge + core quirk 2018-11-02 10:58:20 -07:00
dt-bindings This time it looks like a quieter release cycle in the clk tree. I guess that's 2018-10-31 11:08:30 -07:00
keys KEYS: Move trusted.h to include/keys [ver #2] 2018-10-26 09:30:47 +01:00
kvm
linux mm, thp: restore node-local hugepage allocations 2018-12-05 15:45:54 -08:00
math-emu
media media: Use wait_queue_head_t for media_request 2018-11-20 12:53:23 -05:00
memory
misc
net Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf 2018-11-28 11:02:45 -08:00
pcmcia
ras
rdma First merge window pull request 2018-10-26 07:38:19 -07:00
scsi
soc ARM: SoC driver updates for 4.17 2018-10-29 15:16:01 -07:00
sound ASoC: Fixes for v4.20 2018-11-27 16:06:42 +01:00
target scsi: target/core: Remove the SCF_COMPARE_AND_WRITE_POST flag 2018-10-16 01:13:35 -04:00
trace While rewriting the function graph tracer, I discovered a design flaw that 2018-11-30 09:32:34 -08:00
uapi x86/speculation: Add prctl() control for indirect branch speculation 2018-11-28 11:57:13 +01:00
video udlfb: handle unplug properly 2018-10-08 12:57:34 +02:00
xen Revert "xen/balloon: Mark unallocated host memory as UNUSABLE" 2018-11-29 17:53:31 +01:00