linux/include
Johannes Weiner b70a2a21dc mm: memcontrol: fix transparent huge page allocations under pressure
In a memcg with even just moderate cache pressure, success rates for
transparent huge page allocations drop to zero, wasting a lot of effort
that the allocator puts into assembling these pages.

The reason for this is that the memcg reclaim code was never designed for
higher-order charges.  It reclaims in small batches until there is room
for at least one page.  Huge page charges only succeed when these batches
add up over a series of huge faults, which is unlikely under any
significant load involving order-0 allocations in the group.

Remove that loop on the memcg side in favor of passing the actual reclaim
goal to direct reclaim, which is already set up and optimized to meet
higher-order goals efficiently.

This brings memcg's THP policy in line with the system policy: if the
allocator painstakingly assembles a hugepage, memcg will at least make an
honest effort to charge it.  As a result, transparent hugepage allocation
rates amid cache activity are drastically improved:

                                      vanilla                 patched
pgalloc                 4717530.80 (  +0.00%)   4451376.40 (  -5.64%)
pgfault                  491370.60 (  +0.00%)    225477.40 ( -54.11%)
pgmajfault                    2.00 (  +0.00%)         1.80 (  -6.67%)
thp_fault_alloc               0.00 (  +0.00%)       531.60 (+100.00%)
thp_fault_fallback          749.00 (  +0.00%)       217.40 ( -70.88%)

[ Note: this may in turn increase memory consumption from internal
  fragmentation, which is an inherent risk of transparent hugepages.
  Some setups may have to adjust the memcg limits accordingly to
  accomodate this - or, if the machine is already packed to capacity,
  disable the transparent huge page feature. ]

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Dave Hansen <dave@sr71.net>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:59 -04:00
..
acpi Merge branches 'acpi-hotplug', 'acpi-scan', 'acpi-lpss', 'acpi-gpio' and 'acpi-video' 2014-09-25 22:59:30 +02:00
asm-generic common: dma-mapping: introduce common remapping functions 2014-10-09 22:25:52 -04:00
clocksource ARM: pxa: Add non device-tree timer link to clocksource 2014-07-23 12:02:39 +02:00
crypto crypto: drbg - backport "fix maximum value checks on 32 bit systems" 2014-09-05 15:52:28 +08:00
drm drm/radeon: add additional SI pci ids 2014-08-22 10:47:59 -04:00
dt-bindings ARM: SoC DT updates for 3.18 2014-10-08 17:22:23 -04:00
keys Merge remote-tracking branch 'integrity/next-with-keys' into keys-next 2014-07-22 21:54:43 +01:00
kvm arm/arm64: KVM: vgic: delay vgic allocation until init time 2014-09-18 18:48:58 -07:00
linux mm: memcontrol: fix transparent huge page allocations under pressure 2014-10-09 22:25:59 -04:00
math-emu
media [media] vb2: fix VBI/poll regression 2014-09-21 20:57:30 -03:00
memory
misc
net Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2014-10-08 21:40:54 -04:00
pcmcia
ras
rdma IB: ib_umem_release() should decrement mm->pinned_vm from ib_umem_get 2014-09-19 09:55:42 -07:00
rxrpc include/rxrpc/types.h: Remove unused header 2014-08-29 20:33:39 -07:00
scsi Merge remote-tracking branch 'scsi-queue/drivers-for-3.18' into for-linus 2014-10-07 13:48:12 -07:00
soc/tegra
sound ASoC: core: fix .info for SND_SOC_BYTES_TLV 2014-08-18 08:59:12 -05:00
target
trace Merge tag 'f2fs-for-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs 2014-10-08 12:53:15 -04:00
uapi prctl: PR_SET_MM -- introduce PR_SET_MM_MAP operation 2014-10-09 22:25:55 -04:00
video gpu: ipu-v3: Add ipu-cpmem unit 2014-08-18 14:17:41 +02:00
xen xen/arm: introduce XENFEAT_grant_map_identity 2014-09-11 18:11:52 +00:00
Kbuild