Go to file
Mel Gorman 09a913a7a9 sched/numa: avoid trapping faults and attempting migration of file-backed dirty pages
change_pte_range is called from task work context to mark PTEs for
receiving NUMA faulting hints.  If the marked pages are dirty then
migration may fail.  Some filesystems cannot migrate dirty pages without
blocking so are skipped in MIGRATE_ASYNC mode which just wastes CPU.
Even when they can, it can be a waste of cycles when the pages are
shared forcing higher scan rates.  This patch avoids marking shared
dirty pages for hinting faults but also will skip a migration if the
page was dirtied after the scanner updated a clean page.

This is most noticeable running the NASA Parallel Benchmark when backed
by btrfs, the default root filesystem for some distributions, but also
noticeable when using XFS.

The following are results from a 4-socket machine running a 4.16-rc4
kernel with some scheduler patches that are pending for the next merge
window.

                        4.16.0-rc4             4.16.0-rc4
                 schedtip-20180309          nodirty-v1
  Time cg.D      459.07 (   0.00%)      444.21 (   3.24%)
  Time ep.D       76.96 (   0.00%)       77.69 (  -0.95%)
  Time is.D       25.55 (   0.00%)       27.85 (  -9.00%)
  Time lu.D      601.58 (   0.00%)      596.87 (   0.78%)
  Time mg.D      107.73 (   0.00%)      108.22 (  -0.45%)

is.D regresses slightly in terms of absolute time but note that that
particular load varies quite a bit from run to run.  The more relevant
observation is the total system CPU usage.

            4.16.0-rc4  4.16.0-rc4
          schedtip-20180309 nodirty-v1
  User        71471.91    70627.04
  System      11078.96     8256.13
  Elapsed       661.66      632.74

That is a substantial drop in system CPU usage and overall the workload
completes faster.  The NUMA balancing statistics are also interesting

  NUMA base PTE updates        111407972   139848884
  NUMA huge PMD updates           206506      264869
  NUMA page range updates      217139044   275461812
  NUMA hint faults               4300924     3719784
  NUMA hint local faults         3012539     3416618
  NUMA hint local percent             70          91
  NUMA pages migrated            1517487     1358420

While more PTEs are scanned due to changes in what faults are gathered,
it's clear that a far higher percentage of faults are local as the bulk
of the remote hits were dirty pages that, in this case with btrfs, had
no chance of migrating.

The following is a comparison when using XFS as that is a more realistic
filesystem choice for a data partition

                        4.16.0-rc4             4.16.0-rc4
                 schedtip-20180309          nodirty-v1r47
  Time cg.D      485.28 (   0.00%)      442.62 (   8.79%)
  Time ep.D       77.68 (   0.00%)       77.54 (   0.18%)
  Time is.D       26.44 (   0.00%)       24.79 (   6.24%)
  Time lu.D      597.46 (   0.00%)      597.11 (   0.06%)
  Time mg.D      142.65 (   0.00%)      105.83 (  25.81%)

That is a reasonable gain on two relatively long-lived workloads.  While
not presented, there is also a substantial drop in system CPu usage and
the NUMA balancing stats show similar improvements in locality as btrfs
did.

Link: http://lkml.kernel.org/r/20180326094334.zserdec62gwmmfqf@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Reviewed-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-11 10:28:31 -07:00
Documentation Documentation/vm/hmm.txt: typos and syntaxes fixes 2018-04-11 10:28:31 -07:00
LICENSES LICENSES: Add MPL-1.1 license 2018-01-06 10:59:44 -07:00
arch c6x changes 4.17 2018-04-10 11:50:14 -07:00
block for-4.17/block-20180402 2018-04-05 14:27:02 -07:00
certs certs/blacklist_nohashes.c: fix const confusion in certs blacklist 2018-02-21 15:35:43 -08:00
crypto MIPS changes for 4.17 2018-04-10 11:39:22 -07:00
drivers MIPS changes for 4.17 2018-04-10 11:39:22 -07:00
firmware kbuild: remove all dummy assignments to obj- 2017-11-18 11:46:06 +09:00
fs dcache: account external names as indirectly reclaimable memory 2018-04-11 10:28:29 -07:00
include mm/hmm: fix header file if/else/endif maze, again 2018-04-11 10:28:31 -07:00
init New features: 2018-04-10 11:27:30 -07:00
ipc Merge branch 'userns-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2018-04-03 19:15:32 -07:00
kernel New features: 2018-04-10 11:27:30 -07:00
lib New features: 2018-04-10 11:27:30 -07:00
mm sched/numa: avoid trapping faults and attempting migration of file-backed dirty pages 2018-04-11 10:28:31 -07:00
net Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-04-09 17:04:10 -07:00
samples VFIO updates for v4.17-rc1 2018-04-06 19:44:27 -07:00
scripts Leaking-addresses patches for 4.17-rc1 2018-04-07 11:56:33 -07:00
security New features: 2018-04-10 11:27:30 -07:00
sound sound fixes for 4.17-rc1 2018-04-10 10:16:04 -07:00
tools New features: 2018-04-10 11:27:30 -07:00
usr kbuild: rename built-in.o to built-in.a 2018-03-26 02:01:19 +09:00
virt KVM/ARM updates for v4.17 2018-03-28 16:09:09 +02:00
.cocciconfig scripts: add Linux .cocciconfig for coccinelle 2016-07-22 12:13:39 +02:00
.get_maintainer.ignore Add hch to .get_maintainer.ignore 2015-08-21 14:30:10 -07:00
.gitattributes .gitattributes: set git diff driver for C source code files 2016-10-07 18:46:30 -07:00
.gitignore kbuild: move include/config/ksym/* to include/ksym/* 2018-03-26 02:01:23 +09:00
.mailmap Merge candidates for 4.17 merge window 2018-04-06 17:35:43 -07:00
COPYING COPYING: use the new text with points to the license files 2018-03-23 12:41:45 -06:00
CREDITS MAINTAINERS/CREDITS: Drop METAG ARCHITECTURE 2018-03-05 16:34:24 +00:00
Kbuild Kbuild updates for v4.15 2017-11-17 17:45:29 -08:00
Kconfig License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
MAINTAINERS mm/hmm: documentation editorial update to HMM documentation 2018-04-11 10:28:30 -07:00
Makefile Kconfig updates for v4.17 2018-04-03 16:28:01 -07:00
README Docs: Added a pointer to the formatted docs to README 2018-03-21 09:02:53 -06:00

README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.
See Documentation/00-INDEX for a list of what is contained in each file.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.