linux/include
Rik van Riel 2084140594 mm: fix TLB flush race between migration, and change_protection_range
There are a few subtle races, between change_protection_range (used by
mprotect and change_prot_numa) on one side, and NUMA page migration and
compaction on the other side.

The basic race is that there is a time window between when the PTE gets
made non-present (PROT_NONE or NUMA), and the TLB is flushed.

During that time, a CPU may continue writing to the page.

This is fine most of the time, however compaction or the NUMA migration
code may come in, and migrate the page away.

When that happens, the CPU may continue writing, through the cached
translation, to what is no longer the current memory location of the
process.

This only affects x86, which has a somewhat optimistic pte_accessible.
All other architectures appear to be safe, and will either always flush,
or flush whenever there is a valid mapping, even with no permissions
(SPARC).

The basic race looks like this:

CPU A			CPU B			CPU C

						load TLB entry
make entry PTE/PMD_NUMA
			fault on entry
						read/write old page
			start migrating page
			change PTE/PMD to new page
						read/write old page [*]
flush TLB
						reload TLB from new entry
						read/write new page
						lose data

[*] the old page may belong to a new user at this point!

The obvious fix is to flush remote TLB entries, by making sure that
pte_accessible aware of the fact that PROT_NONE and PROT_NUMA memory may
still be accessible if there is a TLB flush pending for the mm.

This should fix both NUMA migration and compaction.

[mgorman@suse.de: fix build]
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-12-18 19:04:51 -08:00
..
acpi Merge branch 'acpica' 2013-11-27 01:03:27 +01:00
asm-generic mm: fix TLB flush race between migration, and change_protection_range 2013-12-18 19:04:51 -08:00
clocksource drivers: clocksource: add support for ARM architected timer event stream 2013-09-26 09:48:00 +01:00
crypto crypto: scatterwalk - Use sg_chain_ptr on chain entries 2013-12-09 19:58:52 +08:00
drm Merge branch 'ttm-fixes-3.13' of git://people.freedesktop.org/~thomash/linux into drm-fixes 2013-11-21 18:46:56 +10:00
dt-bindings For the 3.13 merge window we have a couple of new drivers for the AMS 2013-11-15 16:37:40 -08:00
keys KEYS: Separate the kernel signature checking keyring from module signing 2013-09-25 17:17:01 +01:00
kvm ARM: KVM: vgic: Bump VGIC_NR_IRQS to 256 2013-08-30 16:12:39 +03:00
linux mm: fix TLB flush race between migration, and change_protection_range 2013-12-18 19:04:51 -08:00
math-emu
media [media] videobuf2: Add support for file access mode flags for DMABUF exporting 2013-12-09 14:50:50 -02:00
memory
misc
net sctp: properly latch and use autoclose value from sock to association 2013-12-10 22:41:26 -05:00
pcmcia
ras
rdma Merge branches 'cma', 'cxgb4', 'flowsteer', 'ipoib', 'misc', 'mlx4', 'mlx5', 'nes', 'ocrdma', 'qib' and 'srp' into for-next 2013-11-17 08:22:19 -08:00
rxrpc
scsi [SCSI] Disable WRITE SAME for RAID and virtual host adapter drivers 2013-11-29 08:48:39 +04:00
sound ALSA: memalloc.h - fix wrong truncation of dma_addr_t 2013-12-10 15:30:46 +01:00
target target_core_alua: Store supported ALUA states 2013-11-20 11:26:37 -08:00
trace Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-12-02 10:13:09 -08:00
uapi sound fixes for 3.13-rc4 2013-12-12 13:14:25 -08:00
video fbdev changes for 3.13 2013-11-14 14:44:20 +09:00
xen Features: 2013-11-15 13:34:37 +09:00
Kbuild