linux

History

Andrea Arcangeli 26c191788f mm: pmd_read_atomic: fix 32bit PAE pmd walk vs pmd_populate SMP race condition When holding the mmap_sem for reading, pmd_offset_map_lock should only run on a pmd_t that has been read atomically from the pmdp pointer, otherwise we may read only half of it leading to this crash. PID: 11679 TASK: f06e8000 CPU: 3 COMMAND: "do_race_2_panic" #0 [f06a9dd8] crash_kexec at c049b5ec #1 [f06a9e2c] oops_end at c083d1c2 #2 [f06a9e40] no_context at c0433ded #3 [f06a9e64] bad_area_nosemaphore at c043401a #4 [f06a9e6c] __do_page_fault at c0434493 #5 [f06a9eec] do_page_fault at c083eb45 #6 [f06a9f04] error_code (via page_fault) at c083c5d5 EAX: 01fb470c EBX: fff35000 ECX: 00000003 EDX: 00000100 EBP: 00000000 DS: 007b ESI: 9e201000 ES: 007b EDI: 01fb4700 GS: 00e0 CS: 0060 EIP: c083bc14 ERR: ffffffff EFLAGS: 00010246 #7 [f06a9f38] _spin_lock at c083bc14 #8 [f06a9f44] sys_mincore at c0507b7d #9 [f06a9fb0] system_call at c083becd start len EAX: ffffffda EBX: 9e200000 ECX: 00001000 EDX: 6228537f DS: 007b ESI: 00000000 ES: 007b EDI: 003d0f00 SS: 007b ESP: 62285354 EBP: 62285388 GS: 0033 CS: 0073 EIP: 00291416 ERR: 000000da EFLAGS: 00000286 This should be a longstanding bug affecting x86 32bit PAE without THP. Only archs with 64bit large pmd_t and 32bit unsigned long should be affected. With THP enabled the barrier() in pmd_none_or_trans_huge_or_clear_bad() would partly hide the bug when the pmd transition from none to stable, by forcing a re-read of the pmd in pmd_offset_map_lock, but when THP is enabled a new set of problem arises by the fact could then transition freely in any of the none, pmd_trans_huge or pmd_trans_stable states. So making the barrier in pmd_none_or_trans_huge_or_clear_bad() unconditional isn't good idea and it would be a flakey solution. This should be fully fixed by introducing a pmd_read_atomic that reads the pmd in order with THP disabled, or by reading the pmd atomically with cmpxchg8b with THP enabled. Luckily this new race condition only triggers in the places that must already be covered by pmd_none_or_trans_huge_or_clear_bad() so the fix is localized there but this bug is not related to THP. NOTE: this can trigger on x86 32bit systems with PAE enabled with more than 4G of ram, otherwise the high part of the pmd will never risk to be truncated because it would be zero at all times, in turn so hiding the SMP race. This bug was discovered and fully debugged by Ulrich, quote: ---- [..] pmd_none_or_trans_huge_or_clear_bad() loads the content of edx and eax. 496 static inline int pmd_none_or_trans_huge_or_clear_bad(pmd_t pmd) 497 { 498 /* depend on compiler for an atomic pmd read / 499 pmd_t pmdval = pmd; // edi = pmd pointer 0xc0507a74 <sys_mincore+548>: mov 0x8(%esp),%edi ... // edx = PTE page table high address 0xc0507a84 <sys_mincore+564>: mov 0x4(%edi),%edx ... // eax = PTE page table low address 0xc0507a8e <sys_mincore+574>: mov (%edi),%eax [..] Please note that the PMD is not read atomically. These are two "mov" instructions where the high order bits of the PMD entry are fetched first. Hence, the above machine code is prone to the following race. - The PMD entry {high\|low} is 0x0000000000000000. The "mov" at 0xc0507a84 loads 0x00000000 into edx. - A page fault (on another CPU) sneaks in between the two "mov" instructions and instantiates the PMD. - The PMD entry {high\|low} is now 0x00000003fda38067. The "mov" at 0xc0507a8e loads 0xfda38067 into eax. ---- Reported-by: Ulrich Obergfell <uobergfe@redhat.com> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Petr Matousek <pmatouse@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>		2012-05-29 16:22:24 -07:00
..
bitops	Add #includes needed to permit the removal of asm/system.h	2012-03-28 18:30:03 +01:00
4level-fixup.h	mm: Pass virtual address to [__]p{te,ud,md}_free_tlb()	2009-07-27 12:10:38 -07:00
Kbuild	include: replace unifdef-y with header-y	2010-08-14 22:26:51 +02:00
Kbuild.asm	include: replace unifdef-y with header-y	2010-08-14 22:26:51 +02:00
atomic-long.h	asm-generic: merge branch 'master' of torvalds/linux-2.6	2009-06-12 11:32:58 +02:00
atomic.h	Remove all #inclusions of asm/system.h	2012-03-28 18:30:03 +01:00
atomic64.h	lib: Provide generic atomic64_t implementation	2009-06-15 13:27:38 +10:00
audit_change_attr.h	audit: support the "standard" <asm-generic/unistd.h>	2011-05-04 14:41:28 -04:00
audit_dir_write.h	audit: support the "standard" <asm-generic/unistd.h>	2011-05-04 14:41:28 -04:00
audit_read.h	audit: support the "standard" <asm-generic/unistd.h>	2011-05-04 14:41:28 -04:00
audit_signal.h	…
audit_write.h	audit: support the "standard" <asm-generic/unistd.h>	2011-05-04 14:41:28 -04:00
auxvec.h	…
barrier.h	Create asm-generic/barrier.h	2012-03-28 18:30:03 +01:00
bitops.h	bitops: remove minix bitops from asm/bitops.h	2011-03-23 19:46:22 -07:00
bitsperlong.h	…
bug.h	consolidate WARN_...ONCE() static variables	2012-03-23 16:58:31 -07:00
bugs.h	…
cache.h	asm-generic: add generic NOMMU versions of some headers	2009-06-11 21:02:50 +02:00
cacheflush.h	asm-generic/cacheflush.h: flush icache when copying to user pages	2011-05-25 08:39:37 -07:00
checksum.h	Add extra arch overrides to asm-generic/checksum.h	2011-11-01 07:34:21 -07:00
cmpxchg-local.h	Fix IRQ flag handling naming	2010-10-07 14:08:55 +01:00
cmpxchg.h	asm-generic: add linux/types.h to cmpxchg.h	2012-04-02 14:41:27 -07:00
cputime.h	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2012-01-06 08:44:54 -08:00
current.h	…
delay.h	asm-generic: delay.h fix udelay and ndelay for 8 bit args	2011-07-22 18:45:33 +02:00
device.h	Driver Core: Add platform device arch data V3	2009-07-22 00:28:38 +02:00
div64.h	…
dma-coherent.h	common: add dma_mmap_from_coherent() function	2012-05-21 15:06:09 +02:00
dma-contiguous.h	drivers: add Contiguous Memory Allocator	2012-05-21 15:09:37 +02:00
dma-mapping-broken.h	dma-mapping: remove dma_is_consistent API	2010-08-11 08:59:21 -07:00
dma-mapping-common.h	BUG: headers with BUG/BUG_ON etc. need linux/bug.h	2012-03-04 17:54:34 -05:00
dma.h	asm-generic: add legacy I/O header files	2009-06-11 21:02:42 +02:00
emergency-restart.h	…
errno-base.h	…
errno.h	mm: make __get_user_pages return -EHWPOISON for HWPOISON page optionally	2011-03-17 13:08:27 -03:00
exec.h	Split arch_align_stack() out from asm-generic/system.h	2012-03-28 18:30:03 +01:00
fb.h	…
fcntl.h	locks: move F_INPROGRESS from fl_type to fl_flags field	2011-08-19 13:25:34 -04:00
ftrace.h	asm-generic headers: add ftrace.h	2011-03-17 09:19:04 +08:00
futex.h	futex: Sanitize futex ops argument types	2011-03-11 12:23:31 +01:00
getorder.h	bitops: Add missing parentheses to new get_order macro	2012-02-24 10:39:27 -08:00
gpio.h	gpiolib: Remove 'const' from data argument of gpiochip_find()	2012-05-18 23:01:05 -06:00
hardirq.h	Fix IRQ flag handling naming	2010-10-07 14:08:55 +01:00
hw_irq.h	asm-generic: add legacy I/O header files	2009-06-11 21:02:42 +02:00
ide_iops.h	…
int-l64.h	…
int-ll64.h	…
io-64-nonatomic-hi-lo.h	asm-generic: architecture independent readq/writeq for 32bit environment	2012-02-21 16:47:28 -08:00
io-64-nonatomic-lo-hi.h	asm-generic: architecture independent readq/writeq for 32bit environment	2012-02-21 16:47:28 -08:00
io.h	lib: use generic pci_iomap on all architectures	2012-01-10 18:04:27 -08:00
ioctl.h	…
ioctls.h	tty: add TIOCVHANGUP to allow clean tty shutdown of all ttys	2011-02-17 14:16:30 -08:00
iomap.h	[PARISC] fix compile break caused by iomap: make IOPORT/PCI mapping functions conditional	2012-02-27 09:43:30 -06:00
ipcbuf.h	…
irq.h	…
irq_regs.h	core: Replace __get_cpu_var with __this_cpu_read if not used for an address.	2010-12-17 15:07:19 +01:00
irqflags.h	Fix IRQ flag handling naming	2010-10-07 14:08:55 +01:00
kdebug.h	asm-generic: kdebug.h: Checkpatch cleanup	2010-10-09 21:51:44 +02:00
kmap_types.h	include/asm-generic/kmap_types.h: add helpful reminder	2010-05-25 08:07:03 -07:00
kvm_para.h	KVM: make asm-generic/kvm_para.h have an ifdef __KERNEL__ block	2012-05-21 17:47:52 +03:00
libata-portmap.h	…
linkage.h	…
local.h	atomic: use <linux/atomic.h>	2011-07-26 16:49:47 -07:00
local64.h	atomic: use <linux/atomic.h>	2011-07-26 16:49:47 -07:00
memory_model.h	mm: fix __page_to_pfn for a const struct page argument	2011-08-17 13:00:20 -07:00
mm_hooks.h	…
mman-common.h	coredump: add VM_NODUMP, MADV_NODUMP, MADV_CLEAR_NODUMP	2012-03-23 16:58:42 -07:00
mman.h	mm: add MAP_HUGETLB for mmaping pseudo-anonymous huge page regions	2009-09-22 07:17:41 -07:00
mmu.h	asm-generic: add generic NOMMU versions of some headers	2009-06-11 21:02:50 +02:00
mmu_context.h	asm-generic: add generic NOMMU versions of some headers	2009-06-11 21:02:50 +02:00
module.h	…
msgbuf.h	…
mutex-dec.h	…
mutex-null.h	…
mutex-xchg.h	…
mutex.h	…
page.h	The following changes since commit `3ee72ca992`	2012-01-10 17:39:40 -08:00
param.h	UAPI: Rearrange definition of HZ in asm-generic/param.h	2011-12-13 09:26:45 +00:00
parport.h	asm-generic: add legacy I/O header files	2009-06-11 21:02:42 +02:00
pci-bridge.h	PCI: work around Stratus ftServer broken PCIe hierarchy	2012-04-30 15:21:02 -06:00
pci-dma-compat.h	dma-mapping: pci: move pci_set_dma_mask and pci_set_consistent_dma_mask to pci-dma-compat.h	2010-03-12 15:52:42 -08:00
pci.h	PCI: collapse pcibios_resource_to_bus	2012-02-23 20:19:04 -07:00
pci_iomap.h	[PARISC] fix compile break caused by iomap: make IOPORT/PCI mapping functions conditional	2012-02-27 09:43:30 -06:00
percpu.h	percpu: Optimize __get_cpu_var()	2010-09-10 10:56:51 +02:00
pgalloc.h	asm-generic: add generic NOMMU versions of some headers	2009-06-11 21:02:50 +02:00
pgtable-nopmd.h	mm: Pass virtual address to [__]p{te,ud,md}_free_tlb()	2009-07-27 12:10:38 -07:00
pgtable-nopud.h	mm: Pass virtual address to [__]p{te,ud,md}_free_tlb()	2009-07-27 12:10:38 -07:00
pgtable.h	mm: pmd_read_atomic: fix 32bit PAE pmd walk vs pmd_populate SMP race condition	2012-05-29 16:22:24 -07:00
poll.h	epoll: introduce POLLFREE to flush ->signalfd_wqh before kfree()	2012-02-24 11:42:50 -08:00
posix_types.h	posix_types: Introduce __kernel_[u]long_t	2012-02-20 12:48:47 -08:00
ptrace.h	asm-generic/ptrace.h: start a common low level ptrace helper	2011-05-26 17:12:36 -07:00
resource.h	ulimit: raise default hard ulimit on number of files to 4096	2011-05-25 08:39:43 -07:00
rtc.h	…
rwsem.h	Hexagon: Add locking types and functions	2011-11-01 07:34:20 -07:00
scatterlist.h	asm-generic: remove ARCH_HAS_SG_CHAIN in scatterlist.h	2010-05-27 09:12:54 -07:00
sections.h	x86: Separate out entry text section	2011-03-08 17:22:11 +01:00
segment.h	asm-generic: add generic NOMMU versions of some headers	2009-06-11 21:02:50 +02:00
sembuf.h	…
serial.h	asm-generic: add legacy I/O header files	2009-06-11 21:02:42 +02:00
setup.h	…
shmbuf.h	…
shmparam.h	…
siginfo.h	Linux 3.4-rc5	2012-05-04 12:46:40 +10:00
signal-defs.h	…
signal.h	…
sizes.h	asm-generic headers: add sizes.h	2011-03-17 09:19:04 +08:00
socket.h	net: Add framework to allow sending packets with customized CRC.	2012-02-24 01:37:35 -08:00
sockios.h	…
spinlock.h	…
stat.h	asm-generic/stat.h: support 64-bit file time_t for stat()	2010-11-01 15:31:29 -04:00
statfs.h	asm-generic: Use __BITS_PER_LONG in statfs.h	2012-04-30 12:55:15 -07:00
string.h	…
swab.h	…
switch_to.h	Split the switch_to() wrapper out of asm-generic/system.h	2012-03-28 18:30:03 +01:00
syscall.h	asm/syscall.h: add syscall_get_arch	2012-04-14 11:13:19 +10:00
syscalls.h	Fix the declaration of sys_execve() in asm-generic/syscalls.h	2010-08-18 12:12:38 -07:00
termbits.h	tty: Add EXTPROC support for LINEMODE	2010-08-10 13:47:39 -07:00
termios-base.h	…
termios.h	…
timex.h	asm-generic: add legacy I/O header files	2009-06-11 21:02:42 +02:00
tlb.h	thp: add tlb_remove_pmd_tlb_entry	2012-01-12 20:13:08 -08:00
tlbflush.h	BUG: headers with BUG/BUG_ON etc. need linux/bug.h	2012-03-04 17:54:34 -05:00
topology.h	topology: alternate fix for ia64 tiger_defconfig build breakage	2010-08-09 20:44:57 -07:00
types.h	consolidate umode_t declarations	2012-01-03 22:55:17 -05:00
uaccess-unaligned.h	…
uaccess.h	fix default __strnlen_user macro	2011-10-06 19:47:12 -04:00
ucontext.h	…
unaligned.h	…
unistd.h	compat: use sys_sendfile64() implementation for sendfile syscall	2012-03-27 13:36:57 -04:00
user.h	asm-generic/user.h: Fix spelling in comment	2011-03-01 15:49:39 +01:00
vga.h	asm-generic: add legacy I/O header files	2009-06-11 21:02:42 +02:00
vmlinux.lds.h	ftrace: Sort all function addresses, not just per page	2012-05-16 19:58:44 -04:00
word-at-a-time.h	word-at-a-time: make the interfaces truly generic	2012-05-26 11:33:40 -07:00
xor.h	sanitize <linux/prefetch.h> usage	2011-05-20 12:50:29 -07:00