linux/arch/x86/kernel
Hugh Dickins 1be7107fbe mm: larger stack guard gap, between vmas
Stack guard page is a useful feature to reduce a risk of stack smashing
into a different mapping. We have been using a single page gap which
is sufficient to prevent having stack adjacent to a different mapping.
But this seems to be insufficient in the light of the stack usage in
userspace. E.g. glibc uses as large as 64kB alloca() in many commonly
used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX]
which is 256kB or stack strings with MAX_ARG_STRLEN.

This will become especially dangerous for suid binaries and the default
no limit for the stack size limit because those applications can be
tricked to consume a large portion of the stack and a single glibc call
could jump over the guard page. These attacks are not theoretical,
unfortunatelly.

Make those attacks less probable by increasing the stack guard gap
to 1MB (on systems with 4k pages; but make it depend on the page size
because systems with larger base pages might cap stack allocations in
the PAGE_SIZE units) which should cover larger alloca() and VLA stack
allocations. It is obviously not a full fix because the problem is
somehow inherent, but it should reduce attack space a lot.

One could argue that the gap size should be configurable from userspace,
but that can be done later when somebody finds that the new 1MB is wrong
for some special case applications.  For now, add a kernel command line
option (stack_guard_gap) to specify the stack gap size (in page units).

Implementation wise, first delete all the old code for stack guard page:
because although we could get away with accounting one extra page in a
stack vma, accounting a larger gap can break userspace - case in point,
a program run with "ulimit -S -v 20000" failed when the 1MB gap was
counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK
and strict non-overcommit mode.

Instead of keeping gap inside the stack vma, maintain the stack guard
gap as a gap between vmas: using vm_start_gap() in place of vm_start
(or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few
places which need to respect the gap - mainly arch_get_unmapped_area(),
and and the vma tree's subtree_gap support for that.

Original-patch-by: Oleg Nesterov <oleg@redhat.com>
Original-patch-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-19 21:50:20 +08:00
..
acpi Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-05-01 23:54:56 -07:00
apic Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-05-01 21:41:07 -07:00
cpu x86/microcode/intel: Clear patch pointer before jettisoning the initrd 2017-06-08 10:03:05 +02:00
fpu KVM: x86: Fix load damaged SSEx MXCSR register 2017-05-15 16:08:56 +02:00
kprobes kprobes/x86: Fix to set RWX bits correctly before releasing trampoline 2017-05-26 22:37:00 -04:00
.gitignore
Makefile x86/ftrace: Use Makefile logic instead of #ifdef for compiling ftrace_*.o 2017-03-24 10:14:08 +01:00
alternative.c x86/alternatives: Prevent uninitialized stack byte read in apply_alternatives() 2017-05-24 16:18:12 +02:00
amd_gart_64.c x86: use set_memory.h header 2017-05-08 17:15:13 -07:00
amd_nb.c x86/amd_nb: Add SMN and Indirect Data Fabric access for AMD Fam17h 2016-11-16 20:46:38 +01:00
apb_timer.c Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-12-25 14:30:04 -08:00
aperture_64.c x86/boot/e820: Prefix the E820_* type names with "E820_TYPE_" 2017-01-28 22:55:22 +01:00
apm_32.c x86: Remap GDT tables in the fixmap section 2017-03-16 09:06:35 +01:00
asm-offsets.c efi: Get and store the secure boot status 2017-02-07 10:42:10 +01:00
asm-offsets_32.c sched/x86: Rewrite the switch_to() code 2016-08-24 12:31:41 +02:00
asm-offsets_64.c x86/kvm: Provide optimized version of vcpu_is_preempted() for x86-64 2017-02-21 12:48:35 +01:00
audit_64.c
bootflag.c
check.c
cpuid.c Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-12-12 19:25:04 -08:00
crash.c x86/boot/e820: Clean up the E820 table size define names 2017-01-28 22:55:23 +01:00
crash_dump_32.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
crash_dump_64.c
devicetree.c
doublefault.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/debug.h> 2017-03-02 08:42:34 +01:00
dumpstack.c x86/unwind: Ensure stack pointer is aligned 2017-04-18 10:30:23 +02:00
dumpstack_32.c x86/debug: Implement __WARN() using UD0 2017-03-27 10:20:28 +02:00
dumpstack_64.c x86/debug: Implement __WARN() using UD0 2017-03-27 10:20:28 +02:00
e820.c x86/boot/e820: Remove a redundant self assignment 2017-04-14 11:43:21 +02:00
early-quirks.c main drm pull request for 4.12 kernel 2017-05-03 11:44:24 -07:00
early_printk.c x86/earlyprintk: Add support for earlyprintk via USB3 debug port 2017-03-21 12:30:16 +01:00
ebda.c
espfix_64.c x86/espfix: Add support for 5-level paging 2017-04-04 08:22:34 +02:00
ftrace.c x86/ftrace: Make sure that ftrace trampolines are not RWX 2017-05-26 22:37:02 -04:00
ftrace_32.S x86/ftrace: Fix ebp in ftrace_regs_caller that screws up unwinder 2017-04-21 09:48:16 +02:00
ftrace_64.S x86/ftrace: Use Makefile logic instead of #ifdef for compiling ftrace_*.o 2017-03-24 10:14:08 +01:00
head32.c x86/boot/e820: Move asm/e820.h to asm/e820/api.h 2017-01-28 09:31:13 +01:00
head64.c Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-05-01 20:51:12 -07:00
head_32.S x86/boot/32: Convert the 32-bit pgtable setup code from assembly to C 2017-01-06 08:39:26 +01:00
head_64.S x86/boot/64: Rename start_cpu() 2017-03-07 13:57:25 +01:00
hpet.c x86/hpet: Prevent might sleep splat on resume 2017-03-02 09:33:47 +01:00
hw_breakpoint.c
i8237.c
i8253.c
i8259.c x86: i8259: export legacy_pic symbol 2017-04-14 12:08:51 +02:00
io_delay.c
ioport.c Second batch of KVM changes for 4.11 merge window 2017-03-04 11:36:19 -08:00
irq.c x86/irq: Optimize free vector check in the CPU offline path 2017-04-20 15:25:09 +02:00
irq_32.c
irq_64.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task_stack.h> 2017-03-02 08:42:36 +01:00
irq_work.c x86/irq, trace: Add __irq_entry annotation to x86's platform IRQ handlers 2017-01-05 08:58:49 +01:00
irqinit.c x86/irq: Remove a redundant #ifdef directive 2017-04-14 22:43:01 +02:00
itmt.c sched/x86: Remove unnecessary TBM3 check to update topology 2017-01-19 08:42:37 +01:00
jump_label.c locking/jump_labels: Update bug_at() boot message 2017-01-12 09:43:07 +01:00
kdebugfs.c x86/kdebugfs: Move boot params hierarchy under (debugfs)/x86/ 2017-03-01 09:57:02 +01:00
kexec-bzimage64.c x86/boot/e820: Clean up the E820 table size define names 2017-01-28 22:55:23 +01:00
kgdb.c sched/x86: Add 'struct inactive_task_frame' to better document the sleeping task stack frame 2016-08-24 12:27:41 +02:00
ksysfs.c
kvm.c kvm: async_pf: fix rcu_irq_enter() with irqs enabled 2017-06-06 14:43:16 +02:00
kvmclock.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/clock.h> 2017-03-02 08:42:27 +01:00
ldt.c Merge branch 'akpm' (patches from Andrew) 2016-12-12 20:50:02 -08:00
livepatch.c
machine_kexec_32.c x86: use set_memory.h header 2017-05-08 17:15:13 -07:00
machine_kexec_64.c Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-05-12 10:11:50 -07:00
mmconf-fam10h_64.c
module.c mm, vmalloc: use __GFP_HIGHMEM implicitly 2017-05-08 17:15:13 -07:00
mpparse.c x86/boot/e820: Rename early_reserve_e820() to e820__memblock_alloc() and document it 2017-01-28 14:42:30 +01:00
msr.c x86/msr: Remove bogus cleanup from the error path 2016-12-25 10:47:41 +01:00
nmi.c * An EDAC driver for Cavium ThunderX RAS IP (Sergey Temerkhanov) 2017-05-01 11:36:00 -07:00
nmi_selftest.c
paravirt-spinlocks.c 4.11 is going to be a relatively large release for KVM, with a little over 2017-02-22 18:22:53 -08:00
paravirt.c x86/paravirt: Add 5-level support to the paravirt code 2017-04-04 08:22:34 +02:00
paravirt_patch_32.c x86/paravirt: Mark unused patch_default label 2016-12-22 17:43:35 +01:00
paravirt_patch_64.c x86/paravirt: Mark unused patch_default label 2016-12-22 17:43:35 +01:00
pci-calgary_64.c x86/pci-calgary: Use setup_timer() instead of open coding it. 2017-03-31 10:21:04 +02:00
pci-dma.c This is a tree wide change and has been kept separate for that reason. 2017-02-25 13:45:43 -08:00
pci-iommu_table.c
pci-nommu.c treewide: Constify most dma_map_ops structures 2017-01-24 12:23:35 -05:00
pci-swiotlb.c treewide: Constify most dma_map_ops structures 2017-01-24 12:23:35 -05:00
pcspeaker.c
perf_regs.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task_stack.h> 2017-03-02 08:42:36 +01:00
platform-quirks.c x86/init: Add i8042 state to the platform data 2016-12-19 11:34:15 +01:00
pmem.c
probe_roms.c x86/boot/e820: Move asm/e820.h to asm/e820/api.h 2017-01-28 09:31:13 +01:00
process.c x86/arch_prctl: Add ARCH_[GET|SET]_CPUID 2017-03-20 16:10:34 +01:00
process_32.c x86/debug/32: Convert a smp_processor_id() call to raw to avoid DEBUG_PREEMPT warning 2017-05-29 08:22:49 +02:00
process_64.c x86/xen: add CONFIG_XEN_PV to Kconfig 2017-05-02 10:50:19 +02:00
ptrace.c x86/arch_prctl/64: Rename do_arch_prctl() to do_arch_prctl_64() 2017-03-20 16:10:32 +01:00
pvclock.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/nmi.h> 2017-03-02 08:42:30 +01:00
quirks.c x86/quirks: Hide maybe-uninitialized warning 2016-10-25 11:45:13 +02:00
reboot.c x86/mce: Handle broadcasted MCE gracefully with kexec 2017-03-13 20:18:07 +01:00
reboot_fixups_32.c
relocate_kernel_32.S
relocate_kernel_64.S
resource.c x86/boot/e820: Harmonize the 'struct e820_table' fields 2017-01-28 09:33:16 +01:00
rtc.c timekeeping: Ignore the bogus sleep time if pm_trace is enabled 2016-11-29 18:02:58 +01:00
setup.c x86/timers: Move simple_udelay_calibration past init_hypervisor_platform 2017-05-26 13:04:09 +02:00
setup_percpu.c x86/boot/32: Fix UP boot on Quark and possibly other platforms 2017-05-09 08:14:24 +02:00
signal.c x86/debug: Fix the printk() debug output of signal_fault(), do_trap() and do_general_protection() 2017-04-11 09:11:13 +02:00
signal_compat.c x86/signals: Fix lower/upper bound reporting in compat siginfo 2017-04-05 10:16:43 +02:00
smp.c Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-05-01 20:51:12 -07:00
smpboot.c x86: Remap GDT tables in the fixmap section 2017-03-16 09:06:35 +01:00
stacktrace.c stacktrace/x86: add function for detecting reliable stack traces 2017-03-08 09:18:02 +01:00
step.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task_stack.h> 2017-03-02 08:42:36 +01:00
sys_x86_64.c mm: larger stack guard gap, between vmas 2017-06-19 21:50:20 +08:00
sysfb.c
sysfb_efi.c
sysfb_simplefb.c x86/sysfb: Fix lfb_size calculation 2016-11-16 09:38:23 +01:00
tboot.c IOMMU Updates for Linux v4.12 2017-05-09 15:15:47 -07:00
tce_64.c
time.c
tls.c x86/tls: Forcibly set the accessed bit in TLS segments 2017-03-19 12:14:35 +01:00
tls.h
topology.c
trace_clock.c
tracepoint.c tracing: Have the reg function allow to fail 2016-12-09 09:13:30 -05:00
traps.c x86/debug: Handle early WARN_ONs proper 2017-06-12 21:17:48 +02:00
tsc.c sched/clock, x86/perf: Fix "perf test tsc" 2017-03-23 07:31:49 +01:00
tsc_msr.c x86/tsc: Set TSC_KNOWN_FREQ and TSC_RELIABLE flags on Intel Atom SoCs 2016-11-18 10:58:31 +01:00
tsc_sync.c x86/tsc: Make the TSC ADJUST sanitizing work for tsc_reliable 2017-02-10 09:47:17 +01:00
unwind_frame.c x86/unwind: Add end-of-stack check for ftrace handlers 2017-05-24 09:05:16 +02:00
unwind_guess.c x86/unwind: Ensure stack pointer is aligned 2017-04-18 10:30:23 +02:00
uprobes.c
verify_cpu.S
vm86_32.c x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly() 2017-04-26 10:02:06 +02:00
vmlinux.lds.S debug: Fix __bug_table[] in arch linker scripts 2017-04-03 10:22:40 +02:00
vsmp_64.c
x86_init.c x86/boot/e820: Rename default_machine_specific_memory_setup() to e820__memory_setup_default() 2017-01-28 14:42:26 +01:00