Go to file
Nicholas Piggin 82e93f94ac mm: fix exec activate_mm vs TLB shootdown and lazy tlb switching race
[ Upstream commit d53c3dfb23 ]

Reading and modifying current->mm and current->active_mm and switching
mm should be done with irqs off, to prevent races seeing an intermediate
state.

This is similar to commit 38cf307c1f ("mm: fix kthread_use_mm() vs TLB
invalidate"). At exec-time when the new mm is activated, the old one
should usually be single-threaded and no longer used, unless something
else is holding an mm_users reference (which may be possible).

Absent other mm_users, there is also a race with preemption and lazy tlb
switching. Consider the kernel_execve case where the current thread is
using a lazy tlb active mm:

  call_usermodehelper()
    kernel_execve()
      old_mm = current->mm;
      active_mm = current->active_mm;
      *** preempt *** -------------------->  schedule()
                                               prev->active_mm = NULL;
                                               mmdrop(prev active_mm);
                                             ...
                      <--------------------  schedule()
      current->mm = mm;
      current->active_mm = mm;
      if (!old_mm)
          mmdrop(active_mm);

If we switch back to the kernel thread from a different mm, there is a
double free of the old active_mm, and a missing free of the new one.

Closing this race only requires interrupts to be disabled while ->mm
and ->active_mm are being switched, but the TLB problem requires also
holding interrupts off over activate_mm. Unfortunately not all archs
can do that yet, e.g., arm defers the switch if irqs are disabled and
expects finish_arch_post_lock_switch() to be called to complete the
flush; um takes a blocking lock in activate_mm().

So as a first step, disable interrupts across the mm/active_mm updates
to close the lazy tlb preempt race, and provide an arch option to
extend that to activate_mm which allows architectures doing IPI based
TLB shootdowns to close the second race.

This is a bit ugly, but in the interest of fixing the bug and backporting
before all architectures are converted this is a compromise.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200914045219.3736466-2-npiggin@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-11-05 11:43:13 +01:00
Documentation xen/events: defer eoi in case of excessive number of events 2020-11-05 11:43:12 +01:00
LICENSES LICENSES: Rename other to deprecated 2019-05-03 06:34:32 -06:00
arch mm: fix exec activate_mm vs TLB shootdown and lazy tlb switching race 2020-11-05 11:43:13 +01:00
block block: ratelimit handle_bad_sector() message 2020-10-29 09:58:01 +01:00
certs PKCS#7: Refactor verify_pkcs7_signature() 2019-08-05 18:40:18 -04:00
crypto crypto: algif_skcipher - EBUSY on aio should be an error 2020-10-29 09:57:30 +01:00
drivers ata: sata_nv: Fix retrieving of active qcs 2020-11-05 11:43:12 +01:00
fs mm: fix exec activate_mm vs TLB shootdown and lazy tlb switching race 2020-11-05 11:43:13 +01:00
include xen/events: add a new "late EOI" evtchn framework 2020-11-05 11:43:10 +01:00
init kprobes: tracing/kprobes: Fix to kill kprobes on initmem after boot 2020-10-01 13:18:23 +02:00
ipc ipc/util.c: sysvipc_find_ipc() incorrectly updates position index 2020-05-20 08:20:16 +02:00
kernel futex: Fix incorrect should_fail_futex() handling 2020-11-05 11:43:13 +01:00
lib lib/crc32.c: fix trivial typo in preprocessor condition 2020-10-29 09:57:52 +01:00
mm mm/page_owner: change split_page_owner to take a count 2020-10-29 09:57:52 +01:00
net tipc: fix memory leak caused by tipc_buf_append() 2020-11-01 12:01:04 +01:00
samples misc: vop: add round_up(x,4) for vring_size to avoid kernel panic 2020-10-29 09:58:04 +01:00
scripts scripts/setlocalversion: make git describe output more reliable 2020-11-01 12:01:01 +01:00
security evm: Check size of security.evm before using it 2020-11-01 12:01:05 +01:00
sound ALSA: hda/ca0132 - Add new quirk ID for SoundBlaster AE-7. 2020-10-29 09:58:09 +01:00
tools bpf: Fix comment for helper bpf_current_task_under_cgroup() 2020-11-01 12:01:05 +01:00
usr initramfs: restore default compression behavior 2020-04-08 09:08:38 +02:00
virt KVM: arm64: Assume write fault on S1PTW permission fault on instruction fetch 2020-10-01 13:18:25 +02:00
.clang-format clang-format: Update with the latest for_each macro list 2019-08-31 10:00:51 +02:00
.cocciconfig
.get_maintainer.ignore
.gitattributes
.gitignore Modules updates for v5.4 2019-09-22 10:34:46 -07:00
.mailmap ARM: SoC fixes 2019-11-10 13:41:59 -08:00
COPYING
CREDITS MAINTAINERS: Remove Simon as Renesas SoC Co-Maintainer 2019-10-10 08:12:51 -07:00
Kbuild kbuild: do not descend to ./Kbuild when cleaning 2019-08-21 21:03:58 +09:00
Kconfig docs: kbuild: convert docs to ReST and rename to *.rst 2019-06-14 14:21:21 -06:00
MAINTAINERS Documentation/llvm: add documentation on building w/ Clang/LLVM 2020-08-26 10:40:46 +02:00
Makefile Linux 5.4.74 2020-11-01 12:01:07 +01:00
README

README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.