linux/drivers
Michal Hocko 93065ac753 mm, oom: distinguish blockable mode for mmu notifiers
There are several blockable mmu notifiers which might sleep in
mmu_notifier_invalidate_range_start and that is a problem for the
oom_reaper because it needs to guarantee a forward progress so it cannot
depend on any sleepable locks.

Currently we simply back off and mark an oom victim with blockable mmu
notifiers as done after a short sleep.  That can result in selecting a new
oom victim prematurely because the previous one still hasn't torn its
memory down yet.

We can do much better though.  Even if mmu notifiers use sleepable locks
there is no reason to automatically assume those locks are held.  Moreover
majority of notifiers only care about a portion of the address space and
there is absolutely zero reason to fail when we are unmapping an unrelated
range.  Many notifiers do really block and wait for HW which is harder to
handle and we have to bail out though.

This patch handles the low hanging fruit.
__mmu_notifier_invalidate_range_start gets a blockable flag and callbacks
are not allowed to sleep if the flag is set to false.  This is achieved by
using trylock instead of the sleepable lock for most callbacks and
continue as long as we do not block down the call chain.

I think we can improve that even further because there is a common pattern
to do a range lookup first and then do something about that.  The first
part can be done without a sleeping lock in most cases AFAICS.

The oom_reaper end then simply retries if there is at least one notifier
which couldn't make any progress in !blockable mode.  A retry loop is
already implemented to wait for the mmap_sem and this is basically the
same thing.

The simplest way for driver developers to test this code path is to wrap
userspace code which uses these notifiers into a memcg and set the hard
limit to hit the oom.  This can be done e.g.  after the test faults in all
the mmu notifier managed memory and set the hard limit to something really
small.  Then we are looking for a proper process tear down.

[akpm@linux-foundation.org: coding style fixes]
[akpm@linux-foundation.org: minor code simplification]
Link: http://lkml.kernel.org/r/20180716115058.5559-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Christian König <christian.koenig@amd.com> # AMD notifiers
Acked-by: Leon Romanovsky <leonro@mellanox.com> # mlx and umem_odp
Reported-by: David Rientjes <rientjes@google.com>
Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
Cc: Dennis Dalessandro <dennis.dalessandro@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Cc: Dimitri Sivanich <sivanich@sgi.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 10:52:44 -07:00
..
accessibility
acpi arm64 updates for 4.19 2018-08-14 16:39:13 -07:00
amba
android android: binder: Rate-limit debug and userspace triggered err msgs 2018-08-08 11:05:47 +02:00
ata Char/Misc driver patches for 4.19-rc1 2018-08-18 11:04:51 -07:00
atm
auxdisplay Char/Misc driver patches for 4.19-rc1 2018-08-18 11:04:51 -07:00
base Driver core patches for 4.19-rc1 2018-08-18 11:44:53 -07:00
bcma
block The main things are support for cephx v2 authentication protocol and 2018-08-20 18:26:55 -07:00
bluetooth Bluetooth: mediatek: pass correct size to h4_recv_buf() 2018-08-13 15:59:39 +02:00
bus VLA leftovers pull summary: 2018-08-17 10:40:09 -07:00
cdrom cdrom: Use struct scsi_sense_hdr internally 2018-08-02 15:22:39 -06:00
char RTC for 4.19 2018-08-20 16:30:27 -07:00
clk The new and exciting feature this time around is in the clk core. 2018-08-15 21:41:21 -07:00
clocksource RISC-V Updates for the 4.19 Merge Window 2018-08-19 09:56:38 -07:00
connector
cpufreq powerpc updates for 4.19 2018-08-17 11:32:50 -07:00
cpuidle powerpc updates for 4.19 2018-08-17 11:32:50 -07:00
crypto Char/Misc driver patches for 4.19-rc1 2018-08-18 11:04:51 -07:00
dax dax: remove VM_MIXEDMAP for fsdax and device dax 2018-08-17 16:20:27 -07:00
dca
devfreq Char/Misc driver patches for 4.19-rc1 2018-08-18 11:04:51 -07:00
dio
dma DMAengine updates for v4.19-rc1 2018-08-18 15:55:59 -07:00
dma-buf
edac EDAC: Add missing MEM_LRDDR4 entry in edac_mem_types[] 2018-08-17 15:13:34 +02:00
eisa
extcon
firewire firewire: use 64-bit time_t based interfaces 2018-08-17 16:20:27 -07:00
firmware Char/Misc driver patches for 4.19-rc1 2018-08-18 11:04:51 -07:00
fmc
fpga
fsi fsi: sbefifo: Bump max command length 2018-08-08 15:44:47 +10:00
gnss
gpio - New Drivers 2018-08-20 15:38:44 -07:00
gpu mm, oom: distinguish blockable mode for mmu notifiers 2018-08-22 10:52:44 -07:00
hid Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid 2018-08-20 15:59:01 -07:00
hsi
hv Drivers: hv: vmbus: Cleanup synic memory free path 2018-08-02 10:20:59 +02:00
hwmon Char/Misc driver patches for 4.19-rc1 2018-08-18 11:04:51 -07:00
hwspinlock
hwtracing
i2c regmap: Changes for v4.19 2018-08-14 11:51:03 -07:00
ide ide-cd: Remove redundant sense buffer 2018-08-02 15:22:37 -06:00
idle
iio
infiniband mm, oom: distinguish blockable mode for mmu notifiers 2018-08-22 10:52:44 -07:00
input - New Drivers 2018-08-20 15:38:44 -07:00
iommu Driver core patches for 4.19-rc1 2018-08-18 11:44:53 -07:00
ipack
irqchip RISC-V Updates for the 4.19 Merge Window 2018-08-19 09:56:38 -07:00
isdn isdn: Disable IIOCDBGVAR 2018-08-16 12:26:24 -07:00
leds leds: ns2: Change unsigned to unsigned int 2018-08-06 23:03:12 +02:00
lightnvm
macintosh
mailbox mailbox: Add support for i.MX messaging unit 2018-08-15 09:53:07 +05:30
mcb
md Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input 2018-08-18 16:48:07 -07:00
media - New Drivers 2018-08-20 15:38:44 -07:00
memory Char/Misc driver patches for 4.19-rc1 2018-08-18 11:04:51 -07:00
memstick
message
mfd - New Drivers 2018-08-20 15:38:44 -07:00
misc mm, oom: distinguish blockable mode for mmu notifiers 2018-08-22 10:52:44 -07:00
mmc Merge branch 'asoc-4.19' into asoc-next 2018-08-09 14:47:05 +01:00
mtd Char/Misc driver patches for 4.19-rc1 2018-08-18 11:04:51 -07:00
mux mux: adgs1408: new driver for Analog Devices ADGS1408/1409 mux 2018-08-02 10:23:02 +02:00
net Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-08-19 11:51:45 -07:00
nfc
ntb
nubus
nvdimm Linux 4.18-rc6 2018-08-05 19:32:09 -06:00
nvme Merge branch 'linus/master' into rdma.git for-next 2018-08-16 14:21:29 -06:00
nvmem
of Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2018-08-15 15:04:25 -07:00
opp
oprofile
parisc
parport Char/Misc driver patches for 4.19-rc1 2018-08-18 11:04:51 -07:00
pci pci-v4.19-changes 2018-08-16 09:21:54 -07:00
pcmcia pcmcia: remove long deprecated pcmcia_request_exclusive_irq() function 2018-08-18 12:30:42 -07:00
perf Char/Misc driver patches for 4.19-rc1 2018-08-18 11:04:51 -07:00
phy
pinctrl - New Drivers 2018-08-20 15:38:44 -07:00
platform - New Drivers 2018-08-20 15:38:44 -07:00
pnp
power
powercap
pps
ps3
ptp Char/Misc driver patches for 4.19-rc1 2018-08-18 11:04:51 -07:00
pwm
rapidio
ras
regulator - New Drivers 2018-08-20 15:38:44 -07:00
remoteproc remoteproc/davinci: use the reset framework 2018-08-16 17:39:55 -07:00
reset
rpmsg
rtc RTC for 4.19 2018-08-20 16:30:27 -07:00
s390 TTY/Serial driver patches for 4.19-rc1 2018-08-18 10:50:41 -07:00
sbus
scsi Char/Misc driver patches for 4.19-rc1 2018-08-18 11:04:51 -07:00
sfi
sh sh: introduce a sh_cacheop_vaddr helper 2018-08-02 13:54:06 +02:00
siox
slimbus
sn
soc remoteproc updates for v4.19 2018-08-18 16:42:04 -07:00
soundwire
spi hwspinlock updates for v4.19 2018-08-18 16:45:27 -07:00
spmi
ssb ssb: Remove SSB_WARN_ON, SSB_BUG_ON and SSB_DEBUG 2018-08-09 18:47:47 +03:00
staging Staging/IIO patches for 4.19-rc1 2018-08-18 11:00:00 -07:00
target SCSI misc on 20180815 2018-08-15 22:06:26 -07:00
tc
tee
thermal Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal 2018-08-16 10:21:18 -07:00
thunderbolt
tty Char/Misc driver patches for 4.19-rc1 2018-08-18 11:04:51 -07:00
uio Char/Misc fix for 4.19-rc1 2018-08-19 09:30:44 -07:00
usb Char/Misc driver patches for 4.19-rc1 2018-08-18 11:04:51 -07:00
uwb
vfio powerpc updates for 4.19 2018-08-17 11:32:50 -07:00
vhost SCSI misc on 20180815 2018-08-15 22:06:26 -07:00
video - Core Frameworks 2018-08-20 15:41:37 -07:00
virt
virtio virtio: Make vp_set_vq_affinity() take a mask. 2018-08-11 12:02:18 -07:00
visorbus
vlynq
vme
w1 Char/Misc driver patches for 4.19-rc1 2018-08-18 11:04:51 -07:00
watchdog linux-watchdog 4.19-rc1 tag 2018-08-18 16:16:57 -07:00
xen mm, oom: distinguish blockable mode for mmu notifiers 2018-08-22 10:52:44 -07:00
zorro
Kconfig
Makefile Char/Misc driver patches for 4.19-rc1 2018-08-18 11:04:51 -07:00