linux/fs
Michal Hocko d1908f5255 fs: break out of iomap_file_buffered_write on fatal signals
Tetsuo has noticed that an OOM stress test which performs large write
requests can cause the full memory reserves depletion.  He has tracked
this down to the following path

	__alloc_pages_nodemask+0x436/0x4d0
	alloc_pages_current+0x97/0x1b0
	__page_cache_alloc+0x15d/0x1a0          mm/filemap.c:728
	pagecache_get_page+0x5a/0x2b0           mm/filemap.c:1331
	grab_cache_page_write_begin+0x23/0x40   mm/filemap.c:2773
	iomap_write_begin+0x50/0xd0             fs/iomap.c:118
	iomap_write_actor+0xb5/0x1a0            fs/iomap.c:190
	? iomap_write_end+0x80/0x80             fs/iomap.c:150
	iomap_apply+0xb3/0x130                  fs/iomap.c:79
	iomap_file_buffered_write+0x68/0xa0     fs/iomap.c:243
	? iomap_write_end+0x80/0x80
	xfs_file_buffered_aio_write+0x132/0x390 [xfs]
	? remove_wait_queue+0x59/0x60
	xfs_file_write_iter+0x90/0x130 [xfs]
	__vfs_write+0xe5/0x140
	vfs_write+0xc7/0x1f0
	? syscall_trace_enter+0x1d0/0x380
	SyS_write+0x58/0xc0
	do_syscall_64+0x6c/0x200
	entry_SYSCALL64_slow_path+0x25/0x25

the oom victim has access to all memory reserves to make a forward
progress to exit easier.  But iomap_file_buffered_write and other
callers of iomap_apply loop to complete the full request.  We need to
check for fatal signals and back off with a short write instead.

As the iomap_apply delegates all the work down to the actor we have to
hook into those.  All callers that work with the page cache are calling
iomap_write_begin so we will check for signals there.  dax_iomap_actor
has to handle the situation explicitly because it copies data to the
userspace directly.  Other callers like iomap_page_mkwrite work on a
single page or iomap_fiemap_actor do not allocate memory based on the
given len.

Fixes: 68a9f5e700 ("xfs: implement iomap based buffered write path")
Link: http://lkml.kernel.org/r/20170201092706.9966-2-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>	[4.8+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-03 14:13:19 -08:00
..
9p Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
adfs
affs vfs: remove ".readlink = generic_readlink" assignments 2016-12-09 16:45:04 +01:00
afs Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
autofs4 Merge uncontroversial parts of branch 'readlink' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs 2016-12-17 19:16:12 -08:00
befs befs: add NFS export support 2016-12-22 11:25:24 +00:00
bfs Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
btrfs Merge branch 'for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2017-01-27 12:41:46 -08:00
cachefiles
ceph ceph: fix bad endianness handling in parse_reply_info_extra 2017-01-18 17:58:45 +01:00
cifs cifs: initialize file_info_lock 2017-01-14 14:58:29 -06:00
coda vfs: remove ".readlink = generic_readlink" assignments 2016-12-09 16:45:04 +01:00
configfs Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
cramfs
crypto fscrypt: fix renaming and linking special files 2016-12-31 00:47:05 -05:00
debugfs
devpts
dlm ktime: Get rid of ktime_equal() 2016-12-25 17:21:23 +01:00
ecryptfs vfs: remove ".readlink = generic_readlink" assignments 2016-12-09 16:45:04 +01:00
efivarfs
efs Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
exofs exofs: don't mess with simple_write_{begin,end} 2016-12-10 14:25:19 -05:00
exportfs
ext2 dax: fix build warnings with FS_DAX and !FS_IOMAP 2017-01-24 16:26:14 -08:00
ext4 dax: fix build warnings with FS_DAX and !FS_IOMAP 2017-01-24 16:26:14 -08:00
f2fs block: Rename blk_queue_zone_size and bdev_zone_size 2017-01-12 07:58:32 -07:00
fat
freevxfs
fscache fscache: Fix dead object requeue 2017-01-31 13:23:09 -05:00
fuse fuse: fix time_to_jiffies nsec sanity check 2017-01-13 17:20:47 +01:00
gfs2 ktime: Cleanup ktime_set() usage 2016-12-25 17:21:22 +01:00
hfs Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
hfsplus Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
hostfs vfs: remove ".readlink = generic_readlink" assignments 2016-12-09 16:45:04 +01:00
hpfs
hugetlbfs Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
isofs Merge branch 'for-4.10/block' of git://git.kernel.dk/linux-block 2016-12-13 10:19:16 -08:00
jbd2 Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
jffs2 vfs: remove ".readlink = generic_readlink" assignments 2016-12-09 16:45:04 +01:00
jfs Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
kernfs Merge uncontroversial parts of branch 'readlink' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs 2016-12-17 19:16:12 -08:00
lockd
minix vfs: remove ".readlink = generic_readlink" assignments 2016-12-09 16:45:04 +01:00
ncpfs Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
nfs pNFS: Fix a reference leak in _pnfs_return_layout 2017-01-26 15:50:41 -05:00
nfs_common
nfsd nfsd: special case truncates some more 2017-01-31 12:29:24 -05:00
nilfs2 Merge uncontroversial parts of branch 'readlink' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs 2016-12-17 19:16:12 -08:00
nls
notify Merge branch 'stable-4.10' of git://git.infradead.org/users/pcmoore/audit 2017-01-05 23:06:06 -08:00
ntfs Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
ocfs2 ocfs2: fix crash caused by stale lvb with fsdlm plugin 2017-01-10 18:31:54 -08:00
omfs
openpromfs Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
orangefs Merge uncontroversial parts of branch 'readlink' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs 2016-12-17 19:16:12 -08:00
overlayfs ovl: fix possible use after free on redirect dir lookup 2017-01-18 15:19:54 +01:00
proc proc: add a schedule point in proc_pid_readdir() 2017-01-24 16:26:14 -08:00
pstore Improvements and fixes to pstore subsystem: 2016-12-13 09:16:11 -08:00
qnx4
qnx6
quota Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2016-12-19 08:23:53 -08:00
ramfs Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
reiserfs Merge uncontroversial parts of branch 'readlink' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs 2016-12-17 19:16:12 -08:00
romfs romfs: use different way to generate fsid for BLOCK or MTD 2017-01-24 16:26:14 -08:00
squashfs Merge uncontroversial parts of branch 'readlink' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs 2016-12-17 19:16:12 -08:00
sysfs
sysv vfs: remove ".readlink = generic_readlink" assignments 2016-12-09 16:45:04 +01:00
tracefs
ubifs ubifs: Fix journal replay wrt. xattr nodes 2017-01-17 14:35:58 +01:00
udf
ufs Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
xfs xfs: prevent quotacheck from overloading inode lru 2017-01-27 09:32:30 -08:00
Kconfig dax: fix build warnings with FS_DAX and !FS_IOMAP 2017-01-24 16:26:14 -08:00
Kconfig.binfmt
Makefile logfs: remove from tree 2016-12-14 23:48:11 -05:00
aio.c aio: fix lock dep warning 2017-01-14 19:31:40 -05:00
anon_inodes.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
attr.c
bad_inode.c bad_inode: add missing i_op initializers 2016-12-09 11:57:43 +01:00
binfmt_aout.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
binfmt_elf.c coredump: Ensure proper size of sparse core files 2017-01-14 19:32:40 -05:00
binfmt_elf_fdpic.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
binfmt_em86.c
binfmt_flat.c
binfmt_misc.c
binfmt_script.c
block_dev.c block: fix use after free in __blkdev_direct_IO 2017-01-24 07:55:53 -07:00
buffer.c clean_bdev_aliases: Prevent cleaning blocks that are not in block range 2017-01-02 09:35:14 -07:00
char_dev.c
compat.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
compat_binfmt_elf.c
compat_ioctl.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
coredump.c coredump: Ensure proper size of sparse core files 2017-01-14 19:32:40 -05:00
dax.c fs: break out of iomap_file_buffered_write on fatal signals 2017-02-03 14:13:19 -08:00
dcache.c mnt: Protect the mountpoint hashtable with mount_lock 2017-01-10 13:34:43 +13:00
dcookies.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
direct-io.c do_direct_IO: Use inode->i_blkbits to compute block count to be cleaned 2017-01-10 13:29:54 -07:00
drop_caches.c
eventfd.c
eventpoll.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
exec.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
fcntl.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
fhandle.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
file.c
file_table.c constify alloc_file() 2016-12-05 19:01:16 -05:00
filesystems.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
fs-writeback.c fs/fs-writeback.c: remove redundant if check 2016-12-12 18:55:08 -08:00
fs_pin.c
fs_struct.c
inode.c
internal.h Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-12-17 18:44:00 -08:00
ioctl.c vfs: call vfs_clone_file_range() under freeze protection 2016-12-16 11:02:54 +01:00
iomap.c fs: break out of iomap_file_buffered_write on fatal signals 2017-02-03 14:13:19 -08:00
libfs.c libfs: Modify mount_pseudo_xattr to be clear it is not a userspace mount 2017-01-10 13:34:55 +13:00
locks.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
mbcache.c mbcache: document that "find" functions only return reusable entries 2016-12-03 15:55:01 -05:00
mount.h vfs: add path_is_mountpoint() helper 2016-12-03 20:51:35 -05:00
mpage.c
namei.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
namespace.c mnt: Protect the mountpoint hashtable with mount_lock 2017-01-10 13:34:43 +13:00
no-block.c
nsfs.c
open.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
pipe.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
pnode.c reorganize do_make_slave() 2016-12-16 16:30:49 -05:00
pnode.h
posix_acl.c tmpfs: clear S_ISGID when setting posix ACLs 2017-01-10 01:29:48 -05:00
proc_namespace.c
read_write.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
readdir.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
select.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
seq_file.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
signalfd.c
splice.c splice: reinstate SIGPIPE/EPIPE handling 2016-12-21 10:59:34 -08:00
stack.c
stat.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
statfs.c vfs: misc struct path constification 2016-12-05 19:03:49 -05:00
super.c
sync.c
timerfd.c ktime: Cleanup ktime_set() usage 2016-12-25 17:21:22 +01:00
userfaultfd.c userfaultfd: fix SIGBUS resulting from false rwsem wakeups 2017-01-24 16:26:14 -08:00
utimes.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
xattr.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00