linux/fs
Ye Bin 1b6ad24210 jbd2: fix a potential race while discarding reserved buffers after an abort
commit 23e3d7f7061f8682c751c46512718f47580ad8f0 upstream.

we got issue as follows:
[   72.796117] EXT4-fs error (device sda): ext4_journal_check_start:83: comm fallocate: Detected aborted journal
[   72.826847] EXT4-fs (sda): Remounting filesystem read-only
fallocate: fallocate failed: Read-only file system
[   74.791830] jbd2_journal_commit_transaction: jh=0xffff9cfefe725d90 bh=0x0000000000000000 end delay
[   74.793597] ------------[ cut here ]------------
[   74.794203] kernel BUG at fs/jbd2/transaction.c:2063!
[   74.794886] invalid opcode: 0000 [#1] PREEMPT SMP PTI
[   74.795533] CPU: 4 PID: 2260 Comm: jbd2/sda-8 Not tainted 5.17.0-rc8-next-20220315-dirty #150
[   74.798327] RIP: 0010:__jbd2_journal_unfile_buffer+0x3e/0x60
[   74.801971] RSP: 0018:ffffa828c24a3cb8 EFLAGS: 00010202
[   74.802694] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[   74.803601] RDX: 0000000000000001 RSI: ffff9cfefe725d90 RDI: ffff9cfefe725d90
[   74.804554] RBP: ffff9cfefe725d90 R08: 0000000000000000 R09: ffffa828c24a3b20
[   74.805471] R10: 0000000000000001 R11: 0000000000000001 R12: ffff9cfefe725d90
[   74.806385] R13: ffff9cfefe725d98 R14: 0000000000000000 R15: ffff9cfe833a4d00
[   74.807301] FS:  0000000000000000(0000) GS:ffff9d01afb00000(0000) knlGS:0000000000000000
[   74.808338] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   74.809084] CR2: 00007f2b81bf4000 CR3: 0000000100056000 CR4: 00000000000006e0
[   74.810047] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   74.810981] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   74.811897] Call Trace:
[   74.812241]  <TASK>
[   74.812566]  __jbd2_journal_refile_buffer+0x12f/0x180
[   74.813246]  jbd2_journal_refile_buffer+0x4c/0xa0
[   74.813869]  jbd2_journal_commit_transaction.cold+0xa1/0x148
[   74.817550]  kjournald2+0xf8/0x3e0
[   74.819056]  kthread+0x153/0x1c0
[   74.819963]  ret_from_fork+0x22/0x30

Above issue may happen as follows:
        write                   truncate                   kjournald2
generic_perform_write
 ext4_write_begin
  ext4_walk_page_buffers
   do_journal_get_write_access ->add BJ_Reserved list
 ext4_journalled_write_end
  ext4_walk_page_buffers
   write_end_fn
    ext4_handle_dirty_metadata
                ***************JBD2 ABORT**************
     jbd2_journal_dirty_metadata
 -> return -EROFS, jh in reserved_list
                                                   jbd2_journal_commit_transaction
                                                    while (commit_transaction->t_reserved_list)
                                                      jh = commit_transaction->t_reserved_list;
                        truncate_pagecache_range
                         do_invalidatepage
			  ext4_journalled_invalidatepage
			   jbd2_journal_invalidatepage
			    journal_unmap_buffer
			     __dispose_buffer
			      __jbd2_journal_unfile_buffer
			       jbd2_journal_put_journal_head ->put last ref_count
			        __journal_remove_journal_head
				 bh->b_private = NULL;
				 jh->b_bh = NULL;
				                      jbd2_journal_refile_buffer(journal, jh);
							bh = jh2bh(jh);
							->bh is NULL, later will trigger null-ptr-deref
				 journal_free_journal_head(jh);

After commit 96f1e09745, we no longer hold the j_state_lock while
iterating over the list of reserved handles in
jbd2_journal_commit_transaction().  This potentially allows the
journal_head to be freed by journal_unmap_buffer while the commit
codepath is also trying to free the BJ_Reserved buffers.  Keeping
j_state_lock held while trying extends hold time of the lock
minimally, and solves this issue.

Fixes: 96f1e0974575("jbd2: avoid long hold times of j_state_lock while committing a transaction")
Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220317142137.1821590-1-yebin10@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-27 13:50:50 +02:00
..
9p
adfs
affs
afs afs: Fix incorrect triggering of sillyrename on 3rd-party invalidation 2021-09-30 10:09:22 +02:00
autofs
befs
bfs
btrfs btrfs: mark resumed async balance as writing 2022-04-20 09:19:38 +02:00
cachefiles
ceph ceph: fix handling of "meta" errors 2021-10-27 09:54:27 +02:00
cifs cifs: Check the IOCB_DIRECT flag, not O_DIRECT 2022-04-27 13:50:47 +02:00
coda
configfs configfs: fix a race in configfs_{,un}register_subsystem() 2022-03-02 11:41:10 +01:00
cramfs
crypto fscrypt: add fscrypt_symlink_getattr() for computing st_size 2021-09-12 08:56:38 +02:00
debugfs debugfs: lockdown: Allow reading debugfs files that are not world readable 2022-01-27 09:19:36 +01:00
devpts fsnotify: fix fsnotify hooks in pseudo filesystems 2022-02-01 17:24:34 +01:00
dlm fs: dlm: filter user dlm messages for kernel locks 2022-01-27 09:19:40 +01:00
ecryptfs
efivarfs
efs
erofs erofs: fix unsafe pagevec reuse of hooked pclusters 2021-11-21 13:38:51 +01:00
exportfs
ext2 ext2: correct max file size computing 2022-04-15 14:18:13 +02:00
ext4 ext4: force overhead calculation if the s_overhead_cluster makes no sense 2022-04-27 13:50:50 +02:00
f2fs f2fs: fix to avoid potential deadlock 2022-04-15 14:18:06 +02:00
fat
freevxfs
fscache fscache: Fix cookie key hashing 2021-09-22 12:26:25 +02:00
fuse fuse: fix pipe buffer lifetime for direct_io 2022-03-16 13:21:47 +01:00
gfs2 gfs2: assign rgrp glock before compute_bitstructs 2022-04-27 13:50:45 +02:00
hfs hfs: add lock nesting notation to hfs_find_init 2021-07-31 08:19:38 +02:00
hfsplus
hostfs
hpfs
hugetlbfs hugetlbfs: fix mount mode command line processing 2021-07-28 13:31:01 +02:00
iomap mm/swap: consider max pages in iomap_swapfile_add_extent 2021-09-15 09:47:35 +02:00
isofs isofs: Fix out of bound access for corrupted isofs image 2021-11-12 14:43:03 +01:00
jbd2 jbd2: fix a potential race while discarding reserved buffers after an abort 2022-04-27 13:50:50 +02:00
jffs2 jffs2: fix memory leak in jffs2_scan_medium 2022-04-15 14:17:59 +02:00
jfs jfs: prevent NULL deref in diFree 2022-04-15 14:18:36 +02:00
kernfs
lockd lockd: lockd server-side shouldn't set fl_ops 2021-09-22 12:26:34 +02:00
minix minix: fix bug when opening a file with O_DIRECT 2022-04-15 14:18:35 +02:00
nfs NFS: swap-out must always use STABLE writes. 2022-04-15 14:18:36 +02:00
nfs_common
nfsd NFSD: prevent underflow in nfssvc_decode_writeargs() 2022-04-15 14:17:59 +02:00
nilfs2 nilfs2: fix memory leak in nilfs_sysfs_delete_snapshot_group 2021-09-26 14:07:13 +02:00
nls
notify
ntfs ntfs: add sanity check on allocation size 2022-04-15 14:18:23 +02:00
ocfs2 ocfs2: fix crash when initialize filecheck kobj fails 2022-03-23 09:12:06 +01:00
omfs
openpromfs
orangefs orangefs: Fix the size of a memory allocation in orangefs_bufmap_alloc() 2022-01-20 09:19:17 +01:00
overlayfs ovl: fix warning in ovl_create_real() 2021-12-22 09:29:40 +01:00
proc proc/vmcore: fix clearing user buffer by properly using clear_user() 2021-12-01 09:23:31 +01:00
pstore
qnx4 qnx4: work around gcc false positive warning bug 2021-09-30 10:09:26 +02:00
qnx6
quota quota: make dquot_quota_sync return errors from ->sync_fs 2022-02-23 11:59:55 +01:00
ramfs
reiserfs reiserfs: check directory items on read from disk 2021-08-12 13:21:05 +02:00
romfs
squashfs
sysfs
sysv
tracefs tracefs: Set the group ownership in apply_options() not parse_options() 2022-03-02 11:41:13 +01:00
ubifs ubifs: Rectify space amount budget for mkdir/tmpfile operations 2022-04-15 14:18:31 +02:00
udf udf: Fix NULL ptr deref when converting from inline format 2022-02-01 17:24:34 +01:00
ufs
unicode
verity fs-verity: fix signed integer overflow with i_size near S64_MAX 2021-10-06 15:42:30 +02:00
xfs xfs: map unwritten blocks in XFS_IOC_{ALLOC,FREE}SP just like fallocate 2022-01-11 15:23:32 +01:00
Kconfig
Kconfig.binfmt
Makefile
aio.c aio: fix use-after-free due to missing POLLFREE handling 2021-12-14 14:49:02 +01:00
anon_inodes.c
attr.c
bad_inode.c
binfmt_aout.c
binfmt_elf.c elf: don't use MAP_FIXED_NOREPLACE for elf interpreter mappings 2021-10-06 15:42:35 +02:00
binfmt_elf_fdpic.c
binfmt_em86.c
binfmt_flat.c
binfmt_misc.c
binfmt_script.c
block_dev.c
buffer.c
char_dev.c
compat.c
compat_binfmt_elf.c
compat_ioctl.c
coredump.c
d_path.c
dax.c
dcache.c
dcookies.c
direct-io.c
drop_caches.c
eventfd.c
eventpoll.c
exec.c vfs: check fd has read access in kernel_read_file_from_fd() 2021-10-27 09:54:27 +02:00
fcntl.c fcntl: fix potential deadlock for &fasync_struct.fa_lock 2021-09-15 09:47:28 +02:00
fhandle.c
file.c fget: clarify and improve __fget_files() implementation 2022-03-02 11:41:18 +01:00
file_table.c
filesystems.c
fs-writeback.c
fs_context.c memcg: charge fs_context and legacy_fs_context 2022-02-08 18:24:29 +01:00
fs_parser.c
fs_pin.c
fs_struct.c
fs_types.c
fsopen.c
inode.c
internal.h cgroup1: fix leaked context root causing sporadic NULL deref in LTP 2021-07-31 08:19:37 +02:00
io_uring.c io_uring: fix fs->users overflow 2022-04-15 14:18:41 +02:00
ioctl.c
libfs.c
locks.c
mbcache.c
mount.h
mpage.c
namei.c fsnotify: invalidate dcache before IN_DELETE event 2022-02-01 17:24:39 +01:00
namespace.c fs: warn about impending deprecation of mandatory locks 2021-08-26 08:36:22 -04:00
no-block.c
nsfs.c
open.c
pipe.c pipe: increase minimum default pipe size to 2 pages 2021-08-12 13:21:02 +02:00
pnode.c
pnode.h
posix_acl.c
proc_namespace.c
read_write.c
readdir.c
select.c select: Fix indefinitely sleeping task in poll_schedule_timeout() 2022-01-29 10:25:11 +01:00
seq_file.c seq_file: disallow extremely large seq buffer allocations 2021-07-20 16:10:54 +02:00
signalfd.c signalfd: use wake_up_pollfree() 2021-12-14 14:49:02 +01:00
splice.c
stack.c
stat.c stat: fix inconsistency between struct stat and struct compat_stat 2022-04-27 13:50:48 +02:00
statfs.c
super.c vfs: make freeze_super abort when sync_filesystem returns error 2022-02-23 11:59:55 +01:00
sync.c
timerfd.c
userfaultfd.c userfaultfd: prevent concurrent API initialization 2021-09-22 12:26:26 +02:00
utimes.c
xattr.c