linux/fs/nfs
Jeff Layton feaff8e5b2 nfs: take extra reference to fl->fl_file when running a setlk
We had a report of a crash while stress testing the NFS client:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000150
    IP: [<ffffffff8127b698>] locks_get_lock_context+0x8/0x90
    PGD 0
    Oops: 0000 [#1] SMP
    Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_filter ebtable_broute bridge stp llc ebtables ip6table_security ip6table_mangle ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_raw ip6table_filter ip6_tables iptable_security iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_raw coretemp crct10dif_pclmul ppdev crc32_pclmul crc32c_intel ghash_clmulni_intel vmw_balloon serio_raw vmw_vmci i2c_piix4 shpchp parport_pc acpi_cpufreq parport nfsd auth_rpcgss nfs_acl lockd grace sunrpc vmwgfx drm_kms_helper ttm drm mptspi scsi_transport_spi mptscsih mptbase e1000 ata_generic pata_acpi
    CPU: 1 PID: 399 Comm: kworker/1:1H Not tainted 4.1.0-0.rc1.git0.1.fc23.x86_64 #1
    Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/30/2013
    Workqueue: rpciod rpc_async_schedule [sunrpc]
    task: ffff880036aea7c0 ti: ffff8800791f4000 task.ti: ffff8800791f4000
    RIP: 0010:[<ffffffff8127b698>]  [<ffffffff8127b698>] locks_get_lock_context+0x8/0x90
    RSP: 0018:ffff8800791f7c00  EFLAGS: 00010293
    RAX: ffff8800791f7c40 RBX: ffff88001f2ad8c0 RCX: ffffe8ffffc80305
    RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
    RBP: ffff8800791f7c88 R08: ffff88007fc971d8 R09: 279656d600000000
    R10: 0000034a01000000 R11: 279656d600000000 R12: ffff88001f2ad918
    R13: ffff88001f2ad8c0 R14: 0000000000000000 R15: 0000000100e73040
    FS:  0000000000000000(0000) GS:ffff88007fc80000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000150 CR3: 0000000001c0b000 CR4: 00000000000407e0
    Stack:
     ffffffff8127c5b0 ffff8800791f7c18 ffffffffa0171e29 ffff8800791f7c58
     ffffffffa0171ef8 ffff8800791f7c78 0000000000000246 ffff88001ea0ba00
     ffff8800791f7c40 ffff8800791f7c40 00000000ff5d86a3 ffff8800791f7ca8
    Call Trace:
     [<ffffffff8127c5b0>] ? __posix_lock_file+0x40/0x760
     [<ffffffffa0171e29>] ? rpc_make_runnable+0x99/0xa0 [sunrpc]
     [<ffffffffa0171ef8>] ? rpc_wake_up_task_queue_locked.part.35+0xc8/0x250 [sunrpc]
     [<ffffffff8127cd3a>] posix_lock_file_wait+0x4a/0x120
     [<ffffffffa03e4f12>] ? nfs41_wake_and_assign_slot+0x32/0x40 [nfsv4]
     [<ffffffffa03bf108>] ? nfs41_sequence_done+0xd8/0x2d0 [nfsv4]
     [<ffffffffa03c116d>] do_vfs_lock+0x2d/0x30 [nfsv4]
     [<ffffffffa03c251d>] nfs4_lock_done+0x1ad/0x210 [nfsv4]
     [<ffffffffa0171a30>] ? __rpc_sleep_on_priority+0x390/0x390 [sunrpc]
     [<ffffffffa0171a30>] ? __rpc_sleep_on_priority+0x390/0x390 [sunrpc]
     [<ffffffffa0171a5c>] rpc_exit_task+0x2c/0xa0 [sunrpc]
     [<ffffffffa0167450>] ? call_refreshresult+0x150/0x150 [sunrpc]
     [<ffffffffa0172640>] __rpc_execute+0x90/0x460 [sunrpc]
     [<ffffffffa0172a25>] rpc_async_schedule+0x15/0x20 [sunrpc]
     [<ffffffff810baa1b>] process_one_work+0x1bb/0x410
     [<ffffffff810bacc3>] worker_thread+0x53/0x480
     [<ffffffff810bac70>] ? process_one_work+0x410/0x410
     [<ffffffff810bac70>] ? process_one_work+0x410/0x410
     [<ffffffff810c0b38>] kthread+0xd8/0xf0
     [<ffffffff810c0a60>] ? kthread_worker_fn+0x180/0x180
     [<ffffffff817a1aa2>] ret_from_fork+0x42/0x70
     [<ffffffff810c0a60>] ? kthread_worker_fn+0x180/0x180

Jean says:

"Running locktests with a large number of iterations resulted in a
 client crash.  The test run took a while and hasn't finished after close
 to 2 hours. The crash happened right after I gave up and killed the test
 (after 107m) with Ctrl+C."

The crash happened because a NULL inode pointer got passed into
locks_get_lock_context. The call chain indicates that file_inode(filp)
returned NULL, which means that f_inode was NULL. Since that's zeroed
out in __fput, that suggests that this filp pointer outlived the last
reference.

Looking at the code, that seems possible. We copy the struct file_lock
that's passed in, but if the task is signalled at an inopportune time we
can end up trying to use that file_lock in rpciod context after the process
that requested it has already returned (and possibly put its filp
reference).

Fix this by taking an extra reference to the filp when we allocate the
lock info, and put it in nfs4_lock_release.

Reported-by: Jean Spector <jean@primarydata.com>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-05-13 14:56:06 -04:00
..
blocklayout NFSv4.1/pnfs: Separate out metadata and data consistency for pNFS 2015-03-27 12:39:38 -04:00
filelayout NFSv4.1/pnfs: Separate out metadata and data consistency for pNFS 2015-03-27 12:39:38 -04:00
flexfilelayout NFS: Move nfs_idmap.h into fs/nfs/ 2015-04-23 15:16:14 -04:00
objlayout NFSv4.1/pnfs: Separate out metadata and data consistency for pNFS 2015-03-27 12:39:38 -04:00
Kconfig kernel: conditionally support non-root users, groups and capabilities 2015-04-15 16:35:22 -07:00
Makefile NFS: Rename idmap.c to nfs4idmap.c 2015-04-23 15:16:14 -04:00
cache_lib.c NFS: simplify and clean cache library 2013-02-15 10:43:36 -05:00
cache_lib.h NFS: simplify and clean cache library 2013-02-15 10:43:36 -05:00
callback.c nfs: fix high load average due to callback thread sleeping 2015-04-23 14:38:07 -04:00
callback.h NFS: Add in v4.2 callback operation 2013-06-08 16:20:18 -04:00
callback_proc.c NFSv4.1: Don't set up a backchannel if the server didn't agree to do so 2015-02-18 12:30:47 -08:00
callback_xdr.c NFSv4.1: Convert open-coded array allocation calls to kmalloc_array() 2015-02-11 19:02:52 -05:00
client.c NFS: Remove CONFIG_NFS_V4 checks from nfs_idmap.h 2015-04-23 15:16:13 -04:00
delegation.c Merge branch 'bugfixes' 2015-04-23 15:16:27 -04:00
delegation.h NFSv4: Fix races between nfs_remove_bad_delegation() and delegation return 2014-11-12 17:19:04 -05:00
dir.c NFS client updates for Linux 4.1 2015-04-26 17:33:59 -07:00
direct.c NFS client updates for Linux 4.1 2015-04-26 17:33:59 -07:00
dns_resolve.c NFS: Enabling v4.2 should not recompile nfsd and lockd 2013-11-19 16:20:40 -05:00
dns_resolve.h NFS: DNS resolver cache per network namespace context introduced 2012-01-31 18:20:26 -05:00
file.c NFS client updates for Linux 4.1 2015-04-26 17:33:59 -07:00
fscache-index.c NFS: Fabricate fscache server index key correctly 2014-09-25 21:25:18 -04:00
fscache.c nfs: define nfs_inc_fscache_stats and using it as possible 2014-11-24 20:08:47 -05:00
fscache.h NFS: Use i_writecount to control whether to get an fscache cookie in nfs_open() 2013-09-27 18:40:25 +01:00
getroot.c VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
inode.c NFS client updates for Linux 4.1 2015-04-26 17:33:59 -07:00
internal.h NFS: Add attribute update barriers to NFS writebacks 2015-03-01 23:23:06 -05:00
iostat.h nfs: define nfs_inc_fscache_stats and using it as possible 2014-11-24 20:08:47 -05:00
mount_clnt.c nfs: have nfs_mount fake up a auth_flavs list when the server didn't provide it 2013-06-28 15:51:51 -04:00
namespace.c VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
netns.h pnfs/blocklayout: serialize GETDEVICEINFO calls 2014-11-12 14:22:52 -05:00
nfs.h NFS: Convert v4 into a module 2012-07-30 19:06:52 -04:00
nfs2super.c NFS: Convert v2 into a module 2012-07-30 19:06:41 -04:00
nfs2xdr.c nfs: save server READ/WRITE/COMMIT status 2015-02-03 11:06:40 -08:00
nfs3_fs.h nfsv3: introduce nfs3_set_ds_client 2015-02-03 11:06:34 -08:00
nfs3acl.c VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
nfs3client.c nfs: set hostname when creating nfsv3 ds connection 2015-02-03 11:06:38 -08:00
nfs3proc.c VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
nfs3super.c nfsv3: introduce nfs3_set_ds_client 2015-02-03 11:06:34 -08:00
nfs3xdr.c NFSv3: Use the readdir fileid as the mounted-on-fileid 2015-03-01 23:23:07 -05:00
nfs4_fs.h Merge branch 'flexfiles' 2015-02-03 16:01:27 -05:00
nfs4client.c NFS client updates for Linux 4.1 2015-04-26 17:33:59 -07:00
nfs4file.c NFS client updates for Linux 4.1 2015-04-26 17:33:59 -07:00
nfs4getroot.c NFSv4: Fix security auto-negotiation 2013-09-07 16:18:30 -04:00
nfs4idmap.c NFS: Rename idmap.c to nfs4idmap.c 2015-04-23 15:16:14 -04:00
nfs4idmap.h NFS: Move nfs_idmap.h into fs/nfs/ 2015-04-23 15:16:14 -04:00
nfs4namespace.c VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
nfs4proc.c nfs: take extra reference to fl->fl_file when running a setlk 2015-05-13 14:56:06 -04:00
nfs4renewd.c NFSv4.1: Fix an NFSv4.1 state renewal regression 2014-09-30 17:18:42 -04:00
nfs4session.c NFSv4.1: Don't set up a backchannel if the server didn't agree to do so 2015-02-18 12:30:47 -08:00
nfs4session.h NFSv4.1: Clear the old state by our client id before establishing a new lease 2015-03-03 21:52:30 -05:00
nfs4state.c NFS client updates for Linux 4.1 2015-04-26 17:33:59 -07:00
nfs4super.c NFS: Move nfs_idmap.h into fs/nfs/ 2015-04-23 15:16:14 -04:00
nfs4sysctl.c NFS: Move nfs_idmap.h into fs/nfs/ 2015-04-23 15:16:14 -04:00
nfs4trace.c NFSv4.1: Add tracepoints for debugging slot table operations 2013-08-22 08:58:27 -04:00
nfs4trace.h VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
nfs4xdr.c NFS: Move nfs_idmap.h into fs/nfs/ 2015-04-23 15:16:14 -04:00
nfs42.h nfs: Add DEALLOCATE support 2014-11-25 16:38:32 -05:00
nfs42proc.c NFS: Reduce time spent holding the i_mutex during fallocate() 2015-04-23 14:36:28 -04:00
nfs42xdr.c NFS: Don't zap caches on fallocate() 2015-04-23 14:36:28 -04:00
nfsroot.c NFS: a couple off by ones 2015-01-30 20:43:30 -05:00
nfstrace.c NFSv4: Allow tracing of NFSv4 fsync calls 2015-03-27 12:39:34 -04:00
nfstrace.h NFS: fix the handling of NFS_INO_INVALID_DATA flag in nfs_revalidate_mapping 2014-01-27 15:35:56 -05:00
pagelist.c VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
pnfs.c Merge branch 'bugfixes' 2015-04-23 15:16:27 -04:00
pnfs.h NFS client updates for Linux 4.1 2015-04-26 17:33:59 -07:00
pnfs_dev.c NFSv4.1: Don't cache deviceids that have no notifications 2015-03-27 12:32:24 -04:00
pnfs_nfs.c Merge branch 'bugfixes' 2015-04-23 15:16:27 -04:00
proc.c VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
read.c NFS client updates for Linux 4.1 2015-04-26 17:33:59 -07:00
super.c NFS client updates for Linux 4.1 2015-04-26 17:33:59 -07:00
symlink.c VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
sysctl.c nfs: convert use of typedef ctl_table to struct ctl_table 2014-06-06 16:08:16 -07:00
unlink.c VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
write.c nfs: stat(2) fails during cthon04 basic test5 on NFSv4.0 2015-05-13 14:56:03 -04:00