linux/fs/nfs
Frank Filz 137d6acaa6 NFSv4: Make sure unlock is really an unlock when cancelling a lock
I ran into a curious issue when a lock is being canceled. The
cancellation results in a lock request to the vfs layer instead of an
unlock request. This is particularly insidious when the process that
owns the lock is exiting. In that case, sometimes the erroneous lock is
applied AFTER the process has entered zombie state, preventing the lock
from ever being released. Eventually other processes block on the lock
causing a slow degredation of the system. In the 2.6.16 kernel this was
investigated on, the problem is compounded by the fact that the cl_sem
is held while blocking on the vfs lock, which results in most processes
accessing the nfs file system in question hanging.

In more detail, here is how the situation occurs:

first _nfs4_do_setlk():

static int _nfs4_do_setlk(struct nfs4_state *state, int cmd, struct file_lock *fl, int reclaim)
...
        ret = nfs4_wait_for_completion_rpc_task(task);
        if (ret == 0) {
...
        } else
                data->cancelled = 1;

then nfs4_lock_release():

static void nfs4_lock_release(void *calldata)
...
        if (data->cancelled != 0) {
                struct rpc_task *task;
                task = nfs4_do_unlck(&data->fl, data->ctx, data->lsp,
                                data->arg.lock_seqid);

The problem is the same file_lock that was passed in to _nfs4_do_setlk()
gets passed to nfs4_do_unlck() from nfs4_lock_release(). So the type is
still F_RDLCK or FWRLCK, not F_UNLCK. At some point, when cancelling the
lock, the type needs to be changed to F_UNLCK. It seemed easiest to do
that in nfs4_do_unlck(), but it could be done in nfs4_lock_release().
The concern I had with doing it there was if something still needed the
original file_lock, though it turns out the original file_lock still
needs to be modified by nfs4_do_unlck() because nfs4_do_unlck() uses the
original file_lock to pass to the vfs layer, and a copy of the original
file_lock for the RPC request.

It seems like the simplest solution is to force all situations where
nfs4_do_unlck() is being used to result in an unlock, so with that in
mind, I made the following change:

Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:49 -04:00
..
Makefile NFS: Remake nfsroot_mount as a permanent part of NFS client 2007-07-10 23:40:46 -04:00
callback.c [PATCH] knfsd: SUNRPC: Provide room in svc_rqst for larger addresses 2007-02-12 09:48:36 -08:00
callback.h NFS: Fix more sparse warnings 2007-05-14 19:33:46 -04:00
callback_proc.c [PATCH] fs/nfs/callback* passes error values big-endian 2006-10-20 10:26:40 -07:00
callback_xdr.c [PATCH] knfsd: SUNRPC: Provide room in svc_rqst for larger addresses 2007-02-12 09:48:36 -08:00
client.c NFSv4: Reduce the chances of an open_owner identifier collision 2007-07-10 23:40:39 -04:00
delegation.c NFSv4: Defer inode revalidation when setting up a delegation 2007-07-10 23:40:41 -04:00
delegation.h NFSv4: Use RCU to protect delegations 2007-07-10 23:40:41 -04:00
dir.c NFS: Fix an Oops in the nfs_access_cache_shrinker() 2007-07-10 23:40:25 -04:00
direct.c NFS: Replace vfsmount and dentry in nfs_open_context with struct path 2007-07-10 23:40:23 -04:00
file.c sendfile: convert nfs to using splice_read() 2007-07-10 08:04:14 +02:00
getroot.c NFS: Kill the obsolete NFS_PARANOIA 2007-05-09 17:58:01 -04:00
idmap.c NFS: use __set_current_state() 2007-05-09 17:58:01 -04:00
inode.c NFSv4: Defer inode revalidation when setting up a delegation 2007-07-10 23:40:41 -04:00
internal.h NFS: Clean-up: use correct type when converting NFS blocks to local blocks 2007-07-10 23:40:44 -04:00
iostat.h
mount_clnt.c NFS: Improve debugging output in NFS in-kernel mount client 2007-07-10 23:40:47 -04:00
namespace.c [PATCH] mark struct inode_operations const 2 2007-02-12 09:48:46 -08:00
nfs2xdr.c SUNRPC: Remove the tk_auth macro... 2007-07-10 23:40:37 -04:00
nfs3acl.c NFSv3: Client-side nfsacl caching fix 2006-06-09 09:34:11 -04:00
nfs3proc.c NFS: nfs3_proc_create() should use nfs_post_op_update_inode() 2007-07-10 23:40:25 -04:00
nfs3xdr.c SUNRPC: Remove the tk_auth macro... 2007-07-10 23:40:37 -04:00
nfs4_fs.h NFSv4: Make the NFS state model work with the nosharedcache mount option 2007-07-10 23:40:48 -04:00
nfs4namespace.c NFSv4: /proc/mounts displays the wrong server name for referrals 2007-02-03 15:35:10 -08:00
nfs4proc.c NFSv4: Make sure unlock is really an unlock when cancelling a lock 2007-07-10 23:40:49 -04:00
nfs4renewd.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
nfs4state.c NFSv4: Make the NFS state model work with the nosharedcache mount option 2007-07-10 23:40:48 -04:00
nfs4xdr.c NFSv4: Reduce the chances of an open_owner identifier collision 2007-07-10 23:40:39 -04:00
nfsroot.c NFS: Remake nfsroot_mount as a permanent part of NFS client 2007-07-10 23:40:46 -04:00
pagelist.c NFS: Replace NFS_I(inode)->req_lock with inode->i_lock 2007-07-10 23:40:38 -04:00
proc.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
read.c NFS: Replace vfsmount and dentry in nfs_open_context with struct path 2007-07-10 23:40:23 -04:00
super.c NFS: Error when mounting the same filesystem with different options 2007-07-10 23:40:48 -04:00
symlink.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
sysctl.c [PATCH] nfs: fix congestion control 2007-03-16 19:25:05 -07:00
unlink.c
write.c NFS: Replace NFS_I(inode)->req_lock with inode->i_lock 2007-07-10 23:40:38 -04:00