Commit Graph

682 Commits

Author SHA1 Message Date
Jeff Layton 2a4317c554 nfsd: add nfsd4_client_tracking_ops struct and a way to set it
Abstract out the mechanism that we use to track clients into a set of
client name tracking functions.

This gives us a mechanism to plug in a new set of client tracking
functions without disturbing the callers. It also gives us a way to
decide on what tracking scheme to use at runtime.

For now, this just looks like pointless abstraction, but later we'll
add a new alternate scheme for tracking clients on stable storage.

Note too that this patch anticipates the eventual containerization
of this code by passing in struct net pointers in places. No attempt
is made to containerize the legacy client tracker however.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-03-26 11:49:47 -04:00
Jeff Layton a52d726bbd nfsd: convert nfs4_client->cl_cb_flags to a generic flags field
We'll need a way to flag the nfs4_client as already being recorded on
stable storage so that we don't continually upcall. Currently, that's
recorded in the cl_firststate field of the client struct. Using an
entire u32 to store a flag is rather wasteful though.

The cl_cb_flags field is only using 2 bits right now, so repurpose that
to a generic flags field. Rename NFSD4_CLIENT_KILL to
NFSD4_CLIENT_CB_KILL to make it evident that it's part of the callback
flags. Add a mask that we can use for existing checks that look to see
whether any flags are set, so that the new flags don't interfere.

Convert all references to cl_firstate to the NFSD4_CLIENT_STABLE flag,
and add a new NFSD4_CLIENT_RECLAIM_COMPLETE flag.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-03-26 11:49:47 -04:00
J. Bruce Fields 1df00640c9 Merge nfs containerization work from Trond's tree
The nfs containerization work is a prerequisite for Jeff Layton's reboot
recovery rework.
2012-03-26 11:48:54 -04:00
Chuck Lever ab4684d156 NFSD: Fix nfs4_verifier memory alignment
Clean up due to code review.

The nfs4_verifier's data field is not guaranteed to be u32-aligned.
Casting an array of chars to a u32 * is considered generally
hazardous.

We can fix most of this by using a __be32 array to generate the
verifier's contents and then byte-copying it into the verifier field.

However, there is one spot where there is a backwards compatibility
constraint: the do_nfsd_create() call expects a verifier which is
32-bit aligned.  Fix this spot by forcing the alignment of the create
verifier in the nfsd4_open args structure.

Also, sizeof(nfs4_verifer) is the size of the in-core verifier data
structure, but NFS4_VERIFIER_SIZE is the number of octets in an XDR'd
verifier.  The two are not interchangeable, even if they happen to
have the same value.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-03-20 15:36:15 -04:00
Trond Myklebust 8f199b8262 NFSD: Fix warnings when NFSD_DEBUG is not defined
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-03-20 15:34:19 -04:00
Benny Halevy 508dc6e110 nfsd41: free_session/free_client must be called under the client_lock
The session client is manipulated under the client_lock hence
both free_session and nfsd4_del_conns must be called under this lock.

This patch adds a BUG_ON that checks this condition in the
respective functions and implements the missing locks.

nfsd4_{get,put}_session helpers were moved to the C file that uses them
so to prevent use from external files and an unlocked version of
nfsd4_put_session is provided for external use from nfs4xdr.c

Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-03-06 18:13:35 -05:00
Benny Halevy e27f49c33b nfsd41: refactor nfsd4_deleg_xgrade_none_ext logic out of nfsd4_process_open2
Handle the case where the nfsv4.1 client asked to uprade or downgrade
its delegations and server returns no delegation.

In this case, op_delegate_type is set to NFS4_OPEN_DELEGATE_NONE_EXT
and op_why_no_deleg is set respectively to WND4_NOT_SUPP_{UP,DOWN}GRADE

Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-03-06 18:13:35 -05:00
Benny Halevy 4aa8913cb0 nfsd41: refactor nfs4_open_deleg_none_ext logic out of nfs4_open_delegation
When a 4.1 client asks for a delegation and the server returns none
op_delegate_type is set to NFS4_OPEN_DELEGATE_NONE_EXT
and op_why_no_deleg is set to either WND4_CONTENTION or WND4_RESOURCE.
Or, if the client sent a NFS4_SHARE_WANT_CANCEL (which it is not supposed
to ever do until our server supports delegations signaling),
op_why_no_deleg is set to WND4_CANCELLED.

Note that for WND4_CONTENTION and WND4_RESOURCE, the xdr layer is hard coded
at this time to encode boolean FALSE for ond_server_will_push_deleg /
ond_server_will_signal_avail.

Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-03-06 18:13:34 -05:00
J. Bruce Fields a8ae08ebf1 nfsd4: fix recovery-entry leak nfsd startup failure
Another leak on error

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-03-06 18:13:32 -05:00
Jeff Layton a6d6b7811c nfsd4: fix recovery-dir leak on nfsd startup failure
The current code never calls nfsd4_shutdown_recdir if nfs4_state_start
returns an error. Also, it's better to go ahead and consolidate these
functions since one is just a trivial wrapper around the other.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-03-06 18:13:25 -05:00
J. Bruce Fields 393d8ed80f nfsd4: purge stable client records with insufficient state
To escape having your stable storage record purged at the end of the
grace period, it's not sufficient to simply have performed a
setclientid_confirm; you also need to meet the same requirements as
someone creating a new record: either you should have done an open or
open reclaim (in the 4.0 case) or a reclaim_complete (in the 4.1 case).

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-03-06 18:13:24 -05:00
J. Bruce Fields 1255a8f36c nfsd4: don't set cl_firststate on first reclaim in 4.1 case
We set cl_firststate when we first decide that a client will be
permitted to reclaim state on next boot.  This happens:

	- for new 4.0 clients, when they confirm their first open
	- for returning 4.0 clients, when they reclaim their first open
	- for 4.1+ clients, when they perform reclaim_complete

We also use cl_firststate to decide whether a reclaim_complete has
already been performed, in the 4.1+ case.

We were setting it on 4.1 open reclaims, which caused spurious
COMPLETE_ALREADY errors on RECLAIM_COMPLETE from an nfs4.1 client with
anything to reclaim.

Reported-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-03-06 18:13:23 -05:00
Benny Halevy d24433cdc9 nfsd41: implement NFS4_SHARE_WANT_NO_DELEG, NFS4_OPEN_DELEGATE_NONE_EXT, why_no_deleg
Respect client request for not getting a delegation in NFSv4.1
Appropriately return delegation "type" NFS4_OPEN_DELEGATE_NONE_EXT
and WND4_NOT_WANTED reason.

[nfsd41: add missing break when encoding op_why_no_deleg]
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-02-17 18:38:53 -05:00
Bryan Schumaker 03cfb42025 NFSD: Clean up the test_stateid function
When I initially wrote it, I didn't understand how lists worked so I
wrote something that didn't use them.  I think making a list of stateids
to test is a more straightforward implementation, especially compared to
especially compared to decoding stateids while simultaneously encoding
a reply to the client.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-02-17 18:38:52 -05:00
Benny Halevy 2c8bd7e0d1 nfsd41: split out share_access want and signal flags while decoding
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-02-17 18:38:42 -05:00
Benny Halevy 00b5f95a26 nfsd41: share_access_to_flags should consider only nfs4.x share_access flags
Currently, it will not correctly ignore any nfsv4.1 signal flags
if the client sends them.

Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-02-17 11:50:36 -05:00
Tigran Mkrtchyan 37c593c573 nfsd41: use current stateid by value
Signed-off-by: Tigran Mkrtchyan <kofemann@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-02-15 11:20:45 -05:00
Tigran Mkrtchyan 9428fe1abb nfsd41: consume current stateid on DELEGRETURN and OPENDOWNGRADE
Signed-off-by: Tigran Mkrtchyan <kofemann@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-02-15 11:20:44 -05:00
Tigran Mkrtchyan 1e97b5190d nfsd41: handle current stateid in SETATTR and FREE_STATEID
Signed-off-by: Tigran Mkrtchyan <kofemann@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-02-15 11:20:43 -05:00
Tigran Mkrtchyan 30813e2773 nfsd41: consume current stateid on read and write
Signed-off-by: Tigran Mkrtchyan <kofemann@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-02-15 11:20:40 -05:00
Tigran Mkrtchyan 62cd4a591c nfsd41: handle current stateid on lock and locku
Signed-off-by: Tigran Mkrtchyan <kofemann@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-02-15 11:20:39 -05:00
Tigran Mkrtchyan 8b70484c67 nfsd41: handle current stateid in open and close
Signed-off-by: Tigran Mkrtchyan <kofemann@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-02-15 11:20:38 -05:00
Tigran Mkrtchyan 19ff0f288c nfsd4: initialize current stateid at compile time
Signed-off-by: Tigran Mkrtchyan <kofemann@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-02-15 11:20:29 -05:00
J. Bruce Fields bf5c43c8f1 nfsd4: check for uninitialized slot
This fixes an oops when a buggy client tries to use an initial seqid of
0 on a new slot, which we may misinterpret as a replay.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-02-14 17:01:58 -05:00
J. Bruce Fields 73e79482b4 nfsd4: rearrange struct nfsd4_slot
Combine two booleans into a single flag field, move the smaller fields
to the end.

(In practice this doesn't make the struct any smaller.  But we'll be
adding another flag here soon.)

Remove some debugging code that doesn't look useful, while we're in the
neighborhood.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-02-14 17:01:29 -05:00
J. Bruce Fields f6d82485e9 nfsd4: fix sessions slotid wraparound logic
From RFC 5661 2.10.6.1: "If the previous sequence ID was 0xFFFFFFFF,
then the next request for the slot MUST have the sequence ID set to
zero."

While we're there, delete some redundant comments.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-02-13 16:15:18 -05:00
Stanislav Kinsbursky f2ac4dc911 SUNRPC: parametrize rpc_uaddr2sockaddr() by network context
Parametrize rpc_uaddr2sockaddr() by network context and thus force it's callers to pass
in network context instead of using hard-coded "init_net".

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-31 19:28:12 -05:00
Linus Torvalds 0b48d42235 Merge branch 'for-3.3' of git://linux-nfs.org/~bfields/linux
* 'for-3.3' of git://linux-nfs.org/~bfields/linux: (31 commits)
  nfsd4: nfsd4_create_clid_dir return value is unused
  NFSD: Change name of extended attribute containing junction
  svcrpc: don't revert to SVC_POOL_DEFAULT on nfsd shutdown
  svcrpc: fix double-free on shutdown of nfsd after changing pool mode
  nfsd4: be forgiving in the absence of the recovery directory
  nfsd4: fix spurious 4.1 post-reboot failures
  NFSD: forget_delegations should use list_for_each_entry_safe
  NFSD: Only reinitilize the recall_lru list under the recall lock
  nfsd4: initialize special stateid's at compile time
  NFSd: use network-namespace-aware cache registering routines
  SUNRPC: create svc_xprt in proper network namespace
  svcrpc: update outdated BKL comment
  nfsd41: allow non-reclaim open-by-fh's in 4.1
  svcrpc: avoid memory-corruption on pool shutdown
  svcrpc: destroy server sockets all at once
  svcrpc: make svc_delete_xprt static
  nfsd: Fix oops when parsing a 0 length export
  nfsd4: Use kmemdup rather than duplicating its implementation
  nfsd4: add a separate (lockowner, inode) lookup
  nfsd4: fix CONFIG_NFSD_FAULT_INJECTION compile error
  ...
2012-01-14 12:26:41 -08:00
Bryan Schumaker 2d3475c0ad NFSD: forget_delegations should use list_for_each_entry_safe
Otherwise the for loop could try to use a file recently removed from the
file_hashtbl list and oops.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Tested-by: Casey Bodley <cbodley@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-12-14 17:38:00 -05:00
Bryan Schumaker 39c4cc0fcc NFSD: Only reinitilize the recall_lru list under the recall lock
unhash_delegation() will grab the recall lock before calling
list_del_init() in each of these places.  This patch removes the
redundant calls.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-12-13 17:11:45 -05:00
J. Bruce Fields f32f3c2d3f nfsd4: initialize special stateid's at compile time
Stateid's with "other" ("opaque") field all zeros or all ones are
reserved.  We define all_ones separately on the off chance there will be
more such some day, though currently all the other special stateid's
have zero other field.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-12-12 15:27:00 -05:00
Justin P. Mattock 42b2aa86c6 treewide: Fix typos in various parts of the kernel, and fix some comments.
The below patch fixes some typos in various parts of the kernel, as well as fixes some comments.
Please let me know if I missed anything, and I will try to get it changed and resent.

Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-12-02 14:57:31 +01:00
Thomas Meyer 67114fe610 nfsd4: Use kmemdup rather than duplicating its implementation
The semantic patch that makes this change is available
in scripts/coccinelle/api/memdup.cocci.

Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-11-25 18:44:22 -05:00
J. Bruce Fields 009673b439 nfsd4: add a separate (lockowner, inode) lookup
Address the possible performance regression mentioned in "nfsd4: hash
lockowners to simplify RELEASE_LOCKOWNER" by providing a separate
(lockowner, inode) hash.

Really, I doubt this matters much, but I think it's likely we'll change
these data structures here and I'd rather that the need for (owner,
inode) lookups be well-documented.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-11-15 19:26:08 -05:00
J. Bruce Fields 353de31b86 nfsd4: fix CONFIG_NFSD_FAULT_INJECTION compile error
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-11-15 19:26:07 -05:00
J. Bruce Fields 16bfdaafa2 nfsd4: share open and lock owner hash tables
Now that they're used in the same way, it's a little simpler to put open
and lock owners in the same hash table, and I can't see a reason not to.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-11-08 11:28:45 -05:00
J. Bruce Fields 06f1f864d4 nfsd4: hash lockowners to simplify RELEASE_LOCKOWNER
Hash lockowners on just the owner string rather than on (owner, inode).
This makes the owner-string lookup needed for RELEASE_LOCKOWNER simpler
(currently it's doing at a linear search through the entire hash
table!).  That may come at the expense of making (owner, inode) lookups
more expensive if a client reuses the same lockowner across multiple
files.  We might add a separate lookup for that.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-11-07 21:10:48 -05:00
Bryan Schumaker 7208339607 NFSD: Call nfsd4_init_slabs() from init_nfsd()
init_nfsd() was calling free_slabs() during cleanup code, but the call
to init_slabs() was hidden in nfsd4_state_init().  This could be
confusing to people unfamiliar with the code.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-11-07 21:10:47 -05:00
Bryan Schumaker 65178db42a NFSD: Added fault injection
Fault injection on the NFS server makes it easier to test the client's
state manager and recovery threads.  Simulating errors on the server is
easier than finding the right conditions that cause them naturally.

This patch uses debugfs to add a simple framework for fault injection to
the server.  This framework is a config option, and can be enabled
through CONFIG_NFSD_FAULT_INJECTION.  Assuming you have debugfs mounted
to /sys/debug, a set of files will be created in /sys/debug/nfsd/.
Writing to any of these files will cause the corresponding action and
write a log entry to dmesg.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-11-07 21:10:47 -05:00
J. Bruce Fields 64a284d07c nfsd4: maintain one seqid stream per (lockowner, file)
Instead of creating a new lockowner and stateid for every
open_to_lockowner call, reuse the existing lockowner if it exists.

Reported-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-11-07 21:10:47 -05:00
J. Bruce Fields 684e563858 nfsd4: cleanup lock clientid handling in sessions case
I'd rather the "ignore clientid in sessions case" rule be enforced in
just one place.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-11-07 21:10:47 -05:00
J. Bruce Fields b93d87c198 nfsd4: fix lockowner matching
Lockowners are looked up by file as well as by owner, but we were
forgetting to do a comparison on the file.  This could cause an
incorrect result from lockt.

(Note looking up the inode from the lockowner is pretty awkward here.
The data structures need fixing.)

Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-11-07 21:10:47 -05:00
Mi Jinlong 345c284290 nfs41: implement DESTROY_CLIENTID operation
According to rfc5661 18.50, implement DESTROY_CLIENTID operation.

Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-10-24 04:24:30 -04:00
Benny Halevy fc0c3dd13b nfsd4: seq->status_flags may be used unitialized
Reported-by: Gopala Suryanarayana <gsuryanarayana@vmware.com>
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-10-24 04:24:28 -04:00
Benny Halevy 5423732a71 nfsd41: use SEQ4_STATUS_BACKCHANNEL_FAULT when cb_sequence is invalid
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-10-24 04:24:27 -04:00
J. Bruce Fields 8b289b2c23 nfsd4: implement new 4.1 open reclaim types
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-10-19 11:52:12 -04:00
J. Bruce Fields a8d86cd75b nfsd4: remove unneeded CLAIM_DELEGATE_CUR workaround
0c12eaffdf "nfsd: don't break lease on
CLAIM_DELEGATE_CUR" was a temporary workaround for a problem fixed
properly in the vfs layer by 778fc546f7
"locks: fix tracking of inprogress lease breaks", so we can revert that
change (but keeping some minor cleanup from that commit).

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-10-19 11:42:03 -04:00
J. Bruce Fields 4cdc951b86 nfsd4: preallocate open stateid in process_open1()
As with the nfs4_file, we'd prefer to find out about any failure before
creating a new file rather than after.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-10-17 17:50:07 -04:00
J. Bruce Fields 996e09385c nfsd4: do idr preallocation with stateid allocation
Move idr preallocation out of stateid initialization, into stateid
allocation, so that we no longer have to handle any errors from the
former.

This is a little subtle due to the way the idr code manages these
preallocated items--document that in comments.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-10-17 17:50:07 -04:00
J. Bruce Fields 32513b40ef nfsd4: preallocate nfs4_file in process_open1()
Creating a new file is an irrevocable step--once it's visible in the
filesystem, other processes may have seen it and done something with it,
and unlinking it wouldn't simply undo the effects of the create.

Therefore, in the case where OPEN creates a new file, we shouldn't do
the create until we know that the rest of the OPEN processing will
succeed.

For example, we should preallocate a struct file in case we need it
until waiting to allocate it till process_open2(), which is already too
late.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-10-17 17:50:00 -04:00
J. Bruce Fields d29b20cd58 nfsd4: clean up open owners on OPEN failure
If process_open1() creates a new open owner, but the open later fails,
the current code will leave the open owner around.  It won't be on the
close_lru list, and the client isn't expected to send a CLOSE, so it
will hang around as long as the client does.

Similarly, if process_open1() removes an existing open owner from the
close lru, anticipating that an open owner that previously had no
associated stateid's now will, but the open subsequently fails, then
we'll again be left with the same leak.

Fix both problems.

Reported-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-10-17 17:33:57 -04:00
J. Bruce Fields bcf130f9df nfsd4: simplify process_open1 logic
No change in behavior.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-10-17 17:33:51 -04:00
J. Bruce Fields a50d2ad172 nfsd4: centralize renew_client() calls
There doesn't seem to be any harm to renewing the client a bit earlier,
when it is looked up.  That saves us from having to sprinkle
renew_client calls over quite so many places.

Also remove a redundant comment and do a little cleanup.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-10-17 17:09:37 -04:00
J. Bruce Fields b6d2f1ca3c nfsd4: more robust ignoring of WANT bits in OPEN
Mask out the WANT bits right at the start instead of on each use.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-10-11 12:15:15 -04:00
J. Bruce Fields a084daf512 nfsd4: move name-length checks to xdr
Again, these checks are better in the xdr code.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-10-11 12:15:01 -04:00
J. Bruce Fields 04f9e664b2 nfsd4: move access/deny validity checks to xdr code
I'd rather put more of these sorts of checks into standardized xdr
decoders for the various types rather than have them cluttering up the
core logic in nfs4proc.c and nfs4state.c.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-10-11 08:53:12 -04:00
J. Bruce Fields c30e92df30 nfsd4: ignore WANT bits in open downgrade
We don't use WANT bits yet--and sending them can probably trigger a
BUG() further down.

Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-10-10 18:05:20 -04:00
J. Bruce Fields 6409a5a65d nfsd4: clean up downgrading code
In response to some review comments, get rid of the somewhat obscure
for-loop with bitops, and improve a comment.

Reported-by: Steve Dickson <steved@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-10-10 18:04:45 -04:00
J. Bruce Fields 71c3bcd713 nfsd4: fix state lock usage in LOCKU
In commit 5ec094c109 "nfsd4: extend state
lock over seqid replay logic" I modified the exit logic of all the
seqid-based procedures except nfsd4_locku().  Fix the oversight.

The result of the bug was a double-unlock while handling the LOCKU
procedure, and a warning like:

[  142.150014] WARNING: at kernel/mutex-debug.c:78 debug_mutex_unlock+0xda/0xe0()
...
[  142.152927] Pid: 742, comm: nfsd Not tainted 3.1.0-rc1-SLIM+ #9
[  142.152927] Call Trace:
[  142.152927]  [<ffffffff8105fa4f>] warn_slowpath_common+0x7f/0xc0
[  142.152927]  [<ffffffff8105faaa>] warn_slowpath_null+0x1a/0x20
[  142.152927]  [<ffffffff810960ca>] debug_mutex_unlock+0xda/0xe0
[  142.152927]  [<ffffffff813e4200>] __mutex_unlock_slowpath+0x80/0x140
[  142.152927]  [<ffffffff813e42ce>] mutex_unlock+0xe/0x10
[  142.152927]  [<ffffffffa03bd3f5>] nfs4_lock_state+0x35/0x40 [nfsd]
[  142.152927]  [<ffffffffa03b0b71>] nfsd4_proc_compound+0x2a1/0x690
[nfsd]
[  142.152927]  [<ffffffffa039f9fb>] nfsd_dispatch+0xeb/0x230 [nfsd]
[  142.152927]  [<ffffffffa02b1055>] svc_process_common+0x345/0x690
[sunrpc]
[  142.152927]  [<ffffffff81058d10>] ? try_to_wake_up+0x280/0x280
[  142.152927]  [<ffffffffa02b16e2>] svc_process+0x102/0x150 [sunrpc]
[  142.152927]  [<ffffffffa039f0bd>] nfsd+0xbd/0x160 [nfsd]
[  142.152927]  [<ffffffffa039f000>] ? 0xffffffffa039efff
[  142.152927]  [<ffffffff8108230c>] kthread+0x8c/0xa0
[  142.152927]  [<ffffffff813e8694>] kernel_thread_helper+0x4/0x10
[  142.152927]  [<ffffffff81082280>] ? kthread_worker_fn+0x190/0x190
[  142.152927]  [<ffffffff813e8690>] ? gs_change+0x13/0x13

Reported-by: Bryan Schumaker <bjschuma@netapp.com>
Tested-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-10-10 18:04:45 -04:00
J. Bruce Fields 38c2f4b12a nfsd4: look up stateid's per clientid
Use a separate stateid idr per client, and lookup a stateid by first
finding the client, then looking up the stateid relative to that client.

Also some minor refactoring.

This allows us to improve error returns: we can return expired when the
clientid is not found and bad_stateid when the clientid is found but not
the stateid, as opposed to returning expired for both cases.

I hope this will also help to replace the state lock mostly by a
per-client lock, but that hasn't been done yet.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-26 17:35:28 -04:00
J. Bruce Fields 36279ac10c nfsd4: assume test_stateid always has session
Test_stateid is 4.1-only and only allowed after a sequence operation, so
this check is unnecessary.

Cc: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-26 17:35:27 -04:00
J. Bruce Fields 6136d2b409 nfsd4: use idr for stateid's
The idr system is designed exactly for generating id and looking up
integer id's.  Thanks to Trond for pointing it out.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-26 17:35:26 -04:00
J. Bruce Fields 2a74aba799 nfsd4: move client * to nfs4_stateid, add init_stid helper
This will be convenient.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-26 17:35:25 -04:00
J. Bruce Fields 3d02fa29de nfsd4: fix open downgrade, again
Yet another open-management regression:

	- nfs4_file_downgrade() doesn't remove the BOTH access bit on
	  downgrade, so the server's idea of the stateid's access gets
	  out of sync with the client's.  If we want to keep an O_RDWR
	  open in this case, we should do that in the file_put_access
	  logic rather than here.
	- We forgot to convert v4 access to an open mode here.

This logic has proven too hard to get right.  In the future we may
consider:
	- reexamining the lock/openowner relationship (locks probably
	  don't really need to take their own references here).
	- adding open upgrade/downgrade support to the vfs.
	- removing the atomic operations.  They're redundant as long as
	  this is all under some other lock.

Also, maybe some kind of additional static checking would help catch
O_/NFS4_SHARE_ACCESS confusion.

Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-20 14:43:39 -04:00
J. Bruce Fields f7a4d87207 nfsd4: hash closed stateid's like any other
Look up closed stateid's in the stateid hash like any other stateid
rather than searching the close lru.

This is simpler, and fixes a bug: currently we handle only the case of a
close that is the last close for a given stateowner, but not the case of
a close for a stateowner that still has active opens on other files.
Thus in a case like:

	open(owner, file1)
	open(owner, file2)
	close(owner, file2)
	close(owner, file2)

the final close won't be recognized as a retransmission.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-19 08:39:34 -04:00
J. Bruce Fields d3b313a463 nfsd4: construct stateid from clientid and counter
Including the full clientid in the on-the-wire stateid allows more
reliable detection of bad vs. expired stateid's, simplifies code, and
ensures we won't reuse the opaque part of the stateid (as we currently
do when the same openowner closes and reopens the same file).

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-19 06:33:57 -04:00
J. Bruce Fields 2da1cec713 nfsd4: simplify free_stateid
We no longer need is_deleg_stateid, for example.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-17 10:31:16 -04:00
J. Bruce Fields 38c387b52d nfsd4: match close replays on stateid, not open owner id
Keep around an unhashed copy of the final stateid after the last close
using an openowner, and when identifying a replay, match against that
stateid instead of just against the open owner id.  Free it the next
time the seqid is bumped or the stateowner is destroyed.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-17 10:01:54 -04:00
J. Bruce Fields dad1c067eb nfsd4: replace oo_confirmed by flag bit
I want at least one more bit here.  So, let's haul out the caps lock key
and add a flags field.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-16 17:44:16 -04:00
Mi Jinlong 849a1cf13d SUNRPC: Replace svc_addr_u by sockaddr_storage
For IPv6 local address, lockd can not callback to client for
missing scope id when binding address at inet6_bind:

 324       if (addr_type & IPV6_ADDR_LINKLOCAL) {
 325               if (addr_len >= sizeof(struct sockaddr_in6) &&
 326                   addr->sin6_scope_id) {
 327                       /* Override any existing binding, if another one
 328                        * is supplied by user.
 329                        */
 330                       sk->sk_bound_dev_if = addr->sin6_scope_id;
 331               }
 332
 333               /* Binding to link-local address requires an interface */
 334               if (!sk->sk_bound_dev_if) {
 335                       err = -EINVAL;
 336                       goto out_unlock;
 337               }

Replacing svc_addr_u by sockaddr_storage, let rqstp->rq_daddr contains more info
besides address.

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-14 08:21:48 -04:00
J. Bruce Fields ee626a77d3 nfsd4: better stateid hashing
First, we shouldn't care here about the structure of the opaque part of
the stateid.  Second, this hash is really dumb.  (I'm not sure the
replacement is much better, though--to look at it another patch.)

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-13 18:30:36 -04:00
J. Bruce Fields 69064a2764 nfsd4: use deleg changes to cleanup preprocess_stateid_op
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-13 18:30:36 -04:00
J. Bruce Fields 97b7e3b6d4 nfsd4: fix test_stateid for delegation stateid's
Test_stateid should handle delegation stateid's as well.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-13 18:30:35 -04:00
J. Bruce Fields f459e45359 nfsd4: hash deleg stateid's like any other
It's simpler to look up delegation stateid's in the same hash table as
any other stateid.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-13 18:30:34 -04:00
J. Bruce Fields 36d44c6038 nfsd4: share common stid-hashing helper function
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-13 18:30:33 -04:00
J. Bruce Fields d5477a8db8 nfsd4: add common dl_stid field to delegation
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-13 18:30:32 -04:00
J. Bruce Fields dcef0413da nfsd4: move some of nfs4_stateid into a separate structure
We want delegations to share more with open/lock stateid's, so first
we'll pull out some of the common stuff we want to share.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-13 18:29:58 -04:00
J. Bruce Fields 91a8c04031 nfsd4: remove redundant stateid initialization
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-13 18:29:04 -04:00
J. Bruce Fields 881ea2b11e nfsd4: rename init_stateid
Note this is actually open-stateid specific.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-13 18:29:03 -04:00
J. Bruce Fields 2288d0e395 nfsd4: pass around typemask instead of flags
We're only using those flags to choose lock or open stateid's at this
point.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-13 18:29:00 -04:00
J. Bruce Fields c0a5d93efb nfsd4: split preprocess_seqid, cleanup
Move most of this into helper functions.  Also move the non-CONFIRM case
into caller, providing a helper function for that purpose.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-13 18:27:35 -04:00
J. Bruce Fields 4d71ab8751 nfsd4: split up find_stateid
Minor cleanup.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-13 18:27:31 -04:00
J. Bruce Fields 4581d14099 nfsd4: rearrange to avoid a forward reference
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-13 18:25:39 -04:00
J. Bruce Fields 4665e2bac5 nfsd4: split out some free_generic_stateid code
We'll use this elsewhere.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-07 09:47:23 -04:00
J. Bruce Fields fe0750e5c4 nfsd4: split stateowners into open and lockowners
The stateowner has some fields that only make sense for openowners, and
some that only make sense for lockowners, and I find it a lot clearer if
those are separated out.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-07 09:45:49 -04:00
J. Bruce Fields f4dee24cca nfsd4: move CLOSE_STATE special case to caller
Move the CLOSE_STATE case into the unique caller that cares about it
rather than putting it in preprocess_seqid_op.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-03 23:15:28 -04:00
J. Bruce Fields 68b66e8270 nfsd4: move double-confirm test to open_confirm
I don't see the point of having this check in nfs4_preprocess_seqid_op()
when it's only needed by the one caller.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-03 05:01:52 -04:00
J. Bruce Fields 77eaae8d44 nfsd4: simplify check_open logic
Sometimes the single-exit style is good, sometimes it's unnecessarily
convoluted....

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-02 19:59:29 -04:00
J. Bruce Fields 7a8711c9a6 nfsd4: share common seqid checks
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-02 19:59:24 -04:00
J. Bruce Fields 16d259418b nfsd4: eliminate unused lt_stateowner
This is used only as a local variable.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-01 11:35:30 -04:00
J. Bruce Fields 7c13f344cf nfsd4: drop most stateowner refcounting
Maybe we'll bring it back some day, but we don't have much real use for
it now.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-01 11:12:47 -04:00
J. Bruce Fields fff6ca9cc4 nfsd4: eliminate impossible open replay case
If open fails with any error other than nfserr_replay_me, then the main
nfsd4_proc_compound() loop continues unconditionally to
nfsd4_encode_operation(), which will always call encode_seqid_op_tail.
Thus the condition we check for here does not occur.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-01 07:29:01 -04:00
J. Bruce Fields 5ec094c109 nfsd4: extend state lock over seqid replay logic
There are currently a couple races in the seqid replay code: a
retransmission could come while we're still encoding the original reply,
or a new seqid-mutating call could come as we're encoding a replay.

So, extend the state lock over the encoding (both encoding of a replayed
reply and caching of the original encoded reply).

I really hate doing this, and previously added the stateowner
reference-counting code to avoid it (which was insufficient)--but I
don't see a less complicated alternative at the moment.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-01 07:07:59 -04:00
J. Bruce Fields 9072d5c66b nfsd4: cleanup seqid op stateowner usage
Now that the replay owner is in the cstate we can remove it from a lot
of other individual operations and further simplify
nfs4_preprocess_seqid_op().

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-08-31 17:56:03 -04:00
J. Bruce Fields f3e4223751 nfsd4: centralize handling of replay owners
Set the stateowner associated with a replay in one spot in
nfs4_preprocess_seqid_op() and keep it in cstate.  This allows removing
a few lines of boilerplate from all the nfs4_preprocess_seqid_op()
callers.

Also turn ENCODE_SEQID_OP_TAIL into a function while we're here.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-08-31 17:56:02 -04:00
J. Bruce Fields 73997dc418 nfsd4: make delegation stateid's seqid start at 1
Thanks to Casey for reminding me that 5661 gives a special meaning to a
value of 0 in the stateid's seqid field, so all stateid's should start
out with si_generation 1.  We were doing that in the open and lock
cases for minorversion 1, but not for the delegation stateid, and not
for openstateid's with v4.0.

It doesn't *really* matter much for v4.0 or for delegation stateid's
(which never get the seqid field incremented), but we may as well do the
same for all of them.

Reported-by: Casey Bodley <cbodley@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-08-31 17:56:01 -04:00
J. Bruce Fields 81b829655d nfsd4: simplify stateid generation code, fix wraparound
Follow the recommendation from rfc3530bis for stateid generation number
wraparound, simplify some code, and fix or remove incorrect comments.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-08-31 17:56:00 -04:00
J. Bruce Fields b79abaddfe nfsd4: consolidate lock & open stateid tables
There's no reason to have two separate hash tables for open and lock
stateid's.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-08-31 17:56:00 -04:00
J. Bruce Fields 5fa0bbb4ee nfsd4: simplify distinguishing lock & open stateid's
The trick free_stateid is using is a little cheesy, and we'll have more
uses for this field later.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-08-31 17:55:59 -04:00
J. Bruce Fields 3cc9fda40a nfsd4: remove redundant is_open_owner check
When called with OPEN_STATE, preprocess_seqid_op only returns an open
stateid, hence only an open owner.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-08-27 14:21:29 -04:00
J. Bruce Fields b34f27aa5d nfsd4: get lock checks out of preprocess_seqid_op
We've got some lock-specific code here in nfs4_preprocess_seqid_op which
is only used by nfsd4_lock().  Move it to the caller.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-08-27 14:21:28 -04:00
J. Bruce Fields 9afb978400 nfsd4: simplify lock openmode check
Note that the special handling for the lock stateid case is already done
by nfs4_check_openmode() (as of 0292191417
"nfsd4: fix openmode checking on IO using lock stateid") so we no longer
need these two cases in the caller.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-08-27 14:21:27 -04:00
J. Bruce Fields 28dde241cc nfsd4: remove HAS_SESSION
This flag doesn't really buy us anything.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-08-27 14:21:25 -04:00
J. Bruce Fields ff194bd959 nfsd4: cleanup lock/stateowner initialization
Share some common code, stop doing silly things like initializing a list
head immediately before adding it to a list, etc.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-08-27 14:21:24 -04:00
J. Bruce Fields 506f275fff nfsd4: name openowner data structures more clearly
These appear to be generic (for both open and lock owners), but they're
actually just for open owners.  This has confused me more than once.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-08-27 14:21:23 -04:00
J. Bruce Fields ddc04c4163 nfsd4: replace some macros by functions
For all the usual reasons.  (Type safety, readability.)

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-08-27 14:21:22 -04:00
J. Bruce Fields 3e77246393 nfsd4: stop using nfserr_resource for transitory errors
The server is returning nfserr_resource for both permanent errors and
for errors (like allocation failures) that might be resolved by retrying
later.  Save nfserr_resource for the former and use delay/jukebox for
the latter.

Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-08-27 14:21:21 -04:00
J. Bruce Fields 48483bf23a nfsd4: simplify recovery dir setting
Move around some of this code, simplify a bit.

Reviewed-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-08-27 14:21:18 -04:00
J. Bruce Fields 75c096f753 nfsd4: it's OK to return nfserr_symlink
The nfsd4 code has a bunch of special exceptions for error returns which
map nfserr_symlink to other errors.

In fact, the spec makes it clear that nfserr_symlink is to be preferred
over less specific errors where possible.

The patch that introduced it back in 2.6.4 is "kNFSd: correct symlink
related error returns.", which claims that these special exceptions are
represent an NFSv4 break from v2/v3 tradition--when in fact the symlink
error was introduced with v4.

I suspect what happened was pynfs tests were written that were overly
faithful to the (known-incomplete) rfc3530 error return lists, and then
code was fixed up mindlessly to make the tests pass.

Delete these unnecessary exceptions.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-08-26 18:22:50 -04:00
Casey Bodley 0c12eaffdf nfsd: don't break lease on CLAIM_DELEGATE_CUR
CLAIM_DELEGATE_CUR is used in response to a broken lease; allowing it
to break the lease and return EAGAIN leaves the client unable to make
progress in returning the delegation

nfs4_get_vfs_file() now takes struct nfsd4_open for access to the
claim type, and calls nfsd_open() with NFSD_MAY_NOT_BREAK_LEASE when
claim type is CLAIM_DELEGATE_CUR

Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-07-23 14:58:17 -04:00
J. Bruce Fields 8fb47a4fbf locks: rename lock-manager ops
Both the filesystem and the lock manager can associate operations with a
lock.  Confusingly, one of them (fl_release_private) actually has the
same name in both operation structures.

It would save some confusion to give the lock-manager ops different
names.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-07-20 20:23:19 -04:00
Mi Jinlong ae82a8d06f nfsd41: check the size of request
Check in SEQUENCE that the request doesn't exceed maxreq_sz for the
given session.

Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-07-15 19:00:00 -04:00
Mi Jinlong 1b74c25bc1 nfsd41: error out when client sets maxreq_sz or maxresp_sz too small
According to RFC5661, 18.36.3,

 "if the client selects a value for ca_maxresponsesize such that
  a replier on a channel could never send a response,the server
  SHOULD return NFS4ERR_TOOSMALL in the CREATE_SESSION reply."

So, error out when the client sets a maxreq_sz less than the minimum
possible SEQUENCE request size, or sets a maxresp_sz less than the
minimum possible SEQUENCE reply size.

Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-07-15 18:58:51 -04:00
J. Bruce Fields f197c27196 nfsd4: fix file leak on open_downgrade
Stateid's hold a read reference for a read open, a write reference for a
write open, and an additional one of each for each read+write open.  The
latter wasn't getting put on a downgrade, so something like:

	open RW
	open R
	downgrade to R

was resulting in a file leak.

Also fix an imbalance in an error path.

Regression from 7d94784293 "nfsd4: fix
downgrade/lock logic".

Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-07-15 18:58:49 -04:00
J. Bruce Fields 499f3edc23 nfsd4: remember to put RW access on stateid destruction
Without this, for example,

	open read
	open read+write
	close

will result in a struct file leak.

Regression from 7d94784293 "nfsd4: fix
downgrade/lock logic".

Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-07-15 18:58:49 -04:00
Bryan Schumaker 1745680454 NFSD: Added TEST_STATEID operation
This operation is used by the client to check the validity of a list of
stateids.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-07-15 18:58:48 -04:00
Bryan Schumaker e1ca12dfb1 NFSD: added FREE_STATEID operation
This operation is used by the client to tell the server to free a
stateid.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-07-15 18:58:47 -04:00
Linus Torvalds a74d70b63f Merge branch 'for-2.6.40' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.40' of git://linux-nfs.org/~bfields/linux: (22 commits)
  nfsd: make local functions static
  NFSD: Remove unused variable from nfsd4_decode_bind_conn_to_session()
  NFSD: Check status from nfsd4_map_bcts_dir()
  NFSD: Remove setting unused variable in nfsd_vfs_read()
  nfsd41: error out on repeated RECLAIM_COMPLETE
  nfsd41: compare request's opcnt with session's maxops at nfsd4_sequence
  nfsd v4.1 lOCKT clientid field must be ignored
  nfsd41: add flag checking for create_session
  nfsd41: make sure nfs server process OPEN with EXCLUSIVE4_1 correctly
  nfsd4: fix wrongsec handling for PUTFH + op cases
  nfsd4: make fh_verify responsibility of nfsd_lookup_dentry caller
  nfsd4: introduce OPDESC helper
  nfsd4: allow fh_verify caller to skip pseudoflavor checks
  nfsd: distinguish functions of NFSD_MAY_* flags
  svcrpc: complete svsk processing on cb receive failure
  svcrpc: take advantage of tcp autotuning
  SUNRPC: Don't wait for full record to receive tcp data
  svcrpc: copy cb reply instead of pages
  svcrpc: close connection if client sends short packet
  svcrpc: note network-order types in svc_process_calldir
  ...
2011-05-29 11:21:12 -07:00
Daniel Mack c47d832bc0 nfsd: make local functions static
This also fixes a number of sparse warnings.

Signed-off-by: Daniel Mack <zonque@gmail.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: J. Bruce Fields <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-05-18 15:28:31 -04:00
Bryan Schumaker 1db2b9dde3 NFSD: Check status from nfsd4_map_bcts_dir()
Compiling gave me this warning:
fs/nfsd/nfs4state.c: In function ‘nfsd4_bind_conn_to_session’:
fs/nfsd/nfs4state.c:1623:9: warning: variable ‘status’ set but not used
[-Wunused-but-set-variable]

The local variable "status" was being set by nfsd4_map_bcts_dir() and
then ignored before calling nfsd4_new_conn().

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-04-29 20:47:58 -04:00
Mi Jinlong bcecf1ccc3 nfsd41: error out on repeated RECLAIM_COMPLETE
Servers are supposed to return nfserr_complete_already to clients that
attempt to send multiple RECLAIM_COMPLETEs.

Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-04-29 20:47:56 -04:00
Mi Jinlong 868b89c3dc nfsd41: compare request's opcnt with session's maxops at nfsd4_sequence
Make sure nfs server errors out if request contains more ops
than channel allows.

Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com>
[bfields@redhat.com: use helper function]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-04-29 20:47:55 -04:00
Mi Jinlong a62573dc35 nfsd41: add flag checking for create_session
Teach the NFS server to reject invalid create_session flags.

Also do some minor formatting adjustments.

Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-04-29 20:47:53 -04:00
OGAWA Hirofumi a96e5b9080 nfsd4: Fix filp leak
23fcf2ec93 (nfsd4: fix oops on lock failure)

The above patch breaks free path for stp->st_file. If stp was inserted
into sop->so_stateids, we have to free stp->st_file refcount. Because
stp->st_file refcount itself is taken whether or not any refcounts are
taken on the stp->st_file->fi_fds[].

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-04-19 17:31:13 -04:00
J. Bruce Fields 4ee63624fd nfsd4: fix struct file leak on delegation
Introduced by acfdf5c383.

Cc: stable@kernel.org
Reported-by: Gerhard Heift <ml-nfs-linux-20110412-ef47@gheift.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-04-18 13:30:56 -04:00
Linus Torvalds 18770c7c3a Merge branch 'for-2.6.39' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.39' of git://linux-nfs.org/~bfields/linux:
  nfsd4: fix oops on lock failure
  nfsd: fix auth_domain reference leak on nlm operations
2011-04-11 15:45:17 -07:00
J. Bruce Fields 23fcf2ec93 nfsd4: fix oops on lock failure
Lock stateid's can have access_bmap 0 if they were only partially
initialized (due to a failed lock request); handle that case in
free_generic_stateid.

------------[ cut here ]------------
kernel BUG at fs/nfsd/nfs4state.c:380!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/kernel/mm/ksm/run
Modules linked in: nfs fscache md4 nls_utf8 cifs ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat bridge stp llc nfsd lockd nfs_acl auth_rpcgss sunrpc ipv6 ppdev parport_pc parport pcnet32 mii pcspkr microcode i2c_piix4 BusLogic floppy [last unloaded: mperf]

Pid: 1468, comm: nfsd Not tainted 2.6.38+ #120 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform
EIP: 0060:[<e24f180d>] EFLAGS: 00010297 CPU: 0
EIP is at nfs4_access_to_omode+0x1c/0x29 [nfsd]
EAX: ffffffff EBX: dd758120 ECX: 00000000 EDX: 00000004
ESI: dd758120 EDI: ddfe657c EBP: dd54dde0 ESP: dd54dde0
 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process nfsd (pid: 1468, ti=dd54c000 task=ddc92580 task.ti=dd54c000)
Stack:
 dd54ddf0 e24f19ca 00000000 ddfe6560 dd54de08 e24f1a5d dd758130 deee3a20
 ddfe6560 31270000 dd54df1c e24f52fd 0000000f dd758090 e2505dd0 0be304cf
 dbb51d68 0000000e ddfe657c ddcd8020 dd758130 dd758128 dd7580d8 dd54de68
Call Trace:
 [<e24f19ca>] free_generic_stateid+0x1c/0x3e [nfsd]
 [<e24f1a5d>] release_lockowner+0x71/0x8a [nfsd]
 [<e24f52fd>] nfsd4_lock+0x617/0x66c [nfsd]
 [<e24e57b6>] ? nfsd_setuser+0x199/0x1bb [nfsd]
 [<e24e056c>] ? nfsd_setuser_and_check_port+0x65/0x81 [nfsd]
 [<c07a0052>] ? _cond_resched+0x8/0x1c
 [<c04ca61f>] ? slab_pre_alloc_hook.clone.33+0x23/0x27
 [<c04cac01>] ? kmem_cache_alloc+0x1a/0xd2
 [<c04835a0>] ? __call_rcu+0xd7/0xdd
 [<e24e0dfb>] ? fh_verify+0x401/0x452 [nfsd]
 [<e24f0b61>] ? nfsd4_encode_operation+0x52/0x117 [nfsd]
 [<e24ea0d7>] ? nfsd4_putfh+0x33/0x3b [nfsd]
 [<e24f4ce6>] ? nfsd4_delegreturn+0xd4/0xd4 [nfsd]
 [<e24ea2c9>] nfsd4_proc_compound+0x1ea/0x33e [nfsd]
 [<e24de6ee>] nfsd_dispatch+0xd1/0x1a5 [nfsd]
 [<e1d6e1c7>] svc_process_common+0x282/0x46f [sunrpc]
 [<e1d6e578>] svc_process+0xdc/0xfa [sunrpc]
 [<e24de0fa>] nfsd+0xd6/0x115 [nfsd]
 [<e24de024>] ? nfsd_shutdown+0x24/0x24 [nfsd]
 [<c0454322>] kthread+0x62/0x67
 [<c04542c0>] ? kthread_worker_fn+0x114/0x114
 [<c07a6ebe>] kernel_thread_helper+0x6/0x10
Code: eb 05 b8 00 00 27 4f 8d 65 f4 5b 5e 5f 5d c3 83 e0 03 55 83 f8 02 89 e5 74 17 83 f8 03 74 05 48 75 09 eb 09 b8 02 00 00 00 eb 0b <0f> 0b 31 c0 eb 05 b8 01 00 00 00 5d c3 55 89 e5 57 56 89 d6 8d
EIP: [<e24f180d>] nfs4_access_to_omode+0x1c/0x29 [nfsd] SS:ESP 0068:dd54dde0
---[ end trace 2b0bf6c6557cb284 ]---

The trace route is:

 -> nfsd4_lock()
   -> if (lock->lk_is_new) {
     -> alloc_init_lock_stateid()

        3739: stp->st_access_bmap = 0;

   ->if (status && lock->lk_is_new && lock_sop)
     -> release_lockowner()
      -> free_generic_stateid()
       -> nfs4_access_bmap_to_omode()
          -> nfs4_access_to_omode()

        380: BUG();   *****

This problem was introduced by 0997b17360.

Reported-by: Mi Jinlong <mijinlong@cn.fujitsu.com>
Tested-by: Mi Jinlong <mijinlong@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-04-10 12:21:27 -04:00
Lucas De Marchi 25985edced Fix common misspellings
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-31 11:26:23 -03:00
J. Bruce Fields cf507b6f8e Merge create_session decoding fix into for-2.6.39
This needs a further fixup!
2011-03-17 13:07:25 -04:00
Mi Jinlong d2b217439f nfs41: make sure nfs server return right ca_maxresponsesize_cached
According to rfc5661,

  ca_maxresponsesize_cached:

     Like ca_maxresponsesize, but the maximum size of a reply that
     will be stored in the reply cache (Section 2.10.6.1).  For each
     channel, the server MAY decrease this value, but MUST NOT
     increase it.

the latest kernel(2.6.38-rc8) may increase the value for ignoring
request's ca_maxresponsesize_cached value. We should not ignore it.

Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-03-16 11:10:22 -04:00
J. Bruce Fields 0997b17360 nfsd4: fix struct file leak
Make sure we properly reference count the struct files that a lock
depends on, and release them when the lock stateid is released.

This fixes a major leak of struct files when using locking over nfsv4.

Cc: stable@kernel.org
Reported-by: Rick Koshi <nfs-bug-report@more-right-rudder.com>
Tested-by: Ivo Přikryl <prikryl@eurosat.cz>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-03-08 19:38:27 -05:00
J. Bruce Fields 529d7b2a7f nfsd4: minor nfs4state.c reshuffling
Minor cleanup in preparation for a bugfix--moving some code to avoid
forward references, etc.  No change in functionality.

Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-03-08 19:38:15 -05:00
Shan Wei 35079582e7 nfsd: kill unused macro definition
These macros had never been used for several years.
So, remove them.

Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-03-07 12:05:09 -05:00
J. Bruce Fields 32b007b4e1 nfsd4: fix bad pointer on failure to find delegation
In case of a nonempty list, the return on error here is obviously bogus;
it ends up being a pointer to the list head instead of to any valid
delegation on the list.

In particular, if nfsd4_delegreturn() hits this case, and you're quite unlucky,
then renew_client may oops, and it may take an embarassingly long time to
figure out why.  Facepalm.

BUG: unable to handle kernel NULL pointer dereference at 0000000000000090
IP: [<ffffffff81292965>] nfsd4_delegreturn+0x125/0x200
...

Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-03-07 11:44:53 -05:00
J. Bruce Fields acfdf5c383 nfsd4: acquire only one lease per file
Instead of acquiring one lease each time another client opens a file,
nfsd can acquire just one lease to represent all of them, and reference
count it to determine when to release it.

This fixes a regression introduced by
c45821d263 "locks: eliminate fl_mylease
callback": after that patch, only the struct file * is used to determine
who owns a given lease.  But since we recently converted the server to
share a single struct file per open, if we acquire multiple leases on
the same file from nfsd, it then becomes impossible on unlocking a lease
to determine which of those leases (all of whom share the same struct
file *) we meant to remove.

Thanks to Takashi Iwai <tiwai@suse.de> for catching a bug in a previous
version of this patch.

Tested-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-02-14 10:35:19 -05:00
J. Bruce Fields 5d926e8c2f nfsd4: modify fi_delegations under recall_lock
Modify fi_delegations only under the recall_lock, allowing us to use
that list on lease breaks.

Also some trivial cleanup to simplify later changes.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-02-14 10:35:19 -05:00
J. Bruce Fields 65bc58f518 nfsd4: remove unused deleg dprintk's.
These aren't all that useful, and get in the way of the next steps.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-02-14 10:35:19 -05:00
J. Bruce Fields edab9782b5 nfsd4: split lease setting into separate function
Splitting some code into a separate function which we'll be adding some
more to.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-02-14 10:35:18 -05:00
J. Bruce Fields dd239cc05f nfsd4: fix leak on allocation error
Also share some common exit code.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-02-14 10:35:18 -05:00
J. Bruce Fields 22d38c4c10 nfsd4: add helper function for lease setup
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-02-14 10:35:18 -05:00
J. Bruce Fields 6b57d9c86d nfsd4: split up nfsd_break_deleg_cb
We'll be adding some more code here soon.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-02-14 10:35:18 -05:00
Linus Torvalds 18bce371ae Merge branch 'for-2.6.38' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.38' of git://linux-nfs.org/~bfields/linux: (62 commits)
  nfsd4: fix callback restarting
  nfsd: break lease on unlink, link, and rename
  nfsd4: break lease on nfsd setattr
  nfsd: don't support msnfs export option
  nfsd4: initialize cb_per_client
  nfsd4: allow restarting callbacks
  nfsd4: simplify nfsd4_cb_prepare
  nfsd4: give out delegations more quickly in 4.1 case
  nfsd4: add helper function to run callbacks
  nfsd4: make sure sequence flags are set after destroy_session
  nfsd4: re-probe callback on connection loss
  nfsd4: set sequence flag when backchannel is down
  nfsd4: keep finer-grained callback status
  rpc: allow xprt_class->setup to return a preexisting xprt
  rpc: keep backchannel xprt as long as server connection
  rpc: move sk_bc_xprt to svc_xprt
  nfsd4: allow backchannel recovery
  nfsd4: support BIND_CONN_TO_SESSION
  nfsd4: modify session list under cl_lock
  Documentation: fl_mylease no longer exists
  ...

Fix up conflicts in fs/nfsd/vfs.c with the vfs-scale work.  The
vfs-scale work touched some msnfs cases, and this merge removes support
for that entirely, so the conflict was trivial to resolve.
2011-01-14 13:17:26 -08:00
J. Bruce Fields 5ce8ba25d6 nfsd4: allow restarting callbacks
If we lose the backchannel and then the client repairs the problem,
resend any callbacks.

We use a new cb_done flag to track whether there is still work to be
done for the callback or whether it can be destroyed with the rpc.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-01-11 15:04:11 -05:00
J. Bruce Fields 14a24e99f4 nfsd4: give out delegations more quickly in 4.1 case
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-01-11 15:04:11 -05:00
J. Bruce Fields 84f5f7ccc5 nfsd4: make sure sequence flags are set after destroy_session
If this loses any backchannel, make sure we have a chance to notice that
and set the sequence flags.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-01-11 15:04:11 -05:00
J. Bruce Fields eea4980660 nfsd4: re-probe callback on connection loss
This makes sure we set the sequence flag when necessary.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-01-11 15:04:10 -05:00
J. Bruce Fields 0d7bb71907 nfsd4: set sequence flag when backchannel is down
Implement the SEQ4_STATUS_CB_PATH_DOWN flag.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-01-11 15:04:10 -05:00
J. Bruce Fields 77a3569d6c nfsd4: keep finer-grained callback status
Distinguish between when the callback channel is known to be down, and
when it is not yet confirmed.  This will be useful in the 4.1 case.

Also, we don't seem to be using the fact that this field is atomic.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2011-01-11 15:04:10 -05:00
J. Bruce Fields dcbeaa68db nfsd4: allow backchannel recovery
Now that we have a list of connections to choose from, we can teach the
callback code to just pick a suitable connection and use that, instead
of insisting on forever using the connection that the first
create_session was sent with.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2011-01-11 15:04:10 -05:00
J. Bruce Fields 1d1bc8f207 nfsd4: support BIND_CONN_TO_SESSION
Basic xdr and processing for BIND_CONN_TO_SESSION.  This adds a
connection to the list of connections associated with a session.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-01-11 15:04:09 -05:00
J. Bruce Fields 4c6493785a nfsd4: modify session list under cl_lock
We want to traverse this from the callback code.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2011-01-11 15:04:09 -05:00
Takuma Umeya 6f3d772fb8 nfs4: set source address when callback is generated
when callback is generated in NFSv4 server, it doesn't set the source
address. When an alias IP is utilized on NFSv4 server and suppose the
client is accessing via that alias IP (e.g. eth0:0), the client invokes
the callback to the IP address that is set on the original device (e.g.
eth0). This behavior results in timeout of xprt.
The patch sets the IP address that the client should invoke callback to.

Signed-off-by: Takuma Umeya <tumeya@redhat.com>
[bfields@redhat.com: Simplify gen_callback arguments, use helper function]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-01-04 19:43:01 -05:00
J. Bruce Fields c45821d263 locks: eliminate fl_mylease callback
The nfs server only supports read delegations for now, so we don't care
how conflicts are determined.  All we care is that unlocks are
recognized as matching the leases they are meant to remove.  After the
last patch, a comparison of struct files will work for that purpose.  So
we no longer need this callback.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-01-04 16:49:28 -05:00
J. Bruce Fields c84d500bc4 nfsd4: use a single struct file for delegations
When we converted to sharing struct filess between nfs4 opens I went too
far and also used the same mechanism for delegations.  But keeping
a reference to the struct file ensures it will outlast the lease, and
allows us to remove the lease with the same file as we added it.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-01-04 16:49:27 -05:00
J. Bruce Fields e63eb93750 nfsd4: eliminate lease delete callback
nfsd controls the lifetime of the lease, not the lock code, so there's
no need for this callback on lease destruction.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-01-04 16:49:26 -05:00
J. Bruce Fields da165dd60e nfsd: remove some unnecessary dropit handling
We no longer need a few of these special cases.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-01-04 16:49:23 -05:00
J. Bruce Fields e203d506bd nfsd4: fix mixed 4.0/4.1 handling, 4.1 reboot
Instead of failing to find client entries which don't match the
minorversion, we should be finding them, then either erroring out or
expiring them as appropriate.

This also fixes a problem which would cause the 4.1 server to fail to
recognize clients after a second reboot.

Reported-by: Casey Bodley <cbodley@citi.umich.edu>
Reviewed-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-12-17 15:48:01 -05:00
J. Bruce Fields 6e5f15c93d nfsd4: replace unintuitive match_clientid_establishment
Reviewed-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-12-17 15:47:41 -05:00
J. Bruce Fields ec66ee3797 Merge commit 'v2.6.37-rc6' into for-2.6.38 2010-12-17 13:29:07 -05:00
Tejun Heo afe2c511fb workqueue: convert cancel_rearming_delayed_work[queue]() users to cancel_delayed_work_sync()
cancel_rearming_delayed_work[queue]() has been superceded by
cancel_delayed_work_sync() quite some time ago.  Convert all the
in-kernel users.  The conversions are completely equivalent and
trivial.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: "David S. Miller" <davem@davemloft.net>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: Evgeniy Polyakov <zbr@ioremap.net>
Cc: Jeff Garzik <jgarzik@pobox.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: netdev@vger.kernel.org
Cc: Anton Vorontsov <cbou@mail.ru>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Alex Elder <aelder@sgi.com>
Cc: xfs-masters@oss.sgi.com
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: netfilter-devel@vger.kernel.org
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: linux-nfs@vger.kernel.org
2010-12-15 10:56:11 +01:00
Mi Jinlong 1205065764 NFS4.1: Fix bug server don't reply the right fore_channel to client at create_session
At the latest kernel(2.6.37-rc1), server just initialize the forechannel
at init_forechannel_attrs, but don't reflect it to reply.

After initialize the session success, we should copy the forechannel info
to nfsd4_create_session struct.

Reviewed-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-11-19 18:35:12 -05:00
Mi Jinlong ced6dfe9fc NFS4.1: server gets drc mem fail should reply error at create_session
When server gets drc mem fail, it should reply error to client.

Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-11-19 18:35:12 -05:00
J. Bruce Fields 044bc1d432 nfsd4: return serverfault on request for ssv
We're refusing to support a mandatory features of 4.1, so serverfault
seems the better error; see e.g.:

	http://www.ietf.org/mail-archive/web/nfsv4/current/msg07638.html

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-11-19 18:35:12 -05:00
Dan Carpenter 43b0178eda nfsd: fix NULL dereference in setattr()
The original code would oops if this were called from nfsd4_setattr()
because "filpp" is NULL.

(Note this case is currently impossible, as long as we only give out
read delegations.)

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-11-19 18:35:11 -05:00
Arnd Bergmann 460781b542 BKL: remove references to lock_kernel from comments
Lock_kernel is gone from the code, so the comments should be updated,
too.  nfsd now uses lock_flocks instead of lock_kernel to protect
against posix file locks.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: J. Bruce Fields <bfields@redhat.com>
Cc: linux-nfs@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-17 08:59:32 -08:00
J. Bruce Fields 21b75b0199 nfsd4: fix 4.1 connection registration race
If a connection is closed just after a sequence or create_session
is sent over it, we could end up trying to register a callback that will
never get called since the xprt is already marked dead.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-11-02 17:13:52 -04:00
Christoph Hellwig 51ee4b84f5 locks: let the caller free file_lock on ->setlease failure
The caller allocated it, the caller should free it.

The only issue so far is that we could change the flp pointer even on an
error return if the fl_change callback failed.  But we can simply move
the flp assignment after the fl_change invocation, as the callers don't
care about the flp return value if the setlease call failed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-31 06:35:15 -07:00
J. Bruce Fields fcf744a96c nfsd4: initialize delegation pointer to lease
The NFSv4 server was initializing the dp->dl_flock pointer by the
somewhat ridiculous method of a locks_copy_lock callback.

Now that setlease uses the passed-in lock instead of doing a copy,
dl_flock no longer gets set, resulting in the lock leaking on delegation
release, and later possible hangs (among other problems).

So, initialize dl_flock and get rid of the callback.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-30 18:08:15 -07:00
Arnd Bergmann c5b1f0d92c locks/nfsd: allocate file lock outside of spinlock
As suggested by Christoph Hellwig, this moves allocation
of new file locks out of generic_setlease into the
callers, nfs4_open_delegation and fcntl_setlease in order
to allow GFP_KERNEL allocations when lock_flocks has
become a spinlock.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: J. Bruce Fields <bfields@redhat.com>
2010-10-27 21:41:50 +02:00
Linus Torvalds 4390110fef Merge branch 'for-2.6.37' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.37' of git://linux-nfs.org/~bfields/linux: (99 commits)
  svcrpc: svc_tcp_sendto XPT_DEAD check is redundant
  svcrpc: no need for XPT_DEAD check in svc_xprt_enqueue
  svcrpc: assume svc_delete_xprt() called only once
  svcrpc: never clear XPT_BUSY on dead xprt
  nfsd4: fix connection allocation in sequence()
  nfsd4: only require krb5 principal for NFSv4.0 callbacks
  nfsd4: move minorversion to client
  nfsd4: delay session removal till free_client
  nfsd4: separate callback change and callback probe
  nfsd4: callback program number is per-session
  nfsd4: track backchannel connections
  nfsd4: confirm only on succesful create_session
  nfsd4: make backchannel sequence number per-session
  nfsd4: use client pointer to backchannel session
  nfsd4: move callback setup into session init code
  nfsd4: don't cache seq_misordered replies
  SUNRPC: Properly initialize sock_xprt.srcaddr in all cases
  SUNRPC: Use conventional switch statement when reclassifying sockets
  sunrpc/xprtrdma: clean up workqueue usage
  sunrpc: Turn list_for_each-s into the ..._entry-s
  ...

Fix up trivial conflicts (two different deprecation notices added in
separate branches) in Documentation/feature-removal-schedule.txt
2010-10-26 09:55:25 -07:00
J. Bruce Fields a663bdd8c5 nfsd4: fix connection allocation in sequence()
We're doing an allocation under a spinlock, and ignoring the
possibility of allocation failure.

A better fix wouldn't require an unnecessary allocation in the common
case, but we'll leave that for later.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-24 21:07:07 -04:00
J. Bruce Fields 8323c3b2a6 nfsd4: move minorversion to client
The minorversion seems more a property of the client than the callback
channel.

Some time we should probably also enforce consistent minorversion usage
from the client; for now, this is just a cosmetic change.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-21 10:12:02 -04:00
J. Bruce Fields 792c95dd51 nfsd4: delay session removal till free_client
Have unhash_client_locked() remove client and associated sessions from
global hashes, but delay further dismantling till free_client().

(After unhash_client_locked(), the only remaining references outside the
destroying thread are from any connections which have xpt_user callbacks
registered.)

This will simplify locking on session destruction.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-21 10:11:56 -04:00
J. Bruce Fields 5a3c9d7134 nfsd4: separate callback change and callback probe
Only one of the nfsd4_callback_probe callers actually cares about
changing the callback information.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-21 10:11:55 -04:00
J. Bruce Fields 8b5ce5cd44 nfsd4: callback program number is per-session
The callback program is allowed to depend on the session which the
callback is going over.

No change in behavior yet, while we still only do callbacks over a
single session for the lifetime of the client.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-21 10:11:54 -04:00
J. Bruce Fields d29c374cd2 nfsd4: track backchannel connections
We need to keep track of which connections are available for use with
the backchannel, which for the forechannel, and which for both.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-21 10:11:53 -04:00
J. Bruce Fields 86c3e16cc7 nfsd4: confirm only on succesful create_session
Following rfc 5661, section 18.36.4: "If the session is not successfully
created, then no changes are made to any client records on the server."
We shouldn't be confirming or incrementing the sequence id in this case.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-21 10:11:52 -04:00
J. Bruce Fields ac7c46f29a nfsd4: make backchannel sequence number per-session
Currently we don't deal well with a client that has multiple sessions
associated with it (even simultaneously, or serially over the lifetime
of the client).

In particular, we don't attempt to keep the backchannel running after
the original session diseappears.

We will fix that soon.

Once we do that, we need the slot sequence number to be per-session;
otherwise, for example, we cannot correctly handle a case like this:

	- All session 1 connections are lost.
	- The client creates session 2.  We use it for the backchannel
	  (since it's the only working choice).
	- The client gives us a new connection to use with session 1.
	- The client destroys session 2.

At this point our only choice is to go back to using session 1.  When we
do so we must use the sequence number that is next for session 1.  We
therefore need to maintain multiple sequence number streams.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-21 10:11:51 -04:00
J. Bruce Fields 90c8145bb6 nfsd4: use client pointer to backchannel session
Instead of copying the sessionid, use the new cl_cb_session pointer,
which indicates which session we're using for the backchannel.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-21 10:11:50 -04:00
J. Bruce Fields edd7678663 nfsd4: move callback setup into session init code
The backchannel should  be associated with a session, it isn't really
global to the client.

We do, however, want a pointer global to the client which tracks which
session we're currently using for client-based callbacks.

This is a first step in that direction; for now, just reshuffling of
code with no significant change in behavior.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-21 10:11:49 -04:00
J. Bruce Fields cd5b814458 nfsd4: don't cache seq_misordered replies
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-21 10:11:48 -04:00
Arnd Bergmann b89f432133 fs/locks.c: prepare for BKL removal
This prepares the removal of the big kernel lock from the
file locking code. We still use the BKL as long as fs/lockd
uses it and ceph might sleep, but we can flip the definition
to a private spinlock as soon as that's done.
All users outside of fs/lockd get converted to use
lock_flocks() instead of lock_kernel() where appropriate.

Based on an earlier patch to use a spinlock from Matthew
Wilcox, who has attempted this a few times before, the
earliest patch from over 10 years ago turned it into
a semaphore, which ended up being slower than the BKL
and was subsequently reverted.

Someone should do some serious performance testing when
this becomes a spinlock, since this has caused problems
before. Using a spinlock should be at least as good
as the BKL in theory, but who knows...

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Miklos Szeredi <mszeredi@suse.cz>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
Cc: Sage Weil <sage@newdream.net>
Cc: linux-kernel@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
2010-10-05 11:02:04 +02:00
J. Bruce Fields 3351514215 nfsd4: return expired on unfound stateid's
Commit 78155ed75f "nfsd4: distinguish
expired from stale stateids" attempted to distinguish expired and stale
stateid's using time information that may not have been completely
reliable, so I reverted it.

That was throwing out the baby with the bathwater; we still do want to
return expired, but let's do that using the simpler approach of just
assuming any stateid is expired if it looks like it was given out by the
current server instance, but we can't find it any more.

This may help clients that are recovering from network partitions.

Reported-by: Bian Naimeng <biannm@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-02 18:49:33 -04:00
J. Bruce Fields 328ead2872 nfsd4: add new connections to session
As long as we're not implementing any session security, we should just
automatically add any new connections that come along to the list of
sessions associated with the session.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01 19:29:45 -04:00
J. Bruce Fields db90681d6e nfsd4: refactor connection allocation
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01 19:29:45 -04:00
J. Bruce Fields 19cf5c026f nfsd4: use callbacks on svc_xprt_deletion
Remove connections from the list when they go down.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-01 19:29:44 -04:00
J. Bruce Fields c7662518c7 nfsd4: keep per-session list of connections
The spec requires us in various places to keep track of the connections
associated with each session.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-01 19:29:44 -04:00
J. Bruce Fields 5b6feee960 nfsd4: clean up session allocation
Changes:
	- make sure session memory reservation is released on failure
	  path.
	- use min_t()/min() for more compact code in several places.
	- break alloc_init_session into smaller pieces.
	- miscellaneous other cleanup.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-01 19:29:44 -04:00
J. Bruce Fields dd93842457 nfsd4: fix alloc_init_session return type
This returns an nfs error, not -ERRNO.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01 19:29:44 -04:00
J. Bruce Fields c23753dac1 nfsd4: fix alloc_init_session BUILD_BUG_ON()
Note we're allocating an array of nfsd4_slot *'s, not nfsd4_slot's.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01 19:29:44 -04:00
J. Bruce Fields 6ff8da0887 nfsd4: Move callback setup to callback queue
Instead of creating the new rpc client from a regular server thread,
set a flag, kick off a null call, and allow the null call to do the work
of setting up the client on the callback workqueue.

Use a spinlock to ensure the callback work gets a consistent view of the
callback parameters.

This allows, for example, changing the callback from contexts where
sleeping is not allowed.  I hope it will also keep the locking simple as
we add more session and trunking features, by serializing most of the
callback-specific work.

This also closes a small race where the the new cb_ident could be used
with an old connection (or vice-versa).

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-01 19:29:44 -04:00
J. Bruce Fields cee277d924 nfsd4: use generic callback code in null case
This will eventually allow us, for example, to kick off null callback
from contexts where we can't sleep.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-01 19:29:43 -04:00
J. Bruce Fields 07263f1efe nfsd4: minor variable renaming (cb -> conn)
Now that we have both nfsd4_callback and nfsd4_cb_conn structures, I get
confused if variables of both types are always named cb....

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-01 19:29:43 -04:00
J. Bruce Fields 8f34a430ac nfsd4: mask out non-access bits in nfs4_access_to_omode
This fixes an unnecessary BUG().

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-09-02 15:25:09 -04:00
J. Bruce Fields f632265d0f Merge commit 'v2.6.36-rc1' into HEAD 2010-08-26 13:22:27 -04:00
J. Bruce Fields 7d94784293 nfsd4: fix downgrade/lock logic
If we already had a RW open for a file, and get a readonly open, we were
piggybacking on the existing RW open.  That's inconsistent with the
downgrade logic which blows away the RW open assuming you'll still have
a readonly open.

Also, make sure there is a readonly or writeonly open available for
locking, again to prevent bad behavior in downgrade cases when any RW
open may be lost.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-08-26 13:22:02 -04:00
J. Bruce Fields 30c0e1ef0a nfsd4: bad BUG() in preprocess_stateid_op
It's OK for this function to return without setting filp--we do it in
the special-stateid case.

And there's a legitimate case where we can hit this, since we do permit
reads on write-only stateid's.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-08-26 13:20:51 -04:00
Linus Torvalds 0d9f9e122c Merge branch 'for-2.6.36' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.36' of git://linux-nfs.org/~bfields/linux: (34 commits)
  nfsd4: fix file open accounting for RDWR opens
  nfsd: don't allow setting maxblksize after svc created
  nfsd: initialize nfsd versions before creating svc
  net: sunrpc: removed duplicated #include
  nfsd41: Fix a crash when a callback is retried
  nfsd: fix startup/shutdown order bug
  nfsd: minor nfsd read api cleanup
  gcc-4.6: nfsd: fix initialized but not read warnings
  nfsd4: share file descriptors between stateid's
  nfsd4: fix openmode checking on IO using lock stateid
  nfsd4: miscellaneous process_open2 cleanup
  nfsd4: don't pretend to support write delegations
  nfsd: bypass readahead cache when have struct file
  nfsd: minor nfsd_svc() cleanup
  nfsd: move more into nfsd_startup()
  nfsd: just keep single lockd reference for nfsd
  nfsd: clean up nfsd_create_serv error handling
  nfsd: fix error handling in __write_ports_addxprt
  nfsd: fix error handling when starting nfsd with rpcbind down
  nfsd4: fix v4 state shutdown error paths
  ...
2010-08-07 14:24:41 -07:00
J. Bruce Fields 998db52c03 nfsd4: fix file open accounting for RDWR opens
Commit f9d7562fdb "nfsd4: share file
descriptors between stateid's" didn't correctly account for O_RDWR opens.
Symptoms include leaked files, resulting in failures to unmount and/or
warnings about orphaned inodes on reboot.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-08-07 09:50:05 -04:00
Andi Kleen 6904996101 gcc-4.6: nfsd: fix initialized but not read warnings
Fixes at least one real minor bug: the nfs4 recovery dir sysctl
would not return its status properly.

Also I finished Al's 1e41568d73 ("Take ima_path_check() in nfsd
past dentry_open() in nfsd_open()") commit, it moved the IMA
code, but left the old path initializer in there.

The rest is just dead code removed I think, although I was not
fully sure about the "is_borc" stuff. Some more review
would be still good.

Found by gcc 4.6's new warnings.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-07-29 19:32:17 -04:00
J. Bruce Fields f9d7562fdb nfsd4: share file descriptors between stateid's
The vfs doesn't really allow us to "upgrade" a file descriptor from
read-only to read-write, and our attempt to do so in nfs4_upgrade_open
is ugly and incomplete.

Move to a different scheme where we keep multiple opens, shared between
open stateid's, in the nfs4_file struct.  Each file will be opened at
most 3 times (for read, write, and read-write), and those opens will be
shared between all clients and openers.  On upgrade we will do another
open if necessary instead of attempting to upgrade an existing open.
We keep count of the number of readers and writers so we know when to
close the shared files.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-07-29 18:19:23 -04:00
J. Bruce Fields 0292191417 nfsd4: fix openmode checking on IO using lock stateid
It is legal to perform a write using the lock stateid that was
originally associated with a read lock, or with a file that was
originally opened for read, but has since been upgraded.

So, when checking the openmode, check the mode associated with the
open stateid from which the lock was derived.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-07-29 16:37:12 -04:00
J. Bruce Fields 21fb4016bd nfsd4: miscellaneous process_open2 cleanup
Move more work into helper functions.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-07-29 16:34:29 -04:00
J. Bruce Fields c3e4808086 nfsd4: don't pretend to support write delegations
The delegation code mostly pretends to support either read or write
delegations.  However, correct support for write delegations would
require, for example, breaking of delegations (and/or implementation of
cb_getattr) on stat.  Currently all that stops us from handing out
delegations is a subtle reference-counting issue.

Avoid confusion by adding an earlier check that explicitly refuses write
delegations.

For now, though, I'm not going so far as to rip out existing
half-support for write delegations, in case we get around to using that
soon.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-07-29 16:05:51 -04:00
Jeff Layton 4ad9a344be nfsd4: fix v4 state shutdown error paths
If someone tries to shut down the laundry_wq while it isn't up it'll
cause an oops.

This can happen because write_ports can create a nfsd_svc before we
really start the nfs server, and we may fail before the server is ever
started.

Also make sure state is shutdown on error paths in nfsd_svc().

Use a common global nfsd_up flag instead of nfs4_init, and create common
helper functions for nfsd start/shutdown, as there will be other work
that we want done only when we the number of nfsd threads transitions
between zero and nonzero.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-07-23 08:51:22 -04:00
J. Bruce Fields ec8acac84a nfsd4: remove some debugging code
This is overkill.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-06-22 22:29:03 -04:00
J. Bruce Fields 4731030d58 nfsd4: translate memory errors to delay, not serverfault
If the server is out of memory is better for clients to back off and
retry than to just error out.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-06-22 17:19:36 -04:00
J. Bruce Fields 76407f76e0 nfsd4; fix session reference count leak
Note the session has to be put() here regardless of what happens to the
client.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-06-22 17:19:28 -04:00
J. Bruce Fields c3935e3049 nfsd4: shut down callback queue outside state lock
This reportedly causes a lockdep warning on nfsd shutdown.  That looks
like a false positive to me, but there's no reason why this needs the
state lock anyway.

Reported-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-06-08 19:33:52 -04:00
J. Bruce Fields 24a0111e40 nfsd4: fix use of op_share_access
NFSv4.1 adds additional flags to the share_access argument of the open
call.  These flags need to be masked out in some of the existing code,
but current code does that inconsistently.

Tested-by: Michael Groshans <groshans@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-05-31 12:43:55 -04:00
J. Bruce Fields e4e83ea47b Revert "nfsd4: distinguish expired from stale stateids"
This reverts commit 78155ed75f.

We're depending here on the boot time that we use to generate the
stateid being monotonic, but get_seconds() is not necessarily.

We still depend at least on boot_time being different every time, but
that is a safer bet.

We have a few reports of errors that might be explained by this problem,
though we haven't been able to confirm any of them.

But the minor gain of distinguishing expired from stale errors seems not
worth the risk.

Conflicts:

	fs/nfsd/nfs4state.c

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-05-18 19:03:50 -04:00
Pavel Emelyanov 47cee541a4 nfsd: safer initialization order in find_file()
The alloc_init_file() first adds a file to the hash and then
initializes its fi_inode, fi_id and fi_had_conflict.

The uninitialized fi_inode could thus be erroneously checked by
the find_file(), so move the hash insertion lower.

The client_mutex should prevent this race in practice; however, we
eventually hope to make less use of the client_mutex, so the ordering
here is an accident waiting to happen.

I didn't find whether the same can be true for two other fields,
but the common sense tells me it's better to initialize an object
before putting it into a global hash table :)

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-05-18 12:05:20 -04:00
J. Bruce Fields 4dc6ec00f6 nfsd4: implement reclaim_complete
This is a mandatory operation.  Also, here (not in open) is where we
should be committing the reboot recovery information.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-05-13 12:03:11 -04:00
Benny Halevy ab707e1565 nfsd4: nfsd4_destroy_session must set callback client under the state lock
nfsd4_set_callback_client must be called under the state lock to atomically
set or unset the callback client and shutting down the previous one.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-05-13 11:59:11 -04:00
Benny Halevy d76829889a nfsd4: keep a reference count on client while in use
Get a refcount on the client on SEQUENCE,
Release the refcount and renew the client when all respective compounds completed.
Do not expire the client by the laundromat while in use.
If the client was expired via another path, free it when the compounds
complete and the refcount reaches 0.

Note that unhash_client_locked must call list_del_init on cl_lru as
it may be called twice for the same client (once from nfs4_laundromat
and then from expire_client)

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-05-13 11:58:54 -04:00
Benny Halevy 07cd4909a6 nfsd4: mark_client_expired
Mark the client as expired under the client_lock so it won't be renewed
when an nfsv4.1 session is done, after it was explicitly expired
during processing of the compound.

Do not renew a client mark as expired (in particular, it is not
on the lru list anymore)

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-05-13 11:47:22 -04:00
Benny Halevy 46583e2597 nfsd4: introduce nfs4_client.cl_refcount
Currently just initialize the cl_refcount to 1
and decrement in expire_client(), conditionally freeing the
client when the refcount reaches 0.

To be used later by nfsv4.1 compounds to keep the client from
timing out while in use.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-05-13 11:47:03 -04:00
Benny Halevy 84d38ac9ab nfsd4: refactor expire_client
Separate out unhashing of the client and session.
To be used later by the laundromat.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-05-11 21:02:02 -04:00
Benny Halevy 36acb66bda nfsd4: extend the client_lock to cover cl_lru
To be used later on to hold a reference count on the client while in use by a
nfsv4.1 compound.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-05-11 21:02:02 -04:00
Benny Halevy 328efbab0f nfsd4: use list_move in move_to_confirmed
rather than list_del_init, list_add

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-05-11 21:02:01 -04:00
Benny Halevy be1fdf6c43 nfsd4: fold release_session into expire_client
and grab the client lock once for all the client's sessions.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-05-11 21:02:01 -04:00
Benny Halevy 9089f1b478 nfsd4: rename sessionid_lock to client_lock
In preparation to share the lock's scope to both client
and session hash tables.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-05-11 21:02:01 -04:00
J. Bruce Fields 5d4cec2f2f nfsd4: fix bare destroy_session null dereference
It's legal to send a DESTROY_SESSION outside any session (as the only
operation in a compound), in which case cstate->session will be NULL;
check for that case.

While we're at it, move these checks into a separate helper function.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-05-07 19:08:47 -04:00
J. Bruce Fields 5306293c9c Merge commit 'v2.6.34-rc6'
Conflicts:
	fs/nfsd/nfs4callback.c
2010-05-04 11:29:05 -04:00
J. Bruce Fields 26c0c75e69 nfsd4: fix unlikely race in session replay case
In the replay case, the

	renew_client(session->se_client);

happens after we've droppped the sessionid_lock, and without holding a
reference on the session; so there's nothing preventing the session
being freed before we get here.

Thanks to Benny Halevy for catching a bug in an earlier version of this
patch.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Acked-by: Benny Halevy <bhalevy@panasas.com>
2010-05-03 08:32:31 -04:00
J. Bruce Fields 5771635592 nfsd4: complete enforcement of 4.1 op ordering
Enforce the rules about compound op ordering.

Motivated by implementing RECLAIM_COMPLETE, for which the client is
implicit in the current session, so it is important to ensure a
succesful SEQUENCE proceeds the RECLAIM_COMPLETE.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-04-22 11:35:14 -04:00
J. Bruce Fields 4b21d0defc nfsd4: allow 4.0 clients to change callback path
The rfc allows a client to change the callback parameters, but we didn't
previously implement it.

Teach the callbacks to rerun themselves (by placing themselves on a
workqueue) when they recognize that their rpc task has been killed and
that the callback connection has changed.

Then we can change the callback connection by setting up a new rpc
client, modifying the nfs4 client to point at it, waiting for any work
in progress to complete, and then shutting down the old client.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-04-22 11:34:02 -04:00
J. Bruce Fields 2bf23875f5 nfsd4: rearrange cb data structures
Mainly I just want to separate the arguments used for setting up the tcp
client from the rest.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-04-22 11:34:02 -04:00
J. Bruce Fields b12a05cbdf nfsd4: cl_count is unused
Now that the shutdown sequence guarantees callbacks are shut down before
the client is destroyed, we no longer have a use for cl_count.

We'll probably reinstate a reference count on the client some day, but
it will be held by users other than callbacks.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-04-22 11:34:02 -04:00
J. Bruce Fields b5a1a81e5c nfsd4: don't sleep in lease-break callback
The NFSv4 server's fl_break callback can sleep (dropping the BKL), in
order to allocate a new rpc task to send a recall to the client.

As far as I can tell this doesn't cause any races in the current code,
but the analysis is difficult.  Also, the sleep here may complicate the
move away from the BKL.

So, just schedule some work to do the job for us instead.  The work will
later also prove useful for restarting a call after the callback
information is changed.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-04-22 11:34:01 -04:00
J. Bruce Fields 408b79bcc3 nfsd4: consistent session flag setting
We should clear these flags on any new create_session, not just on the
first one.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-04-16 21:47:37 -04:00
J. Bruce Fields 3df796dbe9 nfsd4: remove dprintk
I haven't found this useful.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-04-02 17:04:31 -04:00
J. Bruce Fields 147efd0dd7 nfsd4: shutdown callbacks on expiry
Once we've expired the client, there's no further purpose to the
callbacks; go ahead and shut down the callback client rather than
waiting for the last reference to go.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-04-02 16:36:30 -04:00
Tejun Heo 5a0e3ad6af include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-30 22:02:32 +09:00
J. Bruce Fields e739cf1da4 Merge commit 'v2.6.34-rc1' into for-2.6.35-incoming 2010-03-09 17:22:08 -05:00
J. Bruce Fields efc4bb4fdd nfsd4: allow setting grace period time
Allow explicit configuration of the grace period time as well as the
lease period time.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-03-06 15:02:08 -05:00
J. Bruce Fields f958a1320f nfsd4: remove unnecessary lease-setting function
This is another layer of indirection that doesn't really buy us
anything.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-03-06 15:02:03 -05:00
J. Bruce Fields e46b498c84 nfsd4: simplify lease/grace interaction
The original code here assumed we'd allow the user to change the lease
any time, but only allow the change to take effect on restart.  Since
then we modified the code to allow setting the lease on when the server
is down.  Update the rest of the code to reflect that fact, clarify
variable names, and add document.

Also, the code insisted that the grace period always be the longer of
the old and new lease periods, but that's overly conservative--as long
as it lasts at least the old lease period, old clients should still know
to recover in time.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-03-06 15:02:02 -05:00
J. Bruce Fields cf07d2ea43 nfsd4: simplify references to nfsd4 lease time
Instead of accessing the lease time directly, some users call
nfs4_lease_time(), and some a macro, NFSD_LEASE_TIME, defined as
nfs4_lease_time().  Neither layer of indirection serves any purpose.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-03-06 15:02:01 -05:00
Linus Torvalds 05c5cb31ec Merge branch 'for-2.6.34' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.34' of git://linux-nfs.org/~bfields/linux: (22 commits)
  nfsd4: fix minor memory leak
  svcrpc: treat uid's as unsigned
  nfsd: ensure sockets are closed on error
  Revert "sunrpc: move the close processing after do recvfrom method"
  Revert "sunrpc: fix peername failed on closed listener"
  sunrpc: remove unnecessary svc_xprt_put
  NFSD: NFSv4 callback client should use RPC_TASK_SOFTCONN
  xfs_export_operations.commit_metadata
  commit_metadata export operation replacing nfsd_sync_dir
  lockd: don't clear sm_monitored on nsm_reboot_lookup
  lockd: release reference to nsm_handle in nlm_host_rebooted
  nfsd: Use vfs_fsync_range() in nfsd_commit
  NFSD: Create PF_INET6 listener in write_ports
  SUNRPC: NFS kernel APIs shouldn't return ENOENT for "transport not found"
  SUNRPC: Bury "#ifdef IPV6" in svc_create_xprt()
  NFSD: Support AF_INET6 in svc_addsock() function
  SUNRPC: Use rpc_pton() in ip_map_parse()
  nfsd: 4.1 has an rfc number
  nfsd41: Create the recovery entry for the NFSv4.1 client
  nfsd: use vfs_fsync for non-directories
  ...
2010-03-06 11:31:38 -08:00
Wu Fengguang 42e4960868 vfs: take f_lock on modifying f_mode after open time
We'll introduce FMODE_RANDOM which will be runtime modified.  So protect
all runtime modification to f_mode with f_lock to avoid races.

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: <stable@kernel.org>			[2.6.33.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-06 11:26:25 -08:00
Ricardo Labiaga 8b8aae4009 nfsd41: Create the recovery entry for the NFSv4.1 client
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-01-14 12:24:46 -05:00
J. Bruce Fields 7663dacd92 nfsd: remove pointless paths in file headers
The new .h files have paths at the top that are now out of date.  While
we're here, just remove all of those from fs/nfsd; they never served any
purpose.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-15 15:01:47 -05:00
Boaz Harrosh 9a74af2133 nfsd: Move private headers to source directory
Lots of include/linux/nfsd/* headers are only used by
nfsd module. Move them to the source directory

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-14 18:12:12 -05:00
Boaz Harrosh 341eb18446 nfsd: Source files #include cleanups
Now that the headers are fixed and carry their own wait, all fs/nfsd/
source files can include a minimal set of headers. and still compile just
fine.

This patch should improve the compilation speed of the nfsd module.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-14 18:12:09 -05:00
J. Bruce Fields 0a3adadee4 nfsd: make fs/nfsd/vfs.h for common includes
None of this stuff is used outside nfsd, so move it out of the common
linux include directory.

Actually, probably none of the stuff in include/linux/nfsd/nfsd.h really
belongs there, so later we may remove that file entirely.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-11-13 13:23:02 -05:00
Benny Halevy 8c10cbdb4a nfsd: use STATEID_FMT and STATEID_VAL for printing stateids
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-11-05 12:06:29 -05:00
J. Bruce Fields efe0cb6d5a nfsd4.1: common slot allocation size calculation
We do the same calculation in a couple places; use a helper function,
and add a little documentation, in the hopes of preventing bugs like
that fixed in the last patch.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-10-27 19:34:43 -04:00
J. Bruce Fields dd829c4564 nfsd4.1: fix session memory use calculation
Unbalanced calculations on creation and destruction of sessions could
cause our estimate of cache memory used to become negative, sometimes
resulting in spurious SERVERFAULT returns to client CREATE_SESSION
requests.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-10-27 19:34:43 -04:00
Andy Adamson ddc04fd4d5 nfsd41: use sv_max_mesg for forechannel max sizes
ca_maxresponsesize and ca_maxrequest size include the RPC header.

sv_max_mesg is sv_max_payolad plus a page for overhead and is used in
svc_init_buffer to allocate server buffer space for both the request and reply.
Note that this means we can service an RPC compound that requires
ca_maxrequestsize (MAXWRITE) or ca_max_responsesize (MAXREAD) but that we do
not support an RPC compound that requires both ca_maxrequestsize and
ca_maxresponsesize.

Signed-off-by: Andy Adamson <andros@netapp.com>
[bfields@citi.umich.edu: more documentation updates]
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-28 12:40:15 -04:00