Commit Graph

815 Commits

Author SHA1 Message Date
Benny Halevy 2af73580b7 nfsd41: Backchannel: cb_sequence callback
Implement the cb_sequence callback conforming to draft-ietf-nfsv4-minorversion1

Note: highest slot id and target highest slot id do not have to be 0
as was previously implemented.  They can be greater than what the
nfs server sent if the client supports a larger slot table on the
backchannel.  At this point we just ignore that.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
[Rework the back channel xdr using the shared v4.0 and v4.1 framework.]
Signed-off-by: Andy Adamson <andros@netapp.com>
[fixed indentation]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: use nfsd4_cb_sequence for callback minorversion]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: fix verification of CB_SEQUENCE highest slot id[
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: Backchannel: Remove old backchannel serialization]
[nfsd41: Backchannel: First callback sequence ID should be 1]
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: decode_cb_sequence does not need to actually decode ignored fields]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-15 20:49:56 -04:00
Ricardo Labiaga 2a1d1b5938 nfsd41: Backchannel: Setup sequence information
Follows the model used by the NFS client.  Setup the RPC prepare and done
function pointers so that we can populate the sequence information if
minorversion == 1.  rpc_run_task() is then invoked directly just like
existing NFS client operations do.

nfsd4_cb_prepare() determines if the sequence information needs to be setup.
If the slot is in use, it adds itself to the wait queue.

nfsd4_cb_done() wakes anyone sleeping on the callback channel wait queue
after our RPC reply has been received.  It also sets the task message
result pointer to NULL to clearly indicate we're done using it.

Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
[define and initialize cl_cb_seq_nr here]
[pulled out unused defintion of nfsd4_cb_done]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-15 20:49:56 -04:00
Ricardo Labiaga 199ff35e1c nfsd41: Backchannel: Server backchannel RPC wait queue
RPC callback requests will wait on this wait queue if the backchannel
is out of slots.

Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-15 20:49:55 -04:00
Ricardo Labiaga 132f97715c nfsd41: Backchannel: Add sequence arguments to callback RPC arguments
Follow the model we use in the client. Make the sequence arguments
part of the regular RPC arguments.  None of the callbacks that are
soon to be implemented expect results that need to be passed back
to the caller, so we don't define a separate RPC results structure.
For session validation, the cb_sequence decoding will use a pointer
to the sequence arguments that are part of the RPC argument.

Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
[define struct nfsd4_cb_sequence here]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-15 20:49:55 -04:00
Andy Adamson 38524ab38f nfsd41: Backchannel: callback infrastructure
Keep the xprt used for create_session in cl_cb_xprt.
Mark cl_callback.cb_minorversion = 1 and remember
the client provided cl_callback.cb_prog rpc program number.
Use it to probe the callback path.

Use the client's network address to initialize as the
callback's address as expected by the xprt creation
routines.

Define xdr sizes and code nfs4_cb_compound header to be able
to send a null callback rpc.

Signed-off-by: Andy Adamson<andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
[get callback minorversion from fore channel's]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: change bc_sock to bc_xprt]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[pulled definition for cl_cb_xprt]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: set up backchannel's cb_addr]
[moved rpc_create_args init to "nfsd: modify nfsd4.1 backchannel to use new xprt class"]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-15 20:49:55 -04:00
J. Bruce Fields 80fc015bdf nfsd4: use common rpc_cred for all callbacks
Callbacks are always made using the machine's identity, so we can use a
single auth_generic credential shared among callbacks to all clients and
let the rpc code take care of the rest.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-15 20:49:34 -04:00
J. Bruce Fields 29ab23cc5d nfsd4: allow nfs4 state startup to fail
The failure here is pretty unlikely, but we should handle it anyway.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-15 20:49:33 -04:00
J. Bruce Fields 886e3b7fe6 nfsd4: fix null dereference creating nfsv4 callback client
On setting up the callback to the client, we attempt to use the same
authentication flavor the client did.  We find an rpc cred to use by
calling rpcauth_lookup_credcache(), which assumes that the given
authentication flavor has a credentials cache.  However, this is not
required to be true--in particular, auth_null does not use one.
Instead, we should call the auth's lookup_cred() method.

Without this, a client attempting to mount using nfsv4 and auth_null
triggers a null dereference.

Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-15 20:49:33 -04:00
Benny Halevy 4be36ca0ce nfsd4: fix whitespace in NFSPROC4_CLNT_CB_NULL definition
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-13 15:57:39 -04:00
Trond Myklebust ab3bbaa8b2 Merge branch 'nfs-for-2.6.32' 2009-09-11 14:59:37 -04:00
J. Bruce Fields aed100fafb nfsd: fix leak on error in nfsv3 readdir
Note the !dchild->d_inode case can leak the filehandle.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-04 15:48:00 -04:00
J. Bruce Fields 8177e6d6df nfsd: clean up readdirplus encoding
Make the return from compose_entry_fh() zero or an error, even though
the returned error isn't used, just to make the meaning of the return
immediately obvious.

Move some repeated code out of main function into helper.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-04 15:47:40 -04:00
J. Bruce Fields 1be10a88ca nfsd4: filehandle leak or error exit from fh_compose()
A number of callers (nfsd4_encode_fattr(), at least) don't bother to
release the filehandle returned to fh_compose() if fh_compose() returns
an error.  So, modify fh_compose() to release the filehandle before
returning an error.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-04 11:59:32 -04:00
Trond Myklebust 2671a4bf35 NFSd: Fix filehandle leak in exp_pseudoroot() and nfsd4_path()
nfsd4_path() allocates a temporary filehandle and then fails to free it
before the function exits, leaking reference counts to the dentry and
export that it refers to.

Also, nfsd4_lookupp() puts the result of exp_pseudoroot() in a temporary
filehandle which it releases on success of exp_pseudoroot() but not on
failure; fix exp_pseudoroot to ensure that on failure it releases the
filehandle before returning.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-03 16:57:57 -04:00
J. Bruce Fields bc6c53d5a1 nfsd: move fsid_type choice out of fh_compose
More trivial cleanup.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-02 23:54:48 -04:00
J. Bruce Fields 8e498751f2 nfsd: move some of fh_compose into helper functions
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-02 23:53:51 -04:00
David Howells e0e817392b CRED: Add some configurable debugging [try #6]
Add a config option (CONFIG_DEBUG_CREDENTIALS) to turn on some debug checking
for credential management.  The additional code keeps track of the number of
pointers from task_structs to any given cred struct, and checks to see that
this number never exceeds the usage count of the cred struct (which includes
all references, not just those from task_structs).

Furthermore, if SELinux is enabled, the code also checks that the security
pointer in the cred struct is never seen to be invalid.

This attempts to catch the bug whereby inode_has_perm() faults in an nfsd
kernel thread on seeing cred->security be a NULL pointer (it appears that the
credential struct has been previously released):

	http://www.kerneloops.org/oops.php?number=252883

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-02 21:29:01 +10:00
Andy Adamson 557ce2646e nfsd41: replace page based DRC with buffer based DRC
Use NFSD_SLOT_CACHE_SIZE size buffers for sessions DRC instead of holding nfsd
pages in cache.

Connectathon testing has shown that 1024 bytes for encoded compound operation
responses past the sequence operation is sufficient, 512 bytes is a little too
small. Set NFSD_SLOT_CACHE_SIZE to 1024.

Allocate memory for the session DRC in the CREATE_SESSION operation
to guarantee that the memory resource is available for caching responses.
Allocate each slot individually in preparation for slot table size negotiation.

Remove struct nfsd4_cache_entry and helper functions for the old page-based
DRC.

The iov_len calculation in nfs4svc_encode_compoundres is now always
correct.  Replay is now done in nfsd4_sequence under the state lock, so
the session ref count is only bumped on non-replay. Clean up the
nfs4svc_encode_compoundres session logic.

The nfsd4_compound_state statp pointer is also not used.
Remove nfsd4_set_statp().

Move useful nfsd4_cache_entry fields into nfsd4_slot.

Signed-off-by: Andy Adamson <andros@netapp.com
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-01 22:24:06 -04:00
Andy Adamson bdac86e215 nfsd41: replace nfserr_resource in pure nfs41 responses
nfserr_resource is not a legal error for NFSv4.1. Replace it with
nfserr_serverfault for EXCHANGE_ID and CREATE_SESSION processing.

We will also need to map nfserr_resource to other errors in routines shared
by NFSv4.0 and NFSv4.1

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-01 22:24:05 -04:00
Andy Adamson a8dfdaeb7a nfsd41: use session maxreqs for sequence target and highest slotid
This fixes a bug in the sequence operation reply.

The sequence operation returns the highest slotid it will accept in the future
in sr_highest_slotid, and the highest slotid it prefers the client to use.
Since we do not re-negotiate the session slot table yet, these should both
always be set to the session ca_maxrequests.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-01 22:24:05 -04:00
Andy Adamson a649637c73 nfsd41: bound forechannel drc size by memory usage
By using the requested ca_maxresponsesize_cached * ca_maxresponses to bound
a forechannel drc request size, clients can tailor a session to usage.

For example, an I/O session (READ/WRITE only) can have a much smaller
ca_maxresponsesize_cached (for only WRITE compound responses) and a lot larger
ca_maxresponses to service a large in-flight data window.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-01 22:24:05 -04:00
Trond Myklebust a06b1261bd NFSD: Fix a bug in the NFSv4 'supported attrs' mandatory attribute
The fact that the filesystem doesn't currently list any alternate
locations does _not_ imply that the fs_locations attribute should be
marked as "unsupported".

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-01 20:00:17 -04:00
Andy Adamson 468de9e54a nfsd41: expand solo sequence check
Compounds consisting of only a sequence operation don't need any
additional caching beyond the sequence information we store in the slot
entry.  Fix nfsd4_is_solo_sequence to identify this case correctly.

The additional check for a failed sequence in nfsd4_store_cache_entry()
is redundant, since the nfsd4_is_solo_sequence call lower down catches
this case.

The final ce_cachethis set in nfsd4_sequence is also redundant.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-08-28 12:20:15 -04:00
Frank Filz d8d0b85b11 nfsd4: remove ACE4_IDENTIFIER_GROUP flag from GROUP@ entry
RFC 3530 says "ACE4_IDENTIFIER_GROUP flag MUST be ignored on entries
with these special identifiers.  When encoding entries with these
special identifiers, the ACE4_IDENTIFIER_GROUP flag SHOULD be set to
zero."  It really shouldn't matter either way, but the point is that
this flag is used to distinguish named users from named groups (since
unix allows a group to have the same name as a user), so it doesn't
really make sense to use it on a special identifier such as this.)

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-08-27 17:35:41 -04:00
Benny Halevy aaf84eb95a nfsd41: renew_client must be called under the state lock
Until we work out the state locking so we can use a spin lock to protect
the cl_lru, we need to take the state_lock to renew the client.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: Do not renew state on error]
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: Simplify exit code]
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-08-27 17:17:40 -04:00
Ryusei Yamaguchi ed2d8aed52 knfsd: Replace lock_kernel with a mutex in nfsd pool stats.
lock_kernel() in knfsd was replaced with a mutex. The later
commit 03cf6c9f49 ("knfsd:
add file to export stats about nfsd pools") did not follow
that change. This patch fixes the issue.

Also move the get and put of nfsd_serv to the open and close methods
(instead of start and stop methods) to allow atomic check and increment
of reference count in the open method (where we can still return an
error).

Signed-off-by: Ryusei Yamaguchi <mandel59@gmail.com>
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: Greg Banks <gnb@fmeh.org>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-08-25 12:39:37 -04:00
Frank Filz 55bb55dca0 nfsd: Fix unnecessary deny bits in NFSv4 ACL
The group deny entries end up denying tcy even though tcy was just
allowed by the allow entry. This appears to be due to:
	ace->access_mask = mask_from_posix(deny, flags);
instead of:
	ace->access_mask = deny_mask_from_posix(deny, flags);

Denying a previously allowed bit has no effect, so this shouldn't affect
behavior, but it's ugly.

Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-08-24 20:01:22 -04:00
Jeff Layton fbf4665f41 nfsd: populate sin6_scope_id on callback address with scopeid from rq_addr on SETCLIENTID call
When a SETCLIENTID call comes in, one of the args given is the svc_rqst.
This struct contains an rq_addr field which holds the address that sent
the call. If this is an IPv6 address, then we can use the sin6_scope_id
field in this address to populate the sin6_scope_id field in the
callback address.

AFAICT, the rq_addr.sin6_scope_id is non-zero if and only if the client
mounted the server's link-local address.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-08-21 11:27:44 -04:00
Jeff Layton 7077ecbabd nfsd: add support for NFSv4 callbacks over IPv6
The framework to add this is all in place. Now, add the code to allow
support for establishing a callback channel on an IPv6 socket.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-08-21 11:27:44 -04:00
Jeff Layton aa9a4ec770 nfsd: convert nfs4_cb_conn struct to hold address in sockaddr_storage
...rather than as a separate address and port fields. This will be
necessary for implementing callbacks over IPv6. Also, convert
gen_callback to use the standard rpcuaddr2sockaddr routine rather than
its own private one.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-08-21 11:27:43 -04:00
Jeff Layton 363168b4ea nfsd: make nfs4_client->cl_addr a struct sockaddr_storage
It's currently a __be32, which isn't big enough to hold an IPv6 address.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-08-21 11:27:43 -04:00
J. Bruce Fields e9dc122166 Merge branch 'nfs-for-2.6.32' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6 into for-2.6.32-incoming
Conflicts:
	net/sunrpc/cache.c
2009-08-21 11:27:29 -04:00
Trond Myklebust f884dcaead Merge branch 'sunrpc_cache-for-2.6.32' into nfs-for-2.6.32 2009-08-10 17:45:58 -04:00
Trond Myklebust bc74b4f5e6 SUNRPC: Allow the cache_detail to specify alternative upcall mechanisms
For events that are rare, such as referral DNS lookups, it makes limited
sense to have a daemon constantly listening for upcalls on a channel. An
alternative in those cases might simply be to run the app that fills the
cache using call_usermodehelper_exec() and friends.

The following patch allows the cache_detail to specify alternative upcall
mechanisms for these particular cases.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-09 15:14:29 -04:00
Trond Myklebust 2da8ca26c6 NFSD: Clean up the idmapper warning...
What part of 'internal use' is so hard to understand?

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-09 15:14:26 -04:00
Chuck Lever 4116092b92 NFSD: Support IPv6 addresses in write_failover_ip()
In write_failover_ip(), replace the sscanf() with a call to the common
sunrpc.ko presentation address parser.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-09 15:09:40 -04:00
Andy Adamson abfabf8caf nfsd41: encode replay sequence from the slot values
The sequence operation is not cached; always encode the sequence operation on
a replay from the slot table and session values. This simplifies the sessions
replay logic in nfsd4_proc_compound.

If this is a replay of a compound that was specified not to be cached, return
NFS4ERR_RETRY_UNCACHED_REP.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-07-28 16:12:34 -04:00
Andy Adamson c8647947f8 nfsd41: rename nfsd4_enc_uncached_replay
This function is only used for SEQUENCE replay.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-07-28 14:30:36 -04:00
Andy Adamson 49557cc74c nfsd41: Use separate DRC for setclientid
Instead of trying to share the generic 4.1 reply cache code for the
CREATE_SESSION reply cache, it's simpler to handle CREATE_SESSION
separately.

The nfs41 single slot clientid DRC holds the results of create session
processing.  CREATE_SESSION can be preceeded by a SEQUENCE operation
(an embedded CREATE_SESSION) and the create session single slot cache must be
maintained.  nfsd4_replay_cache_entry() and nfsd4_store_cache_entry() do not
implement the replay of an embedded CREATE_SESSION.

The clientid DRC slot does not need the inuse, cachethis or other fields that
the multiple slot session cache uses.  Replace the clientid DRC cache struct
nfs4_slot cache with a new nfsd4_clid_slot cache.  Save the xdr struct
nfsd4_create_session into the cache at the end of processing, and on a replay,
replace the struct for the replay request with the cached version all while
under the state lock.

nfsd4_proc_compound will handle both the solo and embedded CREATE_SESSION case
via the normal use of encode_operation.

Errors that do not change the create session cache:
A create session NFS4ERR_STALE_CLIENTID error means that a client record
(and associated create session slot) could not be found and therefore can't
be changed.  NFSERR_SEQ_MISORDERED errors do not change the slot cache.

All other errors get cached.

Remove the clientid DRC specific check in nfs4svc_encode_compoundres to
put the session only if cstate.session is set which will now always be true.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-07-28 14:30:29 -04:00
Andy Adamson 88e588d56a nfsd41: change check_slot_seqid parameters
For separation of session slot and clientid slot processing.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-07-28 14:30:23 -04:00
Andy Adamson 5261dcf8eb nfsd41: remove redundant forechannel max requests check
This check is done in set_forechannel_maxreqs.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-07-28 14:30:15 -04:00
Andy Adamson 0c193054a4 nfsd41: hange from page to memory based drc limits
NFSD_SLOT_CACHE_SIZE is the size of all encoded operation responses
(excluding the sequence operation) that we want to cache.

For now, keep NFSD_SLOT_CACHE_SIZE at PAGE_SIZE. It will be reduced
when the DRC is changed from page based to memory based.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-07-28 14:30:05 -04:00
Andy Adamson 6a14dd1a4f nfsd41: reserve less memory for DRC
Also remove a slightly misleading comment.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-07-28 14:29:59 -04:00
Andy Adamson b101ebbc39 nfsd41: minor set_forechannel_maxreqs cleanup
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-07-28 14:29:54 -04:00
Andy Adamson be98d1bbd1 nfsd41: reclaim DRC memory on session free
This fixes a leak which would eventually lock out new clients.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-07-28 14:29:48 -04:00
J. Bruce Fields 413d63d710 nfsd: minor write_pool_threads exit cleanup
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-07-28 14:29:41 -04:00
Eric Sesterhenn 2522a776c1 Fix memory leak in write_pool_threads
kmemleak produces the following warning

unreferenced object 0xc9ec02a0 (size 8):
  comm "cat", pid 19048, jiffies 730243
  backtrace:
    [<c01bf970>] create_object+0x100/0x240
    [<c01bfadb>] kmemleak_alloc+0x2b/0x60
    [<c01bcd4b>] __kmalloc+0x14b/0x270
    [<c02fd027>] write_pool_threads+0x87/0x1d0
    [<c02fcc08>] nfsctl_transaction_write+0x58/0x70
    [<c02fcc6f>] nfsctl_transaction_read+0x4f/0x60
    [<c01c2574>] vfs_read+0x94/0x150
    [<c01c297d>] sys_read+0x3d/0x70
    [<c0102d6b>] sysenter_do_call+0x12/0x32
    [<ffffffff>] 0xffffffff

write_pool_threads() only frees nthreads on error paths, in the success case
we leak it.

Signed-off-by: Eric Sesterhenn <eric.sesterhenn@lsexperts.de>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-07-28 14:29:34 -04:00
Andy Adamson 4bd9b0f4af nfsd41: use globals for DRC limits
The version 4.1 DRC memory limit and tracking variables are server wide and
session specific. Replace struct svc_serv fields with globals.
Stop using the svc_serv sv_lock.

Add a spinlock to serialize access to the DRC limit management variables which
change on session creation and deletion (usage counter) or (future)
administrative action to adjust the total DRC memory limit.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2009-07-14 17:52:40 -04:00
Yu Zhiguo 9208faf297 NFSv4: ACL in operations 'open' and 'create' should be used
ACL in operations 'open' and 'create' is decoded but never be used.
It should be set as the initial ACL for the object according to RFC3530.
If error occurs when setting the ACL, just clear the ACL bit in the
returned attr bitmap.

Signed-off-by: Yu Zhiguo <yuzg@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-07-14 12:16:47 -04:00
Alexey Dobriyan 405f55712d headers: smp_lock.h redux
* Remove smp_lock.h from files which don't need it (including some headers!)
* Add smp_lock.h to files which do need it
* Make smp_lock.h include conditional in hardirq.h
  It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT

  This will make hardirq.h inclusion cheaper for every PREEMPT=n config
  (which includes allmodconfig/allyesconfig, BTW)

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-12 12:22:34 -07:00
David Howells 033a666ccb NFSD: Don't hold unrefcounted creds over call to nfsd_setuser()
nfsd_open() gets an unrefcounted pointer to the current process's effective
credentials at the top of the function, then calls nfsd_setuser() via
fh_verify() - which may replace and destroy the current process's effective
credentials - and then passes the unrefcounted pointer to dentry_open() - but
the credentials may have been destroyed by this point.

Instead, the value from current_cred() should be passed directly to
dentry_open() as one of its arguments, rather than being cached in a variable.

Possibly fh_verify() should return the creds to use.

This is a regression introduced by
745ca2475a "CRED: Pass credentials through
dentry_open()".

Signed-off-by: David Howells <dhowells@redhat.com>
Tested-and-Verified-By: Steve Dickson <steved@redhat.com>
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-07-03 10:21:10 -04:00
Linus Torvalds 7e0338c0de Merge branch 'for-2.6.31' of git://fieldses.org/git/linux-nfsd
* 'for-2.6.31' of git://fieldses.org/git/linux-nfsd: (60 commits)
  SUNRPC: Fix the TCP server's send buffer accounting
  nfsd41: Backchannel: minorversion support for the back channel
  nfsd41: Backchannel: cleanup nfs4.0 callback encode routines
  nfsd41: Remove ip address collision detection case
  nfsd: optimise the starting of zero threads when none are running.
  nfsd: don't take nfsd_mutex twice when setting number of threads.
  nfsd41: sanity check client drc maxreqs
  nfsd41: move channel attributes from nfsd4_session to a nfsd4_channel_attr struct
  NFS: kill off complicated macro 'PROC'
  sunrpc: potential memory leak in function rdma_read_xdr
  nfsd: minor nfsd_vfs_write cleanup
  nfsd: Pull write-gathering code out of nfsd_vfs_write
  nfsd: track last inode only in use_wgather case
  sunrpc: align cache_clean work's timer
  nfsd: Use write gathering only with NFSv2
  NFSv4: kill off complicated macro 'PROC'
  NFSv4: do exact check about attribute specified
  knfsd: remove unreported filehandle stats counters
  knfsd: fix reply cache memory corruption
  knfsd: reply cache cleanups
  ...
2009-06-22 12:55:50 -07:00
Andy Adamson ab52ae6db0 nfsd41: Backchannel: minorversion support for the back channel
Prepare to share backchannel code with NFSv4.1.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
[nfsd41: use nfsd4_cb_sequence for callback minorversion]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-06-18 18:33:57 -07:00
Andy Adamson ef52bff840 nfsd41: Backchannel: cleanup nfs4.0 callback encode routines
Mimic the client and prepare to share the back channel xdr with NFSv4.1.
Bump the number of operations in each encode routine, then backfill the
number of operations.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-06-18 18:33:57 -07:00
Mike Sager 6ddbbbfe52 nfsd41: Remove ip address collision detection case
Verified that cthon and pynfs exchange id tests pass (except for the
two expected fails: EID8 and EID50)

Signed-off-by: Mike Sager <sager@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-06-18 17:43:53 -07:00
NeilBrown 671e1fcf63 nfsd: optimise the starting of zero threads when none are running.
Currently, if we ask to set then number of nfsd threads to zero when
there are none running, we set up all the sockets and register the
service, and then tear it all down again.
This is pointless.

So detect that case and exit promptly.
(also remove an assignment to 'error' which was never used.

Signed-off-by: NeilBrown <neilb@suse.de>
Acked-by: Jeff Layton <jlayton@redhat.com>
2009-06-18 09:42:41 -07:00
NeilBrown 82e12fe924 nfsd: don't take nfsd_mutex twice when setting number of threads.
Currently when we write a number to 'threads' in nfsdfs,
we take the nfsd_mutex, update the number of threads, then take the
mutex again to read the number of threads.

Mostly this isn't a big deal.  However if we are write '0', and
portmap happens to be dead, then we can get unpredictable behaviour.
If the nfsd threads all got killed quickly and the last thread is
waiting for portmap to respond, then the second time we take the mutex
we will block waiting for the last thread.
However if the nfsd threads didn't die quite that fast, then there
will be no contention when we try to take the mutex again.

Unpredictability isn't fun, and waiting for the last thread to exit is
pointless, so avoid taking the lock twice.
To achieve this, get nfsd_svc return a non-negative number of active
threads when not returning a negative error.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-06-18 09:40:31 -07:00
Andy Adamson 5d77ddfbcb nfsd41: sanity check client drc maxreqs
Ensure the client requested maximum requests are between 1 and
NFSD_MAX_SLOTS_PER_SESSION

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-06-16 17:13:16 -07:00
Alexandros Batsakis 6c18ba9f5e nfsd41: move channel attributes from nfsd4_session to a nfsd4_channel_attr struct
the change is valid for both the forechannel and the backchannel (currently dummy)

Signed-off-by: Alexandros Batsakis <Alexandros.Batsakis@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-06-16 10:13:45 -07:00
Yu Zhiguo b9081d90f5 NFS: kill off complicated macro 'PROC'
kill off obscure macro 'PROC' of NFSv2&3 in order to make the code more clear.

Among other things, this makes it simpler to grep for callers of these
functions--something which has frequently caused confusion among nfs
developers.

Signed-off-by: Yu Zhiguo <yuzg@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-06-15 19:34:32 -07:00
J. Bruce Fields e4636d535e nfsd: minor nfsd_vfs_write cleanup
There's no need to check host_err >= 0 every time here when we could
check host_err < 0 once, following the usual kernel style.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-06-15 19:18:34 -07:00
J. Bruce Fields d911df7b8d nfsd: Pull write-gathering code out of nfsd_vfs_write
This is a relatively self-contained piece of code that handles a special
case--move it to its own function.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-06-15 18:54:05 -07:00
J. Bruce Fields 9d2a3f31d6 nfsd: track last inode only in use_wgather case
Updating last_ino and last_dev probably isn't useful in the !use_wgather
case.

Also remove some pointless ifdef'd-out code.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-06-15 18:52:47 -07:00
Trond Myklebust 48e03bc515 nfsd: Use write gathering only with NFSv2
NFSv3 and above can use unstable writes whenever they are sending more
than one write, rather than relying on the flaky write gathering
heuristics. More often than not, write gathering is currently getting it
wrong when the NFSv3 clients are sending a single write with FILE_SYNC
for efficiency reasons.

This patch turns off write gathering for NFSv3/v4, and ensures that
it only applies to the one case that can actually benefit: namely NFSv2.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-06-15 18:14:57 -07:00
J. Bruce Fields 7eef4091a6 Merge commit 'v2.6.30' into for-2.6.31 2009-06-15 18:08:07 -07:00
Al Viro 9393bd07cf switch follow_down()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-06-11 21:36:01 -04:00
Al Viro bab77ebf51 switch follow_up() to struct path
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-06-11 21:36:00 -04:00
Al Viro e64c390ca0 switch rqst_exp_parent()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-06-11 21:36:00 -04:00
Al Viro 91c9fa8f75 switch rqst_exp_get_by_name()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-06-11 21:36:00 -04:00
Al Viro 5bf3bd2b5c switch exp_parent() to struct path
... and lose the always-NULL last argument (non-NULL case had been
split off a while ago).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-06-11 21:36:00 -04:00
Al Viro 55430e2ece nfsd struct path use: exp_get_by_name()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-06-11 21:35:59 -04:00
James Morris 0b4ec6e4e0 Merge branch 'master' into next 2009-06-09 09:27:53 +10:00
Yu Zhiguo 0a93a47f04 NFSv4: kill off complicated macro 'PROC'
J. Bruce Fields wrote:
...
> (This is extremely confusing code to track down: note that
> proc->pc_decode is set to nfs4svc_decode_compoundargs() by the PROC()
> macro at the end of fs/nfsd/nfs4proc.c.  Which means, for example, that
> grepping for nfs4svc_decode_compoundargs() gets you nowhere.  Patches to
> kill off that macro would be welcomed....)

the macro 'PROC' is complicated and obscure, it had better
be killed off in order to make the code more clear.

Signed-off-by: Yu Zhiguo <yuzg@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-06-01 18:09:20 -04:00
Yu Zhiguo 3c8e03166a NFSv4: do exact check about attribute specified
Server should return NFS4ERR_ATTRNOTSUPP if an attribute specified is
not supported in current environment.
Operations CREATE, NVERIFY, OPEN, SETATTR and VERIFY should do this check.

This bug is found when do newpynfs tests. The names of the tests that failed
are following:
  CR12 NVF7a NVF7b NVF7c NVF7d NVF7f NVF7r NVF7s
  OPEN15 VF7a VF7b VF7c VF7d VF7f VF7r VF7s

Add function do_check_fattr() to do exact check:
1, Check attribute specified is supported by the NFSv4 server or not.
2, Check FATTR4_WORD0_ACL & FATTR4_WORD0_FS_LOCATIONS are supported
   in current environment or not.
3, Check attribute specified is writable or not.

step 1 and 3 are done in function nfsd4_decode_fattr() but removed
to this function now.

Signed-off-by: Yu Zhiguo <yuzg@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-06-01 18:01:54 -04:00
Mimi Zohar 14dba5331b integrity: nfsd imbalance bug fix
An nfsd exported file is opened/closed by the kernel causing the
integrity imbalance message.

Before a file is opened, there normally is permission checking, which
is done in inode_permission().  However, as integrity checking requires
a dentry and mount point, which is not available in inode_permission(),
the integrity (permission) checking must be called separately.

In order to detect any missing integrity checking calls, we keep track
of file open/closes.  ima_path_check() increments these counts and
does the integrity (permission) checking. As a result, the number of
calls to ima_path_check()/ima_file_free() should be balanced.  An extra
call to fput(), indicates the file could have been accessed without first
calling ima_path_check().

In nfsv3 permission checking is done once, followed by multiple reads,
which do an open/close for each read.  The integrity (permission) checking
call should be in nfsd_permission() after the inode_permission() call, but
as there is no correlation between the number of permission checking and
open calls, the integrity checking call should not increment the counters,
but defer it to when the file is actually opened.

This patch adds:
- integrity (permission) checking for nfsd exported files in nfsd_permission().
- a call to increment counts for files opened by nfsd.

This patch has been updated to return the nfs error types.

Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-05-28 09:32:43 +10:00
Wei Yongjun a0d24b295a nfsd: fix hung up of nfs client while sync write data to nfs server
Commit 'Short write in nfsd becomes a full write to the client'
(31dec2538e) broken the sync write.
With the following commands to reproduce:

  $ mount -t nfs -o sync 192.168.0.21:/nfsroot /mnt
  $ cd /mnt
  $ echo aaaa > temp.txt

Then nfs client is hung up.

In SYNC mode the server alaways return the write count 0 to the
client. This is because the value of host_err in nfsd_vfs_write()
will be overwrite in SYNC mode by 'host_err=nfsd_sync(file);',
and then we return host_err(which is now 0) as write count.

This patch fixed the problem.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-05-27 17:40:06 -04:00
Greg Banks 1dbd0d53f3 knfsd: remove unreported filehandle stats counters
The file nfsfh.c contains two static variables nfsd_nr_verified and
nfsd_nr_put.  These are counters which are incremented as a side
effect of the fh_verify() fh_compose() and fh_put() operations,
i.e. at least twice per NFS call for any non-trivial workload.
Needless to say this makes the cacheline that contains them (and any
other innocent victims) a very hot contention point indeed under high
call-rate workloads on multiprocessor NFS server.  It also turns out
that these counters are not used anywhere.  They're not reported to
userspace, they're not used in logic, they're not even exported from
the object file (let alone the module).  All they do is waste CPU time.

So this patch removes them.

Tests on a 16 CPU Altix A4700 with 2 10gige Myricom cards, configured
separately (no bonding).  Workload is 640 client threads doing directory
traverals with random small reads, from server RAM.

Before
======

Kernel profile:

  %   cumulative   self              self     total
 time   samples   samples    calls   1/call   1/call  name
  6.05   2716.00  2716.00    30406     0.09     1.02  svc_process
  4.44   4706.00  1990.00     1975     1.01     1.01  spin_unlock_irqrestore
  3.72   6376.00  1670.00     1666     1.00     1.00  svc_export_put
  3.41   7907.00  1531.00     1786     0.86     1.02  nfsd_ofcache_lookup
  3.25   9363.00  1456.00    10965     0.13     1.01  nfsd_dispatch
  3.10  10752.00  1389.00     1376     1.01     1.01  nfsd_cache_lookup
  2.57  11907.00  1155.00     4517     0.26     1.03  svc_tcp_recvfrom
  ...
  2.21  15352.00  1003.00     1081     0.93     1.00  nfsd_choose_ofc  <----
  ^^^^

Here the function nfsd_choose_ofc() reads a global variable
which by accident happened to be located in the same cacheline as
nfsd_nr_verified.

Call rate:

nullarbor:~ # pmdumptext nfs3.server.calls
...
Thu Dec 13 00:15:27     184780.663
Thu Dec 13 00:15:28     184885.881
Thu Dec 13 00:15:29     184449.215
Thu Dec 13 00:15:30     184971.058
Thu Dec 13 00:15:31     185036.052
Thu Dec 13 00:15:32     185250.475
Thu Dec 13 00:15:33     184481.319
Thu Dec 13 00:15:34     185225.737
Thu Dec 13 00:15:35     185408.018
Thu Dec 13 00:15:36     185335.764

After
=====

kernel profile:

  %   cumulative   self              self     total
 time   samples   samples    calls   1/call   1/call  name
  6.33   2813.00  2813.00    29979     0.09     1.01  svc_process
  4.66   4883.00  2070.00     2065     1.00     1.00  spin_unlock_irqrestore
  4.06   6687.00  1804.00     2182     0.83     1.00  nfsd_ofcache_lookup
  3.20   8110.00  1423.00    10932     0.13     1.00  nfsd_dispatch
  3.03   9456.00  1346.00     1343     1.00     1.00  nfsd_cache_lookup
  2.62  10622.00  1166.00     4645     0.25     1.01  svc_tcp_recvfrom
[...]
  0.10  42586.00    44.00       74     0.59     1.00  nfsd_choose_ofc  <--- HA!!
  ^^^^

Call rate:

nullarbor:~ # pmdumptext nfs3.server.calls
...
Thu Dec 13 01:45:28     194677.118
Thu Dec 13 01:45:29     193932.692
Thu Dec 13 01:45:30     194294.364
Thu Dec 13 01:45:31     194971.276
Thu Dec 13 01:45:32     194111.207
Thu Dec 13 01:45:33     194999.635
Thu Dec 13 01:45:34     195312.594
Thu Dec 13 01:45:35     195707.293
Thu Dec 13 01:45:36     194610.353
Thu Dec 13 01:45:37     195913.662
Thu Dec 13 01:45:38     194808.675

i.e. about a 5.3% improvement in call rate.

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Reviewed-by: David Chinner <dgc@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-05-27 14:14:03 -04:00
Greg Banks cf0a586cf4 knfsd: fix reply cache memory corruption
Fix a regression in the reply cache introduced when the code was
converted to use proper Linux lists.  When a new entry needs to be
inserted, the case where all the entries are currently being used
by threads is not correctly detected.  This can result in memory
corruption and a crash.  In the current code this is an extremely
unlikely corner case; it would require the machine to have 1024
nfsd threads and all of them to be busy at the same time.  However,
upcoming reply cache changes make this more likely; a crash due to
this problem was actually observed in field.

Signed-off-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-05-27 14:14:02 -04:00
Greg Banks fca4217c5b knfsd: reply cache cleanups
Make REQHASH() an inline function.  Rename hash_list to cache_hash.
Fix an obsolete comment.

Signed-off-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-05-27 14:14:02 -04:00
J. Bruce Fields 8daed1e549 nfsd: silence lockdep warning
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-05-11 17:23:14 -04:00
Wang Chen 02cb2858db nfsd: nfs4_stat_init cleanup
Save some loop time.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-05-06 16:22:41 -04:00
J. Bruce Fields b2c0cea6b1 nfsd4: check for negative dentry before use in nfsv4 readdir
After 2f9092e102 "Fix i_mutex vs.  readdir
handling in nfsd" (and 14f7dd63 "Copy XFS readdir hack into nfsd code"),
an entry may be removed between the first mutex_unlock and the second
mutex_lock. In this case, lookup_one_len() will return a negative
dentry.  Check for this case to avoid a NULL dereference.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Reviewed-by: J. R. Okajima <hooanon05@yahoo.co.jp>
Cc: stable@kernel.org
2009-05-06 16:16:36 -04:00
Randy Dunlap 9064caae8f nfsd: use C99 struct initializers
Eliminate 56 sparse warnings like this one:

fs/nfsd/nfs4xdr.c:1331:15: warning: obsolete array initializer, use C99 syntax

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-05-03 15:09:12 -04:00
J. Bruce Fields 63e4863fab nfsd4: make recall callback an asynchronous rpc
As with the probe, this removes the need for another kthread.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-05-03 15:08:56 -04:00
Andy Adamson ccecee1e5e nfsd41: slots are freed with session
The session and slots are allocated all in one piece.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-05-03 14:45:02 -04:00
J. Bruce Fields 3aea09dc91 nfsd4: track recall retries in nfs4_delegation
Move this out of a local variable into the nfs4_delegation object in
preparation for making this an async rpc call (at which point we'll need
any state like this in a common object that's preserved across function
calls).

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-05-01 20:11:12 -04:00
J. Bruce Fields 6707bd3d42 nfsd4: remove unused dl_trunc
There's no point in keeping this field around--it's always zero.

(Background: the protocol allows you to tell the client that the file is
about to be truncated, as an optimization to save the client from
writing back dirty pages that will just be discarded.  We don't
implement this hint.  If we do some day, adding this field back in will
be the least of the work involved.)

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-05-01 19:57:46 -04:00
J. Bruce Fields b53d40c507 nfsd4: eliminate struct nfs4_cb_recall
The nfs4_cb_recall struct is used only in nfs4_delegation, so its
pointer to the containing delegation is unnecessary--we could just use
container_of().

But there's no real reason to have this a separate struct at all--just
move these fields to nfs4_delegation.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-05-01 19:50:00 -04:00
J. Bruce Fields c237dc0303 nfsd4: rename callback struct to cb_conn
I want to use the name for a struct that actually does represent a
single callback.

(Actually, I've never been sure it helps to a separate struct for the
callback information.  Some day maybe those fields could just be dumped
into struct nfs4_client.  I don't know.)

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-05-01 17:31:44 -04:00
J. Bruce Fields e300a63ce4 nfsd4: replace callback thread by asynchronous rpc
We don't really need a synchronous rpc, and moving to an asynchronous
rpc allows us to do without this extra kthread.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-04-29 17:10:53 -04:00
J. Bruce Fields 3cef9ab266 nfsd4: lookup up callback cred only once
Lookup the callback cred once and then use it for all subsequent
callbacks.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-04-29 16:45:03 -04:00
J. Bruce Fields ecdd03b791 nfsd4: create rpc callback client from server thread
The code is a little simpler, and it should be easier to avoid races, if
we just do all rpc client creation/destruction from nfsd or laundromat
threads and do only the rpc calls themselves asynchronously.  The rpc
creation doesn't involve any significant waiting (it doesn't call the
client, for example), so there's no reason not to do this.

Also don't bother destroying the client on failure of the rpc null
probe.  We may want to retry the probe later anyway.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-04-29 16:44:53 -04:00
J. Bruce Fields e1cab5a589 nfsd4: set cb_client inside setup_callback_client
This is just a minor code simplification.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-04-29 16:44:47 -04:00
J. Bruce Fields 595947acaa nfsd4: set shorter timeout
We tried to do something overly complicated with the callback rpc
timeouts here.  And they're wrong--the result is that by the time a
single callback times out, it's already too late to tell the client
(using the cb_path_down return to RENEW) that the callback is down.

Use a much shorter, simpler timeout.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-04-29 16:44:40 -04:00
J. Bruce Fields f64f79ea5f nfsd4: setclientid_confirm callback-change fixes
This setclientid_confirm case should allow the client to change
callbacks, but it currently has a dummy implementation that just turns
off callbacks completely.  That dummy implementation isn't completely
correct either, though:

	- There's no need to remove any client recovery directory in
	  this case.
	- New clientid confirm verifiers should be generated (and
	  returned) in setclientid; there's no need to generate a new
	  one here.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-04-29 16:44:34 -04:00
J. Bruce Fields b8fd47aefa nfsd: quiet compile warning
Stephen Rothwell said:
"Today's linux-next build (powerpc ppc64_defconfig) produced this new
warning:

fs/nfsd/nfs4state.c: In function 'EXPIRED_STATEID':
fs/nfsd/nfs4state.c:2757: warning: comparison of distinct pointer types lacks a cast

Caused by commit 78155ed75f ("nfsd4:
distinguish expired from stale stateids")."

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
2009-04-29 11:36:17 -04:00
J. Bruce Fields c654b8a9cb nfsd: support ext4 i_version
ext4 supports a real NFSv4 change attribute, which is bumped whenever
the ctime would be updated, including times when two updates arrive
within a jiffy of each other.  (Note that although ext4 has space for
nanosecond-precision ctime, the real resolution is lower: it actually
uses jiffies as the time-source.)  This ensures clients will invalidate
their caches when they need to.

There is some fear that keeping the i_version up-to-date could have
performance drawbacks, so for now it's turned on only by a mount option.
We hope to do something better eventually.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Theodore Tso <tytso@mit.edu>
2009-04-29 11:35:49 -04:00
J. Bruce Fields 3352d2c2d0 nfsd4: delete obsolete xdr comments
We don't need comments to tell us these macros are ugly.  And we're long
past trying to share any of this code with the BSD's.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-04-29 11:35:49 -04:00
J. Bruce Fields bc749ca4c4 nfsd: eliminate ENCODE_HEAD macro
This macro doesn't serve any useful purpose.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-04-29 11:35:49 -04:00
Chuck Lever e06b64050e NFSD: Stricter buffer size checking in fs/nfsd/nfsctl.c
Clean up: For consistency, handle output buffer size checking in a
other nfsctl functions the same way it's done for write_versions().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-04-28 13:54:30 -04:00