Using NIPQUAD() with NIPQUAD_FMT, %d.%d.%d.%d or %u.%u.%u.%u
can be replaced with %pI4
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
since commit ff7d9756b5
"nfsd: use static memory for callback program and stats"
do_probe_callback uses a static callback program
(NFS4_CALLBACK) rather than the one set in clp->cl_callback.cb_prog
as passed in by the client in setclientid (4.0)
or create_session (4.1).
This patches introduces rpc_create_args.prognumber that allows
overriding program->number when creating rpc_clnt.
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
If another task is busy in rpcb_getport_async number, it is more efficient
to have it wake us up when it has finished instead of arbitrarily sleeping
for 5 seconds.
Also ensure that rpcb_wake_rpcbind_waiters() is called regardless of
whether or not rpcb_getport_done() gets called.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The cl_chatty flag alows us to control whether a given rpc client leaves
"server X not responding, timed out"
messages in the syslog. Such messages make sense for ordinary nfs
clients (where an unresponsive server means applications on the
mountpoint are probably hanging), but not for the callback client (which
can fail more commonly, with the only result just of disabling some
optimizations).
Previously cl_chatty was removed, do to lack of users; reinstate it, and
use it for the nfsd's callback client.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Clean up: move the logic that displays each task to its own function.
This removes indentation and makes future changes easier.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Clean up: don't display the rpc_show_tasks column header unless there is at
least one task to display. As far as I can tell, it is safe to let the
list_for_each_entry macro decide that each list is empty.
scripts/checkpatch.pl also wants a KERN_FOO at the start of any newly added
printk() calls, so this and subsequent patches will also add KERN_INFO.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The RPC client uses a finite state machine to move RPC tasks through each
step of an RPC request. Each state is contained in a function in
net/sunrpc/clnt.c, and named call_foo.
Some of the functions named call_foo have changed over the past few years and
are no longer states in the FSM. These include: call_encode, call_header,
and call_verify. As a clean up, rename the functions that have changed.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Improve debugging messages in call_start() and call_verify() by having
them show the RPC procedure name instead of the procedure number.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The special 'ENOMEM' case that was previously flagged as non-fatal is
bogus: auth_gss always returns EAGAIN for non-fatal errors, and may in fact
return ENOMEM in the special case where xdr_buf_read_netobj runs out of
preallocated buffer space (invariably a _fatal_ error, since there is no
provision for preallocating larger buffers).
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
All errors from call_encode(), with exception of EAGAIN are fatal, so we
should immediately return instead of proceeding to xprt_transmit().
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
RFC 2203 requires the server to drop the request if it believes the
RPCSEC_GSS context is out of sequence. The problem is that we have no way
on the client to know why the server dropped the request. In order to avoid
spinning forever trying to resend the request, the safe approach is
therefore to always invalidate the RPCSEC_GSS context on every major
timeout.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
NFSv4 requires us to ensure that we break the TCP connection before we're
allowed to retransmit a request. However in the case where we're
retransmitting several requests that have been sent on the same
connection, we need to ensure that we don't interfere with the attempt to
reconnect and/or break the connection again once it has been established.
We therefore introduce a 'connection' cookie that is bumped every time a
connection is broken. This allows requests to track if they need to force a
disconnection.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We want to ensure that req->rq_private_buf.len is updated before
req->rq_received, so that call_decode() doesn't use an old value for
req->rq_rcv_buf.len.
In 'call_decode()' itself, instead of using task->tk_status (which is set
using req->rq_received) must use the actual value of
req->rq_private_buf.len when deciding whether or not the received RPC reply
is too short.
Finally ensure that we set req->rq_rcv_buf.len to zero when retrying a
request. A typo meant that we were resetting req->rq_private_buf.len in
call_decode(), and then clobbering that value with the old rq_rcv_buf.len
again in xprt_transmit().
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
call_verify() can, under certain circumstances, free the RPC slot. In that
case, our cached pointer 'req = task->tk_rqstp' is invalid. Bug was
introduced in commit 220bcc2afd (SUNRPC:
Don't call xprt_release in call refresh).
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Commit 510deb0d was supposed to move the xprt_create_transport() call in
rpc_create(), but neglected to remove the old call site. This resulted in
a transport leak after every rpc_create() call.
This leak is present in 2.6.24 and 2.6.25.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
__FUNCTION__ is gcc-specific, use __func__
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In all cases where we currently use rpc_wake_up_task(), we almost always
know on which waitqueue the rpc_task is actually sleeping. This will allows
us to simplify the queue locking in a future patch.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Use updated file list for docbook files and
fix kernel-doc warnings in sunrpc:
Warning(linux-2.6.24-git12//net/sunrpc/rpc_pipe.c:689): No description found for parameter 'rpc_client'
Warning(linux-2.6.24-git12//net/sunrpc/rpc_pipe.c:765): No description found for parameter 'flags'
Warning(linux-2.6.24-git12//net/sunrpc/clnt.c:584): No description found for parameter 'tk_ops'
Warning(linux-2.6.24-git12//net/sunrpc/clnt.c:618): No description found for parameter 'bufsize'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'task_killable' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc: (22 commits)
Remove commented-out code copied from NFS
NFS: Switch from intr mount option to TASK_KILLABLE
Add wait_for_completion_killable
Add wait_event_killable
Add schedule_timeout_killable
Use mutex_lock_killable in vfs_readdir
Add mutex_lock_killable
Use lock_page_killable
Add lock_page_killable
Add fatal_signal_pending
Add TASK_WAKEKILL
exit: Use task_is_*
signal: Use task_is_*
sched: Use task_contributes_to_load, TASK_ALL and TASK_NORMAL
ptrace: Use task_is_*
power: Use task_is_*
wait: Use TASK_NORMAL
proc/base.c: Use task_is_*
proc/array.c: Use TASK_REPORT
perfmon: Use task_is_*
...
Fixed up conflicts in NFS/sunrpc manually..
Clean up: have the set up routines explicitly pass the strings to be used
for the transport name and NETID. This removes a number of conditionals
and dependencies on rpc_xprt.prot, which is overloaded.
Tighten up type checking on the address_strings array while we're at it.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
In order to be able to support setting the timeo and retrans parameters on
a per-mountpoint basis, we move the rpc_timeout structure into the
rpc_clnt.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
If the ULP doesn't pass a hostname string to rpc_create(), it manufactures
one based on the passed-in address. Be smart enough to handle an AF_INET6
address properly in this case.
Move the default servername logic before the xprt_create_transport() call
to simplify error handling in rpc_create().
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The sunrpc client exports are not meant to be part of any official kernel
API: they can change at the drop of a hat. Mark them as internal functions
using EXPORT_SYMBOL_GPL.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Move the calls to xprt_disconnect() over to xprt_force_disconnect() in
order to enable the transport layer to manage the state of the
XPRT_CONNECTED flag.
Ditto in xs_tcp_read_fraghdr().
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
By using the TASK_KILLABLE infrastructure, we can get rid of the 'intr'
mount option. We have to use _killable everywhere instead of _interruptible
as we get rid of rpc_clnt_sigmask/sigunmask.
Signed-off-by: Liam R. Howlett <howlett@gmail.com>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Since 43780b87fa7..., rpc_create() fills in a default hostname based on
the ip address if the servername passed in is null. A small typo made
that default incorrect. (But this information appears to be used only
for debugging right now, so I don't believe the typo causes any bugs in
the current kernel.)
Thanks to Olga Kornievskaia for bug report and testing.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Olga Kornievskaia <aglo@citi.umich.edu>
Cc: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
To prepare for including non-sockets-based RPC transports, select
RPC transports by an identifier (to be used in following patches).
Signed-off-by: Tom Talpey <tmt@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
To prepare for including non-sockets-based RPC transports, change the
overly suggestive name of the transport creation arguments struct.
Signed-off-by: Tom Talpey <tmt@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Adds a flag word to the xdrbuf struct which indicates any bulk
disposition of the data. This enables RPC transport providers to
marshal it efficiently/appropriately, and may enable other
optimizations.
Signed-off-by: Tom Talpey <tmt@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The purpose of an RPC ping (a NULL request) is to determine whether the
remote end is operating and supports the RPC program and version of the
request.
If we do an RPC bind and the remote's rpcbind service says "this
program or service isn't supported" then we have our answer already,
and we should give up immediately.
This is good for the kernel mount client, as it will cause the request
to fail, and then allow an immediate retry with different options.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Add more new error code processing to the kernel's rpcbind client
and to call_bind_status() to distinguish two cases:
Case 1: the remote has replied that the program/version tuple is not
registered (returns EACCES)
Case 2: retry with a lesser rpcbind version (rpcb now returns EPFNOSUPPORT)
This change allows more specific error processing for each of these two
cases. We now fail case 2 instead of retrying... it's a server
configuration error not to support even rpcbind version 2. And don't
expose this new error code to user land -- convert it to EIO before
failing the RPC.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Add new error code processing to the kernel's rpcbind client and to
call_bind_status() to distinguish two cases:
Case 1: the remote has replied that the program/version tuple is not
registered (returns -EACCES)
Case 2: another process is already in the middle of binding on this
transport (now returns -EAGAIN)
This change allows more specific retry processing for each of these two
cases.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
/home/cel/linux/net/sunrpc/clnt.c: In function ‘rpc_show_tasks’:
/home/cel/linux/net/sunrpc/clnt.c:1538: warning:
signed and unsigned type in conditional expression
This points out another case where a conditional expression returns a
signed value in one arm and an unsigned value in the other.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Check the length of the passed-in server name before trying to print it in
the log.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
/home/cel/linux/net/sunrpc/clnt.c: In function ‘rpc_bind_new_program’:
/home/cel/linux/net/sunrpc/clnt.c:445: warning:
comparison between signed and unsigned
RPC version numbers are u32, not int.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Remove one of two identical dprintk's that occur when an RPC client is
created.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We don't need the BKL when wrapping and unwrapping; and experiments by Avishay
Traeger have found that permitting multiple encryption and decryption
operations to proceed in parallel can provide significant performance
improvements.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Avishay Traeger <atraeger@cs.sunysb.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
In addition to binding to a local privileged port the NFS client should
allow binding to a specific local address. This is used by the server
for callbacks. The patch adds the necessary interface.
Signed-off-by: Frank van Maarseveen <frankvm@frankvm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cleanup argument passing to functions for creating an RPC transport.
Signed-off-by: Frank van Maarseveen <frankvm@frankvm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
A couple of callers just use a stringified IP address for the rpc client's
hostname. Move the logic for constructing this into rpc_create(), so it can
be shared.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We should almost always be deferencing the rpc_auth struct by means of the
credential's cr_auth field instead of the rpc_clnt->cl_auth anyway. Fix up
that historical mistake, and remove the macro that propagated it.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
RPCSEC_GSS needs to be able to send NULL RPC calls to the server in order
to free up any remaining GSS contexts.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Does a NULL RPC call and returns a pointer to the resulting rpc_task. The
call may be either synchronous or asynchronous.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The kref now does most of what cl_count + cl_user used to do. The only
remaining role for cl_count is to tell us if we are in a 'shutdown'
phase. We can provide that information using a single bit field instead
of a full atomic counter.
Also rename rpc_destroy_client() to rpc_close_client(), which reflects
better what its role is these days.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Replace it with explicit calls to rpc_shutdown_client() or
rpc_destroy_client() (for the case of asynchronous calls).
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Its use is at best racy, and there is only one user (lockd), which has
additional locking that makes the whole thing redundant.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Currently rpc_malloc sets req->rq_buffer internally. Make this a more
generic interface: return a pointer to the new buffer (or NULL) and
make the caller set req->rq_buffer and req->rq_bufsize. This looks much
more like kmalloc and eliminates the side effects.
To fix a potential deadlock, this patch also replaces GFP_NOFS with
GFP_NOWAIT in rpc_malloc. This prevents async RPCs from sleeping outside
the RPC's task scheduler while allocating their buffer.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The RPC buffer size estimation logic in net/sunrpc/clnt.c always
significantly overestimates the requirements for the buffer size.
A little instrumentation demonstrated that in fact rpc_malloc was never
allocating the buffer from the mempool, but almost always called kmalloc.
To compute the size of the RPC buffer more precisely, split p_bufsiz into
two fields; one for the argument size, and one for the result size.
Then, compute the sum of the exact call and reply header sizes, and split
the RPC buffer precisely between the two. That should keep almost all RPC
buffers within the 2KiB buffer mempool limit.
And, we can finally be rid of RPC_SLACK_SPACE!
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Fix a regression due to the patch "NFS: disconnect before retrying NFSv4
requests over TCP"
The assumption made in xprt_transmit() that the condition
"req->rq_bytes_sent == 0 and request is on the receive list"
should imply that we're dealing with a retransmission is false.
Firstly, it may simply happen that the socket send queue was full
at the time the request was initially sent through xprt_transmit().
Secondly, doing this for each request that was retransmitted implies
that we disconnect and reconnect for _every_ request that happened to
be retransmitted irrespective of whether or not a disconnection has
already occurred.
Fix is to move this logic into the call_status request timeout handler.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
RFC3530 section 3.1.1 states an NFSv4 client MUST NOT send a request
twice on the same connection unless it is the NULL procedure. Section
3.1.1 suggests that the client should disconnect and reconnect if it
wants to retry a request.
Implement this by adding an rpc_clnt flag that an ULP can use to
specify that the underlying transport should be disconnected on a
major timeout. The NFSv4 client asserts this new flag, and requests
no retries after a minor retransmit timeout.
Note that disconnecting on a retransmit is in general not safe to do
if the RPC client does not reuse the TCP port number when reconnecting.
See http://bugzilla.linux-nfs.org/show_bug.cgi?id=6
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The tk_pid field is an unsigned short. The proper print format specifier for
that type is %5u, not %4d.
Also clean up some miscellaneous print formatting nits.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The error values are already propagated through task->tk_status, and
none of the callers check one without checking the other, so we can
drop the return value.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Fix the Oops in http://bugzilla.linux-nfs.org/show_bug.cgi?id=138
We shouldn't be calling rpc_release_task() for tasks that are not active.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For now we will assume that all transports will use the address format
buffers in the rpc_xprt struct to store their addresses. Change
rpc_peer2str() to be a generic routine to handle this, and get rid of the
print_address() op in the rpc_xprt_ops vector.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
All internal RPC client operations should no longer depend on the BKL,
however lockd and NFS callbacks may still require it.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Replace references to system_utsname to the per-process uts namespace
where appropriate. This includes things like uname.
Changes: Per Eric Biederman's comments, use the per-process uts namespace
for ELF_PLATFORM, sunrpc, and parts of net/ipv4/ipconfig.c
[jdike@addtoit.com: UML fix]
[clg@fr.ibm.com: cleanup]
[akpm@osdl.org: build fix]
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Andrey Savochkin <saw@sw.ru>
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
pure s/u32/__be32/
[AV: large part based on Alexey's patches]
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
In a subsequent patch, this will allow the portmapper to take a reference
to the rpc_xprt for which it is updating the port number, fixing an Oops.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
- Ensure that the task aborts the RPC call only when it has actually timed out.
- Ensure that req->rq_majortimeo is initialised correctly.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
In case of any of the above errors occuring, delay for 3 seconds, then
handle as if it were a timeout error.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>