linux

Commit Graph

Author	SHA1	Message	Date
Tom Tucker	2779e3ae39	svc: Move kfree of deferral record to common code The rqstp structure has a pointer to a svc_deferred_req record that is allocated when requests are deferred. This record is common to all transports and can be freed in common code. Move the kfree of the rq_deferred to the common svc_xprt_release function. This also fixes a memory leak in the RDMA transport which does not kfree the dr structure in it's version of the xpo_release_rqst callback. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-07 15:40:45 -05:00
Trond Myklebust	69b6ba3712	SUNRPC: Ensure the server closes sockets in a timely fashion We want to ensure that connected sockets close down the connection when we set XPT_CLOSE, so that we don't keep it hanging while cleaning up all the stuff that is keeping a reference to the socket. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:57 -05:00
Jeff Layton	c9233eb7b0	sunrpc: add sv_maxconn field to svc_serv (try #3 ) svc_check_conn_limits() attempts to prevent denial of service attacks by having the service close old connections once it reaches a threshold. This threshold is based on the number of threads in the service: (serv->sv_nrthreads + 3) * 20 Once we reach this, we drop the oldest connections and a printk pops to warn the admin that they should increase the number of threads. Increasing the number of threads isn't an option however for services like lockd. We don't want to eliminate this check entirely for such services but we need some way to increase this limit. This patch adds a sv_maxconn field to the svc_serv struct. When it's set to 0, we use the current method to calculate the max number of connections. RPC services can then set this on an as-needed basis. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Neil Brown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:47 -05:00
Al Viro	56ff5efad9	zero i_uid/i_gid on inode allocation ... and don't bother in callers. Don't bother with zeroing i_blocks, while we are at it - it's already been zeroed. i_mode is not worth the effort; it has no common default value. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-01-05 11:54:28 -05:00
Trond Myklebust	08cc36cbd1	Merge branch 'devel' into next	2008-12-30 16:51:43 -05:00
Linus Torvalds	0191b625ca	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1429 commits) net: Allow dependancies of FDDI & Tokenring to be modular. igb: Fix build warning when DCA is disabled. net: Fix warning fallout from recent NAPI interface changes. gro: Fix potential use after free sfc: If AN is enabled, always read speed/duplex from the AN advertising bits sfc: When disabling the NIC, close the device rather than unregistering it sfc: SFT9001: Add cable diagnostics sfc: Add support for multiple PHY self-tests sfc: Merge top-level functions for self-tests sfc: Clean up PHY mode management in loopback self-test sfc: Fix unreliable link detection in some loopback modes sfc: Generate unique names for per-NIC workqueues 802.3ad: use standard ethhdr instead of ad_header 802.3ad: generalize out mac address initializer 802.3ad: initialize ports LACPDU from const initializer 802.3ad: remove typedef around ad_system 802.3ad: turn ports is_individual into a bool 802.3ad: turn ports is_enabled into a bool 802.3ad: make ntt bool ixgbe: Fix set_ringparam in ixgbe to use the same memory pools. ... Fixed trivial IPv4/6 address printing conflicts in fs/cifs/connect.c due to the conversion to %pI (in this networking merge) and the addition of doing IPv6 addresses (from the earlier merge of CIFS).	2008-12-28 12:49:40 -08:00
Olga Kornievskaia	2efef7080f	rpc: add service field to new upcall This patch extends the new upcall with a "service" field that currently can have 2 values: "" or "nfs". These values specify matching rules for principals in the keytab file. The "" means that gssd is allowed to use "root", "nfs", or "host" keytab entries while the other option requires "nfs". Restricting gssd to use the "nfs" principal is needed for when the server performs a callback to the client. The server in this case has to authenticate itself as an "nfs" principal. We also need "service" field to distiguish between two client-side cases both currently using a uid of 0: the case of regular file access by the root user, and the case of state-management calls (such as setclientid) which should use a keytab for authentication. (And the upcall should fail if an appropriate principal can't be found.) Signed-off: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:19:56 -05:00
Olga Kornievskaia	8b1c7bf5b6	rpc: add target field to new upcall This patch extends the new upcall by adding a "target" field communicating who we want to authenticate to (equivalently, the service principal that we want to acquire a ticket for). Signed-off: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:19:26 -05:00
Olga Kornievskaia	61054b14d5	nfsd: support callbacks with gss flavors This patch adds server-side support for callbacks other than AUTH_SYS. Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:19:00 -05:00
Olga Kornievskaia	945b34a772	rpc: allow gss callbacks to client This patch adds client-side support to allow for callbacks other than AUTH_SYS. Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:18:34 -05:00
Olga Kornievskaia	608207e888	rpc: pass target name down to rpc level on callbacks The rpc client needs to know the principal that the setclientid was done as, so it can tell gssd who to authenticate to. Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:17:40 -05:00
Olga Kornievskaia	68e76ad0ba	nfsd: pass client principal name in rsc downcall Two principals are involved in krb5 authentication: the target, who we authenticate to (normally the name of the server, like nfs/server.citi.umich.edu@CITI.UMICH.EDU), and the source, we we authenticate as (normally a user, like bfields@UMICH.EDU) In the case of NFSv4 callbacks, the target of the callback should be the source of the client's setclientid call, and the source should be the nfs server's own principal. Therefore we allow svcgssd to pass down the name of the principal that just authenticated, so that on setclientid we can store that principal name with the new client, to be used later on callbacks. Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:17:15 -05:00
$\"J. Bruce Fields\$ \"J. Bruce Fields\	34769fc488	rpc: implement new upcall Implement the new upcall. We decide which version of the upcall gssd will use (new or old), by creating both pipes (the new one named "gssd", the old one named after the mechanism (e.g., "krb5")), and then waiting to see which version gssd actually opens. We don't permit pipes of the two different types to be opened at once. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:16:37 -05:00
$\"J. Bruce Fields\$ \"J. Bruce Fields\	5b7ddd4a7b	rpc: store pointer to pipe inode in gss upcall message Keep a pointer to the inode that the message is queued on in the struct gss_upcall_msg. This will be convenient, especially after we have a choice of two pipes that an upcall could be queued on. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:15:44 -05:00
$\"J. Bruce Fields\$ \"J. Bruce Fields\	79a3f20b64	rpc: use count of pipe openers to wait for first open Introduce a global variable pipe_version which will eventually be used to keep track of which version of the upcall gssd is using. For now, though, it only keeps track of whether any pipe is open or not; it is negative if not, zero if one is opened. We use this to wait for the first gssd to open a pipe. (Minor digression: note this waits only for the very first open of any pipe, not for the first open of a pipe for a given auth; thus we still need the RPC_PIPE_WAIT_FOR_OPEN behavior to wait for gssd to open new pipes that pop up on subsequent mounts.) Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:10:52 -05:00
$\"J. Bruce Fields\$ \"J. Bruce Fields\	cf81939d6f	rpc: track number of users of the gss upcall pipe Keep a count of the number of pipes open plus the number of messages on a pipe. This count isn't used yet. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:10:19 -05:00
$\"J. Bruce Fields\$ \"J. Bruce Fields\	e712804ae4	rpc: call release_pipe only on last close I can't see any reason we need to call this until either the kernel or the last gssd closes the pipe. Also, this allows to guarantee that open_pipe and release_pipe are called strictly in pairs; open_pipe on gssd's first open, release_pipe on gssd's last close (or on the close of the kernel side of the pipe, if that comes first). That will make it very easy for the gss code to keep track of which pipes gssd is using. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:09:47 -05:00
$\"J. Bruce Fields\$ \"J. Bruce Fields\	c381060869	rpc: add an rpc_pipe_open method We want to transition to a new gssd upcall which is text-based and more easily extensible. To simplify upgrades, as well as testing and debugging, it will help if we can upgrade gssd (to a version which understands the new upcall) without having to choose at boot (or module-load) time whether we want the new or the old upcall. We will do this by providing two different pipes: one named, as currently, after the mechanism (normally "krb5"), and supporting the old upcall. One named "gssd" and supporting the new upcall version. We allow gssd to indicate which version it supports by its choice of which pipe to open. As we have no interest in supporting simultaneous use of both versions, we'll forbid opening both pipes at the same time. So, add a new pipe_open callback to the rpc_pipefs api, which the gss code can use to track which pipes have been open, and to refuse opens of incompatible pipes. We only need this to be called on the first open of a given pipe. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:08:32 -05:00
$\"J. Bruce Fields\$ \"J. Bruce Fields\	db75b3d6b5	rpc: minor gss_alloc_msg cleanup I want to add a little more code here, so it'll be convenient to have this flatter. Also, I'll want to add another error condition, so it'll be more convenient to return -ENOMEM than NULL in the error case. The only caller is already converting NULL to -ENOMEM anyway. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:07:13 -05:00
$\"J. Bruce Fields\$ \"J. Bruce Fields\	b03568c322	rpc: factor out warning code from gss_pipe_destroy_msg We'll want to call this from elsewhere soon. And this is a bit nicer anyway. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:06:55 -05:00
$\"J. Bruce Fields\$ \"J. Bruce Fields\	99db356368	rpc: remove unnecessary assignment We're just about to kfree() gss_auth, so there's no point to setting any of its fields. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:06:33 -05:00
Jeff Layton	6dcd3926b2	sunrpc: fix code that makes auth_gss send destroy_cred message (try #2 ) There's a bit of a chicken and egg problem when it comes to destroying auth_gss credentials. When we destroy the last instance of a GSSAPI RPC credential, we should send a NULL RPC call with a GSS procedure of RPCSEC_GSS_DESTROY to hint to the server that it can destroy those creds. This isn't happening because we're setting clearing the uptodate bit on the credentials and then setting the operations to the gss_nullops. When we go to do the RPC call, we try to refresh the creds. That fails with -EACCES and the call fails. Fix this by not clearing the UPTODATE bit for the credentials and adding a new crdestroy op for gss_nullops that just tears down the cred without trying to destroy the context. The only difference between this patch and the first one is the removal of some minor formatting deltas. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:57 -05:00
Peter Staubach	64672d55d9	optimize attribute timeouts for "noac" and "actimeo=0" Hi. I've been looking at a bugzilla which describes a problem where a customer was advised to use either the "noac" or "actimeo=0" mount options to solve a consistency problem that they were seeing in the file attributes. It turned out that this solution did not work reliably for them because sometimes, the local attribute cache was believed to be valid and not timed out. (With an attribute cache timeout of 0, the cache should always appear to be timed out.) In looking at this situation, it appears to me that the problem is that the attribute cache timeout code has an off-by-one error in it. It is assuming that the cache is valid in the region, [read_cache_jiffies, read_cache_jiffies + attrtimeo]. The cache should be considered valid only in the region, [read_cache_jiffies, read_cache_jiffies + attrtimeo). With this change, the options, "noac" and "actimeo=0", work as originally expected. This problem was previously addressed by special casing the attrtimeo == 0 case. However, since the problem is only an off- by-one error, the cleaner solution is address the off-by-one error and thus, not require the special case. Thanx... ps Signed-off-by: Peter Staubach <staubach@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:56 -05:00
Trond Myklebust	7bd8826915	SUNRPC: rpcsec_gss modules should not be used by out-of-tree code Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:32 -05:00
Trond Myklebust	468039ee46	SUNRPC: Convert the xdr helpers and rpc_pipefs to EXPORT_SYMBOL_GPL We've never considered the sunrpc code as part of any ABI to be used by out-of-tree modules. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:31 -05:00
Trond Myklebust	88a9fe8cae	SUNRPC: Remove the last remnant of the BKL... Somehow, this escaped the previous purge. There should be no need to keep any extra locks in the XDR callbacks. The NFS client XDR code only writes into private objects, whereas all reads of shared objects are confined to fields that do not change, such as filehandles... Ditto for lockd, the NFSv2/v3 client mount code, and rpcbind. The nfsd XDR code may require the BKL, but since it does a synchronous RPC call from a thread that already holds the lock, that issue is moot. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:31 -05:00
David S. Miller	eb14f01959	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/e1000e/ich8lan.c	2008-12-15 20:03:50 -08:00
Ilpo Järvinen	b1721d2bb9	rpc/rdma: goto instead of copypaste Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-14 23:19:48 -08:00
Roel Kluin	5eaa65b240	net: Make static Sparse asked whether these could be static. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-10 15:18:31 -08:00
James Morris	ec98ce480a	Merge branch 'master' into next Conflicts: fs/nfsd/nfs4recover.c Manually fixed above to use new creds API functions, e.g. nfs4_save_creds(). Signed-off-by: James Morris <jmorris@namei.org>	2008-12-04 17:16:36 +11:00
Linus Torvalds	2433c41789	Merge branch 'for-2.6.28' of git://linux-nfs.org/~bfields/linux * 'for-2.6.28' of git://linux-nfs.org/~bfields/linux: NLM: client-side nlm_lookup_host() should avoid matching on srcaddr nfsd: use of unitialized list head on error exit in nfs4recover.c Add a reference to sunrpc in svc_addsock nfsd: clean up grace period on early exit	2008-12-03 16:40:37 -08:00
Ingo Molnar	ff0db0490a	sunrpc: fix warning in net/sunrpc/xprtrdma/verbs.c fix this warning: net/sunrpc/xprtrdma/verbs.c: In function ‘rpcrdma_conn_upcall’: net/sunrpc/xprtrdma/verbs.c:279: warning: unused variable ‘addr’ Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-25 16:58:42 -08:00
Ingo Molnar	ed72b9c6e0	sunrpc: fix warning in net/sunrpc/xprtrdma/svc_rdma_transport.c this warning: net/sunrpc/xprtrdma/svc_rdma_transport.c: In function ‘svc_rdma_accept’: net/sunrpc/xprtrdma/svc_rdma_transport.c:830: warning: ‘dma_mr_acc’ may be used uninitialized in this function triggers because GCC does not recognize the (correct) flow connection between need_dma_mr and dma_mr_acc. Annotate it. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-25 16:49:37 -08:00
Tom Tucker	2da2c21d75	Add a reference to sunrpc in svc_addsock The svc_addsock function adds transport instances without taking a reference on the sunrpc.ko module, however, the generic transport destruction code drops a reference when a transport instance is destroyed. Add a try_module_get call to the svc_addsock function for transport instances added by this function. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Tested-by: Jeff Moyer <jmoyer@redhat.com>	2008-11-24 10:15:01 -06:00
David S. Miller	6ab33d5171	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/ixgbe/ixgbe_main.c include/net/mac80211.h net/phonet/af_phonet.c	2008-11-20 16:44:00 -08:00
Trond Myklebust	23918b0306	SUNRPC: Fix a performance regression in the RPC authentication code Fix a regression reported by Max Kellermann whereby kernel profiling showed that his clients were spending 45% of their time in rpcauth_lookup_credcache. It turns out that although his processes had identical uid/gid/groups, generic_match() was failing to detect this, because the task->group_info pointers were not shared. This again lead to the creation of a huge number of identical credentials at the RPC layer. The regression is fixed by comparing the contents of task->group_info if the actual pointers are not identical. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-11-20 13:17:40 -08:00
David Howells	86a264abe5	CRED: Wrap current->cred and a few other accessors Wrap current->cred and a few other accessors to hide their actual implementation. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>	2008-11-14 10:39:18 +11:00
David Howells	b6dff3ec5e	CRED: Separate task security context from task_struct Separate the task security context from task_struct. At this point, the security data is temporarily embedded in the task_struct with two pointers pointing to it. Note that the Alpha arch is altered as it refers to (E)UID and (E)GID in entry.S via asm-offsets. With comment fixes Signed-off-by: Marc Dionne <marc.c.dionne@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>	2008-11-14 10:39:16 +11:00
David Howells	8f4194026b	CRED: Wrap task credential accesses in the SunRPC protocol Wrap access to task credentials so that they can be separated more easily from the task_struct during the introduction of COW creds. Change most current->(\|e\|s\|fs)[ug]id to current_(\|e\|s\|fs)[ug]id(). Change some task->e?[ug]id to task_e?[ug]id(). In some places it makes more sense to use RCU directly rather than a convenient wrapper; these will be addressed by later patches. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: J. Bruce Fields <bfields@fieldses.org> Cc: Neil Brown <neilb@suse.de> Cc: linux-nfs@vger.kernel.org Signed-off-by: James Morris <jmorris@namei.org>	2008-11-14 10:39:09 +11:00
David S. Miller	e0db4a786b	sunrpc: Fix build warning due to typo in %pI4 format changes. Noticed by Stephen Hemminger. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-02 23:57:06 -08:00
Harvey Harrison	21454aaad3	net: replace NIPQUAD() in net/*/ Using NIPQUAD() with NIPQUAD_FMT, %d.%d.%d.%d or %u.%u.%u.%u can be replaced with %pI4 Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-31 00:54:56 -07:00
David S. Miller	a1744d3bee	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/p54/p54common.c	2008-10-31 00:17:34 -07:00
Harvey Harrison	5b095d9892	net: replace %p6 with %pI6 Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-29 12:52:50 -07:00
Harvey Harrison	4b7a4274ca	net: replace %#p6 format specifier with %pi6 gcc warns when using the # modifier with the %p format specifier, so we can't use this to omit the colons when needed, introduces %pi6 instead. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-29 12:50:24 -07:00
Harvey Harrison	b189db5d29	net: remove NIP6(), NIP6_FMT, NIP6_SEQFMT and final users Open code NIP6_FMT in the one call inside sscanf and one user of NIP6() that could use %p6 in the netfilter code. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-28 23:02:38 -07:00
Harvey Harrison	fdb46ee752	net, misc: replace uses of NIP6_FMT with %p6 Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-28 23:02:32 -07:00
Harvey Harrison	b071195deb	net: replace all current users of NIP6_SEQFMT with %#p6 The define in kernel.h can be done away with at a later time. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-28 16:05:40 -07:00
Trond Myklebust	5f707eb429	SUNRPC: Fix potential race in put_rpccred() We have to be careful when we try to unhash the credential in put_rpccred(), because we're not holding the credcache lock, so the call to rpcauth_unhash_cred() may fail if someone else has looked the cred up, and obtained a reference to it. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-28 15:21:42 -04:00
Trond Myklebust	eac0d18d44	SUNRPC: Fix rpcauth_prune_expired We need to make sure that we don't remove creds from the cred_unused list if they are still under the moratorium, or else they will never get garbage collected. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-28 15:21:41 -04:00
Trond Myklebust	2a9e1cfa23	SUNRPC: Respond promptly to server TCP resets If the server sends us an RST error while we're in the TCP_ESTABLISHED state, then that will not result in a state change, and so the RPC client ends up hanging forever (see http://bugzilla.kernel.org/show_bug.cgi?id=11154) We can intercept the reset by setting up an sk->sk_error_report callback, which will then allow us to initiate a proper shutdown and retry... We also make sure that if the send request receives an ECONNRESET, then we shutdown too... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-28 15:21:39 -04:00
Linus Torvalds	b225ee5bed	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: net: Remove CONFIG_KMOD from net/ (towards removing CONFIG_KMOD entirely) ipv4: Add a missing rcu_assign_pointer() in routing cache. [netdrvr] ibmtr: PCMCIA IBMTR is ok on 64bit xen-netfront: Avoid unaligned accesses to IP header lmc: copy_*_user under spinlock [netdrvr] myri10ge, ixgbe: remove broken select INTEL_IOATDMA	2008-10-17 08:58:52 -07:00
Johannes Berg	95a5afca4a	net: Remove CONFIG_KMOD from net/ (towards removing CONFIG_KMOD entirely) Some code here depends on CONFIG_KMOD to not try to load protocol modules or similar, replace by CONFIG_MODULES where more than just request_module depends on CONFIG_KMOD and and also use try_then_request_module in ebtables. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-16 15:24:51 -07:00
Trond Myklebust	6925bac120	Merge branch 'next'	2008-10-15 15:54:56 -04:00
Linus Torvalds	8acd3a60bc	Merge branch 'for-2.6.28' of git://linux-nfs.org/~bfields/linux * 'for-2.6.28' of git://linux-nfs.org/~bfields/linux: (59 commits) svcrdma: Fix IRD/ORD polarity svcrdma: Update svc_rdma_send_error to use DMA LKEY svcrdma: Modify the RPC reply path to use FRMR when available svcrdma: Modify the RPC recv path to use FRMR when available svcrdma: Add support to svc_rdma_send to handle chained WR svcrdma: Modify post recv path to use local dma key svcrdma: Add a service to register a Fast Reg MR with the device svcrdma: Query device for Fast Reg support during connection setup svcrdma: Add FRMR get/put services NLM: Remove unused argument from svc_addsock() function NLM: Remove "proto" argument from lockd_up() NLM: Always start both UDP and TCP listeners lockd: Remove unused fields in the nlm_reboot structure lockd: Add helper to sanity check incoming NOTIFY requests lockd: change nlmclnt_grant() to take a "struct sockaddr *" lockd: Adjust nlmsvc_lookup_host() to accomodate AF_INET6 addresses lockd: Adjust nlmclnt_lookup_host() signature to accomodate non-AF_INET lockd: Support non-AF_INET addresses in nlm_lookup_host() NLM: Convert nlm_lookup_host() to use a single argument svcrdma: Add Fast Reg MR Data Types ...	2008-10-14 12:31:14 -07:00
Alan Cox	113aa838ec	net: Rationalise email address: Network Specific Parts Clean up the various different email addresses of mine listed in the code to a single current and valid address. As Dave says his network merges for 2.6.28 are now done this seems a good point to send them in where they won't risk disrupting real changes. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-13 19:01:08 -07:00
Tom Talpey	c055551e97	RPC/RDMA: ensure connection attempt is complete before signalling. The RPC/RDMA connection logic could return early from reconnection attempts, leading to additional spurious retries. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-10 15:15:06 -04:00
Tom Talpey	08ca0dce1e	RPC/RDMA: correct the reconnect timer backoff The RPC/RDMA code had a constant 5-second reconnect backoff, and always performed it, even when re-establishing a connection to a server after the RPC layer closed it due to being idle. Make it an geometric backoff (up to 30 seconds), and don't delay idle reconnect. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-10 15:15:02 -04:00
Tom Talpey	b3cd8d45a7	RPC/RDMA: optionally emit useful transport info upon connect/disconnect. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-10 15:13:59 -04:00
Tom Talpey	5f37d561e0	RPC/RDMA: reformat a debug printk to keep lines together. The send marshaling code split a particular dprintk across two lines, which makes it hard to extract from logfiles. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-10 15:13:42 -04:00
Tom Talpey	5675add36e	RPC/RDMA: harden connection logic against missing/late rdma_cm upcalls. Add defensive timeouts to wait_for_completion() calls in RDMA address resolution, and make them interruptible. Fix the timeout units to milliseconds (formerly jiffies) and move to private header. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-10 15:13:31 -04:00
Tom Talpey	1a954051b0	RPC/RDMA: fix connect/reconnect resource leak. The RPC/RDMA code can leak RDMA connection manager endpoints in certain error cases on connect. Don't signal unwanted events, and be certain to destroy any allocated qp. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-10 15:13:02 -04:00
Tom Talpey	926449ba66	RPC/RDMA: return a consistent error, when connect fails. The xprt_connect call path does not expect such errors as ECONNREFUSED to be returned from failed transport connection attempts, otherwise it translates them to EIO and signals fatal errors. For example, mount.nfs prints simply "internal error". Translate all such errors to ENOTCONN from RPC/RDMA to match sockets behavior. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-10 15:12:44 -04:00
Tom Talpey	9191ca3b38	RPC/RDMA: adhere to protocol for unpadded client trailing write chunks. The RPC/RDMA protocol allows clients and servers to avoid RDMA operations for data which is purely the result of XDR padding. On the client, automatically insert the necessary padding for such server replies, and optionally don't marshal such chunks. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-10 15:12:33 -04:00
Tom Talpey	fee08caf94	RPC/RDMA: avoid an oops due to disconnect racing with async upcalls. RDMA disconnects yield an upcall from the RDMA connection manager, which can race with rpc transport close, e.g. on ^C of a mount. Ensure any rdma cm_id and qp are fully destroyed before continuing. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-10 15:11:40 -04:00
Tom Talpey	ad0e9e01da	RPC/RDMA: maintain the RPC task bytes-sent statistic. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-10 15:10:40 -04:00
Tom Talpey	575448bd36	RPC/RDMA: suppress retransmit on RPC/RDMA clients. An RPC/RDMA client cannot retransmit on an unbroken connection, doing so violates its flow control with the server. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-10 15:10:36 -04:00
Tom Tucker	b334eaabf4	RPC/RDMA: fix connection IRD/ORD setting This logic sets the connection parameter that configures the local device and informs the remote peer how many concurrent incoming RDMA_READ requests are supported. The original logic didn't really do what was intended for two reasons: - The max number supported by the device is typically smaller than any one factor in the calculation used, and - The field in the connection parameter structure where the value is stored is a u8 and always overflows for the default settings. So what really happens is the value requested for responder resources is the left over 8 bits from the "desired value". If the desired value happened to be a multiple of 256, the result was zero and it wouldn't connect at all. Given the above and the fact that max_requests is almost always larger than the max responder resources supported by the adapter, this patch simplifies this logic and simply requests the max supported by the device, subject to a reasonable limit. This bug was found by Jim Schutt at Sandia. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-10 15:09:56 -04:00
Tom Talpey	3197d309f5	RPC/RDMA: support FRMR client memory registration. Configure, detect and use "fastreg" support from IB/iWARP verbs layer to perform RPC/RDMA memory registration. Make FRMR the default memreg mode (will fall back if not supported by the selected RDMA adapter). This allows full and optimal operation over the cxgb3 adapter, and others. Signed-off-by: Tom Talpey <talpey@netapp.com> Acked-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-10 15:09:34 -04:00
Tom Talpey	bd7ed1d133	RPC/RDMA: check selected memory registration mode at runtime. At transport creation, check for, and use, any local dma lkey. Then, check that the selected memory registration mode is in fact supported by the RDMA adapter selected for the mount. Fall back to best alternative if not. Signed-off-by: Tom Talpey <talpey@netapp.com> Acked-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-10 15:09:26 -04:00
Tom Talpey	fe9053b30b	RPC/RDMA: add data types and new FRMR memory registration enum. Internal RPC/RDMA structure updates in preparation for FRMR support. Signed-off-by: Tom Talpey <talpey@netapp.com> Acked-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-10 15:09:24 -04:00
Tom Talpey	8d4ba0347c	RPC/RDMA: refactor the inline memory registration code. Refactor the memory registration and deregistration routines. This saves stack space, makes the code more readable and prepares to add the new FRMR registration methods. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-10 15:09:16 -04:00
J. Bruce Fields	107e0008df	Merge branch 'from-tomtucker' into for-2.6.28	2008-10-08 18:22:18 -04:00
Cedric Le Goater	63ffc23d30	sunrpc: fix oops in rpc_create when the mount namespace is unshared On a system with nfs mounts, if a task unshares its mount namespace, a oops can occur when the system is rebooted if the task is the last to unreference the nfs mount. It will try to create a rpc request using utsname() which has been invalidated by free_nsproxy(). The patch fixes the issue by using the global init_utsname() which is always valid. the capability of identifying rpc clients per uts namespace stills needs some extra work so this should not be a problem. BUG: unable to handle kernel NULL pointer dereference at 00000004 IP: [<c024c9ab>] rpc_create+0x332/0x42f Oops: 0000 [#1] DEBUG_PAGEALLOC Pid: 1857, comm: uts-oops Not tainted (2.6.27-rc5-00319-g7686ad5 #4) EIP: 0060:[<c024c9ab>] EFLAGS: 00210287 CPU: 0 EIP is at rpc_create+0x332/0x42f EAX: 00000000 EBX: df26adf0 ECX: c0251887 EDX: 00000001 ESI: df26ae58 EDI: c02f293c EBP: dda0fc9c ESP: dda0fc2c DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 Process uts-oops (pid: 1857, ti=dda0e000 task=dd9a0778 task.ti=dda0e000) Stack: c0104532 dda0fffc dda0fcac dda0e000 dda0e000 dd93b7f0 00000009 c02f2880 df26aefc dda0fc68 c01096b7 00000000 c0266ee0 c039a070 c039a070 dda0fc74 c012ca67 c039a064 dda0fc8c c012cb20 c03daf74 00000011 00000000 c0275c90 Call Trace: [<c0104532>] ? dump_trace+0xc2/0xe2 [<c01096b7>] ? save_stack_trace+0x1c/0x3a [<c012ca67>] ? save_trace+0x37/0x8c [<c012cb20>] ? add_lock_to_list+0x64/0x96 [<c0256fc4>] ? rpcb_register_call+0x62/0xbb [<c02570c8>] ? rpcb_register+0xab/0xb3 [<c0252f4d>] ? svc_register+0xb4/0x128 [<c0253114>] ? svc_destroy+0xec/0x103 [<c02531b2>] ? svc_exit_thread+0x87/0x8d [<c01a75cd>] ? lockd_down+0x61/0x81 [<c01a577b>] ? nlmclnt_done+0xd/0xf [<c01941fe>] ? nfs_destroy_server+0x14/0x16 [<c0194328>] ? nfs_free_server+0x4c/0xaa [<c019a066>] ? nfs_kill_super+0x23/0x27 [<c0158585>] ? deactivate_super+0x3f/0x51 [<c01695d1>] ? mntput_no_expire+0x95/0xb4 [<c016965b>] ? release_mounts+0x6b/0x7a [<c01696cc>] ? __put_mnt_ns+0x62/0x70 [<c0127501>] ? free_nsproxy+0x25/0x80 [<c012759a>] ? switch_task_namespaces+0x3e/0x43 [<c01275a9>] ? exit_task_namespaces+0xa/0xc [<c0117fed>] ? do_exit+0x4fd/0x666 [<c01181b3>] ? do_group_exit+0x5d/0x83 [<c011fa8c>] ? get_signal_to_deliver+0x2c8/0x2e0 [<c0102630>] ? do_notify_resume+0x69/0x700 [<c011d85a>] ? do_sigaction+0x134/0x145 [<c0127205>] ? hrtimer_nanosleep+0x8f/0xce [<c0126d1a>] ? hrtimer_wakeup+0x0/0x1c [<c0103488>] ? work_notifysig+0x13/0x1b ======================= Code: 70 20 68 cb c1 2c c0 e8 75 4e 01 00 8b 83 ac 00 00 00 59 3d 00 f0 ff ff 5f 77 63 eb 57 a1 00 80 2d c0 8b 80 a8 02 00 00 8d 73 68 <8b> 40 04 83 c0 45 e8 41 46 f7 ff ba 20 00 00 00 83 f8 21 0f 4c EIP: [<c024c9ab>] rpc_create+0x332/0x42f SS:ESP 0068:dda0fc2c Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: "Serge E. Hallyn" <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-07 18:19:10 -04:00
Trond Myklebust	96165e2b7c	SUNRPC: Fix a memory leak in rpcb_getport_async Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-07 18:18:57 -04:00
Trond Myklebust	9a4bd29fe8	SUNRPC: Fix autobind on cloned rpc clients Despite the fact that cloned rpc clients won't have the cl_autobind flag set, they may still find themselves calling rpcb_getport_async(). For this to happen, it suffices for a _parent_ rpc_clnt to use autobinding, in which case any clone may find itself triggering the !xprt_bound() case in call_bind(). The correct fix for this is to walk back up the tree of cloned rpc clients, in order to find the parent that 'owns' the transport, either because it has clnt->cl_autobind set, or because it originally created the transport... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-07 18:18:53 -04:00
Denis V. Lunev	c9f6cde6e2	sunrpc: do not pin sunrpc module in the memory Basically, try_module_get here are pretty useless. Any other module using this API will pin sunrpc in memory due using exported symbols. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-07 18:14:54 -04:00
Tom Tucker	67080c8236	svcrdma: Fix IRD/ORD polarity The inititator/responder resources in the event have been swapped. They no represent what the local peer would set their values to in order to match the peer. Note that iWARP does not exchange these on the wire and the provider is simply putting in the local device max. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-10-06 14:46:13 -05:00
Tom Tucker	04911b539c	svcrdma: Update svc_rdma_send_error to use DMA LKEY Update the svc_rdma_send_error code to use the DMA LKEY which is valid regardless of the memory registration strategy in use. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-10-06 14:46:08 -05:00
Tom Tucker	afd566ea08	svcrdma: Modify the RPC reply path to use FRMR when available Use FRMR to map local RPC reply data. This allows RDMA_WRITE to send reply data using a single WR. The FRMR is invalidated by linking the LOCAL_INV WR to the RDMA_SEND message used to complete the reply. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-10-06 14:46:05 -05:00
Tom Tucker	146b6df6a5	svcrdma: Modify the RPC recv path to use FRMR when available RPCRDMA requests that specify a read-list are fetched with RDMA_READ. Using an FRMR to map the data sink improves NFSRDMA security on transports that place the RDMA_READ data sink LKEY on the wire because the valid lifetime of the MR is only the duration of the RDMA_READ. The LKEY is invalidated when the last RDMA_READ WR completes. Mapping the data sink also allows for very large amounts to data to be fetched with a single WR, so if the client is also using FRMR, the entire RPC read-list can be fetched with a single WR. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-10-06 14:46:01 -05:00
Tom Tucker	5b180a9a64	svcrdma: Add support to svc_rdma_send to handle chained WR WR can be submitted as linked lists of WR. Update the svc_rdma_send routine to handle WR chains. This will be used to submit a WR that uses an FRMR with another WR that invalidates the FRMR. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-10-06 14:45:56 -05:00
Tom Tucker	a5abf4e815	svcrdma: Modify post recv path to use local dma key Update the svc_rdma_post_recv routine to use the adapter's global LKEY instead of sc_phys_mr which is only valid when using a DMA MR. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-10-06 14:45:52 -05:00
Tom Tucker	e118321062	svcrdma: Add a service to register a Fast Reg MR with the device Fast Reg MR introduces a new WR type. Add a service to register the region with the adapter and update the completion handling to support completions with a NULL WR context. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-10-06 14:45:49 -05:00
Tom Tucker	3a5c63803d	svcrdma: Query device for Fast Reg support during connection setup Query the device capabilities in the svc_rdma_accept function to determine what advanced memory management capabilities are supported by the device. Based on the query, select the most secure model available given the requirements of the transport and capabilities of the adapter. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-10-06 14:45:45 -05:00
Tom Tucker	64be8608c1	svcrdma: Add FRMR get/put services Add services for the allocating, freeing, and unmapping Fast Reg MR. These services will be used by the transport connection setup, send and receive routines. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-10-06 14:45:18 -05:00
Chuck Lever	2937391385	NLM: Remove unused argument from svc_addsock() function Clean up: The svc_addsock() function no longer uses its "proto" argument, so remove it. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Neil Brown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-10-04 17:12:27 -04:00
Benny Halevy	d5b337b487	nfsd: use nfs client rpc callback program since commit `ff7d9756b5` "nfsd: use static memory for callback program and stats" do_probe_callback uses a static callback program (NFS4_CALLBACK) rather than the one set in clp->cl_callback.cb_prog as passed in by the client in setclientid (4.0) or create_session (4.1). This patches introduces rpc_create_args.prognumber that allows overriding program->number when creating rpc_clnt. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-09-29 18:13:40 -04:00
Chuck Lever	db820d6376	SUNRPC: Clean up debug messages in rpcb_clnt.c The RPCB XDR functions are used for multiple procedures. For instance, rpcb_encode_getaddr() is used for RPCB_GETADDR, RPCB_SET, and RPCB_UNSET. Make the XDR debug messages more generic so they are less confusing. And, unlike in other RPC consumers in the kernel, a single debug flag enables all levels of debug messages in the RPC bind client, including XDR debug messages. Since the XDR decoders already report success or failure in this case, remove redundant debug messages in the mid-level rpcb_register_call() function. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-09-29 18:13:40 -04:00
Chuck Lever	f6fb3f6f59	SUNRPC: Fix up svc_unregister() With the new rpcbind code, a PMAP_UNSET will not have any effect on services registered via rpcbind v3 or v4. Implement a version of svc_unregister() that uses an RPCB_UNSET with an empty netid string to make sure we have cleared all entries for a kernel RPC service when shutting down, or before starting a fresh instance of the service. Use the new version only when CONFIG_SUNRPC_REGISTER_V4 is enabled; otherwise, the legacy PMAP version is used to ensure complete backwards-compatibility with the Linux portmapper daemon. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-09-29 18:13:40 -04:00
Chuck Lever	9d548b9c95	SUNRPC: Use short-hand IPv6 ANYADDR for RPCB_SET Clean up: When doing an RPCB_SET, make the kernel's rpcb client use the shorthand "::" for the universal form of the IPv6 ANY address. Without this patch, rpcbind will advertise: 0000:0000:0000:0000:0000:0000:0000:0000.x.y This is cosmetic only. It cleans up the display of information from /sbin/rpcinfo. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-09-29 18:13:40 -04:00
Chuck Lever	2c7eb0b206	SUNRPC: Register both netids for AF_INET6 servers TI-RPC is a user-space library of RPC functions that replaces ONC RPC and allows RPC to operate in the new world of IPv6. TI-RPC combines the concept of a transport protocol (UDP and TCP) and a protocol family (PF_INET and PF_INET6) into a single identifier called a "netid." For example, "udp" means UDP over IPv4, and "udp6" means UDP over IPv6. For rpcbind, then, the RPC service tuple that is registered and advertised is: [RPC program, RPC version, service address and port, netid] instead of [RPC program, RPC version, port, protocol] Service address is typically ANYADDR, but can be a specific address of one of the interfaces on a multi-homed host. The third item in the new tuple is expressed as a universal address. The current Linux rpcbind implementation registers a netid for both protocol families when RPCB_SET is done for just the PF_INET6 version of the netid (ie udp6 or tcp6). So registering "udp6" causes a registration for "udp" to appear automatically as well. We've recently determined that this is incorrect behavior. In the TI-RPC world, "udp6" is not meant to imply that the registered RPC service handles requests from AF_INET as well, even if the listener socket does address mapping. "udp" and "udp6" are entirely separate capabilities, and must be registered separately. The Linux kernel, unlike TI-RPC, leverages address mapping to allow a single listener socket to handle requests for both AF_INET and AF_INET6. This is still OK, but the kernel currently assumes registering "udp6" will cover "udp" as well. It registers only "udp6" for it's AF_INET6 services, even though they handle both AF_INET and AF_INET6 on the same port. So svc_register() actually needs to register both "udp" and "udp6" explicitly (and likewise for TCP). Until rpcbind is fixed, the kernel can ignore the return code for the second RPCB_SET call. Please merge this with commit 15231312: SUNRPC: Support IPv6 when registering kernel RPC services Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Olaf Kirch <okir@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-09-29 18:13:39 -04:00
Chuck Lever	a26cfad6e0	SUNRPC: Support IPv6 when registering kernel RPC services In order to advertise NFS-related services on IPv6 interfaces via rpcbind, the kernel RPC server implementation must use rpcb_v4_register() instead of rpcb_register(). A new kernel build option allows distributions to use the legacy v2 call until they integrate an appropriate user-space rpcbind daemon that can support IPv6 RPC services. I tried adding some automatic logic to fall back if registering with a v4 protocol request failed, but there are too many corner cases. So I just made it a compile-time switch that distributions can throw when they've replaced portmapper with rpcbind. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-09-29 18:13:38 -04:00
Chuck Lever	7252d575ab	SUNRPC: Split portmap unregister API into separate function Create a separate server-level interface for unregistering RPC services. The mechanics of, and the API for, registering and unregistering RPC services will diverge further as support for IPv6 is added. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-09-29 18:13:38 -04:00
Chuck Lever	14aeb2118d	SUNRPC: Simplify rpcb_register() API Bruce suggested there's no need to expose the difference between an error sending the PMAP_SET request and an error reply from the portmapper to rpcb_register's callers. The user space equivalent of rpcb_register() is pmap_set(3), which returns a bool_t : either the PMAP set worked, or it didn't. Simple. So let's remove the "*okay" argument from rpcb_register() and rpcb_v4_register(), and simply return an error if any part of the call didn't work. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-09-29 18:13:37 -04:00
Chuck Lever	b6632339e3	SUNRPC: Set V6ONLY socket option for RPC listener sockets My plan is to use an AF_INET listener on systems that support only IPv4, and an AF_INET6 listener on systems that can support IPv6. Incoming IPv4 packets will be posted to an AF_INET6 listener with a mapped IPv4 address. Max Matveev <makc@sgi.com> says: Creating a single listener can be dangerous - if net.ipv6.bindv6only is enabled then it's possible to create another listener in v4 namespace on the same port and steal the traffic from the "unifed" listener. You need to disable V6ONLY explicitly via a sockopt to stop that. Set appropriate socket option on RPC server listener sockets to prevent this. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-09-29 18:13:37 -04:00
Chuck Lever	5dd248f6f1	SUNRPC: Use proper INADDR_ANY when setting up RPC services on IPv6 Teach svc_create_xprt() to use the correct ANY address for AF_INET6 based RPC services. No caller uses AF_INET6 yet. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-09-29 17:56:56 -04:00
Chuck Lever	e851db5b05	SUNRPC: Add address family field to svc_serv data structure Introduce and initialize an address family field in the svc_serv structure. This field will determine what family to use for the service's listener sockets and what families are advertised via the local rpcbind daemon. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-09-29 17:56:56 -04:00
Arnaldo Carvalho de Melo	6067804047	net: Use hton[sl]() instead of __constant_hton[sl]() where applicable Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-09-20 22:20:49 -07:00
Cyrill Gorcunov	27df6f25ff	sunrpc: fix possible overrun on read of /proc/sys/sunrpc/transports Vegard Nossum reported ---------------------- > I noticed that something weird is going on with /proc/sys/sunrpc/transports. > This file is generated in net/sunrpc/sysctl.c, function proc_do_xprt(). When > I "cat" this file, I get the expected output: > $ cat /proc/sys/sunrpc/transports > tcp 1048576 > udp 32768 > But I think that it does not check the length of the buffer supplied by > userspace to read(). With my original program, I found that the stack was > being overwritten by the characters above, even when the length given to > read() was just 1. David Wagner added (among other things) that copy_to_user could be probably used here. Ingo Oeser suggested to use simple_read_from_buffer() here. The conclusion is that proc_do_xprt doesn't check for userside buffer size indeed so fix this by using Ingo's suggestion. Reported-by: Vegard Nossum <vegard.nossum@gmail.com> Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> CC: Ingo Oeser <ioe-lkml@rameria.de> Cc: Neil Brown <neilb@suse.de> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Greg Banks <gnb@sgi.com> Cc: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-09-01 14:24:24 -04:00
Tom Tucker	24b8b44780	svcrdma: Fix race between svc_rdma_recvfrom thread and the dto_tasklet RDMA_READ completions are kept on a separate queue from the general I/O request queue. Since a separate lock is used to protect the RDMA_READ completion queue, a race exists between the dto_tasklet and the svc_rdma_recvfrom thread where the dto_tasklet sets the XPT_DATA bit and adds I/O to the read-completion queue. Concurrently, the recvfrom thread checks the generic queue, finds it empty and resets the XPT_DATA bit. A subsequent svc_xprt_enqueue will fail to enqueue the transport for I/O and cause the transport to "stall". The fix is to protect both lists with the same lock and set the XPT_DATA bit with this lock held. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-08-13 16:57:31 -04:00

1 2 3 4 5 ...

1002 Commits