Commit Graph

151 Commits

Author SHA1 Message Date
Linus Torvalds 7c049d0869 Main batch of InfiniBand/RDMA changes for 3.12 merge window:
- Large ocrdma HW driver update: add "fast register" work requests,
    fixes, cleanups
  - Add receive flow steering support for raw QPs
  - Fix IPoIB neighbour race that leads to crash
  - iSER updates including support for using "fast register" memory
    registration
  - IPv6 support for iWARP
  - XRC transport fixes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 
 iQIcBAABCAAGBQJSJ2drAAoJEENa44ZhAt0hpaUQAJ6EdNFh2bon9c/uHz1lw58A
 DIvVniUwWGCRgpbsv/IxZDBVX3G5IKSW6U0Y3/JxMXeIF/cGOUaJZgCeKPiYNoNA
 yxiEGI0ffvpWGAJUHSE2ARJKvgdjrpqGo5UzsiinEhJx3uczeZBcooosQTWDXxya
 /qSWH3rARkic9abuaizkuVzdWEjWLNjyh2bWnqJ4HNZplE8RfSK4bk8QtvEmpn9Z
 dNBKFFujysfHHGflLuSkFAkP1NdqlZeQ4/2uzD23p3YofbJwrJoJNFxkCfx3UUml
 fjPptlhU6kZMCwSXsD24tAV8Exr/CCgmxriFIN/Xqhu4gvBELScRPPqPqFquhtXI
 pP5hbG9/9P5C8BLgABe6IAvlU1lUyraR67bA78nYeKm2JpATE6T23Nx7nUgMgzRS
 Ee3ZvZGZvk8BZ7zyQh8D7Aig1XOtzqA9D5nLUyEJ2aRPSnXJxyi0S3Fbn0vmOMb7
 8CF+99KECG8fb4sjNAjX5Zm48+7WJd+WCd3t89oUzCbJKKpa1Chctph8VH7dEfke
 JkEjOJhuAGqau4bIMZ2nZkISoTfSD7vbjPAxMPASS7fOdR7fe6AqDn9/LMAhWjpw
 4Fb25ZW7rYf9+6jqA4MgMRFCTkXhR25FsqUUL9h8NaWai4F1dfVoZHbRvGIZah5n
 5HQupXGquwES4y6pvHjc
 =rabl
 -----END PGP SIGNATURE-----

Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

Pull main batch of InfiniBand/RDMA changes from Roland Dreier:
 - Large ocrdma HW driver update: add "fast register" work requests,
   fixes, cleanups
 - Add receive flow steering support for raw QPs
 - Fix IPoIB neighbour race that leads to crash
 - iSER updates including support for using "fast register" memory
   registration
 - IPv6 support for iWARP
 - XRC transport fixes

* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (54 commits)
  RDMA/ocrdma: Fix compiler warning about int/pointer size mismatch
  IB/iser: Fix redundant pointer check in dealloc flow
  IB/iser: Fix possible memory leak in iser_create_frwr_pool()
  IB/qib: Move COUNTER_MASK definition within qib_mad.h header guards
  RDMA/ocrdma: Fix passing wrong opcode to modify_srq
  RDMA/ocrdma: Fill PVID in UMC case
  RDMA/ocrdma: Add ABI versioning support
  RDMA/ocrdma: Consider multiple SGES in case of DPP
  RDMA/ocrdma: Fix for displaying proper link speed
  RDMA/ocrdma: Increase STAG array size
  RDMA/ocrdma: Dont use PD 0 for userpace CQ DB
  RDMA/ocrdma: FRMA code cleanup
  RDMA/ocrdma: For ERX2 irrespective of Qid, num_posted offset is 24
  RDMA/ocrdma: Fix to work with even a single MSI-X vector
  RDMA/ocrdma: Remove the MTU check based on Ethernet MTU
  RDMA/ocrdma: Add support for fast register work requests (FRWR)
  RDMA/ocrdma: Create IRD queue fix
  IB/core: Better checking of userspace values for receive flow steering
  IB/mlx4: Add receive flow steering support
  IB/core: Export ib_create/destroy_flow through uverbs
  ...
2013-09-05 09:39:27 -07:00
Linus Torvalds 27703bb4a6 PTR_RET() is a weird name, and led to some confusing usage. We ended
up with PTR_ERR_OR_ZERO(), and replacing or fixing all the usages.
 
 This has been sitting in linux-next for a whole cycle.
 
 Thanks,
 Rusty.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJSJo+1AAoJENkgDmzRrbjxIC4QALJK95o8AUXuwUkl+2fmFkUt
 hh2/PJ1vDYgk4Xt0J6hyoK7XMa0H1RkbBrROuDdsBnorMFpEsGcgdkUZte9ufoAS
 97Bg+7N0KPbTB/S8vOwtW1vbERTJIVPN2uf6h1Wqm9Xc2puCh3HbMMr1AWMGu0WQ
 NqY5+Zz8zecy1UOrMhEP6H1CjeQcL1w1DO6YM5ydeqlKNzAz+JMfDXriLPDwiE7+
 XFPDF/O3Vtd2ckA7L70Lio7hfHwxV5U4WwFVfiwls98XB4jcZqDKIoh1r8z4SRgR
 +0Rae2DN3BaOabGMr//5XdrzQVpwJTh5m2w8BAOHJvCJ9HR7Sq29UIN4u+TowZBy
 L2xYo4dvFxkympwu5zEd3c7vHYWKIaqmSq5PIjr4gF/uIo2OeOTrpPIK782ZEYb7
 e+qUgOEM05V9AmQZCrSZeP9u474Sj8ow3sCtWxfdRtwNfoEIcUXsNNJd/zDHlVtW
 cEtXqc2xXIpcuUJQWlSaGp8fmRQjVZPzrLKYLM2m39ZcOOJbf5rzQAYS7hHPosIa
 SK+YVux/+Zzi+Xo/vXq1OlM/SruCr5S7JOgCxLowoQ88vupgXME6uPyC8EO+QQ50
 GsrHes5ZNLbk0uVsfcexIyojkUnyvDmmnDpv+1zdC6RgZLJQn8OXp5yNhHhnhrFT
 BiHX6YFWtDDqRlVv8Q0F
 =LeaW
 -----END PGP SIGNATURE-----

Merge tag 'PTR_RET-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull PTR_RET() removal patches from Rusty Russell:
 "PTR_RET() is a weird name, and led to some confusing usage.  We ended
  up with PTR_ERR_OR_ZERO(), and replacing or fixing all the usages.

  This has been sitting in linux-next for a whole cycle"

[ There are still some PTR_RET users scattered about, with some of them
  possibly being new, but most of them existing in Rusty's tree too.  We
  have that

      #define PTR_RET(p) PTR_ERR_OR_ZERO(p)

  thing in <linux/err.h>, so they continue to work for now  - Linus ]

* tag 'PTR_RET-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  GFS2: Replace PTR_RET with PTR_ERR_OR_ZERO
  Btrfs: volume: Replace PTR_RET with PTR_ERR_OR_ZERO
  drm/cma: Replace PTR_RET with PTR_ERR_OR_ZERO
  sh_veu: Replace PTR_RET with PTR_ERR_OR_ZERO
  dma-buf: Replace PTR_RET with PTR_ERR_OR_ZERO
  drivers/rtc: Replace PTR_RET with PTR_ERR_OR_ZERO
  mm/oom_kill: remove weird use of ERR_PTR()/PTR_ERR().
  staging/zcache: don't use PTR_RET().
  remoteproc: don't use PTR_RET().
  pinctrl: don't use PTR_RET().
  acpi: Replace weird use of PTR_RET.
  s390: Replace weird use of PTR_RET.
  PTR_RET is now PTR_ERR_OR_ZERO(): Replace most.
  PTR_RET is now PTR_ERR_OR_ZERO
2013-09-04 17:31:11 -07:00
Steve Wise 24d44a391f RDMA/cma: Add IPv6 support for iWARP
Modify the type of local_addr and remote_addr fields in struct
iw_cm_id from struct sockaddr_in to struct sockaddr_storage to hold
IPv6 and IPv4 addresses uniformly.

Change the references of local_addr and remote_addr in cxgb4, cxgb3,
nes and amso drivers to match this.  However to be able to actully run
traffic over IPv6, low-level drivers have to add code to support this.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>

[ Fix unused variable warnings when INFINIBAND_NES_DEBUG not set.
  - Roland ]

Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-12 12:32:31 -07:00
Sean Hefty 5eb695c177 RDMA/cma: Only call cma_save_ib_info() for CM REQs
Calling cma_save_ib_info() for CM SIDR REQs results in a crash
accessing an invalid path record pointer.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-07-31 00:50:44 -07:00
Sean Hefty e511d1ae16 RDMA/cma: Fix accessing invalid private data for UD
If a application is using AF_IB with a UD QP, but does not provide any
private data, we will end up accessing invalid memory.  Check for this
case and handle it appropriately.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-07-31 00:50:40 -07:00
Paul Bolle 8fb488d740 RDMA/cma: Fix gcc warning
Building cma.o triggers this gcc warning:

    drivers/infiniband/core/cma.c: In function ‘rdma_resolve_addr’:
    drivers/infiniband/core/cma.c:465:23: warning: ‘port’ may be used uninitialized in this function [-Wmaybe-uninitialized]
    drivers/infiniband/core/cma.c:426:5: note: ‘port’ was declared here

This is a false positive, as "port" will always be initialized if we're
at "found". But if we assign to "id_priv->id.port_num" directly, we can
drop "port". That will, obviously, silence gcc.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-07-30 16:11:22 -07:00
Rusty Russell 8c6ffba0ed PTR_RET is now PTR_ERR_OR_ZERO(): Replace most.
Sweep of the simple cases.

Cc: netdev@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-15 11:25:01 +09:30
Sean Hefty ce117ffac2 RDMA/cma: Export AF_IB statistics
Report AF_IB source and destination addresses through netlink
interface.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-20 23:35:45 -07:00
Sean Hefty 5bc2b7b397 RDMA/ucma: Allow user space to specify AF_IB when joining multicast
Allow user space applications to join multicast groups using MGIDs
directly.  MGIDs may be passed using AF_IB addresses.  Since the
current multicast join command only supports addresses as large as
sockaddr_in6, define a new structure for joining addresses specified
using sockaddr_ib.

Since AF_IB allows the user to specify the qkey when resolving a
remote UD QP address, when joining the multicast group use the qkey
value, if one has been assigned.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-20 23:35:45 -07:00
Sean Hefty cf53936f22 RDMA/cma: Export cma_get_service_id()
Allow the rdma_ucm to query the IB service ID formed or allocated by
the rdma_cm by exporting the cma_get_service_id() functionality.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-20 23:35:41 -07:00
Sean Hefty 94d0c93941 RDMA/cma: Only listen on IB devices when using AF_IB
If an rdma_cm_id is bound to AF_IB, with a wild card address, only
listen on IB devices.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-20 23:35:38 -07:00
Sean Hefty 5c438135ad RDMA/cma: Set qkey for AF_IB
Allow the user to specify the qkey when using AF_IB.  The qkey is
added to struct rdma_ucm_conn_param in place of a reserved field, but
for backwards compatability, is only accessed if the associated
rdma_cm_id is using AF_IB.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-20 23:35:37 -07:00
Sean Hefty e8160e1593 RDMA/cma: Expose private data when using AF_IB
If the source or destination address is AF_IB, then do not reserve a
portion of the private data in the IB CM REQ or SIDR REQ messages for
the cma header.  Instead, all private data should be exported to the
user.  When AF_IB is used, the rdma cm does not have sufficient
information to fill in the cma header.  Additionally, this will be
necessary to support any IB connection through the rdma cm interface,

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-20 23:35:36 -07:00
Sean Hefty fbaa1a6d85 RDMA/cma: Merge cma_get/save_net_info
With the removal of SDP related code, we can merge cma_get_net_info()
with cma_save_net_info(), since we're only ever dealing with a single
header format.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-20 23:35:29 -07:00
Sean Hefty 01602f113f RDMA/cma: Remove unused SDP related code
The SDP protocol was never merged upstream.  Remove unused SDP related
code from the RDMA CM.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-20 13:08:05 -07:00
Sean Hefty 496ce3ce17 RDMA/cma: Add support for AF_IB to cma_get_service_id()
cma_get_service_id() forms the service ID based on the port space and
port number of the rdma_cm_id.  Extend the call to support AF_IB,
which contains the service ID directly.  This will be needed to
support any arbitrary SID.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-20 13:08:05 -07:00
Sean Hefty f68194ca88 RDMA/cma: Add support for AF_IB to rdma_resolve_route()
Allow rdma_resolve_route() to handle the case where the user specified
the source and destination addresses using AF_IB.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-20 13:08:04 -07:00
Sean Hefty f17df3b0de RDMA/cma: Add support for AF_IB to rdma_resolve_addr()
Allow the user to specify the remote address using AF_IB format.  When
AF_IB is used, the remote address simply needs to be recorded, and no
resolution using ARP is done.  The local address may still need to be
matched with a local IB device.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-20 13:08:04 -07:00
Sean Hefty 4ae7152e0b RDMA/cma: Verify that source and dest sa_family are the same
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-20 13:08:04 -07:00
Sean Hefty b0569e4075 RDMA/cma: Restrict AF_IB loopback to binding to IB devices only
If a user specifies AF_IB as the source address for a loopback
connection, limit the resolution to IB devices only.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-20 13:08:04 -07:00
Sean Hefty f4753834b5 RDMA/cma: Add helper functions to return id address information
Provide inline helpers to extract source and destination address data
from the rdma_cm_id.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-20 13:08:04 -07:00
Sean Hefty 6a3e362d3c RDMA/cma: Do not modify sa_family when setting loopback address
cma_resolve_loopback is called after an rdma_cm_id has been
bound to a specific sa_family and port.  Once the
source sa_family for the id has been set, do not modify it.
Only the actual IP address portion of the source address
needs to be set.

As part of this fix, we can simplify setting the source address
by moving the loopback address assignment from cma_resolve_loopback
to cma_bind_loopback.  cma_bind_loopback is only invoked when
the source address is the loopback address.

Finally, add loopback support for AF_IB as part of the change.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-20 13:08:03 -07:00
Sean Hefty 680f920a2e RDMA/cma: Allow user to specify AF_IB when binding
Modify rdma_bind_addr to allow the user to specify AF_IB when binding
to a device.  AF_IB indicates that the user is not mapping an IP
address to the native IB addressing.  (The mapping may have already
been done, or is not needed)

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-20 13:08:03 -07:00
Sean Hefty 58afdcb738 RDMA/cma: Update port reservation to support AF_IB
The AF_IB uses a 64-bit service id (SID), which the user can control
through the use of a mask.  The rdma_cm will assign values to the
unmasked portions of the SID based on the selected port space and port
number.

Because the IB spec divides the SID range into several regions, a
SID/mask combination may fall into one of the existing port space
ranges as defined by the RDMA CM IP Annex.  Map the AF_IB SID to the
correct RDMA port space.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-20 13:08:03 -07:00
Sean Hefty ef560861c0 IB/addr: Add AF_IB support to ip_addr_size
Add support for AF_IB to ip_addr_size, and rename the function to
account for the change.  Give the compiler more control over whether
the call should be inline or not by moving the definition into the .c
file, removing the static inline, and exporting it.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-20 13:08:02 -07:00
Sean Hefty 2e2d190c5e RDMA/cma: Include AF_IB in loopback and any address checks
Enhance checks for loopback and any address to support AF_IB in
addition to AF_INET and AF_INT6.  This will allow future patches to
use AF_IB when binding and resolving addresses.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-20 13:08:02 -07:00
Sean Hefty c8dea2f9f0 RDMA/cma: Allow enabling reuseaddr in any state
The rdma_cm only allows setting reuseaddr if the corresponding
rdma_cm_id is in the idle state.  Allow setting this value in other
states.  This brings the behavior more inline with sockets.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-20 13:08:01 -07:00
Jiri Pirko 351638e7de net: pass info struct via netdevice notifier
So far, only net_device * could be passed along with netdevice notifier
event. This patch provides a possibility to pass custom structure
able to provide info that event listener needs to know.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>

v2->v3: fix typo on simeth
	shortened dev_getter
	shortened notifier_info struct name
v1->v2: fix notifier_call parameter in call_netdevice_notifier()
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-28 13:11:01 -07:00
Sasha Levin b67bfe0d42 hlist: drop the node parameter from iterators
I'm not sure why, but the hlist for each entry iterators were conceived

        list_for_each_entry(pos, head, member)

The hlist ones were greedy and wanted an extra parameter:

        hlist_for_each_entry(tpos, pos, head, member)

Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.

Besides the semantic patch, there was some manual work required:

 - Fix up the actual hlist iterators in linux/list.h
 - Fix up the declaration of other iterators based on the hlist ones.
 - A very small amount of places were using the 'node' parameter, this
 was modified to use 'obj->member' instead.
 - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
 properly, so those had to be fixed up manually.

The semantic patch which is mostly the work of Peter Senna Tschudin is here:

@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;

type T;
expression a,c,d,e;
identifier b;
statement S;
@@

-T b;
    <+... when != b
(
hlist_for_each_entry(a,
- b,
c, d) S
|
hlist_for_each_entry_continue(a,
- b,
c) S
|
hlist_for_each_entry_from(a,
- b,
c) S
|
hlist_for_each_entry_rcu(a,
- b,
c, d) S
|
hlist_for_each_entry_rcu_bh(a,
- b,
c, d) S
|
hlist_for_each_entry_continue_rcu_bh(a,
- b,
c) S
|
for_each_busy_worker(a, c,
- b,
d) S
|
ax25_uid_for_each(a,
- b,
c) S
|
ax25_for_each(a,
- b,
c) S
|
inet_bind_bucket_for_each(a,
- b,
c) S
|
sctp_for_each_hentry(a,
- b,
c) S
|
sk_for_each(a,
- b,
c) S
|
sk_for_each_rcu(a,
- b,
c) S
|
sk_for_each_from
-(a, b)
+(a)
S
+ sk_for_each_from(a) S
|
sk_for_each_safe(a,
- b,
c, d) S
|
sk_for_each_bound(a,
- b,
c) S
|
hlist_for_each_entry_safe(a,
- b,
c, d, e) S
|
hlist_for_each_entry_continue_rcu(a,
- b,
c) S
|
nr_neigh_for_each(a,
- b,
c) S
|
nr_neigh_for_each_safe(a,
- b,
c, d) S
|
nr_node_for_each(a,
- b,
c) S
|
nr_node_for_each_safe(a,
- b,
c, d) S
|
- for_each_gfn_sp(a, c, d, b) S
+ for_each_gfn_sp(a, c, d) S
|
- for_each_gfn_indirect_valid_sp(a, c, d, b) S
+ for_each_gfn_indirect_valid_sp(a, c, d) S
|
for_each_host(a,
- b,
c) S
|
for_each_host_safe(a,
- b,
c, d) S
|
for_each_mesh_entry(a,
- b,
c, d) S
)
    ...+>

[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: Peter Senna Tschudin <peter.senna@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:24 -08:00
Tejun Heo 3b069c5d85 IB/core: convert to idr_alloc()
Convert to the much saner new idr interface.

v2: Mike triggered WARN_ON() in idr_preload() because send_mad(),
    which may be used from non-process context, was calling
    idr_preload() unconditionally.  Preload iff @gfp_mask has
    __GFP_WAIT.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Reported-by: "Marciniszyn, Mike" <mike.marciniszyn@intel.com>
Cc: Roland Dreier <roland@kernel.org>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:16 -08:00
shefty 63f05be2c0 RDMA/cm: Change return value from find_gid_port()
Problem reported by Dan Carpenter <dan.carpenter@oracle.com>:

The patch 3c86aa70bf67: "RDMA/cm: Add RDMA CM support for IBoE
devices" from Oct 13, 2010, leads to the following warning:
net/sunrpc/xprtrdma/svc_rdma_transport.c:722 svc_rdma_create()
	 error: passing non neg 1 to ERR_PTR

This bug would result in a NULL dereference.  svc_rdma_create() is
supposed to return ERR_PTRs or valid pointers, but instead it returns
ERR_PTRs, valid pointers and 1.

The call tree is:

svc_rdma_create()
   => rdma_bind_addr()
      => cma_acquire_dev()
         => find_gid_port()

rdma_bind_addr() should return a valid errno.  Fix this by having
find_gid_port() also return a valid errno.  If we can't find the
specified GID on a given port, return -EADDRNOTAVAIL, rather than
-EAGAIN, to better indicate the error.  We also drop using the
special return value of '1' and instead pass through the error
returned by the underlying verbs call.  On such errors, rather
than aborting the search,  we simply continue to check the next
device/port.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-11-29 12:16:29 -08:00
David S. Miller 8dd9117cc7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux
Pulled mainline in order to get the UAPI infrastructure already
merged before I pull in David Howells's UAPI trees for networking.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-10-09 13:14:32 -04:00
Gao feng 809d5fc9bf infiniband: pass rdma_cm module to netlink_dump_start
set netlink_dump_control.module to avoid panic.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Cc: Roland Dreier <roland@kernel.org>
Cc: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-10-07 00:30:56 -04:00
Sean Hefty 4ede178a5e RDMA/cma: Check that retry count values are in range
The retry_count and rnr_retry_count connection parameters are both
3-bit values.  Check that the values are in range and reduce if
they're not.

This fixes a problem reported by Doug Ledford <dledford@redhat.com>
that resulted in the userspace rping test (part of the librdmacm
samples) failing to run over Intel IB HCAs.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>

[ Use min_t() to avoid warnings about type mismatch.  - Roland ]

Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-10-04 19:11:54 -07:00
Dotan Barak 2a22fb8c69 RDMA/cma: Use consistent component mask for IPoIB port space multicast joins
CMA multicast joins for the IPoIB port space need to use the same
component mask used by the ipoib driver.  Otherwise, it's possible for
the CMA to create a group to which a join made by ipoib will fail, or
vise-versa.  Some of the component mask fields set by ipoib weren't
set by the CMA, fix that.

Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Acked-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-09-30 20:31:47 -07:00
Fengguang Wu 4e28904528 RDMA/cma: Use PTR_RET rather than if (IS_ERR(...)) + PTR_ERR
Suggested by scripts/coccinelle/api/ptr_ret.cocci.

Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-07-27 13:05:18 -07:00
Roland Dreier 089117e1ad Merge branches 'cma', 'cxgb4', 'misc', 'mlx4-sriov', 'mlx-cleanups', 'ocrdma' and 'qib' into for-linus 2012-07-22 23:26:17 -07:00
Roland Dreier d90f9b3591 IB: Use IS_ENABLED(CONFIG_IPV6)
Instead of testing defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)

Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-07-08 18:04:32 -07:00
Sean Hefty 68602120e4 RDMA/cma: Allow user to restrict listens to bound address family
Provide an option for the user to specify that listens should only
accept connections where the incoming address family matches that of
the locally bound address.  This is used to support the equivalent of
IPV6_V6ONLY socket option, which allows an app to only accept
connection requests directed to IPv6 addresses.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-07-08 18:02:24 -07:00
Sean Hefty 406b6a25f8 RDMA/cma: Listen on specific address family
The rdma_cm maps IPv4 and IPv6 addresses to the same service ID.  This
prevents apps from listening only for IPv4 or IPv6 addresses.  It also
results in an app binding to an IPv4 address receiving connection
requests for an IPv6 address.

Change this to match socket behavior: restrict listens on IPv4
addresses to only IPv4 addresses, and if a listen is on an IPv6
address, allow it to receive either IPv4 or IPv6 addresses, based on
its address family binding.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-07-08 18:02:24 -07:00
Sean Hefty 5b0ec991c0 RDMA/cma: Bind to a specific address family
The RDMA CM uses a single port space for all associated (tcp, udp,
etc.) port bindings, regardless of the address family that the user
binds to.  The result is that if a user binds to AF_INET, but does not
specify an IP address, the bind will occur for AF_INET6.  This causes
an attempt to bind to the same port using AF_INET6 to fail, and
connection requests to AF_INET6 will match with the AF_INET listener.
Align the behavior with sockets and restrict the bind to AF_INET only.

If a user binds to AF_INET6, we bind the port to AF_INET6 and
AF_INET depending on the value of bindv6only.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-07-08 18:02:23 -07:00
Sean Hefty 4dd81e8956 RDMA/cma: QP type check on received REQs should be AND not OR
Change || check to the intended && when checking the QP type in a
received connection request against the listening endpoint.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-06-19 20:04:04 -07:00
Linus Torvalds c23ddf7857 InfiniBand/RDMA changes for the 3.5 merge window:
- Add ocrdma hardware driver for Emulex IB-over-Ethernet adapters
  - Add generic and mlx4 support for "raw" QPs: allow suitably privileged
    applications to send and receive arbitrary packets directly to/from
    the hardware
  - Add "doorbell drop" handling to the cxgb4 driver
  - A fairly large batch of qib hardware driver changes
  - A few fixes for lockdep-detected issues
  - A few other miscellaneous fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABCAAGBQJPumbZAAoJEENa44ZhAt0h8LgP/0fXe7Szm3n6P6UvMAVqkagM
 4PpreH3mpWUFpzqeQE1JPDtgx700R6aPipbHqgIN+k61RWMpLjICGcNx7iwxn1I+
 zqdquGygWgjceLz+BLVlk+iBmJt3vZ3fPRAXc7fdP+jhIarWkNIOy1pXWTUuRvED
 jL8jIaxhCcgAVzm/zNyt6IPxkaHvCz7K9wqmpyU0dsO9OyPdGvWA9+CkGXwmOCPq
 mxSVhWnfGsMkPBsL7EgTC5KP/ox2PKq6rFgysmVVS+rKCpP0L8BEVQyGX3Gf8KA8
 yV+KdTi9ofDnFrv6R7Wz0v7HRUih8GRssakzBu7Y7HLfK1M/QwMG0GUAibXGZObc
 vUXuQ3uRJ/cIzMPXqKeGYwpb5t+TmxyjhWu44OjCUQkNau91+9BSbA69S88KXc49
 aTJiCZlhPuGf4uGMWJJuPLcE2xO2QCZj+8ckL2STYrIip6GWlCH02kJaQmRkuWH2
 UhMOeJDBC4nvh4EQT/WwHpGzyhkavE2ayfo5YemxBJXo+P5Mdbf7WIDRQDLUEeQH
 F8sPoccH4hDiAorN/SkTsm14jVTP7oWW1M40Ont59Nhbgm88MsVkvjoneHnfBvbD
 HjK92soCWnYTAoREfj0G4xUxZgMdOZcezWrX0rx5LJ8Ju9y4zAi3cKGr7lg6hs4X
 syKfN0VjiDRtJ+pxayi3
 =yWfr
 -----END PGP SIGNATURE-----

Merge tag 'rdma-for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

Pull InfiniBand/RDMA changes from Roland Dreier:
 - Add ocrdma hardware driver for Emulex IB-over-Ethernet adapters
 - Add generic and mlx4 support for "raw" QPs: allow suitably privileged
   applications to send and receive arbitrary packets directly to/from
   the hardware
 - Add "doorbell drop" handling to the cxgb4 driver
 - A fairly large batch of qib hardware driver changes
 - A few fixes for lockdep-detected issues
 - A few other miscellaneous fixes and cleanups

Fix up trivial conflict in drivers/net/ethernet/emulex/benet/be.h.

* tag 'rdma-for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (53 commits)
  RDMA/cxgb4: Include vmalloc.h for vmalloc and vfree
  IB/mlx4: Fix mlx4_ib_add() error flow
  IB/core: Fix IB_SA_COMP_MASK macro
  IB/iser: Fix error flow in iser ep connection establishment
  IB/mlx4: Increase the number of vectors (EQs) available for ULPs
  RDMA/cxgb4: Add query_qp support
  RDMA/cxgb4: Remove kfifo usage
  RDMA/cxgb4: Use vmalloc() for debugfs QP dump
  RDMA/cxgb4: DB Drop Recovery for RDMA and LLD queues
  RDMA/cxgb4: Disable interrupts in c4iw_ev_dispatch()
  RDMA/cxgb4: Add DB Overflow Avoidance
  RDMA/cxgb4: Add debugfs RDMA memory stats
  cxgb4: DB Drop Recovery for RDMA and LLD queues
  cxgb4: Common platform specific changes for DB Drop Recovery
  cxgb4: Detect DB FULL events and notify RDMA ULD
  RDMA/cxgb4: Drop peer_abort when no endpoint found
  RDMA/cxgb4: Always wake up waiters in c4iw_peer_abort_intr()
  mlx4_core: Change bitmap allocator to work in round-robin fashion
  RDMA/nes: Don't call event handler if pointer is NULL
  RDMA/nes: Fix for the ORD value of the connecting peer
  ...
2012-05-21 17:54:55 -07:00
Sean Hefty b6cec8aa4a RDMA/cma: Fix lockdep false positive recursive locking
The following lockdep problem was reported by Or Gerlitz <ogerlitz@mellanox.com>:

    [ INFO: possible recursive locking detected ]
    3.3.0-32035-g1b2649e-dirty #4 Not tainted
    ---------------------------------------------
    kworker/5:1/418 is trying to acquire lock:
     (&id_priv->handler_mutex){+.+.+.}, at: [<ffffffffa0138a41>] rdma_destroy_i    d+0x33/0x1f0 [rdma_cm]

    but task is already holding lock:
     (&id_priv->handler_mutex){+.+.+.}, at: [<ffffffffa0135130>] cma_disable_ca    llback+0x24/0x45 [rdma_cm]

    other info that might help us debug this:
     Possible unsafe locking scenario:

           CPU0
           ----
      lock(&id_priv->handler_mutex);
      lock(&id_priv->handler_mutex);

     *** DEADLOCK ***

     May be due to missing lock nesting notation

    3 locks held by kworker/5:1/418:
     #0:  (ib_cm){.+.+.+}, at: [<ffffffff81042ac1>] process_one_work+0x210/0x4a    6
     #1:  ((&(&work->work)->work)){+.+.+.}, at: [<ffffffff81042ac1>] process_on    e_work+0x210/0x4a6
     #2:  (&id_priv->handler_mutex){+.+.+.}, at: [<ffffffffa0135130>] cma_disab    le_callback+0x24/0x45 [rdma_cm]

    stack backtrace:
    Pid: 418, comm: kworker/5:1 Not tainted 3.3.0-32035-g1b2649e-dirty #4
    Call Trace:
     [<ffffffff8102b0fb>] ? console_unlock+0x1f4/0x204
     [<ffffffff81068771>] __lock_acquire+0x16b5/0x174e
     [<ffffffff8106461f>] ? save_trace+0x3f/0xb3
     [<ffffffff810688fa>] lock_acquire+0xf0/0x116
     [<ffffffffa0138a41>] ? rdma_destroy_id+0x33/0x1f0 [rdma_cm]
     [<ffffffff81364351>] mutex_lock_nested+0x64/0x2ce
     [<ffffffffa0138a41>] ? rdma_destroy_id+0x33/0x1f0 [rdma_cm]
     [<ffffffff81065a78>] ? trace_hardirqs_on_caller+0x11e/0x155
     [<ffffffff81065abc>] ? trace_hardirqs_on+0xd/0xf
     [<ffffffffa0138a41>] rdma_destroy_id+0x33/0x1f0 [rdma_cm]
     [<ffffffffa0139c02>] cma_req_handler+0x418/0x644 [rdma_cm]
     [<ffffffffa012ee88>] cm_process_work+0x32/0x119 [ib_cm]
     [<ffffffffa0130299>] cm_req_handler+0x928/0x982 [ib_cm]
     [<ffffffffa01302f3>] ? cm_req_handler+0x982/0x982 [ib_cm]
     [<ffffffffa0130326>] cm_work_handler+0x33/0xfe5 [ib_cm]
     [<ffffffff81065a78>] ? trace_hardirqs_on_caller+0x11e/0x155
     [<ffffffffa01302f3>] ? cm_req_handler+0x982/0x982 [ib_cm]
     [<ffffffff81042b6e>] process_one_work+0x2bd/0x4a6
     [<ffffffff81042ac1>] ? process_one_work+0x210/0x4a6
     [<ffffffff813669f3>] ? _raw_spin_unlock_irq+0x2b/0x40
     [<ffffffff8104316e>] worker_thread+0x1d6/0x350
     [<ffffffff81042f98>] ? rescuer_thread+0x241/0x241
     [<ffffffff81046a32>] kthread+0x84/0x8c
     [<ffffffff8136e854>] kernel_thread_helper+0x4/0x10
     [<ffffffff81366d59>] ? retint_restore_args+0xe/0xe
     [<ffffffff810469ae>] ? __init_kthread_worker+0x56/0x56
     [<ffffffff8136e850>] ? gs_change+0xb/0xb

The actual locking is fine, since we're dealing with different locks,
but from the same lock class.  cma_disable_callback() acquires the
listening id mutex, whereas rdma_destroy_id() acquires the mutex for
the new connection id.  To fix this, delay the call to
rdma_destroy_id() until we've released the listening id mutex.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-08 11:17:34 -07:00
Amir Vadai 366cddb402 IB/rdma_cm: TOS <=> UP mapping for IBoE
Both tagged traffic and untagged traffic use tc tool mapping.
Treat RDMA TOS same as IP TOS when mapping to SL

Signed-off-by: Amir Vadai <amirv@mellanox.com>
CC: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-05 05:08:04 -04:00
Linus Torvalds 48fa57ac2c infiniband changes for 3.3 merge window
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABCAAGBQJPBQQDAAoJEENa44ZhAt0hy1EP/A/dz741mEd2QZxYHgloK8XU
 sSoHvq+vDxHnOvBrDuaHXT47FoY+OSVE+ESeJJQJ9L+B6g3yacP3hNSIcguXFs8Z
 v011AZAeRvQx3bLu5R9+eDL+YTyotkR0sl/huoLkSwlqrEqGA85eLqf5RSQdxYZf
 iC1ZXfg0KTrtb6rBvohNcijpmIVEe83SWfnD/ZuCGuWq++DyVJxzECnR7p5D8a9q
 eMJEnIKVEIpqkqXrPQr/blVSfGQL54QuUdYtoKAS8ZW6BzjIwCGUmKdoT1vaNqX5
 sIntxXMcgIgE2r0y/nDK+QIFS4U784eUevIC/LeunbhWUEQX05f3l6+V566/T9hX
 lvp5M6aonsSSvtqrVi6SF5rvSHFlwPpvAY3+jhjXKLpZ5OxMqf/ZlTN1xN4bin1A
 whGnznU+51Tjzph6Or8iXo5yExDUQhowX1Z3CYDmh/UqzKHRqaFAuiC071r8GZW3
 BEOV9yf/+qPsgtXAiO4jSKlLrOJbMgEI4BoITXTO9HvZH9dHGXDYLvULdHDmFaBi
 XLg5zcAjou24855miv/gnBQzDc0NWW184BGS9hPE9zmbQlJr7gA4zI0Eggtj+3MO
 7z/SLTrxKSfjZJR8Z3cGsnBjCs1VFqV+YQnTkyZYLORLf4F3RbDLe6aJQ+9WBA1g
 86J11MjrG30erg3gbXun
 =8ZmW
 -----END PGP SIGNATURE-----

Merge tag 'infiniband-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

infiniband changes for 3.3 merge window

* tag 'infiniband-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  rdma/core: Fix sparse warnings
  RDMA/cma: Fix endianness bugs
  RDMA/nes: Fix terminate during AE
  RDMA/nes: Make unnecessarily global nes_set_pau() static
  RDMA/nes: Change MDIO bus clock to 2.5MHz
  IB/cm: Fix layout of APR message
  IB/mlx4: Fix SL to 802.1Q priority-bits mapping for IBoE
  IB/qib: Default some module parameters optimally
  IB/qib: Optimize locking for get_txreq()
  IB/qib: Fix a possible data corruption when receiving packets
  IB/qib: Eliminate 64-bit jiffies use
  IB/qib: Fix style issues
  IB/uverbs: Protect QP multicast list
2012-01-08 14:05:48 -08:00
Sean Hefty 46ea5061c7 RDMA/cma: Fix endianness bugs
Fix endianness bugs reported by sparse in the RDMA core stack.  Note
that these are real bugs, but don't affect any existing code to the
best of my knowledge.  The mlid issue would only affect kernel users
of rdma_join_multicast which have the rdma_cm attach/detach its QP.
There are no current in tree users that do this. (rdma_join_multicast
may be used called by user space applications, which does not have
this issue.)  And the pkey setting is simply returned as
informational.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-01-04 09:13:52 -08:00
David S. Miller abb434cb05 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	net/bluetooth/l2cap_core.c

Just two overlapping changes, one added an initialization of
a local variable, and another change added a new local variable.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-23 17:13:56 -05:00
Sean Hefty 04ded16724 RDMA/cma: Verify private data length
private_data_len is defined as a u8.  If the user specifies a large
private_data size (> 220 bytes), we will calculate a total length that
exceeds 255, resulting in private_data_len wrapping back to 0.  This
can lead to overwriting random kernel memory.  Avoid this by verifying
that the resulting size fits into a u8.

Reported-by: B. Thery <benjamin.thery@bull.net>
Addresses: <http://bugs.openfabrics.org/bugzilla/show_bug.cgi?id=2335>
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-12-19 09:15:33 -08:00
Alexey Dobriyan 4e3fd7a06d net: remove ipv6_addr_copy()
C assignment can handle struct in6_addr copying.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-22 16:43:32 -05:00