Commit Graph

1327 Commits

Author SHA1 Message Date
Roland Dreier 79e2595940 IB/addr: Fix setting source address in addr6_resolve()
Commit eea40b8f62 ("infiniband: call ipv6 route lookup via the stub
interface") introduced a regression in address resolution when connecting
to IPv6 destination addresses.  The old code called ip6_route_output(),
while the new code calls ipv6_stub->ipv6_dst_lookup().  The two are almost
the same, except that ipv6_dst_lookup() also calls ip6_route_get_saddr()
if the source address is in6addr_any.

This means that the test of ipv6_addr_any(&fl6.saddr) now never succeeds,
and so we never copy the source address out.  This ends up causing
rdma_resolve_addr() to fail, because without a resolved source address,
cma_acquire_dev() will fail to find an RDMA device to use.  For me, this
causes connecting to an NVMe over Fabrics target via RoCE / IPv6 to fail.

Fix this by copying out fl6.saddr if ipv6_addr_any() is true for the original
source address passed into addr6_resolve().  We can drop our call to
ipv6_dev_get_saddr() because ipv6_dst_lookup() already does that work.

Fixes: eea40b8f62 ("infiniband: call ipv6 route lookup via the stub interface")
Cc: <stable@vger.kernel.org> # 3.12+
Signed-off-by: Roland Dreier <roland@purestorage.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-07 14:34:19 -04:00
Majd Dibbiny d3957b86a4 RDMA/SA: Fix kernel panic in CMA request handler flow
Commit 9fdca4da4d (IB/SA: Split struct sa_path_rec based on IB and
ROCE specific fields) moved the service_id to be specific attribute
for IB and OPA SA Path Record, and thus wasn't assigned for RoCE.

This caused to the following kernel panic in the CMA request handler flow:

[   27.074594] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[   27.074731] IP: __radix_tree_lookup+0x1d/0xe0
...
[   27.075356] Workqueue: ib_cm cm_work_handler [ib_cm]
[   27.075401] task: ffff88022e3b8000 task.stack: ffffc90001298000
[   27.075449] RIP: 0010:__radix_tree_lookup+0x1d/0xe0
...
[   27.075979] Call Trace:
[   27.076015]  radix_tree_lookup+0xd/0x10
[   27.076055]  cma_ps_find+0x59/0x70 [rdma_cm]
[   27.076097]  cma_id_from_event+0xd2/0x470 [rdma_cm]
[   27.076144]  ? ib_init_ah_from_path+0x39a/0x590 [ib_core]
[   27.076193]  cma_req_handler+0x25/0x480 [rdma_cm]
[   27.076237]  cm_process_work+0x25/0x120 [ib_cm]
[   27.076280]  ? cm_get_bth_pkey.isra.62+0x3c/0xa0 [ib_cm]
[   27.076350]  cm_req_handler+0xb03/0xd40 [ib_cm]
[   27.076430]  ? sched_clock_cpu+0x11/0xb0
[   27.076478]  cm_work_handler+0x194/0x1588 [ib_cm]
[   27.076525]  process_one_work+0x160/0x410
[   27.076565]  worker_thread+0x137/0x4a0
[   27.076614]  kthread+0x112/0x150
[   27.076684]  ? max_active_store+0x60/0x60
[   27.077642]  ? kthread_park+0x90/0x90
[   27.078530]  ret_from_fork+0x2c/0x40

This patch moves it back to the common SA Path Record structure
and removes the redundant setter and getter.

Tested on Connect-IB and Connect-X4 in Infiniband and RoCE respectively.

Fixes: 9fdca4da4d (IB/SA: Split struct sa_path_rec based on IB ands
	ROCE specific fields)
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-01 17:20:14 -04:00
Leon Romanovsky 79bb5b7ee1 RDMA/umem: Fix missing mmap_sem in get umem ODP call
Add mmap_sem lock around VMA inspection in ib_umem_odp_get().

Fixes: 0008b84ea9 ('IB/umem: Add support to huge ODP')
Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com>
Reviewed-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-01 17:20:13 -04:00
Qing Huang 53376fedb9 RDMA/core: not to set page dirty bit if it's already set.
This change will optimize kernel memory deregistration operations.
__ib_umem_release() used to call set_page_dirty_lock() against every
writable page in its memory region. Its purpose is to keep data
synced between CPU and DMA device when swapping happens after mem
deregistration ops. Now we choose not to set page dirty bit if it's
already set by kernel prior to calling __ib_umem_release(). This
reduces memory deregistration time by half or even more when we ran
application simulation test program.

Signed-off-by: Qing Huang <qing.huang@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-01 17:20:12 -04:00
Leon Romanovsky f937d93a91 RDMA/uverbs: Declare local function static and add brackets to sizeof
Commit 5752075144 ("IB/SA: Add OPA path record type") introduced
new local function __ib_copy_path_rec_to_user, but didn't limit its
scope. This produces the following sparse warning:

	drivers/infiniband/core/uverbs_marshall.c:99:6: warning:
	symbol '__ib_copy_path_rec_to_user' was not declared. Should it be
	static?

In addition, it used sizeof ... notations instead of sizeof(...), which
is correct in C, but a little bit misleading. Let's change it too.

Fixes: 5752075144 ("IB/SA: Add OPA path record type")
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-01 17:20:12 -04:00
Leon Romanovsky 233c195583 RDMA/netlink: Reduce exposure of RDMA netlink functions
RDMA netlink is part of ib_core, hence ibnl_chk_listeners(),
ibnl_init() and ibnl_cleanup() don't need to be published
in public header file.

Let's remove EXPORT_SYMBOL from ibnl_chk_listeners() and move all these
functions to private header file.

CC: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-01 17:20:11 -04:00
Linus Torvalds af82455f7d char/misc patches for 4.12-rc1
Here is the big set of new char/misc driver drivers and features for
 4.12-rc1.
 
 There's lots of new drivers added this time around, new firmware drivers
 from Google, more auxdisplay drivers, extcon drivers, fpga drivers, and
 a bunch of other driver updates.  Nothing major, except if you happen to
 have the hardware for these drivers, and then you will be happy :)
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWQvAgg8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+yknsACgzkAeyz16Z97J3UTaeejbR7nKUCAAoKY4WEHY
 8O9f9pr9gj8GMBwxeZQa
 =OIfB
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver updates from Greg KH:
 "Here is the big set of new char/misc driver drivers and features for
  4.12-rc1.

  There's lots of new drivers added this time around, new firmware
  drivers from Google, more auxdisplay drivers, extcon drivers, fpga
  drivers, and a bunch of other driver updates. Nothing major, except if
  you happen to have the hardware for these drivers, and then you will
  be happy :)

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'char-misc-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (136 commits)
  firmware: google memconsole: Fix return value check in platform_memconsole_init()
  firmware: Google VPD: Fix return value check in vpd_platform_init()
  goldfish_pipe: fix build warning about using too much stack.
  goldfish_pipe: An implementation of more parallel pipe
  fpga fr br: update supported version numbers
  fpga: region: release FPGA region reference in error path
  fpga altera-hps2fpga: disable/unprepare clock on error in alt_fpga_bridge_probe()
  mei: drop the TODO from samples
  firmware: Google VPD sysfs driver
  firmware: Google VPD: import lib_vpd source files
  misc: lkdtm: Add volatile to intentional NULL pointer reference
  eeprom: idt_89hpesx: Add OF device ID table
  misc: ds1682: Add OF device ID table
  misc: tsl2550: Add OF device ID table
  w1: Remove unneeded use of assert() and remove w1_log.h
  w1: Use kernel common min() implementation
  uio_mf624: Align memory regions to page size and set correct offsets
  uio_mf624: Refactor memory info initialization
  uio: Allow handling of non page-aligned memory regions
  hangcheck-timer: Fix typo in comment
  ...
2017-05-04 19:15:35 -07:00
Paolo Abeni 24b43c9964 infiniband: avoid dereferencing uninitialized dst on error path
With commit eea40b8f62 ("infiniband: call ipv6 route lookup
via the stub interface"), if the route lookup fails due to
ipv6 being disabled, the dst variable is left untouched, and
the following dst_release() may access uninitialized memory.

Since ipv6_dst_lookup() always sets dst to NULL in case of
lookup failure with ipv6 enabled, fix the above just
returning the error code if the lookup fails.

Fixes: eea40b8f62 ("infiniband: call ipv6 route lookup via the stub interface")
Reported-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-02 10:45:45 -04:00
Dasaratharaman Chandramouli 4c33bd1926 IB/SA: Add support to query OPA path records
When the bit 26 of capmask2 field in OPA classport info
query is set, SA will query for OPA path records instead
of querying for IB path records. Note that OPA
path records can only be queried by kernel ULPs.
Userspace clients continue to query IB path records.

Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01 14:39:02 -04:00
Dasaratharaman Chandramouli 5752075144 IB/SA: Add OPA path record type
Add opa_sa_path_rec to sa_path_rec data structure.
The 'type' field in sa_path_rec identifies the
type of the path record.

Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01 14:39:02 -04:00
Dasaratharaman Chandramouli 9fdca4da4d IB/SA: Split struct sa_path_rec based on IB and ROCE specific fields
sa_path_rec now contains a union of sa_path_rec_ib and sa_path_rec_roce
based on the type of the path record. Note that fields applicable to
path record type ROCE v1 and ROCE v2 fall under sa_path_rec_roce.
Accessor functions are added to these fields so the caller doesn't have
to know the type.

Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01 14:38:19 -04:00
Dasaratharaman Chandramouli dfa834e1d9 IB/SA: Introduce path record specific types
struct sa_path_rec has a gid_type field. This patch introduces a more
generic path record specific type 'rec_type' which is either IB, ROCE v1
or ROCE v2. The patch also provides conversion functions to get
a gid type from a path record type and vice versa

Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01 14:37:28 -04:00
Dasaratharaman Chandramouli c2f8fc4ec4 IB/SA: Rename ib_sa_path_rec to sa_path_rec
Rename ib_sa_path_rec to a more generic sa_path_rec.
This is part of extending ib_sa to also support OPA
path records in addition to the IB defined path records.

Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01 14:37:28 -04:00
Dasaratharaman Chandramouli 82ffc22648 IB/CM: Add braces when using sizeof
This patch adds braces around parameters to sizeof
as called out by checkpatch

Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01 14:36:39 -04:00
Dasaratharaman Chandramouli 44c58487d5 IB/core: Define 'ib' and 'roce' rdma_ah_attr types
rdma_ah_attr can now be either ib or roce allowing
core components to use one type or the other and also
to define attributes unique to a specific type. struct
ib_ah is also initialized with the type when its first
created. This ensures that calls such as modify_ah
dont modify the type of the address handle attribute.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01 14:32:43 -04:00
Dasaratharaman Chandramouli d8966fcd4c IB/core: Use rdma_ah_attr accessor functions
Modify core and driver components to use accessor functions
introduced to access individual fields of rdma_ah_attr

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01 14:32:43 -04:00
Dasaratharaman Chandramouli 3652315934 IB/core: Rename ib_destroy_ah to rdma_destroy_ah
Rename ib_destroy_ah to rdma_destroy_ah so its in sync with the
rename of the ib address handle attribute

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01 14:32:43 -04:00
Dasaratharaman Chandramouli bfbfd661c9 IB/core: Rename ib_query_ah to rdma_query_ah
Rename ib_query_ah to rdma_query_ah so its in sync with the
rename of the ib address handle attribute

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01 14:32:43 -04:00
Dasaratharaman Chandramouli 67b985b6c7 IB/core: Rename ib_modify_ah to rdma_modify_ah
Rename ib_modify_ah to rdma_modify_ah so its in sync with the
rename of the ib address handle attribute

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01 14:32:43 -04:00
Dasaratharaman Chandramouli 0a18cfe4f6 IB/core: Rename ib_create_ah to rdma_create_ah
Rename ib_create_ah to rdma_create_ah so its in sync with the
rename of the ib address handle attribute

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01 14:32:43 -04:00
Dasaratharaman Chandramouli 90898850ec IB/core: Rename struct ib_ah_attr to rdma_ah_attr
This patch simply renames struct ib_ah_attr to
rdma_ah_attr as these fields specify attributes that are
not necessarily specific to IB.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01 14:32:43 -04:00
Dasaratharaman Chandramouli 4ba66093bd IB/core: Check for global flag when using ah_attr
Read/write grh fields of the ah_attr only if the
ah_flags field has the IB_AH_GRH bit enabled

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01 14:32:43 -04:00
Dasaratharaman Chandramouli cf0b9395d0 IB/core: Add braces when using sizeof
This patch adds braces around parameters to sizeof
as called out by checkpatch

Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01 14:32:43 -04:00
Dasaratharaman Chandramouli 2196f27162 IB/SA: Add support to query opa classport info.
For OPA devices, SA will query the OPA classport info
instead of the IB defined classport info.
opa classport info exposes additional information and
capabilities that are specific to OPA devices.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28 19:29:42 -04:00
Dasaratharaman Chandramouli ee1c60b1bf IB/SA: Modify SA to implicitly cache Class Port info
SA will query and cache class port info as part of
its initialization. SA will also invalidate and
refresh the cache based on specific events. Callers such
as IPoIB and CM can query the SA to get the classportinfo
information. Apart from making the caller code much simpler,
this change puts the onus on the SA to query and maintain
classportinfo much like how it maitains the address handle to the SM.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28 14:00:17 -04:00
Dasaratharaman Chandramouli cb8637660a IB/SA: Move functions update_sm_ah() and ib_sa_event()
Moving these will facilitate changes to these in the
next patchs. This is strictly a move and there are no
changes to the functions in any way.

Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28 13:58:08 -04:00
Dasaratharaman Chandramouli 680562b569 IB/SA: Remove unwanted braces
This fixes a checkpatch issue. The fix is needed
so that some of these functions can be moved around
in the forthcoming patches

Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28 13:58:08 -04:00
Dasaratharaman Chandramouli dbb6c91fd8 IB/SA: Add braces when using sizeof
This fixes a checkpatch issue. The fix is needed
so that some of these functions can be moved around
in the forthcoming patches

Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28 13:58:08 -04:00
Dasaratharaman Chandramouli f96a318714 IB/SA: Fix lines longer than 80 columns
This fixes a checkpatch issue. The fix is needed
so that some of these functions can be moved around
in the forthcoming patches

Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28 13:58:08 -04:00
Michael J. Ruhl 8561eae60f IB/core: For multicast functions, verify that LIDs are multicast LIDs
The Infiniband spec defines "A multicast address is defined by a
MGID and a MLID" (section 10.5).  Currently the MLID value is not
validated.

Add check to verify that the MLID value is in the correct address
range.

Fixes: 0c33aeedb2 ("[IB] Add checks to multicast attach and detach")
Cc: stable@vger.kernel.org
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28 13:48:01 -04:00
Michael J. Ruhl 20c7840a77 IB/core: If the MGID/MLID pair is not on the list return an error
A list of MGID/MLID pairs is built when doing a multicast attach.  When
the multicast detach is called, the list is searched, and regardless of
the search outcome, the driver detach is called.

If an MGID/MLID pair is not on the list, driver detach should not be
called, and an error should be returned.  Calling the driver without
removing an MGID/MLID pair from the list can leave the core and driver
out of sync.

Fixes: f4e401562c ("IB/uverbs: track multicast group membership for userspace QPs")
Cc: stable@vger.kernel.org
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28 13:45:44 -04:00
Leon Romanovsky 218271adca Ib/core: Mark local uverbs_std_types functions to be static
Functions declared in uverbs_std_types.c are local to that file, but
they lack static declarations. This produces a lot of sparse warnings,
like the one below:

drivers/infiniband/core/uverbs_std_types.c:41:5: warning: symbol
				'uverbs_free_ah' was not declared.
				Should it be static?

So mark them as static.

CC: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28 13:11:43 -04:00
Paolo Abeni eea40b8f62 infiniband: call ipv6 route lookup via the stub interface
The infiniband address handle can be triggered to resolve an ipv6
address in response to MAD packets, regardless of the ipv6
module being disabled via the kernel command line argument.

That will cause a call into the ipv6 routing code, which is not
initialized, and a conseguent oops.

This commit addresses the above issue replacing the direct lookup
call with an indirect one via the ipv6 stub, which is properly
initialized according to the ipv6 status (e.g. if ipv6 is
disabled, the routing lookup fails gracefully)

Cc: stable@vger.kernel.org # 3.12+
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-28 12:55:17 -04:00
Artemy Kovalyov 0008b84ea9 IB/umem: Add support to huge ODP
Add IB_ACCESS_HUGETLB ib_reg_mr flag.
Hugetlb region registered with this flag
will use single translation entry per huge page.

Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-25 15:40:28 -04:00
Artemy Kovalyov 403cd12e2c IB/umem: Add contiguous ODP support
Currenlty ODP supports only regular MMU pages.
Add ODP support for regions consisting of physically contiguous chunks
of arbitrary order (huge pages for instance) to improve performance.

Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-25 15:40:28 -04:00
Artemy Kovalyov 3e7e1193e2 IB: Replace ib_umem page_size by page_shift
Size of pages are held by struct ib_umem in page_size field.

It is better to store it as an exponent, because page size by nature
is always power-of-two and used as a factor, divisor or ilog2's argument.

The conversion of page_size to be page_shift allows to have portable
code and avoid following error while compiling on ARM:

  ERROR: "__aeabi_uldivmod" [drivers/infiniband/core/ib_core.ko] undefined!

CC: Selvin Xavier <selvin.xavier@broadcom.com>
CC: Steve Wise <swise@chelsio.com>
CC: Lijun Ou <oulijun@huawei.com>
CC: Shiraz Saleem <shiraz.saleem@intel.com>
CC: Adit Ranadive <aditr@vmware.com>
CC: Dennis Dalessandro <dennis.dalessandro@intel.com>
CC: Ram Amrani <Ram.Amrani@Cavium.com>
Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Acked-by: Ram Amrani <Ram.Amrani@cavium.com>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Acked-by: Adit Ranadive <aditr@vmware.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-25 15:40:28 -04:00
Zhu Yanjun 8d2216be28 IB/core: change the return type to void
The function ib_unregister_mad_agent always returns zero. And
this returned value is not checked. As such, chane the return
type to void.

CC: Joe Jin <joe.jin@oracle.com>
CC: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Hal Rosenstock <hal@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-25 15:30:26 -04:00
Vlad Tsyrklevich 4f7f4dcfff infiniband/uverbs: Fix integer overflows
The 'num_sge' variable is verfied to be smaller than the 'sge_count'
variable; however, since both are user-controlled it's possible to cause
an integer overflow for the kmalloc multiply on 32-bit platforms
(num_sge and sge_count are both defined u32). By crafting an input that
causes a smaller-than-expected allocation it's possible to write
controlled data out-of-bounds.

Signed-off-by: Vlad Tsyrklevich <vlad@tsyrklevich.net>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-25 15:18:02 -04:00
Petr Mladek 50b6778c44 IB/fmr_pool: Convert the cleanup thread into kthread worker API
Kthreads are currently implemented as an infinite loop. Each
has its own variant of checks for terminating, freezing,
awakening. In many cases it is unclear to say in which state
it is and sometimes it is done a wrong way.

The plan is to convert kthreads into kthread_worker or workqueues
API. It allows to split the functionality into separate operations.
It helps to make a better structure. Also it defines a clean state
where no locks are taken, IRQs blocked, the kthread might sleep
or even be safely migrated.

The kthread worker API is useful when we want to have a dedicated
single thread for the work. It helps to make sure that it is
available when needed. Also it allows a better control, e.g.
define a scheduling priority.

This patch converts the frm_pool kthread into the kthread worker
API because I am not sure how busy the thread is. It is well
possible that it does not need a dedicated kthread and workqueues
would be perfectly fine. Well, the conversion between kthread
worker API and workqueues is pretty trivial.

The patch moves one iteration from the kthread into the work function.
It is queued only when there is a pending work. Therefore we do not
need to compare flush_ser and req_ser at the beginning. On the contrary,
the same work could be queued only once at a time. Therefore it has to
re-queue itself if some requests are pending.

Otherwise, wake_up_process() is replaced by queuing the work.

Important: The change is only compile tested. I did not find an easy
way how to check it in a real life.

Signed-off-by: Petr Mladek <pmladek@suse.com>
TO: Doug Ledford <dledford@redhat.com>
CC: Sean Hefty <sean.hefty@intel.com>
CC: Hal Rosenstock <hal.rosenstock@gmail.com>
CC: linux-rdma@vger.kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-25 14:24:17 -04:00
Noa Osherovich 12113a35ad IB/core: Add HDR speed enum
Add high data rate speed to the ib_port_speed enumeration.

Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21 12:29:31 -04:00
Moni Shoua 61c0ddbe97 IB/cma: Send MRA for reply messages
Current implementation of RDMA_CM sends MRA (Message Receipt
Acknowledgment) only for request messages but not for response messages.

As a result, a slow active side of the connection may send a ready-to-use
message to the passive side in a delay that is too long for the passive
side to wait for.

This patch adds a call to ib_send_cm_mra() upon receiving a response
message and by this tells the other side to modify the service timeout
to a bigger value, 16 times than before. As in the request case, MRA
for reply will be sent only if a duplicate response has arrived.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Matan Barak <matan@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21 12:29:31 -04:00
Slava Shwartsman 483a3966b5 IB/core: Introduce drop flow specification
This flow steering specification identifies flow for drop by the HW.
If user create a flow only with the drop specification,
then all the packets that hit this flow will be dropped, otherwise the HW
will drop only the packets that match the other L2/L3/L4 specifications.

Signed-off-by: Slava Shwartsman <slavash@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21 12:26:05 -04:00
Jack Morgenstein b312be3d87 IB/core: Fix sysfs registration error flow
The kernel commit cited below restructured ib device management
so that the device kobject is initialized in ib_alloc_device.

As part of the restructuring, the kobject is now initialized in
procedure ib_alloc_device, and is later added to the device hierarchy
in the ib_register_device call stack, in procedure
ib_device_register_sysfs (which calls device_add).

However, in the ib_device_register_sysfs error flow, if an error
occurs following the call to device_add, the cleanup procedure
device_unregister is called. This call results in the device object
being deleted -- which results in various use-after-free crashes.

The correct cleanup call is device_del -- which undoes device_add
without deleting the device object.

The device object will then (correctly) be deleted in the
ib_register_device caller's error cleanup flow, when the caller invokes
ib_dealloc_device.

Fixes: 55aeed0654 ("IB/core: Make ib_alloc_device init the kobject")
Cc: <stable@vger.kernel.org> # v4.2+
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21 12:26:05 -04:00
Parav Pandit 4be3a4fa51 IB/core: Fix kernel crash during fail to initialize device
This patch fixes the kernel crash that occurs during ib_dealloc_device()
called due to provider driver fails with an error after
ib_alloc_device() and before it can register using ib_register_device().

This crashed seen in tha lab as below which can occur with any IB device
which fails to perform its device initialization before invoking
ib_register_device().

This patch avoids touching cache and port immutable structures if device
is not yet initialized.
It also releases related memory when cache and port immutable data
structure initialization fails during register_device() state.

[81416.561946] BUG: unable to handle kernel NULL pointer dereference at (null)
[81416.570340] IP: ib_cache_release_one+0x29/0x80 [ib_core]
[81416.576222] PGD 78da66067
[81416.576223] PUD 7f2d7c067
[81416.579484] PMD 0
[81416.582720]
[81416.587242] Oops: 0000 [#1] SMP
[81416.722395] task: ffff8807887515c0 task.stack: ffffc900062c0000
[81416.729148] RIP: 0010:ib_cache_release_one+0x29/0x80 [ib_core]
[81416.735793] RSP: 0018:ffffc900062c3a90 EFLAGS: 00010202
[81416.741823] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
[81416.749785] RDX: 0000000000000000 RSI: 0000000000000282 RDI: ffff880859fec000
[81416.757757] RBP: ffffc900062c3aa0 R08: ffff8808536e5ac0 R09: ffff880859fec5b0
[81416.765708] R10: 00000000536e5c01 R11: ffff8808536e5ac0 R12: ffff880859fec000
[81416.773672] R13: 0000000000000000 R14: ffff8808536e5ac0 R15: ffff88084ebc0060
[81416.781621] FS:  00007fd879fab740(0000) GS:ffff88085fac0000(0000) knlGS:0000000000000000
[81416.790522] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[81416.797094] CR2: 0000000000000000 CR3: 00000007eb215000 CR4: 00000000003406e0
[81416.805051] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[81416.812997] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[81416.820950] Call Trace:
[81416.824226]  ib_device_release+0x1e/0x40 [ib_core]
[81416.829858]  device_release+0x32/0xa0
[81416.834370]  kobject_cleanup+0x63/0x170
[81416.839058]  kobject_put+0x25/0x50
[81416.843319]  ib_dealloc_device+0x25/0x40 [ib_core]
[81416.848986]  mlx5_ib_add+0x163/0x1990 [mlx5_ib]
[81416.854414]  mlx5_add_device+0x5a/0x160 [mlx5_core]
[81416.860191]  mlx5_register_interface+0x8d/0xc0 [mlx5_core]
[81416.866587]  ? 0xffffffffa09e9000
[81416.870816]  mlx5_ib_init+0x15/0x17 [mlx5_ib]
[81416.876094]  do_one_initcall+0x51/0x1b0
[81416.880861]  ? __vunmap+0x85/0xd0
[81416.885113]  ? kmem_cache_alloc_trace+0x14b/0x1b0
[81416.890768]  ? vfree+0x2e/0x70
[81416.894762]  do_init_module+0x60/0x1fa
[81416.899441]  load_module+0x15f6/0x1af0
[81416.904114]  ? __symbol_put+0x60/0x60
[81416.908709]  ? ima_post_read_file+0x3d/0x80
[81416.913828]  ? security_kernel_post_read_file+0x6b/0x80
[81416.920006]  SYSC_finit_module+0xa6/0xf0
[81416.924888]  SyS_finit_module+0xe/0x10
[81416.929568]  entry_SYSCALL_64_fastpath+0x1a/0xa9
[81416.935089] RIP: 0033:0x7fd879494949
[81416.939543] RSP: 002b:00007ffdbc1b4e58 EFLAGS: 00000202 ORIG_RAX: 0000000000000139
[81416.947982] RAX: ffffffffffffffda RBX: 0000000001b66f00 RCX: 00007fd879494949
[81416.955965] RDX: 0000000000000000 RSI: 000000000041a13c RDI: 0000000000000003
[81416.963926] RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000001b652a0
[81416.971861] R10: 0000000000000003 R11: 0000000000000202 R12: 00007ffdbc1b3e70
[81416.979763] R13: 00007ffdbc1b3e50 R14: 0000000000000005 R15: 0000000000000000
[81417.008005] RIP: ib_cache_release_one+0x29/0x80 [ib_core] RSP: ffffc900062c3a90
[81417.016045] CR2: 0000000000000000

Fixes: 55aeed0654 ("IB/core: Make ib_alloc_device init the kobject")
Fixes: 7738613e7c ("IB/core: Add per port immutable struct to ib_device")
Cc: <stable@vger.kernel.org> # v4.2+
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-21 12:26:05 -04:00
Doug Ledford 23790ba2d7 Merge branch 'k.o/for-4.12' into k.o/for-4.12-rdma-netdevice 2017-04-20 12:00:41 -04:00
Matan Barak db1b5ddd53 IB/core: Rename uverbs event file structure
Previously, ib_uverbs_event_file was suffixed by _file as it contained
the actual file information. Since it's now only used as base struct
for ib_uverbs_async_event_file and ib_uverbs_completion_event_file,
we change its name to ib_uverbs_event_queue. This represents its
logical role better.

Fixes: 1e7710f3f6 ('IB/core: Change completion channel to use the reworked objects schema')
Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20 11:44:07 -04:00
Matan Barak e0fcc61113 IB/core: Don't use is_async in event files to infer events size
Previously, we inferred the events size in ib_uverbs_event_read by
using the is_async flag. Instead of that, we pass the event size
directly.

Fixes: 1e7710f3f6 ('IB/core: Change completion channel to use the reworked objects schema')
Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20 11:44:07 -04:00
Matan Barak c52d8114d1 IB/core: A small refactor in destroy WQ handler
Instead of having uverbs_uobject_put both in the error flow and the
good flow, we unite them.

Fixes: fd3c7904db ('IB/core: Change idr objects to use the new schema')
Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20 11:44:07 -04:00
Matan Barak d9edfc5a4f IB/core: Nullify ib_uobject during allocation
Currently, we initialize all fields of ib_uobject straight after
allocation. Therefore, a kmalloc was sufficient. Since ib_uobject
could be embedded in a type specific structure, we nullify it to
spare programmer errors.

Fixes: 3832125624 ('IB/core: Add support for idr types')
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20 11:44:07 -04:00
Matan Barak f025c48958 IB/core: Don't pass the lock state to _rdma_remove_commit_uobject
The only scenario where this function was called while the lock is
already taken is in the context cleanup scenario. Thus, in order not
to pass the lock state to this function, we just call the remove logic
straight from the cleanup context function.

Fixes: 3832125624 ('IB/core: Add support for idr types')
Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20 11:44:07 -04:00