Add a helper to zero fill fields before copying data to
UVERBS_ATTR_STRUCT.
As UVERBS_ATTR_STRUCT can be used as an extensible struct, we want to make
sure that if the user supplies us with a struct that has new fields that
we are not aware of, we return them zeroed to the user.
This helper should be used when using UVERBS_ATTR_STRUCT for an extendable
data structure and there is a need to make sure that extended members of
the struct, that the kernel doesn't handle, are returned zeroed to the
user. This is needed due to the fact that UVERBS_ATTR_STRUCT allows
non-zero values for members after 'last' member.
Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Same code is executed in both rxe_param_set_add and rxe_notify functions.
Make one function and call it from both places.
Since both callers already have a rxe object use it directly instead of
deriving it from the net device.
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Current rxe device counters are not thread safe.
When multiple QPs are used, they can be racy.
Make them thread safe by making it atomic64.
Fixes: 0b1e5b99a4 ("IB/rxe: Add port protocol stats")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
During testing the command format was changed to close a security
hole. Revise the driver to use the command format that will actually be
supported in GA firmware.
Both the UMEM and UCTX are intended only for use by the kernel and cannot
be executed using a general command.
Since the UMEM and CTX are not part of the general object the caps bits
were moved to be some log_xxx location in the general HCA caps.
The firmware code was adapted as well to match the above.
Fixes: a8b92ca1b0 ("IB/mlx5: Introduce DEVX")
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Use uid as part of alloc/dealloc transport domain to let firmware manages
the resources correctly.
Fixes: d2d19121ae ("IB/mlx5: Set uid as part of TD commands")
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
From git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
mlx5 updates taken for dependencies on following patches.
* branche 'mlx5-next': (23 commits)
IB/mlx5: Introduce uid as part of alloc/dealloc transport domain
net/mlx5: Add shared Q counter bits
net/mlx5: Continue driver initialization despite debugfs failure
net/mlx5: Fold the modify lag code into function
net/mlx5: Add lag affinity info to log
net/mlx5: Split the activate lag function into two routines
net/mlx5: E-Switch, Introduce flow counter affinity
IB/mlx5: Unify e-switch representors load approach between uplink and VFs
net/mlx5: Use lowercase 'X' for hex values
net/mlx5: Remove duplicated include from eswitch.c
net/mlx5: Remove the get protocol device interface entry
net/mlx5: Support extended destination format in flow steering command
net/mlx5: E-Switch, Change vhca id valid bool field to bit flag
net/mlx5: Introduce extended destination fields
net/mlx5: Revise gre and nvgre key formats
net/mlx5: Add monitor commands layout and event data
net/mlx5: Add support for plugged-disabled cable status in PME
net/mlx5: Add support for PCIe power slot exceeded error in PME
net/mlx5: Rework handling of port module events
net/mlx5: Move flow counters data structures from flow steering header
...
Lots of conflicts, by happily all cases of overlapping
changes, parallel adds, things of that nature.
Thanks to Stephen Rothwell, Saeed Mahameed, and others
for their guidance in these resolutions.
Signed-off-by: David S. Miller <davem@davemloft.net>
Increasing the depth of control path command queue to 8K entries to handle
burst of commands. This feature needs support from FW and the driver/fw
compatibility is checked from the interface version number.
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Get HWRM interface major, minor, build and patch version from FW for
checking the FW/Driver compatibility.
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Acquiring the rtnl lock while holding usdev_lock could result in a
deadlock.
For example:
usnic_ib_query_port()
| mutex_lock(&us_ibdev->usdev_lock)
| ib_get_eth_speed()
| rtnl_lock()
rtnl_lock()
| usnic_ib_netdevice_event()
| mutex_lock(&us_ibdev->usdev_lock)
This commit moves the usdev_lock acquisition after the rtnl lock has been
released.
This is safe to do because usdev_lock is not protecting anything being
accessed in ib_get_eth_speed(). Hence, the correct order of holding locks
(rtnl -> usdev_lock) is not violated.
Signed-off-by: Parvi Kaustubhi <pkaustub@cisco.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
When in a sleepable (non-atomic) context, wait for firmware completion
instead of polling for it.
Signed-off-by: Gal Pressman <galpress@amazon.com>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
When in a sleepable (non-atomic) context, wait for firmware completion
instead of polling for it.
Signed-off-by: Gal Pressman <galpress@amazon.com>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Introduce a 'flags' field to destroy address handle callback and add a
flag that marks whether the callback is executed in an atomic context or
not.
This will allow drivers to wait for completion instead of polling for it
when it is allowed.
Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Introduce a 'flags' field to create address handle callback and add a flag
that marks whether the callback is executed in an atomic context or not.
This will allow drivers to wait for completion instead of polling for it
when it is allowed.
Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Modify allocation of the non-SRQ receive queues such that immediate
data is aligned on a 512 byte boundary. That alignment is necessary
to pass the immediate data without copying to the block layer. When
receiving an SRP_CMD with immediate data, postpone the ib_post_recv()
call until target_execute_cmd() has finished. See also
srpt_release_cmd().
Cc: Sergey Gorenko <sergeygo@mellanox.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This patch does not change any functionality but makes the next patch
easier to read.
Cc: Sergey Gorenko <sergeygo@mellanox.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Neither a driver version number nor a release data is useful in
an upstream driver. Remove the word "InfiniBand" from the driver
description because recently RoCE support has been added to this
driver.
Cc: Sergey Gorenko <sergeygo@mellanox.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Add documentation for those structure members for which it is missing.
Cc: Sergey Gorenko <sergeygo@mellanox.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Make sure that long strings occur on a single line as required by the
coding standard.
Cc: Sergey Gorenko <sergeygo@mellanox.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Use tabs instead of spaces for indentation. Make sure that multi-line
expressions have the operator at the end of a line instead of the start.
Avoid a complaint about a missing space in a ternary expression by
changing '(boolean) ? 1: 0' into 'boolean'.
Cc: Sergey Gorenko <sergeygo@mellanox.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Request permission to send immediate data during login. If the SRP
target grants this request, send the payload of write requests <= 8 KB
as immediate data.
Cc: Sergey Gorenko <sergeygo@mellanox.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Move the maximum initiator to target information unit length parameter
from struct srp_target_port into struct srp_rdma_ch. This patch does
not change any functionality but makes the next patch easier to read.
Cc: Sergey Gorenko <sergeygo@mellanox.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Since srp_rdma_ch.max_ti_iu_len is used in the hot path, move it to the
section with data structure members used in the hot path.
Cc: Sergey Gorenko <sergeygo@mellanox.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This patch avoids that the SCSI mid-layer keeps retrying forever if
ib_post_send() fails. This was discovered while testing immediate
data support and passing a too large num_sge value to ib_post_send().
Cc: Sergey Gorenko <sergeygo@mellanox.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Reserve additional space for CDBs that contain more than sixteen bytes
and set the add_cdb_len field for such CDBs as required. From the SRP
standard: "The ADDITIONAL CDB LENGTH field contains the length in
dwords of the ADDITIONAL CDB field."
Cc: Sergey Gorenko <sergeygo@mellanox.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This patch avoids that a warning is reported when building with W=1.
Cc: Sergey Gorenko <sergeygo@mellanox.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Add constants and data structures to support immediate data. These
changes conform to SRP2r04.
Cc: Sergey Gorenko <sergeygo@mellanox.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This patch moves all constants that come from the SRP standard into the
include/scsi/srp.h header file.
Cc: Sergey Gorenko <sergeygo@mellanox.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
When CONFIG_INFINIBAND_ON_DEMAND_PAGING is not enabled, we were getting
build failures for defined but not used code. Fix that.
Fixes: 813e90b1ae ("IB/mlx5: Add advise_mr() support")
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Most SCSI drivers want to enable "clustering", that is merging of
segments so that they might span more than a single page. Remove the
ENABLE_CLUSTERING define, and require drivers to explicitly set
DISABLE_CLUSTERING to disable this feature.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Drivers should be using udata to determine if a method is invoked from
user space or kernel space. A pd does not necessarily say a different
objects is kernel or user.
Transforming the tests to use udata eliminates a large number of uobject
references from the drivers.
Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Having uobject pointer embedded in ib core objects is not aligned with a
future shared ib_x model. The resource tracker only does this to keep
track of user/kernel objects - track this directly instead.
Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The verb advise_mr() is used to give advice to the kernel about an address
range that belongs to a MR. Implement the verb and register it on the
device. The current implementation supports the only known advice to date,
prefetch.
Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Add new ioctl method for the MR object - ADVISE_MR.
This command can be used by users to give an advice or directions to the
kernel about an address range that belongs to memory regions.
A new ib_device callback, advise_mr(), is introduced here to suupport the
new command. This command takes the following arguments:
- pd: The protection domain to which all memory regions belong
- advice: The type of the advice
* IB_UVERBS_ADVISE_MR_ADVICE_PREFETCH - Pre-fetch a range of
an on-demand paging MR
* IB_UVERBS_ADVISE_MR_ADVICE_PREFETCH_WRITE - Pre-fetch a range
of an on-demand paging MR with write intention
- flags: The properties of the advice
* IB_UVERBS_ADVISE_MR_FLAG_FLUSH - Operation must end before
return to the caller
- sg_list: The list of memory ranges
- num_sge: The number of memory ranges in the list
- attrs: More attributes to be parsed by the provider
Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
When the parser of an ioctl command has the knowledge that a ptr attribute
in a bundle represents an array of structures, it is useful for it to know
the number of elements in the array. This is done by dividing the
attribute length with the element size.
Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Add an ioctl method to destroy the PD, MR, MW, AH, flow, RWQ indirection
table and XRCD objects by handle which doesn't require any output response
during destruction.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Introduce a helper function gather_objects_handle() to copy object handles
under a spin lock.
Expose these objects handles via the uverbs ioctl interface.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
The macro MLX4_IB_SQ_HEADROOM calculates the spare room needed to be
left. Use it instead of hard-coding the HW prefetch size.
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The function accidentally returns success on this error path.
Fixes: c7bcb13442 ("RDMA/hns: Add SRQ support for hip08 kernel mode")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
We accidentally return success on this error path.
Fixes: f931551baf ("IB/qib: Add new qib driver for QLogic PCIe InfiniBand adapters")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Currently a RoCE GID entry is removed from the hardware when all
references to the GID entry drop to zero. This is a change in behavior
from before the fixed patch. The GID entry should be removed from the
hardware when GID entry deletion is requested. This allows the driver
terminate ongoing traffic through the RoCE GID.
While a GID is deleted from the hardware, GID slot in the software GID
cache is not freed. GID slot is freed once all references of such GID are
dropped. This continue to ensure that such GID slot of hardware is not
allocated to new GID entry allocation request. It is allocated once all
references to GID entry drop.
This approach allows drivers to put a tombestone of some kind on the HW
GID index to block the traffic.
Fixes: b150c3862d ("IB/core: Introduce GID entry reference counts")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The initialization of the ib_device_ops was dropped by mistake when
rebasing the ib_device_ops series, this patch fixes that.
Fixes: 15644f57cb ("RDMA/i40iw: Initialize ib_device_ops struct")
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Handle atomic was left as unimplemented from 2013, remove the code till
this part will be developed.
Remove the dead code by simplifying SW completion logic which is supposed
to be the same for send and receive paths.
Fixes: e126ba97db ("mlx5: Add driver for Mellanox Connect-IB adapters")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Tested-by: Stephen Rothwell <sfr@canb.auug.org.au> # compile tested
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Now that the handlers do not process their own udata we can make a
sensible ioctl that wrappers them. The ioctl follows the same format as
the write_ex() and has the user explicitly specify the core and driver
in/out opaque structures and a command number.
This works for all forms of write commands.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
With the introduction of SR-IOV LAG, checking whether LAG is active
is no longer good enough, since RoCE and SR-IOV LAG each entails
different behavior by both the core and infiniband drivers.
This patch introduces facilities to discern LAG type, in addition to
mlx5_lag_is_active(). These are implemented in such a way as to allow
more complex mode combinations in the future.
Signed-off-by: Aviv Heller <avivh@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
mlx5-next shared branch with rdma subtree to avoid mlx5 rdma v.s. netdev
conflicts.
Highlights:
1) Lag refactroing and flow counter affinity bits.
2) mlx5 core cleanups
By Roi Dayan (2) and others
* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
net/mlx5: Fold the modify lag code into function
net/mlx5: Add lag affinity info to log
net/mlx5: Split the activate lag function into two routines
net/mlx5: E-Switch, Introduce flow counter affinity
IB/mlx5: Unify e-switch representors load approach between uplink and VFs
net/mlx5: Use lowercase 'X' for hex values
net/mlx5: Remove duplicated include from eswitch.c
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When in switchdev mode and the add function is called by the core
level driver, make sure we only register the callbacks, but don't
create the mlx5 IB device or initialize anything. With this change
all the IB devices in switchdev mode are created only once the load
callback is invoked by the e-switch core sub-module. This follows
the design paradigm under which the all the Eth representors must
be loaded before any of IB reprs is loaded.
Signed-off-by: Mark Bloch <markb@mellanox.com>
Acked-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When support for bonding of RoCE devices was added, there was
necessarily a link between the RoCE device and the paired netdevice that
was part of the bond. If you remove the mlx4_en module, that paired
association is broken (the RoCE device is still present but the paired
netdevice has been released). We need to account for this in
is_upper_ndev_bond_master_filter() and filter out those links with a
broken pairing or else we later oops in netdev_next_upper_dev_rcu().
Fixes: 408f1242d9 ("IB/core: Delete lower netdevice default GID entries in bonding scenario")
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Make all the required change to start use the ib_device_ops structure.
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Initialize ib_device_ops with the supported operations using
ib_set_device_ops() and remove the use of check_driver_override().
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Initialize ib_device_ops with the supported operations using
ib_set_device_ops().
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Initialize ib_device_ops with the supported operations using
ib_set_device_ops().
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Initialize ib_device_ops with the supported operations using
ib_set_device_ops().
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Initialize ib_device_ops with the supported operations using
ib_set_device_ops().
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Initialize ib_device_ops with the supported operations using
ib_set_device_ops().
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Initialize ib_device_ops with the supported operations using
ib_set_device_ops().
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Initialize ib_device_ops with the supported operations using
ib_set_device_ops().
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Initialize ib_device_ops with the supported operations using
ib_set_device_ops().
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Initialize ib_device_ops with the supported operations using
ib_set_device_ops().
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Initialize ib_device_ops with the supported operations using
ib_set_device_ops().
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Initialize ib_device_ops with the supported operations using
ib_set_device_ops().
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Initialize ib_device_ops with the supported operations using
ib_set_device_ops().
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Initialize ib_device_ops with the supported operations using
ib_set_device_ops().
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Initialize ib_device_ops with the supported operations using
ib_set_device_ops().
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Initialize ib_device_ops with the supported operations using
ib_set_device_ops().
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Initialize ib_device_ops with the supported operations using
ib_set_device_ops().
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This change introduces the ib_device_ops structure that defines all the
InfiniBand device operations in one place, so the code will be more
readable and clean, unlike today when the ops are mixed with ib_device
data members.
The providers will need to define the supported operations and assign them
using ib_set_device_ops(), that will also make the providers code more
readable and clean.
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
NULL check for kfree is unnecessary, remove it.
Fixes: b42dde478b ("IB/mlx4: Rework special QP creation error path")
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
In opposite to current implementation of identification based on device
name, PCI-ID identification provides stable names suitable for device
rename. Change ocrdma debugfs representation to use PCI-ID.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Clear extra bytes in response in batch manner instead
of doing it per-byte.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
RDMA core layer already make sure port is valid, no need to check it here
again.
For the pkey validation this depends on commit b3ac5742fead ("RDMA/core:
Validate port number in query_pkey verb")
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
IB representors don't support creation of RAW ethernet QP flows. Disable
them by reusing existing RDMA/core support macros. We do it for both
creation and matcher because latter is not usable if no flow creation is
available.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Report to the user 2x width over MAD interface.
Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Report HDR speed when HDR is supported in CapabilityMask2 and the actual
speed is HDR.
Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
CapabilityMask2 exists when IB_PORT_CAP_MASK2_SUP is set in the original
capability mask. In such cases, query its value and report it in query
port.
Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Add the new rates that were added to Infiniband spec as part of HDR and 2x
support.
Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Array iterator stays at the same slot, fix it.
Fixes: 8700e3e7c4 ("Soft RoCE driver")
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This patch implements a cmdq to enable the loopback of ssu module
according to the modified hardware desgin.
The ssu consists of ingress unit, packet buffer and programmable packet
process unit. if the loopback bit of ssu is not enabled, the roce packet
with loopback bit will fail.
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This patch updates the implementation of the mailbox command interface by
using command queue instead of operating registers. With this update, the
software can be well decoupled with the hardware.
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
It will prevent multiply overflow when defines the pbl for u64 type.
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This patch move the codes of qp state transition into the new function as
well as simplify the logic for other qp states transition.
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
It needs to clear qp context previous when init qp context. Otherwise,
the newly created qp context residue has the contents of the qp context
before the uninstall, and the qp context content is disordered, causing
the task to fail.
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
mlx5-next shared branch with rdma subtree to avoid mlx5 rdma v.s. netdev
conflicts.
Highlights:
1) RDMA ODP (On Demand Paging) improvements and moving ODP logic to
mlx5 RDMA driver
2) Improved mlx5 core driver and device events handling and provided API
for upper layers to subscribe to device events.
3) RDMA only code cleanup from mlx5 core
4) Add helper to get CQE opcode
5) Rework handling of port module events
6) shared mlx5_ifc.h updates to avoid conflicts
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
GRE RFC defines a 32 bit key field. NVGRE RFC splits the 32 bit
key field to 24 bit VSID (gre_key_h) and 8 bit flow entropy (gre_key_l).
Define the two key parsing alternatives in a union, thus enabling both
access methods.
Signed-off-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Eli Britstein <elibr@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Several conflicts, seemingly all over the place.
I used Stephen Rothwell's sample resolutions for many of these, if not
just to double check my own work, so definitely the credit largely
goes to him.
The NFP conflict consisted of a bug fix (moving operations
past the rhashtable operation) while chaning the initial
argument in the function call in the moved code.
The net/dsa/master.c conflict had to do with a bug fix intermixing of
making dsa_master_set_mtu() static with the fixing of the tagging
attribute location.
cls_flower had a conflict because the dup reject fix from Or
overlapped with the addition of port range classifiction.
__set_phy_supported()'s conflict was relatively easy to resolve
because Andrew fixed it in both trees, so it was just a matter
of taking the net-next copy. Or at least I think it was :-)
Joe Stringer's fix to the handling of netns id 0 in bpf_sk_lookup()
intermixed with changes on how the sdif and caller_net are calculated
in these code paths in net-next.
The remaining BPF conflicts were largely about the addition of the
__bpf_md_ptr stuff in 'net' overlapping with adjustments and additions
to the relevant data structure where the MD pointer macros are used.
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the new helper that extracts the opcode
from a CQE (completion queue entry) structure.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Before calling the driver's function let's make sure port is valid.
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Danit Goldberg says:
Packet based credit mode
Packet based credit mode is an alternative end-to-end credit mode for QPs
set during their creation. Credits are transported from the responder to
the requester to optimize the use of its receive resources. In
packet-based credit mode, credits are issued on a per packet basis.
The advantage of this feature comes while sending large RDMA messages
through switches that are short in memory.
The first commit exposes QP creation flag and the HCA capability. The
second commit adds support for a new DV QP creation flag. The last commit
report packet based credit mode capability via the MLX5DV device
capabilities.
* branch 'mlx5-packet-credit-fc':
IB/mlx5: Report packet based credit mode device capability
IB/mlx5: Add packet based credit mode support
net/mlx5: Expose packet based credit mode
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Report packet based credit mode capability via the mlx5 DV interface.
Signed-off-by: Danit Goldberg <danitg@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The device can support two credit modes, message based (default) and
packet based. In order to enable packet based mode, the QP should be
created with special flag that indicates this.
This patch adds support for the new DV QP creation flag that can be used
for RC QPs in order to change the credit mode.
Signed-off-by: Danit Goldberg <danitg@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Utilize rdma_is_port_valid to validate the given port.
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Since the function always returns 0 make it void.
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Flow table can be passed as a DEVX object which is a valid destination in
an EGRESS flow. Fix the original code to allow that.
Fixes: a7ee18bdee ("RDMA/mlx5: Allow creating a matcher for a NIC TX flow table")
Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This fixes a compilation warning in sysfs.c
drivers/infiniband/hw/mlx4/sysfs.c:360:2: warning: 'strncpy' output may be
truncated copying 8 bytes from a string of length 31
[-Wstringop-truncation]
By eliminating the temporary stack buffer.
Signed-off-by: Qian Cai <cai@gmx.us>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Commit 4e045572e2 ("IB/hfi1: Add unique txwait_lock for txreq events")
laid the ground work to support per resource waiting locking.
This patch adds that with a lock unique to each sdma engine and pio
sendcontext and makes necessary changes for verbs, PSM, and vnic to use
the new locks.
This is particularly beneficial for smaller messages that will exhaust
resources at a faster rate.
Fixes: 7724105686 ("IB/hfi1: add driver files")
Reviewed-by: Gary Leshner <Gary.S.Leshner@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>