Commit Graph

15 Commits

Author SHA1 Message Date
Jack Morgenstein 6634961c14 mlx4: Put physical GID and P_Key table sizes in mlx4_phys_caps struct and paravirtualize them
To allow easy paravirtualization of P_Key and GID table sizes, keep
paravirtualized sizes in mlx4_dev->caps, but save the actual physical
sizes from FW in struct: mlx4_dev->phys_cap.

In addition, in SR-IOV mode, do the following:

1. Reduce reported P_Key table size by 1.
   This is done to reserve the highest P_Key index for internal use,
   for declaring an invalid P_Key in P_Key paravirtualization.
   We require a P_Key index which always contain an invalid P_Key
   value for this purpose (i.e., one which cannot be modified by
   the subnet manager).  The way to do this is to reduce the
   P_Key table size reported to the subnet manager by 1, so that
   it will not attempt to access the P_Key at index #127.

2. Paravirtualize the GID table size to 1. Thus, each guest sees
   only a single GID (at its paravirtualized index 0).

In addition, since we are paravirtualizing the GID table size to 1, we
add paravirtualization of the master GID event here (i.e., we do not
do ib_dispatch_event() for the GUID change event on the master, since
its (only) GUID never changes).

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-07-11 11:52:23 -07:00
Jack Morgenstein 00f5ce99dc mlx4: Use port management change event instead of smp_snoop
The port management change event can replace smp_snoop.  If the
capability bit for this event is set in dev-caps, the event is used
(by the driver setting the PORT_MNG_CHG_EVENT bit in the async event
mask in the MAP_EQ fw command).  In this case, when the driver passes
incoming SMP PORT_INFO SET mads to the FW, the FW generates port
management change events to signal any changes to the driver.

If the FW generates these events, smp_snoop shouldn't be invoked in
ib_process_mad(), or duplicate events will occur (once from the
FW-generated event, and once from smp_snoop).

In the case where the FW does not generate port management change
events smp_snoop needs to be invoked to create these events.  The flow
in smp_snoop has been modified to make use of the same procedures as
in the fw-generated-event event case to generate the port management
events (LID change, Client-rereg, Pkey change, and/or GID change).

Port management change event handling required changing the
mlx4_ib_event and mlx4_dispatch_event prototypes; the "param" argument
(last argument) had to be changed to unsigned long in order to
accomodate passing the EQE pointer.

We also needed to move the definition of struct mlx4_eqe from
net/mlx4.h to file device.h -- to make it available to the IB driver,
to handle port management change events.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-07-10 09:47:10 -07:00
Jack Morgenstein b1d8eb5a21 IB/mlx4: Add debug prints
Define pr_fmt and add some pr_debug prints.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-07-08 18:05:06 -07:00
Jack Morgenstein a6f7feae6d IB/mlx4: pass SMP vendor-specific attribute MADs to firmware
In the current code, vendor-specific MADs (e.g with the FDR-10
attribute) are silently dropped by the driver, resulting in timeouts
at the sending side and inability to query/configure the relevant
feature.  However, the ConnectX firmware is able to handle such MADs.
For unsupported attributes, the firmware returns a GET_RESPONSE MAD
containing an error status.

For example, for a FDR-10 node with LID 11:

    # ibstat mlx4_0 1

    CA: 'mlx4_0'
    Port 1:
    State: Active
    Physical state: LinkUp
    Rate: 40 (FDR10)
    Base lid: 11
    LMC: 0
    SM lid: 24
    Capability mask: 0x02514868
    Port GUID: 0x0002c903002e65d1
    Link layer: InfiniBand

Extended Port Query (EPI) vendor mad timeouts before the patch:

    # smpquery MEPI 11 -d

    ibwarn: [4196] smp_query_via: attr 0xff90 mod 0x0 route Lid 11
    ibwarn: [4196] _do_madrpc: retry 1 (timeout 1000 ms)
    ibwarn: [4196] _do_madrpc: retry 2 (timeout 1000 ms)
    ibwarn: [4196] _do_madrpc: timeout after 3 retries, 3000 ms
    ibwarn: [4196] mad_rpc: _do_madrpc failed; dport (Lid 11)
    smpquery: iberror: [pid 4196] main: failed: operation EPI: ext port info query failed

EPI query works OK with the patch:

    # smpquery MEPI 11 -d

    ibwarn: [6548] smp_query_via: attr 0xff90 mod 0x0 route Lid 11
    ibwarn: [6548] mad_rpc: data offs 64 sz 64
    mad data
    0000 0000 0000 0001 0000 0001 0000 0001
    0000 0000 0000 0000 0000 0000 0000 0000
    0000 0000 0000 0000 0000 0000 0000 0000
    0000 0000 0000 0000 0000 0000 0000 0000
    # Ext Port info: Lid 11 port 0
    StateChangeEnable:...............0x00
    LinkSpeedSupported:..............0x01
    LinkSpeedEnabled:................0x01
    LinkSpeedActive:.................0x01

Signed-off-by: Jack Morgenstein <jackm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Acked-by: Ira Weiny <weiny2@llnl.gov>
Cc: <stable@vger.kernel.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-01-30 16:15:17 -08:00
Jack Morgenstein f9baff509f mlx4_core: Add "native" argument to mlx4_cmd and its callers (where needed)
For SRIOV, some Hypervisor commands can be executed directly (native = 1).
Others should go through the command wrapper flow (for tracking resource
usage, for example, or for changing some HCA configurations that slaves
need to be notified of).

This patch sets the groundwork for this capability -- adding the correct
value of "native" in each case.

Note that if SRIOV is not activated, this parameter has no effect.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-13 13:56:05 -05:00
Or Gerlitz c37791349c IB/mlx4: Support PMA counters for IBoE
Use the per port counter attached to all QPs created on that port to
implement port level packets/bytes performance counters a la IB.
Derived from a patch by Eli Cohen <eli@mellanox.co.il>

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.co.il>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-07-18 21:04:36 -07:00
Dan Carpenter 1397490938 IB/mlx4: Handle -ENOMEM in forward_trap()
ib_create_send_mad() can return ERR_PTR(-ENOMEM) here.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2011-01-10 17:42:06 -08:00
Eli Cohen fa417f7b52 IB/mlx4: Add support for IBoE
Add support for IBoE to mlx4_ib.  The bulk of the code is handling the
new address vector fields; mlx4 needs the MAC address of a remote node
to include it in a WQE (for datagrams) or in the QP context (for
connected QPs).  Address resolution is done by assuming all unicast
GIDs are either link-local IPv6 addresses.

Multicast group attach/detach needs to update the NIC's multicast
filters; but since attaching a QP to a multicast group can be done
before the QP is bound to a port, for IBoE we need to keep track of
all multicast groups that a QP is attached too before it transitions
from INIT to RTR (since it does not have a port in the INIT state).

Signed-off-by: Eli Cohen <eli@mellanox.co.il>

[ Many things cleaned up and otherwise monkeyed with; hope I didn't
  introduce too many bugs.  - Roland ]

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-25 10:20:39 -07:00
Tejun Heo 5a0e3ad6af include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-30 22:02:32 +09:00
Moni Shoua f0f6f346a1 IB/mlx4: Fix dispatch of IB_EVENT_LID_CHANGE event
When snooping a PortInfo MAD, its client_reregister bit is checked.
If the bit is ON then a CLIENT_REREGISTER event is dispatched,
otherwise a LID_CHANGE event is dispatched.  This way of decision
ignores the cases where the MAD changes the LID along with an
instruction to reregister (so a necessary LID_CHANGE event won't be
dispatched) or the MAD is neither of these (and an unnecessary
LID_CHANGE event will be dispatched).

This causes problems at least with IPoIB, which will do a "light"
flush on reregister, rather than the "heavy" flush required due to a
LID change.

Fix this by dispatching a CLIENT_REREGISTER event if the
client_reregister bit is set, but also compare the LID in the MAD to
the current LID.  If and only if they are not identical then a
LID_CHANGE event is dispatched.

Signed-off-by: Moni Shoua <monis@voltaire.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Yossi Etigin <yosefe@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-01-28 14:54:35 -08:00
Yevgeny Petrilin 7ff93f8b7e mlx4_core: Multiple port type support
Multi-protocol adapters support different port types.  Each consumer
of mlx4_core queries for supported port types; in particular mlx4_ib
can no longer assume that all physical ports belong to it.  Port type
is configured through a sysfs interface.  When the type of a port is
changed, all mlx4 interfaces are unregistered, and then registered
again with the new port types.

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-10-22 15:38:42 -07:00
Eli Cohen 6578cf3398 IB/mlx4: Pass congestion management class MADs to the HCA
ConnectX HCAs support the IB_MGMT_CLASS_CONG_MGMT management class, so
process MADs of this class through the MAD_IFC firmware command.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14 23:48:45 -07:00
Roland Dreier 5d5e815db9 IB/mlx4: Convert "if(foo)" to "if (foo)"
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16 21:01:04 -07:00
Ilpo Järvinen fe11cb6ba4 IB/mlx4: Incorrect semicolon after if statement
A stray semicolon makes us inadvertently ignore the value of err.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-08-15 20:24:06 -07:00
Roland Dreier 225c7b1fee IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters
Add an InfiniBand driver for Mellanox ConnectX adapters.  Because
these adapters can also be used as ethernet NICs and Fibre Channel 
HBAs, the driver is split into two modules: 
 
  mlx4_core: Handles low-level things like device initialization and 
    processing firmware commands.  Also controls resource allocation 
    so that the InfiniBand, ethernet and FC functions can share a 
    device without stepping on each other. 
 
  mlx4_ib: Handles InfiniBand-specific things; plugs into the 
    InfiniBand midlayer. 

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-08 18:00:38 -07:00