Commit Graph

740548 Commits

Author SHA1 Message Date
Wei Yongjun c76f2481b6 tipc: fix error handling in tipc_udp_enable()
Release alloced resource before return from the error handling
case in tipc_udp_enable(), otherwise will cause memory leak.

Fixes: 52dfae5c85 ("tipc: obtain node identity from interface by default")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27 10:50:02 -04:00
Wei Yongjun 6a91ded32d net: aquantia: Make function hw_atl_utils_mpi_set_speed() static
Fixes the following sparse warning:

drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils.c:508:5: warning:
 symbol 'hw_atl_utils_mpi_set_speed' was not declared. Should it be static?

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27 10:49:36 -04:00
David S. Miller 776d7c5f69 Merge branch 'net-mvpp2-Remove-unnecessary-dynamic-allocs'
Maxime Chevallier says:

====================
net: mvpp2: Remove unnecessary dynamic allocs

Some utility functions in mvpp2 make use of dynamic alloc to exchange temporary
objects representing Parser Entries (which are generic filtering entries in the
PPv2 controller).

These objects are small (44 bytes each), we can use the stack to exchange them.

Some previous discussion on this topic showed that the mvpp2_prs_hw_read, which
initializes a struct mvpp2_prs_entry based on one of its fields, can easily lead
to erroneous code if we don't zero-out the struct beforehand :

https://lkml.org/lkml/2018/3/21/739

To fix this, I propose to rename mvpp2_prs_hw_read into mvpp2_prs_init_from_hw,
make it zero-out the struct and take the index as a parameter. That's what's
done in the first patch of the series.

The second patch is the V3 of
("net: mvpp2: Don't use dynamic allocs for local variables"), making use of
mvpp2_prs_init_from_hw and taking previous comments into account.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27 10:47:23 -04:00
Maxime Chevallier 0c6d9b4414 net: mvpp2: Don't use dynamic allocs for local variables
Some helper functions that search for given entries in the TCAM filter
on PPv2 controller make use of dynamically alloced temporary variables,
allocated with GFP_KERNEL. These functions can be called in atomic
context, and dynamic alloc is not really needed in these cases anyways.

This commit gets rid of dynamic allocs and use stack allocation in the
following functions, and where they're used :
 - mvpp2_prs_flow_find
 - mvpp2_prs_vlan_find
 - mvpp2_prs_double_vlan_find
 - mvpp2_prs_mac_da_range_find

For all these functions, instead of returning an temporary object
representing the TCAM entry, we simply return the TCAM id that matches
the requested entry.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27 10:47:23 -04:00
Maxime Chevallier 47e0e14eb1 net: mvpp2: Make mvpp2_prs_hw_read a parser entry init function
The mvpp2_prs_hw_read function uses the 'index' field of the struct
mvpp2_prs_entry to initialize the rest of the fields. This makes it
unclear from a caller's perspective, who needs to manipulate a struct
that is not entirely initialized.

This commit makes it an init function for prs_entry, by passing it the
index as a parameter. The function now zeroes the entry, and sets the
index field before doing all other init from HW.

The function is renamed 'mvpp2_prs_init_from_hw' to make that clear.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27 10:47:23 -04:00
Colin Ian King 8daf1a2d7e net/ncsi: check for null return from call to nla_nest_start
The call to nla_nest_start calls nla_put which can lead to a NULL
return so it's possible for attr to become NULL and we can potentially
get a NULL pointer dereference on attr.  Fix this by checking for
a NULL return.

Detected by CoverityScan, CID#1466125 ("Dereference null return")

Fixes: 955dc68cb9 ("net/ncsi: Add generic netlink family")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27 10:38:26 -04:00
Xin Long 5306653850 sctp: remove unnecessary asoc in sctp_has_association
After Commit dae399d7fd ("sctp: hold transport instead of assoc
when lookup assoc in rx path"), it put transport instead of asoc
in sctp_has_association. Variable 'asoc' is not used any more.

So this patch is to remove it, while at it,  it also changes the
return type of sctp_has_association to bool, and does the same
for it's caller sctp_endpoint_is_peeled_off.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27 10:22:11 -04:00
David S. Miller 13d5a30a1e Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2018-03-26

This series contains updates to i40e only.

Jake provides several patches which remove the need for cmpxchg64(),
starting with moving I40E_FLAG_[UDP]_FILTER_SYNC from pf->flags to pf->state
since they are modified during run time possibly when the RTNL lock is not
held so they should be a state bits and not flags.  Moved additional
"flags" which should be state fields, into pf->state.  Ensure we hold
the RTNL lock for the entire sequence of preparing for reset and when
resuming, which will protect the flags related to interrupt scheme under
RTNL lock so that their modification is properly threaded.  Finally,
cleanup the use of cmpxchg64() since it is no longer needed.  Cleaned up
the holes in the feature flags created my moving some flags to the state
field.

Björn Töpel adds XDP_REDIRECT support as well as tweaking the page
counting for XDP_REDIRECT so that it will function properly.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27 09:56:13 -04:00
David S. Miller 34fd03b9e6 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
100GbE Intel Wired LAN Driver Updates 2018-03-26

This patch series adds the ice driver, which will support the Intel(R)
E800 Series of network devices.

This is the first phase in the release of this driver where we implement
basic transmit and receive. The idea behind the multi-phase release is to
aid in code review as well as testing. Subsequent phases will implement
advanced features (like SR-IOV, tunnelling, flow director, QoS, etc.) that
build upon the previous phase(s). Each phase will be submitted as a patch
series.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-26 18:20:02 -04:00
Björn Töpel d9314c474d i40e: add support for XDP_REDIRECT
The driver now acts upon the XDP_REDIRECT return action. Two new ndos
are implemented, ndo_xdp_xmit and ndo_xdp_flush.

XDP_REDIRECT action enables XDP program to redirect frames to other
netdevs.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-26 14:17:10 -07:00
Björn Töpel 8ce29c679a i40e: tweak page counting for XDP_REDIRECT
This commit tweaks the page counting for XDP_REDIRECT to function
properly. XDP_REDIRECT support will be added in a future commit.

The current page counting scheme assumes that the reference count
cannot decrease until the received frame is sent to the upper layers
of the networking stack. This assumption does not hold for the
XDP_REDIRECT action, since a page (pointed out by xdp_buff) can have
its reference count decreased via the xdp_do_redirect call.

To work around that, we now start off by a large page count and then
don't allow a refcount less than two.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-26 14:10:04 -07:00
Jacob Keller 8f769dd14a i40e: re-number feature flags to remove gaps
Remove the gaps created by the recent refactor of various feature flags
that have moved to the state field. Use only a u32 now that we have
fewer than 32 flags in the field.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-26 14:00:12 -07:00
Jacob Keller 886ff146a7 i40e: stop using cmpxchg flow in i40e_set_priv_flags()
Now that the only places which modify flags are either (a) during
initialization prior to creating a netdevice, or (b) while holding the
rtnl lock, we no longer need the cmpxchg64 call in i40e_set_priv_flags.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-26 13:56:45 -07:00
Jacob Keller f0ee70a042 i40e: hold the RTNL lock while changing interrupt schemes
When we suspend and resume, we need to clear and re-enable the interrupt
scheme. This was previously not done while holding the RTNL lock, which
could be problematic, because we are actually destroying and re-creating
queues.

Hold the RTNL lock for the entire sequence of preparing for reset, and
when resuming. This additionally protects the flags related to interrupt
scheme under RTNL lock so that their modification is properly threaded.

This is part of a larger effort to remove the need for cmpxchg64 in
i40e_set_priv_flags().

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-26 13:52:53 -07:00
Jacob Keller 5f76a704b8 i40e: move client flags into state bits
The iWarp client flags are all potentially changed when the RTNL lock is
not held, so they should not be part of the pf->flags variable. Instead,
move them into the state field so that we can use atomic bit operations.

This is part of a larger effort to remove cmpxchg64 in
i40e_set_priv_flags()

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-26 13:50:56 -07:00
Jacob Keller 0605c45ce5 i40e: move I40E_FLAG_TEMP_LINK_POLLING to state field
This flag is modified outside of the RTNL lock and thus should not be
part of the pf->flags variable.

Use a state bit instead, so that we can use atomic bit operations.

This is part of a larger effort to remove cmpxchg64 in
i40e_set_priv_flags()

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-26 13:48:10 -07:00
Aviv Heller 71186172b7 net/mlx5e: Add VLAN offload features to hw_enc_features
We support outer VLAN offload in driver and HW regardless of whether
an encapsulation is present in the next headers.

Exposing this in hw_enc_features will allow us to offload outer VLANs
in cases where encapsulation protocols like VXLAN and IPsec are used.

Signed-off-by: Aviv Heller <avivh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-26 13:47:16 -07:00
Gal Pressman be0f780bc5 net/mlx5e: Add a helper macro in set features ndo
Add a new macro to prevent copy-pasting the same code for each new
feature.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-26 13:47:16 -07:00
Gal Pressman 707129dcea net/mlx5e: Make choose LRO timeout function static
The function is used in en_main.c only, we can make it static and remove
its declaration from en.h

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-26 13:47:15 -07:00
Gal Pressman acac5ec0c2 net/mlx5e: Remove redundant check in get ethtool stats
ethtool core code makes sure data isn't NULL before calling
get_ethtool_stats, testing it again in the driver is redundant.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-26 13:47:15 -07:00
Leon Romanovsky 957f6ba8ad net/mlx5: Protect from command bit overflow
The system with CONFIG_UBSAN enabled on produces the following error
during driver initialization. The reason to it that max_reg_cmds can be
larger enough to cause to "1 << max_reg_cmds" overflow the unsigned long.

================================================================================
UBSAN: Undefined behaviour in drivers/net/ethernet/mellanox/mlx5/core/cmd.c:1805:42
signed integer overflow:
-2147483648 - 1 cannot be represented in type 'int'
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.15.0-rc2-00032-g06cda2358d9b-dirty #724
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
Call Trace:
 dump_stack+0xe9/0x18f
 ? dma_virt_alloc+0x81/0x81
 ubsan_epilogue+0xe/0x4e
 handle_overflow+0x187/0x20c
 mlx5_cmd_init+0x73a/0x12b0
 mlx5_load_one+0x1c3d/0x1d30
 init_one+0xd02/0xf10
 pci_device_probe+0x26c/0x3b0
 driver_probe_device+0x622/0xb40
 __driver_attach+0x175/0x1b0
 bus_for_each_dev+0xef/0x190
 bus_add_driver+0x2db/0x490
 driver_register+0x16b/0x1e0
 __pci_register_driver+0x177/0x1b0
 init+0x6d/0x92
 do_one_initcall+0x15b/0x270
 kernel_init_freeable+0x2d8/0x3d0
 kernel_init+0x14/0x190
 ret_from_fork+0x24/0x30
================================================================================

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-26 13:47:14 -07:00
Or Gerlitz 6acfbf38a9 net/mlx5e: Offload tc vlan push/pop using HW action
Currently, we are emulating the offload of vlan push/pop actions using
global setup as done by commit f5f8247609 ("net/mlx5: E-Switch, Support
VLAN actions in the offloads mode"). With newer NICs, we can apply a flow
action for that matter, do that while keeping the emulated path for the
older HW brands.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-26 13:47:13 -07:00
Or Gerlitz 0c06897a9a net/mlx5: Add core support for vlan push/pop steering action
Newer NICs (ConnectX-5 and onward) can apply vlan pop or push as an
action taking place during flow steering. Add the core bits for that.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-26 13:47:13 -07:00
Or Gerlitz aa24670ef6 net/mlx5: E-Switch, Use same source for offloaded actions check
Align the checks for modify header and encap actions with the
rest of the code.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-26 13:47:12 -07:00
Moshe Shemesh 7cbaf9a3ea net/mlx5e: Add interface down dropped packets statistics
Added the following packets drop counter:
Rx interface down dropped packets - counts packets which were received
while the ETH interface was down.
This counter will be shown on ethtool as a new counter called
rx_if_down_packets.

The implementation allocates a q_counter for drop rq which gets all the
received traffic while the interface is down.

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-26 13:47:12 -07:00
Moshe Shemesh aaabd0783b net/mlx5: Add packet dropped while vport down statistics
Added the following packets dropped while vport down statistics:

Rx dropped while vport down - counts packets which were steered by
e-switch to a vport, but dropped since the vport was down. This counter
will be shown on ip link tool as part of the vport rx_dropped counter.

Tx dropped while vport down - counts packets which were transmitted by
a vport, but dropped due to vport logical link down. This counter
will be shown on ip link tool as part of the vport tx_dropped counter.

The counters are read from FW by command QUERY_VNIC_ENV.

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-26 13:47:11 -07:00
Moshe Shemesh 5c298143be net/mlx5e: Add vnic steering drop statistics
Added the following packets drop counter:
Rx steering missed dropped packets - counts packets which were dropped
due to miss on NIC rx steering rules.
This counter will be shown on ethtool as a new counter called
rx_steer_missed_packets.

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-26 13:47:10 -07:00
Moshe Shemesh 61c5b5c917 net/mlx5: Add support for QUERY_VNIC_ENV command
Add support for new FW command QUERY_VNIC_ENV.
The command is used by the driver to query vnic diagnostic statistics
from FW.

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-26 13:47:10 -07:00
Inbar Karmy 2afa609f5c net/mlx5e: PFC stall prevention support
Implement set/get functions to configure PFC stall prevention
timeout by tunables api through ethtool.
By default the stall prevention timeout is configured to 8 sec.
Timeout range is: 80-8000 msec.

Enabling stall prevention with the auto timeout will set
the timeout to 100 msec.

Signed-off-by: Inbar Karmy <inbark@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-26 13:46:46 -07:00
Inbar Karmy e1577c1c88 ethtool: Add support for configuring PFC stall prevention in ethtool
In the event where the device unexpectedly becomes unresponsive
for a long period of time, flow control mechanism may propagate
pause frames which will cause congestion spreading to the entire
network.
To prevent this scenario, when the device is stalled for a period
longer than a pre-configured timeout, flow control mechanisms are
automatically disabled.

This patch adds support for the ETHTOOL_PFC_STALL_PREVENTION
as a tunable.
This API provides support for configuring flow control storm prevention
timeout (msec).

Signed-off-by: Inbar Karmy <inbark@mellanox.com>
Cc: Michal Kubecek <mkubecek@suse.cz>
Cc: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-26 13:46:46 -07:00
Jacob Keller 134201aead i40e: move AUTO_DISABLED flags into the state field
The two Flow Directory auto disable flags are used at run time to mark
when the flow director features needed to be disabled. Thus the flags
could change even when the RTNL lock is not held.

They also have some code constructions which really should be
test_and_set or test_and_clear using atomic bit operations.

Create new state fields to mark this, and stop including them in
pf->flags.

This is part of a larger effort to remove the need for cmpxchg64 in
i40e_set_priv_flags().

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-26 13:46:30 -07:00
Jacob Keller 41898c66ef i40e: move I40E_FLAG_UDP_FILTER_SYNC to the state field
This flag is modified during run time, possibly even when the RTNL lock
is not held. Additionally it has a few places which should be using
test_and_set or test_and_clear atomic bit operations.

Create a new state bit __I40E_UDP_SYNC_PENDING and use it instead of the
ole I40E_FLAG_UDP_FILTER_SYNC flag.

This is part of a larger effort to remove the need for using cmpxchg64
in i40e_set_priv_flags.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-26 13:44:02 -07:00
Inbar Karmy 2fcb12df7d net/mlx5e: Expose PFC stall prevention counters
Add the needed capability bit and counters to device spec description.
Expose the following two counters in ethtool:

tx_pause_storm_warning_events: when the device is stalled for a period
longer than a pre-configured watermark, the counter increase, allowing
the debug utility an insight into current device status.

tx_pause_storm_error_events: when the device is stalled for a period
longer than a pre-configured timeout, the pause transmission is disabled,
and the counter increase.

Signed-off-by: Inbar Karmy <inbark@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-26 13:42:19 -07:00
Jacob Keller bfe040c385 i40e: move I40E_FLAG_FILTER_SYNC to a state bit
The I40E_FLAG_FILTER_SYNC flag is modified during run time possibly when
the RTNL lock is not held. Thus, it should not be part of pf->flags, and
instead should be using atomic bit operations in the pf->state field.

Create a __I40E_MACVLAN_SYNC_PENDING state bit, and use it instead of
the old I40E_FLAG_FILTER_SYNC flag.

This is part of a larger effort to remove the need for cmpxchg64 in
i40e_set_priv_flags().

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-26 13:09:44 -07:00
Anirudh Venkataramanan e94d447866 ice: Implement filter sync, NDO operations and bump version
This patch implements multiple pieces of functionality:

1. Added ice_vsi_sync_filters, which is called through the service task
   to push filter updates to the hardware.

2. Add support to enable/disable promiscuous mode on an interface.
   Enabling/disabling promiscuous mode on an interface results in
   addition/removal of a promisc filter rule through ice_vsi_sync_filters.

3. Implement handlers for ndo_set_mac_address, ndo_change_mtu,
   ndo_poll_controller and ndo_set_rx_mode.

This patch also marks the end of the driver addition by bumping up the
driver version.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-26 12:41:38 -07:00
Anirudh Venkataramanan 0b28b702e7 ice: Support link events, reset and rebuild
Link events are posted to a PF's admin receive queue (ARQ). This patch
adds the ability to detect and process link events.

This patch also adds the ability to process resets.

The driver can process the following resets:
    1) EMP Reset (EMPR)
    2) Global Reset (GLOBR)
    3) Core Reset (CORER)
    4) Physical Function Reset (PFR)

EMPR is the largest level of reset that the driver can handle. An EMPR
resets the manageability block and also the data path, including PHY and
link for all the PFs. The affected PFs are notified of this event through
a miscellaneous interrupt.

GLOBR is a subset of EMPR. It does everything EMPR does except that it
doesn't reset the manageability block.

CORER is a subset of GLOBR. It does everything GLOBR does but doesn't
reset PHY and link.

PFR is a subset of CORER and affects only the given physical function.
In other words, PFR can be thought of as a CORER for a single PF. Since
only the issuing PF is affected, a PFR doesn't result in the miscellaneous
interrupt being triggered.

All the resets have the following in common:
1) Tx/Rx is halted and all queues are stopped.
2) All the VSIs and filters programmed for the PF are lost and have to be
   reprogrammed.
3) Control queue interfaces are reset and have to be reprogrammed.

In the rebuild flow, control queues are reinitialized, VSIs are reallocated
and filters are restored.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-26 12:32:17 -07:00
Anirudh Venkataramanan 5513b920a4 ice: Update Tx scheduler tree for VSI multi-Tx queue support
This patch adds the ability for a VSI to use multiple Tx queues. More
specifically, the patch
    1) Provides the ability to update the Tx scheduler tree in the
       firmware. The driver can configure the Tx scheduler tree by
       adding/removing multiple Tx queues per TC per VSI.

    2) Allows a VSI to reconfigure its Tx queues during runtime.

    3) Synchronizes the Tx scheduler update operations using locks.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-26 12:21:42 -07:00
Anirudh Venkataramanan fcea6f3da5 ice: Add stats and ethtool support
This patch implements a watchdog task to get packet statistics from
the device.

This patch also adds support for the following ethtool operations:

ethtool devname
ethtool -s devname [msglvl N] [msglevel type on|off]
ethtool -g|--show-ring devname
ethtool -G|--set-ring devname [rx N] [tx N]
ethtool -i|--driver devname
ethtool -d|--register-dump devname [raw on|off] [hex on|off] [file name]
ethtool -k|--show-features|--show-offload devname
ethtool -K|--features|--offload devname feature on|off
ethtool -P|--show-permaddr devname
ethtool -S|--statistics devname
ethtool -a|--show-pause devname
ethtool -A|--pause devname [autoneg on|off] [rx on|off] [tx on|off]
ethtool -r|--negotiate devname

CC: Andrew Lunn <andrew@lunn.ch>
CC: Jakub Kicinski <kubakici@wp.pl>
CC: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-26 12:11:19 -07:00
Anirudh Venkataramanan d76a60ba7a ice: Add support for VLANs and offloads
This patch adds support for VLANs. When a VLAN is created a switch filter
is added to direct the VLAN traffic to the corresponding VSI. When a VLAN
is deleted, the filter is deleted as well.

This patch also adds support for the following hardware offloads.
    1) VLAN tag insertion/stripping
    2) Receive Side Scaling (RSS)
    3) Tx checksum and TCP segmentation
    4) Rx checksum

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-26 11:54:49 -07:00
Anirudh Venkataramanan 2b245cb294 ice: Implement transmit and NAPI support
This patch implements ice_start_xmit (the handler for ndo_start_xmit) and
related functions. ice_start_xmit ultimately calls ice_tx_map, where the
Tx descriptor is built and posted to the hardware by bumping the ring tail.

This patch also implements ice_napi_poll, which is invoked when there's an
interrupt on the VSI's queues. The interrupt can be due to either a
completed Tx or an Rx event. In case of a completed Tx/Rx event, resources
are reclaimed. Additionally, in case of an Rx event, the skb is fetched
and passed up to the network stack.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-26 11:27:05 -07:00
Anirudh Venkataramanan cdedef59de ice: Configure VSIs for Tx/Rx
This patch configures the VSIs to be able to send and receive
packets by doing the following:

1) Initialize flexible parser to extract and include certain
   fields in the Rx descriptor.

2) Add Tx queues by programming the Tx queue context (implemented in
   ice_vsi_cfg_txqs). Note that adding the queues also enables (starts)
   the queues.

3) Add Rx queues by programming Rx queue context (implemented in
   ice_vsi_cfg_rxqs). Note that this only adds queues but doesn't start
   them. The rings will be started by calling ice_vsi_start_rx_rings on
   interface up.

4) Configure interrupts for VSI queues.

5) Implement ice_open and ice_stop.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-26 11:18:36 -07:00
Anirudh Venkataramanan 9daf8208dd ice: Add support for switch filter programming
A VSI needs traffic directed towards it. This is done by programming
filter rules on the switch (embedded vSwitch) element in the hardware,
which connects the VSI to the ingress/egress port.

This patch introduces data structures and functions necessary to add
remove or update switch rules on the switch element. This is a pretty low
level function that is generic enough to add a whole range of filters.

This patch also introduces two top level functions ice_add_mac and
ice_remove mac which through a series of intermediate helper functions
eventually call ice_aq_sw_rules to add/delete simple MAC based filters.
It's worth noting that one invocation of ice_add_mac/ice_remove_mac
is capable of adding/deleting multiple MAC filters.

Also worth noting is the fact that the driver maintains a list of currently
active filters, so every filter addition/removal causes an update to this
list. This is done for a couple of reasons:

1) If two VSIs try to add the same filters, we need to detect it and do
   things a little differently (i.e. use VSI lists, described below) as
   the same filter can't be added more than once.

2) In the event of a hardware reset we can simply walk through this list
   and restore the filters.

VSI Lists:
In a multi-VSI situation, it's possible that multiple VSIs want to add the
same filter rule. For example, two VSIs that want to receive broadcast
traffic would both add a filter for destination MAC ff:ff:ff:ff:ff:ff.
This can become cumbersome to maintain and so this is handled using a
VSI list.

A VSI list is resource that can be allocated in the hardware using the
ice_aq_alloc_free_res admin queue command. Simply put, a VSI list can
be thought of as a subscription list containing a set of VSIs to which
the packet should be forwarded, should the filter match.

For example, if VSI-0 has already added a broadcast filter, and VSI-1
wants to do the same thing, the filter creation flow will detect this,
allocate a VSI list and update the switch rule so that broadcast traffic
will now be forwarded to the VSI list which contains VSI-0 and VSI-1.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-26 11:00:08 -07:00
Anirudh Venkataramanan 3a858ba392 ice: Add support for VSI allocation and deallocation
This patch introduces data structures and functions to alloc/free
VSIs. The driver represents a VSI using the ice_vsi structure.

Some noteworthy points about VSI allocation:

1) A VSI is allocated in the firmware using the "add VSI" admin queue
   command (implemented as ice_aq_add_vsi). The firmware returns an
   identifier for the allocated VSI. The VSI context is used to program
   certain aspects (loopback, queue map, etc.) of the VSI's configuration.

2) A VSI is deleted using the "free VSI" admin queue command (implemented
   as ice_aq_free_vsi).

3) The driver represents a VSI using struct ice_vsi. This is allocated
   and initialized as part of the ice_vsi_alloc flow, and deallocated
   as part of the ice_vsi_delete flow.

4) Once the VSI is created, a netdev is allocated and associated with it.
   The VSI's ring and vector related data structures are also allocated
   and initialized.

5) A VSI's queues can either be contiguous or scattered. To do this, the
   driver maintains a bitmap (vsi->avail_txqs) which is kept in sync with
   the firmware's VSI queue allocation imap. If the VSI can't get a
   contiguous queue allocation, it will fallback to scatter. This is
   implemented in ice_vsi_get_qs which is called as part of the VSI setup
   flow. In the release flow, the VSI's queues are released and the bitmap
   is updated to reflect this by ice_vsi_put_qs.

CC: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-26 10:44:27 -07:00
Anirudh Venkataramanan 940b61af02 ice: Initialize PF and setup miscellaneous interrupt
This patch continues the initialization flow as follows:

1) Allocate and initialize necessary fields (like vsi, num_alloc_vsi,
   irq_tracker, etc) in the ice_pf instance.

2) Setup the miscellaneous interrupt handler. This also known as the
   "other interrupt causes" (OIC) handler and is used to handle non
   hotpath interrupts (like control queue events, link events,
   exceptions, etc.

3) Implement a background task to process admin queue receive (ARQ)
   events received by the driver.

CC: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-26 10:34:49 -07:00
Anirudh Venkataramanan dc49c77236 ice: Get MAC/PHY/link info and scheduler topology
This patch adds code to continue the initialization flow as follows:

1) Get PHY/link information and store it
2) Get default scheduler tree topology and store it
3) Get the MAC address associated with the port and store it

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-26 10:24:54 -07:00
Anirudh Venkataramanan 9c20346b63 ice: Get switch config, scheduler config and device capabilities
This patch adds to the initialization flow by getting switch
configuration, scheduler configuration and device capabilities.

Switch configuration:
On boot, an L2 switch element is created in the firmware per physical
function. Each physical function is also mapped to a port, to which its
switch element is connected. In other words, this switch can be visualized
as an embedded vSwitch that can connect a physical function's virtual
station interfaces (VSIs) to the egress/ingress port. Egress/ingress
filters will be eventually created and applied on this switch element.
As part of the initialization flow, the driver gets configuration data
from this switch element and stores it.

Scheduler configuration:
The Tx scheduler is a subsystem responsible for setting and enforcing QoS.
As part of the initialization flow, the driver queries and stores the
default scheduler configuration for the given physical function.

Device capabilities:
As part of initialization, the driver has to determine what the device is
capable of (ex. max queues, VSIs, etc). This information is obtained from
the firmware and stored by the driver.

CC: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-03-26 10:14:57 -07:00
David S. Miller 336f2c038d Merge branch 'mlxsw-Offload-IPv6-multicast-routes'
Ido Schimmel says:

====================
mlxsw: Offload IPv6 multicast routes

Yuval says:

The series is intended to allow offloading IPv6 multicast routes
and is split into two parts:

  - First half of the patches continue extending ip6mr [& refactor ipmr]
    with missing bits necessary for the offloading - fib-notifications,
    mfc refcounting and default rule identification.

  - Second half of the patches extend functionality inside mlxsw,
    beginning with extending lower-parts to support IPv6 mroutes
    to host and later extending the router/mr internal APIs within
    the driver to accommodate support in ipv6 configurations.
    Lastly it adds support in the RTNL_FAMILY_IP6MR notifications,
    allowing driver to react and offload related routes.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-26 13:14:45 -04:00
Yuval Mintz 6a170d326f mlxsw: spectrum: Add multicast router trap for PIMv6
Add a new trap for PIMv6 packets. As PIM already has a designated trap
group [ & rate limiter], simply use the same for PIMv6 as well.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-26 13:14:45 -04:00
Yuval Mintz 64ed1b9e8f mlxsw: spectrum_router: Process IP6MR fib notification
Following previous patches driver is ready to handle notifications
arriving from ip6mr - start processing those when they arrive following
the same manner ipmr currently goes through.

This should enable driver to start offloading ipv6 multicast routes.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-26 13:14:44 -04:00
Yuval Mintz 6981e104a8 mlxsw: spectrum_mr: Add ipv6 specific operations
Populate the various operation structures meant for IPv6 with logic
unique to that protocol suite.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-26 13:14:44 -04:00