Commit Graph

726218 Commits

Author SHA1 Message Date
Corinna Vinschen 177132df5e igb: Allow to remove administratively set MAC on VFs
Before libvirt modifies the MAC address and vlan tag for an SRIOV VF
for use by a virtual machine (either using vfio device assignment or
macvtap passthru mode), it saves the current MAC address and vlan tag
so that it can reset them to their original value when the guest is
done.  Libvirt can't leave the VF MAC set to the value used by the
now-defunct guest since it may be started again later using a
different VF, but it certainly shouldn't just pick any random value,
either. So it saves the state of everything prior to using the VF, and
resets it to that.

The igb driver initializes the MAC addresses of all VFs to
00:00:00:00:00:00, and reports that when asked (via an RTM_GETLINK
netlink message, also visible in the list of VFs in the output of "ip
link show"). But when libvirt attempts to restore the MAC address back
to 00:00:00:00:00:00 (using an RTM_SETLINK netlink message) the kernel
responds with "Invalid argument".

Forbidding a reset back to the original value leaves the VF MAC at the
value set for the now-defunct virtual machine. Especially on a system
with NetworkManager enabled, this has very bad consequences, since
NetworkManager forces all interfacess to be IFF_UP all the time - if
the same virtual machine is restarted using a different VF (or even on
a different host), there will be multiple interfaces watching for
traffic with the same MAC address.

To allow libvirt to revert to the original state, we need a way to
remove the administrative set MAC on a VF, to allow normal host
operation again, and to reset/overwrite the VF MAC via VF netdev.

This patch implements the outlined scenario by allowing to set the
VF MAC to 00:00:00:00:00:00 via RTM_SETLINK on the PF.
igb_ndo_set_vf_mac resets the IGB_VF_FLAG_PF_SET_MAC flag to 0,
so it's possible to reset the VF MAC back to the original value via
the VF netdev.

Note: Recent patches to libvirt allow for a workaround if the NIC
isn't capable of resetting the administrative MAC back to all 0, but
in theory the NIC should allow resetting the MAC in the first place.

Signed-off-by: Corinna Vinschen <vinschen@redhat.com>
Tested-by: Aaron Brown <arron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-24 12:27:48 -08:00
David S. Miller 46410c2efa Merge branch 'pktgen-Behavior-flags-fixes'
Dmitry Safonov says:

====================
pktgen: Behavior flags fixes

v2:
  o fixed a nitpick from David Miller

There are a bunch of fixes/cleanups/Documentations.
Diffstat says for itself, regardless added docs and missed flag
parameters.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24 15:03:37 -05:00
Dmitry Safonov 52e12d5dae pktgen: Clean read user supplied flag mess
Don't use error-prone-brute-force way.

Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24 15:03:36 -05:00
Dmitry Safonov 99c6d3d20d pktgen: Remove brute-force printing of flags
Add macro generated pkt_flag_names array, with a little help of which
the flags can be printed by using an index.

Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24 15:03:36 -05:00
Dmitry Safonov 6f107c7412 pktgen: Add behaviour flags macro to generate flags/names
PKT_FALGS macro will be used to add package behavior names definitions
to simplify the code that prints/reads pkg flags.
Sorted the array in order of printing the flags in pktgen_if_show()
Note: Renamed IPSEC_ON => IPSEC for simplicity.

No visible behavior change expected.

Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24 15:03:36 -05:00
Dmitry Safonov 57a5749b0f pktgen: Add missing !flag parameters
o FLOW_SEQ now can be disabled with pgset "flag !FLOW_SEQ"
o FLOW_SEQ and FLOW_RND are antonyms, as it's shown by pktgen_if_show()
o IPSEC now may be disabled

Note, that IPV6 is enabled with dst6/src6 parameters, not with
a flag parameter.

Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24 15:03:36 -05:00
Dmitry Safonov d2ee7973c3 Documentation/pktgen: Clearify how-to use pktgen samples
o Change process name in ps output: looks like, these days the process
  is named kpktgend_<cpu>, rather than pktgen/<cpu>.
o Use pg_ctrl for start/stop as it can work well with pgset without
  changes to $(PGDEV) variable.
o Clarify a bit needed $(PGDEV) definition for sample scripts and that
  one needs to `source functions.sh`.
o Document how-to unset a behaviour flag, note about history expansion.
o Fix pgset spi parameter value.

Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24 15:03:36 -05:00
David S. Miller 969ade4086 Merge branch 'cxgb4-fix-build-error'
Rahul Lakkireddy says:

====================
cxgb4: fix build error

Patch 1 fixes build error with compiling cudbg_zlib.c when
CONFIG_ZLIB_DEFLATE macro is not defined.

Patch 2 fixes following sparse warning:
"Using plain integer as NULL pointer"
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24 10:56:59 -05:00
Rahul Lakkireddy 325694e6c3 cxgb4: properly initialize variables
memset variables to 0 to fix sparse warnings:

drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c:409:42: sparse: Using
plain integer as NULL pointer

drivers/net/ethernet/chelsio/cxgb4/cudbg_zlib.c:43:47: sparse: Using
plain integer as NULL pointer

Fixes: ad75b7d32f ("cxgb4: implement ethtool dump data operations")
Fixes: 91c1953de3 ("cxgb4: use zlib deflate to compress firmware dump")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24 10:56:59 -05:00
Rahul Lakkireddy a1cf9c9ffe cxgb4: enable ZLIB_DEFLATE when building cxgb4
Fixes:
drivers/net/ethernet/chelsio/cxgb4/cudbg_zlib.c:39:5: error:
redefinition of 'cudbg_compress_buff'
    int cudbg_compress_buff(struct cudbg_init *pdbg_init,
        ^~~~~~~~~~~~~~~~~~~
   In file included from
drivers/net/ethernet/chelsio/cxgb4/cudbg_zlib.c:23:0:
   drivers/net/ethernet/chelsio/cxgb4/cudbg_zlib.h:45:19: note: previous
definition of 'cudbg_compress_buff' was here
    static inline int cudbg_compress_buff(struct cudbg_init *pdbg_init,
                      ^~~~~~~~~~~~~~~~~~~

Fixes: 91c1953de3 ("cxgb4: use zlib deflate to compress firmware dump")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24 10:56:59 -05:00
David S. Miller 9e1a27cdab Merge branch 'net-smc-socket-closing-improvements'
Ursula Braun says:

====================
net/smc: socket closing improvements

while the first 2 patches are just small cleanups, the remaing
patches affect socket closing.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24 10:52:58 -05:00
Ursula Braun aa377e682d net/smc: continue waiting if peer signals write_shutdown
If the peer sends a shutdown WRITE, this should not affect sending
in general, and waiting for send buffer space in particular.
Stop waiting of the local socket for send buffer space only, if peer
signals closing, but not if peer signals just shutdown WRITE.

Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24 10:52:57 -05:00
Ursula Braun bbb96bf236 net/smc: improve state change handling after close wait
When a socket is closed or shutdown, smc waits for data being transmitted
in certain states. If the state changes during this wait, the close
switch depending on state should be reentered.
In addition, state change is avoided if sending of close or shutdown fails.

Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24 10:52:57 -05:00
Ursula Braun 86e780d3a3 net/smc: make wait for work request uninterruptible
Work requests are needed for every ib_post_send(), among them the
ib_post_send() to signal closing. If an smc socket program is cancelled,
the smc connections should be cleaned up, and require sending of closing
signals to the peer. This may fail, if a wait for
a free work request is needed, but is cancelled immediately due to the
cancel interrupt. To guarantee notification of the peer, the wait for
a work request is changed to uninterruptible.

And the area to receive work request completion info with
ib_poll_cq() is cleared first.
And _tx_ variable names are used in the _tx_routines for the
demultiplexing common type in the header.

Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24 10:52:57 -05:00
Ursula Braun 8429c13435 net/smc: get rid of tx_pend waits in socket closing
There is no need to wait for confirmation of pending tx requests
for a closing connection, since pending tx slots are dismissed
when finishing a connection.

Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24 10:52:57 -05:00
Ursula Braun 35a6b17847 net/smc: simplify function smc_clcsock_accept()
Cleanup to avoid duplicate code in smc_clcsock_accept().
No functional change.

Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24 10:52:57 -05:00
Ursula Braun 3163c5071f net/smc: use local struct sock variables consistently
Cleanup to consistently exploit the local struct sock definitions.
No functional change.

Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24 10:52:57 -05:00
Ganesh Goudar 9d5fd927d2 cxgb4/cxgb4vf: add support for ndo_set_vf_vlan
implement ndo_set_vf_vlan for mgmt netdevice to configure
the PCIe VF.

Original work by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24 10:48:25 -05:00
David S. Miller 43df215d99 Merge branch 'bpf-and-netdevsim-test-updates'
Jakub Kicinski says:

====================
bpf and netdevsim test updates

A number of test improvements (delayed by merges).  Quentin provides
patches for checking printing to the verifier log from the drivers
and validating extack messages are propagated.  There is also a test
for replacing TC filters to avoid adding back the bug Daniel recently
fixed in net and stable.
====================

Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23 20:25:28 -05:00
Jakub Kicinski 6d2d58f1b7 selftests/bpf: validate replace of TC filters is working
Daniel discovered recently I broke TC filter replace (and fixed
it in commit ad9294dbc2 ("bpf: fix cls_bpf on filter replace")).
Add a test to make sure it never happens again.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23 20:24:32 -05:00
Quentin Monnet 9045bdc8ed selftests/bpf: check bpf verifier log buffer usage works for HW offload
Make netdevsim print a message to the BPF verifier log buffer when a
program is offloaded.

Then use this message in hardware offload selftests to make sure that
using this buffer actually prints the message to the console for
eBPF hardware offload.

The message is appended after the last instruction is processed with the
verifying function from netdevsim. Output looks like the following:

    $ tc filter add dev foo ingress bpf obj sample_ret0.o \
        sec .text verbose skip_sw

    Prog section '.text' loaded (5)!
     - Type:         3
     - Instructions: 2 (0 over limit)
     - License:

    Verifier analysis:

    0: (b7) r0 = 0
    1: (95) exit
    [netdevsim] Hello from netdevsim!
    processed 2 insns, stack depth 0

"verbose" flag is required to see it in the console since netdevsim does
not throw an error after printing the message.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23 20:24:31 -05:00
Jakub Kicinski 7c5db7e729 netdevsim: don't compile BPF code if syscall not enabled
We should not compile netdevsim/bpf.c if BPF syscall is not
enabled.  Otherwise bpf core would have to provide wrappers
for all functions offload drivers may call, even though
system will never see a BPF object.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23 20:24:31 -05:00
Quentin Monnet caf952288d selftests/bpf: add checks on extack messages for eBPF hw offload tests
Add checks to test that netlink extack messages are correctly displayed
in some expected error cases for eBPF offload to netdevsim with TC and
XDP.

iproute2 may be built without libmnl support, in which case the extack
messages will not be reported.  Try to detect this condition, and when
enountered print a mild warning to the user and skip the extack validation.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23 20:24:31 -05:00
Quentin Monnet 728461f202 netdevsim: add extack support for TC eBPF offload
Use the recently added extack support for TC eBPF filters in netdevsim.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23 20:24:31 -05:00
David S. Miller 521504640f Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2018-01-23

This series contains updates to i40e and i40evf only.

Pawel enables FlatNVM support on x722 devices by allowing nvmupdate tool
to configure the preservation flags in the AdminQ command.

Mitch fixes a potential divide by zero error when DCB is enabled and
the firmware fails to configure the VSI, so check for this state.
Fixed a bug where the driver could fail to adhere to ETS bandwidth
allocations if 8 traffic classes were configured on the switch.

Sudheer fixes a potential deadlock by avoiding to call
flush_schedule_work() in i40evf_remove(), since cancel_work_sync()
and cancel_delayed_work_sync() already cleans up necessary work items.
Fixed an issue with the problematic detection and recovery from
hung queues in the PF which was causing lost interrupts.  This is done
by triggering a software interrupt so that interrupts are forced on
and if we are already in napi_poll and an interrupt fires, napi_poll
will not be rescheduled and the interrupt is lost.

Avinash fixes an issue in the VF where is was possible to issue a
reset_task while the device is currently being removed.

Michal fixes an issue occurring while calling i40e_led_set() with
the blink parameter set to true, which was causing the activity LED
instead of the link LED to blink for port identification.

Shiraz changes the client interface to not call client close/open on
netdev down/up events, since this causes a lot of thrash that is
not needed.  Instead, disable the PE TCP-ENA flag during a netdev
down event and re-enable on a netdev up event, since this blocks all
TCP traffic to the RDMA protocol engine.

Alan fixes an issue which was causing a potential transmit hang by
ignoring the PF link up message if the VF state is not yet in the
RUNNING state.

Amritha fixes the channel VSI recreation during the reset flow to
reconfigure the transmit rings and the queue context associated with
the channel VSI.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23 20:22:57 -05:00
David S. Miller 6b44d0f9c9 Merge branch 'act_csum-spinlock-remove'
Davide Caratti says:

====================
net/sched: remove spinlock from 'csum' action

Similarly to what has been done earlier with other actions [1][2], this
series tries to improve the performance of 'csum' tc action, removing a
spinlock in the data path. Patch 1 lets act_csum use per-CPU counters;
patch 2 removes spin_{,un}lock_bh() calls from the act() method.

test procedure (using pktgen from https://github.com/netoptimizer):

 # ip link add name eth1 type dummy
 # ip link set dev eth1 up
 # tc qdisc add dev eth1 root handle 1: prio
 # for a in pass drop; do
 > tc filter del dev eth1 parent 1: pref 10 matchall action csum udp
 > tc filter add dev eth1 parent 1: pref 10 matchall action csum udp $a
 > for n in 2 4; do
 > ./pktgen_bench_xmit_mode_queue_xmit.sh -v -s 64 -t $n -n 1000000 -i eth1
 > done
 > done

test results:

      |    |  before patch   |   after patch
  $a  | $n | avg. pps/thread | avg. pps/thread
 -----+----+-----------------+----------------
 pass |  2 |    1671463 ± 4% |    1920789 ± 3%
 pass |  4 |     648797 ± 1% |     738190 ± 1%
 drop |  2 |    3212692 ± 2% |    3719811 ± 2%
 drop |  4 |    1078824 ± 1% |    1328099 ± 1%

references:

[1] https://www.spinics.net/lists/netdev/msg334760.html
[2] https://www.spinics.net/lists/netdev/msg465862.html

v3 changes:
 - use rtnl_dereference() in place of rcu_dereference() in tcf_csum_dump()

v2 changes:
 - add 'drop' test, it produces more contentions
 - use RCU-protected struct to store 'action' and 'update_flags', to avoid
   reading the values from subsequent configurations
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23 19:51:46 -05:00
Davide Caratti 9c5f69bbd7 net/sched: act_csum: don't use spinlock in the fast path
use RCU instead of spin_{,unlock}_bh() to protect concurrent read/write on
act_csum configuration, to reduce the effects of contention in the data
path when multiple readers are present.

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23 19:51:46 -05:00
Davide Caratti f6052cf2fc net/sched: act_csum: use per-core statistics
use per-CPU counters, like other TC actions do, instead of maintaining one
set of stats across all cores. This allows updating act_csum stats without
the need of protecting them using spin_{,un}lock_bh() invocations.

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23 19:51:46 -05:00
Roopa Prabhu b76f4189df net: link_watch: mark bonding link events urgent
It takes 1sec for bond link down notification to hit user-space
when all slaves of the bond go down. 1sec is too long for
protocol daemons in user-space relying on bond notification
to recover (eg: multichassis lag implementations in user-space).
Since the link event code already marks team device port link events
 as urgent, this patch moves the code to cover all lag ports and master.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23 19:43:30 -05:00
David S. Miller 0542e13b5f Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
10GbE Intel Wired LAN Driver Updates 2018-01-23

This series contains updates to ixgbe only.

Shannon Nelson provides an implementation of the ipsec hardware offload
feature for the ixgbe driver for these devices: x540, x550, 82599.

The ixgbe NICs support ipsec offload for 1024 Rx and 1024 Tx Security
Associations (SAs), using up to 128 inbound IP addresses, and using the
rfc4106(gcm(aes)) encryption.  This code does not yet support checksum
offload, or TSO in conjunction with the ipsec offload - those will be
added in the future.

This code shows improvements in both packet throughput and CPU utilization.
For example, here are some quicky numbers that show the magnitude of the
performance gain on a single run of "iperf -c <dest>" with the ipsec
offload on both ends of a point-to-point connection:

	9.4 Gbps - normal case
	7.6 Gbps - ipsec with offload
	343 Mbps - ipsec no offload
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23 19:29:26 -05:00
David S. Miller c89b517da1 Merge branch 'GEHC-Bx50-Switch-Support'
Sebastian Reichel says:

====================
GEHC Bx50 Switch Support

This adds support for the internal switch found in GE Healthcare
B450v3, B650v3 and B850v3. All devices use a GPIO bitbanged MDIO
bus to communicate with the switch and a PCIe based network card
for exchanging network data. The cpu network data link requires,
that the switch's internal phy interface is enabled, so support
for that is added by the first patch in this series.

The patch series is based on v4.15-rc8.

Changes since PATCHv4:
 * Introduce dsa_port_link_(un)register_of and mark the fixed
   variant static.
 * Update patch description to describe the phy<->phy connection
   from i210 to the Marvell switch
Changes since PATCHv3:
 * Enable the phy in dsa_port_setup() instead of abusing the
   fixed link setup function
Changes since PATCHv2:
 * Add phy nodes to switch in bx50.dtsi and reference them
   from switch ports
 * Enable cpu-port's phy based on 'phy-handle' instead of 'phy-mode'
Changes since PATCHv1:
 * Use 'marvell,mv88e6085' instead of introducing compatible
   string for mv88e6240.
 * Fix indention of DT nodes
 * Only enable 'cpu' phy, if explicitly set to "internal".
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23 19:22:39 -05:00
Sebastian Reichel 658d063d9d ARM: dts: imx6q-b450v3: Add switch port configuration
This adds support for the Marvell switch and names the network
ports according to the labels, that can be found next to the
connectors. The switch is connected to the host system using a
PCI based network card.

The PCI bus configuration has been written using the following
information:

root@b450v3# lspci -tv
-[0000:00]---00.0-[01]----00.0  Intel Corporation I210 Gigabit Network Connection
root@b450v3# lspci -nn
00:00.0 PCI bridge [0604]: Synopsys, Inc. Device [16c3:abcd] (rev 01)
01:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)

Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23 19:22:38 -05:00
Sebastian Reichel b2ea7f8332 ARM: dts: imx6q-b650v3: Add switch port configuration
This adds support for the Marvell switch and names the network
ports according to the labels, that can be found next to the
connectors. The switch is connected to the host system using a
PCI based network card.

The PCI bus configuration has been written using the following
information:

root@b650v3# lspci -tv
-[0000:00]---00.0-[01]----00.0  Intel Corporation I210 Gigabit Network Connection
root@b650v3# lspci -nn
00:00.0 PCI bridge [0604]: Synopsys, Inc. Device [16c3:abcd] (rev 01)
01:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)

Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23 19:22:38 -05:00
Sebastian Reichel e6b22e4169 ARM: dts: imx6q-b850v3: Add switch port configuration
This adds support for the Marvell switch and names the network
ports according to the labels, that can be found next to the
connectors ("ID", "IX", "ePort 1", "ePort 2"). The switch is
connected to the host system using a PCI based network card.

The PCI bus configuration has been written using the following
information:

root@b850v3# lspci -tv
-[0000:00]---00.0-[01]----00.0-[02-05]--+-01.0-[03]----00.0  Intel Corporation I210 Gigabit Network Connection
                                        +-02.0-[04]----00.0  Intel Corporation I210 Gigabit Network Connection
                                        \-03.0-[05]--
root@b850v3# lspci -nn
00:00.0 PCI bridge [0604]: Synopsys, Inc. Device [16c3:abcd] (rev 01)
01:00.0 PCI bridge [0604]: PLX Technology, Inc. PEX 8605 PCI Express 4-port Gen2 Switch [10b5:8605] (rev ab)
02:01.0 PCI bridge [0604]: PLX Technology, Inc. PEX 8605 PCI Express 4-port Gen2 Switch [10b5:8605] (rev ab)
02:02.0 PCI bridge [0604]: PLX Technology, Inc. PEX 8605 PCI Express 4-port Gen2 Switch [10b5:8605] (rev ab)
02:03.0 PCI bridge [0604]: PLX Technology, Inc. PEX 8605 PCI Express 4-port Gen2 Switch [10b5:8605] (rev ab)
03:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)
04:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)

Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23 19:22:38 -05:00
Sebastian Reichel e26dead442 ARM: dts: imx6q-bx50v3: Add internal switch
B850v3, B650v3 and B450v3 all have a GPIO bit banged MDIO bus to
communicate with a Marvell switch. On all devices the switch is
connected to a PCI based network card, which needs to be referenced
by DT, so this also adds the common PCI root node.

Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23 19:22:38 -05:00
Sebastian Reichel 33615367f3 net: dsa: Support internal phy on 'cpu' port
This adds support for enabling the internal PHY for a 'cpu' port.
It has been tested on GE B850v3,  B650v3 and B450v3, which have a
built-in MV88E6240 switch hardwired to a PCIe based network card.
On these machines the internal PHY of the i210 network card and
the Marvell switch are connected to each other and must be enabled
for properly using the switch. While the i210 PHY will be enabled
when the network interface is enabled, the switch's port is not
exposed as network interface. Additionally the mv88e6xxx driver
resets the chip during probe, so the PHY is disabled without this
patch.

Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23 19:22:38 -05:00
Amritha Nambiar bbf0bdd41f i40e: Fix channel addition in reset flow
Fix recreating the channel VSIs during the reset flow to reconfigure
the Tx rings and the queue context associated with the channel VSI.
Also update the next_base_queue for the VSI while rebuilding the
channel VSIs after a reset.

Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-23 11:29:19 -08:00
Alan Brady e0346f9fcb i40evf: ignore link up if not running
If we receive the link status message from PF with link up before queues
are actually enabled, it will trigger a TX hang.  This fixes the issue
by ignoring a link up message if the VF state is not yet in RUNNING
state.

Signed-off-by: Alan Brady <alan.brady@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-23 11:29:19 -08:00
Markus Elfring 557450c301 i40e: Delete an error message for a failed memory allocation in i40e_init_interrupt_scheme()
Omit an extra message for a memory allocation failure in this function.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-23 11:29:19 -08:00
Shiraz Saleem 7b0b1a6d0a i40e: Disable iWARP VSI PETCP_ENA flag on netdev down events
Client close is overloaded to handle both un-registration and
netdev down event. On netdev down, i40iw client close is called
which unregisters the RDMA dev and this is too destructive
since the netdev is still registered.

Do not call client close/open on netdev down/up events. Instead
disable the PE TCP_ENA flag during a netdev down event. This
blocks all TCP traffic to the RDMA Protocol Engine. On netdev up,
re-enable the flag.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-23 11:29:19 -08:00
Mitch Williams bc73234bcd i40e: simplify pointer dereferences
Now that i40e_vsi_config_tc() has the pf and hw variable defined, use
them, instead of dereferencing vsi->back. Much easier to read.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-23 11:29:19 -08:00
Mitch Williams d8a8785660 i40e: check for invalid DCB config
The driver (and the entire netdev layer for that matter) assumes
that TC0 will always be present in our DCB configuration.
Unfortunately, this isn't always the case. Rather than fail to
configure the VSI, let's go ahead and try to make it work, even
though DCB will end up being disabled by the kernel.

If the driver fails to configure DCB, the driver queries what's
valid, then writes that back to the hardware, always forcing TC0.

This fixes a bug where the driver could fail to adhere to ETS BW
allocations if 8 TCs were configured on the switch.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-23 11:29:19 -08:00
Sudheer Mogilappagari 07d44190a3 i40e/i40evf: Detect and recover hung queue scenario
In VFs, there is a known issue which can cause writebacks
to not occur when interrupts are disabled and there are
less than 4 descriptors resulting in TX timeout. Timeout
can also occur due to lost interrupt.

The current implementation for detecting and recovering
from hung queues in the PF is problematic because it actually
actively encourages lost interrupts.  By triggering a SW
interrupt, interrupts are forced on.  If we are already in
napi_poll and an interrupt fires, napi_poll will not be
rescheduled and the interrupt is effectively lost; thereby
potentially *causing* hung queues.

This patch checks whether packets are being processed between
every watchdog cycle and determine potential hung queue and
fires triggers SW interrupt only for that particular queue.

Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-23 11:29:19 -08:00
Michal Kuchta d95cd48060 i40e: Fix for blinking activity instead of link LEDs
This fix solves an issue occurring while calling i40e_led_set function
from the driver with "blink" parameter set as TRUE. This call resulted
in Activity LED blinking instead of Link LED, which may lead to errors
in physically identifying the port, since Activity LED may be blinking
for different reasons as well.

Signed-off-by: Michal Kuchta <michal.kuchta@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-23 11:29:19 -08:00
Avinash Dayanand 06aa040f03 i40evf: Don't schedule reset_task when device is being removed
When a host disables and enables a PF device, all the associated
VFs are removed and added back in. It also generates a PFR which in turn
resets all the connected VFs. This behaviour is different from that of
Linux guest on Linux host. Hence we end up in a situation where there's
a PFR and device removal at the same time. And watchdog doesn't have a
clue about this and schedules a reset_task. This patch adds code to send
signal to reset_task that the device is currently being removed.

Signed-off-by: Avinash Dayanand <avinash.dayanand@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-23 11:29:19 -08:00
Sudheer Mogilappagari a558566bef i40evf: remove flush_scheduled_work call in i40evf_remove
flush_schedule_work blocks until completion of all scheduled
work items in global work-queue. This can cause deadlock in some
cases. i40evf_remove() cleans up necessary work items with
cancel_delayed_work_sync and cancel_work_sync. This fix removes
flush_schedule_work call inside i40evf_remove().

Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-23 11:29:19 -08:00
Mitch Williams b356dac8ab i40e: avoid divide by zero
In some weird circumstances with DCB enabled, the firmware can fail to
configure the VSI, leaving us with zero traffic classes. Check for this
state when we configure RSS to avoid a panic.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-23 11:29:19 -08:00
Pawel Jablonski e3a5d6e6fa i40e/i40evf: Enable NVMUpdate to retrieve AdminQ and add preservation flags for NVM update
This patch adds new I40E_NVMUPD_GET_AQ_EVENT state to allow
retrieval of AdminQ events as a result of AdminQ commands sent
to firmware.

Add preservation flags support on X722 devices for NVM update
AdminQ function wrapper. Add new parameter and handling to
nvmupdate admin queue function intended to allow nvmupdate tool
to configure the preservation flags in the AdminQ command.

This is required to implement FlatNVM on X722 devices.

Signed-off-by: Pawel Jablonski <pawel.jablonski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-23 11:29:19 -08:00
David S. Miller 5ca114400d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
en_rx_am.c was deleted in 'net-next' but had a bug fixed in it in
'net'.

The esp{4,6}_offload.c conflicts were overlapping changes.
The 'out' label is removed so we just return ERR_PTR(-EINVAL)
directly.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23 13:51:56 -05:00
Shannon Nelson 85bc2663a5 ixgbe: register ipsec offload with the xfrm subsystem
With all the support code in place we can now link in the ipsec
offload operations and set the ESP feature flag for the XFRM
subsystem to see.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-01-23 10:09:12 -08:00