Commit Graph

602583 Commits

Author SHA1 Message Date
Shrikrishna Khare 3c8b3efc06 vmxnet3: allow variable length transmit data ring buffer
vmxnet3 driver supports transmit data ring viz. a set of fixed size
buffers used by the driver to copy packet headers. Small packets that
fit these buffers are copied into these buffers entirely.

Currently this buffer size of fixed at 128 bytes. This patch extends
transmit data ring implementation to allow variable length transmit
data ring buffers. The length of the buffer is read from the emulation
during initialization.

Signed-off-by: Sriram Rangarajan <rangarajans@vmware.com>
Signed-off-by: Shrikrishna Khare <skhare@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 22:37:04 -07:00
Shrikrishna Khare f35c7480f8 vmxnet3: introduce generalized command interface to configure the device
Shared memory is used to exchange information between the vmxnet3 driver
and the emulation. In order to request emulation to perform a task, the
driver first populates specific fields in this shared memory and then
issues corresponding command by writing to the command register(CMD). The
layout of the shared memory was defined by vmxnet3 version 1 and cannot
be extended for every new command without breaking backward compatibility.

To address this problem, in vmxnet3 version 3, the emulation repurposed
a reserved field in the shared memory to represent command information
instead. For new commands, the driver first populates the command
information field in the shared memory and then issues the command. The
emulation interprets the data written to the command information depending
on the type of the command. This patch exposes this capability to the driver.

Signed-off-by: Guolin Yang <gyang@vmware.com>
Signed-off-by: Shrikrishna Khare <skhare@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 22:37:04 -07:00
Shrikrishna Khare 190af10f0b vmxnet3: prepare for version 3 changes
vmxnet3 is currently at version 2, but some command definitions from
previous vmxnet3 versions are missing. Add those definitions before
moving to version 3.

Also, introduce utility macros for vmxnet3 version comparison and update
Copyright information and Maintained by.

Signed-off-by: Shrikrishna Khare <skhare@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 22:37:04 -07:00
David S. Miller a264d830ab Merge branch 'qeth-next'
Ursula Braun says:

====================
s390: qeth patches

here are patches for the s390 qeth driver for net-next:

Patch 01 is a minor improvement for the bridgeport code in qeth.

Patches 02-07 take care about scatter/gather handling in qeth.

Patch 08 improves handling of multicast IP addresses in the qeth layer3
discipline.

Patches 09-11 improve netdev features related functions in qeth.

Patch 12 implements an outbound queue restriction for HiperSockets.

Patch 13 fixes a wrong indentation in qeth_l3_main.c causing a warning
with gcc-6.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 22:16:13 -07:00
Sebastian Ott 77a83ed10b s390/qeth: fix indentation in qeth_l3_arp_query
gcc-6 warns about obviously wrong indentation:
drivers/s390/net/qeth_l3_main.c: In function 'qeth_l3_arp_query':
drivers/s390/net/qeth_l3_main.c:2315:3: warning: this 'if' clause does not
guard... [-Wmisleading-indentation]
   if (copy_to_user(udata, qinfo.udata, 4))
   ^~
drivers/s390/net/qeth_l3_main.c:2317:4: note: ...this statement, but the
latter is misleadingly indented as if it is guarded by the 'if'
    goto free_and_out;
    ^~~~

Although this particular case is harmless, fix the indentation to get rid
of that warning and improve readability.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 22:16:13 -07:00
Hans Wippel 70deb01662 qeth: omit outbound queue 3 for unicast packets in Priority Queuing on HiperSockets
On HiperSockets only outbound queues 0 to 2 are available for unicast
packets. Current Priority Queuing implementation in the qeth driver puts
outgoing packets in outbound queues 0 to 3.

This puts outgoing unicast packets into outbound queue 2 instead of
outbound queue 3 when using Priority Queuing on a HiperSocket.
Additionally, the default outbound queue cannot be set to outbound queue 3
on HiperSockets.

Signed-off-by: Hans Wippel <hwippel@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 22:16:13 -07:00
Hans Wippel 6c7cd71244 qeth: improve set_features error handling
The function set_features is called to configure network device features
on the hardware. If errors occur, the network device features should
reflect the changed hardware state and the function should return an
error in order to notify the user.

In case of an error, the current implementation does not necessarily
save the changed hardware state in the network device features before an
error is returned.

This patch improves error handling by saving features, that could be
changed, to the network device features before returning an error. If
the device is not running, an additional check in fix_features removes
features, that require hardware changes, before they are passed to
set_features. Thus, the corresponding check was removed in set_features.

Signed-off-by: Hans Wippel <hwippel@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 22:16:13 -07:00
Hans Wippel 9bdc441102 qeth: add network device features for VLAN devices
Network device features indicate the capabilities of network devices (e.g.,
TX checksum offloading and TSO) and their configuration state. Additional
network device features (vlan_features) indicate for each network device,
which capabilities can be used on VLAN devices, that are configured on the
respective base network device.

In the current qeth implementation, network device features are only set
for the base network devices and not for the VLAN devices. Thus, features
like TX checksum offloading cannot be used on VLAN devices.

This patch adds network device features to vlan_features, so they can be
used by VLAN devices.

Signed-off-by: Hans Wippel <hwippel@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 22:16:13 -07:00
Thomas Richter 8f43fb00a1 qeth layer 2 and layer 3 common feature handling
This patch introduces a common set of fix_features and set_features
functions for layer 2 and layer 3. The RX, TX and TSO offload
functionality on the OSA card is enabled using ethtool at user's
request and not at device initialization as done before.

For layer 3 the RX checksum offloading is disabled at device
initialization time.

Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 22:16:12 -07:00
Lakhvich Dmitriy 5f78e29cee qeth: optimize IP handling in rx_mode callback
In layer3 mode of the qeth driver, multicast IP addresses
from struct net_device and other type of IP addresses
from other sources require mapping to the OSA-card.
This patch simplifies the IP address mapping logic, and changes imple-
mentation of ndo_set_rx_mode callback and ip notifier events.
Addresses are stored in private hashtables instead of lists now.
It allows hardware registration/removal for new/deleted multicast
addresses only.

Signed-off-by: Lakhvich Dmitriy <ldmitriy@ru.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Reviewed-by: Evgeny Cherkashin <Eugene.Crosser@ru.ibm.com>
Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 22:16:12 -07:00
Eugene Crosser 6059c90537 qeth: introduce linearization fail count to stats
When skb data touches too many pages, skb_linearize() is called
opportunistically in the hope that less pages will be required
for a big linear buffer than for multiple fragments. This patch
intoduces a separate counter in ethtool statistics structure
representing _failed_ linearization attempts.

Signed-off-by: Eugene Crosser <Eugene.Crosser@ru.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Reviewed-by: Lakhvich Dmitriy <ldmitriy@ru.ibm.com>
Reviewed-by: Thomas Richter <tmricht@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 22:16:12 -07:00
Eugene Crosser 2601e4ed3f qeth: enable scatter/gather by default
Set scatter/gather ON by default on OSA, for both layer 2 and
layer 3 modes. We always use fragmentation over QDIO anyway,
so let the upper layers of the stack take advantage of that.

Signed-off-by: Eugene Crosser <Eugene.Crosser@ru.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Reviewed-by: Lakhvich Dmitriy <ldmitriy@ru.ibm.com>
Reviewed-by: Thomas Richter <tmricht@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 22:16:12 -07:00
Eugene Crosser d52aec97e5 qeth: enable scatter/gather in layer 2 mode
The patch enables NETIF_F_SG flag for OSA in layer 2 mode.
It also adds performance accounting for fragmented sends,
adds a conditional skb_linearize() attempt if the skb had
too many fragments for QDIO SBAL, and fills netdevice->gso_*
attributes.

Signed-off-by: Eugene Crosser <Eugene.Crosser@ru.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Reviewed-by: Lakhvich Dmitriy <ldmitriy@ru.ibm.com>
Reviewed-by: Thomas Richter <tmricht@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 22:16:12 -07:00
Eugene Crosser 8bad55f8ce qeth: fill netdevice->gso_* attributes accurately
Use QETH_MAX_BUFFER_ELEMENTS(card) instead of constant 16.
Also fill gso_max_segs and gso_min_segs.

Signed-off-by: Eugene Crosser <Eugene.Crosser@ru.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 22:16:11 -07:00
Eugene Crosser 41aeed58af qeth: clean up condition when tso is used
Make conditions under which TSO is activated more stringent.
Make calculation of SBALEs required for the skb more accurate.

Signed-off-by: Eugene Crosser <Eugene.Crosser@ru.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 22:16:11 -07:00
Eugene Crosser 2863c61334 qeth: refactor calculation of SBALE count
Rewrite the functions that calculate the required number of buffer
elements needed to represent SKB data, to make them hopefully more
comprehensible. Plus a few cleanups.

Signed-off-by: Eugene Crosser <Eugene.Crosser@ru.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 22:16:11 -07:00
Eugene Crosser 1b05cf6285 qeth: Include error message for "OS Mismatch"
Having understood the semantics of BRIDGEPORT error code 0x0010,
we can introduce a meaningful error message.

Signed-off-by: Eugene Crosser <Eugene.Crosser@ru.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 22:16:11 -07:00
Arnd Bergmann 318d3cc04e net: xfrm: fix old-style declaration
Modern C standards expect the '__inline__' keyword to come before the return
type in a declaration, and we get a couple of warnings for this with "make W=1"
in the xfrm{4,6}_policy.c files:

net/ipv6/xfrm6_policy.c:369:1: error: 'inline' is not at beginning of declaration [-Werror=old-style-declaration]
 static int inline xfrm6_net_sysctl_init(struct net *net)
net/ipv6/xfrm6_policy.c:374:1: error: 'inline' is not at beginning of declaration [-Werror=old-style-declaration]
 static void inline xfrm6_net_sysctl_exit(struct net *net)
net/ipv4/xfrm4_policy.c:339:1: error: 'inline' is not at beginning of declaration [-Werror=old-style-declaration]
 static int inline xfrm4_net_sysctl_init(struct net *net)
net/ipv4/xfrm4_policy.c:344:1: error: 'inline' is not at beginning of declaration [-Werror=old-style-declaration]
 static void inline xfrm4_net_sysctl_exit(struct net *net)

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 22:06:30 -07:00
Arnd Bergmann 278af574db net: gianfar: fix old-style declaration
Modern C standards expect the '__inline__' keyword to come before the return
type in a declaration, and we get a warning for this with "make W=1":

drivers/net/ethernet/freescale/gianfar.c:2278:1: error: 'inline' is not at beginning of declaration [-Werror=old-style-declaration]

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 22:06:30 -07:00
Arnd Bergmann ac565065a6 isdn: eicon: fix old-style declarations
Modern C standards expect the '__inline__' keyword to come before the return
type in a declaration, and we get many warnings for this with "make W=1"
because the eicon driver has this in a header file:

eicon/divasmain.c:448:1: error: '__inline__' is not at beginning of declaration [-Werror=old-style-declaration]
eicon/divasmain.c:453:1: error: '__inline__' is not at beginning of declaration [-Werror=old-style-declaration]
eicon/divasmain.c:458:1: error: '__inline__' is not at beginning of declaration [-Werror=old-style-declaration]
eicon/divasmain.c:463:1: error: '__inline__' is not at beginning of declaration [-Werror=old-style-declaration]
eicon/divasmain.c:468:1: error: '__inline__' is not at beginning of declaration [-Werror=old-style-declaration]
eicon/divasmain.c:473:1: error: '__inline__' is not at beginning of declaration [-Werror=old-style-declaration]
eicon/platform.h:274:1: error: '__inline__' is not at beginning of declaration [-Werror=old-style-declaration]
eicon/platform.h:280:1: error: '__inline__' is not at beginning of declaration [-Werror=old-style-declaration]

A similar warning gets printed for the diva_os_register_io_port()
declaration, because 'register' is interpreted as a keyword instead
of a variable name:

In file included from eicon/diva_didd.c:21:0:
eicon/platform.h:206:1: error: 'register' is not at beginning of declaration [-Werror=old-style-declaration]

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 22:06:30 -07:00
Arnd Bergmann 60a25d00db hamradio: baycom: fix old-style declaration
Modern C standards expect the '__inline__' keyword to come before the return
type in a declaration, and we get a warning for this with "make W=1":

drivers/net/hamradio/baycom_par.c:159:1: error: '__inline__' is not at beginning of declaration [-Werror=old-style-declaration]

For consistency with other drivers, I'm changing '__inline__' to 'inline'
at the same time.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 22:06:30 -07:00
Wei Tang be4da0e340 net: the space is required after ','
The space is missing after ',', and this will introduce much more
noise when checking patch around.

Signed-off-by: Wei Tang <tangwei@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 17:41:23 -07:00
Wei Tang 84d15ae57d net: do not initialise statics to 0
This patch fixes the checkpatch.pl error to dev.c:

ERROR: do not initialise statics to 0

Signed-off-by: Wei Tang <tangwei@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 17:41:22 -07:00
Arnd Bergmann 40309d2654 net: tlan: don't set unused function argument
We get a warning for tlan_handle_tx_eoc when building with "make W=1"

drivers/net/ethernet/ti/tlan.c: In function 'tlan_handle_tx_eoc':
drivers/net/ethernet/ti/tlan.c:1647:59: error: parameter 'host_int' set but not used [-Werror=unused-but-set-parameter]
 static u32 tlan_handle_tx_eoc(struct net_device *dev, u16 host_int)

This is harmless, but removing the unused assignment lets us avoid
the warning with no downside.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 17:33:12 -07:00
Arnd Bergmann 287debd6aa net: qlcnic: don't set unused function argument
We get a warning for qlcnic_83xx_get_mac_address when building with
"make W=1":

drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c: In function 'qlcnic_83xx_get_mac_address':
drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c:2156:8: error: parameter 'function' set but not used [-Werror=unused-but-set-parameter]

Clearly this is harmless, but there is also no point for setting
the variable, so we can simply remove the assignment.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 17:33:11 -07:00
Arnd Bergmann 55e7f6abe1 dsa: b53: fix big-endian register access
The b53 dsa register access confusingly uses __raw register accessors
when both the CPU and the device are big-endian, but it uses little-
endian accessors when the same device is used from a little-endian
CPU, which makes no sense.

This uses normal accessors in device-endianess all the time, which
will work in all four combinations of register and CPU endianess,
and it will have the same barrier semantics in all cases.

This also seems to take care of a (false positive) warning I'm getting:

drivers/net/dsa/b53/b53_mmap.c: In function 'b53_mmap_read64':
drivers/net/dsa/b53/b53_mmap.c:109:10: error: 'hi' may be used uninitialized in this function [-Werror=maybe-uninitialized]
  *val = ((u64)hi << 32) | lo;

I originally planned to submit another patch for that warning
and did this one as a preparation cleanup, but it does seem to be
sufficient by itself.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 17:15:28 -07:00
Simon Horman 0d227a8672 mpls: allow routes on ipgre devices
This appears to be necessary and sufficient to provide
MPLS in GRE (RFC4023) support.

This can be used by establishing an ipgre tunnel device
and then routing MPLS over it.

The following example will forward MPLS frames received with an outermost
MPLS label 100 over tun1, a GRE tunnel. The forwarded packet will have the
outermost MPLS LSE removed and two new LSEs added with labels 200
(outermost) and 300 (next).

ip link add name tun1 type gre remote 10.0.99.193 local 10.0.99.192 ttl 225
ip link set up dev tun1
ip addr add 10.0.98.192/24 dev tun1
ip route sh

echo 1 > /proc/sys/net/mpls/conf/eth0/input
echo 101 > /proc/sys/net/mpls/platform_labels
ip -f mpls route add 100 as 200/300 via inet 10.0.98.193
ip -f mpls route sh

Also remove unnecessary braces.

Reviewed-by: Dinan Gunawardena <dinan.gunawardena@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Robert Shearman <rshearma@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 17:12:07 -07:00
hayeswang fae5617877 r8152: modify the check of the flag of PHY_RESET in set_speed function
In set_speed(), BMCR_RESET would be set when the flag of PHY_RESET
is set. Use BMCR_RESET to replace testing the flag of PHY_RESET.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 17:08:33 -07:00
Philippe Reynes d21cfb375e net: ethernet: ax88796: use phy_ethtool_{get|set}_link_ksettings
There are two generics functions phy_ethtool_{get|set}_link_ksettings,
so we can use them instead of defining the same code in the driver.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 17:07:05 -07:00
Philippe Reynes de0eabf86d net: ethernet: ax88796: use phydev from struct net_device
The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phydev in the private structure, and update the driver to use the
one contained in struct net_device.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 17:07:05 -07:00
David S. Miller b4eccef8f3 Merge branch 'stmmac-wol'
Vincent Palatin says:

====================
net: stmmac: dwmac-rk: fixes for Wake-on-Lan on RK3288

In order to support Wake-On-Lan when using the RK3288 integrated MAC
(with an external RGMII PHY), we need to avoid shutting down the regulator
of the external PHY when the MAC is suspended as it's currently done in the MAC
platform code.
As a first step, create independant callbacks for suspend/resume rather than
re-using exit/init callbacks. So the dwmac platform driver can behave differently
on suspend where it might skip shutting the PHY and at module unloading.
Then update the dwmac-rk driver to switch off the PHY regulator only if we are
not planning to wake up from the LAN.
Finally add the PMT interrupt to the MAC device tree configuration, so we can
wake up the core from it when the PHY has received the magic packet.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 14:14:58 -07:00
Vincent Palatin d5bfbeb809 ARM: dts: rockchip: add interrupt for Wake-on-Lan on RK3288
In order to use Wake-on-Lan on RK3288 integrated MAC, we need to wake-up
the CPU on the PMT interrupt when the MAC and the PHY are in low power mode.
Adding the interrupt declaration.

Signed-off-by: Vincent Palatin <vpalatin@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 14:14:58 -07:00
Vincent Palatin 229666c14c net: stmmac: dwmac-rk: keep the PHY up for WoL
When suspending the machine, do not shutdown the external PHY by cutting
its regulator in the mac platform driver suspend code if Wake-on-Lan is enabled,
else it cannot wake us up.
In order to do this, split the suspend/resume callbacks from the
init/exit callbacks, so we can condition the power-down on the lack of
need to wake-up from the LAN but do it unconditionally when unloading the
module.

Signed-off-by: Vincent Palatin <vpalatin@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 14:14:58 -07:00
Vincent Palatin cecbc5563a net: stmmac: allow to split suspend/resume from init/exit callbacks
Let the stmmac platform drivers provide dedicated suspend and resume
callbacks rather than always re-using the init and exits callbacks.
If the driver does not provide the suspend or resume callback, we fall
back to the old behavior trying to use exit or init.

This allows a specific platform to perform only a partial power-down on
suspend if Wake-on-Lan is enabled but always perform the full shutdown
sequence if the module is unloaded.

Signed-off-by: Vincent Palatin <vpalatin@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 14:14:58 -07:00
Xin Long 141ddefce7 sctp: change sk state to CLOSED instead of CLOSING in sctp_sock_migrate
Commit d46e416c11 ("sctp: sctp should change socket state when
shutdown is received") may set sk_state CLOSING in sctp_sock_migrate,
but inet_accept doesn't allow the sk_state other than ESTABLISHED/
CLOSED for sctp. So we will change sk_state to CLOSED, instead of
CLOSING, as actually sk is closed already there.

Fixes: d46e416c11 ("sctp: sctp should change socket state when shutdown is received")
Reported-by: Ye Xiaolong <xiaolong.ye@intel.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 14:10:44 -07:00
Fabien Siron 5fb384b066 netlink: Add comment to warn about deprecated netlink rings attribute request
Signed-off-by: Fabien Siron <fabien.siron@epita.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 14:00:50 -07:00
David S. Miller f0362eab22 Merge branch 'bpf-fd-array-release'
Daniel Borkmann says:

====================
bpf: improve fd array release

This set improves BPF perf fd array map release wrt to purging
entries, first two extend the API as needed. Please see individual
patches for more details.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-15 23:42:58 -07:00
Daniel Borkmann 3b1efb196e bpf, maps: flush own entries on perf map release
The behavior of perf event arrays are quite different from all
others as they are tightly coupled to perf event fds, f.e. shown
recently by commit e03e7ee34f ("perf/bpf: Convert perf_event_array
to use struct file") to make refcounting on perf event more robust.
A remaining issue that the current code still has is that since
additions to the perf event array take a reference on the struct
file via perf_event_get() and are only released via fput() (that
cleans up the perf event eventually via perf_event_release_kernel())
when the element is either manually removed from the map from user
space or automatically when the last reference on the perf event
map is dropped. However, this leads us to dangling struct file's
when the map gets pinned after the application owning the perf
event descriptor exits, and since the struct file reference will
in such case only be manually dropped or via pinned file removal,
it leads to the perf event living longer than necessary, consuming
needlessly resources for that time.

Relations between perf event fds and bpf perf event map fds can be
rather complex. F.e. maps can act as demuxers among different perf
event fds that can possibly be owned by different threads and based
on the index selection from the program, events get dispatched to
one of the per-cpu fd endpoints. One perf event fd (or, rather a
per-cpu set of them) can also live in multiple perf event maps at
the same time, listening for events. Also, another requirement is
that perf event fds can get closed from application side after they
have been attached to the perf event map, so that on exit perf event
map will take care of dropping their references eventually. Likewise,
when such maps are pinned, the intended behavior is that a user
application does bpf_obj_get(), puts its fds in there and on exit
when fd is released, they are dropped from the map again, so the map
acts rather as connector endpoint. This also makes perf event maps
inherently different from program arrays as described in more detail
in commit c9da161c65 ("bpf: fix clearing on persistent program
array maps").

To tackle this, map entries are marked by the map struct file that
added the element to the map. And when the last reference to that map
struct file is released from user space, then the tracked entries
are purged from the map. This is okay, because new map struct files
instances resp. frontends to the anon inode are provided via
bpf_map_new_fd() that is called when we invoke bpf_obj_get_user()
for retrieving a pinned map, but also when an initial instance is
created via map_create(). The rest is resolved by the vfs layer
automatically for us by keeping reference count on the map's struct
file. Any concurrent updates on the map slot are fine as well, it
just means that perf_event_fd_array_release() needs to delete less
of its own entires.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-15 23:42:57 -07:00
Daniel Borkmann d056a78876 bpf, maps: extend map_fd_get_ptr arguments
This patch extends map_fd_get_ptr() callback that is used by fd array
maps, so that struct file pointer from the related map can be passed
in. It's safe to remove map_update_elem() callback for the two maps since
this is only allowed from syscall side, but not from eBPF programs for these
two map types. Like in per-cpu map case, bpf_fd_array_map_update_elem()
needs to be called directly here due to the extra argument.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-15 23:42:57 -07:00
Daniel Borkmann 61d1b6a42f bpf, maps: add release callback
Add a release callback for maps that is invoked when the last
reference to its struct file is gone and the struct file about
to be released by vfs. The handler will be used by fd array maps.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-15 23:42:57 -07:00
David S. Miller b478af0cd7 Merge branch 'sfc-rx-vlan-filtering'
Edward Cree says:

====================
sfc: RX VLAN filtering

Adds support for VLAN-qualified receive filters on EF10 hardware.
This is needed when running as a guest if the hypervisor has enabled
vfs-vlan-restrict, in which case the firmware rejects filters not qualified
with VLAN 0.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-15 22:26:27 -07:00
Andrew Rybchenko 38d27f389c sfc: Fix VLAN filtering feature if vPort has VLAN_RESTRICT flag
If vPort has VLAN_RESTRICT flag, VLAN tagged traffic will not be
delivered without corresponding Rx filters which may be proxied to and
moderated by hypervisor.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-15 22:26:27 -07:00
Edward Cree 23e202b419 sfc: Update MCDI protocol definitions
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-15 22:26:27 -07:00
Andrew Rybchenko eb7cfd8c9b sfc: Disable VLAN filtering by default if not strictly required
If should be done after net_dev->hw_features initialization, to keep the
feature there to be able to enable it later using ethtool.

VLAN filtering is enforced and fixed if vPort requires usage of VLAN
filters to receive tagged traffic.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-15 22:26:27 -07:00
Martin Habets e4478ad14f sfc: VLAN filters must only be created if the firmware supports this.
If it is not supported we simply disable the feature.

For the feature to work we need firmware filter support for
OUTER_VID + LOC_MAC and for OUTER_VID + LOC_MAC_IG.
The low-latency firmware can match on OUTER_VID + LOC_MAC but not on
OUTER_VID + LOC_MAC_IG.
For the capture packet firmware it is the other way around.
Only the full-feature variant can match on both combinations.

Incorporates a fix by Andrew Rybchenko <Andrew.Rybchenko@oktetlabs.ru>
in the net_dev->[hw_]features handling.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-15 22:26:26 -07:00
Andrew Rybchenko 7ac0dd9de6 sfc: Fix dup unknown multicast/unicast filters after datapath reset
Filter match flags are not unique criteria to be mapped to priority
because of both unknown unicast and unknown multicast are mapped to
LOC_MAC_IG. So, local MAC is required to map filter to priority.
MCDI filter flags is unique criteria to find filter priority.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-15 22:26:26 -07:00
Edward Cree 8c91562075 sfc: Refactor checks for invalid filter ID
Nearly every time we call efx_ef10_filter_remove_unsafe, we first check
for EFX_EF10_FILTER_ID_INVALID, in which case we do nothing.  So move
that check into the function, simplifying all the call sites.

Also, change the return type to void, since none of the callers check it.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-15 22:26:26 -07:00
Martin Habets d248953a3c sfc: Take mac_lock before calling efx_ef10_filter_table_probe
When trying to enslave an SFC interface to a bond the following BUG_ON was
hit:

 kernel BUG [in ef10.c]!
 CPU: 0 PID: 4383 Comm: ifenslave Tainted: G
...
 Call Trace:
  efx_ef10_filter_add_vlan+0x121/0x180 [sfc]
  efx_ef10_filter_table_probe+0x2a2/0x4f0 [sfc]
  efx_ef10_set_mac_address+0x370/0x6d0 [sfc]
  efx_set_mac_address+0x7d/0x120 [sfc]
  dev_set_mac_address+0x43/0xa0
  bond_enslave+0x337/0xea0 [bonding]
This comes from function efx_ef10_filter_vlan_sync_rx_mode.

To solve the bug we ensure the mac_lock is taken before calling
efx_ef10_filter_add_vlan. But to avoid a priority inversion mac_lock must
be taken before filter_sem.
To satisfy these requirements we end up taking mac_lock in
efx_ef10_vport_set_mac_address, efx_ef10_set_mac_address,
efx_ef10_sriov_set_vf_vlan and efx_probe_filters.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-15 22:26:26 -07:00
Andrew Rybchenko 4a53ea8a74 sfc: Implement ndo_vlan_rx_{add, kill}_vid() callbacks
Supports HW VLAN filtering, en/disabled using ethtool.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-15 22:26:26 -07:00
Andrew Rybchenko 34813fe26e sfc: Implement list of VLANs added over interface
Right now it contains dummy VLAN entry with unspecified VID only.
The entry is used for the case when HW VLAN filtering is not used.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-15 22:26:26 -07:00