Commit Graph

444367 Commits

Author SHA1 Message Date
Yoshihiro Shimoda 4f809cea61 net: sh_eth: Fix receive packet "exceeded" condition in sh_eth_rx()
This patch fixes the packet "exceeded" condition in sh_eth_rx() when
RACT in an RX descriptor is not set and the "quota" is 0.
Otherwise, kernel panic happens because the "&n->poll_list" is deleted
twice in sh_eth_poll() which calls napi_complete() and net_rx_action().

Signed-off-by: Kouei Abe <kouei.abe.cp@renesas.com>
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 15:15:31 -07:00
Alexei Starovoitov 61f83d0d57 net: filter: fix warning on 32-bit arch
fix compiler warning on 32-bit architectures:

net/core/filter.c: In function '__sk_run_filter':
net/core/filter.c:540:22: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
net/core/filter.c:550:22: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
net/core/filter.c:560:22: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 15:12:27 -07:00
Jon Paul Maloy 02c00c2ab0 tipc: fix potential bug in function tipc_backlog_rcv
In commit 4f4482dcd9 ("tipc: compensate
for double accounting in socket rcv buffer") we access 'truesize' of
a received buffer after it might have been released by the function
filter_rcv().

In this commit we correct this by reading the value of 'truesize' to
the stack before delivering the buffer to filter_rcv().

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 15:01:30 -07:00
Dan Carpenter cf97b8ff22 net: sxgbe: remove duplicate SXGBE_CORE_L34_ADDCTL_REG define
The SXGBE_CORE_L34_ADDCTL_REG define is cut and pasted twice so we can
delete the second instance.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 15:01:30 -07:00
Dan Carpenter 5e3ec11b64 qlcnic: remove duplicate QLC_83XX_GET_LSO_CAPABILITY define
The QLC_83XX_GET_LSO_CAPABILITY define is cut and pasted twice so we can
delete the second instance.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Sony Chacko <sony.chacko@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 15:01:30 -07:00
David S. Miller 9b07d735c0 Merge branch 'mlx4'
Amir Vadai says:

====================
cpumask,net: affinity hint helper function

This patchset will set affinity hint to influence IRQs to be allocated on the
same NUMA node as the one where the card resides. As discussed in
http://www.spinics.net/lists/netdev/msg271497.html

If number of IRQs allocated is greater than the number of local NUMA cores, all
local cores will be used first, and the rest of the IRQs will be on a remote
NUMA node.
If no NUMA support - IRQ's and cores will be mapped 1:1

Since the utility function to calculate the mapping could be useful in other mq
drivers in the kernel, it was added to cpumask.[ch]

This patchset was tested and applied on top of net-next since the first
consumer is a network device (mlx4_en).  Over commit fff1f59 "mac802154:
llsec: add forgotten list_del_rcu in key removal"
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 14:59:21 -07:00
Yuval Atias 9e311e77a8 net/mlx4_en: Use affinity hint
The “affinity hint” mechanism is used by the user space
daemon, irqbalancer, to indicate a preferred CPU mask for irqs.
Irqbalancer can use this hint to balance the irqs between the
cpus indicated by the mask.

We wish the HCA to preferentially map the IRQs it uses to numa cores
close to it.  To accomplish this, we use cpumask_set_cpu_local_first(), that
sets the affinity hint according the following policy:
First it maps IRQs to “close” numa cores.  If these are exhausted, the
remaining IRQs are mapped to “far” numa cores.

Signed-off-by: Yuval Atias <yuvala@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 14:58:16 -07:00
Amir Vadai da91309e0a cpumask: Utility function to set n'th cpu - local cpu first
This function sets the n'th cpu - local cpu's first.
For example: in a 16 cores server with even cpu's local, will get the
following values:
cpumask_set_cpu_local_first(0, numa, cpumask) => cpu 0 is set
cpumask_set_cpu_local_first(1, numa, cpumask) => cpu 2 is set
...
cpumask_set_cpu_local_first(7, numa, cpumask) => cpu 14 is set
cpumask_set_cpu_local_first(8, numa, cpumask) => cpu 1 is set
cpumask_set_cpu_local_first(9, numa, cpumask) => cpu 3 is set
...
cpumask_set_cpu_local_first(15, numa, cpumask) => cpu 15 is set

Curently this function will be used by multi queue networking devices to
calculate the irq affinity mask, such that as many local cpu's as
possible will be utilized to handle the mq device irq's.

Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 14:58:16 -07:00
David S. Miller d4f3862017 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next
Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2014-06-11

This series contains updates to igb, i40e and i40evf.

Todd makes a change to igb to un-hide invariant returns by getting rid of
the E1000_SUCCESS define and converting those returns to return 0.

Jacob separates the hardware logic from the set function, so that we can
re-use it during a ptp_reset in igb.  This enables the reset to return
functionality to the last know timestamp mode, rather than resetting the
value.

Ashish implements context flags for headwb and headwb_addr so that we
do not have to keep them always enabled.

Shannon updates the admin queue API for the new firmware, which adds
set_pf_content, nvm_config_read/write, replaces set_phy_reset with
set_phy_debug and removes nvm_read/write_reg_se.  Cleans up the driver
to use the stored base_queue value since there is no need to read the
PCI register for the PF's base queue on every single transmit queue
enable and disable as we already have the value stored from reading
the capability features at startup.

Anjali changes the notion of source and destination for FD_SB in ethtool
to align i40e with other drivers.  Adds flow director statistics to
the PF stats.  Fixes a bug in ethtool for flow director drop packet
filter where the drop action comes down as a ring_cookie value, so allow
it as a special value that can be used to configure destination control.

Mitch fixes the i40evf to keep the driver from going down when it is
already in a down state.  This prevents a CPU soft lock in napi_disable().
Also change the i40evf to check the admin queue error bits since the
firmware can indicate any admin queue error states to the driver via
some bits in the length registers.

Neerav separates out the DCB capability and enabled flags because currently
if the firmware reports DCB capability the driver enables
I40E_FLAG_DCB_ENABLED flag.  When this flag is enabled the driver inserts
a tag when transmitting a packet from the port even if there are no DCB
traffic classes configured at the port.  So by adding the additional flag,
I40E_FLAG_DCB_CAPABLE, that will be set when the DCB capability is present
and the existing enabled flag will only be set if there are more than one
traffic classes configured at the port.

Greg fixes the i40e driver to not automatically accept tagged packets by
default so that the system must request a VLAN tag packet filter to get
packets with that tag.  Greg also converts i40e to use the in-kernel
ether_addr_copy() instead of mempcy().

Jesse removes the FTYPE field from the receive descriptor to match the
hardware implementation.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 12:25:12 -07:00
David S. Miller 813ebbbf8e Merge branch 'sctp-next'
Daniel Borkmann says:

====================
SCTP update

This set contains transport path selection improvements in
SCTP. Please see individual patches for details.
====================

Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 12:23:30 -07:00
Daniel Borkmann 9b87d46510 net: sctp: fix incorrect type in gfp initializer
This fixes the following sparse warning:

  net/sctp/associola.c:1556:29: warning: incorrect type in initializer (different base types)
  net/sctp/associola.c:1556:29:    expected bool [unsigned] [usertype] preload
  net/sctp/associola.c:1556:29:    got restricted gfp_t

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 12:23:17 -07:00
Daniel Borkmann a7288c4dd5 net: sctp: improve sctp_select_active_and_retran_path selection
In function sctp_select_active_and_retran_path(), we walk the
transport list in order to look for the two most recently used
ACTIVE transports (trans_pri, trans_sec). In case we didn't find
anything ACTIVE, we currently just camp on a possibly PF or
INACTIVE transport that is primary path; this behavior actually
dates back to linux-history tree of the very early days of
lksctp, and can yield a behavior that chooses suboptimal
transport paths.

Instead, be a bit more clever by reusing and extending the
recently introduced sctp_trans_elect_best() handler. In case
both transports are evaluated to have the same score resulting
from their states, break the tie by looking at: 1) transport
patch error count 2) last_time_heard value from each transport.

This is analogous to Nishida's Quick Failover draft [1],
section 5.1, 3:

  The sender SHOULD avoid data transmission to PF destinations.
  When all destinations are in either PF or Inactive state,
  the sender MAY either move the destination from PF to active
  state (and transmit data to the active destination) or the
  sender MAY transmit data to a PF destination. In the former
  scenario, (i) the sender MUST NOT notify the ULP about the
  state transition, and (ii) MUST NOT clear the destination's
  error counter. It is recommended that the sender picks the
  PF destination with least error count (fewest consecutive
  timeouts) for data transmission. In case of a tie (multiple PF
  destinations with same error count), the sender MAY choose the
  last active destination.

Thus for sctp_select_active_and_retran_path(), we keep track of
the best, if any, transport that is in PF state and in case no
ACTIVE transport has been found (hence trans_{pri,sec} is NULL),
we select the best out of the three: current primary_path and
retran_path as well as a possible PF transport.

The secondary may still camp on the original primary_path as
before. The change in sctp_trans_elect_best() with a more fine
grained tie selection also improves at the same time path selection
for sctp_assoc_update_retran_path() in case of non-ACTIVE states.

  [1] http://tools.ietf.org/html/draft-nishida-tsvwg-sctp-failover-05

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 12:23:17 -07:00
Daniel Borkmann e575235fc6 net: sctp: migrate most recently used transport to ktime
Be more precise in transport path selection and use ktime
helpers instead of jiffies to compare and pick the better
primary and secondary recently used transports. This also
avoids any side-effects during a possible roll-over, and
could lead to better path decision-making.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 12:23:17 -07:00
Daniel Borkmann b82e8f31ac net: sctp: refactor active path selection
This patch just refactors and moves the code for the active
path selection into its own helper function outside of
sctp_assoc_control_transport() which is already big enough.
No functional changes here.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 12:23:17 -07:00
Daniel Borkmann 67cb9366ff ktime: add ktime_after and ktime_before helper
Add two minimal helper functions analogous to time_before() and
time_after() that will later on both be needed by SCTP code.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 12:23:17 -07:00
David S. Miller 9181a6bd18 Merge branch 'mac802154'
Phoebe Buckheister says:

====================
Recent llsec code introduced a memory leak on decryption failures during rx.
This fixes said leak, and optimizes the receive loops for monitor and wpan
devices to only deliver skbs to devices that are actually up. Also changes a
dev_kfree_skb to kfree_skb when an invalid packet is dropped before being
pushed into the stack.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 12:11:25 -07:00
Phoebe Buckheister 2d3b5b0a90 mac802154: don't deliver packets to devices that are down
Only one WPAN devices can be active at any given time, so only deliver
packets to that one interface that is actually up. Multiple monitors may
be up at any given time, but we don't have to deliver to monitors that
are down either.

Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 12:10:19 -07:00
Phoebe Buckheister a374eeb5e5 mac802154: properly free incoming skbs on decryption failure
mac802154 RX did not free skbs on decryption failure, assuming that the
caller would when the local rx handler returned _DROP. This was false.

Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 12:10:18 -07:00
Catherine Sullivan f832090249 i40e/i40evf: Bump i40e to version 0.4.10 and i40evf to 0.9.34
Bump versions.

Change-ID: Ic4a84354955061ca18321b1e97c9c30fe1563b5c
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-06-11 08:48:43 -07:00
Shannon Nelson dfb699f970 i40e: use stored base_queue value
No need to read the PCI register for the PF's base queue on every single Tx
queue enable and disable as we already have the value stored from reading
the capability features at startup.

Change-ID: Ic02fb622757742f43cb8269369c3d972d4f66555
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-06-11 08:48:40 -07:00
Anjali Singhai Jain 387ce1a97d i40e: Fix a bug in ethtool for FD drop packet filter action
A drop action comes down as a ring_cookie value, so allow it as
a special value that can be used to configure destination control.

Also fix the output to filter read command accordingly.

Change-ID: I9956723cee42f3194885403317dd21ed4a151144
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-06-11 08:48:36 -07:00
Anjali Singhai Jain 433c47de13 i40e/i40evf: Add Flow director stats to PF stats
Add members to stat struct to keep track of Flow director ATR and
SideBand filter packet matches.

Change-ID: Ibbb31a53c7adcc2bb96991dd80565442a2f2513c
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-06-11 08:48:33 -07:00
Jesse Brandeburg 2c50ef8047 i40e/i40evf: remove FTYPE
This change drops the FTYPE field from the Rx descriptor, to
match the hardware implementation.

Change-ID: I66d31d2b43861da45e8ace4fb03df033abe88bab
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-06-11 08:48:29 -07:00
Mitch Williams 912257e540 i40evf: check admin queue error bits
FW can indicate any admin queue error states to the driver via some bits
in the length registers. Each time we process an admin queue message,
check these bits and log any errors we find. Since the VF really can't
do much, we just print the message and depend on the PF driver to clear
things up on our behalf.

Change-ID: I92bc6c53ce3b4400544e0ca19c5de2d27490bd0d
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-06-11 08:48:26 -07:00
Greg Rose 9a173901d9 i40e/i40evf: User ether_addr_copy instead of memcpy
Linux gives us a function to copy Ethernet MAC addresses, let's use it.

Change-ID: I0c861900029ca5ea65a53ca39565852fb633f6fd
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-06-11 08:48:22 -07:00
Greg Rose 8c27d42ec6 i40e: Do not accept tagged packets by default
Remove the filter created by the firmware with the default MAC address it
reads out of the NVM storage and a promiscuous VLAN tag and replace it
with a filter that will not accept tagged packets by default.  The system
must request a VLAN tag packet filter to get packets with that tag.

Change-ID: I119e6c3603a039bd68282ba31bf26f33a575490a
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-06-11 08:48:18 -07:00
Neerav Parikh 4d9b604353 i40e: Separate out DCB capability and enabled flags
Currently if the firmware reports DCB capability the driver enables
I40E_FLAG_DCB_ENABLED flag. When this flag is enabled the driver
inserts a tag when transmitting a packet from the port even if there
are no DCB traffic classes configured at the port.

This patch adds a new flag I40E_FLAG_DCB_CAPABLE that will be set
when the DCB capability is present and the existing flag
I40E_FLAG_DCB_ENABLED will be set only if there are more than one
traffic classes configured at the port.

Change-ID: I24ccbf53ef293db2eba80c8a9772acf729795bd5
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-06-11 08:48:15 -07:00
Mitch Williams ddf0b3a63e i40evf: don't go further down
If the device is down, there's no place to go but up, so don't try to go
down even more. This prevents a CPU soft lock in napi_disable().

Change-ID: I8b058b9ee974dfa01c212fae2597f4f54b333314
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-06-11 08:48:12 -07:00
Anjali Singhai Jain 04b73bd7a4 i40e: Change the notion of src and dst for FD_SB in ethtool
In XL710 devices we program FD filter's fields from Tx perspective of the flow.
However the user interface exposed in ethtool should be compliant with the
previous generation of drivers where a filter src and dst field are from
the RX perspective. This patch changes the ethtool interface in this regard
to match the other drivers.

Change-ID: Iec6ccddd87357c4fb53ccf33aa0fae699faf70cf
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-06-11 08:48:07 -07:00
Shannon Nelson f94234ee6d i40e/i40evf: AdminQ API update for new FW
Add set_pf_context, replace set_phy_reset with set_phy_debug, add
nvm_config_read/write, remove nvm_read/write_reg_se and add some
PHY types.

With these changes we bump the API version to 1.2.

Change-ID: I4dc3aec175c2316f66fc9b726b3f7d594699d84e
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-06-11 08:47:35 -07:00
Ashish Shah 5d29896a81 i40e/i40evf: set headwb Tx context flags and use them
Set appropriate fields in Tx queue configuration virtchnl message
to pf to enable headwb and setup headwb addr.
Then use that info from the VF to set headwb and headwb_addr instead of
always enabling them.

Change-ID: I7d393d1b2b07f0f3355b3a4f7c2d3c6ee3b0d622
Signed-off-by: Ashish Shah <ashish.n.shah@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-06-11 08:46:19 -07:00
Jacob Keller 9f62ecf425 igb: separate hardware setting from the set_ts_config ioctl
This patch separates the hardware logic from the set function, so that
we can re-use it during a ptp_reset. This enables the reset to return
functionality to the last known timestamp mode, rather than resetting
the value. We initialize the mode to off during the ptp_init cycle.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-06-11 08:45:55 -07:00
Todd Fujinaka 23d87824de igb: unhide invariant returns
Return a 0 directly rather than a constant.

Reported-by: Peter Senna Tschudin <peter.senna@gmail.com>
Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-06-11 08:45:48 -07:00
Lendacky, Thomas d5c4858237 amd-xgbe: Rename MAX_DMA_CHANNELS to avoid powerpc conflict
MAX_DMA_CHANNELS is defined in asm/scatterlist.h of the powerpc
architecture.  Rename this #define in xgbe.h to avoid the
redefined warning issued during compilation.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 00:56:41 -07:00
Ben Hutchings 581d9baa21 farsync: Fix confusion about DMA address and buffer offset types
Use dma_addr_t for DMA address parameters and u32 for shared memory
offset parameters.

Do not assume that dma_addr_t is the same as unsigned long; it will
not be in PAE configurations.  Truncate DMA addresses to 32 bits when
printing them.  This is OK because the DMA mask for this device is
32-bit (per default).

Also rename the DMA address parameters from 'skb' to 'dma'.

Compile-tested only.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 00:45:29 -07:00
David S. Miller 7e1a61b6c1 Merge branch 'mlx4'
Or Gerlitz says:

====================
mlx4 SRIOV fixes

The patch from Wei Yang is a designed fix to a regression introduced by earlier commit
of him. Jack added a fix to the resource management which we got from IBM.

Let's get that into 3.16-rc1 1st and later see to what stable version/s this should go.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 00:32:53 -07:00
Wei Yang da1de8dfff net/mlx4_core: Keep only one driver entry release mlx4_priv
Following commit befdf89 "net/mlx4_core: Preserve pci_dev_data after
__mlx4_remove_one()", there are two mlx4 pci callbacks which will
attempt to release the mlx4_priv object -- .shutdown and .remove.

This leads to a use-after-free access to the already freed mlx4_priv
instance and trigger a "Kernel access of bad area" crash when both
.shutdown and .remove are called.

During reboot or kexec, .shutdown is called, with the VFs probed to
the host going through shutdown first and then the PF. Later, the PF
will trigger VFs' .remove since VFs still have driver attached.

Fix that by keeping only one driver entry which releases mlx4_priv.

Fixes: befdf89 ('net/mlx4_core: Preserve pci_dev_data after __mlx4_remove_one()')
CC: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 00:32:46 -07:00
Jack Morgenstein 95646373c9 net/mlx4_core: Fix SRIOV free-pool management when enforcing resource quotas
The Hypervisor driver tracks free slots and reserved slots at the global level
and tracks allocated slots and guaranteed slots per VF.

Guaranteed slots are treated as reserved by the driver, so the total
reserved slots is the sum of all guaranteed slots over all the VFs.

As VFs allocate resources, free (global) is decremented and allocated (per VF)
is incremented for those resources. However, reserved (global) is never changed.

This means that effectively, when a VF allocates a resource from its
guaranteed pool, it is actually reducing that resource's free pool (since
the global reserved count was not also reduced).

The fix for this problem is the following: For each resource, as long as a
VF's allocated count is <= its guaranteed number, when allocating for that
VF, the reserved count (global) should be reduced by the allocation as well.

When the global reserved count reaches zero, the remaining global free count
is still accessible as the free pool for that resource.

When the VF frees resources, the reverse happens: the global reserved count
for a resource is incremented only once the VFs allocated number falls below
its guaranteed number.

This fix was developed by Rick Kready <kready@us.ibm.com>

Reported-by: Rick Kready <kready@us.ibm.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 00:32:46 -07:00
Rickard Strandqvist f05d435bde net: wimax: i2400m: control.c: Cleaning up conjunction always evaluates to false
Logical conjunction always evaluates to false:  minor < 2 && minor > 1
I guess what you wanted is rather: minor > 2 || minor < 1

This was partly found using a static code analysis program called cppcheck.

Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 00:13:16 -07:00
Rickard Strandqvist 655aa39306 net: ethernet: toshiba: ps3_gelic_net.c: Cleaning up a check on a memory allocation
A check on a memory allocation is checked incorrectly.

This was partly found using a static code analysis program called cppcheck.

Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
Acked-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 00:13:16 -07:00
françois romieu a25aafaa5b amd-xgbe: fix unused variable compilation warning in phylib driver
Fix following compilation warning:
[...]
  CC      drivers/net/phy/amd-xgbe-phy.o
drivers/net/phy/amd-xgbe-phy.c:1353:30: warning:
‘amd_xgbe_phy_ids’ defined but not used [-Wunused-variable]
 static struct mdio_device_id amd_xgbe_phy_ids[] = {
                              ^
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 00:13:16 -07:00
Alexei Starovoitov df6d0f983a net: filter: fix nlattr and nlattr_nest BPF tests
- 'struct nlattr' must be 2 byte aligned
- provide big-endian input data for nlattr/nlattr_nest tests

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 00:13:16 -07:00
Alexei Starovoitov e430f34ee5 net: filter: cleanup A/X name usage
The macro 'A' used in internal BPF interpreter:
 #define A regs[insn->a_reg]
was easily confused with the name of classic BPF register 'A', since
'A' would mean two different things depending on context.

This patch is trying to clean up the naming and clarify its usage in the
following way:

- A and X are names of two classic BPF registers

- BPF_REG_A denotes internal BPF register R0 used to map classic register A
  in internal BPF programs generated from classic

- BPF_REG_X denotes internal BPF register R7 used to map classic register X
  in internal BPF programs generated from classic

- internal BPF instruction format:
struct sock_filter_int {
        __u8    code;           /* opcode */
        __u8    dst_reg:4;      /* dest register */
        __u8    src_reg:4;      /* source register */
        __s16   off;            /* signed offset */
        __s32   imm;            /* signed immediate constant */
};

- BPF_X/BPF_K is 1 bit used to encode source operand of instruction
In classic:
  BPF_X - means use register X as source operand
  BPF_K - means use 32-bit immediate as source operand
In internal:
  BPF_X - means use 'src_reg' register as source operand
  BPF_K - means use 32-bit immediate as source operand

Suggested-by: Chema Gonzalez <chema@google.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Chema Gonzalez <chema@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 00:13:16 -07:00
David S. Miller 7b0dcbd879 Merge branch 'bridge_multicast_exports'
Linus Lüssing says:

====================
bridge: multicast snooping patches / exports

The first patch is simply a cosmetic patch. So far I (and maybe others
too?) have been regularly confusing these two structs, therefore I'd
suggest renaming them and therefore making the follow-up patches easier
to understand and nicer to fit in.

The second patch fixes a minor issue, but probably not worth for stable.

On the other hand the first two patches are also preparations for the
third and fourth patch:

These two patches are exporting functionality needed to marry the bridge
multicast snooping with the batman-adv multicast optimizations recently
added for the 3.15 kernel, allowing to use these optimzations in common
setups having a bridge on top of e.g. bat0, too. So far these bridged
setups would fall back to simple flooding through the batman-adv mesh
network for any multicast packet entering bat0.

More information about the batman-adv multicast optimizations currently
implemented can be found here:

http://www.open-mesh.org/projects/batman-adv/wiki/Basic-multicast-optimizations

The integration on the batman-adv side could afterwards look like this,
for instance:

http://git.open-mesh.org/batman-adv.git/commitdiff/576b59dd3e34737c702e548b21fa72059262f796?hp=f95ce7131746c65fbcdffcf2089cab59e2c2f7ac
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-10 23:51:00 -07:00
Linus Lüssing 2cd4143192 bridge: memorize and export selected IGMP/MLD querier port
Adding bridge support to the batman-adv multicast optimization requires
batman-adv knowing about the existence of bridged-in IGMP/MLD queriers
to be able to reliably serve any multicast listener behind this same
bridge.

Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-10 23:50:47 -07:00
Linus Lüssing 07f8ac4a1e bridge: add export of multicast database adjacent to net_dev
With this new, exported function br_multicast_list_adjacent(net_dev) a
list of IPv4/6 addresses is returned. This list contains all multicast
addresses sensed by the bridge multicast snooping feature on all bridge
ports of the bridge interface of net_dev, excluding addresses from the
specified net_device itself.

Adding bridge support to the batman-adv multicast optimization requires
batman-adv knowing about the existence of bridged-in multicast
listeners to be able to reliably serve them with multicast packets.

Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-10 23:50:47 -07:00
Linus Lüssing dc4eb53a99 bridge: adhere to querier election mechanism specified by RFCs
MLDv1 (RFC2710 section 6), MLDv2 (RFC3810 section 7.6.2), IGMPv2
(RFC2236 section 3) and IGMPv3 (RFC3376 section 6.6.2) specify that the
querier with lowest source address shall become the selected
querier.

So far the bridge stopped its querier as soon as it heard another
querier regardless of its source address. This results in the "wrong"
querier potentially becoming the active querier or a potential,
unnecessary querying delay.

With this patch the bridge memorizes the source address of the currently
selected querier and ignores queries from queriers with a higher source
address than the currently selected one. This slight optimization is
supposed to make it more RFC compliant (but is rather uncritical and
therefore probably not necessary to be queued for stable kernels).

Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-10 23:50:47 -07:00
Linus Lüssing 90010b36eb bridge: rename struct bridge_mcast_query/querier
The current naming of these two structs is very random, in that
reversing their naming would not make any semantical difference.

This patch tries to make the naming less confusing by giving them a more
specific, distinguishable naming.

This is also useful for the upcoming patches reintroducing the
"struct bridge_mcast_querier" but for storing information about the
selected querier (no matter if our own or a foreign querier).

Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-10 23:50:46 -07:00
David S. Miller c4d4c255d8 Merge branch 'cxgb4'
Hariprasad Shenai says:

====================
Adds support for CIQ and other misc. fixes for rdma/cxgb4

This patch series adds support to allocate and use IQs specifically for
indirect interrupts, adds fixes to align ISS for iWARP connections & fixes
related to tcp snd/rvd window for Chelsio T4/T5 adapters on iw_cxgb4.
Also changes Interrupt Holdoff Packet Count threshold of response queues for
cxgb4 driver.

The patches series is created against 'net-next' tree.
And includes patches on cxgb4 and iw_cxgb4 driver.

Since this patch-series contains cxgb4 and iw_cxgb4 patches, we would like to
request this patch series to get merged via David Miller's 'net-next' tree.

We have included all the maintainers of respective drivers. Kindly review the
change and let us know in case of any review comments.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-10 22:49:59 -07:00
Hariprasad Shenai c887ad0e22 cxgb4: Change default Interrupt Holdoff Packet Count Threshold
Based on original work by Casey Leedom <leedom@chelsio.com>

Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-10 22:49:55 -07:00