Commit Graph

603203 Commits

Author SHA1 Message Date
Nikolay Aleksandrov 565ce8f32a net: bridge: fix vlan stats continue counter
I made a dumb off-by-one mistake when I added the vlan stats counter
dumping code. The increment should happen before the check, not after
otherwise we miss one entry when we continue dumping.

Fixes: a60c090361 ("bridge: netlink: export per-vlan stats")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 05:33:35 -04:00
Eric Dumazet a3d2e9f8eb tcp: do not send too big packets at retransmit time
Arjun reported a bug in TCP stack and bisected it to a recent commit.

In case where we process SACK, we can coalesce multiple skbs
into fat ones (tcp_shift_skb_data()), to lower write queue
overhead, because we do not expect to retransmit these packets.

However, SACK reneging can happen, forcing the sender to retransmit
all these packets. If skb->len is above 64KB, we then send buggy
IP packets that could hang TSO engine on cxgb4.

Neal suggested to use tcp_tso_autosize() instead of tp->gso_segs
so that we cook packets of optimal size vs TCP/pacing.

Thanks to Arjun for reporting the bug and running the tests !

Fixes: 10d3be5692 ("tcp-tso: do not split TSO packets at retransmit time")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Arjun V <arjun@chelsio.com>
Tested-by: Arjun V <arjun@chelsio.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 05:25:11 -04:00
Wei Yongjun 96183182ad ibmvnic: fix to use list_for_each_safe() when delete items
Since we will remove items off the list using list_del() we need
to use a safe version of the list_for_each() macro aptly named
list_for_each_safe().

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 05:23:42 -04:00
Rodrigo Vivi 6b9dc6a474 drm/i915: Removing PCI IDs that are no longer listed as Kabylake.
This is unusual. Usually IDs listed on early stages of platform
definition are kept there as reserved for later use.

However these IDs here are not listed anymore in any of steppings
and devices IDs tables for Kabylake on configurations overview
section of BSpec.

So it is better removing them before they become used in any
other future platform.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1466718636-19675-2-git-send-email-rodrigo.vivi@intel.com
(cherry picked from commit a922eb8d45)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2016-06-29 12:19:13 +03:00
Rodrigo Vivi 42482b4546 drm/i915: Add more Kabylake PCI IDs.
The spec has been updated adding new PCI IDs.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1466718636-19675-1-git-send-email-rodrigo.vivi@intel.com
(cherry picked from commit 33d9391d30)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2016-06-29 12:18:49 +03:00
David S. Miller b2c1b30e3e Merge branch 'thunderx-fixes'
Sunil Goutham says:

====================
net: thunderx: Miscellaneous fixes

This 2 patch series fixes issues w.r.t physical link status
reporting and transmit datapath configuration for
secondary qsets.

Changes from v1:
Fixed lmac disable sequence for interfaces of type SGMII.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 05:14:19 -04:00
Sunil Goutham 3e29adba56 net: thunderx: Fix TL4 configuration for secondary Qsets
TL4 calculation for a given SQ of secondary Qsets is incorrect
and goes out of bounds and also for some SQ's TL4 chosen will
transmit data via a different BGX interface and not same as
primary Qset's interface.

This patch fixes this issue.

Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 05:14:13 -04:00
Sunil Goutham 3f4c68cfde net: thunderx: Fix link status reporting
Check for SMU RX local/remote faults along with SPU LINK
status. Otherwise at times link is UP at our end but DOWN
at link partner's side. Also due to an issue in BGX it's
rarely seen that initialization doesn't happen properly
and SMU RX reports faults with everything fine at SPU.
This patch tries to reinitialize LMAC to fix it.

Also fixed LMAC disable sequence to properly bring down link.

Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: Tao Wang <tao.wang@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 05:14:13 -04:00
David S. Miller f5074d0ce2 Merge branch 'mlx5-100G-fixes'
Saeed Mahameed says:

====================
Mellanox 100G mlx5 fixes#2 for 4.7-rc

The following series provides one-liners fixes for mlx5 driver plus one
medium patch to reorganize ethtool counters reporting.

Highlights:
	- Added MODIFY_FLOW_TABLE to command strings table
	- Add ConnectX-5 PCIe 4.0 to list of supported devices
	- Rename ASYNC_EVENTS enum
	- Enable BlueFlame only when supported by device
	- Avoid adding same vxlan port twice
	- Report the correct number of PFC counters
	- Reorganize ethtool reported counters and remove duplications
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:28:59 -04:00
Gal Pressman bfe6d8d1d4 net/mlx5e: Reorganize ethtool statistics
Categorize and reorganize ethtool statistics counters by renaming to
"rx_*" and "tx_*" and removing redundant and duplicated counters, this
way they are easier to grasp and more user friendly.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:28:47 -04:00
Gal Pressman ed80ec4c17 net/mlx5e: Fix number of PFC counters reported to ethtool
Number of PFC counters used to count only number of priorities with PFC
enabled, but each priority has more than one counter, hence the need to
multiply it by the number of PFC counters per priority.

Fixes: cf678570d5 ('net/mlx5e: Add per priority group to PPort counters')
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:28:46 -04:00
Matthew Finlay 9ceec359e4 net/mlx5e: Prevent adding the same vxlan port
Do not allow the same vxlan udp port to be added to the device more than
once.

Fixes: b3f63c3d5e ("net/mlx5e: Add netdev support for VXLAN tunneling")
Signed-off-by: Matthew Finlay <matt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:28:46 -04:00
Gal Pressman fd4782c213 net/mlx5e: Check for BlueFlame capability before allocating SQ uar
Previous to this patch mapping was always set to write combining without
checking whether BlueFlame is supported in the device.

Fixes: 0ba422410b ('net/mlx5: Fix global UAR mapping')
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:28:46 -04:00
Eli Cohen e0f46eb9f6 net/mlx5e: Change enum to better reflect usage
Change MLX5E_STATE_ASYNC_EVENTS_ENABLE to
MLX5E_STATE_ASYNC_EVENTS_ENABLED since it represent a state and not an
operation.

Fixes: acff797cd1 ('net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality')
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:28:46 -04:00
Majd Dibbiny 7092fe8669 net/mlx5: Add ConnectX-5 PCIe 4.0 to list of supported devices
Add the upcoming ConnectX-5 PCIe 4.0 device to the list of
supported devices by the mlx5 driver.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:28:46 -04:00
Eli Cohen 5be1ea899d net/mlx5: Update command strings
Add command string for MODIFY_FLOW_TABLE which is used by the driver.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:28:46 -04:00
Imre Deak ba34a65324 drm/i915: Avoid early timeout during AUX transfers
Since wait_for_atomic doesn't re-check the wait-for condition after
expiry of the timeout it can fail when called from non-atomic context
even if the condition is set correctly before the expiry. Fix this by
using the non-atomic wait_for instead.

Due to the relatively long 10ms timeout, probably this didn't cause any
real problems, but fix it in any case for consistency.

Fixes: 0351b93992 ("drm/i915: Do not lie about atomic timeout granularity")
CC: Chris Wilson <chris@chris-wilson.co.uk>
CC: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
CC: drm-intel-fixes@lists.freedesktop.org
Link: http://patchwork.freedesktop.org/patch/msgid/1467110253-16046-5-git-send-email-imre.deak@intel.com
(cherry picked from commit 713a6b6689)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2016-06-29 11:18:16 +03:00
Imre Deak fa7c13e5b1 drm/i915/hsw: Avoid early timeout during LCPLL disable/restore
Since wait_for_atomic doesn't re-check the wait-for condition after
expiry of the timeout it can fail when called from non-atomic context
even if the condition is set correctly before the expiry. Fix this by
using the non-atomic wait_for instead.

Fixes: 0351b93992 ("drm/i915: Do not lie about atomic timeout granularity")
CC: Chris Wilson <chris@chris-wilson.co.uk>
CC: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
CC: drm-intel-fixes@lists.freedesktop.org
Link: http://patchwork.freedesktop.org/patch/msgid/1467110253-16046-4-git-send-email-imre.deak@intel.com
(cherry picked from commit f53dd63f11)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2016-06-29 11:18:04 +03:00
Imre Deak 11f6145791 drm/i915/lpt: Avoid early timeout during FDI PHY reset
Since wait_for_atomic doesn't re-check the wait-for condition after
expiry of the timeout it can fail when called from non-atomic context
even if the condition is set correctly before the expiry. Fix this by
using the non-atomic wait_for instead.

Fixes: 0351b93992 ("drm/i915: Do not lie about atomic timeout granularity")
CC: Chris Wilson <chris@chris-wilson.co.uk>
CC: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
CC: drm-intel-fixes@lists.freedesktop.org
Link: http://patchwork.freedesktop.org/patch/msgid/1467110253-16046-3-git-send-email-imre.deak@intel.com
(cherry picked from commit cf3598c23c)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2016-06-29 11:17:50 +03:00
Imre Deak ea12e2a0ca drm/i915/bxt: Avoid early timeout during PLL enable
Since wait_for_atomic doesn't re-check the wait-for condition after
expiry of the timeout it can fail when called from non-atomic context
even if the condition is set correctly before the expiry. Fix this by
using the non-atomic wait_for instead.

I noticed this via the PLL locking timing out incorrectly, with this fix
I couldn't reproduce the problem.

Fixes: 0351b93992 ("drm/i915: Do not lie about atomic timeout granularity")
CC: Chris Wilson <chris@chris-wilson.co.uk>
CC: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
CC: drm-intel-fixes@lists.freedesktop.org
Link: http://patchwork.freedesktop.org/patch/msgid/1467110253-16046-2-git-send-email-imre.deak@intel.com
(cherry picked from commit 0b786e41c7)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2016-06-29 11:17:31 +03:00
Ville Syrjälä 664a84d2c7 drm/i915: Refresh cached DP port register value on resume
During hibernation the cached DP port register value will be left with
whatever value we have there when we create the hibernation image.
Currently that means the port (and eDP PLL) will be off in the cached
value. However when we resume there is no guarantee that the value
in the actual register will match the cached value. If i915 isn't
loaded in the kernel that loads the hibernation image, the port may
well be on (eg. left on by the BIOS). The encoder state readout
does the right thing in this case and updates our encoder state
to reflect the actual hardware state. However the post-resume modeset
will then use the stale cached port register value in
intel_dp_link_down() and potentially confuse the hardware.

This was caught by the following assert
 WARNING: CPU: 3 PID: 5288 at ../drivers/gpu/drm/i915/intel_dp.c:2184 assert_edp_pll+0x99/0xa0 [i915]
 eDP PLL state assertion failure (expected on, current off)
on account of the eDP PLL getting prematurely turned off when
shutting down the port, since the DP_PLL_ENABLE bit wasn't set
in the cached register value.

Presumably I introduced this problem in
commit 6fec766283 ("drm/i915: Use intel_dp->DP in eDP PLL setup")
as before that we didn't update the cached value after shuttting the
port down. That's assuming the port got enabled at least once prior
to hibernating. If that didn't happen then the cached value would
still have been totally out of sync with reality (eg. first boot w/o
eDP on, then hibernate, and then resume with eDP on).

So, let's fix this properly and refresh the cached register value from
the hardware register during resume.

DDI platforms shouldn't use the cached value during port disable at
least, so shouldn't have this particular issue. They might still have
issues if we skip the initial modeset and then try to retrain the link
or something. But untangling this DP vs. DDI mess is a bigger topic,
so let's jut punt on DDI for now.

Cc: Jani Nikula <jani.nikula@intel.com>
Cc: stable@vger.kernel.org
Fixes: 6fec766283 ("drm/i915: Use intel_dp->DP in eDP PLL setup")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463162036-27931-1-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
(cherry picked from commit 64989ca4b2)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2016-06-29 11:16:17 +03:00
Harini Katakam 3ec0a0f10c net: marvell: Add separate config ANEG function for Marvell 88E1111
Marvell 88E1111 currently uses the generic marvell config ANEG function.
This function has a sequence accessing Page 5 and Register 31,
both of which are not defined or reserved for this PHY.
Hence this patch adds a new config ANEG function for Marvell 88E1111
without these erroneous accesses.

Signed-off-by: Harini Katakam <harinik@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:07:36 -04:00
David S. Miller d10f0b312a Merge branch 'batman-adv-fixes'
Sven Eckelmann says:

====================
batman-adv: Fixes for Linux 4.7

Antonio currently seems to be occupied. This is currently rather unfortunate
because there are patches waiting in the batman-adv development repository
maint(enance) branch [1] since up to 6 weeks. I am now getting asked when
these patches will hit the distribution kernels and therefore decided to
submit these patches directly to netdev.

The patch from Simon works around the problem that warnings could be triggered
in the translation table code via packets using a VLAN not configured on the
target host. This warning was replaced with a rate limited info message.

Ben Hutchings found an superfluous batadv_softif_vlan_put in the error
handling code of the translation table while he backported the "batman-adv:
Fix reference counting of vlan object for tt_local_entry" patch to the stable
kernels. He noticed correctly that this batadv_softif_vlan_put should also
have been removed by the said patch.

The most requested fix at the moment is related to a double free in the
translation table code. It is a race condition which mostly happens on systems
with multiple cores and multiple network interface attached to batman-adv. Two
Freifunk communities which were haunted by weird crashes (with backtraces
reporting problems in other parts of the kernel) were kind enough to test this
patch. They reported that there systems are now running stable after applying
this patch.

An invalid memory access was detected in the batadv_icmp_packet_rr handling
code when receiving a skbuff with fragments. The last patch is fixing a memory
leak when the interface is removed via .dellink. The code to fix it was copied
from the code handling the legacy sysfs interface to remove netdevices from a
batman-adv netdevice.

There are still 28 patches in the development tree for v4.8 but I will leave
them to Antonio because these are cleanups and features and therefore for net-
next.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:01:53 -04:00
Sven Eckelmann 420cb1b764 batman-adv: Clean up untagged vlan when destroying via rtnl-link
The untagged vlan object is only destroyed when the interface is removed
via the legacy sysfs interface. But it also has to be destroyed when the
standard rtnl-link interface is used.

Fixes: 5d2c05b213 ("batman-adv: add per VLAN interface attribute framework")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Acked-by: Antonio Quartulli <a@unstable.cc>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:01:48 -04:00
Sven Eckelmann 3b55e44220 batman-adv: Fix ICMP RR ethernet access after skb_linearize
The skb_linearize may reallocate the skb. This makes the calculated pointer
for ethhdr invalid. But it the pointer is used later to fill in the RR
field of the batadv_icmp_packet_rr packet.

Instead re-evaluate eth_hdr after the skb_linearize+skb_cow to fix the
pointer and avoid the invalid read.

Fixes: da6b8c20a5 ("batman-adv: generalize batman-adv icmp packet handling")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:01:48 -04:00
Ben Hutchings baceced932 batman-adv: Fix double-put of vlan object
Each batadv_tt_local_entry hold a single reference to a
batadv_softif_vlan.  In case a new entry cannot be added to the hash
table, the error path puts the reference, but the reference will also
now be dropped by batadv_tt_local_entry_release().

Fixes: a33d970d0b ("batman-adv: Fix reference counting of vlan object for tt_local_entry")
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:01:47 -04:00
Sven Eckelmann 9c4604a298 batman-adv: Fix use-after-free/double-free of tt_req_node
The tt_req_node is added and removed from a list inside a spinlock. But the
locking is sometimes removed even when the object is still referenced and
will be used later via this reference. For example batadv_send_tt_request
can create a new tt_req_node (including add to a list) and later
re-acquires the lock to remove it from the list and to free it. But at this
time another context could have already removed this tt_req_node from the
list and freed it.

CPU#0

    batadv_batman_skb_recv from net_device 0
    -> batadv_iv_ogm_receive
      -> batadv_iv_ogm_process
        -> batadv_iv_ogm_process_per_outif
          -> batadv_tvlv_ogm_receive
            -> batadv_tvlv_ogm_receive
              -> batadv_tvlv_containers_process
                -> batadv_tvlv_call_handler
                  -> batadv_tt_tvlv_ogm_handler_v1
                    -> batadv_tt_update_orig
                      -> batadv_send_tt_request
                        -> batadv_tt_req_node_new
                           spin_lock(...)
                           allocates new tt_req_node and adds it to list
                           spin_unlock(...)
                           return tt_req_node

CPU#1

    batadv_batman_skb_recv from net_device 1
    -> batadv_recv_unicast_tvlv
      -> batadv_tvlv_containers_process
        -> batadv_tvlv_call_handler
          -> batadv_tt_tvlv_unicast_handler_v1
            -> batadv_handle_tt_response
               spin_lock(...)
               tt_req_node gets removed from list and is freed
               spin_unlock(...)

CPU#0

                      <- returned to batadv_send_tt_request
                         spin_lock(...)
                         tt_req_node gets removed from list and is freed
                         MEMORY CORRUPTION/SEGFAULT/...
                         spin_unlock(...)

This can only be solved via reference counting to allow multiple contexts
to handle the list manipulation while making sure that only the last
context holding a reference will free the object.

Fixes: a73105b8d4 ("batman-adv: improved client announcement mechanism")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Tested-by: Martin Weinelt <martin@darmstadt.freifunk.net>
Tested-by: Amadeus Alfa <amadeus@chemnitz.freifunk.net>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:01:47 -04:00
Simon Wunderlich 0b3dd7dfb8 batman-adv: replace WARN with rate limited output on non-existing VLAN
If a VLAN tagged frame is received and the corresponding VLAN is not
configured on the soft interface, it will splat a WARN on every packet
received. This is a quite annoying behaviour for some scenarios, e.g. if
bat0 is bridged with eth0, and there are arbitrary VLAN tagged frames
from Ethernet coming in without having any VLAN configuration on bat0.

The code should probably create vlan objects on the fly and
transparently transport these VLAN-tagged Ethernet frames, but until
this is done, at least the WARN splat should be replaced by a rate
limited output.

Fixes: 354136bcc3 ("batman-adv: fix kernel crash due to missing NULL checks")
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:01:47 -04:00
Florian Fainelli 69fc58a57e net: phy: Manage fixed PHY address space using IDA
If we have a system which uses fixed PHY devices and calls
fixed_phy_register() then fixed_phy_unregister() we can exhaust the
number of fixed PHYs available after a while, since we keep incrementing
the variable phy_fixed_addr, but we never decrement it.

This patch fixes that by converting the fixed PHY allocation to using
IDA, which takes care of the allocation/dealloaction of the PHY
addresses for us.

Fixes: a759512174 ("net: phy: extend fixed driver with fixed_phy_register()")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 03:51:40 -04:00
Miklos Szeredi a4859d7594 ovl: fix dentry leak for default_permissions
When using the 'default_permissions' mount option, ovl_permission() on
non-directories was missing a dput(alias), resulting in "BUG Dentry still
in use".

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: 8d3095f4ad ("ovl: default permissions")
Cc: <stable@vger.kernel.org> # v4.5+
2016-06-29 08:26:59 +02:00
Michael Neuling 190ce8693c powerpc/tm: Avoid SLB faults in treclaim/trecheckpoint when RI=0
Currently we have 2 segments that are bolted for the kernel linear
mapping (ie 0xc000... addresses). This is 0 to 1TB and also the kernel
stacks. Anything accessed outside of these regions may need to be
faulted in. (In practice machines with TM always have 1T segments)

If a machine has < 2TB of memory we never fault on the kernel linear
mapping as these two segments cover all physical memory. If a machine
has > 2TB of memory, there may be structures outside of these two
segments that need to be faulted in. This faulting can occur when
running as a guest as the hypervisor may remove any SLB that's not
bolted.

When we treclaim and trecheckpoint we have a window where we need to
run with the userspace GPRs. This means that we no longer have a valid
stack pointer in r1. For this window we therefore clear MSR RI to
indicate that any exceptions taken at this point won't be able to be
handled. This means that we can't take segment misses in this RI=0
window.

In this RI=0 region, we currently access the thread_struct for the
process being context switched to or from. This thread_struct access
may cause a segment fault since it's not guaranteed to be covered by
the two bolted segment entries described above.

We've seen this with a crash when running as a guest with > 2TB of
memory on PowerVM:

  Unrecoverable exception 4100 at c00000000004f138
  Oops: Unrecoverable exception, sig: 6 [#1]
  SMP NR_CPUS=2048 NUMA pSeries
  CPU: 1280 PID: 7755 Comm: kworker/1280:1 Tainted: G                 X 4.4.13-46-default #1
  task: c000189001df4210 ti: c000189001d5c000 task.ti: c000189001d5c000
  NIP: c00000000004f138 LR: 0000000010003a24 CTR: 0000000010001b20
  REGS: c000189001d5f730 TRAP: 4100   Tainted: G                 X  (4.4.13-46-default)
  MSR: 8000000100001031 <SF,ME,IR,DR,LE>  CR: 24000048  XER: 00000000
  CFAR: c00000000004ed18 SOFTE: 0
  GPR00: ffffffffc58d7b60 c000189001d5f9b0 00000000100d7d00 000000003a738288
  GPR04: 0000000000002781 0000000000000006 0000000000000000 c0000d1f4d889620
  GPR08: 000000000000c350 00000000000008ab 00000000000008ab 00000000100d7af0
  GPR12: 00000000100d7ae8 00003ffe787e67a0 0000000000000000 0000000000000211
  GPR16: 0000000010001b20 0000000000000000 0000000000800000 00003ffe787df110
  GPR20: 0000000000000001 00000000100d1e10 0000000000000000 00003ffe787df050
  GPR24: 0000000000000003 0000000000010000 0000000000000000 00003fffe79e2e30
  GPR28: 00003fffe79e2e68 00000000003d0f00 00003ffe787e67a0 00003ffe787de680
  NIP [c00000000004f138] restore_gprs+0xd0/0x16c
  LR [0000000010003a24] 0x10003a24
  Call Trace:
  [c000189001d5f9b0] [c000189001d5f9f0] 0xc000189001d5f9f0 (unreliable)
  [c000189001d5fb90] [c00000000001583c] tm_recheckpoint+0x6c/0xa0
  [c000189001d5fbd0] [c000000000015c40] __switch_to+0x2c0/0x350
  [c000189001d5fc30] [c0000000007e647c] __schedule+0x32c/0x9c0
  [c000189001d5fcb0] [c0000000007e6b58] schedule+0x48/0xc0
  [c000189001d5fce0] [c0000000000deabc] worker_thread+0x22c/0x5b0
  [c000189001d5fd80] [c0000000000e7000] kthread+0x110/0x130
  [c000189001d5fe30] [c000000000009538] ret_from_kernel_thread+0x5c/0xa4
  Instruction dump:
  7cb103a6 7cc0e3a6 7ca222a6 78a58402 38c00800 7cc62838 08860000 7cc000a6
  38a00006 78c60022 7cc62838 0b060000 <e8c701a0> 7ccff120 e8270078 e8a70098
  ---[ end trace 602126d0a1dedd54 ]---

This fixes this by copying the required data from the thread_struct to
the stack before we clear MSR RI. Then once we clear RI, we only access
the stack, guaranteeing there's no segment miss.

We also tighten the region over which we set RI=0 on the treclaim()
path. This may have a slight performance impact since we're adding an
mtmsr instruction.

Fixes: 090b9284d7 ("powerpc/tm: Clear MSR RI in non-recoverable TM code")
Signed-off-by: Michael Neuling <mikey@neuling.org>
Reviewed-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-29 16:19:25 +10:00
Trond Myklebust e547f26283 NFS: Fix another OPEN_DOWNGRADE bug
Olga Kornievskaia reports that the following test fails to trigger
an OPEN_DOWNGRADE on the wire, and only triggers the final CLOSE.

	fd0 = open(foo, RDRW)   -- should be open on the wire for "both"
	fd1 = open(foo, RDONLY)  -- should be open on the wire for "read"
	close(fd0) -- should trigger an open_downgrade
	read(fd1)
	close(fd1)

The issue is that we're missing a check for whether or not the current
state transitioned from an O_RDWR state as opposed to having transitioned
from a combination of O_RDONLY and O_WRONLY.

Reported-by: Olga Kornievskaia <aglo@umich.edu>
Fixes: cd9288ffae ("NFSv4: Fix another bug in the close/open_downgrade code")
Cc: stable@vger.kernel.org # 2.6.33+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-06-28 16:55:34 -04:00
Richard Guy Briggs 3f5be2da85 audit: move audit_get_tty to reduce scope and kabi changes
The only users of audit_get_tty and audit_put_tty are internal to
audit, so move it out of include/linux/audit.h to kernel.h and create
a proper function rather than inlining it.  This also reduces kABI
changes.

Suggested-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
[PM: line wrapped description]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-06-28 15:48:48 -04:00
Richard Guy Briggs 76a658c20e audit: move calcs after alloc and check when logging set loginuid
Move the calculations of values after the allocation in case the
allocation fails.  This avoids wasting effort in the rare case that it
fails, but more importantly saves us extra logic to release the tty
ref.

Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-06-28 15:40:17 -04:00
Linus Torvalds de4921ce9b Merge branch 'for-4.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata
Pull libata fixes from Tejun Heo:
 "Two trivial fixes - one for a bug in the allocation failure path and
  the other a compiler warning fix"

* 'for-4.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
  ata: sata_mv: fix mis-conversion in mv_write_cached_reg()
  ata: fix return value check in ahci_seattle_get_port_info()
2016-06-28 12:11:31 -07:00
Linus Torvalds 595d9e34ee Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
Pull HID fix from Jiri Kosina:
 "Regression fix for multitouch palm rejection from Allen Hung"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
  HID: multitouch: enable palm rejection for Windows Precision Touchpad
  Revert "HID: multitouch: enable palm rejection if device implements confidence usage"
2016-06-28 12:01:14 -07:00
Willem de Bruijn 9a0fee2b55 sock_diag: do not broadcast raw socket destruction
Diag intends to broadcast tcp_sk and udp_sk socket destruction.
Testing sk->sk_protocol for IPPROTO_TCP/IPPROTO_UDP alone is not
sufficient for this. Raw sockets can have the same type.

Add a test for sk->sk_type.

Fixes: eb4cb00852 ("sock_diag: define destruction multicast groups")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-28 09:08:51 -04:00
Aaron Campbell ab8ed95108 connector: fix out-of-order cn_proc netlink message delivery
The proc connector messages include a sequence number, allowing userspace
programs to detect lost messages.  However, performing this detection is
currently more difficult than necessary, since netlink messages can be
delivered to the application out-of-order.  To fix this, leave pre-emption
disabled during cn_netlink_send(), and use GFP_NOWAIT.

The following was written as a test case.  Building the kernel w/ make -j32
proved a reliable way to generate out-of-order cn_proc messages.

int
main(int argc, char *argv[])
{
	static uint32_t last_seq[CPU_SETSIZE], seq;
	int cpu, fd;
	struct sockaddr_nl sa;
	struct __attribute__((aligned(NLMSG_ALIGNTO))) {
		struct nlmsghdr nl_hdr;
		struct __attribute__((__packed__)) {
			struct cn_msg cn_msg;
			struct proc_event cn_proc;
		};
	} rmsg;
	struct __attribute__((aligned(NLMSG_ALIGNTO))) {
		struct nlmsghdr nl_hdr;
		struct __attribute__((__packed__)) {
			struct cn_msg cn_msg;
			enum proc_cn_mcast_op cn_mcast;
		};
	} smsg;

	fd = socket(PF_NETLINK, SOCK_DGRAM, NETLINK_CONNECTOR);
	if (fd < 0) {
		perror("socket");
	}

	sa.nl_family = AF_NETLINK;
	sa.nl_groups = CN_IDX_PROC;
	sa.nl_pid = getpid();
	if (bind(fd, (struct sockaddr *)&sa, sizeof(sa)) < 0) {
		perror("bind");
	}

	memset(&smsg, 0, sizeof(smsg));
	smsg.nl_hdr.nlmsg_len = sizeof(smsg);
	smsg.nl_hdr.nlmsg_pid = getpid();
	smsg.nl_hdr.nlmsg_type = NLMSG_DONE;
	smsg.cn_msg.id.idx = CN_IDX_PROC;
	smsg.cn_msg.id.val = CN_VAL_PROC;
	smsg.cn_msg.len = sizeof(enum proc_cn_mcast_op);
	smsg.cn_mcast = PROC_CN_MCAST_LISTEN;
	if (send(fd, &smsg, sizeof(smsg), 0) != sizeof(smsg)) {
		perror("send");
	}

	while (recv(fd, &rmsg, sizeof(rmsg), 0) == sizeof(rmsg)) {
		cpu = rmsg.cn_proc.cpu;
		if (cpu < 0) {
			continue;
		}
		seq = rmsg.cn_msg.seq;
		if ((last_seq[cpu] != 0) && (seq != last_seq[cpu] + 1)) {
			printf("out-of-order seq=%d on cpu=%d\n", seq, cpu);
		}
		last_seq[cpu] = seq;
	}

	/* NOTREACHED */

	perror("recv");

	return -1;
}

Signed-off-by: Aaron Campbell <aaron@monkey.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-28 08:48:33 -04:00
daniel 0888d5f3c0 Bridge: Fix ipv6 mc snooping if bridge has no ipv6 address
The bridge is falsly dropping ipv6 mulitcast packets if there is:
 1. No ipv6 address assigned on the brigde.
 2. No external mld querier present.
 3. The internal querier enabled.

When the bridge fails to build mld queries, because it has no
ipv6 address, it slilently returns, but keeps the local querier enabled.
This specific case causes confusing packet loss.

Ipv6 multicast snooping can only work if:
 a) An external querier is present
 OR
 b) The bridge has an ipv6 address an is capable of sending own queries

Otherwise it has to forward/flood the ipv6 multicast traffic,
because snooping cannot work.

This patch fixes the issue by adding a flag to the bridge struct that
indicates that there is currently no ipv6 address assinged to the bridge
and returns a false state for the local querier in
__br_multicast_querier_exists().

Special thanks to Linus Lüssing.

Fixes: d1d81d4c3d ("bridge: check return value of ipv6_dev_get_saddr()")
Signed-off-by: Daniel Danzberger <daniel@dd-wrt.com>
Acked-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-28 08:03:04 -04:00
Allen Hung 6dd2e27a10 HID: multitouch: enable palm rejection for Windows Precision Touchpad
The usage Confidence is mandary to Windows Precision Touchpad devices. If
it is examined in input_mapping on a WIndows Precision Touchpad, a new add
quirk MT_QUIRK_CONFIDENCE desgned for such devices will be applied to the
device. A touch with the confidence bit is not set is determined as
invalid.

Tested on Dell XPS13 9343

Cc: stable@vger.kernel.org # v4.5+
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Tested-by: Andy Lutomirski <luto@kernel.org> # XPS 13 9350, BIOS 1.4.3
Signed-off-by: Allen Hung <allen_hung@dell.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2016-06-28 13:24:14 +02:00
Allen Hung 62630ea768 Revert "HID: multitouch: enable palm rejection if device implements confidence usage"
This reverts commit 25a84db15b ("HID: multitouch: enable palm rejection
if device implements confidence usage")

The commit enables palm rejection for Win8 Precision Touchpad devices but
the quirk MT_QUIRK_VALID_IS_CONFIDENCE it is using is not working very
properly. This quirk is originally designed for some WIn7 touchscreens. Use
of this for a Win8 Precision Touchpad will cause unexpected pointer jumping
problem.

Cc: stable@vger.kernel.org # v4.5+
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Tested-by: Andy Lutomirski <luto@kernel.org> # XPS 13 9350, BIOS 1.4.3
Signed-off-by: Allen Hung <allen_hung@dell.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2016-06-28 13:24:14 +02:00
Gavin Shan cca0e542e0 powerpc/eeh: Fix wrong argument passed to eeh_rmv_device()
When calling eeh_rmv_device() in eeh_reset_device() for partial hotplug
case, @rmv_data instead of its address is the proper argument.
Otherwise, the stack frame is corrupted when writing to
@rmv_data (actually its address) in eeh_rmv_device(). It results in
kernel crash as observed.

This fixes the issue by passing @rmv_data, not its address to
eeh_rmv_device() in eeh_reset_device().

Fixes: 67086e32b5 ("powerpc/eeh: powerpc/eeh: Support error recovery for VF PE")
Reported-by: Pridhiviraj Paidipeddi <ppaidipe@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-28 20:47:49 +10:00
Jouni Malinen 126e755732 mac80211: Fix mesh estab_plinks counting in STA removal case
If a user space program (e.g., wpa_supplicant) deletes a STA entry that
is currently in NL80211_PLINK_ESTAB state, the number of established
plinks counter was not decremented and this could result in rejecting
new plink establishment before really hitting the real maximum plink
limit. For !user_mpm case, this decrementation is handled by
mesh_plink_deactive().

Fix this by decrementing estab_plinks on STA deletion
(mesh_sta_cleanup() gets called from there) so that the counter has a
correct value and the Beacon frame advertisement in Mesh Configuration
element shows the proper value for capability to accept additional
peers.

Cc: stable@vger.kernel.org
Signed-off-by: Jouni Malinen <j@w1.fi>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
2016-06-28 12:39:50 +02:00
Wang Sheng-Hui f299a02d5f net/mlx5: use mlx5_buf_alloc_node instead of mlx5_buf_alloc in mlx5_wq_ll_create
Commit 311c7c71c9 ("net/mlx5e: Allocate DMA coherent memory on
reader NUMA node") introduced mlx5_*_alloc_node() but missed changing
some calling and warn messages. This patch introduces 2 changes:
	* Use mlx5_buf_alloc_node() instead of mlx5_buf_alloc() in
	  mlx5_wq_ll_create()
	* Update the failure warn messages with _node postfix for
	  mlx5_*_alloc function names

Fixes: 311c7c71c9 ("net/mlx5e: Allocate DMA coherent memory on reader NUMA node")
Signed-off-by: Wang Sheng-Hui <shhuiw@foxmail.com>
Acked-By: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-28 05:17:38 -04:00
David S. Miller d1b5a8da29 Merge branch 'bgmac-fixes'
Florian Fainelli says:

====================
net: bgmac: Random fixes

This patch series fixes a few issues spotted by code inspection and
actual testing.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-28 04:22:25 -04:00
Florian Fainelli 3894396e64 net: bgmac: Remove superflous netif_carrier_on()
bgmac_open() calls phy_start() to initialize the PHY state machine,
which will set the interface's carrier state accordingly, no need to
force that as this could be conflicting with the PHY state determined by
PHYLIB.

Fixes: dd4544f054 ("bgmac: driver for GBit MAC core on BCMA bus")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-28 04:22:17 -04:00
Florian Fainelli c3897f2a69 net: bgmac: Start transmit queue in bgmac_open
The driver does not start the transmit queue in bgmac_open(). If the
queue was stopped prior to closing then re-opening the interface, we
would never be able to wake-up again.

Fixes: dd4544f054 ("bgmac: driver for GBit MAC core on BCMA bus")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-28 04:22:17 -04:00
Florian Fainelli d2b1323387 net: bgmac: Fix SOF bit checking
We are checking for the Start of Frame bit in the ctl1 word, while this
bit is set in the ctl0 word instead. Read the ctl0 word and update the
check to verify that.

Fixes: 9cde94506e ("bgmac: implement scatter/gather support")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-28 04:22:17 -04:00
Jay Vosburgh 0622cab034 bonding: fix 802.3ad aggregator reselection
Since commit 7bb11dc9f5 ("bonding: unify all places where
actor-oper key needs to be updated."), the logic in bonding to handle
selection between multiple aggregators has not functioned.

	This affects only configurations wherein the bonding slaves
connect to two discrete aggregators (e.g., two independent switches, each
with LACP enabled), thus creating two separate aggregation groups within a
single bond.

	The cause is a change in 7bb11dc9f5 to no longer set
AD_PORT_BEGIN on a port after a link state change, which would cause the
port to be reselected for attachment to an aggregator as if were newly
added to the bond.  We cannot restore the prior behavior, as it
contradicts IEEE 802.1AX 5.4.12, which requires ports that "become
inoperable" (lose carrier, setting port_enabled=false as per 802.1AX
5.4.7) to remain selected (i.e., assigned to the aggregator).  As the port
now remains selected, the aggregator selection logic is not invoked.

	A side effect of this change is that aggregators in bonding will
now contain ports that are link down.  The aggregator selection logic
does not currently handle this situation correctly, causing incorrect
aggregator selection.

	This patch makes two changes to repair the aggregator selection
logic in bonding to function as documented and within the confines of the
standard:

	First, the aggregator selection and related logic now utilizes the
number of active ports per aggregator, not the number of selected ports
(as some selected ports may be down).  The ad_select "bandwidth" and
"count" options only consider ports that are link up.

	Second, on any carrier state change of any slave, the aggregator
selection logic is explicitly called to insure the correct aggregator is
active.

Reported-by: Veli-Matti Lintu <veli-matti.lintu@opinsys.fi>
Fixes: 7bb11dc9f5 ("bonding: unify all places where actor-oper key needs to be updated.")
Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-28 04:19:18 -04:00
Tom Goff 70a0dec451 ipmr/ip6mr: Initialize the last assert time of mfc entries.
This fixes wrong-interface signaling on 32-bit platforms for entries
created when jiffies > 2^31 + MFC_ASSERT_THRESH.

Signed-off-by: Tom Goff <thomas.goff@ll.mit.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-28 04:14:09 -04:00