linux

Commit Graph

Author	SHA1	Message	Date
Florian Fainelli	f0903ea371	r8169: Add support for restarting auto-negotiation Implement ethtooll::nway_restart by utilizing mii_nway_restart. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-05 13:38:43 -05:00
David S. Miller	c3543688ab	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2016-12-03 Here's a set of Bluetooth & 802.15.4 patches for net-next (i.e. 4.10 kernel): - Fix for a potential NULL deref in the ieee802154 netlink code - Fix for the ED values of the at86rf2xx driver - Documentation updates to ieee802154 - Cleanups to u8 vs __u8 usage - Timer API usage cleanups in HCI drivers Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-05 13:37:28 -05:00
Kees Cook	0eab121ef8	net: ping: check minimum size on ICMP header length Prior to commit `c0371da604` ("put iov_iter into msghdr") in v3.19, there was no check that the iovec contained enough bytes for an ICMP header, and the read loop would walk across neighboring stack contents. Since the iov_iter conversion, bad arguments are noticed, but the returned error is EFAULT. Returning EINVAL is a clearer error and also solves the problem prior to v3.19. This was found using trinity with KASAN on v3.18: BUG: KASAN: stack-out-of-bounds in memcpy_fromiovec+0x60/0x114 at addr ffffffc071077da0 Read of size 8 by task trinity-c2/9623 page:ffffffbe034b9a08 count:0 mapcount:0 mapping: (null) index:0x0 flags: 0x0() page dumped because: kasan: bad access detected CPU: 0 PID: 9623 Comm: trinity-c2 Tainted: G BU 3.18.0-dirty #15 Hardware name: Google Tegra210 Smaug Rev 1,3+ (DT) Call trace: [<ffffffc000209c98>] dump_backtrace+0x0/0x1ac arch/arm64/kernel/traps.c:90 [<ffffffc000209e54>] show_stack+0x10/0x1c arch/arm64/kernel/traps.c:171 [< inline >] __dump_stack lib/dump_stack.c:15 [<ffffffc000f18dc4>] dump_stack+0x7c/0xd0 lib/dump_stack.c:50 [< inline >] print_address_description mm/kasan/report.c:147 [< inline >] kasan_report_error mm/kasan/report.c:236 [<ffffffc000373dcc>] kasan_report+0x380/0x4b8 mm/kasan/report.c:259 [< inline >] check_memory_region mm/kasan/kasan.c:264 [<ffffffc00037352c>] __asan_load8+0x20/0x70 mm/kasan/kasan.c:507 [<ffffffc0005b9624>] memcpy_fromiovec+0x5c/0x114 lib/iovec.c:15 [< inline >] memcpy_from_msg include/linux/skbuff.h:2667 [<ffffffc000ddeba0>] ping_common_sendmsg+0x50/0x108 net/ipv4/ping.c:674 [<ffffffc000dded30>] ping_v4_sendmsg+0xd8/0x698 net/ipv4/ping.c:714 [<ffffffc000dc91dc>] inet_sendmsg+0xe0/0x12c net/ipv4/af_inet.c:749 [< inline >] __sock_sendmsg_nosec net/socket.c:624 [< inline >] __sock_sendmsg net/socket.c:632 [<ffffffc000cab61c>] sock_sendmsg+0x124/0x164 net/socket.c:643 [< inline >] SYSC_sendto net/socket.c:1797 [<ffffffc000cad270>] SyS_sendto+0x178/0x1d8 net/socket.c:1761 CVE-2016-8399 Reported-by: Qidan He <i@flanker017.me> Fixes: `c319b4d76b` ("net: ipv4: add IPPROTO_ICMP socket kind") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-05 13:35:38 -05:00
David S. Miller	3f4888adae	Merge branch 'tcp-tsq-perf' Eric Dumazet says: ==================== tcp: tsq: performance series Under very high TX stress, CPU handling NIC TX completions can spend considerable amount of cycles handling TSQ (TCP Small Queues) logic. This patch series avoids some atomic operations, but most notable patch is the 3rd one, allowing other cpus processing ACK packets and calling tcp_write_xmit() to grab TCP_TSQ_DEFERRED so that tcp_tasklet_func() can skip already processed sockets. This avoid lots of lock acquisitions and cache lines accesses, particularly under load. In v2, I added : - tcp_small_queue_check() change to allow 1st and 2nd packets in write queue to be sent, even in the case TX completion of already acknowledged packets did not happen yet. This helps when TX completion coalescing parameters are set even to insane values, and/or busy polling is used. - A reorganization of struct sock fields to lower false sharing and increase data locality. - Then I moved tsq_flags from tcp_sock to struct sock also to reduce cache line misses during TX completions. I measured an overall throughput gain of 22 % for heavy TCP use over a single TX queue. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-05 13:32:25 -05:00
Eric Dumazet	7aa5470c2c	tcp: tsq: move tsq_flags close to sk_wmem_alloc tsq_flags being in the same cache line than sk_wmem_alloc makes a lot of sense. Both fields are changed from tcp_wfree() and more generally by various TSQ related functions. Prior patch made room in struct sock and added sk_tsq_flags, this patch deletes tsq_flags from struct tcp_sock. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-05 13:32:24 -05:00
Eric Dumazet	9115e8cd2a	net: reorganize struct sock for better data locality Group fields used in TX path, and keep some cache lines mostly read to permit sharing among cpus. Gained two 4 bytes holes on 64bit arches. Added a place holder for tcp tsq_flags, next to sk_wmem_alloc to speed up tcp_wfree() in the following patch. I have not added ____cacheline_aligned_in_smp, this might be done later. I prefer doing this once inet and tcp/udp sockets reorg is also done. Tested with both TCP and UDP. UDP receiver performance under flood increased by ~20 % : Accessing sk_filter/sk_wq/sk_napi_id no longer stalls because sk_drops was moved away from a critical cache line, now mostly read and shared. /* --- cacheline 4 boundary (256 bytes) --- / unsigned int sk_napi_id; / 0x100 0x4 / int sk_rcvbuf; / 0x104 0x4 / struct sk_filter sk_filter; /* 0x108 0x8 / union { struct socket_wq sk_wq; /* 0x8 / struct socket_wq sk_wq_raw; /* 0x8 / }; / 0x110 0x8 / struct xfrm_policy sk_policy[2]; /* 0x118 0x10 / struct dst_entry sk_rx_dst; /* 0x128 0x8 / struct dst_entry sk_dst_cache; /* 0x130 0x8 / atomic_t sk_omem_alloc; / 0x138 0x4 / int sk_sndbuf; / 0x13c 0x4 / / --- cacheline 5 boundary (320 bytes) --- / int sk_wmem_queued; / 0x140 0x4 / atomic_t sk_wmem_alloc; / 0x144 0x4 / long unsigned int sk_tsq_flags; / 0x148 0x8 / struct sk_buff sk_send_head; /* 0x150 0x8 / struct sk_buff_head sk_write_queue; / 0x158 0x18 / __s32 sk_peek_off; / 0x170 0x4 / int sk_write_pending; / 0x174 0x4 / long int sk_sndtimeo; / 0x178 0x8 */ Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-05 13:32:24 -05:00
Eric Dumazet	12a59abc22	tcp: tcp_mtu_probe() is likely to exit early Adding a likely() in tcp_mtu_probe() moves its code which used to be inlined in front of tcp_write_xmit() We still have a cache line miss to access icsk->icsk_mtup.enabled, we will probably have to reorganize fields to help data locality. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-05 13:32:23 -05:00
Eric Dumazet	75eefc6c59	tcp: tsq: add a shortcut in tcp_small_queue_check() Always allow the two first skbs in write queue to be sent, regardless of sk_wmem_alloc/sk_pacing_rate values. This helps a lot in situations where TX completions are delayed either because of driver latencies or softirq latencies. Test is done with no cache line misses. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-05 13:32:23 -05:00
Eric Dumazet	a9b204d156	tcp: tsq: avoid one atomic in tcp_wfree() Under high load, tcp_wfree() has an atomic operation trying to schedule a tasklet over and over. We can schedule it only if our per cpu list was empty. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-05 13:32:23 -05:00
Eric Dumazet	b223feb9de	tcp: tsq: add shortcut in tcp_tasklet_func() Under high stress, I've seen tcp_tasklet_func() consuming ~700 usec, handling ~150 tcp sockets. By setting TCP_TSQ_DEFERRED in tcp_wfree(), we give a chance for other cpus/threads entering tcp_write_xmit() to grab it, allowing tcp_tasklet_func() to skip sockets that already did an xmit cycle. In the future, we might give to ACK processing an increased budget to reduce even more tcp_tasklet_func() amount of work. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-05 13:32:22 -05:00
Eric Dumazet	408f0a6c21	tcp: tsq: remove one locked operation in tcp_wfree() Instead of atomically clear TSQ_THROTTLED and atomically set TSQ_QUEUED bits, use one cmpxchg() to perform a single locked operation. Since the following patch will also set TCP_TSQ_DEFERRED here, this cmpxchg() will make this addition free. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-05 13:32:22 -05:00
Eric Dumazet	40fc3423b9	tcp: tsq: add tsq_flags / tsq_enum This is a cleanup, to ease code review of following patches. Old 'enum tsq_flags' is renamed, and a new enumeration is added with the flags used in cmpxchg() operations as opposed to single bit operations. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-05 13:32:22 -05:00
Linus Torvalds	d9d04527c7	powerpc fixes for 4.9 #7 Four fixes, the first for code we merged this cycle and three that are also going to stable: - On 64-bit Book3E we were not placing the .text section where we said we would in the asm. - We broke building the boot wrapper on some 32-bit toolchains. - Lazy icache flushing was broken on pre-POWER5 machines. - One of the error paths in our EEH code would lead to a deadlock. Thanks to: Andrew Donnellan, Ben Hutchings, Benjamin Herrenschmidt, Nicholas Piggin. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJYRMkzAAoJEFHr6jzI4aWAoKsP/jUhpuGLkLM04isnRyjerYUL TZLp3NyplLSII8mwj+mCglVet4R79fctsvrl8uUXcMuTSfD6F1e9W7Oz9gxmBahT owZw5xmXFNLeCq1/w0N3KajYNvCTISEBuIgb/JVHntTu9nQ2gMwJ78DxpnBlTL93 mAWA/2Sl1ybcEJJDKK6M/Lz3TyicRGBKDfo6tHdtgH44Sv1q2NufbA+UCzpZXywO HIPbirgy22vs6pA+OAe+5UmiHCgkFXNgZbrnIr0bz/6w8xaQmKJF1uUxt6Qj03Gn l+y45C1lgGiQzCNl5S+VkO0yopFS9L3/VNA0xHHUpi7/3Sz229ASHpCJjNrT3qsd NUjKMLucAgM+86R4gOhAKk1xjKKjp9LnTdAVs1t9w4nMud6pKd3+2/I7kEk8GQvh fTf3P2Bw6Gtm8b2Pd5WswcDYXpZGgfbPSltgAXtnyKuswtNtUfhhmVkNaJRZSCLP ZdgcwT1zmBISz3b5n1dtngRwSO4BP/+2HTCpLFF77ZT6PEAhRKbCOy1Qlb0+C2RW nZG6oXkNHjvF4W6teYRAmyqklj4ndUcUKsS9koFO/6GOiaDUMiCHnMNa+tByk1nl ufAKLAQ5IlvhEZ6kE11QDkcACy76obNDnKu24+kGwyJD9R2CK0HIMykogJYasN68 Hjo03XsSaWv2umwq7ZwI =6IL4 -----END PGP SIGNATURE----- Merge tag 'powerpc-4.9-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Four fixes, the first for code we merged this cycle and three that are also going to stable: - On 64-bit Book3E we were not placing the .text section where we said we would in the asm. - We broke building the boot wrapper on some 32-bit toolchains. - Lazy icache flushing was broken on pre-POWER5 machines. - One of the error paths in our EEH code would lead to a deadlock. Thanks to: Andrew Donnellan, Ben Hutchings, Benjamin Herrenschmidt, Nicholas Piggin" * tag 'powerpc-4.9-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64: Fix placement of .text to be immediately following .head.text powerpc/eeh: Fix deadlock when PE frozen state can't be cleared powerpc/mm: Fix lazy icache flush on pre-POWER5 powerpc/boot: Fix build failure in 32-bit boot wrapper	2016-12-05 10:30:12 -08:00
Pan Bian	4606c9e8c5	atm: lanai: set error code when ioremap fails In function lanai_dev_open(), when the call to ioremap() fails, the value of return variable result is 0. 0 means no error in this context. This patch fixes the bug, assigning "-ENOMEM" to result when ioremap() returns a NULL pointer. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188791 Signed-off-by: Pan Bian <bianpan2016@163.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-05 13:27:33 -05:00
Pan Bian	51920830d9	net: usb: set error code when usb_alloc_urb fails In function lan78xx_probe(), variable ret takes the errno code on failures. However, when the call to usb_alloc_urb() fails, its value will keeps 0. 0 indicates success in the context, which is inconsistent with the execution result. This patch fixes the bug, assigning "-ENOMEM" to ret when usb_alloc_urb() returns a NULL pointer. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188771 Signed-off-by: Pan Bian <bianpan2016@163.com> Acked-by: Woojung Huh <woojung.huh@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-05 13:27:15 -05:00
Pan Bian	b59589635f	net: bridge: set error code on failure Function br_sysfs_addbr() does not set error code when the call kobject_create_and_add() returns a NULL pointer. It may be better to return "-ENOMEM" when kobject_create_and_add() fails. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188781 Signed-off-by: Pan Bian <bianpan2016@163.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-05 13:26:22 -05:00
Suraj Deshmukh	14dd3e1b97	net: af_mpls.c add space before open parenthesis Adding space after switch keyword before open parenthesis for readability purpose. This patch fixes the checkpatch.pl warning: space required before the open parenthesis '(' Signed-off-by: Suraj Deshmukh <surajssd009005@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-05 13:25:55 -05:00
Pan Bian	89aa8445cd	netdev: broadcom: propagate error code Function bnxt_hwrm_stat_ctx_alloc() always returns 0, even if the call to _hwrm_send_message() fails. It may be better to propagate the errors to the caller of bnxt_hwrm_stat_ctx_alloc(). Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188661 Signed-off-by: Pan Bian <bianpan2016@163.com> Acked-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-05 13:25:38 -05:00
David S. Miller	f83e83037c	Merge branch 'bnxt_en-dcbnl' Michael Chan says: ==================== bnxt_en: Add DCBNL support. This series adds DCBNL operations to support host-based IEEE DCBX. v2: Updated to the latest firmware interface spec. David, please consider this series for net-next. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-05 13:21:41 -05:00
Michael Chan	c77192f204	bnxt_en: Add PFC statistics. Report PFC statistics to ethtool -S and DCBNL. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-05 13:21:40 -05:00
Michael Chan	7df4ae9fe8	bnxt_en: Implement DCBNL to support host-based DCBX. Support only IEEE DCBX initially. Add IEEE DCBNL ops and functions to get and set the hardware DCBX parameters. The DCB code is conditional on Kconfig CONFIG_BNXT_DCB. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-05 13:21:40 -05:00
Michael Chan	87c374ded0	bnxt_en: Update firmware header file to latest 1.6.0. Latest interface has the latest DCB command structs. Get and store the max number of lossless TCs the hardware can support. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-05 13:21:40 -05:00
Michael Chan	c5e3deb8a3	bnxt_en: Re-factor bnxt_setup_tc(). Add a new function bnxt_setup_mq_tc() to handle MQPRIO. This new function will be called during ETS setup when we add DCBNL in the next patch. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-05 13:21:39 -05:00
David S. Miller	2279b752ac	Merge branch 'fib-suffix-length-fixes' Alexander Duyck says: ==================== IPv4 FIB suffix length fixes In reviewing the patch from Robert Shearman and looking over the code I realized there were a few different bugs we were still carrying in the IPv4 FIB lookup code. These two patches are based off of Robert's original patch, but take things one step further by splitting them up to address two additional issues I found. So first have Robert's original patch which was addressing the fact that us calling update_suffix in resize is expensive when it is called per add. To address that I incorporated the core bit of the patch which was us dropping the update_suffix call from resize. The first patch in the series does a rename and fix on the push_suffix and pull_suffix code. Specifically we drop the need to pass a leaf and secondly we fix things so we pull the suffix as long as the value of the suffix in the node is dropping. The second patch addresses the original issue reported as well as optimizing the code for the fact that update_suffix is only really meant to go through and clean things up when we are decreasing a suffix. I had originally added code for it to somehow cause an increase, but if we push the suffix when a new leaf is added we only ever have to handle pulling down the suffix with update_suffix so I updated the code to reflect that. As far as side effects the only ones I think that will be obvious should be the fact that some routes may be able to be found earlier since before we relied on resize to update the suffix lengths, and now we are updating them before we add or remove the leaf. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-05 13:15:59 -05:00
Alexander Duyck	a52ca62c4a	ipv4: Drop suffix update from resize code It has been reported that update_suffix can be expensive when it is called on a large node in which most of the suffix lengths are the same. The time required to add 200K entries had increased from around 3 seconds to almost 49 seconds. In order to address this we need to move the code for updating the suffix out of resize and instead just have it handled in the cases where we are pushing a node that increases the suffix length, or will decrease the suffix length. Fixes: `5405afd1a3` ("fib_trie: Add tracking value for suffix length") Reported-by: Robert Shearman <rshearma@brocade.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Reviewed-by: Robert Shearman <rshearma@brocade.com> Tested-by: Robert Shearman <rshearma@brocade.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-05 13:15:58 -05:00
Alexander Duyck	1a239173cc	ipv4: Drop leaf from suffix pull/push functions It wasn't necessary to pass a leaf in when doing the suffix updates so just drop it. Instead just pass the suffix and work with that. Since we dropped the leaf there is no need to include that in the name so the names are updated to node_push_suffix and node_pull_suffix. Finally I noticed that the logic for pulling the suffix length back actually had some issues. Specifically it would stop prematurely if there was a longer suffix, but it was not as long as the original suffix. I updated the code to address that in node_pull_suffix. Fixes: `5405afd1a3` ("fib_trie: Add tracking value for suffix length") Suggested-by: Robert Shearman <rshearma@brocade.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Reviewed-by: Robert Shearman <rshearma@brocade.com> Tested-by: Robert Shearman <rshearma@brocade.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-05 13:15:58 -05:00
Jesper Nilsson	c7a61319ad	net: phy: dp83848: Support ethernet pause frames According to the documentation, the PHYs supported by this driver can also support pause frames. Announce this to be so. Tested with a TI83822I. Acked-by: Andrew F. Davis <afd@ti.com> Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-05 13:13:32 -05:00
Linus Torvalds	ef3263e35e	Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: "This fixes the following issues: - Intermittent build failure in RSA - Memory corruption in chelsio crypto driver - Regression in DRBG due to vmalloced stack" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: rsa - Add Makefile dependencies to fix parallel builds crypto: chcr - Fix memory corruption crypto: drbg - prevent invalid SG mappings	2016-12-05 09:16:10 -08:00
Linus Torvalds	3e5de27e94	Linux 4.9-rc8	2016-12-04 12:50:51 -08:00
Pan Bian	c66ebf2db5	net: dcb: set error code on failures In function dcbnl_cee_fill(), returns the value of variable err on errors. However, on some error paths (e.g. nla put fails), its value may be 0. It may be better to explicitly set a negative errno to variable err before returning. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188881 Signed-off-by: Pan Bian <bianpan2016@163.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-03 23:54:25 -05:00
Erik Nordmark	adc176c547	ipv6 addrconf: Implemented enhanced DAD (RFC7527) Implemented RFC7527 Enhanced DAD. IPv6 duplicate address detection can fail if there is some temporary loopback of Ethernet frames. RFC7527 solves this by including a random nonce in the NS messages used for DAD, and if an NS is received with the same nonce it is assumed to be a looped back DAD probe and is ignored. RFC7527 is enabled by default. Can be disabled by setting both of conf/{all,interface}/enhanced_dad to zero. Signed-off-by: Erik Nordmark <nordmark@arista.com> Signed-off-by: Bob Gilligan <gilligan@arista.com> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-03 23:21:37 -05:00
David S. Miller	ce84c7c663	Merge branch 'mv88e6390-batch-three' Andrew Lunn says: ==================== mv88e6390 batch 3 More patches to support the MV88e6390. This is mostly refactoring existing code and adding implementations for the mv88e6390. This patchset set which reserved frames are sent to the cpu, the size of jumbo frames that will be accepted, turn off egress rate limiting, and configuration of pause frames. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-03 23:18:39 -05:00
Andrew Lunn	3ce0e65eb6	net: dsa: mv88e6xxx: Implement mv88e6390 pause control The mv88e6390 has a number flow control registers accessed via the Flow Control register. Use these to set the pause control. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-03 23:18:39 -05:00
Andrew Lunn	b35d322a1d	net: dsa: mv88e6xxx: Refactor pause configuration The mv88e6390 has a different mechanism for configuring pause. Refactor the code into an ops function, and for the moment, don't add any mv88e6390 code yet. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-03 23:18:39 -05:00
Andrew Lunn	ef70b1119e	net: dsa: mv88e6xxx: Refactor egress rate limiting There are two different rate limiting configurations, depending on the switch generation. Refactor this into ops. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-03 23:18:38 -05:00
Andrew Lunn	5f4366660d	net: dsa: mv88e6xxx: Refactor setting of jumbo frames Some switches support jumbo frames. Refactor this code into operations in the ops structure. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-03 23:18:38 -05:00
Andrew Lunn	6e55f69846	net: dsa: mv88e6xxx: Reserved Management frames to CPU Older devices have a couple of registers in global2. The mv88e6390 family has a single register in global1 behind which hides similar configuration. Implement and op for this. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-03 23:18:38 -05:00
David S. Miller	7a6c5cb960	Merge branch 'mv88e6390-batch-two' Andrew Lunn says: ==================== MV88E6390 batch two This is the second batch of patches adding support for the MV88e6390. They are not sufficient to make it work properly. The mv88e6390 has a much expanded set of priority maps. Refactor the existing code, and implement basic support for the new device. Similarly, the monitor control register has been reworked. The mv88e6390 has something odd in its EDSA tagging implementation, which means it is not possible to use it. So we need to use DSA tagging. This is the first device with EDSA support where we need to use DSA, and the code does not support this. So two patches refactor the existing code. The two different register definitions are separated out, and using DSA on an EDSA capable device is added. v2: Add port prefix Add helper function for 6390 Add _IEEE_ into #defines Split monitor_ctrl into a number of separate ops. Remove 6390 code which is management, used in a later patch s/EGREES/EGRESS/. Broke up setup_port_dsa() and set_port_dsa() into a number of ops v3: Verify mandatory ops for port setup Don't set ether type for DSA port. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-03 23:15:01 -05:00
Andrew Lunn	56995cbc35	net: dsa: mv88e6xxx: Refactor CPU and DSA port setup Older chips only support DSA tagging. Newer chips have both DSA and EDSA tagging. Refactor the code by adding port functions for setting the frame mode, egress mode, and if to forward unknown frames. This results in the helper mv88e6xxx_6065_family() becoming unused, so remove it. Signed-off-by: Andrew Lunn <andrew@lunn.ch> v3: Verify mandatory ops for port setup Don't set ether type for DSA port. Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-03 23:15:00 -05:00
Andrew Lunn	443d5a1b7d	net: dsa: mv88e6xxx: Move the tagging protocol into info Older chips support a single tagging protocol, DSA. New chips support both DSA and EDSA, an enhanced version. Having both as an option changes the register layouts. Up until now, it has been assumed that if EDSA is supported, it will be used. Hence the register layout has been determined by which protocol should be used. However, mv88e6390 has a different implementation of EDSA, which requires we need to use the DSA tagging. Hence separate the selection of the protocol from the register layout. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-03 23:15:00 -05:00
Andrew Lunn	33641994a6	net: dsa: mv88e6xxx: Monitor and Management tables The mv88e6390 changes the monitor control register into the Monitor and Management control, which is an indirection register to various registers. Add ops to set the CPU port and the ingress/egress port for both register layouts, to global1 Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-03 23:15:00 -05:00
Andrew Lunn	ef0a731882	net: dsa: mv88e6xxx: Implement mv88e6390 tag remap The mv88e6390 does not have the two registers to set the frame priority map. Instead it has an indirection registers for setting a number of different priority maps. Refactor the old code into an function, implement the mv88e6390 version, and use an op to call the right one. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-03 23:15:00 -05:00
Linus Torvalds	0cb65c8330	Merge tag 'drm-fixes-for-v4.9-rc8' of git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "A pretty small pull request: a couple of AMD powerxpress regression fixes and a power management fix, a couple of i915 fixes and one hdlcd fix, along with one core don't oops because of incorrect API usage fix" * tag 'drm-fixes-for-v4.9-rc8' of git://people.freedesktop.org/~airlied/linux: drm/i915: drop the struct_mutex when wedged or trying to reset drm/i915: Don't touch NULL sg on i915_gem_object_get_pages_gtt() error drm: Don't call drm_for_each_crtc with a non-KMS driver drm/radeon: fix check for port PM availability drm/amdgpu: fix check for port PM availability drm/amd/powerplay: initialize the soft_regs offset in struct smu7_hwmgr drm: hdlcd: Fix cleanup order	2016-12-03 16:40:21 -08:00
David S. Miller	69248719d0	Merge branch 'fib-notifier-event-replay' Jiri Pirko says: ==================== ipv4: fib: Replay events when registering FIB notifier Ido says: In kernel 4.9 the switchdev-specific FIB offload mechanism was replaced by a new FIB notification chain to which modules could register in order to be notified about the addition and deletion of FIB entries. The motivation for this change was that switchdev drivers need to be able to reflect the entire FIB table and not only FIBs configured on top of the port netdevs themselves. This is useful in case of in-band management. The fundamental problem with this approach is that upon registration listeners lose all the information previously sent in the chain and thus have an incomplete view of the FIB tables, which can result in packet loss. This patchset fixes that by dumping the FIB tables and replaying notifications previously sent in the chain for the registered notification block. The entire dump process is done under RCU and thus the FIB notification chain is converted to be atomic. The listeners are modified accordingly. This is done in the first eight patches. The ninth patch adds a change sequence counter to ensure the integrity of the FIB dump. The last patch adds the dump itself to the FIB chain registration function and modifies existing listeners to pass a callback to be executed in case dump was inconsistent. --- v3->v4: - Register the notification block after the dump and protect it using the change sequence counter (Hannes Frederic Sowa). - Since we now integrate the dump into the registration function, drop the sysctl to set maximum number of retries and instead set it to a fixed number. Lets see if it's really a problem before adding something we can never remove. - For the same reason, dump FIB tables for all net namespaces. - Add a comment regarding guarantees provided by mutex semantics. v2->v3: - Add sysctl to set the number of FIB dump retries (Hannes Frederic Sowa). - Read the sequence counter under RTNL to ensure synchronization between the dump process and other processes changing the routing tables (Hannes Frederic Sowa). - Pass a callback to the dump function to be executed prior to a retry. - Limit the dump to a single net namespace. v1->v2: - Add a sequence counter to ensure the integrity of the FIB dump (David S. Miller, Hannes Frederic Sowa). - Protect notifications from re-ordering in listeners by using an ordered workqueue (Hannes Frederic Sowa). - Introduce fib_info_hold() (Jiri Pirko). - Relieve rocker from the need to invoke the FIB dump by registering to the FIB notification chain prior to ports creation. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-03 19:29:37 -05:00
Ido Schimmel	c3852ef7f2	ipv4: fib: Replay events when registering FIB notifier Commit `b90eb75494` ("fib: introduce FIB notification infrastructure") introduced a new notification chain to notify listeners (f.e., switchdev drivers) about addition and deletion of routes. However, upon registration to the chain the FIB tables can already be populated, which means potential listeners will have an incomplete view of the tables. Solve that by dumping the FIB tables and replaying the events to the passed notification block. The dump itself is done using RCU in order not to starve consumers that need RTNL to make progress. The integrity of the dump is ensured by reading the FIB change sequence counter before and after the dump under RTNL. This allows us to avoid the problematic situation in which the dumping process sends a ENTRY_ADD notification following ENTRY_DEL generated by another process holding RTNL. Callers of the registration function may pass a callback that is executed in case the dump was inconsistent with current FIB tables. The number of retries until a consistent dump is achieved is set to a fixed number to prevent callers from looping for long periods of time. In case current limit proves to be problematic in the future, it can be easily converted to be configurable using a sysctl. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-03 19:29:35 -05:00
Ido Schimmel	cacaad11f4	ipv4: fib: Allow for consistent FIB dumping The next patch will enable listeners of the FIB notification chain to request a dump of the FIB tables. However, since RTNL isn't taken during the dump, it's possible for the FIB tables to change mid-dump, which will result in inconsistency between the listener's table and the kernel's. Allow listeners to know about changes that occurred mid-dump, by adding a change sequence counter to each net namespace. The counter is incremented just before a notification is sent in the FIB chain. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-03 19:29:35 -05:00
Ido Schimmel	d3f706f68e	ipv4: fib: Convert FIB notification chain to be atomic In order not to hold RTNL for long periods of time we're going to dump the FIB tables using RCU. Convert the FIB notification chain to be atomic, as we can't block in RCU critical sections. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-03 19:29:35 -05:00
Ido Schimmel	17f8be7daf	rocker: Register FIB notifier before creating ports We can miss FIB notifications sent between the time the ports were created and the FIB notification block registered. Instead of receiving these notifications only when they are replayed for the FIB notification block during registration, just register the notification block before the ports are created. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-03 19:29:35 -05:00
Ido Schimmel	db7019557c	rocker: Implement FIB offload in deferred work Convert rocker to offload FIBs in deferred work in a similar fashion to mlxsw, which was converted in the previous commits. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-03 19:29:35 -05:00
Ido Schimmel	c1bb279cfa	rocker: Create an ordered workqueue for FIB offload As explained in the previous commits, we need to process FIB entries addition / deletion events in FIFO order or otherwise we can have a mismatch between the kernel's FIB table and the device's. Create an ordered workqueue for rocker to which these work items will be submitted to. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-03 19:29:35 -05:00

1 2 3 4 5 ...

636968 Commits All Branches Search

636968 Commits

All Branches