linux

Commit Graph

Author	SHA1	Message	Date
David S. Miller	216fe8f021	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Just some simple overlapping changes in marvell PHY driver and the DSA core code. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-06 22:20:08 -04:00
Linus Torvalds	b29794ec95	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Made TCP congestion control documentation match current reality, from Anmol Sarma. 2) Various build warning and failure fixes from Arnd Bergmann. 3) Fix SKB list leak in ipv6_gso_segment(). 4) Use after free in ravb driver, from Eugeniu Rosca. 5) Don't use udp_poll() in ping protocol driver, from Eric Dumazet. 6) Don't crash in PCI error recovery of cxgb4 driver, from Guilherme Piccoli. 7) _SRC_NAT_DONE_BIT needs to be cleared using atomics, from Liping Zhang. 8) Use after free in vxlan deletion, from Mark Bloch. 9) Fix ordering of NAPI poll enabled in ethoc driver, from Max Filippov. 10) Fix stmmac hangs with TSO, from Niklas Cassel. 11) Fix crash in CALIPSO ipv6, from Richard Haines. 12) Clear nh_flags properly on mpls link up. From Roopa Prabhu. 13) Fix regression in sk_err socket error queue handling, noticed by ping applications. From Soheil Hassas Yeganeh. 14) Update mlx4/mlx5 MAINTAINERS information. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (78 commits) net: stmmac: fix a broken u32 less than zero check net: stmmac: fix completely hung TX when using TSO net: ethoc: enable NAPI before poll may be scheduled net: bridge: fix a null pointer dereference in br_afspec ravb: Fix use-after-free on `ifconfig eth0 down` net/ipv6: Fix CALIPSO causing GPF with datagram support net: stmmac: ensure jumbo_frm error return is correctly checked for -ve value Revert "sit: reload iphdr in ipip6_rcv" i40e/i40evf: proper update of the page_offset field i40e: Fix state flags for bit set and clean operations of PF iwlwifi: fix host command memory leaks iwlwifi: fix min API version for 7265D, 3168, 8000 and 8265 iwlwifi: mvm: clear new beacon command template struct iwlwifi: mvm: don't fail when removing a key from an inexisting sta iwlwifi: pcie: only use d0i3 in suspend/resume if system_pm is set to d0i3 iwlwifi: mvm: fix firmware debug restart recording iwlwifi: tt: move ucode_loaded check under mutex iwlwifi: mvm: support ibss in dqa mode iwlwifi: mvm: Fix command queue number on d0i3 flow iwlwifi: mvm: rs: start using LQ command color ...	2017-06-06 14:30:17 -07:00
Nikolay Aleksandrov	1020ce3108	net: bridge: fix a null pointer dereference in br_afspec We might call br_afspec() with p == NULL which is a valid use case if the action is on the bridge device itself, but the bridge tunnel code dereferences the p pointer without checking, so check if p is null first. Reported-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Fixes: `efa5356b0d` ("bridge: per vlan dst_metadata netlink support") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-06 16:05:31 -04:00
Richard Haines	e3ebdb20fd	net/ipv6: Fix CALIPSO causing GPF with datagram support When using CALIPSO with IPPROTO_UDP it is possible to trigger a GPF as the IP header may have moved. Also update the payload length after adding the CALIPSO option. Signed-off-by: Richard Haines <richard_c_haines@btinternet.com> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Huw Davies <huw@codeweavers.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-06 15:18:20 -04:00
Jiri Pirko	e25ea21ffa	net: sched: introduce a TRAP control action There is need to instruct the HW offloaded path to push certain matched packets to cpu/kernel for further analysis. So this patch introduces a new TRAP control action to TC. For kernel datapath, this action does not make much sense. So with the same logic as in HW, new TRAP behaves similar to STOLEN. The skb is just dropped in the datapath (and virtually ejected to an upper level, which does not exist in case of kernel). Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Yotam Gigi <yotamg@mellanox.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-06 12:45:23 -04:00
David S. Miller	bb36314054	RxRPC rewrite -----BEGIN PGP SIGNATURE----- iQIVAwUAWTZjxvSw1s6N8H32AQKwUQ/8CPF6CFwn+oS7cTkkI27sKaW43tTWyxxl 1qAXjeI5dqrnmR+xW5Xu06HwO8TQKfum5dvXJLse5y15ttbK9/fevRW1IzcYxeHQ YcR414c0akIZ72hJ93LZypmLwlhEicZs4dZXrUs6f6WuqFLwYrt4K1MyrY8Bt+bM +a2yLVToF4L0nI/aAhoU0Hh0sNv4AP/PKrLWEzhLDq1Q6xBiQHSsrHLPOPkJ9QqA KOZWSjZJj8j7gSBoXtMwQiBxV76KptbksYQFLpy3EwL/r7z1qBPI0TOAKnLDLs5Q cDHf2uSUTrgfO7TIg02/SJcHm+8s0p3K585E9iK5JZ6BMjdSRfKR14nJdlWyXdZ5 EvvEA7AlUpukHVv+CP+03sdBfkZ3PSb4sAQ+CbwY30SKwL1fRE26NW0fZa5lSmUt E1ixCxHPJXPnSZJAa5kePdWDgQjn2qJI+3Zh+jw0yaQ+rAgpP4M95xckeWdU9PKg 8uFMM7Z1h70PnmVV3nX603MqyVivpKEZKHKTQgqGz4BvB1ZEu9noLTfwQCodXtns /8/8sVD65L4/SpHr1AM3Y+v7483bHth8edAI0k/QZerdKGImR+enrYBoSZ53QkEf TG8pvK74Tdpw2LQJsUIDvL5+oBO4FtPNOmT4UHbotenrVkF/4laIFcCVPW58scG1 mB8kAUS+bzs= =M7Pr -----END PGP SIGNATURE----- Merge tag 'rxrpc-rewrite-20170606' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs David Howells says: ==================== rxrpc: Support service upgrade Here's a set of patches that allow AF_RXRPC to support the AuriStor service upgrade facility. This allows the server to change the service ID requested to an upgraded service if the client requests it upon the initiation of a connection. This is used by the AuriStor AFS-compatible servers to implement IPv6 handling and improved facilities by providing improved volume location, volume, protection, file and cache management services. Note that certain parts of the AFS protocol carry hard-coded IPv4 addresses. The reason AuriStor does it this way is that probing the improved service ID first will not incur an ABORT or any other response on some servers if the server is not listening on it - and so one have to employ a timeout. This is implemented in the server by allowing an AF_RXRPC server to call bind() twice on a socket to allow it to listen on two service IDs and then call setsockopt() to instruct the server to upgrade one into the other if the client requests it (by setting userStatus to 1 on the first DATA packet on a connection). If the upgrade occurs, all further operations on that connection are done with the new service ID. AF_RXRPC has to handle this automatically as connections are not exposed to userspace. Clients can request this facility by setting an RXRPC_UPGRADE_SERVICE command in the sendmsg() control buffer and then observing the resultant service ID in the msg_addr returned by recvmsg(). This should only be used to probe the service. Clients should then use the returned service ID in all subsequent communications with that server. Note that the kernel will not retain this information should the connection expire from its cache. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-06 12:05:57 -04:00
David S. Miller	f4eb17e1ef	Revert "sit: reload iphdr in ipip6_rcv" This reverts commit `b699d00358`. As per Eric Dumazet, the pskb_may_pull() is a NOP in this particular case, so the 'iph' reload is unnecessary. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-06 11:34:06 -04:00
Haishuang Yan	6044bd4a7d	devlink: fix potential memort leak We must free allocated skb when genlmsg_put() return fails. Fixes: `1555d204e7` ("devlink: Support for pipeline debug (dpipe)") Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-05 11:24:28 -04:00
Jiri Pirko	8ec1507dc9	net: sched: select cls when cls_act is enabled It really makes no sense to have cls_act enabled without cls. In that case, the cls_act code is dead. So select it. This also fixes an issue recently reported by kbuild robot: [linux-next:master 1326/4151] net/sched/act_api.c:37:18: error: implicit declaration of function 'tcf_chain_get' Reported-by: kbuild test robot <fengguang.wu@intel.com> Fixes: `db50514f9a` ("net: sched: add termination action to allow goto chain") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-05 10:56:36 -04:00
David Howells	4e255721d1	rxrpc: Add service upgrade support for client connections Make it possible for a client to use AuriStor's service upgrade facility. The client does this by adding an RXRPC_UPGRADE_SERVICE control message to the first sendmsg() of a call. This takes no parameters. When recvmsg() starts returning data from the call, the service ID field in the returned msg_name will reflect the result of the upgrade attempt. If the upgrade was ignored, srx_service will match what was set in the sendmsg(); if the upgrade happened the srx_service will be altered to indicate the service the server upgraded to. Note that: (1) The choice of upgrade service is up to the server (2) Further client calls to the same server that would share a connection are blocked if an upgrade probe is in progress. (3) This should only be used to probe the service. Clients should then use the returned service ID in all subsequent communications with that server (and not set the upgrade). Note that the kernel will not retain this information should the connection expire from its cache. (4) If a server that supports upgrading is replaced by one that doesn't, whilst a connection is live, and if the replacement is running, say, OpenAFS 1.6.4 or older or an older IBM AFS, then the replacement server will not respond to packets sent to the upgraded connection. At this point, calls will time out and the server must be reprobed. Signed-off-by: David Howells <dhowells@redhat.com>	2017-06-05 14:30:49 +01:00
David Howells	4722974d90	rxrpc: Implement service upgrade Implement AuriStor's service upgrade facility. There are three problems that this is meant to deal with: (1) Various of the standard AFS RPC calls have IPv4 addresses in their requests and/or replies - but there's no room for including IPv6 addresses. (2) Definition of IPv6-specific RPC operations in the standard operation sets has not yet been achieved. (3) One could envision the creation a new service on the same port that as the original service. The new service could implement improved operations - and the client could try this first, falling back to the original service if it's not there. Unfortunately, certain servers ignore packets addressed to a service they don't implement and don't respond in any way - not even with an ABORT. This means that the client must then wait for the call timeout to occur. What service upgrade does is to see if the connection is marked as being 'upgradeable' and if so, change the service ID in the server and thus the request and reply formats. Note that the upgrade isn't mandatory - a server that supports only the original call set will ignore the upgrade request. In the protocol, the procedure is then as follows: (1) To request an upgrade, the first DATA packet in a new connection must have the userStatus set to 1 (this is normally 0). The userStatus value is normally ignored by the server. (2) If the server doesn't support upgrading, the reply packets will contain the same service ID as for the first request packet. (3) If the server does support upgrading, all future reply packets on that connection will contain the new service ID and the new service ID will be applied to all further calls on that connection as well. (4) The RPC op used to probe the upgrade must take the same request data as the shadow call in the upgrade set (but may return a different reply). GetCapability RPC ops were added to all standard sets for just this purpose. Ops where the request formats differ cannot be used for probing. (5) The client must wait for completion of the probe before sending any further RPC ops to the same destination. It should then use the service ID that recvmsg() reported back in all future calls. (6) The shadow service must have call definitions for all the operation IDs defined by the original service. To support service upgrading, a server should: (1) Call bind() twice on its AF_RXRPC socket before calling listen(). Each bind() should supply a different service ID, but the transport addresses must be the same. This allows the server to receive requests with either service ID. (2) Enable automatic upgrading by calling setsockopt(), specifying RXRPC_UPGRADEABLE_SERVICE and passing in a two-member array of unsigned shorts as the argument: unsigned short optval[2]; This specifies a pair of service IDs. They must be different and must match the service IDs bound to the socket. Member 0 is the service ID to upgrade from and member 1 is the service ID to upgrade to. Signed-off-by: David Howells <dhowells@redhat.com>	2017-06-05 14:30:49 +01:00
David Howells	28036f4485	rxrpc: Permit multiple service binding Permit bind() to be called on an AF_RXRPC socket more than once (currently maximum twice) to bind multiple listening services to it. There are some restrictions: (1) All bind() calls involved must have a non-zero service ID. (2) The service IDs must all be different. (3) The rest of the address (notably the transport part) must be the same in all (a single UDP socket is shared). (4) This must be done before listen() or sendmsg() is called. This allows someone to connect to the service socket with different service IDs and lays the foundation for service upgrading. The service ID used by an incoming call can be extracted from the msg_name returned by recvmsg(). Signed-off-by: David Howells <dhowells@redhat.com>	2017-06-05 14:30:49 +01:00
David Howells	68d6d1ae5c	rxrpc: Separate the connection's protocol service ID from the lookup ID Keep the rxrpc_connection struct's idea of the service ID that is exposed in the protocol separate from the service ID that's used as a lookup key. This allows the protocol service ID on a client connection to get upgraded without making the connection unfindable for other client calls that also would like to use the upgraded connection. The connection's actual service ID is then returned through recvmsg() by way of msg_name. Whilst we're at it, we get rid of the last_service_id field from each channel. The service ID is per-connection, not per-call and an entire connection is upgraded in one go. Signed-off-by: David Howells <dhowells@redhat.com>	2017-06-05 14:30:49 +01:00
Haishuang Yan	b699d00358	sit: reload iphdr in ipip6_rcv Since iptunnel_pull_header() can call pskb_may_pull(), we must reload any pointer that was related to skb->head. Fixes: `a09a4c8dd1` ("tunnels: Remove encapsulation offloads on decap") Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 23:04:31 -04:00
Jason A. Donenfeld	89a5ea9966	rxrpc: check return value of skb_to_sgvec always Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 23:01:47 -04:00
Jason A. Donenfeld	3f29770723	ipsec: check return value of skb_to_sgvec always Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 23:01:47 -04:00
Jason A. Donenfeld	48a1df6533	skbuff: return -EMSGSIZE in skb_to_sgvec to prevent overflow This is a defense-in-depth measure in response to bugs like `4d6fa57b4d` ("macsec: avoid heap overflow in skb_to_sgvec"). There's not only a potential overflow of sglist items, but also a stack overflow potential, so we fix this by limiting the amount of recursion this function is allowed to do. Not actually providing a bounded base case is a future disaster that we can easily avoid here. As a small matter of house keeping, we take this opportunity to move the documentation comment over the actual function the documentation is for. While this could be implemented by using an explicit stack of skbuffs, when implementing this, the function complexity increased considerably, and I don't think such complexity and bloat is actually worth it. So, instead I built this and tested it on x86, x86_64, ARM, ARM64, and MIPS, and measured the stack usage there. I also reverted the recent MIPS changes that give it a separate IRQ stack, so that I could experience some worst-case situations. I found that limiting it to 24 layers deep yielded a good stack usage with room for safety, as well as being much deeper than any driver actually ever creates. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net> Cc: David Howells <dhowells@redhat.com> Cc: Sabrina Dubroca <sd@queasysnail.net> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 23:01:47 -04:00
Eric Dumazet	77d4b1d369	net: ping: do not abuse udp_poll() Alexander reported various KASAN messages triggered in recent kernels The problem is that ping sockets should not use udp_poll() in the first place, and recent changes in UDP stack finally exposed this old bug. Fixes: `c319b4d76b` ("net: ipv4: add IPPROTO_ICMP socket kind") Fixes: `6d0bfe2261` ("net: ipv6: Add IPv6 support to the ping socket.") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Sasha Levin <alexander.levin@verizon.com> Cc: Solar Designer <solar@openwall.com> Cc: Vasiliy Kulikov <segoon@openwall.com> Cc: Lorenzo Colitti <lorenzo@google.com> Acked-By: Lorenzo Colitti <lorenzo@google.com> Tested-By: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 22:56:55 -04:00
Florian Fainelli	b07ac98946	net: dsa: Fix stale cpu_switch reference after unbind then bind Commit `9520ed8fb8` ("net: dsa: use cpu_switch instead of ds[0]") replaced the use of dst->ds[0] with dst->cpu_switch since that is functionally equivalent, however, we can now run into an use after free scenario after unbinding then rebinding the switch driver. The use after free happens because we do correctly initialize dst->cpu_switch the first time we probe in dsa_cpu_parse(), then we unbind the driver: dsa_dst_unapply() is called, and we rebind again. dst->cpu_switch now points to a freed "ds" structure, and so when we finally dereference it in dsa_cpu_port_ethtool_setup(), we oops. To fix this, simply set dst->cpu_switch to NULL in dsa_dst_unapply() which guarantees that we always correctly re-assign dst->cpu_switch in dsa_cpu_parse(). Fixes: `9520ed8fb8` ("net: dsa: use cpu_switch instead of ds[0]") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 22:55:17 -04:00
David S. Miller	e3e86b5119	ipv6: Fix leak in ipv6_gso_segment(). If ip6_find_1stfragopt() fails and we return an error we have to free up 'segs' because nobody else is going to. Fixes: `2423496af3` ("ipv6: Prevent overrun when parsing v6 header options") Reported-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 21:41:10 -04:00
Sowmini Varadhan	5071034e4a	neigh: Really delete an arp/neigh entry on "ip neigh delete" or "arp -d" The command # arp -s 62.2.0.1 a🅱️c:d:e:f dev eth2 adds an entry like the following (listed by "arp -an") ? (62.2.0.1) at 0a:0b:0c:0d:0e:0f [ether] PERM on eth2 but the symmetric deletion command # arp -i eth2 -d 62.2.0.1 does not remove the PERM entry from the table, and instead leaves behind ? (62.2.0.1) at <incomplete> on eth2 The reason is that there is a refcnt of 1 for the arp_tbl itself (neigh_alloc starts off the entry with a refcnt of 1), thus the neigh_release() call from arp_invalidate() will (at best) just decrement the ref to 1, but will never actually free it from the table. To fix this, we need to do something like neigh_forced_gc: if the refcnt is 1 (i.e., on the table's ref), remove the entry from the table and free it. This patch refactors and shares common code between neigh_forced_gc and the newly added neigh_remove_one. A similar issue exists for IPv6 Neighbor Cache entries, and is fixed in a similar manner by this patch. Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Reviewed-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 21:37:18 -04:00
Florian Fainelli	14be36c2c9	net: dsa: Initialize all CPU and enabled ports masks in dsa_ds_parse() There was no reason for duplicating the code that initializes ds->enabled_port_mask in both dsa_parse_ports_dn() and dsa_parse_ports(), instead move this to dsa_ds_parse() which is early enough before ops->setup() has run. While at it, we can now make dsa_is_cpu_port() check ds->cpu_port_mask which is a step towards being multi-CPU port capable. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 20:05:15 -04:00
Florian Fainelli	e41c1b5030	net: dsa: Consistently use dsa_port for dsa_*_port_{apply, unapply} We have all the information we need in dsa_port, so use it instead of repeating the same arguments over and over again. Suggested-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 20:05:15 -04:00
Florian Fainelli	937c7df85c	net: dsa: Pass dsa_port reference to ethtool setup/restore We do not need to have a reference to a dsa_switch, instead we should pass a reference to a CPU dsa_port, change that. This is a preliminary change to better support multiple CPU ports. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 20:05:15 -04:00
Soheil Hassas Yeganeh	38b257938a	sock: reset sk_err when the error queue is empty Prior to `f5f99309fa` (sock: do not set sk_err in sock_dequeue_err_skb), sk_err was reset to the error of the skb on the head of the error queue. Applications, most notably ping, are relying on this behavior to reset sk_err for ICMP packets. Set sk_err to the ICMP error when there is an ICMP packet at the head of the error queue. Fixes: `f5f99309fa` (sock: do not set sk_err in sock_dequeue_err_skb) Reported-by: Cyril Hrubis <chrubis@suse.cz> Tested-by: Cyril Hrubis <chrubis@suse.cz> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 20:01:53 -04:00
Colin Ian King	1820dd0633	rxrpc: remove redundant proc_remove call The proc_remove call is dead code as it occurs after a return and hence can never be called. Remove it. Detected by CoverityScan, CID#1437743 ("Logically dead code") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 19:59:11 -04:00
Eric Dumazet	8e2f6dd298	dccp: consistently use dccp_write_space() DCCP uses dccp_write_space() for sk->sk_write_space method. Unfortunately a passive connection (as provided by accept()) is using the generic sk_stream_write_space() function. Lets simply inherit sk->sk_write_space from the parent instead of forcing the generic one. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 19:58:35 -04:00
Joe Perches	fbd0ac6042	net-procfs: Use vsnprintf extension %phN Save a bit of code by using the kernel extension. $ size net/core/net-procfs.o* text data bss dec hex filename 3701 120 0 3821 eed net/core/net-procfs.o.new 3764 120 0 3884 f2c net/core/net-procfs.o.old Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 19:52:58 -04:00
Liam McBirnie	5f733ee68f	ip6_tunnel: fix traffic class routing for tunnels ip6_route_output() requires that the flowlabel contains the traffic class for policy routing. Commit `0e9a709560` ("ip6_tunnel, ip6_gre: fix setting of DSCP on encapsulated packets") removed the code which previously added the traffic class to the flowlabel. The traffic class is added here because only route lookup needs the flowlabel to contain the traffic class. Fixes: `0e9a709560` ("ip6_tunnel, ip6_gre: fix setting of DSCP on encapsulated packets") Signed-off-by: Liam McBirnie <liam.mcbirnie@boeing.com> Acked-by: Peter Dawson <peter.a.dawson@boeing.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 19:49:33 -04:00
Or Gerlitz	4d80cc0aaa	net/sched: cls_flower: add support for matching on ip tos and ttl Benefit from the support of ip header fields dissection and allow users to set rules matching on ipv4 tos and ttl or ipv6 traffic-class and hoplimit. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 18:12:24 -04:00
Or Gerlitz	518d8a2e9b	net/flow_dissector: add support for dissection of misc ip header fields Add support for dissection of ip tos and ttl and ipv6 traffic-class and hoplimit. Both are dissected into the same struct. Uses similar call to ip dissection function as with tcp, arp and others. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 18:12:23 -04:00
Linus Torvalds	125f42b0e2	NFS client bugfixes for Linux 4.12 Bugfixes include: - Fix a typo in commit `e092693443` that breaks copy offload - Fix the connect error propagation in xs_tcp_setup_socket() - Fix a lock leak in nfs40_walk_client_list - Verify that pNFS requests lie within the offset range of the layout segment. -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJZM0YBAAoJEGcL54qWCgDyLUwQALaPEVp00UMdDR0in7MIFKsO 2mgi7pOyn6po3EjxKbGtjAbL4nSlVxdaFpCIGg47YXrl9/95Zjjmyke+iwRdnMsa ZPyXwfhVRa80fxbOogAverNCnCptoHoG7EzdWuCTcOOxMxR3Ixs7wVJXrs+7ig+r IdvIAyTsiDYuP5yVp5KkmJCtLGc0Ze20rb7VgdQJfdiLibWvfYCLZ9CgfAQkdAMU RIlbT0/BG13XDqwh/C2V1vLge0VfpT5p8qbIb/kFyQ0ZJUUiicGGGjp3u/yj0aG9 ljldI34WmQpsy+nCNN4dEgsF461ECvWLwRZnnpN9nv7VurUBpJNUqHLnubvDbzhh w8QX54ceEWuQAjg96keNuYOhoG53Omle2/Cm+nmiJOmShJbJ0yh4OcB9DYe0gdYa 5YXbKRjPvf/HfdE7PPpvbPG2E211zfvkLdHnFxswggWyGrh23kqlWrpcHpZomGNW GbJLfIfhyEfBjCPdNJT3Tzvewo2LkcTNLb+3mJhkxOegkdops8vGYA9G2mba3Daj 1HWl1yFAdzlEf2H1Cb8Y2ZrJKHAmaYBKBkKZYUeAcr6EtoxNqnNMP+PEDcVIzPKg 6Jq7DiYwYksK+XDWK9G4QBguKKGLvYtv0MIA3QDX+bBGLo+eFYxc2iaaJefYNdkK +vdLHclg/YpepLg+Ui21 =P2bm -----END PGP SIGNATURE----- Merge tag 'nfs-for-4.12-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client bugfixes from Trond Myklebust: "Bugfixes include: - Fix a typo in commit `e092693443` ("NFS append COMMIT after synchronous COPY") that breaks copy offload - Fix the connect error propagation in xs_tcp_setup_socket() - Fix a lock leak in nfs40_walk_client_list - Verify that pNFS requests lie within the offset range of the layout segment" * tag 'nfs-for-4.12-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: nfs: Mark unnecessarily extern functions as static SUNRPC: ensure correct error is reported by xs_tcp_setup_socket() NFSv4.0: Fix a lock leak in nfs40_walk_client_list pnfs: Fix the check for requests in range of layout segment xprtrdma: Delete an error message for a failed memory allocation in xprt_rdma_bc_setup() pNFS/flexfiles: missing error code in ff_layout_alloc_lseg() NFS fix COMMIT after COPY	2017-06-04 11:56:53 -07:00
Eric Dumazet	f4d0166661	tcp: remove unnecessary skb_reset_tail_pointer() __pskb_trim_head() does not need to reset skb tail pointer. Also change the comments, __pskb_pull_head() does not exist. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-02 14:28:25 -04:00
Yuchung Cheng	775e68a93f	tcp: use TS opt on RTTs for congestion control Currently when a data packet is retransmitted, we do not compute an RTT sample for congestion control due to Kern's check. Therefore the congestion control that uses RTT signals may not receive any update during loss recovery which could last many round trips. For example, BBR and Vegas may not be able to update its min RTT estimation if the network path has shortened until it recovers from losses. This patch mitigates that by using TCP timestamp options for RTT measurement for congestion control. Note that we already use timestamps for RTT estimation. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-02 14:19:23 -04:00
Yuchung Cheng	44abafc4cc	tcp: disallow cwnd undo when switching congestion control When the sender switches its congestion control during loss recovery, if the recovery is spurious then it may incorrectly revert cwnd and ssthresh to the older values set by a previous congestion control. Consider a congestion control (like BBR) that does not use ssthresh and keeps it infinite: the connection may incorrectly revert cwnd to an infinite value when switching from BBR to another congestion control. This patch fixes it by disallowing such cwnd undo operation upon switching congestion control. Note that undo_marker is not reset s.t. the packets that were incorrectly marked lost would be corrected. We only avoid undoing the cwnd in tcp_undo_cwnd_reduction(). Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-02 14:18:13 -04:00
Ben Hutchings	6e80ac5cc9	ipv6: xfrm: Handle errors reported by xfrm6_find_1stfragopt() xfrm6_find_1stfragopt() may now return an error code and we must not treat it as a length. Fixes: `2423496af3` ("ipv6: Prevent overrun when parsing v6 header options") Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Acked-by: Craig Gallek <kraig@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-02 13:57:27 -04:00
Xin Long	ff356414dc	sctp: merge sctp_stream_new and sctp_stream_init Since last patch, sctp doesn't need to alloc memory for asoc->stream any more. sctp_stream_new and sctp_stream_init both are used to alloc memory for stream.in or stream.out, and their names are also confusing. This patch is to merge them into sctp_stream_init, and only pass stream and streamcnt parameters into it, instead of the whole asoc. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-02 13:56:26 -04:00
Xin Long	cee360ab4d	sctp: define the member stream as an object instead of pointer in asoc As Marcelo's suggestion, stream is a fixed size member of asoc and would not grow with more streams. To avoid an allocation for it, this patch is to define it as an object instead of pointer and update the places using it, also create sctp_stream_update() called in sctp_assoc_update() to migrate the stream info from one stream to another. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-02 13:56:26 -04:00
David S. Miller	13fb6c2c7f	Just two fixes: * fix the per-CPU drop counters to not be added to the rx_packets counter, but really the drop counter * fix TX aggregation start/stop callback races by setting bits instead of allocating and queueing an skb -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEExu3sM/nZ1eRSfR9Ha3t4Rpy0AB0FAlkxbE0ACgkQa3t4Rpy0 AB0vlA/8DJ/EccfrzUulYgz7N6e3eZnLV0PL5HG4QLFhS+iL71EkGMgX66PrTpJf 0J308+VuNI/B0n48NF3NOUIg57yiF8i/VulxqR1FNYXdOfLXc5gc9Ca4oSOOlzHr z+CrCm2Z4GLwAZrketrUKuQBoPL8UXY3UKp4OzQoOCk50UujszInlRzXqlLdUHIE If+lg7O+Uq9udGb0WjH845H/GkjEiy5+4rM64pCkmu+rcPhb9uXbC9JI3b3SRu2j VeXl0ShaEEGA971JdncQ20x91rpadItJgnCm0bJ+zNwxZT5JakXW+ZUJGn2EKEqw hPvlvMgBzeAeLsCaRiQJspVdNHlgUa1nNTmn2n7R7+qn6LXuI7tZcj4UdOsn/Sfa eHTHc5irAiyp3ow6MAM+HgjH4/UHMbQg6HQMitVAGFO8Lluy1F1hIijP2amO0/It rHSINcDMi0Crn2rn+2tsYlU6pSzSJFS3kg+yfooK+C+pNl+Td0vH6n6EScvsKttG X6iAykbhPpjS/TrZg4RAPkFNqa7yooXXpvoIX1xtjUFRd1xUm6IE3O/6wN/l4X34 QVyIQolw0LgWvoh3N3YZw7f9OFdc2AImnTU7XmHo6jYiZn4Y7+pO4x4OnAocw4aX SNXmqVwui7EjwGycDMohtOfFavTHC7KjoLRGBKONZXi7ZqqEKtI= =ZgHD -----END PGP SIGNATURE----- Merge tag 'mac80211-for-davem-2017-06-02' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Just two fixes: * fix the per-CPU drop counters to not be added to the rx_packets counter, but really the drop counter * fix TX aggregation start/stop callback races by setting bits instead of allocating and queueing an skb ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-02 10:37:11 -04:00
Florian Fainelli	ac2629a479	net: dsa: Move dsa_switch_{suspend,resume} out of legacy.c dsa_switch_suspend() and dsa_switch_resume() are functions that belong in net/dsa/dsa.c and are not part of the legacy platform support code. Fixes: `a6a71f19fe` ("net: dsa: isolate legacy code") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-02 10:31:16 -04:00
Vivien Didelot	fe47d56306	net: dsa: factor skb freeing on xmit As of `a86d8becc3` ("net: dsa: Factor bottom tag receive functions"), the rcv caller frees the original SKB in case or error. Be symmetric with that and make the xmit caller do the same. At the same time, fix the checkpatch NULL comparison check: CHECK: Comparison to NULL could be written "!nskb" #208: FILE: net/dsa/tag_trailer.c:35: + if (nskb == NULL) Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-01 17:34:56 -04:00
Vivien Didelot	5470979585	net: dsa: remove out_drop label in taggers rcv Many rcv functions from net/dsa/tag_*.c have a useless out_drop goto label which simply returns NULL. Kill it in favor of the obvious. Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-01 17:34:56 -04:00
Vivien Didelot	02f840cbc9	net: dsa: do not cast dst dsa_ptr is not a void pointer anymore since Nov 2011, as of `cf50dcc24f` ("dsa: Change dsa_uses_{dsa, trailer}_tags() into inline functions"), but an explicit dsa_switch_tree pointer, thus remove the (void *) cast. Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-01 17:34:56 -04:00
Vivien Didelot	73a7ece8f7	net: dsa: comment hot path requirements The DSA layer uses inline helpers and copy of the tagging functions for faster access in hot path. Add comments to detail that. Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-01 17:34:56 -04:00
Johannes Berg	e165bc02a0	mac80211: fix dropped counter in multiqueue RX In the commit enabling per-CPU station statistics, I inadvertedly copy-pasted some code to update rx_packets and forgot to change it to update rx_dropped_misc. Fix that. This addresses https://bugzilla.kernel.org/show_bug.cgi?id=195953. Fixes: `c9c5962b56` ("mac80211: enable collecting station statistics per-CPU") Reported-by: Petru-Florin Mihancea <petrum@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-06-01 21:26:03 +02:00
Nikolay Aleksandrov	aeb073241f	net: bridge: start hello timer only if device is up When the transition of NO_STP -> KERNEL_STP was fixed by always calling mod_timer in br_stp_start, it introduced a new regression which causes the timer to be armed even when the bridge is down, and since we stop the timers in its ndo_stop() function, they never get disabled if the device is destroyed before it's upped. To reproduce: $ while :; do ip l add br0 type bridge hello_time 100; brctl stp br0 on; ip l del br0; done; CC: Xin Long <lucien.xin@gmail.com> CC: Ivan Vecera <cera@cera.cz> CC: Sebastian Ott <sebott@linux.vnet.ibm.com> Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Fixes: `6d18c732b9` ("bridge: start hello_timer when enabling KERNEL_STP in br_stp_start") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-01 12:28:31 -04:00
Colin Ian King	7b954ed752	net: dsa: make function ksz_rcv static function ksz_rcv can be made static as it does not need to be in global scope. Reformat arguments to make it checkpatch warning free too. Cleans up sparse warning: "symbol 'ksz_rcv' was not declared. Should it be static?" Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Woojung Huh <Woojung.Huh@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-01 12:12:40 -04:00
Nicolas Dichtel	7212462fa6	netlink: don't send unknown nsid The NETLINK_F_LISTEN_ALL_NSID otion enables to listen all netns that have a nsid assigned into the netns where the netlink socket is opened. The nsid is sent as metadata to userland, but the existence of this nsid is checked only for netns that are different from the socket netns. Thus, if no nsid is assigned to the socket netns, NETNSA_NSID_NOT_ASSIGNED is reported to the userland. This value is confusing and useless. After this patch, only valid nsid are sent to userland. Reported-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-01 11:49:39 -04:00
Roopa Prabhu	ba52d61e0f	ipv4: route: restore skb_dst_set in inet_rtm_getroute recent updates to inet_rtm_getroute dropped skb_dst_set in inet_rtm_getroute. This patch restores it because it is needed to release the dst correctly. Fixes: `3765d35ed8` ("net: ipv4: Convert inet_rtm_getroute to rcu versions of route lookup") Reported-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-01 11:30:41 -04:00
Woojung Huh	8b8010fb78	dsa: add support for Microchip KSZ tail tagging Adding support for the Microchip KSZ switch family tail tagging. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Woojung Huh <Woojung.Huh@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-31 20:56:31 -04:00
Alexei Starovoitov	50bbfed967	bpf: track stack depth of classic bpf programs To track stack depth of classic bpf programs we only need to analyze ST\|STX instructions, since check_load_and_stores() verifies that programs can load from stack only after write. We also need to change the way cBPF stack slots map to eBPF stack, since typical classic programs are using slots 0 and 1, so they need to map to stack offsets -4 and -8 respectively in order to take advantage of small stack interpreter and JITs. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-31 19:29:47 -04:00
Roopa Prabhu	c2e8471d98	mpls: fix clearing of dead nh_flags on link up recent fixes to use WRITE_ONCE for nh_flags on link up, accidently ended up leaving the deadflags on a nh. This patch fixes the WRITE_ONCE to use freshly evaluated nh_flags. Fixes: `39eb8cd175` ("net: mpls: rt_nhn_alive and nh_flags should be accessed using READ_ONCE") Reported-by: Satish Ashok <sashok@cumulusnetworks.com> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-31 14:48:24 -04:00
Vlad Yasevich	8c6c918da1	rtnetlink: use the new rtnl_get_event() interface Small clean-up to rtmsg_ifinfo() to use the rtnl_get_event() interface instead of using 'internal' values directly. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-31 13:08:36 -04:00
Vivien Didelot	23c9ee4934	net: dsa: remove dev arg of dsa_register_switch The current dsa_register_switch function takes a useless struct device pointer argument, which always equals ds->dev. Drivers either call it with ds->dev, or with the same device pointer passed to dsa_switch_alloc, which ends up being assigned to ds->dev. This patch removes the second argument of the dsa_register_switch and _dsa_register_switch functions. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-31 12:35:43 -04:00
Douglas Caetano dos Santos	15e5651525	tcp: reinitialize MTU probing when setting MSS in a TCP repair MTU probing initialization occurred only at connect() and at SYN or SYN-ACK reception, but the former sets MSS to either the default or the user set value (through TCP_MAXSEG sockopt) and the latter never happens with repaired sockets. The result was that, with MTU probing enabled and unless TCP_MAXSEG sockopt was used before connect(), probing would be stuck at tcp_base_mss value until tcp_probe_interval seconds have passed. Signed-off-by: Douglas Caetano dos Santos <douglascs@taghos.com.br> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-31 12:28:59 -04:00
NeilBrown	6ea44adce9	SUNRPC: ensure correct error is reported by xs_tcp_setup_socket() If you attempt a TCP mount from an host that is unreachable in a way that triggers an immediate error from kernel_connect(), that error does not propagate up, instead EAGAIN is reported. This results in call_connect_status receiving the wrong error. A case that it easy to demonstrate is to attempt to mount from an address that results in ENETUNREACH, but first deleting any default route. Without this patch, the mount.nfs process is persistently runnable and is hard to kill. With this patch it exits as it should. The problem is caused by the fact that xs_tcp_force_close() eventually calls xprt_wake_pending_tasks(xprt, -EAGAIN); which causes an error return of -EAGAIN. so when xs_tcp_setup_sock() calls xprt_wake_pending_tasks(xprt, status); the status is ignored. Fixes: `4efdd92c92` ("SUNRPC: Remove TCP client connection reset hack") Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2017-05-31 12:26:44 -04:00
Vivien Didelot	f3d736c478	net: dsa: remove dsa_port_is_bridged The helper is only used once and makes the code more complicated that it should. Remove it and reorganize the variables so that it fits on 80 columns. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-30 18:13:41 -04:00
David Ahern	e1af005b1c	net: mpls: remove unnecessary initialization of err err is initialized to EINVAL and not used before it is set again. Remove the unnecessary initialization. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-30 11:55:33 -04:00
David Ahern	d4e7256007	net: mpls: Make nla_get_via in af_mpls.c nla_get_via is only used in af_mpls.c. Remove declaration from internal.h and move up in af_mpls.c before first use. Code move only; no functional change intended. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-30 11:55:33 -04:00
David Ahern	074350e2eb	net: mpls: Add extack messages for route add and delete failures Add error messages for failures in adding and deleting mpls routes. This covers most of the annoying EINVAL errors. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-30 11:55:33 -04:00
David Ahern	b7b386f42f	net: mpls: Pull common label check into helper mpls_route_add and mpls_route_del have the same checks on the label. Move to a helper. Avoid duplicate extack messages in the next patch. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-30 11:55:32 -04:00
David Ahern	a1f10abe12	net: Fill in extack for mpls lwt encap Fill in extack for errors in build_state for mpls lwt encap including passing extack to nla_get_labels and adding error messages for failures in it. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-30 11:55:32 -04:00
David Ahern	9ae2872748	net: add extack arg to lwtunnel build state Pass extack arg down to lwtunnel_build_state and the build_state callbacks. Add messages for failures in lwtunnel_build_state, and add the extarg to nla_parse where possible in the build_state callbacks. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-30 11:55:32 -04:00
David Ahern	c255bd681d	net: lwtunnel: Add extack to encap attr validation Pass extack down to lwtunnel_valid_encap_type and lwtunnel_valid_encap_type_attr. Add messages for unknown or unsupported encap types. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-30 11:55:31 -04:00
David Ahern	7805599895	net: ipv4: Add extack message for invalid prefix or length Add extack error message for invalid prefix length and invalid prefix. Example of the latter is a route spec containing 172.16.100.1/24, where the /24 mask means the lower 8-bits should be 0. Amazing how easy that one is to overlook when an EINVAL is returned. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-30 11:55:31 -04:00
David Ahern	ba277e8e05	net: ipv4: refactor key and length checks fib_table_insert and fib_table_delete have the same checks on the prefix and length. Refactor into a helper. Avoids duplicate extack messages in the next patch. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-30 11:55:31 -04:00
Bjorn Andersson	5d473fedd1	mac80211: Invoke TX LED in more code paths ieee80211_tx_status() is only one of the possible ways a driver can report a handled packet, some drivers call this for every packet while others calls it rarely or never. In order to invoke the TX LED in the non-status reporting cases this patch pushes the call to ieee80211_led_tx() into ieee80211_report_used_skb(), which is shared between the various code paths. Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-05-30 09:18:13 +02:00
Johannes Berg	e45a79da86	skbuff/mac80211: introduce and use skb_put_zero() This pattern was introduced a number of times in mac80211 just now, and since it's present in a number of other places it makes sense to add a little helper for it. This just adds the helper and transforms the mac80211 code, a later patch will transform other places. Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-05-30 09:14:30 +02:00
Johannes Berg	7a7c0a6438	mac80211: fix TX aggregation start/stop callback race When starting or stopping an aggregation session, one of the steps is that the driver calls back to mac80211 that the start/stop can proceed. This is handled by queueing up a fake SKB and processing it from the normal iface/sdata work. Since this isn't flushed when disassociating, the following race is possible: * associate * start aggregation session * driver callback * disassociate * associate again to the same AP * callback processing runs, leading to a WARN_ON() that the TID hadn't requested aggregation If the second association isn't to the same AP, there would only be a message printed ("Could not find station: <addr>"), but the same race could happen. Fix this by not going the whole detour with a fake SKB etc. but simply looking up the aggregation session in the driver callback, marking it with a START_CB/STOP_CB bit and then scheduling the regular aggregation work that will now process these bits as well. This also simplifies the code and gets rid of the whole problem with allocation failures of said skb, which could have left the session in limbo. Reported-by: Jouni Malinen <j@w1.fi> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-05-30 09:08:40 +02:00
David S. Miller	468b0df61a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for your net tree, they are: 1) Conntrack SCTP CRC32c checksum mangling may operate on non-linear skbuff, patch from Davide Caratti. 2) nf_tables rb-tree set backend does not handle element re-addition after deletion in the same transaction, leading to infinite loop. 3) Atomically unclear the IPS_SRC_NAT_DONE_BIT on nat module removal, from Liping Zhang. 4) Conntrack hashtable resizing while ctnetlink dump is progress leads to a dead reference to released objects in the lists, also from Liping. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-29 23:16:54 -04:00
Vlad Yasevich	3d3ea5af5c	rtnl: Add support for netdev event to link messages When netdev events happen, a rtnetlink_event() handler will send messages for every event in it's white list. These messages contain current information about a particular device, but they do not include the iformation about which event just happened. So, it is impossible to tell what just happend for these events. This patch adds a new extension to RTM_NEWLINK message called IFLA_EVENT that would have an encoding of event that triggered this message. This would allow the the message consumer to easily determine if it needs to perform certain actions. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-27 18:51:41 -04:00
David S. Miller	34aa83c2fc	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Overlapping changes in drivers/net/phy/marvell.c, bug fix in 'net' restricting a HW workaround alongside cleanups in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-26 20:46:35 -04:00
Linus Torvalds	6741d51699	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Fix state pruning in bpf verifier wrt. alignment, from Daniel Borkmann. 2) Handle non-linear SKBs properly in SCTP ICMP parsing, from Davide Caratti. 3) Fix bit field definitions for rss_hash_type of descriptors in mlx5 driver, from Jesper Brouer. 4) Defer slave->link updates until bonding is ready to do a full commit to the new settings, from Nithin Sujir. 5) Properly reference count ipv4 FIB metrics to avoid use after free situations, from Eric Dumazet and several others including Cong Wang and Julian Anastasov. 6) Fix races in llc_ui_bind(), from Lin Zhang. 7) Fix regression of ESP UDP encapsulation for TCP packets, from Steffen Klassert. 8) Fix mdio-octeon driver Kconfig deps, from Randy Dunlap. 9) Fix regression in setting DSCP on ipv6/GRE encapsulation, from Peter Dawson. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (43 commits) ipv4: add reference counting to metrics net: ethernet: ax88796: don't call free_irq without request_irq first ip6_tunnel, ip6_gre: fix setting of DSCP on encapsulated packets sctp: fix ICMP processing if skb is non-linear net: llc: add lock_sock in llc_ui_bind to avoid a race condition bonding: Don't update slave->link until ready to commit test_bpf: Add a couple of tests for BPF_JSGE. bpf: add various verifier test cases bpf: fix wrong exposure of map_flags into fdinfo for lpm bpf: add bpf_clone_redirect to bpf_helper_changes_pkt_data bpf: properly reset caller saved regs after helper call and ld_abs/ind bpf: fix incorrect pruning decision when alignment must be tracked arp: fixed -Wuninitialized compiler warning tcp: avoid fastopen API to be used on AF_UNSPEC net: move somaxconn init from sysctl code net: fix potential null pointer dereference geneve: fix fill_info when using collect_metadata virtio-net: enable TSO/checksum offloads for Q-in-Q vlans be2net: Fix offload features for Q-in-Q packets vlan: Fix tcp checksum offloads in Q-in-Q vlans ...	2017-05-26 13:51:01 -07:00
Ido Schimmel	9341b988e6	bridge: Export multicast enabled state During enslavement to a bridge, after the CHANGEUPPER is sent, the multicast enabled state of the bridge isn't propagated down to the offloading driver unless it's changed. This patch allows such drivers to query the multicast enabled state from the bridge, so that they'll be able to correctly configure their flood tables during port enslavement. In case multicast is disabled, unregistered multicast packets can be treated as broadcast and be flooded through all the bridge ports. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-26 15:18:44 -04:00
Ido Schimmel	1f51445af3	bridge: Export VLAN filtering state It's useful for drivers supporting bridge offload to be able to query the bridge's VLAN filtering state. Currently, upon enslavement to a bridge master, the offloading driver will only learn about the bridge's VLAN filtering state after the bridge device was already linked with its slave. Being able to query the bridge's VLAN filtering state allows such drivers to forbid enslavement in case resource couldn't be allocated for a VLAN-aware bridge and also choose the correct initialization routine for the enslaved port, which is dependent on the bridge type. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-26 15:18:44 -04:00
Eric Dumazet	3fb07daff8	ipv4: add reference counting to metrics Andrey Konovalov reported crashes in ipv4_mtu() I could reproduce the issue with KASAN kernels, between 10.246.7.151 and 10.246.7.152 : 1) 20 concurrent netperf -t TCP_RR -H 10.246.7.152 -l 1000 & 2) At the same time run following loop : while : do ip ro add 10.246.7.152 dev eth0 src 10.246.7.151 mtu 1500 ip ro del 10.246.7.152 dev eth0 src 10.246.7.151 mtu 1500 done Cong Wang attempted to add back rt->fi in commit `82486aa6f1` ("ipv4: restore rt->fi for reference counting") but this proved to add some issues that were complex to solve. Instead, I suggested to add a refcount to the metrics themselves, being a standalone object (in particular, no reference to other objects) I tried to make this patch as small as possible to ease its backport, instead of being super clean. Note that we believe that only ipv4 dst need to take care of the metric refcount. But if this is wrong, this patch adds the basic infrastructure to extend this to other families. Many thanks to Julian Anastasov for reviewing this patch, and Cong Wang for his efforts on this problem. Fixes: `2860583fe8` ("ipv4: Kill rt->fi") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Julian Anastasov <ja@ssi.bg> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-26 14:57:07 -04:00
Peter Dawson	0e9a709560	ip6_tunnel, ip6_gre: fix setting of DSCP on encapsulated packets This fix addresses two problems in the way the DSCP field is formulated on the encapsulating header of IPv6 tunnels. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195661 1) The IPv6 tunneling code was manipulating the DSCP field of the encapsulating packet using the 32b flowlabel. Since the flowlabel is only the lower 20b it was incorrect to assume that the upper 12b containing the DSCP and ECN fields would remain intact when formulating the encapsulating header. This fix handles the 'inherit' and 'fixed-value' DSCP cases explicitly using the extant dsfield u8 variable. 2) The use of INET_ECN_encapsulate(0, dsfield) in ip6_tnl_xmit was incorrect and resulted in the DSCP value always being set to 0. Commit `90427ef5d2` ("ipv6: fix flow labels when the traffic class is non-0") caused the regression by masking out the flowlabel which exposed the incorrect handling of the DSCP portion of the flowlabel in ip6_tunnel and ip6_gre. Fixes: `90427ef5d2` ("ipv6: fix flow labels when the traffic class is non-0") Signed-off-by: Peter Dawson <peter.a.dawson@boeing.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-26 14:54:39 -04:00
Davide Caratti	804ec7ebe8	sctp: fix ICMP processing if skb is non-linear sometimes ICMP replies to INIT chunks are ignored by the client, even if the encapsulated SCTP headers match an open socket. This happens when the ICMP packet is carried by a paged skb: use skb_header_pointer() to read packet contents beyond the SCTP header, so that chunk header and initiate tag are validated correctly. v2: - don't use skb_header_pointer() to read the transport header, since icmp_socket_deliver() already puts these 8 bytes in the linear area. - change commit message to make specific reference to INIT chunks. Signed-off-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Reviewed-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-26 14:40:46 -04:00
linzhang	0908cf4dfe	net: llc: add lock_sock in llc_ui_bind to avoid a race condition There is a race condition in llc_ui_bind if two or more processes/threads try to bind a same socket. If more processes/threads bind a same socket success that will lead to two problems, one is this action is not what we expected, another is will lead to kernel in unstable status or oops(in my simple test case, cause llc2.ko can't unload). The current code is test SOCK_ZAPPED bit to avoid a process to bind a same socket twice but that is can't avoid more processes/threads try to bind a same socket at the same time. So, add lock_sock in llc_ui_bind like others, such as llc_ui_connect. Signed-off-by: Lin Zhang <xiaolou4617@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-26 14:20:29 -04:00
Roopa Prabhu	18c3a61c42	net: ipv6: RTM_GETROUTE: return matched fib result when requested This patch adds support to return matched fib result when RTM_F_FIB_MATCH flag is specified in RTM_GETROUTE request. This is useful for user-space applications/controllers wanting to query a matching route. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-26 14:12:51 -04:00
Roopa Prabhu	b61798130f	net: ipv4: RTM_GETROUTE: return matched fib result when requested This patch adds support to return matched fib result when RTM_F_FIB_MATCH flag is specified in RTM_GETROUTE request. This is useful for user-space applications/controllers wanting to query a matching route. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-26 14:12:51 -04:00
David Ahern	6ffd903415	net: ipv4: Save trie prefix to fib lookup result Prefix is needed for returning matching route spec on get route request. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-26 14:12:50 -04:00
David Ahern	3765d35ed8	net: ipv4: Convert inet_rtm_getroute to rcu versions of route lookup Convert inet_rtm_getroute to use ip_route_input_rcu and ip_route_output_key_hash_rcu passing the fib_result arg to both. The rcu lock is held through the creation of the response, so the rtable/dst does not need to be attached to the skb and is passed to rt_fill_info directly. In converting from ip_route_output_key to ip_route_output_key_hash_rcu the xfrm_lookup_route in ip_route_output_flow is dropped since flowi4_proto is not set for a route get request. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-26 14:12:50 -04:00
David Ahern	d3166e0c95	net: ipv4: Remove event arg to rt_fill_info rt_fill_info has 1 caller with the event set to RTM_NEWROUTE. Given that remove the arg and use RTM_NEWROUTE directly in rt_fill_info. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-26 14:12:49 -04:00
David Ahern	5510cdf7be	net: ipv4: refactor ip_route_input_noref A later patch wants access to the fib result on an input route lookup with the rcu lock held. Refactor ip_route_input_noref pushing the logic between rcu_read_lock ... rcu_read_unlock into a new helper that takes the fib_result as an input arg. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-26 14:12:49 -04:00
David Ahern	3abd1ade67	net: ipv4: refactor __ip_route_output_key_hash A later patch wants access to the fib result on an output route lookup with the rcu lock held. Refactor __ip_route_output_key_hash, pushing the logic between rcu_read_lock ... rcu_read_unlock into a new helper with the fib_result as an input arg. To keep the name length under control remove the leading underscores from the name and add _rcu to the name of the new helper indicating it is called with the rcu read lock held. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-26 14:12:49 -04:00
Linus Torvalds	80941b2aeb	A bunch of make W=1 and static checker fixups, a RECONNECT_SEQ messenger patch from Zheng and Luis' fallocate fix. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABCAAGBQJZKD/7AAoJEEp/3jgCEfOLu6sIAKEvmDAZxkRiIV9HF36+0jLO 947jIV22sb4FjngcMs0eBdFD4IJrL8QPq1UVYjIyHtnJN4Tbp9VDPfjyWArhr7+k hjfTcgnTStmwFy1bUXSq7xNusg9qm0Mw5zpY1DJLCdvkwIU0yrN9zusTlIQvlV5G Kg4Mzvc3EaL/VgUcsGI2lKuVlMt95wb5u1YGt5AG9FjLv1BTBhpX+3/swtvmtzy3 ZpxyujS4YH+RBpHr9AI/+5IJ2xumZB0C6hzOoa/DAyGzjUH7MQJEuD8hjXqMOWQy L1wqZo7gXrIk3NSEjxrCb7/mE0S915jkKyHjoJbUxBhy1zEZmri9AfEwe9isb0M= =enjn -----END PGP SIGNATURE----- Merge tag 'ceph-for-4.12-rc3' of git://github.com/ceph/ceph-client Pul ceph fixes from Ilya Dryomov: "A bunch of make W=1 and static checker fixups, a RECONNECT_SEQ messenger patch from Zheng and Luis' fallocate fix" * tag 'ceph-for-4.12-rc3' of git://github.com/ceph/ceph-client: ceph: check that the new inode size is within limits in ceph_fallocate() libceph: cleanup old messages according to reconnect seq libceph: NULL deref on crush_decode() error path libceph: fix error handling in process_one_ticket() libceph: validate blob_struct_v in process_one_ticket() libceph: drop version variable from ceph_monmap_decode() libceph: make ceph_msg_data_advance() return void libceph: use kbasename() and kill ceph_file_part()	2017-05-26 09:35:22 -07:00
Daniel Borkmann	41703a7310	bpf: add bpf_clone_redirect to bpf_helper_changes_pkt_data The bpf_clone_redirect() still needs to be listed in bpf_helper_changes_pkt_data() since we call into bpf_try_make_head_writable() from there, thus we need to invalidate prior pkt regs as well. Fixes: `36bbef52c7` ("bpf: direct packet write and access for helpers for clsact progs") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-25 13:44:28 -04:00
Ihar Hrachyshka	5990baaa6d	arp: fixed -Wuninitialized compiler warning Commit `7d472a59c0` ("arp: always override existing neigh entries with gratuitous ARP") introduced a compiler warning: net/ipv4/arp.c:880:35: warning: 'addr_type' may be used uninitialized in this function [-Wmaybe-uninitialized] While the code logic seems to be correct and doesn't allow the variable to be used uninitialized, and the warning is not consistently reproducible, it's still worth fixing it for other people not to waste time looking at the warning in case it pops up in the build environment. Yes, compiler is probably at fault, but we will need to accommodate. Fixes: `7d472a59c0` ("arp: always override existing neigh entries with gratuitous ARP") Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-25 13:38:20 -04:00
Wei Wang	ba615f6752	tcp: avoid fastopen API to be used on AF_UNSPEC Fastopen API should be used to perform fastopen operations on the TCP socket. It does not make sense to use fastopen API to perform disconnect by calling it with AF_UNSPEC. The fastopen data path is also prone to race conditions and bugs when using with AF_UNSPEC. One issue reported and analyzed by Vegard Nossum is as follows: +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Thread A: Thread B: ------------------------------------------------------------------------ sendto() - tcp_sendmsg() - sk_stream_memory_free() = 0 - goto wait_for_sndbuf - sk_stream_wait_memory() - sk_wait_event() // sleep \| sendto(flags=MSG_FASTOPEN, dest_addr=AF_UNSPEC) \| - tcp_sendmsg() \| - tcp_sendmsg_fastopen() \| - __inet_stream_connect() \| - tcp_disconnect() //because of AF_UNSPEC \| - tcp_transmit_skb()// send RST \| - return 0; // no reconnect! \| - sk_stream_wait_connect() \| - sock_error() \| - xchg(&sk->sk_err, 0) \| - return -ECONNRESET - ... // wake up, see sk->sk_err == 0 - skb_entail() on TCP_CLOSE socket If the connection is reopened then we will send a brand new SYN packet after thread A has already queued a buffer. At this point I think the socket internal state (sequence numbers etc.) becomes messed up. When the new connection is closed, the FIN-ACK is rejected because the sequence number is outside the window. The other side tries to retransmit, but __tcp_retransmit_skb() calls tcp_trim_head() on an empty skb which corrupts the skb data length and hits a BUG() in copy_and_csum_bits(). +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Hence, this patch adds a check for AF_UNSPEC in the fastopen data path and return EOPNOTSUPP to user if such case happens. Fixes: `cf60af03ca` ("tcp: Fast Open client - sendmsg(MSG_FASTOPEN)") Reported-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Wei Wang <weiwan@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-25 13:30:34 -04:00
David Howells	2baec2c3f8	rxrpc: Support network namespacing Support network namespacing in AF_RXRPC with the following changes: (1) All the local endpoint, peer and call lists, locks, counters, etc. are moved into the per-namespace record. (2) All the connection tracking is moved into the per-namespace record with the exception of the client connection ID tree, which is kept global so that connection IDs are kept unique per-machine. (3) Each namespace gets its own epoch. This allows each network namespace to pretend to be a separate client machine. (4) The /proc/net/rxrpc_xxx files are now called /proc/net/rxrpc/xxx and the contents reflect the namespace. fs/afs/ should be okay with this patch as it explicitly requires the current net namespace to be init_net to permit a mount to proceed at the moment. It will, however, need updating so that cells, IP addresses and DNS records are per-namespace also. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-25 13:15:11 -04:00
Rosen, Rami	878cd3ba37	net/packet: remove unused parameter in prb_curr_blk_in_use(). This patch removes unused parameter from prb_curr_blk_in_use() method in net/packet/af_packet.c. Signed-off-by: Rami Rosen <rami.rosen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-25 13:15:11 -04:00
Roman Kapl	7c3f1875c6	net: move somaxconn init from sysctl code The default value for somaxconn is set in sysctl_core_net_init(), but this function is not called when kernel is configured without CONFIG_SYSCTL. This results in the kernel not being able to accept TCP connections, because the backlog has zero size. Usually, the user ends up with: "TCP: request_sock_TCP: Possible SYN flooding on port 7. Dropping request. Check SNMP counters." If SYN cookies are not enabled the connection is rejected. Before `ef547f2ac1` (tcp: remove max_qlen_log), the effects were less severe, because the backlog was always at least eight slots long. Signed-off-by: Roman Kapl <roman.kapl@sysgo.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-25 13:12:17 -04:00
David S. Miller	52c05fc744	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2017-05-23 Here's the first Bluetooth & 802.15.4 pull request targeting the 4.13 kernel release. - Bluetooth 5.0 improvements (Data Length Extensions and alternate PHY) - Support for new Intel Bluetooth adapter [[8087:0aaa] - Various fixes to ieee802154 code - Various fixes to HCI UART code ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-25 12:54:49 -04:00
Eric Dumazet	d0e1a1b5a8	tcp: better validation of received ack sequences Paul Fiterau Brostean reported : <quote> Linux TCP stack we analyze exhibits behavior that seems odd to me. The scenario is as follows (all packets have empty payloads, no window scaling, rcv/snd window size should not be a factor): TEST HARNESS (CLIENT) LINUX SERVER 1. - LISTEN (server listen, then accepts) 2. - --> <SEQ=100><CTL=SYN> --> SYN-RECEIVED 3. - <-- <SEQ=300><ACK=101><CTL=SYN,ACK> <-- SYN-RECEIVED 4. - --> <SEQ=101><ACK=301><CTL=ACK> --> ESTABLISHED 5. - <-- <SEQ=301><ACK=101><CTL=FIN,ACK> <-- FIN WAIT-1 (server opts to close the data connection calling "close" on the connection socket) 6. - --> <SEQ=101><ACK=99999><CTL=FIN,ACK> --> CLOSING (client sends FIN,ACK with not yet sent acknowledgement number) 7. - <-- <SEQ=302><ACK=102><CTL=ACK> <-- CLOSING (ACK is 102 instead of 101, why?) ... (silence from CLIENT) 8. - <-- <SEQ=301><ACK=102><CTL=FIN,ACK> <-- CLOSING (retransmission, again ACK is 102) Now, note that packet 6 while having the expected sequence number, acknowledges something that wasn't sent by the server. So I would expect the packet to maybe prompt an ACK response from the server, and then be ignored. Yet it is not ignored and actually leads to an increase of the acknowledgement number in the server's retransmission of the FIN,ACK packet. The explanation I found is that the FIN in packet 6 was processed, despite the acknowledgement number being unacceptable. Further experiments indeed show that the server processes this FIN, transitioning to CLOSING, then on receiving an ACK for the FIN it had send in packet 5, the server (or better said connection) transitions from CLOSING to TIME_WAIT (as signaled by netstat). </quote> Indeed, tcp_rcv_state_process() calls tcp_ack() but does not exploit the @acceptable status but for TCP_SYN_RECV state. What we want here is to send a challenge ACK, if not in TCP_SYN_RECV state. TCP_FIN_WAIT1 state is not the only state we should fix. Add a FLAG_NO_CHALLENGE_ACK so that tcp_rcv_state_process() can choose to send a challenge ACK and discard the packet instead of wrongly change socket state. With help from Neal Cardwell. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Paul Fiterau Brostean <p.fiterau-brostean@science.ru.nl> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-25 12:46:55 -04:00
WANG Cong	367a8ce896	net_sched: only create filter chains for new filters/actions tcf_chain_get() always creates a new filter chain if not found in existing ones. This is totally unnecessary when we get or delete filters, new chain should be only created for new filters (or new actions). Fixes: `5bc1701881` ("net: sched: introduce multichain support for filters") Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-25 12:15:05 -04:00
Jiri Pirko	ee538dcea2	net: sched: cls_api: make reclassify return all the way back to the original tp With the introduction of chain goto action, the reclassification would cause the re-iteration of the actual chain. It makes more sense to restart the whole thing and re-iterate starting from the original tp - start of chain 0. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-25 12:11:49 -04:00
Eric Dumazet	ce682ef6e3	tcp: fix TCP_SYNCNT flakes After the mentioned commit, some of our packetdrill tests became flaky. TCP_SYNCNT socket option can limit the number of SYN retransmits. retransmits_timed_out() has to compare times computations based on local_clock() while timers are based on jiffies. With NTP adjustments and roundings we can observe 999 ms delay for 1000 ms timers. We end up sending one extra SYN packet. Gimmick added in commit `6fa12c8503` ("Revert Backoff [v3]: Calculate TCP's connection close threshold as a time value") makes no real sense for TCP_SYN_SENT sockets where no RTO backoff can happen at all. Lets use a simpler logic for TCP_SYN_SENT sockets and remove @syn_set parameter from retransmits_timed_out() Fixes: `9a568de481` ("tcp: switch TCP TS option (RFC 7323) to 1ms clock") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-24 16:29:57 -04:00
Vivien Didelot	64dba236a1	net: dsa: support cross-chip ageing time Now that the switchdev bridge ageing time attribute is propagated to all switch chips of the fabric, each switch can check if the requested value is valid and program itself, so that the whole fabric shares a common ageing time setting. This is especially needed for switch chips in between others, containing no bridge port members but evidently used in the data path. To achieve that, remove the condition which skips the other switches. We also don't need to identify the target switch anymore, thus remove the sw_index member of the dsa_notifier_ageing_time_info notifier structure. On ZII Dev Rev B (with two 88E6352 and one 88E6185) and ZII Dev Rev C (with two 88E6390X), we have the following hardware configuration: # ip link add name br0 type bridge # ip link set master br0 dev lan6 br0: port 1(lan6) entered blocking state br0: port 1(lan6) entered disabled state # echo 2000 > /sys/class/net/br0/bridge/ageing_time Before this patch: zii-rev-b# cat /sys/kernel/debug/mv88e6xxx/sw/age_time 300000 300000 15000 zii-rev-c# cat /sys/kernel/debug/mv88e6xxx/sw/age_time 300000 18750 After this patch: zii-rev-b# cat /sys/kernel/debug/mv88e6xxx/sw/age_time 15000 15000 15000 zii-rev-c# cat /sys/kernel/debug/mv88e6xxx/sw/age_time 18750 18750 Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-24 16:27:47 -04:00
Jiri Pirko	fdfc7dd6ca	net/sched: flower: add support for matching on tcp flags Benefit from the support of tcp flags dissection and allow user to insert rules matching on tcp flags. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-24 16:22:11 -04:00
Jiri Pirko	ac4bb5de27	net: flow_dissector: add support for dissection of tcp flags Add support for dissection of tcp flags. Uses similar function call to tcp dissection function as arp, mpls and others. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-24 16:22:11 -04:00
David S. Miller	029c58178b	Just two fixes this time: * fix the scheduled scan "BUG: scheduling while atomic" * check mesh address extension flags more strictly -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEExu3sM/nZ1eRSfR9Ha3t4Rpy0AB0FAlkkLTwACgkQa3t4Rpy0 AB3i4hAAmq9+ZqKcHMtObQL0xnzkm8MEHoqL/ed1VOkjyvAnhw5sDMsvN1KaqjY3 XwSVQBkJcyLVTixEjJuD6oKKKSiX6IGWmaawsuSxpgNZRuYv4DpZHx8zvO+a6228 RwExpn1l7ziQS/wvSiKCvvK/SsPB/KGtp5YijpoXeUsWzxUOoJaJGNb6EsyoChPY tLLboT65aPvDEp3ZHX540XzxR1yJbM9Dwm4cecfjgF8b/jPYAgohul6hRHoO+l5u O9yW5FO4Z58IhfLqYD1o5Vp2sV5ZODP166pkIQjLs7oDKe4OZSmAMDVrz9TTM33p 4mBMAfIE0G4g0F8Oc+n4xnHury4fiB4OEYIg5SEh6JGfDFWfJtKGlwS2HuyLAh0d hLzOjCZ1h2yAfxZrNN34chtVmpTXZOiQfDaGxqE+xjsFwWCY8usN0BmZOl0j/mRo pfAoYnFOj0Co2av9qIjOG1trys1iZn9BrocZXUAqaAoLADhLeHbYv6y1bAv0riuw xkDkCIoVfdrT9ZvaVlIZA+itKQb6lvDQ+WO4TR5KjjPu8FAHVpwvi8dZW/h12+S8 VE9A1CjfC0ePrFrMcN3uMZmqK3I5QmKhg60JE22zPtkTLnSeTnSq3pzWBjWGZMC7 Aht5HiEoA4rLTYgOR2Eq6tsXu7j4/VEr5b5iwniZCWpu0EQQXN8= =H9PL -----END PGP SIGNATURE----- Merge tag 'mac80211-for-davem-2017-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Just two fixes this time: * fix the scheduled scan "BUG: scheduling while atomic" * check mesh address extension flags more strictly ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-24 15:31:39 -04:00
Alexander Potapenko	0ff50e83b5	net: rtnetlink: bail out from rtnl_fdb_dump() on parse error rtnl_fdb_dump() failed to check the result of nlmsg_parse(), which led to contents of \|ifm\| being uninitialized because nlh->nlmsglen was too small to accommodate \|ifm\|. The uninitialized data may affect some branches and result in unwanted effects, although kernel data doesn't seem to leak to the userspace directly. The bug has been detected with KMSAN and syzkaller. For the record, here is the KMSAN report: ================================================================== BUG: KMSAN: use of unitialized memory in rtnl_fdb_dump+0x5dc/0x1000 CPU: 0 PID: 1039 Comm: probe Not tainted 4.11.0-rc5+ #2727 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:16 dump_stack+0x143/0x1b0 lib/dump_stack.c:52 kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:1007 __kmsan_warning_32+0x66/0xb0 mm/kmsan/kmsan_instr.c:491 rtnl_fdb_dump+0x5dc/0x1000 net/core/rtnetlink.c:3230 netlink_dump+0x84f/0x1190 net/netlink/af_netlink.c:2168 __netlink_dump_start+0xc97/0xe50 net/netlink/af_netlink.c:2258 netlink_dump_start ./include/linux/netlink.h:165 rtnetlink_rcv_msg+0xae9/0xb40 net/core/rtnetlink.c:4094 netlink_rcv_skb+0x339/0x5a0 net/netlink/af_netlink.c:2339 rtnetlink_rcv+0x83/0xa0 net/core/rtnetlink.c:4110 netlink_unicast_kernel net/netlink/af_netlink.c:1272 netlink_unicast+0x13b7/0x1480 net/netlink/af_netlink.c:1298 netlink_sendmsg+0x10b8/0x10f0 net/netlink/af_netlink.c:1844 sock_sendmsg_nosec net/socket.c:633 sock_sendmsg net/socket.c:643 ___sys_sendmsg+0xd4b/0x10f0 net/socket.c:1997 __sys_sendmsg net/socket.c:2031 SYSC_sendmsg+0x2c6/0x3f0 net/socket.c:2042 SyS_sendmsg+0x87/0xb0 net/socket.c:2038 do_syscall_64+0x102/0x150 arch/x86/entry/common.c:285 entry_SYSCALL64_slow_path+0x25/0x25 arch/x86/entry/entry_64.S:246 RIP: 0033:0x401300 RSP: 002b:00007ffc3b0e6d58 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00000000004002b0 RCX: 0000000000401300 RDX: 0000000000000000 RSI: 00007ffc3b0e6d80 RDI: 0000000000000003 RBP: 00007ffc3b0e6e00 R08: 000000000000000b R09: 0000000000000004 R10: 000000000000000d R11: 0000000000000246 R12: 0000000000000000 R13: 00000000004065a0 R14: 0000000000406630 R15: 0000000000000000 origin: 000000008fe00056 save_stack_trace+0x59/0x60 arch/x86/kernel/stacktrace.c:59 kmsan_save_stack_with_flags mm/kmsan/kmsan.c:352 kmsan_internal_poison_shadow+0xb1/0x1a0 mm/kmsan/kmsan.c:247 kmsan_poison_shadow+0x6d/0xc0 mm/kmsan/kmsan.c:260 slab_alloc_node mm/slub.c:2743 __kmalloc_node_track_caller+0x1f4/0x390 mm/slub.c:4349 __kmalloc_reserve net/core/skbuff.c:138 __alloc_skb+0x2cd/0x740 net/core/skbuff.c:231 alloc_skb ./include/linux/skbuff.h:933 netlink_alloc_large_skb net/netlink/af_netlink.c:1144 netlink_sendmsg+0x934/0x10f0 net/netlink/af_netlink.c:1819 sock_sendmsg_nosec net/socket.c:633 sock_sendmsg net/socket.c:643 ___sys_sendmsg+0xd4b/0x10f0 net/socket.c:1997 __sys_sendmsg net/socket.c:2031 SYSC_sendmsg+0x2c6/0x3f0 net/socket.c:2042 SyS_sendmsg+0x87/0xb0 net/socket.c:2038 do_syscall_64+0x102/0x150 arch/x86/entry/common.c:285 return_from_SYSCALL_64+0x0/0x6a arch/x86/entry/entry_64.S:246 ================================================================== and the reproducer: ================================================================== #include <sys/socket.h> #include <net/if_arp.h> #include <linux/netlink.h> #include <stdint.h> int main() { int sock = socket(PF_NETLINK, SOCK_DGRAM \| SOCK_NONBLOCK, 0); struct msghdr msg; memset(&msg, 0, sizeof(msg)); char nlmsg_buf[32]; memset(nlmsg_buf, 0, sizeof(nlmsg_buf)); struct nlmsghdr *nlmsg = nlmsg_buf; nlmsg->nlmsg_len = 0x11; nlmsg->nlmsg_type = 0x1e; // RTM_NEWROUTE = RTM_BASE + 0x0e // type = 0x0e = 1110b // kind = 2 nlmsg->nlmsg_flags = 0x101; // NLM_F_ROOT \| NLM_F_REQUEST nlmsg->nlmsg_seq = 0; nlmsg->nlmsg_pid = 0; nlmsg_buf[16] = (char)7; struct iovec iov; iov.iov_base = nlmsg_buf; iov.iov_len = 17; msg.msg_iov = &iov; msg.msg_iovlen = 1; sendmsg(sock, &msg, 0); return 0; } ================================================================== Signed-off-by: Alexander Potapenko <glider@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-24 15:27:16 -04:00
Xin Long	7e06297768	sctp: set new_asoc temp when processing dupcookie After sctp changed to use transport hashtable, a transport would be added into global hashtable when adding the peer to an asoc, then the asoc can be got by searching the transport in the hashtbale. The problem is when processing dupcookie in sctp_sf_do_5_2_4_dupcook, a new asoc would be created. A peer with the same addr and port as the one in the old asoc might be added into the new asoc, but fail to be added into the hashtable, as they also belong to the same sk. It causes that sctp's dupcookie processing can not really work. Since the new asoc will be freed after copying it's information to the old asoc, it's more like a temp asoc. So this patch is to fix it by setting it as a temp asoc to avoid adding it's any transport into the hashtable and also avoid allocing assoc_id. An extra thing it has to do is to also alloc stream info for any temp asoc, as sctp dupcookie process needs it to update old asoc. But I don't think it would hurt something, as a temp asoc would always be freed after finishing processing cookie echo packet. Reported-by: Jianwen Ji <jiji@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-24 15:21:04 -04:00
Xin Long	3ab2137915	sctp: fix stream update when processing dupcookie Since commit `3dbcc105d5` ("sctp: alloc stream info when initializing asoc"), stream and stream.out info are always alloced when creating an asoc. So it's not correct to check !asoc->stream before updating stream info when processing dupcookie, but would be better to check asoc state instead. Fixes: `3dbcc105d5` ("sctp: alloc stream info when initializing asoc") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-24 15:21:04 -04:00
Yan, Zheng	0a2ad54107	libceph: cleanup old messages according to reconnect seq when reopen a connection, use 'reconnect seq' to clean up messages that have already been received by peer. Link: http://tracker.ceph.com/issues/18690 Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2017-05-24 18:10:51 +02:00
Markus Elfring	d2c23c0075	xprtrdma: Delete an error message for a failed memory allocation in xprt_rdma_bc_setup() Omit an extra message for a memory allocation failure in this function. This issue was detected by using the Coccinelle software. Link: http://events.linuxfoundation.org/sites/events/files/slides/LCJ16-Refactor_Strings-WSang_0.pdf Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2017-05-24 07:53:37 -04:00
Liping Zhang	fefa92679d	netfilter: ctnetlink: fix incorrect nf_ct_put during hash resize If nf_conntrack_htable_size was adjusted by the user during the ct dump operation, we may invoke nf_ct_put twice for the same ct, i.e. the "last" ct. This will cause the ct will be freed but still linked in hash buckets. It's very easy to reproduce the problem by the following commands: # while : ; do echo $RANDOM > /proc/sys/net/netfilter/nf_conntrack_buckets done # while : ; do conntrack -L done # iperf -s 127.0.0.1 & # iperf -c 127.0.0.1 -P 60 -t 36000 After a while, the system will hang like this: NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [bash:20184] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [iperf:20382] ... So at last if we find cb->args[1] is equal to "last", this means hash resize happened, then we can set cb->args[1] to 0 to fix the above issue. Fixes: `d205dc4079` ("[NETFILTER]: ctnetlink: fix deadlock in table dumping") Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2017-05-24 11:26:01 +02:00
Simon Wunderlich	71ec289e62	mac80211: enable VHT for mesh channel processing Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-05-24 08:58:55 +02:00
Simon Wunderlich	75d627d53e	mac80211: mesh: support sending wide bandwidth CSA To support HT and VHT CSA, beacons and action frames must include the corresponding IEs. Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> [make ieee80211_ie_build_wide_bw_cs() return void] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-05-24 08:58:54 +02:00
Liping Zhang	124dffea9e	netfilter: nat: use atomic bit op to clear the _SRC_NAT_DONE_BIT We need to clear the IPS_SRC_NAT_DONE_BIT to indicate that the ct has been removed from nat_bysource table. But unfortunately, we use the non-atomic bit operation: "ct->status &= ~IPS_NAT_DONE_MASK". So there's a race condition that we may clear the _DYING_BIT set by another CPU unexpectedly. Since we don't care about the IPS_DST_NAT_DONE_BIT, so just using clear_bit to clear the IPS_SRC_NAT_DONE_BIT is enough. Also note, this is the last user which use the non-atomic bit operation to update the confirmed ct->status. Reported-by: Florian Westphal <fw@strlen.de> Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2017-05-23 22:54:51 +02:00
Pablo Neira Ayuso	d2df92e98a	netfilter: nft_set_rbtree: handle element re-addition after deletion The existing code selects no next branch to be inspected when re-inserting an inactive element into the rb-tree, looping endlessly. This patch restricts the check for active elements to the EEXIST case only. Fixes: `e701001e7c` ("netfilter: nft_rbtree: allow adjacent intervals with dynamic updates") Reported-by: Wolfgang Bumiller <w.bumiller@proxmox.com> Tested-by: Wolfgang Bumiller <w.bumiller@proxmox.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2017-05-23 22:54:14 +02:00
Davide Caratti	f3c0eb05e2	netfilter: conntrack: fix false CRC32c mismatch using paged skb sctp_compute_cksum() implementation assumes that at least the SCTP header is in the linear part of skb: modify conntrack error callback to avoid false CRC32c mismatch, if the transport header is partially/entirely paged. Fixes: `cf6e007eef` ("netfilter: conntrack: validate SCTP crc32c in PREROUTING") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2017-05-23 22:54:14 +02:00
Dan Carpenter	293dffaad8	libceph: NULL deref on crush_decode() error path If there is not enough space then ceph_decode_32_safe() does a goto bad. We need to return an error code in that situation. The current code returns ERR_PTR(0) which is NULL. The callers are not expecting that and it results in a NULL dereference. Fixes: `f24e9980eb` ("ceph: OSD client") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2017-05-23 20:32:32 +02:00
Ilya Dryomov	b51456a609	libceph: fix error handling in process_one_ticket() Don't leak key internals after new_session_key is populated. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org>	2017-05-23 20:32:28 +02:00
Ilya Dryomov	d18a1247c4	libceph: validate blob_struct_v in process_one_ticket() None of these are validated in userspace, but since we do validate reply_struct_v in ceph_x_proc_ticket_reply(), tkt_struct_v (first) and CephXServiceTicket struct_v (second) in process_one_ticket(), validate CephXTicketBlob struct_v as well. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org>	2017-05-23 20:32:25 +02:00
Ilya Dryomov	f3b4e55ded	libceph: drop version variable from ceph_monmap_decode() It's set but not used: CEPH_FEATURE_MONNAMES feature bit isn't advertised, which guarantees a v1 MonMap. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org>	2017-05-23 20:32:22 +02:00
Ilya Dryomov	1759f7b0e3	libceph: make ceph_msg_data_advance() return void Both callers ignore the returned bool. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org>	2017-05-23 20:32:20 +02:00
Ilya Dryomov	6f4dbd149d	libceph: use kbasename() and kill ceph_file_part() Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org>	2017-05-23 20:32:10 +02:00
Lin Zhang	a611c58b3d	net: ieee802154: fix net_device reference release too early This patch fixes the kernel oops when release net_device reference in advance. In function raw_sendmsg(i think the dgram_sendmsg has the same problem), there is a race condition between dev_put and dev_queue_xmit when the device is gong that maybe lead to dev_queue_ximt to see an illegal net_device pointer. My test kernel is 3.13.0-32 and because i am not have a real 802154 device, so i change lowpan_newlink function to this: /* find and hold real wpan device / real_dev = dev_get_by_index(src_net, nla_get_u32(tb[IFLA_LINK])); if (!real_dev) return -ENODEV; // if (real_dev->type != ARPHRD_IEEE802154) { // dev_put(real_dev); // return -EINVAL; // } lowpan_dev_info(dev)->real_dev = real_dev; lowpan_dev_info(dev)->fragment_tag = 0; mutex_init(&lowpan_dev_info(dev)->dev_list_mtx); Also, in order to simulate preempt, i change the raw_sendmsg function to this: skb->dev = dev; skb->sk = sk; skb->protocol = htons(ETH_P_IEEE802154); dev_put(dev); //simulate preempt schedule_timeout_uninterruptible(30 HZ); err = dev_queue_xmit(skb); if (err > 0) err = net_xmit_errno(err); and this is my userspace test code named test_send_data: int main(int argc, char **argv) { char buf[127]; int sockfd; sockfd = socket(AF_IEEE802154, SOCK_RAW, 0); if (sockfd < 0) { printf("create sockfd error: %s\n", strerror(errno)); return -1; } send(sockfd, buf, sizeof(buf), 0); return 0; } This is my test case: root@zhanglin-x-computer:~/develop/802154# uname -a Linux zhanglin-x-computer 3.13.0-32-generic #57-Ubuntu SMP Tue Jul 15 03:51:08 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux root@zhanglin-x-computer:~/develop/802154# ip link add link eth0 name lowpan0 type lowpan root@zhanglin-x-computer:~/develop/802154# //keep the lowpan0 device down root@zhanglin-x-computer:~/develop/802154# ./test_send_data & //wait a while root@zhanglin-x-computer:~/develop/802154# ip link del link dev lowpan0 //the device is gone //oops [381.303307] general protection fault: 0000 [#1]SMP [381.303407] Modules linked in: af_802154 6lowpan bnep rfcomm bluetooth nls_iso8859_1 snd_hda_codec_hdmi snd_hda_codec_realtek rts5139(C) snd_hda_intel snd_had_codec snd_hwdep snd_pcm snd_page_alloc snd_seq_midi snd_seq_midi_event snd_rawmidi snd_req intel_rapl snd_seq_device coretemp i915 kvm_intel kvm snd_timer snd crct10dif_pclmul crc32_pclmul ghash_clmulni_intel cypted drm_kms_helper drm i2c_algo_bit soundcore video mac_hid parport_pc ppdev ip parport hid_generic usbhid hid ahci r8169 mii libahdi [381.304286] CPU:1 PID: 2524 Commm: 1 Tainted: G C 0 3.13.0-32-generic [381.304409] Hardware name: Haier Haier DT Computer/Haier DT Codputer, BIOS FIBT19H02_X64 06/09/2014 [381.304546] tasks: ffff000096965fc0 ti: ffffB0013779c000 task.ti: ffffB8013779c000 [381.304659] RIP: 0010:[<ffffffff01621fe1>] [<ffffffff81621fe1>] __dev_queue_ximt+0x61/0x500 [381.304798] RSP: 0018:ffffB8013779dca0 EFLAGS: 00010202 [381.304880] RAX: 272b031d57565351 RBX: 0000000000000000 RCX: ffff8800968f1a00 [381.304987] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8800968f1a00 [381.305095] RBP: ffff8e013773dce0 R08: 0000000000000266 R09: 0000000000000004 [381.305202] R10: 0000000000000004 R11: 0000000000000005 R12: ffff88013902e000 [381.305310] R13: 000000000000007f R14: 000000000000007f R15: ffff8800968f1a00 [381.305418] FS: 00007fc57f50f740(0000) GS: ffff88013fc80000(0000) knlGS: 0000000000000000 [381.305540] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [381.305627] CR2: 00007fad0841c000 CR3: 00000001368dd000 CR4: 00000000001007e0 [361.905734] Stack: [381.305768] 00000000002052d0 000000003facb30a ffff88013779dcc0 ffff880137764000 [381.305898] ffff88013779de70 000000000000007f 000000000000007f ffff88013902e000 [381.306026] ffff88013779dcf0 ffffffff81622490 ffff88013779dd39 ffffffffa03af9f1 [381.306155] Call Trace: [381.306202] [<ffffffff81622490>] dev_queue_xmit+0x10/0x20 [381.306294] [<ffffffffa03af9f1>] raw_sendmsg+0x1b1/0x270 [af_802154] [381.306396] [<ffffffffa03af054>] ieee802154_sock_sendmsg+0x14/0x20 [af_802154] [381.306512] [<ffffffff816079eb>] sock_sendmsg+0x8b/0xc0 [381.306600] [<ffffffff811d52a5>] ? __d_alloc+0x25/0x180 [381.306687] [<ffffffff811a1f56>] ? kmem_cache_alloc_trace+0x1c6/0x1f0 [381.306791] [<ffffffff81607b91>] SYSC_sendto+0x121/0x1c0 [381.306878] [<ffffffff8109ddf4>] ? vtime_account_user+x54/0x60 [381.306975] [<ffffffff81020d45>] ? syscall_trace_enter+0x145/0x250 [381.307073] [<ffffffff816086ae>] SyS_sendto+0xe/0x10 [381.307156] [<ffffffff8172c87f>] tracesys+0xe1/0xe6 [381.307233] Code: c6 a1 a4 ff 41 8b 57 78 49 8b 47 20 85 d2 48 8b 80 78 07 00 00 75 21 49 8b 57 18 48 85 d2 74 18 48 85 c0 74 13 8b 92 ac 01 00 00 <3b> 50 10 73 08 8b 44 90 14 41 89 47 78 41 f6 84 24 d5 00 00 00 [381.307801] RIP [<ffffffff81621fe1>] _dev_queue_xmit+0x61/0x500 [381.307901] RSP <ffff88013779dca0> [381.347512] Kernel panic - not syncing: Fatal exception in interrupt [381.347747] drm_kms_helper: panic occurred, switching back to text console In my opinion, there is always exist a chance that the device is gong before call dev_queue_xmit. I think the latest kernel is have the same problem and that dev_put should be behind of the dev_queue_xmit. Signed-off-by: Lin Zhang <xiaolou4617@gmail.com> Acked-by: Stefan Schmidt <stefan@osg.samsung.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2017-05-23 20:05:15 +02:00
Lin Zhang	8fafda7776	net: ieee802154: remove explicit set skb->sk Explicit set skb->sk is needless, sock_alloc_send_skb is already set it. Signed-off-by: Lin Zhang <xiaolou4617@gmail.com> Acked-by: Stefan Schmidt <stefan@osg.samsung.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2017-05-23 20:05:15 +02:00
Jiri Pirko	f93e1cdcf4	net/sched: fix filter flushing When user instructs to remove all filters from chain, we cannot destroy the chain as other actions may hold a reference. Also the put in errout would try to destroy it again. So instead, just walk the chain and remove all existing filters. Fixes: `5bc1701881` ("net: sched: introduce multichain support for filters") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-23 11:00:07 -04:00
Jiri Pirko	31efcc250a	net/sched: properly assign RCU pointer in tcf_chain_tp_insert/remove *p_filter_chain is rcu-dereferenced on reader path. So here in writer, property assign the pointer. Fixes: `2190d1d094` ("net: sched: introduce helpers to work with filter chains") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-23 11:00:06 -04:00
David S. Miller	2f9bfd3399	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2017-05-23 1) Fix wrong header offset for esp4 udpencap packets. 2) Fix a stack access out of bounds when creating a bundle with sub policies. From Sabrina Dubroca. 3) Fix slab-out-of-bounds in pfkey due to an incorrect sadb_x_sec_len calculation. 4) We checked the wrong feature flags when taking down an interface with IPsec offload enabled. Fix from Ilan Tayari. 5) Copy the anti replay sequence numbers when doing a state migration, otherwise we get out of sync with the sequence numbers. Fix from Antony Antony. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-23 10:51:32 -04:00
Arend Van Spriel	1b57b6210f	cfg80211: make cfg80211_sched_scan_results() work from atomic context Drivers should be able to call cfg80211_sched_scan_results() from atomic context. However, with the introduction of multiple scheduled scan feature this requirement was not taken into account resulting in regression shown below. [ 119.021594] BUG: scheduling while atomic: irq/47-iwlwifi/517/0x00000200 [ 119.021604] Modules linked in: [...] [ 119.021759] CPU: 1 PID: 517 Comm: irq/47-iwlwifi Not tainted 4.12.0-rc2-t440s-20170522+ #1 [ 119.021763] Hardware name: LENOVO 20AQS03H00/20AQS03H00, BIOS GJET91WW (2.41 ) 09/21/2016 [ 119.021766] Call Trace: [ 119.021778] ? dump_stack+0x5c/0x84 [ 119.021784] ? __schedule_bug+0x4c/0x70 [ 119.021792] ? __schedule+0x496/0x5c0 [ 119.021798] ? schedule+0x2d/0x80 [ 119.021804] ? schedule_preempt_disabled+0x5/0x10 [ 119.021810] ? __mutex_lock.isra.0+0x18e/0x4c0 [ 119.021817] ? __wake_up+0x2f/0x50 [ 119.021833] ? cfg80211_sched_scan_results+0x19/0x60 [cfg80211] [ 119.021844] ? cfg80211_sched_scan_results+0x19/0x60 [cfg80211] [ 119.021859] ? iwl_mvm_rx_lmac_scan_iter_complete_notif+0x17/0x30 [iwlmvm] [ 119.021869] ? iwl_pcie_rx_handle+0x2a9/0x7e0 [iwlwifi] [ 119.021878] ? iwl_pcie_irq_handler+0x17c/0x730 [iwlwifi] [ 119.021884] ? irq_forced_thread_fn+0x60/0x60 [ 119.021887] ? irq_thread_fn+0x16/0x40 [ 119.021892] ? irq_thread+0x109/0x180 [ 119.021896] ? wake_threads_waitq+0x30/0x30 [ 119.021901] ? kthread+0xf2/0x130 [ 119.021905] ? irq_thread_dtor+0x90/0x90 [ 119.021910] ? kthread_create_on_node+0x40/0x40 [ 119.021915] ? ret_from_fork+0x26/0x40 Fixes: `b34939b983` ("cfg80211: add request id to cfg80211_sched_scan_*() api") Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-05-23 14:36:46 +02:00
Sven Eckelmann	22f0502ed9	batman-adv: Print correct function names in dbg messages The function names in batman-adv changed slightly in the past. But some of the debug messages were not updated correctly and therefore some messages were incorrect. To avoid this in the future, these kind of messages should use __func__ to automatically print the correct function name. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>	2017-05-23 14:34:31 +02:00
Markus Elfring	912eeed9f5	batman-adv: Combine two seq_puts() calls into one call in batadv_nc_nodes_seq_print_text() A bit of text was put into a sequence by two separate function calls. Print the same data by a single function call instead. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>	2017-05-23 12:09:15 +02:00
Markus Elfring	626caae9f2	batman-adv: Replace a seq_puts() call by seq_putc() in two functions Two single characters (line breaks) should be put into a sequence. Thus use the corresponding function "seq_putc". This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>	2017-05-23 12:09:14 +02:00
Matthias Schiffer	8ea026b160	batman-adv: decrease maximum fragment size With this patch the maximum fragment size is reduced from 1400 to 1280 bytes. Fragmentation v2 correctly uses the smaller of 1400 and the interface MTU, thus generally supporting interfaces with an MTU < 1400 bytes, too. However, currently "Fragmentation v2" does not support re-fragmentation. Which means that once a packet is split into two packets of 1400 + x bytes for instance and the next hop provides an interface with an even smaller MTU of 1280 bytes, then the larger fragment is lost. A maximum fragment size of 1280 bytes is a safer option as this is the minimum MTU required by IPv6, making interfaces with an MTU < 1280 rather exotic. Regarding performance, this should have no negative impact on unicast traffic: Having some more bytes in the smaller and some less in the larger does not change the sum of both fragments. Concerning TT, choosing 1280 bytes fragments might result in more TT messages than necessary when a large network is bridged into batman-adv. However, the TT overhead in general is marginal due to its reactive nature, therefore such a performance impact on TT should not be noticeable for a user. Cc: Matthias Schiffer <mschiffer@universe-factory.net> [linus.luessing@c0d3.blue: Added commit message] Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>	2017-05-23 12:09:13 +02:00
Simon Wunderlich	b1d2cf3de3	batman-adv: Start new development cycle Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>	2017-05-23 12:09:13 +02:00
David S. Miller	218b6a5b23	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2017-05-22 23:32:48 -04:00
Vivien Didelot	d0c627b874	net: dsa: add VLAN notifier Add two new DSA_NOTIFIER_VLAN_ADD and DSA_NOTIFIER_VLAN_DEL events to notify not only a single switch, but all switches of a the fabric when an VLAN entry is added or removed. For the moment, keep the current behavior and ignore other switches. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-22 19:37:32 -04:00
Vivien Didelot	8ae5bcdc5d	net: dsa: add MDB notifier Add two new DSA_NOTIFIER_MDB_ADD and DSA_NOTIFIER_MDB_DEL events to notify not only a single switch, but all switches of a the fabric when an MDB entry is added or removed. For the moment, keep the current behavior and ignore other switches. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-22 19:37:32 -04:00
Vivien Didelot	685fb6a40d	net: dsa: add FDB notifier Add two new DSA_NOTIFIER_FDB_ADD and DSA_NOTIFIER_FDB_DEL events to notify not only a single switch, but all switches of a the fabric when an FDB entry is added or removed. For the moment, keep the current behavior and ignore other switches. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-22 19:37:32 -04:00
Vivien Didelot	1faabf7440	net: dsa: add notifier for ageing time This patch keeps the port-wide ageing time handling code in dsa_port_ageing_time, pushes the requested ageing time value in a new switch fabric notification, and moves the switch-wide ageing time handling code in dsa_switch_ageing_time. This has the effect that now not only the switch that the target port belongs to can be programmed, but all switches composing the switch fabric. For the moment, keep the current behavior and ignore other switches. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-22 19:37:32 -04:00
Vivien Didelot	52c96f9d70	net: dsa: move notifier info to private header The DSA notifier events and info structure definitions are not meant for DSA drivers and users, but only used internally by the DSA core files. Move them from the public net/dsa.h file to the private dsa_priv.h file. Also use this opportunity to turn the events into an anonymous enum, because we don't care about the values, and this will prevent future conflicts when adding (and sorting) new events. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-22 19:37:32 -04:00
Vivien Didelot	076e713365	net: dsa: move VLAN handlers Move the DSA port code which handles VLAN objects in port.c, where it belongs. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-22 19:37:32 -04:00
Vivien Didelot	3a9afea37e	net: dsa: move MDB handlers Move the DSA port code which handles MDB objects in port.c, where it belongs. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-22 19:37:32 -04:00
Vivien Didelot	d1cffff008	net: dsa: move FDB handlers Move the DSA port code which handles FDB objects in port.c, where it belongs. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-22 19:37:32 -04:00
Vivien Didelot	d87bd94e1c	net: dsa: move ageing time setter Move the DSA port code which sets a port ageing time in port.c, where it belongs. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-22 19:37:32 -04:00
Vivien Didelot	4d61d3043b	net: dsa: move VLAN filtering setter Move the DSA port code which sets VLAN filtering on a port in port.c, where it belongs. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-22 19:37:32 -04:00
Vivien Didelot	cfbed329be	net: dsa: move bridging routines Move the DSA port code which bridges a port in port.c, where it belongs. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-22 19:37:32 -04:00
Vivien Didelot	a40c175b4a	net: dsa: move port state setters Add a new port.c file to hold all DSA port-wide logic. This patch moves in the code which sets a port state. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-22 19:37:32 -04:00
Vivien Didelot	072bb1903a	net: dsa: change scope of ageing time setter Change the scope of the switchdev bridge ageing time attribute setter from the DSA slave device to the generic DSA port, so that the future port-wide API can also be used for other port types, such as CPU and DSA links. Also ds->ports is now a contiguous array of dsa_port structures, thus their addresses cannot be NULL. Remove the useless check in dsa_fastest_ageing_time. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-22 19:37:32 -04:00
Vivien Didelot	c02c4175cb	net: dsa: change scope of VLAN filtering setter Change the scope of the switchdev VLAN filtering attribute setter from the DSA slave device to the generic DSA port, so that the future port-wide API can also be used for other port types, such as CPU and DSA links. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-22 19:37:32 -04:00
Vivien Didelot	01676d129c	net: dsa: change scope of VLAN handlers Change the scope of the switchdev VLAN object handlers from the DSA slave device to the generic DSA port, so that the future port-wide API can also be used for other port types, such as CPU and DSA links. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-22 19:37:32 -04:00
Vivien Didelot	bcebb976ec	net: dsa: change scope of MDB handlers Change the scope of the switchdev MDB object handlers from the DSA slave device to the generic DSA port, so that the future port-wide API can also be used for other port types, such as CPU and DSA links. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-22 19:37:32 -04:00
Vivien Didelot	3fdb023b5e	net: dsa: change scope of FDB handlers Change the scope of the switchdev FDB object handlers from the DSA slave device to the generic DSA port, so that the future port-wide API can also be used for other port types, such as CPU and DSA links. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-22 19:37:32 -04:00
Vivien Didelot	17d7802b77	net: dsa: change scope of bridging code Now that the bridge join and leave functions only deal with a DSA port, change their scope from the DSA slave net_device to the DSA generic dsa_port. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-22 19:37:32 -04:00
Vivien Didelot	a93ecdd948	net: dsa: change scope of notifier call chain Change the scope of the fabric notification helper from the DSA slave to the DSA port, since this is a DSA layer specific notion, that can be used by non-slave ports (CPU and DSA). Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-22 19:37:32 -04:00
Vivien Didelot	fd36454131	net: dsa: change scope of STP state setter Instead of having multiple STP state helpers scoping a slave device supporting both the DSA logic and the switchdev binding, provide a single dsa_port_set_state helper scoping a DSA port, as well as its dsa_port_set_state_now wrapper which skips the prepare phase. This allows us to better separate the DSA logic from the slave device handling. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-22 19:37:32 -04:00
Linus Torvalds	86ca984cef	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: "Mostly netfilter bug fixes in here, but we have some bits elsewhere as well. 1) Don't do SNAT replies for non-NATed connections in IPVS, from Julian Anastasov. 2) Don't delete conntrack helpers while they are still in use, from Liping Zhang. 3) Fix zero padding in xtables's xt_data_to_user(), from Willem de Bruijn. 4) Add proper RCU protection to nf_tables_dump_set() because we cannot guarantee that we hold the NFNL_SUBSYS_NFTABLES lock. From Liping Zhang. 5) Initialize rcv_mss in tcp_disconnect(), from Wei Wang. 6) smsc95xx devices can't handle IPV6 checksums fully, so don't advertise support for offloading them. From Nisar Sayed. 7) Fix out-of-bounds access in __ip6_append_data(), from Eric Dumazet. 8) Make atl2_probe() propagate the error code properly on failures, from Alexey Khoroshilov. 9) arp_target[] in bond_check_params() is used uninitialized. This got changes from a global static to a local variable, which is how this mistake happened. Fix from Jarod Wilson. 10) Fix fallout from unnecessary NULL check removal in cls_matchall, from Jiri Pirko. This is definitely brown paper bag territory..." * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (26 commits) net: sched: cls_matchall: fix null pointer dereference vsock: use new wait API for vsock_stream_sendmsg() bonding: fix randomly populated arp target array net: Make IP alignment calulations clearer. bonding: fix accounting of active ports in 3ad net: atheros: atl2: don't return zero on failure path in atl2_probe() ipv6: fix out of bound writes in __ip6_append_data() bridge: start hello_timer when enabling KERNEL_STP in br_stp_start smsc95xx: Support only IPv4 TCP/UDP csum offload arp: always override existing neigh entries with gratuitous ARP arp: postpone addr_type calculation to as late as possible arp: decompose is_garp logic into a separate function arp: fixed error in a comment tcp: initialize rcv_mss to TCP_MIN_MSS instead of 0 netfilter: xtables: fix build failure from COMPAT_XT_ALIGN outside CONFIG_COMPAT ebtables: arpreply: Add the standard target sanity check netfilter: nf_tables: revisit chain/object refcounting from elements netfilter: nf_tables: missing sanitization in data from userspace netfilter: nf_tables: can't assume lock is acquired when dumping set elems netfilter: synproxy: fix conntrackd interaction ...	2017-05-22 12:42:02 -07:00
Jiri Pirko	2d76b2f8b5	net: sched: cls_matchall: fix null pointer dereference Since the head is guaranteed by the check above to be null, the call_rcu would explode. Remove the previously logically dead code that was made logically very much alive and kicking. Fixes: `985538eee0` ("net/sched: remove redundant null check on head") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-22 14:54:16 -04:00
Ivan Vecera	bd080488a6	bridge: fix hello and hold timers starting/stopping Current bridge code incorrectly handles starting/stopping of hello and hold timers during STP enable/disable. 1. Timers are stopped in br_stp_start() during NO_STP->USER_STP transition. The timers are already stopped in NO_STP state so this is confusing no-op. 2. During USER_STP->NO_STP transition the timers are started. This does not make sense and is confusion because the timer should not be active in NO_STP state. Cc: davem@davemloft.net Cc: sashok@cumulusnetworks.com Cc: stephen@networkplumber.org Cc: bridge@lists.linux-foundation.org Cc: lucien.xin@gmail.com Cc: nikolay@cumulusnetworks.com Signed-off-by: Ivan Vecera <cera@cera.cz> Reviewed-by: Xin Long <lucien.xin@gmail.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-22 14:40:22 -04:00
WANG Cong	499fde662f	vsock: use new wait API for vsock_stream_sendmsg() As reported by Michal, vsock_stream_sendmsg() could still sleep at vsock_stream_has_space() after prepare_to_wait(): vsock_stream_has_space vmci_transport_stream_has_space vmci_qpair_produce_free_space qp_lock qp_acquire_queue_mutex mutex_lock Just switch to the new wait API like we did for commit `d9dc8b0f8b` ("net: fix sleeping for sk_wait_event()"). Reported-by: Michal Kubecek <mkubecek@suse.cz> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Jorgen Hansen <jhansen@vmware.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-22 14:39:36 -04:00
Rohit Chavan	a777f715ca	net: ipv4: tcp: fixed comment coding style issue Fixed a coding style issue Signed-off-by: Rohit Chavan <roheetchavan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-22 12:14:51 -04:00
Rosen, Rami	241c4667fc	net: socket: fix a typo in sockfd_lookup(). This patch fixes a typo in sockfd_lookup() in net/socket.c. Signed-off-by: Rami Rosen <rami.rosen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-22 12:14:04 -04:00
David Ahern	d5d531cb50	net: ipv6: Add extack messages for route add failures Add messages for non-obvious errors (e.g, no need to add text for malloc failures or ENODEV failures). This mostly covers the annoying EINVAL errors Some message strings violate the 80-columns but searchable strings need to trump that rule. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-22 12:12:20 -04:00
David Ahern	333c430167	net: ipv6: Plumb extack through route add functions Plumb extack argument down to route add functions. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-22 12:12:20 -04:00
David Ahern	c3ab2b4ec8	net: ipv4: Add extack messages for route add failures Add messages for non-obvious errors (e.g, no need to add text for malloc failures or ENODEV failures). This mostly covers the annoying EINVAL errors Some message strings violate the 80-columns but searchable strings need to trump that rule. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-22 12:12:20 -04:00
David Ahern	6d8422a175	net: ipv4: Plumb extack through route add functions Plumb extack argument down to route add functions. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-22 12:12:19 -04:00
Eric Dumazet	232cd35d08	ipv6: fix out of bound writes in __ip6_append_data() Andrey Konovalov and idaifish@gmail.com reported crashes caused by one skb shared_info being overwritten from __ip6_append_data() Andrey program lead to following state : copy -4200 datalen 2000 fraglen 2040 maxfraglen 2040 alloclen 2048 transhdrlen 0 offset 0 fraggap 6200 The skb_copy_and_csum_bits(skb_prev, maxfraglen, data + transhdrlen, fraggap, 0); is overwriting skb->head and skb_shared_info Since we apparently detect this rare condition too late, move the code earlier to even avoid allocating skb and risking crashes. Once again, many thanks to Andrey and syzkaller team. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Reported-by: <idaifish@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-22 11:47:44 -04:00
Markus Elfring	5d4acfc141	Bluetooth: Delete error messages for failed memory allocations in two functions Omit two extra messages for memory allocation failures in these functions. This issue was detected by using the Coccinelle software. Link: http://events.linuxfoundation.org/sites/events/files/slides/LCJ16-Refactor_Strings-WSang_0.pdf Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2017-05-22 10:23:41 +02:00
Stephen Hemminger	d49c9dc1c8	ipv6: remove unused variables in esp6 Resolves warnings: net/ipv6/esp6.c: In function ‘esp_ssg_unref’: net/ipv6/esp6.c:121:10: warning: variable ‘seqhi’ set but not used [-Wunused-but-set-variable] net/ipv6/esp6.c: In function ‘esp6_output_head’: net/ipv6/esp6.c:227:21: warning: variable ‘esph’ set but not used [-Wunused-but-set-variable] Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2017-05-22 08:37:18 +02:00
Eric Dumazet	4ab688793e	tcp: fix tcp_probe_timer() for TCP_USER_TIMEOUT TCP_USER_TIMEOUT is still converted to jiffies value in icsk_user_timeout So we need to make a conversion for the cases HZ != 1000 Fixes: `9a568de481` ("tcp: switch TCP TS option (RFC 7323) to 1ms clock") Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-21 13:50:34 -04:00
stephen hemminger	0a9fc39e41	ipv6: drop unused variables in seg6_genl_dumphac THe seg6_pernet_data variable was set but never used. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-21 13:42:36 -04:00
stephen hemminger	9dc621afa8	fou: make local function static The build header functions are not used by any other code. net/ipv6/fou6.c:36:5: warning: no previous prototype for ‘fou6_build_header’ [-Wmissing-prototypes] net/ipv6/fou6.c:54:5: warning: no previous prototype for ‘gue6_build_header’ [-Wmissing-prototypes] Need to do some code rearranging to satisfy different Kconfig possiblities. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-21 13:42:36 -04:00
stephen hemminger	c718c6d66b	tcpnv: do not export local function The TCP New Vegas congestion control was exporting an internal function tcpnv_get_info which is not used by any other in tree kernel code. Make it static. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-21 13:42:36 -04:00
stephen hemminger	9691724e56	inet: fix warning about missing prototype The prototype for inet_rcv_saddr_equal was not being included. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-21 13:42:36 -04:00
stephen hemminger	9e7b19c516	ila: propagate error code in ila_output This warning: net/ipv6/ila/ila_lwt.c: In function ‘ila_output’: net/ipv6/ila/ila_lwt.c:42:6: warning: variable ‘err’ set but not used [-Wunused-but-set-variable] It looks like the code attempts to set propagate different error values, but always returned -EINVAL. Compile tested only. Needs review by original author. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-21 13:42:33 -04:00
stephen hemminger	332b4fc886	dcb: enforce minimum length on IEEE_APPS attribute Found by reviewing the warning about unused policy table. The code implies that it meant to check for size, but since it unrolled the loop for attribute validation that is never used. Instead do explicit check for attribute. Compile tested only. Needs review by original author. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-21 13:42:33 -04:00
Miroslav Lichvar	b50a5c70ff	net: allow simultaneous SW and HW transmit timestamping Add SOF_TIMESTAMPING_OPT_TX_SWHW option to allow an outgoing packet to be looped to the socket's error queue with a software timestamp even when a hardware transmit timestamp is expected to be provided by the driver. Applications using this option will receive two separate messages from the error queue, one with a software timestamp and the other with a hardware timestamp. As the hardware timestamp is saved to the shared skb info, which may happen before the first message with software timestamp is received by the application, the hardware timestamp is copied to the SCM_TIMESTAMPING control message only when the skb has no software timestamp or it is an incoming packet. While changing sw_tx_timestamp(), inline it in skb_tx_timestamp() as there are no other users. CC: Richard Cochran <richardcochran@gmail.com> CC: Willem de Bruijn <willemb@google.com> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-21 13:37:32 -04:00
Miroslav Lichvar	aad9c8c470	net: add new control message for incoming HW-timestamped packets Add SOF_TIMESTAMPING_OPT_PKTINFO option to request a new control message for incoming packets with hardware timestamps. It contains the index of the real interface which received the packet and the length of the packet at layer 2. The index is useful with bonding, bridges and other interfaces, where IP_PKTINFO doesn't allow applications to determine which PHC made the timestamp. With the L2 length (and link speed) it is possible to transpose preamble timestamps to trailer timestamps, which are used in the NTP protocol. While this information could be provided by two new socket options independently from timestamping, it doesn't look like they would be very useful. With this option any performance impact is limited to hardware timestamping. Use dev_get_by_napi_id() to get the device and its index. On kernels with disabled CONFIG_NET_RX_BUSY_POLL or drivers not using NAPI, a zero index will be returned in the control message. CC: Richard Cochran <richardcochran@gmail.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-21 13:37:32 -04:00
Miroslav Lichvar	90b602f803	net: add function to retrieve original skb device using NAPI ID Since commit `b68581778c` ("net: Make skb->skb_iif always track skb->dev") skbs don't have the original index of the interface which received the packet. This information is now needed for a new control message related to hardware timestamping. Instead of adding a new field to skb, we can find the device by the NAPI ID if it is available, i.e. CONFIG_NET_RX_BUSY_POLL is enabled and the driver is using NAPI. Add dev_get_by_napi_id() and also skb_napi_id() to hide the CONFIG_NET_RX_BUSY_POLL ifdef. CC: Richard Cochran <richardcochran@gmail.com> Suggested-by: Willem de Bruijn <willemb@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-21 13:37:32 -04:00
Miroslav Lichvar	e341257548	net: ethernet: update drivers to handle HWTSTAMP_FILTER_NTP_ALL Include HWTSTAMP_FILTER_NTP_ALL in net_hwtstamp_validate() as a valid filter and update drivers which can timestamp all packets, or which explicitly list unsupported filters instead of using a default case, to handle the filter. CC: Richard Cochran <richardcochran@gmail.com> CC: Willem de Bruijn <willemb@google.com> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-21 13:37:32 -04:00
Miroslav Lichvar	b8210a9e4b	net: define receive timestamp filter for NTP Add HWTSTAMP_FILTER_NTP_ALL to the hwtstamp_rx_filters enum for timestamping of NTP packets. There is currently only one driver (phyter) that could support it directly. CC: Richard Cochran <richardcochran@gmail.com> CC: Willem de Bruijn <willemb@google.com> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-21 13:37:32 -04:00
Xin Long	6d18c732b9	bridge: start hello_timer when enabling KERNEL_STP in br_stp_start Since commit `76b91c32dd` ("bridge: stp: when using userspace stp stop kernel hello and hold timers"), bridge would not start hello_timer if stp_enabled is not KERNEL_STP when br_dev_open. The problem is even if users set stp_enabled with KERNEL_STP later, the timer will still not be started. It causes that KERNEL_STP can not really work. Users have to re-ifup the bridge to avoid this. This patch is to fix it by starting br->hello_timer when enabling KERNEL_STP in br_stp_start. As an improvement, it's also to start hello_timer again only when br->stp_enabled is KERNEL_STP in br_hello_timer_expired, there is no reason to start the timer again when it's NO_STP. Fixes: `76b91c32dd` ("bridge: stp: when using userspace stp stop kernel hello and hold timers") Reported-by: Haidong Li <haili@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Ivan Vecera <cera@cera.cz> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-21 13:33:28 -04:00
Ihar Hrachyshka	7d472a59c0	arp: always override existing neigh entries with gratuitous ARP Currently, when arp_accept is 1, we always override existing neigh entries with incoming gratuitous ARP replies. Otherwise, we override them only if new replies satisfy _locktime_ conditional (packets arrive not earlier than _locktime_ seconds since the last update to the neigh entry). The idea behind locktime is to pick the very first (=> close) reply received in a unicast burst when ARP proxies are used. This helps to avoid ARP thrashing where Linux would switch back and forth from one proxy to another. This logic has nothing to do with gratuitous ARP replies that are generally not aligned in time when multiple IP address carriers send them into network. This patch enforces overriding of existing neigh entries by all incoming gratuitous ARP packets, irrespective of their time of arrival. This will make the kernel honour all incoming gratuitous ARP packets. Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-21 13:26:45 -04:00
Ihar Hrachyshka	d9ef2e7bf9	arp: postpone addr_type calculation to as late as possible The addr_type retrieval can be costly, so it's worth trying to avoid its calculation as much as possible. This patch makes it calculated only for gratuitous ARP packets. This is especially important since later we may want to move is_garp calculation outside of arp_accept block, at which point the costly operation will be executed for all setups. The patch is the result of a discussion in net-dev: http://marc.info/?l=linux-netdev&m=149506354216994 Suggested-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-21 13:26:45 -04:00
Ihar Hrachyshka	6fd05633bd	arp: decompose is_garp logic into a separate function The code is quite involving already to earn a separate function for itself. If anything, it helps arp_process readability. Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-21 13:26:45 -04:00
Ihar Hrachyshka	34eb5fe078	arp: fixed error in a comment the is_garp code deals just with gratuitous ARP packets, not every unsolicited packet. This patch is a result of a discussion in netdev: http://marc.info/?l=linux-netdev&m=149506354216994 Suggested-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-21 13:26:45 -04:00
Wei Wang	499350a5a6	tcp: initialize rcv_mss to TCP_MIN_MSS instead of 0 When tcp_disconnect() is called, inet_csk_delack_init() sets icsk->icsk_ack.rcv_mss to 0. This could potentially cause tcp_recvmsg() => tcp_cleanup_rbuf() => __tcp_select_window() call path to have division by 0 issue. So this patch initializes rcv_mss to TCP_MIN_MSS instead of 0. Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Wei Wang <weiwan@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-21 13:24:47 -04:00
David S. Miller	23416e2304	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following patchset contains Netfilter/IPVS fixes for your net tree, they are: 1) When using IPVS in direct-routing mode, normal traffic from the LVS host to a back-end server is sometimes incorrectly NATed on the way back into the LVS host. Patch to fix this from Julian Anastasov. 2) Calm down clang compilation warning in ctnetlink due to type mismatch, from Matthias Kaehlcke. 3) Do not re-setup NAT for conntracks that are already confirmed, this is fixing a problem that was introduced in the previous nf-next batch. Patch from Liping Zhang. 4) Do not allow conntrack helper removal from userspace cthelper infrastructure if already in used. This comes with an initial patch to introduce nf_conntrack_helper_put() that is required by this fix. From Liping Zhang. 5) Zero the pad when copying data to userspace, otherwise iptables fails to remove rules. This is a follow up on the patchset that sorts out the internal match/target structure pointer leak to userspace. Patch from the same author, Willem de Bruijn. This also comes with a build failure when CONFIG_COMPAT is not on, coming in the last patch of this series. 6) SYNPROXY crashes with conntrack entries that are created via ctnetlink, more specifically via conntrackd state sync. Patch from Eric Leblond. 7) RCU safe iteration on set element dumping in nf_tables, from Liping Zhang. 8) Missing sanitization of immediate date for the bitwise and cmp expressions in nf_tables. 9) Refcounting logic for chain and objects from set elements does not integrate into the nf_tables 2-phase commit protocol. 10) Missing sanitization of target verdict in ebtables arpreply target, from Gao Feng. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-21 13:00:02 -04:00
Davide Caratti	7529390d08	openvswitch: more accurate checksumming in queue_userspace_packet() if skb carries an SCTP packet and ip_summed is CHECKSUM_PARTIAL, it needs CRC32c in place of Internet Checksum: use skb_csum_hwoffload_help to avoid corrupting such packets while queueing them towards userspace. Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-19 19:21:29 -04:00
Davide Caratti	43c26a1a45	net: more accurate checksumming in validate_xmit_skb() skb_csum_hwoffload_help() uses netdev features and skb->csum_not_inet to determine if skb needs software computation of Internet Checksum or crc32c (or nothing, if this computation can be done by the hardware). Use it in place of skb_checksum_help() in validate_xmit_skb() to avoid corruption of non-GSO SCTP packets having skb->ip_summed equal to CHECKSUM_PARTIAL. While at it, remove references to skb_csum_off_chk* functions, since they are not present anymore in Linux _ see commit `cf53b1da73` ("Revert "net: Add driver helper functions to determine checksum offloadability""). Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-19 19:21:29 -04:00
Davide Caratti	dba003067a	net: use skb->csum_not_inet to identify packets needing crc32c skb->csum_not_inet carries the indication on which algorithm is needed to compute checksum on skb in the transmit path, when skb->ip_summed is equal to CHECKSUM_PARTIAL. If skb carries a SCTP packet and crc32c hasn't been yet written in L4 header, skb->csum_not_inet is assigned to 1; otherwise, assume Internet Checksum is needed and thus set skb->csum_not_inet to 0. Suggested-by: Tom Herbert <tom@herbertland.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-19 19:21:29 -04:00
Davide Caratti	219f1d7987	sk_buff: remove support for csum_bad in sk_buff This bit was introduced with commit `5a21232983` ("net: Support for csum_bad in skbuff") to reduce the stack workload when processing RX packets carrying a wrong Internet Checksum. Up to now, only one driver and GRO core are setting it. Suggested-by: Tom Herbert <tom@herbertland.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-19 19:21:29 -04:00
Davide Caratti	b72b5bf6a8	net: introduce skb_crc32c_csum_help skb_crc32c_csum_help is like skb_checksum_help, but it is designed for checksumming SCTP packets using crc32c (see RFC3309), provided that libcrc32c.ko has been loaded before. In case libcrc32c is not loaded, invoking skb_crc32c_csum_help on a skb results in one the following printouts: warn_crc32c_csum_update: attempt to compute crc32c without libcrc32c.ko warn_crc32c_csum_combine: attempt to compute crc32c without libcrc32c.ko Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-19 19:21:29 -04:00
Davide Caratti	9617813dba	skbuff: add stub to help computing crc32c on SCTP packets sctp_compute_checksum requires crc32c symbol (provided by libcrc32c), so it can't be used in net core. Like it has been done previously with other symbols (e.g. ipv6_dst_lookup), introduce a stub struct skb_checksum_ops to allow computation of crc32c checksum in net core after sctp.ko (and thus libcrc32c) has been loaded. Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-19 19:21:29 -04:00
Linus Torvalds	9e856e4b47	xen: fixes for 4.12 rc2 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABAgAGBQJZHx/IAAoJELDendYovxMvzegIAIOyDATZsyLnbDnTunOmYqLJ n06v50N3KwQ+pegJyz4lHdTryI10/TEUzvuT4v/V9B0sHimNRJcE7ClvRVPEaFrs 4y459kKGXRpXXAvS2r0WIY3NhwP/Num9+duVY5lInJ6caq+/JDm3S1tL2HeQ9gl1 SDuI6IMV3q12Agk6jgbvwd1XBh3wbj8Z6SOx3DAchqY/kbdy6tS4y5CR93mKpjs3 LsVyPvY2IOLWCSrPsdloM4l7lMoVmd/1tt6NfzymepIxQbIS3KWo5AwBsoM0cVfs KGb4T3+H8uwmpyWjgibsayr31cC7LIulEqLtqZNyycpIZGR5TlZ01KEPSMKn78s= =Boz3 -----END PGP SIGNATURE----- Merge tag 'for-linus-4.12b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "Some fixes for the new Xen 9pfs frontend and some minor cleanups" * tag 'for-linus-4.12b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: make xen_flush_tlb_all() static xen: cleanup pvh leftovers from pv-only sources xen/9pfs: p9_trans_xen_init and p9_trans_xen_exit can be static xen/9pfs: fix return value check in xen_9pfs_front_probe()	2017-05-19 15:06:48 -07:00
Soheil Hassas Yeganeh	6f5b24eed0	tcp: warn on negative reordering values Commit `bafbb9c732` ("tcp: eliminate negative reordering in tcp_clean_rtx_queue") fixes an issue for negative reordering metrics. To be resilient to such errors, warn and return when a negative metric is passed to tcp_update_reordering(). Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-19 16:55:46 -04:00
Sabrina Dubroca	67df58a3e5	ah: use crypto_memneq to check the ICV Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2017-05-19 14:30:50 +02:00
Simon Wunderlich	3b23782f7d	mac80211: mark as action frame when parsing IEs of CSA action frames Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-05-19 13:34:26 +02:00
Benjamin Berg	0ab2e55d33	mac80211: mesh: Allow following CSA to DFS channels if userspace handles it If userspace has flagged support for DFS earlier, then we can follow CSA to DFS channels. So instead of rejecting the switch, allow it to happen if the flag has been set during mesh setup. Signed-off-by: Benjamin Berg <benjamin@sipsolutions.net> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-05-19 13:26:05 +02:00
Benjamin Berg	8d9de16f80	wireless: Require HANDLE_DFS flag to switch channel for non-AP mode In the case the channel should be switched to one requiring DFS we need to make sure that userspace will handle radar events when they happen. For AP mode this is assumed to be the case, as a manager like hostapd is required. However IBSS and MESH modes can work without further userspace assistance, so refuse to use DFS channels unless userspace vouches that it handles DFS. NOTE: Userspace should have already flagged support earlier during mesh or IBSS setup. However, this information is not readily accessible currently. Signed-off-by: Benjamin Berg <benjamin@sipsolutions.net> [sw: style cleanups] Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-05-19 13:25:58 +02:00
Benjamin Berg	d37d49c2f1	wireless: Only join DFS channels in mesh mode if userspace flags support When joining a mesh network it is not guaranteed that userspace has a daemon listening for radar events. This is however required for channels requiring DFS. To flag that userspace will handle radar events, it needs to set NL80211_ATTR_HANDLE_DFS. This matches the current mechanism used for IBSS mode. Signed-off-by: Benjamin Berg <benjamin@sipsolutions.net> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-05-19 13:25:58 +02:00
Johannes Berg	61b81b4010	mac80211: move clearing result into ieee80211_parse_ch_switch_ie() Clear the csa_ie in ieee80211_parse_ch_switch_ie() where the data is filled in, rather than in each caller. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-05-19 13:25:57 +02:00
Benjamin Berg	5d55371b21	mac80211: mesh: mark channel as unusable if a regulatory MESH CSA is received In the Mesh Channel Switch Parameters (8.4.2.105) the reason is specified to WLAN_REASON_MESH_CHAN_REGULATORY in the case that a regulatory limitation was the cause for the switch. This means another station detected a radar event. Mark the channel as unusable if this happens. Signed-off-by: Benjamin Berg <benjamin@sipsolutions.net> [sw: style cleanup, rebase] Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-05-19 13:25:52 +02:00
Antony Antony	a486cd2366	xfrm: fix state migration copy replay sequence numbers During xfrm migration copy replay and preplay sequence numbers from the previous state. Here is a tcpdump output showing the problem. 10.0.10.46 is running vanilla kernel, is the IKE/IPsec responder. After the migration it sent wrong sequence number, reset to 1. The migration is from 10.0.0.52 to 10.0.0.53. IP 10.0.0.52.4500 > 10.0.10.46.4500: UDP-encap: ESP(spi=0x43ef462d,seq=0x7cf), length 136 IP 10.0.10.46.4500 > 10.0.0.52.4500: UDP-encap: ESP(spi=0xca1c282d,seq=0x7cf), length 136 IP 10.0.0.52.4500 > 10.0.10.46.4500: UDP-encap: ESP(spi=0x43ef462d,seq=0x7d0), length 136 IP 10.0.10.46.4500 > 10.0.0.52.4500: UDP-encap: ESP(spi=0xca1c282d,seq=0x7d0), length 136 IP 10.0.0.53.4500 > 10.0.10.46.4500: NONESP-encap: isakmp: child_sa inf2[I] IP 10.0.10.46.4500 > 10.0.0.53.4500: NONESP-encap: isakmp: child_sa inf2[R] IP 10.0.0.53.4500 > 10.0.10.46.4500: NONESP-encap: isakmp: child_sa inf2[I] IP 10.0.10.46.4500 > 10.0.0.53.4500: NONESP-encap: isakmp: child_sa inf2[R] IP 10.0.0.53.4500 > 10.0.10.46.4500: UDP-encap: ESP(spi=0x43ef462d,seq=0x7d1), length 136 NOTE: next sequence is wrong 0x1 IP 10.0.10.46.4500 > 10.0.0.53.4500: UDP-encap: ESP(spi=0xca1c282d,seq=0x1), length 136 IP 10.0.0.53.4500 > 10.0.10.46.4500: UDP-encap: ESP(spi=0x43ef462d,seq=0x7d2), length 136 IP 10.0.10.46.4500 > 10.0.0.53.4500: UDP-encap: ESP(spi=0xca1c282d,seq=0x2), length 136 Signed-off-by: Antony Antony <antony@phenome.org> Reviewed-by: Richard Guy Briggs <rgb@tricolour.ca> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2017-05-19 12:49:13 +02:00
Andreas Pape	a1a745ef98	batman-adv: fix memory leak when dropping packet from other gateway The skb must be released in the receive handler since `b91a2543b4` ("batman-adv: Consume skb in receive handlers"). Just returning NET_RX_DROP will no longer automatically free the memory. This results in memory leaks when unicast packets from other backbones must be dropped because they share a common backbone. Fixes: `9e794b6bf4` ("batman-adv: drop unicast packets from other backbone gw") Signed-off-by: Andreas Pape <apape@phoenixcontact.com> [sven@narfation.org: adjust commit message] Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>	2017-05-19 12:20:28 +02:00
Sven Eckelmann	36d4d68cd6	batman-adv: Fix rx packet/bytes stats on local ARP reply The stats are generated by batadv_interface_stats and must not be stored directly in the net_device stats member variable. The batadv_priv bat_counters information is assembled when ndo_get_stats is called. The stats previously stored in net_device::stats is then overwritten. The batman-adv counters must therefore be increased when an ARP packet is answered locally via the distributed arp table. Fixes: `c384ea3ec9` ("batman-adv: Distributed ARP Table - add snooping functions for ARP messages") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>	2017-05-19 12:18:52 +02:00
Wei Yongjun	24d472e4e4	xfrm: Make function xfrm_dev_register static Fixes the following sparse warning: net/xfrm/xfrm_device.c:141:5: warning: symbol 'xfrm_dev_register' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2017-05-19 11:42:39 +02:00
David S. Miller	c6cd850d65	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2017-05-18 16:11:32 -04:00
Wei Yongjun	aaf0475a0b	xen/9pfs: p9_trans_xen_init and p9_trans_xen_exit can be static Fixes the following sparse warnings: net/9p/trans_xen.c:528:5: warning: symbol 'p9_trans_xen_init' was not declared. Should it be static? net/9p/trans_xen.c:540:6: warning: symbol 'p9_trans_xen_exit' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>	2017-05-18 11:42:58 -07:00
Wei Yongjun	14e3995e63	xen/9pfs: fix return value check in xen_9pfs_front_probe() In case of error, the function xenbus_read() returns ERR_PTR() and never returns NULL. The NULL test in the return value check should be replaced with IS_ERR(). Fixes: `71ebd71921` ("xen/9pfs: connect to the backend") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>	2017-05-18 11:42:32 -07:00
Eric Dumazet	b17b8a20c5	tcp: fix tcp_rearm_rto() skbs in (re)transmit queue no longer have a copy of jiffies at the time of the transmit : skb->skb_mstamp is now in usec unit, with no correlation to tcp_jiffies32. We have to convert rto from jiffies to usec, compute a time difference in usec, then convert the delta to HZ units. Fixes: `9a568de481` ("tcp: switch TCP TS option (RFC 7323) to 1ms clock") Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-18 13:20:31 -04:00
Vivien Didelot	f0c24ccf49	net: dsa: include switchdev.h only once DSA drivers and core use switchdev. Include switchdev.h only once, in the dsa.h public header, so that inclusion in DSA drivers or forward declarations of switchdev structures in not necessary anymore. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-18 10:40:12 -04:00
Vivien Didelot	ea5dd34be1	net: dsa: include dsa.h only once The public include/net/dsa.h file is meant for DSA drivers, while all DSA core files share a common private header net/dsa/dsa_priv.h file. Ensure that dsa_priv.h is the only DSA core file to include net/dsa.h, and add a new line to separate absolute and relative headers at the same time. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-18 10:40:12 -04:00
Andrey Vagin	de321ed384	net: fix __skb_try_recv_from_queue to return the old behavior This function has to return NULL on a error case, because there is a separate error variable. The offset has to be changed only if skb is returned v2: fix udp code to not use an extra variable Cc: Paolo Abeni <pabeni@redhat.com> Cc: Eric Dumazet <edumazet@google.com> Cc: David S. Miller <davem@davemloft.net> Fixes: `65101aeca5` ("net/sock: factor out dequeue/peek with offset cod") Signed-off-by: Andrei Vagin <avagin@openvz.org> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-18 10:32:58 -04:00
Eric Dumazet	fdcee2cbb8	sctp: do not inherit ipv6_{mc\|ac\|fl}_list from parent SCTP needs fixes similar to `83eaddab43` ("ipv6/dccp: do not inherit ipv6_mc_list from parent"), otherwise bad things can happen. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-18 10:24:08 -04:00
Paolo Abeni	a3f96c47c8	udp: make udp_queue_rcv_skb() functions static Since the udp memory accounting refactor, we don't need any more to export the udp_queue_rcv_skb(). Make them static and fix a couple of sparse warnings: net/ipv4/udp.c:1615:5: warning: symbol 'udp_queue_rcv_skb' was not declared. Should it be static? net/ipv6/udp.c:572:5: warning: symbol 'udpv6_queue_rcv_skb' was not declared. Should it be static? Fixes: `850cbaddb5` ("udp: use it's own memory accounting schema") Fixes: `c915fe13cb` ("udplite: fix NULL pointer dereference") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-18 10:23:33 -04:00
Alexey Dobriyan	0cd2950357	net: make struct net_device::tx_queue_len unsigned int 4 billion packet queue is something unthinkable so use 32-bit value for now. Space savings on x86_64: add/remove: 0/0 grow/shrink: 3/70 up/down: 16/-131 (-115) function old new delta change_tx_queue_len 94 108 +14 qdisc_create 1176 1177 +1 alloc_netdev_mqs 1124 1125 +1 xenvif_alloc 533 532 -1 x25_asy_setup 167 166 -1 ... tun_queue_resize 945 940 -5 pfifo_fast_enqueue 167 162 -5 qfq_init_qdisc 168 158 -10 tap_queue_resize 810 799 -11 transmit 719 698 -21 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-18 10:19:30 -04:00
Tobias Jungel	a285860211	bridge: netlink: check vlan_default_pvid range Currently it is allowed to set the default pvid of a bridge to a value above VLAN_VID_MASK (0xfff). This patch adds a check to br_validate and returns -EINVAL in case the pvid is out of bounds. Reproduce by calling: [root@test ~]# ip l a type bridge [root@test ~]# ip l a type dummy [root@test ~]# ip l s bridge0 type bridge vlan_filtering 1 [root@test ~]# ip l s bridge0 type bridge vlan_default_pvid 9999 [root@test ~]# ip l s dummy0 master bridge0 [root@test ~]# bridge vlan port vlan ids bridge0 9999 PVID Egress Untagged dummy0 9999 PVID Egress Untagged Fixes: `0f963b7592` ("bridge: netlink: add support for default_pvid") Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Tobias Jungel <tobias.jungel@bisdn.de> Acked-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-18 10:15:00 -04:00
Colin Ian King	64f5102dcb	udp: make function udp_skb_dtor_locked static Function udp_skb_dtor_locked does not need to be in global scope so make it static to fix sparse warning: net/ipv4/udp.c: warning: symbol 'udp_skb_dtor_locked' was not declared. Should it be static? Fixes: `6dfb4367cd` ("udp: keep the sk_receive_queue held when splicing") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-18 10:12:40 -04:00
linzhang	64df6d525f	net: x25: fix one potential use-after-free issue The function x25_init is not properly unregister related resources on error handler.It is will result in kernel oops if x25_init init failed, so add properly unregister call on error handler. Also, i adjust the coding style and make x25_register_sysctl properly return failure. Signed-off-by: linzhang <xiaolou4617@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-18 10:05:40 -04:00
Marcel Holtmann	b56c7b2548	Bluetooth: Skip vendor diagnostic configuration for HCI User Channel When the HCI User Channel access is requested, then do not try to undermine it with vendor diagnostic configuration. The exclusive user is required to configure its own vendor diagnostic in that case and can not rely on the host stack support. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2017-05-18 13:52:49 +02:00
Marcel Holtmann	de2ba3039c	Bluetooth: Set LE Default PHY preferences If the LE Set Default PHY command is supported, the indicate to the controller that the host has no preferences for transmitter PHY or receiver PHY selection. Issuing this command gives the controller a clear indication that other PHY can be selected if available. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2017-05-18 13:52:49 +02:00
Marcel Holtmann	27bbca4402	Bluetooth: Enable LE PHY Update Complete event If either LE Set Default PHY command or LE Set PHY commands is supported, then enable the LE PHY Update Complete event. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2017-05-18 13:52:49 +02:00
Marcel Holtmann	9756d33b85	Bluetooth: Enable LE Channel Selection Algorithm event If the Channel Selection Algorithm #2 feature is supported, then enable the new LE Channel Selection Algorithm event. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2017-05-18 13:52:49 +02:00
Marcel Holtmann	122048752e	Bluetooth: Set LE Suggested Default Data Length to maximum When LE Data Packet Length Extension is supported, then actually increase the suggested default data length to the maximum to enable higher througput. < HCI Command: LE Read Maximum Data Length (0x08\|0x002f) plen 0 > HCI Event: Command Complete (0x0e) plen 12 LE Read Maximum Data Length (0x08\|0x002f) ncmd 1 Status: Success (0x00) Max TX octets: 251 Max TX time: 2120 Max RX octets: 251 Max RX time: 2120 < HCI Command: LE Read Suggested Default Data Length (0x08\|0x0023) plen 0 > HCI Event: Command Complete (0x0e) plen 8 LE Read Suggested Default Data Length (0x08\|0x0023) ncmd 1 Status: Success (0x00) TX octets: 27 TX time: 328 < HCI Command: LE Write Suggested Default Data Length (0x08\|0x0024) plen 4 TX octets: 251 TX time: 2120 > HCI Event: Command Complete (0x0e) plen 4 LE Write Suggested Default Data Length (0x08\|0x0024) ncmd 1 Status: Success (0x00) Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2017-05-18 13:52:49 +02:00
Willem de Bruijn	751a9c7638	netfilter: xtables: fix build failure from COMPAT_XT_ALIGN outside CONFIG_COMPAT The patch in the Fixes references COMPAT_XT_ALIGN in the definition of XT_DATA_TO_USER, outside an #ifdef CONFIG_COMPAT block. Split XT_DATA_TO_USER into separate compat and non compat variants and define the first inside an CONFIG_COMPAT block. This simplifies both variants by removing branches inside the macro. Fixes: `324318f024` ("netfilter: xtables: zero padding in data_to_user") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2017-05-18 13:10:03 +02:00
David S. Miller	7dd7eb9513	ipv6: Check ip6_find_1stfragopt() return value properly. Do not use unsigned variables to see if it returns a negative error or not. Fixes: `2423496af3` ("ipv6: Prevent overrun when parsing v6 header options") Reported-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-17 22:54:11 -04:00
Eric Dumazet	9a568de481	tcp: switch TCP TS option (RFC 7323) to 1ms clock TCP Timestamps option is defined in RFC 7323 Traditionally on linux, it has been tied to the internal 'jiffies' variable, because it had been a cheap and good enough generator. For TCP flows on the Internet, 1 ms resolution would be much better than 4ms or 10ms (HZ=250 or HZ=100 respectively) For TCP flows in the DC, Google has used usec resolution for more than two years with great success [1] Receive size autotuning (DRS) is indeed more precise and converges faster to optimal window size. This patch converts tp->tcp_mstamp to a plain u64 value storing a 1 usec TCP clock. This choice will allow us to upstream the 1 usec TS option as discussed in IETF 97. [1] https://www.ietf.org/proceedings/97/slides/slides-97-tcpm-tcp-options-for-low-latency-00.pdf Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-17 16:06:01 -04:00
Eric Dumazet	ac9517fcf3	tcp: replace misc tcp_time_stamp to tcp_jiffies32 After this patch, all uses of tcp_time_stamp will require a change when we introduce 1 ms and/or 1 us TCP TS option. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-17 16:06:01 -04:00
Eric Dumazet	46bf466f08	tcp_lp: cache tcp_time_stamp tcp_time_stamp will become slightly more expensive soon, cache its value. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-17 16:06:01 -04:00
Eric Dumazet	ad5ad69e6b	tcp_westwood: use tcp_jiffies32 instead of tcp_time_stamp This CC does not need 1 ms tcp_time_stamp and can use the jiffy based 'timestamp'. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-17 16:06:01 -04:00
Eric Dumazet	594208afe4	tcp: use tcp_jiffies32 in __tcp_oow_rate_limited() This place wants to use tcp_jiffies32, this is good enough. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-17 16:06:01 -04:00
Eric Dumazet	628174ccc4	tcp: uses jiffies_32 to feed tp->chrono_start tcp_time_stamp will no longer be tied to jiffies. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-17 16:06:01 -04:00
Eric Dumazet	c74df29a8d	tcp: use tcp_jiffies32 to feed probe_timestamp Use tcp_jiffies32 instead of tcp_time_stamp, since tcp_time_stamp will soon be only used for TCP TS option. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-17 16:06:01 -04:00
Eric Dumazet	70eabf0e1b	tcp: use tcp_jiffies32 for rcv_tstamp and lrcvtime Use tcp_jiffies32 instead of tcp_time_stamp, since tcp_time_stamp will soon be only used for TCP TS option. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-17 16:06:01 -04:00
Eric Dumazet	ac35f56220	tcp: bic, cubic: use tcp_jiffies32 instead of tcp_time_stamp Use tcp_jiffies32 instead of tcp_time_stamp, since tcp_time_stamp will soon be only used for TCP TS option. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-17 16:06:01 -04:00
Eric Dumazet	2660bfa84e	tcp_bbr: use tcp_jiffies32 instead of tcp_time_stamp Use tcp_jiffies32 instead of tcp_time_stamp, since tcp_time_stamp will soon be only used for TCP TS option. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-17 16:06:01 -04:00
Eric Dumazet	c2203cf75e	tcp: use tcp_jiffies32 to feed tp->snd_cwnd_stamp Use tcp_jiffies32 instead of tcp_time_stamp to feed tp->snd_cwnd_stamp. tcp_time_stamp will soon be a litle bit more expensive than simply reading 'jiffies'. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-17 16:06:01 -04:00
Eric Dumazet	d635fbe27e	tcp: use tcp_jiffies32 to feed tp->lsndtime Use tcp_jiffies32 instead of tcp_time_stamp to feed tp->lsndtime. tcp_time_stamp will soon be a litle bit more expensive than simply reading 'jiffies'. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-17 16:06:01 -04:00
Eric Dumazet	d011b9a448	dccp: do not use tcp_time_stamp Use our own macro instead of abusing tcp_time_stamp Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-17 16:06:01 -04:00
Eric Dumazet	385e20706f	tcp: use tp->tcp_mstamp in output path Idea is to later convert tp->tcp_mstamp to a full u64 counter using usec resolution, so that we can later have fine grained TCP TS clock (RFC 7323), regardless of HZ value. We try to refresh tp->tcp_mstamp only when necessary. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-17 16:06:01 -04:00
David S. Miller	9d4f97f97b	sch_dsmark: Fix uninitialized variable warning. We still need to initialize err to -EINVAL for the case where 'opt' is NULL in dsmark_init(). Fixes: `6529eaba33` ("net: sched: introduce tcf block infractructure") Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-17 16:04:38 -04:00
Jiri Pirko	db50514f9a	net: sched: add termination action to allow goto chain Introduce new type of termination action called "goto_chain". This allows user to specify a chain to be processed. This action type is then processed as a return value in tcf_classify loop in similar way as "reclassify" is, only it does not reset to the first filter in chain but rather reset to the first filter of the desired chain. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-17 15:22:13 -04:00
Jiri Pirko	9fb9f251d2	net: sched: push tp down to action init Tp pointer will be needed by the next patch in order to get the chain. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-17 15:22:13 -04:00
Jiri Pirko	5bc1701881	net: sched: introduce multichain support for filters Instead of having only one filter per block, introduce a list of chains for every block. Create chain 0 by default. UAPI is extended so the user can specify which chain he wants to change. If the new attribute is not specified, chain 0 is used. That allows to maintain backward compatibility. If chain does not exist and user wants to manipulate with it, new chain is created with specified index. Also, when last filter is removed from the chain, the chain is destroyed. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-17 15:22:13 -04:00
Jiri Pirko	acb31fae3b	net: sched: push chain dump to a separate function Since there will be multiple chains to dump, push chain dumping code to a separate function. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-17 15:22:13 -04:00
Jiri Pirko	2190d1d094	net: sched: introduce helpers to work with filter chains Introduce struct tcf_chain object and set of helpers around it. Wraps up insertion, deletion and search in the filter chain. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-17 15:22:13 -04:00
Jiri Pirko	7961973a00	net: sched: move TC_H_MAJ macro call into tcf_auto_prio Call the helper from the function rather than to always adjust the return value of the function. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-17 15:22:13 -04:00
Jiri Pirko	9d36d9e545	net: sched: replace nprio by a bool to make the function more readable The use of "nprio" variable in tc_ctl_tfilter is a bit cryptic and makes a reader wonder what is going on for a while. So help him to understand this priority allocation dance a litte bit better. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-17 15:22:13 -04:00
Jiri Pirko	fbe9c5b01f	net: sched: rename tcf_destroy_chain helper Make the name consistent with the rest of the helpers around. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-17 15:22:13 -04:00
Jiri Pirko	6529eaba33	net: sched: introduce tcf block infractructure Currently, the filter chains are direcly put into the private structures of qdiscs. In order to be able to have multiple chains per qdisc and to allow filter chains sharing among qdiscs, there is a need for common object that would hold the chains. This introduces such object and calls it "tcf_block". Helpers to get and put the blocks are provided to be called from individual qdisc code. Also, the original filter_list pointers are left in qdisc privs to allow the entry into tcf_block processing without any added overhead of possible multiple pointer dereference on fast path. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-17 15:22:13 -04:00
Jiri Pirko	87d83093bf	net: sched: move tc_classify function to cls_api.c Move tc_classify function to cls_api.c where it belongs, rename it to fit the namespace. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-17 15:22:13 -04:00
Andrew Lunn	eb7b721129	net: dsa: Sort DSA tagging protocol drivers With more tag protocols being added, regain some order by sorting the entries in various places. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-17 15:19:40 -04:00
Eric Dumazet	9142e9007f	net: fix compile error in skb_orphan_partial() If CONFIG_INET is not set, net/core/sock.c can not compile : net/core/sock.c: In function ‘skb_orphan_partial’: net/core/sock.c:1810:2: error: implicit declaration of function ‘skb_is_tcp_pure_ack’ [-Werror=implicit-function-declaration] if (skb_is_tcp_pure_ack(skb)) ^ Fix this by always including <net/tcp.h> Fixes: `f6ba8d33cf` ("netem: fix skb_orphan_partial()") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com> Reported-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-17 15:10:13 -04:00
Craig Gallek	2423496af3	ipv6: Prevent overrun when parsing v6 header options The KASAN warning repoted below was discovered with a syzkaller program. The reproducer is basically: int s = socket(AF_INET6, SOCK_RAW, NEXTHDR_HOP); send(s, &one_byte_of_data, 1, MSG_MORE); send(s, &more_than_mtu_bytes_data, 2000, 0); The socket() call sets the nexthdr field of the v6 header to NEXTHDR_HOP, the first send call primes the payload with a non zero byte of data, and the second send call triggers the fragmentation path. The fragmentation code tries to parse the header options in order to figure out where to insert the fragment option. Since nexthdr points to an invalid option, the calculation of the size of the network header can made to be much larger than the linear section of the skb and data is read outside of it. This fix makes ip6_find_1stfrag return an error if it detects running out-of-bounds. [ 42.361487] ================================================================== [ 42.364412] BUG: KASAN: slab-out-of-bounds in ip6_fragment+0x11c8/0x3730 [ 42.365471] Read of size 840 at addr ffff88000969e798 by task ip6_fragment-oo/3789 [ 42.366469] [ 42.366696] CPU: 1 PID: 3789 Comm: ip6_fragment-oo Not tainted 4.11.0+ #41 [ 42.367628] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1ubuntu1 04/01/2014 [ 42.368824] Call Trace: [ 42.369183] dump_stack+0xb3/0x10b [ 42.369664] print_address_description+0x73/0x290 [ 42.370325] kasan_report+0x252/0x370 [ 42.370839] ? ip6_fragment+0x11c8/0x3730 [ 42.371396] check_memory_region+0x13c/0x1a0 [ 42.371978] memcpy+0x23/0x50 [ 42.372395] ip6_fragment+0x11c8/0x3730 [ 42.372920] ? nf_ct_expect_unregister_notifier+0x110/0x110 [ 42.373681] ? ip6_copy_metadata+0x7f0/0x7f0 [ 42.374263] ? ip6_forward+0x2e30/0x2e30 [ 42.374803] ip6_finish_output+0x584/0x990 [ 42.375350] ip6_output+0x1b7/0x690 [ 42.375836] ? ip6_finish_output+0x990/0x990 [ 42.376411] ? ip6_fragment+0x3730/0x3730 [ 42.376968] ip6_local_out+0x95/0x160 [ 42.377471] ip6_send_skb+0xa1/0x330 [ 42.377969] ip6_push_pending_frames+0xb3/0xe0 [ 42.378589] rawv6_sendmsg+0x2051/0x2db0 [ 42.379129] ? rawv6_bind+0x8b0/0x8b0 [ 42.379633] ? _copy_from_user+0x84/0xe0 [ 42.380193] ? debug_check_no_locks_freed+0x290/0x290 [ 42.380878] ? ___sys_sendmsg+0x162/0x930 [ 42.381427] ? rcu_read_lock_sched_held+0xa3/0x120 [ 42.382074] ? sock_has_perm+0x1f6/0x290 [ 42.382614] ? ___sys_sendmsg+0x167/0x930 [ 42.383173] ? lock_downgrade+0x660/0x660 [ 42.383727] inet_sendmsg+0x123/0x500 [ 42.384226] ? inet_sendmsg+0x123/0x500 [ 42.384748] ? inet_recvmsg+0x540/0x540 [ 42.385263] sock_sendmsg+0xca/0x110 [ 42.385758] SYSC_sendto+0x217/0x380 [ 42.386249] ? SYSC_connect+0x310/0x310 [ 42.386783] ? __might_fault+0x110/0x1d0 [ 42.387324] ? lock_downgrade+0x660/0x660 [ 42.387880] ? __fget_light+0xa1/0x1f0 [ 42.388403] ? __fdget+0x18/0x20 [ 42.388851] ? sock_common_setsockopt+0x95/0xd0 [ 42.389472] ? SyS_setsockopt+0x17f/0x260 [ 42.390021] ? entry_SYSCALL_64_fastpath+0x5/0xbe [ 42.390650] SyS_sendto+0x40/0x50 [ 42.391103] entry_SYSCALL_64_fastpath+0x1f/0xbe [ 42.391731] RIP: 0033:0x7fbbb711e383 [ 42.392217] RSP: 002b:00007ffff4d34f28 EFLAGS: 00000246 ORIG_RAX: 000000000000002c [ 42.393235] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbbb711e383 [ 42.394195] RDX: 0000000000001000 RSI: 00007ffff4d34f60 RDI: 0000000000000003 [ 42.395145] RBP: 0000000000000046 R08: 00007ffff4d34f40 R09: 0000000000000018 [ 42.396056] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000400aad [ 42.396598] R13: 0000000000000066 R14: 00007ffff4d34ee0 R15: 00007fbbb717af00 [ 42.397257] [ 42.397411] Allocated by task 3789: [ 42.397702] save_stack_trace+0x16/0x20 [ 42.398005] save_stack+0x46/0xd0 [ 42.398267] kasan_kmalloc+0xad/0xe0 [ 42.398548] kasan_slab_alloc+0x12/0x20 [ 42.398848] __kmalloc_node_track_caller+0xcb/0x380 [ 42.399224] __kmalloc_reserve.isra.32+0x41/0xe0 [ 42.399654] __alloc_skb+0xf8/0x580 [ 42.400003] sock_wmalloc+0xab/0xf0 [ 42.400346] __ip6_append_data.isra.41+0x2472/0x33d0 [ 42.400813] ip6_append_data+0x1a8/0x2f0 [ 42.401122] rawv6_sendmsg+0x11ee/0x2db0 [ 42.401505] inet_sendmsg+0x123/0x500 [ 42.401860] sock_sendmsg+0xca/0x110 [ 42.402209] ___sys_sendmsg+0x7cb/0x930 [ 42.402582] __sys_sendmsg+0xd9/0x190 [ 42.402941] SyS_sendmsg+0x2d/0x50 [ 42.403273] entry_SYSCALL_64_fastpath+0x1f/0xbe [ 42.403718] [ 42.403871] Freed by task 1794: [ 42.404146] save_stack_trace+0x16/0x20 [ 42.404515] save_stack+0x46/0xd0 [ 42.404827] kasan_slab_free+0x72/0xc0 [ 42.405167] kfree+0xe8/0x2b0 [ 42.405462] skb_free_head+0x74/0xb0 [ 42.405806] skb_release_data+0x30e/0x3a0 [ 42.406198] skb_release_all+0x4a/0x60 [ 42.406563] consume_skb+0x113/0x2e0 [ 42.406910] skb_free_datagram+0x1a/0xe0 [ 42.407288] netlink_recvmsg+0x60d/0xe40 [ 42.407667] sock_recvmsg+0xd7/0x110 [ 42.408022] ___sys_recvmsg+0x25c/0x580 [ 42.408395] __sys_recvmsg+0xd6/0x190 [ 42.408753] SyS_recvmsg+0x2d/0x50 [ 42.409086] entry_SYSCALL_64_fastpath+0x1f/0xbe [ 42.409513] [ 42.409665] The buggy address belongs to the object at ffff88000969e780 [ 42.409665] which belongs to the cache kmalloc-512 of size 512 [ 42.410846] The buggy address is located 24 bytes inside of [ 42.410846] 512-byte region [ffff88000969e780, ffff88000969e980) [ 42.411941] The buggy address belongs to the page: [ 42.412405] page:ffffea000025a780 count:1 mapcount:0 mapping: (null) index:0x0 compound_mapcount: 0 [ 42.413298] flags: 0x100000000008100(slab\|head) [ 42.413729] raw: 0100000000008100 0000000000000000 0000000000000000 00000001800c000c [ 42.414387] raw: ffffea00002a9500 0000000900000007 ffff88000c401280 0000000000000000 [ 42.415074] page dumped because: kasan: bad access detected [ 42.415604] [ 42.415757] Memory state around the buggy address: [ 42.416222] ffff88000969e880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 42.416904] ffff88000969e900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 42.417591] >ffff88000969e980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 42.418273] ^ [ 42.418588] ffff88000969ea00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 42.419273] ffff88000969ea80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 42.419882] ================================================================== Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Craig Gallek <kraig@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-17 14:55:59 -04:00

... 3 4 5 6 7 ...

46801 Commits