linux

Author	SHA1	Message	Date
Arnd Bergmann	b61a602ee6	ipvs: initialize returned data in do_ip_vs_get_ctl As reported by a gcc warning, the do_ip_vs_get_ctl does not initalize all the members of the ip_vs_timeout_user structure it returns if at least one of the TCP or UDP protocols is disabled for ipvs. This makes sure that the data is always initialized, before it is returned as a response to IPVS_CMD_GET_CONFIG or printed as a debug message in IPVS_CMD_SET_CONFIG. Without this patch, building ARM ixp4xx_defconfig results in: net/netfilter/ipvs/ip_vs_ctl.c: In function 'ip_vs_genl_set_cmd': net/netfilter/ipvs/ip_vs_ctl.c:2238:47: warning: 't.udp_timeout' may be used uninitialized in this function [-Wuninitialized] net/netfilter/ipvs/ip_vs_ctl.c:3322:28: note: 't.udp_timeout' was declared here net/netfilter/ipvs/ip_vs_ctl.c:2238:47: warning: 't.tcp_fin_timeout' may be used uninitialized in this function [-Wuninitialized] net/netfilter/ipvs/ip_vs_ctl.c:3322:28: note: 't.tcp_fin_timeout' was declared here net/netfilter/ipvs/ip_vs_ctl.c:2238:47: warning: 't.tcp_timeout' may be used uninitialized in this function [-Wuninitialized] net/netfilter/ipvs/ip_vs_ctl.c:3322:28: note: 't.tcp_timeout' was declared here Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2012-10-09 13:04:34 +09:00
Julian Anastasov	ad4d3ef8b7	ipvs: fix ARP resolving for direct routing mode After the change "Make neigh lookups directly in output packet path" (commit `a263b30936`) IPVS can not reach the real server for DR mode because we resolve the destination address from IP header, not from route neighbour. Use the new FLOWI_FLAG_KNOWN_NH flag to request output routes with known nexthop, so that it has preference on resolving. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-08 17:42:36 -04:00
Julian Anastasov	c92b96553a	ipv4: Add FLOWI_FLAG_KNOWN_NH Add flag to request that output route should be returned with known rt_gateway, in case we want to use it as nexthop for neighbour resolving. The returned route can be cached as follows: - in NH exception: because the cached routes are not shared with other destinations - in FIB NH: when using gateway because all destinations for NH share same gateway As last option, to return rt_gateway!=0 we have to set DST_NOCACHE. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-08 17:42:36 -04:00
Julian Anastasov	155e8336c3	ipv4: introduce rt_uses_gateway Add new flag to remember when route is via gateway. We will use it to allow rt_gateway to contain address of directly connected host for the cases when DST_NOCACHE is used or when the NH exception caches per-destination route without DST_NOCACHE flag, i.e. when routes are not used for other destinations. By this way we force the neighbour resolving to work with the routed destination but we can use different address in the packet, feature needed for IPVS-DR where original packet for virtual IP is routed via route to real IP. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-08 17:42:36 -04:00
Julian Anastasov	f8a17175c6	ipv4: make sure nh_pcpu_rth_output is always allocated Avoid checking nh_pcpu_rth_output in fast path, abort fib_info creation on alloc_percpu failure. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-08 17:42:35 -04:00
Julian Anastasov	e0adef0f74	ipv4: fix forwarding for strict source routes After the change "Adjust semantics of rt->rt_gateway" (commit `f8126f1d51`) rt_gateway can be 0 but ip_forward() compares it directly with nexthop. What we want here is to check if traffic is to directly connected nexthop and to fail if using gateway. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-08 17:42:35 -04:00
Julian Anastasov	e81da0e113	ipv4: fix sending of redirects After "Cache input routes in fib_info nexthops" (commit `d2d68ba9fe`) and "Elide fib_validate_source() completely when possible" (commit `7a9bc9b81a`) we can not send ICMP redirects. It seems we should not cache the RTCF_DOREDIRECT flag in nh_rth_input because the same fib_info can be used for traffic that is not redirected, eg. from other input devices or from sources that are not in same subnet. As result, we have to disable the caching of RTCF_DOREDIRECT flag and to force source validation for the case when forwarding traffic to the input device. If traffic comes from directly connected source we allow redirection as it was done before both changes. Avoid setting RTCF_DOREDIRECT if IN_DEV_TX_REDIRECTS is disabled, this can avoid source address validation and to help caching the routes. After the change "Adjust semantics of rt->rt_gateway" (commit `f8126f1d51`) we should make sure our ICMP_REDIR_HOST messages contain daddr instead of 0.0.0.0 when target is directly connected. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-08 17:42:35 -04:00
Eric Dumazet	863472454c	ipv6: gro: fix PV6_GRO_CB(skb)->proto problem It seems IPV6_GRO_CB(skb)->proto can be destroyed in skb_gro_receive() if a new skb is allocated (to serve as an anchor for frag_list) We copy NAPI_GRO_CB() only (not the IPV6 specific part) in : NAPI_GRO_CB(nskb) = NAPI_GRO_CB(p); So we leave IPV6_GRO_CB(nskb)->proto to 0 (fresh skb allocation) instead of IPPROTO_TCP (6) ipv6_gro_complete() isnt able to call ops->gro_complete() [ tcp6_gro_complete() ] Fix this by moving proto in NAPI_GRO_CB() and getting rid of IPV6_GRO_CB Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-08 15:40:43 -04:00
Florian Zumbiehl	48cc32d38a	vlan: don't deliver frames for unknown vlans to protocols `6a32e4f9dd` made the vlan code skip marking vlan-tagged frames for not locally configured vlans as PACKET_OTHERHOST if there was an rx_handler, as the rx_handler could cause the frame to be received on a different (virtual) vlan-capable interface where that vlan might be configured. As rx_handlers do not necessarily return RX_HANDLER_ANOTHER, this could cause frames for unknown vlans to be delivered to the protocol stack as if they had been received untagged. For example, if an ipv6 router advertisement that's tagged for a locally not configured vlan is received on an interface with macvlan interfaces attached, macvlan's rx_handler returns RX_HANDLER_PASS after delivering the frame to the macvlan interfaces, which caused it to be passed to the protocol stack, leading to ipv6 addresses for the announced prefix being configured even though those are completely unusable on the underlying interface. The fix moves marking as PACKET_OTHERHOST after the rx_handler so the rx_handler, if there is one, sees the frame unchanged, but afterwards, before the frame is delivered to the protocol stack, it gets marked whether there is an rx_handler or not. Signed-off-by: Florian Zumbiehl <florz@florz.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-08 15:21:55 -04:00
Felix Fietkau	c3e7724b6b	mac80211: use ieee80211_free_txskb to fix possible skb leaks A few places free skbs using dev_kfree_skb even though they're called after ieee80211_subif_start_xmit might have cloned it for tracking tx status. Use ieee80211_free_txskb here to prevent skb leaks. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Cc: stable@vger.kernel.org Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-10-08 15:06:05 -04:00
Thomas Pedersen	55fabefe36	mac80211: call drv_get_tsf() in sleepable context The call to drv_get/set_tsf() was put on the workqueue to perform tsf adjustments since that function might sleep. However it ended up inside a spinlock, whose critical section must be atomic. Do tsf adjustment outside the spinlock instead, and get rid of a warning. Signed-off-by: Thomas Pedersen <thomas@cozybit.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-10-08 15:06:02 -04:00
Eric Dumazet	2e71a6f808	net: gro: selective flush of packets Current GRO can hold packets in gro_list for almost unlimited time, in case napi->poll() handler consumes its budget over and over. In this case, napi_complete()/napi_gro_flush() are not called. Another problem is that gro_list is flushed in non friendly way : We scan the list and complete packets in the reverse order. (youngest packets first, oldest packets last) This defeats priorities that sender could have cooked. Since GRO currently only store TCP packets, we dont really notice the bug because of retransmits, but this behavior can add unexpected latencies, particularly on mice flows clamped by elephant flows. This patch makes sure no packet can stay more than 1 ms in queue, and only in stress situations. It also complete packets in the right order to minimize latencies. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Jesse Gross <jesse@nicira.com> Cc: Tom Herbert <therbert@google.com> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-08 14:51:51 -04:00
Steffen Klassert	ee9a8f7ab2	ipv4: Don't report stale pmtu values to userspace We report cached pmtu values even if they are already expired. Change this to not report these values after they are expired and fix a race in the expire time calculation, as suggested by Eric Dumazet. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-08 14:46:35 -04:00
Steffen Klassert	7f92d334ba	ipv4: Don't create nh exeption when the device mtu is smaller than the reported pmtu When a local tool like tracepath tries to send packets bigger than the device mtu, we create a nh exeption and set the pmtu to device mtu. The device mtu does not expire, so check if the device mtu is smaller than the reported pmtu and don't crerate a nh exeption in that case. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-08 14:46:35 -04:00
Steffen Klassert	d851c12b60	ipv4: Always invalidate or update the route on pmtu events Some protocols, like IPsec still cache routes. So we need to invalidate the old route on pmtu events to avoid the reuse of stale routes. We also need to update the mtu and expire time of the route if we already use a nh exception route, otherwise we ignore newly learned pmtu values after the first expiration. With this patch we always invalidate or update the route on pmtu events. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-08 14:46:34 -04:00
David Howells	cf7f601c06	KEYS: Add payload preparsing opportunity prior to key instantiate or update Give the key type the opportunity to preparse the payload prior to the instantiation and update routines being called. This is done with the provision of two new key type operations: int (preparse)(struct key_preparsed_payload prep); void (free_preparse)(struct key_preparsed_payload prep); If the first operation is present, then it is called before key creation (in the add/update case) or before the key semaphore is taken (in the update and instantiate cases). The second operation is called to clean up if the first was called. preparse() is given the opportunity to fill in the following structure: struct key_preparsed_payload { char description; void type_data[2]; void payload; const void data; size_t datalen; size_t quotalen; }; Before the preparser is called, the first three fields will have been cleared, the payload pointer and size will be stored in data and datalen and the default quota size from the key_type struct will be stored into quotalen. The preparser may parse the payload in any way it likes and may store data in the type_data[] and payload fields for use by the instantiate() and update() ops. The preparser may also propose a description for the key by attaching it as a string to the description field. This can be used by passing a NULL or "" description to the add_key() system call or the key_create_or_update() function. This cannot work with request_key() as that required the description to tell the upcall about the key to be created. This, for example permits keys that store PGP public keys to generate their own name from the user ID and public key fingerprint in the key. The instantiate() and update() operations are then modified to look like this: int (instantiate)(struct key key, struct key_preparsed_payload prep); int (update)(struct key key, struct key_preparsed_payload prep); and the new payload data is passed in *prep, whether or not it was preparsed. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2012-10-08 13:49:48 +10:30
Linus Torvalds	7035cdf36d	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull ceph updates from Sage Weil: "The bulk of this pull is a series from Alex that refactors and cleans up the RBD code to lay the groundwork for supporting the new image format and evolving feature set. There are also some cleanups in libceph, and for ceph there's fixed validation of file striping layouts and a bugfix in the code handling a shrinking MDS cluster." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (71 commits) ceph: avoid 32-bit page index overflow ceph: return EIO on invalid layout on GET_DATALOC ioctl rbd: BUG on invalid layout ceph: propagate layout error on osd request creation libceph: check for invalid mapping ceph: convert to use le32_add_cpu() ceph: Fix oops when handling mdsmap that decreases max_mds rbd: update remaining header fields for v2 rbd: get snapshot name for a v2 image rbd: get the snapshot context for a v2 image rbd: get image features for a v2 image rbd: get the object prefix for a v2 rbd image rbd: add code to get the size of a v2 rbd image rbd: lay out header probe infrastructure rbd: encapsulate code that gets snapshot info rbd: add an rbd features field rbd: don't use index in __rbd_add_snap_dev() rbd: kill create_snap sysfs entry rbd: define rbd_dev_image_id() rbd: define some new format constants ...	2012-10-08 06:38:18 +09:00
Eric Dumazet	ca07e43e28	net: gro: fix a potential crash in skb_gro_reset_offset Before accessing skb first fragment, better make sure there is one. This is probably not needed for old kernels, since an ethernet frame cannot contain only an ethernet header, but the recent GRO addition to tunnels makes this patch needed. Also skb_gro_reset_offset() can be static, it actually allows compiler to inline it. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-07 14:49:17 -04:00
Eric Dumazet	51ec04038c	ipv6: GRO should be ECN friendly IPv4 side of the problem was addressed in commit `a9e050f4e7` (net: tcp: GRO should be ECN friendly) This patch does the same, but for IPv6 : A Traffic Class mismatch doesnt mean flows are different, but instead should force a flush of previous packets. This patch removes artificial packet reordering problem. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-07 14:44:36 -04:00
ramesh.nagappa@gmail.com	e1f165032c	net: Fix skb_under_panic oops in neigh_resolve_output The retry loop in neigh_resolve_output() and neigh_connected_output() call dev_hard_header() with out reseting the skb to network_header. This causes the retry to fail with skb_under_panic. The fix is to reset the network_header within the retry loop. Signed-off-by: Ramesh Nagappa <ramesh.nagappa@ericsson.com> Reviewed-by: Shawn Lu <shawn.lu@ericsson.com> Reviewed-by: Robert Coulson <robert.coulson@ericsson.com> Reviewed-by: Billie Alsup <billie.alsup@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-07 14:42:39 -04:00
Eric Dumazet	acb600def2	net: remove skb recycling Over time, skb recycling infrastructure got litle interest and many bugs. Generic rx path skb allocation is now using page fragments for efficient GRO / TCP coalescing, and recyling a tx skb for rx path is not worth the pain. Last identified bug is that fat skbs can be recycled and it can endup using high order pages after few iterations. With help from Maxime Bizon, who pointed out that commit `87151b8689` (net: allow pskb_expand_head() to get maximum tailroom) introduced this regression for recycled skbs. Instead of fixing this bug, lets remove skb recycling. Drivers wanting really hot skbs should use build_skb() anyway, to allocate/populate sk_buff right before netif_receive_skb() Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-07 00:40:54 -04:00
Gao feng	6dc878a8ca	netlink: add reference of module in netlink_dump_start I get a panic when I use ss -a and rmmod inet_diag at the same time. It's because netlink_dump uses inet_diag_dump which belongs to module inet_diag. I search the codes and find many modules have the same problem. We need to add a reference to the module which the cb->dump belongs to. Thanks for all help from Stephen,Jan,Eric,Steffen and Pablo. Change From v3: change netlink_dump_start to inline,suggestion from Pablo and Eric. Change From v2: delete netlink_dump_done,and call module_put in netlink_dump and netlink_sock_destruct. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-07 00:30:56 -04:00
Linus Torvalds	283dbd8205	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking changes from David Miller: "The most important bit in here is the fix for input route caching from Eric Dumazet, it's a shame we couldn't fully analyze this in time for 3.6 as it's a 3.6 regression introduced by the routing cache removal. Anyways, will send quickly to -stable after you pull this in. Other changes of note: 1) Fix lockdep splats in team and bonding, from Eric Dumazet. 2) IPV6 adds link local route even when there is no link local address, from Nicolas Dichtel. 3) Fix ixgbe PTP implementation, from Jacob Keller. 4) Fix excessive stack usage in cxgb4 driver, from Vipul Pandya. 5) MAC length computed improperly in VLAN demux, from Antonio Quartulli." * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (26 commits) ipv6: release reference of ip6_null_entry's dst entry in __ip6_del_rt Remove noisy printks from llcp_sock_connect tipc: prevent dropped connections due to rcvbuf overflow silence some noisy printks in irda team: set qdisc_tx_busylock to avoid LOCKDEP splat bonding: set qdisc_tx_busylock to avoid LOCKDEP splat sctp: check src addr when processing SACK to update transport state sctp: fix a typo in prototype of __sctp_rcv_lookup() ipv4: add a fib_type to fib_info can: mpc5xxx_can: fix section type conflict can: peak_pcmcia: fix error return code can: peak_pci: fix error return code cxgb4: Fix build error due to missing linux/vmalloc.h include. bnx2x: fix ring size for 10G functions cxgb4: Dynamically allocate memory in t4_memory_rw() and get_vpd_params() ixgbe: add support for X540-AT1 ixgbe: fix poll loop for FDIRCTRL.INIT_DONE bit ixgbe: fix PTP ethtool timestamping function ixgbe: (PTP) Fix PPS interrupt code ixgbe: Fix PTP X540 SDP alignment code for PPS signal ...	2012-10-06 03:11:59 +09:00
Andi Kleen	04a6f82cf0	sections: fix section conflicts in net Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-06 03:04:45 +09:00
Andi Kleen	6299b669b1	sections: fix section conflicts in net/can Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Oliver Hartkopp <socketcan@hartkopp.net> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-06 03:04:45 +09:00
Sasha Levin	6f5601251d	net, TTY: initialize tty->driver_data before usage Commit `9c650ffc` ("TTY: ircomm_tty, add tty install") split _open() to _install() and _open(). It also moved the initialization of driver_data out of open(), but never added it to install() - causing a NULL ptr deref whenever the driver was used. Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Acked-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-10-05 09:26:01 -07:00
Gao feng	6825a26c2d	ipv6: release reference of ip6_null_entry's dst entry in __ip6_del_rt as we hold dst_entry before we call __ip6_del_rt, so we should alse call dst_release not only return -ENOENT when the rt6_info is ip6_null_entry. and we already hold the dst entry, so I think it's safe to call dst_release out of the write-read lock. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-04 16:00:07 -04:00
Dave Jones	32418cfe49	Remove noisy printks from llcp_sock_connect Validation of userspace input shouldn't trigger dmesg spamming. Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-04 15:58:47 -04:00
Erik Hugne	e57edf6b6d	tipc: prevent dropped connections due to rcvbuf overflow When large buffers are sent over connected TIPC sockets, it is likely that the sk_backlog will be filled up on the receiver side, but the TIPC flow control mechanism is happily unaware of this since that is based on message count. The sender will receive a TIPC_ERR_OVERLOAD message when this occurs and drop it's side of the connection, leaving it stale on the receiver end. By increasing the sk_rcvbuf to a 'worst case' value, we avoid the overload caused by a full backlog queue and the flow control will work properly. This worst case value is the max TIPC message size times the flow control window, multiplied by two because a sender will transmit up to double the window size before a port is marked congested. We multiply this by 2 to account for the sk_buff and other overheads. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-04 15:53:48 -04:00
Dave Jones	096895818c	silence some noisy printks in irda Fuzzing causes these printks to spew constantly. Changing them to DEBUG statements is consistent with other usage in the file, and makes them disappear when CONFIG_IRDA_DEBUG is disabled. Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-04 15:53:48 -04:00
Nicolas Dichtel	edfee0339e	sctp: check src addr when processing SACK to update transport state Suppose we have an SCTP connection with two paths. After connection is established, path1 is not available, thus this path is marked as inactive. Then traffic goes through path2, but for some reasons packets are delayed (after rto.max). Because packets are delayed, the retransmit mechanism will switch again to path1. At this time, we receive a delayed SACK from path2. When we update the state of the path in sctp_check_transmitted(), we do not take into account the source address of the SACK, hence we update the wrong path. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-04 15:53:48 -04:00
Nicolas Dichtel	575659936f	sctp: fix a typo in prototype of __sctp_rcv_lookup() Just to avoid confusion when people only reads this prototype. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-04 15:53:48 -04:00
Eric Dumazet	f4ef85bbda	ipv4: add a fib_type to fib_info commit `d2d68ba9fe` (ipv4: Cache input routes in fib_info nexthops.) introduced a regression for forwarding. This was hard to reproduce but the symptom was that packets were delivered to local host instead of being forwarded. David suggested to add fib_type to fib_info so that we dont inadvertently share same fib_info for different purposes. With help from Julian Anastasov who provided very helpful hints, reproduced here : <quote> Can it be a problem related to fib_info reuse from different routes. For example, when local IP address is created for subnet we have: broadcast 192.168.0.255 dev DEV proto kernel scope link src 192.168.0.1 192.168.0.0/24 dev DEV proto kernel scope link src 192.168.0.1 local 192.168.0.1 dev DEV proto kernel scope host src 192.168.0.1 The "dev DEV proto kernel scope link src 192.168.0.1" is a reused fib_info structure where we put cached routes. The result can be same fib_info for 192.168.0.255 and 192.168.0.0/24. RTN_BROADCAST is cached only for input routes. Incoming broadcast to 192.168.0.255 can be cached and can cause problems for traffic forwarded to 192.168.0.0/24. So, this patch should solve the problem because it separates the broadcast from unicast traffic. And the ip_route_input_slow caching will work for local and broadcast input routes (above routes 1 and 3) just because they differ in scope and use different fib_info. </quote> Many thanks to Chris Clayton for his patience and help. Reported-by: Chris Clayton <chris2553@googlemail.com> Bisected-by: Chris Clayton <chris2553@googlemail.com> Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Julian Anastasov <ja@ssi.bg> Tested-by: Chris Clayton <chris2553@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-04 13:58:26 -04:00
Linus Torvalds	aab174f0df	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs update from Al Viro: - big one - consolidation of descriptor-related logics; almost all of that is moved to fs/file.c (BTW, I'm seriously tempted to rename the result to fd.c. As it is, we have a situation when file_table.c is about handling of struct file and file.c is about handling of descriptor tables; the reasons are historical - file_table.c used to be about a static array of struct file we used to have way back). A lot of stray ends got cleaned up and converted to saner primitives, disgusting mess in android/binder.c is still disgusting, but at least doesn't poke so much in descriptor table guts anymore. A bunch of relatively minor races got fixed in process, plus an ext4 struct file leak. - related thing - fget_light() partially unuglified; see fdget() in there (and yes, it generates the code as good as we used to have). - also related - bits of Cyrill's procfs stuff that got entangled into that work; _not_ all of it, just the initial move to fs/proc/fd.c and switch of fdinfo to seq_file. - Alex's fs/coredump.c spiltoff - the same story, had been easier to take that commit than mess with conflicts. The rest is a separate pile, this was just a mechanical code movement. - a few misc patches all over the place. Not all for this cycle, there'll be more (and quite a few currently sit in akpm's tree)." Fix up trivial conflicts in the android binder driver, and some fairly simple conflicts due to two different changes to the sock_alloc_file() interface ("take descriptor handling from sock_alloc_file() to callers" vs "net: Providing protocol type via system.sockprotoname xattr of /proc/PID/fd entries" adding a dentry name to the socket) * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (72 commits) MAX_LFS_FILESIZE should be a loff_t compat: fs: Generic compat_sys_sendfile implementation fs: push rcu_barrier() from deactivate_locked_super() to filesystems btrfs: reada_extent doesn't need kref for refcount coredump: move core dump functionality into its own file coredump: prevent double-free on an error path in core dumper usb/gadget: fix misannotations fcntl: fix misannotations ceph: don't abuse d_delete() on failure exits hypfs: ->d_parent is never NULL or negative vfs: delete surplus inode NULL check switch simple cases of fget_light to fdget new helpers: fdget()/fdput() switch o2hb_region_dev_write() to fget_light() proc_map_files_readdir(): don't bother with grabbing files make get_file() return its argument vhost_set_vring(): turn pollstart/pollstop into bool switch prctl_set_mm_exe_file() to fget_light() switch xfs_find_handle() to fget_light() switch xfs_swapext() to fget_light() ...	2012-10-02 20:25:04 -07:00
Antonio Quartulli	5316cf9a51	8021q: fix mac_len recomputation in vlan_untag() skb_reset_mac_len() relies on the value of the skb->network_header pointer, therefore we must wait for such pointer to be recalculated before computing the new mac_len value. Signed-off-by: Antonio Quartulli <ordex@autistici.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-02 22:45:57 -04:00
Nicolas Dichtel	62b54dd915	ipv6: don't add link local route when there is no link local address When an address is added on loopback (ip -6 a a 2002::1/128 dev lo), a route to fe80::/64 is added in the main table: unreachable fe80::/64 dev lo proto kernel metric 256 error -101 This route does not match any prefix (no fe80:: address on lo). In fact, addrconf_dev_config() will not add link local address because this function filters interfaces by type. If the link local address is added manually, the route to the link local prefix will be automatically added by addrconf_add_linklocal(). Note also, that this route is not deleted when the address is removed. After looking at the code, it seems that addrconf_add_lroute() is redundant with addrconf_add_linklocal(), because this function will add the link local route when the link local address is configured. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-02 22:36:23 -04:00
Linus Torvalds	916082b073	workqueue: avoid using deprecated functions The network merge brought in a few users of functions that got deprecated by the workqueue cleanups: the 'system_nrt_wq' is now the same as the regular system_wq, since all workqueues are now non- reentrant. Similarly, remove one use of flush_work_sync() - the regular flush_work() has become synchronous, and the "_sync()" version is thus deprecated as being superfluous. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-02 16:01:31 -07:00
Linus Torvalds	aecdc33e11	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking changes from David Miller: 1) GRE now works over ipv6, from Dmitry Kozlov. 2) Make SCTP more network namespace aware, from Eric Biederman. 3) TEAM driver now works with non-ethernet devices, from Jiri Pirko. 4) Make openvswitch network namespace aware, from Pravin B Shelar. 5) IPV6 NAT implementation, from Patrick McHardy. 6) Server side support for TCP Fast Open, from Jerry Chu and others. 7) Packet BPF filter supports MOD and XOR, from Eric Dumazet and Daniel Borkmann. 8) Increate the loopback default MTU to 64K, from Eric Dumazet. 9) Use a per-task rather than per-socket page fragment allocator for outgoing networking traffic. This benefits processes that have very many mostly idle sockets, which is quite common. From Eric Dumazet. 10) Use up to 32K for page fragment allocations, with fallbacks to smaller sizes when higher order page allocations fail. Benefits are a) less segments for driver to process b) less calls to page allocator c) less waste of space. From Eric Dumazet. 11) Allow GRO to be used on GRE tunnels, from Eric Dumazet. 12) VXLAN device driver, one way to handle VLAN issues such as the limitation of 4096 VLAN IDs yet still have some level of isolation. From Stephen Hemminger. 13) As usual there is a large boatload of driver changes, with the scale perhaps tilted towards the wireless side this time around. Fix up various fairly trivial conflicts, mostly caused by the user namespace changes. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1012 commits) hyperv: Add buffer for extended info after the RNDIS response message. hyperv: Report actual status in receive completion packet hyperv: Remove extra allocated space for recv_pkt_list elements hyperv: Fix page buffer handling in rndis_filter_send_request() hyperv: Fix the missing return value in rndis_filter_set_packet_filter() hyperv: Fix the max_xfer_size in RNDIS initialization vxlan: put UDP socket in correct namespace vxlan: Depend on CONFIG_INET sfc: Fix the reported priorities of different filter types sfc: Remove EFX_FILTER_FLAG_RX_OVERRIDE_IP sfc: Fix loopback self-test with separate_tx_channels=1 sfc: Fix MCDI structure field lookup sfc: Add parentheses around use of bitfield macro arguments sfc: Fix null function pointer in efx_sriov_channel_type vxlan: virtual extensible lan igmp: export symbol ip_mc_leave_group netlink: add attributes to fdb interface tg3: unconditionally select HWMON support when tg3 is enabled. Revert "net: ti cpsw ethernet: allow reading phy interface mode from DT" gre: fix sparse warning ...	2012-10-02 13:38:27 -07:00
Linus Torvalds	437589a74b	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace changes from Eric Biederman: "This is a mostly modest set of changes to enable basic user namespace support. This allows the code to code to compile with user namespaces enabled and removes the assumption there is only the initial user namespace. Everything is converted except for the most complex of the filesystems: autofs4, 9p, afs, ceph, cifs, coda, fuse, gfs2, ncpfs, nfs, ocfs2 and xfs as those patches need a bit more review. The strategy is to push kuid_t and kgid_t values are far down into subsystems and filesystems as reasonable. Leaving the make_kuid and from_kuid operations to happen at the edge of userspace, as the values come off the disk, and as the values come in from the network. Letting compile type incompatible compile errors (present when user namespaces are enabled) guide me to find the issues. The most tricky areas have been the places where we had an implicit union of uid and gid values and were storing them in an unsigned int. Those places were converted into explicit unions. I made certain to handle those places with simple trivial patches. Out of that work I discovered we have generic interfaces for storing quota by projid. I had never heard of the project identifiers before. Adding full user namespace support for project identifiers accounts for most of the code size growth in my git tree. Ultimately there will be work to relax privlige checks from "capable(FOO)" to "ns_capable(user_ns, FOO)" where it is safe allowing root in a user names to do those things that today we only forbid to non-root users because it will confuse suid root applications. While I was pushing kuid_t and kgid_t changes deep into the audit code I made a few other cleanups. I capitalized on the fact we process netlink messages in the context of the message sender. I removed usage of NETLINK_CRED, and started directly using current->tty. Some of these patches have also made it into maintainer trees, with no problems from identical code from different trees showing up in linux-next. After reading through all of this code I feel like I might be able to win a game of kernel trivial pursuit." Fix up some fairly trivial conflicts in netfilter uid/git logging code. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (107 commits) userns: Convert the ufs filesystem to use kuid/kgid where appropriate userns: Convert the udf filesystem to use kuid/kgid where appropriate userns: Convert ubifs to use kuid/kgid userns: Convert squashfs to use kuid/kgid where appropriate userns: Convert reiserfs to use kuid and kgid where appropriate userns: Convert jfs to use kuid/kgid where appropriate userns: Convert jffs2 to use kuid and kgid where appropriate userns: Convert hpfs to use kuid and kgid where appropriate userns: Convert btrfs to use kuid/kgid where appropriate userns: Convert bfs to use kuid/kgid where appropriate userns: Convert affs to use kuid/kgid wherwe appropriate userns: On alpha modify linux_to_osf_stat to use convert from kuids and kgids userns: On ia64 deal with current_uid and current_gid being kuid and kgid userns: On ppc convert current_uid from a kuid before printing. userns: Convert s390 getting uid and gid system calls to use kuid and kgid userns: Convert s390 hypfs to use kuid and kgid where appropriate userns: Convert binder ipc to use kuids userns: Teach security_path_chown to take kuids and kgids userns: Add user namespace support to IMA userns: Convert EVM to deal with kuids and kgids in it's hmac computation ...	2012-10-02 11:11:09 -07:00
Linus Torvalds	68d47a137c	Merge branch 'for-3.7-hierarchy' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup hierarchy update from Tejun Heo: "Currently, different cgroup subsystems handle nested cgroups completely differently. There's no consistency among subsystems and the behaviors often are outright broken. People at least seem to agree that the broken hierarhcy behaviors need to be weeded out if any progress is gonna be made on this front and that the fallouts from deprecating the broken behaviors should be acceptable especially given that the current behaviors don't make much sense when nested. This patch makes cgroup emit warning messages if cgroups for subsystems with broken hierarchy behavior are nested to prepare for fixing them in the future. This was put in a separate branch because more related changes were expected (didn't make it this round) and the memory cgroup wanted to pull in this and make changes on top." * 'for-3.7-hierarchy' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: mark subsystems with broken hierarchy support and whine if cgroups are nested for them	2012-10-02 10:52:28 -07:00
Linus Torvalds	c0e8a139a5	Merge branch 'for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: - xattr support added. The implementation is shared with tmpfs. The usage is restricted and intended to be used to manage per-cgroup metadata by system software. tmpfs changes are routed through this branch with Hugh's permission. - cgroup subsystem ID handling simplified. * 'for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: Define CGROUP_SUBSYS_COUNT according the configuration cgroup: Assign subsystem IDs during compile time cgroup: Do not depend on a given order when populating the subsys array cgroup: Wrap subsystem selection macro cgroup: Remove CGROUP_BUILTIN_SUBSYS_COUNT cgroup: net_prio: Do not define task_netpioidx() when not selected cgroup: net_cls: Do not define task_cls_classid() when not selected cgroup: net_cls: Move sock_update_classid() declaration to cls_cgroup.h cgroup: trivial fixes for Documentation/cgroups/cgroups.txt xattr: mark variable as uninitialized to make both gcc and smatch happy fs: add missing documentation to simple_xattr functions cgroup: add documentation on extended attributes usage cgroup: rename subsys_bits to subsys_mask cgroup: add xattr support cgroup: revise how we re-populate root directory xattr: extract simple_xattr code from tmpfs	2012-10-02 10:50:47 -07:00
Linus Torvalds	033d9959ed	Merge branch 'for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull workqueue changes from Tejun Heo: "This is workqueue updates for v3.7-rc1. A lot of activities this round including considerable API and behavior cleanups. * delayed_work combines a timer and a work item. The handling of the timer part has always been a bit clunky leading to confusing cancelation API with weird corner-case behaviors. delayed_work is updated to use new IRQ safe timer and cancelation now works as expected. * Another deficiency of delayed_work was lack of the counterpart of mod_timer() which led to cancel+queue combinations or open-coded timer+work usages. mod_delayed_work[_on]() are added. These two delayed_work changes make delayed_work provide interface and behave like timer which is executed with process context. * A work item could be executed concurrently on multiple CPUs, which is rather unintuitive and made flush_work() behavior confusing and half-broken under certain circumstances. This problem doesn't exist for non-reentrant workqueues. While non-reentrancy check isn't free, the overhead is incurred only when a work item bounces across different CPUs and even in simulated pathological scenario the overhead isn't too high. All workqueues are made non-reentrant. This removes the distinction between flush_[delayed_]work() and flush_[delayed_]_work_sync(). The former is now as strong as the latter and the specified work item is guaranteed to have finished execution of any previous queueing on return. * In addition to the various bug fixes, Lai redid and simplified CPU hotplug handling significantly. * Joonsoo introduced system_highpri_wq and used it during CPU hotplug. There are two merge commits - one to pull in IRQ safe timer from tip/timers/core and the other to pull in CPU hotplug fixes from wq/for-3.6-fixes as Lai's hotplug restructuring depended on them." Fixed a number of trivial conflicts, but the more interesting conflicts were silent ones where the deprecated interfaces had been used by new code in the merge window, and thus didn't cause any real data conflicts. Tejun pointed out a few of them, I fixed a couple more. * 'for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (46 commits) workqueue: remove spurious WARN_ON_ONCE(in_irq()) from try_to_grab_pending() workqueue: use cwq_set_max_active() helper for workqueue_set_max_active() workqueue: introduce cwq_set_max_active() helper for thaw_workqueues() workqueue: remove @delayed from cwq_dec_nr_in_flight() workqueue: fix possible stall on try_to_grab_pending() of a delayed work item workqueue: use hotcpu_notifier() for workqueue_cpu_down_callback() workqueue: use __cpuinit instead of __devinit for cpu callbacks workqueue: rename manager_mutex to assoc_mutex workqueue: WORKER_REBIND is no longer necessary for idle rebinding workqueue: WORKER_REBIND is no longer necessary for busy rebinding workqueue: reimplement idle worker rebinding workqueue: deprecate __cancel_delayed_work() workqueue: reimplement cancel_delayed_work() using try_to_grab_pending() workqueue: use mod_delayed_work() instead of __cancel + queue workqueue: use irqsafe timer for delayed_work workqueue: clean up delayed_work initializers and add missing one workqueue: make deferrable delayed_work initializer names consistent workqueue: cosmetic whitespace updates for macro definitions workqueue: deprecate system_nrt[_freezable]_wq workqueue: deprecate flush[_delayed]_work_sync() ...	2012-10-02 09:54:49 -07:00
stephen hemminger	193ba92452	igmp: export symbol ip_mc_leave_group Needed for VXLAN. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-01 18:39:44 -04:00
stephen hemminger	edc7d57327	netlink: add attributes to fdb interface Later changes need to be able to refer to neighbour attributes when doing fdb_add. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-01 18:39:44 -04:00
Chuck Lever	ba9b584c1d	SUNRPC: Introduce rpc_clone_client_set_auth() An ULP is supposed to be able to replace a GSS rpc_auth object with another GSS rpc_auth object using rpcauth_create(). However, rpcauth_create() in 3.5 reliably fails with -EEXIST in this case. This is because when gss_create() attempts to create the upcall pipes, sometimes they are already there. For example if a pipe FS mount event occurs, or a previous GSS flavor was in use for this rpc_clnt. It turns out that's not the only problem here. While working on a fix for the above problem, we noticed that replacing an rpc_clnt's rpc_auth is not safe, since dereferencing the cl_auth field is not protected in any way. So we're deprecating the ability of rpcauth_create() to switch an rpc_clnt's security flavor during normal operation. Instead, let's add a fresh API that clones an rpc_clnt and gives the clone a new flavor before it's used. This makes immediate use of the new __rpc_clone_client() helper. This can be used in a similar fashion to rpcauth_create() when a client is hunting for the correct security flavor. Instead of replacing an rpc_clnt's security flavor in a loop, the ULP replaces the whole rpc_clnt. To fix the -EEXIST problem, any ULP logic that relies on replacing an rpc_clnt's rpc_auth with rpcauth_create() must be changed to use this API instead. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-10-01 15:33:33 -07:00
Chuck Lever	1b63a75180	SUNRPC: Refactor rpc_clone_client() rpc_clone_client() does most of the same tasks as rpc_new_client(), so there is an opportunity for code re-use. Create a generic helper that makes it easy to clone an RPC client while replacing any of the clnt's parameters. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-10-01 15:32:07 -07:00
Chuck Lever	632f0d0503	SUNRPC: Use __func__ in dprintk() in auth_gss.c Clean up: Some function names have changed, but debugging messages were never updated. Automate the construction of the function name in debugging messages. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-10-01 15:32:02 -07:00
Chuck Lever	d8af9bc16c	SUNRPC: Clean up dprintk messages in rpc_pipe.c Clean up: The blank space in front of the message must be spaces. Tabs show up on the console as a graphical character. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-10-01 15:31:57 -07:00
Sage Weil	6816282dab	ceph: propagate layout error on osd request creation If we are creating an osd request and get an invalid layout, return an EINVAL to the caller. We switch up the return to have an error code instead of NULL implying -ENOMEM. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>	2012-10-01 17:20:00 -05:00
Sage Weil	d63b77f4c5	libceph: check for invalid mapping If we encounter an invalid (e.g., zeroed) mapping, return an error and avoid a divide by zero. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>	2012-10-01 17:20:00 -05:00
stephen hemminger	9fbef059d6	gre: fix sparse warning Use be16 consistently when looking at flags. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-01 17:35:31 -04:00
Dan Carpenter	f674e72ff1	net/key/af_key.c: add range checks on ->sadb_x_policy_len Because sizeof() is size_t then if "len" is negative, it counts as a large positive value. The call tree looks like: pfkey_sendmsg() -> pfkey_process() -> pfkey_spdadd() -> parse_ipsecrequests() Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-01 17:15:06 -04:00
Eric Dumazet	60769a5dcd	ipv4: gre: add GRO capability Add GRO capability to IPv4 GRE tunnels, using the gro_cells infrastructure. Tested using IPv4 and IPv6 TCP traffic inside this tunnel, and checking GRO is building large packets. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-01 17:01:57 -04:00
Eric Dumazet	c9e6bc644e	net: add gro_cells infrastructure This adds a new include file (include/net/gro_cells.h), to bring GRO (Generic Receive Offload) capability to tunnels, in a modular way. Because tunnels receive path is lockless, and GRO adds a serialization using a napi_struct, I chose to add an array of up to DEFAULT_MAX_NUM_RSS_QUEUES cells, so that multi queue devices wont be slowed down because of GRO layer. skb_get_rx_queue() is used as selector. In the future, we might add optional fanout capabilities, using rxhash for example. With help from Ben Hutchings who reminded me netif_get_num_default_rss_queues() function. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-01 17:01:46 -04:00
Eric Dumazet	861b650101	tcp: gro: add checksuming helpers skb with CHECKSUM_NONE cant currently be handled by GRO, and we notice this deep in GRO stack in tcp[46]_gro_receive() But there are cases where GRO can be a benefit, even with a lack of checksums. This preliminary work is needed to add GRO support to tunnels. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-01 17:00:27 -04:00
Nicolas Dichtel	64c6d08e64	ipv6: del unreachable route when an addr is deleted on lo When an address is added on loopback (ip -6 a a 2002::1/128 dev lo), two routes are added: - one in the local table: local 2002::1 via :: dev lo proto none metric 0 - one the in main table (for the prefix): unreachable 2002::1 dev lo proto kernel metric 256 error -101 When the address is deleted, the route inserted in the main table remains because we use rt6_lookup(), which returns NULL when dst->error is set, which is the case here! Thus, it is better to use ip6_route_lookup() to avoid this kind of filter. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-01 16:49:23 -04:00
Weiping Pan	f4b549a5ac	use skb_end_offset() in skb_try_coalesce() Commit ec47ea824774(skb: Add inline helper for getting the skb end offset from head) introduces this helper function, skb_end_offset(), we should make use of it. Signed-off-by: Weiping Pan <wpan@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-01 16:43:17 -04:00
Wei Yongjun	cc4829e596	ceph: use list_move_tail instead of list_del/list_add_tail Using list_move_tail() instead of list_del() + list_add_tail(). Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Sage Weil <sage@inktank.com>	2012-10-01 14:30:49 -05:00
Iulius Curt	7698f2f5e0	libceph: Fix sparse warning Make ceph_monc_do_poolop() static to remove the following sparse warning: * net/ceph/mon_client.c:616:5: warning: symbol 'ceph_monc_do_poolop' was not declared. Should it be static? Also drops the 'ceph_monc_' prefix, now being a private function. Signed-off-by: Iulius Curt <icurt@ixiacom.com> Signed-off-by: Sage Weil <sage@inktank.com>	2012-10-01 14:30:49 -05:00
Sage Weil	290e33593d	libceph: remove unused monc->have_fsid This is unused; use monc->client->have_fsid. Signed-off-by: Sage Weil <sage@inktank.com>	2012-10-01 14:30:49 -05:00
Linus Torvalds	3498d13b80	TTY merge for 3.7-rc1 As we skipped the merge window for 3.6-rc1 for the tty tree, everything is now settled down and working properly, so we are ready for 3.7-rc1. Here's the patchset, it's big, but the large changes are removing a firmware file and adding a staging tty driver (it depended on the tty core changes, so it's going through this tree instead of the staging tree.) All of these patches have been in the linux-next tree for a while. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iEYEABECAAYFAlBp36oACgkQMUfUDdst+yk4WgCdEy13hot8fI2Lqnc7W0LKu7GX 4p8AoLTjzrXhLosxdijskDQ9X1OtjrxU =S5Ng -----END PGP SIGNATURE----- Merge tag 'tty-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull TTY changes from Greg Kroah-Hartman: "As we skipped the merge window for 3.6-rc1 for the tty tree, everything is now settled down and working properly, so we are ready for 3.7-rc1. Here's the patchset, it's big, but the large changes are removing a firmware file and adding a staging tty driver (it depended on the tty core changes, so it's going through this tree instead of the staging tree.) All of these patches have been in the linux-next tree for a while. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>" Fix up more-or-less trivial conflicts in - drivers/char/pcmcia/synclink_cs.c: tty NULL dereference fix vs tty_port_cts_enabled() helper function - drivers/staging/{Kconfig,Makefile}: add-add conflict (dgrp driver added close to other staging drivers) - drivers/staging/ipack/devices/ipoctal.c: "split ipoctal_channel from iopctal" vs "TTY: use tty_port_register_device" * tag 'tty-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (235 commits) tty/serial: Add kgdb_nmi driver tty/serial/amba-pl011: Quiesce interrupts in poll_get_char tty/serial/amba-pl011: Implement poll_init callback tty/serial/core: Introduce poll_init callback kdb: Turn KGDB_KDB=n stubs into static inlines kdb: Implement disable_nmi command kernel/debug: Mask KGDB NMI upon entry serial: pl011: handle corruption at high clock speeds serial: sccnxp: Make 'default' choice in switch last serial: sccnxp: Remove mask termios caps for SW flow control serial: sccnxp: Report actual baudrate back to core serial: samsung: Add poll_get_char & poll_put_char Powerpc 8xx CPM_UART setting MAXIDL register proportionaly to baud rate Powerpc 8xx CPM_UART maxidl should not depend on fifo size Powerpc 8xx CPM_UART too many interrupts Powerpc 8xx CPM_UART desynchronisation serial: set correct baud_base for EXSYS EX-41092 Dual 16950 serial: omap: fix the reciever line error case 8250: blacklist Winbond CIR port 8250_pnp: do pnp probe before legacy probe ...	2012-10-01 12:26:52 -07:00
Linus Torvalds	06d2fe153b	Driver core merge for 3.7-rc1 Here is the big driver core update for 3.7-rc1. A number of firmware_class.c updates (as you saw a month or so ago), and some hyper-v updates and some printk fixes as well. All patches that are outside of the drivers/base area have been acked by the respective maintainers, and have all been in the linux-next tree for a while. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iEYEABECAAYFAlBp3vkACgkQMUfUDdst+ylQoACgldktGFgkCLzH+rGYthrXOC5P 9hUAnjmOhdoHlMTL81vWTlH+BrGernym =khrr -----END PGP SIGNATURE----- Merge tag 'driver-core-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core merge from Greg Kroah-Hartman: "Here is the big driver core update for 3.7-rc1. A number of firmware_class.c updates (as you saw a month or so ago), and some hyper-v updates and some printk fixes as well. All patches that are outside of the drivers/base area have been acked by the respective maintainers, and have all been in the linux-next tree for a while. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>" * tag 'driver-core-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (95 commits) memory: tegra{20,30}-mc: Fix reading incorrect register in mc_readl() device.h: Add missing inline to #ifndef CONFIG_PRINTK dev_vprintk_emit memory: emif: Add ifdef CONFIG_DEBUG_FS guard for emif_debugfs_[init\|exit] Documentation: Fixes some translation error in Documentation/zh_CN/gpio.txt Documentation: Remove 3 byte redundant code at the head of the Documentation/zh_CN/arm/booting Documentation: Chinese translation of Documentation/video4linux/omap3isp.txt device and dynamic_debug: Use dev_vprintk_emit and dev_printk_emit dev: Add dev_vprintk_emit and dev_printk_emit netdev_printk/netif_printk: Remove a superfluous logging colon netdev_printk/dynamic_netdev_dbg: Directly call printk_emit dev_dbg/dynamic_debug: Update to use printk_emit, optimize stack driver-core: Shut up dev_dbg_reatelimited() without DEBUG tools/hv: Parse /etc/os-release tools/hv: Check for read/write errors tools/hv: Fix exit() error code tools/hv: Fix file handle leak Tools: hv: Implement the KVP verb - KVP_OP_GET_IP_INFO Tools: hv: Rename the function kvp_get_ip_address() Tools: hv: Implement the KVP verb - KVP_OP_SET_IP_INFO Tools: hv: Add an example script to configure an interface ...	2012-10-01 12:10:44 -07:00
Linus Torvalds	99dbb1632f	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull the trivial tree from Jiri Kosina: "Tiny usual fixes all over the place" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (34 commits) doc: fix old config name of kprobetrace fs/fs-writeback.c: cleanup riteback_sb_inodes kerneldoc btrfs: fix the commment for the action flags in delayed-ref.h btrfs: fix trivial typo for the comment of BTRFS_FREE_INO_OBJECTID vfs: fix kerneldoc for generic_fh_to_parent() treewide: fix comment/printk/variable typos ipr: fix small coding style issues doc: fix broken utf8 encoding nfs: comment fix platform/x86: fix asus_laptop.wled_type module parameter mfd: printk/comment fixes doc: getdelays.c: remember to close() socket on error in create_nl_socket() doc: aliasing-test: close fd on write error mmc: fix comment typos dma: fix comments spi: fix comment/printk typos in spi Coccinelle: fix typo in memdup_user.cocci tmiofb: missing NULL pointer checks tools: perf: Fix typo in tools/perf tools/testing: fix comment / output typos ...	2012-10-01 09:06:36 -07:00
Jouni Malinen	33766368f6	mac80211: Fix FC masking in BIP AAD generation The bits used in the mask were off-by-one and ended up masking PwrMgt, MoreData, Protected fields instead of Retry, PwrMgt, MoreData. Fix this and to mask the correct fields. While doing so, convert the code to mask the full FC using IEEE80211_FCTL_* defines similarly to how CCMP AAD is built. Since BIP is used only with broadcast/multicast management frames, the Retry field is always 0 in these frames. The Protected field is also zero to maintain backwards compatibility. As such, the incorrect mask here does not really cause any problems for valid frames. In theory, an invalid BIP frame with Retry or Protected field set to 1 could be rejected because of BIP validation. However, no such frame should show up with standard compliant implementations, so this does not cause problems in normal BIP use. Signed-off-by: Jouni Malinen <j@w1.fi> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-10-01 09:23:15 +02:00
David S. Miller	a248afdc1b	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== Here is another batch of updates intended for 3.7... Highlights include an hci_connect re-write in Bluetooth, HCI/LLC layer separation in NFC, removal of the raw pn544 NFC driver, NFC LLCP raw sockets support, improved IBSS auth frame handling in mac80211, full-MAC AP mode notification support in mac80211, a lot of attention paid to brcmfmac, and the usual level of updates to iwlwifi, ath9k, mwifiex, and rt2x00, and various other updates. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-30 02:30:16 -04:00
Trond Myklebust	9b96ce7197	SUNRPC: Limit the rpciod workqueue concurrency We shouldn't need more than 1 worker thread per cpu, since rpciod is designed to run without sleeping in most cases. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-09-28 20:24:16 -04:00
Lin Ming	188c517a05	ipv6: return errno pointers consistently for fib6_add_1() fib6_add_1() should consistently return errno pointers, rather than a mixture of NULL and errno pointers. Signed-off-by: Lin Ming <mlin@ss.pku.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-28 18:48:28 -04:00
Trond Myklebust	d19751e7b9	SUNRPC: Get rid of the redundant xprt->shutdown bit field It is only set after everyone has dereferenced the transport, and serves no useful purpose: setting it is racy, so all the socket code, etc still needs to be able to cope with the cases where they miss reading it. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-09-28 16:03:05 -04:00
Trond Myklebust	a11a2bf4de	SUNRPC: Optimise away unnecessary data moves in xdr_align_pages We only have to call xdr_shrink_pagelen() if the remaining RPC message does not fit in the page buffer length that we supplied to xdr_align_pages(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-09-28 15:58:42 -04:00
David S. Miller	6a06e5e1bb	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/team/team.c drivers/net/usb/qmi_wwan.c net/batman-adv/bat_iv_ogm.c net/ipv4/fib_frontend.c net/ipv4/route.c net/l2tp/l2tp_netlink.c The team, fib_frontend, route, and l2tp_netlink conflicts were simply overlapping changes. qmi_wwan and bat_iv_ogm were of the "use HEAD" variety. With help from Antonio Quartulli. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-28 14:40:49 -04:00
John W. Linville	c487606f83	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem Conflicts: net/nfc/netlink.c Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-09-28 11:11:16 -04:00
Jesper Dangaard Brouer	92eec78d25	ipvs: SIP fragment handling Use the nfct_reasm SKB if available. Based on part of a patch from: Hans Schillstrom I have left Hans'es comment in the patch (marked /HS) Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Julian Anastasov <ja@ssi.bg> [ horms@verge.net.au: Fix comment style ] Signed-off-by: Simon Horman <horms@verge.net.au>	2012-09-28 11:37:16 +09:00
Jesper Dangaard Brouer	d4383f04d1	ipvs: API change to avoid rescan of IPv6 exthdr Reduce the number of times we scan/skip the IPv6 exthdrs. This patch contains a lot of API changes. This is done, to avoid repeating the scan of finding the IPv6 headers, via ipv6_find_hdr(), which is called by ip_vs_fill_iph_skb(). Finding the IPv6 headers is done as early as possible, and passed on as a pointer "struct ip_vs_iphdr " to the affected functions. This patch reduce/removes 19 calls to ip_vs_fill_iph_skb(). Notice, I have choosen, not to change the API of function pointer "(schedule)" (in struct ip_vs_scheduler) as it can be used by external schedulers, via {un,}register_ip_vs_scheduler. Only 4 out of 10 schedulers use info from ip_vs_iphdr*, and when they do, they are only interested in iph->{s,d}addr. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2012-09-28 11:34:33 +09:00
Jesper Dangaard Brouer	2f74713d14	ipvs: Complete IPv6 fragment handling for IPVS IPVS now supports fragmented packets, with support from nf_conntrack_reasm.c Based on patch from: Hans Schillstrom. IPVS do like conntrack i.e. use the skb->nfct_reasm (i.e. when all fragments is collected, nf_ct_frag6_output() starts a "re-play" of all fragments into the interrupted PREROUTING chain at prio -399 (NF_IP6_PRI_CONNTRACK_DEFRAG+1) with nfct_reasm pointing to the assembled packet.) Notice, module nf_defrag_ipv6 must be loaded for this to work. Report unhandled fragments, and recommend user to load nf_defrag_ipv6. To handle fw-mark for fragments. Add a new IPVS hook into prerouting chain at prio -99 (NF_IP6_PRI_NAT_DST+1) to catch fragments, and copy fw-mark info from the first packet with an upper layer header. IPv6 fragment handling should be the last thing on the IPVS IPv6 missing support list. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Hans Schillstrom <hans@schillstrom.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2012-09-28 11:34:24 +09:00
Jesper Dangaard Brouer	63dca2c0b0	ipvs: Fix faulty IPv6 extension header handling in IPVS IPv6 packets can contain extension headers, thus its wrong to assume that the transport/upper-layer header, starts right after (struct ipv6hdr) the IPv6 header. IPVS uses this false assumption, and will write SNAT & DNAT modifications at a fixed pos which will corrupt the message. To fix this, proper header position must be found before modifying packets. Introducing ip_vs_fill_iph_skb(), which uses ipv6_find_hdr() to skip the exthdrs. It finds (1) the transport header offset, (2) the protocol, and (3) detects if the packet is a fragment. Note, that fragments in IPv6 is represented via an exthdr. Thus, this is detected while skipping through the exthdrs. This patch depends on commit `84018f55a`: "netfilter: ip6_tables: add flags parameter to ipv6_find_hdr()" This also adds a dependency to ip6_tables. Originally based on patch from: Hans Schillstrom kABI notes: Changing struct ip_vs_iphdr is a potential minor kABI breaker, because external modules can be compiled with another version of this struct. This should not matter, as they would most-likely be using a compiled-in version of ip_vs_fill_iphdr(). When recompiled, they will notice ip_vs_fill_iphdr() no longer exists, and they have to used ip_vs_fill_iph_skb() instead. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2012-09-28 11:34:15 +09:00
Jesper Dangaard Brouer	2fab8917f4	ipvs: IPv6 extend ICMPv6 handling for future types Extend handling of ICMPv6, to all none Informational Messages (via ICMPV6_INFOMSG_MASK). This actually only extend our handling to type ICMPV6_PARAMPROB (Parameter Problem), and future types. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2012-09-28 11:34:00 +09:00
Jesper Dangaard Brouer	120b9c14f4	ipvs: Trivial changes, use compressed IPv6 address in output Have not converted the proc file output to compressed IPv6 addresses. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2012-09-28 11:33:52 +09:00
Eric Dumazet	69b08f62e1	net: use bigger pages in __netdev_alloc_frag We currently use percpu order-0 pages in __netdev_alloc_frag to deliver fragments used by __netdev_alloc_skb() Depending on NIC driver and arch being 32 or 64 bit, it allows a page to be split in several fragments (between 1 and 8), assuming PAGE_SIZE=4096 Switching to bigger pages (32768 bytes for PAGE_SIZE=4096 case) allows : - Better filling of space (the ending hole overhead is less an issue) - Less calls to page allocator or accesses to page->_count - Could allow struct skb_shared_info futures changes without major performance impact. This patch implements a transparent fallback to smaller pages in case of memory pressure. It also uses a standard "struct page_frag" instead of a custom one. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Alexander Duyck <alexander.h.duyck@intel.com> Cc: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-27 19:29:35 -04:00
Nicolas Dichtel	bc9259a8ba	inetpeer: fix token initialization When jiffies wraps around (for example, 5 minutes after the boot, see INITIAL_JIFFIES) and peer has just been created, now - peer->rate_last can be < XRLIM_BURST_FACTOR * timeout, so token is not set to the maximum value, thus some icmp packets can be unexpectedly dropped. Fix this case by initializing last_rate to 60 seconds in the past. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-27 19:27:39 -04:00
Christoph Paasch	5dff747b70	tcp: Remove unused parameter from tcp_v4_save_options struct sock *sk is not used inside tcp_v4_save_options. Thus it can be removed. Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-27 19:20:26 -04:00
Eric Dumazet	bcc452935d	ipv6: gre: remove ip6gre_header_parse() dev_parse_header() callers provide 8 bytes of storage, so it's not possible to store an IPv6 address. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-27 18:49:22 -04:00
Eric Dumazet	e2bcabec6e	net: remove sk_init() helper It seems sk_init() has no value today and even does strange things : # grep . /proc/sys/net/core/?mem_* /proc/sys/net/core/rmem_default:212992 /proc/sys/net/core/rmem_max:131071 /proc/sys/net/core/wmem_default:212992 /proc/sys/net/core/wmem_max:131071 We can remove it completely. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Shan Wei <davidshan@tencent.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-27 18:42:00 -04:00
David S. Miller	f54ba77988	pkt_sched: Fix warning false positives. GCC refuses to recognize that all error control flows do in fact set err to something. Add an explicit initialization to shut it up. net/sched/sch_drr.c: In function ‘drr_enqueue’: net/sched/sch_drr.c:359:11: warning: ‘err’ may be used uninitialized in this function [-Wmaybe-uninitialized] net/sched/sch_qfq.c: In function ‘qfq_enqueue’: net/sched/sch_qfq.c:885:11: warning: ‘err’ may be used uninitialized in this function [-Wmaybe-uninitialized] Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-27 18:35:47 -04:00
Konstantin Khlebnikov	4b7cc7fc26	nf_defrag_ipv6: fix oops on module unloading fix copy-paste error introduced in linux-next commit "ipv6: add a new namespace for nf_conntrack_reasm" Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Amerigo Wang <amwang@redhat.com> Cc: David S. Miller <davem@davemloft.net> Acked-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-27 18:14:55 -04:00
stephen hemminger	eccc1bb8d4	tunnel: drop packet if ECN present with not-ECT Linux tunnels were written before RFC6040 and therefore never implemented the corner case of ECN getting set in the outer header and the inner header not being ready for it. Section 4.2. Default Tunnel Egress Behaviour. o If the inner ECN field is Not-ECT, the decapsulator MUST NOT propagate any other ECN codepoint onwards. This is because the inner Not-ECT marking is set by transports that rely on dropped packets as an indication of congestion and would not understand or respond to any other ECN codepoint [RFC4774]. Specifically: * If the inner ECN field is Not-ECT and the outer ECN field is CE, the decapsulator MUST drop the packet. * If the inner ECN field is Not-ECT and the outer ECN field is Not-ECT, ECT(0), or ECT(1), the decapsulator MUST forward the outgoing packet with the ECN field cleared to Not-ECT. This patch moves the ECN decap logic out of the individual tunnels into a common place. It also adds logging to allow detecting broken systems that set ECN bits incorrectly when tunneling (or an intermediate router might be changing the header). Overloads rx_frame_error to keep track of ECN related error. Thanks to Chris Wright who caught this while reviewing the new VXLAN tunnel. This code was tested by injecting faulty logic in other end GRE to send incorrectly encapsulated packets. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-27 18:12:37 -04:00
stephen hemminger	b0558ef24a	xfrm: remove extranous rcu_read_lock The handlers for xfrm_tunnel are always invoked with rcu read lock already. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-27 18:12:37 -04:00
stephen hemminger	0c5794a66c	gre: remove unnecessary rcu_read_lock/unlock The gre function pointers for receive and error handling are always called (from gre.c) with rcu_read_lock already held. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-27 18:12:37 -04:00
stephen hemminger	d208328765	gre: fix handling of key 0 GRE driver incorrectly uses zero as a flag value. Zero is a perfectly valid value for key, and the tunnel should match packets with no key only with tunnels created without key, and vice versa. This is a slightly visible change since previously it might be possible to construct a working tunnel that sent key 0 and received only because of the key wildcard of zero. I.e the sender sent key of zero, but tunnel was defined without key. Note: using gre key 0 requires iproute2 utilities v3.2 or later. The original utility code was broken as well. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-27 18:12:37 -04:00
Wei Yongjun	7f8436a126	l2tp: fix return value check In case of error, the function genlmsg_put() returns NULL pointer not ERR_PTR(). The IS_ERR() test in the return value check should be replaced with NULL test. dpatch engine is used to auto generate this patch. (https://github.com/weiyj/dpatch) Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-27 13:18:19 -04:00
David S. Miller	392b408782	Merge branch 'master' of git://1984.lsi.us.es/nf Pablo Neira Ayuso says: ==================== If time allows, I'd appreciate if you can take the following fix for the xt_limit match. As Jan indicates, random things may occur while using the xt_limit match due to use of uninitialized memory. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-27 13:16:14 -04:00
Szymon Janc	50b78b2a65	NFC: Fix sleeping in atomic when releasing socket nfc_llcp_socket_release is calling lock_sock/release_sock while holding write lock for rwlock. Use bh_lock/unlock_sock instead. BUG: sleeping function called from invalid context at net/core/sock.c:2138 in_atomic(): 1, irqs_disabled(): 0, pid: 56, name: kworker/1:1 4 locks held by kworker/1:1/56: Pid: 56, comm: kworker/1:1 Not tainted 3.5.0-999-nfc+ #7 Call Trace: [<ffffffff810952c5>] __might_sleep+0x145/0x200 [<ffffffff815d7686>] lock_sock_nested+0x36/0xa0 [<ffffffff81731569>] ? _raw_write_lock+0x49/0x50 [<ffffffffa04aa100>] ? nfc_llcp_socket_release+0x30/0x200 [nfc] [<ffffffffa04aa122>] nfc_llcp_socket_release+0x52/0x200 [nfc] [<ffffffffa04ab9f0>] nfc_llcp_mac_is_down+0x20/0x30 [nfc] [<ffffffffa04a6fea>] nfc_dep_link_down+0xaa/0xf0 [nfc] [<ffffffffa04a9bb5>] nfc_llcp_timeout_work+0x15/0x20 [nfc] [<ffffffff810825f7>] process_one_work+0x197/0x7c0 [<ffffffff81082596>] ? process_one_work+0x136/0x7c0 [<ffffffff8172fbc9>] ? __schedule+0x419/0x9c0 [<ffffffffa04a9ba0>] ? nfc_llcp_build_gb+0x1b0/0x1b0 [nfc] [<ffffffff81083090>] worker_thread+0x190/0x4c0 [<ffffffff81082f00>] ? rescuer_thread+0x2a0/0x2a0 [<ffffffff81088d1e>] kthread+0xae/0xc0 [<ffffffff810caafd>] ? trace_hardirqs_on+0xd/0x10 [<ffffffff8173acc4>] kernel_thread_helper+0x4/0x10 [<ffffffff81732174>] ? retint_restore_args+0x13/0x13 [<ffffffff81088c70>] ? flush_kthread_worker+0x150/0x150 [<ffffffff8173acc0>] ? gs_change+0x13/0x13 Signed-off-by: Szymon Janc <szymon.janc@tieto.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-09-27 10:52:22 +02:00
Szymon Janc	3c0cc8aa23	NFC: Fix sleeping in invalid context when netlink socket is closed netlink_register_notifier requires notify functions to not sleep. nfc_stop_poll locks device mutex and must not be called from notifier. Create workqueue that will handle this for all devices. BUG: sleeping function called from invalid context at kernel/mutex.c:269 in_atomic(): 0, irqs_disabled(): 0, pid: 4497, name: neard 1 lock held by neard/4497: Pid: 4497, comm: neard Not tainted 3.5.0-999-nfc+ #5 Call Trace: [<ffffffff810952c5>] __might_sleep+0x145/0x200 [<ffffffff81743dde>] mutex_lock_nested+0x2e/0x50 [<ffffffff816ffd19>] nfc_stop_poll+0x39/0xb0 [<ffffffff81700a17>] nfc_genl_rcv_nl_event+0x77/0xc0 [<ffffffff8174aa8c>] notifier_call_chain+0x5c/0x120 [<ffffffff8174abd6>] __atomic_notifier_call_chain+0x86/0x140 [<ffffffff8174ab50>] ? notifier_call_chain+0x120/0x120 [<ffffffff815e1347>] ? skb_dequeue+0x67/0x90 [<ffffffff8174aca6>] atomic_notifier_call_chain+0x16/0x20 [<ffffffff8162119a>] netlink_release+0x24a/0x280 [<ffffffff815d7aa8>] sock_release+0x28/0xa0 [<ffffffff815d7be7>] sock_close+0x17/0x30 [<ffffffff811b2a7c>] __fput+0xcc/0x250 [<ffffffff811b2c0e>] ____fput+0xe/0x10 [<ffffffff81085009>] task_work_run+0x69/0x90 [<ffffffff8101b951>] do_notify_resume+0x81/0xd0 [<ffffffff8174ef22>] int_signal+0x12/0x17 Signed-off-by: Szymon Janc <szymon.janc@tieto.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-09-27 10:52:17 +02:00
John W. Linville	7d777c3d95	NFC: Add dummy nfc_llc_shdlc_register definition This is used when CONFIG_NFC_SHDLC is disabled. Reported-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-09-27 10:48:08 +02:00
Thierry Escande	4463523bef	NFC: LLCP raw socket support This adds support for socket of type SOCK_RAW to LLCP. sk_buff are copied and sent to raw sockets with a 2 bytes extra header: The first byte header contains the nfc adapter index. The second one contains flags: - 0x01 - Direction (0=RX, 1=TX) - 0x02-0x80 - Reserved A raw socket has to be explicitly bound to a nfc adapter. This is achieved by specifying the adapter index to be bound to in the dev_idx field of the sockaddr_nfc_llcp struct passed to bind(). Signed-off-by: Thierry Escande <thierry.escande@linux.intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-09-27 10:47:59 +02:00
Szymon Janc	fe235b58d5	NFC: Use dynamic initialization for rwlocks If rwlock is dynamically allocated but statically initialized it is missing proper lockdep annotation. INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. Pid: 3352, comm: neard Not tainted 3.5.0-999-nfc+ #2 Call Trace: [<ffffffff810c8526>] __lock_acquire+0x8f6/0x1bf0 [<ffffffff81739045>] ? printk+0x4d/0x4f [<ffffffff810c9eed>] lock_acquire+0x9d/0x220 [<ffffffff81702bfe>] ? nfc_llcp_sock_from_sn+0x4e/0x160 [<ffffffff81746724>] _raw_read_lock+0x44/0x60 [<ffffffff81702bfe>] ? nfc_llcp_sock_from_sn+0x4e/0x160 [<ffffffff81702bfe>] nfc_llcp_sock_from_sn+0x4e/0x160 [<ffffffff817034a7>] nfc_llcp_get_sdp_ssap+0xa7/0x1b0 [<ffffffff81706353>] llcp_sock_bind+0x173/0x210 [<ffffffff815d9c94>] sys_bind+0xe4/0x100 [<ffffffff8139209e>] ? trace_hardirqs_on_thunk+0x3a/0x3f [<ffffffff8174ea69>] system_call_fastpath+0x16/0x1b Signed-off-by: Szymon Janc <szymon.janc@tieto.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-09-27 10:47:03 +02:00
Al Viro	cb0942b812	make get_file() return its argument simplifies a bunch of callers... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-26 21:10:25 -04:00
Al Viro	c3c073f808	new helper: iterate_fd() iterates through the opened files in given descriptor table, calling a supplied function; we stop once non-zero is returned. Callback gets struct file , descriptor number and const void argument passed to iterator. It is called with files->file_lock held, so it is not allowed to block. tty_io, netprio_cgroup and selinux flush_unauthorized_files() converted to its use. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-26 21:09:59 -04:00
Al Viro	56b31d1c9f	unexport sock_map_fd(), switch to sock_alloc_file() Both modular callers of sock_map_fd() had been buggy; sctp one leaks descriptor and file if copy_to_user() fails, 9p one shouldn't be exposing file in the descriptor table at all. Switch both to sock_alloc_file(), export it, unexport sock_map_fd() and make it static. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-26 21:08:50 -04:00
Al Viro	2840763051	take descriptor handling from sock_alloc_file() to callers Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-26 21:08:49 -04:00
Trond Myklebust	8a9a8b8332	SUNRPC: Fix the return value of xdr_align_pages() The callers of xdr_align_pages() expect it to return the number of bytes of actual XDR data remaining in the pages. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-09-26 12:43:10 -04:00
Jan Engelhardt	82e6bfe2fb	netfilter: xt_limit: have r->cost != 0 case work Commit v2.6.19-rc1~1272^2~41 tells us that r->cost != 0 can happen when a running state is saved to userspace and then reinstated from there. Make sure that private xt_limit area is initialized with correct values. Otherwise, random matchings due to use of uninitialized memory. Signed-off-by: Jan Engelhardt <jengelh@inai.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-09-26 01:33:16 +02:00
Linus Torvalds	6f0f9b6b3f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull more networking fixes from David Miller: 1) Eric Dumazet discovered and fixed what turned out to be a family of bugs. These functions were using pskb_may_pull() which might need to reallocate the linear SKB data buffer, but the callers were not expecting this possibility. The callers have cached pointers to the packet header areas, and would need to reload them if we were to continue using pskb_may_pull(). So they could end up reading garbage. It's easier to just change these RAW4/RAW6/MIP6 routines to use skb_header_pointer() instead of pskb_may_pull(), which won't modify the linear SKB data area. 2) Dave Jone's syscall spammer caught a case where a non-TCP socket can call down into the TCP keepalive code. The case basically involves creating a raw socket with sk_protocol == IPPROTO_TCP, then calling setsockopt(sock_fd, SO_KEEPALIVE, ...) Fixed by Eric Dumazet. 3) Bluetooth devices do not get configured properly while being powered on, resulting in always using legacy pairing instead of SSP. Fix from Andrzej Kaczmarek. 4) Bluetooth cancels delayed work erroneously, put stricter checks in place. From Andrei Emeltchenko. 5) Fix deadlock between cfg80211_mutex and reg_regdb_search_mutex in cfg80211, from Luis R. Rodriguez. 6) Fix interrupt double release in iwlwifi, from Emmanuel Grumbach. 7) Missing module license in bcm87xx driver, from Peter Huewe. 8) Team driver can lose port changed events when adding devices to a team, fix from Jiri Pirko. 9) Fix endless loop when trying ot unregister PPPOE device in zombie state, from Xiaodong Xu. 10) batman-adv layer needs to set MAC address of software device earlier, otherwise we call tt_local_add with it uninitialized. 11) Fix handling of KSZ8021 PHYs, it's matched currently by KS8051 but that doesn't program the device properly. From Marek Vasut. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: ipv6: mip6: fix mip6_mh_filter() ipv6: raw: fix icmpv6_filter() net: guard tcp_set_keepalive() to tcp sockets phy/micrel: Add missing header to micrel_phy.h phy/micrel: Rename KS80xx to KSZ80xx phy/micrel: Implement support for KSZ8021 batman-adv: Fix symmetry check / route flapping in multi interface setups batman-adv: Fix change mac address of soft iface. pppoe: drop PPPOX_ZOMBIEs in pppoe_release team: send port changed when added ipv4: raw: fix icmp_filter() net/phy/bcm87xx: Add MODULE_LICENSE("GPL") to GPL driver iwlwifi: don't double free the interrupt in failure path cfg80211: fix possible circular lock on reg_regdb_search() Bluetooth: Fix not removing power_off delayed work Bluetooth: Fix freeing uninitialized delayed works Bluetooth: mgmt: Fix enabling LE while powered off Bluetooth: mgmt: Fix enabling SSP while powered off	2012-09-25 14:20:29 -07:00
Eric Dumazet	96af69ea2a	ipv6: mip6: fix mip6_mh_filter() mip6_mh_filter() should not modify its input, or else its caller would need to recompute ipv6_hdr() if skb->head is reallocated. Use skb_header_pointer() instead of pskb_may_pull() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-25 16:04:44 -04:00
John W. Linville	5419575e83	Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next	2012-09-25 15:54:32 -04:00
David S. Miller	78cc88c408	Included fixes: - fix the behaviour of batman-adv in case of virtual interface MAC change event - fix symmetric link check in neighbour selection -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iEYEABECAAYFAlBffHkACgkQpGgxIkP9cweh4gCfRow8tAL8CnrzFV7cAyTXrZ3K sGkAoIOVe1hbuv4kfAh3eLz1kbd28y5n =1xhN -----END PGP SIGNATURE----- Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge Included fixes: - fix the behaviour of batman-adv in case of virtual interface MAC change event - fix symmetric link check in neighbour selection Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-25 13:24:02 -04:00
Andy Shevchenko	842b08bbee	ipconfig: fix trivial build error The commit `5e953778a2` ("ipconfig: add nameserver IPs to kernel-parameter ip=") introduces ic_nameservers_predef() that defined only for BOOTP. However it is used by ip_auto_config_setup() as well. This patch moves it outside of #ifdef BOOTP. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Christoph Fritz <chf.fritz@googlemail.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-25 13:22:30 -04:00
Eric Dumazet	1b05c4b50e	ipv6: raw: fix icmpv6_filter() icmpv6_filter() should not modify its input, or else its caller would need to recompute ipv6_hdr() if skb->head is reallocated. Use skb_header_pointer() instead of pskb_may_pull() and change the prototype to make clear both sk and skb are const. Also, if icmpv6 header cannot be found, do not deliver the packet, as we do in IPv4. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-25 13:21:49 -04:00
Bryan Schumaker	84e28a307e	SUNRPC: Set alloc_slot for backchannel tcp ops `f39c1bfb5a` (SUNRPC: Fix a UDP transport regression) introduced the "alloc_slot" function for xprt operations, but never created one for the backchannel operations. This patch fixes a null pointer dereference when mounting NFS over v4.1. Call Trace: [<ffffffffa0207957>] ? xprt_reserve+0x47/0x50 [sunrpc] [<ffffffffa02023a4>] call_reserve+0x34/0x60 [sunrpc] [<ffffffffa020e280>] __rpc_execute+0x90/0x400 [sunrpc] [<ffffffffa020e61a>] rpc_async_schedule+0x2a/0x40 [sunrpc] [<ffffffff81073589>] process_one_work+0x139/0x500 [<ffffffff81070e70>] ? alloc_worker+0x70/0x70 [<ffffffffa020e5f0>] ? __rpc_execute+0x400/0x400 [sunrpc] [<ffffffff81073d1e>] worker_thread+0x15e/0x460 [<ffffffff8145c839>] ? preempt_schedule+0x49/0x70 [<ffffffff81073bc0>] ? rescuer_thread+0x230/0x230 [<ffffffff81079603>] kthread+0x93/0xa0 [<ffffffff81465d04>] kernel_thread_helper+0x4/0x10 [<ffffffff81079570>] ? kthread_freezable_should_stop+0x70/0x70 [<ffffffff81465d00>] ? gs_change+0x13/0x13 Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-09-25 10:33:59 -04:00
Vladimir Kondratiev	64629b9d41	cfg80211: Fix regulatory check for 60GHz band frequencies The current regulatory code on cfg80211 performs a check to see if a regulatory rule belongs to an IEEE band so that if a Country IE is received and no rules are specified for a band (which is allowed by IEEE) those bands are left intact. The current band check assumes a rule is bound to a band if the rule's start or end frequency is less than 2 GHz apart from the center of frequency being inspected. In order to support 60 GHz for 802.11ad we need to increase this to account for the channel spacing of 2160 MHz whereby a channel somewhere in the middle of a regulatory rule may be more than 2 GHz apart from either the beginning or end of the frequency rule. Without a fix for this even though channels 1-3 are allowed world wide on the rule (57240 - 63720 @ 2160), channel 2 at 60480 MHz will end up getting disabled given that it is 3240 MHz from both the frequency rule start and end frequency. Fix this by using 2 GHz separation assumption for the 2.4 and 5 GHz bands but for 60 GHz use a 10 GHz separation before assuming a rule is not part of the band. Since we have no 802.11ad drivers yet merged this change has no impact to existing Linux upstream device drivers. Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com> Acked-by: Luis R. Rodriguez <mcgrof@do-not-panic.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-25 09:41:14 +02:00
Eric Dumazet	8489c1d9a8	net: raw: revert unrelated change Commit `5640f76858` ("net: use a per task frag allocator") accidentally contained an unrelated change to net/ipv4/raw.c, later committed (without the pr_err() debugging bits) in net tree as commit `ab43ed8b74` (ipv4: raw: fix icmp_filter()) This patch reverts this glitch, noticed by Stephen Rothwell. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-25 03:11:13 -04:00
David S. Miller	41e268565a	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== Please pull this last(?) batch of fixes intended for 3.6... For the Bluetooth bits, Gustavo says this: "Here goes probably my last update to 3.6. It includes the two patches you were ok last week(from Andrzej Kaczmarek), those are critical ones, and two other fixes one for a system crash and the other for a missing lockdep annotation." The referenced fixes from Andrzej prevent attempts to configure devices that are powered-off. Along with the Bluetooth fixes, there are a couple of 802.11 fixes. Emmanuel Grumbach gives us an iwlwifi fix to prevent releasing an interrupt twice. Luis R. Rodriguez provides a fix for a possible circular lock dependency in the cfg80211 regulatory enforcement code. All of these have been in linux-next for a few days. I hope they are not too late to make the 3.6 release! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-24 22:00:00 -04:00
Linus Torvalds	bee2d97b2c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull two ceph fixes from Sage Weil: "The first fixes a leak in the rbd setup error path, and the second fixes a more serious problem with mismatched kmap/kunmap that surfaced after the recent refactoring work." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: libceph: only kunmap kmapped pages rbd: drop dev reference on error in rbd_open()	2012-09-24 16:13:49 -07:00
Waldemar Rymarkiewicz	4c0ba9ac4b	NFC: Fix typo negociating -> negotiating Signed-off-by: Waldemar Rymarkiewicz <waldemar.rymarkiewicz@tieto.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-09-25 00:17:28 +02:00
Waldemar Rymarkiewicz	12bfd1e890	NFC: Don't handle consequent RSET frames after UA During processing incoming RSET frame chip, possibly due to its internal timout, can retrnasmit an another RSET which is next queued for processing in shdlc layer. In case when we accept processed RSET skip those remaining on the rcv queue until chip will send it's first S or I frame. This will mean the chip completed connection as well. Signed-off-by: Waldemar Rymarkiewicz <waldemar.rymarkiewicz@tieto.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-09-25 00:17:27 +02:00
Waldemar Rymarkiewicz	9010e39f50	NFC: Handle RSET in SHDLC_CONNECTING state As queue_work() does not guarantee immediate execution of sm_work it can happen in crossover RSET usecase that connect timer will constantly change the shdlc state from NEGOTIATING to CONNECTING before shdlc has chance to handle incoming frame. Signed-off-by: Waldemar Rymarkiewicz <waldemar.rymarkiewicz@tieto.com> Acked-by: Eric Lapuyade <eric.lapuyade@intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-09-25 00:17:27 +02:00
Eric Lapuyade	80faa59847	NFC: Add HCI module description Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-09-25 00:17:27 +02:00
Eric Lapuyade	a7d0281bbf	NFC: Fix LLC registration definitions for ANSI compliance Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-09-25 00:17:26 +02:00
Samuel Ortiz	f4f20d0650	NFC: Remove unneeded LLC symbols export After fixing the LLC Makefile, we no longer need those exports. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-09-25 00:17:26 +02:00
Eric Lapuyade	412fda538f	NFC: Changed HCI and PN544 HCI driver to use the new HCI LLC Core The previous shdlc HCI driver and its header are removed from the tree. PN544 now registers directly with HCI and passes the name of the llc it requires (shdlc). HCI instantiation now allocates the required llc instance. The llc is started when the HCI device is brought up. Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-09-25 00:17:26 +02:00
Eric Lapuyade	4a61cd6687	NFC: Add an shdlc llc module to llc core This is used by HCI drivers such as the one for the pn544 which require communications between HCI and the chip to use shdlc. Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-09-25 00:17:25 +02:00
Eric Lapuyade	8af00d48dc	NFC: Add a nop (passthrough) llc module to llc core This is a passthrough llc. It can be used by HCI drivers that don't need link layer control. HCI will then write directly to the driver, and driver will deliver incoming frames directly to HCI without any processing. Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-09-25 00:17:25 +02:00
Eric Lapuyade	67cccfe17d	NFC: Add an LLC Core layer to HCI The LLC layer manages modules that control the link layer protocol (such as shdlc) between HCI and an HCI driver. The driver must simply specify the required llc when it registers with HCI. Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-09-25 00:17:25 +02:00
Eric Lapuyade	f3e8fb5527	NFC: Modified hci_transceive to become an asynchronous operation This enables the completion callback to be called from a different context, preventing a possible deadlock if the callback resulted in the invocation of a nested call to the currently locked nfc_dev. This is also more in line with the im_transceive nfc_ops for NFC Core or NCI drivers which already behave asynchronously. Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-09-25 00:17:25 +02:00
Eric Lapuyade	e4c4789e55	NFC: Add a public nfc_hci_send_cmd_async method This method initiates execution of an HCI cmd. Result will be delivered through an asynchronous callback. Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-09-25 00:17:25 +02:00
Eric Lapuyade	b5faa648fa	NFC: Changed the HCI cmd execution callback prototype Make it match the data_exchange_cb_t so that it can be used directly in the implementation of an asynchronous hci_transceive Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-09-25 00:17:25 +02:00
Waldemar Rymarkiewicz	c1be211727	NFC: Correct outgoing frame before requeueing Driver must handle its data added to the frame, so at this point removeing control field of shdlc frame is enough. Signed-off-by: Waldemar Rymarkiewicz <waldemar.rymarkiewicz@tieto.com> Acked-by: Eric Lapuyade <eric.lapuyade@intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-09-25 00:17:24 +02:00
Waldemar Rymarkiewicz	ade672082d	NFC: Remove crc generation from shdlc layer Checksum is specific for a chip spcification and it varies (in size and type) between different hardware. It should be handled in the driver then. Moreover, shdlc spec doesn't mention crc as a part of the frame. Update pn544_hci driver as well. Signed-off-by: Waldemar Rymarkiewicz <waldemar.rymarkiewicz@tieto.com> Acked-by: Eric Lapuyade <eric.lapuyade@intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-09-25 00:17:24 +02:00
Wei Yongjun	52da2449e1	NFC: Fix possible LLCP memory leak nfc_llcp_build_tlv() malloced the memory and should be free in nfc_llcp_build_gb() after used, and the same in the error handling case, otherwise it will cause memory leak. spatch with a semantic match is used to found this problem. (http://coccinelle.lip6.fr/) Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-09-25 00:17:24 +02:00
Wei Yongjun	33e5971358	NFC: Remove pointless conditional before HCI kfree_skb() Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-09-25 00:17:24 +02:00
Tejun Heo	474fee3db1	NFC: Use system_nrt_wq instead of custom ones NFC is using a number of custom ordered workqueues w/ WQ_MEM_RECLAIM. WQ_MEM_RECLAIM is unnecessary unless NFC is gonna be used as transport for storage device, and all use cases match one work item to one ordered workqueue - IOW, there's no actual ordering going on at all and using system_nrt_wq gives the same behavior. There's nothing to be gained by using custom workqueues. Use system_nrt_wq instead and drop all the custom ones. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-09-25 00:17:23 +02:00
Syam Sidhardhan	5db327f96d	NFC: Remove repeated code for NULL check This patch remove the repeated code for checking llcp_sock & llcp_sock->dev against NULL. Signed-off-by: Syam Sidhardhan <s.syam@samsung.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-09-25 00:17:23 +02:00
Ilan Elias	767f19ae69	NFC: Implement NCI dep_link_up and dep_link_down During NFC-DEP target activation, store the remote general bytes to be used later in dep_link_up. When dep_link_up is called, activate the NFC-DEP target, and forward the remote general bytes. When dep_link_down is called, deactivate the target. Signed-off-by: Ilan Elias <ilane@ti.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-09-25 00:17:23 +02:00
Ilan Elias	ac20683840	NFC: Parse NCI NFC-DEP activation params Signed-off-by: Ilan Elias <ilane@ti.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-09-25 00:17:23 +02:00
Ilan Elias	7e0352306f	NFC: Set local general bytes in nci_start_poll If initiator protocol is NFC-DEP, set the local general bytes in nci_start_poll. Signed-off-by: Ilan Elias <ilane@ti.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-09-25 00:17:23 +02:00
Eric Dumazet	3e10986d1d	net: guard tcp_set_keepalive() to tcp sockets Its possible to use RAW sockets to get a crash in tcp_set_keepalive() / sk_reset_timer() Fix is to make sure socket is a SOCK_STREAM one. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-24 16:51:53 -04:00
Daniel Borkmann	9e49e88958	filter: add XOR instruction for use with X/K SKF_AD_ALU_XOR_X has been added a while ago, but as an 'ancillary' operation that is invoked through a negative offset in K within BPF load operations. Since BPF_MOD has recently been added, BPF_XOR should also be part of the common ALU operations. Removing SKF_AD_ALU_XOR_X might not be an option since this is exposed to user space. Signed-off-by: Daniel Borkmann <daniel.borkmann@tik.ee.ethz.ch> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-24 16:49:21 -04:00
Eric Dumazet	5640f76858	net: use a per task frag allocator We currently use a per socket order-0 page cache for tcp_sendmsg() operations. This page is used to build fragments for skbs. Its done to increase probability of coalescing small write() into single segments in skbs still in write queue (not yet sent) But it wastes a lot of memory for applications handling many mostly idle sockets, since each socket holds one page in sk->sk_sndmsg_page Its also quite inefficient to build TSO 64KB packets, because we need about 16 pages per skb on arches where PAGE_SIZE = 4096, so we hit page allocator more than wanted. This patch adds a per task frag allocator and uses bigger pages, if available. An automatic fallback is done in case of memory pressure. (up to 32768 bytes per frag, thats order-3 pages on x86) This increases TCP stream performance by 20% on loopback device, but also benefits on other network devices, since 8x less frags are mapped on transmit and unmapped on tx completion. Alexander Duyck mentioned a probable performance win on systems with IOMMU enabled. Its possible some SG enabled hardware cant cope with bigger fragments, but their ndo_start_xmit() should already handle this, splitting a fragment in sub fragments, since some arches have PAGE_SIZE=65536 Successfully tested on various ethernet devices. (ixgbe, igb, bnx2x, tg3, mellanox mlx4) Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Cc: Vijay Subramanian <subramanian.vijay@gmail.com> Cc: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Vijay Subramanian <subramanian.vijay@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-24 16:31:37 -04:00
David S. Miller	ae4735166e	Merge branch 'master' of git://1984.lsi.us.es/nf-next Pablo Neira Ayuso says: ==================== This patchset contains updates for your net-next tree, they are: * Mostly fixes for the recently pushed IPv6 NAT support: - Fix crash while removing nf_nat modules from Patrick McHardy. - Fix unbalanced rcu_read_unlock from Ulrich Weber. - Merge NETMAP and REDIRECT into one single xt_target module, from Jan Engelhardt. - Fix Kconfig for IPv6 NAT, which allows inconsistent configurations, from myself. * Updates for ipset, all of the from Jozsef Kadlecsik: - Add the new "nomatch" option to obtain reverse set matching. - Support for /0 CIDR in hash:net,iface set type. - One non-critical fix for a rare crash due to pass really wrong configuration parameters. - Coding style cleanups. - Sparse fixes. - Add set revision supported via modinfo.i * One extension for the xt_time match, to support matching during the transition between two days with one single rule, from Florian Westphal. * Fix maximum packet length supported by nfnetlink_queue and add NFQA_CAP_LEN attribute, from myself. You can notice that this batch contains a couple of fixes that may go to 3.6-rc but I don't consider them critical to push them: * The ipset fix for the /0 cidr case, which is triggered with one inconsistent command line invocation of ipset. * The nfnetlink_queue maximum packet length supported since it requires the new NFQA_CAP_LEN attribute to provide a full workaround for the described problem. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-24 15:42:04 -04:00
John W. Linville	791ef39cd1	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next	2012-09-24 14:39:16 -04:00
John W. Linville	9b4e9e7565	Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next	2012-09-24 14:34:40 -04:00
Pablo Neira Ayuso	6ee584be3e	netfilter: nfnetlink_queue: add NFQA_CAP_LEN attribute This patch adds the NFQA_CAP_LEN attribute that allows us to know what is the real packet size from user-space (even if we decided to retrieve just a few bytes from the packet instead of all of it). Security software that inspects packets should always check for this new attribute to make sure that it is inspecting the entire packet. This also helps to provide a workaround for the problem described in: http://marc.info/?l=netfilter-devel&m=134519473212536&w=2 Original idea from Florian Westphal. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-09-24 15:10:29 +02:00
Pablo Neira Ayuso	ba8d3b0bf5	netfilter: nfnetlink_queue: fix maximum packet length to userspace The packets that we send via NFQUEUE are encapsulated in the NFQA_PAYLOAD attribute. The length of the packet in userspace is obtained via attr->nla_len field. This field contains the size of the Netlink attribute header plus the packet length. If the maximum packet length is specified, ie. 65535 bytes, and packets in the range of (65531,65535] are sent to userspace, the attr->nla_len overflows and it reports bogus lengths to the application. To fix this, this patch limits the maximum packet length to 65531 bytes. If larger packet length is specified, the packet that we send to user-space is truncated to 65531 bytes. To support 65535 bytes packets, we have to revisit the idea of the 32-bits Netlink attribute length. Reported-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-09-24 14:47:40 +02:00
Pablo Neira Ayuso	7be54ca476	netfilter: nf_ct_ftp: add sequence tracking pickup facility for injected entries This patch allows the FTP helper to pickup the sequence tracking from the first packet seen. This is useful to fix the breakage of the first FTP command after the failover while using conntrackd to synchronize states. The seq_aft_nl_num field in struct nf_ct_ftp_info has been shrinked to 16-bits (enough for what it does), so we can use the remaining 16-bits to store the flags while using the same size for the private FTP helper data. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-09-24 14:29:40 +02:00
Florian Westphal	54eb3df3a7	netfilter: xt_time: add support to ignore day transition Currently, if you want to do something like: "match Monday, starting 23:00, for two hours" You need two rules, one for Mon 23:00 to 0:00 and one for Tue 0:00-1:00. The rule: --weekdays Mo --timestart 23:00 --timestop 01:00 looks correct, but it will first match on monday from midnight to 1 a.m. and then again for another hour from 23:00 onwards. This permits userspace to explicitly ignore the day transition and match for a single, continuous time period instead. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-09-24 14:29:01 +02:00
Vitaly Wool	eab48345c2	rfkill: prevent unnecessary event generation Prevent unnecessary rfkill event generation when the state has not actually changed. These events have to be delivered to relevant userspace processes, causing these processes to wake up and do something while they could as well have slept. This obviously results in more CPU usage, longer time-to-sleep-again and therefore higher power consumption. Signed-off-by: Vitaly Wool <vitalywool@gmail.com> Signed-off-by: Mykyta Iziumtsev <nikita.izyumtsev@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-24 10:35:54 +02:00
Linus Lüssing	7caf69fb9c	batman-adv: Fix symmetry check / route flapping in multi interface setups If receiving an OGM from a neighbor other than the currently selected and if it has the same TQ then we are supposed to switch if this neighbor provides a more symmetric link than the currently selected one. However this symmetry check currently is broken if the interface of the neighbor we received the OGM from and the one of the currently selected neighbor differ: We are currently trying to determine the symmetry of the link towards the selected router via the link we received the OGM from instead of just checking via the link towards the currently selected router. This leads to way more route switches than necessary and can lead to permanent route flapping in many common multi interface setups. This patch fixes this issue by using the right interface for this symmetry check. Signed-off-by: Linus Lüssing <linus.luessing@web.de>	2012-09-23 23:12:49 +02:00
Def	40a3eb33e3	batman-adv: Fix change mac address of soft iface. Into function interface_set_mac_addr, the function tt_local_add was invoked before updating dev->dev_addr. The new MAC address was not tagged as NoPurge. Signed-off-by: Def <def@laposte.net>	2012-09-23 23:12:48 +02:00
Neal Cardwell	30099b2e9b	tcp: TCP Fast Open Server - record retransmits after 3WHS When recording the number of SYNACK retransmits for servers using TCP Fast Open, fix the code to ensure that we copy over the retransmit count from the request_sock after we receive the ACK that completes the 3-way handshake. The story here is similar to that of SYNACK RTT measurements. Previously we were always doing this in tcp_v4_syn_recv_sock(). However, for TCP Fast Open connections tcp_v4_conn_req_fastopen() calls tcp_v4_syn_recv_sock() at the time we receive the SYN. So for TFO we must copy the final SYNACK retransmit count in tcp_rcv_state_process(). Note that copying over the SYNACK retransmit count will give us the correct count since, as is mentioned in a comment in tcp_retransmit_timer(), before we receive an ACK for our SYN-ACK a TFO passive connection does not retransmit anything else (e.g., data or FIN segments). Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-22 23:15:25 -04:00
Jozsef Kadlecsik	3e0304a583	netfilter: ipset: Support to match elements marked with "nomatch" Exceptions can now be matched and we can branch according to the possible cases: a. match in the set if the element is not flagged as "nomatch" b. match in the set if the element is flagged with "nomatch" c. no match i.e. iptables ... -m set --match-set ... -j ... iptables ... -m set --match-set ... --nomatch-entries -j ... ... Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2012-09-22 22:44:34 +02:00
Jozsef Kadlecsik	3ace95c0ac	netfilter: ipset: Coding style fixes Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2012-09-22 22:44:29 +02:00
Jozsef Kadlecsik	10111a6ef3	netfilter: ipset: Include supported revisions in module description Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2012-09-22 22:44:24 +02:00
Jozsef Kadlecsik	bd9087e040	netfilter: ipset: Add /0 network support to hash:net,iface type Now it is possible to setup a single hash:net,iface type of set and a single ip6?tables match which covers all egress/ingress filtering. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2012-09-22 22:44:15 +02:00
Neal Cardwell	e69bebde46	tcp: TCP Fast Open Server - call tcp_validate_incoming() for all packets A TCP Fast Open (TFO) passive connection must call both tcp_check_req() and tcp_validate_incoming() for all incoming ACKs that are attempting to complete the 3WHS. This is needed to parallel all the action that happens for a non-TFO connection, where for an ACK that is attempting to complete the 3WHS we call both tcp_check_req() and tcp_validate_incoming(). For example, upon receiving the ACK that completes the 3WHS, we need to call tcp_fast_parse_options() and update ts_recent based on the incoming timestamp value in the ACK. One symptom of the problem with the previous code was that for passive TFO connections using TCP timestamps, the outgoing TS ecr values ignored the incoming TS val value on the ACK that completed the 3WHS. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-22 15:47:10 -04:00
Neal Cardwell	0725398801	tcp: TCP Fast Open Server - note timestamps and retransmits for SYNACK RTT Previously, when using TCP Fast Open a server would return from tcp_check_req() before updating snt_synack based on TCP timestamp echo replies and whether or not we've retransmitted the SYNACK. The result was that (a) for TFO connections using timestamps we used an incorrect baseline SYNACK send time (tcp_time_stamp of SYNACK send instead of rcv_tsecr), and (b) for TFO connections that do not have TCP timestamps but retransmit the SYNACK we took a SYNACK RTT sample when we should not take a sample. This fix merely moves the snt_synack update logic a bit earlier in the function, so that connections using TCP Fast Open will properly do these updates when the ACK for the SYNACK arrives. Moving this snt_synack update logic means that with TCP_DEFER_ACCEPT enabled we do a few instructions of wasted work on each bare ACK, but that seems OK. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-22 15:47:10 -04:00
Neal Cardwell	016818d076	tcp: TCP Fast Open Server - take SYNACK RTT after completing 3WHS When taking SYNACK RTT samples for servers using TCP Fast Open, fix the code to ensure that we only call tcp_valid_rtt_meas() after we receive the ACK that completes the 3-way handshake. Previously we were always taking an RTT sample in tcp_v4_syn_recv_sock(). However, for TCP Fast Open connections tcp_v4_conn_req_fastopen() calls tcp_v4_syn_recv_sock() at the time we receive the SYN. So for TFO we must wait until tcp_rcv_state_process() to take the RTT sample. To fix this, we wait until after TFO calls tcp_v4_syn_recv_sock() before we set the snt_synack timestamp, since tcp_synack_rtt_meas() already ensures that we only take a SYNACK RTT sample if snt_synack is non-zero. To be careful, we only take a snt_synack timestamp when a SYNACK transmit or retransmit succeeds. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-22 15:47:10 -04:00
Neal Cardwell	623df484a7	tcp: extract code to compute SYNACK RTT In preparation for adding another spot where we compute the SYNACK RTT, extract this code so that it can be shared. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-22 15:47:10 -04:00
Eric Dumazet	ab43ed8b74	ipv4: raw: fix icmp_filter() icmp_filter() should not modify its input, or else its caller would need to recompute ip_hdr() if skb->head is reallocated. Use skb_header_pointer() instead of pskb_may_pull() and change the prototype to make clear both sk and skb are const. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-22 15:35:05 -04:00
John W. Linville	1199992df2	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem	2012-09-22 12:19:22 -04:00
Alex Elder	5ce765a540	libceph: only kunmap kmapped pages In write_partial_msg_pages(), pages need to be kmapped in order to perform a CRC-32c calculation on them. As an artifact of the way this code used to be structured, the kunmap() call was separated from the kmap() call and both were done conditionally. But the conditions under which the kmap() and kunmap() calls were made differed, so there was a chance a kunmap() call would be done on a page that had not been mapped. The symptom of this was tripping a BUG() in kunmap_high() when pkmap_count[nr] became 0. Reported-by: Bryan K. Wright <bryan@virginia.edu> Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-09-21 20:49:26 -07:00
Jozsef Kadlecsik	b9fed74818	netfilter: ipset: Check and reject crazy /0 input parameters bitmap:ip and bitmap:ip,mac type did not reject such a crazy range when created and using such a set results in a kernel crash. The hash types just silently ignored such parameters. Reject invalid /0 input parameters explicitely. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2012-09-21 21:51:34 +02:00
Jozsef Kadlecsik	6e27c9b4ee	netfilter: ipset: Fix sparse warnings "incorrect type in assignment" Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2012-09-21 21:51:22 +02:00
Christoph Fritz	5e953778a2	ipconfig: add nameserver IPs to kernel-parameter ip= On small systems (e.g. embedded ones) IP addresses are often configured by bootloaders and get assigned to kernel via parameter "ip=". If set to "ip=dhcp", even nameserver entries from DHCP daemons are handled. These entries exported in /proc/net/pnp are commonly linked by /etc/resolv.conf. To configure nameservers for networks without DHCP, this patch adds option <dns0-ip> and <dns1-ip> to kernel-parameter 'ip='. Signed-off-by: Christoph Fritz <chf.fritz@googlemail.com> Tested-by: Jan Weitzel <j.weitzel@phytec.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-21 14:51:21 -04:00
Zhao Hongjiang	bf5b30b8a4	net: change return values from -EACCES to -EPERM Change return value from -EACCES to -EPERM when the permission check fails. Signed-off-by: Zhao Hongjiang <zhaohongjiang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-21 13:58:08 -04:00
Wei Yongjun	f950c0ecc7	ipv6: fix return value check in fib6_add() In case of error, the function fib6_add_1() returns ERR_PTR() or NULL pointer. The ERR_PTR() case check is missing in fib6_add(). dpatch engine is used to generated this patch. (https://github.com/weiyj/dpatch) Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-21 13:43:52 -04:00
Amerigo Wang	fc18162520	l2tp: fix compile error when CONFIG_IPV6=m and CONFIG_L2TP=y When CONFIG_IPV6=m and CONFIG_L2TP=y, I got the following compile error: LD init/built-in.o net/built-in.o: In function `l2tp_xmit_core': l2tp_core.c:(.text+0x147781): undefined reference to `inet6_csk_xmit' net/built-in.o: In function `l2tp_tunnel_create': (.text+0x149067): undefined reference to `udpv6_encap_enable' net/built-in.o: In function `l2tp_ip6_recvmsg': l2tp_ip6.c:(.text+0x14e991): undefined reference to `ipv6_recv_error' net/built-in.o: In function `l2tp_ip6_sendmsg': l2tp_ip6.c:(.text+0x14ec64): undefined reference to `fl6_sock_lookup' l2tp_ip6.c:(.text+0x14ed6b): undefined reference to `datagram_send_ctl' l2tp_ip6.c:(.text+0x14eda0): undefined reference to `fl6_sock_lookup' l2tp_ip6.c:(.text+0x14ede5): undefined reference to `fl6_merge_options' l2tp_ip6.c:(.text+0x14edf4): undefined reference to `ipv6_fixup_options' l2tp_ip6.c:(.text+0x14ee5d): undefined reference to `fl6_update_dst' l2tp_ip6.c:(.text+0x14eea3): undefined reference to `ip6_dst_lookup_flow' l2tp_ip6.c:(.text+0x14eee7): undefined reference to `ip6_dst_hoplimit' l2tp_ip6.c:(.text+0x14ef8b): undefined reference to `ip6_append_data' l2tp_ip6.c:(.text+0x14ef9d): undefined reference to `ip6_flush_pending_frames' l2tp_ip6.c:(.text+0x14efe2): undefined reference to `ip6_push_pending_frames' net/built-in.o: In function `l2tp_ip6_destroy_sock': l2tp_ip6.c:(.text+0x14f090): undefined reference to `ip6_flush_pending_frames' l2tp_ip6.c:(.text+0x14f0a0): undefined reference to `inet6_destroy_sock' net/built-in.o: In function `l2tp_ip6_connect': l2tp_ip6.c:(.text+0x14f14d): undefined reference to `ip6_datagram_connect' net/built-in.o: In function `l2tp_ip6_bind': l2tp_ip6.c:(.text+0x14f4fe): undefined reference to `ipv6_chk_addr' net/built-in.o: In function `l2tp_ip6_init': l2tp_ip6.c:(.init.text+0x73fa): undefined reference to `inet6_add_protocol' l2tp_ip6.c:(.init.text+0x740c): undefined reference to `inet6_register_protosw' net/built-in.o: In function `l2tp_ip6_exit': l2tp_ip6.c:(.exit.text+0x1954): undefined reference to `inet6_unregister_protosw' l2tp_ip6.c:(.exit.text+0x1965): undefined reference to `inet6_del_protocol' net/built-in.o:(.rodata+0xf2d0): undefined reference to `inet6_release' net/built-in.o:(.rodata+0xf2d8): undefined reference to `inet6_bind' net/built-in.o:(.rodata+0xf308): undefined reference to `inet6_ioctl' net/built-in.o:(.data+0x1af40): undefined reference to `ipv6_setsockopt' net/built-in.o:(.data+0x1af48): undefined reference to `ipv6_getsockopt' net/built-in.o:(.data+0x1af50): undefined reference to `compat_ipv6_setsockopt' net/built-in.o:(.data+0x1af58): undefined reference to `compat_ipv6_getsockopt' make: *** [vmlinux] Error 1 This is due to l2tp uses symbols from IPV6, so when IPV6 is a module, l2tp is not allowed to be builtin. Cc: David Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-21 12:06:46 -04:00
Johannes Berg	c6f219dc83	mac80211: don't send delBA on addBA failure There's no reason to send a delBA when the peer refused our addBA, so change that. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-21 16:14:14 +02:00
Johannes Berg	582bb505b6	mac80211: don't send delBA when removing stations When a station is removed and we stop the aggregation sessions, it's not useful to send delBA since this is due to us or the station disassociating or dropping the connection in some other way, so change that. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-21 16:14:14 +02:00
Johannes Berg	7f1611469b	mac80211: don't send delBA before disassoc When we disassociate, it's not really useful to send delBA action frames since we're going to send disassoc/deauth anyway, so change that. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-21 16:14:13 +02:00
Jan Engelhardt	2cbc78a29e	netfilter: combine ipt_REDIRECT and ip6t_REDIRECT Combine more modules since the actual code is so small anyway that the kmod metadata and the module in its loaded state totally outweighs the combined actual code size. IP_NF_TARGET_REDIRECT becomes a compat option; IP6_NF_TARGET_REDIRECT is completely eliminated since it has not see a release yet. Signed-off-by: Jan Engelhardt <jengelh@inai.de> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-09-21 12:12:05 +02:00
Jan Engelhardt	b3d54b3e40	netfilter: combine ipt_NETMAP and ip6t_NETMAP Combine more modules since the actual code is so small anyway that the kmod metadata and the module in its loaded state totally outweighs the combined actual code size. IP_NF_TARGET_NETMAP becomes a compat option; IP6_NF_TARGET_NETMAP is completely eliminated since it has not see a release yet. Signed-off-by: Jan Engelhardt <jengelh@inai.de> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-09-21 12:11:08 +02:00
Ulrich Weber	136251d02f	netfilter: nf_nat: remove obsolete rcu_read_unlock call hlist walk in find_appropriate_src() is not protected anymore by rcu_read_lock(), so rcu_read_unlock() is unnecessary if in_range() matches. This bug was added in (`c7232c9` netfilter: add protocol independent NAT core). Signed-off-by: Ulrich Weber <ulrich.weber@sophos.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-09-21 12:09:25 +02:00
Patrick McHardy	b0cdb1d9a9	netfilter: nf_nat: fix oops when unloading protocol modules When unloading a protocol module nf_ct_iterate_cleanup() is used to remove all conntracks using the protocol from the bysource hash and clean their NAT sections. Since the conntrack isn't actually killed, the NAT callback is invoked twice, once for each direction, which causes an oops when trying to delete it from the bysource hash for the second time. The same oops can also happen when removing both an L3 and L4 protocol since the cleanup function doesn't check whether the conntrack has already been cleaned up. Pid: 4052, comm: modprobe Not tainted 3.6.0-rc3-test-nat-unload-fix+ #32 Red Hat KVM RIP: 0010:[<ffffffffa002c303>] [<ffffffffa002c303>] nf_nat_proto_clean+0x73/0xd0 [nf_nat] RSP: 0018:ffff88007808fe18 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff8800728550c0 RCX: ffff8800756288b0 RDX: dead000000200200 RSI: ffff88007808fe88 RDI: ffffffffa002f208 RBP: ffff88007808fe28 R08: ffff88007808e000 R09: 0000000000000000 R10: dead000000200200 R11: dead000000100100 R12: ffffffff81c6dc00 R13: ffff8800787582b8 R14: ffff880078758278 R15: ffff88007808fe88 FS: 00007f515985d700(0000) GS:ffff88007cd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007f515986a000 CR3: 000000007867a000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process modprobe (pid: 4052, threadinfo ffff88007808e000, task ffff8800756288b0) Stack: ffff88007808fe68 ffffffffa002c290 ffff88007808fe78 ffffffff815614e3 ffffffff00000000 00000aeb00000246 ffff88007808fe68 ffffffff81c6dc00 ffff88007808fe88 ffffffffa00358a0 0000000000000000 000000000040f5b0 Call Trace: [<ffffffffa002c290>] ? nf_nat_net_exit+0x50/0x50 [nf_nat] [<ffffffff815614e3>] nf_ct_iterate_cleanup+0xc3/0x170 [<ffffffffa002c55a>] nf_nat_l3proto_unregister+0x8a/0x100 [nf_nat] [<ffffffff812a0303>] ? compat_prepare_timeout+0x13/0xb0 [<ffffffffa0035848>] nf_nat_l3proto_ipv4_exit+0x10/0x23 [nf_nat_ipv4] ... To fix this, - check whether the conntrack has already been cleaned up in nf_nat_proto_clean - change nf_ct_iterate_cleanup() to only invoke the callback function once for each conntrack (IP_CT_DIR_ORIGINAL). The second change doesn't affect other callers since when conntracks are actually killed, both directions are removed from the hash immediately and the callback is already only invoked once. If it is not killed, the second callback invocation will always return the same decision not to kill it. Reported-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-09-21 11:35:18 +02:00
Pablo Neira Ayuso	b0041d1b8e	netfilter: fix IPv6 NAT dependencies in Kconfig * NF_NAT_IPV6 requires IP6_NF_IPTABLES * IP6_NF_TARGET_MASQUERADE, IP6_NF_TARGET_NETMAP, IP6_NF_TARGET_REDIRECT and IP6_NF_TARGET_NPT require NF_NAT_IPV6. This change just mirrors what IPv4 does in Kconfig, for consistency. Reported-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-09-21 11:33:19 +02:00
Ed Cashin	c0d680e577	net: do not disable sg for packets requiring no checksum A change in a series of VLAN-related changes appears to have inadvertently disabled the use of the scatter gather feature of network cards for transmission of non-IP ethernet protocols like ATA over Ethernet (AoE). Below is a reference to the commit that introduces a "harmonize_features" function that turns off scatter gather when the NIC does not support hardware checksumming for the ethernet protocol of an sk buff. commit `f01a5236bd` Author: Jesse Gross <jesse@nicira.com> Date: Sun Jan 9 06:23:31 2011 +0000 net offloading: Generalize netif_get_vlan_features(). The can_checksum_protocol function is not equipped to consider a protocol that does not require checksumming. Calling it for a protocol that requires no checksum is inappropriate. The patch below has harmonize_features call can_checksum_protocol when the protocol needs a checksum, so that the network layer is not forced to perform unnecessary skb linearization on the transmission of AoE packets. Unnecessary linearization results in decreased performance and increased memory pressure, as reported here: http://www.spinics.net/lists/linux-mm/msg15184.html The problem has probably not been widely experienced yet, because only recently has the kernel.org-distributed aoe driver acquired the ability to use payloads of over a page in size, with the patchset recently included in the mm tree: https://lkml.org/lkml/2012/8/28/140 The coraid.com-distributed aoe driver already could use payloads of greater than a page in size, but its users generally do not use the newest kernels. Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-20 22:23:40 -04:00
Mathias Krause	e3ac104d41	xfrm_user: don't copy esn replay window twice for new states The ESN replay window was already fully initialized in xfrm_alloc_replay_state_esn(). No need to copy it again. Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Mathias Krause <minipli@googlemail.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-20 18:08:40 -04:00
Mathias Krause	ecd7918745	xfrm_user: ensure user supplied esn replay window is valid The current code fails to ensure that the netlink message actually contains as many bytes as the header indicates. If a user creates a new state or updates an existing one but does not supply the bytes for the whole ESN replay window, the kernel copies random heap bytes into the replay bitmap, the ones happen to follow the XFRMA_REPLAY_ESN_VAL netlink attribute. This leads to following issues: 1. The replay window has random bits set confusing the replay handling code later on. 2. A malicious user could use this flaw to leak up to ~3.5kB of heap memory when she has access to the XFRM netlink interface (requires CAP_NET_ADMIN). Known users of the ESN replay window are strongSwan and Steffen's iproute2 patch (<http://patchwork.ozlabs.org/patch/85962/>). The latter uses the interface with a bitmap supplied while the former does not. strongSwan is therefore prone to run into issue 1. To fix both issues without breaking existing userland allow using the XFRMA_REPLAY_ESN_VAL netlink attribute with either an empty bitmap or a fully specified one. For the former case we initialize the in-kernel bitmap with zero, for the latter we copy the user supplied bitmap. For state updates the full bitmap must be supplied. To prevent overflows in the bitmap length calculation the maximum size of bmp_len is limited to 128 by this patch -- resulting in a maximum replay window of 4096 packets. This should be sufficient for all real life scenarios (RFC 4303 recommends a default replay window size of 64). Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Martin Willi <martin@revosec.ch> Cc: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-20 18:08:40 -04:00
Mathias Krause	1f86840f89	xfrm_user: fix info leak in copy_to_user_tmpl() The memory used for the template copy is a local stack variable. As struct xfrm_user_tmpl contains multiple holes added by the compiler for alignment, not initializing the memory will lead to leaking stack bytes to userland. Add an explicit memset(0) to avoid the info leak. Initial version of the patch by Brad Spengler. Cc: Brad Spengler <spender@grsecurity.net> Signed-off-by: Mathias Krause <minipli@googlemail.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-20 18:08:40 -04:00
Mathias Krause	7b789836f4	xfrm_user: fix info leak in copy_to_user_policy() The memory reserved to dump the xfrm policy includes multiple padding bytes added by the compiler for alignment (padding bytes in struct xfrm_selector and struct xfrm_userpolicy_info). Add an explicit memset(0) before filling the buffer to avoid the heap info leak. Signed-off-by: Mathias Krause <minipli@googlemail.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-20 18:08:39 -04:00
Mathias Krause	f778a63671	xfrm_user: fix info leak in copy_to_user_state() The memory reserved to dump the xfrm state includes the padding bytes of struct xfrm_usersa_info added by the compiler for alignment (7 for amd64, 3 for i386). Add an explicit memset(0) before filling the buffer to avoid the info leak. Signed-off-by: Mathias Krause <minipli@googlemail.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-20 18:08:39 -04:00
Mathias Krause	4c87308bde	xfrm_user: fix info leak in copy_to_user_auth() copy_to_user_auth() fails to initialize the remainder of alg_name and therefore discloses up to 54 bytes of heap memory via netlink to userland. Use strncpy() instead of strcpy() to fill the trailing bytes of alg_name with null bytes. Signed-off-by: Mathias Krause <minipli@googlemail.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-20 18:08:39 -04:00
Andrey Vagin	bc26ccd8fc	tcp: restore rcv_wscale in a repair mode (v2) rcv_wscale is a symetric parameter with snd_wscale. Both this parameters are set on a connection handshake. Without this value a remote window size can not be interpreted correctly, because a value from a packet should be shifted on rcv_wscale. And one more thing is that wscale_ok should be set too. This patch doesn't break a backward compatibility. If someone uses it in a old scheme, a rcv window will be restored with the same bug (rcv_wscale = 0). v2: Save backward compatibility on big-endian system. Before the first two bytes were snd_wscale and the second two bytes were rcv_wscale. Now snd_wscale is opt_val & 0xFFFF and rcv_wscale >> 16. This approach is independent on byte ordering. Cc: David S. Miller <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> CC: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: Andrew Vagin <avagin@openvz.org> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-20 17:49:58 -04:00
Alan Cox	4308fc58dc	tcp: Document use of undefined variable. Both tcp_timewait_state_process and tcp_check_req use the same basic construct of struct tcp_options received tmp_opt; tmp_opt.saw_tstamp = 0; then call tcp_parse_options However if they are fed a frame containing a TCP_SACK then tbe code behaviour is undefined because opt_rx->sack_ok is undefined data. This ought to be documented if it is intentional. Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-20 17:29:36 -04:00
Christoph Paasch	bb68b64724	ipv4: Don't add TCP-code in inet_sock_destruct Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be> Acked-by: H.K. Jerry Chu <hkchu@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-20 17:12:27 -04:00
Sylvain Roger Rieunier	2514ec8653	mac80211: fix IBSS auth TX debug message In the IBSS auth TX debug message the BSSID and DA address are reversed, fix that. Signed-off-by: Sylvain Roger Rieunier <sylvain.roger.rieunier@gmail.com> [reword commit message and make it fit 72 cols] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-20 10:31:34 +02:00
Trond Myklebust	a519fc7a70	SUNRPC: Ensure that the TCP socket is closed when in CLOSE_WAIT Instead of doing a shutdown() call, we need to do an actual close(). Ditto if/when the server is sending us junk RPC headers. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Tested-by: Simon Kirby <sim@hostway.ca> Cc: stable@vger.kernel.org	2012-09-19 18:16:10 -04:00
Li RongQing	8ea853fd0b	net/core: fix comment in skb_try_coalesce It should be the skb which is not cloned Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-19 17:29:13 -04:00
Amerigo Wang	6b102865e7	ipv6: unify fragment thresh handling code Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Michal Kubeček <mkubecek@suse.cz> Cc: David Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-19 17:23:28 -04:00
Amerigo Wang	d4915c087f	ipv6: make ip6_frag_nqueues() and ip6_frag_mem() static inline Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Michal Kubeček <mkubecek@suse.cz> Cc: David Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-19 17:23:28 -04:00
Amerigo Wang	b836c99fd6	ipv6: unify conntrack reassembly expire code with standard one Two years ago, Shan Wei tried to fix this: http://patchwork.ozlabs.org/patch/43905/ The problem is that RFC2460 requires an ICMP Time Exceeded -- Fragment Reassembly Time Exceeded message should be sent to the source of that fragment, if the defragmentation times out. " If insufficient fragments are received to complete reassembly of a packet within 60 seconds of the reception of the first-arriving fragment of that packet, reassembly of that packet must be abandoned and all the fragments that have been received for that packet must be discarded. If the first fragment (i.e., the one with a Fragment Offset of zero) has been received, an ICMP Time Exceeded -- Fragment Reassembly Time Exceeded message should be sent to the source of that fragment. " As Herbert suggested, we could actually use the standard IPv6 reassembly code which follows RFC2460. With this patch applied, I can see ICMP Time Exceeded sent from the receiver when the sender sent out 3/4 fragmented IPv6 UDP packet. Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Michal Kubeček <mkubecek@suse.cz> Cc: David Miller <davem@davemloft.net> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: netfilter-devel@vger.kernel.org Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-19 17:23:28 -04:00
Amerigo Wang	c038a767cd	ipv6: add a new namespace for nf_conntrack_reasm As pointed by Michal, it is necessary to add a new namespace for nf_conntrack_reasm code, this prepares for the second patch. Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Michal Kubeček <mkubecek@suse.cz> Cc: David Miller <davem@davemloft.net> Cc: Patrick McHardy <kaber@trash.net> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: netfilter-devel@vger.kernel.org Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-19 17:23:28 -04:00
Amerigo Wang	8c4c49df5c	netpoll: call ->ndo_select_queue() in tx path In netpoll tx path, we miss the chance of calling ->ndo_select_queue(), thus could cause problems when bonding is involved. This patch makes dev_pick_tx() extern (and rename it to netdev_pick_tx()) to let netpoll call it in netpoll_send_skb_on_dev(). Reported-by: Sylvain Munaut <s.munaut@whatever-company.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Cong Wang <amwang@redhat.com> Tested-by: Sylvain Munaut <s.munaut@whatever-company.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-19 17:19:09 -04:00
stephen hemminger	6b6e27255f	netdev: make address const in device address management The internal functions for add/deleting addresses don't change their argument. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-19 16:35:22 -04:00
Paolo Valente	7126195697	pkt_sched: fix virtual-start-time update in QFQ If the old timestamps of a class, say cl, are stale when the class becomes active, then QFQ may assign to cl a much higher start time than the maximum value allowed. This may happen when QFQ assigns to the start time of cl the finish time of a group whose classes are characterized by a higher value of the ratio max_class_pkt/weight_of_the_class with respect to that of cl. Inserting a class with a too high start time into the bucket list corrupts the data structure and may eventually lead to crashes. This patch limits the maximum start time assigned to a class. Signed-off-by: Paolo Valente <paolo.valente@unimore.it> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-19 16:23:53 -04:00
Michal Kubeček	15c041759b	tcp: flush DMA queue before sk_wait_data if rcv_wnd is zero If recv() syscall is called for a TCP socket so that - IOAT DMA is used - MSG_WAITALL flag is used - requested length is bigger than sk_rcvbuf - enough data has already arrived to bring rcv_wnd to zero then when tcp_recvmsg() gets to calling sk_wait_data(), receive window can be still zero while sk_async_wait_queue exhausts enough space to keep it zero. As this queue isn't cleaned until the tcp_service_net_dma() call, sk_wait_data() cannot receive any data and blocks forever. If zero receive window and non-empty sk_async_wait_queue is detected before calling sk_wait_data(), process the queue first. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-19 16:07:58 -04:00
Linus Lüssing	dbd6b11e15	batman-adv: make batadv_test_bit() return 0 or 1 only On some architectures test_bit() can return other values than 0 or 1: With a generic x86 OpenWrt image in a kvm setup (batadv_)test_bit() frequently returns -1 for me, leading to batadv_iv_ogm_update_seqnos() wrongly signaling a protected seqno window. This patch tries to fix this issue by making batadv_test_bit() return 0 or 1 only. Signed-off-by: Linus Lüssing <linus.luessing@web.de> Acked-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Antonio Quartulli <ordex@autistici.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-19 15:49:53 -04:00
Eric Dumazet	6b78f16e4b	gre: add GSO support Add GSO support to GRE tunnels. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-19 15:40:15 -04:00
Eric Dumazet	2c60db0370	net: provide a default dev->ethtool_ops Instead of forcing device drivers to provide empty ethtool_ops or tweak net/core/ethtool.c again, we could provide a generic ethtool_ops. This occurred to me when I wanted to add GSO support to GRE tunnels. ethtool -k support should be generic for all drivers. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Cc: Maciej Żenczykowski <maze@google.com> Reviewed-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-19 15:40:15 -04:00
Gao feng	828de4f6bf	net: dev: fix incorrect getting net device's name When moving a nic from net namespace A to net namespace B, in dev_change_net_namesapce,we call __dev_get_by_name to decide if the netns B has the device has the same name. if the netns B already has the same named device,we call dev_get_valid_name to try to get a valid name for this nic in the netns B,but net_device->nd_net still point to netns A now. this patch fix it. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-19 15:37:01 -04:00
Li RongQing	3fd91fb358	ipv6: recursive check rt->dst.from when call rt6_check_expired If dst cache dst_a copies from dst_b, and dst_b copies from dst_c, check if dst_a is expired or not, we should not end with dst_a->dst.from, dst_b, we should check dst_c. CC: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-19 15:35:33 -04:00
Eric Dumazet	b40863c667	net: more accurate network taps in transmit path dev_queue_xmit_nit() should be called right before ndo_start_xmit() calls or we might give wrong packet contents to taps users : Packet checksum can be changed, or packet can be linearized or segmented, and segments partially sent for the later case. Also a memory allocation can fail and packet never really hit the driver entry point. Reported-by: Jamie Gloudon <jamie.gloudon@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-19 15:32:42 -04:00
Johannes Berg	552bff0c2f	cfg80211: constify name parameter to add_virtual_intf The name can't be modified by the driver, make it const. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-19 09:32:59 +02:00
Johannes Berg	2ad4814fb6	mac80211: make reset debugfs depend on CONFIG_PM The suspend/resume code depends on CONFIG_PM, so the reset debugfs file can only be made available if that is enabled. Fengguang Wu's zero-day build testing found this. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-19 08:20:24 +02:00
Johan Hedberg	23b3b1330a	Bluetooth: Update management interface revision For each kernel release where commands or events are added to the management interface, the revision field should be increment by one. The increment should only happen once per kernel release and not for every command/event that gets added. The revision value is for informational purposes only, but this simple policy would make any future debugging a lot simple. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-09-18 22:27:30 -03:00
Johan Hedberg	92a25256f1	Bluetooth: mgmt: Implement support for passkey notification This patch adds support for Secure Simple Pairing with devices that have KeyboardOnly as their IO capability. Such devices will cause a passkey notification on our side and optionally also keypress notifications. Without this patch some keyboards cannot be paired using the mgmt interface. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Cc: stable@vger.kernel.org Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-09-18 22:27:29 -03:00
Luis R. Rodriguez	a85d0d7f34	cfg80211: fix possible circular lock on reg_regdb_search() When call_crda() is called we kick off a witch hunt search for the same regulatory domain on our internal regulatory database and that work gets kicked off on a workqueue, this is done while the cfg80211_mutex is held. If that workqueue kicks off it will first lock reg_regdb_search_mutex and later cfg80211_mutex but to ensure two CPUs will not contend against cfg80211_mutex the right thing to do is to have the reg_regdb_search() wait until the cfg80211_mutex is let go. The lockdep report is pasted below. cfg80211: Calling CRDA to update world regulatory domain ====================================================== [ INFO: possible circular locking dependency detected ] 3.3.8 #3 Tainted: G O ------------------------------------------------------- kworker/0:1/235 is trying to acquire lock: (cfg80211_mutex){+.+...}, at: [<816468a4>] set_regdom+0x78c/0x808 [cfg80211] but task is already holding lock: (reg_regdb_search_mutex){+.+...}, at: [<81646828>] set_regdom+0x710/0x808 [cfg80211] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (reg_regdb_search_mutex){+.+...}: [<800a8384>] lock_acquire+0x60/0x88 [<802950a8>] mutex_lock_nested+0x54/0x31c [<81645778>] is_world_regdom+0x9f8/0xc74 [cfg80211] -> #1 (reg_mutex#2){+.+...}: [<800a8384>] lock_acquire+0x60/0x88 [<802950a8>] mutex_lock_nested+0x54/0x31c [<8164539c>] is_world_regdom+0x61c/0xc74 [cfg80211] -> #0 (cfg80211_mutex){+.+...}: [<800a77b8>] __lock_acquire+0x10d4/0x17bc [<800a8384>] lock_acquire+0x60/0x88 [<802950a8>] mutex_lock_nested+0x54/0x31c [<816468a4>] set_regdom+0x78c/0x808 [cfg80211] other info that might help us debug this: Chain exists of: cfg80211_mutex --> reg_mutex#2 --> reg_regdb_search_mutex Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(reg_regdb_search_mutex); lock(reg_mutex#2); lock(reg_regdb_search_mutex); lock(cfg80211_mutex); * DEADLOCK * 3 locks held by kworker/0:1/235: #0: (events){.+.+..}, at: [<80089a00>] process_one_work+0x230/0x460 #1: (reg_regdb_work){+.+...}, at: [<80089a00>] process_one_work+0x230/0x460 #2: (reg_regdb_search_mutex){+.+...}, at: [<81646828>] set_regdom+0x710/0x808 [cfg80211] stack backtrace: Call Trace: [<80290fd4>] dump_stack+0x8/0x34 [<80291bc4>] print_circular_bug+0x2ac/0x2d8 [<800a77b8>] __lock_acquire+0x10d4/0x17bc [<800a8384>] lock_acquire+0x60/0x88 [<802950a8>] mutex_lock_nested+0x54/0x31c [<816468a4>] set_regdom+0x78c/0x808 [cfg80211] Reported-by: Felix Fietkau <nbd@openwrt.org> Tested-by: Felix Fietkau <nbd@openwrt.org> Cc: stable@vger.kernel.org Signed-off-by: Luis R. Rodriguez <mcgrof@do-not-panic.com> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-09-18 20:43:23 -04:00
Vinicius Costa Gomes	78c04c0bf5	Bluetooth: Fix not removing power_off delayed work For example, when a usb reset is received (I could reproduce it running something very similar to this[1] in a loop) it could be that the device is unregistered while the power_off delayed work is still scheduled to run. Backtrace: WARNING: at lib/debugobjects.c:261 debug_print_object+0x7c/0x8d() Hardware name: To Be Filled By O.E.M. ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x26 Modules linked in: nouveau mxm_wmi btusb wmi bluetooth ttm coretemp drm_kms_helper Pid: 2114, comm: usb-reset Not tainted 3.5.0bt-next #2 Call Trace: [<ffffffff8124cc00>] ? free_obj_work+0x57/0x91 [<ffffffff81058f88>] warn_slowpath_common+0x7e/0x97 [<ffffffff81059035>] warn_slowpath_fmt+0x41/0x43 [<ffffffff8124ccb6>] debug_print_object+0x7c/0x8d [<ffffffff8106e3ec>] ? __queue_work+0x259/0x259 [<ffffffff8124d63e>] ? debug_check_no_obj_freed+0x6f/0x1b5 [<ffffffff8124d667>] debug_check_no_obj_freed+0x98/0x1b5 [<ffffffffa00aa031>] ? bt_host_release+0x10/0x1e [bluetooth] [<ffffffff810fc035>] kfree+0x90/0xe6 [<ffffffffa00aa031>] bt_host_release+0x10/0x1e [bluetooth] [<ffffffff812ec2f9>] device_release+0x4a/0x7e [<ffffffff8123ef57>] kobject_release+0x11d/0x154 [<ffffffff8123ed98>] kobject_put+0x4a/0x4f [<ffffffff812ec0d9>] put_device+0x12/0x14 [<ffffffffa009472b>] hci_free_dev+0x22/0x26 [bluetooth] [<ffffffffa0280dd0>] btusb_disconnect+0x96/0x9f [btusb] [<ffffffff813581b4>] usb_unbind_interface+0x57/0x106 [<ffffffff812ef988>] __device_release_driver+0x83/0xd6 [<ffffffff812ef9fb>] device_release_driver+0x20/0x2d [<ffffffff813582a7>] usb_driver_release_interface+0x44/0x7b [<ffffffff81358795>] usb_forced_unbind_intf+0x45/0x4e [<ffffffff8134f959>] usb_reset_device+0xa6/0x12e [<ffffffff8135df86>] usbdev_do_ioctl+0x319/0xe20 [<ffffffff81203244>] ? avc_has_perm_flags+0xc9/0x12e [<ffffffff812031a0>] ? avc_has_perm_flags+0x25/0x12e [<ffffffff81050101>] ? do_page_fault+0x31e/0x3a1 [<ffffffff8135eaa6>] usbdev_ioctl+0x9/0xd [<ffffffff811126b1>] vfs_ioctl+0x21/0x34 [<ffffffff81112f7b>] do_vfs_ioctl+0x408/0x44b [<ffffffff81208d45>] ? file_has_perm+0x76/0x81 [<ffffffff8111300f>] sys_ioctl+0x51/0x76 [<ffffffff8158db22>] system_call_fastpath+0x16/0x1b [1] http://cpansearch.perl.org/src/DPAVLIN/Biblio-RFID-0.03/examples/usbreset.c Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org> Cc: stable@vger.kernel.org Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-09-18 20:13:02 -03:00
Andrei Emeltchenko	aad3d0e343	Bluetooth: Fix freeing uninitialized delayed works When releasing L2CAP socket which is in BT_CONFIG state l2cap_chan_close invokes l2cap_send_disconn_req which cancel delayed works which are only set in BT_CONNECTED state with l2cap_ertm_init. Add state check before cancelling those works. ... [ 9668.574372] [21085] l2cap_sock_release: sock cd065200, sk f073e800 [ 9668.574399] [21085] l2cap_sock_shutdown: sock cd065200, sk f073e800 [ 9668.574411] [21085] l2cap_chan_close: chan f073ec00 state BT_CONFIG sk f073e800 [ 9668.574421] [21085] l2cap_send_disconn_req: chan f073ec00 conn ecc16600 [ 9668.574441] INFO: trying to register non-static key. [ 9668.574443] the code is fine but needs lockdep annotation. [ 9668.574446] turning off the locking correctness validator. [ 9668.574450] Pid: 21085, comm: obex-client Tainted: G O 3.5.0+ #57 [ 9668.574452] Call Trace: [ 9668.574463] [<c10a64b3>] __lock_acquire+0x12e3/0x1700 [ 9668.574468] [<c10a44fb>] ? trace_hardirqs_on+0xb/0x10 [ 9668.574476] [<c15e4f60>] ? printk+0x4d/0x4f [ 9668.574479] [<c10a6e38>] lock_acquire+0x88/0x130 [ 9668.574487] [<c1059740>] ? try_to_del_timer_sync+0x60/0x60 [ 9668.574491] [<c1059790>] del_timer_sync+0x50/0xc0 [ 9668.574495] [<c1059740>] ? try_to_del_timer_sync+0x60/0x60 [ 9668.574515] [<f8aa1c23>] l2cap_send_disconn_req+0xe3/0x160 [bluetooth] ... Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-09-18 20:07:04 -03:00
Andrzej Kaczmarek	562fcc246e	Bluetooth: mgmt: Fix enabling LE while powered off When new BT USB adapter is plugged in it's configured while still being powered off (HCI_AUTO_OFF flag is set), thus Set LE will only set dev_flags but won't write changes to controller. As a result it's not possible to start device discovery session on LE controller as it uses interleaved discovery which requires LE Supported Host flag in extended features. This patch ensures HCI Write LE Host Supported is sent when Set Powered is called to power on controller and clear HCI_AUTO_OFF flag. Signed-off-by: Andrzej Kaczmarek <andrzej.kaczmarek@tieto.com> Cc: stable@vger.kernel.org Acked-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-09-18 20:07:03 -03:00
Andrzej Kaczmarek	3d1cbdd6ae	Bluetooth: mgmt: Fix enabling SSP while powered off When new BT USB adapter is plugged in it's configured while still being powered off (HCI_AUTO_OFF flag is set), thus Set SSP will only set dev_flags but won't write changes to controller. As a result remote devices won't use Secure Simple Pairing with our device due to SSP Host Support flag disabled in extended features and may also reject SSP attempt from our side (with possible fallback to legacy pairing). This patch ensures HCI Write Simple Pairing Mode is sent when Set Powered is called to power on controller and clear HCI_AUTO_OFF flag. Signed-off-by: Andrzej Kaczmarek <andrzej.kaczmarek@tieto.com> Cc: stable@vger.kernel.org Acked-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-09-18 20:07:03 -03:00
Li RongQing	433a195480	xfrm: fix a read lock imbalance in make_blackhole if xfrm_policy_get_afinfo returns 0, it has already released the read lock, xfrm_policy_put_afinfo should not be called again. Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-18 16:30:15 -04:00
Eric Dumazet	1d57f19539	tcp: fix regression in urgent data handling Stephan Springl found that commit `1402d36601` "tcp: introduce tcp_try_coalesce" introduced a regression for rlogin It turns out problem comes from TCP urgent data handling and a change in behavior in input path. rlogin sends two one-byte packets with URG ptr set, and when next data frame is coalesced, we lack sk_data_ready() calls to wakeup consumer. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Stephan Springl <springl-k@lar.bfw.de> Cc: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-18 16:26:27 -04:00
Michael S. Tsirkin	0e698bf662	net: fix memory leak on oom with zerocopy If orphan flags fails, we don't free the skb on receive, which leaks the skb memory. Return value was also wrong: netif_receive_skb is supposed to return NET_RX_DROP, not ENOMEM. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-18 16:24:00 -04:00
Mathias Krause	c254637225	xfrm_user: return error pointer instead of NULL #2 When dump_one_policy() returns an error, e.g. because of a too small buffer to dump the whole xfrm policy, xfrm_policy_netlink() returns NULL instead of an error pointer. But its caller expects an error pointer and therefore continues to operate on a NULL skbuff. Signed-off-by: Mathias Krause <minipli@googlemail.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-18 16:13:46 -04:00
Mathias Krause	864745d291	xfrm_user: return error pointer instead of NULL When dump_one_state() returns an error, e.g. because of a too small buffer to dump the whole xfrm state, xfrm_state_netlink() returns NULL instead of an error pointer. But its callers expect an error pointer and therefore continue to operate on a NULL skbuff. This could lead to a privilege escalation (execution of user code in kernel context) if the attacker has CAP_NET_ADMIN and is able to map address 0. Signed-off-by: Mathias Krause <minipli@googlemail.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-18 16:13:45 -04:00
Peter Senna Tschudin	adccff34de	net/tipc/name_table.c: Remove unecessary semicolon Found by http://coccinelle.lip6.fr/ Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-18 16:08:19 -04:00
Peter Senna Tschudin	a2bf91b5b8	net/openvswitch/vport.c: Remove unecessary semicolon Found by http://coccinelle.lip6.fr/ Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-18 16:08:19 -04:00
Peter Senna Tschudin	4c835019a6	net/ieee802154/6lowpan.c: Remove unecessary semicolon Found by http://coccinelle.lip6.fr/ Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-18 16:08:19 -04:00
Nicolas Dichtel	2c20cbd7e3	ipv6: use DST_* macro to set obselete field Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-18 15:57:04 -04:00
Nicolas Dichtel	6f3118b571	ipv6: use net->rt_genid to check dst validity IPv6 dst should take care of rt_genid too. When a xfrm policy is inserted or deleted, all dst should be invalidated. To force the validation, dst entries should be created with ->obsolete set to DST_OBSOLETE_FORCE_CHK. This was already the case for all functions calling ip6_dst_alloc(), except for ip6_rt_copy(). As a consequence, we can remove the specific code in inet6_connection_sock. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-18 15:57:03 -04:00
Nicolas Dichtel	ee8372dd19	xfrm: invalidate dst on policy insertion/deletion When a policy is inserted or deleted, all dst should be recalculated. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-18 15:57:03 -04:00
Nicolas Dichtel	b42664f898	netns: move net->ipv4.rt_genid to net->rt_genid This commit prepares the use of rt_genid by both IPv4 and IPv6. Initialization is left in IPv4 part. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-18 15:57:03 -04:00
Eric Dumazet	2885da7296	net: rt_cache_flush() cleanup We dont use jhash anymore since route cache removal, so we can get rid of get_random_bytes() calls for rt_genid changes. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-18 15:54:19 -04:00
Nicolas Dichtel	bafa6d9d89	ipv4/route: arg delay is useless in rt_cache_flush() Since route cache deletion (`89aef8921b`), delay is no more used. Remove it. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-18 15:44:34 -04:00
Pandiyarajan Pitchaimuthu	ed44a951c7	cfg80211/nl80211: Notify connection request failure in AP mode In AP mode, when a station requests connection to an AP and if the request is failed for particular reason, userspace is notified about the failure through NL80211_CMD_CONN_FAILED command. Reason for the failure is sent through the attribute NL80211_ATTR_CONN_FAILED_REASON. Signed-off-by: Pandiyarajan Pitchaimuthu <c_ppitch@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-18 19:54:06 +02:00
Alan Cox	f3baed51f4	wireless: remove unreachable code The only case where intersected_rd can become non NULL is within an if. All paths from that if return, so the end chunk has therefore squawked its last and is no more. Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-18 19:54:05 +02:00
Eric W. Biederman	e1760bd5ff	userns: Convert the audit loginuid to be a kuid Always store audit loginuids in type kuid_t. Print loginuids by converting them into uids in the appropriate user namespace, and then printing the resulting uid. Modify audit_get_loginuid to return a kuid_t. Modify audit_set_loginuid to take a kuid_t. Modify /proc/<pid>/loginuid on read to convert the loginuid into the user namespace of the opener of the file. Modify /proc/<pid>/loginud on write to convert the loginuid rom the user namespace of the opener of the file. Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Eric Paris <eparis@redhat.com> Cc: Paul Moore <paul@paul-moore.com> ? Cc: David Miller <davem@davemloft.net> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-17 18:08:54 -07:00
Simon Derr	584a8c13d5	9P: Fix race in p9_write_work() See previous commit about p9_read_work() for details. This fixes a similar race between p9_write_work() and p9_poll_mux() Signed-off-by: Simon Derr <simon.derr@bull.net> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2012-09-17 14:54:11 -05:00
Simon Derr	1957b3a86f	9P: fix test at the end of p9_write_work() At the end of p9_write_work() we want to test if there is still data to send. This means: - either the current request still has data to send (wsize != 0) - or there are requests in the unsent queue Signed-off-by: Simon Derr <simon.derr@bull.net> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2012-09-17 14:54:11 -05:00
Simon Derr	0462194d35	9P: Fix race in p9_read_work() Race scenario between p9_read_work() and p9_poll_mux() Data arrive, Rworksched is set, p9_read_work() is called. thread A thread B p9_read_work() . reads data . checks if new data ready. No. . gets preempted . More data arrive, p9_poll_mux() is called. . . . p9_poll_mux() . . if (!test_and_set_bit(Rworksched, . &m->wsched)) { . schedule_work(&m->rq); . } . . -> does not schedule work because . Rworksched is set . . clear_bit(Rworksched, &m->wsched); return; No work has been scheduled, and yet data are waiting. Currently p9_read_work() checks if there is data to read, and if not, it clears Rworksched. I think it should clear Rworksched first, and then check if there is data to read. Signed-off-by: Simon Derr <simon.derr@bull.net> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2012-09-17 14:54:11 -05:00
David S. Miller	b4516a288e	llc: Remove stray reference to sysctl_llc_station_ack_timeout. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-17 13:13:24 -04:00
Ben Hutchings	12ebc8b9af	llc2: Collapse remainder of state machine into simple if-else if-statement Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-17 13:04:19 -04:00
Ben Hutchings	da31888018	llc2: Remove explicit indexing of state action arrays These arrays are accessed by iteration in llc_exec_station_trans_actions(). There must not be any zero-filled gaps in them, so the explicit indices are pointless. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-17 13:04:19 -04:00
Ben Hutchings	5ecf9eea26	llc2: Remove the station send queue We only ever put one skb on the send queue, and then immediately send it. Remove the queue and call dev_queue_xmit() directly. This leaves struct llc_station empty, so remove that as well. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-17 13:04:18 -04:00
Ben Hutchings	04d191c259	llc2: Collapse the station event receive path We only ever put one skb on the event queue, and then immediately process it. Remove the queue and fold together the related functions, removing several blatantly false comments. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-17 13:04:18 -04:00
Ben Hutchings	025e363325	llc2: Remove dead code for state machine The initial state is UP and there is no way to enter the other states as the required event type is never generated. Delete all states, event types, and other dead code. The only thing left is handling of the XID and TEST commands. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-17 13:04:18 -04:00
Ben Hutchings	cc6328dfe4	llc2: Remove pointless indirection through llc_stat_state_trans_end Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-17 13:04:18 -04:00
Alan Cox	e04dae8408	af_unix: old_cred is surplus Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-17 13:00:13 -04:00
Joe Perches	666f355f38	device and dynamic_debug: Use dev_vprintk_emit and dev_printk_emit Convert direct calls of vprintk_emit and printk_emit to the dev_ equivalents. Make create_syslog_header static. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: David S. Miller <davem@davemloft.net> Tested-by: Jim Cromie <jim.cromie@gmail.com> Acked-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-09-17 06:10:05 -07:00
Joe Perches	c2c5a7051c	netdev_printk/netif_printk: Remove a superfluous logging colon netdev_printk originally called dev_printk with %pV. This style emitted the complete dev_printk header with a colon followed by the netdev_name prefix followed by a colon. Now that netdev_printk does not call dev_printk, the extra colon is superfluous. Remove it. Example: old: sky2 0000:02:00.0: eth0: Link is up at 100 Mbps, full duplex, flow control both new: sky2 0000:02:00.0 eth0: Link is up at 100 Mbps, full duplex, flow control both Signed-off-by: Joe Perches <joe@perches.com> Acked-by: David S. Miller <davem@davemloft.net> Tested-by: Jim Cromie <jim.cromie@gmail.com> Acked-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-09-17 06:10:05 -07:00
Joe Perches	b004ff4972	netdev_printk/dynamic_netdev_dbg: Directly call printk_emit A lot of stack is used in recursive printks with %pV. Using multiple levels of %pV (a logging function with %pV that calls another logging function with %pV) can consume more stack than necessary. Avoid excessive stack use by not calling dev_printk from netdev_printk and dynamic_netdev_dbg. Duplicate the logic and form of dev_printk instead. Make __netdev_printk static. Remove EXPORT_SYMBOL(__netdev_printk) Whitespace and brace style neatening. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: David S. Miller <davem@davemloft.net> Tested-by: Jim Cromie <jim.cromie@gmail.com> Acked-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-09-17 06:08:30 -07:00
David S. Miller	ba01dfe182	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== This is another batch of updates intended for the 3.7 stream. There are not a lot of large items, but iwlwifi, mwifiex, rt2x00, ath9k, and brcmfmac all get some attention. Wei Yongjun also provides a series of small maintenance fixes. This also includes a pull of the wireless tree in order to satisfy some prerequisites for later patches. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-17 00:57:32 -04:00
Greg Kroah-Hartman	7ac3c93e5d	Merge 3.6-rc6 into tty-next This pulls in the fixes in 3.6-rc6 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-09-16 17:31:36 -07:00
David S. Miller	b48b63a1f6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: net/netfilter/nfnetlink_log.c net/netfilter/xt_LOG.c Rather easy conflict resolution, the 'net' tree had bug fixes to make sure we checked if a socket is a time-wait one or not and elide the logging code if so. Whereas on the 'net-next' side we are calculating the UID and GID from the creds using different interfaces due to the user namespace changes from Eric Biederman. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-15 11:43:53 -04:00
Linus Torvalds	a1362d504e	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Use after free and new device IDs in bluetooth from Andre Guedes, Yevgeniy Melnichuk, Gustavo Padovan, and Henrik Rydberg. 2) Fix crashes with short packet lengths and VLAN in pktgen, from Nishank Trivedi. 3) mISDN calls flush_work_sync() with locks held, fix from Karsten Keil. 4) Packet scheduler gred parameters are reported to userspace improperly scaled, and WRED idling is not performed correctly. All from David Ward. 5) Fix TCP socket refcount problem in ipv6, from Julian Anastasov. 6) ibmveth device has RX queue alignment requirements which are not being explicitly met resulting in sporadic failures, fix from Santiago Leon. 7) Netfilter needs to take care when interpreting sockets attached to socket buffers, they could be time-wait minisockets. Fix from Eric Dumazet. 8) sock_edemux() has the same issue as netfilter did in #7 above, fix from Eric Dumazet. 9) Avoid infinite loops in CBQ scheduler with some configurations, from Eric Dumazet. 10) Deal with "Reflection scan: an Off-Path Attack on TCP", from Jozsef Kadlecsik. 11) SCTP overcharges socket for TX packets, fix from Thomas Graf. 12) CODEL packet scheduler should not reset it's state every time it builds a new flow, fix from Eric Dumazet. 13) Fix memory leak in nl80211, from Wei Yongjun. 14) NETROM doesn't check skb_copy_datagram_iovec() return values, from Alan Cox. 15) l2tp ethernet was using sizeof(ETH_HLEN) instead of plain ETH_HLEN, oops. From Eric Dumazet. 16) Fix selection of ath9k chips on which PA linearization and AM2PM predistoration are used, from Felix Fietkau. 17) Flow steering settings in mlx4 driver need to be validated properly, from Hadar Hen Zion. 18) bnx2x doesn't show the correct link duplex setting, from Yaniv Rosner. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (75 commits) pktgen: fix crash with vlan and packet size less than 46 bnx2x: Add missing afex code bnx2x: fix registers dumped bnx2x: correct advertisement of pause capabilities bnx2x: display the correct duplex value bnx2x: prevent timeouts when using PFC bnx2x: fix stats copying logic bnx2x: Avoid sending multiple statistics queries net: qmi_wwan: call subdriver with control intf only net_sched: gred: actually perform idling in WRED mode net_sched: gred: fix qave reporting via netlink net_sched: gred: eliminate redundant DP prio comparisons net_sched: gred: correct comment about qavg calculation in RIO mode mISDN: Fix wrong usage of flush_work_sync while holding locks netfilter: log: Fix log-level processing net-sched: sch_cbq: avoid infinite loop net: qmi_wwan: fix Gobi device probing for un2430 net: fix net/core/sock.c build error ixp4xx_hss: fix build failure due to missing linux/module.h inclusion caif: move the dereference below the NULL test ...	2012-09-14 15:34:07 -07:00
Tejun Heo	8c7f6edbda	cgroup: mark subsystems with broken hierarchy support and whine if cgroups are nested for them Currently, cgroup hierarchy support is a mess. cpu related subsystems behave correctly - configuration, accounting and control on a parent properly cover its children. blkio and freezer completely ignore hierarchy and treat all cgroups as if they're directly under the root cgroup. Others show yet different behaviors. These differing interpretations of cgroup hierarchy make using cgroup confusing and it impossible to co-mount controllers into the same hierarchy and obtain sane behavior. Eventually, we want full hierarchy support from all subsystems and probably a unified hierarchy. Users using separate hierarchies expecting completely different behaviors depending on the mounted subsystem is deterimental to making any progress on this front. This patch adds cgroup_subsys.broken_hierarchy and sets it to %true for controllers which are lacking in hierarchy support. The goal of this patch is two-fold. * Move users away from using hierarchy on currently non-hierarchical subsystems, so that implementing proper hierarchy support on those doesn't surprise them. * Keep track of which controllers are broken how and nudge the subsystems to implement proper hierarchy support. For now, start with a single warning message. We can whine louder later on. v2: Fixed a typo spotted by Michal. Warning message updated. v3: Updated memcg part so that it doesn't generate warning in the cases where .use_hierarchy=false doesn't make the behavior different from root.use_hierarchy=true. Fixed a typo spotted by Glauber. v4: Check ->broken_hierarchy after cgroup creation is complete so that ->create() can affect the result per Michal. Dropped unnecessary memcg root handling per Michal. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: Li Zefan <lizefan@huawei.com> Acked-by: Serge E. Hallyn <serue@us.ibm.com> Cc: Glauber Costa <glommer@parallels.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Turner <pjt@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Thomas Graf <tgraf@suug.ch> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2012-09-14 12:01:16 -07:00
John W. Linville	9316f0e3c6	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem	2012-09-14 13:53:49 -04:00
Daniel Wagner	8a8e04df47	cgroup: Assign subsystem IDs during compile time WARNING: With this change it is impossible to load external built controllers anymore. In case where CONFIG_NETPRIO_CGROUP=m and CONFIG_NET_CLS_CGROUP=m is set, corresponding subsys_id should also be a constant. Up to now, net_prio_subsys_id and net_cls_subsys_id would be of the type int and the value would be assigned during runtime. By switching the macro definition IS_SUBSYS_ENABLED from IS_BUILTIN to IS_ENABLED, all _subsys_id will have constant value. That means we need to remove all the code which assumes a value can be assigned to net_prio_subsys_id and net_cls_subsys_id. A close look is necessary on the RCU part which was introduces by following patch: commit `f845172531` Author: Herbert Xu <herbert@gondor.apana.org.au> Mon May 24 09:12:34 2010 Committer: David S. Miller <davem@davemloft.net> Mon May 24 09:12:34 2010 cls_cgroup: Store classid in struct sock Tis code was added to init_cgroup_cls() / We can't use rcu_assign_pointer because this is an int. */ smp_wmb(); net_cls_subsys_id = net_cls_subsys.subsys_id; respectively to exit_cgroup_cls() net_cls_subsys_id = -1; synchronize_rcu(); and in module version of task_cls_classid() rcu_read_lock(); id = rcu_dereference(net_cls_subsys_id); if (id >= 0) classid = container_of(task_subsys_state(p, id), struct cgroup_cls_state, css)->classid; rcu_read_unlock(); Without an explicit explaination why the RCU part is needed. (The rcu_deference was fixed by exchanging it to rcu_derefence_index_check() in a later commit, but that is a minor detail.) So here is my pondering why it was introduced and why it safe to remove it now. Note that this code was copied over to net_prio the reasoning holds for that subsystem too. The idea behind the RCU use for net_cls_subsys_id is to make sure we get a valid pointer back from task_subsys_state(). task_subsys_state() is just blindly accessing the subsys array and returning the pointer. Obviously, passing in -1 as id into task_subsys_state() returns an invalid value (out of lower bound). So this code makes sure that only after module is loaded and the subsystem registered, the id is assigned. Before unregistering the module all old readers must have left the critical section. This is done by assigning -1 to the id and issuing a synchronized_rcu(). Any new readers wont call task_subsys_state() anymore and therefore it is safe to unregister the subsystem. The new code relies on the same trick, but it looks at the subsys pointer return by task_subsys_state() (remember the id is constant and therefore we allways have a valid index into the subsys array). No precautions need to be taken during module loading module. Eventually, all CPUs will get a valid pointer back from task_subsys_state() because rebind_subsystem() which is called after the module init() function will assigned subsys[net_cls_subsys_id] the newly loaded module subsystem pointer. When the subsystem is about to be removed, rebind_subsystem() will called before the module exit() function. In this case, rebind_subsys() will assign subsys[net_cls_subsys_id] a NULL pointer and then it calls synchronize_rcu(). All old readers have left by then the critical section. Any new reader wont access the subsystem anymore. At this point we are safe to unregister the subsystem. No synchronize_rcu() call is needed. Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Gao feng <gaofeng@cn.fujitsu.com> Cc: Glauber Costa <glommer@parallels.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: John Fastabend <john.r.fastabend@intel.com> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: netdev@vger.kernel.org Cc: cgroups@vger.kernel.org	2012-09-14 09:57:43 -07:00
Daniel Wagner	51e4e7faba	cgroup: net_prio: Do not define task_netpioidx() when not selected task_netprioidx() should not be defined in case the configuration is CONFIG_NETPRIO_CGROUP=n. The reason is that in a following patch the net_prio_subsys_id will only be defined if CONFIG_NETPRIO_CGROUP!=n. When net_prio is not built at all any callee should only get an empty task_netprioidx() without any references to net_prio_subsys_id. Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Cc: Gao feng <gaofeng@cn.fujitsu.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: John Fastabend <john.r.fastabend@intel.com> Cc: netdev@vger.kernel.org Cc: cgroups@vger.kernel.org	2012-09-14 09:57:28 -07:00
Daniel Wagner	8fb974c937	cgroup: net_cls: Do not define task_cls_classid() when not selected task_cls_classid() should not be defined in case the configuration is CONFIG_NET_CLS_CGROUP=n. The reason is that in a following patch the net_cls_subsys_id will only be defined if CONFIG_NET_CLS_CGROUP!=n. When net_cls is not built at all a callee should only get an empty task_cls_classid() without any references to net_cls_subsys_id. Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Cc: Gao feng <gaofeng@cn.fujitsu.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: John Fastabend <john.r.fastabend@intel.com> Cc: netdev@vger.kernel.org Cc: cgroups@vger.kernel.org	2012-09-14 09:57:25 -07:00
Chun-Yeow Yeoh	9385d04f28	mac80211: allow re-open the blocked peer link in mesh Peer link which is blocked using the "iw mesh0 station set <MAC addr> plink_action block" is previously not able to re-open using "iw mesh0 station set <MAC addr> plink_action open". This patch is intended to solve this. If the station plink state remains at OPN_SNT once open, try block and open again should solve this problem. Signed-off-by: Chun-Yeow Yeoh <yeohchunyeow@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-14 14:25:16 +02:00

... 3 4 5 6 7 ...

25426 Commits