linux

Commit Graph

Author	SHA1	Message	Date
Sasha Levin	84768edbb2	net: l2tp: unlock socket lock before returning from l2tp_ip_sendmsg l2tp_ip_sendmsg could return without releasing socket lock, making it all the way to userspace, and generating the following warning: [ 130.891594] ================================================ [ 130.894569] [ BUG: lock held when returning to user space! ] [ 130.897257] 3.4.0-rc5-next-20120501-sasha #104 Tainted: G W [ 130.900336] ------------------------------------------------ [ 130.902996] trinity/8384 is leaving the kernel with locks still held! [ 130.906106] 1 lock held by trinity/8384: [ 130.907924] #0: (sk_lock-AF_INET){+.+.+.}, at: [<ffffffff82b9503f>] l2tp_ip_sendmsg+0x2f/0x550 Introduced by commit `2f16270` ("l2tp: Fix locking in l2tp_ip.c"). Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-02 21:04:33 -04:00
Neil Horman	4fdcfa1284	drop_monitor: prevent init path from scheduling on the wrong cpu I just noticed after some recent updates, that the init path for the drop monitor protocol has a minor error. drop monitor maintains a per cpu structure, that gets initalized from a single cpu. Normally this is fine, as the protocol isn't in use yet, but I recently made a change that causes a failed skb allocation to reschedule itself . Given the current code, the implication is that this workqueue reschedule will take place on the wrong cpu. If drop monitor is used early during the boot process, its possible that two cpus will access a single per-cpu structure in parallel, possibly leading to data corruption. This patch fixes the situation, by storing the cpu number that a given instance of this per-cpu data should be accessed from. In the case of a need for a reschedule, the cpu stored in the struct is assigned the rescheule, rather than the currently executing cpu Tested successfully by myself. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: David Miller <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-02 21:02:48 -04:00
Yuchung Cheng	750ea2bafa	tcp: early retransmit: delayed fast retransmit Implementing the advanced early retransmit (sysctl_tcp_early_retrans==2). Delays the fast retransmit by an interval of RTT/4. We borrow the RTO timer to implement the delay. If we receive another ACK or send a new packet, the timer is cancelled and restored to original RTO value offset by time elapsed. When the delayed-ER timer fires, we enter fast recovery and perform fast retransmit. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-02 20:56:10 -04:00
Yuchung Cheng	eed530b6c6	tcp: early retransmit This patch implements RFC 5827 early retransmit (ER) for TCP. It reduces DUPACK threshold (dupthresh) if outstanding packets are less than 4 to recover losses by fast recovery instead of timeout. While the algorithm is simple, small but frequent network reordering makes this feature dangerous: the connection repeatedly enter false recovery and degrade performance. Therefore we implement a mitigation suggested in the appendix of the RFC that delays entering fast recovery by a small interval, i.e., RTT/4. Currently ER is conservative and is disabled for the rest of the connection after the first reordering event. A large scale web server experiment on the performance impact of ER is summarized in section 6 of the paper "Proportional Rate Reduction for TCP”, IMC 2011. http://conferences.sigcomm.org/imc/2011/docs/p155.pdf Note that Linux has a similar feature called THIN_DUPACK. The differences are THIN_DUPACK do not mitigate reorderings and is only used after slow start. Currently ER is disabled if THIN_DUPACK is enabled. I would be happy to merge THIN_DUPACK feature with ER if people think it's a good idea. ER is enabled by sysctl_tcp_early_retrans: 0: Disables ER 1: Reduce dupthresh to packets_out - 1 when outstanding packets < 4. 2: (Default) reduce dupthresh like mode 1. In addition, delay entering fast recovery by RTT/4. Note: mode 2 is implemented in the third part of this patch series. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-02 20:56:10 -04:00
Yuchung Cheng	1fbc340514	tcp: early retransmit: tcp_enter_recovery() This a prepartion patch that refactors the code to enter recovery into a new function tcp_enter_recovery(). It's needed to implement the delayed fast retransmit in ER. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-02 20:56:09 -04:00
John W. Linville	076e7779c0	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem	2012-05-01 14:14:05 -04:00
Eric Dumazet	116a0fc31c	netem: fix possible skb leak skb_checksum_help(skb) can return an error, we must free skb in this case. qdisc_drop(skb, sch) can also be feeded with a NULL skb (if skb_unshare() failed), so lets use this generic helper. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-01 13:40:48 -04:00
Eric Dumazet	e4ae004b84	netem: add ECN capability Add ECN (Explicit Congestion Notification) marking capability to netem tc qdisc add dev eth0 root netem drop 0.5 ecn Instead of dropping packets, try to ECN mark them. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Hagen Paul Pfeifer <hagen@jauu.net> Cc: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-01 09:39:48 -04:00
Eric Dumazet	e4cbb02a10	net: add a prefetch in socket backlog processing TCP or UDP stacks have big enough latencies that prefetching next pointer is worth it. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-01 09:39:48 -04:00
James Chapman	5dac94e109	l2tp: let iproute2 create L2TPv3 IP tunnels using IPv6 The netlink API lets users create unmanaged L2TPv3 tunnels using iproute2. Until now, a request to create an unmanaged L2TPv3 IP encapsulation tunnel over IPv6 would be rejected with EPROTONOSUPPORT. Now that l2tp_ip6 implements sockets for L2TP IP encapsulation over IPv6, we can add support for that tunnel type. Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-01 09:30:55 -04:00
Chris Elston	a32e0eec70	l2tp: introduce L2TPv3 IP encapsulation support for IPv6 L2TPv3 defines an IP encapsulation packet format where data is carried directly over IP (no UDP). The kernel already has support for L2TP IP encapsulation over IPv4 (l2tp_ip). This patch introduces support for L2TP IP encapsulation over IPv6. The implementation is derived from ipv6/raw and ipv4/l2tp_ip. Signed-off-by: Chris Elston <celston@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-01 09:30:55 -04:00
Chris Elston	a495f8364e	ipv6: Export ipv6 functions for use by other protocols For implementing other protocols on top of IPv6, such as L2TPv3's IP encapsulation over ipv6, we'd like to call some IPv6 functions which are not currently exported. This patch exports them. Signed-off-by: Chris Elston <celston@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-01 09:30:55 -04:00
Chris Elston	f9bac8df90	l2tp: netlink api for l2tpv3 ipv6 unmanaged tunnels This patch adds support for unmanaged L2TPv3 tunnels over IPv6 using the netlink API. We already support unmanaged L2TPv3 tunnels over IPv4. A patch to iproute2 to make use of this feature will be submitted separately. Signed-off-by: Chris Elston <celston@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-01 09:30:55 -04:00
Chris Elston	2121c3f571	l2tp: show IPv6 addresses in l2tp debugfs file If an L2TP tunnel uses IPv6, make sure the l2tp debugfs file shows the IPv6 address correctly. Signed-off-by: Chris Elston <celston@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-01 09:30:55 -04:00
James Chapman	b79585f537	l2tp: pppol2tp_connect() handles ipv6 sockaddr variants Userspace uses connect() to associate a pppol2tp socket with a tunnel socket. This needs to allow the caller to supply the new IPv6 sockaddr_pppol2tp structures if IPv6 is used. Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-01 09:30:55 -04:00
James Chapman	c8657fd50a	l2tp: remove unused stats from l2tp_ip socket The l2tp_ip socket currently maintains packet/byte stats in its private socket structure. But these counters aren't exposed to userspace and so serve no purpose. The counters were also smp-unsafe. So this patch just gets rid of the stats. While here, change a couple of internal __u32 variables to u32. Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-01 09:30:54 -04:00
James Chapman	de3c7a1827	l2tp: Use ip4_datagram_connect() in l2tp_ip_connect() Cleanup the l2tp_ip code to make use of an existing ipv4 support function. Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-01 09:30:54 -04:00
James Chapman	5de7aee541	l2tp: fix locking of 64-bit counters for smp L2TP uses 64-bit counters but since these are not updated atomically, we need to make them safe for smp. This patch addresses that. Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-01 09:30:54 -04:00
David S. Miller	b6d151bb82	Merge branch 'tipc_net-next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux	2012-04-30 21:42:30 -04:00
Eric Dumazet	1d0c0b328a	net: makes skb_splice_bits() aware of skb->head_frag __skb_splice_bits() can check if skb to be spliced has its skb->head mapped to a page fragment, instead of a kmalloc() area. If so we can avoid a copy of the skb head and get a reference on underlying page. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Maciej Żenczykowski <maze@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Cc: Matt Carlson <mcarlson@broadcom.com> Cc: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-30 21:35:50 -04:00
Eric Dumazet	329033f645	tcp: makes tcp_try_coalesce aware of skb->head_frag TCP coalesce can check if skb to be merged has its skb->head mapped to a page fragment, instead of a kmalloc() area. We had to disable coalescing in this case, for performance reasons. We 'upgrade' skb->head as a fragment in itself. This reduces number of cache misses when user makes its copies, since a less sk_buff are fetched. This makes receive and ofo queues shorter and thus reduce cache line misses in TCP stack. This is a followup of patch "net: allow skb->head to be a page fragment" Tested with tg3 nic, with GRO on or off. We can see "TCPRcvCoalesce" counter being incremented. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Maciej Żenczykowski <maze@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Cc: Matt Carlson <mcarlson@broadcom.com> Cc: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-30 21:35:49 -04:00
Eric Dumazet	d7e8883cfc	net: make GRO aware of skb->head_frag GRO can check if skb to be merged has its skb->head mapped to a page fragment, instead of a kmalloc() area. We 'upgrade' skb->head as a fragment in itself This avoids the frag_list fallback, and permits to build true GRO skb (one sk_buff and up to 16 fragments), using less memory. This reduces number of cache misses when user makes its copy, since a single sk_buff is fetched. This is a followup of patch "net: allow skb->head to be a page fragment" Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Maciej Żenczykowski <maze@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Cc: Matt Carlson <mcarlson@broadcom.com> Cc: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-30 21:35:49 -04:00
Eric Dumazet	d3836f21b0	net: allow skb->head to be a page fragment skb->head is currently allocated from kmalloc(). This is convenient but has the drawback the data cannot be converted to a page fragment if needed. We have three spots were it hurts : 1) GRO aggregation When a linear skb must be appended to another skb, GRO uses the frag_list fallback, very inefficient since we keep all struct sk_buff around. So drivers enabling GRO but delivering linear skbs to network stack aren't enabling full GRO power. 2) splice(socket -> pipe). We must copy the linear part to a page fragment. This kind of defeats splice() purpose (zero copy claim) 3) TCP coalescing. Recently introduced, this permits to group several contiguous segments into a single skb. This shortens queue lengths and save kernel memory, and greatly reduce probabilities of TCP collapses. This coalescing doesnt work on linear skbs (or we would need to copy data, this would be too slow) Given all these issues, the following patch introduces the possibility of having skb->head be a fragment in itself. We use a new skb flag, skb->head_frag to carry this information. build_skb() is changed to accept a frag_size argument. Drivers willing to provide a page fragment instead of kmalloc() data will set a non zero value, set to the fragment size. Then, on situations we need to convert the skb head to a frag in itself, we can check if skb->head_frag is set and avoid the copies or various fallbacks we have. This means drivers currently using frags could be updated to avoid the current skb->head allocation and reduce their memory footprint (aka skb truesize). (thats 512 or 1024 bytes saved per skb). This also makes bpf/netfilter faster since the 'first frag' will be part of skb linear part, no need to copy data. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Maciej Żenczykowski <maze@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Cc: Matt Carlson <mcarlson@broadcom.com> Cc: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-30 21:35:11 -04:00
Paul Gortmaker	617d3c7a50	tipc: compress out gratuitous extra carriage returns Some of the comment blocks are floating in limbo between two functions, or between blocks of code. Delete the extra line feeds between any comment and its associated following block of code, to be consistent with the majority of the rest of the kernel. Also delete trailing newlines at EOF and fix a couple trivial typos in existing comments. This is a 100% cosmetic change with no runtime impact. We get rid of over 500 lines of non-code, and being blank line deletes, they won't even show up as noise in git blame. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-04-30 15:53:56 -04:00
Felix Fietkau	66f2c99af3	mac80211: fix AP mode EAP tx for VLAN stations EAP frames for stations in an AP VLAN are sent on the main AP interface to avoid race conditions wrt. moving stations. For that to work properly, sta_info_get_bss must be used instead of sta_info_get when sending EAP packets. Previously this was only done for cooked monitor injected packets, so this patch adds a check for tx->skb->protocol to the same place. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Cc: stable@vger.kernel.org Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-04-30 14:40:05 -04:00
Yuchung Cheng	1cebce36d6	tcp: fix infinite cwnd in tcp_complete_cwr() When the cwnd reduction is done, ssthresh may be infinite if TCP enters CWR via ECN or F-RTO. If cwnd is not undone, i.e., undo_marker is set, tcp_complete_cwr() falsely set cwnd to the infinite ssthresh value. The correct operation is to keep cwnd intact because it has been updated in ECN or F-RTO. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-30 13:44:39 -04:00
Herbert Xu	bb63f1f8a0	bridge: Fix fatal typo in setup of multicast_querier_expired Unfortunately it seems that I didn't properly test the case of an expired external querier in the recent multicast bridge series. The setup of the timer in that case is completely broken and leads to a NULL-pointer dereference. This patch fixes it. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-30 13:30:56 -04:00
David S. Miller	5414fc12e3	Merge branch 'master' of git://1984.lsi.us.es/net	2012-04-30 13:23:22 -04:00
David S. Miller	d499bd2ee9	l2tp: Add missing net/net/ip6_checksum.h include. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-30 13:21:28 -04:00
Trond Myklebust	cbbb34498f	SUNRPC: RPC client must use the current utsname hostname string Now that the rpc client is namespace aware, it needs to use the utsname of the process that created it instead of using the init_utsname. Both rpc_new_client and rpc_clone_client need to be fixed. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Stanislav Kinsbursky <skinsbursky@parallels.com>	2012-04-30 11:58:51 -04:00
Pablo Neira Ayuso	6cf5185248	netfilter: xt_CT: fix wrong checking in the timeout assignment path The current checking always succeeded. We have to check the first character of the string to check that it's empty, thus, skipping the timeout path. This fixes the use of the CT target without the timeout option. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-04-30 10:40:36 +02:00
Hans Schillstrom	8537de8a7a	ipvs: kernel oops - do_ip_vs_get_ctl Change order of init so netns init is ready when register ioctl and netlink. Ver2 Whitespace fixes and __init added. Reported-by: "Ryan O'Hara" <rohara@redhat.com> Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>	2012-04-30 10:40:35 +02:00
Hans Schillstrom	582b8e3ead	ipvs: take care of return value from protocol init_netns ip_vs_create_timeout_table() can return NULL All functions protocol init_netns is affected of this patch. Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2012-04-30 10:40:35 +02:00
Hans Schillstrom	4b984cd50b	ipvs: null check of net->ipvs in lblc(r) shedulers Avoid crash when registering shedulers after the IPVS core initialization for netns fails. Do this by checking for present core (net->ipvs). Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2012-04-30 10:40:14 +02:00
Benjamin LaHaise	d2cf336167	net/l2tp: add support for L2TP over IPv6 UDP Now that encap_rcv() works on IPv6 UDP sockets, wire L2TP up to IPv6. Support has been tested with and without hardware offloading. This version fixes the L2TP over localhost issue with incorrect checksums being reported. Signed-off-by: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-28 22:21:51 -04:00
Benjamin LaHaise	d7f3f62167	net/ipv6/udp: UDP encapsulation: introduce encap_rcv hook into IPv6 Now that the sematics of udpv6_queue_rcv_skb() match IPv4's udp_queue_rcv_skb(), introduce the UDP encap_rcv() hook for IPv6. Signed-off-by: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-28 22:21:51 -04:00
Benjamin LaHaise	cb80ef463d	net/ipv6/udp: UDP encapsulation: move socket locking into udpv6_queue_rcv_skb() In order to make sure that when the encap_rcv() hook is introduced it is not called with the socket lock held, move socket locking from callers into udpv6_queue_rcv_skb(), matching what happens in IPv4. Signed-off-by: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-28 22:21:51 -04:00
Benjamin LaHaise	f7ad74fef3	net/ipv6/udp: UDP encapsulation: break backlog_rcv into __udpv6_queue_rcv_skb This is the first step in reworking the IPv6 UDP code to be structured more like the IPv4 UDP code. This patch creates __udpv6_queue_rcv_skb() with the equivalent sematics to __udp_queue_rcv_skb(), and wires it up to the backlog_rcv method. Signed-off-by: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-28 22:21:50 -04:00
Jeffrin Jose	cb75a36c8a	net: Fixed a coding style issue related to spaces. Fixed a coding style issue relating to spaces in net/core/sock.c Signed-off-by: Jeffrin Jose <ahiliation@yahoo.co.in> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-28 21:45:00 -04:00
Neil Horman	3885ca785a	drop_monitor: Make updating data->skb smp safe Eric Dumazet pointed out to me that the drop_monitor protocol has some holes in its smp protections. Specifically, its possible to replace data->skb while its being written. This patch corrects that by making data->skb an rcu protected variable. That will prevent it from being overwritten while a tracepoint is modifying it. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Reported-by: Eric Dumazet <eric.dumazet@gmail.com> CC: David Miller <davem@davemloft.net> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-28 02:18:48 -04:00
Neil Horman	cde2e9a651	drop_monitor: fix sleeping in invalid context warning Eric Dumazet pointed out this warning in the drop_monitor protocol to me: [ 38.352571] BUG: sleeping function called from invalid context at kernel/mutex.c:85 [ 38.352576] in_atomic(): 1, irqs_disabled(): 0, pid: 4415, name: dropwatch [ 38.352580] Pid: 4415, comm: dropwatch Not tainted 3.4.0-rc2+ #71 [ 38.352582] Call Trace: [ 38.352592] [<ffffffff8153aaf0>] ? trace_napi_poll_hit+0xd0/0xd0 [ 38.352599] [<ffffffff81063f2a>] __might_sleep+0xca/0xf0 [ 38.352606] [<ffffffff81655b16>] mutex_lock+0x26/0x50 [ 38.352610] [<ffffffff8153aaf0>] ? trace_napi_poll_hit+0xd0/0xd0 [ 38.352616] [<ffffffff810b72d9>] tracepoint_probe_register+0x29/0x90 [ 38.352621] [<ffffffff8153a585>] set_all_monitor_traces+0x105/0x170 [ 38.352625] [<ffffffff8153a8ca>] net_dm_cmd_trace+0x2a/0x40 [ 38.352630] [<ffffffff8154a81a>] genl_rcv_msg+0x21a/0x2b0 [ 38.352636] [<ffffffff810f8029>] ? zone_statistics+0x99/0xc0 [ 38.352640] [<ffffffff8154a600>] ? genl_rcv+0x30/0x30 [ 38.352645] [<ffffffff8154a059>] netlink_rcv_skb+0xa9/0xd0 [ 38.352649] [<ffffffff8154a5f0>] genl_rcv+0x20/0x30 [ 38.352653] [<ffffffff81549a7e>] netlink_unicast+0x1ae/0x1f0 [ 38.352658] [<ffffffff81549d76>] netlink_sendmsg+0x2b6/0x310 [ 38.352663] [<ffffffff8150824f>] sock_sendmsg+0x10f/0x130 [ 38.352668] [<ffffffff8150abe0>] ? move_addr_to_kernel+0x60/0xb0 [ 38.352673] [<ffffffff81515f04>] ? verify_iovec+0x64/0xe0 [ 38.352677] [<ffffffff81509c46>] __sys_sendmsg+0x386/0x390 [ 38.352682] [<ffffffff810ffaf9>] ? handle_mm_fault+0x139/0x210 [ 38.352687] [<ffffffff8165b5bc>] ? do_page_fault+0x1ec/0x4f0 [ 38.352693] [<ffffffff8106ba4d>] ? set_next_entity+0x9d/0xb0 [ 38.352699] [<ffffffff81310b49>] ? tty_ldisc_deref+0x9/0x10 [ 38.352703] [<ffffffff8106d363>] ? pick_next_task_fair+0x63/0x140 [ 38.352708] [<ffffffff8150b8d4>] sys_sendmsg+0x44/0x80 [ 38.352713] [<ffffffff8165f8e2>] system_call_fastpath+0x16/0x1b It stems from holding a spinlock (trace_state_lock) while attempting to register or unregister tracepoint hooks, making in_atomic() true in this context, leading to the warning when the tracepoint calls might_sleep() while its taking a mutex. Since we only use the trace_state_lock to prevent trace protocol state races, as well as hardware stat list updates on an rcu write side, we can just convert the spinlock to a mutex to avoid this problem. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Reported-by: Eric Dumazet <eric.dumazet@gmail.com> CC: David Miller <davem@davemloft.net> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-28 02:18:48 -04:00
John W. Linville	4dcc0637fc	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth	2012-04-27 15:16:43 -04:00
Stanislav Kinsbursky	ea8cfa0679	SUNRPC: traverse clients tree on PipeFS event v2: recursion was replaced by loop If client is a clone, then it's parent can not be in the list. But parent's Pipefs dentries have to be created and destroyed. Note: event skip helper for clients introduced Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-04-27 14:10:00 -04:00
Stanislav Kinsbursky	37629b572c	SUNRPC: set per-net PipeFS superblock before notification There can be a case, when on MOUNT event RPC client (after it's dentries were created) is not longer hold by anyone except notification callback. I.e. on release this client will be destoroyed. And it's dentries have to be destroyed as well. Which in turn requires per-net PipeFS superblock to be set. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-04-27 14:10:00 -04:00
Stanislav Kinsbursky	7aab449e5a	SUNRPC: skip clients with program without PipeFS entries 1) This is sane. 2) Otherwise there will be soft lockup: do { rpc_get_client_for_event (clnt->cl_dentry == NULL ==> choose) __rpc_pipefs_event (clnt->cl_program->pipe_dir_name == NULL ==> return) } while (1) Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-04-27 14:09:59 -04:00
Stanislav Kinsbursky	a4dff1bc49	SUNRPC: skip dead but not buried clients on PipeFS events These clients can't be safely dereferenced if their counter in 0. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-04-27 14:09:59 -04:00
Neal Cardwell	651913ce9d	tcp: clean up use of jiffies in tcp_rcv_rtt_measure() Clean up a reference to jiffies in tcp_rcv_rtt_measure() that should instead reference tcp_time_stamp. Since the result of the subtraction is passed into a function taking u32, this should not change any behavior (and indeed the generated assembly does not change on x86_64). However, it seems worth cleaning this up for consistency and clarity (and perhaps to avoid bugs if this is copied and pasted somewhere else). Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-27 12:34:39 -04:00
Allan Stephens	aad585473f	tipc: Reject payload messages with invalid message type Adds check to ensure TIPC sockets reject incoming payload messages that have an unrecognized message type. Remove the old open question about whether TIPC_ERR_NO_PORT is the proper return value. It is appropriate here since there are valid instances where another node can make use of the reply, and at this point in time the host is already broadcasting TIPC data, so there are no real security concerns. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-04-27 10:08:00 -04:00
Eric Dumazet	8298193012	net: cleanups in sock_setsockopt() Use min_t()/max_t() macros, reformat two comments, use !!test_bit() to match !!sock_flag() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-27 02:14:21 -04:00
hartleys	feb50ac19e	crush: include header for global symbols Include the header to pickup the definitions of the global symbols. Quiets the following sparse warnings: warning: symbol 'crush_find_rule' was not declared. Should it be static? warning: symbol 'crush_do_rule' was not declared. Should it be static? Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Cc: Sage Weil <sage@newdream.net> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-27 00:03:34 -04:00
Eric Dumazet	6746960140	ipv6: RTAX_FEATURE_ALLFRAG causes inefficient TCP segment sizing Quoting Tore Anderson from : https://bugzilla.kernel.org/show_bug.cgi?id=42572 When RTAX_FEATURE_ALLFRAG is set on a route, the effective TCP segment size does not take into account the size of the IPv6 Fragmentation header that needs to be included in outbound packets, causing every transmitted TCP segment to be fragmented across two IPv6 packets, the latter of which will only contain 8 bytes of actual payload. RTAX_FEATURE_ALLFRAG is typically set on a route in response to receving a ICMPv6 Packet Too Big message indicating a Path MTU of less than 1280 bytes. 1280 bytes is the minimum IPv6 MTU, however ICMPv6 PTBs with MTU < 1280 are still valid, in particular when an IPv6 packet is sent to an IPv4 destination through a stateless translator. Any ICMPv4 Need To Fragment packets originated from the IPv4 part of the path will be translated to ICMPv6 PTB which may then indicate an MTU of less than 1280. The Linux kernel refuses to reduce the effective MTU to anything below 1280 bytes, instead it sets it to exactly 1280 bytes, and RTAX_FEATURE_ALLFRAG is also set. However, the TCP segment size appears to be set to 1240 bytes (1280 Path MTU - 40 bytes of IPv6 header), instead of 1232 (additionally taking into account the 8 bytes required by the IPv6 Fragmentation extension header). This in turn results in rather inefficient transmission, as every transmitted TCP segment now is split in two fragments containing 1232+8 bytes of payload. After this patch, all the outgoing packets that includes a Fragmentation header all are "atomic" or "non-fragmented" fragments, i.e., they both have Offset=0 and More Fragments=0. With help from David S. Miller Reported-by: Tore Anderson <tore@fud.no> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Maciej Żenczykowski <maze@google.com> Cc: Tom Herbert <therbert@google.com> Tested-by: Tore Anderson <tore@fud.no> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-27 00:03:34 -04:00
Allan Stephens	8f17789693	tipc: Enhance error checking of published names Consolidates validation of scope and name sequence range values into a single routine where it applies both to local name publications and to name publications issued by other nodes in the network. This change means that the scope value for non-local publications is now validated and the name sequence range for local publications is now validated only once. Additionally, a publication attempt that fails validation now creates an entry in the system log file only if debugging capabilities have been enabled; this prevents the system log from being cluttered up with messages caused by a defective application or network node. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-04-26 18:15:48 -04:00
Allan Stephens	f7fb9d20ad	tipc: Create helper routine to delete unused name sequence structure Replaces two identical chunks of code that delete an unused name sequence structure from TIPC's name table with calls to a new routine that performs this operation. This change is cosmetic and doesn't impact the operation of TIPC. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-04-26 18:15:47 -04:00
Allan Stephens	bbe6a295d0	tipc: remove redundant memset and stale comment from subscr.c Eliminate code to zero-out the main topology service structure, which is already zeroed-out. Get rid of a comment documenting a field of the main topology service structure that no longer exists. Both are cosmetic changes with no impact on runtime behaviour. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-04-26 18:15:46 -04:00
Allan Stephens	2d98abb9fe	tipc: Optimize initialization of network topology service Initialization now occurs in the calling thread of control, rather than being deferred to the TIPC tasklet. With the current codebase, the deferral is no longer necessary. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-04-26 18:15:45 -04:00
Allan Stephens	eb3865a99d	tipc: Enhance re-initialization of network topology service Streamlines the job of re-initializing TIPC's network topology service when a node's network address is first assigned. Rather than destroying the topology server port and breaking its connections to existing subscribers, TIPC now simply lets the service continue running (since the change to the port identifier of each port used by the topology service no longer impacts the flow of messages between the service and its subscribers). This enhancement means that applications that utilize the topology service prior to the assignment of TIPC's network address no longer need to re-establish their subscriptions when the address is finally assigned. However, it is worth noting that any subsequent events for existing subscriptions report the new port identifier of the publishing port, rather than the original port identifier. (For example, a name that was previously reported as being published by <0.0.0:ref> may be subsequently withdrawn by <Z.C.N:ref>.) This doesn't impact any of the existing known userspace in tipc-utils, since (a) TIPC continues to treat references to the original port ID correctly and (b) normal use cases assign an address before active use. However if there does happen to be some rare/custom application out there that was relying on this, they can simply bypass the enhancement by issuing a subscription to {0,0} and break its connection to the topology service, if an associated withdrawal event occurs. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-04-26 18:15:40 -04:00
Allan Stephens	eb323b075a	tipc: Optimize termination of configuration service Termination no longer tests to see if the configuration service port was successfully created or not. In the unlikely event that the port was not created, attempting to delete the non-existent port is detected gracefully and causes no harm. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-04-26 17:19:43 -04:00
Allan Stephens	861d3a0e5b	tipc: Optimize initialization of configuration service Initialization now occurs in the calling thread of control, rather than being deferred to the TIPC tasklet. With the current codebase, the deferral is no longer necessary. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-04-26 17:19:42 -04:00
Allan Stephens	a2cfd45b52	tipc: Optimize re-initialization of configuration service Streamlines the job of re-initializing TIPC's configuration service when a node's network address is first assigned. Rather than destroying the configuration server port and then recreating it, TIPC now simply withdraws the existing {0,<0.0.0>} name publication and creates a new {0,<Z.C.N>} name publication that identifies the node's network address to interested subscribers. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-04-26 17:19:07 -04:00
David S. Miller	a85c9bb895	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next	2012-04-26 15:54:45 -04:00
John W. Linville	d9b8ae6bd8	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem Conflicts: drivers/net/wireless/iwlwifi/iwl-testmode.c	2012-04-26 15:03:48 -04:00
Pavel Emelyanov	de248a75c3	tcp repair: Fix unaligned access when repairing options (v2) Don't pick __u8/__u16 values directly from raw pointers, but instead use an array of structures of code:value pairs. This is OK, since the buffer we take options from is not an skb memory, but a user-to-kernel one. For those options which don't require any value now, require this to be zero (for potential future extension of this API). v2: Changed tcp_repair_opt to use two __u32-s as spotted by David Laight. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-26 06:13:51 -04:00
alex.bluesman.smirnov@gmail.com	2d319508a3	6lowpan: duplicate definition of IEEE802154_ALEN The same macros is defined in 'include/net/af_ieee802154.h' and is called IEEE802154_ADDR_LEN. No need another one, so remove it. Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-26 06:01:09 -04:00
alex.bluesman.smirnov@gmail.com	c2e94d73ea	6lowpan: move frame allocation code to a separate function Separate frame allocation routine from data processing function. This makes code more human readable and easier for understanding. Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-26 06:01:09 -04:00
alex.bluesman.smirnov@gmail.com	768f7c7c12	6lowpan: add missing spin_lock_init() Add missing spin_lock_init() for frames list lock. Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-26 05:32:55 -04:00
alex.bluesman.smirnov@gmail.com	8deff4af87	6lowpan: clean up fragments list if module unloaded Clean all the pending fragments and relative timers if 6lowpan link is going to be deleted. Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-26 05:32:55 -04:00
alex.bluesman.smirnov@gmail.com	0848e40430	6lowpan: fix segmentation fault caused by mlme request Add nescesary mlme callbacks to satisfy "iz list" request from user space. Due to 6lowpan device doesn't have its own phy, mlme implemented as a pipe to a real phy to which 6lowpan is attached. Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-26 05:32:55 -04:00
Julian Anastasov	39f618b4fd	ipvs: reset ipvs pointer in netns Make sure net->ipvs is reset on netns cleanup or failed initialization. It is needed for IPVS applications to know that IPVS core is not loaded in netns. Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: Simon Horman <horms@verge.net.au>	2012-04-26 15:26:35 +09:00
Julian Anastasov	8d08d71ce5	ipvs: add check in ftp for initialized core Avoid crash when registering ip_vs_ftp after the IPVS core initialization for netns fails. Do this by checking for present core (net->ipvs). Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: Simon Horman <horms@verge.net.au>	2012-04-26 15:26:35 +09:00
Linus Torvalds	2300fd67b4	NFS client bugfixes for Linux 3.4 Highlights include: - Fix NFSv4 infinite loops on open(O_TRUNC) - Fix an Oops and an infinite loop in the NFSv4 flock code - Don't register the PipeFS filesystem until it has been set up - Fix an Oops in nfs_try_to_update_request - Don't reuse NFSv4 open owners: fixes a bad sequence id storm. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABAgAGBQJPlbzwAAoJEGcL54qWCgDy24oQALZE67vBft7M2j0BiWhVbV15 YLbCf6x/h+0BJAkKWdrBaw7N6GX6OYBOX2SsmrBkzYf5mgHeju5+dH0CmRAR5xib 5d+Lwxif1l+rABfdzzJf8gY1L1THyJCnfmarKKyYEJ5OC1pJyulKLanXSPzPfzlm APV5Jf6NM2WRgkCqzP6zf61NG0HbDSR7C//HQ3k21Sdt9XDLf5qLHBSuPIQ+BlZY EvpbERTtJgp7rPJsLQv1F2dgasDUQNg8G+tmZatGcqEiNxVyQ2YqwshaldOVqftv 3Kocs6OW5C1ESj1dFJZmeMZ/+GSHjRJx8fpqHJjmCsh4kPGgFviQDdYwu4FDhhPI FZslC5nVi8JMTPNJAFmfvbwPQId/TSRPCWYO5PtW1LSfRT/+25b6M5duro1eGIbJ /FDoOCYQmepNOfobU9Q3roDWyNSLYFaUaMJUrccRcAuS3S2NEXisTAT49kmqa1Vm ZArOJBnXTgmGi30nKhqqLJ43P61ekhX0AQ6PycZAXkjeRlkQs7AAQbMJZMB2X0r9 KtRCDPiH2NuR0FwxNMkMP4BXdsaY7Sz/xiSZXLOUf1SeWBiBtYoDdrQ3z67SGOeG qxI3qXXl0KC2+l2jnezcWhBf4CDpxftGIBi+rKWJt8stoYzbemB/M1lkoTCwrVzq 8Gwyy0QTVzE9VkY77oVW =hQAK -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.4-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client bugfixes from Trond Myklebust: - Fix NFSv4 infinite loops on open(O_TRUNC) - Fix an Oops and an infinite loop in the NFSv4 flock code - Don't register the PipeFS filesystem until it has been set up - Fix an Oops in nfs_try_to_update_request - Don't reuse NFSv4 open owners: fixes a bad sequence id storm. * tag 'nfs-for-3.4-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFSv4: Keep dropped state owners on the LRU list for a while NFSv4: Ensure that we don't drop a state owner more than once NFSv4: Ensure we do not reuse open owner names nfs: Enclose hostname in brackets when needed in nfs_do_root_mount NFS: put open context on error in nfs_flush_multi NFS: put open context on error in nfs_pagein_multi NFSv4: Fix open(O_TRUNC) and ftruncate() error handling NFSv4: Ensure that we check lock exclusive/shared type against open modes NFSv4: Ensure that the LOCK code sets exception->inode NFS: check for req==NULL in nfs_try_to_update_request cleanup SUNRPC: register PipeFS file system after pernet sybsystem	2012-04-25 21:38:44 -07:00
Shan Wei	8dcf01fc00	net: sock_diag_handler structs can be const read only, so change it to const. Signed-off-by: Shan Wei <davidshan@tencent.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-25 20:46:59 -04:00
Shan Wei	62ad6fcd74	udp_diag: implement idiag_get_info for udp/udplite to get queue information When we use netlink to monitor queue information for udp socket, idiag_rqueue and idiag_wqueue of inet_diag_msg are returned with 0. Keep consistent with netstat, just return back allocated rmem/wmem size. Signed-off-by: Shan Wei <davidshan@tencent.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-25 20:43:01 -04:00
Eric Dumazet	808db80a7e	ipv6: call consume_skb() in frag/reassembly Some kfree_skb() calls should be replaced by consume_skb() to avoid drop_monitor/dropwatch false positives. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-25 20:39:46 -04:00
John Fastabend	081579840b	net: dcb: add CEE notify calls This adds code to trigger CEE events when an APP change or setall command is made from user space. This simplifies user space code significantly by creating a single interface to listen on that works with both firmware and userland agents. And if we end up with multiple agents this keeps every thing in sync userland agents, firmware agents, and kernel notifier consumers. For an example agent that listens for these events see: https://github.com/jrfastab/cgdcbxd cgdcbxd is a daemon used to monitor DCB netlink events and manage the net_prio control group sub-system. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Shmulik Ravid <shmulikr@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-25 19:47:17 -04:00
John W. Linville	395836282f	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem	2012-04-25 13:41:25 -04:00
Julian Anastasov	8f9b9a2fad	ipvs: fix crash in ip_vs_control_net_cleanup on unload commit `14e405461e` (2.6.39) ("Add __ip_vs_control_{init,cleanup}_sysctl()") introduced regression due to wrong __net_init for __ip_vs_control_cleanup_sysctl. This leads to crash when the ip_vs module is unloaded. Fix it by changing __net_init to __net_exit for the function that is already renamed to ip_vs_control_net_cleanup_sysctl. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Hans Schillstrom <hans@schillstrom.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-04-25 11:16:30 +02:00
Sasha Levin	7118c07a84	ipvs: Verify that IP_VS protocol has been registered The registration of a protocol might fail, there were no checks and all registrations were assumed to be correct. This lead to NULL ptr dereferences when apps tried registering. For example: [ 1293.226051] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 [ 1293.227038] IP: [<ffffffff822aacb0>] tcp_register_app+0x60/0xb0 [ 1293.227038] PGD 391de067 PUD 6c20b067 PMD 0 [ 1293.227038] Oops: 0000 [#1] PREEMPT SMP [ 1293.227038] CPU 1 [ 1293.227038] Pid: 19609, comm: trinity Tainted: G W 3.4.0-rc1-next-20120405-sasha-dirty #57 [ 1293.227038] RIP: 0010:[<ffffffff822aacb0>] [<ffffffff822aacb0>] tcp_register_app+0x60/0xb0 [ 1293.227038] RSP: 0018:ffff880038c1dd18 EFLAGS: 00010286 [ 1293.227038] RAX: ffffffffffffffc0 RBX: 0000000000001500 RCX: 0000000000010000 [ 1293.227038] RDX: 0000000000000000 RSI: ffff88003a2d5888 RDI: 0000000000000282 [ 1293.227038] RBP: ffff880038c1dd48 R08: 0000000000000000 R09: 0000000000000000 [ 1293.227038] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88003a2d5668 [ 1293.227038] R13: ffff88003a2d5988 R14: ffff8800696a8ff8 R15: 0000000000000000 [ 1293.227038] FS: 00007f01930d9700(0000) GS:ffff88007ce00000(0000) knlGS:0000000000000000 [ 1293.227038] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1293.227038] CR2: 0000000000000018 CR3: 0000000065dfc000 CR4: 00000000000406e0 [ 1293.227038] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1293.227038] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 1293.227038] Process trinity (pid: 19609, threadinfo ffff880038c1c000, task ffff88002dc73000) [ 1293.227038] Stack: [ 1293.227038] ffff880038c1dd48 00000000fffffff4 ffff8800696aada0 ffff8800694f5580 [ 1293.227038] ffffffff8369f1e0 0000000000001500 ffff880038c1dd98 ffffffff822a716b [ 1293.227038] 0000000000000000 ffff8800696a8ff8 0000000000000015 ffff8800694f5580 [ 1293.227038] Call Trace: [ 1293.227038] [<ffffffff822a716b>] ip_vs_app_inc_new+0xdb/0x180 [ 1293.227038] [<ffffffff822a7258>] register_ip_vs_app_inc+0x48/0x70 [ 1293.227038] [<ffffffff822b2fea>] __ip_vs_ftp_init+0xba/0x140 [ 1293.227038] [<ffffffff821c9060>] ops_init+0x80/0x90 [ 1293.227038] [<ffffffff821c90cb>] setup_net+0x5b/0xe0 [ 1293.227038] [<ffffffff821c9416>] copy_net_ns+0x76/0x100 [ 1293.227038] [<ffffffff810dc92b>] create_new_namespaces+0xfb/0x190 [ 1293.227038] [<ffffffff810dca21>] unshare_nsproxy_namespaces+0x61/0x80 [ 1293.227038] [<ffffffff810afd1f>] sys_unshare+0xff/0x290 [ 1293.227038] [<ffffffff8187622e>] ? trace_hardirqs_on_thunk+0x3a/0x3f [ 1293.227038] [<ffffffff82665539>] system_call_fastpath+0x16/0x1b [ 1293.227038] Code: 89 c7 e8 34 91 3b 00 89 de 66 c1 ee 04 31 de 83 e6 0f 48 83 c6 22 48 c1 e6 04 4a 8b 14 26 49 8d 34 34 48 8d 42 c0 48 39 d6 74 13 <66> 39 58 58 74 22 48 8b 48 40 48 8d 41 c0 48 39 ce 75 ed 49 8d [ 1293.227038] RIP [<ffffffff822aacb0>] tcp_register_app+0x60/0xb0 [ 1293.227038] RSP <ffff880038c1dd18> [ 1293.227038] CR2: 0000000000000018 [ 1293.379284] ---[ end trace 364ab40c7011a009 ]--- [ 1293.381182] Kernel panic - not syncing: Fatal exception in interrupt Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-04-25 11:16:12 +02:00
Andrei Emeltchenko	94c514fe24	mac80211: Adds clean sdata helper Adds hepler to clean sdata ieee80211_clean_sdata similar way as ieee80211_setup_sdata is implemented. The function will be used by other interfaces later. Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-04-24 14:56:10 -04:00
Felix Fietkau	7e3ed02c6e	mac80211: fix num_mcast_sta counting issues Moving a STA to an AP VLAN prevents num_mcast_sta from being decremented once the STA leaves, because sta->sdata changes. Fix this by checking for AP VLANs as well. Also exclude 4-addr VLAN stations from num_mcast_sta - remote 4-addr stations ignore 3-address multicast frames anyway. In a typical bridge configuration they receive the same packets as 4-address unicast. This patch also fixes clearing the sdata->u.vlan.sta pointer when the STA is removed from a 4-addr VLAN. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-04-24 14:54:28 -04:00
Felix Fietkau	030ef8f8a5	mac80211: rename AP variable num_sta_authorized to num_mcast_sta It is only used to test for BSS multicast receivers. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-04-24 14:54:28 -04:00
Wey-Yi Guy	be6bcabc79	mac80211: check for non-managed interface Average beacon signal only keep tracked by managed interface, give warning and return 0 for the others. Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-04-24 14:54:27 -04:00
Eliad Peller	afa762f687	mac80211: call ieee80211_mgd_stop() on interface stop ieee80211_mgd_teardown() is called on netdev removal, which occurs after the vif was already removed from the low-level driver, resulting in the following warning: [ 4809.014734] ------------[ cut here ]------------ [ 4809.019861] WARNING: at net/mac80211/driver-ops.h:12 ieee80211_bss_info_change_notify+0x200/0x2c8 [mac80211]() [ 4809.030388] wlan0: Failed check-sdata-in-driver check, flags: 0x4 [ 4809.036862] Modules linked in: wlcore_sdio(-) wl12xx wlcore mac80211 cfg80211 [last unloaded: cfg80211] [ 4809.046849] [<c001bd4c>] (unwind_backtrace+0x0/0x12c) [ 4809.055937] [<c047cf1c>] (dump_stack+0x20/0x24) [ 4809.065385] [<c003e334>] (warn_slowpath_common+0x5c/0x74) [ 4809.075589] [<c003e408>] (warn_slowpath_fmt+0x40/0x48) [ 4809.088291] [<bf033630>] (ieee80211_bss_info_change_notify+0x200/0x2c8 [mac80211]) [ 4809.102844] [<bf067f84>] (ieee80211_destroy_auth_data+0x80/0xa4 [mac80211]) [ 4809.116276] [<bf068004>] (ieee80211_mgd_teardown+0x5c/0x74 [mac80211]) [ 4809.129331] [<bf043f18>] (ieee80211_teardown_sdata+0xb0/0xd8 [mac80211]) [ 4809.141595] [<c03b5e58>] (rollback_registered_many+0x228/0x2f0) [ 4809.153056] [<c03b5f48>] (unregister_netdevice_many+0x28/0x50) [ 4809.165696] [<bf041ea8>] (ieee80211_remove_interfaces+0xb4/0xdc [mac80211]) [ 4809.179151] [<bf032174>] (ieee80211_unregister_hw+0x50/0xf0 [mac80211]) [ 4809.191043] [<bf0bebb4>] (wlcore_remove+0x5c/0x7c [wlcore]) [ 4809.201491] [<c02c6918>] (platform_drv_remove+0x24/0x28) [ 4809.212029] [<c02c4d50>] (__device_release_driver+0x8c/0xcc) [ 4809.222738] [<c02c4e84>] (device_release_driver+0x30/0x3c) [ 4809.233099] [<c02c4258>] (bus_remove_device+0x10c/0x128) [ 4809.242620] [<c02c26f8>] (device_del+0x11c/0x17c) [ 4809.252150] [<c02c6de0>] (platform_device_del+0x28/0x68) [ 4809.263051] [<bf0df49c>] (wl1271_remove+0x3c/0x50 [wlcore_sdio]) [ 4809.273590] [<c03806b0>] (sdio_bus_remove+0x48/0xf8) [ 4809.283754] [<c02c4d50>] (__device_release_driver+0x8c/0xcc) [ 4809.293729] [<c02c4e2c>] (driver_detach+0x9c/0xc4) [ 4809.303163] [<c02c3d7c>] (bus_remove_driver+0xc4/0xf4) [ 4809.312973] [<c02c5a98>] (driver_unregister+0x70/0x7c) [ 4809.323220] [<c03809c4>] (sdio_unregister_driver+0x24/0x2c) [ 4809.334213] [<bf0df458>] (wl1271_exit+0x14/0x1c [wlcore_sdio]) [ 4809.344930] [<c009b1a4>] (sys_delete_module+0x228/0x2a8) [ 4809.354734] ---[ end trace 515290ccf5feb522 ]--- Rename ieee80211_mgd_teardown() to ieee80211_mgd_stop(), and call it on ieee80211_do_stop(). Signed-off-by: Eliad Peller <eliad@wizery.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-04-24 14:42:42 -04:00
Szymon Janc	16cde9931b	Bluetooth: Fix missing break in hci_cmd_complete_evt Command complete event for HCI_OP_USER_PASSKEY_NEG_REPLY would result in calling handler function also for HCI_OP_LE_SET_SCAN_PARAM. This could result in undefined behaviour. Signed-off-by: Szymon Janc <szymon.janc@tieto.com> Signed-off-by: Gustavo Padovan <gustavo@padovan.org>	2012-04-24 11:38:41 -03:00
Paul Gortmaker	872f24dbc6	tipc: remove inline instances from C source files. Untie gcc's hands and let it do what it wants within the individual source files. There are two files, node.c and port.c -- only the latter effectively changes (gcc-4.5.2). Objdump shows gcc deciding to not inline port_peernode(). Suggested-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-24 00:41:03 -04:00
Eric Dumazet	bfb253c9b2	af_netlink: drop_monitor/dropwatch friendly Need to consume_skb() instead of kfree_skb() in netlink_dump() and netlink_unicast_kernel() to avoid false dropwatch positives. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-24 00:35:14 -04:00
Eric Dumazet	658cb354ed	af_netlink: cleanups netlink_destroy_callback() move to avoid forward reference CodingStyle cleanups Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-24 00:35:14 -04:00
Eric Dumazet	38ba0a65fa	net: skb_can_coalesce returns a boolean Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-24 00:18:02 -04:00
Peter Huang (Peng)	a881e963c7	set fake_rtable's dst to NULL to avoid kernel Oops bridge: set fake_rtable's dst to NULL to avoid kernel Oops when bridge is deleted before tap/vif device's delete, kernel may encounter an oops because of NULL reference to fake_rtable's dst. Set fake_rtable's dst to NULL before sending packets out can solve this problem. v4 reformat, change br_drop_fake_rtable(skb) to {} v3 enrich commit header v2 introducing new flag DST_FAKE_RTABLE to dst_entry struct. [ Use "do { } while (0)" for nop br_drop_fake_rtable() implementation -DaveM ] Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Peter Huang <peter.huangpeng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-24 00:16:24 -04:00
Eric Dumazet	783c175f90	tcp: tcp_try_coalesce returns a boolean This clarifies code intention, as suggested by David. Suggested-by: David Miller <davem@davemloft.net> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-23 23:36:58 -04:00
Eric Dumazet	d7ccf7c0a0	net: make spd_fill_page() linear argument a bool Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-23 23:35:04 -04:00
David S. Miller	f24001941c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Fix merge between commit `3adadc08cc` ("net ax25: Reorder ax25_exit to remove races") and commit `0ca7a4c87d` ("net ax25: Simplify and cleanup the ax25 sysctl handling") The former moved around the sysctl register/unregister calls, the later simply removed them. With help from Stephen Rothwell. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-23 23:15:17 -04:00
David S. Miller	a108d5f35a	net: Use bool and remove inline in skb_splice_bits() code. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-23 23:06:11 -04:00
Eric Dumazet	41c73a0d44	net: speedup skb_splice_bits() Commit `35f3d14db` (pipe: add support for shrinking and growing pipes) added a slowdown for splice(socket -> pipe), as we might grow the spd used in skb_splice_bits() for each skb we process in splice() syscall. Its not needed since skb lengths are capped. The default on-stack arrays are more than enough. Use MAX_SKB_FRAGS instead of PIPE_DEF_BUFFERS to describe the reasonable limit per skb. Add coalescing support to help splicing of GRO skbs built from linear skbs (linked into frag_list) Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-23 23:01:35 -04:00
Eric Dumazet	1402d36601	tcp: introduce tcp_try_coalesce commit `c8628155ec` (tcp: reduce out_of_order memory use) took care of coalescing tcp segments provided by legacy devices (linear skbs) We extend this idea to fragged skbs, as their truesize can be heavy. ixgbe for example uses 256+1024+PAGE_SIZE/2 = 3328 bytes per segment. Use this coalescing strategy for receive queue too. This contributes to reduce number of tcp collapses, at minimal cost, and reduces memory overhead and packets drops. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Maciej Żenczykowski <maze@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-23 22:42:49 -04:00
Eric Dumazet	da882c1f2e	tcp: sk_add_backlog() is too agressive for TCP While investigating TCP performance problems on 10Gb+ links, we found a tcp sender was dropping lot of incoming ACKS because of sk_rcvbuf limit in sk_add_backlog(), especially if receiver doesnt use GRO/LRO and sends one ACK every two MSS segments. A sender usually tweaks sk_sndbuf, but sk_rcvbuf stays at its default value (87380), allowing a too small backlog. A TCP ACK, even being small, can consume nearly same truesize space than outgoing packets. Using sk_rcvbuf + sk_sndbuf as a limit makes sense and is fast to compute. Performance results on netperf, single flow, receiver with disabled GRO/LRO : 7500 Mbits instead of 6050 Mbits, no more TCPBacklogDrop increments at sender. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Maciej Żenczykowski <maze@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Cc: Rick Jones <rick.jones2@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-23 22:28:28 -04:00
Eric Dumazet	f545a38f74	net: add a limit parameter to sk_add_backlog() sk_add_backlog() & sk_rcvqueues_full() hard coded sk_rcvbuf as the memory limit. We need to make this limit a parameter for TCP use. No functional change expected in this patch, all callers still using the old sk_rcvbuf limit. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Maciej Żenczykowski <maze@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Cc: Rick Jones <rick.jones2@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-23 22:28:28 -04:00
Bala Shanmugam	218d2e26dc	cfg80211: Validate legacy rateset. Legacy rates are not validated while configuring tx rateset using iw. So below cmd is accepted by nl80211. sudo iw wlan2 set bitrates legacy-2.4 1 2 3 Validate legacy rates and return error if any rate in the rateset is not valid. Signed-off-by: Bala Shanmugam <bkamatch@qca.qualcomm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-04-23 15:37:41 -04:00
Wey-Yi Guy	0d8a0a1728	mac80211: declare ieee80211_ave_rssi as EXPORT ieee80211_ave_rssi need to be declare as export for driver to use it. Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-04-23 15:37:41 -04:00
Javier Cardona	6ac95b5765	mac80211: fixup for mesh TSF adjustment latency in Toffset setpoint The original patch defined the correction margin but did not apply it. Signed-off-by: Shinichi Hotori <hotorinn@gmail.com> Signed-off-by: Yu Niiro <yu.niiro@gmail.com> Signed-off-by: Javier Cardona <javier@cozybit.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-04-23 15:37:41 -04:00
Thomas Pedersen	aee286c2cf	mac80211: fix STA channel width field According to IEEE 802.11 8.4.2.59, set the "STA channel width" bit to 0 if transmitting STA is using a 20mhz channel. Signed-off-by: Thomas Pedersen <thomas@cozybit.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-04-23 15:34:07 -04:00
Thomas Pedersen	e76781e48f	mac80211: don't set mesh peer ht caps if ht disabled Blindly setting ht caps on a mesh peer's station entry would result in MCS rates being used by the rate control algorithm even if no ht had been configured. Fix this by checking the channel type before assigning ht capabilites. Signed-off-by: Thomas Pedersen <thomas@cozybit.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-04-23 15:34:07 -04:00
Thomas Pedersen	f743ff4907	mac80211: refactor mesh peer rate handling To avoid passing supp_rates and basic_rates around all the time, just derive these when needed in mesh_matches_local() and mesh_peer_init(). Signed-off-by: Thomas Pedersen <thomas@cozybit.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-04-23 15:34:07 -04:00
Thomas Pedersen	54ab1ffb6c	mac80211: refactor mesh peer initialization This patch unifies the previous two paths toward mesh peer creation a bit. It also fixes a bug where a peer's changing rates or HT mode wouldn't register on leaving and then returning to the mesh with a sta entry still present. Also clean up locking and clear possibly stale ht cap. Signed-off-by: Thomas Pedersen <thomas@cozybit.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-04-23 15:34:07 -04:00
Ben Greear	8a690674e0	mac80211: Support on-channel scan option. This based on an idea posted by Stanislaw Gruszka, though I accept full blame for the implementation! This has been tested with ath9k. The idea is to let users scan on the current operating channel without interrupting normal traffic more than absolutely necessary (changing power level might reset some hardware, for instance). Signed-off-by: Ben Greear <greearb@candelatech.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-04-23 15:28:33 -04:00
David S. Miller	ac807fa8e6	tcp: Fix build warning after tcp_{v4,v6}_init_sock consolidation. net/ipv4/tcp_ipv4.c: In function 'tcp_v4_init_sock': net/ipv4/tcp_ipv4.c:1891:19: warning: unused variable 'tp' [-Wunused-variable] net/ipv6/tcp_ipv6.c: In function 'tcp_v6_init_sock': net/ipv6/tcp_ipv6.c:1836:19: warning: unused variable 'tp' [-Wunused-variable] Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-23 03:21:58 -04:00
Neal Cardwell	d135c522f1	tcp: fix TCP_MAXSEG for established IPv6 passive sockets Commit `f5fff5d` forgot to fix TCP_MAXSEG behavior IPv6 sockets, so IPv6 TCP server sockets that used TCP_MAXSEG would find that the advmss of child sockets would be incorrect. This commit mirrors the advmss logic from tcp_v4_syn_recv_sock in tcp_v6_syn_recv_sock. Eventually this logic should probably be shared between IPv4 and IPv6, but this at least fixes this issue. Signed-off-by: Neal Cardwell <ncardwell@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-22 17:09:35 -04:00
Eric Dumazet	c06fff6e17	af_packet: packet_getsockopt() cleanup Factorize code, since most fetched values are int type. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-21 16:36:42 -04:00
Neal Cardwell	900f65d361	tcp: move duplicate code from tcp_v4_init_sock()/tcp_v6_init_sock() This commit moves the (substantial) common code shared between tcp_v4_init_sock() and tcp_v6_init_sock() to a new address-family independent function, tcp_init_sock(). Centralizing this functionality should help avoid drift issues, e.g. where the IPv4 side is updated without a corresponding update to IPv6. There was already some drift: IPv4 initialized snd_cwnd to TCP_INIT_CWND, while the IPv6 side was still initializing snd_cwnd to 2 (in this case it should not matter, since snd_cwnd is also initialized in tcp_init_metrics(), but the general risks and maintenance overhead remain). When diffing the old and new code, note that new tcp_init_sock() function uses the order of steps from the tcp_v4_init_sock() implementation (the order is slightly different in tcp_v6_init_sock()). Signed-off-by: Neal Cardwell <ncardwell@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-21 16:36:42 -04:00
Eric Dumazet	e66e9a3147	net: allow better page reuse in splice(sock -> pipe) splice() from socket to pipe needs linear_to_page() helper to transfert skb header to part of page. We can reset the offset in the current sk->sk_sndmsg_page if we are the last user of the page. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-21 16:36:42 -04:00
Eric Dumazet	bbe362be53	drop_monitor: allow more events per second It seems there is a logic error in trace_drop_common(), since we store only 64 drops, even if they are from same location. This fix is a one liner, but we probably need more work to avoid useless atomic dec/inc Now I can watch 1 Mpps drops through dropwatch... Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neil Horman <nhorman@tuxdriver.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-21 16:28:38 -04:00
Eric Dumazet	a74e910618	net: change big iov allocations iov of more than 8 entries are allocated in sendmsg()/recvmsg() through sock_kmalloc() As these allocations are temporary only and small enough, it makes sense to use plain kmalloc() and avoid sk_omem_alloc atomic overhead. Slightly changed fast path to be even faster. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Mike Waychison <mikew@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-21 16:24:20 -04:00
Pavel Emelyanov	b139ba4e90	tcp: Repair connection-time negotiated parameters There are options, which are set up on a socket while performing TCP handshake. Need to resurrect them on a socket while repairing. A new sockoption accepts a buffer and parses it. The buffer should be CODE:VALUE sequence of bytes, where CODE is standard option code and VALUE is the respective value. Only 4 options should be handled on repaired socket. To read 3 out of 4 of these options the TCP_INFO sockoption can be used. An ability to get the last one (the mss_clamp) was added by the previous patch. Now the restore. Three of these options -- timestamp_ok, mss_clamp and snd_wscale -- are just restored on a coket. The sack_ok flags has 2 issues. First, whether or not to do sacks at all. This flag is just read and set back. No other sack info is saved or restored, since according to the standart and the code dropping all sack-ed segments is OK, the sender will resubmit them again, so after the repair we will probably experience a pause in connection. Next, the fack bit. It's just set back on a socket if the respective sysctl is set. No collected stats about packets flow is preserved. As far as I see (plz, correct me if I'm wrong) the fack-based congestion algorithm survives dropping all of the stats and repairs itself eventually, probably losing the performance for that period. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-21 15:52:25 -04:00
Pavel Emelyanov	5e6a3ce657	tcp: Report mss_clamp with TCP_MAXSEG option in repair mode The mss_clamp is the only connection-time negotiated option which cannot be obtained from the user space. Make the TCP_MAXSEG sockopt report one in the repair mode. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-21 15:52:25 -04:00
Pavel Emelyanov	c0e88ff0f2	tcp: Repair socket queues Reading queues under repair mode is done with recvmsg call. The queue-under-repair set by TCP_REPAIR_QUEUE option is used to determine which queue should be read. Thus both send and receive queue can be read with this. Caller must pass the MSG_PEEK flag. Writing to queues is done with sendmsg call and yet again -- the repair-queue option can be used to push data into the receive queue. When putting an skb into receive queue a zero tcp header is appented to its head to address the tcp_hdr(skb)->syn and the ->fin checks by the (after repair) tcp_recvmsg. These flags flags are both set to zero and that's why. The fin cannot be met in the queue while reading the source socket, since the repair only works for closed/established sockets and queueing fin packet always changes its state. The syn in the queue denotes that the respective skb's seq is "off-by-one" as compared to the actual payload lenght. Thus, at the rcv queue refill we can just drop this flag and set the skb's sequences to precice values. When the repair mode is turned off, the write queue seqs are updated so that the whole queue is considered to be 'already sent, waiting for ACKs' (write_seq = snd_nxt <= snd_una). From the protocol POV the send queue looks like it was sent, but the data between the write_seq and snd_nxt is lost in the network. This helps to avoid another sockoption for setting the snd_nxt sequence. Leaving the whole queue in a 'not yet sent' state (as it will be after sendmsg-s) will not allow to receive any acks from the peer since the ack_seq will be after the snd_nxt. Thus even the ack for the window probe will be dropped and the connection will be 'locked' with the zero peer window. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-21 15:52:25 -04:00
Pavel Emelyanov	ee9952831c	tcp: Initial repair mode This includes (according the the previous description): * TCP_REPAIR sockoption This one just puts the socket in/out of the repair mode. Allowed for CAP_NET_ADMIN and for closed/establised sockets only. When repair mode is turned off and the socket happens to be in the established state the window probe is sent to the peer to 'unlock' the connection. * TCP_REPAIR_QUEUE sockoption This one sets the queue which we're about to repair. The 'no-queue' is set by default. * TCP_QUEUE_SEQ socoption Sets the write_seq/rcv_nxt of a selected repaired queue. Allowed for TCP_CLOSE-d sockets only. When the socket changes its state the other seq-s are changed by the kernel according to the protocol rules (most of the existing code is actually reused). * Ability to forcibly bind a socket to a port The sk->sk_reuse is set to SK_FORCE_REUSE. * Immediate connect modification The connect syscall initializes the connection, then directly jumps to the code which finalizes it. * Silent close modification The close just aborts the connection (similar to SO_LINGER with 0 time) but without sending any FIN/RST-s to peer. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-21 15:52:25 -04:00
Pavel Emelyanov	370816aef0	tcp: Move code around This is just the preparation patch, which makes the needed for TCP repair code ready for use. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-21 15:52:25 -04:00
Pavel Emelyanov	4a17fd5229	sock: Introduce named constants for sk_reuse Name them in a "backward compatible" manner, i.e. reuse or not are still 1 and 0 respectively. The reuse value of 2 means that the socket with it will forcibly reuse everyone else's port. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-21 15:52:25 -04:00
Eric W. Biederman	5f568e5afe	net: Remove register_net_sysctl_table All of the users have been converted to use registera_net_sysctl so we no longer need register_net_sysctl. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-20 21:22:30 -04:00
Eric W. Biederman	a5347fe36b	net: Delete all remaining instances of ctl_path We don't use struct ctl_path anymore so delete the exported constants. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-20 21:22:30 -04:00
Eric W. Biederman	ec8f23ce0f	net: Convert all sysctl registrations to register_net_sysctl This results in code with less boiler plate that is a bit easier to read. Additionally stops us from using compatibility code in the sysctl core, hastening the day when the compatibility code can be removed. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-20 21:22:30 -04:00
Eric W. Biederman	f99e8f715a	net: Convert nf_conntrack_proto to use register_net_sysctl There isn't much advantage here except that strings paths are a bit easier to read, and converting everything to them allows me to kill off ctl_path. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-20 21:22:30 -04:00
Eric W. Biederman	8607ddb867	net ipv4: Convert devinet to use register_net_sysctl Using an ascii path to register_net_sysctl as opposed to the slightly awkward ctl_path allows for much simpler code. We no longer need to malloc dev_name to keep it alive the length of our sysctl register instead we can use a small temporary buffer on the stack. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-20 21:22:30 -04:00
Eric W. Biederman	6105e29320	net ipv6: Convert addrconf to use register_net_sysctl Using an ascii path to register_net_sysctl as opposed to the slightly awkward ctl_path allows for much simpler code. We no longer need to malloc dev_name to keep it alive the length of our sysctl register instead we can use a small temporary buffer on the stack. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-20 21:22:29 -04:00
Eric W. Biederman	9bdcc88fa0	net decnet: Convert to use register_net_sysctl Using an ascii path to register_net_sysctl as opposed to the slightly awkward ctl_path allows for much simpler code. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-20 21:22:29 -04:00
Eric W. Biederman	8f40a1f982	net neighbour: Convert to use register_net_sysctl Using an ascii path to register_net_sysctl as opposed to the slightly awkward ctl_path allows for much simpler code. We no longer need to malloc dev_name to keep it alive the length of our sysctl register instead we can use a small temporary buffer on the stack. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-20 21:22:29 -04:00
Eric W. Biederman	6dceb03687	net ipv6: Don't use sysctl tables with .child entries. The sysctl core no longer natively understands sysctl tables with .child entries. Split the ipv6_table to remove the .child entries. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-20 21:22:29 -04:00
Eric W. Biederman	64fb301040	net llc: Don't use sysctl tables with .child entries. The sysctl core no longer natively understands sysctl tables with .child entries. Kill the intermediate tables and use register_net_sysctl directly to remove the need for compatibility code. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-20 21:22:29 -04:00
Eric W. Biederman	0ca7a4c87d	net ax25: Simplify and cleanup the ax25 sysctl handling. Don't register/unregister every ax25 table in a batch. Instead register and unregister per device ax25 sysctls as ax25 devices come and go. This moves ax25 to be a completely modern sysctl user. Registering the sysctls in just the initial network namespace, removing the use of .child entries that are no longer natively supported by the sysctl core and taking advantage of the fact that there are no longer any ordering constraints between registering and unregistering different sysctl tables. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-20 21:22:28 -04:00
Eric W. Biederman	4e5ca78541	net ipv4: Remove the unneeded registration of an empty net/ipv4/neigh sysctl no longer requires explicit creation of directories. The neigh directory is always populated with at least a default entry so this won't cause any user visible changes. Delete the ipv4_path and the ipv4_skeleton these are no longer needed. Directly register the ipv4_route_table. And since I am an idiot remove the header definitions that I should have removed in the previous patch. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-20 21:21:18 -04:00
Eric W. Biederman	a5287acc6c	net ipv6: Remove unneded registration of an empty net/ipv6/neigh sysctl no longer requires explicit creation of directories. The neigh directory is always populated with at least a default entry so this should cause no user visible changes. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-20 21:21:18 -04:00
Eric W. Biederman	45bad91498	net core: Remove unneded creation of an empty net/core sysctl directory On the next line we register the net_core_table in net/core which creates the directory and ensures it exists. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-20 21:21:18 -04:00
Eric W. Biederman	5dd3df105b	net: Move all of the network sysctls without a namespace into init_net. This makes it clearer which sysctls are relative to your current network namespace. This makes it a little less error prone by not exposing sysctls for the initial network namespace in other namespaces. This is the same way we handle all of our other network interfaces to userspace and I can't honestly remember why we didn't do this for sysctls right from the start. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-20 21:21:17 -04:00
Eric W. Biederman	4344475797	net: Kill register_sysctl_rotable register_sysctl_rotable never caught on as an interesting way to register sysctls. My take on the situation is that what we want are sysctls that we can only see in the initial network namespace. What we have implemented with register_sysctl_rotable are sysctls that we can see in all of the network namespaces and can only change in the initial network namespace. That is a very silly way to go. Just register the network sysctls in the initial network namespace and we don't have any weird special cases to deal with. The sysctls affected are: /proc/sys/net/ipv4/ipfrag_secret_interval /proc/sys/net/ipv4/ipfrag_max_dist /proc/sys/net/ipv6/ip6frag_secret_interval /proc/sys/net/ipv6/mld_max_msf I really don't expect anyone will miss them if they can't read them in a child user namespace. CC: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-20 21:21:17 -04:00
Eric W. Biederman	2ca794e5e8	net sysctl: Initialize the network sysctls sooner to avoid problems. If the netfilter code is modified to use register_net_sysctl_table the kernel fails to boot because the per net sysctl infrasturce is not setup soon enough. So to avoid races call net_sysctl_init from sock_init(). Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-20 21:21:16 -04:00
Eric W. Biederman	bc8a36942a	net sysctl: Register an empty /proc/sys/net Implementation limitations of the sysctl core won't let /proc/sys/net reside in a network namespace. /proc/sys/net at least must be registered as a normal sysctl. So register /proc/sys/net early as an empty directory to guarantee we don't violate this constraint and hit bugs in the sysctl implementation. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-20 21:21:16 -04:00
Eric W. Biederman	ab41a2ca50	net: Implement register_net_sysctl. Right now all of the networking sysctl registrations are running in a compatibiity mode. The natvie sysctl registration api takes a cstring for a path and a simple ctl_table. Implement register_net_sysctl so that we can register network sysctls without needing to use compatiblity code in the sysctl core. Switching from a ctl_path to a cstring results in less boiler plate and denser code that is a little easier to read. I would simply have changed the arguments to register_net_sysctl_table instead of keeping two functions in parallel but gcc will allow a ctl_path pointer to be passed to a char * pointer with only issuing a warning resulting in completely incorrect code can be built. Since I have to change the function name I am taking advantage of the situation to let both register_net_sysctl and register_net_sysctl_table live for a short time in parallel which makes clean conversion patches a bit easier to read and write. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-20 21:21:15 -04:00
David S. Miller	167de77fd4	Merge branch 'tipc_net-next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux	2012-04-20 20:40:31 -04:00
Allan Stephens	9d52ce4bd3	tipc: Ensure network address change doesn't impact configuration service Enhances command validation done by TIPC's configuration service so that it works properly even if the node's network address is changed in mid-operation. The default node address of <0.0.0> is now recognized as an alias for "this node" even after a new network address has been assigned. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-04-19 15:46:50 -04:00
Allan Stephens	630d920dca	tipc: Ensure network address change doesn't impact rejected message Revises handling of a rejected message to ensure that a locally originated message is returned properly even if the node's network address is changed in mid-operation. The routine now treats the default node address of <0.0.0> as an alias for "this node" when determining where to send a returned message. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-04-19 15:46:49 -04:00
Allan Stephens	8a55fe74b1	tipc: handle <0.0.0> as an alias for this node on outgoing msgs Revises handling of send routines for payload messages to ensure that they are processed properly even if the node's network address is changed in mid-operation. The routines now treat the default node address of <0.0.0> as an alias for "this node" when determining where to send an outgoing message. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-04-19 15:46:48 -04:00
Allan Stephens	b8f683d126	tipc: properly handle off-node send requests with invalid addr There are two send routines that might conceivably be asked by an application to send a message off-node when the node is still using the default network address. These now have an added check that detects this and rejects the message gracefully. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-04-19 15:46:47 -04:00
Allan Stephens	974a5a864b	tipc: take lock while updating node network address The routine that changes the node's network address now takes TIPC's network lock in write mode while the main address variable and associated data structures are being changed; this is needed to ensure that the link subsystem won't attempt to send a message off-node until the sending port's message header template has been updated with the node's new network address. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-04-19 15:46:46 -04:00
Allan Stephens	f0712e86b7	tipc: Ensure network address change doesn't impact local connections Revises routines that deal with connections between two ports on the same node to ensure the connection is not impacted if the node's network address is changed in mid-operation. The routines now treat the default node address of <0.0.0> as an alias for "this node" in the following situations: 1) Incoming messages destined to a connected port now handle the alias properly when validating that the message was sent by the expected peer port, ensuring that the message will be accepted regardless of whether it specifies the node's old network address or it's current one. 2) The code which completes connection establishment now handles the alias properly when determining if the peer port is on the same node as the connected port. An added benefit of addressing issue 1) is that some peer port validation code has been relocated to TIPC's socket subsystem, which means that validation is no longer done twice when a message is sent to a non-socket port (such as TIPC's configuration service or network topology service). Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-04-19 15:46:45 -04:00
Allan Stephens	d0e17fedc2	tipc: delete duplicate peerport/peernode helper functions Prior to commit `23dd4cce38` "tipc: Combine port structure with tipc_port structure" there was a need for the two sets of helper functions. But now they are just duplicates. Remove the globally visible ones, and mark the remaining ones as inline. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-04-19 15:46:43 -04:00
Allan Stephens	f21536d1e7	tipc: Ensure network address change doesn't impact new port Re-orders port creation logic so that the initialization of a new port's message header template occurs while the port list lock is held. This ensures that a change to the node's network address that occurs at the same time as the port is being created does not result in the template identifying the sender using the former network address. The new approach guarantees that the new port's template is using the current network address or that it will be updated when the address changes. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-04-19 15:46:42 -04:00
Allan Stephens	5eb0a291fb	tipc: Optimize re-initialization of port message header templates Removes an unnecessary check in the logic that updates the message header template for existing ports when a node's network address is first assigned. There is no longer any need to check to see if the node's network address has actually changed since the calling routine has already verified that this is so. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-04-19 15:46:41 -04:00
Allan Stephens	d4f5c12cdf	tipc: Ensure network address change doesn't impact name table updates Revises routines that add and remove an entry from a node's name table so that the publication scope lists are updated properly even if the node's network address is changed in mid-operation. The routines now recognize the default node address of <0.0.0> as an alias for "this node" even after a new network address has been assigned. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-04-19 15:46:40 -04:00
Allan Stephens	336ebf5bf5	tipc: Add routines for safe checking of node's network address Introduces routines that test whether a given network address is equal to a node's own network address or if it lies within the node's own network cluster, and which work properly regardless of whether the node is using the default network address <0.0.0> or a non-zero network address that is assigned later on. In essence, these routines ensure that address <0.0.0> is treated as an alias for "this node", regardless of which network address the node is actually using. Old users of the pre-existing more strict match in_own_cluster() have been accordingly redirected to what is now called in_own_cluster_exact() --- which does not extend matching to <0,0,0>. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-04-19 15:46:39 -04:00
Allan Stephens	fd6eced8a4	tipc: Don't record failed publication attempt as a success No longer increments counter of number of publications by a node if an attempt to add a new publication fails. This prevents TIPC from incorrectly blocking future publications because the configured maximum number of publications has been reached. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-04-19 15:46:37 -04:00
Allan Stephens	1110b8d33a	tipc: Update node-scope publications when network address is assigned Ensures that node-scope name publications that exist prior to the configuration of a node's network address are properly re-initialized with that address when it is assigned. TIPC's node-scope publications are now tracked using a publications list like the lists used for cluster-scope and zone-scope publications so they can be easily updated when required. The inclusion of node scope name publications in a conventional publication list means that they must now also be withdrawn, just like cluster and zone scope publications are currently withdrawn. So some conditional tests on scope ==/!= TIPC_NODE_SCOPE are inserted/removed accordingly. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-04-19 15:46:36 -04:00

1 2 3 4 5 ...

23034 Commits