linux

Commit Graph

Author	SHA1	Message	Date
Andy Grover	ff87e97a9d	RDS: make m_rdma_op a member of rds_message This eliminates a separate memory alloc, although it is now necessary to add an "r_active" flag, since it is no longer to use the m_rdma_op pointer as an indicator of if an rdma op is present. rdma SGs allocated from rm sg pool. rds_rm_size also gets bigger. It's a little inefficient to run through CMSGs twice, but it makes later steps a lot smoother. Signed-off-by: Andy Grover <andy.grover@oracle.com>	2010-09-08 18:11:38 -07:00
Andy Grover	21f79afa5f	RDS: fold rdma.h into rds.h RDMA is now an intrinsic part of RDS, so it's easier to just have a single header. Signed-off-by: Andy Grover <andy.grover@oracle.com>	2010-09-08 18:11:37 -07:00
Andy Grover	fc445084f1	RDS: Explicitly allocate rm in sendmsg() r_m_copy_from_user used to allocate the rm as well as kernel buffers for the data, and then copy the data in. Now, sendmsg() allocates the rm, although the data buffer alloc still happens in r_m_copy_from_user. SGs are still allocated with rm, but now r_m_alloc_sgs() is used to reserve them. This allows multiple SG lists to be allocated from the one rm -- this is important once we also want to alloc our rdma sgl from this pool. Signed-off-by: Andy Grover <andy.grover@oracle.com>	2010-09-08 18:11:36 -07:00
Andy Grover	3ef13f3c22	RDS: cleanup/fix rds_rdma_unuse First, it looks to me like the atomic_inc is wrong. We should be decrementing refcount only once here, no? It's already being done by the mr_put() at the end. Second, simplify the logic a bit by bailing early (with a warning) if !mr. Signed-off-by: Andy Grover <andy.grover@oracle.com>	2010-09-08 18:11:35 -07:00
Andy Grover	e779137aa7	RDS: break out rdma and data ops into nested structs in rds_message Clearly separate rdma-related variables in rm from data-related ones. This is in anticipation of adding atomic support. Signed-off-by: Andy Grover <andy.grover@oracle.com>	2010-09-08 18:11:33 -07:00
Andy Grover	8690bfa17a	RDS: cleanup: remove "== NULL"s and "!= NULL"s in ptr comparisons Favor "if (foo)" style over "if (foo != NULL)". Signed-off-by: Andy Grover <andy.grover@oracle.com>	2010-09-08 18:11:32 -07:00
Andy Grover	2dc3935734	RDS: move rds_shutdown_worker impl. to rds_conn_shutdown This fits better in connection.c, rather than threads.c. Signed-off-by: Andy Grover <andy.grover@oracle.com>	2010-09-08 18:10:13 -07:00
Andy Grover	9de0864cf5	RDS: Fix locking in send on m_rs_lock Do not nest m_rs_lock under c_lock Disable interrupts in {rdma,atomic}_send_complete Signed-off-by: Andy Grover <andy.grover@oracle.com>	2010-09-08 18:07:32 -07:00
Andy Grover	7c82eaf00e	RDS: Rewrite rds_send_drop_to() for clarity This function has been the source of numerous bugs; it's just too complicated. Simplified to nest spinlocks cleanly within the second loop body, and kick out early if there are no rms to drop. This will be a little slower because conn lock is grabbed for each entry instead of "caching" the lock across rms, but this should be entirely irrelevant to fastpath performance. Signed-off-by: Andy Grover <andy.grover@oracle.com>	2010-09-08 18:07:32 -07:00
Tina Yang	35b52c7053	RDS: Fix corrupted rds_mrs On second look at this bug (OFED #2002), it seems that the collision is not with the retransmission queue (packet acked by the peer), but with the local send completion. A theoretical sequence of events (from time t0 to t3) is thought to be as follows, Thread #1 t0: sock_release rds_release rds_send_drop_to /* wait on send completion / t2: rds_rdma_drop_keys() / destroy & free all mrs / Thread #2 t1: rds_ib_send_cq_comp_handler rds_ib_send_unmap_rm rds_message_unmapped / wake up #1 @ t0 / t3: rds_message_put rds_message_purge rds_mr_put / memory corruption detected / The problem with the rds_rdma_drop_keys() is it could remove a mr's refcount more than its due (i.e. repeatedly as long as it still remains in the tree (mr->r_refcount > 0)). Theoretically it should remove only one reference - reference by the tree. / Release any MRs associated with this socket */ while ((node = rb_first(&rs->rs_rdma_keys))) { mr = container_of(node, struct rds_mr, r_rb_node); if (mr->r_trans == rs->rs_transport) mr->r_invalidate = 0; rds_mr_put(mr); } I think the correct way of doing it is to remove the mr from the tree and rds_destroy_mr it first, then a rds_mr_put() to decrement its reference count by one. Whichever thread holds the last reference will free the mr via rds_mr_put(). Signed-off-by: Tina Yang <tina.yang@oracle.com> Signed-off-by: Andy Grover <andy.grover@oracle.com>	2010-09-08 18:07:31 -07:00
Andy Grover	9e2effba2c	RDS: Fix BUG_ONs to not fire when in a tasklet in_interrupt() is true in softirqs. The BUG_ONs are supposed to check for if irqs are disabled, so we should use BUG_ON(irqs_disabled()) instead, duh. Signed-off-by: Andy Grover <andy.grover@oracle.com>	2010-09-08 18:07:31 -07:00
Eric Dumazet	db40980fcd	net: poll() optimizations No need to test twice sk->sk_shutdown & RCV_SHUTDOWN Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-06 18:48:45 -07:00
Joe Perches	f6c9322c3a	net/caifcaif_dev.c: Use netdev_<level> Convert pr_<level>("%s" ..., (struct netdev )->name ...) to netdev_<level>((struct netdev ), ...) Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-06 18:48:43 -07:00
Joe Perches	b31fa5bad5	net/caif: Use pr_fmt This patch standardizes caif message logging prefixes. Add #define pr_fmt(fmt) KBUILD_MODNAME ":%s(): " fmt, __func__ Add missing "\n"s to some logging messages Convert pr_warning to pr_warn This changes the logging message prefix from CAIF: to caif: for all uses but caif_socket.c and chnl_net.c. Those now use their filename without extension. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-06 18:48:43 -07:00
Julia Lawall	29af9309db	net/9p/trans_fd.c: Fix unsigned return type The function has an unsigned return type, but returns a negative constant to indicate an error condition. The result of calling the function is always stored in a variable of type (signed) int, and thus unsigned can be dropped from the return type. A sematic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @exists@ identifier f; constant C; @@ unsigned f(...) { <+... * return -C; ...+> } // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-06 18:48:42 -07:00
Eric Dumazet	1fd63041c4	net: pskb_expand_head() optimization pskb_expand_head() blindly takes references on fragments before calling skb_release_data(), potentially releasing these references. We can add a fast path, avoiding these atomic operations, if we own the last reference on skb->head. Based on a previous patch from David Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-06 18:24:59 -07:00
Allan Stephens	d1fb62796c	tipc: Fix misleading error code when enabling Ethernet bearers Cause TIPC to return EAGAIN if it is unable to enable a new Ethernet bearer because one or more recently disabled Ethernet bearers are temporarily consuming resources during shut down. (The previous error code, EDQUOT, is now returned only if all available Ethernet bearer data structures are fully enabled at the time the request to enable an additional bearer is received.) Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-06 18:12:57 -07:00
Allan Stephens	9fbfca0131	tipc: Ensure outgoing messages on Ethernet have sufficient headroom Add code to expand the headroom of an outgoing TIPC message if the sk_buff has insufficient room to hold the header for the associated Ethernet device. This change is necessary to ensure that messages TIPC does not create itself (eg. incoming messages that are being routed to another node) do not cause problems, since TIPC has no control over the amount of headroom available in such messages. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-06 18:12:56 -07:00
Allan Stephens	5d9c54c1e9	tipc: Minor optimizations to name table translation code Optimizes TIPC's name table translation code to avoid unnecessary manipulation of the node address field of the resulting port id when name translation fails. This change is possible because a valid port id cannot have a reference field of zero, so examining the reference only is sufficient to determine if the translation was successful. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-06 18:12:56 -07:00
Eric Dumazet	52ee7a04a0	net: remove two kmemcheck annotations __alloc_skb() uses a memset() to clear all the beginning of skb, including bitfields contained in 'flags1' & 'flags2'. We dont need any more to use kmemcheck_annotate_bitfield() on these fields. However, we still need it for the clone part, which is not cleared. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-03 09:44:51 -07:00
Thomas Graf	c3bccac2fa	ipv6: add special mode forwarding=2 to send RS while configured as router Similar to accepting router advertisement, the IPv6 stack does not send router solicitations if forwarding is enabled. This patch enables this behavior to be overruled by setting forwarding to the special value 2. Signed-off-by: Thomas Graf <tgraf@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-03 09:43:14 -07:00
Thomas Graf	65e9b62d45	ipv6: add special mode accept_ra=2 to accept RA while configured as router The current IPv6 behavior is to not accept router advertisements while forwarding, i.e. configured as router. This does make sense, a router is typically not supposed to be auto configured. However there are exceptions and we should allow the current behavior to be overwritten. Therefore this patch enables the user to overrule the "if forwarding enabled then don't listen to RAs" rule by setting accept_ra to the special value of 2. An alternative would be to ignore the forwarding switch alltogether and solely accept RAs based on the value of accept_ra. However, I found that if not intended, accepting RAs as a router can lead to strange unwanted behavior therefore we it seems wise to only do so if the user explicitely asks for this behavior. Signed-off-by: Thomas Graf <tgraf@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-03 09:43:13 -07:00
David S. Miller	7162f6691e	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6	2010-09-02 12:45:44 -07:00
John W. Linville	78ab952717	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem	2010-09-02 13:30:07 -04:00
Changli Gao	deffd77759	net: arp: code cleanup Clean the code up according to Documentation/CodingStyle. Don't initialize the variable dont_send in arp_process(). Remove the temporary varialbe flags in arp_state_to_flags(). Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-02 10:12:06 -07:00
Eric Dumazet	c07b68e841	net: dev_add_pack() & __dev_remove_pack() changes Add a small helper ptype_head() to get the head to manipulate dev_add_pack() & __dev_remove_pack() can use a spinlock without blocking BH, since softirq use RCU, and these functions are run from process context only. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-02 10:12:06 -07:00
Julian Anastasov	8ed2163ff3	ipvs: use pkts for SCTP too Use correctly the in_pkts packet counter also for SCTP Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-02 10:04:18 -07:00
Eric Dumazet	95f4b45bc6	net: another last_rx round Kill last_rx use in l2tp and two net drivers Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-02 09:19:32 -07:00
Gerrit Renker	0705c6f0e2	tcp: update also tcp_output with regard to RFC 5681 Thanks to Ilpo Jarvinen, this updates also the initial window setting for tcp_output with regard to RFC 5681. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-01 18:15:00 -07:00
stephen hemminger	fa50d64576	net: make rx_queue sysfs_ops const Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-01 18:12:20 -07:00
John W. Linville	85f72bc839	mac80211: only cancel software-based scans on suspend Otherwise the hardware scan handler could access an invalid scan request structure. The driver should cancel any pending hardware scans during the suspend process anyway, so also add a warning if the hardware scan is still pending when the device resumes. Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-09-01 16:12:28 -04:00
Eric Dumazet	6602cebb5b	net: skbuff.c cleanup (skb->data - skb->head) can be changed by skb_headroom(skb) Remove some uses of NET_SKBUFF_DATA_USES_OFFSET, using (skb_end_pointer(skb) - skb->head) or (skb_tail_pointer(skb) - skb->head) : compiler does the right thing, and this is more readable for us ;) (struct skb_shared_info *) casts in pskb_expand_head() to help memcpy() to use aligned moves. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-01 10:57:55 -07:00
Eric Dumazet	86cac58b71	skge: add GRO support - napi_gro_flush() is exported from net/core/dev.c, to avoid an irq_save/irq_restore in the packet receive path. - use napi_gro_receive() instead of netif_receive_skb() - use napi_gro_flush() before calling __napi_complete() - turn on NETIF_F_GRO by default - Tested on a Marvell 88E8001 Gigabit NIC Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-01 10:57:55 -07:00
Eric Dumazet	875168a933	net: tunnels should use rcu_dereference tunnel4_handlers, tunnel64_handlers, tunnel6_handlers and tunnel46_handlers are protected by RCU, but we dont use appropriate rcu primitives to scan them. rcu_lock() is already held by caller. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-01 10:57:53 -07:00
Eric Dumazet	1639ab6f78	gro: unexport tcp4_gro_receive and tcp4_gro_complete tcp4_gro_receive() and tcp4_gro_complete() dont need to be exported. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-31 13:37:07 -07:00
Eric Dumazet	ba4fd9d828	pktgen: remove non used variable remove non used variable "queue" in pg_cleanup Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-31 13:37:06 -07:00
Jiri Pirko	72ed62f7c9	vlan: Use vlan_dev_real_dev in vlan_hwaccel_do_receive [patch net-next-2.6] vlan: Use vlan_dev_real_dev in vlan_hwaccel_do_receive Use helper as in other places. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-31 13:37:05 -07:00
Rémi Denis-Courmont	01b38606bd	Phonet: do not set POLLOUT in case of send buffer overflow Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-31 13:04:33 -07:00
Rémi Denis-Courmont	02ac3268a5	Phonet: correct sendmsg() error code from sock_alloc_send_skb() Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-31 13:04:32 -07:00
Rémi Denis-Courmont	1a98214fee	Phonet: restore flow control credits when sending fails Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-31 13:04:32 -07:00
John W. Linville	18145c6934	mac80211: cancel scan in ieee80211_restart_hw if software scan pending This function exists to clean-up after a hardware error or something similar. The restart is accomplished using the same infrastructure used to resume after a suspend. The suspend path cancels running scans, so it seems appropriate to do that here as well for software-based scans. If a hardware-based scan is pending, issue a warning message since this indicates that the drivers has failed to clean-up after itself. Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-08-31 15:20:45 -04:00
Julia Lawall	3653910714	net/wireless: Remove double test The same expression is tested twice and the result is the same each time. The sematic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @expression@ expression E; @@ ( * E \|\| ... \|\| E \| * E && ... && E ) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-08-31 14:20:40 -04:00
Jouni Malinen	391a200a89	mac80211: Do not generate CQM events based on first Beacon frames The signal strength value in a single RX frame is not that reliable, so it is better to delay start of CQM events until there is a real average signal strength from more than a single Beacon frame available. Signed-off-by: Jouni Malinen <j@w1.fi> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-08-31 14:20:40 -04:00
Jouni Malinen	3ba06c6fbd	mac80211: Fix signal strength average initialization for CQM events The ave_beacon_signal value uses 1/16 dB unit and as such, must be initialized with the signal level of the first Beacon frame multiplied by 16. This fixes an issue where the initial CQM events are reported incorrectly with a burst of events while the running average approaches the correct value after the incorrect initialization. This could cause user space -based roaming decision process to get quite confused at the moment when we would like to go through authentication and DHCP. Cc: stable@kernel.org Signed-off-by: Jouni Malinen <j@w1.fi> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-08-31 14:20:40 -04:00
Eric Dumazet	3ff2cfa55f	ipv6: struct xfrm6_tunnel in read_mostly section tunnel6_handlers chain being scanned for each incoming packet, make sure it doesnt share an often dirtied cache line. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-30 13:50:46 -07:00
Eric Dumazet	6dcd814bd0	net: struct xfrm_tunnel in read_mostly section tunnel4_handlers chain being scanned for each incoming packet, make sure it doesnt share an often dirtied cache line. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-30 13:50:45 -07:00
Gerrit Renker	89858ad143	dccp ccid-3: use per-route RTO or TCP RTO as fallback This makes RTAX_RTO_MIN also available to CCID-3, replacing the compile-time RTO lower bound with a per-route tunable value. The original Kconfig option solved the problem that a very low RTT (in the order of HZ) can trigger too frequent and unnecessary reductions of the sending rate. This tunable does not affect the initial RTO value of 2 seconds specified in RFC 5348, section 4.2 and Appendix B. But like the hardcoded Kconfig value, it allows to adapt to network conditions. The same effect as the original Kconfig option of 100ms is now achieved by > ip route replace to unicast 192.168.0.0/24 rto_min 100j dev eth0 (assuming HZ=1000). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-30 13:45:28 -07:00
Gerrit Renker	4886fcad6e	dccp ccid-2: Share TCP's minimum RTO code Using a fixed RTO_MIN of 0.2 seconds was found to cause problems for CCID-2 over 802.11g: at least once per session there was a spurious timeout. It helped to then increase the the value of RTO_MIN over this link. Since the problem is the same as in TCP, this patch makes the solution from commit "05bb1fad1cde025a864a90cfeb98dcbefe78a44a" "[TCP]: Allow minimum RTO to be configurable via routing metrics." available to DCCP. This avoids reinventing the wheel, so that e.g. the following works in the expected way now also for CCID-2: > ip route change 10.0.0.2 rto_min 800 dev ath0 Luckily this useful rto_min function was recently moved to net/tcp.h, which simplifies sharing code originating from TCP. Documentation also updated (plus minor whitespace fixes). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-30 13:45:27 -07:00
Gerrit Renker	22b71c8f4f	tcp/dccp: Consolidate common code for RFC 3390 conversion This patch consolidates initial-window code common to TCP and CCID-2: * TCP uses RFC 3390 in a packet-oriented manner (tcp_input.c) and * CCID-2 uses RFC 3390 in packet-oriented manner (RFC 4341). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-30 13:45:26 -07:00
Gerrit Renker	d26eeb07fd	dccp ccid-2: Remove wrappers around sk_{reset,stop}_timer() This removes the wrappers around the sk timer functions, since not much is gained from using them: the BUG_ON in start_rto_timer will never trigger since that function is called only if: * the RTO timer expires (rto_expire, and then timer_pending() is false); * in tx_packet_sent only if !timer_pending() (BUG_ON is redundant here); * previously in new_ack, after stopping the timer (timer_pending() false). Removing the wrappers also clears the way for eventually replacing the RTO timer with the icsk-retransmission-timer, as it is already part of the DCCP socket. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-30 13:45:26 -07:00
Gerrit Renker	d82b6f85c1	dccp ccid-2: Use u32 timestamps uniformly Since CCID-2 is de facto a mini implementation of TCP, it makes sense to share as much code as possible. Hence this patch aligns CCID-2 timestamping with TCP timestamping. This also halves the space consumption (on 64-bit systems). The necessary include file <net/tcp.h> is already included by way of net/dccp.h. Redundant includes have been removed. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-30 13:45:25 -07:00
Jerry Chu	dca43c75e7	tcp: Add TCP_USER_TIMEOUT socket option. This patch provides a "user timeout" support as described in RFC793. The socket option is also needed for the the local half of RFC5482 "TCP User Timeout Option". TCP_USER_TIMEOUT is a TCP level socket option that takes an unsigned int, when > 0, to specify the maximum amount of time in ms that transmitted data may remain unacknowledged before TCP will forcefully close the corresponding connection and return ETIMEDOUT to the application. If 0 is given, TCP will continue to use the system default. Increasing the user timeouts allows a TCP connection to survive extended periods without end-to-end connectivity. Decreasing the user timeouts allows applications to "fail fast" if so desired. Otherwise it may take upto 20 minutes with the current system defaults in a normal WAN environment. The socket option can be made during any state of a TCP connection, but is only effective during the synchronized states of a connection (ESTABLISHED, FIN-WAIT-1, FIN-WAIT-2, CLOSE-WAIT, CLOSING, or LAST-ACK). Moreover, when used with the TCP keepalive (SO_KEEPALIVE) option, TCP_USER_TIMEOUT will overtake keepalive to determine when to close a connection due to keepalive failure. The option does not change in anyway when TCP retransmits a packet, nor when a keepalive probe will be sent. This option, like many others, will be inherited by an acceptor from its listener. Signed-off-by: H.K. Jerry Chu <hkchu@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-30 13:23:33 -07:00
Stephen Rothwell	2c70b51962	IPVS: include net/ip6_checksum.h for csum_ipv6_magic Fixes this build error: net/netfilter/ipvs/ip_vs_core.c: In function 'ip_vs_nat_icmp_v6': net/netfilter/ipvs/ip_vs_core.c:640: error: implicit declaration of function 'csum_ipv6_magic' Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-29 21:15:26 -07:00
Akinobu Mita	6a499b242f	phonet: use for_each_set_bit Replace open-coded loop with for_each_set_bit(). Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-28 15:37:04 -07:00
Akinobu Mita	762c29164e	econet: kill unnecessary spin_lock_init() The spinlock aun_queue_lock is initialized statically. It is unnecessary to initialize by spin_lock_init() at module load time. This is detected by the semantic patch. // <smpl> @def@ declarer name DEFINE_SPINLOCK; identifier spinlock; @@ DEFINE_SPINLOCK(spinlock); @@ identifier def.spinlock; @@ - spin_lock_init(&spinlock); // </smpl> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Julia Lawall <julia@diku.dk> Cc: "David S. Miller" <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-28 15:37:03 -07:00
Johannes Berg	5b714c6a37	mac80211: fix offchannel queue stop Somebody noticed this problem, and I outlined to them how to fix it, but haven't heard back from them. So while I was adding the state field I figured I could use it to fix it. The problem, as I understand it, is that when we go offchannel while the driver has a queue stopped, the driver will likely start draining the queue and then enable it while offchannel. This in turn will enable the interface queue, and that leads to transmitting data frames on the wrong channel. Fix this by keeping track of offchannel status per interface, and not enabling the interface queues on interfaces that are offchannel when the driver enables a queue. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-08-27 13:53:31 -04:00
Johannes Berg	34d4bc4d41	mac80211: support runtime interface type changes Add support to mac80211 for changing the interface type even when the interface is UP, if the driver supports it. To achieve this * add a new driver callback for switching, * split some of the interface up/down code out into new functions (do_open/do_stop), and * maintain an own __SDATA_RUNNING bit that will not be set during interface type, so that any other code doesn't use the interface. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-08-27 13:53:31 -04:00
Johannes Berg	87490f6db3	mac80211: split out concurrent vif checks Split the concurrent virtual interface checks into a new function that can be used to check for any given new interface type. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-08-27 13:53:31 -04:00
Johannes Berg	bf533e0bfd	mac80211: simplify zero address checks The libertas_tf special code for zero addresses is a bit too complex, it compares against a stack value instead of using is_zero_ether_addr() and tries to update all interfaces even if just the one that's being brought up needs to be changed. Additionally, the repeated check for a valid MAC address need only be done if we actually changed it on the fly. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-08-27 13:53:30 -04:00
Johannes Berg	26a58456be	mac80211: switch to ieee80211_sdata_running Since the introduction of ieee80211_sdata_running(), some new code was introduced that uses netif_running() instead. Switch all these instances over. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-08-27 13:53:30 -04:00
Johannes Berg	b9dcf712d1	mac80211: clean up ifdown/cleanup paths There's a lot of redundant code in mac80211's interface cleanup/down, for example freeing AP beacons is done both when the interface is set DOWN as well as when it is torn down, of which only the former has any effect. Also, a bunch of things should be closer to where they matter, like the MLME timers that we should cancel when disassociating, rather than only when the interface is set DOWN. Clean up all this code. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-08-27 13:53:30 -04:00
Johannes Berg	2337db8db8	mac80211: use subqueue helpers There are subqueue helpers so that we don't need to get the TX queue and then wake/stop it, use those helpers. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-08-27 13:27:08 -04:00
Johannes Berg	a621fa4d6a	mac80211: allow changing port control protocol Some vendor specified mechanisms for 802.1X-style functionality use a different protocol than EAP (even if EAP is vendor-extensible). Support this in mac80211 via the cfg80211 API for it. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Juuso Oikarinen <juuso.oikarinen@nokia.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-08-27 13:27:07 -04:00
Johannes Berg	c0692b8fe2	cfg80211: allow changing port control protocol Some vendor specified mechanisms for 802.1X-style functionality use a different protocol than EAP (even if EAP is vendor-extensible). Allow setting the ethertype for the protocol when a driver has support for this. The default if unspecified is EAP, of course. Note: This is suitable only for station mode, not for AP implementation. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Juuso Oikarinen <juuso.oikarinen@nokia.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-08-27 13:27:07 -04:00
Johannes Berg	3ffc2a905b	mac80211: allow vendor specific cipher suites Allow drivers to specify their own set of cipher suites to advertise vendor-specific ciphers. The driver is then required to implement hardware crypto offload for it. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Juuso Oikarinen <juuso.oikarinen@nokia.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-08-27 13:27:07 -04:00
Johannes Berg	7d64b7cc1f	cfg80211: allow vendor specific cipher suites cfg80211 currently rejects all cipher suites it doesn't know about for key length checking purposes. This can lead to inconsistencies when a driver advertises an algorithm that cfg80211 doesn't know about. Remove this rejection so drivers can specify any algorithm they like. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Juuso Oikarinen <juuso.oikarinen@nokia.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-08-27 13:27:07 -04:00
Johannes Berg	8789d459bc	mac80211: allow scan to complete from any context The ieee80211_scan_completed() function was a frequent source of potential deadlocks, since it is called by drivers but may call back into drivers, so drivers had to make sure to call it without any locks held, which frequently lead to more complex code in drivers. Avoid that problem by allowing the function to be called in any context, and queueing the actual work it does. Also update the documentation for it to indicate this. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-08-27 13:27:06 -04:00
Johannes Berg	5f33c92d18	mac80211: remove unused scan expire define Since cfg80211 manages the BSS list completely, this define hasn't been used for a long time and will never be used again. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-08-27 13:27:06 -04:00
Eric Dumazet	40d0802b3e	gro: __napi_gro_receive() optimizations compare_ether_header() can have a special implementation on 64 bit arches if CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is defined. __napi_gro_receive() and vlan_gro_common() can avoid a conditional branch to perform device match. On x86_64, __napi_gro_receive() has now 38 instructions instead of 53 As gcc-4.4.3 still choose to not inline it, add inline keyword to this performance critical function. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-26 22:03:08 -07:00
Changli Gao	53f91dc1f7	net: use scnprintf() to avoid potential buffer overflow strlcpy() returns the total length of the string they tried to create, so we should not use its return value without any check. scnprintf() returns the number of characters written into @buf not including the trailing '\0', so use it instead here. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-26 14:11:49 -07:00
Joe Perches	145ce502e4	net/sctp: Use pr_fmt and pr_<level> Change SCTP_DEBUG_PRINTK and SCTP_DEBUG_PRINTK_IPADDR to use do { print } while (0) guards. Add SCTP_DEBUG_PRINTK_CONT to fix errors in log when lines were continued. Add #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt Add a missing newline in "Failed bind hash alloc" Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-26 14:11:48 -07:00
Simon Horman	dee06e4702	ipvs: switch to GFP_KERNEL allocations Switch from GFP_ATOMIC allocations to GFP_KERNEL ones in ip_vs_add_service() and ip_vs_new_dest(), as we hold a mutex and are allowed to sleep in this context. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-26 13:21:29 -07:00
Simon Horman	4f72816ef0	IPVS: convert __ip_vs_securetcp_lock to a spinlock Also rename __ip_vs_securetcp_lock to ip_vs_securetcp_lock. Spinlock conversion was suggested by Eric Dumazet. Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-26 13:21:29 -07:00
Simon Horman	bd14455048	IPVS: convert __ip_vs_sched_lock to a spinlock Also rename __ip_vs_sched_lock to ip_vs_sched_lock. Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-26 13:21:28 -07:00
Simon Horman	8870f8427b	IPVS: ICMPv6 checksum calculation Cc: Xiaoyu Du <tingsrain@gmail.com> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-26 13:21:26 -07:00
stephen hemminger	aa7c6e5fa0	bridge: avoid ethtool on non running interface If bridge port is offline, don't call ethtool to query speed. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-25 16:36:51 -07:00
Stephen Hemminger	944c794d64	bridge: fix locking comment The carrier check is not called from work queue in current code. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-25 16:36:50 -07:00
Julia Lawall	b2aff96327	net/netfilter/ipvs: Eliminate memory leak __ip_vs_service_get and __ip_vs_svc_fwm_get increment a reference count, so that reference count should be decremented before leaving the function in an error case. A simplified version of the semantic match that finds this problem is: (http://coccinelle.lip6.fr/) // <smpl> @r exists@ local idexpression x; expression E; identifier f1; iterator I; @@ x = __ip_vs_service_get(...); <... when != x when != true (x == NULL \|\| ...) when != if (...) { <+...x...+> } when != I (...) { <+...x...+> } ( x == NULL \| x == E \| x->f1 ) ...> * return ...; // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-25 16:36:50 -07:00
John W. Linville	e569aa78ba	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem Conflicts: drivers/net/wireless/libertas/if_sdio.c	2010-08-25 14:51:42 -04:00
Johannes Berg	5eb5a52da6	mac80211: fix mesh advertisement When a mac80211-based driver advertises mesh mode support, this will be advertised to userspace. However, if mac80211 was compiled without mesh support, then that won't actually be true. Fix this by removing the bit for mesh if mesh isn't compiled in. Since this synchronizes what we advertise to cfg80211 and actually support, it means we can now rely on cfg80211's interface type checks and need not check again in mac80211. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-08-25 14:34:56 -04:00
Christian Lamparter	2c15a0cf27	mac80211: fix rcu-unsafe pointer dereference This patch fixes a potential crash (null-pointer de- reference) which was introduced in my previous patch: "mac80211: AMPDU rx reorder timeout timer" During a BA teardown, the pointer to the soon-to-be-gone tid_ampdu_rx element will be nullified. Therefore the release timer mechanism has to be careful not to accidentally access the item without any RCU protection. Signed-off-by: Christian Lamparter <chunkeey@googlemail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-08-25 14:34:56 -04:00
Johannes Berg	74b70a4e38	nl80211: fix missing nesting commit 95a6ccbb46c70cff376684c752831c014c87029d Author: Johannes Berg <johannes.berg@intel.com> Date: Thu Aug 12 15:38:38 2010 +0200 cfg80211/mac80211: extensible frame processing introduced a netlink bug that caused parsing errors in userspace because it forgot to close a nesting, which would advertise a nesting length of zero to userspace, which then completely threw off parsing and led to Illegal nla->nla_type == 0 being printed by libnl. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-08-25 14:34:56 -04:00
Christian Lamparter	258086a48b	mac80211: cancel restart_work in ieee80211_unregister_hw Unlike most other workqueue-tasks, the restart_work is not scheduled onto mac80211's private per-interface workqueue, but onto one of the system-wide workqueues. Therefore the mac80211-stack has to cancel any pending restarts, before destroying the shared device context and handing back the memory. Otherwise - under very unlucky circumstances - there could be a stale work- item left, because some other kernel component might have delayed the execution of ieee80211_restart_work for too long. Signed-off-by: Christian Lamparter <chunkeey@googlemail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-08-25 14:33:20 -04:00
Wey-Yi Guy	ff67bb86d4	mac80211: fix warning for un-used parameter mesh_hdr only used when CONFIG_MAC80211_MESH is defined Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-08-25 14:33:17 -04:00
Joe Perches	0fb9a9ec27	net/mac80211: Use wiphy_<level> Standardize logging messages from printk(KERN_<level> "%s: " fmt , wiphy_name(foo), args); to wiphy_<level>(foo, fmt, args); Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-08-25 14:33:17 -04:00
Stephen Hemminger	c2e3143e3c	tc: add meta match on receive hash Trivial extension to existing meta data match rules to allow matching on skb receive hash value. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-24 14:48:10 -07:00
Eric Dumazet	ec550d246e	net: ip_append_data() optim Compiler is not smart enough to avoid a conditional branch. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-24 14:45:09 -07:00
Johannes Berg	2e161f78e5	cfg80211/mac80211: extensible frame processing Allow userspace to register for more than just action frames by giving the frame subtype, and make it possible to use this in various modes as well. With some tweaks and some added functionality this will, in the future, also be usable in AP mode and be able to replace the cooked monitor interface currently used in that case. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-08-24 16:27:56 -04:00
Johannes Berg	ac4c977d16	mac80211: remove unused don't-encrypt flag When MFP is disabled, action frames will not be encrypted since they are management frames and the only management frames that can then be encrypted are authentication frames. Therefore, setting the don't-encrypt flag on action frames is unnecessary. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-08-24 16:27:55 -04:00
Johannes Berg	633adf1ad1	cfg80211: mark ieee80211_hdrlen const This function analyses only its single, value-passed argument, and has no side effects. Thus it can be const, which makes mac80211 smaller, for example: text data bss dec hex filename 362518 16720 884 380122 5ccda mac80211.ko (before) 362358 16720 884 379962 5cc3a mac80211.ko (after) a 160 byte saving in text size, and an optimisation because the function won't be called as often. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-08-24 16:27:54 -04:00
stephen hemminger	0fdc100bdc	ethtool: allow non-netadmin to query settings The SNMP daemon uses ethtool to determine the speed of network interfaces. This fails on Debian (and probably elsewhere) because for security SNMP daemon runs as non-root user (snmp). Note: A similar patch was rejected previously because of a concern about the possibility that on some hardware querying the ethtool settings requires access to the PHY and could slow the machine down. But the security risk of requiring SNMP daemon (and related services) to run as root far out weighs the risk of denial-of-service. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-23 20:43:16 -07:00
Eric Dumazet	afdcba371f	net: copy_rtnl_link_stats64() simplification No need to use a temporary struct rtnl_link_stats64 variable, just copy the source to skb buffer. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Reviewed-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-23 20:43:16 -07:00
Changli Gao	0eec32ff35	net_sched: act_csum: coding style cleanup Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-23 20:43:15 -07:00
David S. Miller	7abac68602	pkt_sched: Make act_csum depend upon INET. It uses ip_send_check() and stuff like that. Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-23 20:42:11 -07:00
Gerrit Renker	231cc2aaf1	dccp ccid-2: Replace broken RTT estimator with better algorithm The current CCID-2 RTT estimator code is in parts broken and lags behind the suggestions in RFC2988 of using scaled variants for SRTT/RTTVAR. That code is replaced by the present patch, which reuses the Linux TCP RTT estimator code. Further details: ---------------- 1. The minimum RTO of previously one second has been replaced with TCP's, since RFC4341, sec. 5 says that the minimum of 1 sec. (suggested in RFC2988, 2.4) is not necessary. Instead, the TCP_RTO_MIN is used, which agrees with DCCP's concept of a default RTT (RFC 4340, 3.4). 2. The maximum RTO has been set to DCCP_RTO_MAX (64 sec), which agrees with RFC2988, (2.5). 3. De-inlined the function ccid2_new_ack(). 4. Added a FIXME: the RTT is sampled several times per Ack Vector, which will give the wrong estimate. It should be replaced with one sample per Ack. However, at the moment this can not be resolved easily, since - it depends on TX history code (which also needs some work), - the cleanest solution is not to use the `sent' time at all (saves 4 bytes per entry) and use DCCP timestamps / elapsed time to estimated the RTT, which however is non-trivial to get right (but needs to be done). Reasons for reusing the Linux TCP estimator algorithm: ------------------------------------------------------ Some time was spent to find a better alternative, using basic RFC2988 as a first step. Further analysis and experimentation showed that the Linux TCP RTO estimator is superior to a basic RFC2988 implementation. A summary is on http://www.erg.abdn.ac.uk/users/gerrit/dccp/notes/ccid2/rto_estimator/ In addition, this estimator fared well in a recent empirical evaluation: Rewaskar, Sushant, Jasleen Kaur and F. Donelson Smith. A Performance Study of Loss Detection/Recovery in Real-world TCP Implementations. Proceedings of 15th IEEE International Conference on Network Protocols (ICNP-07), 2007. Thus there is significant benefit in reusing the existing TCP code. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-23 20:13:31 -07:00
Gerrit Renker	c38c92a84a	dccp ccid-2: Simplify dec_pipe and rearming of RTO timer This removes the dec_pipe function and improves the way the RTO timer is rearmed when a new acknowledgment comes in. Details and justification for removal: -------------------------------------- 1) The BUG_ON in dec_pipe is never triggered: pipe is only decremented for TX history entries between tail and head, for which it had previously been incremented in tx_packet_sent; and it is not decremented twice for the same entry, since it is - either decremented when a corresponding Ack Vector cell in state 0 or 1 was received (and then ccid2s_acked==1), - or it is decremented when ccid2s_acked==0, as part of the loss detection in tx_packet_recv (and hence it can not have been decremented earlier). 2) Restarting the RTO timer happens for every single entry in each Ack Vector parsed by tx_packet_recv (according to RFC 4340, 11.4 this can happen up to 16192 times per Ack Vector). 3) The RTO timer should not be restarted when all outstanding data has been acknowledged. This is currently done similar to (2), in dec_pipe, when pipe has reached 0. The patch onsolidates the code which rearms the RTO timer, combining the segments from new_ack and dec_pipe. As a result, the code becomes clearer (compare with tcp_rearm_rto()). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-23 20:13:31 -07:00
Gerrit Renker	30564e3555	dccp ccid-2: Remove redundant sanity tests This removes the ccid2_hc_tx_check_sanity function: it is redundant. Details: The tx_check_sanity function performs three tests: 1) it checks that the circular TX list is sorted - in ascending order of sequence number (ccid2s_seq) - and time (ccid2s_sent), - in the direction from `tail' (hctx_seqt) to `head' (hctx_seqh); 2) it ensures that the entire list has the length seqbufc * CCID2_SEQBUF_LEN; 3) it ensures that pipe equals the number of packets that were not marked `acked' (ccid2s_acked) between `tail' and `head'. The following argues that each of these tests is redundant, this can be verified by going through the code. (1) is not necessary, since both time and GSS increase from one packet to the next, so that subsequent insertions in tx_packet_sent (which advance the `head' pointer) will be in ascending order of time and sequence number. In (2), the length of the list is always equal to seqbufc times CCID2_SEQBUF_LEN (set to 1024) unless allocation caused an earlier failure, because: * at initialisation (tx_init), there is one chunk of size 1024 and seqbufc=1; * subsequent calls to tx_alloc_seq take place whenever head->next == tail in tx_packet_sent; then a new chunk of size 1024 is inserted between head and tail, and seqbufc is incremented by one. To show that (3) is redundant requires looking at two cases. The `pipe' variable of the TX socket is incremented only in tx_packet_sent, and decremented in tx_packet_recv. When head == tail (TX history empty) then pipe should be 0, which is the case directly after initialisation and after a retransmission timeout has occurred (ccid2_hc_tx_rto_expire). The first case involves parsing Ack Vectors for packets recorded in the live portion of the buffer, between tail and head. For each packet marked by the receiver as received (state 0) or ECN-marked (state 1), pipe is decremented by one, so for all such packets the BUG_ON in tx_check_sanity will not trigger. The second case is the loss detection in the second half of tx_packet_recv, below the comment "Check for NUMDUPACK". The first while-loop here ensures that the sequence number of `seqp' is either above or equal to `high_ack', or otherwise equal to the highest sequence number sent so far (of the entry head->prev, as head points to the next unsent entry). The next while-loop ("while (1)") counts the number of acked packets starting from that position of seqp, going backwards in the direction from head->prev to tail. If NUMDUPACK=3 such packets were counted within this loop, `seqp' points to the last acknowledged packet of these, and the "if (done == NUMDUPACK)" block is entered next. The while-loop contained within that block in turn traverses the list backwards, from head to tail; the position of `seqp' is saved in the variable `last_acked'. For each packet not marked as `acked', a congestion event is triggered within the loop, and pipe is decremented. The loop terminates when `seqp' has reached `tail', whereupon tail is set to the position previously stored in `last_acked'. Thus, between `last_acked' and the previous position of `tail', - pipe has been decremented earlier if the packet was marked as state 0 or 1; - pipe was decremented if the packet was not marked as acked. That is, pipe has been decremented by the number of packets between `last_acked' and the previous position of `tail'. As a consequence, pipe now again reflects the number of packets which have not (yet) been acked between the new position of tail (at `last_acked') and head->prev, or 0 if head==tail. The result is that the BUG_ON condition in check_sanity will also not be triggered, hence the test (3) is also redundant. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-23 20:13:30 -07:00
Gerrit Renker	51c22bb510	dccp ccid-3: No more CCID control blocks in LISTEN state The CCIDs are activated as last of the features, at the end of the handshake, were the LISTEN state of the master socket is inherited into the server state of the child socket. Thus, the only states visible to CCIDs now are OPEN/PARTOPEN, and the closing states. This allows to remove tests which were previously necessary to protect against referencing a socket in the listening state (in CCID-3), but which now have become redundant. As a further byproduct of enabling the CCIDs only after the connection has been fully established, several typecast-initialisations of ccid3_hc_{rx,tx}_sock can now be eliminated: * the CCID is loaded, so it is not necessary to test if it is NULL, * if it is possible to load a CCID and leave the private area NULL, then this is a bug, which should crash loudly - and earlier, * the test for state==OPEN \|\| state==PARTOPEN now reduces only to the closing phase (e.g. when the node has received an unexpected Reset). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-23 20:13:30 -07:00
Gerrit Renker	67b67e365f	ccid: ccid-2/3 code cosmetics This patch collects cosmetics-only changes to separate these from code changes: * update with regard to CodingStyle and whitespace changes, * documentation: - adding/revising comments, - remove CCID-3 RX socket documentation which is either duplicate or refers to fields that no longer exist, * expand embedded tfrc_tx_info struct inline for consistency, removing indirections via #define. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-23 20:13:29 -07:00
David S. Miller	21dc330157	net: Rename skb_has_frags to skb_has_frag_list SKBs can be "fragmented" in two ways, via a page array (called skb_shinfo(skb)->frags[]) and via a list of SKBs (called skb_shinfo(skb)->frag_list). Since skb_has_frags() tests the latter, it's name is confusing since it sounds more like it's testing the former. Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-23 00:13:46 -07:00

1 2 3 4 5 ...

16383 Commits