linux/net
Eric Dumazet c9eeec26e3 tcp: TSQ can use a dynamic limit
When TCP Small Queues was added, we used a sysctl to limit amount of
packets queues on Qdisc/device queues for a given TCP flow.

Problem is this limit is either too big for low rates, or too small
for high rates.

Now TCP stack has rate estimation in sk->sk_pacing_rate, and TSO
auto sizing, it can better control number of packets in Qdisc/device
queues.

New limit is two packets or at least 1 to 2 ms worth of packets.

Low rates flows benefit from this patch by having even smaller
number of packets in queues, allowing for faster recovery,
better RTT estimations.

High rates flows benefit from this patch by allowing more than 2 packets
in flight as we had reports this was a limiting factor to reach line
rate. [ In particular if TX completion is delayed because of coalescing
parameters ]

Example for a single flow on 10Gbp link controlled by FQ/pacing

14 packets in flight instead of 2

$ tc -s -d qd
qdisc fq 8001: dev eth0 root refcnt 32 limit 10000p flow_limit 100p
buckets 1024 quantum 3028 initial_quantum 15140
 Sent 1168459366606 bytes 771822841 pkt (dropped 0, overlimits 0
requeues 6822476)
 rate 9346Mbit 771713pps backlog 953820b 14p requeues 6822476
  2047 flow, 2046 inactive, 1 throttled, delay 15673 ns
  2372 gc, 0 highprio, 0 retrans, 9739249 throttled, 0 flows_plimit

Note that sk_pacing_rate is currently set to twice the actual rate, but
this might be refined in the future when a flow is in congestion
avoidance.

Additional change : skb->destructor should be set to tcp_wfree().

A future patch (for linux 3.13+) might remove tcp_limit_output_bytes

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-30 20:41:57 -07:00
..
9p for-linus-3.12-merge minor 9p fixes and tweaks for 3.12 merge window 2013-09-11 12:34:13 -07:00
802 mrp: add periodictimer to allow retries when packets get lost 2013-09-23 16:53:52 -04:00
8021q net: vlan: inherit addr_assign_type along with dev_addr 2013-09-03 20:57:49 -04:00
appletalk
atm
ax25
batman-adv batman-adv: set the TAG flag for the vid passed to BLA 2013-09-17 21:15:16 +02:00
bluetooth Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem 2013-09-27 13:11:17 -04:00
bridge bridge: fix NULL pointer deref of br_port_get_rcu 2013-09-15 22:03:33 -04:00
caif caif: Add missing braces to multiline if in cfctrl_linkup_request 2013-09-05 14:31:02 -04:00
can can: gw: add a per rule limitation of frame hops 2013-08-29 22:58:24 +02:00
ceph Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client 2013-09-19 12:50:37 -05:00
core net: flow_dissector: fix thoff for IPPROTO_AH 2013-09-30 15:32:05 -04:00
dcb
dccp net:dccp: do not report ICMP redirects to user space 2013-09-18 12:33:44 -04:00
decnet
dns_resolver
dsa net: dsa: inherit addr_assign_type along with dev_addr 2013-09-03 20:57:49 -04:00
ethernet
ieee802154 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2013-09-05 14:54:29 -07:00
ipv4 tcp: TSQ can use a dynamic limit 2013-09-30 20:41:57 -07:00
ipv6 ipv6: Fix preferred_lft not updating in some cases 2013-09-30 15:06:19 -04:00
ipx
irda
iucv
key
l2tp
lapb net/lapb: re-send packets on timeout 2013-09-23 16:52:45 -04:00
llc llc: Use normal etherdevice.h tests 2013-09-03 22:34:47 -04:00
mac80211
mac802154
mpls
netfilter ip: generate unique IP identificator if local fragmentation is allowed 2013-09-19 14:11:15 -04:00
netlabel
netlink net: netlink: filter particular protocols from analyzers 2013-09-06 14:43:48 -04:00
netrom
nfc
openvswitch net: ovs: flow: fix potential illegal memory access in __parse_flow_nlattrs 2013-09-11 16:09:58 -04:00
packet net: packet: use reciprocal_divide in fanout_demux_hash 2013-08-29 16:43:29 -04:00
phonet
rds
rfkill Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2013-09-05 14:54:29 -07:00
rose
rxrpc
sched pkt_sched: fq: qdisc dismantle fixes 2013-09-30 15:51:23 -04:00
sctp net: sctp: rfc4443: do not report ICMP redirects to user space 2013-09-16 21:40:15 -04:00
sunrpc Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-09-12 15:01:38 -07:00
tipc tipc: set sk_err correctly when connection fails 2013-08-30 16:06:57 -04:00
unix
vmw_vsock
wimax
wireless Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2013-09-05 14:54:29 -07:00
x25 x25: add a sanity check parsing X.25 facilities 2013-09-04 00:27:27 -04:00
xfrm Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-09-05 14:58:52 -04:00
Kconfig Remove GENERIC_HARDIRQ config option 2013-09-13 15:09:52 +02:00
Makefile
compat.c
nonet.c
socket.c Merge git://git.kvack.org/~bcrl/aio-next 2013-09-13 10:55:58 -07:00
sysctl_net.c