linux/net/core
Johannes Weiner e805605c72 net: tcp_memcontrol: sanitize tcp memory accounting callbacks
There won't be a tcp control soft limit, so integrating the memcg code
into the global skmem limiting scheme complicates things unnecessarily.
Replace this with simple and clear charge and uncharge calls--hidden
behind a jump label--to account skb memory.

Note that this is not purely aesthetic: as a result of shoehorning the
per-memcg code into the same memory accounting functions that handle the
global level, the old code would compare the per-memcg consumption
against the smaller of the per-memcg limit and the global limit.  This
allowed the total consumption of multiple sockets to exceed the global
limit, as long as the individual sockets stayed within bounds.  After
this change, the code will always compare the per-memcg consumption to
the per-memcg limit, and the global consumption to the global limit, and
thus close this loophole.

Without a soft limit, the per-memcg memory pressure state in sockets is
generally questionable.  However, we did it until now, so we continue to
enter it when the hard limit is hit, and packets are dropped, to let
other sockets in the cgroup know that they shouldn't grow their transmit
windows, either.  However, keep it simple in the new callback model and
leave memory pressure lazily when the next packet is accepted (as
opposed to doing it synchroneously when packets are processed).  When
packets are dropped, network performance will already be in the toilet,
so that should be a reasonable trade-off.

As described above, consumption is now checked on the per-memcg level
and the global level separately.  Likewise, memory pressure states are
maintained on both the per-memcg level and the global level, and a
socket is considered under pressure when either level asserts as much.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-14 16:00:49 -08:00
..
Makefile soreuseport: define reuseport groups 2016-01-04 22:49:58 -05:00
datagram.c net: Fix inverted test in __skb_recv_datagram 2015-12-08 11:30:17 -05:00
dev.c net, sched: add clsact qdisc 2016-01-10 22:13:15 -05:00
dev_addr_lists.c net: fix spelling for synchronized 2014-11-18 15:26:32 -05:00
dev_ioctl.c dev_ioctl: use sizeof(x) instead of sizeof x 2014-11-18 15:27:32 -05:00
drop_monitor.c net: Replace get_cpu_var through this_cpu_ptr 2014-08-26 13:45:47 -04:00
dst.c net: possible use after free in dst_release 2016-01-06 15:00:27 -05:00
ethtool.c ethtool: Add phy statistics 2015-12-31 00:53:10 -05:00
fib_rules.c fib_rules: fix fib rule dumps across multiple skbs 2015-09-24 15:21:54 -07:00
filter.c net: bpf: reject invalid shifts 2016-01-12 17:06:53 -05:00
flow.c flow: Move __get_hash_from_flowi{4,6} into flow_dissector.c 2015-09-01 17:00:24 -07:00
flow_dissector.c flow_dissector: Use 'const' where possible. 2015-09-01 21:19:17 -07:00
gen_estimator.c net_sched: gen_estimator: extend pps limit 2015-07-08 13:59:20 -07:00
gen_stats.c gen_stats.c: Duplicate xstats buffer for later use 2015-02-19 15:45:53 -05:00
link_watch.c dev: introduce dev_get_iflink() 2015-04-02 14:04:59 -04:00
lwtunnel.c dst: Pass net into dst->output 2015-10-08 04:27:03 -07:00
neighbour.c net/neighbour: fix crash at dumping device-agnostic proxy entries 2015-12-03 00:07:51 -05:00
net-procfs.c rps: selective flow shedding during softnet overflow 2013-05-20 13:48:04 -07:00
net-sysfs.c net-sysfs: use to_net_dev in net_namespace() 2015-12-22 15:04:09 -05:00
net-sysfs.h net: netdev_kobject_init: annotate with __init 2014-01-05 20:27:54 -05:00
net-traces.c net: IPv6 fib lookup tracepoint 2015-11-22 11:54:10 -05:00
net_namespace.c netns: make nsid_lock per net 2015-05-17 23:41:11 -04:00
netclassid_cgroup.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-12-17 22:08:28 -05:00
netevent.c netevent: remove automatic variable in register_netevent_notifier() 2015-05-31 00:03:21 -07:00
netpoll.c netpoll: Drop budget parameter from NAPI polling call hierarchy 2015-09-29 14:57:16 -07:00
netprio_cgroup.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-12-17 22:08:28 -05:00
pktgen.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-01-11 23:55:43 -05:00
ptp_classifier.c ptp: Change ptp_class to a proper bitmask 2015-11-03 11:08:22 -05:00
request_sock.c tcp: restore fastopen operations 2015-10-05 03:19:06 -07:00
rtnetlink.c net/rtnetlink: remove unused sz_idx variable 2016-01-11 00:22:20 -05:00
scm.c net: wrap sock->sk_cgrp_prioidx and ->sk_classid inside a struct 2015-12-08 22:02:33 -05:00
secure_seq.c net: remove a sparse error in secure_dccpv6_sequence_number() 2015-05-25 22:55:37 -04:00
skbuff.c net: check both type and procotol for tcp sockets 2015-12-17 15:46:32 -05:00
sock.c net: tcp_memcontrol: sanitize tcp memory accounting callbacks 2016-01-14 16:00:49 -08:00
sock_diag.c net: diag: Add the ability to destroy a socket. 2015-12-15 23:26:51 -05:00
sock_reuseport.c soreuseport: change consume_skb to kfree_skb in error case 2016-01-06 01:30:27 -05:00
stream.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-12-03 21:09:12 -05:00
sysctl_net_core.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-03-20 18:51:09 -04:00
timestamping.c net: skb_defer_rx_timestamp should check for phydev before setting up classify 2015-07-09 14:17:15 -07:00
tso.c net: tso: add support for IPv6 2015-10-26 22:24:22 -07:00
utils.c net: move net_get_random_once to lib 2015-10-08 05:26:35 -07:00