linux/net
Eric Dumazet 4eaf3b84f2 net_sched: fix qdisc_tree_decrease_qlen() races
qdisc_tree_decrease_qlen() suffers from two problems on multiqueue
devices.

One problem is that it updates sch->q.qlen and sch->qstats.drops
on the mq/mqprio root qdisc, while it should not : Daniele
reported underflows errors :
[  681.774821] PAX: sch->q.qlen: 0 n: 1
[  681.774825] PAX: size overflow detected in function qdisc_tree_decrease_qlen net/sched/sch_api.c:769 cicus.693_49 min, count: 72, decl: qlen; num: 0; context: sk_buff_head;
[  681.774954] CPU: 2 PID: 19 Comm: ksoftirqd/2 Tainted: G           O    4.2.6.201511282239-1-grsec #1
[  681.774955] Hardware name: ASUSTeK COMPUTER INC. X302LJ/X302LJ, BIOS X302LJ.202 03/05/2015
[  681.774956]  ffffffffa9a04863 0000000000000000 0000000000000000 ffffffffa990ff7c
[  681.774959]  ffffc90000d3bc38 ffffffffa95d2810 0000000000000007 ffffffffa991002b
[  681.774960]  ffffc90000d3bc68 ffffffffa91a44f4 0000000000000001 0000000000000001
[  681.774962] Call Trace:
[  681.774967]  [<ffffffffa95d2810>] dump_stack+0x4c/0x7f
[  681.774970]  [<ffffffffa91a44f4>] report_size_overflow+0x34/0x50
[  681.774972]  [<ffffffffa94d17e2>] qdisc_tree_decrease_qlen+0x152/0x160
[  681.774976]  [<ffffffffc02694b1>] fq_codel_dequeue+0x7b1/0x820 [sch_fq_codel]
[  681.774978]  [<ffffffffc02680a0>] ? qdisc_peek_dequeued+0xa0/0xa0 [sch_fq_codel]
[  681.774980]  [<ffffffffa94cd92d>] __qdisc_run+0x4d/0x1d0
[  681.774983]  [<ffffffffa949b2b2>] net_tx_action+0xc2/0x160
[  681.774985]  [<ffffffffa90664c1>] __do_softirq+0xf1/0x200
[  681.774987]  [<ffffffffa90665ee>] run_ksoftirqd+0x1e/0x30
[  681.774989]  [<ffffffffa90896b0>] smpboot_thread_fn+0x150/0x260
[  681.774991]  [<ffffffffa9089560>] ? sort_range+0x40/0x40
[  681.774992]  [<ffffffffa9085fe4>] kthread+0xe4/0x100
[  681.774994]  [<ffffffffa9085f00>] ? kthread_worker_fn+0x170/0x170
[  681.774995]  [<ffffffffa95d8d1e>] ret_from_fork+0x3e/0x70

mq/mqprio have their own ways to report qlen/drops by folding stats on
all their queues, with appropriate locking.

A second problem is that qdisc_tree_decrease_qlen() calls qdisc_lookup()
without proper locking : concurrent qdisc updates could corrupt the list
that qdisc_match_from_root() parses to find a qdisc given its handle.

Fix first problem adding a TCQ_F_NOPARENT qdisc flag that
qdisc_tree_decrease_qlen() can use to abort its tree traversal,
as soon as it meets a mq/mqprio qdisc children.

Second problem can be fixed by RCU protection.
Qdisc are already freed after RCU grace period, so qdisc_list_add() and
qdisc_list_del() simply have to use appropriate rcu list variants.

A future patch will add a per struct netdev_queue list anchor, so that
qdisc_tree_decrease_qlen() can have more efficient lookups.

Reported-by: Daniele Fucini <dfucini@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Cong Wang <cwang@twopensource.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 14:59:05 -05:00
..
6lowpan
9p
802
8021q vlan: Do not put vlan headers back on bridge and macvlan ports 2015-11-17 14:38:35 -05:00
appletalk
atm
ax25
batman-adv
bluetooth Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth 2015-12-03 12:04:05 -05:00
bridge switchdev: bridge: Check return code is not EOPNOTSUPP 2015-11-16 14:56:03 -05:00
caif net: rename SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA 2015-12-01 15:45:05 -05:00
can
ceph Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client 2015-11-13 09:24:40 -08:00
core ipv6: kill sk_dst_lock 2015-12-03 11:32:06 -05:00
dcb
dccp ipv6: kill sk_dst_lock 2015-12-03 11:32:06 -05:00
decnet net: rename SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA 2015-12-01 15:45:05 -05:00
dns_resolver net: dns_resolver: convert time_t to time64_t 2015-11-18 16:27:46 -05:00
dsa
ethernet
hsr net/hsr: fix a warning message 2015-11-23 14:56:15 -05:00
ieee802154
ipv4 ipv4: igmp: Allow removing groups from a removed interface 2015-12-03 12:07:05 -05:00
ipv6 ipv6: kill sk_dst_lock 2015-12-03 11:32:06 -05:00
ipx
irda
iucv net: rename SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA 2015-12-01 15:45:05 -05:00
key
l2tp ipv6: add complete rcu protection around np->opt 2015-12-02 23:37:16 -05:00
l3mdev
lapb
llc
mac80211
mac802154
mpls
netfilter ipvs: use skb_to_full_sk() helper 2015-11-15 18:39:48 -05:00
netlabel
netlink
netrom
nfc net: rename SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA 2015-12-01 15:45:05 -05:00
openvswitch openvswitch: fix hangup on vxlan/gre/geneve device deletion 2015-12-03 14:29:25 -05:00
packet packet: Allow packets with only a header (but no payload) 2015-11-29 22:17:17 -05:00
phonet
rds RDS: fix race condition when sending a message on unbound socket 2015-11-24 17:20:09 -05:00
rfkill
rose
rxrpc net: rename SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA 2015-12-01 15:45:05 -05:00
sched net_sched: fix qdisc_tree_decrease_qlen() races 2015-12-03 14:59:05 -05:00
sctp ipv6: sctp: implement sctp_v6_destroy_sock() 2015-12-03 12:05:57 -05:00
sunrpc net: rename SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA 2015-12-01 15:45:05 -05:00
switchdev
tipc tipc: fix error handling of expanding buffer headroom 2015-11-24 11:26:19 -05:00
unix net: rename SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA 2015-12-01 15:45:05 -05:00
vmw_vsock
wimax
wireless
x25
xfrm
Kconfig
Makefile
compat.c
socket.c net: fix sock_wake_async() rcu protection 2015-12-01 15:45:05 -05:00
sysctl_net.c