linux/net/core
Jiri Olsa a57de0b433 net: adding memory barrier to the poll and receive callbacks
Adding memory barrier after the poll_wait function, paired with
receive callbacks. Adding fuctions sock_poll_wait and sk_has_sleeper
to wrap the memory barrier.

Without the memory barrier, following race can happen.
The race fires, when following code paths meet, and the tp->rcv_nxt
and __add_wait_queue updates stay in CPU caches.

CPU1                         CPU2

sys_select                   receive packet
  ...                        ...
  __add_wait_queue           update tp->rcv_nxt
  ...                        ...
  tp->rcv_nxt check          sock_def_readable
  ...                        {
  schedule                      ...
                                if (sk->sk_sleep && waitqueue_active(sk->sk_sleep))
                                        wake_up_interruptible(sk->sk_sleep)
                                ...
                             }

If there was no cache the code would work ok, since the wait_queue and
rcv_nxt are opposit to each other.

Meaning that once tp->rcv_nxt is updated by CPU2, the CPU1 either already
passed the tp->rcv_nxt check and sleeps, or will get the new value for
tp->rcv_nxt and will return with new data mask.
In both cases the process (CPU1) is being added to the wait queue, so the
waitqueue_active (CPU2) call cannot miss and will wake up CPU1.

The bad case is when the __add_wait_queue changes done by CPU1 stay in its
cache, and so does the tp->rcv_nxt update on CPU2 side.  The CPU1 will then
endup calling schedule and sleep forever if there are no more data on the
socket.

Calls to poll_wait in following modules were ommited:
	net/bluetooth/af_bluetooth.c
	net/irda/af_irda.c
	net/irda/irnet/irnet_ppp.c
	net/mac80211/rc80211_pid_debugfs.c
	net/phonet/socket.c
	net/rds/af_rds.c
	net/rfkill/core.c
	net/sunrpc/cache.c
	net/sunrpc/rpc_pipe.c
	net/tipc/socket.c

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-09 17:06:57 -07:00
..
Makefile Network Drop Monitor: Adding Build changes to enable drop monitor 2009-03-13 12:09:29 -07:00
datagram.c net: adding memory barrier to the poll and receive callbacks 2009-07-09 17:06:57 -07:00
dev.c gro: Flush GRO packets in napi_disable_pending path 2009-06-26 19:27:04 -07:00
dev_mcast.c net: Rationalise email address: Network Specific Parts 2008-10-13 19:01:08 -07:00
drop_monitor.c Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 2009-06-15 03:02:23 -07:00
dst.c net: speedup dst_release() 2008-11-14 00:53:54 -08:00
ethtool.c core: remove pointless conditional before kfree() 2009-03-31 15:06:26 -07:00
fib_rules.c net: Remove unused parameter from fill method in fib_rules_ops. 2009-05-20 17:26:23 -07:00
filter.c filter: add SKF_AD_NLATTR_NEST to look for nested attributes 2008-11-20 00:49:27 -08:00
flow.c netns xfrm: lookup in netns 2008-11-25 17:35:18 -08:00
gen_estimator.c pkt_sched: gen_estimator: Fix signed integers right-shifts. 2009-05-25 22:47:01 -07:00
gen_stats.c [NET_SCHED]: Convert packet schedulers from rtnetlink to new netlink API 2008-01-28 15:11:10 -08:00
iovec.c net: Fix memcpy_toiovecend() to use the right offset 2009-06-08 00:25:39 -07:00
kmap_skb.h [PATCH] severing skbuff.h -> highmem.h 2006-12-04 02:00:29 -05:00
link_watch.c Revert "net: Fix for initial link state in 2.6.28" 2009-01-05 16:01:51 -08:00
neighbour.c neigh: fix state transition INCOMPLETE->FAILED via Netlink request 2009-06-11 04:16:28 -07:00
net-sysfs.c net: Remove bogus reference to BUS_ID_SIZE in sysfs code. 2009-05-26 21:05:19 -07:00
net-sysfs.h netns: Fix device renaming for sysfs 2008-05-02 17:00:58 -07:00
net-traces.c Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 2009-06-15 03:02:23 -07:00
net_namespace.c netns: simplify net_ns_init 2009-05-21 15:10:31 -07:00
netevent.c [NET]: net/core/netevent.c should #include <net/netevent.h> 2007-07-05 17:40:27 -07:00
netpoll.c netpoll: Fix carrier detection for drivers that are using phylib 2009-07-08 20:09:44 -07:00
pktgen.c net pkgen.c:fix no need for check 2009-06-08 00:40:35 -07:00
request_sock.c net: convert BUG_TRAP to generic WARN_ON 2008-07-25 21:43:18 -07:00
rtnetlink.c netlink: change nlmsg_notify() return value logic 2009-02-24 23:18:28 -08:00
scm.c Merge branch 'master' into next 2008-11-18 18:52:37 +11:00
skb_dma_map.c net: skb_shared_info optimization 2009-06-08 00:21:48 -07:00
skbuff.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 2009-06-18 14:07:15 -07:00
sock.c net: adding memory barrier to the poll and receive callbacks 2009-07-09 17:06:57 -07:00
stream.c tcp: tcp_prequeue() can use keyed wakeups 2009-05-17 20:44:43 -07:00
sysctl_net_core.c sysctl: fix sparse warning: Should it be static? 2009-02-26 23:13:34 -08:00
user_dma.c net/core/user_dma.c: Use frag list abstraction interfaces. 2009-06-09 00:19:10 -07:00
utils.c net: core: remove unneeded include in net/core/utils.c. 2009-03-26 01:11:48 -07:00