linux/include/net
David S. Miller f11e6659ce [IPV6]: Fix routing round-robin locking.
As per RFC2461, section 6.3.6, item #2, when no routers on the
matching list are known to be reachable or probably reachable we
do round robin on those available routes so that we make sure
to probe as many of them as possible to detect when one becomes
reachable faster.

Each routing table has a rwlock protecting the tree and the linked
list of routes at each leaf.  The round robin code executes during
lookup and thus with the rwlock taken as a reader.  A small local
spinlock tries to provide protection but this does not work at all
for two reasons:

1) The round-robin list manipulation, as coded, goes like this (with
   read lock held):

	walk routes finding head and tail

	spin_lock();
	rotate list using head and tail
	spin_unlock();

   While one thread is rotating the list, another thread can
   end up with stale values of head and tail and then proceed
   to corrupt the list when it gets the lock.  This ends up causing
   the OOPS in fib6_add() later onthat many people have been hitting.

2) All the other code paths that run with the rwlock held as
   a reader do not expect the list to change on them, they
   expect it to remain completely fixed while they hold the
   lock in that way.

So, simply stated, it is impossible to implement this correctly using
a manipulation of the list without violating the rwlock locking
semantics.

Reimplement using a per-fib6_node round-robin pointer.  This way we
don't need to manipulate the list at all, and since the round-robin
pointer can only ever point to real existing entries we don't need
to perform any locking on the changing of the round-robin pointer
itself.  We only need to reset the round-robin pointer to NULL when
the entry it is pointing to is removed.

The idea is from Thomas Graf and it is very similar to how this
was implemented before the advanced router selection code when in.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-03-25 18:48:05 -07:00
..
bluetooth [PATCH] hci endianness annotations 2006-12-13 09:05:52 -08:00
irda [IRDA] net/irda/: proper prototypes 2007-02-26 11:42:43 -08:00
iucv [S390]: Add AF_IUCV socket support 2007-02-08 13:51:54 -08:00
netfilter [NETFILTER]: conntrack: fix {nf,ip}_ct_iterate_cleanup endless loops 2007-03-05 13:25:18 -08:00
sctp [SCTP]: Reset some transport and association variables on restart 2007-03-20 00:09:45 -07:00
tc_act
tipc [TIPC]: endianness annotations 2006-12-02 21:21:08 -08:00
act_api.h
addrconf.h [IPV6]: Misc endianness annotations. 2006-12-02 21:22:52 -08:00
af_unix.h
ah.h
arp.h [IPV6]: Assorted trivial endianness annotations. 2006-12-02 21:22:50 -08:00
atmclip.h [ATM]: Annotations. 2006-12-02 21:22:55 -08:00
ax25.h [PATCH] mark struct file_operations const 1 2007-02-12 09:48:44 -08:00
checksum.h [NET]: Make mangling a checksum (0 -> 0xffff on the wire) explicit. 2006-12-02 21:23:39 -08:00
cipso_ipv4.h NetLabel: use the correct CIPSOv4 MLS label limits 2006-12-02 21:24:12 -08:00
compat.h
datalink.h
dn.h [NET]: Reduce sizeof(struct flowi) by 20 bytes. 2006-10-21 20:24:01 -07:00
dn_dev.h
dn_fib.h
dn_neigh.h
dn_nsp.h
dn_route.h [DECNET]: Convert decnet route to use the new dst_entry 'next' pointer 2007-02-10 23:20:43 -08:00
dsfield.h [NET]: IP header modifier helpers annotations. 2006-12-02 21:23:40 -08:00
dst.h [NET]: Reorder fields of struct dst_entry 2007-02-10 23:20:45 -08:00
esp.h
fib_rules.h [NET]: Fix fib_rules compatibility breakage 2007-03-25 18:48:00 -07:00
flow.h [NET]: Rethink mark field in struct flowi 2006-12-02 21:21:39 -08:00
gen_stats.h
genetlink.h [GENETLINK]: Add cmd dump completion. 2006-12-02 21:32:09 -08:00
icmp.h
ieee80211.h [PATCH] ieee80211: WLAN_GET_SEQ_SEQ fix (select correct region) 2007-01-02 20:56:26 -05:00
ieee80211_crypt.h
ieee80211_radiotap.h
ieee80211softmac.h WorkStruct: make allyesconfig 2006-11-22 14:57:56 +00:00
ieee80211softmac_wx.h
if_inet6.h [IPV6]: Per-interface statistics support. 2006-12-02 21:22:08 -08:00
inet6_connection_sock.h [TCP]: Restore SKB socket owner setting in tcp_transmit_skb(). 2007-01-26 01:04:55 -08:00
inet6_hashtables.h [IPV6]: annotate inet6_hashtables 2006-12-02 21:21:10 -08:00
inet_common.h
inet_connection_sock.h [TCP]: Restore SKB socket owner setting in tcp_transmit_skb(). 2007-01-26 01:04:55 -08:00
inet_ecn.h [NET]: IP header modifier helpers annotations. 2006-12-02 21:23:40 -08:00
inet_hashtables.h [NET]: change layout of ehash table 2007-02-08 14:16:46 -08:00
inet_sock.h
inet_timewait_sock.h [INET]: twcal_jiffie should be unsigned long, not int 2007-03-05 13:32:48 -08:00
inetpeer.h [IPV4] inet_peer: Group together avl_left, avl_right, v4daddr to speedup lookups on some CPUS 2006-10-20 00:28:35 -07:00
ip.h [TCP]: Restore SKB socket owner setting in tcp_transmit_skb(). 2007-01-26 01:04:55 -08:00
ip6_checksum.h [IPV6]: Dumb typo in generic csum_ipv6_magic() 2006-12-22 11:12:07 -08:00
ip6_fib.h [IPV6]: Fix routing round-robin locking. 2007-03-25 18:48:05 -07:00
ip6_route.h [IPV6]: Misc endianness annotations. 2006-12-02 21:22:52 -08:00
ip6_tunnel.h
ip_fib.h [IPV4] nl_fib_lookup: Rename fl_fwmark to fl_mark 2006-12-02 21:21:40 -08:00
ip_mp_alg.h [NET]: Rethink mark field in struct flowi 2006-12-02 21:21:39 -08:00
ip_vs.h [NET]: ipvs checksum annotations. 2006-12-02 21:23:41 -08:00
ipcomp.h
ipconfig.h [NET]: ipconfig and nfsroot annotations 2006-12-02 21:21:09 -08:00
ipip.h [IPV6] net/ipv6/sit.c: make 2 functions static 2006-12-02 21:26:15 -08:00
ipv6.h [IPV6]: __ipv6_addr_diff() annotations and cleanup. 2006-12-02 21:22:53 -08:00
ipx.h [IPX]: Annotate and fix IPX checksum 2006-11-05 14:11:25 -08:00
iw_handler.h
lapb.h
llc.h
llc_c_ac.h
llc_c_ev.h
llc_c_st.h
llc_conn.h
llc_if.h
llc_pdu.h [LLC]: anotations 2006-12-02 21:21:23 -08:00
llc_s_ac.h
llc_s_ev.h
llc_s_st.h
llc_sap.h
mip6.h
ndisc.h [IPV6]: Misc endianness annotations. 2006-12-02 21:22:52 -08:00
neighbour.h [NET]: Fix neighbour destructor handling. 2007-03-25 18:48:01 -07:00
netdma.h
netevent.h
netlabel.h NetLabel: convert to an extensibile/sparse category bitmap 2006-12-02 21:31:36 -08:00
netlink.h [PATCH] severing skbuff.h -> mm.h 2006-12-04 02:00:34 -05:00
netrom.h [PATCH] mark struct file_operations const 1 2007-02-12 09:48:44 -08:00
nexthop.h
p8022.h
pkt_cls.h
pkt_sched.h
protocol.h [INET]: Change protocol field in struct inet_protosw to u16 2006-12-02 21:30:55 -08:00
psnap.h
raw.h
rawv6.h [IPV6]: 'info' argument of ipv6 ->err_handler() is net-endian 2006-12-02 21:21:12 -08:00
red.h
request_sock.h [PATCH] slab: remove kmem_cache_t 2006-12-07 08:39:25 -08:00
rose.h [PATCH] mark struct file_operations const 1 2007-02-12 09:48:44 -08:00
route.h [IPV4]: Convert ipv4 route to use the new dst_entry 'next' pointer 2007-02-10 23:20:38 -08:00
sch_generic.h [NET_SCHED]: Fix endless loops caused by inaccurate qlen counters (part 1) 2006-12-02 21:31:42 -08:00
scm.h
slhc_vj.h
snmp.h
sock.h [NET]: Revert incorrect accept queue backlog changes. 2007-03-06 11:21:05 -08:00
syncppp.h
tcp.h [TCP]: remove tcp header from tcp_v4_check (take #2) 2007-02-08 12:38:44 -08:00
tcp_ecn.h
tcp_states.h
timewait_sock.h [PATCH] slab: remove kmem_cache_t 2006-12-07 08:39:25 -08:00
transp_v6.h [NET]: Supporting UDP-Lite (RFC 3828) in Linux 2006-12-02 21:22:46 -08:00
udp.h [PATCH] severing skbuff.h -> poll.h 2006-12-04 02:00:31 -05:00
udplite.h [NET]: Fix assorted misannotations (from md5 and udplite merges). 2006-12-02 21:27:16 -08:00
x25.h [X.25]: Adds /proc/sys/net/x25/x25_forward to control forwarding. 2007-02-08 13:34:36 -08:00
x25device.h
xfrm.h [IPSEC]: xfrm_policy delete security check misplaced 2007-03-07 16:08:09 -08:00