As per RFC2461, section 6.3.6, item #2, when no routers on the
matching list are known to be reachable or probably reachable we
do round robin on those available routes so that we make sure
to probe as many of them as possible to detect when one becomes
reachable faster.
Each routing table has a rwlock protecting the tree and the linked
list of routes at each leaf. The round robin code executes during
lookup and thus with the rwlock taken as a reader. A small local
spinlock tries to provide protection but this does not work at all
for two reasons:
1) The round-robin list manipulation, as coded, goes like this (with
read lock held):
walk routes finding head and tail
spin_lock();
rotate list using head and tail
spin_unlock();
While one thread is rotating the list, another thread can
end up with stale values of head and tail and then proceed
to corrupt the list when it gets the lock. This ends up causing
the OOPS in fib6_add() later onthat many people have been hitting.
2) All the other code paths that run with the rwlock held as
a reader do not expect the list to change on them, they
expect it to remain completely fixed while they hold the
lock in that way.
So, simply stated, it is impossible to implement this correctly using
a manipulation of the list without violating the rwlock locking
semantics.
Reimplement using a per-fib6_node round-robin pointer. This way we
don't need to manipulate the list at all, and since the round-robin
pointer can only ever point to real existing entries we don't need
to perform any locking on the changing of the round-robin pointer
itself. We only need to reset the round-robin pointer to NULL when
the entry it is pointing to is removed.
The idea is from Thomas Graf and it is very similar to how this
was implemented before the advanced router selection code when in.
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes the next pointer from 'struct rt6_info.u' union,
and renames u.next to u.dst.rt6_next.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make fib6_node 'subtree' depend on IPV6_SUBTREES.
Signed-off-by: Kim Nordlund <kim.nordlund@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Unify RT6_F_xxx and RT6_SELECT_F_xxx flags into
RT6_LOOKUP_F_xxx flags, and put them into ip6_route.h
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: Ville Nuorvala <vnuorval@tcs.hut.fi
Signed-off-by: David S. Miller <davem@davemloft.net>
Based on MIPL2 kernel patch.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replaces the struct in6_rtmsg based interface orignating from
the ioctl interface with a struct fib6_config based on. Allows
changing the interface without breaking the ioctl interface
and avoids passing on tons of parameters.
The recently introduced struct nl_info is used to pass on
netlink authorship information for notifications.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the following needlessly global code static:
- fib6_walker_lock
- struct fib6_walker_list
- fib6_walk_continue()
- fib6_walk()
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adds support for policy routing rules including a new
local table for routes with a local destination.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adds the framework to support multiple IPv6 routing tables.
Currently all automatically generated routes are put into the
same table. This could be changed at a later point after
considering the produced locking overhead.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Essentially netlink at the moment always reports a pid and sequence of 0
always for v6 route activities.
To understand the repurcassions of this look at:
http://lists.quagga.net/pipermail/quagga-dev/2005-June/003507.html
While fixing this, i took the liberty to resolve the outstanding issue
of IPV6 routes inserted via ioctls to have the correct pids as well.
This patch tries to behave as close as possible to the v4 routes i.e
maintains whatever PID the socket issuing the command owns as opposed to
the process. That made the patch a little bulky.
I have tested against both netlink derived utility to add/del routes as
well as ioctl derived one. The Quagga folks have tested against quagga.
This fixes the problem and so far hasnt been detected to introduce any
new issues.
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!