Based on the previous patch, make bridge support netpoll by:
1) implement the 2 methods to support netpoll for bridge;
2) modify netpoll during forwarding packets via bridge;
3) disable netpoll support of bridge when a netpoll-unabled device
is added to bridge;
4) enable netpoll support when all underlying devices support netpoll.
Cc: David Miller <davem@davemloft.net>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Stephen Hemminger <shemminger@linux-foundation.org>
Cc: Matt Mackall <mpm@selenic.com>
Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move some declarations around to make it clearer which variables
are being used inside loop.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The recently introduced bridge mulitcast port group list was only
partially using RCU correctly. It was missing rcu_dereference()
and missing the necessary barrier on deletion.
The code should have used one of the standard list methods (list or hlist)
instead of open coding a RCU based link list.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix unsafe usage of RCU. Would never work on Alpha SMP because
of lack of rcu_dereference()
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
By coding slightly differently, there are only two cases
to deal with: add at head and add after previous entry.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit ff65e8275f6c96a5eda57493bd84c4555decf7b3.
As explained by Stephen Hemminger, the traversal doesn't require
RCU handling as we hold a lock.
The list addition et al. calls, on the other hand, do.
Signed-off-by: David S. Miller <davem@davemloft.net>
I prefer that the hlist be only accessed through the hlist macro
objects. Explicit twiddling of links (especially with RCU) exposes
the code to future bugs.
Compile tested only.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Based upon a report from Stephen Rothwell:
--------------------
net/bridge/br_multicast.c: In function 'br_ip6_multicast_alloc_query':
net/bridge/br_multicast.c:469: error: implicit declaration of function 'csum_ipv6_magic'
Introduced by commit 08b202b6726459626c73ecfa08fcdc8c3efc76c2 ("bridge
br_multicast: IPv6 MLD support") from the net tree.
csum_ipv6_magic is declared in net/ip6_checksum.h ...
--------------------
Signed-off-by: David S. Miller <davem@davemloft.net>
Even with commit 32dec5dd0233ebffa9cae25ce7ba6daeb7df4467 ("bridge
br_multicast: Don't refer to BR_INPUT_SKB_CB(skb)->mrouters_only
without IGMP snooping."), BR_INPUT_SKB_CB(skb)->mrouters_only is
not appropriately initialized if IGMP/MLD snooping support is
compiled and disabled, so we can see garbage.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce struct br_ip{} to store ip address and protocol
and make functions more generic so that we can support
both IPv4 and IPv6 with less pain.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Sparse can help us find endianness bugs, but we need to make some
cleanups to be able to more easily spot real bugs.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
grec_nsrcs is in network order, we should convert to host horder in
br_multicast_igmp3_report()
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The MTU for IP traffic encapsulated inside PPPoE traffic is smaller
than the MTU of the Ethernet device (1500). Connection tracking
gathers all IP packets and sometimes will refragment them in
ip_fragment(). We then need to subtract the length of the
encapsulating header from the mtu used in ip_fragment(). The check in
br_nf_dev_queue_xmit() which determines if ip_fragment() has to be
called is also updated for the PPPoE-encapsulated packets.
nf_bridge_copy_header() is also updated to make sure the PPPoE data
length field has the correct value.
Signed-off-by: Bart De Schuymer <bdschuym@pandora.be>
Signed-off-by: Patrick McHardy <kaber@trash.net>
- fix IP DNAT on vlan- or pppoe-encapsulated traffic: The functions
neigh_hh_output() or dst->neighbour->output() overwrite the complete
Ethernet header, although we only need the destination MAC address.
For encapsulated packets, they ended up overwriting the encapsulating
header. The new code copies the Ethernet source MAC address and
protocol number before calling dst->neighbour->output(). The Ethernet
source MAC and protocol number are copied back in place in
br_nf_pre_routing_finish_bridge_slow(). This also makes the IP DNAT
more transparent because in the old scheme the source MAC of the
bridge was copied into the source address in the Ethernet header. We
also let skb->protocol equal ETH_P_IP resp. ETH_P_IPV6 during the
execution of the PF_INET resp. PF_INET6 hooks.
- Speed up IP DNAT by calling neigh_hh_bridge() instead of
neigh_hh_output(): if dst->hh is available, we already know the MAC
address so we can just copy it.
Signed-off-by: Bart De Schuymer <bdschuym@pandora.be>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Remove br_netfilter.c::br_nf_local_out(). The function
br_nf_local_out() was needed because the PF_BRIDGE::LOCAL_OUT hook
could be called when IP DNAT happens on to-be-bridged traffic. The
new scheme eliminates this mess.
Signed-off-by: Bart De Schuymer <bdschuym@pandora.be>
Signed-off-by: Patrick McHardy <kaber@trash.net>
ip_refrag isn't used anymore in the bridge-netfilter code
Signed-off-by: Bart De Schuymer <bdschuym@pandora.be>
Signed-off-by: Patrick McHardy <kaber@trash.net>
bridge-netfilter: cleanup br_netfilter.c
- remove some of the graffiti at the head of br_netfilter.c
- remove __br_dnat_complain()
- remove KERN_INFO messages when CONFIG_NETFILTER_DEBUG is defined
Signed-off-by: Bart De Schuymer <bdschuym@pandora.be>
Signed-off-by: Patrick McHardy <kaber@trash.net>
The IGMP3 report parsing is looking at the wrong address for
group records. This patch fixes it.
Reported-by: Banyeer <banyeer@yahoo.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Restore function signatures from bool to int so that we can report
memory allocation failures or similar using -ENOMEM rather than
always having to pass -EINVAL back.
// <smpl>
@@
type bool;
identifier check, par;
@@
-bool check
+int check
(struct xt_tgchk_param *par) { ... }
// </smpl>
Minus the change it does to xt_ct_find_proto.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Restore function signatures from bool to int so that we can report
memory allocation failures or similar using -ENOMEM rather than
always having to pass -EINVAL back.
This semantic patch may not be too precise (checking for functions
that use xt_mtchk_param rather than functions referenced by
xt_match.checkentry), but reviewed, it produced the intended result.
// <smpl>
@@
type bool;
identifier check, par;
@@
-bool check
+int check
(struct xt_mtchk_param *par) { ... }
// </smpl>
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
The first argument to NF_HOOK* is an nfproto since quite some time.
Commit v2.6.27-2457-gfdc9314 was the first to practically start using
the new names. Do that now for the remaining NF_HOOK calls.
The semantic patch used was:
// <smpl>
@@
@@
(NF_HOOK
|NF_HOOK_THRESH
)(
-PF_BRIDGE,
+NFPROTO_BRIDGE,
...)
@@
@@
NF_HOOK(
-PF_INET6,
+NFPROTO_IPV6,
...)
@@
@@
NF_HOOK(
-PF_INET,
+NFPROTO_IPV4,
...)
// </smpl>
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Supplement to 1159683ef48469de71dc26f0ee1a9c30d131cf89.
Downgrade the log level to INFO for most checkentry messages as they
are, IMO, just an extra information to the -EINVAL code that is
returned as part of a parameter "constraint violation". Leave errors
to real errors, such as being unable to create a LED trigger.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
We never actually use iph again so this assignment can be removed.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
It's not desired for underlaying devices to change type. At the time,
there is for example possible to have bond with changed type from
Ethernet to Infiniband as a port of a bridge. This patch fixes this.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ENOMEM is a very obvious error code (cf. EINVAL), so I think we do not
really need a warning message. Not to mention that if the allocation
fails, the user is most likely going to get a stack trace from slab
already.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
The shared packet statistics are a potential source of slow down
on bridged traffic. Convert to per-cpu array, but only keep those
statistics which change per-packet.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Without CONFIG_BRIDGE_IGMP_SNOOPING,
BR_INPUT_SKB_CB(skb)->mrouters_only is not appropriately
initialized, so we can see garbage.
A clear option to fix this is to set it even without that
config, but we cannot optimize out the branch.
Let's introduce a macro that returns value of mrouters_only
and let it return 0 without CONFIG_BRIDGE_IGMP_SNOOPING.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
From: Michael Braun <michael-dev@fami-braun.de>
bridge: Fix br_forward crash in promiscuous mode
It's a linux-next kernel from 2010-03-12 on an x86 system and it
OOPs in the bridge module in br_pass_frame_up (called by
br_handle_frame_finish) because brdev cannot be dereferenced (its set to
a non-null value).
Adding some BUG_ON statements revealed that
BR_INPUT_SKB_CB(skb)->brdev == br-dev
(as set in br_handle_frame_finish first)
only holds until br_forward is called.
The next call to br_pass_frame_up then fails.
Digging deeper it seems that br_forward either frees the skb or passes
it to NF_HOOK which will in turn take care of freeing the skb. The
same is holds for br_pass_frame_ip. So it seems as if two independent
skb allocations are required. As far as I can see, commit
b33084be192ee1e347d98bb5c9e38a53d98d35e2 ("bridge: Avoid unnecessary
clone on forward path") removed skb duplication and so likely causes
this crash. This crash does not happen on 2.6.33.
I've therefore modified br_forward the same way br_flood has been
modified so that the skb is not freed if skb0 is going to be used
and I can confirm that the attached patch resolves the issue for me.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since all callers of br_mdb_ip_get need to check whether the
hash table is NULL, this patch moves the check into the function.
This fixes the two callers (query/leave handler) that didn't
check it.
Reported-by: Michael Braun <michael-dev@fami-braun.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (108 commits)
bridge: ensure to unlock in error path in br_multicast_query().
drivers/net/tulip/eeprom.c: fix bogus "(null)" in tulip init messages
sky2: Avoid rtnl_unlock without rtnl_lock
ipv6: Send netlink notification when DAD fails
drivers/net/tg3.c: change the field used with the TG3_FLAG_10_100_ONLY constant
ipconfig: Handle devices which take some time to come up.
mac80211: Fix memory leak in ieee80211_if_write()
mac80211: Fix (dynamic) power save entry
ipw2200: use kmalloc for large local variables
ath5k: read eeprom IQ calibration values correctly for G mode
ath5k: fix I/Q calibration (for real)
ath5k: fix TSF reset
ath5k: use fixed antenna for tx descriptors
libipw: split ieee->networks into small pieces
mac80211: Fix sta_mtx unlocking on insert STA failure path
rt2x00: remove KSEG1ADDR define from rt2x00soc.h
net: add ColdFire support to the smc91x driver
asix: fix setting mac address for AX88772
ipv6 ip6_tunnel: eliminate unused recursion field from ip6_tnl{}.
net: Fix dev_mc_add()
...