Commit Graph

163 Commits

Author SHA1 Message Date
Veaceslav Falico 87a7b84b58 bonding: add helper function bond_get_targets_ip(targets, ip)
Add function bond_get_targets_ip(targets, ip) which searches through
targets array of ips (arp_targets) and returns the position of first
match. If ip == 0, returns the first free slot. On failure to find the
ip or free slot, return -1.

Use it to verify if the arp we've received is valid and in sysfs.

v1->v2:
Fix "[2/6] bonding: add helper function bond_get_targets_ip(targets, ip)",
per Nikolay's advice, to verify if source ip != 0.0.0.0, otherwise we might
update 'null' arp_ip_targets' last_rx. Also, address style.

Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-25 16:58:37 -07:00
Veaceslav Falico d6641ccff9 bonding: trivial: update the comments to reflect the reality
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-28 23:57:23 -07:00
Nikolay Aleksandrov 53edee2cfb bonding: allow xmit hash policy change while bond dev is up
Since the xmit_hash_policy pointer is always valid and not dependent on
anything, we can change it while the bond device is up and running. The
only downside would be the out of order packets but that is a small price
to pay.

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-27 23:27:14 -07:00
nikolay@redhat.com 318debd897 bonding: fix multiple 3ad mode sysfs race conditions
When bond_3ad_get_active_agg_info() is used in all show_ad_ functions
it is not protected against slave manipulation and since it walks over
the slaves and uses them, this can easily result in NULL pointer
dereference or use of freed memory. Both the new wrapper and the
internal function are exported to the bonding as they're needed in
different places.

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-19 23:25:49 -07:00
nikolay@redhat.com ea6836dd7e bonding: fix set mode race conditions
Changing the mode without any locking can result in multiple races (e.g.
upping a bond, enslaving/releasing). Depending on which race is hit the
impact can vary from incosistent bond state to kernel crash.
Use RTNL to synchronize the mode setting with the dangerous races.

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-19 23:25:49 -07:00
nikolay@redhat.com 1bc7db1678 bonding: fix disabling of arp_interval and miimon
Currently if either arp_interval or miimon is disabled, they both get
disabled, and upon disabling they get executed once more which is not
the proper behaviour. Also when doing a no-op and disabling an already
disabled one, the other again gets disabled.
Also fix the error messages with the proper valid ranges, and a small
typo fix in the up delay error message (outputting "down delay", instead
of "up delay").

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-29 15:02:49 -04:00
Veaceslav Falico 9fe16b78ee bonding: remove already created master sysfs link on failure
If slave sysfs symlink failes to be created - we end up without removing
the master sysfs symlink. Remove it in case of failure.

Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-26 13:00:02 -04:00
Milos Vyletel eb492f7443 bonding: unset primary slave via sysfs
When bonding module is loaded with primary parameter and one decides to unset
primary slave using sysfs these settings are not preserved during bond device
restart. Primary slave is only unset once and it's not remembered in
bond->params structure. Below is example of recreation.

 grep OPTS /etc/sysconfig/network-scripts/ifcfg-bond0
BONDING_OPTS="mode=active-backup miimon=100 primary=eth01"
 grep "Primary Slave" /proc/net/bonding/bond0
Primary Slave: eth01 (primary_reselect always)

 echo "" > /sys/class/net/bond0/bonding/primary
 grep "Primary Slave" /proc/net/bonding/bond0
Primary Slave: None

 sed -i -e 's/primary=eth01//' /etc/sysconfig/network-scripts/ifcfg-bond0
 grep OPTS /etc/sysconfig/network-scripts/ifcfg-bond
BONDING_OPTS="mode=active-backup miimon=100 "
 ifdown bond0 && ifup bond0

without patch:
 grep "Primary Slave" /proc/net/bonding/bond0
Primary Slave: eth01 (primary_reselect always)

with patch:
 grep "Primary Slave" /proc/net/bonding/bond0
Primary Slave: None

Reviewed-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Milos Vyletel <milos.vyletel@sde.cz>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29 15:43:35 -05:00
nikolay@redhat.com e196c0e579 bonding: fix race condition in bonding_store_slaves_active
Race between bonding_store_slaves_active() and slave manipulation
 functions. The bond_for_each_slave use in bonding_store_slaves_active()
 is not protected by any synchronization mechanism.
 NULL pointer dereference is easy to reach.
 Fixed by acquiring the bond->lock for the slave walk.

 v2: Make description text < 75 columns

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-29 13:13:15 -05:00
nikolay@redhat.com fbb0c41b81 bonding: fix miimon and arp_interval delayed work race conditions
First I would give three observations which will be used later.
Observation 1: if (delayed_work_pending(wq)) cancel_delayed_work(wq)
 This usage is wrong because the pending bit is cleared just before the
 work's fn is executed and if the function re-arms itself we might end up
 with the work still running. It's safe to call cancel_delayed_work_sync()
 even if the work is not queued at all.
Observation 2: Use of INIT_DELAYED_WORK()
 Work needs to be initialized only once prior to (de/en)queueing.
Observation 3: IFF_UP is set only after ndo_open is called

Related race conditions:
1. Race between bonding_store_miimon() and bonding_store_arp_interval()
 Because of Obs.1 we can end up having both works enqueued.
2. Multiple races with INIT_DELAYED_WORK()
 Since the works are not protected by anything between INIT_DELAYED_WORK()
 and calls to (en/de)queue it is possible for races between the following
 functions:
 (races are also possible between the calls to INIT_DELAYED_WORK()
  and workqueue code)
 bonding_store_miimon() - bonding_store_arp_interval(), bond_close(),
			  bond_open(), enqueued functions
 bonding_store_arp_interval() - bonding_store_miimon(), bond_close(),
				bond_open(), enqueued functions
3. By Obs.1 we need to change bond_cancel_all()

Bugs 1 and 2 are fixed by moving all work initializations in bond_open
which by Obs. 2 and Obs. 3 and the fact that we make sure that all works
are cancelled in bond_close(), is guaranteed not to have any work
enqueued.
Also RTNL lock is now acquired in bonding_store_miimon/arp_interval so
they can't race with bond_close and bond_open. The opposing work is
cancelled only if the IFF_UP flag is set and it is cancelled
unconditionally. The opposing work is already cancelled if the interface
is down so no need to cancel it again. This way we don't need new
synchronizations for the bonding workqueue. These bugs (and fixes) are
tied together and belong in the same patch.
Note: I have left 1 line intentionally over 80 characters (84) because I
      didn't like how it looks broken down. If you'd prefer it otherwise,
      then simply break it.

 v2: Make description text < 75 columns

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-29 13:13:15 -05:00
nikolay@redhat.com c84e1590d1 bonding: fix second off-by-one error
Fix off-by-one error because IFNAMSIZ == 16 and when this
code gets executed we stick a NULL byte where we should not.

How to reproduce:
 with CONFIG_CC_STACKPROTECTOR=y (otherwise it may pass by silently)
 modprobe bonding; echo 1 > /sys/class/net/bond0/bonding/mode;
 echo "AAAAAAAAAAAAAAAA" > /sys/class/net/bond0/bonding/active_slave;

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>

Note: Sorry for the second patch but I missed this one while checking
      the file. You can squash them into one patch.
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-01 11:53:44 -04:00
nikolay@redhat.com eb6e98a1b2 bonding: fix off-by-one error
Fix off-by-one error because IFNAMSIZ == 16 and when this
code gets executed we stick a NULL byte where we should not.

How to reproduce:
 with CONFIG_CC_STACKPROTECTOR=y (otherwise it may pass by silently)
 modprobe bonding; echo 1 > /sys/class/net/bond0/bonding/mode;
 echo "AAAAAAAAAAAAAAAA" > /sys/class/net/bond0/bonding/primary;

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-01 11:53:43 -04:00
Jiri Pirko 8a540ff9e1 bond_sysfs: use real_num_tx_queues rather than params.tx_queue
Since now number of tx queues can be specified during bond instance
creation and therefore it may differ from params.tx_queues, use rather
real_num_tx_queues for boundary check.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-20 11:07:00 -07:00
Weiping Pan 8a93664df9 bonding:record primary when modify it via sysfs
If we modify primary via sysfs and it is not a valid slave,
we should record it for future use, and this behavior is the same with
bond_check_params().

Signed-off-by: Weiping Pan <wpan@redhat.com>
Acked-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-12 15:23:11 -07:00
Greg Kroah-Hartman ff4b8a57f0 Merge branch 'driver-core-next' into Linux 3.2
This resolves the conflict in the arch/arm/mach-s3c64xx/s3c6400.c file,
and it fixes the build error in the arch/x86/kernel/microcode_core.c
file, that the merge did not catch.

The microcode_core.c patch was provided by Stephen Rothwell
<sfr@canb.auug.org.au> who was invaluable in the merge issues involved
with the large sysdev removal process in the driver-core tree.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-06 11:42:52 -08:00
Kay Sievers edbaa603eb driver-core: remove sysdev.h usage.
The sysdev.h file should not be needed by any in-kernel code, so remove
the .h file from these random files that seem to still want to include
it.

The sysdev code will be going away soon, so this include needs to be
removed no matter what.

Cc: Jiandong Zheng <jdzheng@broadcom.com>
Cc: Scott Branden <sbranden@broadcom.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Kukjin Kim <kgene.kim@samsung.com>
Cc: David Brown <davidb@codeaurora.org>
Cc: Daniel Walker <dwalker@fifo99.com>
Cc: Bryan Huntsman <bryanh@codeaurora.org>
Cc: Ben Dooks <ben-linux@fluff.org>
Cc: Wan ZongShun <mcuos.com@gmail.com>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: "Venkatesh Pallipadi
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
2011-12-21 16:26:03 -08:00
Veaceslav Falico 4a8bb7e27f bonding: Don't allow mode change via sysfs with slaves present
When changing mode via bonding's sysfs, the slaves are not initialized
correctly. Forbid to change modes with slaves present to ensure that every
slave is initialized correctly via bond_enslave().

Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Acked-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-17 19:31:54 -05:00
Eric W. Biederman 01718e36df bonding: Add a forgetten sysfs_attr_init on class_attr_bonding_masters
When I made class_attr_bonding_matters per network namespace and dynamically
allocated I overlooked the need for calling sysfs_attr_init.  Oops.

This fixes the following lockdep splat:

[    5.749651] bonding: Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
[    5.749655] bonding: MII link monitoring set to 100 ms
[    5.749676] BUG: key f49a831c not in .data!
[    5.749677] ------------[ cut here ]------------
[    5.749752] WARNING: at kernel/lockdep.c:2897 lockdep_init_map+0x1c3/0x460()
[    5.749809] Hardware name: ProLiant BL460c G1
[    5.749862] Modules linked in: bonding(+)
[    5.749978] Pid: 3177, comm: modprobe Not tainted 3.1.0-rc9-02177-gf2d1a4e-dirty #1157
[    5.750066] Call Trace:
[    5.750120]  [<c1352c2f>] ? printk+0x18/0x21
[    5.750176]  [<c103112d>] warn_slowpath_common+0x6d/0xa0
[    5.750231]  [<c1060133>] ? lockdep_init_map+0x1c3/0x460
[    5.750287]  [<c1060133>] ? lockdep_init_map+0x1c3/0x460
[    5.750342]  [<c103117d>] warn_slowpath_null+0x1d/0x20
[    5.750398]  [<c1060133>] lockdep_init_map+0x1c3/0x460
[    5.750453]  [<c1355ddd>] ? _raw_spin_unlock+0x1d/0x20
[    5.750510]  [<c11255c8>] ? sysfs_new_dirent+0x68/0x110
[    5.750565]  [<c1124d4b>] sysfs_add_file_mode+0x8b/0xe0
[    5.750621]  [<c1124db3>] sysfs_add_file+0x13/0x20
[    5.750675]  [<c1124e7c>] sysfs_create_file+0x1c/0x20
[    5.750737]  [<c1208f09>] class_create_file+0x19/0x20
[    5.750794]  [<c12c186f>] netdev_class_create_file+0xf/0x20
[    5.750853]  [<f85deaf4>] bond_create_sysfs+0x44/0x90 [bonding]
[    5.750911]  [<f8410947>] ? bond_create_proc_dir+0x1e/0x3e [bonding]
[    5.750970]  [<f841007e>] bond_net_init+0x7e/0x87 [bonding]
[    5.751026]  [<f8410000>] ? 0xf840ffff
[    5.751080]  [<c12abc7a>] ops_init.clone.4+0xba/0x100
[    5.751135]  [<c12abdb2>] ? register_pernet_subsys+0x12/0x30
[    5.751191]  [<c12abd03>] register_pernet_operations.clone.3+0x43/0x80
[    5.751249]  [<c12abdb9>] register_pernet_subsys+0x19/0x30
[    5.751306]  [<f84108b9>] bonding_init+0x832/0x8a2 [bonding]
[    5.751363]  [<c10011f0>] do_one_initcall+0x30/0x160
[    5.751420]  [<f8410087>] ? bond_net_init+0x87/0x87 [bonding]
[    5.751477]  [<c106d5cf>] sys_init_module+0xef/0x1890
[    5.751533]  [<c1356490>] sysenter_do_call+0x12/0x36
[    5.751588] ---[ end trace 89f492d83a7f5006 ]---

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Tested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-22 05:08:44 -04:00
Eric W. Biederman 4c22400ab6 bonding: Use a per netns implementation of /sys/class/net/bonding_masters.
This fixes a network namespace misfeature that bonding_masters looked at
current instead of the remembering the context where in which
/sys/class/net/bonding_masters was opened in to see which network
namespace to act upon.

This removes the need for sysfs to handle tagged directories with
untagged members allowing for a conceptually simpler sysfs
implementation.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19 19:24:15 -04:00
Andy Gospodarek f4bb2e9c4f bonding: fix string comparison errors
When a bond contains a device where one name is the subset of another
(eth1 and eth10, for example), one cannot properly set the primary
device or the currently active device.

This was reported and based on work by Takuma Umeya.  I also verified
the problem and tested that this fix resolves it.

V2: A few did not like the the current code or my changes, so I
refactored bonding_store_primary and bonding_store_active_slave to be a
bit cleaner, dropped the use of strnicmp since we did not really need
the comparison to be case insensitive, and formatted the input string
from sysfs so a comparison to IFNAMSIZ could be used.

I also discovered an error in bonding_store_active_slave that would
modify bond->primary_slave rather than bond->curr_active_slave before
forcing the bonding driver to choose a new active slave.

V3: Actually sending the proper patch....

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Reported-by: Takuma Umeya <tumeya@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-27 22:39:30 -07:00
stephen hemminger 655f8919d5 bonding: add min links parameter to 802.3ad
This adds support for a configuring the minimum number of links that
must be active before asserting carrier. It is similar to the Cisco
EtherChannel min-links feature. This allows setting the minimum number
of member ports that must be up (link-up state) before marking the
bond device as up (carrier on). This is useful for situations where
higher level services such as clustering want to ensure a minimum
number of low bandwidth links are active before switchover.

See:
   http://bugzilla.vyatta.com/show_bug.cgi?id=7196

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-23 02:12:55 -07:00
Peter Pan(潘卫平) ba824a8b2d bonding: make 802.3ad use latest lacp_rate
There is bug that when you modify lacp_rate via sysfs,
802.3ad won't use the new value of lacp_rate to transmit packets.
This is because port->actor_oper_port_state isn't changed.

Signed-off-by: Weiping Pan <panweiping3@gmail.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-09 15:02:18 -07:00
Flavio Leitner 94265cf5f7 bonding: documentation and code cleanup for resend_igmp
Improves the documentation about how IGMP resend parameter
works, fix two missing checks and coding style issues.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Acked-by: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-25 17:55:33 -04:00
Neil Horman 9fe0617d9b bonding: prevent deadlock on slave store with alb mode (v3)
This soft lockup was recently reported:

[root@dell-per715-01 ~]# echo +bond5 > /sys/class/net/bonding_masters
[root@dell-per715-01 ~]# echo +eth1 > /sys/class/net/bond5/bonding/slaves
bonding: bond5: doing slave updates when interface is down.
bonding bond5: master_dev is not up in bond_enslave
[root@dell-per715-01 ~]# echo -eth1 > /sys/class/net/bond5/bonding/slaves
bonding: bond5: doing slave updates when interface is down.

BUG: soft lockup - CPU#12 stuck for 60s! [bash:6444]
CPU 12:
Modules linked in: bonding autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc
be2d
Pid: 6444, comm: bash Not tainted 2.6.18-262.el5 #1
RIP: 0010:[<ffffffff80064bf0>]  [<ffffffff80064bf0>]
.text.lock.spinlock+0x26/00
RSP: 0018:ffff810113167da8  EFLAGS: 00000286
RAX: ffff810113167fd8 RBX: ffff810123a47800 RCX: 0000000000ff1025
RDX: 0000000000000000 RSI: ffff810123a47800 RDI: ffff81021b57f6f8
RBP: ffff81021b57f500 R08: 0000000000000000 R09: 000000000000000c
R10: 00000000ffffffff R11: ffff81011d41c000 R12: ffff81021b57f000
R13: 0000000000000000 R14: 0000000000000282 R15: 0000000000000282
FS:  00002b3b41ef3f50(0000) GS:ffff810123b27940(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002b3b456dd000 CR3: 000000031fc60000 CR4: 00000000000006e0

Call Trace:
 [<ffffffff80064af9>] _spin_lock_bh+0x9/0x14
 [<ffffffff886937d7>] :bonding:tlb_clear_slave+0x22/0xa1
 [<ffffffff8869423c>] :bonding:bond_alb_deinit_slave+0xba/0xf0
 [<ffffffff8868dda6>] :bonding:bond_release+0x1b4/0x450
 [<ffffffff8006457b>] __down_write_nested+0x12/0x92
 [<ffffffff88696ae4>] :bonding:bonding_store_slaves+0x25c/0x2f7
 [<ffffffff801106f7>] sysfs_write_file+0xb9/0xe8
 [<ffffffff80016b87>] vfs_write+0xce/0x174
 [<ffffffff80017450>] sys_write+0x45/0x6e
 [<ffffffff8005d28d>] tracesys+0xd5/0xe0

It occurs because we are able to change the slave configuarion of a bond while
the bond interface is down.  The bonding driver initializes some data structures
only after its ndo_open routine is called.  Among them is the initalization of
the alb tx and rx hash locks.  So if we add or remove a slave without first
opening the bond master device, we run the risk of trying to lock/unlock a
spinlock that has garbage for data in it, which results in our above softlock.

Note that sometimes this works, because in many cases an unlocked spinlock has
the raw_lock parameter initialized to zero (meaning that the kzalloc of the
net_device private data is equivalent to calling spin_lock_init), but thats not
true in all cases, and we aren't guaranteed that condition, so we need to pass
the relevant spinlocks through the spin_lock_init function.

Fix it by moving the spin_lock_init calls for the tx and rx hashtable locks to
the ndo_init path, so they are ready for use by the bond_store_slaves path.

Change notes:
v2) Based on conversation with Jay and Nicolas it seems that the ability to
enslave devices while the bond master is down should be safe to do.  As such
this is an outlier bug, and so instead we'll just initalize the errant spinlocks
in the init path rather than the open path, solving the problem.  We'll also
remove the warnings about the bond being down during enslave operations, since
it should be safe

v3) Fix spelling error

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: jtluka@redhat.com
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: nicolas.2p.debian@gmail.com
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-25 17:55:33 -04:00
Ben Hutchings ad246c992b ipv4, ipv6, bonding: Restore control over number of peer notifications
For backward compatibility, we should retain the module parameters and
sysfs attributes to control the number of peer notifications
(gratuitous ARPs and unsolicited NAs) sent after bonding failover.
Also, it is possible for failover to take place even though the new
active slave does not have link up, and in that case the peer
notification should be deferred until it does.

Change ipv4 and ipv6 so they do not automatically send peer
notifications on bonding failover.

Change the bonding driver to send separate NETDEV_NOTIFY_PEERS
notifications when the link is up, as many times as requested.  Since
it does not directly control which protocols send notifications, make
num_grat_arp and num_unsol_na aliases for a single parameter.  Bump
the bonding version number and update its documentation.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Acked-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-29 12:44:11 -07:00
Jiri Pirko 3aba891dde bonding: move processing of recv handlers into handle_frame()
Since now when bonding uses rx_handler, all traffic going into bond
device goes thru bond_handle_frame. So there's no need to go back into
bonding code later via ptype handlers. This patch converts
original ptype handlers into "bonding receive probes". These functions
are called from bond_handle_frame and they are registered per-mode.

Note that vlan packets are also handled because they are always untagged
thanks to vlan_untag()

Note that this also allows arpmon for eth-bond-bridge-vlan topology.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-25 12:00:30 -07:00
Ben Hutchings 7c89943236 bonding, ipv4, ipv6, vlan: Handle NETDEV_BONDING_FAILOVER like NETDEV_NOTIFY_PEERS
It is undesirable for the bonding driver to be poking into higher
level protocols, and notifiers provide a way to avoid that.  This does
mean removing the ability to configure reptitition of gratuitous ARPs
and unsolicited NAs.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-17 23:36:03 -07:00
Jiri Pirko 2d7011ca79 bonding: get rid of IFF_SLAVE_INACTIVE netdev->priv_flag
Since bond-related code was moved from net/core/dev.c into bonding,
IFF_SLAVE_INACTIVE is no longer needed. Replace is with flag "inactive"
stored in slave structure

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Reviewed-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-16 12:51:20 -07:00
Jiri Pirko e30bc066ab bonding: wrap slave state work
transfers slave->state into slave->backup (that it's going to transfer
into bitfield. Introduce wrapper inlines to do the work with it.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Reviewed-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-16 12:51:20 -07:00
Jiri Pirko 0bd80dad57 net: get rid of multiple bond-related netdevice->priv_flags
Now when bond-related code is moved from net/core/dev.c into bonding
code, multiple priv_flags are not needed anymore. So let them rot.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Reviewed-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-16 12:51:19 -07:00
Phil Oester 5f86cad1e8 bonding: Improve syslog message at device creation time
When the bonding module is loaded, it creates bond0 by default.
Then, when attempting to create bond0, the following messages
are printed to syslog:

    kernel: bonding: bond0 is being created...
    kernel: bonding: Bond creation failed.

Which seems to indicate a problem, when in reality there is no
problem.  Since the actual error code is passed down from bond_create,
make use of it to print a bit less ominous message:

    kernel: bonding: bond0 is being created...
    kernel: bond0 already exists.

Signed-off-by: Phil Oester <kernel@linuxace.com>
Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-15 19:29:39 -07:00
Jiri Pirko 672bda3370 bonding: fix return value of couple of store functions
count is incorrectly returned even in case of fail. Return ret instead.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-25 13:13:16 -08:00
Neil Horman e843fa5088 bonding: Fix deadlock in bonding driver resulting from internal locking when using netpoll
The monitoring paths in the bonding driver take write locks that are shared by
the tx path.  If netconsole is in use, these paths can call printk which puts us
in the netpoll tx path, which, if netconsole is attached to the bonding driver,
result in deadlock (the xmit_lock guards are useless in netpoll_send_skb, as the
monitor paths in the bonding driver don't claim the xmit_lock, nor should they).
The solution is to use a per cpu flag internal to the driver to indicate when a
cpu is holding the lock in a path that might recusrse into the tx path for the
driver via netconsole.  By checking this flag on transmit, we can defer the
sending of the netconsole frames until a later time using the retransmit feature
of netpoll_send_skb that is triggered on the return code NETDEV_TX_BUSY.  I've
tested this and am able to transmit via netconsole while causing failover
conditions on the bond slave links.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-18 08:32:07 -07:00
Flavio Leitner c2952c314b bonding: add retransmit membership reports tunable
Allow sysadmins to configure the number of multicast
membership report sent on a link failure event.

Signed-off-by: Flavio Leitner <fleitner@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-05 20:26:58 -07:00
Andy Gospodarek c5cb002fb0 bonding: prevent sysfs from allowing arp monitoring with alb/tlb
When using module options arp monitoring and balance-alb/balance-tlb
are mutually exclusive options.  Anytime balance-alb/balance-tlb are
enabled mii monitoring is forced to 100ms if not set.  When configuring
via sysfs no checking is currently done.

Handling these cases with sysfs has to be done a bit differently because
we do not have all configuration information available at once.  This
patch will not allow a mode change to balance-alb/balance-tlb if
arp_interval is already non-zero.  It will also not allow the user to
set a non-zero arp_interval value if the mode is already set to
balance-alb/balance-tlb.  They are still mutually exclusive on a
first-come, first serve basis.

Tested with initscripts on Fedora and manual setting via sysfs.

Signed-off-by: Andy Gospodarek <gospo@redhat.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-30 23:27:57 -07:00
Nicolas de Pesloüan 79236680bd bonding: fix a buffer overflow in bonding_show_queue_id.
The test for buffer overflow ensures we have room for 6 more bytes.
sprintf, called with %s:%d, slave->dev->name, slave->queue_id may yield
far more than 6 bytes.

The correct test is res > (PAGE_SIZE - IFNAMSIZ - 6) .

Signed-off-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-14 18:24:54 -07:00
Andy Gospodarek bb1d912323 bonding: allow user-controlled output slave selection
v2: changed bonding module version, modified to apply on top of changes
from previous patch in series, and updated documentation to elaborate on
multiqueue awareness that now exists in bonding driver.

This patch give the user the ability to control the output slave for
round-robin and active-backup bonding.  Similar functionality was
discussed in the past, but Jay Vosburgh indicated he would rather see a
feature like this added to existing modes rather than creating a
completely new mode.  Jay's thoughts as well as Neil's input surrounding
some of the issues with the first implementation pushed us toward a
design that relied on the queue_mapping rather than skb marks.
Round-robin and active-backup modes were chosen as the first users of
this slave selection as they seemed like the most logical choices when
considering a multi-switch environment.

Round-robin mode works without any modification, but active-backup does
require inclusion of the first patch in this series and setting
the 'all_slaves_active' flag.  This will allow reception of unicast traffic on
any of the backup interfaces.

This was tested with IPv4-based filters as well as VLAN-based filters
with good results.

More information as well as a configuration example is available in the
patch to Documentation/networking/bonding.txt.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-05 02:23:17 -07:00
Andy Gospodarek ebd8e4977a bonding: add all_slaves_active parameter
v2: changed parameter name from 'keep_all' to 'all_slaves_active' and
skipped setting slaves to inactive rather than creating a new flag at
Jay's suggestion.

In an effort to suppress duplicate frames on certain bonding modes
(specifically the modes that do not require additional configuration on
the switch or switches connected to the host), code was added in the
generic receive patch in 2.6.16.  The current behavior works quite well
for most users, but there are some times it would be nice to restore old
functionality and allow all frames to make their way up the stack.

This patch adds support for a new module option and sysfs file called
'all_slaves_active' that will restore pre-2.6.16 functionality if the
user desires.  The default value is '0' and retains existing behavior,
but the user can set it to '1' and allow all frames up if desired.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-05 02:23:17 -07:00
Jiri Pirko c20811a79e bonding: move dev_addr cpy to bond_enslave
Move the code that copies slave's mac address in case that's the first slave into
bond_enslave. Ifenslave app does this also but that's not a problem. This is
something that should be done in bond_enslave, and it shound not matter from
where is it called.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-02 04:16:23 -07:00
Jiri Pirko f9f3545e1e bonding: make bonding_store_slaves simpler
This patch makes bonding_store_slaves function nicer and easier to understand.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-02 03:39:42 -07:00
Jiri Pirko 3dd90905e0 bonding: remove redundant checks from bonding_store_slaves V2
(it's actually the same as v1)

Remove checks that duplicates similar checks in bond_enslave.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-02 03:39:42 -07:00
Jiri Pirko b15ba0fbdc bonding: move slave MTU handling from sysfs V2
V1->V2: corrected res/ret use

For some reason, MTU handling (storing, and restoring) is taking  place in
bond_sysfs. The correct place for this code is in bond_enslave, bond_release.
So move it there.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-02 03:39:41 -07:00
Jiri Pirko 6458590999 bonding: remove unused variable "found"
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-02 03:39:40 -07:00
Andi Kleen 28812fe11a driver-core: Add attribute argument to class_attribute show/store
Passing the attribute to the low level IO functions allows all kinds
of cleanups, by sharing low level IO code without requiring
an own function for every piece of data.

Also drivers can extend the attributes with own data fields
and use that in the low level function.

This makes the class attributes the same as sysdev_class attributes
and plain attributes.

This will allow further cleanups in drivers.

Full tree sweep converting all users.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:48 -08:00
Joe Perches a4aee5c808 drivers/net/bonding/: : use pr_fmt
Add #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
Remove DRV_NAME from pr_<level>s
Consolidate long format strings
Remove some extra tab indents
Remove some unnecessary ()s from pr_<level>s arguments
Align pr_<level> arguments

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-13 20:06:07 -08:00
David S. Miller 3505d1a9fd Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/sfc/sfe4001.c
	drivers/net/wireless/libertas/cmd.c
	drivers/staging/Kconfig
	drivers/staging/Makefile
	drivers/staging/rtl8187se/Kconfig
	drivers/staging/rtl8192e/Kconfig
2009-11-18 22:19:03 -08:00
Eric W. Biederman ec87fd3b4e bond: Add support for multiple network namespaces
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-30 12:41:21 -07:00
Eric W. Biederman 6151b3d435 bond: Simply bond sysfs group creation
This patch delegates the work of creating the sysfs groups
to the netdev layer and ultimately to the device layer.  This
closes races between uevents.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-30 12:41:19 -07:00
Alexey Dobriyan d43c36dc6b headers: remove sched.h from interrupt.h
After m68k's task_thread_info() doesn't refer to current,
it's possible to remove sched.h from interrupt.h and not break m68k!
Many thanks to Heiko Carstens for allowing this.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2009-10-11 11:20:58 -07:00
Jiri Pirko a549952ad3 bonding: introduce primary_reselect option
In some cases there is not desirable to switch back to primary interface when
it's link recovers and rather stay with currently active one. We need to avoid
packetloss as much as we can in some cases. This is solved by introducing
primary_reselect option. Note that enslaved primary slave is set as current
active no matter what.

Patch modified by Jay Vosburgh as follows: fixed bug in action
after change of option setting via sysfs, revised the documentation
update, and bumped the bonding version number.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-07 01:07:39 -07:00
Jiri Pirko ce501caf16 bonding: set primary param via sysfs
Primary module parameter passed to bonding is pernament. That means if you
release the primary slave and enslave it again, it becomes the primary slave
again. But if you set primary slave via sysfs, the primary slave is only set
once and it's not remembered in bond->params structure. Therefore the setting is
lost after releasing the primary slave. This simple one-liner fixes this.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-01 14:34:29 -07:00
Jiri Pirko e5e2a8fd83 bonding: wipe out printk's
I did not introduce new lines over 80 chars. I even eliminated some of
them.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-13 16:43:32 -07:00
Stephen Hemminger 5c5129b54f bonding: use is_zero_ether_addr
Remove bogus non-portable possibly unaligned way of testing
for zero addres..

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-13 23:29:03 -07:00
Stephen Hemminger 373500db92 bonding: network device names are case sensative
The bonding device acts unlike all other Linux network device functions
in that it ignores case of device names. The developer must have come
from windows!

Cleanup the management of names and use standard routines where possible.
Flag places where bonding device still doesn't work right with network
namespaces.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-13 23:29:01 -07:00
Stephen Hemminger 6d7ab43ccc bonding: elminate bad refcount code
The "expected_refcount" stuff in bonding sysfs module is a mistake.
Sysfs does proper refcounting, and it is okay to remove a bond device
that has some user process holding the file open.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-13 23:29:00 -07:00
Stephen Hemminger 3d632c3f28 bonding: fix style issues
Resolve some of the complaints from checkpatch, and remove "magic emacs format"
comments, and useless MODULE_SUPPORTED_DEVICE(). But should not
change actual code.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-13 23:28:57 -07:00
Stephen Hemminger 9e71626c1c bonding: fix destructor
It is not safe to use a network device destructor that is a function in
the module, since it can be called after module is unloaded if sysfs
handle is open.

When eventually using netlink, the device cleanup code needs to be done
via uninit function.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-13 23:28:56 -07:00
Stephen Hemminger 7e08384045 bonding: remove bonding read/write semaphore
The whole read/write semaphore locking can be removed. It doesn't add any
protection that isn't already done by using the RTNL mutex properly.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-13 23:28:54 -07:00
Stephen Hemminger d2991f7535 bonding: bond_create always called with default parameters
bond_create() is always called with same parameters so move the argument
down.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-13 23:28:51 -07:00
Stephen Hemminger 130aa61a77 bonding: fix multiple module load problem
Some users still load bond module multiple times to create bonding
devices.  This accidentally was broken by a later patch about
the time sysfs was fixed.  According to Jay, it was broken
by:
   commit b8a9787edd
   Author: Jay Vosburgh <fubar@us.ibm.com>
   Date:   Fri Jun 13 18:12:04 2008 -0700

     bonding: Allow setting max_bonds to zero

Note: sysfs and procfs still produce WARN() messages when this is done
so the sysfs method is the recommended API.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-11 05:46:04 -07:00
Eric W. Biederman 496a60cdcd net: FIX bonding sysfs rtnl_lock deadlock
Sysfs files for a network device can not unconditionally take the
rtnl_lock as the bonding sysfs files do.  If someone accesses those
sysfs files while the network device is being unregistered with the
rtnl_lock held we will deadlock.

So use trylock and restart_syscall to avoid this problem.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-18 22:16:00 -07:00
Brian Haley 5a31bec014 Bonding: fix zero address hole bug in arp_ip_target list
Fix a zero address hole bug in the bonding arp_ip_target list
that was causing the bond to ignore ARP replies (bugz 13006).
Instead of just setting the array entry to zero, we now
copy any additional entries down one slot, putting the
zero entry at the end.  With this change we can now have
all the loops that walk the array stop when they hit a zero
since there will be no addresses after it.

Changes are based in part on code fragment provided in kernel:
bugzilla 13006:

	http://bugzilla.kernel.org/show_bug.cgi?id=13006

by Steve Howard <steve@astutenetworks.com>

Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-13 00:12:41 -07:00
Hannes Eder b06715b7a3 drivers/net/bonding: fix sparse warnings: move decls to header file
Fix this sparse warnings:

  drivers/net/bonding/bond_main.c:104:20: warning: symbol 'bonding_defaults' was not declared. Should it be static?
  drivers/net/bonding/bond_main.c:204:22: warning: symbol 'ad_select_tbl' was not declared. Should it be static?
  drivers/net/bonding/bond_sysfs.c:60:21: warning: symbol 'bonding_rwsem' was not declared. Should it be static?

Signed-off-by: Hannes Eder <hannes@hanneseder.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-12-25 23:58:57 -08:00
Holger Eitzenberger d78755237f bonding: remove duplicate declarations
Remove some declarations from bonding.c as they are declared in bonding.h
already.

Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-12-09 23:09:49 -08:00
Holger Eitzenberger 5a03cdb7f2 bonding: use pr_debug instead of own macros
Use pr_debug() instead of own macros.

Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-12-09 23:09:22 -08:00
Stephen Hemminger eb7cc59a03 bonding: convert to net_device_ops
Convert to net_device_ops table.
Note: for some operations move error checking into generic networking
layer (rather than looking at pointers in bonding).

A couple of gratituous style cleanups to get rid of extra {}

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-19 22:42:42 -08:00
Wang Chen 454d7c9b14 netdevice: safe convert to netdev_priv() #part-1
We have some reasons to kill netdev->priv:
1. netdev->priv is equal to netdev_priv().
2. netdev_priv() wraps the calculation of netdev->priv's offset, obviously
   netdev_priv() is more flexible than netdev->priv.
But we cann't kill netdev->priv, because so many drivers reference to it
directly.

This patch is a safe convert for netdev->priv to netdev_priv(netdev).
Since all of the netdev->priv is only for read.
But it is too big to be sent in one mail.
I split it to 4 parts and make every part smaller than 100,000 bytes,
which is max size allowed by vger.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-12 23:37:49 -08:00
Jay Vosburgh fd989c8332 bonding: alternate agg selection policies for 802.3ad
This patch implements alternative aggregator selection policies
for 802.3ad.  The existing policy, now termed "stable," selects the active
aggregator by greatest bandwidth, and only reselects a new aggregator
if the active aggregator is entirely disabled (no more ports or all ports
down).

	This patch adds two new policies: bandwidth and count, selecting
the active aggregator by total bandwidth (like the stable policy) or by
the number of ports in the aggregator, respectively.  These two policies
also differ from the stable policy in that they will reselect the active
aggregator when availability-related changes occur in the bond (e.g.,
link state change).

	This permits "gang failover" within 802.3ad, allowing redundant
aggregators along parallel paths to always maintain the "best" aggregator
as the active aggregator (rather than having to wait for the active to
entirely fail).

	This patch also updates the driver version to 3.5.0.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-11-06 00:49:47 -05:00
Brian Haley 305d552acc bonding: send IPv6 neighbor advertisement on failover
This patch adds better IPv6 failover support for bonding devices,
especially when in active-backup mode and there are only IPv6 addresses
configured, as reported by Alex Sidorenko.

- Creates a new file, net/drivers/bonding/bond_ipv6.c, for the
   IPv6-specific routines.  Both regular bonds and VLANs over bonds
   are supported.

- Adds a new tunable, num_unsol_na, to limit the number of unsolicited
   IPv6 Neighbor Advertisements that are sent on a failover event.
   Default is 1.

- Creates two new IPv6 neighbor discovery functions:

   ndisc_build_skb()
   ndisc_send_skb()

   These were required to support VLANs since we have to be able to
   add the VLAN id to the skb since ndisc_send_na() and friends
   shouldn't be asked to do this.  These two routines are basically
   __ndisc_send() split into two pieces, in a slightly different order.

- Updates Documentation/networking/bonding.txt and bumps the rev of bond
   support to 3.4.0.

On failover, this new code will generate one packet:

- An unsolicited IPv6 Neighbor Advertisement, which helps the switch
   learn that the address has moved to the new slave.

Testing has shown that sending just the NA results in pretty good
behavior when in active-back mode, I saw no lost ping packets for example.

Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-11-06 00:49:37 -05:00
Jay Vosburgh 6cf3f41e6c bonding, net: Move last_rx update into bonding recv logic
The only user of the net_device->last_rx field is bonding.
This patch adds a conditional update of last_rx to the bonding special
logic in skb_bond_should_drop, causing last_rx to only be updated when
the ARP monitor is running.

	This frees network device drivers from the necessity of
updating last_rx, which can have cache line thrash issues.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-03 18:16:50 -08:00
Harvey Harrison 63779436ab drivers: replace NIPQUAD()
Using NIPQUAD() with NIPQUAD_FMT, %d.%d.%d.%d or %u.%u.%u.%u
can be replaced with %pI4

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-31 00:56:00 -07:00
Johannes Berg e174961ca1 net: convert print_mac to %pM
This converts pretty much everything to print_mac. There were
a few things that had conflicts which I have just dropped for
now, no harm done.

I've built an allyesconfig with this and looked at the files
that weren't built very carefully, but it's a huge patch.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-27 17:06:18 -07:00
Moni Shoua db018a5f49 bonding: Don't destroy bonding master when removing slave via sysfs
It is wrong to destroy a bonding master from a context that uses the sysfs
of that bond. When last IPoIB slave is unenslaved from by writing to a
sysfs file (for bond0 this would be /sys/class/net/bond0/bonding/slaves)
the driver tries to destroy the bond. This is wrong and can lead to a
lockup or a crash.  This fix lets the bonding master stay and relies on
the user to destroy the bonding master if necessary (i.e. before module
ib_ipoib is unloaded)

This patch affects only bonds of IPoIB slaves. Ethernet slaves stay
unaffected.

Signed-off-by: Moni Shoua <monis@voltaire.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-08-07 03:59:56 -04:00
Jay Vosburgh b8a9787edd bonding: Allow setting max_bonds to zero
Permit bonding to function rationally if max_bonds is set to
zero.  This will load the module, but create no master devices (which can
be created via sysfs).

	Requires some change to bond_create_sysfs; currently, the
netdev sysfs directory is determined from the first bonding device created,
but this is no longer possible.  Instead, an interface from net/core is
created to create and destroy files in net_class.

	Based on a patch submitted by Phil Oester <kernel@linuxaces.com>.
Modified by Jay Vosburgh to fix the sysfs issue mentioned above and to
update the documentation.

Signed-off-by: Phil Oester <kernel@linuxace.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-06-18 00:00:04 -04:00
Jay Vosburgh 3915c1e863 bonding: Add "follow" option to fail_over_mac
Add a "follow" selection for fail_over_mac.  This option
causes the MAC address to move from slave to slave as the active
slave changes.  This is in addition to the existing fail_over_mac option
that causes the bond's MAC address to change during failover.

	This new option is useful for devices that cannot tolerate
multiple ports using the same MAC address simultaneously, either
because it confuses them or incurs a performance penalty (as is the
case with some LPAR-aware multiport devices).  Because the MAC of the
bond itself does not change, the "follow" option is slightly more
reliable during failover and doesn't change the MAC of the bond during
operation.

	This patch requires a previous ARP monitor change to properly
handle RTNL during failovers.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-05-22 06:34:29 -04:00
Moni Shoua 7893b2491a bonding: Send more than one gratuitous ARP when slave takes over
With IPoIB, reception of gratuitous ARP by neighboring hosts
is essential for a successful change of slaves in case of failure.
Otherwise, they won't learn about the HW address change and need
to wait a long time until the neighboring system gives up and sends
an ARP request to learn the new HW address.  This patch decreases
the chance for a lost of a gratuitous ARP packet by sending it more
than once. The number retries is configurable and can be set with a
module param.

Signed-off-by: Moni Shoua <monis@voltaire.com>
Acked-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-05-22 06:34:26 -04:00
Pavel Emelyanov 0883beca7f bonding: Relax unneeded _safe lists iterations.
Many places either do not modify the list under the list_for_each_xxx,
or break out of the loop as soon as the first element is removed.

Thus, this _safe iteration just occupies some unneeded .text space
and requires an additional variable.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-05-22 06:34:22 -04:00
Pavel Emelyanov 0dd646fe05 bonding: Remove redundant argument from bond_create.
While we're fixing the bond_create, I hope it's OK to polish it
a bit after the fixes.

The third argument is NULL at the first caller and is ignored by
the second one, so remove it.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-05-22 06:34:21 -04:00
Stephen Hemminger 38d2f38be9 bonding: handle case of device named bonding_master
If device already exists named bonding_masters, then fail. This is a wierd
corner case only a QA group could love.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-14 22:35:04 -07:00
Jay Vosburgh c4ebc66a1a bonding: fix error unwind in bonding_store_bonds
Fixed an error unwind in bonding_store_bonds that didn't release
the locks it held, and consolidated unwinds into a common block at the
end of the function.  Bug reported by Pavel Emelyanov <xemul@openvz.org>,
who provided a different fix.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-05-06 12:01:29 -04:00
David S. Miller 6952d8923b [BOND]: Fix warning in bond_sysfs.c
original_mtu is only used if we end up with a non-NULL
dev, and it is assigned in all such cases, but GCC can't
see that.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:15:38 -07:00
Jay Vosburgh 027ea0416c bonding: fix lock ordering for rtnl and bonding_rwsem
Fix the handling of rtnl and the bonding_rwsem to always be acquired
in a consistent order (rtnl, then bonding_rwsem).

The existing code sometimes acquired them in this order, and sometimes
in the opposite order, which opens a window for deadlock between ifenslave
and sysfs.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-18 14:38:39 -05:00
Jay Vosburgh ece95f7fef bonding: Fix up parameter parsing
A recent change to add an additional hash policy modified
bond_parse_parm, but it now does not correctly match parameters passed in
via sysfs.

	Rewrote bond_parse_parm to handle (a) parameter matches that
are substrings of one another and (b) user input with whitespace (e.g.,
sysfs input often has a trailing newline).

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-18 14:38:38 -05:00
Jay Vosburgh e934dd7862 bonding: fix locking in sysfs primary/active selection
Fix the functions that store the primary and active slave
options via sysfs to hold the correct locks in the correct order.

	The bond_change_active_slave and bond_select_active_slave
functions both require rtnl, bond->lock for read and curr_slave_lock for
write_bh, and no other locks.  This is so that the lower level
mode-specific functions (notably for balance-alb mode) can release locks
down to just rtnl in order to call, e.g., dev_set_mac_address with the
locks it expects (rtnl only).

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-18 14:38:38 -05:00
Wagner Ferenc 8e4b932908 bonding: Allow setting and querying xmit policy regardless of mode
From: Wagner Ferenc <wferi@niif.hu>

For consistency with the behaviour of the arp_ip_target option,
let /sys/class/net/bond0/bonding/xmit_hash_policy accept and report
current policy even if the bonding mode in effect does not use it.

Signed-off-by: Ferenc Wagner <wferi@niif.hu>
Acked-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-12-07 15:00:28 -05:00
Wagner Ferenc 1dcdcd6954 bonding: Coding style: break line after the if condition
From: Wagner Ferenc <wferi@niif.hu>

Adhere to coding style: break line after the if condition

Signed-off-by: Ferenc Wagner <wferi@niif.hu>
Acked-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-12-07 15:00:27 -05:00
Wagner Ferenc b88436651b bonding: Purely cosmetic: rename a local variable
From: Wagner Ferenc <wferi@niif.hu>

Code for rendering multivalue sysfs files occurs three times
in this module.  Rename 'buffer' to 'buf' in the first, for
the sake of consistency.

Signed-off-by: Ferenc Wagner <wferi@niif.hu>
Acked-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-12-07 15:00:26 -05:00
Wagner Ferenc 16cd0160d5 bonding: Return nothing for not applicable values
From: Wagner Ferenc <wferi@niif.hu>

The previous code returned '\n' (that is, a single empty line)
from most files, with one exception (xmit_hash_policy), where
it returned 'NA\n'.  This patch consolidates each file to return
nothing at all if not applicable, not even a '\n'.

I find this behaviour more usual, more useful, more efficient
and shorter to code from both sides.

Signed-off-by: Ferenc Wagner <wferi@niif.hu>
Acked-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-12-07 15:00:25 -05:00
Wagner Ferenc 7bd4650895 bonding: Remove trailing NULs from sysfs interface.
From: Wagner Ferenc <wferi@niif.hu>

Also remove trailing spaces from multivalued files.

This fixes output like for example:

$ od -c /sys/class/net/bond0/bonding/slaves
0000000   e   t   h   -   l   e   f   t       e   t   h   -   r   i   g
0000020   h   t      \n  \0
0000025

It mostly entails deleting '+1'-s after sprintf() calls: the return value
of sprintf is the number of characters printed, without the closing NUL,
ie. exactly what the sysfs interface requires.  The three multivalue
cases are different, because they also have to swallow back a trailing
space.

Signed-off-by: Ferenc Wagner <wferi@niif.hu>
Acked-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-12-07 15:00:18 -05:00
Jay Vosburgh 1466a21997 bonding: fix rtnl locking merge error
Looks like I incorrectly merged one of the rtnl lock changes,
so that one function, bonding_show_active_slave, held rtnl but didn't
release it, and another, bonding_store_active_slave, never held rtnl but
did release it.

	Fixed so the first function doesn't mess with rtnl, and the
second correctly acquires and releases rtnl.

	Bug reported by Moni Shoua <monis@voltaire.com>

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-11-10 04:25:14 -05:00
Jay Vosburgh 6603a6f25e bonding: Convert more locks to _bh, acquire rtnl, for new locking
Convert more lock acquisitions to _bh flavor to avoid deadlock
with workqueue activity and add acquisition of RTNL in appropriate places.
Affects ALB mode, as well as core bonding functions and sysfs.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-23 20:32:00 -04:00
Jay Vosburgh 1b76b31693 Convert bonding timers to workqueues
Convert bonding timers to workqueues.  This converts the various
monitor functions to run in periodic work queues instead of timers.  This
patch introduces the framework and convers the calls, but does not resolve
various locking issues, and does not stand alone.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-23 20:32:00 -04:00
Robert P. J. Day 3a4fa0a25d Fix misspellings of "system", "controller", "interrupt" and "necessary".
Fix the various misspellings of "system", controller", "interrupt" and
"[un]necessary".

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
2007-10-19 23:10:43 +02:00
Jay Vosburgh dd957c57c5 net/bonding: Optionally allow ethernet slaves to keep own MAC
Update the "don't change MAC of slaves" functionality added in
previous changes to be a generic option, rather than something tied to
IB devices, as it's occasionally useful for regular ethernet devices as
well.

	Adds "fail_over_mac" option (which is automatically enabled for IB
slaves), applicable only to active-backup mode.

	Includes documentation update.

	Updates bonding driver version to 3.2.0.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-15 14:20:46 -04:00
Moni Shoua d90a162a4e net/bonding: Destroy bonding master when last slave is gone
When bonding enslaves non Ethernet devices it takes pointers to functions
in the module that owns the slaves. In this case it becomes unsafe
to keep the bonding master registered after last slave was unenslaved
because we don't know if the pointers are still valid.  Destroying the bond when slave_cnt is zero
ensures that these functions be used anymore.

Signed-off-by: Moni Shoua <monis at voltaire.com>
Acked-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-15 14:20:46 -04:00
Moni Shoua 3158bf7d41 net/bonding: Handlle wrong assumptions that slave is always an Ethernet device
bonding sometimes uses Ethernet constants (such as MTU and address length) which
are not good when it enslaves non Ethernet devices (such as InfiniBand).

Signed-off-by: Moni Shoua <monis at voltaire.com>
Acked-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-15 14:20:46 -04:00
Moni Shoua 6b1bf09650 net/bonding: Enable IP multicast for bonding IPoIB devices
Allow to enslave devices when the bonding device is not up. Over the discussion
held at the previous post this seemed to be the most clean way to go, where it
is not expected to cause instabilities.

Normally, the bonding driver is UP before any enslavement takes place.
Once a netdevice is UP, the network stack acts to have it join some multicast groups
(eg the all-hosts 224.0.0.1). Now, since ether_setup() have set the bonding device
type to be ARPHRD_ETHER and address len to be ETHER_ALEN, the net core code
computes a wrong multicast link address. This is b/c ip_eth_mc_map() is called
where for multicast joins taking place after the enslavement another ip_xxx_mc_map()
is called (eg ip_ib_mc_map() when the bond type is ARPHRD_INFINIBAND)

Signed-off-by: Moni Shoua <monis at voltaire.com>
Signed-off-by: Or Gerlitz <ogerlitz at voltaire.com>
Acked-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-15 14:20:46 -04:00
Al Viro d3bb52b094 endianness annotations drivers/net/bonding/
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-10 16:51:56 -07:00
Joe Perches 0795af5729 [NET]: Introduce and use print_mac() and DECLARE_MAC_BUF()
This is nicer than the MAC_FMT stuff.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:51:42 -07:00
Jesper Juhl bf1e9a080d Clean up duplicate includes in drivers/net/
This patch cleans up duplicate includes in
	 drivers/net/

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Acked-by: "John W. Linville" <linville@tuxdriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-10 16:50:26 -07:00