Commit Graph

164 Commits

Author SHA1 Message Date
Herbert Xu a030847e9f [NET]: Avoid copying TCP packets unnecessarily
TCP packets all have writable heads, that is, even though it's cloned, it is
writable up to the end of the TCP header.  This patch makes skb_checksum_help
aware of this fact by using skb_clone_writable and avoiding a copy for TCP.

I've also modified the BUG_ON tests to be unsigned.  The only case where this
makes a difference is if csum_start points to a location before skb->data.
Since skb->data should always include the header where the checksum field
is (and all currently callers adhere to that), this change is safe and may
uncover bugs later.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-15 12:26:34 -07:00
Herbert Xu f697c3e8b3 [NET]: Avoid unnecessary cloning for ingress filtering
As it is we always invoke pt_prev before ing_filter, even if there are no
ingress filters attached.  This can cause unnecessary cloning in pt_prev.

This patch changes it so that we only invoke pt_prev if there are ingress
filters attached.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-15 12:26:26 -07:00
Randy Dunlap c4ea43c552 net core: fix kernel-doc for new function parameters
Fix networking code kernel-doc for newly added parameters.

Warning(linux-2.6.23-git2//net/core/sock.c:879): No description found for parameter 'net'
Warning(linux-2.6.23-git2//net/core/dev.c:570): No description found for parameter 'net'
Warning(linux-2.6.23-git2//net/core/dev.c:594): No description found for parameter 'net'
Warning(linux-2.6.23-git2//net/core/dev.c:617): No description found for parameter 'net'
Warning(linux-2.6.23-git2//net/core/dev.c:641): No description found for parameter 'net'
Warning(linux-2.6.23-git2//net/core/dev.c:667): No description found for parameter 'net'
Warning(linux-2.6.23-git2//net/core/dev.c:722): No description found for parameter 'net'
Warning(linux-2.6.23-git2//net/core/dev.c:959): No description found for parameter 'net'
Warning(linux-2.6.23-git2//net/core/dev.c:1195): No description found for parameter 'dev'
Warning(linux-2.6.23-git2//net/core/dev.c:2105): No description found for parameter 'n'
Warning(linux-2.6.23-git2//net/core/dev.c:3272): No description found for parameter 'net'
Warning(linux-2.6.23-git2//net/core/dev.c:3445): No description found for parameter 'net'
Warning(linux-2.6.23-git2//include/linux/netdevice.h:1301): No description found for parameter 'cpu'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-13 09:52:26 -07:00
Pavel Emelyanov 9b77265235 [NET]: Remove double dev->flags checking when calling dev_close()
The unregister_netdevice() and dev_change_net_namespace()
both check for dev->flags to be IFF_UP before calling the
dev_close(), but the dev_close() checks for IFF_UP itself,
so remove those unneeded checks.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:55:52 -07:00
Pavel Emelyanov 4665079cbb [NETNS]: Move some code into __init section when CONFIG_NET_NS=n
With the net namespaces many code leaved the __init section,
thus making the kernel occupy more memory than it did before.
Since we have a config option that prohibits the namespace
creation, the functions that initialize/finalize some netns
stuff are simply not needed and can be freed after the boot.

Currently, this is almost not noticeable, since few calls
are no longer in __init, but when the namespaces will be
merged it will be possible to free more code. I propose to
use the __net_init, __net_exit and __net_initdata "attributes"
for functions/variables that are not used if the CONFIG_NET_NS
is not set to save more space in memory.

The exiting functions cannot just reside in the __exit section,
as noticed by David, since the init section will have
references on it and the compilation will fail due to modpost
checks. These references can exist, since the init namespace
never dies and the exit callbacks are never called. So I
introduce the __exit_refok attribute just like it is already
done with the __init_refok.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:54:58 -07:00
Jeff Garzik 14e3e07979 [NET]: split dev_ifsioc() according to locking
This always bugged me: dev_ioctl() called dev_ifsioc() either inside
read_lock(dev_base_lock) or rtnl_lock(), depending on the ioctl being
executed.

This change moves the ioctls executed inside dev_base_lock to a new
function, dev_ifsioc_locked().  Now the locking context is completely
clear to the reader.

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:54:49 -07:00
Stephen Hemminger cfcabdcc2d [NET]: sparse warning fixes
Fix a bunch of sparse warnings. Mostly about 0 used as
NULL pointer, and shadowed variable declarations.
One notable case was that hash size should have been unsigned.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:54:48 -07:00
Stephen Hemminger 3b04ddde02 [NET]: Move hardware header operations out of netdevice.
Since hardware header operations are part of the protocol class
not the device instance, make them into a separate object and
save memory.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:52:52 -07:00
Eric W. Biederman 8b41d1887d [NET]: Fix running without sysfs
When sysfs support is compiled out the kernel still keeps and maintains
the kobject tree.  So it is not safe to skip our kobject reference counting or
to avoid becoming members of the kobject tree.  It is safe to not add
the networking specific sysfs attributes.

This patch removes the sysfs special cases from net/core/dev.c
renames functions from netdev_sysfs_xxxx to netdev_kobject_xxxx
and always compiles in net-sysfs.c

net-sysfs.c is modified with a CONFIG_SYSFS guard around the parts
that are actually sysfs specific.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:52:46 -07:00
Pavel Emelyanov 056925ab31 [NET]: Cleanup calling netdev notifiers.
The call_netdev_notifiers routine can successfully be used in
the net/core_dev.c itself.

This will save 6 lines of code and 62 ;) bytes of .text section.

62 is rather small, but I have one more patch saving ~30 bytes
from netns code (sent to Eric), so altogether they can save
some more noticeable amount.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:51:21 -07:00
Pavel Emelyanov 30d97d3585 [NETNS]: Consolidate hashes creation in netdev_init()
The dev_name_hash and the dev_index_hash are now booth kmalloc-ed
(and each element is properly initialized as usually) so I think
it's worth consolidating this code making it look nicer (and
saving 28 bytes of .text section ;) )

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:51:21 -07:00
Eric W. Biederman ad7379d494 [NET]: Fix the prototype of call_netdevice_notifiers.
This replaces the void * parameter with a struct net_device * which
is what is actually required.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:51:20 -07:00
Jamal Hadi Salim 22dd749501 [NET]: migrate HARD_TX_LOCK to header file
HARD_TX_LOCK micro is a nice aggregation that could be used
in other spots. move it to netdevice.h
Also makes sure the previously superflous cpu arguement is used.
Thanks to DaveM for the suggestions.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:51:20 -07:00
Eric W. Biederman 077130c0cf [NET]: Fix race when opening a proc file while a network namespace is exiting.
The problem:  proc_net files remember which network namespace the are
against but do not remember hold a reference count (as that would pin
the network namespace).   So we currently have a small window where
the reference count on a network namespace may be incremented when opening
a /proc file when it has already gone to zero.

To fix this introduce maybe_get_net and get_proc_net.

maybe_get_net increments the network namespace reference count only if it is
greater then zero, ensuring we don't increment a reference count after it
has gone to zero.

get_proc_net handles all of the magic to go from a proc inode to the network
namespace instance and call maybe_get_net on it.

PROC_NET the old accessor is removed so that we don't get confused and use
the wrong helper function.

Then I fix up the callers to use get_proc_net and handle the case case
where get_proc_net returns NULL.  In that case I return -ENXIO because
effectively the network namespace has already gone away so the files
we are trying to access don't exist anymore.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Paul E. McKenney <paulmck@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:49:22 -07:00
David S. Miller 9d5010db7e [NET]: Add a might_sleep() to dev_close().
Requested by Johannes Berg.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:49:15 -07:00
Eric W. Biederman ce286d3273 [NET]: Implement network device movement between namespaces
This patch introduces NETIF_F_NETNS_LOCAL a flag to indicate
a network device is local to a single network namespace and
should never be moved.  Useful for pseudo devices that we
need an instance in each network namespace (like the loopback
device) and for any device we find that cannot handle multiple
network namespaces so we may trap them in the initial network
namespace.

This patch introduces the function dev_change_net_namespace
a function used to move a network device from one network
namespace to another.  To the network device nothing
special appears to happen, to the components of the network
stack it appears as if the network device was unregistered
in the network namespace it is in, and a new device
was registered in the network namespace the device
was moved to.

This patch sets up a namespace device destructor that
upon the exit of a network namespace moves all of the
movable network devices  to the initial network namespace
so they are not lost.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:49:12 -07:00
Eric W. Biederman b267b17964 [NET]: Factor out __dev_alloc_name from dev_alloc_name
When forcibly changing the network namespace of a device
I need something that can generate a name for the device
in the new namespace without overwriting the old name.

__dev_alloc_name provides me that functionality.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:49:11 -07:00
Eric W. Biederman 881d966b48 [NET]: Make the device list and device lookups per namespace.
This patch makes most of the generic device layer network
namespace safe.  This patch makes dev_base_head a
network namespace variable, and then it picks up
a few associated variables.  The functions:
dev_getbyhwaddr
dev_getfirsthwbytype
dev_get_by_flags
dev_get_by_name
__dev_get_by_name
dev_get_by_index
__dev_get_by_index
dev_ioctl
dev_ethtool
dev_load
wireless_process_ioctl

were modified to take a network namespace argument, and
deal with it.

vlan_ioctl_set and brioctl_set were modified so their
hooks will receive a network namespace argument.

So basically anthing in the core of the network stack that was
affected to by the change of dev_base was modified to handle
multiple network namespaces.  The rest of the network stack was
simply modified to explicitly use &init_net the initial network
namespace.  This can be fixed when those components of the network
stack are modified to handle multiple network namespaces.

For now the ifindex generator is left global.

Fundametally ifindex numbers are per namespace, or else
we will have corner case problems with migration when
we get that far.

At the same time there are assumptions in the network stack
that the ifindex of a network device won't change.  Making
the ifindex number global seems a good compromise until
the network stack can cope with ifindex changes when
you change namespaces, and the like.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:49:10 -07:00
Eric W. Biederman 6d34b1c27a [NET]: Initialize the network namespace of network devices.
Except for carefully selected pseudo devices all network
interfaces should start out in the initial network namespace.
Ultimately it will be register_netdev that examines what
dev->nd_net is set to and places a device in a network namespace.

This patch modifies alloc_netdev to initialize the network
namespace a device is in with the initial network namespace.
This gets it right for the vast majority of devices so their
drivers need not be modified and for those few pseudo devices
that need something different they can change this parameter
before calling register_netdevice.

The network namespace parameter on a network device is not
reference counted as the devices are inside of a network namespace
and cannot remain in that namespace past the lifetime of the
network namespace.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:49:07 -07:00
Eric W. Biederman 457c4cbc5a [NET]: Make /proc/net per network namespace
This patch makes /proc/net per network namespace.  It modifies the global
variables proc_net and proc_net_stat to be per network namespace.
The proc_net file helpers are modified to take a network namespace argument,
and all of their callers are fixed to pass &init_net for that argument.
This ensures that all of the /proc/net files are only visible and
usable in the initial network namespace until the code behind them
has been updated to be handle multiple network namespaces.

Making /proc/net per namespace is necessary as at least some files
in /proc/net depend upon the set of network devices which is per
network namespace, and even more files in /proc/net have contents
that are relevant to a single network namespace.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:49:06 -07:00
Stephen Hemminger bea3348eef [NET]: Make NAPI polling independent of struct net_device objects.
Several devices have multiple independant RX queues per net
device, and some have a single interrupt doorbell for several
queues.

In either case, it's easier to support layouts like that if the
structure representing the poll is independant from the net
device itself.

The signature of the ->poll() call back goes from:

	int foo_poll(struct net_device *dev, int *budget)

to

	int foo_poll(struct napi_struct *napi, int budget)

The caller is returned the number of RX packets processed (or
the number of "NAPI credits" consumed if you want to get
abstract).  The callee no longer messes around bumping
dev->quota, *budget, etc. because that is all handled in the
caller upon return.

The napi_struct is to be embedded in the device driver private data
structures.

Furthermore, it is the driver's responsibility to disable all NAPI
instances in it's ->stop() device close handler.  Since the
napi_struct is privatized into the driver's private data structures,
only the driver knows how to get at all of the napi_struct instances
it may have per-device.

With lots of help and suggestions from Rusty Russell, Roland Dreier,
Michael Chan, Jeff Garzik, and Jamal Hadi Salim.

Bug fixes from Thomas Graf, Roland Dreier, Peter Zijlstra,
Joseph Fannin, Scott Wood, Hans J. Koch, and Michael Chan.

[ Ported to current tree and all drivers converted.  Integrated
  Stephen's follow-on kerneldoc additions, and restored poll_list
  handling to the old style to fix mutual exclusion issues.  -DaveM ]

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:47:45 -07:00
Herbert Xu 7f353bf29e [NET]: Share correct feature code between bridging and bonding
http://bugzilla.kernel.org/show_bug.cgi?id=8797 shows that the
bonding driver may produce bogus combinations of the checksum
flags and SG/TSO.

For example, if you bond devices with NETIF_F_HW_CSUM and
NETIF_F_IP_CSUM you'll end up with a bonding device that
has neither flag set.  If both have TSO then this produces
an illegal combination.

The bridge device on the other hand has the correct code to
deal with this.

In fact, the same code can be used for both.  So this patch
moves that logic into net/core/dev.c and uses it for both
bonding and bridging.

In the process I've made small adjustments such as only
setting GSO_ROBUST if at least one constituent device
supports it.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-08-13 22:52:14 -07:00
Herbert Xu fcc5a03ac4 [NET]: Allow netdev REGISTER/CHANGENAME events to fail
This patch adds code to allow errors to be passed up from event
handlers of NETDEV_REGISTER and NETDEV_CHANGENAME.  It also adds
the notifier_from_errno/notifier_to_errnor helpers to pass the
errno value up to the notifier caller.

If an error is detected when a device is registered, it causes
that operation to fail.  A NETDEV_UNREGISTER will be sent to
all event handlers.

Similarly if NETDEV_CHANGENAME fails the original name is restored
and a new NETDEV_CHANGENAME event is sent.

As such all event handlers must be idempotent with respect to
these events.

When an event handler is registered NETDEV_REGISTER events are
sent for all devices currently registered.  Should any of them
fail, we will send NETDEV_GOING_DOWN/NETDEV_DOWN/NETDEV_UNREGISTER
events to that handler for the devices which have already been
registered with it.  The handler registration itself will fail.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-31 02:28:15 -07:00
Herbert Xu 7f988eab57 [NET]: Take dev_base_lock when moving device name hash list entry
When we added name-based hashing the dev_base_lock was designated as the
lock to take when changing the name hash list.  Unfortunately, because
it was a preexisting lock that just happened to be taken in the right
spots we neglected to take it in dev_change_name.

The race can affect calles of __dev_get_by_name that do so without taking
the RTNL.  They may end up walking down the wrong hash chain and end up
missing the device that they're looking for.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-31 02:28:13 -07:00
Herbert Xu 7ce1b0edcb [NET]: Call uninit if necessary in register_netdevice
This patch makes register_netdevice call dev->uninit if the regsitration
fails after dev->init has completed successfully.  Very few drivers use
the init/uninit calls but at least one (drivers/net/wan/sealevel.c) may
leak without this change.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-31 02:28:12 -07:00
Randy Dunlap 0ed72ec4af [NET]: kernel-doc fixes
Fix kernel-doc omissions in net/:

Warning(linux-2.6.23-rc1//net/core/dev.c:2728): No description found for parameter 'addr'
Warning(linux-2.6.23-rc1//net/core/dev.c:2752): No description found for parameter 'addr'
Warning(linux-2.6.23-rc1//net/core/dev.c:3839): No description found for parameter 'net_dma'
Warning(linux-2.6.23-rc1//net/core/dev.c:3877): No description found for parameter 'state'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-31 02:28:00 -07:00
Patrick McHardy 31ce72a6b1 [NET]: Fix loopback crashes when multiqueue is enabled.
From: Patrick McHardy <kaber@trash.net>

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-20 19:45:45 -07:00
YOSHIFUJI Hideaki 40b77c9434 [NET] CORE: Fix whitespace errors.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2007-07-19 10:43:23 +09:00
Denis Cheng 12972621c8 [NET]: move __dev_addr_discard adjacent to dev_addr_discard for readability
Signed-off-by: Denis Cheng <crquan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-18 02:12:56 -07:00
Denis Cheng 26cc2522cb [NET]: merge dev_unicast_discard and dev_mc_discard into one
this two functions could share the dev->_xmit_lock acquired context.

Signed-off-by: Denis Cheng <crquan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-18 02:12:03 -07:00
Denis Cheng 456ad75c89 [NET]: move dev_mc_discard from dev_mcast.c to dev.c
Because this function is only called by unregister_netdevice,
this moving could make this non-global function static,
and also remove its declaration in netdevice.h;

Any further, function __dev_addr_discard is also just called by
dev_mc_discard and dev_unicast_discard, keeping this two functions
both in one c file could make __dev_addr_discard also static
and remove its declaration in netdevice.h;

Futhermore, the sequential call to dev_unicast_discard and then
dev_mc_discard in unregister_netdevice have a similar mechanism that:
(netif_tx_lock_bh / __dev_addr_discard / netif_tx_unlock_bh),
they should merged into one to eliminate duplicates in acquiring and
releasing the dev->_xmit_lock, this would be done in my following patch.

Signed-off-by: Denis Cheng <crquan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-18 02:10:54 -07:00
Patrick McHardy b863ceb7dd [NET]: Add macvlan driver
Add macvlan driver, which allows to create virtual ethernet devices
based on MAC address.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 18:55:06 -07:00
Patrick McHardy 24023451c8 [NET]: Add net_device change_rx_mode callback
Currently the set_multicast_list (and set_rx_mode) callbacks are
responsible for configuring the device according to the IFF_PROMISC,
IFF_MULTICAST and IFF_ALLMULTI flags and the mc_list (and uc_list in
case of set_rx_mode).

These callbacks can be invoked from BH context without the rtnl_mutex
by dev_mc_add/dev_mc_delete, which makes reading the device flags and
promiscous/allmulti count racy. For real hardware drivers that just
commit all changes to the hardware this is not a real problem since
the stack guarantees to call them for every change, so at least the
final call will not race and commit the correct configuration to the
hardware.

For software devices that want to synchronize promiscous and multicast
state to an underlying device however this can cause corruption of the
underlying device's flags or promisc/allmulti counts.

When the software device is concurrently put in promiscous or allmulti
mode while set_multicast_list is invoked from bottem half context, the
device might synchronize the change to the underlying device without
holding the rtnl_mutex, which races with concurrent changes to the
underlying device.

Add a dev->change_rx_flags hook that is invoked when any of the flags
that affect rx filtering change (under the rtnl_mutex), which allows
drivers to perform synchronization immediately and only synchronize
the address lists in set_multicast_list/set_rx_mode.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 18:51:31 -07:00
Linus Torvalds e030dbf91a Merge branch 'ioat-md-accel-for-linus' of git://lost.foo-projects.org/~dwillia2/git/iop
* 'ioat-md-accel-for-linus' of git://lost.foo-projects.org/~dwillia2/git/iop: (28 commits)
  ioatdma: add the unisys "i/oat" pci vendor/device id
  ARM: Add drivers/dma to arch/arm/Kconfig
  iop3xx: surface the iop3xx DMA and AAU units to the iop-adma driver
  iop13xx: surface the iop13xx adma units to the iop-adma driver
  dmaengine: driver for the iop32x, iop33x, and iop13xx raid engines
  md: remove raid5 compute_block and compute_parity5
  md: handle_stripe5 - request io processing in raid5_run_ops
  md: handle_stripe5 - add request/completion logic for async expand ops
  md: handle_stripe5 - add request/completion logic for async read ops
  md: handle_stripe5 - add request/completion logic for async check ops
  md: handle_stripe5 - add request/completion logic for async compute ops
  md: handle_stripe5 - add request/completion logic for async write ops
  md: common infrastructure for running operations with raid5_run_ops
  md: raid5_run_ops - run stripe operations outside sh->lock
  raid5: replace custom debug PRINTKs with standard pr_debug
  raid5: refactor handle_stripe5 and handle_stripe6 (v3)
  async_tx: add the async_tx api
  xor: make 'xor_blocks' a library routine for use with async_tx
  dmaengine: make clients responsible for managing channels
  dmaengine: refactor dmaengine around dma_async_tx_descriptor
  ...
2007-07-13 10:52:27 -07:00
Dan Williams d379b01e90 dmaengine: make clients responsible for managing channels
The current implementation assumes that a channel will only be used by one
client at a time.  In order to enable channel sharing the dmaengine core is
changed to a model where clients subscribe to channel-available-events.
Instead of tracking how many channels a client wants and how many it has
received the core just broadcasts the available channels and lets the
clients optionally take a reference.  The core learns about the clients'
needs at dma_event_callback time.

In support of multiple operation types, clients can specify a capability
mask to only be notified of channels that satisfy a certain set of
capabilities.

Changelog:
* removed DMA_TX_ARRAY_INIT, no longer needed
* dma_client_chan_free -> dma_chan_release: switch to global reference
  counting only at device unregistration time, before it was also happening
  at client unregistration time
* clients now return dma_state_client to dmaengine (ack, dup, nak)
* checkpatch.pl fixes
* fixup merge with git-ioat

Cc: Chris Leech <christopher.leech@intel.com>
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: David S. Miller <davem@davemloft.net>
2007-07-13 08:06:13 -07:00
Patrick McHardy 61cbc2fca6 [NET]: Fix secondary unicast/multicast address count maintenance
When a reference to an existing address is increased or decreased without
hitting zero, the address count is incorrectly adjusted.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:16:23 -07:00
Peter P Waskiewicz Jr f25f4e4480 [CORE] Stack changes to add multiqueue hardware support API
Add the multiqueue hardware device support API to the core network
stack.  Allow drivers to allocate multiple queues and manage them at
the netdev level if they choose to do so.

Added a new field to sk_buff, namely queue_mapping, for drivers to
know which tx_ring to select based on OS classification of the flow.

Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:16:21 -07:00
Herbert Xu a298830cd0 [NET]: Fix TX checksum feature check
This patch fixes a boolean error in the new TX checksum check
that causes bogus TSO packets to be generated.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:16:19 -07:00
Patrick McHardy 4417da668c [NET]: dev: secondary unicast address support
Add support for configuring secondary unicast addresses on network
devices. To support this devices capable of filtering multiple
unicast addresses need to change their set_multicast_list function
to configure unicast filters as well and assign it to dev->set_rx_mode
instead of dev->set_multicast_list. Other devices are put into promiscous
mode when secondary unicast addresses are present.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:15:56 -07:00
Patrick McHardy bf742482d7 [NET]: dev: introduce generic net_device address lists
Introduce struct dev_addr_list and list maintenance functions
based on dev_mc_list and the related functions. This will be
used by follow-up patches for both multicast and secondary
unicast addresses.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:15:54 -07:00
Stephen Hemminger d212f87b06 [NET]: IPV6 checksum offloading in network devices
The existing model for checksum offload does not correctly handle
devices that can offload IPV4 and IPV6 only. The NETIF_F_HW_CSUM flag
implies device can do any arbitrary protocol.

This patch:
 * adds NETIF_F_IPV6_CSUM for those devices
 * fixes bnx2 and tg3 devices that need it
 * add NETIF_F_IPV6_CSUM to ipv6 output (incl GSO)
 * fixes assumptions about NETIF_F_ALL_CSUM in nat
 * adjusts bridge union of checksumming computation

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:15:52 -07:00
Shannon Nelson 515e06c455 [NET]: Re-enable irqs before pushing pending DMA requests
This moves the local_irq_enable() call in net_rx_action() to before
calling the CONFIG_NET_DMA's dma_async_memcpy_issue_pending() rather
than after.  This shortens the irq disabled window and allows for DMA
drivers that need to do their own irq hold.

Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-23 23:09:23 -07:00
Thomas Graf 7c355f532d [NET]: Avoid duplicate netlink notification when changing link state
When changing the link state from userspace not affecting any other
flags. Two duplicate notification are being sent, once as action
in the NETDEV_UP/NETDEV_DOWN notification chain and a second time
when comparing old and new device flags after the change has been
completed. Although harmless, the duplicates should be avoided.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-07 13:40:56 -07:00
Stephen Hemminger 9093bbb2d9 [NET]: Fix race condition about network device name allocation.
Kenji Kaneshige found this race between device removal and
registration.  On unregister it is possible for the old device to
exist, because sysfs file is still open.  A new device with 'eth%d'
will select the same name, but sysfs kobject register will fial.

The following changes the shutdown order slightly. It hold a removes
the sysfs entries earlier (on unregister_netdevice), but holds a
kobject reference.  Then when todo runs the actual last put free
happens.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-19 15:39:25 -07:00
Jarek Poplawski 723e98b79c [NET]: lockdep classes in register_netdevice
After initializing dev->_xmit_lock register_netdevice()
sets lockdep class according to dev->type.

Idea of this patch - by David Miller.

Reported & tested by: "Yuriy N. Shkandybin" <jura@netams.com>
Signed-off-by: Jarek Poplawski <jarkao2@o2.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-17 14:20:28 -07:00
Rafael J. Wysocki 8bb7844286 Add suspend-related notifications for CPU hotplug
Since nonboot CPUs are now disabled after tasks and devices have been
frozen and the CPU hotplug infrastructure is used for this purpose, we need
special CPU hotplug notifications that will help the CPU-hotplug-aware
subsystems distinguish normal CPU hotplug events from CPU hotplug events
related to a system-wide suspend or resume operation in progress.  This
patch introduces such notifications and causes them to be used during
suspend and resume transitions.  It also changes all of the
CPU-hotplug-aware subsystems to take these notifications into consideration
(for now they are handled in the same way as the corresponding "normal"
ones).

[oleg@tv-sign.ru: cleanups]
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-09 12:30:56 -07:00
Josef 'Jeff' Sipek 2396a22e09 [NET] net/core: Fix error handling
Upon failure to register "ptype" procfs entry, "softnet_stat" was not
removed, and an incorrect attempt was made to remove the "ptype" entry.

Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-07 00:33:18 -07:00
Pavel Emelianov 7562f876cd [NET]: Rework dev_base via list_head (v3)
Cleanup of dev_base list use, with the aim to simplify making device
list per-namespace. In almost every occasion, use of dev_base variable
and dev->next pointer could be easily replaced by for_each_netdev
loop. A few most complicated places were converted to using
first_netdev()/next_netdev().

Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Acked-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 15:13:45 -07:00
Patrick McHardy 4e9cac2ba4 [NET]: Add __dev_getfirstbyhwtype
Add __dev_getfirstbyhwtype for callers that don't want a reference but
some data from the device and thus need to take the rtnl anyway.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 03:28:13 -07:00
Rusty Russell 5a1b5898ee [NET]: Remove NETIF_F_INTERNAL_STATS, default to internal stats.
Herbert Xu conviced me that a new flag was overkill; every driver
currently overrides get_stats, so we might as well make the internal
one the default.  If someone did fail to set get_stats, they would now
get all 0 stats instead of "No statistics available".

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-28 21:04:03 -07:00