Commit Graph

244 Commits

Author SHA1 Message Date
Adrian Bunk d86b5e0e6b [PATCH] net/: fix the WIRELESS_EXT abuse
This patch contains the following changes:
- add a CONFIG_WIRELESS_EXT select'ed by NET_RADIO for conditional
  code
- remove the now no longer required #ifdef CONFIG_NET_RADIO from some
  #include's

Based on a patch by Jean Tourrilhes <jt@hpl.hp.com>.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-01-30 20:35:30 -05:00
Thomas Graf cabcac0b29 [BONDING]: Remove CAP_NET_ADMIN requirement for INFOQUERY ioctl
This information is already available via /proc/net/bonding/*
therefore it doesn't make sense to require CAP_NET_ADMIN
privileges.

Original patch by Laurent Deniel <laurent.deniel@free.fr>

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-24 12:46:33 -08:00
Herbert Xu 8798b3fb71 [NET]: Fix skb fclone error path handling.
On the error path if we allocated an fclone then we will free it in
the wrong pool.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-23 16:32:45 -08:00
Kris Katterjohn 2966b66c25 [NET]: more whitespace issues in net/core/filter.c
This fixes some whitespace issues in net/core/filter.c

Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-23 16:26:16 -08:00
David S. Miller 7ac5459ec0 [PKTGEN]: Respect hard_header_len of device.
Don't assume 16.

Found by Ben Greear.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-18 14:19:10 -08:00
Kris Katterjohn 3860288ee8 [NET]: Use is_zero_ether_addr() in net/core/netpoll.c
This replaces a memcmp() with is_zero_ether_addr().

Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-17 15:15:38 -08:00
Kris Katterjohn f404e9a67f [PKTGEN]: Replacing with (compare|is_zero)_ether_addr() and ETH_ALEN
This replaces some tests with is_zero_ether_addr(), memcmp(one, two,
6) with compare_ether_addr(one, two), and 6 with ETH_ALEN where
appropriate.

Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-17 13:04:57 -08:00
Kris Katterjohn e35bedf369 [NET]: Fix whitespace issues in net/core/filter.c
This fixes some whitespace issues in net/core/filter.c

Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-17 02:25:52 -08:00
Kris Katterjohn 7b11f69fb5 [NET]: Clean up comments for sk_chk_filter()
This removes redundant comments, and moves one comment to a better
location.

Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-13 14:33:06 -08:00
Randy Dunlap 4fc268d24c [PATCH] capable/capability.h (net/)
net: Use <linux/capability.h> where capable() is used.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11 18:42:14 -08:00
Evgeniy Polyakov c3f343e4d7 [NET]: Fix diverter build.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-11 16:32:15 -08:00
Kris Katterjohn 8b3a70058b [NET]: Remove more unneeded typecasts on *malloc()
This removes more unneeded casts on the return value for kmalloc(),
sock_kmalloc(), and vmalloc().

Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-11 16:32:14 -08:00
Linus Torvalds 9819d85c21 Fix net/core/wireless.c link failure
It needs <linux/etherdevice.h> for compare_ether_addr()
2006-01-10 19:35:19 -08:00
Kris Katterjohn d3f4a687f6 [NET]: Change memcmp(,,ETH_ALEN) to compare_ether_addr()
This changes some memcmp(one,two,ETH_ALEN) to compare_ether_addr(one,two).

Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-10 12:54:28 -08:00
Andrey Borzenkov 6dd214b554 [PATCH] fix /sys/class/net/<if>/wireless without dev->get_wireless_stats
dev->get_wireless_stats is deprecated but removing it also removes wireless
subdirectory in sysfs. This patch puts it back.

akpm: I don't know what's happening here.  This might be appropriate as a
2.6.15.x compatibility backport.  Waiting to hear from Jeff.

Signed-off-by: Andrey Borzenkov <arvidjaar@mail.ru>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Garzik <jgarzik@pobox.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10 08:01:24 -08:00
Kris Katterjohn 09a626600b [NET]: Change some "if (x) BUG();" to "BUG_ON(x);"
This changes some simple "if (x) BUG();" statements to "BUG_ON(x);"

Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09 14:16:18 -08:00
Alexey Dobriyan a2167dc62e [NET]: Endian-annotate in_aton()
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-06 13:24:54 -08:00
Luiz Capitulino 69549ddd2f [PKTGEN]: Adds missing __init.
pktgen_find_thread() and pktgen_create_thread() are only called at
initialization time.

Signed-off-by: Luiz Capitulino <lcapitulino@mandriva.com.br>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-06 13:19:31 -08:00
Kris Katterjohn 4bad4dc919 [NET]: Change sk_run_filter()'s return type in net/core/filter.c
It should return an unsigned value, and fix sk_filter() as well.

Signed-off-by: Kris Katterjohn <kjak@ispwest.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-06 13:08:20 -08:00
Linus Torvalds db9edfd7e3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6
Trivial manual merge fixup for usb_find_interface clashes.
2006-01-04 18:44:12 -08:00
Linus Torvalds d779188d2b Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 2006-01-04 16:31:56 -08:00
Kay Sievers fd586bacf4 [PATCH] net: swich device attribute creation to default attrs
Recent udev versions don't longer cover bad sysfs timing with built-in
logic. Explicit rules are required to do that. For net devices, the
following is needed:
  ACTION=="add", SUBSYSTEM=="net", WAIT_FOR_SYSFS="address"
to handle access to net device properties from an event handler without
races.

This patch changes the main net attributes to be created by the driver
core, which is done _before_ the event is sent out and will not require
the stat() loop of the WAIT_FOR_SYSFS key.

Signed-off-by: Kay Sievers <kay.sievers@suse.de>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-01-04 16:18:10 -08:00
Kay Sievers 312c004d36 [PATCH] driver core: replace "hotplug" by "uevent"
Leave the overloaded "hotplug" word to susbsystems which are handling
real devices. The driver core does not "plug" anything, it just exports
the state to userspace and generates events.

Signed-off-by: Kay Sievers <kay.sievers@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-01-04 16:18:08 -08:00
Kris Katterjohn 9369986306 [NET]: More instruction checks fornet/core/filter.c
Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-04 13:58:36 -08:00
Christoph Hellwig b5e5fa5e09 [NET]: Add a dev_ioctl() fallback to sock_ioctl()
Currently all network protocols need to call dev_ioctl as the default
fallback in their ioctl implementations.  This patch adds a fallback
to dev_ioctl to sock_ioctl if the protocol returned -ENOIOCTLCMD.
This way all the procotol ioctl handlers can be simplified and we don't
need to export dev_ioctl.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 14:18:33 -08:00
Benjamin LaHaise 4947d3ef8d [NET]: Speed up __alloc_skb()
From: Benjamin LaHaise <bcrl@kvack.org>

In __alloc_skb(), the use of skb_shinfo() which casts a u8 * to the 
shared info structure results in gcc being forced to do a reload of the 
pointer since it has no information on possible aliasing.  Fix this by 
using a pointer to refer to skb_shared_info.

By initializing skb_shared_info sequentially, the write combining buffers 
can reduce the number of memory transactions to a single write.  Reorder 
the initialization in __alloc_skb() to match the structure definition.  
There is also an alignment issue on 64 bit systems with skb_shared_info 
by converting nr_frags to a short everything packs up nicely.

Also, pass the slab cache pointer according to the fclone flag instead 
of using two almost identical function calls.

This raises bw_unix performance up to a peak of 707KB/s when combined 
with the spinlock patch.  It should help other networking protocols, too.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 14:06:50 -08:00
Arnaldo Carvalho de Melo 14c850212e [INET_SOCK]: Move struct inet_sock & helper functions to net/inet_sock.h
To help in reducing the number of include dependencies, several files were
touched as they were getting needed headers indirectly for stuff they use.

Thanks also to Alan Menegotto for pointing out that net/dccp/proto.c had
linux/dccp.h include twice.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:21 -08:00
Jaco Kroon f34fbb9713 [PKTGEN]: Deinitialise static variables.
static variables should not be explicitly initialised to 0.  This causes
them to be placed in .data instead of .bss.  This patch de-initialises 3
static variables in net/core/pktgen.c.

There are approximately 800 more such variables in the source tree
(2.6.15rc5).  If there is more interrest I'd be willing to track down the
rest of these as well and de-initialise them as well.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:16 -08:00
Arnaldo Carvalho de Melo 6d6ee43e0b [TWSK]: Introduce struct timewait_sock_ops
So that we can share several timewait sockets related functions and
make the timewait mini sockets infrastructure closer to the request
mini sockets one.

Next changesets will take advantage of this, moving more code out of
TCP and DCCP v4 and v6 to common infrastructure.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:54 -08:00
Benjamin LaHaise c1cbe4b7ad [NET]: Avoid atomic xchg() for non-error case
It also looks like there were 2 places where the test on sk_err was
missing from the event wait logic (in sk_stream_wait_connect and
sk_stream_wait_memory), while the rest of the sock_error() users look
to be doing the right thing.  This version of the patch fixes those,
and cleans up a few places that were testing ->sk_err directly.

Signed-off-by: Benjamin LaHaise <benjamin.c.lahaise@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:44 -08:00
Herbert Xu 3305b80c21 [IP]: Simplify and consolidate MSG_PEEK error handling
When a packet is obtained from skb_recv_datagram with MSG_PEEK enabled
it is left on the socket receive queue.  This means that when we detect
a checksum error we have to be careful when trying to free the packet
as someone could have dequeued it in the time being.

Currently this delicate logic is duplicated three times between UDPv4,
UDPv6 and RAWv6.  This patch moves them into a one place and simplifies
the code somewhat.

This is based on a suggestion by Eric Dumazet.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:41 -08:00
Trent Jaeger df71837d50 [LSM-IPSec]: Security association restriction.
This patch series implements per packet access control via the
extension of the Linux Security Modules (LSM) interface by hooks in
the XFRM and pfkey subsystems that leverage IPSec security
associations to label packets.  Extensions to the SELinux LSM are
included that leverage the patch for this purpose.

This patch implements the changes necessary to the XFRM subsystem,
pfkey interface, ipv4/ipv6, and xfrm_user interface to restrict a
socket to use only authorized security associations (or no security
association) to send/receive network packets.

Patch purpose:

The patch is designed to enable access control per packets based on
the strongly authenticated IPSec security association.  Such access
controls augment the existing ones based on network interface and IP
address.  The former are very coarse-grained, and the latter can be
spoofed.  By using IPSec, the system can control access to remote
hosts based on cryptographic keys generated using the IPSec mechanism.
This enables access control on a per-machine basis or per-application
if the remote machine is running the same mechanism and trusted to
enforce the access control policy.

Patch design approach:

The overall approach is that policy (xfrm_policy) entries set by
user-level programs (e.g., setkey for ipsec-tools) are extended with a
security context that is used at policy selection time in the XFRM
subsystem to restrict the sockets that can send/receive packets via
security associations (xfrm_states) that are built from those
policies.

A presentation available at
www.selinux-symposium.org/2005/presentations/session2/2-3-jaeger.pdf
from the SELinux symposium describes the overall approach.

Patch implementation details:

On output, the policy retrieved (via xfrm_policy_lookup or
xfrm_sk_policy_lookup) must be authorized for the security context of
the socket and the same security context is required for resultant
security association (retrieved or negotiated via racoon in
ipsec-tools).  This is enforced in xfrm_state_find.

On input, the policy retrieved must also be authorized for the socket
(at __xfrm_policy_check), and the security context of the policy must
also match the security association being used.

The patch has virtually no impact on packets that do not use IPSec.
The existing Netfilter (outgoing) and LSM rcv_skb hooks are used as
before.

Also, if IPSec is used without security contexts, the impact is
minimal.  The LSM must allow such policies to be selected for the
combination of socket and remote machine, but subsequent IPSec
processing proceeds as in the original case.

Testing:

The pfkey interface is tested using the ipsec-tools.  ipsec-tools have
been modified (a separate ipsec-tools patch is available for version
0.5) that supports assignment of xfrm_policy entries and security
associations with security contexts via setkey and the negotiation
using the security contexts via racoon.

The xfrm_user interface is tested via ad hoc programs that set
security contexts.  These programs are also available from me, and
contain programs for setting, getting, and deleting policy for testing
this interface.  Testing of sa functions was done by tracing kernel
behavior.

Signed-off-by: Trent Jaeger <tjaeger@cse.psu.edu>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:24 -08:00
Jeff Garzik ac67c62473 Merge branch 'master' 2006-01-03 10:49:18 -05:00
David S. Miller 1b93ae64ca [NET]: Validate socket filters against BPF_MAXINSNS in one spot.
Currently the checks are scattered all over and this leads
to inconsistencies and even cases where the check is not made.

Based upon a patch from Kris Katterjohn.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-27 13:57:59 -08:00
Jeff Garzik b1086eef81 Merge branch 'master' 2005-12-12 15:24:45 -05:00
Stephen Hemminger 246a421207 [NET]: Fix NULL pointer deref in checksum debugging.
The problem I was seeing turned out to be that skb->dev is NULL when
the checksum is being completed in user context. This happens because
the reference to the device is dropped (to allow it to be released
when packets are in the queue).

Because skb->dev was NULL, the netdev_rx_csum_fault was panicing on
deref of dev->name. How about this?

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-08 15:21:39 -08:00
Martin Waitz dab9630fb3 [NET]: make function pointer argument parseable by kernel-doc
When a function takes a function pointer as argument it should use the 'return
(*pointer)(params...)' syntax used everywhere else in the kernel as this is
recognized by kernel-doc.

Signed-off-by: Martin Waitz <tali@admingilde.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-05 13:40:12 -08:00
Jeff Garzik 2226340eb8 Merge branch 'master' 2005-11-29 03:50:33 -05:00
Kris Katterjohn fb0d366b08 [NET]: Reject socket filter if division by constant zero is attempted.
This way we don't have to check it in sk_run_filter().

Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-20 13:41:34 -08:00
Mitch Williams c2373ee989 [PATCH] net: make dev_valid_name public
dev_valid_name() is a useful function.  Make it public.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2005-11-13 14:48:18 -05:00
Mitch Williams 1e2e565965 [PATCH] net: allow newline terminated IP addresses in in_aton
in_aton() gives weird results if it sees a newline at the end of the
input. This patch makes it able to handle such input correctly.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2005-11-13 14:48:17 -05:00
Herbert Xu fb286bb299 [NET]: Detect hardware rx checksum faults correctly
Here is the patch that introduces the generic skb_checksum_complete
which also checks for hardware RX checksum faults.  If that happens,
it'll call netdev_rx_csum_fault which currently prints out a stack
trace with the device name.  In future it can turn off RX checksum.

I've converted every spot under net/ that does RX checksum checks to
use skb_checksum_complete or __skb_checksum_complete with the
exceptions of:

* Those places where checksums are done bit by bit.  These will call
netdev_rx_csum_fault directly.

* The following have not been completely checked/converted:

ipmr
ip_vs
netfilter
dccp

This patch is based on patches and suggestions from Stephen Hemminger
and David S. Miller.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-10 13:01:24 -08:00
Thomas Graf 9ac4a16983 [RTNETLINK]: Use generic netlink receive queue processor
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-10 02:26:40 +01:00
Thomas Graf a8f74b2288 [NETLINK]: Make netlink_callback->done() optional
Most netlink families make no use of the done() callback, making
it optional gets rid of all unnecessary dummy implementations.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-10 02:26:40 +01:00
Yasuyuki Kozakai 9fb9cbb108 [NETFILTER]: Add nf_conntrack subsystem.
The existing connection tracking subsystem in netfilter can only
handle ipv4.  There were basically two choices present to add
connection tracking support for ipv6.  We could either duplicate all
of the ipv4 connection tracking code into an ipv6 counterpart, or (the
choice taken by these patches) we could design a generic layer that
could handle both ipv4 and ipv6 and thus requiring only one sub-protocol
(TCP, UDP, etc.) connection tracking helper module to be written.

In fact nf_conntrack is capable of working with any layer 3
protocol.

The existing ipv4 specific conntrack code could also not deal
with the pecularities of doing connection tracking on ipv6,
which is also cured here.  For example, these issues include:

1) ICMPv6 handling, which is used for neighbour discovery in
   ipv6 thus some messages such as these should not participate
   in connection tracking since effectively they are like ARP
   messages

2) fragmentation must be handled differently in ipv6, because
   the simplistic "defrag, connection track and NAT, refrag"
   (which the existing ipv4 connection tracking does) approach simply
   isn't feasible in ipv6

3) ipv6 extension header parsing must occur at the correct spots
   before and after connection tracking decisions, and there were
   no provisions for this in the existing connection tracking
   design

4) ipv6 has no need for stateful NAT

The ipv4 specific conntrack layer is kept around, until all of
the ipv4 specific conntrack helpers are ported over to nf_conntrack
and it is feature complete.  Once that occurs, the old conntrack
stuff will get placed into the feature-removal-schedule and we will
fully kill it off 6 months later.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-09 16:38:16 -08:00
Jesper Juhl a51482bde2 [NET]: kfree cleanup
From: Jesper Juhl <jesper.juhl@gmail.com>

This is the net/ part of the big kfree cleanup patch.

Remove pointless checks for NULL prior to calling kfree() in net/.

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Arnaldo Carvalho de Melo <acme@conectiva.com.br>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2005-11-08 09:41:34 -08:00
Herbert Xu 6151b31c96 [NET]: Fix race condition in sk_stream_wait_connect
When sk_stream_wait_connect detects a state transition to ESTABLISHED
or CLOSE_WAIT prior to it going to sleep, it will return without
calling finish_wait and decrementing sk_write_pending.

This may result in crashes and other unintended behaviour.

The fix is to always call finish_wait and update sk_write_pending since
it is safe to do so even if the wait entry is no longer on the queue.

This bug was tracked down with the help of Alex Sidorenko and the
fix is also based on his suggestion.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 21:05:20 -02:00
Herbert Xu c75d721c76 [NET]: Fix zero-size datagram reception
The recent rewrite of skb_copy_datagram_iovec broke the reception of
zero-size datagrams.  This patch fixes it.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-02 22:25:04 -02:00
Ananda Raju e89e9cf539 [IPv4/IPv6]: UFO Scatter-gather approach
Attached is kernel patch for UDP Fragmentation Offload (UFO) feature.

1. This patch incorporate the review comments by Jeff Garzik.
2. Renamed USO as UFO (UDP Fragmentation Offload)
3. udp sendfile support with UFO

This patches uses scatter-gather feature of skb to generate large UDP
datagram. Below is a "how-to" on changes required in network device
driver to use the UFO interface.

UDP Fragmentation Offload (UFO) Interface:
-------------------------------------------
UFO is a feature wherein the Linux kernel network stack will offload the
IP fragmentation functionality of large UDP datagram to hardware. This
will reduce the overhead of stack in fragmenting the large UDP datagram to
MTU sized packets

1) Drivers indicate their capability of UFO using
dev->features |= NETIF_F_UFO | NETIF_F_HW_CSUM | NETIF_F_SG

NETIF_F_HW_CSUM is required for UFO over ipv6.

2) UFO packet will be submitted for transmission using driver xmit routine.
UFO packet will have a non-zero value for

"skb_shinfo(skb)->ufo_size"

skb_shinfo(skb)->ufo_size will indicate the length of data part in each IP
fragment going out of the adapter after IP fragmentation by hardware.

skb->data will contain MAC/IP/UDP header and skb_shinfo(skb)->frags[]
contains the data payload. The skb->ip_summed will be set to CHECKSUM_HW
indicating that hardware has to do checksum calculation. Hardware should
compute the UDP checksum of complete datagram and also ip header checksum of
each fragmented IP packet.

For IPV6 the UFO provides the fragment identification-id in
skb_shinfo(skb)->ip6_frag_id. The adapter should use this ID for generating
IPv6 fragments.

Signed-off-by: Ananda Raju <ananda.raju@neterion.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (forwarded)
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-28 16:30:00 -02:00
Linus Torvalds 236fa08168 Merge master.kernel.org:/pub/scm/linux/kernel/git/acme/net-2.6.15 2005-10-28 08:50:37 -07:00
Al Viro 7d877f3bda [PATCH] gfp_t: net/*
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-28 08:16:47 -07:00
Jeff Garzik 35848e048f [PATCH] kill massive wireless-related log spam
Although this message is having the intended effect of causing wireless
driver maintainers to upgrade their code, I never should have merged this
patch in its present form.  Leading to tons of bug reports and unhappy
users.

Some wireless apps poll for statistics regularly, which leads to a printk()
every single time they ask for stats.  That's a little bit _too_ much of a
reminder that the driver is using an old API.

Change this to printing out the message once, per kernel boot.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-26 10:39:43 -07:00
Randy Dunlap c83c248618 [SK_BUFF] kernel-doc: fix skbuff warnings
Add kernel-doc to skbuff.h, skbuff.c to eliminate kernel-doc warnings.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-26 01:10:18 -02:00
Stephen Hemminger d50a6b56f0 [PKTGEN]: proc interface revision
The code to handle the /proc interface can be cleaned up in several places:
* use seq_file for read
* don't need to remember all the filenames separately
* use for_online_cpu's
* don't vmalloc a buffer for small command from user.

Committer note:
This patch clashed with John Hawkes's "[NET]: Wider use of for_each_*cpu()",
so I fixed it up manually.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-26 00:12:18 -02:00
Stephen Hemminger b4099fab75 [PKTGEN]: Spelling and white space
Fix some cosmetic issues. Indentation, spelling errors, and some whitespace.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-26 00:08:10 -02:00
Stephen Hemminger 2845b63b50 [PKTGEN]: Use kzalloc
These are cleanup patches for pktgen that can go in 2.6.15
Can use kzalloc in a couple of places.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-26 00:05:32 -02:00
Stephen Hemminger b7c8921bf1 [PKTGEN]: Sleeping function called under lock
pktgen is calling kmalloc GFP_KERNEL and vmalloc with lock held.
The simplest fix is to turn the lock into a semaphore, since the
thread lock is only used for admin control from user context.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-26 00:03:12 -02:00
John Hawkes 670c02c2bf [NET]: Wider use of for_each_*cpu()
In 'net' change the explicit use of for-loops and NR_CPUS into the
general for_each_cpu() or for_each_online_cpu() constructs, as
appropriate.  This widens the scope of potential future optimizations
of the general constructs, as well as takes advantage of the existing
optimizations of first_cpu() and next_cpu(), which is advantageous
when the true CPU count is much smaller than NR_CPUS.

Signed-off-by: John Hawkes <hawkes@sgi.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-25 23:54:01 -02:00
Herbert Xu 49636bb128 [NEIGH] Fix timer leak in neigh_changeaddr
neigh_changeaddr attempts to delete neighbour timers without setting
nud_state.  This doesn't work because the timer may have already fired
when we acquire the write lock in neigh_changeaddr.  The result is that
the timer may keep firing for quite a while until the entry reaches
NEIGH_FAILED.

It should be setting the nud_state straight away so that if the timer
has already fired it can simply exit once we relinquish the lock.

In fact, this whole function is simply duplicating the logic in
neigh_ifdown which in turn is already doing the right thing when
it comes to deleting timers and setting nud_state.

So all we have to do is take that code out and put it into a common
function and make both neigh_changeaddr and neigh_ifdown call it.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2005-10-23 17:18:00 +10:00
Herbert Xu 6fb9974f49 [NEIGH] Fix add_timer race in neigh_add_timer
neigh_add_timer cannot use add_timer unconditionally.  The reason is that
by the time it has obtained the write lock someone else (e.g., neigh_update)
could have already added a new timer.

So it should only use mod_timer and deal with its return value accordingly.

This bug would have led to rare neighbour cache entry leaks.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2005-10-23 16:37:48 +10:00
Herbert Xu 203755029e [NEIGH] Print stack trace in neigh_add_timer
Stack traces are very helpful in determining the exact nature of a bug.
So let's print a stack trace when the timer is added twice.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2005-10-23 16:11:39 +10:00
Julian Anastasov c98d80edc8 [SK_BUFF]: ipvs_property field must be copied
IPVS used flag NFC_IPVS_PROPERTY in nfcache but as now nfcache was removed the
new flag 'ipvs_property' still needs to be copied. This patch should be
included in 2.6.14.

Further comments from Harald Welte:

Sorry, seems like the bug was introduced by me.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-22 17:06:01 -02:00
Al Viro dd0fc66fb3 [PATCH] gfp flags annotations - part 1
- added typedef unsigned int __nocast gfp_t;

 - replaced __nocast uses for gfp flags with gfp_t - it gives exactly
   the same warnings as far as sparse is concerned, doesn't change
   generated code (from gcc point of view we replaced unsigned int with
   typedef) and documents what's going on far better.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-08 15:00:57 -07:00
Herbert Xu 3e56a40bb3 [IPV4]: Get rid of bogus __in_put_dev in pktgen
This patch gets rid of a bogus __in_dev_put() in pktgen.c.  This was
spotted by Suzanne Wood.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-03 14:36:32 -07:00
Herbert Xu e5ed639913 [IPV4]: Replace __in_dev_get with __in_dev_get_rcu/rtnl
The following patch renames __in_dev_get() to __in_dev_get_rtnl() and
introduces __in_dev_get_rcu() to cover the second case.

1) RCU with refcnt should use in_dev_get().
2) RCU without refcnt should use __in_dev_get_rcu().
3) All others must hold RTNL and use __in_dev_get_rtnl().

There is one exception in net/ipv4/route.c which is in fact a pre-existing
race condition.  I've marked it as such so that we remember to fix it.

This patch is based on suggestions and prior work by Suzanne Wood and
Paul McKenney.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-03 14:35:55 -07:00
Herbert Xu 325ed82393 [NET]: Fix packet timestamping.
I've found the problem in general.  It affects any 64-bit
architecture.  The problem occurs when you change the system time.

Suppose that when you boot your system clock is forward by a day.
This gets recorded down in skb_tv_base.  You then wind the clock back
by a day.  From that point onwards the offset will be negative which
essentially overflows the 32-bit variables they're stored in.

In fact, why don't we just store the real time stamp in those 32-bit
variables? After all, we're not going to overflow for quite a while
yet.

When we do overflow, we'll need a better solution of course.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-03 13:57:23 -07:00
Linus Torvalds eb693d2994 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2005-09-29 08:56:47 -07:00
Al Viro 666002218d [PATCH] proc_mkdir() should be used to create procfs directories
A bunch of create_proc_dir_entry() calls creating directories had crept
in since the last sweep; converted to proc_mkdir().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-29 08:46:26 -07:00
Frank Filz a79af59efd [NET]: Fix module reference counts for loadable protocol modules
I have been experimenting with loadable protocol modules, and ran into
several issues with module reference counting.

The first issue was that __module_get failed at the BUG_ON check at
the top of the routine (checking that my module reference count was
not zero) when I created the first socket. When sk_alloc() is called,
my module reference count was still 0. When I looked at why sctp
didn't have this problem, I discovered that sctp creates a control
socket during module init (when the module ref count is not 0), which
keeps the reference count non-zero. This section has been updated to
address the point Stephen raised about checking the return value of
try_module_get().

The next problem arose when my socket init routine returned an error.
This resulted in my module reference count being decremented below 0.
My socket ops->release routine was also being called. The issue here
is that sock_release() calls the ops->release routine and decrements
the ref count if sock->ops is not NULL. Since the socket probably
didn't get correctly initialized, this should not be done, so we will
set sock->ops to NULL because we will not call try_module_get().

While searching for another bug, I also noticed that sys_accept() has
a possibility of doing a module_put() when it did not do an
__module_get so I re-ordered the call to security_socket_accept().

Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-27 15:23:38 -07:00
Eric Dumazet 2d7ceece08 [NET]: Prefetch dev->qdisc_lock in dev_queue_xmit()
We know the lock is going to be taken.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-27 15:22:58 -07:00
Daniel Phillips bc8dfcb939 [NET]: Use non-recursive algorithm in skb_copy_datagram_iovec()
Use iteration instead of recursion.  Fraglists within fraglists
should never occur, so we BUG check this.

Signed-off-by: Daniel Phillips <phillips@istop.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-27 15:22:35 -07:00
David S. Miller 667347f1ca [NEIGH]: Add debugging check when adding timers.
If we double-add a neighbour entry timer, which should be
impossible but has been reported, dump the current state of
the entry so that we can debug this.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-27 12:07:44 -07:00
David S. Miller 56e9b26324 Merge master.kernel.org:/pub/scm/linux/kernel/git/acme/llc-2.6 2005-09-26 15:29:31 -07:00
Amos Waterland 45fc3b11f1 [NET]: Protect neigh_stat_seq_fops by CONFIG_PROC_FS
From: Amos Waterland <apw@us.ibm.com>

If CONFIG_PROC_FS is not selected, the compiler emits this warning:

 net/core/neighbour.c:64: warning: `neigh_stat_seq_fops' defined but not used

Which is correct, because neigh_stat_seq_fops is in fact only
initialized and used by code that is protected by CONFIG_PROC_FS.  So
this patch fixes that up.

Signed-off-by: Amos Waterland <apw@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-24 16:53:16 -07:00
Jochen Friedrich cf309e3fb8 [LLC]: Fix for Bugzilla ticket #5156
Signed-off-by: Jochen Friedrich <jochen@scram.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22 04:44:55 -03:00
Nishanth Aravamudan 121caf577d [NET]: fix-up schedule_timeout() usage
Use schedule_timeout_{,un}interruptible() instead of
set_current_state()/schedule_timeout() to reduce kernel size.  Also use
human-time conversion functions instead of hard-coded division to avoid
rounding issues.

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-12 14:15:34 -07:00
Ingo Molnar a9f6a0dd54 [PATCH] more SPIN_LOCK_UNLOCKED -> DEFINE_SPINLOCK conversions
This converts the final 20 DEFINE_SPINLOCK holdouts.  (another 580 places
are already using DEFINE_SPINLOCK).  Build tested on x86.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 14:03:48 -07:00
Ingo Molnar 8d06afab73 [PATCH] timer initialization cleanup: DEFINE_TIMER
Clean up timer initialization by introducing DEFINE_TIMER a'la
DEFINE_SPINLOCK.  Build and boot-tested on x86.  A similar patch has been
been in the -RT tree for some time.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 14:03:48 -07:00
Linus Torvalds 55faed1e60 Merge branch 'upstream' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 2005-09-07 17:22:43 -07:00
Patrick McHardy 0a3f4358ac [NET]: proto_unregister: fix sleeping while atomic
proto_unregister holds a lock while calling kmem_cache_destroy, which
can sleep.

Noticed by Daniele Orlandi <daniele@orlandi.com>.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06 19:47:50 -07:00
Jean Tourrilhes 6582c164f2 [PATCH] WE-19 for kernel 2.6.13
Hi Jeff,

	This is version 19 of the Wireless Extensions. It was supposed
to be the fallback of the WPA API changes, but people seem quite happy
about it (especially Jouni), so the patch is rather small.
	The patch has been fully tested with 2.6.13 and various
wireless drivers, and is in its final version. Would you mind pushing
that into Linus's kernel so that the driver and the apps can take
advantage ot it ?

	It includes :
	o iwstat improvement (explicit dBm). This is the result of
long discussions with Dan Williams, the authors of
NetworkManager. Thanks to him for all the fruitful feedback.
	o remove pointer from event stream. I was not totally sure if
this pointer was 32-64 bits clean, so I'd rather remove it and be at
peace with it.
	o remove linux header from wireless.h. This has long been
requested by people writting user space apps, now it's done, and it
was not even painful.
	o final deprecation of spy_offset. You did not like it, it's
now gone for good.
	o Start deprecating dev->get_wireless_stats -> debloat netdev
	o Add "check" version of event macros for ieee802.11
stack. Jiri Benc doesn't like the current macros, we aim to please ;-)
	All those changes, except the last one, have been bit-roting on
my web pages for a while...

	Patches for most kernel drivers will follow. Patches for the
Orinoco and the HostAP drivers have been sent to their respective
maintainers.

	Have fun...

	Jean
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-09-06 22:40:24 -04:00
Eric Dumazet 9261c9b042 [NET]: Make sure l_linger is unsigned to avoid negative timeouts
One of my x86_64 (linux 2.6.13) server log is filled with :

schedule_timeout: wrong timeout value ffffffffffffff06 from ffffffff802e63ca
schedule_timeout: wrong timeout value ffffffffffffff06 from ffffffff802e63ca
schedule_timeout: wrong timeout value ffffffffffffff06 from ffffffff802e63ca
schedule_timeout: wrong timeout value ffffffffffffff06 from ffffffff802e63ca
schedule_timeout: wrong timeout value ffffffffffffff06 from ffffffff802e63ca

This is because some application does a

struct linger li;
li.l_onoff = 1;
li.l_linger = -1;
setsockopt(sock, SOL_SOCKET, SO_LINGER, &li, sizeof(li));

And unfortunatly l_linger is defined as a 'signed int' in
include/linux/socket.h:

struct linger {
         int             l_onoff;        /* Linger active                */
         int             l_linger;       /* How long to linger for       */
};

I dont know if it's safe to change l_linger to 'unsigned int' in the
include file (It might be defined as int in ABI specs)

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06 14:51:39 -07:00
Linus Torvalds 5bcaa15579 Merge branch 'upstream' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 2005-09-06 00:47:18 -07:00
Herbert Xu 1198ad002a [NET]: 2.6.13 breaks libpcap (and tcpdump)
Patrick McHardy says:

  Never mind, I got it, we never fall through to the second switch
  statement anymore. I think we could simply break when load_pointer
  returns NULL. The switch statement will fall through to the default
  case and return 0 for all cases but 0 > k >= SKF_AD_OFF.

Here's a patch to do just that.

I left BPF_MSH alone because it's really a hack to calculate the IP
header length, which makes no sense when applied to the special data.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-05 18:44:37 -07:00
David S. Miller 6baf1f417d [NET]: Do not protect sysctl_optmem_max with CONFIG_SYSCTL
The ipv4 and ipv6 protocols need to access it unconditionally.
SYSCTL=n build failure reported by Russell King.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-05 18:14:11 -07:00
viro@ftp.linux.org.uk 0bf0519d2b [PATCH] (7/7) __user annotations (ethtool)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-09-05 17:57:23 -04:00
Eric Dumazet ba89966c19 [NET]: use __read_mostly on kmem_cache_t , DEFINE_SNMP_STAT pointers
This patch puts mostly read only data in the right section
(read_mostly), to help sharing of these data between CPUS without
memory ping pongs.

On one of my production machine, tcp_statistics was sitting in a
heavily modified cache line, so *every* SNMP update had to force a
reload.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:11:18 -07:00
Jon Wetzel a6f9a70578 [NET]: Add support for getting the permanent hardware address.
This patch adds a new field to net device to hold the permanent
hardware address, and adds a new generic ethtool_op function to
get that address.

Signed-off-by: Jon Wetzel <jon_wetzel@dell.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:02:44 -07:00
David S. Miller d179cd1292 [NET]: Implement SKB fast cloning.
Protocols that make extensive use of SKB cloning,
for example TCP, eat at least 2 allocations per
packet sent as a result.

To cut the kmalloc() count in half, we implement
a pre-allocation scheme wherein we allocate
2 sk_buff objects in advance, then use a simple
reference count to free up the memory at the
correct time.

Based upon an initial patch by Thomas Graf and
suggestions from Herbert Xu.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:01:54 -07:00
Arnaldo Carvalho de Melo 20380731bc [NET]: Fix sparse warnings
Of this type, mostly:

CHECK   net/ipv6/netfilter.c
net/ipv6/netfilter.c:96:12: warning: symbol 'ipv6_netfilter_init' was not declared. Should it be static?
net/ipv6/netfilter.c:101:6: warning: symbol 'ipv6_netfilter_fini' was not declared. Should it be static?

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:01:32 -07:00
Patrick McHardy 066286071d [NETLINK]: Add "groups" argument to netlink_kernel_create
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:01:11 -07:00
Patrick McHardy ac6d439d20 [NETLINK]: Convert netlink users to use group numbers instead of bitmasks
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:00:54 -07:00
Patrick McHardy a61bbcf28a [NET]: Store skb->timestamp as offset to a base timestamp
Reduces skb size by 8 bytes on 64-bit.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:58:24 -07:00
Harald Welte f6ebe77f95 [NETFILTER]: split net/core/netfilter.c into net/netfilter/*.c
This patch doesn't introduce any code changes, but merely splits the
core netfilter code into four separate files.  It also moves it from
it's old location in net/core/ to the recently-created net/netfilter/
directory.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:51:11 -07:00
Arnaldo Carvalho de Melo 295f7324ff [ICSK]: Introduce reqsk_queue_prune from code in tcp_synack_timer
With this we're very close to getting all of the current TCP
refactorings in my dccp-2.6 tree merged, next changeset will export
some functions needed by the current DCCP code and then dccp-2.6.git
will be born!

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:49:29 -07:00
Arnaldo Carvalho de Melo 87d11ceb9d [SOCK]: Introduce sk_clone
Out of tcp_create_openreq_child, will be used in
dccp_create_openreq_child, and is a nice sock function anyway.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:42:36 -07:00
Arnaldo Carvalho de Melo 8feaf0c0a5 [INET]: Generalise tcp_tw_bucket, aka TIME_WAIT sockets
This paves the way to generalise the rest of the sock ID lookup
routines and saves some bytes in TCPv4 TIME_WAIT sockets on distro
kernels (where IPv6 is always built as a module):

[root@qemu ~]# grep tw_sock /proc/slabinfo
tw_sock_TCPv6  0  0  128  31  1
tw_sock_TCP    0  0   96  41  1
[root@qemu ~]#

Now if a protocol wants to use the TIME_WAIT generic infrastructure it
only has to set the sk_prot->twsk_obj_size field with the size of its
inet_timewait_sock derived sock and proto_register will create
sk_prot->twsk_slab, for now its only for INET sockets, but we can
introduce timewait_sock later if some non INET transport protocolo
wants to use this stuff.

Next changesets will take advantage of this new infrastructure to
generalise even more TCP code.

[acme@toy net-2.6.14]$ grep built-in /tmp/before.size /tmp/after.size
/tmp/before.size: 188646   11764    5068  205478   322a6 net/ipv4/built-in.o
/tmp/after.size:  188144   11764    5068  204976   320b0 net/ipv4/built-in.o
[acme@toy net-2.6.14]$

Tested with both IPv4 & IPv6 (::1 (localhost) & ::ffff:172.20.0.1
(qemu host)).

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:42:13 -07:00
Arnaldo Carvalho de Melo c752f0739f [TCP]: Move the tcp sock states to net/tcp_states.h
Lots of places just needs the states, not even linux/tcp.h, where this
enum was, needs it.

This speeds up development of the refactorings as less sources are
rebuilt when things get moved from net/tcp.h.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:41:54 -07:00
Harald Welte 608c8e4f7b [NETFILTER]: Extend netfilter logging API
This patch is in preparation to nfnetlink_log:
- loggers now have to register struct nf_logger instead of nf_logfn
- nf_log_unregister() replaced by nf_log_unregister_pf() and
  nf_log_unregister_logger()
- add comment to ip[6]t_LOG.h to assure nobody redefines flags
- add /proc/net/netfilter/nf_log to tell user which logger is currently
  registered for which address family
- if user has configured logging, but no logging backend (logger) is
  available, always spit a message to syslog, not just the first time.
- split ip[6]t_LOG.c into two parts:
  Backend: Always try to register as logger for the respective address family
  Frontend: Always log via nf_log_packet() API
- modify all users of nf_log_packet() to accomodate additional argument

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:38:07 -07:00
Arnaldo Carvalho de Melo e6848976b7 [NET]: Cleanup INET_REFCNT_DEBUG code
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:37:29 -07:00