Commit Graph

1544 Commits

Author SHA1 Message Date
Thomas Graf 1797754ea7 [NETLINK]: Introduce NLMSG_NEW macro to better handle netlink flags
Introduces a new macro NLMSG_NEW which extends NLMSG_PUT but takes
a flags argument. NLMSG_PUT stays there for compatibility but now
calls NLMSG_NEW with flags == 0. NLMSG_PUT_ANSWER is renamed to
NLMSG_NEW_ANSWER which now also takes a flags argument.

Also converts the users of NLMSG_PUT_ANSWER to use NLMSG_NEW_ANSWER
and fixes the two direct users of __nlmsg_put to either provide
the flags or use NLMSG_NEW(_ANSWER).

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18 22:53:48 -07:00
Thomas Graf e386c6eb43 [NEIGH]: Fix use of uninitialized variable when trimming in neightbl_fill_parms
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18 22:52:09 -07:00
Thomas Graf 4b6ea82dd1 [NETLINK]: Kill bogus NLMSG_SET_MULTIPART uses.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18 22:51:43 -07:00
Thomas Graf c7fb64db00 [NETLINK]: Neighbour table configuration and statistics via rtnetlink
To retrieve the neighbour tables send RTM_GETNEIGHTBL with the
NLM_F_DUMP flag set. Every neighbour table configuration is
spread over multiple messages to avoid running into message
size limits on systems with many interfaces. The first message
in the sequence transports all not device specific data such as
statistics, configuration, and the default parameter set.
This message is followed by 0..n messages carrying device
specific parameter sets.

Although the ordering should be sufficient, NDTA_NAME can be
used to identify sequences. The initial message can be identified
by checking for NDTA_CONFIG. The device specific messages do
not contain this TLV but have NDTPA_IFINDEX set to the
corresponding interface index.

To change neighbour table attributes, send RTM_SETNEIGHTBL
with NDTA_NAME set. Changeable attribute include NDTA_THRESH[1-3],
NDTA_GC_INTERVAL, and all TLVs in NDTA_PARMS unless marked
otherwise. Device specific parameter sets can be changed by
setting NDTPA_IFINDEX to the interface index of the corresponding
device.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18 22:50:55 -07:00
David S. Miller e52c1f17e4 [NET]: Move sysctl_max_syn_backlog into request_sock.c
This fixes the CONFIG_INET=n build failure noticed
by Andrew Morton.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18 22:49:40 -07:00
Arnaldo Carvalho de Melo 2ad69c55a2 [NET] rename struct tcp_listen_opt to struct listen_sock
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18 22:48:55 -07:00
Arnaldo Carvalho de Melo 0e87506fcc [NET] Generalise tcp_listen_opt
This chunks out the accept_queue and tcp_listen_opt code and moves
them to net/core/request_sock.c and include/net/request_sock.h, to
make it useful for other transport protocols, DCCP being the first one
to use it.

Next patches will rename tcp_listen_opt to accept_sock and remove the
inline tcp functions that just call a reqsk_queue_ function.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18 22:47:59 -07:00
Arnaldo Carvalho de Melo 2e6599cb89 [NET] Generalise TCP's struct open_request minisock infrastructure
Kept this first changeset minimal, without changing existing names to
ease peer review.

Basicaly tcp_openreq_alloc now receives the or_calltable, that in turn
has two new members:

->slab, that replaces tcp_openreq_cachep
->obj_size, to inform the size of the openreq descendant for
  a specific protocol

The protocol specific fields in struct open_request were moved to a
class hierarchy, with the things that are common to all connection
oriented PF_INET protocols in struct inet_request_sock, the TCP ones
in tcp_request_sock, that is an inet_request_sock, that is an
open_request.

I.e. this uses the same approach used for the struct sock class
hierarchy, with sk_prot indicating if the protocol wants to use the
open_request infrastructure by filling in sk_prot->rsk_prot with an
or_calltable.

Results? Performance is improved and TCP v4 now uses only 64 bytes per
open request minisock, down from 96 without this patch :-)

Next changeset will rename some of the structs, fields and functions
mentioned above, struct or_calltable is way unclear, better name it
struct request_sock_ops, s/struct open_request/struct request_sock/g,
etc.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-18 22:46:52 -07:00
Linus Torvalds 0e396ee43e Manual merge of rsync://rsync.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git
This is a fixed-up version of the broken "upstream-2.6.13" branch, where
I re-did the manual merge of drivers/net/r8169.c by hand, and made sure
the history is all good.
2005-06-18 11:42:35 -07:00
Stephen Hemminger e387660545 [NET]: Fix sysctl net.core.dev_weight
Changing the sysctl net.core.dev_weight has no effect because the weight
of the backlog devices is set during initialization and never changed.

This patch propagates any changes to the global value affected by sysctl
to the per-cpu devices. It is done every time the packet handler
function is run.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-08 14:56:01 -07:00
Stephen Hemminger 699a411451 [NET]: Allow controlling NAPI device weight with sysfs
Simple interface to allow changing network device scheduling weight
with sysfs. Please consider this for 2.6.12, since risk/impact is small.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-08 14:55:42 -07:00
David S. Miller fa04ae5c09 [ETHTOOL]: Check correct pointer in ethtool_set_coalesce().
It was checking the "GET" function pointer instead of
the "SET" one.  Looks like a cut&paste error :-)

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-06 15:07:19 -07:00
91bcc018f9 Automatic merge of /spare/repo/netdev-2.6 branch we18 2005-06-04 17:08:24 -04:00
David S. Miller d1102b59ca [NET]: Use %lx for netdev->features sysfs formatting.
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-05-29 20:28:25 -07:00
Jon Mason 69f6a0fafc [NET]: Add ethtool support for NETIF_F_HW_CSUM.
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-05-29 20:27:24 -07:00
Stephen Hemminger 81e8157583 [BRIDGE]: make dev->features unsigned
The features field in netdevice is really a bitmask, and bitmask's should
be unsigned.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-05-29 14:14:35 -07:00
Stephen Hemminger d8a33ac435 [BRIDGE]: features change notification
Resend of earlier patch (no changes) from Catalin used to provide
device feature change notification.

Signed-off-by: Catalin BOIE <catab at umbrella.ro>
Acked-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-05-29 14:13:47 -07:00
fff9cfd99c [PATCH] Wireless Extensions 18 (aka WPA)
This is version 18 of the Wireless Extensions. The main change
  is that it adds all the necessary APIs for WPA and WPA2 support. This
  work was entirely done by Jouni Malinen, so let's thank him for both
  his hard work and deep expertise on the subject ;-)
        This APIs obviously doesn't do much by itself and works in
  concert with driver support (Jouni already sent you the HostAP
  changes) and userspace (Jouni is updating wpa_supplicant). This is
  also orthogonal with the ongoing work on in-kernel IEEE support (but
  potentially useful).
        The patch is attached, tested with 2.6.11. Normally, I would
  ask you to push that directly in the kernel (99% of the patch has been
  on my web page for ages and it does not affect non-WPA stuff), but
  Jouni convinced me that it should bake a few weeks in wireless-2.6
  first, so that other driver maintainers can get up to speed with it.
  
  Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-05-12 20:24:19 -04:00
Jesper Juhl 02c30a84e6 [PATCH] update Ross Biro bouncing email address
Ross moved.  Remove the bad email address so people will find the correct
one in ./CREDITS.

Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-05 16:36:49 -07:00
Arnaldo Carvalho de Melo 476e19cfa1 [IPV6]: Fix OOPS when using IPV6_ADDRFORM
This causes sk->sk_prot to change, which makes the socket
release free the sock into the wrong SLAB cache.  Fix this
by introducing sk_prot_creator so that we always remember
where the sock came from.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-05-05 13:35:15 -07:00
Patrick McHardy bd96535b81 [NETFILTER]: Drop conntrack reference in ip_dev_loopback_xmit()
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-05-03 16:21:37 -07:00
Patrick McHardy e4f8ab00cf [NETFILTER]: Fix nf_debug_ip_local_deliver()
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-05-03 16:20:39 -07:00
Tommy S. Christensen cacaddf57e [NET]: Disable queueing when carrier is lost.
Some network drivers call netif_stop_queue() when detecting loss of
carrier. This leads to packets being queued up at the qdisc level for
an unbound period of time. In order to prevent this effect, the core
networking stack will now cease to queue packets for any device, that
is operationally down (i.e. the queue is flushed and disabled).

Signed-off-by: Tommy S. Christensen <tommy.christensen@tpack.net>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-05-03 16:18:52 -07:00
David S. Miller 0f4821e7b9 [XFRM/RTNETLINK]: Decrement qlen properly in {xfrm_,rt}netlink_rcv().
If we free up a partially processed packet because it's
skb->len dropped to zero, we need to decrement qlen because
we are dropping out of the top-level loop so it will do
the decrement for us.

Spotted by Herbert Xu.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-05-03 16:15:59 -07:00
David S. Miller 09e1430598 [NETLINK]: Fix infinite loops in synchronous netlink changes.
The qlen should continue to decrement, even if we
pop partially processed SKBs back onto the receive queue.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-05-03 15:30:05 -07:00
Herbert Xu 2a0a6ebee1 [NETLINK]: Synchronous message processing.
Let's recap the problem.  The current asynchronous netlink kernel
message processing is vulnerable to these attacks:

1) Hit and run: Attacker sends one or more messages and then exits
before they're processed.  This may confuse/disable the next netlink
user that gets the netlink address of the attacker since it may
receive the responses to the attacker's messages.

Proposed solutions:

a) Synchronous processing.
b) Stream mode socket.
c) Restrict/prohibit binding.

2) Starvation: Because various netlink rcv functions were written
to not return until all messages have been processed on a socket,
it is possible for these functions to execute for an arbitrarily
long period of time.  If this is successfully exploited it could
also be used to hold rtnl forever.

Proposed solutions:

a) Synchronous processing.
b) Stream mode socket.

Firstly let's cross off solution c).  It only solves the first
problem and it has user-visible impacts.  In particular, it'll
break user space applications that expect to bind or communicate
with specific netlink addresses (pid's).

So we're left with a choice of synchronous processing versus
SOCK_STREAM for netlink.

For the moment I'm sticking with the synchronous approach as
suggested by Alexey since it's simpler and I'd rather spend
my time working on other things.

However, it does have a number of deficiencies compared to the
stream mode solution:

1) User-space to user-space netlink communication is still vulnerable.

2) Inefficient use of resources.  This is especially true for rtnetlink
since the lock is shared with other users such as networking drivers.
The latter could hold the rtnl while communicating with hardware which
causes the rtnetlink user to wait when it could be doing other things.

3) It is still possible to DoS all netlink users by flooding the kernel
netlink receive queue.  The attacker simply fills the receive socket
with a single netlink message that fills up the entire queue.  The
attacker then continues to call sendmsg with the same message in a loop.

Point 3) can be countered by retransmissions in user-space code, however
it is pretty messy.

In light of these problems (in particular, point 3), we should implement
stream mode netlink at some point.  In the mean time, here is a patch
that implements synchronous processing.  

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-05-03 14:55:09 -07:00
Thomas Graf db46edc6d3 [RTNETLINK] Cleanup rtnetlink_link tables
Converts remaining rtnetlink_link tables to use c99 designated
initializers to make greping a little bit easier.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-05-03 14:29:39 -07:00
Thomas Graf f90a0a74b8 [RTNETLINK] Fix & cleanup rtm_min/rtm_max
Converts rtm_min and rtm_max arrays to use c99 designated
initializers for easier insertion of new message families.
RTM_GETMULTICAST and RTM_GETANYCAST did not have the minimal
message size specified which means that the netlink message
was parsed for routing attributes starting from the header.
Adds the proper minimal message sizes for these messages
(netlink header + common rtnetlink header) to fix this issue.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-05-03 14:29:00 -07:00
Martin Waitz 67be2dd1ba [PATCH] DocBook: fix some descriptions
Some KernelDoc descriptions are updated to match the current code.
No code changes.

Signed-off-by: Martin Waitz <tali@admingilde.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01 08:59:26 -07:00
Pavel Pisa 4dc3b16ba1 [PATCH] DocBook: changes and extensions to the kernel documentation
I have recompiled Linux kernel 2.6.11.5 documentation for me and our
university students again.  The documentation could be extended for more
sources which are equipped by structured comments for recent 2.6 kernels.  I
have tried to proceed with that task.  I have done that more times from 2.6.0
time and it gets boring to do same changes again and again.  Linux kernel
compiles after changes for i386 and ARM targets.  I have added references to
some more files into kernel-api book, I have added some section names as well.
 So please, check that changes do not break something and that categories are
not too much skewed.

I have changed kernel-doc to accept "fastcall" and "asmlinkage" words reserved
by kernel convention.  Most of the other changes are modifications in the
comments to make kernel-doc happy, accept some parameters description and do
not bail out on errors.  Changed <pid> to @pid in the description, moved some
#ifdef before comments to correct function to comments bindings, etc.

You can see result of the modified documentation build at
  http://cmp.felk.cvut.cz/~pisa/linux/lkdb-2.6.11.tar.gz

Some more sources are ready to be included into kernel-doc generated
documentation.  Sources has been added into kernel-api for now.  Some more
section names added and probably some more chaos introduced as result of quick
cleanup work.

Signed-off-by: Pavel Pisa <pisa@cmp.felk.cvut.cz>
Signed-off-by: Martin Waitz <tali@admingilde.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01 08:59:25 -07:00
Jesper Juhl e49332bd12 [PATCH] misc verify_area cleanups
There were still a few comments left refering to verify_area, and two
functions, verify_area_skas & verify_area_tt that just wrap corresponding
access_ok_skas & access_ok_tt functions, just like verify_area does for
access_ok - deprecate those.

There was also a few places that still used verify_area in commented-out
code, fix those up to use access_ok.

After applying this one there should not be anything left but finally
removing verify_area completely, which will happen after a kernel release
or two.

Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01 08:59:08 -07:00
Paul E. McKenney fbd568a3e6 [PATCH] Change synchronize_kernel to _rcu and _sched
This patch changes calls to synchronize_kernel(), deprecated in the earlier
"Deprecate synchronize_kernel, GPL replacement" patch to instead call the new
synchronize_rcu() and synchronize_sched() APIs.

Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01 08:59:04 -07:00
Olaf Rempel 5bec0039f4 [NET]: /proc/net/stat/* header cleanup
Signed-off-by: Olaf Rempel <razzor@kopf-tisch.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-04-28 12:16:08 -07:00
Al Viro b453257f05 [PATCH] kill gratitious includes of major.h under net/*
A lot of places in there are including major.h for no reason whatsoever.
Removed.  And yes, it still builds. 

The history of that stuff is often amusing.  E.g.  for net/core/sock.c
the story looks so, as far as I've been able to reconstruct it: we used
to need major.h in net/socket.c circa 1.1.early.  In 1.1.13 that need
had disappeared, along with register_chrdev(SOCKET_MAJOR, "socket",
&net_fops) in sock_init().  Include had not.  When 1.2 -> 1.3 reorg of
net/* had moved a lot of stuff from net/socket.c to net/core/sock.c,
this crap had followed... 

Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-25 18:32:13 -07:00
Ben Greear af191367a7 [NET]: Document ->hard_start_xmit() locking in comments.
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-04-24 20:12:36 -07:00
Patrick McHardy 26095455ac [NET]: Add missing newline for skb_*_panic
While we're at it, lets also replace KERN_INFO by KERN_EMERG to
make sure the user gets to see it.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-04-21 16:43:02 -07:00
Arnaldo Carvalho de Melo 88a6685825 [SOCK]: on failure free the sock from the right place
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-04-19 22:41:54 -07:00
Stephen Hemminger 9c2b3328f7 [NET]: skbuff: remove old NET_CALLER macro
Here is a revised alternative that uses BUG_ON/WARN_ON
(as suggested by Herbert Xu) to eliminate NET_CALLER.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-04-19 22:39:42 -07:00
David S. Miller 98f245e797 [RTNETLINK]: Add comma to final entry in link_rtnetlink_table
Noticed by Herbert Xu.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-04-19 22:37:04 -07:00
Thomas Graf 240eed95eb [RTNETLINK]: Protocol family wildcard dumping for routing rules
Be kind to userspace and don't force them to hardcode protocol
families just to have it changed again once we support routing
rules for more than one protocol family.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-04-19 22:35:07 -07:00
Herbert Xu 357b40a18b [IPV6]: IPV6_CHECKSUM socket option can corrupt kernel memory
So here is a patch that introduces skb_store_bits -- the opposite of
skb_copy_bits, and uses them to read/write the csum field in rawv6.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-04-19 22:30:14 -07:00
Herbert Xu 6775cab98b [PATCH] Fix dst_destroy() race
When we are not the real parent of the dst (e.g., when we're xfrm_dst and
the child is an rtentry), it may already be on the GC list.

In fact the current code is buggy to, we need to check dst->flags before
the dec as dst may no longer be valid afterwards.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-16 15:24:10 -07:00
Arnaldo Carvalho de Melo 2a27805127 [PATCH] net: don't call kmem_cache_create with a spinlock held
This fixes the warning reported by Marcel Holtmann (Thanks!).
  
Signed-off-by: Arnaldo Carvalho de Melo <acme@conectiva.com.br>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-16 15:24:09 -07:00
Linus Torvalds 1da177e4c3 Linux-2.6.12-rc2
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!
2005-04-16 15:20:36 -07:00