linux

Commit Graph

Author	SHA1	Message	Date
Patrick McHardy	7d6dfe1f5b	[IPSEC] Fix xfrm_state leaks in error path Herbert Xu wrote: > @@ -1254,6 +1326,7 @@ static int pfkey_add(struct sock *sk, st > if (IS_ERR(x)) > return PTR_ERR(x); > > + xfrm_state_hold(x); This introduces a leak when xfrm_state_add()/xfrm_state_update() fail. We hold two references (one from xfrm_state_alloc(), one from xfrm_state_hold()), but only drop one. We need to take the reference because the reference from xfrm_state_alloc() can be dropped by __xfrm_state_delete(), so the fix is to drop both references on error. Same problem in xfrm_user.c. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-18 22:45:31 -07:00
Herbert Xu	f60f6b8f70	[IPSEC] Use XFRM_MSG_* instead of XFRM_SAP_* This patch removes XFRM_SAP_* and converts them over to XFRM_MSG_*. The netlink interface is meant to map directly onto the underlying xfrm subsystem. Therefore rather than using a new independent representation for the events we can simply use the existing ones from xfrm_user. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2005-06-18 22:44:37 -07:00
Herbert Xu	e7443892f6	[IPSEC] Set byid for km_event in xfrm_get_policy This patch fixes policy deletion in xfrm_user so that it sets km_event.data.byid. This puts xfrm_user on par with what af_key does in this case. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2005-06-18 22:44:18 -07:00
Herbert Xu	bf08867f91	[IPSEC] Turn km_event.data into a union This patch turns km_event.data into a union. This makes code that uses it clearer. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2005-06-18 22:44:00 -07:00
Herbert Xu	4f09f0bbc1	[IPSEC] Fix xfrm to pfkey SA state conversion This patch adjusts the SA state conversion in af_key such that XFRM_STATE_ERROR/XFRM_STATE_DEAD will be converted to SADB_STATE_DEAD instead of SADB_STATE_DYING. According to RFC 2367, SADB_STATE_DYING SAs can be turned into mature ones through updating their lifetime settings. Since SAs which are in the states XFRM_STATE_ERROR/XFRM_STATE_DEAD cannot be resurrected, this value is unsuitable. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2005-06-18 22:43:43 -07:00
Herbert Xu	4666faab09	[IPSEC] Kill spurious hard expire messages This patch ensures that the hard state/policy expire notifications are only sent when the state/policy is successfully removed from their respective tables. As it is, it's possible for a state/policy to both expire through reaching a hard limit, as well as being deleted by the user. Note that this behaviour isn't actually forbidden by RFC 2367. However, it is a quality of implementation issue. As an added bonus, the restructuring in this patch will help eventually in moving the expire notifications from softirq context into process context, thus improving their reliability. One important side-effect from this change is that SAs reaching their hard byte/packet limits are now deleted immediately, just like SAs that have reached their hard time limits. Previously they were announced immediately but only deleted after 30 seconds. This is bad because it prevents the system from issuing an ACQUIRE command until the existing state was deleted by the user or expires after the time is up. In the scenario where the expire notification was lost this introduces a 30 second delay into the system for no good reason. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2005-06-18 22:43:22 -07:00
Jamal Hadi Salim	26b15dad9f	[IPSEC] Add complete xfrm event notification Heres the final patch. What this patch provides - netlink xfrm events - ability to have events generated by netlink propagated to pfkey and vice versa. - fixes the acquire lets-be-happy-with-one-success issue Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2005-06-18 22:42:13 -07:00
Linus Torvalds	19fa95e9e9	Merge master.kernel.org:/pub/scm/linux/kernel/git/dwmw2/audit-2.6	2005-06-18 13:54:12 -07:00
Linus Torvalds	0e396ee43e	Manual merge of rsync://rsync.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git This is a fixed-up version of the broken "upstream-2.6.13" branch, where I re-did the manual merge of drivers/net/r8169.c by hand, and made sure the history is all good.	2005-06-18 11:42:35 -07:00
David Woodhouse	0107b3cf32	Merge with master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6.git	2005-06-18 08:36:46 +01:00
David S. Miller	bcfff0b471	[NETFILTER]: ipt_recent: last_pkts is an array of "unsigned long" not "u_int32_t" This fixes various crashes on 64-bit when using this module. Based upon a patch by Juergen Kreileder <jk@blackdown.de>. Signed-off-by: David S. Miller <davem@davemloft.net> ACKed-by: Patrick McHardy <kaber@trash.net>	2005-06-15 20:51:14 -07:00
Patrick McHardy	a96aca88ac	[NETFILTER]: Advance seq-file position in exp_next_seq() Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-13 18:27:13 -07:00
J. Simonetti	1c2fb7f93c	[IPV4]: Sysctl configurable icmp error source address. This patch alows you to change the source address of icmp error messages. It applies cleanly to 2.6.11.11 and retains the default behaviour. In the old (default) behaviour icmp error messages are sent with the ip of the exiting interface. The new behaviour (when the sysctl variable is toggled on), it will send the message with the ip of the interface that received the packet that caused the icmp error. This is the behaviour network administrators will expect from a router. It makes debugging complicated network layouts much easier. Also, all 'vendor routers' I know of have the later behaviour. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-13 15:19:03 -07:00
Sridhar Samudrala	6a6ddb2a9c	[SCTP] Fix incorrect setting of sk_bound_dev_if when binding/sending to a ipv6 link local address. Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-13 15:13:05 -07:00
Neil Horman	cdac4e0774	[SCTP] Add support for ip_nonlocal_bind sysctl & IP_FREEBIND socket option Signed-off-by: Neil Horman <nhorman@redhat.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-13 15:12:33 -07:00
Vladislav Yasevich	bca735bd0d	[SCTP] Extend the info exported via /proc/net/sctp to support netstat for SCTP. Signed-off-by: Vladislav Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-13 15:11:57 -07:00
Neil Horman	0fd9a65a76	[SCTP] Support SO_BINDTODEVICE socket option on incoming packets. Signed-off-by: Neil Horman <nhorman@redhat.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-13 15:11:24 -07:00
Vladislav Yasevich	4243cac1e7	[SCTP]: Fix bug in restart of peeled-off associations. Signed-off-by: Vladislav Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-13 15:10:49 -07:00
R�mi Denis-Courmont	77bd91967a	[IPv6] Don't generate temporary for TUN devices Userland layer-2 tunneling devices allocated through the TUNTAP driver (drivers/net/tun.c) have a type of ARPHRD_NONE, and have no link-layer address. The kernel complains at regular interval when IPv6 Privacy extension are enabled because it can't find an hardware address : Dec 29 11:02:04 auguste kernel: __ipv6_regen_rndid(idev=cb3e0c00): cannot get EUI64 identifier; use random bytes. IPv6 Privacy extensions should probably be disabled on that sort of device. They won't work anyway. If userland wants a more usual Ethernet-ish interface with usual IPv6 autoconfiguration, it will use a TAP device with an emulated link-layer and a random hardware address rather than a TUN device. As far as I could fine, TUN virtual device from TUNTAP is the very only sort of device using ARPHRD_NONE as kernel device type. Signed-off-by: R�mi Denis-Courmont <rdenis@simphalempin.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-13 15:01:34 -07:00
YOSHIFUJI Hideaki	84427d5330	[IPV6]: Ensure to use icmpv6_socket in non-preemptive context. We saw following trace several times: \|BUG: using smp_processor_id() in preemptible [00000001] code: httpd/30137 \|caller is icmpv6_send+0x23/0x540 \| [<c01ad63b>] smp_processor_id+0x9b/0xb8 \| [<c02993e7>] icmpv6_send+0x23/0x540 This is because of icmpv6_socket, which is the only one user of smp_processor_id() in icmpv6_send(), AFAIK. Since it should be used in non-preemptive context, let's defer the dereference after disabling preemption (by icmpv6_xmit_lock()). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-13 14:59:44 -07:00
Ralf Baechle	979b6c135f	[NET]: Move the netdev list to vger.kernel.org. From: Ralf Baechle <ralf@linux-mips.org> There are archives of the old list at http://oss.sgi.com/archives/netdev Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-13 14:30:40 -07:00
Randy Dunlap	6efd8455cf	[IPV4]: Multipath modules need a license to prevent kernel tainting. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-13 14:29:06 -07:00
Andi Kleen	e7626486c3	[TCP]: Adjust TCP mem order check to new alloc_large_system_hash Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-13 14:24:52 -07:00
Thomas Graf	98e5640552	[PKT_SCHED]: Fix numeric comparison in meta ematch This patch is brought to you by the department of applied stupidity. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-08 15:11:19 -07:00
Thomas Graf	e1e284a4bd	[PKT_SCHED]: Dump classification result for basic classifier Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-08 15:11:02 -07:00
Thomas Graf	4890062960	[PKT_SCHED]: Allow socket attributes to be matched on via meta ematch Adds meta collectors for all socket attributes that make sense to be filtered upon. Some of them are only useful for debugging but having them doesn't hurt. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-08 15:10:48 -07:00
Thomas Graf	b824979aec	[PKT_SCHED]: Fix typo in NET_EMATCH_STACK help text Spotted by Geert Uytterhoeven <geert@linux-m68k.org>. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-08 15:10:22 -07:00
Stephen Hemminger	e387660545	[NET]: Fix sysctl net.core.dev_weight Changing the sysctl net.core.dev_weight has no effect because the weight of the backlog devices is set during initialization and never changed. This patch propagates any changes to the global value affected by sysctl to the per-cpu devices. It is done every time the packet handler function is run. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-08 14:56:01 -07:00
Stephen Hemminger	699a411451	[NET]: Allow controlling NAPI device weight with sysfs Simple interface to allow changing network device scheduling weight with sysfs. Please consider this for 2.6.12, since risk/impact is small. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-08 14:55:42 -07:00
Gabor Fekete	8181b8c1f3	[IPV6]: Update parm.link in ip6ip6_tnl_change() Signed-off-by: Gabor Fekete <gfekete@cc.jyu.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-08 14:54:38 -07:00
David S. Miller	fa04ae5c09	[ETHTOOL]: Check correct pointer in ethtool_set_coalesce(). It was checking the "GET" function pointer instead of the "SET" one. Looks like a cut&paste error :-) Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-06 15:07:19 -07:00
	91bcc018f9	Automatic merge of /spare/repo/netdev-2.6 branch we18	2005-06-04 17:08:24 -04:00
Adrian Bunk	4fef0304ee	[IPV6]: Kill export of fl6_sock_lookup. There is no usage of this EXPORT_SYMBOL in the kernel. Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-02 13:06:36 -07:00
Adrian Bunk	64a6c7aa38	[IPVS]: remove net/ipv4/ipvs/ip_vs_proto_icmp.c ip_vs_proto_icmp.c was never finished. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-02 13:02:25 -07:00
David Woodhouse	1c3f45ab2f	Merge with master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6.git	2005-06-02 16:39:11 +01:00
David Woodhouse	4bcff1b37e	AUDIT: Fix user pointer deref thinko in sys_socketcall(). I cunningly put the audit call immediately after the copy_from_user().... but used the _userspace_ copy of the args still. Let's not do that. Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2005-06-02 12:13:21 +01:00
Edgar E Iglesias	36839836e8	[IPSEC]: Fix esp_decap_data size verification in esp4. Signed-off-by: Edgar E Iglesias <edgar@axis.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-31 17:08:05 -07:00
Thomas Graf	08e9cd1fc5	[PKT_SCHED]: Disable dsmark debugging messages by default Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-31 15:17:28 -07:00
Thomas Graf	486b53e59c	[PKT_SCHED]: make dsmark try using pfifo instead of noop while grafting Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-31 15:16:52 -07:00
Thomas Graf	0451eb074e	[PKT_SCHED]: Fix dsmark to count ignored indices while walking Unused indices which are ignored while walking must still be counted to avoid dumping the same index twice. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-31 15:15:58 -07:00
Herbert Xu	208d89843b	[IPV4]: Fix BUG() in 2.6.x, udp_poll(), fragments + CONFIG_HIGHMEM Steven Hand <Steven.Hand@cl.cam.ac.uk> wrote: > > Reconstructed forward trace: > > net/ipv4/udp.c:1334 spin_lock_irq() > net/ipv4/udp.c:1336 udp_checksum_complete() > net/core/skbuff.c:1069 skb_shinfo(skb)->nr_frags > 1 > net/core/skbuff.c:1086 kunmap_skb_frag() > net/core/skbuff.h:1087 local_bh_enable() > kernel/softirq.c:0140 WARN_ON(irqs_disabled()); The receive queue lock is never taken in IRQs (and should never be) so we can simply substitute bh for irq. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-30 15:50:15 -07:00
Harald Welte	9bb7bc942d	[NETFILTER]: Fix deadlock with ip_queue and tcp local input path. When we have ip_queue being used from LOCAL_IN, then we end up with a situation where the verdicts coming back from userspace traverse the TCP input path from syscall context. While this seems to work most of the time, there's an ugly deadlock: syscall context is interrupted by the timer interrupt. When the timer interrupt leaves, the timer softirq get's scheduled and calls tcp_delack_timer() and alike. They themselves do bh_lock_sock(sk), which is already held from somewhere else -> boom. I've now tested the suggested solution by Patrick McHardy and Herbert Xu to simply use local_bh_{en,dis}able(). Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-30 15:35:26 -07:00
David S. Miller	d1102b59ca	[NET]: Use %lx for netdev->features sysfs formatting. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-29 20:28:25 -07:00
David S. Miller	6c94d3611b	[IPV6]: Clear up user copy warning in flowlabel code. We are intentionally ignoring the copy_to_user() value, make it clear to the compiler too. Noted by Jeff Garzik. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-29 20:28:01 -07:00
Jon Mason	69f6a0fafc	[NET]: Add ethtool support for NETIF_F_HW_CSUM. Signed-off-by: Jon Mason <jdmason@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-29 20:27:24 -07:00
Pravin B. Shelar	37e20a66db	[IPV4]: Kill MULTIPATHHOLDROUTE flag. It cannot work properly, so just ignore it in drr and rr multipath algorithms just like the random multipath algorithm does. Suggested by Herbert Xu. Signed-off by: Pravin B. Shelar <pravins@calsoftinc.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-29 20:26:44 -07:00
Harald Welte	8f937c6099	[IPV4]: Primary and secondary addresses Add an option to make secondary IP addresses get promoted when primary IP addresses are removed from the device. It defaults to off to preserve existing behavior. Signed-off-by: Harald Welte <laforge@gnumonks.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-29 20:23:46 -07:00
Stephen Hemminger	7ce54e3f42	[BRIDGE]: receive path optimization This improves the bridge local receive path by avoiding going through another softirq. The bridge receive path is already being called from a netif_receive_skb() there is no point in going through another receiveq round trip. Recursion is limited because bridge can never be a port of a bridge so handle_bridge() always returns. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-29 14:16:48 -07:00
Stephen Hemminger	85967bb46d	[BRIDGE]: prevent bad forwarding table updates Avoid poisoning of the bridge forwarding table by frames that have been dropped by filtering. This prevents spoofed source addresses on hostile side of bridge from causing packet leakage, a small but possible security risk. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-29 14:15:55 -07:00
Stephen Hemminger	81d35307dd	[BRIDGE]: set features based on enslaved devices Make features of the bridge pseudo-device be a subset of the underlying devices. Motivated by Xen and others who use bridging to do failover. Signed-off-by: Catalin BOIE <catab at umrella.ro> Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-29 14:15:17 -07:00
Stephen Hemminger	81e8157583	[BRIDGE]: make dev->features unsigned The features field in netdevice is really a bitmask, and bitmask's should be unsigned. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-29 14:14:35 -07:00
Stephen Hemminger	d8a33ac435	[BRIDGE]: features change notification Resend of earlier patch (no changes) from Catalin used to provide device feature change notification. Signed-off-by: Catalin BOIE <catab at umbrella.ro> Acked-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-29 14:13:47 -07:00
	1f15d69452	Automatic merge of /spare/repo/netdev-2.6 branch master	2005-05-27 22:07:02 -04:00
Alexey Dobriyan	c8b35d2a29	[TOKENRING]: net/802/tr.c: s/struct rif_cache_s/struct rif_cache/ "_s" suffix is certainly of hungarian origin. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-26 12:59:42 -07:00
Alexey Dobriyan	c6b3365391	[TOKENRING]: be'ify trh_hdr, trllc, rif_cache_s Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-26 12:59:05 -07:00
Hideaki YOSHIFUJI	92d63decc0	From: Kazunori Miyazawa <kazunori@miyazawa.org> [XFRM] Call dst_check() with appropriate cookie This fixes infinite loop issue with IPv6 tunnel mode. Signed-off-by: Kazunori Miyazawa <kazunori@miyazawa.org> Signed-off-by: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-26 12:58:04 -07:00
Stephen Hemminger	0dca51d362	[PKT_SCHED] netem: allow random reordering (with fix) Here is a fixed up version of the reorder feature of netem. It is the same as the earlier patch plus with the bugfix from Julio merged in. Has expected backwards compatibility behaviour. Go ahead and merge this one, the TCP strangeness I was seeing was due to the reordering bug, and previous version of TSO patch. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-26 12:55:48 -07:00
Stephen Hemminger	0f9f32ac65	[PKT_SCHED] netem: use only inner qdisc -- no private skbuff queue Netem works better if there if packets are just queued in the inner discipline rather than having a separate delayed queue. Change to use the dequeue/requeue to peek like TBF does. By doing this potential qlen problems with the old method are avoided. The problems happened when the netem_run that moved packets from the inner discipline to the nested discipline failed (because inner queue was full). This happened in dequeue, so the effective qlen of the netem would be decreased (because of the drop), but there was no way to keep the outer qdisc (caller of netem dequeue) in sync. The problem window is still there since this patch doesn't address the issue of requeue failing in netem_dequeue, but that shouldn't happen since the sequence dequeue/requeue should always work. Long term correct fix is to implement qdisc->peek in all the qdisc's to allow for this (needed by several other qdisc's as well). Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-26 12:55:01 -07:00
Stephen Hemminger	0afb51e728	[PKT_SCHED]: netem: reinsert for duplication Handle duplication of packets in netem by re-inserting at top of qdisc tree. This avoid problems with qlen accounting with nested qdisc. This recursion requires no additional locking but will potentially increase stack depth. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-26 12:53:49 -07:00
Herbert Xu	180e425033	[IPV6]: Fix xfrm tunnel oops with large packets Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-23 13:11:07 -07:00
David S. Miller	314324121f	[TCP]: Fix stretch ACK performance killer when doing ucopy. When we are doing ucopy, we try to defer the ACK generation to cleanup_rbuf(). This works most of the time very well, but if the ucopy prequeue is large, this ACKing behavior kills performance. With TSO, it is possible to fill the prequeue so large that by the time the ACK is sent and gets back to the sender, most of the window has emptied of data and performance suffers significantly. This behavior does help in some cases, so we should think about re-enabling this trick in the future, using some kind of limit in order to avoid the bug case. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-23 12:03:06 -07:00
Tommy S. Christensen	aa1c6a6f7f	[NETLINK]: Defer socket destruction a bit In netlink_broadcast() we're sending shared skb's to netlink listeners when possible (saves some copying). This is OK, since we hold the only other reference to the skb. However, this implies that we must drop our reference on the skb, before allowing a receiving socket to disappear. Otherwise, the socket buffer accounting is disrupted. Signed-off-by: Tommy S. Christensen <tommy.christensen@tpack.net> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-19 13:07:32 -07:00
Tommy S. Christensen	68acc024ea	[NETLINK]: Move broadcast skb_orphan to the skb_get path. Cloned packets don't need the orphan call. Signed-off-by: Tommy S. Christensen <tommy.christensen@tpack.net> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-19 13:06:35 -07:00
Tommy S. Christensen	db61ecc335	[NETLINK]: Fix race with recvmsg(). This bug causes: assertion (!atomic_read(&sk->sk_rmem_alloc)) failed at net/netlink/af_netlink.c (122) What's happening is that: 1) The skb is sent to socket 1. 2) Someone does a recvmsg on socket 1 and drops the ref on the skb. Note that the rmalloc is not returned at this point since the skb is still referenced. 3) The same skb is now sent to socket 2. This version of the fix resurrects the skb_orphan call that was moved out, last time we had 'shared-skb troubles'. It is practically a no-op in the common case, but still prevents the possible race with recvmsg. Signed-off-by: Tommy S. Christensen <tommy.christensen@tpack.net> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-19 12:46:59 -07:00
Herbert Xu	31c26852cb	[IPSEC]: Verify key payload in verify_one_algo We need to verify that the payload contains enough data so that attach_one_algo can copy alg_key_len bits from the payload. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-19 12:39:49 -07:00
Herbert Xu	b9e9dead05	[IPSEC]: Fixed alg_key_len usage in attach_one_algo The variable alg_key_len is in bits and not bytes. The function attach_one_algo is currently using it as if it were in bytes. This causes it to read memory which may not be there. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-19 12:39:04 -07:00
David S. Miller	8be58932ca	[NETFILTER]: Do not be clever about SKB ownership in ip_ct_gather_frags(). Just do an skb_orphan() and be done with it. Based upon discussions with Herbert Xu on netdev. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-19 12:36:33 -07:00
Julian Anastasov	d9fa0f392b	[IP_VS]: Remove extra __ip_vs_conn_put() for incoming ICMP. Remove extra __ip_vs_conn_put for incoming ICMP in direct routing mode. Mark de Vries reports that IPVS connections are not leaked anymore. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-19 12:29:59 -07:00
Christoph Hellwig	f81a0bffa1	[AF_UNIX]: Use lookup_create(). currently it opencodes it, but that's in the way of chaning the lookup_hash interface. I'd prefer to disallow modular af_unix over exporting lookup_create, but I'll leave that to you. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-19 12:26:43 -07:00
Herbert Xu	2fdba6b085	[IPV4/IPV6] Ensure all frag_list members have NULL sk Having frag_list members which holds wmem of an sk leads to nightmares with partially cloned frag skb's. The reason is that once you unleash a skb with a frag_list that has individual sk ownerships into the stack you can never undo those ownerships safely as they may have been cloned by things like netfilter. Since we have to undo them in order to make skb_linearize happy this approach leads to a dead-end. So let's go the other way and make this an invariant: For any skb on a frag_list, skb->sk must be NULL. That is, the socket ownership always belongs to the head skb. It turns out that the implementation is actually pretty simple. The above invariant is actually violated in the following patch for a short duration inside ip_fragment. This is OK because the offending frag_list member is either destroyed at the end of the slow path without being sent anywhere, or it is detached from the frag_list before being sent. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-18 22:52:33 -07:00
Evgeniy Polyakov	d48102007d	[XFRM]: skb_cow_data() does not set proper owner for new skbs. It looks like skb_cow_data() does not set proper owner for newly created skb. If we have several fragments for skb and some of them are shared(?) or cloned (like in async IPsec) there might be a situation when we require recreating skb and thus using skb_copy() for it. Newly created skb has neither a destructor nor a socket assotiated with it, which must be copied from the old skb. As far as I can see, current code sets destructor and socket for the first one skb only and uses truesize of the first skb only to increment sk_wmem_alloc value. If above "analysis" is correct then attached patch fixes that. Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-18 22:51:45 -07:00
David Woodhouse	3ec3b2fba5	AUDIT: Capture sys_socketcall arguments and sockaddrs Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2005-05-17 12:08:48 +01:00
	fff9cfd99c	[PATCH] Wireless Extensions 18 (aka WPA) This is version 18 of the Wireless Extensions. The main change is that it adds all the necessary APIs for WPA and WPA2 support. This work was entirely done by Jouni Malinen, so let's thank him for both his hard work and deep expertise on the subject ;-) This APIs obviously doesn't do much by itself and works in concert with driver support (Jouni already sent you the HostAP changes) and userspace (Jouni is updating wpa_supplicant). This is also orthogonal with the ongoing work on in-kernel IEEE support (but potentially useful). The patch is attached, tested with 2.6.11. Normally, I would ask you to push that directly in the kernel (99% of the patch has been on my web page for ages and it does not affect non-WPA stuff), but Jouni convinced me that it should bake a few weeks in wireless-2.6 first, so that other driver maintainers can get up to speed with it. Signed-off-by: Jeff Garzik <jgarzik@pobox.com>	2005-05-12 20:24:19 -04:00
Jesper Juhl	02c30a84e6	[PATCH] update Ross Biro bouncing email address Ross moved. Remove the bad email address so people will find the correct one in ./CREDITS. Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-05 16:36:49 -07:00
Patrick McHardy	60d5306553	[IPV4]: multipath_wrandom.c GPF fixes multipath_wrandom needs to use GFP_ATOMIC. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-05 14:30:15 -07:00
Christoph Hellwig	3ef4e9a8db	[ATALK]: Add alloc_ltalkdev(). this matches the API used by other link layer like ethernet or token ring. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-05 14:25:59 -07:00
Arnaldo Carvalho de Melo	476e19cfa1	[IPV6]: Fix OOPS when using IPV6_ADDRFORM This causes sk->sk_prot to change, which makes the socket release free the sock into the wrong SLAB cache. Fix this by introducing sk_prot_creator so that we always remember where the sock came from. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-05 13:35:15 -07:00
Rafael J. Wysocki	25ae3f59b1	[DECNET]: Fix build after C99 netlink initializer change. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-05 13:13:29 -07:00
David Woodhouse	bfd4bda097	Merge with master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6.git	2005-05-05 13:59:37 +01:00
Al Viro	56c3b7d788	[PATCH] ISA DMA Kconfig fixes - part 4 (irda) * net/irda/irda_device.c::irda_setup_dma() made conditional on ISA_DMA_API (it uses helpers in question and irda is usable on platforms that don't have them at all - think of USB IRDA, for example). * irda drivers that depend on ISA DMA marked as dependent on ISA_DMA_API Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-04 07:33:14 -07:00
J Hadi Salim	14d50e78f9	[PKT_SCHED]: Action repeat Long standing bug. Policy to repeat an action never worked. Signed-off-by: J Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-03 16:29:13 -07:00
Herbert Xu	aabc9761b6	[IPSEC]: Store idev entries I found a bug that stopped IPsec/IPv6 from working. About a month ago IPv6 started using rt6i_idev->dev on the cached socket dst entries. If the cached socket dst entry is IPsec, then rt6i_idev will be NULL. Since we want to look at the rt6i_idev of the original route in this case, the easiest fix is to store rt6i_idev in the IPsec dst entry just as we do for a number of other IPv6 route attributes. Unfortunately this means that we need some new code to handle the references to rt6i_idev. That's why this patch is bigger than it would otherwise be. I've also done the same thing for IPv4 since it is conceivable that once these idev attributes start getting used for accounting, we probably need to dereference them for IPv4 IPsec entries too. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-03 16:27:10 -07:00
Stephen Hemminger	d5d75cd6b1	[PKT_SCHED]: netetm: adjust parent qlen when duplicating Fix qlen underrun when doing duplication with netem. If netem is used as leaf discipline, then the parent needs to be tweaked when packets are duplicated. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-03 16:24:57 -07:00
Stephen Hemminger	771018e76a	[PKT_SCHED]: netetm: make qdisc friendly to outer disciplines Netem currently dumps packets into the queue when timer expires. This patch makes work by self-clocking (more like TBF). It fixes a bug when 0 delay is requested (only doing loss or duplication). Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-03 16:24:32 -07:00
Stephen Hemminger	8cbe1d46d6	[PKT_SCHED]: netetm: trap infinite loop hange on qlen underflow Due to bugs in netem (fixed by later patches), it is possible to get qdisc qlen to go negative. If this happens the CPU ends up spinning forever in qdisc_run(). So add a BUG_ON() to trap it. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-03 16:24:03 -07:00
Patrick McHardy	bd96535b81	[NETFILTER]: Drop conntrack reference in ip_dev_loopback_xmit() Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-03 16:21:37 -07:00
Patrick McHardy	e4f8ab00cf	[NETFILTER]: Fix nf_debug_ip_local_deliver() Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-03 16:20:39 -07:00
Tommy S. Christensen	cacaddf57e	[NET]: Disable queueing when carrier is lost. Some network drivers call netif_stop_queue() when detecting loss of carrier. This leads to packets being queued up at the qdisc level for an unbound period of time. In order to prevent this effect, the core networking stack will now cease to queue packets for any device, that is operationally down (i.e. the queue is flushed and disabled). Signed-off-by: Tommy S. Christensen <tommy.christensen@tpack.net> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-03 16:18:52 -07:00
David S. Miller	0f4821e7b9	[XFRM/RTNETLINK]: Decrement qlen properly in {xfrm_,rt}netlink_rcv(). If we free up a partially processed packet because it's skb->len dropped to zero, we need to decrement qlen because we are dropping out of the top-level loop so it will do the decrement for us. Spotted by Herbert Xu. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-03 16:15:59 -07:00
David S. Miller	09e1430598	[NETLINK]: Fix infinite loops in synchronous netlink changes. The qlen should continue to decrement, even if we pop partially processed SKBs back onto the receive queue. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-03 15:30:05 -07:00
Herbert Xu	2a0a6ebee1	[NETLINK]: Synchronous message processing. Let's recap the problem. The current asynchronous netlink kernel message processing is vulnerable to these attacks: 1) Hit and run: Attacker sends one or more messages and then exits before they're processed. This may confuse/disable the next netlink user that gets the netlink address of the attacker since it may receive the responses to the attacker's messages. Proposed solutions: a) Synchronous processing. b) Stream mode socket. c) Restrict/prohibit binding. 2) Starvation: Because various netlink rcv functions were written to not return until all messages have been processed on a socket, it is possible for these functions to execute for an arbitrarily long period of time. If this is successfully exploited it could also be used to hold rtnl forever. Proposed solutions: a) Synchronous processing. b) Stream mode socket. Firstly let's cross off solution c). It only solves the first problem and it has user-visible impacts. In particular, it'll break user space applications that expect to bind or communicate with specific netlink addresses (pid's). So we're left with a choice of synchronous processing versus SOCK_STREAM for netlink. For the moment I'm sticking with the synchronous approach as suggested by Alexey since it's simpler and I'd rather spend my time working on other things. However, it does have a number of deficiencies compared to the stream mode solution: 1) User-space to user-space netlink communication is still vulnerable. 2) Inefficient use of resources. This is especially true for rtnetlink since the lock is shared with other users such as networking drivers. The latter could hold the rtnl while communicating with hardware which causes the rtnetlink user to wait when it could be doing other things. 3) It is still possible to DoS all netlink users by flooding the kernel netlink receive queue. The attacker simply fills the receive socket with a single netlink message that fills up the entire queue. The attacker then continues to call sendmsg with the same message in a loop. Point 3) can be countered by retransmissions in user-space code, however it is pretty messy. In light of these problems (in particular, point 3), we should implement stream mode netlink at some point. In the mean time, here is a patch that implements synchronous processing. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-03 14:55:09 -07:00
Herbert Xu	96c3602343	[NETLINK]: cb_lock does not needs ref count on sk Here is a little optimisation for the cb_lock used by netlink_dump. While fixing that race earlier, I noticed that the reference count held by cb_lock is completely useless. The reason is that in order to obtain the protection of the reference count, you have to take the cb_lock. But the only way to take the cb_lock is through dereferencing the socket. That is, you must already possess a reference count on the socket before you can take advantage of the reference count held by cb_lock. As a corollary, we can remve the reference count held by the cb_lock. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-03 14:43:27 -07:00
Asim Shankar	033d899904	[PKT_SCHED]: HTB: Drop packet when direct queue is full htb_enqueue(): Free skb and return NET_XMIT_DROP if a packet is destined for the direct_queue but the direct_queue is full. (Before this: erroneously returned NET_XMIT_SUCCESS even though the packet was not enqueued) Signed-off-by: Asim Shankar <asimshankar@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-03 14:39:33 -07:00
Folkert van Heusden	c3924c70dd	[TCP]: Optimize check in port-allocation code, v6 version. Signed-off-by: Folkert van Heusden <folkert@vanheusden.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-03 14:36:45 -07:00
Folkert van Heusden	0b2531bdc5	[TCP]: Optimize check in port-allocation code. Signed-off-by: Folkert van Heusden <folkert@vanheusden.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-03 14:36:08 -07:00
Lucas Correia Villa Real	20cc6befa2	[PKT_SCHED]: fix typo on Kconfig This is a trivial fix for a typo on Kconfig, where the Generic Random Early Detection algorithm is abbreviated as RED instead of GRED. Signed-off-by: Lucas Correia Villa Real <lucasvr@gobolinux.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-03 14:34:20 -07:00
Thomas Graf	db46edc6d3	[RTNETLINK] Cleanup rtnetlink_link tables Converts remaining rtnetlink_link tables to use c99 designated initializers to make greping a little bit easier. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-03 14:29:39 -07:00
Thomas Graf	f90a0a74b8	[RTNETLINK] Fix & cleanup rtm_min/rtm_max Converts rtm_min and rtm_max arrays to use c99 designated initializers for easier insertion of new message families. RTM_GETMULTICAST and RTM_GETANYCAST did not have the minimal message size specified which means that the netlink message was parsed for routing attributes starting from the header. Adds the proper minimal message sizes for these messages (netlink header + common rtnetlink header) to fix this issue. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-03 14:29:00 -07:00
Thomas Graf	492b558b31	[XFRM]: Cleanup xfrm_msg_min and xfrm_dispatch Converts xfrm_msg_min and xfrm_dispatch to use c99 designated initializers to make greping a little bit easier. Also replaces two hardcoded message type with meaningful names. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-03 14:26:40 -07:00
Herbert Xu	679a873824	[IPV6]: Fix raw socket checksums with IPsec I made a mistake in my last patch to the raw socket checksum code. I used the value of inet->cork.length as the length of the payload. While this works with normal packets, it breaks down when IPsec is present since the cork length includes the extension header length. So here is a patch to fix the length calculations. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-03 14:24:36 -07:00

1 2 3 4 5

202 Commits