linux

Author	SHA1	Message	Date
David S. Miller	e308a5d806	netdev: Add netdev->addr_list_lock protection. Add netif_addr_{lock,unlock}{,_bh}() helpers. Use them to protect operations that operate on or read the network device unicast and multicast address lists. Also use them in cases where the code simply wants to block calls into the driver's ->set_rx_mode() and ->set_multicast_list() methods. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-15 00:13:44 -07:00
David S. Miller	f1f28aa351	netdev: Add addr_list_lock to struct net_device. This will be used to protect the per-device unicast and multicast address lists, as well as the callbacks into the drivers which configure such state such as ->set_rx_mode() and ->set_multicast_list(). Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-15 00:08:33 -07:00
Patrick McHardy	bc1d0411b8	vlan: deliver packets received with VLAN acceleration to network taps When VLAN header stripping is used, packets currently bypass packet sockets (and other network taps) completely. For locally existing VLANs, they appear directly on the VLAN device, for unknown VLANs they are silently dropped. Add a new function netif_nit_deliver() to deliver incoming packets to all network interface taps and use it in __vlan_hwaccel_rx() to make VLAN packets visible on the underlying device. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-14 22:49:30 -07:00
Patrick McHardy	6aa895b047	vlan: Don't store VLAN tag in cb Use a real skb member to store the skb to avoid clashes with qdiscs, which are allowed to use the cb area themselves. As currently only real devices that consume the skb set the NETIF_F_HW_VLAN_TX flag, no explicit invalidation is neccessary. The new member fills a hole on 64 bit, the skb layout changes from: __u32 mark; /* 172 4 / sk_buff_data_t transport_header; / 176 4 / sk_buff_data_t network_header; / 180 4 / sk_buff_data_t mac_header; / 184 4 / sk_buff_data_t tail; / 188 4 / / --- cacheline 3 boundary (192 bytes) --- / sk_buff_data_t end; / 192 4 / / XXX 4 bytes hole, try to pack / to __u32 mark; / 172 4 / __u16 vlan_tci; / 176 2 / / XXX 2 bytes hole, try to pack / sk_buff_data_t transport_header; / 180 4 / sk_buff_data_t network_header; / 184 4 */ Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-14 22:49:06 -07:00
Linus Torvalds	666484f025	Merge branch 'core/softirq' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core/softirq' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: softirq: remove irqs_disabled warning from local_bh_enable softirq: remove initialization of static per-cpu variable Remove argument from open_softirq which is always NULL	2008-07-14 15:28:42 -07:00
Johannes Berg	22bb1be4d2	wext: make sysfs bits optional and deprecate them The /sys/class/net/*/wireless/ direcory is, as far as I know, not used by anyone. Additionally, the same data is available via wext ioctls. Hence the sysfs files are pretty much useless. This patch makes them optional and schedules them for removal. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Cc: Jean Tourrilhes <jt@hpl.hp.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-07-14 14:52:57 -04:00
David S. Miller	c773e847ea	netdev: Move _xmit_lock and xmit_lock_owner into netdev_queue. Accesses are mostly structured such that when there are multiple TX queues the code transformations will be a little bit simpler. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 23:13:53 -07:00
David S. Miller	eb6aafe3f8	pkt_sched: Make qdisc_run take a netdev_queue. This allows us to use this calling convention all the way down into qdisc_restart(). Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 23:12:38 -07:00
David S. Miller	86d804e10a	netdev: Make netif_schedule() routines work with netdev_queue objects. Only plain netif_schedule() remains taking a net_device, mostly as a compatability item while we transition the rest of these interfaces. Everything else calls netif_schedule_queue() or __netif_schedule(), both of which take a netdev_queue pointer. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 23:11:25 -07:00
David S. Miller	6fa9864b53	net: Clean up explicit ->tx_queue references in link watch. First, we add a qdisc_tx_changing() helper which returns true if the qdisc attachment is in transition. Second, we remove an assertion warning which is of limited value and is hard to express precisely in a multiqueue environment. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 23:01:06 -07:00
David S. Miller	ee609cb362	netdev: Move next_sched into struct netdev_queue. We schedule queues, not the device, for output queue processing in BH. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 22:58:37 -07:00
David S. Miller	816f3258e7	netdev: Kill qdisc_ingress, use netdev->rx_queue.qdisc instead. Now that our qdisc management is bi-directional, per-queue, and fully orthogonal, there is no reason to have a special ingress qdisc pointer in struct net_device. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 22:49:00 -07:00
David S. Miller	b0e1e6462d	netdev: Move rest of qdisc state into struct netdev_queue Now qdisc, qdisc_sleeping, and qdisc_list also live there. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 17:42:10 -07:00
David S. Miller	555353cfa1	netdev: The ingress_lock member is no longer needed. Every qdisc is assosciated with a queue, and in the case of ingress qdiscs that will now be netdev->rx_queue so using that queue's lock is the thing to do. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 17:33:13 -07:00
David S. Miller	dc2b48475a	netdev: Move queue_lock into struct netdev_queue. The lock is now an attribute of the device queue. One thing to notice is that "suspicious" places emerge which will need specific training about multiple queue handling. They are so marked with explicit "netdev->rx_queue" and "netdev->tx_queue" references. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 17:18:23 -07:00
David S. Miller	bb949fbd18	netdev: Create netdev_queue abstraction. A netdev_queue is an entity managed by a qdisc. Currently there is one RX and one TX queue, and a netdev_queue merely contains a backpointer to the net_device. The Qdisc struct is augmented with a netdev_queue pointer as well. Eventually the 'dev' Qdisc member will go away and we will have the resulting hierarchy: net_device --> netdev_queue --> Qdisc Also, qdisc_alloc() and qdisc_create_dflt() now take a netdev_queue pointer argument. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-08 16:55:56 -07:00
Patrick McHardy	4b5a698ef4	net: fix dev_set_promiscuity() breakage Commit `dad9b335` (netdevice: Fix promiscuity and allmulti overflow) broke dev_set_promiscuity() by returning on success without reprogramming the device. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-06 15:49:08 -07:00
Ingo Molnar	68083e05d7	Merge commit 'v2.6.26-rc9' into cpus4096	2008-07-06 14:23:39 +02:00
David S. Miller	ea2aca084b	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: Documentation/feature-removal-schedule.txt drivers/net/wan/hdlc_fr.c drivers/net/wireless/iwlwifi/iwl-4965.c drivers/net/wireless/iwlwifi/iwl3945-base.c	2008-07-05 23:08:07 -07:00
Denis V. Lunev	ae299fc051	net: add fib_rules_ops to flush_cache method This is required to pass namespace context into rt_cache_flush called from ->flush_cache. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-05 19:01:28 -07:00
Santwona Behera	0853ad66b1	netdev: Add support for rx flow hash configuration, using ethtool. Added new interfaces to ethtool to configure receive network flow distribution across multiple rx rings using hashing. Signed-off-by: Santwona Behera <santwona.behera@sun.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-02 03:47:41 -07:00
Patrick McHardy	2fe195cfe3	net: fib_rules: fix error code for unsupported families The errno code returned must be negative. Fixes "RTNETLINK answers: Unknown error 18446744073709551519". Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-01 19:59:37 -07:00
Wang Chen	93b3cff991	netdevice: Fix wrong string handle in kernel command line parsing v1->v2: Use strlcpy() to ensure s[i].name be null-termination. 1. In netdev_boot_setup_add(), a long name will leak. ex. : dev=21,0x1234,0x1234,0x2345,eth123456789verylongname......... 2. In netdev_boot_setup_check(), mismatch will happen if s[i].name is a substring of dev->name. ex. : dev=...eth1 dev=...eth11 [ With feedback from Ben Hutchings. ] Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-01 19:57:19 -07:00
Wang Chen	8fde8a0769	net: Tyop of sk_filter() comment Parameter "needlock" no long exists. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-01 19:55:40 -07:00
David S. Miller	1b63ba8a86	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/iwlwifi/iwl4965-base.c	2008-06-28 01:19:40 -07:00
Wang Chen	5dbaec5dc6	netdevice: Fix typo of dev_unicast_add() comment Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-06-27 19:35:16 -07:00
Octavian Purdila	db43a282d3	tcp: fix for splice receive when used with software LRO If an skb has nr_frags set to zero but its frag_list is not empty (as it can happen if software LRO is enabled), and a previous tcp_read_sock has consumed the linear part of the skb, then __skb_splice_bits: (a) incorrectly reports an error and (b) forgets to update the offset to account for the linear part Any of the two problems will cause the subsequent __skb_splice_bits call (the one that handles the frag_list skbs) to either skip data, or, if the unadjusted offset is greater then the size of the next skb in the frag_list, make tcp_splice_read loop forever. Signed-off-by: Octavian Purdila <opurdila@ixiacom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-06-27 17:27:21 -07:00
Jens Axboe	8691e5a8f6	smp_call_function: get rid of the unused nonatomic/retry argument It's never used and the comments refer to nonatomic and retry interchangably. So get rid of it. Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-06-26 11:24:35 +02:00
Ingo Molnar	a60b33cf59	Merge branch 'linus' into core/softirq	2008-06-23 10:52:59 +02:00
Eric W. Biederman	b9f75f45a6	netns: Don't receive new packets in a dead network namespace. Alexey Dobriyan <adobriyan@gmail.com> writes: > Subject: ICMP sockets destruction vs ICMP packets oops > After icmp_sk_exit() nuked ICMP sockets, we get an interrupt. > icmp_reply() wants ICMP socket. > > Steps to reproduce: > > launch shell in new netns > move real NIC to netns > setup routing > ping -i 0 > exit from shell > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 > IP: [<ffffffff803fce17>] icmp_sk+0x17/0x30 > PGD 17f3cd067 PUD 17f3ce067 PMD 0 > Oops: 0000 [1] PREEMPT SMP DEBUG_PAGEALLOC > CPU 0 > Modules linked in: usblp usbcore > Pid: 0, comm: swapper Not tainted 2.6.26-rc6-netns-ct #4 > RIP: 0010:[<ffffffff803fce17>] [<ffffffff803fce17>] icmp_sk+0x17/0x30 > RSP: 0018:ffffffff8057fc30 EFLAGS: 00010286 > RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff81017c7db900 > RDX: 0000000000000034 RSI: ffff81017c7db900 RDI: ffff81017dc41800 > RBP: ffffffff8057fc40 R08: 0000000000000001 R09: 000000000000a815 > R10: 0000000000000000 R11: 0000000000000001 R12: ffffffff8057fd28 > R13: ffffffff8057fd00 R14: ffff81017c7db938 R15: ffff81017dc41800 > FS: 0000000000000000(0000) GS:ffffffff80525000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b > CR2: 0000000000000000 CR3: 000000017fcda000 CR4: 00000000000006e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process swapper (pid: 0, threadinfo ffffffff8053a000, task ffffffff804fa4a0) > Stack: 0000000000000000 ffff81017c7db900 ffffffff8057fcf0 ffffffff803fcfe4 > ffffffff804faa38 0000000000000246 0000000000005a40 0000000000000246 > 000000000001ffff ffff81017dd68dc0 0000000000005a40 0000000055342436 > Call Trace: > <IRQ> [<ffffffff803fcfe4>] icmp_reply+0x44/0x1e0 > [<ffffffff803d3a0a>] ? ip_route_input+0x23a/0x1360 > [<ffffffff803fd645>] icmp_echo+0x65/0x70 > [<ffffffff803fd300>] icmp_rcv+0x180/0x1b0 > [<ffffffff803d6d84>] ip_local_deliver+0xf4/0x1f0 > [<ffffffff803d71bb>] ip_rcv+0x33b/0x650 > [<ffffffff803bb16a>] netif_receive_skb+0x27a/0x340 > [<ffffffff803be57d>] process_backlog+0x9d/0x100 > [<ffffffff803bdd4d>] net_rx_action+0x18d/0x250 > [<ffffffff80237be5>] __do_softirq+0x75/0x100 > [<ffffffff8020c97c>] call_softirq+0x1c/0x30 > [<ffffffff8020f085>] do_softirq+0x65/0xa0 > [<ffffffff80237af7>] irq_exit+0x97/0xa0 > [<ffffffff8020f198>] do_IRQ+0xa8/0x130 > [<ffffffff80212ee0>] ? mwait_idle+0x0/0x60 > [<ffffffff8020bc46>] ret_from_intr+0x0/0xf > <EOI> [<ffffffff80212f2c>] ? mwait_idle+0x4c/0x60 > [<ffffffff80212f23>] ? mwait_idle+0x43/0x60 > [<ffffffff8020a217>] ? cpu_idle+0x57/0xa0 > [<ffffffff8040f380>] ? rest_init+0x70/0x80 > Code: 10 5b 41 5c 41 5d 41 5e c9 c3 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 53 > 48 83 ec 08 48 8b 9f 78 01 00 00 e8 2b c7 f1 ff 89 c0 <48> 8b 04 c3 48 83 c4 08 > 5b c9 c3 66 66 66 66 66 2e 0f 1f 84 00 > RIP [<ffffffff803fce17>] icmp_sk+0x17/0x30 > RSP <ffffffff8057fc30> > CR2: 0000000000000000 > ---[ end trace ea161157b76b33e8 ]--- > Kernel panic - not syncing: Aiee, killing interrupt handler! Receiving packets while we are cleaning up a network namespace is a racy proposition. It is possible when the packet arrives that we have removed some but not all of the state we need to fully process it. We have the choice of either playing wack-a-mole with the cleanup routines or simply dropping packets when we don't have a network namespace to handle them. Since the check looks inexpensive in netif_receive_skb let's just drop the incoming packets. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-06-20 22:16:51 -07:00
Ben Hutchings	4497b0763c	net: Discard and warn about LRO'd skbs received for forwarding Add skb_warn_if_lro() to test whether an skb was received with LRO and warn if so. Change br_forward(), ip_forward() and ip6_forward() to call it) and discard the skb if it returns true. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-06-19 16:22:28 -07:00
Ben Hutchings	0187bdfb05	net: Disable LRO on devices that are forwarding Large Receive Offload (LRO) is only appropriate for packets that are destined for the host, and should be disabled if received packets may be forwarded. It can also confuse the GSO on output. Add dev_disable_lro() function which uses the appropriate ethtool ops to disable LRO if enabled. Add calls to dev_disable_lro() in br_add_if() and functions that enable IPv4 and IPv6 forwarding. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-06-19 16:15:47 -07:00
Wang Chen	dad9b335c6	netdevice: Fix promiscuity and allmulti overflow Max of promiscuity and allmulti plus positive @inc can cause overflow. Fox example: when allmulti=0xFFFFFFFF, any caller give dev_set_allmulti() a positive @inc will cause allmulti be off. This is not what we want, though it's rare case. The fix is that only negative @inc will cause allmulti or promiscuity be off and when any caller makes the counters touch the roof, we return error. Change of v2: Change void function dev_set_promiscuity/allmulti to return int. So callers can get the overflow error. Caller's fix will be done later. Change of v3: 1. Since we return error to caller, we don't need to print KERN_ERROR, KERN_WARNING is enough. 2. In dev_set_promiscuity(), if __dev_set_promiscuity() failed, we return at once. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-06-18 01:48:28 -07:00
David S. Miller	972692e0db	net: Add sk_set_socket() helper. In order to more easily grep for all things that set sk->sk_socket, add sk_set_socket() helper inline function. Suggested (although only half-seriously) by Evgeniy Polyakov. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-06-17 22:41:38 -07:00
Jay Vosburgh	b8a9787edd	bonding: Allow setting max_bonds to zero Permit bonding to function rationally if max_bonds is set to zero. This will load the module, but create no master devices (which can be created via sysfs). Requires some change to bond_create_sysfs; currently, the netdev sysfs directory is determined from the first bonding device created, but this is no longer possible. Instead, an interface from net/core is created to create and destroy files in net_class. Based on a patch submitted by Phil Oester <kernel@linuxaces.com>. Modified by Jay Vosburgh to fix the sysfs issue mentioned above and to update the documentation. Signed-off-by: Phil Oester <kernel@linuxace.com> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-06-18 00:00:04 -04:00
Or Gerlitz	c1da4ac752	net/core: add NETDEV_BONDING_FAILOVER event Add NETDEV_BONDING_FAILOVER event to be used in a successive patch by bonding to announce fail-over for the active-backup mode through the netdev events notifier chain mechanism. Such an event can be of use for the RDMA CM (communication manager) to let native RDMA ULPs (eg NFS-RDMA, iSER) always be aligned with the IP stack, in the sense that they use the same ports/links as the stack does. More usages can be done to allow monitoring tools based on netlink events being aware to bonding fail-over. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-06-17 23:59:41 -04:00
David S. Miller	caea902f72	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/rt2x00/Kconfig drivers/net/wireless/rt2x00/rt2x00usb.c net/sctp/protocol.c	2008-06-16 18:25:48 -07:00
Ben Hutchings	6de329e26c	net: Fix test for VLAN TX checksum offload capability Selected device feature bits can be propagated to VLAN devices, so we can make use of TX checksum offload and TSO on VLAN-tagged packets. However, if the physical device does not do VLAN tag insertion or generic checksum offload then the test for TX checksum offload in dev_queue_xmit() will see a protocol of htons(ETH_P_8021Q) and yield false. This splits the checksum offload test into two functions: - can_checksum_protocol() tests a given protocol against a feature bitmask - dev_can_checksum() first tests the skb protocol against the device features; if that fails and the protocol is htons(ETH_P_8021Q) then it tests the encapsulated protocol against the effective device features for VLANs Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-06-16 17:02:28 -07:00
Ingo Molnar	9583f3d9c0	Merge branch 'linus' into core/softirq	2008-06-16 11:24:17 +02:00
Adrian Bunk	0b04082995	net: remove CVS keywords This patch removes CVS keywords that weren't updated for a long time from comments. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-06-11 21:00:38 -07:00
David S. Miller	65b53e4cc9	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/tg3.c drivers/net/wireless/rt2x00/rt2x00dev.c net/mac80211/ieee80211_i.h	2008-06-10 02:22:26 -07:00
Octavian Purdila	293ad60401	tcp: Fix for race due to temporary drop of the socket lock in skb_splice_bits. skb_splice_bits temporary drops the socket lock while iterating over the socket queue in order to break a reverse locking condition which happens with sendfile. This, however, opens a window of opportunity for tcp_collapse() to aggregate skbs and thus potentially free the current skb used in skb_splice_bits and tcp_read_sock. This patch fixes the problem by (re-)getting the same "logical skb" after the lock has been temporary dropped. Based on idea and initial patch from Evgeniy Polyakov. Signed-off-by: Octavian Purdila <opurdila@ixiacom.com> Acked-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-06-04 15:45:58 -07:00
Thomas Graf	bc3ed28caa	netlink: Improve returned error codes Make nlmsg_trim(), nlmsg_cancel(), genlmsg_cancel(), and nla_nest_cancel() void functions. Return -EMSGSIZE instead of -1 if the provided message buffer is not big enough. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-06-03 16:36:54 -07:00
Brice Goglin	7557af2515	net_dma: remove duplicate assignment in dma_skb_copy_datagram_iovec No need to compute copy twice in the frags loop in dma_skb_copy_datagram_iovec(). Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr> Acked-by: Shannon Nelson <shannon.nelson@intel.com> Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-06-03 16:07:45 -07:00
Stephen Hemminger	b9f5f52cca	net: neighbour table ABI problem The neighbor table time of last use information is returned in the incorrect unit. Kernel to user space ABI's need to use USER_HZ (or milliseconds), otherwise the application has to try and discover the real system HZ value which is problematic. Linux has standardized on keeping USER_HZ consistent (100hz) even when kernel is running internally at some other value. This change is small, but it breaks the ABI for older version of iproute2 utilities. But these utilities are already broken since they are looking at the psched_hz values which are completely different. So let's just go ahead and fix both kernel and user space. Older utilities will just print wrong values. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-06-03 16:03:15 -07:00
David S. Miller	43154d08d6	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/cpmac.c net/mac80211/mlme.c	2008-05-25 23:26:10 -07:00
Carlos R. Mafra	962cf36c5b	Remove argument from open_softirq which is always NULL As git-grep shows, open_softirq() is always called with the last argument being NULL block/blk-core.c: open_softirq(BLOCK_SOFTIRQ, blk_done_softirq, NULL); kernel/hrtimer.c: open_softirq(HRTIMER_SOFTIRQ, run_hrtimer_softirq, NULL); kernel/rcuclassic.c: open_softirq(RCU_SOFTIRQ, rcu_process_callbacks, NULL); kernel/rcupreempt.c: open_softirq(RCU_SOFTIRQ, rcu_process_callbacks, NULL); kernel/sched.c: open_softirq(SCHED_SOFTIRQ, run_rebalance_domains, NULL); kernel/softirq.c: open_softirq(TASKLET_SOFTIRQ, tasklet_action, NULL); kernel/softirq.c: open_softirq(HI_SOFTIRQ, tasklet_hi_action, NULL); kernel/timer.c: open_softirq(TIMER_SOFTIRQ, run_timer_softirq, NULL); net/core/dev.c: open_softirq(NET_TX_SOFTIRQ, net_tx_action, NULL); net/core/dev.c: open_softirq(NET_RX_SOFTIRQ, net_rx_action, NULL); This observation has already been made by Matthew Wilcox in June 2002 (http://www.cs.helsinki.fi/linux/linux-kernel/2002-25/0687.html) "I notice that none of the current softirq routines use the data element passed to them." and the situation hasn't changed since them. So it appears we can safely remove that extra argument to save 128 (54) bytes of kernel data (text). Signed-off-by: Carlos R. Mafra <crmafra@ift.unesp.br> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-25 07:43:15 +02:00
Mike Travis	0e12f848b3	net: use performance variant for_each_cpu_mask_nr Change references from for_each_cpu_mask to for_each_cpu_mask_nr where appropriate Reviewed-by: Paul Jackson <pj@sgi.com> Reviewed-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-23 18:35:12 +02:00
Pavel Emelyanov	96e74088f1	net: The dev->get_stats pointer is not NULL nowadays. And so does the pointer is returns, but sysfs and netlinks still check for both cases. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-05-21 14:12:46 -07:00
Denis V. Lunev	d3ede327e8	pktgen: make sure that pktgen_thread_worker has been executed The following courruption can happen during pktgen stop: list_del corruption. prev->next should be ffff81007e8a5e70, but was 6b6b6b6b6b6b6b6b kernel BUG at lib/list_debug.c:67! :pktgen:pktgen_thread_worker+0x374/0x10b0 ? autoremove_wake_function+0x0/0x40 ? _spin_unlock_irqrestore+0x42/0x80 ? :pktgen:pktgen_thread_worker+0x0/0x10b0 kthread+0x4d/0x80 child_rip+0xa/0x12 ? restore_args+0x0/0x30 ? kthread+0x0/0x80 ? child_rip+0x0/0x12 RIP list_del+0x48/0x70 The problem is that pktgen_thread_worker can not be executed if kthread_stop has been called too early. Insert a completion on the normal initialization path to make sure that pktgen_thread_worker will gain the control for sure. Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: Alexey Dobriyan <adobriyan@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-05-20 15:12:44 -07:00
David Woodhouse	0e91796eb4	net: Fix call to ->change_rx_flags(dev, IFF_MULTICAST) in dev_change_flags() Am I just being particularly dim today, or can the call to dev->change_rx_flags(dev, IFF_MULTICAST) in dev_change_flags() never happen? We've just set dev->flags = flags & IFF_MULTICAST, effectively. So the condition '(dev->flags ^ flags) & IFF_MULTICAST' is _never_ going to be true. Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-05-20 14:36:14 -07:00
Pavel Emelyanov	d5a4502e9e	netns: Register net/core/ sysctls at read-only root. Most of the net/core/xxx sysctls are read-only now, but this goal is achieved with excessive memory consumption in each namespace - the whole table is cloned and most of the entries in it are ~= 0222. Split it into two parts and register (the largest) one at the read-only root. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-05-19 13:49:52 -07:00
Stephen Hemminger	dcc997738e	net: handle errors from device_rename device_rename can fail with -EEXIST or -ENOMEM, so handle any problems. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-05-14 22:33:38 -07:00
Rami Rosen	9ee6b7f155	net: Fix typo in net/core/sock.c. In sock_queue_rcv_skb() (net/core/sock.c) it should be: "Cast sk->rcvbuf ..." instead of: "Cast skb->rcvbuf ..." Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-05-14 03:50:03 -07:00
Johannes Berg	f5184d267c	net: Allow netdevices to specify needed head/tailroom This patch adds needed_headroom/needed_tailroom members to struct net_device and updates many places that allocate sbks to use them. Not all of them can be converted though, and I'm sure I missed some (I mostly grepped for LL_RESERVED_SPACE) Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-05-12 20:48:31 -07:00
Ben Hutchings	e46b66bc42	net: Added ASSERT_RTNL() to dev_open() and dev_close(). dev_open() and dev_close() must be called holding the RTNL, since they call device functions and netdevice notifiers that are promised the RTNL. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-05-08 02:53:17 -07:00
Pavel Emelyanov	aca51397d0	netns: Fix arbitrary net_device-s corruptions on net_ns stop. When a net namespace is destroyed, some devices (those, not killed on ns stop explicitly) are moved back to init_net. The problem, is that this net_ns change has one point of failure - the __dev_alloc_name() may be called if a name collision occurs (and this is easy to trigger). This allocator performs a likely-to-fail GFP_ATOMIC allocation to find a suitable number. Other possible conditions that may cause error (for device being ns local or not registered) are always false in this case. So, when this call fails, the device is unregistered. But this is not the right thing to do, since after this the device may be released (and kfree-ed) improperly. E. g. bridges require more actions (sysfs update, timer disarming, etc.), some other devices want to remove their private areas from lists, etc. I. e. arbitrary use-after-free cases may occur. The proposed fix is the following: since the only reason for the dev_change_net_namespace to fail is the name generation, we may give it a unique fall-back name w/o %d-s in it - the dev<ifindex> one, since ifindexes are still unique. So make this change, raise the failure-case printk loglevel to EMERG and replace the unregister_netdevice call with BUG(). [ Use snprintf() -DaveM ] Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-05-08 01:24:25 -07:00
Johannes Berg	c800578510	net: Fix useless comment reference loop. include/linux/skbuff.h says: /* These elements must be at the end, see alloc_skb() for details. / net/core/skbuff.c says: See comment in sk_buff definition, just before the 'tail' member This patch contains my guess as to the actual reason rather than a dead comment reference loop. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-05-03 20:56:42 -07:00
Daniel Lezcano	aaf8cdc34d	netns: Fix device renaming for sysfs When a netdev is moved across namespaces with the 'dev_change_net_namespace' function, the 'device_rename' function is used to fixup kobject and refresh the sysfs tree. The device_rename function will call kobject_rename and this one will check if there is an object with the same name and this is the case because we are renaming the object with the same name. The use of 'device_rename' seems for me wrong because we usually don't rename it but just move it across namespaces. As we just want to do a mini "netdev_[un]register", IMO the functions 'netdev_[un]register_kobject' should be used instead, like an usual network device [un]registering. This patch replace device_rename by netdev_unregister_kobject, followed by netdev_register_kobject. The netdev_register_kobject will call device_initialize and will raise a warning indicating the device was already initialized. In order to fix that, I split the device initialization into a separate function and use it together with 'netdev_register_kobject' into register_netdevice. So we can safely call 'netdev_register_kobject' in 'dev_change_net_namespace'. This fix will allow to properly use the sysfs per namespace which is coming from -mm tree. Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com> Acked-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-05-02 17:00:58 -07:00
Mike Travis	0c0b0aca66	net: remove NR_CPUS arrays in net/core/dev.c Remove the fixed size channels[NR_CPUS] array in net/core/dev.c and dynamically allocate array based on nr_cpu_ids. Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-05-02 16:43:08 -07:00
Harvey Harrison	d3e2ce3bcd	net: use get/put_unaligned_* helpers Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-05-02 16:26:16 -07:00
Ilpo Järvinen	50aab54f30	net: Add missing braces to multi-statement if()s One finds all kinds of crazy things with some shell pipelining. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-05-02 16:20:10 -07:00
Denis V. Lunev	5efdccbcda	net: assign PDE->data before gluing PDE into /proc tree Simply replace proc_create and further data assigned with proc_create_data. Additionally, there is no need to assign NULL to PDE->data after creation, /proc generic has already done this for us. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-05-02 02:46:22 -07:00
Hirofumi Nakagawa	801678c5a3	Remove duplicated unlikely() in IS_ERR() Some drivers have duplicated unlikely() macros. IS_ERR() already has unlikely() in itself. This patch cleans up such pointless code. Signed-off-by: Hirofumi Nakagawa <hnakagawa@miraclelinux.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Jeff Garzik <jeff@garzik.org> Cc: Paul Clements <paul.clements@steeleye.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: David Brownell <david-b@pacbell.net> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Carsten Otte <cotte@de.ibm.com> Cc: Patrick McHardy <kaber@trash.net> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Jaroslav Kysela <perex@perex.cz> Cc: Takashi Iwai <tiwai@suse.de> Acked-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:25 -07:00
Linus Torvalds	2e561c7b7e	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (48 commits) net: Fix wrong interpretation of some copy_to_user() results. xfrm: alg_key_len & alg_icv_len should be unsigned [netdrvr] tehuti: move ioctl perm check closer to function start ipv6: Fix typo in net/ipv6/Kconfig via-velocity: fix vlan receipt tg3: sparse cleanup forcedeth: realtek phy crossover detection ibm_newemac: Increase MDIO timeouts gianfar: Fix skb allocation strategy netxen: reduce stack usage of netxen_nic_flash_print smc911x: test after postfix decrement fails in smc911x_{reset,drop_pkt} net drivers: fix platform driver hotplug/coldplug forcedeth: new backoff implementation ehea: make things static phylib: Add support for board-level PHY fixups [netdrvr] atlx: code movement: move atl1 parameter parsing atlx: remove flash vendor parameter korina: misc cleanup korina: fix misplaced return statement WAN: Fix confusing insmod error code for C101 too. ...	2008-04-25 12:28:28 -07:00
Mandeep Singh Baines	c5835df971	ethtool: EEPROM dump no longer works for tg3 and natsemi In the ethtool user-space application, tg3 and natsemi over-ride the default implementation of dump_eeprom(). In both tg3_dump_eeprom() and natsemi_dump_eeprom(), there is a magic number check which is not present in the default implementation. Commit `b131dd5d` ("[ETHTOOL]: Add support for large eeproms") snipped the code which copied the ethtool_eeprom structure back to user-space. tg3 and natsemi are over-writing the magic number field and then checking it in user-space. With the ethtool_eeprom copy removed, the check is failing. The fix is simple. Add the ethtool_eeprom copy back. Signed-off-by: Mandeep Singh Baines <msb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-24 20:55:56 -07:00
Linus Torvalds	d02aacff44	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (22 commits) tun: Multicast handling in tun_chr_ioctl() needs proper locking. [NET]: Fix heavy stack usage in seq_file output routines. [AF_UNIX] Initialise UNIX sockets before general device initcalls [RTNETLINK]: Fix bogus ASSERT_RTNL warning iwlwifi: Fix built-in compilation of iwlcore (part 2) tun: Fix minor race in TUNSETLINK ioctl handling. ppp_generic: use stats from net_device structure iwlwifi: Don't unlock priv->mutex if it isn't locked wireless: rndis_wlan: modparam_workaround_interval is never below 0. prism54: prism54_get_encode() test below 0 on unsigned index mac80211: update mesh EID values b43: Workaround DMA quirks mac80211: fix use before check of Qdisc length net/mac80211/rx.c: fix off-by-one mac80211: Fix race between ieee80211_rx_bss_put and lookup routines. ath5k: Fix radio identification on AR5424/2424 ssb: Fix all-ones boardflags b43: Add more btcoexist workarounds b43: Fix HostFlags data types b43: Workaround invalid bluetooth settings ...	2008-04-24 08:40:34 -07:00
Patrick McHardy	c9c1014b2b	[RTNETLINK]: Fix bogus ASSERT_RTNL warning ASSERT_RTNL uses mutex_trylock to test whether the rtnl_mutex is held. This bogus warnings when running in atomic context, which f.e. happens when adding secondary unicast addresses through macvlan or vlan or when synchronizing multicast addresses from wireless devices. Mid-term we might want to consider moving all address updates to process context since the locking seems overly complicated, for now just fix the bogus warning by changing ASSERT_RTNL to use mutex_is_locked(). Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-23 22:10:48 -07:00
Linus Torvalds	b0d19a378a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: iwlwifi: Fix built-in compilation of iwlcore net: Unexport move_addr_to_{kernel,user} rt2x00: Select LEDS_CLASS. iwlwifi: Select LEDS_CLASS. leds: Do not guard NEW_LEDS with HAS_IOMEM [IPSEC]: Fix catch-22 with algorithm IDs above 31 time: Export set_normalized_timespec. tcp: Make use of before macro in tcp_input.c hamradio: Remove unneeded and deprecated cli()/sti() calls in dmascc.c [NETNS]: Remove empty ->init callback. [DCCP]: Convert do_gettimeofday() to getnstimeofday(). [NETNS]: Don't initialize err variable twice. [NETNS]: The ip6_fib_timer can work with garbage on net namespace stop. [IPV4]: Convert do_gettimeofday() to getnstimeofday(). [IPV4]: Make icmp_sk_init() static. [IPV6]: Make struct ip6_prohibit_entry_template static. tcp: Trivial fix to correct function name in a comment in net/ipv4/tcp.c [NET]: Expose netdevice dev_id through sysfs skbuff: fix missing kernel-doc notation [ROSE]: Fix soft lockup wrt. rose_node_list_lock	2008-04-23 12:23:45 -07:00
Linus Torvalds	8a32272688	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: [SPARC]: Remove SunOS and Solaris binary support.	2008-04-21 17:20:53 -07:00
Linus Torvalds	e9b62693ae	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/juhl/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/juhl/trivial: (24 commits) DOC: A couple corrections and clarifications in USB doc. Generate a slightly more informative error msg for bad HZ fix typo "is" -> "if" in Makefile ext*: spelling fix prefered -> preferred DOCUMENTATION: Use newer DEFINE_SPINLOCK macro in docs. KEYS: Fix the comment to match the file name in rxrpc-type.h. RAID: remove trailing space from printk line DMA engine: typo fixes Remove unused MAX_NODES_SHIFT MAINTAINERS: Clarify access to OCFS2 development mailing list. V4L: Storage class should be before const qualifier (sn9c102) V4L: Storage class should be before const qualifier sonypi: Storage class should be before const qualifier intel_menlow: Storage class should be before const qualifier DVB: Storage class should be before const qualifier arm: Storage class should be before const qualifier ALSA: Storage class should be before const qualifier acpi: Storage class should be before const qualifier firmware_sample_driver.c: fix coding style MAINTAINERS: Add ati_remote2 driver ... Fixed up trivial conflicts in firmware_sample_driver.c	2008-04-21 16:36:46 -07:00
Rusty Russell	5309fbcc47	Remove documentation of non-existent sk_alloc arg As you can see, there's no zero_it arg (in fact code always uses __GFP_ZERO). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>	2008-04-21 22:17:12 +00:00
David S. Miller	ec98c6b9b4	[SPARC]: Remove SunOS and Solaris binary support. As per Documentation/feature-removal-schedule.txt Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-21 15:10:15 -07:00
David Woodhouse	9d29672c64	[NET]: Expose netdevice dev_id through sysfs Expose dev_id to userspace, because it helps to disambiguate between interfaces where the MAC address is unique. This should allow us to simplify the handling of persistent naming for S390 network devices in udev -- because it can depend on a simple attribute of the device like the other match criteria, rather than having a special case for SUBSYSTEMS=="ccwgroup". Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-20 16:07:43 -07:00
Matthew Wilcox	5f090dcb4d	net: Remove unnecessary inclusions of asm/semaphore.h None of these files use any of the functionality promised by asm/semaphore.h. It's possible that they rely on it dragging in some unrelated header file, but I can't build all these files, so we'll have fix any build failures as they come up. Signed-off-by: Matthew Wilcox <willy@linux.intel.com>	2008-04-18 22:15:50 -04:00
Alexey Dobriyan	d1643d24c6	[NET]: Fix and allocate less memory for ->priv'less netdevices This patch effectively reverts commit `d0498d9ae1` aka "[NET]: Do not allocate unneeded memory for dev->priv alignment." It was found to be buggy because of final unconditional += NETDEV_ALIGN_CONST removal. For example, for sizeof(struct net_device) being 2048 bytes, "alloc_size" was also 2048 bytes, but allocator with debugging options turned on started giving out !32-byte aligned memory resulting in redzones overwrites. Patch does small optimization in ->priv'less case: bumping size to next 32-byte boundary was always done to ensure ->priv will also be aligned. But, no ->priv, no need to do that. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-18 15:43:32 -07:00
Pavel Emelyanov	d0498d9ae1	[NET]: Do not allocate unneeded memory for dev->priv alignment. The alloc_netdev_mq() tries to produce 32-bytes alignment for both the net_device itself and its private data. The second alignment is achieved by adding the NETDEV_ALIGN_CONST to the whole size of the memory to be allocated. However, for those devices that do not need the private area, this addition just makes the net_device weight 1024 + 32 = 1068 bytes, i.e. consume twice as much memory. Since loopback device is such (sizeof_priv == 0 for it), and each net namespace creates one, this can save a noticeable amount of memory for kernel with net namespaces turned on. After this set the lo device is actually allocated from a size-1024 kmem cache on i386 box even with NETPOLL and WIRELESS_EXT turned on. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-16 02:17:42 -07:00
Denis V. Lunev	f3005d7f4a	[NETNS]: Add netns refcnt debug for network devices. dev_set_net is called for - just allocated devices - devices moving from one namespace to another release_net has proper check inside to distinguish these cases. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-16 02:02:18 -07:00
Denis V. Lunev	3661a91083	[NETNS]: Add netns refcnt debug to fib rules. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-16 02:01:56 -07:00
Denis V. Lunev	65a18ec58e	[NETNS]: Add netns refcnt debug for kernel sockets. Protocol control sockets and netlink kernel sockets should not prevent the namespace stop request. They are initialized and disposed in a special way by sk_change_net/sk_release_kernel. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-16 01:59:46 -07:00
Denis V. Lunev	5d1e4468a7	[NETNS]: Make netns refconting debug like a socket one. Make release_net/hold_net noop for performance-hungry people. This is a debug staff and should be used in the debug mode only. Add check for net != NULL in hold/release calls. This will be required later on. [ Added minor simplifications suggested by Brian Haley. -DaveM ] Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-16 01:58:04 -07:00
Pavel Emelyanov	669f87baab	[RTNL]: Introduce the rtnl_kill_links helper. This one is responsible for calling ->dellink on each net device found in net to help with vlan net_exit hook in the nearest future. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-16 00:46:52 -07:00
Pavel Emelyanov	3a931a80cb	[RTNL]: Relax for_each_netdev_safe in __rtnl_link_unregister. Each potential list_del (happening from inside a ->dellink call) is followed by goto restart, so there's no need in _safe iteration. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-16 00:45:56 -07:00
Mandeep Singh Baines	b131dd5d65	[ETHTOOL]: Add support for large eeproms Currently, it is not possible to read/write to an eeprom larger than 128k in size because the buffer used for temporarily storing the eeprom contents is allocated using kmalloc. kmalloc can only allocate a maximum of 128k depending on architecture. Modified ethtool_get/set_eeprom to only allocate a page of memory and then copy the eeprom a page at a time. Updated original patch as per suggestions from Joe Perches. Signed-off-by: Mandeep Singh Baines <msb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-15 19:29:17 -07:00
Pavel Emelyanov	dec827d174	[NETNS]: The generic per-net pointers. Add the elastic array of void * pointer to the struct net. The access rules are simple: 1. register the ops with register_pernet_gen_device to get the id of your private pointer 2. call net_assign_generic() to put the private data on the struct net (most preferably this should be done in the ->init callback of the ops registered) 3. do not store any private reference on the net_generic array; 4. do not change this pointer while the net is alive; 5. use the net_generic() to get the pointer. When adding a new pointer, I copy the old array, replace it with a new one and schedule the old for kfree after an RCU grace period. Since the net_generic explores the net->gen array inside rcu read section and once set the net->gen->ptr[x] pointer never changes, this grants us a safe access to generic pointers. Quoting Paul: "... RCU is protecting -only- the net_generic structure that net_generic() is traversing, and the [pointer] returned by net_generic() is protected by a reference counter in the upper-level struct net." Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-15 00:36:08 -07:00
Pavel Emelyanov	c93cf61fd1	[NETNS]: The net-subsys IDs generator. To make some per-net generic pointers, we need some way to address them, i.e. - IDs. This is simple IDA-based IDs generator for pernet subsystems. Addressing questions about potential checkpoint/restart problems: these IDs are "lite-offsets" within the net structure and are by no means supposed to be exported to the userspace. Since it will be used in the nearest future by devices only (tun, vlan, tunnels, bridge, etc), I make it resemble the functionality of register_pernet_device(). The new ids is stored in the *id pointer _before_ calling the init callback to make this id available in this callback. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-15 00:35:23 -07:00
David S. Miller	df39e8ba56	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/ehea/ehea_main.c drivers/net/wireless/iwlwifi/Kconfig drivers/net/wireless/rt2x00/rt61pci.c net/ipv4/inet_timewait_sock.c net/ipv6/raw.c net/mac80211/ieee80211_sta.c	2008-04-14 02:30:23 -07:00
Gerrit Renker	7de6c03336	[SKB]: __skb_append = __skb_queue_after This expresses __skb_append in terms of __skb_queue_after, exploiting that __skb_append(old, new, list) = __skb_queue_after(list, old, new). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-14 00:05:09 -07:00
Ben Hutchings	4c821d753d	[NET]: Fix kernel-doc for skb_segment The kernel-doc comment for skb_segment is clearly wrong. This states what it actually does. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-13 21:52:48 -07:00
Eric Dumazet	f37f0afb29	[SOCK] sk_stamp: should be initialized to ktime_set(-1L, 0) Problem spotted by Andrew Brampton Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-13 21:39:26 -07:00
Patrick McHardy	4738c1db15	[SKFILTER]: Add SKF_ADF_NLATTR instruction SKF_ADF_NLATTR searches for a netlink attribute, which avoids manually parsing and walking attributes. It takes the offset at which to start searching in the 'A' register and the attribute type in the 'X' register and returns the offset in the 'A' register. When the attribute is not found it returns zero. A top-level attribute can be located using a filter like this (example for nfnetlink, using struct nfgenmsg): ... { /* A = offset of first attribute / .code = BPF_LD \| BPF_IMM, .k = sizeof(struct nlmsghdr) + sizeof(struct nfgenmsg) }, { / X = CTA_PROTOINFO / .code = BPF_LDX \| BPF_IMM, .k = CTA_PROTOINFO, }, { / A = netlink attribute offset / .code = BPF_LD \| BPF_B \| BPF_ABS, .k = SKF_AD_OFF + SKF_AD_NLATTR }, { / Exit if not found / .code = BPF_JMP \| BPF_JEQ \| BPF_K, .k = 0, .jt = <error> }, ... A nested attribute below the CTA_PROTOINFO attribute would then be parsed like this: ... { / A += sizeof(struct nlattr) / .code = BPF_ALU \| BPF_ADD \| BPF_K, .k = sizeof(struct nlattr), }, { / X = CTA_PROTOINFO_TCP / .code = BPF_LDX \| BPF_IMM, .k = CTA_PROTOINFO_TCP, }, { / A = netlink attribute offset / .code = BPF_LD \| BPF_B \| BPF_ABS, .k = SKF_AD_OFF + SKF_AD_NLATTR }, ... The data of an attribute can be loaded into 'A' like this: ... { / X = A (attribute offset) / .code = BPF_MISC \| BPF_TAX, }, { / A = skb->data[X + k] */ .code = BPF_LD \| BPF_B \| BPF_IND, .k = sizeof(struct nlattr), }, ... Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-10 02:02:28 -07:00
Stephen Hemminger	43db6d65e0	socket: sk_filter deinline The sk_filter function is too big to be inlined. This saves 2296 bytes of text on allyesconfig. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-10 01:43:09 -07:00
Stephen Hemminger	b715631fad	socket: sk_filter minor cleanups Some minor style cleanups: * Move __KERNEL__ definitions to one place in filter.h * Use const for sk_filter_len * Line wrapping * Put EXPORT_SYMBOL next to function definition Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-10 01:33:47 -07:00
David S. Miller	3bb5da3837	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6	2008-04-03 14:33:42 -07:00
Pavel Emelyanov	70ee115942	[SOCK][NETNS]: Add the percpu prot_inuse counter in the struct net. Such an accounting would cost us two more dereferences to get the percpu variable from the struct net, so I make sock_prot_inuse_get and _add calls work differently depending on CONFIG_NET_NS - without it old optimized routines are used. The per-cpu counter for init_net is prepared in core_initcall, so that even af_inet, that starts as fs_initcall, will already have the init_net prepared. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-31 19:42:16 -07:00
Pavel Emelyanov	c29a0bc4df	[SOCK][NETNS]: Add a struct net argument to sock_prot_inuse_add and _get. This counter is about to become per-proto-and-per-net, so we'll need two arguments to determine which cell in this "table" to work with. All the places, but proc already pass proper net to it - proc will be tuned a bit later. Some indentation with spaces in proc files is done to keep the file coding style consistent. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-31 19:41:46 -07:00
Pavel Emelyanov	8efa6e93cb	[NETNS]: Introduce a netns_core structure. There's already some stuff on the struct net, that should better be folded into netns_core structure. I'm making the per-proto inuse counter be per-net also, which is also a candidate for this, so introduce this structure and populate it a bit. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-31 19:41:14 -07:00
David S. Miller	a0f55e0e83	[NET]: Fix dev_alloc_skb() typo. Noticed by Joe Perches. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-28 18:22:32 -07:00
Pavel Emelyanov	60e7663d46	[SOCK]: Drop per-proto inuse init and fre functions (v2). Constructive part of the set is finished here. We have to remove the pcounter, so start with its init and free functions. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-28 16:39:10 -07:00
Pavel Emelyanov	1338d466d9	[SOCK]: Introduce a percpu inuse counters array (v2). And redirect sock_prot_inuse_add and _get to use one. As far as the dereferences are concerned. Before the patch we made 1 dereference to proto->inuse.add call, the call itself and then called the __get_cpu_var() on a static variable. After the patch we make a direct call, then one dereference to proto->inuse_idx and then the same __get_cpu_var() on a still static variable. So this patch doesn't seem to produce performance penalty on SMP. This is not per-net yet, but I will deliberately make NET_NS=y case separated from NET_NS=n one, since it'll cost us one-or-two more dereferences to get the struct net and the inuse counter. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-28 16:38:43 -07:00
Pavel Emelyanov	13ff3d6fa4	[SOCK]: Enumerate struct proto-s to facilitate percpu inuse accounting (v2). The inuse counters are going to become a per-cpu array. Introduce an index for this array on the struct proto. To handle the case of proto register-unregister-register loop the bitmap is used. All its bits manipulations are protected with proto_list_lock and a sanity check for the bitmap being exhausted is also added. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-28 16:38:17 -07:00
Denys Vlasenko	1483b8744e	[NET]: Add inline intent commentary to dev_alloc_skb(). Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-28 15:57:39 -07:00
YOSHIFUJI Hideaki	be01d655d9	[NET] NEIGHBOUR: Extract hash/lookup functions for pneigh entries. Extract hash function for pneigh entries from pneigh_lookup(), __pneigh_lookup() and pneigh_delete() as pneigh_hash(). Extract core of pneigh_lookup() and __pneigh_lookup() as __pneigh_lookup_1(). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2008-03-28 13:43:16 +09:00
YOSHIFUJI Hideaki	0a204500f9	[NET] NEIGHBOUR: Make each EXPORT_SYMBOL{,_GPL}() immediately follow its function/variable. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2008-03-28 13:42:45 +09:00
David S. Miller	8e8e43843b	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/usb/rndis_host.c drivers/net/wireless/b43/dma.c net/ipv6/ndisc.c	2008-03-27 18:48:56 -07:00
Ilpo Järvinen	419ae74ecc	[NET]: uninline skb_trim, de-bloats Allyesconfig (v2.6.24-mm1): -10976 209 funcs, 123 +, 11099 -, diff: -10976 --- skb_trim Without number of debug related CONFIGs (v2.6.25-rc2-mm1): -7360 192 funcs, 131 +, 7491 -, diff: -7360 --- skb_trim skb_trim \| +42 Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-27 17:54:01 -07:00
Ilpo Järvinen	8d3308687f	[NET]: uninline dst_release Codiff stats (allyesconfig, v2.6.24-mm1): -16420 187 funcs, 103 +, 16523 -, diff: -16420 --- dst_release Without number of debug related CONFIGs (v2.6.25-rc2-mm1): -7257 186 funcs, 70 +, 7327 -, diff: -7257 --- dst_release dst_release \| +40 Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-27 17:53:31 -07:00
Ilpo Järvinen	c2aa270ad7	[NET]: uninline skb_push, de-bloats a lot Allyesconfig (v2.6.24-mm1): -21593 356 funcs, 2418 +, 24011 -, diff: -21593 --- skb_push Without many debug related CONFIGs (v2.6.25-rc2-mm1): -13890 341 funcs, 189 +, 14079 -, diff: -13890 --- skb_push skb_push \| +46 Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-27 17:52:40 -07:00
Ilpo Järvinen	f58518e678	[NET]: uninline dev_alloc_skb, de-bloats a lot Allyesconfig (v2.6.24-mm1): -23668 392 funcs, 104 +, 23772 -, diff: -23668 --- dev_alloc_skb Without many debug CONFIGs (v2.6.25-rc2-mm1): -12178 382 funcs, 157 +, 12335 -, diff: -12178 --- dev_alloc_skb dev_alloc_skb \| +37 Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-27 17:51:31 -07:00
Ilpo Järvinen	6be8ac2fdc	[NET]: uninline skb_pull, de-bloats a lot Allyesconfig (v2.6.24-mm1): -28162 354 funcs, 3005 +, 31167 -, diff: -28162 --- skb_pull Without number of debug related CONFIGs (v2.6.25-rc2-mm1): -9697 338 funcs, 221 +, 9918 -, diff: -9697 --- skb_pull skb_pull \| +44 Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-27 17:47:24 -07:00
Ilpo Järvinen	0dde3e1648	[NET]: uninline skb_put, de-bloats a lot Allyesconfig (v2.6.24-mm1): ~500 files changed ... 869 funcs, 198 +, 111003 -, diff: -110805 --- skb_put skb_put \| +104 Without number of debug related CONFIGs (v2.6.25-rc2-mm1): -60744 855 funcs, 861 +, 61605 -, diff: -60744 --- skb_put skb_put \| +57 Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-27 17:43:41 -07:00
Linus Torvalds	ee20a0dd54	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (43 commits) [IPSEC]: Fix BEET output [ICMP]: Dst entry leak in icmp_send host re-lookup code (v2). [AX25]: Remove obsolete references to BKL from TODO file. [NET]: Fix multicast device ioctl checks [IRDA]: Store irnet_socket termios properly. [UML]: uml-net: don't set IFF_ALLMULTI in set_multicast_list [VLAN]: Don't copy ALLMULTI/PROMISC flags from underlying device netxen, phy/marvell, skge: minor checkpatch fixes S2io: Handle TX completions on the same CPU as the sender for MIS-X interrupts b44: Truncate PHY address skge napi->poll() locking bug rndis_host: fix oops when query for OID_GEN_PHYSICAL_MEDIUM fails cxgb3: Fix lockdep problems with sge.reg_lock ehea: Fix IPv6 support dm9000: Support promisc and all-multi modes dm9601: configure MAC to drop invalid (crc/length) packets dm9601: add Hirose USB-100 device ID Marvell PHY m88e1111 driver fix netxen: fix rx dropped stats netxen: remove low level tx lock ...	2008-03-26 18:35:50 -07:00
Patrick McHardy	61ee6bd487	[NET]: Fix multicast device ioctl checks SIOCADDMULTI/SIOCDELMULTI check whether the driver has a set_multicast_list method to determine whether it supports multicast. Drivers implementing secondary unicast support use set_rx_mode however. Check for both dev->set_multicast_mode and dev->set_rx_mode to determine multicast capabilities. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-26 02:12:11 -07:00
YOSHIFUJI Hideaki	878628fbf2	[NET] NETNS: Omit namespace comparision without CONFIG_NET_NS. Introduce an inline net_eq() to compare two namespaces. Without CONFIG_NET_NS, since no namespace other than &init_net exists, it is always 1. We do not need to convert 1) inline vs inline and 2) inline vs &init_net comparisons. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2008-03-26 04:40:00 +09:00
YOSHIFUJI Hideaki	57da52c1e6	[NET] NETNS: Omit neigh_parms->net and pneigh_entry->net without CONFIG_NET_NS. Introduce neigh_parms/pneigh_entry inlines: neigh_parms_net(), pneigh_net(). Without CONFIG_NET_NS, no namespace other than &init_net exists. Let's explicitly define them to help compiler optimizations. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2008-03-26 04:39:58 +09:00
YOSHIFUJI Hideaki	1218854afa	[NET] NETNS: Omit seq_net_private->net without CONFIG_NET_NS. Without CONFIG_NET_NS, no namespace other than &init_net exists, no need to store net in seq_net_private. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2008-03-26 04:39:56 +09:00
YOSHIFUJI Hideaki	3b1e0a655f	[NET] NETNS: Omit sock->sk_net without CONFIG_NET_NS. Introduce per-sock inlines: sock_net(), sock_net_set() and per-inet_timewait_sock inlines: twsk_net(), twsk_net_set(). Without CONFIG_NET_NS, no namespace other than &init_net exists. Let's explicitly define them to help compiler optimizations. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2008-03-26 04:39:55 +09:00
YOSHIFUJI Hideaki	c346dca108	[NET] NETNS: Omit net_device->nd_net without CONFIG_NET_NS. Introduce per-net_device inlines: dev_net(), dev_net_set(). Without CONFIG_NET_NS, no namespace other than &init_net exists. Let's explicitly define them to help compiler optimizations. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2008-03-26 04:39:53 +09:00
Pavel Emelyanov	2feb27dbe0	[NETNS]: Minor information leak via /proc/net/ptype file. This file displays the registered packet types, but some of them (packet sockets creates such) can be bound to a net device and showing them in a wrong namespace is not correct. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-24 14:57:45 -07:00
Pavel Emelyanov	fa86d322d8	[NEIGH]: Fix race between pneigh deletion and ipv6's ndisc_recv_ns (v3). Proxy neighbors do not have any reference counting, so any caller of pneigh_lookup (unless it's a netlink triggered add/del routine) should _not_ perform any actions on the found proxy entry. There's one exception from this rule - the ipv6's ndisc_recv_ns() uses found entry to check the flags for NTF_ROUTER. This creates a race between the ndisc and pneigh_delete - after the pneigh is returned to the caller, the nd_tbl.lock is dropped and the deleting procedure may proceed. One of the fixes would be to add a reference counting, but this problem exists for ndisc only. Besides such a patch would be too big for -rc4. So I propose to introduce a __pneigh_lookup() which is supposed to be called with the lock held and use it in ndisc code to check the flags on alive pneigh entry. Changes from v2: As David noticed, Exported the __pneigh_lookup() to ipv6 module. The checkpatch generates a warning on it, since the EXPORT_SYMBOL does not follow the symbol itself, but in this file all the exports come at the end, so I decided no to break this harmony. Changes from v1: Fixed comments from YOSHIFUJI - indentation of prototype in header and the pndisc_check_router() name - and a compilation fix, pointed by Daniel - the is_routed was (falsely) considered as uninitialized by gcc. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-24 14:48:59 -07:00
Linus Torvalds	7d3628b230	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (46 commits) [NET] ifb: set separate lockdep classes for queue locks [IPV6] KCONFIG: Fix description about IPV6_TUNNEL. [TCP]: Fix shrinking windows with window scaling netpoll: zap_completion_queue: adjust skb->users counter bridge: use time_before() in br_fdb_cleanup() [TG3]: Fix build warning on sparc32. MAINTAINERS: bluez-devel is subscribers-only audit: netlink socket can be auto-bound to pid other than current->pid (v2) [NET]: Fix permissions of /proc/net [SCTP]: Fix a race between module load and protosw access [NETFILTER]: ipt_recent: sanity check hit count [NETFILTER]: nf_conntrack_h323: logical-bitwise & confusion in process_setup() [RT2X00] drivers/net/wireless/rt2x00/rt2x00dev.c: remove dead code, fix warning [IPV4]: esp_output() misannotations [8021Q]: vlan_dev misannotations xfrm: ->eth_proto is __be16 [IPV4]: ipv4_is_lbcast() misannotations [SUNRPC]: net/* NULL noise [SCTP]: fix misannotated __sctp_rcv_asconf_lookup() [PKT_SCHED]: annotate cls_u32 ...	2008-03-21 07:57:45 -07:00
Peter P Waskiewicz Jr	82cc1a7a56	[NET]: Add per-connection option to set max TSO frame size Update: My mailer ate one of Jarek's feedback mails... Fixed the parameter in netif_set_gso_max_size() to be u32, not u16. Fixed the whitespace issue due to a patch import botch. Changed the types from u32 to unsigned int to be more consistent with other variables in the area. Also brought the patch up to the latest net-2.6.26 tree. Update: Made gso_max_size container 32 bits, not 16. Moved the location of gso_max_size within netdev to be less hotpath. Made more consistent names between the sock and netdev layers, and added a define for the max GSO size. Update: Respun for net-2.6.26 tree. Update: changed max_gso_frame_size and sk_gso_max_size from signed to unsigned - thanks Stephen! This patch adds the ability for device drivers to control the size of the TSO frames being sent to them, per TCP connection. By setting the netdevice's gso_max_size value, the socket layer will set the GSO frame size based on that value. This will propogate into the TCP layer, and send TSO's of that size to the hardware. This can be desirable to help tune the bursty nature of TSO on a per-adapter basis, where one may have 1 GbE and 10 GbE devices coexisting in a system, one running multiqueue and the other not, etc. This can also be desirable for devices that cannot support full 64 KB TSO's, but still want to benefit from some level of segmentation offloading. Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-21 03:43:19 -07:00
David S. Miller	a25606c845	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6	2008-03-21 03:42:24 -07:00
Jarek Poplawski	8a455b087c	netpoll: zap_completion_queue: adjust skb->users counter zap_completion_queue() retrieves skbs from completion_queue where they have zero skb->users counter. Before dev_kfree_skb_any() it should be non-zero yet, so it's increased now. Reported-and-tested-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-20 16:07:27 -07:00
Ingo Molnar	6f3d09291b	sched, net: socket wakeups are sync 'sync' wakeups are a hint towards the scheduler that (certain) networking related wakeups likely create coupling between tasks. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-19 04:27:53 +01:00
Harvey Harrison	0dc47877a3	net: replace remaining __FUNCTION__ occurrences __FUNCTION__ is gcc-specific, use __func__ Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-05 20:47:47 -08:00
David S. Miller	255333c1db	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: net/mac80211/rc80211_pid_algo.c	2008-03-05 12:26:41 -08:00
David S. Miller	d9452e9f81	[NETPOLL]: Revert two bogus cleanups that broke netconsole. Based upon a report by Andrew Morton and code analysis done by Jarek Poplawski. This reverts `33f807ba0d` ("[NETPOLL]: Kill NETPOLL_RX_DROP, set but never tested.") and `c7b6ea24b4` ("[NETPOLL]: Don't need rx_flags."). The rx_flags did get tested for zero vs. non-zero and therefore we do need those tests and that code which sets NETPOLL_RX_DROP et al. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-04 12:28:49 -08:00
Pavel Emelyanov	988b705077	[ARP]: Introduce the arp_hdr_len helper. There are some place, that calculate the ARP header length. These calculations are correct, but a) some operate with "magic" constants, b) enlarge the code length (sometimes at the cost of coding style), c) are not informative from the first glance. The proposal is to introduce a helper, that includes all the good sides of these calculations. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-03 12:20:57 -08:00
Frank Blaschka	7e36763b2c	[NET]: Fix race in generic address resolution. neigh_update sends skb from neigh->arp_queue while neigh_timer_handler has increased skbs refcount and calls solicit with the skb. neigh_timer_handler should not increase skbs refcount but make a copy of the skb and do solicit with the copy. Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-03 12:16:04 -08:00
Pavel Emelyanov	95a363582b	[NET]: Use existing device list walker for /proc/dev_mcast. The seq_file_operations' dev_mc_seq_xxx callbacks do the same thing as the dev_seq_xxx ones do, but skip the SEQ_START_TOKEN. So use the existing exported dev_seq_xxx calls and handle the SEQ_START_TOKEN in the dev_mc_seq_show(). Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-29 11:44:14 -08:00
David S. Miller	45af1754bc	[NET]: sk_release_kernel needs to be exported to modules Fixes: ERROR: "sk_release_kernel" [net/ipv6/ipv6.ko] undefined! Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-29 11:33:19 -08:00
Denis V. Lunev	edf0208702	[NET]: Make netlink_kernel_release publically available as sk_release_kernel. This staff will be needed for non-netlink kernel sockets, which should also not pin a namespace like tcp_socket and icmp_socket. Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-29 11:18:32 -08:00
Denis V. Lunev	9de8f76d20	[NETNS]: DST cleanup routines should be called inside namespace. Device inside the namespace can be started and downed. So, active routing cache should be cleaned up on device stop. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-28 20:49:44 -08:00
Denis V. Lunev	0c65babd6c	[NETNS]: Default arp parameters lookup. Default ARP parameters should be findable regardless of the context. Required to make inetdev_event working. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-28 20:48:25 -08:00
Denis V. Lunev	4ab438fcd7	[NETNS]: Register neighbour table parameters in the correct namespace. neigh_sysctl_register should register sysctl entries inside correct namespace to avoid naming conflict. Typical example is a loopback. Entries for it present in all namespaces. Required to make inetdev_event working. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-28 20:48:01 -08:00
Wang Chen	25296d599c	[PKTGEN]: Use proc_create() to setup ->proc_fops first Use proc_create() to make sure that ->proc_fops be setup before gluing PDE to main tree. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-28 14:11:49 -08:00
Wang Chen	46ecf0b994	[NEIGHBOUR]: Use proc_create() to setup ->proc_fops first Use proc_create() to make sure that ->proc_fops be setup before gluing PDE to main tree. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-28 14:10:51 -08:00
Linus Torvalds	bdc0894289	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (37 commits) [NETFILTER]: fix ebtable targets return [IP_TUNNEL]: Don't limit the number of tunnels with generic name explicitly. [NET]: Restore sanity wrt. print_mac(). [NEIGH]: Fix race between neighbor lookup and table's hash_rnd update. [RTNL]: Validate hardware and broadcast address attribute for RTM_NEWLINK tg3: ethtool phys_id default [BNX2]: Update version to 1.7.4. [BNX2]: Disable parallel detect on an HP blade. [BNX2]: More 5706S link down workaround. ssb: Fix support for PCI devices behind a SSB->PCI bridge zd1211rw: fix sparse warnings rtl818x: fix sparse warnings ssb: Fix pcicore cardbus mode ssb: Make the GPIO API reentrancy safe ssb: Fix the GPIO API ssb: Fix watchdog access for devices without a chipcommon ssb: Fix serial console on new bcm47xx devices ath5k: Fix build warnings on some 64-bit platforms. WDEV, ath5k, don't return int from bool function WDEV: ath5k, fix lock imbalance ...	2008-02-23 21:07:10 -08:00
Pavel Emelyanov	bc4bf5f38c	[NEIGH]: Fix race between neighbor lookup and table's hash_rnd update. The neigh_hash_grow() may update the tbl->hash_rnd value, which is used in all tbl->hash callbacks to calculate the hashval. Two lookup routines may race with this, since they call the ->hash callback without the tbl->lock held. Since the hash_rnd is changed with this lock write-locked moving the calls to ->hash under this lock read-locked closes this gap. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-23 19:57:02 -08:00
Thomas Graf	1840bb13c2	[RTNL]: Validate hardware and broadcast address attribute for RTM_NEWLINK RTM_NEWLINK allows for already existing links to be modified. For this purpose do_setlink() is called which expects address attributes with a payload length of at least dev->addr_len. This patch adds the necessary validation for the RTM_NEWLINK case. The address length for links to be created is not checked for now as the actual attribute length is used when copying the address to the netdevice structure. It might make sense to report an error if less than addr_len bytes are provided but enforcing this might break drivers trying to be smart with not transmitting all zero addresses. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-23 19:54:36 -08:00
Denis V. Lunev	da12f7356d	[NETNS]: Namespace leak in pneigh_lookup. release_net is missed on the error path in pneigh_lookup. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-20 00:26:16 -08:00
Thomas Graf	76e87306c2	[RTNL]: Add missing link netlink attribute policy definitions IFLA_LINK is no longer a write-only attribute on the kernel side and must thus be validated. Same goes for the newly introduced IFLA_LINKINFO. Fixes undefined behaviour if either of the attributes are not well formed. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-19 16:12:08 -08:00
Jorge Boncompte [DTI2]	12aa343add	[NET]: Messed multicast lists after dev_mc_sync/unsync Commit `a0a400d79e` ("[NET]: dev_mcast: add multicast list synchronization helpers") from you introduced a new field "da_synced" to struct dev_addr_list that is not properly initialized to 0. So when any of the current users (8021q, macvlan, mac80211) calls dev_mc_sync/unsync they mess the address list for both devices. The attached patch fixed it for me and avoid future problems. Signed-off-by: Jorge Boncompte [DTI2] <jorge@dti2.net> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-19 14:17:04 -08:00
Linus Torvalds	07ce198a1e	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (60 commits) [NIU]: Bump driver version and release date. [NIU]: Fix BMAC alternate MAC address indexing. net: fix kernel-doc warnings in header files [IPV6]: Use BUG_ON instead of if + BUG in fib6_del_route. [IPV6]: dst_entry leak in ip4ip6_err. (resend) bluetooth: do not move child device other than rfcomm bluetooth: put hci dev after del conn [NET]: Elminate spurious print_mac() calls. [BLUETOOTH] hci_sysfs.c: Kill build warning. [NET]: Remove MAC_FMT net/8021q/vlan_dev.c: Use print_mac. [XFRM]: Fix ordering issue in xfrm_dst_hash_transfer(). [BLUETOOTH] net/bluetooth/hci_core.c: Use time_* macros [IPV6]: Fix hardcoded removing of old module code [NETLABEL]: Move some initialization code into __init section. [NETLABEL]: Shrink the genl-ops registration code. [AX25] ax25_out: check skb for NULL in ax25_kick() [TCP]: Fix tcp_v4_send_synack() comment [IPV4]: fix alignment of IP-Config output Documentation: fix tcp.txt ...	2008-02-19 07:52:45 -08:00
David S. Miller	9ff5660746	Revert "[NDISC]: Fix race in generic address resolution" This reverts commit `69cc64d8d9`. It causes recursive locking in IPV6 because unlike other neighbour layer clients, it even needs neighbour cache entries to send neighbour soliciation messages :-( We'll have to find another way to fix this race. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-17 18:39:54 -08:00
David S. Miller	93b2d4a208	Revert "[RTNETLINK]: Send a single notification on device state changes." This reverts commit `45b5035482`. It break locking around dev->link_mode as well as cause other bootup problems. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-17 18:35:07 -08:00
Linus Torvalds	f6866fecd6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (82 commits) [NET]: Make sure sockets implement splice_read netconsole: avoid null pointer dereference at show_local_mac() [IPV6]: Fix reversed local_df test in ip6_fragment [XFRM]: Avoid bogus BUG() when throwing new policy away. [AF_KEY]: Fix bug in spdadd [NETFILTER] nf_conntrack_proto_tcp.c: Mistyped state corrected. net: xfrm statistics depend on INET [NETFILTER]: make secmark_tg_destroy() static [INET]: Unexport inet_listen_wlock [INET]: Unexport __inet_hash_connect [NET]: Improve cache line coherency of ingress qdisc [NET]: Fix race in dev_close(). (Bug 9750) [IPSEC]: Fix bogus usage of u64 on input sequence number [RTNETLINK]: Send a single notification on device state changes. [NETLABLE]: Hide netlbl_unlabel_audit_addr6 under ifdef CONFIG_IPV6. [NETLABEL]: Don't produce unused variables when IPv6 is off. [NETLABEL]: Compilation for CONFIG_AUDIT=n case. [GENETLINK]: Relax dances with genl_lock. [NETLABEL]: Fix lookup logic of netlbl_domhsh_search_def. [IPV6]: remove unused method declaration (net/ndisc.h). ...	2008-02-15 07:33:07 -08:00
Randy Dunlap	bc2cda1ebd	docbook: make a networking book and fix a few errors Move networking (core and drivers) docbook to its own networking book. Fix a few kernel-doc errors in header and source files. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Neil Brown <neilb@suse.de> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-13 16:21:19 -08:00
Harvey Harrison	b5606c2d44	remove final fastcall users fastcall always expands to empty, remove it. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-13 16:21:18 -08:00

1 2 3 4 5 ...

1116 Commits