Commit Graph

9616 Commits

Author SHA1 Message Date
Tomas Winkler ebd74487d4 mac80211: fix warning: unused variable ifsta
This patch fixes warning unused variable ifsta
when compiling without CONFIG_MAC80211_VERBOSE_DEBUG

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-07-08 10:21:35 -04:00
Tomas Winkler d96a7bc049 mac80211: remove useless tid assignment for management and control frames
This patch removes useless tid assignment for management and control frames

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-07-08 10:21:34 -04:00
Ron Rindjunsky 429a380571 mac80211: add block ack request capability
This patch adds block ack request capability

Signed-off-by: Ester Kummer <ester.kummer@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-07-08 10:21:34 -04:00
Ivo van Doorn b2898a2780 mac80211: Don't request encryption for probe response
Probe responses shouldn't be encrypted, and mac80211 doesn't
set the crypto key accordingly. However it didn't set the
IEEE80211_TX_CTL_DO_NOT_ENCRYPT flag which means drivers
could make an attempt to encrypt it, and causing a NULL
pointer dereference when accessing the provided hw_key field.

Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-07-08 10:21:34 -04:00
Patrick McHardy 9bb8582efb vlan: TCI related type and naming cleanups
The VLAN code contains multiple spots that use tag, id and tci as
identifiers for arguments and variables incorrectly and they actually
contain or are expected to contain something different. Additionally
types are used inconsistently (unsigned short vs u16) and identifiers
are sometimes capitalized.

- consistently use u16 for storing TCI, ID or QoS values
- consistently use vlan_id and vlan_tci for storing the respective values
- remove capitalization
- add kdoc comment to netif_hwaccel_{rx,receive_skb}

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-08 03:24:44 -07:00
Patrick McHardy 22d1ba74bb vlan: move struct vlan_dev_info to private header
Hide struct vlan_dev_info from drivers to prevent them from growing
more creative ways to use it. Provide accessors for the two drivers
that currently use it.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-08 03:23:57 -07:00
Patrick McHardy 7750f403cb vlan: uninline __vlan_hwaccel_rx
The function is huge and included at least once in every VLAN acceleration
capable driver. Uninline it; to avoid having drivers depend on the VLAN
module, the function is always built in statically when VLAN is enabled.

With all VLAN acceleration capable drivers that build on x86_64 enabled,
this results in:

   text    data     bss     dec     hex filename
6515227  854044  343968 7713239  75b1d7 vmlinux.inlined
6505637  854044  343968 7703649  758c61 vmlinux.uninlined
----------------------------------------------------------
  -9590

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-08 03:23:36 -07:00
Patrick McHardy 75b8846acd vlan: Add ethtool support
Add ethtool support for querying the device for offload settings.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-08 03:22:42 -07:00
Joonwoo Park 26a25239d7 vlan: Use is_vlan_dev()
Use simplified is_vlan_dev function.

Signed-off-by: Joonwoo Park <joonwpark81@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-08 03:22:16 -07:00
Patrick McHardy acc81e1465 vlan: fix network_header/mac_header adjustments
Lennert Buytenhek points out that the VLAN code incorrectly adjusts
skb->network_header to point in the middle of the VLAN header and
additionally tries to adjust skb->mac_header without checking for
validity.

The network_header should not be touched at all since we're only
adding headers in front of it, mac_header adjustments are not
necessary at all.

Based on patch by Lennert Buytenhek <buytenh@wantstofly.org>.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-08 03:21:27 -07:00
Julius Volz 07035fc1bb irda: Fix netlink error path return value
Fix an incorrect return value check of genlmsg_put() in irda_nl_get_mode().
genlmsg_put() does not use ERR_PTR() to encode return values, it just
returns NULL on error.

Signed-off-by: Julius Volz <juliusv@google.com>
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-08 03:07:43 -07:00
Denis V. Lunev 81c684d12d ipv4: remove flush_mutex from ipv4_sysctl_rtcache_flush
It is possible to avoid locking at all in ipv4_sysctl_rtcache_flush by
defining local ctl_table on the stack.

The patch is based on the suggestion from Eric W. Biederman.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-08 03:05:28 -07:00
Joonwoo Park 4ad3f26162 netfilter: fix string extension for case insensitive pattern matching
The flag XT_STRING_FLAG_IGNORECASE indicates case insensitive string
matching. netfilter can find cmd.exe, Cmd.exe, cMd.exe and etc easily.

A new revision 1 was added, in the meantime invert of xt_string_info
was moved into flags as a flag. If revision is 1, The flag
XT_STRING_FLAG_INVERT indicates invert matching.

Signed-off-by: Joonwoo Park <joonwpark81@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-08 02:38:56 -07:00
Patrick McHardy 58de7862e6 netfilter: ebt_nflog: fix Kconfig typo
The help text should refer to nflog instead of ulog. Noticed by
Krzysztof Halasa <khc@pm.waw.pl>.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-08 02:37:07 -07:00
Alexey Dobriyan 43de9dfeaa netfilter: ip6table_filter in netns for real
One still needs to remove checks in nf_hook_slow() and nf_sockopt_find()
to test this, though.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-08 02:36:18 -07:00
Pablo Neira Ayuso b891c5a831 netfilter: nf_conntrack: add allocation flag to nf_conntrack_alloc
ctnetlink does not need to allocate the conntrack entries with GFP_ATOMIC
as its code is executed in user context.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-08 02:35:55 -07:00
Russ Dill b11c16beb9 netfilter: Get rid of refrences to no longer existant Fast NAT.
Get rid of refrences to no longer existant Fast NAT.

IP_ROUTE_NAT support was removed in August of 2004, but references to Fast
NAT were left in a couple of config options.

Signed-off-by: Russ Dill <Russ.Dill@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-08 02:35:27 -07:00
Alexey Dobriyan d2789312cc netfilter: use correct namespace in ip6table_security
Signed-off-by: Alexey Dobriyan <adobriyan@parallels.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-08 02:34:52 -07:00
Vlad Yasevich 3888e9efc9 sctp: Mark the tsn as received after all allocations finish
If we don't have the buffer space or memory allocations fail,
the data chunk is dropped, but TSN is still reported as received.
This introduced a data loss that can't be recovered.  We should
only mark TSNs are received after memory allocations finish.
The one exception is the invalid stream identifier, but that's
due to user error and is reported back to the user.

This was noticed by Michael Tuexen.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-08 02:28:39 -07:00
Vladimir Koutny 6e43829bb6 mac80211: don't report selected IBSS when not found
Don't report a 'selected' IBSS in sta_find_ibss when none was found.

Signed-off-by: Vladimir Koutny <vlado@ksp.sk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-07-07 15:31:40 -04:00
Ivo van Doorn ea0c925370 mac80211: Only flush workqueue when last interface was removed
Currently the ieee80211_hw->workqueue is flushed each time
an interface is being removed. However most scheduled work
is not interface specific but device specific, for example things like
periodic work for link tuners.

This patch will move the flush_workqueue() call to directly behind
the call to ops->stop() to make sure the workqueue is only flushed
when all interfaces are gone and there really shouldn't be any scheduled
work in the drivers left.

Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-07-07 15:31:39 -04:00
Guy Cohen 8db9369ff9 mac80211: move netif_carrier_on to after ieee80211_bss_info_change_notify
Putting netif_carrier_on before configuring the driver/device with the
new association state may cause a race (tx frames may be sent before
configuration is done)

Signed-off-by: Guy Cohen <guy.cohen@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-07-07 15:31:39 -04:00
Linus Torvalds b2798bf0ec Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  can: add sanity checks
  fs_enet: restore promiscuous and multicast settings in restart()
  ibm_newemac: Fixes entry of short packets
  ibm_newemac: Fixes kernel crashes when speed of cable connected changes
  pasemi_mac: Access iph->tot_len with correct endianness
  ehea: Access iph->tot_len with correct endianness
  ehea: fix race condition
  ehea: add MODULE_DEVICE_TABLE
  ehea: fix might sleep problem
  forcedeth: fix lockdep warning on ethtool -s
  Add missing skb->dev assignment in Frame Relay RX code
  bridge: fix use-after-free in br_cleanup_bridges()
  tcp: fix a size_t < 0 comparison in tcp_read_sock
  tcp: net/ipv4/tcp.c needs linux/scatterlist.h
  libertas: support USB persistence on suspend/resume (resend)
  iwlwifi: drop skb silently for Tx request in monitor mode
  iwlwifi: fix incorrect 5GHz rates reported in monitor mode
2008-07-07 09:24:28 -07:00
Patrick McHardy 4b5a698ef4 net: fix dev_set_promiscuity() breakage
Commit dad9b335 (netdevice: Fix promiscuity and allmulti overflow) broke
dev_set_promiscuity() by returning on success without reprogramming the
device.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-06 15:49:08 -07:00
Ingo Molnar 68083e05d7 Merge commit 'v2.6.26-rc9' into cpus4096 2008-07-06 14:23:39 +02:00
Patrick McHardy fb0305ce1b net-sched: consolidate default fifo qdisc setup
Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 23:40:21 -07:00
Oliver Hartkopp 7f2d38eb7a can: add sanity checks
Even though the CAN netlayer only deals with CAN netdevices, the 
netlayer interface to the userspace and to the device layer should 
perform some sanity checks.

This patch adds several sanity checks that mainly prevent userspace apps 
to send broken content into the system that may be misinterpreted by 
some other userspace application.

Signed-off-by: Oliver Hartkopp <oliver.hartkopp@volkswagen.de>
Signed-off-by: Urs Thuermann <urs.thuermann@volkswagen.de>
Acked-by: Andre Naujoks <nautsch@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 23:38:43 -07:00
Patrick McHardy aee18a8cf2 net-sched: sch_htb: remove write-only qdisc filter_cnt
The filter_cnt is supposed to count filter references to a class.
Since the qdisc can't be the target of a filter, it doesn't need
a filter_cnt. In fact the counter is never decreased since cls_api
considers a return value of zero a failure and doesn't unbind again.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 23:23:27 -07:00
Patrick McHardy 4207759939 net-sched: sch_htb: remove child and sibling lists
Now that the qdisc isn't destroyed in hierarchical order anymore,
the only user of the child lists left is htb_parent_last_child().
This can be easily changed to use a counter of children to save
a few bytes.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 23:22:53 -07:00
Patrick McHardy f4c1f3e0c5 net-sched: sch_htb: use dynamic class hash helpers
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 23:22:35 -07:00
Patrick McHardy fbd8f1379a net-sched: sch_htb: move hash and sibling list removal to htb_delete
Hash list removal currently happens twice (once in htb_delete, once
in htb_destroy_class), which makes it harder to use the dynamically
sized class hash without adding special cases for HTB. The reason is
that qdisc destruction destroys classes in hierarchical order, which
is not necessary if filters are destroyed in a separate iteration
during qdisc destruction.

Adjust qdisc destruction to follow the same scheme as other hierarchical
qdiscs by first performing a filter destruction pass, then destroying
all classes in hash order.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 23:22:19 -07:00
Patrick McHardy d77fea2eb9 net-sched: sch_cbq: use dynamic class hash helpers
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 23:22:05 -07:00
Patrick McHardy be0d39d52c net-sched: sch_hfsc: use dynamic class hash helpers
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 23:21:47 -07:00
Patrick McHardy 6fe1c7a555 net-sched: add dynamically sized qdisc class hash helpers
Currently all qdiscs which allow to create classes uses a fixed sized hash
table with size 16 to hash the classes. This causes a large bottleneck
when using thousands of classes and unbound filters.

Add helpers for dynamically sized class hashes to fix this. The following
patches will convert the qdiscs to use them.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 23:21:31 -07:00
David S. Miller ea2aca084b Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	Documentation/feature-removal-schedule.txt
	drivers/net/wan/hdlc_fr.c
	drivers/net/wireless/iwlwifi/iwl-4965.c
	drivers/net/wireless/iwlwifi/iwl3945-base.c
2008-07-05 23:08:07 -07:00
David S. Miller f3032be921 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2008-07-05 21:41:53 -07:00
Patrick McHardy 70c03b49b8 vlan: Add GVRP support
Add GVRP support for dynamically registering VLANs with switches.

By default GVRP is disabled because we only support the applicant-only
participant model, which means it should not be enabled on vlans that
are members of a bridge. Since there is currently no way to cleanly
determine that, the user is responsible for enabling it.

The code is pretty small and low impact, its wrapped in a config
option though because it depends on the GARP implementation and
the STP core.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 21:26:57 -07:00
Patrick McHardy ce305002e1 vlan: Move device unregistration before lower dev cleanup
Move the unregister_netdevice() call for the VLAN device before cleanup
for the lower device. This is needed by GVRP so it can send a leave
message before the applicant on the lower device is cleaned up.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 21:26:41 -07:00
Patrick McHardy b3ce0325f2 vlan: Change vlan_dev_set_vlan_flag() to handle multiple flags at once
Change vlan_dev_set_vlan_flag() to handle multiple flags at once and
rename to vlan_dev_change_flags(). This allows to to use it from the
netlink interface, which in turn allows to handle necessary adjustments
when changing flags centrally.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 21:26:27 -07:00
Patrick McHardy eca9ebac65 net: Add GARP applicant-only participant
Add an implementation of the GARP (Generic Attribute Registration Protocol)
applicant-only participant. This will be used by the following patch to
add GVRP support to the VLAN code.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 21:26:13 -07:00
Patrick McHardy 7c85fbf065 bridge: Use STP demux
Use the STP demux layer for receiving STP PDUs instead of directly
registering with LLC.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 21:25:56 -07:00
Patrick McHardy a19800d704 net: Add STP demux layer
Add small STP demux layer for demuxing STP PDUs based on MAC address.
This is needed to run both GARP and STP in parallel (or even load the
modules) since both use LLC_SAP_BSPAN.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 21:25:39 -07:00
Pavel Emelyanov ef28d1a20f MIB: add struct net to UDP6_INC_STATS_BH
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 21:19:40 -07:00
Pavel Emelyanov 235b9f7ac5 MIB: add struct net to UDP6_INC_STATS_USER
As simple as the patch #1 in this set.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 21:19:20 -07:00
Pavel Emelyanov 0283328e23 MIB: add struct net to UDP_INC_STATS_BH
Two special cases here - one is rxrpc - I put init_net there
explicitly, since we haven't touched this part yet. The second
place is in __udp4_lib_rcv - we already have a struct net there,
but I have to move its initialization above to make it ready
at the "drop" label.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 21:18:48 -07:00
Pavel Emelyanov 629ca23c33 MIB: add struct net to UDP_INC_STATS_USER
Nothing special - all the places already have a struct sock
at hands, so use the sock_net() net.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 21:18:07 -07:00
Denis V. Lunev 32cb5b4e03 netns: selective flush of rt_cache
dst cache is marked as expired on the per/namespace basis by previous
path. Right now we have to implement selective cache shrinking. This
procedure has been ported from older OpenVz codebase.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 19:06:12 -07:00
Denis V. Lunev e84f84f276 netns: place rt_genid into struct net
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 19:04:32 -07:00
Denis V. Lunev b00180defd ipv4: pass current value of rt_genid into rt_hash
Basically, there is no difference to atomic_read internally or pass it as
a parameter as rt_hash is inline.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 19:04:09 -07:00
Denis V. Lunev 86c657f6b5 netns: add struct net parameter to rt_cache_invalidate
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 19:03:31 -07:00
Denis V. Lunev 9f5e97e536 netns: make rt_secret_rebuild timer per namespace
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 19:02:59 -07:00
Denis V. Lunev 39a23e7508 netns: register net.ipv4.route.flush in each namespace
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 19:02:33 -07:00
Denis V. Lunev 639e104fac ipv4: remove static flush_delay variable
flush delay is used as an external storage for net.ipv4.route.flush sysctl
entry. It is write-only.

The ctl_table->data for this entry is used once. Fix this case to point
to the stack to remove global variable. Do this to avoid additional
variable on struct net in the next patch.

Possible race (as it was before) accessing this local variable is removed
using flush_mutex.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 19:02:06 -07:00
Denis V. Lunev ae299fc051 net: add fib_rules_ops to flush_cache method
This is required to pass namespace context into rt_cache_flush called from
->flush_cache.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 19:01:28 -07:00
Denis V. Lunev 76e6ebfb40 netns: add namespace parameter to rt_cache_flush
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 19:00:44 -07:00
J. Bruce Fields e86322f611 Merge branch 'for-bfields' of git://linux-nfs.org/~tomtucker/xprt-switch-2.6 into for-2.6.27 2008-07-03 16:24:06 -04:00
J. Bruce Fields b620754bfe svcrpc: fix handling of garbage args
To return garbage_args, the accept_stat must be 0, and we must have a
verifier.  So we shouldn't be resetting the write pointer as we reject
the call.

Also, we must add the two placeholder words here regardless of success
of the unwrap, to ensure the output buffer is left in a consistent state
for svcauth_gss_release().

This fixes a BUG() in svcauth_gss.c:svcauth_gss_release().

Thanks to Aime Le Rouzic for bug report, debugging help, and testing.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Tested-by: Aime Le Rouzic <aime.le-rouzic@bull.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-03 12:46:56 -07:00
Patrick McHardy ab1b20467c bridge: fix use-after-free in br_cleanup_bridges()
Unregistering a bridge device may cause virtual devices stacked on the
bridge, like vlan or macvlan devices, to be unregistered as well.
br_cleanup_bridges() uses for_each_netdev_safe() to iterate over all
devices during cleanup. This is not enough however, if one of the
additionally unregistered devices is next in the list to the bridge
device, it will get freed as well and the iteration continues on
the freed element.

Restart iteration after each bridge device removal from the beginning to
fix this, similar to what rtnl_link_unregister() does.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-03 03:53:42 -07:00
Octavian Purdila 374e7b5949 tcp: fix a size_t < 0 comparison in tcp_read_sock
<used> should be of type int (not size_t) since recv_actor can return
negative values and it is also used in a < 0 comparison.

Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-03 03:31:21 -07:00
Andrew Morton 81b23b4a7a tcp: net/ipv4/tcp.c needs linux/scatterlist.h
alpha:

net/ipv4/tcp.c: In function 'tcp_calc_md5_hash':
net/ipv4/tcp.c:2479: error: implicit declaration of function 'sg_init_table'    net/ipv4/tcp.c:2482: error: implicit declaration of function 'sg_set_buf'
net/ipv4/tcp.c:2507: error: implicit declaration of function 'sg_mark_end'      

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-03 03:22:02 -07:00
David S. Miller 44d28ab19c Merge branch 'net-next-2.6-v6ready-20080703' of git://git.linux-ipv6.org/gitroot/yoshfuji/linux-2.6-next 2008-07-03 03:07:58 -07:00
YOSHIFUJI Hideaki e0835f8fa5 ipv4,ipv6 mroute: Add some helper inline functions to remove ugly ifdefs.
ip{,v6}_mroute_{set,get}sockopt() should not matter by optimization but
it would be better not to depend on optimization semantically.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-07-03 17:51:57 +09:00
Wang Chen 03d2f897e9 ipv4: Do cleanup for ip_mr_init
Same as ip6_mr_init(), make ip_mr_init() return errno if fails.
But do not do error handling in inet_init(), just print a msg.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-07-03 17:51:57 +09:00
Wang Chen 623d1a1af7 ipv6: Do cleanup for ip6_mr_init.
If do not do it, we will get following issues:
1. Leaving junks after inet6_init failing halfway.
2. Leaving proc and notifier junks after ipv6 modules unloading.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-07-03 17:51:56 +09:00
YOSHIFUJI Hideaki dd3abc4ef5 ipv6 route: Prefer outgoing interface with source address assigned.
Outgoing interface is selected by the route decision if unspecified.
Let's prefer routes via interface(s) with the address assigned if we
have multiple routes with same cost.
With help from Naohiro Ooiwa <nooiwa@miraclelinux.com>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-07-03 17:51:56 +09:00
YOSHIFUJI Hideaki 1b34be74cb ipv6 addrconf: add accept_dad sysctl to control DAD operation.
- If 0, disable DAD.
- If 1, perform DAD (default).
- If >1, perform DAD and disable IPv6 operation if DAD for MAC-based
  link-local address has been failed (RFC4862 5.4.5).

We do not follow RFC4862 by default.  Refer to the netdev thread entitled
"Linux IPv6 DAD not full conform to RFC 4862 ?"
	http://www.spinics.net/lists/netdev/msg52027.html

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-07-03 17:51:56 +09:00
YOSHIFUJI Hideaki 778d80be52 ipv6: Add disable_ipv6 sysctl to disable IPv6 operaion on specific interface.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-07-03 17:51:55 +09:00
YOSHIFUJI Hideaki 5ce83afaac ipv6: Assume the loopback address in link-local scope.
Handle interface property strictly when looking up a route
for the loopback address (RFC4291 2.5.3).

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-07-03 17:51:55 +09:00
YOSHIFUJI Hideaki f81b2e7d8c ipv6: Do not forward packets with the unspecified source address.
RFC4291 2.5.2.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-07-03 17:51:55 +09:00
YOSHIFUJI Hideaki d68b82705a ipv6: Do not assign non-valid address on interface.
Check the type of the address when adding a new one on interface.
- the unspecified address (::) is always disallowed (RFC4291 2.5.2)
- the loopback address is disallowed unless the interface is (one of)
  loopback (RFC4291 2.5.3).
- multicast addresses are disallowed.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-07-03 17:51:55 +09:00
Pavel Emelyanov 40b215e594 tcp: de-bloat a bit with factoring NET_INC_STATS_BH out
There are some places in TCP that select one MIB index to
bump snmp statistics like this:

	if (<something>)
		NET_INC_STATS_BH(<some_id>);
	else if (<something_else>)
		NET_INC_STATS_BH(<some_other_id>);
	...
	else
		NET_INC_STATS_BH(<default_id>);

or in a more tricky but still similar way.

On the other hand, this NET_INC_STATS_BH is a camouflaged
increment of percpu variable, which is not that small.

Factoring those cases out de-bloats 235 bytes on non-preemptible
i386 config and drives parts of the code into 80 columns.

add/remove: 0/0 grow/shrink: 0/7 up/down: 0/-235 (-235)
function                                     old     new   delta
tcp_fastretrans_alert                       1437    1424     -13
tcp_dsack_set                                137     124     -13
tcp_xmit_retransmit_queue                    690     676     -14
tcp_try_undo_recovery                        283     265     -18
tcp_sacktag_write_queue                     1550    1515     -35
tcp_update_reordering                        162     106     -56
tcp_retransmit_timer                         990     904     -86

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-03 01:05:41 -07:00
Tom Tucker 8948896c9e svcrdma: Change WR context get/put to use the kmem cache
Change the WR context pool to be shared across mount points. This
reduces the RDMA transport memory footprint significantly since
idle mounts don't consume WR context memory.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-07-02 15:02:02 -05:00
Tom Tucker bf5927d84e svcrdma: Create a kmem cache for the WR contexts
Create a kmem cache to hold WR contexts. Next we will convert
the WR context get and put services to use this kmem cache.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-07-02 15:02:01 -05:00
Tom Tucker 902a94e088 svcrdma: Add flush_scheduled_work to module exit function
Make certain all transports pending free are flushed from the wq
before unloading the module.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-07-02 15:02:00 -05:00
Tom Tucker 36ef25e464 svcrdma: Limit ORD based on client's advertised IRD
When adapters have differing IRD limits, the RDMA transport will fail to
connect properly. The RDMA transport should use the client's advertised
inbound read limit when computing its outbound read limit. For iWARP
transports, there is currently no standard for exchanging IRD/ORD
during connection establishment so the 'responder_resources' field in the
connect event is the local device's limit. The RDMA transport can be
configured to use a smaller ORD by writing the desired number to the
/proc/sys/sunrpc/svc_rdma/max_outbound_read_requests file.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-07-02 15:01:59 -05:00
Tom Tucker 94dba4918d svcrdma: Remove unneeded spin locks from __svc_rdma_free
At the time __svc_rdma_free is called, we are guaranteed that all references
to this transport are gone. There is, therefore, no need to protect the
resource lists with a spin lock.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-07-02 15:01:57 -05:00
Tom Tucker 87295b6c5c svcrdma: Add dma map count and WARN_ON
Add a dma map count in order to verify that all DMA mapping resources
have been freed when the transport is closed.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-07-02 15:01:56 -05:00
Tom Tucker e6ab914371 svcrdma: Move the DMA unmap logic to the CQ handler
Separate DMA unmap from context destruction and perform DMA unmapping
in the SQ/RQ CQ reap functions. This is necessary to support software
based RDMA implementations that actually copy the data in their
ib_dma_unmap callback functions and architectures that don't have
cache coherent I/O busses.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-07-02 15:01:55 -05:00
Tom Tucker f820c57ebf svcrdma: Use reply and chunk map for RDMA_READ processing
Modify the RDMA_READ processing to use the reply and chunk list mapping data
types. Also add a special purpose 'hdr_count' field in in the context to hold
the header page count instead of overloading the SGE length field and
corrupting the DMA map length.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-07-02 15:01:55 -05:00
Tom Tucker 34d16e42a6 svcrdma: Use RPC reply map for RDMA_WRITE processing
Use the new svc_rdma_req_map data type for mapping the client side memory
to the server side memory. Move the DMA mapping to the context pointed to
by each WR individually so that it is unmapped after the WR completes.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-07-02 15:01:54 -05:00
Tom Tucker ab96dddbed svcrdma: Add a type for keeping NFS RPC mapping
Create a new data structure to hold the remote client address space
to local server address space mapping.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-07-02 15:01:53 -05:00
Johannes Berg f4ea83dd74 mac80211: rework debug settings and make debugging safer
This patch reworks the mac80211 debug settings making them more focused
and adding help text for those that didn't have one. It also removes a
number of printks that can be triggered remotely and add no value, e.g.
"too short deauthentication frame received - ignoring".

If somebody really needs to debug that they should just add a monitor
interface and look at the frames in wireshark.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-07-02 15:48:33 -04:00
Johannes Berg 49461622ed mac80211: get rid of function pointers in RX path
This changes the RX path to no longer use function pointers for
RX handlers but rather invoke them directly. If debugging is
enabled, mark the RX handlers noinline because otherwise they
all get inlined into ieee80211_invoke_rx_handlers() which makes
it harder to see where a bug is.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-07-02 15:48:33 -04:00
Johannes Berg d9e8a70fa2 mac80211: get rid of function pointers in TX path
This changes the TX path to no longer use function pointers for
TX handlers but rather invoke them directly. If debugging is
enabled, mark the TX handlers noinline because otherwise they
all get inlined into invoke_tx_handlers() which makes it harder
to see where a bug is.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-07-02 15:48:33 -04:00
Santwona Behera 0853ad66b1 netdev: Add support for rx flow hash configuration, using ethtool.
Added new interfaces to ethtool to configure receive network flow
distribution across multiple rx rings using hashing.

Signed-off-by: Santwona Behera <santwona.behera@sun.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-02 03:47:41 -07:00
Vlad Yasevich ecbed6a419 sctp: Mark GET_PEER|LOCAL_ADDR_OLD deprecated.
Socket options SCTP_GET_PEER_ADDR_OLD, SCTP_GET_PEER_ADDR_NUM_OLD,
SCTP_GET_LOCAL_ADDR_OLD, and SCTP_GET_PEER_LOCAL_ADDR_NUM_OLD
have been replaced by newer versions a since 2005.  It's time
to officially deprecate them and schedule them for removal.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-01 20:06:22 -07:00
Patrick McHardy 2fe195cfe3 net: fib_rules: fix error code for unsupported families
The errno code returned must be negative.

Fixes "RTNETLINK answers: Unknown error 18446744073709551519".

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-01 19:59:37 -07:00
Wang Chen 93b3cff991 netdevice: Fix wrong string handle in kernel command line parsing
v1->v2: Use strlcpy() to ensure s[i].name be null-termination.

1. In netdev_boot_setup_add(), a long name will leak.
   ex. : dev=21,0x1234,0x1234,0x2345,eth123456789verylongname.........
2. In netdev_boot_setup_check(), mismatch will happen if s[i].name
   is a substring of dev->name.
   ex. : dev=...eth1 dev=...eth11

[ With feedback from Ben Hutchings. ]

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-01 19:57:19 -07:00
Wang Chen 8fde8a0769 net: Tyop of sk_filter() comment
Parameter "needlock" no long exists.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-01 19:55:40 -07:00
Wang Chen 8487460720 netlink: Unneeded local variable
We already have a variable, which has the same capability.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-01 19:55:09 -07:00
Patrick McHardy a4aebb83cf net-sched: fix filter destruction in atm/hfsc qdisc destruction
Filters need to be destroyed before beginning to destroy classes
since the destination class needs to still be alive to unbind the
filter.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-01 19:53:09 -07:00
Patrick McHardy ff31ab56c0 net-sched: change tcf_destroy_chain() to clear start of filter list
Pass double tcf_proto pointers to tcf_destroy_chain() to make it
clear the start of the filter list for more consistency.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-01 19:52:38 -07:00
Stephen Hemminger 6dbf4bcac9 icmp: fix units for ratelimit
Convert the sysctl values for icmp ratelimit to use milliseconds instead
of jiffies which is based on kernel configured HZ.
Internal kernel jiffies are not a proper unit for any userspace API.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-01 19:29:07 -07:00
Assaf Krauss 4faeb86070 mac80211: add beacon timestamp to beacon template in IBSS
This patch adds a beacon timestamp to the beacon template used in IBSS
mode. This way the underlying driver can update its TSF accordingly.
According the spec station should adopt the highest TSF from an incoming
beacons in the cell.

Signed-off-by: Assaf Krauss <assaf.krauss@intel.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-30 17:37:42 -04:00
Tomas Winkler 5479d0e739 mac80211: fix warning: unused variable invoke_tx_handlers
This patch fixes warning: unused variable in invoke_tx_handlers
when compiling without MAC80211_DEBUG option

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-30 17:37:36 -04:00
Ester Kummer ae6a44e3af mac80211: removing duplicated parsing of information elements
This patch removes the duplicated parsing of information elements
in ieee80211_rx_bss_info and in ieee_rx_mgmt_beacon

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Ester Kummer <ester.kummer@intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-30 17:37:35 -04:00
Adrian Bunk e5f5e7339c build algorithms into the mac80211 module
The old infrastructure was:
- the default algorithm is built into mac80211
- other algorithms get into their own modules

The implementation of this complicated scheme was horrible
(just look at net/mac80211/Makefile), and anyone adding a new
algorithm would most likely not get it right at his first attempt.

This patch therefore builds all enabled algorithms into the mac80211
module.

The user interface for the rate control algorithms changes as follows:
- first the user can choose which algorithms to enable (currently only
  MAC80211_RC_PID is available)
- if more than one algorithm is enabled (currently not possible since
  only one algorithm is present) the user then chooses the default one

Note:
- MAC80211_RC_PID is always enables for CONFIG_EMBEDDED=n

Technical changes:
- all selected algorithms get into the mac80211 module
- net/mac80211/Makefile can now become much less complicated
- support for rc80211_pid_algo.c being modular is no longer required
- this includes unexporting mesh_plink_broken

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-30 17:37:34 -04:00
Emmanuel Grumbach bf998f6864 mac80211: add last beacon time in scan list
This patch adds the interval between the scan results and the last time a
beacon was received in the result of the scan.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-30 17:37:34 -04:00
Tomas Winkler 06ff47bc95 mac80211: add spectrum capabilities
This patch add spectrum capability and required information
elements to association request providing AP has requested it and
it is supported by the driver

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Assaf Krauss <assaf.krauss@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-30 17:37:34 -04:00
Yi Zhu 7b1e78d505 mac80211: add MAC80211_VERBOSE_SPECT_MGMT_DEBUG Kconfig option
The patch introduces MAC80211_VERBOSE_SPECT_MGMT_DEBUG Kconfig option to
suppress Spectrum Management 802.11h related debug logs.

Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-30 17:37:33 -04:00
David S. Miller 2a64cc4b79 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6 2008-06-30 13:18:53 -07:00
Emmanuel Grumbach 23976efedd mac80211: don't accept WEP keys other than WEP40 and WEP104
This patch makes mac80211 refuse a WEP key whose length is not WEP40 nor
WEP104.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-30 15:43:53 -04:00
Jozsef Kadlecsik 84ebe1cdae netfilter: nf_conntrack_tcp: fixing to check the lower bound of valid ACK
Lost connections was reported by Thomas Bätzler (running 2.6.25 kernel) on
the netfilter mailing list (see the thread "Weird nat/conntrack Problem
with PASV FTP upload"). He provided tcpdump recordings which helped to
find a long lingering bug in conntrack.

In TCP connection tracking, checking the lower bound of valid ACK could
lead to mark valid packets as INVALID because:

 - We have got a "higher or equal" inequality, but the test checked
   the "higher" condition only; fixed.
 - If the packet contains a SACK option, it could occur that the ACK
   value was before the left edge of our (S)ACK "window": if a previous
   packet from the other party intersected the right edge of the window
   of the receiver, we could move forward the window parameters beyond
   accepting a valid ack. Therefore in this patch we check the rightmost
   SACK edge instead of the ACK value in the lower bound of valid (S)ACK
   test.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-30 12:41:30 -07:00
David S. Miller 28f49d8fec Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2008-06-28 22:57:58 -07:00
David S. Miller 1b63ba8a86 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/wireless/iwlwifi/iwl4965-base.c
2008-06-28 01:19:40 -07:00
YOSHIFUJI Hideaki d420895efb ipv6 route: Convert rt6_device_match() to use RT6_LOOKUP_F_xxx flags.
The commit 77d16f450a ("[IPV6] ROUTE:
Unify RT6_F_xxx and RT6_SELECT_F_xxx flags") intended to pass various
routing lookup hints around RT6_LOOKUP_F_xxx flags, but conversion was
missing for rt6_device_match().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-27 20:14:54 -07:00
Paul Moore 59d88c00ca netlabel: Fix a problem when dumping the default IPv6 static labels
There is a missing "!" in a conditional statement which is causing entries to
be skipped when dumping the default IPv6 static label entries.  This can be
demonstrated by running the following:

 # netlabelctl unlbl add default address:::1 \
                                 label:system_u:object_r:unlabeled_t:s0
 # netlabelctl -p unlbl list

... you will notice that the entry for the IPv6 localhost address is not
displayed but does exist (works correctly, causes collisions when attempting
to add duplicate entries, etc.).

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-27 20:12:32 -07:00
Eli Cohen 251a4b320f net/inet_lro: remove setting skb->ip_summed when not LRO-able
When an SKB cannot be chained to a session, the current code attempts
to "restore" its ip_summed field from lro_mgr->ip_summed. However,
lro_mgr->ip_summed does not hold the original value; in fact, we'd
better not touch skb->ip_summed since it is not modified by the code
in the path leading to a failure to chain it.  Also use a cleaer
comment to the describe the ip_summed field of struct net_lro_mgr.

Issue raised by Or Gerlitz <ogerlitz@voltaire.com>

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-27 20:09:00 -07:00
Pavel Emelyanov 9a375803fe inet fragments: fix race between inet_frag_find and inet_frag_secret_rebuild
The problem is that while we work w/o the inet_frags.lock even
read-locked the secret rebuild timer may occur (on another CPU, since
BHs are still disabled in the inet_frag_find) and change the rnd seed
for ipv4/6 fragments.

It was caused by my patch fd9e63544c
([INET]: Omit double hash calculations in xxx_frag_intern) late 
in the 2.6.24 kernel, so this should probably be queued to -stable.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-27 20:06:08 -07:00
Julius Volz 10b595aff1 netlink: Fix some doc comments in net/netlink/attr.c
Fix some doc comments to match function and attribute names in
net/netlink/attr.c.

Signed-off-by: Julius Volz <juliusv@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-27 20:02:14 -07:00
Stephen Hemminger 7be87351a1 tcp: /proc/net/tcp rto,ato values not scaled properly (v2)
I found another case where we are sending information to userspace
in the wrong HZ scale.  This should have been fixed back in 2.5 :-(

This means an ABI change but as it stands there is no way for an application
like ss to get the right value.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-27 20:00:19 -07:00
Adrian Bunk ede16af4cd pkt_sched: Remove CONFIG_NET_SCH_RR
Commit d62733c8e4
([SCHED]: Qdisc changes and sch_rr added for multiqueue)
added a NET_SCH_RR option that was unused since the code
went unconditionally into sch_prio.

Reported-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-27 19:54:05 -07:00
WANG Cong 01e123d79a pkt_sched: ERR_PTR() ususally encodes an negative errno, not positive.
Note, in the following patch, 'err' is initialized as:

int err = -ENOBUFS;

Signed-off-by: WANG Cong <wcong@critical-links.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-27 19:51:35 -07:00
Wang Chen 5dbaec5dc6 netdevice: Fix typo of dev_unicast_add() comment
Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-27 19:35:16 -07:00
Rainer Weikusat ec0d215f94 af_unix: fix 'poll for write'/connected DGRAM sockets
For n:1 'datagram connections' (eg /dev/log), the unix_dgram_sendmsg
routine implements a form of receiver-imposed flow control by
comparing the length of the receive queue of the 'peer socket' with
the max_ack_backlog value stored in the corresponding sock structure,
either blocking the thread which caused the send-routine to be called
or returning EAGAIN. This routine is used by both SOCK_DGRAM and
SOCK_SEQPACKET sockets. The poll-implementation for these socket types
is datagram_poll from core/datagram.c. A socket is deemed to be
writeable by this routine when the memory presently consumed by
datagrams owned by it is less than the configured socket send buffer
size. This is always wrong for PF_UNIX non-stream sockets connected to
server sockets dealing with (potentially) multiple clients if the
abovementioned receive queue is currently considered to be full.
'poll' will then return, indicating that the socket is writeable, but
a subsequent write result in EAGAIN, effectively causing an (usual)
application to 'poll for writeability by repeated send request with
O_NONBLOCK set' until it has consumed its time quantum.

The change below uses a suitably modified variant of the datagram_poll
routines for both type of PF_UNIX sockets, which tests if the
recv-queue of the peer a socket is connected to is presently
considered to be 'full' as part of the 'is this socket
writeable'-checking code. The socket being polled is additionally
put onto the peer_wait wait queue associated with its peer, because the
unix_dgram_recvmsg routine does a wake up on this queue after a
datagram was received and the 'other wakeup call' is done implicitly
as part of skb destruction, meaning, a process blocked in poll
because of a full peer receive queue could otherwise sleep forever
if no datagram owned by its socket was already sitting on this queue.
Among this change is a small (inline) helper routine named
'unix_recvq_full', which consolidates the actual testing code (in three
different places) into a single location.

Signed-off-by: Rainer Weikusat <rweikusat@mssgmbh.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-27 19:34:18 -07:00
Octavian Purdila db43a282d3 tcp: fix for splice receive when used with software LRO
If an skb has nr_frags set to zero but its frag_list is not empty (as
it can happen if software LRO is enabled), and a previous
tcp_read_sock has consumed the linear part of the skb, then
__skb_splice_bits:

(a) incorrectly reports an error and

(b) forgets to update the offset to account for the linear part

Any of the two problems will cause the subsequent __skb_splice_bits
call (the one that handles the frag_list skbs) to either skip data,
or, if the unadjusted offset is greater then the size of the next skb
in the frag_list, make tcp_splice_read loop forever.

Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-27 17:27:21 -07:00
Miquel van Smoorenburg 57413ebc4e tcp: calculate tcp_mem based on low memory instead of all memory
The tcp_mem array which contains limits on the total amount of memory
used by TCP sockets is calculated based on nr_all_pages.  On a 32 bits
x86 system, we should base this on the number of lowmem pages.

Signed-off-by: Miquel van Smoorenburg <miquels@cistron.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-27 17:23:57 -07:00
Emmanuel Grumbach 00eb7fe77e mac80211: fix an oops in several failure paths in key allocation
This patch fixes an oops in several failure paths in key allocation. This
Oops occurs when freeing a key that has not been linked yet, so the
key->sdata is not set.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-27 14:49:52 -04:00
Johannes Berg 03f93c3d4c mac80211: fix tx fragmentation
This patch fixes TX fragmentation caused by
tx handlers reordering and 'tx info to cb' patches

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-27 09:09:21 -04:00
Johannes Berg 59959a6150 mac80211: make workqueue freezable
This patch makes the mac80211 workqueue freezable making it
interact a bit better with system suspend and not try to ping
the AP while the hardware is down.

This doesn't really help with implementing proper suspend in
any way but makes some bad things trigger less.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-27 09:09:21 -04:00
Tomas Winkler f37d08bddc mac80211: add phy information to giwname
This patch add phy information to giwname.

Quoting:
It's not useless, it's supposed to tell you about the protocol
capability of the device, like "IEEE 802.11b" or "IEEE 802.11abg"

Jean

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-27 09:09:20 -04:00
Emmanuel Grumbach b9fcc4f298 mac80211: update the authentication method
This patch updates the authentication method upon giwencode ioctl.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-27 09:09:19 -04:00
Emmanuel Grumbach fa6adfe9e6 mac80211: don't return -EINVAL upon iwconfig wlan0 rts auto
This patch avoids returning -EINVAL upon iwconfig wlan0 rts auto. If
rts->fixed is 0, then we should choose a default value instead of failing.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-27 09:09:19 -04:00
Harvey Harrison 4e3996fe89 mac80211: mlme.c use new frame control helpers
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-27 09:09:18 -04:00
Harvey Harrison 182503abf4 mac80211: rx.c use new frame control helpers
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-27 09:09:18 -04:00
Harvey Harrison 065e9605f9 mac80211: tx.c use new frame control helpers
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-27 09:09:18 -04:00
Harvey Harrison 70217d7f83 mac80211: wep.c use new frame control helpers
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-27 09:09:17 -04:00
Luis R. Rodriguez ffd7891dc9 mac80211: Let drivers have access to TKIP key offets for TX and RX MIC
Some drivers may want to to use the TKIP key offsets for TX and RX
MIC so lets move this out. Lets also clear up a bit how this is used
internally in mac80211.

Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-27 09:09:17 -04:00
Johannes Berg 9ae705cfd3 mac80211: rename TKIP debugging Kconfig symbol
... to MAC80211_TKIP_DEBUG rather than TKIP_DEBUG.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-27 09:09:15 -04:00
Johannes Berg 97b045d62b mac80211: add single function calling tx handlers
This modifies mac80211 to only have a single function calling the
TX handlers rather than them being invoked in multiple places.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-26 16:50:02 -04:00
Johannes Berg 5a9f7b047e mac80211: use separate spinlock for sta flags
David Ellingsworth posted a bug that was only noticable on UP/NO-PREEMPT
and Michael correctly analysed it to be a spin_lock_bh() section within
a spin_lock_irqsave() section. This adds a separate spinlock for the
sta_info flags to fix that issue and avoid having to take much care
about where the sta flag manipulation functions are called.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reported-By: David Ellingsworth <david@identd.dyndns.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-26 16:49:17 -04:00
Johannes Berg 135a2110c5 mac80211: remove shared key todo
Adding shared key authentication is not going to happen anyway.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-26 16:49:17 -04:00
Assaf Krauss b662348662 mac80211: 11h - Handling measurement request
This patch handles the 11h measurement request information element.
This is minimal requested implementation - refuse measurement.

Signed-off-by: Assaf Krauss <assaf.krauss@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-26 16:49:14 -04:00
Assaf Krauss f2df38596a mac80211: 11h Infrastructure - Parsing
This patch introduces parsing of 11h and 11d related elements from incoming
management frames.

Signed-off-by: Assaf Krauss <assaf.krauss@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-26 16:49:14 -04:00
Henrique de Moraes Holschuh 5005657cbd rfkill: rename the rfkill_state states and add block-locked state
The current naming of rfkill_state causes a lot of confusion: not only the
"kill" in rfkill suggests negative logic, but also the fact that rfkill cannot
turn anything on (it can just force something off or stop forcing something
off) is often forgotten.

Rename RFKILL_STATE_OFF to RFKILL_STATE_SOFT_BLOCKED (transmitter is blocked
and will not operate; state can be changed by a toggle_radio request), and
RFKILL_STATE_ON to RFKILL_STATE_UNBLOCKED (transmitter is not blocked, and may
operate).

Also, add a new third state, RFKILL_STATE_HARD_BLOCKED (transmitter is blocked
and will not operate; state cannot be changed through a toggle_radio request),
which is used by drivers to indicate a wireless transmiter was blocked by a
hardware rfkill line that accepts no overrides.

Keep the old names as #defines, but document them as deprecated.  This way,
drivers can be converted to the new names *and* verified to actually use rfkill
correctly one by one.

Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Acked-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-26 14:21:22 -04:00
Henrique de Moraes Holschuh 4081f00dc4 rfkill: do not allow userspace to override ALL RADIOS OFF
SW_RFKILL_ALL is the "emergency power-off all radios" input event.  It must
be handled, and must always do the same thing as far as the rfkill system
is concerned: all transmitters are to go *immediately* offline.

For safety, do NOT allow userspace to override EV_SW SW_RFKILL_ALL OFF.  As
long as rfkill-input is loaded, that event will *always* be processed, and
it will *always* force all rfkill switches to disable all wireless
transmitters, regardless of user_claim attribute or anything else.

Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Acked-by: Ivo van Doorn <IvDoorn@gmail.com>
Cc: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-26 14:21:22 -04:00
Fabien Crespel fbc6af2f3c rfkill: drop current_state from tasks in rfkill-input
The whole current_state thing seems completely useless and a source of
problems in rfkill-input, since state comparison is already done in rfkill,
and rfkill-input is more than likely to become out of sync with the real
state.

Signed-off-by: Fabien Crespel <fabien@crespel.net>
Acked-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Acked-by: Ivo van Doorn <IvDoorn@gmail.com>
Cc: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-26 14:21:21 -04:00
Henrique de Moraes Holschuh ffb67c34e4 rfkill: add uevent notifications
Use the notification chains to also send uevents, so that userspace can be
notified of state changes of every rfkill switch.

Userspace should use these events for OSD/status report applications and
rfkill GUI frontends.  HAL might want to broadcast them over DBUS, for
example.  It might be also useful for userspace implementations of
rfkill-input, or to use HAL as the platform driver which promotes rfkill
switch change events into input events (to synchronize all other switches)
when necessary for platforms that lack a convenient platform-specific
kernel module to do it.

Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Acked-by: Ivo van Doorn <IvDoorn@gmail.com>
Cc: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-26 14:21:21 -04:00
Henrique de Moraes Holschuh 99c632e5a3 rfkill: add type string helper
We will need access to the rfkill switch type in string format for more
than just sysfs.  Therefore, move it to a generic helper.

Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Acked-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-26 14:21:21 -04:00
Henrique de Moraes Holschuh 79399a8d19 rfkill: add notifier chains support
Add a notifier chain for use by the rfkill class.  This notifier chain
signals the following events (more to be added when needed):

  1. rfkill: rfkill device state has changed

A pointer to the rfkill struct will be passed as a parameter.

The notifier message types have been added to include/linux/rfkill.h
instead of to include/linux/notifier.h in order to avoid the madness of
modifying a header used globally (and that triggers an almost full tree
rebuild every time it is touched) with information that is of interest only
to code that includes the rfkill.h header.

Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Acked-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-26 14:21:21 -04:00
Henrique de Moraes Holschuh 526324b61a rfkill: rework suspend and resume handlers
The resume handler should reset the wireless transmitter rfkill
state to exactly what it was when the system was suspended.  Do it,
and do it using the normal routines for state change while at it.

The suspend handler should force-switch the transmitter to blocked
state, ignoring caches.  Do it.

Also take an opportunity shot to rfkill_remove_switch() and also
force the transmitter to blocked state there, bypassing caches.

Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Acked-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-26 14:21:20 -04:00
Henrique de Moraes Holschuh 477576a073 rfkill: add the WWAN radio type
Unfortunately, instead of adding a generic Wireless WAN type, a technology-
specific type (WiMAX) was added.  That's useless for other WWAN devices,
such as EDGE, UMTS, X-RTT and other such radios.

Add a WWAN rfkill type for generic wireless WAN devices.  No keys are added
as most devices really want to use KEY_WLAN for WWAN control (in a cycle of
none, WLAN, WWAN, WLAN+WWAN) and need no specific keycode added.

Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Acked-by: Ivo van Doorn <IvDoorn@gmail.com>
Cc: Iñaky Pérez-González <inaky.perez-gonzalez@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-26 14:21:20 -04:00
Henrique de Moraes Holschuh 801e49af4c rfkill: add read-write rfkill switch support
Currently, rfkill support for read/write rfkill switches is hacked through
a round-trip over the input layer and rfkill-input to let a driver sync
rfkill->state to hardware changes.

This is buggy and sub-optimal.  It causes real problems.  It is best to
think of the rfkill class as supporting only write-only switches at the
moment.

In order to implement the read/write functionality properly:

Add a get_state() hook that is called by the class every time it needs to
fetch the current state of the switch.  Add a call to this hook every time
the *current* state of the radio plays a role in a decision.

Also add a force_state() method that can be used to forcefully syncronize
the class' idea of the current state of the switch.  This allows for a
faster implementation of the read/write functionality, as a driver which
get events on switch changes can avoid the need for a get_state() hook.

If the get_state() hook is left as NULL, current behaviour is maintained,
so this change is fully backwards compatible with the current rfkill
drivers.

For hardware that issues events when the rfkill state changes, leave
get_state() NULL in the rfkill struct, set the initial state properly
before registering with the rfkill class, and use the force_state() method
in the driver to keep the rfkill interface up-to-date.

get_state() can be called by the class from atomic context. It must not
sleep.

Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Acked-by: Ivo van Doorn <IvDoorn@gmail.com>
Cc: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-26 14:21:20 -04:00
Henrique de Moraes Holschuh e954b0b85b rfkill: add parameter to disable radios by default
Currently, radios are always enabled when their rfkill interface is
registered.  This is not optimal, the safest state for a radio is to be
offline unless the user turns it on.

Add a module parameter that causes all radios to be disabled when their
rfkill interface is registered.  The module default is not changed so
unless the parameter is used, radios will still be forced to their enabled
state when they are registered.

The new rfkill module parameter is called "default_state".

Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Acked-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-26 14:21:20 -04:00
Henrique de Moraes Holschuh 28f089c184 rfkill: handle SW_RFKILL_ALL events
Teach rfkill-input how to handle SW_RFKILL_ALL events (new name for the
SW_RADIO event).

SW_RFKILL_ALL is an absolute enable-or-disable command that is tied to all
radios in a system.

Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Acked-by: Ivo van Doorn <IvDoorn@gmail.com>
Cc: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-26 14:21:19 -04:00
Henrique de Moraes Holschuh c8fcd905a5 rfkill: fix minor typo in kernel doc
Fix a minor typo in an exported function documentation

Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Acked-by: Ivo van Doorn <IvDoorn@gmail.com>
Cc: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-26 14:21:19 -04:00
Jens Axboe 15c8b6c1aa on_each_cpu(): kill unused 'retry' parameter
It's not even passed on to smp_call_function() anymore, since that
was removed. So kill it.

Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-06-26 11:24:38 +02:00
Jens Axboe 8691e5a8f6 smp_call_function: get rid of the unused nonatomic/retry argument
It's never used and the comments refer to nonatomic and retry
interchangably. So get rid of it.

Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-06-26 11:24:35 +02:00
John W. Linville 1839cea91e Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/wireless-2.6 2008-06-25 15:17:58 -04:00
Tony Vroon 59d393ad92 mac80211: implement EU regulatory domain
Implement missing EU regulatory domain for mac80211. Based on the
information in IEEE 802.11-2007 (specifically pages 1142, 1143 & 1148)
and ETSI 301 893 (V1.4.1).
With thanks to Johannes Berg.

Signed-off-by: Tony Vroon <tony@linx.net>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-25 10:31:29 -04:00
Patrick McHardy 88a6f4ad76 netfilter: ip6table_mangle: don't reroute in LOCAL_IN
Rerouting should only happen in LOCAL_OUT, in INPUT its useless
since the packet has already chosen its final destination.

Noticed by Alexey Dobriyan <adobriyan@gmail.com>.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-24 13:30:45 -07:00
Kevin Coffman 863a24882e gss_krb5: Use random value to initialize confounder
Initialize the value used for the confounder to a random value
rather than starting from zero.
Allow for confounders of length 8 or 16 (which will be needed for AES).

Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-06-23 13:47:38 -04:00
Kevin Coffman db8add5789 gss_krb5: move gss_krb5_crypto into the krb5 module
The gss_krb5_crypto.o object belongs in the rpcsec_gss_krb5 module.
Also, there is no need to export symbols from gss_krb5_crypto.c

Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-06-23 13:47:32 -04:00
Kevin Coffman d00953a53e gss_krb5: create a define for token header size and clean up ptr location
cleanup:
Document token header size with a #define instead of open-coding it.

Don't needlessly increment "ptr" past the beginning of the header
which makes the values passed to functions more understandable and
eliminates the need for extra "krb5_hdr" pointer.

Clean up some intersecting  white-space issues flagged by checkpatch.pl.

This leaves the checksum length hard-coded at 8 for DES.  A later patch
cleans that up.

Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-06-23 13:47:25 -04:00
Jeff Layton a75c5d01e4 sunrpc: remove sv_kill_signal field from svc_serv struct
Since we no longer make any distinction between shutdown signals with
nfsd, then it becomes easier to just standardize on a particular signal
to use to bring it down (SIGINT, in this case).

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-06-23 13:02:49 -04:00
Jeff Layton 9867d76ca1 knfsd: convert knfsd to kthread API
This patch is rather large, but I couldn't figure out a way to break it
up that would remain bisectable. It does several things:

- change svc_thread_fn typedef to better match what kthread_create expects
- change svc_pool_map_set_cpumask to be more kthread friendly. Make it
  take a task arg and and get rid of the "oldmask"
- have svc_set_num_threads call kthread_create directly
- eliminate __svc_create_thread

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-06-23 13:02:49 -04:00
Neil Brown bedbdd8bad knfsd: Replace lock_kernel with a mutex for nfsd thread startup/shutdown locking.
This removes the BKL from the RPC service creation codepath. The BKL
really isn't adequate for this job since some of this info needs
protection across sleeps.

Also, add some comments to try and clarify how the locking should work
and to make it clear that the BKL isn't necessary as long as there is
adequate locking between tasks when touching the svc_serv fields.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-06-23 13:02:49 -04:00
Ingo Molnar 1e74f9cbbb Merge branch 'linus' into core/rcu 2008-06-23 11:29:11 +02:00
Ingo Molnar a60b33cf59 Merge branch 'linus' into core/softirq 2008-06-23 10:52:59 +02:00
Eric W. Biederman b9f75f45a6 netns: Don't receive new packets in a dead network namespace.
Alexey Dobriyan <adobriyan@gmail.com> writes:
> Subject: ICMP sockets destruction vs ICMP packets oops

> After icmp_sk_exit() nuked ICMP sockets, we get an interrupt.
> icmp_reply() wants ICMP socket.
>
> Steps to reproduce:
>
> 	launch shell in new netns
> 	move real NIC to netns
> 	setup routing
> 	ping -i 0
> 	exit from shell
>
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
> IP: [<ffffffff803fce17>] icmp_sk+0x17/0x30
> PGD 17f3cd067 PUD 17f3ce067 PMD 0 
> Oops: 0000 [1] PREEMPT SMP DEBUG_PAGEALLOC
> CPU 0 
> Modules linked in: usblp usbcore
> Pid: 0, comm: swapper Not tainted 2.6.26-rc6-netns-ct #4
> RIP: 0010:[<ffffffff803fce17>]  [<ffffffff803fce17>] icmp_sk+0x17/0x30
> RSP: 0018:ffffffff8057fc30  EFLAGS: 00010286
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff81017c7db900
> RDX: 0000000000000034 RSI: ffff81017c7db900 RDI: ffff81017dc41800
> RBP: ffffffff8057fc40 R08: 0000000000000001 R09: 000000000000a815
> R10: 0000000000000000 R11: 0000000000000001 R12: ffffffff8057fd28
> R13: ffffffff8057fd00 R14: ffff81017c7db938 R15: ffff81017dc41800
> FS:  0000000000000000(0000) GS:ffffffff80525000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 0000000000000000 CR3: 000000017fcda000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process swapper (pid: 0, threadinfo ffffffff8053a000, task ffffffff804fa4a0)
> Stack:  0000000000000000 ffff81017c7db900 ffffffff8057fcf0 ffffffff803fcfe4
>  ffffffff804faa38 0000000000000246 0000000000005a40 0000000000000246
>  000000000001ffff ffff81017dd68dc0 0000000000005a40 0000000055342436
> Call Trace:
>  <IRQ>  [<ffffffff803fcfe4>] icmp_reply+0x44/0x1e0
>  [<ffffffff803d3a0a>] ? ip_route_input+0x23a/0x1360
>  [<ffffffff803fd645>] icmp_echo+0x65/0x70
>  [<ffffffff803fd300>] icmp_rcv+0x180/0x1b0
>  [<ffffffff803d6d84>] ip_local_deliver+0xf4/0x1f0
>  [<ffffffff803d71bb>] ip_rcv+0x33b/0x650
>  [<ffffffff803bb16a>] netif_receive_skb+0x27a/0x340
>  [<ffffffff803be57d>] process_backlog+0x9d/0x100
>  [<ffffffff803bdd4d>] net_rx_action+0x18d/0x250
>  [<ffffffff80237be5>] __do_softirq+0x75/0x100
>  [<ffffffff8020c97c>] call_softirq+0x1c/0x30
>  [<ffffffff8020f085>] do_softirq+0x65/0xa0
>  [<ffffffff80237af7>] irq_exit+0x97/0xa0
>  [<ffffffff8020f198>] do_IRQ+0xa8/0x130
>  [<ffffffff80212ee0>] ? mwait_idle+0x0/0x60
>  [<ffffffff8020bc46>] ret_from_intr+0x0/0xf
>  <EOI>  [<ffffffff80212f2c>] ? mwait_idle+0x4c/0x60
>  [<ffffffff80212f23>] ? mwait_idle+0x43/0x60
>  [<ffffffff8020a217>] ? cpu_idle+0x57/0xa0
>  [<ffffffff8040f380>] ? rest_init+0x70/0x80
> Code: 10 5b 41 5c 41 5d 41 5e c9 c3 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 53
> 48 83 ec 08 48 8b 9f 78 01 00 00 e8 2b c7 f1 ff 89 c0 <48> 8b 04 c3 48 83 c4 08
> 5b c9 c3 66 66 66 66 66 2e 0f 1f 84 00
> RIP  [<ffffffff803fce17>] icmp_sk+0x17/0x30
>  RSP <ffffffff8057fc30>
> CR2: 0000000000000000
> ---[ end trace ea161157b76b33e8 ]---
> Kernel panic - not syncing: Aiee, killing interrupt handler!

Receiving packets while we are cleaning up a network namespace is a
racy proposition. It is possible when the packet arrives that we have
removed some but not all of the state we need to fully process it.  We
have the choice of either playing wack-a-mole with the cleanup routines
or simply dropping packets when we don't have a network namespace to
handle them.

Since the check looks inexpensive in netif_receive_skb let's just
drop the incoming packets.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-20 22:16:51 -07:00
David S. Miller 735ce972fb sctp: Make sure N * sizeof(union sctp_addr) does not overflow.
As noticed by Gabriel Campana, the kmalloc() length arg
passed in by sctp_getsockopt_local_addrs_old() can overflow
if ->addr_num is large enough.

Therefore, enforce an appropriate limit.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-20 22:04:34 -07:00
Arnd Bergmann cddf63d99d irnet_ppp: BKL pushdown
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2008-06-20 14:05:58 -06:00
Vlad Yasevich 0f474d9bc5 sctp: Kill unused variable in sctp_assoc_bh_rcv()
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-20 10:34:47 -07:00
YOSHIFUJI Hideaki f630e43a21 ipv6: Drop packets for loopback address from outside of the box.
[ Based upon original report and patch by Karsten Keil.  Karsten
  has verified that this fixes the TAHI test case "ICMPv6 test
  v6LC.5.1.2 Part F". -DaveM ]

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-19 16:33:57 -07:00
Shan Wei aea7427f70 ipv6: Remove options header when setsockopt's optlen is 0
Remove the sticky Hop-by-Hop options header by calling setsockopt()
for IPV6_HOPOPTS with a zero option length, per RFC3542.

Routing header and Destination options header does the same as
Hop-by-Hop options header.

Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-19 16:29:39 -07:00
Ben Hutchings 4497b0763c net: Discard and warn about LRO'd skbs received for forwarding
Add skb_warn_if_lro() to test whether an skb was received with LRO and
warn if so.

Change br_forward(), ip_forward() and ip6_forward() to call it) and
discard the skb if it returns true.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-19 16:22:28 -07:00
Ben Hutchings 0187bdfb05 net: Disable LRO on devices that are forwarding
Large Receive Offload (LRO) is only appropriate for packets that are
destined for the host, and should be disabled if received packets may be
forwarded.  It can also confuse the GSO on output.

Add dev_disable_lro() function which uses the appropriate ethtool ops to
disable LRO if enabled.

Add calls to dev_disable_lro() in br_add_if() and functions that enable
IPv4 and IPv6 forwarding.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-19 16:15:47 -07:00
Vlad Yasevich 2e3216cd54 sctp: Follow security requirement of responding with 1 packet
RFC 4960, Section 11.4. Protection of Non-SCTP-Capable Hosts

When an SCTP stack receives a packet containing multiple control or
DATA chunks and the processing of the packet requires the sending of
multiple chunks in response, the sender of the response chunk(s) MUST
NOT send more than one packet.  If bundling is supported, multiple
response chunks that fit into a single packet MAY be bundled together
into one single response packet.  If bundling is not supported, then
the sender MUST NOT send more than one response chunk and MUST
discard all other responses.  Note that this rule does NOT apply to a
SACK chunk, since a SACK chunk is, in itself, a response to DATA and
a SACK does not require a response of more DATA.

We implement this by not servicing our outqueue until we reach the end
of the packet.  This enables maximum bundling.  We also identify
'response' chunks and make sure that we only send 1 packet when sending
such chunks.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-19 16:08:18 -07:00
Wei Yongjun 7115e632f9 sctp: Validate Initiate Tag when handling ICMP message
This patch add to validate initiate tag and chunk type if verification
tag is 0 when handling ICMP message.

RFC 4960, Appendix C. ICMP Handling

ICMP6) An implementation MUST validate that the Verification Tag
contained in the ICMP message matches the Verification Tag of the peer.
If the Verification Tag is not 0 and does NOT match, discard the ICMP
message.  If it is 0 and the ICMP message contains enough bytes to
verify that the chunk type is an INIT chunk and that the Initiate Tag
matches the tag of the peer, continue with ICMP7.  If the ICMP message
is too short or the chunk type or the Initiate Tag does not match,
silently discard the packet.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-19 16:07:48 -07:00
David S. Miller 0344f1c66b Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	net/mac80211/tx.c
2008-06-19 16:00:04 -07:00
Johannes Berg ef3a62d272 mac80211: detect driver tx bugs
When a driver rejects a frame in it's ->tx() callback, it must also
stop queues, otherwise mac80211 can go into a loop here. Detect this
situation and abort the loop after five retries, warning about the
driver bug.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-18 15:39:48 -07:00
Patrick McHardy 6d1a3fb567 netlink: genl: fix circular locking
genetlink has a circular locking dependency when dumping the registered
families:

- dump start:
genl_rcv()            : take genl_mutex
genl_rcv_msg()        : call netlink_dump_start() while holding genl_mutex
netlink_dump_start(),
netlink_dump()        : take nlk->cb_mutex
ctrl_dumpfamily()     : try to detect this case and not take genl_mutex a
                        second time

- dump continuance:
netlink_rcv()         : call netlink_dump
netlink_dump          : take nlk->cb_mutex
ctrl_dumpfamily()     : take genl_mutex

Register genl_lock as callback mutex with netlink to fix this. This slightly
widens an already existing module unload race, the genl ops used during the
dump might go away when the module is unloaded. Thomas Graf is working on a
seperate fix for this.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-18 02:07:07 -07:00
Wang Chen dad9b335c6 netdevice: Fix promiscuity and allmulti overflow
Max of promiscuity and allmulti plus positive @inc can cause overflow.
Fox example: when allmulti=0xFFFFFFFF, any caller give dev_set_allmulti() a
positive @inc will cause allmulti be off.
This is not what we want, though it's rare case.
The fix is that only negative @inc will cause allmulti or promiscuity be off
and when any caller makes the counters touch the roof, we return error.

Change of v2:
Change void function dev_set_promiscuity/allmulti to return int.
So callers can get the overflow error.
Caller's fix will be done later.

Change of v3:
1. Since we return error to caller, we don't need to print KERN_ERROR,
KERN_WARNING is enough.
2. In dev_set_promiscuity(), if __dev_set_promiscuity() failed, we
return at once.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-18 01:48:28 -07:00
David S. Miller 3a5be7d4b0 Revert "mac80211: Use skb_header_cloned() on TX path."
This reverts commit 608961a5ec.

The problem is that the mac80211 stack not only needs to be able to
muck with the link-level headers, it also might need to mangle all of
the packet data if doing sw wireless encryption.

This fixes kernel bugzilla #10903.  Thanks to Didier Raboud (for the
bugzilla report), Andrew Prince (for bisecting), Johannes Berg (for
bringing this bisection analysis to my attention), and Ilpo (for
trying to analyze this purely from the TCP side).

In 2.6.27 we can take another stab at this, by using something like
skb_cow_data() when the TX path of mac80211 ends up with a non-NULL
tx->key.  The ESP protocol code in the IPSEC stack can be used as a
model for implementation.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-18 01:19:51 -07:00
Rami Rosen dd574dbfcc ipv6: minor cleanup in net/ipv6/tcp_ipv6.c [RESEND ].
In net/ipv6/tcp_ipv6.c:

  - Remove unneeded tcp_v6_send_check() declaration.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-18 00:51:09 -07:00
David S. Miller 972692e0db net: Add sk_set_socket() helper.
In order to more easily grep for all things that set
sk->sk_socket, add sk_set_socket() helper inline function.

Suggested (although only half-seriously) by Evgeniy Polyakov.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-17 22:41:38 -07:00
Rainer Weikusat 3c73419c09 af_unix: fix 'poll for write'/ connected DGRAM sockets
The unix_dgram_sendmsg routine implements a (somewhat crude)
form of receiver-imposed flow control by comparing the length of the
receive queue of the 'peer socket' with the max_ack_backlog value
stored in the corresponding sock structure, either blocking
the thread which caused the send-routine to be called or returning
EAGAIN. This routine is used by both SOCK_DGRAM and SOCK_SEQPACKET
sockets. The poll-implementation for these socket types is
datagram_poll from core/datagram.c. A socket is deemed to be writeable
by this routine when the memory presently consumed by datagrams
owned by it is less than the configured socket send buffer size. This
is always wrong for connected PF_UNIX non-stream sockets when the
abovementioned receive queue is currently considered to be full.
'poll' will then return, indicating that the socket is writeable, but
a subsequent write result in EAGAIN, effectively causing an
(usual) application to 'poll for writeability by repeated send request
with O_NONBLOCK set' until it has consumed its time quantum.

The change below uses a suitably modified variant of the datagram_poll
routines for both type of PF_UNIX sockets, which tests if the
recv-queue of the peer a socket is connected to is presently
considered to be 'full' as part of the 'is this socket
writeable'-checking code. The socket being polled is additionally
put onto the peer_wait wait queue associated with its peer, because the
unix_dgram_sendmsg routine does a wake up on this queue after a
datagram was received and the 'other wakeup call' is done implicitly
as part of skb destruction, meaning, a process blocked in poll
because of a full peer receive queue could otherwise sleep forever
if no datagram owned by its socket was already sitting on this queue.
Among this change is a small (inline) helper routine named
'unix_recvq_full', which consolidates the actual testing code (in three
different places) into a single location.

Signed-off-by: Rainer Weikusat <rweikusat@mssgmbh.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-17 22:28:05 -07:00
David S. Miller 5bbc1722d5 Merge branch 'davem-next' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 2008-06-17 21:37:14 -07:00
David S. Miller 30902dc3cb ax25: Fix std timer socket destroy handling.
Tihomir Heidelberg - 9a4gl, reports:

--------------------
I would like to direct you attention to one problem existing in ax.25
kernel since 2.4. If listening socket is closed and its SKB queue is
released but those sockets get weird. Those "unAccepted()" sockets
should be destroyed in ax25_std_heartbeat_expiry, but it will not
happen. And there is also a note about that in ax25_std_timer.c:
/* Magic here: If we listen() and a new link dies before it
is accepted() it isn't 'dead' so doesn't get removed. */

This issue cause ax25d to stop accepting new connections and I had to
restarted ax25d approximately each day and my services were unavailable.
Also netstat -n -l shows invalid source and device for those listening
sockets. It is strange why ax25d's listening socket get weird because of
this issue, but definitely when I solved this bug I do not have problems
with ax25d anymore and my ax25d can run for months without problems.
--------------------

Actually as far as I can see, this problem is even in releases
as far back as 2.2.x as well.

It seems senseless to special case this test on TCP_LISTEN state.
Anything still stuck in state 0 has no external references and
we can just simply kill it off directly.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-17 21:26:37 -07:00
Eric Dumazet cb61cb9b8b udp: sk_drops handling
In commits 33c732c361 ([IPV4]: Add raw
drops counter) and a92aa318b4 ([IPV6]:
Add raw drops counter), Wang Chen added raw drops counter for
/proc/net/raw & /proc/net/raw6

This patch adds this capability to UDP sockets too (/proc/net/udp &
/proc/net/udp6).

This means that 'RcvbufErrors' errors found in /proc/net/snmp can be also
be examined for each udp socket.

# grep Udp: /proc/net/snmp
Udp: InDatagrams NoPorts InErrors OutDatagrams RcvbufErrors SndbufErrors
Udp: 23971006 75 899420 16390693 146348 0

# cat /proc/net/udp
 sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt  ---
uid  timeout inode ref pointer drops
 75: 00000000:02CB 00000000:0000 07 00000000:00000000 00:00000000 00000000  ---
  0        0 2358 2 ffff81082a538c80 0
111: 00000000:006F 00000000:0000 07 00000000:00000000 00:00000000 00000000  ---
  0        0 2286 2 ffff81042dd35c80 146348

In this example, only port 111 (0x006F) was flooded by messages that
user program could not read fast enough. 146348 messages were lost.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-17 21:04:56 -07:00
Jay Vosburgh b8a9787edd bonding: Allow setting max_bonds to zero
Permit bonding to function rationally if max_bonds is set to
zero.  This will load the module, but create no master devices (which can
be created via sysfs).

	Requires some change to bond_create_sysfs; currently, the
netdev sysfs directory is determined from the first bonding device created,
but this is no longer possible.  Instead, an interface from net/core is
created to create and destroy files in net_class.

	Based on a patch submitted by Phil Oester <kernel@linuxaces.com>.
Modified by Jay Vosburgh to fix the sysfs issue mentioned above and to
update the documentation.

Signed-off-by: Phil Oester <kernel@linuxace.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-06-18 00:00:04 -04:00
Or Gerlitz c1da4ac752 net/core: add NETDEV_BONDING_FAILOVER event
Add NETDEV_BONDING_FAILOVER event to be used in a successive patch
by bonding to announce fail-over for the active-backup mode through the
netdev events notifier chain mechanism. Such an event can be of use for the
RDMA CM (communication manager) to let native RDMA ULPs (eg NFS-RDMA, iSER)
always be aligned with the IP stack, in the sense that they use the same
ports/links as the stack does. More usages can be done to allow monitoring
tools based on netlink events being aware to bonding fail-over.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-06-17 23:59:41 -04:00
Bernard Pidoux fe2c802ab6 rose: improving AX25 routing frames via ROSE network
ROSE network is organized through nodes connected via hamradio or Internet.
AX25 packet radio frames sent to a remote ROSE address destination are routed
through these nodes.

Without the present patch, automatic routing mechanism did not work optimally
due to an improper parameter checking.

rose_get_neigh() function is called either by rose_connect() or by
rose_route_frame().

In the case of a call from rose_connect(), f0 timer is checked to find if a connection
is already pending. In that case it returns the address of the neighbour, or returns a NULL otherwise.

When called by rose_route_frame() the purpose was to route a packet AX25 frame
through an adjacent node given a destination rose address.
However, in that case, t0 timer checked does not indicate if the adjacent node
is actually connected even if the timer is not null. Thus, for each frame sent, the
function often tried to start a new connexion even if the adjacent node was already connected.

The patch adds a "new" parameter that is true when the function is called by
rose route_frame().
This instructs rose_get_neigh() to check node parameter "restarted". 
If restarted is true it means that the route to the destination address is opened via a neighbour
node already connected.
If "restarted" is false the function returns a NULL.
In that case the calling function will initiate a new connection as before.

This results in a fast routing of frames, from nodes to nodes, until
destination is reached, as originaly specified by ROSE protocole.

Signed-off-by: Bernard Pidoux <f6bvp@amsat.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-17 17:08:32 -07:00
Steffen Klassert fe833fca2e xfrm: fix fragmentation for ipv4 xfrm tunnel
When generating the ip header for the transformed packet we just copy
the frag_off field of the ip header from the original packet to the ip
header of the new generated packet. If we receive a packet as a chain
of fragments, all but the last of the new generated packets have the
IP_MF flag set. We have to mask the frag_off field to only keep the
IP_DF flag from the original packet. This got lost with git commit
36cf9acf93 ("[IPSEC]: Separate
inner/outer mode processing on output")

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-17 16:38:23 -07:00
Mitchell Blank Jr 61c33e0129 atm: use const where reasonable
From: Mitchell Blank Jr <mitch@sfgoth.com>

Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-17 16:20:06 -07:00
Randy Dunlap f586287e0f bridge: fix IPV6=n build
Fix bridge netfilter code so that it uses CONFIG_IPV6 as needed:

net/built-in.o: In function `ebt_filter_ip6':
ebt_ip6.c:(.text+0x87c37): undefined reference to `ipv6_skip_exthdr'
net/built-in.o: In function `ebt_log_packet':
ebt_log.c:(.text+0x88dee): undefined reference to `ipv6_skip_exthdr'
make[1]: *** [.tmp_vmlinux1] Error 1

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-17 16:16:13 -07:00
Stephen Hemminger 92c0574f11 bridge: make bridge address settings sticky
Normally, the bridge just chooses the smallest mac address as the
bridge id and mac address of bridge device. But if the administrator
has explictly set the interface address then don't change it.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-17 16:10:06 -07:00
Stephen Hemminger 43aa192011 bridge: handle process all link-local frames
Any frame addressed to link-local addresses should be processed by local
receive path. The earlier code would process them only if STP was enabled.
Since there are other frames like LACP for bonding, we should always
process them.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-17 16:09:45 -07:00
Pavel Emelyanov 3d00fb9eb1 sctp: fix error path in sctp_proc_init
After the sctp_remaddr_proc_init failed, the proper rollback is
not the sctp_remaddr_proc_exit, but the sctp_assocs_proc_exit.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-17 15:54:14 -07:00
Patrick McHardy a56b8f8158 netfilter: nf_conntrack_h323: fix module unload crash
The H.245 helper is not registered/unregistered, but assigned to
connections manually from the Q.931 helper. This means on unload
existing expectations and connections using the helper are not
cleaned up, leading to the following oops on module unload:

CPU 0 Unable to handle kernel paging request at virtual address c00a6828, epc == 802224dc, ra == 801d4e7c
Oops[#1]:
Cpu 0
$ 0   : 00000000 00000000 00000004 c00a67f0
$ 4   : 802a5ad0 81657e00 00000000 00000000
$ 8   : 00000008 801461c8 00000000 80570050
$12   : 819b0280 819b04b0 00000006 00000000
$16   : 802a5a60 80000000 80b46000 80321010
$20   : 00000000 00000004 802a5ad0 00000001
$24   : 00000000 802257a8
$28   : 802a4000 802a59e8 00000004 801d4e7c
Hi    : 0000000b
Lo    : 00506320
epc   : 802224dc ip_conntrack_help+0x38/0x74     Tainted: P
ra    : 801d4e7c nf_iterate+0xbc/0x130
Status: 1000f403    KERNEL EXL IE
Cause : 00800008
BadVA : c00a6828
PrId  : 00019374
Modules linked in: ip_nat_pptp ip_conntrack_pptp ath_pktlog wlan_acl wlan_wep wlan_tkip wlan_ccmp wlan_xauth ath_pci ath_dev ath_dfs ath_rate_atheros wlan ath_hal ip_nat_tftp ip_conntrack_tftp ip_nat_ftp ip_conntrack_ftp pppoe ppp_async ppp_deflate ppp_mppe pppox ppp_generic slhc
Process swapper (pid: 0, threadinfo=802a4000, task=802a6000)
Stack : 801e7d98 00000004 802a5a60 80000000 801d4e7c 801d4e7c 802a5ad0 00000004
        00000000 00000000 801e7d98 00000000 00000004 802a5ad0 00000000 00000010
        801e7d98 80b46000 802a5a60 80320000 80000000 801d4f8c 802a5b00 00000002
        80063834 00000000 80b46000 802a5a60 801e7d98 80000000 802ba854 00000000
        81a02180 80b7e260 81a021b0 819b0000 819b0000 80570056 00000000 00000001
        ...
Call Trace:
 [<801e7d98>] ip_finish_output+0x0/0x23c
 [<801d4e7c>] nf_iterate+0xbc/0x130
 [<801d4e7c>] nf_iterate+0xbc/0x130
 [<801e7d98>] ip_finish_output+0x0/0x23c
 [<801e7d98>] ip_finish_output+0x0/0x23c
 [<801d4f8c>] nf_hook_slow+0x9c/0x1a4

One way to fix this would be to split helper cleanup from the unregistration
function and invoke it for the H.245 helper, but since ctnetlink needs to be
able to find the helper for synchonization purposes, a better fix is to
register it normally and make sure its not assigned to connections during
helper lookup. The missing l3num initialization is enough for this, this
patch changes it to use AF_UNSPEC to make it more explicit though.

Reported-by: liannan <liannan@twsz.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-17 15:52:32 -07:00
Patrick McHardy 8a548868db netfilter: nf_conntrack_h323: fix memory leak in module initialization error path
Properly free h323_buffer when helper registration fails.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-17 15:52:07 -07:00
Patrick McHardy 68b80f1138 netfilter: nf_nat: fix RCU races
Fix three ct_extend/NAT extension related races:

- When cleaning up the extension area and removing it from the bysource hash,
  the nat->ct pointer must not be set to NULL since it may still be used in
  a RCU read side

- When replacing a NAT extension area in the bysource hash, the nat->ct
  pointer must be assigned before performing the replacement

- When reallocating extension storage in ct_extend, the old memory must
  not be freed immediately since it may still be used by a RCU read side

Possibly fixes https://bugzilla.redhat.com/show_bug.cgi?id=449315
and/or http://bugzilla.kernel.org/show_bug.cgi?id=10875

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-17 15:51:47 -07:00
David S. Miller 48c5732f4a netrom: Kill spurious NULL'ing of sk->sk_socket.
In nr_release(), one code path calls sock_orphan() which
will NULL out sk->sk_socket already.

In the other case, handling states other than NR_STATE_{0,1,2,3},
seems to not be possible other than due to bugs.  Even for an
uninitialized nr->state value, that would be zero or NR_STATE_0.
It might be wise to stick a WARN_ON() here.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-17 03:19:58 -07:00
David S. Miller c751e4f8b3 x25: Use sock_orphan() instead of open-coded (and buggy) variant.
It doesn't grab the sk_callback_lock, it doesn't NULL out
the sk->sk_sleep waitqueue pointer, etc.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-17 03:05:13 -07:00
David S. Miller 0efffaf9d5 econet: Use sock_orphan() instead of open-coded (and buggy) variant.
It doesn't grab the sk_callback_lock, it doesn't NULL out
the sk->sk_sleep waitqueue pointer, etc.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-17 03:01:47 -07:00
David S. Miller b61d38e055 x25: Use sock_graft() and remove bogus sk_socket and sk_sleep init.
This is the x25 variant of changeset
9375cb8a12
("ax25: Use sock_graft() and remove bogus sk_socket and sk_sleep init.")

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-17 02:44:35 -07:00
David S. Miller 44ccff1f53 rose: Use sock_graft() and remove bogus sk_socket and sk_sleep init.
This is the rose variant of changeset
9375cb8a12
("ax25: Use sock_graft() and remove bogus sk_socket and sk_sleep init.")

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-17 02:39:21 -07:00
David S. Miller 7b66767f96 netrom: Use sock_graft() and remove bogus sk_socket and sk_sleep init.
This is the netrom variant of changeset
9375cb8a12
("ax25: Use sock_graft() and remove bogus sk_socket and sk_sleep init.")

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-17 02:36:44 -07:00
David S. Miller 9375cb8a12 ax25: Use sock_graft() and remove bogus sk_socket and sk_sleep init.
The way that listening sockets work in ax25 is that the packet input
code path creates new socks via ax25_make_new() and attaches them
to the incoming SKB.  This SKB gets queued up into the listening
socket's receive queue.

When accept()'d the sock gets hooked up to the real parent socket.
Alternatively, if the listening socket is closed and released, any
unborn socks stuff up in the receive queue get released.

So during this time period these sockets are unreachable in any
other way, so no wakeup events nor references to their ->sk_socket
and ->sk_sleep members can occur.  And even if they do, all such
paths have to make NULL checks.

So do not deceptively initialize them in ax25_make_new() to the
values in the listening socket.  Leave them at NULL.

Finally, use sock_graft() in ax25_accept().

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-17 02:20:54 -07:00
David S. Miller ee5850defc llc: Use sock_graft() instead of by-hand version.
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-17 01:21:03 -07:00
David S. Miller 22196d3648 decnet: Remove SOCK_SLEEP_{PRE,POST} usage.
Just expand the wait sequence.  And as a nice side-effect
the timeout is respected now.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-17 01:06:01 -07:00
David S. Miller ccc580571c wext: Emit event stream entries correctly when compat.
Three major portions to this change:

1) Add IW_EV_COMPAT_LCP_LEN, IW_EV_COMPAT_POINT_OFF,
   and IW_EV_COMPAT_POINT_LEN helper defines.

2) Delete iw_stream_check_add_*(), they are unused.

3) Add iw_request_info argument to iwe_stream_add_*(), and use it to
   size the event and pointer lengths correctly depending upon whether
   IW_REQUEST_FLAG_COMPAT is set or not.

4) The mechanical transformations to the drivers and wireless stack
   bits to get the iw_request_info passed down into the routines
   modified in #3.  Also, explicit references to IW_EV_LCP_LEN are
   replaced with iwe_stream_lcp_len(info).

With a lot of help and bug fixes from Masakazu Mokuno.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 18:50:49 -07:00
David S. Miller 0f5cabba49 wext: Create IW_REQUEST_FLAG_COMPAT and set it as needed.
Now low-level WEXT ioctl handlers can do compat handling
when necessary.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 18:34:49 -07:00
David S. Miller 87de87d5e4 wext: Dispatch and handle compat ioctls entirely in net/wireless/wext.c
Next we can kill the hacks in fs/compat_ioctl.c and also
dispatch compat ioctls down into the driver and 80211 protocol
helper layers in order to handle iw_point objects embedded in
stream replies which need to be translated.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 18:32:46 -07:00
David S. Miller a67fa76d8b wext: Pull top-level ioctl dispatch logic into helper function.
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 18:32:09 -07:00
David S. Miller d291125559 wext: Pass iwreq pointer down into standard/private handlers.
They have no need to see the object as an ifreq.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 18:31:55 -07:00
David S. Miller ca1e8bb8e4 wext: Parameterize the standard/private handlers.
The WEXT standard and private handlers to use are now
arguments to wireless_process_ioctl().

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 18:30:59 -07:00
David S. Miller 67dd760807 wext: Pull ioctl permission checking out into helper function.
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 18:30:47 -07:00
David S. Miller d88174e4d2 wext: Extract private call iw_point handling into seperate functions.
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 18:30:21 -07:00
David S. Miller 84149b0fca wext: Extract standard call iw_point handling into seperate function.
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 18:30:09 -07:00
David S. Miller 208887d4cc wext: Make adjust_priv_size() take a "struct iw_point *".
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 18:29:55 -07:00
David S. Miller 25519a2a76 wext: Remove inline from get_priv_size() and adjust_priv_size().
The compiler inlines when appropriate.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 18:29:40 -07:00
David S. Miller caea902f72 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/wireless/rt2x00/Kconfig
	drivers/net/wireless/rt2x00/rt2x00usb.c
	net/sctp/protocol.c
2008-06-16 18:25:48 -07:00
Eric Kinzie 7e903c2ae3 atm: [br2864] fix routed vcmux support
From: Eric Kinzie <ekinzie@cmf.nrl.navy.mil>
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 17:18:18 -07:00
Jorge Boncompte [DTI2] 27141666b6 atm: [br2684] Fix oops due to skb->dev being NULL
It happens that if a packet arrives in a VC between the call to open it on
the hardware and the call to change the backend to br2684, br2684_regvcc
processes the packet and oopses dereferencing skb->dev because it is
NULL before the call to br2684_push().

Signed-off-by: Jorge Boncompte [DTI2] <jorge@dti2.net>
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
2008-06-16 17:15:33 -07:00
Pavel Emelyanov 33de014c63 inet6: add struct net argument to inet6_ehashfn
Same as for inet_hashfn, prepare its ipv6 incarnation.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 17:13:48 -07:00
Pavel Emelyanov 9f26b3add3 inet: add struct net argument to inet_ehashfn
Although this hash takes addresses into account, the ehash chains
can also be too long when, for instance, communications via lo occur.
So, prepare the inet_hashfn to take struct net into account.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 17:13:27 -07:00
Pavel Emelyanov 2086a65078 inet: add struct net argument to inet_lhashfn
Listening-on-one-port sockets in many namespaces produce long 
chains in the listening_hash-es, so prepare the inet_lhashfn to 
take struct net into account.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 17:13:08 -07:00
Pavel Emelyanov 7f635ab71e inet: add struct net argument to inet_bhashfn
Binding to some port in many namespaces may create too long
chains in bhash-es, so prepare the hashfn to take struct net
into account.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 17:12:49 -07:00
Pavel Emelyanov 19c7578fb2 udp: add struct net argument to udp_hashfn
Every caller already has this one. The new argument is currently 
unused, but this will be fixed shortly.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 17:12:29 -07:00
Pavel Emelyanov e31634931d udp: provide a struct net pointer for __udp[46]_lib_mcast_deliver
They both calculate the hash chain, but currently do not have
a struct net pointer, so pass one there via additional argument,
all the more so their callers already have such.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 17:12:11 -07:00
Pavel Emelyanov d6266281f8 udp: introduce a udp_hashfn function
Currently the chain to store a UDP socket is calculated with
simple (x & (UDP_HTABLE_SIZE - 1)). But taking net into account
would make this calculation a bit more complex, so moving it into
a function would help.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 17:11:50 -07:00
Rami Rosen a9d246dbb0 ipv4: Remove unused definitions in net/ipv4/tcp_ipv4.c.
1) Remove ICMP_MIN_LENGTH, as it is unused.

2) Remove unneeded tcp_v4_send_check() declaration.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 17:07:16 -07:00
Eric Dumazet 68be802cd5 raw: Restore /proc/net/raw correct behavior
I just noticed "cat /proc/net/raw" was buggy, missing '\n' separators.

I believe this was introduced by commit 8cd850efa4 
([RAW]: Cleanup IPv4 raw_seq_show.)

This trivial patch restores correct behavior, and applies to current 
Linus tree (should also be applied to stable tree as well.)

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 17:03:32 -07:00
Ben Hutchings 6de329e26c net: Fix test for VLAN TX checksum offload capability
Selected device feature bits can be propagated to VLAN devices, so we
can make use of TX checksum offload and TSO on VLAN-tagged packets.
However, if the physical device does not do VLAN tag insertion or
generic checksum offload then the test for TX checksum offload in
dev_queue_xmit() will see a protocol of htons(ETH_P_8021Q) and yield
false.

This splits the checksum offload test into two functions:

- can_checksum_protocol() tests a given protocol against a feature bitmask

- dev_can_checksum() first tests the skb protocol against the device
  features; if that fails and the protocol is htons(ETH_P_8021Q) then
  it tests the encapsulated protocol against the effective device
  features for VLANs

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 17:02:28 -07:00
Vlad Yasevich 319fa2a24f sctp: Correclty set changeover_active for SFR-CACC
Right now, any time we set a primary transport we set
the changeover_active flag.  As a result, we invoke SFR-CACC
even when there has been no changeover events.

Only set changeover_active, when there is a true changeover
event, i.e. we had a primary path and we are changing to
another transport.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 17:00:29 -07:00
Wei Yongjun 80896a3584 sctp: Correctly cleanup procfs entries upon failure.
This patch remove the proc fs entry which has been created if fail to
set up proc fs entry for the SCTP protocol.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 16:59:55 -07:00
David S. Miller 93653e0448 tcp: Revert reset of deferred accept changes in 2.6.26
Ingo's system is still seeing strange behavior, and he
reports that is goes away if the rest of the deferred
accept changes are reverted too.

Therefore this reverts e4c7884028
("[TCP]: TCP_DEFER_ACCEPT updates - dont retxmt synack") and
539fae89be ("[TCP]: TCP_DEFER_ACCEPT
updates - defer timeout conflicts with max_thresh").

Just like the other revert, these ideas can be revisited for
2.6.27

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 16:57:40 -07:00
YOSHIFUJI Hideaki 2b4743bd6b ipv6 sit: Avoid extra need for compat layer in PRL management.
We've introduced extra need of compat layer for ip_tunnel_prl{}
for PRL (Potential Router List) management.  Though compat_ioctl
is still missing in ipv4/ipv6, let's make the interface more
straight-forward and eliminate extra need for nasty compat layer
anyway since the interface is new for 2.6.26.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 16:48:20 -07:00
Jesper Dangaard Brouer 47083fc073 pkt_sched: Change HTB_HYSTERESIS to a runtime parameter htb_hysteresis.
Add a htb_hysteresis parameter to htb_sch.ko and by sysfs magic make
it runtime adjustable via
/sys/module/sch_htb/parameters/htb_hysteresis mode 640.

Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
Acked-by: Martin Devera <devik@cdi.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 16:39:32 -07:00
Jesper Dangaard Brouer f9ffcedddb pkt_sched: HTB scheduler, change default hysteresis mode to off.
The HTB hysteresis mode reduce the CPU load, but at the
cost of scheduling accuracy.

On ADSL links (512 kbit/s upstream), this inaccuracy introduce
significant jitter, enought to disturbe VoIP.  For details see my
masters thesis (http://www.adsl-optimizer.dk/thesis/), chapter 7,
section 7.3.1, pp 69-70.

Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
Acked-by: Martin Devera <devik@cdi.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 16:38:33 -07:00
Ingo Molnar 9583f3d9c0 Merge branch 'linus' into core/softirq 2008-06-16 11:24:17 +02:00
Ingo Molnar 766d02786e Merge branch 'linus' into core/rcu 2008-06-16 11:23:36 +02:00
David S. Miller 34a5d71305 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6 2008-06-14 17:33:38 -07:00
David S. Miller 942e7b102a Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2008-06-14 17:15:39 -07:00
Brian Haley 7d06b2e053 net: change proto destroy method to return void
Change struct proto destroy function pointer to return void.  Noticed
by Al Viro.

Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-14 17:04:49 -07:00
Vladimir Koutny 87291c0269 mac80211: eliminate IBSS warning in rate_lowest_index()
In IBSS mode prior to join/creation of new IBSS it is possible that
a frame from unknown station is received and an ibss_add_sta() is
called. This will cause a warning in rate_lowest_index() since the
list of supported rates of our station is not initialized yet.

The fix is to add ibss stations with a rate we received that frame
at; this single-element set will be extended later based on beacon
data. Also there is no need to store stations from a foreign IBSS.

Signed-off-by: Vladimir Koutny <vlado@ksp.sk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-14 12:18:14 -04:00
Harvey Harrison c644bce95f mac80211: tkip.c use a local struct tkip_ctx in ieee80211_get_tkip_key
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-14 12:18:14 -04:00
Harvey Harrison 7c70537f97 mac80211: tkip.c fold ieee80211_gen_rc4key into its one caller
Also change the arguments of the phase1, 2 key mixing to take
a pointer to the encrytion key and the tkip_ctx in the same
order.

Do the dereference of the encryption key in the callers.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-14 12:18:13 -04:00
Harvey Harrison c801242c38 mac80211: tkip.c consolidate tkip IV writing in helper
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-14 12:18:13 -04:00
Harvey Harrison 87228f5743 mac80211: rx.c use new helpers
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-14 12:18:13 -04:00
Harvey Harrison 002aaf4ea6 mac80211: wme.c use new helpers
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-14 12:18:13 -04:00
Harvey Harrison a494bb1cae mac80211: use new helpers in util.c - ieee80211_get_bssid()
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-14 12:18:13 -04:00
Harvey Harrison d5184cacf3 mac80211: wpa.c use new access helpers
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-14 12:18:13 -04:00
Harvey Harrison 6693be7124 mac80211: add utility function to get header length
Take a __le16 directly rather than a host-endian value.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-14 12:18:13 -04:00
Harvey Harrison c9c6950c14 mac80211: make ieee80211_get_hdrlen_from_skb return unsigned
Many callers already expect it to.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-14 12:18:12 -04:00
Tomas Winkler dc0ae30c31 mac80211: fix beacon interval value
This patch fixes setting beacon interval

1. in register_hw it honors value requested by the driver
2. It uses default 100 instead of 1000 or 10000. Scanning for beacon
interval ~1sec and above is not sane

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-14 12:18:11 -04:00
Ron Rindjunsky 8d5e0d58b3 mac80211: do not fragment while aggregation is in use
This patch denies the use of framentation while ampdu is used.

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-14 12:18:10 -04:00
Tony Vroon d2c3cc0070 mac80211: implement EU regulatory domain
Implement missing EU regulatory domain for mac80211. Based on the
information in IEEE 802.11-2007 (specifically pages 1142, 1143 & 1148)
and ETSI 301 893 (V1.4.1).
With thanks to Johannes Berg.

Signed-off-by: Tony Vroon <tony@linx.net>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-14 12:18:03 -04:00
David S. Miller 4ae127d1b6 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/smc911x.c
2008-06-13 20:52:39 -07:00
Tomas Winkler 995ad6c5a4 mac80211: add missing new line in debug print HT_DEBUG
This patch adds '\n' in debug printk (wme.c HT DEBUG)

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-13 16:14:53 -04:00
Abhijeet Kolekar 5c5f9664d5 mac80211 : fix for iwconfig in ad-hoc mode
The patch checks interface status, if it is in IBSS_JOINED mode
show cell id it is associated with.

Signed-off-by: Abhijeet Kolekar <abhijeet.kolekar@intel.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-13 16:14:53 -04:00
Linus Torvalds 51558576ea Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  tcp: Revert 'process defer accept as established' changes.
  ipv6: Fix duplicate initialization of rawv6_prot.destroy
  bnx2x: Updating the Maintainer
  net: Eliminate flush_scheduled_work() calls while RTNL is held.
  drivers/net/r6040.c: correct bad use of round_jiffies()
  fec_mpc52xx: MPC52xx_MESSAGES_DEFAULT: 2nd NETIF_MSG_IFDOWN => IFUP
  ipg: fix receivemode IPG_RM_RECEIVEMULTICAST{,HASH} in ipg_nic_set_multicast_list()
  netfilter: nf_conntrack: fix ctnetlink related crash in nf_nat_setup_info()
  netfilter: Make nflog quiet when no one listen in userspace.
  ipv6: Fail with appropriate error code when setting not-applicable sockopt.
  ipv6: Check IPV6_MULTICAST_LOOP option value.
  ipv6: Check the hop limit setting in ancillary data.
  ipv6 route: Fix route lifetime in netlink message.
  ipv6 mcast: Check address family of gf_group in getsockopt(MS_FILTER).
  dccp: Bug in initial acknowledgment number assignment
  dccp ccid-3: X truncated due to type conversion
  dccp ccid-3: TFRC reverse-lookup Bug-Fix
  dccp ccid-2: Bug-Fix - Ack Vectors need to be ignored on request sockets
  dccp: Fix sparse warnings
  dccp ccid-3: Bug-Fix - Zero RTT is possible
2008-06-13 07:34:47 -07:00
David S. Miller ec0a196626 tcp: Revert 'process defer accept as established' changes.
This reverts two changesets, ec3c0982a2
("[TCP]: TCP_DEFER_ACCEPT updates - process as established") and
the follow-on bug fix 9ae27e0adb
("tcp: Fix slab corruption with ipv6 and tcp6fuzz").

This change causes several problems, first reported by Ingo Molnar
as a distcc-over-loopback regression where connections were getting
stuck.

Ilpo Järvinen first spotted the locking problems.  The new function
added by this code, tcp_defer_accept_check(), only has the
child socket locked, yet it is modifying state of the parent
listening socket.

Fixing that is non-trivial at best, because we can't simply just grab
the parent listening socket lock at this point, because it would
create an ABBA deadlock.  The normal ordering is parent listening
socket --> child socket, but this code path would require the
reverse lock ordering.

Next is a problem noticed by Vitaliy Gusev, he noted:

----------------------------------------
>--- a/net/ipv4/tcp_timer.c
>+++ b/net/ipv4/tcp_timer.c
>@@ -481,6 +481,11 @@ static void tcp_keepalive_timer (unsigned long data)
> 		goto death;
> 	}
>
>+	if (tp->defer_tcp_accept.request && sk->sk_state == TCP_ESTABLISHED) {
>+		tcp_send_active_reset(sk, GFP_ATOMIC);
>+		goto death;

Here socket sk is not attached to listening socket's request queue. tcp_done()
will not call inet_csk_destroy_sock() (and tcp_v4_destroy_sock() which should
release this sk) as socket is not DEAD. Therefore socket sk will be lost for
freeing.
----------------------------------------

Finally, Alexey Kuznetsov argues that there might not even be any
real value or advantage to these new semantics even if we fix all
of the bugs:

----------------------------------------
Hiding from accept() sockets with only out-of-order data only
is the only thing which is impossible with old approach. Is this really
so valuable? My opinion: no, this is nothing but a new loophole
to consume memory without control.
----------------------------------------

So revert this thing for now.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-12 16:34:35 -07:00
David S. Miller f23d60de71 ipv6: Fix duplicate initialization of rawv6_prot.destroy
In changeset 22dd485022
("raw: Raw socket leak.") code was added so that we
flush pending frames on raw sockets to avoid leaks.

The ipv4 part was fine, but the ipv6 part was not
done correctly.  Unlike the ipv4 side, the ipv6 code
already has a .destroy method for rawv6_prot.

So now there were two assignments to this member, and
what the compiler does is use the last one, effectively
making the ipv6 parts of that changeset a NOP.

Fix this by removing the:

	.destroy	   = inet6_destroy_sock,

line, and adding an inet6_destroy_sock() call to the
end of raw6_destroy().

Noticed by Al Viro.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-12 16:34:34 -07:00
David S. Miller e6e30add6b Merge branch 'net-next-2.6-misc-20080612a' of git://git.linux-ipv6.org/gitroot/yoshfuji/linux-2.6-next 2008-06-11 22:33:59 -07:00
Adrian Bunk 0b04082995 net: remove CVS keywords
This patch removes CVS keywords that weren't updated for a long time
from comments.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-11 21:00:38 -07:00
David S. Miller a405657387 Merge branch 'net-2.6-misc-20080611a' of git://git.linux-ipv6.org/gitroot/yoshfuji/linux-2.6-fix 2008-06-11 18:11:16 -07:00
David S. Miller 5cb960a805 Merge branch 'master' of git://eden-feed.erg.abdn.ac.uk/net-2.6 2008-06-11 17:53:04 -07:00
Patrick McHardy ceeff7541e netfilter: nf_conntrack: fix ctnetlink related crash in nf_nat_setup_info()
When creation of a new conntrack entry in ctnetlink fails after having
set up the NAT mappings, the conntrack has an extension area allocated
that is not getting properly destroyed when freeing the conntrack again.
This means the NAT extension is still in the bysource hash, causing a
crash when walking over the hash chain the next time:

BUG: unable to handle kernel paging request at 00120fbd
IP: [<c03d394b>] nf_nat_setup_info+0x221/0x58a
*pde = 00000000
Oops: 0000 [#1] PREEMPT SMP

Pid: 2795, comm: conntrackd Not tainted (2.6.26-rc5 #1)
EIP: 0060:[<c03d394b>] EFLAGS: 00010206 CPU: 1
EIP is at nf_nat_setup_info+0x221/0x58a
EAX: 00120fbd EBX: 00120fbd ECX: 00000001 EDX: 00000000
ESI: 0000019e EDI: e853bbb4 EBP: e853bbc8 ESP: e853bb78
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process conntrackd (pid: 2795, ti=e853a000 task=f7de10f0 task.ti=e853a000)
Stack: 00000000 e853bc2c e85672ec 00000008 c0561084 63c1db4a 00000000 00000000
       00000000 0002e109 61d2b1c3 00000000 00000000 00000000 01114e22 61d2b1c3
       00000000 00000000 f7444674 e853bc04 00000008 c038e728 0000000a f7444674
Call Trace:
 [<c038e728>] nla_parse+0x5c/0xb0
 [<c0397c1b>] ctnetlink_change_status+0x190/0x1c6
 [<c0397eec>] ctnetlink_new_conntrack+0x189/0x61f
 [<c0119aee>] update_curr+0x3d/0x52
 [<c03902d1>] nfnetlink_rcv_msg+0xc1/0xd8
 [<c0390228>] nfnetlink_rcv_msg+0x18/0xd8
 [<c0390210>] nfnetlink_rcv_msg+0x0/0xd8
 [<c038d2ce>] netlink_rcv_skb+0x2d/0x71
 [<c0390205>] nfnetlink_rcv+0x19/0x24
 [<c038d0f5>] netlink_unicast+0x1b3/0x216
 ...

Move invocation of the extension destructors to nf_conntrack_free()
to fix this problem.

Fixes http://bugzilla.kernel.org/show_bug.cgi?id=10875

Reported-and-Tested-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-11 17:51:10 -07:00
Eric Leblond b66985b11b netfilter: Make nflog quiet when no one listen in userspace.
The message "nf_log_packet: can't log since no backend logging module loaded
in! Please either load one, or disable logging explicitly" was displayed for
each logged packet when no userspace application is listening to nflog events.
The message seems to warn for a problem with a kernel module missing but as
said before this is not the case. I thus propose to suppress the message (I
don't see any reason to flood the log because a user application has crashed.)

Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-11 17:50:27 -07:00
YOSHIFUJI Hideaki 1717699cd5 ipv6: Fail with appropriate error code when setting not-applicable sockopt.
IPV6_MULTICAST_HOPS, for example, is not valid for stream sockets.
Since they are virtually unavailable for stream sockets,
we should return ENOPROTOOPT instead of EINVAL.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-12 09:19:09 +09:00
YOSHIFUJI Hideaki 28d4488216 ipv6: Check IPV6_MULTICAST_LOOP option value.
Only 0 and 1 are valid for IPV6_MULTICAST_LOOP socket option,
and we should return an error of EINVAL otherwise, per RFC3493.

Based on patch from Shan Wei <shanwei@cn.fujitsu.com>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-12 09:19:09 +09:00
Shan Wei e8766fc86b ipv6: Check the hop limit setting in ancillary data.
When specifing the outgoing hop limit as ancillary data for sendmsg(),
the kernel doesn't check the integer hop limit value as specified in
[RFC-3542] section 6.3.

Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-12 09:19:08 +09:00
YOSHIFUJI Hideaki 36e3deae8b ipv6 route: Fix route lifetime in netlink message.
1) We may have route lifetime larger than INT_MAX.
In that case we had wired value in lifetime.
Use INT_MAX if lifetime does not fit in s32.

2) Lifetime is valid iif RTF_EXPIRES is set.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-12 09:19:08 +09:00
YOSHIFUJI Hideaki 20c61fbd8d ipv6 mcast: Check address family of gf_group in getsockopt(MS_FILTER).
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-12 09:19:08 +09:00
YOSHIFUJI Hideaki 9501f97229 tcp md5sig: Let the caller pass appropriate key for tcp_v{4,6}_do_calc_md5_hash().
As we do for other socket/timewait-socket specific parameters,
let the callers pass appropriate arguments to
tcp_v{4,6}_do_calc_md5_hash().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-12 03:46:30 +09:00
YOSHIFUJI Hideaki 8d26d76dd4 tcp md5sig: Share most of hash calcucaltion bits between IPv4 and IPv6.
We can share most part of the hash calculation code because
the only difference between IPv4 and IPv6 is their pseudo headers.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-12 02:38:20 +09:00
YOSHIFUJI Hideaki 076fb72233 tcp md5sig: Remove redundant protocol argument.
Protocol is always TCP, so remove useless protocol argument.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-12 02:38:19 +09:00
YOSHIFUJI Hideaki 7d5d5525bd tcp md5sig: Share MD5 Signature option parser between IPv4 and IPv6.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-12 02:38:18 +09:00
YOSHIFUJI Hideaki 81b302a321 key: Use xfrm_addr_cmp() where appropriate.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-12 02:38:17 +09:00
YOSHIFUJI Hideaki 5f95ac9111 key: Share common code path to extract address from sockaddr{}.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-12 02:38:17 +09:00
YOSHIFUJI Hideaki e5b56652c1 key: Share common code path to fill sockaddr{}.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-12 02:38:16 +09:00
YOSHIFUJI Hideaki 9e8b4ed8bb key: Introduce pfkey_sockaddr_len() for raw sockaddr{} length.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-12 02:38:15 +09:00
Benjamin Thery 3de232554a ipv6 netns: Address labels per namespace
This pacth makes IPv6 address labels per network namespace.
It keeps the global label tables, ip6addrlbl_table, but
adds a 'net' member to each ip6addrlbl_entry.
This new member is taken into account when matching labels.

Changelog
=========
* v1: Initial version
* v2:
  * Minize the penalty when network namespaces are not configured:
      *  the 'net' member is added only if CONFIG_NET_NS is
         defined. This saves space when network namespaces are not
         configured.
      * 'net' value is retrieved with the inlined function
         ip6addrlbl_net() that always return &init_net when
         CONFIG_NET_NS is not defined.
  * 'net' member in ip6addrlbl_entry renamed to the less generic
    'lbl_net' name (helps code search).

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-12 02:38:15 +09:00
YOSHIFUJI Hideaki 2b5ead4644 ipv6 addrconf: Introduce addrconf_is_prefix_route() helper.
This inline function, for readability, returns if the route
is a "prefix" route regardless if it was installed by RA or by
hand.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-12 02:38:14 +09:00
Rami Rosen 7d120c55df ipv6 mroute: Use MRT6_VERSION instead of MRT_VERSION in ip6mr.c.
MRT6_VERSION should be used instead of MRT_VERSION in ip6mr.c.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-12 02:38:13 +09:00
Rami Rosen 9cba632e24 ipv6 mcast: Remove unused macro (MLDV2_QQIC) from mcast.c.
This patch removes  MLDV2_QQIC macro from mcast.c
as it is unused.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-12 02:38:12 +09:00
Linus Torvalds f7f866eed0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (42 commits)
  net: Fix routing tables with id > 255 for legacy software
  sky2: Hold RTNL while calling dev_close()
  s2io iomem annotations
  atl1: fix suspend regression
  qeth: start dev queue after tx drop error
  qeth: Prepare-function to call s390dbf was wrong
  qeth: reduce number of kernel messages
  qeth: Use ccw_device_get_id().
  qeth: layer 3 Oops in ip event handler
  virtio: use callback on empty in virtio_net
  virtio: virtio_net free transmit skbs in a timer
  virtio: Fix typo in virtio_net_hdr comments
  virtio_net: Fix skb->csum_start computation
  ehea: set mac address fix
  sfc: Recover from RX queue flush failure
  add missing lance_* exports
  ixgbe: fix typo
  forcedeth: msi interrupts
  ipsec: pfkey should ignore events when no listeners
  pppoe: Unshare skb before anything else
  ...
2008-06-11 08:39:51 -07:00
Gerrit Renker be4c798a41 dccp: Bug in initial acknowledgment number assignment
Step 8.5 in RFC 4340 says for the newly cloned socket

           Initialize S.GAR := S.ISS,

but what in fact the code (minisocks.c) does is

           Initialize S.GAR := S.ISR,

which is wrong (typo?) -- fixed by the patch.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-06-11 11:19:10 +01:00
Gerrit Renker 7deb0f8510 dccp ccid-3: X truncated due to type conversion
This fixes a bug in computing the inter-packet-interval t_ipi = s/X: 

 scaled_div32(a, b) uses u32 for b, but in "scaled_div32(s, X)" the type of the
 sending rate `X' is u64. Since X is scaled by 2^6, this truncates rates greater
 than 2^26 Bps (~537 Mbps).

Using full 64-bit division now.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-06-11 11:19:10 +01:00
Gerrit Renker 1e8a287c79 dccp ccid-3: TFRC reverse-lookup Bug-Fix
This fixes a bug in the reverse lookup of p: given a value f(p), instead of p,
the function returned the smallest tabulated value f(p).

The smallest tabulated value of
	 
   10^6 * f(p) =  sqrt(2*p/3) + 12 * sqrt(3*p/8) * (32 * p^3 + p) 

for p=0.0001 is 8172. 

Since this value is scaled by 10^6, the outcome of this bug is that a loss
of 8172/10^6 = 0.8172% was reported whenever the input was below the table
resolution of 0.01%.

This means that the value was over 80 times too high, resulting in large spikes
of the initial loss interval, thus unnecessarily reducing the throughput.

Also corrected the printk format (%u for u32).

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-06-11 11:19:10 +01:00
Gerrit Renker 65907a433a dccp ccid-2: Bug-Fix - Ack Vectors need to be ignored on request sockets
This fixes an oversight from an earlier patch, ensuring that Ack Vectors
are not processed on request sockets.

The issue is that Ack Vectors must not be parsed on request sockets, since
the Ack Vector feature depends on the selection of the (TX) CCID. During the
initial handshake the CCIDs are undefined, and so RFC 4340, 10.3 applies:

 "Using CCID-specific options and feature options during a negotiation
  for the corresponding CCID feature is NOT RECOMMENDED [...]"

And it is not even possible: when the server receives the Request from the 
client, the CCID and Ack vector features are undefined; when the Ack finalising
the 3-way hanshake arrives, the request socket has not been cloned yet into a
full socket. (This order is necessary, since otherwise the newly created socket
would have to be destroyed whenever an option error occurred - a malicious
hacker could simply send garbage options and exploit this.)

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-06-11 11:19:09 +01:00
Gerrit Renker 1e2f0e5e83 dccp: Fix sparse warnings
This patch fixes the following sparse warnings:
 * nested min(max()) expression:
   net/dccp/ccids/ccid3.c:91:21: warning: symbol '__x' shadows an earlier one
   net/dccp/ccids/ccid3.c:91:21: warning: symbol '__y' shadows an earlier one
   
 * Declaration of function prototypes in .c instead of .h file, resulting in
   "should it be static?" warnings. 

 * Declared "struct dccpw" static (local to dccp_probe).
 
 * Disabled dccp_delayed_ack() - not fully removed due to RFC 4340, 11.3
   ("Receivers SHOULD implement delayed acknowledgement timers ...").

 * Used a different local variable name to avoid
   net/dccp/ackvec.c:293:13: warning: symbol 'state' shadows an earlier one
   net/dccp/ackvec.c:238:33: originally declared here

 * Removed unused functions `dccp_ackvector_print' and `dccp_ackvec_print'.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-06-11 11:19:09 +01:00
Gerrit Renker 3294f202dc dccp ccid-3: Bug-Fix - Zero RTT is possible
In commit $(825de27d9e) (from 27th May, commit
message `dccp ccid-3: Fix "t_ipi explosion" bug'), the CCID-3 window counter
computation was fixed to cope with RTTs < 4 microseconds.

Such RTTs can be found e.g. when running CCID-3 over loopback. The fix removed
a check against RTT < 4, but introduced a divide-by-zero bug.

All steady-state RTTs in DCCP are filtered using dccp_sample_rtt(), which
ensures non-zero samples. However, a zero RTT is possible on initialisation,
when there is no RTT sample from the Request/Response exchange.

The fix is to use the fallback-RTT from RFC 4340, 3.4.

This is also better than just fixing update_win_count() since it allows other
parts of the code to always assume that the RTT is non-zero during the time
that the CCID is used.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-06-11 11:19:09 +01:00
Krzysztof Piotr Oledzki 709772e6e0 net: Fix routing tables with id > 255 for legacy software
Most legacy software do not like tables > 255 as rtm_table is u8
so tb_id is sent &0xff and it is possible to mismatch for example
table 510 with table 254 (main).

This patch introduces RT_TABLE_COMPAT=252 so the code uses it if
tb_id > 255. It makes such old applications happy, new
ones are still able to use RTA_TABLE to get a proper table id.

Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-10 15:44:49 -07:00
Thomas Graf 573bf470e6 ipv4 addr: Send netlink notification for address label changes
Makes people happy who try to keep a list of addresses up to date by
listening to notifications.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-10 15:40:04 -07:00
Jamal Hadi Salim 99c6f60e72 ipsec: pfkey should ignore events when no listeners
When pfkey has no km listeners, it still does a lot of work
before finding out there aint nobody out there.
If a tree falls in a forest and no one is around to hear it, does it make
a sound? In this case it makes a lot of noise:
With this short-circuit adding 10s of thousands of SAs using
netlink improves performance by ~10%.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-10 14:25:34 -07:00
Arnaldo Carvalho de Melo ce4a7d0d48 inet{6}_request_sock: Init ->opt and ->pktopts in the constructor
Wei Yongjun noticed that we may call reqsk_free on request sock objects where
the opt fields may not be initialized, fix it by introducing inet_reqsk_alloc
where we initialize ->opt to NULL and set ->pktopts to NULL in
inet6_reqsk_alloc.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-10 12:39:35 -07:00
John W. Linville 9a727a250c net/mac80211/ieee80211_i.h: fix-up merge damage
These definitions were originally removed in "mac80211: remove channel
use statistics".

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-10 13:31:23 -04:00
David S. Miller 65b53e4cc9 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/tg3.c
	drivers/net/wireless/rt2x00/rt2x00dev.c
	net/mac80211/ieee80211_i.h
2008-06-10 02:22:26 -07:00
David S. Miller 788c0a5316 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-next-2.6
Conflicts:

	drivers/net/ps3_gelic_wireless.c
	drivers/net/wireless/libertas/main.c
2008-06-10 01:54:31 -07:00
Rami Rosen e64bda89b8 netfilter: {ip,ip6,nfnetlink}_queue: misc cleanups
- No need to perform data_len = 0 in the switch command, since data_len
  is initialized to 0 in the beginning of the ipq_build_packet_message()
  method.

- {ip,ip6}_queue: We can reach nlmsg_failure only from one place; skb is
  sure to be NULL when getting there; since skb is NULL, there is no need
  to check this fact and call kfree_skb().

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-09 16:00:45 -07:00
Fabian Hugelshofer e57dce60c7 netfilter: ctnetlink: include conntrack status in destroy event message
When a conntrack is destroyed, the connection status does not get
exported to netlink. I don't see a reason for not doing so. This patch
exports the status on all conntrack events.

Signed-off-by: Fabian Hugelshofer <hugelshofer2006@gmx.ch>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-09 15:59:58 -07:00
Fabian Hugelshofer 718d4ad98e netfilter: nf_conntrack: properly account terminating packets
Currently the last packet of a connection isn't accounted when its causing
abnormal termination.

Introduces nf_ct_kill_acct() which increments the accounting counters on
conntrack kill. The new function was necessary, because there are calls
to nf_ct_kill() which don't need accounting:

nf_conntrack_proto_tcp.c line ~847:
Kills ct and returns NF_REPEAT. We don't want to count twice.

nf_conntrack_proto_tcp.c line ~880:
Kills ct and returns NF_DROP. I think we don't want to count dropped
packets.

nf_conntrack_netlink.c line ~824:
As far as I can see ctnetlink_del_conntrack() is used to destroy a
conntrack on behalf of the user. There is an sk_buff, but I don't think
this is an actual packet. Incrementing counters here is therefore not
desired.

Signed-off-by: Fabian Hugelshofer <hugelshofer2006@gmx.ch>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-09 15:59:40 -07:00
Patrick McHardy 51091764f2 netfilter: nf_conntrack: add nf_ct_kill()
Encapsulate the common

	if (del_timer(&ct->timeout))
		ct->timeout.function((unsigned long)ct)

sequence in a new function.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-09 15:59:06 -07:00
Pekka Enberg 31d8519c9c netfilter: nf_conntrack_extend: use krealloc() in nf_conntrack_extend.c V2
The ksize() API is going away because it is being abused and it doesn't even
work consistenly across different allocators. Therefore, convert
net/netfilter/nf_conntrack_extend.c to use krealloc().

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-09 15:58:39 -07:00
James Morris 17e6e59f0a netfilter: ip6_tables: add ip6tables security table
This is a port of the IPv4 security table for IPv6.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-09 15:58:05 -07:00
James Morris 560ee653b6 netfilter: ip_tables: add iptables security table for mandatory access control rules
The following patch implements a new "security" table for iptables, so
that MAC (SELinux etc.) networking rules can be managed separately to
standard DAC rules.

This is to help with distro integration of the new secmark-based
network controls, per various previous discussions.

The need for a separate table arises from the fact that existing tools
and usage of iptables will likely clash with centralized MAC policy
management.

The SECMARK and CONNSECMARK targets will still be valid in the mangle
table to prevent breakage of existing users.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-09 15:57:24 -07:00
Pablo Neira Ayuso a258860e01 netfilter: ctnetlink: add full support for SCTP to ctnetlink
This patch adds full support for SCTP to ctnetlink. This includes three
new attributes: state, original vtag and reply vtag.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-09 15:56:39 -07:00
Pablo Neira Ayuso 0adf9d6748 netfilter: ctnetlink: group errors into logical errno sets
This patch groups ctnetlink errors into three logical sets:

* Malformed messages: if ctnetlink receives a message without some mandatory
attribute, then it returns EINVAL.
* Unsupported operations: if userspace tries to perform an unsupported
operation, then it returns EOPNOTSUPP.
* Unchangeable: if userspace tries to change some attribute of the
conntrack object that can only be set once, then it returns EBUSY.

This patch reduces the number of -EINVAL from 23 to 14 and it results in
5 -EBUSY and 6 -EOPNOTSUPP.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-09 15:56:20 -07:00
Kuo-lang Tseng 93f6515872 netfilter: ebtables: add IPv6 support
It implements matching functions for IPv6 address & traffic class
(merged from the patch sent by Jan Engelhardt [jengelh@computergmbh.de]
http://marc.info/?l=netfilter-devel&m=120182168424052&w=2), protocol,
and layer-4 port id. Corresponding watcher logging function is also
added for IPv6.

Signed-off-by: Kuo-lang Tseng <kuo-lang.tseng@intel.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-09 15:55:45 -07:00
Pavel Emelyanov 2e761e0532 ipv6 netns: init net is used to set bindv6only for new sock
The bindv6only is tuned via sysctl. It is already on a struct net
and per-net sysctls allow for its modification (ipv6_sysctl_net_init).

Despite this the value configured in the init net is used for the
rest of them.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-09 15:53:30 -07:00
Ursula Braun 469689a4dd af_iucv: exploit target message class support of IUCV
The first 4 bytes of data to be sent are stored additionally into
the message class field of the send request. A receiving target
program (not an af_iucv socket program) can make use of this
information to pre-screen incoming messages.

Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-09 15:51:03 -07:00
Heiko Carstens 7b9d1b22a3 iucv: prevent cpu hotplug when walking cpu_online_map.
The code used preempt_disable() to prevent cpu hotplug, however that
doesn't protect for cpus being added. So use get_online_cpus() instead.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-09 15:50:30 -07:00
Heiko Carstens f1494ed1d3 iucv: fix section mismatch warning.
WARNING: net/iucv/built-in.o(.exit.text+0x9c): Section mismatch in
reference from the function iucv_exit() to the variable
.cpuinit.data:iucv_cpu_notifier

This warning is caused by a reference from unregister_hotcpu_notifier()
from an exit function to a cpuinitdata annotated data structurre.
This is a false positive warning since for the non CPU_HOTPLUG case
unregister_hotcpu_notifier() is a nop.
Use __refdata instead of __cpuinitdata to get rid of the warning.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-09 15:49:57 -07:00
Vlad Yasevich 7bfe8bdb80 sctp: Fix problems with the new SCTP_DELAYED_ACK code
The default sack frequency should be 2.  Also fix copy/paste
error when updating all transports.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-09 15:45:05 -07:00
Assaf Krauss be038b3764 mac80211: Checking IBSS support while changing channel in ad-hoc mode
This patch adds a check to the set_channel flow. When attempting to change
the channel while in IBSS mode, and the new channel does not support IBSS
mode, the flow return with an error value with no consequences on the
mac80211 and driver state.

Signed-off-by: Assaf Krauss <assaf.krauss@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-09 15:53:37 -04:00
Dan Williams 872ba53395 mac80211: decrease IBSS creation latency
Sufficient scans (at least 2 or 3) should have been done within 7
seconds to find an existing IBSS to join.  This should improve IBSS
creation latency; and since IBSS merging is still in effect, shouldn't
have detrimental effects on eventual IBSS convergence.

Signed-off-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-09 15:51:26 -04:00
Assaf Krauss ad81b2f97d mac80211: Fixing slow IBSS rejoin
This patch fixes the issue of slow reconnection to an IBSS cell after
disconnection from it. Now the interface's bssid is reset upon ifdown.

ieee80211_sta_find_ibss:
if (found && memcmp(ifsta->bssid, bssid, ETH_ALEN) != 0 &&
	    (bss = ieee80211_rx_bss_get(dev, bssid,
					local->hw.conf.channel->center_freq,
					ifsta->ssid, ifsta->ssid_len)))

Note:
In general disconnection is still not handled properly in mac80211

Signed-off-by: Assaf Krauss <assaf.krauss@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-09 15:50:20 -04:00
Dan Williams 507b06d062 mac80211: send association event on IBSS create
Otherwise userspace has no idea the IBSS creation succeeded.

Signed-off-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-09 15:50:19 -04:00
Chris Wright ddb2c43594 asn1: additional sanity checking during BER decoding
- Don't trust a length which is greater than the working buffer.
  An invalid length could cause overflow when calculating buffer size
  for decoding oid.

- An oid length of zero is invalid and allows for an off-by-one error when
  decoding oid because the first subid actually encodes first 2 subids.

- A primitive encoding may not have an indefinite length.

Thanks to Wei Wang from McAfee for report.

Cc: Steven French <sfrench@us.ibm.com>
Cc: stable@kernel.org
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-06-05 14:24:54 -07:00
Denis V. Lunev 9457afee85 netlink: Remove nonblock parameter from netlink_attachskb
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-05 11:23:39 -07:00
Allan Stephens 40aecb1b13 tipc: Message rejection rework preparatory changes
This patch defines a few new message header manipulation routines,
and generalizes the usefulness of another, in preparation for upcoming
rework of TIPC's message rejection code.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 17:54:48 -07:00
Allan Stephens 99c145939b tipc: Fix bugs in rejection of message with short header
This patch ensures that TIPC doesn't try to access non-existent
message header fields when rejecting a message with a short header.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 17:48:25 -07:00
Allan Stephens 9bef54383d tipc: Message header creation optimizations
This patch eliminates several cases where message header fields
were being set to the same value twice.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 17:47:55 -07:00
Allan Stephens bd7845337b tipc: Expand link sequence gap field to 13 bits
This patch increases the "sequence gap" field of the LINK_PROTOCOL
message header from 8 bits to 13 bits (utilizing 5 previously
unused 0 bits).  This ensures that the field is big enough to
indicate the loss of up to 8191 consecutive messages on the link,
thereby accommodating the current worst-case scenario of 4000
lost messages.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 17:47:30 -07:00
Linus Torvalds 3e387fcdc4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (56 commits)
  l2tp: Fix possible oops if transmitting or receiving when tunnel goes down
  tcp: Fix for race due to temporary drop of the socket lock in skb_splice_bits.
  tcp: Increment OUTRSTS in tcp_send_active_reset()
  raw: Raw socket leak.
  lt2p: Fix possible WARN_ON from socket code when UDP socket is closed
  USB ID for Philips CPWUA054/00 Wireless USB Adapter 11g
  ssb: Fix context assertion in ssb_pcicore_dev_irqvecs_enable
  libertas: fix command size for CMD_802_11_SUBSCRIBE_EVENT
  ipw2200: expire and use oldest BSS on adhoc create
  airo warning fix
  b43legacy: Fix controller restart crash
  sctp: Fix ECN markings for IPv6
  sctp: Flush the queue only once during fast retransmit.
  sctp: Start T3-RTX timer when fast retransmitting lowest TSN
  sctp: Correctly implement Fast Recovery cwnd manipulations.
  sctp: Move sctp_v4_dst_saddr out of loop
  sctp: retran_path update bug fix
  tcp: fix skb vs fack_count out-of-sync condition
  sunhme: Cleanup use of deprecated calls to save_and_cli and restore_flags.
  xfrm: xfrm_algo: correct usage of RIPEMD-160
  ...
2008-06-04 17:39:33 -07:00
Allan Stephens 307fdf5e7d tipc: Add missing spinlock in name table display code
This patch ensures that the display code that traverses the
publication lists belonging to a name table entry take its
associated spinlock, to protect against a possible change to
one of its "head of list" pointers caused by a simultaneous
name table lookup operation by another thread of control.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 17:38:22 -07:00
Allan Stephens 0f15d36453 tipc: Prevent display of name table types with no publications
This patch adds a check to prevent TIPC's name table display code
from listing a name type entry if it exists only to hold subscription
info, rather than published names.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 17:37:59 -07:00
Allan Stephens 7571521756 tipc: Optimize message initialization routine
This patch eliminates the rarely-used "error code" argument
when initializing a TIPC message header, since the default
value of zero is the desired result in most cases; the few
exceptional cases now set the error code explicitly.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 17:37:34 -07:00
Allan Stephens 9c396a7bfb tipc: Prevent access of non-existent field in short message header
This patch eliminates a case where TIPC's link code could try reading
a field that is not present in a short message header.  (The random
value obtained was not being used, but the read operation could result
in an invalid memory access exception in extremely rare circumstances.)

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 17:36:58 -07:00
Allan Stephens 1265a02108 tipc: Minor optimizations to received message processing
This patch enhances TIPC's handler for incoming messages in two
ways:
- the trivial, single-use routine for processing non-sequenced
  messages has been merged into the main handler
- the interface that received a message is now identified without
  having to access and/or modify the associated sk_buff

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 17:32:35 -07:00
Allan Stephens a686e6859e tipc: Fix minor bugs in link session number handling
This patch introduces a new, out-of-range value to indicate that
a link endpoint does not have an existing session established
with its peer, eliminating the risk that the previously used
"invalid session number" value (i.e. zero) might eventually be
assigned as a valid session number and cause incorrect link
behavior.

The patch also introduces explicit bit masking when assigning a
new link session number to ensure it does not exceed 16 bits.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 17:29:39 -07:00
Allan Stephens e0d4e3d0d7 tipc: Fix bugs in message error code display when debugging
This patch corrects two problems in the display of error code
information in TIPC messages when debugging:
- no longer tries to display error code in NAME_DISTRIBUTOR
  messages, which don't have the error field
- now displays error code in 24 byte data messages, which do
  have the error field

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 17:29:09 -07:00
Allan Stephens 5307e46957 tipc: Standardize error checking on incoming messages via native API
This patch re-orders & re-groups the error checks performed on
messages being delivered to native API ports, in order to clarify the
similarities and differences required for the various message types.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 17:28:45 -07:00
Allan Stephens 84b07c1638 tipc: Fix bug in connection setup via native API
This patch fixes a bug that prevented TIPC from receiving a
connection setup request message on a native TIPC port.
The revised connection setup logic ensures that validation
of the source of a connection-based message is skipped if
the port is not yet connected to a peer.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 17:28:21 -07:00
Octavian Purdila 293ad60401 tcp: Fix for race due to temporary drop of the socket lock in skb_splice_bits.
skb_splice_bits temporary drops the socket lock while iterating over
the socket queue in order to break a reverse locking condition which
happens with sendfile. This, however, opens a window of opportunity
for tcp_collapse() to aggregate skbs and thus potentially free the
current skb used in skb_splice_bits and tcp_read_sock.

This patch fixes the problem by (re-)getting the same "logical skb"
after the lock has been temporary dropped.

Based on idea and initial patch from Evgeniy Polyakov.

Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
Acked-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 15:45:58 -07:00
Sridhar Samudrala 26af65cbeb tcp: Increment OUTRSTS in tcp_send_active_reset()
TCP "resets sent" counter is not incremented when a TCP Reset is 
sent via tcp_send_active_reset().

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 15:19:35 -07:00
Denis V. Lunev 22dd485022 raw: Raw socket leak.
The program below just leaks the raw kernel socket

int main() {
        int fd = socket(PF_INET, SOCK_RAW, IPPROTO_UDP);
        struct sockaddr_in addr;

        memset(&addr, 0, sizeof(addr));
        inet_aton("127.0.0.1", &addr.sin_addr);
        addr.sin_family = AF_INET;
        addr.sin_port = htons(2048);
        sendto(fd,  "a", 1, MSG_MORE, &addr, sizeof(addr));
        return 0;
}

Corked packet is allocated via sock_wmalloc which holds the owner socket,
so one should uncork it and flush all pending data on close. Do this in the
same way as in UDP.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 15:16:12 -07:00
Vlad Yasevich b9031d9d87 sctp: Fix ECN markings for IPv6
Commit e9df2e8fd8 ("[IPV6]: Use
appropriate sock tclass setting for routing lookup.") also changed the
way that ECN capable transports mark this capability in IPv6.  As a
result, SCTP was not marking ECN capablity because the traffic class
was never set.  This patch brings back the markings for IPv6 traffic.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 12:40:15 -07:00
Vlad Yasevich 8b750ce54b sctp: Flush the queue only once during fast retransmit.
When fast retransmit is triggered by a sack, we should flush the queue
only once so that only 1 retransmit happens.  Also, since we could
potentially have non-fast-rtx chunks on the retransmit queue, we need
make sure any chunks eligable for fast retransmit are sent first
during fast retransmission.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Tested-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 12:39:36 -07:00
Vlad Yasevich 62aeaff5cc sctp: Start T3-RTX timer when fast retransmitting lowest TSN
When we are trying to fast retransmit the lowest outstanding TSN, we
need to restart the T3-RTX timer, so that subsequent timeouts will
correctly tag all the packets necessary for retransmissions.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Tested-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 12:39:11 -07:00
Vlad Yasevich a646523481 sctp: Correctly implement Fast Recovery cwnd manipulations.
Correctly keep track of Fast Recovery state and do not reduce
congestion window multiple times during sucht state.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Tested-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 12:38:43 -07:00
Gui Jianfeng 159c6bea37 sctp: Move sctp_v4_dst_saddr out of loop
There's no need to execute sctp_v4_dst_saddr() for each
iteration, just move it out of loop.

Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 12:38:07 -07:00
Gui Jianfeng 4141ddc02a sctp: retran_path update bug fix
If the current retran_path is the only active one, it should
update it to the the next inactive one.

Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 12:37:33 -07:00
David S. Miller aed5a833fb Merge branch 'net-2.6-misc-20080605a' of git://git.linux-ipv6.org/gitroot/yoshfuji/linux-2.6-fix 2008-06-04 12:10:21 -07:00
Ilpo Järvinen a6604471db tcp: fix skb vs fack_count out-of-sync condition
This bug is able to corrupt fackets_out in very rare cases.
In order for this to cause corruption:
  1) DSACK in the middle of previous SACK block must be generated.
  2) In order to take that particular branch, part or all of the
     DSACKed segment must already be SACKed so that we have that
     in cache in the first place.
  3) The new info must be top enough so that fackets_out will be
     updated on this iteration.
...then fack_count is updated while skb wasn't, then we walk again
that particular segment thus updating fack_count twice for
a single skb and finally that value is assigned to fackets_out
by tcp_sacktag_one.

It is safe to call tcp_sacktag_one just once for a segment (at
DSACK), no need to call again for plain SACK.

Potential problem of the miscount are limited to premature entry
to recovery and to inflated reordering metric (which could even
cancel each other out in the most the luckiest scenarios :-)).
Both are quite insignificant in worst case too and there exists
also code to reset them (fackets_out once sacked_out becomes zero
and reordering metric on RTO).

This has been reported by a number of people, because it occurred
quite rarely, it has been very evasive. Andy Furniss was able to
get it to occur couple of times so that a bit more info was
collected about the problem using a debug patch, though it still
required lot of checking around. Thanks also to others who have
tried to help here.

This is listed as Bugzilla #10346. The bug was introduced by
me in commit 68f8353b48 ([TCP]: Rewrite SACK block processing & 
sack_recv_cache use), I probably thought back then that there's
need to scan that entry twice or didn't dare to make it go
through it just once there. Going through twice would have
required restoring fack_count after the walk but as noted above,
I chose to drop the additional walk step altogether here.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 12:07:44 -07:00
Adrian-Ken Rueegsegger a13366c632 xfrm: xfrm_algo: correct usage of RIPEMD-160
This patch fixes the usage of RIPEMD-160 in xfrm_algo which in turn
allows hmac(rmd160) to be used as authentication mechanism in IPsec
ESP and AH (see RFC 2857).

Signed-off-by: Adrian-Ken Rueegsegger <rueegsegger@swiss-it.ch>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 12:04:55 -07:00
Denis V. Lunev 9596cc826e [IPV6]: Do not change protocol for UDPv6 sockets with pending sent data.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-05 04:02:38 +09:00
Denis V. Lunev 36d926b94a [IPV6]: inet_sk(sk)->cork.opt leak
IPv6 UDP sockets wth IPv4 mapped address use udp_sendmsg to send the data
actually. In this case ip_flush_pending_frames should be called instead
of ip6_flush_pending_frames.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-05 04:02:38 +09:00
Denis V. Lunev 49d074f400 [IPV6]: Do not change protocol for raw IPv6 sockets.
It is not allowed to change underlying protocol for
   int fd = socket(PF_INET6, SOCK_RAW, IPPROTO_UDP);

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-05 04:02:37 +09:00
YOSHIFUJI Hideaki 91e1908f56 [IPV6] NETNS: Handle ancillary data in appropriate namespace.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-05 04:02:36 +09:00
YOSHIFUJI Hideaki 187e38384c [IPV6]: Check outgoing interface even if source address is unspecified.
The outgoing interface index (ipi6_ifindex) in IPV6_PKTINFO
ancillary data, is not checked if the source address (ipi6_addr)
is unspecified.  If the ipi6_ifindex is the not-exist interface,
it should be fail.

Based on patch from Shan Wei <shanwei@cn.fujitsu.com> and
Brian Haley <brian.haley@hp.com>.

Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-05 04:02:35 +09:00
Yang Hongyang 95b496b666 [IPV6]: Fix the data length of get destination options with short length
If get destination options with length which is not enough for that
option,getsockopt() will still return the real length of the option,
which is larger then the buffer space.
 This is because ipv6_getsockopt_sticky() returns the real length of
the option.

This patch fix this problem.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-05 04:02:35 +09:00
Yang Hongyang 05335c2220 [IPV6]: Fix the return value of get destination options with NULL data pointer
If we pass NULL data buffer to getsockopt(), it will return 0,
and the option length is set to -EFAULT:
    getsockopt(sk, IPPROTO_IPV6, IPV6_DSTOPTS, NULL, &len);

This is because ipv6_getsockopt_sticky() will return -EFAULT or
-EINVAL if some error occur.

This patch fix this problem.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-05 04:02:34 +09:00
YOSHIFUJI Hideaki 4bed72e4f5 [IPV6] ADDRCONF: Allow longer lifetime on 64bit archs.
- Allow longer lifetimes (>= 0x7fffffff/HZ) on 64bit archs
  by using unsigned long.
- Shadow this arithmetic overflow workaround by introducing
  helper functions: addrconf_timeout_fixup() and
  addrconf_finite_timeout().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-05 04:02:34 +09:00
YOSHIFUJI Hideaki baa2bfb8ae [IPV4] TUNNEL4: Fix incoming packet length check for inter-protocol tunnel.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-05 04:02:33 +09:00
Colin 8283637231 [IPV6] TUNNEL6: Fix incoming packet length check for inter-protocol tunnel.
I discover a strange behavior in [ipv4 in ipv6] tunnel. When IPv6 tunnel
payload is less than 40(0x28), packet can be sent to network, received in
physical interface, but not seen in IP tunnel interface. No counter increase
in tunnel interface.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-05 04:02:32 +09:00
Thomas Graf 24ef0da7b8 [IPV6] ADDRCONF: Check range of prefix length
As of now, the prefix length is not vaildated when adding or deleting
addresses. The value is passed directly into the inet6_ifaddr structure
and later passed on to memcmp() as length indicator which relies on
the value never to exceed 128 (bits).

Due to the missing check, the currently code allows for any 8 bit
value to be passed on as prefix length while using the netlink
interface, and any 32 bit value while using the ioctl interface.

[Use unsigned int instead to generate better code - yoshfuji]

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-05 04:02:31 +09:00
YOSHIFUJI Hideaki a3c960899e [IPV6] UDP: Possible dst leak in udpv6_sendmsg.
ip6_sk_dst_lookup returns held dst entry. It should be released
on all paths beyond this point. Add missed release when up->pending
is set.

Bug report and initial patch by Denis V. Lunev <den@openvz.org>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: Denis V. Lunev <den@openvz.org>
2008-06-05 04:02:31 +09:00
YOSHIFUJI Hideaki e51171019b [SCTP]: Fix NULL dereference of asoc.
Commit 7cbca67c07 ("[IPV6]: Support
Source Address Selection API (RFC5014)") introduced NULL dereference
of asoc to sctp_v6_get_saddr in net/sctp/ipv6.c.
Pointed out by Johann Felix Soden <johfel@users.sourceforge.net>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-05 04:02:30 +09:00
Ilpo Järvinen 8aca6cb117 tcp: Fix inconsistency source (CA_Open only when !tcp_left_out(tp))
It is possible that this skip path causes TCP to end up into an
invalid state where ca_state was left to CA_Open while some
segments already came into sacked_out. If next valid ACK doesn't
contain new SACK information TCP fails to enter into
tcp_fastretrans_alert(). Thus at least high_seq is set
incorrectly to a too high seqno because some new data segments
could be sent in between (and also, limited transmit is not
being correctly invoked there). Reordering in both directions
can easily cause this situation to occur.

I guess we would want to use tcp_moderate_cwnd(tp) there as well
as it may be possible to use this to trigger oversized burst to
network by sending an old ACK with huge amount of SACK info, but
I'm a bit unsure about its effects (mainly to FlightSize), so to
be on the safe side I just currently fixed it minimally to keep
TCP's state consistent (obviously, such nasty ACKs have been
possible this far). Though it seems that FlightSize is already
underestimated by some amount, so probably on the long term we
might want to trigger recovery there too, if appropriate, to make
FlightSize calculation to resemble reality at the time when the
losses where discovered (but such change scares me too much now
and requires some more thinking anyway how to do that as it
likely involves some code shuffling).

This bug was found by Brian Vowell while running my TCP debug
patch to find cause of another TCP issue (fackets_out
miscount).

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 11:34:22 -07:00
Jarek Poplawski b9c6989646 netfilter: nf_conntrack_ipv6: fix inconsistent lock state in nf_ct_frag6_gather()
[   63.531438] =================================
[   63.531520] [ INFO: inconsistent lock state ]
[   63.531520] 2.6.26-rc4 #7
[   63.531520] ---------------------------------
[   63.531520] inconsistent {softirq-on-W} -> {in-softirq-W} usage.
[   63.531520] tcpsic6/3864 [HC0[0]:SC1[1]:HE1:SE0] takes:
[   63.531520]  (&q->lock#2){-+..}, at: [<c07175b0>] ipv6_frag_rcv+0xd0/0xbd0
[   63.531520] {softirq-on-W} state was registered at:
[   63.531520]   [<c0143bba>] __lock_acquire+0x3aa/0x1080
[   63.531520]   [<c0144906>] lock_acquire+0x76/0xa0
[   63.531520]   [<c07a8f0b>] _spin_lock+0x2b/0x40
[   63.531520]   [<c0727636>] nf_ct_frag6_gather+0x3f6/0x910
 ...

According to this and another similar lockdep report inet_fragment
locks are taken from nf_ct_frag6_gather() with softirqs enabled, but
these locks are mainly used in softirq context, so disabling BHs is
necessary.

Reported-and-tested-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 09:58:27 -07:00
Dong Wei d2ee3f2c4b netfilter: xt_connlimit: fix accouning when receive RST packet in ESTABLISHED state
In xt_connlimit match module, the counter of an IP is decreased when
the TCP packet is go through the chain with ip_conntrack state TW.
Well, it's very natural that the server and client close the socket
with FIN packet. But when the client/server close the socket with RST
packet(using so_linger), the counter for this connection still exsit.
The following patch can fix it which is based on linux-2.6.25.4

Signed-off-by: Dong Wei <dwei.zh@gmail.com>
Acked-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 09:57:51 -07:00
Al Viro d430a227d2 bogus format in ip6mr
ptrdiff_t is %t..., not %Z...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-06-04 08:06:02 -07:00
Thomas Graf ab32cd793d route: Remove unused ifa_anycast field
The field was supposed to allow the creation of an anycast route by
assigning an anycast address to an address prefix. It was never
implemented so this field is unused and serves no purpose. Remove it.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-03 16:37:33 -07:00
Thomas Graf bc3ed28caa netlink: Improve returned error codes
Make nlmsg_trim(), nlmsg_cancel(), genlmsg_cancel(), and
nla_nest_cancel() void functions.

Return -EMSGSIZE instead of -1 if the provided message buffer is not
big enough.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-03 16:36:54 -07:00
Thomas Graf 1f9d11c7c9 route: Mark unused routing attributes as such
Also removes an unused policy entry for an attribute which is
only used in kernel->user direction.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-03 16:36:27 -07:00
Thomas Graf 51b77cae0d route: Mark unused route cache flags as such.
Also removes an obsolete check for the unused flag RTCF_MASQ.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-03 16:36:01 -07:00
Brice Goglin 7557af2515 net_dma: remove duplicate assignment in dma_skb_copy_datagram_iovec
No need to compute copy twice in the frags loop in
dma_skb_copy_datagram_iovec().

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-03 16:07:45 -07:00
Stephen Hemminger b9f5f52cca net: neighbour table ABI problem
The neighbor table time of last use information is returned in the
incorrect unit. Kernel to user space ABI's need to use USER_HZ (or
milliseconds), otherwise the application has to try and discover the
real system HZ value which is problematic.  Linux has standardized on
keeping USER_HZ consistent (100hz) even when kernel is running
internally at some other value.

This change is small, but it breaks the ABI for older version of
iproute2 utilities.  But these utilities are already broken since they
are looking at the psched_hz values which are completely different. So
let's just go ahead and fix both kernel and user space. Older
utilities will just print wrong values.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-03 16:03:15 -07:00
Pavel Emelyanov 9ecad87794 irda: Sock leak on error path in irda_create.
Bad type/protocol specified result in sk leak.

Fix is simple - release the sk if bad values are given,
but to make it possible just to call sk_free(), I move
some sk initialization a bit lower.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-03 15:18:36 -07:00
Jarek Poplawski 7dccf1f4e1 ax25: Fix NULL pointer dereference and lockup.
From: Jarek Poplawski <jarkao2@gmail.com>

There is only one function in AX25 calling skb_append(), and it really
looks suspicious: appends skb after previously enqueued one, but in
the meantime this previous skb could be removed from the queue.

This patch Fixes it the simple way, so this is not fully compatible with
the current method, but testing hasn't shown any problems.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-03 14:53:46 -07:00
Dave Young 537d59af73 bluetooth: rfcomm_dev_state_change deadlock fix
There's logic in __rfcomm_dlc_close:
	rfcomm_dlc_lock(d);
	d->state = BT_CLOSED;
	d->state_changed(d, err);
	rfcomm_dlc_unlock(d);

In rfcomm_dev_state_change, it's possible that rfcomm_dev_put try to
take the dlc lock, then we will deadlock.

Here fixed it by unlock dlc before rfcomm_dev_get in
rfcomm_dev_state_change.

why not unlock just before rfcomm_dev_put? it's because there's
another problem.  rfcomm_dev_get/rfcomm_dev_del will take
rfcomm_dev_lock, but in rfcomm_dev_add the lock order is :
rfcomm_dev_lock --> dlc lock

so I unlock dlc before the taken of rfcomm_dev_lock.

Actually it's a regression caused by commit
1905f6c736 ("bluetooth :
__rfcomm_dlc_close lock fix"), the dlc state_change could be two
callbacks : rfcomm_sk_state_change and rfcomm_dev_state_change. I
missed the rfcomm_sk_state_change that time.

Thanks Arjan van de Ven <arjan@linux.intel.com> for the effort in
commit 4c8411f8c1 ("bluetooth: fix
locking bug in the rfcomm socket cleanup handling") but he missed the
rfcomm_dev_state_change lock issue.

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-03 14:27:17 -07:00
Tomas Winkler 2d892986e8 mac80211: removing shadowed sband
This patch removes doubly defined sband variable

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-03 15:00:29 -04:00
Tomas Winkler b97e77e044 mac80211: fix unbalanced locking in ieee80211_get_buffered_bc
This patch fixes unbalanced locking in ieee80211_get_buffered_bc

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-03 15:00:29 -04:00
Pavel Roskin 2b2121417e mac80211: fix panic when using hardware WEP
e039fa4a41 ("mac80211: move TX info into
skb->cb") misplaced code for setting hardware WEP keys.  Move it back.
This fixes kernel panic in b43 if WEP is used and hardware encryption
is enabled.

Signed-off-by: Pavel Roskin <proski@gnu.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-03 15:00:29 -04:00
Johannes Berg 5854a32e6c mac80211: fix rate control initialisation
In commit 2e92e6f2c5 ("mac80211: use rate
index in TX control") I forgot to initialise a few new variables to -1 which
means that the rate control algorithm is never triggered and 0 is used as
the only rate index, effectively fixing the transmit bitrate at the lowest
supported.

This patch adds the missing initialisation.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Bisected-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-03 15:00:28 -04:00
Emmanuel Grumbach 9306102ea5 mac80211: allow disable FAT in specific configurations
This patch allows to disable FAT channel in specific configurations.

For example the configuration (8, +1), (primary channel 8, extension
channel 12) isn't permitted in U.S., but (8, -1), (primary channel 8,
extension channel 4) is. When FAT channel configuration is not
permitted, FAT channel should be reported as not supported in the
capabilities of the HT IE in association request. And sssociation is
performed on 20Mhz channel.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-03 15:00:26 -04:00
Emmanuel Grumbach e623157b8d mac80211: sends HT IE to user level through wext
This patch adds HT IE in the scan list that is returned to user level
through wext. This is useful to let wpa_supplicant if a bss supports 11n or
not: WEP and TKIP are not supported in 11n.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-03 15:00:17 -04:00
Tomas Winkler b83f4e15e6 mac80211: fix deadlock in sta->lock
This patch fixes a deadlock of sta->lock use, occurring while changing
tx aggregation states, as dev_queue_xmit end up in new function
test_and_clear_sta_flags that uses that lock thus leading to deadlock

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-03 15:00:16 -04:00
Tomas Winkler 747cf5e924 mac80211: fix ieee80211_get_buffered_bc
fix bss not initialized in ieee80211_get_buffered_bc
and unbalanced locking

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-03 15:00:16 -04:00
Johannes Berg 23c0752a25 mac80211: clean up skb reallocation code
This cleans up the skb reallocation code to avoid problems with
skb->truesize, not resize an skb twice for a single output path
because we didn't expand it enough during the first copy and also
removes the code to further expand it during crypto operations
which will no longer be necessary.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-03 15:00:14 -04:00
Linus Torvalds 1beee8dc8c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (26 commits)
  llc: Fix double accounting of received packets
  netfilter: nf_conntrack_expect: fix error path unwind in nf_conntrack_expect_init()
  bluetooth: fix locking bug in the rfcomm socket cleanup handling
  mac80211: fix alignment issue with compare_ether_addr()
  mac80211: Fix for NULL pointer dereference in sta_info_get()
  mac80211: fix a typo in ieee80211_handle_filtered_frame comment
  rndis_wlan: add missing range check for power_output modparam
  iwlwifi: fix rate scale TLC column selection bug
  iwlwifi: fix exit from stay_in_table state
  rndis_wlan: Make connections to TKIP PSK networks work
  mac80211 : Fixes the status message for iwconfig
  rt2x00: Use atomic interface iteration in irq context
  rt2x00: Reset antenna RSSI after switch
  rt2x00: Don't count retries as failure
  rt2x00: Fix memleak in tx() path
  mac80211: reorder channel and freq reporting in wext scan report
  b43: Fix controller restart crash
  mac80211: fix ieee80211_rx_bss_put/get imbalance
  net/mac80211: always true conditionals
  b43: Upload both beacon templates on initial load
  ...
2008-05-30 07:45:20 -07:00
Arnaldo Carvalho de Melo 3446b9d57e llc: Fix double accounting of received packets
llc_sap_rcv was being preceded by skb_set_owner_r, then calling
llc_state_process that calls sock_queue_rcv_skb, that in turn calls
skb_set_owner_r again making the space allowed to be used by the socket to be
leaked, making the socket to get stuck.

Fix it by setting skb->sk at llc_sap_rcv and leave the accounting to be done
only at sock_queue_rcv_skb.

Reported-by: Dmitry Petukhov <dmgenp@gmail.com>
Tested-by: Dmitry Petukhov <dmgenp@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-30 02:57:29 -07:00
Alexey Dobriyan 12293bf911 netfilter: nf_conntrack_expect: fix error path unwind in nf_conntrack_expect_init()
Signed-off-by: Alexey Dobriyan <adobriyan@parallels.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-29 03:19:37 -07:00
David S. Miller 8c3a01d0c2 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6 2008-05-29 01:49:04 -07:00
Arjan van de Ven 4c8411f8c1 bluetooth: fix locking bug in the rfcomm socket cleanup handling
in net/bluetooth/rfcomm/sock.c, rfcomm_sk_state_change() does the
following operation:

        if (parent && sock_flag(sk, SOCK_ZAPPED)) {
                /* We have to drop DLC lock here, otherwise
                 * rfcomm_sock_destruct() will dead lock. */
                rfcomm_dlc_unlock(d);
                rfcomm_sock_kill(sk);
                rfcomm_dlc_lock(d);
        }
}

which is fine, since rfcomm_sock_kill() will call sk_free() which will call
rfcomm_sock_destruct() which takes the rfcomm_dlc_lock()... so far so good.

HOWEVER, this assumes that the rfcomm_sk_state_change() function always gets
called with the rfcomm_dlc_lock() taken. This is the case for all but one
case, and in that case where we don't have the lock, we do a double unlock
followed by an attempt to take the lock, which due to underflow isn't
going anywhere fast.

This patch fixes this by moving the stragling case inside the lock, like
the other usages of the same call are doing in this code.

This was found with the help of the www.kerneloops.org project, where this
deadlock was observed 51 times at this point in time:
http://www.kerneloops.org/search.php?search=rfcomm_sock_destruct

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-29 01:32:47 -07:00
Senthil Balasubramanian c97c23e386 mac80211: fix alignment issue with compare_ether_addr()
This addresses an alignment issue with compare_ether_addr().
The addresses passed to compare_ether_addr should be two bytes aligned.
It may function properly in x86 platform. However may not work properly
on IA-64 or ARM processor.

This also fixes a typo in mlme.c where the sk_buff struct name is incorect.
Though sizeof() works for any incorrect structure pointer name as its just
a pointer length that we want, lets just fix it.

Signed-off-by: Senthil Balasubramanian <senthilkumar@atheros.com>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-28 16:43:50 -04:00
Senthil Balasubramanian 70d251b24c mac80211: Fix for NULL pointer dereference in sta_info_get()
This addresses a NULL pointer dereference in sta_info_get().
TID and sta_info are extracted in ADDBA Timer expiry function
through the timer handler's argument.

The problem is extracging the TID (which was stored in
timer_to_tid[] array of type "u8") through "int *" typecast which
may also yield unwanted bytes for the MSB of TID that results
in incorrect sta_info and ieee80211_local pointers.

ieee80211_local pointer is NULL as illustrated below, it crashes in
sta_info_get(). The problem started when extracting ieee80211_local
pointer out of sta_info iteself and eventually crashed in
stat_info_get().

The proper way to fix is to change the data type of TID to u8
instead of u16. However changing all the occurences requires
some prototype changes as well. We should fix this in upcoming
patches.

Signed-off-by: Senthil Balasubramanian <senthilkumar@atheros.com>
Signed-off-by: Luis Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-28 16:43:49 -04:00
Yi Zhu f6d9710489 mac80211: fix a typo in ieee80211_handle_filtered_frame comment
fix a typo in ieee80211_handle_filtered_frame comment

Signed-off-by: Yi Zhu <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-28 16:43:49 -04:00
Abhijeet Kolekar d4231ca3e1 mac80211 : Fixes the status message for iwconfig
iwconfig was showing incorrect status messages when disassociated.
Patch fixes this by always checking for association status in
ioctl calls for getting ap address.

Signed-off-by: Abhijeet Kolekar <abhijeet.kolekar@intel.com>
Acked-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-28 16:43:46 -04:00
Tomas Winkler 9381be059b mac80211: reorder channel and freq reporting in wext scan report
This patch switch order of channel and freq (SIOCGIWFREQ) reports
in scan results in order to overcome wpa_supplicant inability
to handle channel numbers in 5.2Ghz band.
Wext reporting channel number is ambiguous as channels 7-12 (802.11j)
exist on both bands.

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Acked-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-28 16:43:43 -04:00
Tomas Winkler 167ad6f7a2 mac80211: fix ieee80211_rx_bss_put/get imbalance
This patch fixes iee80211_rx_bss_put/get imbalance
introduced by 'mac80211: enable IBSS merging' patch.

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-28 16:43:42 -04:00
Nicolas Kaiser 679fda1aa4 net/mac80211: always true conditionals
Correct always true conditionals.

Signed-off-by: Nicolas Kaiser <nikai@nikai.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-28 16:43:41 -04:00
Gerrit Renker 825de27d9e dccp ccid-3: Fix "t_ipi explosion" bug
The identification of this bug is thanks to Cheng Wei and Tomasz
Grobelny.

To avoid divide-by-zero, the implementation previously ignored RTTs
smaller than 4 microseconds when performing integer division RTT/4.

When the RTT reached a value less than 4 microseconds (as observed on
loopback), this prevented the Window Counter CCVal value from
advancing. As a result, the receiver stopped sending feedback. This in
turn caused non-ending expiries of the nofeedback timer at the sender,
so that the sending rate was progressively reduced until reaching the
minimum of one packet per 64 seconds.

The patch fixes this bug by handling integer division more
intelligently. Due to consistent use of dccp_sample_rtt(),
divide-by-zero-RTT is avoided.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-27 06:33:54 -07:00
Wei Yongjun 6079a463cf dccp: Fix to handle short sequence numbers packet correctly
RFC4340 said:
  8.5.  Pseudocode
       ...
       If P.type is not Data, Ack, or DataAck and P.X == 0 (the packet
             has short sequence numbers), drop packet and return

But DCCP has some mistake to handle short sequence numbers packet, now
it drop packet only if P.type is Data, Ack, or DataAck and P.X == 0.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-27 06:22:38 -07:00
Linus Torvalds c5e6fd28e5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (52 commits)
  vlan: Use bitmask of feature flags instead of seperate feature bits
  fmvj18x_cs: add NextCom NC5310 rev B support
  xirc2ps_cs: re-initialize the multicast address in do_reset
  3C509: rx_bytes should not be increased when alloc_skb failed
  NETFRONT: Use __skb_queue_purge()
  VIRTIO: Use __skb_queue_purge()
  phylib: do EXPORT_SYMBOL on get_phy_id
  netlink: Fix nla_parse_nested_compat() to call nla_parse() directly
  WAN: protect HDLC proto list while insmod/rmmod
  drivers/net/fs_enet: remove null pointer dereference
  S2io: Version update for napi and MSI-X patches
  S2io: Added napi support when MSIX is enabled.
  S2io: Move all the transmit completions to a single msi-x (alarm) vector
  drivers/net/ehea - remove unnecessary memset after kzalloc
  au1000_eth: remove useless check
  Blackfin EMAC Driver: Removed duplicated include <linux/ethtool.h>
  cpmac bugfixes and enhancements
  e1000e: use resource_size_t, not unsigned long, for phys addrs
  net/usb: add support for Apple USB Ethernet Adapter
  uli526x: add support for netpoll
  ...
2008-05-26 10:14:02 -07:00
Alan Cox 5406460098 irda: Push BKL down into irda ioctl handlers
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-25 23:43:11 -07:00
Alan Cox 866988edac wanrouter: Push down BKL
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-25 23:41:40 -07:00
David S. Miller 43154d08d6 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/cpmac.c
	net/mac80211/mlme.c
2008-05-25 23:26:10 -07:00
Carlos R. Mafra 962cf36c5b Remove argument from open_softirq which is always NULL
As git-grep shows, open_softirq() is always called with the last argument
being NULL

block/blk-core.c:       open_softirq(BLOCK_SOFTIRQ, blk_done_softirq, NULL);
kernel/hrtimer.c:       open_softirq(HRTIMER_SOFTIRQ, run_hrtimer_softirq, NULL);
kernel/rcuclassic.c:    open_softirq(RCU_SOFTIRQ, rcu_process_callbacks, NULL);
kernel/rcupreempt.c:    open_softirq(RCU_SOFTIRQ, rcu_process_callbacks, NULL);
kernel/sched.c: open_softirq(SCHED_SOFTIRQ, run_rebalance_domains, NULL);
kernel/softirq.c:       open_softirq(TASKLET_SOFTIRQ, tasklet_action, NULL);
kernel/softirq.c:       open_softirq(HI_SOFTIRQ, tasklet_hi_action, NULL);
kernel/timer.c: open_softirq(TIMER_SOFTIRQ, run_timer_softirq, NULL);
net/core/dev.c: open_softirq(NET_TX_SOFTIRQ, net_tx_action, NULL);
net/core/dev.c: open_softirq(NET_RX_SOFTIRQ, net_rx_action, NULL);

This observation has already been made by Matthew Wilcox in June 2002
(http://www.cs.helsinki.fi/linux/linux-kernel/2002-25/0687.html)

"I notice that none of the current softirq routines use the data element
passed to them."

and the situation hasn't changed since them. So it appears we can safely
remove that extra argument to save 128 (54) bytes of kernel data (text).

Signed-off-by: Carlos R. Mafra <crmafra@ift.unesp.br>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-25 07:43:15 +02:00
Mike Travis 3f9b48a758 net: Pass reference to cpumask variable in net/sunrpc/svc.c
* Pass reference to cpumask variable instead of using stack.

For inclusion into sched-devel/latest tree.

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
    +   sched-devel/latest  .../mingo/linux-2.6-sched-devel.git

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-23 18:44:36 +02:00
Mike Travis 0e12f848b3 net: use performance variant for_each_cpu_mask_nr
Change references from for_each_cpu_mask to for_each_cpu_mask_nr
where appropriate

Reviewed-by: Paul Jackson <pj@sgi.com>
Reviewed-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-23 18:35:12 +02:00
Patrick McHardy 289c79a4bd vlan: Use bitmask of feature flags instead of seperate feature bits
Herbert Xu points out that the use of seperate feature bits for features
to be propagated to VLAN devices is going to get messy real soon.
Replace the VLAN feature bits by a bitmask of feature flags to be
propagated and restore the old GSO_SHIFT/MASK values.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-23 00:27:50 -07:00
Ingo Molnar 2ba4cc319a rcu: fix nf_conntrack_helper.c build bug
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-22 10:08:38 +02:00
Linus Torvalds a0abb93bf9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  net: The world is not perfect patch.
  tcp: Make prior_ssthresh a u32
  xfrm_user: Remove zero length key checks.
  net/ipv4/arp.c: Use common hex_asc helpers
  cassini: Only use chip checksum for ipv4 packets.
  tcp: TCP connection times out if ICMP frag needed is delayed
  netfilter: Move linux/types.h inclusions outside of #ifdef __KERNEL__
  af_key: Fix selector family initialization.
  libertas: Fix ethtool statistics
  mac80211: fix NULL pointer dereference in ieee80211_compatible_rates
  mac80211: don't claim iwspy support
  orinoco_cs: add ID for SpeedStream wireless adapters
  hostap_cs: add ID for Conceptronic CON11CPro
  rtl8187: resource leak in error case
  ath5k: Fix loop variable initializations
2008-05-21 22:14:39 -07:00
Johannes Berg 9e72ebd686 mac80211: remove channel use statistics
The useless channel use statistics are quite a lot of code, currently
use integer divisions in the packet fast path, are rather inaccurate
since they do not account for retries and finally nobody even cares.
Hence, remove them completely.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-21 21:48:17 -04:00
Johannes Berg e253008360 mac80211: use multi-queue master netdevice
This patch updates mac80211 and drivers to be multi-queue aware and
use that instead of the internal queue mapping. Also does a number
of cleanups in various pieces of the code that fall out and reduces
internal mac80211 state size.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-21 21:48:14 -04:00