Commit Graph

16383 Commits

Author SHA1 Message Date
Gerrit Renker d82b6f85c1 dccp ccid-2: Use u32 timestamps uniformly
Since CCID-2 is de facto a mini implementation of TCP, it makes sense to share
as much code as possible.

Hence this patch aligns CCID-2 timestamping with TCP timestamping.
This also halves the space consumption (on 64-bit systems).

The necessary include file <net/tcp.h> is already included by way of
net/dccp.h. Redundant includes have been removed.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-30 13:45:25 -07:00
Jerry Chu dca43c75e7 tcp: Add TCP_USER_TIMEOUT socket option.
This patch provides a "user timeout" support as described in RFC793. The
socket option is also needed for the the local half of RFC5482 "TCP User
Timeout Option".

TCP_USER_TIMEOUT is a TCP level socket option that takes an unsigned int,
when > 0, to specify the maximum amount of time in ms that transmitted
data may remain unacknowledged before TCP will forcefully close the
corresponding connection and return ETIMEDOUT to the application. If
0 is given, TCP will continue to use the system default.

Increasing the user timeouts allows a TCP connection to survive extended
periods without end-to-end connectivity. Decreasing the user timeouts
allows applications to "fail fast" if so desired. Otherwise it may take
upto 20 minutes with the current system defaults in a normal WAN
environment.

The socket option can be made during any state of a TCP connection, but
is only effective during the synchronized states of a connection
(ESTABLISHED, FIN-WAIT-1, FIN-WAIT-2, CLOSE-WAIT, CLOSING, or LAST-ACK).
Moreover, when used with the TCP keepalive (SO_KEEPALIVE) option,
TCP_USER_TIMEOUT will overtake keepalive to determine when to close a
connection due to keepalive failure.

The option does not change in anyway when TCP retransmits a packet, nor
when a keepalive probe will be sent.

This option, like many others, will be inherited by an acceptor from its
listener.

Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-30 13:23:33 -07:00
Stephen Rothwell 2c70b51962 IPVS: include net/ip6_checksum.h for csum_ipv6_magic
Fixes this build error:

net/netfilter/ipvs/ip_vs_core.c: In function 'ip_vs_nat_icmp_v6':
net/netfilter/ipvs/ip_vs_core.c:640: error: implicit declaration of function 'csum_ipv6_magic'

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-29 21:15:26 -07:00
Akinobu Mita 6a499b242f phonet: use for_each_set_bit
Replace open-coded loop with for_each_set_bit().

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-28 15:37:04 -07:00
Akinobu Mita 762c29164e econet: kill unnecessary spin_lock_init()
The spinlock aun_queue_lock is initialized statically. It is unnecessary
to initialize by spin_lock_init() at module load time.

This is detected by the semantic patch.

// <smpl>
@def@
declarer name DEFINE_SPINLOCK;
identifier spinlock;
@@

DEFINE_SPINLOCK(spinlock);

@@
identifier def.spinlock;
@@

- spin_lock_init(&spinlock);
// </smpl>

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Julia Lawall <julia@diku.dk>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-28 15:37:03 -07:00
Johannes Berg 5b714c6a37 mac80211: fix offchannel queue stop
Somebody noticed this problem, and I outlined
to them how to fix it, but haven't heard back
from them. So while I was adding the state
field I figured I could use it to fix it.

The problem, as I understand it, is that when
we go offchannel while the driver has a queue
stopped, the driver will likely start draining
the queue and then enable it while offchannel.
This in turn will enable the interface queue,
and that leads to transmitting data frames on
the wrong channel.

Fix this by keeping track of offchannel status
per interface, and not enabling the interface
queues on interfaces that are offchannel when
the driver enables a queue.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-27 13:53:31 -04:00
Johannes Berg 34d4bc4d41 mac80211: support runtime interface type changes
Add support to mac80211 for changing the interface
type even when the interface is UP, if the driver
supports it.

To achieve this
 * add a new driver callback for switching,
 * split some of the interface up/down code out
   into new functions (do_open/do_stop), and
 * maintain an own __SDATA_RUNNING bit that will
   not be set during interface type, so that any
   other code doesn't use the interface.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-27 13:53:31 -04:00
Johannes Berg 87490f6db3 mac80211: split out concurrent vif checks
Split the concurrent virtual interface checks
into a new function that can be used to check
for any given new interface type.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-27 13:53:31 -04:00
Johannes Berg bf533e0bfd mac80211: simplify zero address checks
The libertas_tf special code for zero addresses
is a bit too complex, it compares against a stack
value instead of using is_zero_ether_addr() and
tries to update all interfaces even if just the
one that's being brought up needs to be changed.
Additionally, the repeated check for a valid MAC
address need only be done if we actually changed
it on the fly.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-27 13:53:30 -04:00
Johannes Berg 26a58456be mac80211: switch to ieee80211_sdata_running
Since the introduction of ieee80211_sdata_running(),
some new code was introduced that uses netif_running()
instead. Switch all these instances over.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-27 13:53:30 -04:00
Johannes Berg b9dcf712d1 mac80211: clean up ifdown/cleanup paths
There's a lot of redundant code in mac80211's
interface cleanup/down, for example freeing
AP beacons is done both when the interface is
set DOWN as well as when it is torn down, of
which only the former has any effect.

Also, a bunch of things should be closer to
where they matter, like the MLME timers that
we should cancel when disassociating, rather
than only when the interface is set DOWN.

Clean up all this code.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-27 13:53:30 -04:00
Johannes Berg 2337db8db8 mac80211: use subqueue helpers
There are subqueue helpers so that we don't
need to get the TX queue and then wake/stop
it, use those helpers.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-27 13:27:08 -04:00
Johannes Berg a621fa4d6a mac80211: allow changing port control protocol
Some vendor specified mechanisms for 802.1X-style
functionality use a different protocol than EAP
(even if EAP is vendor-extensible). Support this
in mac80211 via the cfg80211 API for it.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Juuso Oikarinen <juuso.oikarinen@nokia.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-27 13:27:07 -04:00
Johannes Berg c0692b8fe2 cfg80211: allow changing port control protocol
Some vendor specified mechanisms for 802.1X-style
functionality use a different protocol than EAP
(even if EAP is vendor-extensible). Allow setting
the ethertype for the protocol when a driver has
support for this. The default if unspecified is
EAP, of course.

Note: This is suitable only for station mode, not
      for AP implementation.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Juuso Oikarinen <juuso.oikarinen@nokia.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-27 13:27:07 -04:00
Johannes Berg 3ffc2a905b mac80211: allow vendor specific cipher suites
Allow drivers to specify their own set of cipher
suites to advertise vendor-specific ciphers. The
driver is then required to implement hardware
crypto offload for it.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Juuso Oikarinen <juuso.oikarinen@nokia.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-27 13:27:07 -04:00
Johannes Berg 7d64b7cc1f cfg80211: allow vendor specific cipher suites
cfg80211 currently rejects all cipher suites it
doesn't know about for key length checking
purposes. This can lead to inconsistencies when
a driver advertises an algorithm that cfg80211
doesn't know about. Remove this rejection so
drivers can specify any algorithm they like.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Juuso Oikarinen <juuso.oikarinen@nokia.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-27 13:27:07 -04:00
Johannes Berg 8789d459bc mac80211: allow scan to complete from any context
The ieee80211_scan_completed() function was a frequent
source of potential deadlocks, since it is called by
drivers but may call back into drivers, so drivers had
to make sure to call it without any locks held, which
frequently lead to more complex code in drivers. Avoid
that problem by allowing the function to be called in
any context, and queueing the actual work it does.
Also update the documentation for it to indicate this.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-27 13:27:06 -04:00
Johannes Berg 5f33c92d18 mac80211: remove unused scan expire define
Since cfg80211 manages the BSS list completely,
this define hasn't been used for a long time
and will never be used again.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-27 13:27:06 -04:00
Eric Dumazet 40d0802b3e gro: __napi_gro_receive() optimizations
compare_ether_header() can have a special implementation on 64 bit
arches if CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is defined.

__napi_gro_receive() and vlan_gro_common() can avoid a conditional
branch to perform device match.

On x86_64, __napi_gro_receive() has now 38 instructions instead of 53

As gcc-4.4.3 still choose to not inline it, add inline keyword to this
performance critical function.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-26 22:03:08 -07:00
Changli Gao 53f91dc1f7 net: use scnprintf() to avoid potential buffer overflow
strlcpy() returns the total length of the string they tried to create, so
we should not use its return value without any check. scnprintf() returns
the number of characters written into @buf not including the trailing '\0',
so use it instead here.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-26 14:11:49 -07:00
Joe Perches 145ce502e4 net/sctp: Use pr_fmt and pr_<level>
Change SCTP_DEBUG_PRINTK and SCTP_DEBUG_PRINTK_IPADDR to
use do { print } while (0) guards.
Add SCTP_DEBUG_PRINTK_CONT to fix errors in log when
lines were continued.
Add #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
Add a missing newline in "Failed bind hash alloc"

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-26 14:11:48 -07:00
Simon Horman dee06e4702 ipvs: switch to GFP_KERNEL allocations
Switch from GFP_ATOMIC allocations to GFP_KERNEL ones in
ip_vs_add_service() and ip_vs_new_dest(), as we hold a mutex and are
allowed to sleep in this context.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-26 13:21:29 -07:00
Simon Horman 4f72816ef0 IPVS: convert __ip_vs_securetcp_lock to a spinlock
Also rename __ip_vs_securetcp_lock to ip_vs_securetcp_lock.

Spinlock conversion was suggested by Eric Dumazet.

Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-26 13:21:29 -07:00
Simon Horman bd14455048 IPVS: convert __ip_vs_sched_lock to a spinlock
Also rename __ip_vs_sched_lock to ip_vs_sched_lock.

Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-26 13:21:28 -07:00
Simon Horman 8870f8427b IPVS: ICMPv6 checksum calculation
Cc: Xiaoyu Du <tingsrain@gmail.com>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-26 13:21:26 -07:00
stephen hemminger aa7c6e5fa0 bridge: avoid ethtool on non running interface
If bridge port is offline, don't call ethtool to query speed.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-25 16:36:51 -07:00
Stephen Hemminger 944c794d64 bridge: fix locking comment
The carrier check is not called from work queue in current code.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-25 16:36:50 -07:00
Julia Lawall b2aff96327 net/netfilter/ipvs: Eliminate memory leak
__ip_vs_service_get and __ip_vs_svc_fwm_get increment a reference count, so
that reference count should be decremented before leaving the function in an
error case.

A simplified version of the semantic match that finds this problem is:
(http://coccinelle.lip6.fr/)

// <smpl>
@r exists@
local idexpression x;
expression E;
identifier f1;
iterator I;
@@

x = __ip_vs_service_get(...);
<... when != x
     when != true (x == NULL || ...)
     when != if (...) { <+...x...+> }
     when != I (...) { <+...x...+> }
(
 x == NULL
|
 x == E
|
 x->f1
)
...>
* return ...;
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-25 16:36:50 -07:00
John W. Linville e569aa78ba Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem
Conflicts:
	drivers/net/wireless/libertas/if_sdio.c
2010-08-25 14:51:42 -04:00
Johannes Berg 5eb5a52da6 mac80211: fix mesh advertisement
When a mac80211-based driver advertises mesh mode
support, this will be advertised to userspace.
However, if mac80211 was compiled without mesh
support, then that won't actually be true. Fix
this by removing the bit for mesh if mesh isn't
compiled in.

Since this synchronizes what we advertise to
cfg80211 and actually support, it means we can
now rely on cfg80211's interface type checks
and need not check again in mac80211.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-25 14:34:56 -04:00
Christian Lamparter 2c15a0cf27 mac80211: fix rcu-unsafe pointer dereference
This patch fixes a potential crash (null-pointer de-
reference) which was introduced in my previous patch:
 "mac80211: AMPDU rx reorder timeout timer"

During a BA teardown, the pointer to the soon-to-be-gone
tid_ampdu_rx element will be nullified. Therefore the
release timer mechanism has to be careful not to
accidentally access the item without any RCU protection.

Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-25 14:34:56 -04:00
Johannes Berg 74b70a4e38 nl80211: fix missing nesting
commit 95a6ccbb46c70cff376684c752831c014c87029d
Author: Johannes Berg <johannes.berg@intel.com>
Date:   Thu Aug 12 15:38:38 2010 +0200

    cfg80211/mac80211: extensible frame processing

introduced a netlink bug that caused parsing errors
in userspace because it forgot to close a nesting,
which would advertise a nesting length of zero to
userspace, which then completely threw off parsing
and led to

	Illegal nla->nla_type == 0

being printed by libnl.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-25 14:34:56 -04:00
Christian Lamparter 258086a48b mac80211: cancel restart_work in ieee80211_unregister_hw
Unlike most other workqueue-tasks, the restart_work is
not scheduled onto mac80211's private per-interface
workqueue, but onto one of the system-wide workqueues.

Therefore the mac80211-stack has to cancel any pending
restarts, before destroying the shared device context
and handing back the memory. Otherwise - under very
unlucky circumstances - there could be a stale work-
item left, because some other kernel component might
have delayed the execution of ieee80211_restart_work
for too long.

Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-25 14:33:20 -04:00
Wey-Yi Guy ff67bb86d4 mac80211: fix warning for un-used parameter
mesh_hdr only used when CONFIG_MAC80211_MESH is defined

Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-25 14:33:17 -04:00
Joe Perches 0fb9a9ec27 net/mac80211: Use wiphy_<level>
Standardize logging messages from
	printk(KERN_<level> "%s: " fmt , wiphy_name(foo), args);
to
	wiphy_<level>(foo, fmt, args);

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-25 14:33:17 -04:00
Stephen Hemminger c2e3143e3c tc: add meta match on receive hash
Trivial extension to existing meta data match rules to allow
matching on skb receive hash value.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-24 14:48:10 -07:00
Eric Dumazet ec550d246e net: ip_append_data() optim
Compiler is not smart enough to avoid a conditional branch.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-24 14:45:09 -07:00
Johannes Berg 2e161f78e5 cfg80211/mac80211: extensible frame processing
Allow userspace to register for more than just
action frames by giving the frame subtype, and
make it possible to use this in various modes
as well.

With some tweaks and some added functionality
this will, in the future, also be usable in AP
mode and be able to replace the cooked monitor
interface currently used in that case.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-24 16:27:56 -04:00
Johannes Berg ac4c977d16 mac80211: remove unused don't-encrypt flag
When MFP is disabled, action frames will not
be encrypted since they are management frames
and the only management frames that can then
be encrypted are authentication frames.

Therefore, setting the don't-encrypt flag on
action frames is unnecessary.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-24 16:27:55 -04:00
Johannes Berg 633adf1ad1 cfg80211: mark ieee80211_hdrlen const
This function analyses only its single, value-passed
argument, and has no side effects. Thus it can be
const, which makes mac80211 smaller, for example:

   text	   data	    bss	    dec	    hex	filename
 362518	  16720	    884	 380122	  5ccda	mac80211.ko (before)
 362358	  16720	    884	 379962	  5cc3a	mac80211.ko (after)

a 160 byte saving in text size, and an optimisation
because the function won't be called as often.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-24 16:27:54 -04:00
stephen hemminger 0fdc100bdc ethtool: allow non-netadmin to query settings
The SNMP daemon uses ethtool to determine the speed of
network interfaces. This fails on Debian (and probably elsewhere)
because for security SNMP daemon runs as non-root user (snmp).

Note: A similar patch was rejected previously because of a concern about
the possibility that on some hardware querying the ethtool settings
requires access to the PHY and could slow the machine down.  But the
security risk of requiring SNMP daemon (and related services)
to run as root far out weighs the risk of denial-of-service.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-23 20:43:16 -07:00
Eric Dumazet afdcba371f net: copy_rtnl_link_stats64() simplification
No need to use a temporary struct rtnl_link_stats64 variable,
just copy the source to skb buffer.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-23 20:43:16 -07:00
Changli Gao 0eec32ff35 net_sched: act_csum: coding style cleanup
Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-23 20:43:15 -07:00
David S. Miller 7abac68602 pkt_sched: Make act_csum depend upon INET.
It uses ip_send_check() and stuff like that.

Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-23 20:42:11 -07:00
Gerrit Renker 231cc2aaf1 dccp ccid-2: Replace broken RTT estimator with better algorithm
The current CCID-2 RTT estimator code is in parts broken and lags behind the
suggestions in RFC2988 of using scaled variants for SRTT/RTTVAR.

That code is replaced by the present patch, which reuses the Linux TCP RTT
estimator code.

Further details:
----------------
 1. The minimum RTO of previously one second has been replaced with TCP's, since
    RFC4341, sec. 5 says that the minimum of 1 sec. (suggested in RFC2988, 2.4)
    is not necessary. Instead, the TCP_RTO_MIN is used, which agrees with DCCP's
    concept of a default RTT (RFC 4340, 3.4).
 2. The maximum RTO has been set to DCCP_RTO_MAX (64 sec), which agrees with
    RFC2988, (2.5).
 3. De-inlined the function ccid2_new_ack().
 4. Added a FIXME: the RTT is sampled several times per Ack Vector, which will
    give the wrong estimate. It should be replaced with one sample per Ack.
    However, at the moment this can not be resolved easily, since
    - it depends on TX history code (which also needs some work),
    - the cleanest solution is not to use the `sent' time at all (saves 4 bytes
      per entry) and use DCCP timestamps / elapsed time to estimated the RTT,
      which however is non-trivial to get right (but needs to be done).

Reasons for reusing the Linux TCP estimator algorithm:
------------------------------------------------------
Some time was spent to find a better alternative, using basic RFC2988 as a first
step. Further analysis and experimentation showed that the Linux TCP RTO
estimator is superior to a basic RFC2988 implementation. A summary is on
http://www.erg.abdn.ac.uk/users/gerrit/dccp/notes/ccid2/rto_estimator/

In addition, this estimator fared well in a recent empirical evaluation:

    Rewaskar, Sushant, Jasleen Kaur and F. Donelson Smith.
    A Performance Study of Loss Detection/Recovery in Real-world TCP
    Implementations. Proceedings of 15th IEEE International
    Conference on Network Protocols (ICNP-07), 2007.

Thus there is significant benefit in reusing the existing TCP code.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-23 20:13:31 -07:00
Gerrit Renker c38c92a84a dccp ccid-2: Simplify dec_pipe and rearming of RTO timer
This removes the dec_pipe function and improves the way the RTO timer is rearmed
when a new acknowledgment comes in.

Details and justification for removal:
--------------------------------------
 1) The BUG_ON in dec_pipe is never triggered: pipe is only decremented for TX
    history entries between tail and head, for which it had previously been
    incremented in tx_packet_sent; and it is not decremented twice for the same
    entry, since it is
    - either decremented when a corresponding Ack Vector cell in state 0 or 1
      was received (and then ccid2s_acked==1),
    - or it is decremented when ccid2s_acked==0, as part of the loss detection
      in tx_packet_recv (and hence it can not have been decremented earlier).

 2) Restarting the RTO timer happens for every single entry in each Ack Vector
    parsed by tx_packet_recv (according to RFC 4340, 11.4 this can happen up to
    16192 times per Ack Vector).

 3) The RTO timer should not be restarted when all outstanding data has been
    acknowledged. This is currently done similar to (2), in dec_pipe, when
    pipe has reached 0.

The patch onsolidates the code which rearms the RTO timer, combining the
segments from new_ack and dec_pipe. As a result, the code becomes clearer
(compare with tcp_rearm_rto()).

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-23 20:13:31 -07:00
Gerrit Renker 30564e3555 dccp ccid-2: Remove redundant sanity tests
This removes the ccid2_hc_tx_check_sanity function: it is redundant.

Details:

The tx_check_sanity function performs three tests:
 1) it checks that the circular TX list is sorted
    - in ascending order of sequence number (ccid2s_seq)
    - and time (ccid2s_sent),
    - in the direction from `tail' (hctx_seqt) to `head' (hctx_seqh);
 2) it ensures that the entire list has the length seqbufc * CCID2_SEQBUF_LEN;
 3) it ensures that pipe equals the number of packets that were not
    marked `acked' (ccid2s_acked) between `tail' and `head'.

The following argues that each of these tests is redundant, this can be verified
by going through the code.

(1) is not necessary, since both time and GSS increase from one packet to the
next, so that subsequent insertions in tx_packet_sent (which advance the `head'
pointer) will be in ascending order of time and sequence number.

In (2), the length of the list is always equal to seqbufc times CCID2_SEQBUF_LEN
(set to 1024) unless allocation caused an earlier failure, because:
 * at initialisation (tx_init), there is one chunk of size 1024 and seqbufc=1;
 * subsequent calls to tx_alloc_seq take place whenever head->next == tail in
   tx_packet_sent; then a new chunk of size 1024 is inserted between head and
   tail, and seqbufc is incremented by one.

To show that (3) is redundant requires looking at two cases.

The `pipe' variable of the TX socket is incremented only in tx_packet_sent, and
decremented in tx_packet_recv.  When head == tail (TX history empty) then pipe
should be 0, which is the case directly after initialisation and after a
retransmission timeout has occurred (ccid2_hc_tx_rto_expire).

The first case involves parsing Ack Vectors for packets recorded in the live
portion of the buffer, between tail and head. For each packet marked by the
receiver as received (state 0) or ECN-marked (state 1), pipe is decremented by
one, so for all such packets the BUG_ON in tx_check_sanity will not trigger.

The second case is the loss detection in the second half of tx_packet_recv,
below the comment "Check for NUMDUPACK".

The first while-loop here ensures that the sequence number of `seqp' is either
above or equal to `high_ack', or otherwise equal to the highest sequence number
sent so far (of the entry head->prev, as head points to the next unsent entry).
The next while-loop ("while (1)") counts the number of acked packets starting
from that position of seqp, going backwards in the direction from head->prev to
tail. If NUMDUPACK=3 such packets were counted within this loop, `seqp' points
to the last acknowledged packet of these, and the "if (done == NUMDUPACK)" block
is entered next.
The while-loop contained within that block in turn traverses the list backwards,
from head to tail; the position of `seqp' is saved in the variable `last_acked'.
For each packet not marked as `acked', a congestion event is triggered within
the loop, and pipe is decremented. The loop terminates when `seqp' has reached
`tail', whereupon tail is set to the position previously stored in `last_acked'.
Thus, between `last_acked' and the previous position of `tail',
 - pipe has been decremented earlier if the packet was marked as state 0 or 1;
 - pipe was decremented if the packet was not marked as acked.
That is, pipe has been decremented by the number of packets between `last_acked'
and the previous position of `tail'. As a consequence, pipe now again reflects
the number of packets which have not (yet) been acked between the new position
of tail (at `last_acked') and head->prev, or 0 if head==tail. The result is that
the BUG_ON condition in check_sanity will also not be triggered, hence the test
(3) is also redundant.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-23 20:13:30 -07:00
Gerrit Renker 51c22bb510 dccp ccid-3: No more CCID control blocks in LISTEN state
The CCIDs are activated as last of the features, at the end of the handshake,
were the LISTEN state of the master socket is inherited into the server
state of the child socket. Thus, the only states visible to CCIDs now are
OPEN/PARTOPEN, and the closing states.

This allows to remove tests which were previously necessary to protect
against referencing a socket in the listening state (in CCID-3), but which
now have become redundant.

As a further byproduct of enabling the CCIDs only after the connection has been
fully established, several typecast-initialisations of ccid3_hc_{rx,tx}_sock
can now be eliminated:
 * the CCID is loaded, so it is not necessary to test if it is NULL,
 * if it is possible to load a CCID and leave the private area NULL, then this
    is a bug, which should crash loudly - and earlier,
 * the test for state==OPEN || state==PARTOPEN now reduces only to the closing
   phase (e.g. when the node has received an unexpected Reset).

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-23 20:13:30 -07:00
Gerrit Renker 67b67e365f ccid: ccid-2/3 code cosmetics
This patch collects cosmetics-only changes to separate these from
code changes:
 * update with regard to CodingStyle and whitespace changes,
 * documentation:
   - adding/revising comments,
   - remove CCID-3 RX socket documentation which is either
     duplicate or refers to fields that no longer exist,
 * expand embedded tfrc_tx_info struct inline for consistency,
   removing indirections via #define.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-23 20:13:29 -07:00
David S. Miller 21dc330157 net: Rename skb_has_frags to skb_has_frag_list
SKBs can be "fragmented" in two ways, via a page array (called
skb_shinfo(skb)->frags[]) and via a list of SKBs (called
skb_shinfo(skb)->frag_list).

Since skb_has_frags() tests the latter, it's name is confusing
since it sounds more like it's testing the former.

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-23 00:13:46 -07:00
Hagen Paul Pfeifer e88c64f0a4 tcp: allow effective reduction of TCP's rcv-buffer via setsockopt
Via setsockopt it is possible to reduce the socket RX buffer
(SO_RCVBUF). TCP method to select the initial window and window scaling
option in tcp_select_initial_window() currently misbehaves and do not
consider a reduced RX socket buffer via setsockopt.

Even though the server's RX buffer is reduced via setsockopt() to 256
byte (Initial Window 384 byte => 256 * 2 - (256 * 2 / 4)) the window
scale option is still 7:

192.168.1.38.40676 > 78.47.222.210.5001: Flags [S], seq 2577214362, win 5840, options [mss 1460,sackOK,TS val 338417 ecr 0,nop,wscale 0], length 0
78.47.222.210.5001 > 192.168.1.38.40676: Flags [S.], seq 1570631029, ack 2577214363, win 384, options [mss 1452,sackOK,TS val 2435248895 ecr 338417,nop,wscale 7], length 0
192.168.1.38.40676 > 78.47.222.210.5001: Flags [.], ack 1, win 5840, options [nop,nop,TS val 338421 ecr 2435248895], length 0

Within tcp_select_initial_window() the original space argument - a
representation of the rx buffer size - is expanded during
tcp_select_initial_window(). Only sysctl_tcp_rmem[2], sysctl_rmem_max
and window_clamp are considered to calculate the initial window.

This patch adjust the window_clamp argument if the user explicitly
reduce the receive buffer.

Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-22 21:42:54 -07:00
Simon Horman c2368e795c bridge: is PACKET_LOOPBACK unlikely()?
While looking at using netdev_rx_handler_register for openvswitch Jesse
Gross suggested that an unlikely() might be worthwhile in that code.
I'm interested to see if its appropriate for the bridge code.

Cc: Jesse Gross <jesse@nicira.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-22 21:09:04 -07:00
Changli Gao 05532121da net: 802.1q: make vlan_hwaccel_do_receive() return void
vlan_hwaccel_do_receive() always returns 0, so make it return void.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-22 21:03:33 -07:00
Stephen Rothwell 2436243a39 net/sched: need to include net/ip6_checksum.h
for the declararion of csum_ipv6_magic.

Fixes this build error on PowerPC (at least):

net/sched/act_csum.c: In function 'tcf_csum_ipv6_icmp':
net/sched/act_csum.c:178: error: implicit declaration of function 'csum_ipv6_magic'

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-22 20:31:14 -07:00
Changli Gao 739a91ef06 net_sched: cls_flow: add key rxhash
We can use rxhash to classify the traffic into flows. As rxhash maybe
supplied by NIC or RPS, it is cheaper.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-21 23:40:14 -07:00
Eric Dumazet 81ce790bd7 irda: use net_device_stats from struct net_device
struct net_device has its own struct net_device_stats member, so use
this one instead of a private copy in the irlan_cb struct.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-21 23:32:31 -07:00
David S. Miller d3c6e7ad09 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-08-21 23:32:24 -07:00
Dmitry Kozlov 00959ade36 PPTP: PPP over IPv4 (Point-to-Point Tunneling Protocol)
PPP: introduce "pptp" module which implements point-to-point tunneling protocol using pppox framework
NET: introduce the "gre" module for demultiplexing GRE packets on version criteria
     (required to pptp and ip_gre may coexists)
NET: ip_gre: update to use the "gre" module

This patch introduces then pptp support to the linux kernel which
dramatically speeds up pptp vpn connections and decreases cpu usage in
comparison of existing user-space implementation
(poptop/pptpclient). There is accel-pptp project
(https://sourceforge.net/projects/accel-pptp/) to utilize this module,
it contains plugin for pppd to use pptp in client-mode and modified
pptpd (poptop) to build high-performance pptp NAS.

There was many changes from initial submitted patch, most important are:
1. using rcu instead of read-write locks
2. using static bitmap instead of dynamically allocated
3. using vmalloc for memory allocation instead of BITS_PER_LONG + __get_free_pages
4. fixed many coding style issues
Thanks to Eric Dumazet.

Signed-off-by: Dmitry Kozlov <xeb@mail.ru>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-21 23:05:39 -07:00
Changli Gao 1003489e06 net: rps: fix the wrong network header pointer
__skb_get_rxhash() was broken after the commit:

 commit bfb564e739
 Author: Krishna Kumar <krkumar2@in.ibm.com>
 Date:   Wed Aug 4 06:15:52 2010 +0000

 core: Factor out flow calculation from get_rps_cpu

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-21 22:54:49 -07:00
Grégoire Baron eb4d406545 net/sched: add ACT_CSUM action to update packets checksums
net/sched: add ACT_CSUM action to update packets checksums

ACT_CSUM can be called just after ACT_PEDIT in order to re-compute some
altered checksums in IPv4 and IPv6 packets. The following checksums are
supported by this patch:
 - IPv4: IPv4 header, ICMP, IGMP, TCP, UDP & UDPLite
 - IPv6: ICMPv6, TCP, UDP & UDPLite
It's possible to request in the same action to update different kind of
checksums, if the packets flow mix TCP, UDP and UDPLite, ...

An example of usage is done in the associated iproute2 patch.

Version 3 changes:
 - remove useless goto instructions
 - improve IPv6 hop options decoding

Version 2 changes:
 - coding style correction
 - remove useless arguments of some functions
 - use stack in tcf_csum_dump()
 - add tcf_csum_skb_nextlayer() to factor code

Signed-off-by: Gregoire Baron <baronchon@n7mm.org>
Acked-by: jamal <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-20 01:42:59 -07:00
Eric Dumazet 49e8ab03eb net: build_ehash_secret() and rt_bind_peer() cleanups
Now cmpxchg() is available on all arches, we can use it in
build_ehash_secret() and rt_bind_peer() instead of using spinlocks.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-20 00:50:16 -07:00
Changli Gao b9959c2e44 net_sched: sch_sfq: use proto_ports_offset() to support AH message
Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-19 17:16:25 -07:00
Changli Gao aca071c1c1 netfilter: xt_hashlimit: use proto_ports_offset() to support AH message
Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-19 17:16:25 -07:00
Changli Gao 3d04ebb6ab netfilter: ipt_CLUSTERIP: use proto_ports_offset() to support AH message
Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-19 17:16:24 -07:00
Changli Gao 78d3307ede net_sched: cls_flow: use proto_ports_offset() to support AH message
Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-19 17:16:24 -07:00
Changli Gao 12fcdefb36 net: rps: use proto_ports_offset() to handle the AH message correctly
The SPI isn't at the beginning of an AH message.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-19 17:16:23 -07:00
Changli Gao dbe5775bbc net: rps: skip fragment when computing rxhash
Fragmented IP packets may have no transfer header, so when computing
rxhash, we should skip them.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-19 17:10:38 -07:00
Changli Gao 2d47b45951 net: rps: reset network header before calling skb_get_rxhash()
skb_get_rxhash() assumes the network header pointer of the skb is set
properly after the commit:

commit bfb564e739
Author: Krishna Kumar <krkumar2@in.ibm.com>
Date:   Wed Aug 4 06:15:52 2010 +0000

    core: Factor out flow calculation from get_rps_cpu

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-19 17:08:37 -07:00
Eric Dumazet 79c5f51c63 irda: fix a race in irlan_eth_xmit()
After skb is queued, its illegal to dereference it.

Cache skb->len into a temporary variable.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-19 00:41:52 -07:00
Phil Oester 0ac820eebe vlan: Match underlying dev carrier on vlan add
When adding a new vlan, if the underlying interface has no carrier,
then the newly added vlan interface should also have no carrier.
At present, this is not true - the newly added vlan is added with
carrier up.  Fix by checking state of real device.

Signed-off-by: Phil Oester <kernel@linuxace.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-19 00:26:46 -07:00
Eric Dumazet b92840900f atm: remove a net_device_stats clear
No need to clear device stats in lec_open()

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-19 00:14:36 -07:00
Oliver Hartkopp 2244d07bfa net: simplify flags for tx timestamping
This patch removes the abstraction introduced by the union skb_shared_tx in
the shared skb data.

The access of the different union elements at several places led to some
confusion about accessing the shared tx_flags e.g. in skb_orphan_try().

    http://marc.info/?l=linux-netdev&m=128084897415886&w=2

Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-19 00:08:30 -07:00
Eric Dumazet f037590fff rds: fix a leak of kernel memory
struct rds_rdma_notify contains a 32 bits hole on 64bit arches,
make sure it is zeroed before copying it to user.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-18 23:40:03 -07:00
Johannes Berg 68d6ac6d27 netlink: fix compat recvmsg
Since
commit 1dacc76d00
Author: Johannes Berg <johannes@sipsolutions.net>
Date:   Wed Jul 1 11:26:02 2009 +0000

    net/compat/wext: send different messages to compat tasks

we had a race condition when setting and then
restoring frag_list. Eric attempted to fix it,
but the fix created even worse problems.

However, the original motivation I had when I
added the code that turned out to be racy is
no longer clear to me, since we only copy up
to skb->len to userspace, which doesn't include
the frag_list length. As a result, not doing
any frag_list clearing and restoring avoids
the race condition, while not introducing any
other problems.

Additionally, while preparing this patch I found
that since none of the remaining netlink code is
really aware of the frag_list, we need to use the
original skb's information for packet information
and credentials. This fixes, for example, the
group information received by compat tasks.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: stable@kernel.org [2.6.31+, for 2.6.35 revert 1235f504aa]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-18 23:35:58 -07:00
Julia Lawall 49339ccae5 net/ax25: Use available error codes
Error codes are stored in err, but the return value is always 0.  Return
err instead.

The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@r@
local idexpression x;
constant C;
@@

if (...) { ...
  x = -C
  ... when != x
(
  return <+...x...+>;
|
  return NULL;
|
  return;
|
* return ...;
)
}
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-18 14:26:31 -07:00
Julia Lawall b3d18f1509 net/ax25: Use available error codes
Error codes are stored in err, but the return value is always 0.  Return
err instead.

The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@r@
local idexpression x;
constant C;
@@

if (...) { ...
  x = -C
  ... when != x
(
  return <+...x...+>;
|
  return NULL;
|
  return;
|
* return ...;
)
}
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-18 14:26:31 -07:00
Dan Carpenter 00093fab98 net/sched: remove unneeded NULL check
There is no need to check "s".  nla_data() doesn't return NULL.  Also we
already dereferenced "s" at this point so it would have oopsed ealier if
it were NULL.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-18 14:24:51 -07:00
Jarek Poplawski e5093aec2e net: Fix a memmove bug in dev_gro_receive()
>Xin Xiaohui wrote:
> I looked into the code dev_gro_receive(), found the code here:
> if the frags[0] is pulled to 0, then the page will be released,
> and memmove() frags left.
> Is that right? I'm not sure if memmove do right or not, but
> frags[0].size is never set after memove at least. what I think
> a simple way is not to do anything if we found frags[0].size == 0.
> The patch is as followed.
...

This version of the patch fixes the bug directly in memmove.

Reported-by: "Xin, Xiaohui" <xiaohui.xin@intel.com>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-17 17:37:28 -07:00
Allan Stephens 5a68d5ee00 tipc: Prevent missing name table entries when link flip-flops rapidly
Ensure that TIPC does not re-establish communication with a
neighboring node until it has finished updating all data structures
containing information about that node to reflect the earlier loss of
contact.  Previously, it was possible for TIPC to perform its purge of
name table entries relating to the node once contact had already been
re-established, resulting in the unwanted removal of valid name table
entries.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-17 17:32:00 -07:00
Allan Stephens 564e83b51a tipc: Allow connect() to wait indefinitely
Cause a socket whose TIPC_CONN_TIMEOUT option is zero to wait
indefinitely for a response to a connection request using connect().
Previously, specifying a timeout of 0 ms resulted in an immediate
timeout, which was inconsistent with the behavior specified by Posix
for a socket's receive and send timeout.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-17 17:31:59 -07:00
Allan Stephens c2de58140a tipc: Minor enhancements to name table display format
Eliminate printing of dashes after name table column headers
(to adhere more closely to the standard format used in tipc-config),
and simplify name table display logic using array lookups rather
than if-then-else logic.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-17 17:31:58 -07:00
Allan Stephens 76ae0d71d8 tipc: Optimize tipc_node_has_active_links()
Eliminate unnecessary checking for null node pointer and redundant
check of second active link array entry.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-17 17:31:57 -07:00
Allan Stephens 96d841b703 tipc: Remove per-connection sequence number logic
Remove validation of the per-connection sequence numbers on routable
connections, since routable connections are not supported by TIPC.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-17 17:31:56 -07:00
Allan Stephens 0048b826af tipc: Fix bug in broadcast link transmit statistics computation
Modify TIPC's broadcast link so that it counts each piece of a
fragmented message individually, rather than as treating the group
as a single message.  This ensures that proper correlation of sent
and received traffic can be done when the broadcast link statistics
are displayed, and is consistent with the way fragments are counted
by TIPC's unicast links.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-17 17:31:55 -07:00
Allan Stephens 5b1f7bdeb6 tipc: Fix premature broadcast advertisement by sending node
Prevent a TIPC node from sending out a LINK_STATE message
advertising a broadcast message that it is in the process
of sending, but has not yet actually sent.  Previously, it was
possible for a link timeout to occur in between the time the
broadcast link updated its "last message sent" counter and the
time the broadcast message was passed to the broadcast bearer
for transmission.  This ensures that the code which issues
the LINK_STATE message isn't informed of the new message until
the broadcast bearer has had a chance to send it.

Note: The "last message sent" value is stored in the "fsm_msg_count"
field of the link structure used by the broadcast link.  Since the
broadcast link doesn't utilize the normal link FSM, this field can
be re-used rather than adding a new field to the broadcast link.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-17 17:31:55 -07:00
Allan Stephens 7e3e5d0950 tipc: Prevent crash when broadcast link cannot send to all nodes
Allow TIPC's broadcast link to continue operation when it is unable
to send a message to all nodes in the cluster.  Previously, the
broadcast link attempted to put the broadcast pseudo-bearer into a
blocked state; however, this caused a crash because the associated
bearer structure is only partially initialized.  Further
investigation has revealed some conceptual problems with blocking
the pseudo-bearer; consequently, this functionality has been
disabled for the time being and the undelivered message is
eventually resent by the broadcast link's existing message
retransmission mechanism (if possible).

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-17 17:31:54 -07:00
Allan Stephens b02b69c8a4 tipc: Check for disabled bearer when processing incoming messages
Add a check to tipc_recv_msg() to ensure it discards messages
arriving on a newly disabled bearer.  This is needed to deal with a
race condition that can arise if the bearer is in the midst of being
disabled when it receives a message.  Performing the check after
tipc_net_lock has been taken ensures that TIPC's bearers are in a
stable state while the message is being processed.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-17 17:31:53 -07:00
Allan Stephens f662c07058 tipc: correct problems with misleading flags returned using poll()
Prevent TIPC from incorrectly setting returned flags to poll()
in the following cases:

- an unconnected socket no longer indicates that it is always readable

- an unconnected, connecting, or listening socket no longer indicates
  that it is always writable

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-17 17:31:53 -07:00
Allan Stephens 35997e3157 tipc: Provide correct error code for unsupported connect() operation
Modify TIPC to return EOPNOTSUPP if an application attempts to perform
 a non-blocking connect() operation, which is not supported by TIPC.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-17 17:31:52 -07:00
Florian Westphal 3720d40b20 tipc: add SO_RCVLOWAT support to stream socket receive path
Add support for the SO_RCVLOWAT socket option to TIPC's stream socket
type.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-17 17:31:51 -07:00
Anders Kaseorg f813809252 tipc: Fix log buffer memory leak if initialization fails
Moves log buffer cleanup into tipc_core_stop() so that memory allocated
for the log buffer is freed if tipc_core_start() is unsuccessful.

Signed-off-by: Anders Kaseorg <andersk@ksplice.com>
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-17 17:31:51 -07:00
Eric Dumazet 1c40be12f7 net sched: fix some kernel memory leaks
We leak at least 32bits of kernel memory to user land in tc dump,
because we dont init all fields (capab ?) of the dumped structure.

Use C99 initializers so that holes and non explicit fields are zeroed.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-17 15:12:15 -07:00
Eric Dumazet 001389b958 netfilter: {ip,ip6,arp}_tables: avoid lockdep false positive
After commit 24b36f019 (netfilter: {ip,ip6,arp}_tables: dont block
bottom half more than necessary), lockdep can raise a warning
because we attempt to lock a spinlock with BH enabled, while
the same lock is usually locked by another cpu in a softirq context.

Disable again BH to avoid these lockdep warnings.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Diagnosed-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-17 15:12:14 -07:00
Ben Hutchings 0141480205 ethtool: Provide a default implementation of ethtool_ops::get_drvinfo
The driver name and bus address for a net_device can normally be found
through the driver model now.  Instead of requiring drivers to provide
this information redundantly through the ethtool_ops::get_drvinfo
operation, use the driver model to do so if the driver does not define
the operation.  Since ETHTOOL_GDRVINFO no longer requires the driver
to implement any operations, do not require net_device::ethtool_ops to
be set either.

Remove implementations of get_drvinfo and ethtool_ops that provide
only this information.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-17 02:31:15 -07:00
Julia Lawall bb8a10bbd1 net/decnet: Adjust confusing if indentation
Indent the branch of an if.

The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@r disable braces4@
position p1,p2;
statement S1,S2;
@@

(
if (...) { ... }
|
if (...) S1@p1 S2@p2
)

@script:python@
p1 << r.p1;
p2 << r.p2;
@@

if (p1[0].column == p2[0].column):
  cocci.print_main("branch",p1)
  cocci.print_secs("after",p2)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-16 21:06:30 -07:00
Julia Lawall 510a05edce net/atm: Adjust confusing if indentation
Outdent the code following an if.

The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@r disable braces4@
position p1,p2;
statement S1,S2;
@@

(
if (...) { ... }
|
if (...) S1@p1 S2@p2
)

@script:python@
p1 << r.p1;
p2 << r.p2;
@@

if (p1[0].column == p2[0].column):
  cocci.print_main("branch",p1)
  cocci.print_secs("after",p2)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-16 21:06:26 -07:00
Krishna Kumar bfb564e739 core: Factor out flow calculation from get_rps_cpu
Factor out flow calculation code from get_rps_cpu, since other
functions can use the same code.

Revisions:

v2 (Ben): Separate flow calcuation out and use in select queue.
v3 (Arnd): Don't re-implement MIN.
v4 (Changli): skb->data points to ethernet header in macvtap, and
	make a fast path. Tested macvtap with this patch.
v5 (Changli):
	- Cache skb->rxhash in skb_get_rxhash
	- macvtap may not have pow(2) queues, so change code for
	  queue selection.
    (Arnd):
	- Use first available queue if all fails.

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-16 21:06:24 -07:00
David S. Miller eca6fc7836 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-08-16 20:44:14 -07:00
Johannes Berg afea0b7af7 cfg80211: check if WEP is available for shared key auth
When shared key auth is requested, cfg80211
should verify that the device is capable of
WEP crypto which is required.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-16 16:45:22 -04:00
Johannes Berg 5daa8a8e69 mac80211: dont advertise WEP if unavailable
When WEP is unavailable, don't advertise it
to cfg80211.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-16 16:45:21 -04:00
Johannes Berg dc1580ddfc mac80211: remove unused status flag checks
The decryption code verifies whether or not
a given frame was decrypted and verified by
hardware. This is unnecessary, as the crypto
RX handler already does it long before the
decryption code is even invoked, so remove
that code to avoid confusion.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-16 16:45:21 -04:00
Johannes Berg 60ae0f2005 mac80211: move key tfm setup
There's no need to keep separate if statements
for setting up the CCMP/AES-CMAC tfm structs;
move that into the existing switch statement.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-16 16:45:20 -04:00
Johannes Berg 97359d1235 mac80211: use cipher suite selectors
Currently, mac80211 translates the cfg80211
cipher suite selectors into ALG_* values.
That isn't all too useful, and some drivers
benefit from the distinction between WEP40
and WEP104 as well. Therefore, convert it
all to use the cipher suite selectors.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Gertjan van Wingerde <gwingerde@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-16 16:45:11 -04:00
Johannes Berg 0460079495 cfg80211: support sysfs namespaces
Enable using network namespaces with
wireless devices even when sysfs is
enabled using the same infrastructure
that was built for netdevs.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-16 15:26:40 -04:00
Johannes Berg d1f5b7a34a mac80211: allow drivers to request SM PS mode change
Sometimes drivers have more information than the
stack about how their antennas/chains are used,
and may require that the SM PS mode be changed.
This could happen, for example, when detecting
that the user disconnected an antenna. Thus this
patch introduces API to allow drivers to request
SM PS mode changes.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-16 15:26:40 -04:00
Johannes Berg 7da7cc1d42 mac80211: per interface idle notification
Sometimes we don't just need to know whether or
not the device is idle, but also per interface.
This adds that reporting capability to mac80211.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-16 15:26:40 -04:00
Andrea Gelmini 1fdaa46e9f net: mac80211: Fix a typo.
"userpace" -> "userspace"

Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-16 15:26:40 -04:00
Johannes Berg 3f3b6a8d90 cfg80211: deauth doesn't always imply disconnect
When an AP sends a deauth frame, or we send one
to an AP, that only means we lost our connection
if we were actually connected to that AP. Check
this to avoid sending spurious "disconnected"
events and breaking "iw ... link" reporting.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-16 15:26:39 -04:00
Christian Lamparter 2bff8ebf32 mac80211: AMPDU rx reorder timeout timer
This patch introduces a new timer, which will release
queued-up MPDUs from the reorder buffer, whenever
they've waited for more than HT_RX_REORDER_BUF_TIMEOUT
(which is at around 100 ms).

The advantage of having a dedicated timer, instead of
relying on a constant stream of freshly arriving aMPDUs
to release the old ones, is particularly observable when
even a small fraction of MPDUs are forever lost at
low network speeds.

Previously under these circumstances frames would become
stuck in the reorder buffer and the network stack of both
HT peers throttled back, instead of revving up and
gunning the pipes.

Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-16 15:26:39 -04:00
Christian Lamparter 071d9ac253 mac80211: remove unused rate function parameter
This patch removes a few stale parameters and variables
which survived the last, large rx-path reorganization:
"mac80211: correctly place aMPDU RX reorder code"

Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-16 15:26:39 -04:00
Christian Lamparter aa0c86364f mac80211: put rx handlers into separate functions
This patch takes the reorder logic from the RX path and
moves it into separate routines to make the expired frame
release accessible.

Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-16 15:26:39 -04:00
Ben Hutchings 1ac62ba7c9 mac80211: Don't squash error codes in key setup functions
ieee80211_add_key() currently returns -ENOMEM in case of any error,
including a missing crypto algorithm.  Change ieee80211_key_alloc()
and ieee80211_aes_{key_setup_encrypt,cmac_key_setup}() to encode
errors with ERR_PTR() rather than returning NULL, and change
ieee80211_add_key() accordingly.

Compile-tested only.

Reported-by: Marcin Owsiany <porridge@debian.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-16 15:26:38 -04:00
Johannes Berg a1699b75a1 mac80211: unify scan and work mutexes
Having both scan and work mutexes is not just
a bit too fine grained, it also creates issues
when there's code that needs both since they
then need to be acquired in the right order,
which can be hard to do.

Therefore, use just a single mutex for both.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-16 15:26:37 -04:00
Johannes Berg fc88518916 mac80211: don't check rates on PLCP error frames
Frames that failed PLCP error checks are most likely
microwave transmissions (well, maybe not ...) and
don't have a proper rate detected, so ignore it.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-16 15:26:36 -04:00
Felix Fietkau ffd2778bb9 mac80211: fix driver offchannel notification when the channel does not change
When running in client mode and associating to an AP, the channel
change is usually performed with the offchannel flag still set.
However after the assoc is complete, the following channel change event
is suppressed because the run time channel is already set to the operating channel.
Fix this by sending channel change notifications to the driver even if
only the offchannel flag changes.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-16 15:23:43 -04:00
John W. Linville 9714d315d2 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2010-08-16 14:40:44 -04:00
John W. Linville c61029c77f wireless: upcase alpha2 values in queue_regulatory_request
This provides a little more flexibility for human users, and it allows
us to use isalpha rather than the custom is_alpha_upper.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-16 14:39:47 -04:00
John W. Linville 4e6cbfd09c mac80211: support use of NAPI for bottom-half processing
This patch implement basic infrastructure to support use of NAPI by
mac80211-based hardware drivers.

Because mac80211 devices can support multiple netdevs, a dummy netdev
is used for interfacing with the NAPI code in the core of the network
stack.  That structure is hidden from the hardware drivers, but the
actual napi_struct is exposed in the ieee80211_hw structure so that the
poll routines in drivers can retrieve that structure.  Hardware drivers
can also specify their own weight value for NAPI polling.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-16 14:39:46 -04:00
David S. Miller daa3766e70 Revert "netlink: netlink_recvmsg() fix"
This reverts commit 1235f504aa.

It causes regressions worse than the problem it was trying
to fix.  Eric will try to solve the problem another way.

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-15 23:21:50 -07:00
Min Zhang f3d3f616e3 ipv6: remove sysctl jiffies conversion on gc_elasticity and min_adv_mss
sysctl output ipv6 gc_elasticity and min_adv_mss as values divided by
HZ. However, they are not in unit of jiffies, since ip6_rt_min_advmss
refers to packet size and ip6_rt_fc_elasticity is used as scaler as in
expire>>ip6_rt_gc_elasticity, so replace the jiffies conversion
handler will regular handler for them.

This has impact on scripts that are currently working assuming the
divide by HZ, will yield different results with this patch in place.

Signed-off-by: Min Zhang <mzhang@mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-14 22:42:51 -07:00
Herbert Xu 2f09a4d5da xfrm: Use GFP_ATOMIC in xfrm_compile_policy
As xfrm_compile_policy runs within a read_lock, we cannot use
GFP_KERNEL for memory allocations.

Reported-by: Luca Tettamanti <kronos.it@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-14 22:38:09 -07:00
Linus Torvalds 2f2c779583 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (30 commits)
  ctcm: rename READ/WRITE defines to avoid redefinitions
  claw: rename READ/WRITE defines to avoid redefinitions
  phylib: available for any speed ethernet
  can: add limit for nframes and clean up signed/unsigned variables
  pkt_sched: Check .walk and .leaf class handlers
  pkt_sched: Fix sch_sfq vs tc_modify_qdisc oops
  caif-spi: Bugfix SPI_DATA_POS settings were inverted.
  caif: Bugfix - Increase default headroom size for control channel.
  net: make netpoll_rx return bool for !CONFIG_NETPOLL
  Bluetooth: Use 3-DH5 payload size for default ERTM max PDU size
  Bluetooth: Fix incorrect setting of remote_tx_win for L2CAP ERTM
  Bluetooth: Change default L2CAP ERTM retransmit timeout
  Bluetooth: Fix endianness issue with L2CAP MPS configuration
  net: Use NET_XMIT_SUCCESS where possible.
  isdn: mISDN: call pci_disable_device() if pci_probe() failed
  isdn: avm: call pci_disable_device() if pci_probe() failed
  isdn: avm: call pci_disable_device() if pci_probe() failed
  usbnet: rx_submit() should return an error code.
  pkt_sched: Add some basic qdisc class ops verification. Was: [PATCH] sfq: add dummy bind/unbind handles
  pkt_sched: sch_sfq: Add dummy unbind_tcf and put handles. Was: [PATCH] sfq: add dummy bind/unbind handles
  ...
2010-08-13 10:38:12 -07:00
Linus Torvalds 2897c684d1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  [NFS] Set CONFIG_KEYS when CONFIG_NFS_USE_KERNEL_DNS is set
  AFS: Implement an autocell mount capability [ver #2]
  DNS: If the DNS server returns an error, allow that to be cached [ver #2]
  NFS: Use kernel DNS resolver [ver #2]
  cifs: update README to include details about 'fsc' option
2010-08-13 10:37:30 -07:00
Linus Torvalds 26df0766a7 Merge branch 'params' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
* 'params' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: (22 commits)
  param: don't deref arg in __same_type() checks
  param: update drivers/acpi/debug.c to new scheme
  param: use module_param in drivers/message/fusion/mptbase.c
  ide: use module_param_named rather than module_param_call
  param: update drivers/char/ipmi/ipmi_watchdog.c to new scheme
  param: lock if_sdio's lbs_helper_name and lbs_fw_name against sysfs changes.
  param: lock myri10ge_fw_name against sysfs changes.
  param: simple locking for sysfs-writable charp parameters
  param: remove unnecessary writable charp
  param: add kerneldoc to moduleparam.h
  param: locking for kernel parameters
  param: make param sections const.
  param: use free hook for charp (fix leak of charp parameters)
  param: add a free hook to kernel_param_ops.
  param: silence .init.text references from param ops
  Add param ops struct for hvc_iucv driver.
  nfs: update for module_param_named API change
  AppArmor: update for module_param_named API change
  param: use ops in struct kernel_param, rather than get and set fns directly
  param: move the EXPORT_SYMBOL to after the definitions.
  ...
2010-08-12 10:01:59 -07:00
David Howells 12fdff3fc2 Add a dummy printk function for the maintenance of unused printks
Add a dummy printk function for the maintenance of unused printks through gcc
format checking, and also so that side-effect checking is maintained too.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-12 09:51:35 -07:00
Randy Dunlap cba86f2e20 phylib: available for any speed ethernet
Several gigabit network drivers (SB1250_MAC, TIGON3, FSL, GIANFAR,
UCC_GETH, MV643XX_ETH, XILINX_LL_TEMAC, S6GMAC, STMMAC_ETH, PASEMI_MAC,
and OCTEON_ETHERNET) select PHYLIB.  These drivers are not under
NET_ETHERNET (10/100 mbit), so this warning is generated (long, irrelevant
parts are omitted):

warning: (NET_DSA && NET && EXPERIMENTAL && NET_ETHERNET && !S390 || ... || SB1250_MAC && NETDEVICES && NETDEV_1000 && SIBYTE_SB1xxx_SOC || TIGON3 && NETDEVICES && NETDEV_1000 && PCI || FSL_PQ_MDIO && NETDEVICES && NETDEV_1000 && FSL_SOC || GIANFAR && NETDEVICES && NETDEV_1000 && FSL_SOC || UCC_GETH && NETDEVICES && NETDEV_1000 && QUICC_ENGINE || MV643XX_ETH && NETDEVICES && NETDEV_1000 && (MV64X60 || PPC32 || PLAT_ORION) || XILINX_LL_TEMAC && NETDEVICES && NETDEV_1000 && (PPC || MICROBLAZE) || S6GMAC && NETDEVICES && NETDEV_1000 && XTENSA_VARIANT_S6000 || STMMAC_ETH && NETDEV_1000 && NETDEVICES && CPU_SUBTYPE_ST40 || PASEMI_MAC && NETDEVICES && NETDEV_10000 && PPC_PASEMI && PCI || OCTEON_ETHERNET && STAGING && !STAGING_EXCLUDE_BUILD && CPU_CAVIUM_OCTEON) selects PHYLIB which has unmet direct dependencies (!S390 && NET_ETHERNET)

PHYLIB is used by non-10/100 mbit ethernet drivers, so change the dependencies
to be NETDEVICES instead of NET_ETHERNET.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-11 23:03:50 -07:00
Oliver Hartkopp 5b75c4973c can: add limit for nframes and clean up signed/unsigned variables
This patch adds a limit for nframes as the number of frames in TX_SETUP and
RX_SETUP are derived from a single byte multiplex value by default.
Use-cases that would require to send/filter more than 256 CAN frames should
be implemented in userspace for complexity reasons anyway.

Additionally the assignments of unsigned values from userspace to signed
values in kernelspace and vice versa are fixed by using unsigned values in
kernelspace consistently.

Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Reported-by: Ben Hawkes <hawkes@google.com>
Acked-by: Urs Thuermann <urs.thuermann@volkswagen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-11 16:12:35 -07:00
Wang Lei 4a2d789267 DNS: If the DNS server returns an error, allow that to be cached [ver #2]
If the DNS server returns an error, allow that to be cached in the DNS resolver
key in lieu of a value.  Userspace passes the desired error number as an option
in the payload:

	"#dnserror=<number>"

Userspace must map h_errno from the name resolution routines to an appropriate
Linux error before passing it up.  Something like the following mapping is
recommended:

	[HOST_NOT_FOUND]	= ENODATA,
	[TRY_AGAIN]		= EAGAIN,
	[NO_RECOVERY]		= ECONNREFUSED,
	[NO_DATA]		= ENODATA,

in lieu of Linux errors specifically for representing name service errors.  The
filesystem must map these errors appropropriately before passing them to
userspace.  AFS is made to map ENODATA and EAGAIN to EDESTADDRREQ for the
return to userspace; ECONNREFUSED is allowed to stand as is.

The error can be seen in /proc/keys as a negative number after the description
of the key.  Compare, for example, the following key entries:

2f97238c I--Q--     1  53s 3f010000     0     0 dns_resol afsdb:grand.centrall.org: -61
338bfbbe I--Q--     1  59m 3f010000     0     0 dns_resol afsdb:grand.central.org: 37

If the error option is supplied in the payload, the main part of the payload is
discarded.  The key should have an expiry time set by userspace.

Signed-off-by: Wang Lei <wang840925@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-08-11 17:11:28 +00:00
Rusty Russell d6d1b650ae param: simple locking for sysfs-writable charp parameters
Since the writing to sysfs can free the old one, we need to block that
when we access the charp variables.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reviewed-by: Takashi Iwai <tiwai@suse.de>
Tested-by: Phil Carmody <ext-phil.2.carmody@nokia.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Dan Williams <dcbw@redhat.com>
Cc: John W. Linville <linville@tuxdriver.com>
Cc: Jing Huang <huangj@brocade.com>
Cc: James E.J. Bottomley <James.Bottomley@suse.de>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: user-mode-linux-devel@lists.sourceforge.net
Cc: libertas-dev@lists.infradead.org
Cc: linux-wireless@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: linux-scsi@vger.kernel.org
Cc: linux-usb@vger.kernel.org
2010-08-11 23:04:31 +09:30
Stephen Rothwell 8e4e15d44a nfs: update for module_param_named API change
After merging the rr tree, today's linux-next build (powerpc
ppc64_defconfig) failed like this:

net/sunrpc/auth.c:74: error: 'param_ops_hashtbl_sz' undeclared here (not in a function)

Caused by commit 0685652df0929cec7d78efa85127f6eb34962132
("param:param_ops") interacting with commit
f8f853ab19fcc415b6eadd273373edc424916212 ("SUNRPC: Make the credential
cache hashtable size configurable") from the nfs tree.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-08-11 23:04:15 +09:30
Rusty Russell 9bbb9e5a33 param: use ops in struct kernel_param, rather than get and set fns directly
This is more kernel-ish, saves some space, and also allows us to
expand the ops without breaking all the callers who are happy for the
new members to be NULL.

The few places which defined their own param types are changed to the
new scheme (more which crept in recently fixed in following patches).

Since we're touching them anyway, we change get() and set() to take a
const struct kernel_param (which they really are).  This causes some
harmless warnings until we fix them (in following patches).

To reduce churn, module_param_call creates the ops struct so the callers
don't have to change (and casts the functions to reduce warnings).
The modern version which takes an ops struct is called module_param_cb.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reviewed-by: Takashi Iwai <tiwai@suse.de>
Tested-by: Phil Carmody <ext-phil.2.carmody@nokia.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ville Syrjala <syrjala@sci.fi>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Alessandro Rubini <rubini@ipvvis.unipv.it>
Cc: Michal Januszewski <spock@gentoo.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Cc: linux-kernel@vger.kernel.org
Cc: linux-input@vger.kernel.org
Cc: linux-fbdev-devel@lists.sourceforge.net
Cc: linux-nfs@vger.kernel.org
Cc: netdev@vger.kernel.org
2010-08-11 23:04:13 +09:30
Jarek Poplawski 3e9e5a5921 pkt_sched: Check .walk and .leaf class handlers
Require qdisc class ops .walk and .leaf for classful qdisc in
register_qdisc(). The checks could be done later insted, but these
ops are really needed and used by most of classful qdiscs.

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-11 01:37:00 -07:00
Jarek Poplawski 41065fba84 pkt_sched: Fix sch_sfq vs tc_modify_qdisc oops
sch_sfq as a classful qdisc needs the .leaf handler. Otherwise, there
is an oops possible in tc_modify_qdisc()/check_loop().

Fixes commit 7d2681a6ff

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-11 01:36:59 -07:00
Sjur Braendeland 24e263adba caif: Bugfix - Increase default headroom size for control channel.
Headroom size for control channel must be at least 48 bytes in some scenarios.

Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-10 16:39:27 -07:00
David S. Miller 1c114f42a5 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2010-08-10 15:59:38 -07:00
John W. Linville 533b12c88d Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/holtmann/bluetooth-2.6 2010-08-10 16:16:58 -04:00
Mat Martineau cff70fae11 Bluetooth: Fix incorrect setting of remote_tx_win for L2CAP ERTM
remote_tx_win is intended to be set on receipt of an L2CAP
configuration request.  The value is used to determine the size of the
transmit window on the remote side of an ERTM connection, so L2CAP
can stop sending frames when that remote window is full.

An incorrect remote_tx_win value will cause the stack to not fully
utilize the tx window (performance impact), or to overfill the remote
tx window (causing dropped frames or a disconnect).

This patch removes an extra setting of remote_tx_win when a
configuration response is received.  The transmit window has a
different meaning in a response - it is an informational value
less than or equal to the local tx_win.

Signed-off-by: Mat Martineau <mathewm@codeaurora.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-08-10 07:59:11 -04:00
Mat Martineau 86b1b26326 Bluetooth: Fix endianness issue with L2CAP MPS configuration
Incoming configuration values must be converted to native CPU order
before use.  This fixes a bug where a little-endian MPS value is
compared to a native CPU value.  On big-endian processors, this
can cause ERTM and streaming mode segmentation to produce PDUs
that are larger than the remote stack is expecting, or that would
produce fragmented skbs that the current FCS code cannot handle.

Signed-off-by: Mat Martineau <mathewm@codeaurora.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-08-10 07:59:09 -04:00
Ben Greear 9871e50edd net: Use NET_XMIT_SUCCESS where possible.
This is based on work originally done by Patric McHardy.

Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-10 02:51:11 -07:00
Jarek Poplawski 68fd26b598 pkt_sched: Add some basic qdisc class ops verification. Was: [PATCH] sfq: add dummy bind/unbind handles
Verify in register_qdisc() some basic qdisc class handlers are present.

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-10 01:39:14 -07:00
Jarek Poplawski da7115d94a pkt_sched: sch_sfq: Add dummy unbind_tcf and put handles. Was: [PATCH] sfq: add dummy bind/unbind handles
Add dummy .unbind_tcf and .put qdisc class ops for easier verification.
(All other schedulers have it like this.)

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-10 01:39:13 -07:00
Linus Torvalds f6cec0ae58 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (59 commits)
  igbvf.txt: Add igbvf Documentation
  igb.txt: Add igb documentation
  e100/e1000*/igb*/ixgb*: Add missing read memory barrier
  ixgbe: fix build error with FCOE_CONFIG without DCB_CONFIG
  netxen: protect tx timeout recovery by rtnl lock
  isdn: gigaset: use after free
  isdn: gigaset: add missing unlock
  solos-pci: Fix race condition in tasklet RX handling
  pkt_sched: Fix sch_sfq vs tcf_bind_filter oops
  net: disable preemption before call smp_processor_id()
  tcp: no md5sig option size check bug
  iwlwifi: fix locking assertions
  iwlwifi: fix TX tracer
  isdn: fix information leak
  net: Fix napi_gro_frags vs netpoll path
  usbnet: remove noisy and hardly useful printk
  rtl8180: avoid potential NULL deref in rtl8180_beacon_work
  ath9k: Remove myself from the MAINTAINERS list
  libertas: scan before assocation if no BSSID was given
  libertas: fix association with some APs by using extended rates
  ...
2010-08-09 21:05:52 -07:00
Johannes Berg fe100acddf cfg80211: fix locking in action frame TX
Accesses to "wdev->current_bss" must be
locked with the wdev lock, which action
frame transmission is missing.

Cc: stable@kernel.org [2.6.33+]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-08-09 15:18:57 -04:00
Jarek Poplawski eb4a5527b1 pkt_sched: Fix sch_sfq vs tcf_bind_filter oops
Since there was added ->tcf_chain() method without ->bind_tcf() to
sch_sfq class options, there is oops when a filter is added with
the classid parameter.

Fixes commit 7d2681a6ff
netdev thread: null pointer at cls_api.c

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Reported-by: Franchoze Eric <franchoze@yandex.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-07 22:45:41 -07:00
Changli Gao cece1945bf net: disable preemption before call smp_processor_id()
Although netif_rx() isn't expected to be called in process context with
preemption enabled, it'd better handle this case. And this is why get_cpu()
is used in the non-RPS #ifdef branch. If tree RCU is selected,
rcu_read_lock() won't disable preemption, so preempt_disable() should be
called explictly.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-07 20:35:43 -07:00
Dmitry Popov ba78e2ddca tcp: no md5sig option size check bug
tcp_parse_md5sig_option doesn't check md5sig option (TCPOPT_MD5SIG)
length, but tcp_v[46]_inbound_md5_hash assume that it's at least 16
bytes long.

Signed-off-by: Dmitry Popov <dp@highloadlab.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-07 20:24:28 -07:00
Linus Torvalds 0d9f9e122c Merge branch 'for-2.6.36' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.36' of git://linux-nfs.org/~bfields/linux: (34 commits)
  nfsd4: fix file open accounting for RDWR opens
  nfsd: don't allow setting maxblksize after svc created
  nfsd: initialize nfsd versions before creating svc
  net: sunrpc: removed duplicated #include
  nfsd41: Fix a crash when a callback is retried
  nfsd: fix startup/shutdown order bug
  nfsd: minor nfsd read api cleanup
  gcc-4.6: nfsd: fix initialized but not read warnings
  nfsd4: share file descriptors between stateid's
  nfsd4: fix openmode checking on IO using lock stateid
  nfsd4: miscellaneous process_open2 cleanup
  nfsd4: don't pretend to support write delegations
  nfsd: bypass readahead cache when have struct file
  nfsd: minor nfsd_svc() cleanup
  nfsd: move more into nfsd_startup()
  nfsd: just keep single lockd reference for nfsd
  nfsd: clean up nfsd_create_serv error handling
  nfsd: fix error handling in __write_ports_addxprt
  nfsd: fix error handling when starting nfsd with rpcbind down
  nfsd4: fix v4 state shutdown error paths
  ...
2010-08-07 14:24:41 -07:00
Linus Torvalds 5df6b8e65a Merge branch 'nfs-for-2.6.36' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'nfs-for-2.6.36' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (42 commits)
  NFS: NFSv4.1 is no longer a "developer only" feature
  NFS: NFS_V4 is no longer an EXPERIMENTAL feature
  NFS: Fix /proc/mount for legacy binary interface
  NFS: Fix the locking in nfs4_callback_getattr
  SUNRPC: Defer deleting the security context until gss_do_free_ctx()
  SUNRPC: prevent task_cleanup running on freed xprt
  SUNRPC: Reduce asynchronous RPC task stack usage
  SUNRPC: Move the bound cred to struct rpc_rqst
  SUNRPC: Clean up of rpc_bindcred()
  SUNRPC: Move remaining RPC client related task initialisation into clnt.c
  SUNRPC: Ensure that rpc_exit() always wakes up a sleeping task
  SUNRPC: Make the credential cache hashtable size configurable
  SUNRPC: Store the hashtable size in struct rpc_cred_cache
  NFS: Ensure the AUTH_UNIX credcache is allocated dynamically
  NFS: Fix the NFS users of rpc_restart_call()
  SUNRPC: The function rpc_restart_call() should return success/failure
  NFSv4: Get rid of the bogus RPC_ASSASSINATED(task) checks
  NFSv4: Clean up the process of renewing the NFSv4 lease
  NFSv4.1: Handle NFS4ERR_DELAY on SEQUENCE correctly
  NFS: nfs_rename() should not have to flush out writebacks
  ...
2010-08-07 13:19:36 -07:00
Andrea Gelmini e2aa7f8304 net: sunrpc: removed duplicated #include
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-08-06 17:05:39 -04:00
David S. Miller e225567960 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2010-08-06 13:30:43 -07:00