Commit Graph

5264 Commits

Author SHA1 Message Date
Alexander Duyck 6648bd7e0e ipv4: Add sysctl knob to control early socket demux
This change is meant to add a control for disabling early socket demux.
The main motivation behind this patch is to provide an option to disable
the feature as it adds an additional cost to routing that reduces overall
throughput by up to 5%.  For example one of my systems went from 12.1Mpps
to 11.6 after the early socket demux was added.  It looks like the reason
for the regression is that we are now having to perform two lookups, first
the one for an established socket, and then the one for the routing table.

By adding this patch and toggling the value for ip_early_demux to 0 I am
able to get back to the 12.1Mpps I was previously seeing.

[ Move local variables in ip_rcv_finish() down into the basic
  block in which they are actually used.  -DaveM ]

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-22 17:11:13 -07:00
David S. Miller 41063e9dd1 ipv4: Early TCP socket demux.
Input packet processing for local sockets involves two major demuxes.
One for the route and one for the socket.

But we can optimize this down to one demux for certain kinds of local
sockets.

Currently we only do this for established TCP sockets, but it could
at least in theory be expanded to other kinds of connections.

If a TCP socket is established then it's identity is fully specified.

This means that whatever input route was used during the three-way
handshake must work equally well for the rest of the connection since
the keys will not change.

Once we move to established state, we cache the receive packet's input
route to use later.

Like the existing cached route in sk->sk_dst_cache used for output
packets, we have to check for route invalidations using dst->obsolete
and dst->ops->check().

Early demux occurs outside of a socket locked section, so when a route
invalidation occurs we defer the fixup of sk->sk_rx_dst until we are
actually inside of established state packet processing and thus have
the socket locked.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-19 21:22:05 -07:00
David S. Miller f9242b6b28 inet: Sanitize inet{,6} protocol demux.
Don't pretend that inet_protos[] and inet6_protos[] are hashes, thay
are just a straight arrays.  Remove all unnecessary hash masking.

Document MAX_INET_PROTOS.

Use RAW_HTABLE_SIZE when appropriate.

Reported-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-19 18:56:21 -07:00
David S. Miller a77f4b4acf Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next
John Linville says:

====================
This is a sizeable batch of updates intended for 3.6...

The bulk of the changes here are Bluetooth.  Gustavo says:

	Here goes the first Bluetooth pull request for 3.6, we have
	queued quite a lot of work. Andrei Emeltchenko added the AMP
	Manager code, a lot of work is needed, but the first bit are
	already there. This code is disabled by default.  Mat Martineau
	changed the whole L2CAP ERTM state machine code, replacing
	the old one with a new implementation. Besides that we had
	lot of coding style fixes (to follow net rules), more l2cap
	core separation from socket and many clean ups and fixed all
	over the tree.

Along with the above, there is a healthy dose of ath9k, iwlwifi,
and other driver updates.  There is also another pull from the
wireless tree to resolve some merge issues.  I also fixed-up some
merge discrepencies between net-next and wireless-next.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-19 14:37:15 -07:00
John W. Linville b3c911eeb4 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem
Conflicts:
	drivers/net/wireless/iwlwifi/dvm/testmode.c
	drivers/net/wireless/iwlwifi/pcie/trans.c
2012-06-19 14:41:22 -04:00
Pablo Neira Ayuso 674147e211 netfilter: fix missing symbols if CONFIG_NETFILTER_NETLINK_QUEUE_CT unset
ERROR: "nfqnl_ct_parse" [net/netfilter/nfnetlink_queue.ko] undefined!
ERROR: "nfqnl_ct_seq_adjust" [net/netfilter/nfnetlink_queue.ko] undefined!
ERROR: "nfqnl_ct_put" [net/netfilter/nfnetlink_queue.ko] undefined!
ERROR: "nfqnl_ct_get" [net/netfilter/nfnetlink_queue.ko] undefined!

We have to use CONFIG_NETFILTER_NETLINK_QUEUE_CT in
include/net/netfilter/nfnetlink_queue.h, not CONFIG_NF_CONNTRACK.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-18 21:09:17 -07:00
Pablo Neira Ayuso 7c62234547 netfilter: nfnetlink_queue: fix compilation with NF_CONNTRACK disabled
In "9cb0176 netfilter: add glue code to integrate nfnetlink_queue and ctnetlink"
the compilation with NF_CONNTRACK disabled is broken. This patch fixes this
issue.

I have moved the conntrack part into nfnetlink_queue_ct.c to avoid
peppering the entire nfnetlink_queue.c code with ifdefs.

I also needed to rename nfnetlink_queue.c to nfnetlink_queue_pkt.c
to update the net/netfilter/Makefile to support conditional compilation
of the conntrack integration.

This patch also adds CONFIG_NETFILTER_QUEUE_CT in case you want to explicitly
disable the integration between nf_conntrack and nfnetlink_queue.

Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-06-19 04:44:57 +02:00
David S. Miller 82f437b950 Merge branch 'master' of git://1984.lsi.us.es/nf-next
Pablo says:

====================
This is the second batch of Netfilter updates for net-next. It contains the
kernel changes for the new user-space connection tracking helper
infrastructure.

More details on this infrastructure are provides here:
http://lwn.net/Articles/500196/

Still, I plan to provide some official documentation through the
conntrack-tools user manual on how to setup user-space utilities for this.
So far, it provides two helper in user-space, one for NFSv3 and another for
Oracle/SQLnet/TNS. Yet in my TODO list.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-16 15:23:35 -07:00
Eldad Zack 7f95e1880e include/net/dst.h: neaten asterisk placement
Fix code style - place the asterisk where it belongs.

Signed-off-by: Eldad Zack <eldad@fogrefinery.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-16 15:20:35 -07:00
Pablo Neira Ayuso 12f7a50533 netfilter: add user-space connection tracking helper infrastructure
There are good reasons to supports helpers in user-space instead:

* Rapid connection tracking helper development, as developing code
  in user-space is usually faster.

* Reliability: A buggy helper does not crash the kernel. Moreover,
  we can monitor the helper process and restart it in case of problems.

* Security: Avoid complex string matching and mangling in kernel-space
  running in privileged mode. Going further, we can even think about
  running user-space helpers as a non-root process.

* Extensibility: It allows the development of very specific helpers (most
  likely non-standard proprietary protocols) that are very likely not to be
  accepted for mainline inclusion in the form of kernel-space connection
  tracking helpers.

This patch adds the infrastructure to allow the implementation of
user-space conntrack helpers by means of the new nfnetlink subsystem
`nfnetlink_cthelper' and the existing queueing infrastructure
(nfnetlink_queue).

I had to add the new hook NF_IP6_PRI_CONNTRACK_HELPER to register
ipv[4|6]_helper which results from splitting ipv[4|6]_confirm into
two pieces. This change is required not to break NAT sequence
adjustment and conntrack confirmation for traffic that is enqueued
to our user-space conntrack helpers.

Basic operation, in a few steps:

1) Register user-space helper by means of `nfct':

 nfct helper add ftp inet tcp

 [ It must be a valid existing helper supported by conntrack-tools ]

2) Add rules to enable the FTP user-space helper which is
   used to track traffic going to TCP port 21.

For locally generated packets:

 iptables -I OUTPUT -t raw -p tcp --dport 21 -j CT --helper ftp

For non-locally generated packets:

 iptables -I PREROUTING -t raw -p tcp --dport 21 -j CT --helper ftp

3) Run the test conntrackd in helper mode (see example files under
   doc/helper/conntrackd.conf

 conntrackd

4) Generate FTP traffic going, if everything is OK, then conntrackd
   should create expectations (you can check that with `conntrack':

 conntrack -E expect

    [NEW] 301 proto=6 src=192.168.1.136 dst=130.89.148.12 sport=0 dport=54037 mask-src=255.255.255.255 mask-dst=255.255.255.255 sport=0 dport=65535 master-src=192.168.1.136 master-dst=130.89.148.12 sport=57127 dport=21 class=0 helper=ftp
[DESTROY] 301 proto=6 src=192.168.1.136 dst=130.89.148.12 sport=0 dport=54037 mask-src=255.255.255.255 mask-dst=255.255.255.255 sport=0 dport=65535 master-src=192.168.1.136 master-dst=130.89.148.12 sport=57127 dport=21 class=0 helper=ftp

This confirms that our test helper is receiving packets including the
conntrack information, and adding expectations in kernel-space.

The user-space helper can also store its private tracking information
in the conntrack structure in the kernel via the CTA_HELP_INFO. The
kernel will consider this a binary blob whose layout is unknown. This
information will be included in the information that is transfered
to user-space via glue code that integrates nfnetlink_queue and
ctnetlink.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-06-16 15:40:02 +02:00
Pablo Neira Ayuso ae243bee39 netfilter: ctnetlink: add CTA_HELP_INFO attribute
This attribute can be used to modify and to dump the internal
protocol information.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-06-16 15:09:15 +02:00
Pablo Neira Ayuso 8c88f87cb2 netfilter: nfnetlink_queue: add NAT TCP sequence adjustment if packet mangled
User-space programs that receive traffic via NFQUEUE may mangle packets.
If NAT is enabled, this usually puzzles sequence tracking, leading to
traffic disruptions.

With this patch, nfnl_queue will make the corresponding NAT TCP sequence
adjustment if:

1) The packet has been mangled,
2) the NFQA_CFG_F_CONNTRACK flag has been set, and
3) NAT is detected.

There are some records on the Internet complaning about this issue:
http://stackoverflow.com/questions/260757/packet-mangling-utilities-besides-iptables

By now, we only support TCP since we have no helpers for DCCP or SCTP.
Better to add this if we ever have some helper over those layer 4 protocols.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-06-16 15:09:08 +02:00
Pablo Neira Ayuso 1afc56794e netfilter: nf_ct_helper: implement variable length helper private data
This patch uses the new variable length conntrack extensions.

Instead of using union nf_conntrack_help that contain all the
helper private data information, we allocate variable length
area to store the private helper data.

This patch includes the modification of all existing helpers.
It also includes a couple of include header to avoid compilation
warnings.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-06-16 15:08:55 +02:00
Pablo Neira Ayuso 3cf4c7e381 netfilter: nf_ct_ext: support variable length extensions
We can now define conntrack extensions of variable size. This
patch is useful to get rid of these unions:

union nf_conntrack_help
union nf_conntrack_proto
union nf_conntrack_nat_help

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-06-16 15:08:49 +02:00
Pablo Neira Ayuso 3a8fc53a45 netfilter: nf_ct_helper: allocate 16 bytes for the helper and policy names
This patch modifies the struct nf_conntrack_helper to allocate
the room for the helper name. The maximum length is 16 bytes
(this was already introduced in 2.6.24).

For the maximum length for expectation policy names, I have
also selected 16 bytes.

This patch is required by the follow-up patch to support
user-space connection tracking helpers.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-06-16 15:08:39 +02:00
David S. Miller aee289baaa Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	net/ipv6/route.c

Pull in 'net' again to get the revert of Thomas's change
which introduced regressions.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-16 01:23:04 -07:00
David S. Miller e8803b6c38 Revert "ipv6: Prevent access to uninitialized fib_table_hash via /proc/net/ipv6_route"
This reverts commit 2a0c451ade.

It causes crashes, because now ip6_null_entry is used before
it is initialized.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-16 01:12:19 -07:00
David S. Miller 7e52b33bd5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	net/ipv6/route.c

This deals with a merge conflict between the net-next addition of the
inetpeer network namespace ops, and Thomas Graf's bug fix in
2a0c451ade which makes sure we don't
register /proc/net/ipv6_route before it is actually safe to do so.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-15 15:51:55 -07:00
Thomas Graf 2a0c451ade ipv6: Prevent access to uninitialized fib_table_hash via /proc/net/ipv6_route
/proc/net/ipv6_route reflects the contents of fib_table_hash. The proc
handler is installed in ip6_route_net_init() whereas fib_table_hash is
allocated in fib6_net_init() _after_ the proc handler has been installed.

This opens up a short time frame to access fib_table_hash with its pants
down.

fib6_init() as a whole can't be moved to an earlier position as it also
registers the rtnetlink message handlers which should be registered at
the end. Therefore split it into fib6_init() which is run early and
fib6_init_late() to register the rtnetlink message handlers.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Reviewed-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-15 15:30:15 -07:00
David S. Miller 81aded2467 ipv6: Handle PMTU in ICMP error handlers.
One tricky issue on the ipv6 side vs. ipv4 is that the ICMP callouts
to handle the error pass the 32-bit info cookie in network byte order
whereas ipv4 passes it around in host byte order.

Like the ipv4 side, we have two helper functions.  One for when we
have a socket context and one for when we do not.

ip6ip6 tunnels are not handled here, because they handle PMTU events
by essentially relaying another ICMP packet-too-big message back to
the original sender.

This patch allows us to get rid of rt6_do_pmtu_disc().  It handles all
kinds of situations that simply cannot happen when we do the PMTU
update directly using a fully resolved route.

In fact, the "plen == 128" check in ip6_rt_update_pmtu() can very
likely be removed or changed into a BUG_ON() check.  We should never
have a prefixed ipv6 route when we get there.

Another piece of strange history here is that TCP and DCCP, unlike in
ipv4, never invoke the update_pmtu() method from their ICMP error
handlers.  This is incredibly astonishing since this is the context
where we have the most accurate context in which to make a PMTU
update, namely we have a fully connected socket and associated cached
socket route.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-15 14:54:11 -07:00
David S. Miller 3639339553 ipv4: Handle PMTU in all ICMP error handlers.
With ip_rt_frag_needed() removed, we have to explicitly update PMTU
information in every ICMP error handler.

Create two helper functions to facilitate this.

1) ipv4_sk_update_pmtu()

   This updates the PMTU when we have a socket context to
   work with.

2) ipv4_update_pmtu()

   Raw version, used when no socket context is available.  For this
   interface, we essentially just pass in explicit arguments for
   the flow identity information we would have extracted from the
   socket.

   And you'll notice that ipv4_sk_update_pmtu() is simply implemented
   in terms of ipv4_update_pmtu()

Note that __ip_route_output_key() is used, rather than something like
ip_route_output_flow() or ip_route_output_key().  This is because we
absolutely do not want to end up with a route that does IPSEC
encapsulation and the like.  Instead, we only want the route that
would get us to the node described by the outermost IP header.

Reported-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-14 22:22:07 -07:00
John W. Linville 211c17aaee Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
Conflicts:
	drivers/net/wireless/ath/ath9k/main.c
	net/bluetooth/hci_event.c
2012-06-13 15:35:35 -04:00
John W. Linville ec8eb9ae58 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next 2012-06-13 15:12:07 -04:00
John W. Linville 1f7e010282 Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 2012-06-13 14:05:40 -04:00
David S. Miller 43b03f1f6d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	MAINTAINERS
	drivers/net/wireless/iwlwifi/pcie/trans.c

The iwlwifi conflict was resolved by keeping the code added
in 'net' that turns off the buggy chip feature.

The MAINTAINERS conflict was merely overlapping changes, one
change updated all the wireless web site URLs and the other
changed some GIT trees to be Johannes's instead of John's.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-12 21:59:18 -07:00
Eric Dumazet 5ee31c6898 bonding: Fix corrupted queue_mapping
In the transmit path of the bonding driver, skb->cb is used to
stash the skb->queue_mapping so that the bonding device can set its
own queue mapping.  This value becomes corrupted since the skb->cb is
also used in __dev_xmit_skb.

When transmitting through bonding driver, bond_select_queue is
called from dev_queue_xmit.  In bond_select_queue the original
skb->queue_mapping is copied into skb->cb (via bond_queue_mapping)
and skb->queue_mapping is overwritten with the bond driver queue.

Subsequently in dev_queue_xmit, __dev_xmit_skb is called which writes
the packet length into skb->cb, thereby overwriting the stashed
queue mappping.  In bond_dev_queue_xmit (called from hard_start_xmit),
the queue mapping for the skb is set to the stashed value which is now
the skb length and hence is an invalid queue for the slave device.

If we want to save skb->queue_mapping into skb->cb[], best place is to
add a field in struct qdisc_skb_cb, to make sure it wont conflict with
other layers (eg : Qdiscc, Infiniband...)

This patchs also makes sure (struct qdisc_skb_cb)->data is aligned on 8
bytes :

netem qdisc for example assumes it can store an u64 in it, without
misalignment penalty.

Note : we only have 20 bytes left in (struct qdisc_skb_cb)->data[].
The largest user is CHOKe and it fills it.

Based on a previous patch from Tom Herbert.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Tom Herbert <therbert@google.com>
Cc: John Fastabend <john.r.fastabend@intel.com>
Cc: Roland Dreier <roland@kernel.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-12 15:29:21 -07:00
John W. Linville 0440507bbc Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem 2012-06-12 14:25:04 -04:00
David S. Miller 55afabaa0d inet: Fix BUG triggered by __rt{,6}_get_peer().
If no peer actually gets attached (either because create is zero or
the peer allocation fails) we'll trigger a BUG because we
unconditionally do an rt{,6}_peer_ptr() afterwards.

Fix this by guarding it with the proper check.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-11 15:52:29 -07:00
David S. Miller 67da255210 Merge branch 'master' of git://1984.lsi.us.es/net-next 2012-06-11 12:56:14 -07:00
John W. Linville 2e48686835 Merge tag 'nfc-next-3.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-3.0 2012-06-11 14:46:04 -04:00
David S. Miller 7b34ca2ac7 inet: Avoid potential NULL peer dereference.
We handle NULL in rt{,6}_set_peer but then our caller will try to pass
that NULL pointer into inet_putpeer() which isn't ready for it.

Fix this by moving the NULL check one level up, and then remove the
now unnecessary NULL check from inetpeer_ptr_set_peer().

Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-11 04:13:57 -07:00
David S. Miller 8e77327783 inet: Add inetpeer tree roots to the FIB tables.
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-11 02:09:16 -07:00
David S. Miller b48c80ece9 inet: Add family scope inetpeer flushes.
This implementation can deal with having many inetpeer roots, which is
a necessary prerequisite for per-FIB table rooted peer tables.

Each family (AF_INET, AF_INET6) has a sequence number which we bump
when we get a family invalidation request.

Each peer lookup cheaply checks whether the flush sequence of the
root we are using is out of date, and if so flushes it and updates
the sequence number.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-11 02:09:10 -07:00
David S. Miller 46517008e1 ipv4: Kill ip_rt_frag_needed().
There is zero point to this function.

It's only real substance is to perform an extremely outdated BSD4.2
ICMP check, which we can safely remove.  If you really have a MTU
limited link being routed by a BSD4.2 derived system, here's a nickel
go buy yourself a real router.

The other actions of ip_rt_frag_needed(), checking and conditionally
updating the peer, are done by the per-protocol handlers of the ICMP
event.

TCP, UDP, et al. have a handler which will receive this event and
transmit it back into the associated route via dst_ops->update_pmtu().

This simplification is important, because it eliminates the one place
where we do not have a proper route context in which to make an
inetpeer lookup.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-11 02:08:59 -07:00
David S. Miller 97bab73f98 inet: Hide route peer accesses behind helpers.
We encode the pointer(s) into an unsigned long with one state bit.

The state bit is used so we can store the inetpeer tree root to use
when resolving the peer later.

Later the peer roots will be per-FIB table, and this change works to
facilitate that.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-11 02:08:47 -07:00
Roland Dreier c5d21c4b2a net: Reorder initialization in ip_route_output to fix gcc warning
If I build with W=1, for every file that includes <net/route.h>, I get the warning

    include/net/route.h: In function 'ip_route_output':
    include/net/route.h:135:3: warning: initialized field overwritten [-Woverride-init]
    include/net/route.h:135:3: warning: (near initialization for 'fl4') [-Woverride-init]

(This is with "gcc (Debian 4.6.3-1) 4.6.3")

A fix seems pretty trivial: move the initialization of .flowi4_tos
earlier.  As far as I can tell, this has no effect on code generation.

Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-11 00:04:47 -07:00
David S. Miller c0efc887dc inet: Pass inetpeer root into inet_getpeer*() interfaces.
Otherwise we reference potentially non-existing members when
ipv6 is disabled.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-09 19:12:36 -07:00
David S. Miller 56a6b248eb inet: Consolidate inetpeer_invalidate_tree() interfaces.
We only need one interface for this operation, since we always know
which inetpeer root we want to flush.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-09 16:32:41 -07:00
David S. Miller c3426b4719 inet: Initialize per-netns inetpeer roots in net/ipv{4,6}/route.c
Instead of net/ipv4/inetpeer.c

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-09 16:27:05 -07:00
David S. Miller 2397849baa [PATCH] tcp: Cache inetpeer in timewait socket, and only when necessary.
Since it's guarenteed that we will access the inetpeer if we're trying
to do timewait recycling and TCP options were enabled on the
connection, just cache the peer in the timewait socket.

In the future, inetpeer lookups will be context dependent (per routing
realm), and this helps facilitate that as well.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-09 14:56:12 -07:00
Johannes Berg d13e141481 mac80211: add some missing kernel-doc
Add a few kernel-doc descriptions that were missed
during development.

Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-06-09 10:31:09 +02:00
David S. Miller 4670fd819e tcp: Get rid of inetpeer special cases.
The get_peer method TCP uses is full of special cases that make no
sense accommodating, and it also gets in the way of doing more
reasonable things here.

First of all, if the socket doesn't have a usable cached route, there
is no sense in trying to optimize timewait recycling.

Likewise for the case where we have IP options, such as SRR enabled,
that make the IP header destination address (and thus the destination
address of the route key) differ from that of the connection's
destination address.

Just return a NULL peer in these cases, and thus we're also able to
get rid of the clumsy inetpeer release logic.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-09 01:25:47 -07:00
David S. Miller fbfe95a42e inet: Create and use rt{,6}_get_peer_create().
There's a lot of places that open-code rt{,6}_get_peer() only because
they want to set 'create' to one.  So add an rt{,6}_get_peer_create()
for their sake.

There were also a few spots open-coding plain rt{,6}_get_peer() and
those are transformed here as well.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-08 23:24:18 -07:00
Johan Hedberg 1c2e004183 Bluetooth: Add support for encryption key refresh
With LE/SMP the completion of a security level elavation from medium to
high is indicated by a HCI Encryption Key Refresh Complete event. The
necessary behavior upon receiving this event is a mix of what's done for
auth_complete and encryption_change, which is also where most of the
event handling code has been copied from.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2012-06-08 21:00:40 -03:00
Eric Dumazet 7123aaa3a1 af_unix: speedup /proc/net/unix
/proc/net/unix has quadratic behavior, and can hold unix_table_lock for
a while if high number of unix sockets are alive. (90 ms for 200k
sockets...)

We already have a hash table, so its quite easy to use it.

Problem is unbound sockets are still hashed in a single hash slot
(unix_socket_table[UNIX_HASH_TABLE])

This patch also spreads unbound sockets to 256 hash slots, to speedup
both /proc/net/unix and unix_diag.

Time to read /proc/net/unix with 200k unix sockets :
(time dd if=/proc/net/unix of=/dev/null bs=4k)

before : 520 secs
after : 2 secs

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-08 14:27:23 -07:00
Gao feng 54db0cc2ba inetpeer: add parameter net for inet_getpeer_v4,v6
add struct net as a parameter of inet_getpeer_v[4,6],
use net to replace &init_net.

and modify some places to provide net for inet_getpeer_v[4,6]

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-08 14:27:23 -07:00
Gao feng c8a627ed06 inetpeer: add namespace support for inetpeer
now inetpeer doesn't support namespace,the information will
be leaking across namespace.

this patch move the global vars v4_peers and v6_peers to
netns_ipv4 and netns_ipv6 as a field peers.

add struct pernet_operations inetpeer_ops to initial pernet
inetpeer data.

and change family_to_base and inet_getpeer to support namespace.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-08 14:27:23 -07:00
Gao feng 8264deb818 netfilter: nf_conntrack: add namespace support for cttimeout
This patch adds namespace support for cttimeout.

Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-06-07 14:58:41 +02:00
Pablo Neira Ayuso e76d0af5e4 netfilter: nf_conntrack: remove now unused sysctl for nf_conntrack_l[3|4]proto
Since the sysctl data for l[3|4]proto now resides in pernet nf_proto_net.
We can now remove this unused fields from struct nf_contrack_l[3,4]proto.

Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-06-07 14:58:41 +02:00
Gao feng 7080ba0955 netfilter: nf_ct_icmp: add namespace support
This patch adds namespace support for ICMPv6 protocol tracker.

Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-06-07 14:58:40 +02:00