Introduce per-conntrack locks and use them instead of the global protocol
locks to avoid contention. Especially tcp_lock shows up very high in
profiles on larger machines.
This will also allow to simplify the upcoming reliable event delivery patches.
Signed-off-by: Patrick McHardy <kaber@trash.net>
As the module uses rcu_call() we should make sure that all
rcu callback has been completed before removing the code.
Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On module unload call rcu_barrier(), this is needed as synchronize_rcu()
is not strong enough. The kmem_cache_destroy() does invoke
synchronize_rcu() but it does not provide same protection.
Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This module uses rcu_call() thus it should use rcu_barrier()
on module unload.
Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
Acked-by: Oliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This module uses rcu_call() thus it should use rcu_barrier() on module unload.
Also fixed a trivial typo 'nfetlink' -> 'nfnetlink' in comment.
Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The VLAN 8021q driver needs to call rcu_barrier() when unloading the module,
instead of syncronize_net(). This is needed to make sure that outstanding
call_rcu() callbacks have completed, before the callback function code is
removed on module unload.
Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
cls_cgroup: Fix oops when user send improperly 'tc filter add' request
r8169: fix crash when large packets are received
If the XT_SOCKET_TRANSPARENT flag is set, enabled 'transparent'
socket option is required for the socket to be matched.
Signed-off-by: Laszlo Attila Toth <panther@balabit.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Add a netlink interface for configuration of IEEE 802.15.4 device. Also this
interface specifies events notification sent by devices towards higher layers.
Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: Sergey Lapin <slapin@ossfans.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for communication over IEEE 802.15.4 networks. This implementation
is neither certified nor complete, but aims to that goal. This commit contains
only the socket interface for communication over IEEE 802.15.4 networks.
One can either send RAW datagrams or use SOCK_DGRAM to encapsulate data
inside normal IEEE 802.15.4 packets.
Configuration interface, drivers and software MAC 802.15.4 implementation will
follow.
Initial implementation was done by Maxim Gorbachyov, Maxim Osipov and Pavel
Smolensky as a research project at Siemens AG. Later the stack was heavily
reworked to better suit the linux networking model, and is now maitained
as an open project partially sponsored by Siemens.
Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: Sergey Lapin <slapin@ossfans.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
IEEE 802.15.4 stack requires several constants to be defined/adjusted.
Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: Sergey Lapin <slapin@ossfans.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use PSCHED_SHIFT constant instead of '10' in PSCHED_US2NS() and
PSCHED_NS2US() macros to enable changing this value later.
Additionally use PSCHED_SHIFT in sch_hfsc SM_SHIFT and ISM_SHIFT
definitions. This part of the patch is based on feedback from
Patrick McHardy <kaber@trash.net>.
Reported-by: Antonio Almeida <vexwek@gmail.com>
Tested-by: Antonio Almeida <vexwek@gmail.com>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit f001fde5ea
(net: introduce a list of device addresses dev_addr_list (v6))
added one regression Vegard Nossum found in its testings.
With kmemcheck help, Vegard found some uninitialized memory
was read and reported to user, potentialy leaking kernel data.
( thread can be found on http://lkml.org/lkml/2009/5/30/177 )
dev_addr_init() incorrectly uses sizeof() operator. We were
initializing one byte instead of MAX_ADDR_LEN bytes.
Reported-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I found a bug in cls_cgroup_change() in cls_cgroup.c.
cls_cgroup_change() expected tca[TCA_OPTIONS] was set from user space properly,
but tc in iproute2-2.6.29-1 (which I used) didn't set it.
In the current source code of tc in git, it set tca[TCA_OPTIONS].
git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/iproute2.git
If we always use a newest iproute2 in git when we use cls_cgroup,
we don't face this oops probably.
But I think, kernel shouldn't panic regardless of use program's behaviour.
Signed-off-by: Minoru Usui <usui@mxm.nes.nec.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
Passive OS fingerprinting netfilter module allows to passively detect
remote OS and perform various netfilter actions based on that knowledge.
This module compares some data (WS, MSS, options and it's order, ttl, df
and others) from packets with SYN bit set with dynamically loaded OS
fingerprints.
Fingerprint matching rules can be downloaded from OpenBSD source tree
or found in archive and loaded via netfilter netlink subsystem into
the kernel via special util found in archive.
Archive contains library file (also attached), which was shipped
with iptables extensions some time ago (at least when ipt_osf existed
in patch-o-matic).
Following changes were made in this release:
* added NLM_F_CREATE/NLM_F_EXCL checks
* dropped _rcu list traversing helpers in the protected add/remove calls
* dropped unneded structures, debug prints, obscure comment and check
Fingerprints can be downloaded from
http://www.openbsd.org/cgi-bin/cvsweb/src/etc/pf.os
or can be found in archive
Example usage:
-d switch removes fingerprints
Please consider for inclusion.
Thank you.
Passive OS fingerprint homepage (archives, examples):
http://www.ioremap.net/projects/osf
Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Current conntrack code kills the ICMP conntrack entry as soon as
the first reply is received. This is incorrect, as we then see only
the first ICMP echo reply out of several possible duplicates as
ESTABLISHED, while the rest will be INVALID. Also this unnecessarily
increases the conntrackd traffic on H-A firewalls.
Make all the ICMP conntrack entries (including the replied ones)
last for the default of nf_conntrack_icmp{,v6}_timeout seconds.
Signed-off-by: Jan "Yenya" Kasprzak <kas@fi.muni.cz>
Signed-off-by: Patrick McHardy <kaber@trash.net>
With the re-write of the RFKILL subsystem it is now possible to easily
integrate RFKILL soft-switch support into the Bluetooth subsystem. All
Bluetooth devices will now get automatically RFKILL support.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The Bluetooth source uses some endian conversion helpers, that in the end
translate to kernel standard routines. So remove this obfuscation since it
is fully pointless.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This adds the basic constants required to add support for L2CAP Enhanced
Retransmission feature.
Based on a patch from Nathan Holstein <nathan@lampreynetworks.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This patch fixes the errors without changing the l2cap.o binary:
text data bss dec hex filename
18059 568 0 18627 48c3 l2cap.o.after
18059 568 0 18627 48c3 l2cap.o.before
Signed-off-by: Gustavo F. Padovan <gustavo@las.ic.unicamp.br>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The initial value of err is not used until it is set to -ENOMEM. So just
remove the initialization completely.
Based on a patch from Gustavo F. Padovan <gustavo@las.ic.unicamp.br>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Using the L2CAP_CONF_HINT macro is easier to understand than using a
hardcoded 0x80 value.
Signed-off-by: Gustavo F. Padovan <gustavo@las.ic.unicamp.br>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Use macros instead of hardcoded numbers to make the L2CAP source code
more readable.
Signed-off-by: Gustavo F. Padovan <gustavo@las.ic.unicamp.br>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Change the name of the Kernel CAPI exported function capi_ctr_reseted()
to something representing its purpose better.
Impact: renaming, no functional change
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
vfree() does its own 'NULL' check, so no need for check before
calling it.
Signed-off-by: Figo.zhang <figo1802@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Increment the iovec base by the offset passed in for the initial
copy_to_user() in memcpy_to_iovecend().
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
skb_dma_unmap() is quite expensive for small packets,
because we use two different cache lines from skb_shared_info.
One to access nr_frags, one to access dma_maps[0]
Instead of dma_maps being an array of MAX_SKB_FRAGS + 1 elements,
let dma_head alone in a new dma_head field, close to nr_frags,
to reduce cache lines misses.
Tested on my dev machine (bnx2 & tg3 adapters), nice speedup !
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Get rid of num_dma_maps in struct skb_shared_info, as it seems unused.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On Thu, Jun 04, 2009 at 09:06:00PM +1000, Herbert Xu wrote:
>
> tun: Optimise handling of bogus gso->hdr_len
>
> As all current versions of virtio_net generate a value for the
> header length that's too small, we should optimise this so that
> we don't copy it twice. This can be done by ensuring that it is
> at least as large as the place where we'll write the checksum.
>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
With this applied we can strengthen the partial checksum check:
In skb_partial_csum_set we check to see if the checksum offset
is within the packet. However, we really should check that it
is within the skb head as that's the only bit we can modify
without copying.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The lock "protects" an assignment and a comparision of an integer.
When the caller of device_cmp() evaluates the result, nat->masq_index
may already have been changed (regardless if the lock is there or not).
So, the lock either has to be held during nf_ct_iterate_cleanup(),
or can be removed.
This does the latter.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Adds support for specifying a range of queues instead of a single queue
id. Flows will be distributed across the given range.
This is useful for multicore systems: Instead of having a single
application read packets from a queue, start multiple
instances on queues x, x+1, .. x+n. Each instance can process
flows independently.
Packets for the same connection are put into the same queue.
Signed-off-by: Holger Eitzenberger <heitzenberger@astaro.com>
Signed-off-by: Florian Westphal <fwestphal@astaro.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
We can use wildcard matching here, just like
ab4f21e6fb ("xtables: use NFPROTO_UNSPEC
in more extensions").
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
My mistake, I should have added that when cleaning up
rfkill and changing wimax.
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
ip_mc_drop_socket() method is declared in linux/igmp.h, which
is included anyhow in af_inet.c. So there is no need for this declaration.
This patch removes it from af_inet.c.
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* iwm doesn't depend on cfg80211 or wireless extensions
* rndis wlan selects cfg80211 - needs to depend
* mac80211 selects cfg80211 - needs to depend
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The rfkill core didn't initialise the poll delayed work
because it assumed that polling was always done by specifying
the poll function. cfg80211, however, would like to start
polling only later, which is a valid use case and easy to
support, so change rfkill to always initialise the poll
delayed work and thus allow starting polling by calling the
rfkill_resume_polling() function after registration.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Fixes spares warning:
net/wireless/util.c:261:5: warning:
symbol 'ieee80211_get_mesh_hdrlen' was not declared. Should it be static?
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
To be easier on drivers and users, have cfg80211 register an
rfkill structure that drivers can access. When soft-killed,
simply take down all interfaces; when hard-killed the driver
needs to notify us and we will take down the interfaces
after the fact. While rfkilled, interfaces cannot be set UP.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Sometimes it is necessary to know how the state is,
and it is easier to query rfkill than keep track of
it somewhere else, so add a function for that. This
could later be expanded to return hard/soft block,
but so far that isn't necessary.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch introduces new cfg80211 API to set the TX power
via cfg80211, puts the wext code into cfg80211 and updates
mac80211 to use all that. The -ENETDOWN bits are a hack but
will go away soon.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The new code added by this patch will make rfkill create
a misc character device /dev/rfkill that userspace can use
to control rfkill soft blocks and get status of devices as
well as events when the status changes.
Using it is very simple -- when you open it you can read
a number of times to get the initial state, and every
further read blocks (you can poll) on getting the next
event from the kernel. The same structure you read is
also used when writing to it to change the soft block of
a given device, all devices of a given type, or all
devices.
This also makes CONFIG_RFKILL_INPUT selectable again in
order to be able to test without it present since its
functionality can now be replaced by userspace entirely
and distros and users may not want the input part of
rfkill interfering with their userspace code. We will
also write a userspace daemon to handle all that and
consequently add the input code to the feature removal
schedule.
In order to have rfkilld support both kernels with and
without CONFIG_RFKILL_INPUT (or new kernels after its
eventual removal) we also add an ioctl (that only exists
if rfkill-input is present) to disable rfkill-input.
It is not very efficient, but at least gives the correct
behaviour in all cases.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch completely rewrites the rfkill core to address
the following deficiencies:
* all rfkill drivers need to implement polling where necessary
rather than having one central implementation
* updating the rfkill state cannot be done from arbitrary
contexts, forcing drivers to use schedule_work and requiring
lots of code
* rfkill drivers need to keep track of soft/hard blocked
internally -- the core should do this
* the rfkill API has many unexpected quirks, for example being
asymmetric wrt. alloc/free and register/unregister
* rfkill can call back into a driver from within a function the
driver called -- this is prone to deadlocks and generally
should be avoided
* rfkill-input pointlessly is a separate module
* drivers need to #ifdef rfkill functions (unless they want to
depend on or select RFKILL) -- rfkill should provide inlines
that do nothing if it isn't compiled in
* the rfkill structure is not opaque -- drivers need to initialise
it correctly (lots of sanity checking code required) -- instead
force drivers to pass the right variables to rfkill_alloc()
* the documentation is hard to read because it always assumes the
reader is completely clueless and contains way TOO MANY CAPS
* the rfkill code needlessly uses a lot of locks and atomic
operations in locked sections
* fix LED trigger to actually change the LED when the radio state
changes -- this wasn't done before
Tested-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> [thinkpad]
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This fixes an incorrect assumption (BUG_ON) made in
cfg80211 when handling country IE regulatory requests.
The assumption was that we won't try to call_crda()
twice for the same event and therefore we will not
recieve two replies through nl80211 for the regulatory
request. As it turns out it is true we don't call_crda()
twice for the same event, however, kobject_uevent_env()
*might* send the udev event twice and/or userspace can
simply process the udev event twice. We remove the BUG_ON()
and simply ignore the duplicate request.
For details refer to this thread:
http://marc.info/?l=linux-wireless&m=124149987921337&w=2
Cc: stable@kernel.org
Reported-by: Maxim Levitsky <maximlevitsky@gmail.com>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
NETDEV_UP is called after the device is set UP, but sometimes
it is useful to be able to veto the device UP. Introduce a
new NETDEV_PRE_UP notifier that can be used for exactly this.
The first use case will be cfg80211 denying interfaces to be
set UP if the device is known to be rfkill'ed.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When the SME requests to associate to an open AP
ieee80211_sta_set_extra_ie() can be called with zero IE
length. When this happens or when the extra IE has already
been set -EALREADY is passed down and the supplicant will
complain that the operation is already in progress and it will
not let us associate. We correct this by treating -EALREADY
from ieee80211_sta_set_extra_ie() as a success just as we do
for wext.
Cc: Shan.Palanisamy@Atheros.com
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
On non-AP interfaces userspace has no business interfering with
the station management, this can confuse mac80211 (and other
drivers probably wouldn't support it anyway). Allow adding and
removing stations only on AP interfaces.
(Reconcile this w/ previous version of patch posted with same
subject... -- JWL)
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
I accidentally transposed these in the patch that "fixed" the defaults,
leading to extremely low throughput because of the huge min CW.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Instead of hardcoding the key length for validation, use the
constants Zhu Yi recently added and add one for AES_CMAC too.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When a scan finishes only the program that asked for it
knows what kind of scan it was; let's tell everybody else
about the scan parameters as well so they can evaluate
the result of the scan better. Also helps with debugging.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
We have some validation code in mac80211 but said code will
force an invalid AID to 0 which isn't a valid AID either;
instead require a valid AID (1-2007) to be passed in from
userspace in cfg80211 already. Also move the code before
the race comment since it can only be executed during STA
addition and thus is not racy.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Ivo has updated the driver to no longer use the change flag,
so we can remove that, but rt2x00 and ath5k still use the
actual value so let's mark it as deprecated too.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Prior implementation of the new sctp_connectx() call that returns
an association ID did not work correctly on non-blocking socket.
This is because we could not return both a EINPROGRESS error and
an association id. This is a new implementation that supports this.
Originally from Ivan Skytte Jørgensen <isj-sctp@i1.dk
Signed-off-by: Ivan Skytte Jørgensen <isj-sctp@i1.dk
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
RFC 5061 Section 5.1 ASCONF Chunk Procedures said:
B4) Re-transmit the ASCONF Chunk last sent and if possible choose an
alternate destination address (please refer to [RFC4960],
Section 6.4.1). An endpoint MUST NOT add new parameters to this
chunk; it MUST be the same (including its Sequence Number) as
the last ASCONF sent. An endpoint MAY, however, bundle an
additional ASCONF with new ASCONF parameters with the next
Sequence Number. For details, see Section 5.5.
This patch fix to choose an alternate destination address when
re-transmit the ASCONF chunk, with some dup codes cleanup.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
sctp_sack_timeout is defined as int, but the sysctl's maxsize is set
to sizeof(long) and the min/max are defined as long.
Signed-off-by: jean-mickael.guerin@6wind.com
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
If T4-rto timer is expired on a removed transport, kernel panic
will occur when we do failure management on that transport.
You can reproduce this use the following sequence:
Endpoint A Endpoint B
(ESTABLISHED) (ESTABLISHED)
<----------------- ASCONF
(SRC=X)
ASCONF ----------------->
(Delete IP Address = X)
<----------------- ASCONF-ACK
(Success Indication)
<----------------- ASCONF
(T4-rto timer expire)
This patch fixed the problem.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
If T2-shutdown timer is expired on a removed transport, kernel
panic will occur when we do failure management on that transport.
You can reproduce this use the following sequence:
Endpoint A Endpoint B
(ESTABLISHED) (ESTABLISHED)
<----------------- SHUTDOWN
(SRC=X)
ASCONF ----------------->
(Delete IP Address = X)
<----------------- ASCONF-ACK
(Success Indication)
<----------------- SHUTDOWN
(T2-shutdown timer expire)
This patch fixed the problem.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
If socket is create by PF_INET type, it can not used IPv6 address
to send/recv DATA. So only enable IPv6 address support on PF_INET6
socket.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Just fix a typo in net/sctp/sm_statetable.c.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Use Unresolvable Address error cause instead of Invalid Mandatory
Parameter error cause when process ASCONF chunk with invalid address
since address parameters are not mandatory in the ASCONF chunk.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
RFC5061 Section 5.2. Upon Reception of an ASCONF Chunk
V2) In processing the chunk, the receiver should build a
response message with the appropriate error TLVs, as
specified in the Parameter type bits, for any ASCONF
Parameter it does not understand. To indicate an
unrecognized parameter, Cause Type 8 should be used as
defined in the ERROR in Section 3.3.10.8, [RFC4960]. The
endpoint may also use the response to carry rejections for
other reasons, such as resource shortages, etc., using the
Error Cause TLV and an appropriate error condition.
So we should indicate an unrecognized parameter with error
SCTP_ERROR_UNKNOWN_PARAM in ACSONF-ACK chunk, not
SCTP_ERROR_INV_PARAM.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Define three accessors to get/set dst attached to a skb
struct dst_entry *skb_dst(const struct sk_buff *skb)
void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)
void skb_dst_drop(struct sk_buff *skb)
This one should replace occurrences of :
dst_release(skb->dst)
skb->dst = NULL;
Delete skb->dst field
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Define skb_rtable(const struct sk_buff *skb) accessor to get rtable from skb
Delete skb->rtable field
Setting rtable is not allowed, just set dst instead as rtable is an alias.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes the notify chain infrastructure and replace it
by a simple function pointer. This issue has been mentioned in the
mailing list several times: the use of the notify chain adds
too much overhead for something that is only used by ctnetlink.
This patch also changes nfnetlink_send(). It seems that gfp_any()
returns GFP_KERNEL for user-context request, like those via
ctnetlink, inside the RCU read-side section which is not valid.
Using GFP_KERNEL is also evil since netlink may schedule(),
this leads to "scheduling while atomic" bug reports.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch simplifies the conntrack event caching system by removing
several events:
* IPCT_[*]_VOLATILE, IPCT_HELPINFO and IPCT_NATINFO has been deleted
since the have no clients.
* IPCT_COUNTER_FILLING which is a leftover of the 32-bits counter
days.
* IPCT_REFRESH which is not of any use since we always include the
timeout in the messages.
After this patch, the existing events are:
* IPCT_NEW, IPCT_RELATED and IPCT_DESTROY, that are used to identify
addition and deletion of entries.
* IPCT_STATUS, that notes that the status bits have changes,
eg. IPS_SEEN_REPLY and IPS_ASSURED.
* IPCT_PROTOINFO, that reports that internal protocol information has
changed, eg. the TCP, DCCP and SCTP protocol state.
* IPCT_HELPER, that a helper has been assigned or unassigned to this
entry.
* IPCT_MARK and IPCT_SECMARK, that reports that the mark has changed, this
covers the case when a mark is set to zero.
* IPCT_NATSEQADJ, to report that there's updates in the NAT sequence
adjustment.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
During the module removal there are no possible event listeners
since ctnetlink must be removed before to allow removing
nf_conntrack. This patch removes the event reporting for the
module removal case which is not of any use in the existing code.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch cleans up the message calculation to make it similar
to rtnetlink, moreover, it removes unneeded verbose information.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch is a cleanup, it removes the `nowait' parameter
from all *fill_info() function since it is always set to one.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch cleans up the message handling path in two aspects:
* it uses NLMSG_LENGTH() instead of NLMSG_SPACE() like rtnetlink
does in this case to check if there is enough room for the
Netlink/nfnetlink headers. No need to check for the padding room.
* it removes a redundant header size checking that has been
already do at the beginning of the function.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The patch below adds supporting TCP simultaneous open to conntrack. The
unused LISTEN state is replaced by a new state (SYN_SENT2) denoting the
second SYN sent from the reply direction in the new case. The state table
is updated and the function tcp_in_window is modified to handle
simultaneous open.
The functionality can fairly easily be tested by socat. A sample tcpdump
recording
23:21:34.244733 IP (tos 0x0, ttl 64, id 49224, offset 0, flags [DF], proto TCP (6), length 60) 192.168.0.254.2020 > 192.168.0.1.2020: S, cksum 0xe75f (correct), 3383710133:3383710133(0) win 5840 <mss 1460,sackOK,timestamp 173445629 0,nop,wscale 7>
23:21:34.244783 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 40) 192.168.0.1.2020 > 192.168.0.254.2020: R, cksum 0x0253 (correct), 0:0(0) ack 3383710134 win 0
23:21:36.038680 IP (tos 0x0, ttl 64, id 28092, offset 0, flags [DF], proto TCP (6), length 60) 192.168.0.1.2020 > 192.168.0.254.2020: S, cksum 0x704b (correct), 2634546729:2634546729(0) win 5840 <mss 1460,sackOK,timestamp 824213 0,nop,wscale 1>
23:21:36.038777 IP (tos 0x0, ttl 64, id 49225, offset 0, flags [DF], proto TCP (6), length 60) 192.168.0.254.2020 > 192.168.0.1.2020: S, cksum 0xb179 (correct), 3383710133:3383710133(0) ack 2634546730 win 5840 <mss 1460,sackOK,timestamp 173447423 824213,nop,wscale 7>
23:21:36.038847 IP (tos 0x0, ttl 64, id 28093, offset 0, flags [DF], proto TCP (6), length 52) 192.168.0.1.2020 > 192.168.0.254.2020: ., cksum 0xebad (correct), ack 3383710134 win 2920 <nop,nop,timestamp 824213 173447423>
and the corresponding netlink events:
[NEW] tcp 6 120 SYN_SENT src=192.168.0.254 dst=192.168.0.1 sport=2020 dport=2020 [UNREPLIED] src=192.168.0.1 dst=192.168.0.254 sport=2020 dport=2020
[UPDATE] tcp 6 120 LISTEN src=192.168.0.254 dst=192.168.0.1 sport=2020 dport=2020 src=192.168.0.1 dst=192.168.0.254 sport=2020 dport=2020
[UPDATE] tcp 6 60 SYN_RECV src=192.168.0.254 dst=192.168.0.1 sport=2020 dport=2020 src=192.168.0.1 dst=192.168.0.254 sport=2020 dport=2020
[UPDATE] tcp 6 432000 ESTABLISHED src=192.168.0.254 dst=192.168.0.1 sport=2020 dport=2020 src=192.168.0.1 dst=192.168.0.254 sport=2020 dport=2020 [ASSURED]
The RST packet was dropped in the raw table, thus it did not reach
conntrack. nfnetlink_conntrack is unpatched so it shows the new SYN_SENT2
state as the old unused LISTEN.
With TCP simultaneous open support we satisfy REQ-2 in RFC 5382 ;-) .
Additional minor correction in this patch is that in order to catch
uninitialized reply directions, "td_maxwin == 0" is used instead of
"td_end == 0" because the former can't be true except in uninitialized
state while td_end may accidentally be equal to zero in the mid of a
connection.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch fixes a bug which unconfigured struct tcf_proto keeps
chaining in tc_ctl_tfilter(), and avoids kernel panic in
cls_cgroup_classify() when we use cls_cgroup.
When we execute 'tc filter add', tcf_proto is allocated, initialized
by classifier's init(), and chained. After it's chained,
tc_ctl_tfilter() calls classifier's change(). When classifier's
change() fails, tc_ctl_tfilter() does not free and keeps tcf_proto.
In addition, cls_cgroup is initialized in change() not in init(). It
accesses unconfigured struct tcf_proto which is chained before
change(), then hits Oops.
Signed-off-by: Minoru Usui <usui@mxm.nes.nec.co.jp>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Tested-by: Minoru Usui <usui@mxm.nes.nec.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
After some discussion offline with Christoph Lameter and David Stevens
regarding multicast behaviour in Linux, I'm submitting a slightly
modified patch from the one Christoph submitted earlier.
This patch provides a new socket option IP_MULTICAST_ALL.
In this case, default behaviour is _unchanged_ from the current
Linux standard. The socket option is set by default to provide
original behaviour. Sockets wishing to receive data only from
multicast groups they join explicitly will need to clear this
socket option.
Signed-off-by: Nivedita Singhvi <niv@us.ibm.com>
Signed-off-by: Christoph Lameter<cl@linux.com>
Acked-by: David Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Print-out the error value when sock_alloc_send_skb() fails in
the IPv6 neighbor discovery code - can be useful for debugging.
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the unlikely event that gprs_writeable() and gprs_xmit() check for
writeability at the same, we could stop the device queue forever.
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>