Commit 'net: skb->dst accessors'(adf30907d6)
broken the sctp protocol stack, the sctp packet can never be sent out after
Eric Dumazet's patch, which have typo in the sctp code.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Vlad Yasevich <vladisalv.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change all the code that deals directly with ICMPv6 type and code
values to use u8 instead of a signed int as that's the actual data
type.
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 2b85a34e91
(net: No more expensive sock_hold()/sock_put() on each tx)
changed initial sk_wmem_alloc value.
We need to take into account this offset when reporting
sk_wmem_alloc to user, in PROC_FS files or various
ioctls (SIOCOUTQ/TIOCOUTQ)
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On module unload call rcu_barrier(), this is needed as synchronize_rcu()
is not strong enough. The kmem_cache_destroy() does invoke
synchronize_rcu() but it does not provide same protection.
Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prior implementation of the new sctp_connectx() call that returns
an association ID did not work correctly on non-blocking socket.
This is because we could not return both a EINPROGRESS error and
an association id. This is a new implementation that supports this.
Originally from Ivan Skytte Jørgensen <isj-sctp@i1.dk
Signed-off-by: Ivan Skytte Jørgensen <isj-sctp@i1.dk
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
RFC 5061 Section 5.1 ASCONF Chunk Procedures said:
B4) Re-transmit the ASCONF Chunk last sent and if possible choose an
alternate destination address (please refer to [RFC4960],
Section 6.4.1). An endpoint MUST NOT add new parameters to this
chunk; it MUST be the same (including its Sequence Number) as
the last ASCONF sent. An endpoint MAY, however, bundle an
additional ASCONF with new ASCONF parameters with the next
Sequence Number. For details, see Section 5.5.
This patch fix to choose an alternate destination address when
re-transmit the ASCONF chunk, with some dup codes cleanup.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
sctp_sack_timeout is defined as int, but the sysctl's maxsize is set
to sizeof(long) and the min/max are defined as long.
Signed-off-by: jean-mickael.guerin@6wind.com
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
If T4-rto timer is expired on a removed transport, kernel panic
will occur when we do failure management on that transport.
You can reproduce this use the following sequence:
Endpoint A Endpoint B
(ESTABLISHED) (ESTABLISHED)
<----------------- ASCONF
(SRC=X)
ASCONF ----------------->
(Delete IP Address = X)
<----------------- ASCONF-ACK
(Success Indication)
<----------------- ASCONF
(T4-rto timer expire)
This patch fixed the problem.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
If T2-shutdown timer is expired on a removed transport, kernel
panic will occur when we do failure management on that transport.
You can reproduce this use the following sequence:
Endpoint A Endpoint B
(ESTABLISHED) (ESTABLISHED)
<----------------- SHUTDOWN
(SRC=X)
ASCONF ----------------->
(Delete IP Address = X)
<----------------- ASCONF-ACK
(Success Indication)
<----------------- SHUTDOWN
(T2-shutdown timer expire)
This patch fixed the problem.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
If socket is create by PF_INET type, it can not used IPv6 address
to send/recv DATA. So only enable IPv6 address support on PF_INET6
socket.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Just fix a typo in net/sctp/sm_statetable.c.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Use Unresolvable Address error cause instead of Invalid Mandatory
Parameter error cause when process ASCONF chunk with invalid address
since address parameters are not mandatory in the ASCONF chunk.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
RFC5061 Section 5.2. Upon Reception of an ASCONF Chunk
V2) In processing the chunk, the receiver should build a
response message with the appropriate error TLVs, as
specified in the Parameter type bits, for any ASCONF
Parameter it does not understand. To indicate an
unrecognized parameter, Cause Type 8 should be used as
defined in the ERROR in Section 3.3.10.8, [RFC4960]. The
endpoint may also use the response to carry rejections for
other reasons, such as resource shortages, etc., using the
Error Cause TLV and an appropriate error condition.
So we should indicate an unrecognized parameter with error
SCTP_ERROR_UNKNOWN_PARAM in ACSONF-ACK chunk, not
SCTP_ERROR_INV_PARAM.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Define three accessors to get/set dst attached to a skb
struct dst_entry *skb_dst(const struct sk_buff *skb)
void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)
void skb_dst_drop(struct sk_buff *skb)
This one should replace occurrences of :
dst_release(skb->dst)
skb->dst = NULL;
Delete skb->dst field
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Define skb_rtable(const struct sk_buff *skb) accessor to get rtable from skb
Delete skb->rtable field
Setting rtable is not allowed, just set dst instead as rtable is an alias.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
this is the sctp code to enable hardware crc32c offload for
adapters that support it.
Originally by: Vlad Yasevich <vladislav.yasevich@hp.com>
modified by Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Setting ->owner as done currently (pde->owner = THIS_MODULE) is racy
as correctly noted at bug #12454. Someone can lookup entry with NULL
->owner, thus not pinning enything, and release it later resulting
in module refcount underflow.
We can keep ->owner and supply it at registration time like ->proc_fops
and ->data.
But this leaves ->owner as easy-manipulative field (just one C assignment)
and somebody will forget to unpin previous/pin current module when
switching ->owner. ->proc_fops is declared as "const" which should give
some thoughts.
->read_proc/->write_proc were just fixed to not require ->owner for
protection.
rmmod'ed directories will be empty and return "." and ".." -- no harm.
And directories with tricky enough readdir and lookup shouldn't be modular.
We definitely don't want such modular code.
Removing ->owner will also make PDE smaller.
So, let's nuke it.
Kudos to Jeff Layton for reminding about this, let's say, oversight.
http://bugzilla.kernel.org/show_bug.cgi?id=12454
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Remove 2 TEST_FRAME hacks that are no longer needed. These allowed
sctp regression tests to compile before, but are no longer needed.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
broken by commit 5e739d1752aca4e8f3e794d431503bfca3162df4; AFAICS should
be -stable fodder as well...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Aced-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
RFC5061 states:
Each adaptation layer that is defined that wishes
to use this parameter MUST specify an adaptation code point in an
appropriate RFC defining its use and meaning.
If the user has not set one - assume they don't want to sent the param
with a zero Adaptation Code Point.
Rationale - Currently the IANA defines zero as reserved - and
1 as the only valid value - so we consider zero to be unset - to save
adding a boolean to the socket structure.
Including this parameter unconditionally causes endpoints that do not
understand it to report errors unnecessarily.
Signed-off-by: Malcolm Lashley <mlashley@gmail.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
RFC3758 Section 3.3.1. Sending Forward-TSN-Supported param in INIT
Note that if the endpoint chooses NOT to include the parameter, then
at no time during the life of the association can it send or process
a FORWARD TSN.
If peer does not support PR-SCTP capable, don't send FORWARD-TSN chunk
to peer.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fix to indicate ASCONF support in INIT-ACK only if peer has
such capable.
This patch also fix to calc the chunk size if peer has no FWD-TSN
capable.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
sctp_inet_listen() call is split between UDP and TCP style. Looking
at the code, the two functions are almost the same and can be
merged into a single helper. This also fixes a bug that was
fixed in the UDP function, but missed in the TCP function.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change sctp_ctl_sock_init() to try IPv4 if IPv6 socket registration
fails. Required if the IPv6 module is loaded with "disable=1", else
SCTP will fail to load.
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit faee47cdbf
(sctp: Fix the RTO-doubling on idle-link heartbeats)
broke the RTO doubling for data retransmits. If the
heartbeat was sent before the data T3-rtx time, the
the RTO will not double upon the T3-rtx expiration.
Distingish between the operations by passing an argument
to the function.
Additionally, Wei Youngjun pointed out that our treatment
of requested HEARTBEATS and timer HEARTBEATS is the same
wrt resetting congestion window. That needs to be separated,
since user requested HEARTBEATS should not treat the link
as idle.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The functions time_before or time_after are more robust
for comparing jiffies against other values.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The code in sctp_getsockopt_maxburst() doesn't allow len to be larger
then struct sctp_assoc_value, which is a common case where app writers
just pass down the sizeof(buf) or something similar.
This patch fix the problem.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch add the type name "AUTH" and primitive type name
"PRIMITIVE_ASCONF" for debug message.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If ERROR chunk is received with too many error causes in ESTABLISHED
state, the kernel get panic.
This is because sctp limit the max length of cmds to 14, but while
ERROR chunk is received, one error cause will add around 2 cmds by
sctp_add_cmd_sf(). So many error causes will fill the limit of cmds
and panic.
This patch fixed the problem.
This bug can be test by SCTP Conformance Test Suite
<http://networktest.sourceforge.net/>.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
An extra list_del() during the module load failure and unload
resulted in a crash with a list corruption. Now sctp can
be unloaded again.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During peeloff/accept() sctp needs to save the parent socket state
into the new socket so that any options set on the parent are
inherited by the child socket. This was found when the
parent/listener socket issues SO_BINDTODEVICE, but the
data was misrouted after a route cache flush.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
SCTP incorrectly doubles rto ever time a Hearbeat chunk
is generated. However RFC 4960 states:
On an idle destination address that is allowed to heartbeat, it is
recommended that a HEARTBEAT chunk is sent once per RTO of that
destination address plus the protocol parameter 'HB.interval', with
jittering of +/- 50% of the RTO value, and exponential backoff of the
RTO if the previous HEARTBEAT is unanswered.
Essentially, of if the heartbean is unacknowledged, do we double the RTO.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The sctp crc32c checksum is always generated in little endian.
So, we clean up the code to treat it as little endian and remove
all the __force casts.
Suggested by Herbert Xu.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a new version of my patch, now using a module parameter instead
of a sysctl, so that the option is harder to find. Please note that,
once the module is loaded, it is still possible to change the value of
the parameter in /sys/module/sctp/parameters/, which is useful if you
want to do performance comparisons without rebooting.
Computation of SCTP checksums significantly affects the performance of
SCTP. For example, using two dual-Opteron 246 connected using a Gbe
network, it was not possible to achieve more than ~730 Mbps, compared to
941 Mbps after disabling SCTP checksums.
Unfortunately, SCTP checksum offloading in NICs is not commonly
available (yet).
By default, checksums are still enabled, of course.
Signed-off-by: Lucas Nussbaum <lucas.nussbaum@ens-lyon.fr>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Base versions handle constant folding now.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a race between sctp_rcv() and sctp_accept() where we
have moved the association from the listening socket to the
accepted socket, but sctp_rcv() processing cached the old
socket and continues to use it.
The easy solution is to check for the socket mismatch once we've
grabed the socket lock. If we hit a mis-match, that means
that were are currently holding the lock on the listening socket,
but the association is refrencing a newly accepted socket. We need
to drop the lock on the old socket and grab the lock on the new one.
A more proper solution might be to create accepted sockets when
the new association is established, similar to TCP. That would
eliminate the race for 1-to-1 style sockets, but it would still
existing for 1-to-many sockets where a user wished to peeloff an
association. For now, we'll live with this easy solution as
it addresses the problem.
Reported-by: Michal Hocko <mhocko@suse.cz>
Reported-by: Karsten Keil <kkeil@suse.de>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Recent changes to the retransmit code exposed a long standing
bug where it was possible for a chunk to be time stamped
after the retransmit timer was reset. This caused a rare
situation where the retrnamist timer has expired, but
nothing was marked for retrnasmission because all of
timesamps on data were less then 1 rto ago. As result,
the timer was never restarted since nothing was retransmitted,
and this resulted in a hung association that did couldn't
complete the data transfer. The solution is to timestamp
the chunk when it's added to the packet for transmission
purposes. After the packet is trsnmitted the rtx timer
is restarted. This guarantees that when the timer expires,
there will be data to retransmit.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 62aeaff5cc
(sctp: Start T3-RTX timer when fast retransmitting lowest TSN)
introduced a regression where it was possible to forcibly
restart the sctp retransmit timer at the transmission of any
new chunk. This resulted in much longer timeout times and
sometimes hung sctp connections.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When I review ocfs2 code, find there are 2 typos to "successfull". After
doing grep "successfull " in kernel tree, 22 typos found totally -- great
minds always think alike :)
This patch fixes all the similar typos. Thanks for Randy's ack and comments.
Signed-off-by: Coly Li <coyli@suse.de>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Roland Dreier <rolandd@cisco.com>
Cc: Jeremy Kerr <jk@ozlabs.org>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Vlad Yasevich <vladislav.yasevich@hp.com>
Cc: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The latest ietf socket extensions API draft said:
8.1.21. Set or Get the SCTP Partial Delivery Point
Note also that the call will fail if the user attempts to set
this value larger than the socket receive buffer size.
This patch add this validity check for SCTP_PARTIAL_DELIVERY_POINT
socket option.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If FWD-TSN chunk is received with bad stream ID, the sctp will not do the
validity check, this may cause memory overflow when overwrite the TSN of
the stream ID.
The FORWARD-TSN chunk is like this:
FORWARD-TSN chunk
Type = 192
Flags = 0
Length = 172
NewTSN = 99
Stream = 10000
StreamSequence = 0xFFFF
This patch fix this problem by discard the chunk if stream ID is not
less than MIS.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>