Commit Graph

209 Commits

Author SHA1 Message Date
Sasha Levin 357c4774b5 tipc: correctly handle releasing a not fully initialized sock
Commit f2f9800d49 "tipc: make tipc node table aware of net
namespace" has added a dereference of sock->sk before making sure it's
not NULL, which makes releasing a tipc socket NULL pointer dereference
for sockets that are not fully initialized.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-13 17:26:27 -05:00
Ying Xue 3721e9c7c1 tipc: remove redundant timer defined in tipc_sock struct
Remove the redundant timer defined in tipc_sock structure, instead we
can directly reuse the sk_timer defined in sock structure.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-13 16:45:55 -05:00
Ying Xue a62fbccecd tipc: make subscriber server support net namespace
TIPC establishes one subscriber server which allows users to subscribe
their interesting name service status. After tipc supports namespace,
one dedicated tipc stack instance is created for each namespace, and
each instance can be deemed as one independent TIPC node. As a result,
subscriber server must be built for each namespace.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Tested-by: Tero Aho <Tero.Aho@coriant.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-12 16:24:33 -05:00
Ying Xue 3474753954 tipc: make tipc node address support net namespace
If net namespace is supported in tipc, each namespace will be treated
as a separate tipc node. Therefore, every namespace must own its
private tipc node address. This means the "tipc_own_addr" global
variable of node address must be moved to tipc_net structure to
satisfy the requirement. It's turned out that users also can assign
node address for every namespace.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Tested-by: Tero Aho <Tero.Aho@coriant.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-12 16:24:33 -05:00
Ying Xue 4ac1c8d0ee tipc: name tipc name table support net namespace
TIPC name table is used to store the mapping relationship between
TIPC service name and socket port ID. When tipc supports namespace,
it allows users to publish service names only owned by a certain
namespace. Therefore, every namespace must have its private name
table to prevent service names published to one namespace from being
contaminated by other service names in another namespace. Therefore,
The name table global variable (ie, nametbl) and its lock must be
moved to tipc_net structure, and a parameter of namespace must be
added for necessary functions so that they can obtain name table
variable defined in tipc_net structure.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Tested-by: Tero Aho <Tero.Aho@coriant.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-12 16:24:33 -05:00
Ying Xue e05b31f4bf tipc: make tipc socket support net namespace
Now tipc socket table is statically allocated as a global variable.
Through it, we can look up one socket instance with port ID, insert
a new socket instance to the table, and delete a socket from the
table. But when tipc supports net namespace, each namespace must own
its specific socket table. So the global variable of socket table
must be redefined in tipc_net structure. As a concequence, a new
socket table will be allocated when a new namespace is created, and
a socket table will be deallocated when namespace is destroyed.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Tested-by: Tero Aho <Tero.Aho@coriant.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-12 16:24:33 -05:00
Ying Xue f2f9800d49 tipc: make tipc node table aware of net namespace
Global variables associated with node table are below:
- node table list (node_htable)
- node hash table list (tipc_node_list)
- node table lock (node_list_lock)
- node number counter (tipc_num_nodes)
- node link number counter (tipc_num_links)

To make node table support namespace, above global variables must be
moved to tipc_net structure in order to keep secret for different
namespaces. As a consequence, these variables are allocated and
initialized when namespace is created, and deallocated when namespace
is destroyed. After the change, functions associated with these
variables have to utilize a namespace pointer to access them. So
adding namespace pointer as a parameter of these functions is the
major change made in the commit.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Tested-by: Tero Aho <Tero.Aho@coriant.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-12 16:24:32 -05:00
Ying Xue f2f2a96a20 tipc: feed tipc sock pointer to tipc_sk_timeout routine
In order to make tipc socket table aware of namespace, a networking
namespace instance must be passed to tipc_sk_lookup(), allowing it
to look up tipc socket instance with a given port ID from a concrete
socket table. However, as now tipc_sk_timeout() only has one port ID
parameter and is not namespace aware, it's unable to obtain a correct
socket instance through tipc_sk_lookup() just with a port ID,
especially after namespace is completely supported.

If port ID is replaced with socket instance as tipc_sk_timeout()'s
parameter, it's unnecessary to look up socket table. But as the timer
handler - tipc_sk_timeout() is run asynchronously, socket reference
must be held before its timer is launched, and must be carefully
checked to identify whether the socket reference needs to be put or
not when its timer is terminated.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Tested-by: Tero Aho <Tero.Aho@coriant.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-12 16:24:32 -05:00
Ying Xue 2f55c43788 tipc: remove unnecessary wrapper functions of kernel timer APIs
Not only some wrapper function like k_term_timer() is empty, but also
some others including k_start_timer() and k_cancel_timer() don't return
back any value to its caller, what's more, there is no any component
in the kernel world to do such thing. Therefore, these timer interfaces
defined in tipc module should be purged.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Tested-by: Tero Aho <Tero.Aho@coriant.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-12 16:24:31 -05:00
Ying Xue 07f6c4bc04 tipc: convert tipc reference table to use generic rhashtable
As tipc reference table is statically allocated, its memory size
requested on stack initialization stage is quite big even if the
maximum port number is just restricted to 8191 currently, however,
the number already becomes insufficient in practice. But if the
maximum ports is allowed to its theory value - 2^32, its consumed
memory size will reach a ridiculously unacceptable value. Apart from
this, heavy tipc users spend a considerable amount of time in
tipc_sk_get() due to the read-lock on ref_table_lock.

If tipc reference table is converted with generic rhashtable, above
mentioned both disadvantages would be resolved respectively: making
use of the new resizable hash table can avoid locking on the lookup;
smaller memory size is required at initial stage, for example, 256
hash bucket slots are requested at the beginning phase instead of
allocating the entire 8191 slots in old mode. The hash table will
grow if entries exceeds 75% of table size up to a total table size
of 1M, and it will automatically shrink if usage falls below 30%,
but the minimum table size is allowed down to 256.

Also converts ref_table_lock to a separate mutex to protect hash table
mutations on write side. Lastly defers the release of the socket
reference using call_rcu() to allow using an RCU read-side protected
call to rhashtable_lookup().

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Erik Hugne <erik.hugne@ericsson.com>
Cc: Thomas Graf <tgraf@suug.ch>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-08 19:47:14 -08:00
Erik Hugne 88b17b6a22 tipc: drop tx side permission checks
Part of the old remote management feature is a piece of code
that checked permissions on the local system to see if a certain
operation was permitted, and if so pass the command to a remote
node. This serves no purpose after the removal of remote management
with commit 5902385a24 ("tipc: obsolete the remote management
feature") so we remove it.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-09 13:30:13 -05:00
Ying Xue a6ca109443 tipc: use generic SKB list APIs to manage TIPC outgoing packet chains
Use standard SKB list APIs associated with struct sk_buff_head to
manage socket outgoing packet chain and name table outgoing packet
chain, having relevant code simpler and more readable.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-26 12:30:17 -05:00
David S. Miller d3fc6b3fdd Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
More work from Al Viro to move away from modifying iovecs
by using iov_iter instead.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-25 20:02:51 -05:00
Richard Alpe d8182804cf tipc: fix sparse warnings in new nl api
Fix sparse warnings about non-static declaration of static functions
in the new tipc netlink API.

Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-24 16:10:23 -05:00
Al Viro 45dcc687f7 tipc_msg_build(): pass msghdr instead of its ->msg_iov
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-11-24 05:16:41 -05:00
Al Viro 562640f3c3 tipc_sendmsg(): pass msghdr instead of its ->msg_iov
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-11-24 05:16:40 -05:00
Richard Alpe 1a1a143daf tipc: add publication dump to new netlink api
Add TIPC_NL_PUBL_GET command to the new tipc netlink API.

This command supports dumping of all publications for a specific
socket.

Netlink logical layout of request message:
    -> socket
        -> reference

Netlink logical layout of response message:
    -> publication
        -> type
        -> lower
        -> upper

Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-21 15:01:30 -05:00
Richard Alpe 34b78a127c tipc: add sock dump to new netlink api
Add TIPC_NL_SOCK_GET command to the new tipc netlink API.

This command supports dumping of all available sockets with their
associated connection or publication(s). It could be extended to reply
with a single socket if the NLM_F_DUMP isn't set.

The information about a socket includes reference, address, connection
information / publication information.

Netlink logical layout of response message:
-> socket
    -> reference
    -> address
    [
    -> connection
        -> node
        -> socket
        [
        -> connected flag
        -> type
        -> instance
        ]
    ]
    [
    -> publication flag
    ]

Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-21 15:01:30 -05:00
David S. Miller 51f3d02b98 net: Add and use skb_copy_datagram_msg() helper.
This encapsulates all of the skb_copy_datagram_iovec() callers
with call argument signature "skb, offset, msghdr->msg_iov, length".

When we move to iov_iters in the networking, the iov_iter object will
sit in the msghdr.

Having a helper like this means there will be less places to touch
during that transformation.

Based upon descriptions and patch from Al Viro.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05 16:46:40 -05:00
stephen hemminger b2ad5e5fcc tipc: spelling errors
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-30 16:56:41 -04:00
Ying Xue 1a194c2d59 tipc: fix lockdep warning when intra-node messages are delivered
When running tipcTC&tipcTS test suite, below lockdep unsafe locking
scenario is reported:

[ 1109.997854]
[ 1109.997988] =================================
[ 1109.998290] [ INFO: inconsistent lock state ]
[ 1109.998575] 3.17.0-rc1+ #113 Not tainted
[ 1109.998762] ---------------------------------
[ 1109.998762] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
[ 1109.998762] swapper/7/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
[ 1109.998762]  (slock-AF_TIPC){+.?...}, at: [<ffffffffa0011969>] tipc_sk_rcv+0x49/0x2b0 [tipc]
[ 1109.998762] {SOFTIRQ-ON-W} state was registered at:
[ 1109.998762]   [<ffffffff810a4770>] __lock_acquire+0x6a0/0x1d80
[ 1109.998762]   [<ffffffff810a6555>] lock_acquire+0x95/0x1e0
[ 1109.998762]   [<ffffffff81a2d1ce>] _raw_spin_lock+0x3e/0x80
[ 1109.998762]   [<ffffffffa0011969>] tipc_sk_rcv+0x49/0x2b0 [tipc]
[ 1109.998762]   [<ffffffffa0004fe8>] tipc_link_xmit+0xa8/0xc0 [tipc]
[ 1109.998762]   [<ffffffffa000ec6f>] tipc_sendmsg+0x15f/0x550 [tipc]
[ 1109.998762]   [<ffffffffa000f165>] tipc_connect+0x105/0x140 [tipc]
[ 1109.998762]   [<ffffffff817676ee>] SYSC_connect+0xae/0xc0
[ 1109.998762]   [<ffffffff81767b7e>] SyS_connect+0xe/0x10
[ 1109.998762]   [<ffffffff817a9788>] compat_SyS_socketcall+0xb8/0x200
[ 1109.998762]   [<ffffffff81a306e5>] sysenter_dispatch+0x7/0x1f
[ 1109.998762] irq event stamp: 241060
[ 1109.998762] hardirqs last  enabled at (241060): [<ffffffff8105a4ad>] __local_bh_enable_ip+0x6d/0xd0
[ 1109.998762] hardirqs last disabled at (241059): [<ffffffff8105a46f>] __local_bh_enable_ip+0x2f/0xd0
[ 1109.998762] softirqs last  enabled at (241020): [<ffffffff81059a52>] _local_bh_enable+0x22/0x50
[ 1109.998762] softirqs last disabled at (241021): [<ffffffff8105a626>] irq_exit+0x96/0xc0
[ 1109.998762]
[ 1109.998762] other info that might help us debug this:
[ 1109.998762]  Possible unsafe locking scenario:
[ 1109.998762]
[ 1109.998762]        CPU0
[ 1109.998762]        ----
[ 1109.998762]   lock(slock-AF_TIPC);
[ 1109.998762]   <Interrupt>
[ 1109.998762]     lock(slock-AF_TIPC);
[ 1109.998762]
[ 1109.998762]  *** DEADLOCK ***
[ 1109.998762]
[ 1109.998762] 2 locks held by swapper/7/0:
[ 1109.998762]  #0:  (rcu_read_lock){......}, at: [<ffffffff81782dc9>] __netif_receive_skb_core+0x69/0xb70
[ 1109.998762]  #1:  (rcu_read_lock){......}, at: [<ffffffffa0001c90>] tipc_l2_rcv_msg+0x40/0x260 [tipc]
[ 1109.998762]
[ 1109.998762] stack backtrace:
[ 1109.998762] CPU: 7 PID: 0 Comm: swapper/7 Not tainted 3.17.0-rc1+ #113
[ 1109.998762] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[ 1109.998762]  ffffffff82745830 ffff880016c03828 ffffffff81a209eb 0000000000000007
[ 1109.998762]  ffff880017b3cac0 ffff880016c03888 ffffffff81a1c5ef 0000000000000001
[ 1109.998762]  ffff880000000001 ffff880000000000 ffffffff81012d4f 0000000000000000
[ 1109.998762] Call Trace:
[ 1109.998762]  <IRQ>  [<ffffffff81a209eb>] dump_stack+0x4e/0x68
[ 1109.998762]  [<ffffffff81a1c5ef>] print_usage_bug+0x1f1/0x202
[ 1109.998762]  [<ffffffff81012d4f>] ? save_stack_trace+0x2f/0x50
[ 1109.998762]  [<ffffffff810a406c>] mark_lock+0x28c/0x2f0
[ 1109.998762]  [<ffffffff810a3440>] ? print_irq_inversion_bug.part.46+0x1f0/0x1f0
[ 1109.998762]  [<ffffffff810a467d>] __lock_acquire+0x5ad/0x1d80
[ 1109.998762]  [<ffffffff810a70dd>] ? trace_hardirqs_on+0xd/0x10
[ 1109.998762]  [<ffffffff8108ace8>] ? sched_clock_cpu+0x98/0xc0
[ 1109.998762]  [<ffffffff8108ad2b>] ? local_clock+0x1b/0x30
[ 1109.998762]  [<ffffffff810a10dc>] ? lock_release_holdtime.part.29+0x1c/0x1a0
[ 1109.998762]  [<ffffffff8108aa05>] ? sched_clock_local+0x25/0x90
[ 1109.998762]  [<ffffffffa000dec0>] ? tipc_sk_get+0x60/0x80 [tipc]
[ 1109.998762]  [<ffffffff810a6555>] lock_acquire+0x95/0x1e0
[ 1109.998762]  [<ffffffffa0011969>] ? tipc_sk_rcv+0x49/0x2b0 [tipc]
[ 1109.998762]  [<ffffffff810a6fb6>] ? trace_hardirqs_on_caller+0xa6/0x1c0
[ 1109.998762]  [<ffffffff81a2d1ce>] _raw_spin_lock+0x3e/0x80
[ 1109.998762]  [<ffffffffa0011969>] ? tipc_sk_rcv+0x49/0x2b0 [tipc]
[ 1109.998762]  [<ffffffffa000dec0>] ? tipc_sk_get+0x60/0x80 [tipc]
[ 1109.998762]  [<ffffffffa0011969>] tipc_sk_rcv+0x49/0x2b0 [tipc]
[ 1109.998762]  [<ffffffffa00076bd>] tipc_rcv+0x5ed/0x960 [tipc]
[ 1109.998762]  [<ffffffffa0001d1c>] tipc_l2_rcv_msg+0xcc/0x260 [tipc]
[ 1109.998762]  [<ffffffffa0001c90>] ? tipc_l2_rcv_msg+0x40/0x260 [tipc]
[ 1109.998762]  [<ffffffff81783345>] __netif_receive_skb_core+0x5e5/0xb70
[ 1109.998762]  [<ffffffff81782dc9>] ? __netif_receive_skb_core+0x69/0xb70
[ 1109.998762]  [<ffffffff81784eb9>] ? dev_gro_receive+0x259/0x4e0
[ 1109.998762]  [<ffffffff817838f6>] __netif_receive_skb+0x26/0x70
[ 1109.998762]  [<ffffffff81783acd>] netif_receive_skb_internal+0x2d/0x1f0
[ 1109.998762]  [<ffffffff81785518>] napi_gro_receive+0xd8/0x240
[ 1109.998762]  [<ffffffff815bf854>] e1000_clean_rx_irq+0x2c4/0x530
[ 1109.998762]  [<ffffffff815c1a46>] e1000_clean+0x266/0x9c0
[ 1109.998762]  [<ffffffff8108ad2b>] ? local_clock+0x1b/0x30
[ 1109.998762]  [<ffffffff8108aa05>] ? sched_clock_local+0x25/0x90
[ 1109.998762]  [<ffffffff817842b1>] net_rx_action+0x141/0x310
[ 1109.998762]  [<ffffffff810bd710>] ? handle_fasteoi_irq+0xe0/0x150
[ 1109.998762]  [<ffffffff81059fa6>] __do_softirq+0x116/0x4d0
[ 1109.998762]  [<ffffffff8105a626>] irq_exit+0x96/0xc0
[ 1109.998762]  [<ffffffff81a30d07>] do_IRQ+0x67/0x110
[ 1109.998762]  [<ffffffff81a2ee2f>] common_interrupt+0x6f/0x6f
[ 1109.998762]  <EOI>  [<ffffffff8100d2b7>] ? default_idle+0x37/0x250
[ 1109.998762]  [<ffffffff8100d2b5>] ? default_idle+0x35/0x250
[ 1109.998762]  [<ffffffff8100dd1f>] arch_cpu_idle+0xf/0x20
[ 1109.998762]  [<ffffffff810999fd>] cpu_startup_entry+0x27d/0x4d0
[ 1109.998762]  [<ffffffff81034c78>] start_secondary+0x188/0x1f0

When intra-node messages are delivered from one process to another
process, tipc_link_xmit() doesn't disable BH before it directly calls
tipc_sk_rcv() on process context to forward messages to destination
socket. Meanwhile, if messages delivered by remote node arrive at the
node and their destinations are also the same socket, tipc_sk_rcv()
running on process context might be preempted by tipc_sk_rcv() running
BH context. As a result, the latter cannot obtain the socket lock as
the lock was obtained by the former, however, the former has no chance
to be run as the latter is owning the CPU now, so headlock happens. To
avoid it, BH should be always disabled in tipc_sk_rcv().

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-21 15:28:15 -04:00
Ying Xue 7b8613e0a1 tipc: fix a potential deadlock
Locking dependency detected below possible unsafe locking scenario:

           CPU0                          CPU1
T0:  tipc_named_rcv()                tipc_rcv()
T1:  [grab nametble write lock]*     [grab node lock]*
T2:  tipc_update_nametbl()           tipc_node_link_up()
T3:  tipc_nodesub_subscribe()        tipc_nametbl_publish()
T4:  [grab node lock]*               [grab nametble write lock]*

The opposite order of holding nametbl write lock and node lock on
above two different paths may result in a deadlock. If we move the
the updating of the name table after link state named out of node
lock, the reverse order of holding locks will be eliminated, and
as a result, the deadlock risk.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-21 15:28:15 -04:00
Ying Xue cc086fcf92 tipc: fix a potential oops
Commit 6c9808ce09 ("tipc: remove port_lock") accidentally involves
a potential bug: when tipc socket instance(tsk) is not got with given
reference number in tipc_sk_get(), tsk is set to NULL. Subsequently
we jump to exit label where to decrease socket reference counter
pointed by tsk pointer in tipc_sk_put(). However, As now tsk is NULL,
oops may happen because of touching a NULL pointer.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Erik Hugne <erik.hugne@ericsson.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-29 20:22:43 -07:00
Jon Paul Maloy 301bae56f2 tipc: merge struct tipc_port into struct tipc_sock
We complete the merging of the port and socket layer by aggregating
the fields of struct tipc_port directly into struct tipc_sock, and
moving the combined structure into socket.c.

We also move all functions and macros that are not any longer
exposed to the rest of the stack into socket.c, and rename them
accordingly.

Despite the size of this commit, there are no functional changes.
We have only made such changes that are necessary due of the removal
of struct tipc_port.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-23 11:18:35 -07:00
Jon Paul Maloy 808d90f9c5 tipc: remove files ref.h and ref.c
The reference table is now 'socket aware' instead of being generic,
and has in reality become a socket internal table. In order to be
able to minimize the API exposed by the socket layer towards the rest
of the stack, we now move the reference table definitions and functions
into the file socket.c, and rename the functions accordingly.

There are no functional changes in this commit.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-23 11:18:35 -07:00
Jon Paul Maloy 2e84c60b77 tipc: remove include file port.h
We move the inline functions in the file port.h to socket.c, and modify
their names accordingly.

We move struct tipc_port and some macros to socket.h.

Finally, we remove the file port.h.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-23 11:18:35 -07:00
Jon Paul Maloy 0fc87aaebd tipc: remove source file port.c
In this commit, we move the remaining functions in port.c to
socket.c, and give them new names that correspond to their new
location. We then remove the file port.c.

There are only cosmetic changes to the moved functions.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-23 11:18:35 -07:00
Jon Paul Maloy 6c9808ce09 tipc: remove port_lock
In previous commits we have reduced usage of port_lock to a minimum,
and complemented it with usage of bh_lock_sock() at the remaining
locations. The purpose has been to remove this lock altogether, since
it largely duplicates the role of bh_lock_sock. We are now ready to do
this.

However, we still need to protect the BH callers from inadvertent
release of the socket while they hold a reference to it. We do this by
replacing port_lock by a combination of a rw-lock protecting the
reference table as such, and updating the socket reference counter while
the socket is referenced from BH. This technique is more standard and
comprehensible than the previous approach, and turns out to have a
positive effect on overall performance.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-23 11:18:34 -07:00
Jon Paul Maloy 9b50fd087a tipc: replace port pointer with socket pointer in registry
In order to make tipc_sock the only entity referencable from other
parts of the stack, we add a tipc_sock pointer instead of a tipc_port
pointer to the registry. As a consequence, we also let the function
tipc_port_lock() return a pointer to a tipc_sock instead  of a tipc_port.
We keep the function's name for now, since the lock still is owned by
the port.

This is another step in the direction of eliminating port_lock, replacing
its usage with lock_sock() and bh_lock_sock().

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-23 11:18:34 -07:00
Jon Paul Maloy 5a9ee0be33 tipc: use registry when scanning sockets
The functions tipc_port_get_ports() and tipc_port_reinit() scan over
all sockets/ports to access each of them. This is done by using a
dedicated linked list, 'tipc_socks' where all sockets are members. The
list is in turn protected by a spinlock, 'port_list_lock', while each
socket is locked by using port_lock at the moment of access.

In order to reduce complexity and risk of deadlock, we want to get
rid of the linked list and the accompanying spinlock.

This is what we do in this commit. Instead of the linked list, we use
the port registry to scan across the sockets. We also add usage of
bh_lock_sock() inside the scope of port_lock in both functions, as a
preparation for the complete removal of port_lock.

Finally, we move the functions from port.c to socket.c, and rename them
to tipc_sk_sock_show() and tipc_sk_reinit() repectively.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-23 11:18:34 -07:00
Jon Paul Maloy 5b8fa7ce82 tipc: eliminate functions tipc_port_init and tipc_port_destroy
After the latest changes to the socket/port layer the existence of
the functions tipc_port_init() and tipc_port_destroy() cannot be
justified. They are both called only once, from tipc_sk_create() and
tipc_sk_delete() respectively, and their functionality can better be
merged into the latter two functions.

This also entails that all remaining references to port_lock now are
made from inside socket.c, something that will make it easier to remove
this lock.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-23 11:18:34 -07:00
Jon Paul Maloy 739f5e4efc tipc: redefine message acknowledge function
The function tipc_acknowledge() is a remnant from the obsolete native
API. Currently, it grabs port_lock, before building an acknowledge
message and sending it to the peer.

Since all access to socket members now is protected by the socket lock,
it has become unnecessary to grab port_lock here.

In this commit, we remove the usage of port_lock, simplify the
function, and move it to socket.c, renaming it to tipc_sk_send_ack().

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-23 11:18:34 -07:00
Jon Paul Maloy dadebc0029 tipc: eliminate port_connect()/port_disconnect() functions
tipc_port_connect()/tipc_port_disconnect() are remnants of the obsolete
native API. Their only task is to grab port_lock and call the functions
__tipc_port_connect()/__tipc_port_disconnect() respectively, which will
perform the actual state change.

Since socket/port exection now is single-threaded the use of port_lock
is not needed any more, so we can safely replace the two functions with
their lock-free counterparts.

In this commit, we remove the two functions. Furthermore, the contents
of __tipc_port_disconnect() is so trivial that we choose to eliminate
that function too, expanding its functionality into tipc_shutdown().
__tipc_port_connect() is simplified, moved to socket.c, and given the
more correct name tipc_sk_finish_conn(). Finally, we eliminate the
function auto_connect(), and expand its contents into filter_connect().

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-23 11:18:34 -07:00
Jon Paul Maloy 80e44c2225 tipc: eliminate function tipc_port_shutdown()
tipc_port_shutdown() is a remnant from the now obsolete native
interface. As such it grabs port_lock in order to protect itself
from concurrent BH processing.

However, after the recent changes to the port/socket upcalls, sockets
are now basically single-threaded, and all execution, except the read-only
tipc_sk_timer(), is executing within the protection of lock_sock(). So
the use of port_lock is not needed here.

In this commit we eliminate the whole function, and merge it into its
only caller, tipc_shutdown().

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-23 11:18:34 -07:00
Jon Paul Maloy 5728901581 tipc: clean up socket timer function
The last remaining BH upcall to the socket, apart for the message
reception function tipc_sk_rcv(), is the timer function.

We prefer to let this function continue executing in BH, since it only
does read-acces to semi-permanent data, but we make three changes to it:

1) We introduce a bh_lock_sock()/bh_unlock_sock() inside the scope
   of port_lock.  This is a preparation for replacing port_lock with
   bh_lock_sock() at the locations where it is still used.

2) We move the function from port.c to socket.c, as a further step
   of eliminating the port code level altogether.

3) We let it make use of the newly introduced tipc_msg_create()
   function. This enables us to get rid of three context specific
   functions (port_create_self_abort_msg() etc.) in port.c

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-23 11:18:33 -07:00
Jon Paul Maloy 50100a5e39 tipc: use pseudo message to wake up sockets after link congestion
The current link implementation keeps a linked list of blocked ports/
sockets that is populated when there is link congestion. The purpose
of this is to let the link know which users to wake up when the
congestion abates.

This adds unnecessary complexity to the data structure and the code,
since it forces us to involve the link each time we want to delete
a socket. It also forces us to grab the spinlock port_lock within
the scope of node_lock. We want to get rid of this direct dependence,
as well as the deadlock hazard resulting from the usage of port_lock.

In this commit, we instead let the link keep list of a "wakeup" pseudo
messages for use in such situations. Those messages are sent to the
pending sockets via the ordinary message reception path, and wake up
the socket's owner when they are received.

This enables us to get rid of the 'waiting_ports' linked lists in struct
tipc_port that manifest this direct reference. As a consequence, we can
eliminate another BH entry into the socket, and hence the need to grab
port_lock. This is a further step in our effort to remove port_lock
altogether.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-23 11:18:33 -07:00
Erik Hugne ac32c7f705 tipc: fix message importance range check
Commit 3b4f302d85 ("tipc: eliminate
redundant locking") introduced a bug by removing the sanity check
for message importance, allowing programs to assign any value to
the msg_user field. This will mess up the packet reception logic
and may cause random link resets.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-16 20:17:34 -07:00
Wei Yongjun ad025a56a5 tipc: remove duplicated include from socket.c
Remove duplicated include.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-29 15:51:14 -07:00
Wei Yongjun 52f50ce556 tipc: fix sparse non static symbol warnings
Fixes the following sparse warnings:

net/tipc/socket.c:545:5: warning:
 symbol 'tipc_sk_proto_rcv' was not declared. Should it be static?
net/tipc/socket.c:2015:5: warning:
 symbol 'tipc_ioctl' was not declared. Should it be static?

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-20 22:19:04 -07:00
Jon Paul Maloy 9fbfb8b120 tipc: rename temporarily named functions
After the previous commit, we can now give the functions with temporary
names, such as tipc_link_xmit2(), tipc_msg_build2() etc., their proper
names.

There are no functional changes in this commit.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-16 21:38:19 -07:00
Jon Paul Maloy 0abd8ff21f tipc: start using the new multicast functions
In this commit, we convert the socket multicast send function to
directly call the new multicast/broadcast function (tipc_bclink_xmit2())
introduced in the previous commit. We do this instead of letting the
call go via the now obsolete tipc_port_mcast_xmit(), hence saving
a call level and some code complexity.

We also remove the initial destination lookup at the message sending
side, and replace that with an unconditional lookup at the receiving
side, including on the sending node itself. This makes the destination
lookup and message transfer more uniform than before.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-16 21:38:18 -07:00
Jon Paul Maloy 078bec826f tipc: add new functions for multicast and broadcast distribution
We add a new broadcast link transmit function in bclink.c and a new
receive function in socket.c. The purpose is to move the branching
between external and internal destination down to the link layer,
just as we have done with unicast in earlier commits. We also make
use of the new link-independent fragmentation support that was
introduced in an earlier commit series.

This gives a shorter and simpler code path, and makes it possible
to obtain copy-free buffer delivery to all node local destination
sockets.

The new transmission code is added in parallel with the existing one,
and will be used by the socket multicast send function in the next
commit in this series.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-16 21:38:18 -07:00
Fabian Frederick 7ceaa583be tipc: remove unnecessary break after return
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-15 16:26:59 -07:00
Erik Hugne 70452dcb6d tipc: fix a memleak when sending data
This fixes a regression bug caused by:
067608e9d0 ("tipc: introduce direct
iovec to buffer chain fragmentation function")

If data is sent on a nonblocking socket and the destination link
is congested, the buffer chain is leaked. We fix this by freeing
the chain in this case.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08 16:10:01 -07:00
Jon Paul Maloy 60120526c2 tipc: simplify connection congestion handling
As a consequence of the recently introduced serialized access
to the socket in commit 8d94168a761819d10252bab1f8de6d7b202c3baa
("tipc: same receive code path for connection protocol and data
messages") we can make a number of simplifications in the
detection and handling of connection congestion situations.

- We don't need to keep two counters, one for sent messages and one
  for acked messages. There is no longer any risk for races between
  acknowledge messages arriving in BH and data message sending
  running in user context. So we merge this into one counter,
  'sent_unacked', which is incremented at sending and subtracted
  from at acknowledge reception.

- We don't need to set the 'congested' field in tipc_port to
  true before we sent the message, and clear it when sending
  is successful. (As a matter of fact, it was never necessary;
  the field was set in link_schedule_port() before any wakeup
  could arrive anyway.)

- We keep the conditions for link congestion and connection connection
  congestion separated. There would otherwise be a risk that an arriving
  acknowledge message may wake up a user sleeping because of link
  congestion.

- We can simplify reception of acknowledge messages.

We also make some cosmetic/structural changes:

- We rename the 'congested' field to the more correct 'link_cong´.

- We rename 'conn_unacked' to 'rcv_unacked'

- We move the above mentioned fields from struct tipc_port to
  struct tipc_sock.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 12:50:56 -07:00
Jon Paul Maloy ac0074ee70 tipc: clean up connection protocol reception function
We simplify the code for receiving connection probes, leveraging the
recently introduced tipc_msg_reverse() function. We also stick to
the principle of sending a possible response message directly from
the calling (tipc_sk_rcv or backlog_rcv) functions, hence making
the call chain shallower and easier to follow.

We make one small protocol change here, allowed according to
the spec. If a protocol message arrives from a remote socket that
is not the one we are connected to, we are currently generating a
connection abort message and send it to the source. This behavior
is unnecessary, and might even be a security risk, so instead we
now choose to only ignore the message. The consequnce for the sender
is that he will need longer time to discover his mistake (until the
next timeout), but this is an extreme corner case, and may happen
anyway under other circumstances, so we deem this change acceptable.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 12:50:56 -07:00
Jon Paul Maloy ec8a2e5621 tipc: same receive code path for connection protocol and data messages
As a preparation to eliminate port_lock we need to bring reception
of connection protocol messages under proper protection of bh_lock_sock
or socket owner.

We fix this by letting those messages follow the same code path as
incoming data messages.

As a side effect of this change, the last reference to the function
net_route_msg() disappears, and we can eliminate that function.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 12:50:56 -07:00
Jon Paul Maloy 4ccfe5e041 tipc: connection oriented transport uses new send functions
We move the message sending across established connections
to use the message preparation and send functions introduced
earlier in this series. We now do the message preparation
and call to the link send function directly from the socket,
instead of going via the port layer.

As a consequence of this change, the functions tipc_send(),
tipc_port_iovec_rcv(), tipc_port_iovec_reject() and tipc_reject_msg()
become unreferenced and can be eliminated from port.c. For the same
reason, the functions tipc_link_xmit_fast(), tipc_link_iovec_xmit_long()
and tipc_link_iovec_fast() can be eliminated from link.c.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 12:50:55 -07:00
Jon Paul Maloy e2dafe87d3 tipc: RDM/DGRAM transport uses new fragmenting and sending functions
We merge the code for sending port name and port identity addressed
messages into the corresponding send functions in socket.c, and start
using the new fragmenting and transmit functions we just have introduced.

This saves a call level and quite a few code lines, as well as making
this part of the code easier to follow. As a consequence, the functions
tipc_send2name() and tipc_send2port() in port.c can be removed.

For practical reasons, we break out the code for sending multicast messages
from tipc_sendmsg() and move it into a separate function, tipc_sendmcast(),
but we do not yet convert it into using the new build/send functions.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 12:50:55 -07:00
Jon Paul Maloy 5a379074a7 tipc: introduce message evaluation function
When a message arrives in a node and finds no destination
socket, we may need to drop it, reject it, or forward it after
a secondary destination lookup. The latter two cases currently
results in a code path that is perceived as complex, because it
follows a deep call chain via obscure functions such as
net_route_named_msg() and net_route_msg().

We now introduce a function, tipc_msg_eval(), that takes the
decision about whether such a message should be rejected or
forwarded, but leaves it to the caller to actually perform
the indicated action.

If the decision is 'reject', it is still the task of the recently
introduced function tipc_msg_reverse() to take the final decision
about whether the message is rejectable or not. In the latter case
it drops the message.

As a result of this change, we can finally eliminate the function
net_route_named_msg(), and hence become independent of net_route_msg().

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 12:50:55 -07:00