Commit Graph

24610 Commits

Author SHA1 Message Date
David Ward ba1bf474ea net_sched: gred: actually perform idling in WRED mode
gred_dequeue() and gred_drop() do not seem to get called when the
queue is empty, meaning that we never start idling while in WRED
mode. And since qidlestart is not stored by gred_store_wred_set(),
we would never stop idling while in WRED mode if we ever started.
This messes up the average queue size calculation that influences
packet marking/dropping behavior.

Now, we start WRED mode idling as we are removing the last packet
from the queue. Also we now actually stop WRED mode idling when we
are enqueuing a packet.

Cc: Bruce Osler <brosler@cisco.com>
Signed-off-by: David Ward <david.ward@ll.mit.edu>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-13 16:10:13 -04:00
David Ward 1fe37b106b net_sched: gred: fix qave reporting via netlink
q->vars.qavg is a Wlog scaled value, but q->backlog is not. In order
to pass q->vars.qavg as the backlog value, we need to un-scale it.
Additionally, the qave value returned via netlink should not be Wlog
scaled, so we need to un-scale the result of red_calc_qavg().

This caused artificially high values for "Average Queue" to be shown
by 'tc -s -d qdisc', but did not affect the actual operation of GRED.

Signed-off-by: David Ward <david.ward@ll.mit.edu>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-13 16:10:13 -04:00
David Ward c22e464022 net_sched: gred: eliminate redundant DP prio comparisons
Each pair of DPs only needs to be compared once when searching for
a non-unique prio value.

Signed-off-by: David Ward <david.ward@ll.mit.edu>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-13 16:10:13 -04:00
David Ward e29fe837bf net_sched: gred: correct comment about qavg calculation in RIO mode
Signed-off-by: David Ward <david.ward@ll.mit.edu>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-13 16:10:13 -04:00
David S. Miller 930521695c Merge branch 'master' of git://1984.lsi.us.es/nf
Pablo Neira Ayuso say:

====================
The following patchset contains four updates for your net tree, they are:

* Fix crash on timewait sockets, since the TCP early demux was added,
  in nfnetlink_log, from Eric Dumazet.

* Fix broken syslog log-level for xt_LOG and ebt_log since printk format was
  converted from <.> to a 2 bytes pattern using ASCII SOH, from Joe Perches.

* Two security fixes for the TCP connection tracking targeting off-path attacks,
  from Jozsef Kadlecsik. The problem was discovered by Jan Wrobel and it is
  documented in: http://mixedbit.org/reflection_scan/reflection_scan.pdf.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-13 13:53:06 -04:00
Joe Perches 16af511a66 netfilter: log: Fix log-level processing
auto75914331@hushmail.com reports that iptables does not correctly
output the KERN_<level>.

$IPTABLES -A RULE_0_in  -j LOG  --log-level notice --log-prefix "DENY  in: "

result with linux 3.6-rc5
Sep 12 06:37:29 xxxxx kernel: <5>DENY  in: IN=eth0 OUT= MAC=.......

result with linux 3.5.3 and older:
Sep  9 10:43:01 xxxxx kernel: DENY  in: IN=eth0 OUT= MAC......

commit 04d2c8c83d
("printk: convert the format for KERN_<LEVEL> to a 2 byte pattern")
updated the syslog header style but did not update netfilter uses.

Do so.

Use KERN_SOH and string concatenation instead of "%c" KERN_SOH_ASCII
as suggested by Eric Dumazet.

Signed-off-by: Joe Perches <joe@perches.com>
cc: auto75914331@hushmail.com
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-09-12 17:17:35 +02:00
Eric Dumazet bdfc87f7d1 net-sched: sch_cbq: avoid infinite loop
Its possible to setup a bad cbq configuration leading to
an infinite loop in cbq_classify()

DEV_OUT=eth0
ICMP="match ip protocol 1 0xff"
U32="protocol ip u32"
DST="match ip dst"
tc qdisc add dev $DEV_OUT root handle 1: cbq avpkt 1000 \
	bandwidth 100mbit
tc class add dev $DEV_OUT parent 1: classid 1:1 cbq \
	rate 512kbit allot 1500 prio 5 bounded isolated
tc filter add dev $DEV_OUT parent 1: prio 3 $U32 \
	$ICMP $DST 192.168.3.234 flowid 1:

Reported-by: Denys Fedoryschenko <denys@visp.net.lb>
Tested-by: Denys Fedoryschenko <denys@visp.net.lb>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-11 22:20:43 -04:00
Randy Dunlap 1c463e57b3 net: fix net/core/sock.c build error
Fix net/core/sock.c build error when CONFIG_INET is not enabled:

net/built-in.o: In function `sock_edemux':
(.text+0xd396): undefined reference to `inet_twsk_put'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-10 16:44:45 -04:00
Wei Yongjun 566f26aa70 caif: move the dereference below the NULL test
The dereference should be moved below the NULL test.

spatch with a semantic match is used to found this.
(http://coccinelle.lip6.fr/)

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-10 16:13:31 -04:00
Jozsef Kadlecsik 4a70bbfaef netfilter: Validate the sequence number of dataless ACK packets as well
We spare nothing by not validating the sequence number of dataless
ACK packets and enabling it makes harder off-path attacks.

See: "Reflection scan: an Off-Path Attack on TCP" by Jan Wrobel,
http://arxiv.org/abs/1201.2074

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-09-09 22:13:49 +02:00
Jozsef Kadlecsik 64f509ce71 netfilter: Mark SYN/ACK packets as invalid from original direction
Clients should not send such packets. By accepting them, we open
up a hole by wich ephemeral ports can be discovered in an off-path
attack.

See: "Reflection scan: an Off-Path Attack on TCP" by Jan Wrobel,
http://arxiv.org/abs/1201.2074

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-09-09 22:13:30 +02:00
Chema Gonzalez 6862234238 net: small bug on rxhash calculation
In the current rxhash calculation function, while the
sorting of the ports/addrs is coherent (you get the
same rxhash for packets sharing the same 4-tuple, in
both directions), ports and addrs are sorted
independently. This implies packets from a connection
between the same addresses but crossed ports hash to
the same rxhash.

For example, traffic between A=S:l and B=L:s is hashed
(in both directions) from {L, S, {s, l}}. The same
rxhash is obtained for packets between C=S:s and D=L:l.

This patch ensures that you either swap both addrs and ports,
or you swap none. Traffic between A and B, and traffic
between C and D, get their rxhash from different sources
({L, S, {l, s}} for A<->B, and {L, S, {s, l}} for C<->D)

The patch is co-written with Eric Dumazet <edumazet@google.com>

Signed-off-by: Chema Gonzalez <chema@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-08 18:41:48 -04:00
John W. Linville 777bf135b7 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
John W. Linville says:

====================
Please pull these fixes intended for 3.6.  There are more commits
here than I would like -- I got a bit behind while I was stalking
Steven Rostedt in San Diego last week...  I'll slow it down after this!

There are a couple of pulls here.  One is from Johannes:

"Please pull (according to the below information) to get a few fixes.

 * a fix to properly disconnect in the driver when authentication or
   association fails
 * a fix to prevent invalid information about mesh paths being reported
   to userspace
 * a memory leak fix in an nl80211 error path"

The other comes via Gustavo:

"A few updates for the 3.6 kernel. There are two btusb patches to add
more supported devices through the new USB_VENDOR_AND_INTEFACE_INFO()
macro and another one that add a new device id for a Sony Vaio laptop,
one fix for a user-after-free and, finally, two patches from Vinicius
to fix a issue in SMP pairing."

Along with those...

Arend van Spriel provides a fix for a use-after-free bug in brcmfmac.

Daniel Drake avoids a hang by not trying to touch the libertas hardware
duing suspend if it is already powered-down.

Felix Fietkau provides a batch of ath9k fixes that adress some
potential problems with power settings, as well as a fix to avoid a
potential interrupt storm.

Gertjan van Wingerde provides a register-width fix for rt2x00, and
a rt2x00 fix to prevent incorrectly detecting the rfkill status.
He also provides a device ID patch.

Hante Meuleman gives us three brcmfmac fixes, one that properly
initializes a command structure, one that fixes a race condition that
could lose usb requests, and one that removes some log spam.

Marc Kleine-Budde offers an rt2x00 fix for a voltage setting on some
specific devices.

Mohammed Shafi Shajakhan sent an ath9k fix to avoid a crash related to
using timers that aren't allocated when 2 wire bluetooth coexistence
hardware is in use.

Sergei Poselenov changes rt2800usb to do some validity checking for
received packets, avoiding crashes on an ARM Soc.

Stone Piao gives us an mwifiex fix for an incorrectly set skb length
value for a command buffer.

All of these are localized to their specific drivers, and relatively
small.  The power-related patches from Felix are bigger than I would
like, but I merged them in consideration of their isolation to ath9k
and the sensitive nature of power settings in wireless devices.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-07 14:38:50 -04:00
Eric Dumazet 979402b16c udp: increment UDP_MIB_INERRORS if copy failed
In UDP recvmsg(), we miss an increase of UDP_MIB_INERRORS if the copy
of skb to userspace failed for whatever reason.

Reported-by: Shawn Bohrer <sbohrer@rgmadvisors.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-07 12:56:00 -04:00
Eric Dumazet 0626af3139 netfilter: take care of timewait sockets
Sami Farin reported crashes in xt_LOG because it assumes skb->sk is a
full blown socket.

Since (41063e9 ipv4: Early TCP socket demux), we can have skb->sk
pointing to a timewait socket.

Same fix is needed in nfnetlink_log.

Diagnosed-by: Florian Westphal <fw@strlen.de>
Reported-by: Sami Farin <hvtaifwkbgefbaei@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-09-06 14:28:18 +02:00
Julian Anastasov d013ef2aba tcp: fix possible socket refcount problem for ipv6
commit 144d56e910
("tcp: fix possible socket refcount problem") is missing
the IPv6 part. As tcp_release_cb is shared by both protocols
we should hold sock reference for the TCP_MTU_REDUCED_DEFERRED
bit.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-05 17:16:25 -04:00
John W. Linville 785a7de9ee Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 2012-09-05 14:48:15 -04:00
John W. Linville 518eefe1b6 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth 2012-09-05 14:46:30 -04:00
LEO Airwarosu Yoichi Shinoda 7ce8c7a343 mac80211: Various small fixes for cfg.c: mpath_set_pinfo()
Various small fixes for net/mac80211/cfg.c:mpath_set_pinfo():
Initialize *pinfo before filling members in, handle MESH_PATH_RESOLVED
correctly, and remove bogus assignment; result in correct display
of FLAGS values and meaningful EXPTIME for expired paths in iw utility.

Signed-off-by: Yoichi Shinoda <shinoda@jaist.ac.jp>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-09-05 16:06:05 +02:00
Eric Dumazet c0cc88a762 l2tp: fix a typo in l2tp_eth_dev_recv()
While investigating l2tp bug, I hit a bug in eth_type_trans(),
because not enough bytes were pulled in skb head.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-04 15:54:55 -04:00
David S. Miller 9d148e39d1 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch 2012-09-04 15:17:52 -04:00
Steffen Klassert 3b59df46a4 xfrm: Workaround incompatibility of ESN and async crypto
ESN for esp is defined in RFC 4303. This RFC assumes that the
sequence number counters are always up to date. However,
this is not true if an async crypto algorithm is employed.

If the sequence number counters are not up to date on sequence
number check, we may incorrectly update the upper 32 bit of
the sequence number. This leads to a DOS.

We workaround this by comparing the upper sequence number,
(used for authentication) with the upper sequence number
computed after the async processing. We drop the packet
if these numbers are different.

To do this, we introduce a recheck function that does this
check in the ESN case.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-04 14:09:45 -04:00
Eric Dumazet 37159ef2c1 l2tp: fix a lockdep splat
Fixes following lockdep splat :

[ 1614.734896] =============================================
[ 1614.734898] [ INFO: possible recursive locking detected ]
[ 1614.734901] 3.6.0-rc3+ #782 Not tainted
[ 1614.734903] ---------------------------------------------
[ 1614.734905] swapper/11/0 is trying to acquire lock:
[ 1614.734907]  (slock-AF_INET){+.-...}, at: [<ffffffffa0209d72>] l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
[ 1614.734920]
[ 1614.734920] but task is already holding lock:
[ 1614.734922]  (slock-AF_INET){+.-...}, at: [<ffffffff815fce23>] tcp_v4_err+0x163/0x6b0
[ 1614.734932]
[ 1614.734932] other info that might help us debug this:
[ 1614.734935]  Possible unsafe locking scenario:
[ 1614.734935]
[ 1614.734937]        CPU0
[ 1614.734938]        ----
[ 1614.734940]   lock(slock-AF_INET);
[ 1614.734943]   lock(slock-AF_INET);
[ 1614.734946]
[ 1614.734946]  *** DEADLOCK ***
[ 1614.734946]
[ 1614.734949]  May be due to missing lock nesting notation
[ 1614.734949]
[ 1614.734952] 7 locks held by swapper/11/0:
[ 1614.734954]  #0:  (rcu_read_lock){.+.+..}, at: [<ffffffff81592801>] __netif_receive_skb+0x251/0xd00
[ 1614.734964]  #1:  (rcu_read_lock){.+.+..}, at: [<ffffffff815d319c>] ip_local_deliver_finish+0x4c/0x4e0
[ 1614.734972]  #2:  (rcu_read_lock){.+.+..}, at: [<ffffffff8160d116>] icmp_socket_deliver+0x46/0x230
[ 1614.734982]  #3:  (slock-AF_INET){+.-...}, at: [<ffffffff815fce23>] tcp_v4_err+0x163/0x6b0
[ 1614.734989]  #4:  (rcu_read_lock){.+.+..}, at: [<ffffffff815da240>] ip_queue_xmit+0x0/0x680
[ 1614.734997]  #5:  (rcu_read_lock_bh){.+....}, at: [<ffffffff815d9925>] ip_finish_output+0x135/0x890
[ 1614.735004]  #6:  (rcu_read_lock_bh){.+....}, at: [<ffffffff81595680>] dev_queue_xmit+0x0/0xe00
[ 1614.735012]
[ 1614.735012] stack backtrace:
[ 1614.735016] Pid: 0, comm: swapper/11 Not tainted 3.6.0-rc3+ #782
[ 1614.735018] Call Trace:
[ 1614.735020]  <IRQ>  [<ffffffff810a50ac>] __lock_acquire+0x144c/0x1b10
[ 1614.735033]  [<ffffffff810a334b>] ? check_usage+0x9b/0x4d0
[ 1614.735037]  [<ffffffff810a6762>] ? mark_held_locks+0x82/0x130
[ 1614.735042]  [<ffffffff810a5df0>] lock_acquire+0x90/0x200
[ 1614.735047]  [<ffffffffa0209d72>] ? l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
[ 1614.735051]  [<ffffffff810a69ad>] ? trace_hardirqs_on+0xd/0x10
[ 1614.735060]  [<ffffffff81749b31>] _raw_spin_lock+0x41/0x50
[ 1614.735065]  [<ffffffffa0209d72>] ? l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
[ 1614.735069]  [<ffffffffa0209d72>] l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
[ 1614.735075]  [<ffffffffa014f7f2>] l2tp_eth_dev_xmit+0x32/0x60 [l2tp_eth]
[ 1614.735079]  [<ffffffff81595112>] dev_hard_start_xmit+0x502/0xa70
[ 1614.735083]  [<ffffffff81594c6e>] ? dev_hard_start_xmit+0x5e/0xa70
[ 1614.735087]  [<ffffffff815957c1>] ? dev_queue_xmit+0x141/0xe00
[ 1614.735093]  [<ffffffff815b622e>] sch_direct_xmit+0xfe/0x290
[ 1614.735098]  [<ffffffff81595865>] dev_queue_xmit+0x1e5/0xe00
[ 1614.735102]  [<ffffffff81595680>] ? dev_hard_start_xmit+0xa70/0xa70
[ 1614.735106]  [<ffffffff815b4daa>] ? eth_header+0x3a/0xf0
[ 1614.735111]  [<ffffffff8161d33e>] ? fib_get_table+0x2e/0x280
[ 1614.735117]  [<ffffffff8160a7e2>] arp_xmit+0x22/0x60
[ 1614.735121]  [<ffffffff8160a863>] arp_send+0x43/0x50
[ 1614.735125]  [<ffffffff8160b82f>] arp_solicit+0x18f/0x450
[ 1614.735132]  [<ffffffff8159d9da>] neigh_probe+0x4a/0x70
[ 1614.735137]  [<ffffffff815a191a>] __neigh_event_send+0xea/0x300
[ 1614.735141]  [<ffffffff815a1c93>] neigh_resolve_output+0x163/0x260
[ 1614.735146]  [<ffffffff815d9cf5>] ip_finish_output+0x505/0x890
[ 1614.735150]  [<ffffffff815d9925>] ? ip_finish_output+0x135/0x890
[ 1614.735154]  [<ffffffff815dae79>] ip_output+0x59/0xf0
[ 1614.735158]  [<ffffffff815da1cd>] ip_local_out+0x2d/0xa0
[ 1614.735162]  [<ffffffff815da403>] ip_queue_xmit+0x1c3/0x680
[ 1614.735165]  [<ffffffff815da240>] ? ip_local_out+0xa0/0xa0
[ 1614.735172]  [<ffffffff815f4402>] tcp_transmit_skb+0x402/0xa60
[ 1614.735177]  [<ffffffff815f5a11>] tcp_retransmit_skb+0x1a1/0x620
[ 1614.735181]  [<ffffffff815f7e93>] tcp_retransmit_timer+0x393/0x960
[ 1614.735185]  [<ffffffff815fce23>] ? tcp_v4_err+0x163/0x6b0
[ 1614.735189]  [<ffffffff815fd317>] tcp_v4_err+0x657/0x6b0
[ 1614.735194]  [<ffffffff8160d116>] ? icmp_socket_deliver+0x46/0x230
[ 1614.735199]  [<ffffffff8160d19e>] icmp_socket_deliver+0xce/0x230
[ 1614.735203]  [<ffffffff8160d116>] ? icmp_socket_deliver+0x46/0x230
[ 1614.735208]  [<ffffffff8160d464>] icmp_unreach+0xe4/0x2c0
[ 1614.735213]  [<ffffffff8160e520>] icmp_rcv+0x350/0x4a0
[ 1614.735217]  [<ffffffff815d3285>] ip_local_deliver_finish+0x135/0x4e0
[ 1614.735221]  [<ffffffff815d319c>] ? ip_local_deliver_finish+0x4c/0x4e0
[ 1614.735225]  [<ffffffff815d3ffa>] ip_local_deliver+0x4a/0x90
[ 1614.735229]  [<ffffffff815d37b7>] ip_rcv_finish+0x187/0x730
[ 1614.735233]  [<ffffffff815d425d>] ip_rcv+0x21d/0x300
[ 1614.735237]  [<ffffffff81592a1b>] __netif_receive_skb+0x46b/0xd00
[ 1614.735241]  [<ffffffff81592801>] ? __netif_receive_skb+0x251/0xd00
[ 1614.735245]  [<ffffffff81593368>] process_backlog+0xb8/0x180
[ 1614.735249]  [<ffffffff81593cf9>] net_rx_action+0x159/0x330
[ 1614.735257]  [<ffffffff810491f0>] __do_softirq+0xd0/0x3e0
[ 1614.735264]  [<ffffffff8109ed24>] ? tick_program_event+0x24/0x30
[ 1614.735270]  [<ffffffff8175419c>] call_softirq+0x1c/0x30
[ 1614.735278]  [<ffffffff8100425d>] do_softirq+0x8d/0xc0
[ 1614.735282]  [<ffffffff8104983e>] irq_exit+0xae/0xe0
[ 1614.735287]  [<ffffffff8175494e>] smp_apic_timer_interrupt+0x6e/0x99
[ 1614.735291]  [<ffffffff81753a1c>] apic_timer_interrupt+0x6c/0x80
[ 1614.735293]  <EOI>  [<ffffffff810a14ad>] ? trace_hardirqs_off+0xd/0x10
[ 1614.735306]  [<ffffffff81336f85>] ? intel_idle+0xf5/0x150
[ 1614.735310]  [<ffffffff81336f7e>] ? intel_idle+0xee/0x150
[ 1614.735317]  [<ffffffff814e6ea9>] cpuidle_enter+0x19/0x20
[ 1614.735321]  [<ffffffff814e7538>] cpuidle_idle_call+0xa8/0x630
[ 1614.735327]  [<ffffffff8100c1ba>] cpu_idle+0x8a/0xe0
[ 1614.735333]  [<ffffffff8173762e>] start_secondary+0x220/0x222

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-04 14:07:50 -04:00
Alan Cox 6cf5c95117 netrom: copy_datagram_iovec can fail
Check for an error from this and if so bail properly.

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-04 12:57:35 -04:00
Wei Yongjun b4e4f47e94 nl80211: fix possible memory leak nl80211_connect()
connkeys is malloced in nl80211_parse_connkeys() and should
be freed in the error handling case, otherwise it will cause
memory leak.

spatch with a semantic match is used to found this problem.
(http://coccinelle.lip6.fr/)

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-09-04 18:06:00 +02:00
Eliad Peller 3d2abdfdf1 mac80211: clear bssid on auth/assoc failure
ifmgd->bssid wasn't cleared properly in some
auth/assoc failure cases, causing mac80211 and
the low-level driver to go out of sync.

Clear ifmgd->bssid on failure, and notify the driver.

Cc: stable@kernel.org # 3.4+
Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-09-04 17:14:19 +02:00
Jesse Gross c303aa94cd openvswitch: Fix FLOW_BUFSIZE definition.
The vlan encapsulation fields in the maximum flow defintion were
never updated when the representation changed before upstreaming.
In theory this could cause a kernel panic when a maximum length
flow is used.  In practice this has never happened (to my knowledge)
because skb allocations are padded out to a cache line so you would
need the right combination of flow and packet being sent to userspace.

Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-09-03 19:06:27 -07:00
Eric Dumazet b379135c40 fq_codel: dont reinit flow state
When fq_codel builds a new flow, it should not reset codel state.

Codel algo needs to get previous values (lastcount, drop_next) to get
proper behavior.

Signed-off-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Dave Taht <dave.taht@bufferbloat.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-03 14:36:50 -04:00
Thomas Graf 4c3a5bdae2 sctp: Don't charge for data in sndbuf again when transmitting packet
SCTP charges wmem_alloc via sctp_set_owner_w() in sctp_sendmsg() and via
skb_set_owner_w() in sctp_packet_transmit(). If a sender runs out of
sndbuf it will sleep in sctp_wait_for_sndbuf() and expects to be waken up
by __sctp_write_space().

Buffer space charged via sctp_set_owner_w() is released in sctp_wfree()
which calls __sctp_write_space() directly.

Buffer space charged via skb_set_owner_w() is released via sock_wfree()
which calls sk->sk_write_space() _if_ SOCK_USE_WRITE_QUEUE is not set.
sctp_endpoint_init() sets SOCK_USE_WRITE_QUEUE on all sockets.

Therefore if sctp_packet_transmit() manages to queue up more than sndbuf
bytes, sctp_wait_for_sndbuf() will never be woken up again unless it is
interrupted by a signal.

This could be fixed by clearing the SOCK_USE_WRITE_QUEUE flag but ...

Charging for the data twice does not make sense in the first place, it
leads to overcharging sndbuf by a factor 2. Therefore this patch only
charges a single byte in wmem_alloc when transmitting an SCTP packet to
ensure that the socket stays alive until the packet has been released.

This means that control chunks are no longer accounted for in wmem_alloc
which I believe is not a problem as skb->truesize will typically lead
to overcharging anyway and thus compensates for any control overhead.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
CC: Vlad Yasevich <vyasevic@redhat.com>
CC: Neil Horman <nhorman@tuxdriver.com>
CC: David Miller <davem@davemloft.net>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-03 13:24:13 -04:00
Eric Dumazet e812347ccf net: sock_edemux() should take care of timewait sockets
sock_edemux() can handle either a regular socket or a timewait socket

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-03 13:22:43 -04:00
Joe Stringer 39855b5ba9 openvswitch: Fix typo
Signed-off-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-09-02 12:18:25 -07:00
Linus Torvalds 0b1a34c992 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) NLA_PUT* --> nla_put_* conversion got one case wrong in
    nfnetlink_log, fix from Patrick McHardy.

 2) Missed error return check in ipw2100 driver, from Julia Lawall.

 3) PMTU updates in ipv4 were setting the expiry time incorrectly, fix
    from Eric Dumazet.

 4) SFC driver erroneously reversed src and dst when reporting filters
    via ethtool.

 5) Memory leak in CAN protocol and wrong setting of IRQF_SHARED in
    sja1000 can platform driver, from Alexey Khoroshilov and Sven
    Schmitt.

 6) Fix multicast traffic scaling regression in ipv4_dst_destroy, only
    take the lock when we really need to.  From Eric Dumazet.

 7) Fix non-root process spoofing in netlink, from Pablo Neira Ayuso.

 8) CWND reduction in TCP is done incorrectly during non-SACK recovery,
    fix from Yuchung Cheng.

 9) Revert netpoll change, and fix what was actually a driver specific
    problem.  From Amerigo Wang.  This should cure bootup hangs with
    netconsole some people reported.

10) Fix xen-netfront invoking __skb_fill_page_desc() with a NULL page
    pointer.  From Ian Campbell.

11) SIP NAT fix for expectiontation creation, from Pablo Neira Ayuso.

12) __ip_rt_update_pmtu() needs RCU locking, from Eric Dumazet.

13) Fix usbnet deadlock on resume, can't use GFP_KERNEL in this
    situation.  From Oliver Neukum.

14) The davinci ethernet driver triggers an OOPS on removal because it
    frees an MDIO object before unregistering it.  Fix from Bin Liu.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits)
  net: qmi_wwan: add several new Gobi devices
  fddi: 64 bit bug in smt_add_para()
  net: ethernet: fix kernel OOPS when remove davinci_mdio module
  net/xfrm/xfrm_state.c: fix error return code
  net: ipv6: fix error return code
  net: qmi_wwan: new device: Foxconn/Novatel E396
  usbnet: fix deadlock in resume
  cs89x0 : packet reception not working
  netfilter: nf_conntrack: fix racy timer handling with reliable events
  bnx2x: Correct the ndo_poll_controller call
  bnx2x: Move netif_napi_add to the open call
  ipv4: must use rcu protection while calling fib_lookup
  bnx2x: fix 57840_MF pci id
  net: ipv4: ipmr_expire_timer causes crash when removing net namespace
  e1000e: DoS while TSO enabled caused by link partner with small MSS
  l2tp: avoid to use synchronize_rcu in tunnel free function
  gianfar: fix default tx vlan offload feature flag
  netfilter: nf_nat_sip: fix incorrect handling of EBUSY for RTCP expectation
  xen-netfront: use __pskb_pull_tail to ensure linear area is big enough on RX
  netfilter: nfnetlink_log: fix error return code in init path
  ...
2012-09-02 11:28:00 -07:00
Julia Lawall 599901c3e4 net/xfrm/xfrm_state.c: fix error return code
Initialize return variable before exiting on an error path.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
(
if@p1 (\(ret < 0\|ret != 0\))
 { ... return ret; }
|
ret@p1 = 0
)
... when != ret = e1
    when != &ret
*if(...)
{
  ... when != ret = e2
      when forall
 return ret;
}

// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-31 16:27:48 -04:00
Julia Lawall 48f125ce1c net: ipv6: fix error return code
Initialize return variable before exiting on an error path.

The initial initialization of the return variable is also dropped, because
that value is never used.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
(
if@p1 (\(ret < 0\|ret != 0\))
 { ... return ret; }
|
ret@p1 = 0
)
... when != ret = e1
    when != &ret
*if(...)
{
  ... when != ret = e2
      when forall
 return ret;
}

// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-31 16:27:48 -04:00
David S. Miller 0dcd5052c8 Merge branch 'master' of git://1984.lsi.us.es/nf 2012-08-31 13:06:37 -04:00
Pablo Neira Ayuso 5b423f6a40 netfilter: nf_conntrack: fix racy timer handling with reliable events
Existing code assumes that del_timer returns true for alive conntrack
entries. However, this is not true if reliable events are enabled.
In that case, del_timer may return true for entries that were
just inserted in the dying list. Note that packets / ctnetlink may
hold references to conntrack entries that were just inserted to such
list.

This patch fixes the issue by adding an independent timer for
event delivery. This increases the size of the ecache extension.
Still we can revisit this later and use variable size extensions
to allocate this area on demand.

Tested-by: Oliver Smith <olipro@8.c.9.b.0.7.4.0.1.0.0.2.ip6.arpa>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-08-31 15:50:28 +02:00
Eric Dumazet c5ae7d4192 ipv4: must use rcu protection while calling fib_lookup
Following lockdep splat was reported by Pavel Roskin :

[ 1570.586223] ===============================
[ 1570.586225] [ INFO: suspicious RCU usage. ]
[ 1570.586228] 3.6.0-rc3-wl-main #98 Not tainted
[ 1570.586229] -------------------------------
[ 1570.586231] /home/proski/src/linux/net/ipv4/route.c:645 suspicious rcu_dereference_check() usage!
[ 1570.586233]
[ 1570.586233] other info that might help us debug this:
[ 1570.586233]
[ 1570.586236]
[ 1570.586236] rcu_scheduler_active = 1, debug_locks = 0
[ 1570.586238] 2 locks held by Chrome_IOThread/4467:
[ 1570.586240]  #0:  (slock-AF_INET){+.-...}, at: [<ffffffff814f2c0c>] release_sock+0x2c/0xa0
[ 1570.586253]  #1:  (fnhe_lock){+.-...}, at: [<ffffffff815302fc>] update_or_create_fnhe+0x2c/0x270
[ 1570.586260]
[ 1570.586260] stack backtrace:
[ 1570.586263] Pid: 4467, comm: Chrome_IOThread Not tainted 3.6.0-rc3-wl-main #98
[ 1570.586265] Call Trace:
[ 1570.586271]  [<ffffffff810976ed>] lockdep_rcu_suspicious+0xfd/0x130
[ 1570.586275]  [<ffffffff8153042c>] update_or_create_fnhe+0x15c/0x270
[ 1570.586278]  [<ffffffff815305b3>] __ip_rt_update_pmtu+0x73/0xb0
[ 1570.586282]  [<ffffffff81530619>] ip_rt_update_pmtu+0x29/0x90
[ 1570.586285]  [<ffffffff815411dc>] inet_csk_update_pmtu+0x2c/0x80
[ 1570.586290]  [<ffffffff81558d1e>] tcp_v4_mtu_reduced+0x2e/0xc0
[ 1570.586293]  [<ffffffff81553bc4>] tcp_release_cb+0xa4/0xb0
[ 1570.586296]  [<ffffffff814f2c35>] release_sock+0x55/0xa0
[ 1570.586300]  [<ffffffff815442ef>] tcp_sendmsg+0x4af/0xf50
[ 1570.586305]  [<ffffffff8156fc60>] inet_sendmsg+0x120/0x230
[ 1570.586308]  [<ffffffff8156fb40>] ? inet_sk_rebuild_header+0x40/0x40
[ 1570.586312]  [<ffffffff814f4bdd>] ? sock_update_classid+0xbd/0x3b0
[ 1570.586315]  [<ffffffff814f4c50>] ? sock_update_classid+0x130/0x3b0
[ 1570.586320]  [<ffffffff814ec435>] do_sock_write+0xc5/0xe0
[ 1570.586323]  [<ffffffff814ec4a3>] sock_aio_write+0x53/0x80
[ 1570.586328]  [<ffffffff8114bc83>] do_sync_write+0xa3/0xe0
[ 1570.586332]  [<ffffffff8114c5a5>] vfs_write+0x165/0x180
[ 1570.586335]  [<ffffffff8114c805>] sys_write+0x45/0x90
[ 1570.586340]  [<ffffffff815d2722>] system_call_fastpath+0x16/0x1b

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-30 13:33:08 -04:00
Francesco Ruggeri acbb219d5f net: ipv4: ipmr_expire_timer causes crash when removing net namespace
When tearing down a net namespace, ipv4 mr_table structures are freed
without first deactivating their timers. This can result in a crash in
run_timer_softirq.
This patch mimics the corresponding behaviour in ipv6.
Locking and synchronization seem to be adequate.
We are about to kfree mrt, so existing code should already make sure that
no other references to mrt are pending or can be created by incoming traffic.
The functions invoked here do not cause new references to mrt or other
race conditions to be created.
Invoking del_timer_sync guarantees that ipmr_expire_timer is inactive.
Both ipmr_expire_process (whose completion we may have to wait in
del_timer_sync) and mroute_clean_tables internally use mfc_unres_lock
or other synchronizations when needed, and they both only modify mrt.

Tested in Linux 3.4.8.

Signed-off-by: Francesco Ruggeri <fruggeri@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-30 12:51:32 -04:00
xeb@mail.ru 99469c32f7 l2tp: avoid to use synchronize_rcu in tunnel free function
Avoid to use synchronize_rcu in l2tp_tunnel_free because context may be
atomic.

Signed-off-by: Dmitry Kozlov <xeb@mail.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-30 12:31:03 -04:00
Pablo Neira Ayuso 3f509c689a netfilter: nf_nat_sip: fix incorrect handling of EBUSY for RTCP expectation
We're hitting bug while trying to reinsert an already existing
expectation:

kernel BUG at kernel/timer.c:895!
invalid opcode: 0000 [#1] SMP
[...]
Call Trace:
 <IRQ>
 [<ffffffffa0069563>] nf_ct_expect_related_report+0x4a0/0x57a [nf_conntrack]
 [<ffffffff812d423a>] ? in4_pton+0x72/0x131
 [<ffffffffa00ca69e>] ip_nat_sdp_media+0xeb/0x185 [nf_nat_sip]
 [<ffffffffa00b5b9b>] set_expected_rtp_rtcp+0x32d/0x39b [nf_conntrack_sip]
 [<ffffffffa00b5f15>] process_sdp+0x30c/0x3ec [nf_conntrack_sip]
 [<ffffffff8103f1eb>] ? irq_exit+0x9a/0x9c
 [<ffffffffa00ca738>] ? ip_nat_sdp_media+0x185/0x185 [nf_nat_sip]

We have to remove the RTP expectation if the RTCP expectation hits EBUSY
since we keep trying with other ports until we succeed.

Reported-by: Rafal Fitt <rafalf@aplusc.com.pl>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-08-30 18:27:14 +02:00
Julia Lawall 6fc09f10f1 netfilter: nfnetlink_log: fix error return code in init path
Initialize return variable before exiting on an error path.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
(
if@p1 (\(ret < 0\|ret != 0\))
 { ... return ret; }
|
ret@p1 = 0
)
... when != ret = e1
    when != &ret
*if(...)
{
  ... when != ret = e2
      when forall
 return ret;
}

// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-08-30 03:29:58 +02:00
Julia Lawall ef6acf68c2 netfilter: ctnetlink: fix error return code in init path
Initialize return variable before exiting on an error path.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
(
if@p1 (\(ret < 0\|ret != 0\))
 { ... return ret; }
|
ret@p1 = 0
)
... when != ret = e1
    when != &ret
*if(...)
{
  ... when != ret = e2
      when forall
 return ret;
}

// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-08-30 03:28:22 +02:00
Julia Lawall 0a54e939d8 ipvs: fix error return code
Initialize return variable before exiting on an error path.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
(
if@p1 (\(ret < 0\|ret != 0\))
 { ... return ret; }
|
ret@p1 = 0
)
... when != ret = e1
    when != &ret
*if(...)
{
  ... when != ret = e2
      when forall
 return ret;
}

// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-08-30 03:27:19 +02:00
Amerigo Wang 072a9c4860 netpoll: revert 6bdb7fe310 and fix be_poll() instead
Against -net.

In the patch "netpoll: re-enable irq in poll_napi()", I tried to
fix the following warning:

[100718.051041] ------------[ cut here ]------------
[100718.051048] WARNING: at kernel/softirq.c:159 local_bh_enable_ip+0x7d/0xb0()
(Not tainted)
[100718.051049] Hardware name: ProLiant BL460c G7
...
[100718.051068] Call Trace:
[100718.051073]  [<ffffffff8106b747>] ? warn_slowpath_common+0x87/0xc0
[100718.051075]  [<ffffffff8106b79a>] ? warn_slowpath_null+0x1a/0x20
[100718.051077]  [<ffffffff810747ed>] ? local_bh_enable_ip+0x7d/0xb0
[100718.051080]  [<ffffffff8150041b>] ? _spin_unlock_bh+0x1b/0x20
[100718.051085]  [<ffffffffa00ee974>] ? be_process_mcc+0x74/0x230 [be2net]
[100718.051088]  [<ffffffffa00ea68c>] ? be_poll_tx_mcc+0x16c/0x290 [be2net]
[100718.051090]  [<ffffffff8144fe76>] ? netpoll_poll_dev+0xd6/0x490
[100718.051095]  [<ffffffffa01d24a5>] ? bond_poll_controller+0x75/0x80 [bonding]
[100718.051097]  [<ffffffff8144fde5>] ? netpoll_poll_dev+0x45/0x490
[100718.051100]  [<ffffffff81161b19>] ? ksize+0x19/0x80
[100718.051102]  [<ffffffff81450437>] ? netpoll_send_skb_on_dev+0x157/0x240

by reenabling IRQ before calling ->poll, but it seems more
problems are introduced after that patch:

http://ozlabs.org/~akpm/stuff/IMG_20120824_122054.jpg
http://marc.info/?l=linux-netdev&m=134563282530588&w=2

So it is safe to fix be2net driver code directly.

This patch reverts the offending commit and fixes be_poll() by
avoid disabling BH there, this is okay because be_poll()
can be called either by poll_napi() which already disables
IRQ, or by net_rx_action() which already disables BH.

Reported-by: Andrew Morton <akpm@linux-foundation.org>
Reported-by: Sylvain Munaut <s.munaut@whatever-company.com>
Cc: Sylvain Munaut <s.munaut@whatever-company.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Miller <davem@davemloft.net>
Cc: Sathya Perla <sathya.perla@emulex.com>
Cc: Subbu Seetharaman <subbu.seetharaman@emulex.com>
Cc: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: Cong Wang <amwang@redhat.com>
Tested-by: Sylvain Munaut <s.munaut@whatever-company.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-29 15:03:23 -04:00
Vinicius Costa Gomes d8343f1257 Bluetooth: Fix sending a HCI Authorization Request over LE links
In the case that the link is already in the connected state and a
Pairing request arrives from the mgmt interface, hci_conn_security()
would be called but it was not considering LE links.

Reported-by: João Paulo Rechi Vita <jprvita@openbossa.org>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2012-08-27 08:11:51 -07:00
Vinicius Costa Gomes cc110922da Bluetooth: Change signature of smp_conn_security()
To make it clear that it may be called from contexts that may not have
any knowledge of L2CAP, we change the connection parameter, to receive
a hci_conn.

This also makes it clear that it is checking the security of the link.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2012-08-27 08:07:18 -07:00
Linus Torvalds 8497ae61d0 Merge branch 'for-3.6' of git://linux-nfs.org/~bfields/linux
Pull nfsd bugfixes from J. Bruce Fields:
 "Particular thanks to Michael Tokarev, Malahal Naineni, and Jamie
  Heilman for their testing and debugging help."

* 'for-3.6' of git://linux-nfs.org/~bfields/linux:
  svcrpc: fix svc_xprt_enqueue/svc_recv busy-looping
  svcrpc: sends on closed socket should stop immediately
  svcrpc: fix BUG() in svc_tcp_clear_pages
  nfsd4: fix security flavor of NFSv4.0 callback
2012-08-25 11:43:41 -07:00
David S. Miller d05cebb915 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
John W. Linville says:

====================
This batch of fixes is intended for 3.6...

Johannes Berg gives us a pair of iwlwifi fixes.  One corrects some
improperly defined ifdefs that lead to crashes and BUG_ONs.  The other
prevents attempts to read SRAM for devices that aren't actually started.

Julia Lawall provides an ipw2100 fix to properly set the return code
from a function call before testing it! :-)

Thomas Huehn corrects the improper use of a constant related to a power
setting in ath5k.

Thomas Pedersen offers a mac80211 fix to properly handle destination
addresses of unicast frames passing though a mesh gate.

Vladimir Zapolskiy provides a brcmsmac fix to properly mark the
interface state when the device goes down.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-24 15:15:10 -04:00
Yuchung Cheng 7c4a56fec3 tcp: fix cwnd reduction for non-sack recovery
The cwnd reduction in fast recovery is based on the number of packets
newly delivered per ACK. For non-sack connections every DUPACK
signifies a packet has been delivered, but the sender mistakenly
skips counting them for cwnd reduction.

The fix is to compute newly_acked_sacked after DUPACKs are accounted
in sacked_out for non-sack connections.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Nandita Dukkipati <nanditad@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-24 13:48:58 -04:00
Pablo Neira Ayuso 20e1db19db netlink: fix possible spoofing from non-root processes
Non-root user-space processes can send Netlink messages to other
processes that are well-known for being subscribed to Netlink
asynchronous notifications. This allows ilegitimate non-root
process to send forged messages to Netlink subscribers.

The userspace process usually verifies the legitimate origin in
two ways:

a) Socket credentials. If UID != 0, then the message comes from
   some ilegitimate process and the message needs to be dropped.

b) Netlink portID. In general, portID == 0 means that the origin
   of the messages comes from the kernel. Thus, discarding any
   message not coming from the kernel.

However, ctnetlink sets the portID in event messages that has
been triggered by some user-space process, eg. conntrack utility.
So other processes subscribed to ctnetlink events, eg. conntrackd,
know that the event was triggered by some user-space action.

Neither of the two ways to discard ilegitimate messages coming
from non-root processes can help for ctnetlink.

This patch adds capability validation in case that dst_pid is set
in netlink_sendmsg(). This approach is aggressive since existing
applications using any Netlink bus to deliver messages between
two user-space processes will break. Note that the exception is
NETLINK_USERSOCK, since it is reserved for netlink-to-netlink
userspace communication.

Still, if anyone wants that his Netlink bus allows netlink-to-netlink
userspace, then they can set NL_NONROOT_SEND. However, by default,
I don't think it makes sense to allow to use NETLINK_ROUTE to
communicate two processes that are sending no matter what information
that is not related to link/neighbouring/routing. They should be using
NETLINK_USERSOCK instead for that.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-24 13:36:09 -04:00