This patch adds namespace support for ICMP protocol tracker.
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch adds namespace support for UDP protocol tracker.
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch adds namespace support for TCP protocol tracker.
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch adds namespace support for the generic layer 4 protocol
tracker.
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch prepares the namespace support for layer 3 protocol trackers.
Basically, this modifies the following interfaces:
* nf_ct_l3proto_[un]register_sysctl.
* nf_conntrack_l3proto_[un]register.
We add a new nf_ct_l3proto_net is used to get the pernet data of l3proto.
This adds rhe new struct nf_ip_net that is used to store the sysctl header
and l3proto_ipv4,l4proto_tcp(6),l4proto_udp(6),l4proto_icmp(v6) because the
protos such tcp and tcp6 use the same data,so making nf_ip_net as a field
of netns_ct is the easiest way to manager it.
This patch also adds init_net to struct nf_conntrack_l3proto to initial
the layer 3 protocol pernet data.
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch prepares the namespace support for layer 4 protocol trackers.
Basically, this modifies the following interfaces:
* nf_ct_[un]register_sysctl
* nf_conntrack_l4proto_[un]register
to include the namespace parameter. We still use init_net in this patch
to prepare the ground for follow-up patches for each layer 4 protocol
tracker.
We add a new net_id field to struct nf_conntrack_l4proto that is used
to store the pernet_operations id for each layer 4 protocol tracker.
Note that AF_INET6's protocols do not need to do sysctl compat. Thus,
we only register compat sysctl when l4proto.l3proto != AF_INET6.
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Redesign all the off-channel code, getting rid of
the generic off-channel work concept, replacing
it with a simple remain-on-channel list.
This fixes a number of small issues with the ROC
implementation:
* offloaded remain-on-channel couldn't be queued,
now we can queue it as well, if needed
* in iwlwifi (the only user) offloaded ROC is
mutually exclusive with scanning, use the new
queue to handle that case -- I expect that it
will later depend on a HW flag
The bigger issue though is that there's a bad bug
in the current implementation: if we get a mgmt
TX request while HW roc is active, and this new
request has a wait time, we actually schedule a
software ROC instead since we can't guarantee the
existing offloaded ROC will still be that long.
To fix this, the queuing mechanism was needed.
The queuing mechanism for offloaded ROC isn't yet
optimal, ideally we should add API to have the HW
extend the ROC if needed. We could add that later
but for now use a software implementation.
Overall, this unifies the behaviour between the
offloaded and software-implemented case as much
as possible.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The IDLE handling in HW off-channel is broken right
now since we turn off IDLE only when the off-channel
period already started. Therefore, all drivers that
use it today (only iwlwifi!) must support off-channel
while idle, so playing with idle isn't needed at all.
Off-channel in general, since it's no longer used for
authentication/association, shouldn't affect PS, so
also remove that logic.
Also document a small caveat for reporting TX status
from off-channel frames in HW remain-on-channel.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Now that we've removed all uses of the set_channel
API except for the monitor channel and in libertas,
clarify this. Split the libertas mesh use into a
new libertas_set_mesh_channel() operation, just to
keep backward compatibility, and rename the normal
set_channel() to set_monitor_channel().
Also describe the desired set_monitor_channel()
semantics more clearly.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
commit 5faa5df1fa (inetpeer: Invalidate the inetpeer tree along with
the routing cache) added a race :
Before freeing an inetpeer, we must respect a RCU grace period, and make
sure no user will attempt to increase refcnt.
inetpeer_invalidate_tree() waits for a RCU grace period before inserting
inetpeer tree into gc_list and waking the worker. At that time, no
concurrent lookup can find a inetpeer in this tree.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Just like the AP mode patch, instead of setting
the channel and then joining the mesh network,
provide the channel to join the network on to
the join_mesh() function.
Like in AP mode, you can also give the channel
to the join-mesh nl80211 command now.
Unlike AP mode, it picks a default channel if
none was given.
As libertas uses mesh mode interfaces but has
no join_mesh callback and we can't simply break
it, keep some compatibility code for that case
and configure the channel directly for it.
In the non-libertas case, where we store the
channel until join, allow setting it while the
interface is down.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Instead of setting the channel first and then
starting the AP, let cfg80211 store the channel
and provide it as one of the AP settings.
This means that now you have to set the channel
before you can start an AP interface, but since
hostapd/wpa_supplicant always do that we're OK
with this change.
Alternatively, it's now possible to give the
channel as an attribute to the start-ap nl80211
command, overriding any preset channel.
Cc: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Change cfg80211_can_beacon_sec_chan() to return true
if there is no secondary channel to simplify all the
current users of it. They all check the channel type
before calling the function because it returns false
if there's no secondary channel.
Also actually document the return value.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
ieee80211_get_operstate() was used by drivers in order to
know whether the sta link is up, but it's no longer needed
(nor used) as mac80211 notifies the drivers about
authorization changes (via the sta_state callback)
Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Simplify the use of #ifdef CONFIG_MAC80211_IBSS_DEBUG/#endif
by adding a logging macro to encapsulate the test.
Convert the appropriate uses too.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Simplify the use of #ifdef CONFIG_MAC80211_HT_DEBUG/#endif
by adding a logging macro to encapsulate the test.
Convert the appropriate uses too.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Low level drivers can now set certain netdev feature bits in
netdev_features member of the ieee80211_hw struct. These will be
propagated to every netdev created from this HW.
The white-listed features currently include only ones related to HW
checksumming.
Signed-off-by: Arik Nemtsov <arik@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
A2MP doesn't use part of the L2CAP chan ops API so we just create general
empty function instead of the A2MP specific one.
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
This patch renames L2CAP_LE_DEFAULT_MTU macro to L2CAP_LE_MIN_MTU
since it represents the minimum MTU value, not the default MTU
value for LE.
Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Those are not used anywhere in code (and never were since introduction
in 2006) so just remove them.
Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
A2MP fixed channel do not have sk
Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
A2MP status codes copied from Bluez patch sent by Peter Krystad
<pkrystad@codeaurora.org>.
Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Helper function to build and send A2MP messages.
Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Define AMP Manager and some basic functions.
Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
This move socket specific code to l2cap_sock.c.
Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This remove a bit more of socket code from l2cap core, this calls set the
SOCK_ZAPPED and do some clean up depending on the socket state.
Reported-by: Mat Martineau <mathewm@codeaurora.org>
Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Use chan instead of void * makes more sense here.
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Add HCI commands to deal with Bluetooth AMP controllers.
Those commands will be used by bluetooth and softamp code.
Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Define assigned Protocol and Service Multiplexor (PSM) identifiers
and use them instead of magic numbers.
Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Define Continuation flag which the only flag used from Flags field
in L2CAP Configuration Request and Response.
Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Most of the include were unnecessary or already included by some other
header.
Replace module.h by export.h where possible.
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Fix all warning and errors reported by checkpatch but license trailing
whitespace and bdaddr_t definition.
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Remove magic number with defined link key size.
Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
HCI_QUIRK_NO_RESET name is misleading - purpose of this quirk is to
reset device on close instead of init, not to not reset at all.
Rename it to HCI_QUIRK_RESET_ON_CLOSE to avoid confusion.
Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Now that l2cap_ctrl is used to set up control fields, these macros are
not needed.
Signed-off-by: Mat Martineau <mathewm@codeaurora.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
The ERTM specification requires the retransmit timer to be cancelled
when the monitor timer is set. The retransmit timer cannot be set
again while the monitor timer is pending.
Signed-off-by: Mat Martineau <mathewm@codeaurora.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
This deletes the receive code that had handlers for each frame type at
the top level, and then had logic to determine the receive state
within each handler.
Signed-off-by: Mat Martineau <mathewm@codeaurora.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
This fixes a regression from commit
2ead70b839 that is present in all
kernels starting at v3.0.
When L2CAP information was moved to struct l2cap_chan, a check was
added to l2cap_chan_del to avoid certain cleanup operations when ERTM
or streaming mode had not yet been initialized. The logic in the
check did not take in to account that chan->conf_state is set to 0 in
l2cap_chan_ready, so l2cap_chan_del failed to cancel timers and leaked
memory any time the ERTM queues or lists were not empty.
This change makes sure that l2cap_chan_del only returns early if
ERTM initialization was not performed.
Signed-off-by: Mat Martineau <mathewm@codeaurora.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This routine will be called by drivers whenever they receive data in target
mode. This should be unexpected events and as such should be handled by a
standalone API (i.e. not as a callback pointer from an existing API).
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Userspace gets a netlink event upon target mode activation.
The LLCP layer is also signaled when we get an ATR_REQ in order to get
the remote general bytes.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
When NetLabel is not enabled, e.g. CONFIG_NETLABEL=n, and the system
receives a CIPSO tagged packet it is dropped (cipso_v4_validate()
returns non-zero). In most cases this is the correct and desired
behavior, however, in the case where we are simply forwarding the
traffic, e.g. acting as a network bridge, this becomes a problem.
This patch fixes the forwarding problem by providing the basic CIPSO
validation code directly in ip_options_compile() without the need for
the NetLabel or CIPSO code. The new validation code can not perform
any of the CIPSO option label/value verification that
cipso_v4_validate() does, but it can verify the basic CIPSO option
format.
The behavior when NetLabel is enabled is unchanged.
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking changes from David S. Miller:
1) Fix IPSEC header length calculation for transport mode in ESP. The
issue is whether to do the calculation before or after alignment.
Fix from Benjamin Poirier.
2) Fix regression in IPV6 IPSEC fragment length calculations, from Gao
Feng. This is another transport vs tunnel mode issue.
3) Handle AF_UNSPEC connect()s properly in L2TP to avoid OOPSes. Fix
from James Chapman.
4) Fix USB ASIX driver's reception of full sized VLAN packets, from
Eric Dumazet.
5) Allow drop monitor (and, more generically, all generic netlink
protocols) to be automatically loaded as a module. From Neil
Horman.
Fix up trivial conflict in Documentation/feature-removal-schedule.txt
due to new entries added next to each other at the end. As usual.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (38 commits)
net/smsc911x: Repair broken failure paths
virtio-net: remove useless disable on freeze
netdevice: Update netif_dbg for CONFIG_DYNAMIC_DEBUG
drop_monitor: Add module alias to enable automatic module loading
genetlink: Build a generic netlink family module alias
net: add MODULE_ALIAS_NET_PF_PROTO_NAME
r6040: Do a Proper deinit at errorpath and also when driver unloads (calling r6040_remove_one)
r6040: disable pci device if the subsequent calls (after pci_enable_device) fails
skb: avoid unnecessary reallocations in __skb_cow
net: sh_eth: fix the rxdesc pointer when rx descriptor empty happens
asix: allow full size 8021Q frames to be received
rds_rdma: don't assume infiniband device is PCI
l2tp: fix oops in L2TP IP sockets for connect() AF_UNSPEC case
mac80211: fix ADDBA declined after suspend with wowlan
wlcore: fix undefined symbols when CONFIG_PM is not defined
mac80211: fix flag check for QoS NOACK frames
ath9k_hw: apply internal regulator settings on AR933x
ath9k_hw: update AR933x initvals to fix issues with high power devices
ath9k: fix a use-after-free-bug when ath_tx_setup_buffer() fails
ath9k: stop rx dma before stopping tx
...
We call the destroy function when a cgroup starts to be removed, such as
by a rmdir event.
However, because of our reference counters, some objects are still
inflight. Right now, we are decrementing the static_keys at destroy()
time, meaning that if we get rid of the last static_key reference, some
objects will still have charges, but the code to properly uncharge them
won't be run.
This becomes a problem specially if it is ever enabled again, because now
new charges will be added to the staled charges making keeping it pretty
much impossible.
We just need to be careful with the static branch activation: since there
is no particular preferred order of their activation, we need to make sure
that we only start using it after all call sites are active. This is
achieved by having a per-memcg flag that is only updated after
static_key_slow_inc() returns. At this time, we are sure all sites are
active.
This is made per-memcg, not global, for a reason: it also has the effect
of making socket accounting more consistent. The first memcg to be
limited will trigger static_key() activation, therefore, accounting. But
all the others will then be accounted no matter what. After this patch,
only limited memcgs will have its sockets accounted.
[akpm@linux-foundation.org: move enum sock_flag_bits into sock.h,
document enum sock_flag_bits,
convert memcg_proto_active() and memcg_proto_activated() to test_bit(),
redo tcp_update_limit() comment to 80 cols]
Signed-off-by: Glauber Costa <glommer@parallels.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since commit ad0081e43a
"ipv6: Fragment locally generated tunnel-mode IPSec6 packets as needed"
the fragment of packets is incorrect.
because tunnel mode needs IPsec headers and trailer for all fragments,
while on transport mode it is sufficient to add the headers to the
first fragment and the trailer to the last.
so modify mtu and maxfraglen base on ipsec mode and if fragment is first
or last.
with my test,it work well(every fragment's size is the mtu)
and does not trigger slow fragment path.
Changes from v1:
though optimization, mtu_prev and maxfraglen_prev can be delete.
replace xfrm mode codes with dst_entry's new frag DST_XFRM_TUNNEL.
add fuction ip6_append_data_mtu to make codes clearer.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull more networking updates from David Miller:
"Ok, everything from here on out will be bug fixes."
1) One final sync of wireless and bluetooth stuff from John Linville.
These changes have all been in his tree for more than a week, and
therefore have had the necessary -next exposure. John was just away
on a trip and didn't have a change to send the pull request until a
day or two ago.
2) Put back some defines in user exposed header file areas that were
removed during the tokenring purge. From Stephen Hemminger and Paul
Gortmaker.
3) A bug fix for UDP hash table allocation got lost in the pile due to
one of those "you got it.. no I've got it.." situations. :-)
From Tim Bird.
4) SKB coalescing in TCP needs to have stricter checks, otherwise we'll
try to coalesce overlapping frags and crash. Fix from Eric Dumazet.
5) RCU routing table lookups can race with free_fib_info(), causing
crashes when we deref the device pointers in the route. Fix by
releasing the net device in the RCU callback. From Yanmin Zhang.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (293 commits)
tcp: take care of overlaps in tcp_try_coalesce()
ipv4: fix the rcu race between free_fib_info and ip_route_output_slow
mm: add a low limit to alloc_large_system_hash
ipx: restore token ring define to include/linux/ipx.h
if: restore token ring ARP type to header
xen: do not disable netfront in dom0
phy/micrel: Fix ID of KSZ9021
mISDN: Add X-Tensions USB ISDN TA XC-525
gianfar:don't add FCB length to hard_header_len
Bluetooth: Report proper error number in disconnection
Bluetooth: Create flags for bt_sk()
Bluetooth: report the right security level in getsockopt
Bluetooth: Lock the L2CAP channel when sending
Bluetooth: Restore locking semantics when looking up L2CAP channels
Bluetooth: Fix a redundant and problematic incoming MTU check
Bluetooth: Add support for Foxconn/Hon Hai AR5BBU22 0489:E03C
Bluetooth: Fix EIR data generation for mgmt_device_found
Bluetooth: Fix Inquiry with RSSI event mask
Bluetooth: improve readability of l2cap_seq_list code
Bluetooth: Fix skb length calculation
...
Pull cgroup updates from Tejun Heo:
"cgroup file type addition / removal is updated so that file types are
added and removed instead of individual files so that dynamic file
type addition / removal can be implemented by cgroup and used by
controllers. blkio controller changes which will come through block
tree are dependent on this. Other changes include res_counter cleanup
and disallowing kthread / PF_THREAD_BOUND threads to be attached to
non-root cgroups.
There's a reported bug with the file type addition / removal handling
which can lead to oops on cgroup umount. The issue is being looked
into. It shouldn't cause problems for most setups and isn't a
security concern."
Fix up trivial conflict in Documentation/feature-removal-schedule.txt
* 'for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (21 commits)
res_counter: Account max_usage when calling res_counter_charge_nofail()
res_counter: Merge res_counter_charge and res_counter_charge_nofail
cgroups: disallow attaching kthreadd or PF_THREAD_BOUND threads
cgroup: remove cgroup_subsys->populate()
cgroup: get rid of populate for memcg
cgroup: pass struct mem_cgroup instead of struct cgroup to socket memcg
cgroup: make css->refcnt clearing on cgroup removal optional
cgroup: use negative bias on css->refcnt to block css_tryget()
cgroup: implement cgroup_rm_cftypes()
cgroup: introduce struct cfent
cgroup: relocate __d_cgrp() and __d_cft()
cgroup: remove cgroup_add_file[s]()
cgroup: convert memcg controller to the new cftype interface
memcg: always create memsw files if CONFIG_CGROUP_MEM_RES_CTLR_SWAP
cgroup: convert all non-memcg controllers to the new cftype interface
cgroup: relocate cftype and cgroup_subsys definitions in controllers
cgroup: merge cft_release_agent cftype array into the base files array
cgroup: implement cgroup_add_cftypes() and friends
cgroup: build list of all cgroups under a given cgroupfs_root
cgroup: move cgroup_clear_directory() call out of cgroup_populate_dir()
...
Mostly bool conversions, some inline removals and const additions.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ipv6_opt_accepted() returns a bool, and can use const pointers
ipv6_addr_equal(), ipv6_addr_any(), ipv6_addr_loopback(),
ipv6_addr_orchid() return a bool.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- match() method returns a boolean
- return (A && B && C && D) -> return A && B && C && D
- fix indentation
Signed-off-by: Eric Dumazet <edumazet@google.com>
Enable dynamic debugging and remove a bunch of #ifdef/#endifs.
Add a lapb_dbg(level, fmt, ...) macro and replace the
printk(KERN_DEBUG uses.
Add pr_fmt and remove embedded prefixes.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bool conversions where possible.
__inline__ -> inline
space cleanups
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bool/const conversions where possible
__inline__ -> inline
space cleanups
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- sock_flag() accepts a const pointer
- sock_flag() returns a boolean
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
codel_should_drop() logic allows a packet being not dropped if queue
size is under max packet size.
In fq_codel, we have two possible backlogs : The qdisc global one, and
the flow local one.
The meaningful one for codel_should_drop() should be the global backlog,
not the per flow one, so that thin flows can have a non zero drop/mark
probability.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Dave Taht <dave.taht@bufferbloat.net>
Cc: Kathleen Nichols <nichols@pollere.com>
Cc: Van Jacobson <van@pollere.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Support for monitor device intended to capture all the network activity.
This interface could be used by networks sniffers and is already
supported by WireShark. That's a good test point to check that basic
MAC support works.
Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This stack implementation distinguishes several types of slave
interfaces. Another parameter to 'add_iface_' function is added
to clarify the interface type is going to be registered.
Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
According IEEE 802.15.4 standard each node can be either full functionality
device (FFD) or reduce functionality device (RFD). So 2 sets of operations
are needed. This patch declare RFD operations structure.
Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Main RX data path implementation between physical and mac layers.
Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
An interface to allocate and register ieee802154 compatible device.
The allocated device has the following representation in memory:
+-----------------------+
| struct wpan_phy |
+-----------------------+
| struct mac802154_priv |
+-----------------------+
| driver's private data |
+-----------------------+
Used by device drivers to register new instance in the stack.
Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The IEEE 802.15.4 Working Group focuses on the standardization of the
bottom two layers of ISO/OSI protocol stack: Physical (PHY) and MAC.
The MAC layer provides access control to a shared channel and reliable
data delivery. The main functions performed by the MAC sublayer are:
association and disassociation, security control, optional star
network topology functions, such as beacon generation and Guaranteed
Time Slots (GTSs) management, generation of ACK frames (if used), and,
finally, application support for the two possible network topologies
described in the standard.
This is an initial commit which describes main data structures needed
for ieee802.15.4 compatible devices representation in the MAC layer.
Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
defer_setup and suspended are now flags into bt_sk().
Signed-off-by: Gustavo Padovan <gustavo@padovan.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
The ERTM and streaming mode transmit queue must only be accessed while
the L2CAP channel lock is held. Locking the channel before calling
l2cap_chan_send ensures that multiple threads cannot simultaneously
manipulate the queue when sending and receiving concurrently.
L2CAP channel locking had previously moved to the l2cap_chan struct
instead of the associated socket, so some of the old socket locking
can also be removed in this patch.
Signed-off-by: Mat Martineau <mathewm@codeaurora.org>
Signed-off-by: Gustavo Padovan <gustavo@padovan.org>
The mgmt_device_found function expects to receive only the significant
part of the EIR data so it needs to be removed before calling the
function. This patch adds a new eir_get_length() helper function to
calculate the length of the significant part.
Signed-off-by: Vishal Agarwal <vishal.agarwal@stericsson.com>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
It should return bool, not int. The function even
does return true/false.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Add a flag for the HT format (mixed vs. greenfield)
to allow drivers to report that on receive. Not all
drivers will do that though, so allow drivers to set
which radiotap MCS details they report.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Add IV-room in skb also for TKIP and WEP.
Extend patch: "mac80211: support adding IV-room in the skb for CCMP keys"
Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
We are going to delete the Token ring support. This removes any
special processing in the core networking for token ring, (aside
from net/tr.c itself), leaving the drivers and remaining tokenring
support present but inert.
The mass removal of the drivers and net/tr.c will be in a separate
commit, so that the history of these files that we still care
about won't have the giant deletion tied into their history.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
The NFC core code already does that for them.
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
It is now specified that nfc_target_found() and nfc_target_lost() core
functions must not be called from an atomic context. This allow us to
serialize calls and protect the targets table using the nfc device lock
instead of a spinlock.
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The NFC Core now caches the active nfc target pointer, thereby avoiding
the need to lookup the target table for each invocation of a driver ops.
Consequently, pn533, HCI and NCI now directly receive an nfc_target
pointer instead of a target index.
Cc: Ilan Elias <ilane@ti.com>
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
David pointed out gcc might generate poor code with 31bit fields.
Using u16 is more than enough and permits a better code output.
Also make the code intent more readable using constants, fixed point arithmetic
not being trivial for everybody.
Suggested-by: David Miller <davem@davemloft.net>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It fixes L2CAP socket based security level elevation during a
connection. The HID profile needs this (for keyboards) and it is the only
way to achieve the security level elevation when using the management
interface to talk to the kernel (hence the management enabling patch
being the one that exposes this issue).
It enables the userspace a security level change when the socket is
already connected and create a way to notify the socket the result of the
request. At the moment of the request the socket is made non writable, if
the request fails the connections closes, otherwise the socket is made
writable again, POLL_OUT is emmited.
Signed-off-by: Gustavo Padovan <gustavo@padovan.org>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
As Van pointed out, interval/sqrt(count) can be implemented using
multiplies only.
http://en.wikipedia.org/wiki/Methods_of_computing_square_roots#Iterative_methods_for_reciprocal_square_roots
This patch implements the Newton method and reciprocal divide.
Total cost is 15 cycles instead of 120 on my Corei5 machine (64bit
kernel).
There is a small 'error' for count values < 5, but we don't really care.
I reuse a hole in struct codel_vars :
- pack the dropping boolean into one bit
- use 31bit to store the reciprocal value of sqrt(count).
Suggested-by: Van Jacobson <van@pollere.net>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Dave Taht <dave.taht@bufferbloat.net>
Cc: Kathleen Nichols <nichols@pollere.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Matt Mathis <mattmathis@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
An implementation of CoDel AQM, from Kathleen Nichols and Van Jacobson.
http://queue.acm.org/detail.cfm?id=2209336
This AQM main input is no longer queue size in bytes or packets, but the
delay packets stay in (FIFO) queue.
As we don't have infinite memory, we still can drop packets in enqueue()
in case of massive load, but mean of CoDel is to drop packets in
dequeue(), using a control law based on two simple parameters :
target : target sojourn time (default 5ms)
interval : width of moving time window (default 100ms)
Based on initial work from Dave Taht.
Refactored to help future codel inclusion as a plugin for other linux
qdisc (FQ_CODEL, ...), like RED.
include/net/codel.h contains codel algorithm as close as possible than
Kathleen reference.
net/sched/sch_codel.c contains the linux qdisc specific glue.
Separate structures permit a memory efficient implementation of fq_codel
(to be sent as a separate work) : Each flow has its own struct
codel_vars.
timestamps are taken at enqueue() time with 1024 ns precision, allowing
a range of 2199 seconds in queue, and 100Gb links support. iproute2 uses
usec as base unit.
Selected packets are dropped, unless ECN is enabled and packets can get
ECN mark instead.
Tested from 2Mb to 10Gb speeds with no particular problems, on ixgbe and
tg3 drivers (BQL enabled).
Usage: tc qdisc ... codel [ limit PACKETS ] [ target TIME ]
[ interval TIME ] [ ecn ]
qdisc codel 10: parent 1:1 limit 2000p target 3.0ms interval 60.0ms ecn
Sent 13347099587 bytes 8815805 pkt (dropped 0, overlimits 0 requeues 0)
rate 202365Kbit 16708pps backlog 113550b 75p requeues 0
count 116 lastcount 98 ldelay 4.3ms dropping drop_next 816us
maxpacket 1514 ecn_mark 84399 drop_overlimit 0
CoDel must be seen as a base module, and should be used keeping in mind
there is still a FIFO queue. So a typical setup will probably need a
hierarchy of several qdiscs and packet classifiers to be able to meet
whatever constraints a user might have.
One possible example would be to use fq_codel, which combines Fair
Queueing and CoDel, in replacement of sfq / sfq_red.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dave Taht <dave.taht@bufferbloat.net>
Cc: Kathleen Nichols <nichols@pollere.com>
Cc: Van Jacobson <van@pollere.net>
Cc: Tom Herbert <therbert@google.com>
Cc: Matt Mathis <mattmathis@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It actually works on the input queue and will use its read mem
routines, thus it's better to have in in the tcp_input.c file.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dst_check() will take care of SA (and obsolete field), hence
IPsec rekeying scenario is taken into account.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Vlad Yaseivch <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use more common code for ERTM and streaming mode segmentation and
transmission, and begin using skb control block data for delaying
extended or enhanced header generation until just before the packet is
transmitted. This code is also better suited for resegmentation,
which is needed when L2CAP links are reconfigured after an AMP channel
move.
Signed-off-by: Mat Martineau <mathewm@codeaurora.org>
Reviewed-by: Ulisses Furquim <ulisses@profusion.mobi>
Signed-off-by: Gustavo Padovan <gustavo@padovan.org>
The Bluetooth Low Energy support so far was disabled by default via
a module parameter. With this change the module parameter will be removed
and Low Energy is enabled by default.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo@padovan.org>
No one is using hci_le_ltk_neg_reply() in bluetooth subsystem.
Signed-off-by: Syam Sidhardhan <s.syam@samsung.com>
Signed-off-by: Gustavo Padovan <gustavo@padovan.org>
In this API, we were using sizeof operator for an array
given as function argument, which is invalid.
However this API is not used anywhere.
Signed-off-by: Syam Sidhardhan <s.syam@samsung.com>
Signed-off-by: Gustavo Padovan <gustavo@padovan.org>
These values are now in the nested l2cap_ctrl struct.
Signed-off-by: Mat Martineau <mathewm@codeaurora.org>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo@padovan.org>
Instead of using modular division, the offset can be calculated using
only addition and subtraction. The previous calculation did not work
as intended and was more difficult to understand, involving unsigned
integer underflow and a check for a negative value where one was not
possible.
Signed-off-by: Mat Martineau <mathewm@codeaurora.org>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo@padovan.org>
User-space pass the remote device address type to kernel through
struct sockaddr_l2 what makes the advertising useless. This patch
removes all advertising cache code.
Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Acked-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
In order to establish a LE connection we need the address type
information. User-space already pass this information to kernel
through struct sockaddr_l2.
This patch adds the dst_type parameter to l2cap_chan_connect so we
are able to pass the address type info from user-space down to
hci_conn layer.
Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Acked-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This patch adds the dst_type parameter to hci_connect function.
Instead of searching the address type in advertising cache, we
use the dst_type parameter to establish LE connections.
The dst_type is ignored for BR/EDR connection establishment.
Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Acked-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This patch moves the helper function bdaddr_to_le to hci_core, so it
can be used in mgmt.c and hci_conn.c.
Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Acked-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This patch adds the address type info to struct sockaddr_l2 so
user-space can inform the remote device address type required
to establish LE connections.
Soon, instead of looking the advertising cache up to discover the
address type, we'll use this address type info to establish LE
connections.
Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Acked-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This patch moves address type macros to bluetooth.h since they will be
used by management interface and Bluetooth socket interface. It also
replaces the macro prefix MGMT_ADDR_ by BDADDR_.
Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Acked-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
No one is using strtoba() in the bluetooth subsystem.
Signed-off-by: Syam Sidhardhan <s.syam@samsung.com>
Signed-off-by: Gustavo Padovan <gustavo@padovan.org>
A sequence list is a data structure used to track frames that need to
be retransmitted, and frames that have been requested for
retransmission by the remote device. It can compactly represent a
list of sequence numbers within the ERTM transmit window. Memory for
the list is allocated once at connection time, and common operations
in ERTM are O(1).
Signed-off-by: Mat Martineau <mathewm@codeaurora.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Some parameters in L2CAP chan are set to default similar way in
socket based channels and A2MP channels. Adds common function which
sets all defaults.
Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Signed-off-by: Gustavo Padovan <gustavo@padovan.org>
This patch removes the MGMT_ADDR_INVALID macro. If the address type
isn't LE, we consider it is BR/EDR type.
Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Following the separation if core and sock code this change avoid
manipulation of sk inside l2cap_chan_create().
Signed-off-by: Gustavo Padovan <gustavo@padovan.org>
Every field from ERTM control headers is now carried in the control
block so it only has to be parsed or generated once, and can be
efficiently accessed throughout the ERTM code.
Signed-off-by: Mat Martineau <mathewm@codeaurora.org>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo@padovan.org>
Adds some missing values for control field parsing, additional data
for the new state machine, and enumerations for states, incoming
packet classification, and state machine events.
Signed-off-by: Mat Martineau <mathewm@codeaurora.org>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo@padovan.org>
This patch adds the HCI_PERIODIC_INQ flag to dev_flags. This flag
tracks if periodic inquiry is enabled or not.
Signed-off-by: Andre Guedes <aguedespe@gmail.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo@padovan.org>
This patch adds a handler function to Periodic Inquiry command
complete event.
Signed-off-by: Andre Guedes <aguedespe@gmail.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo@padovan.org>
This patch adds to hci_core the hci_cancel_le_scan function which
should be used to cancel an ongoing LE scan.
Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
ediv is already in little endian order.
Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
The Device ID details need to be programmed into the kernel for every
controller at least once. So provide management command for this.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
The Device ID information can be provided via Extended Inquiry Data
as well. If a valid source is present, then include it.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
The Inquiry Response TX power tag should be added to the Extended
Inquiry Data (EIR) as well.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
We initialize the "struct device" in hci_alloc_dev() for a long time now
so we can access hdev->dev.parent directly. Hence, we can drop the
temporary field hdev->parent which is used in no other place than
hci_add_sysfs().
SET_HCIDEV_DEV() is never called after registering a device by the
drivers so we do not overwrite internal device-state. Furthermore,
hdev->dev is initialized to 0 by kzalloc() inside hci_alloc_dev() so the
default behavior with dev.parent = NULL is kept.
Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Correct type warnings reported by sparse to show that this
functions takes ediv argument in __le16 format.
Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Keep lmp_subver in host byte order. We have following conversion
in hci_cc_read_local_version:
hdev->lmp_subver = __le16_to_cpu(rp->lmp_subver);
Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
This patch introduces a new mesh configuration parameter "ht_opmode" and will
allow user to check the current HT protection mode selected. Users could
configure the protection mode by the command "iw mesh_iface set mesh_param
mesh_ht_protection_mode=2". The default protection mode of mesh is set to
non-HT mixed mode.
Signed-off-by: Ashok Nagarajan <ashok@cozybit.com>
Reviewed-by: Thomas Pedersen <thomas@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This adds hooks to call into the driver to get additional
stats for the ethtool API.
Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Allow master and backup servers to use many threads
for sync traffic. Add sysctl var "sync_ports" to define the
number of threads. Every thread will use single UDP port,
thread 0 will use the default port 8848 while last thread
will use port 8848+sync_ports-1.
The sync traffic for connections is scheduled to many
master threads based on the cp address but one connection is
always assigned to same thread to avoid reordering of the
sync messages.
Remove ip_vs_sync_switch_mode because this check
for sync mode change is still risky. Instead, check for mode
change under sync_buff_lock.
Make sure the backup socks do not block on reading.
Special thanks to Aleksey Chudov for helping in all tests.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Tested-by: Aleksey Chudov <aleksey.chudov@gmail.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Add two new sysctl vars to control the sync rate with the
main idea to reduce the rate for connection templates because
currently it depends on the packet rate for controlled connections.
This mechanism should be useful also for normal connections
with high traffic.
sync_refresh_period: in seconds, difference in reported connection
timer that triggers new sync message. It can be used to
avoid sync messages for the specified period (or half of
the connection timeout if it is lower) if connection state
is not changed from last sync.
sync_retries: integer, 0..3, defines sync retries with period of
sync_refresh_period/8. Useful to protect against loss of
sync messages.
Allow sysctl_sync_threshold to be used with
sysctl_sync_period=0, so that only single sync message is sent
if sync_refresh_period is also 0.
Add new field "sync_endtime" in connection structure to
hold the reported time when connection expires. The 2 lowest
bits will represent the retry count.
As the sysctl_sync_period now can be 0 use ACCESS_ONCE to
avoid division by zero.
Special thanks to Aleksey Chudov for being patient with me,
for his extensive reports and helping in all tests.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Tested-by: Aleksey Chudov <aleksey.chudov@gmail.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
High rate of sync messages in master can lead to
overflowing the socket buffer and dropping the messages.
Fixed sleep of 1 second without wakeup events is not suitable
for loaded masters,
Use delayed_work to schedule sending for queued messages
and limit the delay to IPVS_SYNC_SEND_DELAY (20ms). This will
reduce the rate of wakeups but to avoid sending long bursts we
wakeup the master thread after IPVS_SYNC_WAKEUP_RATE (8) messages.
Add hard limit for the queued messages before sending
by using "sync_qlen_max" sysctl var. It defaults to 1/32 of
the memory pages but actually represents number of messages.
It will protect us from allocating large parts of memory
when the sending rate is lower than the queuing rate.
As suggested by Pablo, add new sysctl var
"sync_sock_size" to configure the SNDBUF (master) or
RCVBUF (slave) socket limit. Default value is 0 (preserve
system defaults).
Change the master thread to detect and block on
SNDBUF overflow, so that we do not drop messages when
the socket limit is low but the sync_qlen_max limit is
not reached. On ENOBUFS or other errors just drop the
messages.
Change master thread to enter TASK_INTERRUPTIBLE
state early, so that we do not miss wakeups due to messages or
kthread_should_stop event.
Thanks to Pablo Neira Ayuso for his valuable feedback!
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
this_cpu_inc() is IRQ safe and faster than
local_bh_disable()/__this_cpu_inc()/local_bh_enable(), at least on x86.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Christoph Lameter <cl@linux.com>
Cc: Tejun Heo <tj@kernel.org>
Reviewed-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch allows you to disable automatic conntrack helper
lookup based on TCP/UDP ports, eg.
echo 0 > /proc/sys/net/netfilter/nf_conntrack_helper
[ Note: flows that already got a helper will keep using it even
if automatic helper assignment has been disabled ]
Once this behaviour has been disabled, you have to explicitly
use the iptables CT target to attach helper to flows.
There are good reasons to stop supporting automatic helper
assignment, for further information, please read:
http://www.netfilter.org/news.html#2012-04-03
This patch also adds one message to inform that automatic helper
assignment is deprecated and it will be removed soon (this is
spotted only once, with the first flow that gets a helper attached
to make it as less annoying as possible).
Signed-off-by: Eric Leblond <eric@regit.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Conflicts:
drivers/net/ethernet/intel/e1000e/param.c
drivers/net/wireless/iwlwifi/iwl-agn-rx.c
drivers/net/wireless/iwlwifi/iwl-trans-pcie-rx.c
drivers/net/wireless/iwlwifi/iwl-trans.h
Resolved the iwlwifi conflict with mainline using 3-way diff posted
by John Linville and Stephen Rothwell. In 'net' we added a bug
fix to make iwlwifi report a more accurate skb->truesize but this
conflicted with RX path changes that happened meanwhile in net-next.
In e1000e a conflict arose in the validation code for settings of
adapter->itr. 'net-next' had more sophisticated logic so that
logic was used.
Signed-off-by: David S. Miller <davem@davemloft.net>
It appears some networks play bad games with the two bits reserved for
ECN. This can trigger false congestion notifications and very slow
transferts.
Since RFC 3168 (6.1.1) forbids SYN packets to carry CT bits, we can
disable TCP ECN negociation if it happens we receive mangled CT bits in
the SYN packet.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Perry Lorier <perryl@google.com>
Cc: Matt Mathis <mattmathis@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Wilmer van der Gaast <wilmer@google.com>
Cc: Ankur Jain <jankur@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Dave Täht <dave.taht@bufferbloat.net>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Extend tcp coalescing implementing it from tcp_queue_rcv(), the main
receiver function when application is not blocked in recvmsg().
Function tcp_queue_rcv() is moved a bit to allow its call from
tcp_data_queue()
This gives good results especially if GRO could not kick, and if skb
head is a fragment.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implementing the advanced early retransmit (sysctl_tcp_early_retrans==2).
Delays the fast retransmit by an interval of RTT/4. We borrow the
RTO timer to implement the delay. If we receive another ACK or send
a new packet, the timer is cancelled and restored to original RTO
value offset by time elapsed. When the delayed-ER timer fires,
we enter fast recovery and perform fast retransmit.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch implements RFC 5827 early retransmit (ER) for TCP.
It reduces DUPACK threshold (dupthresh) if outstanding packets are
less than 4 to recover losses by fast recovery instead of timeout.
While the algorithm is simple, small but frequent network reordering
makes this feature dangerous: the connection repeatedly enter
false recovery and degrade performance. Therefore we implement
a mitigation suggested in the appendix of the RFC that delays
entering fast recovery by a small interval, i.e., RTT/4. Currently
ER is conservative and is disabled for the rest of the connection
after the first reordering event. A large scale web server
experiment on the performance impact of ER is summarized in
section 6 of the paper "Proportional Rate Reduction for TCP”,
IMC 2011. http://conferences.sigcomm.org/imc/2011/docs/p155.pdf
Note that Linux has a similar feature called THIN_DUPACK. The
differences are THIN_DUPACK do not mitigate reorderings and is only
used after slow start. Currently ER is disabled if THIN_DUPACK is
enabled. I would be happy to merge THIN_DUPACK feature with ER if
people think it's a good idea.
ER is enabled by sysctl_tcp_early_retrans:
0: Disables ER
1: Reduce dupthresh to packets_out - 1 when outstanding packets < 4.
2: (Default) reduce dupthresh like mode 1. In addition, delay
entering fast recovery by RTT/4.
Note: mode 2 is implemented in the third part of this patch series.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change order of init so netns init is ready
when register ioctl and netlink.
Ver2
Whitespace fixes and __init added.
Reported-by: "Ryan O'Hara" <rohara@redhat.com>
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
ip_vs_create_timeout_table() can return NULL
All functions protocol init_netns is affected of this patch.
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Now that the sematics of udpv6_queue_rcv_skb() match IPv4's
udp_queue_rcv_skb(), introduce the UDP encap_rcv() hook for IPv6.
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Quoting Tore Anderson from :
https://bugzilla.kernel.org/show_bug.cgi?id=42572
When RTAX_FEATURE_ALLFRAG is set on a route, the effective TCP segment
size does not take into account the size of the IPv6 Fragmentation
header that needs to be included in outbound packets, causing every
transmitted TCP segment to be fragmented across two IPv6 packets, the
latter of which will only contain 8 bytes of actual payload.
RTAX_FEATURE_ALLFRAG is typically set on a route in response to
receving a ICMPv6 Packet Too Big message indicating a Path MTU of less
than 1280 bytes. 1280 bytes is the minimum IPv6 MTU, however ICMPv6
PTBs with MTU < 1280 are still valid, in particular when an IPv6
packet is sent to an IPv4 destination through a stateless translator.
Any ICMPv4 Need To Fragment packets originated from the IPv4 part of
the path will be translated to ICMPv6 PTB which may then indicate an
MTU of less than 1280.
The Linux kernel refuses to reduce the effective MTU to anything below
1280 bytes, instead it sets it to exactly 1280 bytes, and
RTAX_FEATURE_ALLFRAG is also set. However, the TCP segment size appears
to be set to 1240 bytes (1280 Path MTU - 40 bytes of IPv6 header),
instead of 1232 (additionally taking into account the 8 bytes required
by the IPv6 Fragmentation extension header).
This in turn results in rather inefficient transmission, as every
transmitted TCP segment now is split in two fragments containing
1232+8 bytes of payload.
After this patch, all the outgoing packets that includes a
Fragmentation header all are "atomic" or "non-fragmented" fragments,
i.e., they both have Offset=0 and More Fragments=0.
With help from David S. Miller
Reported-by: Tore Anderson <tore@fud.no>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Tom Herbert <therbert@google.com>
Tested-by: Tore Anderson <tore@fud.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
bridge: set fake_rtable's dst to NULL to avoid kernel Oops
when bridge is deleted before tap/vif device's delete, kernel may
encounter an oops because of NULL reference to fake_rtable's dst.
Set fake_rtable's dst to NULL before sending packets out can solve
this problem.
v4 reformat, change br_drop_fake_rtable(skb) to {}
v3 enrich commit header
v2 introducing new flag DST_FAKE_RTABLE to dst_entry struct.
[ Use "do { } while (0)" for nop br_drop_fake_rtable()
implementation -DaveM ]
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Peter Huang <peter.huangpeng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix merge between commit 3adadc08cc ("net ax25: Reorder ax25_exit to
remove races") and commit 0ca7a4c87d ("net ax25: Simplify and
cleanup the ax25 sysctl handling")
The former moved around the sysctl register/unregister calls, the
later simply removed them.
With help from Stephen Rothwell.
Signed-off-by: David S. Miller <davem@davemloft.net>
sk_add_backlog() & sk_rcvqueues_full() hard coded sk_rcvbuf as the
memory limit. We need to make this limit a parameter for TCP use.
No functional change expected in this patch, all callers still using the
old sk_rcvbuf limit.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Randy Dunlap <rdunlap@xenotime.net> reported:
> On 04/23/2012 12:07 AM, Stephen Rothwell wrote:
>
>> Hi all,
>>
>> Changes since 20120420:
>
>
> include/net/ax25.h:447:75: error: expected ';' before '}' token
>
> static inline int ax25_register_dev_sysctl(ax25_dev *ax25_dev) { return 0 };
> static inline void ax25_unregister_dev_sysctl(ax25_dev *ax25_dev) {};
>
> First function: move ';' inside braces.
> Second function: drop the ';'.
Put the semicolons where it makes sense.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Randy Dunlap <rdunlap@xenotime.net> reported:
> On 04/23/2012 12:07 AM, Stephen Rothwell wrote:
>
>> Hi all,
>>
>> Changes since 20120420:
>
>
>
> ERROR: "unregister_net_sysctl_table" [net/phonet/phonet.ko] undefined!
> ERROR: "register_net_sysctl" [net/phonet/phonet.ko] undefined!
>
> when CONFIG_SYSCTL is not enabled.
Add static inline stub functions to gracefully handle the case when sysctl
support is not present.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
ieee80211_ave_rssi need to be declare as export for driver to use it.
Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This commit moves the (substantial) common code shared between
tcp_v4_init_sock() and tcp_v6_init_sock() to a new address-family
independent function, tcp_init_sock().
Centralizing this functionality should help avoid drift issues,
e.g. where the IPv4 side is updated without a corresponding update to
IPv6. There was already some drift: IPv4 initialized snd_cwnd to
TCP_INIT_CWND, while the IPv6 side was still initializing snd_cwnd to
2 (in this case it should not matter, since snd_cwnd is also
initialized in tcp_init_metrics(), but the general risks and
maintenance overhead remain).
When diffing the old and new code, note that new tcp_init_sock()
function uses the order of steps from the tcp_v4_init_sock()
implementation (the order is slightly different in
tcp_v6_init_sock()).
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This includes (according the the previous description):
* TCP_REPAIR sockoption
This one just puts the socket in/out of the repair mode.
Allowed for CAP_NET_ADMIN and for closed/establised sockets only.
When repair mode is turned off and the socket happens to be in
the established state the window probe is sent to the peer to
'unlock' the connection.
* TCP_REPAIR_QUEUE sockoption
This one sets the queue which we're about to repair. The
'no-queue' is set by default.
* TCP_QUEUE_SEQ socoption
Sets the write_seq/rcv_nxt of a selected repaired queue.
Allowed for TCP_CLOSE-d sockets only. When the socket changes
its state the other seq-s are changed by the kernel according
to the protocol rules (most of the existing code is actually
reused).
* Ability to forcibly bind a socket to a port
The sk->sk_reuse is set to SK_FORCE_REUSE.
* Immediate connect modification
The connect syscall initializes the connection, then directly jumps
to the code which finalizes it.
* Silent close modification
The close just aborts the connection (similar to SO_LINGER with 0
time) but without sending any FIN/RST-s to peer.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is just the preparation patch, which makes the needed for
TCP repair code ready for use.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Name them in a "backward compatible" manner, i.e. reuse or not
are still 1 and 0 respectively. The reuse value of 2 means that
the socket with it will forcibly reuse everyone else's port.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
All of the users have been converted to use registera_net_sysctl so we
no longer need register_net_sysctl.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We don't use struct ctl_path anymore so delete the exported constants.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There isn't much advantage here except that strings paths are a bit
easier to read, and converting everything to them allows me to kill off
ctl_path.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The sysctl core no longer natively understands sysctl tables
with .child entries.
Split the ipv6_table to remove the .child entries.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Don't register/unregister every ax25 table in a batch. Instead register
and unregister per device ax25 sysctls as ax25 devices come and go.
This moves ax25 to be a completely modern sysctl user. Registering the
sysctls in just the initial network namespace, removing the use of
.child entries that are no longer natively supported by the sysctl core
and taking advantage of the fact that there are no longer any ordering
constraints between registering and unregistering different sysctl
tables.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
sysctl no longer requires explicit creation of directories. The neigh
directory is always populated with at least a default entry so this
won't cause any user visible changes.
Delete the ipv4_path and the ipv4_skeleton these are no longer needed.
Directly register the ipv4_route_table.
And since I am an idiot remove the header definitions that I should
have removed in the previous patch.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
register_sysctl_rotable never caught on as an interesting way to
register sysctls. My take on the situation is that what we want are
sysctls that we can only see in the initial network namespace. What we
have implemented with register_sysctl_rotable are sysctls that we can
see in all of the network namespaces and can only change in the initial
network namespace.
That is a very silly way to go. Just register the network sysctls
in the initial network namespace and we don't have any weird special
cases to deal with.
The sysctls affected are:
/proc/sys/net/ipv4/ipfrag_secret_interval
/proc/sys/net/ipv4/ipfrag_max_dist
/proc/sys/net/ipv6/ip6frag_secret_interval
/proc/sys/net/ipv6/mld_max_msf
I really don't expect anyone will miss them if they can't read them in a
child user namespace.
CC: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the netfilter code is modified to use register_net_sysctl_table the
kernel fails to boot because the per net sysctl infrasturce is not setup
soon enough. So to avoid races call net_sysctl_init from sock_init().
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Right now all of the networking sysctl registrations are running in a
compatibiity mode. The natvie sysctl registration api takes a cstring
for a path and a simple ctl_table. Implement register_net_sysctl so
that we can register network sysctls without needing to use
compatiblity code in the sysctl core.
Switching from a ctl_path to a cstring results in less boiler plate
and denser code that is a little easier to read.
I would simply have changed the arguments to register_net_sysctl_table
instead of keeping two functions in parallel but gcc will allow a
ctl_path pointer to be passed to a char * pointer with only issuing a
warning resulting in completely incorrect code can be built. Since I
have to change the function name I am taking advantage of the situation
to let both register_net_sysctl and register_net_sysctl_table live for a
short time in parallel which makes clean conversion patches a bit easier
to read and write.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix kernel-doc warning in net/sock.h:
Warning(include/net/sock.h:377): No description found for parameter 'sk_peek_off'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Functionally, this change is a NOP.
Semantically, rt6_clean_expires() wants to do rt->dst.from = NULL instead of
rt->dst.expires = 0. It is clearing the RTF_EXPIRES flag, so the union is going
to be treated as a pointer (dst.from) not a long (dst.expires).
Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 1716a961 (ipv6: fix problem with expired dst cache) broke PMTU
discovery. rt6_update_expires() calls dst_set_expires(), which only updates
dst->expires if it has not been set previously (expires == 0) or if the new
expires is earlier than the current dst->expires.
rt6_update_expires() needs to zero rt->dst.expires, otherwise it will contain
ivalid data left over from rt->dst.from and will confuse dst_set_expires().
Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add utility function to provide the average rssi per vif
Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Commit b82d1bb4 inadvertendly placed unrelated new code between
TCPCB_EVER_RETRANS and TCPCB_RETRANS and the other macros that refer
to the sacked field in the struct tcp_skb_cb (probably because there
was a misleading empty line there). This commit fixes up the
formatting so that all macros related to the sacked field are adjacent
again.
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
My grand plan to allow drivers to gradually move over
to advertising virtual interface combinations and only
enforce with drivers that do want it enforced doesn't
seem to be working out, only Christian ever added the
advertising (to carl9170), nobody else did.
Begin enforcing combinations in cfg80211 so that users
can rely on the information reported about a device.
Cc: "Luis R. Rodriguez" <mcgrof@qca.qualcomm.com>
Cc: Jouni Malinen <jouni@qca.qualcomm.com>
Cc: Vasanthakumar Thiagarajan <vthiagar@qca.qualcomm.com>
Cc: Senthil Balasubramanian <senthilb@qca.qualcomm.com>
Cc: Kalle Valo <kvalo@qca.qualcomm.com>
Cc: Jiri Slaby <jirislaby@gmail.com>
Cc: Nick Kossifidis <mickflemm@gmail.com>
Cc: Bob Copeland <me@bobcopeland.com>
Cc: Bing Zhao <bzhao@marvell.com>
Cc: Lennert Buytenhek <buytenh@wantstofly.org>
Cc: Ivo van Doorn <IvDoorn@gmail.com>
Cc: Gertjan van Wingerde <gwingerde@gmail.com>
Cc: Helmut Schaa <helmut.schaa@googlemail.com>
Cc: Luciano Coelho <coelho@ti.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
If a key is non persistent then it should not be used in future
connections but it should be kept for current connection. And it
should be removed when connecion is removed.
Signed-off-by: Vishal Agarwal <vishal.agarwal@stericsson.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
This patch changes the return type of function hci_persistent_key
from int to bool because it makes more sense to return information
whether a key is persistent or not as a bool.
Signed-off-by: Vishal Agarwal <vishal.agarwal@stericsson.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Conflicts:
drivers/net/ethernet/atheros/atlx/atl1.c
drivers/net/ethernet/atheros/atlx/atl1.h
Resolved a conflict between a DMA error bug fix and NAPI
support changes in the atl1 driver.
Signed-off-by: David S. Miller <davem@davemloft.net>
Use of "unsigned int" is preferred to bare "unsigned" in net tree.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We must try harder to get unique (addr, port) pairs when
doing port autoselection for sockets with SO_REUSEADDR
option set.
We achieve this by adding a relaxation parameter to
inet_csk_bind_conflict. When 'relax' parameter is off
we return a conflict whenever the current searched
pair (addr, port) is not unique.
This tries to address the problems reported in patch:
8d238b25b1
Revert "tcp: bind() fix when many ports are bound"
Tests where ran for creating and binding(0) many sockets
on 100 IPs. The results are, on average:
* 60000 sockets, 600 ports / IP:
* 0.210 s, 620 (IP, port) duplicates without patch
* 0.219 s, no duplicates with patch
* 100000 sockets, 1000 ports / IP:
* 0.371 s, 1720 duplicates without patch
* 0.373 s, no duplicates with patch
* 200000 sockets, 2000 ports / IP:
* 0.766 s, 6900 duplicates without patch
* 0.768 s, no duplicates with patch
* 500000 sockets, 5000 ports / IP:
* 2.227 s, 41500 duplicates without patch
* 2.284 s, no duplicates with patch
Signed-off-by: Alex Copot <alex.mihai.c@gmail.com>
Signed-off-by: Daniel Baluta <dbaluta@ixiacom.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Updates some comments to track RFC6298
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: H.K. Jerry Chu <hkchu@google.com>
Cc: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert the per-cpu statistics kept for GRE, IPIP, and SIT tunnels
to use 64 bit statistics.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Removes hw.conf.channel usage from the following functions:
* ieee80211_mandatory_rates
* ieee80211_sta_get_rates
* ieee80211_frame_duration
* ieee80211_rts_duration
* ieee80211_ctstoself_duration
This is in preparation for multi-channel operation.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
If the current channel is known, add frequency and channel type to
NL80211_CMD_GET_INTERFACE.
Signed-off-by: Pontus Fuchs <pontus.fuchs@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
net/core/rtnetlink.c: In function ‘rtnl_create_link’:
net/core/rtnetlink.c:1645:3: warning: passing argument 2 of ‘ops->get_tx_queues’ from incompatible pointer type [enabled by default]
net/core/rtnetlink.c:1645:3: note: expected ‘const struct nlattr **’ but argument is of type ‘struct nlattr **’
Signed-off-by: David S. Miller <davem@davemloft.net>
neigh_table_init_no_netlink() is only used in net/core/neighbour.c file.
Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Most machines dont use UDP encapsulation (L2TP)
Adds a static_key so that udp_queue_rcv_skb() doesnt have to perform a
test if L2TP never setup the encap_rcv on a socket.
Idea of this patch came after Simon Horman proposal to add a hook on TCP
as well.
If static_key is not yet enabled, the fast path does a single JMP .
When static_key is enabled, JMP destination is patched to reach the real
encap_type/encap_rcv logic, possibly adding cache misses.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Simon Horman <horms@verge.net.au>
Cc: dev@openvswitch.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix spelling and references in rtnetlink.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change get_tx_queues, drop unsused arg/return value real_tx_queues,
and use return by value (with error) rather than call by reference.
Probably bonding should just change to LLTX and the whole get_tx_queues
API could disappear!
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the ipv6 dst cache which copy from the dst generated by ICMPV6 RA packet.
this dst cache will not check expire because it has no RTF_EXPIRES flag.
So this dst cache will always be used until the dst gc run.
Change the struct dst_entry,add a union contains new pointer from and expires.
When rt6_info.rt6i_flags has no RTF_EXPIRES flag,the dst.expires has no use.
we can use this field to point to where the dst cache copy from.
The dst.from is only used in IPV6.
rt6_check_expired check if rt6_info.dst.from is expired.
ip6_rt_copy only set dst.from when the ort has flag RTF_ADDRCONF
and RTF_DEFAULT.then hold the ort.
ip6_dst_destroy release the ort.
Add some functions to operate the RTF_EXPIRES flag and expires(from) together.
and change the code to use these new adding functions.
Changes from v5:
modify ip6_route_add and ndisc_router_discovery to use new adding functions.
Only set dst.from when the ort has flag RTF_ADDRCONF
and RTF_DEFAULT.then hold the ort.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement aggregation algorithm, combining more data into a single
HSI transfer. 4 different traffic categories are supported:
1. TC_PRIO_CONTROL .. TC_PRIO_MAX (CTL)
2. TC_PRIO_INTERACTIVE (VO)
3. TC_PRIO_INTERACTIVE_BULK (VI)
4. TC_PRIO_BESTEFFORT, TC_PRIO_BULK, TC_PRIO_FILLER (BEBK)
Signed-off-by: Dmitry Tarnyagin <dmitry.tarnyagin@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Set traffic class for CAIF packets, based on socket
priority, CAIF protocol type, or type of message.
Traffic class mapping for different packet types:
- control: TC_PRIO_CONTROL;
- flow control: TC_PRIO_CONTROL;
- at: TC_PRIO_CONTROL;
- rfm: TC_PRIO_INTERACTIVE_BULK;
- other sockets: equals to socket's TC;
- network data: no change.
Signed-off-by: Dmitry Tarnyagin <dmitry.tarnyagin@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As specified in RFC6106, DNSSL option contains one or more domain names
of DNS suffixes. 8-bit identifier of the DNSSL option type as assigned
by the IANA is 31. This option should also be treated as userland.
Signed-off-by: Alexey I. Froloff <raorn@raorn.name>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some HW/drivers get notifications when a tag moves out of the radio field.
This notification is now forwarded to user space through netlink.
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The target index can be used by userspace to uniquely identify a target
and thus should be kept unique, per NFC adapter. Moreover, some protocols
do not provide a logical index when discovering new targets, so we have to
generate one for them.
For NCI or pn533 to fetch their logical index, we added a logical_idx field
to the target structure.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Most NFC HCI chipsets actually use a simplified HDLC link layer to
carry HCI payloads.
This implementation registers itself as an HCI device on behalf of the
NFC driver.
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This is an implementation of ETSI TS 102 622 specification.
Many NFC chipsets use HCI as the host <-> target protocol on top of a
serial link like i2c.
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
NFC drivers will call this routine when they detect that a tag leaves the
RF field. This will eventually lead to the corresponding netlink event
to be sent.
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The firmware may decide to switch channels while already beaconing, e.g.
in response to a cfg80211 connect request on a different vif. Add this
event to notify userspace when an AP or GO interface has successfully
migrated to a new channel, so it can update its configuration
accordingly.
Signed-off-by: Thomas Pedersen <c_tpeder@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
In WoWLAN, we only get the triggers when we actually get
to suspend. As a consequence, drivers currently don't
know that the device should enable wakeup. However, the
device_set_wakeup_enable() API is intended to be called
when the wakeup is enabled, not later when needed.
Add a new set_wakeup() call to cfg80211 and mac80211 to
allow drivers to properly call device_set_wakeup_enable.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
mac80211 currently only supports one hardware queue
per AC. This is already problematic for off-channel
uses since if we go off channel while the BE queue
is full and then try to send an off-channel frame
the frame will never go out. This will become worse
when we support multi-channel since then a queue on
one channel might be full, but we have to stop the
software queue for all channels. That is obviously
not desirable.
To address this problem allow drivers to register
more hardware queues, and allow them to map them to
virtual interfaces. When they stop a hardware queue
the corresponding AC software queues on the correct
interfaces will be stopped as well. Additionally,
there's an off-channel queue to solve that problem
and a per-interface after-DTIM beacon queue. This
allows drivers to manage software queues closer to
how the hardware works.
Currently, there's a limit of 16 hardware queues.
This may or may not be sufficient, we can adjust it
as needed.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The queue mapping redesign that I'm planning to do
will break pure injection unless we handle monitor
interfaces explicitly. One possible option would
be to have the driver tell mac80211 about monitor
mode queues etc., but that would duplicate the API
since we already need to have queue assignments
handled per virtual interface.
So in order to solve this, have a virtual monitor
interface that is added whenever all active vifs
are monitors. We could also use the state of one
of the monitor interfaces, but managing that would
be complicated, so allocate separate state.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Basic rates are added with supported rates IE and extended supported
rates IE.
Signed-off-by: Ashok Nagarajan <ashok@cozybit.com>
Signed-off-by: Thomas Pedersen <thomas@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Report Toffset to userspace.
Let userspace select the mesh synchronization method.
Signed-off-by: Marco Porsch <marco.porsch@s2005.tu-chemnitz.de>
Signed-off-by: Pavel Zubarev <pavel.zubarev@gmail.com>
Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
With the plan to change mac80211's queue API to
not map ACs to queues 1:1, it seems necessary to
clarify some APIs that act on ACs rather than on
queues to spell that out explicitly. Do this.
Also verify that the AC number given is valid.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This field is never set to anything non-zero in
mac80211, so we should be able to remove it.
Unfortunately though, the iwlwifi and iwlegacy
drivers use it for their internal TX status
processing (which shouldn't be using the rate
control API to start with), so add a new field
"status.antenna" for them, at least for now.
In the future, I plan to use the new field to
hold the hardware queue, while the SKB's queue
mapping holds the AC.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Devices that have internal rate control need to be
notified when the bandwidth or SMPS state changes
just like external rate control algorithms get a
notification now.
Add this notification and clarify the change bits
while at it, the HT_CHANGED bit really meant only
bandwidth changed.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The channel type argument to the rate_update()
callback isn't really the correct way to give
the rate control algorithm about the desired
RX bandwidth of the peer.
Remove this argument, and instead update the
STA capabilities with 20/40 appropriately. The
SMPS update done by this callback works in the
same way, so this makes the callback cleaner.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The only reason cgroup was used, was to be consistent with the populate()
interface. Now that we're getting rid of it, not only we no longer need
it, but we also *can't* call it this way.
Since we will no longer rely on populate(), this will be called from
create(). During create, the association between struct mem_cgroup
and struct cgroup does not yet exist, since cgroup internals hasn't
yet initialized its bookkeeping. This means we would not be able
to draw the memcg pointer from the cgroup pointer in these
functions, which is highly undesirable.
Signed-off-by: Glauber Costa <glommer@parallels.com>
Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
CC: Li Zefan <lizefan@huawei.com>
CC: Johannes Weiner <hannes@cmpxchg.org>
CC: Michal Hocko <mhocko@suse.cz>
Since some of the HT code pre-dates 802.11n-2009
some names are wrong. The one that bothers me most
is that "HT operation" is called "HT information"
in our code and that causes confusion.
Rename "HT information" to "HT operation" and also
the control_chan field to primary_chan to match
the name used in the spec.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Bing Zhao <bzhao@marvell.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When the control-rate tables are not set up correctly, it makes
little sense to spam the logs, thus change the WARN_ON to WARN_ON_ONCE.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
To ensure that old user space versions do not accidentally pick up and
try to use the management channel, use a different channel number.
Reported-by: Keith Packard <keithp@keithp.com>
Acked-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo@padovan.org>
Although not specified in 8021Qaz spec, it could be useful to enable drivers
whose HW supports setting a rate limit for an ETS TC. This patch adds this
optional attribute to DCB netlink. To use it, drivers should implement and
register the callbacks ieee_setmaxrate and ieee_getmaxrate. The units are 64
bits long and specified in Kbps to enable usage over both slow and very fast
networks.
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The patch removes unused icmp_ioctl() method definition in
include/net/icmp.h.
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
They were error prone due to an embedded goto, and the entire tree has
been converted away from using them.
Signed-off-by: David S. Miller <davem@davemloft.net>
These macros contain a hidden goto, and are thus extremely error
prone and make code hard to audit.
Signed-off-by: David S. Miller <davem@davemloft.net>
If passed 0 as data_length the (parsed < data_length - 1) test will be
true and cause a buffer overflow. In practice we need at least two bytes
for the element length and type so add a test for it to the very
beginning of the function.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo@padovan.org>
The parsed variable is already incremented inside the for-loop so there
no need to increment it again (not to mention that the code was
incrementing it the wrong amount).
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo@padovan.org>
This patch uses the correct flags for checking the HCI_SSP_ENABLED bit.
Without this authentication request was not being initiated.
Signed-off-by: Hemant Gupta <hemant.gupta@stericsson.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Pull networking fixes from David Miller:
1) Name string overrun fix in gianfar driver from Joe Perches.
2) VHOST bug fixes from Michael S. Tsirkin and Nadav Har'El
3) Fix dependencies on xt_LOG netfilter module, from Pablo Neira Ayuso.
4) Fix RCU locking in xt_CT, also from Pablo Neira Ayuso.
5) Add a parameter to skb_add_rx_frag() so we can fix the truesize
adjustments in the drivers that use it. The individual drivers
aren't fixed by this commit, but will be dealt with using follow-on
commits. From Eric Dumazet.
6) Add some device IDs to qmi_wwan driver, from Andrew Bird.
7) Fix a potential rcu_read_lock() imbalancein rt6_fill_node(). From
Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
net: fix a potential rcu_read_lock() imbalance in rt6_fill_node()
net: add a truesize parameter to skb_add_rx_frag()
gianfar: Fix possible overrun and simplify interrupt name field creation
USB: qmi_wwan: Add ZTE (Vodafone) K3570-Z and K3571-Z net interfaces
USB: option: Ignore ZTE (Vodafone) K3570/71 net interfaces
USB: qmi_wwan: Add ZTE (Vodafone) K3565-Z and K4505-Z net interfaces
qlcnic: Bug fix for LRO
netfilter: nf_conntrack: permanently attach timeout policy to conntrack
netfilter: xt_CT: fix assignation of the generic protocol tracker
netfilter: xt_CT: missing rcu_read_lock section in timeout assignment
netfilter: cttimeout: fix dependency with l4protocol conntrack module
netfilter: xt_LOG: use CONFIG_IP6_NF_IPTABLES instead of CONFIG_IPV6
vhost: fix release path lockdep checks
vhost: don't forget to schedule()
tools/virtio: stub out strong barriers
tools/virtio: add linux/hrtimer.h stub
tools/virtio: add linux/module.h stub
The station_info struct had demanded dBm signal values, but the
cfg80211 wireless extensions implementation was also accepting
"unspecified" (i.e. RSSI) unit values while the nl80211 code was
completely unaware of them. Resolve this by formally allowing the
"unspecified" units while making nl80211 ignore them.
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
At least on ia64 the (bogus) use of xchg() here results in the compiler
warning about an unused expression result. As only an assignment is
intended here, convert it to such.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
"[RFC PATCH 0/2] audit of linux/device.h users in include/*"
https://lkml.org/lkml/2012/3/4/159
--
Nearly every subsystem has some kind of header with a proto like:
void foo(struct device *dev);
and yet there is no reason for most of these guys to care about the
sub fields within the device struct. This allows us to significantly
reduce the scope of headers including headers. For this instance, a
reduction of about 40% is achieved by replacing the include with the
simple fact that the device is some kind of a struct.
Unlike the much larger module.h cleanup, this one is simply two
commits. One to fix the implicit <linux/device.h> users, and then
one to delete the device.h includes from the linux/include/ dir
wherever possible.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJPbNxLAAoJEOvOhAQsB9HWR6QQAMRUZ94O2069/nW9h4TO/xTr
Hq/80lo/TBBiRmob3iWBP76lzgeeMPPVEX1I6N7YYlhL3IL7HsaJH1DvpIPPHXQP
GFKcBsZ5ZLV8c4CBDSr+/HFNdhXc0bw0awBjBvR7gAsWuZpNFn4WbhizJi4vWAoE
4ydhPu55G1G8TkBtYLJQ8xavxsmiNBSDhd2i+0vn6EVpgmXynjOMG8qXyaS97Jvg
pZLwnN5Wu21coj6+xH3QUKCl1mJ+KGyamWX5gFBVIfsDB3k5H4neijVm7t1en4b0
cWxmXeR/JE3VLEl/17yN2dodD8qw1QzmTWzz1vmwJl2zK+rRRAByBrL0DP7QCwCZ
ppeJbdhkMBwqjtknwrmMwsuAzUdJd79GXA+6Vm+xSEkr6FEPK1M0kGbvaqV9Usgd
ohMewewbO6ddgR9eF7Kw2FAwo0hwkPNEplXIym9rZzFG1h+T0STGSHvkn7LV765E
ul1FapSV3GCxEVRwWTwD28FLU2+0zlkOZ5sxXwNPTT96cNmW+R7TGuslZKNaMNjX
q7eBZxo8DtVt/jqJTntR8bs8052c8g1Ac1IKmlW8VSmFwT1M6VBGRn1/JWAhuUgv
dBK/FF+I1GJTAJWIhaFcKXLHvmV9uhS6JaIhLMDOetoOkpqSptJ42hDG+89WkFRk
o55GQ5TFdoOpqxVzGbvE
=3j4+
-----END PGP SIGNATURE-----
Merge tag 'device-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux
Pull <linux/device.h> avoidance patches from Paul Gortmaker:
"Nearly every subsystem has some kind of header with a proto like:
void foo(struct device *dev);
and yet there is no reason for most of these guys to care about the
sub fields within the device struct. This allows us to significantly
reduce the scope of headers including headers. For this instance, a
reduction of about 40% is achieved by replacing the include with the
simple fact that the device is some kind of a struct.
Unlike the much larger module.h cleanup, this one is simply two
commits. One to fix the implicit <linux/device.h> users, and then one
to delete the device.h includes from the linux/include/ dir wherever
possible."
* tag 'device-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux:
device.h: audit and cleanup users in main include dir
device.h: cleanup users outside of linux/include (C files)
"[RFC - PATCH 0/7] consolidation of BUG support code."
https://lkml.org/lkml/2012/1/26/525
--
The changes shown here are to unify linux's BUG support under
the one <linux/bug.h> file. Due to historical reasons, we have
some BUG code in bug.h and some in kernel.h -- i.e. the support for
BUILD_BUG in linux/kernel.h predates the addition of linux/bug.h,
but old code in kernel.h wasn't moved to bug.h at that time. As
a band-aid, kernel.h was including <asm/bug.h> to pseudo link them.
This has caused confusion[1] and general yuck/WTF[2] reactions.
Here is an example that violates the principle of least surprise:
CC lib/string.o
lib/string.c: In function 'strlcat':
lib/string.c:225:2: error: implicit declaration of function 'BUILD_BUG_ON'
make[2]: *** [lib/string.o] Error 1
$
$ grep linux/bug.h lib/string.c
#include <linux/bug.h>
$
We've included <linux/bug.h> for the BUG infrastructure and yet we
still get a compile fail! [We've not kernel.h for BUILD_BUG_ON.]
Ugh - very confusing for someone who is new to kernel development.
With the above in mind, the goals of this changeset are:
1) find and fix any include/*.h files that were relying on the
implicit presence of BUG code.
2) find and fix any C files that were consuming kernel.h and
hence relying on implicitly getting some/all BUG code.
3) Move the BUG related code living in kernel.h to <linux/bug.h>
4) remove the asm/bug.h from kernel.h to finally break the chain.
During development, the order was more like 3-4, build-test, 1-2.
But to ensure that git history for bisect doesn't get needless
build failures introduced, the commits have been reorderd to fix
the problem areas in advance.
[1] https://lkml.org/lkml/2012/1/3/90
[2] https://lkml.org/lkml/2012/1/17/414
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJPbNwpAAoJEOvOhAQsB9HWrqYP/A0t9VB0nK6e42F0OR2P14MZ
GJFtf1B++wwioIrx+KSWSRfSur1C5FKhDbxLR3I/pvkAYl4+T4JvRdMG6xJwxyip
CC1kVQQNDjWVVqzjz2x6rYkOffx6dUlw/ERyIyk+OzP+1HzRIsIrugMqbzGLlX0X
y0v2Tbd0G6xg1DV8lcRdp95eIzcGuUvdb2iY2LGadWZczEOeSXx64Jz3QCFxg3aL
LFU4oovsg8Nb7MRJmqDvHK/oQf5vaTm9WSrS0pvVte0msSQRn8LStYdWC0G9BPCS
GwL86h/eLXlUXQlC5GpgWg1QQt5i2QpjBFcVBIG0IT5SgEPMx+gXyiqZva2KwbHu
LKicjKtfnzPitQnyEV/N6JyV1fb1U6/MsB7ebU5nCCzt9Gr7MYbjZ44peNeprAtu
HMvJ/BNnRr4Ha6nPQNu952AdASPKkxmeXFUwBL1zUbLkOX/bK/vy1ujlcdkFxCD7
fP3t7hghYa737IHk0ehUOhrE4H67hvxTSCKioLUAy/YeN1IcfH/iOQiCBQVLWmoS
AqYV6ou9cqgdYoyila2UeAqegb+8xyubPIHt+lebcaKxs5aGsTg+r3vq5juMDAPs
iwSVYUDcIw9dHer1lJfo7QCy3QUTRDTxh+LB9VlHXQICgeCK02sLBOi9hbEr4/H8
Ko9g8J3BMxcMkXLHT9ud
=PYQT
-----END PGP SIGNATURE-----
Merge tag 'bug-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux
Pull <linux/bug.h> cleanup from Paul Gortmaker:
"The changes shown here are to unify linux's BUG support under the one
<linux/bug.h> file. Due to historical reasons, we have some BUG code
in bug.h and some in kernel.h -- i.e. the support for BUILD_BUG in
linux/kernel.h predates the addition of linux/bug.h, but old code in
kernel.h wasn't moved to bug.h at that time. As a band-aid, kernel.h
was including <asm/bug.h> to pseudo link them.
This has caused confusion[1] and general yuck/WTF[2] reactions. Here
is an example that violates the principle of least surprise:
CC lib/string.o
lib/string.c: In function 'strlcat':
lib/string.c:225:2: error: implicit declaration of function 'BUILD_BUG_ON'
make[2]: *** [lib/string.o] Error 1
$
$ grep linux/bug.h lib/string.c
#include <linux/bug.h>
$
We've included <linux/bug.h> for the BUG infrastructure and yet we
still get a compile fail! [We've not kernel.h for BUILD_BUG_ON.] Ugh -
very confusing for someone who is new to kernel development.
With the above in mind, the goals of this changeset are:
1) find and fix any include/*.h files that were relying on the
implicit presence of BUG code.
2) find and fix any C files that were consuming kernel.h and hence
relying on implicitly getting some/all BUG code.
3) Move the BUG related code living in kernel.h to <linux/bug.h>
4) remove the asm/bug.h from kernel.h to finally break the chain.
During development, the order was more like 3-4, build-test, 1-2. But
to ensure that git history for bisect doesn't get needless build
failures introduced, the commits have been reorderd to fix the problem
areas in advance.
[1] https://lkml.org/lkml/2012/1/3/90
[2] https://lkml.org/lkml/2012/1/17/414"
Fix up conflicts (new radeon file, reiserfs header cleanups) as per Paul
and linux-next.
* tag 'bug-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux:
kernel.h: doesn't explicitly use bug.h, so don't include it.
bug: consolidate BUILD_BUG_ON with other bug code
BUG: headers with BUG/BUG_ON etc. need linux/bug.h
bug.h: add include of it to various implicit C users
lib: fix implicit users of kernel.h for TAINT_WARN
spinlock: macroize assert_spin_locked to avoid bug.h dependency
x86: relocate get/set debugreg fcns to include/asm/debugreg.
In some cases the poll() implementation in a driver has to do different
things depending on the events the caller wants to poll for. An example
is when a driver needs to start a DMA engine if the caller polls for
POLLIN, but doesn't want to do that if POLLIN is not requested but instead
only POLLOUT or POLLPRI is requested. This is something that can happen
in the video4linux subsystem among others.
Unfortunately, the current epoll/poll/select implementation doesn't
provide that information reliably. The poll_table_struct does have it: it
has a key field with the event mask. But once a poll() call matches one
or more bits of that mask any following poll() calls are passed a NULL
poll_table pointer.
Also, the eventpoll implementation always left the key field at ~0 instead
of using the requested events mask.
This was changed in eventpoll.c so the key field now contains the actual
events that should be polled for as set by the caller.
The solution to the NULL poll_table pointer is to set the qproc field to
NULL in poll_table once poll() matches the events, not the poll_table
pointer itself. That way drivers can obtain the mask through a new
poll_requested_events inline.
The poll_table_struct can still be NULL since some kernel code calls it
internally (netfs_state_poll() in ./drivers/staging/pohmelfs/netfs.h). In
that case poll_requested_events() returns ~0 (i.e. all events).
Very rarely drivers might want to know whether poll_wait will actually
wait. If another earlier file descriptor in the set already matched the
events the caller wanted to wait for, then the kernel will return from the
select() call without waiting. This might be useful information in order
to avoid doing expensive work.
A new helper function poll_does_not_wait() is added that drivers can use
to detect this situation. This is now used in sock_poll_wait() in
include/net/sock.h. This was the only place in the kernel that needed
this information.
Drivers should no longer access any of the poll_table internals, but use
the poll_requested_events() and poll_does_not_wait() access functions
instead. In order to enforce that the poll_table fields are now prepended
with an underscore and a comment was added warning against using them
directly.
This required a change in unix_dgram_poll() in unix/af_unix.c which used
the key field to get the requested events. It's been replaced by a call
to poll_requested_events().
For qproc it was especially important to change its name since the
behavior of that field changes with this patch since this function pointer
can now be NULL when that wasn't possible in the past.
Any driver accessing the qproc or key fields directly will now fail to compile.
Some notes regarding the correctness of this patch: the driver's poll()
function is called with a 'struct poll_table_struct *wait' argument. This
pointer may or may not be NULL, drivers can never rely on it being one or
the other as that depends on whether or not an earlier file descriptor in
the select()'s fdset matched the requested events.
There are only three things a driver can do with the wait argument:
1) obtain the key field:
events = wait ? wait->key : ~0;
This will still work although it should be replaced with the new
poll_requested_events() function (which does exactly the same).
This will now even work better, since wait is no longer set to NULL
unnecessarily.
2) use the qproc callback. This could be deadly since qproc can now be
NULL. Renaming qproc should prevent this from happening. There are no
kernel drivers that actually access this callback directly, BTW.
3) test whether wait == NULL to determine whether poll would return without
waiting. This is no longer sufficient as the correct test is now
wait == NULL || wait->_qproc == NULL.
However, the worst that can happen here is a slight performance hit in
the case where wait != NULL and wait->_qproc == NULL. In that case the
driver will assume that poll_wait() will actually add the fd to the set
of waiting file descriptors. Of course, poll_wait() will not do that
since it tests for wait->_qproc. This will not break anything, though.
There is only one place in the whole kernel where this happens
(sock_poll_wait() in include/net/sock.h) and that code will be replaced
by a call to poll_does_not_wait() in the next patch.
Note that even if wait->_qproc != NULL drivers cannot rely on poll_wait()
actually waiting. The next file descriptor from the set might match the
event mask and thus any possible waits will never happen.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Reviewed-by: Jonathan Corbet <corbet@lwn.net>
Reviewed-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch introduces nf_conntrack_l4proto_find_get() and
nf_conntrack_l4proto_put() to fix module dependencies between
timeout objects and l4-protocol conntrack modules.
Thus, we make sure that the module cannot be removed if it is
used by any of the cttimeout objects.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pull vfs pile 1 from Al Viro:
"This is _not_ all; in particular, Miklos' and Jan's stuff is not there
yet."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (64 commits)
ext4: initialization of ext4_li_mtx needs to be done earlier
debugfs-related mode_t whack-a-mole
hfsplus: add an ioctl to bless files
hfsplus: change finder_info to u32
hfsplus: initialise userflags
qnx4: new helper - try_extent()
qnx4: get rid of qnx4_bread/qnx4_getblk
take removal of PF_FORKNOEXEC to flush_old_exec()
trim includes in inode.c
um: uml_dup_mmap() relies on ->mmap_sem being held, but activate_mm() doesn't hold it
um: embed ->stub_pages[] into mmu_context
gadgetfs: list_for_each_safe() misuse
ocfs2: fix leaks on failure exits in module_init
ecryptfs: make register_filesystem() the last potential failure exit
ntfs: forgets to unregister sysctls on register_filesystem() failure
logfs: missing cleanup on register_filesystem() failure
jfs: mising cleanup on register_filesystem() failure
make configfs_pin_fs() return root dentry on success
configfs: configfs_create_dir() has parent dentry in dentry->d_parent
configfs: sanitize configfs_create()
...
Pull security subsystem updates for 3.4 from James Morris:
"The main addition here is the new Yama security module from Kees Cook,
which was discussed at the Linux Security Summit last year. Its
purpose is to collect miscellaneous DAC security enhancements in one
place. This also marks a departure in policy for LSM modules, which
were previously limited to being standalone access control systems.
Chromium OS is using Yama, and I believe there are plans for Ubuntu,
at least.
This patchset also includes maintenance updates for AppArmor, TOMOYO
and others."
Fix trivial conflict in <net/sock.h> due to the jumo_label->static_key
rename.
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (38 commits)
AppArmor: Fix location of const qualifier on generated string tables
TOMOYO: Return error if fails to delete a domain
AppArmor: add const qualifiers to string arrays
AppArmor: Add ability to load extended policy
TOMOYO: Return appropriate value to poll().
AppArmor: Move path failure information into aa_get_name and rename
AppArmor: Update dfa matching routines.
AppArmor: Minor cleanup of d_namespace_path to consolidate error handling
AppArmor: Retrieve the dentry_path for error reporting when path lookup fails
AppArmor: Add const qualifiers to generated string tables
AppArmor: Fix oops in policy unpack auditing
AppArmor: Fix error returned when a path lookup is disconnected
KEYS: testing wrong bit for KEY_FLAG_REVOKED
TOMOYO: Fix mount flags checking order.
security: fix ima kconfig warning
AppArmor: Fix the error case for chroot relative path name lookup
AppArmor: fix mapping of META_READ to audit and quiet flags
AppArmor: Fix underflow in xindex calculation
AppArmor: Fix dropping of allowed operations that are force audited
AppArmor: Add mising end of structure test to caps unpacking
...
Pull networking merge from David Miller:
"1) Move ixgbe driver over to purely page based buffering on receive.
From Alexander Duyck.
2) Add receive packet steering support to e1000e, from Bruce Allan.
3) Convert TCP MD5 support over to RCU, from Eric Dumazet.
4) Reduce cpu usage in handling out-of-order TCP packets on modern
systems, also from Eric Dumazet.
5) Support the IP{,V6}_UNICAST_IF socket options, making the wine
folks happy, from Erich Hoover.
6) Support VLAN trunking from guests in hyperv driver, from Haiyang
Zhang.
7) Support byte-queue-limtis in r8169, from Igor Maravic.
8) Outline code intended for IP_RECVTOS in IP_PKTOPTIONS existed but
was never properly implemented, Jiri Benc fixed that.
9) 64-bit statistics support in r8169 and 8139too, from Junchang Wang.
10) Support kernel side dump filtering by ctmark in netfilter
ctnetlink, from Pablo Neira Ayuso.
11) Support byte-queue-limits in gianfar driver, from Paul Gortmaker.
12) Add new peek socket options to assist with socket migration, from
Pavel Emelyanov.
13) Add sch_plug packet scheduler whose queue is controlled by
userland daemons using explicit freeze and release commands. From
Shriram Rajagopalan.
14) Fix FCOE checksum offload handling on transmit, from Yi Zou."
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1846 commits)
Fix pppol2tp getsockname()
Remove printk from rds_sendmsg
ipv6: fix incorrent ipv6 ipsec packet fragment
cpsw: Hook up default ndo_change_mtu.
net: qmi_wwan: fix build error due to cdc-wdm dependecy
netdev: driver: ethernet: Add TI CPSW driver
netdev: driver: ethernet: add cpsw address lookup engine support
phy: add am79c874 PHY support
mlx4_core: fix race on comm channel
bonding: send igmp report for its master
fs_enet: Add MPC5125 FEC support and PHY interface selection
net: bpf_jit: fix BPF_S_LDX_B_MSH compilation
net: update the usage of CHECKSUM_UNNECESSARY
fcoe: use CHECKSUM_UNNECESSARY instead of CHECKSUM_PARTIAL on tx
net: do not do gso for CHECKSUM_UNNECESSARY in netif_needs_gso
ixgbe: Fix issues with SR-IOV loopback when flow control is disabled
net/hyperv: Fix the code handling tx busy
ixgbe: fix namespace issues when FCoE/DCB is not enabled
rtlwifi: Remove unused ETH_ADDR_LEN defines
igbvf: Use ETH_ALEN
...
Fix up fairly trivial conflicts in drivers/isdn/gigaset/interface.c and
drivers/net/usb/{Kconfig,qmi_wwan.c} as per David.
Pull cgroup changes from Tejun Heo:
"Out of the 8 commits, one fixes a long-standing locking issue around
tasklist walking and others are cleanups."
* 'for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: Walk task list under tasklist_lock in cgroup_enable_task_cg_list
cgroup: Remove wrong comment on cgroup_enable_task_cg_list()
cgroup: remove cgroup_subsys argument from callbacks
cgroup: remove extra calls to find_existing_css_set
cgroup: replace tasklist_lock with rcu_read_lock
cgroup: simplify double-check locking in cgroup_attach_proc
cgroup: move struct cgroup_pidlist out from the header file
cgroup: remove cgroup_attach_task_current_cg()
The <linux/device.h> header includes a lot of stuff, and
it in turn gets a lot of use just for the basic "struct device"
which appears so often.
Clean up the users as follows:
1) For those headers only needing "struct device" as a pointer
in fcn args, replace the include with exactly that.
2) For headers not really using anything from device.h, simply
delete the include altogether.
3) For headers relying on getting device.h implicitly before
being included themselves, now explicitly include device.h
4) For files in which doing #1 or #2 uncovers an implicit
dependency on some other header, fix by explicitly adding
the required header(s).
Any C files that were implicitly relying on device.h to be
present have already been dealt with in advance.
Total removals from #1 and #2: 51. Total additions coming
from #3: 9. Total other implicit dependencies from #4: 7.
As of 3.3-rc1, there were 110, so a net removal of 42 gives
about a 38% reduction in device.h presence in include/*
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>