Comment-only change to better explain why TIPC's configuration lock is
temporarily released while activating support for network interfaces,
and why the existing activation code doesn't require rework.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Permits run-time alteration of default link settings on a per-media
and per-bearer basis, in addition to the existing per-link basis.
The following syntax can now be used:
tipc-config -lt=<link-name|bearer-name|media-name>/<tolerance>
tipc-config -lp=<link-name|bearer-name|media-name>/<priority>
tipc-config -lw=<link-name|bearer-name|media-name>/<window>
Note that changes to the default settings for a given media type has
no effect on the default settings used by existing bearers. Similarly,
changes to default bearer settings has no effect on existing link
endpoints that utilize that interface.
Thanks to Florian Westphal <fw@strlen.de> for his contributions to
the development of this enhancement.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Adds a check to ensure that TIPC ignores an incoming neighbor discovery
message that specifies an invalid media address as its source. The check
ensures that the source address is a valid, non-broadcast address that
could legally be used by a neighboring link endpoint.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Reworks TIPC's media address data structure and associated processing
routines to transfer all media-specific details of address conversion
to the associated TIPC media adaptation code. TIPC's generic bearer code
now only needs to know which media type an address is associated with
and whether or not it is a broadcast address, and totally ignores the
"value" field that contains the actual media-specific addressing info.
These changes eliminate the need for a number of endianness conversion
operations and will make it easier for TIPC to support new media types
in the future.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Enhances TIPC's Ethernet media support to provide 3 new address conversion
routines, which allow TIPC to interpret an address that is in string form
and to convert an address to and from the 20 byte format used in TIPC's
neighbor discovery messages.
These routines are pre-requisites to a follow on commit that hides all
media-specific addressing details from TIPC's generic bearer code.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Enhances conversion of a media address to printable form so that an
unconvertable address will be displayed as a string of hex digits,
rather than not being displayed at all. (Also removes a pointless check
for the existence of the media-specific address conversion routine,
since the routine is not optional.)
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Simplifies error handling performed during media registration, since
TIPC no longer supports the dynamic addition of new media types that
are potentially error-prone. These simplifications include the following:
1) No longer check for premature registration of a new media type.
2) No longer check for negative link priority values (which was pointless
since such values are unsigned, and could cause a compiler warning).
3) No longer generate a warning describing the exact cause of any
registration failure (just warns that overall registration failed).
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Changes TIPC's list of registered media types from an array of media
structures to an array of pointers to media structures. This eliminates
the need to copy of the contents of the structure passed in during media
registration.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Streamlines the detection of an attempt to register a TIPC media structure
using an already registered name or type identifier. The revised logic now
reuses an existing routine to detect an existing name and no longer
unnecessarily manipulates the media type counter during an unsuccessful
registration attempt.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Speeds up the registration of TIPC media types by passing in a structure
containing the required information, rather than by passing in the various
fields describing the media type individually.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Permits a Linux container to use TIPC sockets even when it has its own
network namespace defined by removing the check that prohibits such use.
This makes it possible for users who wish to isolate their container
network traffic from normal network traffic to utilize TIPC.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
v2, based on Jay's review.
I kept the 'link must be up' part, because this is enforced in the code.
Signed-off-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
cc: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
RDBG() wasn't even used, and the messages printed by RT6_DEBUG() were
far from useful. Just get rid of all this stuff, we can replace it
with something more suitable if we want.
Signed-off-by: David S. Miller <davem@davemloft.net>
Include linux/slab.h to fix below build error:
CC drivers/net/ethernet/mellanox/mlx4/resource_tracker.o
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'mlx4_init_resource_tracker':
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:233: error: implicit declaration of function 'kzalloc'
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:234: warning: assignment makes pointer from integer without a cast
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'mlx4_free_resource_tracker':
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:264: error: implicit declaration of function 'kfree'
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'alloc_qp_tr':
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:370: warning: assignment makes pointer from integer without a cast
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'alloc_mtt_tr':
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:386: warning: assignment makes pointer from integer without a cast
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'alloc_mpt_tr':
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:402: warning: assignment makes pointer from integer without a cast
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'alloc_eq_tr':
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:417: warning: assignment makes pointer from integer without a cast
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'alloc_cq_tr':
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:431: warning: assignment makes pointer from integer without a cast
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'alloc_srq_tr':
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:446: warning: assignment makes pointer from integer without a cast
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'alloc_counter_tr':
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:461: warning: assignment makes pointer from integer without a cast
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'add_res_range':
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:521: warning: assignment makes pointer from integer without a cast
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'mac_add_to_slave':
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:1193: warning: assignment makes pointer from integer without a cast
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'add_mcg_res':
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:2521: warning: assignment makes pointer from integer without a cast
make[5]: *** [drivers/net/ethernet/mellanox/mlx4/resource_tracker.o] Error 1
make[4]: *** [drivers/net/ethernet/mellanox/mlx4] Error 2
make[3]: *** [drivers/net/ethernet/mellanox] Error 2
make[2]: *** [drivers/net/ethernet] Error 2
make[1]: *** [drivers/net] Error 2
make: *** [drivers] Error 2
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Otherwise we leave uninitialized kernel memory in there.
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The NLA_PUT macro should accept the actual attribute length, not
the amount of elements in array :(
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We currently have two ways to account traffic in netfilter:
- iptables chain and rule counters:
# iptables -L -n -v
Chain INPUT (policy DROP 3 packets, 867 bytes)
pkts bytes target prot opt in out source destination
8 1104 ACCEPT all -- lo * 0.0.0.0/0 0.0.0.0/0
- use flow-based accounting provided by ctnetlink:
# conntrack -L
tcp 6 431999 ESTABLISHED src=192.168.1.130 dst=212.106.219.168 sport=58152 dport=80 packets=47 bytes=7654 src=212.106.219.168 dst=192.168.1.130 sport=80 dport=58152 packets=49 bytes=66340 [ASSURED] mark=0 use=1
While trying to display real-time accounting statistics, we require
to pool the kernel periodically to obtain this information. This is
OK if the number of flows is relatively low. However, in case that
the number of flows is huge, we can spend a considerable amount of
cycles to iterate over the list of flows that have been obtained.
Moreover, if we want to obtain the sum of the flow accounting results
that match some criteria, we have to iterate over the whole list of
existing flows, look for matchings and update the counters.
This patch adds the extended accounting infrastructure for
nfnetlink which aims to allow displaying real-time traffic accounting
without the need of complicated and resource-consuming implementation
in user-space. Basically, this new infrastructure allows you to create
accounting objects. One accounting object is composed of packet and
byte counters.
In order to manipulate create accounting objects, you require the
new libnetfilter_acct library. It contains several examples of use:
libnetfilter_acct/examples# ./nfacct-add http-traffic
libnetfilter_acct/examples# ./nfacct-get
http-traffic = { pkts = 000000000000, bytes = 000000000000 };
Then, you can use one of this accounting objects in several iptables
rules using the new nfacct match (which comes in a follow-up patch):
# iptables -I INPUT -p tcp --sport 80 -m nfacct --nfacct-name http-traffic
# iptables -I OUTPUT -p tcp --dport 80 -m nfacct --nfacct-name http-traffic
The idea is simple: if one packet matches the rule, the nfacct match
updates the counters.
Thanks to Patrick McHardy, Eric Dumazet, Changli Gao for reviewing and
providing feedback for this contribution.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Aim of this patch is to provide full range of rps_flow_cnt on 64bit arches.
Theorical limit on number of flows is 2^32
Fix some buggy RPS/RFS macros as well.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Tom Herbert <therbert@google.com>
CC: Xi Wang <xi.wang@gmail.com>
CC: Laurent Chavey <chavey@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The get and zero operations have to be done in an atomic context,
otherwise counters added between them will be lost.
This problem was spotted by Changli Gao while discussing the
nfacct infrastructure.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
We can't do this without propagating the const to nlk_sk()
too, otherwise:
net/netlink/af_netlink.c: In function ‘netlink_is_kernel’:
net/netlink/af_netlink.c:103:2: warning: passing argument 1 of ‘nlk_sk’ discards ‘const’ qualifier from pointer target type [enabled by default]
net/netlink/af_netlink.c:96:36: note: expected ‘struct sock *’ but argument is of type ‘const struct sock *’
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
net/bluetooth/l2cap_core.c
Just two overlapping changes, one added an initialization of
a local variable, and another change added a new local variable.
Signed-off-by: David S. Miller <davem@davemloft.net>
The new netem loss model is configured with nested netlink messages.
This code is being overly strict about sizes, and is easily confused
by padding (or possible future expansion). Also message
for gemodel is incorrect.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add backlog (byte count) information in hfsc classes and qdisc, so that
"tc -s" can report it to user, instead of 0 values :
qdisc hfsc 1: root refcnt 6 default 20
Sent 45141660 bytes 30545 pkt (dropped 0, overlimits 91751 requeues 0)
rate 1492Kbit 126pps backlog 103226b 74p requeues 0
...
class hfsc 1:20 parent 1:1 leaf 1201: rt m1 0bit d 0us m2 400000bit ls m1 0bit d 0us m2 200000bit
Sent 49534912 bytes 33519 pkt (dropped 0, overlimits 0 requeues 0)
backlog 81822b 56p requeues 0
period 23 work 49451576 bytes rtwork 13277552 bytes level 0
...
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: John A. Sullivan III <jsullivan@opensourcedevel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We recently made loopback a bool type instead of an int, so the bitwise
AND is redundent.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to accommodate a 64K buffer we need 64K/PAGE_SIZE plus one more page
in order to allow for a buffer which does not start on a page boundary.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixed the asix_get_wol() routine reported wrong wol status issue.
Signed-off-by: Allan Chou <allan@asix.com.tw>
Tested-by: Eugene <elubarsky@gmail.com>; Allan Chou <allan@asix.com.tw>
Signed-off-by: David S. Miller <davem@davemloft.net>
Just fixed typo of sample code in packet_mmap.txt
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change details:
- Add debugfs support to obtain firmware trace, saved firmware trace on
an IOC crash, driver info and read/write to registers.
- debugfs hierarchy:
bna/pci_dev:<pci_name>
where the pci_name corresponds to the one under /sys/bus/pci/drivers/bna
- Following are the new debugfs entries added:
fwtrc: collect current firmware trace.
fwsave: collect last saved fw trace as a result of firmware crash.
regwr: write one word to chip register
regrd: read one or more words from chip register.
drvinfo: collect the driver information.
Signed-off-by: Krishna Gudipati <kgudipat@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change details:
- The patch adds flash sub-module to the bna driver.
- Added ethtool set_eeprom() and get_eeprom() entry points to
support flash partition read/write operations.
Signed-off-by: Krishna Gudipati <kgudipat@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes the following warning raised
when compile:
WARNING: modpost: missing MODULE_LICENSE()
in drivers/net/ethernet/stmicro/stmmac/stmmac.o
Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
"! --connbytes 23:42" should match if the packet/byte count is not in range.
As there is no explict "invert match" toggle in the match structure,
userspace swaps the from and to arguments
(i.e., as if "--connbytes 42:23" were given).
However, "what <= 23 && what >= 42" will always be false.
Change things so we use "||" in case "from" is larger than "to".
This change may look like it breaks backwards compatibility when "to" is 0.
However, older iptables binaries will refuse "connbytes 42:0",
and current releases treat it to mean "! --connbytes 0:42",
so we should be fine.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The NAT range to nlattr conversation callbacks and helpers are entirely
dead code and are also useless since there are no NAT ranges in conntrack
context, they are only used for initially selecting a tuple. The final NAT
information is contained in the selected tuples of the conntrack entry.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The packet size check originates from a time when UDP helpers could
accidentally mangle incorrect packets (NEWNAT) and is unnecessary
nowadays since the conntrack helpers invoke the NAT helpers for the
proper packet directly.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The inner tuple that is extracted from the packet is unused. The code also
doesn't have any useful side-effects like verifying the packet does contain
enough data to extract the inner tuple since conntrack already does the
same, so remove it.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The only remaining user of NAT protocol module reference counting is NAT
ctnetlink support. Since this is a fairly short sequence of code, convert
over to use RCU and remove module reference counting.
Module unregistration is already protected by RCU using synchronize_rcu(),
so no further changes are necessary.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Use nf_conntrack_hash_rnd in NAT bysource hash to avoid hash chain attacks.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Export the NAT definitions to userspace. So far userspace (specifically,
iptables) has been copying the headers files from include/net. Also
rename some structures and definitions in preparation for IPv6 NAT.
Since these have never been officially exported, this doesn't affect
existing userspace code.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This partially reworks bc01befdcf
which added userspace expectation support.
This patch removes the nf_ct_userspace_expect_list since now we
force to use the new iptables CT target feature to add the helper
extension for conntracks that have attached expectations from
userspace.
A new version of the proof-of-concept code to implement userspace
helpers from userspace is available at:
http://people.netfilter.org/pablo/userspace-conntrack-helpers/nf-ftp-helper-POC.tar.bz2
This patch also modifies the CT target to allow to set the
conntrack's userspace helper status flags. This flag is used
to tell the conntrack system to explicitly allocate the helper
extension.
This helper extension is useful to link the userspace expectations
with the master conntrack that is being tracked from one userspace
helper.
This feature fixes a problem in the current approach of the
userspace helper support. Basically, if the master conntrack that
has got a userspace expectation vanishes, the expectations point to
one invalid memory address. Thus, triggering an oops in the
expectation deletion event path.
I decided not to add a new revision of the CT target because
I only needed to add a new flag for it. I'll document in this
issue in the iptables manpage. I have also changed the return
value from EINVAL to EOPNOTSUPP if one flag not supported is
specified. Thus, in the future adding new features that only
require a new flag can be added without a new revision.
There is no official code using this in userspace (apart from
the proof-of-concept) that uses this infrastructure but there
will be some by beginning 2012.
Reported-by: Sam Roberts <vieuxtech@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
skb->truesize might be big even for a small packet.
Its even bigger after commit 87fb4b7b53 (net: more accurate skb
truesize) and big MTU.
We should allow queueing at least one packet per receiver, even with a
low RCVBUF setting.
Reported-by: Michal Simek <monstr@monstr.eu>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch changes it to yield sooner at halfway instead. Still not a cure-all
for listener overrun if listner is slow, but works much reliably.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Don't inline functions that cover several lines, and do inline
the trivial ones. Also make some arguments const.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit e5671dfae5.
After a follow up discussion with Michal, it was agreed it would
be better to leave the kmem controller with just the tcp files,
deferring the behavior of the other general memory.kmem.* files
for a later time, when more caches are controlled. This is because
generic kmem files are not used by tcp accounting and it is
not clear how other slab caches would fit into the scheme.
We are reverting the original commit so we can track the reference.
Part of the patch is kept, because it was used by the later tcp
code. Conflicts are shown in the bottom. init/Kconfig is removed from
the revert entirely.
Signed-off-by: Glauber Costa <glommer@parallels.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
CC: Kirill A. Shutemov <kirill@shutemov.name>
CC: Paul Menage <paul@paulmenage.org>
CC: Greg Thelen <gthelen@google.com>
CC: Johannes Weiner <jweiner@redhat.com>
CC: David S. Miller <davem@davemloft.net>
Conflicts:
Documentation/cgroups/memory.txt
mm/memcontrol.c
Signed-off-by: David S. Miller <davem@davemloft.net>
Setting a large rps_flow_cnt like (1 << 30) on 32-bit platform will
cause a kernel oops due to insufficient bounds checking.
if (count > 1<<30) {
/* Enforce a limit to prevent overflow */
return -EINVAL;
}
count = roundup_pow_of_two(count);
table = vmalloc(RPS_DEV_FLOW_TABLE_SIZE(count));
Note that the macro RPS_DEV_FLOW_TABLE_SIZE(count) is defined as:
... + (count * sizeof(struct rps_dev_flow))
where sizeof(struct rps_dev_flow) is 8. (1 << 30) * 8 will overflow
32 bits.
This patch replaces the magic number (1 << 30) with a symbolic bound.
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Userspace may not provide TCA_OPTIONS, in fact tc currently does
so not do so if no arguments are specified on the command line.
Return EINVAL instead of panicing.
Signed-off-by: Thomas Graf <tgraf@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 618f9bc74a (net: Move mtu handling down to the protocol
depended handlers) forgot the bridge netfilter case, adding a NULL
dereference in ip_fragment().
Reported-by: Chris Boot <bootc@bootc.net>
CC: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>