Commit Graph

812159 Commits

Author SHA1 Message Date
Sandipan Das 6f20c71d85 bpf: powerpc64: add JIT support for bpf line info
This adds support for generating bpf line info for
JITed programs.

Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-01 21:03:34 +01:00
Daniel Borkmann 2863debfbc Merge branch 'bpf-spinlocks'
Alexei Starovoitov says:

====================
Many algorithms need to read and modify several variables atomically.
Until now it was hard to impossible to implement such algorithms in BPF.
Hence introduce support for bpf_spin_lock.

The api consists of 'struct bpf_spin_lock' that should be placed
inside hash/array/cgroup_local_storage element
and bpf_spin_lock/unlock() helper function.

Example:
struct hash_elem {
    int cnt;
    struct bpf_spin_lock lock;
};
struct hash_elem * val = bpf_map_lookup_elem(&hash_map, &key);
if (val) {
    bpf_spin_lock(&val->lock);
    val->cnt++;
    bpf_spin_unlock(&val->lock);
}

and BPF_F_LOCK flag for lookup/update bpf syscall commands that
allows user space to read/write map elements under lock.

Together these primitives allow race free access to map elements
from bpf programs and from user space.

Key restriction: root only.
Key requirement: maps must be annotated with BTF.

This concept was discussed at Linux Plumbers Conference 2018.
Thank you everyone who participated and helped to iron out details
of api and implementation.

Patch 1: bpf_spin_lock support in the verifier, BTF, hash, array.
Patch 2: bpf_spin_lock in cgroup local storage.
Patches 3,4,5: tests
Patch 6: BPF_F_LOCK flag to lookup/update
Patches 7,8,9: tests

v6->v7:
- fixed this_cpu->__this_cpu per Peter's suggestion and added Ack.
- simplified bpf_spin_lock and load/store overlap check in the verifier
  as suggested by Andrii
- rebase

v5->v6:
- adopted arch_spinlock approach suggested by Peter
- switched to spin_lock_irqsave equivalent as the simplest way
  to avoid deadlocks in rare case of nested networking progs
  (cgroup-bpf prog in preempt_disable vs clsbpf in softirq sharing
  the same map with bpf_spin_lock)
  bpf_spin_lock is only allowed in networking progs that don't
  have arbitrary entry points unlike tracing progs.
- rebase and split test_verifier tests

v4->v5:
- disallow bpf_spin_lock for tracing progs due to insufficient preemption checks
- socket filter progs cannot use bpf_spin_lock due to missing preempt_disable
- fix atomic_set_release. Spotted by Peter.
- fixed hash_of_maps

v3->v4:
- fix BPF_EXIST | BPF_NOEXIST check patch 6. Spotted by Jakub. Thanks!
- rebase

v2->v3:
- fixed build on ia64 and archs where qspinlock is not supported
- fixed missing lock init during lookup w/o BPF_F_LOCK. Spotted by Martin

v1->v2:
- addressed several issues spotted by Daniel and Martin in patch 1
- added test11 to patch 4 as suggested by Daniel
====================

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-01 20:55:41 +01:00
Alexei Starovoitov ba72a7b4ba selftests/bpf: test for BPF_F_LOCK
Add C based test that runs 4 bpf programs in parallel
that update the same hash and array maps.
And another 2 threads that read from these two maps
via lookup(key, value, BPF_F_LOCK) api
to make sure the user space sees consistent value in both
hash and array elements while user space races with kernel bpf progs.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-01 20:55:39 +01:00
Alexei Starovoitov df5d22facd libbpf: introduce bpf_map_lookup_elem_flags()
Introduce
int bpf_map_lookup_elem_flags(int fd, const void *key, void *value, __u64 flags)
helper to lookup array/hash/cgroup_local_storage elements with BPF_F_LOCK flag.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-01 20:55:39 +01:00
Alexei Starovoitov e44ac9a22b tools/bpf: sync uapi/bpf.h
add BPF_F_LOCK definition to tools/include/uapi/linux/bpf.h

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-01 20:55:39 +01:00
Alexei Starovoitov 96049f3afd bpf: introduce BPF_F_LOCK flag
Introduce BPF_F_LOCK flag for map_lookup and map_update syscall commands
and for map_update() helper function.
In all these cases take a lock of existing element (which was provided
in BTF description) before copying (in or out) the rest of map value.

Implementation details that are part of uapi:

Array:
The array map takes the element lock for lookup/update.

Hash:
hash map also takes the lock for lookup/update and tries to avoid the bucket lock.
If old element exists it takes the element lock and updates the element in place.
If element doesn't exist it allocates new one and inserts into hash table
while holding the bucket lock.
In rare case the hashmap has to take both the bucket lock and the element lock
to update old value in place.

Cgroup local storage:
It is similar to array. update in place and lookup are done with lock taken.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-01 20:55:39 +01:00
Alexei Starovoitov ab963beb9f selftests/bpf: add bpf_spin_lock C test
add bpf_spin_lock C based test that requires latest llvm with BTF support

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-01 20:55:39 +01:00
Alexei Starovoitov b4d4556c32 selftests/bpf: add bpf_spin_lock verifier tests
add bpf_spin_lock tests to test_verifier.c that don't require
latest llvm with BTF support

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-01 20:55:39 +01:00
Alexei Starovoitov 7dac3ae42c tools/bpf: sync include/uapi/linux/bpf.h
sync bpf.h

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-01 20:55:39 +01:00
Alexei Starovoitov e16d2f1ab9 bpf: add support for bpf_spin_lock to cgroup local storage
Allow 'struct bpf_spin_lock' to reside inside cgroup local storage.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-01 20:55:38 +01:00
Alexei Starovoitov d83525ca62 bpf: introduce bpf_spin_lock
Introduce 'struct bpf_spin_lock' and bpf_spin_lock/unlock() helpers to let
bpf program serialize access to other variables.

Example:
struct hash_elem {
    int cnt;
    struct bpf_spin_lock lock;
};
struct hash_elem * val = bpf_map_lookup_elem(&hash_map, &key);
if (val) {
    bpf_spin_lock(&val->lock);
    val->cnt++;
    bpf_spin_unlock(&val->lock);
}

Restrictions and safety checks:
- bpf_spin_lock is only allowed inside HASH and ARRAY maps.
- BTF description of the map is mandatory for safety analysis.
- bpf program can take one bpf_spin_lock at a time, since two or more can
  cause dead locks.
- only one 'struct bpf_spin_lock' is allowed per map element.
  It drastically simplifies implementation yet allows bpf program to use
  any number of bpf_spin_locks.
- when bpf_spin_lock is taken the calls (either bpf2bpf or helpers) are not allowed.
- bpf program must bpf_spin_unlock() before return.
- bpf program can access 'struct bpf_spin_lock' only via
  bpf_spin_lock()/bpf_spin_unlock() helpers.
- load/store into 'struct bpf_spin_lock lock;' field is not allowed.
- to use bpf_spin_lock() helper the BTF description of map value must be
  a struct and have 'struct bpf_spin_lock anyname;' field at the top level.
  Nested lock inside another struct is not allowed.
- syscall map_lookup doesn't copy bpf_spin_lock field to user space.
- syscall map_update and program map_update do not update bpf_spin_lock field.
- bpf_spin_lock cannot be on the stack or inside networking packet.
  bpf_spin_lock can only be inside HASH or ARRAY map value.
- bpf_spin_lock is available to root only and to all program types.
- bpf_spin_lock is not allowed in inner maps of map-in-map.
- ld_abs is not allowed inside spin_lock-ed region.
- tracing progs and socket filter progs cannot use bpf_spin_lock due to
  insufficient preemption checks

Implementation details:
- cgroup-bpf class of programs can nest with xdp/tc programs.
  Hence bpf_spin_lock is equivalent to spin_lock_irqsave.
  Other solutions to avoid nested bpf_spin_lock are possible.
  Like making sure that all networking progs run with softirq disabled.
  spin_lock_irqsave is the simplest and doesn't add overhead to the
  programs that don't use it.
- arch_spinlock_t is used when its implemented as queued_spin_lock
- archs can force their own arch_spinlock_t
- on architectures where queued_spin_lock is not available and
  sizeof(arch_spinlock_t) != sizeof(__u32) trivial lock is used.
- presence of bpf_spin_lock inside map value could have been indicated via
  extra flag during map_create, but specifying it via BTF is cleaner.
  It provides introspection for map key/value and reduces user mistakes.

Next steps:
- allow bpf_spin_lock in other map types (like cgroup local storage)
- introduce BPF_F_LOCK flag for bpf_map_update() syscall and helper
  to request kernel to grab bpf_spin_lock before rewriting the value.
  That will serialize access to map elements.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-01 20:55:38 +01:00
David S. Miller d3a5fd3c98 This feature/cleanup patchset includes the following patches:
- bump version strings, by Simon Wunderlich
 
  - Add DHCPACKs for DAT snooping, by Linus Luessing
 
  - Update copyright years for 2019, by Sven Eckelmann
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAlxUKp8WHHN3QHNpbW9u
 d3VuZGVybGljaC5kZQAKCRChK+OYQpKeoc1rD/oC39wZl/ZdubJogpGWI3G9SdU8
 SxbjnaCVCkLMsLJH/wv/OM9OWwHa2OhvOzmMO5Zq47dy5SYXqBMkvv0QOMJcUQqz
 1nPMssTVo9gQLrJiPiP+yXGZ5+sc4dsLCiykFsCs7YRhqeStL1QTZkAOEKfy0K+7
 6G5OQsLT62aY+1OWLbb5oJaB9nlhUY+GyuQZ213jNuYxP7I6MyM9FfMHokASMLn8
 H58fIvDyGkC0PXvYZiJedlnFBU92TPEnBAV/afJ8egcmYQw9jkWL3cbS5ZzqDG4m
 49p9/Xmt2ARsf4UMDxQTEE3elw3tu1PZGPSecTmU+rRzHyYHIIYWnFgTZWmK7/zU
 TKQMlrPx4ky8HOyIY6/5AHNR7x5muchgxf0ft+4Jf0Bf+rGFIgdqfAIeQliUNAUc
 IW+HC0c1SEU/519a6z1V/ARrC6W4qk8aBZ0G4zyx+76KLvxlyvgjEo3XNasyNJnY
 GpHHhpyIeY5xeNOsGmoVrQraJRMqwr4jnWdcmf1LS8o9loB6X7bij8SRKUsEwNi2
 AkIK2sojUf6c4YZW/GVqIgvPuGtZL/Sy9FHx7Ve5f1NRuxpcMStSVV+daYqHYEe6
 72/WYF+oJc4fCR3zAXc67wVoyuPVPxOKpwh2uJUeVbvdvHZ0+dpOV740ktdFjIWQ
 3XAGhl/dSrn7OXYJqA==
 =G6IW
 -----END PGP SIGNATURE-----

Merge tag 'batadv-next-for-davem-20190201' of git://git.open-mesh.org/linux-merge

Simon Wunderlich says:

====================
This feature/cleanup patchset includes the following patches:

 - bump version strings, by Simon Wunderlich

 - Add DHCPACKs for DAT snooping, by Linus Luessing

 - Update copyright years for 2019, by Sven Eckelmann
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-01 11:04:13 -08:00
David S. Miller 962c382d48 New features for the wifi stack:
* airtime fairness scheduling in mac80211, so we can share
  * more authentication offloads to userspace - this is for
    SAE which is part of WPA3 and is hard to do in firmware
  * documentation fixes
  * various mesh improvements
  * various other small improvements/cleanups
 
 This also contains the NLA_POLICY_NESTED{,_ARRAY} change we
 discussed, which affects everyone but there's no other user
 yet.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAlxUKdwACgkQB8qZga/f
 l8TVaA/+Mw4gwHqAb4oQPfimZi4Zvaefu1lA6xRFEpubMUuo9ntR+qbRQWEt4D5B
 XOyY0xoqqeewxelGdXZ/XvF9vFxK1m9SgP5A5RSswbWETDLV/AIL3x9OdhcBLYMn
 ONm79SgMC0pyP+UePvgtZqNSQpf/3T1jeTBZAADyE3jf42iYhvtpyRFKwJi1mvNT
 wGN6mWz4FD4m3uFQem6bVRFPPfv06f29fK7JcY9iTkP7H8Glt0NK9mrt2OVPqZM4
 g7BDg/epOlKuvFgv5Z1uttNWfEWikZt6fWbyusA7T4UBO7jf7mQKotSczQ1D0awQ
 wWFfLcX8esVmiioMwyyJus1sxxxX3GIcbIfgfd3wM4kq4LfqICuT8NSssziPD5OH
 A1gJMQsVk/bT2uACDA2IoX1m4Sf1hvgdKiyihyMQdcAlJuRBh8Bbbthd2Jzlo+nM
 mHNT2lW2WR+0WslSl+uEk9NCtnOVZKdmSM5dJCmFgc/sycg7+AtksQNBT5p8C7SR
 TsqArDDRHXsySZYFVCDziaJsU/SIp3RQ/5K3owzs7TtCkVOlOJVRlG7/4twa4AdH
 qw3VPspXmPj6N+LDbBfmz5g2YAVQXktTTkrfYtwN5lAD9qfLpCRwO3T+BaVgVASZ
 8GCd0uIuIo8DI61QFf8lLbB5K90FHoSNNfG0yDeB7oBYF+S5v94=
 =9ese
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-next-for-davem-2019-02-01' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next

Johannes Berg says:

====================
New features for the wifi stack:
 * airtime fairness scheduling in mac80211, so we can share
 * more authentication offloads to userspace - this is for
   SAE which is part of WPA3 and is hard to do in firmware
 * documentation fixes
 * various mesh improvements
 * various other small improvements/cleanups

This also contains the NLA_POLICY_NESTED{,_ARRAY} change we
discussed, which affects everyone but there's no other user
yet.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-01 10:20:52 -08:00
Dan Carpenter 39ee6e8204 net: hns3: Check for allocation failure
We should return -ENOMEM if the kcalloc() fails.

Fixes: d174ea75c9 ("net: hns3: add statistics for PFC frames and MAC control frame")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-01 10:01:39 -08:00
Dan Carpenter ef76c77a05 ethtool: remove unnecessary check in ethtool_get_regs()
We recently changed this function in commit f9fc54d313 ("ethtool:
check the return value of get_regs_len") such that if "reglen" is zero
we return directly.  That means we can remove this condition as well.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-01 09:58:07 -08:00
Johannes Berg 7d4194633b mac80211: fix missing/malformed documentation
Fix the missing and malformed documentation that kernel-doc and
sphinx warn about. While at it, also add some things to the docs
to fix missing links.

Sadly, the only way I could find to fix this was to add some
trailing whitespace.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-02-01 12:11:13 +01:00
Johannes Berg 9874b71fa1 cfg80211: add missing documentation that kernel-doc warns about
Add the missing documentation that kernel-doc continually warns
about, to get rid of all that noise.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-02-01 11:53:15 +01:00
Johannes Berg 23323289b1 netlink: reduce NLA_POLICY_NESTED{,_ARRAY} arguments
In typical cases, there's no need to pass both the maxattr
and the policy array pointer, as the maxattr should just be
ARRAY_SIZE(policy) - 1. Therefore, to be less error prone,
just remove the maxattr argument from the default macros
and deduce the size accordingly.

Leave the original macros with a leading underscore to use
here and in case somebody needs to pass a policy pointer
where the policy isn't declared in the same place and thus
ARRAY_SIZE() cannot be used.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-02-01 11:06:55 +01:00
Johannes Berg 752cfee90d Merge remote-tracking branch 'net-next/master' into mac80211-next
Merge net-next so that we get the changes from net, which would
otherwise conflict with the NLA_POLICY_NESTED/_ARRAY changes.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-02-01 11:05:35 +01:00
Matteo Croce 5ac4a12df5 cfg80211: fix typo
Fix spelling mistake in cfg80211.h: "lenght" -> "length".
The typo is also in the special comment block which
translates to documentation.

Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-02-01 11:05:07 +01:00
Toke Høiland-Jørgensen cb86880ee4 mac80211: Fix documentation strings for airtime-related variables
There was a typo in the documentation for weight_multiplier in mac80211.h,
and the doc was missing entirely for airtime and airtime_weight in sta_info.h.

Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-02-01 11:04:53 +01:00
Heiner Kallweit fa6821cbf1 r8169: improve WoL handling
WoL handling for the RTL8168 family is a little bit tricky because of
different types of broken BIOS and/or chip quirks.

Two known issues:
1. Network properly resumes from suspend only if WoL is enabled in the chip.
2. Some notebooks wake up immediately if system is suspended and network
   device is wakeup-enabled.

Few patches tried to deal with this:
7edf6d314c ("r8169: disable WOL per default")
18041b5236 ("r8169: restore previous behavior to accept BIOS WoL
settings")

Currently we have the situation that the chip WoL settings as set by
the BIOS are respected (to prevent issue 1), but the device doesn't get
wakeup-enabled (to prevent issue 2).

This leads to another issue:
If systemd is told to set WoL it first checks whether the requested
settings are active already (and does nothing if yes). Due to the chip
WoL flags being set properly systemd assumes that WoL is configured
properly in our case. Result is that device doesn't get wakeup-enabled
and WoL doesn't work (until it's set e.g. by ethtool).

This patch now:
- leaves the chip WoL settings as is (to prevent issue 1)
- keeps the behavior to not wakeup-enable the device initially
  (to prevent issue 2)
- In addition we report WoL as being disabled in get_wol, matching
  that device isn't wakeup-enabled. If systemd is told to enable WoL,
  it will therefore detect that it has to do something and will
  call set_wol.

Of course the user still has the option to override this with
e.g. ethtool.

v2:
- Don't just exclude __rtl8169_get_wol() from compiling, remove it.
v3:
- adjust commit message

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-31 12:50:42 -08:00
Julian Wiedmann 913564fbc2 macvlan: use netif_is_macvlan_port()
Replace the macvlan_port_exists() macro with its twin from netdevice.h

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-31 09:26:31 -08:00
Valdis Kletnieks 1832f4ef58 bpf, cgroups: clean up kerneldoc warnings
Building with W=1 reveals some bitrot:

  CC      kernel/bpf/cgroup.o
kernel/bpf/cgroup.c:238: warning: Function parameter or member 'flags' not described in '__cgroup_bpf_attach'
kernel/bpf/cgroup.c:367: warning: Function parameter or member 'unused_flags' not described in '__cgroup_bpf_detach'

Add a kerneldoc line for 'flags'.

Fixing the warning for 'unused_flags' is best approached by
removing the unused parameter on the function call.

Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-01-31 10:32:01 +01:00
Valdis Kletnieks 116bfa96a2 bpf: fix missing prototype warnings
Compiling with W=1 generates warnings:

  CC      kernel/bpf/core.o
kernel/bpf/core.c:721:12: warning: no previous prototype for ?bpf_jit_alloc_exec_limit? [-Wmissing-prototypes]
  721 | u64 __weak bpf_jit_alloc_exec_limit(void)
      |            ^~~~~~~~~~~~~~~~~~~~~~~~
kernel/bpf/core.c:757:14: warning: no previous prototype for ?bpf_jit_alloc_exec? [-Wmissing-prototypes]
  757 | void *__weak bpf_jit_alloc_exec(unsigned long size)
      |              ^~~~~~~~~~~~~~~~~~
kernel/bpf/core.c:762:13: warning: no previous prototype for ?bpf_jit_free_exec? [-Wmissing-prototypes]
  762 | void __weak bpf_jit_free_exec(void *addr)
      |             ^~~~~~~~~~~~~~~~~

All three are weak functions that archs can override, provide
proper prototypes for when a new arch provides their own.

Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-01-31 10:31:53 +01:00
Valdis Kletnieks de1da68d9c bpf: fix bitrotted kerneldoc
Over the years, the function signature has changed, but the
kerneldoc block hasn't.

Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-01-31 10:31:44 +01:00
Daniel Borkmann 9f239f68f2 Merge branch 'bpf-tests-probe-kernel-support'
Stanislav Fomichev says:

====================
If test_maps/test_verifier is running against the kernel which doesn't
have _all_ BPF features enabled, it fails with an error. This patch
series tries to probe kernel support for each failed test and skip
it instead. This lets users run BPF selftests in the not-all-bpf-yes
environments and received correct PASS/NON-PASS result.

See https://www.spinics.net/lists/netdev/msg539331.html for more
context.

The series goes like this:

* patch #1 skips sockmap tests in test_maps.c if BPF_MAP_TYPE_SOCKMAP
  map is not supported (if bpf_create_map fails, we probe the kernel
  for support)
* patch #2 skips verifier tests if test->prog_type is not supported (if
  bpf_verify_program fails, we probe the kernel for support)
* patch #3 skips verifier tests if test fixup map is not supported (if
  create_map fails, we probe the kernel for support)
* next patches fix various small issues that arise from the first four:
  * patch #4 sets "unknown func bpf_trace_printk#6" prog_type to
    BPF_PROG_TYPE_TRACEPOINT so it is correctly skipped in
    CONFIG_BPF_EVENTS=n case
  * patch #5 exposes BPF_PROG_TYPE_CGROUP_{SKB,SOCK,SOCK_ADDR} only when
    CONFIG_CGROUP_BPF=y, this makes verifier correctly skip appropriate
    tests

v3 changes:
* rebased on top of Quentin's series which adds probes to libbpf

v2 changes:
* don't sprinkle "ifdef CONFIG_CGROUP_BPF" all around net/core/filter.c,
  doing it only in the bpf_types.h is enough to disable
  BPF_PROG_TYPE_CGROUP_{SKB,SOCK,SOCK_ADDR} prog types for non-cgroup
  enabled kernels
====================

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-01-31 10:13:23 +01:00
Stanislav Fomichev befa618112 bpf: BPF_PROG_TYPE_CGROUP_{SKB, SOCK, SOCK_ADDR} require cgroups enabled
There is no way to exercise appropriate attach points without cgroups
enabled. This lets test_verifier correctly skip tests for these
prog_types if kernel was compiled without BPF cgroup support.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-01-31 10:13:21 +01:00
Stanislav Fomichev cfff578ed5 selftests/bpf: mark verifier test that uses bpf_trace_printk as BPF_PROG_TYPE_TRACEPOINT
We don't have this helper if the kernel was compiled without
CONFIG_BPF_EVENTS. Setting prog_type to BPF_PROG_TYPE_TRACEPOINT
let's verifier correctly skip this test based on the missing
prog_type support in the kernel.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-01-31 10:13:21 +01:00
Stanislav Fomichev 9acea337ef selftests/bpf: skip verifier tests for unsupported map types
Use recently introduced bpf_probe_map_type() to skip tests in the
test_verifier if map creation (create_map) fails. It's handled
explicitly for each fixup, i.e. if bpf_create_map returns negative fd,
we probe the kernel for the appropriate map support and skip the
test is map type is not supported.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-01-31 10:13:21 +01:00
Stanislav Fomichev 8184d44c9a selftests/bpf: skip verifier tests for unsupported program types
Use recently introduced bpf_probe_prog_type() to skip tests in the
test_verifier() if bpf_verify_program() fails. The skipped test is
indicated in the output.

Example:

...
679/p bpf_get_stack return R0 within range SKIP (unsupported program
type 5)
680/p ld_abs: invalid op 1 OK
...
Summary: 863 PASSED, 165 SKIPPED, 3 FAILED

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-01-31 10:13:21 +01:00
Stanislav Fomichev e8ddbfb4bc selftests/bpf: skip sockmap in test_maps if kernel doesn't have support
Use recently introduced bpf_probe_map_type() to skip test_sockmap()
if map creation fails. The skipped test is indicated in the output.

Example:

test_sockmap SKIP (unsupported map type BPF_MAP_TYPE_SOCKMAP)
Fork 1024 tasks to 'test_update_delete'
...
test_sockmap SKIP (unsupported map type BPF_MAP_TYPE_SOCKMAP)
Fork 1024 tasks to 'test_update_delete'
...
test_maps: OK, 2 SKIPPED

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-01-31 10:13:21 +01:00
David S. Miller 630afc7734 Merge branch 'hns3-next'
Huazhong Tan says:

====================
code optimizations & bugfixes for HNS3 driver

This patchset includes bugfixes and code optimizations for the HNS3
ethernet controller driver
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-30 14:50:04 -08:00
Jian Shen 9abeb7d8cf net: hns3: keep flow director state unchanged when reset
In orginal codes, driver always enables flow director when
intializing. When user disable flow director with command
ethtool -K, the flow director will be enabled again after
resetting.

This patch fixes it by only enabling it when first initialzing.

Fixes: 6871af29b3 ("net: hns3: Add reset handle for flow director")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-30 14:50:04 -08:00
Jian Shen c59a85c07e net: hns3: stop sending keep alive msg to PF when VF is resetting
When VF is resetting, it can't communicate to PF with mailbox msg.
This patch adds reset state checking before sending keep alive msg
to PF.

Fixes: a6d818e31d ("net: hns3: Add vport alive state checking support")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-30 14:50:04 -08:00
Peng Li eed9535f9f net: hns3: fix an issue for hclgevf_ae_get_hdev
HNS3 VF driver support NIC and Roce, hdev stores NIC
handle and Roce handle, should use correct parameter for
container_of.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-30 14:50:03 -08:00
Huazhong Tan 9fc5541327 net: hns3: fix improper error handling in the hclge_init_ae_dev()
While hclge_init_umv_space() failed in the hclge_init_ae_dev(),
we should undo all the operation which has been done successfully,
the last success operation maybe hclge_mac_mdio_config(), so if
hclge_init_umv_space() failed, we also need to undo it.

Fixes: 288475b2ad01 ("{topost} net: hns3: refine umv space allocation")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-30 14:50:03 -08:00
Jian Shen 472d7ecee2 net: hns3: fix for rss result nonuniform
The rss result is more uniform when use recommended hash key from
microsoft, instead of the one generated by netdev_rss_key_fill().
Also using hash algorithm "xor" is better than "toeplitz".

This patch modifies the default hash key and hash algorithm.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-30 14:50:03 -08:00
Huazhong Tan e215278548 net: hns3: fix netif_napi_del() not do problem when unloading
When the driver is unloading, if a global reset occurs,
unmap_ring_from_vector() in the hns3_nic_uninit_vector_data() will
fail, and hns3_nic_uninit_vector_data() just return. There may be
some netif_napi_del() not be done.

Since hardware will unmap all ring while resetting, so
hns3_nic_uninit_vector_data() should ignore this error, and do the
rest uninitialization.

Fixes: 76ad4f0ee7 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-30 14:50:03 -08:00
Huazhong Tan c8a8045b2d net: hns3: Fix NULL deref when unloading driver
When the driver is unloading, if there is a calling of ndo_open occurs
between phy_disconnect() and unregister_netdev(), it will end up
causing the kernel to eventually hit a NULL deref:

[14942.417828] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000048
[14942.529878] Mem abort info:
[14942.551166]   ESR = 0x96000006
[14942.567070]   Exception class = DABT (current EL), IL = 32 bits
[14942.623081]   SET = 0, FnV = 0
[14942.639112]   EA = 0, S1PTW = 0
[14942.643628] Data abort info:
[14942.659227]   ISV = 0, ISS = 0x00000006
[14942.674870]   CM = 0, WnR = 0
[14942.679449] user pgtable: 4k pages, 48-bit VAs, pgdp = 00000000224ad6ad
[14942.695595] [0000000000000048] pgd=00000021e6673003, pud=00000021dbf01003, pmd=0000000000000000
[14942.723163] Internal error: Oops: 96000006 [#1] PREEMPT SMP
[14942.729358] Modules linked in: hns3(O) hclge(O) pv680_mii(O) hnae3(O) [last unloaded: hclge]
[14942.738907] CPU: 1 PID: 26629 Comm: kworker/u4:13 Tainted: G           O      4.18.0-rc1-12928-ga960791-dirty #145
[14942.749491] Hardware name: Huawei Technologies Co., Ltd. D05/D05, BIOS Hi1620 FPGA TB BOOT BIOS B763 08/17/2018
[14942.760392] Workqueue: events_power_efficient phy_state_machine
[14942.766644] pstate: 80c00009 (Nzcv daif +PAN +UAO)
[14942.771918] pc : test_and_set_bit+0x18/0x38
[14942.776589] lr : netif_carrier_off+0x24/0x70
[14942.781033] sp : ffff0000121abd20
[14942.784518] x29: ffff0000121abd20 x28: 0000000000000000
[14942.790208] x27: ffff0000164d3cd8 x26: ffff8021da68b7b8
[14942.795832] x25: 0000000000000000 x24: ffff8021eb407800
[14942.801445] x23: 0000000000000000 x22: 0000000000000000
[14942.807046] x21: 0000000000000001 x20: 0000000000000000
[14942.812672] x19: 0000000000000000 x18: ffff000009781708
[14942.818284] x17: 00000000004970e8 x16: ffff00000816ad48
[14942.823900] x15: 0000000000000000 x14: 0000000000000008
[14942.829528] x13: 0000000000000000 x12: 0000000000000f65
[14942.835149] x11: 0000000000000001 x10: 00000000000009d0
[14942.840753] x9 : ffff0000121abaa0 x8 : 0000000000000000
[14942.846360] x7 : ffff000009781708 x6 : 0000000000000003
[14942.851970] x5 : 0000000000000020 x4 : 0000000000000004
[14942.857575] x3 : 0000000000000002 x2 : 0000000000000001
[14942.863180] x1 : 0000000000000048 x0 : 0000000000000000
[14942.868875] Process kworker/u4:13 (pid: 26629, stack limit = 0x00000000c909dbf3)
[14942.876464] Call trace:
[14942.879200]  test_and_set_bit+0x18/0x38
[14942.883376]  phy_link_change+0x38/0x78
[14942.887378]  phy_state_machine+0x3dc/0x4f8
[14942.891968]  process_one_work+0x158/0x470
[14942.896223]  worker_thread+0x50/0x470
[14942.900219]  kthread+0x104/0x130
[14942.903905]  ret_from_fork+0x10/0x1c
[14942.907755] Code: d2800022 8b400c21 f9800031 9ac32044 (c85f7c22)
[14942.914185] ---[ end trace 968c9e12eb740b23 ]---

So this patch fixes it by modifying the timing to do phy_connect_direct()
and phy_disconnect().

Fixes: 256727da73 ("net: hns3: Add MDIO support to HNS3 Ethernet driver for hip08 SoC")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-30 14:50:03 -08:00
Yunsheng Lin de67a690cc net: hns3: only support tc 0 for VF
When the VF shares the same TC config as PF, the business
running on PF and VF must have samiliar module.

For simplicity, we are not considering VF sharing the same tc
configuration as PF use case, so this patch removes the support
of TC configuration from VF and forcing VF to just use single
TC.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-30 14:50:03 -08:00
Huazhong Tan 74354140a5 net: hns3: change hnae3_register_ae_dev() to int
hnae3_register_ae_dev() may fail, and it should return a error code
to its caller, so change hnae3_register_ae_dev() return type to int.

Also, when hnae3_register_ae_dev() return error, hns3_probe() should
do some error handling and return the error code.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-30 14:50:03 -08:00
Peng Li fc0c174f42 net: hns3: use the correct interface to stop|open port
dev_close() stop the netdev and the service base on the netdev
will stop. But ndev->netdev_ops->ndo_stop() may only stop HW
and stack queue, the service base on the netdev can still work.

Fixes: 5668abda09 ("net: hns3: add support for set_ringparam")
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-30 14:50:03 -08:00
Jian Shen 8e1445a653 net: hns3: fix VF dump register issue
In original codes, the .get_regs_len and .get_regs were missed
assigned. This patch fixes it.

Fixes: 1600c3e5f2 ("net: hns3: Support "ethtool -d" for HNS3 VF driver")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-30 14:50:03 -08:00
liyongxin 1a6e552df3 net: hns3: reuse the definition of l3 and l4 header info union
Union l3_hdr_info and l4_hdr_info have already been defined in
the hns3_enet.h, so it is unnecessary to define them elsewhere.

This patch removes the redundant definition, and reuses the one
defined in the hns3_enet.h.

Signed-off-by: liyongxin <liyongxin1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-30 14:50:03 -08:00
David S. Miller a82a3fe018 Merge branch 'net-dsa-mt7530-support-MT7530-in-the-MT7621-SoC'
Greg Ungerer says:

====================
net: dsa: mt7530: support MT7530 in the MT7621 SoC

This is the fourth version of a patch series supporting the MT7530 switch
as used in the MediaTek MT7621 SoC. Unlike the MediaTek MT7623 the MT7621
is built around a dual core MIPS CPU architecture. But inside it uses
basically the same 7530 switch.

This series resolves all issues I had with previous versions, and I can
now reliably use the driver on a 7621 SoC platform. These patches were
generated against linux-5.0-rc4.

The first patch enables support for the existing kernel mediatek ethernet
driver on the MT7621 SoC. This support is from Bjørn Mork, with an update
and fix by me. Using this driver fixed a number of problems I had
(TX checksums, large RX packet drop) over the staging driver
(drivers/staging/mt7621-eth).

Patch 2 modifies the mt7530 DSA driver to support the 7530 switch as
implemented in the Mediatek MT7621 SoC. The last patch updates the
devicetree bindings to reflect the new support in the mt7530 driver.

There is no real dependencies between the patches, so they can be taken
independantly.

Creating a new binding for the MT7621 seems like the only viable approach
to distinguish between a stand alone 7530 switch, the silicon module
in the MT7623 SoC and the silicon in the MT7621. Certainly the 7530 ID
register in the MT7623 and MT7621 returns the same value, "0x7530001".

Looking at the mt7530.c DSA driver it might make some sense to convert
the existing "mediatek,mcm" binding to something like "mediatek,mt7623"
to be consistent with this new MT7621 support. As far as I can tell
this is the intention of this binding.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-30 14:26:07 -08:00
Greg Ungerer 9389b5e946 dt-bindings: net: dsa: add new MT7530 binding to support MT7621
Add devicetree binding to support the compatible mt7530 switch as used
in the MediaTek MT7621 SoC.

Signed-off-by: Greg Ungerer <gerg@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Sean Wang <sean.wang@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-30 14:26:07 -08:00
Greg Ungerer ddda1ac116 net: dsa: mt7530: support the 7530 switch on the Mediatek MT7621 SoC
The MediaTek MT7621 SoC device contains a 7530 switch, and the existing
linux kernel 7530 DSA switch driver can be used with it.

The bulk of the changes required stem from the 7621 having different
regulator and pad setup. The existing setup of these in the 7530
driver appears to be very specific to its implemtation in the Mediatek
7623 SoC. (Not entirely surprising given the 7623 is a quad core ARM
based SoC, and the 7621 is a dual core, dual thread MIPS based SoC).

Create a new devicetree type, "mediatek,mt7621", to support the 7530
switch in the 7621 SoC. There appears to be no usable ID register to
distinguish it from a 7530 in other hardware at runtime. This is used
to carry out the appropriate configuration and setup.

Signed-off-by: Greg Ungerer <gerg@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Sean Wang <sean.wang@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-30 14:26:07 -08:00
Bjørn Mork 889bcbdeee net: ethernet: mediatek: support MT7621 SoC ethernet hardware
The Mediatek MT7621 SoC contains the same ethernet hardware module as
used on a number of other MediaTek SoC parts. There are some minor
differences to deal with but we can use the same driver to support
them all.

This patch is based on work by Bjørn Mork <bjorn@mork.no>, and his
original patch is at:

3293bc63f5

There is an additional compatible devicetree type added, and the primary
change to the code required is to support a single interrupt (for both
RX and TX interrupts).

Signed-off-by: Bjørn Mork <bjorn@mork.no>
[gerg@kernel.org: rebase to mainline and irq handler fix]
Signed-off-by: Greg Ungerer <gerg@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Sean Wang <sean.wang@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-30 14:26:07 -08:00
David S. Miller 08c25fe83a Merge branch 'mlxsw-spectrum_acl-Include-delta-bits-into-hashtable-key'
Ido Schimmel says:

====================
mlxsw: spectrum_acl: Include delta bits into hashtable key

The Spectrum-2 ASIC allows multiple rules to use the same mask provided
that the difference between their masks is small enough (up to 8
consecutive delta bits). A more detailed explanation is provided in
merge commit 756cd36626 ("Merge branch
'mlxsw-Introduce-algorithmic-TCAM-support'").

These delta bits are part of the rule's key and therefore rules that
only differ in their delta bits can be inserted with the same A-TCAM
mask. In case two rules share the same key and only differ in their
priority, then the second will spill to the C-TCAM.

Current code does not take the delta bits into account when checking for
duplicate rules, which leads to unnecessary spillage to the C-TCAM.
This may result in reduced scale and performance.

Patch #1 includes the delta bits in the rule's key to avoid the above
mentioned problem.

Patch #2 adds a tracepoint when a rule is inserted into the C-TCAM.

Patches #3-#5 add test cases to make sure unnecessary spillage into the
C-TCAM does not occur.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-30 10:00:40 -08:00