Commit Graph

726256 Commits

Author SHA1 Message Date
Linus Torvalds 66f8162418 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fix from Thomas Gleixner:
 "A single fix for the new matrix allocator to prevent vector exhaustion
  by certain network drivers which allocate gazillions of unused vectors
  which cannot be put into reservation mode due to MSI and the lack of
  MSI entry masking.

  The fix/workaround is to spread the vectors across CPUs by searching
  the supplied target CPU mask for the CPU with the smallest number of
  allocated vectors"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irq/matrix: Spread interrupts on allocation
2018-01-21 10:39:58 -08:00
David S. Miller cbcbeedbfd Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

The following patchset contains Netfilter/IPVS updates for your net-next
tree. Basically, a new extension for ip6tables, simplification work of
nf_tables that saves us 500 LoC, allow raw table registration before
defragmentation, conversion of the SNMP helper to use the ASN.1 code
generator, unique 64-bit handle for all nf_tables objects and fixes to
address fallout from previous nf-next batch.  More specifically, they
are:

1) Seven patches to remove family abstraction layer (struct nft_af_info)
   in nf_tables, this simplifies our codebase and it saves us 64 bytes per
   net namespace.

2) Add IPv6 segment routing header matching for ip6tables, from Ahmed
   Abdelsalam.

3) Allow to register iptable_raw table before defragmentation, some
   people do not want to waste cycles on defragmenting traffic that is
   going to be dropped, hence add a new module parameter to enable this
   behaviour in iptables and ip6tables. From Subash Abhinov
   Kasiviswanathan. This patch needed a couple of follow up patches to
   get things tidy from Arnd Bergmann.

4) SNMP helper uses the ASN.1 code generator, from Taehee Yoo. Several
   patches for this helper to prepare this change are also part of this
   patch series.

5) Add 64-bit handles to uniquely objects in nf_tables, from Harsha
   Sharma.

6) Remove log message that several netfilter subsystems print at
   boot/load time.

7) Restore x_tables module autoloading, that got broken in a previous
   patch to allow singleton NAT hook callback registration per hook
   spot, from Florian Westphal. Moreover, return EBUSY to report that
   the singleton NAT hook slot is already in instead.

8) Several fixes for the new nf_tables flowtable representation,
   including incorrect error check after nf_tables_flowtable_lookup(),
   missing Kconfig dependencies that lead to build breakage and missing
   initialization of priority and hooknum in flowtable object.

9) Missing NETFILTER_FAMILY_ARP dependency in Kconfig for the clusterip
   target. This is due to recent updates in the core to shrink the hook
   array size and compile it out if no specific family is enabled via
   .config file. Patch from Florian Westphal.

10) Remove duplicated include header files, from Wei Yongjun.

11) Sparse warning fix for the NFPROTO_INET handling from the core
    due to missing static function definition, also from Wei Yongjun.

12) Restore ICMPv6 Parameter Problem error reporting when
    defragmentation fails, from Subash Abhinov Kasiviswanathan.

13) Remove obsolete owner field initialization from struct
    file_operations, patch from Alexey Dobriyan.

14) Use boolean datatype where needed in the Netfilter codebase, from
    Gustavo A. R. Silva.

15) Remove double semicolon in dynset nf_tables expression, from
    Luis de Bethencourt.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-21 11:35:34 -05:00
Linus Torvalds d517bb79f4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha
Pull alpha fixes from Matt Turner:
 "A build fix and a regression fix"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha:
  alpha/PCI: Fix noname IRQ level detection
  alpha: extend memset16 to EV6 optimised routines
2018-01-20 20:12:47 -08:00
David S. Miller ea9722e265 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2018-01-19

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) bpf array map HW offload, from Jakub.

2) support for bpf_get_next_key() for LPM map, from Yonghong.

3) test_verifier now runs loaded programs, from Alexei.

4) xdp cpumap monitoring, from Jesper.

5) variety of tests, cleanups and small x64 JIT optimization, from Daniel.

6) user space can now retrieve HW JITed program, from Jiong.

Note there is a minor conflict between Russell's arm32 JIT fixes
and removal of bpf_jit_enable variable by Daniel which should
be resolved by keeping Russell's comment and removing that variable.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-20 22:03:46 -05:00
Laura Abbott 91cfc88c66 x86: Use __nostackprotect for sme_encrypt_kernel
Commit bacf6b499e ("x86/mm: Use a struct to reduce parameters for SME
PGD mapping") moved some parameters into a structure.

The structure was large enough to trigger the stack protection canary in
sme_encrypt_kernel which doesn't work this early, causing reboots.

Mark sme_encrypt_kernel appropriately to not use the canary.

Fixes: bacf6b499e ("x86/mm: Use a struct to reduce parameters for SME PGD mapping")
Signed-off-by: Laura Abbott <labbott@redhat.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-20 17:22:54 -08:00
Lorenzo Pieralisi 86be89939d alpha/PCI: Fix noname IRQ level detection
The conversion of the alpha architecture PCI host bridge legacy IRQ
mapping/swizzling to the new PCI host bridge map/swizzle hooks carried
out through:

commit 0e4c2eeb75 ("alpha/PCI: Replace pci_fixup_irqs() call with
host bridge IRQ mapping hooks")

implies that IRQ for devices are now allocated through pci_assign_irq()
function in pci_device_probe() that is called when a driver matching a
device is found in order to probe the device through the device driver.

Alpha noname platforms required IRQ level programming to be executed
in sio_fixup_irq_levels(), that is called in noname_init_pci(), a
platform hook called within a subsys_initcall.

In noname_init_pci(), present IRQs are detected through
sio_collect_irq_levels() that check the struct pci_dev->irq number
to detect if an IRQ has been allocated for the device.

By the time sio_collect_irq_levels() is called, some devices may still
have not a matching driver loaded to match them (eg loadable module)
therefore their IRQ allocation is still pending - which means that
sio_collect_irq_levels() does not programme the correct IRQ level for
those devices, causing their IRQ handling to be broken when the device
driver is actually loaded and the device is probed.

Fix the issue by adding code in the noname map_irq() function
(noname_map_irq()) that, whilst mapping/swizzling the IRQ line, it also
ensures that the correct IRQ level programming is executed at platform
level, fixing the issue.

Fixes: 0e4c2eeb75 ("alpha/PCI: Replace pci_fixup_irqs() call with
host bridge IRQ mapping hooks")
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: stable@vger.kernel.org # 4.14
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: Meelis Roos <mroos@linux.ee>
Signed-off-by: Matt Turner <mattst88@gmail.com>
2018-01-20 16:22:36 -08:00
Linus Torvalds 24b6124047 KVM fixes for v4.15-rc9
ARM:
 * fix incorrect huge page mappings on systems using the contiguous hint
   for hugetlbfs
 * support alternative GICv4 init sequence
 * correctly implement the ARM SMCC for HVC and SMC handling
 
 PPC:
 * add KVM IOCTL for reporting vulnerability and workaround status
 
 s390:
 * provide userspace interface for branch prediction changes in firmware
 
 x86:
 * use correct macros for bits
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABCAAGBQJaY3/eAAoJEED/6hsPKofo64kH/16SCSA9pKJTf39+jLoCPzbp
 tlhzxoaqb9cPNMQBAk8Cj5xNJ6V4Clwnk8iRWaE6dRI5nWQxnxRHiWxnrobHwUbK
 I0zSy+SywynSBnollKzLzQrDUBZ72fv3oLwiYEYhjMvs0zW6Q/vg10WERbav912Q
 bv8nb5e8TbvU500ErndKTXOa8/B6uZYkMVjBNvAHwb+4AQ7bJgDQs5/qOeXllm8A
 MT/SNYop/fkjRP7mQng5XYzoO+70tbe0hWpOQGgBnduzrbkNNvZtYtovusHYytLX
 PAB7DDPbLZm5L2HBo4zvKgTHIoHTxU0X2yfUDzt7O151O2WSyqBRC3y1tpj6xa8=
 =GnNJ
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Radim Krčmář:
 "ARM:
   - fix incorrect huge page mappings on systems using the contiguous
     hint for hugetlbfs
   - support alternative GICv4 init sequence
   - correctly implement the ARM SMCC for HVC and SMC handling

  PPC:
   - add KVM IOCTL for reporting vulnerability and workaround status

  s390:
   - provide userspace interface for branch prediction changes in
     firmware

  x86:
   - use correct macros for bits"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: s390: wire up bpb feature
  KVM: PPC: Book3S: Provide information about hardware/firmware CVE workarounds
  KVM/x86: Fix wrong macro references of X86_CR0_PG_BIT and X86_CR4_PAE_BIT in kvm_valid_sregs()
  arm64: KVM: Fix SMCCC handling of unimplemented SMC/HVC calls
  KVM: arm64: Fix GICv4 init when called from vgic_its_create
  KVM: arm/arm64: Check pagesize when allocating a hugepage at Stage 2
2018-01-20 11:41:09 -08:00
Linus Torvalds e6252e7f58 Final MIPS fixes for 4.15
Some final MIPS fixes for 4.15, including important build fixes and a
 MAINTAINERS update:
 
 - Add myself as MIPS co-maintainer.
 - Fix various all*config build failures (particularly as a result of
   switching the default MIPS platform to the "generic" platform).
 - Fix GCC7 build failures (duplicate const and questionable calls to
   missing __multi3 intrinsic on mips64r6).
 - Fix warnings when CPU Idle is enabled (4.14).
 - Fix AR7 serial output (since 3.17).
 - Fix ralink platform_get_irq error checking (since 3.12).
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEd80NauSabkiESfLYbAtpk944dnoFAlpiZcsACgkQbAtpk944
 dnrjZA/+LOgEap7GeBrtc38PjpG+a2RFbfZ8EOIVWXcUkZmxX7Uf3NMQk5JH/f4H
 ESgxwHeueorOmRuZm1QJMK1fafwc0ZzMOCXUL+oOXJn5iZ9RRxhSYms0AeQqZ3Sl
 ueAOxI9qP0YUcHc0CU50csLgZsdz/IMZcp/s8MwNai001HvyThjGLw1BeKPKiwD6
 0Yzf5aVhAmUtUy5gSXaI1vNXN97o8XdUb71ivPWJ79mHhvmcDfLjHDoTInhSj5mM
 DiLtMwRVGOAlPWXG5b63LhDfsRr0XW6zNnwqd5SK06O5jAE48jJkhWVFIqX/U4eh
 IianE3IAY263TimeLbjcG+poJ4tWAoyvD0c4hirW9Sf8chVrT3nnZ+aegD6YflHg
 aQjvshg7y8ciKqfSQO2kSA/F/5z0dQcs/63pMTDFv0mrJG5pDi6/EzAez8e20yzO
 Iyl8O4aQleTyY7Hr3mfiUCT305xU9+cgGEmRquKyI1cPNr4ITuuZBQSlanoPDXdk
 mLoaK7W+RC78DWVNfmFBvj3ewrExQl/aQZ8MEaFx2tRde9PN8htfvc6YT8aDxHBZ
 UcBrnc8BjYRtvU75yOmSgaDEg4zfTpwohzlNBcB3qaZbQmOanRiXzmN7o13HCZRr
 MWd1ELHX4PFhrEOsEsfB0G+ep6hsEyhlVlbHPij5u8lY3JA+iLM=
 =rHbt
 -----END PGP SIGNATURE-----

Merge tag 'mips_fixes_4.15_2' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips

Pull MIPS fixes from James Hogan:
 "Some final MIPS fixes for 4.15, including important build fixes and a
  MAINTAINERS update:

   - Add myself as MIPS co-maintainer.

   - Fix various all*config build failures (particularly as a result of
     switching the default MIPS platform to the "generic" platform).

   - Fix GCC7 build failures (duplicate const and questionable calls to
     missing __multi3 intrinsic on mips64r6).

   - Fix warnings when CPU Idle is enabled (4.14).

   - Fix AR7 serial output (since 3.17).

   - Fix ralink platform_get_irq error checking (since 3.12)"

* tag 'mips_fixes_4.15_2' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips:
  MAINTAINERS: Add James as MIPS co-maintainer
  MIPS: Fix undefined reference to physical_memsize
  MIPS: Implement __multi3 for GCC7 MIPS64r6 builds
  MIPS: mm: Fix duplicate "const" on insn_table_MM
  MIPS: CM: Drop WARN_ON(vp != 0)
  MIPS: ralink: Fix platform_get_irq's error checking
  MIPS: Fix CPS SMP NS16550 UART defaults
  MIPS: BCM47XX Avoid compile error with MIPS allnoconfig
  MIPS: RB532: Avoid undefined mac_pton without GENERIC_NET_UTILS
  MIPS: RB532: Avoid undefined early_serial_setup() without SERIAL_8250_CONSOLE
  MIPS: ath25: Avoid undefined early_serial_setup() without SERIAL_8250_CONSOLE
  MIPS: AR7: ensure the port type's FCR value is used
2018-01-20 11:37:00 -08:00
Christian Borntraeger 35b3fde620 KVM: s390: wire up bpb feature
The new firmware interfaces for branch prediction behaviour changes
are transparently available for the guest. Nevertheless, there is
new state attached that should be migrated and properly resetted.
Provide a mechanism for handling reset, migration and VSIE.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[Changed capability number to 152. - Radim]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-01-20 17:30:47 +01:00
Radim Krčmář 29d24e3f3d Add PPC KVM ioctl to report vulnerability and workaround status to userspace.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJaYXNEAAoJEJ2a6ncsY3Gf3FoH/0kAN3QcgYRN/0qgOmHpVBmL
 CtcrOEnw2vp2HwRXsCdww7PcGjsXg3S2vlZT0SDJiX2GJKBsV01Uh4xKZ0ENqdCM
 qq4EzP8g5ceHfp8gtpKdYQHM3STTBlU9XvHmbaCp5tRLQiI2cL/68cDNvRibJKOd
 WrXx404MNJiHJ3/WMJ/8kJqzFnySdLba5icuA26xfJA8Ur+NN8ScfxOULJLBjZ64
 c+9kTtBvhqUU4cl1ofVa9c8CoEU7UbGh5U8osTzwKcDkJ2vjmKHEoHiF+lESOejk
 Bs2E7hj4I+S2ihnthy94gYnxGX9TIVlbnbVl3Mj4uDRrbjR9MrF42VJjwZjPlxk=
 =FPrd
 -----END PGP SIGNATURE-----

Merge tag 'kvm-ppc-cve-4.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc

Add PPC KVM ioctl to report vulnerability and workaround status to userspace.
2018-01-20 17:29:00 +01:00
David S. Miller 8565d26bcb Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
The BPF verifier conflict was some minor contextual issue.

The TUN conflict was less trivial.  Cong Wang fixed a memory leak of
tfile->tx_array in 'net'.  This is an skb_array.  But meanwhile in
net-next tun changed tfile->tx_arry into tfile->tx_ring which is a
ptr_ring.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 22:59:33 -05:00
Alexei Starovoitov 1391040b65 Merge branch 'bpf-misc-improvements'
Daniel Borkmann says:

====================
This series adds various misc improvements to BPF: detection
of BPF helper definition misconfiguration for mem/size argument
pairs, csum_diff helper also for XDP, various test cases,
removal of the recently added pure_initcall(), restriction
of the jit sysctls to cap_sys_admin for initns, a minor size
improvement for x86 jit in alu ops, output of complexity limit
to verifier log and last but not least having the event output
more flexible with moving to const_size_or_zero type.

Thanks!
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-19 18:37:02 -08:00
Daniel Borkmann 1728a4f2ad bpf: move event_output to const_size_or_zero for xdp/skb as well
Similar rationale as in a60dd35d2e ("bpf: change bpf_perf_event_output
arg5 type to ARG_CONST_SIZE_OR_ZERO"), change the type to CONST_SIZE_OR_ZERO
such that we can better deal with optimized code. No changes needed in
bpf_event_output() as it can also deal with 0 size entirely (e.g. as only
wake-up signal with empty frame in perf RB, or packet dumps w/o meta data
as another such possibility).

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-19 18:37:00 -08:00
Daniel Borkmann 4bd95f4b99 bpf: add upper complexity limit to verifier log
Given the limit could potentially get further adjustments in the
future, add it to the log so it becomes obvious what the current
limit is w/o having to check the source first. This may also be
helpful for debugging complexity related issues on kernels that
backport from upstream.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-19 18:37:00 -08:00
Daniel Borkmann de0a444dda bpf, x86: small optimization in alu ops with imm
For the BPF_REG_0 (BPF_REG_A in cBPF, respectively), we can use
the short form of the opcode as dst mapping is on eax/rax and
thus save a byte per such operation. Added to add/sub/and/or/xor
for 32/64 bit when K immediate is used. There may be more such
low-hanging fruit to add in future as well.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-19 18:37:00 -08:00
Daniel Borkmann 2e4a30983b bpf: restrict access to core bpf sysctls
Given BPF reaches far beyond just networking these days, it was
never intended to allow setting and in some cases reading those
knobs out of a user namespace root running without CAP_SYS_ADMIN,
thus tighten such access.

Also the bpf_jit_enable = 2 debugging mode should only be allowed
if kptr_restrict is not set since it otherwise can leak addresses
to the kernel log. Dump a note to the kernel log that this is for
debugging JITs only when enabled.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-19 18:37:00 -08:00
Daniel Borkmann fa9dd599b4 bpf: get rid of pure_initcall dependency to enable jits
Having a pure_initcall() callback just to permanently enable BPF
JITs under CONFIG_BPF_JIT_ALWAYS_ON is unnecessary and could leave
a small race window in future where JIT is still disabled on boot.
Since we know about the setting at compilation time anyway, just
initialize it properly there. Also consolidate all the individual
bpf_jit_enable variables into a single one and move them under one
location. Moreover, don't allow for setting unspecified garbage
values on them.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-19 18:37:00 -08:00
Daniel Borkmann 87c1793b1b bpf: add couple of test cases for div/mod by zero
Add couple of missing test cases for eBPF div/mod by zero to the
new test_verifier prog runtime feature. Also one for an empty prog
and only exit.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-19 18:37:00 -08:00
Daniel Borkmann fcd1c91771 bpf: add couple of test cases for signed extended imms
Add a couple of test cases for interpreter and JIT that are
related to an issue we faced some time ago in Cilium [1],
which is fixed in LLVM with commit e53750e1e086 ("bpf: fix
bug on silently truncating 64-bit immediate").

Test cases were run-time checking kernel to behave as intended
which should also provide some guidance for current or new
JITs in case they should trip over this. Added for cBPF and
eBPF.

  [1] https://github.com/cilium/cilium/pull/2162

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-19 18:36:59 -08:00
Daniel Borkmann 205c380778 bpf: add csum_diff helper to xdp as well
Useful for porting cls_bpf programs w/o increasing program
complexity limits much at the same time, so add the helper
to XDP as well.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-19 18:36:59 -08:00
Daniel Borkmann 9013341594 bpf, verifier: detect misconfigured mem, size argument pair
I've seen two patch proposals now for helper additions that used
ARG_PTR_TO_MEM or similar in reg_X but no corresponding ARG_CONST_SIZE
in reg_X+1. Verifier won't complain in such case, but it will omit
verifying the memory passed to the helper thus ending up badly.
Detect such buggy helper function signature and bail out during
verification rather than finding them through review.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-19 18:36:59 -08:00
Jesper Dangaard Brouer 417f1d9f21 samples/bpf: xdp_monitor include cpumap tracepoints in monitoring
The xdp_redirect_cpu sample have some "builtin" monitoring of the
tracepoints for xdp_cpumap_*, but it is practical to have an external
tool that can monitor these transpoint as an easy way to troubleshoot
an application using XDP + cpumap.

Specifically I need such external tool when working on Suricata and
XDP cpumap redirect. Extend the xdp_monitor tool sample with
monitoring of these xdp_cpumap_* tracepoints.  Model the output format
like xdp_redirect_cpu.

Given I needed to handle per CPU decoding for cpumap, this patch also
add per CPU info on the existing monitor events.  This resembles part
of the builtin monitoring output from sample xdp_rxq_info.  Thus, also
covering part of that sample in an external monitoring tool.

Performance wise, the cpumap tracepoints uses bulking, which cause
them to have very little overhead.  Thus, they are enabled by default.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-20 02:10:55 +01:00
Linus Torvalds 8dd903d2cf SCSI fixes on 20180119
One fix for SAS attached SATA CD-ROMs.  It turns out that the libata
 handling of CD devices relies on the SCSI error handler, so disable
 async aborts (which don't start the error handler) for these devices.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJaBAABCABEFiEERylv2ugWAYZdTacQBWvsc5kRk3gFAlpibTgmHGphbWVzLmJv
 dHRvbWxleUBoYW5zZW5wYXJ0bmVyc2hpcC5jb20ACgkQBWvsc5kRk3h26Q//dPkL
 pddipZ2IdnEZ3FPqI6hHgxevoOzhLvl7gNHXnGnxMQT6A65AAVFm1bPaBAgVH+4g
 aF2pQ9rjY0o+fszpUSIM+W8YpQFOIVEBsIzFbHubHRuKZ8cq6yPFJyWIySmOZU1S
 /6UjIeyYjjS0vySU5VoCPVdRupHDRSIw0Hgp1xuEslt5eV2clPFbZC9Ec1+N0wmX
 Iv1JmvRO4MZrG0nmSp30kln1SvL9zwz1TkfIYI0y/YnBdFHi/EBGe/9D/hBrZFkk
 uGlpVJX6jtqIIlC/4KIVXPeOCgdEWGpFeC39sQrDt7/H+hg9UkXcyzNT6CsE7gsg
 uh1oyBE7+xIgc6oVoEcGfK/NZ2ALwicV4sbIk7kh4K0GJIW5FY3j53DzSe8FWLm5
 CQGLEIGcqild9SHtOQmp5i3yX3nVHK8cM6+icbtvMTKNCnVimQM408mz0r418RpE
 BeKAzXZ5K7xUxZUFTLExbMQe2aOq0cy5wwZzPJEYqH4dko9Q7YbUISGWuXoSQ46Q
 mjAR4j/B2LRQRAthzf2JHmhmKe6qGqV0AH6yB020FhcECcz1pAiDm30oYcRXCCSp
 CVElL5xFtdEQyJLz4t1cXfncmwmZ4h4pm0naO3yOWKj4O1FWFIRXfObaQ44hAdn2
 brxu+ewMIGyBSy1d8erhjzvzl1rRIw4qLUCQMHc=
 =g/Cs
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fix from James Bottomley:
 "One fix for SAS attached SATA CD-ROMs. It turns out that the libata
  handling of CD devices relies on the SCSI error handler, so disable
  async aborts (which don't start the error handler) for these devices"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: libsas: Disable asynchronous aborts for SATA devices
2018-01-19 15:20:00 -08:00
Linus Torvalds 1cf55613a6 All stable@ fixes:
- Fix DM thinp btree corruption seen when inserting a new key/value pair
   into a full root node.
 
 - Fix DM thinp btree removal deadlock due to artificially low number of
   allowed concurrent locks allowed.
 
 - Fix possible DM crypt corruption if kernel keyring service is used.
   Only affects ciphers using following IVs: essiv, lmk and tcw.
 
 - Two DM crypt device initialization error checking fixes.
 
 - Fix DM integrity to allow use of async ciphers that require DMA.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJaYmmiAAoJEMUj8QotnQNa55QIANbqG20YBvLhejdIvoOC4npL
 Lk6PGfAGjftUBocnaua3BMaCCHnPF0DCmVqEZ8Bb07eIUoOygrAbEgJ/o1GjA1Ku
 6oLmw+yMls0cu3Qk0aOF/5Z/TYl/fNgYAH9g7oMy4g4DNeD9nEKdRsz/DF2JTVWH
 t0AUnmMHFLXP001TBW5k4SnC0jqM6JCwd/dn9rlSW6oE5mNLo22+A0vPdpW2tVNW
 odP2mXAlLQUDTrn9wtAKp7yBJKqbIO56Odv2hQ48VOBjSTB/GMue5OY2tibfS2ci
 /qo4OzVwzBDAB+bU8MN6AP6MqimvyPUeCeyH20X8Zzbmsd2Mwz+qd+WOsHrNiNg=
 =R3/E
 -----END PGP SIGNATURE-----

Merge tag 'for-4.15/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper fixes from Mike Snitzer:
 "All fixes marked for stable:

   - Fix DM thinp btree corruption seen when inserting a new key/value
     pair into a full root node.

   - Fix DM thinp btree removal deadlock due to artificially low number
     of allowed concurrent locks allowed.

   - Fix possible DM crypt corruption if kernel keyring service is used.
     Only affects ciphers using following IVs: essiv, lmk and tcw.

   - Two DM crypt device initialization error checking fixes.

   - Fix DM integrity to allow use of async ciphers that require DMA"

* tag 'for-4.15/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm crypt: fix error return code in crypt_ctr()
  dm crypt: wipe kernel key copy after IV initialization
  dm integrity: don't store cipher request on the stack
  dm crypt: fix crash by adding missing check for auth key size
  dm btree: fix serious bug in btree_split_beneath()
  dm thin metadata: THIN_MAX_CONCURRENT_LOCKS should be 6
2018-01-19 15:16:49 -08:00
Daniel Borkmann 05526361af Merge branch 'bpf-lpm-get-next-key'
Yonghong Song says:

====================
This patch set implements MAP_GET_NEXT_KEY command for LPM_TRIE map.
This command is really useful for key enumeration, and for key deletion
if what keys in the trie are unknown.

Patch #1 implements the functionality in the kernel and patch #2
adds a test case in tools/testing/selftests/bpf.
====================

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-19 23:26:42 +01:00
Yonghong Song 8c417dc15f tools/bpf: add a testcase for MAP_GET_NEXT_KEY command of LPM_TRIE map
A test case is added in tools/testing/selftests/bpf/test_lpm_map.c
for MAP_GET_NEXT_KEY command. A four node trie, which
is described in kernel/bpf/lpm_trie.c, is built and the
MAP_GET_NEXT_KEY results are checked.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-19 23:26:41 +01:00
Yonghong Song b471f2f1de bpf: implement MAP_GET_NEXT_KEY command for LPM_TRIE map
Current LPM_TRIE map type does not implement MAP_GET_NEXT_KEY
command. This command is handy when users want to enumerate
keys. Otherwise, a different map which supports key
enumeration may be required to store the keys. If the
map data is sparse and all map data are to be deleted without
closing file descriptor, using MAP_GET_NEXT_KEY to find
all keys is much faster than enumerating all key space.

This patch implements MAP_GET_NEXT_KEY command for LPM_TRIE map.
If user provided key pointer is NULL or the key does not have
an exact match in the trie, the first key will be returned.
Otherwise, the next key will be returned.

In this implemenation, key enumeration follows a postorder
traversal of internal trie. More specific keys
will be returned first than less specific ones, given
a sequence of MAP_GET_NEXT_KEY syscalls.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-19 23:26:41 +01:00
Shuah Khan b7bcc0bbb8 selftests: bpf: update .gitignore with missing generated files
Update .gitignore with missing generated files.

Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-19 23:20:48 +01:00
Roman Gushchin a55aaf6db0 bpftool: recognize BPF_MAP_TYPE_CPUMAP maps
Add BPF_MAP_TYPE_CPUMAP map type to the list
of map type recognized by bpftool and define
corresponding text representation.

Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Quentin Monnet <quentin.monnet@netronome.com>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Acked-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-19 23:16:52 +01:00
David S. Miller 85831e56a1 Merge branch 'dsa-mv88e6xxx-ATU-VTU-irq-fixes'
Andrew Lunn says:

====================
ATU and VTU irq fixes

Further testing and code review found two sets of bugs.

Core review found a cut/paste error in the irq setup code.

A board which does not have an interrupt line from the switch to the
SoC, and experiancing an EPROBE_DEFER throw a splat when the ATU irq
was freed but never registered.

v2: Fix typ0 chip->chip->vtu_prob_irq to chip->vtu_prob_irq
    0-day compile testing.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 15:57:03 -05:00
Andrew Lunn ae14cafc93 net: dsa: mv88e6xxx: Free ATU/VTU irq only when there is chip irq
We only register the ATU and VTU irq when we have a chip level IRQ.
In the error path, we should only attempt to remove the ATU and VTU
irq if we also have a chip level IRQ.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 15:57:02 -05:00
Andrew Lunn 9b662a3ec2 net: dsa: mv88e6xxx: Return error from irq_find_mapping()
Fix a cut/paste error. When irq_find_mapping() returns an error for
the ATU or VTU interrupt, return that error, not the value of
chip->device_irq.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 15:57:02 -05:00
David S. Miller 7677fd01cc Merge branch 'net-sched-cls-add-extack-support'
Alexander Aring says:

====================
net: sched: cls: add extack support

this patch adds extack support for TC classifier subsystem. The first
patch fixes some code style issues for this patch series pointed out
by checkpatch. The other patches until the last one prepares extack
handling for the TC classifier subsystem and handle generic extack
errors.

The last patch is an example for u32 classifier to add extack support
inside the callbacks delete and change. There exists a init callback as
well, but most classifier implementation run a kalloc() once to allocate
something. Not necessary _yet_ to add extack support now.

- Alex

Cc: David Ahern <dsahern@gmail.com>

changes since v3:
 - fix accidentally move of config option mismatch message in PATCH 2/8
   correct one is 4/8, detected by kbuildbot (Thank you)
 - Removed patch "net: sched: cls: add extack support for tc_setup_cb_call"
   PATCH 7/8 in version v2 as suggested by Jakub Kicinski (Thank you)
 - changed NL_SET_ERR_MSG to NL_SET_ERR_MSG_MOD as suggested by Jakub Kicinski
   in u32 cls (Thank You)
 - Removed text from cover letter that I was waiting for Jiri's Patches as
   detected by Jamal Hadi Salim (Thank you).

changes since v2:
 - rebased on Jiri's patches (Thank you)
 - several spelling fixes pointed out by Cong Wang (Thank you)
 - several spelling fixes pointed out by David Ahern (Thank you)
 - use David Ahern recommendation if config option is mismatch, but
   combine it with Cong Wang recommendation to put config name into it
   (Thank you)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 15:52:52 -05:00
Alexander Aring 4b981dbc22 net: sched: cls_u32: add extack support
This patch adds extack support for the u32 classifier as example for
delete and init callback.

Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Alexander Aring <aring@mojatatu.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 15:52:51 -05:00
Alexander Aring 1057c55f6b net: sched: cls: add extack support for tcf_change_indev
This patch adds extack handling for the tcf_change_indev function which
is common used by TC classifier implementations.

Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Alexander Aring <aring@mojatatu.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 15:52:51 -05:00
Alexander Aring 571acf2106 net: sched: cls: add extack support for delete callback
This patch adds extack support for classifier delete callback api. This
prepares to handle extack support inside each specific classifier
implementation.

Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Alexander Aring <aring@mojatatu.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 15:52:51 -05:00
Alexander Aring 50a561900e net: sched: cls: add extack support for tcf_exts_validate
The tcf_exts_validate function calls the act api change callback. For
preparing extack support for act api, this patch adds the extack as
parameter for this function which is common used in cls implementations.

Furthermore the tcf_exts_validate will call action init callback which
prepares the TC action subsystem for extack support.

Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Alexander Aring <aring@mojatatu.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 15:52:51 -05:00
Alexander Aring 7306db38a6 net: sched: cls: add extack support for change callback
This patch adds extack support for classifier change callback api. This
prepares to handle extack support inside each specific classifier
implementation.

Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Alexander Aring <aring@mojatatu.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 15:52:51 -05:00
Alexander Aring c35a4acc29 net: sched: cls_api: handle generic cls errors
This patch adds extack support for generic cls handling. The extack
will be set deeper to each called function which is not part of netdev
core api.

Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Alexander Aring <aring@mojatatu.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 15:52:51 -05:00
Alexander Aring 8865fdd4e1 net: sched: cls: fix code style issues
This patch changes some code style issues pointed out by checkpatch
inside the TC cls subsystem.

Signed-off-by: Alexander Aring <aring@mojatatu.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 15:52:51 -05:00
Yuval Mintz fd5204cdfb mlxsw: spectrum: Upper-bound supported FW version
During initialization the driver checks whether the flashed FW image
suits its requirements by checking that it's sufficiently new.
However, there's only a weak backward compatibility scheme that is
actually guaranteed by the FW, so driver must also upper bound the
version to prevent compatibility issues between current driver and some
possible future fw.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 15:45:56 -05:00
David S. Miller ea8de47140 Merge branch 'nfp-devlink-capabilities-extensions-and-updates'
Jakub Kicinski says:

====================
nfp: devlink, capabilities extensions and updates

This series starts with an improvement to the usability of the device
memory accessors (CPP transactions).  Next few patches are devoted to
fixing the devlink locking.  After recent patches for mlxsw the locking
scheme of devlink ops has to be reworked.  Following patches improve
NFP code dealing with "representors", and expands the error message
printed when driver has no support for loaded FW.

Second part of the series is focused on vNIC capabilities read from
vNIC control memory (often referred to as "BAR0" for historical reasons).
TLV capability format is established and immediately made use of.  The
next patches rework parsing of features for control vNIC which allows
apps to mask out features they don't want enabled.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 15:44:19 -05:00
Jakub Kicinski 81bd5ded60 nfp: bpf: disable all ctrl vNIC capabilities
BPF firmware currently exposes IRQ moderation capability.
The driver will make use of it by default, inserting 50 usec
delay to every control message exchange.  This cuts the number
of messages per second we can exchange by almost half.

None of the other capabilities make much sense for BPF control
vNIC, either.  Disable them all.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 15:44:19 -05:00
Jakub Kicinski 78a0a65f40 nfp: allow apps to disable ctrl vNIC capabilities
Most vNIC capabilities are netdev related.  It makes no sense
to initialize them and waste FW resources.  Some are even
counter-productive, like IRQ moderation, which will slow
down exchange of control messages.

Add to nfp_app a mask of enabled control vNIC capabilities
for apps to use.  Make flower and BPF enable all capabilities
for now.  No functional changes.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 15:44:18 -05:00
Jakub Kicinski 545bfa7a6a nfp: split reading capabilities out of nfp_net_init()
nfp_net_init() is a little long and we are about to add more
code to reading capabilties.  Move the capability reading,
parsing and validating out.  Only actual initialization
will stay in nfp_net_init().

No functional changes.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 15:44:18 -05:00
Jakub Kicinski 527d7d1b99 nfp: read mailbox address from TLV caps
Allow specifying alternative vNIC mailbox location in TLV caps.
This way we can size the mailbox to the needs and not necessarily
waste 512B of ctrl memory space.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 15:44:18 -05:00
Jakub Kicinski ce991ab666 nfp: read ME frequency from vNIC ctrl memory
PCIe island clock frequency is used when converting coalescing
parameters from usecs to NFP timestamps.  Most chips don't run
at 1200MHz, allow FW to provide us with the real frequency.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 15:44:18 -05:00
Jakub Kicinski 73a0329b05 nfp: add TLV capabilities to the BAR
NFP is entirely programmable, including the PCI data interface.
Using a fixed control BAR layout certainly makes implementations
easier, but require careful considerations when space is allocated.
Once BAR area is allocated to one feature nothing else can use it.
Allocating space statically also requires it to be sized upfront,
which leads to either unnecessary limitation or wastage.

We currently have a 32bit capability word defined which tells drivers
which application FW features are supported.   Most of the bits
are exhausted.  The same bits are also reused for enabling specific
features.  Bulk of capabilities don't have a need for an enable bit,
however, leading to confusion and wastage.

TLVs seems like a better fit for expressing capabilities of applications
running on programmable hardware.

This patch leaves the front of the BAR as is, and declares a TLV
capability start at offset 0x58.  Most of the space up to 0x0d90
is already allocated, but the used space can be wrapped with RESERVED
TLVs.  E.g.:

Address    Type         Length
 0x0058    RESERVED      0xe00  /* Wrap basic structures */
 0x0e5c    FEATURE_A     0x004
 0x0e64    FEATURE_B     0x004
 0x0e6c    RESERVED      0x990  /* Wrap qeueue stats */
 0x1800    FEATURE_C     0x100

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 15:44:18 -05:00
Jakub Kicinski bf245f1fb0 nfp: improve app not found message
When driver app matching loaded FW is not found users are faced with:

   nfp: failed to find app with ID 0x%02x

This message does not properly explain that matching driver code is
either not built into the driver or the driver is too old.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 15:44:18 -05:00
Jakub Kicinski 3eb47dfca0 nfp: protect each repr pointer individually with RCU
Representors are grouped in sets by type.  Currently the whole
sets are under RCU protection, but individual representor pointers
are not.  This causes some inconveniences when representors have
to be destroyed, because we have to allocate new sets to remove
any representors.  Protect the individual pointers with RCU.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 15:44:18 -05:00