Having a pure_initcall() callback just to permanently enable BPF
JITs under CONFIG_BPF_JIT_ALWAYS_ON is unnecessary and could leave
a small race window in future where JIT is still disabled on boot.
Since we know about the setting at compilation time anyway, just
initialize it properly there. Also consolidate all the individual
bpf_jit_enable variables into a single one and move them under one
location. Moreover, don't allow for setting unspecified garbage
values on them.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
If bpf_needs_clear_a() returns true, only actually clear it if it is
ever used. If it is not used, we don't save and restore it, so the
clearing has the nasty side effect of clobbering caller state.
Also, don't emit stack pointer adjustment instructions if the
adjustment amount is zero.
Signed-off-by: David Daney <david.daney@cavium.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Steven J. Hill <steven.hill@cavium.com>
Cc: linux-mips@linux-mips.org
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/15745/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The SKB vlan_tci and queue_mapping fields are unsigned, don't sign
extend these in the BPF JIT. In the vlan_tci case, the value gets
masked so the change is not needed for correctness, but do it anyway
for agreement with the types defined in struct sk_buff.
Signed-off-by: David Daney <david.daney@cavium.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Steven J. Hill <steven.hill@cavium.com>
Cc: linux-mips@linux-mips.org
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/15746/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This let's us pass some additional "modprobe test-bpf" tests with JIT
enabled.
Reuse the code for SKF_AD_IFINDEX, but substitute the offset and size
of the "type" field.
Signed-off-by: David Daney <david.daney@cavium.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Steven J. Hill <steven.hill@cavium.com>
Cc: linux-mips@linux-mips.org
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/15744/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Kernel source files need not include <linux/kconfig.h> explicitly
because the top Makefile forces to include it with:
-include $(srctree)/include/linux/kconfig.h
This commit removes explicit includes except the following:
* arch/s390/include/asm/facilities_src.h
* tools/testing/radix-tree/linux/kernel.h
These two are used for host programs.
Link: http://lkml.kernel.org/r/1473656164-11929-1-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull MIPS updates from Ralf Baechle:
"This is the main pull request for MIPS for 4.8. Also includes is a
minor SSB cleanup as SSB code traditionally is merged through the MIPS
tree:
ATH25:
- MIPS: Add default configuration for ath25
Boot:
- For zboot, copy appended dtb to the end of the kernel
- store the appended dtb address in a variable
BPF:
- Fix off by one error in offset allocation
Cobalt code:
- Fix typos
Core code:
- debugfs_create_file returns NULL on error, so don't use IS_ERR for
testing for errors.
- Fix double locking issue in RM7000 S-cache code. This would only
affect RM7000 ARC systems on reboot.
- Fix page table corruption on THP permission changes.
- Use compat_sys_keyctl for 32 bit userspace on 64 bit kernels.
David says, there are no compatibility issues raised by this fix.
- Move some signal code around.
- Rewrite r4k count/compare clockevent device registration such that
min_delta_ticks/max_delta_ticks files are guaranteed to be
initialized.
- Only register r4k count/compare as clockevent device if we can
assume the clock to be constant.
- Fix MSA asm warnings in control reg accessors
- uasm and tlbex fixes and tweaking.
- Print segment physical address when EU=1.
- Define AT_VECTOR_SIZE_ARCH for ARCH_DLINFO.
- CP: Allow booting by VP other than VP 0
- Cache handling fixes and optimizations for r4k class caches
- Add hotplug support for R6 processors
- Cleanup hotplug bits in kconfig
- traps: return correct si code for accessing nonmapped addresses
- Remove cpu_has_safe_index_cacheops
Lantiq:
- Register IRQ handler for virtual IRQ number
- Fix EIU interrupt loading code
- Use the real EXIN count
- Fix build error.
Loongson 3:
- Increase HPET_MIN_PROG_DELTA and decrease HPET_MIN_CYCLES
Octeon:
- Delete built-in DTB pruning code for D-Link DSR-1000N.
- Clean up GPIO definitions in dlink_dsr-1000n.dts.
- Add more LEDs to the DSR-100n DTS
- Fix off by one in octeon_irq_gpio_map()
- Typo fixes
- Enable SATA by default in cavium_octeon_defconfig
- Support readq/writeq()
- Remove forced mappings of USB interrupts.
- Ensure DMA descriptors are always in the low 4GB
- Improve USB reset code for OCTEON II.
Pistachio:
- Add maintainers entry for pistachio SoC Support
- Remove plat_setup_iocoherency
Ralink:
- Fix pwm UART in spis group pinmux.
SSB:
- Change bare unsigned to unsigned int to suit coding style
Tools:
- Fix reloc tool compiler warnings.
Other:
- Delete use of ARCH_WANT_OPTIONAL_GPIOLIB"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (61 commits)
MIPS: mm: Fix definition of R6 cache instruction
MIPS: tools: Fix relocs tool compiler warnings
MIPS: Cobalt: Fix typo
MIPS: Octeon: Fix typo
MIPS: Lantiq: Fix build failure
MIPS: Use CPHYSADDR to implement mips32 __pa
MIPS: Octeon: Dlink_dsr-1000n.dts: add more leds.
MIPS: Octeon: Clean up GPIO definitions in dlink_dsr-1000n.dts.
MIPS: Octeon: Delete built-in DTB pruning code for D-Link DSR-1000N.
MIPS: store the appended dtb address in a variable
MIPS: ZBOOT: copy appended dtb to the end of the kernel
MIPS: ralink: fix spis group pinmux
MIPS: Factor o32 specific code into signal_o32.c
MIPS: non-exec stack & heap when non-exec PT_GNU_STACK is present
MIPS: Use per-mm page to execute branch delay slot instructions
MIPS: Modify error handling
MIPS: c-r4k: Use SMP calls for CM indexed cache ops
MIPS: c-r4k: Avoid small flush_icache_range SMP calls
MIPS: c-r4k: Local flush_icache_range cache op override
MIPS: c-r4k: Split r4k_flush_kernel_vmap_range()
...
The use of config_enabled() against config options is ambiguous. In
practical terms, config_enabled() is equivalent to IS_BUILTIN(), but the
author might have used it for the meaning of IS_ENABLED(). Using
IS_ENABLED(), IS_BUILTIN(), IS_MODULE() etc. makes the intention
clearer.
This commit replaces config_enabled() with IS_ENABLED() where possible.
This commit is only touching bool config options.
I noticed two cases where config_enabled() is used against a tristate
option:
- config_enabled(CONFIG_HWMON)
[ drivers/net/wireless/ath/ath10k/thermal.c ]
- config_enabled(CONFIG_BACKLIGHT_CLASS_DEVICE)
[ drivers/gpu/drm/gma500/opregion.c ]
I did not touch them because they should be converted to IS_BUILTIN()
in order to keep the logic, but I was not sure it was the authors'
intention.
Link: http://lkml.kernel.org/r/1465215656-20569-1-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Stas Sergeev <stsp@list.ru>
Cc: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: Joshua Kinard <kumba@gentoo.org>
Cc: Jiri Slaby <jslaby@suse.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: "Dmitry V. Levin" <ldv@altlinux.org>
Cc: yu-cheng yu <yu-cheng.yu@intel.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Will Drewry <wad@chromium.org>
Cc: Nikolay Martynov <mar.kolya@gmail.com>
Cc: Huacai Chen <chenhc@lemote.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
Cc: Rafal Milecki <zajec5@gmail.com>
Cc: James Cowgill <James.Cowgill@imgtec.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Alex Smith <alex.smith@imgtec.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Qais Yousef <qais.yousef@imgtec.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Mikko Rapeli <mikko.rapeli@iki.fi>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Brian Norris <computersforpeace@gmail.com>
Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: "Luis R. Rodriguez" <mcgrof@do-not-panic.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Roland McGrath <roland@hack.frob.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Kalle Valo <kvalo@qca.qualcomm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Tony Wu <tung7970@gmail.com>
Cc: Huaitong Han <huaitong.han@intel.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrea Gelmini <andrea.gelmini@gelma.net>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Rabin Vincent <rabin@rab.in>
Cc: "Maciej W. Rozycki" <macro@imgtec.com>
Cc: David Daney <david.daney@cavium.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dan Carpenter reported [1] a static checker warning that ctx->offsets[]
may be accessed off by one from build_body(), since it's allocated with
fp->len * sizeof(*ctx.offsets) as length. The cBPF arm and ppc code
doesn't have this issue as claimed, so only mips seems to be affected and
should like most other JITs allocate with fp->len + 1. A few number of
JITs (x86, sparc, arm64) handle this differently, where they only require
fp->len array elements.
[1] http://www.spinics.net/lists/mips/msg64193.html
Fixes: c6610de353 ("MIPS: net: Add BPF JIT")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: ast@kernel.org
Cc: linux-mips@linux-mips.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13814/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The SKF_AD_ALU_XOR_X ancillary is not like the other ancillary data
instructions since it XORs A with X while all the others replace A with
some loaded value. All the BPF JITs fail to clear A if this is used as
the first instruction in a filter. This was found using american fuzzy
lop.
Add a helper to determine if A needs to be cleared given the first
instruction in a filter, and use this in the JITs. Except for ARM, the
rest have only been compile-tested.
Fixes: 3480593131 ("net: filter: get rid of BPF_S_* enum")
Signed-off-by: Rabin Vincent <rabin@rab.in>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
As we need to add further flags to the bpf_prog structure, lets migrate
both bools to a bitfield representation. The size of the base structure
(excluding insns) remains unchanged at 40 bytes.
Add also tags for the kmemchecker, so that it doesn't throw false
positives. Even in case gcc would generate suboptimal code, it's not
being accessed in performance critical paths.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit introduces BPF ASM helpers for MIPS and MIPS64 kernels.
The purpose of this patch is to twofold:
1) We are now able to handle negative offsets instead of either
falling back to the interpreter or to simply not do anything and
bail out.
2) Optimize reads from the packet header instead of calling the C
helpers
Because of this patch, we are now able to get rid of quite a bit of
code in the JIT generation process by using MIPS optimized assembly
code. The new assembly code makes the test_bpf testsuite happy with
all 60 test passing successfully compared to the previous
implementation where 2 tests were failing.
Doing some basic analysis in the results between the old
implementation and the new one we can obtain the following
summary running current mainline on an ER8 board (+/- 30us delta is
ignored to prevent noise from kernel scheduling or IRQ latencies):
Summary: 22 tests are faster, 7 are slower and 47 saw no improvement
with the most notable improvement being the tcpdump tests. The 7 tests
that seem to be a bit slower is because they all follow the slow path
(bpf_internal_load_pointer_neg_helper) which is meant to be slow so
that's not a problem.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: netdev@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: netdev@vger.kernel.org
Patchwork: http://patchwork.linux-mips.org/patch/10530/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Use the BPF register names instead of the arch register names to
document how the ABI is structured.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: netdev@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: linux-kernel@vger.kernel.org
Patchwork: http://patchwork.linux-mips.org/patch/10529/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The registers will be used by a subsequent patch introducing
ASM helpers so move them to a common header.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: netdev@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/10528/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The RSZIE was used to determine the register width but MIPS
already defines SZREG so use that instead.
Cc: netdev@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: netdev@vger.kernel.org
Patchwork: http://patchwork.linux-mips.org/patch/10526/
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Move the two scratch registers from s0 and s1 to t4 and t5 in order
to free up some callee-saved registers. We will use these callee-saved
registers to store some permanent data on them in a subsequent patch.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: netdev@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/10525/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Fix stack pointer offset which could potentially corrupt
argument registers in the previous frame. The calculated offset
reflects the size of all the registers we need to preserve so there
is no need for this erroneous subtraction.
[ralf@linux-mips.org: Fixed conflict due to only applying this fix part
of the entire series as part of 4.1 fixes.]
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: netdev@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/10527/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Nothing needs the module pointer any more, and the next patch will
call it from RCU, where the module itself might no longer exist.
Removing the arg is the safest approach.
This just codifies the use of the module_alloc/module_free pattern
which ftrace and bpf use.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: x86@kernel.org
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: linux-cris-kernel@axis.com
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: nios2-dev@lists.rocketboards.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: sparclinux@vger.kernel.org
Cc: netdev@vger.kernel.org
Remove optimize_div() from BPF_MOD | BPF_K case
since we don't know the dividend and fix the
emit_mod() by reading the mod operation result from HI register
Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org>
Reviewed-by: Markos Chandras <markos.chandras@imgtec.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull MIPS updates from Ralf Baechle:
"This is the MIPS pull request for the next kernel:
- Zubair's patch series adds CMA support for MIPS. Doing so it also
touches ARM64 and x86.
- remove the last instance of IRQF_DISABLED from arch/mips
- updates to two of the MIPS defconfig files.
- cleanup of how cache coherency bits are handled on MIPS and
implement support for write-combining.
- platform upgrades for Alchemy
- move MIPS DTS files to arch/mips/boot/dts/"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (24 commits)
MIPS: ralink: remove deprecated IRQF_DISABLED
MIPS: pgtable.h: Implement the pgprot_writecombine function for MIPS
MIPS: cpu-probe: Set the write-combine CCA value on per core basis
MIPS: pgtable-bits: Define the CCA bit for WC writes on Ingenic cores
MIPS: pgtable-bits: Move the CCA bits out of the core's ifdef blocks
MIPS: DMA: Add cma support
x86: use generic dma-contiguous.h
arm64: use generic dma-contiguous.h
asm-generic: Add dma-contiguous.h
MIPS: BPF: Add new emit_long_instr macro
MIPS: ralink: Move device-trees to arch/mips/boot/dts/
MIPS: Netlogic: Move device-trees to arch/mips/boot/dts/
MIPS: sead3: Move device-trees to arch/mips/boot/dts/
MIPS: Lantiq: Move device-trees to arch/mips/boot/dts/
MIPS: Octeon: Move device-trees to arch/mips/boot/dts/
MIPS: Add support for building device-tree binaries
MIPS: Create common infrastructure for building built-in device-trees
MIPS: SEAD3: Enable DEVTMPFS
MIPS: SEAD3: Regenerate defconfigs
MIPS: Alchemy: DB1300: Add touch penirq support
...
Fix:
arch/mips/net/bpf_jit.c: In function 'build_body':
arch/mips/net/bpf_jit.c:762:6: error: unused variable 'tmp'
cc1: all warnings being treated as errors
make[2]: *** [arch/mips/net/bpf_jit.o] Error 1
Seen when building mips:allmodconfig in -next since next-20140924.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
arch/mips/net/bpf_jit.c
drivers/net/can/flexcan.c
Both the flexcan and MIPS bpf_jit conflicts were cases of simple
overlapping changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
This macro uses the capitalized UASM_* macros to emit 32 or 64-bit
instructions depending on the kernel configurations. This allows
us to remove a few CONFIG_64BIT ifdefs from the code.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7446/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Currently we have 2 pkt_type_offset functions doing the same thing and
spread across the architecture files. Remove those and replace them
with a PKT_TYPE_OFFSET macro helper which gets the constant value from a
zero sized sk_buff member right in front of the bitfield with offsetof.
This new offset marker does not change size of struct sk_buff.
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reported by Mikulas Patocka, kmemcheck currently barks out a
false positive since we don't have special kmemcheck annotation
for bitfields used in bpf_prog structure.
We currently have jited:1, len:31 and thus when accessing len
while CONFIG_KMEMCHECK enabled, kmemcheck throws a warning that
we're reading uninitialized memory.
As we don't need the whole bit universe for pages member, we
can just split it to u16 and use a bool flag for jited instead
of a bitfield.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With eBPF getting more extended and exposure to user space is on it's way,
hardening the memory range the interpreter uses to steer its command flow
seems appropriate. This patch moves the to be interpreted bytecode to
read-only pages.
In case we execute a corrupted BPF interpreter image for some reason e.g.
caused by an attacker which got past a verifier stage, it would not only
provide arbitrary read/write memory access but arbitrary function calls
as well. After setting up the BPF interpreter image, its contents do not
change until destruction time, thus we can setup the image on immutable
made pages in order to mitigate modifications to that code. The idea
is derived from commit 314beb9bca ("x86: bpf_jit_comp: secure bpf jit
against spraying attacks").
This is possible because bpf_prog is not part of sk_filter anymore.
After setup bpf_prog cannot be altered during its life-time. This prevents
any modifications to the entire bpf_prog structure (incl. function/JIT
image pointer).
Every eBPF program (including classic BPF that are migrated) have to call
bpf_prog_select_runtime() to select either interpreter or a JIT image
as a last setup step, and they all are being freed via bpf_prog_free(),
including non-JIT. Therefore, we can easily integrate this into the
eBPF life-time, plus since we directly allocate a bpf_prog, we have no
performance penalty.
Tested with seccomp and test_bpf testsuite in JIT/non-JIT mode and manual
inspection of kernel_page_tables. Brad Spengler proposed the same idea
via Twitter during development of this patch.
Joint work with Hannes Frederic Sowa.
Suggested-by: Brad Spengler <spender@grsecurity.net>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Kees Cook <keescook@chromium.org>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
clean up names related to socket filtering and bpf in the following way:
- everything that deals with sockets keeps 'sk_*' prefix
- everything that is pure BPF is changed to 'bpf_*' prefix
split 'struct sk_filter' into
struct sk_filter {
atomic_t refcnt;
struct rcu_head rcu;
struct bpf_prog *prog;
};
and
struct bpf_prog {
u32 jited:1,
len:31;
struct sock_fprog_kern *orig_prog;
unsigned int (*bpf_func)(const struct sk_buff *skb,
const struct bpf_insn *filter);
union {
struct sock_filter insns[0];
struct bpf_insn insnsi[0];
struct work_struct work;
};
};
so that 'struct bpf_prog' can be used independent of sockets and cleans up
'unattached' bpf use cases
split SK_RUN_FILTER macro into:
SK_RUN_FILTER to be used with 'struct sk_filter *' and
BPF_PROG_RUN to be used with 'struct bpf_prog *'
__sk_filter_release(struct sk_filter *) gains
__bpf_prog_release(struct bpf_prog *) helper function
also perform related renames for the functions that work
with 'struct bpf_prog *', since they're on the same lines:
sk_filter_size -> bpf_prog_size
sk_filter_select_runtime -> bpf_prog_select_runtime
sk_filter_free -> bpf_prog_free
sk_unattached_filter_create -> bpf_prog_create
sk_unattached_filter_destroy -> bpf_prog_destroy
sk_store_orig_filter -> bpf_prog_store_orig_filter
sk_release_orig_filter -> bpf_release_orig_filter
__sk_migrate_filter -> bpf_migrate_filter
__sk_prepare_filter -> bpf_prepare_filter
API for attaching classic BPF to a socket stays the same:
sk_attach_filter(prog, struct sock *)/sk_detach_filter(struct sock *)
and SK_RUN_FILTER(struct sk_filter *, ctx) to execute a program
which is used by sockets, tun, af_packet
API for 'unattached' BPF programs becomes:
bpf_prog_create(struct bpf_prog **)/bpf_prog_destroy(struct bpf_prog *)
and BPF_PROG_RUN(struct bpf_prog *, ctx) to execute a program
which is used by isdn, ppp, team, seccomp, ptp, xt_bpf, cls_bpf, test_bpf
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When allocating stack space for BPF memwords we need to use the
appropriate 32 or 64-bit instruction to avoid losing the top 32 bits
of the stack pointer.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: netdev@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7135/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
When loading a pointer to register we need to use the appropriate
32 or 64bit instruction to preserve the pointers' top 32bits.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: netdev@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7180/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The skb->pkt_type field is defined as follows:
u8 pkt_type:3,
fclone:2,
ipvs_property:1,
peeked:1,
nf_trace:1
resulting to the following layout in big-endian systems
[pkt_type][fclone][ipvs_propery][peeked][nf_trace]
^ ^
| |
LSB MSB
As a result, the existing code did not work because it was trying to
match pkt_type == 7 whereas in reality it is 7<<5 on big-endian
systems.
This has been fixed in the interpreter in
0dcceabb0c
"net: filter: fix SKF_AD_PKTTYPE extension on big-endian"
The fix is to look for 7<<5 on big-endian systems for the pkt_type
field, and shift by 5 so the packet type will be at the lower 3 bits
of the A register.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: netdev@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7132/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Remove BUG_ON() if the shift immediate is >=32 to avoid kernel crashes
due to malicious user input. If the shift immediate is >= 32,
we simply load the destination register with 0 since only
32-bit instructions are used by JIT so this will do the
correct thing even on MIPS64.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: netdev@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7179/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Previously, update_on_xread() only set the reset flag if SEEN_X hasn't
been set already. However, SEEN_X is used to indicate that X is used
as destination or source register so there are some cases where X
is only used as source register and we really need to make sure that it
has been initialized in time. As a result of which, drop this function and
always set X to zero if it's used in any of the opcodes.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: netdev@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7133/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
is_range() was meant to check whether the number is within
the s16 range or not. However the return values and consumers expected
the exact opposite. We fix that by inverting the logic in the function
to return 'true' for < s16 and 'false' for > s16.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Reported-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: netdev@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7131/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
We should prevent spamming the logs during normal execution of bpf-jit.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Suggested-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: netdev@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7129/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
If VLAN_TAG_PRESENT is not zero, then return 1 as expected by
classic BPF. Otherwise return 0.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: netdev@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7128/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Using VLAN_VID_MASK is not correct to get the vlan tag. Use
~VLAN_PRESENT_MASK instead and make sure it's u16 so the top 16-bits
will be removed. This will ensure that the emit_andi() code will not
treat this as a big 32-bit unsigned value.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: netdev@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7127/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The sltiu and sltu instructions will set the scratch register
to 1 if A <= X|K so fix the emitted branch conditional to check
for scratch != zero rather than scratch >= zero which would complicate
the resuling branch logic given that MIPS does not have a BGT or BGET
instructions to compare general purpose registers directly.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: netdev@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7126/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The SKF_AD_PKTTYPE uses the skb pointer so make sure it's in the
flags so it will be initialized in time.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: netdev@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7125/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The VLAN_VID_MASK and VLAN_TAG_PRESENT are immediates, so using
'and' which expects 3 registers will produce wrong results. Fix
this by using the 'andi' instruction.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: netdev@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7124/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Previously, the negative offset was not checked leading to failures
due to trying to load data beyond the skb struct boundaries. Until we
have proper asm helpers in place, it's best if we return ENOSUPP if K
is negative when trying to JIT the filter or 0 during runtime if we
do an indirect load where the value of X is unknown during build time.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: netdev@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7123/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Reading from the HI register to get the division result is wrong.
The quotient is placed in the LO register.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: netdev@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7122/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
mips: allmodconfig fails in 3.16-rc1 with lots of undefined symbols.
arch/mips/net/bpf_jit.c: In function 'is_load_to_a':
arch/mips/net/bpf_jit.c:559:7: error: 'BPF_S_LD_W_LEN' undeclared (first use in this function)
arch/mips/net/bpf_jit.c:559:7: note: each undeclared identifier is reported only once for each function it appears in
arch/mips/net/bpf_jit.c:560:7: error: 'BPF_S_LD_W_ABS' undeclared (first use in this function)
[...]
The reason behind this is that 3480593131 ("net: filter: get rid of
BPF_S_* enum") was routed via net-next tree, that takes all BPF-related
changes, at a time where MIPS BPF JIT was not part of net-next, while
c6610de353 ("MIPS: net: Add BPF JIT") was routed via mips arch tree
and went into mainline within the same merge window. Thus, fix it up by
converting BPF_S_* in a similar fashion as in 3480593131 for MIPS.
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>
Cc: Linux MIPS Mailing List <linux-mips@linux-mips.org>
Patchwork: https://patchwork.linux-mips.org/patch/7099/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This adds initial support for BPF-JIT on MIPS
Tested on mips32 LE/BE and mips64 BE/n64 using
dhcp, ping and various tcpdump filters.
Benchmarking:
Assuming the remote MIPS target uses 192.168.154.181
as its IP address, and the local host uses 192.168.154.136,
the following results can be obtained using the following
tcpdump filter (catches no frames) and a simple
'time ping -f -c 1000000' command.
[root@(none) ~]# tcpdump -p -n -s 0 -i eth0 net 10.0.0.0/24 -d
(000) ldh [12]
(001) jeq #0x800 jt 2 jf 8
(002) ld [26]
(003) and #0xffffff00
(004) jeq #0xa000000 jt 16 jf 5
(005) ld [30]
(006) and #0xffffff00
(007) jeq #0xa000000 jt 16 jf 17
(008) jeq #0x806 jt 10 jf 9
(009) jeq #0x8035 jt 10 jf 17
(010) ld [28]
(011) and #0xffffff00
(012) jeq #0xa000000 jt 16 jf 13
(013) ld [38]
(014) and #0xffffff00
(015) jeq #0xa000000 jt 16 jf 17
(016) ret #65535
- BPF-JIT Disabled
real 1m38.005s
user 0m1.510s
sys 0m6.710s
- BPF-JIT Enabled
real 1m35.215s
user 0m1.200s
sys 0m4.140s
[ralf@linux-mips.org: Resolved conflict.]
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>