Commit Graph

4707 Commits

Author SHA1 Message Date
Linus Torvalds c1488798ad Merge branch 'core-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull STRICT_DEVMEM default from Ingo Molnar:
 "Make CONFIG_STRICT_DEVMEM default-y on x86 and arm64 as well, to
  follow the distro status quo"

* 'core-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Kconfig: Make STRICT_DEVMEM default-y on x86 and arm64
2018-01-30 10:11:26 -08:00
Jason Gunthorpe e7996a9a77 Linux 4.15
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJabj6pAAoJEHm+PkMAQRiGs8cIAJQFkCWnbz86e3vG4DuWhyA8
 CMGHCQdUOxxFGa/ixhIiuetbC0x+JVHAjV2FwVYbAQfaZB3pfw2iR1ncQxpAP1AI
 oLU9vBEqTmwKMPc9CM5rRfnLFWpGcGwUNzgPdxD5yYqGDtcM8K840mF6NdkYe5AN
 xU8rv1wlcFPF4A5pvHCH0pvVmK4VxlVFk/2H67TFdxBs4PyJOnSBnf+bcGWgsKO6
 hC8XIVtcKCH2GfFxt5d0Vgc5QXJEpX1zn2mtCa1MwYRjN2plgYfD84ha0xE7J0B0
 oqV/wnjKXDsmrgVpncr3txd4+zKJFNkdNRE4eLAIupHo2XHTG4HvDJ5dBY2NhGU=
 =sOml
 -----END PGP SIGNATURE-----

Merge tag v4.15 of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git

To resolve conflicts in:
 drivers/infiniband/hw/mlx5/main.c
 drivers/infiniband/hw/mlx5/qp.c

From patches merged into the -rc cycle. The conflict resolution matches
what linux-next has been carrying.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-30 09:30:00 -07:00
Linus Torvalds 0a4b6e2f80 Merge branch 'for-4.16/block' of git://git.kernel.dk/linux-block
Pull block updates from Jens Axboe:
 "This is the main pull request for block IO related changes for the
  4.16 kernel. Nothing major in this pull request, but a good amount of
  improvements and fixes all over the map. This contains:

   - BFQ improvements, fixes, and cleanups from Angelo, Chiara, and
     Paolo.

   - Support for SMR zones for deadline and mq-deadline from Damien and
     Christoph.

   - Set of fixes for bcache by way of Michael Lyle, including fixes
     from himself, Kent, Rui, Tang, and Coly.

   - Series from Matias for lightnvm with fixes from Hans Holmberg,
     Javier, and Matias. Mostly centered around pblk, and the removing
     rrpc 1.2 in preparation for supporting 2.0.

   - A couple of NVMe pull requests from Christoph. Nothing major in
     here, just fixes and cleanups, and support for command tracing from
     Johannes.

   - Support for blk-throttle for tracking reads and writes separately.
     From Joseph Qi. A few cleanups/fixes also for blk-throttle from
     Weiping.

   - Series from Mike Snitzer that enables dm to register its queue more
     logically, something that's alwways been problematic on dm since
     it's a stacked device.

   - Series from Ming cleaning up some of the bio accessor use, in
     preparation for supporting multipage bvecs.

   - Various fixes from Ming closing up holes around queue mapping and
     quiescing.

   - BSD partition fix from Richard Narron, fixing a problem where we
     can't mount newer (10/11) FreeBSD partitions.

   - Series from Tejun reworking blk-mq timeout handling. The previous
     scheme relied on atomic bits, but it had races where we would think
     a request had timed out if it to reused at the wrong time.

   - null_blk now supports faking timeouts, to enable us to better
     exercise and test that functionality separately. From me.

   - Kill the separate atomic poll bit in the request struct. After
     this, we don't use the atomic bits on blk-mq anymore at all. From
     me.

   - sgl_alloc/free helpers from Bart.

   - Heavily contended tag case scalability improvement from me.

   - Various little fixes and cleanups from Arnd, Bart, Corentin,
     Douglas, Eryu, Goldwyn, and myself"

* 'for-4.16/block' of git://git.kernel.dk/linux-block: (186 commits)
  block: remove smart1,2.h
  nvme: add tracepoint for nvme_complete_rq
  nvme: add tracepoint for nvme_setup_cmd
  nvme-pci: introduce RECONNECTING state to mark initializing procedure
  nvme-rdma: remove redundant boolean for inline_data
  nvme: don't free uuid pointer before printing it
  nvme-pci: Suspend queues after deleting them
  bsg: use pr_debug instead of hand crafted macros
  blk-mq-debugfs: don't allow write on attributes with seq_operations set
  nvme-pci: Fix queue double allocations
  block: Set BIO_TRACE_COMPLETION on new bio during split
  blk-throttle: use queue_is_rq_based
  block: Remove kblockd_schedule_delayed_work{,_on}()
  blk-mq: Avoid that blk_mq_delay_run_hw_queue() introduces unintended delays
  blk-mq: Rename blk_mq_request_direct_issue() into blk_mq_request_issue_directly()
  lib/scatterlist: Fix chaining support in sgl_alloc_order()
  blk-throttle: track read and write request individually
  block: add bdev_read_only() checks to common helpers
  block: fail op_is_write() requests to read-only partitions
  blk-throttle: export io_serviced_recursive, io_service_bytes_recursive
  ...
2018-01-29 11:51:49 -08:00
Linus Torvalds bc4e118355 - New Drivers
- Add support for RAVE Supervisory Processor
 
  - (Re)moved drivers
    - Move Realtek Card Reader Driver to Misc
 
  - New Device Support
    - Add support for Pinctrl to axp20x
 
  - New Functionality
    - Add resume support; atmel-flexcom
 
  - Fix-ups
    - Split MFD (mfd) and userspace handlers (platform); cros_ec
    - Fix trivial (whitespace, spelling) issue(s); pcf50633-core
    - Clean-up error handling; ab8500-debugfs
    - General tidying up; tmio_core
    - Kconfig fix-ups; qcom-pm8xxx
    - Licensing changes (SPDX); stm32-lptimer, stm32-timers
    - Device Tree fixups; mc13xxx
    - Simplify/remove unused code; cros_ec_spi, axp20x, ti_am335x_tscadc,
                                   kempld-core, intel_soc_pmic_core.c,
 				  ab8500-debugfs
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJaYK9SAAoJEFGvii+H/HdhTBwP/iQYlVikcs728oQqYhPKcafc
 cH8OxA6mPoD8BDvJkfjyQ/VXFo+OHZQxs7arUYMBpHweqhRGID/uDJItkZ05O7RI
 0AJoqedczfgQzmEFvos4lpnm2kIdxXZstFqQBA0vLqvbOVd8U+LUiQ/2ilOELxa/
 AYUiIKO/gY0jw/1cXkYWMbLI8Z14u04OFrUzFIu8M6KSdMKyQ5RLvSAISL4l/oyO
 fWvYL8ngdmC7BOw0OF7kc5S5KaevP0qZ9kBNb1e0Y1gbmm1b8WhH5eaAcuWD3tR3
 mxa/lQNVLIDfp1XQzTEVbWFmaic5+i4c05WrVbqZ7Q8jgQGrXtwmdcqYc6ifQJoT
 1/3IH7YTYV9+k/B5cSP9m+CCY4BsNjnqXcIW1A0FLJkmCLfU8jvMBBaapXVZk23h
 rgpRYEWRSVGQEa2E/9tDSndpqUcllWriSKYcTtNGX65kIiP1+VQYpUps/Ff7X8bj
 CiPGIGP4jYywk4SAlTjs0Dothh/g3+4CtyMK4ARei9z1P5prKuPMHyG6Xf0PtTMv
 qLD+0vplL2AbpdlpH8U1Eqda+TxM7RinV2US/FGnHJqUwukWOdZGr+3t/uU54Sfu
 TsQe9gCdURvJnGvMXdHO11/jBIQg4PzTKhJfnfONCo5kZMwJ1athhHVqguJyy6US
 SNJBlEDaO4rVMTdbYo9b
 =k4Mk
 -----END PGP SIGNATURE-----

Merge tag 'mfd-next-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd

Pull MFD updates from Lee Jones:
 "New Drivers:
   - Add support for RAVE Supervisory Processor

  Moved drivers:
   - Move Realtek Card Reader Driver to Misc

  New Device Support:
   - Add support for Pinctrl to axp20x

  New Functionality:
   - Add resume support to atmel-flexcom

  Fix-ups:
   - Split MFD (mfd) and userspace handlers (platform) in cros_ec
   - Fix trivial (whitespace, spelling) issue(s) in pcf50633-core
   - Clean-up error handling in ab8500-debugfs
   - General tidying up in tmio_core
   - Kconfig fix-ups for qcom-pm8xxx
   - Licensing changes (SPDX) to stm32-lptimer, stm32-timers
   - Device Tree fixups in mc13xxx
   - Simplify/remove unused code in cros_ec_spi, axp20x, ti_am335x_tscadc,
     kempld-core, intel_soc_pmic_core.c, ab8500-debugfs"

* tag 'mfd-next-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (32 commits)
  mfd: lpc_ich: Do not touch SPI-NOR write protection bit on Apollo Lake
  mfd: axp20x: Mark axp288 CHRG_BAK_CTRL register volatile
  mfd: ab8500: Introduce DEFINE_SHOW_ATTRIBUTE() macro
  atmel_flexcom: Support resuming after a chip reset
  mfd: Remove duplicate includes
  dt-bindings: mfd: mc13xxx: Add the unit address to sysled
  mfd: stm32: Adopt SPDX identifier
  mfd: axp20x: Add pinctrl cell for AXP813
  mfd: pm8xxx: Make elegible for COMPILE_TEST
  mfd: kempld-core: Use resource_size function on resource object
  mfd: tmio: Move register macros to tmio_core.c
  mfd: cros ec: spi: Simplify delay handling between SPI messages
  mfd: palmas: Assign the right powerhold mask for tps65917
  mfd: ab8500-debugfs: Use common error handling code in ab8500_print_modem_registers()
  mfd: ti_am335x_tscadc: Remove redundant assignment to node
  mfd: pcf50633: Fix spelling mistake: 'Falied' -> 'Failed'
  dt-bindings: watchdog: Add bindings for RAVE SP watchdog driver
  watchdog: Add RAVE SP watchdog driver
  mfd: Add driver for RAVE Supervisory Processor
  serdev: Introduce devm_serdev_device_open()
  ...
2018-01-29 10:59:24 -08:00
Daniel Borkmann 21ccaf2149 bpf: add further test cases around div/mod and others
Update selftests to relfect recent changes and add various new
test cases.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-26 16:42:07 -08:00
Bjorn Helgaas 7328c8f48d PCI: Add SPDX GPL-2.0 when no license was specified
b24413180f ("License cleanup: add SPDX GPL-2.0 license identifier to
files with no license") added SPDX GPL-2.0 to several PCI files that
previously contained no license information.

Add SPDX GPL-2.0 to all other PCI files that did not contain any license
information and hence were under the default GPL version 2 license of the
kernel.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-26 11:45:16 -06:00
Steven Rostedt (VMware) 841a915d20 vsprintf: Do not have bprintf dereference pointers
When trace_printk() was introduced, it was discussed that making it be as
low overhead as possible, that the processing of the format string should be
delayed until it is read. That is, a "trace_printk()" should not convert
the %d into numbers and so on, but instead, save the fmt string and all the
args in the buffer at the time of recording. When the trace_printk() data is
read, it would then parse the format string and do the conversions of the
saved arguments in the tracing buffer.

The code to perform this was added to vsprintf where vbin_printf() would
save the arguments of a specified format string in a buffer, then
bstr_printf() could be used to convert the buffer with the same format
string into the final output, as if vsprintf() was called in one go.

The issue arises when dereferenced pointers are used. The problem is that
something like %*pbl which reads a bitmask, will save the pointer to the
bitmask in the buffer. Then the reading of the buffer via bstr_printf() will
then look at the pointer to process the final output. Obviously the value of
that pointer could have changed since the time it was recorded to the time
the buffer is read. Worse yet, the bitmask could be unmapped, and the
reading of the trace buffer could actually cause a kernel oops.

Another problem is that user space tools such as perf and trace-cmd do not
have access to the contents of these pointers, and they become useless when
the tracing buffer is extracted.

Instead of having vbin_printf() simply save the pointer in the buffer for
later processing, have it perform the formatting at the time bin_printf() is
called. This will fix the issue of dereferencing pointers at a later time,
and has the extra benefit of having user space tools understand these
values.

Since perf and trace-cmd already can handle %p[sSfF] via saving kallsyms,
their pointers are saved and not processed during vbin_printf(). If they
were converted, it would break perf and trace-cmd, as they would not know
how to deal with the conversion.

Link: http://lkml.kernel.org/r/20171228204025.14a71d8f@gandalf.local.home

Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-01-23 15:57:30 -05:00
Bart Van Assche 172856eac7 kobject: Export kobj_ns_grab_current() and kobj_ns_drop()
Make it possible to call these two functions from a kernel module.
Note: despite their name, these two functions can be used meaningfully
independent of kobjects. A later patch will add calls to these
functions from the SRP driver because this patch series modifies the
SRP driver such that it can hold a reference to a namespace that can
last longer than the lifetime of the process through which the
namespace reference was obtained.

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-01-23 11:35:04 -05:00
Wei Yongjun a5e1923356 test_firmware: fix missing unlock on error in config_num_requests_store()
Add the missing unlock before return from function
config_num_requests_store() in the error handling case.

Fixes: c92316bf8e ("test_firmware: add batched firmware tests")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-22 16:55:38 +01:00
Wei Yongjun 76f8ab1bd1 test_firmware: make local symbol test_fw_config static
Fixes the following sparse warnings:

lib/test_firmware.c:99:20: warning:
 symbol 'test_fw_config' was not declared. Should it be static?

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-22 16:55:38 +01:00
Petr Mladek 51ccbb0ae8 Merge branch 'for-4.16-print-symbol' into for-4.16 2018-01-22 10:40:50 +01:00
Daniel Borkmann fcd1c91771 bpf: add couple of test cases for signed extended imms
Add a couple of test cases for interpreter and JIT that are
related to an issue we faced some time ago in Cilium [1],
which is fixed in LLVM with commit e53750e1e086 ("bpf: fix
bug on silently truncating 64-bit immediate").

Test cases were run-time checking kernel to behave as intended
which should also provide some guidance for current or new
JITs in case they should trip over this. Added for cBPF and
eBPF.

  [1] https://github.com/cilium/cilium/pull/2162

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-19 18:36:59 -08:00
Bart Van Assche 8c7a8d1c4b lib/scatterlist: Fix chaining support in sgl_alloc_order()
This patch avoids that workloads with large block sizes (megabytes)
can trigger the following call stack with the ib_srpt driver (that
driver is the only driver that chains scatterlists allocated by
sgl_alloc_order()):

BUG: Bad page state in process kworker/0:1H  pfn:2423a78
page:fffffb03d08e9e00 count:-3 mapcount:0 mapping:          (null) index:0x0
flags: 0x57ffffc0000000()
raw: 0057ffffc0000000 0000000000000000 0000000000000000 fffffffdffffffff
raw: dead000000000100 dead000000000200 0000000000000000 0000000000000000
page dumped because: nonzero _count
CPU: 0 PID: 733 Comm: kworker/0:1H Tainted: G          I      4.15.0-rc7.bart+ #1
Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015
Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
Call Trace:
 dump_stack+0x5c/0x83
 bad_page+0xf5/0x10f
 get_page_from_freelist+0xa46/0x11b0
 __alloc_pages_nodemask+0x103/0x290
 sgl_alloc_order+0x101/0x180
 target_alloc_sgl+0x2c/0x40 [target_core_mod]
 srpt_alloc_rw_ctxs+0x173/0x2d0 [ib_srpt]
 srpt_handle_new_iu+0x61e/0x7f0 [ib_srpt]
 __ib_process_cq+0x55/0xa0 [ib_core]
 ib_cq_poll_work+0x1b/0x60 [ib_core]
 process_one_work+0x141/0x340
 worker_thread+0x47/0x3e0
 kthread+0xf5/0x130
 ret_from_fork+0x1f/0x30

Fixes: e80a0af475 ("lib/scatterlist: Introduce sgl_alloc() and sgl_free()")
Reported-by: Laurence Oberman <loberman@redhat.com>
Tested-by: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Nicholas A. Bellinger <nab@linux-iscsi.org>
Cc: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-19 12:31:03 -07:00
Christoph Hellwig 4bd89ed39b swiotlb: remove various exports
All these symbols are only used by arch dma_ops implementations or
xen-swiotlb.  None of which can be modular.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian König <christian.koenig@amd.com>
2018-01-15 09:35:50 +01:00
Christoph Hellwig 0176adb004 swiotlb: refactor coherent buffer allocation
Factor out a new swiotlb_alloc_buffer helper that allocates DMA coherent
memory from the swiotlb bounce buffer.

This allows to simplify the swiotlb_alloc implemenation that uses
dma_direct_alloc to try to allocate a reachable buffer first.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian König <christian.koenig@amd.com>
2018-01-15 09:35:49 +01:00
Christoph Hellwig a25381aa3a swiotlb: refactor coherent buffer freeing
Factor out a new swiotlb_free_buffer helper that checks if an address
is allocated from the swiotlb bounce buffer, and if yes frees it.

This allows to simplify the swiotlb_free implemenation that uses
dma_direct_free to free the non-bounce buffer allocations.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian König <christian.koenig@amd.com>
2018-01-15 09:35:48 +01:00
Christoph Hellwig aaf796dc6e swiotlb: wire up ->dma_supported in swiotlb_dma_ops
To properly reject too small DMA masks based on the addressability of the
bounce buffer.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian König <christian.koenig@amd.com>
2018-01-15 09:35:46 +01:00
Christoph Hellwig 251533eb35 swiotlb: add common swiotlb_map_ops
Currently all architectures that want to use swiotlb have to implement
their own dma_map_ops instances.  Provide a generic one based on the
x86 implementation which first calls into dma_direct to try a full blown
direct mapping implementation (including e.g. CMA) before falling back
allocating from the swiotlb buffer.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian König <christian.koenig@amd.com>
2018-01-15 09:35:45 +01:00
Christoph Hellwig 7f2c8bbd32 swiotlb: rename swiotlb_free to swiotlb_exit
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2018-01-15 09:35:39 +01:00
Christian König d0bc0c2a31 swiotlb: suppress warning when __GFP_NOWARN is set
TTM tries to allocate coherent memory in chunks of 2MB first to improve
TLB efficiency and falls back to allocating 4K pages if that fails.

Suppress the warning when the 2MB allocations fails since there is a
valid fall back path.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reported-by: Mike Galbraith <efault@gmx.de>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=104082
CC: stable@vger.kernel.org
Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-01-15 09:35:18 +01:00
Christoph Hellwig 1a9777a8a0 dma-direct: reject too small dma masks
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
2018-01-15 09:35:15 +01:00
Christoph Hellwig 19dca8c0ef dma-direct: make dma_direct_{alloc,free} available to other implementations
So that they don't need to indirect through the operation vector.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com>
2018-01-15 09:35:14 +01:00
Christoph Hellwig 95f183916d dma-direct: retry allocations using GFP_DMA for small masks
If an attempt to allocate memory succeeded, but isn't inside the
supported DMA mask, retry the allocation with GFP_DMA set as a
last resort.

Based on the x86 code, but an off by one error in what is now
dma_coherent_ok has been fixed vs the x86 code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-01-15 09:35:13 +01:00
Christoph Hellwig c61e963734 dma-direct: add support for allocation from ZONE_DMA and ZONE_DMA32
This allows to dip into zones for lower memory if they are available.
If one of the zones is not available the corresponding GFP_* flag
will evaluate to 0 so they won't change anything.  We provide an
arch tunable for those architectures that do not use GFP_DMA for
the lowest 24-bits, given that there are a few.

Roughly based on the x86 code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-01-15 09:35:12 +01:00
Christoph Hellwig 21f237e4d0 dma-direct: use node local allocations for coherent memory
To preserve the x86 behavior.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
2018-01-15 09:35:11 +01:00
Christoph Hellwig 080321d3b3 dma-direct: add support for CMA allocation
Try the CMA allocator for coherent allocations if supported.

Roughly modelled after the x86 code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-01-15 09:35:09 +01:00
Christoph Hellwig 2797596992 dma-direct: add dma address sanity checks
Roughly based on the x86 pci-nommu implementation.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-01-15 09:35:08 +01:00
Christoph Hellwig 2e86a04780 dma-direct: use phys_to_dma
This means it uses whatever linear remapping scheme that the architecture
provides is used in the generic dma_direct ops.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com>
2018-01-15 09:35:07 +01:00
Christoph Hellwig 002e67454f dma-direct: rename dma_noop to dma_direct
The trivial direct mapping implementation already does a virtual to
physical translation which isn't strictly a noop, and will soon learn
to do non-direct but linear physical to dma translations through the
device offset and a few small tricks.  Rename it to a better fitting
name.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com>
2018-01-15 09:35:06 +01:00
Masami Hiramatsu 4b1a29a7f5 error-injection: Support fault injection framework
Support in-kernel fault-injection framework via debugfs.
This allows you to inject a conditional error to specified
function using debugfs interfaces.

Here is the result of test script described in
Documentation/fault-injection/fault-injection.txt

  ===========
  # ./test_fail_function.sh
  1+0 records in
  1+0 records out
  1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0227404 s, 46.1 MB/s
  btrfs-progs v4.4
  See http://btrfs.wiki.kernel.org for more information.

  Label:              (null)
  UUID:               bfa96010-12e9-4360-aed0-42eec7af5798
  Node size:          16384
  Sector size:        4096
  Filesystem size:    1001.00MiB
  Block group profiles:
    Data:             single            8.00MiB
    Metadata:         DUP              58.00MiB
    System:           DUP              12.00MiB
  SSD detected:       no
  Incompat features:  extref, skinny-metadata
  Number of devices:  1
  Devices:
     ID        SIZE  PATH
      1  1001.00MiB  /dev/loop2

  mount: mount /dev/loop2 on /opt/tmpmnt failed: Cannot allocate memory
  SUCCESS!
  ===========

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-12 17:33:38 -08:00
Masami Hiramatsu 663faf9f7b error-injection: Add injectable error types
Add injectable error types for each error-injectable function.

One motivation of error injection test is to find software flaws,
mistakes or mis-handlings of expectable errors. If we find such
flaws by the test, that is a program bug, so we need to fix it.

But if the tester miss input the error (e.g. just return success
code without processing anything), it causes unexpected behavior
even if the caller is correctly programmed to handle any errors.
That is not what we want to test by error injection.

To clarify what type of errors the caller must expect for each
injectable function, this introduces injectable error types:

 - EI_ETYPE_NULL : means the function will return NULL if it
		    fails. No ERR_PTR, just a NULL.
 - EI_ETYPE_ERRNO : means the function will return -ERRNO
		    if it fails.
 - EI_ETYPE_ERRNO_NULL : means the function will return -ERRNO
		       (ERR_PTR) or NULL.

ALLOW_ERROR_INJECTION() macro is expanded to get one of
NULL, ERRNO, ERRNO_NULL to record the error type for
each function. e.g.

 ALLOW_ERROR_INJECTION(open_ctree, ERRNO)

This error types are shown in debugfs as below.

  ====
  / # cat /sys/kernel/debug/error_injection/list
  open_ctree [btrfs]	ERRNO
  io_ctl_init [btrfs]	ERRNO
  ====

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-12 17:33:38 -08:00
Masami Hiramatsu 540adea380 error-injection: Separate error-injection from kprobe
Since error-injection framework is not limited to be used
by kprobes, nor bpf. Other kernel subsystems can use it
freely for checking safeness of error-injection, e.g.
livepatch, ftrace etc.
So this separate error-injection framework from kprobes.

Some differences has been made:

- "kprobe" word is removed from any APIs/structures.
- BPF_ALLOW_ERROR_INJECTION() is renamed to
  ALLOW_ERROR_INJECTION() since it is not limited for BPF too.
- CONFIG_FUNCTION_ERROR_INJECTION is the config item of this
  feature. It is automatically enabled if the arch supports
  error injection feature for kprobe or ftrace etc.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-12 17:33:38 -08:00
Eric Biggers 7660b1fb36 crypto: chacha20 - use rol32() macro from bitops.h
For chacha20_block(), use the existing 32-bit left-rotate function
instead of defining one ourselves.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-01-12 23:03:01 +11:00
David S. Miller 19d28fbd30 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
BPF alignment tests got a conflict because the registers
are output as Rn_w instead of just Rn in net-next, and
in net a fixup for a testcase prohibits logical operations
on pointers before using them.

Also, we should attempt to patch BPF call args if JIT always on is
enabled.  Instead, if we fail to JIT the subprogs we should pass
an error back up and fail immediately.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-11 22:13:42 -05:00
David S. Miller 661e4e33a9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2018-01-09

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Prevent out-of-bounds speculation in BPF maps by masking the
   index after bounds checks in order to fix spectre v1, and
   add an option BPF_JIT_ALWAYS_ON into Kconfig that allows for
   removing the BPF interpreter from the kernel in favor of
   JIT-only mode to make spectre v2 harder, from Alexei.

2) Remove false sharing of map refcount with max_entries which
   was used in spectre v1, from Daniel.

3) Add a missing NULL psock check in sockmap in order to fix
   a race, from John.

4) Fix test_align BPF selftest case since a recent change in
   verifier rejects the bit-wise arithmetic on pointers
   earlier but test_align update was missing, from Alexei.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-10 11:17:21 -05:00
Christoph Hellwig ea8c64ace8 dma-mapping: move swiotlb arch helpers to a new header
phys_to_dma, dma_to_phys and dma_capable are helpers published by
architecture code for use of swiotlb and xen-swiotlb only.  Drivers are
not supposed to use these directly, but use the DMA API instead.

Move these to a new asm/dma-direct.h helper, included by a
linux/dma-direct.h wrapper that provides the default linear mapping
unless the architecture wants to override it.

In the MIPS case the existing dma-coherent.h is reused for now as
untangling it will take a bit of work.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Robin Murphy <robin.murphy@arm.com>
2018-01-10 16:40:54 +01:00
Alexei Starovoitov 290af86629 bpf: introduce BPF_JIT_ALWAYS_ON config
The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715.

A quote from goolge project zero blog:
"At this point, it would normally be necessary to locate gadgets in
the host kernel code that can be used to actually leak data by reading
from an attacker-controlled location, shifting and masking the result
appropriately and then using the result of that as offset to an
attacker-controlled address for a load. But piecing gadgets together
and figuring out which ones work in a speculation context seems annoying.
So instead, we decided to use the eBPF interpreter, which is built into
the host kernel - while there is no legitimate way to invoke it from inside
a VM, the presence of the code in the host kernel's text section is sufficient
to make it usable for the attack, just like with ordinary ROP gadgets."

To make attacker job harder introduce BPF_JIT_ALWAYS_ON config
option that removes interpreter from the kernel in favor of JIT-only mode.
So far eBPF JIT is supported by:
x64, arm64, arm32, sparc64, s390, powerpc64, mips64

The start of JITed program is randomized and code page is marked as read-only.
In addition "constant blinding" can be turned on with net.core.bpf_jit_harden

v2->v3:
- move __bpf_prog_ret0 under ifdef (Daniel)

v1->v2:
- fix init order, test_bpf and cBPF (Daniel's feedback)
- fix offloaded bpf (Jakub's feedback)
- add 'return 0' dummy in case something can invoke prog->bpf_func
- retarget bpf tree. For bpf-next the patch would need one extra hunk.
  It will be sent when the trees are merged back to net-next

Considered doing:
  int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT;
but it seems better to land the patch as-is and in bpf-next remove
bpf_jit_enable global variable from all JITs, consolidate in one place
and remove this jit_init() function.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-09 22:25:26 +01:00
David S. Miller a0ce093180 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-01-09 10:37:00 -05:00
Joe Perches b6b996b6cd treewide: Use DEVICE_ATTR_RW
Convert DEVICE_ATTR uses to DEVICE_ATTR_RW where possible.

Done with perl script:

$ git grep -w --name-only DEVICE_ATTR | \
  xargs perl -i -e 'local $/; while (<>) { s/\bDEVICE_ATTR\s*\(\s*(\w+)\s*,\s*\(?(\s*S_IRUGO\s*\|\s*S_IWUSR|\s*S_IWUSR\s*\|\s*S_IRUGO\s*|\s*0644\s*)\)?\s*,\s*\1_show\s*,\s*\1_store\s*\)/DEVICE_ATTR_RW(\1)/g; print;}'

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Acked-by: Jarkko Nikula <jarkko.nikula@bitmer.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-09 16:33:31 +01:00
Sergey Senozhatsky 04b8eb7a4c symbol lookup: introduce dereference_symbol_descriptor()
dereference_symbol_descriptor() invokes appropriate ARCH specific
function descriptor dereference callbacks:
- dereference_kernel_function_descriptor() if the pointer is a
  kernel symbol;

- dereference_module_function_descriptor() if the pointer is a
  module symbol.

This is the last step needed to make '%pS/%ps' smart enough to
handle function descriptor dereference on affected ARCHs and
to retire '%pF/%pf'.

To refresh it:
  Some architectures (ia64, ppc64, parisc64) use an indirect pointer
  for C function pointers - the function pointer points to a function
  descriptor and we need to dereference it to get the actual function
  pointer.

  Function descriptors live in .opd elf section and all affected
  ARCHs (ia64, ppc64, parisc64) handle it properly for kernel and
  modules. So we, technically, can decide if the dereference is
  needed by simply looking at the pointer: if it belongs to .opd
  section then we need to dereference it.

  The kernel and modules have their own .opd sections, obviously,
  that's why we need to split dereference_function_descriptor()
  and use separate kernel and module dereference arch callbacks.

Link: http://lkml.kernel.org/r/20171206043649.GB15885@jagdpanzerIV
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: James Bottomley <jejb@parisc-linux.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-ia64@vger.kernel.org
Cc: linux-parisc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Tested-by: Tony Luck <tony.luck@intel.com> #ia64
Tested-by: Santosh Sivaraj <santosh@fossix.org> #powerpc
Tested-by: Helge Deller <deller@gmx.de> #parisc64
Signed-off-by: Petr Mladek <pmladek@suse.com>
2018-01-09 10:45:38 +01:00
Andrew Morton 0d85adb5fb lib/crc-ccitt: Add CCITT-FALSE CRC16 variant
In support of a soon to be published MFD driver using serdev to talk to
a supervisory processor that uses the CCITT-FALSE CRC16 variant in it's
protocol, this patch was tested successfully on an i.MX6 ARM platform.

Link: http://lkml.kernel.org/r/20170413142932.27287-1-andrew.smirnov@gmail.com
Signed-off-by: Andrey Vostrikov <andrey.vostrikov@cogentembedded.com>
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2018-01-08 10:08:33 +00:00
Bart Van Assche e80a0af475 lib/scatterlist: Introduce sgl_alloc() and sgl_free()
Many kernel drivers contain code that allocates and frees both a
scatterlist and the pages that populate that scatterlist.
Introduce functions in lib/scatterlist.c that perform these tasks
instead of duplicating this functionality in multiple drivers.
Only include these functions in the build if CONFIG_SGL_ALLOC=y
to avoid that the kernel size increases if this functionality is
not used.

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-06 09:18:00 -07:00
Linus Torvalds 64648a5fca Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
 "This fixes the following issues:

   - racy use of ctx->rcvused in af_alg

   - algif_aead crash in chacha20poly1305

   - freeing bogus pointer in pcrypt

   - build error on MIPS in mpi

   - memory leak in inside-secure

   - memory overwrite in inside-secure

   - NULL pointer dereference in inside-secure

   - state corruption in inside-secure

   - build error without CRYPTO_GF128MUL in chelsio

   - use after free in n2"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: inside-secure - do not use areq->result for partial results
  crypto: inside-secure - fix request allocations in invalidation path
  crypto: inside-secure - free requests even if their handling failed
  crypto: inside-secure - per request invalidation
  lib/mpi: Fix umul_ppmm() for MIPS64r6
  crypto: pcrypt - fix freeing pcrypt instances
  crypto: n2 - cure use after free
  crypto: af_alg - Fix race around ctx->rcvused by making it atomic_t
  crypto: chacha20poly1305 - validate the digest size
  crypto: chelsio - select CRYPTO_GF128MUL
2018-01-05 12:10:06 -08:00
Sergey Senozhatsky d202d47b5e lib: do not use print_symbol()
print_symbol() is a very old API that has been obsoleted by %pS format
specifier in a normal printk() call.

Replace print_symbol() with a direct printk("%pS") call.

Link: http://lkml.kernel.org/r/20171211125025.2270-13-sergey.senozhatsky@gmail.com
To: Andrew Morton <akpm@linux-foundation.org>
To: Russell King <linux@armlinux.org.uk>
To: Catalin Marinas <catalin.marinas@arm.com>
To: Mark Salter <msalter@redhat.com>
To: Tony Luck <tony.luck@intel.com>
To: David Howells <dhowells@redhat.com>
To: Yoshinori Sato <ysato@users.sourceforge.jp>
To: Guan Xuetao <gxt@mprc.pku.edu.cn>
To: Borislav Petkov <bp@alien8.de>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: Thomas Gleixner <tglx@linutronix.de>
To: Peter Zijlstra <peterz@infradead.org>
To: Vineet Gupta <vgupta@synopsys.com>
To: Fengguang Wu <fengguang.wu@intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-c6x-dev@linux-c6x.org
Cc: linux-ia64@vger.kernel.org
Cc: linux-am33-list@redhat.com
Cc: linux-sh@vger.kernel.org
Cc: linux-edac@vger.kernel.org
Cc: x86@kernel.org
Cc: linux-snps-arc@lists.infradead.org
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
[pmladek@suse.com: updated commit message]
Signed-off-by: Petr Mladek <pmladek@suse.com>
2018-01-05 15:24:00 +01:00
Ingo Molnar 475c5ee193 Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull RCU updates from Paul E. McKenney:

- Updates to use cond_resched() instead of cond_resched_rcu_qs()
  where feasible (currently everywhere except in kernel/rcu and
  in kernel/torture.c).  Also a couple of fixes to avoid sending
  IPIs to offline CPUs.

- Updates to simplify RCU's dyntick-idle handling.

- Updates to remove almost all uses of smp_read_barrier_depends()
  and read_barrier_depends().

- Miscellaneous fixes.

- Torture-test updates.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-03 14:14:18 +01:00
Greg Kroah-Hartman 8c9076b07c Merge 4.15-rc6 into driver-core-next
We want the fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-02 14:56:51 +01:00
Matthew Wilcox 14ebc28e07 errseq: Add to documentation tree
- Move errseq.rst into core-api
 - Add errseq to the core-api index
 - Promote the header to a more prominent header type, otherwise we get three
   entries in the table of contents.
 - Reformat the table to look nicer and be a little more proportional in
   terms of horizontal width per bit (the SF bit is still disproportionately
   large, but there's no way to fix that).
 - Include errseq kernel-doc in the errseq.rst
 - Neaten some kernel-doc markup

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2018-01-01 12:40:27 -07:00
Linus Torvalds cea92e843e Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
 "A pile of fixes for long standing issues with the timer wheel and the
  NOHZ code:

   - Prevent timer base confusion accross the nohz switch, which can
     cause unlocked access and data corruption

   - Reinitialize the stale base clock on cpu hotplug to prevent subtle
     side effects including rollovers on 32bit

   - Prevent an interrupt storm when the timer softirq is already
     pending caused by tick_nohz_stop_sched_tick()

   - Move the timer start tracepoint to a place where it actually makes
     sense

   - Add documentation to timerqueue functions as they caused confusion
     several times now"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  timerqueue: Document return values of timerqueue_add/del()
  timers: Invoke timer_start_debug() where it makes sense
  nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick()
  timers: Reinitialize per cpu bases on hotplug
  timers: Use deferrable base independent of base::nohz_active
2017-12-31 12:30:34 -08:00
Linus Torvalds 4288e6b4dd Driver core fixes for 4.15-rc6
Here are 2 driver core fixes for 4.15-rc6, resolving some reported
 issues.
 
 The first is a cacheinfo fix for DT based systems to resolve a reported
 issue that has been around for a while, and the other is to resolve a
 regression in the kobject uevent code that showed up in 4.15-rc1.
 
 Both have been in linux-next for a while with no reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWkjLCA8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+yk6GACbBZ/EqSrd5w6//4okaHJLPAi8ligAoNfcBqrH
 TPQycbostUcOBHUvvoUB
 =MaZG
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core fixes from Greg KH:
 "Here are two driver core fixes for 4.15-rc6, resolving some reported
  issues.

  The first is a cacheinfo fix for DT based systems to resolve a
  reported issue that has been around for a while, and the other is to
  resolve a regression in the kobject uevent code that showed up in
  4.15-rc1.

  Both have been in linux-next for a while with no reported issues"

* tag 'driver-core-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  kobject: fix suppressing modalias in uevents delivered over netlink
  drivers: base: cacheinfo: fix cache type for non-architected system cache
2017-12-31 10:50:05 -08:00
Thomas Gleixner 9f4533cd73 timerqueue: Document return values of timerqueue_add/del()
The return values of timerqueue_add/del() are not documented in the kernel doc
comment. Add proper documentation.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: rt@linutronix.de
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
Link: https://lkml.kernel.org/r/20171222145337.872681338@linutronix.de
2017-12-29 23:13:10 +01:00
Jens Axboe 4e5dff41be blk-mq: improve heavily contended tag case
Even with a number of waitqueues, we can get into a situation where we
are heavily contended on the waitqueue lock. I got a report on spc1
where we're spending seconds doing this. Arguably the use case is nasty,
I reproduce it with one device and 1000 threads banging on the device.
But that doesn't mean we shouldn't be handling it better.

What ends up happening is that a thread will fail to get a tag, add
itself to the waitqueue, and subsequently get woken up when a tag is
freed - only to find itself going back to sleep on the waitqueue.

Instead of waking all threads, use an exclusive wait and wake up our
sbitmap batch count instead. This seems to work well for me (massive
improvement for this use case), and it survives basic testing. But I
haven't fully verified it yet.

An additional improvement is running the queue and checking for a new
tag BEFORE needing to add ourselves to the waitqueue.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-12-22 11:09:37 -07:00
David S. Miller fba961ab29 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Lots of overlapping changes.  Also on the net-next side
the XDP state management is handled more in the generic
layers so undo the 'net' nfp fix which isn't applicable
in net-next.

Include a necessary change by Jakub Kicinski, with log message:

====================
cls_bpf no longer takes care of offload tracking.  Make sure
netdevsim performs necessary checks.  This fixes a warning
caused by TC trying to remove a filter it has not added.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-22 11:16:31 -05:00
Herbert Xu 45fa9a324d Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Merge the crypto tree to pick up inside-secure fixes.
2017-12-22 20:00:50 +11:00
James Hogan bbc25bee37 lib/mpi: Fix umul_ppmm() for MIPS64r6
Current MIPS64r6 toolchains aren't able to generate efficient
DMULU/DMUHU based code for the C implementation of umul_ppmm(), which
performs an unsigned 64 x 64 bit multiply and returns the upper and
lower 64-bit halves of the 128-bit result. Instead it widens the 64-bit
inputs to 128-bits and emits a __multi3 intrinsic call to perform a 128
x 128 multiply. This is both inefficient, and it results in a link error
since we don't include __multi3 in MIPS linux.

For example commit 90a53e4432 ("cfg80211: implement regdb signature
checking") merged in v4.15-rc1 recently broke the 64r6_defconfig and
64r6el_defconfig builds by indirectly selecting MPILIB. The same build
errors can be reproduced on older kernels by enabling e.g. CRYPTO_RSA:

lib/mpi/generic_mpih-mul1.o: In function `mpihelp_mul_1':
lib/mpi/generic_mpih-mul1.c:50: undefined reference to `__multi3'
lib/mpi/generic_mpih-mul2.o: In function `mpihelp_addmul_1':
lib/mpi/generic_mpih-mul2.c:49: undefined reference to `__multi3'
lib/mpi/generic_mpih-mul3.o: In function `mpihelp_submul_1':
lib/mpi/generic_mpih-mul3.c:49: undefined reference to `__multi3'
lib/mpi/mpih-div.o In function `mpihelp_divrem':
lib/mpi/mpih-div.c:205: undefined reference to `__multi3'
lib/mpi/mpih-div.c:142: undefined reference to `__multi3'

Therefore add an efficient MIPS64r6 implementation of umul_ppmm() using
inline assembly and the DMULU/DMUHU instructions, to prevent __multi3
calls being emitted.

Fixes: 7fd08ca58a ("MIPS: Add build support for the MIPS R6 ISA")
Signed-off-by: James Hogan <jhogan@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-mips@linux-mips.org
Cc: linux-crypto@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-12-22 19:39:09 +11:00
Jonathan Corbet 27e7c0e813 vsprintf: Fix a dangling documentation reference
A reference to printk-formats.txt didn't get updated when the file moved;
fix that.

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2017-12-21 13:39:31 -07:00
Tobin C. Harding b3ed23213e doc: convert printk-formats.txt to rst
Documentation/printk-formats.txt is a candidate for conversion to
ReStructuredText format. Some effort has already been made to do this
conversion even thought the suffix is currently .txt

Changes required to complete conversion

 - Move printk-formats.txt to core-api/printk-formats.rst
 - Add entry to Documentation/core-api/index.rst
 - Remove entry from Documentation/00-INDEX
 - Fix minor grammatical errors.
 - Order heading adornments as suggested by rst docs.
 - Use 'Passed by reference' uniformly.
 - Update pointer documentation around %px specifier.
 - Fix erroneous double backticks (to commas).
 - Remove extraneous double backticks (suggested by Jonathan Corbet).
 - Simplify documentation for kobject.

Signed-off-by: Tobin C. Harding <me@tobin.cc>
[jc: downcased "kernel"]
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2017-12-21 13:39:07 -07:00
Dmitry Torokhov 9b3fa47d4a kobject: fix suppressing modalias in uevents delivered over netlink
The commit 4a336a23d6 ("kobject: copy env blob in one go") optimized
constructing uevent data for delivery over netlink by using the raw
environment buffer, instead of reconstructing it from individual
environment pointers. Unfortunately in doing so it broke suppressing
MODALIAS attribute for KOBJ_UNBIND events, as the code that suppressed this
attribute only adjusted the environment pointers, but left the buffer
itself alone. Let's fix it by making sure the offending attribute is
obliterated form the buffer as well.

Reported-by: Tariq Toukan <tariqt@mellanox.com>
Reported-by: Casey Leedom <leedom@chelsio.com>
Fixes: 4a336a23d6 ("kobject: copy env blob in one go")
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-21 11:10:33 +01:00
David S. Miller b36025b19a Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2017-12-17

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Fix a corner case in generic XDP where we have non-linear skbs
   but enough tailroom in the skb to not miss to linearizing there,
   from Song.

2) Fix BPF JIT bugs in s390x and ppc64 to not recache skb data when
   BPF context is not skb, from Daniel.

3) Fix a BPF JIT bug in sparc64 where recaching skb data after helper
   call would use the wrong register for the skb, from Daniel.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-18 10:49:22 -05:00
David S. Miller c30abd5e40 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Three sets of overlapping changes, two in the packet scheduler
and one in the meson-gxl PHY driver.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-16 22:11:55 -05:00
Linus Torvalds 1f76a75561 Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Ingo Molnar:
 "Misc fixes:

   - Fix a S390 boot hang that was caused by the lock-break logic.
     Remove lock-break to begin with, as review suggested it was
     unreasonably fragile and our confidence in its continued good
     health is lower than our confidence in its removal.

   - Remove the lockdep cross-release checking code for now, because of
     unresolved false positive warnings. This should make lockdep work
     well everywhere again.

   - Get rid of the final (and single) ACCESS_ONCE() straggler and
     remove the API from v4.15.

   - Fix a liblockdep build warning"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tools/lib/lockdep: Add missing declaration of 'pr_cont()'
  checkpatch: Remove ACCESS_ONCE() warning
  compiler.h: Remove ACCESS_ONCE()
  tools/include: Remove ACCESS_ONCE()
  tools/perf: Convert ACCESS_ONCE() to READ_ONCE()
  locking/lockdep: Remove the cross-release locking checks
  locking/core: Remove break_lock field when CONFIG_GENERIC_LOCKBREAK=y
  locking/core: Fix deadlock during boot on systems with GENERIC_LOCKBREAK
2017-12-15 11:44:59 -08:00
Daniel Borkmann 87ab819430 bpf: add test case for ld_abs and helper changing pkt data
Add a test that i) uses LD_ABS, ii) zeroing R6 before call, iii) calls
a helper that triggers reload of cached skb data, iv) uses LD_ABS again.
It's added for test_bpf in order to do runtime testing after JITing as
well as test_verifier to test that the sequence is allowed.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-15 09:19:36 -08:00
Chris Wilson 338f1d9d1b lib/rbtree,drm/mm: add rbtree_replace_node_cached()
Add a variant of rbtree_replace_node() that maintains the leftmost cache
of struct rbtree_root_cached when replacing nodes within the rbtree.

As drm_mm is the only rb_replace_node() being used on an interval tree,
the mistake looks fairly self-contained.  Furthermore the only user of
drm_mm_replace_node() is its testsuite...

Testcase: igt/drm_mm/replace

Link: http://lkml.kernel.org/r/20171122100729.3742-1-chris@chris-wilson.co.uk
Link: https://patchwork.freedesktop.org/patch/msgid/20171109212435.9265-1-chris@chris-wilson.co.uk
Fixes: f808c13fd3 ("lib/interval_tree: fast overlap detection")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-14 16:00:48 -08:00
Ingo Molnar e966eaeeb6 locking/lockdep: Remove the cross-release locking checks
This code (CONFIG_LOCKDEP_CROSSRELEASE=y and CONFIG_LOCKDEP_COMPLETIONS=y),
while it found a number of old bugs initially, was also causing too many
false positives that caused people to disable lockdep - which is arguably
a worse overall outcome.

If we disable cross-release by default but keep the code upstream then
in practice the most likely outcome is that we'll allow the situation
to degrade gradually, by allowing entropy to introduce more and more
false positives, until it overwhelms maintenance capacity.

Another bad side effect was that people were trying to work around
the false positives by uglifying/complicating unrelated code. There's
a marked difference between annotating locking operations and
uglifying good code just due to bad lock debugging code ...

This gradual decrease in quality happened to a number of debugging
facilities in the kernel, and lockdep is pretty complex already,
so we cannot risk this outcome.

Either cross-release checking can be done right with no false positives,
or it should not be included in the upstream kernel.

( Note that it might make sense to maintain it out of tree and go through
  the false positives every now and then and see whether new bugs were
  introduced. )

Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-12 12:38:51 +01:00
Randy Dunlap d063623b6a Documentation: add UUID/GUID to kernel-api
Update kernel-doc notation in lib/uuid.c and then add UUID/GUID
function interfaces to kernel-api.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
[jc: tweaked the uuid_is_valid() kerneldoc]
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2017-12-11 15:03:08 -07:00
Kees Cook 0f7cda2b82 Kconfig: Make STRICT_DEVMEM default-y on x86 and arm64
Distros have been shipping with CONFIG_STRICT_DEVMEM=y for years now. It
is probably time to flip this default for x86 and arm64.

Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Laura Abbott <labbott@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: kernel-hardening@lists.openwall.com
Link: http://lkml.kernel.org/r/20171201201000.GA44539@beast
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-11 18:41:26 +01:00
Tom Herbert 64e0cd0d35 rhashtable: Call library function alloc_bucket_locks
To allocate the array of bucket locks for the hash table we now
call library function alloc_bucket_spinlocks. This function is
based on the old alloc_bucket_locks in rhashtable and should
produce the same effect.

Signed-off-by: Tom Herbert <tom@quantonium.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11 09:58:39 -05:00
Tom Herbert 92f36cca57 spinlock: Add library function to allocate spinlock buckets array
Add two new library functions: alloc_bucket_spinlocks and
free_bucket_spinlocks. These are used to allocate and free an array
of spinlocks that are useful as locks for hash buckets. The interface
specifies the maximum number of spinlocks in the array as well
as a CPU multiplier to derive the number of spinlocks to allocate.
The number allocated is rounded up to a power of two to make the
array amenable to hash lookup.

Signed-off-by: Tom Herbert <tom@quantonium.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11 09:58:39 -05:00
Tom Herbert 2db54b475a rhashtable: Add rhastable_walk_peek
This function is like rhashtable_walk_next except that it only returns
the current element in the inter and does not advance the iter.

This patch also creates __rhashtable_walk_find_next. It finds the next
element in the table when the entry cached in iter is NULL or at the end
of a slot. __rhashtable_walk_find_next is called from
rhashtable_walk_next and rhastable_walk_peek.

end_of_table is an added field to the iter structure. This indicates
that the end of table was reached (walker.tbl being NULL is not a
sufficient condition for end of table).

Signed-off-by: Tom Herbert <tom@quantonium.net>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11 09:58:38 -05:00
Tom Herbert 97a6ec4ac0 rhashtable: Change rhashtable_walk_start to return void
Most callers of rhashtable_walk_start don't care about a resize event
which is indicated by a return value of -EAGAIN. So calls to
rhashtable_walk_start are wrapped wih code to ignore -EAGAIN. Something
like this is common:

       ret = rhashtable_walk_start(rhiter);
       if (ret && ret != -EAGAIN)
               goto out;

Since zero and -EAGAIN are the only possible return values from the
function this check is pointless. The condition never evaluates to true.

This patch changes rhashtable_walk_start to return void. This simplifies
code for the callers that ignore -EAGAIN. For the few cases where the
caller cares about the resize event, particularly where the table can be
walked in mulitple parts for netlink or seq file dump, the function
rhashtable_walk_start_check has been added that returns -EAGAIN on a
resize event.

Signed-off-by: Tom Herbert <tom@quantonium.net>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11 09:58:38 -05:00
Christophe Leroy a0e94598e6 Fix misannotated out-of-line _copy_to_user()
Destination is a kernel pointer and source - a userland one
in _copy_from_user(); _copy_to_user() is the other way round.

Fixes: d597580d37 ("generic ...copy_..._user primitives")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-12-11 09:35:11 -05:00
Greg Kroah-Hartman 73cf7e111e Merge 4.15-rc3 into driver-core-next
We want the fixes and changes in here for testing.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-11 08:50:05 +01:00
James Morris 4ded3bec65 Keyrings fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIVAwUAWiqvvPSw1s6N8H32AQLq0Q/+LaD+nbuVGCnTSQ1h3QyU2xJrvhnjJseU
 CpsMctc4XT4iyI7VjhhVZQBQm7bnS3ZEFSz95aDpNsTA8z4VTF3loVvqaaRV2k3q
 dBciol9sqPbzSLLvS49lxoa4ZfxATvwvzhiVq5yi1XavWhl1svf42kH27J3PJ+DV
 IzzaARNZ4GmiJOx4TUfN7jypro6Vc0k6oILAKUjifJ9mCh9eYUoHILzzlOg+FqtY
 OIFcHdc7kbNDJwHaChZDxMK/jbp87feQ6z+H66zoIClV405g9MBIO21xVnikL1fK
 4SHdfLPy4XVgLVYqseAMoAcm7ZIxF/WYBDpfdd77urCc5FPifILWDJ4sn4SPRTZT
 2eE761RmyH3TB6Xe6TyCL7NjpFruI7X2hFExgDVYnbsAlPlMvHB2BvcBMaeFDTnn
 Gh7w+FX4LT9hC9hNyOEop5fB0Bi3A53/3W4vQWStTMNslDh7at9p1SJlcFf/Ix6M
 P1TcsaPR3P0juZ4Klku3JOIyU8e1ogji1W+vjGe/2fCSS5dTZnsF4wOd8dqCc/PB
 gZJk6MDt8CIm7amTEj50Pp+35MlOolnM+TC4DLZ9i8nZf73Ec9SIbuXfX3wKK1JX
 5pde9oUlX3/QTQPso2Dt7Yzl5Vnhwro48Hf5IHV45wyp1stFggB0t4AmpAxA1g87
 odyy5L1AESw=
 =Yz52
 -----END PGP SIGNATURE-----

Merge tag 'keys-fixes-20171208' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs into keys-for-linus

Assorted fixes for keyrings, ASN.1, X.509 and PKCS#7.
2017-12-09 14:39:48 +11:00
Eric Biggers 8dfd2f22d3 509: fix printing uninitialized stack memory when OID is empty
Callers of sprint_oid() do not check its return value before printing
the result.  In the case where the OID is zero-length, -EBADMSG was
being returned without anything being written to the buffer, resulting
in uninitialized stack memory being printed.  Fix this by writing
"(bad)" to the buffer in the cases where -EBADMSG is returned.

Fixes: 4f73175d03 ("X.509: Add utility functions to render OIDs as strings")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2017-12-08 15:13:28 +00:00
Eric Biggers 47e0a208fb X.509: fix buffer overflow detection in sprint_oid()
In sprint_oid(), if the input buffer were to be more than 1 byte too
small for the first snprintf(), 'bufsize' would underflow, causing a
buffer overflow when printing the remainder of the OID.

Fortunately this cannot actually happen currently, because no users pass
in a buffer that can be too small for the first snprintf().

Regardless, fix it by checking the snprintf() return value correctly.

For consistency also tweak the second snprintf() check to look the same.

Fixes: 4f73175d03 ("X.509: Add utility functions to render OIDs as strings")
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <james.l.morris@oracle.com>
2017-12-08 15:13:28 +00:00
Eric Biggers 81a7be2cd6 ASN.1: check for error from ASN1_OP_END__ACT actions
asn1_ber_decoder() was ignoring errors from actions associated with the
opcodes ASN1_OP_END_SEQ_ACT, ASN1_OP_END_SET_ACT,
ASN1_OP_END_SEQ_OF_ACT, and ASN1_OP_END_SET_OF_ACT.  In practice, this
meant the pkcs7_note_signed_info() action (since that was the only user
of those opcodes).  Fix it by checking for the error, just like the
decoder does for actions associated with the other opcodes.

This bug allowed users to leak slab memory by repeatedly trying to add a
specially crafted "pkcs7_test" key (requires CONFIG_PKCS7_TEST_KEY).

In theory, this bug could also be used to bypass module signature
verification, by providing a PKCS#7 message that is misparsed such that
a signature's ->authattrs do not contain its ->msgdigest.  But it
doesn't seem practical in normal cases, due to restrictions on the
format of the ->authattrs.

Fixes: 42d5ec27f8 ("X.509: Add an ASN.1 decoder")
Cc: <stable@vger.kernel.org> # v3.7+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <james.l.morris@oracle.com>
2017-12-08 15:13:27 +00:00
Eric Biggers e0058f3a87 ASN.1: fix out-of-bounds read when parsing indefinite length item
In asn1_ber_decoder(), indefinitely-sized ASN.1 items were being passed
to the action functions before their lengths had been computed, using
the bogus length of 0x80 (ASN1_INDEFINITE_LENGTH).  This resulted in
reading data past the end of the input buffer, when given a specially
crafted message.

Fix it by rearranging the code so that the indefinite length is resolved
before the action is called.

This bug was originally found by fuzzing the X.509 parser in userspace
using libFuzzer from the LLVM project.

KASAN report (cleaned up slightly):

    BUG: KASAN: slab-out-of-bounds in memcpy ./include/linux/string.h:341 [inline]
    BUG: KASAN: slab-out-of-bounds in x509_fabricate_name.constprop.1+0x1a4/0x940 crypto/asymmetric_keys/x509_cert_parser.c:366
    Read of size 128 at addr ffff880035dd9eaf by task keyctl/195

    CPU: 1 PID: 195 Comm: keyctl Not tainted 4.14.0-09238-g1d3b78bbc6e9 #26
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014
    Call Trace:
     __dump_stack lib/dump_stack.c:17 [inline]
     dump_stack+0xd1/0x175 lib/dump_stack.c:53
     print_address_description+0x78/0x260 mm/kasan/report.c:252
     kasan_report_error mm/kasan/report.c:351 [inline]
     kasan_report+0x23f/0x350 mm/kasan/report.c:409
     memcpy+0x1f/0x50 mm/kasan/kasan.c:302
     memcpy ./include/linux/string.h:341 [inline]
     x509_fabricate_name.constprop.1+0x1a4/0x940 crypto/asymmetric_keys/x509_cert_parser.c:366
     asn1_ber_decoder+0xb4a/0x1fd0 lib/asn1_decoder.c:447
     x509_cert_parse+0x1c7/0x620 crypto/asymmetric_keys/x509_cert_parser.c:89
     x509_key_preparse+0x61/0x750 crypto/asymmetric_keys/x509_public_key.c:174
     asymmetric_key_preparse+0xa4/0x150 crypto/asymmetric_keys/asymmetric_type.c:388
     key_create_or_update+0x4d4/0x10a0 security/keys/key.c:850
     SYSC_add_key security/keys/keyctl.c:122 [inline]
     SyS_add_key+0xe8/0x290 security/keys/keyctl.c:62
     entry_SYSCALL_64_fastpath+0x1f/0x96

    Allocated by task 195:
     __do_kmalloc_node mm/slab.c:3675 [inline]
     __kmalloc_node+0x47/0x60 mm/slab.c:3682
     kvmalloc ./include/linux/mm.h:540 [inline]
     SYSC_add_key security/keys/keyctl.c:104 [inline]
     SyS_add_key+0x19e/0x290 security/keys/keyctl.c:62
     entry_SYSCALL_64_fastpath+0x1f/0x96

Fixes: 42d5ec27f8 ("X.509: Add an ASN.1 decoder")
Reported-by: Alexander Potapenko <glider@google.com>
Cc: <stable@vger.kernel.org> # v3.7+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2017-12-08 15:13:27 +00:00
David Ahern 6e237d099f netlink: Relax attr validation for fixed length types
Commit 28033ae4e0 ("net: netlink: Update attr validation to require
exact length for some types") requires attributes using types NLA_U* and
NLA_S* to have an exact length. This change is exposing bugs in various
userspace commands that are sending attributes with an invalid length
(e.g., attribute has type NLA_U8 and userspace sends NLA_U32). While
the commands are clearly broken and need to be fixed, users are arguing
that the sudden change in enforcement is breaking older commands on
newer kernels for use cases that otherwise "worked".

Relax the validation to print a warning mesage similar to what is done
for messages containing extra bytes after parsing.

Fixes: 28033ae4e0 ("net: netlink: Update attr validation to require exact length for some types")
Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-07 14:00:57 -05:00
Greg Kroah-Hartman 045c5f75b7 kobject: Remove redundant license text
Now that the SPDX tag is in all kobject files, that identifies the
license in a specific and legally-defined manner.  So the extra GPL text
wording can be removed as it is no longer needed at all.

This is done on a quest to remove the 700+ different ways that files in
the kernel describe the GPL license text.  And there's unneeded stuff
like the address (sometimes incorrect) for the FSF which is never
needed.

No copyright headers or other non-license-description text was removed.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-07 18:36:43 +01:00
Greg Kroah-Hartman d9d16e16a3 kobject: add SPDX identifiers to all kobject files
It's good to have SPDX identifiers in all files to make it easier to
audit the kernel tree for correct licenses.

Update the kobject files files with the correct SPDX license identifier
based on the license text in the file itself.  The SPDX identifier is a
legally binding shorthand, which can be used instead of the full boiler
plate text.

This work is based on a script and data from Thomas Gleixner, Philippe
Ombredanne, and Kate Stewart.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-07 18:36:43 +01:00
Paul E. McKenney 516df05061 lib/assoc_array: Remove smp_read_barrier_depends()
Now that smp_read_barrier_depends() is implied by READ_ONCE(), the several
smp_read_barrier_depends() calls may be removed from lib/assoc_array.c.
This commit makes this change and marks the READ_ONCE() calls that head
address dependencies.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
Cc: David Howells <dhowells@redhat.com>
2017-12-04 10:52:56 -08:00
Paul E. McKenney b393e8b33e percpu: READ_ONCE() now implies smp_read_barrier_depends()
Because READ_ONCE() now implies smp_read_barrier_depends(), this commit
removes the now-redundant smp_read_barrier_depends() following the
READ_ONCE() in __ref_is_percpu().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
2017-12-04 10:52:53 -08:00
Linus Torvalds e1ba1c99da RISC-V Cleanups and ABI Fixes for 4.15-rc2
This tag contains a handful of small cleanups that are a result of
 feedback that didn't make it into our original patch set, either because
 the feedback hadn't been given yet, I missed the original emails, or
 we weren't ready to submit the changes yet.
 
 I've been maintaining the various cleanup patch sets I have as their own
 branches, which I then merged together and signed.  Each merge commit
 has a short summary of the changes, and each branch is based on your
 latest tag (4.15-rc1, in this case).  If this isn't the right way to do
 this then feel free to suggest something else, but it seems sane to me.
 
 Here's a short summary of the changes, roughly in order of how
 interesting they are.
 
 * libgcc.h has been moved from include/lib, where it's the only member,
   to include/linux.  This is meant to avoid tab completion conflicts.
 * VDSO entries for clock_get/gettimeofday/getcpu have been added.  These
   are simple syscalls now, but we want to let glibc use them from the
   start so we can make them faster later.
 * A VDSO entry for instruction cache flushing has been added so
   userspace can flush the instruction cache.
 * The VDSO symbol versions for __vdso_cmpxchg{32,64} have been removed,
   as those VDSO entries don't actually exist.
 * __io_writes has been corrected to respect the given type.
 * A new READ_ONCE in arch_spin_is_locked().
 * __test_and_op_bit_ord() is now actually ordered.
 * Various small fixes throughout the tree to enable allmodconfig to
   build cleanly.
 * Removal of some dead code in our atomic support headers.
 * Improvements to various comments in our atomic support headers.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEAM520YNJYN/OiG3470yhUCzLq0EFAlohyvMTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRDvTKFQLMurQWkhD/wO/F8vrwsNMOWR8zxvHdB30KD+FHmr
 X1+X9OqnH8AMd4Woj6pS0ap7g0GCKuLiI/bOTrQVVdTJpmKFaJ9rrwRCJzHq43yt
 feRjKyPAFYlvf6YaIEJ3YHU0t3LO1eK27YyFMg6F8y+bZim6oK2GdyfYF0Xiik3B
 L3NkDPSH4oplTJjUI+tzDZdMsuZKhxpXPnbNQA7YZLepz04jOPGWqFrA1C3gAaVQ
 dj1OkOGTSyQFwia7LrIm2g0J5/mqpjAF0KjdiTsvH6G9x3V0HZYU5Br3kHgauWKc
 YrNEbbDl8EakT5QocPf5F4Z8qpO9Hvxjwe2/z27usPtV9FQOPuDDPOygSPykwNNJ
 bDfv9nIE3W7lN26BaRcV2ivY3r9ZpCEcq+qXIiTm3P/uTVqjMq54NkvHnj4ON1ih
 DJZEgkM9L+rm7c9XDn627FBkmkeEndPJcQ3P/nopb5zGTYTb2HGrUt2nM+KR2vuE
 FdYtA9+ll3OzyFO3OVVjiAlxr8Qnwf2wIWXJXxWpcmmchGJ5NeTSZtiD14pAP5eC
 EDpoWwefvhqRMGdOlgq/fkx4Mrhz27euWXine3ZccprABAf7Hxkb/N5ojIJKT7qW
 mN3HL3PC9P0t/HxQEu0q0NLLsP+X/1yZ5HmDl44Y7N8aeCrIUXaB61gsTt6Oi6Ha
 PMJi5PI6VDDQbA==
 =CCe+
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-4.15-rc2_cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux

Pull RISC-V cleanups and ABI fixes from Palmer Dabbelt:
 "This contains a handful of small cleanups that are a result of
  feedback that didn't make it into our original patch set, either
  because the feedback hadn't been given yet, I missed the original
  emails, or we weren't ready to submit the changes yet.

  I've been maintaining the various cleanup patch sets I have as their
  own branches, which I then merged together and signed. Each merge
  commit has a short summary of the changes, and each branch is based on
  your latest tag (4.15-rc1, in this case). If this isn't the right way
  to do this then feel free to suggest something else, but it seems sane
  to me.

  Here's a short summary of the changes, roughly in order of how
  interesting they are.

   - libgcc.h has been moved from include/lib, where it's the only
     member, to include/linux. This is meant to avoid tab completion
     conflicts.

   - VDSO entries for clock_get/gettimeofday/getcpu have been added.
     These are simple syscalls now, but we want to let glibc use them
     from the start so we can make them faster later.

   - A VDSO entry for instruction cache flushing has been added so
     userspace can flush the instruction cache.

   - The VDSO symbol versions for __vdso_cmpxchg{32,64} have been
     removed, as those VDSO entries don't actually exist.

   - __io_writes has been corrected to respect the given type.

   - A new READ_ONCE in arch_spin_is_locked().

   - __test_and_op_bit_ord() is now actually ordered.

   - Various small fixes throughout the tree to enable allmodconfig to
     build cleanly.

   - Removal of some dead code in our atomic support headers.

   - Improvements to various comments in our atomic support headers"

* tag 'riscv-for-linus-4.15-rc2_cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux: (23 commits)
  RISC-V: __io_writes should respect the length argument
  move libgcc.h to include/linux
  RISC-V: Clean up an unused include
  RISC-V: Allow userspace to flush the instruction cache
  RISC-V: Flush I$ when making a dirty page executable
  RISC-V: Add missing include
  RISC-V: Use define for get_cycles like other architectures
  RISC-V: Provide stub of setup_profiling_timer()
  RISC-V: Export some expected symbols for modules
  RISC-V: move empty_zero_page definition to C and export it
  RISC-V: io.h: type fixes for warnings
  RISC-V: use RISCV_{INT,SHORT} instead of {INT,SHORT} for asm macros
  RISC-V: use generic serial.h
  RISC-V: remove spin_unlock_wait()
  RISC-V: `sfence.vma` orderes the instruction cache
  RISC-V: Add READ_ONCE in arch_spin_is_locked()
  RISC-V: __test_and_op_bit_ord should be strongly ordered
  RISC-V: Remove smb_mb__{before,after}_spinlock()
  RISC-V: Remove __smp_bp__{before,after}_atomic
  RISC-V: Comment on why {,cmp}xchg is ordered how it is
  ...
2017-12-01 19:39:12 -05:00
Christoph Hellwig 4db2b604c0 move libgcc.h to include/linux
Introducing a new include/lib directory just for this file totally
messes up tab completion for include/linux, which is highly annoying.

Move it to include/linux where we have headers for all kinds of other
lib/ code as well.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2017-12-01 13:09:40 -08:00
Linus Torvalds ef0010a309 vsprintf: don't use 'restricted_pointer()' when not restricting
Instead, just fall back on the new '%p' behavior which hashes the
pointer.

Otherwise, '%pK' - that was intended to mark a pointer as restricted -
just ends up leaking pointers that a normal '%p' wouldn't leak.  Which
just make the whole thing pointless.

I suspect we should actually get rid of '%pK' entirely, and make it just
work as '%p' regardless, but this is the minimal obvious fix.  People
who actually use 'kptr_restrict' should weigh in on which behavior they
want.

Cc: Tobin Harding <me@tobin.cc>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-29 11:28:09 -08:00
Eric Biggers 9f480faec5 crypto: chacha20 - Fix keystream alignment for chacha20_block()
When chacha20_block() outputs the keystream block, it uses 'u32' stores
directly.  However, the callers (crypto/chacha20_generic.c and
drivers/char/random.c) declare the keystream buffer as a 'u8' array,
which is not guaranteed to have the needed alignment.

Fix it by having both callers declare the keystream as a 'u32' array.
For now this is preferable to switching over to the unaligned access
macros because chacha20_block() is only being used in cases where we can
easily control the alignment (stack buffers).

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-11-29 17:33:33 +11:00
Tobin C. Harding 7b1924a1d9 vsprintf: add printk specifier %px
printk specifier %p now hashes all addresses before printing. Sometimes
we need to see the actual unmodified address. This can be achieved using
%lx but then we face the risk that if in future we want to change the
way the Kernel handles printing of pointers we will have to grep through
the already existent 50 000 %lx call sites. Let's add specifier %px as a
clear, opt-in, way to print a pointer and maintain some level of
isolation from all the other hex integer output within the Kernel.

Add printk specifier %px to print the actual unmodified address.

Signed-off-by: Tobin C. Harding <me@tobin.cc>
2017-11-29 12:13:14 +11:00
Tobin C. Harding ad67b74d24 printk: hash addresses printed with %p
Currently there exist approximately 14 000 places in the kernel where
addresses are being printed using an unadorned %p. This potentially
leaks sensitive information regarding the Kernel layout in memory. Many
of these calls are stale, instead of fixing every call lets hash the
address by default before printing. This will of course break some
users, forcing code printing needed addresses to be updated.

Code that _really_ needs the address will soon be able to use the new
printk specifier %px to print the address.

For what it's worth, usage of unadorned %p can be broken down as
follows (thanks to Joe Perches).

$ git grep -E '%p[^A-Za-z0-9]' | cut -f1 -d"/" | sort | uniq -c
   1084 arch
     20 block
     10 crypto
     32 Documentation
   8121 drivers
   1221 fs
    143 include
    101 kernel
     69 lib
    100 mm
   1510 net
     40 samples
      7 scripts
     11 security
    166 sound
    152 tools
      2 virt

Add function ptr_to_id() to map an address to a 32 bit unique
identifier. Hash any unadorned usage of specifier %p and any malformed
specifiers.

Signed-off-by: Tobin C. Harding <me@tobin.cc>
2017-11-29 12:09:02 +11:00
Tobin C. Harding 57e734423a vsprintf: refactor %pK code out of pointer()
Currently code to handle %pK is all within the switch statement in
pointer(). This is the wrong level of abstraction. Each of the other switch
clauses call a helper function, pK should do the same.

Refactor code out of pointer() to new function restricted_pointer().

Signed-off-by: Tobin C. Harding <me@tobin.cc>
2017-11-29 12:03:24 +11:00
Linus Torvalds 844056fd74 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:

 - The final conversion of timer wheel timers to timer_setup().

   A few manual conversions and a large coccinelle assisted sweep and
   the removal of the old initialization mechanisms and the related
   code.

 - Remove the now unused VSYSCALL update code

 - Fix permissions of /proc/timer_list. I still need to get rid of that
   file completely

 - Rename a misnomed clocksource function and remove a stale declaration

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
  m68k/macboing: Fix missed timer callback assignment
  treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts
  timer: Remove redundant __setup_timer*() macros
  timer: Pass function down to initialization routines
  timer: Remove unused data arguments from macros
  timer: Switch callback prototype to take struct timer_list * argument
  timer: Pass timer_list pointer to callbacks unconditionally
  Coccinelle: Remove setup_timer.cocci
  timer: Remove setup_*timer() interface
  timer: Remove init_timer() interface
  treewide: setup_timer() -> timer_setup() (2 field)
  treewide: setup_timer() -> timer_setup()
  treewide: init_timer() -> setup_timer()
  treewide: Switch DEFINE_TIMER callbacks to struct timer_list *
  s390: cmm: Convert timers to use timer_setup()
  lightnvm: Convert timers to use timer_setup()
  drivers/net: cris: Convert timers to use timer_setup()
  drm/vc4: Convert timers to use timer_setup()
  block/laptop_mode: Convert timers to use timer_setup()
  net/atm/mpc: Avoid open-coded assignment of timer callback function
  ...
2017-11-25 08:37:16 -10:00
Linus Torvalds 14b661ebb6 This pull request contains the following core changes:
General changes:
    * Unconfuse get_unmapped_area and point/unpoint driver methods
    * New partition parser: sharpslpart
    * Kill GENERIC_IO
    * Various fixes
 
 NAND changes:
    * Add a flag to mark NANDs that require 3 address cycles to encode a
      page address
    * Set a default ECC/free layout when NAND_ECC_NONE is requested
    * Fix a bug in panic_nand_write()
    * Another batch of cleanups for the denali driver
    * Fix PM support in the atmel driver
    * Remove support for platform data in the omap driver
    * Fix subpage write in the omap driver
    * Fix irq handling in the mtk driver
    * Change link order of mtk_ecc and mtk_nand drivers to speed up boot
      time
    * Change log level of ECC error messages in the mxc driver
    * Patch the pxa3xx driver to support Armada 8k platforms
    * Add BAM DMA support to the qcom driver
    * Convert gpio-nand to the GPIO desc API
    * Fix ECC handling in the mt29f driver
 
 SPI-NOR changes:
    * Introduce system power management support
    * New mechanism to select the proper .quad_enable() hook by JEDEC ID,
      when needed, instead of only by manufacturer ID
    * Add support to new memory parts from Gigadevice, Winbond, Macronix and
      Everspin
    * Maintainance for Cadence, Intel, Mediatek and STM32 drivers
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJaEzkZAAoJEGb5WYXrGLvBiUMP/25eEatNd5pGo9rtXqX463kp
 Q8zXGwtGp7Y2ThtC2TMbSSZZFdhGXIv3AUGpW+Y1yFMzGbiwWh8T28rdgDKDINhl
 jQteoWGQnZnnLhsMEbApJUqqtlxKFkY6COv/fUItmN8a4E5SyYF6ARKdnxH36Quu
 j/i3Kyd1FjDzJE2jsAE6TuomlNRuj/4S0OiZBTlgMhQvbo282Rush6RmF5zAvsdN
 B+S45Q752Pypg3U+1IYkqFSOtSYS3NM1ynZW7YXdWDwcKxDnKvasebSi+wCqPVc8
 n6hkcnXKIMOB6/bGhLg3FZlrzJcH7cbxy2C40NKFmMa7gw+/h1bmvjZk9hubLEc3
 +EJ8/1e8Z/KNTGu+Iyy2BNHTLI+KFKM5n/7/mpSPHMP/0uQjYs95GUmPlhVrenuv
 wprVsQKj7k92E+5Vm/h+Gys67sEG/rQK0v9UEConzl1s2T7i/hnA2lhPfIFmbMU/
 9U2s0CFobDqFUh+O6FSkLg9AT7+gT2HA1t6bbDTJMgnbFW72vlDUiArniia9hWOx
 dSc5pxMnaSiiqk+uCma4zLv2/3Tyi5dAEMQy+qAlK1EpmwPAsyu3SEMbyraovb9S
 PW0YQcMxVlQ/+EdDZCi83ypMlMQE/fDNcuKVMQD9enbko9yKGEgSZsTm9XwIvAv6
 g0P5jYMind1aNNSfg/QM
 =wVm7
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-20171120' of git://git.infradead.org/linux-mtd

Pull MTD updates from Richard Weinberger:
 "General changes:
   -  Unconfuse get_unmapped_area and point/unpoint driver methods
   -  New partition parser: sharpslpart
   -  Kill GENERIC_IO
   -  Various fixes

  NAND changes:
   -  Add a flag to mark NANDs that require 3 address cycles to encode a
      page address
   -  Set a default ECC/free layout when NAND_ECC_NONE is requested
   -  Fix a bug in panic_nand_write()
   -  Another batch of cleanups for the denali driver
   -  Fix PM support in the atmel driver
   -  Remove support for platform data in the omap driver
   -  Fix subpage write in the omap driver
   -  Fix irq handling in the mtk driver
   -  Change link order of mtk_ecc and mtk_nand drivers to speed up boot
      time
   -  Change log level of ECC error messages in the mxc driver
   -  Patch the pxa3xx driver to support Armada 8k platforms
   -  Add BAM DMA support to the qcom driver
   -  Convert gpio-nand to the GPIO desc API
   -  Fix ECC handling in the mt29f driver

  SPI-NOR changes:
   -  Introduce system power management support
   -  New mechanism to select the proper .quad_enable() hook by JEDEC
      ID, when needed, instead of only by manufacturer ID
   -  Add support to new memory parts from Gigadevice, Winbond, Macronix
      and Everspin
   -  Maintainance for Cadence, Intel, Mediatek and STM32 drivers"

*  tag 'for-linus-20171120' of git://git.infradead.org/linux-mtd: (85 commits)
  mtd: Avoid probe failures when mtd->dbg.dfs_dir is invalid
  mtd: sharpslpart: Add sharpslpart partition parser
  mtd: Add sanity checks in mtd_write/read_oob()
  mtd: remove the get_unmapped_area method
  mtd: implement mtd_get_unmapped_area() using the point method
  mtd: chips/map_rom.c: implement point and unpoint methods
  mtd: chips/map_ram.c: implement point and unpoint methods
  mtd: mtdram: properly handle the phys argument in the point method
  mtd: mtdswap: fix spelling mistake: 'TRESHOLD' -> 'THRESHOLD'
  mtd: slram: use memremap() instead of ioremap()
  kconfig: kill off GENERIC_IO option
  mtd: Fix C++ comment in include/linux/mtd/mtd.h
  mtd: constify mtd_partition
  mtd: plat-ram: Replace manual resource management by devm
  mtd: nand: Fix writing mtdoops to nand flash.
  mtd: intel-spi: Add Intel Lewisburg PCH SPI super SKU PCI ID
  mtd: nand: mtk: fix infinite ECC decode IRQ issue
  mtd: spi-nor: Add support for mr25h128
  mtd: nand: mtk: change the compile sequence of mtk_nand.o and mtk_ecc.o
  mtd: spi-nor: enable 4B opcodes for mx66l51235l
  ...
2017-11-22 20:46:06 -10:00
Kees Cook 24ed960abf treewide: Switch DEFINE_TIMER callbacks to struct timer_list *
This changes all DEFINE_TIMER() callbacks to use a struct timer_list
pointer instead of unsigned long. Since the data argument has already been
removed, none of these callbacks are using their argument currently, so
this renames the argument to "unused".

Done using the following semantic patch:

@match_define_timer@
declarer name DEFINE_TIMER;
identifier _timer, _callback;
@@

 DEFINE_TIMER(_timer, _callback);

@change_callback depends on match_define_timer@
identifier match_define_timer._callback;
type _origtype;
identifier _origarg;
@@

 void
-_callback(_origtype _origarg)
+_callback(struct timer_list *unused)
 { ... }

Signed-off-by: Kees Cook <keescook@chromium.org>
2017-11-21 15:57:05 -08:00
Linus Torvalds fa7f578076 Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:

 - a bit more MM

 - procfs updates

 - dynamic-debug fixes

 - lib/ updates

 - checkpatch

 - epoll

 - nilfs2

 - signals

 - rapidio

 - PID management cleanup and optimization

 - kcov updates

 - sysvipc updates

 - quite a few misc things all over the place

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (94 commits)
  EXPERT Kconfig menu: fix broken EXPERT menu
  include/asm-generic/topology.h: remove unused parent_node() macro
  arch/tile/include/asm/topology.h: remove unused parent_node() macro
  arch/sparc/include/asm/topology_64.h: remove unused parent_node() macro
  arch/sh/include/asm/topology.h: remove unused parent_node() macro
  arch/ia64/include/asm/topology.h: remove unused parent_node() macro
  drivers/pcmcia/sa1111_badge4.c: avoid unused function warning
  mm: add infrastructure for get_user_pages_fast() benchmarking
  sysvipc: make get_maxid O(1) again
  sysvipc: properly name ipc_addid() limit parameter
  sysvipc: duplicate lock comments wrt ipc_addid()
  sysvipc: unteach ids->next_id for !CHECKPOINT_RESTORE
  initramfs: use time64_t timestamps
  drivers/watchdog: make use of devm_register_reboot_notifier()
  kernel/reboot.c: add devm_register_reboot_notifier()
  kcov: update documentation
  Makefile: support flag -fsanitizer-coverage=trace-cmp
  kcov: support comparison operands collection
  kcov: remove pointless current != NULL check
  kernel/panic.c: add TAINT_AUX
  ...
2017-11-17 16:56:17 -08:00
Victor Chibotaru d677a4d601 Makefile: support flag -fsanitizer-coverage=trace-cmp
The flag enables Clang instrumentation of comparison operations
(currently not supported by GCC).  This instrumentation is needed by the
new KCOV device to collect comparison operands.

Link: http://lkml.kernel.org/r/20171011095459.70721-2-glider@google.com
Signed-off-by: Victor Chibotaru <tchibo@google.com>
Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: <syzkaller@googlegroups.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-17 16:10:04 -08:00
Yury Norov 4441fca0a2 lib: test module for find_*_bit() functions
find_bit functions are widely used in the kernel, including hot paths.
This module tests performance of those functions in 2 typical scenarios:
randomly filled bitmap with relatively equal distribution of set and
cleared bits, and sparse bitmap which has 1 set bit for 500 cleared
bits.

On ThunderX machine:

	 Start testing find_bit() with random-filled bitmap
	find_next_bit:          240043 cycles,  164062 iterations
	find_next_zero_bit:     312848 cycles,  163619 iterations
	find_last_bit:          193748 cycles,  164062 iterations
	find_first_bit:      177720874 cycles,  164062 iterations

	 Start testing find_bit() with sparse bitmap
	find_next_bit:            3633 cycles,     656 iterations
	find_next_zero_bit:     620399 cycles,  327025 iterations
	find_last_bit:            3038 cycles,     656 iterations
	find_first_bit:         691407 cycles,     656 iterations

[arnd@arndb.de: use correct format string for find-bit tests]
  Link: http://lkml.kernel.org/r/20171113135605.3166307-1-arnd@arndb.de
Link: http://lkml.kernel.org/r/20171109140714.13168-1-ynorov@caviumnetworks.com
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Clement Courbet <courbet@google.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-17 16:10:02 -08:00
Davidlohr Bueso 0b548e33e6 lib/rbtree-test: lower default params
Fengguang reported soft lockups while running the rbtree and interval
tree test modules.  The logic for these tests all occur in init phase,
and we currently are pounding with the default values for number of
nodes and number of iterations of each test.  Reduce the latter by two
orders of magnitude.  This does not influence the value of the tests in
that one thousand times by default is enough to get the picture.

Link: http://lkml.kernel.org/r/20171109161715.xai2dtwqw2frhkcm@linux-n805
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-17 16:10:02 -08:00
Liu, Changcheng 2f9b7e08cb lib/nmi_backtrace.c: fix kernel text address leak
Don't leak idle function address in NMI backtrace.

Link: http://lkml.kernel.org/r/20171106165648.GA95243@sofia
Signed-off-by: Liu Changcheng <changcheng.liu@intel.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-17 16:10:02 -08:00
Stephen Bates 36a3d1dd4e lib/genalloc.c: make the avail variable an atomic_long_t
If the amount of resources allocated to a gen_pool exceeds 2^32 then the
avail atomic overflows and this causes problems when clients try and
borrow resources from the pool.  This is only expected to be an issue on
64 bit systems.

Add the <linux/atomic.h> header to pull in atomic_long* operations.  So
that 32 bit systems continue to use atomic32_t but 64 bit systems can
use atomic64_t.

Link: http://lkml.kernel.org/r/1509033843-25667-1-git-send-email-sbates@raithlin.com
Signed-off-by: Stephen Bates <sbates@raithlin.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Daniel Mentz <danielmentz@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-17 16:10:02 -08:00
Peter Zijlstra e813a61400 lib/int_sqrt: adjust comments
Our current int_sqrt() is not rough nor any approximation; it calculates
the exact value of: floor(sqrt()).  Document this.

Link: http://lkml.kernel.org/r/20171020164645.001652117@infradead.org
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Anshul Garg <aksgarg1989@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: David Miller <davem@davemloft.net>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Joe Perches <joe@perches.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Michael Davidson <md@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-17 16:10:01 -08:00
Peter Zijlstra f8ae107eef lib/int_sqrt: optimize initial value compute
The initial value (@m) compute is:

	m = 1UL << (BITS_PER_LONG - 2);
	while (m > x)
		m >>= 2;

Which is a linear search for the highest even bit smaller or equal to @x
We can implement this using a binary search using __fls() (or better when
its hardware implemented).

	m = 1UL << (__fls(x) & ~1UL);

Especially for small values of @x; which are the more common arguments
when doing a CDF on idle times; the linear search is near to worst case,
while the binary search of __fls() is a constant 6 (or 5 on 32bit)
branches.

      cycles:                 branches:              branch-misses:

PRE:

hot:   43.633557 +- 0.034373  45.333132 +- 0.002277  0.023529 +- 0.000681
cold: 207.438411 +- 0.125840  45.333132 +- 0.002277  6.976486 +- 0.004219

SOFTWARE FLS:

hot:   29.576176 +- 0.028850  26.666730 +- 0.004511  0.019463 +- 0.000663
cold: 165.947136 +- 0.188406  26.666746 +- 0.004511  6.133897 +- 0.004386

HARDWARE FLS:

hot:   24.720922 +- 0.025161  20.666784 +- 0.004509  0.020836 +- 0.000677
cold: 132.777197 +- 0.127471  20.666776 +- 0.004509  5.080285 +- 0.003874

Averages computed over all values <128k using a LFSR to generate order.
Cold numbers have a LFSR based branch trace buffer 'confuser' ran between
each int_sqrt() invocation.

Link: http://lkml.kernel.org/r/20171020164644.936577234@infradead.org
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Suggested-by: Joe Perches <joe@perches.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Anshul Garg <aksgarg1989@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: David Miller <davem@davemloft.net>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Michael Davidson <md@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-17 16:10:01 -08:00
Peter Zijlstra 3f3295709e lib/int_sqrt: optimize small argument
The current int_sqrt() computation is sub-optimal for the case of small
@x.  Which is the interesting case when we're going to do cumulative
distribution functions on idle times, which we assume to be a random
variable, where the target residency of the deepest idle state gives an
upper bound on the variable (5e6ns on recent Intel chips).

In the case of small @x, the compute loop:

	while (m != 0) {
		b = y + m;
		y >>= 1;

		if (x >= b) {
			x -= b;
			y += m;
		}
		m >>= 2;
	}

can be reduced to:

	while (m > x)
		m >>= 2;

Because y==0, b==m and until x>=m y will remain 0.

And while this is computationally equivalent, it runs much faster
because there's less code, in particular less branches.

      cycles:                 branches:              branch-misses:

OLD:

hot:   45.109444 +- 0.044117  44.333392 +- 0.002254  0.018723 +- 0.000593
cold: 187.737379 +- 0.156678  44.333407 +- 0.002254  6.272844 +- 0.004305

PRE:

hot:   67.937492 +- 0.064124  66.999535 +- 0.000488  0.066720 +- 0.001113
cold: 232.004379 +- 0.332811  66.999527 +- 0.000488  6.914634 +- 0.006568

POST:

hot:   43.633557 +- 0.034373  45.333132 +- 0.002277  0.023529 +- 0.000681
cold: 207.438411 +- 0.125840  45.333132 +- 0.002277  6.976486 +- 0.004219

Averages computed over all values <128k using a LFSR to generate order.
Cold numbers have a LFSR based branch trace buffer 'confuser' ran between
each int_sqrt() invocation.

Link: http://lkml.kernel.org/r/20171020164644.876503355@infradead.org
Fixes: 30493cc9dd ("lib/int_sqrt.c: optimize square root algorithm")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Suggested-by: Anshul Garg <aksgarg1989@gmail.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Joe Perches <joe@perches.com>
Cc: David Miller <davem@davemloft.net>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Michael Davidson <md@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-17 16:10:01 -08:00