After attempting to quickly retrieve known errors the kernel proceeds to
kick off a long running ARS. Add a module option to disable this
behavior at initialization time, or at new region discovery time.
Otherwise, ARS can be started manually regardless of the state of this
setting.
Co-developed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
ARS is an operation that can take 10s to 100s of seconds to find media
errors that should rarely be present. If the platform crashes due to
media errors in persistent memory, the expectation is that the BIOS will
report those known errors in a 'short' ARS request.
A 'short' ARS request asks platform firmware to return an ARS payload
with all known errors, but without issuing a 'long' scrub. At driver
init a short request is issued to all PMEM ranges before registering
regions. Then, in the background, a long ARS is scheduled for each
region.
The ARS implementation is simplified to centralize ARS completion work
in the ars_complete() helper. The timeout is removed since there is no
facility to cancel ARS, and this otherwise arranges for system init to
never be blocked waiting for a 'long' ARS. The ars_state flags are used
to coordinate ARS requests from driver init, ARS requests from
userspace, and ARS requests in response to media error notifications.
Given that there is no notification of ARS completion the implementation
still needs to poll. It backs off exponentially to a maximum poll period
of 30 minutes.
Suggested-by: Toshi Kani <toshi.kani@hpe.com>
Co-developed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
acpi_nfit_query_poison() is awkward in that it requires an nfit_spa
argument in order to determine what max_ars value to use. Instead probe
for the minimum max_ars across all scrub-capable ranges in the system
and drop the nfit_spa argument.
This enables a larger rework / simplification of the ARS state machine
whereby the status can be retrieved once and then iterated over all
address ranges to reap completions.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Scan the devicetree for an nvdimm-bus compatible and create
a platform device for them.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Add device-tree binding documentation for the nvdimm region driver.
Cc: devicetree@vger.kernel.org
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This patch adds peliminary device-tree bindings for persistent memory
regions. The driver registers a libnvdimm bus for each pmem-region
node and each address range under the node is converted to a region
within that bus.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
We want to be able to cross reference the region and bus devices
with the device tree node that they were spawned from. libNVDIMM
handles creating the actual devices for these internally, so we
need to pass in a pointer to the relevant node in the descriptor.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The message about constraining number of online cpus to be less than or
equal to ND_MAX_LANES (256) is only useful for block-aperture
configurations and BTT. Make it debug since it is only relevant when
debugging performance.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The following NULL dereference results from incorrectly assuming that
ndd is valid in this print:
struct nvdimm_drvdata *ndd = to_ndd(&nd_region->mapping[i]);
/*
* Give up if we don't find an instance of a uuid at each
* position (from 0 to nd_region->ndr_mappings - 1), or if we
* find a dimm with two instances of the same uuid.
*/
dev_err(&nd_region->dev, "%s missing label for %pUb\n",
dev_name(ndd->dev), nd_label->uuid);
BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
IP: nd_region_register_namespaces+0xd67/0x13c0 [libnvdimm]
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
CPU: 43 PID: 673 Comm: kworker/u609:10 Not tainted 4.16.0-rc4+ #1
[..]
RIP: 0010:nd_region_register_namespaces+0xd67/0x13c0 [libnvdimm]
[..]
Call Trace:
? devres_add+0x2f/0x40
? devm_kmalloc+0x52/0x60
? nd_region_activate+0x9c/0x320 [libnvdimm]
nd_region_probe+0x94/0x260 [libnvdimm]
? kernfs_add_one+0xe4/0x130
nvdimm_bus_probe+0x63/0x100 [libnvdimm]
Switch to using the nvdimm device directly.
Fixes: 0e3b0d123c ("libnvdimm, namespace: allow multiple pmem...")
Cc: <stable@vger.kernel.org>
Reported-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
At initialization time the 'dimm' driver caches a copy of the memory
device's label area and reserves address space for each of the
namespaces defined.
However, as can be seen below, the reservation occurs even when the
index blocks are invalid:
nvdimm nmem0: nvdimm_init_config_data: len: 131072 rc: 0
nvdimm nmem0: config data size: 131072
nvdimm nmem0: __nd_label_validate: nsindex0 labelsize 1 invalid
nvdimm nmem0: __nd_label_validate: nsindex1 labelsize 1 invalid
nvdimm nmem0: : pmem-6025e505: 0x1000000000 @ 0xf50000000 reserve <-- bad
Gate dpa reservation on the presence of valid index blocks.
Cc: <stable@vger.kernel.org>
Fixes: 4a826c83db ("libnvdimm: namespace indices: read and validate")
Reported-by: Krzysztof Rusocki <krzysztof.rusocki@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The default value for smart ctrl_temperature was the same as the
threshold for ctrl_temperature. As a result, any arbitrary smart
injection to the nfit_test dimm could cause this alarm to trigger
and cause an acpi notification. Drop the default value to below the
threshold, so that unrelated injections don't trigger notifications.
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Add support for the smart injection command in the nvdimm unit test
framework. This allows for directly injecting to smart fields and flags
that are supported in the injection command. If the injected values are
past the threshold, then an acpi notification is also triggered.
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In preparation for re-working the ARS implementation to better handle
short vs long ARS runs, introduce nfit_spa->ars_state. For now this just
replaces the nfit_spa->ars_required bit-field/flag, but going forward it
will be used to track ARS completion and make short vs long ARS
requests.
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
For debug, it is useful for bus providers to be able to retrieve the
'struct device' associated with an nd_region instance that it
registered. We already have to_nd_region() to perform the reverse cast
operation, in fact its duplicate declaration can be removed from the
private drivers/nvdimm/nd.h header.
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
There is a small window whereby ARS scan requests can schedule work that
userspace will miss when polling scrub_show. Hold the init_mutex lock
over calls to report the status to close this potential escape. Also,
make sure that requests to cancel the ARS workqueue are treated as an
idle event.
Cc: <stable@vger.kernel.org>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Fixes: 37b137ff8c ("nfit, libnvdimm: allow an ARS scrub...")
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Commit 1cf03c00e7 "nfit: scrub and register regions in a workqueue"
mistakenly attempts to register a region per BLK aperture. There is
nothing to register for individual apertures as they belong as a set to
a BLK aperture group that are registered with a corresponding
DIMM-control-region. Filter them for registration to prevent some
needless devm_kzalloc() allocations.
Cc: <stable@vger.kernel.org>
Fixes: 1cf03c00e7 ("nfit: scrub and register regions in a workqueue")
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Some BIOSen do not handle 0-byte transfer lengths for the _LSR and _LSW
(label storage read/write) methods. This causes Linux to fallback to the
deprecated _DSM path, or otherwise disable label support.
Introduce acpi_nvdimm_has_method() to detect whether a method is
available rather than calling the method, require _LSI and _LSR to be
paired, and require read support before enabling write support.
Cc: <stable@vger.kernel.org>
Fixes: 4b27db7e26 ("acpi, nfit: add support for the _LS...")
Suggested-by: Erik Schmauss <erik.schmauss@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Per the ACPI specification the only functional purpose for a DIMM
Control Region to be mapped into the system physical address space, from
an OSPM perspective, is to support block-apertures. However, there are
some BIOSen that publish DIMM Control Region SPA entries for pre-boot
environment consumption. Undo the kernel policy of generating disabled
'ndblk' regions when this configuration is detected.
Cc: <stable@vger.kernel.org>
Fixes: 1f7df6f88b ("libnvdimm, nfit: regions (block-data-window...)")
Reviewed-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
sizeof_namespace_index() fails when NVDIMM devices have the minimum
1024 bytes label storage area. nvdimm_num_label_slots() returns 3
slots while the area is only big enough for 2 slots.
Change nvdimm_num_label_slots() to calculate a number of label slots
according to UEFI 2.7 spec.
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
UEFI 2.7 defines in page 758 that:
Initial Label Storage Area Configuration
:
The minimum size of the Label Storage Area is large enough to
hold 2 index blocks and 2 labels.
The mininum index block size is 256 bytes, and the minimum label size
is also 256 bytes.
Change ND_LABEL_MIN_SIZE to (256 * 4) so that NVDIMM devices with
the minimum label storage area do not fail with the size check in
nvdimm_init_config_data().
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Use module_nd_driver() instead of having module_init() and
module_exit() callbacks which just call nd_driver_register() and
nd_driver_unregister().
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Use module_nd_driver() instead of having module_init() and
module_exit() callbacks which just call nd_driver_register() and
nd_driver_unregister().
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Provide a module_nd_driver() wrapper over simple nd_driver_register()
nd_driver_unregister() combinations in module_init() and module_exit()
respectively.
Note an explicit nd_driver_unregister() had to be implemented as nd
bus drivers did call device_unregister() direcly in the module_exit()
function.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
When you load nfit_test you currently see the following error in dmesg:
nfit_test nfit_test.0: found a zero length table '0' parsing nfit
This happens because when we parse the nfit_test.0 table via
acpi_nfit_init(), we specify a size of nfit_test->nfit_size. For the first
pass through nfit_test.0 where (t->setup_hotplug == 0) this is the size of
the entire buffer we allocated, including space for the hot plug
structures, not the size that we've actually filled in.
Fix this by only trying to parse the size of the structures that we've
filled in.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
It turns out that we were overrunning the 'nfit_buf' buffer in
nfit_test0_setup() in the (t->setup_hotplug == 1) case because we failed to
correctly account for all of the acpi_nfit_memory_map structures.
Fix the structure count which will increase the allocation size of
'nfit_buf' in nfit_test0_alloc(). Also add some WARN_ON()s to
nfit_test0_setup() and nfit_test1_setup() to catch future issues where the
size of the buffer doesn't match the amount of data we're writing.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In nfit_test0_setup() and nfit_test1_setup() we keep an 'offset' value
which we use to calculate where in our 'nfit_buf' we will place our next
structure. The handling of 'offset' and the calculation of the placement
of the next structure is a bit inconsistent, though. We don't update
'offset' after we insert each structure, sometimes causing us to update it
for multiple structures' sizes at once. When calculating the position of
the next structure we aren't always able to just use 'offset', but
sometimes have to add in other structure sizes as well.
Fix this by updating 'offset' after each structure insertion in a
consistent way, allowing us to always calculate the position of the next
structure to be inserted by just using 'nfit_buf + offset'.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dynamic debug can be instructed to add the function name to the debug
output using the +f switch, so there is no need for the dax modules to
do it again. If a user decides to add the +f switch for the dax modules'
dynamic debug this results in double prints of the function name.
Reported-by: Johannes Thumshirn <jthumshirn@suse.de>
Reported-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dynamic debug can be instructed to add the function name to the debug
output using the +f switch, so there is no need for the libnvdimm
modules to do it again. If a user decides to add the +f switch for
libnvdimm's dynamic debug this results in double prints of the function
name.
Reported-by: Johannes Thumshirn <jthumshirn@suse.de>
Reported-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dynamic debug can be instructed to add the function name to the debug
output using the +f switch, so there is no need for the nfit module to
do it again. If a user decides to add the +f switch for nfit's dynamic
debug this results in double prints of the function name like the
following:
[ 2391.935383] acpi_nfit_ctl: nfit ACPI0012:00: acpi_nfit_ctl:nmem8 cmd: 10: func: 1 input length: 0
Thus remove the stray __func__ printing.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Pull x86 fixes from Thomas Gleixner:
"A small set of fixes for x86:
- Add missing instruction suffixes to assembly code so it can be
compiled by newer GAS versions without warnings.
- Switch refcount WARN exceptions to UD2 as we did in general
- Make the reboot on Intel Edison platforms work
- A small documentation update so text and sample command match"
* 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Documentation, x86, resctrl: Make text and sample command match
x86/platform/intel-mid: Handle Intel Edison reboot correctly
x86/asm: Add instruction suffixes to bitops
x86/entry/64: Add instruction suffix
x86/refcounts: Switch to UD2 for exceptions
Pull x86/pti fixes from Thomas Gleixner:
"Three fixes related to melted spectrum:
- Sync the cpu_entry_area page table to initial_page_table on 32 bit.
Otherwise suspend/resume fails because resume uses
initial_page_table and triggers a triple fault when accessing the
cpu entry area.
- Zero the SPEC_CTL MRS on XEN before suspend to address a
shortcoming in the hypervisor.
- Fix another switch table detection issue in objtool"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpu_entry_area: Sync cpu_entry_area to initial_page_table
objtool: Fix another switch table detection issue
x86/xen: Zero MSR_IA32_SPEC_CTRL before suspend
Pull timer fixes from Thomas Gleixner:
"A small set of fixes from the timer departement:
- Add a missing timer wheel clock forward when migrating timers off a
unplugged CPU to prevent operating on a stale clock base and
missing timer deadlines.
- Use the proper shift count to extract data from a register value to
prevent evaluating unrelated bits
- Make the error return check in the FSL timer driver work correctly.
Checking an unsigned variable for less than zero does not really
work well.
- Clarify the confusing comments in the ARC timer code"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timers: Forward timer base before migrating timers
clocksource/drivers/arc_timer: Update some comments
clocksource/drivers/mips-gic-timer: Use correct shift count to extract data
clocksource/drivers/fsl_ftm_timer: Fix error return checking
Pull irq fixlet from Thomas Gleixner:
"Just a documentation update for the missing device tree property of
the R-Car M3N interrupt controller"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
dt-bindings/irqchip/renesas-irqc: Document R-Car M3-N support
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAlqbuK8ACgkQxWXV+ddt
WDsiVQ//eE8Axfw4qWOyHHjozoAIu9kifvOFQIhvKviLiXHJNrD/vBI6YwD1hyD1
rbbLilMsEl1OD1Sq3AeOUMNSSl5qUFEB+CA8vg9GznFTNRobTkL+p96Zt5xlRDu3
lsFFV93tED+dK4D/eSGP+xYbknA8hIk/2gWPkd6hpYKyh2QdsPBYDqCnaEXvd79P
DIP/cAjIfzqQn0FTiZ9wbaES+LPO+NoZgQRC2w9McYQ5CEMc+oAChEmPJRwpPoKy
YdhuwoULniRNHVnVOIVfi4w9jkSPSz7YIgLeRFli/WGBYGcKeHTMFkMa12KdpuUC
JUclOogJ5ZMbFV3C0W8XEJ7Vb9ltIevrH8MgfKP/3BScuZzbJZ+n5KkH2Gf7vcpe
w5cZGOsKDz+35fDCdTmsFpDK9kpGzHq48JlRifOjARbdyqNwVq4emxOeQlO1ygzq
Y9H5UeMpp/FDAm6g/bV8ezCXzwuwUDV9CwAJBD+WCiZhD2nX85FfIp1kfF2zLcUg
Y8irqV6A/J/0BFkF7Iu9AuzTxOdo6zMLkcHEKb+sL7yGxZ2248v5gZxDoyV5iNWR
WY4Y2GaMemZGu6+NyyLFAJzlFyACD5fSy8vT7KD6upCzwmlAtaJ3VULfTrLyi/uM
SNZAKf3WzKBVe5h52GNGRCi5OLTteH6Yqz5m5/h8r74mBO00IrA=
=bF96
-----END PGP SIGNATURE-----
Merge tag 'for-4.16-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- when NR_CPUS is large, a SRCU structure can significantly inflate
size of the main filesystem structure that would not be possible to
allocate by kmalloc, so the kvalloc fallback is used
- improved error handling
- fix endiannes when printing some filesystem attributes via sysfs,
this is could happen when a filesystem is moved between different
endianity hosts
- send fixes: the NO_HOLE mode should not send a write operation for a
file hole
- fix log replay for for special files followed by file hardlinks
- fix log replay failure after unlink and link combination
- fix max chunk size calculation for DUP allocation
* tag 'for-4.16-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
Btrfs: fix log replay failure after unlink and link combination
Btrfs: fix log replay failure after linking special file and fsync
Btrfs: send, fix issuing write op when processing hole in no data mode
btrfs: use proper endianness accessors for super_copy
btrfs: alloc_chunk: fix DUP stripe size handling
btrfs: Handle btrfs_set_extent_delalloc failure in relocate_file_extent_cluster
btrfs: handle failure of add_pending_csums
btrfs: use kvzalloc to allocate btrfs_fs_info
Pull i2c fixes from Wolfram Sang:
"A driver fix and a documentation fix (which makes dependency handling
for the next cycle easier)"
* 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: octeon: Prevent error message on bus error
dt-bindings: at24: sort manufacturers alphabetically
Pull libnvdimm fixes from Dan Williams:
"A 4.16 regression fix, three fixes for -stable, and a cleanup fix:
- During the merge window support for the new ACPI NVDIMM Platform
Capabilities structure disabled support for "deep flush", a
force-unit- access like mechanism for persistent memory. Restore
that mechanism.
- VFIO like RDMA is yet one more memory registration / pinning
interface that is incompatible with Filesystem-DAX. Disable long
term pins of Filesystem-DAX mappings via VFIO.
- The Filesystem-DAX detection to prevent long terms pins mistakenly
also disabled Device-DAX pins which are not subject to the same
block- map collision concerns.
- Similar to the setup path, softlockup warnings can trigger in the
shutdown path for large persistent memory namespaces. Teach
for_each_device_pfn() to perform cond_resched() in all cases.
- Boaz noticed that the might_sleep() in dax_direct_access() is stale
as of the v4.15 kernel.
These have received a build success notification from the 0day robot,
and the longterm pin fixes have appeared in -next. However, I recently
rebased the tree to remove some other fixes that need to be reworked
after review feedback.
* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
memremap: fix softlockup reports at teardown
libnvdimm: re-enable deep flush for pmem devices via fsync()
vfio: disable filesystem-dax page pinning
dax: fix vma_is_fsdax() helper
dax: ->direct_access does not sleep anymore
- suppress sparse warnings about unknown attributes
- fix typos and stale comments
- fix build error of arch/sh
- fix wrong use of ld-option vs cc-ldoption
- remove redundant GCC_PLUGINS_CFLAGS assignment
- fix another memory leak of Kconfig
- fix line number in error messages of Kconfig
- do not write confusing CONFIG_DEFCONFIG_LIST out to .config
- add xstrdup() to Kconfig to handle memory shortage errors
- show also a Debian package name if ncurses is missing
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJamiV5AAoJED2LAQed4NsGgbUQAInk9+SPBtulpK1HayDoYrC6
6LYQWjdO89jbKAaQFsDQ0ZC5HwPgafX18HWunJ67Cs6kunLYTMJLu8cHr5rRHrv1
B+ceAkMe5EE62Bmju35qX54BSgcPyWt9jFloFmdzOG7nR7D+C9DUx2WZnF3Kd0Gb
XPw8R4u2EEKGOMYh5PwYFP5mjhfWMijg6OZaAdcV4E/fSXiwSlDjwuVY1ymhsEAn
OalDNcJ5+Foe9ADy5yLEna0Jlj8j8UMGPRzNErvL2CxZP2CSfpKfuXJABig5x8rk
ZXULN6w7nIN/dpUkPgkTYgF6behUuSUPj+7xBDWbVjRQLMMy6bI65s5hylBzYAMy
tB3hKnpZtrrJNc00XQWLY18b03YzLv5wW0CK8jmRJPLWDE8znb8aMqxwtfJKhfa+
cW5v5KzQm4ercArSy983jj4FJD6ZuQf1lqAARD4vxX5kK9oy+7oJvfzmwrlh1U2c
dk/3VKF2AjUDDdzxq+zgJIWFR1MMTxFKqL95CsdaFEUMRXr80a8XBNYP07R4l3qB
y2zUsZZi7Ad363MkohBCEBdw1yPtHupRETr3xdQlUDh0ZQj94TXB6lX2SaY4v1NS
J3eqTl8t/RHboS5kbPMz8MduO4otH/ZxYDd0uMKA5c/v/9LkD9y+AIpndswk4jcN
/uKkw5sSRgO+9VfsTqed
=sr4l
-----END PGP SIGNATURE-----
Merge tag 'kbuild-fixes-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- suppress sparse warnings about unknown attributes
- fix typos and stale comments
- fix build error of arch/sh
- fix wrong use of ld-option vs cc-ldoption
- remove redundant GCC_PLUGINS_CFLAGS assignment
- fix another memory leak of Kconfig
- fix line number in error messages of Kconfig
- do not write confusing CONFIG_DEFCONFIG_LIST out to .config
- add xstrdup() to Kconfig to handle memory shortage errors
- show also a Debian package name if ncurses is missing
* tag 'kbuild-fixes-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
MAINTAINERS: take over Kconfig maintainership
kconfig: fix line number in recursive inclusion error message
Coccinelle: memdup: Fix typo in warning messages
kconfig: Update ncurses package names for menuconfig
kbuild/kallsyms: trivial typo fix
kbuild: test --build-id linker flag by ld-option instead of cc-ldoption
kbuild: drop superfluous GCC_PLUGINS_CFLAGS assignment
kconfig: Don't leak choice names during parsing
sh: fix build error for empty CONFIG_BUILTIN_DTB_SOURCE
kconfig: set SYMBOL_AUTO to the symbol marked with defconfig_list
kconfig: add xstrdup() helper
kbuild: disable sparse warnings about unknown attributes
Makefile: Fix lying comment re. silentoldconfig
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJamnteAAoJEAhfPr2O5OEVjKUP/03nggoRa29J66geB3LtTnsC
lnQa2Dr6+fRji5eLsXC0su8o3Bf3P5KsyP9gfRrFQcsn68x6LiO6Ud4dL8yM6ZFh
4ShF8IYD1oNMT0CTq5bgfr34lyJRYsF0/FbXuO+XApOlL74JfstN4g3MFjBiojAQ
B7URIw4snzLrpiJ+yHELL71tfPnzgUllWXKSDkql6Te+Gx1mYymDRyXdexlVKSiH
hD/5mQ/6YJdAmFTHl0efvKGFEhfXK2l59I+gpRXgsupaSG+rV2RhjTeVPcWUcvCn
P91/hpahtv2gHWZyEYpN9h1zkWFYFBt8exgnMAuo4YtFzAyMueeUowCrHvTL8P9V
TSkLgFFsZBtrNmWsX1pw2JQJxq2qUXWZNBeUf326TQV9Z2WBeiikXiStdemR/IBQ
ueVOH9yuLjEX9die8PCxU+O/hN0VTfKa0A59D+rTLBbj76LG+iAaK/NAVedJzv5v
U3N0lPL0U8BIJ00VevktBfpB7FcsTBv2Wa/GHOgyCYZHW2NgUqj0cK4jfMogFNID
MM5SgM4fv0D03p4KVLtu3d2+QDKmPjYBs6rAjEtU2eXvaoTzOPtciSb8As5MFxin
xseliOBSVchJDRcKic0vMyMv2jDdvRcAEaDE/w+OsUyWPTixDNN1qOThp8OGEc6P
5P0nL89lvpf9StOl3WM5
=ypdv
-----END PGP SIGNATURE-----
Merge tag 'media/v4.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
- some build fixes with randconfigs
- an m88ds3103 fix to prevent an OOPS if the chip doesn't provide the
right version during probe (with can happen if the hardware hangs)
- a potential out of array bounds reference in tvp5150
- some fixes and improvements in the DVB memory mapped API (added for
kernel 4.16)
* tag 'media/v4.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
media: vb2: Makefile: place vb2-trace together with vb2-core
media: Don't let tvp5150_get_vbi() go out of vbi_ram_default array
media: dvb: update buffer mmaped flags and frame counter
media: dvb: add continuity error indicators for memory mapped buffers
media: dmxdev: Fix the logic that enables DMA mmap support
media: dmxdev: fix error code for invalid ioctls
media: m88ds3103: don't call a non-initalized function
media: au0828: add VIDEO_V4L2 dependency
media: dvb: fix DVB_MMAP dependency
media: dvb: fix DVB_MMAP symbol name
media: videobuf2: fix build issues with vb2-trace
media: videobuf2: Add VIDEOBUF2_V4L2 Kconfig option for VB2 V4L2 part
x86:
- fix NULL dereference when using userspace lapic
- optimize spectre v1 mitigations by allowing guests to use LFENCE
- make microcode revision configurable to prevent guests from
unnecessarily blacklisting spectre v2 mitigation features
-----BEGIN PGP SIGNATURE-----
iQEcBAABCAAGBQJambvzAAoJEED/6hsPKofo9HwH/2il8xNSLIYf9pJtxZo/puyQ
ZSwByGdeLKBZ1GP1dhdZ8kMk3eBoci0a/sQJmhDiEG6GDf1Mrgri/xj3p60sWwXT
iReG+ZhBKg4QMj/IgOJQrh+53JT73QQP14wIhzc/DSi0Fo0ziqDA/lINxqMKc7oF
b5qratjsb4xF1db4d1g8Ii1VRk64UoBEVpEoP37OOyAu1rgXgDr+9C832KkP0rb+
pVYT8hLFiaYiwVN+WN52/NrIkqBlMvMp3ouRtMAajCQ9OznnraDJE6eTkPGDSDBM
RizSuQbev5R7pcmzCAP9/XKfTbeSUZQei2ZFXQXAOvIXMrQd/ITjbPvnObsceSE=
=wtQd
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Radim Krčmář:
"x86:
- fix NULL dereference when using userspace lapic
- optimize spectre v1 mitigations by allowing guests to use LFENCE
- make microcode revision configurable to prevent guests from
unnecessarily blacklisting spectre v2 mitigation feature"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: fix vcpu initialization with userspace lapic
KVM: X86: Allow userspace to define the microcode version
KVM: X86: Introduce kvm_get_msr_feature()
KVM: SVM: Add MSR-based feature support for serializing LFENCE
KVM: x86: Add a framework for supporting MSR-based features
The cond_resched() currently in the setup path needs to be duplicated in
the teardown path. Rather than require each instance of
for_each_device_pfn() to open code the same sequence, embed it in the
helper.
Link: https://github.com/intel/ixpdimm_sw/issues/11
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: <stable@vger.kernel.org>
Fixes: 7138970383 ("mm, zone_device: Replace {get, put}_zone_device_page()...")
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Re-enable deep flush so that users always have a way to be sure that a
write makes it all the way out to media. Writes from the PMEM driver
always arrive at the NVDIMM since movnt is used to bypass the cache, and
the driver relies on the ADR (Asynchronous DRAM Refresh) mechanism to
flush write buffers on power failure. The Deep Flush mechanism is there
to explicitly write buffers to protect against (rare) ADR failure. This
change prevents a regression in deep flush behavior so that applications
can continue to depend on fsync() as a mechanism to trigger deep flush
in the filesystem-DAX case.
Fixes: 06e8ccdab1 ("acpi: nfit: Add support for detect platform CPU cache...")
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
I have recently picked up Kconfig patches to my tree without any
declaration. Making it official now.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Filesystem-DAX is incompatible with 'longterm' page pinning. Without
page cache indirection a DAX mapping maps filesystem blocks directly.
This means that the filesystem must not modify a file's block map while
any page in a mapping is pinned. In order to prevent the situation of
userspace holding of filesystem operations indefinitely, disallow
'longterm' Filesystem-DAX mappings.
RDMA has the same conflict and the plan there is to add a 'with lease'
mechanism to allow the kernel to notify userspace that the mapping is
being torn down for block-map maintenance. Perhaps something similar can
be put in place for vfio.
Note that xfs and ext4 still report:
"DAX enabled. Warning: EXPERIMENTAL, use at your own risk"
...at mount time, and resolving the dax-dma-vs-truncate problem is one
of the last hurdles to remove that designation.
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: kvm@vger.kernel.org
Cc: <stable@vger.kernel.org>
Reported-by: Haozhong Zhang <haozhong.zhang@intel.com>
Tested-by: Haozhong Zhang <haozhong.zhang@intel.com>
Fixes: d475c6346a ("dax,ext2: replace XIP read and write with DAX I/O")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAlqZy4cUHGJoZWxnYWFz
QGdvb2dsZS5jb20ACgkQWYigwDrT+vzyuw/+M3EDZaGd0SMj5k47qll1udstDEBL
/WkCVTgrVov4TqMh+x5Pba7Yd+/2vFeIx2QbruYS7uGF8YwXMtET6+RJg+vUUMkW
ryIMKl0QRb9X11qJr5GtgOqij//wtV8BXPimGczB02d8ganJFhbxrJSx87z1lshm
g0u5ZNupddMHfT3JW0H4c2AFrkW9+KXjosShPuVzQbkhXxMAk+zbnuC6wRd1+GJV
4BewJMHLNFTg4cEz7O8ao7fRyYsHsaV65HdFoTlSGPrv3PEGZTqapmF9ATbEJTQd
CNQ0QMN+ewFMwZh17oCs0eEkoSLeyzMFVUfXs4IIpMRTxyJbLzuqwlU5VaCtXdkT
ntlETQ09MmNqvyusAmKvrqBjjw2clgvO7NCjJhaMY1ehOH6sEcbvOx83xp6P1RUc
mTqOVV7TbSXyyRFOEa6KunyC2tKmV+t2Cl4dOxVKwaja5J/Eh8GhGlM/r7qt4pEa
h3roNNE1VilHYR/8VcpCVydyueruaMSNrHzlhmIQw+UcsMDzEjBKzuxGFUYNU+0H
2wXZacv2MaDj3qndh8ZS9IHWW1bw6PMBCdf13HOPdEs9DjHX5uI1RsB49Z6rSM75
xXoiB/MHlK/3YgZNB2eEBQjFHfMDD1ylwsnIbeIMhvk074rE8d56C8uNXuBvYYPn
dakUdnUB8abeeD0=
=0inR
-----END PGP SIGNATURE-----
Merge tag 'pci-v4.16-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
- Update pci.ids location (documentation only) (Randy Dunlap)
- Fix a crash when BIOS didn't assign a BAR and we try to enlarge it
(Christian König)
* tag 'pci-v4.16-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: Allow release of resources that were never assigned
PCI: Update location of pci.ids file
Pull parisc fixes from Helge Deller:
- a patch to change the ordering of cache and TLB flushes to hopefully
fix the random segfaults we very rarely face (by Dave Anglin).
- a patch to hide the virtual kernel memory layout due to security
reasons.
- two small patches to make the kernel run more smoothly under qemu.
* 'parisc-4.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Reduce irq overhead when run in qemu
parisc: Use cr16 interval timers unconditionally on qemu
parisc: Check if secondary CPUs want own PDC calls
parisc: Hide virtual kernel memory layout
parisc: Fix ordering of cache and TLB flushes
delayed and three error path memory leak fixups from Chengguang.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABCAAGBQJamV3qAAoJEEp/3jgCEfOLv6kIAI0RXShR20dz9M3GnL2bDws7
KFkC/aNwbviXMAih3q7Q7ZiwXv6BI/uUK0YIYG/tWZoNHucYGS8ZOFDikdiIMcJR
2VP8Yas1/NvnLlh0ut45s2imvbXinkHcjWOLqgJJmJ2cEP9Ar/mbwz//TgfVMdNm
TSHEm2LPRge9jK3Ab/ZiatExGH6dx5hRJvssSauF+f2iN+q4nKj2aBqle+FKaNWT
Iv0ONi2f7aXnaXRvum4Y1gnKWXzKSFd8Y2Cr5R3tfTrKBv8nmQz54eRsMnnuxK/7
uV6Ujch4LUXBI1etXLqTwRdZQIcw3Li5PYxGrO8fl8mQfJau9yv5REw/rbgB5FU=
=f8gA
-----END PGP SIGNATURE-----
Merge tag 'ceph-for-4.16-rc4' of git://github.com/ceph/ceph-client
Pull ceph fixes from Ilya Dryomov:
"A cap handling fix from Zhi that ensures that metadata writeback isn't
delayed and three error path memory leak fixups from Chengguang"
* tag 'ceph-for-4.16-rc4' of git://github.com/ceph/ceph-client:
ceph: fix potential memory leak in init_caches()
ceph: fix dentry leak when failing to init debugfs
libceph, ceph: avoid memory leak when specifying same option several times
ceph: flush dirty caps of unlinked inode ASAP
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABCAAGBQJamXImAAoJEPfTWPspceCmEP4P/3kkm0JIXtbNZFMb1JZtsjwE
t4OUVEDj4jjRfmZfUVkajPnczM4MSPiXm43PbcOi4NF53mv8k76jyIPhlZREzYzq
MBknibvpqyiWpbii9tBRrR6FGDR/N51//ya9vdPaYBcBssTg6Aqtt4BE5oPfo011
PleGROe1jtrBUNBy2dMy4sHb/MvZ0vRuNPxMsD8Agy5UiVeItAelY/lDn1Hw41BY
O+muE5bw6+yKqB9vGXhV3O4WRh8BofJi1YdzbwbbIzH40ZZK5VTDQc5o19/CFEZ/
uZ8BStOFEWA0LNuarME5fknWcogiedEtszweyiWBbVZo4VqCsfxPoaRCibY/Wg5F
a0UNJ4iSzglhfSMoHJlhvlCAMCyubFSeMSdJjrrpIcyBrziJXpcEXcUnWI43yi4P
FoM8zUni22XnfLWxIdTjVkMRytjtqTLcXOHXdP5N/ESa80jBq3Q76TLmzIKW+kK5
sAre+hgr52NdgovP/NSxsdvsckAolWNe40JI8wLbwNo+lMHr0ckzOG+sAdz1iPRK
iVL0CAlby4A94Wcu+OHCwfY7B9lBrMuMfHsesEM6x1cxgAhd3YNfEJ8g2QolCUEV
KmZizXbV9nnmJfegVC06SgM+D7AR26dwsBG2aoibShuvdxX6jMdUHygyu5DCJdg/
JS+q71jmxb/r1TWe/62r
=AMhV
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20180302' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"A collection of fixes for this series. This is a little larger than
usual at this time, but that's mainly because I was out on vacation
last week. Nothing in here is major in any way, it's just two weeks of
fixes. This contains:
- NVMe pull from Keith, with a set of fixes from the usual suspects.
- mq-deadline zone unlock fix from Damien, fixing an issue with the
SMR zone locking added for 4.16.
- two bcache fixes sent in by Michael, with changes from Coly and
Tang.
- comment typo fix from Eric for blktrace.
- return-value error handling fix for nbd, from Gustavo.
- fix a direct-io case where we don't defer to a completion handler,
making us sleep from IRQ device completion. From Jan.
- a small series from Jan fixing up holes around handling of bdev
references.
- small set of regression fixes from Jiufei, mostly fixing problems
around the gendisk pointer -> partition index change.
- regression fix from Ming, fixing a boundary issue with the discard
page cache invalidation.
- two-patch series from Ming, fixing both a core blk-mq-sched and
kyber issue around token freeing on a requeue condition"
* tag 'for-linus-20180302' of git://git.kernel.dk/linux-block: (24 commits)
block: fix a typo
block: display the correct diskname for bio
block: fix the count of PGPGOUT for WRITE_SAME
mq-deadline: Make sure to always unlock zones
nvmet: fix PSDT field check in command format
nvme-multipath: fix sysfs dangerously created links
nbd: fix return value in error handling path
bcache: fix kcrashes with fio in RAID5 backend dev
bcache: correct flash only vols (check all uuids)
blktrace_api.h: fix comment for struct blk_user_trace_setup
blockdev: Avoid two active bdev inodes for one device
genhd: Fix BUG in blkdev_open()
genhd: Fix use after free in __blkdev_get()
genhd: Add helper put_disk_and_module()
genhd: Rename get_disk() to get_disk_and_module()
genhd: Fix leaked module reference for NVME devices
direct-io: Fix sleep in atomic due to sync AIO
nvme-pci: Fix nvme queue cleanup if IRQ setup fails
block: kyber: fix domain token leak during requeue
blk-mq: don't call io sched's .requeue_request when requeueing rq to ->dispatch
...