At the moment, line metadata is persisted on a separate work queue, that
is kicked each time that a line is closed. The assumption when designing
this was that freeing the write thread from creating a new write request
was better than the potential impact of writes colliding on the media
(user I/O and metadata I/O). Experimentation has proven that this
assumption is wrong; collision can cause up to 25% of bandwidth and
introduce long tail latencies on the write thread, which potentially
cause user write threads to spend more time spinning to get a free entry
on the write buffer.
This patch moves the metadata logic to the write thread. When a line is
closed, remaining metadata is written in memory and is placed on a
metadata queue. The write thread then takes the metadata corresponding
to the previous line, creates the write request and schedules it to
minimize collisions on the media. Using this approach, we see that we
can saturate the media's bandwidth, which helps reducing both write
latencies and the spinning time for user writer threads.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <matias@cnexlabs.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Read requests allocate some extra memory to store its per I/O context.
Instead of requiring yet another memory pool for other type of requests,
generalize this context allocation (and change naming accordingly).
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <matias@cnexlabs.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Erase I/Os are scheduled with the following goals in mind: (i) minimize
LUNs collisions with write I/Os, and (ii) even out the price of erasing
on every write, instead of putting all the burden on when garbage
collection runs. This works well on the current design, but is specific
to the default mapping algorithm.
This patch generalizes the erase path so that other mapping algorithms
can select an arbitrary line to be erased instead. It also gets rid of
the erase semaphore since it creates jittering for user writes.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <matias@cnexlabs.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Allow to configure the number of maximum sectors per write command
through sysfs. This makes it easier to tune write command sizes for
different controller configurations.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <matias@cnexlabs.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Add a new debug counter to measure cache hits on the read path
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <matias@cnexlabs.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Spare a double calculation on the fast write path.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <matias@cnexlabs.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
If nvme_alloc_request fails, propagate the right error, instead of
assuming ENOMEM.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <matias@cnexlabs.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
In case of a failure when submitting a request, convert the ppa_list
addresses to the target format so that it can interpret ppas for
recovery
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <matias@cnexlabs.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull s390 bugfix from Martin Schwidefsky:
"One last s390 patch for 4.12
Revert the re-IPL semantics back to the v4.7 state. It turned out that
the memory layout may change due to memory hotplug if load-normal is
used"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/ipl: revert Load Normal semantics for LPAR CCW-type re-IPL
Michael reported the segfault when kernel.kptr_restrict=2 is set.
$ perf record ls
...
perf: Segmentation fault
Obtained 16 stack frames.
./perf(dump_stack+0x2d) [0x5068df]
./perf(sighandler_dump_stack+0x2d) [0x5069bf]
./perf() [0x43e47b]
/lib64/libc.so.6(+0x3594f) [0x7f762004794f]
/lib64/libc.so.6(strlen+0x26) [0x7f762009ef86]
/lib64/libc.so.6(__strdup+0xd) [0x7f762009ecbd]
./perf(maps__set_kallsyms_ref_reloc_sym+0x4d) [0x51590f]
./perf(machine__create_kernel_maps+0x136) [0x50a7de]
./perf(perf_session__create_kernel_maps+0x2c) [0x510a81]
./perf(perf_session__new+0x13d) [0x510e23]
./perf() [0x43fd61]
./perf(cmd_record+0x704) [0x441823]
./perf() [0x4bc1a0]
./perf() [0x4bc40d]
./perf() [0x4bc55f]
./perf(main+0x2d5) [0x4bc939]
Segmentation fault (core dumped)
The reason is that with kernel.kptr_restrict=2, we don't get
the symbol from machine__get_running_kernel_start, which we
want to use in maps__set_kallsyms_ref_reloc_sym and we crash.
Check the symbol name value before calling
maps__set_kallsyms_ref_reloc_sym() and succeed without ref_reloc_sym
being set. It's safe because we check its existence before we use it.
Reported-by: Michael Petlan <mpetlan@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170626095153.553-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The function sbi_send() is local to just pnd2_edac.c and does not need
to be in global scope, so make it static.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170623084855.9197-1-colin.king@canonical.com
Signed-off-by: Borislav Petkov <bp@suse.de>
Larry Finger reported that his Powerbook G4 was no longer booting with v4.12-rc,
userspace was up but giving weird errors such as:
udevd[64]: starting version 175
udevd[64]: Unable to receive ctrl message: Bad address.
modprobe: chdir(4.12-rc1): No such file or directory
He bisected the problem to commit 3448890c32 ("powerpc: get rid of zeroing,
switch to RAW_COPY_USER").
Al identified that the problem is actually a miscompilation by GCC 4.6.3, which
is exposed by the above commit.
Al also pointed out that inlining copy_to/from_user() is probably of little or
no benefit, which is correct. Using Anton's copy_to_user benchmark, with a
pathological single byte copy, we see a small increase in performance
by *removing* inlining:
Before (inlined):
# time ./copy_to_user -w -l 1 -i 10000000 ( x 3 )
real 0m22.063s
real 0m22.059s
real 0m22.076s
After:
# time ./copy_to_user -w -l 1 -i 10000000 ( x 3 )
real 0m21.325s
real 0m21.299s
real 0m21.364s
So as a small performance improvement and to avoid the miscompilation, drop
inlining copy_to/from_user() on 32-bit.
Fixes: 3448890c32 ("powerpc: get rid of zeroing, switch to RAW_COPY_USER")
Reported-by: Larry Finger <Larry.Finger@lwfinger.net>
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The hash table created during vmw_cmdbuf_res_man_create was
never freed. This causes memory leak in context creation.
Added the corresponding drm_ht_remove in vmw_cmdbuf_res_man_destroy.
Tested for memory leak by running piglit overnight and kernel
memory is not inflated which earlier was.
Cc: <stable@vger.kernel.org>
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Since commit:
af2cf278ef ("x86/mm/hotplug: Don't remove PGD entries in remove_pagetable()")
we no longer free PUDs so that we do not have to synchronize
all PGDs on hot-remove/vfree().
But the new 5-level page table patchset reverted that for 4-level
page tables, in the following commit:
f2a6a7050109: ("x86: Convert the rest of the code to support p4d_t")
This patch restores the damage and disables free_pud() if we are in the
4-level page table case, thus avoiding BUG_ON() after hot-remove.
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
[ Clarified the changelog and the code comments. ]
Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20170624180514.3821-1-jglisse@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
If we write a relocation into the buffer, we require our own implicit
synchronisation added after the start of the execbuf, outside of the
user's control. As we may end up clflushing, or doing the patch itself
on the GPU, asynchronously we need to look at the implicit serialisation
on obj->resv and hence need to disable EXEC_OBJECT_ASYNC for this
object.
If the user does trigger a stall for relocations, we make sure the stall
is complete enough so that the batch is not submitted before we complete
those relocations.
Fixes: 77ae995789 ("drm/i915: Enable userspace to opt-out of implicit fencing")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
(cherry picked from commit 071750e550)
[danvet: Resolve conflicts, resolution reviewed by Tvrtko on irc.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
As we walk the obj->vma_list in per_file_stats(), we need to hold
struct_mutex to prevent alteration of that list.
Fixes: 1d2ac403ae ("drm: Protect dev->filelist with its own mutex")
Fixes: c84455b4ba ("drm/i915: Move debug only per-request pid tracking from request to ctx")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101460
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170617115744.4452-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
(cherry picked from commit 0caf81b5c5)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Since we may track unfenced access (GPU access to the vma that
explicitly requires no fence), vma->last_fence may be set without any
attached fence (vma->fence) and so will not be flushed when we call
i915_vma_put_fence(). Since we stopped doing a full retire of the
activity trackers for unbind, we need to explicitly retire each tracker.
Fixes: b0decaf75b ("drm/i915: Track active vma requests")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170620124321.1108-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
(cherry picked from commit 760a898d80)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Pull x86 fix from Thomas Gleixner:
"A single fix to unbreak the vdso32 build for 64bit kernels caused by
excess #includes in the mshyperv header"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mshyperv: Remove excess #includes from mshyperv.h
Pull timer fixes from Thomas Gleixner:
"A few fixes for timekeeping and timers:
- Plug a subtle race due to a missing READ_ONCE() in the timekeeping
code where reloading of a pointer results in an inconsistent
callback argument being supplied to the clocksource->read function.
- Correct the CLOCK_MONOTONIC_RAW sub-nanosecond accounting in the
time keeping core code, to prevent a possible discontuity.
- Apply a similar fix to the arm64 vdso clock_gettime()
implementation
- Add missing includes to clocksource drivers, which relied on
indirect includes which fails in certain configs.
- Use the proper iomem pointer for read/iounmap in a probe function"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
arm64/vdso: Fix nsec handling for CLOCK_MONOTONIC_RAW
time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accounting
time: Fix clock->read(clock) race around clocksource changes
clocksource: Explicitly include linux/clocksource.h when needed
clocksource/drivers/arm_arch_timer: Fix read and iounmap of incorrect variable
Pull perf fixes from Thomas Gleixner:
"Three fixlets for perf:
- Return the proper error code if aux buffers for a event are not
supported.
- Calculate the probe offset for inlined functions correctly
- Update the Skylake DTLB load/store miss event so it can count 1G
TLB entries as well"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf probe: Fix probe definition for inlined functions
perf/x86/intel: Add 1G DTLB load/store miss support for SKL
perf/aux: Correct return code of rb_alloc_aux() if !has_aux(ev)
Pull irq fix from Thomas Gleixner:
"A single fix for the MIPS GIC to prevent ftrace recursion"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/mips-gic: Mark count and compare accessors notrace
Pull input fixes from Dmitry Torokhov:
- a quirk to i8042 to ignore timeout bit on Lifebook AH544
- a fixup to Synaptics RMI function 54 that was breaking some Dells
- a fix for memory leak in soc_button_array driver
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: synaptics-rmi4 - only read the F54 query registers which are used
Input: i8042 - add Fujitsu Lifebook AH544 to notimeout list
Input: soc_button_array - fix leaking the ACPI button descriptor buffer
Pull SCSI target fixes from Nicholas Bellinger:
"Here are the target-pending fixes for v4.12-rc7 that have been queued
up for the last 2 weeks. This includes:
- Fix a TMR related kref underflow detected by the recent refcount_t
conversion in upstream.
- Fix a iscsi-target corner case during explicit connection logout
timeout failure.
- Address last fallout in iscsi-target immediate data handling from
v4.4 target-core now allowing control CDB payload underflow"
* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
iscsi-target: Reject immediate data underflow larger than SCSI transfer length
iscsi-target: Fix delayed logout processing greater than SECONDS_FOR_LOGOUT_COMP
target: Fix kref->refcount underflow in transport_cmd_finish_abort
We have to reset the sk->sk_rx_dst when we disconnect a TCP
connection, because otherwise when we re-connect it this
dst reference is simply overridden in tcp_finish_connect().
This fixes a dst leak which leads to a loopback dev refcnt
leak. It is a long-standing bug, Kevin reported a very similar
(if not same) bug before. Thanks to Andrei for providing such
a reliable reproducer which greatly narrows down the problem.
Fixes: 41063e9dd1 ("ipv4: Early TCP socket demux.")
Reported-by: Andrei Vagin <avagin@gmail.com>
Reported-by: Kevin Xu <kaiwen.xu@hulu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In __ip6_datagram_connect(), reset sk->sk_v6_daddr and inet->dport if
error occurs.
In udp_v6_early_demux(), check for sk_state to make sure it is in
TCP_ESTABLISHED state.
Together, it makes sure unconnected UDP socket won't be considered as a
valid candidate for early demux.
v3: add TCP_ESTABLISHED state check in udp_v6_early_demux()
v2: fix compilation error
Fixes: 5425077d73 ("net: ipv6: Add early demux handler for UDP unicast")
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When mc configuration changes bnx2x_config_mcast() can return 0 for
success, negative for failure and positive for benign reason preventing
its immediate work, e.g., when the command awaits the completion of
a previously sent command.
When removing all configured macs on a 578xx adapter, if a positive
value would be returned driver would errneously log it as an error.
Fixes: c7b7b483cc ("bnx2x: Don't flush multicast MACs")
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- fix warnings of host programs
- fix "make tags" when COMPILE_SOURCE=1 is specified along with O=
- clarify help message of C=1 option
- fix dependency for ncurses compatibility check
- fix "make headers_install" for fakechroot environment
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZTumLAAoJED2LAQed4NsGUHYP/12pK+wpjj3hPS6dlgC3n6CR
ZzIgBPVVMH+W5wjcKU4JIrhFT3aXHNnV9QTjCakK5Ufubfm7YBpCY+cVWaFzUl4A
CTKGs0NguV200E6bLUnLAWrjC6mSZ17tPuxomx4AClmVkc3y7rT8Hl5L9UMLJ98n
qxWMs3pOvkXKId67zWLuIAu0UTT94s20gkjTRAUxSfV+zahLyWsdwvmIubp2Wa1N
1GrXyA+bHQ+iY4kMuN+sWvIOSn8B7E3ZZGEg9IFd8hVv/ispIYU3Pcm0nNEudxGE
V54/r7noIPgsI9sHYx5mhkxag/AGlXu99IVqMbhLyvM02OndvZFen+GyWMUp+ZTk
j3hQUKtGyUkTqpjQFN3LfONVS5p1Gxlrvj9L4CGjZHNIsxwDDNWHBbkkyQTi3+iR
CPfiV47oPfUoOFg6Yk8GKnHD3tMI3TUtcqHvTCYpMfQz2IU3oMFW6s8h+i7+800W
lmfHHTXmKp+w6Q5+WAcI9LHLdXp2oG68HmNptb/YasDDBYX8q1FGBPJqvDX1snq3
bJS+9KlGarYmWaWa+Y5I0yWzK0kA3E3VP8LO0LgjihzFL45GbTcx2PQ3FumC+/sB
3IPMMCT8EjEkuR1sNAqXY0FpItsGvS1tKPriO753N2cb+uDPQxe2gmHAR4yGD3+E
CWX6t9Cg5KurUHSgGbDc
=yDPZ
-----END PGP SIGNATURE-----
Merge tag 'kbuild-fixes-v4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
"Nothing scary, just some random fixes:
- fix warnings of host programs
- fix "make tags" when COMPILED_SOURCE=1 is specified along with O=
- clarify help message of C=1 option
- fix dependency for ncurses compatibility check
- fix "make headers_install" for fakechroot environment"
* tag 'kbuild-fixes-v4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kconfig: fix sparse warnings in nconfig
kbuild: fix header installation under fakechroot environment
kconfig: Check for libncurses before menuconfig
Kbuild: tiny correction on `make help`
tags: honor COMPILED_SOURCE with apart output directory
genksyms: add printf format attribute to error_with_pos()
Pull timer fix from Eric Biederman:
"This fixes an issue of confusing injected signals with the signals
from posix timers that has existed since posix timers have been in the
kernel.
This patch is slightly simpler than my earlier version of this patch
as I discovered in testing that I had misspelled "#ifdef
CONFIG_POSIX_TIMERS". So I deleted that unnecessary test and made
setting of resched_timer uncondtional.
I have tested this and verified that without this patch there is a
nasty hang that is easy to trigger, and with this patch everything
works properly"
Thomas Gleixner dixit:
"It fixes the problem at hand and covers the ptrace case as well, which
I missed.
Reviewed-and-tested-by: Thomas Gleixner <tglx@linutronix.de>"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
signal: Only reschedule timers on signals timers have sent
A recent commit included linux/slab.h in linux/irq.h. This breaks the build
of vdso32 on a 64-bit kernel.
The reason is that linux/irq.h gets included into the vdso code via
linux/interrupt.h which is included from asm/mshyperv.h. That makes the
32-bit vdso compile fail, because slab.h includes the pgtable headers for
64-bit on a 64-bit build.
Neither linux/clocksource.h nor linux/interrupt.h are needed in the
mshyperv.h header file itself - it has a dependency on <linux/atomic.h>.
Remove the includes and unbreak the build.
Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: devel@linuxdriverproject.org
Fixes: dee863b571 ("hv: export current Hyper-V clocksource")
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1706231038460.2647@nanos
Signed-off-by: Ingo Molnar <mingo@kernel.org>
- three fixes for kprobes/ftrace/livepatch interactions.
- properly handle data breakpoints when using the Radix MMU.
- fix for perf sampling of registers during call_usermodehelper().
- properly initialise the thread_info on our emergency stacks
- add an explicit flush when doing TLB invalidations for a process
using NPU2.
Thanks to:
Alistair Popple, Naveen N. Rao, Nicholas Piggin, Ravi Bangoria,
Masami Hiramatsu.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZTZy4AAoJEFHr6jzI4aWA9CYQAK+BIZ2wM+QEKDWUc7bHUBfJ
kVkFr59VS4x9w2zL2fKijy3CTNqaEXCUhmCks7PFYxGfF437YaJGVfCBVotuY9Ce
SKTkJujUUf7b1zN+lKz8d9u6AKomE9rYBLpR0LPhDrnpiLbHtyWCeFWsmOB63k4E
05EwIHGAlvIC/dc6bHoeJzSLT5agK2KcCVWjgVzZgkDi7sbYkE8qhPmo/cojSERo
48+o8beAKgU3YEI8OwraxYBlUR71DKfdL7+6xvEo8kVNj5iNMq5GWY+YLvcQgR50
3MLuGxWFZWVRfZY8rrLMajFxNXojwuWuLu/PTT0Kz2ZRgLseF+op0AH2Ezsw4pnZ
CLp0sSKs9BqpwKuFCb1lHiEVnGfOb9CFy3u0nWmQjsE0Bj8HRC433x4fNQcJVUmJ
ZMPXRtZaboPV9jt3UoUhtancMiXdAbTP48N7klFRuVwCOycnxW5yAFkCssFaSpsn
EAidzBDODUXUV6/3paNVsZD7ehVJ/FMBgKSyAoJrcr+RZeFbn4b9m/NvdpdhQIwn
iGrTMhz3YmEhxiZrStYB9aaeaaWKZxd120bnTcfFEcnMOCKUkBSICtqjGLVsBO5e
rQV9P97h+kxf+Wh7DqhkC7br7URpYsYDZa9bCd+SAL1qrGeNZW/RP01ABRZWiSi4
0QVvKZ7uVzyEHIVHXOoj
=a2Ax
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.12-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Some more powerpc fixes for 4.12. Most of these actually came in last
week but got held up for some more testing.
- three fixes for kprobes/ftrace/livepatch interactions.
- properly handle data breakpoints when using the Radix MMU.
- fix for perf sampling of registers during call_usermodehelper().
- properly initialise the thread_info on our emergency stacks
- add an explicit flush when doing TLB invalidations for a process
using NPU2.
Thanks to: Alistair Popple, Naveen N. Rao, Nicholas Piggin, Ravi
Bangoria, Masami Hiramatsu"
* tag 'powerpc-4.12-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64: Initialise thread_info for emergency stacks
powerpc/powernv/npu-dma: Add explicit flush when sending an ATSD
powerpc/perf: Fix oops when kthread execs user process
powerpc/64s: Handle data breakpoints in Radix mode
powerpc/kprobes: Skip livepatch_handler() for jprobes
powerpc/ftrace: Pass the correct stack pointer for DYNAMIC_FTRACE_WITH_REGS
powerpc/kprobes: Pause function_graph tracing during jprobes handling
- I2C and SPI devices are expected to be enumerated by the
I2C and SPI subsystems, respectively, but due to a change made
during the 4.11 cycle, in some cases the ACPI core marks them
as already enumerated which causes the I2C and SPI subsystems
to overlook them, so fix that (Jarkko Nikula).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJZTRdgAAoJEILEb/54YlRx+IgP/17SwIqNGDJMl6t/R+dRtLTv
JWGzp967HLo0UchZUQpIvYT6CTPWytSV4hHxoCFSKSYLmQ8XI/qpu1fNysjsf0lK
PZp6H/qsTRyh9MRbadEhJKYLvb7SVgSLRsht8ts5kiqc1TfgdfT8b/VJF0ggQ4+O
z/QM3yMlnEIkAds23SdRbiy+Y0W45XTvPYHq9H03HIL809DvuQ5MdPTC9Cyr/PI9
KyhGs8zx3ZOuWmuRVmGJvWqw40WMa2ZhUGXbQGYTaVHzcIIeiv345WQn6luWsYPL
ln52ifM+T7wbNsmvCH4MtkW8Ix5yDJNkDM3x2gwkF07egJpYYOa7q/02sPmHz+2Z
daQbh6mKb861a75UiEcZmF1DL4sCpcwGAvKv9ERvrWBsYX6y14K84SLas2j0lKO/
9SzDhKKLVV46u/rmM0qz+9n6tEDoo1Hi7460FToucLEJpjc4aqsE8kUSjshlO+QX
mqdlrHKlWfAq52Ccno7Sn2FhGJG+p9CoQk+BE9rExIoBMlLrtw+fe/oPiHVvTzs8
TP7U5ioHyfdL9sRLfkKXHF21zAwpxvh7EVPED5LK3GwNZ0QjVryN5HAlnwGrDnUe
zL32+hLEpRW5hPmsZOHjSzhZ4o7ihZwpcCTy62uG4zZVf+qLpIM0Xd0dFC1CpZTm
F5/BD8Gj5Dl5n4SO7VM4
=ORkn
-----END PGP SIGNATURE-----
Merge tag 'acpi-4.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
"This fixes the ACPI-based enumeration of some I2C and SPI devices
broken in 4.11.
Specifics:
- I2C and SPI devices are expected to be enumerated by the I2C and
SPI subsystems, respectively, but due to a change made during the
4.11 cycle, in some cases the ACPI core marks them as already
enumerated which causes the I2C and SPI subsystems to overlook
them, so fix that (Jarkko Nikula)"
* tag 'acpi-4.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / scan: Fix enumeration for special SPI and I2C devices
Pull i2c fix from Wolfram Sang.
* 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: imx: Use correct function to write to register
MVEBU PWM controller embedded in the GPIO controller before
we release v4.12. Hopefully.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJZTQM9AAoJEEEQszewGV1zdOYP/1GN7dCIE2VemdGDVkt52WUm
8NOy7PMe6XtrfLXQMxpS8ezXH3ag1CegEynuoEnjreZGWji6yYjr7vzXYPVAHgAj
mSnvRt+t+KkaQ1nTRLaVH/DjSfMCZiqBAsJakyvrcnV7GuHrzVOJYiSLhHFu4XXI
ak2Xb4pO8bc32aNVGH24vmvNEZBglJ6jn2YOBNHvkc7IJTvLKZ93nZ+LZaa2pJPz
gWv/Mcz0j1KcyXAY7c0QdZYkY/Rr2RuD/ZTUWpfUJJpPHuD8122S0rDoAc+6abaN
bXjCI7tYW2Pj1u6+4Ky0g/A26Cph3ELc2XZ8spBcr40qKLyAJe/nZpbWE0wpmGCR
0m0pHMwejyFXac6a9aV90gZmzl2aG3YgNNDR2ea6AhBepPfFVUj9fJ1/hi2AD3F9
n+XZw2Q+0MKMlvIct6+f8gtNYro7emHx+U7+4zsqgGM8BvdXQiumDs+2uWLXVjo4
oWjvGXJM/FaK5Qu6iH92dojBIole9ncvCVa4T+OMT5WCFAB19x8ukjeLGJvQq7EM
pj8fouAYQXGZV6jP1rEU1EBuCYtDCz7CmN6lLBVvf6KGwkKt2pgDRCp7feCwvL63
vtBYclEiVjf6uY5JBe2Px2JwjM5BX5b9+NoQyTX8VIvjJTVmTZL6UFSxb6kGFfzB
8TE1rAxwXMF75hsk1wug
=+/4l
-----END PGP SIGNATURE-----
Merge tag 'gpio-v4.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO fix from Linus Walleij:
"A single GPIO patch fixing the compatible string for the MVEBU PWM
controller embedded in the GPIO controller before we release v4.12.
Hopefully"
* tag 'gpio-v4.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: mvebu: change compatible string for PWM support
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJZTKPyAAoJEAx081l5xIa+uwEQAIws5U4O3a2L4qAJyKXfLFOL
iupnaWeFmvdSCs0bOuMDvltQh1QSNMj2kO9z8GGeMj23o6qo471JPVigU7uhxMur
xoqsbctrP4P/zq04mK/NrXYrNUL5nMyodJHb06BuztjNbgs9ftAPha97Hxkh4BL8
14dHoNvKthpzF3rOzIsx6ZK0VXnL7kGGgJZsdbcFbbZLUNxsUzYsn+9p1n5G9d7Z
jiEJutORf/n8Op2jXYhU42aYjIDo/mROFcD7rxehW5ZKRyg6rLTzJt1M3IYnAaZC
yOkLR+vwY6P5PzfLlOfKfCh15iEWMB/5kdnRQ0vbFIVe5Tg6ZrtI8gM2VII/rXI0
qRs/pDQJPdt/+vLQ+ryfuxbJWTPX8ZpFVWHpbB44NAw9JY2cgwFnx/NW2a/8LzFM
m4ToLihvN3O9aHHDpUl0Tr0l/coUMW4WyIuj8TZ+IeMo93Y40Dmbf0O29uY9+svs
uvtbFobETX/caF31h7Y0/8zd/LYTDCvnf5ip78/9YzPMTE+0UuwIgWXRtkP+hXUg
djxr+lni7wHyaGk3l2gVvC/YQ4uOD+EVCwU+K3GU0DVjbkE+J3m/Jj1wTl+iU+TS
8lot4baJ9o3AZVJV2ZHbWKvxcitUnqAQQXlJSPRarZf6qnXu6Bg38sU+dueNDjg4
tBLYjT1UKOFosRe3BVuV
=MKIT
-----END PGP SIGNATURE-----
Merge tag 'drm-fixes-for-v4.12-rc7' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"A varied bunch of fixes, one for an API regression with connectors.
Otherwise amdgpu and i915 have a bunch of varied fixes, the shrinker
ones being the most important"
* tag 'drm-fixes-for-v4.12-rc7' of git://people.freedesktop.org/~airlied/linux:
drm: Fix GETCONNECTOR regression
drm/radeon: add a quirk for Toshiba Satellite L20-183
drm/radeon: add a PX quirk for another K53TK variant
drm/amdgpu: adjust default display clock
drm/amdgpu/atom: fix ps allocation size for EnableDispPowerGating
drm/amdgpu: add Polaris12 DID
drm/i915: Don't enable backlight at setup time.
drm/i915: Plumb the correct acquire ctx into intel_crtc_disable_noatomic()
drm/i915: Fix deadlock witha the pipe A quirk during resume
drm/i915: Remove __GFP_NORETRY from our buffer allocator
drm/i915: Encourage our shrinker more when our shmemfs allocations fails
drm/i915: Differentiate between sw write location into ring and last hw read
random_for_linus_stable pull request.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAllIfggACgkQ8vlZVpUN
gaMzCAgAgsCFSH9vGm+DgUjABNH++fPB/MVsd8lq8sGzg+rTGCe1pBap409feDkF
+xZfF41Dqts8rchXYn6hqTDuOMfCX9cOxDxxOdhdKG6ntmdGHSZ4T+hM17v6Jgbe
a7M1xs/7Xrfunqsz9bkb1AdReO1wxG7f3a6JixPnQ1K6yc6HZpFZK5mTrd73lSfY
ta+KVrZBvPyVyAcWNQn6ssgTRhrTFwFy/nG4Mz2XteATyo9Z9622z8TGW5tZacnQ
dMgMi9ZMqYuIW/1tA1MmIs5GFkmbZVOqgpbipjhrXEquNCGwj4LQCeeN4qrKnXsw
enAy3z6DRu9C/F7gMHcvpbYETEmSjQ==
=HSEU
-----END PGP SIGNATURE-----
Merge tag 'random_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random
Pull random fixes from Ted Ts'o:
"Fix some locking and gcc optimization issues from the most recent
random_for_linus_stable pull request"
* tag 'random_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
random: silence compiler warnings and fix race
to crash
- a DM io reference count fix that resolves a NULL pointer seen when
issuing discards to a DM mirror target's device whose mirror legs do
not all support discards
- a couple DM integrity fixes
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJZTBU6AAoJEMUj8QotnQNaDJEH/3ujmjAnN1gpB4PhMh7kGyKA
i8qB476EYTEH8mPha88lvGoGoTk07cgwuWgQtulbcIlM0PbtbjcRs4lxEPCJW8sm
BeBnwKnBtnd+I+INKK0RCkYNHxO1ciCv4jMe08xNvSOrcmNVI1E4HjQ5GtJX7IMO
eQCEdTsDhf+ZXGnE6ErzfLVYnrazhNhk40+2jSlDFxDL8Qpd43EwMw5iHzeh0ztm
Frf9+JjWlckUS6oVm1AkTygbRrS3FEJ/cM3ei61/kj6hYzFGcP4Ba95Zd/E5k1Ls
9byBw93KTW3Pi5BKeVTo/JgxITcVQUZAWn95qF7HofZn6oBLdEiHzXLHQctl9Qs=
=fZVe
-----END PGP SIGNATURE-----
Merge tag 'for-4.12/dm-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- a revert of a DM mirror commit that has proven to make the code prone
to crash
- a DM io reference count fix that resolves a NULL pointer seen when
issuing discards to a DM mirror target's device whose mirror legs do
not all support discards
- a couple DM integrity fixes
* tag 'for-4.12/dm-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm io: fix duplicate bio completion due to missing ref count
dm integrity: fix to not disable/enable interrupts from interrupt context
Revert "dm mirror: use all available legs on multiple failures"
dm integrity: reject mappings too large for device
Merge misc fixes from Andrew Morton:
"8 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
fs/exec.c: account for argv/envp pointers
ocfs2: fix deadlock caused by recursive locking in xattr
slub: make sysfs file removal asynchronous
lib/cmdline.c: fix get_options() overflow while parsing ranges
fs/dax.c: fix inefficiency in dax_writeback_mapping_range()
autofs: sanity check status reported with AUTOFS_DEV_IOCTL_FAIL
mm/vmalloc.c: huge-vmap: fail gracefully on unexpected huge vmap mappings
mm, thp: remove cond_resched from __collapse_huge_page_copy
When limiting the argv/envp strings during exec to 1/4 of the stack limit,
the storage of the pointers to the strings was not included. This means
that an exec with huge numbers of tiny strings could eat 1/4 of the stack
limit in strings and then additional space would be later used by the
pointers to the strings.
For example, on 32-bit with a 8MB stack rlimit, an exec with 1677721
single-byte strings would consume less than 2MB of stack, the max (8MB /
4) amount allowed, but the pointers to the strings would consume the
remaining additional stack space (1677721 * 4 == 6710884).
The result (1677721 + 6710884 == 8388605) would exhaust stack space
entirely. Controlling this stack exhaustion could result in
pathological behavior in setuid binaries (CVE-2017-1000365).
[akpm@linux-foundation.org: additional commenting from Kees]
Fixes: b6a2fea393 ("mm: variable length argument support")
Link: http://lkml.kernel.org/r/20170622001720.GA32173@beast
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Qualys Security Advisory <qsa@qualys.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Another deadlock path caused by recursive locking is reported. This
kind of issue was introduced since commit 743b5f1434 ("ocfs2: take
inode lock in ocfs2_iop_set/get_acl()"). Two deadlock paths have been
fixed by commit b891fa5024 ("ocfs2: fix deadlock issue when taking
inode lock at vfs entry points"). Yes, we intend to fix this kind of
case in incremental way, because it's hard to find out all possible
paths at once.
This one can be reproduced like this. On node1, cp a large file from
home directory to ocfs2 mountpoint. While on node2, run
setfacl/getfacl. Both nodes will hang up there. The backtraces:
On node1:
__ocfs2_cluster_lock.isra.39+0x357/0x740 [ocfs2]
ocfs2_inode_lock_full_nested+0x17d/0x840 [ocfs2]
ocfs2_write_begin+0x43/0x1a0 [ocfs2]
generic_perform_write+0xa9/0x180
__generic_file_write_iter+0x1aa/0x1d0
ocfs2_file_write_iter+0x4f4/0xb40 [ocfs2]
__vfs_write+0xc3/0x130
vfs_write+0xb1/0x1a0
SyS_write+0x46/0xa0
On node2:
__ocfs2_cluster_lock.isra.39+0x357/0x740 [ocfs2]
ocfs2_inode_lock_full_nested+0x17d/0x840 [ocfs2]
ocfs2_xattr_set+0x12e/0xe80 [ocfs2]
ocfs2_set_acl+0x22d/0x260 [ocfs2]
ocfs2_iop_set_acl+0x65/0xb0 [ocfs2]
set_posix_acl+0x75/0xb0
posix_acl_xattr_set+0x49/0xa0
__vfs_setxattr+0x69/0x80
__vfs_setxattr_noperm+0x72/0x1a0
vfs_setxattr+0xa7/0xb0
setxattr+0x12d/0x190
path_setxattr+0x9f/0xb0
SyS_setxattr+0x14/0x20
Fix this one by using ocfs2_inode_{lock|unlock}_tracker, which is
exported by commit 439a36b8ef ("ocfs2/dlmglue: prepare tracking logic
to avoid recursive cluster lock").
Link: http://lkml.kernel.org/r/20170622014746.5815-1-zren@suse.com
Fixes: 743b5f1434 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()")
Signed-off-by: Eric Ren <zren@suse.com>
Reported-by: Thomas Voegtle <tv@lio96.de>
Tested-by: Thomas Voegtle <tv@lio96.de>
Reviewed-by: Joseph Qi <jiangqi903@gmail.com>
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit bf5eb3de38 ("slub: separate out sysfs_slab_release() from
sysfs_slab_remove()") made slub sysfs file removals synchronous to
kmem_cache shutdown.
Unfortunately, this created a possible ABBA deadlock between slab_mutex
and sysfs draining mechanism triggering the following lockdep warning.
======================================================
[ INFO: possible circular locking dependency detected ]
4.10.0-test+ #48 Not tainted
-------------------------------------------------------
rmmod/1211 is trying to acquire lock:
(s_active#120){++++.+}, at: [<ffffffff81308073>] kernfs_remove+0x23/0x40
but task is already holding lock:
(slab_mutex){+.+.+.}, at: [<ffffffff8120f691>] kmem_cache_destroy+0x41/0x2d0
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (slab_mutex){+.+.+.}:
lock_acquire+0xf6/0x1f0
__mutex_lock+0x75/0x950
mutex_lock_nested+0x1b/0x20
slab_attr_store+0x75/0xd0
sysfs_kf_write+0x45/0x60
kernfs_fop_write+0x13c/0x1c0
__vfs_write+0x28/0x120
vfs_write+0xc8/0x1e0
SyS_write+0x49/0xa0
entry_SYSCALL_64_fastpath+0x1f/0xc2
-> #0 (s_active#120){++++.+}:
__lock_acquire+0x10ed/0x1260
lock_acquire+0xf6/0x1f0
__kernfs_remove+0x254/0x320
kernfs_remove+0x23/0x40
sysfs_remove_dir+0x51/0x80
kobject_del+0x18/0x50
__kmem_cache_shutdown+0x3e6/0x460
kmem_cache_destroy+0x1fb/0x2d0
kvm_exit+0x2d/0x80 [kvm]
vmx_exit+0x19/0xa1b [kvm_intel]
SyS_delete_module+0x198/0x1f0
entry_SYSCALL_64_fastpath+0x1f/0xc2
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(slab_mutex);
lock(s_active#120);
lock(slab_mutex);
lock(s_active#120);
*** DEADLOCK ***
2 locks held by rmmod/1211:
#0: (cpu_hotplug.dep_map){++++++}, at: [<ffffffff810a7877>] get_online_cpus+0x37/0x80
#1: (slab_mutex){+.+.+.}, at: [<ffffffff8120f691>] kmem_cache_destroy+0x41/0x2d0
stack backtrace:
CPU: 3 PID: 1211 Comm: rmmod Not tainted 4.10.0-test+ #48
Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
Call Trace:
print_circular_bug+0x1be/0x210
__lock_acquire+0x10ed/0x1260
lock_acquire+0xf6/0x1f0
__kernfs_remove+0x254/0x320
kernfs_remove+0x23/0x40
sysfs_remove_dir+0x51/0x80
kobject_del+0x18/0x50
__kmem_cache_shutdown+0x3e6/0x460
kmem_cache_destroy+0x1fb/0x2d0
kvm_exit+0x2d/0x80 [kvm]
vmx_exit+0x19/0xa1b [kvm_intel]
SyS_delete_module+0x198/0x1f0
? SyS_delete_module+0x5/0x1f0
entry_SYSCALL_64_fastpath+0x1f/0xc2
It'd be the cleanest to deal with the issue by removing sysfs files
without holding slab_mutex before the rest of shutdown; however, given
the current code structure, it is pretty difficult to do so.
This patch punts sysfs file removal to a work item. Before commit
bf5eb3de38, the removal was punted to a RCU delayed work item which is
executed after release. Now, we're punting to a different work item on
shutdown which still maintains the goal removing the sysfs files earlier
when destroying kmem_caches.
Link: http://lkml.kernel.org/r/20170620204512.GI21326@htj.duckdns.org
Fixes: bf5eb3de38 ("slub: separate out sysfs_slab_release() from sysfs_slab_remove()")
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When using get_options() it's possible to specify a range of numbers,
like 1-100500. The problem is that it doesn't track array size while
calling internally to get_range() which iterates over the range and
fills the memory with numbers.
Link: http://lkml.kernel.org/r/2613C75C-B04D-4BFF-82A6-12F97BA0F620@gmail.com
Signed-off-by: Ilya V. Matveychikov <matvejchikov@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
dax_writeback_mapping_range() fails to update iteration index when
searching radix tree for entries needing cache flushing. Thus each
pagevec worth of entries is searched starting from the start which is
inefficient and prone to livelocks. Update index properly.
Link: http://lkml.kernel.org/r/20170619124531.21491-1-jack@suse.cz
Fixes: 9973c98ecf ("dax: add support for fsync/sync")
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If a positive status is passed with the AUTOFS_DEV_IOCTL_FAIL ioctl,
autofs4_d_automount() will return
ERR_PTR(status)
with that status to follow_automount(), which will then dereference an
invalid pointer.
So treat a positive status the same as zero, and map to ENOENT.
See comment in systemd src/core/automount.c::automount_send_ready().
Link: http://lkml.kernel.org/r/871sqwczx5.fsf@notabene.neil.brown.name
Signed-off-by: NeilBrown <neilb@suse.com>
Cc: Ian Kent <raven@themaw.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Existing code that uses vmalloc_to_page() may assume that any address
for which is_vmalloc_addr() returns true may be passed into
vmalloc_to_page() to retrieve the associated struct page.
This is not un unreasonable assumption to make, but on architectures
that have CONFIG_HAVE_ARCH_HUGE_VMAP=y, it no longer holds, and we need
to ensure that vmalloc_to_page() does not go off into the weeds trying
to dereference huge PUDs or PMDs as table entries.
Given that vmalloc() and vmap() themselves never create huge mappings or
deal with compound pages at all, there is no correct answer in this
case, so return NULL instead, and issue a warning.
When reading /proc/kcore on arm64, you will hit an oops as soon as you
hit the huge mappings used for the various segments that make up the
mapping of vmlinux. With this patch applied, you will no longer hit the
oops, but the kcore contents willl be incorrect (these regions will be
zeroed out)
We are fixing this for kcore specifically, so it avoids vread() for
those regions. At least one other problematic user exists, i.e.,
/dev/kmem, but that is currently broken on arm64 for other reasons.
Link: http://lkml.kernel.org/r/20170609082226.26152-1-ard.biesheuvel@linaro.org
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Laura Abbott <labbott@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: zhong jiang <zhongjiang@huawei.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is a partial revert of commit 338a16ba15 ("mm, thp: copying user
pages must schedule on collapse") which added a cond_resched() to
__collapse_huge_page_copy().
On x86 with CONFIG_HIGHPTE, __collapse_huge_page_copy is called in
atomic context and thus scheduling is not possible. This is only a
possible config on arm and i386.
Although need_resched has been shown to be set for over 100 jiffies
while doing the iteration in __collapse_huge_page_copy, this is better
than doing
if (in_atomic())
cond_resched()
to cover only non-CONFIG_HIGHPTE configs.
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1706191341550.97821@chino.kir.corp.google.com
Signed-off-by: David Rientjes <rientjes@google.com>
Reported-by: Larry Finger <Larry.Finger@lwfinger.net>
Tested-by: Larry Finger <Larry.Finger@lwfinger.net>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Two fixes to remove spurious WARN_ONs from the new(ish) qedi driver.
The driver already prints a warning message, there's no need to panic
users by printing something that looks like an oops as well.
Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJZTBWOAAoJEAVr7HOZEZN4g0gP/0DjcJmv8c72CzInKnfB+nF1
y9zrhOojrTD3OhChI0V82rxIfpFLL7ipMCmnl5zElsB0e8R+8w2EwkwO4VJz9lyu
mztjX68HxG3UmEA4wG9euixCUvK8ewt21PjbWF+voS0pj1bf1fSobl74J+5zQD2D
svoB93gqDGAIUVL388NWfwrR8Lo1Ysrc1FMtjIGQcrpx+7lVAKpN7olby7JKQpnx
2HJn64cUiPWX5qajEPnFOfpj8tNecg9Vq7+z8lIcGKXEQk0neraEejbtBWC2z9ZA
/WQJvA4Ed5BH2I4tG8Ba+J3Z5YFGMJhIdPD3P3Xs8rWAJ3Kzte2jWxP3o1T0cu0R
8W3b1EhFmyZbiAbhLf+UOlPGMY5cxej7dMzobH2h985mZAXamBnqO7yKXrGl/mHh
/ai4bC52AdJitOxObaRzyk7ilx4dfSODUfdbTD3j1gj7mIpFoDkgCqA/GrfDFk34
pzill8IiRxKntFgcZUPflBYAuvFZvGZHXvLsV6xxsYehMv7c3uxJ6/9rPowjxEAg
HYmkOH+nQgrgKo27SAviaz9Do9hfxYjJ1hP014DjsCAdtlzb05PhXAhJCAqaPzIG
UpPxpvE1QATI+ZtudLDt7Fk7uR6ggyvkgUkRoNktf4N8rhlUs3kAVQy3Mda3vVW3
97NN3G9kppiLATijLG/b
=ysyU
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Two fixes to remove spurious WARN_ONs from the new(ish) qedi driver.
The driver already prints a warning message, there's no need to panic
users by printing something that looks like an oops as well"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: qedi: Remove WARN_ON from clear task context.
scsi: qedi: Remove WARN_ON for untracked cleanup.
- don't allow swapon on files on the realtime device, because the swap
code will swap pages out to blocks on the data device, thereby
corrupting the filesystem
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABCgAGBQJZSzmHAAoJEPh/dxk0SrTroa4P/02EPljuA4pOhYlTrsrKyul4
7KnVg1AFk2uYlNbEcZjKJTkhMhvCqtENorAWawixezAbSeumft24DgPVmXxXEGRx
f2ym8UiwEVSdTs2dlP/8HCgrrx3kgaF6H4tYnu4WQxkMkDfE6feTp0TcOsklW8R1
bR+V+Q9xSJ2WRji9mDBu++3jXKa1VlsOzCRDjnWI7E/ZHJ2n8y412qYxaOHPDvl2
g5AG7jOtB2D7nDEVtfuEdsuSIBHrUsZ/LWrpDlXMhTY7eJ5ipjvcs6RtMayufNdE
H5ZeA8bKIJNcpR5Y0MvAb5lQNDA5wg4MTLWfQQ7jlvnI6qaysqWR13UhbfzRBHg8
YDUUWtuyvq+2/gy94VOn82xKTerD8l+KE+pdZUU99qZDsHVZ0FZ0A2IpSA0ZRdj+
xYm2WnzIqgMp5OD0Ef+QYzMr0043eBnD1+CDnG/JbHz/S1nqI4KdzH5t2ndMg9YS
g4sl3qKEwR1ZHnECTu2Q9LWAtF5s8WBgVj3brDG9mdMZXwWYLyGKJDNZ6tsxwOzh
Z2Pp+6Gs5KRqCt5Acok84KjcS7/XVM0a4w9KOjmlZxZ1K9R5abAePGOT+GEGFP4g
qO2WOa+wHX2UlUQI+lYg60PFMCBtO41ewptx/1+ZluREyNE24aIRTQttRRdz2twA
/kF8Uf8eGzPWkyP/uCH3
=qkCp
-----END PGP SIGNATURE-----
Merge tag 'xfs-4.12-fixes-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Darrick Wong:
"I have one more bugfix for you for 4.12-rc7 to fix a disk corruption
problem:
- don't allow swapon on files on the realtime device, because the
swap code will swap pages out to blocks on the data device, thereby
corrupting the filesystem"
* tag 'xfs-4.12-fixes-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: don't allow bmap on rt files