Commit Graph

679368 Commits

Author SHA1 Message Date
Rik van Riel 3fed382b46 sched/numa: Implement NUMA node level wake_affine()
Since select_idle_sibling() can place a task anywhere on a socket,
comparing loads between individual CPU cores makes no real sense
for deciding whether to do an affine wakeup across sockets, either.

Instead, compare the load between the sockets in a similar way the
load balancer and the numa balancing code do.

Signed-off-by: Rik van Riel <riel@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: jhladky@redhat.com
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20170623165530.22514-4-riel@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-24 08:57:52 +02:00
Rik van Riel 7d894e6e34 sched/fair: Simplify wake_affine() for the single socket case
Then 'this_cpu' and 'prev_cpu' are in the same socket, select_idle_sibling()
will do its thing regardless of the return value of wake_affine().

Just return true and don't look at all the other things.

Signed-off-by: Rik van Riel <riel@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: jhladky@redhat.com
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20170623165530.22514-3-riel@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-24 08:57:52 +02:00
Rik van Riel 739294fb03 sched/numa: Override part of migrate_degrades_locality() when idle balancing
Several tests in the NAS benchmark seem to run a lot slower with
NUMA balancing enabled, than with NUMA balancing disabled. The
slower run time corresponds with increased idle time.

Overriding the final test of migrate_degrades_locality (but still
doing the other NUMA tests first) seems to improve performance
of those benchmarks.

Reported-by: Jirka Hladky <jhladky@redhat.com>
Signed-off-by: Rik van Riel <riel@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20170623165530.22514-2-riel@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-24 08:57:46 +02:00
Ingo Molnar 1bc3cd4dfa Merge branch 'linus' into sched/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-24 08:57:20 +02:00
Linus Torvalds 94a6df251d powerpc fixes for 4.12 #7
- three fixes for kprobes/ftrace/livepatch interactions.
 
  - properly handle data breakpoints when using the Radix MMU.
 
  - fix for perf sampling of registers during call_usermodehelper().
 
  - properly initialise the thread_info on our emergency stacks
 
  - add an explicit flush when doing TLB invalidations for a process
    using NPU2.
 
 Thanks to:
   Alistair Popple, Naveen N. Rao, Nicholas Piggin, Ravi Bangoria,
   Masami Hiramatsu.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJZTZy4AAoJEFHr6jzI4aWA9CYQAK+BIZ2wM+QEKDWUc7bHUBfJ
 kVkFr59VS4x9w2zL2fKijy3CTNqaEXCUhmCks7PFYxGfF437YaJGVfCBVotuY9Ce
 SKTkJujUUf7b1zN+lKz8d9u6AKomE9rYBLpR0LPhDrnpiLbHtyWCeFWsmOB63k4E
 05EwIHGAlvIC/dc6bHoeJzSLT5agK2KcCVWjgVzZgkDi7sbYkE8qhPmo/cojSERo
 48+o8beAKgU3YEI8OwraxYBlUR71DKfdL7+6xvEo8kVNj5iNMq5GWY+YLvcQgR50
 3MLuGxWFZWVRfZY8rrLMajFxNXojwuWuLu/PTT0Kz2ZRgLseF+op0AH2Ezsw4pnZ
 CLp0sSKs9BqpwKuFCb1lHiEVnGfOb9CFy3u0nWmQjsE0Bj8HRC433x4fNQcJVUmJ
 ZMPXRtZaboPV9jt3UoUhtancMiXdAbTP48N7klFRuVwCOycnxW5yAFkCssFaSpsn
 EAidzBDODUXUV6/3paNVsZD7ehVJ/FMBgKSyAoJrcr+RZeFbn4b9m/NvdpdhQIwn
 iGrTMhz3YmEhxiZrStYB9aaeaaWKZxd120bnTcfFEcnMOCKUkBSICtqjGLVsBO5e
 rQV9P97h+kxf+Wh7DqhkC7br7URpYsYDZa9bCd+SAL1qrGeNZW/RP01ABRZWiSi4
 0QVvKZ7uVzyEHIVHXOoj
 =a2Ax
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.12-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 "Some more powerpc fixes for 4.12. Most of these actually came in last
  week but got held up for some more testing.

   - three fixes for kprobes/ftrace/livepatch interactions.

   - properly handle data breakpoints when using the Radix MMU.

   - fix for perf sampling of registers during call_usermodehelper().

   - properly initialise the thread_info on our emergency stacks

   - add an explicit flush when doing TLB invalidations for a process
     using NPU2.

  Thanks to: Alistair Popple, Naveen N. Rao, Nicholas Piggin, Ravi
  Bangoria, Masami Hiramatsu"

* tag 'powerpc-4.12-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/64: Initialise thread_info for emergency stacks
  powerpc/powernv/npu-dma: Add explicit flush when sending an ATSD
  powerpc/perf: Fix oops when kthread execs user process
  powerpc/64s: Handle data breakpoints in Radix mode
  powerpc/kprobes: Skip livepatch_handler() for jprobes
  powerpc/ftrace: Pass the correct stack pointer for DYNAMIC_FTRACE_WITH_REGS
  powerpc/kprobes: Pause function_graph tracing during jprobes handling
2017-06-23 17:53:16 -07:00
Linus Torvalds cd5545ae87 ACPI fix for v4.12-rc7
- I2C and SPI devices are expected to be enumerated by the
    I2C and SPI subsystems, respectively, but due to a change made
    during the 4.11 cycle, in some cases the ACPI core marks them
    as already enumerated which causes the I2C and SPI subsystems
    to overlook them, so fix that (Jarkko Nikula).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJZTRdgAAoJEILEb/54YlRx+IgP/17SwIqNGDJMl6t/R+dRtLTv
 JWGzp967HLo0UchZUQpIvYT6CTPWytSV4hHxoCFSKSYLmQ8XI/qpu1fNysjsf0lK
 PZp6H/qsTRyh9MRbadEhJKYLvb7SVgSLRsht8ts5kiqc1TfgdfT8b/VJF0ggQ4+O
 z/QM3yMlnEIkAds23SdRbiy+Y0W45XTvPYHq9H03HIL809DvuQ5MdPTC9Cyr/PI9
 KyhGs8zx3ZOuWmuRVmGJvWqw40WMa2ZhUGXbQGYTaVHzcIIeiv345WQn6luWsYPL
 ln52ifM+T7wbNsmvCH4MtkW8Ix5yDJNkDM3x2gwkF07egJpYYOa7q/02sPmHz+2Z
 daQbh6mKb861a75UiEcZmF1DL4sCpcwGAvKv9ERvrWBsYX6y14K84SLas2j0lKO/
 9SzDhKKLVV46u/rmM0qz+9n6tEDoo1Hi7460FToucLEJpjc4aqsE8kUSjshlO+QX
 mqdlrHKlWfAq52Ccno7Sn2FhGJG+p9CoQk+BE9rExIoBMlLrtw+fe/oPiHVvTzs8
 TP7U5ioHyfdL9sRLfkKXHF21zAwpxvh7EVPED5LK3GwNZ0QjVryN5HAlnwGrDnUe
 zL32+hLEpRW5hPmsZOHjSzhZ4o7ihZwpcCTy62uG4zZVf+qLpIM0Xd0dFC1CpZTm
 F5/BD8Gj5Dl5n4SO7VM4
 =ORkn
 -----END PGP SIGNATURE-----

Merge tag 'acpi-4.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fix from Rafael Wysocki:
 "This fixes the ACPI-based enumeration of some I2C and SPI devices
  broken in 4.11.

  Specifics:

   - I2C and SPI devices are expected to be enumerated by the I2C and
     SPI subsystems, respectively, but due to a change made during the
     4.11 cycle, in some cases the ACPI core marks them as already
     enumerated which causes the I2C and SPI subsystems to overlook
     them, so fix that (Jarkko Nikula)"

* tag 'acpi-4.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI / scan: Fix enumeration for special SPI and I2C devices
2017-06-23 17:49:12 -07:00
Linus Torvalds ba6cbdb673 Merge branch 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fix from Wolfram Sang.

* 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: imx: Use correct function to write to register
2017-06-23 17:46:10 -07:00
Linus Torvalds 25b2398f5c A single GPIO patch fixing the compatible string for the
MVEBU PWM controller embedded in the GPIO controller before
 we release v4.12. Hopefully.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJZTQM9AAoJEEEQszewGV1zdOYP/1GN7dCIE2VemdGDVkt52WUm
 8NOy7PMe6XtrfLXQMxpS8ezXH3ag1CegEynuoEnjreZGWji6yYjr7vzXYPVAHgAj
 mSnvRt+t+KkaQ1nTRLaVH/DjSfMCZiqBAsJakyvrcnV7GuHrzVOJYiSLhHFu4XXI
 ak2Xb4pO8bc32aNVGH24vmvNEZBglJ6jn2YOBNHvkc7IJTvLKZ93nZ+LZaa2pJPz
 gWv/Mcz0j1KcyXAY7c0QdZYkY/Rr2RuD/ZTUWpfUJJpPHuD8122S0rDoAc+6abaN
 bXjCI7tYW2Pj1u6+4Ky0g/A26Cph3ELc2XZ8spBcr40qKLyAJe/nZpbWE0wpmGCR
 0m0pHMwejyFXac6a9aV90gZmzl2aG3YgNNDR2ea6AhBepPfFVUj9fJ1/hi2AD3F9
 n+XZw2Q+0MKMlvIct6+f8gtNYro7emHx+U7+4zsqgGM8BvdXQiumDs+2uWLXVjo4
 oWjvGXJM/FaK5Qu6iH92dojBIole9ncvCVa4T+OMT5WCFAB19x8ukjeLGJvQq7EM
 pj8fouAYQXGZV6jP1rEU1EBuCYtDCz7CmN6lLBVvf6KGwkKt2pgDRCp7feCwvL63
 vtBYclEiVjf6uY5JBe2Px2JwjM5BX5b9+NoQyTX8VIvjJTVmTZL6UFSxb6kGFfzB
 8TE1rAxwXMF75hsk1wug
 =+/4l
 -----END PGP SIGNATURE-----

Merge tag 'gpio-v4.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull GPIO fix from Linus Walleij:
 "A single GPIO patch fixing the compatible string for the MVEBU PWM
  controller embedded in the GPIO controller before we release v4.12.
  Hopefully"

* tag 'gpio-v4.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpio: mvebu: change compatible string for PWM support
2017-06-23 17:40:41 -07:00
Linus Torvalds 51c933f208 sound fixes for 4.12-rc7
Nothing exciting here, just a few stable fixes:
 - Suppress spurious kernel WARNING in PCM core
 - Fix potential spin deadlock at error handling in firewire
 - HD-audio PCI ID addition / fixup
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJZTKpPAAoJEGwxgFQ9KSmkuREP/06S8dyQoKYd4OTVo+FrVjnK
 8gzaGqkttaMbBk8sIifQKs0FX6rfoYh2xGkkkMQ11TVyqItbw96RbnMpS8/lV45O
 fuC9Bm+YtoY6zyW3AQBaiCepn2EsoA88WTRv2BG4zEpQfZeFSsMKj432gaLun9Dz
 j7WsheqC+7rBJzXLLsun5RveimA0SQAAuES2bT9N8kDQJa9i7tAOWTKFmtvEj+pz
 thO9FvKqG0zIe0sFtUK+Qj+lkGoL10sOffRD3mGcepyJokM/dT6t6KXj6jk0dhwz
 kzrzQNkEZuWzOrXoBopsmNRStQ4CprU86fKfdBcePmmJqv68jvSe2Hfkrq1unxfd
 pj/BFJlxRKu1BA3zrBfvuiVg4nPhn3zb6djp9ecqAKJFzr5kT7cx2fZ3JEKgbTV1
 eP5aX/LS3iqM/0/sgFAUq+IGCBwTPJNVwE1QlzVBC6cAVFExjifdvrT4W5j1Mbj8
 ncy79J6icLjOrG6ykiKCiYVmDUZnhJnYIALZDB1OKaqLMluwV9s645m2NLnQIH/D
 +YagrFeOXkqELoi4gCnsGf9qGM04ob5F9Xah4DMi+b7tlm9EilNT8Hza4ccTcnDu
 Q7wcjDdNe/HtFp1DnAlOFXOnXkaT/m3UmnGZsL9aSTUXz0P5DOrsE7JG8rHoGadJ
 nTQqpXkDSm5zr4VAtWBX
 =5vmf
 -----END PGP SIGNATURE-----

Merge tag 'sound-4.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "Nothing exciting here, just a few stable fixes:

   - suppress spurious kernel WARNING in PCM core

   - fix potential spin deadlock at error handling in firewire

   - HD-audio PCI ID addition / fixup"

* tag 'sound-4.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda - Apply quirks to Broxton-T, too
  ALSA: firewire-lib: Fix stall of process context at packet error
  ALSA: pcm: Don't treat NULL chmap as a fatal error
  ALSA: hda - Add Coffelake PCI ID
2017-06-23 17:37:56 -07:00
Linus Torvalds 311548f173 i915, amdgpu and one core regression fix
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJZTKPyAAoJEAx081l5xIa+uwEQAIws5U4O3a2L4qAJyKXfLFOL
 iupnaWeFmvdSCs0bOuMDvltQh1QSNMj2kO9z8GGeMj23o6qo471JPVigU7uhxMur
 xoqsbctrP4P/zq04mK/NrXYrNUL5nMyodJHb06BuztjNbgs9ftAPha97Hxkh4BL8
 14dHoNvKthpzF3rOzIsx6ZK0VXnL7kGGgJZsdbcFbbZLUNxsUzYsn+9p1n5G9d7Z
 jiEJutORf/n8Op2jXYhU42aYjIDo/mROFcD7rxehW5ZKRyg6rLTzJt1M3IYnAaZC
 yOkLR+vwY6P5PzfLlOfKfCh15iEWMB/5kdnRQ0vbFIVe5Tg6ZrtI8gM2VII/rXI0
 qRs/pDQJPdt/+vLQ+ryfuxbJWTPX8ZpFVWHpbB44NAw9JY2cgwFnx/NW2a/8LzFM
 m4ToLihvN3O9aHHDpUl0Tr0l/coUMW4WyIuj8TZ+IeMo93Y40Dmbf0O29uY9+svs
 uvtbFobETX/caF31h7Y0/8zd/LYTDCvnf5ip78/9YzPMTE+0UuwIgWXRtkP+hXUg
 djxr+lni7wHyaGk3l2gVvC/YQ4uOD+EVCwU+K3GU0DVjbkE+J3m/Jj1wTl+iU+TS
 8lot4baJ9o3AZVJV2ZHbWKvxcitUnqAQQXlJSPRarZf6qnXu6Bg38sU+dueNDjg4
 tBLYjT1UKOFosRe3BVuV
 =MKIT
 -----END PGP SIGNATURE-----

Merge tag 'drm-fixes-for-v4.12-rc7' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
 "A varied bunch of fixes, one for an API regression with connectors.

  Otherwise amdgpu and i915 have a bunch of varied fixes, the shrinker
  ones being the most important"

* tag 'drm-fixes-for-v4.12-rc7' of git://people.freedesktop.org/~airlied/linux:
  drm: Fix GETCONNECTOR regression
  drm/radeon: add a quirk for Toshiba Satellite L20-183
  drm/radeon: add a PX quirk for another K53TK variant
  drm/amdgpu: adjust default display clock
  drm/amdgpu/atom: fix ps allocation size for EnableDispPowerGating
  drm/amdgpu: add Polaris12 DID
  drm/i915: Don't enable backlight at setup time.
  drm/i915: Plumb the correct acquire ctx into intel_crtc_disable_noatomic()
  drm/i915: Fix deadlock witha the pipe A quirk during resume
  drm/i915: Remove __GFP_NORETRY from our buffer allocator
  drm/i915: Encourage our shrinker more when our shmemfs allocations fails
  drm/i915: Differentiate between sw write location into ring and last hw read
2017-06-23 17:35:57 -07:00
Linus Torvalds 7139a06b16 Fix some locking and gcc optimization issues from the most recent
random_for_linus_stable pull request.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAllIfggACgkQ8vlZVpUN
 gaMzCAgAgsCFSH9vGm+DgUjABNH++fPB/MVsd8lq8sGzg+rTGCe1pBap409feDkF
 +xZfF41Dqts8rchXYn6hqTDuOMfCX9cOxDxxOdhdKG6ntmdGHSZ4T+hM17v6Jgbe
 a7M1xs/7Xrfunqsz9bkb1AdReO1wxG7f3a6JixPnQ1K6yc6HZpFZK5mTrd73lSfY
 ta+KVrZBvPyVyAcWNQn6ssgTRhrTFwFy/nG4Mz2XteATyo9Z9622z8TGW5tZacnQ
 dMgMi9ZMqYuIW/1tA1MmIs5GFkmbZVOqgpbipjhrXEquNCGwj4LQCeeN4qrKnXsw
 enAy3z6DRu9C/F7gMHcvpbYETEmSjQ==
 =HSEU
 -----END PGP SIGNATURE-----

Merge tag 'random_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random

Pull random fixes from Ted Ts'o:
 "Fix some locking and gcc optimization issues from the most recent
  random_for_linus_stable pull request"

* tag 'random_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
  random: silence compiler warnings and fix race
2017-06-23 17:33:46 -07:00
Linus Torvalds 7ec2f7e8d9 - a revert of a DM mirror commit that has proven to make the code prone
to crash
 
 - a DM io reference count fix that resolves a NULL pointer seen when
   issuing discards to a DM mirror target's device whose mirror legs do
   not all support discards
 
 - a couple DM integrity fixes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJZTBU6AAoJEMUj8QotnQNaDJEH/3ujmjAnN1gpB4PhMh7kGyKA
 i8qB476EYTEH8mPha88lvGoGoTk07cgwuWgQtulbcIlM0PbtbjcRs4lxEPCJW8sm
 BeBnwKnBtnd+I+INKK0RCkYNHxO1ciCv4jMe08xNvSOrcmNVI1E4HjQ5GtJX7IMO
 eQCEdTsDhf+ZXGnE6ErzfLVYnrazhNhk40+2jSlDFxDL8Qpd43EwMw5iHzeh0ztm
 Frf9+JjWlckUS6oVm1AkTygbRrS3FEJ/cM3ei61/kj6hYzFGcP4Ba95Zd/E5k1Ls
 9byBw93KTW3Pi5BKeVTo/JgxITcVQUZAWn95qF7HofZn6oBLdEiHzXLHQctl9Qs=
 =fZVe
 -----END PGP SIGNATURE-----

Merge tag 'for-4.12/dm-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper fixes from Mike Snitzer:

 - a revert of a DM mirror commit that has proven to make the code prone
   to crash

 - a DM io reference count fix that resolves a NULL pointer seen when
   issuing discards to a DM mirror target's device whose mirror legs do
   not all support discards

 - a couple DM integrity fixes

* tag 'for-4.12/dm-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm io: fix duplicate bio completion due to missing ref count
  dm integrity: fix to not disable/enable interrupts from interrupt context
  Revert "dm mirror: use all available legs on multiple failures"
  dm integrity: reject mappings too large for device
2017-06-23 17:32:05 -07:00
Linus Torvalds 337c6ba2d8 Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "8 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  fs/exec.c: account for argv/envp pointers
  ocfs2: fix deadlock caused by recursive locking in xattr
  slub: make sysfs file removal asynchronous
  lib/cmdline.c: fix get_options() overflow while parsing ranges
  fs/dax.c: fix inefficiency in dax_writeback_mapping_range()
  autofs: sanity check status reported with AUTOFS_DEV_IOCTL_FAIL
  mm/vmalloc.c: huge-vmap: fail gracefully on unexpected huge vmap mappings
  mm, thp: remove cond_resched from __collapse_huge_page_copy
2017-06-23 16:30:52 -07:00
Kees Cook 98da7d0885 fs/exec.c: account for argv/envp pointers
When limiting the argv/envp strings during exec to 1/4 of the stack limit,
the storage of the pointers to the strings was not included.  This means
that an exec with huge numbers of tiny strings could eat 1/4 of the stack
limit in strings and then additional space would be later used by the
pointers to the strings.

For example, on 32-bit with a 8MB stack rlimit, an exec with 1677721
single-byte strings would consume less than 2MB of stack, the max (8MB /
4) amount allowed, but the pointers to the strings would consume the
remaining additional stack space (1677721 * 4 == 6710884).

The result (1677721 + 6710884 == 8388605) would exhaust stack space
entirely.  Controlling this stack exhaustion could result in
pathological behavior in setuid binaries (CVE-2017-1000365).

[akpm@linux-foundation.org: additional commenting from Kees]
Fixes: b6a2fea393 ("mm: variable length argument support")
Link: http://lkml.kernel.org/r/20170622001720.GA32173@beast
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Qualys Security Advisory <qsa@qualys.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-23 16:15:56 -07:00
Eric Ren 8818efaaac ocfs2: fix deadlock caused by recursive locking in xattr
Another deadlock path caused by recursive locking is reported.  This
kind of issue was introduced since commit 743b5f1434 ("ocfs2: take
inode lock in ocfs2_iop_set/get_acl()").  Two deadlock paths have been
fixed by commit b891fa5024 ("ocfs2: fix deadlock issue when taking
inode lock at vfs entry points").  Yes, we intend to fix this kind of
case in incremental way, because it's hard to find out all possible
paths at once.

This one can be reproduced like this.  On node1, cp a large file from
home directory to ocfs2 mountpoint.  While on node2, run
setfacl/getfacl.  Both nodes will hang up there.  The backtraces:

On node1:
  __ocfs2_cluster_lock.isra.39+0x357/0x740 [ocfs2]
  ocfs2_inode_lock_full_nested+0x17d/0x840 [ocfs2]
  ocfs2_write_begin+0x43/0x1a0 [ocfs2]
  generic_perform_write+0xa9/0x180
  __generic_file_write_iter+0x1aa/0x1d0
  ocfs2_file_write_iter+0x4f4/0xb40 [ocfs2]
  __vfs_write+0xc3/0x130
  vfs_write+0xb1/0x1a0
  SyS_write+0x46/0xa0

On node2:
  __ocfs2_cluster_lock.isra.39+0x357/0x740 [ocfs2]
  ocfs2_inode_lock_full_nested+0x17d/0x840 [ocfs2]
  ocfs2_xattr_set+0x12e/0xe80 [ocfs2]
  ocfs2_set_acl+0x22d/0x260 [ocfs2]
  ocfs2_iop_set_acl+0x65/0xb0 [ocfs2]
  set_posix_acl+0x75/0xb0
  posix_acl_xattr_set+0x49/0xa0
  __vfs_setxattr+0x69/0x80
  __vfs_setxattr_noperm+0x72/0x1a0
  vfs_setxattr+0xa7/0xb0
  setxattr+0x12d/0x190
  path_setxattr+0x9f/0xb0
  SyS_setxattr+0x14/0x20

Fix this one by using ocfs2_inode_{lock|unlock}_tracker, which is
exported by commit 439a36b8ef ("ocfs2/dlmglue: prepare tracking logic
to avoid recursive cluster lock").

Link: http://lkml.kernel.org/r/20170622014746.5815-1-zren@suse.com
Fixes: 743b5f1434 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()")
Signed-off-by: Eric Ren <zren@suse.com>
Reported-by: Thomas Voegtle <tv@lio96.de>
Tested-by: Thomas Voegtle <tv@lio96.de>
Reviewed-by: Joseph Qi <jiangqi903@gmail.com>
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-23 16:15:55 -07:00
Tejun Heo 3b7b314053 slub: make sysfs file removal asynchronous
Commit bf5eb3de38 ("slub: separate out sysfs_slab_release() from
sysfs_slab_remove()") made slub sysfs file removals synchronous to
kmem_cache shutdown.

Unfortunately, this created a possible ABBA deadlock between slab_mutex
and sysfs draining mechanism triggering the following lockdep warning.

  ======================================================
  [ INFO: possible circular locking dependency detected ]
  4.10.0-test+ #48 Not tainted
  -------------------------------------------------------
  rmmod/1211 is trying to acquire lock:
   (s_active#120){++++.+}, at: [<ffffffff81308073>] kernfs_remove+0x23/0x40

  but task is already holding lock:
   (slab_mutex){+.+.+.}, at: [<ffffffff8120f691>] kmem_cache_destroy+0x41/0x2d0

  which lock already depends on the new lock.

  the existing dependency chain (in reverse order) is:

  -> #1 (slab_mutex){+.+.+.}:
	 lock_acquire+0xf6/0x1f0
	 __mutex_lock+0x75/0x950
	 mutex_lock_nested+0x1b/0x20
	 slab_attr_store+0x75/0xd0
	 sysfs_kf_write+0x45/0x60
	 kernfs_fop_write+0x13c/0x1c0
	 __vfs_write+0x28/0x120
	 vfs_write+0xc8/0x1e0
	 SyS_write+0x49/0xa0
	 entry_SYSCALL_64_fastpath+0x1f/0xc2

  -> #0 (s_active#120){++++.+}:
	 __lock_acquire+0x10ed/0x1260
	 lock_acquire+0xf6/0x1f0
	 __kernfs_remove+0x254/0x320
	 kernfs_remove+0x23/0x40
	 sysfs_remove_dir+0x51/0x80
	 kobject_del+0x18/0x50
	 __kmem_cache_shutdown+0x3e6/0x460
	 kmem_cache_destroy+0x1fb/0x2d0
	 kvm_exit+0x2d/0x80 [kvm]
	 vmx_exit+0x19/0xa1b [kvm_intel]
	 SyS_delete_module+0x198/0x1f0
	 entry_SYSCALL_64_fastpath+0x1f/0xc2

  other info that might help us debug this:

   Possible unsafe locking scenario:

	 CPU0                    CPU1
	 ----                    ----
    lock(slab_mutex);
				 lock(s_active#120);
				 lock(slab_mutex);
    lock(s_active#120);

   *** DEADLOCK ***

  2 locks held by rmmod/1211:
   #0:  (cpu_hotplug.dep_map){++++++}, at: [<ffffffff810a7877>] get_online_cpus+0x37/0x80
   #1:  (slab_mutex){+.+.+.}, at: [<ffffffff8120f691>] kmem_cache_destroy+0x41/0x2d0

  stack backtrace:
  CPU: 3 PID: 1211 Comm: rmmod Not tainted 4.10.0-test+ #48
  Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
  Call Trace:
   print_circular_bug+0x1be/0x210
   __lock_acquire+0x10ed/0x1260
   lock_acquire+0xf6/0x1f0
   __kernfs_remove+0x254/0x320
   kernfs_remove+0x23/0x40
   sysfs_remove_dir+0x51/0x80
   kobject_del+0x18/0x50
   __kmem_cache_shutdown+0x3e6/0x460
   kmem_cache_destroy+0x1fb/0x2d0
   kvm_exit+0x2d/0x80 [kvm]
   vmx_exit+0x19/0xa1b [kvm_intel]
   SyS_delete_module+0x198/0x1f0
   ? SyS_delete_module+0x5/0x1f0
   entry_SYSCALL_64_fastpath+0x1f/0xc2

It'd be the cleanest to deal with the issue by removing sysfs files
without holding slab_mutex before the rest of shutdown; however, given
the current code structure, it is pretty difficult to do so.

This patch punts sysfs file removal to a work item.  Before commit
bf5eb3de38, the removal was punted to a RCU delayed work item which is
executed after release.  Now, we're punting to a different work item on
shutdown which still maintains the goal removing the sysfs files earlier
when destroying kmem_caches.

Link: http://lkml.kernel.org/r/20170620204512.GI21326@htj.duckdns.org
Fixes: bf5eb3de38 ("slub: separate out sysfs_slab_release() from sysfs_slab_remove()")
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-23 16:15:55 -07:00
Ilya Matveychikov a91e0f680b lib/cmdline.c: fix get_options() overflow while parsing ranges
When using get_options() it's possible to specify a range of numbers,
like 1-100500.  The problem is that it doesn't track array size while
calling internally to get_range() which iterates over the range and
fills the memory with numbers.

Link: http://lkml.kernel.org/r/2613C75C-B04D-4BFF-82A6-12F97BA0F620@gmail.com
Signed-off-by: Ilya V. Matveychikov <matvejchikov@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-23 16:15:55 -07:00
Jan Kara 1eb643d02b fs/dax.c: fix inefficiency in dax_writeback_mapping_range()
dax_writeback_mapping_range() fails to update iteration index when
searching radix tree for entries needing cache flushing.  Thus each
pagevec worth of entries is searched starting from the start which is
inefficient and prone to livelocks.  Update index properly.

Link: http://lkml.kernel.org/r/20170619124531.21491-1-jack@suse.cz
Fixes: 9973c98ecf ("dax: add support for fsync/sync")
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-23 16:15:55 -07:00
NeilBrown 9fa4eb8e49 autofs: sanity check status reported with AUTOFS_DEV_IOCTL_FAIL
If a positive status is passed with the AUTOFS_DEV_IOCTL_FAIL ioctl,
autofs4_d_automount() will return

   ERR_PTR(status)

with that status to follow_automount(), which will then dereference an
invalid pointer.

So treat a positive status the same as zero, and map to ENOENT.

See comment in systemd src/core/automount.c::automount_send_ready().

Link: http://lkml.kernel.org/r/871sqwczx5.fsf@notabene.neil.brown.name
Signed-off-by: NeilBrown <neilb@suse.com>
Cc: Ian Kent <raven@themaw.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-23 16:15:55 -07:00
Ard Biesheuvel 029c54b095 mm/vmalloc.c: huge-vmap: fail gracefully on unexpected huge vmap mappings
Existing code that uses vmalloc_to_page() may assume that any address
for which is_vmalloc_addr() returns true may be passed into
vmalloc_to_page() to retrieve the associated struct page.

This is not un unreasonable assumption to make, but on architectures
that have CONFIG_HAVE_ARCH_HUGE_VMAP=y, it no longer holds, and we need
to ensure that vmalloc_to_page() does not go off into the weeds trying
to dereference huge PUDs or PMDs as table entries.

Given that vmalloc() and vmap() themselves never create huge mappings or
deal with compound pages at all, there is no correct answer in this
case, so return NULL instead, and issue a warning.

When reading /proc/kcore on arm64, you will hit an oops as soon as you
hit the huge mappings used for the various segments that make up the
mapping of vmlinux.  With this patch applied, you will no longer hit the
oops, but the kcore contents willl be incorrect (these regions will be
zeroed out)

We are fixing this for kcore specifically, so it avoids vread() for
those regions.  At least one other problematic user exists, i.e.,
/dev/kmem, but that is currently broken on arm64 for other reasons.

Link: http://lkml.kernel.org/r/20170609082226.26152-1-ard.biesheuvel@linaro.org
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Laura Abbott <labbott@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: zhong jiang <zhongjiang@huawei.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-23 16:15:55 -07:00
David Rientjes c891d9f6bf mm, thp: remove cond_resched from __collapse_huge_page_copy
This is a partial revert of commit 338a16ba15 ("mm, thp: copying user
pages must schedule on collapse") which added a cond_resched() to
__collapse_huge_page_copy().

On x86 with CONFIG_HIGHPTE, __collapse_huge_page_copy is called in
atomic context and thus scheduling is not possible.  This is only a
possible config on arm and i386.

Although need_resched has been shown to be set for over 100 jiffies
while doing the iteration in __collapse_huge_page_copy, this is better
than doing

	if (in_atomic())
		cond_resched()

to cover only non-CONFIG_HIGHPTE configs.

Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1706191341550.97821@chino.kir.corp.google.com
Signed-off-by: David Rientjes <rientjes@google.com>
Reported-by: Larry Finger <Larry.Finger@lwfinger.net>
Tested-by: Larry Finger <Larry.Finger@lwfinger.net>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-23 16:15:55 -07:00
Linus Torvalds 2592d2ef04 SCSI fixes on 20170622
Two fixes to remove spurious WARN_ONs from the new(ish) qedi driver.
 The driver already prints a warning message, there's no need to panic
 users by printing something that looks like an oops as well.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJZTBWOAAoJEAVr7HOZEZN4g0gP/0DjcJmv8c72CzInKnfB+nF1
 y9zrhOojrTD3OhChI0V82rxIfpFLL7ipMCmnl5zElsB0e8R+8w2EwkwO4VJz9lyu
 mztjX68HxG3UmEA4wG9euixCUvK8ewt21PjbWF+voS0pj1bf1fSobl74J+5zQD2D
 svoB93gqDGAIUVL388NWfwrR8Lo1Ysrc1FMtjIGQcrpx+7lVAKpN7olby7JKQpnx
 2HJn64cUiPWX5qajEPnFOfpj8tNecg9Vq7+z8lIcGKXEQk0neraEejbtBWC2z9ZA
 /WQJvA4Ed5BH2I4tG8Ba+J3Z5YFGMJhIdPD3P3Xs8rWAJ3Kzte2jWxP3o1T0cu0R
 8W3b1EhFmyZbiAbhLf+UOlPGMY5cxej7dMzobH2h985mZAXamBnqO7yKXrGl/mHh
 /ai4bC52AdJitOxObaRzyk7ilx4dfSODUfdbTD3j1gj7mIpFoDkgCqA/GrfDFk34
 pzill8IiRxKntFgcZUPflBYAuvFZvGZHXvLsV6xxsYehMv7c3uxJ6/9rPowjxEAg
 HYmkOH+nQgrgKo27SAviaz9Do9hfxYjJ1hP014DjsCAdtlzb05PhXAhJCAqaPzIG
 UpPxpvE1QATI+ZtudLDt7Fk7uR6ggyvkgUkRoNktf4N8rhlUs3kAVQy3Mda3vVW3
 97NN3G9kppiLATijLG/b
 =ysyU
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Two fixes to remove spurious WARN_ONs from the new(ish) qedi driver.

  The driver already prints a warning message, there's no need to panic
  users by printing something that looks like an oops as well"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: qedi: Remove WARN_ON from clear task context.
  scsi: qedi: Remove WARN_ON for untracked cleanup.
2017-06-23 12:25:37 -07:00
Linus Torvalds 7b249bdc3d Changes since last update:
- don't allow swapon on files on the realtime device, because the swap
   code will swap pages out to blocks on the data device, thereby
   corrupting the filesystem
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCgAGBQJZSzmHAAoJEPh/dxk0SrTroa4P/02EPljuA4pOhYlTrsrKyul4
 7KnVg1AFk2uYlNbEcZjKJTkhMhvCqtENorAWawixezAbSeumft24DgPVmXxXEGRx
 f2ym8UiwEVSdTs2dlP/8HCgrrx3kgaF6H4tYnu4WQxkMkDfE6feTp0TcOsklW8R1
 bR+V+Q9xSJ2WRji9mDBu++3jXKa1VlsOzCRDjnWI7E/ZHJ2n8y412qYxaOHPDvl2
 g5AG7jOtB2D7nDEVtfuEdsuSIBHrUsZ/LWrpDlXMhTY7eJ5ipjvcs6RtMayufNdE
 H5ZeA8bKIJNcpR5Y0MvAb5lQNDA5wg4MTLWfQQ7jlvnI6qaysqWR13UhbfzRBHg8
 YDUUWtuyvq+2/gy94VOn82xKTerD8l+KE+pdZUU99qZDsHVZ0FZ0A2IpSA0ZRdj+
 xYm2WnzIqgMp5OD0Ef+QYzMr0043eBnD1+CDnG/JbHz/S1nqI4KdzH5t2ndMg9YS
 g4sl3qKEwR1ZHnECTu2Q9LWAtF5s8WBgVj3brDG9mdMZXwWYLyGKJDNZ6tsxwOzh
 Z2Pp+6Gs5KRqCt5Acok84KjcS7/XVM0a4w9KOjmlZxZ1K9R5abAePGOT+GEGFP4g
 qO2WOa+wHX2UlUQI+lYg60PFMCBtO41ewptx/1+ZluREyNE24aIRTQttRRdz2twA
 /kF8Uf8eGzPWkyP/uCH3
 =qkCp
 -----END PGP SIGNATURE-----

Merge tag 'xfs-4.12-fixes-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fixes from Darrick Wong:
 "I have one more bugfix for you for 4.12-rc7 to fix a disk corruption
  problem:

   - don't allow swapon on files on the realtime device, because the
     swap code will swap pages out to blocks on the data device, thereby
     corrupting the filesystem"

* tag 'xfs-4.12-fixes-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: don't allow bmap on rt files
2017-06-23 12:23:06 -07:00
Nicolas Pitre 8887cd9903 sched/rt: Move RT related code from sched/core.c to sched/rt.c
This helps making sched/core.c smaller and hopefully easier to understand and maintain.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170621182203.30626-3-nicolas.pitre@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-23 10:46:45 +02:00
Nicolas Pitre 06a76fe08d sched/deadline: Move DL related code from sched/core.c to sched/deadline.c
This helps making sched/core.c smaller and hopefully easier to understand and maintain.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170621182203.30626-2-nicolas.pitre@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-23 10:46:45 +02:00
Nicolas Pitre e1d4eeec5a sched/cpuset: Only offer CONFIG_CPUSETS if SMP is enabled
Make CONFIG_CPUSETS=y depend on SMP as this feature makes no sense
on UP. This allows for configuring out cpuset_cpumask_can_shrink()
and task_can_attach() entirely, which shrinks the kernel a bit.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170614171926.8345-2-nicolas.pitre@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-23 10:46:44 +02:00
Nicholas Piggin 34f19ff1b5 powerpc/64: Initialise thread_info for emergency stacks
Emergency stacks have their thread_info mostly uninitialised, which in
particular means garbage preempt_count values.

Emergency stack code runs with interrupts disabled entirely, and is
used very rarely, so this has been unnoticed so far. It was found by a
proposed new powerpc watchdog that takes a soft-NMI directly from the
masked_interrupt handler and using the emergency stack. That crashed
at BUG_ON(in_nmi()) in nmi_enter(). preempt_count()s were found to be
garbage.

To fix this, zero the entire THREAD_SIZE allocation, and initialize
the thread_info.

Cc: stable@vger.kernel.org
Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Move it all into setup_64.c, use a function not a macro. Fix
      crashes on Cell by setting preempt_count to 0 not HARDIRQ_OFFSET]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-23 13:25:38 +10:00
Dave Airlie 33ce7563a4 Merge tag 'drm-misc-fixes-2017-06-22' of git://anongit.freedesktop.org/git/drm-misc into drm-fixes
UAPI Changes:
- drm: Fix regression in GETCONNECTOR ioctl returning stale properties (Daniel)

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>

* tag 'drm-misc-fixes-2017-06-22' of git://anongit.freedesktop.org/git/drm-misc:
  drm: Fix GETCONNECTOR regression
2017-06-23 11:44:51 +10:00
Linus Torvalds a38371cba6 Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
 "Various small fixes for stable"

* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
  CIFS: Fix some return values in case of error in 'crypt_message'
  cifs: remove redundant return in cifs_creation_time_get
  CIFS: Improve readdir verbosity
  CIFS: check if pages is null rather than bv for a failed allocation
  CIFS: Set ->should_dirty in cifs_user_readv()
2017-06-22 11:16:55 -07:00
Linus Torvalds 3f7ba7e13e KVM fixes for v4.12-rc7
MIPS:
  - Fix build with KVM, DYNAMIC_DEBUG and JUMP_LABEL.
 
 PPC:
  - Fix host crashes/hangs on POWER9.
  - Properly restore userspace state after KVM_RUN ioctl.
 
 s390:
  - Fix address translation in odd-ball cases (real-space designation
    ASCEs).
 
 x86:
  - Fix privilege escalation in 64-bit Windows guests.
 
 All patches are for stable and the x86 also has a CVE.
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABCAAGBQJZS9uIAAoJEED/6hsPKofou7UH/1AopK/4WzfZqIlObxf1O2K/
 iqeoHlU/7TPz3+YVN4PxCyb9KWxOR1CS6IjmrrQRnl/ncYkFwUI11zb1Dao7mvYo
 L/D4XeT9rLheNATj9RPlznIAbQicN3TFWWczMzR0T2kftHHDAe0rWF1hkyS3BDyY
 n6V6LbG6h6ONacUHUFfDAgRugiI1rKAjKtOeFvylIS5nIe1ez1ocULBxoXVJFxv1
 0XnX/OrWDocGeope0xt6Jmjr7N5cMU0fyjJ+VM4ap8HGmovVUPeXF+cKdaOUyZyS
 L+4goghsHDK8fCrtQiPhL+TqQ7El0OtzzcSScb662vT1wd7haAtrQcv96WFAVE4=
 =Zhvq
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Radim Krčmář:
 "MIPS:
   - Fix build with KVM, DYNAMIC_DEBUG and JUMP_LABEL.

  PPC:
   - Fix host crashes/hangs on POWER9.
   - Properly restore userspace state after KVM_RUN ioctl.

  s390:
   - Fix address translation in odd-ball cases (real-space designation
     ASCEs).

  x86:
   - Fix privilege escalation in 64-bit Windows guests

  All patches are for stable and the x86 also has a CVE"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: fix singlestepping over syscall
  KVM: s390: gaccess: fix real-space designation asce handling for gmap shadows
  KVM: MIPS: Fix maybe-uninitialized build failure
  KVM: PPC: Book3S HV: Ignore timebase offset on POWER9 DD1
  KVM: PPC: Book3S HV: Save/restore host values of debug registers
  KVM: PPC: Book3S HV: Preserve userspace HTM state properly
  KVM: PPC: Book3S HV: Restore critical SPRs to host values on guest exit
  KVM: PPC: Book3S HV: Context-switch EBB registers properly
  KVM: PPC: Book3S HV: Cope with host using large decrementer mode
2017-06-22 11:03:09 -07:00
Linus Torvalds 4f92f0e25a - Bug Fixes
- Use address passed in, rather than hard coded value
   - Correct clock-names value in DT binding documentation
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJZSUUqAAoJEFGvii+H/Hdhw7oP/i2Lks1NeHv+fXR3qWfoYZc6
 QAyAguc1NHRhVi76am+Qr4p5joAg8zrPeRVf8rBzIybS0d834M6ehs87rzkZPo01
 3gkO9nvjWIzbXk9b3IONm8djVvTTEf8dndtfdqp2AD53sSXEm9FOavUB7zVwfOEU
 Tyo/PIOb8xlhK05S+Yqq72gkCMgpQ4YXwzKQ2fvYH5RdHHq8g6hpPExDosrcH84g
 u9Yd9ccNA03E+82rBj0AdNaM3ECm3lHdMLA+6BIolgGpm5PDGBERVYMibPtZCJ00
 t0QbmOXeE8PA6x1hu2tHowz0MpqWdU6IplwxEu2Zd5ycQArjeSp7/zr8p5TNBEnq
 zf7ADGa108hlMW4TKSe+vvsk6ya0G7Rw2QDM2qwp4ZUnU4zEScrbWEVXOX5kt8df
 1oU43358rDDuCMfdHuKxyNgi6rT3b0cke//VGmPGdBlbW7rNBYoDUi3GynSXP38x
 J1L6hEelszr9JDBQDw8s1bfsh0Yux3su+IXHwtBEJd8skEGdWZ9n8AUymEBwCW3o
 41StxTxSutdl/fCu4pde0q2e1KoHNCFRhbxlNY7I69HnxLNzyVfWylEr/TSJb2JN
 qnp1zx96VqGIp4fxSOhvcACm41XIa5+geGeeNde18B1Pc/YxFXuYc6kB7rokkEiS
 GjbpslGVeZcnPCFpyB3c
 =iygS
 -----END PGP SIGNATURE-----

Merge tag 'mfd-fixes-4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd

Pull MFD fixes from Lee Jones:

 - arizona: use address passed in, rather than hard coded value

 - correct STM32 clock-names value in DT binding documentation

* tag 'mfd-fixes-4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
  dt-bindings: mfd: Update STM32 timers clock names
  mfd: arizona: Fix typo using hard-coded register
2017-06-22 10:47:29 -07:00
Paolo Bonzini c8401dda2f KVM: x86: fix singlestepping over syscall
TF is handled a bit differently for syscall and sysret, compared
to the other instructions: TF is checked after the instruction completes,
so that the OS can disable #DB at a syscall by adding TF to FMASK.
When the sysret is executed the #DB is taken "as if" the syscall insn
just completed.

KVM emulates syscall so that it can trap 32-bit syscall on Intel processors.
Fix the behavior, otherwise you could get #DB on a user stack which is not
nice.  This does not affect Linux guests, as they use an IST or task gate
for #DB.

This fixes CVE-2017-7518.

Cc: stable@vger.kernel.org
Reported-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-06-22 16:13:29 +02:00
Radim Krčmář d6aa07c169 KVM: s390: fix shadow table handling for nested guests
Some odd-ball cases (real-space designation ASCEs) are handled wrong
 for the shadow page tables. Fix it.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJZS6aoAAoJEBF7vIC1phx8Y9gP/0UO7OBobxB10k3SP3aQtisw
 oILlXRxvEskkv6RiTUJGvHwILHiigIVXtIWFIDx+tpX70ifx0/id7KtLiQoEnuqm
 bmt2lsU1VQnO7siJmGvXZvZ4Da6BonlqT6bJkGSHiP2oOGgZByQFlQ3E04ZtyTJ+
 Uc8nSAAsZrZDMCT+2P9OXLT/t3/dGw5C1vI5fiBmyweR4qXjXxGvWw5VtvA0nT4/
 m/vTuEevTymmQeV7LyV0x/Ru3RV9yU2QUVQrctcPrkWicvTdWO/Ml+z+/Q0OHh2A
 5B3sjlS0Dq5qqF6dRlh0YHDV00uuyrOuBSSH2p5Dhdo3+55fy44U243zZayDzarK
 1xrv13iOus0e2GbMxhhB3hWE8E8gw4t9XUyl4ipJdiTGH0IRCS3Wi/yz8aqCS0w/
 1bY858p3cvV+SqTfmeQdvN0ZhcYDGaIPwxheeClE6DKHKr2PBqW2NnSHDBy3tyhD
 5Lz1Xkn0RFxybb8TJhNdY0i2MxprFQdHAvVmhBvLTfspintO0nYygZjPme2OtYhZ
 P7bS2p8F8aR32HyDsN1nUGwwlYpuBAGcwQ/yuGdz11uEfcOPnI40GrWyHakBMzx4
 krrK9WnF7WT1bcqgmB46YvUc+hAuG5smqsUxa1XqLkxOKRvkncYKgYAPj9dg+o8E
 Y+i+/SKxAqhTJcf2loHP
 =SHME
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-master-4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux

KVM: s390: fix shadow table handling for nested guests

Some odd-ball cases (real-space designation ASCEs) are handled wrong
for the shadow page tables. Fix it.
2017-06-22 16:13:06 +02:00
Alistair Popple bbd5ff50af powerpc/powernv/npu-dma: Add explicit flush when sending an ATSD
NPU2 requires an extra explicit flush to an active GPU PID when
sending address translation shoot downs (ATSDs) to reliably flush the
GPU TLB. This patch adds just such a flush at the end of each sequence
of ATSDs.

We can safely use PID 0 which is always reserved and active on the
GPU. PID 0 is only used for init_mm which will never be a user mm on
the GPU. To enforce this we add a check in pnv_npu2_init_context()
just in case someone tries to use PID 0 on the GPU.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
[mpe: Use true/false for bool literals]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-22 21:21:08 +10:00
Heiko Carstens addb63c18a KVM: s390: gaccess: fix real-space designation asce handling for gmap shadows
For real-space designation asces the asce origin part is only a token.
The asce token origin must not be used to generate an effective
address for storage references. This however is erroneously done
within kvm_s390_shadow_tables().

Furthermore within the same function the wrong parts of virtual
addresses are used to generate a corresponding real address
(e.g. the region second index is used as region first index).

Both of the above can result in incorrect address translations. Only
for real space designations with a token origin of zero and addresses
below one megabyte the translation was correct.

Furthermore replace a "!asce.r" statement with a "!*fake" statement to
make it more obvious that a specific condition has nothing to do with
the architecture, but with the fake handling of real space designations.

Fixes: 3218f7094b ("s390/mm: support real-space for gmap shadows")
Cc: David Hildenbrand <david@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-06-22 12:53:34 +02:00
Frederic Weisbecker 387bc8b553 sched/fair: Spare idle load balancing on nohz_full CPUs
Although idle load balancing obviously only concerns idle CPUs, it can
be a disturbance on a busy nohz_full CPU. Indeed a CPU can only get rid
of an idle load balancing duty once a tick fires while it runs a task
and this can take a while on a nohz_full CPU.

We could fix that and escape the idle load balancing duty from the very
idle exit path but that would bring unecessary overhead. Lets just not
bother and leave that job to housekeeping CPUs (those outside nohz_full
range). The nohz_full CPUs simply don't want any disturbance.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1497838322-10913-4-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-22 11:30:02 +02:00
Frederic Weisbecker a0db971e4e nohz: Move idle balancer registration to the idle path
The idle load balancing registration path assumes that we only stop the
tick when the CPU is idle, ignoring the nohz full case. As a result, a
nohz full CPU that is running a task may be chosen to perform idle load
balancing.

Lets make sure that only CPUs in dynticks idle mode can be picked as
idle load balancers.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1497838322-10913-3-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-22 11:30:01 +02:00
Frederic Weisbecker 3c85d6db5e sched/loadavg: Generalize "_idle" naming to "_nohz"
The loadavg naming code still assumes that nohz == idle whereas its code
is actually handling well both nohz idle and nohz full.

So lets fix the naming according to what the code actually does, to
unconfuse the reader.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1497838322-10913-2-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-22 11:30:01 +02:00
Michail Georgios Etairidis 6c782a5ea5 i2c: imx: Use correct function to write to register
The i2c-imx driver incorrectly uses readb()/writeb() to read and
write to the appropriate registers when performing a repeated start.
The appropriate imx_i2c_read_reg()/imx_i2c_write_reg() functions
should be used instead. Performing a repeated start results in
a kernel panic. The platform is imx.

Signed-off-by: Michail G Etairidis <m.etairidis@beck-ipc.com>
Fixes: ce1a78840f ("i2c: imx: add DMA support for freescale i2c driver")
Fixes: 054b62d9f2 ("i2c: imx: fix the i2c bus hang issue when do repeat restart")
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2017-06-22 10:52:02 +02:00
Linus Torvalds 8d829b9bb8 Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "This contains a set of fixes for xen-blkback by way of Konrad, and a
  performance regression fix for blk-mq for shared tags.

  The latter could account for as much as a 50x reduction in
  performance, with the test case from the user with 500 name spaces. A
  more realistic setup on my end with 32 drives showed a 3.5x drop. The
  fix has been thoroughly tested before being committed"

* 'for-linus' of git://git.kernel.dk/linux-block:
  blk-mq: fix performance regression with shared tags
  xen-blkback: don't leak stack data via response ring
  xen/blkback: don't use xen_blkif_get() in xen-blkback kthread
  xen/blkback: don't free be structure too early
  xen/blkback: fix disconnect while I/Os in flight
2017-06-21 22:15:00 -07:00
Darrick J. Wong eb5e248d50 xfs: don't allow bmap on rt files
bmap returns a dumb LBA address but not the block device that goes with
that LBA.  Swapfiles don't care about this and will blindly assume that
the data volume is the correct blockdev, which is totally bogus for
files on the rt subvolume.  This results in the swap code doing IOs to
arbitrary locations on the data device(!) if the passed in mapping is a
realtime file, so just turn off bmap for rt files.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2017-06-21 20:27:35 -07:00
Jarkko Nikula e4330d8bf6 ACPI / scan: Fix enumeration for special SPI and I2C devices
Commit f406270bf7 ("ACPI / scan: Set the visited flag for all
enumerated devices") caused that two group of special SPI or I2C
devices do not enumerate. SPI and I2C devices are expected to be
enumerated by the SPI and I2C subsystems but change caused that
acpi_bus_attach() marks those devices with acpi_device_set_enumerated().

First group of devices are matched using Device Tree compatible property
with special _HID "PRP0001". Those devices have matched scan handler,
acpi_scan_attach_handler() retuns 1 and acpi_bus_attach() marks them
with acpi_device_set_enumerated().

Second group of devices without valid _HID such as "LNXVIDEO" have
device->pnp.type.platform_id set to zero and change again marks them
with acpi_device_set_enumerated().

Fix this by flagging the SPI and I2C devices during struct acpi_device
object initialization time and let the code in acpi_bus_attach() to go
through the device_attach() and acpi_default_enumeration() path for all
SPI and I2C devices.

Fixes: f406270bf7 (ACPI / scan: Set the visited flag for all enumerated devices)
Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: 4.11+ <stable@vger.kernel.org> # 4.11+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-06-21 23:14:55 +02:00
Linus Torvalds 48b6bbef9a Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix refcounting wrt timers which hold onto inet6 address objects,
    from Xin Long.

 2) Fix an ancient bug in wireless wext ioctls, from Johannes Berg.

 3) Firmware handling fixes in brcm80211 driver, from Arend Van Spriel.

 4) Several mlx5 driver fixes (firmware readiness, timestamp cap
    reporting, devlink command validity checking, tc offloading, etc.)
    From Eli Cohen, Maor Dickman, Chris Mi, and Or Gerlitz.

 5) Fix dst leak in IP/IP6 tunnels, from Haishuang Yan.

 6) Fix dst refcount bug in decnet, from Wei Wang.

 7) Netdev can be double freed in register_vlan_device(). Fix from Gao
    Feng.

 8) Don't allow object to be destroyed while it is being dumped in SCTP,
    from Xin Long.

 9) Fix dpaa_eth build when modular, from Madalin Bucur.

10) Fix throw route leaks, from Serhey Popovych.

11) IFLA_GROUP missing from if_nlmsg_size() and ifla_policy[] table,
    also from Serhey Popovych.

12) Fix premature TX SKB free in stmmac, from Niklas Cassel.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (36 commits)
  igmp: add a missing spin_lock_init()
  net: stmmac: free an skb first when there are no longer any descriptors using it
  sfc: remove duplicate up_write on VF filter_sem
  rtnetlink: add IFLA_GROUP to ifla_policy
  ipv6: Do not leak throw route references
  dt-bindings: net: sms911x: Add missing optional VDD regulators
  dpaa_eth: reuse the dma_ops provided by the FMan MAC device
  fsl/fman: propagate dma_ops
  net/core: remove explicit do_softirq() from busy_poll_stop()
  fib_rules: Resolve goto rules target on delete
  sctp: ensure ep is not destroyed before doing the dump
  net/hns:bugfix of ethtool -t phy self_test
  net: 8021q: Fix one possible panic caused by BUG_ON in free_netdev
  cxgb4: notify uP to route ctrlq compl to rdma rspq
  ip6_tunnel: Correct tos value in collect_md mode
  decnet: always not take dst->__refcnt when inserting dst into hash table
  ip6_tunnel: fix potential issue in __ip6_tnl_rcv
  ip_tunnel: fix potential issue in ip_tunnel_rcv
  brcmfmac: fix uninitialized warning in brcmf_usb_probe_phase2()
  net/mlx5e: Avoid doing a cleanup call if the profile doesn't have it
  ...
2017-06-21 12:40:20 -07:00
Linus Torvalds ce879b64a7 Pin control fixes for v4.12, take three:
- Make the AMD driver use a regular interrupt rather than a chained one,
   so the system does not lock up.
 
 - Fix a function call error deep inside the STM32 driver.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJZSQshAAoJEEEQszewGV1zyOMP/iNDNNnLbR3GhrZRA9UwBBqL
 aBYB6zyg4KOWf5Zey93H4MnXe1S0oI7l2W4aTTHTb+F3z4zjYA65AOGPfL5UcPjl
 46xCCgP+tPT3atMV5HdznuP+9dCgbUUjHBtzy0XzhfaRQkJ1XxNU+u82gyuB4JGS
 ZZLxISFcYHDoIb6bKGFssTr6aPe+A5WgxVbV/GntJT9AUEo9kKOo1s2GdS3yNjPl
 MHV4f6cxLMrRqxDsrFPG+uFd0YEsIKe4iBMgE4qBy+ZELWr4D8i2owY6EPLtY17q
 auzlXtFepMGAOMm0EYyQOn2YgKXZyR57sep6cIbeE0f//mgqlvszDT+j9/I7ZPLx
 cFBEI0C1vjw1M2P3/TKgWBMalgZ5mmIwMTKF9qeuVzbTT/leJtUYO5PuAaFj7E62
 bGzx1NEaGqMRoQgxm5kGTgws3TPYzusSq+m+nfY9UQnHdjgkbFh4FBsn54c/PwHM
 5woz+ToAAdCp+eA4DuklBUloiZXjSsLwK8tZBSIlEcKdOJYF2R9vjjhijpUVAZIp
 8TeQWABxcE3NmxCctBV+UGx1geMPyqXvwiyEvc1QOm2iK0Md97YRE0gWDDXE7ne/
 GdEcRCkzGVVl15EtL2u4OgsxejmUdIptelm105eE57k/giGZckUoI9cNpFO5B9/2
 mfIJD46IwgtWYwVJwHqk
 =2zx1
 -----END PGP SIGNATURE-----

Merge tag 'pinctrl-v4.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull more pin control fixes from Linus Walleij:
 "Some late arriving fixes. I should have sent earlier, just swamped
  with work as usual. Thomas patch makes AMD systems usable despite
  firmware bugs so it is fairly important.

   - Make the AMD driver use a regular interrupt rather than a chained
     one, so the system does not lock up.

   - Fix a function call error deep inside the STM32 driver"

* tag 'pinctrl-v4.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: stm32: Fix bad function call
  pinctrl/amd: Use regular interrupt instead of chained
2017-06-21 12:16:12 -07:00
Linus Torvalds db1b5ccd27 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
Pull HID fixes from Jiri Kosina:

 - revert of a commit to magicmouse driver that regressess certain
   devices, from Daniel Stone

 - quirk for a specific Dell mouse, from Sebastian Parschauer

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
  Revert "HID: magicmouse: Set multi-touch keybits for Magic Mouse"
  HID: Add quirk for Dell PIXART OEM mouse
2017-06-21 12:06:29 -07:00
Linus Torvalds dcba71086e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching
Pull livepatching fix from Jiri Kosina:
 "Fix the way how livepatches are being stacked with respect to RCU,
  from Petr Mladek"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
  livepatch: Fix stacking of patches with respect to RCU
2017-06-21 12:02:48 -07:00
Linus Torvalds 021f601980 Merge branch 'ufs-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more ufs fixes from Al Viro:
 "More UFS fixes, unfortunately including build regression fix for the
  64-bit s_dsize commit. Fixed in this pile:

   - trivial bug in signedness of 32bit timestamps on ufs1

   - ESTALE instead of ufs_error() when doing open-by-fhandle on
     something deleted

   - build regression on 32bit in ufs_new_fragments() - calculating that
     many percents of u64 pulls libgcc stuff on some of those. Mea
     culpa.

   - fix hysteresis loop broken by typo in 2.4.14.7 (right next to the
     location of previous bug).

   - fix the insane limits of said hysteresis loop on filesystems with
     very low percentage of reserved blocks. If it's 5% or less, just
     use the OPTSPACE policy.

   - calculate those limits once and mount time.

  This tree does pass xfstests clean (both ufs1 and ufs2) and it _does_
  survive cross-builds.

  Again, my apologies for missing that, especially since I have noticed
  a related percentage-of-64bit issue in earlier patches (when dealing
  with amount of reserved blocks). Self-LART applied..."

* 'ufs-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  ufs: fix the logics for tail relocation
  ufs_iget(): fail with -ESTALE on deleted inode
  fix signedness of timestamps on ufs1
2017-06-21 11:30:52 -07:00
Helge Deller bd726c90b6 Allow stack to grow up to address space limit
Fix expand_upwards() on architectures with an upward-growing stack (parisc,
metag and partly IA-64) to allow the stack to reliably grow exactly up to
the address space limit given by TASK_SIZE.

Signed-off-by: Helge Deller <deller@gmx.de>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-21 11:07:18 -07:00
Hugh Dickins f4cb767d76 mm: fix new crash in unmapped_area_topdown()
Trinity gets kernel BUG at mm/mmap.c:1963! in about 3 minutes of
mmap testing.  That's the VM_BUG_ON(gap_end < gap_start) at the
end of unmapped_area_topdown().  Linus points out how MAP_FIXED
(which does not have to respect our stack guard gap intentions)
could result in gap_end below gap_start there.  Fix that, and
the similar case in its alternative, unmapped_area().

Cc: stable@vger.kernel.org
Fixes: 1be7107fbe ("mm: larger stack guard gap, between vmas")
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Debugged-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-21 10:56:11 -07:00
Jens Axboe 8e8320c931 blk-mq: fix performance regression with shared tags
If we have shared tags enabled, then every IO completion will trigger
a full loop of every queue belonging to a tag set, and every hardware
queue for each of those queues, even if nothing needs to be done.
This causes a massive performance regression if you have a lot of
shared devices.

Instead of doing this huge full scan on every IO, add an atomic
counter to the main queue that tracks how many hardware queues have
been marked as needing a restart. With that, we can avoid looking for
restartable queues, if we don't have to.

Max reports that this restores performance. Before this patch, 4K
IOPS was limited to 22-23K IOPS. With the patch, we are running at
950-970K IOPS.

Fixes: 6d8c6c0f97 ("blk-mq: Restart a single queue if tag sets are shared")
Reported-by: Max Gurtovoy <maxg@mellanox.com>
Tested-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Tested-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-21 10:17:49 -06:00