Commit Graph

1500 Commits

Author SHA1 Message Date
Ben Hutchings 3072601375 powerpc/32: Remove Mac-on-Linux/rtlinux hooks
The symbols exported for use by MOL/rtlinux aren't getting CRCs and I
was about to fix that. But MOL is dead upstream, and the latest work on
it was to make it use KVM instead of its own kernel module. So remove
them instead.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-21 22:09:26 +11:00
Laurent Dufour 819cdcdb7b powerpc/mm: Move mmap_sem unlocking in do_page_fault()
Since the fault retry is now handled earlier, we can release the
mmap_sem lock earlier too and remove later unlocking previously done in
mm_fault_error().

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-21 22:09:18 +11:00
Laurent Dufour 14c02e419a powerpc/mm: Handle VM_FAULT_RETRY earlier
In do_page_fault() if handle_mm_fault() returns VM_FAULT_RETRY, retry
the page fault handling before anything else.

This would simplify the handling of the mmap_sem lock in this part of
the code.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-21 22:09:11 +11:00
Laurent Dufour c2294e0ffe powerpc/mm: Move mmap_sem unlock up from do_sigbus
Move mmap_sem releasing in the do_sigbus()'s unique caller : mm_fault_error()

No functional changes.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-21 22:08:53 +11:00
Paul Mackerras fc36a90326 Revert "powerpc/64: Disable use of radix under a hypervisor"
This reverts commit 3f91a89d42.

Now that we do have the machinery for using the radix MMU under a
hypervisor, the extra check and comment introduced in 3f91a89d42 are
no longer correct.  The result is that when booted under a hypervisor
that only allows use of radix, we clear the MMU_FTR_TYPE_RADIX and
then set it again, and print a warning about ignoring the
disable_radix command line option, even though the command line does
not include "disable_radix".

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-21 16:50:28 +11:00
Linus Torvalds f7d6a7283a powerpc fixes for 4.11 #3
Five fairly small fixes for things that went in this cycle.
 
 A fairly large patch to rework the CAS logic on Power9, necessitated by a late
 change to the firmware API, and we can't boot without it.
 
 Three fixes going to stable, allowing more instructions to be emulated on LE,
 fixing a boot crash on 32-bit Freescale BookE machines, and the OPAL XICS
 workaround.
 
 And a patch from me to sort the selects under CONFIG PPC. Annoying churn, but
 worth it in the long run, and best for it to go in now to avoid conflicts.
 
 Thanks to:
   Alexey Kardashevskiy, Anton Blanchard, Balbir Singh, Gautham R. Shenoy,
   Laurentiu Tudor, Nicholas Piggin, Paul Mackerras, Ravi Bangoria, Sachin Sant,
   Shile Zhang, Suraj Jitindar Singh.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJYvqSxAAoJEFHr6jzI4aWAjMQP/06OFGz3VQvO5Q8jPsqRF22y
 Wr+04OKFmKnYVObdQk15HGOagp1fSkWWHfP/eu50kx1WNCzq7tQdLjNSi7H4F3s1
 4NwlaOfSQoxctsVtfnITJkfVScjcxK7XVagswtb3wvBpBx4lwD8fGwxkSxj6NhRw
 PNxLi44wobb8mDyR6L/6tJKBI2Jt12qXZY+kBQIleun5+lF8fNXIu4qPiglMOia6
 oPhXlp4RASt8wz74H8JuMTwGv17MxG+zvbkDPwQC7PI/fohJLybgWEfByN4H5UMy
 7Xi/lWHlShAyc7ulAIN+A1mHKY9LSv45U6qrrHFUJgRftZihoZHe6ekcI+h5oFVX
 chP9oUrQNeeZ5QqUC4rYdWwsMfiXBI0y5+BCupItixXc1LANBH9Ym9IECbgPRP93
 LQVqiS4958KijHlYBOA2zPicl/FnVO16orqakyRS0B3lQ54XBvhcgG8gIXjQr8PM
 Mt2W4r6RtGJ4ddhUPpF/W4lEuR4+dmXfEqs7DkgBKRbvi8XYkiLx2byBNh/OMRUG
 T4ILXsYf50AKRAq/jFTs9A0zkjtmtBeDdn96Mcan8i3WZuTQ7b8mQlC46zEg23A8
 XmTG2xt7N1dMjjwS78CfnvQ8sIVtA9AUfK37aTc0ICMsBCqEcWLAhHKZyCw0h25C
 wq9BMn4e5Gdg2xLTHKlL
 =SxON
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 "Five fairly small fixes for things that went in this cycle.

  A fairly large patch to rework the CAS logic on Power9, necessitated
  by a late change to the firmware API, and we can't boot without it.

  Three fixes going to stable, allowing more instructions to be emulated
  on LE, fixing a boot crash on 32-bit Freescale BookE machines, and the
  OPAL XICS workaround.

  And a patch from me to sort the selects under CONFIG PPC. Annoying
  churn, but worth it in the long run, and best for it to go in now to
  avoid conflicts.

  Thanks to:
    Alexey Kardashevskiy, Anton Blanchard, Balbir Singh, Gautham R.
    Shenoy, Laurentiu Tudor, Nicholas Piggin, Paul Mackerras, Ravi
    Bangoria, Sachin Sant, Shile Zhang, Suraj Jitindar Singh"

* tag 'powerpc-4.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc: Sort the selects under CONFIG_PPC
  powerpc/64: Fix L1D cache shape vector reporting L1I values
  powerpc/64: Avoid panic during boot due to divide by zero in init_cache_info()
  powerpc: Update to new option-vector-5 format for CAS
  powerpc: Parse the command line before calling CAS
  powerpc/xics: Work around limitations of OPAL XICS priority handling
  powerpc/64: Fix checksum folding in csum_add()
  powerpc/powernv: Fix opal tracepoints with JUMP_LABEL=n
  powerpc/booke: Fix boot crash due to null hugepd
  powerpc: Fix compiling a BE kernel with a powerpc64le toolchain
  selftest/powerpc: Fix false failures for skipped tests
  powerpc/powernv: Fix bug due to labeling ambiguity in power_enter_stop
  powerpc/64: Invalidate process table caching after setting process table
  powerpc: emulate_step() tests for load/store instructions
  powerpc: Emulation support for load/store instructions on LE
2017-03-07 10:46:10 -08:00
Suraj Jitindar Singh 014d02cbf1 powerpc: Update to new option-vector-5 format for CAS
On POWER9 the ibm,client-architecture-support (CAS) negotiation process
has been updated to change how the host to guest negotiation is done for
the new hash/radix mmu as well as the nest mmu, process tables and guest
translation shootdown (GTSE).

This is documented in the unreleased PAPR ACR "CAS option vector
additions for P9".

The host tells the guest which options it supports in
ibm,arch-vec-5-platform-support. The guest then chooses a subset of these
to request in the CAS call and these are agreed to in the
ibm,architecture-vec-5 property of the chosen node.

Thus we read ibm,arch-vec-5-platform-support and make our selection before
calling CAS. We then parse the ibm,architecture-vec-5 property of the
chosen node to check whether we should run as hash or radix.

ibm,arch-vec-5-platform-support format:

index value pairs: <index, val> ... <index, val>

index: Option vector 5 byte number
val:   Some representation of supported values

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Acked-by: Paul Mackerras <paulus@ozlabs.org>
[mpe: Don't print about unknown options, be consistent with OV5_FEAT]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-06 21:44:09 +11:00
Paul Mackerras 7a70d7288c powerpc/64: Invalidate process table caching after setting process table
The POWER9 MMU reads and caches entries from the process table.
When we kexec from one kernel to another, the second kernel sets
its process table pointer but doesn't currently do anything to
make the CPU invalidate any cached entries from the old process table.
This adds a tlbie (TLB invalidate entry) instruction with parameters
to invalidate caching of the process table after the new process
table is installed.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-03 11:24:50 +11:00
Ingo Molnar 589ee62844 sched/headers: Prepare to remove the <linux/mm_types.h> dependency from <linux/sched.h>
Update code that relied on sched.h including various MM types for them.

This will allow us to remove the <linux/mm_types.h> include from <linux/sched.h>.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:37 +01:00
Ingo Molnar 68db0cf106 sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task_stack.h>
We are going to split <linux/sched/task_stack.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/task_stack.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:36 +01:00
Ingo Molnar 010426079e sched/headers: Prepare for new header dependencies before moving more code to <linux/sched/mm.h>
We are going to split more MM APIs out of <linux/sched.h>, which
will have to be picked up from a couple of .c files.

The APIs that we are going to move are:

  arch_pick_mmap_layout()
  arch_get_unmapped_area()
  arch_get_unmapped_area_topdown()
  mm_update_next_owner()

Include the header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:30 +01:00
Ingo Molnar 3f07c01441 sched/headers: Prepare for new header dependencies before moving code to <linux/sched/signal.h>
We are going to split <linux/sched/signal.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/signal.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:29 +01:00
Linus Torvalds b286cedd47 powerpc updates for 4.11 part 2
Highlights include:
 
  - An update of the disassembly code used by xmon to the latest versions in
    binutils. We've received permission from all the authors of the relevant
    binutils changes to relicense their changes to the relevant files from GPLv3
    to GPLv2, for inclusion in Linux. Thanks to Peter Bergner for doing the leg
    work to get permission from everyone.
 
  - Addition of the "architected" Power9 CPU table entry, allowing us to boot
    in Power9 architected mode under a hypervisor.
 
  - Updates to the Power9 PMU code.
 
  - Implementation of clear_bit_unlock_is_negative_byte() to optimise
    unlock_page().
 
  - Freescale updates from Scott: "Highlights include 8xx breakpoints and perf,
    t1042rdb display support, and board updates."
 
 Thanks to:
   Al Viro, Andrew Donnellan, Aneesh Kumar K.V, Balbir Singh, Douglas Miller,
   Frédéric Weisbecker, Gavin Shan, Madhavan Srinivasan, Michael Roth, Nathan
   Fontenot, Naveen N. Rao, Nicholas Piggin, Peter Bergner, Paul E. McKenney,
   Rashmica Gupta, Russell Currey, Sahil Mehta, Stewart Smith.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJYthsKAAoJEFHr6jzI4aWAaWMQAJ7mAwX98ncoYschPgRmmIun
 f6DtE4IonrxiZ22gp1ct4+c9OFtA+B5FXMcEhOKpfh93lg38PTDjHs9e5kfauD7+
 oTQ2Bg1eXaL48FKdmC5Vs4Kt+/J8e9guGafUC1OVIpTyyRPoZeUDH0lx+kSPV5bd
 PkL+wY/k3W0Njo8WgD1P9u3W15+BxISo/k8c7ajzKTHGBZlAvj5h2gO6XUBNMLyy
 YClB/qIymjZriSB+AeWYD79k8gPbBZPsmZG0ZF1hY060894LgqLB9mPOJdffx/DY
 H7/uP6jcsRDOXTOmyueW1SEmPoQbtysiMd1lNrCXKtC/Okr5uhn2cUhi88AsgWvd
 1QFly2lobcDAKPah/yB7YQGMAcmYvGGNuqrWaosaV2T7r0KprzUYYgCOqzvC3WSJ
 QtVatBzMIqRTMYq+3U4G1aHeCXlRazVQHDuvPby8RdR5b2gIexiqMab2eS7tSMIH
 mCOIunRIvT14g/7wxUV7tahN+ifncNxzAk4DvPO+Wc4FQ4sy7wArv2YipSaWRWtE
 u7tNdBkEwlDkKhJgRU5T0Op2PyMbHwCP8pWuz7PQIhKIcgwmP9wb07BIWG/GGIqn
 07TxJYX2ItabyEMZMsYhzILZqjLyiAaCARANB7ScbQbdP8wdcGZcwismhwnfROIU
 NuxsZg63BUDMoxk7Sauu
 =rspd
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull more powerpc updates from Michael Ellerman:
 "Highlights include:

   - an update of the disassembly code used by xmon to the latest
     versions in binutils. We've received permission from all the
     authors of the relevant binutils changes to relicense their changes
     to the relevant files from GPLv3 to GPLv2, for inclusion in Linux.
     Thanks to Peter Bergner for doing the leg work to get permission
     from everyone.

   - addition of the "architected" Power9 CPU table entry, allowing us
     to boot in Power9 architected mode under a hypervisor.

   - updates to the Power9 PMU code.

   - implementation of clear_bit_unlock_is_negative_byte() to optimise
     unlock_page().

   - Freescale updates from Scott: "Highlights include 8xx breakpoints
     and perf, t1042rdb display support, and board updates."

  Thanks to:
    Al Viro, Andrew Donnellan, Aneesh Kumar K.V, Balbir Singh, Douglas
    Miller, Frédéric Weisbecker, Gavin Shan, Madhavan Srinivasan,
    Michael Roth, Nathan Fontenot, Naveen N. Rao, Nicholas Piggin, Peter
    Bergner, Paul E. McKenney, Rashmica Gupta, Russell Currey, Sahil
    Mehta, Stewart Smith"

* tag 'powerpc-4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (48 commits)
  powerpc: Remove leftover cputime_to_nsecs call causing build error
  powerpc/mm/hash: Always clear UPRT and Host Radix bits when setting up CPU
  powerpc/optprobes: Fix TOC handling in optprobes trampoline
  powerpc/pseries: Advertise Hot Plug Event support to firmware
  cxl: fix nested locking hang during EEH hotplug
  powerpc/xmon: Dump memory in CPU endian format
  powerpc/pseries: Revert 'Auto-online hotplugged memory'
  powerpc/powernv: Make PCI non-optional
  powerpc/64: Implement clear_bit_unlock_is_negative_byte()
  powerpc/powernv: Remove unused variable in pnv_pci_sriov_disable()
  powerpc/kernel: Remove error message in pcibios_setup_phb_resources()
  powerpc/mm: Fix typo in set_pte_at()
  pci/hotplug/pnv-php: Disable MSI and PCI device properly
  pci/hotplug/pnv-php: Disable surprise hotplug capability on conflicts
  pci/hotplug/pnv-php: Remove WARN_ON() in pnv_php_put_slot()
  powerpc: Add POWER9 architected mode to cputable
  powerpc/perf: use is_kernel_addr macro in perf_get_misc_flags()
  powerpc/perf: Avoid FAB_*_MATCH checks for power9
  powerpc/perf: Add restrictions to PMC5 in power9 DD1
  powerpc/perf: Use Instruction Counter value
  ...
2017-03-01 10:10:16 -08:00
Linus Torvalds 38705613b7 powerpc updates for 4.11 part 1.
Highlights include:
 
  - Support for direct mapped LPC on POWER9, giving Linux direct access to
    devices that may be on there such as a UART.
 
  - Memory hotplug support for the Power9 Radix MMU.
 
  - Add new AUX vectors describing the processor's cache geometry, to be used by
    glibc.
 
  - The ability for a guest to ask the hypervisor to resize the guest's hash
    table, and in addition support for doing so automatically when memory is
    hotplugged into/out-of the guest. This allows the hash table to be sized
    based on the current memory usage of the guest, rather than the maximum
    possible memory usage.
 
  - Implementation of optprobes (kprobe optimisation) for powerpc.
 
 In addition there's the topic branch shared with the KVM tree, which includes
 support for guests to use the Radix MMU on Power9.
 
 Thanks to:
   Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anju T, Anton Blanchard,
   Benjamin Herrenschmidt, Chris Packham, Daniel Axtens, Daniel Borkmann, David
   Gibson, Finn Thain, Gautham R. Shenoy, Gavin Shan, Greg Kurz, Joel Stanley,
   John Allen, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring, Michael
   Neuling, Nathan Fontenot, Naveen N. Rao, Nicholas Piggin, Paul Mackerras, Ravi
   Bangoria, Reza Arbab, Shailendra Singh, Vaibhav Jain, Wei Yongjun.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJYrWj0AAoJEFHr6jzI4aWAsn4P/08Kz3TtOvDuuPGVNoO7fWOn
 ag5/zVt8R4FuCALqWpAZbVqMuUU4wLxG0RuWlmBNYYhrjMC6JxHpOSjQXxM2D7YT
 CdGTJxG414r6HMOeToL9i/z33o0m+KT07tscer+QMKlXVKCR2z0fEJch+zoPBHYA
 5eVBFpLLTtbNiX9UnrcM/vYz61d56kT4YJey9/8qbAkTAc1rMPa8ucU5UiKYJ7yX
 mF8cd7WE+7aqif9V8yN59G2rcbz+h3pbMw/gzImiYsYrUj4fLjU+VTKL5PPT+UFy
 WWBXD3MAMm1dksMMZi3hgoo2BZhDn3RkymeYi6Jo4kDknNMPZzkMxGyvaJ8Eq5H/
 bIYXdS1AbTtvaaEEuWqDFjpnChOEvj/8IeqitU0jjlql8BVjNKg/ESaaKucZr+pO
 pk2Mfvw0Tb/lxJT5qj27yq4aRsxJwdFOoPYCN7MquPp/wV2Tg5M6h4nVQ4T6Wl0S
 tMFQeCqXflhDWh0Xgr2UXpF66/YTj3Du5LasOTkgGeU30Z8TcNGFEmDWShKP3cEm
 e0oQE+OHhIGN4WSBXBAto/Gw8/0v3pXlMs+VEfeHqenwPss2sWtSwXWe8khmiy9e
 DJ48sTzj75/Zx1fiqRldw9YEnrL+NK0eOOpvzxeyKpfvdUytc+chFfEqUmO6kl3Z
 DW2UmlZxmW+b0SfexCHL
 =Icle
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "Highlights include:

   - Support for direct mapped LPC on POWER9, giving Linux direct access
     to devices that may be on there such as a UART.

   - Memory hotplug support for the Power9 Radix MMU.

   - Add new AUX vectors describing the processor's cache geometry, to
     be used by glibc.

   - The ability for a guest to ask the hypervisor to resize the guest's
     hash table, and in addition support for doing so automatically when
     memory is hotplugged into/out-of the guest. This allows the hash
     table to be sized based on the current memory usage of the guest,
     rather than the maximum possible memory usage.

   - Implementation of optprobes (kprobe optimisation) for powerpc.

  In addition there's the topic branch shared with the KVM tree, which
  includes support for guests to use the Radix MMU on Power9.

  Thanks to:
    Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anju T, Anton
    Blanchard, Benjamin Herrenschmidt, Chris Packham, Daniel Axtens,
    Daniel Borkmann, David Gibson, Finn Thain, Gautham R. Shenoy, Gavin
    Shan, Greg Kurz, Joel Stanley, John Allen, Madhavan Srinivasan,
    Mahesh Salgaonkar, Markus Elfring, Michael Neuling, Nathan Fontenot,
    Naveen N. Rao, Nicholas Piggin, Paul Mackerras, Ravi Bangoria, Reza
    Arbab, Shailendra Singh, Vaibhav Jain, Wei Yongjun"

* tag 'powerpc-4.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (129 commits)
  powerpc/mm/radix: Skip ptesync in pte update helpers
  powerpc/mm/radix: Use ptep_get_and_clear_full when clearing pte for full mm
  powerpc/mm/radix: Update pte update sequence for pte clear case
  powerpc/mm: Update PROTFAULT handling in the page fault path
  powerpc/xmon: Fix data-breakpoint
  powerpc/mm: Fix build break with BOOK3S_64=n and MEMORY_HOTPLUG=y
  powerpc/mm: Fix build break when CMA=n && SPAPR_TCE_IOMMU=y
  powerpc/mm: Fix build break with RADIX=y & HUGETLBFS=n
  powerpc/pseries: Fix typo in parameter description
  powerpc/kprobes: Remove kprobe_exceptions_notify()
  kprobes: Introduce weak variant of kprobe_exceptions_notify()
  powerpc/ftrace: Fix confusing help text for DISABLE_MPROFILE_KERNEL
  powerpc/powernv: Fix opal_exit tracepoint opcode
  powerpc: Add a prototype for mcount() so it can be versioned
  powerpc: Drop GPL from of_node_to_nid() export to match other arches
  powerpc/kprobes: Optimize kprobe in kretprobe_trampoline()
  powerpc/kprobes: Implement Optprobes
  powerpc/kprobes: Fixes for kprobe_lookup_name() on BE
  powerpc: Add helper to check if offset is within relative branch range
  powerpc/bpf: Introduce __PPC_SH64()
  ...
2017-02-22 10:30:38 -08:00
Gavin Shan c618f6b188 powerpc/mm: Fix typo in set_pte_at()
This fixes the typo about the _PAGE_PTE in set_pte_at() by changing
"tryint" to "trying to".

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-17 22:16:25 +11:00
Michael Ellerman a90e883d8b powerpc/mm: Blacklist SLB symbols from kprobe
We can't sensibly take a trap at this point. So, blacklist these
symbols.

Reported-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-17 10:58:52 +11:00
Michael Ellerman e471c393df powerpc/mm: Convert slb_finish_load[_1T] to local symbols
slb_finish_load and slb_finish_load_1T are both only used within
slb_low.S, so make them local symbols.

This makes the code a little clearer, as it's more obvious neither is
intended to be an entry point from arbitrary other code, only the uses
in this file.

It also prevents them being used with kprobes and other tracing tools,
which is good because we're not able to safely take traps at these
locations, so making them local symbols avoids us needing to blacklist
them.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-17 10:58:51 +11:00
Paul Mackerras 3f91a89d42 powerpc/64: Disable use of radix under a hypervisor
Currently, if the kernel is running on a POWER9 processor under a
hypervisor, it may try to use the radix MMU even though it doesn't have
the necessary code to do so (it doesn't negotiate use of radix, and it
doesn't do the H_REGISTER_PROC_TBL hcall).  If the hypervisor supports
both radix and HPT, then it will set up the guest to use HPT (since the
guest doesn't request radix in the CAS call), but if the radix feature
bit is set in the ibm,pa-features property (which is valid, since
ibm,pa-features is defined to represent the capabilities of the
processor) the guest will try to use radix, resulting in a crash when
it turns the MMU on.

This makes the minimal fix for the current code, which is to disable
radix unless we are running in hypervisor mode.

Fixes: 2bfd65e45e ("powerpc/mm/radix: Add radix callbacks for early init routines")
Cc: stable@vger.kernel.org # v4.7+
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-16 14:01:42 +11:00
Aneesh Kumar K.V 18061c17c8 powerpc/mm: Update PROTFAULT handling in the page fault path
With radix, we can get page fault with DSISR_PROTFAULT value set in case of
PROT_NONE or autonuma mapping. The PROT_NONE case in handled by the vma check
where we consider the access bad. For autonuma we should fall through and fixup
the access mask correctly.

Without this patch we trigger the WARN_ON() on radix. This code moves that
WARN_ON() within a radix_enabled() check. I also moved the WARN_ON() outside
the if condition making it apply for all type of faults (exec/write/read). It
is also conditionalized for book3s, because BOOK3E can also get a PROTFAULT to
handle the D/I cache sync.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-15 20:02:39 +11:00
Michael Ellerman da0e7e6276 Merge branch 'topic/ppc-kvm' into next
Merge the topic branch we're sharing with the kvm-ppc tree.
2017-02-14 17:18:29 +11:00
Michael Ellerman a05ef161cd powerpc/mm: Fix build break when CMA=n && SPAPR_TCE_IOMMU=y
Currently the build breaks if CMA=n and SPAPR_TCE_IOMMU=y:

  arch/powerpc/mm/mmu_context_iommu.c: In function ‘mm_iommu_get’:
  arch/powerpc/mm/mmu_context_iommu.c:193:42: error: ‘MIGRATE_CMA’ undeclared (first use in this function)
  if (get_pageblock_migratetype(page) == MIGRATE_CMA) {
  ^~~~~~~~~~~

Fix it by using the existing is_migrate_cma_page(), which evaulates to
false when CMA=n.

Fixes: 2e5bbb5461 ("KVM: PPC: Book3S HV: Migrate pinned pages out of CMA")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-14 16:54:22 +11:00
Shailendra Singh be9ba9ff93 powerpc: Drop GPL from of_node_to_nid() export to match other arches
The generic implementation of of_node_to_nid() is EXPORT_SYMBOL, added
in commit 298535c00a ("of, numa: Add NUMA of binding
implementation.").

The powerpc implementation added in commit 953039c8df ("[PATCH]
powerpc: Allow devices to register with numa topology") is
EXPORT_SYMBOL_GPL.

This creates an inconsistency for of_node_to_nid() callers across
architectures.

Update the powerpc implementation to be exported consistently with the
generic implementation.

Signed-off-by: Shailendra Singh <shailendras@nvidia.com>
Reviewed-by: Andy Ritger <aritger@nvidia.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-10 13:28:05 +11:00
David Gibson 438cc81a41 powerpc/pseries: Automatically resize HPT for memory hot add/remove
We've now implemented code in the pseries platform to use the new PAPR
interface to allow resizing the hash page table (HPT) at runtime.

This patch uses that interface to automatically attempt to resize the HPT
when memory is hot added or removed.  This tries to always keep the HPT at
a reasonable size for our current memory size.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-10 13:28:02 +11:00
David Gibson dbcf929c00 powerpc/pseries: Add support for hash table resizing
This adds support for using two hypercalls to change the size of the
main hash page table while running as a PAPR guest. For now these
hypercalls are only in experimental qemu versions.

The interface is two part: first H_RESIZE_HPT_PREPARE is used to
allocate and prepare the new hash table. This may be slow, but can be
done asynchronously. Then, H_RESIZE_HPT_COMMIT is used to switch to the
new hash table. This requires that no CPUs be concurrently updating the
HPT, and so must be run under stop_machine().

This also adds a debugfs file which can be used to manually control
HPT resizing or testing purposes.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Paul Mackerras <paulus@samba.org>
[mpe: Rename the debugfs file to "hpt_order"]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-10 13:27:55 +11:00
Benjamin Herrenschmidt 90c1e3c2fa powerpc/mm/radix: Update ERAT flushes when invalidating TLB
Three tiny changes to the ERAT flushing logic: First don't make
it depend on DD1. It hasn't been decided yet but we might run
DD2 in a mode that also requires explicit flushes for performance
reasons so make it unconditional. We also add a missing isync, and
finally remove the flush from _tlbiel_va as it is only necessary
for congruence-class invalidations (PID, LPID and full TLB), not
targetted invalidations.

Fixes: 96ed1fe511 ("powerpc/mm/radix: Invalidate ERAT on tlbiel for POWER9 DD1")
Cc: stable@vger.kernel.org # v4.9+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-09 14:54:33 +11:00
Benjamin Herrenschmidt d7df2443cd powerpc/mm: Fix spurrious segfaults on radix with autonuma
When autonuma (Automatic NUMA balancing) marks a PTE inaccessible it
clears all the protection bits but leave the PTE valid.

With the Radix MMU, an attempt at executing from such a PTE will
take a fault with bit 35 of SRR1 set "SRR1_ISI_N_OR_G".

It is thus incorrect to treat all such faults as errors. We should
pass them to handle_mm_fault() for autonuma to deal with. The case
of pages that are really not executable is handled by the existing
test for VM_EXEC further down.

That leaves us with catching the kernel attempts at executing user
pages. We can catch that earlier, even before we do find_vma.

It is never valid on powerpc for the kernel to take an exec fault
to begin with. So fold that test with the existing test for the
kernel faulting on kernel addresses to bail out early.

Fixes: 1d18ad0268 ("powerpc/mm: Detect instruction fetch denied and report")
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-08 23:36:29 +11:00
Paul Mackerras 16ed141677 powerpc/64: Make type of partition table flush depend on partition type
When changing a partition table entry on POWER9, we do a particular
form of the tlbie instruction which flushes all TLBs and caches of
the partition table for a given logical partition ID (LPID).
This instruction has a field in the instruction word, labelled R
(radix), which should be 1 if the partition was previously a radix
partition and 0 if it was a HPT partition.  This implements that
logic.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-31 19:11:46 +11:00
Paul Mackerras ba9b399aee powerpc/64: Export pgtable_cache and pgtable_cache_add for KVM
This exports the pgtable_cache array and the pgtable_cache_add
function so that HV KVM can use them for allocating radix page
tables for guests.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-31 19:11:45 +11:00
Paul Mackerras cc3d294013 powerpc/64: Enable use of radix MMU under hypervisor on POWER9
To use radix as a guest, we first need to tell the hypervisor via
the ibm,client-architecture call first that we support POWER9 and
architecture v3.00, and that we can do either radix or hash and
that we would like to choose later using an hcall (the
H_REGISTER_PROC_TBL hcall).

Then we need to check whether the hypervisor agreed to us using
radix.  We need to do this very early on in the kernel boot process
before any of the MMU initialization is done.  If the hypervisor
doesn't agree, we can't use radix and therefore clear the radix
MMU feature bit.

Later, when we have set up our process table, which points to the
radix tree for each process, we need to install that using the
H_REGISTER_PROC_TBL hcall.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-31 19:11:44 +11:00
Paul Mackerras 18569c1f13 powerpc/64: Don't try to use radix MMU under a hypervisor
Currently, if the kernel is running on a POWER9 processor under a
hypervisor, it will try to use the radix MMU even though it doesn't have
the necessary code to use radix under a hypervisor (it doesn't negotiate
use of radix, and it doesn't do the H_REGISTER_PROC_TBL hcall). The
result is that the guest kernel will crash when it tries to turn on the
MMU.

This fixes it by looking for the /chosen/ibm,architecture-vec-5
property, and if it exists, clears the radix MMU feature bit, before we
decide whether to initialize for radix or HPT. This property is created
by the hypervisor as a result of the guest calling the
ibm,client-architecture-support method to indicate its capabilities, so
it will indicate whether the hypervisor agreed to us using radix.

Systems without a hypervisor may have this property also (for example,
skiboot creates it), so we check the HV bit in the MSR to see whether we
are running as a guest or not. If we are in hypervisor mode, then we can
do whatever we like including using the radix MMU.

The reason for using this property is that in future, when we have
support for using radix under a hypervisor, we will need to check this
property to see whether the hypervisor agreed to us using radix.

Fixes: 2bfd65e45e ("powerpc/mm/radix: Add radix callbacks for early init routines")
Cc: stable@vger.kernel.org # v4.7+
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-31 19:11:43 +11:00
Reza Arbab 0d0a4bc2a6 powerpc/mm: unstub radix__vmemmap_remove_mapping()
Use remove_pagetable() and friends for radix vmemmap removal.

We do not require the special-case handling of vmemmap done in the x86
versions of these functions. This is because vmemmap_free() has already
freed the mapped pages, and calls us with an aligned address range.

So, add a few failsafe WARNs, but otherwise the code to remove physical
mappings is already sufficient for vmemmap.

Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-31 13:54:20 +11:00
Reza Arbab 4b5d62ca17 powerpc/mm: add radix__remove_section_mapping()
Tear down and free the four-level page tables of physical mappings
during memory hotremove.

Borrow the basic structure of remove_pagetable() and friends from the
identically-named x86 functions. Reduce the frequency of tlb flushes and
page_table_lock spinlocks by only doing them in the outermost function.
There was some question as to whether the locking is needed at all.
Leave it for now, but we could consider dropping it.

Memory must be offline to be removed, thus not in use. So there
shouldn't be the sort of concurrent page walking activity here that
might prompt us to use RCU.

Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-31 13:54:19 +11:00
Reza Arbab 6cc27341b2 powerpc/mm: add radix__create_section_mapping()
Wire up memory hotplug page mapping for radix. Share the mapping
function already used by radix_init_pgtable().

Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-31 13:54:19 +11:00
Reza Arbab b5200ec9ed powerpc/mm: refactor radix physical page mapping
Move the page mapping code in radix_init_pgtable() into a separate
function that will also be used for memory hotplug.

The current goto loop progressively decreases its mapping size as it
covers the tail of a range whose end is unaligned. Change this to a for
loop which can do the same for both ends of the range.

Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-31 13:54:18 +11:00
Alistair Popple 1d0761d255 powerpc/powernv: Initialise nest mmu
POWER9 contains an off core mmu called the nest mmu (NMMU). This is
used by other hardware units on the chip to translate virtual
addresses into real addresses. The unit attempting an address
translation provides the majority of the context required for the
translation request except for the base address of the partition table
(ie. the PTCR) which needs to be programmed into the NMMU.

This patch adds a call to OPAL to set the PTCR for the nest mmu in
opal_init().

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-30 20:24:33 +11:00
Reza Arbab 2a8628d416 powerpc/mm: Allow memory hotplug into an offline node
Relax the check preventing us from hotplugging into an offline node.

This limitation was added in commit 482ec7c403 ("[PATCH] powerpc numa:
Support sparse online node map") to prevent adding resources to an
uninitialized node.

These days, there is no harm in doing so. The addition will actually
cause the node to be initialized and onlined; add_memory_resource()
calls hotadd_new_pgdat() (if necessary) and node_set_online().

Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-30 16:49:36 +11:00
Reza Arbab 7656cd8e8e powerpc/mm: Simplify loop control in parse_numa_properties()
The flow of the main loop in parse_numa_properties() is overly
complicated. Simplify it to be less confusing and easier to read.
No functional change.

The end of the main loop in parse_numa_properties() looks like this:

	for_each_node_by_type(...) {
		...
		if (!condition) {
			if (--ranges)
				goto new_range;
			else
				continue;
		}

		statement();

		if (--ranges)
			goto new_range;
		/* else
		 *	continue; <- implicit, this is the end of the loop
		 */
	}

The only effect of !condition is to skip execution of statement(). This
can be rewritten in a simpler way:

	for_each_node_by_type(...) {
		...
		if (condition)
			statement();

		if (--ranges)
			goto new_range;
	}

Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-30 16:49:30 +11:00
Reza Arbab a0615a16f7 powerpc/mm: Use the correct pointer when setting a 2MB pte
When setting a 2MB pte, radix__map_kernel_page() is using the address

	ptep = (pte_t *)pudp;

Fix this conversion to use pmdp instead. Use pmdp_ptep() to do this
instead of casting the pointer.

Fixes: 2bfd65e45e ("powerpc/mm/radix: Add radix callbacks for early init routines")
Cc: stable@vger.kernel.org # v4.7+
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-30 15:35:13 +11:00
Markus Elfring a967f161ab powerpc/mm: Return directly after a failed __copy_from_user() in sys_subpage_prot()
This function already has multiple exit points, so there's no harm
adding another. Although it looks odd to return directly in a function
which takes a lock, we've actually just dropped the mmap_sem in this
code, so there's really no reason to go via a label. And it means we can
drop the unhelpfully named out2 label.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
[mpe: Rewrite change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-25 13:34:22 +11:00
Aneesh Kumar K.V 9859f0c7e0 powerpc/mm: Remove the debug hugepd_ok check
We don't do this for other page table entries. So lets keep this simple
and always return false for hugepd check on a 64K page size config.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-23 19:19:28 +11:00
Aneesh Kumar K.V 20717e1ff5 powerpc/mm: Fix little-endian 4K hugetlb
When we switched to big endian page table, we never updated the hugepd
format such that it can work for both big endian and little endian
config. This patch series update hugepd format such that it is looked at
as __be64 value in big endian page table config.

This patch also switch hugepd_t.pd from signed long to unsigned long.
I did update the FSL hugepd_ok check to check for the top bit instead
of checking > 0.

Fixes: 5dc1ef858c ("powerpc/mm: Use big endian Linux page tables for book3s 64")
Cc: stable@vger.kernel.org # v4.7+
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-18 11:58:50 +11:00
Aneesh Kumar K.V ff8b85796d powerpc/mm/hugetlb: Don't panic when we don't find the default huge page size
The generic hugetlbfs code can handle not finding the default huge page
size correctly. With HPAGE_SHIFT = 0 we see in dmesg:

  hugetlbfs: disabling because there are no supported hugepage sizes

bash-4.2# echo 30 > /proc/sys/vm/nr_hugepages
bash: echo: write error: Operation not supported

Fixes: 03bb2d6590 ("powerpc: get hugetlbpage handling more generic")
Reported-by: Chris Smart <chris@distroguy.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-18 11:58:50 +11:00
Nicholas Piggin bf5ca68dd2 powerpc: Fix pgtable pmd cache init
Commit 9b081e1080 ("powerpc: port 64 bits pgtable_cache to 32 bits")
mixed up PMD_INDEX_SIZE and PMD_CACHE_INDEX a couple of times. This
resulted in 64s/hash/4k configs to panic at boot with a false positive
error check.

Fix that and simplify error handling by moving the check to the caller.

Fixes: 9b081e1080 ("powerpc: port 64 bits pgtable_cache to 32 bits")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-18 11:58:30 +11:00
Reza Arbab 32b53c012e powerpc/mm: Fix memory hotplug BUG() on radix
Memory hotplug is leading to hash page table calls, even on radix:

  arch_add_memory
    create_section_mapping
      htab_bolt_mapping
        BUG_ON(!ppc_md.hpte_insert);

To fix, refactor {create,remove}_section_mapping() into hash__ and
radix__ variants. Leave the radix versions stubbed for now.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-17 10:05:43 +11:00
Linus Torvalds b272f732f8 Merge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull SMP hotplug notifier removal from Thomas Gleixner:
 "This is the final cleanup of the hotplug notifier infrastructure. The
  series has been reintgrated in the last two days because there came a
  new driver using the old infrastructure via the SCSI tree.

  Summary:

   - convert the last leftover drivers utilizing notifiers

   - fixup for a completely broken hotplug user

   - prevent setup of already used states

   - removal of the notifiers

   - treewide cleanup of hotplug state names

   - consolidation of state space

  There is a sphinx based documentation pending, but that needs review
  from the documentation folks"

* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/armada-xp: Consolidate hotplug state space
  irqchip/gic: Consolidate hotplug state space
  coresight/etm3/4x: Consolidate hotplug state space
  cpu/hotplug: Cleanup state names
  cpu/hotplug: Remove obsolete cpu hotplug register/unregister functions
  staging/lustre/libcfs: Convert to hotplug state machine
  scsi/bnx2i: Convert to hotplug state machine
  scsi/bnx2fc: Convert to hotplug state machine
  cpu/hotplug: Prevent overwriting of callbacks
  x86/msr: Remove bogus cleanup from the error path
  bus: arm-ccn: Prevent hotplug callback leak
  perf/x86/intel/cstate: Prevent hotplug callback leak
  ARM/imx/mmcd: Fix broken cpu hotplug handling
  scsi: qedi: Convert to hotplug state machine
2016-12-25 14:05:56 -08:00
Thomas Gleixner 73c1b41e63 cpu/hotplug: Cleanup state names
When the state names got added a script was used to add the extra argument
to the calls. The script basically converted the state constant to a
string, but the cleanup to convert these strings into meaningful ones did
not happen.

Replace all the useless strings with 'subsys/xxx/yyy:state' strings which
are used in all the other places already.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Link: http://lkml.kernel.org/r/20161221192112.085444152@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-25 10:47:44 +01:00
Linus Torvalds 7c0f6ba682 Replace <asm/uaccess.h> with <linux/uaccess.h> globally
This was entirely automated, using the script by Al:

  PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
  sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
        $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.

Requested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-24 11:46:01 -08:00
Linus Torvalds de399813b5 powerpc updates for 4.10
Highlights include:
 
  - Support for the kexec_file_load() syscall, which is a prereq for secure and
    trusted boot.
 
  - Prevent kernel execution of userspace on P9 Radix (similar to SMEP/PXN).
 
  - Sort the exception tables at build time, to save time at boot, and store
    them as relative offsets to save space in the kernel image & memory.
 
  - Allow building the kernel with thin archives, which should allow us to build
    an allyesconfig once some other fixes land.
 
  - Build fixes to allow us to correctly rebuild when changing the kernel endian
    from big to little or vice versa.
 
  - Plumbing so that we can avoid doing a full mm TLB flush on P9 Radix.
 
  - Initial stack protector support (-fstack-protector).
 
  - Support for dumping the radix (aka. Linux) and hash page tables via debugfs.
 
  - Fix an oops in cxl coredump generation when cxl_get_fd() is used.
 
  - Freescale updates from Scott: "Highlights include 8xx hugepage support,
    qbman fixes/cleanup, device tree updates, and some misc cleanup."
 
  - Many and varied fixes and minor enhancements as always.
 
 Thanks to:
   Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Anshuman Khandual,
   Anton Blanchard, Balbir Singh, Bartlomiej Zolnierkiewicz, Christophe Jaillet,
   Christophe Leroy, Denis Kirjanov, Elimar Riesebieter, Frederic Barrat,
   Gautham R. Shenoy, Geliang Tang, Geoff Levand, Jack Miller, Johan Hovold,
   Lars-Peter Clausen, Libin, Madhavan Srinivasan, Michael Neuling, Nathan
   Fontenot, Naveen N. Rao, Nicholas Piggin, Pan Xinhui, Peter Senna Tschudin,
   Rashmica Gupta, Rui Teng, Russell Currey, Scott Wood, Simon Guo, Suraj
   Jitindar Singh, Thiago Jung Bauermann, Tobias Klauser, Vaibhav Jain.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJYU4YSAAoJEFHr6jzI4aWAC4gQALtIAqqPon0Cd5b/FVVcMbW7
 mMqB2b/0FGEl5GoRTzGUDaQqElilm6AEVfHO86C7DFji/a6olneFfw87iz+mtWuZ
 JvrNq68ZiSnoeszdUy4MgtXFLb5sTzNMev4skaHfjI9E5CepWBoR0zH4G+kNVnd5
 WSgudv8Cq4Px+MEuTOigt3QYjHzZ3cw/XNOOm9c+oGj+PDW4O9UItVI+S1WLoey4
 rAB2nRcLMDPuwfRQC9XsF3zEbkv4h1dEXo/EBRuRpcF+0lLTzFw1lv1WE8OxlUmS
 kAXbty3dIytBfSbtJT0c0Ps6sfQ4HFhu6ZV2fjnxNTz2KDkBIN7LBYHmBYiqY9oZ
 9zvbUWtfiTu5ocfRtTq7rC/Hcj4Kbr9S9F/FvXR0WyDsKgu4xxAovqC3gcn6YjYK
 Rr1tcCI4nUzyhVJVmd+OEhUvc5JbFy9aGage+YeOyejfvvSbXIunaxWlPjoDkvim
 Vjl+UKU8gw51XFssqY5ZBi/HNlMFKYedLpMFp/fItnLglhj50V0eFWkpDgdSCYom
 vo9ifPLZx8n8m8De3H7TV4E0F4gCHcTeqZdu7tW9AAUVM6iLJcDLm3asGmtNh21t
 snOHNOJ5QSIno6ezUUg29T6VBjbPh46fdJJSlIZrEe8OzLZ1haGyttf0tD00PQvY
 Z2W/m3gxafnOeGgBqvyv
 =xOzf
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "Highlights include:

   - Support for the kexec_file_load() syscall, which is a prereq for
     secure and trusted boot.

   - Prevent kernel execution of userspace on P9 Radix (similar to
     SMEP/PXN).

   - Sort the exception tables at build time, to save time at boot, and
     store them as relative offsets to save space in the kernel image &
     memory.

   - Allow building the kernel with thin archives, which should allow us
     to build an allyesconfig once some other fixes land.

   - Build fixes to allow us to correctly rebuild when changing the
     kernel endian from big to little or vice versa.

   - Plumbing so that we can avoid doing a full mm TLB flush on P9
     Radix.

   - Initial stack protector support (-fstack-protector).

   - Support for dumping the radix (aka. Linux) and hash page tables via
     debugfs.

   - Fix an oops in cxl coredump generation when cxl_get_fd() is used.

   - Freescale updates from Scott: "Highlights include 8xx hugepage
     support, qbman fixes/cleanup, device tree updates, and some misc
     cleanup."

   - Many and varied fixes and minor enhancements as always.

  Thanks to:
    Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Anshuman
    Khandual, Anton Blanchard, Balbir Singh, Bartlomiej Zolnierkiewicz,
    Christophe Jaillet, Christophe Leroy, Denis Kirjanov, Elimar
    Riesebieter, Frederic Barrat, Gautham R. Shenoy, Geliang Tang, Geoff
    Levand, Jack Miller, Johan Hovold, Lars-Peter Clausen, Libin,
    Madhavan Srinivasan, Michael Neuling, Nathan Fontenot, Naveen N.
    Rao, Nicholas Piggin, Pan Xinhui, Peter Senna Tschudin, Rashmica
    Gupta, Rui Teng, Russell Currey, Scott Wood, Simon Guo, Suraj
    Jitindar Singh, Thiago Jung Bauermann, Tobias Klauser, Vaibhav Jain"

[ And thanks to Michael, who took time off from a new baby to get this
  pull request done.   - Linus ]

* tag 'powerpc-4.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (174 commits)
  powerpc/fsl/dts: add FMan node for t1042d4rdb
  powerpc/fsl/dts: add sg_2500_aqr105_phy4 alias on t1024rdb
  powerpc/fsl/dts: add QMan and BMan nodes on t1024
  powerpc/fsl/dts: add QMan and BMan nodes on t1023
  soc/fsl/qman: test: use DEFINE_SPINLOCK()
  powerpc/fsl-lbc: use DEFINE_SPINLOCK()
  powerpc/8xx: Implement support of hugepages
  powerpc: get hugetlbpage handling more generic
  powerpc: port 64 bits pgtable_cache to 32 bits
  powerpc/boot: Request no dynamic linker for boot wrapper
  soc/fsl/bman: Use resource_size instead of computation
  soc/fsl/qe: use builtin_platform_driver
  powerpc/fsl_pmc: use builtin_platform_driver
  powerpc/83xx/suspend: use builtin_platform_driver
  powerpc/ftrace: Fix the comments for ftrace_modify_code
  powerpc/perf: macros for power9 format encoding
  powerpc/perf: power9 raw event format encoding
  powerpc/perf: update attribute_group data structure
  powerpc/perf: factor out the event format field
  powerpc/mm/iommu, vfio/spapr: Put pages on VFIO container shutdown
  ...
2016-12-16 09:26:42 -08:00
Michael Ellerman c6f6634721 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next
Freescale updates from Scott:

"Highlights include 8xx hugepage support, qbman fixes/cleanup, device
tree updates, and some misc cleanup."
2016-12-16 15:05:38 +11:00
Linus Torvalds 93173b5bf2 Small release, the most interesting stuff is x86 nested virt improvements.
x86: userspace can now hide nested VMX features from guests; nested
 VMX can now run Hyper-V in a guest; support for AVX512_4VNNIW and
 AVX512_FMAPS in KVM; infrastructure support for virtual Intel GPUs.
 
 PPC: support for KVM guests on POWER9; improved support for interrupt
 polling; optimizations and cleanups.
 
 s390: two small optimizations, more stuff is in flight and will be
 in 4.11.
 
 ARM: support for the GICv3 ITS on 32bit platforms.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQExBAABCAAbBQJYTkP0FBxwYm9uemluaUByZWRoYXQuY29tAAoJEL/70l94x66D
 lZIH/iT1n9OQXcuTpYYnQhuCenzI3GZZOIMTbCvK2i5bo0FIJKxVn0EiAAqZSXvO
 nO185FqjOgLuJ1AD1kJuxzye5suuQp4HIPWWgNHcexLuy43WXWKZe0IQlJ4zM2Xf
 u31HakpFmVDD+Cd1qN3yDXtDrRQ79/xQn2kw7CWb8olp+pVqwbceN3IVie9QYU+3
 gCz0qU6As0aQIwq2PyalOe03sO10PZlm4XhsoXgWPG7P18BMRhNLTDqhLhu7A/ry
 qElVMANT7LSNLzlwNdpzdK8rVuKxETwjlc1UP8vSuhrwad4zM2JJ1Exk26nC2NaG
 D0j4tRSyGFIdx6lukZm7HmiSHZ0=
 =mkoB
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "Small release, the most interesting stuff is x86 nested virt
  improvements.

  x86:
   - userspace can now hide nested VMX features from guests
   - nested VMX can now run Hyper-V in a guest
   - support for AVX512_4VNNIW and AVX512_FMAPS in KVM
   - infrastructure support for virtual Intel GPUs.

  PPC:
   - support for KVM guests on POWER9
   - improved support for interrupt polling
   - optimizations and cleanups.

  s390:
   - two small optimizations, more stuff is in flight and will be in
     4.11.

  ARM:
   - support for the GICv3 ITS on 32bit platforms"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (94 commits)
  arm64: KVM: pmu: Reset PMSELR_EL0.SEL to a sane value before entering the guest
  KVM: arm/arm64: timer: Check for properly initialized timer on init
  KVM: arm/arm64: vgic-v2: Limit ITARGETSR bits to number of VCPUs
  KVM: x86: Handle the kthread worker using the new API
  KVM: nVMX: invvpid handling improvements
  KVM: nVMX: check host CR3 on vmentry and vmexit
  KVM: nVMX: introduce nested_vmx_load_cr3 and call it on vmentry
  KVM: nVMX: propagate errors from prepare_vmcs02
  KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT
  KVM: nVMX: load GUEST_EFER after GUEST_CR0 during emulated VM-entry
  KVM: nVMX: generate MSR_IA32_CR{0,4}_FIXED1 from guest CPUID
  KVM: nVMX: fix checks on CR{0,4} during virtual VMX operation
  KVM: nVMX: support restore of VMX capability MSRs
  KVM: nVMX: generate non-true VMX MSRs based on true versions
  KVM: x86: Do not clear RFLAGS.TF when a singlestep trap occurs.
  KVM: x86: Add kvm_skip_emulated_instruction and use it.
  KVM: VMX: Move skip_emulated_instruction out of nested_vmx_check_vmcs12
  KVM: VMX: Reorder some skip_emulated_instruction calls
  KVM: x86: Add a return value to kvm_emulate_cpuid
  KVM: PPC: Book3S: Move prototypes for KVM functions into kvm_ppc.h
  ...
2016-12-13 15:47:02 -08:00
Reza Arbab 4a3bac4e3a powerpc/mm: allow memory hotplug into a memoryless node
Patch series "enable movable nodes on non-x86 configs", v7.

This patchset allows more configs to make use of movable nodes.  When
CONFIG_MOVABLE_NODE is selected, there are two ways to introduce such
nodes into the system:

1. Discover movable nodes at boot. Currently this is only possible on
   x86, but we will enable configs supporting fdt to do the same.

2. Hotplug and online all of a node's memory using online_movable. This
   is already possible on any config supporting memory hotplug, not
   just x86, but the Kconfig doesn't say so. We will fix that.

We'll also remove some cruft on power which would prevent (2).

This patch (of 5):

Remove the check which prevents us from hotplugging into an empty node.

The original commit b226e46212 ("[PATCH] powerpc: don't add memory to
empty node/zone"), states that this was intended to be a temporary measure.
It is a workaround for an oops which no longer occurs.

Link: http://lkml.kernel.org/r/1479160961-25840-2-git-send-email-arbab@linux.vnet.ibm.com
Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Alistair Popple <apopple@au1.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Cc: Frank Rowand <frowand.list@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12 18:55:07 -08:00
Christophe Leroy 4b91428699 powerpc/8xx: Implement support of hugepages
8xx uses a two level page table with two different linux page size
support (4k and 16k). 8xx also support two different hugepage sizes
512k and 8M. In order to support them on linux we define two different
page table layout.

The size of pages is in the PGD entry, using PS field (bits 28-29):
00 : Small pages (4k or 16k)
01 : 512k pages
10 : reserved
11 : 8M pages

For 512K hugepage size a pgd entry have the below format
[<hugepte address >0101] . The hugepte table allocated will contain 8
entries pointing to 512K huge pte in 4k pages mode and 64 entries in
16k pages mode.

For 8M in 16k mode, a pgd entry have the below format
[<hugepte address >1101] . The hugepte table allocated will contain 8
entries pointing to 8M huge pte.

For 8M in 4k mode, multiple pgd entries point to the same hugepte
address and pgd entry will have the below format
[<hugepte address>1101]. The hugepte table allocated will only have one
entry.

For the time being, we do not support CPU15 ERRATA when HUGETLB is
selected

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> (v3, for the generic bits)
Signed-off-by: Scott Wood <oss@buserror.net>
2016-12-09 22:49:07 -06:00
Christophe Leroy 03bb2d6590 powerpc: get hugetlbpage handling more generic
Today there are two implementations of hugetlbpages which are managed
by exclusive #ifdefs:
* FSL_BOOKE: several directory entries points to the same single hugepage
* BOOK3S: one upper level directory entry points to a table of hugepages

In preparation of implementation of hugepage support on the 8xx, we
need a mix of the two above solutions, because the 8xx needs both cases
depending on the size of pages:
* In 4k page size mode, each PGD entry covers a 4M bytes area. It means
that 2 PGD entries will be necessary to cover an 8M hugepage while a
single PGD entry will cover 8x 512k hugepages.
* In 16 page size mode, each PGD entry covers a 64M bytes area. It means
that 8x 8M hugepages will be covered by one PGD entry and 64x 512k
hugepages will be covers by one PGD entry.

This patch:
* removes #ifdefs in favor of if/else based on the range sizes
* merges the two huge_pte_alloc() functions as they are pretty similar
* merges the two hugetlbpage_init() functions as they are pretty similar

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> (v3)
Signed-off-by: Scott Wood <oss@buserror.net>
2016-12-09 22:48:09 -06:00
Christophe Leroy 9b081e1080 powerpc: port 64 bits pgtable_cache to 32 bits
Today powerpc64 uses a set of pgtable_caches while powerpc32 uses
standard pages when using 4k pages and a single pgtable_cache
if using other size pages.

In preparation of implementing huge pages on the 8xx, this patch
replaces the specific powerpc32 handling by the 64 bits approach.

This is done by:
* moving 64 bits pgtable_cache_add() and pgtable_cache_init()
in a new file called init-common.c
* modifying pgtable_cache_init() to also handle the case
without PMD
* removing the 32 bits version of pgtable_cache_add() and
pgtable_cache_init()
* copying related header contents from 64 bits into both the
book3s/32 and nohash/32 header files

On the 8xx, the following cache sizes will be used:
* 4k pages mode:
- PGT_CACHE(10) for PGD
- PGT_CACHE(3) for 512k hugepage tables
* 16k pages mode:
- PGT_CACHE(6) for PGD
- PGT_CACHE(7) for 512k hugepage tables
- PGT_CACHE(3) for 8M hugepage tables

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-12-09 22:48:01 -06:00
Alexey Kardashevskiy 4b6fad7097 powerpc/mm/iommu, vfio/spapr: Put pages on VFIO container shutdown
At the moment the userspace tool is expected to request pinning of
the entire guest RAM when VFIO IOMMU SPAPR v2 driver is present.
When the userspace process finishes, all the pinned pages need to
be put; this is done as a part of the userspace memory context (MM)
destruction which happens on the very last mmdrop().

This approach has a problem that a MM of the userspace process
may live longer than the userspace process itself as kernel threads
use userspace process MMs which was runnning on a CPU where
the kernel thread was scheduled to. If this happened, the MM remains
referenced until this exact kernel thread wakes up again
and releases the very last reference to the MM, on an idle system this
can take even hours.

This moves preregistered regions tracking from MM to VFIO; insteads of
using mm_iommu_table_group_mem_t::used, tce_container::prereg_list is
added so each container releases regions which it has pre-registered.

This changes the userspace interface to return EBUSY if a memory
region is already registered in a container. However it should not
have any practical effect as the only userspace tool available now
does register memory region once per container anyway.

As tce_iommu_register_pages/tce_iommu_unregister_pages are called
under container->lock, this does not need additional locking.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-12-02 14:38:34 +11:00
Alexey Kardashevskiy d7baee6901 powerpc/iommu: Stop using @current in mm_iommu_xxx
This changes mm_iommu_xxx helpers to take mm_struct as a parameter
instead of getting it from @current which in some situations may
not have a valid reference to mm.

This changes helpers to receive @mm and moves all references to @current
to the caller, including checks for !current and !current->mm;
checks in mm_iommu_preregistered() are removed as there is no caller
yet.

This moves the mm_iommu_adjust_locked_vm() call to the caller as
it receives mm_iommu_table_group_mem_t but it needs mm.

This should cause no behavioral change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-12-02 14:38:29 +11:00
Alexey Kardashevskiy 88f54a3581 powerpc/iommu: Pass mm_struct to init/cleanup helpers
We are going to get rid of @current references in mmu_context_boos3s64.c
and cache mm_struct in the VFIO container. Since mm_context_t does not
have reference counting, we will be using mm_struct which does have
the reference counter.

This changes mm_iommu_init/mm_iommu_cleanup to receive mm_struct rather
than mm_context_t (which is embedded into mm).

This should not cause any behavioral change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-12-02 14:38:27 +11:00
Michael Ellerman dd5ac03e09 powerpc/mm: Fix page table dump build on non-Book3S
In the recent commit 1515ab9321 ("powerpc/mm: Dump hash table") we
added code to dump the hage page table. Currently this can be selected
to build on any platform. However it breaks the build if we're building
for a non-Book3S platform, because none of the hash page table related
defines and so on exist. So restrict it to building only on Book3S.

Similarly in commit 8eb07b1870 ("powerpc/mm: Dump linux pagetables")
we added code to dump the Linux page tables, which uses some constants
which are only defined on Book3S - so guard those with an #ifdef.

Fixes: 1515ab9321 ("powerpc/mm: Dump hash table")
Fixes: 8eb07b1870 ("powerpc/mm: Dump linux pagetables")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-12-01 16:20:18 +11:00
Balbir Singh 0ab5171b89 powerpc/mm: Fix no execute fault handling on pre-POWER5
Aneesh/Ben reported that the change to do_page_fault() we made in commit
1d18ad0268 ("powerpc/mm: Detect instruction fetch denied and report")
needs to handle the case where CPU_FTR_COHERENT_ICACHE is missing but we
have CPU_FTR_NOEXECUTE. In those cases the check added for
SRR1_ISI_N_OR_G might trigger a false positive.

This patch adds a check for CPU_FTR_COHERENT_ICACHE in addition to the
MSR value.

Fixes: 1d18ad0268 ("powerpc/mm: Detect instruction fetch denied and report")
Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-30 17:19:01 +11:00
Benjamin Herrenschmidt dd7b2f035e powerpc/mm: Fix lazy icache flush on pre-POWER5
On 64-bit CPUs with no-execute support and non-snooping icache, such as
970 or POWER4, we have a software mechanism to ensure coherency of the
cache (using exec faults when needed).

This was broken due to a logic error when the code was rewritten
from assembly to C, previously the assembly code did:

  BEGIN_FTR_SECTION
         mr      r4,r30
         mr      r5,r7
         bl      hash_page_do_lazy_icache
  END_FTR_SECTION(CPU_FTR_NOEXECUTE|CPU_FTR_COHERENT_ICACHE, CPU_FTR_NOEXECUTE)

Which tests that:
   (cpu_features & (NOEXECUTE | COHERENT_ICACHE)) == NOEXECUTE

Which says that the current cpu does have NOEXECUTE, but does not have
COHERENT_ICACHE.

Fixes: 91f1da9979 ("powerpc/mm: Convert 4k hash insert to C")
Fixes: 89ff725051 ("powerpc/mm: Convert __hash_page_64K to C")
Fixes: a43c0eb836 ("powerpc/mm: Convert 4k insert from asm to C")
Cc: stable@vger.kernel.org # v4.5+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[mpe: Change log verbosification]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-29 23:59:40 +11:00
Aneesh Kumar K.V b3603e174f powerpc/mm: update radix__ptep_set_access_flag to not do full mm tlb flush
When we are updating a pte, we just need to flush the tlb mapping
that pte. Right now we do a full mm flush because we don't track the page
size. Now that we have page size details in pte use that to do the
optimized flush

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-28 22:44:33 +11:00
Aneesh Kumar K.V 6d3a0379eb powerpc/mm: Add radix__tlb_flush_pte_p9_dd1()
Now that we have page size details encoded in pte using software pte
bits, use that to find the page size needed for tlb flush.

This function should only be used on P9 DD1, so give it a horrible name
to make that clear.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-28 22:43:45 +11:00
Balbir Singh 3b10d0095a powerpc/mm/radix: Prevent kernel execution of user space
ISA 3 defines new encoded access authority that allows instruction
access prevention in privileged mode and allows normal access
to problem state. This patch just enables IAMR (Instruction Authority
Mask Register), enabling AMR would require more work.

I've tested this with a buggy driver and a simple payload. The payload
is specific to the build I've tested.

mpe: Also tested with LKDTM:

  # echo EXEC_USERSPACE > /sys/kernel/debug/provoke-crash/DIRECT
  lkdtm: Performing direct entry EXEC_USERSPACE
  lkdtm: attempting ok execution at c0000000005bf560
  lkdtm: attempting bad execution at 00003fff8d940000
  Unable to handle kernel paging request for instruction fetch
  Faulting instruction address: 0x3fff8d940000
  Oops: Kernel access of bad area, sig: 11 [#1]
  NIP: 00003fff8d940000 LR: c0000000005bfa58 CTR: 00003fff8d940000
  REGS: c0000000f1fcf900 TRAP: 0400   Not tainted  (4.9.0-rc5-compiler_gcc-6.2.0-00109-g956dbc06232a)
  MSR: 9000000010009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 48002222  XER: 00000000
  ...
  Call Trace:
    lkdtm_EXEC_USERSPACE+0x104/0x120 (unreliable)
    lkdtm_do_action+0x3c/0x80
    direct_entry+0x100/0x1b0
    full_proxy_write+0x94/0x100
    __vfs_write+0x3c/0x1b0
    vfs_write+0xcc/0x230
    SyS_write+0x60/0x110
    system_call+0x38/0xfc

Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-26 18:48:04 +11:00
Balbir Singh 1d18ad0268 powerpc/mm: Detect instruction fetch denied and report
ISA 3 allows for prevention of instruction fetch and execution
of user mode pages. If such an error occurs, SRR1 bit 35 reports the
error. We catch and report the error in do_page_fault().

Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-25 15:01:35 +11:00
Balbir Singh ee97b6b99f powerpc/mm/radix: Setup AMOR in HV mode to allow key 0
Setup AMOR (Authority Mask Override Register) in HV mode so that the
host and guest kernel can in turn setup IAMR.

This allows us to enable key 0 in a following patch.

Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-25 15:01:31 +11:00
Aneesh Kumar K.V 984d7a1ec6 powerpc/mm: Fixup kernel read only mapping
With commit e58e87adc8 ("powerpc/mm: Update _PAGE_KERNEL_RO") we
started using the ppp value 0b110 to map kernel readonly. But that
facility was only added as part of ISA 2.04. For earlier ISA version
only supported ppp bit value for readonly mapping is 0b011. (This
implies both user and kernel get mapped using the same ppp bit value for
readonly mapping.).
Update the code such that for earlier architecture version we use ppp
value 0b011 for readonly mapping. We don't differentiate between power5+
and power5 here and apply the new ppp bits only from power6 (ISA 2.05).
This keep the changes minimal.

This fixes issue with PS3 spu usage reported at
https://lkml.kernel.org/r/rep.1421449714.geoff@infradead.org

Fixes: e58e87adc8 ("powerpc/mm: Update _PAGE_KERNEL_RO")
Cc: stable@vger.kernel.org # v4.7+
Tested-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-25 14:18:25 +11:00
Michael Ellerman ddbefe7e77 Merge branch 'topic/ppc-kvm' into next
Merge the topic branch we're sharing with the kvm-ppc tree.
2016-11-24 22:14:52 +11:00
Paul Mackerras 9d66195807 powerpc/64: Provide functions for accessing POWER9 partition table
POWER9 requires the host to set up a partition table, which is a
table in memory indexed by logical partition ID (LPID) which
contains the pointers to page tables and process tables for the
host and each guest.

This factors out the initialization of the partition table into
a single function.  This code was previously duplicated between
hash_utils_64.c and pgtable-radix.c.

This provides a function for setting a partition table entry,
which is used in early MMU initialization, and will be used by
KVM whenever a guest is created.  This function includes a tlbie
instruction which will flush all TLB entries for the LPID and
all caches of the partition table entry for the LPID, across the
system.

This also moves a call to memblock_set_current_limit(), which was
in radix_init_partition_table(), but has nothing to do with the
partition table.  By analogy with the similar code for hash, the
call gets moved to near the end of radix__early_init_mmu().  It
now gets called when running as a guest, whereas previously it
would only be called if the kernel is running as the host.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-23 10:32:11 +11:00
Aneesh Kumar K.V 64168f4296 powerpc/mm/coproc: Handle bad address on coproc slb fault
VSID 0 is bad address. Don't create slb entries on coproc fault for
bad address

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-22 11:57:08 +11:00
Aneesh Kumar K.V cac4a18540 powerpc/mm: Fix missing update of HID register on secondary CPUs
We need to update on secondaries for the selected MMU mode.

Fixes: ad410674f5 ("powerpc/mm: Update the HID bit when switching from radix to hash")
Reported-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-18 23:16:58 +11:00
Michael Neuling 96ed1fe511 powerpc/mm/radix: Invalidate ERAT on tlbiel for POWER9 DD1
On POWER9 DD1, when we do a local TLB invalidate we also need to explicitly
invalidate the ERAT.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-18 15:12:24 +11:00
Suraj Jitindar Singh 555c16328a powerpc/mm: Correct process and partition table max size
Version 3.00 of the ISA states that the PATS (partition table size) field
of the PTCR (partition table control register) and the PRTS (process table
size) field of the partition table entry must both be less than or equal
to 24. However the actual size of the partition and process tables is equal
to 2 to the power of 12 plus the PATS and PRTS fields, respectively. This
means that the max allowable size of each of these tables is 2^36 or 64GB
for both.

Thus when checking the size shift for each we should be checking for values
of greater than 36 instead of the current check for shifts larger than 24
and 23.

Fixes: 2bfd65e45e
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-17 17:11:53 +11:00
Rashmica Gupta 1515ab9321 powerpc/mm: Dump hash table
Useful to be able to dump the kernel hash page table to check
which pages are hashed along with their sizes and other details.

Add a debugfs file to check the hash page table. If radix is enabled
(and so there is no hash page table) then this file doesn't exist. To
use this the PPC_PTDUMP config option must be selected.

Signed-off-by: Rashmica Gupta <rashmicy@gmail.com>
[mpe: Fix build with SPARSEMEM_VMEMMAP=n & PSERIES=n]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-17 17:11:47 +11:00
Rashmica Gupta 8eb07b1870 powerpc/mm: Dump linux pagetables
Useful to be able to dump the kernels page tables to check permissions
and memory types - derived from arm64's implementation.

Add a debugfs file to check the page tables. To use this the PPC_PTDUMP
config option must be selected.

Signed-off-by: Rashmica Gupta <rashmicy@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-17 17:11:46 +11:00
Balbir Singh ac8d3818aa powerpc/mm: Fix typo in radix encodings print
Rename "sift" to "shift".

Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-17 17:11:45 +11:00
Paul Mackerras 6b243fcfb5 powerpc/64: Simplify adaptation to new ISA v3.00 HPTE format
This changes the way that we support the new ISA v3.00 HPTE format.
Instead of adapting everything that uses HPTE values to handle either
the old format or the new format, depending on which CPU we are on,
we now convert explicitly between old and new formats if necessary
in the low-level routines that actually access HPTEs in memory.
This limits the amount of code that needs to know about the new
format and makes the conversions explicit.  This is OK because the
old format contains all the information that is in the new format.

This also fixes operation under a hypervisor, because the H_ENTER
hypercall (and other hypercalls that deal with HPTEs) will continue
to require the HPTE value to be supplied in the old format.  At
present the kernel will not boot in HPT mode on POWER9 under a
hypervisor.

This fixes and partially reverts commit 50de596de8
("powerpc/mm/hash: Add support for Power9 Hash", 2016-04-29).

Fixes: 50de596de8 ("powerpc/mm/hash: Add support for Power9 Hash")
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-16 15:43:25 +11:00
Balbir Singh f923efbcfd powerpc/hash64: Be more careful when generating tlbiel
In ISA v2.05, the tlbiel instruction takes two arguments, RB and L:

tlbiel RB,L

+---------+---------+----+---------+---------+---------+----+
|    31   |    /    | L  |    /    |    RB   |   274   | /  |
| 31 - 26 | 25 - 22 | 21 | 20 - 16 | 15 - 11 |  10 - 1 | 0  |
+---------+---------+----+---------+---------+---------+----+

In ISA v2.06 tlbiel takes only one argument, RB:

tlbiel RB

+---------+---------+---------+---------+---------+----+
|    31   |    /    |    /    |    RB   |   274   | /  |
| 31 - 26 | 25 - 21 | 20 - 16 | 15 - 11 |  10 - 1 | 0  |
+---------+---------+---------+---------+---------+----+

And in ISA v3.00 tlbiel takes five arguments:

tlbiel RB,RS,RIC,PRS,R

+---------+---------+----+---------+----+----+---------+---------+----+
|    31   |    RS   | /  |   RIC   |PRS | R  |    RB   |   274   | /  |
| 31 - 26 | 25 - 21 | 20 | 19 - 18 | 17 | 16 | 15 - 11 |  10 - 1 | 0  |
+---------+---------+----+---------+----+----+---------+---------+----+

However the assembler also accepts "tlbiel RB", and generates
"tlbiel RB,r0,0,0,0".

As you can see above the L field from the v2.05 encoding overlaps with the
reserved field of the v2.06 encoding, and the low bit of the RS field of the
v3.00 encoding.

Currently in __tlbiel() we generate two tlbiel instructions manually using hex
constants. In the first case, for MMU_PAGE_4K, we generate "tlbiel RB,0", which
is safe in all cases, because the L bit is zero.

However in the default case we generate "tlbiel RB,1", therefore setting bit 21
to 1.

This is not an actual bug on v2.06 processors, because the CPU ignores the value
of the reserved field. However software is supposed to encode the reserved
fields as zero to enable forward compatibility.

On v3.00 processors setting bit 21 to 1 and no other bits of RS, means we are
using r1 for the value of RS.

Although it's not obvious, the code sets the IS field (bits 10-11) to 0 (by
omission), and L=1, in the va value, which is passed as RB. We also pass R=0 in
the instruction.

The combination of IS=0, L=1 and R=0 means the value of RS is not used, so even
on ISA v3.00 there is no actual bug.

We should still fix it, as setting a reserved bit on v2.06 is naughty, and we
are only avoiding a bug on v3.00 by accident rather than design. Use
ASM_FTR_IFSET() to generate the single argument form on ISA v2.06 and later, and
the two argument form on pre v2.06.

Although there may be very old toolchains which don't understand tlbiel, we have
other code in the tree which has been using tlbiel for over five years, and no
one has reported any build failures, so just let the assembler generate the
instructions.

Signed-off-by: Balbir Singh <bsingharora@gmail.com>
[mpe: Rewrite change log, use IFSET instead of IFCLR]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-14 11:11:51 +11:00
Nicholas Piggin 61a92f7031 powerpc: Add support for relative exception tables
This halves the exception table size on 64-bit builds, and it allows
build-time sorting of exception tables to work on relocated kernels.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Minor asm fixups and bits to keep the selftests working]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-14 11:11:51 +11:00
Aneesh Kumar K.V bd77c44986 powerpc/mm/radix: Use tlbiel only if we ever ran on the current cpu
Before this patch, we used tlbiel, if we ever ran only on this core.
That was mostly derived from the nohash usage of the same. But is
incorrect, the ISA 3.0 clarifies tlbiel such that:

"All TLB entries that have all of the following properties are made
invalid on the thread executing the tlbiel instruction"

ie. tlbiel only invalidates TLB entries on the current thread. So if the
mm has been used on any other thread (aka. cpu) then we must broadcast
the invalidate.

This bug could lead to invalid TLB entries if a program runs on multiple
threads of a core.

Hence use tlbiel, if we only ever ran on only the current cpu.

Fixes: 1a472c9dba ("powerpc/mm/radix: Add tlbflush routines")
Cc: stable@vger.kernel.org # v4.7+
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-27 21:55:13 +11:00
Aneesh Kumar K.V 8467801cc8 powerpc: Fix numa topology console print
With recent update to printk, we get console output like below:

[    0.550639] Brought up 160 CPUs
[    0.550718] Node 0 CPUs:
[    0.550721]  0
[    0.550754] -39

[    0.550794] Node 1 CPUs:
[    0.550798]  40
[    0.550817] -79

[    0.550856] Node 16 CPUs:
[    0.550860]  80
[    0.550880] -119

[    0.550917] Node 17 CPUs:
[    0.550923]  120
[    0.550942] -159

Fix this by properly using pr_cont(), ie. KERN_CONT.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-19 20:35:41 +11:00
Michael Ellerman 08b5e79ebd powerpc/mm: Drop dump_numa_memory_topology()
At boot we dump the NUMA memory topology in dump_numa_memory_topology(),
at KERN_DEBUG level, resulting in output like:

  Node 0 Memory: 0x0-0x100000000
  Node 1 Memory: 0x100000000-0x200000000

Which is nice enough, but immediately after that we iterate over each
node and call setup_node_data(), which also prints out the node ranges,
at KERN_INFO, giving eg:

  numa: Initmem setup node 0 [mem 0x00000000-0xffffffff]
  numa: Initmem setup node 1 [mem 0x100000000-0x1ffffffff]

Additionally dump_numa_memory_topology() does not use KERN_CONT
correctly, resulting in split output lines on recent kernels.

So drop dump_numa_memory_topology() as superfluous chatter.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-19 20:35:40 +11:00
Frederic Barrat d2cf909cda powerpc/mm: Prevent unlikely crash in copro_calculate_slb()
If a cxl adapter faults on an invalid address for a kernel context, we
may enter copro_calculate_slb() with a NULL mm pointer (kernel
context) and an effective address which looks like a user
address. Which will cause a crash when dereferencing mm. It is clearly
an AFU bug, but there's no reason to crash either. So return an error,
so that cxl can ack the interrupt with an address error.

Fixes: 73d16a6e0e ("powerpc/cell: Move data segment faulting code out of cell platform")
Cc: stable@vger.kernel.org # v3.18+
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-19 20:32:49 +11:00
Linus Torvalds 84d69848c9 Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Pull kbuild updates from Michal Marek:

 - EXPORT_SYMBOL for asm source by Al Viro.

   This does bring a regression, because genksyms no longer generates
   checksums for these symbols (CONFIG_MODVERSIONS). Nick Piggin is
   working on a patch to fix this.

   Plus, we are talking about functions like strcpy(), which rarely
   change prototypes.

 - Fixes for PPC fallout of the above by Stephen Rothwell and Nick
   Piggin

 - fixdep speedup by Alexey Dobriyan.

 - preparatory work by Nick Piggin to allow architectures to build with
   -ffunction-sections, -fdata-sections and --gc-sections

 - CONFIG_THIN_ARCHIVES support by Stephen Rothwell

 - fix for filenames with colons in the initramfs source by me.

* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: (22 commits)
  initramfs: Escape colons in depfile
  ppc: there is no clear_pages to export
  powerpc/64: whitelist unresolved modversions CRCs
  kbuild: -ffunction-sections fix for archs with conflicting sections
  kbuild: add arch specific post-link Makefile
  kbuild: allow archs to select link dead code/data elimination
  kbuild: allow architectures to use thin archives instead of ld -r
  kbuild: Regenerate genksyms lexer
  kbuild: genksyms fix for typeof handling
  fixdep: faster CONFIG_ search
  ia64: move exports to definitions
  sparc32: debride memcpy.S a bit
  [sparc] unify 32bit and 64bit string.h
  sparc: move exports to definitions
  ppc: move exports to definitions
  arm: move exports to definitions
  s390: move exports to definitions
  m68k: move exports to definitions
  alpha: move exports to actual definitions
  x86: move exports to actual definitions
  ...
2016-10-14 14:26:58 -07:00
Linus Torvalds d8bfb96a2e powerpc updates for 4.9 #2
Freescale updates from Scott:
 
 "Highlights include qbman support (a prerequisite for datapath drivers
 such as ethernet), a PCI DMA fix+improvement, reset handler changes, more
 8xx optimizations, and some cleanups and fixes."
 
 Fixes:
  - selftests/powerpc: Add missing binaries to .gitignores (Michael Ellerman)
  - selftests/powerpc: Fix build break caused by EXPORT_SYMBOL changes (Michael Ellerman)
  - powerpc/pseries: Fix stack corruption in htpe code (Laurent Dufour)
  - powerpc/64s: Fix power4_fixup_nap placement (Nicholas Piggin)
  - powerpc/64: Fix incorrect return value from __copy_tofrom_user (Paul Mackerras)
  - powerpc/mm/hash64: Fix might_have_hea() check (Michael Ellerman)
 
 Other:
  - MAINTAINERS: Remove myself from PA Semi entries (Olof Johansson)
  - MAINTAINERS: Drop separate pseries entry (Michael Ellerman)
  - MAINTAINERS: Update powerpc website & add selftests (Michael Ellerman)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJX/1EOAAoJEFHr6jzI4aWAu0UQAIsmnc361a4xOl1ODRzJNWSD
 OqBbuGQPZ3I3XPMxB6BBK4mnpR507nb+L1yxMDidDebJal/pRJdXkGi7I5pe9uq+
 12XJcaePtVfmrHKKWAhC/fef0gQHSusBDpIIDquN5QE1BVvUDGbynG4GjnpX9ZaT
 gmXGL03u/yJvUoUNexG7lrMAJ7bZgU8BzFKyojzWtoEDF4SM7rpWKs1hGwojW4/T
 EYcek5uTNo01UsN/WNrtBkHA8eC9unnLk9NisOxvBXu7eJfEq38Bz71fhoowFO+C
 FDRboPdkXxySzzNTBb3hROontLZS2S13upzjcrRo2/f4gxvcimRJtDzxuRKrYX5n
 xdXcZVdFSRsKanbuV0Dwjki05IU4zeOhsHUqYqaS2UD+QlAbNCu0N9DZOhPMn+H2
 8uT3cOOrBLBrhIH3e7DMK9Rx97FBeuCvwrbjnZp8My7s55VXXd2CZTFYf5/wW0b2
 VEf5eoXM1BB2zuh9kFZ785Sq5iYnsKoNhKjoXULkBrf3m7WtmjPIbHzRTJM5ltwt
 YUvFMG6nncQB0ERVOvDIXXNzwVB0JkJTVX2BBZ2a7Fr+8KHE6rTYkgcQiosibUmq
 gLV9M59MFamAgJlna3A1OmGIpEiZ3RrriYL2mgraWwuLUn/qW3yPiPE6hBk0uomL
 cARvlIjGf8rWhi+3qAb1
 =lwsd
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull more powerpc updates from Michael Ellerman:
 "Some more powerpc updates for 4.9:

  Freescale updates from Scott Wood:
   - qbman support (a prerequisite for datapath drivers such as ethernet)
   - a PCI DMA fix+improvement
   - reset handler changes
   - more 8xx optimizations
   - some cleanups and fixes.'

  Fixes:
   - selftests/powerpc: Add missing binaries to .gitignores (Michael Ellerman)
   - selftests/powerpc: Fix build break caused by EXPORT_SYMBOL changes (Michael Ellerman)
   - powerpc/pseries: Fix stack corruption in htpe code (Laurent Dufour)
   - powerpc/64s: Fix power4_fixup_nap placement (Nicholas Piggin)
   - powerpc/64: Fix incorrect return value from __copy_tofrom_user (Paul Mackerras)
   - powerpc/mm/hash64: Fix might_have_hea() check (Michael Ellerman)

  Other:
   - MAINTAINERS: Remove myself from PA Semi entries (Olof Johansson)
   - MAINTAINERS: Drop separate pseries entry (Michael Ellerman)
   - MAINTAINERS: Update powerpc website & add selftests (Michael Ellerman):

* tag 'powerpc-4.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (35 commits)
  powerpc/mm/hash64: Fix might_have_hea() check
  powerpc/64: Fix incorrect return value from __copy_tofrom_user
  powerpc/64s: Fix power4_fixup_nap placement
  powerpc/pseries: Fix stack corruption in htpe code
  selftests/powerpc: Fix build break caused by EXPORT_SYMBOL changes
  MAINTAINERS: Update powerpc website & add selftests
  MAINTAINERS: Drop separate pseries entry
  MAINTAINERS: Remove myself from PA Semi entries
  selftests/powerpc: Add missing binaries to .gitignores
  arch/powerpc: Add CONFIG_FSL_DPAA to corenetXX_smp_defconfig
  soc/qman: Add self-test for QMan driver
  soc/bman: Add self-test for BMan driver
  soc/fsl: Introduce DPAA 1.x QMan device driver
  soc/fsl: Introduce DPAA 1.x BMan device driver
  powerpc/8xx: make user addr DTLB miss the short path
  powerpc/8xx: Move additional DTLBMiss handlers out of exception area
  powerpc/8xx: use r3 to scratch CR in ITLBmiss
  soc/fsl/qe: fix gpio save_regs functions
  powerpc/8xx: add dedicated machine check handler
  powerpc/8xx: add system_reset_exception
  ...
2016-10-14 11:07:42 -07:00
Michael Ellerman 08bf75ba85 powerpc/mm/hash64: Fix might_have_hea() check
In commit 2b4e3ad8f5 ("powerpc/mm/hash64: Don't test for machine type
to detect HEA special case") we changed the logic in might_have_hea()
to check FW_FEATURE_SPLPAR rather than machine_is(pseries).

However the check was incorrectly negated, leading to crashes on
machines with HEA adapters, such as:

  mm: Hashing failure ! EA=0xd000080080004040 access=0x800000000000000e current=NetworkManager
      trap=0x300 vsid=0x13d349c ssize=1 base psize=2 psize 2 pte=0xc0003cc033e701ae
  Unable to handle kernel paging request for data at address 0xd000080080004040
  Call Trace:
    .ehea_create_cq+0x148/0x340 [ehea] (unreliable)
    .ehea_up+0x258/0x1200 [ehea]
    .ehea_open+0x44/0x1a0 [ehea]
    ...

Fix it by removing the negation.

Fixes: 2b4e3ad8f5 ("powerpc/mm/hash64: Don't test for machine type to detect HEA special case")
Cc: stable@vger.kernel.org # v4.8+
Reported-by: Denis Kirjanov <kda@linux-powerpc.org>
Reported-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-12 08:32:28 +11:00
Linus Torvalds 07021b4359 powerpc updates for 4.9
Highlights:
  - Major rework of Book3S 64-bit exception vectors (Nicholas Piggin)
    - Use gas sections for arranging exception vectors et. al.
  - Large set of TM cleanups and selftests (Cyril Bur)
  - Enable transactional memory (TM) lazily for userspace (Cyril Bur)
  - Support for XZ compression in the zImage wrapper (Oliver O'Halloran)
  - Add support for bpf constant blinding (Naveen N. Rao)
  - Beginnings of upstream support for PA Semi Nemo motherboards (Darren Stevens)
 
 Fixes:
  - Ensure .mem(init|exit).text are within _stext/_etext (Michael Ellerman)
  - xmon: Don't use ld on 32-bit (Michael Ellerman)
  - vdso64: Use double word compare on pointers (Anton Blanchard)
  - powerpc/nvram: Fix an incorrect partition merge (Pan Xinhui)
  - powerpc: Fix usage of _PAGE_RO in hugepage (Christophe Leroy)
  - powerpc/mm: Update FORCE_MAX_ZONEORDER range to allow hugetlb w/4K (Aneesh Kumar K.V)
  - Fix memory leak in queue_hotplug_event() error path (Andrew Donnellan)
  - Replay hypervisor maintenance interrupt first (Nicholas Piggin)
 
 Cleanups & features:
  - Sparse fixes/cleanups (Daniel Axtens)
  - Preserve CFAR value on SLB miss caused by access to bogus address (Paul Mackerras)
  - Radix MMU fixups for POWER9 (Aneesh Kumar K.V)
  - Support for setting used_(vsr|vr|spe) in sigreturn path (for CRIU) (Simon Guo)
  - Optimise syscall entry for virtual, relocatable case (Nicholas Piggin)
  - Optimise MSR handling in exception handling (Nicholas Piggin)
  - Support for kexec with Radix MMU (Benjamin Herrenschmidt)
  - powernv EEH fixes (Russell Currey)
  - Suprise PCI hotplug support for powernv (Gavin Shan)
  - Endian/sparse fixes for powernv PCI (Gavin Shan)
  - Defconfig updates (Anton Blanchard)
  - Various performance optimisations (Anton Blanchard)
    - Align hot loops of memset() and backwards_memcpy()
    - During context switch, check before setting mm_cpumask
    - Remove static branch prediction in atomic{, 64}_add_unless
    - Only disable HAVE_EFFICIENT_UNALIGNED_ACCESS on POWER7 little endian
    - Set default CPU type to POWER8 for little endian builds
 
  - KVM: PPC: Book3S HV: Migrate pinned pages out of CMA (Balbir Singh)
  - cxl: Flush PSL cache before resetting the adapter (Frederic Barrat)
  - cxl: replace loop with for_each_child_of_node(), remove unneeded of_node_put() (Andrew Donnellan)
  - Fix HV facility unavailable to use correct handler (Nicholas Piggin)
  - Remove unnecessary syscall trampoline (Nicholas Piggin)
  - fadump: Fix build break when CONFIG_PROC_VMCORE=n (Michael Ellerman)
  - Quieten EEH message when no adapters are found (Anton Blanchard)
  - powernv: Add PHB register dump debugfs handle (Russell Currey)
  - Use kprobe blacklist for exception handlers & asm functions (Nicholas Piggin)
  - Document the syscall ABI (Nicholas Piggin)
  - MAINTAINERS: Update cxl maintainers (Michael Neuling)
  - powerpc: Remove all usages of NO_IRQ (Michael Ellerman)
 
 Minor cleanups:
  - Andrew Donnellan, Christophe Leroy, Colin Ian King, Cyril Bur, Frederic Barrat,
    Pan Xinhui, PrasannaKumar Muralidharan, Rui Teng, Simon Guo.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJX9x5ZAAoJEFHr6jzI4aWAWQ0P+gOhdtayMsRY0k0dzPmYaFr0
 Ha5v968RJaNIyGGM9ARJg8h27PGMaSlBp/9zaYdk1G7xfv/DMR0uq8d8l5pjy/Zw
 Jm72WE4PEX/zAcQxry6Y2fDdumO09crTBA/W0hM1UZzqu0bcVUfD+E51ZFYWW7yh
 fyhT2YnlucxIcT34pxsLqwTIiZYG4xgN3+YGo0wohY1D1GHE3UZ7SXIglb49yM6v
 ZeXrL7SOdERR1w88rC+g99P/cWng5HDS0wPLUbxGT5KIpoOSXOs7EbZwFqQBUy5O
 37PB07K5dDyUbrm++l5lUigldF3W1OZQBN5+n8PciulxxwFX84pllTlAxv1p60JR
 piEKZ8pl023IF7zMGatUG9qcNOcnbxdMsAhoEhlcFi9ulM/yLzbmRTKVfDYm+O/J
 UI+YtcbsgdyOXMdGXCqdpeBNuuypgLG/g7gC8bnk3taS0LUUZLcXtRNuE4tcPJJe
 v8FnszaLkjAi83Lmzt3fgZo7DI1RIPwDSw6fY+nBrxCRfEPRVx3f7KhmUXvSeol5
 Ln9xpk4AtyQt1RHhckxXwWSUgvXVg2ltmz7ElqK4sQ9mO/D2ZIs6R6fPY4VlJLc4
 /2yIV4RLIsbHmdv9IbJ8PBp0VTugSNdicZ904QiAHSZQv/i1mgYuXw3tjR6kuy9f
 bKOzNJTwLV1WUsOlUpiq
 =Jnn8
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "Highlights:
   - Major rework of Book3S 64-bit exception vectors (Nicholas Piggin)
   - Use gas sections for arranging exception vectors et. al.
   - Large set of TM cleanups and selftests (Cyril Bur)
   - Enable transactional memory (TM) lazily for userspace (Cyril Bur)
   - Support for XZ compression in the zImage wrapper (Oliver
     O'Halloran)
   - Add support for bpf constant blinding (Naveen N. Rao)
   - Beginnings of upstream support for PA Semi Nemo motherboards
     (Darren Stevens)

  Fixes:
   - Ensure .mem(init|exit).text are within _stext/_etext (Michael
     Ellerman)
   - xmon: Don't use ld on 32-bit (Michael Ellerman)
   - vdso64: Use double word compare on pointers (Anton Blanchard)
   - powerpc/nvram: Fix an incorrect partition merge (Pan Xinhui)
   - powerpc: Fix usage of _PAGE_RO in hugepage (Christophe Leroy)
   - powerpc/mm: Update FORCE_MAX_ZONEORDER range to allow hugetlb w/4K
     (Aneesh Kumar K.V)
   - Fix memory leak in queue_hotplug_event() error path (Andrew
     Donnellan)
   - Replay hypervisor maintenance interrupt first (Nicholas Piggin)

  Various performance optimisations (Anton Blanchard):
   - Align hot loops of memset() and backwards_memcpy()
   - During context switch, check before setting mm_cpumask
   - Remove static branch prediction in atomic{, 64}_add_unless
   - Only disable HAVE_EFFICIENT_UNALIGNED_ACCESS on POWER7 little
     endian
   - Set default CPU type to POWER8 for little endian builds

  Cleanups & features:
   - Sparse fixes/cleanups (Daniel Axtens)
   - Preserve CFAR value on SLB miss caused by access to bogus address
     (Paul Mackerras)
   - Radix MMU fixups for POWER9 (Aneesh Kumar K.V)
   - Support for setting used_(vsr|vr|spe) in sigreturn path (for CRIU)
     (Simon Guo)
   - Optimise syscall entry for virtual, relocatable case (Nicholas
     Piggin)
   - Optimise MSR handling in exception handling (Nicholas Piggin)
   - Support for kexec with Radix MMU (Benjamin Herrenschmidt)
   - powernv EEH fixes (Russell Currey)
   - Suprise PCI hotplug support for powernv (Gavin Shan)
   - Endian/sparse fixes for powernv PCI (Gavin Shan)
   - Defconfig updates (Anton Blanchard)
   - KVM: PPC: Book3S HV: Migrate pinned pages out of CMA (Balbir Singh)
   - cxl: Flush PSL cache before resetting the adapter (Frederic Barrat)
   - cxl: replace loop with for_each_child_of_node(), remove unneeded
     of_node_put() (Andrew Donnellan)
   - Fix HV facility unavailable to use correct handler (Nicholas
     Piggin)
   - Remove unnecessary syscall trampoline (Nicholas Piggin)
   - fadump: Fix build break when CONFIG_PROC_VMCORE=n (Michael
     Ellerman)
   - Quieten EEH message when no adapters are found (Anton Blanchard)
   - powernv: Add PHB register dump debugfs handle (Russell Currey)
   - Use kprobe blacklist for exception handlers & asm functions
     (Nicholas Piggin)
   - Document the syscall ABI (Nicholas Piggin)
   - MAINTAINERS: Update cxl maintainers (Michael Neuling)
   - powerpc: Remove all usages of NO_IRQ (Michael Ellerman)

  Minor cleanups:
   - Andrew Donnellan, Christophe Leroy, Colin Ian King, Cyril Bur,
     Frederic Barrat, Pan Xinhui, PrasannaKumar Muralidharan, Rui Teng,
     Simon Guo"

* tag 'powerpc-4.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (156 commits)
  powerpc/bpf: Add support for bpf constant blinding
  powerpc/bpf: Implement support for tail calls
  powerpc/bpf: Introduce accessors for using the tmp local stack space
  powerpc/fadump: Fix build break when CONFIG_PROC_VMCORE=n
  powerpc: tm: Enable transactional memory (TM) lazily for userspace
  powerpc/tm: Add TM Unavailable Exception
  powerpc: Remove do_load_up_transact_{fpu,altivec}
  powerpc: tm: Rename transct_(*) to ck(\1)_state
  powerpc: tm: Always use fp_state and vr_state to store live registers
  selftests/powerpc: Add checks for transactional VSXs in signal contexts
  selftests/powerpc: Add checks for transactional VMXs in signal contexts
  selftests/powerpc: Add checks for transactional FPUs in signal contexts
  selftests/powerpc: Add checks for transactional GPRs in signal contexts
  selftests/powerpc: Check that signals always get delivered
  selftests/powerpc: Add TM tcheck helpers in C
  selftests/powerpc: Allow tests to extend their kill timeout
  selftests/powerpc: Introduce GPR asm helper header file
  selftests/powerpc: Move VMX stack frame macros to header file
  selftests/powerpc: Rework FPU stack placement macros and move to header file
  selftests/powerpc: Check for VSX preservation across userspace preemption
  ...
2016-10-07 20:19:31 -07:00
Linus Torvalds 6218590bcb KVM updates for v4.9-rc1
All architectures:
   Move `make kvmconfig` stubs from x86;  use 64 bits for debugfs stats.
 
 ARM:
   Important fixes for not using an in-kernel irqchip; handle SError
   exceptions and present them to guests if appropriate; proxying of GICV
   access at EL2 if guest mappings are unsafe; GICv3 on AArch32 on ARMv8;
   preparations for GICv3 save/restore, including ABI docs; cleanups and
   a bit of optimizations.
 
 MIPS:
   A couple of fixes in preparation for supporting MIPS EVA host kernels;
   MIPS SMP host & TLB invalidation fixes.
 
 PPC:
   Fix the bug which caused guests to falsely report lockups; other minor
   fixes; a small optimization.
 
 s390:
   Lazy enablement of runtime instrumentation; up to 255 CPUs for nested
   guests; rework of machine check deliver; cleanups and fixes.
 
 x86:
   IOMMU part of AMD's AVIC for vmexit-less interrupt delivery; Hyper-V
   TSC page; per-vcpu tsc_offset in debugfs; accelerated INS/OUTS in
   nVMX; cleanups and fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABCAAGBQJX9iDrAAoJEED/6hsPKofoOPoIAIUlgojkb9l2l1XVDgsXdgQL
 sRVhYSVv7/c8sk9vFImrD5ElOPZd+CEAIqFOu45+NM3cNi7gxip9yftUVs7wI5aC
 eDZRWm1E4trDZLe54ZM9ThcqZzZZiELVGMfR1+ZndUycybwyWzafpXYsYyaXp3BW
 hyHM3qVkoWO3dxBWFwHIoO/AUJrWYkRHEByKyvlC6KPxSdBPSa5c1AQwMCoE0Mo4
 K/xUj4gBn9eMelNhg4Oqu/uh49/q+dtdoP2C+sVM8bSdquD+PmIeOhPFIcuGbGFI
 B+oRpUhIuntN39gz8wInJ4/GRSeTuR2faNPxMn4E1i1u4LiuJvipcsOjPfe0a18=
 =fZRB
 -----END PGP SIGNATURE-----

Merge tag 'kvm-4.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Radim Krčmář:
 "All architectures:
   - move `make kvmconfig` stubs from x86
   - use 64 bits for debugfs stats

  ARM:
   - Important fixes for not using an in-kernel irqchip
   - handle SError exceptions and present them to guests if appropriate
   - proxying of GICV access at EL2 if guest mappings are unsafe
   - GICv3 on AArch32 on ARMv8
   - preparations for GICv3 save/restore, including ABI docs
   - cleanups and a bit of optimizations

  MIPS:
   - A couple of fixes in preparation for supporting MIPS EVA host
     kernels
   - MIPS SMP host & TLB invalidation fixes

  PPC:
   - Fix the bug which caused guests to falsely report lockups
   - other minor fixes
   - a small optimization

  s390:
   - Lazy enablement of runtime instrumentation
   - up to 255 CPUs for nested guests
   - rework of machine check deliver
   - cleanups and fixes

  x86:
   - IOMMU part of AMD's AVIC for vmexit-less interrupt delivery
   - Hyper-V TSC page
   - per-vcpu tsc_offset in debugfs
   - accelerated INS/OUTS in nVMX
   - cleanups and fixes"

* tag 'kvm-4.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (140 commits)
  KVM: MIPS: Drop dubious EntryHi optimisation
  KVM: MIPS: Invalidate TLB by regenerating ASIDs
  KVM: MIPS: Split kernel/user ASID regeneration
  KVM: MIPS: Drop other CPU ASIDs on guest MMU changes
  KVM: arm/arm64: vgic: Don't flush/sync without a working vgic
  KVM: arm64: Require in-kernel irqchip for PMU support
  KVM: PPC: Book3s PR: Allow access to unprivileged MMCR2 register
  KVM: PPC: Book3S PR: Support 64kB page size on POWER8E and POWER8NVL
  KVM: PPC: Book3S: Remove duplicate setting of the B field in tlbie
  KVM: PPC: BookE: Fix a sanity check
  KVM: PPC: Book3S HV: Take out virtual core piggybacking code
  KVM: PPC: Book3S: Treat VTB as a per-subcore register, not per-thread
  ARM: gic-v3: Work around definition of gic_write_bpr1
  KVM: nVMX: Fix the NMI IDT-vectoring handling
  KVM: VMX: Enable MSR-BASED TPR shadow even if APICv is inactive
  KVM: nVMX: Fix reload apic access page warning
  kvmconfig: add virtio-gpu to config fragment
  config: move x86 kvm_guest.config to a common location
  arm64: KVM: Remove duplicating init code for setting VMID
  ARM: KVM: Support vgic-v3
  ...
2016-10-06 10:49:01 -07:00
Linus Torvalds 597f03f9d1 Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull CPU hotplug updates from Thomas Gleixner:
 "Yet another batch of cpu hotplug core updates and conversions:

   - Provide core infrastructure for multi instance drivers so the
     drivers do not have to keep custom lists.

   - Convert custom lists to the new infrastructure. The block-mq custom
     list conversion comes through the block tree and makes the diffstat
     tip over to more lines removed than added.

   - Handle unbalanced hotplug enable/disable calls more gracefully.

   - Remove the obsolete CPU_STARTING/DYING notifier support.

   - Convert another batch of notifier users.

   The relayfs changes which conflicted with the conversion have been
   shipped to me by Andrew.

   The remaining lot is targeted for 4.10 so that we finally can remove
   the rest of the notifiers"

* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits)
  cpufreq: Fix up conversion to hotplug state machine
  blk/mq: Reserve hotplug states for block multiqueue
  x86/apic/uv: Convert to hotplug state machine
  s390/mm/pfault: Convert to hotplug state machine
  mips/loongson/smp: Convert to hotplug state machine
  mips/octeon/smp: Convert to hotplug state machine
  fault-injection/cpu: Convert to hotplug state machine
  padata: Convert to hotplug state machine
  cpufreq: Convert to hotplug state machine
  ACPI/processor: Convert to hotplug state machine
  virtio scsi: Convert to hotplug state machine
  oprofile/timer: Convert to hotplug state machine
  block/softirq: Convert to hotplug state machine
  lib/irq_poll: Convert to hotplug state machine
  x86/microcode: Convert to hotplug state machine
  sh/SH-X3 SMP: Convert to hotplug state machine
  ia64/mca: Convert to hotplug state machine
  ARM/OMAP/wakeupgen: Convert to hotplug state machine
  ARM/shmobile: Convert to hotplug state machine
  arm64/FP/SIMD: Convert to hotplug state machine
  ...
2016-10-03 19:43:08 -07:00
Balbir Singh 2e5bbb5461 KVM: PPC: Book3S HV: Migrate pinned pages out of CMA
When PCI Device pass-through is enabled via VFIO, KVM-PPC will
pin pages using get_user_pages_fast(). One of the downsides of
the pinning is that the page could be in CMA region. The CMA
region is used for other allocations like the hash page table.
Ideally we want the pinned pages to be from non CMA region.

This patch (currently only for KVM PPC with VFIO) forcefully
migrates the pages out (huge pages are omitted for the moment).
There are more efficient ways of doing this, but that might
be elaborate and might impact a larger audience beyond just
the kvm ppc implementation.

The magic is in new_iommu_non_cma_page() which allocates the
new page from a non CMA region.

I've tested the patches lightly at my end. The full solution
requires migration of THP pages in the CMA region. That work
will be done incrementally on top of this.

Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Acked-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[mpe: Merged via powerpc tree as that's where the changes are]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-29 15:14:44 +10:00
Rui Teng f1a55ce054 powerpc: Clean up tm_abort duplication in hash_utils_64.c
The same logic appears twice and should probably be pulled out into a
function.

Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Rui Teng <rui.teng@linux.vnet.ibm.com>
[mpe: Rename to tm_flush_hash_page() and move comment into the function]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-23 07:54:23 +10:00
Christophe Leroy 6b8cb66a6a powerpc: Fix usage of _PAGE_RO in hugepage
On some CPUs like the 8xx, _PAGE_RW hence _PAGE_WRITE is defined
as 0 and _PAGE_RO has to be set when a page is not writable

_PAGE_RO is defined by default in pte-common.h, however BOOK3S/64
doesn't include that file so _PAGE_RO has to be defined explicitly
in book3s/64/pgtable.h

Fixes: a7b9f671f2 ("powerpc32: adds handling of _PAGE_RO")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-23 07:54:22 +10:00
Aneesh Kumar K.V be34d30059 powerpc/mm: Add radix flush all with IS=3
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-23 07:54:18 +10:00
Benjamin Herrenschmidt fe036a0605 powerpc/64/kexec: Fix MMU cleanup on radix
Just using the hash ops won't work anymore since radix will have
NULL in there. Instead create an mmu_cleanup_all() function which
will do the right thing based on the MMU mode.

For Radix, for now I clear UPRT and the PTCR, effectively switching
back to Radix with no partition table setup.

Currently set it to NULL on BookE thought it might be a good idea
to wipe the TLB there (Scott ?)

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-23 07:54:17 +10:00
Nicholas Piggin 03465f899b powerpc: Use kprobe blacklist for exception handlers
Currently we mark the C implementations of some exception handlers as
__kprobes. This has the effect of putting them in the ".kprobes.text"
section, which separates them from the rest of the text.

Instead we can use the blacklist macros to add the symbols to a
blacklist which kprobes will check. This allows the linker to move
exception handler functions close to callers and avoids trampolines in
larger kernels.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Reword change log a bit]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-19 10:53:54 +10:00
Colin Ian King 3daf3c2069 powerpc/32: Add missing \n and switch to pr_warn()
The message is missing a \n, add it. Switch to pr_warn(), it's shorter
and less ugly.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-13 17:37:11 +10:00
Aneesh Kumar K.V ad410674f5 powerpc/mm: Update the HID bit when switching from radix to hash
Power9 DD1 requires to update the hid0 register when switching from
hash to radix.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-13 17:37:10 +10:00
Aneesh Kumar K.V c6d1a767b9 powerpc/mm/radix: Use different pte update sequence for different POWER9 revs
POWER9 DD1 requires pte to be marked invalid (V=0) before updating
it with the new value. This makes this distinction for the different
revisions.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-13 17:37:10 +10:00
Michael Ellerman 68201fbbb0 powerpc/Makefile: Drop CONFIG_WORD_SIZE for BITS
Commit 2578bfae84 ("[POWERPC] Create and use CONFIG_WORD_SIZE") added
CONFIG_WORD_SIZE, and suggests that other arches were going to do
likewise.

But that never happened, powerpc is the only architecture which uses it.

So switch to using a simple make variable, BITS, like x86, sh, sparc and
tile. It is also easier to spell and simpler, avoiding any confusion
about whether it's defined due to ordering of make vs kconfig.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-13 17:37:06 +10:00
Paul Mackerras f0f558b131 powerpc/mm: Preserve CFAR value on SLB miss caused by access to bogus address
Currently, if userspace or the kernel accesses a completely bogus address,
for example with any of bits 46-59 set, we first take an SLB miss interrupt,
install a corresponding SLB entry with VSID 0, retry the instruction, then
take a DSI/ISI interrupt because there is no HPT entry mapping the address.
However, by the time of the second interrupt, the Come-From Address Register
(CFAR) has been overwritten by the rfid instruction at the end of the SLB
miss interrupt handler.  Since bogus accesses can often be caused by a
function return after the stack has been overwritten, the CFAR value would
be very useful as it could indicate which function it was whose return had
led to the bogus address.

This patch adds code to create a full exception frame in the SLB miss handler
in the case of a bogus address, rather than inserting an SLB entry with a
zero VSID field.  Then we call a new slb_miss_bad_addr() function in C code,
which delivers a signal for a user access or creates an oops for a kernel
access.  In the latter case the oops message will show the CFAR value at the
time of the access.

In the case of the radix MMU, a segment miss interrupt indicates an access
outside the ranges mapped by the page tables.  Previously this was handled
by the code for an unrecoverable SLB miss (one with MSR[RI] = 0), which is
not really correct.  With this patch, we now handle these interrupts with
slb_miss_bad_addr(), which is much more consistent.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-13 17:37:03 +10:00
Paul Mackerras 0eeede0c63 powerpc/mm: Speed up computation of base and actual page size for a HPTE
This replaces a 2-D search through an array with a simple 8-bit table
lookup for determining the actual and/or base page size for a HPT entry.

The encoding in the second doubleword of the HPTE is designed to encode
the actual and base page sizes without using any more bits than would be
needed for a 4k page number, by using between 1 and 8 low-order bits of
the RPN (real page number) field to encode the page sizes.  A single
"large page" bit in the first doubleword indicates that these low-order
bits are to be interpreted like this.

We can determine the page sizes by using the low-order 8 bits of the RPN
to look up a 256-entry table.  For actual page sizes less than 1MB, some
of the upper bits of these 8 bits are going to be real address bits, but
we can cope with that by replicating the entries for those smaller page
sizes.

While we're at it, let's move the hpte_page_size() and hpte_base_page_size()
functions from a KVM-specific header to a header for 64-bit HPT systems,
since this computation doesn't have anything specifically to do with KVM.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2016-09-09 16:14:48 +10:00