Commit Graph

5429 Commits

Author SHA1 Message Date
Linus Torvalds 7246f60068 powerpc updates for 4.12 part 1.
Highlights include:
 
  - Larger virtual address space on 64-bit server CPUs. By default we use a 128TB
    virtual address space, but a process can request access to the full 512TB by
    passing a hint to mmap().
 
  - Support for the new Power9 "XIVE" interrupt controller.
 
  - TLB flushing optimisations for the radix MMU on Power9.
 
  - Support for CAPI cards on Power9, using the "Coherent Accelerator Interface
    Architecture 2.0".
 
  - The ability to configure the mmap randomisation limits at build and runtime.
 
  - Several small fixes and cleanups to the kprobes code, as well as support for
    KPROBES_ON_FTRACE.
 
  - Major improvements to handling of system reset interrupts, correctly treating
    them as NMIs, giving them a dedicated stack and using a new hypervisor call
    to trigger them, all of which should aid debugging and robustness.
 
 Many fixes and other minor enhancements.
 
 Thanks to:
   Alastair D'Silva, Alexey Kardashevskiy, Alistair Popple, Andrew Donnellan,
   Aneesh Kumar K.V, Anshuman Khandual, Anton Blanchard, Balbir Singh, Ben
   Hutchings, Benjamin Herrenschmidt, Bhupesh Sharma, Chris Packham, Christian
   Zigotzky, Christophe Leroy, Christophe Lombard, Daniel Axtens, David Gibson,
   Gautham R. Shenoy, Gavin Shan, Geert Uytterhoeven, Guilherme G. Piccoli,
   Hamish Martin, Hari Bathini, Kees Cook, Laurent Dufour, Madhavan Srinivasan,
   Mahesh J Salgaonkar, Mahesh Salgaonkar, Masami Hiramatsu, Matt Brown, Matthew
   R. Ochs, Michael Neuling, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran,
   Pan Xinhui, Paul Mackerras, Rashmica Gupta, Russell Currey, Sukadev
   Bhattiprolu, Thadeu Lima de Souza Cascardo, Tobin C. Harding, Tyrel Datwyler,
   Uma Krishnan, Vaibhav Jain, Vipin K Parashar, Yang Shi.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJZDHUMAAoJEFHr6jzI4aWAT7oQALkE2Nj3gjcn1z0SkFhq/1iO
 Py9Elmqm4E+L6NKYtBY5dS8xVAJ088ffzERyqJ1FY1LHkB8tn8bWRcMQmbjAFzTI
 V4TAzDNI890BN/F4ptrYRwNFxRBHAvZ4NDunTzagwYnwmTzW9PYHmOi4pvWTo3Tw
 KFUQ0joLSEgHzyfXxYB3fyj41u8N0FZvhfazdNSqia2Y5Vwwv/ION5jKplDM+09Y
 EtVEXFvaKAS1sjbM/d/Jo5rblHfR0D9/lYV10+jjyIokjzslIpyTbnj3izeYoM5V
 I4h99372zfsEjBGPPXyM3khL3zizGMSDYRmJHQSaKxjtecS9SPywPTZ8ufO/aSzV
 Ngq6nlND+f1zep29VQ0cxd3Jh40skWOXzxJaFjfDT25xa6FbfsWP2NCtk8PGylZ7
 EyqTuCWkMgIP02KlX3oHvEB2LRRPCDmRU2zECecRGNJrIQwYC2xjoiVi7Q8Qe8rY
 gr7Ib5Jj/a+uiTcCIy37+5nXq2s14/JBOKqxuYZIxeuZFvKYuRUipbKWO05WDOAz
 m/pSzeC3J8AAoYiqR0gcSOuJTOnJpGhs7zrQFqnEISbXIwLW+ICumzOmTAiBqOEY
 Rt8uW2gYkPwKLrE05445RfVUoERaAjaE06eRMOWS6slnngHmmnRJbf3PcoALiJkT
 ediqGEj0/N1HMB31V5tS
 =vSF3
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "Highlights include:

   - Larger virtual address space on 64-bit server CPUs. By default we
     use a 128TB virtual address space, but a process can request access
     to the full 512TB by passing a hint to mmap().

   - Support for the new Power9 "XIVE" interrupt controller.

   - TLB flushing optimisations for the radix MMU on Power9.

   - Support for CAPI cards on Power9, using the "Coherent Accelerator
     Interface Architecture 2.0".

   - The ability to configure the mmap randomisation limits at build and
     runtime.

   - Several small fixes and cleanups to the kprobes code, as well as
     support for KPROBES_ON_FTRACE.

   - Major improvements to handling of system reset interrupts,
     correctly treating them as NMIs, giving them a dedicated stack and
     using a new hypervisor call to trigger them, all of which should
     aid debugging and robustness.

   - Many fixes and other minor enhancements.

  Thanks to: Alastair D'Silva, Alexey Kardashevskiy, Alistair Popple,
  Andrew Donnellan, Aneesh Kumar K.V, Anshuman Khandual, Anton
  Blanchard, Balbir Singh, Ben Hutchings, Benjamin Herrenschmidt,
  Bhupesh Sharma, Chris Packham, Christian Zigotzky, Christophe Leroy,
  Christophe Lombard, Daniel Axtens, David Gibson, Gautham R. Shenoy,
  Gavin Shan, Geert Uytterhoeven, Guilherme G. Piccoli, Hamish Martin,
  Hari Bathini, Kees Cook, Laurent Dufour, Madhavan Srinivasan, Mahesh J
  Salgaonkar, Mahesh Salgaonkar, Masami Hiramatsu, Matt Brown, Matthew
  R. Ochs, Michael Neuling, Naveen N. Rao, Nicholas Piggin, Oliver
  O'Halloran, Pan Xinhui, Paul Mackerras, Rashmica Gupta, Russell
  Currey, Sukadev Bhattiprolu, Thadeu Lima de Souza Cascardo, Tobin C.
  Harding, Tyrel Datwyler, Uma Krishnan, Vaibhav Jain, Vipin K Parashar,
  Yang Shi"

* tag 'powerpc-4.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (214 commits)
  powerpc/64s: Power9 has no LPCR[VRMASD] field so don't set it
  powerpc/powernv: Fix TCE kill on NVLink2
  powerpc/mm/radix: Drop support for CPUs without lockless tlbie
  powerpc/book3s/mce: Move add_taint() later in virtual mode
  powerpc/sysfs: Move #ifdef CONFIG_HOTPLUG_CPU out of the function body
  powerpc/smp: Document irq enable/disable after migrating IRQs
  powerpc/mpc52xx: Don't select user-visible RTAS_PROC
  powerpc/powernv: Document cxl dependency on special case in pnv_eeh_reset()
  powerpc/eeh: Clean up and document event handling functions
  powerpc/eeh: Avoid use after free in eeh_handle_special_event()
  cxl: Mask slice error interrupts after first occurrence
  cxl: Route eeh events to all drivers in cxl_pci_error_detected()
  cxl: Force context lock during EEH flow
  powerpc/64: Allow CONFIG_RELOCATABLE if COMPILE_TEST
  powerpc/xmon: Teach xmon oops about radix vectors
  powerpc/mm/hash: Fix off-by-one in comment about kernel contexts ids
  powerpc/pseries: Enable VFIO
  powerpc/powernv: Fix iommu table size calculation hook for small tables
  powerpc/powernv: Check kzalloc() return value in pnv_pci_table_alloc
  powerpc: Add arch/powerpc/tools directory
  ...
2017-05-05 11:36:44 -07:00
Nicholas Piggin 700b7eadd5 powerpc/64s: Power9 has no LPCR[VRMASD] field so don't set it
Power9/ISAv3 has no VRMASD field in LPCR, we shouldn't be setting reserved bits,
so don't set them on Power9.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-05-03 20:45:55 +10:00
Mahesh Salgaonkar d93b0ac01a powerpc/book3s/mce: Move add_taint() later in virtual mode
machine_check_early() gets called in real mode. The very first time when
add_taint() is called, it prints a warning which ends up calling opal
call (that uses OPAL_CALL wrapper) for writing it to console. If we get a
very first machine check while we are in opal we are doomed. OPAL_CALL
overwrites the PACASAVEDMSR in r13 and in this case when we are done with
MCE handling the original opal call will use this new MSR on it's way
back to opal_return. This usually leads to unexpected behaviour or the
kernel to panic. Instead move the add_taint() call later in the virtual
mode where it is safe to call.

This is broken with current FW level. We got lucky so far for not getting
very first MCE hit while in OPAL. But easily reproducible on Mambo.

Fixes: 27ea2c420c ("powerpc: Set the correct kernel taint on machine check errors.")
Cc: stable@vger.kernel.org # v4.2+
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-05-03 14:45:39 +10:00
Michael Ellerman 3f2290e1b5 powerpc/sysfs: Move #ifdef CONFIG_HOTPLUG_CPU out of the function body
The entire body of unregister_cpu_online() is inside an #ifdef
CONFIG_HOTPLUG_CPU block. This is ugly and means we create an empty function
when hotplug is disabled for no reason.

Instead move the #ifdef out of the function body and define the function to be
NULL in the else case. This means we'll pass NULL to cpuhp_setup_state(), but
that's fine because it accepts NULL to mean there is no teardown callback, which
is exactly what we want.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-05-03 14:45:38 +10:00
Michael Ellerman 687b8f24f1 powerpc/smp: Document irq enable/disable after migrating IRQs
This code was until recently completely undocumented and even now the comment is
not very verbose.

We've already had one patch sent to remove the IRQ enable/disable because it's
"paradoxical and unnecessary". So document it thoroughly to save anyone else
from puzzling over it.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-05-03 14:45:38 +10:00
Linus Torvalds 76f1948a79 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching
Pull livepatch updates from Jiri Kosina:

 - a per-task consistency model is being added for architectures that
   support reliable stack dumping (extending this, currently rather
   trivial set, is currently in the works).

   This extends the nature of the types of patches that can be applied
   by live patching infrastructure. The code stems from the design
   proposal made [1] back in November 2014. It's a hybrid of SUSE's
   kGraft and RH's kpatch, combining advantages of both: it uses
   kGraft's per-task consistency and syscall barrier switching combined
   with kpatch's stack trace switching. There are also a number of
   fallback options which make it quite flexible.

   Most of the heavy lifting done by Josh Poimboeuf with help from
   Miroslav Benes and Petr Mladek

   [1] https://lkml.kernel.org/r/20141107140458.GA21774@suse.cz

 - module load time patch optimization from Zhou Chengming

 - a few assorted small fixes

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
  livepatch: add missing printk newlines
  livepatch: Cancel transition a safe way for immediate patches
  livepatch: Reduce the time of finding module symbols
  livepatch: make klp_mutex proper part of API
  livepatch: allow removal of a disabled patch
  livepatch: add /proc/<pid>/patch_state
  livepatch: change to a per-task consistency model
  livepatch: store function sizes
  livepatch: use kstrtobool() in enabled_store()
  livepatch: move patching functions into patch.c
  livepatch: remove unnecessary object loaded check
  livepatch: separate enabled and patched states
  livepatch/s390: add TIF_PATCH_PENDING thread flag
  livepatch/s390: reorganize TIF thread flag bits
  livepatch/powerpc: add TIF_PATCH_PENDING thread flag
  livepatch/x86: add TIF_PATCH_PENDING thread flag
  livepatch: create temporary klp_update_patch_state() stub
  x86/entry: define _TIF_ALLWORK_MASK flags explicitly
  stacktrace/x86: add function for detecting reliable stack traces
2017-05-02 18:24:16 -07:00
Linus Torvalds 2575be8ad3 - constify compression structures; Bhumika Goyal
- restore powerpc dumping; Ankit Kumar
 - fix more bugs in the rarely exercises module unloading logic
 - reorganize filesystem locking to fix problems noticed by lockdep
 - refactor internal pstore APIs to make development and review easier:
   - improve error reporting
   - add kernel-doc structure and function comments
   - avoid insane argument passing by using a common record structure
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 Comment: Kees Cook <kees@outflux.net>
 
 iQIcBAABCgAGBQJZB3wzAAoJEIly9N/cbcAmVQAP/3+GzoxUcL43ypsDa1CmCsFN
 l2roQjzWLNGfHgq5qkS/mNtrUdEvBMUBd2oyhHcaiqM0DsuuO3rKTp6dZ8oczYjN
 6GoTmU8ZwIPze3VEadNPjCIdpsfTMNKtZvVJCTrWnsgXxTawDS89qqr7SCs3qhBS
 Dm1E8oX77YyhOKoGA6O3CJxpdm/Ge4+KpPR6Uwj90eVro04vYoiwnBjyLUzE7w1l
 JcXEGEh1t5NjUxHeMwW7HZwJYZfA3DQ6I3MOOzGhf9tsKp6J0LQTTV8PMSEo1mif
 mLZhDBy8BBlunL+b2Tp3+c4QItGSHkBCWASI2RLa2TM7xvL67oC+qm/WaUyoRovy
 hllEG96rsCs3Zx7fFFsfQCwURcTWfJQMrD+0d/fM+P2ylWvgp+KU6PeLTS9IHu6M
 3n6i5i6A6OY/QvmZr1tN/06kUBjtQmo8EgQ0jxoxAlWyNcJqi93hmJyaRW28KxjS
 tjFTNLZMrslj0UDmjiD6fIuaT6gsGDB+3wAMPVAf+iV/k/2GUlj3ZILe4RaABAe9
 8xaUu11tZ5sTniayZ+10bA+6+K5n7uTlgU8RfFgaUZoRAzHgtyijOmdo6N+HILfK
 klv59B1Fmf6JpDlq7L9vurOqE82FAWFn4DruFM2bAaky2meFUNbYFiNfwK4l6lPI
 pmAgpdgRRvNMBCEmbVfv
 =S14G
 -----END PGP SIGNATURE-----

Merge tag 'pstore-v4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull pstore updates from Kees Cook:
 "This has a large internal refactoring along with several smaller
  fixes.

   - constify compression structures; Bhumika Goyal

   - restore powerpc dumping; Ankit Kumar

   - fix more bugs in the rarely exercises module unloading logic

   - reorganize filesystem locking to fix problems noticed by lockdep

   - refactor internal pstore APIs to make development and review
     easier:
      - improve error reporting
      - add kernel-doc structure and function comments
      - avoid insane argument passing by using a common record
        structure"

* tag 'pstore-v4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (23 commits)
  pstore: Solve lockdep warning by moving inode locks
  pstore: Fix flags to enable dumps on powerpc
  pstore: Remove unused vmalloc.h in pmsg
  pstore: simplify write_user_compat()
  pstore: Remove write_buf() callback
  pstore: Replace arguments for write_buf_user() API
  pstore: Replace arguments for write_buf() API
  pstore: Replace arguments for erase() API
  pstore: Do not duplicate record metadata
  pstore: Allocate records on heap instead of stack
  pstore: Pass record contents instead of copying
  pstore: Always allocate buffer for decompression
  pstore: Replace arguments for write() API
  pstore: Replace arguments for read() API
  pstore: Switch pstore_mkfile to pass record
  pstore: Move record decompression to function
  pstore: Extract common arguments into structure
  pstore: Add kernel-doc for struct pstore_info
  pstore: Improve register_pstore() error reporting
  pstore: Avoid race in module unloading
  ...
2017-05-02 10:35:45 -07:00
Russell Currey c0b64978f0 powerpc/eeh: Clean up and document event handling functions
Remove unnecessary tags in eeh_handle_normal_event(), and add function
comments for eeh_handle_normal_event() and eeh_handle_special_event().

The only functional difference is that in the case of a PE reaching the
maximum number of failures, rather than one message telling you of this
and suggesting you reseat the device, there are two separate messages.

Suggested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-05-02 22:41:43 +10:00
Russell Currey daeba2956f powerpc/eeh: Avoid use after free in eeh_handle_special_event()
eeh_handle_special_event() is called when an EEH event is detected but
can't be narrowed down to a specific PE.  This function looks through
every PE to find one in an erroneous state, then calls the regular event
handler eeh_handle_normal_event() once it knows which PE has an error.

However, if eeh_handle_normal_event() found that the PE cannot possibly
be recovered, it will free it, rendering the passed PE stale.
This leads to a use after free in eeh_handle_special_event() as it attempts to
clear the "recovering" state on the PE after eeh_handle_normal_event() returns.

Thus, make sure the PE is valid when attempting to clear state in
eeh_handle_special_event().

Fixes: 8a6b1bc70d ("powerpc/eeh: EEH core to handle special event")
Cc: stable@vger.kernel.org # v3.11+
Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-05-02 22:41:43 +10:00
Linus Torvalds 3fb9268e43 Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 asm updates from Ingo Molnar:
 "The main changes in this cycle were:

   - unwinder fixes and enhancements

   - improve ftrace interaction with the unwinder

   - optimize the code footprint of WARN() and related debugging
     constructs

   - ... plus misc updates, cleanups and fixes"

* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
  x86/unwind: Dump all stacks in unwind_dump()
  x86/unwind: Silence more entry-code related warnings
  x86/ftrace: Fix ebp in ftrace_regs_caller that screws up unwinder
  x86/unwind: Remove unused 'sp' parameter in unwind_dump()
  x86/unwind: Prepend hex mask value with '0x' in unwind_dump()
  x86/unwind: Properly zero-pad 32-bit values in unwind_dump()
  x86/unwind: Ensure stack pointer is aligned
  debug: Avoid setting BUGFLAG_WARNING twice
  x86/unwind: Silence entry-related warnings
  x86/unwind: Read stack return address in update_stack_state()
  x86/unwind: Move common code into update_stack_state()
  debug: Fix __bug_table[] in arch linker scripts
  debug: Add _ONCE() logic to report_bug()
  x86/debug: Define BUG() again for !CONFIG_BUG
  x86/debug: Implement __WARN() using UD0
  x86/ftrace: Use Makefile logic instead of #ifdef for compiling ftrace_*.o
  x86/ftrace: Add -mfentry support to x86_32 with DYNAMIC_FTRACE set
  x86/ftrace: Clean up ftrace_regs_caller
  x86/ftrace: Add stack frame pointer to ftrace_caller
  x86/ftrace: Move the ftrace specific code out of entry_32.S
  ...
2017-05-01 22:07:51 -07:00
Linus Torvalds 3527d3e951 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
 "The main changes in this cycle were:

   - another round of rq-clock handling debugging, robustization and
     fixes

   - PELT accounting improvements

   - CPU hotplug related ->cpus_allowed affinity handling fixes all
     around the tree

   - ... plus misc fixes, cleanups and updates"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (35 commits)
  sched/x86: Update reschedule warning text
  crypto: N2 - Replace racy task affinity logic
  cpufreq/sparc-us2e: Replace racy task affinity logic
  cpufreq/sparc-us3: Replace racy task affinity logic
  cpufreq/sh: Replace racy task affinity logic
  cpufreq/ia64: Replace racy task affinity logic
  ACPI/processor: Replace racy task affinity logic
  ACPI/processor: Fix error handling in __acpi_processor_start()
  sparc/sysfs: Replace racy task affinity logic
  powerpc/smp: Replace open coded task affinity logic
  ia64/sn/hwperf: Replace racy task affinity logic
  ia64/salinfo: Replace racy task affinity logic
  workqueue: Provide work_on_cpu_safe()
  ia64/topology: Remove cpus_allowed manipulation
  sched/fair: Move the PELT constants into a generated header
  sched/fair: Increase PELT accuracy for small tasks
  sched/fair: Fix comments
  sched/Documentation: Add 'sched-pelt' tool
  sched/fair: Fix corner case in __accumulate_sum()
  sched/core: Remove 'task' parameter and rename tsk_restore_flags() to current_restore_flags()
  ...
2017-05-01 19:12:53 -07:00
Linus Torvalds 174ddfd5df Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:
 "The timer departement delivers:

   - more year 2038 rework

   - a massive rework of the arm achitected timer

   - preparatory patches to allow NTP correction of clock event devices
     to avoid early expiry

   - the usual pile of fixes and enhancements all over the place"

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (91 commits)
  timer/sysclt: Restrict timer migration sysctl values to 0 and 1
  arm64/arch_timer: Mark errata handlers as __maybe_unused
  Clocksource/mips-gic: Remove redundant non devicetree init
  MIPS/Malta: Probe gic-timer via devicetree
  clocksource: Use GENMASK_ULL in definition of CLOCKSOURCE_MASK
  acpi/arm64: Add SBSA Generic Watchdog support in GTDT driver
  clocksource: arm_arch_timer: add GTDT support for memory-mapped timer
  acpi/arm64: Add memory-mapped timer support in GTDT driver
  clocksource: arm_arch_timer: simplify ACPI support code.
  acpi/arm64: Add GTDT table parse driver
  clocksource: arm_arch_timer: split MMIO timer probing.
  clocksource: arm_arch_timer: add structs to describe MMIO timer
  clocksource: arm_arch_timer: move arch_timer_needs_of_probing into DT init call
  clocksource: arm_arch_timer: refactor arch_timer_needs_probing
  clocksource: arm_arch_timer: split dt-only rate handling
  x86/uv/time: Set ->min_delta_ticks and ->max_delta_ticks
  unicore32/time: Set ->min_delta_ticks and ->max_delta_ticks
  um/time: Set ->min_delta_ticks and ->max_delta_ticks
  tile/time: Set ->min_delta_ticks and ->max_delta_ticks
  score/time: Set ->min_delta_ticks and ->max_delta_ticks
  ...
2017-05-01 16:15:18 -07:00
Nicholas Piggin c64af6458e powerpc: Add struct smp_ops_t.cause_nmi_ipi operation
Have the NMI IPI code use this op when the platform defines it.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-28 21:02:25 +10:00
Nicholas Piggin ddd703ca06 powerpc: Add NMI IPI infrastructure
Add a simple NMI IPI system that handles concurrency and reentrancy.

The platform does not have to implement a true non-maskable interrupt,
the default is to simply use the debugger break IPI message. This has
now been co-opted for a general IPI message, and users (debugger and
crash) have been reimplemented on top of the NMI system.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Incorporate incremental fixes from Nick]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-28 21:02:25 +10:00
Nicholas Piggin 2b4f3ac564 powerpc: Mark system reset as an NMI with nmi_enter/exit()
System reset is a non-maskable interrupt from Linux's point of view
(occurs under local_irq_disable()), so it should use nmi_enter/exit.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-28 21:02:25 +10:00
Nicholas Piggin b1ee8a3de5 powerpc/64s: Dedicated system reset interrupt stack
The system reset interrupt is used for crash/debug situations, so it is
desirable to have as little impact on the normal state of the system as
possible.

Currently it uses the current kernel stack to process the exception.
This stores into the stack which may be involved with the crash. The
stack pointer may be corrupted, or it may have overflowed.

Avoid or minimise these problems by creating a dedicated NMI stack for
the system reset interrupt to use.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-28 21:02:25 +10:00
Nicholas Piggin c4f3b52ce7 powerpc/64s: Disallow system reset vs system reset reentrancy
In preparation for using a dedicated stack for system reset interrupts,
prevent a nested system reset from recovering, in order to simplify
code that is called in crash/debug path. This allows a system reset
interrupt to just use the base stack pointer.

Keep an in_nmi nesting counter similarly to the in_mce counter. Consider
the interrrupt non-recoverable if it is taken inside another system
reset.

Interrupt nesting could be allowed similarly to MCE, but system reset
is a special case that's not for normal operation, so simplicity wins
until there is requirement for nested system reset interrupts.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-28 21:02:25 +10:00
Nicholas Piggin a3d96f70c1 powerpc/64s: Fix system reset vs general interrupt reentrancy
The system reset interrupt can occur when MSR_EE=0, and it currently
uses the PACA_EXGEN save area.

Some PACA_EXGEN interrupts have a window where MSR_RI=1 and MSR_EE=0
when the save area is still in use. A system reset interrupt in this
window can lead to undetected corruption when the save area gets
overwritten.

This patch introduces PACA_EXNMI save area for system reset exceptions,
which closes this corruption window. It's also helpful to retain the
EXGEN state for debugging situations, even if not considering the
recoverability aspect.

This patch also moves the PACA_EXMC area down to a less frequently used
part of the paca with the new save area.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-28 21:02:25 +10:00
Nicholas Piggin a4087a4d38 powerpc/64s: Exception macro for stack frame and initial register save
This code is common to a few exceptions, and another user will be added.
This causes a trivial change to generated code:

-     604: std     r9,416(r1)
-     608: mfspr   r11,314
-     60c: std     r11,368(r1)
-     610: mfspr   r12,315
+     604: mfspr   r11,314
+     608: mfspr   r12,315
+     60c: std     r9,416(r1)
+     610: std     r11,368(r1)

machine_check_powernv_early could also use this, but that requires non
trivial changes to generated code, so that's for another patch.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-28 21:02:25 +10:00
Nicholas Piggin 83a980f7f4 powerpc/64s: Add exception macro that does not enable RI
Subsequent patches will add more non-RI variant exceptions, so
create a macro for it rather than open-code it.

This does not change generated instructions.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-28 21:02:25 +10:00
Michael Ellerman b13f6683ed Merge branch 'topic/ppc-kvm' into next
Merge the topic branch we were sharing with kvm-ppc, Paul has also
merged it.
2017-04-28 20:19:37 +10:00
Ankit Kumar 041939c1ec pstore: Fix flags to enable dumps on powerpc
After commit c950fd6f20 kernel registers pstore write based on flag set.
Pstore write for powerpc is broken as flags(PSTORE_FLAGS_DMESG) is not set for
powerpc architecture. On panic, kernel doesn't write message to
/fs/pstore/dmesg*(Entry doesn't gets created at all).

This patch enables pstore write for powerpc architecture by setting
PSTORE_FLAGS_DMESG flag.

Fixes: c950fd6f20 ("pstore: Split pstore fragile flags")
Cc: stable@vger.kernel.org # v4.9+
Signed-off-by: Ankit Kumar <ankit@linux.vnet.ibm.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
2017-04-27 14:49:05 -07:00
Naveen N. Rao 096ff2ddba powerpc/ftrace/64: Split further based on -mprofile-kernel
Split ftrace_64.S further retaining the core ftrace 64-bit aspects
in ftrace_64.S and moving ftrace_caller() and ftrace_graph_caller() into
separate files based on -mprofile-kernel. The livepatch routines are all
now contained within the mprofile file.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-27 22:20:29 +10:00
Naveen N. Rao 7853f9c029 powerpc: Split ftrace bits into a separate file
entry_*.S now includes a lot more than just kernel entry/exit code. As a
first step at cleaning this up, let's split out the ftrace bits into
separate files. Also move all related tracing code into a new trace/
subdirectory.

No functional changes.

Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-27 22:20:29 +10:00
Tyrel Datwyler e76ca27790 powerpc/sysfs: Fix reference leak of cpu device_nodes present at boot
For CPUs present at boot each logical CPU acquires a reference to the
associated device node of the core. This happens in register_cpu() which
is called by topology_init(). The result of this is that we end up with
a reference held by each thread of the core. However, these references
are never freed if the CPU core is DLPAR removed.

This patch fixes the reference leaks by acquiring and releasing the references
in the CPU hotplug callbacks un/register_cpu_online(). With this patch symmetric
reference counting is observed with both CPUs present at boot, and those DLPAR
added after boot.

Fixes: f86e4718f2 ("driver/core: cpu: initialize of_node in cpu's device struture")
Cc: stable@vger.kernel.org # v3.12+
Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-25 00:24:59 +10:00
Michael Ellerman 9fc849144c Merge branch 'topic/kprobes' into next
Although most of these kprobes patches are powerpc specific, there's a couple
that touch generic code (with Acks). At the moment there's one conflict with
acme's tree, but it's not too bad. Still just in case some other conflicts show
up, we've put these in a topic branch so another tree could merge some or all of
it if necessary.
2017-04-25 00:24:04 +10:00
Naveen N. Rao 24bd909e94 powerpc/kprobes: Prefer ftrace when probing function entry
KPROBES_ON_FTRACE avoids much of the overhead of regular kprobes as it
eliminates the need for a trap, as well as the need to emulate or single-step
instructions.

Though OPTPROBES provides us with similar performance, we have limited
optprobes trampoline slots. As such, when asked to probe at a function
entry, default to using the ftrace infrastructure.

With:
  # cd /sys/kernel/debug/tracing
  # echo 'p _do_fork' > kprobe_events

before patch:
  # cat ../kprobes/list
  c0000000000daf08  k  _do_fork+0x8    [DISABLED]
  c000000000044fc0  k  kretprobe_trampoline+0x0    [OPTIMIZED]

and after patch:
  # cat ../kprobes/list
  c0000000000d074c  k  _do_fork+0xc    [DISABLED][FTRACE]
  c0000000000412b0  k  kretprobe_trampoline+0x0    [OPTIMIZED]

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-24 19:07:59 +10:00
Naveen N. Rao 1b32cd1715 powerpc: Introduce a new helper to obtain function entry points
kprobe_lookup_name() is specific to the kprobe subsystem and may not always
return the function entry point (in a subsequent patch for KPROBES_ON_FTRACE).
For looking up function entry points, introduce a separate helper and use it
in optprobes.c

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-24 19:07:58 +10:00
Naveen N. Rao ead514d5fb powerpc/kprobes: Add support for KPROBES_ON_FTRACE
Allow kprobes to be placed on ftrace _mcount() call sites. This optimization
avoids the use of a trap, by riding on ftrace infrastructure.

This depends on HAVE_DYNAMIC_FTRACE_WITH_REGS which depends on MPROFILE_KERNEL,
which is only currently enabled on powerpc64le with newer toolchains.

Based on the x86 code by Masami.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-24 19:07:58 +10:00
Naveen N. Rao 2f59be5b97 powerpc/ftrace: Restore LR from pt_regs
Pass the real LR to the ftrace handler. This is needed for KPROBES_ON_FTRACE for
the pre handlers.

Also, with KPROBES_ON_FTRACE, the link register may be updated by the pre
handlers or by a registed kretprobe. Honor updated LR by restoring it from
pt_regs, rather than from the stack save area.

Live patch and function graph continue to work fine with this change.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-24 19:07:57 +10:00
Naveen N. Rao 7aa5b018bf powerpc/kprobes: Blacklist exception handlers
Introduce __head_end to mark end of the early fixed sections and use it to
blacklist all exception handlers from kprobes.

mpe: We do not need to do anything special for relocatable kernels, where the
exception vectors are split from the main kernel, as the split vectors are
already excluded by the check for kernel_text_address().

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
[mpe: Move __head_end outside #ifdef 64-bit to unbreak the 32-bit build]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-23 20:32:25 +10:00
Naveen N. Rao 71f6e58e5e powerpc/kprobes: Convert __kprobes to NOKPROBE_SYMBOL()
Along similar lines as commit 9326638cbe ("kprobes, x86: Use NOKPROBE_SYMBOL()
instead of __kprobes annotation"), convert __kprobes annotation to either
NOKPROBE_SYMBOL() or nokprobe_inline. The latter forces inlining, in which case
the caller needs to be added to NOKPROBE_SYMBOL().

Also:
 - blacklist arch_deref_entry_point(), and
 - convert a few regular inlines to nokprobe_inline in lib/sstep.c

A key benefit is the ability to detect such symbols as being
blacklisted. Before this patch:

  $ cat /sys/kernel/debug/kprobes/blacklist | grep read_mem
  $ perf probe read_mem
  Failed to write event: Invalid argument
    Error: Failed to add events.
  $ dmesg | tail -1
  [ 3736.112815] Could not insert probe at _text+10014968: -22

After patch:
  $ cat /sys/kernel/debug/kprobes/blacklist | grep read_mem
  0xc000000000072b50-0xc000000000072d20	read_mem
  $ perf probe read_mem
  read_mem is blacklisted function, skip it.
  Added new events:
    (null):(null)        (on read_mem)
    probe:read_mem       (on read_mem)

  You can now use it in all perf tools, such as:

	  perf record -e probe:read_mem -aR sleep 1

  $ grep " read_mem" /proc/kallsyms
  c000000000072b50 t read_mem
  c0000000005f3b40 t read_mem
  $ cat /sys/kernel/debug/kprobes/list
  c0000000005f3b48  k  read_mem+0x8    [DISABLED]

Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
[mpe: Minor change log formatting, fix up some conflicts]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-23 20:32:25 +10:00
Naveen N. Rao 700e64377c powerpc/ftrace: Move stack setup and teardown code into ftrace_graph_caller()
Move the stack setup and teardown code into ftrace_graph_caller(). This way, we
don't incur the cost of setting it up unless function graph is enabled for this
function.

Also, remove the extraneous LR restore code after the function graph stub. LR
has previously been restored and neither livepatch_handler() nor
ftrace_graph_caller() return back here.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
[mpe: Drop bad change to non-mprofile-kernel version of ftrace_graph_caller]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-23 20:32:24 +10:00
Naveen N. Rao d08f8a28bc powerpc/kprobes: Remove duplicate saving of MSR
set_current_kprobe() already saves regs->msr into kprobe_saved_msr. Remove the
redundant save.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-23 20:32:24 +10:00
Nicholas Piggin 9cba253df4 powerpc/64s: Simplify POWER9 DD1 idle workaround code
The idle workaround does not need to load PACATOC, and it does not
need to be called within a nested function that requires LR to be
saved.

Load the PACATOC at entry to the idle wakeup. It does not matter which
PACA this comes from, so it's okay to call before the workaround. Then
apply the workaround to get the right PACA.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-23 20:32:23 +10:00
Nicholas Piggin 0d7720a242 powerpc/64s: Idle POWER8 avoid full state loss recovery where possible
If not all threads were in winkle, full state loss recovery is not
necessary and can be avoided. A previous patch removed this optimisation
due to some complexity with the implementation. Re-implement it by
counting the number of threads in winkle with the per-core idle state.
Only restore full state loss if all threads were in winkle.

This has a small window of false positives right before threads execute
winkle and just after they wake up, when the winkle count does not
reflect the true number of threads in winkle. This is not a significant
problem in comparison with even the minimum winkle duration. For
correctness, a false positive is not a problem (only false negatives
would be).

Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-23 20:32:12 +10:00
Nicholas Piggin e420249d44 powerpc/64s: Idle do not hold reservation longer than required
When taking the core idle state lock, grab it immediately like a regular
lock, rather than adding more tests in there. Holding the lock keeps it
stable, so there is no need to do it whole holding the reservation.

Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-23 20:31:57 +10:00
Nicholas Piggin adbcf8d74f powerpc/64s: Expand core idle state bits
In preparation for adding more bits to the core idle state word, move
the lock bit up, and unlock by flipping the lock bit rather than masking
off all but the thread bits.

Add branch hints for atomic operations while we're here.

Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-23 20:31:49 +10:00
Nicholas Piggin 1945bc4549 powerpc/64s: Fix POWER9 machine check handler from stop state
The ISA specifies power save wakeup due to a machine check exception can
cause a machine check interrupt (rather than the usual system reset
interrupt).

The machine check handler copes with this by doing low level machine
check recovery without restoring full state from idle, then queues up a
machine check event for logging, then directly executes the same idle
instruction it woke from. This minimises the work done before recovery
is performed.

The problem is that it requires machine specific instructions and
knowledge of the book3s idle code. Currently it only has code to handle
POWER8 idle, so POWER9 crashes when trying to execute the P8 idle
instructions which don't exist in ISAv3.0B.

cpu 0x0: Vector: e40 (Emulation Assist) at [c0000000008f3810]
    pc: c000000000008380: machine_check_handle_early+0x130/0x2f0
    lr: c00000000053a098: stop_loop+0x68/0xd0
    sp: c0000000008f3a90
   msr: 9000000000081001
  current = 0xc0000000008a1080
  paca    = 0xc00000000ffd0000   softe: 0        irq_happened: 0x01
    pid   = 0, comm = swapper/0

Instead of going to sleep after recovery, do the usual idle wakeup and
state restoration by calling into the normal idle wakeup path. This
reuses the normal idle wakeup paths.

Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Reviewed-by: Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-23 20:31:46 +10:00
Nicholas Piggin 10101aa9aa powerpc/64s: Use alternative feature patching
This reduces the number of nops for POWER8.

Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-23 20:31:43 +10:00
Nicholas Piggin 544686cae8 powerpc/64s: Stop using bit in HSPRG0 to test winkle
The POWER8 idle code has a neat trick of programming the power on engine
to restore a low bit into HSPRG0, so idle wakeup code can test and see
if it has been programmed this way and therefore lost all state. Restore
time can be reduced if winkle has not been reached.

However this messes with our r13 PACA pointer, and requires HSPRG0 to be
written to. It also optimizes the slowest and most uncommon case at the
expense of another SPR write in the common nap state wakeup.

Remove this complexity and assume winkle sleeps always require a state
restore. This speedup could be made entirely contained within the winkle
idle code by counting per-core winkles and setting a thread bitmap when
all have gone to winkle.

Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-23 20:31:39 +10:00
Nicholas Piggin bf0153c143 powerpc/64s: Move remaining system reset idle code into idle_book3s.S
No functional change.

Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-23 20:31:35 +10:00
Nicholas Piggin 2563a70c3b powerpc/64s: Remove unnecessary relocation branch from idle handler
The system reset idle handler system_reset_idle_common is relocated, so
relocation is not required to branch to kvm_start_guest. The superfluous
relocation does not result in incorrect code, but it does not compile
outside of exception-64s.S (with fixed section definitions).

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-23 17:26:35 +10:00
Naveen N. Rao 22d8b3dec2 powerpc/kprobes: Emulate instructions on kprobe handler re-entry
On kprobe handler re-entry, try to emulate the instruction rather than single
stepping always.

Acked-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-20 23:18:56 +10:00
Naveen N. Rao 1cabd2f8f7 powerpc/kprobes: Factor out code to emulate instruction into a helper
Factor out code to emulate instruction into a try_to_emulate()
helper function. This makes no functional changes.

Acked-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-20 23:18:56 +10:00
Naveen N. Rao a64e3f35a4 powerpc/kretprobes: Override default function entry offset
With ABIv2, we offset 8 bytes into a function to get at the local entry
point.

mpe: NB this function is currently not called, the change to generic code to
call it is being merged via the tip tree.

Acked-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-20 23:18:55 +10:00
Naveen N. Rao 290e307076 powerpc/kprobes: Fix handling of function offsets on ABIv2
commit 239aeba764 ("perf powerpc: Fix kprobe and kretprobe handling with
kallsyms on ppc64le") changed how we use the offset field in struct kprobe on
ABIv2. perf now offsets from the global entry point if an offset is specified
and otherwise chooses the local entry point.

Fix the same in kernel for kprobe API users. We do this by extending
kprobe_lookup_name() to accept an additional parameter to indicate the offset
specified with the kprobe registration. If offset is 0, we return the local
function entry and return the global entry point otherwise.

With:
  # cd /sys/kernel/debug/tracing/
  # echo "p _do_fork" >> kprobe_events
  # echo "p _do_fork+0x10" >> kprobe_events

before this patch:
  # cat ../kprobes/list
  c0000000000d0748  k  _do_fork+0x8    [DISABLED]
  c0000000000d0758  k  _do_fork+0x18    [DISABLED]
  c0000000000412b0  k  kretprobe_trampoline+0x0    [OPTIMIZED]

and after:
  # cat ../kprobes/list
  c0000000000d04c8  k  _do_fork+0x8    [DISABLED]
  c0000000000d04d0  k  _do_fork+0x10    [DISABLED]
  c0000000000412b0  k  kretprobe_trampoline+0x0    [OPTIMIZED]

Acked-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-20 23:18:55 +10:00
Naveen N. Rao 49e0b4658f kprobes: Convert kprobe_lookup_name() to a function
The macro is now pretty long and ugly on powerpc. In the light of further
changes needed here, convert it to a __weak variant to be over-ridden with a
nicer looking function.

Suggested-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-20 23:18:54 +10:00
Nicholas Piggin 95dbdf4fa0 powerpc/64s: Minor fix for MCE TLB flush for radix
The TLB flush for radix first flushes TLB for radix configuration,
then flushes for hash configuration. The second flush is unnecessary
but does not affect correctness.

Fixes: 1a472c9dba ("powerpc/mm/radix: Add tlbflush routines")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-19 20:00:18 +10:00
Nicholas Piggin 8d1b48ef58 powerpc/64s: Revert setting of LPCR[LPES] on POWER9
The XIVE enablement patches included a change to set the LPES (Logical
Partitioning Environment Selector) bit (bit # 3) in LPCR (Logical Partitioning
Control Register) on POWER9 hosts. This bit sets external interrupts to guest
delivery mode, which uses SRR0/1. The host's EE interrupt handler is written to
expect HSRR0/1 (for earlier CPUs). This should be fine because XIVE is
configured not to deliver EEs to the host (Hypervisor Virtulization Interrupt is
used instead) so the EE handler should never be executed.

However a bug in interrupt controller code, hardware, or odd configuration of a
simulator could result in the host getting an EE incorrectly. Keeping the EE
delivery mode matching the host EE handler prevents strange crashes due to using
the wrong exception registers.

KVM will configure the LPCR to set LPES prior to running a guest so that EEs are
delivered to the guest using SRR0/1.

Fixes: 08a1e650cc ("powerpc: Fixup LPCR:PECE and HEIC setting on POWER9")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Massage change log to avoid referring to LPES0 which is now renamed LPES]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-19 20:00:17 +10:00