Commit Graph

11592 Commits

Author SHA1 Message Date
Mark Brown d82fbf305c powerpc: Ignore zImage.epapr
This is another file we can generate so add it to the list.

Signed-off-by: Mark Brown <broonie@linaro.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 15:00:03 +10:00
James Yang cc7059b5ea powerpc/math-emu: Fix load/store indexed emulation
Load/store indexed instructions where the index register RA=R0, such
as "lfdx f1,0,r3", are not illegal.

Load/store indexed with update instructions where the index register
RA=R0, such as "lfdux f1,0,r3", are invalid, and, to be consistent
with existing math-emu behavior for other invalid instruction forms,
will signal as illegal.

Signed-off-by: James Yang <James.Yang@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 14:59:57 +10:00
Kevin Hao 5f20be4478 powerpc: Remove the empty giveup_fpu() function on 32bit kernel
Instead of implementing an empty giveup_fpu() function for each
32bit processor type, replace them with an unique empty inline
function.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 14:59:50 +10:00
Kevin Hao 037f0eed57 powerpc: Make flush_fp_to_thread() nop when CONFIG_PPC_FPU is disabled
In the current kernel, the function flush_fp_to_thread() is not
dependent on CONFIG_PPC_FPU. So most invocations of this function
is not wrapped by CONFIG_PPC_FPU. Even through we don't really
save the FPRs to the thread struct if CONFIG_PPC_FPU is not enabled,
but there does have some runtime overhead such as the check for
tsk->thread.regs and preempt disable and enable. It really make
no sense to do that. So make it a nop when CONFIG_PPC_FPU is
disabled. Also remove the wrapped #ifdef CONFIG_PPC_FPU
when invoking this function.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 14:59:44 +10:00
Kevin Hao 662499d04b powerpc: Remove the redundant flush_fp_to_thread() in setup_sigcontext()
In commit c6e6771b(powerpc: Introduce VSX thread_struct and CONFIG_VSX)
we add a invocation of flush_fp_to_thread() before copying the FPR or
VSR to users. But we already invoke the flush_fp_to_thread() in this
function. So remove one of them.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 14:59:38 +10:00
Kevin Hao d01dd2b8ac powerpc/mpc85xx: Only emulate the unimplemented FP instructions on corenet64
We have split the math emulation into two parts. This makes it
possible to just emulate the unimplemented floating point instructions
on these boards.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 14:59:31 +10:00
Kevin Hao 6ef94ff2e8 powerpc: remove the unused function disable_kernel_fp()
The only using of function disable_kernel_fp() was already dropped
in the commit 5daf9071 (powerpc: merge align.c).

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 14:59:25 +10:00
Kevin Hao e05c0e81b0 powerpc: split She math emulation into two parts
For some SoC (such as the FSL BookE) even though there does have
a hardware FPU, but not all floating point instructions are
implemented. Unfortunately some versions of gcc do use these
unimplemented instructions. Then we have to enable the math emulation
to workaround this issue. It seems a little redundant to have the
support to emulate all the floating point instructions in this case.
So split the math emulation into two parts. One is for the SoC which
doesn't have FPU at all and the other for the SoC which does have the
hardware FPU and only need some special floating point instructions to
be emulated.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 14:59:19 +10:00
Kevin Hao 3a3b5aa63f powerpc: Introduce function emulate_math()
There are two invocations of do_mathemu() in traps.c. And the codes
in these two places are almost the same. Introduce a locale function
to eliminate the duplication. With this change we can also make sure
that in program_check_exception() the PPC_WARN_EMULATED is invoked for
the correctly emulated math instructions.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 14:59:12 +10:00
Kevin Hao 6761ee3d7e powerpc/math-emu: Move the flush FPU state function into do_mathemu
By doing this we can make sure that the FPU state is only flushed to
the thread struct when it is really needed.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 14:59:06 +10:00
Kevin Hao 195c4940f7 powerpc/85xx: Enable the math emulation for the corenet64_smp_defconfig
I got the following error on my t4240qds board.
  ntpd[2713]: unhandled signal 4 at 0fd5b448 nip 0fd5b448 lr 0fd5b424 code 30001

The root cause is that the float point instruction 'fsqrt' is used.
But this instruction is not implemented on e6500 core. Even this
does seem a gcc bug, I would like to enable the math emulation
in the kernel to workaround this kind of issue.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 14:59:00 +10:00
Paul Bolle 39cee08b28 powerpc/8xx: Remove last traces of 8XX_MINIMAL_FPEMU
The Kconfig symbol 8XX_MINIMAL_FPEMU was removed in commit 968219fa33
("powerpc/8xx: Remove 8xx specific "minimal FPU emulation""). But that
commit didn't remove all code depending on that symbol. Do so now.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 14:58:53 +10:00
Kevin Hao f0870c5530 powerpc/math-emu: Remove the unneeded check for CONFIG_MATH_EMULATION in math.c
The math.c is only built when CONFIG_MATH_EMULATION is enabled.
So the #ifdef check for CONFIG_MATH_EMULATION in it seems redundant.
Drop all of them.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 14:58:46 +10:00
Kevin Hao cf5c2e543c powerpc/math-emu: Remove the dead code in math.c
The math.c is only built when CONFIG_MATH_EMULATION is enabled.
So we would never get into the case that CONFIG_MATH_EMULATION
is not defined in this file.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 14:58:40 +10:00
Benjamin Herrenschmidt 8a05dd8514 powerpc/powernv: Enable detection of legacy UARTs
Legacy UARTs can exist on PowerNV, memory-mapped ones on PCI
or IO based ones on the LPC bus.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 14:58:34 +10:00
Benjamin Herrenschmidt 2db29d28eb powerpc/powernv: Don't crash if there are no OPAL consoles
Some machines might provide the console via a different mechanism
such as direct access to a UART from Linux, in which case OPAL
might not expose any console. In that case, the code would cause
a NULL dereference.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 14:58:27 +10:00
Benjamin Herrenschmidt e0f5fa99a3 powerpc: Check "status" property before adding legacy ISA serial ports
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 14:58:21 +10:00
Benjamin Herrenschmidt 309257484c powerpc: Cleanup udbg_16550 and add support for LPC PIO-only UARTs
The udbg_16550 code, which we use for our early consoles and debug
backends was fairly messy. Especially for the debug consoles, it
would re-implement the "high level" getc/putc/poll functions for
each access method. It also had code to configure the UART but only
for the straight MMIO method.

This changes it to instead abstract at the register accessor level,
and have the various functions and configuration routines use these.

The result is simpler and slightly smaller code, and free support
for non-MMIO mapped PIO UARTs, which such as the ones that can be
present on a POWER 8 LPC bus.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 14:58:15 +10:00
Benjamin Herrenschmidt 3fafe9c202 powerpc/powernv: Add PIO accessors for Power8 LPC bus
This uses the hooks provided by CONFIG_PPC_INDIRECT_PIO to
implement a set of hooks for IO port access to use the LPC
bus via OPAL calls for the first 64K of IO space

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 14:58:08 +10:00
Benjamin Herrenschmidt b37193b718 powerpc/powernv: Add helper to get ibm,chip-id of a node
This includes walking the parent nodes if necessary.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 14:58:02 +10:00
Benjamin Herrenschmidt cc0efb57eb powerpc/powernv: Update opal.h to add new LPC and XSCOM functions
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 14:57:56 +10:00
Benjamin Herrenschmidt ecd73cc5c9 powerpc: Better split CONFIG_PPC_INDIRECT_PIO and CONFIG_PPC_INDIRECT_MMIO
Remove the generic PPC_INDIRECT_IO and ensure we only add overhead
to the right accessors. IE. If only CONFIG_PPC_INDIRECT_PIO is set,
we don't add overhead to all MMIO accessors.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 14:57:50 +10:00
Tiejun Chen de021bb79c powerpc/ppc64: Rename SOFT_DISABLE_INTS with RECONCILE_IRQ_STATE
The SOFT_DISABLE_INTS seems an odd name for something that updates the
software state to be consistent with interrupts being hard disabled, so
rename SOFT_DISABLE_INTS with RECONCILE_IRQ_STATE to avoid this confusion.

Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 14:57:47 +10:00
Benjamin Herrenschmidt 7191b61575 powerpc/pmac: Early debug output on screen on 64-bit macs
We have a bunch of CONFIG_PPC_EARLY_DEBUG_* options that are intended
for bringup/debug only. They hard wire a machine specific udbg backend
very early on (before we even probe the platform), and use whatever
tricks are available on each machine/cpu to be able to get some kind
of output out there early on.

So far, on powermac with no serial ports, we have CONFIG_PPC_EARLY_DEBUG_BOOTX
to use the low-level btext engine on the screen, but it doesn't do much, at
least on 64-bit. It only really gets enabled after the platform has been
probed and the MMU enabled.

This adds a way to enable it much earlier. From prom_init.c (while still
running with Open Firmware), we grab the screen details and set things up
using the physical address of the frame buffer.

Then btext itself uses the "rm_ci" feature of the 970 processor (Real
Mode Cache Inhibited) to access it while in real mode.

We need to do a little bit of reorg of the btext code to inline things
better, in order to limit how much we touch memory while in this mode as
the consequences might be ... interesting.

This successfully allowed me to debug problems early on with the G5
(related to gold being broken vs. ppc64 kernels).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 14:57:40 +10:00
Gavin Shan 1a85d66bcc powerpc/pci: Remove duplicate check in pcibios_fixup_bus()
pci_read_bridge_bases() already checks if the PCI bus is root
bus or not, so we needn't do same check in pcibios_fixup_bus()
and just remove it.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 14:57:36 +10:00
Gavin Shan c35d2a8c5f powerpc/powernv: Needn't IO segment map for PHB3
PHB3 doesn't support IO ports and we needn't IO segment map for that.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 14:57:32 +10:00
Gavin Shan 2f1ec02ea1 powerpc/powernv: Check primary PHB through ID
The index of one specific PCI controller (struct pci_controller::
global_number) can tell that it's primary one or not. So we needn't
additional variable for that and just remove it.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 14:57:29 +10:00
Gavin Shan f1b7cc3ec1 powerpc/powernv: Fetch PHB bus range from dev-tree
The patch enables fetching bus range from device-tree for the
specific PHB. If we can't get that from device-tree, the default
range [0 255] will be used.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 14:57:25 +10:00
Gavin Shan 58d714ec7a powerpc/powernv: Free PHB instance upon error
We don't free PHB instance (struct pnv_phb) on error to creating
the associated PCI controler (struct pci_controller). The patch
fixes that. Also, the output messages have been cleaned for a bit
so that they looks unified for IODA_1/2 cases.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 14:57:21 +10:00
Paul Mackerras 408a7e08b2 powerpc: Fix VRSAVE handling
Since 2002, the kernel has not saved VRSAVE on exception entry and
restored it on exit; rather, VRSAVE gets context-switched in _switch.
This means that when executing in process context in the kernel, the
userspace VRSAVE value is live in the VRSAVE register.

However, the signal code assumes that current->thread.vrsave holds
the current VRSAVE value, which is incorrect.  Therefore, this
commit changes it to use the actual VRSAVE register instead.  (It
still uses current->thread.vrsave as a temporary location to store
it in, as __get_user and __put_user can only transfer to/from a
variable, not an SPR.)

This also modifies the transactional memory code to save and restore
VRSAVE regardless of whether VMX is enabled in the MSR.  This is
because accesses to VRSAVE are not controlled by the MSR.VEC bit,
but can happen at any time.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 14:57:18 +10:00
Paul Mackerras 1f7bf02876 powerpc: Implement __get_user_pages_fast()
Other architectures have a __get_user_pages_fast(), in addition to the
regular get_user_pages_fast(), which doesn't call get_user_pages() on
failure, and thus doesn't attempt to fault pages in or COW them.  The
generic KVM code uses __get_user_pages_fast() to detect whether a page
for which we have only requested read access is actually writable.

This provides an implementation of __get_user_pages_fast() by
splitting the existing get_user_pages_fast() in two.  With this, the
generic KVM code will get the right answer instead of always
considering such pages non-writable.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 14:57:14 +10:00
Andy Fleming 39fd40274d powerpc: Convert platforms to smp_generic_cpu_bootable
T4, Cell, powernv, and pseries had the same implementation, so switch
them to use a generic version. A2 apparently had a version, but
removed it at some point, so we remove the declaration, too.

Signed-off-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 14:56:57 +10:00
Andy Fleming 3cd8525023 powerpc: Add smp_generic_cpu_bootable
Cell and PSeries both implemented their own versions of a
cpu_bootable smp_op which do the same thing (well, the PSeries
one has support for more than 2 threads). Copy the PSeries one
to generic code, and rename it smp_generic_cpu_bootable.

Signed-off-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 14:56:50 +10:00
Kevin Hao 3b04c30007 powerpc: Remove the symbol __flush_icache_range
And now the function flush_icache_range() is just a wrapper which
only invoke the function __flush_icache_range() directly. So we
don't have reason to keep it anymore.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 14:56:44 +10:00
Kevin Hao abb29c3bb1 powerpc: Move the testing of CPU_FTR_COHERENT_ICACHE into __flush_icache_range
In function flush_icache_range(), we use cpu_has_feature() to test
the feature bit of CPU_FTR_COHERENT_ICACHE. But this seems not optimal
for two reasons:
 a) For ppc32, the function __flush_icache_range() already do this
    check with the macro END_FTR_SECTION_IFSET.
 b) Compare with the cpu_has_feature(), the method of using macro
    END_FTR_SECTION_IFSET will not introduce any runtime overhead.

[And while at it, add the missing required isync] -- BenH

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 14:56:06 +10:00
Anton Blanchard f13c13a005 powerpc: Stop using non-architected shared_proc field in lppaca
Although the shared_proc field in the lppaca works today, it is
not architected. A shared processor partition will always have a non
zero yield_count so use that instead. Create a wrapper so users
don't have to know about the details.

In order for older kernels to continue to work on KVM we need
to set the shared_proc bit. While here, remove the ugly bitfield.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 11:50:26 +10:00
Anton Blanchard 0c69f9c52c powerpc/pci: Don't use bitfield for force_32bit_msi
Fix a sparse warning about force_32bit_msi being a one bit bitfield.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 11:50:25 +10:00
Anton Blanchard b0d436c739 powerpc: Fix a number of sparse warnings
Address some of the trivial sparse warnings in arch/powerpc.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 11:50:24 +10:00
Anton Blanchard a0a96ee9ba powerpc/pseries: Simplify H_GET_TERM_CHAR
plpar_get_term_char is only used once and just adds a layer
of complexity to H_GET_TERM_CHAR. plpar_put_term_char isn't
used at all so we can remove it.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 11:50:23 +10:00
Anton Blanchard 01b0e07e60 powerpc: Simplify logic in include/uapi/asm/elf.h
Simplify things by putting all the 32bit and 64bit defines
together instead of in two spots.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 11:50:22 +10:00
Anton Blanchard 23c6bd2aea powerpc: Remove SAVE_VSRU and REST_VSRU macros
We always use VMX loads and stores to manage the high 32
VSRs. Remove these unused macros.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 11:50:22 +10:00
Anton Blanchard 594bd38303 powerpc: Wrap MSR macros with parentheses
Not having parentheses around a macro is asking for trouble.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 11:50:21 +10:00
Anton Blanchard 230aef7a6a powerpc: Handle unaligned ldbrx/stdbrx
Normally when we haven't implemented an alignment handler for
a load or store instruction the process will be terminated.

The alignment handler uses the DSISR (or a pseudo one) to locate
the right handler. Unfortunately ldbrx and stdbrx overlap lfs and
stfs so we incorrectly think ldbrx is an lfs and stdbrx is an
stfs.

This bug is particularly nasty - instead of terminating the
process we apply an incorrect fixup and continue on.

With more and more overlapping instructions we should stop
creating a pseudo DSISR and index using the instruction directly,
but for now add a special case to catch ldbrx/stdbrx.

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 11:50:20 +10:00
Anton Blanchard 5b63fee1fe powerpc: Align p_toc
p_toc is an 8 byte relative offset to the TOC that we place in the
text section. This means it is only 4 byte aligned where it should
be 8 byte aligned. Add an explicit alignment.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-14 11:50:19 +10:00
Sebastian Siewior 6391f697d4 powerpc: 52xx: provide a default in mpc52xx_irqhost_map()
My gcc-4.3.5 fails to compile due to:

|cc1: warnings being treated as errors
|arch/powerpc/platforms/52xx/mpc52xx_pic.c: In function ‘mpc52xx_irqhost_map’:
|arch/powerpc/platforms/52xx/mpc52xx_pic.c:343: error: ‘irqchip’ may be used uninitialized in this function

since commit e34298c ("powerpc: 52xx: nop out unsupported critical
IRQs"). This warning is complete crap since only values 0…3 are possible
which are checked but gcc fails to understand that. I wouldn't care much
but since this is compiled with -Werror I made this patch.
While add it, I replaced the warning from l2irq to l1irq since this is
the number that is evaluated.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Anatolij Gustschin <agust@denx.de>
2013-08-13 00:16:04 +02:00
Thomas Petazzoni ebd97be635 PCI: remove ARCH_SUPPORTS_MSI kconfig option
Now that we have weak versions for each of the PCI MSI architecture
functions, we can actually build the MSI support for all platforms,
regardless of whether they provide or not architecture-specific
versions of those functions. For this reason, the ARCH_SUPPORTS_MSI
hidden kconfig boolean becomes useless, and this patch gets rid of it.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Daniel Price <daniel.price@gmail.com>
Tested-by: Thierry Reding <thierry.reding@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux390@de.ibm.com
Cc: linux-s390@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-ia64@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: David S. Miller <davem@davemloft.net>
Cc: sparclinux@vger.kernel.org
Cc: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
2013-08-12 15:26:48 +00:00
Thomas Petazzoni 4287d824f2 PCI: use weak functions for MSI arch-specific functions
Until now, the MSI architecture-specific functions could be overloaded
using a fairly complex set of #define and compile-time
conditionals. In order to prepare for the introduction of the msi_chip
infrastructure, it is desirable to switch all those functions to use
the 'weak' mechanism. This commit converts all the architectures that
were overidding those MSI functions to use the new strategy.

Note that we keep two separate, non-weak, functions
default_teardown_msi_irqs() and default_restore_msi_irqs() for the
default behavior of the arch_teardown_msi_irqs() and
arch_restore_msi_irqs(), as the default behavior is needed by x86 PCI
code.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Daniel Price <daniel.price@gmail.com>
Tested-by: Thierry Reding <thierry.reding@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux390@de.ibm.com
Cc: linux-s390@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-ia64@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: David S. Miller <davem@davemloft.net>
Cc: sparclinux@vger.kernel.org
Cc: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
2013-08-12 15:26:39 +00:00
Michael Neuling 28e61cc466 powerpc/tm: Fix context switching TAR, PPR and DSCR SPRs
If a transaction is rolled back, the Target Address Register (TAR), Processor
Priority Register (PPR) and Data Stream Control Register (DSCR) should be
restored to the checkpointed values before the transaction began.  Any changes
to these SPRs inside the transaction should not be visible in the abort
handler.

Currently Linux doesn't save or restore the checkpointed TAR, PPR or DSCR.  If
we preempt a processes inside a transaction which has modified any of these, on
process restore, that same transaction may be aborted we but we won't see the
checkpointed versions of these SPRs.

This adds checkpointed versions of these SPRs to the thread_struct and adds the
save/restore of these three SPRs to the treclaim/trechkpt code.

Without this if any of these SPRs are modified during a transaction, users may
incorrectly see a speculated SPR value even if the transaction is aborted.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Cc: <stable@vger.kernel.org> [v3.10]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-09 18:07:12 +10:00
Michael Neuling c2d52644e2 powerpc: Save the TAR register earlier
This moves us to save the Target Address Register (TAR) a earlier in
__switch_to.  It introduces a new function save_tar() to do this.

We need to save the TAR earlier as we will overwrite it in the transactional
memory reclaim/recheckpoint path.  We are going to do this in a subsequent
patch which will fix saving the TAR register when it's modified inside a
transaction.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Cc: <stable@vger.kernel.org> [v3.10]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-09 18:07:08 +10:00
Michael Neuling 2517617e0d powerpc: Fix context switch DSCR on POWER8
POWER8 allows the DSCR to be accessed directly from userspace via a new SPR
number 0x3 (Rather than 0x11.  DSCR SPR number 0x11 is still used on POWER8 but
like POWER7, is only accessible in HV and OS modes).  Currently, we allow this
by setting H/FSCR DSCR bit on boot.

Unfortunately this doesn't work, as the kernel needs to see the DSCR change so
that it knows to no longer restore the system wide version of DSCR on context
switch (ie. to set thread.dscr_inherit).

This clears the H/FSCR DSCR bit initially.  If a process then accesses the DSCR
(via SPR 0x3), it'll trap into the kernel where we set thread.dscr_inherit in
facility_unavailable_exception().

We also change _switch() so that we set or clear the H/FSCR DSCR bit based on
the thread.dscr_inherit.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Cc: <stable@vger.kernel.org> [v3.10]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-09 18:07:05 +10:00
Michael Neuling 74e400cee6 powerpc: Rework setting up H/FSCR bit definitions
This reworks the Facility Status and Control Regsiter (FSCR) config bit
definitions so that we can access the bit numbers.  This is needed for a
subsequent patch to fix the userspace DSCR handling.

HFSCR and FSCR bit definitions are the same, so reuse them.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Cc: <stable@vger.kernel.org> [v3.10]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-09 18:07:01 +10:00
Michael Neuling 88f094120b powerpc: Fix hypervisor facility unavaliable vector number
Currently if we take hypervisor facility unavaliable (from 0xf80/0x4f80) we
mark it as an OS facility unavaliable (0xf60) as the two share the same code
path.

The becomes a problem in facility_unavailable_exception() as we aren't able to
see the hypervisor facility unavailable exceptions.

Below fixes this by duplication the required macros.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Cc: <stable@vger.kernel.org> [v3.10]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-09 18:06:58 +10:00
Thadeu Lima de Souza Cascardo e0e1361462 powerpc/kvm/book3s_pr: Return appropriate error when allocation fails
err was overwritten by a previous function call, and checked to be 0. If
the following page allocation fails, 0 is going to be returned instead
of -ENOMEM.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-09 18:06:54 +10:00
Chen Gang 2fb10672c8 powerpc/kvm: Add signed type cast for comparation
'rmls' is 'unsigned long', lpcr_rmls() will return negative number when
failure occurs, so it need a type cast for comparing.

'lpid' is 'unsigned long', kvmppc_alloc_lpid() return negative number
when failure occurs, so it need a type cast for comparing.

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-09 18:06:51 +10:00
Mike Qiu 144136dd7a powerpc/eeh: Add missing procfs entry for PowerNV
The procfs entry for global statistics has been missed on PowerNV
platform and the patch is going to add that.

Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com>
Acked-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-09 18:06:47 +10:00
Aruna Balakrishnaiah 156c9ebdac powerpc/pseries: Add backward compatibilty to read old kernel oops-log
Older kernels has just length information in their header. Handle it
while reading old kernel oops log from pstore.

Applies on top of powerpc/pseries: Fix buffer overflow when reading from pstore

Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-09 18:06:44 +10:00
Aruna Balakrishnaiah 7e76f34fa1 powerpc/pseries: Fix buffer overflow when reading from pstore
When reading from pstore there is a buffer overflow during decompression
due to the header added in unzip_oops. Remove unzip_oops and call
pstore_decompress directly in nvram_pstore_read. Allocate buffer of size
report_length of the oops header as header will not be deallocated in pstore.
Since we have 'openssl' command line tool to decompress the compressed data,
dump the compressed data in case decompression fails instead of not dumping
anything.

Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-09 18:06:40 +10:00
Anton Blanchard 4e90a2a737 powerpc: On POWERNV enable PPC_DENORMALISATION by default
We want PPC_DENORMALISATION enabled when POWERNV is enabled,
so update the Kconfig.

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: <stable@vger.kernel.org>
2013-08-09 18:05:29 +10:00
Benjamin Herrenschmidt a12e4537ad Merge remote-tracking branch 'scott/next' into next
Merge some Freescale updates from Scott Wood
2013-08-09 16:01:40 +10:00
Catalin Udma c8db32c866 powerpc/e500: Update compilation flags with core specific options
If CONFIG_E500 is enabled, the compilation flags are updated
specifying the target core -mcpu=e5500/e500mc/8540
Also remove -Wa,-me500, being incompatible with -mcpu=e5500/e6500
The assembler option is redundant if the -mcpu= flag is set.
The patch fixes the kernel compilation problem for e5500/e6500
when using gcc option -mcpu=e5500/e6500.

Signed-off-by: Catalin Udma <catalin.udma@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-08-07 18:49:44 -05:00
Zhenhua Luo dcbe39fed7 powerpc/fsl: Enable CONFIG_DEVTMPFS_MOUNT so /dev can be mounted correctly
When using recent udev, the /dev node mount requires CONFIG_DEVTMPFS_MOUNT
is enabled in Kernel. The patch enables the option in defconfig of Freescale
QorIQ targets.

Changed defconfig list:
   arch/powerpc/configs/85xx/p1023rds_defconfig
   arch/powerpc/configs/corenet32_smp_defconfig
   arch/powerpc/configs/corenet64_smp_defconfig
   arch/powerpc/configs/mpc85xx_smp_defconfig
   arch/powerpc/configs/mpc85xx_defconfig
   arch/powerpc/configs/mpc83xx_defconfig

Signed-off-by: Zhenhua Luo <zhenhua.luo@freescale.com>
[scottwood@freescale.com: added mpc83xx and non-smp mpc85xx]
Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-08-07 18:38:11 -05:00
Yuanquan Chen 36f6849400 powerpc/pci: fix PCI-e check link issue
For Freescale powerpc platform, the PCI-e bus number uses the reassign mode
by default. It means the second PCI-e controller's hose->first_busno is the
first controller's last bus number adding 1. For some hotpluged device(or
controlled by FPGA), the device is linked to PCI-e slot at linux runtime.
It needs rescan for the system to add it and driver it to work. It successes
to rescan the device linked to the first PCI-e controller's slot, but fails to
rescan the device linked to the second PCI-e controller's slot. The cause is
that the bus->number is reset to 0, which isn't equal to the hose->first_busno
for the second controller checking PCI-e link. So it doesn't really check the
PCI-e link status, the link status is always no_link. The device won't be
really rescaned. Reset the bus->number to hose->first_busno in the function
fsl_pcie_check_link(), it will do the real checking PCI-e link status for the
second controller, the device will be rescaned.

Signed-off-by: Yuanquan Chen <Yuanquan.Chen@freescale.com>
Tested-by: Rojhalat Ibrahim <imr@rtschenk.de>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-08-07 18:38:08 -05:00
Kevin Hao c45e91831b powerpc/fsl-pci: enable SWIOTLB in function setup_pci_atmu
This function contains all the stuff we need to check if SWIOTLB
should be enabled or not. So it is more convenient to enable
the SWIOTLB here than later.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-08-07 18:38:08 -05:00
Kevin Hao 2d49c42a30 powerpc/fsl-pci: fix the unreachable warning message
The (1ull << mem_log) is never greater than mem unless mem_log++;

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-08-07 18:38:07 -05:00
Kevin Hao 7d4d595dad powerpc/mpc85xx: remove the unneeded pci init functions for corenet ds board
The function pci_devs_phb_init is invoked more earlier than we really
probe the pci controller, so it does nothing at all. And we also don't
need the pci_dn stuff for the fsl powerpc64 boards, just remove it.

It also seems that we don't support ISA on all the current corenet ds
boards. So picking a primary bus seems useless, remove that function
too.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
2013-08-07 18:38:07 -05:00
Haijun.Zhang bf57aeb57a powerpc/85xx: add the P1020RDB-PD DTS support
Overview of P1020RDB-PD device:
- DDR3 2GB
- NOR flash 64MB
- NAND flash 128MB
- SPI flash 16MB
- I2C EEPROM 256Kb
- eTSEC1 (RGMII PHY) connected to VSC7385 L2 switch
- eTSEC2 (SGMII PHY)
- eTSEC3 (RGMII PHY)
- SDHC
- 1 USB ports
- TDM ports
- PCIe

Signed-off-by: Haijun Zhang <Haijun.Zhang@freescale.com>
Signed-off-by: Jerry Huang <Chang-Ming.Huang@freescale.com>
Signed-off-by: Xie Xiaobo-R63061 <X.Xie@freescale.com>
CC: Scott Wood <scottwood@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-08-07 18:38:07 -05:00
Haijun.Zhang 550593e8f5 powerpc/85xx: add P1020RDB-PD platform support
The p1020rdb-pd has the similar feature as the p1020rdb.
Therefore, p1020rdb-pd use the same platform file as the p1/p2 rdb board.
Overview of P1020RDB-PD platform:
- DDR3 2GB
- NOR flash 64MB
- NAND flash 128MB
- SPI flash 16MB
- I2C EEPROM 256Kb
	- eTSEC1 (RGMII PHY) connected to VSC7385 L2 switch
	- eTSEC2 (SGMII PHY)
- eTSEC3 (RGMII PHY)
	- SDHC
	- 2 USB ports
	- 4 TDM ports
	- PCIe

Signed-off-by: Haijun Zhang <Haijun.Zhang@freescale.com>
Signed-off-by: Jerry Huang <Chang-Ming.Huang@freescale.com>
CC: Scott Wood <scottwood@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-08-07 18:38:06 -05:00
Laurentiu TUDOR 4e21b94c9c powerpc/85xx: Move ePAPR paravirt initialization earlier
At console init, when the kernel tries to flush the log buffer
the ePAPR byte-channel based console write fails silently,
losing the buffered messages.
This happens because The ePAPR para-virtualization init isn't
done early enough so that the hcall instruction to be set,
causing the byte-channel write hcall to be a nop.
To fix, change the ePAPR para-virt init to use early device
tree functions and move it in early init.

Signed-off-by: Laurentiu Tudor <Laurentiu.Tudor@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-08-07 18:38:06 -05:00
Minghuan Lian f31dd9443a powerpc/fsl_msi: add MSIIR1 support for MPIC v4.3
The original MPIC MSI bank contains 8 registers, MPIC v4.3 MSI bank
contains 16 registers, and this patch adds NR_MSI_REG_MAX and
NR_MSI_IRQS_MAX to describe the maximum capability of MSI bank.
MPIC v4.3 provides MSIIR1 to index these 16 MSI registers. MSIIR1
uses different bits definition than MSIIR. This patch adds
ibs_shift and srs_shift to indicate the bits definition of the
MSIIR and MSIIR1, so the same code can handle the MSIIR and MSIIR1
simultaneously.

Signed-off-by: Minghuan Lian <Minghuan.Lian@freescale.com>
[scottwood@freescale.com: reinstated static on all_avail]
Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-08-07 18:38:05 -05:00
Minghuan Lian 6d854acd24 powerpc/dts: add MPIC v4.3 dts node
For the latest platform T4 and B4, MPIC controller has been updated
to v4.3. This patch adds a new file to describe the latest MPIC.
The MSI blocks number is increased to four, the registers number
of each block is increased to sixteen. MSIIR1 has been added to
access these sixteen MSI registers.

Signed-off-by: Minghuan Lian <Minghuan.Lian@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-08-07 18:38:05 -05:00
Hongtao Jia df1024ad87 powerpc/msi: Fix compile error on mpc83xx
mpic_get_primary_version() is not defined when not using MPIC.
The compile error log like:

arch/powerpc/sysdev/built-in.o: In function `fsl_of_msi_probe':
fsl_msi.c:(.text+0x150c): undefined reference to `fsl_mpic_primary_get_version'

Signed-off-by: Jia Hongtao <hongtao.jia@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-08-07 18:38:04 -05:00
Priyanka Jain 3c83658ca9 powerpc/perf: Add e6500 PMU driver
e6500 core performance monitors has the following features:
- 6 performance monitor counters
- 512 events supported
- no threshold events

e6500 PMU has more specific events (Data L1 cache misses, Instruction L1
cache misses, etc ) than e500 PMU (which only had Data L1 cache reloads,
etc). Where available, the more specific events have been used which will
produce slightly different results than e500 PMU equivalents.

Signed-off-by: Priyanka Jain <Priyanka.Jain@freescale.com>
Signed-off-by: Lijun Pan <Lijun.Pan@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-08-07 18:38:04 -05:00
Lijun Pan 5815c434fd powerpc/perf: add 2 additional performance monitor counters for e6500 core
There are 6 counters in e6500 core instead of 4 in e500 core.

Signed-off-by: Lijun Pan <Lijun.Pan@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-08-07 18:38:03 -05:00
Catalin Udma 96c3c9e78f powerpc/perf: increase the perf HW events to 6
This change is required after the e6500 perf support has been added.
There are 6 counters in e6500 core instead of 4 in e500 core and
the MAX_HWEVENTS counter should be changed accordingly from 4 to 6.
Added also runtime check for counters overflow.

Signed-off-by: Catalin Udma <catalin.udma@freescale.com>
Signed-off-by: Lijun Pan <Lijun.Pan@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-08-07 18:38:03 -05:00
Lijun Pan a9a5cda069 powerpc/perf: correct typos in counter enumeration
Signed-off-by: Lijun Pan <Lijun.Pan@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-08-07 18:38:03 -05:00
Ian Campbell 807f4c4720 powerpc/fsl-booke: Rename b4qds.dts -> b4qds.dtsi.
This file is a common include for B4860 and B4420 but is not a valid DTS itself:
	  DTC     arch/powerpc/boot/b4qds.dtb
	Error: arch/powerpc/boot/dts/b4qds.dts:35.1-2 syntax error
	FATAL ERROR: Unable to parse input tree
	make[1]: *** [arch/powerpc/boot/b4qds.dtb] Error 1
	make: *** [b4qds.dtb] Error 2

I spotted in build tests of device-tree.git, announcement
https://lkml.org/lkml/2013/4/24/209, which builds *.dts. Probably no one would
do this this in real life on linux.git but it still seems worth fixing.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Shaveta Leekha <shaveta@freescale.com>
Cc: Minghuan Lian <Minghuan.Lian@freescale.com>
Cc: Andy Fleming <afleming@freescale.com>
Cc: Poonam Aggrwal <poonam.aggrwal@freescale.com>
Cc: Ramneek Mehresh <ramneek.mehresh@freescale.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-08-07 18:38:02 -05:00
Linus Torvalds e7e2e511ba Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull powerpc fixes from Ben Herrenschmidt:
 "Here is not quite a handful of powerpc fixes for rc3.

  The windfarm fix is a regression fix (though not a new one), the PMU
  interrupt rename is not a fix per-se but has been submitted a long
  time ago and I kept forgetting to put it in (it puts us back in sync
  with x86), the other perf bit is just about putting an API/ABI bit
  definition in the right place for userspace to consume, and finally,
  we have a fix for the VPHN (Virtual Partition Home Node) feature
  (notification that the hypervisor is moving nodes around) which could
  cause lockups so we may as well fix it now"

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/windfarm: Fix noisy slots-fan on Xserve (rm31)
  powerpc: VPHN topology change updates all siblings
  powerpc/perf: Export PERF_EVENT_CONFIG_EBB_SHIFT to userspace
  powerpc: Rename PMU interrupts from CNT to PMI
2013-08-02 14:39:49 -07:00
Linus Torvalds aa8032b6fa PCI updates for v3.11:
Hotplug
       PCI: pciehp: Fix null pointer deref when hot-removing SR-IOV device
       PCI: hotplug: Convert to be builtin only, not modular
       PCI: pciehp: Convert pciehp to be builtin only, not modular
   Resource allocation
       PCI: Retry allocation of only the resource type that failed
   ARM
       PCI: mvebu: Disable prefetchable memory support in PCI-to-PCI bridge
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJR+/RRAAoJEFmIoMA60/r8iloP/is7mx667v9F9WRlYMJss+ig
 K0cVOauokmr/pyS7Tgr0jfhtulXOmN/VJTvCw6h7+qYtsO/DQ7Y5LPXoMW83s5hK
 /N33HDZcm2/1hUxv/4GS3omtO9VefWcmzXmPM7l1fEBQyACTg/zj1Mb97U8j8mhh
 vr6y/pLDLwwaRcQNP/Y2F6H8RsDDWE/IyC9MN1qEz+b7Qve5fAPPrDBgExakzmu/
 UiiJV3nl7fqBeITA/LSWDCgsOeavmmabqZuaXnKWN0L5PaEa3/8Of6kbsG5ZGgdZ
 Y5/Qu0HRtAhaFNAd1670IzcMThC6TJV685z09/OQ+4uKpZ2jJYM26ISSdiMG2He4
 FQmLQcgkxGX0xYxdD7K37VC7O17NH+3jEouM2SNSCXGz5RQ3qvW5F4IvSctHmIO8
 Q0m2HladNYHtOYk1eGNSlxPd3U0nyQwlXSTgJrKYd/cJgUqYjs+Q9zuAmUj+4QPR
 ywwO3rgpetaROY6avw31fWFGqQFAQEQeXvMwTrAJbdIcxvVz77yRiOrD72X61Z7h
 A6owzbMmuYKPuym5EbBr0GoCJWAuHc8GIvXKHQQ9QMGuRAj9X4Qrk+BdudeqlTxD
 e9WKkKGPmPxR6IuSlZdLCNqcTDDlQZFTnq5WJ989pPsUMKFu8LjybKntBEbPtIue
 ydWvibCrxNbTEMyyg+c6
 =z2eX
 -----END PGP SIGNATURE-----

Merge tag 'pci-v3.11-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:
 "Yinghai fixed a couple regressions: one resource assignment problem
  introduced in v3.10 that showed up with SR-IOV on powerpc, and another
  SR-IOV hot-remove issue related to refcounting changes we merged for
  v3.11.

  Yinghai is still working on another SR-IOV-related fix or two, which
  will be simpler if pciehp is non-modular, so I included the Kconfig
  changes now to get them in earlier.

  Finally, a minor fix for the ARM Marvell EBU host bridge driver that
  was merged for v3.11

  Hotplug:
      PCI: pciehp: Fix null pointer deref when hot-removing SR-IOV device
      PCI: hotplug: Convert to be builtin only, not modular
      PCI: pciehp: Convert pciehp to be builtin only, not modular

  Resource allocation:
      PCI: Retry allocation of only the resource type that failed

  ARM:
      PCI: mvebu: Disable prefetchable memory support in PCI-to-PCI bridge"

* tag 'pci-v3.11-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: mvebu: Disable prefetchable memory support in PCI-to-PCI bridge
  PCI: Retry allocation of only the resource type that failed
  PCI: pciehp: Convert pciehp to be builtin only, not modular
  PCI: hotplug: Convert to be builtin only, not modular
  PCI: pciehp: Fix null pointer deref when hot-removing SR-IOV device
2013-08-02 13:12:52 -07:00
Robert Jennings 3be7db6ab4 powerpc: VPHN topology change updates all siblings
When an associativity level change is found for one thread, the
siblings threads need to be updated as well.  This is done today
for PRRN in stage_topology_update() but is missing for VPHN in
update_cpu_associativity_changes_mask().  This patch will correctly
update all thread siblings during a topology change.

Without this patch a topology update can result in a CPU in
init_sched_groups_power() getting stuck indefinitely in a loop.

This loop is built in build_sched_groups(). As a result of the thread
moving to a node separate from its siblings the struct sched_group will
have its next pointer set to point to itself rather than the sched_group
struct of the next thread.  This happens because we have a domain without
the SD_OVERLAP flag, which is correct, and a topology that doesn't conform
with reality (threads on the same core assigned to different numa nodes).
When this list is traversed by init_sched_groups_power() it will reach
the thread's sched_group structure and loop indefinitely; the cpu will
be stuck at this point.

The bug was exposed when VPHN was enabled in commit b7abef0 (v3.9).

Cc: <stable@vger.kernel.org> [v3.9+]
Reported-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-01 13:11:47 +10:00
Michael Ellerman 8d7c55d01e powerpc/perf: Export PERF_EVENT_CONFIG_EBB_SHIFT to userspace
We use bit 63 of the event code for userspace to request that the event
be counted using EBB (Event Based Branches). Export this value, making
it part of the API - though only on processors that support EBB.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-01 13:11:46 +10:00
Michael Ellerman e8e813ed26 powerpc: Rename PMU interrupts from CNT to PMI
Back in commit 89713ed "Add timer, performance monitor and machine check
counts to /proc/interrupts" we added a count of PMU interrupts to the
output of /proc/interrupts.

At the time we named them "CNT" to match x86.

However in commit 89ccf46 "Rename 'performance counter interrupt'", the
x86 guys renamed theirs from "CNT" to "PMI".

Arguably changing the name could break someone's script, but I think the
chance of that is minimal, and it's preferable to have a name that 1) is
somewhat meaningful, and 2) matches x86.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-01 13:11:46 +10:00
Dongsheng Wang e00c9a0c2e powerpc/mpc85xx: invalidate TLB after hibernation resume
This problem belongs to the core synchronization issues.
The cpu1 already updated spin_table values, but bootcore cannot get
this value in time.

After bootcpu hibiernation restore the pages. we are now running
with the kernel data of the old kernel fully restored. if we reset
the non-bootcpus that will be reset cache(tlb), the non-bootcpus
will get new address(map virtual and physical address spaces).
but bootcpu tlb cache still use boot kernel data, so we need to
invalidate the bootcpu tlb cache make it to get new main memory data.

log:
Enabling non-boot CPUs ...
smp_85xx_kick_cpu: timeout waiting for core 1 to reset
smp: failed starting cpu 1 (rc -2)
Error taking CPU1 up: -2

Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com>
Reviewed-by: Anton Vorontsov <anton@enomsg.org>
[scottwood@freescale.com: reworded code comment for clarity]
Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-07-30 15:50:08 -05:00
Wei Yongjun f8dc6eb757 powerpc/fsl_msi: fix error return code in fsl_of_msi_probe()
Fix to return a negative error code in the MSI bitmap alloc error
handling case instead of 0, as done elsewhere in this function.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-07-30 15:50:08 -05:00
Hongtao Jia 4e0e3435b5 powerpc/85xx: Add machine check handler to fix PCIe erratum on mpc85xx
A PCIe erratum of mpc85xx may causes a core hang when a link of PCIe
goes down. when the link goes down, Non-posted transactions issued
via the ATMU requiring completion result in an instruction stall.
At the same time a machine-check exception is generated to the core
to allow further processing by the handler. We implements the handler
which skips the instruction caused the stall.

This patch depends on patch:
powerpc/85xx: Add platform_device declaration to fsl_pci.h

Signed-off-by: Zhao Chenhui <b35336@freescale.com>
Signed-off-by: Li Yang <leoli@freescale.com>
Signed-off-by: Liu Shuo <soniccat.liu@gmail.com>
Signed-off-by: Jia Hongtao <hongtao.jia@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-07-30 15:50:07 -05:00
Hongtao Jia 9123c5ed45 powerpc: Move opcode definitions from kvm/emulate.c to asm/ppc-opcode.h
Opcode and xopcode are useful definitions not just for KVM. Move these
definitions to asm/ppc-opcode.h for public use.

Also add the opcodes for LHAUX and LWZUX.

Signed-off-by: Jia Hongtao <hongtao.jia@freescale.com>
Signed-off-by: Li Yang <leoli@freescale.com>
[scottwood@freesacle.com: update commit message and rebase]
Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-07-30 15:50:07 -05:00
Bjorn Helgaas 7cd29f4b22 PCI: hotplug: Convert to be builtin only, not modular
Convert CONFIG_HOTPLUG_PCI from tristate to bool.  This only affects
the hotplug core; several of the hotplug drivers can still be modules.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
2013-07-25 14:11:06 -06:00
Paul Mackerras c8ae0ace10 KVM: PPC: Book3S PR: Load up SPRG3 register with guest value on guest entry
Unlike the other general-purpose SPRs, SPRG3 can be read by usermode
code, and is used in recent kernels to store the CPU and NUMA node
numbers so that they can be read by VDSO functions.  Thus we need to
load the guest's SPRG3 value into the real SPRG3 register when entering
the guest, and restore the host's value when exiting the guest.  We don't
need to save the guest SPRG3 value when exiting the guest as usermode
code can't modify SPRG3.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-07-25 15:33:09 +02:00
Santosh Shilimkar 374d5c9964 of: Specify initrd location using 64-bit
On some PAE architectures, the entire range of physical memory could reside
outside the 32-bit limit.  These systems need the ability to specify the
initrd location using 64-bit numbers.

This patch globally modifies the early_init_dt_setup_initrd_arch() function to
use 64-bit numbers instead of the current unsigned long.

There has been quite a bit of debate about whether to use u64 or phys_addr_t.
It was concluded to stick to u64 to be consistent with rest of the device
tree code. As summarized by Geert, "The address to load the initrd is decided
by the bootloader/user and set at that point later in time. The dtb should not
be tied to the kernel you are booting"

More details on the discussion can be found here:
https://lkml.org/lkml/2013/6/20/690
https://lkml.org/lkml/2012/9/13/544

Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Signed-off-by: Grant Likely <grant.likely@linaro.org>
2013-07-24 11:10:01 +01:00
Anshuman Khandual ff3d79dc12 powerpc/perf: BHRB filter configuration should follow the task
When the task moves around the system, the corresponding cpuhw
per cpu strcuture should be popullated with the BHRB filter
request value so that PMU could be configured appropriately with
that during the next call into power_pmu_enable().

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24 14:42:34 +10:00
Anshuman Khandual 7689bdcab1 powerpc/perf: Ignore separate BHRB privilege state filter request
Completely ignore BHRB privilege state filter request as we are
already configuring that with privilege state filtering attribute
for the accompanying PMU event. This would help achieve cleaner
user space interaction for BHRB.

This patch fixes a situation like this

Before patch:-
------------
./perf record -j any -e branch-misses:k ls
Error:
The sys_perf_event_open() syscall returned with 95 (Operation not
supported) for event (branch-misses:k).
/bin/dmesg may provide additional information.
No CONFIG_PERF_EVENTS=y kernel support configured?

Here 'perf record' actually copies over ':k' filter request into BHRB
privilege state filter config and our previous check in kernel would
fail that.

After patch:-
-------------
./perf record -j any -e branch-misses:k ls
perf  perf.data  perf.data.old  test-mmap-ring
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.002 MB perf.data (~102 samples)]

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24 14:42:31 +10:00
Bjorn Helgaas 679750054a powerpc/powernv: Mark pnv_pci_init_ioda2_phb() as __init
Mark pnv_pci_init_ioda2_phb() as __init.  It is called only from an
init function (pnv_pci_init()), and it calls an init function
(pnv_pci_init_ioda_phb()):

    pnv_pci_init                # init
      pnv_pci_init_ioda2_phb    # non-init
	pnv_pci_init_ioda_phb   # init

This should fix a section mismatch warning.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24 14:42:27 +10:00
Aneesh Kumar K.V de640959b6 powerpc/mm: Use the correct SLB(LLP) encoding in tlbie instruction
The sllp value is stored in mmu_psize_defs in such a way that we can easily OR
the value to get the operand for slbmte instruction. ie, the L and LP bits are
not contiguous. Decode the bits and use them correctly in tlbie.
regression is introduced by 1f6aaaccb1
"powerpc: Update tlbie/tlbiel as per ISA doc"

Reported-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24 14:42:24 +10:00
Aneesh Kumar K.V 83383b73ad powerpc/mm: Fix fallthrough bug in hpte_decode
We should not fallthrough different case statements in hpte_decode. Add
break statement to break out of the switch. The regression is introduced by
dcda287a9b "powerpc/mm: Simplify hpte_decode"

Reported-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24 14:42:21 +10:00
Denis Kirjanov ad92c61597 powerpc/pseries: Fix a typo in pSeries_lpar_hpte_insert()
Commit 801eb73f45 introduced
a bug while checking PTE flags. We have to drop the _PAGE_COHERENT flag
when __PAGE_NO_CACHE is set and the cache update policy is not write-through
(i.e. _PAGE_WRITETHRU is not set)

Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
CC:  Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24 14:42:18 +10:00
Gavin Shan ab55d2187d powerpc/eeh: Introdce flag to protect sysfs
The patch introduces flag EEH_DEV_SYSFS to keep track that the sysfs
entries for the corresponding EEH device (then PCI device) has been
added or removed, in order to avoid race condition.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24 14:18:49 +10:00
Gavin Shan 91150af3ad powerpc/eeh: Fix unbalanced enable for IRQ
The patch fixes following issue:

Unbalanced enable for IRQ 23
------------[ cut here ]------------
WARNING: at kernel/irq/manage.c:437
:
NIP [c00000000016de8c] .__enable_irq+0x11c/0x140
LR [c00000000016de88] .__enable_irq+0x118/0x140
Call Trace:
[c000003ea1f23880] [c00000000016de88] .__enable_irq+0x118/0x140 (unreliable)
[c000003ea1f23910] [c00000000016df08] .enable_irq+0x58/0xa0
[c000003ea1f239a0] [c0000000000388b4] .eeh_enable_irq+0xc4/0xe0
[c000003ea1f23a30] [c000000000038a28] .eeh_report_reset+0x78/0x130
[c000003ea1f23ac0] [c000000000037508] .eeh_pe_dev_traverse+0x98/0x170
[c000003ea1f23b60] [c0000000000391ac] .eeh_handle_normal_event+0x2fc/0x3d0
[c000003ea1f23bf0] [c000000000039538] .eeh_handle_event+0x2b8/0x2c0
[c000003ea1f23c90] [c000000000039600] .eeh_event_handler+0xc0/0x170
[c000003ea1f23d30] [c0000000000da9a0] .kthread+0xf0/0x100
[c000003ea1f23e30] [c00000000000a1dc] .ret_from_kernel_thread+0x5c/0x80

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24 14:18:49 +10:00
Gavin Shan 4b83bd452f powerpc/eeh: Don't use pci_dev during BAR restore
While restoring BARs for one specific PCI device, the pci_dev
instance should have been released. So it's not reliable to use
the pci_dev instance on restoring BARs. However, we still need
some information (e.g. PCIe capability position, header type) from
the pci_dev instance. So we have to store those information to
EEH device in advance.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24 14:18:49 +10:00
Gavin Shan f5c57710dd powerpc/eeh: Use partial hotplug for EEH unaware drivers
When EEH error happens to one specific PE, some devices with drivers
supporting EEH won't except hotplug on the device. However, there
might have other deivces without driver, or with driver without EEH
support. For the case, we need do partial hotplug in order to make
sure that the PE becomes absolutely quite during reset. Otherise,
the PE reset might fail and leads to failure of error recovery.

The current code doesn't handle that 'mixed' case properly, it either
uses the error callbacks to the drivers, or tries hotplug, but doesn't
handle a PE (EEH domain) composed of a combination of the two.

The patch intends to support so-called "partial" hotplug for EEH:
Before we do reset, we stop and remove those PCI devices without
EEH sensitive driver. The corresponding EEH devices are not detached
from its PE, but with special flag. After the reset is done, those
EEH devices with the special flag will be scanned one by one.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24 14:18:48 +10:00
Gavin Shan ab444ec97e powerpc/pci: Partial tree hotplug support
When EEH error happens to one specific PE, the device drivers
of its attached EEH devices (PCI devices) are checked to see
the further action: reset with complete hotplug, or reset without
hotplug. However, that's not enough for those PCI devices whose
drivers can't support EEH, or those PCI devices without driver.
So we need do so-called "partial hotplug" on basis of PCI devices.
In the situation, part of PCI devices of the specific PE are
unplugged and plugged again after PE reset.

The patch changes pcibios_add_pci_devices() so that it can support
full hotplug and so-called "partial" hotplug based on device-tree
or real hardware. It's notable that pci_of_scan.c has been changed
for a bit in order to support the "partial" hotplug based on dev-tree.

Most of the generic code already supports that, we just need to
plumb it properly on our side.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24 14:18:48 +10:00
Gavin Shan 9feed42e93 powerpc/eeh: Use safe list traversal when walking EEH devices
Currently, we're trasversing the EEH devices list using list_for_each_entry().
That's not safe enough because the EEH devices might be removed from
its parent PE while doing iteration. The patch replaces that with
list_for_each_entry_safe().

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24 14:18:47 +10:00
Gavin Shan 807a827d4e powerpc/eeh: Keep PE during hotplug
When we do normal hotplug, the PE (shadow EEH structure) shouldn't be
kept around.

However, we need to keep it if the hotplug an artifial one caused by
EEH errors recovery.

Since we remove EEH device through the PCI hook pcibios_release_device(),
the flag "purge_pe" passed to various functions is meaningless. So the patch
removes the meaningless flag and introduce new flag "EEH_PE_KEEP"
to save the PE while doing hotplug during EEH error recovery.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24 14:18:47 +10:00
Gavin Shan 008a4938ea powerpc/pci: Override pcibios_release_device()
The patch overrides pcibios_release_device() to release EEH
resources (EEH cache, unbinding EEH device) for the indicated PCI
device.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24 14:18:46 +10:00
Gavin Shan f2856491d2 powerpc/eeh: Export functions for hotplug
Make some functions public in order to support hotplug on either specific
PCI bus or PCI device in future.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24 14:18:46 +10:00
Gavin Shan 0ba178888b powerpc/eeh: Remove reference to PCI device
We will rely on pcibios_release_device() to remove the EEH cache
and unbind EEH device for the specific PCI device. So we shouldn't
hold the reference to the PCI device from EEH cache and EEH device.
Otherwise, pcibios_release_device() won't be called as we expected.
The patch removes the reference to the PCI device in EEH core.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24 14:18:46 +10:00
Mahesh Salgaonkar ee1dd1e3dc powerpc: Fix the corrupt r3 error during MCE handling.
During Machine Check interrupt on pseries platform, R3 generally points to
memory region inside RTAS (FWNMI) area. We see r3 corruption because when RTAS
delivers the machine check exception it passes the address inside FWNMI area
with the top most bit set. This patch fixes this issue by masking top two bit
in machine check exception handler.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24 14:18:45 +10:00
Michael Ellerman 5d7ead0039 powerpc/perf: Set PPC_FEATURE2_EBB when we register the power8 PMU
The presence or absence of EBB is advertised to userspace via the presence
or absence of PPC_FEATURE2_EBB in cpu_user_features2.

Because the kernel can be built without PMU support, we should only add
PPC_FEATURE2_EBB to cpu_user_features2 when we successfully register the
power8 PMU support.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24 14:18:45 +10:00
Paul Bolle 97645154a9 powerpc/pseries: Drop "select HOTPLUG"
The Kconfig symbol HOTPLUG was removed with commit 40b313608a ("Finally
eradicate CONFIG_HOTPLUG"). But there's still one select statement for
that symbol. It seems that select statement was added after the patch to
remove CONFIG_HOTPLUG was submitted. Anyhow, it is useless and can be
dropped.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24 14:18:44 +10:00
Tiejun Chen 0b88f772bd powerpc: Access local paca after hard irq disabled
In hard_irq_disable(), we accessed the PACA before we hard disabled
the interrupts, potentially causing a warning as get_paca() will
us debug_smp_processor_id().

Move that to after the disabling, and also use local_paca directly
rather than get_paca() to avoid several redundant and useless checks.

Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24 14:18:44 +10:00
Anton Blanchard 0e0ed6406e powerpc/modules: Module CRC relocation fix causes perf issues
Module CRCs are implemented as absolute symbols that get resolved by
a linker script. We build an intermediate .o that contains an
unresolved symbol for each CRC. genksysms parses this .o, calculates
the CRCs and writes a linker script that "resolves" the symbols to
the calculated CRC.

Unfortunately the ppc64 relocatable kernel sees these CRCs as symbols
that need relocating and relocates them at boot. Commit d4703aef
(module: handle ppc64 relocating kcrctabs when CONFIG_RELOCATABLE=y)
added a hook to reverse the bogus relocations. Part of this patch
created a symbol at 0x0:

# head -2 /proc/kallsyms
0000000000000000 T reloc_start
c000000000000000 T .__start

This reloc_start symbol is causing lots of confusion to perf. It
thinks reloc_start is a massive function that stretches from 0x0 to
0xc000000000000000 and we get various cryptic errors out of perf,
including:

problem incrementing symbol count, skipping event

This patch removes the  reloc_start linker script label and instead
defines it as PHYSICAL_START. We also need to wrap it with
CONFIG_PPC64 because the ppc32 kernel can set a non zero
PHYSICAL_START at compile time and we wouldn't want to subtract
it from the CRCs in that case.

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24 14:18:43 +10:00
Michael Neuling 33959f88fc powerpc: Add second POWER8 PVR entry
POWER8 comes with two different PVRs.  This patch enables the additional
PVR in the cputable.

The existing entry (PVR=0x4b) is renamed to POWER8E and the new entry
(PVR=0x4d) is given POWER8.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24 14:18:43 +10:00
Ingo Molnar 5a9821321e perf/core improvements and fixes:
. Add missing 'finished_round' event forwarding in 'perf inject', from Adrian Hunter.
 
 . Assorted tidy ups, from Adrian Hunter.
 
 . Fall back to sysfs event names when parsing fails, from Andi Kleen.
 
 . List pmu events in perf list, from Andi Kleen.
 
 . Cleanup some memory allocation/freeing uses, from David Ahern.
 
 . Add option to collapse undesired parts of call graph, from Greg Price.
 
 . Prep work for multi perf data file storage, from Jiri Olsa.
 
 . Add support for more than two files comparision in 'perf diff', from Jiri Olsa
 
 . A few more 'perf test' improvements, from Jiri Olsa
 
 . libtraceevent cleanups, from Namhyung Kim.
 
 . Remove odd build stall in 'perf sched' by moving a large struct initialization
   from a local variable to a global one, from Namhyung Kim.
 
 . Add support for callchains in the gtk UI, from Namhyung Kim.
 
 . Do not apply symfs for an absolute vmlinux path, fix from Namhyung Kim.
 
 . Use default include path notation for libtraceevent, from Robert Richter.
 
 . Fix 'make tools/perf', from Robert Richter.
 
 . Make Power7 events available, from Runzhen Wang.
 
 . Add --objdump option to 'perf top', from Sukadev Bhattiprolu.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQIcBAABAgAGBQJR6EOuAAoJENZQFvNTUqpAqjEP/1Ist/eR9be8YhljMz8Yxl1o
 JXktgxSkMS/n59lRibUuGZrgPKPNxivK6AEbnbZxzZoHDBS8tnAAOXUuUVTtNCoT
 YsQurQjCmyXHIvYqwMaYarrhoIv33LdJyshskW3GZ81UfeeC6QoC56he3VTg1dEd
 k8snS4F8LJpBQizRJN6s959nF+pyw16wqiGYKJ80G1nhPTsStz8NSSWdCRVbyXl9
 fG0S/lLvUfilGT/ixHcvS62ENHiErL4N6jGNV4XeQqoADhrQvCwziDr+BORfJB9K
 udbO0PFS5uR4HOGNqZOPZfPxW8cTUXV9cCscLScKEVUghKz9rzHbPTTSDejXna/h
 cqLjRW1xpWUmIRY7Y5zSoBJIsh2t3vo4TkZoRNZxhCexoOT/qIUL6bWVZoxqaKzG
 xwL4DopOvb/DdUDkb+UB9+9kW4rDoMR1wUb6XXuGx8EqM8LHiA3TAPcGwmNh/IM6
 4maUhgyOFad3rm5mcjO7IoCU7NxoWR1dKsjGYteZeZv17X30UfRFwIIH+8l2Ma4o
 bpBQ4DKu07jXRUZXcUajOhj7cZO+UmE6c4YoAPpv+CQ1YKVH4/YHYf6RBWdjGLqV
 kqOrQuSopVztmaglrh93e+cYkfvNjZAC2EOzhx+UxcPqh9zh1Pil8LE1DwjChW5m
 /Y/NPH98nDvXY3N4vlLK
 =xk/v
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

 * Add missing 'finished_round' event forwarding in 'perf inject', from Adrian Hunter.

 * Assorted tidy ups, from Adrian Hunter.

 * Fall back to sysfs event names when parsing fails, from Andi Kleen.

 * List pmu events in perf list, from Andi Kleen.

 * Cleanup some memory allocation/freeing uses, from David Ahern.

 * Add option to collapse undesired parts of call graph, from Greg Price.

 * Prep work for multi perf data file storage, from Jiri Olsa.

 * Add support for more than two files comparision in 'perf diff', from Jiri Olsa

 * A few more 'perf test' improvements, from Jiri Olsa

 * libtraceevent cleanups, from Namhyung Kim.

 * Remove odd build stall in 'perf sched' by moving a large struct initialization
   from a local variable to a global one, from Namhyung Kim.

 * Add support for callchains in the gtk UI, from Namhyung Kim.

 * Do not apply symfs for an absolute vmlinux path, fix from Namhyung Kim.

 * Use default include path notation for libtraceevent, from Robert Richter.

 * Fix 'make tools/perf', from Robert Richter.

 * Make Power7 events available, from Runzhen Wang.

 * Add --objdump option to 'perf top', from Sukadev Bhattiprolu.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-07-19 09:35:30 +02:00
Takuya Yoshikawa e59dbe09f8 KVM: Introduce kvm_arch_memslots_updated()
This is called right after the memslots is updated, i.e. when the result
of update_memslots() gets installed in install_new_memslots().  Since
the memslots needs to be updated twice when we delete or move a memslot,
kvm_arch_commit_memory_region() does not correspond to this exactly.

In the following patch, x86 will use this new API to check if the mmio
generation has reached its maximum value, in which case mmio sptes need
to be flushed out.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Acked-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-18 12:29:25 +02:00
Rusty Russell 8c6ffba0ed PTR_RET is now PTR_ERR_OR_ZERO(): Replace most.
Sweep of the simple cases.

Cc: netdev@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-15 11:25:01 +09:30
Linus Torvalds be9c6d9169 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "Just a bunch of small fixes and tidy ups:

   1) Finish the "busy_poll" renames, from Eliezer Tamir.

   2) Fix RCU stalls in IFB driver, from Ding Tianhong.

   3) Linearize buffers properly in tun/macvtap zerocopy code.

   4) Don't crash on rmmod in vxlan, from Pravin B Shelar.

   5) Spinlock used before init in alx driver, from Maarten Lankhorst.

   6) A sparse warning fix in bnx2x broke TSO checksums, fix from Dmitry
      Kravkov.

   7) Dummy and ifb driver load failure paths can oops, fixes from Tan
      Xiaojun and Ding Tianhong.

   8) Correct MTU calculations in IP tunnels, from Alexander Duyck.

   9) Account all TCP retransmits in SNMP stats properly, from Yuchung
      Cheng.

  10) atl1e and via-rhine do not handle DMA mapping failures properly,
      from Neil Horman.

  11) Various equal-cost multipath route fixes in ipv6 from Hannes
      Frederic Sowa"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (36 commits)
  ipv6: only static routes qualify for equal cost multipathing
  via-rhine: fix dma mapping errors
  atl1e: fix dma mapping warnings
  tcp: account all retransmit failures
  usb/net/r815x: fix cast to restricted __le32
  usb/net/r8152: fix integer overflow in expression
  net: access page->private by using page_private
  net: strict_strtoul is obsolete, use kstrtoul instead
  drivers/net/ieee802154: don't use devm_pinctrl_get_select_default() in probe
  drivers/net/ethernet/cadence: don't use devm_pinctrl_get_select_default() in probe
  drivers/net/can/c_can: don't use devm_pinctrl_get_select_default() in probe
  net/usb: add relative mii functions for r815x
  net/tipc: use %*phC to dump small buffers in hex form
  qlcnic: Adding Maintainers.
  gre: Fix MTU sizing check for gretap tunnels
  pkt_sched: sch_qfq: remove forward declaration of qfq_update_agg_ts
  pkt_sched: sch_qfq: improve efficiency of make_eligible
  gso: Update tunnel segmentation to support Tx checksum offload
  inet: fix spacing in assignment
  ifb: fix oops when loading the ifb failed
  ...
2013-07-13 17:42:22 -07:00
Linus Torvalds 560ae37178 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
 - fix for do_div() abuse on x86
 - locking fix in perf core
 - a pile of (build) fixes and cleanups in perf tools

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
  perf/x86: Fix incorrect use of do_div() in NMI warning
  perf: Fix perf_lock_task_context() vs RCU
  perf: Remove WARN_ON_ONCE() check in __perf_event_enable() for valid scenario
  perf: Clone child context from parent context pmu
  perf script: Fix broken include in Context.xs
  perf tools: Fix -ldw/-lelf link test when static linking
  perf tools: Revert regression in configuration of Python support
  perf tools: Fix perf version generation
  perf stat: Fix per-socket output bug for uncore events
  perf symbols: Fix vdso list searching
  perf evsel: Fix missing increment in sample parsing
  perf tools: Update symbol_conf.nr_events when processing attribute events
  perf tools: Fix new_term() missing free on error path
  perf tools: Fix parse_events_terms() segfault on error path
  perf evsel: Fix count parameter to read call in event_format__new
  perf tools: fix a typo of a Power7 event name
  perf tools: Fix -x/--exclude-other option for report command
  perf evlist: Enhance perf_evlist__start_workload()
  perf record: Remove -f/--force option
  perf record: Remove -A/--append option
  ...
2013-07-13 15:35:47 -07:00
Runzhen Wang cfe0d8ba14 perf tools: Make Power7 events available for perf
Power7 supports over 530 different perf events but only a small subset
of these can be specified by name, for the remaining events, we must
specify them by their raw code:

        perf stat -e r2003c <application>

This patch makes all the POWER7 events available in sysfs.  So we can
instead specify these as:

        perf stat -e 'cpu/PM_CMPLU_STALL_DFU/' <application>

where PM_CMPLU_STALL_DFU is the r2003c in previous example.

Before this patch is applied, the size of power7-pmu.o is:

$ size arch/powerpc/perf/power7-pmu.o
   text	   data	    bss	    dec	    hex	filename
   3073	   2720	      0	   5793	   16a1	arch/powerpc/perf/power7-pmu.o

and after the patch is applied, it is:

$ size arch/powerpc/perf/power7-pmu.o
   text	   data	    bss	    dec	    hex	filename
  15950	  31112	      0	  47062	   b7d6	arch/powerpc/perf/power7-pmu.o

For the run time overhead, I use two scripts, one is "event_name.sh",
which contains 50 event names, it looks like:

 # ./perf record  -e 'cpu/PM_CMPLU_STALL_DFU/' -e .....  /bin/sleep 1

the other one is named "event_code.sh" which use corresponding  events
raw
code instead of events names, it looks like:

 # ./perf record -e r2003c -e ......  /bin/sleep 1

below is the result.

Using events name:

[root@localhost perf]# time ./event_name.sh
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.002 MB perf.data (~102 samples) ]

real	0m1.192s
user	0m0.028s
sys	0m0.106s

Using events raw code:

[root@localhost perf]# time ./event_code.sh
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.003 MB perf.data (~112 samples) ]

real	0m1.198s
user	0m0.028s
sys	0m0.105s

Signed-off-by: Runzhen Wang <runzhen@linux.vnet.ibm.com>
Acked-by: Michael Ellerman <michael@ellerman.id.au>
Cc: icycoder@gmail.com
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Runzhen Wang <runzhew@clemson.edu>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1372407297-6996-3-git-send-email-runzhen@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-07-12 13:46:09 -03:00
Michel Lespinasse 98d1e64f95 mm: remove free_area_cache
Since all architectures have been converted to use vm_unmapped_area(),
there is no remaining use for the free_area_cache.

Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-10 18:11:34 -07:00
Eliezer Tamir 64b0dc517e net: rename busy poll socket op and globals
Rename LL_SO to BUSY_POLL_SO
Rename sysctl_net_ll_{read,poll} to sysctl_busy_{read,poll}
Fix up users of these variables.
Fix documentation for sysctl.

a patch for the socket.7  man page will follow separately,
because of limitations of my mail setup.

Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-10 17:08:27 -07:00
Scott Wood c09ec198eb kvm/ppc/booke: Don't call kvm_guest_enter twice
kvm_guest_enter() was already called by kvmppc_prepare_to_enter().
Don't call it again.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-07-11 00:58:49 +02:00
Scott Wood 5f1c248f52 kvm/ppc: Call trace_hardirqs_on before entry
Currently this is only being done on 64-bit.  Rather than just move it
out of the 64-bit ifdef, move it to kvm_lazy_ee_enable() so that it is
consistent with lazy ee state, and so that we don't track more host
code as interrupts-enabled than necessary.

Rename kvm_lazy_ee_enable() to kvm_fix_ee_before_entry() to reflect
that this function now has a role on 32-bit as well.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-07-11 00:51:28 +02:00
Paul Mackerras 4baa1d871c KVM: PPC: Book3S HV: Allow negative offsets to real-mode hcall handlers
The table of offsets to real-mode hcall handlers in book3s_hv_rmhandlers.S
can contain negative values, if some of the handlers end up before the
table in the vmlinux binary.  Thus we need to use a sign-extending load
to read the values in the table rather than a zero-extending load.
Without this, the host crashes when the guest does one of the hcalls
with negative offsets, due to jumping to a bogus address.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-07-10 13:14:16 +02:00
Paul Mackerras 5448050124 KVM: PPC: Book3S HV: Correct tlbie usage
This corrects the usage of the tlbie (TLB invalidate entry) instruction
in HV KVM.  The tlbie instruction changed between PPC970 and POWER7.
On the PPC970, the bit to select large vs. small page is in the instruction,
not in the RB register value.  This changes the code to use the correct
form on PPC970.

On POWER7 we were calculating the AVAL (Abbreviated Virtual Address, Lower)
field of the RB value incorrectly for 64k pages.  This fixes it.

Since we now have several cases to handle for the tlbie instruction, this
factors out the code to do a sequence of tlbies into a new function,
do_tlbies(), and calls that from the various places where the code was
doing tlbie instructions inline.  It also makes kvmppc_h_bulk_remove()
use the same global_invalidates() function for determining whether to do
local or global TLB invalidations as is used in other places, for
consistency, and also to make sure that kvm->arch.need_tlb_flush gets
updated properly.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-07-10 13:14:09 +02:00
Linus Torvalds 496322bc91 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "This is a re-do of the net-next pull request for the current merge
  window.  The only difference from the one I made the other day is that
  this has Eliezer's interface renames and the timeout handling changes
  made based upon your feedback, as well as a few bug fixes that have
  trickeled in.

  Highlights:

   1) Low latency device polling, eliminating the cost of interrupt
      handling and context switches.  Allows direct polling of a network
      device from socket operations, such as recvmsg() and poll().

      Currently ixgbe, mlx4, and bnx2x support this feature.

      Full high level description, performance numbers, and design in
      commit 0a4db187a9 ("Merge branch 'll_poll'")

      From Eliezer Tamir.

   2) With the routing cache removed, ip_check_mc_rcu() gets exercised
      more than ever before in the case where we have lots of multicast
      addresses.  Use a hash table instead of a simple linked list, from
      Eric Dumazet.

   3) Add driver for Atheros CQA98xx 802.11ac wireless devices, from
      Bartosz Markowski, Janusz Dziedzic, Kalle Valo, Marek Kwaczynski,
      Marek Puzyniak, Michal Kazior, and Sujith Manoharan.

   4) Support reporting the TUN device persist flag to userspace, from
      Pavel Emelyanov.

   5) Allow controlling network device VF link state using netlink, from
      Rony Efraim.

   6) Support GRE tunneling in openvswitch, from Pravin B Shelar.

   7) Adjust SOCK_MIN_RCVBUF and SOCK_MIN_SNDBUF for modern times, from
      Daniel Borkmann and Eric Dumazet.

   8) Allow controlling of TCP quickack behavior on a per-route basis,
      from Cong Wang.

   9) Several bug fixes and improvements to vxlan from Stephen
      Hemminger, Pravin B Shelar, and Mike Rapoport.  In particular,
      support receiving on multiple UDP ports.

  10) Major cleanups, particular in the area of debugging and cookie
      lifetime handline, to the SCTP protocol code.  From Daniel
      Borkmann.

  11) Allow packets to cross network namespaces when traversing tunnel
      devices.  From Nicolas Dichtel.

  12) Allow monitoring netlink traffic via AF_PACKET sockets, in a
      manner akin to how we monitor real network traffic via ptype_all.
      From Daniel Borkmann.

  13) Several bug fixes and improvements for the new alx device driver,
      from Johannes Berg.

  14) Fix scalability issues in the netem packet scheduler's time queue,
      by using an rbtree.  From Eric Dumazet.

  15) Several bug fixes in TCP loss recovery handling, from Yuchung
      Cheng.

  16) Add support for GSO segmentation of MPLS packets, from Simon
      Horman.

  17) Make network notifiers have a real data type for the opaque
      pointer that's passed into them.  Use this to properly handle
      network device flag changes in arp_netdev_event().  From Jiri
      Pirko and Timo Teräs.

  18) Convert several drivers over to module_pci_driver(), from Peter
      Huewe.

  19) tcp_fixup_rcvbuf() can loop 500 times over loopback, just use a
      O(1) calculation instead.  From Eric Dumazet.

  20) Support setting of explicit tunnel peer addresses in ipv6, just
      like ipv4.  From Nicolas Dichtel.

  21) Protect x86 BPF JIT against spraying attacks, from Eric Dumazet.

  22) Prevent a single high rate flow from overruning an individual cpu
      during RX packet processing via selective flow shedding.  From
      Willem de Bruijn.

  23) Don't use spinlocks in TCP md5 signing fast paths, from Eric
      Dumazet.

  24) Don't just drop GSO packets which are above the TBF scheduler's
      burst limit, chop them up so they are in-bounds instead.  Also
      from Eric Dumazet.

  25) VLAN offloads are missed when configured on top of a bridge, fix
      from Vlad Yasevich.

  26) Support IPV6 in ping sockets.  From Lorenzo Colitti.

  27) Receive flow steering targets should be updated at poll() time
      too, from David Majnemer.

  28) Fix several corner case regressions in PMTU/redirect handling due
      to the routing cache removal, from Timo Teräs.

  29) We have to be mindful of ipv4 mapped ipv6 sockets in
      upd_v6_push_pending_frames().  From Hannes Frederic Sowa.

  30) Fix L2TP sequence number handling bugs, from James Chapman."

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1214 commits)
  drivers/net: caif: fix wrong rtnl_is_locked() usage
  drivers/net: enic: release rtnl_lock on error-path
  vhost-net: fix use-after-free in vhost_net_flush
  net: mv643xx_eth: do not use port number as platform device id
  net: sctp: confirm route during forward progress
  virtio_net: fix race in RX VQ processing
  virtio: support unlocked queue poll
  net/cadence/macb: fix bug/typo in extracting gem_irq_read_clear bit
  Documentation: Fix references to defunct linux-net@vger.kernel.org
  net/fs: change busy poll time accounting
  net: rename low latency sockets functions to busy poll
  bridge: fix some kernel warning in multicast timer
  sfc: Fix memory leak when discarding scattered packets
  sit: fix tunnel update via netlink
  dt:net:stmmac: Add dt specific phy reset callback support.
  dt:net:stmmac: Add support to dwmac version 3.610 and 3.710
  dt:net:stmmac: Allocate platform data only if its NULL.
  net:stmmac: fix memleak in the open method
  ipv6: rt6_check_neigh should successfully verify neigh if no NUD information are available
  net: ipv6: fix wrong ping_v6_sendmsg return value
  ...
2013-07-09 18:24:39 -07:00
Oleg Nesterov 6961ed96f1 ptrace/powerpc: revert "hw_breakpoints: Fix racy access to ptrace breakpoints"
This reverts commit 07fa7a0a8a ("hw_breakpoints: Fix racy access to
ptrace breakpoints") and removes ptrace_get/put_breakpoints() added by
other commits.

The patch was fine but we can no longer race with SIGKILL after commit
9899d11f65 ("ptrace: ensure arch_ptrace/ptrace_request can never race
with SIGKILL"), the __TASK_TRACED tracee can't be woken up and
->ptrace_bps[] can't go away.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Michael Neuling <mikey@neuling.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jan Kratochvil <jan.kratochvil@redhat.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Prasad <prasad@linux.vnet.ibm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-09 10:33:25 -07:00
Runzhen Wang 7e40c92019 perf tools: fix a typo of a Power7 event name
In the Power7 PMU guide:
https://www.power.org/documentation/commonly-used-metrics-for-performance-analysis/
PM_BRU_MPRED is referred to as PM_BR_MPRED.

It fixed the typo by changing the name of the event in kernel and
documentation accordingly.

This patch changes the ABI, there are some reasons I think it's ok:

- It is relatively new interface, specific to the Power7 platform.

- No tools that we know of actually use this interface at this point
 (none are listed near the interface).

- Users of this interface (eg oprofile users migrating to perf)
  would be more used to the "PM_BR_MPRED" rather than "PM_BRU_MPRED".

- These are in the ABI/testing at this point rather than ABI/stable,
  so hoping we have some wiggle room.

Signed-off-by: Runzhen Wang <runzhen@linux.vnet.ibm.com>
Acked-by: Michael Ellerman <michael@ellerman.id.au>
Cc: icycoder@gmail.com
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Runzhen Wang <runzhew@clemson.edu>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1372407297-6996-2-git-send-email-runzhen@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-07-08 17:40:05 -03:00
Aneesh Kumar K.V 990978e993 powerpc/kvm: Use 256K chunk to track both RMA and hash page table allocation.
Both RMA and hash page table request will be a multiple of 256K. We can use
a chunk size of 256K to track the free/used 256K chunk in the bitmap. This
should help to reduce the bitmap size.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-07-08 16:21:13 +02:00
Aneesh Kumar K.V 6c45b81098 powerpc/kvm: Contiguous memory allocator based RMA allocation
Older version of power architecture use Real Mode Offset register and Real Mode Limit
Selector for mapping guest Real Mode Area. The guest RMA should be physically
contigous since we use the range when address translation is not enabled.

This patch switch RMA allocation code to use contigous memory allocator. The patch
also remove the the linear allocator which not used any more

Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-07-08 16:20:20 +02:00
Aneesh Kumar K.V fa61a4e376 powerpc/kvm: Contiguous memory allocator based hash page table allocation
Powerpc architecture uses a hash based page table mechanism for mapping virtual
addresses to physical address. The architecture require this hash page table to
be physically contiguous. With KVM on Powerpc currently we use early reservation
mechanism for allocating guest hash page table. This implies that we need to
reserve a big memory region to ensure we can create large number of guest
simultaneously with KVM on Power. Another disadvantage is that the reserved memory
is not available to rest of the subsystems and and that implies we limit the total
available memory in the host.

This patch series switch the guest hash page table allocation to use
contiguous memory allocator.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-07-08 16:19:58 +02:00
Alexander Graf f35320288c KVM: PPC: Book3S: Ignore DABR register
We don't emulate breakpoints yet, so just ignore reads and writes
to / from DABR.

This fixes booting of more recent Linux guest kernels for me.

Reported-by: Nello Martuscielli <ppc.addon@gmail.com>
Tested-by: Nello Martuscielli <ppc.addon@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-07-08 16:18:20 +02:00
Linus Torvalds 2cb7b5a38c irqdomain refactoring for v3.11
This is the long awaited simplification of irqdomain. It gets rid of the
 different types of irq domains and instead both linear and tree mappings
 can be supported in a single domain. Doing this removes a lot of special
 case code and makes irq domains simpler to understand overall.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJR1mnQAAoJEEFnBt12D9kBW7gP/jQpLqFU6PTdsCMX459Ib7fE
 JGLQWYjAsqVbVT3H8LZYOZl+ItIbXSQVk+wf0mislnOzxtGxNCxZeJdniFK2RWyz
 sGNm/eEZfp9bvm9jlrPE615QE7U7c9fQ5dqETNS4UMMFFmjgrsY4Z82zYH2/Y/jK
 YbxFUT6+fep5QaF7lYM7phfrs7FrvnpCoxCio6LO+GgxnMxHSsGJ8L9JcojZ5Qoa
 MwlSwsMaQrWJ6PNVbB2YrtlMxjHkNpQ7EV7bNF693PjACVpXf3ItpvQ1k6W/ns06
 j6Nysj4nbjsTdWOVh5mY7tPfknZUDs38rLT5s+RbfHqySMIGQXp6CzieYaJ+lxV9
 YsItyfZE7h3anebjwRv/9XqLNsDM4KmSIJGcnq+QxNSGxO3rXkItWtpCKaqYdWnN
 wI359rSZ//x2ltFzvB14mZJm2j7r9Hh7gH8M5h8W8YTqCbe3oSES3iRhJqnyPxMX
 of4PsQaTAicuLwberGdVkKafg9XIsHPoebBzDxssB4nxPrEKIILL4g54x1rV17TZ
 kBXurd79FhP4gOQxJHf5uEpdxrQ3Cm/PQpD5CG2rZJrrcV7hfP4QMMFHqqbFSIOu
 2h4M4fZX2gsDh3RLboGRGn0GckSPTHiO05yeeBriLzMCVGcZ1JMqAxLUSJpH/UYl
 EzFy9RpwnwLoUPi8jUXE
 =yJRw
 -----END PGP SIGNATURE-----

Merge tag 'irqdomain-for-linus' of git://git.secretlab.ca/git/linux

Pull irqdomain refactoring from Grant Likely:
 "This is the long awaited simplification of irqdomain.  It gets rid of
  the different types of irq domains and instead both linear and tree
  mappings can be supported in a single domain.  Doing this removes a
  lot of special case code and makes irq domains simpler to understand
  overall"

* tag 'irqdomain-for-linus' of git://git.secretlab.ca/git/linux:
  irq: fix checkpatch error
  irqdomain: Include hwirq number in /proc/interrupts
  irqdomain: make irq_linear_revmap() a fast path again
  irqdomain: remove irq_domain_generate_simple()
  irqdomain: Refactor irq_domain_associate_many()
  irqdomain: Beef up debugfs output
  irqdomain: Clean up aftermath of irq_domain refactoring
  irqdomain: Eliminate revmap type
  irqdomain: merge linear and tree reverse mappings.
  irqdomain: Add a name field
  irqdomain: Replace LEGACY mapping with LINEAR
  irqdomain: Relax failure path on setting up mappings
2013-07-06 12:37:04 -07:00
Linus Torvalds 74b9272bbe Device tree updates for v3.11
This branch contains the following changes:
 - Removal of CONFIG_OF_DEVICE, it is always enabled by CONFIG_OF
 - Remove #ifdef from linux/of_platform.h to increase compiler syntax
   coverage
 - Bug fix for address decoding on Bimini and js2x powerpc platforms.
 - miscellaneous binding changes
 
 One note on the above. The binding changes going in from all kinds of
 different trees has gotten rather out of hand. I picked up some during
 this cycle, but even going though my tree isn't a great fit. Ian
 Campbell has prototyped splitting the bindings and .dtb files into a
 separate repository. The plan is to migrate to using that sometime in
 the next few kernel releases which should get rid of a lot of the churn
 on binding docs and .dts files.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJR1fP3AAoJEEFnBt12D9kB3IIP/0Q5ctMespiJ50+ThjGsaR3m
 sUbQkMK46uL/oupXaJT2ybX2PxLN5LpgvO9rPt77hblOoL0+wZt+j9G0pLy1qZQZ
 aHprH9SrpGJv6F0SFbHp/+D/m9vESPv+zwYzL9TvrOALvCD7OSZ7tHLaoF7Y1ADM
 QnZa7pta3Owpu5NsGXaTXLpaZzfXzfWzf4PDzv2FsAIDbtuVJZGJZ7sJVO7Z0r+K
 KCY85uKJ4VOHY0onBVlM6uoCnopOi2XMMkyxYvR28lL2Kiv2b3np46jG3zX1EZH5
 Qxdu85QZn2oio9iaTeYKK8bG9aRIRsXnzCnF2s68n2rQlEtPpWKN9Lj2AS/KJ+Ig
 obFTOFDHmxt1F4GIA0/HIPkDvRd7GTIwgwYYubEMi44E3Mae0N+xzkIRE41vYP7s
 8zaNHbjAjsYjplsvN5gTPxxiU/ta24a5bl7Ont2zmOjAbXCsDajm4NCKZRJ3lb2f
 FHNsS1zHGmqgJ9zt13GQabo/Tp4t3KwTzBirPQsDokRO4eoL6klcS3GCRv82VWC0
 dLnzu92hXcyXgh7mX2sj6sRBSwNygxMn4ZsZJklle38/LynvtrzT72BOZjghS+Vh
 l553uDInjSJ3IBrXnClPoyObcu50cmsBBgsK39FzU+MF9mcCHmkHQiT52zM6ZW3M
 wwY1OfcZk3XaT7akcVu6
 =CndB
 -----END PGP SIGNATURE-----

Merge tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux

Pull device tree updates from Grant Likely:
 "This branch contains the following changes:
   - Removal of CONFIG_OF_DEVICE, it is always enabled by CONFIG_OF
   - Remove #ifdef from linux/of_platform.h to increase compiler syntax
     coverage
   - Bug fix for address decoding on Bimini and js2x powerpc platforms.
   - miscellaneous binding changes

  One note on the above.  The binding changes going in from all kinds of
  different trees has gotten rather out of hand.  I picked up some
  during this cycle, but even going though my tree isn't a great fit.

  Ian Campbell has prototyped splitting the bindings and .dtb files into
  a separate repository.  The plan is to migrate to using that sometime
  in the next few kernel releases which should get rid of a lot of the
  churn on binding docs and .dts files"

* tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux:
  of: Fix address decoding on Bimini and js2x machines
  of: remove CONFIG_OF_DEVICE
  usb: chipidea: depend on CONFIG_OF instead of CONFIG_OF_DEVICE
  of: remove of_platform_driver
  ibmebus: convert of_platform_driver to platform_driver
  driver core: move to_platform_driver to platform_device.h
  mfd: DT bindings for the palmas family MFD
  ARM: dts: omap3-devkit8000: fix NAND memory binding
  of/base: fix typos
  of: remove #ifdef from linux/of_platform.h
2013-07-04 15:51:45 -07:00
Linus Torvalds 80cc38b163 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree updates from Jiri Kosina:
 "The usual stuff from trivial tree"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (34 commits)
  treewide: relase -> release
  Documentation/cgroups/memory.txt: fix stat file documentation
  sysctl/net.txt: delete reference to obsolete 2.4.x kernel
  spinlock_api_smp.h: fix preprocessor comments
  treewide: Fix typo in printk
  doc: device tree: clarify stuff in usage-model.txt.
  open firmware: "/aliasas" -> "/aliases"
  md: bcache: Fixed a typo with the word 'arithmetic'
  irq/generic-chip: fix a few kernel-doc entries
  frv: Convert use of typedef ctl_table to struct ctl_table
  sgi: xpc: Convert use of typedef ctl_table to struct ctl_table
  doc: clk: Fix incorrect wording
  Documentation/arm/IXP4xx fix a typo
  Documentation/networking/ieee802154 fix a typo
  Documentation/DocBook/media/v4l fix a typo
  Documentation/video4linux/si476x.txt fix a typo
  Documentation/virtual/kvm/api.txt fix a typo
  Documentation/early-userspace/README fix a typo
  Documentation/video4linux/soc-camera.txt fix a typo
  lguest: fix CONFIG_PAE -> CONFIG_x86_PAE in comment
  ...
2013-07-04 11:40:58 -07:00
Linus Torvalds e61aca5158 Merge branch 'kconfig-diet' from Dave Hansen
Merge Kconfig menu diet patches from Dave Hansen:
 "I think the "Kernel Hacking" menu has gotten a bit out of hand.  It is
  over 120 lines long on my system with everything enabled and options
  are scattered around it haphazardly.

        http://sr71.net/~dave/linux/kconfig-horror.png

  Let's try to introduce some sanity.  This set takes that 120 lines
  down to 55 and makes it vastly easier to find some things.  It's a
  start.

  This set stands on its own, but there is plenty of room for follow-up
  patches.  The arch-specific debug options still end up getting stuck
  in the top-level "kernel hacking" menu.  OPTIMIZE_INLINING, for
  instance, could obviously go in to the "compiler options" menu, but
  the fact that it is defined in arch/ in a separate Kconfig file keeps
  it on its own for the moment.

  The Signed-off-by's in here look funky.  I changed employers while
  working on this set, so I have signoffs from both email addresses"

* emailed patches from Dave Hansen <dave@sr71.net>:
  hang and lockup detection menu
  kconfig: consolidate printk options
  group locking debugging options
  consolidate compilation option configs
  consolidate runtime testing configs
  order memory debugging Kconfig options
  consolidate per-arch stack overflow debugging options
2013-07-04 11:25:51 -07:00
Dave Hansen d1a1dc0be8 consolidate per-arch stack overflow debugging options
Original posting:

	http://lkml.kernel.org/r/20121214184202.F54094D9@kernel.stglabs.ibm.com

Several architectures have similar stack debugging config options.
They all pretty much do the same thing, some with slightly
differing help text.

This patch changes the architectures to instead enable a Kconfig
boolean, and then use that boolean in the generic Kconfig.debug
to present the actual menu option.  This removes a bunch of
duplication and adds consistency across arches.

Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Reviewed-by: H. Peter Anvin <hpa@zytor.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com> [for tile]
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-04 11:25:39 -07:00
Linus Torvalds 65b97fb730 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull powerpc updates from Ben Herrenschmidt:
 "This is the powerpc changes for the 3.11 merge window.  In addition to
  the usual bug fixes and small updates, the main highlights are:

   - Support for transparent huge pages by Aneesh Kumar for 64-bit
     server processors.  This allows the use of 16M pages as transparent
     huge pages on kernels compiled with a 64K base page size.

   - Base VFIO support for KVM on power by Alexey Kardashevskiy

   - Wiring up of our nvram to the pstore infrastructure, including
     putting compressed oopses in there by Aruna Balakrishnaiah

   - Move, rework and improve our "EEH" (basically PCI error handling
     and recovery) infrastructure.  It is no longer specific to pseries
     but is now usable by the new "powernv" platform as well (no
     hypervisor) by Gavin Shan.

   - I fixed some bugs in our math-emu instruction decoding and made it
     usable to emulate some optional FP instructions on processors with
     hard FP that lack them (such as fsqrt on Freescale embedded
     processors).

   - Support for Power8 "Event Based Branch" facility by Michael
     Ellerman.  This facility allows what is basically "userspace
     interrupts" for performance monitor events.

   - A bunch of Transactional Memory vs.  Signals bug fixes and HW
     breakpoint/watchpoint fixes by Michael Neuling.

  And more ...  I appologize in advance if I've failed to highlight
  something that somebody deemed worth it."

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (156 commits)
  pstore: Add hsize argument in write_buf call of pstore_ftrace_call
  powerpc/fsl: add MPIC timer wakeup support
  powerpc/mpic: create mpic subsystem object
  powerpc/mpic: add global timer support
  powerpc/mpic: add irq_set_wake support
  powerpc/85xx: enable coreint for all the 64bit boards
  powerpc/8xx: Erroneous double irq_eoi() on CPM IRQ in MPC8xx
  powerpc/fsl: Enable CONFIG_E1000E in mpc85xx_smp_defconfig
  powerpc/mpic: Add get_version API both for internal and external use
  powerpc: Handle both new style and old style reserve maps
  powerpc/hw_brk: Fix off by one error when validating DAWR region end
  powerpc/pseries: Support compression of oops text via pstore
  powerpc/pseries: Re-organise the oops compression code
  pstore: Pass header size in the pstore write callback
  powerpc/powernv: Fix iommu initialization again
  powerpc/pseries: Inform the hypervisor we are using EBB regs
  powerpc/perf: Add power8 EBB support
  powerpc/perf: Core EBB support for 64-bit book3s
  powerpc/perf: Drop MMCRA from thread_struct
  powerpc/perf: Don't enable if we have zero events
  ...
2013-07-04 10:29:23 -07:00
Linus Torvalds 7f0ef0267e Merge branch 'akpm' (updates from Andrew Morton)
Merge first patch-bomb from Andrew Morton:
 - various misc bits
 - I'm been patchmonkeying ocfs2 for a while, as Joel and Mark have been
   distracted.  There has been quite a bit of activity.
 - About half the MM queue
 - Some backlight bits
 - Various lib/ updates
 - checkpatch updates
 - zillions more little rtc patches
 - ptrace
 - signals
 - exec
 - procfs
 - rapidio
 - nbd
 - aoe
 - pps
 - memstick
 - tools/testing/selftests updates

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (445 commits)
  tools/testing/selftests: don't assume the x bit is set on scripts
  selftests: add .gitignore for kcmp
  selftests: fix clean target in kcmp Makefile
  selftests: add .gitignore for vm
  selftests: add hugetlbfstest
  self-test: fix make clean
  selftests: exit 1 on failure
  kernel/resource.c: remove the unneeded assignment in function __find_resource
  aio: fix wrong comment in aio_complete()
  drivers/w1/slaves/w1_ds2408.c: add magic sequence to disable P0 test mode
  drivers/memstick/host/r592.c: convert to module_pci_driver
  drivers/memstick/host/jmb38x_ms: convert to module_pci_driver
  pps-gpio: add device-tree binding and support
  drivers/pps/clients/pps-gpio.c: convert to module_platform_driver
  drivers/pps/clients/pps-gpio.c: convert to devm_* helpers
  drivers/parport/share.c: use kzalloc
  Documentation/accounting/getdelays.c: avoid strncpy in accounting tool
  aoe: update internal version number to v83
  aoe: update copyright date
  aoe: perform I/O completions in parallel
  ...
2013-07-03 17:12:13 -07:00
Linus Torvalds 862f001254 PCI changes for the v3.11 merge window:
PCI device hotplug
     - Add pci_alloc_dev() interface (Gu Zheng)
     - Add pci_bus_get()/put() for reference counting (Jiang Liu)
     - Fix SR-IOV reference count issues (Jiang Liu)
     - Remove unused acpi_pci_roots list (Jiang Liu)
 
   MSI
     - Conserve interrupt resources on x86 (Alexander Gordeev)
 
   AER
     - Force fatal severity when component has been reset (Betty Dall)
     - Reset link below Root Port as well as Downstream Port (Betty Dall)
     - Fix "Firmware first" flag setting (Bjorn Helgaas)
     - Don't parse HEST for non-PCIe devices (Bjorn Helgaas)
 
   ASPM
     - Warn when we can't disable ASPM as driver requests (Bjorn Helgaas)
 
   Miscellaneous
     - Add CircuitCo PCI IDs (Darren Hart)
     - Add AMD CZ SATA and SMBus PCI IDs (Shane Huang)
     - Work around Ivytown NTB BAR size issue (Jon Mason)
     - Detect invalid initial BAR values (Kevin Hao)
     - Add pcibios_release_device() (Sebastian Ott)
     - Fix powerpc & sparc PCI_UNKNOWN power state usage (Bjorn Helgaas)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJR0dhDAAoJEFmIoMA60/r8AUoP/RrKOXC2GGZCqUIKjUyxaCU+
 NXwaNhjfRuge5lZHE8fUAnYVveFTv0iNo8if/md866/pS4il3vxaMWRhZrBddXqe
 juyxPUGaOb5NmI2C+g5ebQ1xHhnOU6kWrgQ5kQk5GmJdm6BpiWDCFaalyioYj27v
 FoN/25IG+5EtJjP6kRdQGFZq+RYOqlBfQp4fdFmY5bDsQiCLREH6YWHeNSkH+t1I
 Eh84WESqGzgaZyCb9QKM2AcU/HMKLux4VXAYp9idVr3tH1j9b/klQI7xNW7sPnkY
 LIzKTcfF89iXkhxc7zrF0O/n5rC5Cp7LQpiFMV6yCT3w25yWpq9itOwqcZ/nfCv6
 fje8P1B2lwGrizkwKKLcosTzWkJewvfLkVye90WS3g0i3zlijF4pfEiw3a2ujA91
 MP9/JmX+ZZ5QeGyPuFmYJyMlInH4vtSdegl9jtaeuX4cOnuMP+Ouxnxc+mH2bOfl
 Z5/K1OSCYLfb27uWM7od2lgb+GFHLMP+RMy073h0ZMpDvM6EnZy5iu1zU9+yJO4S
 8/aRhBz4h+YEBinnXOJvHzMfu3wQQ7UvXZqEspgsug2Z5xHvxhMLhrJQgpVUSdsW
 Nrdm1dNdACV/cvt/lWzUE7SmUaOua/r/cVmViF2ryeRET7t65in+5NHXmXP8ET+r
 1WA7pbykfegC9uY84PaK
 =X3Lo
 -----END PGP SIGNATURE-----

Merge tag 'pci-v3.11-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI changes from Bjorn Helgaas:
 "PCI device hotplug
    - Add pci_alloc_dev() interface (Gu Zheng)
    - Add pci_bus_get()/put() for reference counting (Jiang Liu)
    - Fix SR-IOV reference count issues (Jiang Liu)
    - Remove unused acpi_pci_roots list (Jiang Liu)

  MSI
    - Conserve interrupt resources on x86 (Alexander Gordeev)

  AER
    - Force fatal severity when component has been reset (Betty Dall)
    - Reset link below Root Port as well as Downstream Port (Betty Dall)
    - Fix "Firmware first" flag setting (Bjorn Helgaas)
    - Don't parse HEST for non-PCIe devices (Bjorn Helgaas)

  ASPM
    - Warn when we can't disable ASPM as driver requests (Bjorn Helgaas)

  Miscellaneous
    - Add CircuitCo PCI IDs (Darren Hart)
    - Add AMD CZ SATA and SMBus PCI IDs (Shane Huang)
    - Work around Ivytown NTB BAR size issue (Jon Mason)
    - Detect invalid initial BAR values (Kevin Hao)
    - Add pcibios_release_device() (Sebastian Ott)
    - Fix powerpc & sparc PCI_UNKNOWN power state usage (Bjorn Helgaas)"

* tag 'pci-v3.11-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (51 commits)
  MAINTAINERS: Add ACPI folks for ACPI-related things under drivers/pci
  PCI: Add CircuitCo vendor ID and subsystem ID
  PCI: Use pdev->pm_cap instead of pci_find_capability(..,PCI_CAP_ID_PM)
  PCI: Return early on allocation failures to unindent mainline code
  PCI: Simplify IOV implementation and fix reference count races
  PCI: Drop redundant setting of bus->is_added in virtfn_add_bus()
  unicore32/PCI: Remove redundant call of pci_bus_add_devices()
  m68k/PCI: Remove redundant call of pci_bus_add_devices()
  PCI / ACPI / PM: Use correct power state strings in messages
  PCI: Fix comment typo for pcie_pme_remove()
  PCI: Rename pci_release_bus_bridge_dev() to pci_release_host_bridge_dev()
  PCI: Fix refcount issue in pci_create_root_bus() error recovery path
  ia64/PCI: Clean up pci_scan_root_bus() usage
  PCI/AER: Reset link for devices below Root Port or Downstream Port
  ACPI / APEI: Force fatal AER severity when component has been reset
  PCI/AER: Remove "extern" from function declarations
  PCI/AER: Move AER severity defines to aer.h
  PCI/AER: Set dev->__aer_firmware_first only for matching devices
  PCI/AER: Factor out HEST device type matching
  PCI/AER: Don't parse HEST table for non-PCIe devices
  ...
2013-07-03 16:31:35 -07:00
Zhang Yanfei 427fcd23b5 powerpc: Remove savemaxmem parameter setup
saved_max_pfn is used to know the amount of memory that the previous
kernel used.  And for powerpc, we set saved_max_pfn by passing the kernel
commandline parameter "savemaxmem=".

The only user of saved_max_pfn in powerpc is read_oldmem interface.  Since
we have removed read_oldmem, we don't need this parameter anymore.

Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-03 16:08:03 -07:00
Jiang Liu 602ddc70ec mm/PPC: prepare for killing free_all_bootmem_node()
Prepare for killing free_all_bootmem_node() by using free_all_bootmem().

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Alexander Graf <agraf@suse.de>
Cc: "Suzuki K. Poulose" <suzuki@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-03 16:07:39 -07:00
Jiang Liu 369a9d8523 mm/ppc: prepare for removing num_physpages and simplify mem_init()
Prepare for removing num_physpages and simplify mem_init().

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-03 16:07:37 -07:00
Jiang Liu 0c98853473 mm: concentrate modification of totalram_pages into the mm core
Concentrate code to modify totalram_pages into the mm core, so the arch
memory initialized code doesn't need to take care of it.  With these
changes applied, only following functions from mm core modify global
variable totalram_pages: free_bootmem_late(), free_all_bootmem(),
free_all_bootmem_node(), adjust_managed_page_count().

With this patch applied, it will be much more easier for us to keep
totalram_pages and zone->managed_pages in consistence.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: <sworddragon2@aol.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michel Lespinasse <walken@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-03 16:07:33 -07:00
Jiang Liu dbe67df4ba mm: enhance free_reserved_area() to support poisoning memory with zero
Address more review comments from last round of code review.
1) Enhance free_reserved_area() to support poisoning freed memory with
   pattern '0'. This could be used to get rid of poison_init_mem()
   on ARM64.
2) A previous patch has disabled memory poison for initmem on s390
   by mistake, so restore to the original behavior.
3) Remove redundant PAGE_ALIGN() when calling free_reserved_area().

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: <sworddragon2@aol.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michel Lespinasse <walken@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-03 16:07:32 -07:00
Jiang Liu 11199692d8 mm: change signature of free_reserved_area() to fix building warnings
Change signature of free_reserved_area() according to Russell King's
suggestion to fix following build warnings:

  arch/arm/mm/init.c: In function 'mem_init':
  arch/arm/mm/init.c:603:2: warning: passing argument 1 of 'free_reserved_area' makes integer from pointer without a cast [enabled by default]
    free_reserved_area(__va(PHYS_PFN_OFFSET), swapper_pg_dir, 0, NULL);
    ^
  In file included from include/linux/mman.h:4:0,
                   from arch/arm/mm/init.c:15:
  include/linux/mm.h:1301:22: note: expected 'long unsigned int' but argument is of type 'void *'
   extern unsigned long free_reserved_area(unsigned long start, unsigned long end,

   mm/page_alloc.c: In function 'free_reserved_area':
>> mm/page_alloc.c:5134:3: warning: passing argument 1 of 'virt_to_phys' makes pointer from integer without a cast [enabled by default]
   In file included from arch/mips/include/asm/page.h:49:0,
                    from include/linux/mmzone.h:20,
                    from include/linux/gfp.h:4,
                    from include/linux/mm.h:8,
                    from mm/page_alloc.c:18:
   arch/mips/include/asm/io.h:119:29: note: expected 'const volatile void *' but argument is of type 'long unsigned int'
   mm/page_alloc.c: In function 'free_area_init_nodes':
   mm/page_alloc.c:5030:34: warning: array subscript is below array bounds [-Warray-bounds]

Also address some minor code review comments.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: <sworddragon2@aol.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michel Lespinasse <walken@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-03 16:07:32 -07:00
Wanpeng Li 2415cf12e0 mm/hugetlb: use already existing interface huge_page_shift
Use the already existing interface huge_page_shift instead of h->order +
PAGE_SHIFT.

Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-03 16:07:32 -07:00
David S. Miller 0c1072ae02 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/freescale/fec_main.c
	drivers/net/ethernet/renesas/sh_eth.c
	net/ipv4/gre.c

The GRE conflict is between a bug fix (kfree_skb --> kfree_skb_list)
and the splitting of the gre.c code into seperate files.

The FEC conflict was two sets of changes adding ethtool support code
in an "!CONFIG_M5272" CPP protected block.

Finally the sh_eth.c conflict was between one commit add bits set
in the .eesr_err_check mask whilst another commit removed the
.tx_error_check member and assignments.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-03 14:55:13 -07:00
Linus Torvalds f991fae5c6 Power management and ACPI updates for 3.11-rc1
- Hotplug changes allowing device hot-removal operations to fail
   gracefully (instead of crashing the kernel) if they cannot be
   carried out completely.  From Rafael J Wysocki and Toshi Kani.
 
 - Freezer update from Colin Cross and Mandeep Singh Baines targeted
   at making the freezing of tasks a bit less heavy weight operation.
 
 - cpufreq resume fix from Srivatsa S Bhat for a regression introduced
   during the 3.10 cycle causing some cpufreq sysfs attributes to
   return wrong values to user space after resume.
 
 - New freqdomain_cpus sysfs attribute for the acpi-cpufreq driver to
   provide information previously available via related_cpus from
   Lan Tianyu.
 
 - cpufreq fixes and cleanups from Viresh Kumar, Jacob Shin,
   Heiko Stübner, Xiaoguang Chen, Ezequiel Garcia, Arnd Bergmann, and
   Tang Yuantian.
 
 - Fix for an ACPICA regression causing suspend/resume issues to
   appear on some systems introduced during the 3.4 development cycle
   from Lv Zheng.
 
 - ACPICA fixes and cleanups from Bob Moore, Tomasz Nowicki, Lv Zheng,
   Chao Guan, and Zhang Rui.
 
 - New cupidle driver for Xilinx Zynq processors from Michal Simek.
 
 - cpuidle fixes and cleanups from Daniel Lezcano.
 
 - Changes to make suspend/resume work correctly in Xen guests from
   Konrad Rzeszutek Wilk.
 
 - ACPI device power management fixes and cleanups from Fengguang Wu
   and Rafael J Wysocki.
 
 - ACPI documentation updates from Lv Zheng, Aaron Lu and Hanjun Guo.
 
 - Fix for the IA-64 issue that was the reason for reverting commit
   9f29ab1 and updates of the ACPI scan code from Rafael J Wysocki.
 
 - Mechanism for adding CMOS RTC address space handlers from Lan Tianyu
   (to allow some EC-related breakage to be fixed on some systems).
 
 - Spec-compliant implementation of acpi_os_get_timer() from
   Mika Westerberg.
 
 - Modification of do_acpi_find_child() to execute _STA in order to
   to avoid situations in which a pointer to a disabled device object
   is returned instead of an enabled one with the same _ADR value.
   From Jeff Wu.
 
 - Intel BayTrail PCH (Platform Controller Hub) support for the ACPI
   Intel Low-Power Subsystems (LPSS) driver and modificaions of that
   driver to work around a couple of known BIOS issues from
   Mika Westerberg and Heikki Krogerus.
 
 - EC driver fix from Vasiliy Kulikov to make it use get_user() and
   put_user() instead of dereferencing user space pointers blindly.
 
 - Assorted ACPI code cleanups from Bjorn Helgaas, Nicholas Mazzuca and
   Toshi Kani.
 
 - Modification of the "runtime idle" helper routine to take the return
   values of the callbacks executed by it into account and to call
   rpm_suspend() if they return 0, which allows some code bloat
   reduction to be done, from Rafael J Wysocki and Alan Stern.
 
 - New trace points for PM QoS from Sahara <keun-o.park@windriver.com>.
 
 - PM QoS documentation update from Lan Tianyu.
 
 - Assorted core PM code cleanups and changes from Bernie Thompson,
   Bjorn Helgaas, Julius Werner, and Shuah Khan.
 
 - New devfreq driver for the Exynos5-bus device from Abhilash Kesavan.
 
 - Minor devfreq cleanups, fixes and MAINTAINERS update from
   MyungJoo Ham, Abhilash Kesavan, Paul Bolle, Rajagopal Venkat, and
   Wei Yongjun.
 
 - OMAP Adaptive Voltage Scaling (AVS) SmartReflex voltage control
   driver updates from Andrii Tseglytskyi and Nishanth Menon.
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQIcBAABAgAGBQJR0ZNOAAoJEKhOf7ml8uNsDLYP/0EU4rmvw0TWTITfp6RS1KDE
 9GwBn96ZR4Q5bJd9gBCTPSqhHOYMqxWEUp99sn/M2wehG1pk/jw5LO56+2IhM3UZ
 g1HDcJ7te2nVT/iXsKiAGTVhU9Rk0aYwoVSknwk27qpIBGxW9w/s5tLX8pY3Q3Zq
 wL/7aTPjyL+PFFFEaxgH7qLqsl3DhbtYW5AriUBTkXout/tJ4eO1b7MNBncLDh8X
 VQ/0DNCKE95VEJfkO4rk9RKUyVp9GDn0i+HXCD/FS4IA5oYzePdVdNDmXf7g+swe
 CGlTZq8pB+oBpDiHl4lxzbNrKQjRNbGnDUkoRcWqn0nAw56xK+vmYnWJhW99gQ/I
 fKnvxeLca5po1aiqmC4VSJxZIatFZqLrZAI4dzoCLWY+bGeTnCKmj0/F8ytFnZA2
 8IuLLs7/dFOaHXV/pKmpg6FAlFa9CPxoqRFoyqb4M0GjEarADyalXUWsPtG+6xCp
 R/p0CISpwk+guKZR/qPhL7M654S7SHrPwd2DPF0KgGsvk+G2GhoB8EzvD8BVp98Z
 9siCGCdgKQfJQVI6R0k9aFmn/4gRQIAgyPhkhv9tqULUUkiaXki+/t8kPfnb8O/d
 zep+CA57E2G8MYLkDJfpFeKS7GpPD6TIdgFdGmOUC0Y6sl9iTdiw4yTx8O2JM37z
 rHBZfYGkJBrbGRu+Q1gs
 =VBBq
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-3.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management and ACPI updates from Rafael Wysocki:
 "This time the total number of ACPI commits is slightly greater than
  the number of cpufreq commits, but Viresh Kumar (who works on cpufreq)
  remains the most active patch submitter.

  To me, the most significant change is the addition of offline/online
  device operations to the driver core (with the Greg's blessing) and
  the related modifications of the ACPI core hotplug code.  Next are the
  freezer updates from Colin Cross that should make the freezing of
  tasks a bit less heavy weight.

  We also have a couple of regression fixes, a number of fixes for
  issues that have not been identified as regressions, two new drivers
  and a bunch of cleanups all over.

  Highlights:

   - Hotplug changes to support graceful hot-removal failures.

     It sometimes is necessary to fail device hot-removal operations
     gracefully if they cannot be carried out completely.  For example,
     if memory from a memory module being hot-removed has been allocated
     for the kernel's own use and cannot be moved elsewhere, it's
     desirable to fail the hot-removal operation in a graceful way
     rather than to crash the kernel, but currenty a success or a kernel
     crash are the only possible outcomes of an attempted memory
     hot-removal.  Needless to say, that is not a very attractive
     alternative and it had to be addressed.

     However, in order to make it work for memory, I first had to make
     it work for CPUs and for this purpose I needed to modify the ACPI
     processor driver.  It's been split into two parts, a resident one
     handling the low-level initialization/cleanup and a modular one
     playing the actual driver's role (but it binds to the CPU system
     device objects rather than to the ACPI device objects representing
     processors).  That's been sort of like a live brain surgery on a
     patient who's riding a bike.

     So this is a little scary, but since we found and fixed a couple of
     regressions it caused to happen during the early linux-next testing
     (a month ago), nobody has complained.

     As a bonus we remove some duplicated ACPI hotplug code, because the
     ACPI-based CPU hotplug is now going to use the common ACPI hotplug
     code.

   - Lighter weight freezing of tasks.

     These changes from Colin Cross and Mandeep Singh Baines are
     targeted at making the freezing of tasks a bit less heavy weight
     operation.  They reduce the number of tasks woken up every time
     during the freezing, by using the observation that the freezer
     simply doesn't need to wake up some of them and wait for them all
     to call refrigerator().  The time needed for the freezer to decide
     to report a failure is reduced too.

     Also reintroduced is the check causing a lockdep warining to
     trigger when try_to_freeze() is called with locks held (which is
     generally unsafe and shouldn't happen).

   - cpufreq updates

     First off, a commit from Srivatsa S Bhat fixes a resume regression
     introduced during the 3.10 cycle causing some cpufreq sysfs
     attributes to return wrong values to user space after resume.  The
     fix is kind of fresh, but also it's pretty obvious once Srivatsa
     has identified the root cause.

     Second, we have a new freqdomain_cpus sysfs attribute for the
     acpi-cpufreq driver to provide information previously available via
     related_cpus.  From Lan Tianyu.

     Finally, we fix a number of issues, mostly related to the
     CPUFREQ_POSTCHANGE notifier and cpufreq Kconfig options and clean
     up some code.  The majority of changes from Viresh Kumar with bits
     from Jacob Shin, Heiko Stübner, Xiaoguang Chen, Ezequiel Garcia,
     Arnd Bergmann, and Tang Yuantian.

   - ACPICA update

     A usual bunch of updates from the ACPICA upstream.

     During the 3.4 cycle we introduced support for ACPI 5 extended
     sleep registers, but they are only supposed to be used if the
     HW-reduced mode bit is set in the FADT flags and the code attempted
     to use them without checking that bit.  That caused suspend/resume
     regressions to happen on some systems.  Fix from Lv Zheng causes
     those registers to be used only if the HW-reduced mode bit is set.

     Apart from this some other ACPICA bugs are fixed and code cleanups
     are made by Bob Moore, Tomasz Nowicki, Lv Zheng, Chao Guan, and
     Zhang Rui.

   - cpuidle updates

     New driver for Xilinx Zynq processors is added by Michal Simek.

     Multidriver support simplification, addition of some missing
     kerneldoc comments and Kconfig-related fixes come from Daniel
     Lezcano.

   - ACPI power management updates

     Changes to make suspend/resume work correctly in Xen guests from
     Konrad Rzeszutek Wilk, sparse warning fix from Fengguang Wu and
     cleanups and fixes of the ACPI device power state selection
     routine.

   - ACPI documentation updates

     Some previously missing pieces of ACPI documentation are added by
     Lv Zheng and Aaron Lu (hopefully, that will help people to
     uderstand how the ACPI subsystem works) and one outdated doc is
     updated by Hanjun Guo.

   - Assorted ACPI updates

     We finally nailed down the IA-64 issue that was the reason for
     reverting commit 9f29ab11dd ("ACPI / scan: do not match drivers
     against objects having scan handlers"), so we can fix it and move
     the ACPI scan handler check added to the ACPI video driver back to
     the core.

     A mechanism for adding CMOS RTC address space handlers is
     introduced by Lan Tianyu to allow some EC-related breakage to be
     fixed on some systems.

     A spec-compliant implementation of acpi_os_get_timer() is added by
     Mika Westerberg.

     The evaluation of _STA is added to do_acpi_find_child() to avoid
     situations in which a pointer to a disabled device object is
     returned instead of an enabled one with the same _ADR value.  From
     Jeff Wu.

     Intel BayTrail PCH (Platform Controller Hub) support is added to
     the ACPI driver for Intel Low-Power Subsystems (LPSS) and that
     driver is modified to work around a couple of known BIOS issues.
     Changes from Mika Westerberg and Heikki Krogerus.

     The EC driver is fixed by Vasiliy Kulikov to use get_user() and
     put_user() instead of dereferencing user space pointers blindly.

     Code cleanups are made by Bjorn Helgaas, Nicholas Mazzuca and Toshi
     Kani.

   - Assorted power management updates

     The "runtime idle" helper routine is changed to take the return
     values of the callbacks executed by it into account and to call
     rpm_suspend() if they return 0, which allows us to reduce the
     overall code bloat a bit (by dropping some code that's not
     necessary any more after that modification).

     The runtime PM documentation is updated by Alan Stern (to reflect
     the "runtime idle" behavior change).

     New trace points for PM QoS are added by Sahara
     (<keun-o.park@windriver.com>).

     PM QoS documentation is updated by Lan Tianyu.

     Code cleanups are made and minor issues are addressed by Bernie
     Thompson, Bjorn Helgaas, Julius Werner, and Shuah Khan.

   - devfreq updates

     New driver for the Exynos5-bus device from Abhilash Kesavan.

     Minor cleanups, fixes and MAINTAINERS update from MyungJoo Ham,
     Abhilash Kesavan, Paul Bolle, Rajagopal Venkat, and Wei Yongjun.

   - OMAP power management updates

     Adaptive Voltage Scaling (AVS) SmartReflex voltage control driver
     updates from Andrii Tseglytskyi and Nishanth Menon."

* tag 'pm+acpi-3.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (162 commits)
  cpufreq: Fix cpufreq regression after suspend/resume
  ACPI / PM: Fix possible NULL pointer deref in acpi_pm_device_sleep_state()
  PM / Sleep: Warn about system time after resume with pm_trace
  cpufreq: don't leave stale policy pointer in cdbs->cur_policy
  acpi-cpufreq: Add new sysfs attribute freqdomain_cpus
  cpufreq: make sure frequency transitions are serialized
  ACPI: implement acpi_os_get_timer() according the spec
  ACPI / EC: Add HP Folio 13 to ec_dmi_table in order to skip DSDT scan
  ACPI: Add CMOS RTC Operation Region handler support
  ACPI / processor: Drop unused variable from processor_perflib.c
  cpufreq: tegra: call CPUFREQ_POSTCHANGE notfier in error cases
  cpufreq: s3c64xx: call CPUFREQ_POSTCHANGE notfier in error cases
  cpufreq: omap: call CPUFREQ_POSTCHANGE notfier in error cases
  cpufreq: imx6q: call CPUFREQ_POSTCHANGE notfier in error cases
  cpufreq: exynos: call CPUFREQ_POSTCHANGE notfier in error cases
  cpufreq: dbx500: call CPUFREQ_POSTCHANGE notfier in error cases
  cpufreq: davinci: call CPUFREQ_POSTCHANGE notfier in error cases
  cpufreq: arm-big-little: call CPUFREQ_POSTCHANGE notfier in error cases
  cpufreq: powernow-k8: call CPUFREQ_POSTCHANGE notfier in error cases
  cpufreq: pcc: call CPUFREQ_POSTCHANGE notfier in error cases
  ...
2013-07-03 14:35:40 -07:00
Linus Torvalds fe489bf450 KVM fixes for 3.11
On the x86 side, there are some optimizations and documentation updates.
 The big ARM/KVM change for 3.11, support for AArch64, will come through
 Catalin Marinas's tree.  s390 and PPC have misc cleanups and bugfixes.
 
 There is a conflict due to "s390/pgtable: fix ipte notify bit" having
 entered 3.10 through Martin Schwidefsky's s390 tree.  This pull request
 has additional changes on top, so this tree's version is the correct one.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQIcBAABAgAGBQJR0oU6AAoJEBvWZb6bTYbynnsP/RSUrrHrA8Wu1tqVfAKu+1y5
 6OIihqZ9x11/YMaNofAfv86jqxFu0/j7CzMGphNdjzujqKI+Q1tGe7oiVCmKzoG+
 UvSctWsz0lpllgBtnnrm5tcfmG6rrddhLtpA7m320+xCVx8KV5P4VfyHZEU+Ho8h
 ziPmb2mAQ65gBNX6nLHEJ3ITTgad6gt4NNbrKIYpyXuWZQJypzaRqT/vpc4md+Ed
 dCebMXsL1xgyb98EcnOdrWH1wV30MfucR7IpObOhXnnMKeeltqAQPvaOlKzZh4dK
 +QfxJfdRZVS0cepcxzx1Q2X3dgjoKQsHq1nlIyz3qu1vhtfaqBlixLZk0SguZ/R9
 1S1YqucZiLRO57RD4q0Ak5oxwobu18ZoqJZ6nledNdWwDe8bz/W2wGAeVty19ky0
 qstBdM9jnwXrc0qrVgZp3+s5dsx3NAm/KKZBoq4sXiDLd/yBzdEdWIVkIrU3X9wU
 3X26wOmBxtsB7so/JR7ciTsQHelmLicnVeXohAEP9CjIJffB81xVXnXs0P0SYuiQ
 RzbSCwjPzET4JBOaHWT0Dhv0DTS/EaI97KzlN32US3Bn3WiLlS1oDCoPFoaLqd2K
 LxQMsXS8anAWxFvexfSuUpbJGPnKSidSQoQmJeMGBa9QhmZCht3IL16/Fb641ToN
 xBohzi49L9FDbpOnTYfz
 =1zpG
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "On the x86 side, there are some optimizations and documentation
  updates.  The big ARM/KVM change for 3.11, support for AArch64, will
  come through Catalin Marinas's tree.  s390 and PPC have misc cleanups
  and bugfixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (87 commits)
  KVM: PPC: Ignore PIR writes
  KVM: PPC: Book3S PR: Invalidate SLB entries properly
  KVM: PPC: Book3S PR: Allow guest to use 1TB segments
  KVM: PPC: Book3S PR: Don't keep scanning HPTEG after we find a match
  KVM: PPC: Book3S PR: Fix invalidation of SLB entry 0 on guest entry
  KVM: PPC: Book3S PR: Fix proto-VSID calculations
  KVM: PPC: Guard doorbell exception with CONFIG_PPC_DOORBELL
  KVM: Fix RTC interrupt coalescing tracking
  kvm: Add a tracepoint write_tsc_offset
  KVM: MMU: Inform users of mmio generation wraparound
  KVM: MMU: document fast invalidate all mmio sptes
  KVM: MMU: document fast invalidate all pages
  KVM: MMU: document fast page fault
  KVM: MMU: document mmio page fault
  KVM: MMU: document write_flooding_count
  KVM: MMU: document clear_spte_count
  KVM: MMU: drop kvm_mmu_zap_mmio_sptes
  KVM: MMU: init kvm generation close to mmio wrap-around value
  KVM: MMU: add tracepoint for check_mmio_spte
  KVM: MMU: fast invalidate all mmio sptes
  ...
2013-07-03 13:21:40 -07:00
Linus Torvalds 92295f632c The common clock framework changes for 3.11 include new clock drivers
across several different platforms and architectures, fixes to existing
 drivers, a MAINTAINERS file fix and improvements to the basic clock
 types that allow them to be of use to more platforms than before. Only a
 few fixes to the core framework are included with most all of the
 changes landing in the various clock drivers themselves.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJR0hFpAAoJEDqPOy9afJhJcr4P/iA83uEGXIncUzFsbCIsKNGl
 g8yoasz1mpcAng31kK54lasNpFvUYIEBLKWkJBN56h2J5gJ3OKpi7basnQGRykjh
 +rCg2dV4gKDBQzypdeg5kPxf0YLJ8KrwyngcRm5KA3oxKYtmLHt/M2jmDd3DXP0h
 dKo2bkR52eQ7E+ij6/MWgUObxWiFnH0C5Lf+L0tGHsRlRvZTV6NMymX2v0sIOCw9
 hpsrL+7Hqf/+PBfWBljH0IAiI72I5dGEaJoesdTRWz8qut+duwF+OGlzMZnkclMf
 XU444HnvCq3bWGP5NYf9ASO4RRA2IGEoN/eDeiVX5hvkAjeUNHVNhkcUk28Xn9vI
 ZxCxwBXki4wGayiQm4TTgORozmp4i+NwbOuG+s3For1qQU43aA7ppwysFHGXngQa
 bxXT2pNECLX8WR91vkmEdqeQVm6s++R3zltHFUry1VRjTIs5y18Dt9kR6T16NKKQ
 ztoegeQQ8SI7HOLd+tR2v19VXH4zoV3NYAdkCKJX+VXFg+4mgJERmXoU350V3d8D
 wxde7WU4KLO4+1Fngror8wocP8jRRHfhx5jbBVVJ5n6BnF/Mos24g7KI1VPis3Wr
 XZQibxLW0kZ7/hTHykZCff8D4aNqR4fT/UCKyy2Xw1aR0zkD9ZAQq5oWhL99lsnt
 cCe9GBP0Lum9Zq9jjMwL
 =SwMB
 -----END PGP SIGNATURE-----

Merge tag 'clk-for-linus-3.11' of git://git.linaro.org/people/mturquette/linux

Pull clock framework updates from Mike Turquette:
 "The common clock framework changes for 3.11 include new clock drivers
  across several different platforms and architectures, fixes to
  existing drivers, a MAINTAINERS file fix and improvements to the basic
  clock types that allow them to be of use to more platforms than before.

  Only a few fixes to the core framework are included with most all of
  the changes landing in the various clock drivers themselves."

* tag 'clk-for-linus-3.11' of git://git.linaro.org/people/mturquette/linux: (55 commits)
  clk: tegra: fix ifdef for tegra_periph_reset_assert inline
  clk: tegra: provide tegra_periph_reset_assert alternative
  clk: exynos4: Fix clock aliases for cpufreq related clocks
  clk: samsung: Add MUX_FA macro to pass flag and alias
  clk: add support for Rockchip gate clocks
  clk: vexpress: Make the clock drivers directly available for arm64
  clk: vexpress: Use full node name to identify individual clocks
  clk: tegra: T114: add DFLL DVCO reset control
  clk: tegra: T114: add DFLL source clocks
  clk: tegra: T114: add FCPU clock shaper programming, needed by the DFLL
  clk: gate: add CLK_GATE_HIWORD_MASK
  clk: divider: add CLK_DIVIDER_HIWORD_MASK flag
  clk: mux: add CLK_MUX_HIWORD_MASK
  clk: Always notify whole subtree when reparenting
  MAINTAINERS: make drivers/clk entry match subdirs
  clk: honor CLK_GET_RATE_NOCACHE in clk_set_rate
  clk: use clk_get_rate() for debugfs
  clk: tegra: Use override bits when needed
  clk: tegra: override bits for Tegra30 PLLM
  clk: tegra: override bits for Tegra114 PLLM
  ...
2013-07-03 11:54:50 -07:00
Linus Torvalds 790eac5640 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull second set of VFS changes from Al Viro:
 "Assorted f_pos race fixes, making do_splice_direct() safe to call with
  i_mutex on parent, O_TMPFILE support, Jeff's locks.c series,
  ->d_hash/->d_compare calling conventions changes from Linus, misc
  stuff all over the place."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
  Document ->tmpfile()
  ext4: ->tmpfile() support
  vfs: export lseek_execute() to modules
  lseek_execute() doesn't need an inode passed to it
  block_dev: switch to fixed_size_llseek()
  cpqphp_sysfs: switch to fixed_size_llseek()
  tile-srom: switch to fixed_size_llseek()
  proc_powerpc: switch to fixed_size_llseek()
  ubi/cdev: switch to fixed_size_llseek()
  pci/proc: switch to fixed_size_llseek()
  isapnp: switch to fixed_size_llseek()
  lpfc: switch to fixed_size_llseek()
  locks: give the blocked_hash its own spinlock
  locks: add a new "lm_owner_key" lock operation
  locks: turn the blocked_list into a hashtable
  locks: convert fl_link to a hlist_node
  locks: avoid taking global lock if possible when waking up blocked waiters
  locks: protect most of the file_lock handling with i_lock
  locks: encapsulate the fl_link list handling
  locks: make "added" in __posix_lock_file a bool
  ...
2013-07-03 09:10:19 -07:00
Linus Torvalds e13053f506 Merge branch 'sched-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull voluntary preemption fixes from Ingo Molnar:
 "This tree contains a speedup which is achieved through better
  might_sleep()/might_fault() preemption point annotations for uaccess
  functions, by Michael S Tsirkin:

  1. The only reason uaccess routines might sleep is if they fault.
     Make this explicit for all architectures.

  2. A voluntary preemption point in uaccess functions means compiler
     can't inline them efficiently, this breaks assumptions that they
     are very fast and small that e.g.  net code seems to make.  Remove
     this preemption point so behaviour matches with what callers
     assume.

  3. Accesses (e.g through socket ops) to kernel memory with KERNEL_DS
     like net/sunrpc does will never sleep.  Remove an unconditinal
     might_sleep() in the might_fault() inline in kernel.h (used when
     PROVE_LOCKING is not set).

  4. Accesses with pagefault_disable() return EFAULT but won't cause
     caller to sleep.  Check for that and thus avoid might_sleep() when
     PROVE_LOCKING is set.

  These changes offer a nice speedup for CONFIG_PREEMPT_VOLUNTARY=y
  kernels, here's a network bandwidth measurement between a virtual
  machine and the host:

   before:
        incoming: 7122.77   Mb/s
        outgoing: 8480.37   Mb/s

   after:
        incoming: 8619.24   Mb/s   [ +21.0% ]
        outgoing: 9455.42   Mb/s   [ +11.5% ]

  I kept these changes in a separate tree, separate from scheduler
  changes, because it's a mixed MM and scheduler topic"

* 'sched-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  mm, sched: Allow uaccess in atomic with pagefault_disable()
  mm, sched: Drop voluntary schedule from might_fault()
  x86: uaccess s/might_sleep/might_fault/
  tile: uaccess s/might_sleep/might_fault/
  powerpc: uaccess s/might_sleep/might_fault/
  mn10300: uaccess s/might_sleep/might_fault/
  microblaze: uaccess s/might_sleep/might_fault/
  m32r: uaccess s/might_sleep/might_fault/
  frv: uaccess s/might_sleep/might_fault/
  arm64: uaccess s/might_sleep/might_fault/
  asm-generic: uaccess s/might_sleep/might_fault/
2013-07-02 16:19:24 -07:00
Linus Torvalds 2d722f6d56 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
 "The main changes:

   - load-calculation cleanups and improvements, by Alex Shi
   - various nohz related tidying up of statisics, by Frederic
     Weisbecker
   - factor out /proc functions to kernel/sched/proc.c, by Paul
     Gortmaker
   - simplify the RT policy scheduler, by Kirill Tkhai
   - various fixes and cleanups"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits)
  sched/debug: Remove CONFIG_FAIR_GROUP_SCHED mask
  sched/debug: Fix formatting of /proc/<PID>/sched
  sched: Fix typo in struct sched_avg member description
  sched/fair: Fix typo describing flags in enqueue_entity
  sched/debug: Add load-tracking statistics to task
  sched: Change get_rq_runnable_load() to static and inline
  sched/tg: Remove tg.load_weight
  sched/cfs_rq: Change atomic64_t removed_load to atomic_long_t
  sched/tg: Use 'unsigned long' for load variable in task group
  sched: Change cfs_rq load avg to unsigned long
  sched: Consider runnable load average in move_tasks()
  sched: Compute runnable load avg in cpu_load and cpu_avg_load_per_task
  sched: Update cpu load after task_tick
  sched: Fix sleep time double accounting in enqueue entity
  sched: Set an initial value of runnable avg for new forked task
  sched: Move a few runnable tg variables into CONFIG_SMP
  Revert "sched: Introduce temporary FAIR_GROUP_SCHED dependency for load-tracking"
  sched: Don't mix use of typedef ctl_table and struct ctl_table
  sched: Remove WARN_ON(!sd) from init_sched_groups_power()
  sched: Fix memory leakage in build_sched_groups()
  ...
2013-07-02 16:17:25 -07:00
Linus Torvalds f0bb4c0ab0 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "Kernel improvements:

   - watchdog driver improvements by Li Zefan
   - Power7 CPI stack events related improvements by Sukadev Bhattiprolu
   - event multiplexing via hrtimers and other improvements by Stephane
     Eranian
   - kernel stack use optimization by Andrew Hunter
   - AMD IOMMU uncore PMU support by Suravee Suthikulpanit
   - NMI handling rate-limits by Dave Hansen
   - various hw_breakpoint fixes by Oleg Nesterov
   - hw_breakpoint overflow period sampling and related signal handling
     fixes by Jiri Olsa
   - Intel Haswell PMU support by Andi Kleen

  Tooling improvements:

   - Reset SIGTERM handler in workload child process, fix from David
     Ahern.
   - Makefile reorganization, prep work for Kconfig patches, from Jiri
     Olsa.
   - Add automated make test suite, from Jiri Olsa.
   - Add --percent-limit option to 'top' and 'report', from Namhyung
     Kim.
   - Sorting improvements, from Namhyung Kim.
   - Expand definition of sysfs format attribute, from Michael Ellerman.

  Tooling fixes:

   - 'perf tests' fixes from Jiri Olsa.
   - Make Power7 CPI stack events available in sysfs, from Sukadev
     Bhattiprolu.
   - Handle death by SIGTERM in 'perf record', fix from David Ahern.
   - Fix printing of perf_event_paranoid message, from David Ahern.
   - Handle realloc failures in 'perf kvm', from David Ahern.
   - Fix divide by 0 in variance, from David Ahern.
   - Save parent pid in thread struct, from David Ahern.
   - Handle JITed code in shared memory, from Andi Kleen.
   - Fixes for 'perf diff', from Jiri Olsa.
   - Remove some unused struct members, from Jiri Olsa.
   - Add missing liblk.a dependency for python/perf.so, fix from Jiri
     Olsa.
   - Respect CROSS_COMPILE in liblk.a, from Rabin Vincent.
   - No need to do locking when adding hists in perf report, only 'top'
     needs that, from Namhyung Kim.
   - Fix alignment of symbol column in in the hists browser (top,
     report) when -v is given, from NAmhyung Kim.
   - Fix 'perf top' -E option behavior, from Namhyung Kim.
   - Fix bug in isupper() and islower(), from Sukadev Bhattiprolu.
   - Fix compile errors in bp_signal 'perf test', from Sukadev
     Bhattiprolu.

  ... and more things"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (102 commits)
  perf/x86: Disable PEBS-LL in intel_pmu_pebs_disable()
  perf/x86: Fix shared register mutual exclusion enforcement
  perf/x86/intel: Support full width counting
  x86: Add NMI duration tracepoints
  perf: Drop sample rate when sampling is too slow
  x86: Warn when NMI handlers take large amounts of time
  hw_breakpoint: Introduce "struct bp_cpuinfo"
  hw_breakpoint: Simplify *register_wide_hw_breakpoint()
  hw_breakpoint: Introduce cpumask_of_bp()
  hw_breakpoint: Simplify the "weight" usage in toggle_bp_slot() paths
  hw_breakpoint: Simplify list/idx mess in toggle_bp_slot() paths
  perf/x86/intel: Add mem-loads/stores support for Haswell
  perf/x86/intel: Support Haswell/v4 LBR format
  perf/x86/intel: Move NMI clearing to end of PMI handler
  perf/x86/intel: Add Haswell PEBS support
  perf/x86/intel: Add simple Haswell PMU support
  perf/x86/intel: Add Haswell PEBS record support
  perf/x86/intel: Fix sparse warning
  perf/x86/amd: AMD IOMMU Performance Counter PERF uncore PMU implementation
  perf/x86/amd: Add IOMMU Performance Counter resource management
  ...
2013-07-02 16:15:23 -07:00
Linus Torvalds ab3d681e9d Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar:
 "The major changes:

  - Simplify RCU's grace-period and callback processing based on the new
    numbering for callbacks.

  - Removal of TINY_PREEMPT_RCU in favor of TREE_PREEMPT_RCU for
    single-CPU low-latency systems.

  - SRCU-related changes and fixes.

  - Miscellaneous fixes, including converting a few remaining printk()
    calls to pr_*().

  - Documentation updates"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits)
  rcu: Shrink TINY_RCU by reworking CPU-stall ifdefs
  rcu: Shrink TINY_RCU by moving exit_rcu()
  rcu: Remove TINY_PREEMPT_RCU tracing documentation
  rcu: Consolidate rcutiny_plugin.h ifdefs
  rcu: Remove rcu_preempt_note_context_switch()
  rcu: Remove the CONFIG_TINY_RCU ifdefs in rcutiny.h
  rcu: Remove check_cpu_stall_preempt()
  rcu: Simplify RCU_TINY RCU callback invocation
  rcu: Remove rcu_preempt_process_callbacks()
  rcu: Remove rcu_preempt_remove_callbacks()
  rcu: Remove rcu_preempt_check_callbacks()
  rcu: Remove show_tiny_preempt_stats()
  rcu: Remove TINY_PREEMPT_RCU
  powerpc,kvm: fix imbalance srcu_read_[un]lock()
  rcu: Remove srcu_read_lock_raw() and srcu_read_unlock_raw().
  rcu: Apply Dave Jones's NOCB Kconfig help feedback
  rcu: Merge adjacent identical ifdefs
  rcu: Drive quiescent-state-forcing delay from HZ
  rcu: Remove "Experimental" flags
  kthread: Add kworker kthreads to OS-jitter documentation
  ...
2013-07-02 16:13:29 -07:00
Linus Torvalds 0c46d68d19 Merge branch 'core-mutexes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull WW mutex support from Ingo Molnar:
 "This tree adds support for wound/wait style locks, which the graphics
  guys would like to make use of in the TTM graphics subsystem.

  Wound/wait mutexes are used when other multiple lock acquisitions of a
  similar type can be done in an arbitrary order.  The deadlock handling
  used here is called wait/wound in the RDBMS literature: The older
  tasks waits until it can acquire the contended lock.  The younger
  tasks needs to back off and drop all the locks it is currently
  holding, ie the younger task is wounded.

  See this LWN.net description of W/W mutexes:

     https://lwn.net/Articles/548909/

  The comments there outline specific usecases for this facility (which
  have already been implemented for the DRM tree).

  Also see Documentation/ww-mutex-design.txt for more details"

* 'core-mutexes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking-selftests: Handle unexpected failures more strictly
  mutex: Add more w/w tests to test EDEADLK path handling
  mutex: Add more tests to lib/locking-selftest.c
  mutex: Add w/w tests to lib/locking-selftest.c
  mutex: Add w/w mutex slowpath debugging
  mutex: Add support for wound/wait style locks
  arch: Make __mutex_fastpath_lock_retval return whether fastpath succeeded or not
2013-07-02 16:09:13 -07:00
Linus Torvalds fc76a258d4 Driver core patches for 3.11-rc1
Here's the big driver core merge for 3.11-rc1
 
 Lots of little things, and larger firmware subsystem updates, all
 described in the shortlog.  Nice thing here is that we finally get rid
 of CONFIG_HOTPLUG, after 10+ years, thanks to Stephen Rohtwell (it had
 been always on for a number of kernel releases, now it's just removed.)
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iEYEABECAAYFAlHRsGMACgkQMUfUDdst+ylIIACfW8lLxOPVK+iYG699TWEBAkp0
 LFEAnjlpAMJ1JnoZCuWDZObNCev93zGB
 =020+
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-3.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here's the big driver core merge for 3.11-rc1

  Lots of little things, and larger firmware subsystem updates, all
  described in the shortlog.  Nice thing here is that we finally get rid
  of CONFIG_HOTPLUG, after 10+ years, thanks to Stephen Rohtwell (it had
  been always on for a number of kernel releases, now it's just
  removed)"

* tag 'driver-core-3.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (27 commits)
  driver core: device.h: fix doc compilation warnings
  firmware loader: fix another compile warning with PM_SLEEP unset
  build some drivers only when compile-testing
  firmware loader: fix compile warning with PM_SLEEP set
  kobject: sanitize argument for format string
  sysfs_notify is only possible on file attributes
  firmware loader: simplify holding module for request_firmware
  firmware loader: don't export cache_firmware and uncache_firmware
  drivers/base: Use attribute groups to create sysfs memory files
  firmware loader: fix compile warning
  firmware loader: fix build failure with !CONFIG_FW_LOADER_USER_HELPER
  Documentation: Updated broken link in HOWTO
  Finally eradicate CONFIG_HOTPLUG
  driver core: firmware loader: kill FW_ACTION_NOHOTPLUG requests before suspend
  driver core: firmware loader: don't cache FW_ACTION_NOHOTPLUG firmware
  Documentation: Tidy up some drivers/base/core.c kerneldoc content.
  platform_device: use a macro instead of platform_driver_register
  firmware: move EXPORT_SYMBOL annotations
  firmware: Avoid deadlock of usermodehelper lock at shutdown
  dell_rbu: Select CONFIG_FW_LOADER_USER_HELPER explicitly
  ...
2013-07-02 11:44:19 -07:00
Linus Torvalds 0de10f9ea6 TTY/Serial merge for 3.11-rc1
Here is the big TTY / Serial driver merge for 3.11-rc1.
 
 It's not all that big, nothing major changed in the tty api, which is a
 nice change, just a number of serial driver fixes and updates and new
 drivers, along with some n_tty fixes to help resolve some reported
 issues.
 
 All of these have been in the linux-next releases for a while, with the
 exception of the last revert patch, which was reported this past weekend
 by two different people as being needed.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iEYEABECAAYFAlHRtNIACgkQMUfUDdst+ymYgwCeKdyv9wRJ5cIWZt7Jz8ou3P/C
 76YAoIvMv+fwoFBpyud/sC8eAKGTQ6KB
 =Sjk6
 -----END PGP SIGNATURE-----

Merge tag 'tty-3.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty/serial updates from Greg KH:
 "Here is the big TTY / Serial driver merge for 3.11-rc1.

  It's not all that big, nothing major changed in the tty api, which is
  a nice change, just a number of serial driver fixes and updates and
  new drivers, along with some n_tty fixes to help resolve some reported
  issues.

  All of these have been in the linux-next releases for a while, with
  the exception of the last revert patch, which was reported this past
  weekend by two different people as being needed."

* tag 'tty-3.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (51 commits)
  Revert "serial: 8250_pci: add support for another kind of NetMos Technology PCI 9835 Multi-I/O Controller"
  pch_uart: Add uart_clk selection for the MinnowBoard
  tty: atmel_serial: prepare clk before calling enable
  tty: Reset itty for other pty
  n_tty: Buffer work should not reschedule itself
  n_tty: Fix unsafe update of available buffer space
  n_tty: Untangle read completion variables
  n_tty: Encapsulate minimum_to_wake within N_TTY
  serial: omap: Fix device tree based PM runtime
  serial: imx: Fix serial clock unbalance
  serial/mpc52xx_uart: fix kernel panic when system reboot
  serial: mfd: Add sysrq support
  serial: imx: enable the clocks for console
  tty: serial: add Freescale lpuart driver support
  serial: imx: Improve Kconfig text
  serial: imx: Allow module build
  serial: imx: Fix warning when !CONFIG_SERIAL_IMX_CONSOLE
  tty/serial/sirf: fix error propagation in sirfsoc_uart_probe()
  serial: omap: fix potential NULL pointer dereference in serial_omap_runtime_suspend()
  tty: serial: Enable uartlite for ARM zynq
  ...
2013-07-02 11:32:06 -07:00
Linus Torvalds 63580e51bb Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull VFS patches (part 1) from Al Viro:
 "The major change in this pile is ->readdir() replacement with
  ->iterate(), dealing with ->f_pos races in ->readdir() instances for
  good.

  There's a lot more, but I'd prefer to split the pull request into
  several stages and this is the first obvious cutoff point."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (67 commits)
  [readdir] constify ->actor
  [readdir] ->readdir() is gone
  [readdir] convert ecryptfs
  [readdir] convert coda
  [readdir] convert ocfs2
  [readdir] convert fatfs
  [readdir] convert xfs
  [readdir] convert btrfs
  [readdir] convert hostfs
  [readdir] convert afs
  [readdir] convert ncpfs
  [readdir] convert hfsplus
  [readdir] convert hfs
  [readdir] convert befs
  [readdir] convert cifs
  [readdir] convert freevxfs
  [readdir] convert fuse
  [readdir] convert hpfs
  reiserfs: switch reiserfs_readdir_dentry to inode
  reiserfs: is_privroot_deh() needs only directory inode, actually
  ...
2013-07-02 09:28:37 -07:00
Benjamin Herrenschmidt 7a9cdd95fe Merge remote-tracking branch 'agust/next' into next
From Anatolij:

"There are small cleanups and fixes for mpc512x common code,
mpc512x_defconfig updates and soft reboot support for mpc5125
based boards."
2013-07-02 18:31:08 +10:00
Benjamin Herrenschmidt dd8164c1dd Merge remote-tracking branch 'scott/next' into next
Merge Freescale updates
2013-07-02 17:42:17 +10:00
Dongsheng.wang@freescale.com a63b3bc7db powerpc/fsl: add MPIC timer wakeup support
The driver provides a way to wake up the system by the MPIC timer.

For example,
echo 5 > /sys/devices/system/mpic/timer_wakeup
echo standby > /sys/power/state

After 5 seconds the MPIC timer will generate an interrupt to wake up
the system.

Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com>
Signed-off-by: Zhao Chenhui <chenhui.zhao@freescale.com>
Signed-off-by: Li Yang <leoli@freescale.com>
2013-07-01 18:38:42 -05:00
Dongsheng.wang@freescale.com 9e6f31a9db powerpc/mpic: create mpic subsystem object
Register a mpic subsystem at /sys/devices/system/

Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-07-01 18:38:42 -05:00
Dongsheng.wang@freescale.com 36ca09be6f powerpc/mpic: add global timer support
The MPIC global timer is a hardware timer inside the Freescale PIC complying
with OpenPIC standard. When the specified interval times out, the hardware
timer generates an interrupt. The driver currently is only tested on fsl chip,
but it can potentially support other global timers complying to OpenPIC
standard.

The two independent groups of global timer on fsl chip, group A and group B,
are identical in their functionality, except that they appear at different
locations within the PIC register map. The hardware timer can be cascaded to
create timers larger than the default 31-bit global timers. Timer cascade
fields allow configuration of up to two 63-bit timers. But These two groups
of timers cannot be cascaded together.

It can be used as a wakeup source for low power modes. It also could be used
as periodical timer for protocols, drivers and etc.

Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com>
Signed-off-by: Li Yang <leoli@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-07-01 18:38:41 -05:00
Dongsheng.wang@freescale.com 5ff04b7287 powerpc/mpic: add irq_set_wake support
Add irq_set_wake support. Just add IRQF_NO_SUSPEND to desc->action->flag.
So the wake up interrupt will not be disable in suspend_device_irqs.

Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-07-01 18:38:41 -05:00
Kevin Hao 9837b43c5f powerpc/85xx: enable coreint for all the 64bit boards
With the patch 7230c564 (powerpc: Rework lazy-interrupt handling),
it seems that the coreint works pretty well on the 85xx 64bit kernel.
So use the coreint by default for these boards.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-07-01 18:38:41 -05:00
LEROY Christophe 7601f59765 powerpc/8xx: Erroneous double irq_eoi() on CPM IRQ in MPC8xx
irq_eoi() is already called by generic_handle_irq() so
it shall not be called a again

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-07-01 18:38:32 -05:00
Chunhe Lan 2dd1c132d5 powerpc/fsl: Enable CONFIG_E1000E in mpc85xx_smp_defconfig
On the most boards of Freescale platform, they use the PCI-Express
Intel(R) PRO/1000 gigabit ethernet card to work. So enable the
corresponding driver for it.

Signed-off-by: Chunhe Lan <Chunhe.Lan@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-07-01 18:38:32 -05:00
Hongtao Jia 86d379690c powerpc/mpic: Add get_version API both for internal and external use
MPIC version is useful information for both mpic_alloc() and mpic_init().
The patch provide an API to get MPIC version for reusing the code.
Also, some other IP block may need MPIC version for their own use.
The API for external use is also provided.

Signed-off-by: Jia Hongtao <hongtao.jia@freescale.com>
Signed-off-by: Li Yang <leoli@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-07-01 18:38:28 -05:00
Benjamin Herrenschmidt c039e3a8dd powerpc: Handle both new style and old style reserve maps
When Jeremy introduced the new device-tree based reserve map, he made
the code in early_reserve_mem_dt() bail out if it found one, thus not
reserving the initrd nor processing the old style map.

I hit problems with variants of kexec that didn't put the initrd in
the new style map either. While these could/will be fixed, I believe
we should be safe here and rather reserve more than not enough.

We could have a firmware passing stuff via the new style map, and
in the middle, a kexec that knows nothing about it and adding other
things to the old style map.

I don't see a big issue with processing both and reserving everything
that needs to be. memblock_reserve() supports overlaps fine these days.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-02 08:20:49 +10:00
Michael Neuling e2a800beac powerpc/hw_brk: Fix off by one error when validating DAWR region end
The Data Address Watchpoint Register (DAWR) on POWER8 can take a 512
byte range but this range must not cross a 512 byte boundary.

Unfortunately we were off by one when calculating the end of the region,
hence we were not allowing some breakpoint regions which were actually
valid.  This fixes this error.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Reported-by: Edjunior Barbosa Machado <emachado@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org # 3.9+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-02 08:20:49 +10:00
Ingo Molnar 2fd1b48788 Linux 3.10
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQEcBAABAgAGBQJR0K2gAAoJEHm+PkMAQRiGWsEH+gMZSN1qRm34hZ82q1Tx7HvL
 Eb/Gsl3Qw/7G2TlTqgjBUs36IdqV9O2cui/aa3/TfXvdvrx+0GlhRkEwQPc+ygcO
 Mvoyoke4tT4+4jVFdCg1J8avREsa28/6oaHs0ZZxuVmJBBLTJH7aXaNsGn6eU1q9
 9+p798MQis6naIiPC63somlZcCIiBhsuWCPWpEfLMn8G1HWAFTM3xXIbNBqe/brS
 bmIOfhomlIZ5dcdaXGvjtP3+KJhkNDwhkPC4tVYu8JqqgSlrE+a+EGyEuuGqKk10
 U+swiqyuD31uBI9ga54u/2FzSqDiAu6YOcMXevjo/m3g9XLdYbYLvN+nvN8alCQ=
 =Ob6Z
 -----END PGP SIGNATURE-----

Merge tag 'v3.10' into sched/core

Merge in a recent upstream commit:

  c2853c8df5 include/linux/math64.h: add div64_ul()

because:

  72a4cf20cb sched: Change cfs_rq load avg to unsigned long

relies on it.

[ We don't rebase sched/core for this, because the handful of
  followup commits after the broken commit are not behavioral
  changes so are unlikely to be needed during bisection. ]

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-07-01 11:18:53 +02:00
Aruna Balakrishnaiah 40847e5660 powerpc/pseries: Support compression of oops text via pstore
The patch set supports compression of oops messages while writing to NVRAM,
this helps in capturing more of oops data to lnx,oops-log. The pstore file
for oops messages will be in decompressed format making it readable.

In case compression fails, the patch takes care of copying the header added
by pstore and last oops_data_sz bytes of big_oops_buf to NVRAM so that we
have recent oops messages in lnx,oops-log.

In case decompression fails, it will result in absence of oops file but still
have files (in /dev/pstore) for other partitions.

Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01 18:10:49 +10:00
Aruna Balakrishnaiah fbfe86fc0c powerpc/pseries: Re-organise the oops compression code
nvram_compress() and zip_oops() is used by the nvram_pstore_write
API to compress oops messages hence re-organise the functions
accordingly to avoid forward declarations.

Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01 18:10:49 +10:00
Aruna Balakrishnaiah 6bbbca7359 pstore: Pass header size in the pstore write callback
Header size is needed to distinguish between header and the dump data.
Incorporate the addition of new argument (hsize) in the pstore write
callback.

Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01 18:10:48 +10:00
Benjamin Herrenschmidt 74251fe21b powerpc/powernv: Fix iommu initialization again
So because those things always end up in trainwrecks... In 7846de406
we moved back the iommu initialization earlier, essentially undoing
37f02195b which was causing us endless trouble... except that in the
meantime we had merged 959c9bdd58 (to workaround the original breakage)
which is now ... broken :-)

This fixes it by doing a partial revert of the latter (we keep the
ppc_md. path which will be needed in the hotplug case, which happens
also during some EEH error recovery situations).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: <stable@vger.kernel.org> [v3.10]
2013-07-01 18:10:29 +10:00
Benjamin Herrenschmidt 24a72acac1 Linux 3.10
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQEcBAABAgAGBQJR0K2gAAoJEHm+PkMAQRiGWsEH+gMZSN1qRm34hZ82q1Tx7HvL
 Eb/Gsl3Qw/7G2TlTqgjBUs36IdqV9O2cui/aa3/TfXvdvrx+0GlhRkEwQPc+ygcO
 Mvoyoke4tT4+4jVFdCg1J8avREsa28/6oaHs0ZZxuVmJBBLTJH7aXaNsGn6eU1q9
 9+p798MQis6naIiPC63somlZcCIiBhsuWCPWpEfLMn8G1HWAFTM3xXIbNBqe/brS
 bmIOfhomlIZ5dcdaXGvjtP3+KJhkNDwhkPC4tVYu8JqqgSlrE+a+EGyEuuGqKk10
 U+swiqyuD31uBI9ga54u/2FzSqDiAu6YOcMXevjo/m3g9XLdYbYLvN+nvN8alCQ=
 =Ob6Z
 -----END PGP SIGNATURE-----

Merge tag 'v3.10' into next

Merge 3.10 in order to get some of the last minute powerpc
changes, resolve conflicts and add additional fixes on top
of them.
2013-07-01 17:57:25 +10:00
Michael Ellerman 6e0b8bc965 powerpc/pseries: Inform the hypervisor we are using EBB regs
On LPAR systems we need to inform the hypervisor that we are using the
EBB registers. We do this by setting a bit in the Virtual Processor Area
(VPA) - formerly known as the lppaca.

For now we do this always, ie. we do not dynamically enable/disable.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01 11:50:17 +10:00
Michael Ellerman 4df4899911 powerpc/perf: Add power8 EBB support
Add logic to the power8 PMU code to support EBB. Future processors would
also be expected to implement similar constraints. At that time we could
possibly factor these out into common code.

Finally mark the power8 PMU as supporting EBB, which is the actual
enable switch which allows EBBs to be configured.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01 11:50:13 +10:00
Michael Ellerman 330a1eb777 powerpc/perf: Core EBB support for 64-bit book3s
Add support for EBB (Event Based Branches) on 64-bit book3s. See the
included documentation for more details.

EBBs are a feature which allows the hardware to branch directly to a
specified user space address when a PMU event overflows. This can be
used by programs for self-monitoring with no kernel involvement in the
inner loop.

Most of the logic is in the generic book3s code, primarily to avoid a
proliferation of PMU callbacks.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01 11:50:10 +10:00
Michael Ellerman 2ac138ca21 powerpc/perf: Drop MMCRA from thread_struct
In commit 59affcd "Context switch more PMU related SPRs" I added more
PMU SPRs to thread_struct, later modified in commit b11ae95. To add
insult to injury it turns out we don't need to switch MMCRA as it's
only user readable, and the value is recomputed by the PMU code.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01 11:50:07 +10:00
Michael Ellerman 4ea355b536 powerpc/perf: Don't enable if we have zero events
In power_pmu_enable() we still enable the PMU even if we have zero
events. This should have no effect but doesn't make much sense. Instead
just return after telling the hypervisor that we are not using the PMCs.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
CC: <stable@vger.kernel.org> [v3.10]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01 11:50:03 +10:00
Michael Ellerman 0a48843d6c powerpc/perf: Use existing out label in power_pmu_enable()
In power_pmu_enable() we can use the existing out label to reduce the
number of return paths.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
CC: <stable@vger.kernel.org> [v3.10]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01 11:50:00 +10:00
Michael Ellerman 7a7a41f9d5 powerpc/perf: Freeze PMC5/6 if we're not using them
On Power8 we can freeze PMC5 and 6 if we're not using them. Normally they
run all the time.

As noticed by Anshuman, we should unfreeze them when we disable the PMU
as there are legacy tools which expect them to run all the time.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
CC: <stable@vger.kernel.org> [v3.10]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01 11:49:57 +10:00
Michael Ellerman 378a6ee99e powerpc/perf: Rework disable logic in pmu_disable()
In pmu_disable() we disable the PMU by setting the FC (Freeze Counters)
bit in MMCR0. In order to do this we have to read/modify/write MMCR0.

It's possible that we read a value from MMCR0 which has PMAO (PMU Alert
Occurred) set. When we write that value back it will cause an interrupt
to occur. We will then end up in the PMU interrupt handler even though
we are supposed to have just disabled the PMU.

We can avoid this by making sure we never write PMAO back. We should not
lose interrupts because when the PMU is re-enabled the overflowed values
will cause another interrupt.

We also reorder the clearing of SAMPLE_ENABLE so that is done after the
PMU is frozen. Otherwise there is a small window between the clearing of
SAMPLE_ENABLE and the setting of FC where we could take an interrupt and
incorrectly see SAMPLE_ENABLE not set. This would for example change the
logic in perf_read_regs().

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
CC: <stable@vger.kernel.org> [v3.10]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01 11:49:54 +10:00
Michael Ellerman d8bec4c9cd powerpc/perf: Check that events only include valid bits on Power8
A mistake we have made in the past is that we pull out the fields we
need from the event code, but don't check that there are no unknown bits
set. This means that we can't ever assign meaning to those unknown bits
in future.

Although we have once again failed to do this at release, it is still
early days for Power8 so I think we can still slip this in and get away
with it.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
CC: <stable@vger.kernel.org> [v3.10]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01 11:49:50 +10:00
Michael Ellerman b14b6260ef powerpc: Wire up the HV facility unavailable exception
Similar to the facility unavailble exception, except the facilities are
controlled by HFSCR.

Adapt the facility_unavailable_exception() so it can be called for
either the regular or Hypervisor facility unavailable exceptions.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
CC: <stable@vger.kernel.org> [v3.10]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01 11:49:47 +10:00
Michael Ellerman 021424a1fc powerpc: Rename and flesh out the facility unavailable exception handler
The exception at 0xf60 is not the TM (Transactional Memory) unavailable
exception, it is the "Facility Unavailable Exception", rename it as
such.

Flesh out the handler to acknowledge the fact that it can be called for
many reasons, one of which is TM being unavailable.

Use STD_EXCEPTION_COMMON() for the exception body, for some reason we
had it open-coded, I've checked the generated code is identical.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
CC: <stable@vger.kernel.org> [v3.10]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01 11:49:44 +10:00
Michael Ellerman c9f69518e5 powerpc: Remove KVMTEST from RELON exception handlers
KVMTEST is a macro which checks whether we are taking an exception from
guest context, if so we branch out of line and eventually call into the
KVM code to handle the switch.

When running real guests on bare metal (HV KVM) the hardware ensures
that we never take a relocation on exception when transitioning from
guest to host. For PR KVM we disable relocation on exceptions ourself in
kvmppc_core_init_vm(), as of commit a413f47 "Disable relocation on
exceptions whenever PR KVM is active".

So convert all the RELON macros to use NOTEST, and drop the remaining
KVM_HANDLER() definitions we have for 0xe40 and 0xe80.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
CC: <stable@vger.kernel.org> [v3.9+]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01 11:49:40 +10:00
Michael Ellerman 1d567cb4bd powerpc: Remove unreachable relocation on exception handlers
We have relocation on exception handlers defined for h_data_storage and
h_instr_storage. However we will never take relocation on exceptions for
these because they can only come from a guest, and we never take
relocation on exceptions when we transition from guest to host.

We also have a handler for hmi_exception (Hypervisor Maintenance) which
is defined in the architecture to never be delivered with relocation on,
see see v2.07 Book III-S section 6.5.

So remove the handlers, leaving a branch to self just to be double extra
paranoid.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
CC: <stable@vger.kernel.org> [v3.9+]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01 11:49:37 +10:00
Nathan Fontenot dd023217e1 powerpc/numa: Do not update sysfs cpu registration from invalid context
The topology update code that updates the cpu node registration in sysfs
should not be called while in stop_machine(). The register/unregister
calls take a lock and may sleep.

This patch moves these calls outside of the call to stop_machine().

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
CC: <stable@vger.kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01 11:49:34 +10:00
Chen Gang 8246aca705 powerpc/smp: Section mismatch from smp_release_cpus to __initdata spinning_secondaries
the smp_release_cpus is a normal funciton and called in normal environments,
  but it calls the __initdata spinning_secondaries.
  need modify spinning_secondaries to match smp_release_cpus.

the related warning:
  (the linker report boot_paca.33377, but it should be spinning_secondaries)

-----------------------------------------------------------------------------

WARNING: arch/powerpc/kernel/built-in.o(.text+0x23176): Section mismatch in reference from the function .smp_release_cpus() to the variable .init.data:boot_paca.33377
The function .smp_release_cpus() references
the variable __initdata boot_paca.33377.
This is often because .smp_release_cpus lacks a __initdata
annotation or the annotation of boot_paca.33377 is wrong.

WARNING: arch/powerpc/kernel/built-in.o(.text+0x231fe): Section mismatch in reference from the function .smp_release_cpus() to the variable .init.data:boot_paca.33377
The function .smp_release_cpus() references
the variable __initdata boot_paca.33377.
This is often because .smp_release_cpus lacks a __initdata
annotation or the annotation of boot_paca.33377 is wrong.

-----------------------------------------------------------------------------

Signed-off-by: Chen Gang <gang.chen@asianux.com>
CC: <stable@vger.kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01 11:49:27 +10:00
Chen Gang 7029705a9d powerpc/nvram64: Need return the related error code on failure occurs
When error occurs, need return the related error code to let upper
caller know about it.

ppc_md.nvram_size() can return the error code (e.g. core99_nvram_size()
in 'arch/powerpc/platforms/powermac/nvram.c').

Also set ret value when only need it, so can save structions for normal
cases.

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01 11:46:56 +10:00
Li Zhong cce606feb4 powerpc: Set cpu sibling mask before online cpu
It seems following race is possible:

	cpu0					cpux
smp_init->cpu_up->_cpu_up
	__cpu_up
		kick_cpu(1)
-------------------------------------------------------------------------
		waiting online			...
		...				notify CPU_STARTING
							set cpux active
						set cpux online
-------------------------------------------------------------------------
		finish waiting online
		...
sched_init_smp
	init_sched_domains(cpu_active_mask)
		build_sched_domains
						set cpux sibling info
-------------------------------------------------------------------------

Execution of cpu0 and cpux could be concurrent between two separator
lines.

So if the cpux sibling information was set too late (normally
impossible, but could be triggered by adding some delay in
start_secondary, after setting cpu online), build_sched_domains()
running on cpu0 might see cpux active, with an empty sibling mask, then
cause some bad address accessing like following:

[    0.099855] Unable to handle kernel paging request for data at address 0xc00000038518078f
[    0.099868] Faulting instruction address: 0xc0000000000b7a64
[    0.099883] Oops: Kernel access of bad area, sig: 11 [#1]
[    0.099895] PREEMPT SMP NR_CPUS=16 DEBUG_PAGEALLOC NUMA pSeries
[    0.099922] Modules linked in:
[    0.099940] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.0-rc1-00120-gb973425-dirty #16
[    0.099956] task: c0000001fed80000 ti: c0000001fed7c000 task.ti: c0000001fed7c000
[    0.099971] NIP: c0000000000b7a64 LR: c0000000000b7a40 CTR: c0000000000b4934
[    0.099985] REGS: c0000001fed7f760 TRAP: 0300   Not tainted  (3.10.0-rc1-00120-gb973425-dirty)
[    0.099997] MSR: 8000000000009032 <SF,EE,ME,IR,DR,RI>  CR: 24272828  XER: 20000003
[    0.100045] SOFTE: 1
[    0.100053] CFAR: c000000000445ee8
[    0.100064] DAR: c00000038518078f, DSISR: 40000000
[    0.100073]
GPR00: 0000000000000080 c0000001fed7f9e0 c000000000c84d48 0000000000000010
GPR04: 0000000000000010 0000000000000000 c0000001fc55e090 0000000000000000
GPR08: ffffffffffffffff c000000000b80b30 c000000000c962d8 00000003845ffc5f
GPR12: 0000000000000000 c00000000f33d000 c00000000000b9e4 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000001 0000000000000000
GPR20: c000000000ccf750 0000000000000000 c000000000c94d48 c0000001fc504000
GPR24: c0000001fc504000 c0000001fecef848 c000000000c94d48 c000000000ccf000
GPR28: c0000001fc522090 0000000000000010 c0000001fecef848 c0000001fed7fae0
[    0.100293] NIP [c0000000000b7a64] .get_group+0x84/0xc4
[    0.100307] LR [c0000000000b7a40] .get_group+0x60/0xc4
[    0.100318] Call Trace:
[    0.100332] [c0000001fed7f9e0] [c0000000000dbce4] .lock_is_held+0xa8/0xd0 (unreliable)
[    0.100354] [c0000001fed7fa70] [c0000000000bf62c] .build_sched_domains+0x728/0xd14
[    0.100375] [c0000001fed7fbe0] [c000000000af67bc] .sched_init_smp+0x4fc/0x654
[    0.100394] [c0000001fed7fce0] [c000000000adce24] .kernel_init_freeable+0x17c/0x30c
[    0.100413] [c0000001fed7fdb0] [c00000000000ba08] .kernel_init+0x24/0x12c
[    0.100431] [c0000001fed7fe30] [c000000000009f74] .ret_from_kernel_thread+0x5c/0x68
[    0.100445] Instruction dump:
[    0.100456] 38800010 38a00000 4838e3f5 60000000 7c6307b4 2fbf0000 419e0040 3d220001
[    0.100496] 78601f24 39491590 e93e0008 7d6a002a <7d69582a> f97f0000 7d4a002a e93e0010
[    0.100559] ---[ end trace 31fd0ba7d8756001 ]---

This patch tries to move the sibling maps updating before
notify_cpu_starting() and cpu online, and a write barrier there to make
sure sibling maps are updated before active and online mask.

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01 11:46:55 +10:00
Paul Gortmaker 061d19f279 powerpc: Delete __cpuinit usage from all users
The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications.  For example, the fix in
commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
is a good example of the nasty type of bugs that can be created
with improper use of the various __init prefixes.

After a discussion on LKML[1] it was decided that cpuinit should go
the way of devinit and be phased out.  Once all the users are gone,
we can then finally remove the macros themselves from linux/init.h.

This removes all the powerpc uses of the __cpuinit macros.  There
are no __CPUINIT users in assembly files in powerpc.

[1] https://lkml.org/lkml/2013/5/20/589

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01 11:10:36 +10:00
Joe Perches cc293bf7a9 powerpc/idle: Convert use of typedef ctl_table to struct ctl_table
This typedef is unnecessary and should just be removed.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01 11:10:35 +10:00
Bjorn Helgaas 5524f3fc06 powerpc/iommu: Remove unused pci_iommu_init() and pci_direct_iommu_init()
pci_iommu_init() and pci_direct_iommu_init() are not referenced anywhere,
so remove them.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01 11:10:35 +10:00
Kevin Hao 348c2298a6 powerpc: Don't flush/invalidate the d/icache for an unknown relocation type
For an unknown relocation type since the value of r4 is just the 8bit
relocation type, the sum of r4 and r7 may yield an invalid memory
address. For example:
    In normal case:
             r4 = c00xxxxx
             r7 = 40000000
             r4 + r7 = 000xxxxx

    For an unknown relocation type:
             r4 = 000000xx
             r7 = 40000000
             r4 + r7 = 400000xx
   400000xx is an invalid memory address for a board which has just
   512M memory.

And for operations such as dcbst or icbi may cause bus error for an
invalid memory address on some platforms and then cause the board
reset. So we should skip the flush/invalidate the d/icache for
an unknown relocation type.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Acked-by: Suzuki K. Poulose <suzuki@in.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01 11:10:34 +10:00
Gavin Shan 9bf41be673 powerpc/powernv: Use dev-node in PCI config accessors
Currently, we're using the combo (PCI bus + devfn) in the PCI
config accessors and PCI config accessors in EEH depends on them.
However, it's not safe to refer the PCI bus which might have been
removed during hotplug. So we're using device node in the PCI
config accessors and the corresponding backends just reuse them.

The patch also fix one potential risk: We possiblly have frozen
PE during the early PCI probe time, but we haven't setup the PE
mapping yet. So the errors should be counted to PE#0.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01 11:10:33 +10:00
Gavin Shan eeb6361fdd powerpc/eeh: Avoid build warnings
The patch is for avoiding following build warnings:

   The function .pnv_pci_ioda_fixup() references
   the function __init .eeh_init().
   This is often because .pnv_pci_ioda_fixup lacks a __init

   The function .pnv_pci_ioda_fixup() references
   the function __init .eeh_addr_cache_build().
   This is often because .pnv_pci_ioda_fixup lacks a __init

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01 11:10:33 +10:00
Gavin Shan 56ca4fde90 powerpc/eeh: Refactor the output message
We needn't the the whole backtrace other than one-line message in
the error reporting interrupt handler. For errors triggered by
access PCI config space or MMIO, we replace "WARN(1, ...)" with
pr_err() and dump_stack(). The patch also adds more output messages
to indicate what EEH core is doing. Besides, some printk() are
replaced with pr_warning().

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01 11:10:33 +10:00
Gavin Shan 88b6d14b2b powerpc/eeh: Fix address catch for PowerNV
On the PowerNV platform, the EEH address cache isn't built correctly
because we skipped the EEH devices without binding PE. The patch
fixes that.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01 11:10:32 +10:00
Gavin Shan 0b9e267d71 powerpc/powernv: Replace variables with flags
We have 2 fields in "struct pnv_phb" to trace the states. The patch
replace the fields with one and introduces flags for that. The patch
doesn't impact the logic.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01 11:10:32 +10:00
Gavin Shan 652defed48 powerpc/eeh: Check PCIe link after reset
After reset (e.g. complete reset) in order to bring the fenced PHB
back, the PCIe link might not be ready yet. The patch intends to
make sure the PCIe link is ready before accessing its subordinate
PCI devices. The patch also fixes that wrong values restored to
PCI_COMMAND register for PCI bridges.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01 11:10:31 +10:00
Gavin Shan c35ae1796b powerpc/eeh: Don't collect PCI-CFG data on PHB
When the PHB is fenced or dead, it's pointless to collect the data
from PCI config space of subordinate PCI devices since it should
return 0xFF's. The patch also fixes overwritten buffer while getting
PCI config data.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01 11:10:31 +10:00
Michael Neuling 090b9284d7 powerpc/tm: Clear MSR RI in non-recoverable TM code
When we treclaim and trecheckpoint there's an unavoidable period when r1
will not be a valid kernel stack pointer.

This patch clears the MSR recoverable interrupt (RI) bit over these
regions to indicate we have an invalid kernel stack pointer.

For treclaim, the region over which we clear MSR RI is larger than
required to avoid the need for an extra costly mtmsrd.

Thanks to Paulus for suggesting this change.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-30 15:49:43 +10:00
James Yang 80aa0fb494 powerpc: Fix string instr. emulation for 32-bit processes on ppc64
String instruction emulation would erroneously result in a segfault if
the upper bits of the EA are set and is so high that it fails access
check.  Truncate the EA to 32 bits if the process is 32-bit.

Signed-off-by: James Yang <James.Yang@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-30 15:49:40 +10:00
Sebastien Bessiere e1b85c17bf trivial: powerpc: Fix typo in ioei_interrupt() description
Signed-off-by: Sebastien Bessiere <sebastien.bessiere@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-30 15:03:18 +10:00
Gavin Shan ea461abf61 powerpc/eeh: Fix fetching bus for single-dev-PE
While running Linux as guest on top of phyp, we possiblly have
PE that includes single PCI device. However, we didn't return
its PCI bus correctly and it leads to failure on recovery from
EEH errors for single-dev-PE. The patch fixes the issue.

Cc: <stable@vger.kernel.org> # v3.7+
Cc: Steve Best <sbest@us.ibm.com>
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-30 14:08:34 +10:00
Alexander Graf a3ff5fbc94 KVM: PPC: Ignore PIR writes
While technically it's legal to write to PIR and have the identifier changed,
we don't implement logic to do so because we simply expose vcpu_id to the guest.

So instead, let's ignore writes to PIR. This ensures that we don't inject faults
into the guest for something the guest is allowed to do. While at it, we cross
our fingers hoping that it also doesn't mind that we broke its PIR read values.

Signed-off-by: Alexander Graf <agraf@suse.de>
2013-06-30 03:33:22 +02:00
Paul Mackerras 681562cd56 KVM: PPC: Book3S PR: Invalidate SLB entries properly
At present, if the guest creates a valid SLB (segment lookaside buffer)
entry with the slbmte instruction, then invalidates it with the slbie
instruction, then reads the entry with the slbmfee/slbmfev instructions,
the result of the slbmfee will have the valid bit set, even though the
entry is not actually considered valid by the host.  This is confusing,
if not worse.  This fixes it by zeroing out the orige and origv fields
of the SLB entry structure when the entry is invalidated.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-06-30 03:33:22 +02:00
Paul Mackerras 0f296829b5 KVM: PPC: Book3S PR: Allow guest to use 1TB segments
With this, the guest can use 1TB segments as well as 256MB segments.
Since we now have the situation where a single emulated guest segment
could correspond to multiple shadow segments (as the shadow segments
are still 256MB segments), this adds a new kvmppc_mmu_flush_segment()
to scan for all shadow segments that need to be removed.

This restructures the guest HPT (hashed page table) lookup code to
use the correct hashing and matching functions for HPTEs within a
1TB segment.  We use the standard hpt_hash() function instead of
open-coding the hash calculation, and we use HPTE_V_COMPARE() with
an AVPN value that has the B (segment size) field included.  The
calculation of avpn is done a little earlier since it doesn't change
in the loop starting at the do_second label.

The computation in kvmppc_mmu_book3s_64_esid_to_vsid() changes so that
it returns a 256MB VSID even if the guest SLB entry is a 1TB entry.
This is because the users of this function are creating 256MB SLB
entries.  We set a new VSID_1T flag so that entries created from 1T
segments don't collide with entries from 256MB segments.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-06-30 03:33:22 +02:00
Paul Mackerras 6ed1485f65 KVM: PPC: Book3S PR: Don't keep scanning HPTEG after we find a match
The loop in kvmppc_mmu_book3s_64_xlate() that looks up a translation
in the guest hashed page table (HPT) keeps going if it finds an
HPTE that matches but doesn't allow access.  This is incorrect; it
is different from what the hardware does, and there should never be
more than one matching HPTE anyway.  This fixes it to stop when any
matching HPTE is found.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-06-30 03:33:22 +02:00
Paul Mackerras bc1bc4e392 KVM: PPC: Book3S PR: Fix invalidation of SLB entry 0 on guest entry
On entering a PR KVM guest, we invalidate the whole SLB before loading
up the guest entries.  We do this using an slbia instruction, which
invalidates all entries except entry 0, followed by an slbie to
invalidate entry 0.  However, the slbie turns out to be ineffective
in some circumstances (specifically when the host linear mapping uses
64k pages) because of errors in computing the parameter to the slbie.
The result is that the guest kernel hangs very early in boot because
it takes a DSI the first time it tries to access kernel data using
a linear mapping address in real mode.

Currently we construct bits 36 - 43 (big-endian numbering) of the slbie
parameter by taking bits 56 - 63 of the SLB VSID doubleword.  These bits
for the tlbie are C (class, 1 bit), B (segment size, 2 bits) and 5
reserved bits.  For the SLB VSID doubleword these are C (class, 1 bit),
reserved (1 bit), LP (large page size, 2 bits), and 4 reserved bits.
Thus we are not setting the B field correctly, and when LP = 01 as
it is for 64k pages, we are setting a reserved bit.

Rather than add more instructions to calculate the slbie parameter
correctly, this takes a simpler approach, which is to set entry 0 to
zeroes explicitly.  Normally slbmte should not be used to invalidate
an entry, since it doesn't invalidate the ERATs, but it is OK to use
it to invalidate an entry if it is immediately followed by slbia,
which does invalidate the ERATs.  (This has been confirmed with the
Power architects.)  This approach takes fewer instructions and will
work whatever the contents of entry 0.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-06-30 03:33:21 +02:00
Paul Mackerras 8ed7b7e9d2 KVM: PPC: Book3S PR: Fix proto-VSID calculations
This makes sure the calculation of the proto-VSIDs used by PR KVM
is done with 64-bit arithmetic.  Since vcpu3s->context_id[] is int,
when we do vcpu3s->context_id[0] << ESID_BITS the shift will be done
with 32-bit instructions, possibly leading to significant bits
getting lost, as the context id can be up to 524283 and ESID_BITS is
18.  To fix this we cast the context id to u64 before shifting.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-06-30 03:33:21 +02:00
Tiejun Chen 5f17ce8b95 KVM: PPC: Guard doorbell exception with CONFIG_PPC_DOORBELL
Availablity of the doorbell_exception function is guarded by
CONFIG_PPC_DOORBELL. Use the same define to guard our caller
of it.

Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
[agraf: improve patch description]
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-06-30 03:33:21 +02:00
Linus Torvalds 6c355beafd Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull powerpc fixes from Ben Herrenschmidt:
 "We discovered some breakage in our "EEH" (PCI Error Handling) code
  while doing error injection, due to a couple of regressions.  One of
  them is due to a patch (37f02195be "powerpc/pci: fix PCI-e devices
  rescan issue on powerpc platform") that, in hindsight, I shouldn't
  have merged considering that it caused more problems than it solved.

  Please pull those two fixes.  One for a simple EEH address cache
  initialization issue.  The other one is a patch from Guenter that I
  had originally planned to put in 3.11 but which happens to also fix
  that other regression (a kernel oops during EEH error handling and
  possibly hotplug).

  With those two, the couple of test machines I've hammered with error
  injection are remaining up now.  EEH appears to still fail to recover
  on some devices, so there is another problem that Gavin is looking
  into but at least it's no longer crashing the kernel."

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/pci: Improve device hotplug initialization
  powerpc/eeh: Add eeh_dev to the cache during boot
2013-06-29 17:02:48 -07:00
Guenter Roeck 7846de406f powerpc/pci: Improve device hotplug initialization
Commit 37f02195b (powerpc/pci: fix PCI-e devices rescan issue on powerpc
platform) fixes a problem with interrupt and DMA initialization on hot
plugged devices. With this commit, interrupt and DMA initialization for
hot plugged devices is handled in the pci device enable function.

This approach has a couple of drawbacks. First, it creates two code paths
for device initialization, one for hot plugged devices and another for devices
known during the initial PCI scan. Second, the initialization code for hot
plugged devices is only called when the device is enabled, ie typically
in the probe function. Also, the platform specific setup code is called each
time pci_enable_device() is called, not only once during device discovery,
meaning it is actually called multiple times, once for devices discovered
during the initial scan and again each time a driver is re-loaded.

The visible result is that interrupt pins are only assigned to hot plugged
devices when the device driver is loaded. Effectively this changes the PCI
probe API, since pci_dev->irq and the device's dma configuration will now
only be valid after pci_enable() was called at least once. A more subtle
change is that platform specific PCI device setup is moved from device
discovery into the driver's probe function, more specifically into the
pci_enable_device() call.

To fix the inconsistencies, add new function pcibios_add_device.
Call pcibios_setup_device from pcibios_setup_bus_devices if device setup
is not complete, and from pcibios_add_device if bus setup is complete.

With this change, device setup code is moved back into device initialization,
and called exactly once for both static and hot plugged devices.

[ This also fixes a regression introduced by the above patch which
  causes dev->irq to be overwritten under some cirumstances after
  MSIs have been enabled for the device which leads to crashes due
  to the MSI core "hijacking" dev->irq to store the base MSI number
  and not the LSI. --BenH
]

Cc: Yuanquan Chen <Yuanquan.Chen@freescale.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hiroo Matsumoto <matsumoto.hiroo@jp.fujitsu.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-30 08:46:46 +10:00
Al Viro b33159b7d2 proc_powerpc: switch to fixed_size_llseek()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29 12:57:50 +04:00
Al Viro 5f99f4e79a [readdir] switch dcache_readdir() users to ->iterate()
new helpers - dir_emit_dot(file, ctx, dentry), dir_emit_dotdot(file, ctx),
dir_emit_dots(file, ctx).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29 12:46:48 +04:00
Al Viro 40d158e618 consolidate io_remap_pfn_range definitions
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29 12:46:35 +04:00
Thadeu Lima de Souza Cascardo 1abd601864 powerpc/eeh: Add eeh_dev to the cache during boot
commit f8f7d63fd9 ("powerpc/eeh: Trace eeh
device from I/O cache") broke EEH on pseries for devices that were
present during boot and have not been hotplugged/DLPARed.

eeh_check_failure will get the eeh_dev from the cache, and will get
NULL. eeh_addr_cache_build adds the addresses to the cache, but eeh_dev
for the giving pci_device is not set yet. Just reordering the call to
eeh_addr_cache_insert_dev works fine. The ordering is similar to the one
in eeh_add_device_late.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Acked-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-28 12:02:07 +10:00
Rafael J. Wysocki 39a95f4861 Merge branch 'pm-cpufreq-assorted' into pm-cpufreq
* pm-cpufreq-assorted: (21 commits)
  cpufreq: powernow-k8: call CPUFREQ_POSTCHANGE notfier in error cases
  cpufreq: pcc: call CPUFREQ_POSTCHANGE notfier in error cases
  cpufreq: e_powersaver: call CPUFREQ_POSTCHANGE notfier in error cases
  cpufreq: ACPI: call CPUFREQ_POSTCHANGE notfier in error cases
  cpufreq: make __cpufreq_notify_transition() static
  cpufreq: Fix minor formatting issues
  cpufreq: Fix governor start/stop race condition
  cpufreq: Simplify userspace governor
  cpufreq: powerpc: move cpufreq driver to drivers/cpufreq
  cpufreq: kirkwood: Select CPU_FREQ_TABLE option
  cpufreq: big.LITTLE needs cpufreq table
  cpufreq: SPEAr needs cpufreq table
  cpufreq: powerpc: Add cpufreq driver for Freescale e500mc SoCs
  cpufreq: remove unnecessary cpufreq_cpu_{get|put}() calls
  cpufreq: MAINTAINERS: Add git tree path for ARM specific updates
  cpufreq: rename index as driver_data in cpufreq_frequency_table
  cpufreq: Don't create empty /sys/devices/system/cpu/cpufreq directory
  cpufreq: Move get_cpu_idle_time() to cpufreq.c
  cpufreq: governors: Move get_governor_parent_kobj() to cpufreq.c
  cpufreq: Add EXPORT_SYMBOL_GPL for have_governor_per_policy
  ...
2013-06-27 21:46:45 +02:00
Maarten Lankhorst a41b56efa7 arch: Make __mutex_fastpath_lock_retval return whether fastpath succeeded or not
This will allow me to call functions that have multiple
arguments if fastpath fails. This is required to support ticket
mutexes, because they need to be able to pass an extra argument
to the fail function.

Originally I duplicated the functions, by adding
__mutex_fastpath_lock_retval_arg. This ended up being just a
duplication of the existing function, so a way to test if
fastpath was called ended up being better.

This also cleaned up the reservation mutex patch some by being
able to call an atomic_set instead of atomic_xchg, and making it
easier to detect if the wrong unlock function was previously
used.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: dri-devel@lists.freedesktop.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: robclark@gmail.com
Cc: rostedt@goodmis.org
Cc: daniel@ffwll.ch
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20130620113105.4001.83929.stgit@patser
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-26 12:10:55 +02:00
Wei Yongjun 98c7355fb3 powerpc/83xx: use module_i2c_driver to simplify the code
Use the module_i2c_driver() macro to make the code smaller
and a bit simpler.

dpatch engine is used to auto generate this patch.
(https://github.com/weiyj/dpatch)

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-06-25 16:34:11 -05:00
Linus Torvalds ad46547056 Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull powerpc bugfix from Ben Herrenschmidt:
 "This is a fix for a regression causing a freescale "83xx" based
  platforms to crash on boot due to some PCI breakage"

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/pci: Fix boot panic on mpc83xx (regression)
2013-06-25 09:06:48 -10:00
Gavin Shan 5459ae1431 powerpc/eeh: Use interruptible sleep in keehd
To replace down() with down_interrutible() to avoid following
warning:

[c00000007ba7b710] [c000000000014410] .__switch_to+0x1b0/0x380
[c00000007ba7b7c0] [c0000000007b408c] .__schedule+0x3ec/0x970
[c00000007ba7ba50] [c0000000007b1f24] .schedule_timeout+0x1a4/0x2b0
[c00000007ba7bb30] [c0000000007b34a4] .__down+0xa4/0x104
[c00000007ba7bbf0] [c0000000000b9230] .down+0x60/0x70
[c00000007ba7bc80] [c0000000000336d0] .eeh_event_handler+0x70/0x190
[c00000007ba7bd30] [c0000000000b1a58] .kthread+0xe8/0xf0
[c00000007ba7be30] [c00000000000a05c] .ret_from_kernel_thread+0x5c/0x8

This also avoids keeping the load average up while doing nothing.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-25 17:24:41 +10:00
Gavin Shan ef6a285773 powerpc/eeh: Remove eeh_mutex
Originally, eeh_mutex was introduced to protect the PE hierarchy
tree and the attached EEH devices because EEH core was possiblly
running with multiple threads to access the PE hierarchy tree.
However, we now have only one kthread in EEH core. So we needn't
the eeh_mutex and just remove it.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-25 17:24:41 +10:00
Nathan Fontenot ff1e768341 powerpc/mm: Fix build warnings with CONFIG_TRANSPARENT_HUGEPAGE disabled
Building with CONFIG_TRANSPARENT_HUGEPAGE disabled causes the following
build wearnings;

powerpc/arch/powerpc/include/asm/mmu-hash64.h: In function ‘__hash_page_thp’:
powerpc/arch/powerpc/include/asm/mmu-hash64.h:354: warning: no return statement in function returning non-void

This patch adds a return -1 to the static inline for __hash_page_thp()
to correct the warnings.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-25 17:24:40 +10:00
Aruna Balakrishnaiah 99b308e3bb powerpc/pseries: Enable PSTORE in pseries_defconfig
Since now we have pstore support for nvram in pseries, enable it
in the default config. With this config option enabled, pstore
infra-structure will be used to read/write the messages from/to nvram.

Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-25 17:24:40 +10:00
Michael Neuling 540e07c67e powerpc/hw_brk: Fix clearing of extraneous IRQ
In 9422de3 "powerpc: Hardware breakpoints rewrite to handle non DABR breakpoint
registers" we changed the way we mark extraneous irqs with this:

-	info->extraneous_interrupt = !((bp->attr.bp_addr <= dar) &&
-			(dar - bp->attr.bp_addr < bp->attr.bp_len));
+	if (!((bp->attr.bp_addr <= dar) &&
+	      (dar - bp->attr.bp_addr < bp->attr.bp_len)))
+		info->type |= HW_BRK_TYPE_EXTRANEOUS_IRQ;

Unfortunately this is bogus as it never clears extraneous IRQ if it's already
set.

This correctly clears extraneous IRQ before possibly setting it.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Reported-by: Edjunior Barbosa Machado <emachado@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-25 17:24:40 +10:00
Michael Neuling b0b0aa9c7f powerpc/hw_brk: Fix setting of length for exact mode breakpoints
The smallest match region for both the DABR and DAWR is 8 bytes, so the
kernel needs to filter matches when users want to look at regions smaller than
this.

Currently we set the length of PPC_BREAKPOINT_MODE_EXACT breakpoints to 8.
This is wrong as in exact mode we should only match on 1 address, hence the
length should be 1.

This ensures that the kernel will filter out any exact mode hardware breakpoint
matches on any addresses other than the requested one.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Reported-by: Edjunior Barbosa Machado <emachado@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-25 17:24:39 +10:00
Anatolij Gustschin dd0120dea6 powerpc/mpc512x: enable USB support in defconfig
Enable USB EHCI, mass storage and USB gadget support.

Signed-off-by: Anatolij Gustschin <agust@denx.de>
2013-06-25 08:51:19 +02:00
Gerhard Sittig 6d18c9042c powerpc/mpc512x: commit re-generated defconfig
This patch does not change the content, it merely re-orders
configuration items and drops explicit options which already
apply as the default.

Signed-off-by: Gerhard Sittig <gsi@denx.de>
Signed-off-by: Anatolij Gustschin <agust@denx.de>
2013-06-25 08:37:17 +02:00
Joe Liccese 8c43d2b0ca powerpc: Add T4 LAC device tree binding & defs
The Interlaken is a narrow, high speed channelized chip-to-chip interface. To
facilitate interoperability between a data path device and a look-aside
co-processor, the Interlaken Look-Aside protocol is defined for short
transaction-related transfers. Although based on the Interlaken protocol,
Interlaken Look-Aside is not directly compatible with Interlaken and can be
considered a different operation mode.

The Interlaken LA controller connects internal platform to Interlaken serial
interface. It accepts LA command through software portals, which are system
memory mapped 4KB spaces. The LA commands are then translated into the
Interlaken control words and data words, which are sent on TX side to TCAM
through SerDes lanes.

Signed-off-by: Joe Liccese <joe.liccese@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-06-24 19:52:36 -05:00
Greg Kroah-Hartman 805bf3daf3 Merge 3.10-rc7 into tty-next
We want the tty fixes in this branch as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-24 15:17:53 -07:00
Greg Kroah-Hartman b5aef682e0 Merge 3.10-rc7 into driver-core-next
We want the firmware merge fixes, and other bits, in here now.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-24 15:14:43 -07:00
Rojhalat Ibrahim b37e161388 powerpc/pci: Fix boot panic on mpc83xx (regression)
The following commit caused a fatal oops when booting on mpc83xx with
a non-express PCI bus (regardless of whether a PCI device is present):

commit 50d8f87d2b
Author: Rojhalat Ibrahim <imr@rtschenk.de>
Date:   Mon Apr 8 10:15:28 2013 +0200

    powerpc/fsl-pci Make PCIe hotplug work with Freescale PCIe controllers

    Up to now the PCIe link status on Freescale PCIe controllers was only
    checked once at boot time. So hotplug did not work. With this patch the
    link status is checked on every config read. PCIe devices not present at
    boot time are found after doing 'echo 1 >/sys/bus/pci/rescan'.

    Signed-off-by: Rojhalat Ibrahim <imr@rtschenk.de>
    Signed-off-by: Kumar Gala <galak@kernel.crashing.org>

This patch fixes the issue by calling setup_indirect_pci for all device types.
fsl_indirect_read_config is now only used for booke/86xx PCIe controllers.

Reported-by: Michael Guntsche <mike@it-loops.com>
Cc: Scott Wood <scottwood@freescale.com>
Signed-off-by: Rojhalat Ibrahim <imr@rtschenk.de>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-06-24 16:54:09 -05:00
Matteo Facchinetti 0875a88e85 powerpc/mpc512x: add MPC5125 reset module support for system restart
Only part of MPC5125 reset module is like as MPC5121.
In detail, RCWH register doesn't contain informations about:
- PCI arbiter
- NAND flash page size
- NAND flash port size

For this reason, in device tree, this module has a different name then
MPC5121 reset module but use the same "struct mpc512x_reset_module"
register definition and the same restart procedure.

Signed-off-by: Matteo Facchinetti <engineering@sirius-es.it>
Signed-off-by: Anatolij Gustschin <agust@denx.de>
2013-06-24 21:36:49 +02:00
Linus Torvalds 9d0be540d7 KVM fixes for 3.10-rc6
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQIcBAABAgAGBQJRwc4AAAoJEBvWZb6bTYbyT/wQAIAvvjM+jlqOsZCUjnutdWXf
 VGvJB1jXPEa6cuUkaE3RYO8rLLJz/PC4Geg8I/awtjeSkPvldeu0FihSkdbrNkpJ
 3dSAJ0cR8+ScUZB74Lt1pfwwEAdZk61MU1opD1qy3wWTy5Rk2xp4qzE2aCVJoLkh
 K8gscLrCHR47DELcm+AWCtHt/0VOZSAVe9z/7Qf9eAMCNmH/efHgrZZTxF/1aBkI
 7XaZdqT+d84D/Bc2isdRNvFYt9rkuII2GCeRciw2t3QsgVoBXn7FVK2OXbufIaYW
 /XX0YiNpSnUpcpOtGw2MwzpKgDSRE9QbnOUVsWThTDeG7Q5d72ioCFuRDba8p3Lc
 u146nQM495bAdJQz6IW3JCeiZlQ86/QjFquk9hJuCwQpuHmlRpVEqFbLVnZEnYvk
 lc9dnK7BcaftbMgo/j6DMv6Bpq/EDp1JrPXzNT4BwG7ptArEXzmdoktwpONjrZzI
 1oZV5xCcNhTjyqlndiZKxneKXoqz/3NOdR+EV0yh3+0l4PK6C52jd6PaKRv7qN3l
 Lt35uFrmKqzOR12KVStYREyh1Sy43ADE4tIIPnOBbcXz23Df4/vVSb1Bqv0nPq4i
 JhsRsGoR8ZdDBd7c40XlvpxOiMmubFLSklX0WaKn2yIZH9FMj63Ak0wet7qhaqb8
 lbJRUH8gnK0RCVEoUFpB
 =yGPS
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "Three one-line fixes for my first pull request; one for x86 host, one
  for x86 guest, one for PPC"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  x86: kvmclock: zero initialize pvclock shared memory area
  kvm/ppc/booke: Delay kvmppc_lazy_ee_enable
  KVM: x86: remove vcpu's CPL check in host-invoked XCR set
2013-06-21 06:29:22 -10:00
Aneesh Kumar K.V 1a5272866f powerpc: Optimize hugepage invalidate
Hugepage invalidate involves invalidating multiple hpte entries.
Optimize the operation using H_BULK_REMOVE on lpar platforms.
On native, reduce the number of tlb flush.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21 16:01:58 +10:00
Aneesh Kumar K.V 437d496457 powerpc/THP: Enable THP on PPC64
We enable only if the we support 16MB page size.

Reviewed-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21 16:01:58 +10:00
Aneesh Kumar K.V d8e355a20f powerpc: split hugepage when using subpage protection
We find all the overlapping vma and mark them such that we don't allocate
hugepage in that range. Also we split existing huge page so that the
normal page hash can be invalidated and new page faulted in with new
protection bits.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21 16:01:57 +10:00
Aneesh Kumar K.V a00e7bea0d powerpc: disable assert_pte_locked for collapse_huge_page
With THP we set pmd to none, before we do pte_clear. Hence we can't
walk page table to get the pte lock ptr and verify whether it is locked.
THP do take pte lock before calling pte_clear. So we don't change the locking
rules here. It is that we can't use page table walking to check whether
pte locks are held with THP.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21 16:01:57 +10:00
Aneesh Kumar K.V 7888b4ddb4 powerpc: Prevent gcc to re-read the pagetables
GCC is very likely to read the pagetables just once and cache them in
the local stack or in a register, but it is can also decide to re-read
the pagetables. The problem is that the pagetable in those places can
change from under gcc.

With THP/hugetlbfs the pmd (and pgd for hugetlbfs giga pages) can
change under gup_fast. The pages won't be freed untill we finish
gup fast because we have irq disabled and we free these pages via
rcu callback.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21 16:01:56 +10:00
Aneesh Kumar K.V 0ac52dd766 powerpc: Make linux pagetable walk safe with THP enabled
We need to have irqs disabled to handle all the possible parallel update for
linux page table without holding locks.

Events that we are intersted in while walking page tables are
1) Page fault
2) umap
3) THP split
4) THP collapse

A) local_irq_disabled:
------------------------
1) page fault:
A none to valid transition via page fault is not an issue because we
would either see a none or valid. If it is none, we would error out
the page table walk. We may need to use on stack values when checking for
type of page table elements, because if we do

if (!is_hugepd()) {
    if (!pmd_none() {
       if (pmd_bad() {

We could take that bad condition because the pmd got converted to a hugepd
after the !is_hugepd check via a hugetlb fault.

The right way would be to check for pmd_none higher up or use on stack value.

2) A valid to none conversion via unmap:
We can safely walk the upper level table, because we don't remove the the
page table entries until rcu grace period. So even if we followed a
wrong pointer we still have the pointer valid till the grace period.

A PTE pointer returned need to be atomically checked for _PAGE_PRESENT and
 _PAGE_BUSY. A valid pointer returned could becoming none later. To prevent
pte_clear we take _PAGE_BUSY.

3) THP split:
A valid transparent hugepage is converted to nomal page. Before we split we
do pmd_splitting_flush, which sets the hugepage PTE to _PAGE_SPLITTING
So when walking page table we need to check for pmd_trans_splitting and
handle that. The pte returned should also need to be checked for
_PAGE_SPLITTING before setting _PAGE_BUSY similar to _PAGE_PRESENT. We save
the value of PTE on stack and check for the flag in the local pte value.
If we don't have the value set we can safely operate on the local pte value
and we atomicaly set _PAGE_BUSY.

4) THP collapse:
A normal page gets converted to hugepage. In the collapse path, we
mark the pmd none early (pmdp_clear_flush). With irq disabled, if we
are aleady walking page table we would see the pmd_none and won't continue.
If we see a valid PMD, we should still check for _PAGE_PRESENT before
setting _PAGE_BUSY, to make sure we didn't collapse the PTE to a Huge PTE.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21 16:01:56 +10:00
Aneesh Kumar K.V 6d492ecc64 powerpc/THP: Add code to handle HPTE faults for hugepages
The deposted PTE page in the second half of the PMD table is used to
track the state on hash PTEs. After updating the HPTE, we mark the
coresponding slot in the deposted PTE page valid.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21 16:01:56 +10:00
Aneesh Kumar K.V c367714ce8 powerpc: Update gup_pmd_range to handle transparent hugepages
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21 16:01:55 +10:00
Aneesh Kumar K.V db7cb5b924 powerpc/kvm: Handle transparent hugepage in KVM
We can find pte that are splitting while walking page tables. Return
None pte in that case.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21 16:01:55 +10:00
Aneesh Kumar K.V 12bc9f6fc1 powerpc: Replace find_linux_pte with find_linux_pte_or_hugepte
Replace find_linux_pte with find_linux_pte_or_hugepte and explicitly
document why we don't need to handle transparent hugepages at callsites.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21 16:01:54 +10:00
Aneesh Kumar K.V ac52ae4721 powerpc: Update find_linux_pte_or_hugepte to handle transparent hugepages
Reviewed-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21 16:01:54 +10:00
Aneesh Kumar K.V 29409997f8 powerpc: move find_linux_pte_or_hugepte and gup_hugepte to common code
We will use this in the later patch for handling THP pages

Reviewed-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21 16:01:54 +10:00
Aneesh Kumar K.V 074c2eae3e powerpc/THP: Implement transparent hugepages for ppc64
We now have pmd entries covering 16MB range and the PMD table double its original size.
We use the second half of the PMD table to deposit the pgtable (PTE page).
The depoisted PTE page is further used to track the HPTE information. The information
include [ secondary group | 3 bit hidx | valid ]. We use one byte per each HPTE entry.
With 16MB hugepage and 64K HPTE we need 256 entries and with 4K HPTE we need
4096 entries. Both will fit in a 4K PTE page. On hugepage invalidate we need to walk
the PTE page and invalidate all valid HPTEs.

This patch implements necessary arch specific functions for THP support and also
hugepage invalidate logic. These PMD related functions are intentionally kept
similar to their PTE counter-part.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21 16:01:53 +10:00
Aneesh Kumar K.V f940f52898 powerpc/THP: Double the PMD table size for THP
THP code does PTE page allocation along with large page request and deposit them
for later use. This is to ensure that we won't have any failures when we split
hugepages to regular pages.

On powerpc we want to use the deposited PTE page for storing hash pte slot and
secondary bit information for the HPTEs. We use the second half
of the pmd table to save the deposted PTE page.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21 16:01:53 +10:00
Aneesh Kumar K.V db3d853490 powerpc/mm: handle hugepage size correctly when invalidating hpte entries
If a hash bucket gets full, we "evict" a more/less random entry from it.
When we do that we don't invalidate the TLB (hpte_remove) because we assume
the old translation is still technically "valid". This implies that when
we are invalidating or updating pte, even if HPTE entry is not valid
we should do a tlb invalidate. With hugepages, we need to pass the correct
actual page size value for tlb invalidation.

This change update the patch 0608d69246
"powerpc/mm: Always invalidate tlb on hpte invalidate and update" to handle
transparent hugepages correctly.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21 16:01:52 +10:00
Gavin Shan 8998897b8f powerpc/eeh: Debugfs for error injection
The patch creates debugfs entries (powerpc/PCIxxxx/err_injct) for
injecting EEH errors for testing purpose.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21 16:01:52 +10:00
Gavin Shan 37c367f279 powerpc/powernv: Debugfs directory for PHB
The patch creates one debugfs directory ("powerpc/PCIxxxx") for
each PHB so that we can hook EEH error injection debugfs entry
there in proceeding patch.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21 16:01:51 +10:00
Gavin Shan 7cb9d93dc6 powerpc/eeh: Register OPAL notifier for PCI error
The patch registers OPAL event notifier and process the PCI errors
from firmware. If we have pending PCI errors, special EEH event
(without binding PE) will be sent to EEH core for processing.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21 16:01:51 +10:00
Gavin Shan e8e71fa426 powernv/opal: Disable OPAL notifier upon poweroff
While we're restarting or powering off the system, we needn't
the OPAL notifier any more. So just to disable that.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21 16:01:51 +10:00
Gavin Shan 1bc98de26d powernv/opal: Notifier for OPAL events
This patch implements a notifier to receive a notification on OPAL
event mask changes. The notifier is only called as a result of an OPAL
interrupt, which will happen upon reception of FSP messages or PCI errors.
Any event mask change detected as a result of opal_poll_events() will not
result in a notifier call.

[benh: changelog]
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-21 16:01:50 +10:00
Gavin Shan b95cd2cd44 powerpc/eeh: Allow to check fenced PHB proactively
It's meaningless to handle frozen PE if we already had fenced PHB.
The patch intends to check the PHB state before checking PE. If the
PHB has been put into fenced state, we need take care of that firstly.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:06:53 +10:00
Gavin Shan be7e744607 powerpc/eeh: Enable EEH check for config access
The patch enables EEH check and let EEH core to process the EEH
errors for PowerNV platform while accessing config space. Originally,
the implementation already had mechanism to check EEH errors and
tried to recover from them. However, we never let EEH core to handle
the EEH errors.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:06:50 +10:00
Gavin Shan e9cc17d4de powerpc/eeh: Initialization for PowerNV
The patch initializes EEH for PowerNV platform. Because the OPAL
APIs requires HUB ID, we need trace that through struct pnv_phb.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:06:47 +10:00
Gavin Shan 29310e5e86 powerpc/eeh: PowerNV EEH backends
The patch adds EEH backends for PowerNV platform. It's notable that
part of those EEH backends call to the I/O chip dependent backends.

[Removed pointless change to eeh_pseries.c -- BenH]

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:06:43 +10:00
Gavin Shan 70f942db46 powerpc/eeh: I/O chip next error
The patch implements the backend for EEH core to retrieve next
EEH error to handle. For the informational errors, we won't bother
the EEH core. Otherwise, the EEH should take appropriate actions
depending on the return value:

	0 - No further errors detected
	1 - Frozen PE
	2 - Fenced PHB
	3 - Dead PHB
	4 - Dead IOC

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:06:40 +10:00
Gavin Shan bf90dfea23 powerpc/eeh: I/O chip PE log and bridge setup
The patch adds backends to retrieve error log and configure p2p
bridges for the indicated PE.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:06:37 +10:00
Gavin Shan 9d5cab0010 powerpc/eeh: I/O chip PE reset
The patch adds the I/O chip backend to do PE reset. For now, we
focus on PCI bus dependent PE. If PHB PE has been put into error
state, the PHB will take complete reset. Besides, the root bridge
will take fundamental or hot reset accordingly if the indicated
PE locates at the toppest of PCI hierarchy tree. Otherwise, the
upstream p2p bridge will take hot reset.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:06:33 +10:00
Gavin Shan 8c41a7f3f7 powerpc/eeh: I/O chip EEH state retrieval
The patch adds I/O chip backend to retrieve the state for the
indicated PE. While the PE state is temperarily unavailable,
the upper layer (powernv platform) should return default delay
(1 second).

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:06:30 +10:00
Gavin Shan eb0059836b powerpc/eeh: I/O chip EEH enable option
The patch adds the backend to enable or disable EEH functionality
for the specified PE. The backend is also used to enable MMIO or
DMA path for the problematic PE. It's notable that all PEs on
PowerNV platform support EEH functionality by default, and we
disallow to disable EEH for the specific PE.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:06:27 +10:00
Gavin Shan 73370c662b powerpc/eeh: I/O chip post initialization
The post initialization (struct eeh_ops::post_init) is called after
the EEH probe is done. On the other hand, the EEH core post
initialization is designed to call platform and then I/O chip backend
on PowerNV platform.

The patch adds the backend for I/O chip to notify the platform
that the specific PHB is ready to supply EEH service.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:06:24 +10:00
Gavin Shan 8747f36324 powerpc/eeh: EEH backend for P7IOC
For EEH on PowerNV platform, the overall architecture is different
from that on pSeries platform. In order to support multiple I/O chips
in future, we split EEH to 3 layers for PowerNV platform: EEH core,
platform layer, I/O layer. It would give EEH implementation on PowerNV
platform much more flexibility in future.

The patch adds the EEH backend for P7IOC.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:06:20 +10:00
Gavin Shan 23773230c8 powerpc/eeh: Sync OPAL API with firmware
The patch synchronizes OPAL APIs between kernel and firmware. Also,
we starts to replace opal_pci_get_phb_diag_data() with the similar
opal_pci_get_phb_diag_data2() and the former OPAL API would return
OPAL_UNSUPPORTED from now on.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:06:17 +10:00
Gavin Shan 8a6b1bc70d powerpc/eeh: EEH core to handle special event
On PowerNV platform, the EEH event caused by interrupt won't have
binding PE. The patch enables EEH core to handle the special event.
To avoid the current logic we have, The eeh_handle_event() is renamed
to eeh_handle_normal_event(), and the eeh_handle_special_event() is
introduced. The function eeh_handle_event() dispatches to above two
functions according to the input parameter. Besides, new backend
"next_error" added to eeh_ops and it's expected to have following
return values:

        4 - Dead IOC           3 - Dead PHB
        2 - Fenced PHB         1 - Frozen PE
        0 - No error found

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:06:14 +10:00
Gavin Shan 4907581dc2 powerpc/eeh: Export confirm_error_lock
An EEH event is created and queued to the event queue for each
ingress EEH error. When there're mutiple EEH errors, we need serialize
the process to keep consistent PE state (flags). The spinlock
"confirm_error_lock" was introduced for the purpose. We'll inject
EEH event upon error reporting interrupts on PowerNV platform. So
we export the spinlock for that to use for consistent PE state.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:06:11 +10:00
Gavin Shan 9986659534 powerpc/eeh: Allow to purge EEH events
On PowerNV platform, we might run into the situation where subsequent
events are duplicated events of former one, which is being processed.
For the case, we need the function implemented by the patch to purge
EEH events accordingly.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:06:07 +10:00
Gavin Shan 5a71978e4b powerpc/eeh: Trace time on first error for PE
We're not expecting that one specific PE got frozen for over 5
times in last hour. Otherwise, the PE will be removed from the
system upon newly coming EEH errors. The patch introduces time
stamp to trace the first error on specific PE in last hour and
function to update that accordingly. Besides, the time stamp
is recovered during PE hotplug path as we did for frozen count.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:06:04 +10:00
Gavin Shan c86085580d powerpc/eeh: Single kthread to handle events
We possiblly have multiple kthreads running for multiple EEH errors
(events) and use one spinlock to make the process of handling those
EEH events serialized. That's unnecessary and the patch creates only
one kthread, which is started during EEH core initialization time in
eeh_init(). A new semaphore introduced to count the number of existing
EEH events in the queue and the kthread waiting on the semaphore.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:06:01 +10:00
Gavin Shan 26a74850b3 powerpc/eeh: Delay EEH probe during hotplug
While doing EEH recovery, the PCI devices of the problematic PE
should be removed and then added to the system again. During the
so-called hotplug event, the PCI devices of the problematic PE
will be probed through early/late phase. We would delay EEH probe
on late point for PowerNV platform since the PCI device isn't
available in early phase.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:05:58 +10:00
Gavin Shan 326a98ea93 powerpc/eeh: Refactor eeh_reset_pe_once()
We shouldn't check that the returned PE status is exactly equal to
(EEH_STATE_MMIO_ACTIVE | EEH_STATE_DMA_ACTIVE) but instead only check
that they are both set.

[benh: changelog]
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:05:54 +10:00
Gavin Shan 21fd21f590 powerpc/eeh: EEH post initialization operation
The patch adds new EEH operation post_init. It's used to notify
the platform that EEH core has completed the EEH probe. By that,
PowerNV platform starts to use the services supplied by EEH
functionality.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:05:51 +10:00
Gavin Shan 51fb5f5632 powerpc/eeh: Make eeh_init() public
For EEH on PowerNV platform, we will do EEH probe based on the
real PCI devices. The PCI devices are available after PCI probe.
So we have to call eeh_init() explicitly on PowerNV platform
after PCI probe. The patch also does EEH probe for PowerNV platform
in eeh_init().

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:05:48 +10:00
Gavin Shan 8cdb283371 powerpc/eeh: Trace PCI bus from PE
There're several types of PEs can be supported for now: PHB, Bus
and Device dependent PE. For PCI bus dependent PE, tracing the
corresponding PCI bus from PE (struct eeh_pe) would make the code
more efficient. The patch also enables the retrieval of PCI bus based
on the PCI bus dependent PE.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:05:45 +10:00
Gavin Shan 0156680854 powerpc/eeh: Make eeh_pe_get() public
While processing EEH event interrupt from P7IOC, we need function
to retrieve the PE according to the indicated EEH device. The patch
makes function eeh_pe_get() public so that other source files can call
it for that purpose. Also, the patch fixes referring to wrong BDF
(Bus/Device/Function) address while searching PE in function
__eeh_pe_get().

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:05:41 +10:00
Gavin Shan 9ff67433ce powerpc/eeh: Make eeh_phb_pe_get() public
One of the possible cases indicated by P7IOC interrupt is fenced
PHB. For that case, we need fetch the PE corresponding to the PHB
and disable the PHB and all subordinate PCI buses/devices, recover
from the fenced state and eventually enable the whole PHB. We need
one function to fetch the PHB PE outside eeh_pe.c and the patch is
going to make eeh_phb_pe_get() public for that purpose.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:05:38 +10:00
Gavin Shan 317f06de78 powerpc/eeh: Move common part to kernel directory
The patch moves the common part of EEH core into arch/powerpc/kernel
directory so that we needn't PPC_PSERIES while compiling POWERNV
platform:

        * Move the EEH common part into arch/powerpc/kernel
        * Move the functions for PCI hotplug from pSeries platform to
          arch/powerpc/kernel/pci-hotplug.c
        * Move CONFIG_EEH from arch/powerpc/platforms/pseries/Kconfig to
          arch/powerpc/platforms/Kconfig
        * Adjust makefile accordingly

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:05:35 +10:00
Gavin Shan a84f273c30 powerpc/eeh: Cleanup for EEH core
Cleanup on EEH core to remove unnecessary whitespaces.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:05:31 +10:00
Michael Neuling 87b4e5393a powerpc/tm: Fix return of active 64bit signals
Currently we only restore signals which are transactionally suspended but it's
possible that the transaction can be restored even when it's active.  Most
likely this will result in a transactional rollback by the hardware as the
transaction will have been doomed by an earlier treclaim.

The current code is a legacy of earlier kernel implementations which did
software rollback of active transactions in the kernel.  That code has now gone
but we didn't correctly fix up this part of the signals code which still makes
assumptions based on having software rollback.

This changes the signal return code to always restore both contexts on 64 bit
signal return.  It also ensures that the MSR TM bits are properly restored from
the signal context which they are not currently.

Signed-off-by: Michael Neuling <mikey@neuling.org>
cc: stable@vger.kernel.org (v3.9+)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:05:28 +10:00
Michael Neuling 55e4341850 powerpc/tm: Fix return of 32bit rt signals to active transactions
Currently we only restore signals which are transactionally suspended but it's
possible that the transaction can be restored even when it's active.  Most
likely this will result in a transactional rollback by the hardware as the
transaction will have been doomed by an earlier treclaim.

The current code is a legacy of earlier kernel implementations which did
software rollback of active transactions in the kernel.  That code has now gone
but we didn't correctly fix up this part of the signals code which still makes
assumptions based on having software rollback.

This changes the signal return code to always restore both contexts on 32 bit
rt signal return.

Signed-off-by: Michael Neuling <mikey@neuling.org>
cc: stable@vger.kernel.org (v3.9+)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:05:25 +10:00
Michael Neuling 2c27a18f87 powerpc/tm: Fix restoration of MSR on 32bit signal return
Currently we clear out the MSR TM bits on signal return assuming that the
signal should never return to an active transaction.

This is bogus as the user may do this.  It's most likely the transaction will
be doomed due to a treclaim but that's a problem for the HW not the kernel.

The current code is a legacy of earlier kernel implementations which did
software rollback of active transactions in the kernel.  That code has now gone
but we didn't correctly fix up this part of the signals code which still makes
the assumption that it must be returning to a suspended transaction.

This pulls out both MSR TM bits from the user supplied context rather than just
setting TM suspend.  We pull out only the bits needed to ensure the user can't
do anything dangerous to the MSR.

Signed-off-by: Michael Neuling <mikey@neuling.org>
cc: stable@vger.kernel.org (v3.9+)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:05:22 +10:00
Michael Neuling fee5545071 powerpc/tm: Fix 32 bit non-rt signals
Currently sys_sigreturn() is TM unaware.  Therefore, if we take a 32 bit signal
without SIGINFO (non RT) inside a transaction, on signal return we don't
restore the signal frame correctly.

This checks if the signal frame being restoring is an active transaction, and
if so, it copies the additional state to ptregs so it can be restored.

Signed-off-by: Michael Neuling <mikey@neuling.org>
cc: stable@vger.kernel.org (v3.9+)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:05:18 +10:00
Michael Neuling 1d25f11fdb powerpc/tm: Fix writing top half of MSR on 32 bit signals
The MSR TM controls are in the top 32 bits of the MSR hence on 32 bit signals,
we stick the top half of the MSR in the checkpointed signal context so that the
user can access it.

Unfortunately, we don't currently write anything to the checkpointed signal
context when coming in a from a non transactional process and hence the top MSR
bits can contain junk.

This updates the 32 bit signal handling code to always write something to the
top MSR bits so that users know if the process is transactional or not and the
kernel can use it on signal return.

Signed-off-by: Michael Neuling <mikey@neuling.org>
cc: stable@vger.kernel.org (v3.9+)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:05:15 +10:00
Benjamin Herrenschmidt 968219fa33 powerpc/8xx: Remove 8xx specific "minimal FPU emulation"
This is duplicated code from math-emu and implements such a small
subset of the FPU (load/stores/fmr) that it's essentially pointless
nowdays.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:05:12 +10:00
Benjamin Herrenschmidt 4e63f8edfe powerpc/math-emu: Allow math-emu to be used for HW FPU
(Including 64-bit ones)

This allow SW emulation by the kernel of optional instructions
such as fsqrt which aren't implemented on some processors, and
thus fixes some Fedora 19 issues such as Anaconda since the
compiler is set to generate those by default on 64-bit.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:05:09 +10:00
Benjamin Herrenschmidt 04ae900171 powerpc/math-emu: Fix decoding of some instructions
The decoding of some instructions such as fsqrt{s} was incorrect,
using the wrong registers, and thus could not work.

This fixes it and also adds a couple of place holders for missing
instructions.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:05:05 +10:00
Aruna Balakrishnaiah a5e4797b0f powerpc/pseries: Read common partition via pstore
This patch exploits pstore subsystem to read details of common partition
in NVRAM to a separate file in /dev/pstore. For instance, common partition
details will be stored in a file named [common-nvram-6].

Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com>
Reviewed-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:05:02 +10:00
Aruna Balakrishnaiah f33f748c96 powerpc/pseries: Read of-config partition via pstore
This patch set exploits the pstore subsystem to read details of
of-config partition in NVRAM to a separate file in /dev/pstore.
For instance, of-config partition details will be stored in a
file named [of-nvram-5].

Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com>
Reviewed-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:04:59 +10:00
Aruna Balakrishnaiah edf38465a3 powerpc/pseries: Distinguish between a os-partition and non-os partition
Introduce os_partition member in nvram_os_partition structure to identify
if the partition is an os partition or not. This will be useful to handle
non-os partitions of-config and common.

Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com>
Reviewed-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:04:56 +10:00
Aruna Balakrishnaiah 69020eea97 powerpc/pseries: Read rtas partition via pstore
This patch set exploits the pstore subsystem to read details of rtas partition
in NVRAM to a separate file in /dev/pstore. For instance, rtas details will be
stored in a file named [rtas-nvram-4].

Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com>
Reviewed-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:04:52 +10:00
Aruna Balakrishnaiah d7563c94f7 powerpc/pseries: Read/Write oops nvram partition via pstore
IBM's p series machines provide persistent storage for LPARs through NVRAM.
NVRAM's lnx,oops-log partition is used to log oops messages.
Currently the kernel provides the contents of p-series NVRAM only as a
simple stream of bytes via /dev/nvram, which must be interpreted in user
space by the nvram command in the powerpc-utils package.

This patch set exploits the pstore subsystem to expose oops partition in
NVRAM as a separate file in /dev/pstore. For instance, Oops messages will be
stored in a file named [dmesg-nvram-2]. In case pstore registration fails it
will fall back to kmsg_dump mechanism.

This patch will read/write the oops messages from/to this partition via pstore.

Signed-off-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:04:49 +10:00
Aruna Balakrishnaiah 126746101e powerpc/pseries: Introduce generic read function to read nvram-partitions
Introduce generic read function to read nvram partitions other than rtas.
nvram_read_error_log will be retained which is used to read rtas partition
from rtasd. nvram_read_partition is the generic read function to read from
any nvram partition.

Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com>
Reviewed-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:04:46 +10:00
Aruna Balakrishnaiah b1f70e1f72 powerpc/pseries: Add version and timestamp to oops header
Introduce version and timestamp information in the oops header.
oops_log_info (oops header) holds version (to distinguish between old
and new format oops header), length of the oops text
(compressed or uncompressed) and timestamp.

The version field will sit in the same place as the length in old
headers. version is assigned 5000 (greater than oops partition size)
so that existing tools will refuse to dump new style partitions as
the length is too large. The updated tools will work with both
old and new format headers.

Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com>
Reviewed-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:04:43 +10:00
Aruna Balakrishnaiah 1bf247f8df powerpc/pseries: Remove syslog prefix in uncompressed oops text
Removal of syslog prefix in the uncompressed oops text will
help in capturing more oops data.

Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com>
Reviewed-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:04:39 +10:00
Gavin Shan 2d5c121678 powerpc/eeh: Enhance converting EEH dev
Under some special circumstances, the EEH device doesn't have the
associated device tree node or PCI device. The patch enhances those
functions converting EEH device to device tree node or PCI device
accordingly to avoid unnecessary system crash.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:04:36 +10:00
Gavin Shan 5fb621698e powerpc/eeh: Fix fetching bus for single-dev-PE
While running Linux as guest on top of phyp, we possiblly have
PE that includes single PCI device. However, we didn't return
its PCI bus correctly and it leads to failure on recovery from
EEH errors for single-dev-PE. The patch fixes the issue.

Cc: <stable@vger.kernel.org> # v3.7+
Cc: Steve Best <sbest@us.ibm.com>
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:04:33 +10:00
Anton Blanchard 475e68cfdd powerpc: Align thread->fpr to 16 bytes
On newer CPUs we use VSX loads and stores to the thread->fpr array.
For best performance we need to ensure 16 byte alignment.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:04:30 +10:00
liguang 1b7e0cbe49 powerpc/pseries: Use 'true' instead of '1' for orderly_poweroff
orderly_poweroff is expecting a bool parameter, so
use 'true' instead '1'

Signed-off-by: liguang <lig.fnst@cn.fujitsu.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:04:26 +10:00
liguang a5b45ded09 powerpc/smp: Use '==' instead of '<' for system_state
'system_state < SYSTEM_RUNNING' will have same effect
with 'system_state == SYSTEM_BOOTING', but the later
one is more clearer.

Signed-off-by: liguang <lig.fnst@cn.fujitsu.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:04:23 +10:00
Bharat Bhushan 13d543cd79 powerpc: Restore dbcr0 on user space exit
On BookE (Branch taken + Single Step) is as same as Branch Taken
on BookS and in Linux we simulate BookS behavior for BookE as well.
When doing so, in Branch taken handling we want to set DBCR0_IC but
we update the current->thread->dbcr0 and not DBCR0.

Now on 64bit the current->thread.dbcr0 (and other debug registers)
is synchronized ONLY on context switch flow. But after handling
Branch taken in debug exception if we return back to user space
without context switch then single stepping change (DBCR0_ICMP)
does not get written in h/w DBCR0 and Instruction Complete exception
does not happen.

This fixes using ptrace reliably on BookE-PowerPC

lmbench latency test (lat_syscall) Results are (they varies a little
on each run)

1) ./lat_syscall <action> /dev/shm/uImage

action:	Open	read	write	stat	fstat	null
Before:	3.8618	0.2017	0.2851	1.6789	0.2256	0.0856
After:	3.8580	0.2017	0.2851	1.6955	0.2255	0.0856

1) ./lat_syscall -P 2 -N 10 <action> /dev/shm/uImage
action:	Open	read	write	stat	fstat	null
Before:	4.1388	0.2238	0.3066	1.7106	0.2256	0.0856
After:	4.1413	0.2236	0.3062	1.7107	0.2256	0.0856

[ Slightly modified to avoid extra branch in the fast path
  on Book3S and fix build on all non-BookE 64-bit -- BenH
]

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:04:19 +10:00
Bharat Bhushan d8899bb2be powerpc: Debug control and status registers are 32bit
Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:04:16 +10:00
Alexey Kardashevskiy 5b25199eff powerpc/vfio: Enable on pSeries platform
The enables VFIO on the pSeries platform, enabling user space
programs to access PCI devices directly.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 16:55:15 +10:00
Alexey Kardashevskiy 4e13c1ac6b powerpc/vfio: Enable on PowerNV platform
This initializes IOMMU groups based on the IOMMU configuration
discovered during the PCI scan on POWERNV (POWER non virtualized)
platform.  The IOMMU groups are to be used later by the VFIO driver,
which is used for PCI pass through.

It also implements an API for mapping/unmapping pages for
guest PCI drivers and providing DMA window properties.
This API is going to be used later by QEMU-VFIO to handle
h_put_tce hypercalls from the KVM guest.

The iommu_put_tce_user_mode() does only a single page mapping
as an API for adding many mappings at once is going to be
added later.

Although this driver has been tested only on the POWERNV
platform, it should work on any platform which supports
TCE tables.  As h_put_tce hypercall is received by the host
kernel and processed by the QEMU (what involves calling
the host kernel again), performance is not the best -
circa 220MB/s on 10Gb ethernet network.

To enable VFIO on POWER, enable SPAPR_TCE_IOMMU config
option and configure VFIO as required.

Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 16:55:14 +10:00
Alistair Popple ab9a4183fd powerpc: Update currituck pci/usb fixup for new board revision
The currituck board uses a different IRQ for the pci usb host
controller depending on the board revision. This patch adds support
for newer board revisions by retrieving the board revision from the
FPGA and mapping the appropriate IRQ.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Acked-by: Tony Breeds <tony@bakeyournoodle.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 16:55:13 +10:00
Michael Neuling 70a54a4fae powerpc: Fix single step emulation of 32bit overflowed branches
Check truncate_if_32bit() on final write to nip.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 16:55:13 +10:00
Alistair Popple b9ef7d6b11 powerpc: Update default configurations
Update default configurations for systems with CONFIG_BOOTX_TEXT
selected so that they continue to print early debug messages as is
currently the case.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 16:55:12 +10:00
Alistair Popple 071df9422a powerpc: Add a configuration option for early BootX/OpenFirmware debug
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 16:55:12 +10:00
Jeremy Kerr 0962e8004e powerpc/prom: Scan reserved-ranges node for memory reservations
Based on benh's proposal at
https://lists.ozlabs.org/pipermail/linuxppc-dev/2012-September/101237.html,
this change provides support for reserving memory from the
reserved-ranges node at the root of the device tree.

We just call memblock_reserve on these ranges for now.

Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 16:55:11 +10:00
Daniel Walker d5d8ec895c powerpc/mm: Make mmap_64.c compile on 32bit powerpc
There appears to be no good reason to keep this as 64bit only. It works
on 32bit also, and has checks so that it can work correctly with 32bit
binaries on 64bit hardware which is why I think this works.

I tested this on qemu using the virtex-ml507 machine type.

Before,

/bin2 # ./test & cat /proc/${!}/maps
00100000-00103000 r-xp 00000000 00:00 0          [vdso]
10000000-10007000 r-xp 00000000 00:01 454        /bin2/test
10017000-10018000 rw-p 00007000 00:01 454        /bin2/test
48000000-48020000 r-xp 00000000 00:01 224        /lib/ld-2.11.3.so
48021000-48023000 rw-p 00021000 00:01 224        /lib/ld-2.11.3.so
bfd03000-bfd24000 rw-p 00000000 00:00 0          [stack]
/bin2 # ./test & cat /proc/${!}/maps
00100000-00103000 r-xp 00000000 00:00 0          [vdso]
0fe6e000-0ffd8000 r-xp 00000000 00:01 214        /lib/libc-2.11.3.so
0ffd8000-0ffe8000 ---p 0016a000 00:01 214        /lib/libc-2.11.3.so
0ffe8000-0ffed000 rw-p 0016a000 00:01 214        /lib/libc-2.11.3.so
0ffed000-0fff0000 rw-p 00000000 00:00 0
10000000-10007000 r-xp 00000000 00:01 454        /bin2/test
10017000-10018000 rw-p 00007000 00:01 454        /bin2/test
48000000-48020000 r-xp 00000000 00:01 224        /lib/ld-2.11.3.so
48020000-48021000 rw-p 00000000 00:00 0
48021000-48023000 rw-p 00021000 00:01 224        /lib/ld-2.11.3.so
bf98a000-bf9ab000 rw-p 00000000 00:00 0          [stack]
/bin2 # ./test & cat /proc/${!}/maps
00100000-00103000 r-xp 00000000 00:00 0          [vdso]
0fe6e000-0ffd8000 r-xp 00000000 00:01 214        /lib/libc-2.11.3.so
0ffd8000-0ffe8000 ---p 0016a000 00:01 214        /lib/libc-2.11.3.so
0ffe8000-0ffed000 rw-p 0016a000 00:01 214        /lib/libc-2.11.3.so
0ffed000-0fff0000 rw-p 00000000 00:00 0
10000000-10007000 r-xp 00000000 00:01 454        /bin2/test
10017000-10018000 rw-p 00007000 00:01 454        /bin2/test
48000000-48020000 r-xp 00000000 00:01 224        /lib/ld-2.11.3.so
48020000-48021000 rw-p 00000000 00:00 0
48021000-48023000 rw-p 00021000 00:01 224        /lib/ld-2.11.3.so
bfa54000-bfa75000 rw-p 00000000 00:00 0          [stack]

After,

bash-4.1# ./test & cat /proc/${!}/maps
[7] 803
00100000-00103000 r-xp 00000000 00:00 0          [vdso]
10000000-10007000 r-xp 00000000 00:01 454        /bin2/test
10017000-10018000 rw-p 00007000 00:01 454        /bin2/test
b7eb0000-b7ed0000 r-xp 00000000 00:01 224        /lib/ld-2.11.3.so
b7ed1000-b7ed3000 rw-p 00021000 00:01 224        /lib/ld-2.11.3.so
bfbc0000-bfbe1000 rw-p 00000000 00:00 0          [stack]
bash-4.1# ./test & cat /proc/${!}/maps
[8] 805
00100000-00103000 r-xp 00000000 00:00 0          [vdso]
10000000-10007000 r-xp 00000000 00:01 454        /bin2/test
10017000-10018000 rw-p 00007000 00:01 454        /bin2/test
b7b03000-b7b23000 r-xp 00000000 00:01 224        /lib/ld-2.11.3.so
b7b24000-b7b26000 rw-p 00021000 00:01 224        /lib/ld-2.11.3.so
bfc27000-bfc48000 rw-p 00000000 00:00 0          [stack]
bash-4.1# ./test & cat /proc/${!}/maps
[9] 807
00100000-00103000 r-xp 00000000 00:00 0          [vdso]
10000000-10007000 r-xp 00000000 00:01 454        /bin2/test
10017000-10018000 rw-p 00007000 00:01 454        /bin2/test
b7f37000-b7f57000 r-xp 00000000 00:01 224        /lib/ld-2.11.3.so
b7f58000-b7f5a000 rw-p 00021000 00:01 224        /lib/ld-2.11.3.so
bff96000-bffb7000 rw-p 00000000 00:00 0          [stack]

Signed-off-by: Daniel Walker <dwalker@fifo90.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 16:55:11 +10:00
Kevin Hao 3139b0a797 powerpc: Remove the unneeded trigger of decrementer interrupt in decrementer_check_overflow
Previously in order to handle the edge sensitive decrementers,
we choose to set the decrementer to 1 to trigger a decrementer
interrupt when re-enabling interrupts. But with the rework of the
lazy EE, we would replay the decrementer interrupt when re-enabling
interrupts if a decrementer interrupt occurs with irq soft-disabled.
So there is no need to trigger a decrementer interrupt in this case
any more.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 16:55:10 +10:00
Scott Wood 39a421ff0b powerpc/mm/nohash: Ignore NULL stale_map entries
This happens with threads that are offline due to CPU hotplug
(including threads that were never "plugged in" to begin with because
SMT is disabled).

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 16:55:10 +10:00
Suzuki K. Poulose 35fd219a26 powerpc: Move the single step enable code to a generic path
This patch moves the single step enable code used by kprobe to a generic
routine header so that, it can be re-used by other code, in this case,
uprobes. No functional changes.

Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com>
Cc:	Ananth N Mavinakaynahalli <ananth@in.ibm.com>
Cc:	Kumar Gala <galak@kernel.crashing.org>
Cc:	linuxppc-dev@ozlabs.org
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 16:55:09 +10:00
Suzuki K. Poulose 85f395c5b0 powerpc/kprobes: Do not disable External interrupts during single step
External/Decrement exceptions have lower priority than the Debug Exception.
So, we don't have to disable the External interrupts before a single step.
However, on BookE, Critical Input Exception(CE) has higher priority than a
Debug Exception. Hence we mask them.

Signed-off-by: 	Suzuki K. Poulose <suzuki@in.ibm.com>
Cc:		Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc:		Ananth N Mavinakaynahalli <ananth@in.ibm.com>
Cc:		Kumar Gala <galak@kernel.crashing.org>
Cc:		linuxppc-dev@ozlabs.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 16:55:09 +10:00
Thomas Gleixner e80034047b powerpc: Mark low level irq handlers NO_THREAD
These low level handlers cannot be threaded. Mark them NO_THREAD

Reported-by: leroy christophe <christophe.leroy@c-s.fr>
Tested-by: leroy christophe <christophe.leroy@c-s.fr>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 16:55:08 +10:00
Aneesh Kumar K.V 8bbd9f04b7 powerpc: Fix bad pmd error with book3E config
Book3E uses the hugepd at PMD level and don't encode pte directly
at the pmd level. So it will find the lower bits of pmd set
and the pmd_bad check throws error. Infact the current code
will never take the free_hugepd_range call at all because it will
clear the pmd if it find a hugepd pointer. Fix this by clearing
bad pmd only if it is not a hugepd pointer.

This is regression introduced by e2b3d202d1
"powerpc: Switch 16GB and 16MB explicit hugepages to a different page table format"

Reported-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 15:25:21 +10:00
David S. Miller d98cae64e4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/wireless/ath/ath9k/Kconfig
	drivers/net/xen-netback/netback.c
	net/batman-adv/bat_iv_ogm.c
	net/wireless/nl80211.c

The ath9k Kconfig conflict was a change of a Kconfig option name right
next to the deletion of another option.

The xen-netback conflict was overlapping changes involving the
handling of the notify list in xen_netbk_rx_action().

Batman conflict resolution provided by Antonio Quartulli, basically
keep everything in both conflict hunks.

The nl80211 conflict is a little more involved.  In 'net' we added a
dynamic memory allocation to nl80211_dump_wiphy() to fix a race that
Linus reported.  Meanwhile in 'net-next' the handlers were converted
to use pre and post doit handlers which use a flag to determine
whether to hold the RTNL mutex around the operation.

However, the dump handlers to not use this logic.  Instead they have
to explicitly do the locking.  There were apparent bugs in the
conversion of nl80211_dump_wiphy() in that we were not dropping the
RTNL mutex in all the return paths, and it seems we very much should
be doing so.  So I fixed that whilst handling the overlapping changes.

To simplify the initial returns, I take the RTNL mutex after we try
to allocate 'tb'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-19 16:49:39 -07:00
Ingo Molnar b1fe9987b7 Merge branch 'rcu/next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull RCU changes from Paul E. McKenney:

"The major changes for this series are:

 1.      Simplify RCU's grace-period and callback processing based on
         the new numbering for callbacks.  These were posted to LKML at
         https://lkml.org/lkml/2013/5/20/330.

 2.      Documentation updates.  These were posted to LKML at
         https://lkml.org/lkml/2013/5/20/348.

 3.      Miscellaneous fixes, including converting a few remaining printk()
         calls to pr_*().  These were posted to LKML at
         https://lkml.org/lkml/2013/5/20/324.

 4.      SRCU-related changes and fixes.  These were posted to LKML at
         https://lkml.org/lkml/2013/5/20/425.

 5.      Removal of TINY_PREEMPT_RCU in favor of TREE_PREEMPT_RCU for
         single-CPU low-latency systems.  These were posted to LKML at
         https://lkml.org/lkml/2013/5/20/427."

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-19 13:48:57 +02:00
Viresh Kumar 0a0fca9d83 sched: Rename sched.c as sched/core.c in comments and Documentation
Most of the stuff from kernel/sched.c was moved to kernel/sched/core.c long time
back and the comments/Documentation never got updated.

I figured it out when I was going through sched-domains.txt and so thought of
fixing it globally.

I haven't crossed check if the stuff that is referenced in sched/core.c by all
these files is still present and hasn't changed as that wasn't the motive behind
this patch.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/cdff76a265326ab8d71922a1db5be599f20aad45.1370329560.git.viresh.kumar@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-19 12:58:42 +02:00
Scott Wood f8941fbe20 kvm/ppc/booke: Delay kvmppc_lazy_ee_enable
kwmppc_lazy_ee_enable() should be called as late as possible,
or else we get things like WARN_ON(preemptible()) in enable_kernel_fp()
in configurations where preemptible() works.

Note that book3s_pr already waits until just before __kvmppc_vcpu_run
to call kvmppc_lazy_ee_enable().

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-19 12:15:13 +02:00
Greg Kroah-Hartman bb07b00be7 Merge 3.10-rc6 into driver-core-next
We want these fixes here too.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-17 16:57:20 -07:00
Eliezer Tamir dafcc4380d net: add socket option for low latency polling
adds a socket option for low latency polling.
This allows overriding the global sysctl value with a per-socket one.
Unexport sysctl_net_ll_poll since for now it's not needed in modules.

Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 15:48:14 -07:00
Greg Kroah-Hartman bf32d52c45 Merge 3.10-rc6 into tty-next
We want the changes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-17 12:00:22 -07:00
Linus Torvalds 5938930e71 Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull powerpc fixes from Benjamin Herrenschmidt:
 "So here are 3 fixes still for 3.10.  Fixes are simple, bugs are nasty
  (though not recent regressions, nasty enough) and all targeted at
  stable"

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc: Fix missing/delayed calls to irq_work
  powerpc: Fix emulation of illegal instructions on PowerNV platform
  powerpc: Fix stack overflow crash in resume_kernel when ftracing
2013-06-14 19:25:04 -10:00
Benjamin Herrenschmidt 230b303479 powerpc: Fix missing/delayed calls to irq_work
When replaying interrupts (as a result of the interrupt occurring
while soft-disabled), in the case of the decrementer, we are exclusively
testing for a pending timer target. However we also use decrementer
interrupts to trigger the new "irq_work", which in this case would
be missed.

This change the logic to force a replay in both cases of a timer
boundary reached and a decrementer interrupt having actually occurred
while disabled. The former test is still useful to catch cases where
a CPU having been hard-disabled for a long time completely misses the
interrupt due to a decrementer rollover.

CC: <stable@vger.kernel.org> [v3.4+]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Steven Rostedt <rostedt@goodmis.org>
2013-06-15 12:33:30 +10:00
Paul Mackerras bf593907f7 powerpc: Fix emulation of illegal instructions on PowerNV platform
Normally, the kernel emulates a few instructions that are unimplemented
on some processors (e.g. the old dcba instruction), or privileged (e.g.
mfpvr).  The emulation of unimplemented instructions is currently not
working on the PowerNV platform.  The reason is that on these machines,
unimplemented and illegal instructions cause a hypervisor emulation
assist interrupt, rather than a program interrupt as on older CPUs.
Our vector for the emulation assist interrupt just calls
program_check_exception() directly, without setting the bit in SRR1
that indicates an illegal instruction interrupt.  This fixes it by
making the emulation assist interrupt set that bit before calling
program_check_interrupt().  With this, old programs that use no-longer
implemented instructions such as dcba now work again.

CC: <stable@vger.kernel.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-15 12:24:11 +10:00
Michael Ellerman 0e37739b1c powerpc: Fix stack overflow crash in resume_kernel when ftracing
It's possible for us to crash when running with ftrace enabled, eg:

  Bad kernel stack pointer bffffd12 at c00000000000a454
  cpu 0x3: Vector: 300 (Data Access) at [c00000000ffe3d40]
      pc: c00000000000a454: resume_kernel+0x34/0x60
      lr: c00000000000335c: performance_monitor_common+0x15c/0x180
      sp: bffffd12
     msr: 8000000000001032
     dar: bffffd12
   dsisr: 42000000

If we look at current's stack (paca->__current->stack) we see it is
equal to c0000002ecab0000. Our stack is 16K, and comparing to
paca->kstack (c0000002ecab3e30) we can see that we have overflowed our
kernel stack. This leads to us writing over our struct thread_info, and
in this case we have corrupted thread_info->flags and set
_TIF_EMULATE_STACK_STORE.

Dumping the stack we see:

  3:mon> t c0000002ecab0000
  [c0000002ecab0000] c00000000002131c .performance_monitor_exception+0x5c/0x70
  [c0000002ecab0080] c00000000000335c performance_monitor_common+0x15c/0x180
  --- Exception: f01 (Performance Monitor) at c0000000000fb2ec .trace_hardirqs_off+0x1c/0x30
  [c0000002ecab0370] c00000000016fdb0 .trace_graph_entry+0xb0/0x280 (unreliable)
  [c0000002ecab0410] c00000000003d038 .prepare_ftrace_return+0x98/0x130
  [c0000002ecab04b0] c00000000000a920 .ftrace_graph_caller+0x14/0x28
  [c0000002ecab0520] c0000000000d6b58 .idle_cpu+0x18/0x90
  [c0000002ecab05a0] c00000000000a934 .return_to_handler+0x0/0x34
  [c0000002ecab0620] c00000000001e660 .timer_interrupt+0x160/0x300
  [c0000002ecab06d0] c0000000000025dc decrementer_common+0x15c/0x180
  --- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0
  [c0000002ecab09c0] c0000000000fe044 .trace_hardirqs_on+0x14/0x30 (unreliable)
  [c0000002ecab0fb0] c00000000016fe3c .trace_graph_entry+0x13c/0x280
  [c0000002ecab1050] c00000000003d038 .prepare_ftrace_return+0x98/0x130
  [c0000002ecab10f0] c00000000000a920 .ftrace_graph_caller+0x14/0x28
  [c0000002ecab1160] c0000000000161f0 .__ppc64_runlatch_on+0x10/0x40
  [c0000002ecab11d0] c00000000000a934 .return_to_handler+0x0/0x34
  --- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0

  ... and so on

__ppc64_runlatch_on() is called from RUNLATCH_ON in the exception entry
path. At that point the irq state is not consistent, ie. interrupts are
hard disabled (by the exception entry), but the paca soft-enabled flag
may be out of sync.

This leads to the local_irq_restore() in trace_graph_entry() actually
enabling interrupts, which we do not want. Because we have not yet
reprogrammed the decrementer we immediately take another decrementer
exception, and recurse.

The fix is twofold. Firstly make sure we call DISABLE_INTS before
calling RUNLATCH_ON. The badly named DISABLE_INTS actually reconciles
the irq state in the paca with the hardware, making it safe again to
call local_irq_save/restore().

Although that should be sufficient to fix the bug, we also mark the
runlatch routines as notrace. They are called very early in the
exception entry and we are asking for trouble tracing them. They are
also fairly uninteresting and tracing them just adds unnecessary
overhead.

[ This regression was introduced by fe1952fc0a
  "powerpc: Rework runlatch code" by myself --BenH
]

CC: <stable@vger.kernel.org> [v3.4+]
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-15 12:21:57 +10:00
Bjorn Helgaas df58f46c0f Merge branch 'pci/jiang-bus-lock-v3' into next
* pci/jiang-bus-lock-v3:
  PCI: Return early on allocation failures to unindent mainline code
  PCI: Simplify IOV implementation and fix reference count races
  PCI: Drop redundant setting of bus->is_added in virtfn_add_bus()
  unicore32/PCI: Remove redundant call of pci_bus_add_devices()
  m68k/PCI: Remove redundant call of pci_bus_add_devices()
  PCI: Rename pci_release_bus_bridge_dev() to pci_release_host_bridge_dev()
  PCI: Fix refcount issue in pci_create_root_bus() error recovery path
  ia64/PCI: Clean up pci_scan_root_bus() usage
  PCI: Convert alloc_pci_dev(void) to pci_alloc_dev(bus)
  PCI: Introduce pci_alloc_dev(struct pci_bus*) to replace alloc_pci_dev()
  PCI: Introduce pci_bus_{get|put}() to manage PCI bus reference count

Conflicts:
	drivers/pci/probe.c
2013-06-14 17:47:46 -06:00
Rob Herring c45640e4a9 ibmebus: convert of_platform_driver to platform_driver
ibmebus is the last remaining user of of_platform_driver and the
conversion to a regular platform driver is trivial.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Grant Likely <grant.likely@linaro.org>
2013-06-12 12:37:26 +01:00
Linus Torvalds af180b81a3 Merge branch 'fixes' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm bugfixes from Gleb Natapov:
 "There is one more fix for MIPS KVM ABI here, MIPS and PPC build
  breakage fixes and a couple of PPC bug fixes"

* 'fixes' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  kvm/ppc/booke64: Fix lazy ee handling in kvmppc_handle_exit()
  kvm/ppc/booke: Hold srcu lock when calling gfn functions
  kvm/ppc/booke64: Disable e6500 support
  kvm/ppc/booke64: Fix AltiVec interrupt numbers and build breakage
  mips/kvm: Use KVM_REG_MIPS and proper size indicators for *_ONE_REG
  kvm: Add definition of KVM_REG_MIPS
  KVM: add kvm_para_available to asm-generic/kvm_para.h
2013-06-11 11:16:43 -07:00
Scott Wood 7c11c0ccc7 kvm/ppc/booke64: Fix lazy ee handling in kvmppc_handle_exit()
EE is hard-disabled on entry to kvmppc_handle_exit(), so call
hard_irq_disable() so that PACA_IRQ_HARD_DIS is set, and soft_enabled
is unset.

Without this, we get warnings such as arch/powerpc/kernel/time.c:300,
and sometimes host kernel hangs.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-06-11 11:11:00 +03:00
Scott Wood f1e89028f0 kvm/ppc/booke: Hold srcu lock when calling gfn functions
KVM core expects arch code to acquire the srcu lock when calling
gfn_to_memslot and similar functions.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-06-11 11:10:59 +03:00
Scott Wood 2b6398fcf2 kvm/ppc/booke64: Disable e6500 support
The previous patch made 64-bit booke KVM build again, but Altivec
support is still not complete, and we can't prevent the guest from
turning on Altivec (which can corrupt host state until state
save/restore is implemented).  Disable e6500 on KVM until this is
fixed.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-06-11 11:10:56 +03:00
Mihai Caraman 4edd1ae91b kvm/ppc/booke64: Fix AltiVec interrupt numbers and build breakage
Interrupt numbers defined for Book3E follows IVORs definition. Align
BOOKE_INTERRUPT_ALTIVEC_UNAVAIL and BOOKE_INTERRUPT_ALTIVEC_ASSIST to this
rule which also fixes the build breakage.
IVORs 32 and 33 are shared so reflect this in the interrupts naming.

This fixes a build break for 64-bit booke KVM.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-06-11 11:10:49 +03:00
Lai Jiangshan 505d6421bd powerpc,kvm: fix imbalance srcu_read_[un]lock()
At the point of up_out label in kvmppc_hv_setup_htab_rma(),
srcu read lock is still held.

We have to release it before return.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: kvm@vger.kernel.org
Cc: kvm-ppc@vger.kernel.org
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-06-10 13:45:26 -07:00
Grant Likely fa40f37757 irqdomain: Clean up aftermath of irq_domain refactoring
After refactoring the irqdomain code, there are a number of API
functions that are merely empty wrappers around core code. Drop those
wrappers out of the C file and replace them with static inlines in the
header.

Signed-off-by: Grant Likely <grant.likely@linaro.org>
2013-06-10 11:52:09 +01:00
Michael Ellerman b11ae95100 powerpc: Partial revert of "Context switch more PMU related SPRs"
In commit 59affcd I added context switching of more PMU SPRs, because
they are potentially exposed to userspace on Power8. However despite me
being a smart arse in the commit message it's actually not correct. In
particular it interacts badly with a global perf record.

We will have to do something more complicated, but that will have to
wait for 3.11.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-10 08:36:35 +10:00
Michael Ellerman 6772faa1ba powerpc/perf: Fix deadlock caused by calling printk() in PMU exception
In commit bc09c21 "Fix finding overflowed PMC in interrupt" we added
a printk() to the PMU exception handler. Unfortunately that is not safe.

The problem is that the PMU exception may run even when interrupts are
soft disabled, aka NMI context. We do this so that we can profile parts
of the kernel that have interrupts soft-disabled.

But by calling printk() from the exception handler, we can potentially
deadlock in the printk code on logbuf_lock, eg:

  [c00000038ba575c0] c000000000081928 .vprintk_emit+0xa8/0x540
  [c00000038ba576a0] c0000000007bcde8 .printk+0x48/0x58
  [c00000038ba57710] c000000000076504 .perf_event_interrupt+0x2d4/0x490
  [c00000038ba57810] c00000000001f6f8 .performance_monitor_exception+0x48/0x60
  [c00000038ba57880] c0000000000032cc performance_monitor_common+0x14c/0x180
  --- Exception: f01 (Performance Monitor) at c0000000007b25d4 ._raw_spin_lock_irq
  +0x64/0xc0
  [c00000038ba57bf0] c00000000007ed90 .devkmsg_read+0xd0/0x5a0
  [c00000038ba57d00] c0000000001c2934 .vfs_read+0xc4/0x1e0
  [c00000038ba57d90] c0000000001c2cd8 .SyS_read+0x58/0xd0
  [c00000038ba57e30] c000000000009d54 syscall_exit+0x0/0x98
  --- Exception: c01 (System Call) at 00001fffffbf6f7c
  SP (3ffff6d4de10) is in userspace

Fix it by making sure we only call printk() when we are not in NMI
context.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Cc: <stable@vger.kernel.org> # 3.9
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-10 08:36:32 +10:00
Michael Neuling 82a9f16adc powerpc/hw_breakpoints: Add DABRX cpu feature to fix 32-bit regression
When introducing support for DABRX in 4474ef0, we broke older 32-bit CPUs
that don't have that register.

Some CPUs have a DABR but not DABRX.  Configuration are:
- No 32bit CPUs have DABRX but some have DABR.
- POWER4+ and below have the DABR but no DABRX.
- 970 and POWER5 and above have DABR and DABRX.
- POWER8 has DAWR, hence no DABRX.

This introduces CPU_FTR_DABRX and sets it on appropriate CPUs.  We use
the top 64 bits for CPU FTR bits since only 64 bit CPUs have this.

Processors that don't have the DABRX will still work as they will fall
back to software filtering these breakpoints via perf_exclude_event().

Signed-off-by: Michael Neuling <mikey@neuling.org>
Reported-by: "Gorelik, Jacob (335F)" <jacob.gorelik@jpl.nasa.gov>
cc: stable@vger.kernel.org (v3.9 only)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-10 08:36:29 +10:00
Michael Neuling fb0fce3e55 powerpc/power8: Update denormalization handler
POWER8 can take a denormalisation exception on any VSX registers.

This does the extra 32 VSX registers we don't currently handle.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-10 08:36:26 +10:00
Michael Neuling d7c67fb1cf powerpc/pseries: Simplify denormalization handler
The following simplifies the denorm code by using macros to generate the long
stream of almost identical instructions.

This patch results in no changes to the output binary, but removes a lot of
lines of code.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-10 08:36:22 +10:00
Michael Neuling 6a60f9e7d8 powerpc/power8: Fix oprofile and perf
In 2ac6f42 powerpc/cputable: Fix oprofile_cpu_type on power8
we broke all power8 hw events.

This reverts this change and uses oprofile_type instead. Perf now works
on POWER8 again and oprofile will revert to using timers on POWER8.

Kudos to mpe this fix.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-10 08:36:19 +10:00
Gavin Shan b8b3de224f powerpc/eeh: Don't check RTAS token to get PE addr
RTAS token "ibm,get-config-addr-info" or ibm,get-config-addr-info2"
are used to retrieve the PE address according to PCI address, which
made up of domain/bus/slot/function. If we don't have those 2 tokens,
the domain/bus/slot/function would be used as the address for EEH
RTAS operations. Some older f/w might not have those 2 tokens and
that blocks the EEH functionality to be initialized. It was introduced
by commit e2af155c ("powerpc/eeh: pseries platform EEH initialization").

The patch skips the check on those 2 tokens so we can bring up EEH
functionality successfully. And domain/bus/slot/function will be
used as address for EEH RTAS operations.

Cc: <stable@vger.kernel.org> # v3.4+
Reported-by: Robert Knight <knight@princeton.edu>
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Tested-by: Robert Knight <knight@princeton.edu>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-10 08:36:16 +10:00
Kevin Hao c5df457ffe powerpc/pci: Check the bus address instead of resource address in pcibios_fixup_resources
If a BAR has the value of 0, we would assume that it is unset yet and
then mark the resource as unset and would reassign it later. But after
commit 6c5705fe (powerpc/PCI: get rid of device resource fixups)
the pcibios_fixup_resources is invoked after the bus address was
translated to linux resource. So the value of res->start is resource
address. And since the resource and bus address may be different, we
should translate it to the bus address before doing the check.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-10 08:36:13 +10:00
Greg Kroah-Hartman 1ba7055af2 Merge 3.10-rc5 into tty-next 2013-06-08 21:23:33 -07:00