__kuser_cmpxchg64 has a return path using bx lr to get back to the caller.
This is actually ok since the code in question is predicated on
CONFIG_CPU_32v6K, but for the sake of consistency using the usr_ret
macro is probably better.
Acked-by: Dave Martin <dave.martin@linaro.org>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Fix compilation failure, when Thumb support is not enabled:
arch/arm/kernel/entry-armv.S: Assembler messages:
arch/arm/kernel/entry-armv.S:501: Error: backward ref to unknown label "2:"
arch/arm/kernel/entry-armv.S:502: Error: backward ref to unknown label "3:"
make[2]: *** [arch/arm/kernel/entry-armv.o] Error 1
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Reviewed-by: Dave Martin <dave.martin@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Even when CONFIG_MULTI_IRQ_HANDLER is selected, the core code
requires the arch_irq_handler_default macro to be defined as
a fallback.
It turns out nobody is using that particular feature as both PXA
and shmobile have all their machine descriptors populated with
the interrupt handler, leaving unused code (or empty macros) in
their entry-macro.S file just to be able to compile entry-armv.S.
Make CONFIG_MULTI_IRQ_HANDLER exclusive wrt arch_irq_handler_default,
which allows to remove one test from the hot path. Also cleanup both
PXA and shmobile entry-macro.S.
Cc: Paul Mundt <lethal@linux-sh.org>
Acked-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Eric Miao <eric.y.miao@gmail.com>
Tested-by: Jamie Iles <jamie@jamieiles.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
When v6 and >=v7 boards are supported in the same kernel, the
__und_usr code currently makes a build-time assumption that Thumb-2
instructions occurring in userspace don't need to be supported.
Strictly speaking this is incorrect.
This patch fixes the above case by doing a run-time check on the
CPU architecture in these cases. This only affects kernels which
support v6 and >=v7 CPUs together: plain v6 and plain v7 kernels
are unaffected.
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Reviewed-by: Jon Medhurst <tixy@yxit.co.uk>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When testing whether a Thumb-2 instruction is 32 bits long or not,
the masking done in order to test bits 11-15 of the first
instruction halfword won't affect the result of the comparison, so
remove it.
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Reviewed-by: Jon Medhurst <tixy@yxit.co.uk>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The SVC IRQ, prefetch and data abort handlers preserve the SPSR value
via r5 across the exception. Rather than re-loading it from pt_regs,
use the preserved value instead.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Tail-call the main C data abort handler code from the per-CPU helper
code. Update the comments in the code wrt the new calling and return
register state.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Tail-call the main C prefetch abort handler code from the per-CPU
helper code. Also note that the helper function becomes ABI
compliant in terms of the registers preserved.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This avoids the irq entry assembly corrupting r5, thereby allowing it
to be preserved through to the svc exit code.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
All handlers now call trace_hardirqs_off, so move this common code into
the (svc|usr)_entry assembler macros.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
As we no longer re-enable interrupts in these exception handlers, add
the irqsoff tracing calls to them so that the kernel tracks the state
more accurately.
Note that these calls are conditional on IRQSOFF_TRACER:
kernel ----------> user ---------> kernel
^ irqs enabled ^ irqs disabled
No kernel code can run on the local CPU until we've re-entered the
kernel through one of the exception handlers - and userspace can not
take any locks etc. So, the kernel doesn't care about the IRQ mask
state while userspace is running unless we're doing IRQ off latency
tracing. So, we can (and do) avoid the overhead of updating the IRQ
mask state on every kernel->user and user->kernel transition.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Add irqtrace function calls to the undefined exception handler, so
that we get sane lockdep traces from locking problems in undefined
exception handlers.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Avoid enabling interrupts if the parent context had interrupts enabled
in the abort handler assembly code, and move this into the breakpoint/
page/alignment fault handlers instead.
This gets rid of some special-casing for the breakpoint fault handlers
from the low level abort handler path.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This avoids unnecessary instructions for CPUs which implement the IFAR
(instruction fault address register).
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This allows us to avoid moving registers twice to work around the
clobbered registers when we add calls to trace_hardirqs_{on,off}.
Ensure that all SVC handlers return with SPSR in r5 for consistency.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
There's no point checking to see whether IRQs were masked in the parent
context when returning from IRQ handling - the fact that we're handling
an IRQ means that the parent context must have had IRQs unmasked.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
irq_enter() and irq_exit() already take care of the preempt_count
handling for interrupts, which increment and decrement the hardirq
bits of the preempt count. So we can remove the preempt count handing
in our IRQ entry/exit assembly, like x86 did some 9 years ago.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Replace r4 with ip for calling abort helpers - ip is allowed to be
corrupted by called functions in the ABI, so it makes more sense to
use such a register.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Some user space applications are designed around the ability to perform
atomic operations on 64 bit values. Since this is natively possible
only with ARMv6k and above, let's provide a new kuser helper to perform
the operation with kernel supervision on pre ARMv6k hardware.
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Tested-by: Dave Martin <dave.martin@linaro.org>
Digging into some assembly file in order to get information about the
kuser helpers is not that convivial. Let's move that information to
a better formatted file in Documentation/arm/ and improve on it a bit.
Thanks to Dave Martin <dave.martin@linaro.org> for the initial cleanup and
clarifications.
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Acked-by: Dave Martin <dave.martin@linaro.org>
This patch fixes the lockdep warning of "unannotated irqs-off"[1].
After entering __irq_usr, arm core will disable interrupt automatically,
but __irq_usr does not annotate the irq disable, so lockdep may complain
the warning if it has chance to check this in irq handler.
This patch adds trace_hardirqs_off in __irq_usr before entering irq_handler
to handle the irq, also calls ret_to_user_from_irq to avoid calling
disable_irq again.
This is also a fix for irq off tracer.
[1], lockdep warning log of "unannotated irqs-off"
[ 13.804687] ------------[ cut here ]------------
[ 13.809570] WARNING: at kernel/lockdep.c:3335 check_flags+0x78/0x1d0()
[ 13.816467] Modules linked in:
[ 13.819732] Backtrace:
[ 13.822357] [<c01cb42c>] (dump_backtrace+0x0/0x100) from [<c06abb14>] (dump_stack+0x20/0x24)
[ 13.831268] r6:c07d8c2c r5:00000d07 r4:00000000 r3:00000000
[ 13.837280] [<c06abaf4>] (dump_stack+0x0/0x24) from [<c01ffc04>] (warn_slowpath_common+0x5c/0x74)
[ 13.846649] [<c01ffba8>] (warn_slowpath_common+0x0/0x74) from [<c01ffc48>] (warn_slowpath_null+0x2c/0x34)
[ 13.856781] r8:00000000 r7:00000000 r6:c18b8194 r5:60000093 r4:ef182000
[ 13.863708] r3:00000009
[ 13.866485] [<c01ffc1c>] (warn_slowpath_null+0x0/0x34) from [<c0237d84>] (check_flags+0x78/0x1d0)
[ 13.875823] [<c0237d0c>] (check_flags+0x0/0x1d0) from [<c023afc8>] (lock_acquire+0x4c/0x150)
[ 13.884704] [<c023af7c>] (lock_acquire+0x0/0x150) from [<c06af638>] (_raw_spin_lock+0x4c/0x84)
[ 13.893798] [<c06af5ec>] (_raw_spin_lock+0x0/0x84) from [<c01f9a44>] (sched_ttwu_pending+0x58/0x8c)
[ 13.903320] r6:ef92d040 r5:00000003 r4:c18b8180
[ 13.908233] [<c01f99ec>] (sched_ttwu_pending+0x0/0x8c) from [<c01f9a90>] (scheduler_ipi+0x18/0x1c)
[ 13.917663] r6:ef183fb0 r5:00000003 r4:00000000 r3:00000001
[ 13.923645] [<c01f9a78>] (scheduler_ipi+0x0/0x1c) from [<c01bc458>] (do_IPI+0x9c/0xfc)
[ 13.932006] [<c01bc3bc>] (do_IPI+0x0/0xfc) from [<c06b0888>] (__irq_usr+0x48/0xe0)
[ 13.939971] Exception stack(0xef183fb0 to 0xef183ff8)
[ 13.945281] 3fa0: ffffffc3 0001500c 00000001 0001500c
[ 13.953948] 3fc0: 00000050 400b45f0 400d9000 00000000 00000001 400d9600 6474e552 bea05b3c
[ 13.962585] 3fe0: 400d96c0 bea059c0 400b6574 400b65d8 20000010 ffffffff
[ 13.969573] r6:00000403 r5:fa240100 r4:ffffffff r3:20000010
[ 13.975585] ---[ end trace efc4896ab0fb62cb ]---
[ 13.980468] possible reason: unannotated irqs-off.
[ 13.985534] irq event stamp: 1610
[ 13.989044] hardirqs last enabled at (1610): [<c01c703c>] no_work_pending+0x8/0x2c
[ 13.997131] hardirqs last disabled at (1609): [<c01c7024>] ret_slow_syscall+0xc/0x1c
[ 14.005371] softirqs last enabled at (0): [<c01fe5e4>] copy_process+0x2cc/0xa24
[ 14.013183] softirqs last disabled at (0): [< (null)>] (null)
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This allows the cache/processor/fault glue to be more easily used
from assembler code. Tested on Assabet and Tegra 2.
Tested-by: Colin Cross <ccross@android.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Per subarch interrupt handler macros V3.
This patch breaks out code from the irq_handler macro
into arch_irq_handler and arch_irq_handler_default.
The macros are put in the header file "entry-macro-multi.S"
The arch_irq_handler_default macro is designed to be
used by irq_handler in entry-armv.S while arch_irq_handler
is suitable for per-subarch use.
Signed-off-by: Magnus Damm <damm@opensource.se>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Normally different ARM platform has different way to decode the IRQ
hardware status and demultiplex to the corresponding IRQ handler.
This is highly optimized by macro irq_handler in entry-armv.S, and
each machine defines their own macro to decode the IRQ number.
However, this prevents multiple machine classes to be built into a
single kernel.
By allowing each machine to specify thier own handler, and making
function pointer 'handle_arch_irq' to point to it at run time, this
can be solved. And introduce CONFIG_MULTI_IRQ_HANDLER to allow both
solutions to work.
Comparing with the highly optimized macro of irq_handler, the new
function must be written with care not to lose too much performance.
And the IPI stuff on SMP is expected to move to the provided arch
IRQ handler as well.
The assembly code to invoke handle_arch_irq is optimized by Russell
King.
Signed-off-by: Eric Miao <eric.miao@canonical.com>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* __fixup_smp_on_up has been modified with support for the
THUMB2_KERNEL case. For THUMB2_KERNEL only, fixups are split
into halfwords in case of misalignment, since we can't rely on
unaligned accesses working before turning the MMU on.
No attempt is made to optimise the aligned case, since the
number of fixups is typically small, and it seems best to keep
the code as simple as possible.
* Add a rotate in the fixup_smp code in order to support
CPU_BIG_ENDIAN, as suggested by Nicolas Pitre.
* Add an assembly-time sanity-check to ALT_UP() to ensure that
the content really is the right size (4 bytes).
(No check is done for ALT_SMP(). Possibly, this could be fixed
by splitting the two uses ot ALT_SMP() (ALT_SMP...SMP_UP versus
ALT_SMP...SMP_UP_B) into two macros. In the first case,
ALT_SMP needs to expand to >= 4 bytes, not == 4.)
* smp_mpidr.h (which implements ALT_SMP()/ALT_UP() manually due
to macro limitations) has not been modified: the affected
instruction (mov) has no 16-bit encoding, so the correct
instruction size is satisfied in this case.
* A "mode" parameter has been added to smp_dmb:
smp_dmb arm @ assumes 4-byte instructions (for ARM code, e.g. kuser)
smp_dmb @ uses W() to ensure 4-byte instructions for ALT_SMP()
This avoids assembly failures due to use of W() inside smp_dmb,
when assembling pure-ARM code in the vectors page.
There might be a better way to achieve this.
* Kconfig: make SMP_ON_UP depend on
(!THUMB2_KERNEL || !BIG_ENDIAN) i.e., THUMB2_KERNEL is now
supported, but only if !BIG_ENDIAN (The fixup code for Thumb-2
currently assumes little-endian order.)
Tested using a single generic realview kernel on:
ARM RealView PB-A8 (CONFIG_THUMB2_KERNEL={n,y})
ARM RealView PBX-A9 (SMP)
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
On ARM, debug exceptions occur in the form of data or prefetch aborts.
One difference is that debug exceptions require access to per-cpu banked
registers and data structures which are not saved in the low-level exception
code. For kernels built with CONFIG_PREEMPT, there is an unlikely scenario
that the debug handler ends up running on a different CPU from the one
that originally signalled the event, resulting in random data being read
from the wrong registers.
This patch adds a debug_entry macro to the low-level exception handling
code which checks whether the taken exception is a debug exception. If
it is, the preempt count for the faulting process is incremented. After
the debug handler has finished, the count is decremented.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The existing code invokes the syscall with rubbish in r7,
due to what looks like an incorrect literal load idiom.
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This allows us to use smp_cross_call() to trigger a number of different
software generated interrupts, rather than combining them all on one
SGI. Recover the SGI number via do_IPI.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch removes the domain switching functionality via the set_fs and
__switch_to functions on cores that have a TLS register.
Currently, the ioremap and vmalloc areas share the same level 1 page
tables and therefore have the same domain (DOMAIN_KERNEL). When the
kernel domain is modified from Client to Manager (via the __set_fs or in
the __switch_to function), the XN (eXecute Never) bit is overridden and
newer CPUs can speculatively prefetch the ioremap'ed memory.
Linux performs the kernel domain switching to allow user-specific
functions (copy_to/from_user, get/put_user etc.) to access kernel
memory. In order for these functions to work with the kernel domain set
to Client, the patch modifies the LDRT/STRT and related instructions to
the LDR/STR ones.
The user pages access rights are also modified for kernel read-only
access rather than read/write so that the copy-on-write mechanism still
works. CPU_USE_DOMAINS gets disabled only if the hardware has a TLS register
(CPU_32v6K is defined) since writing the TLS value to the high vectors page
isn't possible.
The user addresses passed to the kernel are checked by the access_ok()
function so that they do not point to the kernel space.
Tested-by: Anton Vorontsov <cbouatmailru@gmail.com>
Cc: Tony Lindgren <tony@atomide.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
UP systems do not implement all the instructions that SMP systems have,
so in order to boot a SMP kernel on a UP system, we need to rewrite
parts of the kernel.
Do this using an 'alternatives' scheme, where the kernel code and data
is modified prior to initialization to replace the SMP instructions,
thereby rendering the problematical code ineffectual. We use the linker
to generate a list of 32-bit word locations and their replacement values,
and run through these replacements when we detect a UP system.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
CPU: Testing write buffer coherency: ok
------------[ cut here ]------------
WARNING: at kernel/lockdep.c:3145 check_flags+0xcc/0x1dc()
Modules linked in:
[<c0035120>] (unwind_backtrace+0x0/0xf8) from [<c0355374>] (dump_stack+0x20/0x24)
[<c0355374>] (dump_stack+0x20/0x24) from [<c0060c04>] (warn_slowpath_common+0x58/0x70)
[<c0060c04>] (warn_slowpath_common+0x58/0x70) from [<c0060c3c>] (warn_slowpath_null+0x20/0x24)
[<c0060c3c>] (warn_slowpath_null+0x20/0x24) from [<c008f224>] (check_flags+0xcc/0x1dc)
[<c008f224>] (check_flags+0xcc/0x1dc) from [<c00945dc>] (lock_acquire+0x50/0x140)
[<c00945dc>] (lock_acquire+0x50/0x140) from [<c0358434>] (_raw_spin_lock+0x50/0x88)
[<c0358434>] (_raw_spin_lock+0x50/0x88) from [<c00fd114>] (set_task_comm+0x2c/0x60)
[<c00fd114>] (set_task_comm+0x2c/0x60) from [<c007e184>] (kthreadd+0x30/0x108)
[<c007e184>] (kthreadd+0x30/0x108) from [<c0030104>] (kernel_thread_exit+0x0/0x8)
---[ end trace 1b75b31a2719ed1c ]---
possible reason: unannotated irqs-on.
irq event stamp: 3
hardirqs last enabled at (2): [<c0059bb0>] finish_task_switch+0x48/0xb0
hardirqs last disabled at (3): [<c002f0b0>] ret_slow_syscall+0xc/0x1c
softirqs last enabled at (0): [<c005f3e0>] copy_process+0x394/0xe5c
softirqs last disabled at (0): [<(null)>] (null)
Fix this by ensuring that the lockdep interrupt state is manipulated in
the appropriate places. We essentially treat userspace as an entirely
separate environment which isn't relevant to lockdep (lockdep doesn't
monitor userspace.) We don't tell lockdep that IRQs will be enabled
in that environment.
Instead, when creating kernel threads (which is a rare event compared
to entering/leaving userspace) we have to update the lockdep state. Do
this by starting threads with IRQs disabled, and in the kthread helper,
tell lockdep that IRQs are enabled, and enable them.
This provides lockdep with a consistent view of the current IRQ state
in kernel space.
This also revert portions of 0d928b0b61
which didn't fix the problem.
Tested-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The TLS register is only available on ARM1136 r1p0 and later.
Set HWCAP_TLS flags if hardware TLS is available and test for
it if CONFIG_CPU_32v6K is not set for V6.
Note that we set the TLS instruction in __kuser_get_tls
dynamically as suggested by Jamie Lokier <jamie@shareable.org>.
Also the __switch_to code is optimized out in most cases as
suggested by Nicolas Pitre <nico@fluxnic.net>.
Reviewed-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
A new random value for the canary is stored in the task struct whenever
a new task is forked. This is meant to allow for different canary values
per task. On ARM, GCC expects the canary value to be found in a global
variable called __stack_chk_guard. So this variable has to be updated
with the value stored in the task struct whenever a task switch occurs.
Because the variable GCC expects is global, this cannot work on SMP
unfortunately. So, on SMP, the same initial canary value is kept
throughout, making this feature a bit less effective although it is still
useful.
One way to overcome this GCC limitation would be to locate the
__stack_chk_guard variable into a memory page of its own for each CPU,
and then use TLB locking to have each CPU see its own page at the same
virtual address for each of them.
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
With CONFIG_KPROBES enabled two section are getting created which
leads to below build break.
LOG:
AS arch/arm/kernel/entry-armv.o
arch/arm/kernel/entry-armv.S: Assembler messages:
arch/arm/kernel/entry-armv.S:431: Error: symbol ret_from_exception is in a different section
arch/arm/kernel/entry-armv.S:490: Error: symbol ret_from_exception is in a different section
arch/arm/kernel/entry-armv.S:491: Error: symbol __und_usr_unknown is in a different section
This was introduced by commit 4260415f6a
Reported-by: Anand Gadiyar <gadiyar@ti.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
/tmp/ccJ3ssZW.s: Assembler messages:
/tmp/ccJ3ssZW.s:1952: Error: can't resolve `.text' {.text section} - `.LFB1077'
This is caused because:
.section .data
.section .text
.section .text
.previous
does not return us to the .text section, but the .data section; this
makes use of .previous dangerous if the ordering of previous sections
is not known.
Fix up the other users of .previous; .pushsection and .popsection are
a safer pairing to use than .section and .previous.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The __kuser_cmpxchg code uses an ARMv6 dmb instruction, rather than
one based upon the architecture being built for. Switch to using
the macro provided for this purpose, which also eliminates the
need for an ifdef.
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Use a definition for the cmpxchg SWI instead of hard-coding the number.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Nicolas Pitre <nico@fluxnic.net>
The 32-bit wide variant of "mov pc, reg" in Thumb-2 is unpredictable
causing improper handling of the undefined instructions not caught by
the kernel. This patch adds a movw_pc macro for such situations
(currently only used in call_fpe).
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>