Directives such as .long and .word do not magically cause the
assembler location counter to become aligned in gas. As a result,
using these directives in code sections can result in misaligned
data words when building a Thumb-2 kernel (CONFIG_THUMB2_KERNEL).
This is a Bad Thing, since the ABI permits the compiler to assume
that fundamental types of word size or above are word- aligned when
accessing them from C. If the data is not really word-aligned,
this can cause impaired performance and stray alignment faults in
some circumstances.
In general, the following rules should be applied when using data
word declaration directives inside code sections:
* .quad and .double:
.align 3
* .long, .word, .single, .float:
.align (or .align 2)
* .short:
No explicit alignment required, since Thumb-2
instructions are always 2 or 4 bytes in size.
immediately after an instruction.
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
MVFR0 and MVFR1 are only available starting with ARM1136 r1p0 release
according to "B.5 VFP changes" in DDI0211F_arm1136_r1p0_trm.pdf. This is
also when TLS register got added, so we can use HAS_TLS also to test for
MVFR0 and MVFR1.
Otherwise VFPFMRX and VFPFMXR access fails and we get:
Internal error: Oops - undefined instruction: 0 [#1]
PC is at no_old_VFP_process+0x8/0x3c
LR is at __und_svc+0x48/0x80
...
Signed-off-by: Tony Lindgren <tony@atomide.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
vfp_put_double() takes the double value in r0,r1 not r1,r2.
Reported-by: Tarun Kanti DebBarma <tarun.kanti@ti.com>
Cc: <stable@kernel.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
From: Imre Deak <imre.deak@nokia.com>
Recently the UP versions of these functions were refactored and as
a side effect it became possible to call them for the current thread.
This isn't true for the SMP versions however, so fix this up.
Signed-off-by: Imre Deak <imre.deak@nokia.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
A CPU has VFPv3 hardware if the FPSID[19:16] bits are 2 or more.
Currently Linux was only checking for 3 or more.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
If we're only reading the VFP context via the ptrace call, there's
no need to invalidate the hardware context - we only need to do that
on PTRACE_SETVFPREGS. This allows more efficient monitoring of a
traced task.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The more I look at vfp_sync_state(), the more I believe it's trying
to do its job in a really obscure way.
Essentially, last_VFP_context[] tracks who owns the state in the VFP
hardware. If last_VFP_context[] is the context for the thread which
we're interested in, then the VFP hardware has context which is not
saved in the software state - so we need to bring the software state
up to date.
If last_VFP_context[] is for some other thread, we really don't care
what state the VFP hardware is in; it doesn't contain any information
pertinent to the thread we're trying to deal with - so don't touch
the hardware.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Commit c98929c07a removed the clearing of the FPSCR[31:28] bits from the
vfp_raise_exceptions() function and the new bits are or'ed with the old
FPSCR bits leading to unexpected results (the original commit was
referring to the cumulative bits - FPSCR[4:0]).
Reported-by: Tom Hameenanttila <tmhameen@marvell.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This avoids races in the VFP code where the dead thread may have
state on another CPU. By moving this code to exit_thread(), we
will be running as the thread, and therefore be running on the
current CPU.
This means that we can ensure that the only local state is accessed
in the thread notifiers.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When the VFP notifier is called for flush_thread(), we may be
preemptible, meaning we might migrate to another CPU, which means
referencing the current CPU number without some form of locking is
invalid, and can cause data corruption.
For the most cases, this isn't a problem since atomic notifiers are run
under rcu lock, which for most configurations results in preemption
being disabled - except when the preemptable tree-based rcu
implementation is selected.
Let's make it safe anyway.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Since the Thumb-2 instructions can be 16-bit wide, data in the .text
sections may not be aligned to a 32-bit word and this leads to unaligned
exceptions. This patch does not affect the ARM code generation.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This CPU generates synchronous VFP exceptions in a non-standard way -
the FPEXC.EX bit set but without the FPSCR.IXE bit being set like in the
VFP subarchitecture 1 or just the FPEXC.DEX bit like in VFP
subarchitecture 2. The main problem is that the faulty instruction
(which needs to be emulated in software) will be restarted several times
(normally until a context switch disables the VFP). This patch ensures
that the VFP exception is treated as synchronous.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Nicolas Pitre <nico@cam.org>
We've observed that ARM VFP state can be corrupted during VFP exception
handling when PREEMPT is enabled. The exact conditions are difficult
to reproduce but appear to occur during VFP exception handling when a
task causes a VFP exception which is handled via VFP_bounce and is then
preempted by yet another task which in turn causes yet another VFP
exception. Since the VFP_bounce code is not preempt safe, VFP state then
becomes corrupt. In order to prevent preemption from occuring while
handling a VFP exception, this patch disables preemption while handling
VFP exceptions.
Signed-off-by: George G. Davis <gdavis@mvista.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The VFPv3D16 is a VFPv3 CPU configuration where only 16 double registers
are present, as the VFPv2 configuration. This patch adds the
corresponding hwcap bits so that applications or debuggers have more
information about the supported features.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch adds ptrace support for setting and getting the VFP registers
using PTRACE_SETVFPREGS and PTRACE_GETVFPREGS. The user_vfp structure
defined in asm/user.h contains 32 double registers (to cover VFPv3 and
Neon hardware) and the FPSCR register.
Cc: Paul Brook <paul@codesourcery.com>
Cc: Daniel Jacobowitz <dan@codesourcery.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When CONFIG_PM is selected, the VFP code does not have any handler
installed to deal with either saving the VFP state of the current
task, nor does it do anything to try and restore the VFP after a
resume.
On resume, the VFP will have been reset and the co-processor access
control registers are in an indeterminate state (very probably the
CP10 and CP11 the VFP uses will have been disabled by the ARM core
reset). When this happens, resume will break as soon as it tries to
unfreeze the tasks and restart scheduling.
Add a sys device to allow us to hook the suspend call to save the
current thread state if the thread is using VFP and a resume hook
which restores the CP10/CP11 access and ensures the VFP is disabled
so that the lazy swapping will take place on next access.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
On ARMv7, conditional undefined instructions may generate exceptions
even if the condition is not met. The vfphw.S contains the FPINST and
FPINST2 access instructions which may not be present on processors with
synchronous VFP exceptions.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This declaration specifies the "function" type and size for various
assembly functions, mainly needed for generating the correct branch
instructions in Thumb-2.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
It's never used and the comments refer to nonatomic and retry
interchangably. So get rid of it.
Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This patch adds the support for VFPv3 (the kernel currently supports
VFPv2). The main difference is 32 double registers (compared to 16).
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch allows the VFP support code to run correctly on CPUs
compatible with the common VFP subarchitecture specification (Appendix
B in the ARM ARM v7-A and v7-R edition). It implements support for VFP
subarchitecture 2 while being backwards compatible with
subarchitecture 1.
On VFP subarchitecture 1, the arithmetic exceptions are asynchronous
(or imprecise as described in the old ARM ARM) unless the FPSCR.IXE
bit is 1. The exceptional instructions can be read from FPINST and
FPINST2 registers. With VFP subarchitecture 2, the arithmetic
exceptions can also be synchronous and marked by the FPEXC.DEX bit
(the FPEXC.EX bit is cleared). CPUs implementing the synchronous
arithmetic exceptions don't have the FPINST and FPINST2 registers and
accessing them would trigger and undefined exception.
Note that FPEXC.EX bit has an additional meaning on subarchitecture 1
- if it isn't set, there is no additional information in FPINST and
FPINST2 that needs to be saved at context switch or when lazy-loading
the VFP state of a different thread.
The patch also removes the clearing of the cumulative exception flags in
FPSCR when additional exceptions were raised. It is up to the user
application to clear these bits.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
These two instructions exceptionally take a single precision register
as their operand. This means we can't use vfp_get_dm() to read the
register number - we need to use vfp_get_sm() instead. Add a flag to
indicate this exception to the general rule.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The vector stride of the double-precision vector instructions must be changed
to 1-2 from even 2-4, because the double registers numbering has been
changed to 0-15 from even 0-30 by
1356c1948d commit.
Signed-off-by: Takashi Ohmasa <ohmasa.takashi@jp.panasonic.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
All exception flags of the FPEXC register must be cleared before
returning from exception code to user code, including FP2V and OFC.
Signed-off-by: Takashi Ohmasa <ohmasa.takashi@jp.panasonic.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The variable AFLAGS is a wellknown variable and the usage by
kbuild may result in unexpected behaviour.
On top of that several people over time has asked for a way to
pass in additional flags to gcc.
This patch replace use of AFLAGS with KBUILD_AFLAGS all over
the tree.
Patch was tested on following architectures:
alpha, arm, i386, x86_64, mips, sparc, sparc64, ia64, m68k, s390
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
vfp_init() takes care of the condition when CONFIG_VFP=y but no real VFP
device exists. However, when this condition is true, a compiler might
misplace code lines in a way that will break this support. (To be more
specific - fmrx(FPSID) might be executed before vfp_testing_entry
assignment, which will end up with Oops - undefined instruction).
This patch adds a barrier() to guarantee the right execution ordering.
Signed-off-by: Assaf Hoffman
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Fix a real section mismatch issue; the test code is thrown away after
initialisation, but if we do not detect the VFP hardware, it is left
hooked into the exception handler. Any VFP instructions which are
subsequently executed risk calling the discarded exception handler.
Introduce a new "null" handler which returns to the "unrecognised
fault" return address.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The current lazy saving of the VFP registers is no longer possible
with thread migration on SMP. This patch implements a per-CPU
vfp-state pointer and the saving of the VFP registers at every context
switch. The registers restoring is still performed in a lazy way.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When we install the handlers for context switching, we must enable
VFP on all CPU cores, otherwise undefined (and random) effects
occur.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Don't set HWCAP_VFP in the processor support file; not only does it
depend on the processor features, but it also depends on the support
code being present. Therefore, only set it if the support code
detects that we have a VFP coprocessor attached.
Also, move the VFP handling of the coprocessor access register into
the VFP support code.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The SIGFPE signal should be generated if Division by Zero exception is detected.
Signed-off-by: Takashi Ohmasa <ohmasa.takashi@jp.panasonic.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The significand should be shifted until the value of bit [62] is 1
to normalize the denormal double number.
Signed-off-by: Takashi Ohmasa <ohmasa.takashi@jp.panasonic.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
It looks like Zach Brown's patch pr_debug-check-pr_debug-arguments
worked as inteded. That is, it doesn't "allow completely incorrect code
to build." :).
The arm build fails with the following message:
CC arch/arm/vfp/vfpsingle.o
arch/arm/vfp/vfpsingle.c: In function `__vfp_single_normaliseround':
arch/arm/vfp/vfpsingle.c:201: error: `func' undeclared (first use in
this function)
arch/arm/vfp/vfpsingle.c:201: error: (Each undeclared identifier is
reported only once
arch/arm/vfp/vfpsingle.c:201: error: for each function it appears in.)
make[1]: *** [arch/arm/vfp/vfpsingle.o] Error 1
make: *** [arch/arm/vfp] Error 2
The following patch fixes the issue by using func only when DEBUG is
defined.
Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Zach Brown <zach.brown@oracle.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- Document the meaning for OP_SCALAR, OP_SD and add OP_DD.
- Formatting cleanups
- Remove now redundant code for making compare instructions
operate on scalar values.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
VECITR in Floating-Point Exception register indicates the number of
remaining short vector iterations after a potential exception was
detected.
In case of exception caused by scalar instructions, VECITR is NOT updated.
Therefore emulation for VFP must ignore VECITR field
and treat "veclen" as zero when recognizing scalar instructing.
Signed-off-by: Gen Fukatsu <fukatsu.gen@jp.panasonic.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Daniel Jacobowitz
The ARM kernel has several uses of asm("foo%?"). %? is a GCC internal
modifier used to output conditional execution predicates. However, no
version of GCC supports conditionalizing asm statements. GCC 4.2 will
correctly expand %? to the empty string in user asms. Earlier versions may
reuse the condition from the previous instruction. In 'if (foo) asm
("bar%?");' this is somewhat likely to be right... but not reliable.
So, the only safe thing to do is to remove the uses of %?. I believe
the tlbflush.h occurances were supposed to be removed before, based
on the comment about %? not working at the top of that file.
Old versions of GCC could omit branches around user asms if the asm didn't
mark the condition codes as clobbered. This problem hasn't been seen on any
recent (3.x or 4.x) GCC, but it could theoretically happen. So, where
%? was removed a cc clobber was added.
Signed-off-by: Daniel Jacobowitz <dan@codesourcery.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The common case for the thread notifier is a context switch. Tell
gcc that this is the most likely condition so it can optimise the
function for this case.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Daniel Jacobowitz
vfp_put_double didn't work in a CONFIG_AEABI kernel. By swapping
the arguments, we arrange for them to be in the same place regardless
of ABI. I made the same change to vfp_put_float for consistency.
Signed-off-by: Daniel Jacobowitz <dan@codesourcery.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Daniel Jacobowitz
The fcvtds and fcvtsd instructions were generating a qnan bit pattern
for both quiet and signalling NaNs.
Signed-off-by: Daniel Jacobowitz <dan@codesourcery.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Daniel Jacobowitz
The fcvtsd/fcvtds emulation was left behind when the numbering of double
precision registers was changed from 0-30 to 0-15. Both conversion
instructions were writing their results to the wrong register. Also,
the conversion instructions should stop after the first element even
if a vector length is specified.
Signed-off-by: Daniel Jacobowitz <dan@codesourcery.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Daniel Jacobowitz
The recent fix to hide VFP_NAN_FLAG broke the check in vfp_raise_exceptions;
it would attempt to deliver an exception mask of 0xfffffeff instead of reporting
a serious error condition using printk. Define a safe constant to use for
an invalid exception maskm, and use it at both ends.
Signed-off-by: Daniel Jacobowitz <dan@codesourcery.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Since we pass flags to the compiler to control code generation based
on the least capable selected CPU, if we want to include VFP support,
we must tweak the assembler flags to allow the VFP instructions.
Moreover, we must not use the mrrc/mcrr versions since these will not
be recognised by the assembler.
We do not convert all instructions to the VFP-equivalent (yet) since
binutils appears to barf on "fmrx rn, fpinst" and doesn't provide any
other way (other than using the mrc equivalent) to encode this
instruction - which is rather a problem when you have a VFP
implementation which requires these instructions.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Some machine classes need to allow VFP support to be built into the
kernel, but still allow the kernel to run even though VFP isn't
present. Unfortunately, the kernel hard-codes VFP instructions
into the thread switch, which prevents this being run-time selectable.
Solve this by introducing a notifier which things such as VFP can
hook into to be informed of events which affect the VFP subsystem
(eg, creation and destruction of threads, switches between threads.)
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>