The ARM architecture mandates that we detect tininess before rounding,
so set the softfloat fp_status up appropriately.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Catch the UNPREDICTABLE case for Neon VTBL,VTBX, and UNDEF it
rather than allowing the helper function to index off the end
of the register file.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Many of the Neon "2 register misc" instruction forms require invalid
size fields to cause the instruction to UNDEF. Pull this information
out into an array; this simplifies the code and also means we can do
the check early and avoid the problem of leaking TCG temporaries in
the illegal_op case.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Add missing checks for cases which must UNDEF in the Neon "2 registers and
a scalar" data processing instruction space.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Add missing UNDEF checks for instructions in the Neon "3 registers of
different widths" data processing space.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
For Neon "one register and a modified immediate value" forms, the
combination op=1 cmode=1111 is unallocated and should UNDEF.
All instructions of this form also UNDEF if Q == 1 and Vd<0> == 1.
We also add a comment on the only UNPREDICTABLE in this space.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Collapse some switch cases for VSRI into those for VSHL, VSLI,
since the bodies are the same. (This is not completely obvious
for the size < 3 case, but since for VSRI we know U=1 the
GEN_NEON_INTEGER_OP() expansion is equivalent to the open-coded
VSHL/VSLI case.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Correctly handle all the UNDEF cases for Neon instructions of the
"2 registers and shift" form, and make sure that we check for these
cases early enough not to leak TCG temporaries.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Since we know that the case of (pairwise && q) has been caught
earlier, we can simplify the register setup code for each pass
in the three-register-same-size Neon loop.
Signed-off-by: Juha Riihimäki <juha.riihimaki@nokia.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Correct the handling of UNDEF cases for the NEON "3 registers same
size" forms, by adding missing checks and rationalising some others
so they are done early enough to avoid leaking TCG temporaries.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Simplify the checks for invalid size values for the Neon "three registers
of the same size" instruction forms (and add them where they were missing)
by using a lookup table.
This includes adding symbolic constants for the op values in this space,
since we now use them in multiple places.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Currently target-arm/ assumes at least ARMv5 core. Add support for
handling also ARMv4/ARMv4T. This changes the following instructions:
BX(v4T and later)
BKPT, BLX, CDP2, CLZ, LDC2, LDRD, MCRR, MCRR2, MRRC, MCRR, MRC2, MRRC,
MRRC2, PLD QADD, QDADD, QDSUB, QSUB, STRD, SMLAxy, SMLALxy, SMLAWxy,
SMULxy, SMULWxy, STC2 (v5 and later)
All instructions that are "v5TE and later" are also bound to just v5, as
that's how it was before.
This patch doesn _not_ include disabling of cp15 access and base-updated
data abort model (that will be required to emulate chips based on a
ARM7TDMI), because:
* no ARM7TDMI chips are currently emulated (or planned)
* those features aren't strictly necessary for my purposes (SA-1 core
emulation).
All v5 models are handled as they are v5T. Internally we still have a
check if the model is a v5(T) or v5TE, but as all emulated cores are
v5TE, those two cases are simply aliased (for now).
Patch is heavily based on patch by Filip Navara <filip.navara@gmail.com>
which in turn is based on work by Ulrich Hecht <uli@suse.de> and Vincent
Sanders <vince@kyllikki.org>.
Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
tcg_gen_exit_tb takes a parameter of type tcg_target_long,
so the type casts of pointer to long should be replaced by
type casts of pointer to tcg_target_long (suggested by Blue Swirl).
These changes are needed for build environments where
sizeof(long) != sizeof(void *), especially for w64.
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Make the Neon helper routines use the correct FP status from
the CPUEnv rather than using a dummy static one. This means
they will correctly handle denormals and NaNs and will set
FPSCR exception bits properly.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Use the global 'env' variable in the helper functions in iwmmxt_helper.c.
This means we don't need to pass env as an argument to them any more.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Use the global 'env' variable in the helper functions in neon_helper.c.
This means we don't need to pass env as an argument to them any more.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Correct the argument and return types for the float<->int conversion helper
functions so that integer arguments and return values are declared as
uint32_t/uint64_t, not float32/float64. This allows us to remove the
hand-rolled functions which were doing bitwise copies between the types
via unions.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Nathan Froyd <froydnj@codesourcery.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Use the new softfloat min/max functions to implement the Neon VMAX
and VMIN instructions. This allows us to get the right behaviour
for NaN and negative zero.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Implement ABD by taking the absolute value of the difference
of the operands (as the ARM ARM specifies) rather than by
flipping the order of the operands to the subtract based
on the results of a comparison. The latter approch gives
the wrong answers for some edge cases like negative zero.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Implementing the floating-point versions of VCLE #0 and VCLT #0 by
doing a GT comparison and inverting the result gives the wrong
result if the input is a NaN. Implement as a GT comparison with the
operands swapped instead.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Fix the helper functions implementing the Neon floating point comparison
ops (VCGE, VCGT, VCEQ, VACGT, VACGE) to return the right answer when
one of the values being compared is a NaN.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Use the softfloat make_float32 and float32_val macros to convert between
softfloat's float32 type and raw uint32_t types, rather than private
conversion functions.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Nathan Froyd <froydnj@codesourcery.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Move the allocation and freeing of the TCG temp used for the address for
Neon load/store instructions so that we don't allocate the temporary
until we've done enough decoding to know that the instruction is not
an UNDEF pattern; this avoids leaking the TCG temp in these cases.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Fix several bugs in VLD of single element to all lanes:
The "single element to all lanes" form of VLD1 differs from those for
VLD2, VLD3 and VLD4 in that bit 5 indicates whether the loaded element
should be written to one or two Dregs (rather than being a register
stride). Handle this by special-casing VLD1 rather than trying to
have one loop which deals with both VLD1 and 2/3/4.
Handle VLD4.32 with 16 byte alignment specified, rather than UNDEFfing.
UNDEF for the invalid size and alignment combinations.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The preferred way to create a constant floating point value is to use
make_float32() rather than doing a runtime int32_to_float32().
Convert the code in the VRSQRTS helper to work this way.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Correct the handling of edge cases for the VRECPS instruction:
* this is a Neon instruction so uses the "standard FPSCR value"
* (zero, inf) is a special case which returns 2.0
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
SMUAD and SMLAD are supposed to set the Q bit if the addition of
the two 16x16 multiply products and optional accumulator overflows
considered as a signed value. However we were only doing this check
for the addition of the accumulator, not when adding the products,
with the effect that we were mishandling the edge case where
both inputs are 0x80008000.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Fix the signed modulo arithmetic helpers for the v6media
instructions (SADD8, SSUB8, SADD16, SSUB16, SASX, SSAX) to set
the GE bits correctly (based on the result of the add or subtract
before it is truncated to 16 bits, not after).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Decode of Thumb load/store was merging together the cases of 'bit 11==0'
(reg+reg LSL imm) and 'bit 11==1' (reg+imm). This happens to work for
valid instruction patterns but meant that we would not UNDEF for the
cases the architecture mandates that we must. Make the decode actually
look at bit 11 as well as [10..8] so that we UNDEF in the right places.
This change also removes what was a spurious unreachable 'case 8',
and correctly frees TCG temporaries on the illegal-insn codepaths.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
All implementations are now the same, and there is only one caller,
so inline the function there.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Newer ARM kernels try to probe for whether the CPU has hardware breakpoint
support. For this to work QEMU has to implement a minimal set of the cp14
debug registers. The architecture requires v7 cores to implement debug
and so there is no defined way to report its absence; however in practice
returning a zero DBGDIDR (ie with a reserved value for "debug architecture
version") should cause well-written hw debug users to do the right thing.
We also implement DBGDRAR and DBGDSAR as RAZ, indicating no memory mapped
debug components.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Use the new TCG temporary leak debugging facilities to
check that each ARM instruction does not leak temporaries.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
This commit removes the ad-hoc resource leak checking code from
target-arm. This includes replacing all uses of new_tmp() with
tcg_temp_new_i32() and all uses of dead_tmp() with
tcg_temp_free_i32().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Implement VA->PA translations by cp15-c7 that went through unchanged
previously.
Signed-off-by: Adam Lackorzynski <adam@os.inf.tu-dresden.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The code for Thumb2 ORNS (or negated and set flags) was trashing
a TCG input register which was needed later for use in calculating
flags, with the effect that the carry flag was always set with
the wrong sense. Fix this by using the TCG orc op instead of
separate not and or ops.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Fix two bugs in the translation of the instructions VMOV sa,sb,rx,ry and
VMOV rx,ry,sa,sb (which copy between a pair of ARM core registers and a
pair of VFP single precision registers):
* An incorrect condition meant these instruction patterns were being
treated as load/store multiple, which resulted in the generation
of bad code and a runtime segfault
* The order of the core register pair was reversed so the values would
go to the wrong registers
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
In v7 of the ARM architecture, WFI (wait for interrupt) is a first-class
instruction, but in previous versions this functionality was provided
via a cp15 coprocessor register. Add correct feature checks to the
decoding of the cp15 WFI instructions so that they behave correctly
for newer cores. In particular, the old 0,c7,c8,2 encoding used on
ARM940 has been reused for VA-to-PA translation in v6 and v7.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Adam Lackorzynski <adam@os.inf.tu-dresden.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Now use the same algorithm as described in the ARM ARM.
Signed-off-by: Christophe Lyon <christophe.lyon@st.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Now use the same algorithm as described in the ARM ARM.
Signed-off-by: Christophe Lyon <christophe.lyon@st.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
These two constants will be used by helper functions such as recpe_f32
and rsqrte_f32.
Signed-off-by: Christophe Lyon <christophe.lyon@st.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
For Neon shifts by immediate and narrow, correctly handle the case
where the source registers and the destination registers overlap
(the second pass should use the original register contents, not the
results of the first pass).
This includes a refactoring to pull the size check outside the
loop rather than inside, since there is now very little common
code between the size == 3 and size != 3 case.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Pull the code which decodes narrowing operations as being either
signed/unsigned saturate or plain out into its own function.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Correctly handle VQRSHL of unsigned values by a shift count of the
width of the data type or larger, which must be special-cased in the
qrshl_u* helper functions.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Handle the case of signed VQRSHL by a shift count of the width of the
data type or larger, which must be special cased in the qrshl_s*
helper functions.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Call the normal shift helpers instead of the rounding ones.
Signed-off-by: Christophe Lyon <christophe.lyon@st.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>