Commit Graph

123 Commits

Author SHA1 Message Date
Richard Henderson
ee03027a2c target/arm: Take an exception if PC is misaligned
For A64, any input to an indirect branch can cause this.

For A32, many indirect branch paths force the branch to be aligned,
but BXWritePC does not.  This includes the BX instruction but also
other interworking changes to PC.  Prior to v8, this case is UNDEFINED.
With v8, this is CONSTRAINED UNPREDICTABLE and may either raise an
exception or force align the PC.

We choose to raise an exception because we have the infrastructure,
it makes the generated code for gen_bx simpler, and it has the
possibility of catching more guest bugs.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-12-15 10:35:26 +00:00
Peter Maydell
8e228c9e4b target/arm: Implement HSTR.TJDBX
In v7A, the HSTR register has a TJDBX bit which traps NS EL0/EL1
access to the JOSCR and JMCR trivial Jazelle registers, and also BXJ.
Implement these traps. In v8A this HSTR bit doesn't exist, so don't
trap for v8A CPUs.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210816180305.20137-3-peter.maydell@linaro.org
2021-08-26 17:02:01 +01:00
Peter Maydell
e534629296 target/arm: Implement M-profile trapping on division by zero
Unlike A-profile, for M-profile the UDIV and SDIV insns can be
configured to raise an exception on division by zero, using the CCR
DIV_0_TRP bit.

Implement support for setting this bit by making the helper functions
raise the appropriate exception.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210730151636.17254-3-peter.maydell@linaro.org
2021-08-25 10:48:50 +01:00
Richard Henderson
b5cf742841 accel/tcg: Remove TranslatorOps.breakpoint_check
The hook is now unused, with breakpoints checked outside translation.

Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-07-21 07:47:05 -10:00
Peter Maydell
507b6a500c target/arm: Implement MVE VLDR/VSTR (non-widening forms)
Implement the forms of the MVE VLDR and VSTR insns which perform
non-widening loads of bytes, halfwords or words from memory into
vector elements of the same width (encodings T5, T6, T7).

(At the moment we know for MVE and M-profile in general that
vfp_access_check() can never return false, but we include the
conventional return-true-on-failure check for consistency
with non-M-profile translation code.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210617121628.20116-2-peter.maydell@linaro.org
2021-06-21 16:49:38 +01:00
Richard Henderson
458d0ab683 target/arm: Implement bfloat widening fma (indexed)
This is BFMLAL{B,T} for both AArch64 AdvSIMD and SVE,
and VFMA{B,T}.BF16 for AArch32 NEON.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210525225817.400336-11-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-06-03 16:43:26 +01:00
Richard Henderson
5693887f2e target/arm: Implement bfloat widening fma (vector)
This is BFMLAL{B,T} for both AArch64 AdvSIMD and SVE,
and VFMA{B,T}.BF16 for AArch32 NEON.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210525225817.400336-10-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-06-03 16:43:26 +01:00
Richard Henderson
81266a1f58 target/arm: Implement bfloat16 matrix multiply accumulate
This is BFMMLA for both AArch64 AdvSIMD and SVE,
and VMMLA.BF16 for AArch32 NEON.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210525225817.400336-9-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-06-03 16:43:26 +01:00
Richard Henderson
839144784b target/arm: Implement bfloat16 dot product (indexed)
This is BFDOT for both AArch64 AdvSIMD and SVE,
and VDOT.BF16 for AArch32 NEON.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210525225817.400336-8-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-06-03 16:43:26 +01:00
Richard Henderson
cb8657f7f9 target/arm: Implement bfloat16 dot product (vector)
This is BFDOT for both AArch64 AdvSIMD and SVE,
and VDOT.BF16 for AArch32 NEON.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210525225817.400336-7-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-06-03 16:43:26 +01:00
Richard Henderson
d29b17ca3e target/arm: Implement vector float32 to bfloat16 conversion
This is BFCVT{N,T} for both AArch64 AdvSIMD and SVE,
and VCVT.BF16.F32 for AArch32 NEON.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210525225817.400336-5-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-06-03 16:43:26 +01:00
Richard Henderson
3a98ac40fa target/arm: Implement scalar float32 to bfloat16 conversion
This is the 64-bit BFCVT and the 32-bit VCVT{B,T}.BF16.F32.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210525225817.400336-4-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-06-03 16:43:26 +01:00
Richard Henderson
2323c5ffd4 target/arm: Implement integer matrix multiply accumulate
This is {S,U,US}MMLA for both AArch64 AdvSIMD and SVE,
and V{S,U,US}MMLA.S8 for AArch32 NEON.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210525010358.152808-91-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-05-25 16:01:44 +01:00
Stephen Long
50d102bd42 target/arm: Implement SVE2 fp multiply-add long
Implements both vectored and indexed FMLALB, FMLALT, FMLSLB, FMLSLT

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Stephen Long <steplong@quicinc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210525010358.152808-83-richard.henderson@linaro.org
Message-Id: <20200504171240.11220-1-steplong@quicinc.com>
[rth: Rearrange to use float16_to_float32_by_bits.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-05-25 16:01:44 +01:00
Richard Henderson
6a98cb2ae0 target/arm: Implement SVE mixed sign dot product
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210525010358.152808-68-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-05-25 16:01:44 +01:00
Richard Henderson
2867039a9f target/arm: Implement SVE mixed sign dot product (indexed)
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210525010358.152808-67-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-05-25 16:01:44 +01:00
Richard Henderson
1aee2d70e3 target/arm: Implement SVE2 saturating multiply high (indexed)
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210525010358.152808-60-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-05-25 16:01:44 +01:00
Richard Henderson
169d7c5825 target/arm: Implement SVE2 signed saturating doubling multiply high
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210525010358.152808-59-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-05-25 16:01:44 +01:00
Richard Henderson
636ddeb15c target/arm: Pass separate addend to FCMLA helpers
For SVE, we potentially have a 4th argument coming from the
movprfx instruction.  Currently we do not optimize movprfx,
so the problem is not visible.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210525010358.152808-51-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-05-25 16:01:44 +01:00
Richard Henderson
bc2bd6974e target/arm: Pass separate addend to {U, S}DOT helpers
For SVE, we potentially have a 4th argument coming from the
movprfx instruction.  Currently we do not optimize movprfx,
so the problem is not visible.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210525010358.152808-50-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-05-25 16:01:44 +01:00
Richard Henderson
e6eba6e532 target/arm: Implement SVE2 XAR
In addition, use the same vector generator interface for AdvSIMD.
This fixes a bug in which the AdvSIMD insn failed to clear the
high bits of the SVE register.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210525010358.152808-44-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-05-25 16:01:44 +01:00
Richard Henderson
ab3ddf3185 target/arm: Implement SVE2 saturating multiply-add high
SVE2 has two additional sizes of the operation and unlike NEON,
there is no saturation flag.  Create new entry points for SVE2
that do not set QC.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210525010358.152808-36-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-05-25 16:01:44 +01:00
Richard Henderson
5dad1ba52f target/arm: Implement SVE2 Integer Multiply - Unpredicated
For MUL, we can rely on generic support.  For SMULH and UMULH,
create some trivial helpers.  For PMUL, back in a21bb78e58,
we organized helper_gvec_pmul_b in preparation for this use.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210525010358.152808-3-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-05-25 16:01:43 +01:00
Richard Henderson
604cef3e57 target/arm: Fix neon VTBL/VTBX for len > 1
The helper function did not get updated when we reorganized
the vector register file for SVE.  Since then, the neon dregs
are non-sequential and cannot be simply indexed.

At the same time, make the helper function operate on 64-bit
quantities so that we do not have to call it twice.

Fixes: c39c2b9043
Reported-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
[PMM: use aa32_vfp_dreg() rather than opencoding]
Message-id: 20201105171126.88014-1-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-11-10 11:03:48 +00:00
Peter Maydell
61db12d9f9 target/arm: AArch32 VCVT fixed-point to float is always round-to-nearest
For AArch32, unlike the VCVT of integer to float, which honours the
rounding mode specified by the FPSCR, VCVT of fixed-point to float is
always round-to-nearest. (AArch64 fixed-point-to-float conversions
always honour the FPCR rounding mode.)

Implement this by providing _round_to_nearest versions of the
relevant helpers which set the rounding mode temporarily when making
the call to the underlying softfloat function.

We only need to change the VFP VCVT instructions, because the
standard- FPSCR value used by the Neon VCVT is always set to
round-to-nearest, so we don't need to do the extra work of saving
and restoring the rounding mode.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20201013103532.13391-1-peter.maydell@linaro.org
2020-10-20 16:12:00 +01:00
Peter Maydell
c50d8d1440 target/arm/vec_helper: Add gvec fp indexed multiply-and-add operations
Add gvec helpers for doing Neon-style indexed non-fused fp
multiply-and-accumulate operations.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20200828183354.27913-44-peter.maydell@linaro.org
2020-09-01 11:45:32 +01:00
Peter Maydell
23afcdd251 target/arm: Implement fp16 for Neon VRINTX
Convert the Neon VRINTX insn to use gvec, and use this to implement
fp16 support for it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200828183354.27913-42-peter.maydell@linaro.org
2020-09-01 11:45:25 +01:00
Peter Maydell
18725916b1 target/arm: Implement fp16 for Neon VRINT-with-specified-rounding-mode
Convert the Neon VRINT-with-specified-rounding-mode insns to gvec,
and use this to implement the fp16 versions.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200828183354.27913-41-peter.maydell@linaro.org
2020-09-01 11:45:20 +01:00
Peter Maydell
ca88a6efdf target/arm: Implement fp16 for Neon VCVT with rounding modes
Convert the Neon VCVT with-specified-rounding-mode instructions
to gvec, and use this to implement fp16 support for them.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200828183354.27913-40-peter.maydell@linaro.org
2020-09-01 11:38:58 +01:00
Peter Maydell
24018cf399 target/arm: Implement fp16 for Neon VCVT fixed-point
Implement fp16 for the Neon VCVT insns which convert between
float and fixed-point.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200828183354.27913-39-peter.maydell@linaro.org
2020-09-01 11:38:50 +01:00
Peter Maydell
7b959c5890 target/arm: Convert Neon VCVT fixed-point to gvec
Convert the Neon VCVT float<->fixed-point insns to a
gvec style, in preparation for adding fp16 support.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200828183354.27913-38-peter.maydell@linaro.org
2020-09-01 11:38:42 +01:00
Peter Maydell
7782a9afec target/arm: Implement fp16 for Neon float-integer VCVT
Convert the Neon float-integer VCVT insns to gvec, and use this
to implement fp16 support for them.

Note that unlike the VFP int<->fp16 VCVT insns we converted
earlier and which convert to/from a 32-bit integer, these
Neon insns convert to/from 16-bit integers. So we can use
the existing vfp conversion helpers for the f32<->u32/i32
case but need to provide our own for f16<->u16/i16.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200828183354.27913-37-peter.maydell@linaro.org
2020-09-01 11:38:35 +01:00
Peter Maydell
1dc587ee9b target/arm: Implement fp16 for Neon pairwise fp ops
Convert the Neon pairwise fp ops to use a single gvic-style
helper to do the full operation instead of one helper call
for each 32-bit part. This allows us to use the same
framework to implement the fp16.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200828183354.27913-36-peter.maydell@linaro.org
2020-09-01 11:38:20 +01:00
Peter Maydell
40fde72dda target/arm: Implement fp16 for Neon VRSQRTS
Convert the Neon VRSQRTS insn to using a gvec helper,
and use this to implement the fp16 case.

As with VRECPS, we adjust the phrasing of the new implementation
slightly so that the fp32 version parallels the fp16 one.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200828183354.27913-35-peter.maydell@linaro.org
2020-09-01 11:38:13 +01:00
Peter Maydell
ac8c62c4e5 target/arm: Implement fp16 for Neon VRECPS
Convert the Neon VRECPS insn to using a gvec helper, and
use this to implement the fp16 case.

The phrasing of the new float32_recps_nf() is slightly different from
the old recps_f32() so that it parallels the f16 version; for f16 we
can't assume that flush-to-zero is always enabled.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200828183354.27913-34-peter.maydell@linaro.org
2020-09-01 11:38:06 +01:00
Peter Maydell
635187aaa9 target/arm: Implement fp16 for Neon fp compare-vs-0
Convert the neon floating-point vector compare-vs-0 insns VCEQ0,
VCGT0, VCLE0, VCGE0 and VCLT0 to use a gvec helper, and use this to
implement the fp16 case.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200828183354.27913-33-peter.maydell@linaro.org
2020-09-01 11:37:59 +01:00
Peter Maydell
cf722d75b3 target/arm: Implement fp16 for Neon VFMA, VMFS
Convert the neon floating-point vector operations VFMA and VFMS
to use a gvec helper, and use this to implement the fp16 case.

This is the last use of do_3same_fp() so we can now delete
that function.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200828183354.27913-32-peter.maydell@linaro.org
2020-09-01 11:37:51 +01:00
Peter Maydell
e5adc70665 target/arm: Implement fp16 for Neon VMLA, VMLS operations
Convert the Neon floating-point VMLA and VMLS insns over to using a
gvec helper, and use this to implement the fp16 case.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200828183354.27913-31-peter.maydell@linaro.org
2020-09-01 11:31:48 +01:00
Peter Maydell
e22705bb94 target/arm: Implement fp16 for Neon VMAXNM, VMINNM
Convert the Neon floating point VMAXNM and VMINNM insns to
using a gvec helper and use this to implement the fp16 case.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200828183354.27913-30-peter.maydell@linaro.org
2020-09-01 11:31:16 +01:00
Peter Maydell
e43268c54b target/arm: Implement fp16 for Neon VMAX, VMIN
Convert the Neon float-point VMAX and VMIN insns over to using
a gvec helper, and use this to implement the fp16 case.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200828183354.27913-29-peter.maydell@linaro.org
2020-09-01 11:31:12 +01:00
Peter Maydell
bb2741da18 target/arm: Implement fp16 for VACGE, VACGT
Convert the neon floating-point vector absolute comparison ops
VACGE and VACGT over to using a gvec hepler and use this to
implement the fp16 case.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200828183354.27913-28-peter.maydell@linaro.org
2020-09-01 11:31:05 +01:00
Peter Maydell
ad505db233 target/arm: Implement fp16 for VCEQ, VCGE, VCGT comparisons
Convert the Neon floating-point vector comparison ops VCEQ,
VCGE and VCGT over to using a gvec helper and use this to
implement the fp16 case.

(We put the float16_ceq() etc functions above the DO_2OP()
macro definition because later when we convert the
compare-against-zero instructions we'll want their
definitions to be visible at that point in the source file.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200828183354.27913-27-peter.maydell@linaro.org
2020-09-01 11:30:58 +01:00
Peter Maydell
e4a6d4a69e target/arm: Implement FP16 for Neon VADD, VSUB, VABD, VMUL
Implement FP16 support for the Neon insns which use the DO_3S_FP_GVEC
macro: VADD, VSUB, VABD, VMUL.

For VABD this requires us to implement a new gvec_fabd_h helper
using the machinery we have already for the other helpers.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200828183354.27913-24-peter.maydell@linaro.org
2020-09-01 11:19:32 +01:00
Peter Maydell
0a6f4b4cb3 target/arm: Implement VFP fp16 VRINT*
Implement the fp16 version of the VFP VRINT* insns.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200828183354.27913-19-peter.maydell@linaro.org
2020-09-01 11:19:32 +01:00
Peter Maydell
414ba270c4 target/arm: Use macros instead of open-coding fp16 conversion helpers
Now the VFP_CONV_FIX macros can handle fp16's distinction between the
width of the operation and the width of the type used to pass operands,
use the macros rather than the open-coded functions.

This creates an extra six helper functions, all of which we are going
to need for the AArch32 VFP fp16 instructions.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200828183354.27913-15-peter.maydell@linaro.org
2020-09-01 11:19:32 +01:00
Peter Maydell
1b88b054c5 target/arm: Implement VFP fp16 VCMP
Implement fp16 version of VCMP.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200828183354.27913-11-peter.maydell@linaro.org
2020-09-01 11:19:32 +01:00
Peter Maydell
ce2d65a5d1 target/arm: Implement VFP fp16 for VABS, VNEG, VSQRT
Implement VFP fp16 for VABS, VNEG and VSQRT. This is all
the fp16 insns that use the DO_VFP_2OP macro, because there
is no fp16 version of VMOV_reg.

Notes:
 * the gen_helper_vfp_negh already exists as we needed to create
   it for the fp16 multiply-add insns
 * as usual we need to use the f16 version of the fp_status;
   this is only relevant for VSQRT

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200828183354.27913-9-peter.maydell@linaro.org
2020-09-01 11:19:32 +01:00
Peter Maydell
9886fe2834 target/arm: Implement VFP fp16 for fused-multiply-add
Implement VFP fp16 support for fused multiply-add insns
VFNMA, VFNMS, VFMA, VFMS.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200828183354.27913-7-peter.maydell@linaro.org
2020-09-01 11:19:32 +01:00
Peter Maydell
e7cb0ded52 target/arm: Implement VFP fp16 VMLA, VMLS, VNMLS, VNMLA, VNMUL
Implement fp16 versions of the VFP VMLA, VMLS, VNMLS, VNMLA, VNMUL
instructions. (These are all the remaining ones which we implement
via do_vfp_3op_[hsd]p().)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200828183354.27913-5-peter.maydell@linaro.org
2020-09-01 11:19:32 +01:00
Peter Maydell
120a0eb3ea target/arm: Implement VFP fp16 for VFP_BINOP operations
Implmeent VFP fp16 support for simple binary-operator VFP insns VADD,
VSUB, VMUL, VDIV, VMINNM and VMAXNM:

 * make the VFP_BINOP() macro generate float16 helpers as well as
   float32 and float64
 * implement a do_vfp_3op_hp() function similar to the existing
   do_vfp_3op_sp()
 * add decode for the half-precision insn patterns

Note that the VFP_BINOP macro use creates a couple of unused helper
functions vfp_maxh and vfp_minh, but they're small so it's not worth
splitting the BINOP operations into "needs halfprec" and "no
halfprec" groups.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200828183354.27913-4-peter.maydell@linaro.org
2020-09-01 11:19:32 +01:00