target/arm: Fix float16 pairwise Neon ops on big-endian hosts

In the neon_padd/pmax/pmin helpers for float16, a cut-and-paste error meant we were using the H4() address swizzler macro rather than the H2() which is required for 2-byte data. This had no effect on little-endian hosts but meant we put the result data into the destination Dreg in the wrong order on big-endian hosts. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20201028191712.4910-2-peter.maydell@linaro.org
2020-11-02 16:52:15 +00:00 · 2020-11-02 16:52:15 +00:00 · 552714c081
commit 552714c081
parent 8aab18a2c5
1 changed files with 4 additions and 4 deletions
--- a/target/arm/vec_helper.c
+++ b/target/arm/vec_helper.c
@ -1858,10 +1858,10 @@ DO_ABA(gvec_uaba_d, uint64_t)
        r2 = float16_##OP(m[H2(0)], m[H2(1)], fpst);                    \
        r3 = float16_##OP(m[H2(2)], m[H2(3)], fpst);                    \
                                                                        \
-        d[H4(0)] = r0;                                                  \
-        d[H4(1)] = r1;                                                  \
-        d[H4(2)] = r2;                                                  \
-        d[H4(3)] = r3;                                                  \
+        d[H2(0)] = r0;                                                  \
+        d[H2(1)] = r1;                                                  \
+        d[H2(2)] = r2;                                                  \
+        d[H2(3)] = r3;                                                  \
    }

 DO_NEON_PAIRWISE(neon_padd, add)