Go to file
Przemyslaw Wirkus 84ae721396 [arm] Implement usadv16qi and ssadv16qi standard names
This patch implements the usadv16qi and ssadv16qi standard names for arm.

The V16QImode variant is important as it is the most commonly used pattern:
reducing vectors of bytes into an int.
The midend expects the optab to compute the absolute differences of operands 1
and 2 and reduce them while widening along the way up to SImode. So the inputs
are V16QImode and the output is V4SImode.

I've based my solution on Aarch64 usadv16qi and ssadv16qi standard names
current implementation (r260437). This solution emits below sequence of
instructions:

        VABDL.u8        tmp, op1, op2   # op1, op2 lowpart
        VABAL.u8        tmp, op1, op2   # op1, op2 highpart
        VPADAL.u16      op3, tmp

So, for the code:

$ arm-none-linux-gnueabihf-gcc -S -O3 -march=armv8-a+simd -mfpu=auto -mfloat-abi=hard usadv16qi.c -dp

#define N 1024
unsigned char pix1[N];
unsigned char pix2[N];

int
foo (void)
{
  int i_sum = 0;
  int i;
  for (i = 0; i < N; i++)
    i_sum += __builtin_abs (pix1[i] - pix2[i]);
  return i_sum;
}

we now generate on arm:
foo:
        movw    r3, #:lower16:pix2      @ 57    [c=4 l=4]  *arm_movsi_vfp/3
        movt    r3, #:upper16:pix2      @ 58    [c=4 l=4]  *arm_movt/0
        vmov.i32        q9, #0  @ v4si  @ 3     [c=4 l=4]  *neon_movv4si/2
        movw    r2, #:lower16:pix1      @ 59    [c=4 l=4]  *arm_movsi_vfp/3
        movt    r2, #:upper16:pix1      @ 60    [c=4 l=4]  *arm_movt/0
        add     r1, r3, #1024   @ 8     [c=4 l=4]  *arm_addsi3/4
.L2:
        vld1.8  {q11}, [r3]!    @ 11    [c=8 l=4]  *movmisalignv16qi_neon_load
        vld1.8  {q10}, [r2]!    @ 10    [c=8 l=4]  *movmisalignv16qi_neon_load
        cmp     r1, r3  @ 21    [c=4 l=4]  *arm_cmpsi_insn/2
        vabdl.u8        q8, d20, d22    @ 12    [c=8 l=4]  neon_vabdluv8qi
        vabal.u8        q8, d21, d23    @ 15    [c=88 l=4]  neon_vabaluv8qi
        vpadal.u16      q9, q8  @ 16    [c=8 l=4]  neon_vpadaluv8hi
        bne     .L2             @ 22    [c=16 l=4]  arm_cond_branch
        vadd.i32        d18, d18, d19   @ 24    [c=120 l=4]  quad_halves_plusv4si
        vpadd.i32       d18, d18, d18   @ 25    [c=8 l=4]  neon_vpadd_internalv2si
        vmov.32 r0, d18[0]      @ 30    [c=12 l=4]  vec_extractv2sisi/1

instead of:
foo:
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        movw    r3, #:lower16:pix1
        movt    r3, #:upper16:pix1
        vmov.i32        q9, #0  @ v4si
        movw    r2, #:lower16:pix2
        movt    r2, #:upper16:pix2
        add     r1, r3, #1024
.L2:
        vld1.8  {q8}, [r3]!
        vld1.8  {q11}, [r2]!
        vmovl.u8 q10, d16
        cmp     r1, r3
        vmovl.u8 q8, d17
        vmovl.u8 q12, d22
        vmovl.u8 q11, d23
        vsub.i16        q10, q10, q12
        vsub.i16        q8, q8, q11
        vabs.s16        q10, q10
        vabs.s16        q8, q8
        vaddw.s16       q9, q9, d20
        vaddw.s16       q9, q9, d21
        vaddw.s16       q9, q9, d16
        vaddw.s16       q9, q9, d17
        bne     .L2
        vadd.i32        d18, d18, d19
        vpadd.i32       d18, d18, d18
        vmov.32 r0, d18[0]

2019-06-12  Przemyslaw Wirkus  <przemyslaw.wirkus@arm.com>

        * config/arm/iterators.md (VABAL): New int iterator.
        * config/arm/neon.md (<sup>sadv16qi): New define_expand.
        * config/arm/unspecs.md ("unspec"): Define UNSPEC_VABAL_S, UNSPEC_VABAL_U
        values.

        * gcc.target/arm/ssadv16qi.c: New test.
        * gcc.target/arm/usadv16qi.c: Likewise.

From-SVN: r272180
2019-06-12 08:27:59 +00:00
config Generalize getconf _NPROCESSORS_ONLN 2019-05-30 09:06:48 +00:00
contrib
fixincludes
gcc [arm] Implement usadv16qi and ssadv16qi standard names 2019-06-12 08:27:59 +00:00
gnattools
gotools
include Add warn_unused_result attribute for memory-related functions in libiberty. 2019-06-10 07:42:43 +00:00
INSTALL
intl
libada
libatomic
libbacktrace
libcc1
libcpp
libdecnumber
libffi
libgcc * libgcov-merge.c (__gcov_merge_single): Revert previous change. 2019-06-11 09:54:17 +02:00
libgfortran
libgo go/internal/gccgoimporter: ignore unexported and imported names 2019-06-07 00:07:50 +00:00
libgomp re PR target/90811 ([nvptx] ptxas error on OpenMP offloaded code) 2019-06-11 18:40:10 +02:00
libhsail-rt
libiberty cp-demangle.c: Don't define CP_DYNAMIC_ARRAYS if __STDC_NO_VLA__ is non-zero. 2019-05-31 12:25:48 -06:00
libitm
libobjc
liboffloadmic
libphobos
libquadmath
libsanitizer
libssp
libstdc++-v3 Fix ConstexprIterator requirements tests - No constexpr algorithms! 2019-06-11 16:29:35 +00:00
libvtv
lto-plugin
maintainer-scripts
zlib
.dir-locals.el
.gitattributes
.gitignore
ABOUT-NLS
ar-lib
ChangeLog removed extra .com, fixed e-mail. 2019-06-11 20:15:43 +00:00
ChangeLog.jit
ChangeLog.tree-ssa
compile
config-ml.in
config.guess
config.rpath
config.sub
configure Import these changes from the binutils/gdb repository: 2019-06-11 12:05:49 +00:00
configure.ac Import these changes from the binutils/gdb repository: 2019-06-11 12:05:49 +00:00
COPYING
COPYING3
COPYING3.LIB
COPYING.LIB
COPYING.RUNTIME
depcomp
install-sh
libtool-ldflags
libtool.m4
lt~obsolete.m4
ltgcc.m4
ltmain.sh
ltoptions.m4
ltsugar.m4
ltversion.m4
MAINTAINERS * MAINTAINERS (Write After Approval): Add myself. 2019-06-11 19:31:36 +00:00
Makefile.def Import these changes from the binutils/gdb repository: 2019-06-11 12:05:49 +00:00
Makefile.in Import these changes from the binutils/gdb repository: 2019-06-11 12:05:49 +00:00
Makefile.tpl
missing
mkdep
mkinstalldirs
move-if-change
multilib.am
README
symlink-tree
test-driver
ylwrap

This directory contains the GNU Compiler Collection (GCC).

The GNU Compiler Collection is free software.  See the files whose
names start with COPYING for copying permission.  The manuals, and
some of the runtime libraries, are under different terms; see the
individual source files for details.

The directory INSTALL contains copies of the installation information
as HTML and plain text.  The source of this information is
gcc/doc/install.texi.  The installation information includes details
of what is included in the GCC sources and what files GCC installs.

See the file gcc/doc/gcc.texi (together with other files that it
includes) for usage and porting information.  An online readable
version of the manual is in the files gcc/doc/gcc.info*.

See http://gcc.gnu.org/bugs/ for how to report bugs usefully.

Copyright years on GCC source files may be listed using range
notation, e.g., 1987-2012, indicating that every year in the range,
inclusive, is a copyrightable year that could otherwise be listed
individually.