Go to file
Alejandro Martinez 9feeafd7f9 [Aarch64][SVE] Dot product support
This patch does two things. For the general vectoriser, it adds support to
perform fully masked reductions over expressions that don't support masking.
This is achieved by using VEC_COND_EXPR where possible.  At the moment this is
implemented for DOT_PROD_EXPR only, but the framework is there to extend it to
other expressions.

Related to that, this patch adds support to vectorize dot product using SVE.  It
also uses the new functionality to ensure that the resulting loop is masked.

Given this input code:

uint32_t
dotprod (uint8_t *restrict x, uint8_t *restrict y, int n)
{
  uint32_t sum = 0;

  for (int i = 0; i < n; i++)
    {
      sum += x[i] * y[i];
    }

  return sum;
}

The resulting SVE code is:

0000000000000000 <dotprod>:
   0:	7100005f 	cmp	w2, #0x0
   4:	5400024d 	b.le	4c <dotprod+0x4c>
   8:	d2800003 	mov	x3, #0x0                   	// #0
   c:	93407c42 	sxtw	x2, w2
  10:	2538c001 	mov	z1.b, #0
  14:	25221fe0 	whilelo	p0.b, xzr, x2
  18:	2538c003 	mov	z3.b, #0
  1c:	d503201f 	nop
  20:	a4034002 	ld1b	{z2.b}, p0/z, [x0, x3]
  24:	a4034020 	ld1b	{z0.b}, p0/z, [x1, x3]
  28:	0430e3e3 	incb	x3
  2c:	0523c000 	sel	z0.b, p0, z0.b, z3.b
  30:	25221c60 	whilelo	p0.b, x3, x2
  34:	44820401 	udot	z1.s, z0.b, z2.b
  38:	54ffff41 	b.ne	20 <dotprod+0x20>  // b.any
  3c:	2598e3e0 	ptrue	p0.s
  40:	04812021 	uaddv	d1, p0, z1.s
  44:	1e260020 	fmov	w0, s1
  48:	d65f03c0 	ret
  4c:	1e2703e1 	fmov	s1, wzr
  50:	1e260020 	fmov	w0, s1
  54:	d65f03c0 	ret

Notice how udot is used inside a fully masked loop.

I tested this patch in an aarch64 machine bootstrapping the compiler and
running the checks.
gcc/Changelog:

2019-05-02  Alejandro Martinez  <alejandro.martinezvicente@arm.com>

	* config/aarch64/aarch64-sve.md (<sur>dot_prod<vsi2qi>): Taken from SVE
	ACLE branch.
	* config/aarch64/iterators.md: Copied Vetype_fourth, VSI2QI and vsi2qi from
	SVE ACLE branch.
	* tree-vect-loop.c (use_mask_by_cond_expr_p): New function to check if a
	VEC_COND_EXPR be inserted to emulate a conditional internal function.
	(build_vect_cond_expr): Emit the VEC_COND_EXPR.
	(vectorizable_reduction): Use the functions above to vectorize in a
	fully masked loop codes that don't have a conditional internal
	function.

gcc/testsuite/Changelog:
 
2019-05-02  Alejandro Martinez  <alejandro.martinezvicente@arm.com>

	* gcc.target/aarch64/sve/dot_1.c: New test for dot product.

From-SVN: r270790
2019-05-02 09:58:00 +00:00
config
contrib * check-internal-format-escaping.py: New version using polib. 2019-04-30 10:14:40 -06:00
fixincludes
gcc [Aarch64][SVE] Dot product support 2019-05-02 09:58:00 +00:00
gnattools
gotools
include libiberty.h (vasprintf): Don't declare if HAVE_DECL_VASPRINTF is not defined. 2019-04-26 09:35:01 -06:00
INSTALL
intl
libada
libatomic
libbacktrace
libcc1
libcpp
libdecnumber
libffi
libgcc
libgfortran
libgo compiler: recognize and optimize map range clear 2019-05-01 21:37:00 +00:00
libgomp
libhsail-rt
libiberty d-demangle.c (dlang_parse_assocarray): Correctly handle error result. 2019-04-30 08:39:14 -06:00
libitm
libobjc
liboffloadmic
libphobos libphobos: Fix multilib builds for s390x-linux-gnu 2019-04-29 05:42:48 +00:00
libquadmath
libsanitizer
libssp
libstdc++-v3 Update Solaris baselines for GCC 9.1 2019-05-01 16:14:30 +00:00
libvtv
lto-plugin
maintainer-scripts
zlib
.dir-locals.el
.gitattributes
.gitignore
ABOUT-NLS
ar-lib
ChangeLog
ChangeLog.jit
ChangeLog.tree-ssa
compile
config-ml.in
config.guess
config.rpath
config.sub
configure
configure.ac
COPYING
COPYING3
COPYING3.LIB
COPYING.LIB
COPYING.RUNTIME
depcomp
install-sh
libtool-ldflags
libtool.m4
lt~obsolete.m4
ltgcc.m4
ltmain.sh
ltoptions.m4
ltsugar.m4
ltversion.m4
MAINTAINERS
Makefile.def
Makefile.in
Makefile.tpl
missing
mkdep
mkinstalldirs
move-if-change
multilib.am
README
symlink-tree
test-driver
ylwrap

This directory contains the GNU Compiler Collection (GCC).

The GNU Compiler Collection is free software.  See the files whose
names start with COPYING for copying permission.  The manuals, and
some of the runtime libraries, are under different terms; see the
individual source files for details.

The directory INSTALL contains copies of the installation information
as HTML and plain text.  The source of this information is
gcc/doc/install.texi.  The installation information includes details
of what is included in the GCC sources and what files GCC installs.

See the file gcc/doc/gcc.texi (together with other files that it
includes) for usage and porting information.  An online readable
version of the manual is in the files gcc/doc/gcc.info*.

See http://gcc.gnu.org/bugs/ for how to report bugs usefully.

Copyright years on GCC source files may be listed using range
notation, e.g., 1987-2012, indicating that every year in the range,
inclusive, is a copyrightable year that could otherwise be listed
individually.