rs6000/test: Add emulated gather test case

As verified, the emulated gather capability of vectorizer
(r12-2733) can help to speed up SPEC2017 510.parest_r on
Power8/9/10 by 5% ~ 9% with option sets Ofast unroll and
Ofast lto.

This patch is to add a test case similar to the one in i386
to add testing coverage for 510.parest_r hotspots.

btw, different from the one in i386, this uses unsigned int
as INDEXTYPE since the unpack support for unsigned int
(r12-3134) also matters for the hotspots vectorization.

gcc/testsuite/ChangeLog:

	* gcc.target/powerpc/vect-gather-1.c: New test.
This commit is contained in:
Kewen Lin 2021-11-28 19:59:59 -06:00
parent 68332ab7ec
commit 300dbea126
1 changed files with 20 additions and 0 deletions

View File

@ -0,0 +1,20 @@
/* { dg-do compile } */
/* Profitable from Power8 since it supports efficient unaligned load. */
/* { dg-options "-Ofast -mdejagnu-cpu=power8 -fdump-tree-vect-details -fdump-tree-forwprop4" } */
#ifndef INDEXTYPE
#define INDEXTYPE unsigned int
#endif
double vmul(INDEXTYPE *rowstart, INDEXTYPE *rowend,
double *luval, double *dst)
{
double res = 0;
for (const INDEXTYPE * col = rowstart; col != rowend; ++col, ++luval)
res += *luval * dst[*col];
return res;
}
/* With gather emulation this should be profitable to vectorize from Power8. */
/* { dg-final { scan-tree-dump "loop vectorized" "vect" } } */
/* The index vector loads and promotions should be scalar after forwprop. */
/* { dg-final { scan-tree-dump-not "vec_unpack" "forwprop4" } } */