tree-optimization/105053 - fix reduction chain epilogue generation

When we optimize permutations in a reduction chain we have to
be careful to select the correct live-out stmt, otherwise the
reduction result will be unused and the retained scalar code will
execute only the number of vector iterations.

2022-03-25  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/105053
	* tree-vect-loop.cc (vect_create_epilog_for_reduction): Pick
	the correct live-out stmt for a reduction chain.

	* g++.dg/vect/pr105053.cc: New testcase.
This commit is contained in:
Richard Biener 2022-03-25 14:31:25 +01:00
parent d0b938a761
commit fe705dce2e
2 changed files with 36 additions and 3 deletions

View File

@ -0,0 +1,25 @@
// { dg-require-effective-target c++11 }
// { dg-require-effective-target int32plus }
#include <vector>
#include <tuple>
#include <algorithm>
int main()
{
const int n = 4;
std::vector<std::tuple<int,int,double>> vec
= { { 1597201307, 1817606674, 0. },
{ 1380347796, 1721941769, 0.},
{837975613, 1032707773, 0.},
{1173654292, 2020064272, 0.} } ;
int sup1 = 0;
for(int i=0;i<n;++i)
sup1=std::max(sup1,std::max(std::get<0>(vec[i]),std::get<1>(vec[i])));
int sup2 = 0;
for(int i=0;i<n;++i)
sup2=std::max(std::max(sup2,std::get<0>(vec[i])),std::get<1>(vec[i]));
if (sup1 != sup2)
std::abort ();
return 0;
}

View File

@ -5271,9 +5271,17 @@ vect_create_epilog_for_reduction (loop_vec_info loop_vinfo,
/* All statements produce live-out values. */
live_out_stmts = SLP_TREE_SCALAR_STMTS (slp_node);
else if (slp_node)
/* The last statement in the reduction chain produces the live-out
value. */
single_live_out_stmt[0] = SLP_TREE_SCALAR_STMTS (slp_node)[group_size - 1];
{
/* The last statement in the reduction chain produces the live-out
value. Note SLP optimization can shuffle scalar stmts to
optimize permutations so we have to search for the last stmt. */
for (k = 0; k < group_size; ++k)
if (!REDUC_GROUP_NEXT_ELEMENT (SLP_TREE_SCALAR_STMTS (slp_node)[k]))
{
single_live_out_stmt[0] = SLP_TREE_SCALAR_STMTS (slp_node)[k];
break;
}
}
unsigned vec_num;
int ncopies;