rs6000.c (rs6000_builtin_vectorization_cost): Correct costs for vec_construct.

2016-08-12  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* config/rs6000/rs6000.c (rs6000_builtin_vectorization_cost):
	Correct costs for vec_construct.

From-SVN: r239417
This commit is contained in:
Bill Schmidt 2016-08-12 15:23:34 +00:00 committed by William Schmidt
parent 8eb414aa6c
commit 42b5ebf32c
2 changed files with 14 additions and 5 deletions

View File

@ -1,3 +1,8 @@
2016-08-12 Bill Schmidt <wschmidt@linux.vnet.ibm.com>
* config/rs6000/rs6000.c (rs6000_builtin_vectorization_cost):
Correct costs for vec_construct.
2016-08-12 Bin Cheng <bin.cheng@arm.com>
PR tree-optimization/69848

View File

@ -5266,16 +5266,20 @@ rs6000_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost,
return 2;
case vec_construct:
elements = TYPE_VECTOR_SUBPARTS (vectype);
/* This is a rough approximation assuming non-constant elements
constructed into a vector via element insertion. FIXME:
vec_construct is not granular enough for uniformly good
decisions. If the initialization is a splat, this is
cheaper than we estimate. Improve this someday. */
elem_type = TREE_TYPE (vectype);
/* 32-bit vectors loaded into registers are stored as double
precision, so we need n/2 converts in addition to the usual
n/2 merges to construct a vector of short floats from them. */
precision, so we need 2 permutes, 2 converts, and 1 merge
to construct a vector of short floats from them. */
if (SCALAR_FLOAT_TYPE_P (elem_type)
&& TYPE_PRECISION (elem_type) == 32)
return elements + 1;
return 5;
else
return elements / 2 + 1;
return max (2, TYPE_VECTOR_SUBPARTS (vectype) - 1);
default:
gcc_unreachable ();