extend.texi: Add fvect-cost-model flag.
gcc/ChangeLog: 2007-06-08 Harsha Jagasia <harsha.jagasia@amd.com> Tony Linthicum <tony.linthicum@amd.com> * doc/extend.texi: Add fvect-cost-model flag. * common.opt (fvect-cost-model): New flag. * tree-vectorizer.c (new_stmt_vec_info): Initialize inside and outside cost fields in stmt_vec_info struct for STMT. * tree-vectorizer.h (stmt_vec_info): Define inside and outside cost fields in stmt_vec_info struct and access functions for the same. (TARG_COND_BRANCH_COST): Define cost of conditional branch. (TARG_VEC_STMT_COST): Define cost of any vector operation, excluding load, store and vector to scalar operation. (TARG_VEC_TO_SCALAR_COST): Define cost of vector to scalar operation. (TARG_VEC_LOAD_COST): Define cost of aligned vector load. (TARG_VEC_UNALIGNED_LOAD_COST): Define cost of misasligned vector load. (TARG_VEC_STORE_COST): Define cost of vector store. (vect_estimate_min_profitable_iters): Define new function. * tree-vect-analyze.c (vect_analyze_operations): Add a compile-time check to evaluate if loop iterations are less than minimum profitable iterations determined by cost model or minimum vect loop bound defined by user, whichever is more conservative. * tree-vect-transform.c (vect_do_peeling_for_loop_bound): Add a run-time check to evaluate if loop iterations are less than minimum profitable iterations determined by cost model or minimum vect loop bound defined by user, whichever is more conservative. (vect_estimate_min_profitable_iterations): New function to estimate mimimimum iterartions required for vector version of loop to be profitable over scalar version. (vect_model_reduction_cost): New function. (vect_model_induction_cost): New function. (vect_model_simple_cost): New function. (vect_cost_strided_group_size): New function. (vect_model_store_cost): New function. (vect_model_load_cost): New function. (vectorizable_reduction): Call vect_model_reduction_cost during analysis phase. (vectorizable_induction): Call vect_model_induction_cost during analysis phase. (vectorizable_load): Call vect_model_load_cost during analysis phase. (vectorizable_store): Call vect_model_store_cost during analysis phase. (vectorizable_call, vectorizable_assignment, vectorizable_operation, vectorizable_promotion, vectorizable_demotion): Call vect_model_simple_cost during analysis phase. gcc/testsuite/ChangeLog: 2007-06-08 Harsha Jagasia <harsha.jagasia@amd.com> * gcc.dg/vect/costmodel: New directory. * gcc.dg/vect/costmodel/i386: New directory. * gcc.dg/vect/costmodel/i386/i386-costmodel-vect.exp: New testsuite. * gcc.dg/vect/costmodel/i386/costmodel-fast-math-vect-pr29925.c: New test. * gcc.dg/vect/costmodel/i386/costmodel-vect-31.c: New test. * gcc.dg/vect/costmodel/i386/costmodel-vect-33.c: New test. * gcc.dg/vect/costmodel/i386/costmodel-vect-68.c: New test. * gcc.dg/vect/costmodel/i386/costmodel-vect-reduc-1char.c: New test. * gcc.dg/vect/costmodel/x86_64: New directory. * gcc.dg/vect/costmodel/x86_64/x86_64-costmodel-vect.exp: New testsuite. * gcc.dg/vect/costmodel/x86_64/costmodel-fast-math-vect-pr29925.c: New test. * gcc.dg/vect/costmodel/x86_64/costmodel-vect-31.c: New test. * gcc.dg/vect/costmodel/x86_64/costmodel-vect-33.c: New test. * gcc.dg/vect/costmodel/x86_64/costmodel-vect-68.c: New test. * gcc.dg/vect/costmodel/x86_64/costmodel-vect-reduc-1char.c: New test. * gcc.dg/vect/costmodel/x86_64/costmodel-pr30843.c: New test. Co-Authored-By: Tony Linthicum <tony.linthicum@amd.com> From-SVN: r125575
This commit is contained in:
parent
c8e2516ccf
commit
792ed98bb7
@ -1,3 +1,47 @@
|
||||
2007-06-08 Harsha Jagasia <harsha.jagasia@amd.com>
|
||||
Tony Linthicum <tony.linthicum@amd.com>
|
||||
|
||||
* doc/extend.texi: Add fvect-cost-model flag.
|
||||
* common.opt (fvect-cost-model): New flag.
|
||||
* tree-vectorizer.c (new_stmt_vec_info): Initialize inside and outside
|
||||
cost fields in stmt_vec_info struct for STMT.
|
||||
* tree-vectorizer.h (stmt_vec_info): Define inside and outside cost
|
||||
fields in stmt_vec_info struct and access functions for the same.
|
||||
(TARG_COND_BRANCH_COST): Define cost of conditional branch.
|
||||
(TARG_VEC_STMT_COST): Define cost of any vector operation, excluding
|
||||
load, store and vector to scalar operation.
|
||||
(TARG_VEC_TO_SCALAR_COST): Define cost of vector to scalar operation.
|
||||
(TARG_VEC_LOAD_COST): Define cost of aligned vector load.
|
||||
(TARG_VEC_UNALIGNED_LOAD_COST): Define cost of misasligned vector load.
|
||||
(TARG_VEC_STORE_COST): Define cost of vector store.
|
||||
(vect_estimate_min_profitable_iters): Define new function.
|
||||
* tree-vect-analyze.c (vect_analyze_operations): Add a compile-time
|
||||
check to evaluate if loop iterations are less than minimum profitable
|
||||
iterations determined by cost model or minimum vect loop bound defined
|
||||
by user, whichever is more conservative.
|
||||
* tree-vect-transform.c (vect_do_peeling_for_loop_bound): Add a
|
||||
run-time check to evaluate if loop iterations are less than minimum
|
||||
profitable iterations determined by cost model or minimum vect loop
|
||||
bound defined by user, whichever is more conservative.
|
||||
(vect_estimate_min_profitable_iterations): New function to estimate
|
||||
mimimimum iterartions required for vector version of loop to be
|
||||
profitable over scalar version.
|
||||
(vect_model_reduction_cost): New function.
|
||||
(vect_model_induction_cost): New function.
|
||||
(vect_model_simple_cost): New function.
|
||||
(vect_cost_strided_group_size): New function.
|
||||
(vect_model_store_cost): New function.
|
||||
(vect_model_load_cost): New function.
|
||||
(vectorizable_reduction): Call vect_model_reduction_cost during
|
||||
analysis phase.
|
||||
(vectorizable_induction): Call vect_model_induction_cost during
|
||||
analysis phase.
|
||||
(vectorizable_load): Call vect_model_load_cost during analysis phase.
|
||||
(vectorizable_store): Call vect_model_store_cost during analysis phase.
|
||||
(vectorizable_call, vectorizable_assignment, vectorizable_operation,
|
||||
vectorizable_promotion, vectorizable_demotion): Call
|
||||
vect_model_simple_cost during analysis phase.
|
||||
|
||||
2007-06-08 Simon Baldwin <simonb@google.com>
|
||||
|
||||
* reg-stack.c (get_true_reg): Readability change. Moved default case
|
||||
|
@ -1110,6 +1110,10 @@ ftree-vectorize
|
||||
Common Report Var(flag_tree_vectorize) Optimization
|
||||
Enable loop vectorization on trees
|
||||
|
||||
fvect-cost-model
|
||||
Common Report Var(flag_vect_cost_model) Optimization
|
||||
Enable use of cost model in vectorization
|
||||
|
||||
ftree-vect-loop-version
|
||||
Common Report Var(flag_tree_vect_loop_version) Init(1) Optimization
|
||||
Enable loop versioning when doing loop vectorization on trees
|
||||
|
@ -357,7 +357,7 @@ Objective-C and Objective-C++ Dialects}.
|
||||
-fcheck-data-deps @gol
|
||||
-ftree-dominator-opts -ftree-dse -ftree-copyrename -ftree-sink @gol
|
||||
-ftree-ch -ftree-sra -ftree-ter -ftree-fre -ftree-vectorize @gol
|
||||
-ftree-vect-loop-version -ftree-salias -fipa-pta -fweb @gol
|
||||
-ftree-vect-loop-version -fvect-cost-model -ftree-salias -fipa-pta -fweb @gol
|
||||
-ftree-copy-prop -ftree-store-ccp -ftree-store-copy-prop -fwhole-program @gol
|
||||
--param @var{name}=@var{value}
|
||||
-O -O0 -O1 -O2 -O3 -Os}
|
||||
@ -5666,6 +5666,9 @@ the loop are generated along with runtime checks for alignment or dependence
|
||||
to control which version is executed. This option is enabled by default
|
||||
except at level @option{-Os} where it is disabled.
|
||||
|
||||
@item -fvect-cost-model
|
||||
Enable cost model for vectorization.
|
||||
|
||||
@item -ftree-vrp
|
||||
Perform Value Range Propagation on trees. This is similar to the
|
||||
constant propagation pass, but instead of values, ranges of values are
|
||||
|
@ -1,3 +1,25 @@
|
||||
2007-06-08 Harsha Jagasia <harsha.jagasia@amd.com>
|
||||
|
||||
* gcc.dg/vect/costmodel: New directory.
|
||||
* gcc.dg/vect/costmodel/i386: New directory.
|
||||
* gcc.dg/vect/costmodel/i386/i386-costmodel-vect.exp: New testsuite.
|
||||
* gcc.dg/vect/costmodel/i386/costmodel-fast-math-vect-pr29925.c:
|
||||
New test.
|
||||
* gcc.dg/vect/costmodel/i386/costmodel-vect-31.c: New test.
|
||||
* gcc.dg/vect/costmodel/i386/costmodel-vect-33.c: New test.
|
||||
* gcc.dg/vect/costmodel/i386/costmodel-vect-68.c: New test.
|
||||
* gcc.dg/vect/costmodel/i386/costmodel-vect-reduc-1char.c: New test.
|
||||
* gcc.dg/vect/costmodel/x86_64: New directory.
|
||||
* gcc.dg/vect/costmodel/x86_64/x86_64-costmodel-vect.exp:
|
||||
New testsuite.
|
||||
* gcc.dg/vect/costmodel/x86_64/costmodel-fast-math-vect-pr29925.c:
|
||||
New test.
|
||||
* gcc.dg/vect/costmodel/x86_64/costmodel-vect-31.c: New test.
|
||||
* gcc.dg/vect/costmodel/x86_64/costmodel-vect-33.c: New test.
|
||||
* gcc.dg/vect/costmodel/x86_64/costmodel-vect-68.c: New test.
|
||||
* gcc.dg/vect/costmodel/x86_64/costmodel-vect-reduc-1char.c: New test.
|
||||
* gcc.dg/vect/costmodel/x86_64/costmodel-pr30843.c: New test.
|
||||
|
||||
2007-06-08 Uros Bizjak <ubizjak@gmail.com>
|
||||
|
||||
PR tree-optimization/32243
|
||||
|
@ -0,0 +1,39 @@
|
||||
/* { dg-require-effective-target vect_float } */
|
||||
|
||||
#include <stdlib.h>
|
||||
#include "../../tree-vect.h"
|
||||
|
||||
void interp_pitch(float *exc, float *interp, int pitch, int len)
|
||||
{
|
||||
int i,k;
|
||||
int maxj;
|
||||
|
||||
maxj=3;
|
||||
for (i=0;i<len;i++)
|
||||
{
|
||||
float tmp = 0;
|
||||
for (k=0;k<7;k++)
|
||||
{
|
||||
tmp += exc[i-pitch+k+maxj-6];
|
||||
}
|
||||
interp[i] = tmp;
|
||||
}
|
||||
}
|
||||
|
||||
int main()
|
||||
{
|
||||
float *exc = calloc(126,sizeof(float));
|
||||
float *interp = calloc(80,sizeof(float));
|
||||
int pitch = -35;
|
||||
|
||||
check_vect ();
|
||||
|
||||
interp_pitch(exc, interp, pitch, 80);
|
||||
free(exc);
|
||||
free(interp);
|
||||
return 0;
|
||||
}
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorization not profitable" 1 "vect" } } */
|
||||
/* { dg-final { cleanup-tree-dump "vect" } } */
|
||||
|
91
gcc/testsuite/gcc.dg/vect/costmodel/i386/costmodel-vect-31.c
Normal file
91
gcc/testsuite/gcc.dg/vect/costmodel/i386/costmodel-vect-31.c
Normal file
@ -0,0 +1,91 @@
|
||||
/* { dg-require-effective-target vect_int } */
|
||||
|
||||
#include <stdarg.h>
|
||||
#include "../../tree-vect.h"
|
||||
|
||||
#define N 32
|
||||
|
||||
struct t{
|
||||
int k[N];
|
||||
int l;
|
||||
};
|
||||
|
||||
struct s{
|
||||
char a; /* aligned */
|
||||
char b[N-1]; /* unaligned (offset 1B) */
|
||||
char c[N]; /* aligned (offset NB) */
|
||||
struct t d; /* aligned (offset 2NB) */
|
||||
struct t e; /* unaligned (offset 2N+4N+4 B) */
|
||||
};
|
||||
|
||||
int main1 ()
|
||||
{
|
||||
int i;
|
||||
struct s tmp;
|
||||
|
||||
/* unaligned */
|
||||
for (i = 0; i < N/2; i++)
|
||||
{
|
||||
tmp.b[i] = 5;
|
||||
}
|
||||
|
||||
/* check results: */
|
||||
for (i = 0; i <N/2; i++)
|
||||
{
|
||||
if (tmp.b[i] != 5)
|
||||
abort ();
|
||||
}
|
||||
|
||||
/* aligned */
|
||||
for (i = 0; i < N/2; i++)
|
||||
{
|
||||
tmp.c[i] = 6;
|
||||
}
|
||||
|
||||
/* check results: */
|
||||
for (i = 0; i <N/2; i++)
|
||||
{
|
||||
if (tmp.c[i] != 6)
|
||||
abort ();
|
||||
}
|
||||
|
||||
/* aligned */
|
||||
for (i = 0; i < N/2; i++)
|
||||
{
|
||||
tmp.d.k[i] = 7;
|
||||
}
|
||||
|
||||
/* check results: */
|
||||
for (i = 0; i <N/2; i++)
|
||||
{
|
||||
if (tmp.d.k[i] != 7)
|
||||
abort ();
|
||||
}
|
||||
|
||||
/* unaligned */
|
||||
for (i = 0; i < N/2; i++)
|
||||
{
|
||||
tmp.e.k[i] = 8;
|
||||
}
|
||||
|
||||
/* check results: */
|
||||
for (i = 0; i <N/2; i++)
|
||||
{
|
||||
if (tmp.e.k[i] != 8)
|
||||
abort ();
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
int main (void)
|
||||
{
|
||||
check_vect ();
|
||||
|
||||
return main1 ();
|
||||
}
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorization not profitable" 2 "vect" } }
|
||||
*/
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" } } */
|
||||
/* { dg-final { cleanup-tree-dump "vect" } } */
|
39
gcc/testsuite/gcc.dg/vect/costmodel/i386/costmodel-vect-33.c
Normal file
39
gcc/testsuite/gcc.dg/vect/costmodel/i386/costmodel-vect-33.c
Normal file
@ -0,0 +1,39 @@
|
||||
/* { dg-do compile } */
|
||||
/* { dg-require-effective-target vect_int } */
|
||||
|
||||
#include <stdarg.h>
|
||||
#include "../../tree-vect.h"
|
||||
|
||||
#define N 16
|
||||
struct test {
|
||||
char ca[N];
|
||||
};
|
||||
|
||||
extern struct test s;
|
||||
|
||||
int main1 ()
|
||||
{
|
||||
int i;
|
||||
|
||||
for (i = 0; i < N; i++)
|
||||
{
|
||||
s.ca[i] = 5;
|
||||
}
|
||||
|
||||
/* check results: */
|
||||
for (i = 0; i < N; i++)
|
||||
{
|
||||
if (s.ca[i] != 5)
|
||||
abort ();
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
int main (void)
|
||||
{
|
||||
return main1 ();
|
||||
}
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorization not profitable" 1 "vect" } } */
|
||||
/* { dg-final { cleanup-tree-dump "vect" } } */
|
89
gcc/testsuite/gcc.dg/vect/costmodel/i386/costmodel-vect-68.c
Normal file
89
gcc/testsuite/gcc.dg/vect/costmodel/i386/costmodel-vect-68.c
Normal file
@ -0,0 +1,89 @@
|
||||
/* { dg-require-effective-target vect_int } */
|
||||
|
||||
#include <stdarg.h>
|
||||
#include "../../tree-vect.h"
|
||||
|
||||
#define N 32
|
||||
|
||||
struct s{
|
||||
int m;
|
||||
int n[N][N][N];
|
||||
};
|
||||
|
||||
struct test1{
|
||||
struct s a; /* array a.n is unaligned */
|
||||
int b;
|
||||
int c;
|
||||
struct s e; /* array e.n is aligned */
|
||||
};
|
||||
|
||||
int main1 ()
|
||||
{
|
||||
int i,j;
|
||||
struct test1 tmp1;
|
||||
|
||||
/* 1. unaligned */
|
||||
for (i = 0; i < N; i++)
|
||||
{
|
||||
tmp1.a.n[1][2][i] = 5;
|
||||
}
|
||||
|
||||
/* check results: */
|
||||
for (i = 0; i <N; i++)
|
||||
{
|
||||
if (tmp1.a.n[1][2][i] != 5)
|
||||
abort ();
|
||||
}
|
||||
|
||||
/* 2. aligned */
|
||||
for (i = 3; i < N-1; i++)
|
||||
{
|
||||
tmp1.a.n[1][2][i] = 6;
|
||||
}
|
||||
|
||||
/* check results: */
|
||||
for (i = 3; i < N-1; i++)
|
||||
{
|
||||
if (tmp1.a.n[1][2][i] != 6)
|
||||
abort ();
|
||||
}
|
||||
|
||||
/* 3. aligned */
|
||||
for (i = 0; i < N; i++)
|
||||
{
|
||||
tmp1.e.n[1][2][i] = 7;
|
||||
}
|
||||
|
||||
/* check results: */
|
||||
for (i = 0; i < N; i++)
|
||||
{
|
||||
if (tmp1.e.n[1][2][i] != 7)
|
||||
abort ();
|
||||
}
|
||||
|
||||
/* 4. unaligned */
|
||||
for (i = 3; i < N-3; i++)
|
||||
{
|
||||
tmp1.e.n[1][2][i] = 8;
|
||||
}
|
||||
|
||||
/* check results: */
|
||||
for (i = 3; i <N-3; i++)
|
||||
{
|
||||
if (tmp1.e.n[1][2][i] != 8)
|
||||
abort ();
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
int main (void)
|
||||
{
|
||||
check_vect ();
|
||||
|
||||
return main1 ();
|
||||
}
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorization not profitable" 1 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" } } */
|
||||
/* { dg-final { cleanup-tree-dump "vect" } } */
|
@ -0,0 +1,51 @@
|
||||
/* { dg-require-effective-target vect_int } */
|
||||
|
||||
#include <stdarg.h>
|
||||
#include "../../tree-vect.h"
|
||||
|
||||
#define N 16
|
||||
#define DIFF 242
|
||||
|
||||
void
|
||||
main1 (unsigned char x, unsigned char max_result, unsigned char min_result)
|
||||
{
|
||||
int i;
|
||||
unsigned char ub[N] = {1,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
|
||||
unsigned char uc[N] = {1,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
|
||||
unsigned char udiff = 2;
|
||||
unsigned char umax = x;
|
||||
unsigned char umin = x;
|
||||
|
||||
for (i = 0; i < N; i++) {
|
||||
udiff += (unsigned char)(ub[i] - uc[i]);
|
||||
}
|
||||
|
||||
for (i = 0; i < N; i++) {
|
||||
umax = umax < uc[i] ? uc[i] : umax;
|
||||
}
|
||||
|
||||
for (i = 0; i < N; i++) {
|
||||
umin = umin > uc[i] ? uc[i] : umin;
|
||||
}
|
||||
|
||||
/* check results: */
|
||||
if (udiff != DIFF)
|
||||
abort ();
|
||||
if (umax != max_result)
|
||||
abort ();
|
||||
if (umin != min_result)
|
||||
abort ();
|
||||
}
|
||||
|
||||
int main (void)
|
||||
{
|
||||
check_vect ();
|
||||
|
||||
main1 (100, 100, 1);
|
||||
main1 (0, 15, 0);
|
||||
return 0;
|
||||
}
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail vect_no_int_max } } } */
|
||||
/* { dg-final { scan-tree-dump-times "vectorization not profitable" 2 "vect" { xfail vect_no_int_max } } } */
|
||||
/* { dg-final { cleanup-tree-dump "vect" } } */
|
@ -0,0 +1,67 @@
|
||||
# Copyright (C) 1997, 2004, 2005, 2006 Free Software Foundation, Inc.
|
||||
|
||||
# This program is free software; you can redistribute it and/or modify
|
||||
# it under the terms of the GNU General Public License as published by
|
||||
# the Free Software Foundation; either version 2 of the License, or
|
||||
# (at your option) any later version.
|
||||
#
|
||||
# This program is distributed in the hope that it will be useful,
|
||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
# GNU General Public License for more details.
|
||||
#
|
||||
# You should have received a copy of the GNU General Public License
|
||||
# along with this program; if not, write to the Free Software
|
||||
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
|
||||
|
||||
# GCC testsuite that uses the `dg.exp' driver.
|
||||
|
||||
# Load support procs.
|
||||
load_lib gcc-dg.exp
|
||||
|
||||
# Exit immediately if this isn't a x86 target.
|
||||
if { ![istarget i?86*-*-*] && ![istarget x86_64-*-*] } then {
|
||||
return
|
||||
}
|
||||
|
||||
# Set up flags used for tests that don't specify options.
|
||||
set DEFAULT_VECTCFLAGS ""
|
||||
|
||||
# These flags are used for all targets.
|
||||
lappend DEFAULT_VECTCFLAGS "-O2" "-ftree-vectorize" "-fvect-cost-model"
|
||||
|
||||
# If the target system supports vector instructions, the default action
|
||||
# for a test is 'run', otherwise it's 'compile'. Save current default.
|
||||
# Executing vector instructions on a system without hardware vector support
|
||||
# is also disabled by a call to check_vect, but disabling execution here is
|
||||
# more efficient.
|
||||
global dg-do-what-default
|
||||
set save-dg-do-what-default ${dg-do-what-default}
|
||||
|
||||
lappend DEFAULT_VECTCFLAGS "-msse2"
|
||||
set dg-do-what-default run
|
||||
|
||||
# Initialize `dg'.
|
||||
dg-init
|
||||
|
||||
lappend DEFAULT_VECTCFLAGS "-fdump-tree-vect-details"
|
||||
|
||||
# Main loop.
|
||||
dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/costmodel-vect-*.\[cS\]]] \
|
||||
"" $DEFAULT_VECTCFLAGS
|
||||
|
||||
#### Tests with special options
|
||||
global SAVED_DEFAULT_VECTCFLAGS
|
||||
set SAVED_DEFAULT_VECTCFLAGS $DEFAULT_VECTCFLAGS
|
||||
|
||||
# -ffast-math tests
|
||||
set DEFAULT_VECTCFLAGS $SAVED_DEFAULT_VECTCFLAGS
|
||||
lappend DEFAULT_VECTCFLAGS "-ffast-math"
|
||||
dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/costmodel-fast-math-vect*.\[cS\]]] \
|
||||
"" $DEFAULT_VECTCFLAGS
|
||||
|
||||
# Clean up.
|
||||
set dg-do-what-default ${save-dg-do-what-default}
|
||||
|
||||
# All done.
|
||||
dg-finish
|
@ -0,0 +1,39 @@
|
||||
/* { dg-require-effective-target vect_float } */
|
||||
|
||||
#include <stdlib.h>
|
||||
#include "../../tree-vect.h"
|
||||
|
||||
void interp_pitch(float *exc, float *interp, int pitch, int len)
|
||||
{
|
||||
int i,k;
|
||||
int maxj;
|
||||
|
||||
maxj=3;
|
||||
for (i=0;i<len;i++)
|
||||
{
|
||||
float tmp = 0;
|
||||
for (k=0;k<7;k++)
|
||||
{
|
||||
tmp += exc[i-pitch+k+maxj-6];
|
||||
}
|
||||
interp[i] = tmp;
|
||||
}
|
||||
}
|
||||
|
||||
int main()
|
||||
{
|
||||
float *exc = calloc(126,sizeof(float));
|
||||
float *interp = calloc(80,sizeof(float));
|
||||
int pitch = -35;
|
||||
|
||||
check_vect ();
|
||||
|
||||
interp_pitch(exc, interp, pitch, 80);
|
||||
free(exc);
|
||||
free(interp);
|
||||
return 0;
|
||||
}
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorization not profitable" 1 "vect" } } */
|
||||
/* { dg-final { cleanup-tree-dump "vect" } } */
|
||||
|
@ -0,0 +1,26 @@
|
||||
/* { dg-do compile } */
|
||||
/* { dg-require-effective-target vect_long } */
|
||||
|
||||
#include <stdarg.h>
|
||||
#include "../../tree-vect.h"
|
||||
|
||||
#define N 16
|
||||
|
||||
void dacP98FillRGBMap (unsigned char *pBuffer)
|
||||
{
|
||||
unsigned long dw, dw1;
|
||||
unsigned long *pdw = (unsigned long *)(pBuffer);
|
||||
|
||||
for( dw = 256, dw1 = 0; dw; dw--, dw1 += 0x01010101)
|
||||
{
|
||||
*pdw++ = dw1;
|
||||
*pdw++ = dw1;
|
||||
*pdw++ = dw1;
|
||||
*pdw++ = dw1;
|
||||
}
|
||||
}
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorization not profitable" 1 "vect" { target vect_interleave
|
||||
} } } */
|
||||
/* { dg-final { cleanup-tree-dump "vect" } } */
|
||||
|
@ -0,0 +1,91 @@
|
||||
/* { dg-require-effective-target vect_int } */
|
||||
|
||||
#include <stdarg.h>
|
||||
#include "../../tree-vect.h"
|
||||
|
||||
#define N 32
|
||||
|
||||
struct t{
|
||||
int k[N];
|
||||
int l;
|
||||
};
|
||||
|
||||
struct s{
|
||||
char a; /* aligned */
|
||||
char b[N-1]; /* unaligned (offset 1B) */
|
||||
char c[N]; /* aligned (offset NB) */
|
||||
struct t d; /* aligned (offset 2NB) */
|
||||
struct t e; /* unaligned (offset 2N+4N+4 B) */
|
||||
};
|
||||
|
||||
int main1 ()
|
||||
{
|
||||
int i;
|
||||
struct s tmp;
|
||||
|
||||
/* unaligned */
|
||||
for (i = 0; i < N/2; i++)
|
||||
{
|
||||
tmp.b[i] = 5;
|
||||
}
|
||||
|
||||
/* check results: */
|
||||
for (i = 0; i <N/2; i++)
|
||||
{
|
||||
if (tmp.b[i] != 5)
|
||||
abort ();
|
||||
}
|
||||
|
||||
/* aligned */
|
||||
for (i = 0; i < N/2; i++)
|
||||
{
|
||||
tmp.c[i] = 6;
|
||||
}
|
||||
|
||||
/* check results: */
|
||||
for (i = 0; i <N/2; i++)
|
||||
{
|
||||
if (tmp.c[i] != 6)
|
||||
abort ();
|
||||
}
|
||||
|
||||
/* aligned */
|
||||
for (i = 0; i < N/2; i++)
|
||||
{
|
||||
tmp.d.k[i] = 7;
|
||||
}
|
||||
|
||||
/* check results: */
|
||||
for (i = 0; i <N/2; i++)
|
||||
{
|
||||
if (tmp.d.k[i] != 7)
|
||||
abort ();
|
||||
}
|
||||
|
||||
/* unaligned */
|
||||
for (i = 0; i < N/2; i++)
|
||||
{
|
||||
tmp.e.k[i] = 8;
|
||||
}
|
||||
|
||||
/* check results: */
|
||||
for (i = 0; i <N/2; i++)
|
||||
{
|
||||
if (tmp.e.k[i] != 8)
|
||||
abort ();
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
int main (void)
|
||||
{
|
||||
check_vect ();
|
||||
|
||||
return main1 ();
|
||||
}
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorization not profitable" 2 "vect" } }
|
||||
*/
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" } } */
|
||||
/* { dg-final { cleanup-tree-dump "vect" } } */
|
@ -0,0 +1,39 @@
|
||||
/* { dg-do compile } */
|
||||
/* { dg-require-effective-target vect_int } */
|
||||
|
||||
#include <stdarg.h>
|
||||
#include "../../tree-vect.h"
|
||||
|
||||
#define N 16
|
||||
struct test {
|
||||
char ca[N];
|
||||
};
|
||||
|
||||
extern struct test s;
|
||||
|
||||
int main1 ()
|
||||
{
|
||||
int i;
|
||||
|
||||
for (i = 0; i < N; i++)
|
||||
{
|
||||
s.ca[i] = 5;
|
||||
}
|
||||
|
||||
/* check results: */
|
||||
for (i = 0; i < N; i++)
|
||||
{
|
||||
if (s.ca[i] != 5)
|
||||
abort ();
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
int main (void)
|
||||
{
|
||||
return main1 ();
|
||||
}
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorization not profitable" 1 "vect" } } */
|
||||
/* { dg-final { cleanup-tree-dump "vect" } } */
|
@ -0,0 +1,89 @@
|
||||
/* { dg-require-effective-target vect_int } */
|
||||
|
||||
#include <stdarg.h>
|
||||
#include "../../tree-vect.h"
|
||||
|
||||
#define N 32
|
||||
|
||||
struct s{
|
||||
int m;
|
||||
int n[N][N][N];
|
||||
};
|
||||
|
||||
struct test1{
|
||||
struct s a; /* array a.n is unaligned */
|
||||
int b;
|
||||
int c;
|
||||
struct s e; /* array e.n is aligned */
|
||||
};
|
||||
|
||||
int main1 ()
|
||||
{
|
||||
int i,j;
|
||||
struct test1 tmp1;
|
||||
|
||||
/* 1. unaligned */
|
||||
for (i = 0; i < N; i++)
|
||||
{
|
||||
tmp1.a.n[1][2][i] = 5;
|
||||
}
|
||||
|
||||
/* check results: */
|
||||
for (i = 0; i <N; i++)
|
||||
{
|
||||
if (tmp1.a.n[1][2][i] != 5)
|
||||
abort ();
|
||||
}
|
||||
|
||||
/* 2. aligned */
|
||||
for (i = 3; i < N-1; i++)
|
||||
{
|
||||
tmp1.a.n[1][2][i] = 6;
|
||||
}
|
||||
|
||||
/* check results: */
|
||||
for (i = 3; i < N-1; i++)
|
||||
{
|
||||
if (tmp1.a.n[1][2][i] != 6)
|
||||
abort ();
|
||||
}
|
||||
|
||||
/* 3. aligned */
|
||||
for (i = 0; i < N; i++)
|
||||
{
|
||||
tmp1.e.n[1][2][i] = 7;
|
||||
}
|
||||
|
||||
/* check results: */
|
||||
for (i = 0; i < N; i++)
|
||||
{
|
||||
if (tmp1.e.n[1][2][i] != 7)
|
||||
abort ();
|
||||
}
|
||||
|
||||
/* 4. unaligned */
|
||||
for (i = 3; i < N-3; i++)
|
||||
{
|
||||
tmp1.e.n[1][2][i] = 8;
|
||||
}
|
||||
|
||||
/* check results: */
|
||||
for (i = 3; i <N-3; i++)
|
||||
{
|
||||
if (tmp1.e.n[1][2][i] != 8)
|
||||
abort ();
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
int main (void)
|
||||
{
|
||||
check_vect ();
|
||||
|
||||
return main1 ();
|
||||
}
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorization not profitable" 1 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" } } */
|
||||
/* { dg-final { cleanup-tree-dump "vect" } } */
|
@ -0,0 +1,51 @@
|
||||
/* { dg-require-effective-target vect_int } */
|
||||
|
||||
#include <stdarg.h>
|
||||
#include "../../tree-vect.h"
|
||||
|
||||
#define N 16
|
||||
#define DIFF 242
|
||||
|
||||
void
|
||||
main1 (unsigned char x, unsigned char max_result, unsigned char min_result)
|
||||
{
|
||||
int i;
|
||||
unsigned char ub[N] = {1,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
|
||||
unsigned char uc[N] = {1,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
|
||||
unsigned char udiff = 2;
|
||||
unsigned char umax = x;
|
||||
unsigned char umin = x;
|
||||
|
||||
for (i = 0; i < N; i++) {
|
||||
udiff += (unsigned char)(ub[i] - uc[i]);
|
||||
}
|
||||
|
||||
for (i = 0; i < N; i++) {
|
||||
umax = umax < uc[i] ? uc[i] : umax;
|
||||
}
|
||||
|
||||
for (i = 0; i < N; i++) {
|
||||
umin = umin > uc[i] ? uc[i] : umin;
|
||||
}
|
||||
|
||||
/* check results: */
|
||||
if (udiff != DIFF)
|
||||
abort ();
|
||||
if (umax != max_result)
|
||||
abort ();
|
||||
if (umin != min_result)
|
||||
abort ();
|
||||
}
|
||||
|
||||
int main (void)
|
||||
{
|
||||
check_vect ();
|
||||
|
||||
main1 (100, 100, 1);
|
||||
main1 (0, 15, 0);
|
||||
return 0;
|
||||
}
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail vect_no_int_max } } } */
|
||||
/* { dg-final { scan-tree-dump-times "vectorization not profitable" 2 "vect" { xfail vect_no_int_max } } } */
|
||||
/* { dg-final { cleanup-tree-dump "vect" } } */
|
@ -0,0 +1,70 @@
|
||||
# Copyright (C) 1997, 2004, 2005, 2006 Free Software Foundation, Inc.
|
||||
|
||||
# This program is free software; you can redistribute it and/or modify
|
||||
# it under the terms of the GNU General Public License as published by
|
||||
# the Free Software Foundation; either version 2 of the License, or
|
||||
# (at your option) any later version.
|
||||
#
|
||||
# This program is distributed in the hope that it will be useful,
|
||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
# GNU General Public License for more details.
|
||||
#
|
||||
# You should have received a copy of the GNU General Public License
|
||||
# along with this program; if not, write to the Free Software
|
||||
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
|
||||
|
||||
# GCC testsuite that uses the `dg.exp' driver.
|
||||
|
||||
# Load support procs.
|
||||
load_lib gcc-dg.exp
|
||||
|
||||
# Exit immediately if this isn't a x86 target.
|
||||
if { (![istarget x86_64-*-*] && ![istarget i?86-*-*])
|
||||
|| ![is-effective-target lp64] } then {
|
||||
return
|
||||
}
|
||||
|
||||
# Set up flags used for tests that don't specify options.
|
||||
set DEFAULT_VECTCFLAGS ""
|
||||
|
||||
# These flags are used for all targets.
|
||||
lappend DEFAULT_VECTCFLAGS "-O2" "-ftree-vectorize" "-fvect-cost-model"
|
||||
|
||||
# If the target system supports vector instructions, the default action
|
||||
# for a test is 'run', otherwise it's 'compile'. Save current default.
|
||||
# Executing vector instructions on a system without hardware vector support
|
||||
# is also disabled by a call to check_vect, but disabling execution here is
|
||||
# more efficient.
|
||||
global dg-do-what-default
|
||||
set save-dg-do-what-default ${dg-do-what-default}
|
||||
|
||||
lappend DEFAULT_VECTCFLAGS "-msse2"
|
||||
set dg-do-what-default run
|
||||
|
||||
# Initialize `dg'.
|
||||
dg-init
|
||||
|
||||
lappend DEFAULT_VECTCFLAGS "-fdump-tree-vect-details"
|
||||
|
||||
# Main loop.
|
||||
dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/costmodel-pr*.\[cS\]]] \
|
||||
"" $DEFAULT_VECTCFLAGS
|
||||
dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/costmodel-vect-*.\[cS\]]] \
|
||||
"" $DEFAULT_VECTCFLAGS
|
||||
|
||||
#### Tests with special options
|
||||
global SAVED_DEFAULT_VECTCFLAGS
|
||||
set SAVED_DEFAULT_VECTCFLAGS $DEFAULT_VECTCFLAGS
|
||||
|
||||
# -ffast-math tests
|
||||
set DEFAULT_VECTCFLAGS $SAVED_DEFAULT_VECTCFLAGS
|
||||
lappend DEFAULT_VECTCFLAGS "-ffast-math"
|
||||
dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/costmodel-fast-math-vect*.\[cS\]]] \
|
||||
"" $DEFAULT_VECTCFLAGS
|
||||
|
||||
# Clean up.
|
||||
set dg-do-what-default ${save-dg-do-what-default}
|
||||
|
||||
# All done.
|
||||
dg-finish
|
@ -1,5 +1,5 @@
|
||||
/* Analysis Utilities for Loop Vectorization.
|
||||
Copyright (C) 2003,2004,2005,2006 Free Software Foundation, Inc.
|
||||
Copyright (C) 2003, 2004, 2005, 2006, 2007 Free Software Foundation, Inc.
|
||||
Contributed by Dorit Naishlos <dorit@il.ibm.com>
|
||||
|
||||
This file is part of GCC.
|
||||
@ -300,6 +300,9 @@ vect_analyze_operations (loop_vec_info loop_vinfo)
|
||||
tree phi;
|
||||
stmt_vec_info stmt_info;
|
||||
bool need_to_vectorize = false;
|
||||
int min_profitable_iters;
|
||||
int min_scalar_loop_bound;
|
||||
unsigned int th;
|
||||
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
fprintf (vect_dump, "=== vect_analyze_operations ===");
|
||||
@ -443,8 +446,6 @@ vect_analyze_operations (loop_vec_info loop_vinfo)
|
||||
} /* stmts in bb */
|
||||
} /* bbs */
|
||||
|
||||
/* TODO: Analyze cost. Decide if worth while to vectorize. */
|
||||
|
||||
/* All operations in the loop are either irrelevant (deal with loop
|
||||
control, or dead), or only used outside the loop and can be moved
|
||||
out of the loop (e.g. invariants, inductions). The loop can be
|
||||
@ -468,16 +469,55 @@ vect_analyze_operations (loop_vec_info loop_vinfo)
|
||||
vectorization_factor, LOOP_VINFO_INT_NITERS (loop_vinfo));
|
||||
|
||||
if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
|
||||
&& ((LOOP_VINFO_INT_NITERS (loop_vinfo) < vectorization_factor)
|
||||
|| (LOOP_VINFO_INT_NITERS (loop_vinfo) <=
|
||||
((unsigned) (PARAM_VALUE (PARAM_MIN_VECT_LOOP_BOUND))
|
||||
* vectorization_factor))))
|
||||
&& (LOOP_VINFO_INT_NITERS (loop_vinfo) < vectorization_factor))
|
||||
{
|
||||
if (vect_print_dump_info (REPORT_UNVECTORIZED_LOOPS))
|
||||
fprintf (vect_dump, "not vectorized: iteration count too small.");
|
||||
fprintf (vect_dump, "not vectorized: iteration count too small.");
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
fprintf (vect_dump,"not vectorized: iteration count smaller than "
|
||||
"vectorization factor.");
|
||||
return false;
|
||||
}
|
||||
|
||||
/* Analyze cost. Decide if worth while to vectorize. */
|
||||
|
||||
min_profitable_iters = vect_estimate_min_profitable_iters (loop_vinfo);
|
||||
|
||||
if (min_profitable_iters < 0)
|
||||
{
|
||||
if (vect_print_dump_info (REPORT_UNVECTORIZED_LOOPS))
|
||||
fprintf (vect_dump, "not vectorized: vectorization not profitable.");
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
fprintf (vect_dump, "not vectorized: vector version will never be "
|
||||
"profitable.");
|
||||
return false;
|
||||
}
|
||||
|
||||
min_scalar_loop_bound = (PARAM_VALUE (PARAM_MIN_VECT_LOOP_BOUND))
|
||||
* vectorization_factor;
|
||||
|
||||
/* Use the cost model only if it is more conservative than user specified
|
||||
threshold. */
|
||||
|
||||
th = (unsigned) min_scalar_loop_bound;
|
||||
if (min_profitable_iters
|
||||
&& (!min_scalar_loop_bound
|
||||
|| min_profitable_iters > min_scalar_loop_bound))
|
||||
th = (unsigned) min_profitable_iters;
|
||||
|
||||
if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
|
||||
&& LOOP_VINFO_INT_NITERS (loop_vinfo) < th)
|
||||
{
|
||||
if (vect_print_dump_info (REPORT_UNVECTORIZED_LOOPS))
|
||||
fprintf (vect_dump, "not vectorized: vectorization not "
|
||||
"profitable.");
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
fprintf (vect_dump, "not vectorized: iteration count smaller than "
|
||||
"user specified loop bound parameter or minimum "
|
||||
"profitable iterations (whichever is more conservative).");
|
||||
return false;
|
||||
}
|
||||
|
||||
if (!LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
|
||||
|| LOOP_VINFO_INT_NITERS (loop_vinfo) % vectorization_factor != 0
|
||||
|| LOOP_PEELING_FOR_ALIGNMENT (loop_vinfo))
|
||||
|
@ -74,6 +74,490 @@ static void vect_update_inits_of_drs (loop_vec_info, tree);
|
||||
static int vect_min_worthwhile_factor (enum tree_code);
|
||||
|
||||
|
||||
/* Function vect_estimate_min_profitable_iters
|
||||
|
||||
Return the number of iterations required for the vector version of the
|
||||
loop to be profitable relative to the cost of the scalar version of the
|
||||
loop.
|
||||
|
||||
TODO: Take profile info into account before making vectorization
|
||||
decisions, if available. */
|
||||
|
||||
int
|
||||
vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo)
|
||||
{
|
||||
int i;
|
||||
int min_profitable_iters;
|
||||
int peel_iters_prologue;
|
||||
int peel_iters_epilogue;
|
||||
int vec_inside_cost = 0;
|
||||
int vec_outside_cost = 0;
|
||||
int scalar_single_iter_cost = 0;
|
||||
int vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
|
||||
struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
|
||||
basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo);
|
||||
int nbbs = loop->num_nodes;
|
||||
|
||||
/* Cost model disabled. */
|
||||
if (!flag_vect_cost_model)
|
||||
{
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
fprintf (vect_dump, "cost model disabled.");
|
||||
return 0;
|
||||
}
|
||||
|
||||
/* Requires loop versioning tests to handle misalignment.
|
||||
FIXME: Make cost depend on number of stmts in may_misalign list. */
|
||||
|
||||
if (LOOP_VINFO_MAY_MISALIGN_STMTS (loop_vinfo))
|
||||
{
|
||||
vec_outside_cost += TARG_COND_BRANCH_COST;
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
fprintf (vect_dump, "cost model: Adding cost of checks for loop "
|
||||
"versioning.\n");
|
||||
}
|
||||
|
||||
/* Requires a prologue loop when peeling to handle misalignment. Add cost of
|
||||
two guards, one for the peeled loop and one for the vector loop. */
|
||||
|
||||
peel_iters_prologue = LOOP_PEELING_FOR_ALIGNMENT (loop_vinfo);
|
||||
if (peel_iters_prologue)
|
||||
{
|
||||
vec_outside_cost += 2 * TARG_COND_BRANCH_COST;
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
fprintf (vect_dump, "cost model: Adding cost of checks for "
|
||||
"prologue.\n");
|
||||
}
|
||||
|
||||
/* Requires an epilogue loop to finish up remaining iterations after vector
|
||||
loop. Add cost of two guards, one for the peeled loop and one for the
|
||||
vector loop. */
|
||||
|
||||
if ((peel_iters_prologue < 0)
|
||||
|| !LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
|
||||
|| LOOP_VINFO_INT_NITERS (loop_vinfo) % vf)
|
||||
{
|
||||
vec_outside_cost += 2 * TARG_COND_BRANCH_COST;
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
fprintf (vect_dump, "cost model : Adding cost of checks for "
|
||||
"epilogue.\n");
|
||||
}
|
||||
|
||||
/* Count statements in scalar loop. Using this as scalar cost for a single
|
||||
iteration for now.
|
||||
|
||||
TODO: Add outer loop support.
|
||||
|
||||
TODO: Consider assigning different costs to different scalar
|
||||
statements. */
|
||||
|
||||
for (i = 0; i < nbbs; i++)
|
||||
{
|
||||
block_stmt_iterator si;
|
||||
basic_block bb = bbs[i];
|
||||
|
||||
for (si = bsi_start (bb); !bsi_end_p (si); bsi_next (&si))
|
||||
{
|
||||
tree stmt = bsi_stmt (si);
|
||||
stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
|
||||
if (!STMT_VINFO_RELEVANT_P (stmt_info)
|
||||
&& !STMT_VINFO_LIVE_P (stmt_info))
|
||||
continue;
|
||||
scalar_single_iter_cost++;
|
||||
vec_inside_cost += STMT_VINFO_INSIDE_OF_LOOP_COST (stmt_info);
|
||||
vec_outside_cost += STMT_VINFO_OUTSIDE_OF_LOOP_COST (stmt_info);
|
||||
}
|
||||
}
|
||||
|
||||
/* Add additional cost for the peeled instructions in prologue and epilogue
|
||||
loop.
|
||||
|
||||
FORNOW: If we dont know the value of peel_iters for prologue or epilogue
|
||||
at compile-time - we assume the worst.
|
||||
|
||||
TODO: Build an expression that represents peel_iters for prologue and
|
||||
epilogue to be used in a run-time test. */
|
||||
|
||||
peel_iters_prologue = LOOP_PEELING_FOR_ALIGNMENT (loop_vinfo);
|
||||
|
||||
if (peel_iters_prologue < 0)
|
||||
{
|
||||
peel_iters_prologue = vf - 1;
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
fprintf (vect_dump, "cost model: "
|
||||
"prologue peel iters set conservatively.");
|
||||
|
||||
/* If peeling for alignment is unknown, loop bound of main loop becomes
|
||||
unkown. */
|
||||
peel_iters_epilogue = vf - 1;
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
fprintf (vect_dump, "cost model: "
|
||||
"epilogue peel iters set conservatively because "
|
||||
"peeling for alignment is unknown .");
|
||||
}
|
||||
else
|
||||
{
|
||||
if (!LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo))
|
||||
{
|
||||
peel_iters_epilogue = vf - 1;
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
fprintf (vect_dump, "cost model: "
|
||||
"epilogue peel iters set conservatively because "
|
||||
"loop iterations are unknown .");
|
||||
}
|
||||
else
|
||||
peel_iters_epilogue =
|
||||
(LOOP_VINFO_INT_NITERS (loop_vinfo) - peel_iters_prologue)
|
||||
% vf;
|
||||
}
|
||||
|
||||
vec_outside_cost += (peel_iters_prologue * scalar_single_iter_cost)
|
||||
+ (peel_iters_epilogue * scalar_single_iter_cost);
|
||||
|
||||
/* Calculate number of iterations required to make the vector version
|
||||
profitable, relative to the loop bodies only. The following condition
|
||||
must hold true: ((SIC*VF)-VIC)*niters > VOC*VF, where
|
||||
SIC = scalar iteration cost, VIC = vector iteration cost,
|
||||
VOC = vector outside cost and VF = vectorization factor. */
|
||||
|
||||
if ((scalar_single_iter_cost * vf) > vec_inside_cost)
|
||||
{
|
||||
if (vec_outside_cost == 0)
|
||||
min_profitable_iters = 1;
|
||||
else
|
||||
{
|
||||
min_profitable_iters = (vec_outside_cost * vf)
|
||||
/ ((scalar_single_iter_cost * vf)
|
||||
- vec_inside_cost);
|
||||
|
||||
if ((scalar_single_iter_cost * vf * min_profitable_iters)
|
||||
<= ((vec_inside_cost * min_profitable_iters)
|
||||
+ (vec_outside_cost * vf)))
|
||||
min_profitable_iters++;
|
||||
}
|
||||
}
|
||||
/* vector version will never be profitable. */
|
||||
else
|
||||
{
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
fprintf (vect_dump, "cost model: vector iteration cost = %d "
|
||||
"is divisible by scalar iteration cost = %d by a factor "
|
||||
"greater than or equal to the vectorization factor = %d .",
|
||||
vec_inside_cost, scalar_single_iter_cost, vf);
|
||||
return -1;
|
||||
}
|
||||
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
{
|
||||
fprintf (vect_dump, "Cost model analysis: \n");
|
||||
fprintf (vect_dump, " Vector inside of loop cost: %d\n",
|
||||
vec_inside_cost);
|
||||
fprintf (vect_dump, " Vector outside of loop cost: %d\n",
|
||||
vec_outside_cost);
|
||||
fprintf (vect_dump, " Scalar cost: %d\n", scalar_single_iter_cost);
|
||||
fprintf (vect_dump, " prologue iterations: %d\n",
|
||||
peel_iters_prologue);
|
||||
fprintf (vect_dump, " epilogue iterations: %d\n",
|
||||
peel_iters_epilogue);
|
||||
fprintf (vect_dump, " Calculated minimum iters for profitability: %d\n",
|
||||
min_profitable_iters);
|
||||
fprintf (vect_dump, " Actual minimum iters for profitability: %d\n",
|
||||
min_profitable_iters < vf ? vf : min_profitable_iters);
|
||||
}
|
||||
|
||||
return min_profitable_iters < vf ? vf : min_profitable_iters;
|
||||
}
|
||||
|
||||
|
||||
/* TODO: Close dependency between vect_model_*_cost and vectorizable_*
|
||||
functions. Design better to avoid maintainence issues. */
|
||||
|
||||
/* Function vect_model_reduction_cost.
|
||||
|
||||
Models cost for a reduction operation, including the vector ops
|
||||
generated within the strip-mine loop, the initial definition before
|
||||
the loop, and the epilogue code that must be generated. */
|
||||
|
||||
static void
|
||||
vect_model_reduction_cost (stmt_vec_info stmt_info, enum tree_code reduc_code,
|
||||
int ncopies)
|
||||
{
|
||||
int outer_cost = 0;
|
||||
enum tree_code code;
|
||||
optab optab;
|
||||
tree vectype;
|
||||
tree orig_stmt;
|
||||
tree reduction_op;
|
||||
enum machine_mode mode;
|
||||
tree operation = GIMPLE_STMT_OPERAND (STMT_VINFO_STMT (stmt_info), 1);
|
||||
int op_type = TREE_CODE_LENGTH (TREE_CODE (operation));
|
||||
|
||||
/* Cost of reduction op inside loop. */
|
||||
STMT_VINFO_INSIDE_OF_LOOP_COST (stmt_info) += ncopies * TARG_VEC_STMT_COST;
|
||||
|
||||
reduction_op = TREE_OPERAND (operation, op_type-1);
|
||||
vectype = get_vectype_for_scalar_type (TREE_TYPE (reduction_op));
|
||||
mode = TYPE_MODE (vectype);
|
||||
orig_stmt = STMT_VINFO_RELATED_STMT (stmt_info);
|
||||
|
||||
if (!orig_stmt)
|
||||
orig_stmt = STMT_VINFO_STMT (stmt_info);
|
||||
|
||||
code = TREE_CODE (GIMPLE_STMT_OPERAND (orig_stmt, 1));
|
||||
|
||||
/* Add in cost for initial definition. */
|
||||
outer_cost += TARG_VEC_STMT_COST;
|
||||
|
||||
/* Determine cost of epilogue code.
|
||||
|
||||
We have a reduction operator that will reduce the vector in one statement.
|
||||
Also requires scalar extract. */
|
||||
|
||||
if (reduc_code < NUM_TREE_CODES)
|
||||
outer_cost += TARG_VEC_STMT_COST + TARG_VEC_TO_SCALAR_COST;
|
||||
else
|
||||
{
|
||||
int vec_size_in_bits = tree_low_cst (TYPE_SIZE (vectype), 1);
|
||||
tree bitsize =
|
||||
TYPE_SIZE (TREE_TYPE ( GIMPLE_STMT_OPERAND (orig_stmt, 0)));
|
||||
int element_bitsize = tree_low_cst (bitsize, 1);
|
||||
int nelements = vec_size_in_bits / element_bitsize;
|
||||
|
||||
optab = optab_for_tree_code (code, vectype);
|
||||
|
||||
/* We have a whole vector shift available. */
|
||||
if (!VECTOR_MODE_P (mode)
|
||||
|| optab->handlers[mode].insn_code == CODE_FOR_nothing)
|
||||
/* Final reduction via vector shifts and the reduction operator. Also
|
||||
requires scalar extract. */
|
||||
outer_cost += ((exact_log2(nelements) * 2 + 1) * TARG_VEC_STMT_COST);
|
||||
else
|
||||
/* Use extracts and reduction op for final reduction. For N elements,
|
||||
we have N extracts and N-1 reduction ops. */
|
||||
outer_cost += ((nelements + nelements - 1) * TARG_VEC_STMT_COST);
|
||||
}
|
||||
|
||||
STMT_VINFO_OUTSIDE_OF_LOOP_COST (stmt_info) = outer_cost;
|
||||
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
fprintf (vect_dump, "vect_model_reduction_cost: inside_cost = %d, "
|
||||
"outside_cost = %d .", STMT_VINFO_INSIDE_OF_LOOP_COST (stmt_info),
|
||||
STMT_VINFO_OUTSIDE_OF_LOOP_COST (stmt_info));
|
||||
}
|
||||
|
||||
|
||||
/* Function vect_model_induction_cost.
|
||||
|
||||
Models cost for induction operations. */
|
||||
|
||||
static void
|
||||
vect_model_induction_cost (stmt_vec_info stmt_info, int ncopies)
|
||||
{
|
||||
/* loop cost for vec_loop. */
|
||||
STMT_VINFO_INSIDE_OF_LOOP_COST (stmt_info) = ncopies * TARG_VEC_STMT_COST;
|
||||
/* prologue cost for vec_init and vec_step. */
|
||||
STMT_VINFO_OUTSIDE_OF_LOOP_COST (stmt_info) = 2 * TARG_VEC_STMT_COST;
|
||||
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
fprintf (vect_dump, "vect_model_induction_cost: inside_cost = %d, "
|
||||
"outside_cost = %d .", STMT_VINFO_INSIDE_OF_LOOP_COST (stmt_info),
|
||||
STMT_VINFO_OUTSIDE_OF_LOOP_COST (stmt_info));
|
||||
}
|
||||
|
||||
|
||||
/* Function vect_model_simple_cost.
|
||||
|
||||
Models cost for simple operations, i.e. those that only emit ncopies of a
|
||||
single op. Right now, this does not account for multiple insns that could
|
||||
be generated for the single vector op. We will handle that shortly. */
|
||||
|
||||
static void
|
||||
vect_model_simple_cost (stmt_vec_info stmt_info, int ncopies)
|
||||
{
|
||||
STMT_VINFO_INSIDE_OF_LOOP_COST (stmt_info) = ncopies * TARG_VEC_STMT_COST;
|
||||
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
fprintf (vect_dump, "vect_model_simple_cost: inside_cost = %d, "
|
||||
"outside_cost = %d .", STMT_VINFO_INSIDE_OF_LOOP_COST (stmt_info),
|
||||
STMT_VINFO_OUTSIDE_OF_LOOP_COST (stmt_info));
|
||||
}
|
||||
|
||||
|
||||
/* Function vect_cost_strided_group_size
|
||||
|
||||
For strided load or store, return the group_size only if it is the first
|
||||
load or store of a group, else return 1. This ensures that group size is
|
||||
only returned once per group. */
|
||||
|
||||
static int
|
||||
vect_cost_strided_group_size (stmt_vec_info stmt_info)
|
||||
{
|
||||
tree first_stmt = DR_GROUP_FIRST_DR (stmt_info);
|
||||
|
||||
if (first_stmt == STMT_VINFO_STMT (stmt_info))
|
||||
return DR_GROUP_SIZE (stmt_info);
|
||||
|
||||
return 1;
|
||||
}
|
||||
|
||||
|
||||
/* Function vect_model_store_cost
|
||||
|
||||
Models cost for stores. In the case of strided accesses, one access
|
||||
has the overhead of the strided access attributed to it. */
|
||||
|
||||
static void
|
||||
vect_model_store_cost (stmt_vec_info stmt_info, int ncopies)
|
||||
{
|
||||
int cost = 0;
|
||||
int group_size;
|
||||
|
||||
/* Strided access? */
|
||||
if (DR_GROUP_FIRST_DR (stmt_info))
|
||||
group_size = vect_cost_strided_group_size (stmt_info);
|
||||
/* Not a strided access. */
|
||||
else
|
||||
group_size = 1;
|
||||
|
||||
/* Is this an access in a group of stores, which provide strided access?
|
||||
If so, add in the cost of the permutes. */
|
||||
if (group_size > 1)
|
||||
{
|
||||
/* Uses a high and low interleave operation for each needed permute. */
|
||||
cost = ncopies * exact_log2(group_size) * group_size
|
||||
* TARG_VEC_STMT_COST;
|
||||
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
fprintf (vect_dump, "vect_model_store_cost: strided group_size = %d .",
|
||||
group_size);
|
||||
|
||||
}
|
||||
|
||||
/* Costs of the stores. */
|
||||
cost += ncopies * TARG_VEC_STORE_COST;
|
||||
|
||||
STMT_VINFO_INSIDE_OF_LOOP_COST (stmt_info) = cost;
|
||||
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
fprintf (vect_dump, "vect_model_store_cost: inside_cost = %d, "
|
||||
"outside_cost = %d .", STMT_VINFO_INSIDE_OF_LOOP_COST (stmt_info),
|
||||
STMT_VINFO_OUTSIDE_OF_LOOP_COST (stmt_info));
|
||||
}
|
||||
|
||||
|
||||
/* Function vect_model_load_cost
|
||||
|
||||
Models cost for loads. In the case of strided accesses, the last access
|
||||
has the overhead of the strided access attributed to it. Since unaligned
|
||||
accesses are supported for loads, we also account for the costs of the
|
||||
access scheme chosen. */
|
||||
|
||||
static void
|
||||
vect_model_load_cost (stmt_vec_info stmt_info, int ncopies)
|
||||
|
||||
{
|
||||
int inner_cost = 0;
|
||||
int group_size;
|
||||
int alignment_support_cheme;
|
||||
tree first_stmt;
|
||||
struct data_reference *dr = STMT_VINFO_DATA_REF (stmt_info), *first_dr;
|
||||
|
||||
/* Strided accesses? */
|
||||
first_stmt = DR_GROUP_FIRST_DR (stmt_info);
|
||||
if (first_stmt)
|
||||
{
|
||||
group_size = vect_cost_strided_group_size (stmt_info);
|
||||
first_dr = STMT_VINFO_DATA_REF (vinfo_for_stmt (first_stmt));
|
||||
}
|
||||
/* Not a strided access. */
|
||||
else
|
||||
{
|
||||
group_size = 1;
|
||||
first_dr = dr;
|
||||
}
|
||||
|
||||
alignment_support_cheme = vect_supportable_dr_alignment (first_dr);
|
||||
|
||||
/* Is this an access in a group of loads providing strided access?
|
||||
If so, add in the cost of the permutes. */
|
||||
if (group_size > 1)
|
||||
{
|
||||
/* Uses an even and odd extract operations for each needed permute. */
|
||||
inner_cost = ncopies * exact_log2(group_size) * group_size
|
||||
* TARG_VEC_STMT_COST;
|
||||
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
fprintf (vect_dump, "vect_model_load_cost: strided group_size = %d .",
|
||||
group_size);
|
||||
|
||||
}
|
||||
|
||||
/* The loads themselves. */
|
||||
switch (alignment_support_cheme)
|
||||
{
|
||||
case dr_aligned:
|
||||
{
|
||||
inner_cost += ncopies * TARG_VEC_LOAD_COST;
|
||||
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
fprintf (vect_dump, "vect_model_load_cost: aligned.");
|
||||
|
||||
break;
|
||||
}
|
||||
case dr_unaligned_supported:
|
||||
{
|
||||
/* Here, we assign an additional cost for the unaligned load. */
|
||||
inner_cost += ncopies * TARG_VEC_UNALIGNED_LOAD_COST;
|
||||
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
fprintf (vect_dump, "vect_model_load_cost: unaligned supported by "
|
||||
"hardware.");
|
||||
|
||||
break;
|
||||
}
|
||||
case dr_unaligned_software_pipeline:
|
||||
{
|
||||
int outer_cost = 0;
|
||||
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
fprintf (vect_dump, "vect_model_load_cost: unaligned software "
|
||||
"pipelined.");
|
||||
|
||||
/* Unaligned software pipeline has a load of an address, an initial
|
||||
load, and possibly a mask operation to "prime" the loop. However,
|
||||
if this is an access in a group of loads, which provide strided
|
||||
acccess, then the above cost should only be considered for one
|
||||
access in the group. Inside the loop, there is a load op
|
||||
and a realignment op. */
|
||||
|
||||
if ((!DR_GROUP_FIRST_DR (stmt_info)) || group_size > 1)
|
||||
{
|
||||
outer_cost = 2*TARG_VEC_STMT_COST;
|
||||
if (targetm.vectorize.builtin_mask_for_load)
|
||||
outer_cost += TARG_VEC_STMT_COST;
|
||||
}
|
||||
|
||||
STMT_VINFO_OUTSIDE_OF_LOOP_COST (stmt_info) = outer_cost;
|
||||
|
||||
inner_cost += ncopies * (TARG_VEC_LOAD_COST + TARG_VEC_STMT_COST);
|
||||
|
||||
break;
|
||||
}
|
||||
|
||||
default:
|
||||
gcc_unreachable ();
|
||||
}
|
||||
|
||||
STMT_VINFO_INSIDE_OF_LOOP_COST (stmt_info) = inner_cost;
|
||||
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
fprintf (vect_dump, "vect_model_load_cost: inside_cost = %d, "
|
||||
"outside_cost = %d .", STMT_VINFO_INSIDE_OF_LOOP_COST (stmt_info),
|
||||
STMT_VINFO_OUTSIDE_OF_LOOP_COST (stmt_info));
|
||||
|
||||
}
|
||||
|
||||
|
||||
/* Function vect_get_new_vect_var.
|
||||
|
||||
Returns a name for a new variable. The current naming scheme appends the
|
||||
@ -1655,6 +2139,7 @@ vectorizable_reduction (tree stmt, block_stmt_iterator *bsi, tree *vec_stmt)
|
||||
if (!vec_stmt) /* transformation not required. */
|
||||
{
|
||||
STMT_VINFO_TYPE (stmt_info) = reduc_vec_info_type;
|
||||
vect_model_reduction_cost (stmt_info, epilog_reduc_code, ncopies);
|
||||
return true;
|
||||
}
|
||||
|
||||
@ -1862,9 +2347,15 @@ vectorizable_call (tree stmt, block_stmt_iterator *bsi, tree *vec_stmt)
|
||||
|
||||
gcc_assert (ZERO_SSA_OPERANDS (stmt, SSA_OP_ALL_VIRTUALS));
|
||||
|
||||
ncopies = (LOOP_VINFO_VECT_FACTOR (loop_vinfo)
|
||||
/ TYPE_VECTOR_SUBPARTS (vectype_out));
|
||||
|
||||
if (!vec_stmt) /* transformation not required. */
|
||||
{
|
||||
STMT_VINFO_TYPE (stmt_info) = call_vec_info_type;
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
fprintf (vect_dump, "=== vectorizable_call ===");
|
||||
vect_model_simple_cost (stmt_info, ncopies);
|
||||
return true;
|
||||
}
|
||||
|
||||
@ -1873,8 +2364,6 @@ vectorizable_call (tree stmt, block_stmt_iterator *bsi, tree *vec_stmt)
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
fprintf (vect_dump, "transform operation.");
|
||||
|
||||
ncopies = (LOOP_VINFO_VECT_FACTOR (loop_vinfo)
|
||||
/ TYPE_VECTOR_SUBPARTS (vectype_out));
|
||||
gcc_assert (ncopies >= 1);
|
||||
|
||||
/* Handle def. */
|
||||
@ -2302,6 +2791,9 @@ vectorizable_assignment (tree stmt, block_stmt_iterator *bsi, tree *vec_stmt)
|
||||
if (!vec_stmt) /* transformation not required. */
|
||||
{
|
||||
STMT_VINFO_TYPE (stmt_info) = assignment_vec_info_type;
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
fprintf (vect_dump, "=== vectorizable_assignment ===");
|
||||
vect_model_simple_cost (stmt_info, ncopies);
|
||||
return true;
|
||||
}
|
||||
|
||||
@ -2392,6 +2884,9 @@ vectorizable_induction (tree phi, block_stmt_iterator *bsi ATTRIBUTE_UNUSED,
|
||||
if (!vec_stmt) /* transformation not required. */
|
||||
{
|
||||
STMT_VINFO_TYPE (stmt_info) = induc_vec_info_type;
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
fprintf (vect_dump, "=== vectorizable_induction ===");
|
||||
vect_model_induction_cost (stmt_info, ncopies);
|
||||
return true;
|
||||
}
|
||||
|
||||
@ -2555,6 +3050,9 @@ vectorizable_operation (tree stmt, block_stmt_iterator *bsi, tree *vec_stmt)
|
||||
if (!vec_stmt) /* transformation not required. */
|
||||
{
|
||||
STMT_VINFO_TYPE (stmt_info) = op_vec_info_type;
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
fprintf (vect_dump, "=== vectorizable_operation ===");
|
||||
vect_model_simple_cost (stmt_info, ncopies);
|
||||
return true;
|
||||
}
|
||||
|
||||
@ -2772,6 +3270,9 @@ vectorizable_type_demotion (tree stmt, block_stmt_iterator *bsi,
|
||||
if (!vec_stmt) /* transformation not required. */
|
||||
{
|
||||
STMT_VINFO_TYPE (stmt_info) = type_demotion_vec_info_type;
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
fprintf (vect_dump, "=== vectorizable_demotion ===");
|
||||
vect_model_simple_cost (stmt_info, ncopies);
|
||||
return true;
|
||||
}
|
||||
|
||||
@ -2932,6 +3433,9 @@ vectorizable_type_promotion (tree stmt, block_stmt_iterator *bsi,
|
||||
if (!vec_stmt) /* transformation not required. */
|
||||
{
|
||||
STMT_VINFO_TYPE (stmt_info) = type_promotion_vec_info_type;
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
fprintf (vect_dump, "=== vectorizable_promotion ===");
|
||||
vect_model_simple_cost (stmt_info, 2*ncopies);
|
||||
return true;
|
||||
}
|
||||
|
||||
@ -3252,14 +3756,12 @@ vectorizable_store (tree stmt, block_stmt_iterator *bsi, tree *vec_stmt)
|
||||
if (!vec_stmt) /* transformation not required. */
|
||||
{
|
||||
STMT_VINFO_TYPE (stmt_info) = store_vec_info_type;
|
||||
vect_model_store_cost (stmt_info, ncopies);
|
||||
return true;
|
||||
}
|
||||
|
||||
/** Transform. **/
|
||||
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
fprintf (vect_dump, "transform store. ncopies = %d",ncopies);
|
||||
|
||||
if (strided_store)
|
||||
{
|
||||
first_stmt = DR_GROUP_FIRST_DR (stmt_info);
|
||||
@ -3284,6 +3786,9 @@ vectorizable_store (tree stmt, block_stmt_iterator *bsi, tree *vec_stmt)
|
||||
group_size = 1;
|
||||
}
|
||||
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
fprintf (vect_dump, "transform store. ncopies = %d",ncopies);
|
||||
|
||||
dr_chain = VEC_alloc (tree, heap, group_size);
|
||||
oprnds = VEC_alloc (tree, heap, group_size);
|
||||
|
||||
@ -3915,14 +4420,15 @@ vectorizable_load (tree stmt, block_stmt_iterator *bsi, tree *vec_stmt)
|
||||
if (!vec_stmt) /* transformation not required. */
|
||||
{
|
||||
STMT_VINFO_TYPE (stmt_info) = load_vec_info_type;
|
||||
vect_model_load_cost (stmt_info, ncopies);
|
||||
return true;
|
||||
}
|
||||
|
||||
/** Transform. **/
|
||||
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
fprintf (vect_dump, "transform load.");
|
||||
|
||||
/** Transform. **/
|
||||
|
||||
if (strided_load)
|
||||
{
|
||||
first_stmt = DR_GROUP_FIRST_DR (stmt_info);
|
||||
@ -4807,6 +5313,8 @@ vect_do_peeling_for_loop_bound (loop_vec_info loop_vinfo, tree *ratio)
|
||||
basic_block preheader;
|
||||
int loop_num;
|
||||
unsigned int th;
|
||||
int min_scalar_loop_bound;
|
||||
int min_profitable_iters;
|
||||
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
fprintf (vect_dump, "=== vect_do_peeling_for_loop_bound ===");
|
||||
@ -4822,11 +5330,28 @@ vect_do_peeling_for_loop_bound (loop_vec_info loop_vinfo, tree *ratio)
|
||||
&ratio_mult_vf_name, ratio);
|
||||
|
||||
loop_num = loop->num;
|
||||
/* Threshold for vectorized loop. */
|
||||
th = (PARAM_VALUE (PARAM_MIN_VECT_LOOP_BOUND)) *
|
||||
LOOP_VINFO_VECT_FACTOR (loop_vinfo);
|
||||
|
||||
/* Analyze cost to set threshhold for vectorized loop. */
|
||||
min_profitable_iters = vect_estimate_min_profitable_iters (loop_vinfo);
|
||||
|
||||
min_scalar_loop_bound = (PARAM_VALUE (PARAM_MIN_VECT_LOOP_BOUND))
|
||||
* LOOP_VINFO_VECT_FACTOR (loop_vinfo);
|
||||
|
||||
/* Use the cost model only if it is more conservative than user specified
|
||||
threshold. */
|
||||
|
||||
th = (unsigned) min_scalar_loop_bound;
|
||||
if (min_profitable_iters
|
||||
&& (!min_scalar_loop_bound
|
||||
|| min_profitable_iters > min_scalar_loop_bound))
|
||||
th = (unsigned) min_profitable_iters;
|
||||
|
||||
if (vect_print_dump_info (REPORT_DETAILS))
|
||||
fprintf (vect_dump, "vectorization may not be profitable.");
|
||||
|
||||
new_loop = slpeel_tree_peel_loop_to_edge (loop, single_exit (loop),
|
||||
ratio_mult_vf_name, ni_name, false, th);
|
||||
ratio_mult_vf_name, ni_name, false,
|
||||
th);
|
||||
gcc_assert (new_loop);
|
||||
gcc_assert (loop_num == loop->num);
|
||||
#ifdef ENABLE_CHECKING
|
||||
|
@ -1351,6 +1351,8 @@ new_stmt_vec_info (tree stmt, loop_vec_info loop_vinfo)
|
||||
else
|
||||
STMT_VINFO_DEF_TYPE (res) = vect_loop_def;
|
||||
STMT_VINFO_SAME_ALIGN_REFS (res) = VEC_alloc (dr_p, heap, 5);
|
||||
STMT_VINFO_INSIDE_OF_LOOP_COST (res) = 0;
|
||||
STMT_VINFO_OUTSIDE_OF_LOOP_COST (res) = 0;
|
||||
DR_GROUP_FIRST_DR (res) = NULL_TREE;
|
||||
DR_GROUP_NEXT_DR (res) = NULL_TREE;
|
||||
DR_GROUP_SIZE (res) = 0;
|
||||
|
@ -1,5 +1,5 @@
|
||||
/* Loop Vectorization
|
||||
Copyright (C) 2003, 2004, 2005, 2006 Free Software Foundation, Inc.
|
||||
Copyright (C) 2003, 2004, 2005, 2006, 2007 Free Software Foundation, Inc.
|
||||
Contributed by Dorit Naishlos <dorit@il.ibm.com>
|
||||
|
||||
This file is part of GCC.
|
||||
@ -268,6 +268,13 @@ typedef struct _stmt_vec_info {
|
||||
/* For loads only, if there is a store with the same location, this field is
|
||||
TRUE. */
|
||||
bool read_write_dep;
|
||||
|
||||
/* Vectorization costs associated with statement. */
|
||||
struct
|
||||
{
|
||||
int outside_of_loop; /* Statements generated outside loop. */
|
||||
int inside_of_loop; /* Statements generated inside loop. */
|
||||
} cost;
|
||||
} *stmt_vec_info;
|
||||
|
||||
/* Access Functions. */
|
||||
@ -300,6 +307,42 @@ typedef struct _stmt_vec_info {
|
||||
#define DR_GROUP_READ_WRITE_DEPENDENCE(S) (S)->read_write_dep
|
||||
|
||||
#define STMT_VINFO_RELEVANT_P(S) ((S)->relevant != vect_unused_in_loop)
|
||||
#define STMT_VINFO_OUTSIDE_OF_LOOP_COST(S) (S)->cost.outside_of_loop
|
||||
#define STMT_VINFO_INSIDE_OF_LOOP_COST(S) (S)->cost.inside_of_loop
|
||||
|
||||
/* These are some defines for the initial implementation of the vectorizer's
|
||||
cost model. These will later be target specific hooks. */
|
||||
|
||||
/* Cost of conditional branch. */
|
||||
#ifndef TARG_COND_BRANCH_COST
|
||||
#define TARG_COND_BRANCH_COST 3
|
||||
#endif
|
||||
|
||||
/* Cost of any vector operation, excluding load, store or vector to scalar
|
||||
operation. */
|
||||
#ifndef TARG_VEC_STMT_COST
|
||||
#define TARG_VEC_STMT_COST 1
|
||||
#endif
|
||||
|
||||
/* Cost of vector to scalar operation. */
|
||||
#ifndef TARG_VEC_TO_SCALAR_COST
|
||||
#define TARG_VEC_TO_SCALAR_COST 1
|
||||
#endif
|
||||
|
||||
/* Cost of aligned vector load. */
|
||||
#ifndef TARG_VEC_LOAD_COST
|
||||
#define TARG_VEC_LOAD_COST 1
|
||||
#endif
|
||||
|
||||
/* Cost of misaligned vector load. */
|
||||
#ifndef TARG_VEC_UNALIGNED_LOAD_COST
|
||||
#define TARG_VEC_UNALIGNED_LOAD_COST 2
|
||||
#endif
|
||||
|
||||
/* Cost of vector store. */
|
||||
#ifndef TARG_VEC_STORE_COST
|
||||
#define TARG_VEC_STORE_COST 1
|
||||
#endif
|
||||
|
||||
static inline void set_stmt_info (stmt_ann_t ann, stmt_vec_info stmt_info);
|
||||
static inline stmt_vec_info vinfo_for_stmt (tree stmt);
|
||||
@ -437,6 +480,7 @@ extern bool vectorizable_condition (tree, block_stmt_iterator *, tree *);
|
||||
extern bool vectorizable_live_operation (tree, block_stmt_iterator *, tree *);
|
||||
extern bool vectorizable_reduction (tree, block_stmt_iterator *, tree *);
|
||||
extern bool vectorizable_induction (tree, block_stmt_iterator *, tree *);
|
||||
extern int vect_estimate_min_profitable_iters (loop_vec_info);
|
||||
/* Driver for transformation stage. */
|
||||
extern void vect_transform_loop (loop_vec_info);
|
||||
|
||||
|
Loading…
x
Reference in New Issue
Block a user