extend.texi: Add fvect-cost-model flag.

gcc/ChangeLog:
2007-06-08  Harsha Jagasia <harsha.jagasia@amd.com>
            Tony Linthicum <tony.linthicum@amd.com>

	* doc/extend.texi: Add fvect-cost-model flag.
	* common.opt (fvect-cost-model): New flag.
	* tree-vectorizer.c (new_stmt_vec_info): Initialize inside and outside
	cost fields in stmt_vec_info struct for STMT.
	* tree-vectorizer.h (stmt_vec_info): Define inside and outside cost
	fields in stmt_vec_info struct and access functions for the same.
	(TARG_COND_BRANCH_COST): Define cost of conditional branch.
	(TARG_VEC_STMT_COST): Define cost of any vector operation, excluding
	load, store and vector to scalar operation.
	(TARG_VEC_TO_SCALAR_COST): Define cost of vector to scalar operation.
	(TARG_VEC_LOAD_COST): Define cost of aligned vector load.
	(TARG_VEC_UNALIGNED_LOAD_COST): Define cost of misasligned vector load.
	(TARG_VEC_STORE_COST): Define cost of vector store.
	(vect_estimate_min_profitable_iters): Define new function.
	* tree-vect-analyze.c (vect_analyze_operations): Add a compile-time
	check to evaluate if loop iterations are less than minimum profitable
	iterations determined by cost model or minimum vect loop bound defined
	by user, whichever is more conservative.
	* tree-vect-transform.c (vect_do_peeling_for_loop_bound): Add a
	run-time check to evaluate if loop iterations are less than minimum
	profitable iterations determined by cost model or minimum vect loop
	bound defined by user, whichever is more conservative.
	(vect_estimate_min_profitable_iterations): New function to estimate
	mimimimum iterartions required for vector version of loop to be
	profitable over scalar version.
        (vect_model_reduction_cost): New function.
	(vect_model_induction_cost): New function.
	(vect_model_simple_cost): New function.
	(vect_cost_strided_group_size): New function.
	(vect_model_store_cost): New function.
	(vect_model_load_cost): New function.
	(vectorizable_reduction): Call vect_model_reduction_cost during
	analysis phase.
	(vectorizable_induction): Call vect_model_induction_cost during
	analysis phase.
	(vectorizable_load): Call vect_model_load_cost during analysis phase.
	(vectorizable_store): Call vect_model_store_cost during analysis phase.
	(vectorizable_call, vectorizable_assignment, vectorizable_operation,
	vectorizable_promotion, vectorizable_demotion): Call 
	vect_model_simple_cost during analysis phase.

gcc/testsuite/ChangeLog:
2007-06-08  Harsha Jagasia <harsha.jagasia@amd.com>

	* gcc.dg/vect/costmodel: New directory.
	* gcc.dg/vect/costmodel/i386: New directory.
	* gcc.dg/vect/costmodel/i386/i386-costmodel-vect.exp: New testsuite.
	* gcc.dg/vect/costmodel/i386/costmodel-fast-math-vect-pr29925.c:
	New test.
	* gcc.dg/vect/costmodel/i386/costmodel-vect-31.c: New test.
	* gcc.dg/vect/costmodel/i386/costmodel-vect-33.c: New test.
	* gcc.dg/vect/costmodel/i386/costmodel-vect-68.c: New test.
	* gcc.dg/vect/costmodel/i386/costmodel-vect-reduc-1char.c: New test.
	* gcc.dg/vect/costmodel/x86_64: New directory.
	* gcc.dg/vect/costmodel/x86_64/x86_64-costmodel-vect.exp:
	New testsuite.	
	* gcc.dg/vect/costmodel/x86_64/costmodel-fast-math-vect-pr29925.c:
	New test.
	* gcc.dg/vect/costmodel/x86_64/costmodel-vect-31.c: New test.
	* gcc.dg/vect/costmodel/x86_64/costmodel-vect-33.c: New test.
	* gcc.dg/vect/costmodel/x86_64/costmodel-vect-68.c: New test.
	* gcc.dg/vect/costmodel/x86_64/costmodel-vect-reduc-1char.c: New test.
	* gcc.dg/vect/costmodel/x86_64/costmodel-pr30843.c: New test.

Co-Authored-By: Tony Linthicum <tony.linthicum@amd.com>

From-SVN: r125575
This commit is contained in:
Harsha Jagasia 2007-06-08 16:30:49 +00:00 committed by Harsha Jagasia
parent c8e2516ccf
commit 792ed98bb7
21 changed files with 1486 additions and 21 deletions

View File

@ -1,3 +1,47 @@
2007-06-08 Harsha Jagasia <harsha.jagasia@amd.com>
Tony Linthicum <tony.linthicum@amd.com>
* doc/extend.texi: Add fvect-cost-model flag.
* common.opt (fvect-cost-model): New flag.
* tree-vectorizer.c (new_stmt_vec_info): Initialize inside and outside
cost fields in stmt_vec_info struct for STMT.
* tree-vectorizer.h (stmt_vec_info): Define inside and outside cost
fields in stmt_vec_info struct and access functions for the same.
(TARG_COND_BRANCH_COST): Define cost of conditional branch.
(TARG_VEC_STMT_COST): Define cost of any vector operation, excluding
load, store and vector to scalar operation.
(TARG_VEC_TO_SCALAR_COST): Define cost of vector to scalar operation.
(TARG_VEC_LOAD_COST): Define cost of aligned vector load.
(TARG_VEC_UNALIGNED_LOAD_COST): Define cost of misasligned vector load.
(TARG_VEC_STORE_COST): Define cost of vector store.
(vect_estimate_min_profitable_iters): Define new function.
* tree-vect-analyze.c (vect_analyze_operations): Add a compile-time
check to evaluate if loop iterations are less than minimum profitable
iterations determined by cost model or minimum vect loop bound defined
by user, whichever is more conservative.
* tree-vect-transform.c (vect_do_peeling_for_loop_bound): Add a
run-time check to evaluate if loop iterations are less than minimum
profitable iterations determined by cost model or minimum vect loop
bound defined by user, whichever is more conservative.
(vect_estimate_min_profitable_iterations): New function to estimate
mimimimum iterartions required for vector version of loop to be
profitable over scalar version.
(vect_model_reduction_cost): New function.
(vect_model_induction_cost): New function.
(vect_model_simple_cost): New function.
(vect_cost_strided_group_size): New function.
(vect_model_store_cost): New function.
(vect_model_load_cost): New function.
(vectorizable_reduction): Call vect_model_reduction_cost during
analysis phase.
(vectorizable_induction): Call vect_model_induction_cost during
analysis phase.
(vectorizable_load): Call vect_model_load_cost during analysis phase.
(vectorizable_store): Call vect_model_store_cost during analysis phase.
(vectorizable_call, vectorizable_assignment, vectorizable_operation,
vectorizable_promotion, vectorizable_demotion): Call
vect_model_simple_cost during analysis phase.
2007-06-08 Simon Baldwin <simonb@google.com>
* reg-stack.c (get_true_reg): Readability change. Moved default case

View File

@ -1110,6 +1110,10 @@ ftree-vectorize
Common Report Var(flag_tree_vectorize) Optimization
Enable loop vectorization on trees
fvect-cost-model
Common Report Var(flag_vect_cost_model) Optimization
Enable use of cost model in vectorization
ftree-vect-loop-version
Common Report Var(flag_tree_vect_loop_version) Init(1) Optimization
Enable loop versioning when doing loop vectorization on trees

View File

@ -357,7 +357,7 @@ Objective-C and Objective-C++ Dialects}.
-fcheck-data-deps @gol
-ftree-dominator-opts -ftree-dse -ftree-copyrename -ftree-sink @gol
-ftree-ch -ftree-sra -ftree-ter -ftree-fre -ftree-vectorize @gol
-ftree-vect-loop-version -ftree-salias -fipa-pta -fweb @gol
-ftree-vect-loop-version -fvect-cost-model -ftree-salias -fipa-pta -fweb @gol
-ftree-copy-prop -ftree-store-ccp -ftree-store-copy-prop -fwhole-program @gol
--param @var{name}=@var{value}
-O -O0 -O1 -O2 -O3 -Os}
@ -5666,6 +5666,9 @@ the loop are generated along with runtime checks for alignment or dependence
to control which version is executed. This option is enabled by default
except at level @option{-Os} where it is disabled.
@item -fvect-cost-model
Enable cost model for vectorization.
@item -ftree-vrp
Perform Value Range Propagation on trees. This is similar to the
constant propagation pass, but instead of values, ranges of values are

View File

@ -1,3 +1,25 @@
2007-06-08 Harsha Jagasia <harsha.jagasia@amd.com>
* gcc.dg/vect/costmodel: New directory.
* gcc.dg/vect/costmodel/i386: New directory.
* gcc.dg/vect/costmodel/i386/i386-costmodel-vect.exp: New testsuite.
* gcc.dg/vect/costmodel/i386/costmodel-fast-math-vect-pr29925.c:
New test.
* gcc.dg/vect/costmodel/i386/costmodel-vect-31.c: New test.
* gcc.dg/vect/costmodel/i386/costmodel-vect-33.c: New test.
* gcc.dg/vect/costmodel/i386/costmodel-vect-68.c: New test.
* gcc.dg/vect/costmodel/i386/costmodel-vect-reduc-1char.c: New test.
* gcc.dg/vect/costmodel/x86_64: New directory.
* gcc.dg/vect/costmodel/x86_64/x86_64-costmodel-vect.exp:
New testsuite.
* gcc.dg/vect/costmodel/x86_64/costmodel-fast-math-vect-pr29925.c:
New test.
* gcc.dg/vect/costmodel/x86_64/costmodel-vect-31.c: New test.
* gcc.dg/vect/costmodel/x86_64/costmodel-vect-33.c: New test.
* gcc.dg/vect/costmodel/x86_64/costmodel-vect-68.c: New test.
* gcc.dg/vect/costmodel/x86_64/costmodel-vect-reduc-1char.c: New test.
* gcc.dg/vect/costmodel/x86_64/costmodel-pr30843.c: New test.
2007-06-08 Uros Bizjak <ubizjak@gmail.com>
PR tree-optimization/32243

View File

@ -0,0 +1,39 @@
/* { dg-require-effective-target vect_float } */
#include <stdlib.h>
#include "../../tree-vect.h"
void interp_pitch(float *exc, float *interp, int pitch, int len)
{
int i,k;
int maxj;
maxj=3;
for (i=0;i<len;i++)
{
float tmp = 0;
for (k=0;k<7;k++)
{
tmp += exc[i-pitch+k+maxj-6];
}
interp[i] = tmp;
}
}
int main()
{
float *exc = calloc(126,sizeof(float));
float *interp = calloc(80,sizeof(float));
int pitch = -35;
check_vect ();
interp_pitch(exc, interp, pitch, 80);
free(exc);
free(interp);
return 0;
}
/* { dg-final { scan-tree-dump-times "vectorization not profitable" 1 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */

View File

@ -0,0 +1,91 @@
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include "../../tree-vect.h"
#define N 32
struct t{
int k[N];
int l;
};
struct s{
char a; /* aligned */
char b[N-1]; /* unaligned (offset 1B) */
char c[N]; /* aligned (offset NB) */
struct t d; /* aligned (offset 2NB) */
struct t e; /* unaligned (offset 2N+4N+4 B) */
};
int main1 ()
{
int i;
struct s tmp;
/* unaligned */
for (i = 0; i < N/2; i++)
{
tmp.b[i] = 5;
}
/* check results: */
for (i = 0; i <N/2; i++)
{
if (tmp.b[i] != 5)
abort ();
}
/* aligned */
for (i = 0; i < N/2; i++)
{
tmp.c[i] = 6;
}
/* check results: */
for (i = 0; i <N/2; i++)
{
if (tmp.c[i] != 6)
abort ();
}
/* aligned */
for (i = 0; i < N/2; i++)
{
tmp.d.k[i] = 7;
}
/* check results: */
for (i = 0; i <N/2; i++)
{
if (tmp.d.k[i] != 7)
abort ();
}
/* unaligned */
for (i = 0; i < N/2; i++)
{
tmp.e.k[i] = 8;
}
/* check results: */
for (i = 0; i <N/2; i++)
{
if (tmp.e.k[i] != 8)
abort ();
}
return 0;
}
int main (void)
{
check_vect ();
return main1 ();
}
/* { dg-final { scan-tree-dump-times "vectorization not profitable" 2 "vect" } }
*/
/* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */

View File

@ -0,0 +1,39 @@
/* { dg-do compile } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include "../../tree-vect.h"
#define N 16
struct test {
char ca[N];
};
extern struct test s;
int main1 ()
{
int i;
for (i = 0; i < N; i++)
{
s.ca[i] = 5;
}
/* check results: */
for (i = 0; i < N; i++)
{
if (s.ca[i] != 5)
abort ();
}
return 0;
}
int main (void)
{
return main1 ();
}
/* { dg-final { scan-tree-dump-times "vectorization not profitable" 1 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */

View File

@ -0,0 +1,89 @@
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include "../../tree-vect.h"
#define N 32
struct s{
int m;
int n[N][N][N];
};
struct test1{
struct s a; /* array a.n is unaligned */
int b;
int c;
struct s e; /* array e.n is aligned */
};
int main1 ()
{
int i,j;
struct test1 tmp1;
/* 1. unaligned */
for (i = 0; i < N; i++)
{
tmp1.a.n[1][2][i] = 5;
}
/* check results: */
for (i = 0; i <N; i++)
{
if (tmp1.a.n[1][2][i] != 5)
abort ();
}
/* 2. aligned */
for (i = 3; i < N-1; i++)
{
tmp1.a.n[1][2][i] = 6;
}
/* check results: */
for (i = 3; i < N-1; i++)
{
if (tmp1.a.n[1][2][i] != 6)
abort ();
}
/* 3. aligned */
for (i = 0; i < N; i++)
{
tmp1.e.n[1][2][i] = 7;
}
/* check results: */
for (i = 0; i < N; i++)
{
if (tmp1.e.n[1][2][i] != 7)
abort ();
}
/* 4. unaligned */
for (i = 3; i < N-3; i++)
{
tmp1.e.n[1][2][i] = 8;
}
/* check results: */
for (i = 3; i <N-3; i++)
{
if (tmp1.e.n[1][2][i] != 8)
abort ();
}
return 0;
}
int main (void)
{
check_vect ();
return main1 ();
}
/* { dg-final { scan-tree-dump-times "vectorization not profitable" 1 "vect" } } */
/* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */

View File

@ -0,0 +1,51 @@
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include "../../tree-vect.h"
#define N 16
#define DIFF 242
void
main1 (unsigned char x, unsigned char max_result, unsigned char min_result)
{
int i;
unsigned char ub[N] = {1,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
unsigned char uc[N] = {1,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
unsigned char udiff = 2;
unsigned char umax = x;
unsigned char umin = x;
for (i = 0; i < N; i++) {
udiff += (unsigned char)(ub[i] - uc[i]);
}
for (i = 0; i < N; i++) {
umax = umax < uc[i] ? uc[i] : umax;
}
for (i = 0; i < N; i++) {
umin = umin > uc[i] ? uc[i] : umin;
}
/* check results: */
if (udiff != DIFF)
abort ();
if (umax != max_result)
abort ();
if (umin != min_result)
abort ();
}
int main (void)
{
check_vect ();
main1 (100, 100, 1);
main1 (0, 15, 0);
return 0;
}
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail vect_no_int_max } } } */
/* { dg-final { scan-tree-dump-times "vectorization not profitable" 2 "vect" { xfail vect_no_int_max } } } */
/* { dg-final { cleanup-tree-dump "vect" } } */

View File

@ -0,0 +1,67 @@
# Copyright (C) 1997, 2004, 2005, 2006 Free Software Foundation, Inc.
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
# GCC testsuite that uses the `dg.exp' driver.
# Load support procs.
load_lib gcc-dg.exp
# Exit immediately if this isn't a x86 target.
if { ![istarget i?86*-*-*] && ![istarget x86_64-*-*] } then {
return
}
# Set up flags used for tests that don't specify options.
set DEFAULT_VECTCFLAGS ""
# These flags are used for all targets.
lappend DEFAULT_VECTCFLAGS "-O2" "-ftree-vectorize" "-fvect-cost-model"
# If the target system supports vector instructions, the default action
# for a test is 'run', otherwise it's 'compile'. Save current default.
# Executing vector instructions on a system without hardware vector support
# is also disabled by a call to check_vect, but disabling execution here is
# more efficient.
global dg-do-what-default
set save-dg-do-what-default ${dg-do-what-default}
lappend DEFAULT_VECTCFLAGS "-msse2"
set dg-do-what-default run
# Initialize `dg'.
dg-init
lappend DEFAULT_VECTCFLAGS "-fdump-tree-vect-details"
# Main loop.
dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/costmodel-vect-*.\[cS\]]] \
"" $DEFAULT_VECTCFLAGS
#### Tests with special options
global SAVED_DEFAULT_VECTCFLAGS
set SAVED_DEFAULT_VECTCFLAGS $DEFAULT_VECTCFLAGS
# -ffast-math tests
set DEFAULT_VECTCFLAGS $SAVED_DEFAULT_VECTCFLAGS
lappend DEFAULT_VECTCFLAGS "-ffast-math"
dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/costmodel-fast-math-vect*.\[cS\]]] \
"" $DEFAULT_VECTCFLAGS
# Clean up.
set dg-do-what-default ${save-dg-do-what-default}
# All done.
dg-finish

View File

@ -0,0 +1,39 @@
/* { dg-require-effective-target vect_float } */
#include <stdlib.h>
#include "../../tree-vect.h"
void interp_pitch(float *exc, float *interp, int pitch, int len)
{
int i,k;
int maxj;
maxj=3;
for (i=0;i<len;i++)
{
float tmp = 0;
for (k=0;k<7;k++)
{
tmp += exc[i-pitch+k+maxj-6];
}
interp[i] = tmp;
}
}
int main()
{
float *exc = calloc(126,sizeof(float));
float *interp = calloc(80,sizeof(float));
int pitch = -35;
check_vect ();
interp_pitch(exc, interp, pitch, 80);
free(exc);
free(interp);
return 0;
}
/* { dg-final { scan-tree-dump-times "vectorization not profitable" 1 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */

View File

@ -0,0 +1,26 @@
/* { dg-do compile } */
/* { dg-require-effective-target vect_long } */
#include <stdarg.h>
#include "../../tree-vect.h"
#define N 16
void dacP98FillRGBMap (unsigned char *pBuffer)
{
unsigned long dw, dw1;
unsigned long *pdw = (unsigned long *)(pBuffer);
for( dw = 256, dw1 = 0; dw; dw--, dw1 += 0x01010101)
{
*pdw++ = dw1;
*pdw++ = dw1;
*pdw++ = dw1;
*pdw++ = dw1;
}
}
/* { dg-final { scan-tree-dump-times "vectorization not profitable" 1 "vect" { target vect_interleave
} } } */
/* { dg-final { cleanup-tree-dump "vect" } } */

View File

@ -0,0 +1,91 @@
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include "../../tree-vect.h"
#define N 32
struct t{
int k[N];
int l;
};
struct s{
char a; /* aligned */
char b[N-1]; /* unaligned (offset 1B) */
char c[N]; /* aligned (offset NB) */
struct t d; /* aligned (offset 2NB) */
struct t e; /* unaligned (offset 2N+4N+4 B) */
};
int main1 ()
{
int i;
struct s tmp;
/* unaligned */
for (i = 0; i < N/2; i++)
{
tmp.b[i] = 5;
}
/* check results: */
for (i = 0; i <N/2; i++)
{
if (tmp.b[i] != 5)
abort ();
}
/* aligned */
for (i = 0; i < N/2; i++)
{
tmp.c[i] = 6;
}
/* check results: */
for (i = 0; i <N/2; i++)
{
if (tmp.c[i] != 6)
abort ();
}
/* aligned */
for (i = 0; i < N/2; i++)
{
tmp.d.k[i] = 7;
}
/* check results: */
for (i = 0; i <N/2; i++)
{
if (tmp.d.k[i] != 7)
abort ();
}
/* unaligned */
for (i = 0; i < N/2; i++)
{
tmp.e.k[i] = 8;
}
/* check results: */
for (i = 0; i <N/2; i++)
{
if (tmp.e.k[i] != 8)
abort ();
}
return 0;
}
int main (void)
{
check_vect ();
return main1 ();
}
/* { dg-final { scan-tree-dump-times "vectorization not profitable" 2 "vect" } }
*/
/* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */

View File

@ -0,0 +1,39 @@
/* { dg-do compile } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include "../../tree-vect.h"
#define N 16
struct test {
char ca[N];
};
extern struct test s;
int main1 ()
{
int i;
for (i = 0; i < N; i++)
{
s.ca[i] = 5;
}
/* check results: */
for (i = 0; i < N; i++)
{
if (s.ca[i] != 5)
abort ();
}
return 0;
}
int main (void)
{
return main1 ();
}
/* { dg-final { scan-tree-dump-times "vectorization not profitable" 1 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */

View File

@ -0,0 +1,89 @@
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include "../../tree-vect.h"
#define N 32
struct s{
int m;
int n[N][N][N];
};
struct test1{
struct s a; /* array a.n is unaligned */
int b;
int c;
struct s e; /* array e.n is aligned */
};
int main1 ()
{
int i,j;
struct test1 tmp1;
/* 1. unaligned */
for (i = 0; i < N; i++)
{
tmp1.a.n[1][2][i] = 5;
}
/* check results: */
for (i = 0; i <N; i++)
{
if (tmp1.a.n[1][2][i] != 5)
abort ();
}
/* 2. aligned */
for (i = 3; i < N-1; i++)
{
tmp1.a.n[1][2][i] = 6;
}
/* check results: */
for (i = 3; i < N-1; i++)
{
if (tmp1.a.n[1][2][i] != 6)
abort ();
}
/* 3. aligned */
for (i = 0; i < N; i++)
{
tmp1.e.n[1][2][i] = 7;
}
/* check results: */
for (i = 0; i < N; i++)
{
if (tmp1.e.n[1][2][i] != 7)
abort ();
}
/* 4. unaligned */
for (i = 3; i < N-3; i++)
{
tmp1.e.n[1][2][i] = 8;
}
/* check results: */
for (i = 3; i <N-3; i++)
{
if (tmp1.e.n[1][2][i] != 8)
abort ();
}
return 0;
}
int main (void)
{
check_vect ();
return main1 ();
}
/* { dg-final { scan-tree-dump-times "vectorization not profitable" 1 "vect" } } */
/* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */

View File

@ -0,0 +1,51 @@
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include "../../tree-vect.h"
#define N 16
#define DIFF 242
void
main1 (unsigned char x, unsigned char max_result, unsigned char min_result)
{
int i;
unsigned char ub[N] = {1,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
unsigned char uc[N] = {1,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
unsigned char udiff = 2;
unsigned char umax = x;
unsigned char umin = x;
for (i = 0; i < N; i++) {
udiff += (unsigned char)(ub[i] - uc[i]);
}
for (i = 0; i < N; i++) {
umax = umax < uc[i] ? uc[i] : umax;
}
for (i = 0; i < N; i++) {
umin = umin > uc[i] ? uc[i] : umin;
}
/* check results: */
if (udiff != DIFF)
abort ();
if (umax != max_result)
abort ();
if (umin != min_result)
abort ();
}
int main (void)
{
check_vect ();
main1 (100, 100, 1);
main1 (0, 15, 0);
return 0;
}
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail vect_no_int_max } } } */
/* { dg-final { scan-tree-dump-times "vectorization not profitable" 2 "vect" { xfail vect_no_int_max } } } */
/* { dg-final { cleanup-tree-dump "vect" } } */

View File

@ -0,0 +1,70 @@
# Copyright (C) 1997, 2004, 2005, 2006 Free Software Foundation, Inc.
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
# GCC testsuite that uses the `dg.exp' driver.
# Load support procs.
load_lib gcc-dg.exp
# Exit immediately if this isn't a x86 target.
if { (![istarget x86_64-*-*] && ![istarget i?86-*-*])
|| ![is-effective-target lp64] } then {
return
}
# Set up flags used for tests that don't specify options.
set DEFAULT_VECTCFLAGS ""
# These flags are used for all targets.
lappend DEFAULT_VECTCFLAGS "-O2" "-ftree-vectorize" "-fvect-cost-model"
# If the target system supports vector instructions, the default action
# for a test is 'run', otherwise it's 'compile'. Save current default.
# Executing vector instructions on a system without hardware vector support
# is also disabled by a call to check_vect, but disabling execution here is
# more efficient.
global dg-do-what-default
set save-dg-do-what-default ${dg-do-what-default}
lappend DEFAULT_VECTCFLAGS "-msse2"
set dg-do-what-default run
# Initialize `dg'.
dg-init
lappend DEFAULT_VECTCFLAGS "-fdump-tree-vect-details"
# Main loop.
dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/costmodel-pr*.\[cS\]]] \
"" $DEFAULT_VECTCFLAGS
dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/costmodel-vect-*.\[cS\]]] \
"" $DEFAULT_VECTCFLAGS
#### Tests with special options
global SAVED_DEFAULT_VECTCFLAGS
set SAVED_DEFAULT_VECTCFLAGS $DEFAULT_VECTCFLAGS
# -ffast-math tests
set DEFAULT_VECTCFLAGS $SAVED_DEFAULT_VECTCFLAGS
lappend DEFAULT_VECTCFLAGS "-ffast-math"
dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/costmodel-fast-math-vect*.\[cS\]]] \
"" $DEFAULT_VECTCFLAGS
# Clean up.
set dg-do-what-default ${save-dg-do-what-default}
# All done.
dg-finish

View File

@ -1,5 +1,5 @@
/* Analysis Utilities for Loop Vectorization.
Copyright (C) 2003,2004,2005,2006 Free Software Foundation, Inc.
Copyright (C) 2003, 2004, 2005, 2006, 2007 Free Software Foundation, Inc.
Contributed by Dorit Naishlos <dorit@il.ibm.com>
This file is part of GCC.
@ -300,6 +300,9 @@ vect_analyze_operations (loop_vec_info loop_vinfo)
tree phi;
stmt_vec_info stmt_info;
bool need_to_vectorize = false;
int min_profitable_iters;
int min_scalar_loop_bound;
unsigned int th;
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "=== vect_analyze_operations ===");
@ -443,8 +446,6 @@ vect_analyze_operations (loop_vec_info loop_vinfo)
} /* stmts in bb */
} /* bbs */
/* TODO: Analyze cost. Decide if worth while to vectorize. */
/* All operations in the loop are either irrelevant (deal with loop
control, or dead), or only used outside the loop and can be moved
out of the loop (e.g. invariants, inductions). The loop can be
@ -468,16 +469,55 @@ vect_analyze_operations (loop_vec_info loop_vinfo)
vectorization_factor, LOOP_VINFO_INT_NITERS (loop_vinfo));
if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
&& ((LOOP_VINFO_INT_NITERS (loop_vinfo) < vectorization_factor)
|| (LOOP_VINFO_INT_NITERS (loop_vinfo) <=
((unsigned) (PARAM_VALUE (PARAM_MIN_VECT_LOOP_BOUND))
* vectorization_factor))))
&& (LOOP_VINFO_INT_NITERS (loop_vinfo) < vectorization_factor))
{
if (vect_print_dump_info (REPORT_UNVECTORIZED_LOOPS))
fprintf (vect_dump, "not vectorized: iteration count too small.");
fprintf (vect_dump, "not vectorized: iteration count too small.");
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump,"not vectorized: iteration count smaller than "
"vectorization factor.");
return false;
}
/* Analyze cost. Decide if worth while to vectorize. */
min_profitable_iters = vect_estimate_min_profitable_iters (loop_vinfo);
if (min_profitable_iters < 0)
{
if (vect_print_dump_info (REPORT_UNVECTORIZED_LOOPS))
fprintf (vect_dump, "not vectorized: vectorization not profitable.");
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "not vectorized: vector version will never be "
"profitable.");
return false;
}
min_scalar_loop_bound = (PARAM_VALUE (PARAM_MIN_VECT_LOOP_BOUND))
* vectorization_factor;
/* Use the cost model only if it is more conservative than user specified
threshold. */
th = (unsigned) min_scalar_loop_bound;
if (min_profitable_iters
&& (!min_scalar_loop_bound
|| min_profitable_iters > min_scalar_loop_bound))
th = (unsigned) min_profitable_iters;
if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
&& LOOP_VINFO_INT_NITERS (loop_vinfo) < th)
{
if (vect_print_dump_info (REPORT_UNVECTORIZED_LOOPS))
fprintf (vect_dump, "not vectorized: vectorization not "
"profitable.");
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "not vectorized: iteration count smaller than "
"user specified loop bound parameter or minimum "
"profitable iterations (whichever is more conservative).");
return false;
}
if (!LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
|| LOOP_VINFO_INT_NITERS (loop_vinfo) % vectorization_factor != 0
|| LOOP_PEELING_FOR_ALIGNMENT (loop_vinfo))

View File

@ -74,6 +74,490 @@ static void vect_update_inits_of_drs (loop_vec_info, tree);
static int vect_min_worthwhile_factor (enum tree_code);
/* Function vect_estimate_min_profitable_iters
Return the number of iterations required for the vector version of the
loop to be profitable relative to the cost of the scalar version of the
loop.
TODO: Take profile info into account before making vectorization
decisions, if available. */
int
vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo)
{
int i;
int min_profitable_iters;
int peel_iters_prologue;
int peel_iters_epilogue;
int vec_inside_cost = 0;
int vec_outside_cost = 0;
int scalar_single_iter_cost = 0;
int vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo);
int nbbs = loop->num_nodes;
/* Cost model disabled. */
if (!flag_vect_cost_model)
{
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "cost model disabled.");
return 0;
}
/* Requires loop versioning tests to handle misalignment.
FIXME: Make cost depend on number of stmts in may_misalign list. */
if (LOOP_VINFO_MAY_MISALIGN_STMTS (loop_vinfo))
{
vec_outside_cost += TARG_COND_BRANCH_COST;
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "cost model: Adding cost of checks for loop "
"versioning.\n");
}
/* Requires a prologue loop when peeling to handle misalignment. Add cost of
two guards, one for the peeled loop and one for the vector loop. */
peel_iters_prologue = LOOP_PEELING_FOR_ALIGNMENT (loop_vinfo);
if (peel_iters_prologue)
{
vec_outside_cost += 2 * TARG_COND_BRANCH_COST;
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "cost model: Adding cost of checks for "
"prologue.\n");
}
/* Requires an epilogue loop to finish up remaining iterations after vector
loop. Add cost of two guards, one for the peeled loop and one for the
vector loop. */
if ((peel_iters_prologue < 0)
|| !LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
|| LOOP_VINFO_INT_NITERS (loop_vinfo) % vf)
{
vec_outside_cost += 2 * TARG_COND_BRANCH_COST;
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "cost model : Adding cost of checks for "
"epilogue.\n");
}
/* Count statements in scalar loop. Using this as scalar cost for a single
iteration for now.
TODO: Add outer loop support.
TODO: Consider assigning different costs to different scalar
statements. */
for (i = 0; i < nbbs; i++)
{
block_stmt_iterator si;
basic_block bb = bbs[i];
for (si = bsi_start (bb); !bsi_end_p (si); bsi_next (&si))
{
tree stmt = bsi_stmt (si);
stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
if (!STMT_VINFO_RELEVANT_P (stmt_info)
&& !STMT_VINFO_LIVE_P (stmt_info))
continue;
scalar_single_iter_cost++;
vec_inside_cost += STMT_VINFO_INSIDE_OF_LOOP_COST (stmt_info);
vec_outside_cost += STMT_VINFO_OUTSIDE_OF_LOOP_COST (stmt_info);
}
}
/* Add additional cost for the peeled instructions in prologue and epilogue
loop.
FORNOW: If we dont know the value of peel_iters for prologue or epilogue
at compile-time - we assume the worst.
TODO: Build an expression that represents peel_iters for prologue and
epilogue to be used in a run-time test. */
peel_iters_prologue = LOOP_PEELING_FOR_ALIGNMENT (loop_vinfo);
if (peel_iters_prologue < 0)
{
peel_iters_prologue = vf - 1;
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "cost model: "
"prologue peel iters set conservatively.");
/* If peeling for alignment is unknown, loop bound of main loop becomes
unkown. */
peel_iters_epilogue = vf - 1;
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "cost model: "
"epilogue peel iters set conservatively because "
"peeling for alignment is unknown .");
}
else
{
if (!LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo))
{
peel_iters_epilogue = vf - 1;
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "cost model: "
"epilogue peel iters set conservatively because "
"loop iterations are unknown .");
}
else
peel_iters_epilogue =
(LOOP_VINFO_INT_NITERS (loop_vinfo) - peel_iters_prologue)
% vf;
}
vec_outside_cost += (peel_iters_prologue * scalar_single_iter_cost)
+ (peel_iters_epilogue * scalar_single_iter_cost);
/* Calculate number of iterations required to make the vector version
profitable, relative to the loop bodies only. The following condition
must hold true: ((SIC*VF)-VIC)*niters > VOC*VF, where
SIC = scalar iteration cost, VIC = vector iteration cost,
VOC = vector outside cost and VF = vectorization factor. */
if ((scalar_single_iter_cost * vf) > vec_inside_cost)
{
if (vec_outside_cost == 0)
min_profitable_iters = 1;
else
{
min_profitable_iters = (vec_outside_cost * vf)
/ ((scalar_single_iter_cost * vf)
- vec_inside_cost);
if ((scalar_single_iter_cost * vf * min_profitable_iters)
<= ((vec_inside_cost * min_profitable_iters)
+ (vec_outside_cost * vf)))
min_profitable_iters++;
}
}
/* vector version will never be profitable. */
else
{
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "cost model: vector iteration cost = %d "
"is divisible by scalar iteration cost = %d by a factor "
"greater than or equal to the vectorization factor = %d .",
vec_inside_cost, scalar_single_iter_cost, vf);
return -1;
}
if (vect_print_dump_info (REPORT_DETAILS))
{
fprintf (vect_dump, "Cost model analysis: \n");
fprintf (vect_dump, " Vector inside of loop cost: %d\n",
vec_inside_cost);
fprintf (vect_dump, " Vector outside of loop cost: %d\n",
vec_outside_cost);
fprintf (vect_dump, " Scalar cost: %d\n", scalar_single_iter_cost);
fprintf (vect_dump, " prologue iterations: %d\n",
peel_iters_prologue);
fprintf (vect_dump, " epilogue iterations: %d\n",
peel_iters_epilogue);
fprintf (vect_dump, " Calculated minimum iters for profitability: %d\n",
min_profitable_iters);
fprintf (vect_dump, " Actual minimum iters for profitability: %d\n",
min_profitable_iters < vf ? vf : min_profitable_iters);
}
return min_profitable_iters < vf ? vf : min_profitable_iters;
}
/* TODO: Close dependency between vect_model_*_cost and vectorizable_*
functions. Design better to avoid maintainence issues. */
/* Function vect_model_reduction_cost.
Models cost for a reduction operation, including the vector ops
generated within the strip-mine loop, the initial definition before
the loop, and the epilogue code that must be generated. */
static void
vect_model_reduction_cost (stmt_vec_info stmt_info, enum tree_code reduc_code,
int ncopies)
{
int outer_cost = 0;
enum tree_code code;
optab optab;
tree vectype;
tree orig_stmt;
tree reduction_op;
enum machine_mode mode;
tree operation = GIMPLE_STMT_OPERAND (STMT_VINFO_STMT (stmt_info), 1);
int op_type = TREE_CODE_LENGTH (TREE_CODE (operation));
/* Cost of reduction op inside loop. */
STMT_VINFO_INSIDE_OF_LOOP_COST (stmt_info) += ncopies * TARG_VEC_STMT_COST;
reduction_op = TREE_OPERAND (operation, op_type-1);
vectype = get_vectype_for_scalar_type (TREE_TYPE (reduction_op));
mode = TYPE_MODE (vectype);
orig_stmt = STMT_VINFO_RELATED_STMT (stmt_info);
if (!orig_stmt)
orig_stmt = STMT_VINFO_STMT (stmt_info);
code = TREE_CODE (GIMPLE_STMT_OPERAND (orig_stmt, 1));
/* Add in cost for initial definition. */
outer_cost += TARG_VEC_STMT_COST;
/* Determine cost of epilogue code.
We have a reduction operator that will reduce the vector in one statement.
Also requires scalar extract. */
if (reduc_code < NUM_TREE_CODES)
outer_cost += TARG_VEC_STMT_COST + TARG_VEC_TO_SCALAR_COST;
else
{
int vec_size_in_bits = tree_low_cst (TYPE_SIZE (vectype), 1);
tree bitsize =
TYPE_SIZE (TREE_TYPE ( GIMPLE_STMT_OPERAND (orig_stmt, 0)));
int element_bitsize = tree_low_cst (bitsize, 1);
int nelements = vec_size_in_bits / element_bitsize;
optab = optab_for_tree_code (code, vectype);
/* We have a whole vector shift available. */
if (!VECTOR_MODE_P (mode)
|| optab->handlers[mode].insn_code == CODE_FOR_nothing)
/* Final reduction via vector shifts and the reduction operator. Also
requires scalar extract. */
outer_cost += ((exact_log2(nelements) * 2 + 1) * TARG_VEC_STMT_COST);
else
/* Use extracts and reduction op for final reduction. For N elements,
we have N extracts and N-1 reduction ops. */
outer_cost += ((nelements + nelements - 1) * TARG_VEC_STMT_COST);
}
STMT_VINFO_OUTSIDE_OF_LOOP_COST (stmt_info) = outer_cost;
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "vect_model_reduction_cost: inside_cost = %d, "
"outside_cost = %d .", STMT_VINFO_INSIDE_OF_LOOP_COST (stmt_info),
STMT_VINFO_OUTSIDE_OF_LOOP_COST (stmt_info));
}
/* Function vect_model_induction_cost.
Models cost for induction operations. */
static void
vect_model_induction_cost (stmt_vec_info stmt_info, int ncopies)
{
/* loop cost for vec_loop. */
STMT_VINFO_INSIDE_OF_LOOP_COST (stmt_info) = ncopies * TARG_VEC_STMT_COST;
/* prologue cost for vec_init and vec_step. */
STMT_VINFO_OUTSIDE_OF_LOOP_COST (stmt_info) = 2 * TARG_VEC_STMT_COST;
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "vect_model_induction_cost: inside_cost = %d, "
"outside_cost = %d .", STMT_VINFO_INSIDE_OF_LOOP_COST (stmt_info),
STMT_VINFO_OUTSIDE_OF_LOOP_COST (stmt_info));
}
/* Function vect_model_simple_cost.
Models cost for simple operations, i.e. those that only emit ncopies of a
single op. Right now, this does not account for multiple insns that could
be generated for the single vector op. We will handle that shortly. */
static void
vect_model_simple_cost (stmt_vec_info stmt_info, int ncopies)
{
STMT_VINFO_INSIDE_OF_LOOP_COST (stmt_info) = ncopies * TARG_VEC_STMT_COST;
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "vect_model_simple_cost: inside_cost = %d, "
"outside_cost = %d .", STMT_VINFO_INSIDE_OF_LOOP_COST (stmt_info),
STMT_VINFO_OUTSIDE_OF_LOOP_COST (stmt_info));
}
/* Function vect_cost_strided_group_size
For strided load or store, return the group_size only if it is the first
load or store of a group, else return 1. This ensures that group size is
only returned once per group. */
static int
vect_cost_strided_group_size (stmt_vec_info stmt_info)
{
tree first_stmt = DR_GROUP_FIRST_DR (stmt_info);
if (first_stmt == STMT_VINFO_STMT (stmt_info))
return DR_GROUP_SIZE (stmt_info);
return 1;
}
/* Function vect_model_store_cost
Models cost for stores. In the case of strided accesses, one access
has the overhead of the strided access attributed to it. */
static void
vect_model_store_cost (stmt_vec_info stmt_info, int ncopies)
{
int cost = 0;
int group_size;
/* Strided access? */
if (DR_GROUP_FIRST_DR (stmt_info))
group_size = vect_cost_strided_group_size (stmt_info);
/* Not a strided access. */
else
group_size = 1;
/* Is this an access in a group of stores, which provide strided access?
If so, add in the cost of the permutes. */
if (group_size > 1)
{
/* Uses a high and low interleave operation for each needed permute. */
cost = ncopies * exact_log2(group_size) * group_size
* TARG_VEC_STMT_COST;
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "vect_model_store_cost: strided group_size = %d .",
group_size);
}
/* Costs of the stores. */
cost += ncopies * TARG_VEC_STORE_COST;
STMT_VINFO_INSIDE_OF_LOOP_COST (stmt_info) = cost;
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "vect_model_store_cost: inside_cost = %d, "
"outside_cost = %d .", STMT_VINFO_INSIDE_OF_LOOP_COST (stmt_info),
STMT_VINFO_OUTSIDE_OF_LOOP_COST (stmt_info));
}
/* Function vect_model_load_cost
Models cost for loads. In the case of strided accesses, the last access
has the overhead of the strided access attributed to it. Since unaligned
accesses are supported for loads, we also account for the costs of the
access scheme chosen. */
static void
vect_model_load_cost (stmt_vec_info stmt_info, int ncopies)
{
int inner_cost = 0;
int group_size;
int alignment_support_cheme;
tree first_stmt;
struct data_reference *dr = STMT_VINFO_DATA_REF (stmt_info), *first_dr;
/* Strided accesses? */
first_stmt = DR_GROUP_FIRST_DR (stmt_info);
if (first_stmt)
{
group_size = vect_cost_strided_group_size (stmt_info);
first_dr = STMT_VINFO_DATA_REF (vinfo_for_stmt (first_stmt));
}
/* Not a strided access. */
else
{
group_size = 1;
first_dr = dr;
}
alignment_support_cheme = vect_supportable_dr_alignment (first_dr);
/* Is this an access in a group of loads providing strided access?
If so, add in the cost of the permutes. */
if (group_size > 1)
{
/* Uses an even and odd extract operations for each needed permute. */
inner_cost = ncopies * exact_log2(group_size) * group_size
* TARG_VEC_STMT_COST;
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "vect_model_load_cost: strided group_size = %d .",
group_size);
}
/* The loads themselves. */
switch (alignment_support_cheme)
{
case dr_aligned:
{
inner_cost += ncopies * TARG_VEC_LOAD_COST;
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "vect_model_load_cost: aligned.");
break;
}
case dr_unaligned_supported:
{
/* Here, we assign an additional cost for the unaligned load. */
inner_cost += ncopies * TARG_VEC_UNALIGNED_LOAD_COST;
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "vect_model_load_cost: unaligned supported by "
"hardware.");
break;
}
case dr_unaligned_software_pipeline:
{
int outer_cost = 0;
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "vect_model_load_cost: unaligned software "
"pipelined.");
/* Unaligned software pipeline has a load of an address, an initial
load, and possibly a mask operation to "prime" the loop. However,
if this is an access in a group of loads, which provide strided
acccess, then the above cost should only be considered for one
access in the group. Inside the loop, there is a load op
and a realignment op. */
if ((!DR_GROUP_FIRST_DR (stmt_info)) || group_size > 1)
{
outer_cost = 2*TARG_VEC_STMT_COST;
if (targetm.vectorize.builtin_mask_for_load)
outer_cost += TARG_VEC_STMT_COST;
}
STMT_VINFO_OUTSIDE_OF_LOOP_COST (stmt_info) = outer_cost;
inner_cost += ncopies * (TARG_VEC_LOAD_COST + TARG_VEC_STMT_COST);
break;
}
default:
gcc_unreachable ();
}
STMT_VINFO_INSIDE_OF_LOOP_COST (stmt_info) = inner_cost;
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "vect_model_load_cost: inside_cost = %d, "
"outside_cost = %d .", STMT_VINFO_INSIDE_OF_LOOP_COST (stmt_info),
STMT_VINFO_OUTSIDE_OF_LOOP_COST (stmt_info));
}
/* Function vect_get_new_vect_var.
Returns a name for a new variable. The current naming scheme appends the
@ -1655,6 +2139,7 @@ vectorizable_reduction (tree stmt, block_stmt_iterator *bsi, tree *vec_stmt)
if (!vec_stmt) /* transformation not required. */
{
STMT_VINFO_TYPE (stmt_info) = reduc_vec_info_type;
vect_model_reduction_cost (stmt_info, epilog_reduc_code, ncopies);
return true;
}
@ -1862,9 +2347,15 @@ vectorizable_call (tree stmt, block_stmt_iterator *bsi, tree *vec_stmt)
gcc_assert (ZERO_SSA_OPERANDS (stmt, SSA_OP_ALL_VIRTUALS));
ncopies = (LOOP_VINFO_VECT_FACTOR (loop_vinfo)
/ TYPE_VECTOR_SUBPARTS (vectype_out));
if (!vec_stmt) /* transformation not required. */
{
STMT_VINFO_TYPE (stmt_info) = call_vec_info_type;
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "=== vectorizable_call ===");
vect_model_simple_cost (stmt_info, ncopies);
return true;
}
@ -1873,8 +2364,6 @@ vectorizable_call (tree stmt, block_stmt_iterator *bsi, tree *vec_stmt)
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "transform operation.");
ncopies = (LOOP_VINFO_VECT_FACTOR (loop_vinfo)
/ TYPE_VECTOR_SUBPARTS (vectype_out));
gcc_assert (ncopies >= 1);
/* Handle def. */
@ -2302,6 +2791,9 @@ vectorizable_assignment (tree stmt, block_stmt_iterator *bsi, tree *vec_stmt)
if (!vec_stmt) /* transformation not required. */
{
STMT_VINFO_TYPE (stmt_info) = assignment_vec_info_type;
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "=== vectorizable_assignment ===");
vect_model_simple_cost (stmt_info, ncopies);
return true;
}
@ -2392,6 +2884,9 @@ vectorizable_induction (tree phi, block_stmt_iterator *bsi ATTRIBUTE_UNUSED,
if (!vec_stmt) /* transformation not required. */
{
STMT_VINFO_TYPE (stmt_info) = induc_vec_info_type;
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "=== vectorizable_induction ===");
vect_model_induction_cost (stmt_info, ncopies);
return true;
}
@ -2555,6 +3050,9 @@ vectorizable_operation (tree stmt, block_stmt_iterator *bsi, tree *vec_stmt)
if (!vec_stmt) /* transformation not required. */
{
STMT_VINFO_TYPE (stmt_info) = op_vec_info_type;
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "=== vectorizable_operation ===");
vect_model_simple_cost (stmt_info, ncopies);
return true;
}
@ -2772,6 +3270,9 @@ vectorizable_type_demotion (tree stmt, block_stmt_iterator *bsi,
if (!vec_stmt) /* transformation not required. */
{
STMT_VINFO_TYPE (stmt_info) = type_demotion_vec_info_type;
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "=== vectorizable_demotion ===");
vect_model_simple_cost (stmt_info, ncopies);
return true;
}
@ -2932,6 +3433,9 @@ vectorizable_type_promotion (tree stmt, block_stmt_iterator *bsi,
if (!vec_stmt) /* transformation not required. */
{
STMT_VINFO_TYPE (stmt_info) = type_promotion_vec_info_type;
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "=== vectorizable_promotion ===");
vect_model_simple_cost (stmt_info, 2*ncopies);
return true;
}
@ -3252,14 +3756,12 @@ vectorizable_store (tree stmt, block_stmt_iterator *bsi, tree *vec_stmt)
if (!vec_stmt) /* transformation not required. */
{
STMT_VINFO_TYPE (stmt_info) = store_vec_info_type;
vect_model_store_cost (stmt_info, ncopies);
return true;
}
/** Transform. **/
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "transform store. ncopies = %d",ncopies);
if (strided_store)
{
first_stmt = DR_GROUP_FIRST_DR (stmt_info);
@ -3284,6 +3786,9 @@ vectorizable_store (tree stmt, block_stmt_iterator *bsi, tree *vec_stmt)
group_size = 1;
}
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "transform store. ncopies = %d",ncopies);
dr_chain = VEC_alloc (tree, heap, group_size);
oprnds = VEC_alloc (tree, heap, group_size);
@ -3915,14 +4420,15 @@ vectorizable_load (tree stmt, block_stmt_iterator *bsi, tree *vec_stmt)
if (!vec_stmt) /* transformation not required. */
{
STMT_VINFO_TYPE (stmt_info) = load_vec_info_type;
vect_model_load_cost (stmt_info, ncopies);
return true;
}
/** Transform. **/
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "transform load.");
/** Transform. **/
if (strided_load)
{
first_stmt = DR_GROUP_FIRST_DR (stmt_info);
@ -4807,6 +5313,8 @@ vect_do_peeling_for_loop_bound (loop_vec_info loop_vinfo, tree *ratio)
basic_block preheader;
int loop_num;
unsigned int th;
int min_scalar_loop_bound;
int min_profitable_iters;
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "=== vect_do_peeling_for_loop_bound ===");
@ -4822,11 +5330,28 @@ vect_do_peeling_for_loop_bound (loop_vec_info loop_vinfo, tree *ratio)
&ratio_mult_vf_name, ratio);
loop_num = loop->num;
/* Threshold for vectorized loop. */
th = (PARAM_VALUE (PARAM_MIN_VECT_LOOP_BOUND)) *
LOOP_VINFO_VECT_FACTOR (loop_vinfo);
/* Analyze cost to set threshhold for vectorized loop. */
min_profitable_iters = vect_estimate_min_profitable_iters (loop_vinfo);
min_scalar_loop_bound = (PARAM_VALUE (PARAM_MIN_VECT_LOOP_BOUND))
* LOOP_VINFO_VECT_FACTOR (loop_vinfo);
/* Use the cost model only if it is more conservative than user specified
threshold. */
th = (unsigned) min_scalar_loop_bound;
if (min_profitable_iters
&& (!min_scalar_loop_bound
|| min_profitable_iters > min_scalar_loop_bound))
th = (unsigned) min_profitable_iters;
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "vectorization may not be profitable.");
new_loop = slpeel_tree_peel_loop_to_edge (loop, single_exit (loop),
ratio_mult_vf_name, ni_name, false, th);
ratio_mult_vf_name, ni_name, false,
th);
gcc_assert (new_loop);
gcc_assert (loop_num == loop->num);
#ifdef ENABLE_CHECKING

View File

@ -1351,6 +1351,8 @@ new_stmt_vec_info (tree stmt, loop_vec_info loop_vinfo)
else
STMT_VINFO_DEF_TYPE (res) = vect_loop_def;
STMT_VINFO_SAME_ALIGN_REFS (res) = VEC_alloc (dr_p, heap, 5);
STMT_VINFO_INSIDE_OF_LOOP_COST (res) = 0;
STMT_VINFO_OUTSIDE_OF_LOOP_COST (res) = 0;
DR_GROUP_FIRST_DR (res) = NULL_TREE;
DR_GROUP_NEXT_DR (res) = NULL_TREE;
DR_GROUP_SIZE (res) = 0;

View File

@ -1,5 +1,5 @@
/* Loop Vectorization
Copyright (C) 2003, 2004, 2005, 2006 Free Software Foundation, Inc.
Copyright (C) 2003, 2004, 2005, 2006, 2007 Free Software Foundation, Inc.
Contributed by Dorit Naishlos <dorit@il.ibm.com>
This file is part of GCC.
@ -268,6 +268,13 @@ typedef struct _stmt_vec_info {
/* For loads only, if there is a store with the same location, this field is
TRUE. */
bool read_write_dep;
/* Vectorization costs associated with statement. */
struct
{
int outside_of_loop; /* Statements generated outside loop. */
int inside_of_loop; /* Statements generated inside loop. */
} cost;
} *stmt_vec_info;
/* Access Functions. */
@ -300,6 +307,42 @@ typedef struct _stmt_vec_info {
#define DR_GROUP_READ_WRITE_DEPENDENCE(S) (S)->read_write_dep
#define STMT_VINFO_RELEVANT_P(S) ((S)->relevant != vect_unused_in_loop)
#define STMT_VINFO_OUTSIDE_OF_LOOP_COST(S) (S)->cost.outside_of_loop
#define STMT_VINFO_INSIDE_OF_LOOP_COST(S) (S)->cost.inside_of_loop
/* These are some defines for the initial implementation of the vectorizer's
cost model. These will later be target specific hooks. */
/* Cost of conditional branch. */
#ifndef TARG_COND_BRANCH_COST
#define TARG_COND_BRANCH_COST 3
#endif
/* Cost of any vector operation, excluding load, store or vector to scalar
operation. */
#ifndef TARG_VEC_STMT_COST
#define TARG_VEC_STMT_COST 1
#endif
/* Cost of vector to scalar operation. */
#ifndef TARG_VEC_TO_SCALAR_COST
#define TARG_VEC_TO_SCALAR_COST 1
#endif
/* Cost of aligned vector load. */
#ifndef TARG_VEC_LOAD_COST
#define TARG_VEC_LOAD_COST 1
#endif
/* Cost of misaligned vector load. */
#ifndef TARG_VEC_UNALIGNED_LOAD_COST
#define TARG_VEC_UNALIGNED_LOAD_COST 2
#endif
/* Cost of vector store. */
#ifndef TARG_VEC_STORE_COST
#define TARG_VEC_STORE_COST 1
#endif
static inline void set_stmt_info (stmt_ann_t ann, stmt_vec_info stmt_info);
static inline stmt_vec_info vinfo_for_stmt (tree stmt);
@ -437,6 +480,7 @@ extern bool vectorizable_condition (tree, block_stmt_iterator *, tree *);
extern bool vectorizable_live_operation (tree, block_stmt_iterator *, tree *);
extern bool vectorizable_reduction (tree, block_stmt_iterator *, tree *);
extern bool vectorizable_induction (tree, block_stmt_iterator *, tree *);
extern int vect_estimate_min_profitable_iters (loop_vec_info);
/* Driver for transformation stage. */
extern void vect_transform_loop (loop_vec_info);