re PR tree-optimization/24659 (Conversions are not vectorized)

PR tree-optimization/24659
        * optabs.h (enum optab_index): Add OTI_vec_unpacks_float_hi,
	OTI_vec_unpacks_float_lo, OTI_vec_unpacku_float_hi,
	OTI_vec_unpacku_float_lo, OTI_vec_pack_sfix_trunc and
	OTI_vec_pack_ufix_trunc.
	(vec_unpacks_float_hi_optab): Define new macro.
	(vec_unpacks_float_lo_optab): Ditto.
	(vec_unpacku_float_hi_optab): Ditto.
	(vec_unpacku_float_lo_optab): Ditto.
	(vec_pack_sfix_trunc_optab): Ditto.
	(vec_pack_ufix_trunc_optab): Ditto.
	* genopinit.c (optabs): Implement vec_unpack[s|u]_[hi|lo]_optab
	and vec_pack_[s|u]fix_trunc_optab using
	vec_unpack[s|u]_[hi\lo]_* and vec_pack_[u|s]fix_trunc_* patterns
	* tree-vectorizer.c (supportable_widening_operation): Handle
	FLOAT_EXPR and CONVERT_EXPR.  Update comment.
	(supportable_narrowing_operation): New function.
	* tree-vectorizer.h (supportable_narrowing_operation): Prototype.
	* tree-vect-transform.c (vectorizable_conversion): Handle
	(nunits_in == nunits_out / 2) and (nunits_out == nunits_in / 2) cases.
	(vect_gen_widened_results_half): Move before vectorizable_conversion.
	(vectorizable_type_demotion): Call supportable_narrowing_operation()
	to check for target support.
	* optabs.c (optab_for_tree_code) Return vec_unpack[s|u]_float_hi_optab
	for VEC_UNPACK_FLOAT_HI_EXPR, vec_unpack[s|u]_float_lo_optab
	for VEC_UNPACK_FLOAT_LO_EXPR and vec_pack_[u|s]fix_trunc_optab
	for VEC_PACK_FIX_TRUNC_EXPR.
	(expand_binop): Special case mode of the result for
	vec_pack_[u|s]fix_trunc_optab.
	(init_optabs): Initialize vec_unpack[s|u]_[hi|lo]_optab and
	vec_pack_[u|s]fix_trunc_optab.

	* tree.def (VEC_UNPACK_FLOAT_HI_EXPR, VEC_UNPACK_FLOAT_LO_EXPR,
	VEC_PACK_FIX_TRUNC_EXPR): New tree codes.
	* tree-pretty-print.c (dump_generic_node): Handle
	VEC_UNPACK_FLOAT_HI_EXPR, VEC_UNPACK_FLOAT_LO_EXPR and
	VEC_PACK_FIX_TRUNC_EXPR.
	(op_prio): Ditto.
	* expr.c (expand_expr_real_1): Ditto.
	* tree-inline.c (estimate_num_insns_1): Ditto.
	* tree-vect-generic.c (expand_vector_operations_1): Ditto.

	* config/i386/sse.md (vec_unpacks_float_hi_v8hi): New expander.
	(vec_unpacks_float_lo_v8hi): Ditto.
	(vec_unpacku_float_hi_v8hi): Ditto.
	(vec_unpacku_float_lo_v8hi): Ditto.
	(vec_unpacks_float_hi_v4si): Ditto.
	(vec_unpacks_float_lo_v4si): Ditto.
	(vec_pack_sfix_trunc_v2df): Ditto.

	* doc/c-tree.texi (Expression trees) [VEC_UNPACK_FLOAT_HI_EXPR]:
	Document.
	[VEC_UNPACK_FLOAT_LO_EXPR]: Ditto.
	[VEC_PACK_FIX_TRUNC_EXPR]: Ditto.
	* doc/md.texi (Standard Names) [vec_pack_sfix_trunc]: Document.
	[vec_pack_ufix_trunc]: Ditto.
	[vec_unpacks_float_hi]: Ditto.
	[vec_unpacks_float_lo]: Ditto.
	[vec_unpacku_float_hi]: Ditto.
	[vec_unpacku_float_lo]: Ditto.

testsuite/ChangeLog:

	PR tree-optimization/24659
	* gcc.dg/vect/vect-floatint-conversion-2.c: New test.
	* gcc.dg/vect/vect-intfloat-conversion-1.c: Require vect_float,
	not vect_int target.
	* gcc.dg/vect/vect-intfloat-conversion-2.c: Require vect_float,
	not vect_int target.  Loop is vectorized for vect_intfloat_cvt
	targets.
	* gcc.dg/vect/vect-intfloat-conversion-3.c: New test.
	* gcc.dg/vect/vect-intfloat-conversion-4a.c: New test.
	* gcc.dg/vect/vect-intfloat-conversion-4b.c: New test.

From-SVN: r124784
This commit is contained in:
Uros Bizjak 2007-05-17 08:31:05 +02:00
parent f59d2a7c86
commit d9987fb407
22 changed files with 791 additions and 150 deletions

View File

@ -1,3 +1,66 @@
2007-05-17 Uros Bizjak <ubizjak@gmail.com>
PR tree-optimization/24659
* optabs.h (enum optab_index): Add OTI_vec_unpacks_float_hi,
OTI_vec_unpacks_float_lo, OTI_vec_unpacku_float_hi,
OTI_vec_unpacku_float_lo, OTI_vec_pack_sfix_trunc and
OTI_vec_pack_ufix_trunc.
(vec_unpacks_float_hi_optab): Define new macro.
(vec_unpacks_float_lo_optab): Ditto.
(vec_unpacku_float_hi_optab): Ditto.
(vec_unpacku_float_lo_optab): Ditto.
(vec_pack_sfix_trunc_optab): Ditto.
(vec_pack_ufix_trunc_optab): Ditto.
* genopinit.c (optabs): Implement vec_unpack[s|u]_[hi|lo]_optab
and vec_pack_[s|u]fix_trunc_optab using
vec_unpack[s|u]_[hi\lo]_* and vec_pack_[u|s]fix_trunc_* patterns
* tree-vectorizer.c (supportable_widening_operation): Handle
FLOAT_EXPR and CONVERT_EXPR. Update comment.
(supportable_narrowing_operation): New function.
* tree-vectorizer.h (supportable_narrowing_operation): Prototype.
* tree-vect-transform.c (vectorizable_conversion): Handle
(nunits_in == nunits_out / 2) and (nunits_out == nunits_in / 2) cases.
(vect_gen_widened_results_half): Move before vectorizable_conversion.
(vectorizable_type_demotion): Call supportable_narrowing_operation()
to check for target support.
* optabs.c (optab_for_tree_code) Return vec_unpack[s|u]_float_hi_optab
for VEC_UNPACK_FLOAT_HI_EXPR, vec_unpack[s|u]_float_lo_optab
for VEC_UNPACK_FLOAT_LO_EXPR and vec_pack_[u|s]fix_trunc_optab
for VEC_PACK_FIX_TRUNC_EXPR.
(expand_binop): Special case mode of the result for
vec_pack_[u|s]fix_trunc_optab.
(init_optabs): Initialize vec_unpack[s|u]_[hi|lo]_optab and
vec_pack_[u|s]fix_trunc_optab.
* tree.def (VEC_UNPACK_FLOAT_HI_EXPR, VEC_UNPACK_FLOAT_LO_EXPR,
VEC_PACK_FIX_TRUNC_EXPR): New tree codes.
* tree-pretty-print.c (dump_generic_node): Handle
VEC_UNPACK_FLOAT_HI_EXPR, VEC_UNPACK_FLOAT_LO_EXPR and
VEC_PACK_FIX_TRUNC_EXPR.
(op_prio): Ditto.
* expr.c (expand_expr_real_1): Ditto.
* tree-inline.c (estimate_num_insns_1): Ditto.
* tree-vect-generic.c (expand_vector_operations_1): Ditto.
* config/i386/sse.md (vec_unpacks_float_hi_v8hi): New expander.
(vec_unpacks_float_lo_v8hi): Ditto.
(vec_unpacku_float_hi_v8hi): Ditto.
(vec_unpacku_float_lo_v8hi): Ditto.
(vec_unpacks_float_hi_v4si): Ditto.
(vec_unpacks_float_lo_v4si): Ditto.
(vec_pack_sfix_trunc_v2df): Ditto.
* doc/c-tree.texi (Expression trees) [VEC_UNPACK_FLOAT_HI_EXPR]:
Document.
[VEC_UNPACK_FLOAT_LO_EXPR]: Ditto.
[VEC_PACK_FIX_TRUNC_EXPR]: Ditto.
* doc/md.texi (Standard Names) [vec_pack_sfix_trunc]: Document.
[vec_pack_ufix_trunc]: Ditto.
[vec_unpacks_float_hi]: Ditto.
[vec_unpacks_float_lo]: Ditto.
[vec_unpacku_float_hi]: Ditto.
[vec_unpacku_float_lo]: Ditto.
2007-05-16 Uros Bizjak <ubizjak@gmail.com>
* soft-fp/README: Update for new files.
@ -46,14 +109,15 @@
2007-05-16 Paolo Bonzini <bonzini@gnu.org>
* config/i386/i386.c (legitimize_tls_address): Mark __tls_get_addr
calls as pure.
* config/i386/i386.c (legitimize_tls_address): Mark __tls_get_addr
calls as pure.
2007-05-16 Eric Christopher <echristo@apple.com>
* config/rs6000/rs6000.c (rs6000_emit_prologue): Move altivec register
saving after stack push. Set sp_offset whenever we push.
(rs6000_emit_epilogue): Move altivec register restore before stack push.
saving after stack push. Set sp_offset whenever we push.
(rs6000_emit_epilogue): Move altivec register restore before
stack push.
2007-05-16 Richard Sandiford <richard@codesourcery.com>
@ -496,7 +560,7 @@
dumps.
2007-05-08 Sandra Loosemore <sandra@codesourcery.com>
Nigel Stephens <nigel@mips.com>
Nigel Stephens <nigel@mips.com>
* config/mips/mips.h (MAX_FPRS_PER_FMT): Renamed from FP_INC.
Update comments and all uses.
@ -563,7 +627,7 @@
* configure: Regenerate.
* config.in: Regenerate.
2007-05-07 Naveen.H.S <naveen.hs@kpitcummins.com>
2007-05-07 Naveen.H.S <naveen.hs@kpitcummins.com>
* config/m32c/muldiv.md (mulhisi3_c): Limit the mode of the 2nd
operand to HI mode.
@ -1062,7 +1126,7 @@
PR middle-end/22156
Temporarily revert:
2007-04-06 Andreas Tobler <a.tobler@schweiz.org>
* tree-sra.c (sra_build_elt_assignment): Initialize min/maxshift.
* tree-sra.c (sra_build_elt_assignment): Initialize min/maxshift.
2007-04-05 Alexandre Oliva <aoliva@redhat.com>
* tree-sra.c (try_instantiate_multiple_fields): Needlessly
initialize align to silence bogus warning.
@ -1274,17 +1338,17 @@
PR tree-optimization/30965
PR tree-optimization/30978
* Makefile.in (tree-ssa-forwprop.o): Depend on $(FLAGS_H).
* tree-ssa-forwprop.c (forward_propagate_into_cond_1): Remove.
(find_equivalent_equality_comparison): Likewise.
(simplify_cond): Likewise.
(get_prop_source_stmt): New helper.
(get_prop_dest_stmt): Likewise.
* tree-ssa-forwprop.c (forward_propagate_into_cond_1): Remove.
(find_equivalent_equality_comparison): Likewise.
(simplify_cond): Likewise.
(get_prop_source_stmt): New helper.
(get_prop_dest_stmt): Likewise.
(can_propagate_from): Likewise.
(remove_prop_source_from_use): Likewise.
(combine_cond_expr_cond): Likewise.
(forward_propagate_comparison): New function.
(forward_propagate_into_cond): Rewrite to use fold for
tree combining.
(combine_cond_expr_cond): Likewise.
(forward_propagate_comparison): New function.
(forward_propagate_into_cond): Rewrite to use fold for
tree combining.
(tree_ssa_forward_propagate_single_use_vars): Call
forward_propagate_comparison to propagate comparisons.

View File

@ -2205,6 +2205,80 @@
(parallel [(const_int 0) (const_int 1)]))))]
"TARGET_SSE2")
(define_expand "vec_unpacks_float_hi_v8hi"
[(match_operand:V4SF 0 "register_operand" "")
(match_operand:V8HI 1 "register_operand" "")]
"TARGET_SSE2"
{
rtx tmp = gen_reg_rtx (V4SImode);
emit_insn (gen_vec_unpacks_hi_v8hi (tmp, operands[1]));
emit_insn (gen_sse2_cvtdq2ps (operands[0], tmp));
DONE;
})
(define_expand "vec_unpacks_float_lo_v8hi"
[(match_operand:V4SF 0 "register_operand" "")
(match_operand:V8HI 1 "register_operand" "")]
"TARGET_SSE2"
{
rtx tmp = gen_reg_rtx (V4SImode);
emit_insn (gen_vec_unpacks_lo_v8hi (tmp, operands[1]));
emit_insn (gen_sse2_cvtdq2ps (operands[0], tmp));
DONE;
})
(define_expand "vec_unpacku_float_hi_v8hi"
[(match_operand:V4SF 0 "register_operand" "")
(match_operand:V8HI 1 "register_operand" "")]
"TARGET_SSE2"
{
rtx tmp = gen_reg_rtx (V4SImode);
emit_insn (gen_vec_unpacku_hi_v8hi (tmp, operands[1]));
emit_insn (gen_sse2_cvtdq2ps (operands[0], tmp));
DONE;
})
(define_expand "vec_unpacku_float_lo_v8hi"
[(match_operand:V4SF 0 "register_operand" "")
(match_operand:V8HI 1 "register_operand" "")]
"TARGET_SSE2"
{
rtx tmp = gen_reg_rtx (V4SImode);
emit_insn (gen_vec_unpacku_lo_v8hi (tmp, operands[1]));
emit_insn (gen_sse2_cvtdq2ps (operands[0], tmp));
DONE;
})
(define_expand "vec_unpacks_float_hi_v4si"
[(set (match_dup 2)
(vec_select:V4SI
(match_operand:V4SI 1 "nonimmediate_operand" "")
(parallel [(const_int 2)
(const_int 3)
(const_int 2)
(const_int 3)])))
(set (match_operand:V2DF 0 "register_operand" "")
(float:V2DF
(vec_select:V2SI
(match_dup 2)
(parallel [(const_int 0) (const_int 1)]))))]
"TARGET_SSE2"
{
operands[2] = gen_reg_rtx (V4SImode);
})
(define_expand "vec_unpacks_float_lo_v4si"
[(set (match_operand:V2DF 0 "register_operand" "")
(float:V2DF
(vec_select:V2SI
(match_operand:V4SI 1 "nonimmediate_operand" "")
(parallel [(const_int 0) (const_int 1)]))))]
"TARGET_SSE2")
(define_expand "vec_pack_trunc_v2df"
[(match_operand:V4SF 0 "register_operand" "")
(match_operand:V2DF 1 "nonimmediate_operand" "")
@ -2222,6 +2296,25 @@
DONE;
})
(define_expand "vec_pack_sfix_trunc_v2df"
[(match_operand:V4SI 0 "register_operand" "")
(match_operand:V2DF 1 "nonimmediate_operand" "")
(match_operand:V2DF 2 "nonimmediate_operand" "")]
"TARGET_SSE2"
{
rtx r1, r2;
r1 = gen_reg_rtx (V4SImode);
r2 = gen_reg_rtx (V4SImode);
emit_insn (gen_sse2_cvttpd2dq (r1, operands[1]));
emit_insn (gen_sse2_cvttpd2dq (r2, operands[2]));
emit_insn (gen_sse2_punpcklqdq (gen_lowpart (V2DImode, operands[0]),
gen_lowpart (V2DImode, r1),
gen_lowpart (V2DImode, r2)));
DONE;
})
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;
;; Parallel double-precision floating point element swizzling
@ -3525,7 +3618,7 @@
"TARGET_SSE2"
{
rtx op1, op2, h1, l1, h2, l2, h3, l3;
op1 = gen_lowpart (V16QImode, operands[1]);
op2 = gen_lowpart (V16QImode, operands[2]);
h1 = gen_reg_rtx (V16QImode);
@ -3534,7 +3627,7 @@
l2 = gen_reg_rtx (V16QImode);
h3 = gen_reg_rtx (V16QImode);
l3 = gen_reg_rtx (V16QImode);
emit_insn (gen_vec_interleave_highv16qi (h1, op1, op2));
emit_insn (gen_vec_interleave_lowv16qi (l1, op1, op2));
emit_insn (gen_vec_interleave_highv16qi (h2, l1, h1));
@ -3544,7 +3637,7 @@
emit_insn (gen_vec_interleave_lowv16qi (operands[0], l3, h3));
DONE;
})
;; Reduce:
;; op1 = abcdefgh
;; op2 = ijklmnop
@ -3560,14 +3653,14 @@
"TARGET_SSE2"
{
rtx op1, op2, h1, l1, h2, l2;
op1 = gen_lowpart (V8HImode, operands[1]);
op2 = gen_lowpart (V8HImode, operands[2]);
h1 = gen_reg_rtx (V8HImode);
l1 = gen_reg_rtx (V8HImode);
h2 = gen_reg_rtx (V8HImode);
l2 = gen_reg_rtx (V8HImode);
emit_insn (gen_vec_interleave_highv8hi (h1, op1, op2));
emit_insn (gen_vec_interleave_lowv8hi (l1, op1, op2));
emit_insn (gen_vec_interleave_highv8hi (h2, l1, h1));
@ -3575,7 +3668,7 @@
emit_insn (gen_vec_interleave_lowv8hi (operands[0], l2, h2));
DONE;
})
;; Reduce:
;; op1 = abcd
;; op2 = efgh
@ -3589,12 +3682,12 @@
"TARGET_SSE2"
{
rtx op1, op2, h1, l1;
op1 = gen_lowpart (V4SImode, operands[1]);
op2 = gen_lowpart (V4SImode, operands[2]);
h1 = gen_reg_rtx (V4SImode);
l1 = gen_reg_rtx (V4SImode);
emit_insn (gen_vec_interleave_highv4si (h1, op1, op2));
emit_insn (gen_vec_interleave_lowv4si (l1, op1, op2));
emit_insn (gen_vec_interleave_lowv4si (operands[0], l1, h1));

View File

@ -1983,8 +1983,11 @@ This macro returns the attributes on the type @var{type}.
@tindex VEC_WIDEN_MULT_LO_EXPR
@tindex VEC_UNPACK_HI_EXPR
@tindex VEC_UNPACK_LO_EXPR
@tindex VEC_UNPACK_FLOAT_HI_EXPR
@tindex VEC_UNPACK_FLOAT_LO_EXPR
@tindex VEC_PACK_TRUNC_EXPR
@tindex VEC_PACK_SAT_EXPR
@tindex VEC_PACK_FIX_TRUNC_EXPR
@tindex VEC_EXTRACT_EVEN_EXPR
@tindex VEC_EXTRACT_ODD_EXPR
@tindex VEC_INTERLEAVE_HIGH_EXPR
@ -2846,6 +2849,17 @@ high @code{N/2} elements of the vector are extracted and widened (promoted).
In the case of @code{VEC_UNPACK_LO_EXPR} the low @code{N/2} elements of the
vector are extracted and widened (promoted).
@item VEC_UNPACK_FLOAT_HI_EXPR
@item VEC_UNPACK_FLOAT_LO_EXPR
These nodes represent unpacking of the high and low parts of the input vector,
where the values are converted from fixed point to floating point. The
single operand is a vector that contains @code{N} elements of the same
integral type. The result is a vector that contains half as many elements
of a floating point type whose size is twice as wide. In the case of
@code{VEC_UNPACK_HI_EXPR} the high @code{N/2} elements of the vector are
extracted, converted and widened. In the case of @code{VEC_UNPACK_LO_EXPR}
the low @code{N/2} elements of the vector are extracted, converted and widened.
@item VEC_PACK_TRUNC_EXPR
This node represents packing of truncated elements of the two input vectors
into the output vector. Input operands are vectors that contain the same
@ -2862,6 +2876,15 @@ vector that contains twice as many elements of an integral type whose size
is half as wide. The elements of the two vectors are demoted and merged
(concatenated) to form the output vector.
@item VEC_PACK_FIX_TRUNC_EXPR
This node represents packing of elements of the two input vectors into the
output vector, where the values are converted from floating point
to fixed point. Input operands are vectors that contain the same number
of elements of a floating point type. The result is a vector that contains
twice as many elements of an integral type whose size is half as wide. The
elements of the two vectors are merged (concatenated) to form the output
vector.
@item VEC_EXTRACT_EVEN_EXPR
@item VEC_EXTRACT_ODD_EXPR
These nodes represent extracting of the even/odd elements of the two input

View File

@ -3607,6 +3607,14 @@ Operand 0 is the resulting vector in which the elements of the two input
vectors are concatenated after narrowing them down using signed/unsigned
saturating arithmetic.
@cindex @code{vec_pack_sfix_trunc_@var{m}} instruction pattern
@cindex @code{vec_pack_ufix_trunc_@var{m}} instruction pattern
@item @samp{vec_pack_sfix_trunc_@var{m}}, @samp{vec_pack_ufix_trunc_@var{m}}
Narrow, convert to signed/unsigned integral type and merge the elements
of two vectors. Operands 1 and 2 are vectors of the same mode having N
floating point elements of size S. Operand 0 is the resulting vector
in which 2*N elements of size N/2 are concatenated.
@cindex @code{vec_unpacks_hi_@var{m}} instruction pattern
@cindex @code{vec_unpacks_lo_@var{m}} instruction pattern
@item @samp{vec_unpacks_hi_@var{m}}, @samp{vec_unpacks_lo_@var{m}}
@ -3624,11 +3632,24 @@ integral elements. The input vector (operand 1) has N elements of size S.
Widen (promote) the high/low elements of the vector using zero extension and
place the resulting N/2 values of size 2*S in the output vector (operand 0).
@cindex @code{vec_unpacks_float_hi_@var{m}} instruction pattern
@cindex @code{vec_unpacks_float_lo_@var{m}} instruction pattern
@cindex @code{vec_unpacku_float_hi_@var{m}} instruction pattern
@cindex @code{vec_unpacku_float_lo_@var{m}} instruction pattern
@item @samp{vec_unpacks_float_hi_@var{m}}, @samp{vec_unpacks_float_lo_@var{m}}
@itemx @samp{vec_unpacku_float_hi_@var{m}}, @samp{vec_unpacku_float_lo_@var{m}}
Extract, convert to floating point type and widen the high/low part of a
vector of signed/unsigned integral elements. The input vector (operand 1)
has N elements of size S. Convert the high/low elements of the vector using
floating point conversion and place the resulting N/2 values of size 2*S in
the output vector (operand 0).
@cindex @code{vec_widen_umult_hi_@var{m}} instruction pattern
@cindex @code{vec_widen_umult_lo__@var{m}} instruction pattern
@cindex @code{vec_widen_smult_hi_@var{m}} instruction pattern
@cindex @code{vec_widen_smult_lo_@var{m}} instruction pattern
@item @samp{vec_widen_umult_hi_@var{m}}, @samp{vec_widen_umult_lo_@var{m}}, @samp{vec_widen_smult_hi_@var{m}}, @samp{vec_widen_smult_lo_@var{m}}
@item @samp{vec_widen_umult_hi_@var{m}}, @samp{vec_widen_umult_lo_@var{m}}
@itemx @samp{vec_widen_smult_hi_@var{m}}, @samp{vec_widen_smult_lo_@var{m}}
Signed/Unsigned widening multiplication. The two inputs (operands 1 and 2)
are vectors with N signed/unsigned elements of size S. Multiply the high/low
elements of the two vectors, and put the N/2 products of size 2*S in the

View File

@ -9001,6 +9001,21 @@ expand_expr_real_1 (tree exp, rtx target, enum machine_mode tmode,
return temp;
}
case VEC_UNPACK_FLOAT_HI_EXPR:
case VEC_UNPACK_FLOAT_LO_EXPR:
{
op0 = expand_normal (TREE_OPERAND (exp, 0));
/* The signedness is determined from input operand. */
this_optab = optab_for_tree_code (code,
TREE_TYPE (TREE_OPERAND (exp, 0)));
temp = expand_widen_pattern_expr
(exp, op0, NULL_RTX, NULL_RTX,
target, TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (exp, 0))));
gcc_assert (temp);
return temp;
}
case VEC_WIDEN_MULT_HI_EXPR:
case VEC_WIDEN_MULT_LO_EXPR:
{
@ -9016,6 +9031,7 @@ expand_expr_real_1 (tree exp, rtx target, enum machine_mode tmode,
case VEC_PACK_TRUNC_EXPR:
case VEC_PACK_SAT_EXPR:
case VEC_PACK_FIX_TRUNC_EXPR:
{
mode = TYPE_MODE (TREE_TYPE (TREE_OPERAND (exp, 0)));
goto binop;

View File

@ -233,9 +233,15 @@ static const char * const optabs[] =
"vec_unpacks_lo_optab->handlers[$A].insn_code = CODE_FOR_$(vec_unpacks_lo_$a$)",
"vec_unpacku_hi_optab->handlers[$A].insn_code = CODE_FOR_$(vec_unpacku_hi_$a$)",
"vec_unpacku_lo_optab->handlers[$A].insn_code = CODE_FOR_$(vec_unpacku_lo_$a$)",
"vec_unpacks_float_hi_optab->handlers[$A].insn_code = CODE_FOR_$(vec_unpacks_float_hi_$a$)",
"vec_unpacks_float_lo_optab->handlers[$A].insn_code = CODE_FOR_$(vec_unpacks_float_lo_$a$)",
"vec_unpacku_float_hi_optab->handlers[$A].insn_code = CODE_FOR_$(vec_unpacku_float_hi_$a$)",
"vec_unpacku_float_lo_optab->handlers[$A].insn_code = CODE_FOR_$(vec_unpacku_float_lo_$a$)",
"vec_pack_trunc_optab->handlers[$A].insn_code = CODE_FOR_$(vec_pack_trunc_$a$)",
"vec_pack_ssat_optab->handlers[$A].insn_code = CODE_FOR_$(vec_pack_ssat_$a$)",
"vec_pack_usat_optab->handlers[$A].insn_code = CODE_FOR_$(vec_pack_usat_$a$)"
"vec_pack_usat_optab->handlers[$A].insn_code = CODE_FOR_$(vec_pack_usat_$a$)",
"vec_pack_sfix_trunc_optab->handlers[$A].insn_code = CODE_FOR_$(vec_pack_sfix_trunc_$a$)",
"vec_pack_ufix_trunc_optab->handlers[$A].insn_code = CODE_FOR_$(vec_pack_ufix_trunc_$a$)"
};
static void gen_insn (rtx);

View File

@ -340,12 +340,26 @@ optab_for_tree_code (enum tree_code code, tree type)
return TYPE_UNSIGNED (type) ?
vec_unpacku_lo_optab : vec_unpacks_lo_optab;
case VEC_UNPACK_FLOAT_HI_EXPR:
/* The signedness is determined from input operand. */
return TYPE_UNSIGNED (type) ?
vec_unpacku_float_hi_optab : vec_unpacks_float_hi_optab;
case VEC_UNPACK_FLOAT_LO_EXPR:
/* The signedness is determined from input operand. */
return TYPE_UNSIGNED (type) ?
vec_unpacku_float_lo_optab : vec_unpacks_float_lo_optab;
case VEC_PACK_TRUNC_EXPR:
return vec_pack_trunc_optab;
case VEC_PACK_SAT_EXPR:
return TYPE_UNSIGNED (type) ? vec_pack_usat_optab : vec_pack_ssat_optab;
case VEC_PACK_FIX_TRUNC_EXPR:
return TYPE_UNSIGNED (type) ?
vec_pack_ufix_trunc_optab : vec_pack_sfix_trunc_optab;
default:
break;
}
@ -1375,7 +1389,9 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
if (binoptab == vec_pack_trunc_optab
|| binoptab == vec_pack_usat_optab
|| binoptab == vec_pack_ssat_optab)
|| binoptab == vec_pack_ssat_optab
|| binoptab == vec_pack_ufix_trunc_optab
|| binoptab == vec_pack_sfix_trunc_optab)
{
/* The mode of the result is different then the mode of the
arguments. */
@ -5565,9 +5581,15 @@ init_optabs (void)
vec_unpacks_lo_optab = init_optab (UNKNOWN);
vec_unpacku_hi_optab = init_optab (UNKNOWN);
vec_unpacku_lo_optab = init_optab (UNKNOWN);
vec_unpacks_float_hi_optab = init_optab (UNKNOWN);
vec_unpacks_float_lo_optab = init_optab (UNKNOWN);
vec_unpacku_float_hi_optab = init_optab (UNKNOWN);
vec_unpacku_float_lo_optab = init_optab (UNKNOWN);
vec_pack_trunc_optab = init_optab (UNKNOWN);
vec_pack_usat_optab = init_optab (UNKNOWN);
vec_pack_ssat_optab = init_optab (UNKNOWN);
vec_pack_ufix_trunc_optab = init_optab (UNKNOWN);
vec_pack_sfix_trunc_optab = init_optab (UNKNOWN);
powi_optab = init_optab (UNKNOWN);

View File

@ -298,11 +298,24 @@ enum optab_index
elements. */
OTI_vec_unpacku_hi,
OTI_vec_unpacku_lo,
/* Extract, convert to floating point and widen the high/low part of
a vector of signed or unsigned integer elements. */
OTI_vec_unpacks_float_hi,
OTI_vec_unpacks_float_lo,
OTI_vec_unpacku_float_hi,
OTI_vec_unpacku_float_lo,
/* Narrow (demote) and merge the elements of two vectors. */
OTI_vec_pack_trunc,
OTI_vec_pack_usat,
OTI_vec_pack_ssat,
/* Convert to signed/unsigned integer, narrow and merge elements
of two vectors of floating point elements. */
OTI_vec_pack_sfix_trunc,
OTI_vec_pack_ufix_trunc,
/* Perform a raise to the power of integer. */
OTI_powi,
@ -446,9 +459,15 @@ extern GTY(()) optab optab_table[OTI_MAX];
#define vec_unpacks_lo_optab (optab_table[OTI_vec_unpacks_lo])
#define vec_unpacku_hi_optab (optab_table[OTI_vec_unpacku_hi])
#define vec_unpacku_lo_optab (optab_table[OTI_vec_unpacku_lo])
#define vec_unpacks_float_hi_optab (optab_table[OTI_vec_unpacks_float_hi])
#define vec_unpacks_float_lo_optab (optab_table[OTI_vec_unpacks_float_lo])
#define vec_unpacku_float_hi_optab (optab_table[OTI_vec_unpacku_float_hi])
#define vec_unpacku_float_lo_optab (optab_table[OTI_vec_unpacku_float_lo])
#define vec_pack_trunc_optab (optab_table[OTI_vec_pack_trunc])
#define vec_pack_ssat_optab (optab_table[OTI_vec_pack_ssat])
#define vec_pack_usat_optab (optab_table[OTI_vec_pack_usat])
#define vec_pack_sfix_trunc_optab (optab_table[OTI_vec_pack_sfix_trunc])
#define vec_pack_ufix_trunc_optab (optab_table[OTI_vec_pack_ufix_trunc])
#define powi_optab (optab_table[OTI_powi])

View File

@ -1,3 +1,16 @@
2007-05-17 Uros Bizjak <ubizjak@gmail.com>
PR tree-optimization/24659
* gcc.dg/vect/vect-floatint-conversion-2.c: New test.
* gcc.dg/vect/vect-intfloat-conversion-1.c: Require vect_float,
not vect_int target.
* gcc.dg/vect/vect-intfloat-conversion-2.c: Require vect_float,
not vect_int target. Loop is vectorized for vect_intfloat_cvt
targets.
* gcc.dg/vect/vect-intfloat-conversion-3.c: New test.
* gcc.dg/vect/vect-intfloat-conversion-4a.c: New test.
* gcc.dg/vect/vect-intfloat-conversion-4b.c: New test.
2007-05-16 Uros Bizjak <ubizjak@gmail.com>
* gcc.dg/torture/fp-int-convert-float128.c: Do not xfail for i?86-*-*
@ -746,7 +759,7 @@
* g++.dg/expr/bitfield8.C: New test.
2007-04-17 Joseph Myers <joseph@codesourcery.com>
Richard Sandiford <richard@codesourcery.com>
Richard Sandiford <richard@codesourcery.com>
* lib/target-supports.exp (check_profiling_available): Return 0
for uClibc with -p or -pg.

View File

@ -0,0 +1,40 @@
/* { dg-require-effective-target vect_double } */
#include <stdarg.h>
#include "tree-vect.h"
#define N 32
int
main1 ()
{
int i;
double db[N] = {0.4,3.5,6.6,9.4,12.5,15.6,18.4,21.5,24.6,27.4,30.5,33.6,36.4,39.5,42.6,45.4,0.5,3.6,6.4,9.5,12.6,15.4,18.5,21.6,24.4,27.5,30.6,33.4,36.5,39.6,42.4,45.5};
int ia[N];
/* double -> int */
for (i = 0; i < N; i++)
{
ia[i] = (int) db[i];
}
/* check results: */
for (i = 0; i < N; i++)
{
if (ia[i] != (int) db[i])
abort ();
}
return 0;
}
int
main (void)
{
check_vect ();
return main1 ();
}
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_floatint_cvt } } } */
/* { dg-final { cleanup-tree-dump "vect" } } */

View File

@ -1,4 +1,4 @@
/* { dg-require-effective-target vect_int } */
/* { dg-require-effective-target vect_float } */
#include <stdarg.h>
#include "tree-vect.h"

View File

@ -1,4 +1,4 @@
/* { dg-require-effective-target vect_int } */
/* { dg-require-effective-target vect_float } */
#include <stdarg.h>
#include "tree-vect.h"
@ -36,5 +36,5 @@ int main (void)
return main1 ();
}
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target powerpc*-*-* i?86-*-* x86_64-*-* } } } */
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_intfloat_cvt } } } */
/* { dg-final { cleanup-tree-dump "vect" } } */

View File

@ -0,0 +1,38 @@
/* { dg-require-effective-target vect_double } */
#include <stdarg.h>
#include "tree-vect.h"
#define N 32
int main1 ()
{
int i;
int ib[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45,0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
double da[N];
/* int -> double */
for (i = 0; i < N; i++)
{
da[i] = (double) ib[i];
}
/* check results: */
for (i = 0; i < N; i++)
{
if (da[i] != (double) ib[i])
abort ();
}
return 0;
}
int main (void)
{
check_vect ();
return main1 ();
}
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_intfloat_cvt } } } */
/* { dg-final { cleanup-tree-dump "vect" } } */

View File

@ -0,0 +1,38 @@
/* { dg-require-effective-target vect_float } */
#include <stdarg.h>
#include "tree-vect.h"
#define N 32
int main1 ()
{
int i;
short sb[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45,0,-3,-6,-9,-12,-15,-18,-21,-24,-27,-30,-33,-36,-39,-42,-45};
float fa[N];
/* short -> float */
for (i = 0; i < N; i++)
{
fa[i] = (float) sb[i];
}
/* check results: */
for (i = 0; i < N; i++)
{
if (fa[i] != (float) sb[i])
abort ();
}
return 0;
}
int main (void)
{
check_vect ();
return main1 ();
}
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_intfloat_cvt } } } */
/* { dg-final { cleanup-tree-dump "vect" } } */

View File

@ -0,0 +1,38 @@
/* { dg-require-effective-target vect_float } */
#include <stdarg.h>
#include "tree-vect.h"
#define N 32
int main1 ()
{
int i;
unsigned short usb[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45,0,65533,65530,65527,65524,65521,65518,65515,65512,65509,65506,65503,65500,65497,65494,65491};
float fa[N];
/* unsigned short -> float */
for (i = 0; i < N; i++)
{
fa[i] = (float) usb[i];
}
/* check results: */
for (i = 0; i < N; i++)
{
if (fa[i] != (float) usb[i])
abort ();
}
return 0;
}
int main (void)
{
check_vect ();
return main1 ();
}
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_intfloat_cvt } } } */
/* { dg-final { cleanup-tree-dump "vect" } } */

View File

@ -2148,8 +2148,11 @@ estimate_num_insns_1 (tree *tp, int *walk_subtrees, void *data)
case VEC_WIDEN_MULT_LO_EXPR:
case VEC_UNPACK_HI_EXPR:
case VEC_UNPACK_LO_EXPR:
case VEC_UNPACK_FLOAT_HI_EXPR:
case VEC_UNPACK_FLOAT_LO_EXPR:
case VEC_PACK_TRUNC_EXPR:
case VEC_PACK_SAT_EXPR:
case VEC_PACK_FIX_TRUNC_EXPR:
case WIDEN_MULT_EXPR:

View File

@ -1943,6 +1943,18 @@ dump_generic_node (pretty_printer *buffer, tree node, int spc, int flags,
pp_string (buffer, " > ");
break;
case VEC_UNPACK_FLOAT_HI_EXPR:
pp_string (buffer, " VEC_UNPACK_FLOAT_HI_EXPR < ");
dump_generic_node (buffer, TREE_OPERAND (node, 0), spc, flags, false);
pp_string (buffer, " > ");
break;
case VEC_UNPACK_FLOAT_LO_EXPR:
pp_string (buffer, " VEC_UNPACK_FLOAT_LO_EXPR < ");
dump_generic_node (buffer, TREE_OPERAND (node, 0), spc, flags, false);
pp_string (buffer, " > ");
break;
case VEC_PACK_TRUNC_EXPR:
pp_string (buffer, " VEC_PACK_TRUNC_EXPR < ");
dump_generic_node (buffer, TREE_OPERAND (node, 0), spc, flags, false);
@ -1950,7 +1962,7 @@ dump_generic_node (pretty_printer *buffer, tree node, int spc, int flags,
dump_generic_node (buffer, TREE_OPERAND (node, 1), spc, flags, false);
pp_string (buffer, " > ");
break;
case VEC_PACK_SAT_EXPR:
pp_string (buffer, " VEC_PACK_SAT_EXPR < ");
dump_generic_node (buffer, TREE_OPERAND (node, 0), spc, flags, false);
@ -1958,7 +1970,15 @@ dump_generic_node (pretty_printer *buffer, tree node, int spc, int flags,
dump_generic_node (buffer, TREE_OPERAND (node, 1), spc, flags, false);
pp_string (buffer, " > ");
break;
case VEC_PACK_FIX_TRUNC_EXPR:
pp_string (buffer, " VEC_PACK_FIX_TRUNC_EXPR < ");
dump_generic_node (buffer, TREE_OPERAND (node, 0), spc, flags, false);
pp_string (buffer, ", ");
dump_generic_node (buffer, TREE_OPERAND (node, 1), spc, flags, false);
pp_string (buffer, " > ");
break;
case BLOCK:
{
tree t;
@ -2352,6 +2372,8 @@ op_prio (tree op)
case VEC_RSHIFT_EXPR:
case VEC_UNPACK_HI_EXPR:
case VEC_UNPACK_LO_EXPR:
case VEC_UNPACK_FLOAT_HI_EXPR:
case VEC_UNPACK_FLOAT_LO_EXPR:
case VEC_PACK_TRUNC_EXPR:
case VEC_PACK_SAT_EXPR:
return 16;

View File

@ -421,8 +421,11 @@ expand_vector_operations_1 (block_stmt_iterator *bsi)
|| code == VEC_WIDEN_MULT_LO_EXPR
|| code == VEC_UNPACK_HI_EXPR
|| code == VEC_UNPACK_LO_EXPR
|| code == VEC_UNPACK_FLOAT_HI_EXPR
|| code == VEC_UNPACK_FLOAT_LO_EXPR
|| code == VEC_PACK_TRUNC_EXPR
|| code == VEC_PACK_SAT_EXPR)
|| code == VEC_PACK_SAT_EXPR
|| code == VEC_PACK_FIX_TRUNC_EXPR)
type = TREE_TYPE (TREE_OPERAND (rhs, 0));
/* Optabs will try converting a negation into a subtraction, so

View File

@ -210,7 +210,7 @@ vect_create_addr_base_for_vector_ref (tree stmt,
accessed in the loop by STMT, along with the def-use update chain to
appropriately advance the pointer through the loop iterations. Also set
aliasing information for the pointer. This vector pointer is used by the
callers to this function to create a memory reference expression for vector
callers to this function to create a memory reference expression for vector
load/store access.
Input:
@ -1931,6 +1931,64 @@ vectorizable_call (tree stmt, block_stmt_iterator *bsi, tree *vec_stmt)
}
/* Function vect_gen_widened_results_half
Create a vector stmt whose code, type, number of arguments, and result
variable are CODE, VECTYPE, OP_TYPE, and VEC_DEST, and its arguments are
VEC_OPRND0 and VEC_OPRND1. The new vector stmt is to be inserted at BSI.
In the case that CODE is a CALL_EXPR, this means that a call to DECL
needs to be created (DECL is a function-decl of a target-builtin).
STMT is the original scalar stmt that we are vectorizing. */
static tree
vect_gen_widened_results_half (enum tree_code code, tree vectype, tree decl,
tree vec_oprnd0, tree vec_oprnd1, int op_type,
tree vec_dest, block_stmt_iterator *bsi,
tree stmt)
{
tree expr;
tree new_stmt;
tree new_temp;
tree sym;
ssa_op_iter iter;
/* Generate half of the widened result: */
if (code == CALL_EXPR)
{
/* Target specific support */
if (op_type == binary_op)
expr = build_call_expr (decl, 2, vec_oprnd0, vec_oprnd1);
else
expr = build_call_expr (decl, 1, vec_oprnd0);
}
else
{
/* Generic support */
gcc_assert (op_type == TREE_CODE_LENGTH (code));
if (op_type == binary_op)
expr = build2 (code, vectype, vec_oprnd0, vec_oprnd1);
else
expr = build1 (code, vectype, vec_oprnd0);
}
new_stmt = build_gimple_modify_stmt (vec_dest, expr);
new_temp = make_ssa_name (vec_dest, new_stmt);
GIMPLE_STMT_OPERAND (new_stmt, 0) = new_temp;
vect_finish_stmt_generation (stmt, new_stmt, bsi);
if (code == CALL_EXPR)
{
FOR_EACH_SSA_TREE_OPERAND (sym, new_stmt, iter, SSA_OP_ALL_VIRTUALS)
{
if (TREE_CODE (sym) == SSA_NAME)
sym = SSA_NAME_VAR (sym);
mark_sym_for_renaming (sym);
}
}
return new_stmt;
}
/* Function vectorizable_conversion.
Check if STMT performs a conversion operation, that can be vectorized.
@ -1946,21 +2004,24 @@ vectorizable_conversion (tree stmt, block_stmt_iterator * bsi,
tree scalar_dest;
tree operation;
tree op0;
tree vec_oprnd0 = NULL_TREE;
tree vec_oprnd0 = NULL_TREE, vec_oprnd1 = NULL_TREE;
stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
enum tree_code code;
enum tree_code code, code1 = CODE_FOR_nothing, code2 = CODE_FOR_nothing;
tree decl1 = NULL_TREE, decl2 = NULL_TREE;
tree new_temp;
tree def, def_stmt;
enum vect_def_type dt0;
tree new_stmt;
stmt_vec_info prev_stmt_info;
int nunits_in;
int nunits_out;
int ncopies, j;
tree vectype_out, vectype_in;
int ncopies, j;
tree expr;
tree rhs_type, lhs_type;
tree builtin_decl;
stmt_vec_info prev_stmt_info;
enum { NARROW, NONE, WIDEN } modifier;
/* Is STMT a vectorizable conversion? */
@ -1998,23 +2059,36 @@ vectorizable_conversion (tree stmt, block_stmt_iterator * bsi,
scalar_dest = GIMPLE_STMT_OPERAND (stmt, 0);
lhs_type = TREE_TYPE (scalar_dest);
vectype_out = get_vectype_for_scalar_type (lhs_type);
gcc_assert (STMT_VINFO_VECTYPE (stmt_info) == vectype_out);
nunits_out = TYPE_VECTOR_SUBPARTS (vectype_out);
/* FORNOW: need to extend to support short<->float conversions as well. */
if (nunits_out != nunits_in)
/* FORNOW */
if (nunits_in == nunits_out / 2)
modifier = NARROW;
else if (nunits_out == nunits_in)
modifier = NONE;
else if (nunits_out == nunits_in / 2)
modifier = WIDEN;
else
return false;
if (modifier == NONE)
gcc_assert (STMT_VINFO_VECTYPE (stmt_info) == vectype_out);
/* Bail out if the types are both integral or non-integral */
if ((INTEGRAL_TYPE_P (rhs_type) && INTEGRAL_TYPE_P (lhs_type))
|| (!INTEGRAL_TYPE_P (rhs_type) && !INTEGRAL_TYPE_P (lhs_type)))
return false;
if (modifier == NARROW)
ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits_out;
else
ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits_in;
/* Sanity check: make sure that at least one copy of the vectorized stmt
needs to be generated. */
ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits_in;
gcc_assert (ncopies >= 1);
/* Check the operands of the operation. */
if (!vect_is_simple_use (op0, loop_vinfo, &def_stmt, &def, &dt0))
{
if (vect_print_dump_info (REPORT_DETAILS))
@ -2023,21 +2097,31 @@ vectorizable_conversion (tree stmt, block_stmt_iterator * bsi,
}
/* Supportable by target? */
if (!targetm.vectorize.builtin_conversion (code, vectype_in))
if ((modifier == NONE
&& !targetm.vectorize.builtin_conversion (code, vectype_in))
|| (modifier == WIDEN
&& !supportable_widening_operation (code, stmt, vectype_in,
&decl1, &decl2,
&code1, &code2))
|| (modifier == NARROW
&& !supportable_narrowing_operation (code, stmt, vectype_in,
&code1)))
{
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "op not supported by target.");
return false;
}
if (modifier != NONE)
STMT_VINFO_VECTYPE (stmt_info) = vectype_in;
if (!vec_stmt) /* transformation not required. */
{
STMT_VINFO_TYPE (stmt_info) = type_conversion_vec_info_type;
return true;
}
/** Transform. **/
/** Transform. **/
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "transform conversion.");
@ -2045,37 +2129,113 @@ vectorizable_conversion (tree stmt, block_stmt_iterator * bsi,
vec_dest = vect_create_destination_var (scalar_dest, vectype_out);
prev_stmt_info = NULL;
for (j = 0; j < ncopies; j++)
switch (modifier)
{
tree sym;
ssa_op_iter iter;
case NONE:
for (j = 0; j < ncopies; j++)
{
tree sym;
ssa_op_iter iter;
if (j == 0)
vec_oprnd0 = vect_get_vec_def_for_operand (op0, stmt, NULL);
else
vec_oprnd0 = vect_get_vec_def_for_stmt_copy (dt0, vec_oprnd0);
if (j == 0)
vec_oprnd0 = vect_get_vec_def_for_operand (op0, stmt, NULL);
else
vec_oprnd0 = vect_get_vec_def_for_stmt_copy (dt0, vec_oprnd0);
builtin_decl =
targetm.vectorize.builtin_conversion (code, vectype_in);
new_stmt = build_call_expr (builtin_decl, 1, vec_oprnd0);
builtin_decl =
targetm.vectorize.builtin_conversion (code, vectype_in);
new_stmt = build_call_expr (builtin_decl, 1, vec_oprnd0);
/* Arguments are ready. create the new vector stmt. */
new_stmt = build_gimple_modify_stmt (vec_dest, new_stmt);
new_temp = make_ssa_name (vec_dest, new_stmt);
GIMPLE_STMT_OPERAND (new_stmt, 0) = new_temp;
vect_finish_stmt_generation (stmt, new_stmt, bsi);
FOR_EACH_SSA_TREE_OPERAND (sym, new_stmt, iter, SSA_OP_ALL_VIRTUALS)
{
if (TREE_CODE (sym) == SSA_NAME)
sym = SSA_NAME_VAR (sym);
mark_sym_for_renaming (sym);
}
/* Arguments are ready. create the new vector stmt. */
new_stmt = build_gimple_modify_stmt (vec_dest, new_stmt);
new_temp = make_ssa_name (vec_dest, new_stmt);
GIMPLE_STMT_OPERAND (new_stmt, 0) = new_temp;
vect_finish_stmt_generation (stmt, new_stmt, bsi);
FOR_EACH_SSA_TREE_OPERAND (sym, new_stmt, iter, SSA_OP_ALL_VIRTUALS)
{
if (TREE_CODE (sym) == SSA_NAME)
sym = SSA_NAME_VAR (sym);
mark_sym_for_renaming (sym);
}
if (j == 0)
STMT_VINFO_VEC_STMT (stmt_info) = *vec_stmt = new_stmt;
else
STMT_VINFO_RELATED_STMT (prev_stmt_info) = new_stmt;
prev_stmt_info = vinfo_for_stmt (new_stmt);
if (j == 0)
STMT_VINFO_VEC_STMT (stmt_info) = *vec_stmt = new_stmt;
else
STMT_VINFO_RELATED_STMT (prev_stmt_info) = new_stmt;
prev_stmt_info = vinfo_for_stmt (new_stmt);
}
break;
case WIDEN:
/* In case the vectorization factor (VF) is bigger than the number
of elements that we can fit in a vectype (nunits), we have to
generate more than one vector stmt - i.e - we need to "unroll"
the vector stmt by a factor VF/nunits. */
for (j = 0; j < ncopies; j++)
{
if (j == 0)
vec_oprnd0 = vect_get_vec_def_for_operand (op0, stmt, NULL);
else
vec_oprnd0 = vect_get_vec_def_for_stmt_copy (dt0, vec_oprnd0);
STMT_VINFO_VECTYPE (stmt_info) = vectype_in;
/* Generate first half of the widened result: */
new_stmt
= vect_gen_widened_results_half (code1, vectype_out, decl1,
vec_oprnd0, vec_oprnd1,
unary_op, vec_dest, bsi, stmt);
if (j == 0)
STMT_VINFO_VEC_STMT (stmt_info) = new_stmt;
else
STMT_VINFO_RELATED_STMT (prev_stmt_info) = new_stmt;
prev_stmt_info = vinfo_for_stmt (new_stmt);
/* Generate second half of the widened result: */
new_stmt
= vect_gen_widened_results_half (code2, vectype_out, decl2,
vec_oprnd0, vec_oprnd1,
unary_op, vec_dest, bsi, stmt);
STMT_VINFO_RELATED_STMT (prev_stmt_info) = new_stmt;
prev_stmt_info = vinfo_for_stmt (new_stmt);
}
break;
case NARROW:
/* In case the vectorization factor (VF) is bigger than the number
of elements that we can fit in a vectype (nunits), we have to
generate more than one vector stmt - i.e - we need to "unroll"
the vector stmt by a factor VF/nunits. */
for (j = 0; j < ncopies; j++)
{
/* Handle uses. */
if (j == 0)
{
vec_oprnd0 = vect_get_vec_def_for_operand (op0, stmt, NULL);
vec_oprnd1 = vect_get_vec_def_for_stmt_copy (dt0, vec_oprnd0);
}
else
{
vec_oprnd0 = vect_get_vec_def_for_stmt_copy (dt0, vec_oprnd1);
vec_oprnd1 = vect_get_vec_def_for_stmt_copy (dt0, vec_oprnd0);
}
/* Arguments are ready. Create the new vector stmt. */
expr = build2 (code1, vectype_out, vec_oprnd0, vec_oprnd1);
new_stmt = build_gimple_modify_stmt (vec_dest, expr);
new_temp = make_ssa_name (vec_dest, new_stmt);
GIMPLE_STMT_OPERAND (new_stmt, 0) = new_temp;
vect_finish_stmt_generation (stmt, new_stmt, bsi);
if (j == 0)
STMT_VINFO_VEC_STMT (stmt_info) = new_stmt;
else
STMT_VINFO_RELATED_STMT (prev_stmt_info) = new_stmt;
prev_stmt_info = vinfo_for_stmt (new_stmt);
}
*vec_stmt = STMT_VINFO_VEC_STMT (stmt_info);
}
return true;
}
@ -2525,7 +2685,7 @@ vectorizable_operation (tree stmt, block_stmt_iterator *bsi, tree *vec_stmt)
bool
vectorizable_type_demotion (tree stmt, block_stmt_iterator *bsi,
tree *vec_stmt)
tree *vec_stmt)
{
tree vec_dest;
tree scalar_dest;
@ -2534,7 +2694,7 @@ vectorizable_type_demotion (tree stmt, block_stmt_iterator *bsi,
tree vec_oprnd0=NULL, vec_oprnd1=NULL;
stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
enum tree_code code;
enum tree_code code, code1 = CODE_FOR_nothing;
tree new_temp;
tree def, def_stmt;
enum vect_def_type dt0;
@ -2548,8 +2708,6 @@ vectorizable_type_demotion (tree stmt, block_stmt_iterator *bsi,
tree expr;
tree vectype_in;
tree scalar_type;
optab optab;
enum machine_mode vec_mode;
if (!STMT_VINFO_RELEVANT_P (stmt_info))
return false;
@ -2607,13 +2765,7 @@ vectorizable_type_demotion (tree stmt, block_stmt_iterator *bsi,
}
/* Supportable by target? */
code = VEC_PACK_TRUNC_EXPR;
optab = optab_for_tree_code (code, vectype_in);
if (!optab)
return false;
vec_mode = TYPE_MODE (vectype_in);
if (optab->handlers[(int) vec_mode].insn_code == CODE_FOR_nothing)
if (!supportable_narrowing_operation (code, stmt, vectype_in, &code1))
return false;
STMT_VINFO_VECTYPE (stmt_info) = vectype_in;
@ -2652,7 +2804,7 @@ vectorizable_type_demotion (tree stmt, block_stmt_iterator *bsi,
}
/* Arguments are ready. Create the new vector stmt. */
expr = build2 (code, vectype_out, vec_oprnd0, vec_oprnd1);
expr = build2 (code1, vectype_out, vec_oprnd0, vec_oprnd1);
new_stmt = build_gimple_modify_stmt (vec_dest, expr);
new_temp = make_ssa_name (vec_dest, new_stmt);
GIMPLE_STMT_OPERAND (new_stmt, 0) = new_temp;
@ -2671,64 +2823,6 @@ vectorizable_type_demotion (tree stmt, block_stmt_iterator *bsi,
}
/* Function vect_gen_widened_results_half
Create a vector stmt whose code, type, number of arguments, and result
variable are CODE, VECTYPE, OP_TYPE, and VEC_DEST, and its arguments are
VEC_OPRND0 and VEC_OPRND1. The new vector stmt is to be inserted at BSI.
In the case that CODE is a CALL_EXPR, this means that a call to DECL
needs to be created (DECL is a function-decl of a target-builtin).
STMT is the original scalar stmt that we are vectorizing. */
static tree
vect_gen_widened_results_half (enum tree_code code, tree vectype, tree decl,
tree vec_oprnd0, tree vec_oprnd1, int op_type,
tree vec_dest, block_stmt_iterator *bsi,
tree stmt)
{
tree expr;
tree new_stmt;
tree new_temp;
tree sym;
ssa_op_iter iter;
/* Generate half of the widened result: */
if (code == CALL_EXPR)
{
/* Target specific support */
if (op_type == binary_op)
expr = build_call_expr (decl, 2, vec_oprnd0, vec_oprnd1);
else
expr = build_call_expr (decl, 1, vec_oprnd0);
}
else
{
/* Generic support */
gcc_assert (op_type == TREE_CODE_LENGTH (code));
if (op_type == binary_op)
expr = build2 (code, vectype, vec_oprnd0, vec_oprnd1);
else
expr = build1 (code, vectype, vec_oprnd0);
}
new_stmt = build_gimple_modify_stmt (vec_dest, expr);
new_temp = make_ssa_name (vec_dest, new_stmt);
GIMPLE_STMT_OPERAND (new_stmt, 0) = new_temp;
vect_finish_stmt_generation (stmt, new_stmt, bsi);
if (code == CALL_EXPR)
{
FOR_EACH_SSA_TREE_OPERAND (sym, new_stmt, iter, SSA_OP_ALL_VIRTUALS)
{
if (TREE_CODE (sym) == SSA_NAME)
sym = SSA_NAME_VAR (sym);
mark_sym_for_renaming (sym);
}
}
return new_stmt;
}
/* Function vectorizable_type_promotion
Check if STMT performs a binary or unary operation that involves
@ -2785,7 +2879,8 @@ vectorizable_type_promotion (tree stmt, block_stmt_iterator *bsi,
operation = GIMPLE_STMT_OPERAND (stmt, 1);
code = TREE_CODE (operation);
if (code != NOP_EXPR && code != WIDEN_MULT_EXPR)
if (code != NOP_EXPR && code != CONVERT_EXPR
&& code != WIDEN_MULT_EXPR)
return false;
op0 = TREE_OPERAND (operation, 0);

View File

@ -1736,10 +1736,10 @@ vect_is_simple_use (tree operand, loop_vec_info loop_vinfo, tree *def_stmt,
widening operation that is supported by the target platform in
vector form (i.e., when operating on arguments of type VECTYPE).
The two kinds of widening operations we currently support are
NOP and WIDEN_MULT. This function checks if these operations
are supported by the target platform either directly (via vector
tree-codes), or via target builtins.
Widening operations we currently support are NOP (CONVERT), FLOAT
and WIDEN_MULT. This function checks if these operations are supported
by the target platform either directly (via vector tree-codes), or via
target builtins.
Output:
- CODE1 and CODE2 are codes of vector operations to be used when
@ -1815,6 +1815,7 @@ supportable_widening_operation (enum tree_code code, tree stmt, tree vectype,
break;
case NOP_EXPR:
case CONVERT_EXPR:
if (BYTES_BIG_ENDIAN)
{
c1 = VEC_UNPACK_HI_EXPR;
@ -1827,6 +1828,19 @@ supportable_widening_operation (enum tree_code code, tree stmt, tree vectype,
}
break;
case FLOAT_EXPR:
if (BYTES_BIG_ENDIAN)
{
c1 = VEC_UNPACK_FLOAT_HI_EXPR;
c2 = VEC_UNPACK_FLOAT_LO_EXPR;
}
else
{
c2 = VEC_UNPACK_FLOAT_HI_EXPR;
c1 = VEC_UNPACK_FLOAT_LO_EXPR;
}
break;
default:
gcc_unreachable ();
}
@ -1851,6 +1865,63 @@ supportable_widening_operation (enum tree_code code, tree stmt, tree vectype,
}
/* Function supportable_narrowing_operation
Check whether an operation represented by the code CODE is a
narrowing operation that is supported by the target platform in
vector form (i.e., when operating on arguments of type VECTYPE).
Narrowing operations we currently support are NOP (CONVERT) and
FIX_TRUNC. This function checks if these operations are supported by
the target platform directly via vector tree-codes.
Output:
- CODE1 is the code of a vector operation to be used when
vectorizing the operation, if available. */
bool
supportable_narrowing_operation (enum tree_code code,
tree stmt, tree vectype,
enum tree_code *code1)
{
enum machine_mode vec_mode;
enum insn_code icode1;
optab optab1;
tree expr = GIMPLE_STMT_OPERAND (stmt, 1);
tree type = TREE_TYPE (expr);
tree narrow_vectype = get_vectype_for_scalar_type (type);
enum tree_code c1;
switch (code)
{
case NOP_EXPR:
case CONVERT_EXPR:
c1 = VEC_PACK_TRUNC_EXPR;
break;
case FIX_TRUNC_EXPR:
c1 = VEC_PACK_FIX_TRUNC_EXPR;
break;
default:
gcc_unreachable ();
}
*code1 = c1;
optab1 = optab_for_tree_code (c1, vectype);
if (!optab1)
return false;
vec_mode = TYPE_MODE (vectype);
if ((icode1 = optab1->handlers[(int) vec_mode].insn_code) == CODE_FOR_nothing
|| insn_data[icode1].operand[0].mode != TYPE_MODE (narrow_vectype))
return false;
return true;
}
/* Function reduction_code_for_scalar_code
Input:

View File

@ -398,6 +398,9 @@ extern enum dr_alignment_support vect_supportable_dr_alignment
extern bool reduction_code_for_scalar_code (enum tree_code, enum tree_code *);
extern bool supportable_widening_operation (enum tree_code, tree, tree,
tree *, tree *, enum tree_code *, enum tree_code *);
extern bool supportable_narrowing_operation (enum tree_code, tree, tree,
enum tree_code *);
/* Creation and deletion of loop and stmt info structs. */
extern loop_vec_info new_loop_vec_info (struct loop *loop);
extern void destroy_loop_vec_info (loop_vec_info);

View File

@ -1085,13 +1085,20 @@ DEFTREECODE (GIMPLE_MODIFY_STMT, "gimple_modify_stmt", tcc_gimple_stmt, 2)
DEFTREECODE (VEC_WIDEN_MULT_HI_EXPR, "widen_mult_hi_expr", tcc_binary, 2)
DEFTREECODE (VEC_WIDEN_MULT_LO_EXPR, "widen_mult_hi_expr", tcc_binary, 2)
/* Unpack (extract and promote/widen) the high/low elements of the input vector
into the output vector. The input vector has twice as many elements
as the output vector, that are half the size of the elements
/* Unpack (extract and promote/widen) the high/low elements of the input
vector into the output vector. The input vector has twice as many
elements as the output vector, that are half the size of the elements
of the output vector. This is used to support type promotion. */
DEFTREECODE (VEC_UNPACK_HI_EXPR, "vec_unpack_hi_expr", tcc_unary, 1)
DEFTREECODE (VEC_UNPACK_LO_EXPR, "vec_unpack_lo_expr", tcc_unary, 1)
/* Unpack (extract) the high/low elements of the input vector, convert
fixed point values to floating point and widen elements into the
output vector. The input vector has twice as many elements as the output
vector, that are half the size of the elements of the output vector. */
DEFTREECODE (VEC_UNPACK_FLOAT_HI_EXPR, "vec_unpack_float_hi_expr", tcc_unary, 1)
DEFTREECODE (VEC_UNPACK_FLOAT_LO_EXPR, "vec_unpack_float_lo_expr", tcc_unary, 1)
/* Pack (demote/narrow and merge) the elements of the two input vectors
into the output vector using truncation/saturation.
The elements of the input vectors are twice the size of the elements of the
@ -1099,6 +1106,12 @@ DEFTREECODE (VEC_UNPACK_LO_EXPR, "vec_unpack_lo_expr", tcc_unary, 1)
DEFTREECODE (VEC_PACK_TRUNC_EXPR, "vec_pack_trunc_expr", tcc_binary, 2)
DEFTREECODE (VEC_PACK_SAT_EXPR, "vec_pack_sat_expr", tcc_binary, 2)
/* Convert floating point values of the two input vectors to integer
and pack (narrow and merge) the elements into the output vector. The
elements of the input vector are twice the size of the elements of
the output vector. */
DEFTREECODE (VEC_PACK_FIX_TRUNC_EXPR, "vec_pack_fix_trunc_expr", tcc_binary, 2)
/* Extract even/odd fields from vectors. */
DEFTREECODE (VEC_EXTRACT_EVEN_EXPR, "vec_extracteven_expr", tcc_binary, 2)
DEFTREECODE (VEC_EXTRACT_ODD_EXPR, "vec_extractodd_expr", tcc_binary, 2)