re PR tree-optimization/24659 (Conversions are not vectorized)
PR tree-optimization/24659 * optabs.h (enum optab_index): Add OTI_vec_unpacks_float_hi, OTI_vec_unpacks_float_lo, OTI_vec_unpacku_float_hi, OTI_vec_unpacku_float_lo, OTI_vec_pack_sfix_trunc and OTI_vec_pack_ufix_trunc. (vec_unpacks_float_hi_optab): Define new macro. (vec_unpacks_float_lo_optab): Ditto. (vec_unpacku_float_hi_optab): Ditto. (vec_unpacku_float_lo_optab): Ditto. (vec_pack_sfix_trunc_optab): Ditto. (vec_pack_ufix_trunc_optab): Ditto. * genopinit.c (optabs): Implement vec_unpack[s|u]_[hi|lo]_optab and vec_pack_[s|u]fix_trunc_optab using vec_unpack[s|u]_[hi\lo]_* and vec_pack_[u|s]fix_trunc_* patterns * tree-vectorizer.c (supportable_widening_operation): Handle FLOAT_EXPR and CONVERT_EXPR. Update comment. (supportable_narrowing_operation): New function. * tree-vectorizer.h (supportable_narrowing_operation): Prototype. * tree-vect-transform.c (vectorizable_conversion): Handle (nunits_in == nunits_out / 2) and (nunits_out == nunits_in / 2) cases. (vect_gen_widened_results_half): Move before vectorizable_conversion. (vectorizable_type_demotion): Call supportable_narrowing_operation() to check for target support. * optabs.c (optab_for_tree_code) Return vec_unpack[s|u]_float_hi_optab for VEC_UNPACK_FLOAT_HI_EXPR, vec_unpack[s|u]_float_lo_optab for VEC_UNPACK_FLOAT_LO_EXPR and vec_pack_[u|s]fix_trunc_optab for VEC_PACK_FIX_TRUNC_EXPR. (expand_binop): Special case mode of the result for vec_pack_[u|s]fix_trunc_optab. (init_optabs): Initialize vec_unpack[s|u]_[hi|lo]_optab and vec_pack_[u|s]fix_trunc_optab. * tree.def (VEC_UNPACK_FLOAT_HI_EXPR, VEC_UNPACK_FLOAT_LO_EXPR, VEC_PACK_FIX_TRUNC_EXPR): New tree codes. * tree-pretty-print.c (dump_generic_node): Handle VEC_UNPACK_FLOAT_HI_EXPR, VEC_UNPACK_FLOAT_LO_EXPR and VEC_PACK_FIX_TRUNC_EXPR. (op_prio): Ditto. * expr.c (expand_expr_real_1): Ditto. * tree-inline.c (estimate_num_insns_1): Ditto. * tree-vect-generic.c (expand_vector_operations_1): Ditto. * config/i386/sse.md (vec_unpacks_float_hi_v8hi): New expander. (vec_unpacks_float_lo_v8hi): Ditto. (vec_unpacku_float_hi_v8hi): Ditto. (vec_unpacku_float_lo_v8hi): Ditto. (vec_unpacks_float_hi_v4si): Ditto. (vec_unpacks_float_lo_v4si): Ditto. (vec_pack_sfix_trunc_v2df): Ditto. * doc/c-tree.texi (Expression trees) [VEC_UNPACK_FLOAT_HI_EXPR]: Document. [VEC_UNPACK_FLOAT_LO_EXPR]: Ditto. [VEC_PACK_FIX_TRUNC_EXPR]: Ditto. * doc/md.texi (Standard Names) [vec_pack_sfix_trunc]: Document. [vec_pack_ufix_trunc]: Ditto. [vec_unpacks_float_hi]: Ditto. [vec_unpacks_float_lo]: Ditto. [vec_unpacku_float_hi]: Ditto. [vec_unpacku_float_lo]: Ditto. testsuite/ChangeLog: PR tree-optimization/24659 * gcc.dg/vect/vect-floatint-conversion-2.c: New test. * gcc.dg/vect/vect-intfloat-conversion-1.c: Require vect_float, not vect_int target. * gcc.dg/vect/vect-intfloat-conversion-2.c: Require vect_float, not vect_int target. Loop is vectorized for vect_intfloat_cvt targets. * gcc.dg/vect/vect-intfloat-conversion-3.c: New test. * gcc.dg/vect/vect-intfloat-conversion-4a.c: New test. * gcc.dg/vect/vect-intfloat-conversion-4b.c: New test. From-SVN: r124784
This commit is contained in:
parent
f59d2a7c86
commit
d9987fb407
|
@ -1,3 +1,66 @@
|
||||||
|
2007-05-17 Uros Bizjak <ubizjak@gmail.com>
|
||||||
|
|
||||||
|
PR tree-optimization/24659
|
||||||
|
* optabs.h (enum optab_index): Add OTI_vec_unpacks_float_hi,
|
||||||
|
OTI_vec_unpacks_float_lo, OTI_vec_unpacku_float_hi,
|
||||||
|
OTI_vec_unpacku_float_lo, OTI_vec_pack_sfix_trunc and
|
||||||
|
OTI_vec_pack_ufix_trunc.
|
||||||
|
(vec_unpacks_float_hi_optab): Define new macro.
|
||||||
|
(vec_unpacks_float_lo_optab): Ditto.
|
||||||
|
(vec_unpacku_float_hi_optab): Ditto.
|
||||||
|
(vec_unpacku_float_lo_optab): Ditto.
|
||||||
|
(vec_pack_sfix_trunc_optab): Ditto.
|
||||||
|
(vec_pack_ufix_trunc_optab): Ditto.
|
||||||
|
* genopinit.c (optabs): Implement vec_unpack[s|u]_[hi|lo]_optab
|
||||||
|
and vec_pack_[s|u]fix_trunc_optab using
|
||||||
|
vec_unpack[s|u]_[hi\lo]_* and vec_pack_[u|s]fix_trunc_* patterns
|
||||||
|
* tree-vectorizer.c (supportable_widening_operation): Handle
|
||||||
|
FLOAT_EXPR and CONVERT_EXPR. Update comment.
|
||||||
|
(supportable_narrowing_operation): New function.
|
||||||
|
* tree-vectorizer.h (supportable_narrowing_operation): Prototype.
|
||||||
|
* tree-vect-transform.c (vectorizable_conversion): Handle
|
||||||
|
(nunits_in == nunits_out / 2) and (nunits_out == nunits_in / 2) cases.
|
||||||
|
(vect_gen_widened_results_half): Move before vectorizable_conversion.
|
||||||
|
(vectorizable_type_demotion): Call supportable_narrowing_operation()
|
||||||
|
to check for target support.
|
||||||
|
* optabs.c (optab_for_tree_code) Return vec_unpack[s|u]_float_hi_optab
|
||||||
|
for VEC_UNPACK_FLOAT_HI_EXPR, vec_unpack[s|u]_float_lo_optab
|
||||||
|
for VEC_UNPACK_FLOAT_LO_EXPR and vec_pack_[u|s]fix_trunc_optab
|
||||||
|
for VEC_PACK_FIX_TRUNC_EXPR.
|
||||||
|
(expand_binop): Special case mode of the result for
|
||||||
|
vec_pack_[u|s]fix_trunc_optab.
|
||||||
|
(init_optabs): Initialize vec_unpack[s|u]_[hi|lo]_optab and
|
||||||
|
vec_pack_[u|s]fix_trunc_optab.
|
||||||
|
|
||||||
|
* tree.def (VEC_UNPACK_FLOAT_HI_EXPR, VEC_UNPACK_FLOAT_LO_EXPR,
|
||||||
|
VEC_PACK_FIX_TRUNC_EXPR): New tree codes.
|
||||||
|
* tree-pretty-print.c (dump_generic_node): Handle
|
||||||
|
VEC_UNPACK_FLOAT_HI_EXPR, VEC_UNPACK_FLOAT_LO_EXPR and
|
||||||
|
VEC_PACK_FIX_TRUNC_EXPR.
|
||||||
|
(op_prio): Ditto.
|
||||||
|
* expr.c (expand_expr_real_1): Ditto.
|
||||||
|
* tree-inline.c (estimate_num_insns_1): Ditto.
|
||||||
|
* tree-vect-generic.c (expand_vector_operations_1): Ditto.
|
||||||
|
|
||||||
|
* config/i386/sse.md (vec_unpacks_float_hi_v8hi): New expander.
|
||||||
|
(vec_unpacks_float_lo_v8hi): Ditto.
|
||||||
|
(vec_unpacku_float_hi_v8hi): Ditto.
|
||||||
|
(vec_unpacku_float_lo_v8hi): Ditto.
|
||||||
|
(vec_unpacks_float_hi_v4si): Ditto.
|
||||||
|
(vec_unpacks_float_lo_v4si): Ditto.
|
||||||
|
(vec_pack_sfix_trunc_v2df): Ditto.
|
||||||
|
|
||||||
|
* doc/c-tree.texi (Expression trees) [VEC_UNPACK_FLOAT_HI_EXPR]:
|
||||||
|
Document.
|
||||||
|
[VEC_UNPACK_FLOAT_LO_EXPR]: Ditto.
|
||||||
|
[VEC_PACK_FIX_TRUNC_EXPR]: Ditto.
|
||||||
|
* doc/md.texi (Standard Names) [vec_pack_sfix_trunc]: Document.
|
||||||
|
[vec_pack_ufix_trunc]: Ditto.
|
||||||
|
[vec_unpacks_float_hi]: Ditto.
|
||||||
|
[vec_unpacks_float_lo]: Ditto.
|
||||||
|
[vec_unpacku_float_hi]: Ditto.
|
||||||
|
[vec_unpacku_float_lo]: Ditto.
|
||||||
|
|
||||||
2007-05-16 Uros Bizjak <ubizjak@gmail.com>
|
2007-05-16 Uros Bizjak <ubizjak@gmail.com>
|
||||||
|
|
||||||
* soft-fp/README: Update for new files.
|
* soft-fp/README: Update for new files.
|
||||||
|
@ -53,7 +116,8 @@
|
||||||
|
|
||||||
* config/rs6000/rs6000.c (rs6000_emit_prologue): Move altivec register
|
* config/rs6000/rs6000.c (rs6000_emit_prologue): Move altivec register
|
||||||
saving after stack push. Set sp_offset whenever we push.
|
saving after stack push. Set sp_offset whenever we push.
|
||||||
(rs6000_emit_epilogue): Move altivec register restore before stack push.
|
(rs6000_emit_epilogue): Move altivec register restore before
|
||||||
|
stack push.
|
||||||
|
|
||||||
2007-05-16 Richard Sandiford <richard@codesourcery.com>
|
2007-05-16 Richard Sandiford <richard@codesourcery.com>
|
||||||
|
|
||||||
|
|
|
@ -2205,6 +2205,80 @@
|
||||||
(parallel [(const_int 0) (const_int 1)]))))]
|
(parallel [(const_int 0) (const_int 1)]))))]
|
||||||
"TARGET_SSE2")
|
"TARGET_SSE2")
|
||||||
|
|
||||||
|
(define_expand "vec_unpacks_float_hi_v8hi"
|
||||||
|
[(match_operand:V4SF 0 "register_operand" "")
|
||||||
|
(match_operand:V8HI 1 "register_operand" "")]
|
||||||
|
"TARGET_SSE2"
|
||||||
|
{
|
||||||
|
rtx tmp = gen_reg_rtx (V4SImode);
|
||||||
|
|
||||||
|
emit_insn (gen_vec_unpacks_hi_v8hi (tmp, operands[1]));
|
||||||
|
emit_insn (gen_sse2_cvtdq2ps (operands[0], tmp));
|
||||||
|
DONE;
|
||||||
|
})
|
||||||
|
|
||||||
|
(define_expand "vec_unpacks_float_lo_v8hi"
|
||||||
|
[(match_operand:V4SF 0 "register_operand" "")
|
||||||
|
(match_operand:V8HI 1 "register_operand" "")]
|
||||||
|
"TARGET_SSE2"
|
||||||
|
{
|
||||||
|
rtx tmp = gen_reg_rtx (V4SImode);
|
||||||
|
|
||||||
|
emit_insn (gen_vec_unpacks_lo_v8hi (tmp, operands[1]));
|
||||||
|
emit_insn (gen_sse2_cvtdq2ps (operands[0], tmp));
|
||||||
|
DONE;
|
||||||
|
})
|
||||||
|
|
||||||
|
(define_expand "vec_unpacku_float_hi_v8hi"
|
||||||
|
[(match_operand:V4SF 0 "register_operand" "")
|
||||||
|
(match_operand:V8HI 1 "register_operand" "")]
|
||||||
|
"TARGET_SSE2"
|
||||||
|
{
|
||||||
|
rtx tmp = gen_reg_rtx (V4SImode);
|
||||||
|
|
||||||
|
emit_insn (gen_vec_unpacku_hi_v8hi (tmp, operands[1]));
|
||||||
|
emit_insn (gen_sse2_cvtdq2ps (operands[0], tmp));
|
||||||
|
DONE;
|
||||||
|
})
|
||||||
|
|
||||||
|
(define_expand "vec_unpacku_float_lo_v8hi"
|
||||||
|
[(match_operand:V4SF 0 "register_operand" "")
|
||||||
|
(match_operand:V8HI 1 "register_operand" "")]
|
||||||
|
"TARGET_SSE2"
|
||||||
|
{
|
||||||
|
rtx tmp = gen_reg_rtx (V4SImode);
|
||||||
|
|
||||||
|
emit_insn (gen_vec_unpacku_lo_v8hi (tmp, operands[1]));
|
||||||
|
emit_insn (gen_sse2_cvtdq2ps (operands[0], tmp));
|
||||||
|
DONE;
|
||||||
|
})
|
||||||
|
|
||||||
|
(define_expand "vec_unpacks_float_hi_v4si"
|
||||||
|
[(set (match_dup 2)
|
||||||
|
(vec_select:V4SI
|
||||||
|
(match_operand:V4SI 1 "nonimmediate_operand" "")
|
||||||
|
(parallel [(const_int 2)
|
||||||
|
(const_int 3)
|
||||||
|
(const_int 2)
|
||||||
|
(const_int 3)])))
|
||||||
|
(set (match_operand:V2DF 0 "register_operand" "")
|
||||||
|
(float:V2DF
|
||||||
|
(vec_select:V2SI
|
||||||
|
(match_dup 2)
|
||||||
|
(parallel [(const_int 0) (const_int 1)]))))]
|
||||||
|
"TARGET_SSE2"
|
||||||
|
{
|
||||||
|
operands[2] = gen_reg_rtx (V4SImode);
|
||||||
|
})
|
||||||
|
|
||||||
|
(define_expand "vec_unpacks_float_lo_v4si"
|
||||||
|
[(set (match_operand:V2DF 0 "register_operand" "")
|
||||||
|
(float:V2DF
|
||||||
|
(vec_select:V2SI
|
||||||
|
(match_operand:V4SI 1 "nonimmediate_operand" "")
|
||||||
|
(parallel [(const_int 0) (const_int 1)]))))]
|
||||||
|
"TARGET_SSE2")
|
||||||
|
|
||||||
(define_expand "vec_pack_trunc_v2df"
|
(define_expand "vec_pack_trunc_v2df"
|
||||||
[(match_operand:V4SF 0 "register_operand" "")
|
[(match_operand:V4SF 0 "register_operand" "")
|
||||||
(match_operand:V2DF 1 "nonimmediate_operand" "")
|
(match_operand:V2DF 1 "nonimmediate_operand" "")
|
||||||
|
@ -2222,6 +2296,25 @@
|
||||||
DONE;
|
DONE;
|
||||||
})
|
})
|
||||||
|
|
||||||
|
(define_expand "vec_pack_sfix_trunc_v2df"
|
||||||
|
[(match_operand:V4SI 0 "register_operand" "")
|
||||||
|
(match_operand:V2DF 1 "nonimmediate_operand" "")
|
||||||
|
(match_operand:V2DF 2 "nonimmediate_operand" "")]
|
||||||
|
"TARGET_SSE2"
|
||||||
|
{
|
||||||
|
rtx r1, r2;
|
||||||
|
|
||||||
|
r1 = gen_reg_rtx (V4SImode);
|
||||||
|
r2 = gen_reg_rtx (V4SImode);
|
||||||
|
|
||||||
|
emit_insn (gen_sse2_cvttpd2dq (r1, operands[1]));
|
||||||
|
emit_insn (gen_sse2_cvttpd2dq (r2, operands[2]));
|
||||||
|
emit_insn (gen_sse2_punpcklqdq (gen_lowpart (V2DImode, operands[0]),
|
||||||
|
gen_lowpart (V2DImode, r1),
|
||||||
|
gen_lowpart (V2DImode, r2)));
|
||||||
|
DONE;
|
||||||
|
})
|
||||||
|
|
||||||
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
|
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
|
||||||
;;
|
;;
|
||||||
;; Parallel double-precision floating point element swizzling
|
;; Parallel double-precision floating point element swizzling
|
||||||
|
|
|
@ -1983,8 +1983,11 @@ This macro returns the attributes on the type @var{type}.
|
||||||
@tindex VEC_WIDEN_MULT_LO_EXPR
|
@tindex VEC_WIDEN_MULT_LO_EXPR
|
||||||
@tindex VEC_UNPACK_HI_EXPR
|
@tindex VEC_UNPACK_HI_EXPR
|
||||||
@tindex VEC_UNPACK_LO_EXPR
|
@tindex VEC_UNPACK_LO_EXPR
|
||||||
|
@tindex VEC_UNPACK_FLOAT_HI_EXPR
|
||||||
|
@tindex VEC_UNPACK_FLOAT_LO_EXPR
|
||||||
@tindex VEC_PACK_TRUNC_EXPR
|
@tindex VEC_PACK_TRUNC_EXPR
|
||||||
@tindex VEC_PACK_SAT_EXPR
|
@tindex VEC_PACK_SAT_EXPR
|
||||||
|
@tindex VEC_PACK_FIX_TRUNC_EXPR
|
||||||
@tindex VEC_EXTRACT_EVEN_EXPR
|
@tindex VEC_EXTRACT_EVEN_EXPR
|
||||||
@tindex VEC_EXTRACT_ODD_EXPR
|
@tindex VEC_EXTRACT_ODD_EXPR
|
||||||
@tindex VEC_INTERLEAVE_HIGH_EXPR
|
@tindex VEC_INTERLEAVE_HIGH_EXPR
|
||||||
|
@ -2846,6 +2849,17 @@ high @code{N/2} elements of the vector are extracted and widened (promoted).
|
||||||
In the case of @code{VEC_UNPACK_LO_EXPR} the low @code{N/2} elements of the
|
In the case of @code{VEC_UNPACK_LO_EXPR} the low @code{N/2} elements of the
|
||||||
vector are extracted and widened (promoted).
|
vector are extracted and widened (promoted).
|
||||||
|
|
||||||
|
@item VEC_UNPACK_FLOAT_HI_EXPR
|
||||||
|
@item VEC_UNPACK_FLOAT_LO_EXPR
|
||||||
|
These nodes represent unpacking of the high and low parts of the input vector,
|
||||||
|
where the values are converted from fixed point to floating point. The
|
||||||
|
single operand is a vector that contains @code{N} elements of the same
|
||||||
|
integral type. The result is a vector that contains half as many elements
|
||||||
|
of a floating point type whose size is twice as wide. In the case of
|
||||||
|
@code{VEC_UNPACK_HI_EXPR} the high @code{N/2} elements of the vector are
|
||||||
|
extracted, converted and widened. In the case of @code{VEC_UNPACK_LO_EXPR}
|
||||||
|
the low @code{N/2} elements of the vector are extracted, converted and widened.
|
||||||
|
|
||||||
@item VEC_PACK_TRUNC_EXPR
|
@item VEC_PACK_TRUNC_EXPR
|
||||||
This node represents packing of truncated elements of the two input vectors
|
This node represents packing of truncated elements of the two input vectors
|
||||||
into the output vector. Input operands are vectors that contain the same
|
into the output vector. Input operands are vectors that contain the same
|
||||||
|
@ -2862,6 +2876,15 @@ vector that contains twice as many elements of an integral type whose size
|
||||||
is half as wide. The elements of the two vectors are demoted and merged
|
is half as wide. The elements of the two vectors are demoted and merged
|
||||||
(concatenated) to form the output vector.
|
(concatenated) to form the output vector.
|
||||||
|
|
||||||
|
@item VEC_PACK_FIX_TRUNC_EXPR
|
||||||
|
This node represents packing of elements of the two input vectors into the
|
||||||
|
output vector, where the values are converted from floating point
|
||||||
|
to fixed point. Input operands are vectors that contain the same number
|
||||||
|
of elements of a floating point type. The result is a vector that contains
|
||||||
|
twice as many elements of an integral type whose size is half as wide. The
|
||||||
|
elements of the two vectors are merged (concatenated) to form the output
|
||||||
|
vector.
|
||||||
|
|
||||||
@item VEC_EXTRACT_EVEN_EXPR
|
@item VEC_EXTRACT_EVEN_EXPR
|
||||||
@item VEC_EXTRACT_ODD_EXPR
|
@item VEC_EXTRACT_ODD_EXPR
|
||||||
These nodes represent extracting of the even/odd elements of the two input
|
These nodes represent extracting of the even/odd elements of the two input
|
||||||
|
|
|
@ -3607,6 +3607,14 @@ Operand 0 is the resulting vector in which the elements of the two input
|
||||||
vectors are concatenated after narrowing them down using signed/unsigned
|
vectors are concatenated after narrowing them down using signed/unsigned
|
||||||
saturating arithmetic.
|
saturating arithmetic.
|
||||||
|
|
||||||
|
@cindex @code{vec_pack_sfix_trunc_@var{m}} instruction pattern
|
||||||
|
@cindex @code{vec_pack_ufix_trunc_@var{m}} instruction pattern
|
||||||
|
@item @samp{vec_pack_sfix_trunc_@var{m}}, @samp{vec_pack_ufix_trunc_@var{m}}
|
||||||
|
Narrow, convert to signed/unsigned integral type and merge the elements
|
||||||
|
of two vectors. Operands 1 and 2 are vectors of the same mode having N
|
||||||
|
floating point elements of size S. Operand 0 is the resulting vector
|
||||||
|
in which 2*N elements of size N/2 are concatenated.
|
||||||
|
|
||||||
@cindex @code{vec_unpacks_hi_@var{m}} instruction pattern
|
@cindex @code{vec_unpacks_hi_@var{m}} instruction pattern
|
||||||
@cindex @code{vec_unpacks_lo_@var{m}} instruction pattern
|
@cindex @code{vec_unpacks_lo_@var{m}} instruction pattern
|
||||||
@item @samp{vec_unpacks_hi_@var{m}}, @samp{vec_unpacks_lo_@var{m}}
|
@item @samp{vec_unpacks_hi_@var{m}}, @samp{vec_unpacks_lo_@var{m}}
|
||||||
|
@ -3624,11 +3632,24 @@ integral elements. The input vector (operand 1) has N elements of size S.
|
||||||
Widen (promote) the high/low elements of the vector using zero extension and
|
Widen (promote) the high/low elements of the vector using zero extension and
|
||||||
place the resulting N/2 values of size 2*S in the output vector (operand 0).
|
place the resulting N/2 values of size 2*S in the output vector (operand 0).
|
||||||
|
|
||||||
|
@cindex @code{vec_unpacks_float_hi_@var{m}} instruction pattern
|
||||||
|
@cindex @code{vec_unpacks_float_lo_@var{m}} instruction pattern
|
||||||
|
@cindex @code{vec_unpacku_float_hi_@var{m}} instruction pattern
|
||||||
|
@cindex @code{vec_unpacku_float_lo_@var{m}} instruction pattern
|
||||||
|
@item @samp{vec_unpacks_float_hi_@var{m}}, @samp{vec_unpacks_float_lo_@var{m}}
|
||||||
|
@itemx @samp{vec_unpacku_float_hi_@var{m}}, @samp{vec_unpacku_float_lo_@var{m}}
|
||||||
|
Extract, convert to floating point type and widen the high/low part of a
|
||||||
|
vector of signed/unsigned integral elements. The input vector (operand 1)
|
||||||
|
has N elements of size S. Convert the high/low elements of the vector using
|
||||||
|
floating point conversion and place the resulting N/2 values of size 2*S in
|
||||||
|
the output vector (operand 0).
|
||||||
|
|
||||||
@cindex @code{vec_widen_umult_hi_@var{m}} instruction pattern
|
@cindex @code{vec_widen_umult_hi_@var{m}} instruction pattern
|
||||||
@cindex @code{vec_widen_umult_lo__@var{m}} instruction pattern
|
@cindex @code{vec_widen_umult_lo__@var{m}} instruction pattern
|
||||||
@cindex @code{vec_widen_smult_hi_@var{m}} instruction pattern
|
@cindex @code{vec_widen_smult_hi_@var{m}} instruction pattern
|
||||||
@cindex @code{vec_widen_smult_lo_@var{m}} instruction pattern
|
@cindex @code{vec_widen_smult_lo_@var{m}} instruction pattern
|
||||||
@item @samp{vec_widen_umult_hi_@var{m}}, @samp{vec_widen_umult_lo_@var{m}}, @samp{vec_widen_smult_hi_@var{m}}, @samp{vec_widen_smult_lo_@var{m}}
|
@item @samp{vec_widen_umult_hi_@var{m}}, @samp{vec_widen_umult_lo_@var{m}}
|
||||||
|
@itemx @samp{vec_widen_smult_hi_@var{m}}, @samp{vec_widen_smult_lo_@var{m}}
|
||||||
Signed/Unsigned widening multiplication. The two inputs (operands 1 and 2)
|
Signed/Unsigned widening multiplication. The two inputs (operands 1 and 2)
|
||||||
are vectors with N signed/unsigned elements of size S. Multiply the high/low
|
are vectors with N signed/unsigned elements of size S. Multiply the high/low
|
||||||
elements of the two vectors, and put the N/2 products of size 2*S in the
|
elements of the two vectors, and put the N/2 products of size 2*S in the
|
||||||
|
|
16
gcc/expr.c
16
gcc/expr.c
|
@ -9001,6 +9001,21 @@ expand_expr_real_1 (tree exp, rtx target, enum machine_mode tmode,
|
||||||
return temp;
|
return temp;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
case VEC_UNPACK_FLOAT_HI_EXPR:
|
||||||
|
case VEC_UNPACK_FLOAT_LO_EXPR:
|
||||||
|
{
|
||||||
|
op0 = expand_normal (TREE_OPERAND (exp, 0));
|
||||||
|
/* The signedness is determined from input operand. */
|
||||||
|
this_optab = optab_for_tree_code (code,
|
||||||
|
TREE_TYPE (TREE_OPERAND (exp, 0)));
|
||||||
|
temp = expand_widen_pattern_expr
|
||||||
|
(exp, op0, NULL_RTX, NULL_RTX,
|
||||||
|
target, TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (exp, 0))));
|
||||||
|
|
||||||
|
gcc_assert (temp);
|
||||||
|
return temp;
|
||||||
|
}
|
||||||
|
|
||||||
case VEC_WIDEN_MULT_HI_EXPR:
|
case VEC_WIDEN_MULT_HI_EXPR:
|
||||||
case VEC_WIDEN_MULT_LO_EXPR:
|
case VEC_WIDEN_MULT_LO_EXPR:
|
||||||
{
|
{
|
||||||
|
@ -9016,6 +9031,7 @@ expand_expr_real_1 (tree exp, rtx target, enum machine_mode tmode,
|
||||||
|
|
||||||
case VEC_PACK_TRUNC_EXPR:
|
case VEC_PACK_TRUNC_EXPR:
|
||||||
case VEC_PACK_SAT_EXPR:
|
case VEC_PACK_SAT_EXPR:
|
||||||
|
case VEC_PACK_FIX_TRUNC_EXPR:
|
||||||
{
|
{
|
||||||
mode = TYPE_MODE (TREE_TYPE (TREE_OPERAND (exp, 0)));
|
mode = TYPE_MODE (TREE_TYPE (TREE_OPERAND (exp, 0)));
|
||||||
goto binop;
|
goto binop;
|
||||||
|
|
|
@ -233,9 +233,15 @@ static const char * const optabs[] =
|
||||||
"vec_unpacks_lo_optab->handlers[$A].insn_code = CODE_FOR_$(vec_unpacks_lo_$a$)",
|
"vec_unpacks_lo_optab->handlers[$A].insn_code = CODE_FOR_$(vec_unpacks_lo_$a$)",
|
||||||
"vec_unpacku_hi_optab->handlers[$A].insn_code = CODE_FOR_$(vec_unpacku_hi_$a$)",
|
"vec_unpacku_hi_optab->handlers[$A].insn_code = CODE_FOR_$(vec_unpacku_hi_$a$)",
|
||||||
"vec_unpacku_lo_optab->handlers[$A].insn_code = CODE_FOR_$(vec_unpacku_lo_$a$)",
|
"vec_unpacku_lo_optab->handlers[$A].insn_code = CODE_FOR_$(vec_unpacku_lo_$a$)",
|
||||||
|
"vec_unpacks_float_hi_optab->handlers[$A].insn_code = CODE_FOR_$(vec_unpacks_float_hi_$a$)",
|
||||||
|
"vec_unpacks_float_lo_optab->handlers[$A].insn_code = CODE_FOR_$(vec_unpacks_float_lo_$a$)",
|
||||||
|
"vec_unpacku_float_hi_optab->handlers[$A].insn_code = CODE_FOR_$(vec_unpacku_float_hi_$a$)",
|
||||||
|
"vec_unpacku_float_lo_optab->handlers[$A].insn_code = CODE_FOR_$(vec_unpacku_float_lo_$a$)",
|
||||||
"vec_pack_trunc_optab->handlers[$A].insn_code = CODE_FOR_$(vec_pack_trunc_$a$)",
|
"vec_pack_trunc_optab->handlers[$A].insn_code = CODE_FOR_$(vec_pack_trunc_$a$)",
|
||||||
"vec_pack_ssat_optab->handlers[$A].insn_code = CODE_FOR_$(vec_pack_ssat_$a$)",
|
"vec_pack_ssat_optab->handlers[$A].insn_code = CODE_FOR_$(vec_pack_ssat_$a$)",
|
||||||
"vec_pack_usat_optab->handlers[$A].insn_code = CODE_FOR_$(vec_pack_usat_$a$)"
|
"vec_pack_usat_optab->handlers[$A].insn_code = CODE_FOR_$(vec_pack_usat_$a$)",
|
||||||
|
"vec_pack_sfix_trunc_optab->handlers[$A].insn_code = CODE_FOR_$(vec_pack_sfix_trunc_$a$)",
|
||||||
|
"vec_pack_ufix_trunc_optab->handlers[$A].insn_code = CODE_FOR_$(vec_pack_ufix_trunc_$a$)"
|
||||||
};
|
};
|
||||||
|
|
||||||
static void gen_insn (rtx);
|
static void gen_insn (rtx);
|
||||||
|
|
24
gcc/optabs.c
24
gcc/optabs.c
|
@ -340,12 +340,26 @@ optab_for_tree_code (enum tree_code code, tree type)
|
||||||
return TYPE_UNSIGNED (type) ?
|
return TYPE_UNSIGNED (type) ?
|
||||||
vec_unpacku_lo_optab : vec_unpacks_lo_optab;
|
vec_unpacku_lo_optab : vec_unpacks_lo_optab;
|
||||||
|
|
||||||
|
case VEC_UNPACK_FLOAT_HI_EXPR:
|
||||||
|
/* The signedness is determined from input operand. */
|
||||||
|
return TYPE_UNSIGNED (type) ?
|
||||||
|
vec_unpacku_float_hi_optab : vec_unpacks_float_hi_optab;
|
||||||
|
|
||||||
|
case VEC_UNPACK_FLOAT_LO_EXPR:
|
||||||
|
/* The signedness is determined from input operand. */
|
||||||
|
return TYPE_UNSIGNED (type) ?
|
||||||
|
vec_unpacku_float_lo_optab : vec_unpacks_float_lo_optab;
|
||||||
|
|
||||||
case VEC_PACK_TRUNC_EXPR:
|
case VEC_PACK_TRUNC_EXPR:
|
||||||
return vec_pack_trunc_optab;
|
return vec_pack_trunc_optab;
|
||||||
|
|
||||||
case VEC_PACK_SAT_EXPR:
|
case VEC_PACK_SAT_EXPR:
|
||||||
return TYPE_UNSIGNED (type) ? vec_pack_usat_optab : vec_pack_ssat_optab;
|
return TYPE_UNSIGNED (type) ? vec_pack_usat_optab : vec_pack_ssat_optab;
|
||||||
|
|
||||||
|
case VEC_PACK_FIX_TRUNC_EXPR:
|
||||||
|
return TYPE_UNSIGNED (type) ?
|
||||||
|
vec_pack_ufix_trunc_optab : vec_pack_sfix_trunc_optab;
|
||||||
|
|
||||||
default:
|
default:
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
|
@ -1375,7 +1389,9 @@ expand_binop (enum machine_mode mode, optab binoptab, rtx op0, rtx op1,
|
||||||
|
|
||||||
if (binoptab == vec_pack_trunc_optab
|
if (binoptab == vec_pack_trunc_optab
|
||||||
|| binoptab == vec_pack_usat_optab
|
|| binoptab == vec_pack_usat_optab
|
||||||
|| binoptab == vec_pack_ssat_optab)
|
|| binoptab == vec_pack_ssat_optab
|
||||||
|
|| binoptab == vec_pack_ufix_trunc_optab
|
||||||
|
|| binoptab == vec_pack_sfix_trunc_optab)
|
||||||
{
|
{
|
||||||
/* The mode of the result is different then the mode of the
|
/* The mode of the result is different then the mode of the
|
||||||
arguments. */
|
arguments. */
|
||||||
|
@ -5565,9 +5581,15 @@ init_optabs (void)
|
||||||
vec_unpacks_lo_optab = init_optab (UNKNOWN);
|
vec_unpacks_lo_optab = init_optab (UNKNOWN);
|
||||||
vec_unpacku_hi_optab = init_optab (UNKNOWN);
|
vec_unpacku_hi_optab = init_optab (UNKNOWN);
|
||||||
vec_unpacku_lo_optab = init_optab (UNKNOWN);
|
vec_unpacku_lo_optab = init_optab (UNKNOWN);
|
||||||
|
vec_unpacks_float_hi_optab = init_optab (UNKNOWN);
|
||||||
|
vec_unpacks_float_lo_optab = init_optab (UNKNOWN);
|
||||||
|
vec_unpacku_float_hi_optab = init_optab (UNKNOWN);
|
||||||
|
vec_unpacku_float_lo_optab = init_optab (UNKNOWN);
|
||||||
vec_pack_trunc_optab = init_optab (UNKNOWN);
|
vec_pack_trunc_optab = init_optab (UNKNOWN);
|
||||||
vec_pack_usat_optab = init_optab (UNKNOWN);
|
vec_pack_usat_optab = init_optab (UNKNOWN);
|
||||||
vec_pack_ssat_optab = init_optab (UNKNOWN);
|
vec_pack_ssat_optab = init_optab (UNKNOWN);
|
||||||
|
vec_pack_ufix_trunc_optab = init_optab (UNKNOWN);
|
||||||
|
vec_pack_sfix_trunc_optab = init_optab (UNKNOWN);
|
||||||
|
|
||||||
powi_optab = init_optab (UNKNOWN);
|
powi_optab = init_optab (UNKNOWN);
|
||||||
|
|
||||||
|
|
19
gcc/optabs.h
19
gcc/optabs.h
|
@ -298,11 +298,24 @@ enum optab_index
|
||||||
elements. */
|
elements. */
|
||||||
OTI_vec_unpacku_hi,
|
OTI_vec_unpacku_hi,
|
||||||
OTI_vec_unpacku_lo,
|
OTI_vec_unpacku_lo,
|
||||||
|
|
||||||
|
/* Extract, convert to floating point and widen the high/low part of
|
||||||
|
a vector of signed or unsigned integer elements. */
|
||||||
|
OTI_vec_unpacks_float_hi,
|
||||||
|
OTI_vec_unpacks_float_lo,
|
||||||
|
OTI_vec_unpacku_float_hi,
|
||||||
|
OTI_vec_unpacku_float_lo,
|
||||||
|
|
||||||
/* Narrow (demote) and merge the elements of two vectors. */
|
/* Narrow (demote) and merge the elements of two vectors. */
|
||||||
OTI_vec_pack_trunc,
|
OTI_vec_pack_trunc,
|
||||||
OTI_vec_pack_usat,
|
OTI_vec_pack_usat,
|
||||||
OTI_vec_pack_ssat,
|
OTI_vec_pack_ssat,
|
||||||
|
|
||||||
|
/* Convert to signed/unsigned integer, narrow and merge elements
|
||||||
|
of two vectors of floating point elements. */
|
||||||
|
OTI_vec_pack_sfix_trunc,
|
||||||
|
OTI_vec_pack_ufix_trunc,
|
||||||
|
|
||||||
/* Perform a raise to the power of integer. */
|
/* Perform a raise to the power of integer. */
|
||||||
OTI_powi,
|
OTI_powi,
|
||||||
|
|
||||||
|
@ -446,9 +459,15 @@ extern GTY(()) optab optab_table[OTI_MAX];
|
||||||
#define vec_unpacks_lo_optab (optab_table[OTI_vec_unpacks_lo])
|
#define vec_unpacks_lo_optab (optab_table[OTI_vec_unpacks_lo])
|
||||||
#define vec_unpacku_hi_optab (optab_table[OTI_vec_unpacku_hi])
|
#define vec_unpacku_hi_optab (optab_table[OTI_vec_unpacku_hi])
|
||||||
#define vec_unpacku_lo_optab (optab_table[OTI_vec_unpacku_lo])
|
#define vec_unpacku_lo_optab (optab_table[OTI_vec_unpacku_lo])
|
||||||
|
#define vec_unpacks_float_hi_optab (optab_table[OTI_vec_unpacks_float_hi])
|
||||||
|
#define vec_unpacks_float_lo_optab (optab_table[OTI_vec_unpacks_float_lo])
|
||||||
|
#define vec_unpacku_float_hi_optab (optab_table[OTI_vec_unpacku_float_hi])
|
||||||
|
#define vec_unpacku_float_lo_optab (optab_table[OTI_vec_unpacku_float_lo])
|
||||||
#define vec_pack_trunc_optab (optab_table[OTI_vec_pack_trunc])
|
#define vec_pack_trunc_optab (optab_table[OTI_vec_pack_trunc])
|
||||||
#define vec_pack_ssat_optab (optab_table[OTI_vec_pack_ssat])
|
#define vec_pack_ssat_optab (optab_table[OTI_vec_pack_ssat])
|
||||||
#define vec_pack_usat_optab (optab_table[OTI_vec_pack_usat])
|
#define vec_pack_usat_optab (optab_table[OTI_vec_pack_usat])
|
||||||
|
#define vec_pack_sfix_trunc_optab (optab_table[OTI_vec_pack_sfix_trunc])
|
||||||
|
#define vec_pack_ufix_trunc_optab (optab_table[OTI_vec_pack_ufix_trunc])
|
||||||
|
|
||||||
#define powi_optab (optab_table[OTI_powi])
|
#define powi_optab (optab_table[OTI_powi])
|
||||||
|
|
||||||
|
|
|
@ -1,3 +1,16 @@
|
||||||
|
2007-05-17 Uros Bizjak <ubizjak@gmail.com>
|
||||||
|
|
||||||
|
PR tree-optimization/24659
|
||||||
|
* gcc.dg/vect/vect-floatint-conversion-2.c: New test.
|
||||||
|
* gcc.dg/vect/vect-intfloat-conversion-1.c: Require vect_float,
|
||||||
|
not vect_int target.
|
||||||
|
* gcc.dg/vect/vect-intfloat-conversion-2.c: Require vect_float,
|
||||||
|
not vect_int target. Loop is vectorized for vect_intfloat_cvt
|
||||||
|
targets.
|
||||||
|
* gcc.dg/vect/vect-intfloat-conversion-3.c: New test.
|
||||||
|
* gcc.dg/vect/vect-intfloat-conversion-4a.c: New test.
|
||||||
|
* gcc.dg/vect/vect-intfloat-conversion-4b.c: New test.
|
||||||
|
|
||||||
2007-05-16 Uros Bizjak <ubizjak@gmail.com>
|
2007-05-16 Uros Bizjak <ubizjak@gmail.com>
|
||||||
|
|
||||||
* gcc.dg/torture/fp-int-convert-float128.c: Do not xfail for i?86-*-*
|
* gcc.dg/torture/fp-int-convert-float128.c: Do not xfail for i?86-*-*
|
||||||
|
|
|
@ -0,0 +1,40 @@
|
||||||
|
/* { dg-require-effective-target vect_double } */
|
||||||
|
|
||||||
|
#include <stdarg.h>
|
||||||
|
#include "tree-vect.h"
|
||||||
|
|
||||||
|
#define N 32
|
||||||
|
|
||||||
|
int
|
||||||
|
main1 ()
|
||||||
|
{
|
||||||
|
int i;
|
||||||
|
double db[N] = {0.4,3.5,6.6,9.4,12.5,15.6,18.4,21.5,24.6,27.4,30.5,33.6,36.4,39.5,42.6,45.4,0.5,3.6,6.4,9.5,12.6,15.4,18.5,21.6,24.4,27.5,30.6,33.4,36.5,39.6,42.4,45.5};
|
||||||
|
int ia[N];
|
||||||
|
|
||||||
|
/* double -> int */
|
||||||
|
for (i = 0; i < N; i++)
|
||||||
|
{
|
||||||
|
ia[i] = (int) db[i];
|
||||||
|
}
|
||||||
|
|
||||||
|
/* check results: */
|
||||||
|
for (i = 0; i < N; i++)
|
||||||
|
{
|
||||||
|
if (ia[i] != (int) db[i])
|
||||||
|
abort ();
|
||||||
|
}
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
int
|
||||||
|
main (void)
|
||||||
|
{
|
||||||
|
check_vect ();
|
||||||
|
|
||||||
|
return main1 ();
|
||||||
|
}
|
||||||
|
|
||||||
|
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_floatint_cvt } } } */
|
||||||
|
/* { dg-final { cleanup-tree-dump "vect" } } */
|
|
@ -1,4 +1,4 @@
|
||||||
/* { dg-require-effective-target vect_int } */
|
/* { dg-require-effective-target vect_float } */
|
||||||
|
|
||||||
#include <stdarg.h>
|
#include <stdarg.h>
|
||||||
#include "tree-vect.h"
|
#include "tree-vect.h"
|
||||||
|
|
|
@ -1,4 +1,4 @@
|
||||||
/* { dg-require-effective-target vect_int } */
|
/* { dg-require-effective-target vect_float } */
|
||||||
|
|
||||||
#include <stdarg.h>
|
#include <stdarg.h>
|
||||||
#include "tree-vect.h"
|
#include "tree-vect.h"
|
||||||
|
@ -36,5 +36,5 @@ int main (void)
|
||||||
return main1 ();
|
return main1 ();
|
||||||
}
|
}
|
||||||
|
|
||||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target powerpc*-*-* i?86-*-* x86_64-*-* } } } */
|
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_intfloat_cvt } } } */
|
||||||
/* { dg-final { cleanup-tree-dump "vect" } } */
|
/* { dg-final { cleanup-tree-dump "vect" } } */
|
||||||
|
|
|
@ -0,0 +1,38 @@
|
||||||
|
/* { dg-require-effective-target vect_double } */
|
||||||
|
|
||||||
|
#include <stdarg.h>
|
||||||
|
#include "tree-vect.h"
|
||||||
|
|
||||||
|
#define N 32
|
||||||
|
|
||||||
|
int main1 ()
|
||||||
|
{
|
||||||
|
int i;
|
||||||
|
int ib[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45,0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
|
||||||
|
double da[N];
|
||||||
|
|
||||||
|
/* int -> double */
|
||||||
|
for (i = 0; i < N; i++)
|
||||||
|
{
|
||||||
|
da[i] = (double) ib[i];
|
||||||
|
}
|
||||||
|
|
||||||
|
/* check results: */
|
||||||
|
for (i = 0; i < N; i++)
|
||||||
|
{
|
||||||
|
if (da[i] != (double) ib[i])
|
||||||
|
abort ();
|
||||||
|
}
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
int main (void)
|
||||||
|
{
|
||||||
|
check_vect ();
|
||||||
|
|
||||||
|
return main1 ();
|
||||||
|
}
|
||||||
|
|
||||||
|
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_intfloat_cvt } } } */
|
||||||
|
/* { dg-final { cleanup-tree-dump "vect" } } */
|
|
@ -0,0 +1,38 @@
|
||||||
|
/* { dg-require-effective-target vect_float } */
|
||||||
|
|
||||||
|
#include <stdarg.h>
|
||||||
|
#include "tree-vect.h"
|
||||||
|
|
||||||
|
#define N 32
|
||||||
|
|
||||||
|
int main1 ()
|
||||||
|
{
|
||||||
|
int i;
|
||||||
|
short sb[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45,0,-3,-6,-9,-12,-15,-18,-21,-24,-27,-30,-33,-36,-39,-42,-45};
|
||||||
|
float fa[N];
|
||||||
|
|
||||||
|
/* short -> float */
|
||||||
|
for (i = 0; i < N; i++)
|
||||||
|
{
|
||||||
|
fa[i] = (float) sb[i];
|
||||||
|
}
|
||||||
|
|
||||||
|
/* check results: */
|
||||||
|
for (i = 0; i < N; i++)
|
||||||
|
{
|
||||||
|
if (fa[i] != (float) sb[i])
|
||||||
|
abort ();
|
||||||
|
}
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
int main (void)
|
||||||
|
{
|
||||||
|
check_vect ();
|
||||||
|
|
||||||
|
return main1 ();
|
||||||
|
}
|
||||||
|
|
||||||
|
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_intfloat_cvt } } } */
|
||||||
|
/* { dg-final { cleanup-tree-dump "vect" } } */
|
|
@ -0,0 +1,38 @@
|
||||||
|
/* { dg-require-effective-target vect_float } */
|
||||||
|
|
||||||
|
#include <stdarg.h>
|
||||||
|
#include "tree-vect.h"
|
||||||
|
|
||||||
|
#define N 32
|
||||||
|
|
||||||
|
int main1 ()
|
||||||
|
{
|
||||||
|
int i;
|
||||||
|
unsigned short usb[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45,0,65533,65530,65527,65524,65521,65518,65515,65512,65509,65506,65503,65500,65497,65494,65491};
|
||||||
|
float fa[N];
|
||||||
|
|
||||||
|
/* unsigned short -> float */
|
||||||
|
for (i = 0; i < N; i++)
|
||||||
|
{
|
||||||
|
fa[i] = (float) usb[i];
|
||||||
|
}
|
||||||
|
|
||||||
|
/* check results: */
|
||||||
|
for (i = 0; i < N; i++)
|
||||||
|
{
|
||||||
|
if (fa[i] != (float) usb[i])
|
||||||
|
abort ();
|
||||||
|
}
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
int main (void)
|
||||||
|
{
|
||||||
|
check_vect ();
|
||||||
|
|
||||||
|
return main1 ();
|
||||||
|
}
|
||||||
|
|
||||||
|
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_intfloat_cvt } } } */
|
||||||
|
/* { dg-final { cleanup-tree-dump "vect" } } */
|
|
@ -2148,8 +2148,11 @@ estimate_num_insns_1 (tree *tp, int *walk_subtrees, void *data)
|
||||||
case VEC_WIDEN_MULT_LO_EXPR:
|
case VEC_WIDEN_MULT_LO_EXPR:
|
||||||
case VEC_UNPACK_HI_EXPR:
|
case VEC_UNPACK_HI_EXPR:
|
||||||
case VEC_UNPACK_LO_EXPR:
|
case VEC_UNPACK_LO_EXPR:
|
||||||
|
case VEC_UNPACK_FLOAT_HI_EXPR:
|
||||||
|
case VEC_UNPACK_FLOAT_LO_EXPR:
|
||||||
case VEC_PACK_TRUNC_EXPR:
|
case VEC_PACK_TRUNC_EXPR:
|
||||||
case VEC_PACK_SAT_EXPR:
|
case VEC_PACK_SAT_EXPR:
|
||||||
|
case VEC_PACK_FIX_TRUNC_EXPR:
|
||||||
|
|
||||||
case WIDEN_MULT_EXPR:
|
case WIDEN_MULT_EXPR:
|
||||||
|
|
||||||
|
|
|
@ -1943,6 +1943,18 @@ dump_generic_node (pretty_printer *buffer, tree node, int spc, int flags,
|
||||||
pp_string (buffer, " > ");
|
pp_string (buffer, " > ");
|
||||||
break;
|
break;
|
||||||
|
|
||||||
|
case VEC_UNPACK_FLOAT_HI_EXPR:
|
||||||
|
pp_string (buffer, " VEC_UNPACK_FLOAT_HI_EXPR < ");
|
||||||
|
dump_generic_node (buffer, TREE_OPERAND (node, 0), spc, flags, false);
|
||||||
|
pp_string (buffer, " > ");
|
||||||
|
break;
|
||||||
|
|
||||||
|
case VEC_UNPACK_FLOAT_LO_EXPR:
|
||||||
|
pp_string (buffer, " VEC_UNPACK_FLOAT_LO_EXPR < ");
|
||||||
|
dump_generic_node (buffer, TREE_OPERAND (node, 0), spc, flags, false);
|
||||||
|
pp_string (buffer, " > ");
|
||||||
|
break;
|
||||||
|
|
||||||
case VEC_PACK_TRUNC_EXPR:
|
case VEC_PACK_TRUNC_EXPR:
|
||||||
pp_string (buffer, " VEC_PACK_TRUNC_EXPR < ");
|
pp_string (buffer, " VEC_PACK_TRUNC_EXPR < ");
|
||||||
dump_generic_node (buffer, TREE_OPERAND (node, 0), spc, flags, false);
|
dump_generic_node (buffer, TREE_OPERAND (node, 0), spc, flags, false);
|
||||||
|
@ -1959,6 +1971,14 @@ dump_generic_node (pretty_printer *buffer, tree node, int spc, int flags,
|
||||||
pp_string (buffer, " > ");
|
pp_string (buffer, " > ");
|
||||||
break;
|
break;
|
||||||
|
|
||||||
|
case VEC_PACK_FIX_TRUNC_EXPR:
|
||||||
|
pp_string (buffer, " VEC_PACK_FIX_TRUNC_EXPR < ");
|
||||||
|
dump_generic_node (buffer, TREE_OPERAND (node, 0), spc, flags, false);
|
||||||
|
pp_string (buffer, ", ");
|
||||||
|
dump_generic_node (buffer, TREE_OPERAND (node, 1), spc, flags, false);
|
||||||
|
pp_string (buffer, " > ");
|
||||||
|
break;
|
||||||
|
|
||||||
case BLOCK:
|
case BLOCK:
|
||||||
{
|
{
|
||||||
tree t;
|
tree t;
|
||||||
|
@ -2352,6 +2372,8 @@ op_prio (tree op)
|
||||||
case VEC_RSHIFT_EXPR:
|
case VEC_RSHIFT_EXPR:
|
||||||
case VEC_UNPACK_HI_EXPR:
|
case VEC_UNPACK_HI_EXPR:
|
||||||
case VEC_UNPACK_LO_EXPR:
|
case VEC_UNPACK_LO_EXPR:
|
||||||
|
case VEC_UNPACK_FLOAT_HI_EXPR:
|
||||||
|
case VEC_UNPACK_FLOAT_LO_EXPR:
|
||||||
case VEC_PACK_TRUNC_EXPR:
|
case VEC_PACK_TRUNC_EXPR:
|
||||||
case VEC_PACK_SAT_EXPR:
|
case VEC_PACK_SAT_EXPR:
|
||||||
return 16;
|
return 16;
|
||||||
|
|
|
@ -421,8 +421,11 @@ expand_vector_operations_1 (block_stmt_iterator *bsi)
|
||||||
|| code == VEC_WIDEN_MULT_LO_EXPR
|
|| code == VEC_WIDEN_MULT_LO_EXPR
|
||||||
|| code == VEC_UNPACK_HI_EXPR
|
|| code == VEC_UNPACK_HI_EXPR
|
||||||
|| code == VEC_UNPACK_LO_EXPR
|
|| code == VEC_UNPACK_LO_EXPR
|
||||||
|
|| code == VEC_UNPACK_FLOAT_HI_EXPR
|
||||||
|
|| code == VEC_UNPACK_FLOAT_LO_EXPR
|
||||||
|| code == VEC_PACK_TRUNC_EXPR
|
|| code == VEC_PACK_TRUNC_EXPR
|
||||||
|| code == VEC_PACK_SAT_EXPR)
|
|| code == VEC_PACK_SAT_EXPR
|
||||||
|
|| code == VEC_PACK_FIX_TRUNC_EXPR)
|
||||||
type = TREE_TYPE (TREE_OPERAND (rhs, 0));
|
type = TREE_TYPE (TREE_OPERAND (rhs, 0));
|
||||||
|
|
||||||
/* Optabs will try converting a negation into a subtraction, so
|
/* Optabs will try converting a negation into a subtraction, so
|
||||||
|
|
|
@ -1931,6 +1931,64 @@ vectorizable_call (tree stmt, block_stmt_iterator *bsi, tree *vec_stmt)
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
/* Function vect_gen_widened_results_half
|
||||||
|
|
||||||
|
Create a vector stmt whose code, type, number of arguments, and result
|
||||||
|
variable are CODE, VECTYPE, OP_TYPE, and VEC_DEST, and its arguments are
|
||||||
|
VEC_OPRND0 and VEC_OPRND1. The new vector stmt is to be inserted at BSI.
|
||||||
|
In the case that CODE is a CALL_EXPR, this means that a call to DECL
|
||||||
|
needs to be created (DECL is a function-decl of a target-builtin).
|
||||||
|
STMT is the original scalar stmt that we are vectorizing. */
|
||||||
|
|
||||||
|
static tree
|
||||||
|
vect_gen_widened_results_half (enum tree_code code, tree vectype, tree decl,
|
||||||
|
tree vec_oprnd0, tree vec_oprnd1, int op_type,
|
||||||
|
tree vec_dest, block_stmt_iterator *bsi,
|
||||||
|
tree stmt)
|
||||||
|
{
|
||||||
|
tree expr;
|
||||||
|
tree new_stmt;
|
||||||
|
tree new_temp;
|
||||||
|
tree sym;
|
||||||
|
ssa_op_iter iter;
|
||||||
|
|
||||||
|
/* Generate half of the widened result: */
|
||||||
|
if (code == CALL_EXPR)
|
||||||
|
{
|
||||||
|
/* Target specific support */
|
||||||
|
if (op_type == binary_op)
|
||||||
|
expr = build_call_expr (decl, 2, vec_oprnd0, vec_oprnd1);
|
||||||
|
else
|
||||||
|
expr = build_call_expr (decl, 1, vec_oprnd0);
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
/* Generic support */
|
||||||
|
gcc_assert (op_type == TREE_CODE_LENGTH (code));
|
||||||
|
if (op_type == binary_op)
|
||||||
|
expr = build2 (code, vectype, vec_oprnd0, vec_oprnd1);
|
||||||
|
else
|
||||||
|
expr = build1 (code, vectype, vec_oprnd0);
|
||||||
|
}
|
||||||
|
new_stmt = build_gimple_modify_stmt (vec_dest, expr);
|
||||||
|
new_temp = make_ssa_name (vec_dest, new_stmt);
|
||||||
|
GIMPLE_STMT_OPERAND (new_stmt, 0) = new_temp;
|
||||||
|
vect_finish_stmt_generation (stmt, new_stmt, bsi);
|
||||||
|
|
||||||
|
if (code == CALL_EXPR)
|
||||||
|
{
|
||||||
|
FOR_EACH_SSA_TREE_OPERAND (sym, new_stmt, iter, SSA_OP_ALL_VIRTUALS)
|
||||||
|
{
|
||||||
|
if (TREE_CODE (sym) == SSA_NAME)
|
||||||
|
sym = SSA_NAME_VAR (sym);
|
||||||
|
mark_sym_for_renaming (sym);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return new_stmt;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
/* Function vectorizable_conversion.
|
/* Function vectorizable_conversion.
|
||||||
|
|
||||||
Check if STMT performs a conversion operation, that can be vectorized.
|
Check if STMT performs a conversion operation, that can be vectorized.
|
||||||
|
@ -1946,21 +2004,24 @@ vectorizable_conversion (tree stmt, block_stmt_iterator * bsi,
|
||||||
tree scalar_dest;
|
tree scalar_dest;
|
||||||
tree operation;
|
tree operation;
|
||||||
tree op0;
|
tree op0;
|
||||||
tree vec_oprnd0 = NULL_TREE;
|
tree vec_oprnd0 = NULL_TREE, vec_oprnd1 = NULL_TREE;
|
||||||
stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
|
stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
|
||||||
loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
|
loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
|
||||||
enum tree_code code;
|
enum tree_code code, code1 = CODE_FOR_nothing, code2 = CODE_FOR_nothing;
|
||||||
|
tree decl1 = NULL_TREE, decl2 = NULL_TREE;
|
||||||
tree new_temp;
|
tree new_temp;
|
||||||
tree def, def_stmt;
|
tree def, def_stmt;
|
||||||
enum vect_def_type dt0;
|
enum vect_def_type dt0;
|
||||||
tree new_stmt;
|
tree new_stmt;
|
||||||
|
stmt_vec_info prev_stmt_info;
|
||||||
int nunits_in;
|
int nunits_in;
|
||||||
int nunits_out;
|
int nunits_out;
|
||||||
int ncopies, j;
|
|
||||||
tree vectype_out, vectype_in;
|
tree vectype_out, vectype_in;
|
||||||
|
int ncopies, j;
|
||||||
|
tree expr;
|
||||||
tree rhs_type, lhs_type;
|
tree rhs_type, lhs_type;
|
||||||
tree builtin_decl;
|
tree builtin_decl;
|
||||||
stmt_vec_info prev_stmt_info;
|
enum { NARROW, NONE, WIDEN } modifier;
|
||||||
|
|
||||||
/* Is STMT a vectorizable conversion? */
|
/* Is STMT a vectorizable conversion? */
|
||||||
|
|
||||||
|
@ -1998,23 +2059,36 @@ vectorizable_conversion (tree stmt, block_stmt_iterator * bsi,
|
||||||
scalar_dest = GIMPLE_STMT_OPERAND (stmt, 0);
|
scalar_dest = GIMPLE_STMT_OPERAND (stmt, 0);
|
||||||
lhs_type = TREE_TYPE (scalar_dest);
|
lhs_type = TREE_TYPE (scalar_dest);
|
||||||
vectype_out = get_vectype_for_scalar_type (lhs_type);
|
vectype_out = get_vectype_for_scalar_type (lhs_type);
|
||||||
gcc_assert (STMT_VINFO_VECTYPE (stmt_info) == vectype_out);
|
|
||||||
nunits_out = TYPE_VECTOR_SUBPARTS (vectype_out);
|
nunits_out = TYPE_VECTOR_SUBPARTS (vectype_out);
|
||||||
|
|
||||||
/* FORNOW: need to extend to support short<->float conversions as well. */
|
/* FORNOW */
|
||||||
if (nunits_out != nunits_in)
|
if (nunits_in == nunits_out / 2)
|
||||||
|
modifier = NARROW;
|
||||||
|
else if (nunits_out == nunits_in)
|
||||||
|
modifier = NONE;
|
||||||
|
else if (nunits_out == nunits_in / 2)
|
||||||
|
modifier = WIDEN;
|
||||||
|
else
|
||||||
return false;
|
return false;
|
||||||
|
|
||||||
|
if (modifier == NONE)
|
||||||
|
gcc_assert (STMT_VINFO_VECTYPE (stmt_info) == vectype_out);
|
||||||
|
|
||||||
/* Bail out if the types are both integral or non-integral */
|
/* Bail out if the types are both integral or non-integral */
|
||||||
if ((INTEGRAL_TYPE_P (rhs_type) && INTEGRAL_TYPE_P (lhs_type))
|
if ((INTEGRAL_TYPE_P (rhs_type) && INTEGRAL_TYPE_P (lhs_type))
|
||||||
|| (!INTEGRAL_TYPE_P (rhs_type) && !INTEGRAL_TYPE_P (lhs_type)))
|
|| (!INTEGRAL_TYPE_P (rhs_type) && !INTEGRAL_TYPE_P (lhs_type)))
|
||||||
return false;
|
return false;
|
||||||
|
|
||||||
|
if (modifier == NARROW)
|
||||||
|
ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits_out;
|
||||||
|
else
|
||||||
|
ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits_in;
|
||||||
|
|
||||||
/* Sanity check: make sure that at least one copy of the vectorized stmt
|
/* Sanity check: make sure that at least one copy of the vectorized stmt
|
||||||
needs to be generated. */
|
needs to be generated. */
|
||||||
ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits_in;
|
|
||||||
gcc_assert (ncopies >= 1);
|
gcc_assert (ncopies >= 1);
|
||||||
|
|
||||||
|
/* Check the operands of the operation. */
|
||||||
if (!vect_is_simple_use (op0, loop_vinfo, &def_stmt, &def, &dt0))
|
if (!vect_is_simple_use (op0, loop_vinfo, &def_stmt, &def, &dt0))
|
||||||
{
|
{
|
||||||
if (vect_print_dump_info (REPORT_DETAILS))
|
if (vect_print_dump_info (REPORT_DETAILS))
|
||||||
|
@ -2023,13 +2097,24 @@ vectorizable_conversion (tree stmt, block_stmt_iterator * bsi,
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Supportable by target? */
|
/* Supportable by target? */
|
||||||
if (!targetm.vectorize.builtin_conversion (code, vectype_in))
|
if ((modifier == NONE
|
||||||
|
&& !targetm.vectorize.builtin_conversion (code, vectype_in))
|
||||||
|
|| (modifier == WIDEN
|
||||||
|
&& !supportable_widening_operation (code, stmt, vectype_in,
|
||||||
|
&decl1, &decl2,
|
||||||
|
&code1, &code2))
|
||||||
|
|| (modifier == NARROW
|
||||||
|
&& !supportable_narrowing_operation (code, stmt, vectype_in,
|
||||||
|
&code1)))
|
||||||
{
|
{
|
||||||
if (vect_print_dump_info (REPORT_DETAILS))
|
if (vect_print_dump_info (REPORT_DETAILS))
|
||||||
fprintf (vect_dump, "op not supported by target.");
|
fprintf (vect_dump, "op not supported by target.");
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
if (modifier != NONE)
|
||||||
|
STMT_VINFO_VECTYPE (stmt_info) = vectype_in;
|
||||||
|
|
||||||
if (!vec_stmt) /* transformation not required. */
|
if (!vec_stmt) /* transformation not required. */
|
||||||
{
|
{
|
||||||
STMT_VINFO_TYPE (stmt_info) = type_conversion_vec_info_type;
|
STMT_VINFO_TYPE (stmt_info) = type_conversion_vec_info_type;
|
||||||
|
@ -2037,7 +2122,6 @@ vectorizable_conversion (tree stmt, block_stmt_iterator * bsi,
|
||||||
}
|
}
|
||||||
|
|
||||||
/** Transform. **/
|
/** Transform. **/
|
||||||
|
|
||||||
if (vect_print_dump_info (REPORT_DETAILS))
|
if (vect_print_dump_info (REPORT_DETAILS))
|
||||||
fprintf (vect_dump, "transform conversion.");
|
fprintf (vect_dump, "transform conversion.");
|
||||||
|
|
||||||
|
@ -2045,6 +2129,9 @@ vectorizable_conversion (tree stmt, block_stmt_iterator * bsi,
|
||||||
vec_dest = vect_create_destination_var (scalar_dest, vectype_out);
|
vec_dest = vect_create_destination_var (scalar_dest, vectype_out);
|
||||||
|
|
||||||
prev_stmt_info = NULL;
|
prev_stmt_info = NULL;
|
||||||
|
switch (modifier)
|
||||||
|
{
|
||||||
|
case NONE:
|
||||||
for (j = 0; j < ncopies; j++)
|
for (j = 0; j < ncopies; j++)
|
||||||
{
|
{
|
||||||
tree sym;
|
tree sym;
|
||||||
|
@ -2077,6 +2164,79 @@ vectorizable_conversion (tree stmt, block_stmt_iterator * bsi,
|
||||||
STMT_VINFO_RELATED_STMT (prev_stmt_info) = new_stmt;
|
STMT_VINFO_RELATED_STMT (prev_stmt_info) = new_stmt;
|
||||||
prev_stmt_info = vinfo_for_stmt (new_stmt);
|
prev_stmt_info = vinfo_for_stmt (new_stmt);
|
||||||
}
|
}
|
||||||
|
break;
|
||||||
|
|
||||||
|
case WIDEN:
|
||||||
|
/* In case the vectorization factor (VF) is bigger than the number
|
||||||
|
of elements that we can fit in a vectype (nunits), we have to
|
||||||
|
generate more than one vector stmt - i.e - we need to "unroll"
|
||||||
|
the vector stmt by a factor VF/nunits. */
|
||||||
|
for (j = 0; j < ncopies; j++)
|
||||||
|
{
|
||||||
|
if (j == 0)
|
||||||
|
vec_oprnd0 = vect_get_vec_def_for_operand (op0, stmt, NULL);
|
||||||
|
else
|
||||||
|
vec_oprnd0 = vect_get_vec_def_for_stmt_copy (dt0, vec_oprnd0);
|
||||||
|
|
||||||
|
STMT_VINFO_VECTYPE (stmt_info) = vectype_in;
|
||||||
|
|
||||||
|
/* Generate first half of the widened result: */
|
||||||
|
new_stmt
|
||||||
|
= vect_gen_widened_results_half (code1, vectype_out, decl1,
|
||||||
|
vec_oprnd0, vec_oprnd1,
|
||||||
|
unary_op, vec_dest, bsi, stmt);
|
||||||
|
if (j == 0)
|
||||||
|
STMT_VINFO_VEC_STMT (stmt_info) = new_stmt;
|
||||||
|
else
|
||||||
|
STMT_VINFO_RELATED_STMT (prev_stmt_info) = new_stmt;
|
||||||
|
prev_stmt_info = vinfo_for_stmt (new_stmt);
|
||||||
|
|
||||||
|
/* Generate second half of the widened result: */
|
||||||
|
new_stmt
|
||||||
|
= vect_gen_widened_results_half (code2, vectype_out, decl2,
|
||||||
|
vec_oprnd0, vec_oprnd1,
|
||||||
|
unary_op, vec_dest, bsi, stmt);
|
||||||
|
STMT_VINFO_RELATED_STMT (prev_stmt_info) = new_stmt;
|
||||||
|
prev_stmt_info = vinfo_for_stmt (new_stmt);
|
||||||
|
}
|
||||||
|
break;
|
||||||
|
|
||||||
|
case NARROW:
|
||||||
|
/* In case the vectorization factor (VF) is bigger than the number
|
||||||
|
of elements that we can fit in a vectype (nunits), we have to
|
||||||
|
generate more than one vector stmt - i.e - we need to "unroll"
|
||||||
|
the vector stmt by a factor VF/nunits. */
|
||||||
|
for (j = 0; j < ncopies; j++)
|
||||||
|
{
|
||||||
|
/* Handle uses. */
|
||||||
|
if (j == 0)
|
||||||
|
{
|
||||||
|
vec_oprnd0 = vect_get_vec_def_for_operand (op0, stmt, NULL);
|
||||||
|
vec_oprnd1 = vect_get_vec_def_for_stmt_copy (dt0, vec_oprnd0);
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
vec_oprnd0 = vect_get_vec_def_for_stmt_copy (dt0, vec_oprnd1);
|
||||||
|
vec_oprnd1 = vect_get_vec_def_for_stmt_copy (dt0, vec_oprnd0);
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Arguments are ready. Create the new vector stmt. */
|
||||||
|
expr = build2 (code1, vectype_out, vec_oprnd0, vec_oprnd1);
|
||||||
|
new_stmt = build_gimple_modify_stmt (vec_dest, expr);
|
||||||
|
new_temp = make_ssa_name (vec_dest, new_stmt);
|
||||||
|
GIMPLE_STMT_OPERAND (new_stmt, 0) = new_temp;
|
||||||
|
vect_finish_stmt_generation (stmt, new_stmt, bsi);
|
||||||
|
|
||||||
|
if (j == 0)
|
||||||
|
STMT_VINFO_VEC_STMT (stmt_info) = new_stmt;
|
||||||
|
else
|
||||||
|
STMT_VINFO_RELATED_STMT (prev_stmt_info) = new_stmt;
|
||||||
|
|
||||||
|
prev_stmt_info = vinfo_for_stmt (new_stmt);
|
||||||
|
}
|
||||||
|
|
||||||
|
*vec_stmt = STMT_VINFO_VEC_STMT (stmt_info);
|
||||||
|
}
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -2534,7 +2694,7 @@ vectorizable_type_demotion (tree stmt, block_stmt_iterator *bsi,
|
||||||
tree vec_oprnd0=NULL, vec_oprnd1=NULL;
|
tree vec_oprnd0=NULL, vec_oprnd1=NULL;
|
||||||
stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
|
stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
|
||||||
loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
|
loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
|
||||||
enum tree_code code;
|
enum tree_code code, code1 = CODE_FOR_nothing;
|
||||||
tree new_temp;
|
tree new_temp;
|
||||||
tree def, def_stmt;
|
tree def, def_stmt;
|
||||||
enum vect_def_type dt0;
|
enum vect_def_type dt0;
|
||||||
|
@ -2548,8 +2708,6 @@ vectorizable_type_demotion (tree stmt, block_stmt_iterator *bsi,
|
||||||
tree expr;
|
tree expr;
|
||||||
tree vectype_in;
|
tree vectype_in;
|
||||||
tree scalar_type;
|
tree scalar_type;
|
||||||
optab optab;
|
|
||||||
enum machine_mode vec_mode;
|
|
||||||
|
|
||||||
if (!STMT_VINFO_RELEVANT_P (stmt_info))
|
if (!STMT_VINFO_RELEVANT_P (stmt_info))
|
||||||
return false;
|
return false;
|
||||||
|
@ -2607,13 +2765,7 @@ vectorizable_type_demotion (tree stmt, block_stmt_iterator *bsi,
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Supportable by target? */
|
/* Supportable by target? */
|
||||||
code = VEC_PACK_TRUNC_EXPR;
|
if (!supportable_narrowing_operation (code, stmt, vectype_in, &code1))
|
||||||
optab = optab_for_tree_code (code, vectype_in);
|
|
||||||
if (!optab)
|
|
||||||
return false;
|
|
||||||
|
|
||||||
vec_mode = TYPE_MODE (vectype_in);
|
|
||||||
if (optab->handlers[(int) vec_mode].insn_code == CODE_FOR_nothing)
|
|
||||||
return false;
|
return false;
|
||||||
|
|
||||||
STMT_VINFO_VECTYPE (stmt_info) = vectype_in;
|
STMT_VINFO_VECTYPE (stmt_info) = vectype_in;
|
||||||
|
@ -2652,7 +2804,7 @@ vectorizable_type_demotion (tree stmt, block_stmt_iterator *bsi,
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Arguments are ready. Create the new vector stmt. */
|
/* Arguments are ready. Create the new vector stmt. */
|
||||||
expr = build2 (code, vectype_out, vec_oprnd0, vec_oprnd1);
|
expr = build2 (code1, vectype_out, vec_oprnd0, vec_oprnd1);
|
||||||
new_stmt = build_gimple_modify_stmt (vec_dest, expr);
|
new_stmt = build_gimple_modify_stmt (vec_dest, expr);
|
||||||
new_temp = make_ssa_name (vec_dest, new_stmt);
|
new_temp = make_ssa_name (vec_dest, new_stmt);
|
||||||
GIMPLE_STMT_OPERAND (new_stmt, 0) = new_temp;
|
GIMPLE_STMT_OPERAND (new_stmt, 0) = new_temp;
|
||||||
|
@ -2671,64 +2823,6 @@ vectorizable_type_demotion (tree stmt, block_stmt_iterator *bsi,
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
/* Function vect_gen_widened_results_half
|
|
||||||
|
|
||||||
Create a vector stmt whose code, type, number of arguments, and result
|
|
||||||
variable are CODE, VECTYPE, OP_TYPE, and VEC_DEST, and its arguments are
|
|
||||||
VEC_OPRND0 and VEC_OPRND1. The new vector stmt is to be inserted at BSI.
|
|
||||||
In the case that CODE is a CALL_EXPR, this means that a call to DECL
|
|
||||||
needs to be created (DECL is a function-decl of a target-builtin).
|
|
||||||
STMT is the original scalar stmt that we are vectorizing. */
|
|
||||||
|
|
||||||
static tree
|
|
||||||
vect_gen_widened_results_half (enum tree_code code, tree vectype, tree decl,
|
|
||||||
tree vec_oprnd0, tree vec_oprnd1, int op_type,
|
|
||||||
tree vec_dest, block_stmt_iterator *bsi,
|
|
||||||
tree stmt)
|
|
||||||
{
|
|
||||||
tree expr;
|
|
||||||
tree new_stmt;
|
|
||||||
tree new_temp;
|
|
||||||
tree sym;
|
|
||||||
ssa_op_iter iter;
|
|
||||||
|
|
||||||
/* Generate half of the widened result: */
|
|
||||||
if (code == CALL_EXPR)
|
|
||||||
{
|
|
||||||
/* Target specific support */
|
|
||||||
if (op_type == binary_op)
|
|
||||||
expr = build_call_expr (decl, 2, vec_oprnd0, vec_oprnd1);
|
|
||||||
else
|
|
||||||
expr = build_call_expr (decl, 1, vec_oprnd0);
|
|
||||||
}
|
|
||||||
else
|
|
||||||
{
|
|
||||||
/* Generic support */
|
|
||||||
gcc_assert (op_type == TREE_CODE_LENGTH (code));
|
|
||||||
if (op_type == binary_op)
|
|
||||||
expr = build2 (code, vectype, vec_oprnd0, vec_oprnd1);
|
|
||||||
else
|
|
||||||
expr = build1 (code, vectype, vec_oprnd0);
|
|
||||||
}
|
|
||||||
new_stmt = build_gimple_modify_stmt (vec_dest, expr);
|
|
||||||
new_temp = make_ssa_name (vec_dest, new_stmt);
|
|
||||||
GIMPLE_STMT_OPERAND (new_stmt, 0) = new_temp;
|
|
||||||
vect_finish_stmt_generation (stmt, new_stmt, bsi);
|
|
||||||
|
|
||||||
if (code == CALL_EXPR)
|
|
||||||
{
|
|
||||||
FOR_EACH_SSA_TREE_OPERAND (sym, new_stmt, iter, SSA_OP_ALL_VIRTUALS)
|
|
||||||
{
|
|
||||||
if (TREE_CODE (sym) == SSA_NAME)
|
|
||||||
sym = SSA_NAME_VAR (sym);
|
|
||||||
mark_sym_for_renaming (sym);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
return new_stmt;
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
/* Function vectorizable_type_promotion
|
/* Function vectorizable_type_promotion
|
||||||
|
|
||||||
Check if STMT performs a binary or unary operation that involves
|
Check if STMT performs a binary or unary operation that involves
|
||||||
|
@ -2785,7 +2879,8 @@ vectorizable_type_promotion (tree stmt, block_stmt_iterator *bsi,
|
||||||
|
|
||||||
operation = GIMPLE_STMT_OPERAND (stmt, 1);
|
operation = GIMPLE_STMT_OPERAND (stmt, 1);
|
||||||
code = TREE_CODE (operation);
|
code = TREE_CODE (operation);
|
||||||
if (code != NOP_EXPR && code != WIDEN_MULT_EXPR)
|
if (code != NOP_EXPR && code != CONVERT_EXPR
|
||||||
|
&& code != WIDEN_MULT_EXPR)
|
||||||
return false;
|
return false;
|
||||||
|
|
||||||
op0 = TREE_OPERAND (operation, 0);
|
op0 = TREE_OPERAND (operation, 0);
|
||||||
|
|
|
@ -1736,10 +1736,10 @@ vect_is_simple_use (tree operand, loop_vec_info loop_vinfo, tree *def_stmt,
|
||||||
widening operation that is supported by the target platform in
|
widening operation that is supported by the target platform in
|
||||||
vector form (i.e., when operating on arguments of type VECTYPE).
|
vector form (i.e., when operating on arguments of type VECTYPE).
|
||||||
|
|
||||||
The two kinds of widening operations we currently support are
|
Widening operations we currently support are NOP (CONVERT), FLOAT
|
||||||
NOP and WIDEN_MULT. This function checks if these operations
|
and WIDEN_MULT. This function checks if these operations are supported
|
||||||
are supported by the target platform either directly (via vector
|
by the target platform either directly (via vector tree-codes), or via
|
||||||
tree-codes), or via target builtins.
|
target builtins.
|
||||||
|
|
||||||
Output:
|
Output:
|
||||||
- CODE1 and CODE2 are codes of vector operations to be used when
|
- CODE1 and CODE2 are codes of vector operations to be used when
|
||||||
|
@ -1815,6 +1815,7 @@ supportable_widening_operation (enum tree_code code, tree stmt, tree vectype,
|
||||||
break;
|
break;
|
||||||
|
|
||||||
case NOP_EXPR:
|
case NOP_EXPR:
|
||||||
|
case CONVERT_EXPR:
|
||||||
if (BYTES_BIG_ENDIAN)
|
if (BYTES_BIG_ENDIAN)
|
||||||
{
|
{
|
||||||
c1 = VEC_UNPACK_HI_EXPR;
|
c1 = VEC_UNPACK_HI_EXPR;
|
||||||
|
@ -1827,6 +1828,19 @@ supportable_widening_operation (enum tree_code code, tree stmt, tree vectype,
|
||||||
}
|
}
|
||||||
break;
|
break;
|
||||||
|
|
||||||
|
case FLOAT_EXPR:
|
||||||
|
if (BYTES_BIG_ENDIAN)
|
||||||
|
{
|
||||||
|
c1 = VEC_UNPACK_FLOAT_HI_EXPR;
|
||||||
|
c2 = VEC_UNPACK_FLOAT_LO_EXPR;
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
c2 = VEC_UNPACK_FLOAT_HI_EXPR;
|
||||||
|
c1 = VEC_UNPACK_FLOAT_LO_EXPR;
|
||||||
|
}
|
||||||
|
break;
|
||||||
|
|
||||||
default:
|
default:
|
||||||
gcc_unreachable ();
|
gcc_unreachable ();
|
||||||
}
|
}
|
||||||
|
@ -1851,6 +1865,63 @@ supportable_widening_operation (enum tree_code code, tree stmt, tree vectype,
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
/* Function supportable_narrowing_operation
|
||||||
|
|
||||||
|
Check whether an operation represented by the code CODE is a
|
||||||
|
narrowing operation that is supported by the target platform in
|
||||||
|
vector form (i.e., when operating on arguments of type VECTYPE).
|
||||||
|
|
||||||
|
Narrowing operations we currently support are NOP (CONVERT) and
|
||||||
|
FIX_TRUNC. This function checks if these operations are supported by
|
||||||
|
the target platform directly via vector tree-codes.
|
||||||
|
|
||||||
|
Output:
|
||||||
|
- CODE1 is the code of a vector operation to be used when
|
||||||
|
vectorizing the operation, if available. */
|
||||||
|
|
||||||
|
bool
|
||||||
|
supportable_narrowing_operation (enum tree_code code,
|
||||||
|
tree stmt, tree vectype,
|
||||||
|
enum tree_code *code1)
|
||||||
|
{
|
||||||
|
enum machine_mode vec_mode;
|
||||||
|
enum insn_code icode1;
|
||||||
|
optab optab1;
|
||||||
|
tree expr = GIMPLE_STMT_OPERAND (stmt, 1);
|
||||||
|
tree type = TREE_TYPE (expr);
|
||||||
|
tree narrow_vectype = get_vectype_for_scalar_type (type);
|
||||||
|
enum tree_code c1;
|
||||||
|
|
||||||
|
switch (code)
|
||||||
|
{
|
||||||
|
case NOP_EXPR:
|
||||||
|
case CONVERT_EXPR:
|
||||||
|
c1 = VEC_PACK_TRUNC_EXPR;
|
||||||
|
break;
|
||||||
|
|
||||||
|
case FIX_TRUNC_EXPR:
|
||||||
|
c1 = VEC_PACK_FIX_TRUNC_EXPR;
|
||||||
|
break;
|
||||||
|
|
||||||
|
default:
|
||||||
|
gcc_unreachable ();
|
||||||
|
}
|
||||||
|
|
||||||
|
*code1 = c1;
|
||||||
|
optab1 = optab_for_tree_code (c1, vectype);
|
||||||
|
|
||||||
|
if (!optab1)
|
||||||
|
return false;
|
||||||
|
|
||||||
|
vec_mode = TYPE_MODE (vectype);
|
||||||
|
if ((icode1 = optab1->handlers[(int) vec_mode].insn_code) == CODE_FOR_nothing
|
||||||
|
|| insn_data[icode1].operand[0].mode != TYPE_MODE (narrow_vectype))
|
||||||
|
return false;
|
||||||
|
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
/* Function reduction_code_for_scalar_code
|
/* Function reduction_code_for_scalar_code
|
||||||
|
|
||||||
Input:
|
Input:
|
||||||
|
|
|
@ -398,6 +398,9 @@ extern enum dr_alignment_support vect_supportable_dr_alignment
|
||||||
extern bool reduction_code_for_scalar_code (enum tree_code, enum tree_code *);
|
extern bool reduction_code_for_scalar_code (enum tree_code, enum tree_code *);
|
||||||
extern bool supportable_widening_operation (enum tree_code, tree, tree,
|
extern bool supportable_widening_operation (enum tree_code, tree, tree,
|
||||||
tree *, tree *, enum tree_code *, enum tree_code *);
|
tree *, tree *, enum tree_code *, enum tree_code *);
|
||||||
|
extern bool supportable_narrowing_operation (enum tree_code, tree, tree,
|
||||||
|
enum tree_code *);
|
||||||
|
|
||||||
/* Creation and deletion of loop and stmt info structs. */
|
/* Creation and deletion of loop and stmt info structs. */
|
||||||
extern loop_vec_info new_loop_vec_info (struct loop *loop);
|
extern loop_vec_info new_loop_vec_info (struct loop *loop);
|
||||||
extern void destroy_loop_vec_info (loop_vec_info);
|
extern void destroy_loop_vec_info (loop_vec_info);
|
||||||
|
|
19
gcc/tree.def
19
gcc/tree.def
|
@ -1085,13 +1085,20 @@ DEFTREECODE (GIMPLE_MODIFY_STMT, "gimple_modify_stmt", tcc_gimple_stmt, 2)
|
||||||
DEFTREECODE (VEC_WIDEN_MULT_HI_EXPR, "widen_mult_hi_expr", tcc_binary, 2)
|
DEFTREECODE (VEC_WIDEN_MULT_HI_EXPR, "widen_mult_hi_expr", tcc_binary, 2)
|
||||||
DEFTREECODE (VEC_WIDEN_MULT_LO_EXPR, "widen_mult_hi_expr", tcc_binary, 2)
|
DEFTREECODE (VEC_WIDEN_MULT_LO_EXPR, "widen_mult_hi_expr", tcc_binary, 2)
|
||||||
|
|
||||||
/* Unpack (extract and promote/widen) the high/low elements of the input vector
|
/* Unpack (extract and promote/widen) the high/low elements of the input
|
||||||
into the output vector. The input vector has twice as many elements
|
vector into the output vector. The input vector has twice as many
|
||||||
as the output vector, that are half the size of the elements
|
elements as the output vector, that are half the size of the elements
|
||||||
of the output vector. This is used to support type promotion. */
|
of the output vector. This is used to support type promotion. */
|
||||||
DEFTREECODE (VEC_UNPACK_HI_EXPR, "vec_unpack_hi_expr", tcc_unary, 1)
|
DEFTREECODE (VEC_UNPACK_HI_EXPR, "vec_unpack_hi_expr", tcc_unary, 1)
|
||||||
DEFTREECODE (VEC_UNPACK_LO_EXPR, "vec_unpack_lo_expr", tcc_unary, 1)
|
DEFTREECODE (VEC_UNPACK_LO_EXPR, "vec_unpack_lo_expr", tcc_unary, 1)
|
||||||
|
|
||||||
|
/* Unpack (extract) the high/low elements of the input vector, convert
|
||||||
|
fixed point values to floating point and widen elements into the
|
||||||
|
output vector. The input vector has twice as many elements as the output
|
||||||
|
vector, that are half the size of the elements of the output vector. */
|
||||||
|
DEFTREECODE (VEC_UNPACK_FLOAT_HI_EXPR, "vec_unpack_float_hi_expr", tcc_unary, 1)
|
||||||
|
DEFTREECODE (VEC_UNPACK_FLOAT_LO_EXPR, "vec_unpack_float_lo_expr", tcc_unary, 1)
|
||||||
|
|
||||||
/* Pack (demote/narrow and merge) the elements of the two input vectors
|
/* Pack (demote/narrow and merge) the elements of the two input vectors
|
||||||
into the output vector using truncation/saturation.
|
into the output vector using truncation/saturation.
|
||||||
The elements of the input vectors are twice the size of the elements of the
|
The elements of the input vectors are twice the size of the elements of the
|
||||||
|
@ -1099,6 +1106,12 @@ DEFTREECODE (VEC_UNPACK_LO_EXPR, "vec_unpack_lo_expr", tcc_unary, 1)
|
||||||
DEFTREECODE (VEC_PACK_TRUNC_EXPR, "vec_pack_trunc_expr", tcc_binary, 2)
|
DEFTREECODE (VEC_PACK_TRUNC_EXPR, "vec_pack_trunc_expr", tcc_binary, 2)
|
||||||
DEFTREECODE (VEC_PACK_SAT_EXPR, "vec_pack_sat_expr", tcc_binary, 2)
|
DEFTREECODE (VEC_PACK_SAT_EXPR, "vec_pack_sat_expr", tcc_binary, 2)
|
||||||
|
|
||||||
|
/* Convert floating point values of the two input vectors to integer
|
||||||
|
and pack (narrow and merge) the elements into the output vector. The
|
||||||
|
elements of the input vector are twice the size of the elements of
|
||||||
|
the output vector. */
|
||||||
|
DEFTREECODE (VEC_PACK_FIX_TRUNC_EXPR, "vec_pack_fix_trunc_expr", tcc_binary, 2)
|
||||||
|
|
||||||
/* Extract even/odd fields from vectors. */
|
/* Extract even/odd fields from vectors. */
|
||||||
DEFTREECODE (VEC_EXTRACT_EVEN_EXPR, "vec_extracteven_expr", tcc_binary, 2)
|
DEFTREECODE (VEC_EXTRACT_EVEN_EXPR, "vec_extracteven_expr", tcc_binary, 2)
|
||||||
DEFTREECODE (VEC_EXTRACT_ODD_EXPR, "vec_extractodd_expr", tcc_binary, 2)
|
DEFTREECODE (VEC_EXTRACT_ODD_EXPR, "vec_extractodd_expr", tcc_binary, 2)
|
||||||
|
|
Loading…
Reference in New Issue