[AArch64] Use SVE binary immediate instructions for conditional arithmetic
This patch lets us use the immediate forms of FADD, FSUB, FSUBR, FMUL, FMAXNM and FMINNM for conditional arithmetic. (We already use them for normal unconditional arithmetic.) 2019-08-15 Richard Sandiford <richard.sandiford@arm.com> Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> gcc/ * config/aarch64/aarch64.c (aarch64_print_vector_float_operand): Print 2.0 naturally. (aarch64_sve_float_mul_immediate_p): Return true for 2.0. * config/aarch64/predicates.md (aarch64_sve_float_negated_arith_immediate): New predicate, renamed from aarch64_sve_float_arith_with_sub_immediate. (aarch64_sve_float_arith_with_sub_immediate): Test for both positive and negative constants. (aarch64_sve_float_arith_with_sub_operand): Redefine as a register or an aarch64_sve_float_arith_with_sub_immediate. * config/aarch64/constraints.md (vsN): Use aarch64_sve_float_negated_arith_immediate. * config/aarch64/iterators.md (SVE_COND_FP_BINARY_I1): New int iterator. (sve_pred_fp_rhs2_immediate): New int attribute. * config/aarch64/aarch64-sve.md (cond_<SVE_COND_FP_BINARY:optab><SVE_F:mode>): Use sve_pred_fp_rhs1_operand and sve_pred_fp_rhs2_operand. (*cond_<SVE_COND_FP_BINARY_I1:optab><SVE_F:mode>_2_const) (*cond_<SVE_COND_FP_BINARY_I1:optab><SVE_F:mode>_any_const) (*cond_add<SVE_F:mode>_2_const, *cond_add<SVE_F:mode>_any_const) (*cond_sub<mode>_3_const, *cond_sub<mode>_any_const): New patterns. gcc/testsuite/ * gcc.target/aarch64/sve/cond_fadd_1.c: New test. * gcc.target/aarch64/sve/cond_fadd_1_run.c: Likewise. * gcc.target/aarch64/sve/cond_fadd_2.c: Likewise. * gcc.target/aarch64/sve/cond_fadd_2_run.c: Likewise. * gcc.target/aarch64/sve/cond_fadd_3.c: Likewise. * gcc.target/aarch64/sve/cond_fadd_3_run.c: Likewise. * gcc.target/aarch64/sve/cond_fadd_4.c: Likewise. * gcc.target/aarch64/sve/cond_fadd_4_run.c: Likewise. * gcc.target/aarch64/sve/cond_fsubr_1.c: Likewise. * gcc.target/aarch64/sve/cond_fsubr_1_run.c: Likewise. * gcc.target/aarch64/sve/cond_fsubr_2.c: Likewise. * gcc.target/aarch64/sve/cond_fsubr_2_run.c: Likewise. * gcc.target/aarch64/sve/cond_fsubr_3.c: Likewise. * gcc.target/aarch64/sve/cond_fsubr_3_run.c: Likewise. * gcc.target/aarch64/sve/cond_fsubr_4.c: Likewise. * gcc.target/aarch64/sve/cond_fsubr_4_run.c: Likewise. * gcc.target/aarch64/sve/cond_fmaxnm_1.c: Likewise. * gcc.target/aarch64/sve/cond_fmaxnm_1_run.c: Likewise. * gcc.target/aarch64/sve/cond_fmaxnm_2.c: Likewise. * gcc.target/aarch64/sve/cond_fmaxnm_2_run.c: Likewise. * gcc.target/aarch64/sve/cond_fmaxnm_3.c: Likewise. * gcc.target/aarch64/sve/cond_fmaxnm_3_run.c: Likewise. * gcc.target/aarch64/sve/cond_fmaxnm_4.c: Likewise. * gcc.target/aarch64/sve/cond_fmaxnm_4_run.c: Likewise. * gcc.target/aarch64/sve/cond_fminnm_1.c: Likewise. * gcc.target/aarch64/sve/cond_fminnm_1_run.c: Likewise. * gcc.target/aarch64/sve/cond_fminnm_2.c: Likewise. * gcc.target/aarch64/sve/cond_fminnm_2_run.c: Likewise. * gcc.target/aarch64/sve/cond_fminnm_3.c: Likewise. * gcc.target/aarch64/sve/cond_fminnm_3_run.c: Likewise. * gcc.target/aarch64/sve/cond_fminnm_4.c: Likewise. * gcc.target/aarch64/sve/cond_fminnm_4_run.c: Likewise. * gcc.target/aarch64/sve/cond_fmul_1.c: Likewise. * gcc.target/aarch64/sve/cond_fmul_1_run.c: Likewise. * gcc.target/aarch64/sve/cond_fmul_2.c: Likewise. * gcc.target/aarch64/sve/cond_fmul_2_run.c: Likewise. * gcc.target/aarch64/sve/cond_fmul_3.c: Likewise. * gcc.target/aarch64/sve/cond_fmul_3_run.c: Likewise. * gcc.target/aarch64/sve/cond_fmul_4.c: Likewise. * gcc.target/aarch64/sve/cond_fmul_4_run.c: Likewise. Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org> From-SVN: r274508
This commit is contained in:
parent
bf30864e4c
commit
a19ba9e1b1
|
@ -1,3 +1,29 @@
|
|||
2019-08-15 Richard Sandiford <richard.sandiford@arm.com>
|
||||
Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
|
||||
|
||||
* config/aarch64/aarch64.c (aarch64_print_vector_float_operand):
|
||||
Print 2.0 naturally.
|
||||
(aarch64_sve_float_mul_immediate_p): Return true for 2.0.
|
||||
* config/aarch64/predicates.md
|
||||
(aarch64_sve_float_negated_arith_immediate): New predicate,
|
||||
renamed from aarch64_sve_float_arith_with_sub_immediate.
|
||||
(aarch64_sve_float_arith_with_sub_immediate): Test for both
|
||||
positive and negative constants.
|
||||
(aarch64_sve_float_arith_with_sub_operand): Redefine as a register
|
||||
or an aarch64_sve_float_arith_with_sub_immediate.
|
||||
* config/aarch64/constraints.md (vsN): Use
|
||||
aarch64_sve_float_negated_arith_immediate.
|
||||
* config/aarch64/iterators.md (SVE_COND_FP_BINARY_I1): New int
|
||||
iterator.
|
||||
(sve_pred_fp_rhs2_immediate): New int attribute.
|
||||
* config/aarch64/aarch64-sve.md
|
||||
(cond_<SVE_COND_FP_BINARY:optab><SVE_F:mode>): Use
|
||||
sve_pred_fp_rhs1_operand and sve_pred_fp_rhs2_operand.
|
||||
(*cond_<SVE_COND_FP_BINARY_I1:optab><SVE_F:mode>_2_const)
|
||||
(*cond_<SVE_COND_FP_BINARY_I1:optab><SVE_F:mode>_any_const)
|
||||
(*cond_add<SVE_F:mode>_2_const, *cond_add<SVE_F:mode>_any_const)
|
||||
(*cond_sub<mode>_3_const, *cond_sub<mode>_any_const): New patterns.
|
||||
|
||||
2019-08-15 Richard Sandiford <richard.sandiford@arm.com>
|
||||
Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
|
||||
|
||||
|
|
|
@ -2553,14 +2553,14 @@
|
|||
;; ---- [FP] General binary arithmetic corresponding to unspecs
|
||||
;; -------------------------------------------------------------------------
|
||||
;; Includes merging forms of:
|
||||
;; - FADD
|
||||
;; - FADD (constant forms handled in the "Addition" section)
|
||||
;; - FDIV
|
||||
;; - FDIVR
|
||||
;; - FMAXNM
|
||||
;; - FMINNM
|
||||
;; - FMUL
|
||||
;; - FSUB
|
||||
;; - FSUBR
|
||||
;; - FMAXNM (including #0.0 and #1.0)
|
||||
;; - FMINNM (including #0.0 and #1.0)
|
||||
;; - FMUL (including #0.5 and #2.0)
|
||||
;; - FSUB (constant forms handled in the "Addition" section)
|
||||
;; - FSUBR (constant forms handled in the "Subtraction" section)
|
||||
;; -------------------------------------------------------------------------
|
||||
|
||||
;; Unpredicated floating-point binary operations.
|
||||
|
@ -2603,8 +2603,8 @@
|
|||
(unspec:SVE_F
|
||||
[(match_dup 1)
|
||||
(const_int SVE_STRICT_GP)
|
||||
(match_operand:SVE_F 2 "register_operand")
|
||||
(match_operand:SVE_F 3 "register_operand")]
|
||||
(match_operand:SVE_F 2 "<sve_pred_fp_rhs1_operand>")
|
||||
(match_operand:SVE_F 3 "<sve_pred_fp_rhs2_operand>")]
|
||||
SVE_COND_FP_BINARY)
|
||||
(match_operand:SVE_F 4 "aarch64_simd_reg_or_zero")]
|
||||
UNSPEC_SEL))]
|
||||
|
@ -2635,6 +2635,30 @@
|
|||
[(set_attr "movprfx" "*,yes")]
|
||||
)
|
||||
|
||||
;; Same for operations that take a 1-bit constant.
|
||||
(define_insn_and_rewrite "*cond_<optab><mode>_2_const"
|
||||
[(set (match_operand:SVE_F 0 "register_operand" "=w, ?w")
|
||||
(unspec:SVE_F
|
||||
[(match_operand:<VPRED> 1 "register_operand" "Upl, Upl")
|
||||
(unspec:SVE_F
|
||||
[(match_operand 4)
|
||||
(match_operand:SI 5 "aarch64_sve_gp_strictness")
|
||||
(match_operand:SVE_F 2 "register_operand" "0, w")
|
||||
(match_operand:SVE_F 3 "<sve_pred_fp_rhs2_immediate>")]
|
||||
SVE_COND_FP_BINARY_I1)
|
||||
(match_dup 2)]
|
||||
UNSPEC_SEL))]
|
||||
"TARGET_SVE && aarch64_sve_pred_dominates_p (&operands[4], operands[1])"
|
||||
"@
|
||||
<sve_fp_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, #%3
|
||||
movprfx\t%0, %2\;<sve_fp_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, #%3"
|
||||
"&& !rtx_equal_p (operands[1], operands[4])"
|
||||
{
|
||||
operands[4] = copy_rtx (operands[1]);
|
||||
}
|
||||
[(set_attr "movprfx" "*,yes")]
|
||||
)
|
||||
|
||||
;; Predicated floating-point operations, merging with the second input.
|
||||
(define_insn_and_rewrite "*cond_<optab><mode>_3"
|
||||
[(set (match_operand:SVE_F 0 "register_operand" "=w, ?&w")
|
||||
|
@ -2700,6 +2724,44 @@
|
|||
[(set_attr "movprfx" "yes")]
|
||||
)
|
||||
|
||||
;; Same for operations that take a 1-bit constant.
|
||||
(define_insn_and_rewrite "*cond_<optab><mode>_any_const"
|
||||
[(set (match_operand:SVE_F 0 "register_operand" "=w, w, ?w")
|
||||
(unspec:SVE_F
|
||||
[(match_operand:<VPRED> 1 "register_operand" "Upl, Upl, Upl")
|
||||
(unspec:SVE_F
|
||||
[(match_operand 5)
|
||||
(match_operand:SI 6 "aarch64_sve_gp_strictness")
|
||||
(match_operand:SVE_F 2 "register_operand" "w, w, w")
|
||||
(match_operand:SVE_F 3 "<sve_pred_fp_rhs2_immediate>")]
|
||||
SVE_COND_FP_BINARY_I1)
|
||||
(match_operand:SVE_F 4 "aarch64_simd_reg_or_zero" "Dz, 0, w")]
|
||||
UNSPEC_SEL))]
|
||||
"TARGET_SVE
|
||||
&& !rtx_equal_p (operands[2], operands[4])
|
||||
&& aarch64_sve_pred_dominates_p (&operands[5], operands[1])"
|
||||
"@
|
||||
movprfx\t%0.<Vetype>, %1/z, %2.<Vetype>\;<sve_fp_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, #%3
|
||||
movprfx\t%0.<Vetype>, %1/m, %2.<Vetype>\;<sve_fp_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, #%3
|
||||
#"
|
||||
"&& 1"
|
||||
{
|
||||
if (reload_completed
|
||||
&& register_operand (operands[4], <MODE>mode)
|
||||
&& !rtx_equal_p (operands[0], operands[4]))
|
||||
{
|
||||
emit_insn (gen_vcond_mask_<mode><vpred> (operands[0], operands[2],
|
||||
operands[4], operands[1]));
|
||||
operands[4] = operands[2] = operands[0];
|
||||
}
|
||||
else if (!rtx_equal_p (operands[1], operands[5]))
|
||||
operands[5] = copy_rtx (operands[1]);
|
||||
else
|
||||
FAIL;
|
||||
}
|
||||
[(set_attr "movprfx" "yes")]
|
||||
)
|
||||
|
||||
;; -------------------------------------------------------------------------
|
||||
;; ---- [FP] Addition
|
||||
;; -------------------------------------------------------------------------
|
||||
|
@ -2729,7 +2791,76 @@
|
|||
[(set (match_dup 0) (plus:SVE_F (match_dup 2) (match_dup 3)))]
|
||||
)
|
||||
|
||||
;; Merging forms are handled through SVE_COND_FP_BINARY.
|
||||
;; Predicated floating-point addition of a constant, merging with the
|
||||
;; first input.
|
||||
(define_insn_and_rewrite "*cond_add<mode>_2_const"
|
||||
[(set (match_operand:SVE_F 0 "register_operand" "=w, w, ?w, ?w")
|
||||
(unspec:SVE_F
|
||||
[(match_operand:<VPRED> 1 "register_operand" "Upl, Upl, Upl, Upl")
|
||||
(unspec:SVE_F
|
||||
[(match_operand 4)
|
||||
(match_operand:SI 5 "aarch64_sve_gp_strictness")
|
||||
(match_operand:SVE_F 2 "register_operand" "0, 0, w, w")
|
||||
(match_operand:SVE_F 3 "aarch64_sve_float_arith_with_sub_immediate" "vsA, vsN, vsA, vsN")]
|
||||
UNSPEC_COND_FADD)
|
||||
(match_dup 2)]
|
||||
UNSPEC_SEL))]
|
||||
"TARGET_SVE && aarch64_sve_pred_dominates_p (&operands[4], operands[1])"
|
||||
"@
|
||||
fadd\t%0.<Vetype>, %1/m, %0.<Vetype>, #%3
|
||||
fsub\t%0.<Vetype>, %1/m, %0.<Vetype>, #%N3
|
||||
movprfx\t%0, %2\;fadd\t%0.<Vetype>, %1/m, %0.<Vetype>, #%3
|
||||
movprfx\t%0, %2\;fsub\t%0.<Vetype>, %1/m, %0.<Vetype>, #%N3"
|
||||
"&& !rtx_equal_p (operands[1], operands[4])"
|
||||
{
|
||||
operands[4] = copy_rtx (operands[1]);
|
||||
}
|
||||
[(set_attr "movprfx" "*,*,yes,yes")]
|
||||
)
|
||||
|
||||
;; Predicated floating-point addition of a constant, merging with an
|
||||
;; independent value.
|
||||
(define_insn_and_rewrite "*cond_add<mode>_any_const"
|
||||
[(set (match_operand:SVE_F 0 "register_operand" "=w, w, w, w, ?w, ?w")
|
||||
(unspec:SVE_F
|
||||
[(match_operand:<VPRED> 1 "register_operand" "Upl, Upl, Upl, Upl, Upl, Upl")
|
||||
(unspec:SVE_F
|
||||
[(match_operand 5)
|
||||
(match_operand:SI 6 "aarch64_sve_gp_strictness")
|
||||
(match_operand:SVE_F 2 "register_operand" "w, w, w, w, w, w")
|
||||
(match_operand:SVE_F 3 "aarch64_sve_float_arith_with_sub_immediate" "vsA, vsN, vsA, vsN, vsA, vsN")]
|
||||
UNSPEC_COND_FADD)
|
||||
(match_operand:SVE_F 4 "aarch64_simd_reg_or_zero" "Dz, Dz, 0, 0, w, w")]
|
||||
UNSPEC_SEL))]
|
||||
"TARGET_SVE
|
||||
&& !rtx_equal_p (operands[2], operands[4])
|
||||
&& aarch64_sve_pred_dominates_p (&operands[5], operands[1])"
|
||||
"@
|
||||
movprfx\t%0.<Vetype>, %1/z, %2.<Vetype>\;fadd\t%0.<Vetype>, %1/m, %0.<Vetype>, #%3
|
||||
movprfx\t%0.<Vetype>, %1/z, %2.<Vetype>\;fsub\t%0.<Vetype>, %1/m, %0.<Vetype>, #%N3
|
||||
movprfx\t%0.<Vetype>, %1/m, %2.<Vetype>\;fadd\t%0.<Vetype>, %1/m, %0.<Vetype>, #%3
|
||||
movprfx\t%0.<Vetype>, %1/m, %2.<Vetype>\;fsub\t%0.<Vetype>, %1/m, %0.<Vetype>, #%N3
|
||||
#
|
||||
#"
|
||||
"&& 1"
|
||||
{
|
||||
if (reload_completed
|
||||
&& register_operand (operands[4], <MODE>mode)
|
||||
&& !rtx_equal_p (operands[0], operands[4]))
|
||||
{
|
||||
emit_insn (gen_vcond_mask_<mode><vpred> (operands[0], operands[2],
|
||||
operands[4], operands[1]));
|
||||
operands[4] = operands[2] = operands[0];
|
||||
}
|
||||
else if (!rtx_equal_p (operands[1], operands[5]))
|
||||
operands[5] = copy_rtx (operands[1]);
|
||||
else
|
||||
FAIL;
|
||||
}
|
||||
[(set_attr "movprfx" "yes")]
|
||||
)
|
||||
|
||||
;; Register merging forms are handled through SVE_COND_FP_BINARY.
|
||||
|
||||
;; -------------------------------------------------------------------------
|
||||
;; ---- [FP] Subtraction
|
||||
|
@ -2765,7 +2896,71 @@
|
|||
[(set (match_dup 0) (minus:SVE_F (match_dup 2) (match_dup 3)))]
|
||||
)
|
||||
|
||||
;; Merging forms are handled through SVE_COND_FP_BINARY.
|
||||
;; Predicated floating-point subtraction from a constant, merging with the
|
||||
;; second input.
|
||||
(define_insn_and_rewrite "*cond_sub<mode>_3_const"
|
||||
[(set (match_operand:SVE_F 0 "register_operand" "=w, ?w")
|
||||
(unspec:SVE_F
|
||||
[(match_operand:<VPRED> 1 "register_operand" "Upl, Upl")
|
||||
(unspec:SVE_F
|
||||
[(match_operand 4)
|
||||
(match_operand:SI 5 "aarch64_sve_gp_strictness")
|
||||
(match_operand:SVE_F 2 "aarch64_sve_float_arith_immediate")
|
||||
(match_operand:SVE_F 3 "register_operand" "0, w")]
|
||||
UNSPEC_COND_FSUB)
|
||||
(match_dup 3)]
|
||||
UNSPEC_SEL))]
|
||||
"TARGET_SVE && aarch64_sve_pred_dominates_p (&operands[4], operands[1])"
|
||||
"@
|
||||
fsubr\t%0.<Vetype>, %1/m, %0.<Vetype>, #%2
|
||||
movprfx\t%0, %3\;fsubr\t%0.<Vetype>, %1/m, %0.<Vetype>, #%2"
|
||||
"&& !rtx_equal_p (operands[1], operands[4])"
|
||||
{
|
||||
operands[4] = copy_rtx (operands[1]);
|
||||
}
|
||||
[(set_attr "movprfx" "*,yes")]
|
||||
)
|
||||
|
||||
;; Predicated floating-point subtraction from a constant, merging with an
|
||||
;; independent value.
|
||||
(define_insn_and_rewrite "*cond_sub<mode>_any_const"
|
||||
[(set (match_operand:SVE_F 0 "register_operand" "=w, w, ?w")
|
||||
(unspec:SVE_F
|
||||
[(match_operand:<VPRED> 1 "register_operand" "Upl, Upl, Upl")
|
||||
(unspec:SVE_F
|
||||
[(match_operand 5)
|
||||
(match_operand:SI 6 "aarch64_sve_gp_strictness")
|
||||
(match_operand:SVE_F 2 "aarch64_sve_float_arith_immediate")
|
||||
(match_operand:SVE_F 3 "register_operand" "w, w, w")]
|
||||
UNSPEC_COND_FSUB)
|
||||
(match_operand:SVE_F 4 "aarch64_simd_reg_or_zero" "Dz, 0, w")]
|
||||
UNSPEC_SEL))]
|
||||
"TARGET_SVE
|
||||
&& !rtx_equal_p (operands[3], operands[4])
|
||||
&& aarch64_sve_pred_dominates_p (&operands[5], operands[1])"
|
||||
"@
|
||||
movprfx\t%0.<Vetype>, %1/z, %3.<Vetype>\;fsubr\t%0.<Vetype>, %1/m, %0.<Vetype>, #%2
|
||||
movprfx\t%0.<Vetype>, %1/m, %3.<Vetype>\;fsubr\t%0.<Vetype>, %1/m, %0.<Vetype>, #%2
|
||||
#"
|
||||
"&& 1"
|
||||
{
|
||||
if (reload_completed
|
||||
&& register_operand (operands[4], <MODE>mode)
|
||||
&& !rtx_equal_p (operands[0], operands[4]))
|
||||
{
|
||||
emit_insn (gen_vcond_mask_<mode><vpred> (operands[0], operands[3],
|
||||
operands[4], operands[1]));
|
||||
operands[4] = operands[3] = operands[0];
|
||||
}
|
||||
else if (!rtx_equal_p (operands[1], operands[5]))
|
||||
operands[5] = copy_rtx (operands[1]);
|
||||
else
|
||||
FAIL;
|
||||
}
|
||||
[(set_attr "movprfx" "yes")]
|
||||
)
|
||||
|
||||
;; Register merging forms are handled through SVE_COND_FP_BINARY.
|
||||
|
||||
;; -------------------------------------------------------------------------
|
||||
;; ---- [FP] Absolute difference
|
||||
|
@ -2939,7 +3134,8 @@
|
|||
[(set (match_dup 0) (mult:SVE_F (match_dup 2) (match_dup 3)))]
|
||||
)
|
||||
|
||||
;; Merging forms are handled through SVE_COND_FP_BINARY.
|
||||
;; Merging forms are handled through SVE_COND_FP_BINARY and
|
||||
;; SVE_COND_FP_BINARY_I1.
|
||||
|
||||
;; -------------------------------------------------------------------------
|
||||
;; ---- [FP] Binary logical operations
|
||||
|
@ -3064,7 +3260,8 @@
|
|||
[(set_attr "movprfx" "*,*,yes,yes")]
|
||||
)
|
||||
|
||||
;; Merging forms are handled through SVE_COND_FP_BINARY.
|
||||
;; Merging forms are handled through SVE_COND_FP_BINARY and
|
||||
;; SVE_COND_FP_BINARY_I1.
|
||||
|
||||
;; -------------------------------------------------------------------------
|
||||
;; ---- [PRED] Binary logical operations
|
||||
|
|
|
@ -8289,6 +8289,8 @@ aarch64_print_vector_float_operand (FILE *f, rtx x, bool negate)
|
|||
fixed form in the assembly syntax. */
|
||||
if (real_equal (&r, &dconst0))
|
||||
asm_fprintf (f, "0.0");
|
||||
else if (real_equal (&r, &dconst2))
|
||||
asm_fprintf (f, "2.0");
|
||||
else if (real_equal (&r, &dconst1))
|
||||
asm_fprintf (f, "1.0");
|
||||
else if (real_equal (&r, &dconsthalf))
|
||||
|
@ -15205,11 +15207,10 @@ aarch64_sve_float_mul_immediate_p (rtx x)
|
|||
{
|
||||
rtx elt;
|
||||
|
||||
/* GCC will never generate a multiply with an immediate of 2, so there is no
|
||||
point testing for it (even though it is a valid constant). */
|
||||
return (const_vec_duplicate_p (x, &elt)
|
||||
&& GET_CODE (elt) == CONST_DOUBLE
|
||||
&& real_equal (CONST_DOUBLE_REAL_VALUE (elt), &dconsthalf));
|
||||
&& (real_equal (CONST_DOUBLE_REAL_VALUE (elt), &dconsthalf)
|
||||
|| real_equal (CONST_DOUBLE_REAL_VALUE (elt), &dconst2)));
|
||||
}
|
||||
|
||||
/* Return true if replicating VAL32 is a valid 2-byte or 4-byte immediate
|
||||
|
|
|
@ -458,4 +458,4 @@
|
|||
(define_constraint "vsN"
|
||||
"@internal
|
||||
A constraint that matches the negative of vsA"
|
||||
(match_operand 0 "aarch64_sve_float_arith_with_sub_immediate"))
|
||||
(match_operand 0 "aarch64_sve_float_negated_arith_immediate"))
|
||||
|
|
|
@ -1713,6 +1713,10 @@
|
|||
UNSPEC_COND_FMUL
|
||||
UNSPEC_COND_FSUB])
|
||||
|
||||
(define_int_iterator SVE_COND_FP_BINARY_I1 [UNSPEC_COND_FMAXNM
|
||||
UNSPEC_COND_FMINNM
|
||||
UNSPEC_COND_FMUL])
|
||||
|
||||
(define_int_iterator SVE_COND_FP_BINARY_REG [UNSPEC_COND_FDIV])
|
||||
|
||||
;; Floating-point max/min operations that correspond to optabs,
|
||||
|
@ -2108,3 +2112,9 @@
|
|||
(UNSPEC_COND_FMINNM "aarch64_sve_float_maxmin_operand")
|
||||
(UNSPEC_COND_FMUL "aarch64_sve_float_mul_operand")
|
||||
(UNSPEC_COND_FSUB "register_operand")])
|
||||
|
||||
;; Likewise for immediates only.
|
||||
(define_int_attr sve_pred_fp_rhs2_immediate
|
||||
[(UNSPEC_COND_FMAXNM "aarch64_sve_float_maxmin_immediate")
|
||||
(UNSPEC_COND_FMINNM "aarch64_sve_float_maxmin_immediate")
|
||||
(UNSPEC_COND_FMUL "aarch64_sve_float_mul_immediate")])
|
||||
|
|
|
@ -663,10 +663,14 @@
|
|||
(and (match_code "const,const_vector")
|
||||
(match_test "aarch64_sve_float_arith_immediate_p (op, false)")))
|
||||
|
||||
(define_predicate "aarch64_sve_float_arith_with_sub_immediate"
|
||||
(define_predicate "aarch64_sve_float_negated_arith_immediate"
|
||||
(and (match_code "const,const_vector")
|
||||
(match_test "aarch64_sve_float_arith_immediate_p (op, true)")))
|
||||
|
||||
(define_predicate "aarch64_sve_float_arith_with_sub_immediate"
|
||||
(ior (match_operand 0 "aarch64_sve_float_arith_immediate")
|
||||
(match_operand 0 "aarch64_sve_float_negated_arith_immediate")))
|
||||
|
||||
(define_predicate "aarch64_sve_float_mul_immediate"
|
||||
(and (match_code "const,const_vector")
|
||||
(match_test "aarch64_sve_float_mul_immediate_p (op)")))
|
||||
|
@ -730,7 +734,7 @@
|
|||
(match_operand 0 "aarch64_sve_float_arith_immediate")))
|
||||
|
||||
(define_predicate "aarch64_sve_float_arith_with_sub_operand"
|
||||
(ior (match_operand 0 "aarch64_sve_float_arith_operand")
|
||||
(ior (match_operand 0 "register_operand")
|
||||
(match_operand 0 "aarch64_sve_float_arith_with_sub_immediate")))
|
||||
|
||||
(define_predicate "aarch64_sve_float_mul_operand"
|
||||
|
|
|
@ -1,3 +1,47 @@
|
|||
2019-08-15 Richard Sandiford <richard.sandiford@arm.com>
|
||||
Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
|
||||
|
||||
* gcc.target/aarch64/sve/cond_fadd_1.c: New test.
|
||||
* gcc.target/aarch64/sve/cond_fadd_1_run.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fadd_2.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fadd_2_run.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fadd_3.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fadd_3_run.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fadd_4.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fadd_4_run.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fsubr_1.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fsubr_1_run.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fsubr_2.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fsubr_2_run.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fsubr_3.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fsubr_3_run.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fsubr_4.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fsubr_4_run.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fmaxnm_1.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fmaxnm_1_run.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fmaxnm_2.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fmaxnm_2_run.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fmaxnm_3.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fmaxnm_3_run.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fmaxnm_4.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fmaxnm_4_run.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fminnm_1.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fminnm_1_run.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fminnm_2.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fminnm_2_run.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fminnm_3.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fminnm_3_run.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fminnm_4.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fminnm_4_run.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fmul_1.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fmul_1_run.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fmul_2.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fmul_2_run.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fmul_3.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fmul_3_run.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fmul_4.c: Likewise.
|
||||
* gcc.target/aarch64/sve/cond_fmul_4_run.c: Likewise.
|
||||
|
||||
2019-08-15 Richard Sandiford <richard.sandiford@arm.com>
|
||||
Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
|
||||
|
||||
|
|
|
@ -0,0 +1,62 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-options "-O2 -ftree-vectorize" } */
|
||||
|
||||
#include <stdint.h>
|
||||
|
||||
#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
|
||||
void __attribute__ ((noipa)) \
|
||||
test_##TYPE##_##NAME (TYPE *__restrict x, \
|
||||
TYPE *__restrict y, \
|
||||
PRED_TYPE *__restrict pred, \
|
||||
int n) \
|
||||
{ \
|
||||
for (int i = 0; i < n; ++i) \
|
||||
x[i] = pred[i] != 1 ? y[i] + (TYPE) CONST : y[i]; \
|
||||
}
|
||||
|
||||
#define TEST_TYPE(T, TYPE, PRED_TYPE) \
|
||||
T (TYPE, PRED_TYPE, half, 0.5) \
|
||||
T (TYPE, PRED_TYPE, one, 1.0) \
|
||||
T (TYPE, PRED_TYPE, two, 2.0) \
|
||||
T (TYPE, PRED_TYPE, minus_half, -0.5) \
|
||||
T (TYPE, PRED_TYPE, minus_one, -1.0) \
|
||||
T (TYPE, PRED_TYPE, minus_two, -2.0)
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
TEST_TYPE (T, _Float16, int16_t) \
|
||||
TEST_TYPE (T, float, int32_t) \
|
||||
TEST_TYPE (T, double, int64_t)
|
||||
|
||||
TEST_ALL (DEF_LOOP)
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #-2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #-2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #-2\.0} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
|
||||
/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
|
||||
/* { dg-final { scan-assembler-not {\tsel\t} } } */
|
|
@ -0,0 +1,32 @@
|
|||
/* { dg-do run { target aarch64_sve_hw } } */
|
||||
/* { dg-options "-O2 -ftree-vectorize" } */
|
||||
|
||||
#include "cond_fadd_1.c"
|
||||
|
||||
#define N 99
|
||||
|
||||
#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
|
||||
{ \
|
||||
TYPE x[N], y[N]; \
|
||||
PRED_TYPE pred[N]; \
|
||||
for (int i = 0; i < N; ++i) \
|
||||
{ \
|
||||
y[i] = i * i; \
|
||||
pred[i] = i % 3; \
|
||||
} \
|
||||
test_##TYPE##_##NAME (x, y, pred, N); \
|
||||
for (int i = 0; i < N; ++i) \
|
||||
{ \
|
||||
TYPE expected = i % 3 != 1 ? y[i] + (TYPE) CONST : y[i]; \
|
||||
if (x[i] != expected) \
|
||||
__builtin_abort (); \
|
||||
asm volatile ("" ::: "memory"); \
|
||||
} \
|
||||
}
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
TEST_ALL (TEST_LOOP)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,56 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-options "-O2 -ftree-vectorize" } */
|
||||
|
||||
#include <stdint.h>
|
||||
|
||||
#define DEF_LOOP(TYPE, NAME, CONST) \
|
||||
void __attribute__ ((noipa)) \
|
||||
test_##TYPE##_##NAME (TYPE *__restrict x, \
|
||||
TYPE *__restrict y, \
|
||||
TYPE *__restrict z, \
|
||||
int n) \
|
||||
{ \
|
||||
for (int i = 0; i < n; ++i) \
|
||||
x[i] = y[i] < 8 ? z[i] + (TYPE) CONST : y[i]; \
|
||||
}
|
||||
|
||||
#define TEST_TYPE(T, TYPE) \
|
||||
T (TYPE, half, 0.5) \
|
||||
T (TYPE, one, 1.0) \
|
||||
T (TYPE, two, 2.0) \
|
||||
T (TYPE, minus_half, -0.5) \
|
||||
T (TYPE, minus_one, -1.0) \
|
||||
T (TYPE, minus_two, -2.0)
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
TEST_TYPE (T, float) \
|
||||
TEST_TYPE (T, double)
|
||||
|
||||
TEST_ALL (DEF_LOOP)
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #-2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #-2\.0} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 6 } } */
|
||||
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 6 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
|
||||
/* { dg-final { scan-assembler-not {\tsel\t} } } */
|
|
@ -0,0 +1,31 @@
|
|||
/* { dg-do run { target aarch64_sve_hw } } */
|
||||
/* { dg-options "-O2 -ftree-vectorize" } */
|
||||
|
||||
#include "cond_fadd_2.c"
|
||||
|
||||
#define N 99
|
||||
|
||||
#define TEST_LOOP(TYPE, NAME, CONST) \
|
||||
{ \
|
||||
TYPE x[N], y[N], z[N]; \
|
||||
for (int i = 0; i < N; ++i) \
|
||||
{ \
|
||||
y[i] = i % 13; \
|
||||
z[i] = i * i; \
|
||||
} \
|
||||
test_##TYPE##_##NAME (x, y, z, N); \
|
||||
for (int i = 0; i < N; ++i) \
|
||||
{ \
|
||||
TYPE expected = y[i] < 8 ? z[i] + (TYPE) CONST : y[i]; \
|
||||
if (x[i] != expected) \
|
||||
__builtin_abort (); \
|
||||
asm volatile ("" ::: "memory"); \
|
||||
} \
|
||||
}
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
TEST_ALL (TEST_LOOP)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,65 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-options "-O2 -ftree-vectorize" } */
|
||||
|
||||
#include <stdint.h>
|
||||
|
||||
#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
|
||||
void __attribute__ ((noipa)) \
|
||||
test_##TYPE##_##NAME (TYPE *__restrict x, \
|
||||
TYPE *__restrict y, \
|
||||
PRED_TYPE *__restrict pred, \
|
||||
int n) \
|
||||
{ \
|
||||
for (int i = 0; i < n; ++i) \
|
||||
x[i] = pred[i] != 1 ? y[i] + (TYPE) CONST : 4; \
|
||||
}
|
||||
|
||||
#define TEST_TYPE(T, TYPE, PRED_TYPE) \
|
||||
T (TYPE, PRED_TYPE, half, 0.5) \
|
||||
T (TYPE, PRED_TYPE, one, 1.0) \
|
||||
T (TYPE, PRED_TYPE, two, 2.0) \
|
||||
T (TYPE, PRED_TYPE, minus_half, -0.5) \
|
||||
T (TYPE, PRED_TYPE, minus_one, -1.0) \
|
||||
T (TYPE, PRED_TYPE, minus_two, -2.0)
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
TEST_TYPE (T, _Float16, int16_t) \
|
||||
TEST_TYPE (T, float, int32_t) \
|
||||
TEST_TYPE (T, double, int64_t)
|
||||
|
||||
TEST_ALL (DEF_LOOP)
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #-2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #-2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #-2\.0} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 6 } } */
|
||||
/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 6 } } */
|
||||
/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 6 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
|
||||
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
|
|
@ -0,0 +1,32 @@
|
|||
/* { dg-do run { target aarch64_sve_hw } } */
|
||||
/* { dg-options "-O2 -ftree-vectorize" } */
|
||||
|
||||
#include "cond_fadd_3.c"
|
||||
|
||||
#define N 99
|
||||
|
||||
#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
|
||||
{ \
|
||||
TYPE x[N], y[N]; \
|
||||
PRED_TYPE pred[N]; \
|
||||
for (int i = 0; i < N; ++i) \
|
||||
{ \
|
||||
y[i] = i * i; \
|
||||
pred[i] = i % 3; \
|
||||
} \
|
||||
test_##TYPE##_##NAME (x, y, pred, N); \
|
||||
for (int i = 0; i < N; ++i) \
|
||||
{ \
|
||||
TYPE expected = i % 3 != 1 ? y[i] + (TYPE) CONST : 4; \
|
||||
if (x[i] != expected) \
|
||||
__builtin_abort (); \
|
||||
asm volatile ("" ::: "memory"); \
|
||||
} \
|
||||
}
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
TEST_ALL (TEST_LOOP)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,64 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-options "-O2 -ftree-vectorize" } */
|
||||
|
||||
#include <stdint.h>
|
||||
|
||||
#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
|
||||
void __attribute__ ((noipa)) \
|
||||
test_##TYPE##_##NAME (TYPE *__restrict x, \
|
||||
TYPE *__restrict y, \
|
||||
PRED_TYPE *__restrict pred, \
|
||||
int n) \
|
||||
{ \
|
||||
for (int i = 0; i < n; ++i) \
|
||||
x[i] = pred[i] != 1 ? y[i] + (TYPE) CONST : 0; \
|
||||
}
|
||||
|
||||
#define TEST_TYPE(T, TYPE, PRED_TYPE) \
|
||||
T (TYPE, PRED_TYPE, half, 0.5) \
|
||||
T (TYPE, PRED_TYPE, one, 1.0) \
|
||||
T (TYPE, PRED_TYPE, two, 2.0) \
|
||||
T (TYPE, PRED_TYPE, minus_half, -0.5) \
|
||||
T (TYPE, PRED_TYPE, minus_one, -1.0) \
|
||||
T (TYPE, PRED_TYPE, minus_two, -2.0)
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
TEST_TYPE (T, _Float16, int16_t) \
|
||||
TEST_TYPE (T, float, int32_t) \
|
||||
TEST_TYPE (T, double, int64_t)
|
||||
|
||||
TEST_ALL (DEF_LOOP)
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #-2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #-2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #-2\.0} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/z, z[0-9]+\.s\n} 6 } } */
|
||||
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/z, z[0-9]+\.d\n} 6 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
|
||||
/* { dg-final { scan-assembler-not {\tsel\t} } } */
|
|
@ -0,0 +1,32 @@
|
|||
/* { dg-do run { target aarch64_sve_hw } } */
|
||||
/* { dg-options "-O2 -ftree-vectorize" } */
|
||||
|
||||
#include "cond_fadd_4.c"
|
||||
|
||||
#define N 99
|
||||
|
||||
#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
|
||||
{ \
|
||||
TYPE x[N], y[N]; \
|
||||
PRED_TYPE pred[N]; \
|
||||
for (int i = 0; i < N; ++i) \
|
||||
{ \
|
||||
y[i] = i * i; \
|
||||
pred[i] = i % 3; \
|
||||
} \
|
||||
test_##TYPE##_##NAME (x, y, pred, N); \
|
||||
for (int i = 0; i < N; ++i) \
|
||||
{ \
|
||||
TYPE expected = i % 3 != 1 ? y[i] + (TYPE) CONST : 0; \
|
||||
if (x[i] != expected) \
|
||||
__builtin_abort (); \
|
||||
asm volatile ("" ::: "memory"); \
|
||||
} \
|
||||
}
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
TEST_ALL (TEST_LOOP)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,55 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
|
||||
|
||||
#include <stdint.h>
|
||||
|
||||
#ifndef FN
|
||||
#define FN(X) __builtin_fmax##X
|
||||
#endif
|
||||
|
||||
#define DEF_LOOP(FN, TYPE, PRED_TYPE, NAME, CONST) \
|
||||
void __attribute__ ((noipa)) \
|
||||
test_##TYPE##_##NAME (TYPE *__restrict x, \
|
||||
TYPE *__restrict y, \
|
||||
PRED_TYPE *__restrict pred, \
|
||||
int n) \
|
||||
{ \
|
||||
for (int i = 0; i < n; ++i) \
|
||||
x[i] = pred[i] != 1 ? FN (y[i], CONST) : y[i]; \
|
||||
}
|
||||
|
||||
#define TEST_TYPE(T, FN, TYPE, PRED_TYPE) \
|
||||
T (FN, TYPE, PRED_TYPE, zero, 0) \
|
||||
T (FN, TYPE, PRED_TYPE, one, 1) \
|
||||
T (FN, TYPE, PRED_TYPE, two, 2)
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
TEST_TYPE (T, FN (f16), _Float16, int16_t) \
|
||||
TEST_TYPE (T, FN (f32), float, int32_t) \
|
||||
TEST_TYPE (T, FN (f64), double, int64_t)
|
||||
|
||||
TEST_ALL (DEF_LOOP)
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.0\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
|
||||
/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
|
||||
/* { dg-final { scan-assembler-not {\tsel\t} } } */
|
|
@ -0,0 +1,32 @@
|
|||
/* { dg-do run { target aarch64_sve_hw } } */
|
||||
/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
|
||||
|
||||
#include "cond_fmaxnm_1.c"
|
||||
|
||||
#define N 99
|
||||
|
||||
#define TEST_LOOP(FN, TYPE, PRED_TYPE, NAME, CONST) \
|
||||
{ \
|
||||
TYPE x[N], y[N]; \
|
||||
PRED_TYPE pred[N]; \
|
||||
for (int i = 0; i < N; ++i) \
|
||||
{ \
|
||||
y[i] = i * i; \
|
||||
pred[i] = i % 3; \
|
||||
} \
|
||||
test_##TYPE##_##NAME (x, y, pred, N); \
|
||||
for (int i = 0; i < N; ++i) \
|
||||
{ \
|
||||
TYPE expected = i % 3 != 1 ? FN (y[i], CONST) : y[i]; \
|
||||
if (x[i] != expected) \
|
||||
__builtin_abort (); \
|
||||
asm volatile ("" ::: "memory"); \
|
||||
} \
|
||||
}
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
TEST_ALL (TEST_LOOP)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,48 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
|
||||
|
||||
#include <stdint.h>
|
||||
|
||||
#ifndef FN
|
||||
#define FN(X) __builtin_fmax##X
|
||||
#endif
|
||||
|
||||
#define DEF_LOOP(FN, TYPE, NAME, CONST) \
|
||||
void __attribute__ ((noipa)) \
|
||||
test_##TYPE##_##NAME (TYPE *__restrict x, \
|
||||
TYPE *__restrict y, \
|
||||
TYPE *__restrict z, \
|
||||
int n) \
|
||||
{ \
|
||||
for (int i = 0; i < n; ++i) \
|
||||
x[i] = y[i] < 8 ? FN (z[i], CONST) : y[i]; \
|
||||
}
|
||||
|
||||
#define TEST_TYPE(T, FN, TYPE) \
|
||||
T (FN, TYPE, zero, 0) \
|
||||
T (FN, TYPE, one, 1) \
|
||||
T (FN, TYPE, two, 2)
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
TEST_TYPE (T, FN (f32), float) \
|
||||
TEST_TYPE (T, FN (f64), double)
|
||||
|
||||
TEST_ALL (DEF_LOOP)
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.0\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 3 } } */
|
||||
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 3 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
|
||||
/* { dg-final { scan-assembler-not {\tsel\t} } } */
|
|
@ -0,0 +1,31 @@
|
|||
/* { dg-do run { target aarch64_sve_hw } } */
|
||||
/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
|
||||
|
||||
#include "cond_fmaxnm_2.c"
|
||||
|
||||
#define N 99
|
||||
|
||||
#define TEST_LOOP(FN, TYPE, NAME, CONST) \
|
||||
{ \
|
||||
TYPE x[N], y[N], z[N]; \
|
||||
for (int i = 0; i < N; ++i) \
|
||||
{ \
|
||||
y[i] = i % 13; \
|
||||
z[i] = i * i; \
|
||||
} \
|
||||
test_##TYPE##_##NAME (x, y, z, N); \
|
||||
for (int i = 0; i < N; ++i) \
|
||||
{ \
|
||||
TYPE expected = y[i] < 8 ? FN (z[i], CONST) : y[i]; \
|
||||
if (x[i] != expected) \
|
||||
__builtin_abort (); \
|
||||
asm volatile ("" ::: "memory"); \
|
||||
} \
|
||||
}
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
TEST_ALL (TEST_LOOP)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,54 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
|
||||
|
||||
#include <stdint.h>
|
||||
|
||||
#ifndef FN
|
||||
#define FN(X) __builtin_fmax##X
|
||||
#endif
|
||||
|
||||
#define DEF_LOOP(FN, TYPE, PRED_TYPE, NAME, CONST) \
|
||||
void __attribute__ ((noipa)) \
|
||||
test_##TYPE##_##NAME (TYPE *__restrict x, \
|
||||
TYPE *__restrict y, \
|
||||
PRED_TYPE *__restrict pred, \
|
||||
int n) \
|
||||
{ \
|
||||
for (int i = 0; i < n; ++i) \
|
||||
x[i] = pred[i] != 1 ? FN (y[i], CONST) : 4; \
|
||||
}
|
||||
|
||||
#define TEST_TYPE(T, FN, TYPE, PRED_TYPE) \
|
||||
T (FN, TYPE, PRED_TYPE, zero, 0) \
|
||||
T (FN, TYPE, PRED_TYPE, one, 1) \
|
||||
T (FN, TYPE, PRED_TYPE, two, 2)
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
TEST_TYPE (T, FN (f16), _Float16, int16_t) \
|
||||
TEST_TYPE (T, FN (f32), float, int32_t) \
|
||||
TEST_TYPE (T, FN (f64), double, int64_t)
|
||||
|
||||
TEST_ALL (DEF_LOOP)
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.0\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
|
||||
/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
|
||||
/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
|
||||
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
|
|
@ -0,0 +1,32 @@
|
|||
/* { dg-do run { target aarch64_sve_hw } } */
|
||||
/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
|
||||
|
||||
#include "cond_fmaxnm_3.c"
|
||||
|
||||
#define N 99
|
||||
|
||||
#define TEST_LOOP(FN, TYPE, PRED_TYPE, NAME, CONST) \
|
||||
{ \
|
||||
TYPE x[N], y[N]; \
|
||||
PRED_TYPE pred[N]; \
|
||||
for (int i = 0; i < N; ++i) \
|
||||
{ \
|
||||
y[i] = i * i; \
|
||||
pred[i] = i % 3; \
|
||||
} \
|
||||
test_##TYPE##_##NAME (x, y, pred, N); \
|
||||
for (int i = 0; i < N; ++i) \
|
||||
{ \
|
||||
TYPE expected = i % 3 != 1 ? FN (y[i], CONST) : 4; \
|
||||
if (x[i] != expected) \
|
||||
__builtin_abort (); \
|
||||
asm volatile ("" ::: "memory"); \
|
||||
} \
|
||||
}
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
TEST_ALL (TEST_LOOP)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,53 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
|
||||
|
||||
#include <stdint.h>
|
||||
|
||||
#ifndef FN
|
||||
#define FN(X) __builtin_fmax##X
|
||||
#endif
|
||||
|
||||
#define DEF_LOOP(FN, TYPE, PRED_TYPE, NAME, CONST) \
|
||||
void __attribute__ ((noipa)) \
|
||||
test_##TYPE##_##NAME (TYPE *__restrict x, \
|
||||
TYPE *__restrict y, \
|
||||
PRED_TYPE *__restrict pred, \
|
||||
int n) \
|
||||
{ \
|
||||
for (int i = 0; i < n; ++i) \
|
||||
x[i] = pred[i] != 1 ? FN (y[i], CONST) : 0; \
|
||||
}
|
||||
|
||||
#define TEST_TYPE(T, FN, TYPE, PRED_TYPE) \
|
||||
T (FN, TYPE, PRED_TYPE, zero, 0) \
|
||||
T (FN, TYPE, PRED_TYPE, one, 1) \
|
||||
T (FN, TYPE, PRED_TYPE, two, 2)
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
TEST_TYPE (T, FN (f16), _Float16, int16_t) \
|
||||
TEST_TYPE (T, FN (f32), float, int32_t) \
|
||||
TEST_TYPE (T, FN (f64), double, int64_t)
|
||||
|
||||
TEST_ALL (DEF_LOOP)
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.0\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/z, z[0-9]+\.s\n} 3 } } */
|
||||
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/z, z[0-9]+\.d\n} 3 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
|
||||
/* { dg-final { scan-assembler-not {\tsel\t} } } */
|
|
@ -0,0 +1,32 @@
|
|||
/* { dg-do run { target aarch64_sve_hw } } */
|
||||
/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
|
||||
|
||||
#include "cond_fmaxnm_4.c"
|
||||
|
||||
#define N 99
|
||||
|
||||
#define TEST_LOOP(FN, TYPE, PRED_TYPE, NAME, CONST) \
|
||||
{ \
|
||||
TYPE x[N], y[N]; \
|
||||
PRED_TYPE pred[N]; \
|
||||
for (int i = 0; i < N; ++i) \
|
||||
{ \
|
||||
y[i] = i * i; \
|
||||
pred[i] = i % 3; \
|
||||
} \
|
||||
test_##TYPE##_##NAME (x, y, pred, N); \
|
||||
for (int i = 0; i < N; ++i) \
|
||||
{ \
|
||||
TYPE expected = i % 3 != 1 ? FN (y[i], CONST) : 0; \
|
||||
if (x[i] != expected) \
|
||||
__builtin_abort (); \
|
||||
asm volatile ("" ::: "memory"); \
|
||||
} \
|
||||
}
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
TEST_ALL (TEST_LOOP)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,29 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
|
||||
|
||||
#define FN(X) __builtin_fmin##X
|
||||
#include "cond_fmaxnm_1.c"
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.0\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
|
||||
/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
|
||||
/* { dg-final { scan-assembler-not {\tsel\t} } } */
|
|
@ -0,0 +1,5 @@
|
|||
/* { dg-do run { target aarch64_sve_hw } } */
|
||||
/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
|
||||
|
||||
#define FN(X) __builtin_fmin##X
|
||||
#include "cond_fmaxnm_1_run.c"
|
|
@ -0,0 +1,23 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
|
||||
|
||||
#define FN(X) __builtin_fmin##X
|
||||
#include "cond_fmaxnm_2.c"
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.0\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 3 } } */
|
||||
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 3 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
|
||||
/* { dg-final { scan-assembler-not {\tsel\t} } } */
|
|
@ -0,0 +1,5 @@
|
|||
/* { dg-do run { target aarch64_sve_hw } } */
|
||||
/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
|
||||
|
||||
#define FN(X) __builtin_fmin##X
|
||||
#include "cond_fmaxnm_2_run.c"
|
|
@ -0,0 +1,28 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
|
||||
|
||||
#define FN(X) __builtin_fmin##X
|
||||
#include "cond_fmaxnm_3.c"
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.0\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
|
||||
/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
|
||||
/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
|
||||
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
|
|
@ -0,0 +1,5 @@
|
|||
/* { dg-do run { target aarch64_sve_hw } } */
|
||||
/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
|
||||
|
||||
#define FN(X) __builtin_fmin##X
|
||||
#include "cond_fmaxnm_3_run.c"
|
|
@ -0,0 +1,27 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
|
||||
|
||||
#define FN(X) __builtin_fmin##X
|
||||
#include "cond_fmaxnm_4.c"
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.0\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/z, z[0-9]+\.s\n} 3 } } */
|
||||
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/z, z[0-9]+\.d\n} 3 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
|
||||
/* { dg-final { scan-assembler-not {\tsel\t} } } */
|
|
@ -0,0 +1,5 @@
|
|||
/* { dg-do run { target aarch64_sve_hw } } */
|
||||
/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
|
||||
|
||||
#define FN(X) __builtin_fmin##X
|
||||
#include "cond_fmaxnm_4_run.c"
|
|
@ -0,0 +1,47 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-options "-O2 -ftree-vectorize" } */
|
||||
|
||||
#include <stdint.h>
|
||||
|
||||
#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
|
||||
void __attribute__ ((noipa)) \
|
||||
test_##TYPE##_##NAME (TYPE *__restrict x, \
|
||||
TYPE *__restrict y, \
|
||||
PRED_TYPE *__restrict pred, \
|
||||
int n) \
|
||||
{ \
|
||||
for (int i = 0; i < n; ++i) \
|
||||
x[i] = pred[i] != 1 ? y[i] * (TYPE) CONST : y[i]; \
|
||||
}
|
||||
|
||||
#define TEST_TYPE(T, TYPE, PRED_TYPE) \
|
||||
T (TYPE, PRED_TYPE, half, 0.5) \
|
||||
T (TYPE, PRED_TYPE, two, 2.0) \
|
||||
T (TYPE, PRED_TYPE, four, 4.0)
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
TEST_TYPE (T, _Float16, int16_t) \
|
||||
TEST_TYPE (T, float, int32_t) \
|
||||
TEST_TYPE (T, double, int64_t)
|
||||
|
||||
TEST_ALL (DEF_LOOP)
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #2\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #2\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #2\.0\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #4\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #4\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #4\.0} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
|
||||
/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
|
||||
/* { dg-final { scan-assembler-not {\tsel\t} } } */
|
|
@ -0,0 +1,32 @@
|
|||
/* { dg-do run { target aarch64_sve_hw } } */
|
||||
/* { dg-options "-O2 -ftree-vectorize" } */
|
||||
|
||||
#include "cond_fmul_1.c"
|
||||
|
||||
#define N 99
|
||||
|
||||
#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
|
||||
{ \
|
||||
TYPE x[N], y[N]; \
|
||||
PRED_TYPE pred[N]; \
|
||||
for (int i = 0; i < N; ++i) \
|
||||
{ \
|
||||
y[i] = i * i; \
|
||||
pred[i] = i % 3; \
|
||||
} \
|
||||
test_##TYPE##_##NAME (x, y, pred, N); \
|
||||
for (int i = 0; i < N; ++i) \
|
||||
{ \
|
||||
TYPE expected = i % 3 != 1 ? y[i] * (TYPE) CONST : y[i]; \
|
||||
if (x[i] != expected) \
|
||||
__builtin_abort (); \
|
||||
asm volatile ("" ::: "memory"); \
|
||||
} \
|
||||
}
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
TEST_ALL (TEST_LOOP)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,44 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-options "-O2 -ftree-vectorize" } */
|
||||
|
||||
#include <stdint.h>
|
||||
|
||||
#define DEF_LOOP(TYPE, NAME, CONST) \
|
||||
void __attribute__ ((noipa)) \
|
||||
test_##TYPE##_##NAME (TYPE *__restrict x, \
|
||||
TYPE *__restrict y, \
|
||||
TYPE *__restrict z, \
|
||||
int n) \
|
||||
{ \
|
||||
for (int i = 0; i < n; ++i) \
|
||||
x[i] = y[i] < 8 ? z[i] * (TYPE) CONST : y[i]; \
|
||||
}
|
||||
|
||||
#define TEST_TYPE(T, TYPE) \
|
||||
T (TYPE, half, 0.5) \
|
||||
T (TYPE, two, 2.0) \
|
||||
T (TYPE, four, 4.0)
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
TEST_TYPE (T, float) \
|
||||
TEST_TYPE (T, double)
|
||||
|
||||
TEST_ALL (DEF_LOOP)
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #2\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #2\.0\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #4\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #4\.0} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 3 } } */
|
||||
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 3 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
|
||||
/* { dg-final { scan-assembler-not {\tsel\t} } } */
|
|
@ -0,0 +1,31 @@
|
|||
/* { dg-do run { target aarch64_sve_hw } } */
|
||||
/* { dg-options "-O2 -ftree-vectorize" } */
|
||||
|
||||
#include "cond_fmul_2.c"
|
||||
|
||||
#define N 99
|
||||
|
||||
#define TEST_LOOP(TYPE, NAME, CONST) \
|
||||
{ \
|
||||
TYPE x[N], y[N], z[N]; \
|
||||
for (int i = 0; i < N; ++i) \
|
||||
{ \
|
||||
y[i] = i % 13; \
|
||||
z[i] = i * i; \
|
||||
} \
|
||||
test_##TYPE##_##NAME (x, y, z, N); \
|
||||
for (int i = 0; i < N; ++i) \
|
||||
{ \
|
||||
TYPE expected = y[i] < 8 ? z[i] * (TYPE) CONST : y[i]; \
|
||||
if (x[i] != expected) \
|
||||
__builtin_abort (); \
|
||||
asm volatile ("" ::: "memory"); \
|
||||
} \
|
||||
}
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
TEST_ALL (TEST_LOOP)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,50 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-options "-O2 -ftree-vectorize" } */
|
||||
|
||||
#include <stdint.h>
|
||||
|
||||
#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
|
||||
void __attribute__ ((noipa)) \
|
||||
test_##TYPE##_##NAME (TYPE *__restrict x, \
|
||||
TYPE *__restrict y, \
|
||||
PRED_TYPE *__restrict pred, \
|
||||
int n) \
|
||||
{ \
|
||||
for (int i = 0; i < n; ++i) \
|
||||
x[i] = pred[i] != 1 ? y[i] * (TYPE) CONST : 8; \
|
||||
}
|
||||
|
||||
#define TEST_TYPE(T, TYPE, PRED_TYPE) \
|
||||
T (TYPE, PRED_TYPE, half, 0.5) \
|
||||
T (TYPE, PRED_TYPE, two, 2.0) \
|
||||
T (TYPE, PRED_TYPE, four, 4.0)
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
TEST_TYPE (T, _Float16, int16_t) \
|
||||
TEST_TYPE (T, float, int32_t) \
|
||||
TEST_TYPE (T, double, int64_t)
|
||||
|
||||
TEST_ALL (DEF_LOOP)
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #2\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #2\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #2\.0\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #4\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #4\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #4\.0} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
|
||||
/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
|
||||
/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
|
||||
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
|
|
@ -0,0 +1,32 @@
|
|||
/* { dg-do run { target aarch64_sve_hw } } */
|
||||
/* { dg-options "-O2 -ftree-vectorize" } */
|
||||
|
||||
#include "cond_fmul_3.c"
|
||||
|
||||
#define N 99
|
||||
|
||||
#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
|
||||
{ \
|
||||
TYPE x[N], y[N]; \
|
||||
PRED_TYPE pred[N]; \
|
||||
for (int i = 0; i < N; ++i) \
|
||||
{ \
|
||||
y[i] = i * i; \
|
||||
pred[i] = i % 3; \
|
||||
} \
|
||||
test_##TYPE##_##NAME (x, y, pred, N); \
|
||||
for (int i = 0; i < N; ++i) \
|
||||
{ \
|
||||
TYPE expected = i % 3 != 1 ? y[i] * (TYPE) CONST : 8; \
|
||||
if (x[i] != expected) \
|
||||
__builtin_abort (); \
|
||||
asm volatile ("" ::: "memory"); \
|
||||
} \
|
||||
}
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
TEST_ALL (TEST_LOOP)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,49 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-options "-O2 -ftree-vectorize" } */
|
||||
|
||||
#include <stdint.h>
|
||||
|
||||
#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
|
||||
void __attribute__ ((noipa)) \
|
||||
test_##TYPE##_##NAME (TYPE *__restrict x, \
|
||||
TYPE *__restrict y, \
|
||||
PRED_TYPE *__restrict pred, \
|
||||
int n) \
|
||||
{ \
|
||||
for (int i = 0; i < n; ++i) \
|
||||
x[i] = pred[i] != 1 ? y[i] * (TYPE) CONST : 0; \
|
||||
}
|
||||
|
||||
#define TEST_TYPE(T, TYPE, PRED_TYPE) \
|
||||
T (TYPE, PRED_TYPE, half, 0.5) \
|
||||
T (TYPE, PRED_TYPE, two, 2.0) \
|
||||
T (TYPE, PRED_TYPE, four, 4.0)
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
TEST_TYPE (T, _Float16, int16_t) \
|
||||
TEST_TYPE (T, float, int32_t) \
|
||||
TEST_TYPE (T, double, int64_t)
|
||||
|
||||
TEST_ALL (DEF_LOOP)
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #2\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #2\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #2\.0\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #4\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #4\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #4\.0} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/z, z[0-9]+\.s\n} 3 } } */
|
||||
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/z, z[0-9]+\.d\n} 3 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
|
||||
/* { dg-final { scan-assembler-not {\tsel\t} } } */
|
|
@ -0,0 +1,32 @@
|
|||
/* { dg-do run { target aarch64_sve_hw } } */
|
||||
/* { dg-options "-O2 -ftree-vectorize" } */
|
||||
|
||||
#include "cond_fmul_4.c"
|
||||
|
||||
#define N 99
|
||||
|
||||
#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
|
||||
{ \
|
||||
TYPE x[N], y[N]; \
|
||||
PRED_TYPE pred[N]; \
|
||||
for (int i = 0; i < N; ++i) \
|
||||
{ \
|
||||
y[i] = i * i; \
|
||||
pred[i] = i % 3; \
|
||||
} \
|
||||
test_##TYPE##_##NAME (x, y, pred, N); \
|
||||
for (int i = 0; i < N; ++i) \
|
||||
{ \
|
||||
TYPE expected = i % 3 != 1 ? y[i] * (TYPE) CONST : 0; \
|
||||
if (x[i] != expected) \
|
||||
__builtin_abort (); \
|
||||
asm volatile ("" ::: "memory"); \
|
||||
} \
|
||||
}
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
TEST_ALL (TEST_LOOP)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,47 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-options "-O2 -ftree-vectorize" } */
|
||||
|
||||
#include <stdint.h>
|
||||
|
||||
#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
|
||||
void __attribute__ ((noipa)) \
|
||||
test_##TYPE##_##NAME (TYPE *__restrict x, \
|
||||
TYPE *__restrict y, \
|
||||
PRED_TYPE *__restrict pred, \
|
||||
int n) \
|
||||
{ \
|
||||
for (int i = 0; i < n; ++i) \
|
||||
x[i] = pred[i] != 1 ? (TYPE) CONST - y[i] : y[i]; \
|
||||
}
|
||||
|
||||
#define TEST_TYPE(T, TYPE, PRED_TYPE) \
|
||||
T (TYPE, PRED_TYPE, half, 0.5) \
|
||||
T (TYPE, PRED_TYPE, one, 1.0) \
|
||||
T (TYPE, PRED_TYPE, two, 2.0)
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
TEST_TYPE (T, _Float16, int16_t) \
|
||||
TEST_TYPE (T, float, int32_t) \
|
||||
TEST_TYPE (T, double, int64_t)
|
||||
|
||||
TEST_ALL (DEF_LOOP)
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
|
||||
/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
|
||||
/* { dg-final { scan-assembler-not {\tsel\t} } } */
|
|
@ -0,0 +1,32 @@
|
|||
/* { dg-do run { target aarch64_sve_hw } } */
|
||||
/* { dg-options "-O2 -ftree-vectorize" } */
|
||||
|
||||
#include "cond_fsubr_1.c"
|
||||
|
||||
#define N 99
|
||||
|
||||
#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
|
||||
{ \
|
||||
TYPE x[N], y[N]; \
|
||||
PRED_TYPE pred[N]; \
|
||||
for (int i = 0; i < N; ++i) \
|
||||
{ \
|
||||
y[i] = i * i; \
|
||||
pred[i] = i % 3; \
|
||||
} \
|
||||
test_##TYPE##_##NAME (x, y, pred, N); \
|
||||
for (int i = 0; i < N; ++i) \
|
||||
{ \
|
||||
TYPE expected = i % 3 != 1 ? (TYPE) CONST - y[i] : y[i]; \
|
||||
if (x[i] != expected) \
|
||||
__builtin_abort (); \
|
||||
asm volatile ("" ::: "memory"); \
|
||||
} \
|
||||
}
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
TEST_ALL (TEST_LOOP)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,44 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-options "-O2 -ftree-vectorize" } */
|
||||
|
||||
#include <stdint.h>
|
||||
|
||||
#define DEF_LOOP(TYPE, NAME, CONST) \
|
||||
void __attribute__ ((noipa)) \
|
||||
test_##TYPE##_##NAME (TYPE *__restrict x, \
|
||||
TYPE *__restrict y, \
|
||||
TYPE *__restrict z, \
|
||||
int n) \
|
||||
{ \
|
||||
for (int i = 0; i < n; ++i) \
|
||||
x[i] = y[i] < 8 ? (TYPE) CONST - z[i] : y[i]; \
|
||||
}
|
||||
|
||||
#define TEST_TYPE(T, TYPE) \
|
||||
T (TYPE, half, 0.5) \
|
||||
T (TYPE, one, 1.0) \
|
||||
T (TYPE, two, 2.0)
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
TEST_TYPE (T, float) \
|
||||
TEST_TYPE (T, double)
|
||||
|
||||
TEST_ALL (DEF_LOOP)
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 3 } } */
|
||||
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 3 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
|
||||
/* { dg-final { scan-assembler-not {\tsel\t} } } */
|
|
@ -0,0 +1,31 @@
|
|||
/* { dg-do run { target aarch64_sve_hw } } */
|
||||
/* { dg-options "-O2 -ftree-vectorize" } */
|
||||
|
||||
#include "cond_fsubr_2.c"
|
||||
|
||||
#define N 99
|
||||
|
||||
#define TEST_LOOP(TYPE, NAME, CONST) \
|
||||
{ \
|
||||
TYPE x[N], y[N], z[N]; \
|
||||
for (int i = 0; i < N; ++i) \
|
||||
{ \
|
||||
y[i] = i % 13; \
|
||||
z[i] = i * i; \
|
||||
} \
|
||||
test_##TYPE##_##NAME (x, y, z, N); \
|
||||
for (int i = 0; i < N; ++i) \
|
||||
{ \
|
||||
TYPE expected = y[i] < 8 ? (TYPE) CONST - z[i] : y[i]; \
|
||||
if (x[i] != expected) \
|
||||
__builtin_abort (); \
|
||||
asm volatile ("" ::: "memory"); \
|
||||
} \
|
||||
}
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
TEST_ALL (TEST_LOOP)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,50 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-options "-O2 -ftree-vectorize" } */
|
||||
|
||||
#include <stdint.h>
|
||||
|
||||
#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
|
||||
void __attribute__ ((noipa)) \
|
||||
test_##TYPE##_##NAME (TYPE *__restrict x, \
|
||||
TYPE *__restrict y, \
|
||||
PRED_TYPE *__restrict pred, \
|
||||
int n) \
|
||||
{ \
|
||||
for (int i = 0; i < n; ++i) \
|
||||
x[i] = pred[i] != 1 ? (TYPE) CONST - y[i] : 4; \
|
||||
}
|
||||
|
||||
#define TEST_TYPE(T, TYPE, PRED_TYPE) \
|
||||
T (TYPE, PRED_TYPE, half, 0.5) \
|
||||
T (TYPE, PRED_TYPE, one, 1.0) \
|
||||
T (TYPE, PRED_TYPE, two, 2.0)
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
TEST_TYPE (T, _Float16, int16_t) \
|
||||
TEST_TYPE (T, float, int32_t) \
|
||||
TEST_TYPE (T, double, int64_t)
|
||||
|
||||
TEST_ALL (DEF_LOOP)
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
|
||||
/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
|
||||
/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
|
||||
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
|
|
@ -0,0 +1,32 @@
|
|||
/* { dg-do run { target aarch64_sve_hw } } */
|
||||
/* { dg-options "-O2 -ftree-vectorize" } */
|
||||
|
||||
#include "cond_fsubr_3.c"
|
||||
|
||||
#define N 99
|
||||
|
||||
#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
|
||||
{ \
|
||||
TYPE x[N], y[N]; \
|
||||
PRED_TYPE pred[N]; \
|
||||
for (int i = 0; i < N; ++i) \
|
||||
{ \
|
||||
y[i] = i * i; \
|
||||
pred[i] = i % 3; \
|
||||
} \
|
||||
test_##TYPE##_##NAME (x, y, pred, N); \
|
||||
for (int i = 0; i < N; ++i) \
|
||||
{ \
|
||||
TYPE expected = i % 3 != 1 ? (TYPE) CONST - y[i] : 4; \
|
||||
if (x[i] != expected) \
|
||||
__builtin_abort (); \
|
||||
asm volatile ("" ::: "memory"); \
|
||||
} \
|
||||
}
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
TEST_ALL (TEST_LOOP)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,49 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-options "-O2 -ftree-vectorize" } */
|
||||
|
||||
#include <stdint.h>
|
||||
|
||||
#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
|
||||
void __attribute__ ((noipa)) \
|
||||
test_##TYPE##_##NAME (TYPE *__restrict x, \
|
||||
TYPE *__restrict y, \
|
||||
PRED_TYPE *__restrict pred, \
|
||||
int n) \
|
||||
{ \
|
||||
for (int i = 0; i < n; ++i) \
|
||||
x[i] = pred[i] != 1 ? (TYPE) CONST - y[i] : 0; \
|
||||
}
|
||||
|
||||
#define TEST_TYPE(T, TYPE, PRED_TYPE) \
|
||||
T (TYPE, PRED_TYPE, half, 0.5) \
|
||||
T (TYPE, PRED_TYPE, one, 1.0) \
|
||||
T (TYPE, PRED_TYPE, two, 2.0)
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
TEST_TYPE (T, _Float16, int16_t) \
|
||||
TEST_TYPE (T, float, int32_t) \
|
||||
TEST_TYPE (T, double, int64_t)
|
||||
|
||||
TEST_ALL (DEF_LOOP)
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
|
||||
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/z, z[0-9]+\.s\n} 3 } } */
|
||||
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/z, z[0-9]+\.d\n} 3 } } */
|
||||
|
||||
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
|
||||
/* { dg-final { scan-assembler-not {\tsel\t} } } */
|
|
@ -0,0 +1,32 @@
|
|||
/* { dg-do run { target aarch64_sve_hw } } */
|
||||
/* { dg-options "-O2 -ftree-vectorize" } */
|
||||
|
||||
#include "cond_fsubr_4.c"
|
||||
|
||||
#define N 99
|
||||
|
||||
#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
|
||||
{ \
|
||||
TYPE x[N], y[N]; \
|
||||
PRED_TYPE pred[N]; \
|
||||
for (int i = 0; i < N; ++i) \
|
||||
{ \
|
||||
y[i] = i * i; \
|
||||
pred[i] = i % 3; \
|
||||
} \
|
||||
test_##TYPE##_##NAME (x, y, pred, N); \
|
||||
for (int i = 0; i < N; ++i) \
|
||||
{ \
|
||||
TYPE expected = i % 3 != 1 ? (TYPE) CONST - y[i] : 0; \
|
||||
if (x[i] != expected) \
|
||||
__builtin_abort (); \
|
||||
asm volatile ("" ::: "memory"); \
|
||||
} \
|
||||
}
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
TEST_ALL (TEST_LOOP)
|
||||
return 0;
|
||||
}
|
Loading…
Reference in New Issue