Add a compatible_vector_types_p target hook

One problem with adding an N-bit vector extension to an existing architecture is to decide how N-bit vectors should be passed to functions and returned from functions. Allowing all N-bit vector types to be passed in registers breaks backwards compatibility, since N-bit vectors could be used (and emulated) before the vector extension was added. But always passing N-bit vectors on the stack would be inefficient for things like vector libm functions. For SVE we took the compromise position of predefining new SVE vector types that are distinct from all existing vector types, including GNU-style vectors. The new types are passed and returned in an efficient way while existing vector types are passed and returned in the traditional way. In the right circumstances, the two types are inter-convertible. The SVE types are created using: vectype = build_distinct_type_copy (vectype); SET_TYPE_STRUCTURAL_EQUALITY (vectype); TYPE_ARTIFICIAL (vectype) = 1; The C frontend maintains this distinction, using VIEW_CONVERT_EXPR to convert from one type to the other. However, the distinction can be lost during gimple, which treats two vector types with the same mode, number of elements, and element type as equivalent. And for most targets that's the right thing to do. This patch therefore adds a hook that lets the target choose whether such vector types are indeed equivalent. Note that the new tests fail for -mabi=ilp32 in the same way as other ACLE-based tests. I'm still planning to fix that as a follow-on. 2020-01-09 Richard Sandiford <richard.sandiford@arm.com> gcc/ * target.def (compatible_vector_types_p): New target hook. * hooks.h (hook_bool_const_tree_const_tree_true): Declare. * hooks.c (hook_bool_const_tree_const_tree_true): New function. * doc/tm.texi.in (TARGET_COMPATIBLE_VECTOR_TYPES_P): New hook. * doc/tm.texi: Regenerate. * gimple-expr.c: Include target.h. (useless_type_conversion_p): Use targetm.compatible_vector_types_p. * config/aarch64/aarch64.c (aarch64_compatible_vector_types_p): New function. (TARGET_COMPATIBLE_VECTOR_TYPES_P): Define. * config/aarch64/aarch64-sve-builtins.cc (gimple_folder::convert_pred): Use the original predicate if it already has a suitable type. gcc/testsuite/ * gcc.target/aarch64/sve/pcs/gnu_vectors_1.c: New test. * gcc.target/aarch64/sve/pcs/gnu_vectors_2.c: Likewise. From-SVN: r280047
2020-01-09 15:08:26 +00:00 · 2020-01-09 15:08:26 +00:00 · 482b2b43e5
parent 15df004070
commit 482b2b43e5
12 changed files with 296 additions and 6 deletions
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@ -1,3 +1,18 @@
+2020-01-09  Richard Sandiford  <richard.sandiford@arm.com>
+
+	* target.def (compatible_vector_types_p): New target hook.
+	* hooks.h (hook_bool_const_tree_const_tree_true): Declare.
+	* hooks.c (hook_bool_const_tree_const_tree_true): New function.
+	* doc/tm.texi.in (TARGET_COMPATIBLE_VECTOR_TYPES_P): New hook.
+	* doc/tm.texi: Regenerate.
+	* gimple-expr.c: Include target.h.
+	(useless_type_conversion_p): Use targetm.compatible_vector_types_p.
+	* config/aarch64/aarch64.c (aarch64_compatible_vector_types_p): New
+	function.
+	(TARGET_COMPATIBLE_VECTOR_TYPES_P): Define.
+	* config/aarch64/aarch64-sve-builtins.cc (gimple_folder::convert_pred):
+	Use the original predicate if it already has a suitable type.
+
 2020-01-09  Martin Jambor  <mjambor@suse.cz>

 	* cgraph.h (cgraph_edge): Make remove, set_call_stmt, make_direct,
--- a/gcc/config/aarch64/aarch64-sve-builtins.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
@ -2265,9 +2265,13 @@ tree
 gimple_folder::convert_pred (gimple_seq &stmts, tree vectype,
 			     unsigned int argno)
 {
-  tree predtype = truth_type_for (vectype);
  tree pred = gimple_call_arg (call, argno);
-  return gimple_build (&stmts, VIEW_CONVERT_EXPR, predtype, pred);
+  if (known_eq (TYPE_VECTOR_SUBPARTS (TREE_TYPE (pred)),
+		TYPE_VECTOR_SUBPARTS (vectype)))
+    return pred;
+
+  return gimple_build (&stmts, VIEW_CONVERT_EXPR,
+		       truth_type_for (vectype), pred);
 }

 /* Return a pointer to the address in a contiguous load or store,
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@ -2098,6 +2098,15 @@ aarch64_fntype_abi (const_tree fntype)
  return default_function_abi;
 }

+/* Implement TARGET_COMPATIBLE_VECTOR_TYPES_P.  */
+
+static bool
+aarch64_compatible_vector_types_p (const_tree type1, const_tree type2)
+{
+  return (aarch64_sve::builtin_type_p (type1)
+	  == aarch64_sve::builtin_type_p (type2));
+}
+
 /* Return true if we should emit CFI for register REGNO.  */

 static bool
@ -22099,6 +22108,9 @@ aarch64_libgcc_floating_mode_supported_p
 #undef TARGET_VECTOR_MODE_SUPPORTED_P
 #define TARGET_VECTOR_MODE_SUPPORTED_P aarch64_vector_mode_supported_p

+#undef TARGET_COMPATIBLE_VECTOR_TYPES_P
+#define TARGET_COMPATIBLE_VECTOR_TYPES_P aarch64_compatible_vector_types_p
+
 #undef TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT
 #define TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT \
  aarch64_builtin_support_vector_misalignment
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@ -4324,6 +4324,27 @@ insns involving vector mode @var{mode}.  At the very least, it
 must have move patterns for this mode.
@end deftypefn

+@deftypefn {Target Hook} bool TARGET_COMPATIBLE_VECTOR_TYPES_P (const_tree @var{type1}, const_tree @var{type2})
+Return true if there is no target-specific reason for treating
+vector types @var{type1} and @var{type2} as distinct types.  The caller
+has already checked for target-independent reasons, meaning that the
+types are known to have the same mode, to have the same number of elements,
+and to have what the caller considers to be compatible element types.
+
+The main reason for defining this hook is to reject pairs of types
+that are handled differently by the target's calling convention.
+For example, when a new @var{N}-bit vector architecture is added
+to a target, the target may want to handle normal @var{N}-bit
+@code{VECTOR_TYPE} arguments and return values in the same way as
+before, to maintain backwards compatibility.  However, it may also
+provide new, architecture-specific @code{VECTOR_TYPE}s that are passed
+and returned in a more efficient way.  It is then important to maintain
+a distinction between the ``normal'' @code{VECTOR_TYPE}s and the new
+architecture-specific ones.
+
+The default implementation returns true, which is correct for most targets.
+@end deftypefn
+
@deftypefn {Target Hook} opt_machine_mode TARGET_ARRAY_MODE (machine_mode @var{mode}, unsigned HOST_WIDE_INT @var{nelems})
 Return the mode that GCC should use for an array that has
@var{nelems} elements, with each element having mode @var{mode}.
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@ -3365,6 +3365,8 @@ stack.

@hook TARGET_VECTOR_MODE_SUPPORTED_P

+@hook TARGET_COMPATIBLE_VECTOR_TYPES_P
+
@hook TARGET_ARRAY_MODE

@hook TARGET_ARRAY_MODE_SUPPORTED_P
--- a/gcc/gimple-expr.c
+++ b/gcc/gimple-expr.c
@ -37,6 +37,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-pass.h"
 #include "stringpool.h"
 #include "attribs.h"
+#include "target.h"

 /* ----- Type related -----  */

@ -147,10 +148,12 @@ useless_type_conversion_p (tree outer_type, tree inner_type)

  /* Recurse for vector types with the same number of subparts.  */
  else if (TREE_CODE (inner_type) == VECTOR_TYPE
-	   && TREE_CODE (outer_type) == VECTOR_TYPE
-	   && TYPE_PRECISION (inner_type) == TYPE_PRECISION (outer_type))
-    return useless_type_conversion_p (TREE_TYPE (outer_type),
-				      TREE_TYPE (inner_type));
+	   && TREE_CODE (outer_type) == VECTOR_TYPE)
+    return (known_eq (TYPE_VECTOR_SUBPARTS (inner_type),
+		      TYPE_VECTOR_SUBPARTS (outer_type))
+	    && useless_type_conversion_p (TREE_TYPE (outer_type),
+					  TREE_TYPE (inner_type))
+	    && targetm.compatible_vector_types_p (inner_type, outer_type));

  else if (TREE_CODE (inner_type) == ARRAY_TYPE
 	   && TREE_CODE (outer_type) == ARRAY_TYPE)
--- a/gcc/hooks.c
+++ b/gcc/hooks.c
@ -312,6 +312,12 @@ hook_bool_const_tree_false (const_tree)
  return false;
 }

+bool
+hook_bool_const_tree_const_tree_true (const_tree, const_tree)
+{
+  return true;
+}
+
 bool
 hook_bool_tree_true (tree)
 {
--- a/gcc/hooks.h
+++ b/gcc/hooks.h
@ -45,6 +45,7 @@ extern bool hook_bool_uint_uint_mode_false (unsigned int, unsigned int,
 extern bool hook_bool_uint_mode_true (unsigned int, machine_mode);
 extern bool hook_bool_tree_false (tree);
 extern bool hook_bool_const_tree_false (const_tree);
+extern bool hook_bool_const_tree_const_tree_true (const_tree, const_tree);
 extern bool hook_bool_tree_true (tree);
 extern bool hook_bool_const_tree_true (const_tree);
 extern bool hook_bool_gsiptr_false (gimple_stmt_iterator *);
--- a/gcc/target.def
+++ b/gcc/target.def
@ -3410,6 +3410,29 @@ must have move patterns for this mode.",
 bool, (machine_mode mode),
 hook_bool_mode_false)

+DEFHOOK
+(compatible_vector_types_p,
+ "Return true if there is no target-specific reason for treating\n\
+vector types @var{type1} and @var{type2} as distinct types.  The caller\n\
+has already checked for target-independent reasons, meaning that the\n\
+types are known to have the same mode, to have the same number of elements,\n\
+and to have what the caller considers to be compatible element types.\n\
+\n\
+The main reason for defining this hook is to reject pairs of types\n\
+that are handled differently by the target's calling convention.\n\
+For example, when a new @var{N}-bit vector architecture is added\n\
+to a target, the target may want to handle normal @var{N}-bit\n\
+@code{VECTOR_TYPE} arguments and return values in the same way as\n\
+before, to maintain backwards compatibility.  However, it may also\n\
+provide new, architecture-specific @code{VECTOR_TYPE}s that are passed\n\
+and returned in a more efficient way.  It is then important to maintain\n\
+a distinction between the ``normal'' @code{VECTOR_TYPE}s and the new\n\
+architecture-specific ones.\n\
+\n\
+The default implementation returns true, which is correct for most targets.",
+ bool, (const_tree type1, const_tree type2),
+ hook_bool_const_tree_const_tree_true)
+
 DEFHOOK
 (vector_alignment,
 "This hook can be used to define the alignment for a vector of type\n\
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@ -1,3 +1,8 @@
+2020-01-09  Richard Sandiford  <richard.sandiford@arm.com>
+
+	* gcc.target/aarch64/sve/pcs/gnu_vectors_1.c: New test.
+	* gcc.target/aarch64/sve/pcs/gnu_vectors_2.c: Likewise.
+
 2020-01-09  Tobias Burnus  <tobias@codesourcery.com>

 	PR fortran/84135
--- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_1.c
@ -0,0 +1,99 @@
+/* { dg-options "-O -msve-vector-bits=256 -fomit-frame-pointer" } */
+
+#include <arm_sve.h>
+
+typedef float16_t float16x16_t __attribute__((vector_size (32)));
+typedef float32_t float32x8_t __attribute__((vector_size (32)));
+typedef float64_t float64x4_t __attribute__((vector_size (32)));
+typedef int8_t int8x32_t __attribute__((vector_size (32)));
+typedef int16_t int16x16_t __attribute__((vector_size (32)));
+typedef int32_t int32x8_t __attribute__((vector_size (32)));
+typedef int64_t int64x4_t __attribute__((vector_size (32)));
+typedef uint8_t uint8x32_t __attribute__((vector_size (32)));
+typedef uint16_t uint16x16_t __attribute__((vector_size (32)));
+typedef uint32_t uint32x8_t __attribute__((vector_size (32)));
+typedef uint64_t uint64x4_t __attribute__((vector_size (32)));
+
+void float16_callee (float16x16_t);
+void float32_callee (float32x8_t);
+void float64_callee (float64x4_t);
+void int8_callee (int8x32_t);
+void int16_callee (int16x16_t);
+void int32_callee (int32x8_t);
+void int64_callee (int64x4_t);
+void uint8_callee (uint8x32_t);
+void uint16_callee (uint16x16_t);
+void uint32_callee (uint32x8_t);
+void uint64_callee (uint64x4_t);
+
+void
+float16_caller (void)
+{
+  float16_callee (svdup_f16 (1.0));
+}
+
+void
+float32_caller (void)
+{
+  float32_callee (svdup_f32 (2.0));
+}
+
+void
+float64_caller (void)
+{
+  float64_callee (svdup_f64 (3.0));
+}
+
+void
+int8_caller (void)
+{
+  int8_callee (svindex_s8 (0, 1));
+}
+
+void
+int16_caller (void)
+{
+  int16_callee (svindex_s16 (0, 2));
+}
+
+void
+int32_caller (void)
+{
+  int32_callee (svindex_s32 (0, 3));
+}
+
+void
+int64_caller (void)
+{
+  int64_callee (svindex_s64 (0, 4));
+}
+
+void
+uint8_caller (void)
+{
+  uint8_callee (svindex_u8 (1, 1));
+}
+
+void
+uint16_caller (void)
+{
+  uint16_callee (svindex_u16 (1, 2));
+}
+
+void
+uint32_caller (void)
+{
+  uint32_callee (svindex_u32 (1, 3));
+}
+
+void
+uint64_caller (void)
+{
+  uint64_callee (svindex_u64 (1, 4));
+}
+
+/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b, p[0-7], \[x0\]} 2 } } */
+/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h, p[0-7], \[x0\]} 3 } } */
+/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7], \[x0\]} 3 } } */
+/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7], \[x0\]} 3 } } */
+/* { dg-final { scan-assembler-times {\tadd\tx0, sp, #?16\n} 11 } } */
--- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_2.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_2.c
@ -0,0 +1,99 @@
+/* { dg-options "-O -msve-vector-bits=256 -fomit-frame-pointer" } */
+
+#include <arm_sve.h>
+
+typedef float16_t float16x16_t __attribute__((vector_size (32)));
+typedef float32_t float32x8_t __attribute__((vector_size (32)));
+typedef float64_t float64x4_t __attribute__((vector_size (32)));
+typedef int8_t int8x32_t __attribute__((vector_size (32)));
+typedef int16_t int16x16_t __attribute__((vector_size (32)));
+typedef int32_t int32x8_t __attribute__((vector_size (32)));
+typedef int64_t int64x4_t __attribute__((vector_size (32)));
+typedef uint8_t uint8x32_t __attribute__((vector_size (32)));
+typedef uint16_t uint16x16_t __attribute__((vector_size (32)));
+typedef uint32_t uint32x8_t __attribute__((vector_size (32)));
+typedef uint64_t uint64x4_t __attribute__((vector_size (32)));
+
+void float16_callee (svfloat16_t);
+void float32_callee (svfloat32_t);
+void float64_callee (svfloat64_t);
+void int8_callee (svint8_t);
+void int16_callee (svint16_t);
+void int32_callee (svint32_t);
+void int64_callee (svint64_t);
+void uint8_callee (svuint8_t);
+void uint16_callee (svuint16_t);
+void uint32_callee (svuint32_t);
+void uint64_callee (svuint64_t);
+
+void
+float16_caller (float16x16_t arg)
+{
+  float16_callee (arg);
+}
+
+void
+float32_caller (float32x8_t arg)
+{
+  float32_callee (arg);
+}
+
+void
+float64_caller (float64x4_t arg)
+{
+  float64_callee (arg);
+}
+
+void
+int8_caller (int8x32_t arg)
+{
+  int8_callee (arg);
+}
+
+void
+int16_caller (int16x16_t arg)
+{
+  int16_callee (arg);
+}
+
+void
+int32_caller (int32x8_t arg)
+{
+  int32_callee (arg);
+}
+
+void
+int64_caller (int64x4_t arg)
+{
+  int64_callee (arg);
+}
+
+void
+uint8_caller (uint8x32_t arg)
+{
+  uint8_callee (arg);
+}
+
+void
+uint16_caller (uint16x16_t arg)
+{
+  uint16_callee (arg);
+}
+
+void
+uint32_caller (uint32x8_t arg)
+{
+  uint32_callee (arg);
+}
+
+void
+uint64_caller (uint64x4_t arg)
+{
+  uint64_callee (arg);
+}
+
+/* { dg-final { scan-assembler-times {\tld1b\tz0\.b, p[0-7]/z, \[x0\]} 2 } } */
+/* { dg-final { scan-assembler-times {\tld1h\tz0\.h, p[0-7]/z, \[x0\]} 3 } } */
+/* { dg-final { scan-assembler-times {\tld1w\tz0\.s, p[0-7]/z, \[x0\]} 3 } } */
+/* { dg-final { scan-assembler-times {\tld1d\tz0\.d, p[0-7]/z, \[x0\]} 3 } } */
+/* { dg-final { scan-assembler-not {\tst1[bhwd]\t} } } */