Add a compatible_vector_types_p target hook

One problem with adding an N-bit vector extension to an existing
architecture is to decide how N-bit vectors should be passed to
functions and returned from functions.  Allowing all N-bit vector
types to be passed in registers breaks backwards compatibility,
since N-bit vectors could be used (and emulated) before the vector
extension was added.  But always passing N-bit vectors on the
stack would be inefficient for things like vector libm functions.

For SVE we took the compromise position of predefining new SVE vector
types that are distinct from all existing vector types, including
GNU-style vectors.  The new types are passed and returned in an
efficient way while existing vector types are passed and returned
in the traditional way.  In the right circumstances, the two types
are inter-convertible.

The SVE types are created using:

      vectype = build_distinct_type_copy (vectype);
      SET_TYPE_STRUCTURAL_EQUALITY (vectype);
      TYPE_ARTIFICIAL (vectype) = 1;

The C frontend maintains this distinction, using VIEW_CONVERT_EXPR
to convert from one type to the other.  However, the distinction can
be lost during gimple, which treats two vector types with the same
mode, number of elements, and element type as equivalent.  And for
most targets that's the right thing to do.

This patch therefore adds a hook that lets the target choose
whether such vector types are indeed equivalent.

Note that the new tests fail for -mabi=ilp32 in the same way as other
ACLE-based tests.  I'm still planning to fix that as a follow-on.

2020-01-09  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* target.def (compatible_vector_types_p): New target hook.
	* hooks.h (hook_bool_const_tree_const_tree_true): Declare.
	* hooks.c (hook_bool_const_tree_const_tree_true): New function.
	* doc/tm.texi.in (TARGET_COMPATIBLE_VECTOR_TYPES_P): New hook.
	* doc/tm.texi: Regenerate.
	* gimple-expr.c: Include target.h.
	(useless_type_conversion_p): Use targetm.compatible_vector_types_p.
	* config/aarch64/aarch64.c (aarch64_compatible_vector_types_p): New
	function.
	(TARGET_COMPATIBLE_VECTOR_TYPES_P): Define.
	* config/aarch64/aarch64-sve-builtins.cc (gimple_folder::convert_pred):
	Use the original predicate if it already has a suitable type.

gcc/testsuite/
	* gcc.target/aarch64/sve/pcs/gnu_vectors_1.c: New test.
	* gcc.target/aarch64/sve/pcs/gnu_vectors_2.c: Likewise.

From-SVN: r280047
This commit is contained in:
Richard Sandiford 2020-01-09 15:08:26 +00:00 committed by Richard Sandiford
parent 15df004070
commit 482b2b43e5
12 changed files with 296 additions and 6 deletions

View File

@ -1,3 +1,18 @@
2020-01-09 Richard Sandiford <richard.sandiford@arm.com>
* target.def (compatible_vector_types_p): New target hook.
* hooks.h (hook_bool_const_tree_const_tree_true): Declare.
* hooks.c (hook_bool_const_tree_const_tree_true): New function.
* doc/tm.texi.in (TARGET_COMPATIBLE_VECTOR_TYPES_P): New hook.
* doc/tm.texi: Regenerate.
* gimple-expr.c: Include target.h.
(useless_type_conversion_p): Use targetm.compatible_vector_types_p.
* config/aarch64/aarch64.c (aarch64_compatible_vector_types_p): New
function.
(TARGET_COMPATIBLE_VECTOR_TYPES_P): Define.
* config/aarch64/aarch64-sve-builtins.cc (gimple_folder::convert_pred):
Use the original predicate if it already has a suitable type.
2020-01-09 Martin Jambor <mjambor@suse.cz>
* cgraph.h (cgraph_edge): Make remove, set_call_stmt, make_direct,

View File

@ -2265,9 +2265,13 @@ tree
gimple_folder::convert_pred (gimple_seq &stmts, tree vectype,
unsigned int argno)
{
tree predtype = truth_type_for (vectype);
tree pred = gimple_call_arg (call, argno);
return gimple_build (&stmts, VIEW_CONVERT_EXPR, predtype, pred);
if (known_eq (TYPE_VECTOR_SUBPARTS (TREE_TYPE (pred)),
TYPE_VECTOR_SUBPARTS (vectype)))
return pred;
return gimple_build (&stmts, VIEW_CONVERT_EXPR,
truth_type_for (vectype), pred);
}
/* Return a pointer to the address in a contiguous load or store,

View File

@ -2098,6 +2098,15 @@ aarch64_fntype_abi (const_tree fntype)
return default_function_abi;
}
/* Implement TARGET_COMPATIBLE_VECTOR_TYPES_P. */
static bool
aarch64_compatible_vector_types_p (const_tree type1, const_tree type2)
{
return (aarch64_sve::builtin_type_p (type1)
== aarch64_sve::builtin_type_p (type2));
}
/* Return true if we should emit CFI for register REGNO. */
static bool
@ -22099,6 +22108,9 @@ aarch64_libgcc_floating_mode_supported_p
#undef TARGET_VECTOR_MODE_SUPPORTED_P
#define TARGET_VECTOR_MODE_SUPPORTED_P aarch64_vector_mode_supported_p
#undef TARGET_COMPATIBLE_VECTOR_TYPES_P
#define TARGET_COMPATIBLE_VECTOR_TYPES_P aarch64_compatible_vector_types_p
#undef TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT
#define TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT \
aarch64_builtin_support_vector_misalignment

View File

@ -4324,6 +4324,27 @@ insns involving vector mode @var{mode}. At the very least, it
must have move patterns for this mode.
@end deftypefn
@deftypefn {Target Hook} bool TARGET_COMPATIBLE_VECTOR_TYPES_P (const_tree @var{type1}, const_tree @var{type2})
Return true if there is no target-specific reason for treating
vector types @var{type1} and @var{type2} as distinct types. The caller
has already checked for target-independent reasons, meaning that the
types are known to have the same mode, to have the same number of elements,
and to have what the caller considers to be compatible element types.
The main reason for defining this hook is to reject pairs of types
that are handled differently by the target's calling convention.
For example, when a new @var{N}-bit vector architecture is added
to a target, the target may want to handle normal @var{N}-bit
@code{VECTOR_TYPE} arguments and return values in the same way as
before, to maintain backwards compatibility. However, it may also
provide new, architecture-specific @code{VECTOR_TYPE}s that are passed
and returned in a more efficient way. It is then important to maintain
a distinction between the ``normal'' @code{VECTOR_TYPE}s and the new
architecture-specific ones.
The default implementation returns true, which is correct for most targets.
@end deftypefn
@deftypefn {Target Hook} opt_machine_mode TARGET_ARRAY_MODE (machine_mode @var{mode}, unsigned HOST_WIDE_INT @var{nelems})
Return the mode that GCC should use for an array that has
@var{nelems} elements, with each element having mode @var{mode}.

View File

@ -3365,6 +3365,8 @@ stack.
@hook TARGET_VECTOR_MODE_SUPPORTED_P
@hook TARGET_COMPATIBLE_VECTOR_TYPES_P
@hook TARGET_ARRAY_MODE
@hook TARGET_ARRAY_MODE_SUPPORTED_P

View File

@ -37,6 +37,7 @@ along with GCC; see the file COPYING3. If not see
#include "tree-pass.h"
#include "stringpool.h"
#include "attribs.h"
#include "target.h"
/* ----- Type related ----- */
@ -147,10 +148,12 @@ useless_type_conversion_p (tree outer_type, tree inner_type)
/* Recurse for vector types with the same number of subparts. */
else if (TREE_CODE (inner_type) == VECTOR_TYPE
&& TREE_CODE (outer_type) == VECTOR_TYPE
&& TYPE_PRECISION (inner_type) == TYPE_PRECISION (outer_type))
return useless_type_conversion_p (TREE_TYPE (outer_type),
TREE_TYPE (inner_type));
&& TREE_CODE (outer_type) == VECTOR_TYPE)
return (known_eq (TYPE_VECTOR_SUBPARTS (inner_type),
TYPE_VECTOR_SUBPARTS (outer_type))
&& useless_type_conversion_p (TREE_TYPE (outer_type),
TREE_TYPE (inner_type))
&& targetm.compatible_vector_types_p (inner_type, outer_type));
else if (TREE_CODE (inner_type) == ARRAY_TYPE
&& TREE_CODE (outer_type) == ARRAY_TYPE)

View File

@ -312,6 +312,12 @@ hook_bool_const_tree_false (const_tree)
return false;
}
bool
hook_bool_const_tree_const_tree_true (const_tree, const_tree)
{
return true;
}
bool
hook_bool_tree_true (tree)
{

View File

@ -45,6 +45,7 @@ extern bool hook_bool_uint_uint_mode_false (unsigned int, unsigned int,
extern bool hook_bool_uint_mode_true (unsigned int, machine_mode);
extern bool hook_bool_tree_false (tree);
extern bool hook_bool_const_tree_false (const_tree);
extern bool hook_bool_const_tree_const_tree_true (const_tree, const_tree);
extern bool hook_bool_tree_true (tree);
extern bool hook_bool_const_tree_true (const_tree);
extern bool hook_bool_gsiptr_false (gimple_stmt_iterator *);

View File

@ -3410,6 +3410,29 @@ must have move patterns for this mode.",
bool, (machine_mode mode),
hook_bool_mode_false)
DEFHOOK
(compatible_vector_types_p,
"Return true if there is no target-specific reason for treating\n\
vector types @var{type1} and @var{type2} as distinct types. The caller\n\
has already checked for target-independent reasons, meaning that the\n\
types are known to have the same mode, to have the same number of elements,\n\
and to have what the caller considers to be compatible element types.\n\
\n\
The main reason for defining this hook is to reject pairs of types\n\
that are handled differently by the target's calling convention.\n\
For example, when a new @var{N}-bit vector architecture is added\n\
to a target, the target may want to handle normal @var{N}-bit\n\
@code{VECTOR_TYPE} arguments and return values in the same way as\n\
before, to maintain backwards compatibility. However, it may also\n\
provide new, architecture-specific @code{VECTOR_TYPE}s that are passed\n\
and returned in a more efficient way. It is then important to maintain\n\
a distinction between the ``normal'' @code{VECTOR_TYPE}s and the new\n\
architecture-specific ones.\n\
\n\
The default implementation returns true, which is correct for most targets.",
bool, (const_tree type1, const_tree type2),
hook_bool_const_tree_const_tree_true)
DEFHOOK
(vector_alignment,
"This hook can be used to define the alignment for a vector of type\n\

View File

@ -1,3 +1,8 @@
2020-01-09 Richard Sandiford <richard.sandiford@arm.com>
* gcc.target/aarch64/sve/pcs/gnu_vectors_1.c: New test.
* gcc.target/aarch64/sve/pcs/gnu_vectors_2.c: Likewise.
2020-01-09 Tobias Burnus <tobias@codesourcery.com>
PR fortran/84135

View File

@ -0,0 +1,99 @@
/* { dg-options "-O -msve-vector-bits=256 -fomit-frame-pointer" } */
#include <arm_sve.h>
typedef float16_t float16x16_t __attribute__((vector_size (32)));
typedef float32_t float32x8_t __attribute__((vector_size (32)));
typedef float64_t float64x4_t __attribute__((vector_size (32)));
typedef int8_t int8x32_t __attribute__((vector_size (32)));
typedef int16_t int16x16_t __attribute__((vector_size (32)));
typedef int32_t int32x8_t __attribute__((vector_size (32)));
typedef int64_t int64x4_t __attribute__((vector_size (32)));
typedef uint8_t uint8x32_t __attribute__((vector_size (32)));
typedef uint16_t uint16x16_t __attribute__((vector_size (32)));
typedef uint32_t uint32x8_t __attribute__((vector_size (32)));
typedef uint64_t uint64x4_t __attribute__((vector_size (32)));
void float16_callee (float16x16_t);
void float32_callee (float32x8_t);
void float64_callee (float64x4_t);
void int8_callee (int8x32_t);
void int16_callee (int16x16_t);
void int32_callee (int32x8_t);
void int64_callee (int64x4_t);
void uint8_callee (uint8x32_t);
void uint16_callee (uint16x16_t);
void uint32_callee (uint32x8_t);
void uint64_callee (uint64x4_t);
void
float16_caller (void)
{
float16_callee (svdup_f16 (1.0));
}
void
float32_caller (void)
{
float32_callee (svdup_f32 (2.0));
}
void
float64_caller (void)
{
float64_callee (svdup_f64 (3.0));
}
void
int8_caller (void)
{
int8_callee (svindex_s8 (0, 1));
}
void
int16_caller (void)
{
int16_callee (svindex_s16 (0, 2));
}
void
int32_caller (void)
{
int32_callee (svindex_s32 (0, 3));
}
void
int64_caller (void)
{
int64_callee (svindex_s64 (0, 4));
}
void
uint8_caller (void)
{
uint8_callee (svindex_u8 (1, 1));
}
void
uint16_caller (void)
{
uint16_callee (svindex_u16 (1, 2));
}
void
uint32_caller (void)
{
uint32_callee (svindex_u32 (1, 3));
}
void
uint64_caller (void)
{
uint64_callee (svindex_u64 (1, 4));
}
/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b, p[0-7], \[x0\]} 2 } } */
/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h, p[0-7], \[x0\]} 3 } } */
/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7], \[x0\]} 3 } } */
/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7], \[x0\]} 3 } } */
/* { dg-final { scan-assembler-times {\tadd\tx0, sp, #?16\n} 11 } } */

View File

@ -0,0 +1,99 @@
/* { dg-options "-O -msve-vector-bits=256 -fomit-frame-pointer" } */
#include <arm_sve.h>
typedef float16_t float16x16_t __attribute__((vector_size (32)));
typedef float32_t float32x8_t __attribute__((vector_size (32)));
typedef float64_t float64x4_t __attribute__((vector_size (32)));
typedef int8_t int8x32_t __attribute__((vector_size (32)));
typedef int16_t int16x16_t __attribute__((vector_size (32)));
typedef int32_t int32x8_t __attribute__((vector_size (32)));
typedef int64_t int64x4_t __attribute__((vector_size (32)));
typedef uint8_t uint8x32_t __attribute__((vector_size (32)));
typedef uint16_t uint16x16_t __attribute__((vector_size (32)));
typedef uint32_t uint32x8_t __attribute__((vector_size (32)));
typedef uint64_t uint64x4_t __attribute__((vector_size (32)));
void float16_callee (svfloat16_t);
void float32_callee (svfloat32_t);
void float64_callee (svfloat64_t);
void int8_callee (svint8_t);
void int16_callee (svint16_t);
void int32_callee (svint32_t);
void int64_callee (svint64_t);
void uint8_callee (svuint8_t);
void uint16_callee (svuint16_t);
void uint32_callee (svuint32_t);
void uint64_callee (svuint64_t);
void
float16_caller (float16x16_t arg)
{
float16_callee (arg);
}
void
float32_caller (float32x8_t arg)
{
float32_callee (arg);
}
void
float64_caller (float64x4_t arg)
{
float64_callee (arg);
}
void
int8_caller (int8x32_t arg)
{
int8_callee (arg);
}
void
int16_caller (int16x16_t arg)
{
int16_callee (arg);
}
void
int32_caller (int32x8_t arg)
{
int32_callee (arg);
}
void
int64_caller (int64x4_t arg)
{
int64_callee (arg);
}
void
uint8_caller (uint8x32_t arg)
{
uint8_callee (arg);
}
void
uint16_caller (uint16x16_t arg)
{
uint16_callee (arg);
}
void
uint32_caller (uint32x8_t arg)
{
uint32_callee (arg);
}
void
uint64_caller (uint64x4_t arg)
{
uint64_callee (arg);
}
/* { dg-final { scan-assembler-times {\tld1b\tz0\.b, p[0-7]/z, \[x0\]} 2 } } */
/* { dg-final { scan-assembler-times {\tld1h\tz0\.h, p[0-7]/z, \[x0\]} 3 } } */
/* { dg-final { scan-assembler-times {\tld1w\tz0\.s, p[0-7]/z, \[x0\]} 3 } } */
/* { dg-final { scan-assembler-times {\tld1d\tz0\.d, p[0-7]/z, \[x0\]} 3 } } */
/* { dg-final { scan-assembler-not {\tst1[bhwd]\t} } } */