Define AVX broadcast intrinsics

This defines `_mm256_broadcast_ps` and `_mm256_broadcast_pd`. The `_ss`
and `_sd` variants are not supported by LLVM. In Clang these intrinsics
are implemented as inline functions in C++.

Intel reference: https://software.intel.com/en-us/node/514144.

Note: the argument type should really be "0hPc" (a pointer to a vector
of half the width), but internally the LLVM intrinsic takes a pointer to
a signed integer, and for any other type LLVM will complain. This means
that a transmute is required to call these intrinsics.

The AVX2 broadcast intrinsics `_mm256_broadcastss_ps` and
`_mm256_broadcastsd_pd` are not available as LLVM intrinsics. In Clang
they are implemented using the shufflevector builtin.
This commit is contained in:
Ruud van Asseldonk 2016-03-08 21:41:18 +01:00
parent 8f0479b2a5
commit 37efeae886

View File

@ -8,6 +8,13 @@
"ret": "f(32-64)",
"args": ["0", "0"]
},
{
"intrinsic": "256_broadcast_{0.data_type}",
"width": [256],
"llvm": "vbroadcastf128.{0.data_type}.256",
"ret": "f(32-64)",
"args": ["s8SPc"]
},
{
"intrinsic": "256_dp_ps",
"width": [256],