gcc/gcc/config/i386/mmintrin.h

950 lines
31 KiB
C
Raw Normal View History

/* Copyright (C) 2002-2017 Free Software Foundation, Inc.
This file is part of GCC.
GCC is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3, or (at your option)
any later version.
GCC is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
Under Section 7 of GPL version 3, you are granted additional
permissions described in the GCC Runtime Library Exception, version
3.1, as published by the Free Software Foundation.
You should have received a copy of the GNU General Public License and
a copy of the GCC Runtime Library Exception along with this program;
see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
<http://www.gnu.org/licenses/>. */
/* Implemented from the specification included in the Intel C++ Compiler
User Guide and Reference, version 9.0. */
#ifndef _MMINTRIN_H_INCLUDED
#define _MMINTRIN_H_INCLUDED
#if defined __x86_64__ && !defined __SSE__ || !defined __MMX__
Allow mmintrin headers to work with function specific target opts. Allow mmintrin headers to work with function specific target opts. Please see discussion here: http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00740.html * config/i386/i386.c (ix86_pragma_target_parse): Restore target when current target options does not apply. * config/i386/i386-protos.h (ix86_reset_previous_fndecl): New function. * config/i386/i386.c (ix86_reset_previous_fndecl): Ditto. * config/i386/bmiintrin.h: Pass appropriate target attributes to header. * config/i386/mmintrin.h: Ditto. * config/i386/nmmintrin.h: Ditto. * config/i386/avx2intrin.h: Ditto. * config/i386/fxsrintrin.h: Ditto. * config/i386/tbmintrin.h: Ditto. * config/i386/xsaveintrin.h: Ditto. * config/i386/f16cintrin.h: Ditto. * config/i386/xtestintrin.h: Ditto. * config/i386/xsaveoptintrin.h: Ditto. * config/i386/bmi2intrin.h: Ditto. * config/i386/lzcntintrin.h: Ditto. * config/i386/smmintrin.h: Ditto. * config/i386/wmmintrin.h: Ditto. * config/i386/x86intrin.h: Remove all header include guards. * config/i386/prfchwintrin.h: Ditto. * config/i386/pmmintrin.h: Ditto. * config/i386/tmmintrin.h: Ditto. * config/i386/xmmintrin.h: Ditto. * config/i386/popcntintrin.h: Ditto. * config/i386/rdseedintrin.h: Ditto. * config/i386/ammintrin.h: Ditto. * config/i386/emmintrin.h: Ditto. * config/i386/immintrin.h: Remove all header include guards. * config/i386/fma4intrin.h: Ditto. * config/i386/lwpintrin.h: Ditto. * config/i386/xopintrin.h: Ditto. * config/i386/ia32intrin.h: Ditto. * config/i386/avxintrin.h: Ditto. * config/i386/rtmintrin.h: Ditto. * config/i386/fmaintrin.h: Ditto. * config/i386/mm3dnow.h: Ditto. * testsuite/gcc.target/i386/intrinsics_1.c: New test. * testsuite/gcc.target/i386/intrinsics_2.c: Ditto. * testsuite/gcc.target/i386/intrinsics_3.c: Ditto. * testsuite/gcc.target/i386/intrinsics_4.c: Ditto. * testsuite/gcc.target/i386/intrinsics_5.c: Ditto. * testsuite/gcc.target/i386/intrinsics_6.c: Ditto. * testsuite/gcc.target/i386/avx-1.c: Provide macros for builtins needing immediate arguments in f16cintrin.h and rtmintrin.h. From-SVN: r200349
2013-06-23 08:15:19 +02:00
#pragma GCC push_options
#ifdef __x86_64__
#pragma GCC target("sse,mmx")
#else
Allow mmintrin headers to work with function specific target opts. Allow mmintrin headers to work with function specific target opts. Please see discussion here: http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00740.html * config/i386/i386.c (ix86_pragma_target_parse): Restore target when current target options does not apply. * config/i386/i386-protos.h (ix86_reset_previous_fndecl): New function. * config/i386/i386.c (ix86_reset_previous_fndecl): Ditto. * config/i386/bmiintrin.h: Pass appropriate target attributes to header. * config/i386/mmintrin.h: Ditto. * config/i386/nmmintrin.h: Ditto. * config/i386/avx2intrin.h: Ditto. * config/i386/fxsrintrin.h: Ditto. * config/i386/tbmintrin.h: Ditto. * config/i386/xsaveintrin.h: Ditto. * config/i386/f16cintrin.h: Ditto. * config/i386/xtestintrin.h: Ditto. * config/i386/xsaveoptintrin.h: Ditto. * config/i386/bmi2intrin.h: Ditto. * config/i386/lzcntintrin.h: Ditto. * config/i386/smmintrin.h: Ditto. * config/i386/wmmintrin.h: Ditto. * config/i386/x86intrin.h: Remove all header include guards. * config/i386/prfchwintrin.h: Ditto. * config/i386/pmmintrin.h: Ditto. * config/i386/tmmintrin.h: Ditto. * config/i386/xmmintrin.h: Ditto. * config/i386/popcntintrin.h: Ditto. * config/i386/rdseedintrin.h: Ditto. * config/i386/ammintrin.h: Ditto. * config/i386/emmintrin.h: Ditto. * config/i386/immintrin.h: Remove all header include guards. * config/i386/fma4intrin.h: Ditto. * config/i386/lwpintrin.h: Ditto. * config/i386/xopintrin.h: Ditto. * config/i386/ia32intrin.h: Ditto. * config/i386/avxintrin.h: Ditto. * config/i386/rtmintrin.h: Ditto. * config/i386/fmaintrin.h: Ditto. * config/i386/mm3dnow.h: Ditto. * testsuite/gcc.target/i386/intrinsics_1.c: New test. * testsuite/gcc.target/i386/intrinsics_2.c: Ditto. * testsuite/gcc.target/i386/intrinsics_3.c: Ditto. * testsuite/gcc.target/i386/intrinsics_4.c: Ditto. * testsuite/gcc.target/i386/intrinsics_5.c: Ditto. * testsuite/gcc.target/i386/intrinsics_6.c: Ditto. * testsuite/gcc.target/i386/avx-1.c: Provide macros for builtins needing immediate arguments in f16cintrin.h and rtmintrin.h. From-SVN: r200349
2013-06-23 08:15:19 +02:00
#pragma GCC target("mmx")
#endif
Allow mmintrin headers to work with function specific target opts. Allow mmintrin headers to work with function specific target opts. Please see discussion here: http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00740.html * config/i386/i386.c (ix86_pragma_target_parse): Restore target when current target options does not apply. * config/i386/i386-protos.h (ix86_reset_previous_fndecl): New function. * config/i386/i386.c (ix86_reset_previous_fndecl): Ditto. * config/i386/bmiintrin.h: Pass appropriate target attributes to header. * config/i386/mmintrin.h: Ditto. * config/i386/nmmintrin.h: Ditto. * config/i386/avx2intrin.h: Ditto. * config/i386/fxsrintrin.h: Ditto. * config/i386/tbmintrin.h: Ditto. * config/i386/xsaveintrin.h: Ditto. * config/i386/f16cintrin.h: Ditto. * config/i386/xtestintrin.h: Ditto. * config/i386/xsaveoptintrin.h: Ditto. * config/i386/bmi2intrin.h: Ditto. * config/i386/lzcntintrin.h: Ditto. * config/i386/smmintrin.h: Ditto. * config/i386/wmmintrin.h: Ditto. * config/i386/x86intrin.h: Remove all header include guards. * config/i386/prfchwintrin.h: Ditto. * config/i386/pmmintrin.h: Ditto. * config/i386/tmmintrin.h: Ditto. * config/i386/xmmintrin.h: Ditto. * config/i386/popcntintrin.h: Ditto. * config/i386/rdseedintrin.h: Ditto. * config/i386/ammintrin.h: Ditto. * config/i386/emmintrin.h: Ditto. * config/i386/immintrin.h: Remove all header include guards. * config/i386/fma4intrin.h: Ditto. * config/i386/lwpintrin.h: Ditto. * config/i386/xopintrin.h: Ditto. * config/i386/ia32intrin.h: Ditto. * config/i386/avxintrin.h: Ditto. * config/i386/rtmintrin.h: Ditto. * config/i386/fmaintrin.h: Ditto. * config/i386/mm3dnow.h: Ditto. * testsuite/gcc.target/i386/intrinsics_1.c: New test. * testsuite/gcc.target/i386/intrinsics_2.c: Ditto. * testsuite/gcc.target/i386/intrinsics_3.c: Ditto. * testsuite/gcc.target/i386/intrinsics_4.c: Ditto. * testsuite/gcc.target/i386/intrinsics_5.c: Ditto. * testsuite/gcc.target/i386/intrinsics_6.c: Ditto. * testsuite/gcc.target/i386/avx-1.c: Provide macros for builtins needing immediate arguments in f16cintrin.h and rtmintrin.h. From-SVN: r200349
2013-06-23 08:15:19 +02:00
#define __DISABLE_MMX__
#endif /* __MMX__ */
/* The Intel API is flexible enough that we must allow aliasing with other
vector types, and their scalar components. */
typedef int __m64 __attribute__ ((__vector_size__ (8), __may_alias__));
/* Unaligned version of the same type */
typedef int __m64_u __attribute__ ((__vector_size__ (8), __may_alias__, __aligned__ (1)));
/* Internal data types for implementing the intrinsics. */
typedef int __v2si __attribute__ ((__vector_size__ (8)));
typedef short __v4hi __attribute__ ((__vector_size__ (8)));
typedef char __v8qi __attribute__ ((__vector_size__ (8)));
re PR target/22152 (Poor loop optimization when using mmx builtins) 2008-03-08 Uros Bizjak <ubizjak@gmail.com> PR target/22152 * config/i386/i386-modes.def (V1DI): New vector mode. * config/i386/i386.h (VALID_MMX_REG_MODE): Add V1DImode. * config/i386/mmx.md (MMXMODEI8): New mode iterator. (MMXMODE248): Ditto. (MMXMODE): Add V1DI mode. (mmxvecsize): Change DI mode to V1DI mode. ("mov<mode>): Use MMXMODEI8 mode iterator. ("*mov<mode>_internal_rex64"): Ditto. ("*mov<mode>_internal"): Ditto. ("mmx_add<mode>3"): Ditto. Handle V1DImode for TARGET_SSE2. ("mmx_sub<mode>3"): Ditto. ("mmx_adddi3"): Remove insn pattern. ("mmx_subdi3"): Ditto. ("mmx_ashr<mode>3"): Use SImode and "yN" constraint for operand 2. ("mmx_lshr<mode>3"): Ditto. Use MMXMODE248 mode iterator. ("mmx_ashl<mode>3"): Ditto. ("mmx_lshrdi3"): Remove insn pattern. ("mmx_ashldi3"): Ditto. * config/i386/i386.c (classify_argument): Handle V1DImode. (function_arg_advance_32): Ditto. (function_arg_32): Ditto. (struct builtin_description) [IX86_BUILTIN_PADDQ]: Use mmx_addv1di3 insn pattern. [IX86_BUILTIN_PSUBQ]: Use mmx_subv1di3 insn pattern. [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I, IX86_BUILTIN_PSLL?I128, IX86_BUILTIN_PSRL?I128, IX86_BUILTIN_PSRA?I128]: Remove definitions of built-in functions. (V1DI_type_node): New node. (v1di_ftype_v1di_int): Ditto. (v1di_ftype_v1di_v1di): Ditto. (v2si_ftype_v2si_si): Ditto. (v4hi_ftype_v4hi_di): Remove node. (v2si_ftype_v2si_di): Ditto. (ix86_init_mmx_sse_builtins): Handle V1DImode. (__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?): Redefine builtins using def_builtin_const with *_ftype_*_int node. (__builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i): Add new builtins using def_builtin_const. (ix86_expand_builtin) [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I]: Handle builtin definitions. * config/i386/mmintrin.h (__v1di): New typedef. (_mm_add_si64): Cast arguments to __v1di type. (_mm_sub_si64): Ditto. (_mm_sll_pi16): Cast __count to __v4hi type. (_mm_sll_pi32): Cast __count to __v2si type. (_mm_sll_si64): Cast arguments to __v1di type. (_mm_srl_pi16): Cast __count to __v4hi type. (_mm_srl_pi32): Cast __count to __v2si type. (_mm_srl_si64): Cast arguments to __v1di type. (_mm_sra_pi16): Cast __count to __v4hi type. (_mm_sra_pi32): Cast __count to __v2si type. (_mm_slli_pi16): Use __builtin_ia32_psllwi. (_mm_slli_pi32): Use __builtin_ia32_pslldi. (_mm_slli_si64): Use __builtin_ia32_psllqi. Cast __m to __v1di type. (_mm_srli_pi16): Use __builtin_ia32_psrlwi. (_mm_srli_pi32): Use __builtin_ia32_psrldi. (_mm_srli_si64): Use __builtin_ia32_psrlqi. Cast __m to __v1di type. (_mm_srai_pi16): Use __builtin_ia32_psrawi. (_mm_srai_pi32): Use __builtin_ia32_psradi. * config/i386/i386.md (UNSPEC_NOP): Remove unspec definition. * doc/extend.texi (X86 Built-in Functions) [__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?, __builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i]: Add new builtins. From-SVN: r133023
2008-03-08 07:59:33 +01:00
typedef long long __v1di __attribute__ ((__vector_size__ (8)));
typedef float __v2sf __attribute__ ((__vector_size__ (8)));
/* Empty the multimedia state. */
extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_empty (void)
{
__builtin_ia32_emms ();
}
extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_empty (void)
{
_mm_empty ();
}
/* Convert I to a __m64 object. The integer is zero-extended to 64-bits. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_cvtsi32_si64 (int __i)
{
return (__m64) __builtin_ia32_vec_init_v2si (__i, 0);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_from_int (int __i)
{
return _mm_cvtsi32_si64 (__i);
}
#ifdef __x86_64__
/* Convert I to a __m64 object. */
/* Intel intrinsic. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_from_int64 (long long __i)
{
return (__m64) __i;
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_cvtsi64_m64 (long long __i)
{
return (__m64) __i;
}
/* Microsoft intrinsic. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_cvtsi64x_si64 (long long __i)
{
return (__m64) __i;
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_set_pi64x (long long __i)
{
return (__m64) __i;
}
#endif
/* Convert the lower 32 bits of the __m64 object into an integer. */
extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_cvtsi64_si32 (__m64 __i)
{
return __builtin_ia32_vec_ext_v2si ((__v2si)__i, 0);
}
extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_to_int (__m64 __i)
{
return _mm_cvtsi64_si32 (__i);
}
#ifdef __x86_64__
/* Convert the __m64 object to a 64bit integer. */
/* Intel intrinsic. */
extern __inline long long __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_to_int64 (__m64 __i)
{
return (long long)__i;
}
extern __inline long long __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_cvtm64_si64 (__m64 __i)
{
return (long long)__i;
}
/* Microsoft intrinsic. */
extern __inline long long __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_cvtsi64_si64x (__m64 __i)
{
return (long long)__i;
}
#endif
/* Pack the four 16-bit values from M1 into the lower four 8-bit values of
the result, and the four 16-bit values from M2 into the upper four 8-bit
values of the result, all with signed saturation. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_packs_pi16 (__m64 __m1, __m64 __m2)
{
return (__m64) __builtin_ia32_packsswb ((__v4hi)__m1, (__v4hi)__m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_packsswb (__m64 __m1, __m64 __m2)
{
return _mm_packs_pi16 (__m1, __m2);
}
/* Pack the two 32-bit values from M1 in to the lower two 16-bit values of
the result, and the two 32-bit values from M2 into the upper two 16-bit
values of the result, all with signed saturation. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_packs_pi32 (__m64 __m1, __m64 __m2)
{
return (__m64) __builtin_ia32_packssdw ((__v2si)__m1, (__v2si)__m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_packssdw (__m64 __m1, __m64 __m2)
{
return _mm_packs_pi32 (__m1, __m2);
}
/* Pack the four 16-bit values from M1 into the lower four 8-bit values of
the result, and the four 16-bit values from M2 into the upper four 8-bit
values of the result, all with unsigned saturation. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_packs_pu16 (__m64 __m1, __m64 __m2)
{
return (__m64) __builtin_ia32_packuswb ((__v4hi)__m1, (__v4hi)__m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_packuswb (__m64 __m1, __m64 __m2)
{
return _mm_packs_pu16 (__m1, __m2);
}
/* Interleave the four 8-bit values from the high half of M1 with the four
8-bit values from the high half of M2. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_unpackhi_pi8 (__m64 __m1, __m64 __m2)
{
return (__m64) __builtin_ia32_punpckhbw ((__v8qi)__m1, (__v8qi)__m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_punpckhbw (__m64 __m1, __m64 __m2)
{
return _mm_unpackhi_pi8 (__m1, __m2);
}
/* Interleave the two 16-bit values from the high half of M1 with the two
16-bit values from the high half of M2. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_unpackhi_pi16 (__m64 __m1, __m64 __m2)
{
return (__m64) __builtin_ia32_punpckhwd ((__v4hi)__m1, (__v4hi)__m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_punpckhwd (__m64 __m1, __m64 __m2)
{
return _mm_unpackhi_pi16 (__m1, __m2);
}
/* Interleave the 32-bit value from the high half of M1 with the 32-bit
value from the high half of M2. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_unpackhi_pi32 (__m64 __m1, __m64 __m2)
{
return (__m64) __builtin_ia32_punpckhdq ((__v2si)__m1, (__v2si)__m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_punpckhdq (__m64 __m1, __m64 __m2)
{
return _mm_unpackhi_pi32 (__m1, __m2);
}
/* Interleave the four 8-bit values from the low half of M1 with the four
8-bit values from the low half of M2. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_unpacklo_pi8 (__m64 __m1, __m64 __m2)
{
return (__m64) __builtin_ia32_punpcklbw ((__v8qi)__m1, (__v8qi)__m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_punpcklbw (__m64 __m1, __m64 __m2)
{
return _mm_unpacklo_pi8 (__m1, __m2);
}
/* Interleave the two 16-bit values from the low half of M1 with the two
16-bit values from the low half of M2. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_unpacklo_pi16 (__m64 __m1, __m64 __m2)
{
return (__m64) __builtin_ia32_punpcklwd ((__v4hi)__m1, (__v4hi)__m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_punpcklwd (__m64 __m1, __m64 __m2)
{
return _mm_unpacklo_pi16 (__m1, __m2);
}
/* Interleave the 32-bit value from the low half of M1 with the 32-bit
value from the low half of M2. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_unpacklo_pi32 (__m64 __m1, __m64 __m2)
{
return (__m64) __builtin_ia32_punpckldq ((__v2si)__m1, (__v2si)__m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_punpckldq (__m64 __m1, __m64 __m2)
{
return _mm_unpacklo_pi32 (__m1, __m2);
}
/* Add the 8-bit values in M1 to the 8-bit values in M2. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_add_pi8 (__m64 __m1, __m64 __m2)
{
return (__m64) __builtin_ia32_paddb ((__v8qi)__m1, (__v8qi)__m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_paddb (__m64 __m1, __m64 __m2)
{
return _mm_add_pi8 (__m1, __m2);
}
/* Add the 16-bit values in M1 to the 16-bit values in M2. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_add_pi16 (__m64 __m1, __m64 __m2)
{
return (__m64) __builtin_ia32_paddw ((__v4hi)__m1, (__v4hi)__m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_paddw (__m64 __m1, __m64 __m2)
{
return _mm_add_pi16 (__m1, __m2);
}
/* Add the 32-bit values in M1 to the 32-bit values in M2. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_add_pi32 (__m64 __m1, __m64 __m2)
{
return (__m64) __builtin_ia32_paddd ((__v2si)__m1, (__v2si)__m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_paddd (__m64 __m1, __m64 __m2)
{
return _mm_add_pi32 (__m1, __m2);
}
/* Add the 64-bit values in M1 to the 64-bit values in M2. */
Allow mmintrin headers to work with function specific target opts. Allow mmintrin headers to work with function specific target opts. Please see discussion here: http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00740.html * config/i386/i386.c (ix86_pragma_target_parse): Restore target when current target options does not apply. * config/i386/i386-protos.h (ix86_reset_previous_fndecl): New function. * config/i386/i386.c (ix86_reset_previous_fndecl): Ditto. * config/i386/bmiintrin.h: Pass appropriate target attributes to header. * config/i386/mmintrin.h: Ditto. * config/i386/nmmintrin.h: Ditto. * config/i386/avx2intrin.h: Ditto. * config/i386/fxsrintrin.h: Ditto. * config/i386/tbmintrin.h: Ditto. * config/i386/xsaveintrin.h: Ditto. * config/i386/f16cintrin.h: Ditto. * config/i386/xtestintrin.h: Ditto. * config/i386/xsaveoptintrin.h: Ditto. * config/i386/bmi2intrin.h: Ditto. * config/i386/lzcntintrin.h: Ditto. * config/i386/smmintrin.h: Ditto. * config/i386/wmmintrin.h: Ditto. * config/i386/x86intrin.h: Remove all header include guards. * config/i386/prfchwintrin.h: Ditto. * config/i386/pmmintrin.h: Ditto. * config/i386/tmmintrin.h: Ditto. * config/i386/xmmintrin.h: Ditto. * config/i386/popcntintrin.h: Ditto. * config/i386/rdseedintrin.h: Ditto. * config/i386/ammintrin.h: Ditto. * config/i386/emmintrin.h: Ditto. * config/i386/immintrin.h: Remove all header include guards. * config/i386/fma4intrin.h: Ditto. * config/i386/lwpintrin.h: Ditto. * config/i386/xopintrin.h: Ditto. * config/i386/ia32intrin.h: Ditto. * config/i386/avxintrin.h: Ditto. * config/i386/rtmintrin.h: Ditto. * config/i386/fmaintrin.h: Ditto. * config/i386/mm3dnow.h: Ditto. * testsuite/gcc.target/i386/intrinsics_1.c: New test. * testsuite/gcc.target/i386/intrinsics_2.c: Ditto. * testsuite/gcc.target/i386/intrinsics_3.c: Ditto. * testsuite/gcc.target/i386/intrinsics_4.c: Ditto. * testsuite/gcc.target/i386/intrinsics_5.c: Ditto. * testsuite/gcc.target/i386/intrinsics_6.c: Ditto. * testsuite/gcc.target/i386/avx-1.c: Provide macros for builtins needing immediate arguments in f16cintrin.h and rtmintrin.h. From-SVN: r200349
2013-06-23 08:15:19 +02:00
#ifndef __SSE2__
#pragma GCC push_options
#pragma GCC target("sse2,mmx")
Allow mmintrin headers to work with function specific target opts. Allow mmintrin headers to work with function specific target opts. Please see discussion here: http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00740.html * config/i386/i386.c (ix86_pragma_target_parse): Restore target when current target options does not apply. * config/i386/i386-protos.h (ix86_reset_previous_fndecl): New function. * config/i386/i386.c (ix86_reset_previous_fndecl): Ditto. * config/i386/bmiintrin.h: Pass appropriate target attributes to header. * config/i386/mmintrin.h: Ditto. * config/i386/nmmintrin.h: Ditto. * config/i386/avx2intrin.h: Ditto. * config/i386/fxsrintrin.h: Ditto. * config/i386/tbmintrin.h: Ditto. * config/i386/xsaveintrin.h: Ditto. * config/i386/f16cintrin.h: Ditto. * config/i386/xtestintrin.h: Ditto. * config/i386/xsaveoptintrin.h: Ditto. * config/i386/bmi2intrin.h: Ditto. * config/i386/lzcntintrin.h: Ditto. * config/i386/smmintrin.h: Ditto. * config/i386/wmmintrin.h: Ditto. * config/i386/x86intrin.h: Remove all header include guards. * config/i386/prfchwintrin.h: Ditto. * config/i386/pmmintrin.h: Ditto. * config/i386/tmmintrin.h: Ditto. * config/i386/xmmintrin.h: Ditto. * config/i386/popcntintrin.h: Ditto. * config/i386/rdseedintrin.h: Ditto. * config/i386/ammintrin.h: Ditto. * config/i386/emmintrin.h: Ditto. * config/i386/immintrin.h: Remove all header include guards. * config/i386/fma4intrin.h: Ditto. * config/i386/lwpintrin.h: Ditto. * config/i386/xopintrin.h: Ditto. * config/i386/ia32intrin.h: Ditto. * config/i386/avxintrin.h: Ditto. * config/i386/rtmintrin.h: Ditto. * config/i386/fmaintrin.h: Ditto. * config/i386/mm3dnow.h: Ditto. * testsuite/gcc.target/i386/intrinsics_1.c: New test. * testsuite/gcc.target/i386/intrinsics_2.c: Ditto. * testsuite/gcc.target/i386/intrinsics_3.c: Ditto. * testsuite/gcc.target/i386/intrinsics_4.c: Ditto. * testsuite/gcc.target/i386/intrinsics_5.c: Ditto. * testsuite/gcc.target/i386/intrinsics_6.c: Ditto. * testsuite/gcc.target/i386/avx-1.c: Provide macros for builtins needing immediate arguments in f16cintrin.h and rtmintrin.h. From-SVN: r200349
2013-06-23 08:15:19 +02:00
#define __DISABLE_SSE2__
#endif /* __SSE2__ */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_add_si64 (__m64 __m1, __m64 __m2)
{
re PR target/22152 (Poor loop optimization when using mmx builtins) 2008-03-08 Uros Bizjak <ubizjak@gmail.com> PR target/22152 * config/i386/i386-modes.def (V1DI): New vector mode. * config/i386/i386.h (VALID_MMX_REG_MODE): Add V1DImode. * config/i386/mmx.md (MMXMODEI8): New mode iterator. (MMXMODE248): Ditto. (MMXMODE): Add V1DI mode. (mmxvecsize): Change DI mode to V1DI mode. ("mov<mode>): Use MMXMODEI8 mode iterator. ("*mov<mode>_internal_rex64"): Ditto. ("*mov<mode>_internal"): Ditto. ("mmx_add<mode>3"): Ditto. Handle V1DImode for TARGET_SSE2. ("mmx_sub<mode>3"): Ditto. ("mmx_adddi3"): Remove insn pattern. ("mmx_subdi3"): Ditto. ("mmx_ashr<mode>3"): Use SImode and "yN" constraint for operand 2. ("mmx_lshr<mode>3"): Ditto. Use MMXMODE248 mode iterator. ("mmx_ashl<mode>3"): Ditto. ("mmx_lshrdi3"): Remove insn pattern. ("mmx_ashldi3"): Ditto. * config/i386/i386.c (classify_argument): Handle V1DImode. (function_arg_advance_32): Ditto. (function_arg_32): Ditto. (struct builtin_description) [IX86_BUILTIN_PADDQ]: Use mmx_addv1di3 insn pattern. [IX86_BUILTIN_PSUBQ]: Use mmx_subv1di3 insn pattern. [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I, IX86_BUILTIN_PSLL?I128, IX86_BUILTIN_PSRL?I128, IX86_BUILTIN_PSRA?I128]: Remove definitions of built-in functions. (V1DI_type_node): New node. (v1di_ftype_v1di_int): Ditto. (v1di_ftype_v1di_v1di): Ditto. (v2si_ftype_v2si_si): Ditto. (v4hi_ftype_v4hi_di): Remove node. (v2si_ftype_v2si_di): Ditto. (ix86_init_mmx_sse_builtins): Handle V1DImode. (__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?): Redefine builtins using def_builtin_const with *_ftype_*_int node. (__builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i): Add new builtins using def_builtin_const. (ix86_expand_builtin) [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I]: Handle builtin definitions. * config/i386/mmintrin.h (__v1di): New typedef. (_mm_add_si64): Cast arguments to __v1di type. (_mm_sub_si64): Ditto. (_mm_sll_pi16): Cast __count to __v4hi type. (_mm_sll_pi32): Cast __count to __v2si type. (_mm_sll_si64): Cast arguments to __v1di type. (_mm_srl_pi16): Cast __count to __v4hi type. (_mm_srl_pi32): Cast __count to __v2si type. (_mm_srl_si64): Cast arguments to __v1di type. (_mm_sra_pi16): Cast __count to __v4hi type. (_mm_sra_pi32): Cast __count to __v2si type. (_mm_slli_pi16): Use __builtin_ia32_psllwi. (_mm_slli_pi32): Use __builtin_ia32_pslldi. (_mm_slli_si64): Use __builtin_ia32_psllqi. Cast __m to __v1di type. (_mm_srli_pi16): Use __builtin_ia32_psrlwi. (_mm_srli_pi32): Use __builtin_ia32_psrldi. (_mm_srli_si64): Use __builtin_ia32_psrlqi. Cast __m to __v1di type. (_mm_srai_pi16): Use __builtin_ia32_psrawi. (_mm_srai_pi32): Use __builtin_ia32_psradi. * config/i386/i386.md (UNSPEC_NOP): Remove unspec definition. * doc/extend.texi (X86 Built-in Functions) [__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?, __builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i]: Add new builtins. From-SVN: r133023
2008-03-08 07:59:33 +01:00
return (__m64) __builtin_ia32_paddq ((__v1di)__m1, (__v1di)__m2);
}
Allow mmintrin headers to work with function specific target opts. Allow mmintrin headers to work with function specific target opts. Please see discussion here: http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00740.html * config/i386/i386.c (ix86_pragma_target_parse): Restore target when current target options does not apply. * config/i386/i386-protos.h (ix86_reset_previous_fndecl): New function. * config/i386/i386.c (ix86_reset_previous_fndecl): Ditto. * config/i386/bmiintrin.h: Pass appropriate target attributes to header. * config/i386/mmintrin.h: Ditto. * config/i386/nmmintrin.h: Ditto. * config/i386/avx2intrin.h: Ditto. * config/i386/fxsrintrin.h: Ditto. * config/i386/tbmintrin.h: Ditto. * config/i386/xsaveintrin.h: Ditto. * config/i386/f16cintrin.h: Ditto. * config/i386/xtestintrin.h: Ditto. * config/i386/xsaveoptintrin.h: Ditto. * config/i386/bmi2intrin.h: Ditto. * config/i386/lzcntintrin.h: Ditto. * config/i386/smmintrin.h: Ditto. * config/i386/wmmintrin.h: Ditto. * config/i386/x86intrin.h: Remove all header include guards. * config/i386/prfchwintrin.h: Ditto. * config/i386/pmmintrin.h: Ditto. * config/i386/tmmintrin.h: Ditto. * config/i386/xmmintrin.h: Ditto. * config/i386/popcntintrin.h: Ditto. * config/i386/rdseedintrin.h: Ditto. * config/i386/ammintrin.h: Ditto. * config/i386/emmintrin.h: Ditto. * config/i386/immintrin.h: Remove all header include guards. * config/i386/fma4intrin.h: Ditto. * config/i386/lwpintrin.h: Ditto. * config/i386/xopintrin.h: Ditto. * config/i386/ia32intrin.h: Ditto. * config/i386/avxintrin.h: Ditto. * config/i386/rtmintrin.h: Ditto. * config/i386/fmaintrin.h: Ditto. * config/i386/mm3dnow.h: Ditto. * testsuite/gcc.target/i386/intrinsics_1.c: New test. * testsuite/gcc.target/i386/intrinsics_2.c: Ditto. * testsuite/gcc.target/i386/intrinsics_3.c: Ditto. * testsuite/gcc.target/i386/intrinsics_4.c: Ditto. * testsuite/gcc.target/i386/intrinsics_5.c: Ditto. * testsuite/gcc.target/i386/intrinsics_6.c: Ditto. * testsuite/gcc.target/i386/avx-1.c: Provide macros for builtins needing immediate arguments in f16cintrin.h and rtmintrin.h. From-SVN: r200349
2013-06-23 08:15:19 +02:00
#ifdef __DISABLE_SSE2__
#undef __DISABLE_SSE2__
#pragma GCC pop_options
#endif /* __DISABLE_SSE2__ */
/* Add the 8-bit values in M1 to the 8-bit values in M2 using signed
saturated arithmetic. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_adds_pi8 (__m64 __m1, __m64 __m2)
{
return (__m64) __builtin_ia32_paddsb ((__v8qi)__m1, (__v8qi)__m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_paddsb (__m64 __m1, __m64 __m2)
{
return _mm_adds_pi8 (__m1, __m2);
}
/* Add the 16-bit values in M1 to the 16-bit values in M2 using signed
saturated arithmetic. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_adds_pi16 (__m64 __m1, __m64 __m2)
{
return (__m64) __builtin_ia32_paddsw ((__v4hi)__m1, (__v4hi)__m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_paddsw (__m64 __m1, __m64 __m2)
{
return _mm_adds_pi16 (__m1, __m2);
}
/* Add the 8-bit values in M1 to the 8-bit values in M2 using unsigned
saturated arithmetic. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_adds_pu8 (__m64 __m1, __m64 __m2)
{
return (__m64) __builtin_ia32_paddusb ((__v8qi)__m1, (__v8qi)__m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_paddusb (__m64 __m1, __m64 __m2)
{
return _mm_adds_pu8 (__m1, __m2);
}
/* Add the 16-bit values in M1 to the 16-bit values in M2 using unsigned
saturated arithmetic. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_adds_pu16 (__m64 __m1, __m64 __m2)
{
return (__m64) __builtin_ia32_paddusw ((__v4hi)__m1, (__v4hi)__m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_paddusw (__m64 __m1, __m64 __m2)
{
return _mm_adds_pu16 (__m1, __m2);
}
/* Subtract the 8-bit values in M2 from the 8-bit values in M1. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_sub_pi8 (__m64 __m1, __m64 __m2)
{
return (__m64) __builtin_ia32_psubb ((__v8qi)__m1, (__v8qi)__m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_psubb (__m64 __m1, __m64 __m2)
{
return _mm_sub_pi8 (__m1, __m2);
}
/* Subtract the 16-bit values in M2 from the 16-bit values in M1. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_sub_pi16 (__m64 __m1, __m64 __m2)
{
return (__m64) __builtin_ia32_psubw ((__v4hi)__m1, (__v4hi)__m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_psubw (__m64 __m1, __m64 __m2)
{
return _mm_sub_pi16 (__m1, __m2);
}
/* Subtract the 32-bit values in M2 from the 32-bit values in M1. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_sub_pi32 (__m64 __m1, __m64 __m2)
{
return (__m64) __builtin_ia32_psubd ((__v2si)__m1, (__v2si)__m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_psubd (__m64 __m1, __m64 __m2)
{
return _mm_sub_pi32 (__m1, __m2);
}
/* Add the 64-bit values in M1 to the 64-bit values in M2. */
Allow mmintrin headers to work with function specific target opts. Allow mmintrin headers to work with function specific target opts. Please see discussion here: http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00740.html * config/i386/i386.c (ix86_pragma_target_parse): Restore target when current target options does not apply. * config/i386/i386-protos.h (ix86_reset_previous_fndecl): New function. * config/i386/i386.c (ix86_reset_previous_fndecl): Ditto. * config/i386/bmiintrin.h: Pass appropriate target attributes to header. * config/i386/mmintrin.h: Ditto. * config/i386/nmmintrin.h: Ditto. * config/i386/avx2intrin.h: Ditto. * config/i386/fxsrintrin.h: Ditto. * config/i386/tbmintrin.h: Ditto. * config/i386/xsaveintrin.h: Ditto. * config/i386/f16cintrin.h: Ditto. * config/i386/xtestintrin.h: Ditto. * config/i386/xsaveoptintrin.h: Ditto. * config/i386/bmi2intrin.h: Ditto. * config/i386/lzcntintrin.h: Ditto. * config/i386/smmintrin.h: Ditto. * config/i386/wmmintrin.h: Ditto. * config/i386/x86intrin.h: Remove all header include guards. * config/i386/prfchwintrin.h: Ditto. * config/i386/pmmintrin.h: Ditto. * config/i386/tmmintrin.h: Ditto. * config/i386/xmmintrin.h: Ditto. * config/i386/popcntintrin.h: Ditto. * config/i386/rdseedintrin.h: Ditto. * config/i386/ammintrin.h: Ditto. * config/i386/emmintrin.h: Ditto. * config/i386/immintrin.h: Remove all header include guards. * config/i386/fma4intrin.h: Ditto. * config/i386/lwpintrin.h: Ditto. * config/i386/xopintrin.h: Ditto. * config/i386/ia32intrin.h: Ditto. * config/i386/avxintrin.h: Ditto. * config/i386/rtmintrin.h: Ditto. * config/i386/fmaintrin.h: Ditto. * config/i386/mm3dnow.h: Ditto. * testsuite/gcc.target/i386/intrinsics_1.c: New test. * testsuite/gcc.target/i386/intrinsics_2.c: Ditto. * testsuite/gcc.target/i386/intrinsics_3.c: Ditto. * testsuite/gcc.target/i386/intrinsics_4.c: Ditto. * testsuite/gcc.target/i386/intrinsics_5.c: Ditto. * testsuite/gcc.target/i386/intrinsics_6.c: Ditto. * testsuite/gcc.target/i386/avx-1.c: Provide macros for builtins needing immediate arguments in f16cintrin.h and rtmintrin.h. From-SVN: r200349
2013-06-23 08:15:19 +02:00
#ifndef __SSE2__
#pragma GCC push_options
#pragma GCC target("sse2,mmx")
Allow mmintrin headers to work with function specific target opts. Allow mmintrin headers to work with function specific target opts. Please see discussion here: http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00740.html * config/i386/i386.c (ix86_pragma_target_parse): Restore target when current target options does not apply. * config/i386/i386-protos.h (ix86_reset_previous_fndecl): New function. * config/i386/i386.c (ix86_reset_previous_fndecl): Ditto. * config/i386/bmiintrin.h: Pass appropriate target attributes to header. * config/i386/mmintrin.h: Ditto. * config/i386/nmmintrin.h: Ditto. * config/i386/avx2intrin.h: Ditto. * config/i386/fxsrintrin.h: Ditto. * config/i386/tbmintrin.h: Ditto. * config/i386/xsaveintrin.h: Ditto. * config/i386/f16cintrin.h: Ditto. * config/i386/xtestintrin.h: Ditto. * config/i386/xsaveoptintrin.h: Ditto. * config/i386/bmi2intrin.h: Ditto. * config/i386/lzcntintrin.h: Ditto. * config/i386/smmintrin.h: Ditto. * config/i386/wmmintrin.h: Ditto. * config/i386/x86intrin.h: Remove all header include guards. * config/i386/prfchwintrin.h: Ditto. * config/i386/pmmintrin.h: Ditto. * config/i386/tmmintrin.h: Ditto. * config/i386/xmmintrin.h: Ditto. * config/i386/popcntintrin.h: Ditto. * config/i386/rdseedintrin.h: Ditto. * config/i386/ammintrin.h: Ditto. * config/i386/emmintrin.h: Ditto. * config/i386/immintrin.h: Remove all header include guards. * config/i386/fma4intrin.h: Ditto. * config/i386/lwpintrin.h: Ditto. * config/i386/xopintrin.h: Ditto. * config/i386/ia32intrin.h: Ditto. * config/i386/avxintrin.h: Ditto. * config/i386/rtmintrin.h: Ditto. * config/i386/fmaintrin.h: Ditto. * config/i386/mm3dnow.h: Ditto. * testsuite/gcc.target/i386/intrinsics_1.c: New test. * testsuite/gcc.target/i386/intrinsics_2.c: Ditto. * testsuite/gcc.target/i386/intrinsics_3.c: Ditto. * testsuite/gcc.target/i386/intrinsics_4.c: Ditto. * testsuite/gcc.target/i386/intrinsics_5.c: Ditto. * testsuite/gcc.target/i386/intrinsics_6.c: Ditto. * testsuite/gcc.target/i386/avx-1.c: Provide macros for builtins needing immediate arguments in f16cintrin.h and rtmintrin.h. From-SVN: r200349
2013-06-23 08:15:19 +02:00
#define __DISABLE_SSE2__
#endif /* __SSE2__ */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_sub_si64 (__m64 __m1, __m64 __m2)
{
re PR target/22152 (Poor loop optimization when using mmx builtins) 2008-03-08 Uros Bizjak <ubizjak@gmail.com> PR target/22152 * config/i386/i386-modes.def (V1DI): New vector mode. * config/i386/i386.h (VALID_MMX_REG_MODE): Add V1DImode. * config/i386/mmx.md (MMXMODEI8): New mode iterator. (MMXMODE248): Ditto. (MMXMODE): Add V1DI mode. (mmxvecsize): Change DI mode to V1DI mode. ("mov<mode>): Use MMXMODEI8 mode iterator. ("*mov<mode>_internal_rex64"): Ditto. ("*mov<mode>_internal"): Ditto. ("mmx_add<mode>3"): Ditto. Handle V1DImode for TARGET_SSE2. ("mmx_sub<mode>3"): Ditto. ("mmx_adddi3"): Remove insn pattern. ("mmx_subdi3"): Ditto. ("mmx_ashr<mode>3"): Use SImode and "yN" constraint for operand 2. ("mmx_lshr<mode>3"): Ditto. Use MMXMODE248 mode iterator. ("mmx_ashl<mode>3"): Ditto. ("mmx_lshrdi3"): Remove insn pattern. ("mmx_ashldi3"): Ditto. * config/i386/i386.c (classify_argument): Handle V1DImode. (function_arg_advance_32): Ditto. (function_arg_32): Ditto. (struct builtin_description) [IX86_BUILTIN_PADDQ]: Use mmx_addv1di3 insn pattern. [IX86_BUILTIN_PSUBQ]: Use mmx_subv1di3 insn pattern. [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I, IX86_BUILTIN_PSLL?I128, IX86_BUILTIN_PSRL?I128, IX86_BUILTIN_PSRA?I128]: Remove definitions of built-in functions. (V1DI_type_node): New node. (v1di_ftype_v1di_int): Ditto. (v1di_ftype_v1di_v1di): Ditto. (v2si_ftype_v2si_si): Ditto. (v4hi_ftype_v4hi_di): Remove node. (v2si_ftype_v2si_di): Ditto. (ix86_init_mmx_sse_builtins): Handle V1DImode. (__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?): Redefine builtins using def_builtin_const with *_ftype_*_int node. (__builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i): Add new builtins using def_builtin_const. (ix86_expand_builtin) [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I]: Handle builtin definitions. * config/i386/mmintrin.h (__v1di): New typedef. (_mm_add_si64): Cast arguments to __v1di type. (_mm_sub_si64): Ditto. (_mm_sll_pi16): Cast __count to __v4hi type. (_mm_sll_pi32): Cast __count to __v2si type. (_mm_sll_si64): Cast arguments to __v1di type. (_mm_srl_pi16): Cast __count to __v4hi type. (_mm_srl_pi32): Cast __count to __v2si type. (_mm_srl_si64): Cast arguments to __v1di type. (_mm_sra_pi16): Cast __count to __v4hi type. (_mm_sra_pi32): Cast __count to __v2si type. (_mm_slli_pi16): Use __builtin_ia32_psllwi. (_mm_slli_pi32): Use __builtin_ia32_pslldi. (_mm_slli_si64): Use __builtin_ia32_psllqi. Cast __m to __v1di type. (_mm_srli_pi16): Use __builtin_ia32_psrlwi. (_mm_srli_pi32): Use __builtin_ia32_psrldi. (_mm_srli_si64): Use __builtin_ia32_psrlqi. Cast __m to __v1di type. (_mm_srai_pi16): Use __builtin_ia32_psrawi. (_mm_srai_pi32): Use __builtin_ia32_psradi. * config/i386/i386.md (UNSPEC_NOP): Remove unspec definition. * doc/extend.texi (X86 Built-in Functions) [__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?, __builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i]: Add new builtins. From-SVN: r133023
2008-03-08 07:59:33 +01:00
return (__m64) __builtin_ia32_psubq ((__v1di)__m1, (__v1di)__m2);
}
Allow mmintrin headers to work with function specific target opts. Allow mmintrin headers to work with function specific target opts. Please see discussion here: http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00740.html * config/i386/i386.c (ix86_pragma_target_parse): Restore target when current target options does not apply. * config/i386/i386-protos.h (ix86_reset_previous_fndecl): New function. * config/i386/i386.c (ix86_reset_previous_fndecl): Ditto. * config/i386/bmiintrin.h: Pass appropriate target attributes to header. * config/i386/mmintrin.h: Ditto. * config/i386/nmmintrin.h: Ditto. * config/i386/avx2intrin.h: Ditto. * config/i386/fxsrintrin.h: Ditto. * config/i386/tbmintrin.h: Ditto. * config/i386/xsaveintrin.h: Ditto. * config/i386/f16cintrin.h: Ditto. * config/i386/xtestintrin.h: Ditto. * config/i386/xsaveoptintrin.h: Ditto. * config/i386/bmi2intrin.h: Ditto. * config/i386/lzcntintrin.h: Ditto. * config/i386/smmintrin.h: Ditto. * config/i386/wmmintrin.h: Ditto. * config/i386/x86intrin.h: Remove all header include guards. * config/i386/prfchwintrin.h: Ditto. * config/i386/pmmintrin.h: Ditto. * config/i386/tmmintrin.h: Ditto. * config/i386/xmmintrin.h: Ditto. * config/i386/popcntintrin.h: Ditto. * config/i386/rdseedintrin.h: Ditto. * config/i386/ammintrin.h: Ditto. * config/i386/emmintrin.h: Ditto. * config/i386/immintrin.h: Remove all header include guards. * config/i386/fma4intrin.h: Ditto. * config/i386/lwpintrin.h: Ditto. * config/i386/xopintrin.h: Ditto. * config/i386/ia32intrin.h: Ditto. * config/i386/avxintrin.h: Ditto. * config/i386/rtmintrin.h: Ditto. * config/i386/fmaintrin.h: Ditto. * config/i386/mm3dnow.h: Ditto. * testsuite/gcc.target/i386/intrinsics_1.c: New test. * testsuite/gcc.target/i386/intrinsics_2.c: Ditto. * testsuite/gcc.target/i386/intrinsics_3.c: Ditto. * testsuite/gcc.target/i386/intrinsics_4.c: Ditto. * testsuite/gcc.target/i386/intrinsics_5.c: Ditto. * testsuite/gcc.target/i386/intrinsics_6.c: Ditto. * testsuite/gcc.target/i386/avx-1.c: Provide macros for builtins needing immediate arguments in f16cintrin.h and rtmintrin.h. From-SVN: r200349
2013-06-23 08:15:19 +02:00
#ifdef __DISABLE_SSE2__
#undef __DISABLE_SSE2__
#pragma GCC pop_options
#endif /* __DISABLE_SSE2__ */
/* Subtract the 8-bit values in M2 from the 8-bit values in M1 using signed
saturating arithmetic. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_subs_pi8 (__m64 __m1, __m64 __m2)
{
return (__m64) __builtin_ia32_psubsb ((__v8qi)__m1, (__v8qi)__m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_psubsb (__m64 __m1, __m64 __m2)
{
return _mm_subs_pi8 (__m1, __m2);
}
/* Subtract the 16-bit values in M2 from the 16-bit values in M1 using
signed saturating arithmetic. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_subs_pi16 (__m64 __m1, __m64 __m2)
{
return (__m64) __builtin_ia32_psubsw ((__v4hi)__m1, (__v4hi)__m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_psubsw (__m64 __m1, __m64 __m2)
{
return _mm_subs_pi16 (__m1, __m2);
}
/* Subtract the 8-bit values in M2 from the 8-bit values in M1 using
unsigned saturating arithmetic. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_subs_pu8 (__m64 __m1, __m64 __m2)
{
return (__m64) __builtin_ia32_psubusb ((__v8qi)__m1, (__v8qi)__m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_psubusb (__m64 __m1, __m64 __m2)
{
return _mm_subs_pu8 (__m1, __m2);
}
/* Subtract the 16-bit values in M2 from the 16-bit values in M1 using
unsigned saturating arithmetic. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_subs_pu16 (__m64 __m1, __m64 __m2)
{
return (__m64) __builtin_ia32_psubusw ((__v4hi)__m1, (__v4hi)__m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_psubusw (__m64 __m1, __m64 __m2)
{
return _mm_subs_pu16 (__m1, __m2);
}
/* Multiply four 16-bit values in M1 by four 16-bit values in M2 producing
four 32-bit intermediate results, which are then summed by pairs to
produce two 32-bit results. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_madd_pi16 (__m64 __m1, __m64 __m2)
{
return (__m64) __builtin_ia32_pmaddwd ((__v4hi)__m1, (__v4hi)__m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_pmaddwd (__m64 __m1, __m64 __m2)
{
return _mm_madd_pi16 (__m1, __m2);
}
/* Multiply four signed 16-bit values in M1 by four signed 16-bit values in
M2 and produce the high 16 bits of the 32-bit results. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_mulhi_pi16 (__m64 __m1, __m64 __m2)
{
return (__m64) __builtin_ia32_pmulhw ((__v4hi)__m1, (__v4hi)__m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_pmulhw (__m64 __m1, __m64 __m2)
{
return _mm_mulhi_pi16 (__m1, __m2);
}
/* Multiply four 16-bit values in M1 by four 16-bit values in M2 and produce
the low 16 bits of the results. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_mullo_pi16 (__m64 __m1, __m64 __m2)
{
return (__m64) __builtin_ia32_pmullw ((__v4hi)__m1, (__v4hi)__m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_pmullw (__m64 __m1, __m64 __m2)
{
return _mm_mullo_pi16 (__m1, __m2);
}
/* Shift four 16-bit values in M left by COUNT. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_sll_pi16 (__m64 __m, __m64 __count)
{
re PR target/22152 (Poor loop optimization when using mmx builtins) 2008-03-08 Uros Bizjak <ubizjak@gmail.com> PR target/22152 * config/i386/i386-modes.def (V1DI): New vector mode. * config/i386/i386.h (VALID_MMX_REG_MODE): Add V1DImode. * config/i386/mmx.md (MMXMODEI8): New mode iterator. (MMXMODE248): Ditto. (MMXMODE): Add V1DI mode. (mmxvecsize): Change DI mode to V1DI mode. ("mov<mode>): Use MMXMODEI8 mode iterator. ("*mov<mode>_internal_rex64"): Ditto. ("*mov<mode>_internal"): Ditto. ("mmx_add<mode>3"): Ditto. Handle V1DImode for TARGET_SSE2. ("mmx_sub<mode>3"): Ditto. ("mmx_adddi3"): Remove insn pattern. ("mmx_subdi3"): Ditto. ("mmx_ashr<mode>3"): Use SImode and "yN" constraint for operand 2. ("mmx_lshr<mode>3"): Ditto. Use MMXMODE248 mode iterator. ("mmx_ashl<mode>3"): Ditto. ("mmx_lshrdi3"): Remove insn pattern. ("mmx_ashldi3"): Ditto. * config/i386/i386.c (classify_argument): Handle V1DImode. (function_arg_advance_32): Ditto. (function_arg_32): Ditto. (struct builtin_description) [IX86_BUILTIN_PADDQ]: Use mmx_addv1di3 insn pattern. [IX86_BUILTIN_PSUBQ]: Use mmx_subv1di3 insn pattern. [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I, IX86_BUILTIN_PSLL?I128, IX86_BUILTIN_PSRL?I128, IX86_BUILTIN_PSRA?I128]: Remove definitions of built-in functions. (V1DI_type_node): New node. (v1di_ftype_v1di_int): Ditto. (v1di_ftype_v1di_v1di): Ditto. (v2si_ftype_v2si_si): Ditto. (v4hi_ftype_v4hi_di): Remove node. (v2si_ftype_v2si_di): Ditto. (ix86_init_mmx_sse_builtins): Handle V1DImode. (__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?): Redefine builtins using def_builtin_const with *_ftype_*_int node. (__builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i): Add new builtins using def_builtin_const. (ix86_expand_builtin) [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I]: Handle builtin definitions. * config/i386/mmintrin.h (__v1di): New typedef. (_mm_add_si64): Cast arguments to __v1di type. (_mm_sub_si64): Ditto. (_mm_sll_pi16): Cast __count to __v4hi type. (_mm_sll_pi32): Cast __count to __v2si type. (_mm_sll_si64): Cast arguments to __v1di type. (_mm_srl_pi16): Cast __count to __v4hi type. (_mm_srl_pi32): Cast __count to __v2si type. (_mm_srl_si64): Cast arguments to __v1di type. (_mm_sra_pi16): Cast __count to __v4hi type. (_mm_sra_pi32): Cast __count to __v2si type. (_mm_slli_pi16): Use __builtin_ia32_psllwi. (_mm_slli_pi32): Use __builtin_ia32_pslldi. (_mm_slli_si64): Use __builtin_ia32_psllqi. Cast __m to __v1di type. (_mm_srli_pi16): Use __builtin_ia32_psrlwi. (_mm_srli_pi32): Use __builtin_ia32_psrldi. (_mm_srli_si64): Use __builtin_ia32_psrlqi. Cast __m to __v1di type. (_mm_srai_pi16): Use __builtin_ia32_psrawi. (_mm_srai_pi32): Use __builtin_ia32_psradi. * config/i386/i386.md (UNSPEC_NOP): Remove unspec definition. * doc/extend.texi (X86 Built-in Functions) [__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?, __builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i]: Add new builtins. From-SVN: r133023
2008-03-08 07:59:33 +01:00
return (__m64) __builtin_ia32_psllw ((__v4hi)__m, (__v4hi)__count);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_psllw (__m64 __m, __m64 __count)
{
return _mm_sll_pi16 (__m, __count);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_slli_pi16 (__m64 __m, int __count)
{
re PR target/22152 (Poor loop optimization when using mmx builtins) 2008-03-08 Uros Bizjak <ubizjak@gmail.com> PR target/22152 * config/i386/i386-modes.def (V1DI): New vector mode. * config/i386/i386.h (VALID_MMX_REG_MODE): Add V1DImode. * config/i386/mmx.md (MMXMODEI8): New mode iterator. (MMXMODE248): Ditto. (MMXMODE): Add V1DI mode. (mmxvecsize): Change DI mode to V1DI mode. ("mov<mode>): Use MMXMODEI8 mode iterator. ("*mov<mode>_internal_rex64"): Ditto. ("*mov<mode>_internal"): Ditto. ("mmx_add<mode>3"): Ditto. Handle V1DImode for TARGET_SSE2. ("mmx_sub<mode>3"): Ditto. ("mmx_adddi3"): Remove insn pattern. ("mmx_subdi3"): Ditto. ("mmx_ashr<mode>3"): Use SImode and "yN" constraint for operand 2. ("mmx_lshr<mode>3"): Ditto. Use MMXMODE248 mode iterator. ("mmx_ashl<mode>3"): Ditto. ("mmx_lshrdi3"): Remove insn pattern. ("mmx_ashldi3"): Ditto. * config/i386/i386.c (classify_argument): Handle V1DImode. (function_arg_advance_32): Ditto. (function_arg_32): Ditto. (struct builtin_description) [IX86_BUILTIN_PADDQ]: Use mmx_addv1di3 insn pattern. [IX86_BUILTIN_PSUBQ]: Use mmx_subv1di3 insn pattern. [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I, IX86_BUILTIN_PSLL?I128, IX86_BUILTIN_PSRL?I128, IX86_BUILTIN_PSRA?I128]: Remove definitions of built-in functions. (V1DI_type_node): New node. (v1di_ftype_v1di_int): Ditto. (v1di_ftype_v1di_v1di): Ditto. (v2si_ftype_v2si_si): Ditto. (v4hi_ftype_v4hi_di): Remove node. (v2si_ftype_v2si_di): Ditto. (ix86_init_mmx_sse_builtins): Handle V1DImode. (__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?): Redefine builtins using def_builtin_const with *_ftype_*_int node. (__builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i): Add new builtins using def_builtin_const. (ix86_expand_builtin) [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I]: Handle builtin definitions. * config/i386/mmintrin.h (__v1di): New typedef. (_mm_add_si64): Cast arguments to __v1di type. (_mm_sub_si64): Ditto. (_mm_sll_pi16): Cast __count to __v4hi type. (_mm_sll_pi32): Cast __count to __v2si type. (_mm_sll_si64): Cast arguments to __v1di type. (_mm_srl_pi16): Cast __count to __v4hi type. (_mm_srl_pi32): Cast __count to __v2si type. (_mm_srl_si64): Cast arguments to __v1di type. (_mm_sra_pi16): Cast __count to __v4hi type. (_mm_sra_pi32): Cast __count to __v2si type. (_mm_slli_pi16): Use __builtin_ia32_psllwi. (_mm_slli_pi32): Use __builtin_ia32_pslldi. (_mm_slli_si64): Use __builtin_ia32_psllqi. Cast __m to __v1di type. (_mm_srli_pi16): Use __builtin_ia32_psrlwi. (_mm_srli_pi32): Use __builtin_ia32_psrldi. (_mm_srli_si64): Use __builtin_ia32_psrlqi. Cast __m to __v1di type. (_mm_srai_pi16): Use __builtin_ia32_psrawi. (_mm_srai_pi32): Use __builtin_ia32_psradi. * config/i386/i386.md (UNSPEC_NOP): Remove unspec definition. * doc/extend.texi (X86 Built-in Functions) [__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?, __builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i]: Add new builtins. From-SVN: r133023
2008-03-08 07:59:33 +01:00
return (__m64) __builtin_ia32_psllwi ((__v4hi)__m, __count);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_psllwi (__m64 __m, int __count)
{
return _mm_slli_pi16 (__m, __count);
}
/* Shift two 32-bit values in M left by COUNT. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_sll_pi32 (__m64 __m, __m64 __count)
{
re PR target/22152 (Poor loop optimization when using mmx builtins) 2008-03-08 Uros Bizjak <ubizjak@gmail.com> PR target/22152 * config/i386/i386-modes.def (V1DI): New vector mode. * config/i386/i386.h (VALID_MMX_REG_MODE): Add V1DImode. * config/i386/mmx.md (MMXMODEI8): New mode iterator. (MMXMODE248): Ditto. (MMXMODE): Add V1DI mode. (mmxvecsize): Change DI mode to V1DI mode. ("mov<mode>): Use MMXMODEI8 mode iterator. ("*mov<mode>_internal_rex64"): Ditto. ("*mov<mode>_internal"): Ditto. ("mmx_add<mode>3"): Ditto. Handle V1DImode for TARGET_SSE2. ("mmx_sub<mode>3"): Ditto. ("mmx_adddi3"): Remove insn pattern. ("mmx_subdi3"): Ditto. ("mmx_ashr<mode>3"): Use SImode and "yN" constraint for operand 2. ("mmx_lshr<mode>3"): Ditto. Use MMXMODE248 mode iterator. ("mmx_ashl<mode>3"): Ditto. ("mmx_lshrdi3"): Remove insn pattern. ("mmx_ashldi3"): Ditto. * config/i386/i386.c (classify_argument): Handle V1DImode. (function_arg_advance_32): Ditto. (function_arg_32): Ditto. (struct builtin_description) [IX86_BUILTIN_PADDQ]: Use mmx_addv1di3 insn pattern. [IX86_BUILTIN_PSUBQ]: Use mmx_subv1di3 insn pattern. [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I, IX86_BUILTIN_PSLL?I128, IX86_BUILTIN_PSRL?I128, IX86_BUILTIN_PSRA?I128]: Remove definitions of built-in functions. (V1DI_type_node): New node. (v1di_ftype_v1di_int): Ditto. (v1di_ftype_v1di_v1di): Ditto. (v2si_ftype_v2si_si): Ditto. (v4hi_ftype_v4hi_di): Remove node. (v2si_ftype_v2si_di): Ditto. (ix86_init_mmx_sse_builtins): Handle V1DImode. (__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?): Redefine builtins using def_builtin_const with *_ftype_*_int node. (__builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i): Add new builtins using def_builtin_const. (ix86_expand_builtin) [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I]: Handle builtin definitions. * config/i386/mmintrin.h (__v1di): New typedef. (_mm_add_si64): Cast arguments to __v1di type. (_mm_sub_si64): Ditto. (_mm_sll_pi16): Cast __count to __v4hi type. (_mm_sll_pi32): Cast __count to __v2si type. (_mm_sll_si64): Cast arguments to __v1di type. (_mm_srl_pi16): Cast __count to __v4hi type. (_mm_srl_pi32): Cast __count to __v2si type. (_mm_srl_si64): Cast arguments to __v1di type. (_mm_sra_pi16): Cast __count to __v4hi type. (_mm_sra_pi32): Cast __count to __v2si type. (_mm_slli_pi16): Use __builtin_ia32_psllwi. (_mm_slli_pi32): Use __builtin_ia32_pslldi. (_mm_slli_si64): Use __builtin_ia32_psllqi. Cast __m to __v1di type. (_mm_srli_pi16): Use __builtin_ia32_psrlwi. (_mm_srli_pi32): Use __builtin_ia32_psrldi. (_mm_srli_si64): Use __builtin_ia32_psrlqi. Cast __m to __v1di type. (_mm_srai_pi16): Use __builtin_ia32_psrawi. (_mm_srai_pi32): Use __builtin_ia32_psradi. * config/i386/i386.md (UNSPEC_NOP): Remove unspec definition. * doc/extend.texi (X86 Built-in Functions) [__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?, __builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i]: Add new builtins. From-SVN: r133023
2008-03-08 07:59:33 +01:00
return (__m64) __builtin_ia32_pslld ((__v2si)__m, (__v2si)__count);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_pslld (__m64 __m, __m64 __count)
{
return _mm_sll_pi32 (__m, __count);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_slli_pi32 (__m64 __m, int __count)
{
re PR target/22152 (Poor loop optimization when using mmx builtins) 2008-03-08 Uros Bizjak <ubizjak@gmail.com> PR target/22152 * config/i386/i386-modes.def (V1DI): New vector mode. * config/i386/i386.h (VALID_MMX_REG_MODE): Add V1DImode. * config/i386/mmx.md (MMXMODEI8): New mode iterator. (MMXMODE248): Ditto. (MMXMODE): Add V1DI mode. (mmxvecsize): Change DI mode to V1DI mode. ("mov<mode>): Use MMXMODEI8 mode iterator. ("*mov<mode>_internal_rex64"): Ditto. ("*mov<mode>_internal"): Ditto. ("mmx_add<mode>3"): Ditto. Handle V1DImode for TARGET_SSE2. ("mmx_sub<mode>3"): Ditto. ("mmx_adddi3"): Remove insn pattern. ("mmx_subdi3"): Ditto. ("mmx_ashr<mode>3"): Use SImode and "yN" constraint for operand 2. ("mmx_lshr<mode>3"): Ditto. Use MMXMODE248 mode iterator. ("mmx_ashl<mode>3"): Ditto. ("mmx_lshrdi3"): Remove insn pattern. ("mmx_ashldi3"): Ditto. * config/i386/i386.c (classify_argument): Handle V1DImode. (function_arg_advance_32): Ditto. (function_arg_32): Ditto. (struct builtin_description) [IX86_BUILTIN_PADDQ]: Use mmx_addv1di3 insn pattern. [IX86_BUILTIN_PSUBQ]: Use mmx_subv1di3 insn pattern. [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I, IX86_BUILTIN_PSLL?I128, IX86_BUILTIN_PSRL?I128, IX86_BUILTIN_PSRA?I128]: Remove definitions of built-in functions. (V1DI_type_node): New node. (v1di_ftype_v1di_int): Ditto. (v1di_ftype_v1di_v1di): Ditto. (v2si_ftype_v2si_si): Ditto. (v4hi_ftype_v4hi_di): Remove node. (v2si_ftype_v2si_di): Ditto. (ix86_init_mmx_sse_builtins): Handle V1DImode. (__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?): Redefine builtins using def_builtin_const with *_ftype_*_int node. (__builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i): Add new builtins using def_builtin_const. (ix86_expand_builtin) [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I]: Handle builtin definitions. * config/i386/mmintrin.h (__v1di): New typedef. (_mm_add_si64): Cast arguments to __v1di type. (_mm_sub_si64): Ditto. (_mm_sll_pi16): Cast __count to __v4hi type. (_mm_sll_pi32): Cast __count to __v2si type. (_mm_sll_si64): Cast arguments to __v1di type. (_mm_srl_pi16): Cast __count to __v4hi type. (_mm_srl_pi32): Cast __count to __v2si type. (_mm_srl_si64): Cast arguments to __v1di type. (_mm_sra_pi16): Cast __count to __v4hi type. (_mm_sra_pi32): Cast __count to __v2si type. (_mm_slli_pi16): Use __builtin_ia32_psllwi. (_mm_slli_pi32): Use __builtin_ia32_pslldi. (_mm_slli_si64): Use __builtin_ia32_psllqi. Cast __m to __v1di type. (_mm_srli_pi16): Use __builtin_ia32_psrlwi. (_mm_srli_pi32): Use __builtin_ia32_psrldi. (_mm_srli_si64): Use __builtin_ia32_psrlqi. Cast __m to __v1di type. (_mm_srai_pi16): Use __builtin_ia32_psrawi. (_mm_srai_pi32): Use __builtin_ia32_psradi. * config/i386/i386.md (UNSPEC_NOP): Remove unspec definition. * doc/extend.texi (X86 Built-in Functions) [__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?, __builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i]: Add new builtins. From-SVN: r133023
2008-03-08 07:59:33 +01:00
return (__m64) __builtin_ia32_pslldi ((__v2si)__m, __count);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_pslldi (__m64 __m, int __count)
{
return _mm_slli_pi32 (__m, __count);
}
/* Shift the 64-bit value in M left by COUNT. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_sll_si64 (__m64 __m, __m64 __count)
{
re PR target/22152 (Poor loop optimization when using mmx builtins) 2008-03-08 Uros Bizjak <ubizjak@gmail.com> PR target/22152 * config/i386/i386-modes.def (V1DI): New vector mode. * config/i386/i386.h (VALID_MMX_REG_MODE): Add V1DImode. * config/i386/mmx.md (MMXMODEI8): New mode iterator. (MMXMODE248): Ditto. (MMXMODE): Add V1DI mode. (mmxvecsize): Change DI mode to V1DI mode. ("mov<mode>): Use MMXMODEI8 mode iterator. ("*mov<mode>_internal_rex64"): Ditto. ("*mov<mode>_internal"): Ditto. ("mmx_add<mode>3"): Ditto. Handle V1DImode for TARGET_SSE2. ("mmx_sub<mode>3"): Ditto. ("mmx_adddi3"): Remove insn pattern. ("mmx_subdi3"): Ditto. ("mmx_ashr<mode>3"): Use SImode and "yN" constraint for operand 2. ("mmx_lshr<mode>3"): Ditto. Use MMXMODE248 mode iterator. ("mmx_ashl<mode>3"): Ditto. ("mmx_lshrdi3"): Remove insn pattern. ("mmx_ashldi3"): Ditto. * config/i386/i386.c (classify_argument): Handle V1DImode. (function_arg_advance_32): Ditto. (function_arg_32): Ditto. (struct builtin_description) [IX86_BUILTIN_PADDQ]: Use mmx_addv1di3 insn pattern. [IX86_BUILTIN_PSUBQ]: Use mmx_subv1di3 insn pattern. [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I, IX86_BUILTIN_PSLL?I128, IX86_BUILTIN_PSRL?I128, IX86_BUILTIN_PSRA?I128]: Remove definitions of built-in functions. (V1DI_type_node): New node. (v1di_ftype_v1di_int): Ditto. (v1di_ftype_v1di_v1di): Ditto. (v2si_ftype_v2si_si): Ditto. (v4hi_ftype_v4hi_di): Remove node. (v2si_ftype_v2si_di): Ditto. (ix86_init_mmx_sse_builtins): Handle V1DImode. (__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?): Redefine builtins using def_builtin_const with *_ftype_*_int node. (__builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i): Add new builtins using def_builtin_const. (ix86_expand_builtin) [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I]: Handle builtin definitions. * config/i386/mmintrin.h (__v1di): New typedef. (_mm_add_si64): Cast arguments to __v1di type. (_mm_sub_si64): Ditto. (_mm_sll_pi16): Cast __count to __v4hi type. (_mm_sll_pi32): Cast __count to __v2si type. (_mm_sll_si64): Cast arguments to __v1di type. (_mm_srl_pi16): Cast __count to __v4hi type. (_mm_srl_pi32): Cast __count to __v2si type. (_mm_srl_si64): Cast arguments to __v1di type. (_mm_sra_pi16): Cast __count to __v4hi type. (_mm_sra_pi32): Cast __count to __v2si type. (_mm_slli_pi16): Use __builtin_ia32_psllwi. (_mm_slli_pi32): Use __builtin_ia32_pslldi. (_mm_slli_si64): Use __builtin_ia32_psllqi. Cast __m to __v1di type. (_mm_srli_pi16): Use __builtin_ia32_psrlwi. (_mm_srli_pi32): Use __builtin_ia32_psrldi. (_mm_srli_si64): Use __builtin_ia32_psrlqi. Cast __m to __v1di type. (_mm_srai_pi16): Use __builtin_ia32_psrawi. (_mm_srai_pi32): Use __builtin_ia32_psradi. * config/i386/i386.md (UNSPEC_NOP): Remove unspec definition. * doc/extend.texi (X86 Built-in Functions) [__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?, __builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i]: Add new builtins. From-SVN: r133023
2008-03-08 07:59:33 +01:00
return (__m64) __builtin_ia32_psllq ((__v1di)__m, (__v1di)__count);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_psllq (__m64 __m, __m64 __count)
{
return _mm_sll_si64 (__m, __count);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_slli_si64 (__m64 __m, int __count)
{
re PR target/22152 (Poor loop optimization when using mmx builtins) 2008-03-08 Uros Bizjak <ubizjak@gmail.com> PR target/22152 * config/i386/i386-modes.def (V1DI): New vector mode. * config/i386/i386.h (VALID_MMX_REG_MODE): Add V1DImode. * config/i386/mmx.md (MMXMODEI8): New mode iterator. (MMXMODE248): Ditto. (MMXMODE): Add V1DI mode. (mmxvecsize): Change DI mode to V1DI mode. ("mov<mode>): Use MMXMODEI8 mode iterator. ("*mov<mode>_internal_rex64"): Ditto. ("*mov<mode>_internal"): Ditto. ("mmx_add<mode>3"): Ditto. Handle V1DImode for TARGET_SSE2. ("mmx_sub<mode>3"): Ditto. ("mmx_adddi3"): Remove insn pattern. ("mmx_subdi3"): Ditto. ("mmx_ashr<mode>3"): Use SImode and "yN" constraint for operand 2. ("mmx_lshr<mode>3"): Ditto. Use MMXMODE248 mode iterator. ("mmx_ashl<mode>3"): Ditto. ("mmx_lshrdi3"): Remove insn pattern. ("mmx_ashldi3"): Ditto. * config/i386/i386.c (classify_argument): Handle V1DImode. (function_arg_advance_32): Ditto. (function_arg_32): Ditto. (struct builtin_description) [IX86_BUILTIN_PADDQ]: Use mmx_addv1di3 insn pattern. [IX86_BUILTIN_PSUBQ]: Use mmx_subv1di3 insn pattern. [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I, IX86_BUILTIN_PSLL?I128, IX86_BUILTIN_PSRL?I128, IX86_BUILTIN_PSRA?I128]: Remove definitions of built-in functions. (V1DI_type_node): New node. (v1di_ftype_v1di_int): Ditto. (v1di_ftype_v1di_v1di): Ditto. (v2si_ftype_v2si_si): Ditto. (v4hi_ftype_v4hi_di): Remove node. (v2si_ftype_v2si_di): Ditto. (ix86_init_mmx_sse_builtins): Handle V1DImode. (__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?): Redefine builtins using def_builtin_const with *_ftype_*_int node. (__builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i): Add new builtins using def_builtin_const. (ix86_expand_builtin) [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I]: Handle builtin definitions. * config/i386/mmintrin.h (__v1di): New typedef. (_mm_add_si64): Cast arguments to __v1di type. (_mm_sub_si64): Ditto. (_mm_sll_pi16): Cast __count to __v4hi type. (_mm_sll_pi32): Cast __count to __v2si type. (_mm_sll_si64): Cast arguments to __v1di type. (_mm_srl_pi16): Cast __count to __v4hi type. (_mm_srl_pi32): Cast __count to __v2si type. (_mm_srl_si64): Cast arguments to __v1di type. (_mm_sra_pi16): Cast __count to __v4hi type. (_mm_sra_pi32): Cast __count to __v2si type. (_mm_slli_pi16): Use __builtin_ia32_psllwi. (_mm_slli_pi32): Use __builtin_ia32_pslldi. (_mm_slli_si64): Use __builtin_ia32_psllqi. Cast __m to __v1di type. (_mm_srli_pi16): Use __builtin_ia32_psrlwi. (_mm_srli_pi32): Use __builtin_ia32_psrldi. (_mm_srli_si64): Use __builtin_ia32_psrlqi. Cast __m to __v1di type. (_mm_srai_pi16): Use __builtin_ia32_psrawi. (_mm_srai_pi32): Use __builtin_ia32_psradi. * config/i386/i386.md (UNSPEC_NOP): Remove unspec definition. * doc/extend.texi (X86 Built-in Functions) [__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?, __builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i]: Add new builtins. From-SVN: r133023
2008-03-08 07:59:33 +01:00
return (__m64) __builtin_ia32_psllqi ((__v1di)__m, __count);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_psllqi (__m64 __m, int __count)
{
return _mm_slli_si64 (__m, __count);
}
/* Shift four 16-bit values in M right by COUNT; shift in the sign bit. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_sra_pi16 (__m64 __m, __m64 __count)
{
re PR target/22152 (Poor loop optimization when using mmx builtins) 2008-03-08 Uros Bizjak <ubizjak@gmail.com> PR target/22152 * config/i386/i386-modes.def (V1DI): New vector mode. * config/i386/i386.h (VALID_MMX_REG_MODE): Add V1DImode. * config/i386/mmx.md (MMXMODEI8): New mode iterator. (MMXMODE248): Ditto. (MMXMODE): Add V1DI mode. (mmxvecsize): Change DI mode to V1DI mode. ("mov<mode>): Use MMXMODEI8 mode iterator. ("*mov<mode>_internal_rex64"): Ditto. ("*mov<mode>_internal"): Ditto. ("mmx_add<mode>3"): Ditto. Handle V1DImode for TARGET_SSE2. ("mmx_sub<mode>3"): Ditto. ("mmx_adddi3"): Remove insn pattern. ("mmx_subdi3"): Ditto. ("mmx_ashr<mode>3"): Use SImode and "yN" constraint for operand 2. ("mmx_lshr<mode>3"): Ditto. Use MMXMODE248 mode iterator. ("mmx_ashl<mode>3"): Ditto. ("mmx_lshrdi3"): Remove insn pattern. ("mmx_ashldi3"): Ditto. * config/i386/i386.c (classify_argument): Handle V1DImode. (function_arg_advance_32): Ditto. (function_arg_32): Ditto. (struct builtin_description) [IX86_BUILTIN_PADDQ]: Use mmx_addv1di3 insn pattern. [IX86_BUILTIN_PSUBQ]: Use mmx_subv1di3 insn pattern. [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I, IX86_BUILTIN_PSLL?I128, IX86_BUILTIN_PSRL?I128, IX86_BUILTIN_PSRA?I128]: Remove definitions of built-in functions. (V1DI_type_node): New node. (v1di_ftype_v1di_int): Ditto. (v1di_ftype_v1di_v1di): Ditto. (v2si_ftype_v2si_si): Ditto. (v4hi_ftype_v4hi_di): Remove node. (v2si_ftype_v2si_di): Ditto. (ix86_init_mmx_sse_builtins): Handle V1DImode. (__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?): Redefine builtins using def_builtin_const with *_ftype_*_int node. (__builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i): Add new builtins using def_builtin_const. (ix86_expand_builtin) [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I]: Handle builtin definitions. * config/i386/mmintrin.h (__v1di): New typedef. (_mm_add_si64): Cast arguments to __v1di type. (_mm_sub_si64): Ditto. (_mm_sll_pi16): Cast __count to __v4hi type. (_mm_sll_pi32): Cast __count to __v2si type. (_mm_sll_si64): Cast arguments to __v1di type. (_mm_srl_pi16): Cast __count to __v4hi type. (_mm_srl_pi32): Cast __count to __v2si type. (_mm_srl_si64): Cast arguments to __v1di type. (_mm_sra_pi16): Cast __count to __v4hi type. (_mm_sra_pi32): Cast __count to __v2si type. (_mm_slli_pi16): Use __builtin_ia32_psllwi. (_mm_slli_pi32): Use __builtin_ia32_pslldi. (_mm_slli_si64): Use __builtin_ia32_psllqi. Cast __m to __v1di type. (_mm_srli_pi16): Use __builtin_ia32_psrlwi. (_mm_srli_pi32): Use __builtin_ia32_psrldi. (_mm_srli_si64): Use __builtin_ia32_psrlqi. Cast __m to __v1di type. (_mm_srai_pi16): Use __builtin_ia32_psrawi. (_mm_srai_pi32): Use __builtin_ia32_psradi. * config/i386/i386.md (UNSPEC_NOP): Remove unspec definition. * doc/extend.texi (X86 Built-in Functions) [__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?, __builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i]: Add new builtins. From-SVN: r133023
2008-03-08 07:59:33 +01:00
return (__m64) __builtin_ia32_psraw ((__v4hi)__m, (__v4hi)__count);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_psraw (__m64 __m, __m64 __count)
{
return _mm_sra_pi16 (__m, __count);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_srai_pi16 (__m64 __m, int __count)
{
re PR target/22152 (Poor loop optimization when using mmx builtins) 2008-03-08 Uros Bizjak <ubizjak@gmail.com> PR target/22152 * config/i386/i386-modes.def (V1DI): New vector mode. * config/i386/i386.h (VALID_MMX_REG_MODE): Add V1DImode. * config/i386/mmx.md (MMXMODEI8): New mode iterator. (MMXMODE248): Ditto. (MMXMODE): Add V1DI mode. (mmxvecsize): Change DI mode to V1DI mode. ("mov<mode>): Use MMXMODEI8 mode iterator. ("*mov<mode>_internal_rex64"): Ditto. ("*mov<mode>_internal"): Ditto. ("mmx_add<mode>3"): Ditto. Handle V1DImode for TARGET_SSE2. ("mmx_sub<mode>3"): Ditto. ("mmx_adddi3"): Remove insn pattern. ("mmx_subdi3"): Ditto. ("mmx_ashr<mode>3"): Use SImode and "yN" constraint for operand 2. ("mmx_lshr<mode>3"): Ditto. Use MMXMODE248 mode iterator. ("mmx_ashl<mode>3"): Ditto. ("mmx_lshrdi3"): Remove insn pattern. ("mmx_ashldi3"): Ditto. * config/i386/i386.c (classify_argument): Handle V1DImode. (function_arg_advance_32): Ditto. (function_arg_32): Ditto. (struct builtin_description) [IX86_BUILTIN_PADDQ]: Use mmx_addv1di3 insn pattern. [IX86_BUILTIN_PSUBQ]: Use mmx_subv1di3 insn pattern. [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I, IX86_BUILTIN_PSLL?I128, IX86_BUILTIN_PSRL?I128, IX86_BUILTIN_PSRA?I128]: Remove definitions of built-in functions. (V1DI_type_node): New node. (v1di_ftype_v1di_int): Ditto. (v1di_ftype_v1di_v1di): Ditto. (v2si_ftype_v2si_si): Ditto. (v4hi_ftype_v4hi_di): Remove node. (v2si_ftype_v2si_di): Ditto. (ix86_init_mmx_sse_builtins): Handle V1DImode. (__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?): Redefine builtins using def_builtin_const with *_ftype_*_int node. (__builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i): Add new builtins using def_builtin_const. (ix86_expand_builtin) [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I]: Handle builtin definitions. * config/i386/mmintrin.h (__v1di): New typedef. (_mm_add_si64): Cast arguments to __v1di type. (_mm_sub_si64): Ditto. (_mm_sll_pi16): Cast __count to __v4hi type. (_mm_sll_pi32): Cast __count to __v2si type. (_mm_sll_si64): Cast arguments to __v1di type. (_mm_srl_pi16): Cast __count to __v4hi type. (_mm_srl_pi32): Cast __count to __v2si type. (_mm_srl_si64): Cast arguments to __v1di type. (_mm_sra_pi16): Cast __count to __v4hi type. (_mm_sra_pi32): Cast __count to __v2si type. (_mm_slli_pi16): Use __builtin_ia32_psllwi. (_mm_slli_pi32): Use __builtin_ia32_pslldi. (_mm_slli_si64): Use __builtin_ia32_psllqi. Cast __m to __v1di type. (_mm_srli_pi16): Use __builtin_ia32_psrlwi. (_mm_srli_pi32): Use __builtin_ia32_psrldi. (_mm_srli_si64): Use __builtin_ia32_psrlqi. Cast __m to __v1di type. (_mm_srai_pi16): Use __builtin_ia32_psrawi. (_mm_srai_pi32): Use __builtin_ia32_psradi. * config/i386/i386.md (UNSPEC_NOP): Remove unspec definition. * doc/extend.texi (X86 Built-in Functions) [__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?, __builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i]: Add new builtins. From-SVN: r133023
2008-03-08 07:59:33 +01:00
return (__m64) __builtin_ia32_psrawi ((__v4hi)__m, __count);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_psrawi (__m64 __m, int __count)
{
return _mm_srai_pi16 (__m, __count);
}
/* Shift two 32-bit values in M right by COUNT; shift in the sign bit. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_sra_pi32 (__m64 __m, __m64 __count)
{
re PR target/22152 (Poor loop optimization when using mmx builtins) 2008-03-08 Uros Bizjak <ubizjak@gmail.com> PR target/22152 * config/i386/i386-modes.def (V1DI): New vector mode. * config/i386/i386.h (VALID_MMX_REG_MODE): Add V1DImode. * config/i386/mmx.md (MMXMODEI8): New mode iterator. (MMXMODE248): Ditto. (MMXMODE): Add V1DI mode. (mmxvecsize): Change DI mode to V1DI mode. ("mov<mode>): Use MMXMODEI8 mode iterator. ("*mov<mode>_internal_rex64"): Ditto. ("*mov<mode>_internal"): Ditto. ("mmx_add<mode>3"): Ditto. Handle V1DImode for TARGET_SSE2. ("mmx_sub<mode>3"): Ditto. ("mmx_adddi3"): Remove insn pattern. ("mmx_subdi3"): Ditto. ("mmx_ashr<mode>3"): Use SImode and "yN" constraint for operand 2. ("mmx_lshr<mode>3"): Ditto. Use MMXMODE248 mode iterator. ("mmx_ashl<mode>3"): Ditto. ("mmx_lshrdi3"): Remove insn pattern. ("mmx_ashldi3"): Ditto. * config/i386/i386.c (classify_argument): Handle V1DImode. (function_arg_advance_32): Ditto. (function_arg_32): Ditto. (struct builtin_description) [IX86_BUILTIN_PADDQ]: Use mmx_addv1di3 insn pattern. [IX86_BUILTIN_PSUBQ]: Use mmx_subv1di3 insn pattern. [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I, IX86_BUILTIN_PSLL?I128, IX86_BUILTIN_PSRL?I128, IX86_BUILTIN_PSRA?I128]: Remove definitions of built-in functions. (V1DI_type_node): New node. (v1di_ftype_v1di_int): Ditto. (v1di_ftype_v1di_v1di): Ditto. (v2si_ftype_v2si_si): Ditto. (v4hi_ftype_v4hi_di): Remove node. (v2si_ftype_v2si_di): Ditto. (ix86_init_mmx_sse_builtins): Handle V1DImode. (__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?): Redefine builtins using def_builtin_const with *_ftype_*_int node. (__builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i): Add new builtins using def_builtin_const. (ix86_expand_builtin) [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I]: Handle builtin definitions. * config/i386/mmintrin.h (__v1di): New typedef. (_mm_add_si64): Cast arguments to __v1di type. (_mm_sub_si64): Ditto. (_mm_sll_pi16): Cast __count to __v4hi type. (_mm_sll_pi32): Cast __count to __v2si type. (_mm_sll_si64): Cast arguments to __v1di type. (_mm_srl_pi16): Cast __count to __v4hi type. (_mm_srl_pi32): Cast __count to __v2si type. (_mm_srl_si64): Cast arguments to __v1di type. (_mm_sra_pi16): Cast __count to __v4hi type. (_mm_sra_pi32): Cast __count to __v2si type. (_mm_slli_pi16): Use __builtin_ia32_psllwi. (_mm_slli_pi32): Use __builtin_ia32_pslldi. (_mm_slli_si64): Use __builtin_ia32_psllqi. Cast __m to __v1di type. (_mm_srli_pi16): Use __builtin_ia32_psrlwi. (_mm_srli_pi32): Use __builtin_ia32_psrldi. (_mm_srli_si64): Use __builtin_ia32_psrlqi. Cast __m to __v1di type. (_mm_srai_pi16): Use __builtin_ia32_psrawi. (_mm_srai_pi32): Use __builtin_ia32_psradi. * config/i386/i386.md (UNSPEC_NOP): Remove unspec definition. * doc/extend.texi (X86 Built-in Functions) [__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?, __builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i]: Add new builtins. From-SVN: r133023
2008-03-08 07:59:33 +01:00
return (__m64) __builtin_ia32_psrad ((__v2si)__m, (__v2si)__count);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_psrad (__m64 __m, __m64 __count)
{
return _mm_sra_pi32 (__m, __count);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_srai_pi32 (__m64 __m, int __count)
{
re PR target/22152 (Poor loop optimization when using mmx builtins) 2008-03-08 Uros Bizjak <ubizjak@gmail.com> PR target/22152 * config/i386/i386-modes.def (V1DI): New vector mode. * config/i386/i386.h (VALID_MMX_REG_MODE): Add V1DImode. * config/i386/mmx.md (MMXMODEI8): New mode iterator. (MMXMODE248): Ditto. (MMXMODE): Add V1DI mode. (mmxvecsize): Change DI mode to V1DI mode. ("mov<mode>): Use MMXMODEI8 mode iterator. ("*mov<mode>_internal_rex64"): Ditto. ("*mov<mode>_internal"): Ditto. ("mmx_add<mode>3"): Ditto. Handle V1DImode for TARGET_SSE2. ("mmx_sub<mode>3"): Ditto. ("mmx_adddi3"): Remove insn pattern. ("mmx_subdi3"): Ditto. ("mmx_ashr<mode>3"): Use SImode and "yN" constraint for operand 2. ("mmx_lshr<mode>3"): Ditto. Use MMXMODE248 mode iterator. ("mmx_ashl<mode>3"): Ditto. ("mmx_lshrdi3"): Remove insn pattern. ("mmx_ashldi3"): Ditto. * config/i386/i386.c (classify_argument): Handle V1DImode. (function_arg_advance_32): Ditto. (function_arg_32): Ditto. (struct builtin_description) [IX86_BUILTIN_PADDQ]: Use mmx_addv1di3 insn pattern. [IX86_BUILTIN_PSUBQ]: Use mmx_subv1di3 insn pattern. [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I, IX86_BUILTIN_PSLL?I128, IX86_BUILTIN_PSRL?I128, IX86_BUILTIN_PSRA?I128]: Remove definitions of built-in functions. (V1DI_type_node): New node. (v1di_ftype_v1di_int): Ditto. (v1di_ftype_v1di_v1di): Ditto. (v2si_ftype_v2si_si): Ditto. (v4hi_ftype_v4hi_di): Remove node. (v2si_ftype_v2si_di): Ditto. (ix86_init_mmx_sse_builtins): Handle V1DImode. (__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?): Redefine builtins using def_builtin_const with *_ftype_*_int node. (__builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i): Add new builtins using def_builtin_const. (ix86_expand_builtin) [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I]: Handle builtin definitions. * config/i386/mmintrin.h (__v1di): New typedef. (_mm_add_si64): Cast arguments to __v1di type. (_mm_sub_si64): Ditto. (_mm_sll_pi16): Cast __count to __v4hi type. (_mm_sll_pi32): Cast __count to __v2si type. (_mm_sll_si64): Cast arguments to __v1di type. (_mm_srl_pi16): Cast __count to __v4hi type. (_mm_srl_pi32): Cast __count to __v2si type. (_mm_srl_si64): Cast arguments to __v1di type. (_mm_sra_pi16): Cast __count to __v4hi type. (_mm_sra_pi32): Cast __count to __v2si type. (_mm_slli_pi16): Use __builtin_ia32_psllwi. (_mm_slli_pi32): Use __builtin_ia32_pslldi. (_mm_slli_si64): Use __builtin_ia32_psllqi. Cast __m to __v1di type. (_mm_srli_pi16): Use __builtin_ia32_psrlwi. (_mm_srli_pi32): Use __builtin_ia32_psrldi. (_mm_srli_si64): Use __builtin_ia32_psrlqi. Cast __m to __v1di type. (_mm_srai_pi16): Use __builtin_ia32_psrawi. (_mm_srai_pi32): Use __builtin_ia32_psradi. * config/i386/i386.md (UNSPEC_NOP): Remove unspec definition. * doc/extend.texi (X86 Built-in Functions) [__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?, __builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i]: Add new builtins. From-SVN: r133023
2008-03-08 07:59:33 +01:00
return (__m64) __builtin_ia32_psradi ((__v2si)__m, __count);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_psradi (__m64 __m, int __count)
{
return _mm_srai_pi32 (__m, __count);
}
/* Shift four 16-bit values in M right by COUNT; shift in zeros. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_srl_pi16 (__m64 __m, __m64 __count)
{
re PR target/22152 (Poor loop optimization when using mmx builtins) 2008-03-08 Uros Bizjak <ubizjak@gmail.com> PR target/22152 * config/i386/i386-modes.def (V1DI): New vector mode. * config/i386/i386.h (VALID_MMX_REG_MODE): Add V1DImode. * config/i386/mmx.md (MMXMODEI8): New mode iterator. (MMXMODE248): Ditto. (MMXMODE): Add V1DI mode. (mmxvecsize): Change DI mode to V1DI mode. ("mov<mode>): Use MMXMODEI8 mode iterator. ("*mov<mode>_internal_rex64"): Ditto. ("*mov<mode>_internal"): Ditto. ("mmx_add<mode>3"): Ditto. Handle V1DImode for TARGET_SSE2. ("mmx_sub<mode>3"): Ditto. ("mmx_adddi3"): Remove insn pattern. ("mmx_subdi3"): Ditto. ("mmx_ashr<mode>3"): Use SImode and "yN" constraint for operand 2. ("mmx_lshr<mode>3"): Ditto. Use MMXMODE248 mode iterator. ("mmx_ashl<mode>3"): Ditto. ("mmx_lshrdi3"): Remove insn pattern. ("mmx_ashldi3"): Ditto. * config/i386/i386.c (classify_argument): Handle V1DImode. (function_arg_advance_32): Ditto. (function_arg_32): Ditto. (struct builtin_description) [IX86_BUILTIN_PADDQ]: Use mmx_addv1di3 insn pattern. [IX86_BUILTIN_PSUBQ]: Use mmx_subv1di3 insn pattern. [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I, IX86_BUILTIN_PSLL?I128, IX86_BUILTIN_PSRL?I128, IX86_BUILTIN_PSRA?I128]: Remove definitions of built-in functions. (V1DI_type_node): New node. (v1di_ftype_v1di_int): Ditto. (v1di_ftype_v1di_v1di): Ditto. (v2si_ftype_v2si_si): Ditto. (v4hi_ftype_v4hi_di): Remove node. (v2si_ftype_v2si_di): Ditto. (ix86_init_mmx_sse_builtins): Handle V1DImode. (__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?): Redefine builtins using def_builtin_const with *_ftype_*_int node. (__builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i): Add new builtins using def_builtin_const. (ix86_expand_builtin) [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I]: Handle builtin definitions. * config/i386/mmintrin.h (__v1di): New typedef. (_mm_add_si64): Cast arguments to __v1di type. (_mm_sub_si64): Ditto. (_mm_sll_pi16): Cast __count to __v4hi type. (_mm_sll_pi32): Cast __count to __v2si type. (_mm_sll_si64): Cast arguments to __v1di type. (_mm_srl_pi16): Cast __count to __v4hi type. (_mm_srl_pi32): Cast __count to __v2si type. (_mm_srl_si64): Cast arguments to __v1di type. (_mm_sra_pi16): Cast __count to __v4hi type. (_mm_sra_pi32): Cast __count to __v2si type. (_mm_slli_pi16): Use __builtin_ia32_psllwi. (_mm_slli_pi32): Use __builtin_ia32_pslldi. (_mm_slli_si64): Use __builtin_ia32_psllqi. Cast __m to __v1di type. (_mm_srli_pi16): Use __builtin_ia32_psrlwi. (_mm_srli_pi32): Use __builtin_ia32_psrldi. (_mm_srli_si64): Use __builtin_ia32_psrlqi. Cast __m to __v1di type. (_mm_srai_pi16): Use __builtin_ia32_psrawi. (_mm_srai_pi32): Use __builtin_ia32_psradi. * config/i386/i386.md (UNSPEC_NOP): Remove unspec definition. * doc/extend.texi (X86 Built-in Functions) [__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?, __builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i]: Add new builtins. From-SVN: r133023
2008-03-08 07:59:33 +01:00
return (__m64) __builtin_ia32_psrlw ((__v4hi)__m, (__v4hi)__count);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_psrlw (__m64 __m, __m64 __count)
{
return _mm_srl_pi16 (__m, __count);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_srli_pi16 (__m64 __m, int __count)
{
re PR target/22152 (Poor loop optimization when using mmx builtins) 2008-03-08 Uros Bizjak <ubizjak@gmail.com> PR target/22152 * config/i386/i386-modes.def (V1DI): New vector mode. * config/i386/i386.h (VALID_MMX_REG_MODE): Add V1DImode. * config/i386/mmx.md (MMXMODEI8): New mode iterator. (MMXMODE248): Ditto. (MMXMODE): Add V1DI mode. (mmxvecsize): Change DI mode to V1DI mode. ("mov<mode>): Use MMXMODEI8 mode iterator. ("*mov<mode>_internal_rex64"): Ditto. ("*mov<mode>_internal"): Ditto. ("mmx_add<mode>3"): Ditto. Handle V1DImode for TARGET_SSE2. ("mmx_sub<mode>3"): Ditto. ("mmx_adddi3"): Remove insn pattern. ("mmx_subdi3"): Ditto. ("mmx_ashr<mode>3"): Use SImode and "yN" constraint for operand 2. ("mmx_lshr<mode>3"): Ditto. Use MMXMODE248 mode iterator. ("mmx_ashl<mode>3"): Ditto. ("mmx_lshrdi3"): Remove insn pattern. ("mmx_ashldi3"): Ditto. * config/i386/i386.c (classify_argument): Handle V1DImode. (function_arg_advance_32): Ditto. (function_arg_32): Ditto. (struct builtin_description) [IX86_BUILTIN_PADDQ]: Use mmx_addv1di3 insn pattern. [IX86_BUILTIN_PSUBQ]: Use mmx_subv1di3 insn pattern. [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I, IX86_BUILTIN_PSLL?I128, IX86_BUILTIN_PSRL?I128, IX86_BUILTIN_PSRA?I128]: Remove definitions of built-in functions. (V1DI_type_node): New node. (v1di_ftype_v1di_int): Ditto. (v1di_ftype_v1di_v1di): Ditto. (v2si_ftype_v2si_si): Ditto. (v4hi_ftype_v4hi_di): Remove node. (v2si_ftype_v2si_di): Ditto. (ix86_init_mmx_sse_builtins): Handle V1DImode. (__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?): Redefine builtins using def_builtin_const with *_ftype_*_int node. (__builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i): Add new builtins using def_builtin_const. (ix86_expand_builtin) [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I]: Handle builtin definitions. * config/i386/mmintrin.h (__v1di): New typedef. (_mm_add_si64): Cast arguments to __v1di type. (_mm_sub_si64): Ditto. (_mm_sll_pi16): Cast __count to __v4hi type. (_mm_sll_pi32): Cast __count to __v2si type. (_mm_sll_si64): Cast arguments to __v1di type. (_mm_srl_pi16): Cast __count to __v4hi type. (_mm_srl_pi32): Cast __count to __v2si type. (_mm_srl_si64): Cast arguments to __v1di type. (_mm_sra_pi16): Cast __count to __v4hi type. (_mm_sra_pi32): Cast __count to __v2si type. (_mm_slli_pi16): Use __builtin_ia32_psllwi. (_mm_slli_pi32): Use __builtin_ia32_pslldi. (_mm_slli_si64): Use __builtin_ia32_psllqi. Cast __m to __v1di type. (_mm_srli_pi16): Use __builtin_ia32_psrlwi. (_mm_srli_pi32): Use __builtin_ia32_psrldi. (_mm_srli_si64): Use __builtin_ia32_psrlqi. Cast __m to __v1di type. (_mm_srai_pi16): Use __builtin_ia32_psrawi. (_mm_srai_pi32): Use __builtin_ia32_psradi. * config/i386/i386.md (UNSPEC_NOP): Remove unspec definition. * doc/extend.texi (X86 Built-in Functions) [__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?, __builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i]: Add new builtins. From-SVN: r133023
2008-03-08 07:59:33 +01:00
return (__m64) __builtin_ia32_psrlwi ((__v4hi)__m, __count);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_psrlwi (__m64 __m, int __count)
{
return _mm_srli_pi16 (__m, __count);
}
/* Shift two 32-bit values in M right by COUNT; shift in zeros. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_srl_pi32 (__m64 __m, __m64 __count)
{
re PR target/22152 (Poor loop optimization when using mmx builtins) 2008-03-08 Uros Bizjak <ubizjak@gmail.com> PR target/22152 * config/i386/i386-modes.def (V1DI): New vector mode. * config/i386/i386.h (VALID_MMX_REG_MODE): Add V1DImode. * config/i386/mmx.md (MMXMODEI8): New mode iterator. (MMXMODE248): Ditto. (MMXMODE): Add V1DI mode. (mmxvecsize): Change DI mode to V1DI mode. ("mov<mode>): Use MMXMODEI8 mode iterator. ("*mov<mode>_internal_rex64"): Ditto. ("*mov<mode>_internal"): Ditto. ("mmx_add<mode>3"): Ditto. Handle V1DImode for TARGET_SSE2. ("mmx_sub<mode>3"): Ditto. ("mmx_adddi3"): Remove insn pattern. ("mmx_subdi3"): Ditto. ("mmx_ashr<mode>3"): Use SImode and "yN" constraint for operand 2. ("mmx_lshr<mode>3"): Ditto. Use MMXMODE248 mode iterator. ("mmx_ashl<mode>3"): Ditto. ("mmx_lshrdi3"): Remove insn pattern. ("mmx_ashldi3"): Ditto. * config/i386/i386.c (classify_argument): Handle V1DImode. (function_arg_advance_32): Ditto. (function_arg_32): Ditto. (struct builtin_description) [IX86_BUILTIN_PADDQ]: Use mmx_addv1di3 insn pattern. [IX86_BUILTIN_PSUBQ]: Use mmx_subv1di3 insn pattern. [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I, IX86_BUILTIN_PSLL?I128, IX86_BUILTIN_PSRL?I128, IX86_BUILTIN_PSRA?I128]: Remove definitions of built-in functions. (V1DI_type_node): New node. (v1di_ftype_v1di_int): Ditto. (v1di_ftype_v1di_v1di): Ditto. (v2si_ftype_v2si_si): Ditto. (v4hi_ftype_v4hi_di): Remove node. (v2si_ftype_v2si_di): Ditto. (ix86_init_mmx_sse_builtins): Handle V1DImode. (__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?): Redefine builtins using def_builtin_const with *_ftype_*_int node. (__builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i): Add new builtins using def_builtin_const. (ix86_expand_builtin) [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I]: Handle builtin definitions. * config/i386/mmintrin.h (__v1di): New typedef. (_mm_add_si64): Cast arguments to __v1di type. (_mm_sub_si64): Ditto. (_mm_sll_pi16): Cast __count to __v4hi type. (_mm_sll_pi32): Cast __count to __v2si type. (_mm_sll_si64): Cast arguments to __v1di type. (_mm_srl_pi16): Cast __count to __v4hi type. (_mm_srl_pi32): Cast __count to __v2si type. (_mm_srl_si64): Cast arguments to __v1di type. (_mm_sra_pi16): Cast __count to __v4hi type. (_mm_sra_pi32): Cast __count to __v2si type. (_mm_slli_pi16): Use __builtin_ia32_psllwi. (_mm_slli_pi32): Use __builtin_ia32_pslldi. (_mm_slli_si64): Use __builtin_ia32_psllqi. Cast __m to __v1di type. (_mm_srli_pi16): Use __builtin_ia32_psrlwi. (_mm_srli_pi32): Use __builtin_ia32_psrldi. (_mm_srli_si64): Use __builtin_ia32_psrlqi. Cast __m to __v1di type. (_mm_srai_pi16): Use __builtin_ia32_psrawi. (_mm_srai_pi32): Use __builtin_ia32_psradi. * config/i386/i386.md (UNSPEC_NOP): Remove unspec definition. * doc/extend.texi (X86 Built-in Functions) [__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?, __builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i]: Add new builtins. From-SVN: r133023
2008-03-08 07:59:33 +01:00
return (__m64) __builtin_ia32_psrld ((__v2si)__m, (__v2si)__count);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_psrld (__m64 __m, __m64 __count)
{
return _mm_srl_pi32 (__m, __count);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_srli_pi32 (__m64 __m, int __count)
{
re PR target/22152 (Poor loop optimization when using mmx builtins) 2008-03-08 Uros Bizjak <ubizjak@gmail.com> PR target/22152 * config/i386/i386-modes.def (V1DI): New vector mode. * config/i386/i386.h (VALID_MMX_REG_MODE): Add V1DImode. * config/i386/mmx.md (MMXMODEI8): New mode iterator. (MMXMODE248): Ditto. (MMXMODE): Add V1DI mode. (mmxvecsize): Change DI mode to V1DI mode. ("mov<mode>): Use MMXMODEI8 mode iterator. ("*mov<mode>_internal_rex64"): Ditto. ("*mov<mode>_internal"): Ditto. ("mmx_add<mode>3"): Ditto. Handle V1DImode for TARGET_SSE2. ("mmx_sub<mode>3"): Ditto. ("mmx_adddi3"): Remove insn pattern. ("mmx_subdi3"): Ditto. ("mmx_ashr<mode>3"): Use SImode and "yN" constraint for operand 2. ("mmx_lshr<mode>3"): Ditto. Use MMXMODE248 mode iterator. ("mmx_ashl<mode>3"): Ditto. ("mmx_lshrdi3"): Remove insn pattern. ("mmx_ashldi3"): Ditto. * config/i386/i386.c (classify_argument): Handle V1DImode. (function_arg_advance_32): Ditto. (function_arg_32): Ditto. (struct builtin_description) [IX86_BUILTIN_PADDQ]: Use mmx_addv1di3 insn pattern. [IX86_BUILTIN_PSUBQ]: Use mmx_subv1di3 insn pattern. [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I, IX86_BUILTIN_PSLL?I128, IX86_BUILTIN_PSRL?I128, IX86_BUILTIN_PSRA?I128]: Remove definitions of built-in functions. (V1DI_type_node): New node. (v1di_ftype_v1di_int): Ditto. (v1di_ftype_v1di_v1di): Ditto. (v2si_ftype_v2si_si): Ditto. (v4hi_ftype_v4hi_di): Remove node. (v2si_ftype_v2si_di): Ditto. (ix86_init_mmx_sse_builtins): Handle V1DImode. (__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?): Redefine builtins using def_builtin_const with *_ftype_*_int node. (__builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i): Add new builtins using def_builtin_const. (ix86_expand_builtin) [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I]: Handle builtin definitions. * config/i386/mmintrin.h (__v1di): New typedef. (_mm_add_si64): Cast arguments to __v1di type. (_mm_sub_si64): Ditto. (_mm_sll_pi16): Cast __count to __v4hi type. (_mm_sll_pi32): Cast __count to __v2si type. (_mm_sll_si64): Cast arguments to __v1di type. (_mm_srl_pi16): Cast __count to __v4hi type. (_mm_srl_pi32): Cast __count to __v2si type. (_mm_srl_si64): Cast arguments to __v1di type. (_mm_sra_pi16): Cast __count to __v4hi type. (_mm_sra_pi32): Cast __count to __v2si type. (_mm_slli_pi16): Use __builtin_ia32_psllwi. (_mm_slli_pi32): Use __builtin_ia32_pslldi. (_mm_slli_si64): Use __builtin_ia32_psllqi. Cast __m to __v1di type. (_mm_srli_pi16): Use __builtin_ia32_psrlwi. (_mm_srli_pi32): Use __builtin_ia32_psrldi. (_mm_srli_si64): Use __builtin_ia32_psrlqi. Cast __m to __v1di type. (_mm_srai_pi16): Use __builtin_ia32_psrawi. (_mm_srai_pi32): Use __builtin_ia32_psradi. * config/i386/i386.md (UNSPEC_NOP): Remove unspec definition. * doc/extend.texi (X86 Built-in Functions) [__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?, __builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i]: Add new builtins. From-SVN: r133023
2008-03-08 07:59:33 +01:00
return (__m64) __builtin_ia32_psrldi ((__v2si)__m, __count);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_psrldi (__m64 __m, int __count)
{
return _mm_srli_pi32 (__m, __count);
}
/* Shift the 64-bit value in M left by COUNT; shift in zeros. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_srl_si64 (__m64 __m, __m64 __count)
{
re PR target/22152 (Poor loop optimization when using mmx builtins) 2008-03-08 Uros Bizjak <ubizjak@gmail.com> PR target/22152 * config/i386/i386-modes.def (V1DI): New vector mode. * config/i386/i386.h (VALID_MMX_REG_MODE): Add V1DImode. * config/i386/mmx.md (MMXMODEI8): New mode iterator. (MMXMODE248): Ditto. (MMXMODE): Add V1DI mode. (mmxvecsize): Change DI mode to V1DI mode. ("mov<mode>): Use MMXMODEI8 mode iterator. ("*mov<mode>_internal_rex64"): Ditto. ("*mov<mode>_internal"): Ditto. ("mmx_add<mode>3"): Ditto. Handle V1DImode for TARGET_SSE2. ("mmx_sub<mode>3"): Ditto. ("mmx_adddi3"): Remove insn pattern. ("mmx_subdi3"): Ditto. ("mmx_ashr<mode>3"): Use SImode and "yN" constraint for operand 2. ("mmx_lshr<mode>3"): Ditto. Use MMXMODE248 mode iterator. ("mmx_ashl<mode>3"): Ditto. ("mmx_lshrdi3"): Remove insn pattern. ("mmx_ashldi3"): Ditto. * config/i386/i386.c (classify_argument): Handle V1DImode. (function_arg_advance_32): Ditto. (function_arg_32): Ditto. (struct builtin_description) [IX86_BUILTIN_PADDQ]: Use mmx_addv1di3 insn pattern. [IX86_BUILTIN_PSUBQ]: Use mmx_subv1di3 insn pattern. [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I, IX86_BUILTIN_PSLL?I128, IX86_BUILTIN_PSRL?I128, IX86_BUILTIN_PSRA?I128]: Remove definitions of built-in functions. (V1DI_type_node): New node. (v1di_ftype_v1di_int): Ditto. (v1di_ftype_v1di_v1di): Ditto. (v2si_ftype_v2si_si): Ditto. (v4hi_ftype_v4hi_di): Remove node. (v2si_ftype_v2si_di): Ditto. (ix86_init_mmx_sse_builtins): Handle V1DImode. (__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?): Redefine builtins using def_builtin_const with *_ftype_*_int node. (__builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i): Add new builtins using def_builtin_const. (ix86_expand_builtin) [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I]: Handle builtin definitions. * config/i386/mmintrin.h (__v1di): New typedef. (_mm_add_si64): Cast arguments to __v1di type. (_mm_sub_si64): Ditto. (_mm_sll_pi16): Cast __count to __v4hi type. (_mm_sll_pi32): Cast __count to __v2si type. (_mm_sll_si64): Cast arguments to __v1di type. (_mm_srl_pi16): Cast __count to __v4hi type. (_mm_srl_pi32): Cast __count to __v2si type. (_mm_srl_si64): Cast arguments to __v1di type. (_mm_sra_pi16): Cast __count to __v4hi type. (_mm_sra_pi32): Cast __count to __v2si type. (_mm_slli_pi16): Use __builtin_ia32_psllwi. (_mm_slli_pi32): Use __builtin_ia32_pslldi. (_mm_slli_si64): Use __builtin_ia32_psllqi. Cast __m to __v1di type. (_mm_srli_pi16): Use __builtin_ia32_psrlwi. (_mm_srli_pi32): Use __builtin_ia32_psrldi. (_mm_srli_si64): Use __builtin_ia32_psrlqi. Cast __m to __v1di type. (_mm_srai_pi16): Use __builtin_ia32_psrawi. (_mm_srai_pi32): Use __builtin_ia32_psradi. * config/i386/i386.md (UNSPEC_NOP): Remove unspec definition. * doc/extend.texi (X86 Built-in Functions) [__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?, __builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i]: Add new builtins. From-SVN: r133023
2008-03-08 07:59:33 +01:00
return (__m64) __builtin_ia32_psrlq ((__v1di)__m, (__v1di)__count);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_psrlq (__m64 __m, __m64 __count)
{
return _mm_srl_si64 (__m, __count);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_srli_si64 (__m64 __m, int __count)
{
re PR target/22152 (Poor loop optimization when using mmx builtins) 2008-03-08 Uros Bizjak <ubizjak@gmail.com> PR target/22152 * config/i386/i386-modes.def (V1DI): New vector mode. * config/i386/i386.h (VALID_MMX_REG_MODE): Add V1DImode. * config/i386/mmx.md (MMXMODEI8): New mode iterator. (MMXMODE248): Ditto. (MMXMODE): Add V1DI mode. (mmxvecsize): Change DI mode to V1DI mode. ("mov<mode>): Use MMXMODEI8 mode iterator. ("*mov<mode>_internal_rex64"): Ditto. ("*mov<mode>_internal"): Ditto. ("mmx_add<mode>3"): Ditto. Handle V1DImode for TARGET_SSE2. ("mmx_sub<mode>3"): Ditto. ("mmx_adddi3"): Remove insn pattern. ("mmx_subdi3"): Ditto. ("mmx_ashr<mode>3"): Use SImode and "yN" constraint for operand 2. ("mmx_lshr<mode>3"): Ditto. Use MMXMODE248 mode iterator. ("mmx_ashl<mode>3"): Ditto. ("mmx_lshrdi3"): Remove insn pattern. ("mmx_ashldi3"): Ditto. * config/i386/i386.c (classify_argument): Handle V1DImode. (function_arg_advance_32): Ditto. (function_arg_32): Ditto. (struct builtin_description) [IX86_BUILTIN_PADDQ]: Use mmx_addv1di3 insn pattern. [IX86_BUILTIN_PSUBQ]: Use mmx_subv1di3 insn pattern. [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I, IX86_BUILTIN_PSLL?I128, IX86_BUILTIN_PSRL?I128, IX86_BUILTIN_PSRA?I128]: Remove definitions of built-in functions. (V1DI_type_node): New node. (v1di_ftype_v1di_int): Ditto. (v1di_ftype_v1di_v1di): Ditto. (v2si_ftype_v2si_si): Ditto. (v4hi_ftype_v4hi_di): Remove node. (v2si_ftype_v2si_di): Ditto. (ix86_init_mmx_sse_builtins): Handle V1DImode. (__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?): Redefine builtins using def_builtin_const with *_ftype_*_int node. (__builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i): Add new builtins using def_builtin_const. (ix86_expand_builtin) [IX86_BUILTIN_PSLL?, IX86_BUILTIN_PSRL?, IX86_BUILTIN_PSRA?, IX86_BUILTIN_PSLL?I, IX86_BUILTIN_PSRL?I, IX86_BUILTIN_PSRA?I]: Handle builtin definitions. * config/i386/mmintrin.h (__v1di): New typedef. (_mm_add_si64): Cast arguments to __v1di type. (_mm_sub_si64): Ditto. (_mm_sll_pi16): Cast __count to __v4hi type. (_mm_sll_pi32): Cast __count to __v2si type. (_mm_sll_si64): Cast arguments to __v1di type. (_mm_srl_pi16): Cast __count to __v4hi type. (_mm_srl_pi32): Cast __count to __v2si type. (_mm_srl_si64): Cast arguments to __v1di type. (_mm_sra_pi16): Cast __count to __v4hi type. (_mm_sra_pi32): Cast __count to __v2si type. (_mm_slli_pi16): Use __builtin_ia32_psllwi. (_mm_slli_pi32): Use __builtin_ia32_pslldi. (_mm_slli_si64): Use __builtin_ia32_psllqi. Cast __m to __v1di type. (_mm_srli_pi16): Use __builtin_ia32_psrlwi. (_mm_srli_pi32): Use __builtin_ia32_psrldi. (_mm_srli_si64): Use __builtin_ia32_psrlqi. Cast __m to __v1di type. (_mm_srai_pi16): Use __builtin_ia32_psrawi. (_mm_srai_pi32): Use __builtin_ia32_psradi. * config/i386/i386.md (UNSPEC_NOP): Remove unspec definition. * doc/extend.texi (X86 Built-in Functions) [__builtin_ia32_psll?, __builtin_ia32_psrl?, __builtin_ia32_psra?, __builtin_ia32_psll?i, __builtin_ia32_psrl?i, __builtin_ia32_psra?i]: Add new builtins. From-SVN: r133023
2008-03-08 07:59:33 +01:00
return (__m64) __builtin_ia32_psrlqi ((__v1di)__m, __count);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_psrlqi (__m64 __m, int __count)
{
return _mm_srli_si64 (__m, __count);
}
/* Bit-wise AND the 64-bit values in M1 and M2. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_and_si64 (__m64 __m1, __m64 __m2)
{
i386.c (bdesc_2arg): Update names for mmx_ prefixes. * config/i386/i386.c (bdesc_2arg): Update names for mmx_ prefixes. (ix86_expand_builtin): Likewise. Frob MASKMOVQ wrt the input mem just like MASKMOVDQU. Return plain zero for MMX_ZERO. * config/i386/i386.md (MMXMODEI, mov<MMXMODEI>, mov<MMXMODEI>_internal_rex64, mov<MMXMODEI>_internal, movv2sf, movv2sf_internal_rex64, movv2sf_internal, MMXMODE, movmisalign<MMXMODE>, mmx_pmovmskb, mmx_maskmovq, mmx_maskmovq_rex, sse_movntdi, addv8qi3, addv4hi3, addv2si3, mmx_adddi3, ssaddv8qi3, ssaddv4hi3, usaddv8qi3, usaddv4hi3, subv8qi3, subv4hi3, subv2si3, mmx_subdi3, sssubv8qi3, sssubv4hi3, ussubv8qi3, ussubv4hi3, mulv4hi3, smulv4hi3_highpart, umulv4hi3_highpart, mmx_pmaddwd, sse2_umulsidi3, mmx_iordi3, mmx_xordi3, mmx_anddi3, mmx_nanddi3, mmx_uavgv8qi3, mmx_uavgv4hi3, mmx_psadbw, mmx_pinsrw, mmx_pinsrw, mmx_pextrw, mmx_pshufw, eqv8qi3, eqv4hi3, eqv2si3, gtv8qi3, gtv4hi3, gtv2si3, umaxv8qi3, smaxv4hi3, uminv8qi3, sminv4hi3, ashrv4hi3, ashrv2si3, lshrv4hi3, lshrv2si3, mmx_lshrdi3, ashlv4hi3, ashlv2si3, mmx_ashldi3, mmx_packsswb, mmx_packssdw, mmx_packuswb, mmx_punpckhbw, mmx_punpckhwd, mmx_punpckhdq, mmx_punpcklbw, mmx_punpcklwd, mmx_punpckldq, emms, addv2sf3, subv2sf3, subrv2sf3, gtv2sf3, gev2sf3, eqv2sf3, pfmaxv2sf3, pfminv2sf3, mulv2sf3, femms, pf2id, pf2iw, pfacc, pfnacc, pfpnacc, pi2fw, floatv2si2, pfrcpv2sf2, pfrcpit1v2sf3, pfrcpit2v2sf3, pfrsqrtv2sf2, pfrsqit1v2sf3, pmulhrwv4hi3, pswapdv2si2, pswapdv2sf2): Move to mmx.md; rename as necessary with leading mmx_ prefix. (mmx_clrdi, pavgusb): Remove. (ldmxcsr, stmxcsr, sfence, sfence_insn): Move to sse.md; rename with leading sse_ prefix. * config/i386/sse.md: Receive them. * config/i386/mmx.md: New file. (MMXMODE12, MMXMODE24, mmxvecsize): New. (subrv2sf3): Turn into expander for normal subtraction. (mmx_addv2sf3, mmx_mulv2sf3, mmx_smaxv2sf3, mmx_sminv2sf3, mmx_eqv2sf3, mmx_mulv4hi3, mmx_smulv4hi3_highpart, mmx_umulv4hi3_highpart, mmx_pmaddwd, mmx_pmulhrwv4hi3, sse2_umulsidi3, mmx_umaxv8qi3, mmx_smaxv4hi3, mmx_uminv8qi3, mmx_sminv4hi3): Mark commutative; use ix86_binary_operator_ok. (mmx_add<MMXMODEI>3, mmx_ssadd<MMXMODE12>3, mmx_usadd<MMXMODE12>3, mmx_sub<MMXMODEI>3, mmx_sssub<MMXMODE12>3, mmx_ussub<MMXMODE12>3 mmx_ashr<MMXMODE24>3, mmx_lshr<MMXMODE23>3, mmx_ashl<MMXMODE24>3 mmx_eq<MMXMODEI>3, mmx_gt<MMXMODEI>3, mmx_and<MMXMODEI>3, mmx_nand<MMXMODEI>3, mmx_ior<MMXMODEI>3, mmx_xor<MMXMODEI>3): Macroize from existing patterns; use ix86_binary_operator_ok. (mmx_packsswb, mmx_packssdw, mmx_packuswb): Add memory alternative. (mmx_punpckhbw, mmx_punpcklbw, mmx_punpckhwd, mmx_punpcklwd, mmx_punpckhdq, mmx_punpckhdq, mmx_punpckldq): Likewise. Model with vec_select+vec_concat. (mmx_pshufw, mmx_pshufw_1): Likewise. (mmx_uavgv8qi3): Merge pavgusb. Model correcty. (mmx_uavgv4hi3): Model correctly. * config/i386/mmintrin.h (_mm_and_si64, _mm_andnot_si64, _mm_or_si64, _mm_xor_si64): Remove casts. From-SVN: r93107
2005-01-09 12:23:25 +01:00
return __builtin_ia32_pand (__m1, __m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_pand (__m64 __m1, __m64 __m2)
{
return _mm_and_si64 (__m1, __m2);
}
/* Bit-wise complement the 64-bit value in M1 and bit-wise AND it with the
64-bit value in M2. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_andnot_si64 (__m64 __m1, __m64 __m2)
{
i386.c (bdesc_2arg): Update names for mmx_ prefixes. * config/i386/i386.c (bdesc_2arg): Update names for mmx_ prefixes. (ix86_expand_builtin): Likewise. Frob MASKMOVQ wrt the input mem just like MASKMOVDQU. Return plain zero for MMX_ZERO. * config/i386/i386.md (MMXMODEI, mov<MMXMODEI>, mov<MMXMODEI>_internal_rex64, mov<MMXMODEI>_internal, movv2sf, movv2sf_internal_rex64, movv2sf_internal, MMXMODE, movmisalign<MMXMODE>, mmx_pmovmskb, mmx_maskmovq, mmx_maskmovq_rex, sse_movntdi, addv8qi3, addv4hi3, addv2si3, mmx_adddi3, ssaddv8qi3, ssaddv4hi3, usaddv8qi3, usaddv4hi3, subv8qi3, subv4hi3, subv2si3, mmx_subdi3, sssubv8qi3, sssubv4hi3, ussubv8qi3, ussubv4hi3, mulv4hi3, smulv4hi3_highpart, umulv4hi3_highpart, mmx_pmaddwd, sse2_umulsidi3, mmx_iordi3, mmx_xordi3, mmx_anddi3, mmx_nanddi3, mmx_uavgv8qi3, mmx_uavgv4hi3, mmx_psadbw, mmx_pinsrw, mmx_pinsrw, mmx_pextrw, mmx_pshufw, eqv8qi3, eqv4hi3, eqv2si3, gtv8qi3, gtv4hi3, gtv2si3, umaxv8qi3, smaxv4hi3, uminv8qi3, sminv4hi3, ashrv4hi3, ashrv2si3, lshrv4hi3, lshrv2si3, mmx_lshrdi3, ashlv4hi3, ashlv2si3, mmx_ashldi3, mmx_packsswb, mmx_packssdw, mmx_packuswb, mmx_punpckhbw, mmx_punpckhwd, mmx_punpckhdq, mmx_punpcklbw, mmx_punpcklwd, mmx_punpckldq, emms, addv2sf3, subv2sf3, subrv2sf3, gtv2sf3, gev2sf3, eqv2sf3, pfmaxv2sf3, pfminv2sf3, mulv2sf3, femms, pf2id, pf2iw, pfacc, pfnacc, pfpnacc, pi2fw, floatv2si2, pfrcpv2sf2, pfrcpit1v2sf3, pfrcpit2v2sf3, pfrsqrtv2sf2, pfrsqit1v2sf3, pmulhrwv4hi3, pswapdv2si2, pswapdv2sf2): Move to mmx.md; rename as necessary with leading mmx_ prefix. (mmx_clrdi, pavgusb): Remove. (ldmxcsr, stmxcsr, sfence, sfence_insn): Move to sse.md; rename with leading sse_ prefix. * config/i386/sse.md: Receive them. * config/i386/mmx.md: New file. (MMXMODE12, MMXMODE24, mmxvecsize): New. (subrv2sf3): Turn into expander for normal subtraction. (mmx_addv2sf3, mmx_mulv2sf3, mmx_smaxv2sf3, mmx_sminv2sf3, mmx_eqv2sf3, mmx_mulv4hi3, mmx_smulv4hi3_highpart, mmx_umulv4hi3_highpart, mmx_pmaddwd, mmx_pmulhrwv4hi3, sse2_umulsidi3, mmx_umaxv8qi3, mmx_smaxv4hi3, mmx_uminv8qi3, mmx_sminv4hi3): Mark commutative; use ix86_binary_operator_ok. (mmx_add<MMXMODEI>3, mmx_ssadd<MMXMODE12>3, mmx_usadd<MMXMODE12>3, mmx_sub<MMXMODEI>3, mmx_sssub<MMXMODE12>3, mmx_ussub<MMXMODE12>3 mmx_ashr<MMXMODE24>3, mmx_lshr<MMXMODE23>3, mmx_ashl<MMXMODE24>3 mmx_eq<MMXMODEI>3, mmx_gt<MMXMODEI>3, mmx_and<MMXMODEI>3, mmx_nand<MMXMODEI>3, mmx_ior<MMXMODEI>3, mmx_xor<MMXMODEI>3): Macroize from existing patterns; use ix86_binary_operator_ok. (mmx_packsswb, mmx_packssdw, mmx_packuswb): Add memory alternative. (mmx_punpckhbw, mmx_punpcklbw, mmx_punpckhwd, mmx_punpcklwd, mmx_punpckhdq, mmx_punpckhdq, mmx_punpckldq): Likewise. Model with vec_select+vec_concat. (mmx_pshufw, mmx_pshufw_1): Likewise. (mmx_uavgv8qi3): Merge pavgusb. Model correcty. (mmx_uavgv4hi3): Model correctly. * config/i386/mmintrin.h (_mm_and_si64, _mm_andnot_si64, _mm_or_si64, _mm_xor_si64): Remove casts. From-SVN: r93107
2005-01-09 12:23:25 +01:00
return __builtin_ia32_pandn (__m1, __m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_pandn (__m64 __m1, __m64 __m2)
{
return _mm_andnot_si64 (__m1, __m2);
}
/* Bit-wise inclusive OR the 64-bit values in M1 and M2. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_or_si64 (__m64 __m1, __m64 __m2)
{
i386.c (bdesc_2arg): Update names for mmx_ prefixes. * config/i386/i386.c (bdesc_2arg): Update names for mmx_ prefixes. (ix86_expand_builtin): Likewise. Frob MASKMOVQ wrt the input mem just like MASKMOVDQU. Return plain zero for MMX_ZERO. * config/i386/i386.md (MMXMODEI, mov<MMXMODEI>, mov<MMXMODEI>_internal_rex64, mov<MMXMODEI>_internal, movv2sf, movv2sf_internal_rex64, movv2sf_internal, MMXMODE, movmisalign<MMXMODE>, mmx_pmovmskb, mmx_maskmovq, mmx_maskmovq_rex, sse_movntdi, addv8qi3, addv4hi3, addv2si3, mmx_adddi3, ssaddv8qi3, ssaddv4hi3, usaddv8qi3, usaddv4hi3, subv8qi3, subv4hi3, subv2si3, mmx_subdi3, sssubv8qi3, sssubv4hi3, ussubv8qi3, ussubv4hi3, mulv4hi3, smulv4hi3_highpart, umulv4hi3_highpart, mmx_pmaddwd, sse2_umulsidi3, mmx_iordi3, mmx_xordi3, mmx_anddi3, mmx_nanddi3, mmx_uavgv8qi3, mmx_uavgv4hi3, mmx_psadbw, mmx_pinsrw, mmx_pinsrw, mmx_pextrw, mmx_pshufw, eqv8qi3, eqv4hi3, eqv2si3, gtv8qi3, gtv4hi3, gtv2si3, umaxv8qi3, smaxv4hi3, uminv8qi3, sminv4hi3, ashrv4hi3, ashrv2si3, lshrv4hi3, lshrv2si3, mmx_lshrdi3, ashlv4hi3, ashlv2si3, mmx_ashldi3, mmx_packsswb, mmx_packssdw, mmx_packuswb, mmx_punpckhbw, mmx_punpckhwd, mmx_punpckhdq, mmx_punpcklbw, mmx_punpcklwd, mmx_punpckldq, emms, addv2sf3, subv2sf3, subrv2sf3, gtv2sf3, gev2sf3, eqv2sf3, pfmaxv2sf3, pfminv2sf3, mulv2sf3, femms, pf2id, pf2iw, pfacc, pfnacc, pfpnacc, pi2fw, floatv2si2, pfrcpv2sf2, pfrcpit1v2sf3, pfrcpit2v2sf3, pfrsqrtv2sf2, pfrsqit1v2sf3, pmulhrwv4hi3, pswapdv2si2, pswapdv2sf2): Move to mmx.md; rename as necessary with leading mmx_ prefix. (mmx_clrdi, pavgusb): Remove. (ldmxcsr, stmxcsr, sfence, sfence_insn): Move to sse.md; rename with leading sse_ prefix. * config/i386/sse.md: Receive them. * config/i386/mmx.md: New file. (MMXMODE12, MMXMODE24, mmxvecsize): New. (subrv2sf3): Turn into expander for normal subtraction. (mmx_addv2sf3, mmx_mulv2sf3, mmx_smaxv2sf3, mmx_sminv2sf3, mmx_eqv2sf3, mmx_mulv4hi3, mmx_smulv4hi3_highpart, mmx_umulv4hi3_highpart, mmx_pmaddwd, mmx_pmulhrwv4hi3, sse2_umulsidi3, mmx_umaxv8qi3, mmx_smaxv4hi3, mmx_uminv8qi3, mmx_sminv4hi3): Mark commutative; use ix86_binary_operator_ok. (mmx_add<MMXMODEI>3, mmx_ssadd<MMXMODE12>3, mmx_usadd<MMXMODE12>3, mmx_sub<MMXMODEI>3, mmx_sssub<MMXMODE12>3, mmx_ussub<MMXMODE12>3 mmx_ashr<MMXMODE24>3, mmx_lshr<MMXMODE23>3, mmx_ashl<MMXMODE24>3 mmx_eq<MMXMODEI>3, mmx_gt<MMXMODEI>3, mmx_and<MMXMODEI>3, mmx_nand<MMXMODEI>3, mmx_ior<MMXMODEI>3, mmx_xor<MMXMODEI>3): Macroize from existing patterns; use ix86_binary_operator_ok. (mmx_packsswb, mmx_packssdw, mmx_packuswb): Add memory alternative. (mmx_punpckhbw, mmx_punpcklbw, mmx_punpckhwd, mmx_punpcklwd, mmx_punpckhdq, mmx_punpckhdq, mmx_punpckldq): Likewise. Model with vec_select+vec_concat. (mmx_pshufw, mmx_pshufw_1): Likewise. (mmx_uavgv8qi3): Merge pavgusb. Model correcty. (mmx_uavgv4hi3): Model correctly. * config/i386/mmintrin.h (_mm_and_si64, _mm_andnot_si64, _mm_or_si64, _mm_xor_si64): Remove casts. From-SVN: r93107
2005-01-09 12:23:25 +01:00
return __builtin_ia32_por (__m1, __m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_por (__m64 __m1, __m64 __m2)
{
return _mm_or_si64 (__m1, __m2);
}
/* Bit-wise exclusive OR the 64-bit values in M1 and M2. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_xor_si64 (__m64 __m1, __m64 __m2)
{
i386.c (bdesc_2arg): Update names for mmx_ prefixes. * config/i386/i386.c (bdesc_2arg): Update names for mmx_ prefixes. (ix86_expand_builtin): Likewise. Frob MASKMOVQ wrt the input mem just like MASKMOVDQU. Return plain zero for MMX_ZERO. * config/i386/i386.md (MMXMODEI, mov<MMXMODEI>, mov<MMXMODEI>_internal_rex64, mov<MMXMODEI>_internal, movv2sf, movv2sf_internal_rex64, movv2sf_internal, MMXMODE, movmisalign<MMXMODE>, mmx_pmovmskb, mmx_maskmovq, mmx_maskmovq_rex, sse_movntdi, addv8qi3, addv4hi3, addv2si3, mmx_adddi3, ssaddv8qi3, ssaddv4hi3, usaddv8qi3, usaddv4hi3, subv8qi3, subv4hi3, subv2si3, mmx_subdi3, sssubv8qi3, sssubv4hi3, ussubv8qi3, ussubv4hi3, mulv4hi3, smulv4hi3_highpart, umulv4hi3_highpart, mmx_pmaddwd, sse2_umulsidi3, mmx_iordi3, mmx_xordi3, mmx_anddi3, mmx_nanddi3, mmx_uavgv8qi3, mmx_uavgv4hi3, mmx_psadbw, mmx_pinsrw, mmx_pinsrw, mmx_pextrw, mmx_pshufw, eqv8qi3, eqv4hi3, eqv2si3, gtv8qi3, gtv4hi3, gtv2si3, umaxv8qi3, smaxv4hi3, uminv8qi3, sminv4hi3, ashrv4hi3, ashrv2si3, lshrv4hi3, lshrv2si3, mmx_lshrdi3, ashlv4hi3, ashlv2si3, mmx_ashldi3, mmx_packsswb, mmx_packssdw, mmx_packuswb, mmx_punpckhbw, mmx_punpckhwd, mmx_punpckhdq, mmx_punpcklbw, mmx_punpcklwd, mmx_punpckldq, emms, addv2sf3, subv2sf3, subrv2sf3, gtv2sf3, gev2sf3, eqv2sf3, pfmaxv2sf3, pfminv2sf3, mulv2sf3, femms, pf2id, pf2iw, pfacc, pfnacc, pfpnacc, pi2fw, floatv2si2, pfrcpv2sf2, pfrcpit1v2sf3, pfrcpit2v2sf3, pfrsqrtv2sf2, pfrsqit1v2sf3, pmulhrwv4hi3, pswapdv2si2, pswapdv2sf2): Move to mmx.md; rename as necessary with leading mmx_ prefix. (mmx_clrdi, pavgusb): Remove. (ldmxcsr, stmxcsr, sfence, sfence_insn): Move to sse.md; rename with leading sse_ prefix. * config/i386/sse.md: Receive them. * config/i386/mmx.md: New file. (MMXMODE12, MMXMODE24, mmxvecsize): New. (subrv2sf3): Turn into expander for normal subtraction. (mmx_addv2sf3, mmx_mulv2sf3, mmx_smaxv2sf3, mmx_sminv2sf3, mmx_eqv2sf3, mmx_mulv4hi3, mmx_smulv4hi3_highpart, mmx_umulv4hi3_highpart, mmx_pmaddwd, mmx_pmulhrwv4hi3, sse2_umulsidi3, mmx_umaxv8qi3, mmx_smaxv4hi3, mmx_uminv8qi3, mmx_sminv4hi3): Mark commutative; use ix86_binary_operator_ok. (mmx_add<MMXMODEI>3, mmx_ssadd<MMXMODE12>3, mmx_usadd<MMXMODE12>3, mmx_sub<MMXMODEI>3, mmx_sssub<MMXMODE12>3, mmx_ussub<MMXMODE12>3 mmx_ashr<MMXMODE24>3, mmx_lshr<MMXMODE23>3, mmx_ashl<MMXMODE24>3 mmx_eq<MMXMODEI>3, mmx_gt<MMXMODEI>3, mmx_and<MMXMODEI>3, mmx_nand<MMXMODEI>3, mmx_ior<MMXMODEI>3, mmx_xor<MMXMODEI>3): Macroize from existing patterns; use ix86_binary_operator_ok. (mmx_packsswb, mmx_packssdw, mmx_packuswb): Add memory alternative. (mmx_punpckhbw, mmx_punpcklbw, mmx_punpckhwd, mmx_punpcklwd, mmx_punpckhdq, mmx_punpckhdq, mmx_punpckldq): Likewise. Model with vec_select+vec_concat. (mmx_pshufw, mmx_pshufw_1): Likewise. (mmx_uavgv8qi3): Merge pavgusb. Model correcty. (mmx_uavgv4hi3): Model correctly. * config/i386/mmintrin.h (_mm_and_si64, _mm_andnot_si64, _mm_or_si64, _mm_xor_si64): Remove casts. From-SVN: r93107
2005-01-09 12:23:25 +01:00
return __builtin_ia32_pxor (__m1, __m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_pxor (__m64 __m1, __m64 __m2)
{
return _mm_xor_si64 (__m1, __m2);
}
/* Compare eight 8-bit values. The result of the comparison is 0xFF if the
test is true and zero if false. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_cmpeq_pi8 (__m64 __m1, __m64 __m2)
{
return (__m64) __builtin_ia32_pcmpeqb ((__v8qi)__m1, (__v8qi)__m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_pcmpeqb (__m64 __m1, __m64 __m2)
{
return _mm_cmpeq_pi8 (__m1, __m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_cmpgt_pi8 (__m64 __m1, __m64 __m2)
{
return (__m64) __builtin_ia32_pcmpgtb ((__v8qi)__m1, (__v8qi)__m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_pcmpgtb (__m64 __m1, __m64 __m2)
{
return _mm_cmpgt_pi8 (__m1, __m2);
}
/* Compare four 16-bit values. The result of the comparison is 0xFFFF if
the test is true and zero if false. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_cmpeq_pi16 (__m64 __m1, __m64 __m2)
{
return (__m64) __builtin_ia32_pcmpeqw ((__v4hi)__m1, (__v4hi)__m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_pcmpeqw (__m64 __m1, __m64 __m2)
{
return _mm_cmpeq_pi16 (__m1, __m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_cmpgt_pi16 (__m64 __m1, __m64 __m2)
{
return (__m64) __builtin_ia32_pcmpgtw ((__v4hi)__m1, (__v4hi)__m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_pcmpgtw (__m64 __m1, __m64 __m2)
{
return _mm_cmpgt_pi16 (__m1, __m2);
}
/* Compare two 32-bit values. The result of the comparison is 0xFFFFFFFF if
the test is true and zero if false. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_cmpeq_pi32 (__m64 __m1, __m64 __m2)
{
return (__m64) __builtin_ia32_pcmpeqd ((__v2si)__m1, (__v2si)__m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_pcmpeqd (__m64 __m1, __m64 __m2)
{
return _mm_cmpeq_pi32 (__m1, __m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_cmpgt_pi32 (__m64 __m1, __m64 __m2)
{
return (__m64) __builtin_ia32_pcmpgtd ((__v2si)__m1, (__v2si)__m2);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_m_pcmpgtd (__m64 __m1, __m64 __m2)
{
return _mm_cmpgt_pi32 (__m1, __m2);
}
/* Creates a 64-bit zero. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_setzero_si64 (void)
{
re PR rtl-optimization/13366 (ICE using MMX/SSE builtins with -O) PR target/13366 * config/i386/i386.h (enum ix86_builtins): Move ... * config/i386/i386.c: ... here. (IX86_BUILTIN_MOVDDUP, IX86_BUILTIN_MMX_ZERO, IX86_BUILTIN_PEXTRW, IX86_BUILTIN_PINSRW, IX86_BUILTIN_LOADAPS, IX86_BUILTIN_LOADSS, IX86_BUILTIN_STORESS, IX86_BUILTIN_SSE_ZERO, IX86_BUILTIN_PEXTRW128, IX86_BUILTIN_PINSRW128, IX86_BUILTIN_LOADAPD, IX86_BUILTIN_LOADSD, IX86_BUILTIN_STOREAPD, IX86_BUILTIN_STORESD, IX86_BUILTIN_STOREHPD, IX86_BUILTIN_STORELPD, IX86_BUILTIN_SETPD1, IX86_BUILTIN_SETPD, IX86_BUILTIN_CLRPD, IX86_BUILTIN_LOADPD1, IX86_BUILTIN_LOADRPD, IX86_BUILTIN_STOREPD1, IX86_BUILTIN_STORERPD, IX86_BUILTIN_LOADDQA, IX86_BUILTIN_STOREDQA, IX86_BUILTIN_CLRTI, IX86_BUILTIN_LOADDDUP): Remove. (IX86_BUILTIN_VEC_INIT_V2SI, IX86_BUILTIN_VEC_INIT_V4HI, IX86_BUILTIN_VEC_INIT_V8QI, IX86_BUILTIN_VEC_EXT_V2DF, IX86_BUILTIN_VEC_EXT_V2DI, IX86_BUILTIN_VEC_EXT_V4SF, IX86_BUILTIN_VEC_EXT_V8HI, IX86_BUILTIN_VEC_EXT_V4HI, IX86_BUILTIN_VEC_SET_V8HI, IX86_BUILTIN_VEC_SET_V4HI): New. (ix86_init_builtins): Make static. (ix86_init_mmx_sse_builtins): Update for changed builtins. (ix86_expand_binop_builtin): Only use ix86_fixup_binary_operands if all the modes match. Otherwise, fake it. (get_element_number, ix86_expand_vec_init_builtin, ix86_expand_vec_ext_builtin, ix86_expand_vec_set_builtin): New. (ix86_expand_builtin): Make static. Update for changed builtins. (ix86_expand_vector_move_misalign): Use sse2_loadlpd with zero operand instead of sse2_loadsd. Cast sse1 fallback to V4SFmode. (ix86_expand_vector_init_duplicate): New. (ix86_expand_vector_init_low_nonzero): New. (ix86_expand_vector_init_one_var, ix86_expand_vector_init_general): Split out from ix86_expand_vector_init; handle integer modes. (ix86_expand_vector_init): Use them. (ix86_expand_vector_set, ix86_expand_vector_extract): New. * config/i386/i386-protos.h: Update. * config/i386/predicates.md (reg_or_0_operand): New. * config/i386/mmx.md (mov<MMXMODEI>_internal): Add 'r' variants. (movv2sf_internal): Likewise. And a splitter to match them all. (vec_dupv2sf, mmx_concatv2sf, vec_setv2sf, vec_extractv2sf, vec_initv2sf, vec_dupv4hi, vec_dupv2si, mmx_concatv2si, vec_setv2si, vec_extractv2si, vec_initv2si, vec_setv4hi, vec_extractv4hi, vec_initv4hi, vec_setv8qi, vec_extractv8qi, vec_initv8qi): New. (mmx_pinsrw): Fix operand ordering. * config/i386/sse.md (movv4sf splitter): Use direct pattern, rather than sse_loadss expander. (movv2df splitter): Similarly. (sse_loadss, sse_loadlss): Remove. (vec_dupv4sf, sse_concatv2sf, sse_concatv4sf, vec_extractv4sf_0): New. (vec_setv4sf, vec_setv2df): Use ix86_expand_vector_set. (vec_extractv4sf, vec_extractv2df): Use ix86_expand_vector_extract. (sse3_movddup): Rename with '*'. (sse3_movddup splitter): Use gen_rtx_REG instead of gen_lowpart. (sse2_loadsd): Remove. (vec_dupv2df_sse3): Rename from sse3_loadddup. (vec_dupv2df, vec_concatv2df_sse3, vec_concatv2df): New. (sse2_pinsrw): Fix argument ordering. (sse2_loadld, sse2_loadq): Add sse1 alternatives. (sse2_stored): Remove 'r' destination. (vec_dupv4si, vec_dupv2di, sse2_concatv2si, sse1_concatv2si, vec_concatv4si_1, vec_concatv2di, vec_setv2di, vec_extractv2di, vec_initv2di, vec_setv4si, vec_extractv4si, vec_initv4si, vec_setv8hi, vec_extractv8hi, vec_initv8hi, vec_setv16qi, vec_extractv16qi, vec_initv16qi): New. * config/i386/emmintrin.h (__m128i, __m128d): Use typedef, not define. (_mm_set_sd, _mm_set1_pd, _mm_setzero_pd, _mm_set_epi64x, _mm_set_epi32, _mm_set_epi16, _mm_set_epi8, _mm_setzero_si128): Use constructor form. (_mm_load_pd, _mm_store_pd): Use plain dereference. (_mm_load_si128, _mm_store_si128): Likewise. (_mm_load1_pd): Use _mm_set1_pd. (_mm_load_sd): Use _mm_set_sd. (_mm_store_sd, _mm_storeh_pd): Use __builtin_ia32_vec_ext_v2df. (_mm_store1_pd, _mm_storer_pd): Use _mm_store_pd. (_mm_set_epi64): Use _mm_set_epi64x. (_mm_set1_epi64x, _mm_set1_epi64, _mm_set1_epi32, _mm_set_epi16, _mm_set1_epi8, _mm_setr_epi64, _mm_setr_epi32, _mm_setr_epi16, _mm_setr_epi8): Use _mm_set_foo form. (_mm_loadl_epi64, _mm_movpi64_epi64, _mm_move_epi64): Use _mm_set_epi64. (_mm_storel_epi64, _mm_movepi64_pi64): Use __builtin_ia32_vec_ext_v2di. (_mm_extract_epi16): Use __builtin_ia32_vec_ext_v8hi. (_mm_insert_epi16): Use __builtin_ia32_vec_set_v8hi. * config/i386/mmintrin.h (_mm_setzero_si64): Use plain cast. (_mm_set_pi32): Use __builtin_ia32_vec_init_v2si. (_mm_set_pi16): Use __builtin_ia32_vec_init_v4hi. (_mm_set_pi8): Use __builtin_ia32_vec_init_v8qi. (_mm_set1_pi16, _mm_set1_pi8): Use _mm_set_piN variant. * config/i386/pmmintrin.h (_mm_loaddup_pd): Use _mm_load1_pd. (_mm_movedup_pd): Use _mm_shuffle_pd. * config/i386/xmmintrin.h (_mm_setzero_ps, _mm_set_ss, _mm_set1_ps, _mm_set_ps, _mm_setr_ps): Use constructor form. (_mm_cvtpi16_ps, _mm_cvtpu16_ps, _mm_cvtpi8_ps, _mm_cvtpu8_ps, _mm_cvtps_pi8, _mm_cvtpi32x2_ps): Avoid __builtin_ia32_mmx_zero; Use _mm_setzero_ps. (_mm_load_ss, _mm_load1_ps): Use _mm_set* form. (_mm_load_ps, _mm_loadr_ps): Use raw dereference. (_mm_store_ss): Use __builtin_ia32_vec_ext_v4sf. (_mm_store_ps): Use raw dereference. (_mm_store1_ps): Use _mm_storeu_ps. (_mm_storer_ps): Use _mm_store_ps. (_mm_extract_pi16): Use __builtin_ia32_vec_ext_v4hi. (_mm_insert_pi16): Use __builtin_ia32_vec_set_v4hi. From-SVN: r93199
2005-01-11 22:33:14 +01:00
return (__m64)0LL;
}
/* Creates a vector of two 32-bit values; I0 is least significant. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_set_pi32 (int __i1, int __i0)
{
re PR rtl-optimization/13366 (ICE using MMX/SSE builtins with -O) PR target/13366 * config/i386/i386.h (enum ix86_builtins): Move ... * config/i386/i386.c: ... here. (IX86_BUILTIN_MOVDDUP, IX86_BUILTIN_MMX_ZERO, IX86_BUILTIN_PEXTRW, IX86_BUILTIN_PINSRW, IX86_BUILTIN_LOADAPS, IX86_BUILTIN_LOADSS, IX86_BUILTIN_STORESS, IX86_BUILTIN_SSE_ZERO, IX86_BUILTIN_PEXTRW128, IX86_BUILTIN_PINSRW128, IX86_BUILTIN_LOADAPD, IX86_BUILTIN_LOADSD, IX86_BUILTIN_STOREAPD, IX86_BUILTIN_STORESD, IX86_BUILTIN_STOREHPD, IX86_BUILTIN_STORELPD, IX86_BUILTIN_SETPD1, IX86_BUILTIN_SETPD, IX86_BUILTIN_CLRPD, IX86_BUILTIN_LOADPD1, IX86_BUILTIN_LOADRPD, IX86_BUILTIN_STOREPD1, IX86_BUILTIN_STORERPD, IX86_BUILTIN_LOADDQA, IX86_BUILTIN_STOREDQA, IX86_BUILTIN_CLRTI, IX86_BUILTIN_LOADDDUP): Remove. (IX86_BUILTIN_VEC_INIT_V2SI, IX86_BUILTIN_VEC_INIT_V4HI, IX86_BUILTIN_VEC_INIT_V8QI, IX86_BUILTIN_VEC_EXT_V2DF, IX86_BUILTIN_VEC_EXT_V2DI, IX86_BUILTIN_VEC_EXT_V4SF, IX86_BUILTIN_VEC_EXT_V8HI, IX86_BUILTIN_VEC_EXT_V4HI, IX86_BUILTIN_VEC_SET_V8HI, IX86_BUILTIN_VEC_SET_V4HI): New. (ix86_init_builtins): Make static. (ix86_init_mmx_sse_builtins): Update for changed builtins. (ix86_expand_binop_builtin): Only use ix86_fixup_binary_operands if all the modes match. Otherwise, fake it. (get_element_number, ix86_expand_vec_init_builtin, ix86_expand_vec_ext_builtin, ix86_expand_vec_set_builtin): New. (ix86_expand_builtin): Make static. Update for changed builtins. (ix86_expand_vector_move_misalign): Use sse2_loadlpd with zero operand instead of sse2_loadsd. Cast sse1 fallback to V4SFmode. (ix86_expand_vector_init_duplicate): New. (ix86_expand_vector_init_low_nonzero): New. (ix86_expand_vector_init_one_var, ix86_expand_vector_init_general): Split out from ix86_expand_vector_init; handle integer modes. (ix86_expand_vector_init): Use them. (ix86_expand_vector_set, ix86_expand_vector_extract): New. * config/i386/i386-protos.h: Update. * config/i386/predicates.md (reg_or_0_operand): New. * config/i386/mmx.md (mov<MMXMODEI>_internal): Add 'r' variants. (movv2sf_internal): Likewise. And a splitter to match them all. (vec_dupv2sf, mmx_concatv2sf, vec_setv2sf, vec_extractv2sf, vec_initv2sf, vec_dupv4hi, vec_dupv2si, mmx_concatv2si, vec_setv2si, vec_extractv2si, vec_initv2si, vec_setv4hi, vec_extractv4hi, vec_initv4hi, vec_setv8qi, vec_extractv8qi, vec_initv8qi): New. (mmx_pinsrw): Fix operand ordering. * config/i386/sse.md (movv4sf splitter): Use direct pattern, rather than sse_loadss expander. (movv2df splitter): Similarly. (sse_loadss, sse_loadlss): Remove. (vec_dupv4sf, sse_concatv2sf, sse_concatv4sf, vec_extractv4sf_0): New. (vec_setv4sf, vec_setv2df): Use ix86_expand_vector_set. (vec_extractv4sf, vec_extractv2df): Use ix86_expand_vector_extract. (sse3_movddup): Rename with '*'. (sse3_movddup splitter): Use gen_rtx_REG instead of gen_lowpart. (sse2_loadsd): Remove. (vec_dupv2df_sse3): Rename from sse3_loadddup. (vec_dupv2df, vec_concatv2df_sse3, vec_concatv2df): New. (sse2_pinsrw): Fix argument ordering. (sse2_loadld, sse2_loadq): Add sse1 alternatives. (sse2_stored): Remove 'r' destination. (vec_dupv4si, vec_dupv2di, sse2_concatv2si, sse1_concatv2si, vec_concatv4si_1, vec_concatv2di, vec_setv2di, vec_extractv2di, vec_initv2di, vec_setv4si, vec_extractv4si, vec_initv4si, vec_setv8hi, vec_extractv8hi, vec_initv8hi, vec_setv16qi, vec_extractv16qi, vec_initv16qi): New. * config/i386/emmintrin.h (__m128i, __m128d): Use typedef, not define. (_mm_set_sd, _mm_set1_pd, _mm_setzero_pd, _mm_set_epi64x, _mm_set_epi32, _mm_set_epi16, _mm_set_epi8, _mm_setzero_si128): Use constructor form. (_mm_load_pd, _mm_store_pd): Use plain dereference. (_mm_load_si128, _mm_store_si128): Likewise. (_mm_load1_pd): Use _mm_set1_pd. (_mm_load_sd): Use _mm_set_sd. (_mm_store_sd, _mm_storeh_pd): Use __builtin_ia32_vec_ext_v2df. (_mm_store1_pd, _mm_storer_pd): Use _mm_store_pd. (_mm_set_epi64): Use _mm_set_epi64x. (_mm_set1_epi64x, _mm_set1_epi64, _mm_set1_epi32, _mm_set_epi16, _mm_set1_epi8, _mm_setr_epi64, _mm_setr_epi32, _mm_setr_epi16, _mm_setr_epi8): Use _mm_set_foo form. (_mm_loadl_epi64, _mm_movpi64_epi64, _mm_move_epi64): Use _mm_set_epi64. (_mm_storel_epi64, _mm_movepi64_pi64): Use __builtin_ia32_vec_ext_v2di. (_mm_extract_epi16): Use __builtin_ia32_vec_ext_v8hi. (_mm_insert_epi16): Use __builtin_ia32_vec_set_v8hi. * config/i386/mmintrin.h (_mm_setzero_si64): Use plain cast. (_mm_set_pi32): Use __builtin_ia32_vec_init_v2si. (_mm_set_pi16): Use __builtin_ia32_vec_init_v4hi. (_mm_set_pi8): Use __builtin_ia32_vec_init_v8qi. (_mm_set1_pi16, _mm_set1_pi8): Use _mm_set_piN variant. * config/i386/pmmintrin.h (_mm_loaddup_pd): Use _mm_load1_pd. (_mm_movedup_pd): Use _mm_shuffle_pd. * config/i386/xmmintrin.h (_mm_setzero_ps, _mm_set_ss, _mm_set1_ps, _mm_set_ps, _mm_setr_ps): Use constructor form. (_mm_cvtpi16_ps, _mm_cvtpu16_ps, _mm_cvtpi8_ps, _mm_cvtpu8_ps, _mm_cvtps_pi8, _mm_cvtpi32x2_ps): Avoid __builtin_ia32_mmx_zero; Use _mm_setzero_ps. (_mm_load_ss, _mm_load1_ps): Use _mm_set* form. (_mm_load_ps, _mm_loadr_ps): Use raw dereference. (_mm_store_ss): Use __builtin_ia32_vec_ext_v4sf. (_mm_store_ps): Use raw dereference. (_mm_store1_ps): Use _mm_storeu_ps. (_mm_storer_ps): Use _mm_store_ps. (_mm_extract_pi16): Use __builtin_ia32_vec_ext_v4hi. (_mm_insert_pi16): Use __builtin_ia32_vec_set_v4hi. From-SVN: r93199
2005-01-11 22:33:14 +01:00
return (__m64) __builtin_ia32_vec_init_v2si (__i0, __i1);
}
/* Creates a vector of four 16-bit values; W0 is least significant. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_set_pi16 (short __w3, short __w2, short __w1, short __w0)
{
re PR rtl-optimization/13366 (ICE using MMX/SSE builtins with -O) PR target/13366 * config/i386/i386.h (enum ix86_builtins): Move ... * config/i386/i386.c: ... here. (IX86_BUILTIN_MOVDDUP, IX86_BUILTIN_MMX_ZERO, IX86_BUILTIN_PEXTRW, IX86_BUILTIN_PINSRW, IX86_BUILTIN_LOADAPS, IX86_BUILTIN_LOADSS, IX86_BUILTIN_STORESS, IX86_BUILTIN_SSE_ZERO, IX86_BUILTIN_PEXTRW128, IX86_BUILTIN_PINSRW128, IX86_BUILTIN_LOADAPD, IX86_BUILTIN_LOADSD, IX86_BUILTIN_STOREAPD, IX86_BUILTIN_STORESD, IX86_BUILTIN_STOREHPD, IX86_BUILTIN_STORELPD, IX86_BUILTIN_SETPD1, IX86_BUILTIN_SETPD, IX86_BUILTIN_CLRPD, IX86_BUILTIN_LOADPD1, IX86_BUILTIN_LOADRPD, IX86_BUILTIN_STOREPD1, IX86_BUILTIN_STORERPD, IX86_BUILTIN_LOADDQA, IX86_BUILTIN_STOREDQA, IX86_BUILTIN_CLRTI, IX86_BUILTIN_LOADDDUP): Remove. (IX86_BUILTIN_VEC_INIT_V2SI, IX86_BUILTIN_VEC_INIT_V4HI, IX86_BUILTIN_VEC_INIT_V8QI, IX86_BUILTIN_VEC_EXT_V2DF, IX86_BUILTIN_VEC_EXT_V2DI, IX86_BUILTIN_VEC_EXT_V4SF, IX86_BUILTIN_VEC_EXT_V8HI, IX86_BUILTIN_VEC_EXT_V4HI, IX86_BUILTIN_VEC_SET_V8HI, IX86_BUILTIN_VEC_SET_V4HI): New. (ix86_init_builtins): Make static. (ix86_init_mmx_sse_builtins): Update for changed builtins. (ix86_expand_binop_builtin): Only use ix86_fixup_binary_operands if all the modes match. Otherwise, fake it. (get_element_number, ix86_expand_vec_init_builtin, ix86_expand_vec_ext_builtin, ix86_expand_vec_set_builtin): New. (ix86_expand_builtin): Make static. Update for changed builtins. (ix86_expand_vector_move_misalign): Use sse2_loadlpd with zero operand instead of sse2_loadsd. Cast sse1 fallback to V4SFmode. (ix86_expand_vector_init_duplicate): New. (ix86_expand_vector_init_low_nonzero): New. (ix86_expand_vector_init_one_var, ix86_expand_vector_init_general): Split out from ix86_expand_vector_init; handle integer modes. (ix86_expand_vector_init): Use them. (ix86_expand_vector_set, ix86_expand_vector_extract): New. * config/i386/i386-protos.h: Update. * config/i386/predicates.md (reg_or_0_operand): New. * config/i386/mmx.md (mov<MMXMODEI>_internal): Add 'r' variants. (movv2sf_internal): Likewise. And a splitter to match them all. (vec_dupv2sf, mmx_concatv2sf, vec_setv2sf, vec_extractv2sf, vec_initv2sf, vec_dupv4hi, vec_dupv2si, mmx_concatv2si, vec_setv2si, vec_extractv2si, vec_initv2si, vec_setv4hi, vec_extractv4hi, vec_initv4hi, vec_setv8qi, vec_extractv8qi, vec_initv8qi): New. (mmx_pinsrw): Fix operand ordering. * config/i386/sse.md (movv4sf splitter): Use direct pattern, rather than sse_loadss expander. (movv2df splitter): Similarly. (sse_loadss, sse_loadlss): Remove. (vec_dupv4sf, sse_concatv2sf, sse_concatv4sf, vec_extractv4sf_0): New. (vec_setv4sf, vec_setv2df): Use ix86_expand_vector_set. (vec_extractv4sf, vec_extractv2df): Use ix86_expand_vector_extract. (sse3_movddup): Rename with '*'. (sse3_movddup splitter): Use gen_rtx_REG instead of gen_lowpart. (sse2_loadsd): Remove. (vec_dupv2df_sse3): Rename from sse3_loadddup. (vec_dupv2df, vec_concatv2df_sse3, vec_concatv2df): New. (sse2_pinsrw): Fix argument ordering. (sse2_loadld, sse2_loadq): Add sse1 alternatives. (sse2_stored): Remove 'r' destination. (vec_dupv4si, vec_dupv2di, sse2_concatv2si, sse1_concatv2si, vec_concatv4si_1, vec_concatv2di, vec_setv2di, vec_extractv2di, vec_initv2di, vec_setv4si, vec_extractv4si, vec_initv4si, vec_setv8hi, vec_extractv8hi, vec_initv8hi, vec_setv16qi, vec_extractv16qi, vec_initv16qi): New. * config/i386/emmintrin.h (__m128i, __m128d): Use typedef, not define. (_mm_set_sd, _mm_set1_pd, _mm_setzero_pd, _mm_set_epi64x, _mm_set_epi32, _mm_set_epi16, _mm_set_epi8, _mm_setzero_si128): Use constructor form. (_mm_load_pd, _mm_store_pd): Use plain dereference. (_mm_load_si128, _mm_store_si128): Likewise. (_mm_load1_pd): Use _mm_set1_pd. (_mm_load_sd): Use _mm_set_sd. (_mm_store_sd, _mm_storeh_pd): Use __builtin_ia32_vec_ext_v2df. (_mm_store1_pd, _mm_storer_pd): Use _mm_store_pd. (_mm_set_epi64): Use _mm_set_epi64x. (_mm_set1_epi64x, _mm_set1_epi64, _mm_set1_epi32, _mm_set_epi16, _mm_set1_epi8, _mm_setr_epi64, _mm_setr_epi32, _mm_setr_epi16, _mm_setr_epi8): Use _mm_set_foo form. (_mm_loadl_epi64, _mm_movpi64_epi64, _mm_move_epi64): Use _mm_set_epi64. (_mm_storel_epi64, _mm_movepi64_pi64): Use __builtin_ia32_vec_ext_v2di. (_mm_extract_epi16): Use __builtin_ia32_vec_ext_v8hi. (_mm_insert_epi16): Use __builtin_ia32_vec_set_v8hi. * config/i386/mmintrin.h (_mm_setzero_si64): Use plain cast. (_mm_set_pi32): Use __builtin_ia32_vec_init_v2si. (_mm_set_pi16): Use __builtin_ia32_vec_init_v4hi. (_mm_set_pi8): Use __builtin_ia32_vec_init_v8qi. (_mm_set1_pi16, _mm_set1_pi8): Use _mm_set_piN variant. * config/i386/pmmintrin.h (_mm_loaddup_pd): Use _mm_load1_pd. (_mm_movedup_pd): Use _mm_shuffle_pd. * config/i386/xmmintrin.h (_mm_setzero_ps, _mm_set_ss, _mm_set1_ps, _mm_set_ps, _mm_setr_ps): Use constructor form. (_mm_cvtpi16_ps, _mm_cvtpu16_ps, _mm_cvtpi8_ps, _mm_cvtpu8_ps, _mm_cvtps_pi8, _mm_cvtpi32x2_ps): Avoid __builtin_ia32_mmx_zero; Use _mm_setzero_ps. (_mm_load_ss, _mm_load1_ps): Use _mm_set* form. (_mm_load_ps, _mm_loadr_ps): Use raw dereference. (_mm_store_ss): Use __builtin_ia32_vec_ext_v4sf. (_mm_store_ps): Use raw dereference. (_mm_store1_ps): Use _mm_storeu_ps. (_mm_storer_ps): Use _mm_store_ps. (_mm_extract_pi16): Use __builtin_ia32_vec_ext_v4hi. (_mm_insert_pi16): Use __builtin_ia32_vec_set_v4hi. From-SVN: r93199
2005-01-11 22:33:14 +01:00
return (__m64) __builtin_ia32_vec_init_v4hi (__w0, __w1, __w2, __w3);
}
/* Creates a vector of eight 8-bit values; B0 is least significant. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_set_pi8 (char __b7, char __b6, char __b5, char __b4,
char __b3, char __b2, char __b1, char __b0)
{
re PR rtl-optimization/13366 (ICE using MMX/SSE builtins with -O) PR target/13366 * config/i386/i386.h (enum ix86_builtins): Move ... * config/i386/i386.c: ... here. (IX86_BUILTIN_MOVDDUP, IX86_BUILTIN_MMX_ZERO, IX86_BUILTIN_PEXTRW, IX86_BUILTIN_PINSRW, IX86_BUILTIN_LOADAPS, IX86_BUILTIN_LOADSS, IX86_BUILTIN_STORESS, IX86_BUILTIN_SSE_ZERO, IX86_BUILTIN_PEXTRW128, IX86_BUILTIN_PINSRW128, IX86_BUILTIN_LOADAPD, IX86_BUILTIN_LOADSD, IX86_BUILTIN_STOREAPD, IX86_BUILTIN_STORESD, IX86_BUILTIN_STOREHPD, IX86_BUILTIN_STORELPD, IX86_BUILTIN_SETPD1, IX86_BUILTIN_SETPD, IX86_BUILTIN_CLRPD, IX86_BUILTIN_LOADPD1, IX86_BUILTIN_LOADRPD, IX86_BUILTIN_STOREPD1, IX86_BUILTIN_STORERPD, IX86_BUILTIN_LOADDQA, IX86_BUILTIN_STOREDQA, IX86_BUILTIN_CLRTI, IX86_BUILTIN_LOADDDUP): Remove. (IX86_BUILTIN_VEC_INIT_V2SI, IX86_BUILTIN_VEC_INIT_V4HI, IX86_BUILTIN_VEC_INIT_V8QI, IX86_BUILTIN_VEC_EXT_V2DF, IX86_BUILTIN_VEC_EXT_V2DI, IX86_BUILTIN_VEC_EXT_V4SF, IX86_BUILTIN_VEC_EXT_V8HI, IX86_BUILTIN_VEC_EXT_V4HI, IX86_BUILTIN_VEC_SET_V8HI, IX86_BUILTIN_VEC_SET_V4HI): New. (ix86_init_builtins): Make static. (ix86_init_mmx_sse_builtins): Update for changed builtins. (ix86_expand_binop_builtin): Only use ix86_fixup_binary_operands if all the modes match. Otherwise, fake it. (get_element_number, ix86_expand_vec_init_builtin, ix86_expand_vec_ext_builtin, ix86_expand_vec_set_builtin): New. (ix86_expand_builtin): Make static. Update for changed builtins. (ix86_expand_vector_move_misalign): Use sse2_loadlpd with zero operand instead of sse2_loadsd. Cast sse1 fallback to V4SFmode. (ix86_expand_vector_init_duplicate): New. (ix86_expand_vector_init_low_nonzero): New. (ix86_expand_vector_init_one_var, ix86_expand_vector_init_general): Split out from ix86_expand_vector_init; handle integer modes. (ix86_expand_vector_init): Use them. (ix86_expand_vector_set, ix86_expand_vector_extract): New. * config/i386/i386-protos.h: Update. * config/i386/predicates.md (reg_or_0_operand): New. * config/i386/mmx.md (mov<MMXMODEI>_internal): Add 'r' variants. (movv2sf_internal): Likewise. And a splitter to match them all. (vec_dupv2sf, mmx_concatv2sf, vec_setv2sf, vec_extractv2sf, vec_initv2sf, vec_dupv4hi, vec_dupv2si, mmx_concatv2si, vec_setv2si, vec_extractv2si, vec_initv2si, vec_setv4hi, vec_extractv4hi, vec_initv4hi, vec_setv8qi, vec_extractv8qi, vec_initv8qi): New. (mmx_pinsrw): Fix operand ordering. * config/i386/sse.md (movv4sf splitter): Use direct pattern, rather than sse_loadss expander. (movv2df splitter): Similarly. (sse_loadss, sse_loadlss): Remove. (vec_dupv4sf, sse_concatv2sf, sse_concatv4sf, vec_extractv4sf_0): New. (vec_setv4sf, vec_setv2df): Use ix86_expand_vector_set. (vec_extractv4sf, vec_extractv2df): Use ix86_expand_vector_extract. (sse3_movddup): Rename with '*'. (sse3_movddup splitter): Use gen_rtx_REG instead of gen_lowpart. (sse2_loadsd): Remove. (vec_dupv2df_sse3): Rename from sse3_loadddup. (vec_dupv2df, vec_concatv2df_sse3, vec_concatv2df): New. (sse2_pinsrw): Fix argument ordering. (sse2_loadld, sse2_loadq): Add sse1 alternatives. (sse2_stored): Remove 'r' destination. (vec_dupv4si, vec_dupv2di, sse2_concatv2si, sse1_concatv2si, vec_concatv4si_1, vec_concatv2di, vec_setv2di, vec_extractv2di, vec_initv2di, vec_setv4si, vec_extractv4si, vec_initv4si, vec_setv8hi, vec_extractv8hi, vec_initv8hi, vec_setv16qi, vec_extractv16qi, vec_initv16qi): New. * config/i386/emmintrin.h (__m128i, __m128d): Use typedef, not define. (_mm_set_sd, _mm_set1_pd, _mm_setzero_pd, _mm_set_epi64x, _mm_set_epi32, _mm_set_epi16, _mm_set_epi8, _mm_setzero_si128): Use constructor form. (_mm_load_pd, _mm_store_pd): Use plain dereference. (_mm_load_si128, _mm_store_si128): Likewise. (_mm_load1_pd): Use _mm_set1_pd. (_mm_load_sd): Use _mm_set_sd. (_mm_store_sd, _mm_storeh_pd): Use __builtin_ia32_vec_ext_v2df. (_mm_store1_pd, _mm_storer_pd): Use _mm_store_pd. (_mm_set_epi64): Use _mm_set_epi64x. (_mm_set1_epi64x, _mm_set1_epi64, _mm_set1_epi32, _mm_set_epi16, _mm_set1_epi8, _mm_setr_epi64, _mm_setr_epi32, _mm_setr_epi16, _mm_setr_epi8): Use _mm_set_foo form. (_mm_loadl_epi64, _mm_movpi64_epi64, _mm_move_epi64): Use _mm_set_epi64. (_mm_storel_epi64, _mm_movepi64_pi64): Use __builtin_ia32_vec_ext_v2di. (_mm_extract_epi16): Use __builtin_ia32_vec_ext_v8hi. (_mm_insert_epi16): Use __builtin_ia32_vec_set_v8hi. * config/i386/mmintrin.h (_mm_setzero_si64): Use plain cast. (_mm_set_pi32): Use __builtin_ia32_vec_init_v2si. (_mm_set_pi16): Use __builtin_ia32_vec_init_v4hi. (_mm_set_pi8): Use __builtin_ia32_vec_init_v8qi. (_mm_set1_pi16, _mm_set1_pi8): Use _mm_set_piN variant. * config/i386/pmmintrin.h (_mm_loaddup_pd): Use _mm_load1_pd. (_mm_movedup_pd): Use _mm_shuffle_pd. * config/i386/xmmintrin.h (_mm_setzero_ps, _mm_set_ss, _mm_set1_ps, _mm_set_ps, _mm_setr_ps): Use constructor form. (_mm_cvtpi16_ps, _mm_cvtpu16_ps, _mm_cvtpi8_ps, _mm_cvtpu8_ps, _mm_cvtps_pi8, _mm_cvtpi32x2_ps): Avoid __builtin_ia32_mmx_zero; Use _mm_setzero_ps. (_mm_load_ss, _mm_load1_ps): Use _mm_set* form. (_mm_load_ps, _mm_loadr_ps): Use raw dereference. (_mm_store_ss): Use __builtin_ia32_vec_ext_v4sf. (_mm_store_ps): Use raw dereference. (_mm_store1_ps): Use _mm_storeu_ps. (_mm_storer_ps): Use _mm_store_ps. (_mm_extract_pi16): Use __builtin_ia32_vec_ext_v4hi. (_mm_insert_pi16): Use __builtin_ia32_vec_set_v4hi. From-SVN: r93199
2005-01-11 22:33:14 +01:00
return (__m64) __builtin_ia32_vec_init_v8qi (__b0, __b1, __b2, __b3,
__b4, __b5, __b6, __b7);
}
/* Similar, but with the arguments in reverse order. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_setr_pi32 (int __i0, int __i1)
{
return _mm_set_pi32 (__i1, __i0);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_setr_pi16 (short __w0, short __w1, short __w2, short __w3)
{
return _mm_set_pi16 (__w3, __w2, __w1, __w0);
}
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_setr_pi8 (char __b0, char __b1, char __b2, char __b3,
char __b4, char __b5, char __b6, char __b7)
{
return _mm_set_pi8 (__b7, __b6, __b5, __b4, __b3, __b2, __b1, __b0);
}
/* Creates a vector of two 32-bit values, both elements containing I. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_set1_pi32 (int __i)
{
return _mm_set_pi32 (__i, __i);
}
/* Creates a vector of four 16-bit values, all elements containing W. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_set1_pi16 (short __w)
{
re PR rtl-optimization/13366 (ICE using MMX/SSE builtins with -O) PR target/13366 * config/i386/i386.h (enum ix86_builtins): Move ... * config/i386/i386.c: ... here. (IX86_BUILTIN_MOVDDUP, IX86_BUILTIN_MMX_ZERO, IX86_BUILTIN_PEXTRW, IX86_BUILTIN_PINSRW, IX86_BUILTIN_LOADAPS, IX86_BUILTIN_LOADSS, IX86_BUILTIN_STORESS, IX86_BUILTIN_SSE_ZERO, IX86_BUILTIN_PEXTRW128, IX86_BUILTIN_PINSRW128, IX86_BUILTIN_LOADAPD, IX86_BUILTIN_LOADSD, IX86_BUILTIN_STOREAPD, IX86_BUILTIN_STORESD, IX86_BUILTIN_STOREHPD, IX86_BUILTIN_STORELPD, IX86_BUILTIN_SETPD1, IX86_BUILTIN_SETPD, IX86_BUILTIN_CLRPD, IX86_BUILTIN_LOADPD1, IX86_BUILTIN_LOADRPD, IX86_BUILTIN_STOREPD1, IX86_BUILTIN_STORERPD, IX86_BUILTIN_LOADDQA, IX86_BUILTIN_STOREDQA, IX86_BUILTIN_CLRTI, IX86_BUILTIN_LOADDDUP): Remove. (IX86_BUILTIN_VEC_INIT_V2SI, IX86_BUILTIN_VEC_INIT_V4HI, IX86_BUILTIN_VEC_INIT_V8QI, IX86_BUILTIN_VEC_EXT_V2DF, IX86_BUILTIN_VEC_EXT_V2DI, IX86_BUILTIN_VEC_EXT_V4SF, IX86_BUILTIN_VEC_EXT_V8HI, IX86_BUILTIN_VEC_EXT_V4HI, IX86_BUILTIN_VEC_SET_V8HI, IX86_BUILTIN_VEC_SET_V4HI): New. (ix86_init_builtins): Make static. (ix86_init_mmx_sse_builtins): Update for changed builtins. (ix86_expand_binop_builtin): Only use ix86_fixup_binary_operands if all the modes match. Otherwise, fake it. (get_element_number, ix86_expand_vec_init_builtin, ix86_expand_vec_ext_builtin, ix86_expand_vec_set_builtin): New. (ix86_expand_builtin): Make static. Update for changed builtins. (ix86_expand_vector_move_misalign): Use sse2_loadlpd with zero operand instead of sse2_loadsd. Cast sse1 fallback to V4SFmode. (ix86_expand_vector_init_duplicate): New. (ix86_expand_vector_init_low_nonzero): New. (ix86_expand_vector_init_one_var, ix86_expand_vector_init_general): Split out from ix86_expand_vector_init; handle integer modes. (ix86_expand_vector_init): Use them. (ix86_expand_vector_set, ix86_expand_vector_extract): New. * config/i386/i386-protos.h: Update. * config/i386/predicates.md (reg_or_0_operand): New. * config/i386/mmx.md (mov<MMXMODEI>_internal): Add 'r' variants. (movv2sf_internal): Likewise. And a splitter to match them all. (vec_dupv2sf, mmx_concatv2sf, vec_setv2sf, vec_extractv2sf, vec_initv2sf, vec_dupv4hi, vec_dupv2si, mmx_concatv2si, vec_setv2si, vec_extractv2si, vec_initv2si, vec_setv4hi, vec_extractv4hi, vec_initv4hi, vec_setv8qi, vec_extractv8qi, vec_initv8qi): New. (mmx_pinsrw): Fix operand ordering. * config/i386/sse.md (movv4sf splitter): Use direct pattern, rather than sse_loadss expander. (movv2df splitter): Similarly. (sse_loadss, sse_loadlss): Remove. (vec_dupv4sf, sse_concatv2sf, sse_concatv4sf, vec_extractv4sf_0): New. (vec_setv4sf, vec_setv2df): Use ix86_expand_vector_set. (vec_extractv4sf, vec_extractv2df): Use ix86_expand_vector_extract. (sse3_movddup): Rename with '*'. (sse3_movddup splitter): Use gen_rtx_REG instead of gen_lowpart. (sse2_loadsd): Remove. (vec_dupv2df_sse3): Rename from sse3_loadddup. (vec_dupv2df, vec_concatv2df_sse3, vec_concatv2df): New. (sse2_pinsrw): Fix argument ordering. (sse2_loadld, sse2_loadq): Add sse1 alternatives. (sse2_stored): Remove 'r' destination. (vec_dupv4si, vec_dupv2di, sse2_concatv2si, sse1_concatv2si, vec_concatv4si_1, vec_concatv2di, vec_setv2di, vec_extractv2di, vec_initv2di, vec_setv4si, vec_extractv4si, vec_initv4si, vec_setv8hi, vec_extractv8hi, vec_initv8hi, vec_setv16qi, vec_extractv16qi, vec_initv16qi): New. * config/i386/emmintrin.h (__m128i, __m128d): Use typedef, not define. (_mm_set_sd, _mm_set1_pd, _mm_setzero_pd, _mm_set_epi64x, _mm_set_epi32, _mm_set_epi16, _mm_set_epi8, _mm_setzero_si128): Use constructor form. (_mm_load_pd, _mm_store_pd): Use plain dereference. (_mm_load_si128, _mm_store_si128): Likewise. (_mm_load1_pd): Use _mm_set1_pd. (_mm_load_sd): Use _mm_set_sd. (_mm_store_sd, _mm_storeh_pd): Use __builtin_ia32_vec_ext_v2df. (_mm_store1_pd, _mm_storer_pd): Use _mm_store_pd. (_mm_set_epi64): Use _mm_set_epi64x. (_mm_set1_epi64x, _mm_set1_epi64, _mm_set1_epi32, _mm_set_epi16, _mm_set1_epi8, _mm_setr_epi64, _mm_setr_epi32, _mm_setr_epi16, _mm_setr_epi8): Use _mm_set_foo form. (_mm_loadl_epi64, _mm_movpi64_epi64, _mm_move_epi64): Use _mm_set_epi64. (_mm_storel_epi64, _mm_movepi64_pi64): Use __builtin_ia32_vec_ext_v2di. (_mm_extract_epi16): Use __builtin_ia32_vec_ext_v8hi. (_mm_insert_epi16): Use __builtin_ia32_vec_set_v8hi. * config/i386/mmintrin.h (_mm_setzero_si64): Use plain cast. (_mm_set_pi32): Use __builtin_ia32_vec_init_v2si. (_mm_set_pi16): Use __builtin_ia32_vec_init_v4hi. (_mm_set_pi8): Use __builtin_ia32_vec_init_v8qi. (_mm_set1_pi16, _mm_set1_pi8): Use _mm_set_piN variant. * config/i386/pmmintrin.h (_mm_loaddup_pd): Use _mm_load1_pd. (_mm_movedup_pd): Use _mm_shuffle_pd. * config/i386/xmmintrin.h (_mm_setzero_ps, _mm_set_ss, _mm_set1_ps, _mm_set_ps, _mm_setr_ps): Use constructor form. (_mm_cvtpi16_ps, _mm_cvtpu16_ps, _mm_cvtpi8_ps, _mm_cvtpu8_ps, _mm_cvtps_pi8, _mm_cvtpi32x2_ps): Avoid __builtin_ia32_mmx_zero; Use _mm_setzero_ps. (_mm_load_ss, _mm_load1_ps): Use _mm_set* form. (_mm_load_ps, _mm_loadr_ps): Use raw dereference. (_mm_store_ss): Use __builtin_ia32_vec_ext_v4sf. (_mm_store_ps): Use raw dereference. (_mm_store1_ps): Use _mm_storeu_ps. (_mm_storer_ps): Use _mm_store_ps. (_mm_extract_pi16): Use __builtin_ia32_vec_ext_v4hi. (_mm_insert_pi16): Use __builtin_ia32_vec_set_v4hi. From-SVN: r93199
2005-01-11 22:33:14 +01:00
return _mm_set_pi16 (__w, __w, __w, __w);
}
/* Creates a vector of eight 8-bit values, all elements containing B. */
extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_set1_pi8 (char __b)
{
re PR rtl-optimization/13366 (ICE using MMX/SSE builtins with -O) PR target/13366 * config/i386/i386.h (enum ix86_builtins): Move ... * config/i386/i386.c: ... here. (IX86_BUILTIN_MOVDDUP, IX86_BUILTIN_MMX_ZERO, IX86_BUILTIN_PEXTRW, IX86_BUILTIN_PINSRW, IX86_BUILTIN_LOADAPS, IX86_BUILTIN_LOADSS, IX86_BUILTIN_STORESS, IX86_BUILTIN_SSE_ZERO, IX86_BUILTIN_PEXTRW128, IX86_BUILTIN_PINSRW128, IX86_BUILTIN_LOADAPD, IX86_BUILTIN_LOADSD, IX86_BUILTIN_STOREAPD, IX86_BUILTIN_STORESD, IX86_BUILTIN_STOREHPD, IX86_BUILTIN_STORELPD, IX86_BUILTIN_SETPD1, IX86_BUILTIN_SETPD, IX86_BUILTIN_CLRPD, IX86_BUILTIN_LOADPD1, IX86_BUILTIN_LOADRPD, IX86_BUILTIN_STOREPD1, IX86_BUILTIN_STORERPD, IX86_BUILTIN_LOADDQA, IX86_BUILTIN_STOREDQA, IX86_BUILTIN_CLRTI, IX86_BUILTIN_LOADDDUP): Remove. (IX86_BUILTIN_VEC_INIT_V2SI, IX86_BUILTIN_VEC_INIT_V4HI, IX86_BUILTIN_VEC_INIT_V8QI, IX86_BUILTIN_VEC_EXT_V2DF, IX86_BUILTIN_VEC_EXT_V2DI, IX86_BUILTIN_VEC_EXT_V4SF, IX86_BUILTIN_VEC_EXT_V8HI, IX86_BUILTIN_VEC_EXT_V4HI, IX86_BUILTIN_VEC_SET_V8HI, IX86_BUILTIN_VEC_SET_V4HI): New. (ix86_init_builtins): Make static. (ix86_init_mmx_sse_builtins): Update for changed builtins. (ix86_expand_binop_builtin): Only use ix86_fixup_binary_operands if all the modes match. Otherwise, fake it. (get_element_number, ix86_expand_vec_init_builtin, ix86_expand_vec_ext_builtin, ix86_expand_vec_set_builtin): New. (ix86_expand_builtin): Make static. Update for changed builtins. (ix86_expand_vector_move_misalign): Use sse2_loadlpd with zero operand instead of sse2_loadsd. Cast sse1 fallback to V4SFmode. (ix86_expand_vector_init_duplicate): New. (ix86_expand_vector_init_low_nonzero): New. (ix86_expand_vector_init_one_var, ix86_expand_vector_init_general): Split out from ix86_expand_vector_init; handle integer modes. (ix86_expand_vector_init): Use them. (ix86_expand_vector_set, ix86_expand_vector_extract): New. * config/i386/i386-protos.h: Update. * config/i386/predicates.md (reg_or_0_operand): New. * config/i386/mmx.md (mov<MMXMODEI>_internal): Add 'r' variants. (movv2sf_internal): Likewise. And a splitter to match them all. (vec_dupv2sf, mmx_concatv2sf, vec_setv2sf, vec_extractv2sf, vec_initv2sf, vec_dupv4hi, vec_dupv2si, mmx_concatv2si, vec_setv2si, vec_extractv2si, vec_initv2si, vec_setv4hi, vec_extractv4hi, vec_initv4hi, vec_setv8qi, vec_extractv8qi, vec_initv8qi): New. (mmx_pinsrw): Fix operand ordering. * config/i386/sse.md (movv4sf splitter): Use direct pattern, rather than sse_loadss expander. (movv2df splitter): Similarly. (sse_loadss, sse_loadlss): Remove. (vec_dupv4sf, sse_concatv2sf, sse_concatv4sf, vec_extractv4sf_0): New. (vec_setv4sf, vec_setv2df): Use ix86_expand_vector_set. (vec_extractv4sf, vec_extractv2df): Use ix86_expand_vector_extract. (sse3_movddup): Rename with '*'. (sse3_movddup splitter): Use gen_rtx_REG instead of gen_lowpart. (sse2_loadsd): Remove. (vec_dupv2df_sse3): Rename from sse3_loadddup. (vec_dupv2df, vec_concatv2df_sse3, vec_concatv2df): New. (sse2_pinsrw): Fix argument ordering. (sse2_loadld, sse2_loadq): Add sse1 alternatives. (sse2_stored): Remove 'r' destination. (vec_dupv4si, vec_dupv2di, sse2_concatv2si, sse1_concatv2si, vec_concatv4si_1, vec_concatv2di, vec_setv2di, vec_extractv2di, vec_initv2di, vec_setv4si, vec_extractv4si, vec_initv4si, vec_setv8hi, vec_extractv8hi, vec_initv8hi, vec_setv16qi, vec_extractv16qi, vec_initv16qi): New. * config/i386/emmintrin.h (__m128i, __m128d): Use typedef, not define. (_mm_set_sd, _mm_set1_pd, _mm_setzero_pd, _mm_set_epi64x, _mm_set_epi32, _mm_set_epi16, _mm_set_epi8, _mm_setzero_si128): Use constructor form. (_mm_load_pd, _mm_store_pd): Use plain dereference. (_mm_load_si128, _mm_store_si128): Likewise. (_mm_load1_pd): Use _mm_set1_pd. (_mm_load_sd): Use _mm_set_sd. (_mm_store_sd, _mm_storeh_pd): Use __builtin_ia32_vec_ext_v2df. (_mm_store1_pd, _mm_storer_pd): Use _mm_store_pd. (_mm_set_epi64): Use _mm_set_epi64x. (_mm_set1_epi64x, _mm_set1_epi64, _mm_set1_epi32, _mm_set_epi16, _mm_set1_epi8, _mm_setr_epi64, _mm_setr_epi32, _mm_setr_epi16, _mm_setr_epi8): Use _mm_set_foo form. (_mm_loadl_epi64, _mm_movpi64_epi64, _mm_move_epi64): Use _mm_set_epi64. (_mm_storel_epi64, _mm_movepi64_pi64): Use __builtin_ia32_vec_ext_v2di. (_mm_extract_epi16): Use __builtin_ia32_vec_ext_v8hi. (_mm_insert_epi16): Use __builtin_ia32_vec_set_v8hi. * config/i386/mmintrin.h (_mm_setzero_si64): Use plain cast. (_mm_set_pi32): Use __builtin_ia32_vec_init_v2si. (_mm_set_pi16): Use __builtin_ia32_vec_init_v4hi. (_mm_set_pi8): Use __builtin_ia32_vec_init_v8qi. (_mm_set1_pi16, _mm_set1_pi8): Use _mm_set_piN variant. * config/i386/pmmintrin.h (_mm_loaddup_pd): Use _mm_load1_pd. (_mm_movedup_pd): Use _mm_shuffle_pd. * config/i386/xmmintrin.h (_mm_setzero_ps, _mm_set_ss, _mm_set1_ps, _mm_set_ps, _mm_setr_ps): Use constructor form. (_mm_cvtpi16_ps, _mm_cvtpu16_ps, _mm_cvtpi8_ps, _mm_cvtpu8_ps, _mm_cvtps_pi8, _mm_cvtpi32x2_ps): Avoid __builtin_ia32_mmx_zero; Use _mm_setzero_ps. (_mm_load_ss, _mm_load1_ps): Use _mm_set* form. (_mm_load_ps, _mm_loadr_ps): Use raw dereference. (_mm_store_ss): Use __builtin_ia32_vec_ext_v4sf. (_mm_store_ps): Use raw dereference. (_mm_store1_ps): Use _mm_storeu_ps. (_mm_storer_ps): Use _mm_store_ps. (_mm_extract_pi16): Use __builtin_ia32_vec_ext_v4hi. (_mm_insert_pi16): Use __builtin_ia32_vec_set_v4hi. From-SVN: r93199
2005-01-11 22:33:14 +01:00
return _mm_set_pi8 (__b, __b, __b, __b, __b, __b, __b, __b);
}
Allow mmintrin headers to work with function specific target opts. Allow mmintrin headers to work with function specific target opts. Please see discussion here: http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00740.html * config/i386/i386.c (ix86_pragma_target_parse): Restore target when current target options does not apply. * config/i386/i386-protos.h (ix86_reset_previous_fndecl): New function. * config/i386/i386.c (ix86_reset_previous_fndecl): Ditto. * config/i386/bmiintrin.h: Pass appropriate target attributes to header. * config/i386/mmintrin.h: Ditto. * config/i386/nmmintrin.h: Ditto. * config/i386/avx2intrin.h: Ditto. * config/i386/fxsrintrin.h: Ditto. * config/i386/tbmintrin.h: Ditto. * config/i386/xsaveintrin.h: Ditto. * config/i386/f16cintrin.h: Ditto. * config/i386/xtestintrin.h: Ditto. * config/i386/xsaveoptintrin.h: Ditto. * config/i386/bmi2intrin.h: Ditto. * config/i386/lzcntintrin.h: Ditto. * config/i386/smmintrin.h: Ditto. * config/i386/wmmintrin.h: Ditto. * config/i386/x86intrin.h: Remove all header include guards. * config/i386/prfchwintrin.h: Ditto. * config/i386/pmmintrin.h: Ditto. * config/i386/tmmintrin.h: Ditto. * config/i386/xmmintrin.h: Ditto. * config/i386/popcntintrin.h: Ditto. * config/i386/rdseedintrin.h: Ditto. * config/i386/ammintrin.h: Ditto. * config/i386/emmintrin.h: Ditto. * config/i386/immintrin.h: Remove all header include guards. * config/i386/fma4intrin.h: Ditto. * config/i386/lwpintrin.h: Ditto. * config/i386/xopintrin.h: Ditto. * config/i386/ia32intrin.h: Ditto. * config/i386/avxintrin.h: Ditto. * config/i386/rtmintrin.h: Ditto. * config/i386/fmaintrin.h: Ditto. * config/i386/mm3dnow.h: Ditto. * testsuite/gcc.target/i386/intrinsics_1.c: New test. * testsuite/gcc.target/i386/intrinsics_2.c: Ditto. * testsuite/gcc.target/i386/intrinsics_3.c: Ditto. * testsuite/gcc.target/i386/intrinsics_4.c: Ditto. * testsuite/gcc.target/i386/intrinsics_5.c: Ditto. * testsuite/gcc.target/i386/intrinsics_6.c: Ditto. * testsuite/gcc.target/i386/avx-1.c: Provide macros for builtins needing immediate arguments in f16cintrin.h and rtmintrin.h. From-SVN: r200349
2013-06-23 08:15:19 +02:00
#ifdef __DISABLE_MMX__
#undef __DISABLE_MMX__
#pragma GCC pop_options
#endif /* __DISABLE_MMX__ */
#endif /* _MMINTRIN_H_INCLUDED */