OpenE2K/gcc - gcc - Expired Mentality Git

Author	SHA1	Message	Date
Sandra Loosemore	a0db59bc5f	Fortran manual: Update section on Interoperability with C 2021-11-01 Sandra Loosemore <sandra@codesourcery.com> gcc/fortran/ * gfortran.texi (Interoperability with C): Copy-editing. Add more index entries. (Intrinsic Types): Likewise. (Derived Types and struct): Likewise. (Interoperable Global Variables): Likewise. (Interoperable Subroutines and Functions): Likewise. (Working with C Pointers): Likewise. (Further Interoperability of Fortran with C): Likewise. Rewrite to reflect that this is now fully supported by gfortran.	2021-11-04 09:53:02 -07:00
Sandra Loosemore	227e010036	Fortran manual: Revise introductory chapter. Fix various bit-rot in the discussion of standards conformance, remove material that is only of historical interest, copy-editing. Also move discussion of preprocessing out of the introductory chapter. 2021-11-01 Sandra Loosemore <sandra@codesourcery.com> gcc/fortran/ * gfortran.texi (About GNU Fortran): Consolidate material formerly in other sections. Copy-editing. (Preprocessing and conditional compilation): Delete, moving most material to invoke.texi. (GNU Fortran and G77): Delete. (Project Status): Delete. (Standards): Update. (Fortran 95 status): Mention conditional compilation here. (Fortran 2003 status): Rewrite to mention the 1 missing feature instead of all the ones implemented. (Fortran 2008 status): Similarly for the 2 missing features. (Fortran 2018 status): Rewrite to reflect completion of TS29113 feature support. * invoke.texi (Preprocessing Options): Move material formerly in introductory chapter here.	2021-11-04 09:53:02 -07:00
Sandra Loosemore	2b1c757d83	Fortran manual: Combine standard conformance docs in one place. Discussion of conformance with various revisions of the Fortran standard was split between two separate parts of the manual. This patch moves it all to the introductory chapter. 2021-11-01 Sandra Loosemore <sandra@codesourcery.com> gcc/fortran/ * gfortran.texi (Standards): Move discussion of specific standard versions here.... (Fortran standards status): ...from here, and delete this node.	2021-11-04 09:53:02 -07:00
Jan Hubicka	d3f7a2fa64	Workaround ICE in gimple_call_static_chain_flags gcc/ChangeLog: 2021-11-04 Jan Hubicka <hubicka@ucw.cz> PR ipa/103058 * gimple.c (gimple_call_static_chain_flags): Handle case when nested function does not bind locally.	2021-11-04 17:10:47 +01:00
Jason Merrill	fae00a0ac0	c++: use range-for more gcc/cp/ChangeLog: * call.c (build_array_conv): Use range-for. (build_complex_conv): Likewise. * constexpr.c (clear_no_implicit_zero) (reduced_constant_expression_p): Likewise. * decl.c (cp_complete_array_type): Likewise. * decl2.c (mark_vtable_entries): Likewise. * pt.c (iterative_hash_template_arg): (invalid_tparm_referent_p, unify) (type_dependent_expression_p): Likewise. * typeck.c (build_ptrmemfunc_access_expr): Likewise.	2021-11-04 11:35:54 -04:00
Jonathan Wright	eb04ccf4bf	aarch64: Pass and return Neon vector-tuple types without a parallel Neon vector-tuple types can be passed in registers on function call and return - there is no need to generate a parallel rtx. This patch adds cases to detect vector-tuple modes and generates an appropriate register rtx. This change greatly improves code generated when passing Neon vector- tuple types between functions; many new test cases are added to defend these improvements. gcc/ChangeLog: 2021-10-07 Jonathan Wright <jonathan.wright@arm.com> * config/aarch64/aarch64.c (aarch64_function_value): Generate a register rtx for Neon vector-tuple modes. (aarch64_layout_arg): Likewise. gcc/testsuite/ChangeLog: * gcc.target/aarch64/vector_structure_intrinsics.c: New code generation tests.	2021-11-04 14:55:44 +00:00
Jonathan Wright	511245325a	gcc/lower_subreg.c: Prevent decomposition if modes are not tieable Preventing decomposition if modes are not tieable is necessary to stop AArch64 partial Neon structure modes being treated as packed in registers. This is a necessary prerequisite for a future AArch64 PCS change to maintain good code generation. gcc/ChangeLog: 2021-10-14 Jonathan Wright <jonathan.wright@arm.com> * lower-subreg.c (simple_move): Prevent decomposition if modes are not tieable.	2021-11-04 14:55:44 +00:00
Jonathan Wright	66f206b853	aarch64: Add machine modes for Neon vector-tuple types Until now, GCC has used large integer machine modes (OI, CI and XI) to model Neon vector-tuple types. This is suboptimal for many reasons, the most notable are: 1) Large integer modes are opaque and modifying one vector in the tuple requires a lot of inefficient set/get gymnastics. The result is a lot of superfluous move instructions. 2) Large integer modes do not map well to types that are tuples of 64-bit vectors - we need additional zero-padding which again results in superfluous move instructions. This patch adds new machine modes that better model the C-level Neon vector-tuple types. The approach is somewhat similar to that already used for SVE vector-tuple types. All of the AArch64 backend patterns and builtins that manipulate Neon vector tuples are updated to use the new machine modes. This has the effect of significantly reducing the amount of boiler-plate code in the arm_neon.h header. While this patch increases the quality of code generated in many instances, there is still room for significant improvement - which will be attempted in subsequent patches. gcc/ChangeLog: 2021-08-09 Jonathan Wright <jonathan.wright@arm.com> Richard Sandiford <richard.sandiford@arm.com> * config/aarch64/aarch64-builtins.c (v2x8qi_UP): Define. (v2x4hi_UP): Likewise. (v2x4hf_UP): Likewise. (v2x4bf_UP): Likewise. (v2x2si_UP): Likewise. (v2x2sf_UP): Likewise. (v2x1di_UP): Likewise. (v2x1df_UP): Likewise. (v2x16qi_UP): Likewise. (v2x8hi_UP): Likewise. (v2x8hf_UP): Likewise. (v2x8bf_UP): Likewise. (v2x4si_UP): Likewise. (v2x4sf_UP): Likewise. (v2x2di_UP): Likewise. (v2x2df_UP): Likewise. (v3x8qi_UP): Likewise. (v3x4hi_UP): Likewise. (v3x4hf_UP): Likewise. (v3x4bf_UP): Likewise. (v3x2si_UP): Likewise. (v3x2sf_UP): Likewise. (v3x1di_UP): Likewise. (v3x1df_UP): Likewise. (v3x16qi_UP): Likewise. (v3x8hi_UP): Likewise. (v3x8hf_UP): Likewise. (v3x8bf_UP): Likewise. (v3x4si_UP): Likewise. (v3x4sf_UP): Likewise. (v3x2di_UP): Likewise. (v3x2df_UP): Likewise. (v4x8qi_UP): Likewise. (v4x4hi_UP): Likewise. (v4x4hf_UP): Likewise. (v4x4bf_UP): Likewise. (v4x2si_UP): Likewise. (v4x2sf_UP): Likewise. (v4x1di_UP): Likewise. (v4x1df_UP): Likewise. (v4x16qi_UP): Likewise. (v4x8hi_UP): Likewise. (v4x8hf_UP): Likewise. (v4x8bf_UP): Likewise. (v4x4si_UP): Likewise. (v4x4sf_UP): Likewise. (v4x2di_UP): Likewise. (v4x2df_UP): Likewise. (TYPES_GETREGP): Delete. (TYPES_SETREGP): Likewise. (TYPES_LOADSTRUCT_U): Define. (TYPES_LOADSTRUCT_P): Likewise. (TYPES_LOADSTRUCT_LANE_U): Likewise. (TYPES_LOADSTRUCT_LANE_P): Likewise. (TYPES_STORE1P): Move for consistency. (TYPES_STORESTRUCT_U): Define. (TYPES_STORESTRUCT_P): Likewise. (TYPES_STORESTRUCT_LANE_U): Likewise. (TYPES_STORESTRUCT_LANE_P): Likewise. (aarch64_simd_tuple_types): Define. (aarch64_lookup_simd_builtin_type): Handle tuple type lookup. (aarch64_init_simd_builtin_functions): Update frontend lookup for builtin functions after handling arm_neon.h pragma. (register_tuple_type): Manually set modes of single-integer tuple types. Record tuple types. * config/aarch64/aarch64-modes.def (ADV_SIMD_D_REG_STRUCT_MODES): Define D-register tuple modes. (ADV_SIMD_Q_REG_STRUCT_MODES): Define Q-register tuple modes. (SVE_MODES): Give single-vector modes priority over vector- tuple modes. (VECTOR_MODES_WITH_PREFIX): Set partial-vector mode order to be after all single-vector modes. * config/aarch64/aarch64-simd-builtins.def: Update builtin generator macros to reflect modifications to the backend patterns. * config/aarch64/aarch64-simd.md (aarch64_simd_ld2<mode>): Use vector-tuple mode iterator and rename to... (aarch64_simd_ld2<vstruct_elt>): This. (aarch64_simd_ld2r<mode>): Use vector-tuple mode iterator and rename to... (aarch64_simd_ld2r<vstruct_elt>): This. (aarch64_vec_load_lanesoi_lane<mode>): Use vector-tuple mode iterator and rename to... (aarch64_vec_load_lanes<mode>_lane<vstruct_elt>): This. (vec_load_lanesoi<mode>): Use vector-tuple mode iterator and rename to... (vec_load_lanes<mode><vstruct_elt>): This. (aarch64_simd_st2<mode>): Use vector-tuple mode iterator and rename to... (aarch64_simd_st2<vstruct_elt>): This. (aarch64_vec_store_lanesoi_lane<mode>): Use vector-tuple mode iterator and rename to... (aarch64_vec_store_lanes<mode>_lane<vstruct_elt>): This. (vec_store_lanesoi<mode>): Use vector-tuple mode iterator and rename to... (vec_store_lanes<mode><vstruct_elt>): This. (aarch64_simd_ld3<mode>): Use vector-tuple mode iterator and rename to... (aarch64_simd_ld3<vstruct_elt>): This. (aarch64_simd_ld3r<mode>): Use vector-tuple mode iterator and rename to... (aarch64_simd_ld3r<vstruct_elt>): This. (aarch64_vec_load_lanesci_lane<mode>): Use vector-tuple mode iterator and rename to... (vec_load_lanesci<mode>): This. (aarch64_simd_st3<mode>): Use vector-tuple mode iterator and rename to... (aarch64_simd_st3<vstruct_elt>): This. (aarch64_vec_store_lanesci_lane<mode>): Use vector-tuple mode iterator and rename to... (vec_store_lanesci<mode>): This. (aarch64_simd_ld4<mode>): Use vector-tuple mode iterator and rename to... (aarch64_simd_ld4<vstruct_elt>): This. (aarch64_simd_ld4r<mode>): Use vector-tuple mode iterator and rename to... (aarch64_simd_ld4r<vstruct_elt>): This. (aarch64_vec_load_lanesxi_lane<mode>): Use vector-tuple mode iterator and rename to... (vec_load_lanesxi<mode>): This. (aarch64_simd_st4<mode>): Use vector-tuple mode iterator and rename to... (aarch64_simd_st4<vstruct_elt>): This. (aarch64_vec_store_lanesxi_lane<mode>): Use vector-tuple mode iterator and rename to... (vec_store_lanesxi<mode>): This. (mov<mode>): Define for Neon vector-tuple modes. (aarch64_ld1x3<VALLDIF:mode>): Use vector-tuple mode iterator and rename to... (aarch64_ld1x3<vstruct_elt>): This. (aarch64_ld1_x3_<mode>): Use vector-tuple mode iterator and rename to... (aarch64_ld1_x3_<vstruct_elt>): This. (aarch64_ld1x4<VALLDIF:mode>): Use vector-tuple mode iterator and rename to... (aarch64_ld1x4<vstruct_elt>): This. (aarch64_ld1_x4_<mode>): Use vector-tuple mode iterator and rename to... (aarch64_ld1_x4_<vstruct_elt>): This. (aarch64_st1x2<VALLDIF:mode>): Use vector-tuple mode iterator and rename to... (aarch64_st1x2<vstruct_elt>): This. (aarch64_st1_x2_<mode>): Use vector-tuple mode iterator and rename to... (aarch64_st1_x2_<vstruct_elt>): This. (aarch64_st1x3<VALLDIF:mode>): Use vector-tuple mode iterator and rename to... (aarch64_st1x3<vstruct_elt>): This. (aarch64_st1_x3_<mode>): Use vector-tuple mode iterator and rename to... (aarch64_st1_x3_<vstruct_elt>): This. (aarch64_st1x4<VALLDIF:mode>): Use vector-tuple mode iterator and rename to... (aarch64_st1x4<vstruct_elt>): This. (aarch64_st1_x4_<mode>): Use vector-tuple mode iterator and rename to... (aarch64_st1_x4_<vstruct_elt>): This. (aarch64_mov<mode>): Define for vector-tuple modes. (aarch64_be_mov<mode>): Likewise. (aarch64_ld<VSTRUCT:nregs>r<VALLDIF:mode>): Use vector-tuple mode iterator and rename to... (aarch64_ld<nregs>r<vstruct_elt>): This. (aarch64_ld2<mode>_dreg): Use vector-tuple mode iterator and rename to... (aarch64_ld2<vstruct_elt>_dreg): This. (aarch64_ld3<mode>_dreg): Use vector-tuple mode iterator and rename to... (aarch64_ld3<vstruct_elt>_dreg): This. (aarch64_ld4<mode>_dreg): Use vector-tuple mode iterator and rename to... (aarch64_ld4<vstruct_elt>_dreg): This. (aarch64_ld<VSTRUCT:nregs><VDC:mode>): Use vector-tuple mode iterator and rename to... (aarch64_ld<nregs><vstruct_elt>): Use vector-tuple mode iterator and rename to... (aarch64_ld<VSTRUCT:nregs><VQ:mode>): Use vector-tuple mode (aarch64_ld1x2<VQ:mode>): Delete. (aarch64_ld1x2<VDC:mode>): Use vector-tuple mode iterator and rename to... (aarch64_ld1x2<vstruct_elt>): This. (aarch64_ld<VSTRUCT:nregs>_lane<VALLDIF:mode>): Use vector- tuple mode iterator and rename to... (aarch64_ld<nregs>_lane<vstruct_elt>): This. (aarch64_get_dreg<VSTRUCT:mode><VDC:mode>): Delete. (aarch64_get_qreg<VSTRUCT:mode><VQ:mode>): Likewise. (aarch64_st2<mode>_dreg): Use vector-tuple mode iterator and rename to... (aarch64_st2<vstruct_elt>_dreg): This. (aarch64_st3<mode>_dreg): Use vector-tuple mode iterator and rename to... (aarch64_st3<vstruct_elt>_dreg): This. (aarch64_st4<mode>_dreg): Use vector-tuple mode iterator and rename to... (aarch64_st4<vstruct_elt>_dreg): This. (aarch64_st<VSTRUCT:nregs><VDC:mode>): Use vector-tuple mode iterator and rename to... (aarch64_st<nregs><vstruct_elt>): This. (aarch64_st<VSTRUCT:nregs><VQ:mode>): Use vector-tuple mode iterator and rename to aarch64_st<nregs><vstruct_elt>. (aarch64_st<VSTRUCT:nregs>_lane<VALLDIF:mode>): Use vector- tuple mode iterator and rename to... (aarch64_st<nregs>_lane<vstruct_elt>): This. (aarch64_set_qreg<VSTRUCT:mode><VQ:mode>): Delete. (aarch64_simd_ld1<mode>_x2): Use vector-tuple mode iterator and rename to... (aarch64_simd_ld1<vstruct_elt>_x2): This. * config/aarch64/aarch64.c (aarch64_advsimd_struct_mode_p): Refactor to include new vector-tuple modes. (aarch64_classify_vector_mode): Add cases for new vector- tuple modes. (aarch64_advsimd_partial_struct_mode_p): Define. (aarch64_advsimd_full_struct_mode_p): Likewise. (aarch64_advsimd_vector_array_mode): Likewise. (aarch64_sve_data_mode): Change location in file. (aarch64_array_mode): Handle case of Neon vector-tuple modes. (aarch64_hard_regno_nregs): Handle case of partial Neon vector structures. (aarch64_classify_address): Refactor to include handling of Neon vector-tuple modes. (aarch64_print_operand): Print "d" for "%R" for a partial Neon vector structure. (aarch64_expand_vec_perm_1): Use new vector-tuple mode. (aarch64_modes_tieable_p): Prevent tieing Neon partial struct modes with scalar machines modes larger than 8 bytes. (aarch64_can_change_mode_class): Don't allow changes between partial and full Neon vector-structure modes. * config/aarch64/arm_neon.h (vst2_lane_f16): Use updated builtin and remove boiler-plate code for opaque mode. (vst2_lane_f32): Likewise. (vst2_lane_f64): Likewise. (vst2_lane_p8): Likewise. (vst2_lane_p16): Likewise. (vst2_lane_p64): Likewise. (vst2_lane_s8): Likewise. (vst2_lane_s16): Likewise. (vst2_lane_s32): Likewise. (vst2_lane_s64): Likewise. (vst2_lane_u8): Likewise. (vst2_lane_u16): Likewise. (vst2_lane_u32): Likewise. (vst2_lane_u64): Likewise. (vst2q_lane_f16): Likewise. (vst2q_lane_f32): Likewise. (vst2q_lane_f64): Likewise. (vst2q_lane_p8): Likewise. (vst2q_lane_p16): Likewise. (vst2q_lane_p64): Likewise. (vst2q_lane_s8): Likewise. (vst2q_lane_s16): Likewise. (vst2q_lane_s32): Likewise. (vst2q_lane_s64): Likewise. (vst2q_lane_u8): Likewise. (vst2q_lane_u16): Likewise. (vst2q_lane_u32): Likewise. (vst2q_lane_u64): Likewise. (vst3_lane_f16): Likewise. (vst3_lane_f32): Likewise. (vst3_lane_f64): Likewise. (vst3_lane_p8): Likewise. (vst3_lane_p16): Likewise. (vst3_lane_p64): Likewise. (vst3_lane_s8): Likewise. (vst3_lane_s16): Likewise. (vst3_lane_s32): Likewise. (vst3_lane_s64): Likewise. (vst3_lane_u8): Likewise. (vst3_lane_u16): Likewise. (vst3_lane_u32): Likewise. (vst3_lane_u64): Likewise. (vst3q_lane_f16): Likewise. (vst3q_lane_f32): Likewise. (vst3q_lane_f64): Likewise. (vst3q_lane_p8): Likewise. (vst3q_lane_p16): Likewise. (vst3q_lane_p64): Likewise. (vst3q_lane_s8): Likewise. (vst3q_lane_s16): Likewise. (vst3q_lane_s32): Likewise. (vst3q_lane_s64): Likewise. (vst3q_lane_u8): Likewise. (vst3q_lane_u16): Likewise. (vst3q_lane_u32): Likewise. (vst3q_lane_u64): Likewise. (vst4_lane_f16): Likewise. (vst4_lane_f32): Likewise. (vst4_lane_f64): Likewise. (vst4_lane_p8): Likewise. (vst4_lane_p16): Likewise. (vst4_lane_p64): Likewise. (vst4_lane_s8): Likewise. (vst4_lane_s16): Likewise. (vst4_lane_s32): Likewise. (vst4_lane_s64): Likewise. (vst4_lane_u8): Likewise. (vst4_lane_u16): Likewise. (vst4_lane_u32): Likewise. (vst4_lane_u64): Likewise. (vst4q_lane_f16): Likewise. (vst4q_lane_f32): Likewise. (vst4q_lane_f64): Likewise. (vst4q_lane_p8): Likewise. (vst4q_lane_p16): Likewise. (vst4q_lane_p64): Likewise. (vst4q_lane_s8): Likewise. (vst4q_lane_s16): Likewise. (vst4q_lane_s32): Likewise. (vst4q_lane_s64): Likewise. (vst4q_lane_u8): Likewise. (vst4q_lane_u16): Likewise. (vst4q_lane_u32): Likewise. (vst4q_lane_u64): Likewise. (vtbl3_s8): Likewise. (vtbl3_u8): Likewise. (vtbl3_p8): Likewise. (vtbl4_s8): Likewise. (vtbl4_u8): Likewise. (vtbl4_p8): Likewise. (vld1_u8_x3): Likewise. (vld1_s8_x3): Likewise. (vld1_u16_x3): Likewise. (vld1_s16_x3): Likewise. (vld1_u32_x3): Likewise. (vld1_s32_x3): Likewise. (vld1_u64_x3): Likewise. (vld1_s64_x3): Likewise. (vld1_f16_x3): Likewise. (vld1_f32_x3): Likewise. (vld1_f64_x3): Likewise. (vld1_p8_x3): Likewise. (vld1_p16_x3): Likewise. (vld1_p64_x3): Likewise. (vld1q_u8_x3): Likewise. (vld1q_s8_x3): Likewise. (vld1q_u16_x3): Likewise. (vld1q_s16_x3): Likewise. (vld1q_u32_x3): Likewise. (vld1q_s32_x3): Likewise. (vld1q_u64_x3): Likewise. (vld1q_s64_x3): Likewise. (vld1q_f16_x3): Likewise. (vld1q_f32_x3): Likewise. (vld1q_f64_x3): Likewise. (vld1q_p8_x3): Likewise. (vld1q_p16_x3): Likewise. (vld1q_p64_x3): Likewise. (vld1_u8_x2): Likewise. (vld1_s8_x2): Likewise. (vld1_u16_x2): Likewise. (vld1_s16_x2): Likewise. (vld1_u32_x2): Likewise. (vld1_s32_x2): Likewise. (vld1_u64_x2): Likewise. (vld1_s64_x2): Likewise. (vld1_f16_x2): Likewise. (vld1_f32_x2): Likewise. (vld1_f64_x2): Likewise. (vld1_p8_x2): Likewise. (vld1_p16_x2): Likewise. (vld1_p64_x2): Likewise. (vld1q_u8_x2): Likewise. (vld1q_s8_x2): Likewise. (vld1q_u16_x2): Likewise. (vld1q_s16_x2): Likewise. (vld1q_u32_x2): Likewise. (vld1q_s32_x2): Likewise. (vld1q_u64_x2): Likewise. (vld1q_s64_x2): Likewise. (vld1q_f16_x2): Likewise. (vld1q_f32_x2): Likewise. (vld1q_f64_x2): Likewise. (vld1q_p8_x2): Likewise. (vld1q_p16_x2): Likewise. (vld1q_p64_x2): Likewise. (vld1_s8_x4): Likewise. (vld1q_s8_x4): Likewise. (vld1_s16_x4): Likewise. (vld1q_s16_x4): Likewise. (vld1_s32_x4): Likewise. (vld1q_s32_x4): Likewise. (vld1_u8_x4): Likewise. (vld1q_u8_x4): Likewise. (vld1_u16_x4): Likewise. (vld1q_u16_x4): Likewise. (vld1_u32_x4): Likewise. (vld1q_u32_x4): Likewise. (vld1_f16_x4): Likewise. (vld1q_f16_x4): Likewise. (vld1_f32_x4): Likewise. (vld1q_f32_x4): Likewise. (vld1_p8_x4): Likewise. (vld1q_p8_x4): Likewise. (vld1_p16_x4): Likewise. (vld1q_p16_x4): Likewise. (vld1_s64_x4): Likewise. (vld1_u64_x4): Likewise. (vld1_p64_x4): Likewise. (vld1q_s64_x4): Likewise. (vld1q_u64_x4): Likewise. (vld1q_p64_x4): Likewise. (vld1_f64_x4): Likewise. (vld1q_f64_x4): Likewise. (vld2_s64): Likewise. (vld2_u64): Likewise. (vld2_f64): Likewise. (vld2_s8): Likewise. (vld2_p8): Likewise. (vld2_p64): Likewise. (vld2_s16): Likewise. (vld2_p16): Likewise. (vld2_s32): Likewise. (vld2_u8): Likewise. (vld2_u16): Likewise. (vld2_u32): Likewise. (vld2_f16): Likewise. (vld2_f32): Likewise. (vld2q_s8): Likewise. (vld2q_p8): Likewise. (vld2q_s16): Likewise. (vld2q_p16): Likewise. (vld2q_p64): Likewise. (vld2q_s32): Likewise. (vld2q_s64): Likewise. (vld2q_u8): Likewise. (vld2q_u16): Likewise. (vld2q_u32): Likewise. (vld2q_u64): Likewise. (vld2q_f16): Likewise. (vld2q_f32): Likewise. (vld2q_f64): Likewise. (vld3_s64): Likewise. (vld3_u64): Likewise. (vld3_f64): Likewise. (vld3_s8): Likewise. (vld3_p8): Likewise. (vld3_s16): Likewise. (vld3_p16): Likewise. (vld3_s32): Likewise. (vld3_u8): Likewise. (vld3_u16): Likewise. (vld3_u32): Likewise. (vld3_f16): Likewise. (vld3_f32): Likewise. (vld3_p64): Likewise. (vld3q_s8): Likewise. (vld3q_p8): Likewise. (vld3q_s16): Likewise. (vld3q_p16): Likewise. (vld3q_s32): Likewise. (vld3q_s64): Likewise. (vld3q_u8): Likewise. (vld3q_u16): Likewise. (vld3q_u32): Likewise. (vld3q_u64): Likewise. (vld3q_f16): Likewise. (vld3q_f32): Likewise. (vld3q_f64): Likewise. (vld3q_p64): Likewise. (vld4_s64): Likewise. (vld4_u64): Likewise. (vld4_f64): Likewise. (vld4_s8): Likewise. (vld4_p8): Likewise. (vld4_s16): Likewise. (vld4_p16): Likewise. (vld4_s32): Likewise. (vld4_u8): Likewise. (vld4_u16): Likewise. (vld4_u32): Likewise. (vld4_f16): Likewise. (vld4_f32): Likewise. (vld4_p64): Likewise. (vld4q_s8): Likewise. (vld4q_p8): Likewise. (vld4q_s16): Likewise. (vld4q_p16): Likewise. (vld4q_s32): Likewise. (vld4q_s64): Likewise. (vld4q_u8): Likewise. (vld4q_u16): Likewise. (vld4q_u32): Likewise. (vld4q_u64): Likewise. (vld4q_f16): Likewise. (vld4q_f32): Likewise. (vld4q_f64): Likewise. (vld4q_p64): Likewise. (vld2_dup_s8): Likewise. (vld2_dup_s16): Likewise. (vld2_dup_s32): Likewise. (vld2_dup_f16): Likewise. (vld2_dup_f32): Likewise. (vld2_dup_f64): Likewise. (vld2_dup_u8): Likewise. (vld2_dup_u16): Likewise. (vld2_dup_u32): Likewise. (vld2_dup_p8): Likewise. (vld2_dup_p16): Likewise. (vld2_dup_p64): Likewise. (vld2_dup_s64): Likewise. (vld2_dup_u64): Likewise. (vld2q_dup_s8): Likewise. (vld2q_dup_p8): Likewise. (vld2q_dup_s16): Likewise. (vld2q_dup_p16): Likewise. (vld2q_dup_s32): Likewise. (vld2q_dup_s64): Likewise. (vld2q_dup_u8): Likewise. (vld2q_dup_u16): Likewise. (vld2q_dup_u32): Likewise. (vld2q_dup_u64): Likewise. (vld2q_dup_f16): Likewise. (vld2q_dup_f32): Likewise. (vld2q_dup_f64): Likewise. (vld2q_dup_p64): Likewise. (vld3_dup_s64): Likewise. (vld3_dup_u64): Likewise. (vld3_dup_f64): Likewise. (vld3_dup_s8): Likewise. (vld3_dup_p8): Likewise. (vld3_dup_s16): Likewise. (vld3_dup_p16): Likewise. (vld3_dup_s32): Likewise. (vld3_dup_u8): Likewise. (vld3_dup_u16): Likewise. (vld3_dup_u32): Likewise. (vld3_dup_f16): Likewise. (vld3_dup_f32): Likewise. (vld3_dup_p64): Likewise. (vld3q_dup_s8): Likewise. (vld3q_dup_p8): Likewise. (vld3q_dup_s16): Likewise. (vld3q_dup_p16): Likewise. (vld3q_dup_s32): Likewise. (vld3q_dup_s64): Likewise. (vld3q_dup_u8): Likewise. (vld3q_dup_u16): Likewise. (vld3q_dup_u32): Likewise. (vld3q_dup_u64): Likewise. (vld3q_dup_f16): Likewise. (vld3q_dup_f32): Likewise. (vld3q_dup_f64): Likewise. (vld3q_dup_p64): Likewise. (vld4_dup_s64): Likewise. (vld4_dup_u64): Likewise. (vld4_dup_f64): Likewise. (vld4_dup_s8): Likewise. (vld4_dup_p8): Likewise. (vld4_dup_s16): Likewise. (vld4_dup_p16): Likewise. (vld4_dup_s32): Likewise. (vld4_dup_u8): Likewise. (vld4_dup_u16): Likewise. (vld4_dup_u32): Likewise. (vld4_dup_f16): Likewise. (vld4_dup_f32): Likewise. (vld4_dup_p64): Likewise. (vld4q_dup_s8): Likewise. (vld4q_dup_p8): Likewise. (vld4q_dup_s16): Likewise. (vld4q_dup_p16): Likewise. (vld4q_dup_s32): Likewise. (vld4q_dup_s64): Likewise. (vld4q_dup_u8): Likewise. (vld4q_dup_u16): Likewise. (vld4q_dup_u32): Likewise. (vld4q_dup_u64): Likewise. (vld4q_dup_f16): Likewise. (vld4q_dup_f32): Likewise. (vld4q_dup_f64): Likewise. (vld4q_dup_p64): Likewise. (vld2_lane_u8): Likewise. (vld2_lane_u16): Likewise. (vld2_lane_u32): Likewise. (vld2_lane_u64): Likewise. (vld2_lane_s8): Likewise. (vld2_lane_s16): Likewise. (vld2_lane_s32): Likewise. (vld2_lane_s64): Likewise. (vld2_lane_f16): Likewise. (vld2_lane_f32): Likewise. (vld2_lane_f64): Likewise. (vld2_lane_p8): Likewise. (vld2_lane_p16): Likewise. (vld2_lane_p64): Likewise. (vld2q_lane_u8): Likewise. (vld2q_lane_u16): Likewise. (vld2q_lane_u32): Likewise. (vld2q_lane_u64): Likewise. (vld2q_lane_s8): Likewise. (vld2q_lane_s16): Likewise. (vld2q_lane_s32): Likewise. (vld2q_lane_s64): Likewise. (vld2q_lane_f16): Likewise. (vld2q_lane_f32): Likewise. (vld2q_lane_f64): Likewise. (vld2q_lane_p8): Likewise. (vld2q_lane_p16): Likewise. (vld2q_lane_p64): Likewise. (vld3_lane_u8): Likewise. (vld3_lane_u16): Likewise. (vld3_lane_u32): Likewise. (vld3_lane_u64): Likewise. (vld3_lane_s8): Likewise. (vld3_lane_s16): Likewise. (vld3_lane_s32): Likewise. (vld3_lane_s64): Likewise. (vld3_lane_f16): Likewise. (vld3_lane_f32): Likewise. (vld3_lane_f64): Likewise. (vld3_lane_p8): Likewise. (vld3_lane_p16): Likewise. (vld3_lane_p64): Likewise. (vld3q_lane_u8): Likewise. (vld3q_lane_u16): Likewise. (vld3q_lane_u32): Likewise. (vld3q_lane_u64): Likewise. (vld3q_lane_s8): Likewise. (vld3q_lane_s16): Likewise. (vld3q_lane_s32): Likewise. (vld3q_lane_s64): Likewise. (vld3q_lane_f16): Likewise. (vld3q_lane_f32): Likewise. (vld3q_lane_f64): Likewise. (vld3q_lane_p8): Likewise. (vld3q_lane_p16): Likewise. (vld3q_lane_p64): Likewise. (vld4_lane_u8): Likewise. (vld4_lane_u16): Likewise. (vld4_lane_u32): Likewise. (vld4_lane_u64): Likewise. (vld4_lane_s8): Likewise. (vld4_lane_s16): Likewise. (vld4_lane_s32): Likewise. (vld4_lane_s64): Likewise. (vld4_lane_f16): Likewise. (vld4_lane_f32): Likewise. (vld4_lane_f64): Likewise. (vld4_lane_p8): Likewise. (vld4_lane_p16): Likewise. (vld4_lane_p64): Likewise. (vld4q_lane_u8): Likewise. (vld4q_lane_u16): Likewise. (vld4q_lane_u32): Likewise. (vld4q_lane_u64): Likewise. (vld4q_lane_s8): Likewise. (vld4q_lane_s16): Likewise. (vld4q_lane_s32): Likewise. (vld4q_lane_s64): Likewise. (vld4q_lane_f16): Likewise. (vld4q_lane_f32): Likewise. (vld4q_lane_f64): Likewise. (vld4q_lane_p8): Likewise. (vld4q_lane_p16): Likewise. (vld4q_lane_p64): Likewise. (vqtbl2_s8): Likewise. (vqtbl2_u8): Likewise. (vqtbl2_p8): Likewise. (vqtbl2q_s8): Likewise. (vqtbl2q_u8): Likewise. (vqtbl2q_p8): Likewise. (vqtbl3_s8): Likewise. (vqtbl3_u8): Likewise. (vqtbl3_p8): Likewise. (vqtbl3q_s8): Likewise. (vqtbl3q_u8): Likewise. (vqtbl3q_p8): Likewise. (vqtbl4_s8): Likewise. (vqtbl4_u8): Likewise. (vqtbl4_p8): Likewise. (vqtbl4q_s8): Likewise. (vqtbl4q_u8): Likewise. (vqtbl4q_p8): Likewise. (vqtbx2_s8): Likewise. (vqtbx2_u8): Likewise. (vqtbx2_p8): Likewise. (vqtbx2q_s8): Likewise. (vqtbx2q_u8): Likewise. (vqtbx2q_p8): Likewise. (vqtbx3_s8): Likewise. (vqtbx3_u8): Likewise. (vqtbx3_p8): Likewise. (vqtbx3q_s8): Likewise. (vqtbx3q_u8): Likewise. (vqtbx3q_p8): Likewise. (vqtbx4_s8): Likewise. (vqtbx4_u8): Likewise. (vqtbx4_p8): Likewise. (vqtbx4q_s8): Likewise. (vqtbx4q_u8): Likewise. (vqtbx4q_p8): Likewise. (vst1_s64_x2): Likewise. (vst1_u64_x2): Likewise. (vst1_f64_x2): Likewise. (vst1_s8_x2): Likewise. (vst1_p8_x2): Likewise. (vst1_s16_x2): Likewise. (vst1_p16_x2): Likewise. (vst1_s32_x2): Likewise. (vst1_u8_x2): Likewise. (vst1_u16_x2): Likewise. (vst1_u32_x2): Likewise. (vst1_f16_x2): Likewise. (vst1_f32_x2): Likewise. (vst1_p64_x2): Likewise. (vst1q_s8_x2): Likewise. (vst1q_p8_x2): Likewise. (vst1q_s16_x2): Likewise. (vst1q_p16_x2): Likewise. (vst1q_s32_x2): Likewise. (vst1q_s64_x2): Likewise. (vst1q_u8_x2): Likewise. (vst1q_u16_x2): Likewise. (vst1q_u32_x2): Likewise. (vst1q_u64_x2): Likewise. (vst1q_f16_x2): Likewise. (vst1q_f32_x2): Likewise. (vst1q_f64_x2): Likewise. (vst1q_p64_x2): Likewise. (vst1_s64_x3): Likewise. (vst1_u64_x3): Likewise. (vst1_f64_x3): Likewise. (vst1_s8_x3): Likewise. (vst1_p8_x3): Likewise. (vst1_s16_x3): Likewise. (vst1_p16_x3): Likewise. (vst1_s32_x3): Likewise. (vst1_u8_x3): Likewise. (vst1_u16_x3): Likewise. (vst1_u32_x3): Likewise. (vst1_f16_x3): Likewise. (vst1_f32_x3): Likewise. (vst1_p64_x3): Likewise. (vst1q_s8_x3): Likewise. (vst1q_p8_x3): Likewise. (vst1q_s16_x3): Likewise. (vst1q_p16_x3): Likewise. (vst1q_s32_x3): Likewise. (vst1q_s64_x3): Likewise. (vst1q_u8_x3): Likewise. (vst1q_u16_x3): Likewise. (vst1q_u32_x3): Likewise. (vst1q_u64_x3): Likewise. (vst1q_f16_x3): Likewise. (vst1q_f32_x3): Likewise. (vst1q_f64_x3): Likewise. (vst1q_p64_x3): Likewise. (vst1_s8_x4): Likewise. (vst1q_s8_x4): Likewise. (vst1_s16_x4): Likewise. (vst1q_s16_x4): Likewise. (vst1_s32_x4): Likewise. (vst1q_s32_x4): Likewise. (vst1_u8_x4): Likewise. (vst1q_u8_x4): Likewise. (vst1_u16_x4): Likewise. (vst1q_u16_x4): Likewise. (vst1_u32_x4): Likewise. (vst1q_u32_x4): Likewise. (vst1_f16_x4): Likewise. (vst1q_f16_x4): Likewise. (vst1_f32_x4): Likewise. (vst1q_f32_x4): Likewise. (vst1_p8_x4): Likewise. (vst1q_p8_x4): Likewise. (vst1_p16_x4): Likewise. (vst1q_p16_x4): Likewise. (vst1_s64_x4): Likewise. (vst1_u64_x4): Likewise. (vst1_p64_x4): Likewise. (vst1q_s64_x4): Likewise. (vst1q_u64_x4): Likewise. (vst1q_p64_x4): Likewise. (vst1_f64_x4): Likewise. (vst1q_f64_x4): Likewise. (vst2_s64): Likewise. (vst2_u64): Likewise. (vst2_f64): Likewise. (vst2_s8): Likewise. (vst2_p8): Likewise. (vst2_s16): Likewise. (vst2_p16): Likewise. (vst2_s32): Likewise. (vst2_u8): Likewise. (vst2_u16): Likewise. (vst2_u32): Likewise. (vst2_f16): Likewise. (vst2_f32): Likewise. (vst2_p64): Likewise. (vst2q_s8): Likewise. (vst2q_p8): Likewise. (vst2q_s16): Likewise. (vst2q_p16): Likewise. (vst2q_s32): Likewise. (vst2q_s64): Likewise. (vst2q_u8): Likewise. (vst2q_u16): Likewise. (vst2q_u32): Likewise. (vst2q_u64): Likewise. (vst2q_f16): Likewise. (vst2q_f32): Likewise. (vst2q_f64): Likewise. (vst2q_p64): Likewise. (vst3_s64): Likewise. (vst3_u64): Likewise. (vst3_f64): Likewise. (vst3_s8): Likewise. (vst3_p8): Likewise. (vst3_s16): Likewise. (vst3_p16): Likewise. (vst3_s32): Likewise. (vst3_u8): Likewise. (vst3_u16): Likewise. (vst3_u32): Likewise. (vst3_f16): Likewise. (vst3_f32): Likewise. (vst3_p64): Likewise. (vst3q_s8): Likewise. (vst3q_p8): Likewise. (vst3q_s16): Likewise. (vst3q_p16): Likewise. (vst3q_s32): Likewise. (vst3q_s64): Likewise. (vst3q_u8): Likewise. (vst3q_u16): Likewise. (vst3q_u32): Likewise. (vst3q_u64): Likewise. (vst3q_f16): Likewise. (vst3q_f32): Likewise. (vst3q_f64): Likewise. (vst3q_p64): Likewise. (vst4_s64): Likewise. (vst4_u64): Likewise. (vst4_f64): Likewise. (vst4_s8): Likewise. (vst4_p8): Likewise. (vst4_s16): Likewise. (vst4_p16): Likewise. (vst4_s32): Likewise. (vst4_u8): Likewise. (vst4_u16): Likewise. (vst4_u32): Likewise. (vst4_f16): Likewise. (vst4_f32): Likewise. (vst4_p64): Likewise. (vst4q_s8): Likewise. (vst4q_p8): Likewise. (vst4q_s16): Likewise. (vst4q_p16): Likewise. (vst4q_s32): Likewise. (vst4q_s64): Likewise. (vst4q_u8): Likewise. (vst4q_u16): Likewise. (vst4q_u32): Likewise. (vst4q_u64): Likewise. (vst4q_f16): Likewise. (vst4q_f32): Likewise. (vst4q_f64): Likewise. (vst4q_p64): Likewise. (vtbx4_s8): Likewise. (vtbx4_u8): Likewise. (vtbx4_p8): Likewise. (vld1_bf16_x2): Likewise. (vld1q_bf16_x2): Likewise. (vld1_bf16_x3): Likewise. (vld1q_bf16_x3): Likewise. (vld1_bf16_x4): Likewise. (vld1q_bf16_x4): Likewise. (vld2_bf16): Likewise. (vld2q_bf16): Likewise. (vld2_dup_bf16): Likewise. (vld2q_dup_bf16): Likewise. (vld3_bf16): Likewise. (vld3q_bf16): Likewise. (vld3_dup_bf16): Likewise. (vld3q_dup_bf16): Likewise. (vld4_bf16): Likewise. (vld4q_bf16): Likewise. (vld4_dup_bf16): Likewise. (vld4q_dup_bf16): Likewise. (vst1_bf16_x2): Likewise. (vst1q_bf16_x2): Likewise. (vst1_bf16_x3): Likewise. (vst1q_bf16_x3): Likewise. (vst1_bf16_x4): Likewise. (vst1q_bf16_x4): Likewise. (vst2_bf16): Likewise. (vst2q_bf16): Likewise. (vst3_bf16): Likewise. (vst3q_bf16): Likewise. (vst4_bf16): Likewise. (vst4q_bf16): Likewise. (vld2_lane_bf16): Likewise. (vld2q_lane_bf16): Likewise. (vld3_lane_bf16): Likewise. (vld3q_lane_bf16): Likewise. (vld4_lane_bf16): Likewise. (vld4q_lane_bf16): Likewise. (vst2_lane_bf16): Likewise. (vst2q_lane_bf16): Likewise. (vst3_lane_bf16): Likewise. (vst3q_lane_bf16): Likewise. (vst4_lane_bf16): Likewise. (vst4q_lane_bf16): Likewise. * config/aarch64/geniterators.sh: Modify iterator regex to match new vector-tuple modes. * config/aarch64/iterators.md (insn_count): Extend mode attribute with vector-tuple type information. (nregs): Likewise. (Vendreg): Likewise. (Vetype): Likewise. (Vtype): Likewise. (VSTRUCT_2D): New mode iterator. (VSTRUCT_2DNX): Likewise. (VSTRUCT_2DX): Likewise. (VSTRUCT_2Q): Likewise. (VSTRUCT_2QD): Likewise. (VSTRUCT_3D): Likewise. (VSTRUCT_3DNX): Likewise. (VSTRUCT_3DX): Likewise. (VSTRUCT_3Q): Likewise. (VSTRUCT_3QD): Likewise. (VSTRUCT_4D): Likewise. (VSTRUCT_4DNX): Likewise. (VSTRUCT_4DX): Likewise. (VSTRUCT_4Q): Likewise. (VSTRUCT_4QD): Likewise. (VSTRUCT_D): Likewise. (VSTRUCT_Q): Likewise. (VSTRUCT_QD): Likewise. (VSTRUCT_ELT): New mode attribute. (vstruct_elt): Likewise. * genmodes.c (VECTOR_MODE): Add default prefix and order parameters. (VECTOR_MODE_WITH_PREFIX): Define. (make_vector_mode): Add mode prefix and order parameters. gcc/testsuite/ChangeLog: * gcc.target/aarch64/advsimd-intrinsics/bf16_vldN_lane_2.c: Relax incorrect register number requirement. * gcc.target/aarch64/sve/pcs/struct_3_256.c: Accept equivalent codegen with fmov.	2021-11-04 14:54:36 +00:00
Jonathan Wright	4e5929e457	gcc/expmed.c: Ensure vector modes are tieable before extraction Extracting a bitfield from a vector can be achieved by casting the vector to a new type whose elements are the same size as the desired bitfield, before generating a subreg. However, this is only an optimization if the original vector can be accessed in the new machine mode without first being copied - a condition denoted by the TARGET_MODES_TIEABLE_P hook. This patch adds a check to make sure that the vector modes are tieable before attempting to generate a subreg. This is a necessary prerequisite for a subsequent patch that will introduce new machine modes for Arm Neon vector-tuple types. gcc/ChangeLog: 2021-10-11 Jonathan Wright <jonathan.wright@arm.com> * expmed.c (extract_bit_field_1): Ensure modes are tieable.	2021-11-04 14:51:09 +00:00
Jonathan Wright	2fc2026061	gcc/expr.c: Remove historic workaround for broken SIMD subreg A long time ago, using a parallel to take a subreg of a SIMD register was broken. This temporary fix[1] (from 2003) spilled these registers to memory and reloaded the appropriate part to obtain the subreg. The fix initially existed for the benefit of the PowerPC E500 - a platform for which GCC removed support a number of years ago. Regardless, a proper mechanism for taking a subreg of a SIMD register exists now anyway. This patch removes the workaround thus preventing SIMD registers being dumped to memory unnecessarily - which sometimes can't be fixed by later passes. [1] https://gcc.gnu.org/pipermail/gcc-patches/2003-April/102099.html gcc/ChangeLog: 2021-10-11 Jonathan Wright <jonathan.wright@arm.com> * expr.c (emit_group_load_1): Remove historic workaround.	2021-11-04 14:50:55 +00:00
Jonathan Wright	8197ab94b4	aarch64: Move Neon vector-tuple type declaration into the compiler Declare the Neon vector-tuple types inside the compiler instead of in the arm_neon.h header. This is a necessary first step before adding corresponding machine modes to the AArch64 backend. The vector-tuple types are implemented using a #pragma. This means initialization of builtin functions that have vector-tuple types as arguments or return values has to be delayed until the #pragma is handled. gcc/ChangeLog: 2021-09-10 Jonathan Wright <jonathan.wright@arm.com> * config/aarch64/aarch64-builtins.c (aarch64_init_simd_builtins): Factor out main loop to... (aarch64_init_simd_builtin_functions): This new function. (register_tuple_type): Define. (aarch64_scalar_builtin_type_p): Define. (handle_arm_neon_h): Define. * config/aarch64/aarch64-c.c (aarch64_pragma_aarch64): Handle pragma for arm_neon.h. * config/aarch64/aarch64-protos.h (aarch64_advsimd_struct_mode_p): Declare. (handle_arm_neon_h): Likewise. * config/aarch64/aarch64.c (aarch64_advsimd_struct_mode_p): Remove static modifier. * config/aarch64/arm_neon.h (target): Remove Neon vector structure type definitions.	2021-11-04 14:50:40 +00:00
H.J. Lu	fbe58ba97a	x86: Check leal/addl gcc.target/i386/amxtile-3.c for x32 Check leal and addl for x32 to fix: FAIL: gcc.target/i386/amxtile-3.c scan-assembler addq[ \\t]+\\$12 FAIL: gcc.target/i386/amxtile-3.c scan-assembler leaq[ \\t]+4 FAIL: gcc.target/i386/amxtile-3.c scan-assembler leaq[ \\t]+8 * gcc.target/i386/amxtile-3.c: Check leal/addl for x32.	2021-11-04 07:41:52 -07:00
Aldy Hernandez	6a9678f0b3	path solver: Prefer range_of_expr instead of range_on_edge. The range_of_expr method provides better caching than range_on_edge. If we have a statement, we can just it and avoid the range_on_edge dance. Plus we can use all the range_of_expr fanciness. Tested on x86-64 and ppc64le Linux with the usual regstrap. I also verified that the before and after number of threads was the same or greater in a suite of .ii files from a bootstrap. gcc/ChangeLog: PR tree-optimization/102943 * gimple-range-path.cc (path_range_query::range_on_path_entry): Prefer range_of_expr unless there are no statements in the BB.	2021-11-04 15:39:03 +01:00
Aldy Hernandez	e441162269	Avoid repeating calculations in threader. We already attempt to resolve the current path on entry to find_paths_to_name(), so there's no need to do so again for each exported range since nothing has changed. Removing this redundant calculation avoids 22% of calls into the path solver. Tested on x86-64 and ppc64le Linux with the usual regstrap. I also verified that the before and after number of threads was the same in a suite of .ii files from a bootstrap. gcc/ChangeLog: PR tree-optimization/102943 * tree-ssa-threadbackward.c (back_threader::find_paths_to_names): Avoid duplicate calculation of paths.	2021-11-04 15:37:35 +01:00
Aldy Hernandez	5ea1ce43b6	path solver: Only compute relations for imports. We are currently calculating implicit PHI relations for all PHI arguments. This creates unecessary work, as we only care about SSA names in the import bitmap. Similarly for inter-path relationals. We can avoid things not in the bitmap. Tested on x86-64 and ppc64le Linux with the usual regstrap. I also verified that the before and after number of threads was the same in a suite of .ii files from a bootstrap. gcc/ChangeLog: PR tree-optimization/102943 * gimple-range-path.cc (path_range_query::compute_phi_relations): Only compute relations for SSA names in the import list. (path_range_query::compute_outgoing_relations): Same. * gimple-range-path.h (path_range_query::import_p): New.	2021-11-04 15:37:35 +01:00
H.J. Lu	333efaea63	libffi: Add --enable-cet to configure When --enable-cet is used to configure GCC, enable Intel CET in libffi. * Makefile.am (AM_CFLAGS): Add $(CET_FLAGS). (AM_CCASFLAGS): Likewise. * configure.ac (CET_FLAGS): Add GCC_CET_FLAGS and AC_SUBST. * Makefile.in: Regenerate. * aclocal.m4: Likewise. * configure: Likewise. * include/Makefile.in: Likewise. * man/Makefile.in: Likewise. * testsuite/Makefile.in: Likewise.	2021-11-04 07:19:22 -07:00
Martin Liska	af1bfcc04c	Add -v option for git_check_commit.py. Doing so, one can see: $ git gcc-verify a50914d2111c72d2cd5cb8cf474133f4f85a25f6 -v Checking a50914d2111c72d2cd5cb8cf474133f4f85a25f6: FAILED ERR: unchanged file mentioned in a ChangeLog: "gcc/common.opt" ERR: unchanged file mentioned in a ChangeLog (did you mean "gcc/testsuite/g++.dg/pr102955.C"?): "gcc/testsuite/gcc.dg/pr102955.c" - gcc/testsuite/gcc.dg/pr102955.c ? ^^ ^ + gcc/testsuite/g++.dg/pr102955.C ? ^^ ^ contrib/ChangeLog: * gcc-changelog/git_check_commit.py: Add -v option. * gcc-changelog/git_commit.py: Print verbose diff for wrong filename.	2021-11-04 15:01:52 +01:00
Tamar Christina	5914a7b5c6	testsuite: Add more guards to complex tests This test hopefully fixes all the remaining target specific test issues by 1: Unrolling all add testcases by 16 using pragma GCC unroll 2. On armhf use Adv.SIMD instead of MVE to test. MVE's autovec is too incomplete to be a general test target. 3. Add appropriate vect_<type> and float<size> guards on testcases. gcc/testsuite/ChangeLog: PR testsuite/103042 * gcc.dg/vect/complex/bb-slp-complex-add-pattern-int.c: Update guards. * gcc.dg/vect/complex/bb-slp-complex-add-pattern-long.c: Likewise. * gcc.dg/vect/complex/bb-slp-complex-add-pattern-short.c: Likewise. * gcc.dg/vect/complex/bb-slp-complex-add-pattern-unsigned-int.c: Likewise. * gcc.dg/vect/complex/bb-slp-complex-add-pattern-unsigned-long.c: Likewise. * gcc.dg/vect/complex/bb-slp-complex-add-pattern-unsigned-short.c: Likewise. * gcc.dg/vect/complex/complex-add-pattern-template.c: Likewise. * gcc.dg/vect/complex/complex-add-template.c: Likewise. * gcc.dg/vect/complex/complex-operations-run.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-add-double.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-add-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-add-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-double.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-double.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-double.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-double.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-add-double.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-add-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-add-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-add-pattern-double.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-add-pattern-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-add-pattern-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mla-double.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mla-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mla-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mls-double.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mls-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mls-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mul-double.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mul-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mul-half-float.c: Likewise. * gcc.dg/vect/complex/vect-complex-add-pattern-byte.c: Likewise. * gcc.dg/vect/complex/vect-complex-add-pattern-int.c: Likewise. * gcc.dg/vect/complex/vect-complex-add-pattern-long.c: Likewise. * gcc.dg/vect/complex/vect-complex-add-pattern-short.c: Likewise. * gcc.dg/vect/complex/vect-complex-add-pattern-unsigned-byte.c: Likewise. * gcc.dg/vect/complex/vect-complex-add-pattern-unsigned-int.c: Likewise. * gcc.dg/vect/complex/vect-complex-add-pattern-unsigned-long.c: Likewise. * gcc.dg/vect/complex/vect-complex-add-pattern-unsigned-short.c: Likewise.	2021-11-04 13:43:36 +00:00
David Malcolm	347682ea46	analyzer: fix ICE in sm_state_map::dump when dumping trees gcc/analyzer/ChangeLog: * program-state.cc (sm_state_map::dump): Use default_tree_printer as format decoder. Signed-off-by: David Malcolm <dmalcolm@redhat.com>	2021-11-04 09:32:05 -04:00
Richard Biener	d136035016	rtl-optimization/103075 - avoid ICEing on unfolded int-to-float converts The following avoids asserting in exact_int_to_float_conversion_p that the argument is not constant which it in fact can be with -frounding-math and inexact int-to-float conversions. Say so. 2021-11-04 Richard Biener <rguenther@suse.de> PR rtl-optimization/103075 * simplify-rtx.c (exact_int_to_float_conversion_p): Return false for a VOIDmode operand. * gcc.dg/pr103075.c: New testcase.	2021-11-04 13:33:19 +01:00
Richard Sandiford	d43fc1df73	aarch64: Move more code into aarch64_vector_costs This patch moves more code into aarch64_vector_costs and reuses some of the information that is now available in the base class. I'm planing to significantly rework this code, with more hooks into the vectoriser, but this seemed worth doing as a first step. gcc/ * config/aarch64/aarch64.c (aarch64_vector_costs): Make member variables private and add "m_" to their names. Remove is_loop. (aarch64_record_potential_advsimd_unrolling): Replace with... (aarch64_vector_costs::record_potential_advsimd_unrolling): ...this. (aarch64_analyze_loop_vinfo): Replace with... (aarch64_vector_costs::analyze_loop_vinfo): ...this. Move initialization of (m_)vec_flags to add_stmt_cost. (aarch64_analyze_bb_vinfo): Delete. (aarch64_count_ops): Replace with... (aarch64_vector_costs::count_ops): ...this. (aarch64_vector_costs::add_stmt_cost): Set m_vec_flags, using m_costing_for_scalar to test whether we're costing scalar or vector code. (aarch64_adjust_body_cost_sve): Replace with... (aarch64_vector_costs::adjust_body_cost_sve): ...this. (aarch64_adjust_body_cost): Replace with... (aarch64_vector_costs::adjust_body_cost): ...this. (aarch64_vector_costs::finish_cost): Use m_vinfo instead of is_loop.	2021-11-04 12:31:17 +00:00
Richard Sandiford	6239dd0512	vect: Convert cost hooks to classes The current vector cost interface has a quite a bit of redundancy built in. Each target that defines its own hooks has to replicate the basic unsigned[3] management. Currently each target also duplicates the cost adjustment for inner loops. This patch instead defines a vector_costs class for holding the scalar or vector cost and allows targets to subclass it. There is then only one costing hook: to create a new costs structure of the appropriate type. Everything else can be virtual functions, with common concepts implemented in the base class rather than in each target's derivation. This might seem like excess C++-ification, but it shaves ~100 LOC. I've also got some follow-on changes that become significantly easier with this patch. Maybe it could help with things like weighting blocks based on frequency too. This will clash with Andre's unrolling patches. His patches have priority so this patch should queue behind them. The x86 and rs6000 parts fully convert to a self-contained class. The equivalent aarch64 changes are more complex, so this patch just does the bare minimum. A later patch will rework the aarch64 bits. gcc/ * target.def (targetm.vectorize.init_cost): Replace with... (targetm.vectorize.create_costs): ...this. (targetm.vectorize.add_stmt_cost): Delete. (targetm.vectorize.finish_cost): Likewise. (targetm.vectorize.destroy_cost_data): Likewise. * doc/tm.texi.in (TARGET_VECTORIZE_INIT_COST): Replace with... (TARGET_VECTORIZE_CREATE_COSTS): ...this. (TARGET_VECTORIZE_ADD_STMT_COST): Delete. (TARGET_VECTORIZE_FINISH_COST): Likewise. (TARGET_VECTORIZE_DESTROY_COST_DATA): Likewise. * doc/tm.texi: Regenerate. * tree-vectorizer.h (vec_info::vec_info): Remove target_cost_data parameter. (vec_info::target_cost_data): Change from a void * to a vector_costs . (vector_costs): New class. (init_cost): Take a vec_info and return a vector_costs. (dump_stmt_cost): Remove data parameter. (add_stmt_cost): Replace vinfo and data parameters with a vector_costs. (add_stmt_costs): Likewise. (finish_cost): Replace data parameter with a vector_costs. (destroy_cost_data): Delete. tree-vectorizer.c (dump_stmt_cost): Remove data argument and don't print it. (vec_info::vec_info): Remove the target_cost_data parameter and initialize the member variable to null instead. (vec_info::~vec_info): Delete target_cost_data instead of calling destroy_cost_data. (vector_costs::add_stmt_cost): New function. (vector_costs::finish_cost): Likewise. (vector_costs::record_stmt_cost): Likewise. (vector_costs::adjust_cost_for_freq): Likewise. * tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Update call to vec_info::vec_info. (vect_compute_single_scalar_iteration_cost): Update after above changes to costing interface. (vect_analyze_loop_operations): Likewise. (vect_estimate_min_profitable_iters): Likewise. (vect_analyze_loop_2): Initialize LOOP_VINFO_TARGET_COST_DATA at the start_over point, where it needs to be recreated after trying without slp. Update retry code accordingly. * tree-vect-slp.c (_bb_vec_info::_bb_vec_info): Update call to vec_info::vec_info. (vect_slp_analyze_operation): Update after above changes to costing interface. (vect_bb_vectorization_profitable_p): Likewise. * targhooks.h (default_init_cost): Replace with... (default_vectorize_create_costs): ...this. (default_add_stmt_cost): Delete. (default_finish_cost, default_destroy_cost_data): Likewise. * targhooks.c (default_init_cost): Replace with... (default_vectorize_create_costs): ...this. (default_add_stmt_cost): Delete, moving logic to vector_costs instead. (default_finish_cost, default_destroy_cost_data): Delete. * config/aarch64/aarch64.c (aarch64_vector_costs): Inherit from vector_costs. Add a constructor. (aarch64_init_cost): Replace with... (aarch64_vectorize_create_costs): ...this. (aarch64_add_stmt_cost): Replace with... (aarch64_vector_costs::add_stmt_cost): ...this. Use record_stmt_cost to adjust the cost for inner loops. (aarch64_finish_cost): Replace with... (aarch64_vector_costs::finish_cost): ...this. (aarch64_destroy_cost_data): Delete. (TARGET_VECTORIZE_INIT_COST): Replace with... (TARGET_VECTORIZE_CREATE_COSTS): ...this. (TARGET_VECTORIZE_ADD_STMT_COST): Delete. (TARGET_VECTORIZE_FINISH_COST): Likewise. (TARGET_VECTORIZE_DESTROY_COST_DATA): Likewise. * config/i386/i386.c (ix86_vector_costs): New structure. (ix86_init_cost): Replace with... (ix86_vectorize_create_costs): ...this. (ix86_add_stmt_cost): Replace with... (ix86_vector_costs::add_stmt_cost): ...this. Use adjust_cost_for_freq to adjust the cost for inner loops. (ix86_finish_cost, ix86_destroy_cost_data): Delete. (TARGET_VECTORIZE_INIT_COST): Replace with... (TARGET_VECTORIZE_CREATE_COSTS): ...this. (TARGET_VECTORIZE_ADD_STMT_COST): Delete. (TARGET_VECTORIZE_FINISH_COST): Likewise. (TARGET_VECTORIZE_DESTROY_COST_DATA): Likewise. * config/rs6000/rs6000.c (TARGET_VECTORIZE_INIT_COST): Replace with... (TARGET_VECTORIZE_CREATE_COSTS): ...this. (TARGET_VECTORIZE_ADD_STMT_COST): Delete. (TARGET_VECTORIZE_FINISH_COST): Likewise. (TARGET_VECTORIZE_DESTROY_COST_DATA): Likewise. (rs6000_cost_data): Inherit from vector_costs. Add a constructor. Drop loop_info, cost and costing_for_scalar in favor of the corresponding vector_costs member variables. Add "m_" to the names of the remaining member variables and initialize them. (rs6000_density_test): Replace with... (rs6000_cost_data::density_test): ...this. (rs6000_init_cost): Replace with... (rs6000_vectorize_create_costs): ...this. (rs6000_update_target_cost_per_stmt): Replace with... (rs6000_cost_data::update_target_cost_per_stmt): ...this. (rs6000_add_stmt_cost): Replace with... (rs6000_cost_data::add_stmt_cost): ...this. Use adjust_cost_for_freq to adjust the cost for inner loops. (rs6000_adjust_vect_cost_per_loop): Replace with... (rs6000_cost_data::adjust_vect_cost_per_loop): ...this. (rs6000_finish_cost): Replace with... (rs6000_cost_data::finish_cost): ...this. Group loop code into a single if statement and pass the loop_vinfo down to subroutines. (rs6000_destroy_cost_data): Delete.	2021-11-04 12:31:17 +00:00
Martin Liska	af976d90fa	libsanitizer: update LOCAL_PATCHES libsanitizer/ChangeLog: * LOCAL_PATCHES: Update git revision.	2021-11-04 13:26:58 +01:00
H.J. Lu	65ade6a34c	libsanitizer: Apply local patches	2021-11-04 13:26:17 +01:00
Martin Liska	0cedf1fb76	lisanitizer: Apply autoreconf.	2021-11-04 13:26:05 +01:00
Martin Liska	cb0437584b	libsanitizer: merge from master (c86b4503a94c277534ce4b9a5c015a6ac151b98a).	2021-11-04 13:24:53 +01:00
Aldy Hernandez	bb27f5e9ec	Convert arrays in ssa pointer_equiv_analyzer to auto_vec's. The problem in this PR is an off-by-one bug. We should've allocated num_ssa_names + 1. However, in fixing this, I noticed that num_ssa_names can change between queries, so I have replaced the array with an auto_vec and added code to grow the vector as necessary. Tested on x86-64 Linux. PR tree-optimization/103062 gcc/ChangeLog: PR tree-optimization/103062 * value-pointer-equiv.cc (ssa_equiv_stack::ssa_equiv_stack): Increase size of allocation by 1. (ssa_equiv_stack::push_replacement): Grow as needed. (ssa_equiv_stack::get_replacement): Same. (pointer_equiv_analyzer::pointer_equiv_analyzer): Same. (pointer_equiv_analyzer::~pointer_equiv_analyzer): Remove delete. (pointer_equiv_analyzer::set_global_equiv): Grow as needed. (pointer_equiv_analyzer::get_equiv): Same. (pointer_equiv_analyzer::get_equiv_expr): Remove const. * value-pointer-equiv.h (class pointer_equiv_analyzer): Remove const markers. Use auto_vec instead of tree . gcc/testsuite/ChangeLog: gcc.dg/pr103062.c: New test.	2021-11-04 11:48:04 +01:00
Jonathan Wakely	a45d577b2b	libstdc++: Refactor emplace-like functions in std::variant libstdc++-v3/ChangeLog: * include/std/variant (__detail::__variant::__emplace): New function template. (_Copy_assign_base::operator=): Reorder conditions to match bulleted list of effects in the standard. Use __emplace instead of _M_reset followed by _Construct. (_Move_assign_base::operator=): Likewise. (__construct_by_index): Remove. (variant::emplace): Use __emplace instead of _M_reset followed by __construct_by_index. (variant::swap): Hoist valueless cases out of visitor. Use __emplace to replace _M_reset followed by _Construct.	2021-11-04 09:36:10 +00:00
Jonathan Wakely	30ab6d9e43	libstdc++: Optimize std::variant traits and improve diagnostics By defining additional partial specializations of _Nth_type we can reduce the number of recursive instantiations needed to get from N to 0. We can also use _Nth_type in variant_alternative, to take advantage of that new optimization. By adding a static_assert to variant_alternative we get a nicer error than 'invalid use of incomplete type'. By defining partial specializations of std::variant_size_v for the common case we can avoid instantiating the std::variant_size class template. The __tuple_count class template and __tuple_count_v variable template can be simplified to a single variable template, __count. By adding a deleted constructor to the _Variant_union primary template we can (very slightly) improve diagnostics for invalid attempts to construct a std::variant with an out-of-range index. Instead of a confusing error about "too many initializers for ..." we get a call to a deleted function. By using _Nth_type instead of variant_alternative (for cv-unqualified variant types) we avoid instantiating variant_alternative. By adding deleted overloads of variant::emplace we get better diagnostics for emplace<invalid-index> or emplace<invalid-type>. Instead of getting errors explaining why each of the four overloads wasn't valid, we just get one error about calling a deleted function. libstdc++-v3/ChangeLog: * include/std/variant (_Nth_type): Define partial specializations to reduce number of instantiations. (variant_size_v): Define partial specializations to avoid instantiations. (variant_alternative): Use _Nth_type. Add static assert. (__tuple_count, __tuple_count_v): Replace with ... (__count): New variable template. (_Variant_union): Add deleted constructor. (variant::__to_type): Use _Nth_type. (variant::emplace): Use _Nth_type. Add deleted overloads for invalid types and indices.	2021-11-04 09:36:09 +00:00
Jonathan Wakely	7551a99574	libstdc++: Fix handling of const types in std::variant [PR102912] Prior to r12-4447 (implementing P2231R1 constexpr changes) we didn't construct the correct member of the union in __variant_construct_single, we just plopped an object in the memory occupied by the union: void* __storage = std::addressof(__lhs._M_u); using _Type = remove_reference_t<decltype(__rhs_mem)>; ::new (__storage) _Type(std::forward<decltype(__rhs_mem)>(__rhs_mem)); We didn't care whether we had variant<int, const int>, we would just place an int (or const int) into the storage, and then set the _M_index to say which one it was. In the new constexpr-friendly code we use std::construct_at to construct the union object, which constructs the active member of the right type. But now we need to know exactly the right type. We have to distinguish between alternatives of type int and const int, and we have to be able to find a const int (or const std::string, as in the OP) among the alternatives. So my change from remove_reference_t<decltype(__rhs_mem)> to remove_cvref_t<_Up> was wrong. It strips the const from const int, and then we can't find the index of the const int alternative. But just using remove_reference_t doesn't work either. When the copy assignment operator of std::variant<int> uses __variant_construct_single it passes a const int& as __rhs_mem, but if we don't strip the const then we try to find const int among the alternatives, and that fails. Similarly for the copy constructor, which also uses a const int& as the initializer for a non-const int alternative. The root cause of the problem is that __variant_construct_single doesn't know the index of the type it's supposed to construct, and the new _Variant_storage::__index_of<_Type> helper doesn't work if __rhs_mem and the alternative being constructed have different const-qualification. We need to replace __variant_construct_single with something that knows the index of the alternative being constructed. All uses of that function do actually know the index, but that context is lost by the time we call __variant_construct_single. This patch replaces that function and __variant_construct, inlining their effects directly into the callers. libstdc++-v3/ChangeLog: PR libstdc++/102912 * include/std/variant (_Variant_storage::__index_of): Remove. (__variant_construct_single): Remove. (__variant_construct): Remove. (_Copy_ctor_base::_Copy_ctor_base(const _Copy_ctor_base&)): Do construction directly instead of using __variant_construct. (_Move_ctor_base::_Move_ctor_base(_Move_ctor_base&&)): Likewise. (_Move_ctor_base::_M_destructive_move()): Remove. (_Move_ctor_base::_M_destructive_copy()): Remove. (_Copy_assign_base::operator=(const _Copy_assign_base&)): Do construction directly instead of using _M_destructive_copy. (variant::swap): Do construction directly instead of using _M_destructive_move. * testsuite/20_util/variant/102912.cc: New test.	2021-11-04 09:36:09 +00:00
Richard Biener	fa62db42b9	VN/PRE TLC This removes an always true parameter of vn_nary_op_insert_into and moves valueization to the two callers of vn_nary_op_compute_hash instead of doing it therein where this function name does not suggest such thing. Also remove extra valueization from PRE phi-translation. 2021-11-03 Richard Biener <rguenther@suse.de> * tree-ssa-sccvn.c (vn_nary_op_insert_into): Remove always true parameter and inline valueization. (vn_nary_op_lookup_1): Inline valueization from ... (vn_nary_op_compute_hash): ... here and remove it here. * tree-ssa-pre.c (phi_translate_1): Do not valueize before vn_nary_lookup_pieces. (get_representative_for): Mark created SSA representatives as visited.	2021-11-04 10:15:36 +01:00
Jiufu Guo	f75e56f46d	Update dg-require-effective-target for pr101145 cases For test cases pr101145.c, some types are not able to be vectorized on some targets. This patch updates dg-require-effective-target according to test cases. gcc/testsuite/ChangeLog: gcc.dg/vect/pr101145_1.c: Update case. * gcc.dg/vect/pr101145_2.c: Update case. * gcc.dg/vect/pr101145_3.c: Update case.	2021-11-04 17:13:14 +08:00
Martin Liska	b9003cf734	Disable warning for an ASAN test-case. gcc/testsuite/ChangeLog: * g++.dg/asan/asan_test.C: Disable one warning.	2021-11-04 09:54:00 +01:00
Richard Sandiford	518f865f4b	simplify-rtx: Fix vec_select index check Vector lane indices follow memory (array) order, so lane 0 corresponds to the high element rather than the low element on big-endian targets. This was causing quite a few execution failures on aarch64_be, such as gcc.c-torture/execute/pr47538.c. gcc/ * simplify-rtx.c (simplify_context::simplify_gen_vec_select): Assert that the operand has a vector mode. Use subreg_lowpart_offset to test whether an index corresponds to the low part. gcc/testsuite/ * gcc.dg/rtl/aarch64/big-endian-cse-1.c: New test.	2021-11-04 08:28:44 +00:00
Richard Sandiford	95318d469f	Fix RTL frontend handling of const_vectors The RTL frontend makes sure that CONST_INTs use shared rtxes where appropriate. We should do the same thing for CONST_VECTORs, reusing CONST0_RTX, CONST1_RTX and CONSTM1_RTX. This also has the effect of setting CONST_VECTOR_NELTS_PER_PATTERN and CONST_VECTOR_NPATTERNS. While looking at where to add that, I noticed we had some dead #includes in read-rtl.c. Some of the stuff that read-rtl-function.c does was once in that file instead. gcc/ * read-rtl.c: Remove dead !GENERATOR_FILE block. * read-rtl-function.c (function_reader::consolidate_singletons): Generate canonical CONST_VECTORs.	2021-11-04 08:28:44 +00:00
liuhongt	bc9c8e5f8a	Extend vternlog define_insn_and_split to memory_operand to enable more optimziation. gcc/ChangeLog: PR target/101989 * config/i386/predicates.md (reg_or_notreg_operand): Rename to .. (regmem_or_bitnot_regmem_operand): .. and extend to handle memory_operand. * config/i386/sse.md (<avx512>_vpternlog<mode>_1): Force_reg the operands which are required to be register_operand. (<avx512>_vpternlog<mode>_2): Ditto. (<avx512>_vpternlog<mode>_3): Ditto. (<avx512>_vternlog<mode>_all): Disallow embeded broadcast for vector HFmodes since it's not a real AVX512FP16 instruction. gcc/testsuite/ChangeLog: * gcc.target/i386/pr101989-3.c: New test.	2021-11-04 16:09:52 +08:00
liuhongt	22ce7382fc	Simplify (trunc)copysign((extend)a, (extend)b) to .COPYSIGN (a,b). a and b are same type as the truncation type and has less precision than extend type. gcc/ChangeLog: PR target/102464 * match.pd: simplify (trunc)copysign((extend)a, (extend)b) to .COPYSIGN (a,b) when a and b are same type as the truncation type and has less precision than extend type. gcc/testsuite/ChangeLog: * gcc.target/i386/pr102464-copysign-1.c: New test.	2021-11-04 16:09:46 +08:00
Richard Biener	d0d428c4ce	Update TARGET_MEM_REF documentation This updates the internals manual documentation of TARGET_MEM_REF and amends MEM_REF. The former was seriously out of date. 2021-11-04 Richard Biener <rguenther@suse.de> gcc/ * doc/generic.texi: Update TARGET_MEM_REF and MEM_REF documentation.	2021-11-04 08:41:58 +01:00
Hongyu Wang	3fd0723f0a	i386: Auto vectorize sdot_prod, usdot_prod with VNNI instruction. AVX512VNNI/AVXVNNI has vpdpwssd for HImode, vpdpbusd for QImode, so Adjust HImode sdot_prod expander and add QImode usdot_prod expander to enhance vectorization for dotprod. gcc/ChangeLog: * config/i386/sse.md (VI2_AVX512VNNIBW): New mode iterator. (VI1_AVX512VNNI): Likewise. (SDOT_VPDP_SUF): New mode_attr. (VI1SI): Likewise. (vi1si): Likewise. (sdot_prod<mode>): Use VI2_AVX512F iterator, expand to vpdpwssd when VNNI targets available. (usdot_prod<mode>): New expander for vector QImode. gcc/testsuite/ChangeLog: * gcc.target/i386/vnni-auto-vectorize-1.c: New test. * gcc.target/i386/vnni-auto-vectorize-2.c: Ditto.	2021-11-04 14:41:30 +08:00
Hongyu Wang	7fcc22dae7	i386: Fix wrong result for AMX-TILE intrinsic when parsing expression. _tile_loadd, _tile_stored, _tile_streamloadd intrinsics are defined by macro, so the parameters should be wrapped by parentheses to accept expressions. gcc/ChangeLog: * config/i386/amxtileintrin.h (_tile_loadd_internal): Add parentheses to base and stride. (_tile_stream_loadd_internal): Likewise. (_tile_stored_internal): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/amxtile-3.c: New test.	2021-11-04 13:01:16 +08:00
Marek Polacek	cd389e5f94	testsuite: Fix g++.dg/opt/pr102970.C This test uses a generic lambda, only available since C++14, so don't run it in earlier modes. gcc/testsuite/ChangeLog: * g++.dg/opt/pr102970.C: Only run in C++14 and up.	2021-11-03 20:40:28 -04:00
GCC Administrator	18ae471f7b	Daily bump.	2021-11-04 00:16:32 +00:00
Maciej W. Rozycki	c79399c7e1	MAINTAINERS: Clarify the policy WRT the Write After Approval list * MAINTAINERS: Clarify the policy WRT the Write After Approval list.	2021-11-03 17:05:48 +00:00
Maciej W. Rozycki	a31056e919	RISC-V: Fix register class subset checks for CLASS_MAX_NREGS Fix the register class subset checks in the determination of the maximum number of consecutive registers needed to hold a value of a given mode. The number depends on whether a register is a general-purpose or a floating-point register, so check whether the register class requested is a subset (argument 1 to `reg_class_subset_p') rather than superset (argument 2) of GR_REGS or FP_REGS class respectively. gcc/ * config/riscv/riscv.c (riscv_class_max_nregs): Swap the arguments to `reg_class_subset_p'.	2021-11-03 17:05:48 +00:00
Jonathan Wakely	1e7a269856	libstdc++: Fix regression in std::list::sort [PR66742] The standard does not require const-correct comparisons in list::sort. libstdc++-v3/ChangeLog: PR libstdc++/66742 * include/bits/list.tcc (list::sort): Use mutable iterators for comparisons. * include/bits/stl_list.h (_Scratch_list::_Ptr_cmp): Likewise. * testsuite/23_containers/list/operations/66742.cc: Check non-const comparisons.	2021-11-03 15:15:27 +00:00
Joseph Myers	600dcd74b8	c: Fold implicit integer-to-floating conversions in static initializers with -frounding-math [PR103031] Recent fixes to avoid inappropriate folding of some conversions to floating-point types with -frounding-math also prevented such folding in C static initializers, when folding (in the default rounding mode, exceptions discarded) is required for correctness. Folding for static initializers is handled via functions in fold-const.c calling START_FOLD_INIT and END_FOLD_INIT to adjust flags such as flag_rounding_math that should not apply in static initializer context, but no such function was being called for the folding of these implicit conversions to the type of the object being initialized, only for explicit conversions as part of the initializer. Arrange for relevant folding (a fold call in convert, in particular) to use this special initializer handling (via a new fold_init function, in particular). Because convert is used by language-independent code but defined in each front end, this isn't as simple as just adding a new default argument to it. Instead, I added a new convert_init function; that then gets called by c-family code, and C and C++ need convert_init implementations (the C++ one does nothing different from convert and will never actually get called because the new convert_and_check argument will never be true from C++), but other languages don't. Bootstrapped with no regressions for x86_64-pc-linux-gnu. gcc/ PR c/103031 * fold-const.c (fold_init): New function. * fold-const.h (fold_init): New prototype. gcc/c-family/ PR c/103031 * c-common.c (convert_and_check): Add argument init_const. Call convert_init if init_const. * c-common.h (convert_and_check): Update prototype. (convert_init): New prototype. gcc/c/ PR c/103031 * c-convert.c (c_convert): New function, based on convert. (convert): Make into wrapper of c_convert. (convert_init): New function. * c-typeck.c (enum impl_conv): Add ic_init_const. (convert_for_assignment): Handle ic_init_const like ic_init. Add new argument to convert_and_check call. (digest_init): Pass ic_init_const to convert_for_assignment for initializers required to be constant. gcc/cp/ PR c/103031 * cvt.c (convert_init): New function. gcc/testsuite/ PR c/103031 * gcc.dg/init-rounding-math-1.c: New test.	2021-11-03 14:59:22 +00:00
Andrew MacLeod	502ffb1f38	Switch vrp2 to ranger. This patch flips the default for the VRP2 pass to execute ranger vrp rather than the assert_expr version of VRP. * params.opt (param_vrp2_mode): Make ranger the default for VRP2.	2021-11-03 10:37:24 -04:00
Andrew MacLeod	1410b20801	Testcase adjustments for pass vrp1. Unify testcases for the vrp1 pass so they will work with the output from either VRP or ranger. gcc/testsuite/ * gcc.dg/tree-ssa/pr23744.c: Tweak output checks. * gcc.dg/tree-ssa/vrp07.c: Ditto. * gcc.dg/tree-ssa/vrp08.c: Ditto. * gcc.dg/tree-ssa/vrp09.c: Ditto. * gcc.dg/tree-ssa/vrp20.c: Ditto. * gcc.dg/tree-ssa/vrp92.c: Ditto. * jit.dg/test-sum-of-squares.c: Ditto.	2021-11-03 10:13:32 -04:00
Andrew MacLeod	6d936684fc	For ranges, PHIs don't need to process arg == def. If an argument of a phi is the same as the DEF of the phi, then the range on the incoming edge doesn't need to be taken into account since it can't be anything other than itself. * gimple-range-fold.cc (fold_using_range::range_of_phi): Don't import a range from edge if arg == phidef.	2021-11-03 10:13:32 -04:00
Andrew MacLeod	b18394ce15	Check for constant builtin value first. The original code imported from EVRP for evaluating built_in_constant_p didn't check to see if the value was a constant before checking the inlining flag. Now we check for a constant first. * gimple-range-fold.cc (fold_using_range::range_of_builtin_call): Test for constant before any other processing.	2021-11-03 10:13:32 -04:00

1 2 3 4 5 ...

189326 Commits