OpenE2K/gcc - gcc - Expired Mentality Git

Author	SHA1	Message	Date
Richard Biener	716a583692	c++/102228 - make lookup_anon_field O(1) For the testcase in PR101555 lookup_anon_field takes the majority of parsing time followed by get_class_binding_direct/fields_linear_search which is PR83309. The situation with anon aggregates is particularly dire when we need to build accesses to their members and the anon aggregates are nested. There for each such access we recursively build sub-accesses to the anon aggregate FIELD_DECLs bottom-up, DFS searching for them. That's inefficient since as I believe there's a 1:1 relationship between anon aggregate types and the FIELD_DECL used to place them. The patch below does away with the search in lookup_anon_field and instead records the single FIELD_DECL in the anon aggregate types lang-specific data, re-using the RTTI typeinfo_var field. That speeds up the compile of the testcase with -fsyntax-only from about 4.5s to slightly less than 1s. I tried to poke holes into the 1:1 relationship idea with my C++ knowledge but failed (which might not say much). It also leaves a hole for the case when the C++ FE itself duplicates such type and places it at a semantically different position. I've tried to poke holes into it with the duplication mechanism I understand (templates) but failed. 2021-09-08 Richard Biener <rguenther@suse.de> PR c++/102228 gcc/cp/ * cp-tree.h (ANON_AGGR_TYPE_FIELD): New define. * decl.c (fixup_anonymous_aggr): Wipe RTTI info put in place on invalid code. * decl2.c (reset_type_linkage): Guard CLASSTYPE_TYPEINFO_VAR access. * module.cc (trees_in::read_class_def): Likewise. Reconstruct ANON_AGGR_TYPE_FIELD. * semantics.c (finish_member_declaration): Populate ANON_AGGR_TYPE_FIELD for anon aggregate typed members. * typeck.c (lookup_anon_field): Remove DFS search and return ANON_AGGR_TYPE_FIELD directly.	2021-09-08 17:43:40 +02:00
Joseph Myers	d27d694151	testsuite: Allow .sdata in more cases in gcc.dg/array-quals-1.c When testing for Nios II (gcc-testresults shows this for MIPS as well), failures of gcc.dg/array-quals-1.c appear where a symbol was found in .sdata rather than one of the expected sections. FAIL: gcc.dg/array-quals-1.c scan-assembler-symbol-section symbol ^_?a$ (found a) has section ^\\.(const\|rodata\|srodata)\|\\[RO\\] (found .sdata) FAIL: gcc.dg/array-quals-1.c scan-assembler-symbol-section symbol ^_?b$ (found b) has section ^\\.(const\|rodata\|srodata)\|\\[RO\\] (found .sdata) FAIL: gcc.dg/array-quals-1.c scan-assembler-symbol-section symbol ^_?c$ (found c) has section ^\\.(const\|rodata\|srodata)\|\\[RO\\] (found .sdata) FAIL: gcc.dg/array-quals-1.c scan-assembler-symbol-section symbol ^_?d$ (found d) has section ^\\.(const\|rodata\|srodata)\|\\[RO\\] (found .sdata) Jakub's commit `0b34dbc0a2` allowed .sdata for many variables in that test where use of .sdata caused a failure on powerpc-linux. I'm presuming the choice of which variables had .sdata allowed was based only on the code generated for powerpc-linux, not on any reason it would be wrong to allow it for the other variables; thus, this patch adjusts the test to allow .sdata for some more variables where that is needed on Nios II (and in one case where it's not needed on Nios II, but the test results on gcc-testresults suggest that it is needed on MIPS). Tested with no regressions with cross to nios2-elf. * gcc.dg/array-quals-1.c: Allow .sdata section in more cases.	2021-09-08 15:38:18 +00:00
Joseph Myers	d081516ae1	testsuite: Use explicit -ftree-cselim in tests using -fdump-tree-cselim-details When testing for Nios II (gcc-testresults shows this for various other targets as well), tests scanning cselim dumps produce an UNRESOLVED result because those dumps do not exist. cselim is enabled conditionally by code in toplev.c: if (flag_tree_cselim == AUTODETECT_VALUE) { if (HAVE_conditional_move) flag_tree_cselim = 1; else flag_tree_cselim = 0; } Add explicit -ftree-cselim to dg-options in the affected tests (as already used by some other tests of cselim dumps) so that this dump exists on all architectures. Tested with no regressions with cross to nios2-elf, where this causes the tests in question to PASS instead of being UNRESOLVED. * gcc.dg/tree-ssa/pr89430-1.c, gcc.dg/tree-ssa/pr89430-2.c, gcc.dg/tree-ssa/pr89430-3.c, gcc.dg/tree-ssa/pr89430-4.c, gcc.dg/tree-ssa/pr89430-5.c, gcc.dg/tree-ssa/pr89430-6.c, gcc.dg/tree-ssa/pr89430-7-comp-ref.c, gcc.dg/tree-ssa/pr89430-8-mem-ref-size.c, gcc.dg/tree-ssa/pr99473-1.c: Use -ftree-cselim.	2021-09-08 14:57:20 +00:00
Segher Boessenkool	86e6268cff	rs6000: Fix ELFv2 r12 use in epilogue We cannot use r12 here, it is already in use as the GEP (for sibling calls). 2021-09-08 Segher Boessenkool <segher@kernel.crashing.org> PR target/102107 * config/rs6000/rs6000-logue.c (rs6000_emit_epilogue): For ELFv2 use r11 instead of r12 for restoring CR.	2021-09-08 13:27:56 +00:00
Jakub Jelinek	7485a52551	i386: Fix up xorsign for AVX [PR89984] Thinking about it more this morning, while this patch fixes the problems revealed in the testcase, the recent PR89984 change was buggy too, but perhaps that can be fixed incrementally. Because for AVX the new code destructively modifies op1. If that is different from dest, say on: float foo (float x, float y) { return x * __builtin_copysignf (1.0f, y) + y; } then we get after RA: (insn 8 7 9 2 (set (reg:SF 20 xmm0 [orig:82 _2 ] [82]) (unspec:SF [ (reg:SF 20 xmm0 [88]) (reg:SF 21 xmm1 [89]) (mem/u/c:V4SF (symbol_ref/u:DI (".LC0") [flags 0x2]) [0 S16 A128]) ] UNSPEC_XORSIGN)) "hohoho.c":4:12 649 {xorsignsf3_1} (nil)) (insn 9 8 15 2 (set (reg:SF 20 xmm0 [87]) (plus:SF (reg:SF 20 xmm0 [orig:82 _2 ] [82]) (reg:SF 21 xmm1 [89]))) "hohoho.c":4:44 1021 {fop_sf_comm} (nil)) but split the xorsign into: vandps .LC0(%rip), %xmm1, %xmm1 vxorps %xmm0, %xmm1, %xmm0 and then the addition: vaddss %xmm1, %xmm0, %xmm0 which means we miscompile it - instead of adding y in the end we add __builtin_copysignf (0.0f, y). So, wonder if we don't want instead in addition to the &Yv <- Yv, 0 alternative (enabled for both pre-AVX and AVX as in this patch) the &Yv <- Yv, Yv where destination must be different from inputs and another Yv <- Yv, Yv where it can be the same but then need a match_scratch (with X for the other alternatives and =Yv for the last one). That way we'd always have a safe register we can store the op1 & mask value into, either the destination (in the first alternative known to be equal to op1 which is needed for non-AVX but ok for AVX too), in the second alternative known to be different from both inputs and in the third which could be used for those float bar (float x, float y) { return x * __builtin_copysignf (1.0f, y); } cases where op1 is naturally xmm1 and dest == op0 naturally xmm0 we'd use some other register like xmm2. On Wed, Sep 08, 2021 at 05:23:40PM +0800, Hongtao Liu wrote: > I'm curious why we need the post_reload splitter @xorsign<mode>3_1 > for scalar mode, can't we just expand them into and/xor operations in > the expander, just like vector modes did. Following seems to work for all the testcases I've tried (and in some generates better code than the post-reload splitter). 2021-09-08 Jakub Jelinek <jakub@redhat.com> liuhongt <hongtao.liu@intel.com> PR target/89984 * config/i386/i386.md (@xorsign<mode>3_1): Remove. * config/i386/i386-expand.c (ix86_expand_xorsign): Expand right away into AND with mask and XOR, using paradoxical subregs. (ix86_split_xorsign): Remove. * config/i386/i386-protos.h (ix86_split_xorsign): Remove. * gcc.target/i386/avx-pr102224.c: Fix up PR number. * gcc.dg/pr89984.c: New test. * gcc.target/i386/avx-pr89984.c: New test.	2021-09-08 14:06:10 +02:00
liuhongt	6576ad5add	Compile __{mul,div}hc3 into libgcc_s.so.1. libgcc/ChangeLog: * config/i386/t-softfp: Compile __{mul,div}hc3 into libgcc_s.so.1.	2021-09-08 19:18:15 +08:00
Di Zhao	7285f39455	tree-optimization/102183 - sccvn: fix result compare in vn_nary_op_insert_into If the first predicate value is different and copied, the comparison will then be between val->result and the copied one. That can cause inserting extra vn_pvals. gcc/ChangeLog: * tree-ssa-sccvn.c (vn_nary_op_insert_into): fix result compare	2021-09-08 18:47:18 +08:00
Jakub Jelinek	87d55da7d7	libgcc, i386: Export hf and hc from libgcc_s.so.1 The following patch exports it for Linux from config/i386/.ver where it IMNSHO belongs, aarch64 already exports some of those at GCC_11 and other targets might add them at completely different gcc versions. 2021-09-08 Jakub Jelinek <jakub@redhat.com> Iain Sandoe <iain@sandoe.co.uk> * config/i386/libgcc-glibc.ver: Add %inherit GCC_12.0.0 GCC_7.0.0 and export hf and hc functions at GCC_12.0.0.	2021-09-08 11:34:45 +02:00
Jakub Jelinek	a7b626d98a	i386: Fix up @xorsign<mode>3_1 [PR102224] As the testcase shows, we miscompile @xorsign<mode>3_1 if both input operands are in the same register, because the splitter overwrites op1 before with op1 & mask before using op0. For dest = xorsign op0, op0 we can actually simplify it from dest = (op0 & mask) ^ op0 to dest = op0 & ~mask (aka abs). The expander change is an optimization improvement, if we at expansion time know it is xorsign op0, op0, we can emit abs right away and get better code through that. The @xorsign<mode>3_1 is a fix for the case where xorsign wouldn't be known to have same operands during expansion, but during RTL optimizations they would appear. For non-AVX we need to use earlyclobber, we require dest and op1 to be the same but op0 must be different because we overwrite op1 first. For AVX the constraints ensure that at most 2 of the 3 operands may be the same register and if both inputs are the same, handles that case. This case can be easily tested with the xorsign<mode>3 expander change reverted. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Thinking about it more this morning, while this patch fixes the problems revealed in the testcase, the recent PR89984 change was buggy too, but perhaps that can be fixed incrementally. Because for AVX the new code destructively modifies op1. If that is different from dest, say on: float foo (float x, float y) { return x * __builtin_copysignf (1.0f, y) + y; } then we get after RA: (insn 8 7 9 2 (set (reg:SF 20 xmm0 [orig:82 _2 ] [82]) (unspec:SF [ (reg:SF 20 xmm0 [88]) (reg:SF 21 xmm1 [89]) (mem/u/c:V4SF (symbol_ref/u:DI (".LC0") [flags 0x2]) [0 S16 A128]) ] UNSPEC_XORSIGN)) "hohoho.c":4:12 649 {xorsignsf3_1} (nil)) (insn 9 8 15 2 (set (reg:SF 20 xmm0 [87]) (plus:SF (reg:SF 20 xmm0 [orig:82 _2 ] [82]) (reg:SF 21 xmm1 [89]))) "hohoho.c":4:44 1021 {fop_sf_comm} (nil)) but split the xorsign into: vandps .LC0(%rip), %xmm1, %xmm1 vxorps %xmm0, %xmm1, %xmm0 and then the addition: vaddss %xmm1, %xmm0, %xmm0 which means we miscompile it - instead of adding y in the end we add __builtin_copysignf (0.0f, y). So, wonder if we don't want instead in addition to the &Yv <- Yv, 0 alternative (enabled for both pre-AVX and AVX as in this patch) the &Yv <- Yv, Yv where destination must be different from inputs and another Yv <- Yv, Yv where it can be the same but then need a match_scratch (with X for the other alternatives and =Yv for the last one). That way we'd always have a safe register we can store the op1 & mask value into, either the destination (in the first alternative known to be equal to op1 which is needed for non-AVX but ok for AVX too), in the second alternative known to be different from both inputs and in the third which could be used for those float bar (float x, float y) { return x * __builtin_copysignf (1.0f, y); } cases where op1 is naturally xmm1 and dest == op0 naturally xmm0 we'd use some other register like xmm2. 2021-09-08 Jakub Jelinek <jakub@redhat.com> PR target/102224 * config/i386/i386.md (xorsign<mode>3): If operands[1] is equal to operands[2], emit abs<mode>2 instead. (@xorsign<mode>3_1): Add early-clobbers for output operand, enable first alternative even for avx, add another alternative with =&Yv <- 0, Yv, Yvm constraints. * config/i386/i386-expand.c (ix86_split_xorsign): If op0 is equal to op1, emit vpandn instead. * gcc.dg/pr102224.c: New test. * gcc.target/i386/avx-pr102224.c: New test.	2021-09-08 11:25:31 +02:00
liuhongt	4a61bcaca0	AVX512FP16: Add abi test for zmm gcc/testsuite/ChangeLog: * gcc.target/x86_64/abi/avx512fp16/m512h/abi-avx512fp16-zmm.exp: New file. * gcc.target/x86_64/abi/avx512fp16/m512h/args.h: Likewise. * gcc.target/x86_64/abi/avx512fp16/m512h/asm-support.S: Likewise. * gcc.target/x86_64/abi/avx512fp16/m512h/avx512fp16-zmm-check.h: Likewise. * gcc.target/x86_64/abi/avx512fp16/m512h/test_m512_returning.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/m512h/test_passing_m512.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/m512h/test_passing_structs.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/m512h/test_passing_unions.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/m512h/test_varargs-m512.c: Likewise.	2021-09-08 12:44:50 +08:00
liuhongt	07308cdb0c	AVX512FP16: Add ABI test for ymm. gcc/testsuite/ChangeLog: * gcc.target/x86_64/abi/avx512fp16/m256h/abi-avx512fp16-ymm.exp: New exp file. * gcc.target/x86_64/abi/avx512fp16/m256h/args.h: New header. * gcc.target/x86_64/abi/avx512fp16/m256h/avx512fp16-ymm-check.h: Likewise. * gcc.target/x86_64/abi/avx512fp16/m256h/asm-support.S: New. * gcc.target/x86_64/abi/avx512fp16/m256h/test_m256_returning.c: New test. * gcc.target/x86_64/abi/avx512fp16/m256h/test_passing_m256.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/m256h/test_passing_structs.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/m256h/test_passing_unions.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/m256h/test_varargs-m256.c: Likewise.	2021-09-08 12:44:50 +08:00
H.J. Lu	22ce16ffa4	AVX512FP16: Add ABI tests for xmm. Copied from regular XMM ABI tests. Only run AVX512FP16 ABI tests for ELF targets. gcc/testsuite/ChangeLog: * gcc.target/x86_64/abi/avx512fp16/abi-avx512fp16-xmm.exp: New exp file for abi test. * gcc.target/x86_64/abi/avx512fp16/args.h: New header file for abi test. * gcc.target/x86_64/abi/avx512fp16/avx512fp16-check.h: Likewise. * gcc.target/x86_64/abi/avx512fp16/avx512fp16-xmm-check.h: Likewise. * gcc.target/x86_64/abi/avx512fp16/defines.h: Likewise. * gcc.target/x86_64/abi/avx512fp16/macros.h: Likewise. * gcc.target/x86_64/abi/avx512fp16/asm-support.S: New asm for abi check. * gcc.target/x86_64/abi/avx512fp16/test_3_element_struct_and_unions.c: New test. * gcc.target/x86_64/abi/avx512fp16/test_basic_alignment.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/test_basic_array_size_and_align.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/test_basic_returning.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/test_basic_sizes.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/test_basic_struct_size_and_align.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/test_basic_union_size_and_align.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/test_complex_returning.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/test_m64m128_returning.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/test_passing_floats.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/test_passing_m64m128.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/test_passing_structs.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/test_passing_unions.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/test_struct_returning.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/test_varargs-m128.c: Likewise.	2021-09-08 12:44:50 +08:00
H.J. Lu	5bbd88bb1e	AVX512FP16: Add tests for vector passing in variable arguments. gcc/testsuite/ChangeLog: * gcc.target/i386/avx512fp16-vararg-1.c: New test. * gcc.target/i386/avx512fp16-vararg-2.c: Ditto. * gcc.target/i386/avx512fp16-vararg-3.c: Ditto. * gcc.target/i386/avx512fp16-vararg-4.c: Ditto.	2021-09-08 12:44:50 +08:00
liuhongt	2f3318dbcf	AVX512FP16: Add testcase for vector init and broadcast intrinsics. gcc/testsuite/ChangeLog: * gcc.target/i386/m512-check.h: Add union128h, union256h, union512h. * gcc.target/i386/avx512fp16-10a.c: New test. * gcc.target/i386/avx512fp16-10b.c: Ditto. * gcc.target/i386/avx512fp16-1a.c: Ditto. * gcc.target/i386/avx512fp16-1b.c: Ditto. * gcc.target/i386/avx512fp16-1c.c: Ditto. * gcc.target/i386/avx512fp16-1d.c: Ditto. * gcc.target/i386/avx512fp16-1e.c: Ditto. * gcc.target/i386/avx512fp16-2a.c: Ditto. * gcc.target/i386/avx512fp16-2b.c: Ditto. * gcc.target/i386/avx512fp16-2c.c: Ditto. * gcc.target/i386/avx512fp16-3a.c: Ditto. * gcc.target/i386/avx512fp16-3b.c: Ditto. * gcc.target/i386/avx512fp16-3c.c: Ditto. * gcc.target/i386/avx512fp16-4.c: Ditto. * gcc.target/i386/avx512fp16-5.c: Ditto. * gcc.target/i386/avx512fp16-6.c: Ditto. * gcc.target/i386/avx512fp16-7.c: Ditto. * gcc.target/i386/avx512fp16-8.c: Ditto. * gcc.target/i386/avx512fp16-9a.c: Ditto. * gcc.target/i386/avx512fp16-9b.c: Ditto. * gcc.target/i386/pr54855-13.c: Ditto. * gcc.target/i386/avx512fp16-vec_set_var.c: Ditto.	2021-09-08 12:44:50 +08:00
liuhongt	9e2a82e1f9	AVX512FP16: Support vector init/broadcast/set/extract for FP16. gcc/ChangeLog: * config/i386/avx512fp16intrin.h (_mm_set_ph): New intrinsic. (_mm256_set_ph): Likewise. (_mm512_set_ph): Likewise. (_mm_setr_ph): Likewise. (_mm256_setr_ph): Likewise. (_mm512_setr_ph): Likewise. (_mm_set1_ph): Likewise. (_mm256_set1_ph): Likewise. (_mm512_set1_ph): Likewise. (_mm_setzero_ph): Likewise. (_mm256_setzero_ph): Likewise. (_mm512_setzero_ph): Likewise. (_mm_set_sh): Likewise. (_mm_load_sh): Likewise. (_mm_store_sh): Likewise. * config/i386/i386-builtin-types.def (V8HF): New type. (DEF_FUNCTION_TYPE (V8HF, V8HI)): New builtin function type * config/i386/i386-expand.c (ix86_expand_vector_init_duplicate): Support vector HFmodes. (ix86_expand_vector_init_one_nonzero): Likewise. (ix86_expand_vector_init_one_var): Likewise. (ix86_expand_vector_init_interleave): Likewise. (ix86_expand_vector_init_general): Likewise. (ix86_expand_vector_set): Likewise. (ix86_expand_vector_extract): Likewise. (ix86_expand_vector_init_concat): Likewise. (ix86_expand_sse_movcc): Handle vector HFmodes. (ix86_expand_vector_set_var): Ditto. * config/i386/i386-modes.def: Add HF vector modes in comment. * config/i386/i386.c (classify_argument): Add HF vector modes. (ix86_hard_regno_mode_ok): Allow HF vector modes for AVX512FP16. (ix86_vector_mode_supported_p): Likewise. (ix86_set_reg_reg_cost): Handle vector HFmode. (ix86_get_ssemov): Handle vector HFmode. (function_arg_advance_64): Pass unamed V16HFmode and V32HFmode by stack. (function_arg_advance_32): Pass V8HF/V16HF/V32HF by sse reg for 32bit mode. (function_arg_advance_32): Ditto. * config/i386/i386.h (VALID_AVX512FP16_REG_MODE): New. (VALID_AVX256_REG_OR_OI_MODE): Rename to .. (VALID_AVX256_REG_OR_OI_VHF_MODE): .. this, and add V16HF. (VALID_SSE2_REG_VHF_MODE): New. (VALID_AVX512VL_128_REG_MODE): Add V8HF and TImode. (SSE_REG_MODE_P): Add vector HFmode. * config/i386/i386.md (mode): Add HF vector modes. (MODE_SIZE): Likewise. (ssemodesuffix): Add ph suffix for HF vector modes. * config/i386/sse.md (VFH_128): New mode iterator. (VMOVE): Adjust for HF vector modes. (V): Likewise. (V_256_512): Likewise. (avx512): Likewise. (avx512fmaskmode): Likewise. (shuffletype): Likewise. (sseinsnmode): Likewise. (ssedoublevecmode): Likewise. (ssehalfvecmode): Likewise. (ssehalfvecmodelower): Likewise. (ssePScmode): Likewise. (ssescalarmode): Likewise. (ssescalarmodelower): Likewise. (sseintprefix): Likewise. (i128): Likewise. (bcstscalarsuff): Likewise. (xtg_mode): Likewise. (VI12HF_AVX512VL): New mode_iterator. (VF_AVX512FP16): Likewise. (VIHF): Likewise. (VIHF_256): Likewise. (VIHF_AVX512BW): Likewise. (V16_256): Likewise. (V32_512): Likewise. (sseintmodesuffix): New mode_attr. (sse): Add scalar and vector HFmodes. (ssescalarmode): Add vector HFmode mapping. (ssescalarmodesuffix): Add sh suffix for HFmode. (<sse>_vm<insn><mode>3): Use VFH_128. (<sse>_vm<multdiv_mnemonic><mode>3): Likewise. (ieee_<ieee_maxmin><mode>3): Likewise. (<avx512>_blendm<mode>): New define_insn. (vec_setv8hf): New define_expand. (vec_set<mode>_0): New define_insn for HF vector set. (avx512fp16_movsh): Likewise. (avx512fp16_movsh): Likewise. (vec_extract_lo_v32hi): Rename to ... (vec_extract_lo_<mode>): ... this, and adjust to allow HF vector modes. (vec_extract_hi_v32hi): Likewise. (vec_extract_hi_<mode>): Likewise. (vec_extract_lo_v16hi): Likewise. (vec_extract_lo_<mode>): Likewise. (vec_extract_hi_v16hi): Likewise. (vec_extract_hi_<mode>): Likewise. (vec_set_hi_v16hi): Likewise. (vec_set_hi_<mode>): Likewise. (vec_set_lo_v16hi): Likewise. (vec_set_lo_<mode>): Likewise. (vec_extract<mode>_0): New define_insn_and_split for HF vector extract. (vec_extracthf): New define_insn. (VEC_EXTRACT_MODE): Add HF vector modes. (PINSR_MODE): Add V8HF. (sse2p4_1): Likewise. (pinsr_evex_isa): Likewise. (<sse2p4_1>_pinsr<ssemodesuffix>): Adjust to support insert for V8HFmode. (pbroadcast_evex_isa): Add HF vector modes. (AVX2_VEC_DUP_MODE): Likewise. (VEC_INIT_MODE): Likewise. (VEC_INIT_HALF_MODE): Likewise. (avx2_pbroadcast<mode>): Adjust to support HF vector mode broadcast. (avx2_pbroadcast<mode>_1): Likewise. (<avx512>_vec_dup<mode>_1): Likewise. (<avx512>_vec_dup<mode><mask_name>): Likewise. (<mask_codefor><avx512>_vec_dup_gpr<mode><mask_name>): Likewise.	2021-09-08 12:44:50 +08:00
Guo, Xuepeng	a68412117f	AVX512FP16: Initial support for AVX512FP16 feature and scalar _Float16 instructions. gcc/ChangeLog: * common/config/i386/cpuinfo.h (get_available_features): Detect FEATURE_AVX512FP16. * common/config/i386/i386-common.c (OPTION_MASK_ISA_AVX512FP16_SET, OPTION_MASK_ISA_AVX512FP16_UNSET, OPTION_MASK_ISA2_AVX512FP16_SET, OPTION_MASK_ISA2_AVX512FP16_UNSET): New. (OPTION_MASK_ISA2_AVX512BW_UNSET, OPTION_MASK_ISA2_AVX512BF16_UNSET): Add AVX512FP16. (ix86_handle_option): Handle -mavx512fp16. * common/config/i386/i386-cpuinfo.h (enum processor_features): Add FEATURE_AVX512FP16. * common/config/i386/i386-isas.h: Add entry for AVX512FP16. * config.gcc: Add avx512fp16intrin.h. * config/i386/avx512fp16intrin.h: New intrinsic header. * config/i386/cpuid.h: Add bit_AVX512FP16. * config/i386/i386-builtin-types.def: (FLOAT16): New primitive type. * config/i386/i386-builtins.c: Support _Float16 type for i386 backend. (ix86_register_float16_builtin_type): New function. (ix86_float16_type_node): New. * config/i386/i386-c.c (ix86_target_macros_internal): Define __AVX512FP16__. * config/i386/i386-expand.c (ix86_expand_branch): Support HFmode. (ix86_prepare_fp_compare_args): Adjust TARGET_SSE_MATH && SSE_FLOAT_MODE_P to SSE_FLOAT_MODE_SSEMATH_OR_HF_P. (ix86_expand_fp_movcc): Ditto. * config/i386/i386-isa.def: Add PTA define for AVX512FP16. * config/i386/i386-options.c (isa2_opts): Add -mavx512fp16. (ix86_valid_target_attribute_inner_p): Add avx512fp16 attribute. * config/i386/i386.c (ix86_get_ssemov): Use vmovdqu16/vmovw/vmovsh for HFmode/HImode scalar or vector. (ix86_get_excess_precision): Use FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 when TARGET_AVX512FP16 existed. (sse_store_index): Use SFmode cost for HFmode cost. (inline_memory_move_cost): Add HFmode, and perfer SSE cost over GPR cost for HFmode. (ix86_hard_regno_mode_ok): Allow HImode in sse register. (ix86_mangle_type): Add manlging for _Float16 type. (inline_secondary_memory_needed): No memory is needed for 16bit movement between gpr and sse reg under TARGET_AVX512FP16. (ix86_multiplication_cost): Adjust TARGET_SSE_MATH && SSE_FLOAT_MODE_P to SSE_FLOAT_MODE_SSEMATH_OR_HF_P. (ix86_division_cost): Ditto. (ix86_rtx_costs): Ditto. (ix86_add_stmt_cost): Ditto. (ix86_optab_supported_p): Ditto. * config/i386/i386.h (VALID_AVX512F_SCALAR_MODE): Add HFmode. (SSE_FLOAT_MODE_SSEMATH_OR_HF_P): Add HFmode. (PTA_SAPPHIRERAPIDS): Add PTA_AVX512FP16. * config/i386/i386.md (mode): Add HFmode. (MODE_SIZE): Add HFmode. (isa): Add avx512fp16. (enabled): Handle avx512fp16. (ssemodesuffix): Add sh suffix for HFmode. (comm): Add mult, div. (plusminusmultdiv): New code iterator. (insn): Add mult, div. (movhf_internal): Adjust for avx512fp16 instruction. (movhi_internal): Ditto. (cmpi<unord>hf): New define_insn for HFmode. (ieee_s<ieee_maxmin>hf3): Likewise. (extendhf<mode>2): Likewise. (trunc<mode>hf2): Likewise. (float<floatunssuffix><mode>hf2): Likewise. (<insn>hf): Likewise. (cbranchhf4): New expander. (movhfcc): Likewise. (<insn>hf3): Likewise. (mulhf3): Likewise. (divhf3): Likewise. config/i386/i386.opt: Add mavx512fp16. * config/i386/immintrin.h: Include avx512fp16intrin.h. * doc/invoke.texi: Add mavx512fp16. * doc/extend.texi: Add avx512fp16 Usage Notes. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add -mavx512fp16 in dg-options. * gcc.target/i386/avx-2.c: Ditto. * gcc.target/i386/avx512-check.h: Check cpuid for AVX512FP16. * gcc.target/i386/funcspec-56.inc: Add new target attribute check. * gcc.target/i386/sse-13.c: Add -mavx512fp16. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * lib/target-supports.exp: (check_effective_target_avx512fp16): New. * g++.target/i386/float16-1.C: New test. * g++.target/i386/float16-2.C: Ditto. * g++.target/i386/float16-3.C: Ditto. * gcc.target/i386/avx512fp16-12a.c: Ditto. * gcc.target/i386/avx512fp16-12b.c: Ditto. * gcc.target/i386/float16-3a.c: Ditto. * gcc.target/i386/float16-3b.c: Ditto. * gcc.target/i386/float16-4a.c: Ditto. * gcc.target/i386/float16-4b.c: Ditto. * gcc.target/i386/pr54855-12.c: Ditto. * g++.dg/other/i386-2.C: Ditto. * g++.dg/other/i386-3.C: Ditto. Co-Authored-By: H.J. Lu <hongjiu.lu@intel.com> Co-Authored-By: Liu Hongtao <hongtao.liu@intel.com> Co-Authored-By: Wang Hongyu <hongyu.wang@intel.com> Co-Authored-By: Xu Dianhong <dianhong.xu@intel.com>	2021-09-08 12:44:50 +08:00
liuhongt	f19a327077	Support -fexcess-precision=16 which will enable FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 when backend supports _Float16. gcc/ada/ChangeLog: * gcc-interface/misc.c (gnat_post_options): Issue an error for -fexcess-precision=16. gcc/c-family/ChangeLog: * c-common.c (excess_precision_mode_join): Update below comments. (c_ts18661_flt_eval_method): Set excess_precision_type to EXCESS_PRECISION_TYPE_FLOAT16 when -fexcess-precision=16. * c-cppbuiltin.c (cpp_atomic_builtins): Update below comments. (c_cpp_flt_eval_method_iec_559): Set excess_precision_type to EXCESS_PRECISION_TYPE_FLOAT16 when -fexcess-precision=16. gcc/ChangeLog: * common.opt: Support -fexcess-precision=16. * config/aarch64/aarch64.c (aarch64_excess_precision): Return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 when EXCESS_PRECISION_TYPE_FLOAT16. * config/arm/arm.c (arm_excess_precision): Ditto. * config/i386/i386.c (ix86_get_excess_precision): Ditto. * config/m68k/m68k.c (m68k_excess_precision): Issue an error when EXCESS_PRECISION_TYPE_FLOAT16. * config/s390/s390.c (s390_excess_precision): Ditto. * coretypes.h (enum excess_precision_type): Add EXCESS_PRECISION_TYPE_FLOAT16. * doc/tm.texi (TARGET_C_EXCESS_PRECISION): Update documents. * doc/tm.texi.in (TARGET_C_EXCESS_PRECISION): Ditto. * doc/extend.texi (Half-Precision): Document -fexcess-precision=16. * flag-types.h (enum excess_precision): Add EXCESS_PRECISION_FLOAT16. * target.def (excess_precision): Update document. * tree.c (excess_precision_type): Set excess_precision_type to EXCESS_PRECISION_FLOAT16 when -fexcess-precision=16. gcc/fortran/ChangeLog: * options.c (gfc_post_options): Issue an error for -fexcess-precision=16. gcc/testsuite/ChangeLog: * gcc.target/i386/float16-6.c: New test. * gcc.target/i386/float16-7.c: New test.	2021-09-08 12:44:49 +08:00
liuhongt	a549a9a39a	Adjust the wording for x86 _Float16 type. gcc/ChangeLog: * doc/extend.texi: (@node Floating Types): Adjust the wording. (@node Half-Precision): Ditto.	2021-09-08 09:10:45 +08:00
GCC Administrator	b2748138c0	Daily bump.	2021-09-08 00:16:23 +00:00
Max Filippov	b552c4e601	gcc: xtensa: fix PR target/102115 2021-09-07 Takayuki 'January June' Suwa <jjsuwa_sys3175@yahoo.co.jp> gcc/ PR target/102115 * config/xtensa/xtensa.c (xtensa_emit_move_sequence): Add 'CONST_INT_P (src)' to the condition of the block that tries to eliminate literal when loading integer contant.	2021-09-07 15:40:26 -07:00
Ian Lance Taylor	21b046bade	runtime: use hash32, not hash64, for amd64p32, mips64p32, mips64p32le Fixes PR go/102102 Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/348015	2021-09-07 15:05:11 -07:00
David Faust	d9996ccb94	doc: BPF CO-RE documentation Document the new command line options (-mco-re and -mno-co-re), the new BPF target builtin (__builtin_preserve_access_index), and the new BPF target attribute (preserve_access_index) introduced with BPF CO-RE. gcc/ChangeLog: * doc/extend.texi (BPF Type Attributes) New node. Document new preserve_access_index attribute. Document new preserve_access_index builtin. * doc/invoke.texi: Document -mco-re and -mno-co-re options.	2021-09-07 13:48:58 -07:00
David Faust	f4cdfd4856	bpf testsuite: Add BPF CO-RE tests This commit adds several tests for the new BPF CO-RE functionality to the BPF target testsuite. gcc/testsuite/ChangeLog: * gcc.target/bpf/core-attr-1.c: New test. * gcc.target/bpf/core-attr-2.c: Likewise. * gcc.target/bpf/core-attr-3.c: Likewise. * gcc.target/bpf/core-attr-4.c: Likewise * gcc.target/bpf/core-builtin-1.c: Likewise * gcc.target/bpf/core-builtin-2.c: Likewise. * gcc.target/bpf/core-builtin-3.c: Likewise. * gcc.target/bpf/core-section-1.c: Likewise.	2021-09-07 13:48:58 -07:00
David Faust	8bdabb3754	bpf: BPF CO-RE support This commit introduces support for BPF Compile Once - Run Everywhere (CO-RE) in GCC. gcc/ChangeLog: * config/bpf/bpf.c: Adjust includes. (bpf_handle_preserve_access_index_attribute): New function. (bpf_attribute_table): Use it here. (bpf_builtins): Add BPF_BUILTIN_PRESERVE_ACCESS_INDEX. (bpf_option_override): Handle "-mco-re" option. (bpf_asm_init_sections): New. (TARGET_ASM_INIT_SECTIONS): Redefine. (bpf_file_end): New. (TARGET_ASM_FILE_END): Redefine. (bpf_init_builtins): Add "__builtin_preserve_access_index". (bpf_core_compute, bpf_core_get_index): New. (is_attr_preserve_access): New. (bpf_expand_builtin): Handle new builtins. (bpf_core_newdecl, bpf_core_is_maybe_aggregate_access): New. (bpf_core_walk): New. (bpf_resolve_overloaded_builtin): New. (TARGET_RESOLVE_OVERLOADED_BUILTIN): Redefine. (handle_attr): New. (pass_bpf_core_attr): New RTL pass. * config/bpf/bpf-passes.def: New file. * config/bpf/bpf-protos.h (make_pass_bpf_core_attr): New. * config/bpf/coreout.c: New file. * config/bpf/coreout.h: Likewise. * config/bpf/t-bpf (TM_H): Add $(srcdir)/config/bpf/coreout.h. (coreout.o): New rule. (PASSES_EXTRA): Add $(srcdir)/config/bpf/bpf-passes.def. * config.gcc (bpf): Add coreout.h to extra_headers. Add coreout.o to extra_objs. Add $(srcdir)/config/bpf/coreout.c to target_gtfiles.	2021-09-07 13:48:58 -07:00
David Faust	0a2bd52f1a	btf: expose get_btf_id Expose the function get_btf_id, so that it may be used by the BPF backend. This enables the BPF CO-RE machinery in the BPF backend to lookup BTF type IDs, in order to create CO-RE relocation records. A prototype is added in ctfc.h gcc/ChangeLog: * btfout.c (get_btf_id): Function is no longer static. * ctfc.h: Expose it here.	2021-09-07 13:48:58 -07:00
David Faust	5b723401b3	ctfc: add function to lookup CTF ID of a TREE type Add a new function, ctf_lookup_tree_type, to return the CTF type ID associated with a type via its is TREE node. The function is exposed via a prototype in ctfc.h. gcc/ChangeLog: * ctfc.c (ctf_lookup_tree_type): New function. * ctfc.h: Likewise.	2021-09-07 13:48:58 -07:00
David Faust	44e4ed6a3c	ctfc: externalize ctf_dtd_lookup Expose the function ctf_dtd_lookup, so that it can be used by the BPF CO-RE machinery. The function is no longer static, and an extern prototype is added in ctfc.h. gcc/ChangeLog: * ctfc.c (ctf_dtd_lookup): Function is no longer static. * ctfc.h: Analogous change.	2021-09-07 13:48:58 -07:00
David Faust	81eced213c	dwarf: externalize lookup_type_die Expose the function lookup_type_die in dwarf2out, so that it can be used by CTF/BTF when adding BPF CO-RE information. The function is now non-static, and an extern prototype is added in dwarf2out.h. gcc/ChangeLog: * dwarf2out.c (lookup_type_die): Function is no longer static. * dwarf2out.h: Expose it here.	2021-09-07 13:48:57 -07:00
Hans-Peter Nilsson	578cd82af7	Fix fatal typo in gcc.dg/no_profile_instrument_function-attr-2.c Dejagnu is unfortunately brittle: a syntax error in a directive can abort the test-run for the current "tool" (gcc, g++, gfortran), and if you don't check for this condition or actually read the stdout log yourself, your tools may make you believe the test was successful without regressions. At the very least, always grep for ^ERROR: in the stdout log! With r12-3379, the testsuite got such a fatal syntax error, causing the gcc test-run to abort at (e.g.): ... FAIL: gcc.dg/memchr.c (test for excess errors) FAIL: gcc.dg/memcmp-3.c (test for excess errors) ERROR: (DejaGnu) proc "scan-tree-dump-not\" = foo {"} optimized" does not exist. The error code is TCL LOOKUP COMMAND scan-tree-dump-not\" The info on the error is: invalid command name "scan-tree-dump-not"" while executing "::tcl_unknown scan-tree-dump-not\" = foo {"} optimized" ("uplevel" body line 1) invoked from within "uplevel 1 ::tcl_unknown $args" === gcc Summary === # of expected passes 63740 # of unexpected failures 38 # of unexpected successes 2 # of expected failures 351 # of unresolved testcases 3 # of unsupported tests 662 x/cris-elf/gccobj/gcc/xgcc version 12.0.0 20210907 (experimental)\ [master r12-3391-g849d5f5929fc] (GCC) testsuite: * gcc.dg/no_profile_instrument_function-attr-2.c: Fix typo in last change.	2021-09-07 22:36:59 +02:00
Harald Anlauf	2a1537a19c	Fortran - improve error recovery determining array element from constructor gcc/fortran/ChangeLog: PR fortran/101327 * expr.c (find_array_element): When bounds cannot be determined as constant, return error instead of aborting. gcc/testsuite/ChangeLog: PR fortran/101327 * gfortran.dg/pr101327.f90: New test.	2021-09-07 20:51:49 +02:00
Indu Bhagat	849d5f5929	dwarf2out: Emit BTF in dwarf2out_finish for BPF CO-RE usecase DWARF generation is split between early and late phases when LTO is in effect. This poses challenges for CTF/BTF generation especially if late debug info generation is desirable, as turns out to be the case for BPF CO-RE. The approach taken here in this patch is: 1. LTO is disabled for BPF CO-RE The reason to disable LTO for BPF CO-RE is that if LTO is in effect, BPF CO-RE relocations need to be generated in the LTO link phase _after_ the optimizations are done. This means we need to devise way to combine early and late BTF. At this time, in absence of linker support for BTF sections, it makes sense to steer clear of LTO for BPF CO-RE and bypass the issue. 2. The BPF backend updates the write_symbols with BPF_WITH_CORE_DEBUG to convey the case that BTF with CO-RE support needs to be generated. This information is used by the debug info emission routines to defer the emission of BTF/CO-RE until dwarf2out_finish. So, in other words, dwarf2out_early_finish - Always emit CTF here. - if (BTF && !BTF_WITH_CORE), emit BTF now. dwarf2out_finish - if (BTF_WITH_CORE) emit BTF now. gcc/ChangeLog: * dwarf2ctf.c (ctf_debug_finalize): Make it static. (ctf_debug_early_finish): New definition. (ctf_debug_finish): Likewise. * dwarf2ctf.h (ctf_debug_finalize): Remove declaration. (ctf_debug_early_finish): New declaration. (ctf_debug_finish): Likewise. * dwarf2out.c (dwarf2out_finish): Invoke ctf_debug_finish. (dwarf2out_early_finish): Invoke ctf_debug_early_finish.	2021-09-07 11:18:54 -07:00
Indu Bhagat	e29a9607fa	bpf: Add new -mco-re option for BPF CO-RE -mco-re in the BPF backend enables code generation for the CO-RE usecase. LTO is disabled for CO-RE compilations. gcc/ChangeLog: * config/bpf/bpf.c (bpf_option_override): For BPF backend, disable LTO support when compiling for CO-RE. * config/bpf/bpf.opt: Add new command line option -mco-re. gcc/testsuite/ChangeLog: * gcc.target/bpf/core-lto-1.c: New test.	2021-09-07 11:17:55 -07:00
Indu Bhagat	053db9a49b	debug: Add BTF_WITH_CORE_DEBUG debug format To best handle BTF/CO-RE in GCC, a distinct BTF_WITH_CORE_DEBUG debug format is being added. This helps the compiler detect whether BTF with CO-RE relocations needs to be emitted. gcc/ChangeLog: * flag-types.h (enum debug_info_type): Add new enum DINFO_TYPE_BTF_WITH_CORE. (BTF_WITH_CORE_DEBUG): New bitmask. * flags.h (btf_with_core_debuginfo_p): New declaration. * opts.c (btf_with_core_debuginfo_p): New definition.	2021-09-07 11:16:53 -07:00
Jason Merrill	c03db573b9	tree: Change error_operand_p to an inline function I've thought for a while that many of the macros in tree.h and such should become inline functions. This one in particular was confusing Coverity; the null check in the macro made it think that all code guarded by error_operand_p would also need null checks. gcc/ChangeLog: * tree.h (error_operand_p): Change to inline function.	2021-09-07 14:09:38 -04:00
Jakub Jelinek	81f9718139	c++: Fix up constexpr evaluation of deleting dtors [PR100495] We do not save bodies of constexpr clones and instead evaluate the bodies of the constexpr functions they were cloned from. I believe that is just fine for constructors because complete vs. base ctors differ only in classes that have virtual bases and such constructors aren't constexpr, similarly complete/base destructors. But as the testcase below shows, for deleting destructors it is not fine, deleting dtors while marked as clones in fact are just artificial functions with synthetized body which calls the user destructor and deallocation. So, either we'd need to evaluate the destructor and afterwards synthetize and evaluate the deallocation, or we can just save and use the deleting dtors bodies. The latter seems much easier to me. 2021-09-07 Jakub Jelinek <jakub@redhat.com> PR c++/100495 * constexpr.c (maybe_save_constexpr_fundef): Save body even for constexpr deleting dtors. (cxx_eval_call_expression): Don't use DECL_CLONED_FUNCTION for deleting dtors. * g++.dg/cpp2a/constexpr-new21.C: New test.	2021-09-07 19:33:28 +02:00
Tobias Burnus	ff7bc505b1	libgomp.texi: Extend OpenMP 5.0 Implementation Status libgomp/ * libgomp.texi (OpenMP Implementation Status): Extend OpenMP 5.0 section. (OpenACC Profiling Interface): Fix typo.	2021-09-07 18:30:25 +02:00
Aldy Hernandez	020e2db0a8	Rename forwarder_block_p in treading code to empty_block_with_phis_p. gcc/ChangeLog: * tree-ssa-threadedge.c (forwarder_block_p): Rename to... (empty_block_with_phis_p): ...this. (potentially_threadable_block): Same. (jump_threader::thread_through_normal_block): Same.	2021-09-07 18:04:36 +02:00
Tobias Burnus	fc4f0631de	libgfortran: Makefile fix for ISO_Fortran_binding.h libgfortran/ChangeLog: * Makefile.am (gfor_built_src): Depend on include/ISO_Fortran_binding.h not on ISO_Fortran_binding.h. (ISO_Fortran_binding.h): Rename make target to ... (include/ISO_Fortran_binding.h): ... this. * Makefile.in: Regenerate.	2021-09-07 17:46:05 +02:00
Eric Botcazou	81e9178fe7	Fix PR debug/101947 This is the recent LTO bootstrap failure with Ada enabled. The compiler now generates DW_OP_deref_type for a unit of the Ada front-end, which means that the offset of base types in the CU must be computed during early DWARF too. gcc/ PR debug/101947 * dwarf2out.c (mark_base_types): New overloaded function. (dwarf2out_early_finish): Invoke it on the COMDAT type list as well as the compilation unit, and call move_marked_base_types afterward.	2021-09-07 15:42:51 +02:00
H.J. Lu	ad9fcb961c	x86: Enable FMA in unsigned SI to SF expanders Enable FMA in scalar/vector unsigned SI to SF expanders. Don't check TARGET_AVX512F which has vcvtusi2ss and vcvtudq2ps instructions. gcc/ PR target/85819 * config/i386/i386-expand.c (ix86_expand_convert_uns_sisf_sse): Enable FMA. (ix86_expand_vector_convert_uns_vsivsf): Likewise. gcc/testsuite/ PR target/85819 * gcc.target/i386/pr85819-1a.c: New test. * gcc.target/i386/pr85819-1b.c: Likewise. * gcc.target/i386/pr85819-2a.c: Likewise. * gcc.target/i386/pr85819-2b.c: Likewise. * gcc.target/i386/pr85819-2c.c: Likewise. * gcc.target/i386/pr85819-3.c: Likewise.	2021-09-07 05:28:07 -07:00
Richard Biener	843068149e	tree-optimization/102226 - fix epilogue vector re-use This fixes re-use of the reduction value in epilogue vectorization when a conversion from/to variable lenght vectors is required. 2021-09-07 Richard Biener <rguenther@suse.de> PR tree-optimization/102226 * tree-vect-loop.c (vect_transform_cycle_phi): Record the converted value for the epilogue PHI use. * g++.dg/vect/pr102226.cc: New testcase.	2021-09-07 13:10:37 +02:00
Marcel Vollweiler	ba1cc6956b	C, C++, Fortran, OpenMP: Add support for 'flush seq_cst' construct. This patch adds support for the 'seq_cst' memory order clause on the 'flush' directive which was introduced in OpenMP 5.1. gcc/c-family/ChangeLog: * c-omp.c (c_finish_omp_flush): Handle MEMMODEL_SEQ_CST. gcc/c/ChangeLog: * c-parser.c (c_parser_omp_flush): Parse 'seq_cst' clause on 'flush' directive. gcc/cp/ChangeLog: * parser.c (cp_parser_omp_flush): Parse 'seq_cst' clause on 'flush' directive. * semantics.c (finish_omp_flush): Handle MEMMODEL_SEQ_CST. gcc/fortran/ChangeLog: * openmp.c (gfc_match_omp_flush): Parse 'seq_cst' clause on 'flush' directive. * trans-openmp.c (gfc_trans_omp_flush): Handle OMP_MEMORDER_SEQ_CST. gcc/testsuite/ChangeLog: * c-c++-common/gomp/flush-1.c: Add test case for 'seq_cst'. * c-c++-common/gomp/flush-2.c: Add test case for 'seq_cst'. * g++.dg/gomp/attrs-1.C: Adapt test to handle all flush clauses. * g++.dg/gomp/attrs-2.C: Adapt test to handle all flush clauses. * gfortran.dg/gomp/flush-1.f90: Add test case for 'seq_cst'. * gfortran.dg/gomp/flush-2.f90: Add test case for 'seq_cst'.	2021-09-07 03:46:28 -07:00
Martin Liska	aad72d2ea8	inline: do not einline when no_profile_instrument_function is different PR gcov-profile/80223 gcc/ChangeLog: * ipa-inline.c (can_inline_edge_p): Similarly to sanitizer options, do not inline when no_profile_instrument_function attributes are different in early inliner. It's fine to inline it after PGO instrumentation. gcc/testsuite/ChangeLog: * gcc.dg/no_profile_instrument_function-attr-2.c: New test.	2021-09-07 11:47:57 +02:00
Richard Biener	f387ff788f	tree-optimization/101555 - avoid redundant alias queries in PRE This avoids doing redundant work during PHI translation to invalidate mems when translating their corresponding VUSE through the blocks virtual PHI node. All the invalidation work is already done by prune_clobbered_mems. This speeds up the compile of the testcase from 275s with PRE taking 91% of the compile-time down to 43s with PRE taking 16% of the compile-time. 2021-09-07 Richard Biener <rguenther@suse.de> PR tree-optimization/101555 * tree-ssa-pre.c (translate_vuse_through_block): Do not perform an alias walk to determine the validity of the mem at the start of the block which is already guaranteed by means of prune_clobbered_mems. (phi_translate_1): Pass edge to translate_vuse_through_block.	2021-09-07 11:14:17 +02:00
Tobias Burnus	cff72ef4e2	libgomp.texi: Add OpenMP Implementation Status libgomp/ * libgomp.texi (Enabling OpenMP): Refer to OMP spec in general not to 4.5; link to new section. (OpenMP Implementation Status): New.	2021-09-07 11:01:38 +02:00
Sandra Loosemore	13beaf9e8d	Fortran: Revert to non-multilib-specific ISO_Fortran_binding.h Commit `fef67987cf` changed the libgfortran build process to generate multilib-specific versions of ISO_Fortran_binding.h from a template, by running gfortran to identify the values of the Fortran kind constants C_LONG_DOUBLE, C_FLOAT128, and C_INT128_T. This caused multiple problems with search paths, both for build-tree testing and installed-tree use, not all of which have been fixed. This patch reverts to a non-multilib-specific .h file that uses GCC's predefined preprocessor symbols to detect the supported types and map them to kind values in the same way as the Fortran front end. 2021-09-06 Sandra Loosemore <sandra@codesourcery.com> libgfortran/ * ISO_Fortran_binding-1-tmpl.h: Deleted. * ISO_Fortran_binding-2-tmpl.h: Deleted. * ISO_Fortran_binding-3-tmpl.h: Deleted. * ISO_Fortran_binding.h: New file to replace the above. * Makefile.am (gfor_cdir): Remove MULTISUBDIR. (ISO_Fortran_binding.h): Simplify to just copy the file. * Makefile.in: Regenerated. * mk-kinds-h.sh: Revert pieces no longer needed for ISO_Fortran_binding.h.	2021-09-06 21:28:50 -07:00
Xionghu Luo	546ecb0054	rs6000: Expand fmod and remainder when built with fast-math [PR97142] fmod/fmodf and remainder/remainderf could be expanded instead of library call when fast-math build, which is much faster. fmodf: fdivs f0,f1,f2 friz f0,f0 fnmsubs f1,f2,f0,f1 remainderf: fdivs f0,f1,f2 frin f0,f0 fnmsubs f1,f2,f0,f1 SPEC2017 Ofast P8LE: 511.povray_r +1.14%, 526.blender_r +1.72% gcc/ChangeLog: 2021-09-07 Xionghu Luo <luoxhu@linux.ibm.com> PR target/97142 * config/rs6000/rs6000.md (fmod<mode>3): New define_expand. (remainder<mode>3): Likewise. gcc/testsuite/ChangeLog: 2021-09-07 Xionghu Luo <luoxhu@linux.ibm.com> PR target/97142 * gcc.target/powerpc/pr97142.c: New test.	2021-09-06 20:22:50 -05:00
YunQiang Su	58572bbb62	MIPS: add .module arch and ase to all output asm Currently, the asm output file for MIPS has no rev info. It can make some trouble, for example: assembler is mips1 by default, gcc is fpxx by default. To assemble the output of gcc -S, we have to pass -mips2 to assembler. The same situation is for some CPU has extension insn. Octeon is an example. So we can just add ".set arch=octeon". If an ASE is enabled, .module ase will also be used. gcc/ChangeLog: * config/mips/mips.c (mips_file_start): add .module for arch and ase.	2021-09-07 08:45:37 +08:00
GCC Administrator	9f99555f29	Daily bump.	2021-09-07 00:16:34 +00:00
Roger Sayle	74cb45e67d	Correct implementation of wi::clz As diagnosed with Jakub and Richard in the analysis of PR 102134, the current implementation of wi::clz has incorrect/inconsistent behaviour. As mentioned by Richard in comment #7, clz should (always) return zero for negative values, but the current implementation can only return 0 when precision is a multiple of HOST_BITS_PER_WIDE_INT. The fix is simply to reorder/shuffle the existing tests. 2021-09-06 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * wide-int.cc (wi::clz): Reorder tests to ensure the result is zero for all negative values.	2021-09-06 22:50:45 +01:00

1 2 3 4 5 ...

187830 Commits