OpenE2K/gcc - gcc - Expired Mentality Git

Author	SHA1	Message	Date
Martin Liska	2701442d0c	libsanitizer: cherry-pick 9cf13067cb5088626ba7 from upstream 9cf13067cb5088626ba7ee1ec4c42ec59c7995a0 [sanitizer] Remove #include <linux/fs.h> to resolve fsconfig_command/mount_attr conflict with glibc 2.36	2022-07-11 22:04:00 +02:00
Andrew MacLeod	12a9b98ac5	Avoid calling range_from_dom when dominator is already resolved. range_from_dom makes a recursive call to resolve the immediate dominator when there are multiple incoming edges to a block. This is not necessary when the dominator already has an on-entry cache value. PR tree-optimization/106234 * gimple-range-cache.cc (ranger_cache::range_from_dom): Check dominator cache value before recursively resolving it.	2022-07-11 14:41:15 -04:00
Roger Sayle	c3ed9e0d6e	Improved Scalar-To-Vector (STV) support for TImode to V1TImode on x86_64. This patch upgrades x86_64's scalar-to-vector (STV) pass to more aggressively transform 128-bit scalar TImode operations into vector V1TImode operations performed on SSE registers. TImode functionality already exists in STV, but only for move operations. This change brings support for logical operations (AND, IOR, XOR, NOT and ANDN) and comparisons. The effect of these changes are conveniently demonstrated by the new sse4_1-stv-5.c test case: __int128 a[16]; __int128 b[16]; __int128 c[16]; void foo() { for (unsigned int i=0; i<16; i++) a[i] = b[i] & ~c[i]; } which when currently compiled on mainline wtih -O2 -msse4 produces: foo: xorl %eax, %eax .L2: movq c(%rax), %rsi movq c+8(%rax), %rdi addq $16, %rax notq %rsi notq %rdi andq b-16(%rax), %rsi andq b-8(%rax), %rdi movq %rsi, a-16(%rax) movq %rdi, a-8(%rax) cmpq $256, %rax jne .L2 ret but with this patch now produces: foo: xorl %eax, %eax .L2: movdqa c(%rax), %xmm0 pandn b(%rax), %xmm0 addq $16, %rax movaps %xmm0, a-16(%rax) cmpq $256, %rax jne .L2 ret Technically, the STV pass is implemented by three C++ classes, a common abstract base class "scalar_chain" that contains common functionality, and two derived classes: general_scalar_chain (which handles SI and DI modes) and timode_scalar_chain (which handles TI modes). As mentioned previously, because only TI mode moves were handled the two worker classes behaved significantly differently. These changes bring the functionality of these two classes closer together, which is reflected by refactoring more shared code from general_scalar_chain to the parent scalar_chain and reusing it from timode_scalar_chain. There still remain significant differences (and simplifications) so the existing division of classes (as specializations) continues to make sense. 2022-07-11 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * config/i386/i386-features.h (scalar_chain): Add fields insns_conv, n_sse_to_integer and n_integer_to_sse to this parent class, moved from general_scalar_chain. (scalar_chain::convert_compare): Protected method moved from general_scalar_chain. (mark_dual_mode_def): Make protected, not private virtual. (scalar_chain:convert_op): New private virtual method. (general_scalar_chain::general_scalar_chain): Simplify constructor. (general_scalar_chain::~general_scalar_chain): Delete destructor. (general_scalar_chain): Move insns_conv, n_sse_to_integer and n_integer_to_sse fields to parent class, scalar_chain. (general_scalar_chain::mark_dual_mode_def): Delete prototype. (general_scalar_chain::convert_compare): Delete prototype. (timode_scalar_chain::compute_convert_gain): Remove simplistic implementation, convert to a method prototype. (timode_scalar_chain::mark_dual_mode_def): Delete prototype. (timode_scalar_chain::convert_op): Prototype new virtual method. * config/i386/i386-features.cc (scalar_chain::scalar_chain): Allocate insns_conv and initialize n_sse_to_integer and n_integer_to_sse fields in constructor. (scalar_chain::scalar_chain): Free insns_conv in destructor. (general_scalar_chain::general_scalar_chain): Delete constructor, now defined in the class declaration. (general_scalar_chain::~general_scalar_chain): Delete destructor. (scalar_chain::mark_dual_mode_def): Renamed from general_scalar_chain::mark_dual_mode_def. (timode_scalar_chain::mark_dual_mode_def): Delete. (scalar_chain::convert_compare): Renamed from general_scalar_chain::convert_compare. (timode_scalar_chain::compute_convert_gain): New method to determine the gain from converting a TImode chain to V1TImode. (timode_scalar_chain::convert_op): New method to convert an operand from TImode to V1TImode. (timode_scalar_chain::convert_insn) <case REG>: Only PUT_MODE on REG_EQUAL notes that were originally TImode (not CONST_INT). Handle AND, ANDN, XOR, IOR, NOT and COMPARE. (timode_mem_p): Helper predicate to check where operand is memory reference with sufficient alignment for TImode STV. (timode_scalar_to_vector_candidate_p): Use convertible_comparison_p to check whether COMPARE is convertible. Handle SET_DESTs that that are REG_P or MEM_P and SET_SRCs that are REG, CONST_INT, CONST_WIDE_INT, MEM, AND, ANDN, IOR, XOR or NOT. gcc/testsuite/ChangeLog * gcc.target/i386/sse4_1-stv-2.c: New test case, pand. * gcc.target/i386/sse4_1-stv-3.c: New test case, por. * gcc.target/i386/sse4_1-stv-4.c: New test case, pxor. * gcc.target/i386/sse4_1-stv-5.c: New test case, pandn. * gcc.target/i386/sse4_1-stv-6.c: New test case, ptest.	2022-07-11 16:04:46 +01:00
Richard Sandiford	e7a7fed818	vect: Restore optab_vector argument [PR106250] In g:76c3041b856cb0 I'd removed a "C ? optab_vector : optab_mixed_sign" argument from a call to directly_supported_p, thinking that the argument only existed because of the condition (which I was removing). But the difference between the scalar and vector forms matters for shifts, so we do still need the argument. gcc/ PR tree-optimization/106250 * tree-vect-loop.cc (vectorizable_reduction): Reinstate final argument to directly_supported_p.	2022-07-11 15:59:00 +01:00
Lewis Hyatt	cb7b01db7a	c-family: Fix option check in handle_pragma_diagnostic [PR106252] In r13-1544, handle_pragma_diagnostic was refactored to support processing early pragmas. During that process the part looking up option arguments was inadvertenly moved too early, prior to checking the option was valid, causing PR106252. Fixed by moving the check back where it goes. gcc/c-family/ChangeLog: PR preprocessor/106252 * c-pragma.cc (handle_pragma_diagnostic_impl): Don't look up the option argument prior to verifying the option was found.	2022-07-11 08:42:39 -04:00
Richard Biener	f1782a0a8c	More update-ssa speedup When working on a smaller region like a loop version copy the main time spent is now dominance fast query recompute which does a full function DFS walk. The dominance queries within the region of interest should be O(log n) without fast queries and we should do on the order of O(n) of them which overall means reasonable complexity. For the artificial testcase I'm looking at this shaves off considerable time again. * tree-into-ssa.cc (update_ssa): Do not forcefully re-compute dominance fast queries for TODO_update_ssa_no_phi.	2022-07-11 13:47:32 +02:00
Richard Biener	415d2c38ed	tree-optimization/106228 - fixup last change The following fixes the last commit to honor the case we are not vectorizing a loop. PR tree-optimization/106228 * tree-vect-data-refs.cc (vect_setup_realignment): Adjust VUSE compute for the non-loop case.	2022-07-11 12:09:21 +02:00
Richard Biener	74526710f7	More update-ssa speedup When we do TODO_update_ssa_no_phi we already avoid computing dominance frontiers for all blocks - it is worth to also avoid walking all dominated blocks in the update domwalk and restrict the walk to the SEME region with the affected blocks. We can do that by walking the CFG in reverse from blocks_to_update to the common immediate dominator, marking blocks in the region and telling the domwalk to STOP when leaving it. For an artificial testcase with N adjacent loops with one unswitching opportunity that takes the incremental SSA updating off the -ftime-report radar: tree loop unswitching : 11.25 ( 3%) 0.09 ( 5%) 11.53 ( 3%) 36M ( 9%) `- tree SSA incremental : 35.74 ( 9%) 0.07 ( 4%) 36.65 ( 9%) 2734k ( 1%) improves to tree loop unswitching : 10.21 ( 3%) 0.05 ( 3%) 11.50 ( 3%) 36M ( 9%) `- tree SSA incremental : 0.66 ( 0%) 0.02 ( 1%) 0.49 ( 0%) 2734k ( 1%) for less localized updates the SEME region isn't likely constrained enough so I've restricted the extra work to TODO_update_ssa_no_phi callers. * tree-into-ssa.cc (rewrite_mode::REWRITE_UPDATE_REGION): New. (rewrite_update_dom_walker::rewrite_update_dom_walker): Update. (rewrite_update_dom_walker::m_in_region_flag): New. (rewrite_update_dom_walker::before_dom_children): If the region to update is marked, STOP at exits. (rewrite_blocks): For REWRITE_UPDATE_REGION mark the region to be updated. (dump_update_ssa): Use bitmap_empty_p. (update_ssa): Likewise. Use REWRITE_UPDATE_REGION when TODO_update_ssa_no_phi. * tree-cfgcleanup.cc (cleanup_tree_cfg_noloop): Account pending update_ssa to the caller.	2022-07-11 12:05:27 +02:00
Thomas Schwinge	06b2a2abe2	Enhance '_Pragma' diagnostics verification in OMP C/C++ test cases Follow-up to recent commit `0587cef3d7` "c: Fix location for _Pragma tokens [PR97498]". gcc/testsuite/ * c-c++-common/gomp/pragma-3.c: Enhance '_Pragma' diagnostics verification. * c-c++-common/gomp/pragma-5.c: Likewise. libgomp/ * testsuite/libgomp.oacc-c-c++-common/reduction-5.c: Enhance '_Pragma' diagnostics verification.	2022-07-11 11:23:33 +02:00
Richard Biener	4c94382a13	target/105459 - allow delayed target option node fixup The following avoids the need to massage the target optimization node at WPA time when we fixup the optimization node, copying FP related flags from callee to caller. The target is already set up to fixup, but that only works when not switching between functions. After fixing that the fixup is then done at LTRANS time when materializing the function. 2022-07-01 Richard Biener <rguenthert@suse.de> PR target/105459 * config/i386/i386-options.cc (ix86_set_current_function): Rebuild the target optimization node whenever necessary, not only when the optimization node didn't change. * gcc.dg/lto/pr105459_0.c: New testcase.	2022-07-11 11:21:53 +02:00
Richard Biener	79f18ac6b7	tree-optimization/106228 - fix vect_setup_realignment virtual SSA handling The following adds missing assignment of a virtual use operand to a created load to vect_setup_realignment which shows as bootstrap failure on powerpc64-linux and extra testsuite fails for targets when misaligned loads are not supported or not optimal. PR tree-optimization/106228 * tree-vect-data-refs.cc (vect_setup_realignment): Properly set a VUSE operand on the emitted load.	2022-07-11 09:29:49 +02:00
Aldy Hernandez	0a7e721a64	Implement global ranges for all vrange types (SSA_NAME_RANGE_INFO). Currently SSA_NAME_RANGE_INFO only handles integer ranges, and loses half the precision in the process because its use of legacy value_range's. This patch rewrites all the SSA_NAME_RANGE_INFO (nonzero bits included) to use the recently contributed vrange_storage. With it, we'll be able to efficiently save any ranges supported by ranger in GC memory. Presently this will only be irange's, but shortly we'll add floating ranges and others to the mix. As per the discussion with the trailing_wide_ints adjustments and vrange_storage, we'll be able to save integer ranges with a maximum of 5 sub-ranges. This could be adjusted later if more sub-ranges are needed (unlikely). Since this is a behavior changing patch, I would like to take a few days for discussion, and commit early next week if all goes well. A few notes. First, we get rid of the SSA_NAME_ANTI_RANGE_P bit in the SSA_NAME since we store full resolution ranges. Perhaps it could be re-used for something else. The range_info_def struct is gone in favor of an opaque type handled by vrange_storage. It currently supports irange, but will support frange, prange, etc, in due time. From the looks of it, set_range_info was an update operation despite its name, as we improved the nonzero bits with each call, even though we clobbered the ranges. Presumably this was because doing a proper intersect of ranges lost information with the anti-range hack. We no longer have this limitation so now we formalize both set_range_info and set_nonzero_bits to an update operation. After all, we should never be losing information, but enhancing it whenever possible. This means, that if folks' finger-memory is not offended, as a follow-up, I'd like to rename set_nonzero_bits and set_range_info to update_. I have kept the same global API we had in tree-ssanames.h, with the caveat that all set operations are now update as discussed above. There is a 2% performance penalty for evrp and a 3% penalty for VRP that is coincidentally in line with a previous improvement of the same amount in the vrange abstraction patchset. Interestingly, this penalty is mostly due to the wide int to tree dance we keep doing with irange and legacy. In a first draft of this patch where I was streaming trees directly, there was actually a small improvement instead. I hope to get some of the gain back when we move irange's to wide-ints, though I'm not in a hurry ;-). Tested and benchmarked on x86-64 Linux. Tested on ppc64le Linux. Comments welcome. gcc/ChangeLog: gimple-range.cc (gimple_ranger::export_global_ranges): Remove verification against legacy value_range. (gimple_ranger::register_inferred_ranges): Same. (gimple_ranger::export_global_ranges): Rename update_global_range to set_range_info. * tree-core.h (struct range_info_def): Remove. (struct irange_storage_slot): New. (struct tree_base): Remove SSA_NAME_ANTI_RANGE_P documentation. (struct tree_ssa_name): Add vrange_storage support. * tree-ssanames.cc (range_info_p): New. (range_info_fits_p): New. (range_info_alloc): New. (range_info_free): New. (range_info_get_range): New. (range_info_set_range): New. (set_range_info_raw): Remove. (set_range_info): Adjust to use vrange_storage. (set_nonzero_bits): Same. (get_nonzero_bits): Same. (duplicate_ssa_name_range_info): Remove overload taking value_range_kind. Rewrite tree overload to use vrange_storage. (duplicate_ssa_name_fn): Adjust to use vrange_storage. * tree-ssanames.h (struct range_info_def): Remove. (set_range_info): Adjust prototype to take vrange. * tree-vrp.cc (vrp_asserts::remove_range_assertions): Call duplicate_ssa_name_range_info. * tree.h (SSA_NAME_ANTI_RANGE_P): Remove. (SSA_NAME_RANGE_TYPE): Remove. * value-query.cc (get_ssa_name_range_info): Adjust to use vrange_storage. (update_global_range): Remove. (get_range_global): Remove as_a<irange>. * value-query.h (update_global_range): Remove. * tree-ssa-dom.cc (set_global_ranges_from_unreachable_edges): Rename update_global_range to set_range_info. * value-range-storage.cc (vrange_storage::alloc_slot): Remove gcc_unreachable.	2022-07-11 08:30:40 +02:00
GCC Administrator	b53ebbc541	Daily bump.	2022-07-11 00:16:25 +00:00
Lewis Hyatt	0587cef3d7	c: Fix location for _Pragma tokens [PR97498] The handling of #pragma GCC diagnostic uses input_location, which is not always as precise as needed; in particular the relative location of some tokens and a _Pragma directive will crucially determine whether a given diagnostic is enabled or suppressed in the desired way. PR97498 shows how the C frontend ends up with input_location pointing to the beginning of the line containing a _Pragma() directive, resulting in the wrong behavior if the diagnostic to be modified pertains to some tokens found earlier on the same line. This patch fixes that by addressing two issues: a) libcpp was not assigning a valid location to the CPP_PRAGMA token generated by the _Pragma directive. b) C frontend was not setting input_location to something reasonable. With this change, the C frontend is able to change input_location to point to the _Pragma token as needed. This is just a two-line fix (one for each of a) and b)), the testsuite changes were needed only because the location on the tested warnings has been somewhat improved, so the tests need to look for the new locations. gcc/c/ChangeLog: PR preprocessor/97498 * c-parser.cc (c_parser_pragma): Set input_location to the location of the pragma, rather than the start of the line. libcpp/ChangeLog: PR preprocessor/97498 * directives.cc (destringize_and_run): Override the location of the CPP_PRAGMA token from a _Pragma directive to the location of the expansion point, as is done for the tokens lexed from it. gcc/testsuite/ChangeLog: PR preprocessor/97498 * c-c++-common/pr97498.c: New test. * c-c++-common/gomp/pragma-3.c: Adapt for improved warning locations. * c-c++-common/gomp/pragma-5.c: Likewise. * gcc.dg/pragma-message.c: Likewise. libgomp/ChangeLog: * testsuite/libgomp.oacc-c-c++-common/reduction-5.c: Adapt for improved warning locations. * testsuite/libgomp.oacc-c-c++-common/vred2d-128.c: Likewise.	2022-07-10 16:50:03 -04:00
Dimitar Dimitrov	4ebbf39068	testsuite: Require int128 for gcc.dg/pr106063.c Require effective target int128 for gcc.dg/pr106063.c. PR tree-optimization/106063 gcc/testsuite/ChangeLog: * gcc.dg/pr106063.c: Require effective target int128. Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>	2022-07-10 11:38:46 +03:00
Aldy Hernandez	c106825b93	Cleanups to irange::nonzero bit code. In discussions with Andrew we realized varying_p() was returning true for a range of the entire domain with a non-empty nonzero mask. This is confusing as varying_p() should only return true when absolutely no information is available. A nonzero mask that has any cleared bits is extra information and must return false for varying_p(). This patch fixes this oversight. Now a range of the entire domain with nonzero bits, is internally set to VR_RANGE (with the appropriate end points set). VR_VARYING ranges must have a null nonzero mask. Also, the union and intersect code were not quite right in the presence of nonzero masks. Sometimes we would drop masks to -1 unnecessarily. I was trying to be too smart in avoiding extra work when the mask was NULL, but there's also an implicit mask in the range that must be taken into account. For example, [0,0] may have no nonzero bits set explicitly, but the mask is really 0x0. This will all be simpler when we drop trees, because the nonzero bits will always be set, even if -1. Finally, I've added unit tests to the nonzero mask code. This should help us maintain sanity going forward. There should be no visible changes, as the main consumer of this code is the SSA_NAME_RANGE_INFO patchset which has yet to be committed. Tested on x86-64 Linux. gcc/ChangeLog: * value-range.cc (irange::operator=): Call verify_range. (irange::irange_set): Normalize kind after everything else has been set. (irange::irange_set_anti_range): Same. (irange::set): Same. (irange::verify_range): Disallow nonzero masks for VARYING. (irange::irange_union): Call verify_range. Handle nonzero masks better. (irange::irange_intersect): Same. (irange::set_nonzero_bits): Calculate mask if either range has an explicit mask. (irange::intersect_nonzero_bits): Same. (irange::union_nonzero_bits): Same. (range_tests_nonzero_bits): New. (range_tests): Call range_tests_nonzero_bits. * value-range.h (class irange): Remove set_nonzero_bits method with trees. (irange::varying_compatible_p): Set nonzero mask.	2022-07-10 09:51:48 +02:00
Xi Ruoyao	a8cfc36b99	loongarch: avoid unnecessary sign-extend after 32-bit division Like add.w/sub.w/mul.w, div.w/mod.w/div.wu/mod.wu also sign-extend the output on LA64. But, LoongArch v1.00 mandates that the inputs of 32-bit division to be sign-extended so we have to expand 32-bit division into RTL sequences. We defined div.w/mod.w/div.wu/mod.wu as a (DI, DI) -> SI instruction. This definition does not indicate the fact that these instructions will store the result as sign-extended value in a 64-bit GR. Then the compiler would emit unnecessary sign-extend operations. For example: int div(int a, int b) { return a / b; } was compiled to: div.w $r4, $r4, $r5 slli.w $r4, $r4, 0 # this is unnecessary jr $r1 To remove this unnecessary operation, we change the division instructions to (DI, DI) -> DI and describe the sign-extend behavior explicitly in the RTL template. In the expander for 32-bit division we then use simplify_gen_subreg to extract the lower 32 bits. gcc/ChangeLog: * config/loongarch/loongarch.md (<any_div>di3_fake): Describe the sign-extend of result in the RTL template. (<any_div><mode>3): Adjust for <any_div>di3_fake change. gcc/testsuite/ChangeLog: * gcc.target/loongarch/div-4.c: New test.	2022-07-10 11:36:22 +08:00
Xi Ruoyao	a5d3826f76	loongarch: add alternatives for idiv insns to improve code generation Currently in the description of LoongArch integer division instructions, the output is marked as earlyclobbered ('&'). It's necessary when loongarch_check_zero_div_p() because clobbering operand 2 (divisor) will make the checking for zero divisor impossible. But, for -mno-check-zero-division (the default of GCC >= 12.2 for optimized code), the output is not earlyclobbered at all. And, the read of operand 1 only occurs before clobbering the output. So we make three alternatives for an idiv instruction: * (=r,r,r): For -mno-check-zero-division. * (=&r,r,r): For -mcheck-zero-division. * (=&r,0,r): For -mcheck-zero-division, to explicitly allow patterns like "div.d $a0, $a0, $a1". gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_check_zero_div_p): Remove static, for use in the machine description file. * config/loongarch/loongarch-protos.h: (loongarch_check_zero_div_p): Add prototype. * config/loongarch/loongarch.md (enabled): New attr. (<optab><mode>3): Add (=r,r,r) and (=&r,0,r) alternatives for idiv. Conditionally enable the alternatives using loongarch_check_zero_div_p. (<optab>di3_fake): Likewise. gcc/testsuite/ChangeLog: gcc.target/loongarch/div-1.c: New test. * gcc.target/loongarch/div-2.c: New test. * gcc.target/loongarch/div-3.c: New test.	2022-07-10 11:36:11 +08:00
Xi Ruoyao	1fa42d6214	loongarch: fix mulsidi3_64bit instruction (mult (sign_extend:DI rj:SI) (sign_extend:DI rk:SI)) should be "mulw.d.w", not "mul.d". gcc/ChangeLog: * config/loongarch/loongarch.md (mulsidi3_64bit): Use mulw.d.w instead of mul.d. gcc/testsuite/ChangeLog: * gcc.target/loongarch/mulw_d_w.c: New test. * gcc.c-torture/execute/mul-sext.c: New test.	2022-07-10 11:28:08 +08:00
GCC Administrator	aa2eb25c94	Daily bump.	2022-07-10 00:16:23 +00:00
Aldy Hernandez	030a53c613	Set VR_VARYING in irange::irange_single_pair_union. The fast union operation is sometimes setting a range of the entire domain, but leaving the kind bit as VR_RANGE instead of downgrading it to VR_VARYING. Tested on x86-64 Linux. gcc/ChangeLog: * value-range.cc (irange::irange_single_pair_union): Set VR_VARYING when appropriate.	2022-07-09 21:08:10 +02:00
Vit Kabele	7a16d39903	[PATCH v3] c: Extend the -Wpadded message with actual padding size gcc/ChangeLog: * stor-layout.cc (finalize_record_size): Extend warning message. gcc/testsuite/ChangeLog: * c-c++-common/Wpadded.c: New test.	2022-07-09 13:06:43 -04:00
Sam Feifer	d9fa599dc7	[PATCH] match.pd: Add new bitwise arithmetic pattern [PR98304] PR tree-optimization/98304 gcc: * match.pd (n - (((n > C1) ? n : C1) & -C2)): New simplification. gcc/testsuite: * gcc.c-torture/execute/pr98304-2.c: New test. * gcc.dg/pr98304-1.c: New test.	2022-07-09 12:08:01 -04:00
Jeff Law	46dc26fdfb	[RFA] Improve initialization of objects when the initializer has trailing zeros. gcc/ * expr.cc (store_expr): Identify trailing NULs in a STRING_CST initializer and use clear_storage rather than copying the NULs to the destination array.	2022-07-09 11:11:00 -04:00
François Dumont	8f1802003d	libstdc++: Remove obsolete comment in <string> header The comment is obsolete because char_traits.h do not include stl_algobase.h anymore and stl_algobase.h is included directly from <string> a few lines below. libstdc++-v3/ChangeLog: * include/std/string: Remove obsolete comment about char_traits.h including stl_algobase.h.	2022-07-09 14:18:15 +02:00
Roger Sayle	b434c94bf7	Improve preservation of FLAGS_REG mode in i386.md's peephole2s. The patch tweaks several peephole2s in i386.md that propagate the flags register, but take its mode from the SET_SRC rather than preserve the mode of the original SET_DEST. This encounters problems when the SET_SRC is a VOIDmode CONST_INT. Fixed by using match_operand with a flags_reg_operand predicate. 2022-07-09 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * config/i386/i386.md (define_peephole2): Use match_operand of flags_reg_operand to capture and preserve the mode of FLAGS_REG. (define_peephole2): Likewise. (define_peephole2): Likewise...	2022-07-09 09:07:18 +01:00
Roger Sayle	002d81affa	Support testdi_not_doubleword during STV pass on x86. This patch fixes the current two FAILs of pr65105-5.c on x86 when compiled with -m32. These (temporary) breakages were fallout from my patches to improve/upgrade (scalar) double word comparisons. On mainline, the i386 backend currently represents a critical comparison using (compare (and (not reg1) reg2) (const_int 0)) which isn't/wasn't recognized by the STV pass' convertible_comparison_p. This simple STV patch adds support for this pattern (testdi_not_doubleword) and generates the vector pandn and ptest instructions expected in the existing (failing) test case. 2022-07-09 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * config/i386/i386-features.cc (convert_compare): Add support for testdi_not_doubleword pattern, "(compare (and (not ...)))" by generating a pandn followed by ptest. (convertible_comparison_p): Recognize both cmpdi_doubleword and recent *testdi_not_doubleword comparison patterns.	2022-07-09 09:04:55 +01:00
Tamar Christina	84ff566c63	[PATCH][s390]: Fix the usage of store_bit_field in the backend. Hi All, I seem to have broken the s390 bootstrap because I added a new parameter to the store_bit_field function to indicate whether the value the field of is being set is currently undefined. If it's undefined we use a subreg instead. In this case the value of false restores the old behavior. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/s390/s390.cc (s390_expand_atomic): Pass false to store_bit_field to indicate that the value is not undefined.	2022-07-08 21:56:25 -04:00
Andrew Pinski	71e3daa31c	Fix tree-opt/PR106087: ICE with inline-asm with multiple output and assigned only static vars The problem here is that when we mark the ssa name that was referenced in the now removed dead store (to a write only static variable), the inline-asm would also be removed even though it was defining another ssa name. This fixes the problem by checking to make sure that the statement was only defining one ssa name. Committed as approved after a bootstrapped and tested on x86_64 with no regressions. PR tree-optimization/106087 gcc/ChangeLog: * tree-ssa-dce.cc (simple_dce_from_worklist): Check to make sure the statement is only defining one operand. gcc/testsuite/ChangeLog: * gcc.c-torture/compile/inline-asm-1.c: New test.	2022-07-09 01:17:46 +00:00
GCC Administrator	0fe604a2d9	Daily bump.	2022-07-09 00:16:54 +00:00
Ian Lance Taylor	55bb77b50a	libbacktrace: check for sys/link.h QNX uses sys/link.h rather than link.h for dl_iterate_phdr Fixes https://github.com/ianlancetaylor/libbacktrace/issues/86 * configure.ac: Check for sys/link.h. Use either link.h or sys/link.h when checking for dl_iterate_phdr. * elf.c: Include sys/link.h if available. * configure, config.h.in: Regenerate.	2022-07-08 10:32:45 -07:00
Martin Jambor	b984b84cbe	testsuite: Fix tree-ssa/alias-access-path-13.c on 32bit platforms (PR 106216) For gcc.dg/tree-ssa/alias-access-path-13.c to work, SRA must think of accesses to foo.inn.val and to foo itself as different ones, i.e. they need to have different offset and size, which on 32bit platforms they do not. Fixed by replacing a dummy long int field of the union with a struct of two integers. Tested by: make -k check-gcc RUNTESTFLAGS="tree-ssa.exp=alias-access-path-13.c" and make -k check-gcc RUNTESTFLAGS="--target_board=unix'{-m32}' tree-ssa.exp=alias-access-path-13.c" on an x86_64-linux, also with patched SRA to verify it still tests the original intent. gcc/testsuite/ChangeLog: 2022-07-08 Martin Jambor <mjambor@suse.cz> PR testsuite/106216 * gcc.dg/tree-ssa/alias-access-path-13.c (union foo): Replace a long int field with a struct that is larger than an int also on 32bit platforms.	2022-07-08 18:13:31 +02:00
Lewis Hyatt	2bd15617e7	diagnostics: Make line-ending logic consistent with libcpp [PR91733] libcpp recognizes a lone \r as a valid line ending, so the infrastructure for retrieving source lines to be output in diagnostics needs to do the same. This patch fixes file_cache_slot::get_next_line() accordingly so that diagnostics display the correct part of the source when \r line endings are in use. gcc/ChangeLog: PR preprocessor/91733 * input.cc (find_end_of_line): New helper function. (file_cache_slot::get_next_line): Recognize \r as a line ending. * diagnostic-show-locus.cc (test_escaping_bytes_1): Adapt selftest since \r will now be interpreted as a line-ending. gcc/testsuite/ChangeLog: PR preprocessor/91733 * c-c++-common/pr91733.c: New test.	2022-07-08 09:43:33 -04:00
Martin Liska	6da7f7c5ac	sanitizer: Fix hwasan related option conflicts [PR106132] Split report_conflicting_sanitizer_options(..., SANITIZE_ADDRESS \| SANITIZE_HWADDRESS) call into 2 calls as we don't have any option that would be address+hwaddress (that conflicts) as well. PR sanitizer/106132 gcc/ChangeLog: * opts.cc (finish_options): Use 2 calls to report_conflicting_sanitizer_options. gcc/testsuite/ChangeLog: * c-c++-common/hwasan/arguments-3.c: Cover new ICE.	2022-07-08 13:23:44 +02:00
Richard Biener	cf3a120084	tree-optimization/106226 - move vectorizer virtual SSA update When we knowingly have broken virtual SSA form we need to update it before we eventually perform slpeel manual updating which will call delete_update_ssa. Currently that's done on-demand but communicating whether it's a known unavoidable case is broken there. The following makes that a synchronous operation but instead of actually performing the update we instead recod the need, clear the update SSA sub-state and force virtual renaming at the very end of the vectorization pass. PR tree-optimization/106226 * tree-vect-loop-manip.cc (vect_do_peeling): Assert that no SSA update is needed. Move virtual SSA update ... * tree-vectorizer.cc (pass_vectorize::execute): ... here, via forced virtual renaming when TODO_update_ssa_only_virtuals is queued. (vect_transform_loops): Return TODO_update_ssa_only_virtuals when virtual SSA update is required. (try_vectorize_loop_1): Adjust. * tree-vect-stmts.cc (vectorizable_simd_clone_call): Allow virtual renaming if the ABI forces an aggregate return but the original call did not have a virtual definition. * gfortran.dg/pr106226.f: New testcase.	2022-07-08 13:05:19 +02:00
Martin Liska	95a234f5cb	lto-dump: Do not print output file Right now the following is printed: lto-dump .file "<artificial>" .ident "GCC: (GNU) 13.0.0 20220707 (experimental)" .section .note.GNU-stack,"",@progbits After the patch we print -help and do not emit any assembly output: lto-dump Usage: lto-dump [OPTION]... SUB_COMMAND [OPTION]... LTO dump tool command line options. -list [options] Dump the symbol list. -demangle Dump the demangled output. -defined-only Dump only the defined symbols. ... gcc/lto/ChangeLog: * lto-dump.cc (lto_main): Exit in the function as we don't want any LTO bytecode processing. gcc/ChangeLog: * toplev.cc (init_asm_output): Do not init asm_out_file.	2022-07-08 12:52:47 +02:00
Tamar Christina	f7854e2faf	middle-end: don't lower past veclower [PR106063] Hi All, My previous patch can cause a problem if the pattern matches after veclower as it may replace the construct with a vector sequence which the target may not directly support. As such don't perform the rewriting if after veclower unless the target supports the operation. If before veclower do the rewriting as well if the target didn't support the original operation either. gcc/ChangeLog: PR tree-optimization/106063 * match.pd: Do not apply pattern after veclower is not supported. gcc/testsuite/ChangeLog: PR tree-optimization/106063 * gcc.dg/pr106063.c: New test.	2022-07-08 08:30:22 +01:00
Thomas Schwinge	faa0c328ee	Fix one issue in OpenMP 'requires' directive diagnostics Fix-up for recent commit `683f118439` "OpenMP: Move omp requires checks to libgomp". gcc/ * lto-cgraph.cc (input_offload_tables) <LTO_symtab_edge>: Correct 'fn2' computation. libgomp/ * testsuite/libgomp.c-c++-common/requires-1.c: Add 'dg-note's. * testsuite/libgomp.c-c++-common/requires-2.c: Likewise. * testsuite/libgomp.c-c++-common/requires-3.c: Likewise. * testsuite/libgomp.c-c++-common/requires-7.c: Likewise. * testsuite/libgomp.fortran/requires-1.f90: Likewise.	2022-07-08 08:53:58 +02:00
Tamar Christina	13f44099bc	middle-end: Use subregs to expand COMPLEX_EXPR to set the lowpart. When lowering COMPLEX_EXPR we currently emit two VEC_EXTRACTs. One for the lowpart and one for the highpart. The problem with this is that in RTL the lvalue of the RTX is the only thing tying the two instructions together. This means that e.g. combine is unable to try to combine the two instructions for setting the lowpart and highpart. For ISAs that have bit extract instructions we can eliminate one of the extracts if, and only if we're setting the entire complex number. This change changes the expand code when we're setting the entire complex number to generate a subreg for the lowpart instead of a vec_extract. This allows us to optimize sequences such as: _Complex int f(int a, int b) { _Complex int t = a + b * 1i; return t; } from: f: bfi x2, x0, 0, 32 bfi x2, x1, 32, 32 mov x0, x2 ret into: f: bfi x0, x1, 32, 32 ret I have also confirmed the codegen for x86_64 did not change. gcc/ChangeLog: * expmed.cc (store_bit_field_1): Add parameter that indicates if value is still undefined and if so emit a subreg move instead. (store_integral_bit_field): Likewise. (store_bit_field): Likewise. * expr.h (write_complex_part): Likewise. * expmed.h (store_bit_field): Add new parameter. * builtins.cc (expand_ifn_atomic_compare_exchange_into_call): Use new parameter. (expand_ifn_atomic_compare_exchange): Likewise. * calls.cc (store_unaligned_arguments_into_pseudos): Likewise. * emit-rtl.cc (validate_subreg): Likewise. * expr.cc (emit_group_store): Likewise. (copy_blkmode_from_reg): Likewise. (copy_blkmode_to_reg): Likewise. (clear_storage_hints): Likewise. (write_complex_part): Likewise. (emit_move_complex_parts): Likewise. (expand_assignment): Likewise. (store_expr): Likewise. (store_field): Likewise. (expand_expr_real_2): Likewise. * ifcvt.cc (noce_emit_move_insn): Likewise. * internal-fn.cc (expand_arith_set_overflow): Likewise. (expand_arith_overflow_result_store): Likewise. (expand_addsub_overflow): Likewise. (expand_neg_overflow): Likewise. (expand_mul_overflow): Likewise. (expand_arith_overflow): Likewise. gcc/testsuite/ChangeLog: * g++.target/aarch64/complex-init.C: New test.	2022-07-08 07:39:33 +01:00
Haochen Jiang	bf3695691f	i386: Handle memory operand for direct call to cvtps2pd in unpack gcc/ChangeLog: PR target/106180 * config/i386/sse.md (sse2_cvtps2pd<mask_name>_1): Rename from sse2_cvtps2pd<mask_name>_1. (vec_unpacks_lo_v4sf): Add handler for memory operand. gcc/testsuite/ChangeLog: PR target/106180 g++.target/i386/pr106180-1.C: New test.	2022-07-08 12:17:43 +08:00
Lulu Cheng	aa8fd7f656	LoongArch: Modify fp_sp_offset and gp_sp_offset's calculation method when frame->mask or frame->fmask is zero. Under the LA architecture, when the stack is dropped too far, the process of dropping the stack is divided into two steps. step1: After dropping the stack, save callee saved registers on the stack. step2: The rest of it. The stack drop operation is optimized when frame->total_size minus frame->sp_fp_offset is an integer multiple of 4096, can reduce the number of instructions required to drop the stack. However, this optimization is not effective because of the original calculation method The following case: int main() { char buf[1024 * 12]; printf ("%p\n", buf); return 0; } As you can see from the generated assembler, the old GCC has two more instructions than the new GCC, lines 14 and line 24. new old 10 main: \| 11 main: 11 addi.d $r3,$r3,-16 \| 12 lu12i.w $r13,-12288>>12 12 lu12i.w $r13,-12288>>12 \| 13 addi.d $r3,$r3,-2032 13 lu12i.w $r5,-12288>>12 \| 14 ori $r13,$r13,2016 14 lu12i.w $r12,12288>>12 \| 15 lu12i.w $r5,-12288>>12 15 st.d $r1,$r3,8 \| 16 lu12i.w $r12,12288>>12 16 add.d $r12,$r12,$r5 \| 17 st.d $r1,$r3,2024 17 add.d $r3,$r3,$r13 \| 18 add.d $r12,$r12,$r5 18 add.d $r5,$r12,$r3 \| 19 add.d $r3,$r3,$r13 19 la.local $r4,.LC0 \| 20 add.d $r5,$r12,$r3 20 bl %plt(printf) \| 21 la.local $r4,.LC0 21 lu12i.w $r13,12288>>12 \| 22 bl %plt(printf) 22 add.d $r3,$r3,$r13 \| 23 lu12i.w $r13,8192>>12 23 ld.d $r1,$r3,8 \| 24 ori $r13,$r13,2080 24 or $r4,$r0,$r0 \| 25 add.d $r3,$r3,$r13 25 addi.d $r3,$r3,16 \| 26 ld.d $r1,$r3,2024 26 jr $r1 \| 27 or $r4,$r0,$r0 \| 28 addi.d $r3,$r3,2032 \| 29 jr $r1 gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_compute_frame_info): Modify fp_sp_offset and gp_sp_offset's calculation method, when frame->mask or frame->fmask is zero, don't minus UNITS_PER_WORD or UNITS_PER_FP_REG. gcc/testsuite/ChangeLog: * gcc.target/loongarch/prolog-opt.c: New test.	2022-07-08 11:11:20 +08:00
GCC Administrator	6345c41454	Daily bump.	2022-07-08 00:16:22 +00:00
Ian Lance Taylor	d8ddf1fa09	libbacktrace: don't exit Mach-O dyld library loop on one failure * macho.c (backtrace_initialize) [HAVE_MACH_O_DYLD_H]: Don't exit loop if we can't find debug info for one shared library.	2022-07-07 16:54:05 -07:00
Ian Lance Taylor	9ed5779623	libbacktrace: don't let "make clean" remove allocfail.sh For https://github.com/ianlancetaylor/libbacktrace/issues/81 * Makefile.am (MAKETESTS): New variable split out of TESTS. (CLEANFILES): Replace TESTS with BUILDTESTS and MAKETESTS. * Makefile.in: Regenerate.	2022-07-07 16:15:05 -07:00
Patrick Palka	7b90f07f77	c++: generic targs and identity substitution [PR105956] In r13-1045-gcb7fd1ea85feea I assumed that substitution into generic DECL_TI_ARGS corresponds to an identity mapping of the given arguments, and hence its safe to always elide such substitution. But this PR demonstrates that such a substitution isn't always the identity mapping, in particular when there's an ARGUMENT_PACK_SELECT argument, which gets handled specially during substitution: * when substituting an APS into a template parameter, we strip the APS to its underlying argument; * and when substituting an APS into a pack expansion, we strip the APS to its underlying argument pack. In this testcase, when expanding the pack expansion pattern (idx + Ns)... with Ns={0,1}, we specialize idx twice, first with Ns=APS<0,{0,1}> and then Ns=APS<1,{0,1}>. The DECL_TI_ARGS of idx are the generic template arguments of the enclosing class template impl, so before r13-1045, we'd substitute into its DECL_TI_ARGS which gave Ns={0,1} as desired. But after r13-1045, we elide this substitution and end up attempting to hash the original Ns argument, an APS, which ICEs. So this patch reverts that part of r13-1045. I considered using preserve_args in this case instead, but that'd break the static_assert in the testcase because preserve_args always strips APS to its underlying argument, but here we want to strip it to its underlying argument pack, so we'd incorrectly end up forming the specializations impl<0>::idx and impl<1>::idx instead of impl<0,1>::idx. Although we can't elide the substitution into DECL_TI_ARGS in light of ARGUMENT_PACK_SELECT, it should still be safe to elide template argument coercion in the case of a non-template decl, which this patch preserves. It's unfortunate that we need to remove this optimization just because it doesn't hold for one special tree code. So this patch implements a heuristic in tsubst_template_args to avoid allocating a new TREE_VEC if the substituted elements are identical to those of a level from ARGS, as well as a similar heuristic for tsubst_argument_pack. It turns out that about 40% of all calls to tsubst_template_args benefit from this, and it reduces memory usage by about 4% for e.g. range-v3's zip.cpp (relative to r13-1045) which more than makes up for the reversion. PR c++/105956 gcc/cp/ChangeLog: * pt.cc (template_arg_to_parm): Define. (tsubst_argument_pack): Try to reuse the corresponding ARGUMENT_PACK from 'args' when substituting into a generic ARGUMENT_PACK for a variadic template parameter. (tsubst_template_args): Move variable declarations closer to their first use. Replace 'orig_t' with 'r'. Rename 'need_new' to 'const_subst_p'. Heuristically detect if the substituted elements are identical to that of a level from 'args' and avoid allocating a new TREE_VEC if so. Add sanity check for the length of the new TREE_VEC, and remove dead ARGUMENT_PACK_P test. (tsubst_decl) <case TYPE_DECL, case VAR_DECL>: Revert r13-1045-gcb7fd1ea85feea change for avoiding substitution into DECL_TI_ARGS, but still avoid coercion in this case. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/variadic183.C: New test.	2022-07-07 16:46:29 -04:00
David Malcolm	52f538fa4a	analyzer: use label_text for superedge::get_description gcc/analyzer/ChangeLog: * checker-path.cc (start_cfg_edge_event::get_desc): Update for superedge::get_description returning a label_text. * engine.cc (feasibility_state::maybe_update_for_edge): Likewise. * supergraph.cc (superedge::dump): Likewise. (superedge::get_description): Convert return type from char * to label_text. * supergraph.h (superedge::get_description): Likewise. Signed-off-by: David Malcolm <dmalcolm@redhat.com>	2022-07-07 15:56:08 -04:00
David Malcolm	a8dce13c07	Convert label_text to C++11 move semantics libcpp's class label_text stores a char * for a string and a flag saying whether it owns the buffer. I added this class before we could use C++11, and so to avoid lots of copying it required an explicit call to label_text::maybe_free to potentially free the buffer. Now that we can use C++11, this patch removes label_text::maybe_free in favor of doing the cleanup in the destructor, and using C++ move semantics to avoid any copying. This allows lots of messy cleanup code to be eliminated in favor of implicit destruction (mostly in the analyzer). No functional change intended. gcc/analyzer/ChangeLog: * call-info.cc (call_info::print): Update for removal of label_text::maybe_free in favor of automatic memory management. * checker-path.cc (checker_event::dump): Likewise. (checker_event::prepare_for_emission): Likewise. (state_change_event::get_desc): Likewise. (superedge_event::should_filter_p): Likewise. (start_cfg_edge_event::get_desc): Likewise. (warning_event::get_desc): Likewise. (checker_path::dump): Likewise. (checker_path::debug): Likewise. * diagnostic-manager.cc (diagnostic_manager::prune_for_sm_diagnostic): Likewise. (diagnostic_manager::prune_interproc_events): Likewise. * program-state.cc (sm_state_map::to_json): Likewise. * region.cc (region::to_json): Likewise. * sm-malloc.cc (inform_nonnull_attribute): Likewise. * store.cc (binding_map::to_json): Likewise. (store::to_json): Likewise. * svalue.cc (svalue::to_json): Likewise. gcc/c-family/ChangeLog: * c-format.cc (range_label_for_format_type_mismatch::get_text): Update for removal of label_text::maybe_free in favor of automatic memory management. gcc/ChangeLog: * diagnostic-format-json.cc (json_from_location_range): Update for removal of label_text::maybe_free in favor of automatic memory management. * diagnostic-format-sarif.cc (sarif_builder::make_location_object): Likewise. * diagnostic-show-locus.cc (struct pod_label_text): New. (class line_label): Convert m_text from label_text to pod_label_text. (layout::print_any_labels): Move "text" to the line_label. * tree-diagnostic-path.cc (path_label::get_text): Update for removal of label_text::maybe_free in favor of automatic memory management. (event_range::print): Likewise. (default_tree_diagnostic_path_printer): Likewise. (default_tree_make_json_for_path): Likewise. libcpp/ChangeLog: * include/line-map.h: Include <utility>. (class label_text): Delete maybe_free method in favor of a destructor. Add move ctor and assignment operator. Add deletion of the copy ctor and copy-assignment operator. Rename field m_caller_owned to m_owned. Add std::move where necessary; add moved_from member function. Signed-off-by: David Malcolm <dmalcolm@redhat.com>	2022-07-07 15:50:26 -04:00
David Malcolm	897b3b31f0	analyzer: fix false positives from -Wanalyzer-tainted-divisor [PR106225] gcc/analyzer/ChangeLog: PR analyzer/106225 * sm-taint.cc (taint_state_machine::on_stmt): Move handling of assignments from division to... (taint_state_machine::check_for_tainted_divisor): ...this new function. Reject warning when the divisor is known to be non-zero. * sm.cc: Include "analyzer/program-state.h". (sm_context::get_old_region_model): New. * sm.h (sm_context::get_old_region_model): New decl. gcc/testsuite/ChangeLog: PR analyzer/106225 * gcc.dg/analyzer/taint-divisor-1.c: Add test coverage for various correct and incorrect checks against zero. Signed-off-by: David Malcolm <dmalcolm@redhat.com>	2022-07-07 15:50:26 -04:00
Jonathan Wakely	ad6135e261	libstdc++: Remove workaround in __gnu_cxx::char_traits::move [PR89074] The front-end bug that prevented this constexpr loop from working has been fixed since GCC 12.1 so we can remove the workaround. libstdc++-v3/ChangeLog: PR c++/89074 * include/bits/char_traits.h (__gnu_cxx::char_traits::move): Remove workaround for front-end bug.	2022-07-07 17:38:14 +01:00
Prathamesh Kulkarni	9de8fbe150	statistics.cc: Add check to see if fn is not NULL in get_function_name. gcc/ChangeLog: * statistics.cc (get_function_name): Add check to see if fn is not NULL.	2022-07-07 22:05:09 +05:30

1 2 3 4 5 ...

194331 Commits