OpenE2K/gcc - gcc - Expired Mentality Git

Author	SHA1	Message	Date
Iain Buclaw	b6df113247	d: Merge upstream dmd d7772a2369, phobos 5748ca43f. In upstream dmd, the compiler front-end and run-time have been merged together into one repository. Both dmd and libdruntime now track that. D front-end changes: - Deprecated `scope(failure)' blocks that contain `return' statements. - Deprecated using integers for `version' or `debug' conditions. - Deprecated returning a discarded void value from a function. - `new' can now allocate an associative array. D runtime changes: - Added avx512f detection to core.cpuid module. Phobos changes: - Changed std.experimental.logger.core.sharedLog to return shared(Logger). gcc/d/ChangeLog: * dmd/MERGE: Merge upstream dmd d7772a2369. * dmd/VERSION: Bump version to v2.100.1. * d-codegen.cc (get_frameinfo): Check whether decision to generate closure changed since semantic finished. * d-lang.cc (d_handle_option): Remove handling of -fdebug=level and -fversion=level. * decl.cc (DeclVisitor::visit (VarDeclaration )): Generate evaluation of noreturn variable initializers before throw. expr.cc (ExprVisitor::visit (AssignExp )): Don't generate assignment for noreturn types, only evaluate for side effects. lang.opt (fdebug=): Undocument -fdebug=level. (fversion=): Undocument -fversion=level. libphobos/ChangeLog: * configure: Regenerate. * configure.ac (libtool_VERSION): Update to 4:0:0. * libdruntime/MERGE: Merge upstream druntime d7772a2369. * libdruntime/Makefile.am (DRUNTIME_DSOURCES): Add core/internal/array/duplication.d. * libdruntime/Makefile.in: Regenerate. * src/MERGE: Merge upstream phobos 5748ca43f. * testsuite/libphobos.gc/nocollect.d:	2022-08-03 13:01:53 +02:00
Richard Earnshaw	64ce76d940	cselib: add function to check if SET is redundant [PR106187] A SET operation that writes memory may have the same value as an earlier store but if the alias sets of the new and earlier store do not conflict then the set is not truly redundant. This can happen, for example, if objects of different types share a stack slot. To fix this we define a new function in cselib that first checks for equality and if that is successful then finds the earlier store in the value history and checks the alias sets. The routine is used in two places elsewhere in the compiler: cfgcleanup and postreload. gcc/ChangeLog: PR rtl-optimization/106187 * alias.h (mems_same_for_tbaa_p): Declare. * alias.cc (mems_same_for_tbaa_p): New function. * dse.cc (record_store): Use it instead of open-coding alias check. * cselib.h (cselib_redundant_set_p): Declare. * cselib.cc: Include alias.h (cselib_redundant_set_p): New function. * cfgcleanup.cc: (mark_effect): Use cselib_redundant_set_p instead of rtx_equal_for_cselib_p. * postreload.cc (reload_cse_simplify): Use cselib_redundant_set_p. (reload_cse_noop_set_p): Delete.	2022-08-03 10:07:15 +01:00
Martin Liska	a6b7fff06c	gcov-dump: add --stable option The option prints TOP N counters in a stable format usage for comparison (diff). gcc/ChangeLog: * doc/gcov-dump.texi: Document the new option. * gcov-dump.cc (main): Parse the new option. (print_usage): Show the option. (tag_counters): Sort key:value pairs of TOP N counter.	2022-08-03 10:58:22 +02:00
Martin Liska	7585e5ecb4	profile: do not collect stats unless TDF_DETAILS gcc/ChangeLog: * profile.cc (compute_branch_probabilities): Do not collect stats unless TDF_DETAILS.	2022-08-03 10:55:18 +02:00
Roger Sayle	fc6ef90173	PR target/47949: Use xchg to move from/to AX_REG with -Oz on x86. This patch adds a peephole2 to i386.md to implement the suggestion in PR target/47949, of using xchg instead of mov for moving values to/from the %rax/%eax register, controlled by -Oz, as the xchg instruction is one byte shorter than the move it is replacing. The new test case is taken from the PR: int foo(int x) { return x; } where previously we'd generate: foo: mov %edi,%eax // 2 bytes ret but with this patch, using -Oz, we generate: foo: xchg %eax,%edi // 1 byte ret On the CSiBE benchmark, this saves a total of 10238 bytes (reducing the -Oz total from 3661796 bytes to 3651558 bytes, a 0.28% saving). Interestingly, some modern architectures (such as Zen 3) implement xchg using zero latency register renaming (just like mov), so in theory this transformation could be enabled when optimizing for speed, if benchmarking shows the improved code density produces consistently better performance. However, this is architecture dependent, and there may be interactions using xchg (instead a single_set) in the late RTL passes (such as cprop_hardreg), so for now I've restricted this to -Oz. 2022-08-03 Roger Sayle <roger@nextmovesoftware.com> Uroš Bizjak <ubizjak@gmail.com> gcc/ChangeLog PR target/47949 * config/i386/i386.md (peephole2): New peephole2 to convert SWI48 moves to/from %rax/%eax where the src is dead to xchg, when optimizing for minimal size with -Oz. gcc/testsuite/ChangeLog PR target/47949 * gcc.target/i386/pr47949.c: New test case.	2022-08-03 09:07:36 +01:00
Roger Sayle	e6b011bcfd	Improved pre-reload split of double word comparison against -1 on x86. This patch adds an extra optimization to cmp<dwi>_doubleword to improve the code generated for comparisons against -1. Hypothetically, if a comparison against -1 reached this splitter we'd currently generate code that looks like: notq %rdx ; 3 bytes notq %rax ; 3 bytes orq %rdx, %rax ; 3 bytes setne %al With this patch we would instead generate the superior: andq %rdx, %rax ; 3 bytes cmpq $-1, %rax ; 4 bytes setne %al which is both faster and smaller, and also what's currently generated thanks to the middle-end splitting double word comparisons against zero and minus one during RTL expansion. Should that change, this would become a missed-optimization regression, but this patch also (potentially) helps suitable comparisons created by CSE and combine. 2022-08-03 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog config/i386/i386.md (*cmp<dwi>_doubleword): Add a special case to split comparisons against -1 using AND and CMP -1 instructions.	2022-08-03 09:03:17 +01:00
Roger Sayle	7baed397dd	Support logical shifts by (some) integer constants in TImode STV on x86_64. This patch improves TImode STV by adding support for logical shifts by integer constants that are multiples of 8. For the test case: unsigned __int128 a, b; void foo() { a = b << 16; } on x86_64, gcc -O2 currently generates: movq b(%rip), %rax movq b+8(%rip), %rdx shldq $16, %rax, %rdx salq $16, %rax movq %rax, a(%rip) movq %rdx, a+8(%rip) ret with this patch we now generate: movdqa b(%rip), %xmm0 pslldq $2, %xmm0 movaps %xmm0, a(%rip) ret 2022-08-03 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * config/i386/i386-features.cc (compute_convert_gain): Add gain for converting suitable TImode shift to a V1TImode shift. (timode_scalar_chain::convert_insn): Add support for converting suitable ASHIFT and LSHIFTRT. (timode_scalar_to_vector_candidate_p): Consider logical shifts by integer constants that are multiples of 8 to be candidates. gcc/testsuite/ChangeLog * gcc.target/i386/sse4_1-stv-7.c: New test case.	2022-08-03 09:00:20 +01:00
Roger Sayle	c23a9c87cc	Some additional zero-extension related optimizations in simplify-rtx. This patch implements some additional zero-extension and sign-extension related optimizations in simplify-rtx.cc. The original motivation comes from PR rtl-optimization/71775, where in comment #2 Andrew Pinksi sees: Failed to match this instruction: (set (reg:DI 88 [ _1 ]) (sign_extend:DI (subreg:SI (ctz:DI (reg/v:DI 86 [ x ])) 0))) On many platforms the result of DImode CTZ is constrained to be a small unsigned integer (between 0 and 64), hence the truncation to 32-bits (using a SUBREG) and the following sign extension back to 64-bits are effectively a no-op, so the above should ideally (often) be simplified to "(set (reg:DI 88) (ctz:DI (reg/v:DI 86 [ x ]))". To implement this, and some closely related transformations, we build upon the existing val_signbit_known_clear_p predicate. In the first chunk, nonzero_bits knows that FFS and ABS can't leave the sign-bit bit set, so the simplification of of ABS (ABS (x)) and ABS (FFS (x)) can itself be simplified. The second transformation is that we can canonicalized SIGN_EXTEND to ZERO_EXTEND (as in the PR 71775 case above) when the operand's sign-bit is known to be clear. The final two chunks are for SIGN_EXTEND of a truncating SUBREG, and ZERO_EXTEND of a truncating SUBREG respectively. The nonzero_bits of a truncating SUBREG pessimistically thinks that the upper bits may have an arbitrary value (by taking the SUBREG), so we need look deeper at the SUBREG's operand to confirm that the high bits are known to be zero. Unfortunately, for PR rtl-optimization/71775, ctz:DI on x86_64 with default architecture options is undefined at zero, so we can't be sure the upper bits of reg:DI 88 will be sign extended (all zeros or all ones). nonzero_bits knows this, so the above transformations don't trigger, but the transformations themselves are perfectly valid for other operations such as FFS, POPCOUNT and PARITY, and on other targets/-march settings where CTZ is defined at zero. 2022-08-03 Roger Sayle <roger@nextmovesoftware.com> Segher Boessenkool <segher@kernel.crashing.org> Richard Sandiford <richard.sandiford@arm.com> gcc/ChangeLog * simplify-rtx.cc (simplify_unary_operation_1) <ABS>: Add optimizations for CLRSB, PARITY, POPCOUNT, SS_ABS and LSHIFTRT that are all positive to complement the existing FFS and idempotent ABS simplifications. <SIGN_EXTEND>: Canonicalize SIGN_EXTEND to ZERO_EXTEND when val_signbit_known_clear_p is true of the operand. Simplify sign extensions of SUBREG truncations of operands that are already suitably (zero) extended. <ZERO_EXTEND>: Simplify zero extensions of SUBREG truncations of operands that are already suitably zero extended.	2022-08-03 08:58:09 +01:00
GCC Administrator	969a989d2b	Daily bump.	2022-08-03 00:16:48 +00:00
Andrew MacLeod	70daecc032	Do not register edges for statements not understood. Previously, all gimple_cond types were undserstoof, with float values, this is no longer true. We should gracefully do nothing if the gcond type is not supported. PR tree-optimization/106510 gcc/ * gimple-range-fold.cc (fur_source::register_outgoing_edges): Check for unsupported statements early. gcc/testsuite * gcc.dg/pr106510.c: New.	2022-08-02 19:23:47 -04:00
Aldy Hernandez	502605a277	Adjust testsuite/gcc.dg/tree-ssa/vrp-float-1.c I missed the -details dump flag, plus I wasn't checking the actual folding. As a bonus I had flipped the dump file name and the count, so the test was coming out as unresolved, which I missed because I was only checking for failures and passes. Whooops. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/vrp-float-1.c: Adjust test so it passes.	2022-08-02 20:59:22 +02:00
Andrew MacLeod	87dd4c8c83	Check equivalencies when calculating range on entry. When propagating on-entry values in the cache, checking if any equivalence has a known value can improve results. No new calculations are made. Only queries via dominators which do not populate the cache are checked. PR tree-optimization/106474 gcc/ * gimple-range-cache.cc (ranger_cache::fill_block_cache): Query range of equivalences that may contribute to the range. gcc/testsuite/ * g++.dg/pr106474.C: New.	2022-08-02 14:18:57 -04:00
Jose E. Marchesi	5df04a7aa8	btf: do not use the CHAR `encoding' bit for BTF Contrary to CTF and our previous expectations, as per [1], turns out that in BTF: 1) The `encoding' field in integer types shall not be treated as a bitmap, but as an enumerated, i.e. these bits are exclusive to each other. 2) The CHAR bit in `encoding' shall _not_ be set when emitting types for char nor `unsigned char'. Consequently this patch clears the CHAR bit before emitting the variable part of BTF integral types. It also updates the testsuite accordingly, expanding it to check for BOOL bits. [1] https://lore.kernel.org/bpf/a73586ad-f2dc-0401-1eba-2004357b7edf@fb.com/T/#t gcc/ChangeLog: * btfout.cc (output_asm_btf_vlen_bytes): Do not use the CHAR encoding bit in BTF. gcc/testsuite/ChangeLog: * gcc.dg/debug/btf/btf-int-1.c: Do not check for char bits in bti_encoding and check for bool bits.	2022-08-02 19:25:21 +02:00
Immad Mir	6a11f2d974	analyzer: support for creat, dup, dup2 and dup3 [PR106298] This patch extends the state machine in sm-fd.cc to support creat, dup, dup2 and dup3 functions. Lightly tested on x86_64 Linux. gcc/analyzer/ChangeLog: PR analyzer/106298 * sm-fd.cc (fd_state_machine::on_open): Add creat, dup, dup2 and dup3 functions. (enum dup): New. (fd_state_machine::valid_to_unchecked_state): New. (fd_state_machine::on_creat): New. (fd_state_machine::on_dup): New. gcc/testsuite/ChangeLog: PR analyzer/106298 * gcc.dg/analyzer/fd-1.c: Add tests for 'creat'. * gcc.dg/analyzer/fd-2.c: Likewise. * gcc.dg/analyzer/fd-4.c: Likewise. * gcc.dg/analyzer/fd-dup-1.c: New tests. Signed-off-by: Immad Mir <mirimmad@outlook.com>	2022-08-02 22:22:15 +05:30
Aldy Hernandez	6d41f7c39c	Make range_of_ssa_name_with_loop_info type agnostic. gcc/ChangeLog: * gimple-range-fold.cc (fold_using_range::range_of_phi): Remove irange check. (tree_lower_bound): New. (tree_upper_bound): New. (fold_using_range::range_of_ssa_name_with_loop_info): Convert to vrange. * gimple-range-fold.h (range_of_ssa_name_with_loop_info): Change argument to vrange.	2022-08-02 17:42:15 +02:00
Richard Biener	353fd1ec3d	Properly honor param_max_fsm_thread_path_insns in backwards threader I am trying to make sense of back_threader_profitability::profitable_path_p and the first thing I notice is that we do /* Threading is profitable if the path duplicated is hot but also in a case we separate cold path from hot path and permit optimization of the hot path later. Be on the agressive side here. In some testcases, as in PR 78407 this leads to noticeable improvements. / if (m_speed_p && ((taken_edge && optimize_edge_for_speed_p (taken_edge)) \|\| contains_hot_bb)) { if (n_insns >= param_max_fsm_thread_path_insns) { if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, " FAIL: Jump-thread path not considered: " "the number of instructions on the path " "exceeds PARAM_MAX_FSM_THREAD_PATH_INSNS.\n"); return false; } ... } else if (!m_speed_p && n_insns > 1) { if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, " FAIL: Jump-thread path not considered: " "duplication of %i insns is needed and optimizing for size.\n", n_insns); return false; } ... return true; thus we apply the n_insns >= param_max_fsm_thread_path_insns only to "hot paths". The comment above this isn't entirely clear whether this is by design ("Be on the aggressive side here ...") but I think this is a mistake. In fact the "hot path" check seems entirely useless since if the path is not hot we simply continue threading it. This was caused by r12-324-g69e5544210e3c0 and the following simply reverts the offending change. tree-ssa-threadbackward.cc (back_threader_profitability::profitable_path_p): Apply size constraints to all paths again.	2022-08-02 15:06:06 +02:00
Aldy Hernandez	24012539ae	Implement basic range operators to enable floating point VRP. Without further ado, here is the implementation for floating point range operators, plus the switch to enable all ranger clients to handle floats. These are bare bone implementations good enough for relation operators to work, while keeping the NAN bits up to date in the frange. There is also minimal support for keeping track of +-INF when it is obvious. Tested on x86-64 Linux. gcc/ChangeLog: * range-op-float.cc (finite_operands_p): New. (frelop_early_resolve): New. (default_frelop_fold_range): New. (class foperator_equal): New. (class foperator_not_equal): New. (class foperator_lt): New. (class foperator_le): New. (class foperator_gt): New. (class foperator_ge): New. (class foperator_unordered): New. (class foperator_ordered): New. (class foperator_relop_unknown): New. (floating_op_table::floating_op_table): Add above classes to floating op table. * value-range.h (frange::supports_p): Enable. gcc/testsuite/ChangeLog: * g++.dg/opt/pr94589-2.C: XFAIL. * gcc.dg/tree-ssa/vrp-float-1.c: New test. * gcc.dg/tree-ssa/vrp-float-11.c: New test. * gcc.dg/tree-ssa/vrp-float-3.c: New test. * gcc.dg/tree-ssa/vrp-float-4.c: New test. * gcc.dg/tree-ssa/vrp-float-6.c: New test. * gcc.dg/tree-ssa/vrp-float-7.c: New test. * gcc.dg/tree-ssa/vrp-float-8.c: New test.	2022-08-02 14:50:26 +02:00
Aldy Hernandez	5f7e187e7f	Implement streamer for frange. This patch Allows us to export floating point ranges into the SSA name (SSA_NAME_RANGE_INFO). [Richi, in PR24021 you suggested that match.pd could use global float ranges, because it would generally not invoke ranger. This patch implements the boiler plate to save the frange globally.] [Jeff, we've also been talking in parallel of using NAN knowledge during expansion to RTL. This patch will provide the NAN bits in the SSA name.] Since frange's currently implementation is just a shell, with no actual endpoints, frange_storage_slot only contains frange_props which fits inside a byte. When we have endpoints, y'all can decide if it's worth saving them, or if the NAN/etc bits are good enough. gcc/ChangeLog: * tree-core.h (struct tree_ssa_name): Add frange_info and reshuffle the rest. * value-range-storage.cc (vrange_storage::alloc_slot): Add case for frange. (vrange_storage::set_vrange): Same. (vrange_storage::get_vrange): Same. (vrange_storage::fits_p): Same. (frange_storage_slot::alloc_slot): New. (frange_storage_slot::set_frange): New. (frange_storage_slot::get_frange): New. (frange_storage_slot::fits_p): New. * value-range-storage.h (class frange_storage_slot): New.	2022-08-02 14:50:25 +02:00
Aldy Hernandez	e9f5b4fa4f	Limit ranger query in ipa-prop.cc to integrals. ipa-* still works on legacy value_range's which only support integrals. This patch limits the query to integrals, as to not get a floating point range that can't exist in an irange. gcc/ChangeLog: * ipa-prop.cc (ipa_compute_jump_functions_for_edge): Limit ranger query to integrals.	2022-08-02 14:50:25 +02:00
Aldy Hernandez	5e4f26441b	More frange::set cleanups. gcc/ChangeLog: * value-range.cc (frange::set): Initialize m_props and cleanup.	2022-08-02 14:50:25 +02:00
Richard Biener	0f3514756e	tree-optimization/106497 - more forward threader can-copy-bb This adds EDGE_COPY_SRC_JOINER_BLOCK sources to the set of blocks we need to check we can duplicate. PR tree-optimization/106497 * tree-ssa-threadupdate.cc (fwd_jt_path_registry::update_cfg): Also verify we can copy EDGE_COPY_SRC_JOINER_BLOCK. * gcc.dg/torture/pr106497.c: New testcase.	2022-08-02 13:29:36 +02:00
Martin Liska	84beef30a5	IPA: reduce what we dump in normal mode gcc/ChangeLog: * profile.cc (compute_branch_probabilities): Dump details only if TDF_DETAILS. * symtab.cc (symtab_node::dump_base): Do not dump pointer unless TDF_ADDRESS is used, it makes comparison harder.	2022-08-02 12:16:32 +02:00
Martin Liska	c2d0742938	gcc-changelog: do not run extra deduction Do not deduce changelog for root ChangeLog (''). contrib/ChangeLog: * gcc-changelog/git_commit.py: Do not deduce changelog for root ChangeLog.	2022-08-02 10:50:07 +02:00
Richard Biener	c30bbd4d16	tree-optimization/106498 - reduce SSA updates in autopar The following reduces the number of SSA updates done during autopar OMP expansion, specifically avoiding the cases that just add virtual operands (where maybe none have been before) in dead regions of the CFG. Instead virtual SSA update is delayed until after the pass. There's much more TLC needed here, but test coverage makes it really difficult. PR tree-optimization/106498 * omp-expand.cc (expand_omp_taskreg): Do not perform virtual SSA update here. (expand_omp_for): Or here. (execute_expand_omp): Instead schedule it here together with CFG cleanup via TODO.	2022-08-02 08:40:30 +02:00
Richard Biener	bc7526f6fc	lto/106334 - fix previous fix wrt -flto-partition=none This adjusts the assert guard to include -flto-partition=none which behaves as WPA. PR lto/106334 * dwarf2out.cc (dwarf2out_register_external_die): Adjust assert.	2022-08-02 08:35:01 +02:00
Richard Biener	b9da686470	tree-optimization/106495 - avoid threading to possibly never executed edge The following builds upon the logic of the PR105679 fix by avoiding to thread to a known edge that is predicted as probably never executed. PR tree-optimization/106495 * tree-ssa-threadbackward.cc (back_threader_profitability::profitable_path_p): If known_edge is probably never executed avoid threading.	2022-08-02 08:35:01 +02:00
GCC Administrator	325103829e	Daily bump.	2022-08-02 00:16:51 +00:00
David Malcolm	e8bc6918b3	c: improvements to address space diagnostics This adds a clarifying "note" to address space mismatch diagnostics. For example, it improves the diagnostic for gcc.target/i386/addr-space-typeck-2.c from: addr-space-typeck-2.c: In function 'test_bad_call': addr-space-typeck-2.c:12:22: error: passing argument 2 of 'expects_seg_gs' from pointer to non-enclosed address space 12 \| expects_seg_gs (0, ptr, 1); \| ^~~ to: addr-space-typeck-2.c: In function 'test_bad_call': addr-space-typeck-2.c:12:22: error: passing argument 2 of 'expects_seg_gs' from pointer to non-enclosed address space 12 \| expects_seg_gs (0, ptr, 1); \| ^~~ addr-space-typeck-2.c:7:51: note: expected '__seg_gs void ' but argument is of type 'void ' 7 \| extern void expects_seg_gs (int i, void __seg_gs param, int j); \| ~~~~~~~~~~~~~~~^~~~~ I took the liberty of adding the test coverage to i386 since we need a specific target to test this on. gcc/c/ChangeLog: c-typeck.cc (build_c_cast): Quote names of address spaces in diagnostics. (convert_for_assignment): Add a note to address space mismatch diagnostics, specifying the expected and actual types. gcc/testsuite/ChangeLog: * gcc.target/i386/addr-space-typeck-1.c: New test. * gcc.target/i386/addr-space-typeck-2.c: New test. Signed-off-by: David Malcolm <dmalcolm@redhat.com>	2022-08-01 19:30:15 -04:00
David Malcolm	ffd12be139	docs: fix copy&paste error in -Wanalyzer-putenv-of-auto-var gcc/ChangeLog: * doc/invoke.texi (-Wanalyzer-putenv-of-auto-var): Fix copy&paste error. Signed-off-by: David Malcolm <dmalcolm@redhat.com>	2022-08-01 19:30:15 -04:00
Roger Sayle	96e5f6696a	PR target/106481: Handle CONST_WIDE_INT in REG_EQUAL during STV on x86_64. This patch resolves PR target/106481, and is an oversight in my recent battles with REG_EQUAL notes during TImode STV (see PR target/106278 https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598416.html). The patch above's/current behaviour is that we check that the mode of the REG_EQUAL note is TImode before using PUT_MODE to set it to V1TImode. However, the new test case reveals that this doesn't consider REG_EQUAL notes that are CONST_INT or CONST_WIDE_INT, i.e. that are VOIDmode, and so STV produces: (insn 85 84 86 2 (set (reg:V1TI 113) (reg:V1TI 84)) "pr106481.c":13:3 1766 {movv1ti_internal} (expr_list:REG_EQUAL (const_wide_int 0x0ffffffff00000004) (nil))) which causes problems as the const_wide_int isn't a valid immediate constant for V1TImode. With this patch, we now generate the correct: (insn 85 84 86 2 (set (reg:V1TI 113) (reg:V1TI 84)) "pr106481.c":13:3 1766 {movv1ti_internal} (expr_list:REG_EQUAL (const_vector:V1TI [ (const_wide_int 0x0ffffffff00000004) ]) (nil))) 2022-08-01 Roger Sayle <roger@nextmovesoftware.com> Uroš Bizjak <ubizjak@gmail.com> gcc/ChangeLog PR target/106481 * config/i386/i386-features.cc (timode_scalar_chain::convert_insn): Convert a CONST_SCALAR_INT_P in a REG_EQUAL note into a V1TImode CONST_VECTOR. gcc/testsuite/ChangeLog PR target/106481 * gcc.target/i386/pr106481.c: New test case.	2022-08-01 23:08:23 +01:00
H.J. Lu	8092892014	x86: Add ix86_ifunc_ref_local_ok We can't always use the PLT entry as the function address for local IFUNC functions. When the PIC register is needed for PLT call, indirect call via the PLT entry will fail since the PIC register may not be set up properly for indirect call. Add ix86_ifunc_ref_local_ok to return false when the PLT entry can't be used as local IFUNC function pointers. gcc/ PR target/83782 * config/i386/i386.cc (ix86_ifunc_ref_local_ok): New. (TARGET_IFUNC_REF_LOCAL_OK): Use it. gcc/testsuite/ PR target/83782 * gcc.target/i386/pr83782-1.c: Require non-ia32. * gcc.target/i386/pr83782-2.c: Likewise. * gcc.target/i386/pr83782-3.c: New test.	2022-08-01 11:28:43 -07:00
Jose E. Marchesi	32566720f3	btf: emit linkage information in BTF_KIND_FUNC entries The kernel bpftool expects BTF_KIND_FUNC entries in BTF to include an annotation reflecting the linkage of functions (static, global). For whatever reason they abuse the `vlen' field of the BTF_KIND_FUNC entry instead of adding a variable-part to the record like it is done with other entry kinds. This patch makes GCC to include this linkage info in BTF_KIND_FUNC entries. Tested in bpf-unknown-none target. gcc/ChangeLog: PR debug/106263 * ctfc.h (struct ctf_dtdef): Add field linkage. * ctfc.cc (ctf_add_function): Set ctti_linkage. * dwarf2ctf.cc (gen_ctf_function_type): Pass a linkage for function types and subprograms. * btfout.cc (btf_asm_func_type): Emit linkage information for the function. (btf_dtd_emit_preprocess_cb): Propagate the linkage information for functions. gcc/testsuite/ChangeLog: PR debug/106263 * gcc.dg/debug/btf/btf-function-4.c: New test. * gcc.dg/debug/btf/btf-function-5.c: Likewise.	2022-08-01 19:44:12 +02:00
Andrew Stubbs	b64e937ccd	openmp-simd-clone: Match shift types Ensure that both parameters to vector shifts use the same mode. This is most important for amdgcn where the masks are DImode. gcc/ChangeLog: * omp-simd-clone.cc (simd_clone_adjust): Convert shift_cnt to match the mask type. Co-authored-by: Jakub Jelinek <jakub@redhat.com>	2022-08-01 17:08:27 +01:00
Sam Feifer	388fbbd895	match.pd: Add new division pattern [PR104992] This patch fixes a missed optimization in match.pd. It takes the pattern, x / y * y == x, and optimizes it to x % y == 0. This produces fewer instructions. This simplification does not happen for complex types. This patch also adds tests for the optimization rule. Bootstrapped/regtested on x86_64-pc-linux-gnu. PR tree-optimization/104992 gcc/ChangeLog: * match.pd (x / y * y == x): New simplification. gcc/testsuite/ChangeLog: * g++.dg/pr104992-1.C: New test. * gcc.dg/pr104992.c: New test.	2022-08-01 09:01:53 -04:00
Roger Sayle	71f068a9b3	Update configure to check for a recent gnat Ada compiler. GCC fails to bootstrap when configured with --enable-languages=all on machines that have older versions of GNAT installed as the system Ada compiler. In configure, it's not sufficient to check whether gnat is available, but whether a sufficiently recent version of GNAT is installed. This patch tweaks config/acx.m4 so that conftest.adb also contains a reference to System.CRTL.int64 as required by the current version of gcc/ada/osint.adb. This fixes the build when the system Ada is GNAT v4.8.5 (on Redhat 7) by disabling ada, but continues to work fine when the system Ada is GNAT v11.3.1. 2022-08-01 Roger Sayle <roger@nextmovesoftware.com> Arnaud Charlet <charlet@adacore.com> config/ChangeLog * acx.m4 (AC_PROG_GNAT): Update conftest.adb to include features required of the host gnat compiler. ChangeLog * configure: Regenerate.	2022-08-01 11:40:50 +01:00
Martin Liska	5d8637208d	lto: replace $target with $host in configure.ac [PR106170] PR lto/106170 lto-plugin/ChangeLog: * configure.ac: Replace $target with $host. * configure: Regenerate.	2022-08-01 10:32:00 +02:00
Jakub Jelinek	82ac4cd213	libfortran: Fix up boz_15.f90 on powerpc64le with -mabi=ieeelongdouble [PR106079] The boz_15.f90 test FAILs on powerpc64le-linux when -mabi=ieeelongdouble is used (either default through --with-long-double-format=ieee or when used explicitly). The problem is that the read/write transfer routines are called with BT_REAL (or BT_COMPLEX) type and kind 17 which is magic we use to say it is the IEEE quad real(kind=16) rather than the IBM double double real(kind=16). For the floating point input/output we then handle kind 17 specially, but for B/O/Z we just treat the bytes of the floating point value as binary blob and using 17 in that case results in unexpected behavior, for write it means we don't estimate right how many chars we'll need and print ******************** etc. rather than what we should, and even with explicit size we'd print one further byte than intended. For read it would even mean overwriting some unrelated byte after the floating point object. Fixed by using 16 instead of 17 in the read_radix and write_{b,o,z} calls. 2022-08-01 Jakub Jelinek <jakub@redhat.com> PR libfortran/106079 * io/transfer.c (formatted_transfer_scalar_read, formatted_transfer_scalar_write): For type BT_REAL with kind 17 change kind to 16 before calling read_radix or write_{b,o,z}.	2022-08-01 08:26:03 +02:00
Aldy Hernandez	3f05605364	Cleanups to frange. These are some assorted cleanups to the frange class to make it easier to drop in an implementation with FP endpoints: * frange::set() had some asserts limiting the type of arguments passed. There's no reason why we can't handle all the variants. Worse comes to worse, we can always return a VARYING which is conservative and correct. * frange::normalize_kind() now returns a boolean that can be used in union and intersection to indicate that the range changed. * Implement vrp_val_max and vrp_val_min for floats. Also, move them earlier in the header file so frange can use them. Tested on x86-64 Linux. gcc/ChangeLog: * value-range.cc (tree_compare): New. (frange::set): Make more general. (frange::normalize_kind): Cleanup and return bool. (frange::union_): Use normalize_kind return value. (frange::intersect): Same. (frange::verify_range): Remove unnecessary else. * value-range.h (vrp_val_max): Move before frange class. (vrp_val_min): Same. (frange::frange): Remove set to m_type.	2022-08-01 08:16:03 +02:00
Aldy Hernandez	7e029e067d	const_tree conversion of vrange::supports_* Make all vrange::supports__p methods const_tree as they can end up being called from functions that are const_tree. Tested on x86-64 Linux. gcc/ChangeLog: value-range.cc (vrange::supports_type_p): Use const_tree. (irange::supports_type_p): Same. (frange::supports_type_p): Same. * value-range.h (Value_Range::supports_type_p): Same. (irange::supports_p): Same.	2022-08-01 08:16:03 +02:00
Aldy Hernandez	460dcec49f	Make irange dependency explicit for range_of_ssa_name_with_loop_info. Even though ranger is type agnostic, SCEV seems to only work with integers. This patch removes some FIXME notes making it explicit that bounds_of_var_in_loop only works with iranges. Tested on x86-64 Linux. gcc/ChangeLog: * gimple-range-fold.cc (fold_using_range::range_of_phi): Only query SCEV for integers. (fold_using_range::range_of_ssa_name_with_loop_info): Remove irange check.	2022-08-01 08:16:03 +02:00
Dimitrije Milošević	1efeaf99bd	libsanitizer: Cherry-pick 2bfb0fcb51510f22723c8cdfefe from upstream 2bfb0fcb51510f22723c8cdfefe [Sanitizer][MIPS] Fix stat struct size for the O32 ABI. Signed-off-by: Dimitrije Milosevic <dimitrije.milosevic@syrmia.com>.	2022-08-01 06:10:31 +02:00
GCC Administrator	4a7274ddc4	Daily bump.	2022-08-01 00:16:31 +00:00
Roger Sayle	525a1a73a5	Add rotl64ti2_doubleword pattern to i386.md This patch adds rot[lr]64ti2_doubleword patterns to the x86_64 backend, to move splitting of 128-bit TImode rotates by 64 bits after reload, matching what we now do for 64-bit DImode rotations by 32 bits with -m32. In theory moving when this rotation is split should have little influence on code generation, but in practice "reload" sometimes decides to make use of the increased flexibility to reduce the number of registers used, and the code size, by using xchg. For example: __int128 x; __int128 y; __int128 a; __int128 b; void foo() { unsigned __int128 t = x; t ^= a; t = (t<<64) \| (t>>64); t ^= b; y = t; } Before: movq x(%rip), %rsi movq x+8(%rip), %rdi xorq a(%rip), %rsi xorq a+8(%rip), %rdi movq %rdi, %rax movq %rsi, %rdx xorq b(%rip), %rax xorq b+8(%rip), %rdx movq %rax, y(%rip) movq %rdx, y+8(%rip) ret After: movq x(%rip), %rax movq x+8(%rip), %rdx xorq a(%rip), %rax xorq a+8(%rip), %rdx xchgq %rdx, %rax xorq b(%rip), %rax xorq b+8(%rip), %rdx movq %rax, y(%rip) movq %rdx, y+8(%rip) ret One some modern architectures this is a small win, on some older architectures this is a small loss. The decision which code to generate is made in "reload", and could probably be tweaked by register preferencing. The much bigger win is that (eventually) all TImode mode shifts and rotates by constants will become potential candidates for TImode STV. 2022-07-31 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * config/i386/i386.md (define_expand <any_rotate>ti3): For rotations by 64 bits use new rot[lr]64ti2_doubleword pattern. (rot[lr]64ti2_doubleword): New post-reload splitter.	2022-07-31 21:51:44 +01:00
Roger Sayle	493f4e6cf0	PR target/106450: Tweak timode_remove_non_convertible_regs on x86_64. This patch resolves PR target/106450, some more fall-out from more aggressive TImode scalar-to-vector (STV) optimizations. I continue to be caught out by how far TImode STV has diverged from DImode/SImode STV, and therefore requires additional (unexpected) tweaking. Many thanks to H.J. Lu for pointing out timode_remove_non_convertible_regs needs to be extended to handle XOR (and other new operations). Unhelpfully the comment above this function states that it's the TImode version of "remove_non_convertible_regs", which doesn't exist anymore, so I've resurrected an explanatory comment from the git history. By refactoring the checks for hard regs and already "marked" regs into timode_check_non_convertible_regs itself, all of its callers are simplified. This patch then FOR_EACH_INSN_USE and FOR_EACH_INSN_DEF to generically handle arbitrary (non-move) instructions (including unary and binary operations), calling timode_check_non_convertible_regs on each TImode register USE and DEF. 2022-07-31 Roger Sayle <roger@nextmovesoftware.com> H.J. Lu <hjl.tools@gmail.com> gcc/ChangeLog PR target/106450 * config/i386/i386-features.cc (timode_check_non_convertible_regs): Do nothing if REGNO is set in the REGS bitmap, or is a hard reg. (timode_remove_non_convertible_regs): Update comment. Call timode_check_non_convertible_reg on all TImode register DEFs and USEs in each instruction. gcc/testsuite/ChangeLog PR target/106450 * gcc.target/i386/pr106450.c: New test case.	2022-07-31 21:44:51 +01:00
Harald Anlauf	d325e7048c	Fortran: detect blanks within literal constants in free-form mode [PR92805] gcc/fortran/ChangeLog: PR fortran/92805 * match.cc (gfc_match_small_literal_int): Make gobbling of leading whitespace optional. (gfc_match_name): Likewise. (gfc_match_char): Likewise. * match.h (gfc_match_small_literal_int): Adjust prototype. (gfc_match_name): Likewise. (gfc_match_char): Likewise. * primary.cc (match_kind_param): Match small literal int or name without gobbling whitespace. (get_kind): Do not skip over blanks. (match_string_constant): Likewise. gcc/testsuite/ChangeLog: PR fortran/92805 * gfortran.dg/literal_constants.f: New test. * gfortran.dg/literal_constants.f90: New test. Co-authored-by: Steven G. Kargl <kargl@gcc.gnu.org>	2022-07-31 20:43:17 +02:00
Harald Anlauf	0110cfd544	Fortran: fix invalid rank error in ASSOCIATED when rank is remapped [PR77652] gcc/fortran/ChangeLog: PR fortran/77652 * check.cc (gfc_check_associated): Make the rank check of POINTER vs. TARGET match the allowed forms of pointer assignment for the selected Fortran standard. gcc/testsuite/ChangeLog: PR fortran/77652 * gfortran.dg/associated_target_9a.f90: New test. * gfortran.dg/associated_target_9b.f90: New test.	2022-07-31 20:28:38 +02:00
Lewis Hyatt	b04c399e25	c++: Fix location for -Wunused-macros [PR66290] In C++, since all tokens are lexed from libcpp up front, diagnostics generated by libcpp after lexing has completed do not get a valid location from libcpp (rather, libcpp thinks they all pertain to the end of the file.) This has long been addressed using the global variable "done_lexing", which the C++ frontend sets at the appropriate time; when done_lexing is true, then c_cpp_diagnostic(), which outputs libcpp's diagnostics, uses input_location instead of the wrong libcpp location. The C++ frontend arranges that input_location will point to the token it is currently processing, so this generally works fine. However, there is one exception currently, which is -Wunused-macros. This gets generated at the end of processing in cpp_finish (), since we need to wait until then to determine whether a macro was eventually used or not. But the locations it passes to c_cpp_diagnostic () were remembered from the original lexing and hence they should not be overridden with input_location, which is now the one incorrectly pointing to the end of the file. Fixed by setting done_lexing=false again just prior to calling cpp_finish (). I also renamed the variable from done_lexing to "override_libcpp_locations", since it's now not strictly about lexing anymore. There is no new testcase with this patch, since we already had an xfailed testcase which is now fixed. gcc/c-family/ChangeLog: PR c++/66290 * c-common.h: Rename global done_lexing to override_libcpp_locations. * c-common.cc (c_cpp_diagnostic): Likewise. * c-opts.cc (c_common_finish): Set override_libcpp_locations (formerly done_lexing) immediately prior to calling cpp_finish (). gcc/cp/ChangeLog: PR c++/66290 * parser.cc (cp_lexer_new_main): Rename global done_lexing to override_libcpp_locations. gcc/testsuite/ChangeLog: PR c++/66290 * c-c++-common/pragma-diag-15.c: Remove xfail for C++.	2022-07-31 07:48:47 -04:00
Roger Sayle	351e3cad2c	PR bootstrap/106472: Add libgo depends on libbacktrace to Makefile.def This patch fixes PR bootstrap/106472 by adding a missing dependency to Makefile.def to allow make bootstrap when configured using "--enable-languages=go" (and not using make with multiple threads). 2022-07-31 Roger Sayle <roger@nextmovesoftware.com> ChangeLog PR bootstrap/106472 * Makefile.def (dependencies): Make configure-target-libgo depend upon all-target-libbacktrace.	2022-07-31 08:13:30 +01:00
Jason Merrill	9efe4e153d	c++: constexpr, empty base after non-empty [PR106369] Here the CONSTRUCTOR we were providing for D{} had an entry for the B base subobject at offset 0 following the entry for the C base, causing output_constructor_regular_field to ICE due to going backwards. It might be nice for that function to be more tolerant of empty fields, but it also seems reasonable for the front end to prune the useless entry. PR c++/106369 gcc/cp/ChangeLog: * constexpr.cc (reduced_constant_expression_p): Return false if a CONSTRUCTOR initializes an empty field. gcc/testsuite/ChangeLog: * g++.dg/cpp1z/constexpr-lambda27.C: New test.	2022-07-30 19:56:36 -07:00
GCC Administrator	9ef2c9aa5b	Daily bump.	2022-07-31 00:16:37 +00:00

1 2 3 4 5 ...

194721 Commits