This removes a premature and not working optimization from the
gimplifier. When gimplification is requested not to produce a SSA
name we try to avoid generating a copy when we did so anyway but
instead replace the LHS of its definition. But that only works in
case there are no uses of the SSA name already which is something
we cannot easily check, so the following removes said optimization.
Statistics on the whole bootstrap shows we hit this optimization
only for libiberty/cp-demangle.c and overall we have 21652112
gimplifications where just 240 copies are elided. Preserving
the optimization would require scanning the original expression
and the pre and post sequences for SSA names and uses, that seems
excessive to avoid these 240 copies.
2021-06-22 Richard Biener <rguenther@suse.de>
PR middle-end/101156
* gimplify.c (gimplify_expr): Remove premature incorrect
optimization.
* gcc.dg/pr101156.c: New testcase.
(cherry picked from commit b4e21c8046)
This PR shows that building an ao_ref from value-numbers is prone to
expose bogus contextual alias info to the oracle. The following makes
sure to construct ao_refs from SSA names available at the program point
only.
On the way it modifies the awkward valueize_refs[_1] API.
2021-06-08 Richard Biener <rguenther@suse.de>
PR tree-optimization/100923
* tree-ssa-sccvn.c (valueize_refs_1): Take a pointer to
the operand vector to be valueized.
(valueize_refs): Likewise.
(valueize_shared_reference_ops_from_ref): Adjust.
(valueize_shared_reference_ops_from_call): Likewise.
(vn_reference_lookup_3): Likewise.
(vn_reference_lookup_pieces): Likewise. Re-valueize
with honoring availability when we are about to create
the ao_ref and valueized before.
(vn_reference_lookup): Likewise.
(vn_reference_insert_pieces): Adjust.
* gcc.dg/torture/pr100923.c: New testcase.
(cherry picked from commit 7a56d3d3e9)
When we face a sm_ord vs sm_unord for the same ref during
store sequence merging we assert that the ref is already marked
unsupported. But it can be that it will only be marked so
during the ongoing merging so instead of asserting mark it here.
Also apply some optimization to not waste resources to search
for already unsupported refs.
2021-06-16 Richard Biener <rguenther@suse.de>
PR tree-optimization/101088
* tree-ssa-loop-im.c (sm_seq_valid_bb): Only look for
supported refs on edges. Do not assert same ref but
different kind stores are unsuported but mark them so.
(hoist_memory_references): Only look for supported refs
on exits.
* gcc.dg/torture/pr101088.c: New testcase.
(cherry picked from commit 43fc4234ad)
This plugs a hole in store-motion where it fails to perform dependence
checking on conditionally executed but not store-motioned refs.
2021-06-11 Richard Biener <rguenther@suse.de>
PR tree-optimization/101025
* tree-ssa-loop-im.c (sm_seq_valid_bb): Make sure to process
all refs that require dependence checking.
* gcc.dg/torture/pr101025.c: New testcase.
(cherry picked from commit b8b80b8aa3)
Rust places text files in archives. AIX ld ignores such files with a
warning. The collect2 wrapper for ld had been exiting with a fatal
error if it scanned an archive that contained a non-COFF file.
This patch updates collect2.c to issue a warning and ignore the file
member, matching the behavior of AIX ld. GCC can encounter archives
created by Rust and should not issue a fatal error. This changes
fatal_error to warning, with an implicit location and no associated
optimization flag.
gcc/ChangeLog:
2021-05-20 Clement Chigot <clement.chigot@atos.net>
David Edelsohn <dje.gcc@gmail.com>
* collect2.c (scan_prog_file): Issue non-fatal warning for
non-COFF files.
(cherry picked from commit 5a3bf28119)
The D front-end semantic pass sometimes declares a temporary inside a
return expression. This is now detected with the RESULT_DECL replacing
the temporary, allowing for RVO to be done.
PR d/101273
gcc/d/ChangeLog:
* toir.cc (IRVisitor::visit (ReturnStatement *)): Detect returns that
use a temporary, and replace with return value.
gcc/testsuite/ChangeLog:
* gdc.dg/torture/pr101273.d: New test.
(cherry picked from commit 152f4d0e4d)
To prevent the RHS of an assignment modifying the LHS before the
assignment proper, a target_expr is forced so that function calls that
return with slot optimization modify the temporary instead. This did
not work for conditional expressions however, to give one example. So
now the RHS is always forced to a temporary.
PR d/101282
gcc/d/ChangeLog:
* d-codegen.cc (build_assign): Force target_expr on RHS for non-POD
assignment expressions.
gcc/testsuite/ChangeLog:
* gdc.dg/torture/pr101282.d: New test.
(cherry picked from commit c77230856e)
Fix failures seen on i686 due to relying on exact floating-point
equality when testing results of vector division.
gcc/testsuite/ChangeLog:
* jit.dg/test-vector-rvalues.cc (check_div): Add specialization
for v4f, to avoid relying on exact floating-point equality.
* jit.dg/test-vector-types.cc (check_div): Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
On i686, test_i386_basic_asm_4 has:
error: inconsistent operand constraints in an 'asm'
and test_i386_basic_asm_5 has:
/tmp/libgccjit-9FsLie/fake.s:9: Error: bad register name `%rdi'
/tmp/libgccjit-9FsLie/fake.s:10: Error: bad register name `%rsi'
This is only intended as a smoketest of asm support, so only run
it on x86_64.
gcc/testsuite/ChangeLog:
* jit.dg/test-asm.c: Remove i?86-*-* from target specifier.
* jit.dg/test-asm.cc: Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
gcc/analyzer/ChangeLog:
* store.cc (binding_cluster::get_any_binding): Make symbolic reads
from a cluster with concrete bindings return unknown.
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/symbolic-7.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
gcc/analyzer/ChangeLog:
* region-model-manager.cc
(region_model_manager::get_or_create_int_cst): New.
(region_model_manager::maybe_undo_optimize_bit_field_compare): Use
it to simplify away a local tree.
* region-model.cc (region_model::on_setjmp): Likewise.
(region_model::on_longjmp): Likewise.
* region-model.h (region_model_manager::get_or_create_int_cst):
New decl.
* store.cc (binding_cluster::zero_fill_region): Use it to simplify
away a local tree.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Looks like my patch for PR analyzer/99212 implicitly assumed
little-endian, which the following patch fixes.
Fixes bitfields-1.c on:
- armeb-none-linux-gnueabihf
- cris-elf
- powerpc64-darwin
- s390-linux-gnu
gcc/analyzer/ChangeLog:
PR analyzer/99212
PR analyzer/101082
* engine.cc: Include "target.h".
(impl_run_checkers): Log BITS_BIG_ENDIAN, BYTES_BIG_ENDIAN, and
WORDS_BIG_ENDIAN.
* region-model-manager.cc
(region_model_manager::maybe_fold_binop): Move support for masking
via ARG0 & CST into...
(region_model_manager::maybe_undo_optimize_bit_field_compare):
...this new function. Flatten by converting from nested
conditionals to a series of early return statements to reject
failures. Reject if type is not unsigned_char_type_node.
Handle BYTES_BIG_ENDIAN when determining which bits are bound
in the binding_map.
* region-model.h
(region_model_manager::maybe_undo_optimize_bit_field_compare):
New decl.
* store.cc (bit_range::dump): New function.
* store.h (bit_range::dump): New decl.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
While debugging another issue I noticed that the analyzer could fail to
merge nodes for control flow in which one path had called a function
and another path hadn't:
BB
/ \
/ \
fn call no fn call
\ /
\ /
join BB
The root cause was that the worklist sort function wasn't prioritizing
call strings, and thus it was fully exploring the "no function called"
path to the exit BB, and only then exploring the "within the function call"
parts of the "funcion called" path.
This patch prioritizes call strings when sorting the worklist so that
the nodes with deeper call strings are processed before those with shallower
call strings, thus allowing such nodes to be merged at the joinpoint.
gcc/analyzer/ChangeLog:
* engine.cc (worklist::key_t::cmp): Move sort by call_string to
before SCC.
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/loop-0-up-to-n-by-1-with-iter-obj.c: Update
expected number of enodes after the loop.
* gcc.dg/analyzer/paths-8.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
This patch verifies the previous fix for bitfield sizes by implementing
enough support for bitfields in the analyzer to get the test cases to pass.
The patch implements support in the analyzer for reading from a
BIT_FIELD_REF, and support for folding BIT_AND_EXPR of a mask, to handle
the cases generated in tests.
The existing bitfields tests in data-model-1.c turned out to rely on
undefined behavior, in that they were assigning values to a signed
bitfield that were outside of the valid range of values. I believe that
that's why we were seeing target-specific differences in the test
results (PR analyzer/99212). The patch updates the test to remove the
undefined behaviors.
gcc/analyzer/ChangeLog:
PR analyzer/99212
* region-model-manager.cc
(region_model_manager::maybe_fold_binop): Add support for folding
BIT_AND_EXPR of compound_svalue and a mask constant.
* region-model.cc (region_model::get_rvalue_1): Implement
BIT_FIELD_REF in terms of...
(region_model::get_rvalue_for_bits): New function.
* region-model.h (region_model::get_rvalue_for_bits): New decl.
* store.cc (bit_range::from_mask): New function.
(selftest::test_bit_range_intersects_p): New selftest.
(selftest::assert_bit_range_from_mask_eq): New.
(ASSERT_BIT_RANGE_FROM_MASK_EQ): New macro.
(selftest::assert_no_bit_range_from_mask_eq): New.
(ASSERT_NO_BIT_RANGE_FROM_MASK): New macro.
(selftest::test_bit_range_from_mask): New selftest.
(selftest::analyzer_store_cc_tests): Call the new selftests.
* store.h (bit_range::intersects_p): New.
(bit_range::from_mask): New decl.
(concrete_binding::get_bit_range): New accessor.
(store_manager::get_concrete_binding): New overload taking
const bit_range &.
gcc/testsuite/ChangeLog:
PR analyzer/99212
* gcc.dg/analyzer/bitfields-1.c: New test.
* gcc.dg/analyzer/data-model-1.c (struct sbits): Make bitfields
explicitly signed.
(test_44): Update test values assigned to the bits to ones that
fit in the range of the bitfield type. Remove xfails.
(test_45): Remove xfails.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
gcc/analyzer/ChangeLog:
* analyzer.h (int_size_in_bits): New decl.
* region.cc (int_size_in_bits): New function.
(region::get_bit_size): Reimplement in terms of the above.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
gcc/analyzer/ChangeLog:
* store.cc (concrete_binding::dump_to_pp): Move bulk of
implementation to...
(bit_range::dump_to_pp): ...this new function.
(bit_range::cmp): New.
(concrete_binding::overlaps_p): Update for use of bit_range.
(concrete_binding::cmp_ptr_ptr): Likewise.
* store.h (struct bit_range): New.
(class concrete_binding): Replace fields m_start_bit_offset and
m_size_in_bits with new field m_bit_range.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
gcc/ChangeLog:
* diagnostic-show-locus.c (diagnostic_show_locus): Don't reject
printing the same location twice if there are fix-it hints,
multiple locations, or a label.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
PR analyzer/100615 reports a missing leak diagnostic.
The issue is that the code calls strsep which the analyzer doesn't
have special knowledge of, and so conservatively assumes that it
could free the pointer, so drops malloc state for it.
Properly "teaching" the analyzer about strsep would require it
to support bifurcating state at a call, which is currently fiddly to
do, so for now this patch notes that strsep doesn't affect the
malloc state machine, allowing the analyzer to correctly detect the leak.
gcc/analyzer/ChangeLog:
PR analyzer/100615
* sm-malloc.cc: Include "analyzer/function-set.h".
(malloc_state_machine::on_stmt): Call unaffected_by_call_p and
bail on the functions it recognizes.
(malloc_state_machine::unaffected_by_call_p): New.
gcc/testsuite/ChangeLog:
PR analyzer/100615
* gcc.dg/analyzer/pr100615.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
PR analyzer/100244 reports an ICE on a -Wanalyzer-free-of-non-heap
due to a case where free_of_non_heap::describe_state_change can be
passed a NULL change.m_expr for a suitably complicated symbolic value.
Bulletproof it by checking for change.m_expr being NULL before
dereferencing it.
gcc/analyzer/ChangeLog:
PR analyzer/100244
* sm-malloc.cc (free_of_non_heap::describe_state_change):
Bulletproof against change.m_expr being NULL.
gcc/testsuite/ChangeLog:
PR analyzer/100244
* g++.dg/analyzer/pr100244.C: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
A big difference between ELF and PE-COFF is that, with the latter, you can
build position-independent executables or DLLs without generating PIC; as
a matter of fact, flag_pic has historically been forced to 0 for 32-bit:
/* Don't allow flag_pic to propagate since gas may produce invalid code
otherwise. */
\
do {
\
flag_pic = TARGET_64BIT ? 1 : 0; \
} while (0)
The reason is that the linker builds a .reloc section that collects the
absolute relocations in the generated binary, and the loader uses them to
relocate it at load time if need be (e.g. if --dynamicbase is enabled).
Up to binutils 2.35, the GNU linker didn't build the .reloc section for
executables and defaulted to --enable-auto-image-base for DLLs, which means
that DLLs had an essentially unique load address and, therefore, need not
be relocated by the loader in most cases.
With binutils 2.36 and later, the GNU linker builds a .reloc section for
executables (thus making them PIE), --enable-auto-image-base is disabled
and --dynamicbase is enabled by default, which means that essentially all
the binaries are relocated at load time.
This badly breaks the 32-bit compiler configured to use DWARF-2 EH because
the loader corrupts the .eh_frame section when processing the relocations
contained in the .reloc section.
gcc/
* config/i386/i386.c (asm_preferred_eh_data_format): Always use the
PIC encodings for PE-COFF targets.
This is a minor regression present on mainline and 11 branch, whereby the
value of the Enum_Rep attribute is always unsigned.
gcc/ada/
PR ada/101094
* exp_attr.adb (Get_Integer_Type): Return an integer type with the
same signedness as the input type.
For a composite literal we only need to introduce a temporary variable
if we may be converting to an interface type, so only do it then.
This saves over 80% of compilation time when using gccgo to compile
cmd/internal/obj/x86, as the GCC middle-end spends a lot of time
pointlessly computing interactions between temporary variables.
For PR debug/101064
For golang/go#46600
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/331513
We weren't passing 'flags' to the recursive call to cp_parser_declarator
in the ptr-operator case and as an effect, delayed parsing of noexcept
didn't work as advertised. The following change passes more than just
CP_PARSER_FLAGS_DELAY_NOEXCEPT but that doesn't seem to break anything.
I'm now also passing member_p and static_p, as a consequence, two tests
needed small tweaks.
PR c++/100752
gcc/cp/ChangeLog:
* parser.c (cp_parser_declarator): Pass flags down to
cp_parser_declarator. Also pass static_p/member_p.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/noexcept69.C: New test.
* g++.dg/parse/saved1.C: Adjust dg-error.
* g++.dg/template/crash50.C: Likewise.
(cherry picked from commit f9c80eb12c)
The recent float128 ISA3.1 support (r12-1340) has some typos,
it makes the libgcc build fail if it's with one binutils
(assembler) which doesn't support Power10 insns. The error
looks like:
Error: invalid switch -mpower10
Error: unrecognized option -mpower10
... [...libgcc/shared-object.mk:14: float128-p10.o] Error 1
What this patch does are:
- fix test target typo libgcc_cv_powerpc_3_1_float128_hw
(written wrongly as libgcc_cv_powerpc_float128_hw, so it's
going to build ISA3.1 stuffs just when detecting ISA3.0).
- fix test used for libgcc_cv_powerpc_3_1_float128_hw check.
- fix test option used for libgcc_cv_powerpc_3_1_float128_hw
check.
- remove the ISA3.1 related contents from t-float128-hw.
- add new macro FLOAT128_HW_INSNS_ISA3_1 to differentiate
ISA3.1 content from ISA3.0 part in ifunc support.
Bootstrapped/regtested on:
- powerpc64le-linux-gnu P10
- powerpc64le-linux-gnu P9 (w/i and w/o p10 supported as)
- powerpc64-linux-gnu P8 (w/i and w/o p10 supported as)
libgcc/ChangeLog:
PR target/101235
* configure: Regenerate.
* configure.ac (test for libgcc_cv_powerpc_3_1_float128_hw): Fix
typos among the name, CFLAGS and the test.
* config/rs6000/t-float128-hw (fp128_3_1_hw_funcs, fp128_3_1_hw_src,
fp128_3_1_hw_static_obj, fp128_3_1_hw_shared_obj, fp128_3_1_hw_obj):
Remove.
* config/rs6000/t-float128-p10-hw (FLOAT128_HW_INSNS): Append
macro FLOAT128_HW_INSNS_ISA3_1.
(FP128_3_1_CFLAGS_HW): Fix option typo.
* config/rs6000/float128-ifunc.c (SW_OR_HW_ISA3_1): Guard this with
FLOAT128_HW_INSNS_ISA3_1.
(__floattikf_resolve): Likewise.
(__floatuntikf_resolve): Likewise.
(__fixkfti_resolve): Likewise.
(__fixunskfti_resolve): Likewise.
(__floattikf): Likewise.
(__floatuntikf): Likewise.
(__fixkfti): Likewise.
(__fixunskfti): Likewise.
(cherry picked from commit 47749c43ac)
This fixes SLP permute propagation to not propagate across operations
that have different semantics on different lanes like for example
the recently added COMPLEX_ADD_ROT90.
2021-06-24 Richard Biener <rguenther@suse.de>
* tree-vect-slp.c (vect_optimize_slp): Do not propagate
across operations that have different semantics on different
lanes.
This moves the check for same operands after verifying we're
facing compatible calls.
2021-06-22 Richard Biener <rguenther@suse.de>
PR tree-optimization/101158
* tree-vect-slp.c (vect_build_slp_tree_1): Move same operand
checking after checking for matching operation.
* gfortran.dg/pr101158.f90: New testcase.
(cherry picked from commit 7a22d8a764)
The check whether two blocks are in the same irreducible region
and thus post-dominance checks being unreliable was incomplete
since an irreducible region can contain reducible sub-regions but
if one block is in the irreducible part and one not the check
still doesn't work as expected.
2021-06-22 Richard Biener <rguenther@suse.de>
PR tree-optimization/101151
* tree-ssa-sink.c (statement_sink_location): Expand irreducible
region check.
* gcc.dg/torture/pr101151.c: New testcase.
(cherry picked from commit a2ef8395fa)
We were ignoring DR_STEP for VF == 1 which is OK only in case
the scalar order is preserved or both DR steps are the same.
2021-06-23 Richard Biener <rguenther@suse.de>
PR tree-optimization/101105
* tree-vect-data-refs.c (vect_prune_runtime_alias_test_list):
Only ignore steps when they are equal or scalar order is preserved.
* gcc.dg/torture/pr101105.c: New testcase.
(cherry picked from commit 50374fdacb)
This fixes the bogus use of TYPE_PRECISION on vector types
from optimizing -((int)x >> 31) into (unsigned)x >> 31.
2021-05-19 Richard Biener <rguenther@suse.de>
PR middle-end/100672
* fold-const.c (fold_negate_expr_1): Use element_precision.
(negate_expr_p): Likewise.
* gcc.dg/torture/pr100672.c: New testcase.
(cherry picked from commit 8d51039cb7)
When the assembler supports it, the compiler automatically passes --gdwarf-5
to it, which has an interesting side effect: any assembly instruction prior
to the first .file directive defines a new line associated with .file 0 in
the .debug_line section and of course the numbering of these implicit lines
has nothing to do with that of the source code. This can be problematic in
Ada when we do not generate .file/.loc directives for compiled-generated
functions to avoid too jumpy a debugging experience.
gcc/
* dwarf2out.c (dwarf2out_assembly_start): Emit .file 0 marker here..
(dwarf2out_finish): ...instead of here.
The issues are that 1) they use readelf instead of objdump and 2) they use
ELF syntax in the assembly code.
gcc/
* configure.ac (--gdwarf-5 option): Use objdump instead of readelf.
(working --gdwarf-4/--gdwarf-5 for all sources): Likewise.
(--gdwarf-4 not refusing generated .debug_line): Adjust for Windows.
* configure: Regenerate.