183402 Commits

Author SHA1 Message Date
Maya Rashish
f9d4544df5 aarch64: Run SUBTARGET_INIT_BUILTINS if it exists
Some subtargets don't provide the canonical function names as
the symbol name in C libraries, and libcalls will only work if
the builtins are patched to emit the correct library name.

For example, on NetBSD, cabsl has the symbol name __c99_cabsl,
and the patching is done via netbsd_patch_builtin.

With this change, libgfortran.so is correctly built with a
reference to __c99_cabsl, instead of "cabsl" which is not defined.

gcc/ChangeLog:
	* config/aarch64/aarch64.c (aarch64_init_builtins):
	Call SUBTARGET_INIT_BUILTINS.
2021-02-15 18:38:55 +00:00
Peter Bergner
a33927c9ab rtl-optimization: Fix uninitialized use of opaque mode variable ICE [PR98872]
The initialize_uninitialized_regs function emits (set (reg:) (CONST0_RTX))
for all uninitialized pseudo uses.  However, some modes (eg, opaque modes)
may not have a CONST0_RTX defined, leading to an ICE when we try and create
the initialization insn.  The fix is to skip emitting the initialization
if there is no CONST0_RTX defined for the mode.

2021-02-15  Peter Bergner  <bergner@linux.ibm.com>

gcc/
	PR rtl-optimization/98872
	* init-regs.c (initialize_uninitialized_regs): Skip initialization
	if CONST0_RTX is NULL.

gcc/testsuite/
	PR rtl-optimization/98872
	* gcc.target/powerpc/pr98872.c: New test.
2021-02-15 10:39:24 -06:00
Jonathan Wakely
cc9a0a3d79 libstdc++: Fix __thread_yield for non-gthreads targets
The __gthread_yield() function is only defined for gthreads targets, so
check _GLIBCXX_HAS_GTHREADS before using it.

Also reorder __thread_relax and __thread_yield so that the former can
use the latter instead of repeating the same preprocessor checks.

libstdc++-v3/ChangeLog:

	* include/bits/atomic_wait.h (__thread_yield()): Check
	_GLIBCXX_HAS_GTHREADS before using __gthread_yield.
	(__thread_relax()): Use __thread_yield() instead of repeating
	the preprocessor checks for __gthread_yield.
2021-02-15 15:52:25 +00:00
Jonathan Wakely
d27153f038 libstdc++: Add missing return and use reserved name
The once_flag::_M_activate() function is only ever called immediately
after a call to once_flag::_M_passive(), and so in the non-gthreads case
it is impossible for _M_passive() to be true in the body of
_M_activate(). Add a check for it anyway, to avoid warnings about
missing return.

Also replace a non-reserved name with a reserved one.

libstdc++-v3/ChangeLog:

	* include/std/mutex (once_flag::_M_activate()): Add explicit
	return statement for passive case.
	(once_flag::_M_finish(bool)): Use reserved name for parameter.
2021-02-15 15:52:25 +00:00
Richard Sandiford
abe07a74bb rtl-ssa: Reduce the amount of temporary memory needed [PR98863]
The rtl-ssa code uses an on-the-side IL and needs to build that IL
for each block and RTL insn.  I'd originally not used the classical
dominance frontier method for placing phis on the basis that it seemed
like more work in this context: we're having to visit everything in
an RPO walk anyway, so for non-backedge cases we can tell immediately
whether a phi node is needed.  We then speculatively created phis for
registers that are live across backedges and simplified them later.
This avoided having to walk most of the IL twice (once to build the
initial IL, and once to link uses to phis).

However, as shown in PR98863, this leads to excessive temporary
memory in extreme cases, since we had to record the value of
every live register on exit from every block.  In that PR,
there were many registers that were live (but unused) across
a large region of code.

This patch does use the classical approach to placing phis, but tries
to use the existing DF defs information to avoid two walks of the IL.
We still use the previous approach for memory, since there is no
up-front information to indicate whether a block defines memory or not.
However, since memory is just treated as a single unified thing
(like for gimple vops), memory doesn't suffer from the same
scalability problems as registers.

With this change, fwprop no longer seems to be a memory-hog outlier
in the PR: the maximum RSS is similar with and without fwprop.

The PR also shows the problems inherent in using bitmap operations
involving the live-in and live-out sets, which in the testcase are
very large.  I've therefore tried to reduce those operations to the
bare minimum.

The patch also includes other compile-time optimisations motivated
by the PR; see the changelog for details.

I tried adding:

    for (int i = 0; i < 200; ++i)
      {
	crtl->ssa = new rtl_ssa::function_info (cfun);
	delete crtl->ssa;
      }

to fwprop.c to stress the code.  fwprop then took 35% of the compile
time for the problematic partition in the PR (measured on a release
build).  fwprop takes less than .5% of the compile time when running
normally.

The command:

  git diff 0b76990a9d75d97b84014e37519086b81824c307~ gcc/fwprop.c | \
    patch -p1 -R

still gives a working compiler that uses the old fwprop.c.  The compile
time with that version is very similar.

For a more reasonable testcase like optabs.ii at -O, I saw a 6.7%
compile time regression with the loop above added (i.e. creating
the info 201 times per pass instead of once per pass).  That goes
down to 4.8% with -O -g.  I can't measure a significant difference
with a normal compiler (no 200-iteration loop).

So I think that (as expected) the patch does make things a bit
slower in the normal case.  But like Richi says, peak memory usage
is harder for users to work around than slighter slower compile times.

gcc/
	PR rtl-optimization/98863
	* rtl-ssa/functions.h (function_info::bb_live_out_info): Delete.
	(function_info::build_info): Turn into a declaration, moving the
	definition to internals.h.
	(function_info::bb_walker): Declare.
	(function_info::create_reg_use): Likewise.
	(function_info::calculate_potential_phi_regs): Take a build_info
	parameter.
	(function_info::place_phis, function_info::create_ebbs): Declare.
	(function_info::calculate_ebb_live_in_for_debug): Likewise.
	(function_info::populate_backedge_phis): Delete.
	(function_info::start_block, function_info::end_block): Declare.
	(function_info::populate_phi_inputs): Delete.
	(function_info::m_potential_phi_regs): Move information to build_info.
	* rtl-ssa/internals.h: New file.
	(function_info::bb_phi_info): New class.
	(function_info::build_info): Moved from functions.h.
	Add a constructor and destructor.
	(function_info::build_info::ebb_use): Delete.
	(function_info::build_info::ebb_def): Likewise.
	(function_info::build_info::bb_live_out): Likewise.
	(function_info::build_info::tmp_ebb_live_in_for_debug): New variable.
	(function_info::build_info::potential_phi_regs): Likewise.
	(function_info::build_info::potential_phi_regs_for_debug): Likewise.
	(function_info::build_info::ebb_def_regs): Likewise.
	(function_info::build_info::bb_phis): Likewise.
	(function_info::build_info::bb_mem_live_out): Likewise.
	(function_info::build_info::bb_to_rpo): Likewise.
	(function_info::build_info::def_stack): Likewise.
	(function_info::build_info::old_def_stack_limit): Likewise.
	* rtl-ssa/internals.inl (function_info::build_info::record_reg_def):
	Remove the regno argument.  Push the previous definition onto the
	definition stack where necessary.
	* rtl-ssa/accesses.cc: Include internals.h.
	* rtl-ssa/changes.cc: Likewise.
	* rtl-ssa/blocks.cc: Likewise.
	(function_info::build_info::build_info): Define.
	(function_info::build_info::~build_info): Likewise.
	(function_info::bb_walker): New class.
	(function_info::bb_walker::bb_walker): Define.
	(function_info::add_live_out_use): Convert a logarithmic-complexity
	test into a linear one.  Allow the same definition to be passed
	multiple times.
	(function_info::calculate_potential_phi_regs): Moved from
	functions.cc.  Take a build_info parameter and store the
	information there instead.
	(function_info::place_phis): New function.
	(function_info::add_entry_block_defs): Update call to record_reg_def.
	(function_info::calculate_ebb_live_in_for_debug): New function.
	(function_info::add_phi_nodes): Use bb_phis to decide which
	registers need phi nodes and initialize ebb_def_regs accordingly.
	Do not add degenerate phis here.
	(function_info::add_artificial_accesses): Use create_reg_use.
	Assert that all definitions are listed in the DF LR sets.
	Update call to record_reg_def.
	(function_info::record_block_live_out): Record live-out register
	values in the phis of successor blocks.  Use the live-out set
	when processing the last block in an EBB, instead of always
	using the live-in sets of successor blocks.  AND the live sets
	with the set of registers that have been defined in the EBB,
	rather than with all potential phi registers.  Cope correctly
	with branches back to the start of the current EBB.
	(function_info::start_block): New function.
	(function_info::end_block): Likewise.
	(function_info::populate_phi_inputs): Likewise.
	(function_info::create_ebbs): Likewise.
	(function_info::process_all_blocks): Rewrite into a multi-phase
	process.
	* rtl-ssa/functions.cc: Include internals.h.
	(function_info::calculate_potential_phi_regs): Move to blocks.cc.
	(function_info::init_function_data): Remove caller.
	* rtl-ssa/insns.cc: Include internals.h
	(function_info::create_reg_use): New function.  Lazily any
	degenerate phis needed by the linear RPO view.
	(function_info::record_use): Use create_reg_use.  When processing
	debug uses, use potential_phi_regs and test it before checking
	whether the register is live on entry to the current EBB.  Lazily
	calculate ebb_live_in_for_debug.
	(function_info::record_call_clobbers): Update call to record_reg_def.
	(function_info::record_def): Likewise.
2021-02-15 15:05:22 +00:00
Martin Liska
40f235b5f0 Fix 2 more leaks related to gen_command_line_string.
gcc/ChangeLog:

	* toplev.c (init_asm_output): Free output of
	gen_command_line_string function.
	(process_options): Likewise.
2021-02-15 16:01:58 +01:00
Martin Liska
26cedbce4b Add 2 missing Param keywords.
gcc/ChangeLog:

	* params.opt: Add 2 missing Param keywords.
2021-02-15 15:09:04 +01:00
Eric Botcazou
8ec4f693fb Fix cast in df_worklist_dataflow_doublequeue
The existing cast to float gives weird results in the RTL dump files
on x86 when the compiler is configured -with-fpmath=sse.

gcc/
	* df-core.c (df_worklist_dataflow_doublequeue): Use proper cast.
2021-02-15 10:43:30 +01:00
Jakub Jelinek
70099a6acf match.pd: Fix up A % (cast) (pow2cst << B) simplification [PR99079]
The (mod @0 (convert?@3 (power_of_two_cand@1 @2))) simplification
uses tree_nop_conversion_p (type, TREE_TYPE (@3)) condition, but I believe
it doesn't check what it was meant to check.  On convert?@3
TREE_TYPE (@3) is not the type of what it has been converted from, but
what it has been converted to, which needs to be (because it is operand
of normal binary operation) equal or compatible to type of the modulo
result and first operand - type.
I could fix that by using && tree_nop_conversion_p (type, TREE_TYPE (@1))
and be done with it, but actually most of the non-nop conversions are IMHO
ok and so we would regress those optimizations.
In particular, if we have say narrowing conversions (foo5 and foo6 in
the new testcase), I think we are fine, either the shift of the power of two
constant after narrowing conversion is still that power of two (or negation
of that) and then it will still work, or the result of narrowing conversion
is 0 and then we would have UB which we can ignore.
Similarly, widening conversions where the shift result is unsigned are fine,
or even widening conversions where the shift result is signed, but we sign
extend to a signed wider divisor, the problematic case of INT_MIN will
become x % (long long) INT_MIN and we can still optimize that to
x & (long long) INT_MAX.
What doesn't work is the case in the pr99079.c testcase, widening conversion
of a signed shift result to wider unsigned divisor, where if the shift
is negative, we end up with x % (unsigned long long) INT_MIN which is
x % 0xffffffff80000000ULL where the divisor is not a power of two and
we can't optimize that to x & 0x7fffffffULL.

So, the patch rejects only the single problematic case.

Furthermore, when the shift result is signed, we were introducing UB into
a program which previously didn't have one (well, left shift into the sign
bit is UB in some language/version pairs, but it is definitely valid in
C++20 - wonder if I shouldn't move the gcc.c-torture/execute/pr99079.c
testcase to g++.dg/torture/pr99079.C and use -std=c++20), by adding that
subtraction of 1, x % (1 << 31) in C++20 is well defined, but
x & ((1 << 31) - 1) triggers UB on the subtraction.
So, the patch performs the subtraction in the unsigned type if it isn't
wrapping.

2021-02-15  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/99079
	* match.pd (A % (pow2pcst << N) -> A & ((pow2pcst << N) - 1)): Remove
	useless tree_nop_conversion_p (type, TREE_TYPE (@3)) check.  Instead
	require both type and TREE_TYPE (@1) to be integral types and either
	type having smaller or equal precision, or TREE_TYPE (@1) being
	unsigned type, or type being signed type.  If TREE_TYPE (@1)
	doesn't have wrapping overflow, perform the subtraction of one in
	unsigned type.

	* gcc.dg/fold-modpow2-2.c: New test.
	* gcc.c-torture/execute/pr99079.c: New test.
2021-02-15 09:16:06 +01:00
GCC Administrator
c5ae38e8dc Daily bump. 2021-02-15 00:16:18 +00:00
Jan Hubicka
9966699d7a Fix memory leak in ipa-refernece
2021-02-14  Jan Hubicka  <hubicka@ucw.cz>
	    Richard Biener  <rguether@suse.de>

	PR ipa/97346
	* ipa-reference.c (ipa_init): Only conditinally initialize
	reference_vars_to_consider.
	(propagate): Conditionally deninitialize reference_vars_to_consider.
	(ipa_reference_write_optimization_summary): Sanity check that
	reference_vars_to_consider is not allocated.
2021-02-14 23:24:44 +01:00
Jonathan Wakely
4e3590d06c libstdc++: Restore <unistd.h> in testsuite_fs.h header [PR 99096]
libstdc++-v3/ChangeLog:

	PR libstdc++/99096
	* testsuite/util/testsuite_fs.h: Always include <unistd.h>.
2021-02-14 20:38:32 +00:00
GCC Administrator
c8656df666 Daily bump. 2021-02-14 00:16:34 +00:00
Levy Hsu
18fabc35f4 RISC-V: Avoid zero/sign extend for volatile loads. Fix for 97417.
This expands sub-word loads as a zero/sign extended load, followed by
a subreg.  This helps eliminate unnecessary zero/sign extend insns after
the load, particularly for volatiles, but also in some other cases.
Testing shows that it gives consistent code size decreases.

Tested with riscv32-elf rv32imac/ilp32 and riscv64-linux rv64gc/lp064d
builds and checks.  Some -gsplit-stack tests fail with the patch, but
this turns out to be an existing bug with the split-stack support that
I hadn't noticed before.  It isn't a bug in this patch.  Ignoring that
there are no regressions.

Committed.

	gcc/
	PR target/97417
	* config/riscv/riscv-shorten-memrefs.c (pass_shorten_memrefs): Add
	extend parameter to get_si_mem_base_reg declaration.
	(get_si_mem_base_reg): Add extend parameter.  Set it.
	(analyze): Pass extend arg to get_si_mem_base_reg.
	(transform): Likewise.  Use it when rewriting mems.
	* config/riscv/riscv.c (riscv_legitimize_move): Check for subword
	loads and emit sign/zero extending load followed by subreg move.
2021-02-13 12:33:44 -08:00
Jim Wilson
a4953810ba RISC-V: Shorten memrefs improvement, partial fix 97417.
We already have a check for riscv_shorten_memrefs in riscv_address_cost.
This adds the same check to riscv_rtx_costs.  Making this work also
requires a change to riscv_compressed_lw_address_p to work before reload
by checking the offset and assuming any pseudo reg is OK.  Testing shows
that this consistently gives small code size reductions.

	gcc/
	PR target/97417
	* config/riscv/riscv.c (riscv_compressed_lw_address_p): Drop early
	exit when !reload_completed.  Only perform check for compressed reg
	if reload_completed.
	(riscv_rtx_costs): In MEM case, when optimizing	for size and
	shorten memrefs, if not compressible, then increase cost.
2021-02-13 12:13:08 -08:00
Jakub Jelinek
05402ca65a passes: Enable split4 with selective scheduling 2 [PR98439]
As mentioned in the PR, we have 5 split passes (+ splitting during final).
split1 is before RA and is unconditional,
split2 is after RA and is gated on optimize > 0,
split3 is before sched2 and is gated on
  defined(INSN_SCHEDULING) && optimize > 0 && flag_schedule_insns_after_reload
split4 is before regstack and is gated on
  HAVE_ATTR_length && defined (STACK_REGS) && !gate (split3)
split5 is before shorten_branches and is gated on
  HAVE_ATTR_length && !defined (STACK_REGS)
and the splitting during final works only when !HAVE_ATTR_length.
STACK_REGS is a macro enabled only on i386/x86_64.

The problem with the following testcase is that split3 before sched2
is the last splitting pass for the target/command line options set,
but selective scheduling unlike normal scheduling can create new
instructions that need to be split, which means we ICE during final as
there are insns that require splitting but nothing split them.

This patch fixes it by doing split4 also when -fselective-scheduling2
is enabled on x86 and split3 has been run.  As that option isn't on
by default, it should slow down compilation only for those that enable
that option.

2021-02-13  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/98439
	* recog.c (pass_split_before_regstack::gate): Enable even when
	pass_split_before_sched2 is enabled if -fselective-scheduling2 is
	on.

	* gcc.target/i386/pr98439.c: New test.
2021-02-13 16:08:29 +01:00
Iain Buclaw
a3b38b7781 d: Merge upstream dmd 7132b3537
Splits out all semantic passes for Dsymbol, Type, and TemplateParameter
nodes into Visitors in separate files, and the copyright years of all
sources have been updated.

Reviewed-on: https://github.com/dlang/dmd/pull/12190

gcc/d/ChangeLog:

	* dmd/MERGE: Merge upstream dmd 7132b3537.
	* Make-lang.in (D_FRONTEND_OBJS): Add d/dsymbolsem.o, d/semantic2.o,
	d/semantic3.o, and d/templateparamsem.o.
	* d-compiler.cc (Compiler::genCmain): Update calls to semantic
	entrypoint functions.
	* d-lang.cc (d_parse_file): Likewise.
	* typeinfo.cc (make_frontend_typeinfo): Likewise.
2021-02-13 12:50:45 +01:00
Jakub Jelinek
0f3a743b68 i386: Add combiner splitter to optimize V2SImode memory rotation [PR96166]
Since the x86 backend enabled V2SImode vectorization (with
TARGET_MMX_WITH_SSE), slp vectorization can kick in and emit
        movq    (%rdi), %xmm1
        pshufd  $225, %xmm1, %xmm0
        movq    %xmm0, (%rdi)
instead of
        rolq    $32, (%rdi)
we used to emit (or emit when slp vectorization is disabled).
I think the rotate is both smaller and faster, so this patch adds
a combiner splitter to optimize that back.

2021-02-13  Jakub Jelinek  <jakub@redhat.com>

	PR target/96166
	* config/i386/mmx.md (*mmx_pshufd_1): Add a combine splitter for
	swap of V2SImode elements in memory into DImode memory rotate by 32.

	* gcc.target/i386/pr96166.c: New test.
2021-02-13 10:32:16 +01:00
GCC Administrator
fab095dad5 Daily bump. 2021-02-13 00:16:38 +00:00
Jakub Jelinek
eb64b0b285 testsuite: Restrict gcc.dg/rtl/aarch64/multi-subreg-1.c test to aarch64 only
2021-02-13  Jakub Jelinek  <jakub@redhat.com>

	* gcc.dg/rtl/aarch64/multi-subreg-1.c: Add dg-do compile directive
	and restrict the test to aarch64-*-* target only.
2021-02-13 00:02:28 +01:00
Nathan Sidwell
8c4137c7ea c++: Seed imported bindings [PR 99039]
As mentioned in 99040's fix, we can get inter-module using decls.  If the
using decl is the only reference to an import, we'll have failed to
seed our imports leading to an assertion failure.  The fix is
straight-forwards, check binding contents when seeding imports.

	gcc/cp/
	* module.cc (module_state::write_cluster): Check bindings for
	imported using-decls.
	gcc/testsuite/
	* g++.dg/modules/pr99039_a.C: New.
	* g++.dg/modules/pr99039_b.C: New.
2021-02-12 13:50:03 -08:00
Nathan Sidwell
0c27fe96f8 c++: Register streamed-in decls when new [PR 99040]
With modules one can have using-decls refering to their own scope.  This
is the way to export things from the GMF or from an import.  The
problem was I was using current_ns == CP_DECL_CONTEXT (decl) to
determine whether a decl should be registered in a namespace level or
not.  But that's an inadequate check and we ended up reregistering
decls and creating a circular list.  We should be registering the decl
when first encountered -- whether we bind it is orthogonal to that.

	PR c++/99040
	gcc/cp/
	* module.cc (trees_in::decl_value): Call add_module_namespace_decl
	for new namespace-scope entities.
	(module_state::read_cluster): Don't call add_module_decl here.
	* name-lookup.h (add_module_decl): Rename to ...
	(add_module_namespace_decl): ... this.
	* name-lookup.c (newbinding_bookkeeping): Move into ...
	(do_pushdecl): ... here.  Its only remaining caller.
	(add_module_decl): Rename to ...
	(add_module_namespace_decl): ... here.  Add checking-assert for
	circularity. Don't call newbinding_bookkeeping, just extern_c
	checking and incomplete var checking.
	gcc/testsuite/
	* g++.dg/modules/pr99040_a.C: New.
	* g++.dg/modules/pr99040_b.C: New.
	* g++.dg/modules/pr99040_c.C: New.
	* g++.dg/modules/pr99040_d.C: New.
2021-02-12 13:50:03 -08:00
Nathan Sidwell
8f93e1b892 Expunge namespace-scope IDENTIFIER_TYPE_VALUE & global_type_name [PR 99039]
IDENTIFIER_TYPE_VALUE and friends is a remnant of G++'s C origins.  It
holds elaborated types on identifier-nodes.  While this is fine for C
and for local and class-scopes in C++, it fails badly for namespaces.
In that case a marker 'global_type_node' was used, which essentially
signified 'this is a namespace-scope type *somewhere*', and you'd have
to do a regular name_lookup to find it.  As the parser and
substitution machinery has avanced over the last 25 years or so,
there's not much outside of actual name-lookup that uses that.
Amusingly the IDENTIFIER_HAS_TYPE_VALUE predicate will do an actual
name-lookup and then users would repeat that lookup to find the
now-known to be there type.

Rather late I realized that this interferes with the lazy loading of
module entities, because we were setting IDENTIFIER_TYPE_VALUE to
global_type_node.  But we could be inside some local scope where that
identifier is bound to some local type.  Not good!

Rather than add more cruft to look at an identifier's shadow stack and
alter that as necessary, this takes the approach of removing the
existing cruft.

We nuke the few places outside of name lookup that use
IDENTIFIER_TYPE_VALUE.  Replacing them with either proper name
lookups, alternative sequences, or in some cases asserting that they
(no longer) happen.  Class template instantiation was calling pushtag
after setting IDENTIFIER_TYPE_VALUE in order to stop pushtag creating
an implicit typedef and pushing it, but to get the bookkeeping it
needed.  Let's just do the bookkeeping directly.

Then we can stop having a 'bound at namespace-scope' marker at all,
which means lazy loading won't screw up local shadow stacks.  Also, it
simplifies set_identifier_type_value_with_scope, as it never needs to
inspect the scope stack.  When developing this patch, I discovered a
number of places we'd put an actual namespace-scope type on the
type_value slot, rather than global_type_node.  You might notice this
is killing at least two 'why are we doing this?' comments.

While this doesn't fix the two PRs mentioned, it is a necessary step.

	PR c++/99039
	PR c++/99040
	gcc/cp/
	* cp-tree.h (CPTI_GLOBAL_TYPE): Delete.
	(global_type_node): Delete.
	(IDENTIFIER_TYPE_VALUE): Delete.
	(IDENTIFIER_HAS_TYPE_VALUE): Delete.
	(get_type_value): Delete.
	* name-lookup.h (identifier_type_value): Delete.
	* name-lookup.c (check_module_override): Don't
	SET_IDENTIFIER_TYPE_VALUE here.
	(do_pushdecl): Nor here.
	(identifier_type_value_1, identifier_type_value): Delete.
	(set_identifier_type_value_with_scope): Only
	SET_IDENTIFIER_TYPE_VALUE for local and class scopes.
	(pushdecl_nanmespace_level): Remove shadow stack nadgering.
	(do_pushtag): Use REAL_IDENTIFIER_TYPE_VALUE.
	* call.c (check_dtor_name): Use lookup_name.
	* decl.c (cxx_init_decl_processing): Drop global_type_node.
	* decl2.c (cplus_decl_attributes): Don't SET_IDENTIFIER_TYPE_VALUE
	here.
	* init.c (get_type_value): Delete.
	* pt.c (instantiate_class_template_1): Don't call pushtag or
	SET_IDENTIFIER_TYPE_VALUE here.
	(tsubst): Assert never an identifier.
	(dependent_type_p): Drop global_type_node assert.
	* typeck.c (error_args_num): Don't use IDENTIFIER_HAS_TYPE_VALUE
	to determine ctorness.
	gcc/testsuite/
	* g++.dg/lookup/pr99039.C: New.
2021-02-12 13:50:03 -08:00
Michael Matloob
9769564e74 compiler: open byte slice and string embeds using the absolute path
The paths vector contains the names of the files that the embed_files_
map is keyed by. While the code processing embed.FS values looks up
the paths in the embed_files_ map, the code processing string and byte
slice embeds tries opening the files using their names directly. Look
up the full paths in the embed_files_ map when opening them.

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/291429
2021-02-12 11:18:32 -08:00
Martin Sebor
f3d7fd1475 PR c/99055 - memory leak in warn_parm_array_mismatch
gcc/c-family/ChangeLog:

	PR c/99055
	* c-warn.c (warn_parm_array_mismatch): Free strings returned from
	print_generic_expr_to_str.

gcc/ChangeLog:

	* tree-pretty-print.c (print_generic_expr_to_str): Update comment.
2021-02-12 11:18:52 -07:00
Steve Kargl
0631e008ad libgfortran: Fix PR95647 by changing the interfaces of operators .eq. and .ne.
The FE converts the old school .eq. to ==,
and then tracks the ==.  The module starts with == and so it does not
properly overload the .eq.  Reversing the interfaces fixes this.

2021-02-12  Steve Kargl <sgk@troutmask.apl.washington.edu>

libgfortran/ChangeLog:

	PR libfortran/95647
	* ieee/ieee_arithmetic.F90: Flip interfaces of operators .eq. to
	== and .ne. to /= .

gcc/testsuite/ChangeLog:

	PR libfortran/95647
	* gfortran.dg/ieee/ieee_12.f90: New test.
2021-02-12 07:58:16 -08:00
Richard Sandiford
adfee3c4c0 rtl-ssa: Use right obstack for temporary allocation
I noticed while working on PR98863 that we were using the main
obstack to allocate temporary uses.  That was safe, but represents
a kind of local memory leak.

gcc/
	* rtl-ssa/accesses.cc (function_info::make_use_available): Use
	m_temp_obstack rather than m_obstack to allocate the temporary use.
2021-02-12 15:54:49 +00:00
Richard Sandiford
f60226fd72 df: Record all definitions in DF_LR_BB_INFO->def [PR98863]
df_lr_bb_local_compute has:

      FOR_EACH_INSN_INFO_DEF (def, insn_info)
	/* If the def is to only part of the reg, it does
	   not kill the other defs that reach here.  */
	if (!(DF_REF_FLAGS (def) & (DF_REF_PARTIAL | DF_REF_CONDITIONAL)))

However, as noted in the comment in the patch and below, almost all
partial definitions have an associated use.  This means that the
confluence function:

  IN = (OUT & ~DEF) | USE

is unaffected by whether partial definitions are in DEF or not.

Even though the choice doesn't matter for the LR problem itself,
it's IMO much more convenient for consumers if DEF contains all the
definitions in the block.  The only pre-RTL-SSA code that tries to
consume DEF directly is shrink-wrap.c, which already has to work
around the incompleteness of the information:

	  /* DF_LR_BB_INFO (bb)->def does not comprise the DF_REF_PARTIAL and
	     DF_REF_CONDITIONAL defs.  So if DF_LIVE doesn't exist, i.e.
	     at -O1, just give up searching NEXT_BLOCK.  */

I hit the same problem when trying to fix the RTL-SSA part of PR98863.

This patch treats partial definitions as both a def and a use,
just like the df_ref records almost always do.

To show that partial definitions almost always have uses:

  DF_REF_CONDITIONAL:

    Added by:

      case COND_EXEC:
	df_defs_record (collection_rec, COND_EXEC_CODE (x),
			bb, insn_info, DF_REF_CONDITIONAL);
	break;

    Later, df_get_conditional_uses creates uses for all DF_REF_CONDITIONAL
    definitions.

  DF_REF_PARTIAL:

    In total, there are 4 locations at which we add partial definitions.

    Case 1:

      if (GET_CODE (dst) == STRICT_LOW_PART)
	{
	  flags |= DF_REF_READ_WRITE | DF_REF_PARTIAL | DF_REF_STRICT_LOW_PART;

	  loc = &XEXP (dst, 0);
	  dst = *loc;
	}

    Corresponding use:

      case STRICT_LOW_PART:
	{
	  rtx *temp = &XEXP (dst, 0);
	  /* A strict_low_part uses the whole REG and not just the
	   SUBREG.  */
	  dst = XEXP (dst, 0);
	  df_uses_record (collection_rec,
			  (GET_CODE (dst) == SUBREG) ? &SUBREG_REG (dst) : temp,
			  DF_REF_REG_USE, bb, insn_info,
			  DF_REF_READ_WRITE | DF_REF_STRICT_LOW_PART);
	}
	break;

    Case 2:

      if (GET_CODE (dst) == ZERO_EXTRACT)
	{
	  flags |= DF_REF_READ_WRITE | DF_REF_PARTIAL | DF_REF_ZERO_EXTRACT;

	  loc = &XEXP (dst, 0);
	  dst = *loc;
	}

    Corresponding use:

      case ZERO_EXTRACT:
	{
	  df_uses_record (collection_rec, &XEXP (dst, 1),
			  DF_REF_REG_USE, bb, insn_info, flags);
	  df_uses_record (collection_rec, &XEXP (dst, 2),
			  DF_REF_REG_USE, bb, insn_info, flags);
	  if (GET_CODE (XEXP (dst,0)) == MEM)
	    df_uses_record (collection_rec, &XEXP (dst, 0),
			    DF_REF_REG_USE, bb, insn_info,
			    flags);
	  else
	    df_uses_record (collection_rec, &XEXP (dst, 0),
			    DF_REF_REG_USE, bb, insn_info,
			    DF_REF_READ_WRITE | DF_REF_ZERO_EXTRACT);
----------------------------^^^^^^^^^^^^^^^^^
	}
	break;

    Case 3:

      else if (GET_CODE (dst) == SUBREG && REG_P (SUBREG_REG (dst)))
	{
	  if (read_modify_subreg_p (dst))
	    flags |= DF_REF_READ_WRITE | DF_REF_PARTIAL;

	  flags |= DF_REF_SUBREG;

	  df_ref_record (DF_REF_REGULAR, collection_rec,
			 dst, loc, bb, insn_info, DF_REF_REG_DEF, flags);
	}

    Corresponding use:

      case SUBREG:
	if (read_modify_subreg_p (dst))
	  {
	    df_uses_record (collection_rec, &SUBREG_REG (dst),
			    DF_REF_REG_USE, bb, insn_info,
			    flags | DF_REF_READ_WRITE | DF_REF_SUBREG);
	    break;
	  }

    Case 4:

      /*  If this is a multiword hardreg, we create some extra
	  datastructures that will enable us to easily build REG_DEAD
	  and REG_UNUSED notes.  */
      if (collection_rec
	  && (endregno != regno + 1) && insn_info)
	{
	  /* Sets to a subreg of a multiword register are partial.
	     Sets to a non-subreg of a multiword register are not.  */
	  if (GET_CODE (reg) == SUBREG)
	    ref_flags |= DF_REF_PARTIAL;
	  ref_flags |= DF_REF_MW_HARDREG;

    Corresponding use:

      None.  However, this case should be rare to non-existent on most
      targets, and the current handling seems suspect.  See the comment
      in the patch for more details.

gcc/
	* df-problems.c (df_lr_bb_local_compute): Treat partial definitions
	as read-modify operations.

gcc/testsuite/
	* gcc.dg/rtl/aarch64/multi-subreg-1.c: New test.
2021-02-12 15:54:48 +00:00
Jonathan Wakely
b7210405ed libstdc++: Re-enable workaround for _wstat64 bug, again [PR 88881]
I forgot that the workaround is present in both filesystem::status and
filesystem::symlink_status. This restores it in the latter.

libstdc++-v3/ChangeLog:

	PR libstdc++/88881
	* src/c++17/fs_ops.cc (fs::symlink_status): Re-enable workaround.
2021-02-12 15:30:35 +00:00
Jonathan Wakely
1dfd95f0a0 libstdc++: Fix filesystem::rename on Windows [PR 98985]
The _wrename function won't overwrite an existing file, so use
MoveFileEx instead. That allows renaming directories over files, which
POSIX doesn't allow, so check for that case explicitly and report an
error.

Also document the deviation from the expected behaviour, and add a test
for filesystem::rename which was previously missing.

The Filesystem TS experimental::filesystem::rename doesn't have that
extra code to handle directories correctly, so the relevant parts of the
new test are not run on Windows.

libstdc++-v3/ChangeLog:

	* doc/xml/manual/status_cxx2014.xml: Document implementation
	specific properties of std::experimental::filesystem::rename.
	* doc/xml/manual/status_cxx2017.xml: Document implementation
	specific properties of std::filesystem::rename.
	* doc/html/*: Regenerate.
	* src/c++17/fs_ops.cc (fs::rename): Implement correct behaviour
	for directories on Windows.
	* src/filesystem/ops-common.h (__gnu_posix::rename): Use
	MoveFileExW on Windows.
	* testsuite/27_io/filesystem/operations/rename.cc: New test.
	* testsuite/experimental/filesystem/operations/rename.cc: New test.
2021-02-12 15:29:50 +00:00
Jonathan Wakely
4179ec1079 libstdc++: Make "nonexistent" paths less predictable in filesystem tests
The helper function for creating new paths doesn't work well on Windows,
because the PID of a process started by Wine is very consistent and so
the same path gets created each time.

libstdc++-v3/ChangeLog:

	* testsuite/util/testsuite_fs.h (nonexistent_path): Add
	random number to the path.
2021-02-12 15:13:02 +00:00
Jonathan Wakely
d1a821b93c libstdc++: Include scope ID in net::internet::address_v6::to_string()
libstdc++-v3/ChangeLog:

	* include/experimental/internet (address_v6::to_string): Include
	scope ID in string.
	* testsuite/experimental/net/internet/address/v6/members.cc:
	Test to_string() results.
2021-02-12 15:08:29 +00:00
Jonathan Wakely
970ba71925 libstdc++: Fix errors in <experimental/internet>
libstdc++-v3/ChangeLog:

	* include/experimental/internet (address_v6::any): Avoid using
	memcpy in constexpr function.
	(address_v6::loopback): Likewise.
	(make_address_v6): Fix missing return statements on error paths.
	* include/experimental/io_context: Avoid -Wdangling-else
	warning.
	* testsuite/experimental/net/internet/address/v4/members.cc:
	Remove unused variables.
	* testsuite/experimental/net/internet/address/v6/members.cc:
	New test.
2021-02-12 14:30:14 +00:00
Jonathan Wakely
87eaa3c525 libstdc++: Add unused attributes to shared_ptr functions
This avoids some warnings when building with -fno-rtti because the
function parameters are only used when RTTI is enabled.

libstdc++-v3/ChangeLog:

	* include/bits/shared_ptr_base.h (__shared_ptr::_M_get_deleter):
	Add unused attribute to parameter.
	* src/c++11/shared_ptr.cc (_Sp_make_shared_tag::_S_eq):
	Likewise.
2021-02-12 14:30:14 +00:00
Jonathan Wakely
c4ece1d96a libstdc++: XFAIL tests that depends on RTTI
The std::emit_on_flush manipulator depends on dynamic_cast, so fails
without RTTI.

The std::async code can't catch a forced_unwind exception when RTTI is
disabled, so it can't rethrow it either, and the test aborts.

libstdc++-v3/ChangeLog:

	* testsuite/27_io/basic_ostream/emit/1.cc: Expect test to fail
	if -fno-rtti is used.
	* testsuite/30_threads/async/forced_unwind.cc: Expect test
	to abort if -fno-rtti is used.
2021-02-12 14:30:13 +00:00
Jonathan Wakely
0bd242ec5a libstdc++: Make test memory_resource work without exceptions and RTTI
libstdc++-v3/ChangeLog:

	* testsuite/util/testsuite_allocator.h (memory_resource):
	Remove requirement for RTTI and exceptions to be enabled.
2021-02-12 14:30:13 +00:00
Jonathan Wakely
e9c3105211 libstdc++: Only use dynamic_cast in tests when RTTI is enabled
libstdc++-v3/ChangeLog:

	* testsuite/27_io/basic_istringstream/rdbuf/char/2832.cc: Use
	static_cast when RTTI is disabled.
	* testsuite/27_io/basic_istringstream/rdbuf/wchar_t/2832.cc:
	Likewise.
	* testsuite/27_io/basic_ostringstream/rdbuf/char/2832.cc:
	Likewise.
	* testsuite/27_io/basic_ostringstream/rdbuf/wchar_t/2832.cc:
	Likewise.
	* testsuite/27_io/basic_stringstream/str/char/2.cc:
	Likewise.
	* testsuite/27_io/basic_stringstream/str/wchar_t/2.cc:
	Likewise.
2021-02-12 14:30:13 +00:00
Jonathan Wakely
14b554c462 libstdc++: Fix errors when syncbuf is used without RTTI
libstdc++-v3/ChangeLog:

	* include/std/ostream (__syncbuf_base::_S_get): Mark parameter
	as unused and only use dynamic_cast when RTTI is enabled.
2021-02-12 14:30:12 +00:00
Jonathan Wakely
4591f7e532 libstdc++: Fix bootstrap with -fno-rtti [PR 99077]
When libstdc++ is built without RTTI the __ios_failure type is just an
alias for std::ios_failure, so trying to construct it from an int won't
compile. This changes the RTTI-enabled __ios_failure type to have the
same constructor parameters as std::ios_failure, so that the constructor
takes the same arguments whether RTTI is enabled or not.

The __throw_ios_failure function now constructs the error_code, instead
of the __ios_failure constructor. As a drive-by fix that error_code is
constructed with std::generic_category() not std::system_category(),
because the int comes from errno which corresponds to the generic
category.

libstdc++-v3/ChangeLog:

	PR libstdc++/99077
	* src/c++11/cxx11-ios_failure.cc (__ios_failure(const char*, int)):
	Change int parameter to error_code, to match std::ios_failure.
	(__throw_ios_failure(const char*, int)): Construct error_code
	from int parameter.
2021-02-12 14:30:12 +00:00
Christophe Lyon
71b8ed7c61 testsuite, arm: Add -mthumb to pr98931.c [PR target/98931]
This test forces -march=armv8.1-m.main, which supports only Thumb mode.
However, if the toolchain is not configured --with-thumb, the test
fails with:
error: target CPU does not support ARM mode

Adding -mthumb to dg-options fixes the problem.

2021-02-12  Christophe Lyon  <christophe.lyon@linaro.org>

	PR target/98931
	gcc/testsuite/
	* gcc.target/arm/pr98931.c: Add -mthumb
2021-02-12 14:19:24 +00:00
Arnaud Charlet
3fbf81a252 [Ada] Remove unused subprograms (continued)
gcc/ada/

	* repinfo.ads, repinfo.adb (*SO_Ref*): Restore.
2021-02-12 08:53:41 -05:00
Tobias Burnus
f699e0b165 Fortran: Fix rank of assumed-rank array [PR99043]
gcc/fortran/ChangeLog:

	PR fortran/99043
	* trans-expr.c (gfc_conv_procedure_call): Don't reset
	rank of assumed-rank array.

gcc/testsuite/ChangeLog:

	PR fortran/99043
	* gfortran.dg/assumed_rank_20.f90: New test.
2021-02-12 14:43:41 +01:00
Richard Biener
6cc886bf42 middle-end/38474 - fix alias walk budget accounting in IPA analysis
The walk_aliased_vdef calls do not update the walking budget until
it is hit by a single call (and then in one case it resumes with
no limit at all).  The following rectifies this in multiple places.
It also makes the updates more consistend and fixes
determine_known_aggregate_parts to account its own alias queries.

2021-02-12  Richard Biener  <rguenther@suse.de>

	PR middle-end/38474
	* ipa-fnsummary.c (unmodified_parm_1): Only walk when
	fbi->aa_walk_budget is bigger than zero.  Update
	fbi->aa_walk_budget.
	(param_change_prob): Likewise.
	* ipa-prop.c (detect_type_change_from_memory_writes):
	Properly account walk_aliased_vdefs.
	(parm_preserved_before_stmt_p): Canonicalize updates.
	(parm_ref_data_preserved_p): Likewise.
	(parm_ref_data_pass_through_p): Likewise.
	(determine_known_aggregate_parts): Account own alias queries.
2021-02-12 12:34:28 +01:00
Martin Liska
bc6087c575 Fix producer string memory leaks
gcc/ChangeLog:

	* opts-common.c (decode_cmdline_option): Release werror_arg.
	* opts.c (gen_producer_string): Release output of
	gen_command_line_string.
2021-02-12 10:25:06 +01:00
Jakub Jelinek
cf059e1c09 c++: Fix endless errors on invalid requirement seq [PR97742]
As the testcase shows, if we reach CPP_EOF during parsing of requirement
sequence, we end up with endless loop where we always report invalid
requirement expression, don't consume any token (as we are at eof) and
repeat.

This patch stops the loop when we reach CPP_EOF.

2021-02-12  Jakub Jelinek  <jakub@redhat.com>

	PR c++/97742
	* parser.c (cp_parser_requirement_seq): Stop iterating after reaching
	CPP_EOF.

	* g++.dg/cpp2a/concepts-requires24.C: New test.
2021-02-12 09:58:25 +01:00
Richard Biener
95d94b52ea tree-optimization/38474 - fix store-merging compile-time regression
The following puts a limit on the number of alias tests we do in
terminate_all_aliasing_chains which is quadratic in the number of
overall stores currentrly tracked.  There is already a limit in
place on the maximum number of stores in a single chain so the
following adds a limit on the number of chains tracked.  The
worst number of overall stores tracked from the defaults (64 and 64)
is then 4096 which when imposed as the sole limit for the testcase
still causes

 store merging                      :  71.65 ( 56%)

because the testcase is somewhat degenerate with most chains
consisting only of a single store (and 25% of exactly three stores).
The single stores are all CLOBBERs at the point variables go out of
scope.  Note unpatched we have

 store merging                      : 308.60 ( 84%)

Limiting the number of chains to 64 brings this down to

 store merging                      :   1.52 (  3%)

which is more reasonable.  There are ideas on how to make
terminate_all_aliasing_chains cheaper but for this degenerate case
they would not have any effect so I'll defer for GCC 12 for those.

I'm not sure we want to have both --params, just keeping the
more to-the-point max-stores-to-track works but makes the
degenerate case above slower.
I made the current default 1024 which for the testcasse
(without limiting chains) results in 25% compile time and 20s
putting it in the same ballpart as the next offender (which is PTA).

This is a regression on trunk and the GCC 10 branch btw.

2021-02-11  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/38474
	* params.opt (-param=max-store-chains-to-track=): New param.
	(-param=max-stores-to-track=): Likewise.
	* doc/invoke.texi (max-store-chains-to-track): Document.
	(max-stores-to-track): Likewise.
	* gimple-ssa-store-merging.c (pass_store_merging::m_n_chains):
	New.
	(pass_store_merging::m_n_stores): Likewise.
	(pass_store_merging::terminate_and_process_chain): Update
	m_n_stores and m_n_chains.
	(pass_store_merging::process_store): Likewise.   Terminate
	oldest chains if the number of stores or chains get too large.
	(imm_store_chain_info::terminate_and_process_chain): Dump
	chain length.
2021-02-12 09:38:52 +01:00
Jason Merrill
ac001ddd0c c++: variadic lambda template and empty pack [PR97246]
In get<0>, Is is empty, so the first parameter pack of the lambda is empty,
but after the fix for PR94546 we were wrongly associating it with the
partial instantiation of 'v'.

gcc/cp/ChangeLog:

	PR c++/97246
	PR c++/94546
	* pt.c (extract_fnparm_pack): Check DECL_PACK_P here.
	(register_parameter_specializations): Not here.

gcc/testsuite/ChangeLog:

	PR c++/97246
	* g++.dg/cpp2a/lambda-generic-variadic21.C: New test.
2021-02-11 21:30:24 -05:00
Ian Lance Taylor
3e2f329e94 libbacktrace: check for objcopy --add-gnu-debuglink using --help
* configure.ac: Check for objcopy --add-gnu-debuglink by using
	objcopy --help.
	* configure: Regenerate
2021-02-11 18:10:25 -08:00
David Malcolm
467a482052 analyzer: fix ICE in print_mem_ref [PR98969]
PR analyzer/98969 and PR analyzer/99064 describes ICEs, in both cases
within print_mem_ref, when falsely reporting memory leaks - though it
is possible to generate the ICE on other diagnostics (which I added
in one of the test cases).

This patch fixes the ICE, leaving the fix for the leak false positives
as followup work.

The analyzer uses region_model::get_representative_path_var and
region_model::get_representative_tree to map back from its svalue
and region classes to the tree type used by the rest of the compiler,
and, in particular, for diagnostics.

The root cause of the ICE is sloppiness about types within those
functions; specifically when casts were stripped off svalues.  To
track these down I added wrapper functions that verify that the
types of the results are correct, and in doing so found various
other type-safety issues, which the patch also fixes.

Doing so led to various changes in diagnostics messages due to
more accurate types, but I felt that these changes weren't
desirable.
For example, the warning at CVE-2005-1689-minimal.c line 48
which expects:
  double-'free' of 'inbuf.data'
changed fo
  double-'free' of '(char *)inbuf.data'

So I added stripping of top-level casts where necessary to avoid
cluttering diagnostics.

Finally, the more accurate types led to worse results from
readability_comparator, where e.g. the event message at line 50
of sensitive-1.c regressed from the precise:
  passing sensitive value 'password' in call to 'called_by_test_5' from 'test_5'
to the vaguer:
  calling 'called_by_test_5' from 'test_5'
This was due to erroneously picking the initial value of "password"
in the caller frame as the best value within the *callee* frame, due to
"char *" vs "const char *", which confuses the logic for tracking values
that pass along callgraph edges.  The patch fixes this by combining the
readability tests for tree and stack depth, rather than performing
them in sequence, so that it favors the value in the deepest frame.

As noted above, the patch fixes the ICEs, but does not fix the
leak false positives.

gcc/analyzer/ChangeLog:
	PR analyzer/98969
	* engine.cc (readability): Add names for the various arbitrary
	values.  Handle NOP_EXPR and INTEGER_CST.
	(readability_comparator): Combine the readability tests for
	tree and stack depth, rather than performing them sequentially.
	(impl_region_model_context::on_state_leak): Strip off top-level
	casts.
	* region-model.cc (region_model::get_representative_path_var): Add
	type-checking, moving the bulk of the implementation to...
	(region_model::get_representative_path_var_1): ...here.  Respect
	types in casts by recursing and re-adding the cast, rather than
	merely stripping them off.  Use the correct type when handling
	region_svalue.
	(region_model::get_representative_tree): Strip off any top-level
	cast.
	(region_model::get_representative_path_var): Add type-checking,
	moving the bulk of the implementation to...
	(region_model::get_representative_path_var_1): ...here.
	* region-model.h (region_model::get_representative_path_var_1):
	New decl
	(region_model::get_representative_path_var_1): New decl.
	* store.cc (append_pathvar_with_type): New.
	(binding_cluster::get_representative_path_vars): Cast path_vars
	to the correct type when adding them to *OUT_PVS.

gcc/testsuite/ChangeLog:
	PR analyzer/98969
	* g++.dg/analyzer/pr99064.C: New test.
	* gcc.dg/analyzer/pr98969.c: New test.
2021-02-11 20:32:10 -05:00
GCC Administrator
0c5cdb31bd Daily bump. 2021-02-12 00:16:25 +00:00