This splits can_associate_p into checks for SSA defs and checks
for the type so it can be called from is_reassociable_op to
catch cases not catched by the earlier fix.
2021-05-11 Richard Biener <rguenther@suse.de>
PR tree-optimization/100519
* tree-ssa-reassoc.c (can_associate_p): Split into...
(can_associate_op_p): ... this
(can_associate_type_p): ... and this.
(is_reassociable_op): Call can_associate_op_p.
(break_up_subtract_bb): Call the appropriate predicates.
(reassociate_bb): Likewise.
* gcc.dg/torture/pr100519.c: New testcase.
Size_In_Slots uses the Nkind to look up the size in a table indexed
by Nkind. This patch fixes a couple of places where the Nkind is
wrong (uninitialized or zeroed out) so Size_In_Slots cannot be used.
gcc/ada/
PR ada/100564
* atree.adb (Change_Node): Do not call Zero_Slots on a Node_Id
when the Nkind has not yet been set; call the other Zero_Slots
that takes a range of slot offsets. Call the new Mutate_Kind
that takes an Old_Size, for the same reason -- the size cannot
be computed without the Nkind.
(Mutate_Nkind): New function that allows specifying the Old_Size.
(Size_In_Slots): Assert that the Nkind has proper (nonzero) value.
* atree.ads: Minor reformatting.
gcc/ChangeLog:
* lto-wrapper.c (print_lto_docs_link): New function.
(run_gcc): Print warning about missing job server detection
after we know NR of partitions. Do the same for -flto{,=1}.
* opts.c (get_option_html_page): Support -flto option.
In this testcase the compile unit consists of a single
text section with a single embedded DECL_IGNORED_P function.
So we have a kind of multi-range text section here.
To avoid an ICE in output_rnglists we need to make sure
that have_multiple_function_sections is set to true.
This is a regression from
e69ac02037 ("Add line debug info for virtual thunks")
2021-05-12 Bernd Edlinger <bernd.edlinger@hotmail.de>
PR debug/100515
* dwarf2out.c (dwarf2out_finish): Set
have_multiple_function_sections with multi-range text_section.
* gcc.dg/debug/dwarf2/pr100515.c: New testcase.
This makes the rtvec_alloc argument size_t catching overflow and
truncated arguments (from "invalid" testcases), verifying the
argument against INT_MAX which is the limit set by the int
typed rtvec_def.num_elem member.
2021-05-12 Richard Biener <rguenther@suse.de>
PR middle-end/100547
* rtl.h (rtvec_alloc): Make argument size_t.
* rtl.c (rtvec_alloc): Verify the count is less than INT_MAX.
The inliner doesn't remap DEBUG_EXPR_DECLs, so the same decls can appear
in multiple functions.
Furthermore, expansion reuses corresponding DEBUG_EXPRs too, so they again
can be reused in multiple functions.
Neither of that is a major problem, DEBUG_EXPRs are just magic value holders
and what value they stand for is independent in each function and driven by
what debug stmts or DEBUG_INSNs they are bound to.
Except for DEBUG_EXPR*s with vector types, TYPE_MODE can be either BLKmode
or some vector mode depending on whether current function's enabled ISAs
support that vector mode or not. On the following testcase, we expand it
first in foo function without AVX2 enabled and so the DEBUG_EXPR is
BLKmode, but later the same DEBUG_EXPR_DECL is used in a simd clone with
AVX2 enabled and expansion ICEs because of a mode mismatch.
The following patch fixes that by forcing recreation of a DEBUG_EXPR if
there is a mode mismatch for vector typed DEBUG_EXPR_DECL, DEBUG_EXPRs
will be still reused in between functions otherwise and within the same
function the mode should be always the same.
2021-05-12 Jakub Jelinek <jakub@redhat.com>
PR middle-end/100508
* cfgexpand.c (expand_debug_expr): For DEBUG_EXPR_DECL with vector
type, don't reuse DECL_RTL if it has different mode, instead force
creation of a new DEBUG_EXPR.
* gcc.dg/gomp/pr100508.c: New test.
> Somewhere in RTL (_M_value&1)==_M_value is turned into (_M_value&-2)==0,
> that could be worth doing already in GIMPLE.
Apparently it is
/* Simplify eq/ne (and/ior x y) x/y) for targets with a BICS instruction or
constant folding if x/y is a constant. */
if ((code == EQ || code == NE)
&& (op0code == AND || op0code == IOR)
&& !side_effects_p (op1)
&& op1 != CONST0_RTX (cmp_mode))
{
/* Both (eq/ne (and x y) x) and (eq/ne (ior x y) y) simplify to
(eq/ne (and (not y) x) 0). */
...
/* Both (eq/ne (and x y) y) and (eq/ne (ior x y) x) simplify to
(eq/ne (and (not x) y) 0). */
Yes, doing that on GIMPLE for the case where the not argument is constant
would simplify the phiopt follow-up (it would be single imm use then).
On Thu, May 06, 2021 at 09:42:41PM +0200, Marc Glisse wrote:
> We can probably do it in 2 steps, first something like
>
> (for cmp (eq ne)
> (simplify
> (cmp (bit_and:c @0 @1) @0)
> (cmp (@0 (bit_not! @1)) { build_zero_cst (TREE_TYPE (@0)); })))
>
> to get rid of the double use, and then simplify X&C==0 to X<=~C if C is a
> mask 111...000 (I thought we already had a function to detect such masks, or
> the 000...111, but I can't find them anymore).
Ok, here is the first step then.
2021-05-12 Jakub Jelinek <jakub@redhat.com>
Marc Glisse <marc.glisse@inria.fr>
PR tree-optimization/94589
* match.pd ((X & Y) == X -> (X & ~Y) == 0,
(X | Y) == Y -> (X & ~Y) == 0): New GIMPLE simplifications.
* gcc.dg/tree-ssa/pr94589-1.c: New test.
C2X adds #elifdef and #elifndef preprocessor directives; these have
also been proposed for C++. Implement these directives in libcpp
accordingly.
In this implementation, #elifdef and #elifndef are treated as
non-directives for any language version other than c2x and gnu2x (if
the feature is accepted for C++, it can trivially be enabled for
relevant C++ versions). In strict conformance modes for prior
language versions, this is required, as illustrated by the
c11-elifdef-1.c test added.
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
libcpp/
* include/cpplib.h (struct cpp_options): Add elifdef.
* init.c (struct lang_flags): Add elifdef.
(lang_defaults): Update to include elifdef initializers.
(cpp_set_lang): Set elifdef for pfile based on language.
* directives.c (STDC2X, ELIFDEF): New macros.
(EXTENSION): Increase value to 3.
(DIRECTIVE_TABLE): Add #elifdef and #elifndef.
(_cpp_handle_directive): Do not treat ELIFDEF directives as
directives for language versions without the #elifdef feature.
(do_elif): Handle #elifdef and #elifndef.
(do_elifdef, do_elifndef): New functions.
gcc/testsuite/
* gcc.dg/cpp/c11-elifdef-1.c, gcc.dg/cpp/c2x-elifdef-1.c,
gcc.dg/cpp/c2x-elifdef-2.c: New tests.
Resolves:
PR middle-end/21433 - The COMPONENT_REF case of expand_expr_real_1 is probably wrong
gcc/ChangeLog:
PR middle-end/21433
* expr.c (expand_expr_real_1): Replace unreachable code with an assert.
The libcpp function cpp_avoid_paste is used to insert whitespace in
preprocessed output where needed to avoid two consecutive
preprocessing tokens, that logically (e.g. when stringized) do not
have whitespace between them, from being incorrectly lexed as one when
the preprocessed input is reread by a compiler.
This fails to allow for digit separators, so meaning that invalid
code, that has a CPP_NUMBER (from a macro expansion) followed by a
character literal, can result in preprocessed output with a valid use
of digit separators, so that required syntax errors do not occur when
compiling with -save-temps. Fix this by handling that case in
cpp_avoid_paste (as with other cases in cpp_avoid_paste, this doesn't
try to check whether the language version in use supports digit
separators; it's always OK to have unnecessary whitespace in
preprocessed output).
Note: there are other cases, with various kinds of wide character or
string literal following a CPP_NUMBER, where spurious pasting of
preprocessing tokens can occur but the sequence of tokens remains
invalid both before and after that pasting. Maybe cpp_avoid_paste
should also handle those cases (and similar cases after a CPP_NAME),
to ensure the sequence of preprocessing tokens in preprocessed output
is exactly right, whether or not it affects whether syntax errors
occur. This patch only addresses the case with digit separators where
invalid code can fail to be diagnosed without the space inserted.
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
libcpp/
* lex.c (cpp_avoid_paste): Do not allow pasting CPP_NUMBER with
CPP_CHAR.
gcc/testsuite/
* g++.dg/cpp1y/digit-sep-paste.C, gcc.dg/c2x-digit-separators-3.c:
New tests.
The type of the output operands *p and *q of the extended asm statement
of function foo is unsigned long whereas the type of the corresponding
input operands is int. This results, e.g. on IBM Z, in the case that
the immediates 2 and 3 are written into registers in SI mode and read in
DI mode resulting in wrong values. Fixed by lifting the input operands
to type long.
gcc/testsuite/ChangeLog:
* gcc.dg/guality/pr43077-1.c: Align types of output and input
operands by lifting immediates to type long.
floating_to_chars.cc includes the Ryu sources into an anonymous
namespace as a convenient way to give all its symbols internal linkage.
But an entity declared extern "C" always has external linkage even
from within an anonymous namespace, so this trick doesn't work in the
presence of extern "C", and it causes the Ryu function generic_to_chars
to be visible from libstdc++.a.
This patch removes the only use of extern "C" from our local copy of
Ryu along with some declarations for never-defined functions that GCC
now warns about.
libstdc++-v3/ChangeLog:
* src/c++17/ryu/LOCAL_PATCHES: Update.
* src/c++17/ryu/ryu_generic_128.h: Remove extern "C".
Remove declarations for never-defined functions.
* testsuite/20_util/to_chars/4.cc: New test.
The header synopsis test fails to define NOTHROW for C++98.
The shared_ptr test should be skipped for C++98.
The debug mode one should work for C++98 too, it just needs to avoid
C++11 syntax that isn't valid in C++98.
libstdc++-v3/ChangeLog:
* testsuite/20_util/headers/memory/synopsis.cc: Define C++98
alternative for macro.
* testsuite/20_util/shared_ptr/creation/99006.cc: Add effective
target keyword.
* testsuite/25_algorithms/copy/debug/99402.cc: Avoid C++11
syntax.
The changes in 75c6a925da were slightly
incorrect, because the converting constructor should be noexcept, and
the POCMA and is_always_equal traits should still be present in C++20.
This fixes it, and slightly refactors the preprocessor conditions and
order of members. Also add comments explaining things.
The non-standard construct and destroy members added for PR 78052 can be
private if allocator_traits<allocator<void>> is made a friend.
libstdc++-v3/ChangeLog:
* include/bits/allocator.h (allocator<void>) [C++20]: Add
missing noexcept to constructor. Restore missing POCMA and
is_always_equal_traits.
[C++17]: Make construct and destroy members private and
declare allocator_traits as a friend.
* include/bits/memoryfwd.h (allocator_traits): Declare.
* include/ext/malloc_allocator.h (malloc_allocator::allocate):
Add nodiscard attribute. Add static assertion for LWG 3307.
* include/ext/new_allocator.h (new_allocator::allocate): Add
static assertion for LWG 3307.
* testsuite/20_util/allocator/void.cc: Check that converting
constructor is noexcept. Check for propagation traits and
size_type and difference_type. Check that pointer and
const_pointer are gone in C++20.
C2X adds digit separators, as in C++. Enable them accordingly in
libcpp and c-lex.c. Some basic tests are added that digit separators
behave as expected for C2X and are properly disabled for C11; further
test coverage is included in the existing g++.dg/cpp1y/digit-sep*.C
tests.
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
gcc/c-family/
* c-lex.c (interpret_float): Handle digit separators for C2X.
libcpp/
* init.c (lang_defaults): Enable digit separators for GNUC2X and
STDC2X.
gcc/testsuite/
* gcc.dg/c11-digit-separators-1.c,
gcc.dg/c2x-digit-separators-1.c, gcc.dg/c2x-digit-separators-2.c:
New tests.
My recent change to reject calling rvalue() with an argument of class type
crashes on this testcase, where we use rvalue() on what we expect to be an
argument of integer or vector type. Fixed by checking first.
gcc/cp/ChangeLog:
PR c++/100517
* typeck.c (build_reinterpret_cast_1): Check intype on
cast to vector.
gcc/testsuite/ChangeLog:
PR c++/100517
* g++.dg/ext/vector41.C: New test.
This removes stale users of maybe_fold_reference where IL constraints
make it never do anything.
2021-05-11 Richard Biener <rguenther@suse.de>
* gimple-fold.c (gimple_fold_call): Do not call
maybe_fold_reference on call arguments or the static chain.
(fold_stmt_1): Do not call maybe_fold_reference on GIMPLE_ASM
inputs.
This fixes unintended clobbering of SSA_NAME_DEF_STMT of the
cloned/inlined from SSA name during IPA parameter manipulation
of call stmt LHSs. gimple_call_set_lhs adjusts SSA_NAME_DEF_STMT
of the lhs to the stmt being modified but when
ipa_param_body_adjustments::modify_call_stmt is called the
cloning/inlining process has not yet remapped the stmts operands
to the copy variants but they are still original.
2021-05-11 Richard Biener <rguenther@suse.de>
PR ipa/100513
* ipa-param-manipulation.c
(ipa_param_body_adjustments::modify_call_stmt): Avoid
altering SSA_NAME_DEF_STMT by adjusting the calls LHS
via gimple_call_lhs_ptr.
The PR shows us attaching REG_CFA_ADJUST_CFA notes to stack pointer
adjustments emitted in cmse_nonsecure_call_inline_register_clear (when
-march=armv8.1-m.main). However, the stack pointer is not guaranteed to
be the CFA reg. If we're at -O0 or we have -fno-omit-frame-pointer, then
the frame pointer will be used as the CFA reg, and these notes on the sp
adjustments will lead to ICEs in dwarf2out_frame_debug_adjust_cfa.
This patch avoids emitting these notes if the current function has a
frame pointer.
gcc/ChangeLog:
PR target/99725
* config/arm/arm.c (cmse_nonsecure_call_inline_register_clear):
Avoid emitting CFA adjusts on the sp if we have the fp.
gcc/testsuite/ChangeLog:
PR target/99725
* gcc.target/arm/cmse/pr99725.c: New test.
This patch removes the duplication between the mul_laneq<mode>3
and the older mul-lane patterns. The older patterns were previously
divided into two based on whether the indexed operand had the same mode
as the other operands or whether it had the opposite length from the
other operands (64-bit vs. 128-bit). However, it seemed easier to
divide them instead based on whether the indexed operand was 64-bit or
128-bit, since that maps directly to the arm_neon.h “q” conventions.
Also, it looks like the older patterns were missing cases for
V8HF<->V4HF combinations, which meant that vmul_laneq_f16 and
vmulq_lane_f16 didn't produce single instructions.
There was a typo in the V2SF entry for VCONQ, but in practice
no patterns were using that entry until now.
The test passes for both endiannesses, but endianness does change
the mapping between regexps and functions.
gcc/
* config/aarch64/iterators.md (VMUL_CHANGE_NLANES): Delete.
(VMULD): New iterator.
(VCOND): Handle V4HF and V8HF.
(VCONQ): Fix entry for V2SF.
* config/aarch64/aarch64-simd.md (mul_lane<mode>3): Use VMULD
instead of VMUL. Use a 64-bit vector mode for the indexed operand.
(*aarch64_mul3_elt_<vswap_width_name><mode>): Merge with...
(mul_laneq<mode>3): ...this define_insn. Use VMUL instead of VDQSF.
Use a 128-bit vector mode for the indexed operand. Use stype for
the scheduling type.
gcc/testsuite/
* gcc.target/aarch64/fmul_lane_1.c: New test.
This adjusts maybe_fold_reference to adhere to its desired behavior
of performing constant folding and thus explicitely avoid returning
unfolded reference trees.
2021-05-11 Richard Biener <rguenther@suse.de>
* gimple-fold.c (maybe_fold_reference): Only return
is_gimple_min_invariant values.
When folding a constant initializer looking through aliases to
incompatible types can lead to us trying to fold a constant
to an aggregate type which can't work. Simply avoid trying
to constant fold non-register typed symbols.
2021-05-11 Richard Biener <rguenther@suse.de>
PR middle-end/100509
* gimple-fold.c (fold_gimple_assign): Only call
get_symbol_constant_value on register type symbols.
* gcc.dg/pr100509.c: New testcase.
Instead of selecting bits 62 to (wraparound) 59 from r2 and inserting them
into r3, we select bits 60 to 62 from r3 and insert them into r2
nowadays. Adjust the test accordingly.
gcc/testsuite/ChangeLog:
* gcc.target/s390/risbg-ll-3.c: Change match pattern.
When a taskloop doesn't have any iterations, GOMP_taskloop* takes an early
return, doesn't create any tasks and more importantly, doesn't create
a taskgroup and doesn't register task reductions. But, the code emitted
in the callers assumes task reductions have been registered and performs
the reduction handling and task reduction unregistration. The pointer
to the task reduction private variables is reused, on input it is the alignment
and only on output it is the pointer, so in the case taskloop with no iterations
the caller attempts to dereference the alignment value as if it was a pointer
and crashes. We could in the early returns register the task reductions
only to have them looped over and unregistered in the caller, but I think
it is better to tell the caller there is nothing to task reduce and bypass
all that.
2021-05-11 Jakub Jelinek <jakub@redhat.com>
PR middle-end/100471
* omp-low.c (lower_omp_task_reductions): For OMP_TASKLOOP, if data
is 0, bypass the reduction loop including
GOMP_taskgroup_reduction_unregister call.
* taskloop.c (GOMP_taskloop): If GOMP_TASK_FLAG_REDUCTION and not
GOMP_TASK_FLAG_NOGROUP, when doing early return clear the task
reduction pointer.
* testsuite/libgomp.c/task-reduction-4.c: New test.
This patch teaches rs6000_density_test to only care about the vector
version cost calculation and early return when calculating the single
scalar iteration cost.
Bootstrapped/regtested on powerpc64le-linux-gnu P9.
gcc/ChangeLog:
* config/rs6000/rs6000.c (struct rs6000_cost_data): New member
costing_for_scalar.
(rs6000_density_test): Early return if costing_for_scalar is true.
(rs6000_init_cost): Init costing_for_scalar of rs6000_cost_data.
rs6000 port function rs6000_density_test wants to differentiate the
current cost model is for the scalar version of a loop or block, or
the vector version. As Richi suggested, this patch introduces one
new parameter costing_for_scalar to init_cost hook to pass down this
information explicitly.
gcc/ChangeLog:
* doc/tm.texi: Regenerated.
* target.def (init_cost): Add new parameter costing_for_scalar.
* targhooks.c (default_init_cost): Adjust for new parameter.
* targhooks.h (default_init_cost): Likewise.
* tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Likewise.
(vect_compute_single_scalar_iteration_cost): Likewise.
(vect_analyze_loop_2): Likewise.
* tree-vect-slp.c (_bb_vec_info::_bb_vec_info): Likewise.
(vect_bb_vectorization_profitable_p): Likewise.
* tree-vectorizer.h (init_cost): Likewise.
* config/aarch64/aarch64.c (aarch64_init_cost): Likewise.
* config/i386/i386.c (ix86_init_cost): Likewise.
* config/rs6000/rs6000.c (rs6000_init_cost): Likewise.
This patch is to move rs6000_vect_nonmem (target cost_data
related information) into target cost_data struct.
As Richi pointed out, we can gather data from add_stmt_cost
invocations. This is one pre-step to centralize target
cost_data related stuffs.
gcc/ChangeLog:
* config/rs6000/rs6000.c (rs6000_vect_nonmem): Renamed to
vect_nonmem and moved into...
(struct rs6000_cost_data): ...here.
(rs6000_init_cost): Use vect_nonmem of cost_data instead.
(rs6000_add_stmt_cost): Likewise.
(rs6000_finish_cost): Likewise.
This unconditionally enables the maybe_save_operator_binding mechanism
for all function templates, so that when resolving a dependent operator
expression from a function template we ignore later-declared
namespace-scope bindings that weren't visible at template definition
time. This patch additionally makes the mechanism apply to dependent
comma and compound-assignment operator expressions.
Note that this doesn't fix the testcases in PR83035 or PR99692 because
there the dependent operator expressions aren't at function scope. I'm
not sure how adapt this mechanism for these testcases, since although
we'll in both testcases have a TEMPLATE_DECL to associate the lookup
result with, at instantiation time we won't have an appropriate binding
level to push to.
gcc/cp/ChangeLog:
PR c++/51577
* name-lookup.c (maybe_save_operator_binding): Unconditionally
enable for all function templates, not just generic lambdas.
Handle compound-assignment operator expressions.
* typeck.c (build_x_compound_expr): Call maybe_save_operator_binding
in the type-dependent case.
(build_x_modify_expr): Likewise. Move declaration of 'op' closer
to its first use.
gcc/testsuite/ChangeLog:
PR c++/51577
* g++.dg/lookup/operator-3.C: New test.
This PR is about CTAD but the underlying problems are more general;
CTAD is a good trigger for them because of the necessary substitution
into constraints that deduction guide generation entails.
In the testcase below, when generating the implicit deduction guide for
the constrained constructor template for A, we substitute the generic
flattening map 'tsubst_args' into the constructor's constraints. During
this substitution, tsubst_pack_expansion returns a rebuilt pack
expansion for sizeof...(xs), but doesn't carry over the
PACK_EXPANSION_LOCAL_P (and PACK_EXPANSION_SIZEOF_P) flag from the
original tree to the rebuilt one. The flag is otherwise unset on the
original tree but gets set for the rebuilt tree from make_pack_expansion
since at_function_scope_p() is true (we're inside main). This leads to
a crash during satisfaction when substituting into the pack expansion
because we don't have local_specializations set up (and it'd be set up
for us if PACK_EXPANSION_LOCAL_P is unset)
Similarly, tsubst_constraint needs to set cp_unevaluated so that the
substitution performed therein doesn't rely on local_specializations.
This avoids a crash during CTAD for C below.
gcc/cp/ChangeLog:
PR c++/100138
* constraint.cc (tsubst_constraint): Set up cp_unevaluated.
(satisfy_atom): Set up iloc_sentinel before calling
cxx_constant_value.
* pt.c (tsubst_pack_expansion): When returning a rebuilt pack
expansion, carry over PACK_EXPANSION_LOCAL_P and
PACK_EXPANSION_SIZEOF_P from the original pack expansion.
gcc/testsuite/ChangeLog:
PR c++/100138
* g++.dg/cpp2a/concepts-ctad4.C: New test.
gcc/ada/
PR bootstrap/100506
* Make-generated.in: Replace version.c with ada/version.c.
* gcc-interface/Make-lang.in: Add version.o to GNAT1_C_OBJS.
Add version.o to GNAT_ADA_OBJS and GNATBIND_OBJS.
* gcc-interface/Makefile.in: Add version.o to TOOLS_LIBS.
* gnatvsn.adb: Start using a new C symbol gnat_version_string.
* version.c: New file.
This pragma is relatively recent and may be problematic for the bootstrap.
gcc/ada/
* atree.ads (Slot): Remove pragma Provide_Shift_Operators.
(Shift_Left): New intrinsic function.
(Shift_Right): Likewise.
* atree.adb (Get_1_Bit_Val): Use Natural instead of Integer.
(Get_2_Bit_Val): Likewise.
(Get_4_Bit_Val): Likewise.
(Get_8_Bit_Val): Likewise.
(Set_1_Bit_Val): Likewise.
(Set_2_Bit_Val): Likewise.
(Set_4_Bit_Val): Likewise.
(Set_8_Bit_Val): Likewise.