When --enable-cet is used to configure GCC, enable Intel CET in libphobos.
* Makefile.am (AM_MAKEFLAGS): Add $(CET_FLAGS) to GCC FLAGS.
* configure.ac (CET_FLAGS): Add GCC_CET_FLAGS and AC_SUBST.
* Makefile.in: Regenerated.
* aclocal.m4: Likewise.
* configure.ac: Likewise.
There are several places where we insert bind expressions while
making the coroutine AST transforms. These should be marked as
having side-effects where relevant, which had been omitted. This
leads to at least one failure in the cppcoros test suite, where a loop
body is dropped in gimplification because it is not marked.
gcc/cp/ChangeLog:
2020-05-08 Iain Sandoe <iain@sandoe.co.uk>
PR c++/95003
* coroutines.cc (build_actor_fn): Ensure that bind scopes
are marked as having side-effects where necessary.
(replace_statement_captures): Likewise.
(morph_fn_to_coro): Likewise.
gcc/testsuite/ChangeLog:
2020-05-08 Iain Sandoe <iain@sandoe.co.uk>
PR c++/95003
* g++.dg/coroutines/torture/pr95003.C: New test.
The existing directives-only code (a) punched a hole through the
libcpp interface and (b) didn't support raw string literals. This
reimplements this preprocessing mode. I added a proper callback
interface, and adjusted c-ppoutput to use it. Sadly I cannot get rid
of the libcpp/internal.h include for unrelated reasons.
The new scanner is in lex.x, and works doing some backwards scanning
when it finds a charater of interest. This reduces the number of
cases one has to deal with in forward scanning. It may have different
failure mode than forward scanning on bad tokenization.
Finally, Moved some cpp tests from the c-specific dg.gcc/cpp directory
to the c-c++-common/cpp shared directory,
libcpp/
* directives-only.c: Delete.
* Makefile.in (libcpp_a_OBJS, libcpp_a_SOURCES): Remove it.
* include/cpplib.h (enum CPP_DO_task): New enum.
(cpp_directive_only_preprocess): Declare.
* internal.h (_cpp_dir_only_callbacks): Delete.
(_cpp_preprocess_dir_only): Delete.
* lex.c (do_peek_backslask, do_peek_next, do_peek_prev): New.
(cpp_directives_only_process): New implementation.
gcc/c-family/
Reimplement directives only processing.
* c-ppoutput.c (token_streamer): Ne.
(directives_only_cb): New. Swallow ...
(print_lines_directives_only): ... this.
(scan_translation_unit_directives_only): Reimplment using the
published interface.
gcc/testsuite/
* gcc.dg/cpp/counter-[23].c: Move to c-c+_-common/cpp.
* gcc.dg/cpp/dir-only-*: Likewise.
* c-c++-common/cpp/dir-only-[78].c: New.
This delays the SLP permutation check to vectorizable_load and optimizes
permutations only after all SLP instances have been generated and the
vectorization factor is determined.
2020-05-08 Richard Biener <rguenther@suse.de>
* tree-vectorizer.h (vec_info::slp_loads): New.
(vect_optimize_slp): Declare.
* tree-vect-slp.c (vect_attempt_slp_rearrange_stmts): Do
nothing when there are no loads.
(vect_gather_slp_loads): Gather loads into a vector.
(vect_supported_load_permutation_p): Remove.
(vect_analyze_slp_instance): Do not verify permutation
validity here.
(vect_analyze_slp): Optimize permutations of reductions
after all SLP instances have been gathered and gather
all loads.
(vect_optimize_slp): New function split out from
vect_supported_load_permutation_p. Elide some permutations.
(vect_slp_analyze_bb_1): Call vect_optimize_slp.
* tree-vect-loop.c (vect_analyze_loop_2): Likewise.
* tree-vect-stmts.c (vectorizable_load): Check whether
the load can be permuted. When generating code assert we can.
* gcc.dg/vect/bb-slp-pr68892.c: Adjust for not supported
SLP permutations becoming builds from scalars.
* gcc.dg/vect/bb-slp-pr78205.c: Likewise.
* gcc.dg/vect/bb-slp-34.c: Likewise.
Two aliased objects must have distinct addresses, even if they have
size zero, so we make sure to allocate at least one byte for them.
* gcc-interface/decl.c (gnat_to_gnu_entity) <E_Variable>: Force at
least the unit size for an aliased object of a constrained nominal
subtype whose size is variable.
The first tweak is to remove the TREE_OVERFLOW flag on INTEGER_CSTs
because it prevents them from being uniquized in LTO mode.
The second, unrelated tweak is to canonicalize the packable types made
by gigi so that at most one per type is present in the GENERIC IL.
* gcc-interface/decl.c (gnat_to_gnu_entity) <E_Array_Subtype>: Deal
with artificial maximally-sized types designed by access types.
* gcc-interface/utils.c (packable_type_hash): New structure.
(packable_type_hasher): Likewise.
(packable_type_hash_table): New hash table.
(init_gnat_utils): Initialize it.
(destroy_gnat_utils): Destroy it.
(packable_type_hasher::equal): New method.
(hash_packable_type): New static function.
(canonicalize_packable_type): Likewise.
(make_packable_type): Make sure not to use too small a type for the
size of the new fields. Canonicalize the type if it is named.
The information was missing in cases the front-end was able to turn
the range comparison into a simple comparison.
* gcc-interface/trans.c (Raise_Error_to_gnu): Always compute a lower
bound and an upper bound for use by the -gnateE switch for range and
comparison operators.
We mark the type of In parameters in Ada with the const qualifier, but
it is stripped by free_lang_data_in_type so do not do it in LTO mode.
* gcc-interface/decl.c (gnat_to_gnu_param): Do not make a variant
of the type in LTO mode.
This fixes an issue with redundant store elimination in FRE/PRE
which, when invoked by the DOM elimination walk, ends up using
possibly stale availability data from the RPO walk. It also
fixes a missed optimization during valueization of addresses
by making sure to use get_addr_base_and_unit_offset_1 which can
valueize and adjusting that to also valueize ARRAY_REFs low-bound.
2020-05-08 Richard Biener <rguenther@suse.de>
* tree-ssa-sccvn.c (rpo_avail): Change type to
eliminate_dom_walker *.
(eliminate_with_rpo_vn): Adjust rpo_avail to make vn_valueize
use the DOM walker availability.
(vn_reference_fold_indirect): Use get_addr_base_and_unit_offset_1
with vn_valueize as valueization callback.
(vn_reference_maybe_forwprop_address): Likewise.
* tree-dfa.c (get_addr_base_and_unit_offset_1): Also valueize
array_ref_low_bound.
* gnat.dg/opt83.adb: New testcase.
We already have x - ((x - y) & -(z < w)) and
x + ((y - x) & -(z < w)) simplifications, this one adds
x ^ ((x ^ y) & -(z < w)) (not merged using for because of the
:c that can be present on bit_xor and can't on minus).
2020-05-08 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/94786
* match.pd (A ^ ((A ^ B) & -(C cmp D)) -> (C cmp D) ? B : A): New
simplification.
* gcc.dg/tree-ssa/pr94786.c: New test.
The following peephole2 changes:
- addl (%rdi), %esi
+ xorl %eax, %eax
+ addl %esi, (%rdi)
setc %al
- movl %esi, (%rdi)
- movzbl %al, %eax
ret
on the testcase. *add<mode>3_cc_overflow_1, being an add{l,q} insn, is
commutative, so if TARGET_READ_MODIFY_WRITE we can replace
addl (%rdi), %esi; movl %esi, (%rdi)
with
addl %esi, (%rdi)
if %esi is dead after those two insns.
2020-05-08 Jakub Jelinek <jakub@redhat.com>
PR target/94857
* config/i386/i386.md (peephole2 after *add<mode>3_cc_overflow_1): New
define_peephole2.
* gcc.target/i386/pr94857.c: New test.
On Thu, May 07, 2020 at 02:45:29PM +0200, Thomas Schwinge wrote:
> >>+ for (tree op = win; TREE_CODE (op) == COMPOUND_EXPR;
>
> ..., and new 'op' variable here.
>
> >>+ op = TREE_OPERAND (op, 1))
> >>+ v.safe_push (op);
> >>+ FOR_EACH_VEC_ELT_REVERSE (v, i, op)
> >>+ ret = build2_loc (EXPR_LOCATION (op), COMPOUND_EXPR,
> >>+ TREE_TYPE (win), TREE_OPERAND (op, 0),
> >>+ ret);
> >>+ return ret;
> >> }
> >> while (TREE_CODE (op) == NOP_EXPR)
> >> {
There is no reason for the shadowing and op at this point acts as a
temporary and will be overwritten in FOR_EACH_VEC_ELT_REVERSE anyway.
So, we can just s/tree // here.
2020-05-08 Jakub Jelinek <jakub@redhat.com>
PR middle-end/94724
* tree.c (get_narrower): Reuse the op temporary instead of
shadowing it.
The following patch canonicalizes M = X >> (prec - 1); (X + M) ^ M
for signed integral types into ABS_EXPR (X). For X == min it is already
UB because M is -1 and min + -1 is UB, so we can use ABS_EXPR rather than
say ABSU_EXPR + cast.
The backend might then emit the abs code back using the shift and addition
and xor if it is the best sequence for the target, but could do something
different that is better.
2020-05-08 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/94783
* match.pd ((X + (X >> (prec - 1))) ^ (X >> (prec - 1)) to abs (X)):
New simplification.
* gcc.dg/tree-ssa/pr94783.c: New test.
The ffs expanders on several targets (x86, ia64, aarch64 at least)
emit a conditional move or similar code to handle the case when the
argument is 0, which makes the code longer.
If we know from VRP that the argument will not be zero, we can (if the
target has also an ctz expander) just use ctz which is undefined at zero
and thus the expander doesn't need to deal with that.
2020-05-08 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/94956
* match.pd (FFS): Optimize __builtin_ffs* of non-zero argument into
__builtin_ctz* + 1 if direct IFN_CTZ is supported.
* gcc.target/i386/pr94956.c: New test.
Implemented thusly. The TYPE_OVERFLOW_WRAPS is there just because the
pattern above it has it too, if you want, I can throw it away from both.
2020-05-08 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/94913
* match.pd (A - B + -1 >= A to B >= A): New simplification.
(A - B > A to A < B): Don't test TYPE_OVERFLOW_WRAPS which is always
true for TYPE_UNSIGNED integral types.
* gcc.dg/tree-ssa/pr94913.c: New test.
My recent combine-stack-adj.c change broke df checking bootstrap,
while most of the changes are done through validate_change/confirm_changes
which update df info, the removal of REG_EQUAL notes didn't update df info.
2020-05-08 Jakub Jelinek <jakub@redhat.com>
PR bootstrap/94961
PR rtl-optimization/94516
* rtl.h (remove_reg_equal_equiv_notes): Add a bool argument defaulted
to false.
* rtlanal.c (remove_reg_equal_equiv_notes): Add no_rescan argument.
Call df_notes_rescan if that argument is not true and returning true.
* combine.c (adjust_for_new_dest): Pass true as second argument to
remove_reg_equal_equiv_notes.
* postreload.c (reload_combine_recognize_pattern): Don't call
df_notes_rescan.
setnbc[r] is like setbc[r], but it writes -1 instead of 1 to the GPR.
2020-05-07 Segher Boessenkool <segher@kernel.crashing.org>
* config/rs6000/rs6000.md (*setnbc_<un>signed_<GPR:mode>): New
define_insn.
(*setnbcr_<un>signed_<GPR:mode>): New define_insn.
(*neg_eq_<mode>): Avoid for TARGET_FUTURE; add missing && 1.
(*neg_ne_<mode>): Likewise.
New instructions setbc and setbcr. setbc sets a GPR to 1 if some
condition register bit is set, and 0 otherwise; setbcr does it the
other way around.
2020-05-07 Segher Boessenkool <segher@kernel.crashing.org>
* config/rs6000/rs6000.md (setbc_<un>signed_<GPR:mode>): New
define_insn.
(*setbcr_<un>signed_<GPR:mode>): Likewise.
(cstore<mode>4): Use setbc[r] if available.
(<code><GPR:mode><GPR2:mode>2_isel): Avoid for TARGET_FUTURE.
(eq<mode>3): Use setbc for TARGET_FUTURE.
(*eq<mode>3): Avoid for TARGET_FUTURE.
(ne<mode>3): Replace :P with :GPR; use setbc for TARGET_FUTURE;
else for non-Pmode, use gen_eq and gen_xor.
(*ne<mode>3): Avoid for TARGET_FUTURE.
(*eqsi3_ext<mode>): Avoid for TARGET_FUTURE; fix missing && 1.
* config/h8300/h8300.md: Move expanders and patterns into
files based on functionality.
* config/h8300/addsub.md: New file.
* config/h8300/bitfield.md: New file
* config/h8300/combiner.md: New file
* config/h8300/divmod.md: New file
* config/h8300/extensions.md: New file
* config/h8300/jumpcall.md: New file
* config/h8300/logical.md: New file
* config/h8300/movepush.md: New file
* config/h8300/multiply.md: New file
* config/h8300/other.md: New file
* config/h8300/proepi.md: New file
* config/h8300/shiftrotate.md: New file
* config/h8300/testcompare.md: New file
commit da1de1d91088ac506c1bed0fba9b0f04c5b8c876
* config/h8300/h8300.md (adds/subs splitters): Merge into single
splitter.
(negation expanders and patterns): Simplify and combine using
iterators.
(one_cmpl expanders and patterns): Likewise.
(tablejump, indirect_jump patterns ): Likewise.
(shift and rotate expanders and patterns): Likewise.
(absolute value expander and pattern): Drop expander, rename pattern
to just "abssf2"
(peephole2 patterns): Move into...
* config/h8300/peepholes.md: New file.
Some new algorithms need to use _GLIBCXX_STD_A to refer to the "normal"
version of the algorithm, to workaround the namespace dance done for
parallel mode.
PR libstdc++/94971 (partial)
* include/bits/ranges_algo.h (ranges::__sample_fn): Qualify
std::sample using macro to work in parallel mode.
(__sort_fn): Likewise for std::sort.
(ranges::__nth_element_fn): Likewise for std::nth_element.
* include/bits/stl_algobase.h (lexicographical_compare_three_way):
Likewise for std::__min_cmp.
* include/parallel/algobase.h (lexicographical_compare_three_way):
Add to namespace std::__parallel.
This is a correct fix for the incorrect cppcheck suggestion to make
these parameters const. In order to that, the dereference operators need
to be const. The conversions to the underlying iterator can be const
too.
PR c/92472
* include/parallel/multiway_merge.h (_GuardedIterator::operator*)
(_GuardedIterator::operator _RAIter, _UnguardedIterator::operator*)
(_UnguardedIterator::operator _RAIter): Add const qualifier.
(operator<(_GuardedIterator&, _GuardedIterator&)
(operator<=(_GuardedIterator&, _GuardedIterator&)
(operator<(_UnguardedIterator&, _UnguardedIterator&)
(operator<=(_UnguardedIterator&, _UnguardedIterator&): Change
parameters to const references.
When we have completely missing key information (e.g. the
coroutine_traits) or a partially transformed function body, we
need to try and balance returning useful information about
failures with the possibility that some part of the diagnostics
machinery or following code will not be able to handle the
state.
The PRs (and revised testcase) point to cases where that processing
has failed.
This revises the process to avoid special handling for the
ramp, and falls back on the same code used for regular function
fails.
There are test-cases (in addition to the ones for the PRs) that now
cover all early exit points [where the transforms are considered
to have failed in a manner that does not allow compilation to
continue].
gcc/cp/ChangeLog:
2020-05-07 Iain Sandoe <iain@sandoe.co.uk>
PR c++/94817
PR c++/94829
* coroutines.cc (morph_fn_to_coro): Set unformed outline
functions to error_mark_node. For early error returns suppress
warnings about missing ramp return values. Fix reinstatement
of the function body on pre-existing initial error.
* decl.c (finish_function): Use the normal error path for fails
in the ramp function, do not try to compile the helpers if the
transform fails.
gcc/testsuite/ChangeLog:
2020-05-07 Iain Sandoe <iain@sandoe.co.uk>
PR c++/94817
PR c++/94829
* g++.dg/coroutines/coro-missing-final-suspend.C: New test.
* g++.dg/coroutines/coro-missing-initial-suspend.C: New test.
* g++.dg/coroutines/coro-missing-promise-yield.C: Check for
continuation of compilation.
* g++.dg/coroutines/coro-missing-promise.C: Likewise.
* g++.dg/coroutines/coro-missing-ret-value.C: Likewise
* g++.dg/coroutines/coro-missing-ret-void.C: Likewise
* g++.dg/coroutines/coro-missing-ueh-3.C: Likewise
* g++.dg/coroutines/pr94817.C: New test.
* g++.dg/coroutines/pr94829.C: New test.
This PR points out that we don't detect long double -> double narrowing
when long double happens to have the same precision as double; on x86_64
this can be achieved by -mlong-double-64.
[dcl.init.list]#7.2 specifically says "from long double to double or float,
or from double to float", but check_narrowing only checks
TYPE_PRECISION (type) < TYPE_PRECISION (ftype)
so we need to handle the other cases too, e.g. by same_type_p as in
the following patch.
PR c++/94590 - Detect long double -> double narrowing.
* typeck2.c (check_narrowing): Detect long double -> double
narrowing even when double and long double have the same
precision. Make it handle conversions to float too.
* g++.dg/cpp0x/Wnarrowing18.C: New test.
This is an ICE on invalid, because we're specializing S::foo in the
wrong namespace. cp_parser_class_specifier_1 parses S::foo in M
and then it tries to push the nested-name-specifier of foo, which is
S. By that, we're breaking the assumption of push_inner_scope that
the pushed scope must be a scope nested inside current scope: current
scope is M, but the namespace context of S is N, and N is not nested
in M, so we fell into an infinite loop in push_inner_scope_r.
(cp_parser_class_head called check_specialization_namespace which already
gave a permerror.)
PR c++/94255
* parser.c (cp_parser_class_specifier_1): Check that the scope is
nested inside current scope before pushing it.
* g++.dg/template/spec41.C: New test.
* tree-ssa-operands.c (operands_scanner): New class.
(operands_bitmap_obstack): Remove.
(n_initialized): Remove.
(build_uses): Move to operands_scanner class.
(build_vuse): Same as above.
(build_vdef): Same as above.
(verify_ssa_operands): Same as above.
(finalize_ssa_uses): Same as above.
(cleanup_build_arrays): Same as above.
(finalize_ssa_stmt_operands): Same as above.
(start_ssa_stmt_operands): Same as above.
(append_use): Same as above.
(append_vdef): Same as above.
(add_virtual_operand): Same as above.
(add_stmt_operand): Same as above.
(get_mem_ref_operands): Same as above.
(get_tmr_operands): Same as above.
(maybe_add_call_vops): Same as above.
(get_asm_stmt_operands): Same as above.
(get_expr_operands): Same as above.
(parse_ssa_operands): Same as above.
(finalize_ssa_defs): Same as above.
(build_ssa_operands): Same as above, plus create a C-like wrapper.
(update_stmt_operands): Create an instance of operands_scanner.
This was approved in the Prague 2020 WG21 meeting so let's adjust the
comment. Since it's supposed to be a DR I think we should no longer
limit it to C++20.
P1957R2
* typeck2.c (check_narrowing): Consider T* to bool narrowing
in C++11 and up.
* g++.dg/cpp0x/initlist92.C: Don't expect an error in C++20 only.
externally_visible_p wasn't the correct predicate to use (even if it
worked), instead we should use DECL_EXTERNAL || TREE_PUBLIC.
2020-05-07 Richard Biener <rguenther@suse.de>
PR ipa/94947
* tree-ssa-structalias.c (refered_from_nonlocal_fn): Use
DECL_EXTERNAL || TREE_PUBLIC instead of externally_visible.
(refered_from_nonlocal_var): Likewise.
(ipa_pta_execute): Likewise.
I was looking at DR 296 and noticed that we say "nonstatic" instead of
"non-static", which is the version the standard uses. So this patch
fixes the spelling throughout the front end. Did not check e.g.
non-dependent or any other.
* decl.c (grok_op_properties): Fix spelling of non-static.
* typeck.c (build_class_member_access_expr): Likewise.
* g++.dg/other/operator1.C: Adjust expected message.
* g++.dg/overload/operator2.C: Likewise.
* g++.dg/template/error30.C: Likewise.
* g++.old-deja/g++.jason/operator.C: Likewise.
This extends DECL_GIMPLE_REG_P to all types so we can clear
TREE_ADDRESSABLE even for integers with partial defs, not just
complex and vector variables. To make that transition easier
the patch inverts DECL_GIMPLE_REG_P to DECL_NOT_GIMPLE_REG_P
since that makes the default the current state for all other
types besides complex and vectors.
For the testcase in PR94703 we're able to expand the partial
def'ed local integer to a register then, producing a single
movl rather than going through the stack.
On i?86 this execute FAILs gcc.dg/torture/pr71522.c because
we now expand a round-trip through a long double automatic var
to a register fld/fst which normalizes the value. For that
during RTL expansion we're looking for problematic punnings
of decls and avoid pseudos for those - I chose integer or
BLKmode accesses on decls with modes where precision doesn't
match bitsize which covers the XFmode case.
2020-05-07 Richard Biener <rguenther@suse.de>
PR middle-end/94703
* tree-core.h (tree_decl_common::gimple_reg_flag): Rename ...
(tree_decl_common::not_gimple_reg_flag): ... to this.
* tree.h (DECL_GIMPLE_REG_P): Rename ...
(DECL_NOT_GIMPLE_REG_P): ... to this.
* gimple-expr.c (copy_var_decl): Copy DECL_NOT_GIMPLE_REG_P.
(create_tmp_reg): Simplify.
(create_tmp_reg_fn): Likewise.
(is_gimple_reg): Check DECL_NOT_GIMPLE_REG_P for all regs.
* gimplify.c (create_tmp_from_val): Simplify.
(gimplify_bind_expr): Likewise.
(gimplify_compound_literal_expr): Likewise.
(gimplify_function_tree): Likewise.
(prepare_gimple_addressable): Set DECL_NOT_GIMPLE_REG_P.
* asan.c (create_odr_indicator): Do not clear DECL_GIMPLE_REG_P.
(asan_add_global): Copy it.
* cgraphunit.c (cgraph_node::expand_thunk): Force args
to be GIMPLE regs.
* function.c (gimplify_parameters): Copy
DECL_NOT_GIMPLE_REG_P.
* ipa-param-manipulation.c
(ipa_param_body_adjustments::common_initialization): Simplify.
(ipa_param_body_adjustments::reset_debug_stmts): Copy
DECL_NOT_GIMPLE_REG_P.
* omp-low.c (lower_omp_for_scan): Do not set DECL_GIMPLE_REG_P.
* sanopt.c (sanitize_rewrite_addressable_params): Likewise.
* tree-cfg.c (make_blocks_1): Simplify.
(verify_address): Do not verify DECL_GIMPLE_REG_P setting.
* tree-eh.c (lower_eh_constructs_2): Simplify.
* tree-inline.c (declare_return_variable): Adjust and
generalize.
(copy_decl_to_var): Copy DECL_NOT_GIMPLE_REG_P.
(copy_result_decl_to_var): Likewise.
* tree-into-ssa.c (pass_build_ssa::execute): Adjust comment.
* tree-nested.c (create_tmp_var_for): Simplify.
* tree-parloops.c (separate_decls_in_region_name): Copy
DECL_NOT_GIMPLE_REG_P.
* tree-sra.c (create_access_replacement): Adjust and
generalize partial def support.
* tree-ssa-forwprop.c (pass_forwprop::execute): Set
DECL_NOT_GIMPLE_REG_P on decls we introduce partial defs on.
* tree-ssa.c (maybe_optimize_var): Handle clearing of
TREE_ADDRESSABLE and setting/clearing DECL_NOT_GIMPLE_REG_P
independently.
* lto-streamer-out.c (hash_tree): Hash DECL_NOT_GIMPLE_REG_P.
* tree-streamer-out.c (pack_ts_decl_common_value_fields): Stream
DECL_NOT_GIMPLE_REG_P.
* tree-streamer-in.c (unpack_ts_decl_common_value_fields): Likewise.
* cfgexpand.c (avoid_type_punning_on_regs): New.
(discover_nonconstant_array_refs): Call
avoid_type_punning_on_regs to avoid unsupported mode punning.
lto/
* lto-common.c (compare_tree_sccs_1): Compare
DECL_NOT_GIMPLE_REG_P.
c/
* gimple-parser.c (c_parser_parse_ssa_name): Do not set
DECL_GIMPLE_REG_P.
cp/
* optimize.c (update_cloned_parm): Copy DECL_NOT_GIMPLE_REG_P.
* gcc.dg/tree-ssa/pr94703.c: New testcase.
The testcase in the current form doesn't FAIL without the patch on
x86_64-linux unless also testing with -m32; as that the 64-bit testing
on that target is probably way more common, and we can use also attributes
that FAIL without the patch with -m64, the following patch adjusts the
test, so that it FAILs without the patch for both -m64 and -m32 (but not
-mx32) and PASSes with the patch.
2020-05-07 Jakub Jelinek <jakub@redhat.com>
PR c++/94946
* g++.dg/ext/attr-parm-1.C: Enable the test also for lp64 x86, use
sysv_abi and ms_abi attributes in that case instead of fastcall and
no attribute.
If the second argument of __builtin_speculation_safe_value is
error_mark_node (or has such a type), we ICE during
useless_typ_conversion_p.
202-05-07 Jakub Jelinek <jakub@redhat.com>
PR c/94968
* c-common.c (speculation_safe_value_resolve_params): Return false if
error_operand_p (val2).
(resolve_overloaded_builtin) <case BUILT_IN_SPECULATION_SAFE_VALUE_N>:
Remove extraneous semicolon.
* gcc.dg/pr94968.c: New test.
The attached patch fixes a bootstrap failure on AArch32 introduced by
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=308bc496884706af4b3077171cbac684c7a6f7c6
This makes the declaration of arm_add_stmt_cost match the definition, and removes the redundant
class keyword from the definition.
2020-05-07 Alex Coplan <alex.coplan@arm.com>
* config/arm/arm.c (arm_add_stmt_cost): Fix declaration, remove class
from definition.
This rewrites store-motion to process candidates where we can
ensure order preserving separately and with no need to disambiguate
against all stores. Those candidates we cannot handle this way
are validated to be independent on all stores (w/o TBAA) and then
processed as "unordered" (all conditionally executed stores are so
as well).
This will necessary cause
FAIL: gcc.dg/graphite/pr80906.c scan-tree-dump graphite "isl AST to Gimple succeeded"
because the SM previously performed is not valid for exactly the PR57359
reason, we still perform SM of qc for the innermost loop but that's not enough.
There is still room for improvements because we still check some constraints
for the order preserving cases that are only necessary in the current
strict way for the unordered ones. Leaving that for the furture.
2020-05-07 Richard Biener <rguenther@suse.de>
PR tree-optimization/57359
* tree-ssa-loop-im.c (im_mem_ref::indep_loop): Remove.
(in_mem_ref::dep_loop): Repurpose.
(LOOP_DEP_BIT): Remove.
(enum dep_kind): New.
(enum dep_state): Likewise.
(record_loop_dependence): New function to populate the
dependence cache.
(query_loop_dependence): New function to query the dependence
cache.
(memory_accesses::refs_in_loop): Rename to ...
(memory_accesses::refs_loaded_in_loop): ... this and change to
only record loads.
(outermost_indep_loop): Adjust.
(mem_ref_alloc): Likewise.
(gather_mem_refs_stmt): Likewise.
(mem_refs_may_alias_p): Add tbaa_p parameter and pass it down.
(struct sm_aux): New.
(execute_sm): Split code generation on exits, record state
into new hash-map.
(enum sm_kind): New.
(execute_sm_exit): Exit code generation part.
(sm_seq_push_down): Helper for sm_seq_valid_bb performing
dependence checking on stores reached from exits.
(sm_seq_valid_bb): New function gathering SM stores on exits.
(hoist_memory_references): Re-implement.
(refs_independent_p): Add tbaa_p parameter and pass it down.
(record_dep_loop): Remove.
(ref_indep_loop_p_1): Fold into ...
(ref_indep_loop_p): ... this and generalize for three kinds
of dependence queries.
(can_sm_ref_p): Adjust according to hoist_memory_references
changes.
(store_motion_loop): Don't do anything if the set of SM
candidates is empty.
(tree_ssa_lim_initialize): Adjust.
(tree_ssa_lim_finalize): Likewise.
* gcc.dg/torture/pr57359-1.c: New testcase.
* gcc.dg/torture/pr57359-1.c: Likewise.
* gcc.dg/tree-ssa/ssa-lim-14.c: Likewise.
* gcc.dg/graphite/pr80906.c: XFAIL.