Commit Graph

193484 Commits

Author SHA1 Message Date
Andrew MacLeod 156d7d8dbc Use infer instead of side-effect for ranges.
Rename the files and classes to reflect the term infer rather than side-effect.

	* Makefile.in (OBJS): Use gimple-range-infer.o.
	* gimple-range-cache.cc (ranger_cache::fill_block_cache): Change msg.
	(ranger_cache::range_from_dom): Rename var side_effect to infer.
	(ranger_cache::apply_inferred_ranges): Rename from apply_side_effects.
	* gimple-range-cache.h: Include gimple-range-infer.h.
	(class ranger_cache): Adjust prototypes, use infer_range_manager.
	* gimple-range-infer.cc: Rename from gimple-range-side-effects.cc.
	(gimple_infer_range::*): Rename from stmt_side_effects.
	(infer_range_manager::*): Rename from side_effect_manager.
	* gimple-range-side-effect.cc: Rename.
	* gimple-range-side-effect.h: Rename.
	* gimple-range-infer.h: Rename from gimple-range-side-effects.h.
	(class gimple_infer_range): Rename from stmt_side_effects.
	(class infer_range_manager): Rename from side_effect_manager.
	* gimple-range.cc (gimple_ranger::register_inferred_ranges): Rename
	from register_side_effects.
	* gimple-range.h (register_inferred_ranges): Adjust prototype.
	* range-op.h: Adjust comment.
	* tree-vrp.cc (rvrp_folder::pre_fold_bb): Use register_inferred_ranges.
	(rvrp_folder::post_fold_bb): Use register_inferred_ranges.
2022-05-25 10:33:07 -04:00
Simon Cook 63f198553d RISC-V: Don't unconditionally add m,a,f,d in arch-canonicalize
This solves an issue where rv32i, etc. are canonicalized to rv32imafd
since the g->i addition of 'm', 'a', 'f', 'd' is not actually gated by
whether the input was rv32g/rv64g.

gcc/ChangeLog:

	* config/riscv/arch-canonicalize: Only add mafd extension if
	base was rv32/rv64g.
2022-05-25 22:00:17 +08:00
Tobias Burnus 2a790686fd GCN: Add gfx908/gfx90a to -march/-mtune in invoke.texi
gcc/
	* doc/invoke.texi (AMD GCN Options): Add gfx908/gfx90a.
2022-05-25 14:37:13 +02:00
Jakub Jelinek 7a3ee77a2e c: Improve build_component_ref diagnostics [PR91134]
On the following testcase (the first dg-error line) we emit a weird
diagnostics and even fixit on pointerpointer->member
where pointerpointer is pointer to pointer to struct and we say
'pointerpointer' is a pointer; did you mean to use '->'?
The first part is indeed true, but suggesting -> when the code already
does use -> is confusing.
The following patch adjusts callers so that they tell it if it is from
. parsing or from -> parsing and in the latter case suggests to dereference
the left operand instead by adding (* before it and ) after it (before ->).
Or would a suggestion to add [0] before -> be better?

2022-05-25  Jakub Jelinek  <jakub@redhat.com>

	PR c/91134
gcc/c/
	* c-tree.h (build_component_ref): Add ARROW_LOC location_t argument.
	* c-typeck.cc (build_component_ref): Likewise.  If DATUM is
	INDIRECT_REF and ARROW_LOC isn't UNKNOWN_LOCATION, print a different
	diagnostic and fixit hint if DATUM has pointer type.
	* c-parser.cc (c_parser_postfix_expression,
	c_parser_omp_variable_list): Adjust build_component_ref callers.
	* gimple-parser.cc (c_parser_gimple_postfix_expression_after_primary):
	Likewise.
gcc/objc/
	* objc-act.cc (objc_build_component_ref): Adjust build_component_ref
	caller.
gcc/testsuite/
	* gcc.dg/pr91134.c: New test.
2022-05-25 14:21:54 +02:00
Iain Buclaw 329417d775 d: add more 'final' and 'override' to gcc/d/*.cc 'visit' impls
The first round of adding these missed several more cases in other
files where the Visitor pattern is used in the D front-end.

gcc/d/ChangeLog:

	* expr.cc: Add "final" and "override" to all "visit" vfunc decls
	as appropriate.
	* imports.cc: Likewise.
	* typeinfo.cc: Likewise.

Signed-off-by: Iain Buclaw <ibuclaw@gdcproject.org>
2022-05-25 13:12:53 +02:00
Richard Biener 19aec65ae1 Fix misspelled default
This fixes misspelled defaut: in switch statements in three
new testcases.

2022-05-25  Richard Biener  <rguenther@suse.de>

	* gcc.dg/loop-unswitch-10.c: Fix misspelled defaut:
	* gcc.dg/loop-unswitch-11.c: Likewise.
	* gcc.dg/loop-unswitch-14.c: Likewise.
2022-05-25 12:56:16 +02:00
Jakub Jelinek af02daff55 asan: Fix up instrumentation of assignments which are both loads and stores [PR105714]
On the following testcase with -Os asan pass sees:
  <bb 6> [local count: 354334800]:
  # h_21 = PHI <h_15(6), 0(5)>
  *c.3_5 = *d.2_4;
  h_15 = h_21 + 1;
  if (h_15 != 3)
    goto <bb 6>; [75.00%]
  else
    goto <bb 7>; [25.00%]

  <bb 7> [local count: 118111600]:
  *c.3_5 = MEM[(struct a *)&b + 12B];
  _13 = c.3_5->x;
  return _13;
It instruments the
  *c.3_5 = *d.2_4;
assignment by adding
  .ASAN_CHECK (7, c.3_5, 4, 4);
  .ASAN_CHECK (6, d.2_4, 4, 4);
before it (which later lowers to checking the corresponding shadow
memory).  But when considering instrumentation of
  *c.3_5 = MEM[(struct a *)&b + 12B];
it doesn't instrument anything, because it sees that *c.3_5 store is
already instrumented in a dominating block and so there is no need
to instrument *c.3_5 store again (i.e. add another
  .ASAN_CHECK (7, c.3_5, 4, 4);
).  That is true, but misses the fact that we still want to
instrument the MEM[(struct a *)&b + 12B] load.

The following patch fixes that by changing has_stmt_been_instrumented_p
to consider both store and load in the assignment if it does both
(returning true iff both have been instrumented).
That matches how we handle e.g. builtin calls, where we also perform AND
of all the memory locs involved in the call.

I've verified that we still don't add the redundant
  .ASAN_CHECK (7, c.3_5, 4, 4);
call but just add
  _18 = &MEM[(struct a *)&b + 12B];
  .ASAN_CHECK (6, _18, 4, 4);
to instrument the load.

2022-05-25  Jakub Jelinek  <jakub@redhat.com>

	PR sanitizer/105714
	* asan.cc (has_stmt_been_instrumented_p): For assignments which
	are both stores and loads, return true only if both destination
	and source have been instrumented.

	* gcc.dg/asan/pr105714.c: New test.
2022-05-25 12:05:08 +02:00
Jakub Jelinek c125f504c4 libgomp: Fix occassional hangs with taskwait nowait depend
Richi reported occassional hangs with taskwait-depend-nowait-1.*
tests and I've finally manged to reproduce.  The problem is if
taskwait depend without nowait is encountered soon after
taskwait depend nowait and the former depends on the latter and there
is no other work to do, the taskwait depend without nowait is put
to sleep, but the empty_task optimization in
gomp_task_run_post_handle_dependers wouldn't wake it up in that
case.  gomp_task_run_post_handle_dependers normally does some wakeups
because it schedules more work (another task), which is not the
case of empty_task, but we need to do the wakeups that would be done
upon task completion so that we awake sleeping threads when the
last child is done.
So, the taskwait-depend-nowait-1.* testcase is fixed with the
else if (__builtin_expect (task->parent_depends_on, 0) part of
the patch.
The new testcase can hang on another problem, if the empty task
is the last task of a taskgroup, we need to use atomic store
like elsewhere to decrease the counter to 0, and wake up taskgroup
end if needed.
Yet another spot which can sleep is normal taskwait (without depend),
but I believe nothing needs to be done for that - in that case we
await solely until the children's queue has no tasks, tasks still
waiting for dependencies aren't accounted in that, but the reason
is that if taskwait should wait for something, there needs to be at least
one active child doing something (in the children queue), which then
possibly awakes some of its siblings when the dependencies are met,
or in the empty task case awakes further dependencies, but in any
case the child that finished is still handled as active child and
will awake taskwait at the end if there is nothing further to
do.
Last sleeping case are barriers, but that is handled by ++ret and
awaking the barrier.

2022-05-25  Jakub Jelinek  <jakub@redhat.com>

	* task.c (gomp_task_run_post_handle_dependers): If empty_task
	is the last task taskwait depend depends on, wake it up.
	Similarly if it is the last child of a taskgroup, use atomic
	store instead of decrement and awak taskgroup wait if any.
	* testsuite/libgomp.c-c++-common/taskwait-depend-nowait-2.c: New test.
2022-05-25 11:10:41 +02:00
Martin Liska a1c9f779f7 Add GIMPLE switch support to loop unswitching
This patch adds support to unswitch loops with switch statements
based on invariant index.  It furthermore reworks the cost model
to allow an overall budget of statements to be created per original
loop by all unswitching opportunities in the loop.  Compared to
the original all unswitching opportunities in a loop are
pre-evaluated before the first transform which will allow future
changes to select the most profitable candidates first.

To efficiently support switch statements the pass now uses
ranger to simplify switch statements and conditions in loop
copies based on ranges extracted from the recorded set of
predicates unswitched.

gcc/ChangeLog:

	* dbgcnt.def (DEBUG_COUNTER): Add loop_unswitch counter.
	* params.opt (max-unswitch-level): Remove.
	* doc/invoke.texi (max-unswitch-level): Likewise.
	* tree-cfg.cc (gimple_lv_add_condition_to_bb): Support not
	gimplified expressions.
	* tree-ssa-loop-unswitch.cc (struct unswitch_predicate): New.
	(tree_may_unswitch_on): Rename to ...
	(find_unswitching_predicates_for_bb): ... this and handle
	switch statements.
	(get_predicates_for_bb): Likewise.
	(set_predicates_for_bb): Likewise.
	(init_loop_unswitch_info): Likewise.
	(tree_ssa_unswitch_loops): Prepare stuff before calling
	tree_unswitch_single_loop.
	(tree_unswitch_single_loop): Rework the function using
	pre-computed predicates and with a per original loop cost model.
	(merge_last): New.
	(add_predicate_to_path): Likewise.
	(find_range_for_lhs): Likewise.
	(simplify_using_entry_checks): Rename to ...
	(evaluate_control_stmt_using_entry_checks): ... this, handle
	switch statements and improve simplifications using ranger.
	(simplify_loop_version): Rework using
	evaluate_control_stmt_using_entry_checks.
	(evaluate_bbs): New.
	(evaluate_loop_insns_for_predicate): Likewise.
	(tree_unswitch_loop): Adjust to allow switch statements and
	pass in the edge to unswitch.
	(clean_up_after_unswitching): New.
	(pass_tree_unswitch::execute): Pass down fun.

gcc/testsuite/ChangeLog:

	* gcc.dg/loop-unswitch-7.c: New test.
	* gcc.dg/loop-unswitch-8.c: New test.
	* gcc.dg/loop-unswitch-9.c: New test.
	* gcc.dg/loop-unswitch-10.c: New test.
	* gcc.dg/loop-unswitch-11.c: New test.
	* gcc.dg/loop-unswitch-12.c: New test.
	* gcc.dg/loop-unswitch-13.c: New test.
	* gcc.dg/loop-unswitch-14.c: New test.
	* gcc.dg/loop-unswitch-15.c: New test.
	* gcc.dg/loop-unswitch-16.c: New test.
	* gcc.dg/loop-unswitch-17.c: New test.
	* gcc.dg/torture/20220518-1.c: New test.
	* gcc.dg/torture/20220518-2.c: New test.
	* gcc.dg/torture/20220525-1.c: New test.
	* gcc.dg/alias-10.c: Adjust.
	* gcc.dg/tree-ssa/loop-6.c: Likewise.
	* gcc.dg/loop-unswitch-1.c: Likewise.

Co-authored-by: Richard Biener  <rguenther@suse.de>
2022-05-25 10:37:13 +02:00
Szabolcs Nagy 0d344b5576 aarch64: Fix pac-ret with unusual dwarf in libgcc unwinder [PR104689]
The RA_SIGN_STATE dwarf pseudo-register is normally only set using the
DW_CFA_AARCH64_negate_ra_state (== DW_CFA_window_save) operation which
toggles the return address signedness state (the default state is 0).
(It may be set by remember/restore_state CFI too, those save/restore
the state of all registers.)

However RA_SIGN_STATE can be set directly via DW_CFA_val_expression too.
GCC does not generate such CFI but some other compilers reportedly do.

Note: the toggle operation must not be mixed with other dwarf register
rule CFI within the same CIE and FDE.

In libgcc we assume REG_UNSAVED means the RA_STATE is set using toggle
operations, otherwise we assume its value is set by other CFI.

libgcc/ChangeLog:

	PR target/104689
	* config/aarch64/aarch64-unwind.h (aarch64_frob_update_context):
	Handle the !REG_UNSAVED case.
	* unwind-dw2.c (execute_cfa_program): Fail toggle if !REG_UNSAVED.

gcc/testsuite/ChangeLog:

	PR target/104689
	* gcc.target/aarch64/pr104689.c: New test.
2022-05-25 09:17:06 +01:00
GCC Administrator 768f49a20f Daily bump. 2022-05-25 00:17:06 +00:00
Eugene Rozenfeld 5af22024f6 Fix profile count maintenance in vectorizer peeling.
This patch changes the code to save/restore profile counts for
the epliog loop (when not using scalar loop in the epilog)
instead of scaling them down and then back up, which may lead
to problems if we scale down to 0.

Tested on x86_64-pc-linux-gnu.

gcc/ChangeLog:

	* tree-vect-loop-manip.cc (vect_do_peeling): Save/restore profile
	counts for the epilog loop.
2022-05-24 16:49:45 -07:00
Martin Sebor 10d1986aee PR middle-end/105604 - ICE: in tree_to_shwi with vla in struct and sprintf
gcc/ChangeLog:

	PR middle-end/105604
	* gimple-ssa-sprintf.cc (set_aggregate_size_and_offset): Add comments.
	(get_origin_and_offset_r): Remove null handling.  Handle variable array
	sizes.
	(get_origin_and_offset): Handle null argument here.  Simplify.
	(alias_offset): Update comment.
	* pointer-query.cc (field_at_offset): Update comment.  Handle members
	of variable-length types.

gcc/testsuite/ChangeLog:

	PR middle-end/105604
	* gcc.dg/Wrestrict-24.c: New test.
	* gcc.dg/Wrestrict-25.c: New test.
	* gcc.dg/Wrestrict-26.c: New test.

Co-authored-by: Richard Biener <rguenther@suse.de>
2022-05-24 16:05:50 -06:00
Jason Merrill 1189c03859 c++: *this folding in constexpr call
The code in cxx_eval_call_expression to fold *this was doing the wrong thing
for array decay; we can use cxx_fold_indirect_ref instead.

gcc/cp/ChangeLog:

	* constexpr.cc (cxx_fold_indirect_ref): Add default arg.
	(cxx_eval_call_expression): Call it.
	(cxx_fold_indirect_ref_1): Handle null empty_base.
2022-05-24 15:52:03 -04:00
Joel Brobecker 0aee03cb63 gcc.misc-tests/outputs.exp: Use link test to check for -gsplit-dwarf support
We have noticed that, when running the GCC testsuite on AArch64
RTEMS 6, we have about 150 tests failing due to a link failure.
When investigating, we found that all the tests were failing
due to the use of -gsplit-dwarf.

On this platform, using -gsplit-dwarf currently causes an error
during the link:

    | /[...]/ld: a.out section `.unexpected_sections' will not fit
    |    in region `UNEXPECTED_SECTIONS'
    | /[...]/ld: region `UNEXPECTED_SECTIONS' overflowed by 56 bytes

The error is a bit cryptic, but the source of the issue is that
the linker does not currently support the sections generated
by -gsplit-dwarf (.debug_gnu_pubnames, .debug_gnu_pubtypes).
This means that the -gsplit-dwarf feature itself really isn't
supported on this platform, at least for the moment.

This commit enhances the -gsplit-dwarf support check to be
a compile-and-link check, rather than just a compile check.
This allows it to properly detect that this feature isn't
supported on platforms such as AArch64 RTEMS where the compilation
works, but not the link.

Tested on aarch64-rtems, where a little over 150 tests are now
passing, instead of failing, as well as on x86_64-linux, where
the results are identical, and where the .log file was also manually
inspected to make sure that the use of the -gsplit-dwarf option
was preserved.

gcc/testsuite/ChangeLog:

	* gcc.misc-tests/outputs.exp: Make the -gsplit-dwarf test
	a compile-and-link test rather than a compile-only test.
2022-05-24 12:51:42 -07:00
Jason Merrill 72f76540ad c++: discarded-value and constexpr
I've been thinking for a while that the 'lval' parameter needed a third
value for discarded-value expressions; most importantly,
cxx_eval_store_expression does extra work for an lvalue result, and we also
don't want to do the l->r conversion.

Mostly this is pretty mechanical.  Apart from the _store_ fix, I also use
vc_discard for substatements of a STATEMENT_LIST other than a stmt-expr
result, and avoid building _REFs to be ignored in a few other places.

gcc/cp/ChangeLog:

	* constexpr.cc (enum value_cat): New. Change all 'lval' parameters
	from int to value_cat.  Change most false to vc_prvalue, most true
	to vc_glvalue, cases where the return value is ignored to
	vc_discard.
	(cxx_eval_statement_list): Only vc_prvalue for stmt-expr result.
	(cxx_eval_store_expression): Only build _REF for vc_glvalue.
	(cxx_eval_array_reference, cxx_eval_component_reference)
	(cxx_eval_indirect_ref, cxx_eval_constant_expression): Likewise.
2022-05-24 15:50:26 -04:00
Jason Merrill 2540e2c604 c++: constexpr empty base redux [PR105622]
Here calling the constructor for s.__size_ had ctx->ctor for s itself
because cxx_eval_store_expression doesn't create a ctor for the empty field.
Then cxx_eval_call_expression returned the s initializer, and my empty base
overhaul in r13-160 got confused because the type of init is not an empty
class.  But that's OK, we should be checking the type of the original LHS
instead.  We also want to use initialized_type in the condition, in case
init is an AGGR_INIT_EXPR.

I spent quite a while working on more complex solutions before coming back
to this simple one.

	PR c++/105622

gcc/cp/ChangeLog:

	* constexpr.cc (cxx_eval_store_expression): Adjust assert.
	Use initialized_type.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp2a/no_unique_address14.C: New test.
2022-05-24 15:49:27 -04:00
Prathamesh Kulkarni ae8decf1d2 Add new parameter to vec_perm_const hook for specifying operand mode.
The rationale of the patch is to support vec_perm_expr of the form:
lhs = vec_perm_expr<rhs, mask>
where lhs and rhs are vector types with different lengths but have
same element type. For example, lhs is SVE vector and rhs
is corresponding AdvSIMD vector.

It would also allow to express extract even/odd and interleave operations
with a VEC_PERM_EXPR.  The interleave currently has the issue that we have
to artificially widen the inputs with "dont-care" elements.

gcc/ChangeLog:

	* target.def (vec_perm_const): Define new parameter op_mode and
	update doc.
	* doc/tm.texi: Regenerate.
	* config/aarch64/aarch64.cc (aarch64_vectorize_vec_perm_const): Adjust
	vec_perm_const hook to add new parameter op_mode and return false
	if result and operand modes do not match.
	* config/arm/arm.cc (arm_vectorize_vec_perm_const): Likewise.
	* config/gcn/gcn.cc (gcn_vectorize_vec_perm_const): Likewise.
	* config/ia64/ia64.cc (ia64_vectorize_vec_perm_const): Likewise.
	* config/mips/mips.cc (mips_vectorize_vec_perm_const): Likewise.
	* config/rs6000/rs6000.cc (rs6000_vectorize_vec_perm_const): Likewise
	* config/s390/s390.cc (s390_vectorize_vec_perm_const): Likewise.
	* config/sparc/sparc.cc (sparc_vectorize_vec_perm_const): Likewise.
	* config/i386/i386-expand.cc (ix86_vectorize_vec_perm_const): Likewise.
	* config/i386/i386-expand.h (ix86_vectorize_vec_perm_const): Adjust
	prototype.
	* config/i386/sse.md (ashrv4di3): Adjust call to vec_perm_const hook.
	(ashrv2di3): Likewise.
	* optabs.cc (expand_vec_perm_const): Likewise.
	* optabs-query.h (can_vec_perm_const_p): Adjust prototype.
	* optabs-query.cc (can_vec_perm_const_p): Define new parameter
	op_mode and pass it to vec_perm_const hook.
	(can_mult_highpart_p): Adjust call to can_vec_perm_const_p.
	* match.pd (vec_perm X Y CST): Likewise.
	* tree-ssa-forwprop.cc (simplify_vector_constructor): Likewise.
	* tree-vect-data-refs.cc (vect_grouped_store_supported): Likewise.
	(vect_grouped_load_supported): Likewise.
	(vect_shift_permute_load_chain): Likewise.
	* tree-vect-generic.cc (lower_vec_perm): Likewise.
	* tree-vect-loop-manip.cc (interleave_supported_p): Likewise.
	* tree-vect-loop.cc (have_whole_vector_shift): Likewise.
	* tree-vect-patterns.cc (vect_recog_rotate_pattern): Likewise.
	* tree-vect-slp.cc (can_duplicate_and_interleave_p): Likewise.
	(vect_transform_slp_perm_load): Likewise.
	(vectorizable_slp_permutation): Likewise.
	* tree-vect-stmts.cc (perm_mask_for_reverse): Likewise.
	(vectorizable_bswap): Likewise.
	(scan_store_can_perm_p): Likewise.
	(vect_gen_perm_mask_checked): Likewise.
2022-05-25 00:42:00 +05:30
H.J. Lu 2f4f7de787 x86: Document -mcet-switch
When -fcf-protection=branch is used, the compiler will generate jump
tables for switch statements where the indirect jump is prefixed with
the NOTRACK prefix, so it can jump to non-ENDBR targets.  Since the
indirect jump targets are generated by the compiler and stored in
read-only memory, this does not result in a direct loss of hardening.
But if the jump table index is attacker-controlled, the indirect jump
may not be constrained by CET.

Document -mcet-switch to generate jump tables for switch statements with
ENDBR and skip the NOTRACK prefix for indirect jump.  This option should
be used when the NOTRACK prefix is disabled.

	PR target/104816
	* config/i386/i386.opt: Remove Undocumented.
	* doc/invoke.texi: Document -mcet-switch.
2022-05-24 09:05:07 -07:00
Andrew Stubbs cde52d3a2d amdgcn: Add gfx90a support
This adds architecture options and multilibs for the AMD GFX90a GPUs.
It also tidies up some of the ISA selection code, and corrects a few small
mistake in the gfx908 naming.

gcc/ChangeLog:

	* config.gcc (amdgcn): Accept --with-arch=gfx908 and gfx90a.
	* config/gcn/gcn-opts.h (enum gcn_isa): New.
	(TARGET_GCN3): Use enum gcn_isa.
	(TARGET_GCN3_PLUS): Likewise.
	(TARGET_GCN5): Likewise.
	(TARGET_GCN5_PLUS): Likewise.
	(TARGET_CDNA1): New.
	(TARGET_CDNA1_PLUS): New.
	(TARGET_CDNA2): New.
	(TARGET_CDNA2_PLUS): New.
	(TARGET_M0_LDS_LIMIT): New.
	(TARGET_PACKED_WORK_ITEMS): New.
	* config/gcn/gcn.cc (gcn_isa): Change to enum gcn_isa.
	(gcn_option_override): Recognise CDNA ISA variants.
	(gcn_omp_device_kind_arch_isa): Support gfx90a.
	(gcn_expand_prologue): Make m0 init optional.
	Add support for packed work items.
	(output_file_start): Support gfx90a.
	(gcn_hsa_declare_function_name): Support gfx90a metadata.
	* config/gcn/gcn.h (TARGET_CPU_CPP_BUILTINS):Add __CDNA1__ and
	__CDNA2__.
	* config/gcn/gcn.md (<su>mulsi3_highpart): Use TARGET_GCN5_PLUS.
	(<su>mulsi3_highpart_imm): Likewise.
	(<su>mulsidi3): Likewise.
	(<su>mulsidi3_imm): Likewise.
	* config/gcn/gcn.opt (gpu_type): Add gfx90a.
	* config/gcn/mkoffload.cc (EF_AMDGPU_MACH_AMDGCN_GFX90a): New.
	(main): Support gfx90a.
	* config/gcn/t-gcn-hsa: Add gfx90a multilib.
	* config/gcn/t-omp-device: Add gfx90a isa.

libgomp/ChangeLog:

	* plugin/plugin-gcn.c (EF_AMDGPU_MACH): Add
	EF_AMDGPU_MACH_AMDGCN_GFX90a.
	(gcn_gfx90a_s): New.
	(isa_hsa_name): Support gfx90a.
	(isa_code): Likewise.
2022-05-24 16:18:14 +01:00
Andrew Stubbs 8086230e7a amdgcn: Remove LLVM 9 assembler/linker support
The minimum required LLVM version is now 13.0.1, and is enforced by configure.

gcc/ChangeLog:

	* config.in: Regenerate.
	* config/gcn/gcn-hsa.h (X_FIJI): Delete.
	(X_900): Delete.
	(X_906): Delete.
	(X_908): Delete.
	(S_FIJI): Delete.
	(S_900): Delete.
	(S_906): Delete.
	(S_908): Delete.
	(NO_XNACK): New macro.
	(NO_SRAM_ECC): New macro.
	(SRAMOPT): Keep only v4 variant.
	(HSACO3_SELECT_OPT): Delete.
	(DRIVER_SELF_SPECS): Delete.
	(ASM_SPEC): Remove LLVM 9 support.
	* config/gcn/gcn-valu.md
	(gather<mode>_insn_2offsets<exec>): Remove assembler bug workaround.
	(scatter<mode>_insn_2offsets<exec_scatter>): Likewise.
	* config/gcn/gcn.cc (output_file_start): Remove LLVM 9 support.
	(print_operand_address): Remove assembler bug workaround.
	* config/gcn/mkoffload.cc (EF_AMDGPU_XNACK_V3): Delete.
	(EF_AMDGPU_SRAM_ECC_V3): Delete.
	(SET_XNACK_ON): Delete v3 variants.
	(SET_XNACK_OFF): Delete v3 variants.
	(TEST_XNACK): Delete v3 variants.
	(SET_SRAM_ECC_ON): Delete v3 variants.
	(SET_SRAM_ECC_ANY): Delete v3 variants.
	(SET_SRAM_ECC_OFF): Delete v3 variants.
	(SET_SRAM_ECC_UNSUPPORTED): Delete v3 variants.
	(TEST_SRAM_ECC_ANY): Delete v3 variants.
	(TEST_SRAM_ECC_ON): Delete v3 variants.
	(copy_early_debug_info): Remove v3 support.
	(main): Remove v3 support.
	* configure: Regenerate.
	* configure.ac: Replace all GCN feature checks with a version check.
2022-05-24 16:18:13 +01:00
David Malcolm 2c5c645663 libiberty: remove FINAL and OVERRIDE from ansidecl.h
libiberty's ansidecl.h provides macros FINAL and OVERRIDE to allow
virtual functions to be labelled with the C++11 "final" and "override"
specifiers, but with empty implementations on pre-C++11 C++ compilers.

We've used the macros in many places in GCC, but as of as of GCC 11
onwards GCC has required a C++11 compiler, such as GCC 4.8 or later.
On the assumption that any such compiler correctly implements "final"
and "override", I've simplified GCC's codebase by replacing all uses of
the FINAL and OVERRIDE macros in GCC's source tree with the lower-case
specifiers (via commits r13-690-gff171cb13df671 and
r13-716-g8473ef7be60443)

The macros are reportedly not used anywhere in binutils-gdb.

This patch completes this transition for GCC by eliminating the macros
from ansidecl.h.

include/ChangeLog:
	* ansidecl.h: Drop macros OVERRIDE and FINAL.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-05-24 10:22:37 -04:00
Roger Sayle e8a25550da Optimize double word negation of zero extended values on x86.
It's not uncommon for GCC to convert between a (zero or one) Boolean
value and a (zero or all ones) mask value, possibly of a wider type,
using negation.

Currently on x86_64, the following simple test case:
__int128 foo(unsigned long x) { return -(__int128)x; }

compiles with -O2 to:

        movq    %rdi, %rax
        xorl    %edx, %edx
        negq    %rax
        adcq    $0, %rdx
        negq    %rdx
        ret

with this patch, which adds an additional peephole2 to i386.md,
we instead generate the improved:

        movq    %rdi, %rax
        negq    %rax
        sbbq    %rdx, %rdx
        ret

[and likewise for the (DImode) long long version using -m32.]
A peephole2 is appropriate as the double word negation and the
operation providing the xor are typically only split after combine.

In fact, the new peephole2 sequence:
;; Convert:
;;   xorl %edx, %edx
;;   negl %eax
;;   adcl $0, %edx
;;   negl %edx
;; to:
;;   negl %eax
;;   sbbl %edx, %edx    // *x86_mov<mode>cc_0_m1

is nearly identical to (and placed immediately after) the existing:
;; Convert:
;;   mov %esi, %edx
;;   negl %eax
;;   adcl $0, %edx
;;   negl %edx
;; to:
;;   xorl %edx, %edx
;;   negl %eax
;;   sbbl %esi, %edx

One potential objection/concern is that "sbb? %reg,%reg" may possibly be
incorrectly perceived as a false register dependency on older hardware,
much like "xor? %reg,%reg" may be perceived as a false dependency on
really old hardware.  This doesn't currently appear to be a concern
for the i386 backend's *x86_move<mode>cc_0_m1 as shown by the following
test code:

int bar(unsigned int x, unsigned int y) {
  return x > y ? -1 : 0;
}

which currently generates a "naked" sbb:
        cmp     esi, edi
        sbb     eax, eax
        ret

If anyone does potentially encounter a stall, it would easy to add
a splitter or peephole2 controlled by a tuning flag to insert an additional
xor to break the false dependency chain (when not optimizing for size),
but I don't believe this is required on recent microarchitectures.

2022-05-24 Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* config/i386/i386.md (peephole2): Convert xor;neg;adc;neg,
	i.e. a double word negation of a zero extended operand, to
	neg;sbb.

gcc/testsuite/ChangeLog
	* gcc.target/i386/neg-zext-1.c: New test case for -m32.
	* gcc.target/i386/neg-zext-2.c: New test case for -m64.
2022-05-24 15:18:56 +01:00
Roger Sayle 793f847ba7 PR tree-optimization/105668: Provide vcond_mask_v1tiv1ti pattern.
This patch is an alternate/supplementary fix to PR tree-optimization/105668
that provides a vcond_mask_v1titi optab/define_expand to the i386 backend.
An undocumented feature/bug of GCC's vectorization is that any target that
provides a vec_cmpeq<mode><mode> has to also provide a matching
vcond_mask<mode><mode>.  This backend patch preserves the status quo,
rather than fixes the underlying problem.

One aspect of this clean-up is that ix86_expand_sse_movcc provides
fallback implementations using pand/pandn/por that effectively make
V2DImode and V1TImode vcond_mask available on any TARGET_SSE2, not
just TARGET_SSE4_2.  This allows a simplification as V2DI mode can
be handled by using a VI_128 mode iterator instead of a VI124_128
mode iterator, and instead this define_expand is effectively renamed
to provide a V1TImode vcond_mask expander (as V1TI isn't in VI_128).

2022-05-24  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	PR tree-optimization/105668
	* config/i386/i386-expand.cc (ix86_expand_sse_movcc): Support
	V1TImode, just like V2DImode.
	* config/i386/sse.md (vcond_mask_<mode><sseintvecmodelower>):
	Use VI_128 mode iterator instead of VI124_128 to include V2DI.
	(vcond_mask_v2div2di): Delete.
	(vcond_mask_v1tiv1ti): New define_expand.

gcc/testsuite/ChangeLog
	PR tree-optimization/105668
	* gcc.target/i386/pr105668.c: New test case.
2022-05-24 15:15:12 +01:00
Roger Sayle 9e7a0e42a1 Minor improvement to genpreds.cc
This simple patch implements Richard Biener's suggestion in comment #6
of PR tree-optimization/52171 (from February 2013) that the insn-preds
code generated by genpreds can avoid using strncmp when matching constant
strings of length one.

The effect of this patch is best explained by the diff of insn-preds.cc:
<       if (!strncmp (str + 1, "g", 1))
---
>       if (str[1] == 'g')
3104c3104
<       if (!strncmp (str + 1, "m", 1))
---
>       if (str[1] == 'm')
3106c3106
<       if (!strncmp (str + 1, "c", 1))
---
>       if (str[1] == 'c')
...

The equivalent optimization is performed by GCC (but perhaps not by the
host compiler), but generating simpler/smaller code may encourage further
optimizations (such as use of a switch statement).

2022-05-24  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* genpreds.cc (write_lookup_constraint_1): Avoid generating a call
	to strncmp for strings of length one.
2022-05-24 14:31:59 +01:00
Patrick Palka d0ef9e0619 c++: set TYPE_CANONICAL for more template types
When forming a class template specialization, lookup_template_class
uses structural equality for the specialized type whenever one of its
template arguments uses structural equality.  This is the sensible thing
to do in a vacuum, but given that we already effectively deduplicate class
specializations via the type_specializations table, we ought to be able
to safely assume that each class specialization is unique and therefore
canonical, regardless of the canonicity of the template arguments.

To that end this patch makes us use the canonical type machinery for all
type specializations, except for the case where a PARM_DECL appears in
the template arguments (this special case was recently added by
r12-3766-g72394d38d929c7).

Additionally, this patch makes us use the canonical type machinery for
TEMPLATE_TEMPLATE_PARMs and BOUND_TEMPLATE_TEMPLATE_PARMs, by extending
canonical_type_parameter appropriately.  A comment in tsubst says it's
unsafe to set TYPE_CANONICAL for a lowered TEMPLATE_TEMPLATE_PARM, but
I'm not sure this is true anymore.  According to Jason, this comment
(from r120341) became obsolete when later that year r129844 started to
substitute the template parms of ttps.  Note that r10-7817-ga6f400239d792d
recently changed process_template_parm to clear TYPE_CANONICAL for
TEMPLATE_TEMPLATE_PARM consistent with the tsubst comment; this patch
changes both functions to set instead of clear TYPE_CANONICAL for ttps.

These changes improve compile time of template-heavy code by around 10%
for me (with a release compiler).  For instance, compile time for the
libstdc++ test std/ranges/adaptors/all.cc drops from 1.45s to 1.25s, and
for the range-v3 test test/view/zip.cpp from 5.38s to 4.88s.  The total
number of calls to structural_comptypes for the latter test drops from
10.5M to 1.8M.  Memory use is unaffected (as expected).

The new testcase verifies we check the r12-3766 PARM_DECL special case
in bind_template_template_parm too.

gcc/cp/ChangeLog:

	* cp-tree.h (any_template_arguments_need_structural_equality_p):
	Declare.
	* pt.cc (struct ctp_hasher): Define.
	(ctp_table): Define.
	(canonical_type_parameter): Use it.
	(process_template_parm): Set TYPE_CANONICAL for
	TEMPLATE_TEMPLATE_PARM too.
	(lookup_template_class_1): Remove now outdated comment for the
	any_template_arguments_need_structural_equality_p test.
	(tsubst) <case TEMPLATE_TEMPLATE_PARM, etc>: Don't specifically
	clear TYPE_CANONICAL for ttps.  Set TYPE_CANONICAL on the
	substituted type later.
	(any_template_arguments_need_structural_equality_p): Return
	true for any_targ_node.  Don't return true just because a
	template argument uses structural equality.  Add comment for
	the PARM_DECL special case.
	(rewrite_template_parm): Set TYPE_CANONICAL on the rewritten
	parm's type later.
	* tree.cc (bind_template_template_parm): Set TYPE_CANONICAL
	when safe to do so.
	* typeck.cc (structural_comptypes) [check_alias]: Increment
	processing_template_decl before checking
	dependent_alias_template_spec_p.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp0x/constexpr-52830a.C: New test.
2022-05-24 09:27:39 -04:00
David Malcolm 442cf0977a d: add 'final' and 'override' to gcc/d/*.cc 'visit' impls
gcc/d/ChangeLog:
	* decl.cc: Add "final" and "override" to all "visit" vfunc decls
	as appropriate.
	* expr.cc: Likewise.
	* toir.cc: Likewise.
	* typeinfo.cc: Likewise.
	* types.cc: Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-05-24 09:07:22 -04:00
ShiYulong d44e471cf0 RISC-V: Cache Management Operation instructions testcases
This commit adds testcases about CMO instructions.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/cmo-zicbom-1.c: New test.
	* gcc.target/riscv/cmo-zicbom-2.c: New test.
	* gcc.target/riscv/cmo-zicbop-1.c: New test.
	* gcc.target/riscv/cmo-zicbop-2.c: New test.
	* gcc.target/riscv/cmo-zicboz-1.c: New test.
	* gcc.target/riscv/cmo-zicboz-2.c: New test.
2022-05-24 21:00:45 +08:00
ShiYulong 3df3ca9014 RISC-V: Cache Management Operation instructions
This commit adds cbo.clea, cbo.flush, cbo.inval, cbo.zero, prefetch.i,
prefetch.r and prefetch.w instructions.

diff with the previous version:
We use unspec_volatile instead of unspec for those cache operations.
We use UNSPECV instead of UNSPEC and move them to unspecv.

gcc/ChangeLog:

	* config/riscv/predicates.md (imm5_operand): Add a new operand type for
	prefetch instructions.
	* config/riscv/riscv-builtins.cc (AVAIL): Add new AVAILs for CMO ISA
	Extensions.
	(RISCV_ATYPE_SI): New.
	(RISCV_ATYPE_DI): New.
	* config/riscv/riscv-ftypes.def (0): New.
	(1): New.
	* config/riscv/riscv.md (riscv_clean_<mode>): New.
	(riscv_flush_<mode>): New.
	(riscv_inval_<mode>): New.
	(riscv_zero_<mode>): New.
	(prefetch): New.
	(riscv_prefetchi_<mode>): New.
	* config/riscv/riscv-cmo.def: New file.
2022-05-24 21:00:39 +08:00
ShiYulong 23c738bcba RISC-V: Add mininal support for Zicbo[mzp]
This commit adds minimal support for 'Zicbom','Zicboz' and 'Zicbop' extensions.

gcc/ChangeLog:

	* common/config/riscv/riscv-common.cc: Add zicbom, zicboz, zicbop extensions.
	* config/riscv/riscv-opts.h (MASK_ZICBOZ): New.
	(MASK_ZICBOM): New.
	(MASK_ZICBOP): New.
	(TARGET_ZICBOZ): New.
	(TARGET_ZICBOM): New.
	(TARGET_ZICBOP): New.
	* config/riscv/riscv.opt (riscv_zicmo_subext): New.
2022-05-24 21:00:33 +08:00
David Malcolm 4665cfbc4c tree-vect-slp-patterns.cc: add 'final' and 'override' to vect_pattern::build impls
gcc/ChangeLog:
	* tree-vect-slp-patterns.cc: Add "final" and "override" to
	vect_pattern::build impls as appropriate.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-05-24 08:53:30 -04:00
David Malcolm f31ba11652 ipa: add 'final' and 'override' to call_summary_base vfunc impls
gcc/ChangeLog:
	* ipa-cp.cc: Add "final" and "override" to call_summary_base vfunc
	implementations, removing redundant "virtual" as appropriate.
	* ipa-fnsummary.h: Likewise.
	* ipa-modref.cc: Likewise.
	* ipa-param-manipulation.cc: Likewise.
	* ipa-profile.cc: Likewise.
	* ipa-prop.h: Likewise.
	* ipa-pure-const.cc: Likewise.
	* ipa-reference.cc: Likewise.
	* ipa-sra.cc: Likewise.
	* symbol-summary.h: Likewise.
	* symtab-thunks.cc: Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-05-24 08:51:00 -04:00
Martin Liska bd06c36f77 Revert "Mitigate -Wmaybe-uninitialized in expmed.cc."
This reverts commit c5c5237231.
2022-05-24 13:30:00 +02:00
Martin Liska c5c5237231 Mitigate -Wmaybe-uninitialized in expmed.cc.
It's the warning I see every time I build GCC:

In file included from /home/marxin/Programming/gcc/gcc/coretypes.h:478,
                 from /home/marxin/Programming/gcc/gcc/expmed.cc:26:
In function ‘poly_uint16 mode_to_bytes(machine_mode)’,
    inlined from ‘typename if_nonpoly<typename T::measurement_type>::type GET_MODE_SIZE(const T&) [with T = scalar_int_mode]’ at /home/marxin/Programming/gcc/gcc/machmode.h:647:24,
    inlined from ‘rtx_def* emit_store_flag_1(rtx, rtx_code, rtx, rtx, machine_mode, int, int, machine_mode)’ at /home/marxin/Programming/gcc/gcc/expmed.cc:5728:56:
/home/marxin/Programming/gcc/gcc/machmode.h:550:49: warning: ‘*(unsigned int*)((char*)&int_mode + offsetof(scalar_int_mode, scalar_int_mode::m_mode))’ may be used uninitialized [-Wmaybe-uninitialized]
  550 |           ? mode_size_inline (mode) : mode_size[mode]);
      |                                                 ^~~~
/home/marxin/Programming/gcc/gcc/expmed.cc: In function ‘rtx_def* emit_store_flag_1(rtx, rtx_code, rtx, rtx, machine_mode, int, int, machine_mode)’:
/home/marxin/Programming/gcc/gcc/expmed.cc:5657:19: note: ‘*(unsigned int*)((char*)&int_mode + offsetof(scalar_int_mode, scalar_int_mode::m_mode))’ was declared here
 5657 |   scalar_int_mode int_mode;
      |                   ^~~~~~~~

Can we please mitigate it?

gcc/ChangeLog:

	* expmed.cc (emit_store_flag_1): Mitigate -Wmaybe-uninitialized
	warning.
2022-05-24 13:26:47 +02:00
Bruno Haible 3677eb80b6 Extend --with-zstd documentation
The patch that was so far added for documenting --with-zstd is pretty
minimal:
  - it refers to undocumented options --with-zstd-include and
    --with-zstd-lib;
  - it suggests that --with-zstd can be used without an argument;
  - it does not clarify how this option applies to cross-compilation.

How about adding the same details as for the --with-isl,
--with-isl-include, --with-isl-lib options, mutatis mutandis? This patch
does that.

	PR other/105527

gcc/ChangeLog:

	* doc/install.texi (Configuration): Add more details about --with-zstd.
	Document --with-zstd-include and --with-zstd-lib

Signed-off-by: Bruno Haible <bruno@clisp.org>
2022-05-24 13:23:43 +02:00
Richard Biener 91c7c5edd2 middle-end/105711 - properly handle CONST_INT when expanding bitfields
This is another place where we fail to pass down the mode of a
CONST_INT.

2022-05-24  Richard Biener  <rguenther@suse.de>

	PR middle-end/105711
	* expmed.cc (extract_bit_field_as_subreg): Add op0_mode parameter
	and use it.
	(extract_bit_field_1): Pass down the mode of op0 to
	extract_bit_field_as_subreg.

	* gcc.target/i386/pr105711.c: New testcase.
2022-05-24 12:12:13 +02:00
Tobias Burnus 4fb2b4f7ea OpenMP: Support nowait with Fortran [PR105378]
Fortran part to C/C++/libgomp
commit r13-724-gb43836914bdc2a37563cf31359b2c4803bfe4374

gcc/fortran/

	PR c/105378
	* openmp.cc (gfc_match_omp_taskwait): Accept nowait.

gcc/testsuite/

	PR c/105378
	* gfortran.dg/gomp/taskwait-depend-nowait-1.f90: New.

libgomp/

	PR c/105378
	* libgomp.texi (OpenMP 5.1): Set 'taskwait nowait' to 'Y'.
	* testsuite/libgomp.fortran/taskwait-depend-nowait-1.f90: New.
2022-05-24 10:45:26 +02:00
Vineet Gupta b646d7d279 RISC-V: Inhibit FP <--> int register moves via tune param
Under extreme register pressure, compiler can use FP <--> int
moves as a cheap alternate to spilling to memory.
This was seen with SPEC2017 FP benchmark 507.cactu:
ML_BSSN_Advect.cc:ML_BSSN_Advect_Body()

|	fmv.d.x	fa5,s9	# PDupwindNthSymm2Xt1, PDupwindNthSymm2Xt1
| .LVL325:
|	ld	s9,184(sp)		# _12469, %sfp
| ...
| .LVL339:
|	fmv.x.d	s4,fa5	# PDupwindNthSymm2Xt1, PDupwindNthSymm2Xt1
|

The FMV instructions could be costlier (than stack spill) on certain
micro-architectures, thus this needs to be a per-cpu tunable
(default being to inhibit on all existing RV cpus).

Testsuite run with new test reports 10 failures without the fix
corresponding to the build variations of pr105666.c

| 		=== gcc Summary ===
|
| # of expected passes		123318   (+10)
| # of unexpected failures	34       (-10)
| # of unexpected successes	4
| # of expected failures	780
| # of unresolved testcases	4
| # of unsupported tests	2796

gcc/ChangeLog:

	* config/riscv/riscv.cc: (struct riscv_tune_param): Add
	  fmv_cost.
	(rocket_tune_info): Add default fmv_cost 8.
	(sifive_7_tune_info): Ditto.
	(thead_c906_tune_info): Ditto.
	(optimize_size_tune_info): Ditto.
	(riscv_register_move_cost): Use fmv_cost for int<->fp moves.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/pr105666.c: New test.

Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
2022-05-24 15:55:17 +08:00
Jakub Jelinek b43836914b openmp: Add taskwait nowait depend support [PR105378]
This patch adds support for (so far C/C++)
  #pragma omp taskwait nowait depend(...)
directive, which is like
  #pragma omp task depend(...)
  ;
but slightly optimized on the library side, so that it creates
the task only for the purpose of dependency tracking and doesn't actually
schedule it and wait for it when the dependencies are satisfied, instead
makes its dependencies satisfied right away.

2022-05-24  Jakub Jelinek  <jakub@redhat.com>

	PR c/105378
gcc/
	* omp-builtins.def (BUILT_IN_GOMP_TASKWAIT_DEPEND_NOWAIT): New
	builtin.
	* gimplify.cc (gimplify_omp_task): Diagnose taskwait with nowait
	clause but no depend clauses.
	* omp-expand.cc (expand_taskwait_call): Use
	BUILT_IN_GOMP_TASKWAIT_DEPEND_NOWAIT rather than
	BUILT_IN_GOMP_TASKWAIT_DEPEND if nowait clause is present.
gcc/c/
	* c-parser.cc (OMP_TASKWAIT_CLAUSE_MASK): Add nowait clause.
gcc/cp/
	* parser.cc (OMP_TASKWAIT_CLAUSE_MASK): Add nowait clause.
gcc/testsuite/
	* c-c++-common/gomp/taskwait-depend-nowait-1.c: New test.
libgomp/
	* libgomp_g.h (GOMP_taskwait_depend_nowait): Declare.
	* libgomp.map (GOMP_taskwait_depend_nowait): Export at GOMP_5.1.1.
	* task.c (empty_task): New function.
	(gomp_task_run_post_handle_depend_hash): Declare earlier.
	(gomp_task_run_post_handle_depend): Declare.
	(GOMP_task): Optimize fn == empty_task if there is nothing to wait
	for.
	(gomp_task_run_post_handle_dependers): Optimize task->fn == empty_task.
	(GOMP_taskwait_depend_nowait): New function.
	* testsuite/libgomp.c-c++-common/taskwait-depend-nowait-1.c: New test.
2022-05-24 09:12:44 +02:00
Richard Biener 1adf11822b tree-optimization/100221 - improve DSE a bit
When facing multiple PHI defs and one feeding the other we can
postpone processing uses of one and thus can proceed.

2022-05-20  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/100221
	* tree-ssa-dse.cc (contains_phi_arg): New function.
	(dse_classify_store): Postpone PHI defs that feed another PHI in defs.

	* gcc.dg/tree-ssa/ssa-dse-44.c: New testcase.
	* gcc.dg/tree-ssa/ssa-dse-45.c: Likewise.
2022-05-24 08:20:11 +02:00
Richard Biener d918faea12 tree-optimization/105629 - spaceship recognition regression
With the extra GENERIC folding we now do to
(unsigned int) __v._M_value & 1 != (unsigned int) __v._M_value
we end up with a sign-extending conversion to unsigned int
rather than the sign-conversion to unsigned char we expect.
Relaxing that fixes the regression.

2022-05-23  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/105629
	* tree-ssa-phiopt.cc (spaceship_replacement): Allow
	a sign-extending conversion.
2022-05-24 08:20:11 +02:00
Kewen Lin 8fa8bca9f5 testsuite/rs6000: Adjust gcc.target/powerpc/pr78604.c [PR105706]
Commit r13-707 adjusts the below gimple:

  iftmp.7_4 = _1 < _2 ? val2_7(D) : val1_8(D);

to

  _3 = _1 >= _2;
  iftmp.7_4 = _3 ? val1_8(D) : val2_7(D);

and result in one more vect_model_simple_cost dumping for each
function.  Need to adjust the match count accordingly.

	PR testsuite/105706

gcc/testsuite/ChangeLog:

	* gcc.target/powerpc/pr78604.c: Adjust.
2022-05-24 01:00:40 -05:00
Kewen Lin 149d04ccbb rs6000: Skip debug insns for union [PR105627]
As PR105627 exposes, pass analyze_swaps should skip debug
insn when doing unionfind_union.  One debug insn can use
several pseudos, if we take debug insn into account, we can
union those insns defining them and generate some unexpected
unions.

Based on the assumption that it's impossible to have one
pseudo which is defined by one debug insn but is used by one
nondebug insn, we just asserts debug insn never shows up in
function union_defs.

	PR target/105627

gcc/ChangeLog:

	* config/rs6000/rs6000-p8swap.cc (union_defs): Assert def_insn can't
	be a debug insn.
	(union_uses): Skip debug use_insn.

gcc/testsuite/ChangeLog:

	* gcc.target/powerpc/pr105627.c: New test.
2022-05-24 01:00:22 -05:00
GCC Administrator 168fc8bda1 Daily bump. 2022-05-24 00:17:03 +00:00
H.J. Lu f1a80c05db x86: Avoid uninitialized variable in PR target/104441 test
PR target/104441
	* gcc.target/i386/pr104441-1a.c (load8bit_4x4_avx2): Initialize
	src23.
2022-05-23 16:57:17 -07:00
Vineet Gupta ef85d150b5
RISC-V: Enable TARGET_SUPPORTS_WIDE_INT
This is at par with other major arches such as aarch64, i386, s390 ...

gcc/ChangeLog

	* config/riscv/predicates.md (const_0_operand): Remove
	const_double.
	* config/riscv/riscv.cc (riscv_rtx_costs): Add check for
	CONST_DOUBLE.
	* config/riscv/riscv.h (TARGET_SUPPORTS_WIDE_INT): New define.

Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-05-23 16:32:10 -07:00
David Malcolm 8473ef7be6 test plugins: use "final" and "override" directly, rather than via macros
gcc/testsuite/ChangeLog:
	* gcc.dg/plugin/analyzer_gil_plugin.c: Replace uses of "FINAL" and
	"OVERRIDE" with "final" and "override".

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-05-23 19:28:48 -04:00
David Malcolm 58c9c7407a jit: use 'final' and 'override' where appropriate
gcc/jit/ChangeLog:
	* jit-recording.h: Add "final" and "override" to all vfunc
	implementations that were missing them, as appropriate.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-05-23 15:09:30 -04:00
David Malcolm 2ac1459f04 analyzer: use 'final' and 'override' where appropriate
gcc/analyzer/ChangeLog:
	* call-info.cc: Add "final" and "override" to all vfunc
	implementations that were missing them, as appropriate.
	* engine.cc: Likewise.
	* region-model.cc: Likewise.
	* sm-malloc.cc: Likewise.
	* supergraph.h: Likewise.
	* svalue.cc: Likewise.
	* varargs.cc: Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-05-23 15:08:13 -04:00
Mayshao a239aff82c [x86_64]: Zhaoxin lujiazui enablement
This patch fix Zhaoxin CPU vendor ID detection problem and add zhaoxin
"lujiazui" processor support.  Currently gcc can't recognize Zhaoxin CPU
(vendor ID "CentaurHauls" and "Shanghai") if user use -march=native option,
which is confusing for users.  This patch enables -march=native in zhaoxin
family 7th processor and -march/-mtune=lujiazui, costs and tunning are set
according to the characteristics of the processor.
We add a new md file to describe lujiazui pipeline.

Testing:
Bootstrap is ok, and no regressions for i386/x86-64 testsuite.

Background:
Related Zhaoxin linux kernel patch can be found at:
https://lore.kernel.org/lkml/01042674b2f741b2aed1f797359bdffb@zhaoxin.com/

Related Zhaoxin glibc patch can be found at:
https://sourceware.org/git/?p=glibc.git;a=commit;h=32ac0b988466785d6e3cc1dffc364bb26fc63193

gcc/ChangeLog:

	* common/config/i386/cpuinfo.h (get_zhaoxin_cpu): Detect
	the specific type of Zhaoxin CPU, and return Zhaoxin CPU name.
	(cpu_indicator_init): Handle Zhaoxin processors.
	* common/config/i386/i386-common.cc: Add lujiazui.
	* common/config/i386/i386-cpuinfo.h (enum processor_vendor): Add
	VENDOR_ZHAOXIN.
	(enum processor_types): Add ZHAOXIN_FAM7H.
	(enum processor_subtypes): Add ZHAOXIN_FAM7H_LUJIAZUI.
	* config.gcc: Add lujiazui.
	* config/i386/cpuid.h (signature_SHANGHAI_ebx): Add
	Signatures for zhaoxin
	(signature_SHANGHAI_ecx): Ditto.
	(signature_SHANGHAI_edx): Ditto.
	* config/i386/driver-i386.cc (host_detect_local_cpu): Let
	-march=native recognize lujiazui processors.
	* config/i386/i386-c.cc (ix86_target_macros_internal): Add lujiazui.
	* config/i386/i386-options.cc (m_LUJIAZUI): New_definition.
	* config/i386/i386.h (enum processor_type): Ditto.
	* config/i386/i386.md: Add lujiazui.
	* config/i386/x86-tune-costs.h (struct processor_costs): Add
	lujiazui costs.
	* config/i386/x86-tune-sched.cc (ix86_issue_rate): Add lujiazui.
	(ix86_adjust_cost): Ditto.
	* config/i386/x86-tune.def (X86_TUNE_SCHEDULE): Add lujiazui Tunnings.
	(X86_TUNE_PARTIAL_REG_DEPENDENCY): Ditto.
	(X86_TUNE_SSE_PARTIAL_REG_DEPENDENCY): Ditto.
	(X86_TUNE_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY): Ditto.
	(X86_TUNE_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY): Ditto.
	(X86_TUNE_MOVX): Ditto.
	(X86_TUNE_MEMORY_MISMATCH_STALL): Ditto.
	(X86_TUNE_FUSE_CMP_AND_BRANCH_32): Ditto.
	(X86_TUNE_FUSE_CMP_AND_BRANCH_64): Ditto.
	(X86_TUNE_FUSE_CMP_AND_BRANCH_SOFLAGS): Ditto.
	(X86_TUNE_FUSE_ALU_AND_BRANCH): Ditto.
	(X86_TUNE_ACCUMULATE_OUTGOING_ARGS): Ditto.
	(X86_TUNE_USE_LEAVE): Ditto.
	(X86_TUNE_PUSH_MEMORY): Ditto.
	(X86_TUNE_LCP_STALL): Ditto.
	(X86_TUNE_USE_INCDEC): Ditto.
	(X86_TUNE_INTEGER_DFMODE_MOVES): Ditto.
	(X86_TUNE_OPT_AGU): Ditto.
	(X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB): Ditto.
	(X86_TUNE_MISALIGNED_MOVE_STRING_PRO_EPILOGUES): Ditto.
	(X86_TUNE_USE_SAHF): Ditto.
	(X86_TUNE_USE_BT): Ditto.
	(X86_TUNE_AVOID_FALSE_DEP_FOR_BMI): Ditto.
	(X86_TUNE_ONE_IF_CONV_INSN): Ditto.
	(X86_TUNE_AVOID_MFENCE): Ditto.
	(X86_TUNE_EXPAND_ABS): Ditto.
	(X86_TUNE_USE_SIMODE_FIOP): Ditto.
	(X86_TUNE_USE_FFREEP): Ditto.
	(X86_TUNE_EXT_80387_CONSTANTS): Ditto.
	(X86_TUNE_SSE_UNALIGNED_LOAD_OPTIMAL): Ditto.
	(X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL): Ditto.
	(X86_TUNE_SSE_TYPELESS_STORES): Ditto.
	(X86_TUNE_SSE_LOAD0_BY_PXOR): Ditto.
	* doc/extend.texi: Add details about lujiazui.
	* doc/invoke.texi: Add details about lujiazui.
	* config/i386/lujiazui.md: Introduce lujiazui cpu and include new md file.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/funcspec-56.inc: Test -arch=lujiauzi and -tune=lujiazui.
	* g++.target/i386/mv32.C: Ditto.

Signed-off-by: mayshao <mayshao-oc@zhaoxin.com>
2022-05-23 17:53:27 +02:00