Commit Graph

193469 Commits

Author SHA1 Message Date
Jason Merrill 72f76540ad c++: discarded-value and constexpr
I've been thinking for a while that the 'lval' parameter needed a third
value for discarded-value expressions; most importantly,
cxx_eval_store_expression does extra work for an lvalue result, and we also
don't want to do the l->r conversion.

Mostly this is pretty mechanical.  Apart from the _store_ fix, I also use
vc_discard for substatements of a STATEMENT_LIST other than a stmt-expr
result, and avoid building _REFs to be ignored in a few other places.

gcc/cp/ChangeLog:

	* constexpr.cc (enum value_cat): New. Change all 'lval' parameters
	from int to value_cat.  Change most false to vc_prvalue, most true
	to vc_glvalue, cases where the return value is ignored to
	vc_discard.
	(cxx_eval_statement_list): Only vc_prvalue for stmt-expr result.
	(cxx_eval_store_expression): Only build _REF for vc_glvalue.
	(cxx_eval_array_reference, cxx_eval_component_reference)
	(cxx_eval_indirect_ref, cxx_eval_constant_expression): Likewise.
2022-05-24 15:50:26 -04:00
Jason Merrill 2540e2c604 c++: constexpr empty base redux [PR105622]
Here calling the constructor for s.__size_ had ctx->ctor for s itself
because cxx_eval_store_expression doesn't create a ctor for the empty field.
Then cxx_eval_call_expression returned the s initializer, and my empty base
overhaul in r13-160 got confused because the type of init is not an empty
class.  But that's OK, we should be checking the type of the original LHS
instead.  We also want to use initialized_type in the condition, in case
init is an AGGR_INIT_EXPR.

I spent quite a while working on more complex solutions before coming back
to this simple one.

	PR c++/105622

gcc/cp/ChangeLog:

	* constexpr.cc (cxx_eval_store_expression): Adjust assert.
	Use initialized_type.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp2a/no_unique_address14.C: New test.
2022-05-24 15:49:27 -04:00
Prathamesh Kulkarni ae8decf1d2 Add new parameter to vec_perm_const hook for specifying operand mode.
The rationale of the patch is to support vec_perm_expr of the form:
lhs = vec_perm_expr<rhs, mask>
where lhs and rhs are vector types with different lengths but have
same element type. For example, lhs is SVE vector and rhs
is corresponding AdvSIMD vector.

It would also allow to express extract even/odd and interleave operations
with a VEC_PERM_EXPR.  The interleave currently has the issue that we have
to artificially widen the inputs with "dont-care" elements.

gcc/ChangeLog:

	* target.def (vec_perm_const): Define new parameter op_mode and
	update doc.
	* doc/tm.texi: Regenerate.
	* config/aarch64/aarch64.cc (aarch64_vectorize_vec_perm_const): Adjust
	vec_perm_const hook to add new parameter op_mode and return false
	if result and operand modes do not match.
	* config/arm/arm.cc (arm_vectorize_vec_perm_const): Likewise.
	* config/gcn/gcn.cc (gcn_vectorize_vec_perm_const): Likewise.
	* config/ia64/ia64.cc (ia64_vectorize_vec_perm_const): Likewise.
	* config/mips/mips.cc (mips_vectorize_vec_perm_const): Likewise.
	* config/rs6000/rs6000.cc (rs6000_vectorize_vec_perm_const): Likewise
	* config/s390/s390.cc (s390_vectorize_vec_perm_const): Likewise.
	* config/sparc/sparc.cc (sparc_vectorize_vec_perm_const): Likewise.
	* config/i386/i386-expand.cc (ix86_vectorize_vec_perm_const): Likewise.
	* config/i386/i386-expand.h (ix86_vectorize_vec_perm_const): Adjust
	prototype.
	* config/i386/sse.md (ashrv4di3): Adjust call to vec_perm_const hook.
	(ashrv2di3): Likewise.
	* optabs.cc (expand_vec_perm_const): Likewise.
	* optabs-query.h (can_vec_perm_const_p): Adjust prototype.
	* optabs-query.cc (can_vec_perm_const_p): Define new parameter
	op_mode and pass it to vec_perm_const hook.
	(can_mult_highpart_p): Adjust call to can_vec_perm_const_p.
	* match.pd (vec_perm X Y CST): Likewise.
	* tree-ssa-forwprop.cc (simplify_vector_constructor): Likewise.
	* tree-vect-data-refs.cc (vect_grouped_store_supported): Likewise.
	(vect_grouped_load_supported): Likewise.
	(vect_shift_permute_load_chain): Likewise.
	* tree-vect-generic.cc (lower_vec_perm): Likewise.
	* tree-vect-loop-manip.cc (interleave_supported_p): Likewise.
	* tree-vect-loop.cc (have_whole_vector_shift): Likewise.
	* tree-vect-patterns.cc (vect_recog_rotate_pattern): Likewise.
	* tree-vect-slp.cc (can_duplicate_and_interleave_p): Likewise.
	(vect_transform_slp_perm_load): Likewise.
	(vectorizable_slp_permutation): Likewise.
	* tree-vect-stmts.cc (perm_mask_for_reverse): Likewise.
	(vectorizable_bswap): Likewise.
	(scan_store_can_perm_p): Likewise.
	(vect_gen_perm_mask_checked): Likewise.
2022-05-25 00:42:00 +05:30
H.J. Lu 2f4f7de787 x86: Document -mcet-switch
When -fcf-protection=branch is used, the compiler will generate jump
tables for switch statements where the indirect jump is prefixed with
the NOTRACK prefix, so it can jump to non-ENDBR targets.  Since the
indirect jump targets are generated by the compiler and stored in
read-only memory, this does not result in a direct loss of hardening.
But if the jump table index is attacker-controlled, the indirect jump
may not be constrained by CET.

Document -mcet-switch to generate jump tables for switch statements with
ENDBR and skip the NOTRACK prefix for indirect jump.  This option should
be used when the NOTRACK prefix is disabled.

	PR target/104816
	* config/i386/i386.opt: Remove Undocumented.
	* doc/invoke.texi: Document -mcet-switch.
2022-05-24 09:05:07 -07:00
Andrew Stubbs cde52d3a2d amdgcn: Add gfx90a support
This adds architecture options and multilibs for the AMD GFX90a GPUs.
It also tidies up some of the ISA selection code, and corrects a few small
mistake in the gfx908 naming.

gcc/ChangeLog:

	* config.gcc (amdgcn): Accept --with-arch=gfx908 and gfx90a.
	* config/gcn/gcn-opts.h (enum gcn_isa): New.
	(TARGET_GCN3): Use enum gcn_isa.
	(TARGET_GCN3_PLUS): Likewise.
	(TARGET_GCN5): Likewise.
	(TARGET_GCN5_PLUS): Likewise.
	(TARGET_CDNA1): New.
	(TARGET_CDNA1_PLUS): New.
	(TARGET_CDNA2): New.
	(TARGET_CDNA2_PLUS): New.
	(TARGET_M0_LDS_LIMIT): New.
	(TARGET_PACKED_WORK_ITEMS): New.
	* config/gcn/gcn.cc (gcn_isa): Change to enum gcn_isa.
	(gcn_option_override): Recognise CDNA ISA variants.
	(gcn_omp_device_kind_arch_isa): Support gfx90a.
	(gcn_expand_prologue): Make m0 init optional.
	Add support for packed work items.
	(output_file_start): Support gfx90a.
	(gcn_hsa_declare_function_name): Support gfx90a metadata.
	* config/gcn/gcn.h (TARGET_CPU_CPP_BUILTINS):Add __CDNA1__ and
	__CDNA2__.
	* config/gcn/gcn.md (<su>mulsi3_highpart): Use TARGET_GCN5_PLUS.
	(<su>mulsi3_highpart_imm): Likewise.
	(<su>mulsidi3): Likewise.
	(<su>mulsidi3_imm): Likewise.
	* config/gcn/gcn.opt (gpu_type): Add gfx90a.
	* config/gcn/mkoffload.cc (EF_AMDGPU_MACH_AMDGCN_GFX90a): New.
	(main): Support gfx90a.
	* config/gcn/t-gcn-hsa: Add gfx90a multilib.
	* config/gcn/t-omp-device: Add gfx90a isa.

libgomp/ChangeLog:

	* plugin/plugin-gcn.c (EF_AMDGPU_MACH): Add
	EF_AMDGPU_MACH_AMDGCN_GFX90a.
	(gcn_gfx90a_s): New.
	(isa_hsa_name): Support gfx90a.
	(isa_code): Likewise.
2022-05-24 16:18:14 +01:00
Andrew Stubbs 8086230e7a amdgcn: Remove LLVM 9 assembler/linker support
The minimum required LLVM version is now 13.0.1, and is enforced by configure.

gcc/ChangeLog:

	* config.in: Regenerate.
	* config/gcn/gcn-hsa.h (X_FIJI): Delete.
	(X_900): Delete.
	(X_906): Delete.
	(X_908): Delete.
	(S_FIJI): Delete.
	(S_900): Delete.
	(S_906): Delete.
	(S_908): Delete.
	(NO_XNACK): New macro.
	(NO_SRAM_ECC): New macro.
	(SRAMOPT): Keep only v4 variant.
	(HSACO3_SELECT_OPT): Delete.
	(DRIVER_SELF_SPECS): Delete.
	(ASM_SPEC): Remove LLVM 9 support.
	* config/gcn/gcn-valu.md
	(gather<mode>_insn_2offsets<exec>): Remove assembler bug workaround.
	(scatter<mode>_insn_2offsets<exec_scatter>): Likewise.
	* config/gcn/gcn.cc (output_file_start): Remove LLVM 9 support.
	(print_operand_address): Remove assembler bug workaround.
	* config/gcn/mkoffload.cc (EF_AMDGPU_XNACK_V3): Delete.
	(EF_AMDGPU_SRAM_ECC_V3): Delete.
	(SET_XNACK_ON): Delete v3 variants.
	(SET_XNACK_OFF): Delete v3 variants.
	(TEST_XNACK): Delete v3 variants.
	(SET_SRAM_ECC_ON): Delete v3 variants.
	(SET_SRAM_ECC_ANY): Delete v3 variants.
	(SET_SRAM_ECC_OFF): Delete v3 variants.
	(SET_SRAM_ECC_UNSUPPORTED): Delete v3 variants.
	(TEST_SRAM_ECC_ANY): Delete v3 variants.
	(TEST_SRAM_ECC_ON): Delete v3 variants.
	(copy_early_debug_info): Remove v3 support.
	(main): Remove v3 support.
	* configure: Regenerate.
	* configure.ac: Replace all GCN feature checks with a version check.
2022-05-24 16:18:13 +01:00
David Malcolm 2c5c645663 libiberty: remove FINAL and OVERRIDE from ansidecl.h
libiberty's ansidecl.h provides macros FINAL and OVERRIDE to allow
virtual functions to be labelled with the C++11 "final" and "override"
specifiers, but with empty implementations on pre-C++11 C++ compilers.

We've used the macros in many places in GCC, but as of as of GCC 11
onwards GCC has required a C++11 compiler, such as GCC 4.8 or later.
On the assumption that any such compiler correctly implements "final"
and "override", I've simplified GCC's codebase by replacing all uses of
the FINAL and OVERRIDE macros in GCC's source tree with the lower-case
specifiers (via commits r13-690-gff171cb13df671 and
r13-716-g8473ef7be60443)

The macros are reportedly not used anywhere in binutils-gdb.

This patch completes this transition for GCC by eliminating the macros
from ansidecl.h.

include/ChangeLog:
	* ansidecl.h: Drop macros OVERRIDE and FINAL.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-05-24 10:22:37 -04:00
Roger Sayle e8a25550da Optimize double word negation of zero extended values on x86.
It's not uncommon for GCC to convert between a (zero or one) Boolean
value and a (zero or all ones) mask value, possibly of a wider type,
using negation.

Currently on x86_64, the following simple test case:
__int128 foo(unsigned long x) { return -(__int128)x; }

compiles with -O2 to:

        movq    %rdi, %rax
        xorl    %edx, %edx
        negq    %rax
        adcq    $0, %rdx
        negq    %rdx
        ret

with this patch, which adds an additional peephole2 to i386.md,
we instead generate the improved:

        movq    %rdi, %rax
        negq    %rax
        sbbq    %rdx, %rdx
        ret

[and likewise for the (DImode) long long version using -m32.]
A peephole2 is appropriate as the double word negation and the
operation providing the xor are typically only split after combine.

In fact, the new peephole2 sequence:
;; Convert:
;;   xorl %edx, %edx
;;   negl %eax
;;   adcl $0, %edx
;;   negl %edx
;; to:
;;   negl %eax
;;   sbbl %edx, %edx    // *x86_mov<mode>cc_0_m1

is nearly identical to (and placed immediately after) the existing:
;; Convert:
;;   mov %esi, %edx
;;   negl %eax
;;   adcl $0, %edx
;;   negl %edx
;; to:
;;   xorl %edx, %edx
;;   negl %eax
;;   sbbl %esi, %edx

One potential objection/concern is that "sbb? %reg,%reg" may possibly be
incorrectly perceived as a false register dependency on older hardware,
much like "xor? %reg,%reg" may be perceived as a false dependency on
really old hardware.  This doesn't currently appear to be a concern
for the i386 backend's *x86_move<mode>cc_0_m1 as shown by the following
test code:

int bar(unsigned int x, unsigned int y) {
  return x > y ? -1 : 0;
}

which currently generates a "naked" sbb:
        cmp     esi, edi
        sbb     eax, eax
        ret

If anyone does potentially encounter a stall, it would easy to add
a splitter or peephole2 controlled by a tuning flag to insert an additional
xor to break the false dependency chain (when not optimizing for size),
but I don't believe this is required on recent microarchitectures.

2022-05-24 Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* config/i386/i386.md (peephole2): Convert xor;neg;adc;neg,
	i.e. a double word negation of a zero extended operand, to
	neg;sbb.

gcc/testsuite/ChangeLog
	* gcc.target/i386/neg-zext-1.c: New test case for -m32.
	* gcc.target/i386/neg-zext-2.c: New test case for -m64.
2022-05-24 15:18:56 +01:00
Roger Sayle 793f847ba7 PR tree-optimization/105668: Provide vcond_mask_v1tiv1ti pattern.
This patch is an alternate/supplementary fix to PR tree-optimization/105668
that provides a vcond_mask_v1titi optab/define_expand to the i386 backend.
An undocumented feature/bug of GCC's vectorization is that any target that
provides a vec_cmpeq<mode><mode> has to also provide a matching
vcond_mask<mode><mode>.  This backend patch preserves the status quo,
rather than fixes the underlying problem.

One aspect of this clean-up is that ix86_expand_sse_movcc provides
fallback implementations using pand/pandn/por that effectively make
V2DImode and V1TImode vcond_mask available on any TARGET_SSE2, not
just TARGET_SSE4_2.  This allows a simplification as V2DI mode can
be handled by using a VI_128 mode iterator instead of a VI124_128
mode iterator, and instead this define_expand is effectively renamed
to provide a V1TImode vcond_mask expander (as V1TI isn't in VI_128).

2022-05-24  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	PR tree-optimization/105668
	* config/i386/i386-expand.cc (ix86_expand_sse_movcc): Support
	V1TImode, just like V2DImode.
	* config/i386/sse.md (vcond_mask_<mode><sseintvecmodelower>):
	Use VI_128 mode iterator instead of VI124_128 to include V2DI.
	(vcond_mask_v2div2di): Delete.
	(vcond_mask_v1tiv1ti): New define_expand.

gcc/testsuite/ChangeLog
	PR tree-optimization/105668
	* gcc.target/i386/pr105668.c: New test case.
2022-05-24 15:15:12 +01:00
Roger Sayle 9e7a0e42a1 Minor improvement to genpreds.cc
This simple patch implements Richard Biener's suggestion in comment #6
of PR tree-optimization/52171 (from February 2013) that the insn-preds
code generated by genpreds can avoid using strncmp when matching constant
strings of length one.

The effect of this patch is best explained by the diff of insn-preds.cc:
<       if (!strncmp (str + 1, "g", 1))
---
>       if (str[1] == 'g')
3104c3104
<       if (!strncmp (str + 1, "m", 1))
---
>       if (str[1] == 'm')
3106c3106
<       if (!strncmp (str + 1, "c", 1))
---
>       if (str[1] == 'c')
...

The equivalent optimization is performed by GCC (but perhaps not by the
host compiler), but generating simpler/smaller code may encourage further
optimizations (such as use of a switch statement).

2022-05-24  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* genpreds.cc (write_lookup_constraint_1): Avoid generating a call
	to strncmp for strings of length one.
2022-05-24 14:31:59 +01:00
Patrick Palka d0ef9e0619 c++: set TYPE_CANONICAL for more template types
When forming a class template specialization, lookup_template_class
uses structural equality for the specialized type whenever one of its
template arguments uses structural equality.  This is the sensible thing
to do in a vacuum, but given that we already effectively deduplicate class
specializations via the type_specializations table, we ought to be able
to safely assume that each class specialization is unique and therefore
canonical, regardless of the canonicity of the template arguments.

To that end this patch makes us use the canonical type machinery for all
type specializations, except for the case where a PARM_DECL appears in
the template arguments (this special case was recently added by
r12-3766-g72394d38d929c7).

Additionally, this patch makes us use the canonical type machinery for
TEMPLATE_TEMPLATE_PARMs and BOUND_TEMPLATE_TEMPLATE_PARMs, by extending
canonical_type_parameter appropriately.  A comment in tsubst says it's
unsafe to set TYPE_CANONICAL for a lowered TEMPLATE_TEMPLATE_PARM, but
I'm not sure this is true anymore.  According to Jason, this comment
(from r120341) became obsolete when later that year r129844 started to
substitute the template parms of ttps.  Note that r10-7817-ga6f400239d792d
recently changed process_template_parm to clear TYPE_CANONICAL for
TEMPLATE_TEMPLATE_PARM consistent with the tsubst comment; this patch
changes both functions to set instead of clear TYPE_CANONICAL for ttps.

These changes improve compile time of template-heavy code by around 10%
for me (with a release compiler).  For instance, compile time for the
libstdc++ test std/ranges/adaptors/all.cc drops from 1.45s to 1.25s, and
for the range-v3 test test/view/zip.cpp from 5.38s to 4.88s.  The total
number of calls to structural_comptypes for the latter test drops from
10.5M to 1.8M.  Memory use is unaffected (as expected).

The new testcase verifies we check the r12-3766 PARM_DECL special case
in bind_template_template_parm too.

gcc/cp/ChangeLog:

	* cp-tree.h (any_template_arguments_need_structural_equality_p):
	Declare.
	* pt.cc (struct ctp_hasher): Define.
	(ctp_table): Define.
	(canonical_type_parameter): Use it.
	(process_template_parm): Set TYPE_CANONICAL for
	TEMPLATE_TEMPLATE_PARM too.
	(lookup_template_class_1): Remove now outdated comment for the
	any_template_arguments_need_structural_equality_p test.
	(tsubst) <case TEMPLATE_TEMPLATE_PARM, etc>: Don't specifically
	clear TYPE_CANONICAL for ttps.  Set TYPE_CANONICAL on the
	substituted type later.
	(any_template_arguments_need_structural_equality_p): Return
	true for any_targ_node.  Don't return true just because a
	template argument uses structural equality.  Add comment for
	the PARM_DECL special case.
	(rewrite_template_parm): Set TYPE_CANONICAL on the rewritten
	parm's type later.
	* tree.cc (bind_template_template_parm): Set TYPE_CANONICAL
	when safe to do so.
	* typeck.cc (structural_comptypes) [check_alias]: Increment
	processing_template_decl before checking
	dependent_alias_template_spec_p.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp0x/constexpr-52830a.C: New test.
2022-05-24 09:27:39 -04:00
David Malcolm 442cf0977a d: add 'final' and 'override' to gcc/d/*.cc 'visit' impls
gcc/d/ChangeLog:
	* decl.cc: Add "final" and "override" to all "visit" vfunc decls
	as appropriate.
	* expr.cc: Likewise.
	* toir.cc: Likewise.
	* typeinfo.cc: Likewise.
	* types.cc: Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-05-24 09:07:22 -04:00
ShiYulong d44e471cf0 RISC-V: Cache Management Operation instructions testcases
This commit adds testcases about CMO instructions.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/cmo-zicbom-1.c: New test.
	* gcc.target/riscv/cmo-zicbom-2.c: New test.
	* gcc.target/riscv/cmo-zicbop-1.c: New test.
	* gcc.target/riscv/cmo-zicbop-2.c: New test.
	* gcc.target/riscv/cmo-zicboz-1.c: New test.
	* gcc.target/riscv/cmo-zicboz-2.c: New test.
2022-05-24 21:00:45 +08:00
ShiYulong 3df3ca9014 RISC-V: Cache Management Operation instructions
This commit adds cbo.clea, cbo.flush, cbo.inval, cbo.zero, prefetch.i,
prefetch.r and prefetch.w instructions.

diff with the previous version:
We use unspec_volatile instead of unspec for those cache operations.
We use UNSPECV instead of UNSPEC and move them to unspecv.

gcc/ChangeLog:

	* config/riscv/predicates.md (imm5_operand): Add a new operand type for
	prefetch instructions.
	* config/riscv/riscv-builtins.cc (AVAIL): Add new AVAILs for CMO ISA
	Extensions.
	(RISCV_ATYPE_SI): New.
	(RISCV_ATYPE_DI): New.
	* config/riscv/riscv-ftypes.def (0): New.
	(1): New.
	* config/riscv/riscv.md (riscv_clean_<mode>): New.
	(riscv_flush_<mode>): New.
	(riscv_inval_<mode>): New.
	(riscv_zero_<mode>): New.
	(prefetch): New.
	(riscv_prefetchi_<mode>): New.
	* config/riscv/riscv-cmo.def: New file.
2022-05-24 21:00:39 +08:00
ShiYulong 23c738bcba RISC-V: Add mininal support for Zicbo[mzp]
This commit adds minimal support for 'Zicbom','Zicboz' and 'Zicbop' extensions.

gcc/ChangeLog:

	* common/config/riscv/riscv-common.cc: Add zicbom, zicboz, zicbop extensions.
	* config/riscv/riscv-opts.h (MASK_ZICBOZ): New.
	(MASK_ZICBOM): New.
	(MASK_ZICBOP): New.
	(TARGET_ZICBOZ): New.
	(TARGET_ZICBOM): New.
	(TARGET_ZICBOP): New.
	* config/riscv/riscv.opt (riscv_zicmo_subext): New.
2022-05-24 21:00:33 +08:00
David Malcolm 4665cfbc4c tree-vect-slp-patterns.cc: add 'final' and 'override' to vect_pattern::build impls
gcc/ChangeLog:
	* tree-vect-slp-patterns.cc: Add "final" and "override" to
	vect_pattern::build impls as appropriate.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-05-24 08:53:30 -04:00
David Malcolm f31ba11652 ipa: add 'final' and 'override' to call_summary_base vfunc impls
gcc/ChangeLog:
	* ipa-cp.cc: Add "final" and "override" to call_summary_base vfunc
	implementations, removing redundant "virtual" as appropriate.
	* ipa-fnsummary.h: Likewise.
	* ipa-modref.cc: Likewise.
	* ipa-param-manipulation.cc: Likewise.
	* ipa-profile.cc: Likewise.
	* ipa-prop.h: Likewise.
	* ipa-pure-const.cc: Likewise.
	* ipa-reference.cc: Likewise.
	* ipa-sra.cc: Likewise.
	* symbol-summary.h: Likewise.
	* symtab-thunks.cc: Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-05-24 08:51:00 -04:00
Martin Liska bd06c36f77 Revert "Mitigate -Wmaybe-uninitialized in expmed.cc."
This reverts commit c5c5237231.
2022-05-24 13:30:00 +02:00
Martin Liska c5c5237231 Mitigate -Wmaybe-uninitialized in expmed.cc.
It's the warning I see every time I build GCC:

In file included from /home/marxin/Programming/gcc/gcc/coretypes.h:478,
                 from /home/marxin/Programming/gcc/gcc/expmed.cc:26:
In function ‘poly_uint16 mode_to_bytes(machine_mode)’,
    inlined from ‘typename if_nonpoly<typename T::measurement_type>::type GET_MODE_SIZE(const T&) [with T = scalar_int_mode]’ at /home/marxin/Programming/gcc/gcc/machmode.h:647:24,
    inlined from ‘rtx_def* emit_store_flag_1(rtx, rtx_code, rtx, rtx, machine_mode, int, int, machine_mode)’ at /home/marxin/Programming/gcc/gcc/expmed.cc:5728:56:
/home/marxin/Programming/gcc/gcc/machmode.h:550:49: warning: ‘*(unsigned int*)((char*)&int_mode + offsetof(scalar_int_mode, scalar_int_mode::m_mode))’ may be used uninitialized [-Wmaybe-uninitialized]
  550 |           ? mode_size_inline (mode) : mode_size[mode]);
      |                                                 ^~~~
/home/marxin/Programming/gcc/gcc/expmed.cc: In function ‘rtx_def* emit_store_flag_1(rtx, rtx_code, rtx, rtx, machine_mode, int, int, machine_mode)’:
/home/marxin/Programming/gcc/gcc/expmed.cc:5657:19: note: ‘*(unsigned int*)((char*)&int_mode + offsetof(scalar_int_mode, scalar_int_mode::m_mode))’ was declared here
 5657 |   scalar_int_mode int_mode;
      |                   ^~~~~~~~

Can we please mitigate it?

gcc/ChangeLog:

	* expmed.cc (emit_store_flag_1): Mitigate -Wmaybe-uninitialized
	warning.
2022-05-24 13:26:47 +02:00
Bruno Haible 3677eb80b6 Extend --with-zstd documentation
The patch that was so far added for documenting --with-zstd is pretty
minimal:
  - it refers to undocumented options --with-zstd-include and
    --with-zstd-lib;
  - it suggests that --with-zstd can be used without an argument;
  - it does not clarify how this option applies to cross-compilation.

How about adding the same details as for the --with-isl,
--with-isl-include, --with-isl-lib options, mutatis mutandis? This patch
does that.

	PR other/105527

gcc/ChangeLog:

	* doc/install.texi (Configuration): Add more details about --with-zstd.
	Document --with-zstd-include and --with-zstd-lib

Signed-off-by: Bruno Haible <bruno@clisp.org>
2022-05-24 13:23:43 +02:00
Richard Biener 91c7c5edd2 middle-end/105711 - properly handle CONST_INT when expanding bitfields
This is another place where we fail to pass down the mode of a
CONST_INT.

2022-05-24  Richard Biener  <rguenther@suse.de>

	PR middle-end/105711
	* expmed.cc (extract_bit_field_as_subreg): Add op0_mode parameter
	and use it.
	(extract_bit_field_1): Pass down the mode of op0 to
	extract_bit_field_as_subreg.

	* gcc.target/i386/pr105711.c: New testcase.
2022-05-24 12:12:13 +02:00
Tobias Burnus 4fb2b4f7ea OpenMP: Support nowait with Fortran [PR105378]
Fortran part to C/C++/libgomp
commit r13-724-gb43836914bdc2a37563cf31359b2c4803bfe4374

gcc/fortran/

	PR c/105378
	* openmp.cc (gfc_match_omp_taskwait): Accept nowait.

gcc/testsuite/

	PR c/105378
	* gfortran.dg/gomp/taskwait-depend-nowait-1.f90: New.

libgomp/

	PR c/105378
	* libgomp.texi (OpenMP 5.1): Set 'taskwait nowait' to 'Y'.
	* testsuite/libgomp.fortran/taskwait-depend-nowait-1.f90: New.
2022-05-24 10:45:26 +02:00
Vineet Gupta b646d7d279 RISC-V: Inhibit FP <--> int register moves via tune param
Under extreme register pressure, compiler can use FP <--> int
moves as a cheap alternate to spilling to memory.
This was seen with SPEC2017 FP benchmark 507.cactu:
ML_BSSN_Advect.cc:ML_BSSN_Advect_Body()

|	fmv.d.x	fa5,s9	# PDupwindNthSymm2Xt1, PDupwindNthSymm2Xt1
| .LVL325:
|	ld	s9,184(sp)		# _12469, %sfp
| ...
| .LVL339:
|	fmv.x.d	s4,fa5	# PDupwindNthSymm2Xt1, PDupwindNthSymm2Xt1
|

The FMV instructions could be costlier (than stack spill) on certain
micro-architectures, thus this needs to be a per-cpu tunable
(default being to inhibit on all existing RV cpus).

Testsuite run with new test reports 10 failures without the fix
corresponding to the build variations of pr105666.c

| 		=== gcc Summary ===
|
| # of expected passes		123318   (+10)
| # of unexpected failures	34       (-10)
| # of unexpected successes	4
| # of expected failures	780
| # of unresolved testcases	4
| # of unsupported tests	2796

gcc/ChangeLog:

	* config/riscv/riscv.cc: (struct riscv_tune_param): Add
	  fmv_cost.
	(rocket_tune_info): Add default fmv_cost 8.
	(sifive_7_tune_info): Ditto.
	(thead_c906_tune_info): Ditto.
	(optimize_size_tune_info): Ditto.
	(riscv_register_move_cost): Use fmv_cost for int<->fp moves.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/pr105666.c: New test.

Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
2022-05-24 15:55:17 +08:00
Jakub Jelinek b43836914b openmp: Add taskwait nowait depend support [PR105378]
This patch adds support for (so far C/C++)
  #pragma omp taskwait nowait depend(...)
directive, which is like
  #pragma omp task depend(...)
  ;
but slightly optimized on the library side, so that it creates
the task only for the purpose of dependency tracking and doesn't actually
schedule it and wait for it when the dependencies are satisfied, instead
makes its dependencies satisfied right away.

2022-05-24  Jakub Jelinek  <jakub@redhat.com>

	PR c/105378
gcc/
	* omp-builtins.def (BUILT_IN_GOMP_TASKWAIT_DEPEND_NOWAIT): New
	builtin.
	* gimplify.cc (gimplify_omp_task): Diagnose taskwait with nowait
	clause but no depend clauses.
	* omp-expand.cc (expand_taskwait_call): Use
	BUILT_IN_GOMP_TASKWAIT_DEPEND_NOWAIT rather than
	BUILT_IN_GOMP_TASKWAIT_DEPEND if nowait clause is present.
gcc/c/
	* c-parser.cc (OMP_TASKWAIT_CLAUSE_MASK): Add nowait clause.
gcc/cp/
	* parser.cc (OMP_TASKWAIT_CLAUSE_MASK): Add nowait clause.
gcc/testsuite/
	* c-c++-common/gomp/taskwait-depend-nowait-1.c: New test.
libgomp/
	* libgomp_g.h (GOMP_taskwait_depend_nowait): Declare.
	* libgomp.map (GOMP_taskwait_depend_nowait): Export at GOMP_5.1.1.
	* task.c (empty_task): New function.
	(gomp_task_run_post_handle_depend_hash): Declare earlier.
	(gomp_task_run_post_handle_depend): Declare.
	(GOMP_task): Optimize fn == empty_task if there is nothing to wait
	for.
	(gomp_task_run_post_handle_dependers): Optimize task->fn == empty_task.
	(GOMP_taskwait_depend_nowait): New function.
	* testsuite/libgomp.c-c++-common/taskwait-depend-nowait-1.c: New test.
2022-05-24 09:12:44 +02:00
Richard Biener 1adf11822b tree-optimization/100221 - improve DSE a bit
When facing multiple PHI defs and one feeding the other we can
postpone processing uses of one and thus can proceed.

2022-05-20  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/100221
	* tree-ssa-dse.cc (contains_phi_arg): New function.
	(dse_classify_store): Postpone PHI defs that feed another PHI in defs.

	* gcc.dg/tree-ssa/ssa-dse-44.c: New testcase.
	* gcc.dg/tree-ssa/ssa-dse-45.c: Likewise.
2022-05-24 08:20:11 +02:00
Richard Biener d918faea12 tree-optimization/105629 - spaceship recognition regression
With the extra GENERIC folding we now do to
(unsigned int) __v._M_value & 1 != (unsigned int) __v._M_value
we end up with a sign-extending conversion to unsigned int
rather than the sign-conversion to unsigned char we expect.
Relaxing that fixes the regression.

2022-05-23  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/105629
	* tree-ssa-phiopt.cc (spaceship_replacement): Allow
	a sign-extending conversion.
2022-05-24 08:20:11 +02:00
Kewen Lin 8fa8bca9f5 testsuite/rs6000: Adjust gcc.target/powerpc/pr78604.c [PR105706]
Commit r13-707 adjusts the below gimple:

  iftmp.7_4 = _1 < _2 ? val2_7(D) : val1_8(D);

to

  _3 = _1 >= _2;
  iftmp.7_4 = _3 ? val1_8(D) : val2_7(D);

and result in one more vect_model_simple_cost dumping for each
function.  Need to adjust the match count accordingly.

	PR testsuite/105706

gcc/testsuite/ChangeLog:

	* gcc.target/powerpc/pr78604.c: Adjust.
2022-05-24 01:00:40 -05:00
Kewen Lin 149d04ccbb rs6000: Skip debug insns for union [PR105627]
As PR105627 exposes, pass analyze_swaps should skip debug
insn when doing unionfind_union.  One debug insn can use
several pseudos, if we take debug insn into account, we can
union those insns defining them and generate some unexpected
unions.

Based on the assumption that it's impossible to have one
pseudo which is defined by one debug insn but is used by one
nondebug insn, we just asserts debug insn never shows up in
function union_defs.

	PR target/105627

gcc/ChangeLog:

	* config/rs6000/rs6000-p8swap.cc (union_defs): Assert def_insn can't
	be a debug insn.
	(union_uses): Skip debug use_insn.

gcc/testsuite/ChangeLog:

	* gcc.target/powerpc/pr105627.c: New test.
2022-05-24 01:00:22 -05:00
GCC Administrator 168fc8bda1 Daily bump. 2022-05-24 00:17:03 +00:00
H.J. Lu f1a80c05db x86: Avoid uninitialized variable in PR target/104441 test
PR target/104441
	* gcc.target/i386/pr104441-1a.c (load8bit_4x4_avx2): Initialize
	src23.
2022-05-23 16:57:17 -07:00
Vineet Gupta ef85d150b5
RISC-V: Enable TARGET_SUPPORTS_WIDE_INT
This is at par with other major arches such as aarch64, i386, s390 ...

gcc/ChangeLog

	* config/riscv/predicates.md (const_0_operand): Remove
	const_double.
	* config/riscv/riscv.cc (riscv_rtx_costs): Add check for
	CONST_DOUBLE.
	* config/riscv/riscv.h (TARGET_SUPPORTS_WIDE_INT): New define.

Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-05-23 16:32:10 -07:00
David Malcolm 8473ef7be6 test plugins: use "final" and "override" directly, rather than via macros
gcc/testsuite/ChangeLog:
	* gcc.dg/plugin/analyzer_gil_plugin.c: Replace uses of "FINAL" and
	"OVERRIDE" with "final" and "override".

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-05-23 19:28:48 -04:00
David Malcolm 58c9c7407a jit: use 'final' and 'override' where appropriate
gcc/jit/ChangeLog:
	* jit-recording.h: Add "final" and "override" to all vfunc
	implementations that were missing them, as appropriate.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-05-23 15:09:30 -04:00
David Malcolm 2ac1459f04 analyzer: use 'final' and 'override' where appropriate
gcc/analyzer/ChangeLog:
	* call-info.cc: Add "final" and "override" to all vfunc
	implementations that were missing them, as appropriate.
	* engine.cc: Likewise.
	* region-model.cc: Likewise.
	* sm-malloc.cc: Likewise.
	* supergraph.h: Likewise.
	* svalue.cc: Likewise.
	* varargs.cc: Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-05-23 15:08:13 -04:00
Mayshao a239aff82c [x86_64]: Zhaoxin lujiazui enablement
This patch fix Zhaoxin CPU vendor ID detection problem and add zhaoxin
"lujiazui" processor support.  Currently gcc can't recognize Zhaoxin CPU
(vendor ID "CentaurHauls" and "Shanghai") if user use -march=native option,
which is confusing for users.  This patch enables -march=native in zhaoxin
family 7th processor and -march/-mtune=lujiazui, costs and tunning are set
according to the characteristics of the processor.
We add a new md file to describe lujiazui pipeline.

Testing:
Bootstrap is ok, and no regressions for i386/x86-64 testsuite.

Background:
Related Zhaoxin linux kernel patch can be found at:
https://lore.kernel.org/lkml/01042674b2f741b2aed1f797359bdffb@zhaoxin.com/

Related Zhaoxin glibc patch can be found at:
https://sourceware.org/git/?p=glibc.git;a=commit;h=32ac0b988466785d6e3cc1dffc364bb26fc63193

gcc/ChangeLog:

	* common/config/i386/cpuinfo.h (get_zhaoxin_cpu): Detect
	the specific type of Zhaoxin CPU, and return Zhaoxin CPU name.
	(cpu_indicator_init): Handle Zhaoxin processors.
	* common/config/i386/i386-common.cc: Add lujiazui.
	* common/config/i386/i386-cpuinfo.h (enum processor_vendor): Add
	VENDOR_ZHAOXIN.
	(enum processor_types): Add ZHAOXIN_FAM7H.
	(enum processor_subtypes): Add ZHAOXIN_FAM7H_LUJIAZUI.
	* config.gcc: Add lujiazui.
	* config/i386/cpuid.h (signature_SHANGHAI_ebx): Add
	Signatures for zhaoxin
	(signature_SHANGHAI_ecx): Ditto.
	(signature_SHANGHAI_edx): Ditto.
	* config/i386/driver-i386.cc (host_detect_local_cpu): Let
	-march=native recognize lujiazui processors.
	* config/i386/i386-c.cc (ix86_target_macros_internal): Add lujiazui.
	* config/i386/i386-options.cc (m_LUJIAZUI): New_definition.
	* config/i386/i386.h (enum processor_type): Ditto.
	* config/i386/i386.md: Add lujiazui.
	* config/i386/x86-tune-costs.h (struct processor_costs): Add
	lujiazui costs.
	* config/i386/x86-tune-sched.cc (ix86_issue_rate): Add lujiazui.
	(ix86_adjust_cost): Ditto.
	* config/i386/x86-tune.def (X86_TUNE_SCHEDULE): Add lujiazui Tunnings.
	(X86_TUNE_PARTIAL_REG_DEPENDENCY): Ditto.
	(X86_TUNE_SSE_PARTIAL_REG_DEPENDENCY): Ditto.
	(X86_TUNE_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY): Ditto.
	(X86_TUNE_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY): Ditto.
	(X86_TUNE_MOVX): Ditto.
	(X86_TUNE_MEMORY_MISMATCH_STALL): Ditto.
	(X86_TUNE_FUSE_CMP_AND_BRANCH_32): Ditto.
	(X86_TUNE_FUSE_CMP_AND_BRANCH_64): Ditto.
	(X86_TUNE_FUSE_CMP_AND_BRANCH_SOFLAGS): Ditto.
	(X86_TUNE_FUSE_ALU_AND_BRANCH): Ditto.
	(X86_TUNE_ACCUMULATE_OUTGOING_ARGS): Ditto.
	(X86_TUNE_USE_LEAVE): Ditto.
	(X86_TUNE_PUSH_MEMORY): Ditto.
	(X86_TUNE_LCP_STALL): Ditto.
	(X86_TUNE_USE_INCDEC): Ditto.
	(X86_TUNE_INTEGER_DFMODE_MOVES): Ditto.
	(X86_TUNE_OPT_AGU): Ditto.
	(X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB): Ditto.
	(X86_TUNE_MISALIGNED_MOVE_STRING_PRO_EPILOGUES): Ditto.
	(X86_TUNE_USE_SAHF): Ditto.
	(X86_TUNE_USE_BT): Ditto.
	(X86_TUNE_AVOID_FALSE_DEP_FOR_BMI): Ditto.
	(X86_TUNE_ONE_IF_CONV_INSN): Ditto.
	(X86_TUNE_AVOID_MFENCE): Ditto.
	(X86_TUNE_EXPAND_ABS): Ditto.
	(X86_TUNE_USE_SIMODE_FIOP): Ditto.
	(X86_TUNE_USE_FFREEP): Ditto.
	(X86_TUNE_EXT_80387_CONSTANTS): Ditto.
	(X86_TUNE_SSE_UNALIGNED_LOAD_OPTIMAL): Ditto.
	(X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL): Ditto.
	(X86_TUNE_SSE_TYPELESS_STORES): Ditto.
	(X86_TUNE_SSE_LOAD0_BY_PXOR): Ditto.
	* doc/extend.texi: Add details about lujiazui.
	* doc/invoke.texi: Add details about lujiazui.
	* config/i386/lujiazui.md: Introduce lujiazui cpu and include new md file.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/funcspec-56.inc: Test -arch=lujiauzi and -tune=lujiazui.
	* g++.target/i386/mv32.C: Ditto.

Signed-off-by: mayshao <mayshao-oc@zhaoxin.com>
2022-05-23 17:53:27 +02:00
Dimitar Dimitrov e6c04ac9fd testsuite: mallign: Handle word size of 1 byte
This patch fixes a spurious warning for the pru-unknown-elf target:
  gcc/testsuite/gcc.dg/mallign.c:12:27: warning: ignoring return value of 'malloc' declared with attribute 'warn_unused_result' [-Wunused-result]

For 8-bit targets the resulting mask ignores all bits in the value
returned by malloc.  Fix by first checking the target word size.

gcc/testsuite/ChangeLog:

	* gcc.dg/mallign.c: Skip check if sizeof(word)==1.

Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
2022-05-23 18:44:38 +03:00
Nathan Sidwell b7feb71d45 demangler: C++ modules support
This adds demangling support for C++ modules.  A new 'W' component
along with augmented behaviour of 'S' components.

	include/
	* demangle.h (enum demangle_component_type): Add module components.
	libiberty/
	* cp-demangle.c (d_make_comp): Adjust.
	(d_name, d_prefix): Adjust subst handling. Add module handling.
	(d_maybe_module_name): New.
	(d_unqualified_name): Add incoming module parm. Handle it.  Adjust all callers.
	(d_special_name): Add 'GI' support.
	(d_count_template_scopes): Adjust.
	(d_print_comp_inner): Print module.
	* testsuite/demangle-expected: New test cases
2022-05-23 05:39:15 -07:00
Martin Liska 63798f67dc tilepro: fix missing ARRAY_SIZE macro
gcc/ChangeLog:

	* config/tilepro/gen-mul-tables.cc (ARRAY_SIZE): Add new macro.
2022-05-23 13:54:53 +02:00
Richard Biener 0236ea984c Remove forward_propagate_into_cond
This is a first cleanup opportunity from the COND_EXPR gimplification
which allows us to remove now redundant forward_propagate_into_cond.

2022-05-23  Richard Biener  <rguenther@suse.de>

	* tree-ssa-forwprop.cc (forward_propagate_into_cond): Remove.
	(pass_forwprop::execute): Do not propagate into COND_EXPR conditions.
2022-05-23 12:55:13 +02:00
Richard Biener 19dd439389 Remove is_gimple_condexpr
This removes is_gimple_condexpr, note the vectorizer via patterns
still creates COND_EXPRs with embedded GENERIC conditions and has
a reference to the function in comments.  Otherwise is_gimple_condexpr
is now equal to is_gimple_val.

2022-05-16  Richard Biener  <rguenther@suse.de>

	* gimple-expr.cc (is_gimple_condexpr): Remove.
	* gimple-expr.h (is_gimple_condexpr): Likewise.
	* gimplify.cc (gimplify_expr): Remove is_gimple_condexpr usage.
	* tree-if-conv.cc (set_bb_predicate): Likewie.
	(add_to_predicate_list): Likewise.
	(gen_phi_arg_condition): Likewise.
	(predicate_scalar_phi): Likewise.
	(predicate_statements): Likewise.
2022-05-23 11:30:39 +02:00
Richard Biener 68e0063397 Force the selection operand of a GIMPLE COND_EXPR to be a register
This goes away with the selection operand allowed to be a GENERIC
tcc_comparison tree.  It keeps those for vectorizer pattern recog,
those are short lived and removing this instance is a bigger task.

The patch doesn't yet remove dead code and functionality, that's
left for a followup.  Instead the patch makes sure to produce
valid GIMPLE IL and continue to optimize COND_EXPRs where the
previous IL allowed and the new IL showed regressions in the testsuite.

2022-05-16  Richard Biener  <rguenther@suse.de>

	* gimple-expr.cc (is_gimple_condexpr): Equate to is_gimple_val.
	* gimplify.cc (gimplify_pure_cond_expr): Gimplify the condition
	as is_gimple_val.
	* gimple-fold.cc (valid_gimple_rhs_p): Simplify.
	* tree-cfg.cc (verify_gimple_assign_ternary): Likewise.
	* gimple-loop-interchange.cc (loop_cand::undo_simple_reduction):
	Build the condition of the COND_EXPR separately.
	* tree-ssa-loop-im.cc (move_computations_worker): Likewise.
	* tree-vect-generic.cc (expand_vector_condition): Likewise.
	* tree-vect-loop.cc (vect_create_epilog_for_reduction):
	Likewise.
	* vr-values.cc (simplify_using_ranges::simplify): Likewise.
	* tree-vect-patterns.cc: Add comment indicating we are
	building invalid COND_EXPRs and why.
	* omp-expand.cc (expand_omp_simd): Gimplify the condition
	to the COND_EXPR separately.
	(expand_omp_atomic_cas): Note part that should be unreachable
	now.
	* tree-ssa-forwprop.cc (forward_propagate_into_cond): Adjust
	condition for valid replacements.
	* tree-if-conv.cc (predicate_bbs): Simulate previous
	re-folding of the condition in folded COND_EXPRs which
	is necessary because of unfolded GIMPLE_CONDs in the IL
	as in for example gcc.dg/fold-bopcond-1.c.
	* gimple-range-gori.cc (gori_compute::condexpr_adjust):
	Handle that the comparison is now in the def stmt of
	the select operand.  Required by gcc.dg/pr104526.c.

	* gcc.dg/gimplefe-27.c: Adjust.
	* gcc.dg/gimplefe-45.c: Likewise.
	* gcc.dg/pr101145-2.c: Likewise.
	* gcc.dg/pr98211.c: Likewise.
	* gcc.dg/torture/pr89595.c: Likewise.
	* gcc.dg/tree-ssa/divide-7.c: Likewise.
	* gcc.dg/tree-ssa/ssa-lim-12.c: Likewise.
2022-05-23 11:30:39 +02:00
Tobias Burnus 49d1a2f913 OpenMP: Handle descriptors in target's firstprivate [PR104949]
For allocatable/pointer arrays, a firstprivate to a device
not only needs to privatize the descriptor but also the actual
data. This is implemented as:
  firstprivate(x) firstprivate(x.data) attach(x [bias: &x.data-&x)
where the address of x in device memory is saved in hostaddrs[i]
by libgomp and the middle end actually passes hostaddrs[i]' to
attach.

As side effect, has_device_addr(array_desc) had to be changed:
before, it was converted to firstprivate in the front end; now
it is handled in omp-low.cc as has_device_addr requires a shallow
firstprivate (not touching the data pointer) while the normal
firstprivate requires (now) a deep firstprivate.

gcc/fortran/ChangeLog:

	PR fortran/104949
	* f95-lang.cc (LANG_HOOKS_OMP_ARRAY_SIZE): Redefine.
	* trans-openmp.cc (gfc_omp_array_size): New.
	(gfc_trans_omp_variable_list): Never turn has_device_addr
	to firstprivate.
	* trans.h (gfc_omp_array_size): New.

gcc/ChangeLog:

	PR fortran/104949
	* langhooks-def.h (lhd_omp_array_size): New.
	(LANG_HOOKS_OMP_ARRAY_SIZE): Define.
	(LANG_HOOKS_DECLS): Add it.
	* langhooks.cc (lhd_omp_array_size): New.
	* langhooks.h (struct lang_hooks_for_decls): Add hook.
	* omp-low.cc (scan_sharing_clauses, lower_omp_target):
	Handle GOMP_MAP_FIRSTPRIVATE for array descriptors.

libgomp/ChangeLog:

	PR fortran/104949
	* target.c (gomp_map_vars_internal, copy_firstprivate_data):
	Support attach for GOMP_MAP_FIRSTPRIVATE.
	* testsuite/libgomp.fortran/target-firstprivate-1.f90: New test.
	* testsuite/libgomp.fortran/target-firstprivate-2.f90: New test.
	* testsuite/libgomp.fortran/target-firstprivate-3.f90: New test.
2022-05-23 10:54:32 +02:00
Roger Sayle 7707d7fddf Some additional ix86_rtx_costs clean-ups: NEG, AND, andn and pandn.
Double-word NOT requires two operations, but double-word NEG requires
three operations.  Using SSE, vector NOT requires a pxor with -1, but
AND of NOT is cheap thanks to the existence of pandn.  There's also some
legacy (aka incorrect) logic explicitly testing for DImode [independently
of TARGET_64BIT] in determining the cost of logic operations that's not
required.

2022-05-23  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* config/i386/i386.cc (ix86_rtx_costs) <case AND>: Split from
	XOR/IOR case.  Account for two instructions for double-word
	operations.  In case of vector pandn, account for single
	instruction.  Likewise for integer andn with TARGET_BMI.
	<case NOT>: Vector NOT requires more than 1 instruction (pxor).
	<case NEG>: Double-word negation requires 3 instructions.
2022-05-23 08:47:42 +01:00
Tsukasa OI 075fb873c2 RISC-V: Fix canonical extension order (K and J)
This commit fixes canonical extension order to follow the RISC-V ISA
Manual draft-20210402-1271737 or later.

gcc/ChangeLog:

	* common/config/riscv/riscv-common.cc (riscv_supported_std_ext):
	Fix "K" extension prefix to be placed before "J".
	* config/riscv/arch-canonicalize: Likewise.

Signed-off-by: Tsukasa OI <research_trasio@irq.a4lg.com>
2022-05-23 10:50:25 +08:00
liuhongt 657612fb9f Increase move cost between mask and gpr.
kmovd only uses port5 which is often the bottleneck of
performance. Also from latency perspective, spill and reload mostly
could be STLF or even MRN which only take 1 cycle.

So the patch increase move cost between gpr and mask to be the same as
gpr <-> sse register.

gcc/ChangeLog:

	* config/i386/x86-tune-costs.h (skylake_cost): Increase gpr
	<-> mask cost from 5 to 6.
	(icelake_cost): Ditto.

gcc/testsuite/ChangeLog:
	* gcc.target/i386/spill_to_mask-1.c: New test.
2022-05-23 09:57:04 +08:00
GCC Administrator 260f189335 Daily bump. 2022-05-23 00:16:28 +00:00
GCC Administrator a60228404f Daily bump. 2022-05-22 00:16:38 +00:00
Dimitar Dimitrov 570fbf448d testsuite: Skip vectorize tests for PRU
PRU has single-cycle constant cost for any jump, and it cannot
vectorise.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/gen-vect-11.c: For PRU target, skip the
	vectorizing checks in tree dumps.
	* gcc.dg/tree-ssa/gen-vect-11a.c: Ditto.
	* gcc.dg/tree-ssa/gen-vect-2.c: Ditto.
	* gcc.dg/tree-ssa/gen-vect-25.c: Ditto.
	* gcc.dg/tree-ssa/gen-vect-26.c: Ditto.
	* gcc.dg/tree-ssa/gen-vect-28.c: Ditto.
	* gcc.dg/tree-ssa/gen-vect-32.c: Ditto.

Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
2022-05-21 21:27:29 +03:00
Dimitar Dimitrov addacdc87b testsuite: Adjust pr91088.c for default_packed targets
PR ipa/91088

gcc/testsuite/ChangeLog:

	* gcc.dg/ipa/pr91088.c: Adjust member offset checks to
	accommodate targets which pack structures by default.

Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
2022-05-21 21:27:29 +03:00
Dimitar Dimitrov 0380b6575f testsuite: Skip gcc.dg/pr46647.c for PRU
Like AVR and Cris, PRU has no alignment requirements.  Thus it is
also affected by PR53535.

	PR middle-end/53535

gcc/testsuite/ChangeLog:

	* gcc.dg/pr46647.c: Skip for pru target.

Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
2022-05-21 21:27:29 +03:00