Commit Graph

193949 Commits

Author SHA1 Message Date
Jia-wei Chen b18e5d7e5f RISC-V/testsuite: Fix pr105666.c under rv32
In rv32 regression test, this cases will report an error:

"cc1: error: ABI requires '-march=rv32'"

Add '-mabi' option will fix this.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/pr105666.c: New options.
2022-06-16 12:22:27 +08:00
liuhongt 1089d08311 Simplify (B * v + C) * D -> BD* v + CD when B,C,D are all INTEGER_CST.
Similar for (v + B) * C + D -> C * v + BCD.
Don't simplify it when there's overflow and overflow is UB for type v.

gcc/ChangeLog:

	PR tree-optimization/53533
	* match.pd: Simplify (B * v + C) * D -> BD * v + CD and
	(v + B) * C + D -> C * v + BCD when B,C,D are all INTEGER_CST,
	and there's no overflow or !TYPE_OVERFLOW_UNDEFINED.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/pr53533-1.c: New test.
	* gcc.target/i386/pr53533-2.c: New test.
	* gcc.target/i386/pr53533-3.c: New test.
	* gcc.target/i386/pr53533-4.c: New test.
	* gcc.target/i386/pr53533-5.c: New test.
	* gcc.dg/vect/slp-11a.c: Adjust testcase.
2022-06-16 09:26:36 +08:00
GCC Administrator 499b9c5f09 Daily bump. 2022-06-16 00:16:44 +00:00
Takayuki 'January June' Suwa ce3867d414 xtensa: Eliminate [DS]Cmode hard register clobber that is immediately followed by whole overwrite the register
RTL expansion of substitution to [DS]Cmode hard register includes obstructive
register clobber.

A simplest example:

    double _Complex test(double _Complex c) {
      return c;
    }

will be converted to:

    (set (reg:DF 42 [ c ]) (reg:DF 2 a2))
    (set (reg:DF 43 [ c+8 ]) (reg:DF 4 a4))
    (clobber (reg:DC 2 a2))
    (set (reg:DF 2 a2) (reg:DF 42 [ c ]))
    (set (reg:DF 4 a4) (reg:DF 43 [ c+8 ]))
    (use (reg:DC 2 a2))
    (return)

and then finally:

    test:
	mov	a8, a2
	mov	a9, a3
	mov	a6, a4
	mov	a7, a5
	mov	a2, a8
	mov	a3, a9
	mov	a4, a6
	mov	a5, a7
	ret

As you see, it is so ridiculous.

This patch eliminates such clobber in order to prune away the wasted move
instructions by the optimizer:

    test:
	ret

gcc/ChangeLog:

	* config/xtensa/xtensa.md (DSC): New split pattern and mode iterator.
2022-06-15 16:55:36 -07:00
Takayuki 'January June' Suwa cfad4856fa xtensa: Eliminate unwanted reg-reg moves during DFmode input reloads
When spilled DFmode registers are reloaded in, once loaded into a pair of
SImode regs and then copied from that regs.  Such unwanted reg-reg moves
seems not to be eliminated at the "cprop_hardreg" stage, despite no problem
in output reloads.

Luckily it is easy to resolve such inefficiencies, with the use of peephole2
pattern.

gcc/ChangeLog:

	* config/xtensa/predicates.md (reload_operand):
	New predicate.
	* config/xtensa/xtensa.md: New peephole2 pattern.
2022-06-15 16:55:36 -07:00
Takayuki 'January June' Suwa c95e307e3a xtensa: Add some dedicated patterns that correspond to GIMPLE canonicalizations
This patch offers better RTL representations against straightforward
derivations from some tree optimizers' canonicalized forms.

- rounding up to even, such as '(x + (x & 1))', is canonicalized to
  '((x + 1) & -2)', but the former is one instruction less than the latter
  in Xtensa ISA.
- signed greater or equal to zero as logical value '((signed)x >= 0)',
  is canonicalized to '((unsigned)(x ^ -1) >> 31)', but the equivalent
  '(((signed)x >> 31) + 1)' is one instruction less.

gcc/ChangeLog:

	* config/xtensa/xtensa.md (*round_up_to_even):
	New insn-and-split pattern.
	(*signed_ge_zero): Ditto.
2022-06-15 16:55:36 -07:00
Takayuki 'January June' Suwa 43b0c56fda xtensa: Add support for sibling call optimization
This patch introduces support for sibling call optimization, when call0
ABI is in effect.

gcc/ChangeLog:

	* config/xtensa/xtensa-protos.h (xtensa_prepare_expand_call,
	xtensa_emit_sibcall): New prototypes.
	(xtensa_expand_epilogue): Add new argument that specifies whether
	or not sibling call.
	* config/xtensa/xtensa.cc (TARGET_FUNCTION_OK_FOR_SIBCALL):
	New macro definition.
	(xtensa_prepare_expand_call): New function in order to share
	the common code.
	(xtensa_emit_sibcall, xtensa_function_ok_for_sibcall):
	New functions.
	(xtensa_expand_epilogue): Add new argument sibcall_p and use it
	for sibling call handling.
	* config/xtensa/xtensa.md (call, call_value):
	Use xtensa_prepare_expand_call.
	(call_internal, call_value_internal):
	Add the condition in order to be disabled if sibling call.
	(sibcall, sibcall_value, sibcall_epilogue): New expansions.
	(sibcall_internal, sibcall_value_internal): New insn patterns,
	and split ones in order to take care of the indirect sibcalls.

gcc/testsuite/ChangeLog:

	* gcc.target/xtensa/sibcalls.c: New.
2022-06-15 16:55:36 -07:00
Takayuki 'January June' Suwa 96518f714e xtensa: Document new -mextra-l32r-costs= Xtensa-specific option
gcc/ChangeLog:
	* doc/invoke.texi: Document -mextra-l32r-costs= option.
2022-06-15 16:55:36 -07:00
David Malcolm 63c0731994 analyzer: fix up paths for inlining (PR analyzer/105962)
-fanalyzer runs late compared to other code analysis tools, in that in
runs on the partially-optimized gimple-ssa representation.  I chose this
point to run in the hope of easy integration with LTO.

As PR analyzer/105962 notes, this means that function inlining can occur
before the -fanalyzer "sees" the user's code.  For example given:

void foo (void *p)
{
  __builtin_free (p);
}

void bar (void *q)
{
  foo (q);
  foo (q);
}

Below -O2, -fanalyzer shows the calls and returns:

inline-1.c: In function ‘foo’:
inline-1.c:3:3: warning: double-‘free’ of ‘p’ [CWE-415] [-Wanalyzer-double-free]
    3 |   __builtin_free (p);
      |   ^~~~~~~~~~~~~~~~~~
  ‘bar’: events 1-2
    |
    |    6 | void bar (void *q)
    |      |      ^~~
    |      |      |
    |      |      (1) entry to ‘bar’
    |    7 | {
    |    8 |   foo (q);
    |      |   ~~~~~~~
    |      |   |
    |      |   (2) calling ‘foo’ from ‘bar’
    |
    +--> ‘foo’: events 3-4
           |
           |    1 | void foo (void *p)
           |      |      ^~~
           |      |      |
           |      |      (3) entry to ‘foo’
           |    2 | {
           |    3 |   __builtin_free (p);
           |      |   ~~~~~~~~~~~~~~~~~~
           |      |   |
           |      |   (4) first ‘free’ here
           |
    <------+
    |
  ‘bar’: events 5-6
    |
    |    8 |   foo (q);
    |      |   ^~~~~~~
    |      |   |
    |      |   (5) returning to ‘bar’ from ‘foo’
    |    9 |   foo (q);
    |      |   ~~~~~~~
    |      |   |
    |      |   (6) passing freed pointer ‘q’ in call to ‘foo’ from ‘bar’
    |
    +--> ‘foo’: events 7-8
           |
           |    1 | void foo (void *p)
           |      |      ^~~
           |      |      |
           |      |      (7) entry to ‘foo’
           |    2 | {
           |    3 |   __builtin_free (p);
           |      |   ~~~~~~~~~~~~~~~~~~
           |      |   |
           |      |   (8) second ‘free’ here; first ‘free’ was at (4)
           |

but at -O2, -fanalyzer "sees" this gimple:

void bar (void * q)
{
  <bb 2> [local count: 1073741824]:
  __builtin_free (q_2(D));
  __builtin_free (q_2(D));
  return;
}

where "foo" has been inlined away, leading to this unhelpful output:

In function ‘foo’,
    inlined from ‘bar’ at inline-1.c:9:3:
inline-1.c:3:3: warning: double-‘free’ of ‘q’ [CWE-415] [-Wanalyzer-double-free]
    3 |   __builtin_free (p);
      |   ^~~~~~~~~~~~~~~~~~
  ‘bar’: events 1-2
    |
    |    3 |   __builtin_free (p);
    |      |   ^~~~~~~~~~~~~~~~~~
    |      |   |
    |      |   (1) first ‘free’ here
    |      |   (2) second ‘free’ here; first ‘free’ was at (1)

where the stack frame information in the execution path suggests that these
events are happening in "bar", in the top stack frame.

This is what the analyzer sees, but I find it hard to decipher such
output.  Hence, as a workaround for the fact that -fanalyzer runs so
late, this patch attempts to reconstruct the "true" stack frame
information, and to inject events showing inline calls, based on the
inlining chain information recorded in the location_t values for the events.

Doing so leads to this output at -O2 on the above example (with
-fdiagnostics-show-path-depths):

In function ‘foo’,
    inlined from ‘bar’ at inline-1.c:9:3:
inline-1.c:3:3: warning: double-‘free’ of ‘q’ [CWE-415] [-Wanalyzer-double-free]
    3 |   __builtin_free (p);
      |   ^~~~~~~~~~~~~~~~~~
  ‘bar’: events 1-2 (depth 1)
    |
    |    6 | void bar (void *q)
    |      |      ^~~
    |      |      |
    |      |      (1) entry to ‘bar’
    |    7 | {
    |    8 |   foo (q);
    |      |   ~
    |      |   |
    |      |   (2) inlined call to ‘foo’ from ‘bar’
    |
    +--> ‘foo’: event 3 (depth 2)
           |
           |    3 |   __builtin_free (p);
           |      |   ^~~~~~~~~~~~~~~~~~
           |      |   |
           |      |   (3) first ‘free’ here
           |
    <------+
    |
  ‘bar’: event 4 (depth 1)
    |
    |    9 |   foo (q);
    |      |   ^
    |      |   |
    |      |   (4) inlined call to ‘foo’ from ‘bar’
    |
    +--> ‘foo’: event 5 (depth 2)
           |
           |    3 |   __builtin_free (p);
           |      |   ^~~~~~~~~~~~~~~~~~
           |      |   |
           |      |   (5) second ‘free’ here; first ‘free’ was at (3)
           |

reconstructing the calls and returns.

The patch also adds a new option, -fno-analyzer-undo-inlining, which can
be used to disable this reconstruction, restoring the output listed
above (this time with -fdiagnostics-show-path-depths):

In function ‘foo’,
    inlined from ‘bar’ at inline-1.c:9:3:
inline-1.c:3:3: warning: double-‘free’ of ‘q’ [CWE-415] [-Wanalyzer-double-free]
    3 |   __builtin_free (p);
      |   ^~~~~~~~~~~~~~~~~~
  ‘bar’: events 1-2 (depth 1)
    |
    |    3 |   __builtin_free (p);
    |      |   ^~~~~~~~~~~~~~~~~~
    |      |   |
    |      |   (1) first ‘free’ here
    |      |   (2) second ‘free’ here; first ‘free’ was at (1)
    |

gcc/analyzer/ChangeLog:
	PR analyzer/105962
	* analyzer.opt (fanalyzer-undo-inlining): New option.
	* checker-path.cc: Include "diagnostic-core.h" and
	"inlining-iterator.h".
	(event_kind_to_string): Handle EK_INLINED_CALL.
	(class inlining_info): New class.
	(checker_event::checker_event): Move here from checker-path.h.
	Store original fndecl and depth, and calculate effective fndecl
	and depth based on inlining information.
	(checker_event::dump): Emit original depth as well as effective
	depth when they differ; likewise for fndecl.
	(region_creation_event::get_desc): Use m_effective_fndecl.
	(inlined_call_event::get_desc): New.
	(inlined_call_event::get_meaning): New.
	(checker_path::inject_any_inlined_call_events): New.
	* checker-path.h (enum event_kind): Add EK_INLINED_CALL.
	(checker_event::checker_event): Make protected, and move
	definition to checker-path.cc.
	(checker_event::get_fndecl): Use effective fndecl.
	(checker_event::get_stack_depth): Use effective stack depth.
	(checker_event::get_logical_location): Use effective stack depth.
	(checker_event::get_original_stack_depth): New.
	(checker_event::m_fndecl): Rename to...
	(checker_event::m_original_fndecl): ...this.
	(checker_event::m_depth): Rename to...
	(checker_event::m_original_depth): ...this.
	(checker_event::m_effective_fndecl): New field.
	(checker_event::m_effective_depth): New field.
	(class inlined_call_event): New checker_event subclass.
	(checker_path::inject_any_inlined_call_events): New decl.
	* diagnostic-manager.cc: Include "inlining-iterator.h".
	(diagnostic_manager::emit_saved_diagnostic): Call
	checker_path::inject_any_inlined_call_events.
	(diagnostic_manager::prune_for_sm_diagnostic): Handle
	EK_INLINED_CALL.
	* engine.cc (tainted_args_function_custom_event::get_desc): Use
	effective fndecl.
	* inlining-iterator.h: New file.

gcc/testsuite/ChangeLog:
	PR analyzer/105962
	* gcc.dg/analyzer/inlining-1-multiline.c: New test.
	* gcc.dg/analyzer/inlining-1-no-undo.c: New test.
	* gcc.dg/analyzer/inlining-1.c: New test.
	* gcc.dg/analyzer/inlining-2-multiline.c: New test.
	* gcc.dg/analyzer/inlining-2.c: New test.
	* gcc.dg/analyzer/inlining-3-multiline.c: New test.
	* gcc.dg/analyzer/inlining-3.c: New test.
	* gcc.dg/analyzer/inlining-4-multiline.c: New test.
	* gcc.dg/analyzer/inlining-4.c: New test.
	* gcc.dg/analyzer/inlining-5-multiline.c: New test.
	* gcc.dg/analyzer/inlining-5.c: New test.
	* gcc.dg/analyzer/inlining-6-multiline.c: New test.
	* gcc.dg/analyzer/inlining-6.c: New test.
	* gcc.dg/analyzer/inlining-7-multiline.c: New test.
	* gcc.dg/analyzer/inlining-7.c: New test.

gcc/ChangeLog:
	PR analyzer/105962
	* doc/invoke.texi: Add -fno-analyzer-undo-inlining.
	* tree-diagnostic-path.cc (default_tree_diagnostic_path_printer):
	Extend -fdiagnostics-path-format=separate-events so that with
	-fdiagnostics-show-path-depths it prints fndecls as well as stack
	depths.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-06-15 17:44:14 -04:00
David Malcolm b06b84dbca value-relation.h: add 'final' and 'override' to relation_oracle vfunc impls
gcc/ChangeLog:
	* value-relation.h: Add "final" and "override" to relation_oracle
	vfunc implementations as appropriate.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-06-15 17:42:17 -04:00
David Malcolm c540077a3b analyzer: show saved diagnostics as nodes in .eg.dot dumps
I've been using this tweak to the output of
-fdump-analyzer-exploded-graph in my working copies for a while;
the extra red nodes make it *much* easier to find the places where
diagnostics are being emitted (or rejected by the diagnostic_manager).

gcc/analyzer/ChangeLog:
	* diagnostic-manager.cc (saved_diagnostic::dump_dot_id): New.
	(saved_diagnostic::dump_as_dot_node): New.
	* diagnostic-manager.h (saved_diagnostic::dump_dot_id): New decl.
	(saved_diagnostic::dump_as_dot_node): New decl.
	* engine.cc (exploded_node::dump_dot): Add nodes for saved
	diagnostics.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-06-15 17:40:33 -04:00
David Malcolm 44681d4547 analyzer: add more uninit test coverage
gcc/testsuite/ChangeLog:
	* gcc.dg/analyzer/uninit-1.c: Add test coverage of attempts
	to jump through an uninitialized function pointer, and of attempts
	to pass an uninitialized value to a function call.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-06-15 17:39:42 -04:00
Iain Buclaw 90f2a11141 d: Add `@no_sanitize' attribute to compiler and library.
The `@no_sanitize' attribute disables a particular sanitizer for this
function, analogous to `__attribute__((no_sanitize))'.  The library also
defines `@noSanitize' to be compatible with the LLVM D compiler's
`ldc.attributes'.

gcc/d/ChangeLog:

	* d-attribs.cc (d_langhook_attribute_table): Add no_sanitize.
	(d_handle_no_sanitize_attribute): New function.

libphobos/ChangeLog:

	* libdruntime/gcc/attributes.d (no_sanitize): Define.
	(noSanitize): Define.

gcc/testsuite/ChangeLog:

	* gdc.dg/asan/attr_no_sanitize1.d: New test.
	* gdc.dg/ubsan/attr_no_sanitize2.d: New test.
2022-06-15 23:16:21 +02:00
François Dumont dc9b92facf libstdc++: [_Hashtable] Insert range of types convertible to value_type PR 105717
Fix insertion of range of instances convertible to value_type.

libstdc++-v3/ChangeLog:

	PR libstdc++/105717
	* include/bits/hashtable_policy.h (_ConvertToValueType): New.
	* include/bits/hashtable.h (_Hashtable<>::_M_insert_unique_aux): New.
	(_Hashtable<>::_M_insert(_Arg&&, const _NodeGenerator&, true_type)): Use latters.
	(_Hashtable<>::_M_insert(_Arg&&, const _NodeGenerator&, false_type)): Likewise.
	(_Hashtable(_InputIterator, _InputIterator, size_type, const _Hash&, const _Equal&,
	const allocator_type&, true_type)): Use this.insert range.
	(_Hashtable(_InputIterator, _InputIterator, size_type, const _Hash&, const _Equal&,
	const allocator_type&, false_type)): Use _M_insert.
	* testsuite/23_containers/unordered_map/cons/56112.cc: Check how many times conversion
	is done.
	* testsuite/23_containers/unordered_map/insert/105717.cc: New test.
	* testsuite/23_containers/unordered_set/insert/105717.cc: New test.
2022-06-15 20:21:52 +02:00
Iain Buclaw 636b01ab49 d: Add `@visibility' and `@hidden' attributes.
The `@visibility' attribute is functionality the same as
`__attribute__((visibility))', and `@hidden' is a convenience alias to
`@visibility("hidden")' defined in the `gcc.attributes' module.

As the visibility of a symbol is also indirectly controlled by the
`export' keyword, the handling of this in the code generation pass has
been improved so that conflicts will be appropriately diagnosed.

gcc/d/ChangeLog:

	* d-attribs.cc (d_langhook_attribute_table): Add visibility.
	(insert_type_attribute): Use decl_attributes instead of
	merge_attributes.
	(insert_decl_attribute): Likewise.
	(apply_user_attributes): Do nothing when no UDAs applied.
	(d_handle_visibility_attribute): New function.
	* d-gimplify.cc (d_gimplify_binary_expr): Adjust.
	* d-tree.h (set_visibility_for_decl): Declare.
	* decl.cc (get_symbol_decl): Move setting of visibility flags to...
	(set_visibility_for_decl): ... here.  New function.
	* types.cc (TypeVisitor::visit (TypeStruct *)): Call
	set_visibility_for_decl().
	(TypeVisitor::visit (TypeClass *)): Likewise.

gcc/testsuite/ChangeLog:

	* gdc.dg/attr_visibility1.d: New test.
	* gdc.dg/attr_visibility2.d: New test.
	* gdc.dg/attr_visibility3.d: New test.

libphobos/ChangeLog:

	* libdruntime/gcc/attributes.d (visibility): Define.
	(hidden): Define.
2022-06-15 20:11:04 +02:00
David Edelsohn 49d14a841f testsuite: AIX operator new
The testcase relies on C++ "operator new", which requires AIX
runtime linking to override the symbol at runtime.

	* g++.dg/cpp1z/aligned-new9.C: Skip on AIX.
2022-06-15 13:49:35 -04:00
Richard Sandiford 9d2fe6d427 Revert recent internal-fn changes [PR105975]
The recent internal-fn “clean-ups” triggered problems on nvptx
because some of the omp_simt_* patterns had modeless operands.
I wondered about adapting expand_fn_using_insn to cope with that,
but then the problem becomes: what should the mode of operand 0
be when there is no lhs?  The answer depends on the target insn.
For GOMP_SIMT_ENTER_ALLOC the answer was: use Pmode.
For GOMP_SIMT_ORDERED_PRED and others the answer was: elide the call.
(However, GOMP_SIMT_ORDERED_PRED doesn't seem to have ECF_* flags
that would normally allow it to be dropped at the gimple level.)

So these instructions seem to be special enough that they need
their own code after all.  This patch reverts the second patch
and most of the first.  The only part retained from the first
is splitting expand_fn_using_insn out of expand_direct_optab_fn,
since I think expand_fn_using_insn could still be useful in future.

gcc/
	PR middle-end/105975
	Revert everything apart from the expand_fn_using_insn and
	expand_direct_optab_fn changes from:

	* internal-fn.def (DEF_INTERNAL_INSN_FN): New macro.
	(GOMP_SIMT_ENTER_ALLOC, GOMP_SIMT_EXIT, GOMP_SIMT_LANE)
	(GOMP_SIMT_LAST_LANE, GOMP_SIMT_ORDERED_PRED, GOMP_SIMT_VOTE_ANY)
	(GOMP_SIMT_XCHG_BFLY, GOMP_SIMT_XCHG_IDX): Use it.
	* internal-fn.h (direct_internal_fn_info::directly_mapped): New
	member variable.
	(direct_internal_fn_info::vectorizable): Reduce to 1 bit.
	(direct_internal_fn_p): Also return true for internal functions
	that map directly to instructions defined target-insns.def.
	(direct_internal_fn): Adjust comment accordingly.
	* internal-fn.cc (direct_insn, optab1, optab2, vectorizable_optab1)
	(vectorizable_optab2): New local macros.
	(not_direct): Initialize directly_mapped.
	(mask_load_direct, load_lanes_direct, mask_load_lanes_direct)
	(gather_load_direct, len_load_direct, mask_store_direct)
	(store_lanes_direct, mask_store_lanes_direct, vec_cond_mask_direct)
	(vec_cond_direct, scatter_store_direct, len_store_direct)
	(vec_set_direct, unary_direct, binary_direct, ternary_direct)
	(cond_unary_direct, cond_binary_direct, cond_ternary_direct)
	(while_direct, fold_extract_direct, fold_left_direct)
	(mask_fold_left_direct, check_ptrs_direct): Use the macros above.
	(expand_GOMP_SIMT_ENTER_ALLOC, expand_GOMP_SIMT_EXIT): Delete
	(expand_GOMP_SIMT_LANE, expand_GOMP_SIMT_LAST_LANE): Likewise;
	(expand_GOMP_SIMT_ORDERED_PRED, expand_GOMP_SIMT_VOTE_ANY): Likewise.
	(expand_GOMP_SIMT_XCHG_BFLY, expand_GOMP_SIMT_XCHG_IDX): Likewise.
	(direct_internal_fn_types): Handle functions that map to instructions
	defined in target-insns.def.
	(direct_internal_fn_types): Likewise.
	(direct_internal_fn_supported_p): Likewise.
	(internal_fn_expanders): Likewise.

	(expand_fn_using_insn): New function,
	split out and adapted from...
	(expand_direct_optab_fn): ...here.
	(expand_GOMP_SIMT_ENTER_ALLOC): Use it.
	(expand_GOMP_SIMT_EXIT): Likewise.
	(expand_GOMP_SIMT_LANE): Likewise.
	(expand_GOMP_SIMT_LAST_LANE): Likewise.
	(expand_GOMP_SIMT_ORDERED_PRED): Likewise.
	(expand_GOMP_SIMT_VOTE_ANY): Likewise.
	(expand_GOMP_SIMT_XCHG_BFLY): Likewise.
	(expand_GOMP_SIMT_XCHG_IDX): Likewise.
2022-06-15 17:40:09 +01:00
Richard Earnshaw 8aaa948059 arm: big-endian issue in gen_cpymem_ldrd_strd [PR105981]
The code in gen_cpymem_ldrd_strd has been incorrect for big-endian
since r230663.  The problem is that we use gen_lowpart, etc. to split
the 64-bit quantity, but fail to account for the fact that these
routines are really dealing with 64-bit /values/ and in big-endian the
ordering of the sub-registers changes.

To fix this, I've renamed the conceptually misnamed low_reg and hi_reg
as first_reg and second_reg, and then used different logic for
big-endian targets to initialize these values.  This makes the logic
clearer than trying to think about high bits and low bits.

gcc/ChangeLog:

	PR target/105981
	* config/arm/arm.cc (gen_cpymem_ldrd_strd): Rename low_reg and hi_reg
	to first_reg and second_reg respectively.  Initialize them correctly
	when generating big-endian code.
2022-06-15 16:09:01 +01:00
Nathan Sidwell 052d89537a c++: Use better module partition naming
It turns out that 'implementation partition' is not a term used in the
std, and is confusing to users.  Let's use the better term 'internal
partition'.  While there, adjust header unit naming.

	gcc/cp/
	* module.cc (module_state::write_readme): Use less confusing
	importable unit names.
2022-06-15 08:02:37 -07:00
Richard Earnshaw dc8071da0e arm: fix thinko in arm_bfi_1_p() [PR105974]
I clearly wasn't thinking straight when I wrote the arm_bfi_1_p
function and used XUINT rather than UINTVAL when extracting CONST_INT
values.  It seemed to work in testing, but was incorrect and failed
RTL checking.

Fixed thusly:

gcc/ChangeLog:

	PR target/105974
	* config/arm/arm.cc (arm_bfi_1_p): Use UINTVAL instead of XUINT.
2022-06-15 13:43:25 +01:00
Iain Buclaw 57b2adae53 d: Set TYPE_ARTIFICIAL on internal TypeInfo types
Prevents them from triggering warnings when compiling with `-Wpadded'.

gcc/d/ChangeLog:

	* typeinfo.cc (make_internal_typeinfo): Set TYPE_ARTIFICIAL.

gcc/testsuite/ChangeLog:

	* gdc.dg/Wpadded.d: New test.
2022-06-15 13:42:28 +02:00
Richard Biener 8c2733e16e tree-optimization/105971 - less surprising refs_may_alias_p_2
When DSE asks whether __real a is using __imag a it gets a surprising
result when a is a FUNCTION_DECL.  The following makes sure this case
is less surprising to callers but keeping the bail-out for the
non-decl case where it is true that PTA doesn't track aliases to code
correctly.

2022-06-15  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/105971
	* tree-ssa-alias.cc (refs_may_alias_p_2): Put bail-out for
	FUNCTION_DECL and LABEL_DECL refs after decl-decl disambiguation
	to leak less surprising alias results.

	* gcc.dg/torture/pr106971.c: New testcase.
2022-06-15 13:15:11 +02:00
Richard Biener edb9330c29 tree-optimization/105969 - FPE with array diagnostics
For a [0][0] array we have to be careful when dividing by the element
size which is zero for the outermost dimension.  Luckily the division
is only for an overflow check which is pointless for array size zero.

2022-06-15  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/105969
	* gimple-ssa-sprintf.cc (get_origin_and_offset_r): Avoid division
	by zero in overflow check.

	* gcc.dg/pr105969.c: New testcase.
2022-06-15 13:14:58 +02:00
Iain Buclaw f4c3ce32fa d: Delay completing aggregate and enum types until after attributes have been applied.
Because of forward/recursive references, the TYPE_SIZE, TYPE_ALIGN, and
TYPE_MODE of structs and enums were set before laying out its members.

This adds a new macro TYPE_FORWARD_REFERENCES for storing those forward
references against the incomplete type, laying them out after the type
has been completed.  Construction of the TYPE_DECL has also been moved
on earlier in the type generation pass, which will allow the possibility
of adding gdc-specific type attributes to the D front-end in the future.

gcc/d/ChangeLog:

	* d-attribs.cc (apply_user_attributes): Set ATTR_FLAG_TYPE_IN_PLACE
	only on incomplete types.
	* d-codegen.cc (copy_aggregate_type): Set TYPE_STUB_DECL after copy.
	* d-compiler.cc (Compiler::onParseModule): Adjust.
	* d-tree.h (AGGREGATE_OR_ENUM_TYPE_CHECK): Define.
	(TYPE_FORWARD_REFERENCES): Define.
	* decl.cc (gcc_attribute_p): Update documentation.
	(DeclVisitor::visit (StructDeclaration *)): Exit before building type
	node if gcc.attributes symbol.
	(DeclVisitor::visit (ClassDeclaration *)): Build type node and add
	TYPE_NAME to current binding level before emitting anything else.
	(DeclVisitor::visit (InterfaceDeclaration *)): Likewise.
	(DeclVisitor::visit (EnumDeclaration *)): Likewise.
	(build_type_decl): Move rest_of_decl_compilation() call to
	finish_aggregate_type().
	* types.cc (insert_aggregate_field): Move layout_decl() call to
	finish_aggregate_type().
	(insert_aggregate_bitfield): Likewise.
	(layout_aggregate_members): Adjust.
	(finish_incomplete_fields): New function.
	(finish_aggregate_type): Handle forward referenced field types.  Call
	rest_of_type_compilation() after completing the aggregate.
	(TypeVisitor::visit (TypeEnum *)): Don't set size and alignment until
	after apply_user_attributes().  Call rest_of_type_compilation() after
	completing the enumeral.
	(TypeVisitor::visit (TypeStruct *)): Call build_type_decl() before
	apply_user_attributes().  Don't set size, alignment, and mode until
	after apply_user_attributes().
	(TypeVisitor::visit (TypeClass *)): Call build_type_decl() before
	applly_user_attributes().
2022-06-15 12:33:42 +02:00
Richard Sandiford 2636660b6f aarch64: Revert bogus fix for PR105254
In f2ebf2d98e I'd forced the
chosen unroll factor to be a factor of the VF, in order to
work around an exact_div ICE in PR105254.  This was completely
bogus -- clearly I didn't look in enough detail at why we ended
up with an unrolled VF that wasn't a multiple of the UF.

Kewen has since fixed the bug properly for PR105940, so this
patch reverts my earlier attempt.  Sorry for the stupidity.

gcc/
	PR tree-optimization/105254
	PR tree-optimization/105940

	Revert:

	* config/aarch64/aarch64.cc
	(aarch64_vector_costs::determine_suggested_unroll_factor): Take a
	loop_vec_info as argument.  Restrict the unroll factor to values
	that divide the VF.
	(aarch64_vector_costs::finish_cost): Update call accordingly.

gcc/testsuite/
	* gcc.target/aarch64/sve/cost_model_14.c: New test.
2022-06-15 11:12:51 +01:00
Richard Sandiford 183a4f3829 gen: Allow unspec numbers in .md attributes
Tamar pointed out that:

  (unspec:M ... <FOO>)

didn't work when a value of attribute FOO was defined by
define_constant, such as in:

  (define_int_attribute FOO [(UNSPEC_A "UNSPEC_B") ...])

This is because symbolic constants are substituted during lexing
and only apply to bare symbol names, not strings.

One option would have been to extend this lexing substitution
to define_*_attribute values as well.  However, that would replace
symbolic names with integer constants in the generated .cc code,
making it less readable.

This patch goes for the more localised approach of only
applying define_constants when we want their integer value.

I don't think any changes to the docs are needed.  This isn't
adding a new feature, it's just making an existing one work in
the expected way.

gcc/
	* read-rtl.cc (find_int): Substitute symbolic constants
	before converting the string to an integer.
2022-06-15 11:12:51 +01:00
Jakub Jelinek 7bfb3f488a openmp: Fix up get-mapped-ptr-1.{c,f90} tests
On Tue, Jun 14, 2022 at 06:41:37PM +0200, Thomas Schwinge wrote:
> In an offloading configuration, I'm seeing:
>
>     PASS: libgomp.fortran/get-mapped-ptr-1.f90   -O  (test for excess errors)
>     [-PASS:-]{+FAIL:+} libgomp.fortran/get-mapped-ptr-1.f90   -O  execution test
>
> Does that one need similar treatment?

I assume not just that but libgomp.c-c++-common/get-mapped-ptr-1.c too?

It both needs the same treatment, and in the get-mapped-ptr-1.c
case there is even UB, while the Fortran version was using c_loc (q)
as the host pointer, in C/C++ it was using q which was value of
uninitialized pointer.

2022-06-15  Jakub Jelinek  <jakub@redhat.com>

	* testsuite/libgomp.c-c++-common/get-mapped-ptr-1.c (main): Initialize
	q to ddress of an automatic variable.  Use -5 instead of -1 in
	omp_get_mapped_ptr call.  Add test with omp_initial_device.
	* testsuite/libgomp.fortran/get-mapped-ptr-1.f90 (main): Use -5 instead
	of -1 in omp_get_mapped_ptr call.  Add test with omp_initial_device.
	Renumber stop arguments afterwards.
2022-06-15 10:45:04 +02:00
Roger Sayle acb1e6f43d Fold truncations of left shifts in match.pd
Whilst investigating PR 55278, I noticed that the tree-ssa optimizers
aren't eliminating the promotions of shifts to "int" as inserted by the
c-family front-ends, instead leaving this simplification to be left to
the RTL optimizers.  This patch allows match.pd to do this itself earlier,
narrowing (T)(X << C) to (T)X << C when the constant C is known to be
valid for the (narrower) type T.

Hence for this simple test case:
short foo(short x) { return x << 5; }

the .optimized dump currently looks like:

short int foo (short int x)
{
  int _1;
  int _2;
  short int _4;

  <bb 2> [local count: 1073741824]:
  _1 = (int) x_3(D);
  _2 = _1 << 5;
  _4 = (short int) _2;
  return _4;
}

but with this patch, now becomes:

short int foo (short int x)
{
  short int _2;

  <bb 2> [local count: 1073741824]:
  _2 = x_1(D) << 5;
  return _2;
}

This is always reasonable as RTL expansion knows how to use
widening optabs if it makes sense at the RTL level to perform
this shift in a wider mode.

Of course, there's often a catch.  The above simplification not only
reduces the number of statements in gimple, but also allows further
optimizations, for example including the perception of rotate idioms
and bswap16.  Alas, optimizing things earlier than anticipated
requires several testsuite changes [though all these tests have
been confirmed to generate identical assembly code on x86_64].
The only significant change is that the vectorization pass wouldn't
previously lower rotations of signed integer types.  Hence this
patch includes a refinement to tree-vect-patterns to allow signed
types, by using the equivalent unsigned shifts.

2022-06-15  Roger Sayle  <roger@nextmovesoftware.com>
	    Richard Biener  <rguenther@suse.de>

gcc/ChangeLog
	* match.pd (convert (lshift @1 INTEGER_CST@2)): Narrow integer
	left shifts by a constant when the result is truncated, and the
	shift constant is well-defined.
	* tree-vect-patterns.cc (vect_recog_rotate_pattern): Add
	support for rotations of signed integer types, by lowering
	using unsigned vector shifts.

gcc/testsuite/ChangeLog
	* gcc.dg/fold-convlshift-4.c: New test case.
	* gcc.dg/optimize-bswaphi-1.c: Update found bswap count.
	* gcc.dg/tree-ssa/pr61839_3.c: Shift is now optimized before VRP.
	* gcc.dg/vect/vect-over-widen-1-big-array.c: Remove obsolete tests.
	* gcc.dg/vect/vect-over-widen-1.c: Likewise.
	* gcc.dg/vect/vect-over-widen-3-big-array.c: Likewise.
	* gcc.dg/vect/vect-over-widen-3.c: Likewise.
	* gcc.dg/vect/vect-over-widen-4-big-array.c: Likewise.
	* gcc.dg/vect/vect-over-widen-4.c: Likewise.
2022-06-15 09:31:13 +02:00
liuhongt 4b1a827f02 Fix ICE in extract_insn, at recog.cc:2791
(In reply to Uroš Bizjak from comment #1)
> Instruction does not accept memory operand for operand 3:
>
> (define_insn_and_split
> "*<sse4_1>_blendv<ssefltmodesuffix><avxsizesuffix>_ltint"
>   [(set (match_operand:<ssebytemode> 0 "register_operand" "=Yr,*x,x")
> 	(unspec:<ssebytemode>
> 	  [(match_operand:<ssebytemode> 1 "register_operand" "0,0,x")
> 	   (match_operand:<ssebytemode> 2 "vector_operand" "YrBm,*xBm,xm")
> 	   (subreg:<ssebytemode>
> 	     (lt:VI48_AVX
> 	       (match_operand:VI48_AVX 3 "register_operand" "Yz,Yz,x")
> 	       (match_operand:VI48_AVX 4 "const0_operand")) 0)]
> 	  UNSPEC_BLENDV))]
>
> The problematic insn is:
>
> (define_insn_and_split "*avx_cmp<mode>3_ltint_not"
>  [(set (match_operand:VI48_AVX  0 "register_operand")
>        (vec_merge:VI48_AVX
> 	 (match_operand:VI48_AVX 1 "vector_operand")
> 	 (match_operand:VI48_AVX 2 "vector_operand")
> 	 (unspec:<avx512fmaskmode>
> 	   [(subreg:VI48_AVX
> 	    (not:<ssebytemode>
> 	      (match_operand:<ssebytemode> 3 "vector_operand")) 0)
> 	    (match_operand:VI48_AVX 4 "const0_operand")
> 	    (match_operand:SI 5 "const_0_to_7_operand")]
> 	    UNSPEC_PCMP)))]
>
> which gets split to the above pattern.
>
> In the preparation statements we have:
>
>   if (!MEM_P (operands[3]))
>     operands[3] = force_reg (<ssebytemode>mode, operands[3]);
>   operands[3] = lowpart_subreg (<MODE>mode, operands[3], <ssebytemode>mode);
>
> Which won't fly when operand 3 is memory operand...
>

gcc/ChangeLog:

	PR target/105953
	* config/i386/sse.md (*avx_cmp<mode>3_ltint_not): Force_reg
	operands[3].

gcc/testsuite/ChangeLog:

	* g++.target/i386/pr105953.C: New test.
2022-06-15 13:55:31 +08:00
GCC Administrator 4adc5350fe Daily bump. 2022-06-15 00:16:24 +00:00
Ian Lance Taylor cf79b1117b syscall: gofmt
Add blank lines after //sys comments where needed, and then run gofmt
on the syscall package with the new formatter.

This is the libgo version of CL 407136.

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/412074
2022-06-14 15:11:17 -07:00
Jonathan Wakely 6abe341558 libstdc++: Check lengths first in operator== for basic_string [PR62187]
As confirmed by LWG 2852, the calls to traits_type::compare do not need
to be obsvervable, so we can make operator== compare string lengths
first and return immediately for non-equal lengths. This avoids doing a
slow string comparison for "abc...xyz" == "abc...xy". Previously we only
did this optimization for std::char_traits<char>, but we can enable it
unconditionally thanks to LWG 2852.

For comparisons with a const char* we can call traits_type::length right
away to do the same optimization. That strlen call can be folded away
for constant arguments, making it very efficient.

For the pre-C++20 operator== and operator!= overloads we can swap the
order of the arguments to take advantage of the operator== improvements.

libstdc++-v3/ChangeLog:

	PR libstdc++/62187
	* include/bits/basic_string.h (operator==): Always compare
	lengths before checking string contents.
	[!__cpp_lib_three_way_comparison] (operator==, operator!=):
	Reorder arguments.
2022-06-14 21:07:48 +01:00
Jonathan Wakely 1b65779f46 libstdc++: Inline all basic_string::compare overloads [PR59048]
Defining the compare member functions inline allows calls to
traits_type::length and std::min to be inlined, taking advantage of
constant expression arguments. When not inline, the compiler prefers to
use the explicit instantiation definitions in libstdc++.so and can't
take advantage of constant arguments.

libstdc++-v3/ChangeLog:

	PR libstdc++/59048
	* include/bits/basic_string.h (compare): Define inline.
	* include/bits/basic_string.tcc (compare): Remove out-of-line
	definitions.
	* include/bits/cow_string.h (compare): Define inline.
	* testsuite/21_strings/basic_string/operations/compare/char/3.cc:
	New test.
2022-06-14 21:07:47 +01:00
Jonathan Wakely 29da01709f libstdc++: Fix indentation in allocator base classes
libstdc++-v3/ChangeLog:

	* include/bits/new_allocator.h: Fix indentation.
	* include/ext/malloc_allocator.h: Likewise.
2022-06-14 21:07:47 +01:00
Jonathan Wakely 0a9af7b4ef libstdc++: Check for size overflow in constexpr allocation [PR105957]
libstdc++-v3/ChangeLog:

	PR libstdc++/105957
	* include/bits/allocator.h (allocator::allocate): Check for
	overflow in constexpr allocation.
	* testsuite/20_util/allocator/105975.cc: New test.
2022-06-14 21:07:47 +01:00
Surya Kumari Jangala 3e16b4359e regrename: Fix -fcompare-debug issue in check_new_reg_p [PR105041]
In check_new_reg_p, the nregs of a du chain is computed by obtaining the
MODE of the first element in the chain, and then calling
hard_regno_nregs() with the MODE. But the first element of the chain can
be a DEBUG_INSN whose mode need not be the same as the rest of the
elements in the du chain. This was resulting in fcompare-debug failure
as check_new_reg_p was returning a different result with -g for the same
candidate register. We can instead obtain nregs from the du chain
itself.

2022-06-10  Surya Kumari Jangala  <jskumari@linux.ibm.com>

gcc/
	PR rtl-optimization/105041
	* regrename.cc (check_new_reg_p): Use nregs value from du chain.

gcc/testsuite/
	PR rtl-optimization/105041
	* gcc.target/powerpc/pr105041.c: New test.
2022-06-14 17:36:48 +00:00
Segher Boessenkool e0e3ce6348 rs6000: Delete VS_scalar
It is just the same as VEC_base, which is a more generic name.

2022-06-14  Segher Boessenkool  <segher@kernel.crashing.org>

	* config/rs6000/vsx.md (VS_scalar): Delete.
	(rest of file): Adjust.
2022-06-14 17:31:15 +00:00
Nathan Sidwell e8609768fb c++: Elide calls to NOP module initializers
gcc/cp
	* cp-tree.h (fini_modules): Add has_inits parm.
	* decl2.cc (c_parse_final_cleanups): Check for
	inits, adjust fini_modules flags.
	* module.cc (module_state): Rename call_init_p to
	active_init_p.
	(module_state::write_config): Write active_init.
	(module_state::read_config): Read it.
	(module_determine_import_inits): Clear active_init_p
	of covered inits.
	(late_finish_module): Add has_init parm.  Record it.
	(fini_modules): Adjust.

	gcc/testsuite/
	* g++.dg/modules/init-2_a.C: Adjust.
	* g++.dg/modules/init-2_c.C: Adjust.
	* g++.dg/modules/init-2_d.C: New.
2022-06-14 07:57:36 -07:00
Jan Hubicka 8f6c317b3a Fix ipa-cp wrt volatile loads
Check for volatile flag to ipa_load_from_parm_agg.

gcc/ChangeLog:

2022-06-10  Jan Hubicka  <hubicka@ucw.cz>

	PR ipa/105739
	* ipa-prop.cc (ipa_load_from_parm_agg): Punt on volatile loads.

gcc/testsuite/ChangeLog:

2022-06-10  Jan Hubicka  <hubicka@ucw.cz>

	* gcc.dg/ipa/pr105739.c: New test.
2022-06-14 14:05:53 +02:00
Philipp Tomsich 0247ad3e0f RISC-V: Split slli+sh[123]add.uw opportunities to avoid zext.w
When encountering a prescaled (biased) value as a candidate for
sh[123]add.uw, the combine pass will present this as shifted by the
aggregate amount (prescale + shift-amount) with an appropriately
adjusted mask constant that has fewer than 32 bits set.

E.g., here's the failing expression seen in combine for a prescale of
1 and a shift of 2 (note how 0x3fffffff8 >> 3 is 0x7fffffff).
  Trying 7, 8 -> 10:
      7: r78:SI=r81:DI#0<<0x1
        REG_DEAD r81:DI
      8: r79:DI=zero_extend(r78:SI)
        REG_DEAD r78:SI
     10: r80:DI=r79:DI<<0x2+r82:DI
        REG_DEAD r79:DI
        REG_DEAD r82:DI
  Failed to match this instruction:
  (set (reg:DI 80 [ cD.1491 ])
      (plus:DI (and:DI (ashift:DI (reg:DI 81)
                       (const_int 3 [0x3]))
               (const_int 17179869176 [0x3fffffff8]))
          (reg:DI 82)))

To address this, we introduce a splitter handling these cases.

Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Co-developed-by: Manolis Tsamis <manolis.tsamis@vrull.eu>

gcc/ChangeLog:

	* config/riscv/bitmanip.md: Add split to handle opportunities
	for slli + sh[123]add.uw

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/zba-shadd.c: New test.
2022-06-14 13:37:51 +02:00
Philipp Tomsich 4bf0dcb049 RISC-V: add consecutive_bits_operand predicate
Provide an easy way to constrain for constants that are a a single,
consecutive run of ones.

gcc/ChangeLog:

	* config/riscv/predicates.md (consecutive_bits_operand):
	Implement new predicate.

Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
2022-06-14 13:35:49 +02:00
Richard Biener e07a876c07 tree-optimization/105946 - avoid accessing excess args from uninit diag
uninit diagnostics uses passing via reference and access attributes
but that iterates over function type arguments which can in some
cases appearantly outrun the actual arguments leading to ICEs.
The following simply ignores not present arguments.

2022-06-14  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/105946
	* tree-ssa-uninit.cc (maybe_warn_pass_by_reference):
	Do not look at arguments not specified in the function call.
2022-06-14 12:52:49 +02:00
Richard Biener 90467f0ad6 middle-end/105965 - add missing v_c_e <{ el }> simplification
When we got the simplification of bit-field-ref to view-convert
we lost the ability to detect FMAs since we cannot look through

  _1 = {_10};
  _11 = VIEW_CONVERT_EXPR<float>(_1);

the following amends the (view_convert CONSTRUCTOR) pattern
to handle this case.

2022-06-14  Richard Biener  <rguenther@suse.de>

	PR middle-end/105965
	* match.pd (view_convert CONSTRUCTOR): Handle single-element
	CTOR case.

	* gcc.target/i386/pr105965.c: New testcase.
2022-06-14 12:52:49 +02:00
Eric Botcazou be6676286a Restore bootstrap on ARM
The -Wuse-after-free warning is explicitly disabled for destructors on ARM
because of the special ABI and the previous change to the warning machinery
uncovered another case where the warning data would be incorrectly erased.

gcc/
	* warning-control.cc (copy_warning) [generic version]: Do not erase
	the warning data of the destination location when the no-warning
	bit is not set on the source.
	(copy_warning) [tree version]: Return early if TO is equal to FROM.
	(copy_warning) [gimple version]: Likewise.
gcc/testsuite/
	* g++.dg/warn/Wuse-after-free5.C: New test.
2022-06-14 12:41:11 +02:00
Kewen Lin f907cf4c07 vect: Move suggested_unroll_factor applying [PR105940]
As PR105940 shown, when rs6000 port tries to assign
m_suggested_unroll_factor by 4 or so, there will be ICE on:

  exact_div (LOOP_VINFO_VECT_FACTOR (loop_vinfo),
             loop_vinfo->suggested_unroll_factor);

In function vect_analyze_loop_2, the current place of
suggested_unroll_factor applying can't guarantee it's
applied for all cases.  As the case shows, vectorizer
could retry with SLP forced off, the vf is reset by
saved_vectorization_factor which isn't applied with
suggested_unroll_factor before.  It means it can end
up with one vf which neglects suggested_unroll_factor.
I think it's off design, we should move the applying
of suggested_unroll_factor after start_over.

	PR tree-optimization/105940

gcc/ChangeLog:

	* tree-vect-loop.cc (vect_analyze_loop_2): Move the place of
	applying suggested_unroll_factor after start_over.
2022-06-14 00:57:01 -05:00
Takayuki 'January June' Suwa 077438933c xtensa: Optimize bitwise AND operation with some specific forms of constants
This patch offers several insn-and-split patterns for bitwise AND with
register and constant that can be represented as:

i.   1's least significant N bits and the others 0's (17 <= N <= 31)
ii.  1's most significant N bits and the others 0's (12 <= N <= 31)
iii. M 1's sequence of bits and trailing N 0's bits, that cannot fit into a
	"MOVI Ax, simm12" instruction (1 <= M <= 16, 1 <= N <= 30)

And also offers shortcuts for conditional branch if each of the abovementioned
operations is (not) equal to zero.

gcc/ChangeLog:

	* config/xtensa/predicates.md (shifted_mask_operand):
	New predicate.
	* config/xtensa/xtensa.md (*andsi3_const_pow2_minus_one):
	New insn-and-split pattern.
	(*andsi3_const_negative_pow2, *andsi3_const_shifted_mask,
	*masktrue_const_pow2_minus_one, *masktrue_const_negative_pow2,
	*masktrue_const_shifted_mask): Ditto.
2022-06-13 17:25:48 -07:00
Takayuki 'January June' Suwa 70ce04ca35 xtensa: Make use of BALL/BNALL instructions
In Xtensa ISA, there is no single machine instruction that calculates unary
bitwise negation, but a few similar fused instructions are exist:

  "BALL  Ax, Ay, label"  // if ((~Ax & Ay) == 0) goto label;
  "BNALL Ax, Ay, label"  // if ((~Ax & Ay) != 0) goto label;

These instructions have never been emitted before, but it seems no reason not
to make use of them.

gcc/ChangeLog:

	* config/xtensa/xtensa.md (*masktrue_bitcmpl): New insn pattern.

gcc/testsuite/ChangeLog:

	* gcc.target/xtensa/BALL-BNALL.c: New.
2022-06-13 17:25:48 -07:00
Takayuki 'January June' Suwa e1b193c1cc xtensa: Simplify conditional branch/move insn patterns
No need to describe the "false side" conditional insn patterns anymore.

gcc/ChangeLog:

	* config/xtensa/xtensa-protos.h (xtensa_emit_branch):
	Remove the first argument.
	(xtensa_emit_bit_branch): Remove it because now called only from the
	output statement of *bittrue insn pattern.
	* config/xtensa/xtensa.cc (gen_int_relational): Remove the last
	argument 'p_invert', and make so that the condition is reversed by
	itself as needed.
	(xtensa_expand_conditional_branch): Share the common path, and remove
	condition inversion code.
	(xtensa_emit_branch, xtensa_emit_movcc): Simplify by removing the
	"false side" pattern.
	(xtensa_emit_bit_branch): Remove it because of the abovementioned
	reason, and move the function body to *bittrue insn pattern.
	* config/xtensa/xtensa.md (*bittrue): Transplant the output
	statement from removed xtensa_emit_bit_branch().
	(*bfalse, *ubfalse, *bitfalse, *maskfalse): Remove the "false side"
	insn patterns.
2022-06-13 17:25:48 -07:00
Takayuki 'January June' Suwa 1c68ec1f8a xtensa: Improve shift operations more
This patch introduces funnel shifter utilization, and rearranges existing
"per-byte shift" insn patterns.

gcc/ChangeLog:

	* config/xtensa/predicates.md (logical_shift_operator,
	xtensa_shift_per_byte_operator): New predicates.
	* config/xtensa/xtensa-protos.h (xtensa_shlrd_which_direction):
	New prototype.
	* config/xtensa/xtensa.cc (xtensa_shlrd_which_direction):
	New helper function for funnel shift patterns.
	* config/xtensa/xtensa.md (ior_op): New code iterator.
	(*ashlsi3_1): Replace with new split pattern.
	(*shift_per_byte): Unify *ashlsi3_3x, *ashrsi3_3x and *lshrsi3_3x.
	(*shift_per_byte_omit_AND_0, *shift_per_byte_omit_AND_1):
	New insn-and-split patterns that redirect to *xtensa_shift_per_byte,
	in order to omit unnecessary bitwise AND operation.
	(*shlrd_reg_<code>, *shlrd_const_<code>, *shlrd_per_byte_<code>,
	*shlrd_per_byte_<code>_omit_AND):
	New insn patterns for funnel shifts.

gcc/testsuite/ChangeLog:

	* gcc.target/xtensa/funnel_shifter.c: New.
2022-06-13 17:25:48 -07:00
GCC Administrator c3642271e8 Daily bump. 2022-06-14 00:16:39 +00:00