In the testcase for PR94991, we are doing FAIL for scalar floating move expand
pattern since TARGET_FLOAT is false with option -mgeneral-regs-only. But move
expand pattern cannot fail. It would be better to replace the FAIL with code
that bitcasts to the equivalent integer mode using gen_lowpart.
2020-05-11 Felix Yang <felix.yang@huawei.com>
gcc/
PR target/94991
* config/aarch64/aarch64.md (mov<mode>):
Bitcasts to the equivalent integer mode using gen_lowpart
instead of doing FAIL for scalar floating point move.
gcc/testsuite/
PR target/94991
* gcc.target/aarch64/mgeneral-regs_5.c: New test.
Given the C code:
unsigned long long inv(unsigned a, unsigned b, unsigned c)
{
return a ? b : ~c;
}
Prior to this patch, AArch64 GCC at -O2 generates:
inv:
cmp w0, 0
mvn w2, w2
csel w0, w1, w2, ne
ret
and after applying the patch, we get:
inv:
cmp w0, 0
csinv w0, w1, w2, ne
ret
The new pattern also catches the optimization for the symmetric case where the
body of foo reads a ? ~b : c.
Similarly, with the following code:
unsigned long long neg(unsigned a, unsigned b, unsigned c)
{
return a ? b : -c;
}
GCC at -O2 previously gave:
neg:
cmp w0, 0
neg w2, w2
csel w0, w1, w2, ne
but now gives:
neg:
cmp w0, 0
csneg w0, w1, w2, ne
ret
with the corresponding code for the symmetric case as above.
2020-05-11 Alex Coplan <alex.coplan@arm.com>
gcc/
* config/aarch64/aarch64.c (aarch64_if_then_else_costs): Add case
to correctly calculate cost for new pattern (*csinv3_uxtw_insn3).
* config/aarch64/aarch64.md (*csinv3_utxw_insn1): New.
(*csinv3_uxtw_insn2): New.
(*csinv3_uxtw_insn3): New.
* config/aarch64/iterators.md (neg_not_cs): New.
gcc/testsuite/
* gcc.target/aarch64/csinv-neg.c: New test.
- The testcase uses the -fgnu-tm option but does not ensure that support
is enabled. This patch adds the test to the testcase.
* gcc/testsuite/g++.dg/ipa/pr94856.C: Require fgnu-tm.
Enable V2SFmode vectorization and vectorize V2SFmode PLUS,
MINUS, MULT, MIN and MAX operations using XMM registers.
To avoid unwanted secondary effects (e.g. exceptions), load values
to XMM registers using MOVQ that clears high bits of the XMM
register outside V2SFmode.
The compiler now vectorizes e.g.:
float r[2], a[2], b[2];
void
test_plus (void)
{
for (int i = 0; i < 2; i++)
r[i] = a[i] + b[i];
}
to:
movq a(%rip), %xmm0
movq b(%rip), %xmm1
addps %xmm1, %xmm0
movlps %xmm0, r(%rip)
ret
gcc/ChangeLog:
PR target/95046
* config/i386/i386.c (ix86_vector_mode_supported_p):
Vectorize 3dNOW! vector modes for TARGET_MMX_WITH_SSE.
* config/i386/mmx.md (*mov<mode>_internal): Do not set
mode of alternative 13 to V2SF for TARGET_MMX_WITH_SSE.
(mmx_addv2sf3): Change operand predicates from
nonimmediate_operand to register_mmxmem_operand.
(addv2sf3): New expander.
(*mmx_addv2sf3): Add SSE/AVX alternatives. Change operand
predicates from nonimmediate_operand to register_mmxmem_operand.
Enable instruction pattern for TARGET_MMX_WITH_SSE.
(mmx_subv2sf3): Change operand predicate from
nonimmediate_operand to register_mmxmem_operand.
(mmx_subrv2sf3): Ditto.
(subv2sf3): New expander.
(*mmx_subv2sf3): Add SSE/AVX alternatives. Change operand
predicates from nonimmediate_operand to register_mmxmem_operand.
Enable instruction pattern for TARGET_MMX_WITH_SSE.
(mmx_mulv2sf3): Change operand predicates from
nonimmediate_operand to register_mmxmem_operand.
(mulv2sf3): New expander.
(*mmx_mulv2sf3): Add SSE/AVX alternatives. Change operand
predicates from nonimmediate_operand to register_mmxmem_operand.
Enable instruction pattern for TARGET_MMX_WITH_SSE.
(mmx_<code>v2sf3): Change operand predicates from
nonimmediate_operand to register_mmxmem_operand.
(<code>v2sf3): New expander.
(*mmx_<code>v2sf3): Add SSE/AVX alternatives. Change operand
predicates from nonimmediate_operand to register_mmxmem_operand.
Enable instruction pattern for TARGET_MMX_WITH_SSE.
(mmx_ieee_<ieee_maxmin>v2sf3): Ditto.
testsuite/ChangeLog:
PR target/95046
* gcc.target/i386/pr95046-1.c: New test.
This change is from a patch developed for gcc-5. The code
has moved on since then requiring a change to interface.c
2020-05-11 Janus Weil <janus@gcc.gnu.org>
Dominique d'Humieres <dominiq@lps.ens.fr>
gcc/fortran/
PR fortran/59107
* gfortran.h: Rename field resolved as resolve_symbol_called
and assign two 2 bits instead of 1.
* interface.c (check_dtio_interface1): Use new field name.
(gfc_find_typebound_dtio_proc): Use new field name.
* resolve.c (gfc_resolve_intrinsic): Replace check of the formal
field with resolve_symbol_called is at least 2, if it is not
set the field to 2. (resolve_typebound_procedure): Use new field
name. (resolve_symbol): Use new field name and check whether it
is at least 1, if it is not set the field to 1.
2020-05-11 Mark Eggleston <markeggleston@gcc.gnu.org>
gcc/testsuite/
PR fortran/59107
* gfortran.dg/pr59107.f90: New test.
Use determine_value_range to get value range info for fold convert expressions
with internal operation PLUS_EXPR/MINUS_EXPR/MULT_EXPR when not overflow on
wrapping overflow inner type. i.e.:
(long unsigned int)((unsigned int)n * 10 + 1)
=>
(long unsigned int)n * (long unsigned int)10 + (long unsigned int)1
With this patch for affine combination, load/store motion could detect
more address refs independency and promote some memory expressions to
registers within loop.
PS: Replace the previous "(T1)(X + CST) as (T1)X - (T1)(-CST))"
to "(T1)(X + CST) as (T1)X + (T1)(CST))" for wrapping overflow.
Bootstrap and regression tested pass on Power8-LE.
gcc/ChangeLog
2020-05-11 Xiong Hu Luo <luoxhu@linux.ibm.com>
PR tree-optimization/83403
* tree-affine.c (expr_to_aff_combination): Replace SSA_NAME with
determine_value_range, Add fold conversion of MULT_EXPR, fix the
previous PLUS_EXPR.
gcc/testsuite/ChangeLog
2020-05-11 Xiong Hu Luo <luoxhu@linux.ibm.com>
PR tree-optimization/83403
* gcc.dg/tree-ssa/pr83403-1.c: New test.
* gcc.dg/tree-ssa/pr83403-2.c: New test.
* gcc.dg/tree-ssa/pr83403.h: New header.
Avoids race condition when checking for an iterator to be singular or
to be comparable to another iterator.
* src/c++/debug.cc
(_Safe_sequence_base::_M_attach_single): Set attached iterator
sequence pointer and version.
(_Safe_sequence_base::_M_detach_single): Reset detached iterator.
(_Safe_iterator_base::_M_attach): Remove attached iterator sequence
pointer and version asignments.
(_Safe_iterator_base::_M_attach_single): Likewise.
(_Safe_iterator_base::_M_detach_single): Remove detached iterator
reset.
(_Safe_iterator_base::_M_singular): Use atomic load to access parent
sequence.
(_Safe_iterator_base::_M_can_compare): Likewise.
(_Safe_iterator_base::_M_get_mutex): Likewise.
(_Safe_local_iterator_base::_M_attach): Remove attached iterator container
pointer and version assignments.
(_Safe_local_iterator_base::_M_attach_single): Likewise.
(_Safe_unordered_container_base::_M_attach_local_single):
Set attached iterator container pointer and version.
(_Safe_unordered_container_base::_M_detach_local_single): Reset detached
iterator.
Division by zero in declaration statements could sometimes
generate NULL pointers being passed around that lead to ICEs.
2020-05-10 Harald Anlauf <anlauf@gmx.de>
gcc/fortran/
PR fortran/93499
* arith.c (gfc_divide): Catch division by zero.
(eval_intrinsic_f3): Safeguard for NULL operands.
gcc/testsuite/
PR fortran/93499
* gfortran.dg/pr93499.f90: New test.
libbacktrace/
PR libbacktrace/88745
* macho.c: New file.
* filetype.awk: Recognize Mach-O files.
* Makefile.am (FORMAT_FILES): Add macho.c.
(check_DATA): New variable. Set to .dSYM if HAVE_DSYMUTIL.
(%.dSYM): New pattern target.
(test_macho_SOURCES, test_macho_CFLAGS): New targets.
(test_macho_LDADD): New target.
(BUILDTESTS): Add test_macho.
(macho.lo): Add dependencies.
* configure.ac: Recognize macho file type. Check for
mach-o/dyld.h. Don't try to run objcopy if we don't find it.
Look for dsymutil and define a HAVE_DSYMUTIL conditional.
* Makefile.in: Regenerate.
* configure: Regenerate.
* config.h.in: Regenerate.
This supports FreeBSD and NetBSD when /proc is not mounted.
libbacktrace/
* fileline.c (sysctl_exec_name): New static function.
(sysctl_exec_name1): New macro or static function.
(sysctl_exec_name2): Likewise.
(fileline_initialize): Try sysctl_exec_name[12].
* configure.ac: Check for sysctl args to fetch executable name.
* configure: Regenerate.
* config.h.in: Regenerate.
This is the mode where the GNAT compiler does not use special encodings
in the debug info to describe some Ada constructs, for example packed
array types.
* gcc-interface/ada-tree.h (TYPE_PACKED_ARRAY_TYPE_P): Rename into...
(TYPE_BIT_PACKED_ARRAY_TYPE_P): ...this.
(TYPE_IS_PACKED_ARRAY_TYPE_P): Rename into...
(BIT_PACKED_ARRAY_TYPE_P): ...this.
(TYPE_IMPL_PACKED_ARRAY_P): Adjust to above renaming.
* gcc-interface/gigi.h (maybe_pad_type): Remove IS_USER_TYPE..
* gcc-interface/decl.c (gnat_to_gnu_entity) <E_Variable>: Adjust
call to maybe_pad_type.
<E_Ordinary_Fixed_Point_Type>: Remove const qualifiers for tree.
<E_Signed_Integer_Subtype>: Remove redundant test and redundant call
to associate_original_type_to_packed_array. Turn into assertion.
Call associate_original_type_to_packed_array and modify
gnu_entity_name accordingly. Explicitly set the parallel type
for GNAT encodings.
Call create_type_decl in the misaligned case before maybe_pad_type.
<E_Array_Type>: Do not use the name of the implementation type for
a packed array when not using GNAT encodings.
<E_Array_Subtype>: Move around setting flags. Use the result of the
call to associate_original_type_to_packed_array for gnu_entity_name.
<E_Record_Subtype>: Create XVS type and XVZ variable only if debug
info is requested for the type.
Call create_type_decl if a padded type was created for a type entity
(gnat_to_gnu_component_type): Use local variable and adjust calls to
maybe_pad_type.
(gnat_to_gnu_subprog_type): Adjust call to maybe_pad_type.
(gnat_to_gnu_field): Likewise.
(validate_size): Adjust to renaming of macro.
(set_rm_size): Likewise.
(associate_original_type_to_packed_array): Adjust return type and
return the name of the original type if GNAT encodings are not used
* gcc-interface/misc.c (gnat_get_debug_typ): Remove obsolete stuff.
(gnat_get_fixed_point_type_info): Remove const qualifiers for tree.
(gnat_get_array_descr_info): Likewise and set variables lazily.
Remove call to maybe_debug_type. Simplify a few computations.
(enumerate_modes): Remove const qualifier for tree.
* gcc-interface/utils.c (make_type_from_size): Adjust to renaming.
(maybe_pad_type): Remove IS_USER_TYPE parameter and adjust. Remove
specific code for implementation types for packed arrays.
(compute_deferred_decl_context): Remove const qualifier for tree.
(convert): Adjust call to maybe_pad_type.
(unchecked_convert): Likewise.
* gcc-interface/utils2.c (is_simple_additive_expressio): Likewise.
This can happen because we build an array type on the fly in case there
is an apparent type inconsistency in the construct.
* gcc-interface/utils2.c (build_binary_op) <ARRAY_RANGE_REF>: Use
build_nonshared_array_type to build the common type and declare it.
This was already the case in -gnatp mode.
* gcc-interface/misc.c (gnat_init_gcc_eh): Do not override the user
for -fnon-call-exceptions in default mode.
This prevents gigi from making a local copy of large aggregates.
* gcc-interface/trans.c (lvalue_required_p) <N_Selected_Component>:
Merge with N_Slice.
<N_Allocator>: Move to...
(lvalue_for_aggregate_p): ...here. New function.
(Identifier_to_gnu): For an identifier with aggregate type, also
call lvalue_for_aggregate_p if lvalue_required_p returned false
before substituting the identifier with the constant.
Aggregates can be surrounded by a qualified expression and this
prepares the support code in gigi for accepting them.
* gcc-interface/trans.c (gnat_to_gnu) <N_Assignment_Statement>: Deal
with qualified "others" aggregates in the memset case.
This happens when it is passed by copy and not passed in.
* gcc-interface/decl.c (gnat_to_gnu_param): Also back-annotate the
mechanism in the case of an Out parameter only passed by copy-out.
It was in the ada/gcc-interface repository and is outdated.
* tree.h (expr_align): Delete.
* tree.c (expr_align): Likewise.
ada/
* gcc-interface/utils2.c: Include builtins.h.
(known_alignment) <ADDR_EXPR>: Use DECL_ALIGN for DECL_P operands
and get_object_alignment for the rest.
two-types-6.c never emitted the warning, even in 4.5/4.6, and pr93382.c
doesn't have properly escaped parens, so doesn't check whether they are
literally present in the message.
2020-05-09 Jakub Jelinek <jakub@redhat.com>
PR testsuite/95008
* gcc.dg/two-types-6.c: Remove dg-warning directive that never
triggered.
* gcc.dg/analyzer/pr93382.c: Properly escape ()s in the diagnostic
message.
While gcc seems to prefer transforming tests on the result of
reversible operations, into tests on the original, it also can
work with the destination, if allocated to the same register as
it commonly-enough is. The re-use is easily covered in a
test-case. (N.B.: the value 0x80000000 appears to be considered
invalid and unimportant.) Spotted as a "microregression" in
libgcc when comparing to the cc0 version.
gcc:
* config/cris/cris.c (cris_select_cc_mode): Return CC_NZmode for
NEG too. Correct comment.
* config/cris/cris.md ("<anz>neg<mode>2<setnz>"): Rename from
"neg<mode>2".
Enables the use of btst / btstq for a single bit (at other bits
than 0, including as indicated by a variable) to set
condition-codes. There's also a bug-fix for the bit-0-btstq
pattern; it shouldn't generate CCmode as only the Z flag is
valid, still using CC_NZmode is ok, as only equality-tests are
generated. The cris_rtx_costs tweak is necessary or else
combine will consider the btst not preferable. It reduces the
difference to cc0-costs beyond the threshold to the
transformation being seen as profitable, but there's still a
difference in values for the pre-split-time btst+branch as
opposed to the cc0 btst and branch, with both appearing to be
the cost of several insns (18 and 22).
gcc:
* config/cris/cris-modes.def (CC_ZnN): New CC_MODE.
* config/cris/cris.c (cris_rtx_costs): Handle pre-split bit-test
* config/cris/cris.md (ZnNNZSET, ZnNNZUSE): New mode_iterators.
(znnCC, rznnCC): New code_attrs.
("*btst<mode>"): Iterator over ZnNNZSET instead of NZVCSET. Remove
obseolete comment. Add belt-and-suspenders mode-test to condition.
Add fixme regarding remaining matched-but-not-generated case.
("*cbranch<mode>4_btstrq1_<CC>"): New insn_and_split.
("*cbranch<mode>4_btstqb0_<CC>"): Rename from
"*cbranch<mode>4_btstq<CC>". Split to CC_NZ instead of CC.
("*b<zcond:code><mode>"): Iterate over ZnNNZUSE instead of NZUSE.
Handle output of CC_ZnNmode.
("*b<nzcond:code>_reversed<mode>"): Ditto.
Enables dropping of compares with zero of the result, through
any CCmode substitution.
gcc:
* config/cris/cris.md
("<acc><anz><anzvc><shlr>si3<setcc><setnz><setnzvc>"): Rename
from "<shlr>si3".
("<acc><anz><anzvc>clzsi2<setcc><setnz><setnzvc>"): Rename
from "clzsi2".
("<acc><anz><anzvc>bswapsi2<setcc><setnz><setnzvc>"): Rename
from "bswapsi2".
("*uminsi3<setcc><setnz><setnzvc>"): Rename from "*uminsi3".
Enabling dropping of compares with zero of the result, through
any CCmode substitution. Beware that this will cause
size-suboptimal operands to appear for e.g. 32-bit "and":
-65536, -256, 255, 65535; for 16-bit "and" -256, -31..-1, 255;
for 8-bit "and" -31..-1. Fixed for 0..31 for 16- and 8-bit
sizes as it seemed worthwhile and used in libgcc.
gcc:
* config/cris/cris.md ("*expanded_andsi<setcc><setnz><setnzvc>"):
Rename from "*expanded_andsi".
("*iorsi3<setcc><setnz><setnzvc>"): Similar from "*iorsi3".
Decorate "cc" attribute to make "cc<cccc><ccnz><ccnzvc>".
("*iorhi3<setcc><setnz><setnzvc>"): Similar from "*iorhi3".
("*iorqi3<setcc><setnz><setnzvc>"): Similar from "*iorqi3".
("*expanded_andhi<setcc><setnz><setnzvc>"): Similar from
"*expanded_andhi". Add quick cc-setting alternative for 0..31.
("*andqi3<setcc><setnz><setnzvc>"): Similar from "*andqi3".
("<acc><anz><anzvc>xorsi3<setcc><setnz><setnzvc>"): Rename
from "xorsi3".
("<acc><anz><anzvc>one_cmplsi2<setcc><setnz><setnzvc>"): Rename
from "one_cmplsi2".
Enabling dropping of compares with zero of the result, through
the non-VC-setting CCmode substitution. Beware that the
substitutions for 8- and 16-bit patterns will in some cases be
size-neutral; e.g. replacing an "addq 1..63,$rN" + "test.w $rN"
or "subq 1..63,$rN" + "test.w $rN" with an "add.w -63..63,$rN".
gcc:
* config/cris/cris.md ("*adddi3<setnz>"): Rename from "*adddi3".
cris: Enable 32-bit addition to set condition codes.
("*subdi3<setnz>"): Similarly from "*subdi3".
("*addsi3<setnz>"): Similarly from "*addsi3".
("*subsi3<setnz>"): Similarly from "*subsi3".
("*addhi3<setnz>"): Similarly from "*addhi3" and decorate the
"cc" attribute to "cc<ccnz>".
("*addqi3<setnz>"): Similarly from "*addqi3".
("*sub<mode>3<setnz>"): Similarly from "*sub<mode>3".
Enable dropping of compares with zero of the result, through the
three CCmode substitutions and the cmpelim pass.
gcc:
* config/cris/cris.md
("<acc><anz><anzvc>extend<mode>si2<setcc><setnz><setnzvc>"):
Rename from "extend<mode>si2".
("<acc><anz><anzvc>zero_extend<mode>si2<setcc><setnz><setnzvc>"):
Similar, from "zero_extend<mode>si2".
Like with movsi_internal. Looks like the "cc" attribute didn't
need tweaking for "movhi", but did for "movqi". N.B.: disabled
alternatives make cause a later alternative to match.
Also, non-anonymous insns get declarations and gen_* functions.
We don't want that; even if it doesn't affect generated code
it's sloppy. (This may or may not be preferable to the
name decorations obfuscating standard pattern names.)
Also anonymize left-over non-anonymous branches; they haven't
been needing names since the cbranch pattern was made the
generic method.
gcc:
* config/cris/cris.md ("anz", "anzvc", "acc"): New define_subst_attrs.
("<acc><anz><anzvc>movhi<setcc><setnz><setnzvc>"): Rename from
"movhi". Rename "cc" attribute to "cc<cccc><ccnz><ccnzvc>".
("<acc><anz><anzvc>movqi<setcc><setnz><setnzvc>"): Similar from
"movqi". Correct contents of, and rename "cc" attribute to
"cc<cccc><ccnz><ccnzvc>".
("*b<zcond:code><mode>"): Rename from "b<zcond:code><mode>".
("*b<nzvccond:code><mode>"): Rename from "b<nzvccond:code><mode>".
("*b<rnzcond:code><mode>"): Rename from "*b<rnzcond:code><mode>".
Completion of, and first use of, the CRIS-specific parts of the
condition-code-setting framework, making use of the define_subst
machinery and the cmpelim optimization pass. This round, just
moves in SImode. Note the re-use of the cc0 era "cc" attribute
(tweaks needed).
gcc:
* config/cris/cris.md ("cc"): Comment on new use.
("cc_enabled"): New attribute.
("enabled"): Make default fall back to cc_enabled.
("setnz", "ccnz", "setnzvc", "ccnzvc", "setcc", "cccc"): New
default_subst_attrs.
("setnz_subst", "setnzvc_subst", "setcc_subst"): New default_subst.
("*movsi_internal<setcc><setnz><setnzvc>"): Rename from
"*movsi_internal". Correct contents of, and rename attribute
"cc" to "cc<cccc><ccnz><ccnzvc>".
This is just the framework bits of splitting CCmode into classes
where the cc-setter can merge mode (CCmode), classes where the
cc-setter must set V and C "usefully" (as well as N and Z flags)
and classes where the cc-setter is something like an arithmetic
instruction, where N and Z are valid but C and V reflect the
operation rather than a compare of the result with zero. This
should yield identical or near-identical code.
The old split of conditions into the ncond and ocond sets took
into account the transformations done by final.c:alter_cond from
cc_status.flags & CC_NO_OVERFLOW, and wasn't a reflection of the
hardware description of the conditions (i.e. whether V mattered
or not).
gcc:
Prepare for cmpelim pass to eliminate redundant compare insns.
* config/cris/cris-modes.def: New file.
* config/cris/cris-protos.h (cris_select_cc_mode): Declare.
(cris_notice_update_cc): Remove left-over declaration.
* config/cris/cris.c (TARGET_CC_MODES_COMPATIBLE): Define.
(cris_select_cc_mode, cris_cc_modes_compatible): New functions.
* config/cris/cris.h (SELECT_CC_MODE): Define.
* config/cris/cris.md (NZSET, NZUSE, NZVCSET, NZVCUSE): New
mode_iterators.
(cond): New code_iterator.
(nzcond): Replacement for incorrect ncond. All callers changed.
(nzvccond): Replacement for ocond. All callers changed.
(rnzcond): Replacement for rcond. All callers changed.
(xCC): New code_attr.
(cmp_op1c, cmp_op0c): Renumber from cmp_op1c and cmp_op2c. All
users changed.
("*cmpdi<NZVCSET:mode>"): Rename from "*cmpdi". Replace
CCmode with iteration over NZVCSET.
("*cmp_ext<BW:mode><NZVCSET:mode>"): Similarly; rename from
"*cmp_ext<mode>".
("*cmpsi<NZVCSET:mode>"): Similarly, from "*cmpsi".
("*cmp<BW:mode><NZVCSET:mode>"): Similarly from "*cmp<mode>".
("*btst<mode>"): Similarly, from "*btst".
("*cbranch<mode><code>4"): Rename from "*cbranch<mode>4",
iterating over cond instead of matching the comparison with
ordered_comparison_operator.
("*cbranch<mode>4_btstq<CC>"): Correct label operand number.
("b<zcond:code><mode>"): Rename from "b<ncond:code>", iterating
over NZUSE.
("b<nzvccond:code><mode>"): Similarly from "b<ocond:code>", over
NZVCUSE. Remove FIXME.
("*b<nzcond:code>_reversed<mode>"): Similarly from
"*b<ncond:code>_reversed", over NZUSE.
("*b<nzvccond:code>_reversed<mode>"): Similarly from
"*b<ocond:code>_reversed", over NZVCUSE. Remove FIXME.
("b<rnzcond:code><mode>"): Similarly from "b<rcond:code>",
over NZUSE. Reinstate "b<oCC>" vs. "b<CC>" mnemonic choice,
depending on CC_NZmode vs. CCmode. Remove FIXME.
("*b<rnzcond:code>_reversed<mode>"): Similarly from
"*b<rcond:code>_reversed", over NZUSE.
("*cstore<mode><code>4"): Rename from "*cstore<mode>4",
iterating over cond instead of matching the comparison with
ordered_comparison_operator.
("*s<nzcond:code><mode>"): Rename from "*s<ncond:code>",
iterating over NZUSE.
("*s<rnzcond:code><mode>"): Similar from "*s<rcond:code>", over
NZUSE. Reinstate "b<oCC>" vs. "b<CC>" mnemonic choice,
depending on CC_NZmode vs. CCmode.
("*s<nzvccond:code><mode>"): Simlar from "*s<ocond:code>", over
NZVCUSE. Remove FIXME.
A separated follow-up to the previous change: Also emit moves
from zero as not clobbering condition-codes.
(note: actually folded into the previous ChangeLog-entry)
gcc:
* config/cris/cris.md ("movsi"): For a zero-source post-reload,
generate a clobberless variant.
("*mov_fromzero<mode>_split"): New split.
("*mov_fromzero<mode>"): New insn.
In preparation for compare-elimination (for it to be obviously
useful), we have to have some common insn in-between that
doesn't clobber condition-codes. A move to memory is an obvious
choice. Note the FIXME: we can do this for a zero source too;
later.
gcc:
* config/cris/cris.md ("movsi"): For memory destination
post-reload, generate clobberless variant.
("*mov_tomem<mode>_split"): New split.
("*mov_tomem<mode>"): New insn.
("enabled", mov_tomem_enabled): Define and use to exclude "x" ->
"Q>m" for less-than-SImode.
For some reason (like a buglet in the user in jump.c), defining this makes
a beneficial difference in ledf2, thus this is separated to its own commit.
Also, add comment on (not defining) REVERSE_CONDITION.
gcc:
* config/cris/cris.h (REVERSIBLE_CC_MODE): Define to true.
This made a whole lot of difference regarding regressions in the
delay-slot filling. Before this, comparing __lshrdi3 for v10
before/after decc0ration and other nearby functions was worse by
several missing delay-slot fills; now down to 1.
Also, add a comment about *not* defining
TARGET_FIXED_CONDITION_CODE_REGS.
gcc:
* config/cris/cris.c (TARGET_FLAGS_REGNUM): Define.
As the added FIXME says, the new insn_and_split generates only a
small subset of the bit-tests that can be matched by "*btst" and
that were emitted by the undecc0rated cris.md at combine-time,
but it's naturally separable from a general variant by being
just what's needed for the test-cases that were previously
xfailed, and that no additional CCmodes are required.
gcc:
PR target/93372
* config/cris/cris.md (zcond): New code_iterator.
("*cbranch<mode>4_btstq<CC>"): New insn_and_split.
In the parlance of <https://gcc.gnu.org/wiki/CC0Transition>,
this is a basic "type 2" conversion, without
condition-code-related optimizations (just plain CCmode), but
with "cstore{M}4" defined. CRIS is somewhat similar to the
m68k; most instructions affect condition-codes. To wit, it
lacks sufficient instructions to compose an arbitrary valid
address in a register, specifically from a valid address where
involved registers have to be spilled or adjusted, without
affecting condition-codes in CRIS_CC0_REGNUM aka. dccr.
On the other hand, moving dccr to and from a stackpointer-plus-
constant-offset-address *can* be done without additional register
use, and moving to or from a general register does not affect
it. There's no instruction to add a constant to a register or
to put a constant in a register, without affecting dccr, but
there *is* an instruction to add a register (optionally scaled)
to another without affecting dccr (i.e. "addi"). Also, moves
*to* memory from any register do not affect dccr, and likewise
between another special registers and a general register. Maybe
some of that opens up the solution-space to a better solution
than clobbering dccr until reload_completed; to be investigated.
FAOD: I know what to do in the direction of defining and using
additional CCmodes, but prefer to do the full transition in
smaller steps.
Regarding the similarity to m68k, I didn't follow the steps of
the m68k cc0 transition, making use of the final_postscan_insn
hook as with the a NOTICE_UPDATE_CC machinery. For one, because
it seems to be lacking in that it keeps compare-elimination
restricted to output-time, but also because it seems a bad match
considering that CRIS has delay-slots; better try to eliminate
compares earlier. Another approach which I originally intended
to implement, that of the visium port of defining three variants
for most insns (not counting the define_subst expansions;
unaffecting-before-reload, clobbering and setting), seems
overworked and bloating the machine description. I may be
proven wrong, but I prefer we fix gcc if some something bails on
seeing a parallel with a clobber of that specific hard-register.
Also, I chose to remove most anonymous combination-matching
patterns; matchers, splitters and peepholes instead of
converting them to add clobbers of CRIS_CC0_REGNUM. There are
exclusions: those covered in the test-suite, if trivial enough.
Many of these patterns are used to handle the side-effect-
assignment addressing-modes as put together by combine: a
"prefix instruction" before the main instruction, where the main
instruction uses the post-incremented-register addressing-mode
and the "left-over" instruction-field in the prefixed insn to
assign a register. An example: the hopefully descriptive
"move.d $r9,[$r0=$r1+1234]" compared to "move.d $r9,[$r1+1234]";
both formed by the prefix insn "biap.w 1234,$r1" before
respectively "move.d $r9,[$r0+]" and "move.d $r9,[$r0]". Other
prefix variants exist. Useful, but optional, except where
side-effect assignment was used in a special case in the
function prologue; adjusted to a less optimal combination.
Support like the function cris_side_effect_mode_ok is kept.
I intend to put back as many as I find use for, of those
anonymous patterns in a controlled manner, with self-contained
test-cases proving their usability, rather than symmetry with
other instructions and similar addressing modes, which guided
the original introduction. I've entered pr93372 to track code
performance regressions related to this transition, with focus
on target-side causes and fixes; besides the function prologue
special-case, there were some checking presence of the bit-test
(btstq) instruction.
The now-gone "tst<mode>" patterns deserve a comment too: they
were an artefact from pre-"cbranch" era, now fully folded into
the "cmp<mode>" patterns.
I've left the now-unused "cc" insn attribute in, for the time
being; to be removed, used or transformed to be useful with
further work to fix pr93372. It can't be used as is, because
"normal" doesn't mean "like a compare instruction" but "handled
by NOTICE_UPDATE_CC" and may in fact be reflecting e.g. reverse
operands, something that bit me during the conversion.
gcc:
Move trivially from cc0 to reg:CC model, removing most optimizations.
* config/cris/cris.md: Remove all side-effect patterns and their
splitters. Remove most peepholes. Add clobbers of CRIS_CC0_REGNUM
to all but post-reload control-flow and movem insns. Remove
constraints on all modified expanders. Remove obsoleted cc0-related
references.
(attr "cc"): Remove alternative "rev".
(mode_iterator BWDD, DI_, SI_): New.
(mode_attr sCC_destc, cmp_op1c, cmp_op2c): New.
("tst<mode>"): Remove; fold as "M" alternative into compare insn.
("mstep_shift", "mstep_mul"): Remove patterns.
("s<rcond>", "s<ocond>", "s<ncond>"): Anonymize.
* config/cris/cris.c: Change all non-condition-code,
non-control-flow emitted insns to add a parallel with clobber of
CRIS_CC0_REGNUM, mostly by changing from gen_rtx_SET with
emit_insn to use of emit_move_insn, gen_add2_insn or
cris_emit_insn, as convenient.
(cris_reg_overlap_mentioned_p)
(cris_normal_notice_update_cc, cris_notice_update_cc): Remove.
(cris_movem_load_rest_p): Don't assume all elements in a
PARALLEL are SETs.
(cris_store_multiple_op_p): Ditto.
(cris_emit_insn): New function.
* cris/cris-protos.h (cris_emit_insn): Declare.