Aggregates can be surrounded by a qualified expression and this
prepares the support code in gigi for accepting them.
* gcc-interface/trans.c (gnat_to_gnu) <N_Assignment_Statement>: Deal
with qualified "others" aggregates in the memset case.
This happens when it is passed by copy and not passed in.
* gcc-interface/decl.c (gnat_to_gnu_param): Also back-annotate the
mechanism in the case of an Out parameter only passed by copy-out.
It was in the ada/gcc-interface repository and is outdated.
* tree.h (expr_align): Delete.
* tree.c (expr_align): Likewise.
ada/
* gcc-interface/utils2.c: Include builtins.h.
(known_alignment) <ADDR_EXPR>: Use DECL_ALIGN for DECL_P operands
and get_object_alignment for the rest.
two-types-6.c never emitted the warning, even in 4.5/4.6, and pr93382.c
doesn't have properly escaped parens, so doesn't check whether they are
literally present in the message.
2020-05-09 Jakub Jelinek <jakub@redhat.com>
PR testsuite/95008
* gcc.dg/two-types-6.c: Remove dg-warning directive that never
triggered.
* gcc.dg/analyzer/pr93382.c: Properly escape ()s in the diagnostic
message.
While gcc seems to prefer transforming tests on the result of
reversible operations, into tests on the original, it also can
work with the destination, if allocated to the same register as
it commonly-enough is. The re-use is easily covered in a
test-case. (N.B.: the value 0x80000000 appears to be considered
invalid and unimportant.) Spotted as a "microregression" in
libgcc when comparing to the cc0 version.
gcc:
* config/cris/cris.c (cris_select_cc_mode): Return CC_NZmode for
NEG too. Correct comment.
* config/cris/cris.md ("<anz>neg<mode>2<setnz>"): Rename from
"neg<mode>2".
Enables the use of btst / btstq for a single bit (at other bits
than 0, including as indicated by a variable) to set
condition-codes. There's also a bug-fix for the bit-0-btstq
pattern; it shouldn't generate CCmode as only the Z flag is
valid, still using CC_NZmode is ok, as only equality-tests are
generated. The cris_rtx_costs tweak is necessary or else
combine will consider the btst not preferable. It reduces the
difference to cc0-costs beyond the threshold to the
transformation being seen as profitable, but there's still a
difference in values for the pre-split-time btst+branch as
opposed to the cc0 btst and branch, with both appearing to be
the cost of several insns (18 and 22).
gcc:
* config/cris/cris-modes.def (CC_ZnN): New CC_MODE.
* config/cris/cris.c (cris_rtx_costs): Handle pre-split bit-test
* config/cris/cris.md (ZnNNZSET, ZnNNZUSE): New mode_iterators.
(znnCC, rznnCC): New code_attrs.
("*btst<mode>"): Iterator over ZnNNZSET instead of NZVCSET. Remove
obseolete comment. Add belt-and-suspenders mode-test to condition.
Add fixme regarding remaining matched-but-not-generated case.
("*cbranch<mode>4_btstrq1_<CC>"): New insn_and_split.
("*cbranch<mode>4_btstqb0_<CC>"): Rename from
"*cbranch<mode>4_btstq<CC>". Split to CC_NZ instead of CC.
("*b<zcond:code><mode>"): Iterate over ZnNNZUSE instead of NZUSE.
Handle output of CC_ZnNmode.
("*b<nzcond:code>_reversed<mode>"): Ditto.
Enables dropping of compares with zero of the result, through
any CCmode substitution.
gcc:
* config/cris/cris.md
("<acc><anz><anzvc><shlr>si3<setcc><setnz><setnzvc>"): Rename
from "<shlr>si3".
("<acc><anz><anzvc>clzsi2<setcc><setnz><setnzvc>"): Rename
from "clzsi2".
("<acc><anz><anzvc>bswapsi2<setcc><setnz><setnzvc>"): Rename
from "bswapsi2".
("*uminsi3<setcc><setnz><setnzvc>"): Rename from "*uminsi3".
Enabling dropping of compares with zero of the result, through
any CCmode substitution. Beware that this will cause
size-suboptimal operands to appear for e.g. 32-bit "and":
-65536, -256, 255, 65535; for 16-bit "and" -256, -31..-1, 255;
for 8-bit "and" -31..-1. Fixed for 0..31 for 16- and 8-bit
sizes as it seemed worthwhile and used in libgcc.
gcc:
* config/cris/cris.md ("*expanded_andsi<setcc><setnz><setnzvc>"):
Rename from "*expanded_andsi".
("*iorsi3<setcc><setnz><setnzvc>"): Similar from "*iorsi3".
Decorate "cc" attribute to make "cc<cccc><ccnz><ccnzvc>".
("*iorhi3<setcc><setnz><setnzvc>"): Similar from "*iorhi3".
("*iorqi3<setcc><setnz><setnzvc>"): Similar from "*iorqi3".
("*expanded_andhi<setcc><setnz><setnzvc>"): Similar from
"*expanded_andhi". Add quick cc-setting alternative for 0..31.
("*andqi3<setcc><setnz><setnzvc>"): Similar from "*andqi3".
("<acc><anz><anzvc>xorsi3<setcc><setnz><setnzvc>"): Rename
from "xorsi3".
("<acc><anz><anzvc>one_cmplsi2<setcc><setnz><setnzvc>"): Rename
from "one_cmplsi2".
Enabling dropping of compares with zero of the result, through
the non-VC-setting CCmode substitution. Beware that the
substitutions for 8- and 16-bit patterns will in some cases be
size-neutral; e.g. replacing an "addq 1..63,$rN" + "test.w $rN"
or "subq 1..63,$rN" + "test.w $rN" with an "add.w -63..63,$rN".
gcc:
* config/cris/cris.md ("*adddi3<setnz>"): Rename from "*adddi3".
cris: Enable 32-bit addition to set condition codes.
("*subdi3<setnz>"): Similarly from "*subdi3".
("*addsi3<setnz>"): Similarly from "*addsi3".
("*subsi3<setnz>"): Similarly from "*subsi3".
("*addhi3<setnz>"): Similarly from "*addhi3" and decorate the
"cc" attribute to "cc<ccnz>".
("*addqi3<setnz>"): Similarly from "*addqi3".
("*sub<mode>3<setnz>"): Similarly from "*sub<mode>3".
Enable dropping of compares with zero of the result, through the
three CCmode substitutions and the cmpelim pass.
gcc:
* config/cris/cris.md
("<acc><anz><anzvc>extend<mode>si2<setcc><setnz><setnzvc>"):
Rename from "extend<mode>si2".
("<acc><anz><anzvc>zero_extend<mode>si2<setcc><setnz><setnzvc>"):
Similar, from "zero_extend<mode>si2".
Like with movsi_internal. Looks like the "cc" attribute didn't
need tweaking for "movhi", but did for "movqi". N.B.: disabled
alternatives make cause a later alternative to match.
Also, non-anonymous insns get declarations and gen_* functions.
We don't want that; even if it doesn't affect generated code
it's sloppy. (This may or may not be preferable to the
name decorations obfuscating standard pattern names.)
Also anonymize left-over non-anonymous branches; they haven't
been needing names since the cbranch pattern was made the
generic method.
gcc:
* config/cris/cris.md ("anz", "anzvc", "acc"): New define_subst_attrs.
("<acc><anz><anzvc>movhi<setcc><setnz><setnzvc>"): Rename from
"movhi". Rename "cc" attribute to "cc<cccc><ccnz><ccnzvc>".
("<acc><anz><anzvc>movqi<setcc><setnz><setnzvc>"): Similar from
"movqi". Correct contents of, and rename "cc" attribute to
"cc<cccc><ccnz><ccnzvc>".
("*b<zcond:code><mode>"): Rename from "b<zcond:code><mode>".
("*b<nzvccond:code><mode>"): Rename from "b<nzvccond:code><mode>".
("*b<rnzcond:code><mode>"): Rename from "*b<rnzcond:code><mode>".
Completion of, and first use of, the CRIS-specific parts of the
condition-code-setting framework, making use of the define_subst
machinery and the cmpelim optimization pass. This round, just
moves in SImode. Note the re-use of the cc0 era "cc" attribute
(tweaks needed).
gcc:
* config/cris/cris.md ("cc"): Comment on new use.
("cc_enabled"): New attribute.
("enabled"): Make default fall back to cc_enabled.
("setnz", "ccnz", "setnzvc", "ccnzvc", "setcc", "cccc"): New
default_subst_attrs.
("setnz_subst", "setnzvc_subst", "setcc_subst"): New default_subst.
("*movsi_internal<setcc><setnz><setnzvc>"): Rename from
"*movsi_internal". Correct contents of, and rename attribute
"cc" to "cc<cccc><ccnz><ccnzvc>".
This is just the framework bits of splitting CCmode into classes
where the cc-setter can merge mode (CCmode), classes where the
cc-setter must set V and C "usefully" (as well as N and Z flags)
and classes where the cc-setter is something like an arithmetic
instruction, where N and Z are valid but C and V reflect the
operation rather than a compare of the result with zero. This
should yield identical or near-identical code.
The old split of conditions into the ncond and ocond sets took
into account the transformations done by final.c:alter_cond from
cc_status.flags & CC_NO_OVERFLOW, and wasn't a reflection of the
hardware description of the conditions (i.e. whether V mattered
or not).
gcc:
Prepare for cmpelim pass to eliminate redundant compare insns.
* config/cris/cris-modes.def: New file.
* config/cris/cris-protos.h (cris_select_cc_mode): Declare.
(cris_notice_update_cc): Remove left-over declaration.
* config/cris/cris.c (TARGET_CC_MODES_COMPATIBLE): Define.
(cris_select_cc_mode, cris_cc_modes_compatible): New functions.
* config/cris/cris.h (SELECT_CC_MODE): Define.
* config/cris/cris.md (NZSET, NZUSE, NZVCSET, NZVCUSE): New
mode_iterators.
(cond): New code_iterator.
(nzcond): Replacement for incorrect ncond. All callers changed.
(nzvccond): Replacement for ocond. All callers changed.
(rnzcond): Replacement for rcond. All callers changed.
(xCC): New code_attr.
(cmp_op1c, cmp_op0c): Renumber from cmp_op1c and cmp_op2c. All
users changed.
("*cmpdi<NZVCSET:mode>"): Rename from "*cmpdi". Replace
CCmode with iteration over NZVCSET.
("*cmp_ext<BW:mode><NZVCSET:mode>"): Similarly; rename from
"*cmp_ext<mode>".
("*cmpsi<NZVCSET:mode>"): Similarly, from "*cmpsi".
("*cmp<BW:mode><NZVCSET:mode>"): Similarly from "*cmp<mode>".
("*btst<mode>"): Similarly, from "*btst".
("*cbranch<mode><code>4"): Rename from "*cbranch<mode>4",
iterating over cond instead of matching the comparison with
ordered_comparison_operator.
("*cbranch<mode>4_btstq<CC>"): Correct label operand number.
("b<zcond:code><mode>"): Rename from "b<ncond:code>", iterating
over NZUSE.
("b<nzvccond:code><mode>"): Similarly from "b<ocond:code>", over
NZVCUSE. Remove FIXME.
("*b<nzcond:code>_reversed<mode>"): Similarly from
"*b<ncond:code>_reversed", over NZUSE.
("*b<nzvccond:code>_reversed<mode>"): Similarly from
"*b<ocond:code>_reversed", over NZVCUSE. Remove FIXME.
("b<rnzcond:code><mode>"): Similarly from "b<rcond:code>",
over NZUSE. Reinstate "b<oCC>" vs. "b<CC>" mnemonic choice,
depending on CC_NZmode vs. CCmode. Remove FIXME.
("*b<rnzcond:code>_reversed<mode>"): Similarly from
"*b<rcond:code>_reversed", over NZUSE.
("*cstore<mode><code>4"): Rename from "*cstore<mode>4",
iterating over cond instead of matching the comparison with
ordered_comparison_operator.
("*s<nzcond:code><mode>"): Rename from "*s<ncond:code>",
iterating over NZUSE.
("*s<rnzcond:code><mode>"): Similar from "*s<rcond:code>", over
NZUSE. Reinstate "b<oCC>" vs. "b<CC>" mnemonic choice,
depending on CC_NZmode vs. CCmode.
("*s<nzvccond:code><mode>"): Simlar from "*s<ocond:code>", over
NZVCUSE. Remove FIXME.
A separated follow-up to the previous change: Also emit moves
from zero as not clobbering condition-codes.
(note: actually folded into the previous ChangeLog-entry)
gcc:
* config/cris/cris.md ("movsi"): For a zero-source post-reload,
generate a clobberless variant.
("*mov_fromzero<mode>_split"): New split.
("*mov_fromzero<mode>"): New insn.
In preparation for compare-elimination (for it to be obviously
useful), we have to have some common insn in-between that
doesn't clobber condition-codes. A move to memory is an obvious
choice. Note the FIXME: we can do this for a zero source too;
later.
gcc:
* config/cris/cris.md ("movsi"): For memory destination
post-reload, generate clobberless variant.
("*mov_tomem<mode>_split"): New split.
("*mov_tomem<mode>"): New insn.
("enabled", mov_tomem_enabled): Define and use to exclude "x" ->
"Q>m" for less-than-SImode.
For some reason (like a buglet in the user in jump.c), defining this makes
a beneficial difference in ledf2, thus this is separated to its own commit.
Also, add comment on (not defining) REVERSE_CONDITION.
gcc:
* config/cris/cris.h (REVERSIBLE_CC_MODE): Define to true.
This made a whole lot of difference regarding regressions in the
delay-slot filling. Before this, comparing __lshrdi3 for v10
before/after decc0ration and other nearby functions was worse by
several missing delay-slot fills; now down to 1.
Also, add a comment about *not* defining
TARGET_FIXED_CONDITION_CODE_REGS.
gcc:
* config/cris/cris.c (TARGET_FLAGS_REGNUM): Define.
As the added FIXME says, the new insn_and_split generates only a
small subset of the bit-tests that can be matched by "*btst" and
that were emitted by the undecc0rated cris.md at combine-time,
but it's naturally separable from a general variant by being
just what's needed for the test-cases that were previously
xfailed, and that no additional CCmodes are required.
gcc:
PR target/93372
* config/cris/cris.md (zcond): New code_iterator.
("*cbranch<mode>4_btstq<CC>"): New insn_and_split.
In the parlance of <https://gcc.gnu.org/wiki/CC0Transition>,
this is a basic "type 2" conversion, without
condition-code-related optimizations (just plain CCmode), but
with "cstore{M}4" defined. CRIS is somewhat similar to the
m68k; most instructions affect condition-codes. To wit, it
lacks sufficient instructions to compose an arbitrary valid
address in a register, specifically from a valid address where
involved registers have to be spilled or adjusted, without
affecting condition-codes in CRIS_CC0_REGNUM aka. dccr.
On the other hand, moving dccr to and from a stackpointer-plus-
constant-offset-address *can* be done without additional register
use, and moving to or from a general register does not affect
it. There's no instruction to add a constant to a register or
to put a constant in a register, without affecting dccr, but
there *is* an instruction to add a register (optionally scaled)
to another without affecting dccr (i.e. "addi"). Also, moves
*to* memory from any register do not affect dccr, and likewise
between another special registers and a general register. Maybe
some of that opens up the solution-space to a better solution
than clobbering dccr until reload_completed; to be investigated.
FAOD: I know what to do in the direction of defining and using
additional CCmodes, but prefer to do the full transition in
smaller steps.
Regarding the similarity to m68k, I didn't follow the steps of
the m68k cc0 transition, making use of the final_postscan_insn
hook as with the a NOTICE_UPDATE_CC machinery. For one, because
it seems to be lacking in that it keeps compare-elimination
restricted to output-time, but also because it seems a bad match
considering that CRIS has delay-slots; better try to eliminate
compares earlier. Another approach which I originally intended
to implement, that of the visium port of defining three variants
for most insns (not counting the define_subst expansions;
unaffecting-before-reload, clobbering and setting), seems
overworked and bloating the machine description. I may be
proven wrong, but I prefer we fix gcc if some something bails on
seeing a parallel with a clobber of that specific hard-register.
Also, I chose to remove most anonymous combination-matching
patterns; matchers, splitters and peepholes instead of
converting them to add clobbers of CRIS_CC0_REGNUM. There are
exclusions: those covered in the test-suite, if trivial enough.
Many of these patterns are used to handle the side-effect-
assignment addressing-modes as put together by combine: a
"prefix instruction" before the main instruction, where the main
instruction uses the post-incremented-register addressing-mode
and the "left-over" instruction-field in the prefixed insn to
assign a register. An example: the hopefully descriptive
"move.d $r9,[$r0=$r1+1234]" compared to "move.d $r9,[$r1+1234]";
both formed by the prefix insn "biap.w 1234,$r1" before
respectively "move.d $r9,[$r0+]" and "move.d $r9,[$r0]". Other
prefix variants exist. Useful, but optional, except where
side-effect assignment was used in a special case in the
function prologue; adjusted to a less optimal combination.
Support like the function cris_side_effect_mode_ok is kept.
I intend to put back as many as I find use for, of those
anonymous patterns in a controlled manner, with self-contained
test-cases proving their usability, rather than symmetry with
other instructions and similar addressing modes, which guided
the original introduction. I've entered pr93372 to track code
performance regressions related to this transition, with focus
on target-side causes and fixes; besides the function prologue
special-case, there were some checking presence of the bit-test
(btstq) instruction.
The now-gone "tst<mode>" patterns deserve a comment too: they
were an artefact from pre-"cbranch" era, now fully folded into
the "cmp<mode>" patterns.
I've left the now-unused "cc" insn attribute in, for the time
being; to be removed, used or transformed to be useful with
further work to fix pr93372. It can't be used as is, because
"normal" doesn't mean "like a compare instruction" but "handled
by NOTICE_UPDATE_CC" and may in fact be reflecting e.g. reverse
operands, something that bit me during the conversion.
gcc:
Move trivially from cc0 to reg:CC model, removing most optimizations.
* config/cris/cris.md: Remove all side-effect patterns and their
splitters. Remove most peepholes. Add clobbers of CRIS_CC0_REGNUM
to all but post-reload control-flow and movem insns. Remove
constraints on all modified expanders. Remove obsoleted cc0-related
references.
(attr "cc"): Remove alternative "rev".
(mode_iterator BWDD, DI_, SI_): New.
(mode_attr sCC_destc, cmp_op1c, cmp_op2c): New.
("tst<mode>"): Remove; fold as "M" alternative into compare insn.
("mstep_shift", "mstep_mul"): Remove patterns.
("s<rcond>", "s<ocond>", "s<ncond>"): Anonymize.
* config/cris/cris.c: Change all non-condition-code,
non-control-flow emitted insns to add a parallel with clobber of
CRIS_CC0_REGNUM, mostly by changing from gen_rtx_SET with
emit_insn to use of emit_move_insn, gen_add2_insn or
cris_emit_insn, as convenient.
(cris_reg_overlap_mentioned_p)
(cris_normal_notice_update_cc, cris_notice_update_cc): Remove.
(cris_movem_load_rest_p): Don't assume all elements in a
PARALLEL are SETs.
(cris_store_multiple_op_p): Ditto.
(cris_emit_insn): New function.
* cris/cris-protos.h (cris_emit_insn): Declare.
Part of the removal of crisv32-* and cris-*-linux* (cris-elf remains).
Essentially everything is gone, including functions and
target-specific definitions and most obvious knock-on effects,
like removing unused functions and arguments.
There's one exception: the register-class effects of the CRIS v32
ACR register are deliberately excluded and left in (i.e. its
use by-number is removed and the ACE_REGS regclass is always
unusable - but present). Changing register class definitions to
remove ACR_REGS and related classes (folding their uses into
remaining classes), causes extra register moves in libgcc (as an
immediate observation; actual net effect unknown), which is
unwanted both for performance reasons and also causing extra
work comparing before/after cc0-machinery-conversion changes
ahead. The actual cause and solution for these negative effects
of cleaning up the register-classes will at the moment have to
remain to-be-investigated.
If CRIS v32 support is reinstated, consider doing the .md part
not as separate patterns with opposite conditions but merged
patterns with necessarily-different alternatives using the
"enabled" attribute (which was not invented back then).
Also, a single ACR-related RTL-dump example in a cris.md
comment, related to a strict_low_part issue is kept, but marked
as obsolete.
Note that the "b" register-constraint (non-ACR registers; can be
used for post-increment) is left in, as that may have extant
uses outside of gcc. Its availability is tested by
gcc.target/cris/asm-b-1.c. When ACR register classes are
removed, it's probably best to make it equal to GENERAL_REGS.
gcc:
* config/cris: Remove shared-library and CRIS v32 support.
Part of the removal of crisv32-* and cris-*-linux* (cris-elf remains).
Uses of "cris*" (as opposed to "cris") are deliberately left unadjusted.
gcc/testsuite:
* gcc.dg/20020919-1.c, gcc.dg/pr31866.c, gcc.dg/pr46647.c,
gcc.dg/sibcall-10.c, gcc.dg/sibcall-3.c, gcc.dg/sibcall-4.c,
gcc.dg/sibcall-9.c, gcc.dg/torture/cris-asm-mof-1.c,
gcc.dg/torture/cris-volatile-1.c, gcc.dg/torture/pr38948.c,
gcc.dg/tree-ssa/20040204-1.c, gcc.dg/tree-ssa/loop-1.c,
gcc.dg/weak/typeof-2.c, lib/target-supports.exp: Remove remaining
traces of crisv32-*.
Part of the removal of crisv32-* and cris-*-linux* (cris-elf remains).
After this, within gcc.target, grep -i v32 and grep -i linux
finds no matches, except for a comment in
gcc.target/cris/asmreg-1.c, now grammar-corrected.
gcc/testsuite:
* gcc.target/cris/: Adjust for removing crisv32-* and cris-linux-*.
Part of the removal of crisv32-* and cris-*-linux* (cris-elf remains).
libgcc:
* config.host: Remove support for crisv32-*-* and cris*-*-linux.
* config/cris/libgcc-glibc.ver, config/cris/t-linux: Remove.
Or really, move from the obsolete targets section, to
unsupported targets section, and remove crisv32-*-* and
cris-*-linux* from the rest.
Part of the removal of crisv32-* and cris-*-linux* (cris-elf remains).
gcc:
* config.gcc: Remove support for crisv32-*-* and cris-*-linux*.
Compared to the cc0 version, I noticed a regression in
delay-slot-filling for CRIS for several functions in libgcc with
a similar layout, one being lshrdi3, where with cc0 all
delay-slots were filled, as exposed by the test-case in
gcc.target/cris/pr93372-1.c.
There's one slot that fails to be filled for the decc0rated CRIS
port. A gdb session shows it is because of the automatic
inclusion of TARGET_FLAGS_REGNUM in "registers needed at the end
of the function" because there are insns in the epilogue that
clobber the condition-code register. I'm not trying to tell a
clobber from a set, as parallels with set instead of clobber
seems likely to happen too, for targets with TARGET_FLAGS_REGNUM
set.
Other targets with delay-slots and one dedicated often-clobbered
condition-code-register should consider defining
TARGET_FLAGS_REGNUM. I noticed it improved delay-slot-filling
also in other situations than this.
(Previously approved by Jeff Law.)
gcc:
* resource.c (init_resource_info): Filter-out TARGET_FLAGS_REGNUM
from end_of_function_needs.
When __CET__ is defined, <cet.h> should be included to add Intel CET
marker to object file and _CET_ENDBR should be placed at function entry
to indicate indirect branch target.
* libdruntime/config/x86/switchcontext.S: Include <cet.h> if
__CET__ is defined.
(_CET_ENDBR): New. Define if __CET__ is not defined.
(fiber_switchContext): Add _CET_ENDBR after .cfi_startproc.
When --enable-cet is used to configure GCC, enable Intel CET in libphobos.
* Makefile.am (AM_MAKEFLAGS): Add $(CET_FLAGS) to GCC FLAGS.
* configure.ac (CET_FLAGS): Add GCC_CET_FLAGS and AC_SUBST.
* Makefile.in: Regenerated.
* aclocal.m4: Likewise.
* configure.ac: Likewise.
There are several places where we insert bind expressions while
making the coroutine AST transforms. These should be marked as
having side-effects where relevant, which had been omitted. This
leads to at least one failure in the cppcoros test suite, where a loop
body is dropped in gimplification because it is not marked.
gcc/cp/ChangeLog:
2020-05-08 Iain Sandoe <iain@sandoe.co.uk>
PR c++/95003
* coroutines.cc (build_actor_fn): Ensure that bind scopes
are marked as having side-effects where necessary.
(replace_statement_captures): Likewise.
(morph_fn_to_coro): Likewise.
gcc/testsuite/ChangeLog:
2020-05-08 Iain Sandoe <iain@sandoe.co.uk>
PR c++/95003
* g++.dg/coroutines/torture/pr95003.C: New test.
The existing directives-only code (a) punched a hole through the
libcpp interface and (b) didn't support raw string literals. This
reimplements this preprocessing mode. I added a proper callback
interface, and adjusted c-ppoutput to use it. Sadly I cannot get rid
of the libcpp/internal.h include for unrelated reasons.
The new scanner is in lex.x, and works doing some backwards scanning
when it finds a charater of interest. This reduces the number of
cases one has to deal with in forward scanning. It may have different
failure mode than forward scanning on bad tokenization.
Finally, Moved some cpp tests from the c-specific dg.gcc/cpp directory
to the c-c++-common/cpp shared directory,
libcpp/
* directives-only.c: Delete.
* Makefile.in (libcpp_a_OBJS, libcpp_a_SOURCES): Remove it.
* include/cpplib.h (enum CPP_DO_task): New enum.
(cpp_directive_only_preprocess): Declare.
* internal.h (_cpp_dir_only_callbacks): Delete.
(_cpp_preprocess_dir_only): Delete.
* lex.c (do_peek_backslask, do_peek_next, do_peek_prev): New.
(cpp_directives_only_process): New implementation.
gcc/c-family/
Reimplement directives only processing.
* c-ppoutput.c (token_streamer): Ne.
(directives_only_cb): New. Swallow ...
(print_lines_directives_only): ... this.
(scan_translation_unit_directives_only): Reimplment using the
published interface.
gcc/testsuite/
* gcc.dg/cpp/counter-[23].c: Move to c-c+_-common/cpp.
* gcc.dg/cpp/dir-only-*: Likewise.
* c-c++-common/cpp/dir-only-[78].c: New.
This delays the SLP permutation check to vectorizable_load and optimizes
permutations only after all SLP instances have been generated and the
vectorization factor is determined.
2020-05-08 Richard Biener <rguenther@suse.de>
* tree-vectorizer.h (vec_info::slp_loads): New.
(vect_optimize_slp): Declare.
* tree-vect-slp.c (vect_attempt_slp_rearrange_stmts): Do
nothing when there are no loads.
(vect_gather_slp_loads): Gather loads into a vector.
(vect_supported_load_permutation_p): Remove.
(vect_analyze_slp_instance): Do not verify permutation
validity here.
(vect_analyze_slp): Optimize permutations of reductions
after all SLP instances have been gathered and gather
all loads.
(vect_optimize_slp): New function split out from
vect_supported_load_permutation_p. Elide some permutations.
(vect_slp_analyze_bb_1): Call vect_optimize_slp.
* tree-vect-loop.c (vect_analyze_loop_2): Likewise.
* tree-vect-stmts.c (vectorizable_load): Check whether
the load can be permuted. When generating code assert we can.
* gcc.dg/vect/bb-slp-pr68892.c: Adjust for not supported
SLP permutations becoming builds from scalars.
* gcc.dg/vect/bb-slp-pr78205.c: Likewise.
* gcc.dg/vect/bb-slp-34.c: Likewise.
Two aliased objects must have distinct addresses, even if they have
size zero, so we make sure to allocate at least one byte for them.
* gcc-interface/decl.c (gnat_to_gnu_entity) <E_Variable>: Force at
least the unit size for an aliased object of a constrained nominal
subtype whose size is variable.
The first tweak is to remove the TREE_OVERFLOW flag on INTEGER_CSTs
because it prevents them from being uniquized in LTO mode.
The second, unrelated tweak is to canonicalize the packable types made
by gigi so that at most one per type is present in the GENERIC IL.
* gcc-interface/decl.c (gnat_to_gnu_entity) <E_Array_Subtype>: Deal
with artificial maximally-sized types designed by access types.
* gcc-interface/utils.c (packable_type_hash): New structure.
(packable_type_hasher): Likewise.
(packable_type_hash_table): New hash table.
(init_gnat_utils): Initialize it.
(destroy_gnat_utils): Destroy it.
(packable_type_hasher::equal): New method.
(hash_packable_type): New static function.
(canonicalize_packable_type): Likewise.
(make_packable_type): Make sure not to use too small a type for the
size of the new fields. Canonicalize the type if it is named.
The information was missing in cases the front-end was able to turn
the range comparison into a simple comparison.
* gcc-interface/trans.c (Raise_Error_to_gnu): Always compute a lower
bound and an upper bound for use by the -gnateE switch for range and
comparison operators.
We mark the type of In parameters in Ada with the const qualifier, but
it is stripped by free_lang_data_in_type so do not do it in LTO mode.
* gcc-interface/decl.c (gnat_to_gnu_param): Do not make a variant
of the type in LTO mode.
This fixes an issue with redundant store elimination in FRE/PRE
which, when invoked by the DOM elimination walk, ends up using
possibly stale availability data from the RPO walk. It also
fixes a missed optimization during valueization of addresses
by making sure to use get_addr_base_and_unit_offset_1 which can
valueize and adjusting that to also valueize ARRAY_REFs low-bound.
2020-05-08 Richard Biener <rguenther@suse.de>
* tree-ssa-sccvn.c (rpo_avail): Change type to
eliminate_dom_walker *.
(eliminate_with_rpo_vn): Adjust rpo_avail to make vn_valueize
use the DOM walker availability.
(vn_reference_fold_indirect): Use get_addr_base_and_unit_offset_1
with vn_valueize as valueization callback.
(vn_reference_maybe_forwprop_address): Likewise.
* tree-dfa.c (get_addr_base_and_unit_offset_1): Also valueize
array_ref_low_bound.
* gnat.dg/opt83.adb: New testcase.
We already have x - ((x - y) & -(z < w)) and
x + ((y - x) & -(z < w)) simplifications, this one adds
x ^ ((x ^ y) & -(z < w)) (not merged using for because of the
:c that can be present on bit_xor and can't on minus).
2020-05-08 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/94786
* match.pd (A ^ ((A ^ B) & -(C cmp D)) -> (C cmp D) ? B : A): New
simplification.
* gcc.dg/tree-ssa/pr94786.c: New test.
The following peephole2 changes:
- addl (%rdi), %esi
+ xorl %eax, %eax
+ addl %esi, (%rdi)
setc %al
- movl %esi, (%rdi)
- movzbl %al, %eax
ret
on the testcase. *add<mode>3_cc_overflow_1, being an add{l,q} insn, is
commutative, so if TARGET_READ_MODIFY_WRITE we can replace
addl (%rdi), %esi; movl %esi, (%rdi)
with
addl %esi, (%rdi)
if %esi is dead after those two insns.
2020-05-08 Jakub Jelinek <jakub@redhat.com>
PR target/94857
* config/i386/i386.md (peephole2 after *add<mode>3_cc_overflow_1): New
define_peephole2.
* gcc.target/i386/pr94857.c: New test.
On Thu, May 07, 2020 at 02:45:29PM +0200, Thomas Schwinge wrote:
> >>+ for (tree op = win; TREE_CODE (op) == COMPOUND_EXPR;
>
> ..., and new 'op' variable here.
>
> >>+ op = TREE_OPERAND (op, 1))
> >>+ v.safe_push (op);
> >>+ FOR_EACH_VEC_ELT_REVERSE (v, i, op)
> >>+ ret = build2_loc (EXPR_LOCATION (op), COMPOUND_EXPR,
> >>+ TREE_TYPE (win), TREE_OPERAND (op, 0),
> >>+ ret);
> >>+ return ret;
> >> }
> >> while (TREE_CODE (op) == NOP_EXPR)
> >> {
There is no reason for the shadowing and op at this point acts as a
temporary and will be overwritten in FOR_EACH_VEC_ELT_REVERSE anyway.
So, we can just s/tree // here.
2020-05-08 Jakub Jelinek <jakub@redhat.com>
PR middle-end/94724
* tree.c (get_narrower): Reuse the op temporary instead of
shadowing it.
The following patch canonicalizes M = X >> (prec - 1); (X + M) ^ M
for signed integral types into ABS_EXPR (X). For X == min it is already
UB because M is -1 and min + -1 is UB, so we can use ABS_EXPR rather than
say ABSU_EXPR + cast.
The backend might then emit the abs code back using the shift and addition
and xor if it is the best sequence for the target, but could do something
different that is better.
2020-05-08 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/94783
* match.pd ((X + (X >> (prec - 1))) ^ (X >> (prec - 1)) to abs (X)):
New simplification.
* gcc.dg/tree-ssa/pr94783.c: New test.
The ffs expanders on several targets (x86, ia64, aarch64 at least)
emit a conditional move or similar code to handle the case when the
argument is 0, which makes the code longer.
If we know from VRP that the argument will not be zero, we can (if the
target has also an ctz expander) just use ctz which is undefined at zero
and thus the expander doesn't need to deal with that.
2020-05-08 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/94956
* match.pd (FFS): Optimize __builtin_ffs* of non-zero argument into
__builtin_ctz* + 1 if direct IFN_CTZ is supported.
* gcc.target/i386/pr94956.c: New test.
Implemented thusly. The TYPE_OVERFLOW_WRAPS is there just because the
pattern above it has it too, if you want, I can throw it away from both.
2020-05-08 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/94913
* match.pd (A - B + -1 >= A to B >= A): New simplification.
(A - B > A to A < B): Don't test TYPE_OVERFLOW_WRAPS which is always
true for TYPE_UNSIGNED integral types.
* gcc.dg/tree-ssa/pr94913.c: New test.
My recent combine-stack-adj.c change broke df checking bootstrap,
while most of the changes are done through validate_change/confirm_changes
which update df info, the removal of REG_EQUAL notes didn't update df info.
2020-05-08 Jakub Jelinek <jakub@redhat.com>
PR bootstrap/94961
PR rtl-optimization/94516
* rtl.h (remove_reg_equal_equiv_notes): Add a bool argument defaulted
to false.
* rtlanal.c (remove_reg_equal_equiv_notes): Add no_rescan argument.
Call df_notes_rescan if that argument is not true and returning true.
* combine.c (adjust_for_new_dest): Pass true as second argument to
remove_reg_equal_equiv_notes.
* postreload.c (reload_combine_recognize_pattern): Don't call
df_notes_rescan.