ARC700 targets have a store/load pipeline hazard, if we load within 2
cycles of a store, and the load/store are at the same address, then we
pay a multi-cycle penalty.
This commit avoids this by inserting nop instructions between the store
and the load.
gcc/ChangeLog:
* config/arc/arc-protos.h (arc_store_addr_hazard_p): Declare.
* config/arc/arc.c (arc_store_addr_hazard_p): New function.
(workaround_arc_anomaly): Call arc_store_addr_hazard_p for ARC700.
* config/arc/arc700.md: Add define_bypass for store/load.
gcc/testsuite/ChangeLog:
* gcc.target/arc/arc700-stld-hazard.c: New file.
From-SVN: r243007
2016-11-30 Janus Weil <janus@gcc.gnu.org>
PR fortran/78592
* interface.c (gfc_find_specific_dtio_proc): Rearrange code to avoid
dereferencing a null pointer.
2016-11-30 Janus Weil <janus@gcc.gnu.org>
PR fortran/78592
* gfortran.dg/dtio_18.f90: New test case.
From-SVN: r243005
PR sanitizer/78541
* gcc.dg/asan/pr78541-2.c: New test.
* gcc.dg/asan/pr78541.c: New test.
PR sanitizer/78541
* asan.c (asan_expand_mark_ifn): Properly
select a VAR_DECL from FRAME.* component reference.
From-SVN: r243003
The comment for the added case to simplify_truncation reads
/* Turn (truncate:M1 (*_extract:M2 (reg:M2) (len) (pos))) into
(*_extract:M1 (truncate:M1 (reg:M2)) (len) (pos')) if possible without
changing len. */
but I forgot to check the two modes M2 are actually the same.
PR rtl-optimization/78583
* simplify-rtx.c (simplify_truncation): Add check missing from the
previous commit.
From-SVN: r243000
PR78590 shows a problem in change_zero_ext, where we change a zero_extend
of a subreg to a logical and. We should only do this if the thing we are
taking the subreg of is a scalar integer, otherwise we will take a subreg
of (e.g.) a float in a different size, which is nonsensical and hits an
assert.
PR rtl-optimization/78590
* combine.c (change_zero_ext): Transform zero_extend of subregs only
if the subreg_reg is a scalar integer mode.
From-SVN: r242999
2016-11-30 Janus Weil <janus@gcc.gnu.org>
PR fortran/78573
* decl.c (build_struct): On error, return directly and do not build
class symbol.
2016-11-30 Janus Weil <janus@gcc.gnu.org>
PR fortran/78573
* gfortran.dg/class_61.f90: New test case.
From-SVN: r242996
* lra-constraints.c (check_and_process_move): Constrain the
range of DCLASS and SCLASS to avoid false positive out of bounds
array index warning.
From-SVN: r242993
With -buildmode=c-archive, initsig is called before the memory
allocator has been initialized. The code was doing a memory
allocation because of the call to funcPC(sigtramp). When escape
analysis is fully implemented, that call should not allocate. For
now, finesse the issue by calling a C function to get the C function
pointer value of sigtramp.
When returning from a call from C to a Go function, a deferred
function is run to go back to syscall mode. When the call occurs on a
non-Go thread, that call sets g to nil, making it impossible to add
the _defer struct back to the pool. Just drop it and let the garbage
collector clean it up.
Reviewed-on: https://go-review.googlesource.com/33675
From-SVN: r242992
The ICE in PR preprocessor/78569 appears to be due to an attempt to
generate substring locations in a .i file where the underlying .c file
has changed since the .i file was generated.
This can't work, so it seems safest for the on-demand substring
locations to be unavailable for such files, falling back to
"whole string" locations for such cases.
gcc/ChangeLog:
PR preprocessor/78569
* input.c (get_substring_ranges_for_loc): Fail gracefully if
line directives were present.
gcc/testsuite/ChangeLog:
PR preprocessor/78569
* gcc.dg/format/pr78569.c: New test case.
From-SVN: r242990
2016-11-29 Michael Meissner <meissner@linux.vnet.ibm.com>
PR target/78594
* config/rs6000/rs6000.md (mov<mode>_internal, QHI iterator): Add
'x' to stxsi<wd>x print pattern, so that QImode and HImode values
residing in traditional altivec registers can be stored
correctly.
From-SVN: r242983
2016-11-29 Max Filippov <jcmvbkbc@gmail.com>
gcc/
* config/xtensa/xtensa.c (hwloop_optimize): Don't emit zero
overhead loop start between a call and its CALL_ARG_LOCATION
note.
From-SVN: r242979
PR target/71331
* config/tilegx/tilegx.c (tilegx_function_profiler): Save r10
to stack before call mcount.
(tilegx_can_use_return_insn_p): Clean up code.
From-SVN: r242969
gcc/cp/ChangeLog:
PR c++/77922
* name-lookup.c (lookup_name_fuzzy): Filter out reserved words
that were filtered out by init_reswords.
gcc/ChangeLog:
PR c++/72774
PR c++/72786
PR c++/77922
PR c++/78313
* spellcheck.c (selftest::test_find_closest_string): Verify that
we don't offer the goal string as a suggestion.
* spellcheck.h (best_match::get_best_meaningful_candidate): Don't
offer the goal string as a suggestion.
gcc/testsuite/ChangeLog:
PR c++/72774
PR c++/72786
PR c++/77922
PR c++/78313
* g++.dg/spellcheck-c++-11-keyword.C: New test case.
* g++.dg/spellcheck-macro-ordering.C: New test case.
* g++.dg/spellcheck-pr78313.C: New test case.
From-SVN: r242965
2016-11-29 Richard Biener <rguenther@suse.de>
* tree-cfg.c (lower_phi_internal_fn): Do not look for further
PHIs after a regular stmt.
(stmt_starts_bb_p): PHIs not preceeded by a PHI or a label
start a new BB.
From-SVN: r242959
PR gcov-profile/78582
* gcc.dg/pr78582.c: New test.
PR gcov-profile/78582
* tree-profile.c (gimple_gen_time_profiler): Make one extra BB
to prevent PHI argument clash.
From-SVN: r242958
The dump expects literals which would only be present if the target's
int size is 32 bits.
Fix by explicitly using 32 bit ints for targets with __SIZEOF_INT__ < 4.
gcc/testsuite/
2016-11-29 Senthil Kumar Selvaraj <senthil_kumar.selvaraj@atmel.com>
* testsuite/gcc.dg/pr31096-1.c: Use __{U,}INT32_TYPE__ for
targets with sizeof(int) < 4.
From-SVN: r242954
These testcases test that we generate the expected code for all of the
rl*i* instructions, that is, rotate-and-mask and rotate-and-mask-insert
for immediate rotation counts. All the testcases do rotate, shift left,
as well as shift right; if that results in an instruction that does not
exist the testcases generate a multiplication instead, so that we can
detect if this is handled properly.
Many 32-bit instructions zero-extend their result properly in 64-bit
mode, but the rs6000 port does not yet know. These testcases test the
status quo, so they will need updating when ever we handle this.
gcc/testsuite/
* gcc.target/powerpc/rldic-0.c: New testcase.
* gcc.target/powerpc/rldic-1.c: New testcase.
* gcc.target/powerpc/rldic-2.c: New testcase.
* gcc.target/powerpc/rldicl-0.c: New testcase.
* gcc.target/powerpc/rldicl-1.c: New testcase.
* gcc.target/powerpc/rldicl-2.c: New testcase.
* gcc.target/powerpc/rldicr-0.c: New testcase.
* gcc.target/powerpc/rldicr-1.c: New testcase.
* gcc.target/powerpc/rldicr-2.c: New testcase.
* gcc.target/powerpc/rldicx.h: New file.
* gcc.target/powerpc/rldimi-0.c: New testcase.
* gcc.target/powerpc/rldimi-1.c: New testcase.
* gcc.target/powerpc/rldimi-2.c: New testcase.
* gcc.target/powerpc/rldimi.h: New file.
* gcc.target/powerpc/rlwimi-0.c: New testcase.
* gcc.target/powerpc/rlwimi-1.c: New testcase.
* gcc.target/powerpc/rlwimi-2.c: New testcase.
* gcc.target/powerpc/rlwimi.h: New file.
* gcc.target/powerpc/rlwinm-0.c: New testcase.
* gcc.target/powerpc/rlwinm-1.c: New testcase.
* gcc.target/powerpc/rlwinm-2.c: New testcase.
* gcc.target/powerpc/rlwinm.h: New file.
From-SVN: r242951
change_zero_ext handles (zero_extend:M1 (subreg:M2 (reg:M1) ...))
already; this patch extends it to also deal with any
(zero_extend:M1 (subreg:M2 (reg:M3) ...)) where the subreg is not
paradoxical.
* combine.c (change_zero_ext): Also handle extends from a subreg
to a mode bigger than that of the operand of the subreg.
From-SVN: r242950
If we use ABI_V4 and we have a big stack frame, we end the epilogue
with a "mr 1,11" (or similar) instruction. This instruction however
has no dependencies on the earlier restores from stack (done via r11),
so sched2 can end up reordering the insns, which is bad because we
have no red zone so that you then restore from stack that is already
deallocated.
This fixes it by making that restore depend on the memory accesses.
PR target/77687
* config/rs6000/rs6000.c (rs6000_emit_stack_reset): Emit the
stack_restore_tie insn instead of stack_tie, for the SVR4 and
SPE ABIs.
* config/rs6000/rs6000.md (stack_restore_tie): New define_insn.
From-SVN: r242949
This patch changes spread_components to use a simpler algorithm that
puts prologue components as early as possible, and epilogue components
as late as possible. This allows better scheduling, and also saves a
bit of code size. The blocks that run with some specific component
enabled after this patch is a strict superset of those that had it
before the patch.
It does this by finding for every component the basic blocks where that
component is not needed on some path from the entry block (it reuses
head_components to store this), and similarly the blocks where the
component is not needed on some path to the exit block (or the exit can
not be reached from that block) (stored in tail_components). Blocks
that then are not in both of those two sets get the component active.
* shrink-wrap.c (init_separate_shrink_wrap): Do not clear
head_components and tail_components.
(spread_components): New algorithm.
(emit_common_tails_for_components): Clear head_components and
tail_components.
(insert_prologue_epilogue_for_components): Write extra output to the
dump file for sibcalls and abnormal exits.
From-SVN: r242948
Combine can turn a conditional trap into an unconditional trap. If it
does that it should make the code after it unreachable (an unconditional
trap should be the last insn in its bb, and that bb has no successors).
This patch seems to work. It is hard to be sure, this is very hard to
trigger. Quite a few other passes look like they need something similar
as well, but I don't see anything else handling it yet either.
PR rtl-optimization/78342
* combine.c: Include "cfghooks.h".
(try_combine): If we create an unconditional trap, break the basic
block in two just after it, and remove the edge between; also, set
the *new_direct_jump_p flag so that cleanup_cfg is run.
From-SVN: r242947
simplify_truncation changes the truncation of many operations into
the operation on the truncation. This patch makes this code also
handle extracts.
* simplify-rtx.c (simplify_truncation): Handle truncate of zero_extract
and sign_extract.
From-SVN: r242946