Supporting TLS for -mpcrel turns out to be relatively simple. The
existing TLSGD and TLSLD unspecs happily can have their GOT pointer
reg element replaced with zero, refelecting the fact that optimisation
of calls to __tls_get_addr when pc-rel won't use the GOT pointer.
Some other insns also can be reused, and just a few added.
* config/rs6000/predicates.md (unspec_tls): Allow const0_rtx for got
element of unspec vec.
* config/rs6000/rs6000.c (rs6000_legitimize_tls_address): Support
PC-relative TLS.
* config/rs6000/rs6000.md (UNSPEC_TLSTLS_PCREL): New unspec.
(tls_gd_pcrel, tls_ld_pcrel): New insns.
(tls_dtprel, tls_tprel): Set attr prefixed when tls_size is not 16.
(tls_got_tprel_pcrel, tls_tls_pcrel): New insns.
From-SVN: r278076
2019-11-11 Michael Meissner <meissner@linux.ibm.com>
* config/rs6000/predicates.md (prefixed_memory): New predicate.
* config/rs6000/rs6000.md (stack_protect_setdi): Deal with either
address being a prefixed load/store.
(stack_protect_testdi): Deal with either address being a prefixed
load.
From-SVN: r278069
This PR was caused by the SLP handling in get_group_load_store_type
returning VMAT_CONTIGUOUS rather than VMAT_CONTIGUOUS_REVERSE for
downward groups.
A more elaborate fix would be to try to combine the reverse permutation
into SLP_TREE_LOAD_PERMUTATION for loads, but that's really a follow-on
optimisation and not backport material. It might also not necessarily
be a win, if the target supports (say) reversing and odd/even swaps
as independent permutes but doesn't recognise the combined form.
2019-11-11 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR tree-optimization/92420
* tree-vect-stmts.c (get_negative_load_store_type): Move further
up file.
(get_group_load_store_type): Use it for reversed SLP accesses.
gcc/testsuite/
* gcc.dg/vect/pr92420.c: New test.
From-SVN: r278064
Bump the minimum MPFR version to 3.1.0, released 2011-10-03. With this
requirement one can still build GCC with the operating system provided
MPFR on old but still supported operating systems like SLES 12 (MPFR
3.1.2) or RHEL/CentOS 7.x (MPFR 3.1.1).
This allows removing some code in the Fortran frontend, as well as
fixing PR 91828.
ChangeLog:
2019-11-11 Janne Blomqvist <jb@gcc.gnu.org>
PR fortran/91828
* configure.ac: Bump minimum MPFR to 3.1.0, recommended to 3.1.6+.
* configure: Regenerated.
gcc/ChangeLog:
2019-11-11 Janne Blomqvist <jb@gcc.gnu.org>
PR fortran/91828
* doc/install.texi: Document that the minimum MPFR version is
3.1.0.
gcc/fortran/ChangeLog:
2019-11-11 Janne Blomqvist <jb@gcc.gnu.org>
PR fortran/91828
* simplify.c (gfc_simplify_fraction): Remove fallback path for
MPFR < 3.1.0.
From-SVN: r278058
The movsi_ne variants are in a wrong order, leading to wrong
computation of the internal attribute "cond". Hence, to errors when
outputting annul-true or annul-false instructions.
gcc/
xxxx-xx-xx Claudiu Zissulescu <claziss@synopsys.com>
Shahab Vahedi <shahab@synopsys.com>
* config/arc/arc.md (movsi_ne): Reorder instruction variants.
testsuite/
xxxx-xx-xx Shahab Vahedi <shahab@synopsys.com>
* gcc.target/arc/delay-slot-limm.c: New test.
From-SVN: r278057
There are cases when an pic address gets complicated, and it needs to
be resolved via force_reg function found in
prepare_move_operands. When this happens, we need to disambiguate the
pic address and re-legitimize it.
gcc/
xxxx-xx-xx Claudiu Zissulescu <claziss@synopsys.com>
* config/arc/arc.c (arc_legitimize_pic_address): Consider UNSPECs
as well, if interesting recover the symbol and re-legitimize the
pic address.
gcc/testsuite/
xxxx-xx-xx Claudiu Zissulescu <claziss@synopsys.com>
* gcc.target/arc/pic-2.c: New file.
From-SVN: r278056
2019-11-11 Martin Liska <mliska@suse.cz>
* Make-lang.in: Relax dependency of lto-dump.o to
LTO_OBJS which will allow faster linking (mainly with LTO).
From-SVN: r278054
niters for epilogue
gcc/ChangeLog:
2019-11-11 Andre Vieira <andre.simoesdiasvieira@arm.com>
* tree-vect-loop-manip.c (vect_do_peeling): Take epilogue gaps into
account when checking if there are enough iterations to vectorize
epilogue.
gcc/testsuite/ChangeLog:
2019-11-11 Andre Vieira <andre.simoesdiasvieira@arm.com>
* gcc.dg/vect/vect-reduc-epilogue-gaps.c: New test.
From-SVN: r278049
2019-11-11 José Rui Faustino de Sousa <jrfsousa@gmail.com>
libgfortran/
PR fortran/92142
* runtime/ISO_Fortran_binding.c (CFI_setpointer): Don't
override descriptor attribute; with -fcheck, check that
it is a pointer.
gcc/testsuite/
PR fortran/92142
* gcc/testsuite/gfortran.dg/ISO_Fortran_binding_16.c: New.
* gcc/testsuite/gfortran.dg/ISO_Fortran_binding_16.f90: New.
* gcc/testsuite/gfortran.dg/ISO_Fortran_binding_10.c: Correct
upper bounds for case 0.
From-SVN: r278048
On x86, since -fPIC and -shared should be used to create offload image,
we put them the last to properly create offload image.
2019-11-11 H.J. Lu <hjl.tools@gmail.com>
PR target/87833
* config/i386/intelmic-mkoffload.c (prepare_target_image): Put
-fPIC and -shared the last to create offload image.
From-SVN: r278041
The 'gcc/configure' script sources all 'gcc/*/config-lang.in' files, but fails
to emit such dependency information into the build machinery. That means,
currently, when something gets changed in a 'gcc/*/config-lang.in' file, this
is not noticed, and doesn't propagate through the build machinery.
Handling of configure fragments is modelled in the same way as it already
exists for Makefile fragments.
gcc/
* Makefile.in (LANG_CONFIGUREFRAGS): Define.
(config.status): Use/depend on it.
* configure.ac (all_lang_configurefrags): Track, 'AC_SUBST'.
* configure: Regenerate.
From-SVN: r278035
In this patch, loop unroll adjust hook is introduced for powerpc. We
can do target related heuristic adjustment in this hook. In this patch,
-funroll-loops is enabled for small loops at O2 and above with an option
-munroll-small-loops to guard the small loops unrolling, and it works
fine with -flto.
gcc/
2019-11-11 Jiufu Guo <guojiufu@linux.ibm.com>
PR tree-optimization/88760
* gcc/config/rs6000/rs6000.opt (-munroll-only-small-loops): New option.
* gcc/common/config/rs6000/rs6000-common.c
(rs6000_option_optimization_table) [OPT_LEVELS_2_PLUS_SPEED_ONLY]:
Turn on -funroll-loops and -munroll-only-small-loops.
[OPT_LEVELS_ALL]: Turn off -fweb and -frename-registers.
* config/rs6000/rs6000.c (rs6000_option_override_internal): Remove
set of PARAM_MAX_UNROLL_TIMES and PARAM_MAX_UNROLLED_INSNS.
Turn off -munroll-only-small-loops for explicit -funroll-loops.
(TARGET_LOOP_UNROLL_ADJUST): Add loop unroll adjust hook.
(rs6000_loop_unroll_adjust): Define it. Use -munroll-only-small-loops.
gcc.testsuite/
2019-11-11 Jiufu Guo <guojiufu@linux.ibm.com>
PR tree-optimization/88760
* gcc.dg/pr59643.c: Update back to r277550.
From-SVN: r278034
To align with rs6000_insn_cost costing more for load type insns,
this patch is to make load insns cost more in vectorization cost
function. The latency of load insns is about twice that of
"simple" instructions; 2 vs. 1 on older cores, and 4 (or so) vs.
2 on newer cores. Considering that the result of load usually
is used somehow later (true-dep) but store won't, we keep the
store as before.
The SPEC2017 performance evaluation on Power8 shows 525.x264_r
+9.56%, 511.povray_r +2.08%, 527.cam4_r 1.16% gains, no
significant degradation, SPECINT geomean +0.88%, SPECFP geomean
+0.26%.
The SPEC2017 performance evaluation on Power9 shows no significant
improvement or degradation, SPECINT geomean +0.04%, SPECFP geomean
+0.04%.
The SPEC2006 performance evaluation on Power8 shows 454.calculix
+4.41% gain but 416.gamess -1.19% and 453.povray -3.83% degradation.
I looked into the two degradation bmks, the degradation were NOT
due to hotspot changes by vectorization, were all side effects.
SPECINT geomean +0.10%, SPECFP geomean no changed considering
the degradation.
gcc/ChangeLog
2019-11-11 Kewen Lin <linkw@gcc.gnu.org>
* config/rs6000/rs6000.c (rs6000_builtin_vectorization_cost): Make
scalar_load, vector_load, unaligned_load and vector_gather_load cost
more to conform hardware latency and insn cost settings.
From-SVN: r278033
Some of the solution to PR71767 is incomplete, and we need finer-grained
control over whether symbols need to be made linker-visible. This is a
preparation patch, providing the flag.
gcc/ChangeLog:
2019-11-10 Iain Sandoe <iain@sandoe.co.uk>
* config/darwin.h (MACHO_SYMBOL_FLAG_LINKER_VIS): New.
(MACHO_SYMBOL_LINKER_VIS_P): New.
From-SVN: r278028
As part of PR 91413, GFortran now prints a warning when a variable is
moved from the stack to static storage. However, when the user
explicitly specifies that all local variables should be put in static
storage with the -fno-automatic option, don't print this warning.
Regtested on x86_64-pc-linux-gnu, committed as obvious.
gcc/fortran/ChangeLog:
2019-11-10 Janne Blomqvist <jb@gcc.gnu.org>
PR fortran/91413
* trans-decl.c (gfc_finish_var_decl): Don't print warning when
-fno-automatic is enabled.
From-SVN: r278027
This paper was delayed until the February meeting in Prague so that we could
get a better idea of what the impact on existing code would actually be. To
that end, I'm implementing it now.
* typeck2.c (check_narrowing): Treat pointer->bool as a narrowing
conversion with -std=c++2a.
From-SVN: r278026
2019-11-10 Paul Thomas <pault@gcc.gnu.org>
PR fortran/92123
*decl.c (gfc_verify_c_interop_param): Remove error asserting
that pointer or allocatable variables in a bind C procedure are
not supported. Delete some trailing spaces.
* trans-stmt.c (trans_associate_var): Correct the attempt to
treat scalar pointer or allocatable temporaries as if they are
array descriptors.
2019-11-10 Paul Thomas <pault@gcc.gnu.org>
PR fortran/92123
* gfortran.dg/bind_c_procs_3.f90 : New test.
* gfortran.dg/ISO_Fortran_binding_15.c : New test.
* gfortran.dg/ISO_Fortran_binding_15.f90 : Additional source.
From-SVN: r278025
The liveness of eliminable hard registers is not tracked by LRA between
basic blocks, so they should not be used as spill registers as LRA may
decide to allocate them to pseudos while the spilled value is still live.
2019-11-10 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/
* lra-spills.c (assign_spill_hard_regs): Do not spill into
registers in eliminable_regset.
From-SVN: r278024
* ipa-inline.c (compute_uninlined_call_time,
compute_inlined_call_time): Take edge frequency as
parameter rather than computing it by itself.
(big_speedup_p, edge_badness): Manually CSE sreal
frequency calculations.
From-SVN: r278023
* ipa-prop.c (ipa_propagate_indirect_call_infos): Remove ipa edge
args summaries of inlined edge unless it holds info about
described reference.
From-SVN: r278020
Sometimes combine wants to do a move in CCFPmode, but we don't currently
handle moves in any CC mode other than CCmode. Fix that oversight.
* config/rs6000/rs6000.md (CC_any): New mode iterator.
(*movcc_internal1): Rename to...
(*movcc_<mode> for CC_any): ... this. Support moves of all CC modes.
From-SVN: r278017
* cgraph.h (struct cgraph_node): Add ipcp_clone flag.
(cgraph_node::create_virtual_clone): Copy it.
* ipa-cp.c (ipcp_versionable_function_p): Watch for missing
summaries.
(ignore_edge_p): If caller has ipa-cp disabled, skip the edge, too.
(ipcp_verify_propagated_values): Do not verify nodes where ipcp
is disabled.
(propagate_constants_across_call): If callee is not analyzed, give up.
(propagate_constants_topo): Lower to bottom latties of all callees of
functions with ipa-cp disabled.
(ipcp_propagate_stage): Skip functions with ipa-cp disabled.
(cgraph_edge_brings_value_p): Check for availability first.
(create_specialized_node): Set ipcp_clone.
(ipcp_store_bits_results): Check that info is present.
* ipa-fnsummary.c (evaluate_properties_for_edge): Do not analyze
thunks.
(ipa_call_context::duplicate_from, ipa_call_context::equal_to): Be
conservative when callee summary is missing.
(remap_edge_summaries): Lookup call summary only when needed.
* ipa-icf.c (sem_function::param_used_p): Be ready for missing summary.
* ipa-prpo.c (ipa_alloc_node_params, ipa_initialize_node_params):
Use get_create.
(ipa_analyze_node): Use get_create.
(propagate_controlled_uses): Do not propagate when function is not
analyzed.
(ipa_propagate_indirect_call_infos): Remove summary of inline clone.
(ipa_read_node_info): Use get_create.
* ipa-prop.h (IPA_NODE_REF): Use get.
(IPA_NODE_REF_GET_CREATE): New.
From-SVN: r278016
* ipa-inline-analysis.c (do_estimate_growth_1): Add support for
capping the growth cumulated.
(offline_size): Break out from ...
(estimate_growth): ... here.
(check_callers): Add N, OFFLINE and MIN_SIZE and KNOWN_EDGE
parameters.
(growth_likely_positive): Turn to ...
(growth_positive_p): Re-implement.
* ipa-inline.h (growth_likely_positive): Remove.
(growth_positive_p): Declare.
* ipa-inline.c (want_inline_small_function_p): Use
growth_positive_p.
(want_inline_function_to_all_callers_p): Likewise.
From-SVN: r278007
* ipa-fnsummary.c (estimate_edge_size_and_time): Do not call
estimate_edge_devirt_benefit when not computing hints;
do not compute time when not asked for.
(estimate_calls_size_and_time): Pass NULL hints and time when
these are not computed; do not evaluate hint predicates when these are
not computed.
(ipa_merge_fn_summary_after_inlining): Do not re-evaluate edge
frequency.
From-SVN: r278005
PR tree-optimization/92401
* gimple-match-head.c (gimple_resimplify1): Call const_unop only
if res_op->code is an expression with code length 1.
* gimple-match-head.c (gimple_resimplify2): Call const_binop only
if res_op->code is an expression with code length 2.
* gimple-match-head.c (gimple_resimplify3): Call fold_ternary only
if res_op->code is an expression with code length 3.
* g++.dg/opt/pr92401.C: New test.
From-SVN: r278004