gcc/ChangeLog:
PR middle-end/83688
* gimple-ssa-sprintf.c (format_result::alias_info): New struct.
(directive::argno): New member.
(format_result::aliases, format_result::alias_count): New data members.
(format_result::append_alias): New member function.
(fmtresult::dst_offset): New data member.
(pass_sprintf_length::call_info::dst_origin): New data member.
(pass_sprintf_length::call_info::dst_field, dst_offset): Same.
(char_type_p, array_elt_at_offset, field_at_offset): New functions.
(get_origin_and_offset): Same.
(format_string): Call it.
(format_directive): Call append_alias and set directive argument
number.
(maybe_warn_overlap): New function.
(pass_sprintf_length::compute_format_length): Call it.
(pass_sprintf_length::handle_gimple_call): Initialize new members.
* gcc/tree-ssa-strlen.c (): Also enable when -Wrestrict is on.
gcc/testsuite/ChangeLog:
PR tree-optimization/35503
* gcc.dg/tree-ssa/builtin-sprintf-warn-23.c: New test.
From-SVN: r278098
try_forward_edges does not update dominance info, and merge_blocks
relies on it being up-to-date. In PR92430 stale dominance info makes
merge_blocks produce a loop in the dominator tree, which in turn makes
delete_basic_block loop forever.
Fix by freeing dominance info at the beginning of cleanup_cfg.
gcc/ChangeLog:
2019-11-12 Ilya Leoshkevich <iii@linux.ibm.com>
PR rtl-optimization/92430
* cfgcleanup.c (pass_jump_after_combine::execute): Free
dominance info at the beginning.
gcc/testsuite/ChangeLog:
2019-11-12 Ilya Leoshkevich <iii@linux.ibm.com>
PR rtl-optimization/92430
* gcc.dg/pr92430.c: New test (from Arseny Solokha).
From-SVN: r278095
2019-11-12 Martin Liska <mliska@suse.cz>
* common/common-target.def:
Do not mention set_default_param_value
and set_param_value.
* doc/tm.texi: Likewise.
From-SVN: r278088
2019-11-12 Martin Liska <mliska@suse.cz>
* common.opt: Remove param_values.
* config/i386/i386-options.c (ix86_valid_target_attribute_p):
Remove finalize_options_struct.
* gcc.c (driver::decode_argv): Do not call global_init_params
and finish_params.
(driver::finalize): Do not call params_c_finalize
and finalize_options_struct.
* opt-suggestions.c (option_proposer::get_completions): Remove
special casing of params.
(option_proposer::find_param_completions): Remove.
(test_completion_partial_match): Update expected output.
* opt-suggestions.h: Remove find_param_completions.
* opts-common.c (add_misspelling_candidates): Add
--param with a space.
* opts.c (handle_param): Remove.
(init_options_struct):. Remove init_options_struct and
similar calls.
(finalize_options_struct): Remove.
(common_handle_option): Use SET_OPTION_IF_UNSET.
* opts.h (finalize_options_struct): Remove.
* toplev.c (general_init): Do not call global_init_params.
(toplev::finalize): Do not call params_c_finalize and
finalize_options_struct.
From-SVN: r278087
2019-11-12 Martin Liska <mliska@suse.cz>
* Makefile.in: Include params.opt.
* flag-types.h (enum parloops_schedule_type): Add
parloops_schedule_type used in params.opt.
* params.opt: New file.
From-SVN: r278084
2019-11-12 Martin Liska <mliska@suse.cz>
* common.opt: Remove --param and --param= options.
* opt-functions.awk: Mark CL_PARAMS for options
that have Param keyword.
* opts-common.c (decode_cmdline_options_to_array):
Replace --param key=value with --param=key=value.
* opts.c (print_filtered_help): Remove special
printing of params.
(print_specific_help): Update title for params.
(common_handle_option): Do not handle OPT__param.
opts.h (SET_OPTION_IF_UNSET): New macro.
* doc/options.texi: Document Param keyword.
From-SVN: r278083
The `serial' construct (cf. section 2.5.3 of the OpenACC 2.6 standard)
is equivalent to a `parallel' construct with clauses `num_gangs(1)
num_workers(1) vector_length(1)' implied.
These clauses are therefore not supported with the `serial'
construct. All the remaining clauses accepted with `parallel' are also
accepted with `serial'.
The `serial' construct is implemented like `parallel', except for
hardcoding dimensions rather than taking them from the relevant
clauses, in `expand_omp_target'.
Separate codes are used to denote the `serial' construct throughout the
middle end, even though the mapping of `serial' to an equivalent
`parallel' construct could have been done in the individual language
frontends. In particular, this allows to distinguish between compute
constructs in warnings, error messages, dumps etc.
2019-11-12 Maciej W. Rozycki <macro@codesourcery.com>
Tobias Burnus <tobias@codesourcery.com>
Frederik Harwath <frederik@codesourcery.com>
Thomas Schwinge <thomas@codesourcery.com>
gcc/
* gimple.h (gf_mask): Add GF_OMP_TARGET_KIND_OACC_SERIAL
enumeration constant.
(is_gimple_omp_oacc): Handle GF_OMP_TARGET_KIND_OACC_SERIAL.
(is_gimple_omp_offloaded): Likewise.
* gimplify.c (omp_region_type): Add ORT_ACC_SERIAL enumeration
constant. Adjust the value of ORT_NONE accordingly.
(is_gimple_stmt): Handle OACC_SERIAL.
(oacc_default_clause): Handle ORT_ACC_SERIAL.
(gomp_needs_data_present): Likewise.
(gimplify_adjust_omp_clauses): Likewise.
(gimplify_omp_workshare): Handle OACC_SERIAL.
(gimplify_expr): Likewise.
* omp-expand.c (expand_omp_target):
Handle GF_OMP_TARGET_KIND_OACC_SERIAL.
(build_omp_regions_1, omp_make_gimple_edges): Likewise.
* omp-low.c (is_oacc_parallel): Rename function to...
(is_oacc_parallel_or_serial): ... this.
Handle GF_OMP_TARGET_KIND_OACC_SERIAL.
(scan_sharing_clauses): Adjust accordingly.
(scan_omp_for): Likewise.
(lower_oacc_head_mark): Likewise.
(convert_from_firstprivate_int): Likewise.
(lower_omp_target): Likewise.
(check_omp_nesting_restrictions): Handle
GF_OMP_TARGET_KIND_OACC_SERIAL.
(lower_oacc_reductions): Likewise.
(lower_omp_target): Likewise.
* tree.def (OACC_SERIAL): New tree code.
* tree-pretty-print.c (dump_generic_node): Handle OACC_SERIAL.
* doc/generic.texi (OpenACC): Document OACC_SERIAL.
gcc/c-family/
* c-pragma.h (pragma_kind): Add PRAGMA_OACC_SERIAL enumeration
constant.
* c-pragma.c (oacc_pragmas): Add "serial" entry.
gcc/c/
* c-parser.c (OACC_SERIAL_CLAUSE_MASK): New macro.
(c_parser_oacc_kernels_parallel): Rename function to...
(c_parser_oacc_compute): ... this. Handle PRAGMA_OACC_SERIAL.
(c_parser_omp_construct): Update accordingly.
gcc/cp/
* constexpr.c (potential_constant_expression_1): Handle
OACC_SERIAL.
* parser.c (OACC_SERIAL_CLAUSE_MASK): New macro.
(cp_parser_oacc_kernels_parallel): Rename function to...
(cp_parser_oacc_compute): ... this. Handle PRAGMA_OACC_SERIAL.
(cp_parser_omp_construct): Update accordingly.
(cp_parser_pragma): Handle PRAGMA_OACC_SERIAL. Fix alphabetic
order.
* pt.c (tsubst_expr): Handle OACC_SERIAL.
gcc/fortran/
* gfortran.h (gfc_statement): Add ST_OACC_SERIAL_LOOP,
ST_OACC_END_SERIAL_LOOP, ST_OACC_SERIAL and ST_OACC_END_SERIAL
enumeration constants.
(gfc_exec_op): Add EXEC_OACC_SERIAL_LOOP and EXEC_OACC_SERIAL
enumeration constants.
* match.h (gfc_match_oacc_serial): New prototype.
(gfc_match_oacc_serial_loop): Likewise.
* dump-parse-tree.c (show_omp_node, show_code_node): Handle
EXEC_OACC_SERIAL_LOOP and EXEC_OACC_SERIAL.
* match.c (match_exit_cycle): Handle EXEC_OACC_SERIAL_LOOP.
* openmp.c (OACC_SERIAL_CLAUSES): New macro.
(gfc_match_oacc_serial_loop): New function.
(gfc_match_oacc_serial): Likewise.
(oacc_is_loop): Handle EXEC_OACC_SERIAL_LOOP.
(resolve_omp_clauses): Handle EXEC_OACC_SERIAL.
(oacc_code_to_statement): Handle EXEC_OACC_SERIAL and
EXEC_OACC_SERIAL_LOOP.
(gfc_resolve_oacc_directive): Likewise.
* parse.c (decode_oacc_directive) <'s'>: Add case for "serial"
and "serial loop".
(next_statement): Handle ST_OACC_SERIAL_LOOP and ST_OACC_SERIAL.
(gfc_ascii_statement): Likewise. Handle ST_OACC_END_SERIAL_LOOP
and ST_OACC_END_SERIAL.
(parse_oacc_structured_block): Handle ST_OACC_SERIAL.
(parse_oacc_loop): Handle ST_OACC_SERIAL_LOOP and
ST_OACC_END_SERIAL_LOOP.
(parse_executable): Handle ST_OACC_SERIAL_LOOP and
ST_OACC_SERIAL.
(is_oacc): Handle EXEC_OACC_SERIAL_LOOP and EXEC_OACC_SERIAL.
* resolve.c (gfc_resolve_blocks, gfc_resolve_code): Likewise.
* st.c (gfc_free_statement): Likewise.
* trans-openmp.c (gfc_trans_oacc_construct): Handle
EXEC_OACC_SERIAL.
(gfc_trans_oacc_combined_directive): Handle
EXEC_OACC_SERIAL_LOOP.
(gfc_trans_oacc_directive): Handle EXEC_OACC_SERIAL_LOOP and
EXEC_OACC_SERIAL.
* trans.c (trans_code): Likewise.
gcc/testsuite/
* c-c++-common/goacc/parallel-dims.c: New test.
* gfortran.dg/goacc/parallel-dims.f90: New test.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: New test.
* testsuite/libgomp.oacc-fortran/parallel-dims-aux.c: New test.
* testsuite/libgomp.oacc-fortran/parallel-dims.f89: New test.
* testsuite/libgomp.oacc-fortran/parallel-dims-2.f90: New test.
Reviewed-by: Thomas Schwinge <thomas@codesourcery.com>
Co-Authored-By: Frederik Harwath <frederik@codesourcery.com>
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
Co-Authored-By: Tobias Burnus <tobias@codesourcery.com>
From-SVN: r278082
PR tree-optimization/92452
* tree-vrp.c (vrp_prop::check_array_ref): If TRUNC_DIV_EXPR folds
into NULL_TREE, set up_bound to NULL_TREE instead of computing
MINUS_EXPR on it.
* c-c++-common/pr92452.c: New test.
From-SVN: r278080
Supporting TLS for -mpcrel turns out to be relatively simple. The
existing TLSGD and TLSLD unspecs happily can have their GOT pointer
reg element replaced with zero, refelecting the fact that optimisation
of calls to __tls_get_addr when pc-rel won't use the GOT pointer.
Some other insns also can be reused, and just a few added.
* config/rs6000/predicates.md (unspec_tls): Allow const0_rtx for got
element of unspec vec.
* config/rs6000/rs6000.c (rs6000_legitimize_tls_address): Support
PC-relative TLS.
* config/rs6000/rs6000.md (UNSPEC_TLSTLS_PCREL): New unspec.
(tls_gd_pcrel, tls_ld_pcrel): New insns.
(tls_dtprel, tls_tprel): Set attr prefixed when tls_size is not 16.
(tls_got_tprel_pcrel, tls_tls_pcrel): New insns.
From-SVN: r278076
2019-11-11 Michael Meissner <meissner@linux.ibm.com>
* config/rs6000/predicates.md (prefixed_memory): New predicate.
* config/rs6000/rs6000.md (stack_protect_setdi): Deal with either
address being a prefixed load/store.
(stack_protect_testdi): Deal with either address being a prefixed
load.
From-SVN: r278069
This PR was caused by the SLP handling in get_group_load_store_type
returning VMAT_CONTIGUOUS rather than VMAT_CONTIGUOUS_REVERSE for
downward groups.
A more elaborate fix would be to try to combine the reverse permutation
into SLP_TREE_LOAD_PERMUTATION for loads, but that's really a follow-on
optimisation and not backport material. It might also not necessarily
be a win, if the target supports (say) reversing and odd/even swaps
as independent permutes but doesn't recognise the combined form.
2019-11-11 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR tree-optimization/92420
* tree-vect-stmts.c (get_negative_load_store_type): Move further
up file.
(get_group_load_store_type): Use it for reversed SLP accesses.
gcc/testsuite/
* gcc.dg/vect/pr92420.c: New test.
From-SVN: r278064
Bump the minimum MPFR version to 3.1.0, released 2011-10-03. With this
requirement one can still build GCC with the operating system provided
MPFR on old but still supported operating systems like SLES 12 (MPFR
3.1.2) or RHEL/CentOS 7.x (MPFR 3.1.1).
This allows removing some code in the Fortran frontend, as well as
fixing PR 91828.
ChangeLog:
2019-11-11 Janne Blomqvist <jb@gcc.gnu.org>
PR fortran/91828
* configure.ac: Bump minimum MPFR to 3.1.0, recommended to 3.1.6+.
* configure: Regenerated.
gcc/ChangeLog:
2019-11-11 Janne Blomqvist <jb@gcc.gnu.org>
PR fortran/91828
* doc/install.texi: Document that the minimum MPFR version is
3.1.0.
gcc/fortran/ChangeLog:
2019-11-11 Janne Blomqvist <jb@gcc.gnu.org>
PR fortran/91828
* simplify.c (gfc_simplify_fraction): Remove fallback path for
MPFR < 3.1.0.
From-SVN: r278058
The movsi_ne variants are in a wrong order, leading to wrong
computation of the internal attribute "cond". Hence, to errors when
outputting annul-true or annul-false instructions.
gcc/
xxxx-xx-xx Claudiu Zissulescu <claziss@synopsys.com>
Shahab Vahedi <shahab@synopsys.com>
* config/arc/arc.md (movsi_ne): Reorder instruction variants.
testsuite/
xxxx-xx-xx Shahab Vahedi <shahab@synopsys.com>
* gcc.target/arc/delay-slot-limm.c: New test.
From-SVN: r278057
There are cases when an pic address gets complicated, and it needs to
be resolved via force_reg function found in
prepare_move_operands. When this happens, we need to disambiguate the
pic address and re-legitimize it.
gcc/
xxxx-xx-xx Claudiu Zissulescu <claziss@synopsys.com>
* config/arc/arc.c (arc_legitimize_pic_address): Consider UNSPECs
as well, if interesting recover the symbol and re-legitimize the
pic address.
gcc/testsuite/
xxxx-xx-xx Claudiu Zissulescu <claziss@synopsys.com>
* gcc.target/arc/pic-2.c: New file.
From-SVN: r278056
2019-11-11 Martin Liska <mliska@suse.cz>
* Make-lang.in: Relax dependency of lto-dump.o to
LTO_OBJS which will allow faster linking (mainly with LTO).
From-SVN: r278054
niters for epilogue
gcc/ChangeLog:
2019-11-11 Andre Vieira <andre.simoesdiasvieira@arm.com>
* tree-vect-loop-manip.c (vect_do_peeling): Take epilogue gaps into
account when checking if there are enough iterations to vectorize
epilogue.
gcc/testsuite/ChangeLog:
2019-11-11 Andre Vieira <andre.simoesdiasvieira@arm.com>
* gcc.dg/vect/vect-reduc-epilogue-gaps.c: New test.
From-SVN: r278049
2019-11-11 José Rui Faustino de Sousa <jrfsousa@gmail.com>
libgfortran/
PR fortran/92142
* runtime/ISO_Fortran_binding.c (CFI_setpointer): Don't
override descriptor attribute; with -fcheck, check that
it is a pointer.
gcc/testsuite/
PR fortran/92142
* gcc/testsuite/gfortran.dg/ISO_Fortran_binding_16.c: New.
* gcc/testsuite/gfortran.dg/ISO_Fortran_binding_16.f90: New.
* gcc/testsuite/gfortran.dg/ISO_Fortran_binding_10.c: Correct
upper bounds for case 0.
From-SVN: r278048
On x86, since -fPIC and -shared should be used to create offload image,
we put them the last to properly create offload image.
2019-11-11 H.J. Lu <hjl.tools@gmail.com>
PR target/87833
* config/i386/intelmic-mkoffload.c (prepare_target_image): Put
-fPIC and -shared the last to create offload image.
From-SVN: r278041
The 'gcc/configure' script sources all 'gcc/*/config-lang.in' files, but fails
to emit such dependency information into the build machinery. That means,
currently, when something gets changed in a 'gcc/*/config-lang.in' file, this
is not noticed, and doesn't propagate through the build machinery.
Handling of configure fragments is modelled in the same way as it already
exists for Makefile fragments.
gcc/
* Makefile.in (LANG_CONFIGUREFRAGS): Define.
(config.status): Use/depend on it.
* configure.ac (all_lang_configurefrags): Track, 'AC_SUBST'.
* configure: Regenerate.
From-SVN: r278035
In this patch, loop unroll adjust hook is introduced for powerpc. We
can do target related heuristic adjustment in this hook. In this patch,
-funroll-loops is enabled for small loops at O2 and above with an option
-munroll-small-loops to guard the small loops unrolling, and it works
fine with -flto.
gcc/
2019-11-11 Jiufu Guo <guojiufu@linux.ibm.com>
PR tree-optimization/88760
* gcc/config/rs6000/rs6000.opt (-munroll-only-small-loops): New option.
* gcc/common/config/rs6000/rs6000-common.c
(rs6000_option_optimization_table) [OPT_LEVELS_2_PLUS_SPEED_ONLY]:
Turn on -funroll-loops and -munroll-only-small-loops.
[OPT_LEVELS_ALL]: Turn off -fweb and -frename-registers.
* config/rs6000/rs6000.c (rs6000_option_override_internal): Remove
set of PARAM_MAX_UNROLL_TIMES and PARAM_MAX_UNROLLED_INSNS.
Turn off -munroll-only-small-loops for explicit -funroll-loops.
(TARGET_LOOP_UNROLL_ADJUST): Add loop unroll adjust hook.
(rs6000_loop_unroll_adjust): Define it. Use -munroll-only-small-loops.
gcc.testsuite/
2019-11-11 Jiufu Guo <guojiufu@linux.ibm.com>
PR tree-optimization/88760
* gcc.dg/pr59643.c: Update back to r277550.
From-SVN: r278034
To align with rs6000_insn_cost costing more for load type insns,
this patch is to make load insns cost more in vectorization cost
function. The latency of load insns is about twice that of
"simple" instructions; 2 vs. 1 on older cores, and 4 (or so) vs.
2 on newer cores. Considering that the result of load usually
is used somehow later (true-dep) but store won't, we keep the
store as before.
The SPEC2017 performance evaluation on Power8 shows 525.x264_r
+9.56%, 511.povray_r +2.08%, 527.cam4_r 1.16% gains, no
significant degradation, SPECINT geomean +0.88%, SPECFP geomean
+0.26%.
The SPEC2017 performance evaluation on Power9 shows no significant
improvement or degradation, SPECINT geomean +0.04%, SPECFP geomean
+0.04%.
The SPEC2006 performance evaluation on Power8 shows 454.calculix
+4.41% gain but 416.gamess -1.19% and 453.povray -3.83% degradation.
I looked into the two degradation bmks, the degradation were NOT
due to hotspot changes by vectorization, were all side effects.
SPECINT geomean +0.10%, SPECFP geomean no changed considering
the degradation.
gcc/ChangeLog
2019-11-11 Kewen Lin <linkw@gcc.gnu.org>
* config/rs6000/rs6000.c (rs6000_builtin_vectorization_cost): Make
scalar_load, vector_load, unaligned_load and vector_gather_load cost
more to conform hardware latency and insn cost settings.
From-SVN: r278033