gcc/gcc/params.opt

1182 lines
58 KiB
Plaintext
Raw Normal View History

; Parameter options of the compiler.
2022-01-03 10:42:10 +01:00
; Copyright (C) 2019-2022 Free Software Foundation, Inc.
;
; This file is part of GCC.
;
; GCC is free software; you can redistribute it and/or modify it under
; the terms of the GNU General Public License as published by the Free
; Software Foundation; either version 3, or (at your option) any later
; version.
;
; GCC is distributed in the hope that it will be useful, but WITHOUT ANY
; WARRANTY; without even the implied warranty of MERCHANTABILITY or
; FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
; for more details.
;
; You should have received a copy of the GNU General Public License
; along with GCC; see the file COPYING3. If not see
; <http://www.gnu.org/licenses/>.
; See the GCC internals manual (options.texi) for a description of this file's format.
; Please try to keep this file in ASCII collating order.
-param=align-loop-iterations=
Common Joined UInteger Var(param_align_loop_iterations) Init(4) Param Optimization
Loops iterating at least selected number of iterations will get loop alignment.
-param=align-threshold=
Common Joined UInteger Var(param_align_threshold) Init(100) IntegerRange(1, 65536) Param Optimization
Select fraction of the maximal frequency of executions of basic block in function given basic block get alignment.
-param=asan-globals=
Common Joined UInteger Var(param_asan_globals) Init(1) IntegerRange(0, 1) Param
Enable asan globals protection.
-param=asan-instrument-allocas=
Common Joined UInteger Var(param_asan_protect_allocas) Init(1) IntegerRange(0, 1) Param Optimization
Enable asan allocas/VLAs protection.
-param=asan-instrument-reads=
Common Joined UInteger Var(param_asan_instrument_reads) Init(1) IntegerRange(0, 1) Param Optimization
Enable asan load operations protection.
-param=asan-instrument-writes=
Common Joined UInteger Var(param_asan_instrument_writes) Init(1) IntegerRange(0, 1) Param Optimization
Enable asan store operations protection.
-param=asan-instrumentation-with-call-threshold=
Common Joined UInteger Var(param_asan_instrumentation_with_call_threshold) Init(7000) Param Optimization
Use callbacks instead of inline code if number of accesses in function becomes greater or equal to this number.
-param=asan-memintrin=
Common Joined UInteger Var(param_asan_memintrin) Init(1) IntegerRange(0, 1) Param Optimization
Enable asan builtin functions protection.
-param=asan-stack=
Common Joined UInteger Var(param_asan_stack) Init(1) IntegerRange(0, 1) Param Optimization
Enable asan stack protection.
-param=asan-use-after-return=
Common Joined UInteger Var(param_asan_use_after_return) Init(1) IntegerRange(0, 1) Param Optimization
Enable asan detection of use-after-return bugs.
libsanitizer: options: Add hwasan flags and argument parsing These flags can't be used at the same time as any of the other sanitizers. We add an equivalent flag to -static-libasan in -static-libhwasan to ensure static linking. The -fsanitize=kernel-hwaddress option is for compiling targeting the kernel. This flag has defaults to match the LLVM implementation and sets some other behaviors to work in the kernel (e.g. accounting for the fact that the stack pointer will have 0xff in the top byte and to not call the userspace library initialisation routines). The defaults are that we do not sanitize variables on the stack and always recover from a detected bug. Since we are introducing a few more conflicts between sanitizer flags we refactor the checking for such conflicts to use a helper function which makes checking for such conflicts more easy and consistent. We introduce a backend hook `targetm.memtag.can_tag_addresses` that indicates to the mid-end whether a target has a feature like AArch64 TBI where the top byte of an address is ignored. Without this feature hwasan sanitization is not done. gcc/ChangeLog: * common.opt (flag_sanitize_recover): Default for kernel hwaddress. (static-libhwasan): New cli option. * config/aarch64/aarch64.c (aarch64_can_tag_addresses): New. (TARGET_MEMTAG_CAN_TAG_ADDRESSES): New. * config/gnu-user.h (LIBHWASAN_EARLY_SPEC): hwasan equivalent of asan command line flags. * cppbuiltin.c (define_builtin_macros_for_compilation_flags): Add hwasan equivalent of __SANITIZE_ADDRESS__. * doc/invoke.texi: Document hwasan command line flags. * doc/tm.texi: Document new hook. * doc/tm.texi.in: Document new hook. * flag-types.h (enum sanitize_code): New sanitizer values. * gcc.c (STATIC_LIBHWASAN_LIBS): New macro. (LIBHWASAN_SPEC): New macro. (LIBHWASAN_EARLY_SPEC): New macro. (SANITIZER_EARLY_SPEC): Update to include hwasan. (SANITIZER_SPEC): Update to include hwasan. (sanitize_spec_function): Use hwasan options. * opts.c (finish_options): Describe conflicts between address sanitizers. (find_sanitizer_argument): New. (report_conflicting_sanitizer_options): New. (sanitizer_opts): Introduce new sanitizer flags. (common_handle_option): Add defaults for kernel sanitizer. * params.opt (hwasan--instrument-stack): New (hwasan-random-frame-tag): New (hwasan-instrument-allocas): New (hwasan-instrument-reads): New (hwasan-instrument-writes): New (hwasan-instrument-mem-intrinsics): New * target.def (HOOK_PREFIX): Add new hook. (can_tag_addresses): Add new hook under memtag prefix. * targhooks.c (default_memtag_can_tag_addresses): New. * targhooks.h (default_memtag_can_tag_addresses): New decl. * toplev.c (process_options): Ensure hwasan only on architectures that advertise the possibility.
2020-11-25 17:31:43 +01:00
-param=hwasan-instrument-stack=
Common Joined UInteger Var(param_hwasan_instrument_stack) Init(1) IntegerRange(0, 1) Param Optimization
Enable hwasan instrumentation of statically sized stack-allocated variables.
-param=hwasan-random-frame-tag=
Common Joined UInteger Var(param_hwasan_random_frame_tag) Init(1) IntegerRange(0, 1) Param Optimization
Use random base tag for each frame, as opposed to base always zero.
-param=hwasan-instrument-allocas=
Common Joined UInteger Var(param_hwasan_instrument_allocas) Init(1) IntegerRange(0, 1) Param Optimization
Enable hwasan instrumentation of allocas/VLAs.
-param=hwasan-instrument-reads=
Common Joined UInteger Var(param_hwasan_instrument_reads) Init(1) IntegerRange(0, 1) Param Optimization
Enable hwasan instrumentation of load operations.
-param=hwasan-instrument-writes=
Common Joined UInteger Var(param_hwasan_instrument_writes) Init(1) IntegerRange(0, 1) Param Optimization
Enable hwasan instrumentation of store operations.
-param=hwasan-instrument-mem-intrinsics=
Common Joined UInteger Var(param_hwasan_instrument_mem_intrinsics) Init(1) IntegerRange(0, 1) Param Optimization
Enable hwasan instrumentation of builtin functions.
-param=avg-loop-niter=
Common Joined UInteger Var(param_avg_loop_niter) Init(10) IntegerRange(1, 65536) Param Optimization
Average number of iterations of a loop.
-param=avoid-fma-max-bits=
Common Joined UInteger Var(param_avoid_fma_max_bits) IntegerRange(0, 512) Param Optimization
Maximum number of bits for which we avoid creating FMAs.
-param=builtin-expect-probability=
Common Joined UInteger Var(param_builtin_expect_probability) Init(90) IntegerRange(0, 100) Param Optimization
Set the estimated probability in percentage for builtin expect. The default value is 90% probability.
-param=builtin-string-cmp-inline-length=
Common Joined UInteger Var(param_builtin_string_cmp_inline_length) Init(3) IntegerRange(0, 100) Param Optimization
The maximum length of a constant string for a builtin string cmp call eligible for inlining. The default value is 3.
-param=case-values-threshold=
Common Joined UInteger Var(param_case_values_threshold) Param Optimization
The smallest number of different values for which it is best to use a jump-table instead of a tree of conditional branches, if 0, use the default for the machine.
-param=comdat-sharing-probability=
Common Joined UInteger Var(param_comdat_sharing_probability) Init(20) Param Optimization
Probability that COMDAT function will be shared with different compilation unit.
-param=cxx-max-namespaces-for-diagnostic-help=
Common Joined UInteger Var(param_cxx_max_namespaces_for_diagnostic_help) Init(1000) Param
Maximum number of namespaces to search for alternatives when name lookup fails.
-param=dse-max-alias-queries-per-store=
Common Joined UInteger Var(param_dse_max_alias_queries_per_store) Init(256) Param Optimization
Maximum number of queries into the alias oracle per store.
-param=dse-max-object-size=
Common Joined UInteger Var(param_dse_max_object_size) Init(256) Param Optimization
Maximum size (in bytes) of objects tracked bytewise by dead store elimination.
-param=early-inlining-insns=
Convert inliner to function specific param infrastructure This patch adds opt_for_fn for all cross module params used by inliner so they can be modified at function granuality. With inlining almost always there are three functions to consider (callee and caller of the inlined edge and the outer function caller is inlined to). I always use the outer function params since that is how local parameters behave. I hope it is kind of what is also expected in most case: it is better to inline agressively into -O3 compiled code rather than inline agressively -O3 functions into their callers. New params infrastructure is nice. One drawback is that is very hard to search for individual param uses since they all occupy global namespace. With C++ world we had chance to do something like params.param_flag_name or params::param_flag_name instead... Bootstrapped/regtested x86_64-linux, comitted. * cif-code.def (MAX_INLINE_INSNS_SINGLE_O2_LIMIT): Remove. * doc/invoke.texi (max-inline-insns-single-O2, inline-heuristics-hint-percent-O2, inline-min-speedup-O2, early-inlining-insns-O2): Remove documentation. * ipa-fnsummary.c (analyze_function_body, compute_fn_summary): Use opt_for_fn when accessing parameters. * ipa-inline.c (caller_growth_limits, can_inline_edge_p, inline_insns_auto, can_inline_edge_by_limits_p, want_early_inline_function_p, big_speedup_p, want_inline_small_function_p, want_inline_self_recursive_call_p, recursive_inlining, compute_max_insns, inline_small_functions): Likewise. * opts.c (default_options): Add -O3 defaults for OPT__param_early_inlining_insns_, OPT__param_inline_heuristics_hint_percent_, OPT__param_inline_min_speedup_, OPT__param_max_inline_insns_single_. * params.opt (-param=early-inlining-insns-O2=, -param=inline-heuristics-hint-percent-O2=, -param=inline-min-speedup-O2=, -param=max-inline-insns-single-O2= -param=early-inlining-insns=, -param=inline-heuristics-hint-percent=, -param=inline-min-speedup=, -param=inline-unit-growth=, -param=large-function-growth=, -param=large-stack-frame=, -param=large-stack-frame-growth=, -param=large-unit-insns=, -param=max-inline-insns-recursive=, -param=max-inline-insns-recursive-auto=, -param=max-inline-insns-single=, -param=max-inline-insns-size=, -param=max-inline-insns-small=, -param=max-inline-recursive-depth=, -param=max-inline-recursive-depth-auto=, -param=min-inline-recursive-probability=, -param=partial-inlining-entry-probability=, -param=uninlined-function-insns=, -param=uninlined-function-time=, -param=uninlined-thunk-insns=, -param=uninlined-thunk-time=): Add Optimization. * g++.dg/tree-ssa/pr53844.C: Drop -O2 from param name. * g++.dg/tree-ssa/pr61034.C: Likewise. * g++.dg/tree-ssa/pr8781.C: Likewise. * g++.dg/warn/Wstringop-truncation-1.C: Likewise. * gcc.dg/ipa/pr63416.c: Likewise. * gcc.dg/tree-ssa/ssa-thread-12.c: Likewise. * gcc.dg/vect/pr66142.c: Likewise. * gcc.dg/winline-3.c: Likewise. * gcc.target/powerpc/pr72804.c: Likewise. From-SVN: r278644
2019-11-23 14:11:25 +01:00
Common Joined UInteger Var(param_early_inlining_insns) Init(6) Optimization Param
Maximal estimated growth of function body caused by early inlining of single call.
-param=evrp-sparse-threshold=
Common Joined UInteger Var(param_evrp_sparse_threshold) Init(800) Optimization Param
Maximum number of basic blocks before EVRP uses a sparse cache.
-param=evrp-switch-limit=
Common Joined UInteger Var(param_evrp_switch_limit) Init(50) Optimization Param
Maximum number of outgoing edges in a switch before EVRP will not process it.
-param=fsm-scale-path-blocks=
Common Joined UInteger Var(param_fsm_scale_path_blocks) Init(3) IntegerRange(1, 10) Param Optimization
Scale factor to apply to the number of blocks in a threading path when comparing to the number of (scaled) statements.
-param=fsm-scale-path-stmts=
Common Joined UInteger Var(param_fsm_scale_path_stmts) Init(2) IntegerRange(1, 10) Param Optimization
Scale factor to apply to the number of statements in a threading path when comparing to the number of (scaled) blocks.
-param=gcse-after-reload-critical-fraction=
Common Joined UInteger Var(param_gcse_after_reload_critical_fraction) Init(10) Param Optimization
The threshold ratio of critical edges execution count that permit performing redundancy elimination after reload.
-param=gcse-after-reload-partial-fraction=
Common Joined UInteger Var(param_gcse_after_reload_partial_fraction) Init(3) Param Optimization
The threshold ratio for performing partial redundancy elimination after reload.
-param=gcse-cost-distance-ratio=
Common Joined UInteger Var(param_gcse_cost_distance_ratio) Init(10) Param Optimization
Scaling factor in calculation of maximum distance an expression can be moved by GCSE optimizations.
-param=gcse-unrestricted-cost=
Common Joined UInteger Var(param_gcse_unrestricted_cost) Init(3) Param Optimization
Cost at which GCSE optimizations will not constraint the distance an expression can travel.
-param=ggc-min-expand=
Common Joined UInteger Var(param_ggc_min_expand) Init(30) Param
Minimum heap expansion to trigger garbage collection, as a percentage of the total size of the heap.
-param=ggc-min-heapsize=
Common Joined UInteger Var(param_ggc_min_heapsize) Init(4096) Param
Minimum heap size before we start collecting garbage, in kilobytes.
-param=gimple-fe-computed-hot-bb-threshold=
Common Joined UInteger Var(param_gimple_fe_computed_hot_bb_threshold) Param
The number of executions of a basic block which is considered hot. The parameter is used only in GIMPLE FE.
-param=graphite-allow-codegen-errors=
Common Joined UInteger Var(param_graphite_allow_codegen_errors) IntegerRange(0, 1) Param
Whether codegen errors should be ICEs when -fchecking.
-param=graphite-max-arrays-per-scop=
Common Joined UInteger Var(param_graphite_max_arrays_per_scop) Init(100) Param Optimization
Maximum number of arrays per SCoP.
-param=graphite-max-nb-scop-params=
Common Joined UInteger Var(param_graphite_max_nb_scop_params) Init(10) Param Optimization
Maximum number of parameters in a SCoP.
-param=hash-table-verification-limit=
Common Joined UInteger Var(param_hash_table_verification_limit) Init(10) Param
The number of elements for which hash table verification is done for each searched element.
-param=hot-bb-count-fraction=
Common Joined UInteger Var(param_hot_bb_count_fraction) Init(10000) Param
The denominator n of fraction 1/n of the maximal execution count of a basic block in the entire program that a basic block needs to at least have in order to be considered hot (used in non-LTO mode).
-param=hot-bb-count-ws-permille=
Common Joined UInteger Var(param_hot_bb_count_ws_permille) Init(990) IntegerRange(0, 1000) Param
The number of most executed permilles of the profiled execution of the entire program to which the execution count of a basic block must be part of in order to be considered hot (used in LTO mode).
-param=hot-bb-frequency-fraction=
Common Joined UInteger Var(param_hot_bb_frequency_fraction) Init(1000) Param
The denominator n of fraction 1/n of the execution frequency of the entry block of a function that a basic block of this function needs to at least have in order to be considered hot.
-param=inline-heuristics-hint-percent=
Convert inliner to function specific param infrastructure This patch adds opt_for_fn for all cross module params used by inliner so they can be modified at function granuality. With inlining almost always there are three functions to consider (callee and caller of the inlined edge and the outer function caller is inlined to). I always use the outer function params since that is how local parameters behave. I hope it is kind of what is also expected in most case: it is better to inline agressively into -O3 compiled code rather than inline agressively -O3 functions into their callers. New params infrastructure is nice. One drawback is that is very hard to search for individual param uses since they all occupy global namespace. With C++ world we had chance to do something like params.param_flag_name or params::param_flag_name instead... Bootstrapped/regtested x86_64-linux, comitted. * cif-code.def (MAX_INLINE_INSNS_SINGLE_O2_LIMIT): Remove. * doc/invoke.texi (max-inline-insns-single-O2, inline-heuristics-hint-percent-O2, inline-min-speedup-O2, early-inlining-insns-O2): Remove documentation. * ipa-fnsummary.c (analyze_function_body, compute_fn_summary): Use opt_for_fn when accessing parameters. * ipa-inline.c (caller_growth_limits, can_inline_edge_p, inline_insns_auto, can_inline_edge_by_limits_p, want_early_inline_function_p, big_speedup_p, want_inline_small_function_p, want_inline_self_recursive_call_p, recursive_inlining, compute_max_insns, inline_small_functions): Likewise. * opts.c (default_options): Add -O3 defaults for OPT__param_early_inlining_insns_, OPT__param_inline_heuristics_hint_percent_, OPT__param_inline_min_speedup_, OPT__param_max_inline_insns_single_. * params.opt (-param=early-inlining-insns-O2=, -param=inline-heuristics-hint-percent-O2=, -param=inline-min-speedup-O2=, -param=max-inline-insns-single-O2= -param=early-inlining-insns=, -param=inline-heuristics-hint-percent=, -param=inline-min-speedup=, -param=inline-unit-growth=, -param=large-function-growth=, -param=large-stack-frame=, -param=large-stack-frame-growth=, -param=large-unit-insns=, -param=max-inline-insns-recursive=, -param=max-inline-insns-recursive-auto=, -param=max-inline-insns-single=, -param=max-inline-insns-size=, -param=max-inline-insns-small=, -param=max-inline-recursive-depth=, -param=max-inline-recursive-depth-auto=, -param=min-inline-recursive-probability=, -param=partial-inlining-entry-probability=, -param=uninlined-function-insns=, -param=uninlined-function-time=, -param=uninlined-thunk-insns=, -param=uninlined-thunk-time=): Add Optimization. * g++.dg/tree-ssa/pr53844.C: Drop -O2 from param name. * g++.dg/tree-ssa/pr61034.C: Likewise. * g++.dg/tree-ssa/pr8781.C: Likewise. * g++.dg/warn/Wstringop-truncation-1.C: Likewise. * gcc.dg/ipa/pr63416.c: Likewise. * gcc.dg/tree-ssa/ssa-thread-12.c: Likewise. * gcc.dg/vect/pr66142.c: Likewise. * gcc.dg/winline-3.c: Likewise. * gcc.target/powerpc/pr72804.c: Likewise. From-SVN: r278644
2019-11-23 14:11:25 +01:00
Common Joined UInteger Var(param_inline_heuristics_hint_percent) Init(200) Optimization IntegerRange(100, 1000000) Param
The scale (in percents) applied to inline-insns-single and auto limits when heuristics hints that inlining is very profitable.
-param=inline-min-speedup=
Convert inliner to function specific param infrastructure This patch adds opt_for_fn for all cross module params used by inliner so they can be modified at function granuality. With inlining almost always there are three functions to consider (callee and caller of the inlined edge and the outer function caller is inlined to). I always use the outer function params since that is how local parameters behave. I hope it is kind of what is also expected in most case: it is better to inline agressively into -O3 compiled code rather than inline agressively -O3 functions into their callers. New params infrastructure is nice. One drawback is that is very hard to search for individual param uses since they all occupy global namespace. With C++ world we had chance to do something like params.param_flag_name or params::param_flag_name instead... Bootstrapped/regtested x86_64-linux, comitted. * cif-code.def (MAX_INLINE_INSNS_SINGLE_O2_LIMIT): Remove. * doc/invoke.texi (max-inline-insns-single-O2, inline-heuristics-hint-percent-O2, inline-min-speedup-O2, early-inlining-insns-O2): Remove documentation. * ipa-fnsummary.c (analyze_function_body, compute_fn_summary): Use opt_for_fn when accessing parameters. * ipa-inline.c (caller_growth_limits, can_inline_edge_p, inline_insns_auto, can_inline_edge_by_limits_p, want_early_inline_function_p, big_speedup_p, want_inline_small_function_p, want_inline_self_recursive_call_p, recursive_inlining, compute_max_insns, inline_small_functions): Likewise. * opts.c (default_options): Add -O3 defaults for OPT__param_early_inlining_insns_, OPT__param_inline_heuristics_hint_percent_, OPT__param_inline_min_speedup_, OPT__param_max_inline_insns_single_. * params.opt (-param=early-inlining-insns-O2=, -param=inline-heuristics-hint-percent-O2=, -param=inline-min-speedup-O2=, -param=max-inline-insns-single-O2= -param=early-inlining-insns=, -param=inline-heuristics-hint-percent=, -param=inline-min-speedup=, -param=inline-unit-growth=, -param=large-function-growth=, -param=large-stack-frame=, -param=large-stack-frame-growth=, -param=large-unit-insns=, -param=max-inline-insns-recursive=, -param=max-inline-insns-recursive-auto=, -param=max-inline-insns-single=, -param=max-inline-insns-size=, -param=max-inline-insns-small=, -param=max-inline-recursive-depth=, -param=max-inline-recursive-depth-auto=, -param=min-inline-recursive-probability=, -param=partial-inlining-entry-probability=, -param=uninlined-function-insns=, -param=uninlined-function-time=, -param=uninlined-thunk-insns=, -param=uninlined-thunk-time=): Add Optimization. * g++.dg/tree-ssa/pr53844.C: Drop -O2 from param name. * g++.dg/tree-ssa/pr61034.C: Likewise. * g++.dg/tree-ssa/pr8781.C: Likewise. * g++.dg/warn/Wstringop-truncation-1.C: Likewise. * gcc.dg/ipa/pr63416.c: Likewise. * gcc.dg/tree-ssa/ssa-thread-12.c: Likewise. * gcc.dg/vect/pr66142.c: Likewise. * gcc.dg/winline-3.c: Likewise. * gcc.target/powerpc/pr72804.c: Likewise. From-SVN: r278644
2019-11-23 14:11:25 +01:00
Common Joined UInteger Var(param_inline_min_speedup) Init(30) Optimization IntegerRange(0, 100) Param
The minimal estimated speedup allowing inliner to ignore inline-insns-single and inline-insns-auto.
-param=inline-unit-growth=
Convert inliner to function specific param infrastructure This patch adds opt_for_fn for all cross module params used by inliner so they can be modified at function granuality. With inlining almost always there are three functions to consider (callee and caller of the inlined edge and the outer function caller is inlined to). I always use the outer function params since that is how local parameters behave. I hope it is kind of what is also expected in most case: it is better to inline agressively into -O3 compiled code rather than inline agressively -O3 functions into their callers. New params infrastructure is nice. One drawback is that is very hard to search for individual param uses since they all occupy global namespace. With C++ world we had chance to do something like params.param_flag_name or params::param_flag_name instead... Bootstrapped/regtested x86_64-linux, comitted. * cif-code.def (MAX_INLINE_INSNS_SINGLE_O2_LIMIT): Remove. * doc/invoke.texi (max-inline-insns-single-O2, inline-heuristics-hint-percent-O2, inline-min-speedup-O2, early-inlining-insns-O2): Remove documentation. * ipa-fnsummary.c (analyze_function_body, compute_fn_summary): Use opt_for_fn when accessing parameters. * ipa-inline.c (caller_growth_limits, can_inline_edge_p, inline_insns_auto, can_inline_edge_by_limits_p, want_early_inline_function_p, big_speedup_p, want_inline_small_function_p, want_inline_self_recursive_call_p, recursive_inlining, compute_max_insns, inline_small_functions): Likewise. * opts.c (default_options): Add -O3 defaults for OPT__param_early_inlining_insns_, OPT__param_inline_heuristics_hint_percent_, OPT__param_inline_min_speedup_, OPT__param_max_inline_insns_single_. * params.opt (-param=early-inlining-insns-O2=, -param=inline-heuristics-hint-percent-O2=, -param=inline-min-speedup-O2=, -param=max-inline-insns-single-O2= -param=early-inlining-insns=, -param=inline-heuristics-hint-percent=, -param=inline-min-speedup=, -param=inline-unit-growth=, -param=large-function-growth=, -param=large-stack-frame=, -param=large-stack-frame-growth=, -param=large-unit-insns=, -param=max-inline-insns-recursive=, -param=max-inline-insns-recursive-auto=, -param=max-inline-insns-single=, -param=max-inline-insns-size=, -param=max-inline-insns-small=, -param=max-inline-recursive-depth=, -param=max-inline-recursive-depth-auto=, -param=min-inline-recursive-probability=, -param=partial-inlining-entry-probability=, -param=uninlined-function-insns=, -param=uninlined-function-time=, -param=uninlined-thunk-insns=, -param=uninlined-thunk-time=): Add Optimization. * g++.dg/tree-ssa/pr53844.C: Drop -O2 from param name. * g++.dg/tree-ssa/pr61034.C: Likewise. * g++.dg/tree-ssa/pr8781.C: Likewise. * g++.dg/warn/Wstringop-truncation-1.C: Likewise. * gcc.dg/ipa/pr63416.c: Likewise. * gcc.dg/tree-ssa/ssa-thread-12.c: Likewise. * gcc.dg/vect/pr66142.c: Likewise. * gcc.dg/winline-3.c: Likewise. * gcc.target/powerpc/pr72804.c: Likewise. From-SVN: r278644
2019-11-23 14:11:25 +01:00
Common Joined UInteger Var(param_inline_unit_growth) Init(40) Optimization Param
How much can given compilation unit grow because of the inlining (in percent).
-param=integer-share-limit=
Common Joined UInteger Var(param_integer_share_limit) Init(251) IntegerRange(2, 65536) Param
The upper bound for sharing integer constants.
-param=ipa-cp-eval-threshold=
Common Joined UInteger Var(param_ipa_cp_eval_threshold) Init(500) Param Optimization
Threshold ipa-cp opportunity evaluation that is still considered beneficial to clone.
-param=ipa-cp-loop-hint-bonus=
Common Joined UInteger Var(param_ipa_cp_loop_hint_bonus) Init(64) Param Optimization
Compile-time bonus IPA-CP assigns to candidates which make loop bounds or strides known.
-param=ipa-cp-max-recursive-depth=
Common Joined UInteger Var(param_ipa_cp_max_recursive_depth) Init(8) Param Optimization
Maximum depth of recursive cloning for self-recursive function.
-param=ipa-cp-min-recursive-probability=
Common Joined UInteger Var(param_ipa_cp_min_recursive_probability) Init(2) Param Optimization
Recursive cloning only when the probability of call being executed exceeds the parameter.
ipa-cp: Propagation boost for recursion generated values Recursive call graph edges, even when they are hot and important for the compiled program, can never have frequency bigger than one, even when the actual time savings in the next recursion call are not realized just once but depend on the depth of recursion. The current IPA-CP effect propagation code did not take that into account and just used the frequency, thus severely underestimating the effect. This patch artificially boosts values taking part in such calls. If a value feeds into itself through a recursive call, the frequency of the edge is multiplied by a parameter with default value of 6, basically assuming that the recursion will take place 6 times. This value can of course be subject to change. Moreover, values which do not feed into themselves but which were generated for a self-recursive call with an arithmetic pass-function (aka the 548.exchange "hack" which however is generally applicable for recursive functions which count the recursion depth in a parameter) have the edge frequency multiplied as many times as there are generated values in the chain. In essence, we will assume they are all useful. This patch partially fixes the current situation when we fail to optimize 548.exchange with PGO. In the benchmark one recursive edge count overwhelmingly dominates all other counts in the program and so we fail to perform the first cloning (for the nonrecursive entry call) because it looks totally insignificant. gcc/ChangeLog: 2021-07-16 Martin Jambor <mjambor@suse.cz> * params.opt (ipa-cp-recursive-freq-factor): New. * ipa-cp.c (ipcp_value): Switch to inline initialization. New members scc_no, self_recursion_generated_level, same_scc and self_recursion_generated_p. (ipcp_lattice::add_value): Replaced parameter unlimited with same_lat_gen_level, usit it determine limit of values and store it to the value. (ipcp_lattice<valtype>::print): Dump the new fileds. (allocate_and_init_ipcp_value): Take same_lat_gen_level as a new parameter and store it to the new value. (self_recursively_generated_p): Removed. (propagate_vals_across_arith_jfunc): Use self_recursion_generated_p instead of self_recursively_generated_p, store self generation level to such values. (value_topo_info<valtype>::add_val): Set scc_no. (value_topo_info<valtype>::propagate_effects): Multiply frequencies of recursively feeding values and self generated values by appropriate new factors.
2021-10-14 14:02:49 +02:00
-param=ipa-cp-recursive-freq-factor=
Common Joined UInteger Var(param_ipa_cp_recursive_freq_factor) Init(6) Param Optimization
When propagating IPA-CP effect estimates, multiply frequencies of recursive edges that bring back an unchanged value by this factor.
ipa-cp: Propagation boost for recursion generated values Recursive call graph edges, even when they are hot and important for the compiled program, can never have frequency bigger than one, even when the actual time savings in the next recursion call are not realized just once but depend on the depth of recursion. The current IPA-CP effect propagation code did not take that into account and just used the frequency, thus severely underestimating the effect. This patch artificially boosts values taking part in such calls. If a value feeds into itself through a recursive call, the frequency of the edge is multiplied by a parameter with default value of 6, basically assuming that the recursion will take place 6 times. This value can of course be subject to change. Moreover, values which do not feed into themselves but which were generated for a self-recursive call with an arithmetic pass-function (aka the 548.exchange "hack" which however is generally applicable for recursive functions which count the recursion depth in a parameter) have the edge frequency multiplied as many times as there are generated values in the chain. In essence, we will assume they are all useful. This patch partially fixes the current situation when we fail to optimize 548.exchange with PGO. In the benchmark one recursive edge count overwhelmingly dominates all other counts in the program and so we fail to perform the first cloning (for the nonrecursive entry call) because it looks totally insignificant. gcc/ChangeLog: 2021-07-16 Martin Jambor <mjambor@suse.cz> * params.opt (ipa-cp-recursive-freq-factor): New. * ipa-cp.c (ipcp_value): Switch to inline initialization. New members scc_no, self_recursion_generated_level, same_scc and self_recursion_generated_p. (ipcp_lattice::add_value): Replaced parameter unlimited with same_lat_gen_level, usit it determine limit of values and store it to the value. (ipcp_lattice<valtype>::print): Dump the new fileds. (allocate_and_init_ipcp_value): Take same_lat_gen_level as a new parameter and store it to the new value. (self_recursively_generated_p): Removed. (propagate_vals_across_arith_jfunc): Use self_recursion_generated_p instead of self_recursively_generated_p, store self generation level to such values. (value_topo_info<valtype>::add_val): Set scc_no. (value_topo_info<valtype>::propagate_effects): Multiply frequencies of recursively feeding values and self generated values by appropriate new factors.
2021-10-14 14:02:49 +02:00
-param=ipa-cp-recursion-penalty=
Common Joined UInteger Var(param_ipa_cp_recursion_penalty) Init(40) IntegerRange(0, 100) Param Optimization
Percentage penalty the recursive functions will receive when they are evaluated for cloning.
-param=ipa-cp-single-call-penalty=
Common Joined UInteger Var(param_ipa_cp_single_call_penalty) Init(15) IntegerRange(0, 100) Param Optimization
Percentage penalty functions containing a single call to another function will receive when they are evaluated for cloning.
-param=ipa-cp-unit-growth=
Common Joined UInteger Var(param_ipa_cp_unit_growth) Init(10) Param Optimization
How much can given compilation unit grow because of the interprocedural constant propagation (in percent).
ipa-cp: Separate and increase the large-unit parameter A previous patch in the series has taught IPA-CP to identify the important cloning opportunities in 548.exchange2_r as worthwhile on their own, but the optimization is still prevented from taking place because of the overall unit-growh limit. This patches raises that limit so that it takes place and the benchmark runs 30% faster (on AMD Zen2 CPU at least). Before this patch, IPA-CP uses the following formulae to arrive at the overall_size limit: base = MAX(orig_size, param_large_unit_insns) unit_growth_limit = base + base * param_ipa_cp_unit_growth / 100 since param_ipa_cp_unit_growth has default 10, param_large_unit_insns has default value 10000. The problem with exchange2 (at least on zen2 but I have had a quick look on aarch64 too) is that the original estimated unit size is 10513 and so param_large_unit_insns does not apply and the default limit is therefore 11564 which is good enough only for one of the ideal 8 clonings, we need the limit to be at least 16291. I would like to raise param_ipa_cp_unit_growth a little bit more soon too, but most certainly not to 55. Therefore, the large_unit must be increased. In this patch, I decided to decouple the inlining and ipa-cp large-unit parameters. It also makes sense because IPA-CP uses it only at -O3 while inlining also at -O2 (IIUC). But if we agree we can try raising param_large_unit_insns to 13-14 thousand "instructions," perhaps it is not necessary. But then again, it may make sense to actually increase the IPA-CP limit further. I plan to experiment with IPA-CP tuning on a larger set of programs. Meanwhile, mainly to address the 548.exchange2_r regression, I'm suggesting this simple change. gcc/ChangeLog: 2020-09-07 Martin Jambor <mjambor@suse.cz> * params.opt (ipa-cp-large-unit-insns): New parameter. * ipa-cp.c (get_max_overall_size): Use the new parameter.
2020-10-02 18:41:35 +02:00
-param=ipa-cp-large-unit-insns=
Common Joined UInteger Var(param_ipa_cp_large_unit_insns) Optimization Init(16000) Param
The size of translation unit that IPA-CP pass considers large.
-param=ipa-cp-value-list-size=
Common Joined UInteger Var(param_ipa_cp_value_list_size) Init(8) Param Optimization
Maximum size of a list of values associated with each parameter for interprocedural constant propagation.
ipa-cp: Select saner profile count to base heuristics on When profile feedback is available, IPA-CP takes the count of the hottest node and then evaluates all call contexts relative to it. This means that typically almost no clones for specialized contexts are ever created because the maximum is some special function, called from everywhere (that is likely to get inlined anyway) and all the examined edges look cold compared to it. This patch changes the selection. It simply sorts counts of all edges eligible for cloning in a vector and then picks the count in 90th percentile (the actual number is configurable via a parameter). I also tried more complex approaches which were summing the counts and picking the edge which together with all hotter edges accounted for a given portion of the total sum of all edge counts. But first it was not apparently clear to me that they make more logical sense that the simple method and practically I always also had to ignore a few percent of the hottest edges with really extreme counts (looking at bash and python). And when I had to do that anyway, it seemed simpler to just "ignore" more and take the first non-ignored count as the base. Nevertheless, if people think some more sophisticated method should be used anyway, I am willing to be persuaded. But this patch is a clear improvement over the current situation. gcc/ChangeLog: 2021-10-26 Martin Jambor <mjambor@suse.cz> * params.opt (param_ipa_cp_profile_count_base): New parameter. * doc/invoke.texi (Optimize Options): Add entry for ipa-cp-profile-count-base. * ipa-cp.c (max_count): Replace with base_count, replace all occurrences too, unless otherwise stated. (ipcp_cloning_candidate_p): identify mostly-directly called functions based on their counts, not max_count. (compare_edge_profile_counts): New function. (ipcp_propagate_stage): Instead of setting max_count, find the appropriate edge count in a sorted vector of counts of eligible edges and make it the base_count.
2021-10-27 14:49:01 +02:00
-param=ipa-cp-profile-count-base=
Common Joined UInteger Var(param_ipa_cp_profile_count_base) Init(10) IntegerRange(0, 100) Param Optimization
When using profile feedback, use the edge at this percentage position in frequncy histogram as the bases for IPA-CP heuristics.
-param=ipa-jump-function-lookups=
Common Joined UInteger Var(param_ipa_jump_function_lookups) Init(8) Param Optimization
Maximum number of statements visited during jump function offset discovery.
-param=ipa-max-aa-steps=
Common Joined UInteger Var(param_ipa_max_aa_steps) Init(25000) Param Optimization
Maximum number of statements that will be visited by IPA formal parameter analysis based on alias analysis in any given function.
-param=ipa-max-agg-items=
Common Joined UInteger Var(param_ipa_max_agg_items) Init(16) Param Optimization
Maximum number of aggregate content items for a parameter in jump functions and lattices.
-param=ipa-max-param-expr-ops=
Common Joined UInteger Var(param_ipa_max_param_expr_ops) Init(10) Param Optimization
Maximum number of operations in a parameter expression that can be handled by IPA analysis.
ipa: Multiple predicates for loop properties, with frequencies This patch enhances the ability of IPA to reason under what conditions loops in a function have known iteration counts or strides because it replaces single predicates which currently hold conjunction of predicates for all loops with vectors capable of holding multiple predicates, each with a cumulative frequency of loops with the property. This second property is then used by IPA-CP to much more aggressively boost its heuristic score for cloning opportunities which make iteration counts or strides of frequent loops compile time constant. gcc/ChangeLog: 2020-09-03 Martin Jambor <mjambor@suse.cz> * ipa-fnsummary.h (ipa_freqcounting_predicate): New type. (ipa_fn_summary): Change the type of loop_iterations and loop_strides to vectors of ipa_freqcounting_predicate. (ipa_fn_summary::ipa_fn_summary): Construct the new vectors. (ipa_call_estimates): New fields loops_with_known_iterations and loops_with_known_strides. * ipa-cp.c (hint_time_bonus): Multiply param_ipa_cp_loop_hint_bonus with the expected frequencies of loops with known iteration count or stride. * ipa-fnsummary.c (add_freqcounting_predicate): New function. (ipa_fn_summary::~ipa_fn_summary): Release the new vectors instead of just two predicates. (remap_hint_predicate_after_duplication): Replace with function remap_freqcounting_preds_after_dup. (ipa_fn_summary_t::duplicate): Use it or duplicate new vectors. (ipa_dump_fn_summary): Dump the new vectors. (analyze_function_body): Compute the loop property vectors. (ipa_call_context::estimate_size_and_time): Calculate also loops_with_known_iterations and loops_with_known_strides. Adjusted dumping accordinly. (remap_hint_predicate): Replace with function remap_freqcounting_predicate. (ipa_merge_fn_summary_after_inlining): Use it. (inline_read_section): Stream loopcounting vectors instead of two simple predicates. (ipa_fn_summary_write): Likewise. * params.opt (ipa-max-loop-predicates): New parameter. * doc/invoke.texi (ipa-max-loop-predicates): Document new param. gcc/testsuite/ChangeLog: 2020-09-03 Martin Jambor <mjambor@suse.cz> * gcc.dg/ipa/ipcp-loophint-1.c: New test.
2020-10-02 18:41:35 +02:00
-param=ipa-max-loop-predicates=
Common Joined UInteger Var(param_ipa_max_loop_predicates) Init(16) Param Optimization
Maximum number of different predicates used to track properties of loops in IPA analysis.
-param=ipa-max-switch-predicate-bounds=
Common Joined UInteger Var(param_ipa_max_switch_predicate_bounds) Init(5) Param Optimization
Maximal number of boundary endpoints of case ranges of switch statement used during IPA function summary generation.
-param=ipa-sra-max-replacements=
Common Joined UInteger Var(param_ipa_sra_max_replacements) Optimization Init(8) IntegerRange(0, 16) Param
Maximum pieces that IPA-SRA tracks per formal parameter, as a consequence, also the maximum number of replacements of a formal parameter.
-param=ipa-sra-ptr-growth-factor=
Common Joined UInteger Var(param_ipa_sra_ptr_growth_factor) Init(2) Param Optimization
Maximum allowed growth of number and total size of new parameters that ipa-sra replaces a pointer to an aggregate with.
-param=ira-loop-reserved-regs=
Common Joined UInteger Var(param_ira_loop_reserved_regs) Init(2) Param Optimization
The number of registers in each class kept unused by loop invariant motion.
-param=ira-max-conflict-table-size=
Common Joined UInteger Var(param_ira_max_conflict_table_size) Init(1000) Param Optimization
Max size of conflict table in MB.
-param=ira-max-loops-num=
Common Joined UInteger Var(param_ira_max_loops_num) Init(100) Param Optimization
Max loops number for regional RA.
ira: Support more matching constraint forms with param [PR100328] This patch is to make IRA consider matching constraint heavily, even if there is at least one other alternative with non-NO_REG register class constraint, it will continue and check matching constraint in all available alternatives and respect the matching constraint with preferred register class. One typical case is destructive FMA style instruction on rs6000. Without this patch, for the mentioned FMA instruction, IRA won't respect the matching constraint on VSX_REG since there are some alternative with FLOAT_REG which doesn't have matching constraint. It can cause extra register copies since later reload has to make code to respect the constraint. This patch make IRA respect this matching constraint on VSX_REG which is the preferred regclass, but it excludes some cases where for one preferred register class there can be two or more alternatives, one of them has the matching constraint, while another doesn't have. It also considers the possibility of free register copy. With option Ofast unroll, this patch can help to improve SPEC2017 bmk 508.namd_r +2.42% and 519.lbm_r +2.43% on Power8 while 508.namd_r +3.02% and 519.lbm_r +3.85% on Power9 without any remarkable degradations. It also improved something on SVE as testcase changes showed and Richard's confirmation. Bootstrapped & regtested on powerpc64le-linux-gnu P9, x86_64-redhat-linux and aarch64-linux-gnu. gcc/ChangeLog: PR rtl-optimization/100328 * doc/invoke.texi (ira-consider-dup-in-all-alts): Document new parameter. * ira.c (ira_get_dup_out_num): Adjust as parameter param_ira_consider_dup_in_all_alts. * params.opt (ira-consider-dup-in-all-alts): New. * ira-conflicts.c (process_regs_for_copy): Add one parameter single_input_op_has_cstr_p. (get_freq_for_shuffle_copy): New function. (add_insn_allocno_copies): Adjust as single_input_op_has_cstr_p. * ira-int.h (ira_get_dup_out_num): Add one bool parameter. gcc/testsuite/ChangeLog: PR rtl-optimization/100328 * gcc.target/aarch64/sve/acle/asm/div_f16.c: Remove one xfail. * gcc.target/aarch64/sve/acle/asm/div_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/div_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/divr_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/divr_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/divr_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/mad_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/mad_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/mad_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/mla_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/mla_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/mla_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/mls_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/mls_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/mls_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/msb_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/msb_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/msb_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/mulx_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/mulx_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/mulx_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/nmad_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/nmad_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/nmad_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/nmla_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/nmla_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/nmla_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/nmls_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/nmls_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/nmls_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/nmsb_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/nmsb_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/nmsb_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/sub_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/sub_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/sub_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/subr_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/subr_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/subr_f64.c: Likewise.
2021-07-06 03:53:19 +02:00
-param=ira-consider-dup-in-all-alts=
Common Joined UInteger Var(param_ira_consider_dup_in_all_alts) Init(1) IntegerRange(0, 1) Param Optimization
Control ira to consider matching constraint (duplicated operand number) heavily in all available alternatives for preferred register class. If it is set as zero, it means ira only respects the matching constraint when it's in the only available alternative with an appropriate register class. Otherwise, it means ira will check all available alternatives for preferred register class even if it has found some choice with an appropriate register class and respect the found qualified matching constraint.
-param=iv-always-prune-cand-set-bound=
Common Joined UInteger Var(param_iv_always_prune_cand_set_bound) Init(10) Param Optimization
If number of candidates in the set is smaller, we always try to remove unused ivs during its optimization.
-param=iv-consider-all-candidates-bound=
Common Joined UInteger Var(param_iv_consider_all_candidates_bound) Init(40) Param Optimization
Bound on number of candidates below that all candidates are considered in iv optimizations.
-param=iv-max-considered-uses=
Common Joined UInteger Var(param_iv_max_considered_uses) Init(250) Param Optimization
Bound on number of iv uses in loop optimized in iv optimizations.
-param=jump-table-max-growth-ratio-for-size=
Common Joined UInteger Var(param_jump_table_max_growth_ratio_for_size) Init(300) Param Optimization
The maximum code size growth ratio when expanding into a jump table (in percent). The parameter is used when optimizing for size.
-param=jump-table-max-growth-ratio-for-speed=
Common Joined UInteger Var(param_jump_table_max_growth_ratio_for_speed) Init(800) Param Optimization
The maximum code size growth ratio when expanding into a jump table (in percent). The parameter is used when optimizing for speed.
-param=l1-cache-line-size=
Common Joined UInteger Var(param_l1_cache_line_size) Init(32) Param Optimization
The size of L1 cache line.
c++: implement C++17 hardware interference size The last missing piece of the C++17 standard library is the hardware intereference size constants. Much of the delay in implementing these has been due to uncertainty about what the right values are, and even whether there is a single constant value that is suitable; the destructive interference size is intended to be used in structure layout, so program ABIs will depend on it. In principle, both of these values should be the same as the target's L1 cache line size. When compiling for a generic target that is intended to support a range of target CPUs with different cache line sizes, the constructive size should probably be the minimum size, and the destructive size the maximum, unless you are constrained by ABI compatibility with previous code. From discussion on gcc-patches, I've come to the conclusion that the solution to the difficulty of choosing stable values is to give up on it, and instead encourage only uses where ABI stability is unimportant: in particular, uses where the ABI is shared at most between translation units built at the same time with the same flags. To that end, I've added a warning for any use of the constant value of std::hardware_destructive_interference_size in a header or module export. Appropriate uses within a project can disable the warning. A previous iteration of this patch included an -finterference-tune flag to make the value vary with -mtune; this iteration makes that the default behavior, which should be appropriate for all reasonable uses of the variable. The previous default of "stable-ish" seems to me likely to have been more of an attractive nuisance; since we can't promise actual stability, we should instead make proper uses more convenient. JF Bastien's implementation proposal is summarized at https://github.com/itanium-cxx-abi/cxx-abi/issues/74 I implement this by adding new --params for the two sizes. Targets can override these values in targetm.target_option.override() to support a range of values for the generic target; otherwise, both will default to the L1 cache line size. 64 bytes still seems correct for all x86. I'm not sure why he proposed 64/64 for generic 32-bit ARM, since the Cortex A9 has a 32-byte cache line, so I'd think 32/64 would make more sense. He proposed 64/128 for generic AArch64, but since the A64FX now has a 256B cache line, I've changed that to 64/256. Other arch maintainers are invited to set ranges for their generic targets if that seems better than using the default cache line size for both values. With the above choice to reject stability as a goal, getting these values "right" is now just a matter of what we want the default optimization to be, and we can feel free to adjust them as CPUs with different cache lines become more and less common. gcc/ChangeLog: * params.opt: Add destructive-interference-size and constructive-interference-size. * doc/invoke.texi: Document them. * config/aarch64/aarch64.c (aarch64_override_options_internal): Set them. * config/arm/arm.c (arm_option_override): Set them. * config/i386/i386-options.c (ix86_option_override_internal): Set them. gcc/c-family/ChangeLog: * c.opt: Add -Winterference-size. * c-cppbuiltin.c (cpp_atomic_builtins): Add __GCC_DESTRUCTIVE_SIZE and __GCC_CONSTRUCTIVE_SIZE. gcc/cp/ChangeLog: * constexpr.c (maybe_warn_about_constant_value): Complain about std::hardware_destructive_interference_size. (cxx_eval_constant_expression): Call it. * decl.c (cxx_init_decl_processing): Check --param *-interference-size values. libstdc++-v3/ChangeLog: * include/std/version: Define __cpp_lib_hardware_interference_size. * libsupc++/new: Define hardware interference size variables. gcc/testsuite/ChangeLog: * g++.dg/warn/Winterference.H: New file. * g++.dg/warn/Winterference.C: New test. * g++.target/aarch64/interference.C: New test. * g++.target/arm/interference.C: New test. * g++.target/i386/interference.C: New test.
2021-07-15 21:30:17 +02:00
-param=destructive-interference-size=
Common Joined UInteger Var(param_destruct_interfere_size) Init(0) Param Optimization
The minimum recommended offset between two concurrently-accessed objects to
avoid additional performance degradation due to contention introduced by the
implementation. Typically the L1 cache line size, but can be larger to
accommodate a variety of target processors with different cache line sizes.
C++17 code might use this value in structure layout, but is strongly
discouraged from doing so in public ABIs.
-param=constructive-interference-size=
Common Joined UInteger Var(param_construct_interfere_size) Init(0) Param Optimization
The maximum recommended size of contiguous memory occupied by two objects
accessed with temporal locality by concurrent threads. Typically the L1 cache
line size, but can be smaller to accommodate a variety of target processors with
different cache line sizes.
-param=l1-cache-size=
Common Joined UInteger Var(param_l1_cache_size) Init(64) Param Optimization
The size of L1 cache.
-param=l2-cache-size=
Common Joined UInteger Var(param_l2_cache_size) Init(512) Param Optimization
The size of L2 cache.
-param=large-function-growth=
Convert inliner to function specific param infrastructure This patch adds opt_for_fn for all cross module params used by inliner so they can be modified at function granuality. With inlining almost always there are three functions to consider (callee and caller of the inlined edge and the outer function caller is inlined to). I always use the outer function params since that is how local parameters behave. I hope it is kind of what is also expected in most case: it is better to inline agressively into -O3 compiled code rather than inline agressively -O3 functions into their callers. New params infrastructure is nice. One drawback is that is very hard to search for individual param uses since they all occupy global namespace. With C++ world we had chance to do something like params.param_flag_name or params::param_flag_name instead... Bootstrapped/regtested x86_64-linux, comitted. * cif-code.def (MAX_INLINE_INSNS_SINGLE_O2_LIMIT): Remove. * doc/invoke.texi (max-inline-insns-single-O2, inline-heuristics-hint-percent-O2, inline-min-speedup-O2, early-inlining-insns-O2): Remove documentation. * ipa-fnsummary.c (analyze_function_body, compute_fn_summary): Use opt_for_fn when accessing parameters. * ipa-inline.c (caller_growth_limits, can_inline_edge_p, inline_insns_auto, can_inline_edge_by_limits_p, want_early_inline_function_p, big_speedup_p, want_inline_small_function_p, want_inline_self_recursive_call_p, recursive_inlining, compute_max_insns, inline_small_functions): Likewise. * opts.c (default_options): Add -O3 defaults for OPT__param_early_inlining_insns_, OPT__param_inline_heuristics_hint_percent_, OPT__param_inline_min_speedup_, OPT__param_max_inline_insns_single_. * params.opt (-param=early-inlining-insns-O2=, -param=inline-heuristics-hint-percent-O2=, -param=inline-min-speedup-O2=, -param=max-inline-insns-single-O2= -param=early-inlining-insns=, -param=inline-heuristics-hint-percent=, -param=inline-min-speedup=, -param=inline-unit-growth=, -param=large-function-growth=, -param=large-stack-frame=, -param=large-stack-frame-growth=, -param=large-unit-insns=, -param=max-inline-insns-recursive=, -param=max-inline-insns-recursive-auto=, -param=max-inline-insns-single=, -param=max-inline-insns-size=, -param=max-inline-insns-small=, -param=max-inline-recursive-depth=, -param=max-inline-recursive-depth-auto=, -param=min-inline-recursive-probability=, -param=partial-inlining-entry-probability=, -param=uninlined-function-insns=, -param=uninlined-function-time=, -param=uninlined-thunk-insns=, -param=uninlined-thunk-time=): Add Optimization. * g++.dg/tree-ssa/pr53844.C: Drop -O2 from param name. * g++.dg/tree-ssa/pr61034.C: Likewise. * g++.dg/tree-ssa/pr8781.C: Likewise. * g++.dg/warn/Wstringop-truncation-1.C: Likewise. * gcc.dg/ipa/pr63416.c: Likewise. * gcc.dg/tree-ssa/ssa-thread-12.c: Likewise. * gcc.dg/vect/pr66142.c: Likewise. * gcc.dg/winline-3.c: Likewise. * gcc.target/powerpc/pr72804.c: Likewise. From-SVN: r278644
2019-11-23 14:11:25 +01:00
Common Joined UInteger Var(param_large_function_growth) Optimization Init(100) Param
Maximal growth due to inlining of large function (in percent).
-param=large-function-insns=
Convert inliner to function specific param infrastructure This patch adds opt_for_fn for all cross module params used by inliner so they can be modified at function granuality. With inlining almost always there are three functions to consider (callee and caller of the inlined edge and the outer function caller is inlined to). I always use the outer function params since that is how local parameters behave. I hope it is kind of what is also expected in most case: it is better to inline agressively into -O3 compiled code rather than inline agressively -O3 functions into their callers. New params infrastructure is nice. One drawback is that is very hard to search for individual param uses since they all occupy global namespace. With C++ world we had chance to do something like params.param_flag_name or params::param_flag_name instead... Bootstrapped/regtested x86_64-linux, comitted. * cif-code.def (MAX_INLINE_INSNS_SINGLE_O2_LIMIT): Remove. * doc/invoke.texi (max-inline-insns-single-O2, inline-heuristics-hint-percent-O2, inline-min-speedup-O2, early-inlining-insns-O2): Remove documentation. * ipa-fnsummary.c (analyze_function_body, compute_fn_summary): Use opt_for_fn when accessing parameters. * ipa-inline.c (caller_growth_limits, can_inline_edge_p, inline_insns_auto, can_inline_edge_by_limits_p, want_early_inline_function_p, big_speedup_p, want_inline_small_function_p, want_inline_self_recursive_call_p, recursive_inlining, compute_max_insns, inline_small_functions): Likewise. * opts.c (default_options): Add -O3 defaults for OPT__param_early_inlining_insns_, OPT__param_inline_heuristics_hint_percent_, OPT__param_inline_min_speedup_, OPT__param_max_inline_insns_single_. * params.opt (-param=early-inlining-insns-O2=, -param=inline-heuristics-hint-percent-O2=, -param=inline-min-speedup-O2=, -param=max-inline-insns-single-O2= -param=early-inlining-insns=, -param=inline-heuristics-hint-percent=, -param=inline-min-speedup=, -param=inline-unit-growth=, -param=large-function-growth=, -param=large-stack-frame=, -param=large-stack-frame-growth=, -param=large-unit-insns=, -param=max-inline-insns-recursive=, -param=max-inline-insns-recursive-auto=, -param=max-inline-insns-single=, -param=max-inline-insns-size=, -param=max-inline-insns-small=, -param=max-inline-recursive-depth=, -param=max-inline-recursive-depth-auto=, -param=min-inline-recursive-probability=, -param=partial-inlining-entry-probability=, -param=uninlined-function-insns=, -param=uninlined-function-time=, -param=uninlined-thunk-insns=, -param=uninlined-thunk-time=): Add Optimization. * g++.dg/tree-ssa/pr53844.C: Drop -O2 from param name. * g++.dg/tree-ssa/pr61034.C: Likewise. * g++.dg/tree-ssa/pr8781.C: Likewise. * g++.dg/warn/Wstringop-truncation-1.C: Likewise. * gcc.dg/ipa/pr63416.c: Likewise. * gcc.dg/tree-ssa/ssa-thread-12.c: Likewise. * gcc.dg/vect/pr66142.c: Likewise. * gcc.dg/winline-3.c: Likewise. * gcc.target/powerpc/pr72804.c: Likewise. From-SVN: r278644
2019-11-23 14:11:25 +01:00
Common Joined UInteger Var(param_large_function_insns) Optimization Init(2700) Param
The size of function body to be considered large.
-param=large-stack-frame=
Convert inliner to function specific param infrastructure This patch adds opt_for_fn for all cross module params used by inliner so they can be modified at function granuality. With inlining almost always there are three functions to consider (callee and caller of the inlined edge and the outer function caller is inlined to). I always use the outer function params since that is how local parameters behave. I hope it is kind of what is also expected in most case: it is better to inline agressively into -O3 compiled code rather than inline agressively -O3 functions into their callers. New params infrastructure is nice. One drawback is that is very hard to search for individual param uses since they all occupy global namespace. With C++ world we had chance to do something like params.param_flag_name or params::param_flag_name instead... Bootstrapped/regtested x86_64-linux, comitted. * cif-code.def (MAX_INLINE_INSNS_SINGLE_O2_LIMIT): Remove. * doc/invoke.texi (max-inline-insns-single-O2, inline-heuristics-hint-percent-O2, inline-min-speedup-O2, early-inlining-insns-O2): Remove documentation. * ipa-fnsummary.c (analyze_function_body, compute_fn_summary): Use opt_for_fn when accessing parameters. * ipa-inline.c (caller_growth_limits, can_inline_edge_p, inline_insns_auto, can_inline_edge_by_limits_p, want_early_inline_function_p, big_speedup_p, want_inline_small_function_p, want_inline_self_recursive_call_p, recursive_inlining, compute_max_insns, inline_small_functions): Likewise. * opts.c (default_options): Add -O3 defaults for OPT__param_early_inlining_insns_, OPT__param_inline_heuristics_hint_percent_, OPT__param_inline_min_speedup_, OPT__param_max_inline_insns_single_. * params.opt (-param=early-inlining-insns-O2=, -param=inline-heuristics-hint-percent-O2=, -param=inline-min-speedup-O2=, -param=max-inline-insns-single-O2= -param=early-inlining-insns=, -param=inline-heuristics-hint-percent=, -param=inline-min-speedup=, -param=inline-unit-growth=, -param=large-function-growth=, -param=large-stack-frame=, -param=large-stack-frame-growth=, -param=large-unit-insns=, -param=max-inline-insns-recursive=, -param=max-inline-insns-recursive-auto=, -param=max-inline-insns-single=, -param=max-inline-insns-size=, -param=max-inline-insns-small=, -param=max-inline-recursive-depth=, -param=max-inline-recursive-depth-auto=, -param=min-inline-recursive-probability=, -param=partial-inlining-entry-probability=, -param=uninlined-function-insns=, -param=uninlined-function-time=, -param=uninlined-thunk-insns=, -param=uninlined-thunk-time=): Add Optimization. * g++.dg/tree-ssa/pr53844.C: Drop -O2 from param name. * g++.dg/tree-ssa/pr61034.C: Likewise. * g++.dg/tree-ssa/pr8781.C: Likewise. * g++.dg/warn/Wstringop-truncation-1.C: Likewise. * gcc.dg/ipa/pr63416.c: Likewise. * gcc.dg/tree-ssa/ssa-thread-12.c: Likewise. * gcc.dg/vect/pr66142.c: Likewise. * gcc.dg/winline-3.c: Likewise. * gcc.target/powerpc/pr72804.c: Likewise. From-SVN: r278644
2019-11-23 14:11:25 +01:00
Common Joined UInteger Var(param_large_stack_frame) Init(256) Optimization Param
The size of stack frame to be considered large.
-param=large-stack-frame-growth=
Convert inliner to function specific param infrastructure This patch adds opt_for_fn for all cross module params used by inliner so they can be modified at function granuality. With inlining almost always there are three functions to consider (callee and caller of the inlined edge and the outer function caller is inlined to). I always use the outer function params since that is how local parameters behave. I hope it is kind of what is also expected in most case: it is better to inline agressively into -O3 compiled code rather than inline agressively -O3 functions into their callers. New params infrastructure is nice. One drawback is that is very hard to search for individual param uses since they all occupy global namespace. With C++ world we had chance to do something like params.param_flag_name or params::param_flag_name instead... Bootstrapped/regtested x86_64-linux, comitted. * cif-code.def (MAX_INLINE_INSNS_SINGLE_O2_LIMIT): Remove. * doc/invoke.texi (max-inline-insns-single-O2, inline-heuristics-hint-percent-O2, inline-min-speedup-O2, early-inlining-insns-O2): Remove documentation. * ipa-fnsummary.c (analyze_function_body, compute_fn_summary): Use opt_for_fn when accessing parameters. * ipa-inline.c (caller_growth_limits, can_inline_edge_p, inline_insns_auto, can_inline_edge_by_limits_p, want_early_inline_function_p, big_speedup_p, want_inline_small_function_p, want_inline_self_recursive_call_p, recursive_inlining, compute_max_insns, inline_small_functions): Likewise. * opts.c (default_options): Add -O3 defaults for OPT__param_early_inlining_insns_, OPT__param_inline_heuristics_hint_percent_, OPT__param_inline_min_speedup_, OPT__param_max_inline_insns_single_. * params.opt (-param=early-inlining-insns-O2=, -param=inline-heuristics-hint-percent-O2=, -param=inline-min-speedup-O2=, -param=max-inline-insns-single-O2= -param=early-inlining-insns=, -param=inline-heuristics-hint-percent=, -param=inline-min-speedup=, -param=inline-unit-growth=, -param=large-function-growth=, -param=large-stack-frame=, -param=large-stack-frame-growth=, -param=large-unit-insns=, -param=max-inline-insns-recursive=, -param=max-inline-insns-recursive-auto=, -param=max-inline-insns-single=, -param=max-inline-insns-size=, -param=max-inline-insns-small=, -param=max-inline-recursive-depth=, -param=max-inline-recursive-depth-auto=, -param=min-inline-recursive-probability=, -param=partial-inlining-entry-probability=, -param=uninlined-function-insns=, -param=uninlined-function-time=, -param=uninlined-thunk-insns=, -param=uninlined-thunk-time=): Add Optimization. * g++.dg/tree-ssa/pr53844.C: Drop -O2 from param name. * g++.dg/tree-ssa/pr61034.C: Likewise. * g++.dg/tree-ssa/pr8781.C: Likewise. * g++.dg/warn/Wstringop-truncation-1.C: Likewise. * gcc.dg/ipa/pr63416.c: Likewise. * gcc.dg/tree-ssa/ssa-thread-12.c: Likewise. * gcc.dg/vect/pr66142.c: Likewise. * gcc.dg/winline-3.c: Likewise. * gcc.target/powerpc/pr72804.c: Likewise. From-SVN: r278644
2019-11-23 14:11:25 +01:00
Common Joined UInteger Var(param_stack_frame_growth) Optimization Init(1000) Param
Maximal stack frame growth due to inlining (in percent).
-param=large-unit-insns=
Common Joined UInteger Var(param_large_unit_insns) Optimization Init(10000) Param
The size of translation unit to be considered large.
-param=lazy-modules=
C++ Joined UInteger Var(param_lazy_modules) Init(32768) Param
Maximum number of concurrently open C++ module files when lazy loading.
-param=lim-expensive=
Common Joined UInteger Var(param_lim_expensive) Init(20) Param Optimization
The minimum cost of an expensive expression in the loop invariant motion.
-param=logical-op-non-short-circuit=
Common Joined UInteger Var(param_logical_op_non_short_circuit) Init(-1) IntegerRange(0, 1) Param
True if a non-short-circuit operation is optimal.
-param=loop-block-tile-size=
Common Joined UInteger Var(param_loop_block_tile_size) Init(51) Param Optimization
Size of tiles for loop blocking.
-param=loop-interchange-max-num-stmts=
Common Joined UInteger Var(param_loop_interchange_max_num_stmts) Init(64) Param Optimization
The maximum number of stmts in loop nest for loop interchange.
-param=loop-interchange-stride-ratio=
Common Joined UInteger Var(param_loop_interchange_stride_ratio) Init(2) Param Optimization
The minimum stride ratio for loop interchange to be profitable.
-param=loop-invariant-max-bbs-in-loop=
Common Joined UInteger Var(param_loop_invariant_max_bbs_in_loop) Init(10000) Param Optimization
Max basic blocks number in loop for loop invariant motion.
-param=loop-max-datarefs-for-datadeps=
Common Joined UInteger Var(param_loop_max_datarefs_for_datadeps) Init(1000) Param Optimization
Maximum number of datarefs in loop for building loop data dependencies.
-param=loop-versioning-max-inner-insns=
Common Joined UInteger Var(param_loop_versioning_max_inner_insns) Init(200) Param Optimization
The maximum number of instructions in an inner loop that is being considered for versioning.
-param=loop-versioning-max-outer-insns=
Common Joined UInteger Var(param_loop_versioning_max_outer_insns) Init(100) Param Optimization
The maximum number of instructions in an outer loop that is being considered for versioning, on top of the instructions in inner loops.
-param=lra-inheritance-ebb-probability-cutoff=
Common Joined UInteger Var(param_lra_inheritance_ebb_probability_cutoff) Init(40) IntegerRange(0, 100) Param Optimization
Minimal fall-through edge probability in percentage used to add BB to inheritance EBB in LRA.
-param=lra-max-considered-reload-pseudos=
Common Joined UInteger Var(param_lra_max_considered_reload_pseudos) Init(500) Param Optimization
The max number of reload pseudos which are considered during spilling a non-reload pseudo.
-param=lto-max-partition=
Common Joined UInteger Var(param_max_partition_size) Init(1000000) Param
Maximal size of a partition for LTO (in estimated instructions).
-param=lto-max-streaming-parallelism=
Common Joined UInteger Var(param_max_lto_streaming_parallelism) Init(32) IntegerRange(1, 65536) Param
maximal number of LTO partitions streamed in parallel.
-param=lto-min-partition=
Common Joined UInteger Var(param_min_partition_size) Init(10000) Param
Minimal size of a partition for LTO (in estimated instructions).
-param=lto-partitions=
Common Joined UInteger Var(param_lto_partitions) Init(128) IntegerRange(1, 65536) Param
Number of partitions the program should be split to.
-param=max-average-unrolled-insns=
Common Joined UInteger Var(param_max_average_unrolled_insns) Init(80) Param Optimization
The maximum number of instructions to consider to unroll in a loop on average.
-param=max-combine-insns=
Common Joined UInteger Var(param_max_combine_insns) Init(4) IntegerRange(2, 4) Param Optimization
The maximum number of insns combine tries to combine.
-param=max-completely-peel-loop-nest-depth=
Common Joined UInteger Var(param_max_unroll_iterations) Init(8) Param Optimization
The maximum depth of a loop nest we completely peel.
-param=max-completely-peel-times=
Common Joined UInteger Var(param_max_completely_peel_times) Init(16) Param Optimization
The maximum number of peelings of a single loop that is peeled completely.
-param=max-completely-peeled-insns=
Common Joined UInteger Var(param_max_completely_peeled_insns) Init(200) Param Optimization
The maximum number of insns of a completely peeled loop.
-param=max-crossjump-edges=
Common Joined UInteger Var(param_max_crossjump_edges) Init(100) Param Optimization
The maximum number of incoming edges to consider for crossjumping.
-param=max-cse-insns=
Common Joined UInteger Var(param_max_cse_insns) Init(1000) Param Optimization
The maximum instructions CSE process before flushing.
-param=max-cse-path-length=
Common Joined UInteger Var(param_max_cse_path_length) Init(10) IntegerRange(1, 65536) Param Optimization
The maximum length of path considered in cse.
-param=max-cselib-memory-locations=
Common Joined UInteger Var(param_max_cselib_memory_locations) Init(500) Param Optimization
The maximum memory locations recorded by cselib.
-param=max-debug-marker-count=
Common Joined UInteger Var(param_max_debug_marker_count) Init(100000) Param Optimization
Max. count of debug markers to expand or inline.
-param=max-delay-slot-insn-search=
Common Joined UInteger Var(param_max_delay_slot_insn_search) Init(100) Param Optimization
The maximum number of instructions to consider to fill a delay slot.
-param=max-delay-slot-live-search=
Common Joined UInteger Var(param_max_delay_slot_live_search) Init(333) Param Optimization
The maximum number of instructions to consider to find accurate live register information.
-param=max-dse-active-local-stores=
Common Joined UInteger Var(param_max_dse_active_local_stores) Init(5000) Param Optimization
Maximum number of active local stores in RTL dead store elimination.
-param=max-early-inliner-iterations=
Common Joined UInteger Var(param_early_inliner_max_iterations) Init(1) Param Optimization
The maximum number of nested indirect inlining performed by early inliner.
-param=max-fields-for-field-sensitive=
Common Joined UInteger Var(param_max_fields_for_field_sensitive) Param
Maximum number of fields in a structure before pointer analysis treats the structure as a single variable.
-param=max-fsm-thread-path-insns=
Common Joined UInteger Var(param_max_fsm_thread_path_insns) Init(100) IntegerRange(1, 999999) Param Optimization
Maximum number of instructions to copy when duplicating blocks on a finite state automaton jump thread path.
-param=max-gcse-insertion-ratio=
Common Joined UInteger Var(param_max_gcse_insertion_ratio) Init(20) Param Optimization
The maximum ratio of insertions to deletions of expressions in GCSE.
-param=max-gcse-memory=
Common Joined UInteger Var(param_max_gcse_memory) Init(131072) Param Optimization
The maximum amount of memory to be allocated by GCSE, in kilobytes.
-param=max-goto-duplication-insns=
Common Joined UInteger Var(param_max_goto_duplication_insns) Init(8) Param Optimization
The maximum number of insns to duplicate when unfactoring computed gotos.
-param=max-grow-copy-bb-insns=
Common Joined UInteger Var(param_max_grow_copy_bb_insns) Init(8) Param Optimization
The maximum expansion factor when copying basic blocks.
-param=max-hoist-depth=
Common Joined UInteger Var(param_max_hoist_depth) Init(30) Param Optimization
Maximum depth of search in the dominator tree for expressions to hoist.
-param=max-inline-functions-called-once-loop-depth=
Common Joined UInteger Var(param_inline_functions_called_once_loop_depth) Init(6) Optimization Param
Maximum loop depth of a call which is considered for inlining functions called once.
-param=max-inline-functions-called-once-insns=
Common Joined UInteger Var(param_inline_functions_called_once_insns) Init(4000) Optimization Param
Maximum combined size of caller and callee which is inlined if callee is called once.
-param=max-inline-insns-auto=
Common Joined UInteger Var(param_max_inline_insns_auto) Init(15) Optimization Param
The maximum number of instructions when automatically inlining.
-param=max-inline-insns-recursive=
Convert inliner to function specific param infrastructure This patch adds opt_for_fn for all cross module params used by inliner so they can be modified at function granuality. With inlining almost always there are three functions to consider (callee and caller of the inlined edge and the outer function caller is inlined to). I always use the outer function params since that is how local parameters behave. I hope it is kind of what is also expected in most case: it is better to inline agressively into -O3 compiled code rather than inline agressively -O3 functions into their callers. New params infrastructure is nice. One drawback is that is very hard to search for individual param uses since they all occupy global namespace. With C++ world we had chance to do something like params.param_flag_name or params::param_flag_name instead... Bootstrapped/regtested x86_64-linux, comitted. * cif-code.def (MAX_INLINE_INSNS_SINGLE_O2_LIMIT): Remove. * doc/invoke.texi (max-inline-insns-single-O2, inline-heuristics-hint-percent-O2, inline-min-speedup-O2, early-inlining-insns-O2): Remove documentation. * ipa-fnsummary.c (analyze_function_body, compute_fn_summary): Use opt_for_fn when accessing parameters. * ipa-inline.c (caller_growth_limits, can_inline_edge_p, inline_insns_auto, can_inline_edge_by_limits_p, want_early_inline_function_p, big_speedup_p, want_inline_small_function_p, want_inline_self_recursive_call_p, recursive_inlining, compute_max_insns, inline_small_functions): Likewise. * opts.c (default_options): Add -O3 defaults for OPT__param_early_inlining_insns_, OPT__param_inline_heuristics_hint_percent_, OPT__param_inline_min_speedup_, OPT__param_max_inline_insns_single_. * params.opt (-param=early-inlining-insns-O2=, -param=inline-heuristics-hint-percent-O2=, -param=inline-min-speedup-O2=, -param=max-inline-insns-single-O2= -param=early-inlining-insns=, -param=inline-heuristics-hint-percent=, -param=inline-min-speedup=, -param=inline-unit-growth=, -param=large-function-growth=, -param=large-stack-frame=, -param=large-stack-frame-growth=, -param=large-unit-insns=, -param=max-inline-insns-recursive=, -param=max-inline-insns-recursive-auto=, -param=max-inline-insns-single=, -param=max-inline-insns-size=, -param=max-inline-insns-small=, -param=max-inline-recursive-depth=, -param=max-inline-recursive-depth-auto=, -param=min-inline-recursive-probability=, -param=partial-inlining-entry-probability=, -param=uninlined-function-insns=, -param=uninlined-function-time=, -param=uninlined-thunk-insns=, -param=uninlined-thunk-time=): Add Optimization. * g++.dg/tree-ssa/pr53844.C: Drop -O2 from param name. * g++.dg/tree-ssa/pr61034.C: Likewise. * g++.dg/tree-ssa/pr8781.C: Likewise. * g++.dg/warn/Wstringop-truncation-1.C: Likewise. * gcc.dg/ipa/pr63416.c: Likewise. * gcc.dg/tree-ssa/ssa-thread-12.c: Likewise. * gcc.dg/vect/pr66142.c: Likewise. * gcc.dg/winline-3.c: Likewise. * gcc.target/powerpc/pr72804.c: Likewise. From-SVN: r278644
2019-11-23 14:11:25 +01:00
Common Joined UInteger Var(param_max_inline_insns_recursive) Optimization Init(450) Param
The maximum number of instructions inline function can grow to via recursive inlining.
-param=max-inline-insns-recursive-auto=
Convert inliner to function specific param infrastructure This patch adds opt_for_fn for all cross module params used by inliner so they can be modified at function granuality. With inlining almost always there are three functions to consider (callee and caller of the inlined edge and the outer function caller is inlined to). I always use the outer function params since that is how local parameters behave. I hope it is kind of what is also expected in most case: it is better to inline agressively into -O3 compiled code rather than inline agressively -O3 functions into their callers. New params infrastructure is nice. One drawback is that is very hard to search for individual param uses since they all occupy global namespace. With C++ world we had chance to do something like params.param_flag_name or params::param_flag_name instead... Bootstrapped/regtested x86_64-linux, comitted. * cif-code.def (MAX_INLINE_INSNS_SINGLE_O2_LIMIT): Remove. * doc/invoke.texi (max-inline-insns-single-O2, inline-heuristics-hint-percent-O2, inline-min-speedup-O2, early-inlining-insns-O2): Remove documentation. * ipa-fnsummary.c (analyze_function_body, compute_fn_summary): Use opt_for_fn when accessing parameters. * ipa-inline.c (caller_growth_limits, can_inline_edge_p, inline_insns_auto, can_inline_edge_by_limits_p, want_early_inline_function_p, big_speedup_p, want_inline_small_function_p, want_inline_self_recursive_call_p, recursive_inlining, compute_max_insns, inline_small_functions): Likewise. * opts.c (default_options): Add -O3 defaults for OPT__param_early_inlining_insns_, OPT__param_inline_heuristics_hint_percent_, OPT__param_inline_min_speedup_, OPT__param_max_inline_insns_single_. * params.opt (-param=early-inlining-insns-O2=, -param=inline-heuristics-hint-percent-O2=, -param=inline-min-speedup-O2=, -param=max-inline-insns-single-O2= -param=early-inlining-insns=, -param=inline-heuristics-hint-percent=, -param=inline-min-speedup=, -param=inline-unit-growth=, -param=large-function-growth=, -param=large-stack-frame=, -param=large-stack-frame-growth=, -param=large-unit-insns=, -param=max-inline-insns-recursive=, -param=max-inline-insns-recursive-auto=, -param=max-inline-insns-single=, -param=max-inline-insns-size=, -param=max-inline-insns-small=, -param=max-inline-recursive-depth=, -param=max-inline-recursive-depth-auto=, -param=min-inline-recursive-probability=, -param=partial-inlining-entry-probability=, -param=uninlined-function-insns=, -param=uninlined-function-time=, -param=uninlined-thunk-insns=, -param=uninlined-thunk-time=): Add Optimization. * g++.dg/tree-ssa/pr53844.C: Drop -O2 from param name. * g++.dg/tree-ssa/pr61034.C: Likewise. * g++.dg/tree-ssa/pr8781.C: Likewise. * g++.dg/warn/Wstringop-truncation-1.C: Likewise. * gcc.dg/ipa/pr63416.c: Likewise. * gcc.dg/tree-ssa/ssa-thread-12.c: Likewise. * gcc.dg/vect/pr66142.c: Likewise. * gcc.dg/winline-3.c: Likewise. * gcc.target/powerpc/pr72804.c: Likewise. From-SVN: r278644
2019-11-23 14:11:25 +01:00
Common Joined UInteger Var(param_max_inline_insns_recursive_auto) Optimization Init(450) Param
The maximum number of instructions non-inline function can grow to via recursive inlining.
-param=max-inline-insns-single=
Convert inliner to function specific param infrastructure This patch adds opt_for_fn for all cross module params used by inliner so they can be modified at function granuality. With inlining almost always there are three functions to consider (callee and caller of the inlined edge and the outer function caller is inlined to). I always use the outer function params since that is how local parameters behave. I hope it is kind of what is also expected in most case: it is better to inline agressively into -O3 compiled code rather than inline agressively -O3 functions into their callers. New params infrastructure is nice. One drawback is that is very hard to search for individual param uses since they all occupy global namespace. With C++ world we had chance to do something like params.param_flag_name or params::param_flag_name instead... Bootstrapped/regtested x86_64-linux, comitted. * cif-code.def (MAX_INLINE_INSNS_SINGLE_O2_LIMIT): Remove. * doc/invoke.texi (max-inline-insns-single-O2, inline-heuristics-hint-percent-O2, inline-min-speedup-O2, early-inlining-insns-O2): Remove documentation. * ipa-fnsummary.c (analyze_function_body, compute_fn_summary): Use opt_for_fn when accessing parameters. * ipa-inline.c (caller_growth_limits, can_inline_edge_p, inline_insns_auto, can_inline_edge_by_limits_p, want_early_inline_function_p, big_speedup_p, want_inline_small_function_p, want_inline_self_recursive_call_p, recursive_inlining, compute_max_insns, inline_small_functions): Likewise. * opts.c (default_options): Add -O3 defaults for OPT__param_early_inlining_insns_, OPT__param_inline_heuristics_hint_percent_, OPT__param_inline_min_speedup_, OPT__param_max_inline_insns_single_. * params.opt (-param=early-inlining-insns-O2=, -param=inline-heuristics-hint-percent-O2=, -param=inline-min-speedup-O2=, -param=max-inline-insns-single-O2= -param=early-inlining-insns=, -param=inline-heuristics-hint-percent=, -param=inline-min-speedup=, -param=inline-unit-growth=, -param=large-function-growth=, -param=large-stack-frame=, -param=large-stack-frame-growth=, -param=large-unit-insns=, -param=max-inline-insns-recursive=, -param=max-inline-insns-recursive-auto=, -param=max-inline-insns-single=, -param=max-inline-insns-size=, -param=max-inline-insns-small=, -param=max-inline-recursive-depth=, -param=max-inline-recursive-depth-auto=, -param=min-inline-recursive-probability=, -param=partial-inlining-entry-probability=, -param=uninlined-function-insns=, -param=uninlined-function-time=, -param=uninlined-thunk-insns=, -param=uninlined-thunk-time=): Add Optimization. * g++.dg/tree-ssa/pr53844.C: Drop -O2 from param name. * g++.dg/tree-ssa/pr61034.C: Likewise. * g++.dg/tree-ssa/pr8781.C: Likewise. * g++.dg/warn/Wstringop-truncation-1.C: Likewise. * gcc.dg/ipa/pr63416.c: Likewise. * gcc.dg/tree-ssa/ssa-thread-12.c: Likewise. * gcc.dg/vect/pr66142.c: Likewise. * gcc.dg/winline-3.c: Likewise. * gcc.target/powerpc/pr72804.c: Likewise. From-SVN: r278644
2019-11-23 14:11:25 +01:00
Common Joined UInteger Var(param_max_inline_insns_single) Optimization Init(70) Param
The maximum number of instructions in a single function eligible for inlining.
-param=max-inline-insns-size=
Convert inliner to function specific param infrastructure This patch adds opt_for_fn for all cross module params used by inliner so they can be modified at function granuality. With inlining almost always there are three functions to consider (callee and caller of the inlined edge and the outer function caller is inlined to). I always use the outer function params since that is how local parameters behave. I hope it is kind of what is also expected in most case: it is better to inline agressively into -O3 compiled code rather than inline agressively -O3 functions into their callers. New params infrastructure is nice. One drawback is that is very hard to search for individual param uses since they all occupy global namespace. With C++ world we had chance to do something like params.param_flag_name or params::param_flag_name instead... Bootstrapped/regtested x86_64-linux, comitted. * cif-code.def (MAX_INLINE_INSNS_SINGLE_O2_LIMIT): Remove. * doc/invoke.texi (max-inline-insns-single-O2, inline-heuristics-hint-percent-O2, inline-min-speedup-O2, early-inlining-insns-O2): Remove documentation. * ipa-fnsummary.c (analyze_function_body, compute_fn_summary): Use opt_for_fn when accessing parameters. * ipa-inline.c (caller_growth_limits, can_inline_edge_p, inline_insns_auto, can_inline_edge_by_limits_p, want_early_inline_function_p, big_speedup_p, want_inline_small_function_p, want_inline_self_recursive_call_p, recursive_inlining, compute_max_insns, inline_small_functions): Likewise. * opts.c (default_options): Add -O3 defaults for OPT__param_early_inlining_insns_, OPT__param_inline_heuristics_hint_percent_, OPT__param_inline_min_speedup_, OPT__param_max_inline_insns_single_. * params.opt (-param=early-inlining-insns-O2=, -param=inline-heuristics-hint-percent-O2=, -param=inline-min-speedup-O2=, -param=max-inline-insns-single-O2= -param=early-inlining-insns=, -param=inline-heuristics-hint-percent=, -param=inline-min-speedup=, -param=inline-unit-growth=, -param=large-function-growth=, -param=large-stack-frame=, -param=large-stack-frame-growth=, -param=large-unit-insns=, -param=max-inline-insns-recursive=, -param=max-inline-insns-recursive-auto=, -param=max-inline-insns-single=, -param=max-inline-insns-size=, -param=max-inline-insns-small=, -param=max-inline-recursive-depth=, -param=max-inline-recursive-depth-auto=, -param=min-inline-recursive-probability=, -param=partial-inlining-entry-probability=, -param=uninlined-function-insns=, -param=uninlined-function-time=, -param=uninlined-thunk-insns=, -param=uninlined-thunk-time=): Add Optimization. * g++.dg/tree-ssa/pr53844.C: Drop -O2 from param name. * g++.dg/tree-ssa/pr61034.C: Likewise. * g++.dg/tree-ssa/pr8781.C: Likewise. * g++.dg/warn/Wstringop-truncation-1.C: Likewise. * gcc.dg/ipa/pr63416.c: Likewise. * gcc.dg/tree-ssa/ssa-thread-12.c: Likewise. * gcc.dg/vect/pr66142.c: Likewise. * gcc.dg/winline-3.c: Likewise. * gcc.target/powerpc/pr72804.c: Likewise. From-SVN: r278644
2019-11-23 14:11:25 +01:00
Common Joined UInteger Var(param_max_inline_insns_size) Optimization Param
The maximum number of instructions when inlining for size.
-param=max-inline-insns-small=
Convert inliner to function specific param infrastructure This patch adds opt_for_fn for all cross module params used by inliner so they can be modified at function granuality. With inlining almost always there are three functions to consider (callee and caller of the inlined edge and the outer function caller is inlined to). I always use the outer function params since that is how local parameters behave. I hope it is kind of what is also expected in most case: it is better to inline agressively into -O3 compiled code rather than inline agressively -O3 functions into their callers. New params infrastructure is nice. One drawback is that is very hard to search for individual param uses since they all occupy global namespace. With C++ world we had chance to do something like params.param_flag_name or params::param_flag_name instead... Bootstrapped/regtested x86_64-linux, comitted. * cif-code.def (MAX_INLINE_INSNS_SINGLE_O2_LIMIT): Remove. * doc/invoke.texi (max-inline-insns-single-O2, inline-heuristics-hint-percent-O2, inline-min-speedup-O2, early-inlining-insns-O2): Remove documentation. * ipa-fnsummary.c (analyze_function_body, compute_fn_summary): Use opt_for_fn when accessing parameters. * ipa-inline.c (caller_growth_limits, can_inline_edge_p, inline_insns_auto, can_inline_edge_by_limits_p, want_early_inline_function_p, big_speedup_p, want_inline_small_function_p, want_inline_self_recursive_call_p, recursive_inlining, compute_max_insns, inline_small_functions): Likewise. * opts.c (default_options): Add -O3 defaults for OPT__param_early_inlining_insns_, OPT__param_inline_heuristics_hint_percent_, OPT__param_inline_min_speedup_, OPT__param_max_inline_insns_single_. * params.opt (-param=early-inlining-insns-O2=, -param=inline-heuristics-hint-percent-O2=, -param=inline-min-speedup-O2=, -param=max-inline-insns-single-O2= -param=early-inlining-insns=, -param=inline-heuristics-hint-percent=, -param=inline-min-speedup=, -param=inline-unit-growth=, -param=large-function-growth=, -param=large-stack-frame=, -param=large-stack-frame-growth=, -param=large-unit-insns=, -param=max-inline-insns-recursive=, -param=max-inline-insns-recursive-auto=, -param=max-inline-insns-single=, -param=max-inline-insns-size=, -param=max-inline-insns-small=, -param=max-inline-recursive-depth=, -param=max-inline-recursive-depth-auto=, -param=min-inline-recursive-probability=, -param=partial-inlining-entry-probability=, -param=uninlined-function-insns=, -param=uninlined-function-time=, -param=uninlined-thunk-insns=, -param=uninlined-thunk-time=): Add Optimization. * g++.dg/tree-ssa/pr53844.C: Drop -O2 from param name. * g++.dg/tree-ssa/pr61034.C: Likewise. * g++.dg/tree-ssa/pr8781.C: Likewise. * g++.dg/warn/Wstringop-truncation-1.C: Likewise. * gcc.dg/ipa/pr63416.c: Likewise. * gcc.dg/tree-ssa/ssa-thread-12.c: Likewise. * gcc.dg/vect/pr66142.c: Likewise. * gcc.dg/winline-3.c: Likewise. * gcc.target/powerpc/pr72804.c: Likewise. From-SVN: r278644
2019-11-23 14:11:25 +01:00
Common Joined UInteger Var(param_max_inline_insns_small) Optimization Param
The maximum number of instructions when automatically inlining small functions.
-param=max-inline-recursive-depth=
Convert inliner to function specific param infrastructure This patch adds opt_for_fn for all cross module params used by inliner so they can be modified at function granuality. With inlining almost always there are three functions to consider (callee and caller of the inlined edge and the outer function caller is inlined to). I always use the outer function params since that is how local parameters behave. I hope it is kind of what is also expected in most case: it is better to inline agressively into -O3 compiled code rather than inline agressively -O3 functions into their callers. New params infrastructure is nice. One drawback is that is very hard to search for individual param uses since they all occupy global namespace. With C++ world we had chance to do something like params.param_flag_name or params::param_flag_name instead... Bootstrapped/regtested x86_64-linux, comitted. * cif-code.def (MAX_INLINE_INSNS_SINGLE_O2_LIMIT): Remove. * doc/invoke.texi (max-inline-insns-single-O2, inline-heuristics-hint-percent-O2, inline-min-speedup-O2, early-inlining-insns-O2): Remove documentation. * ipa-fnsummary.c (analyze_function_body, compute_fn_summary): Use opt_for_fn when accessing parameters. * ipa-inline.c (caller_growth_limits, can_inline_edge_p, inline_insns_auto, can_inline_edge_by_limits_p, want_early_inline_function_p, big_speedup_p, want_inline_small_function_p, want_inline_self_recursive_call_p, recursive_inlining, compute_max_insns, inline_small_functions): Likewise. * opts.c (default_options): Add -O3 defaults for OPT__param_early_inlining_insns_, OPT__param_inline_heuristics_hint_percent_, OPT__param_inline_min_speedup_, OPT__param_max_inline_insns_single_. * params.opt (-param=early-inlining-insns-O2=, -param=inline-heuristics-hint-percent-O2=, -param=inline-min-speedup-O2=, -param=max-inline-insns-single-O2= -param=early-inlining-insns=, -param=inline-heuristics-hint-percent=, -param=inline-min-speedup=, -param=inline-unit-growth=, -param=large-function-growth=, -param=large-stack-frame=, -param=large-stack-frame-growth=, -param=large-unit-insns=, -param=max-inline-insns-recursive=, -param=max-inline-insns-recursive-auto=, -param=max-inline-insns-single=, -param=max-inline-insns-size=, -param=max-inline-insns-small=, -param=max-inline-recursive-depth=, -param=max-inline-recursive-depth-auto=, -param=min-inline-recursive-probability=, -param=partial-inlining-entry-probability=, -param=uninlined-function-insns=, -param=uninlined-function-time=, -param=uninlined-thunk-insns=, -param=uninlined-thunk-time=): Add Optimization. * g++.dg/tree-ssa/pr53844.C: Drop -O2 from param name. * g++.dg/tree-ssa/pr61034.C: Likewise. * g++.dg/tree-ssa/pr8781.C: Likewise. * g++.dg/warn/Wstringop-truncation-1.C: Likewise. * gcc.dg/ipa/pr63416.c: Likewise. * gcc.dg/tree-ssa/ssa-thread-12.c: Likewise. * gcc.dg/vect/pr66142.c: Likewise. * gcc.dg/winline-3.c: Likewise. * gcc.target/powerpc/pr72804.c: Likewise. From-SVN: r278644
2019-11-23 14:11:25 +01:00
Common Joined UInteger Var(param_max_inline_recursive_depth) Optimization Init(8) Param
The maximum depth of recursive inlining for inline functions.
-param=max-inline-recursive-depth-auto=
Convert inliner to function specific param infrastructure This patch adds opt_for_fn for all cross module params used by inliner so they can be modified at function granuality. With inlining almost always there are three functions to consider (callee and caller of the inlined edge and the outer function caller is inlined to). I always use the outer function params since that is how local parameters behave. I hope it is kind of what is also expected in most case: it is better to inline agressively into -O3 compiled code rather than inline agressively -O3 functions into their callers. New params infrastructure is nice. One drawback is that is very hard to search for individual param uses since they all occupy global namespace. With C++ world we had chance to do something like params.param_flag_name or params::param_flag_name instead... Bootstrapped/regtested x86_64-linux, comitted. * cif-code.def (MAX_INLINE_INSNS_SINGLE_O2_LIMIT): Remove. * doc/invoke.texi (max-inline-insns-single-O2, inline-heuristics-hint-percent-O2, inline-min-speedup-O2, early-inlining-insns-O2): Remove documentation. * ipa-fnsummary.c (analyze_function_body, compute_fn_summary): Use opt_for_fn when accessing parameters. * ipa-inline.c (caller_growth_limits, can_inline_edge_p, inline_insns_auto, can_inline_edge_by_limits_p, want_early_inline_function_p, big_speedup_p, want_inline_small_function_p, want_inline_self_recursive_call_p, recursive_inlining, compute_max_insns, inline_small_functions): Likewise. * opts.c (default_options): Add -O3 defaults for OPT__param_early_inlining_insns_, OPT__param_inline_heuristics_hint_percent_, OPT__param_inline_min_speedup_, OPT__param_max_inline_insns_single_. * params.opt (-param=early-inlining-insns-O2=, -param=inline-heuristics-hint-percent-O2=, -param=inline-min-speedup-O2=, -param=max-inline-insns-single-O2= -param=early-inlining-insns=, -param=inline-heuristics-hint-percent=, -param=inline-min-speedup=, -param=inline-unit-growth=, -param=large-function-growth=, -param=large-stack-frame=, -param=large-stack-frame-growth=, -param=large-unit-insns=, -param=max-inline-insns-recursive=, -param=max-inline-insns-recursive-auto=, -param=max-inline-insns-single=, -param=max-inline-insns-size=, -param=max-inline-insns-small=, -param=max-inline-recursive-depth=, -param=max-inline-recursive-depth-auto=, -param=min-inline-recursive-probability=, -param=partial-inlining-entry-probability=, -param=uninlined-function-insns=, -param=uninlined-function-time=, -param=uninlined-thunk-insns=, -param=uninlined-thunk-time=): Add Optimization. * g++.dg/tree-ssa/pr53844.C: Drop -O2 from param name. * g++.dg/tree-ssa/pr61034.C: Likewise. * g++.dg/tree-ssa/pr8781.C: Likewise. * g++.dg/warn/Wstringop-truncation-1.C: Likewise. * gcc.dg/ipa/pr63416.c: Likewise. * gcc.dg/tree-ssa/ssa-thread-12.c: Likewise. * gcc.dg/vect/pr66142.c: Likewise. * gcc.dg/winline-3.c: Likewise. * gcc.target/powerpc/pr72804.c: Likewise. From-SVN: r278644
2019-11-23 14:11:25 +01:00
Common Joined UInteger Var(param_max_inline_recursive_depth_auto) Optimization Init(8) Param
The maximum depth of recursive inlining for non-inline functions.
-param=max-isl-operations=
Common Joined UInteger Var(param_max_isl_operations) Init(350000) Param Optimization
Maximum number of isl operations, 0 means unlimited.
-param=max-iterations-computation-cost=
Common Joined UInteger Var(param_max_iterations_computation_cost) Init(10) Param Optimization
Bound on the cost of an expression to compute the number of iterations.
-param=max-iterations-to-track=
Common Joined UInteger Var(param_max_iterations_to_track) Init(1000) Param Optimization
Bound on the number of iterations the brute force # of iterations analysis algorithm evaluates.
-param=max-jump-thread-duplication-stmts=
Common Joined UInteger Var(param_max_jump_thread_duplication_stmts) Init(15) Param Optimization
Maximum number of statements allowed in a block that needs to be duplicated when threading jumps.
tree-optimization/106514 - add --param max-jump-thread-paths The following adds a limit for the exponential greedy search of the backwards jump threader. The idea is to limit the search space in a way that the paths considered are the same if the search were in BFS order rather than DFS. In particular it stops considering incoming edges into a block if the product of the in-degrees of blocks on the path exceeds the specified limit. When considering the low stmt copying limit of 7 (or 1 in the size optimize case) this means the degenerate case with maximum search space is a sequence of conditions with no actual code B1 |\ | empty |/ B2 |\ ... Bn |\ GIMPLE_CONDs are costed 2, an equivalent GIMPLE_SWITCH already 4, so we reach 7 already with 3 middle conditions (B1 and Bn do not count). The search space would be 2^4 == 16 to reach this. The FSM threads historically allowed for a thread length of 10 but is really looking for a single multiway branch threaded across the backedge. I've chosen the default of the new parameter to 64 which effectively limits the outdegree of the switch statement (the cases reaching the backedge) to that number (divided by 2 until I add some special pruning for FSM threads due to the loop header indegree). The testcase ssa-dom-thread-7.c requires 56 at the moment (as said, some special FSM thread pruning of considered edges would bring it down to half of that), but we now get one more threading and quite some more in later threadfull. This testcase seems to be difficult to check for expected transforms. The new testcases add the degenerate case we currently thread (without deciding whether that's a good idea ...) plus one with an approripate limit that should prevent the threading. This obsoletes the mentioned --param max-fsm-thread-length but I am not removing it as part of this patch. When the search space is limited the thread stmt size limit effectively provides max-fsm-thread-length. The param with its default does not help PR106514 enough to unleash path searching with the higher FSM stmt count limit. PR tree-optimization/106514 * params.opt (max-jump-thread-paths): New. * doc/invoke.texi (max-jump-thread-paths): Document. * tree-ssa-threadbackward.cc (back_threader::find_paths_to_names): Honor max-jump-thread-paths, take overall_path argument. (back_threader::find_paths): Pass 1 as initial overall_path. * gcc.dg/tree-ssa/ssa-thread-16.c: New testcase. * gcc.dg/tree-ssa/ssa-thread-17.c: Likewise. * gcc.dg/tree-ssa/ssa-dom-thread-7.c: Adjust.
2022-08-08 12:20:04 +02:00
-param=max-jump-thread-paths=
Common Joined UInteger Var(param_max_jump_thread_paths) Init(64) IntegerRange(1, 65536) Param Optimization
Search space limit for the backwards jump threader.
-param=max-last-value-rtl=
Common Joined UInteger Var(param_max_last_value_rtl) Init(10000) Param Optimization
The maximum number of RTL nodes that can be recorded as combiner's last value.
-param=max-loop-header-insns=
Common Joined UInteger Var(param_max_loop_header_insns) Init(20) Param Optimization
The maximum number of insns in loop header duplicated by the copy loop headers pass.
-param=max-modulo-backtrack-attempts=
Common Joined UInteger Var(param_max_modulo_backtrack_attempts) Init(40) Param Optimization
The maximum number of backtrack attempts the scheduler should make when modulo scheduling a loop.
Allow (void *) 0xdeadbeef accesses without warnings [PR99578] Starting with GCC11 we keep emitting false positive -Warray-bounds or -Wstringop-overflow etc. warnings on widely used *(type *)0x12345000 style accesses (or memory/string routines to such pointers). This is a standard programming style supported by all C/C++ compilers I've ever tried, used mostly in kernel or DSP programming, but sometimes also together with mmap MAP_FIXED when certain things, often I/O registers but could be anything else too are known to be present at fixed addresses. Such INTEGER_CST addresses can appear in code either because a user used it like that (in which case it is fine) or because somebody used pointer arithmetics (including &((struct whatever *)NULL)->field) on a NULL pointer. The middle-end warning code wrongly assumes that the latter case is what is very likely, while the former is unlikely and users should change their code. The following patch adds a min-pagesize param defaulting to 4KB, and treats INTEGER_CST addresses smaller than that as assumed results of pointer arithmetics from NULL while addresses equal or larger than that as expected user constant addresses. For GCC 13 we can represent results from pointer arithmetics on NULL using &MEM[(void*)0 + offset] instead of (void*)offset INTEGER_CSTs. 2022-03-18 Jakub Jelinek <jakub@redhat.com> PR middle-end/99578 PR middle-end/100680 PR tree-optimization/100834 * params.opt (--param=min-pagesize=): New parameter. * pointer-query.cc (compute_objsize_r) <case ARRAY_REF>: Formatting fix. (compute_objsize_r) <case INTEGER_CST>: Use maximum object size instead of zero for pointer constants equal or larger than min-pagesize. * gcc.dg/tree-ssa/pr99578-1.c: New test. * gcc.dg/pr99578-1.c: New test. * gcc.dg/pr99578-2.c: New test. * gcc.dg/pr99578-3.c: New test. * gcc.dg/pr100680.c: New test. * gcc.dg/pr100834.c: New test.
2022-03-18 18:58:06 +01:00
-param=min-pagesize=
Common Joined UInteger Var(param_min_pagesize) Init(4096) Param Optimization
Minimum page size for warning purposes.
-param=max-partial-antic-length=
Common Joined UInteger Var(param_max_partial_antic_length) Init(100) Param Optimization
Maximum length of partial antic set when performing tree pre optimization.
-param=max-peel-branches=
Common Joined UInteger Var(param_max_peel_branches) Init(32) Param Optimization
The maximum number of branches on the path through the peeled sequence.
-param=max-peel-times=
Common Joined UInteger Var(param_max_peel_times) Init(16) Param Optimization
The maximum number of peelings of a single loop.
-param=max-peeled-insns=
Common Joined UInteger Var(param_max_peeled_insns) Init(100) Param Optimization
The maximum number of insns of a peeled loop.
-param=max-pending-list-length=
Common Joined UInteger Var(param_max_pending_list_length) Init(32) Param Optimization
The maximum length of scheduling's pending operations list.
-param=max-pipeline-region-blocks=
Common Joined UInteger Var(param_max_pipeline_region_blocks) Init(15) Param Optimization
The maximum number of blocks in a region to be considered for interblock scheduling.
-param=max-pipeline-region-insns=
Common Joined UInteger Var(param_max_pipeline_region_insns) Init(200) Param Optimization
The maximum number of insns in a region to be considered for interblock scheduling.
-param=max-pow-sqrt-depth=
Common Joined UInteger Var(param_max_pow_sqrt_depth) Init(5) IntegerRange(1, 32) Param Optimization
Maximum depth of sqrt chains to use when synthesizing exponentiation by a real constant.
-param=max-predicted-iterations=
Common Joined UInteger Var(param_max_predicted_iterations) Init(100) IntegerRange(0, 65536) Param Optimization
The maximum number of loop iterations we predict statically.
-param=max-reload-search-insns=
Common Joined UInteger Var(param_max_reload_search_insns) Init(100) Param Optimization
The maximum number of instructions to search backward when looking for equivalent reload.
-param=max-rtl-if-conversion-insns=
Common Joined UInteger Var(param_max_rtl_if_conversion_insns) Init(10) IntegerRange(0, 99) Param Optimization
Maximum number of insns in a basic block to consider for RTL if-conversion.
-param=max-rtl-if-conversion-predictable-cost=
Common Joined UInteger Var(param_max_rtl_if_conversion_predictable_cost) Init(20) IntegerRange(0, 200) Param Optimization
Maximum permissible cost for the sequence that would be generated by the RTL if-conversion pass for a branch that is considered predictable.
-param=max-rtl-if-conversion-unpredictable-cost=
Common Joined UInteger Var(param_max_rtl_if_conversion_unpredictable_cost) Init(40) IntegerRange(0, 200) Param Optimization
Maximum permissible cost for the sequence that would be generated by the RTL if-conversion pass for a branch that is considered unpredictable.
-param=max-sched-extend-regions-iters=
Common Joined UInteger Var(param_max_sched_extend_regions_iters) Param Optimization
The maximum number of iterations through CFG to extend regions.
-param=max-sched-insn-conflict-delay=
Common Joined UInteger Var(param_max_sched_insn_conflict_delay) Init(3) IntegerRange(1, 10) Param Optimization
The maximum conflict delay for an insn to be considered for speculative motion.
-param=max-sched-ready-insns=
Common Joined UInteger Var(param_max_sched_ready_insns) Init(100) IntegerRange(1, 65536) Param Optimization
The maximum number of instructions ready to be issued to be considered by the scheduler during the first scheduling pass.
-param=max-sched-region-blocks=
Common Joined UInteger Var(param_max_sched_region_blocks) Init(10) Param Optimization
The maximum number of blocks in a region to be considered for interblock scheduling.
-param=max-sched-region-insns=
Common Joined UInteger Var(param_max_sched_region_insns) Init(100) Param Optimization
The maximum number of insns in a region to be considered for interblock scheduling.
-param=max-slsr-cand-scan=
Common Joined UInteger Var(param_max_slsr_candidate_scan) Init(50) IntegerRange(1, 999999) Param Optimization
Maximum length of candidate scans for straight-line strength reduction.
-param=max-speculative-devirt-maydefs=
Common Joined UInteger Var(param_max_speculative_devirt_maydefs) Init(50) Param Optimization
Maximum number of may-defs visited when devirtualizing speculatively.
-param=max-ssa-name-query-depth=
Common Joined UInteger Var(param_max_ssa_name_query_depth) Init(3) IntegerRange(1, 10) Param
Maximum recursion depth allowed when querying a property of an SSA name.
-param=max-stores-to-merge=
Common Joined UInteger Var(param_max_stores_to_merge) Init(64) IntegerRange(2, 65536) Param Optimization
Maximum number of constant stores to merge in the store merging pass.
-param=max-stores-to-sink=
Common Joined UInteger Var(param_max_stores_to_sink) Init(2) Param Optimization
Maximum number of conditional store pairs that can be sunk.
tree-optimization/38474 - fix store-merging compile-time regression The following puts a limit on the number of alias tests we do in terminate_all_aliasing_chains which is quadratic in the number of overall stores currentrly tracked. There is already a limit in place on the maximum number of stores in a single chain so the following adds a limit on the number of chains tracked. The worst number of overall stores tracked from the defaults (64 and 64) is then 4096 which when imposed as the sole limit for the testcase still causes store merging : 71.65 ( 56%) because the testcase is somewhat degenerate with most chains consisting only of a single store (and 25% of exactly three stores). The single stores are all CLOBBERs at the point variables go out of scope. Note unpatched we have store merging : 308.60 ( 84%) Limiting the number of chains to 64 brings this down to store merging : 1.52 ( 3%) which is more reasonable. There are ideas on how to make terminate_all_aliasing_chains cheaper but for this degenerate case they would not have any effect so I'll defer for GCC 12 for those. I'm not sure we want to have both --params, just keeping the more to-the-point max-stores-to-track works but makes the degenerate case above slower. I made the current default 1024 which for the testcasse (without limiting chains) results in 25% compile time and 20s putting it in the same ballpart as the next offender (which is PTA). This is a regression on trunk and the GCC 10 branch btw. 2021-02-11 Richard Biener <rguenther@suse.de> PR tree-optimization/38474 * params.opt (-param=max-store-chains-to-track=): New param. (-param=max-stores-to-track=): Likewise. * doc/invoke.texi (max-store-chains-to-track): Document. (max-stores-to-track): Likewise. * gimple-ssa-store-merging.c (pass_store_merging::m_n_chains): New. (pass_store_merging::m_n_stores): Likewise. (pass_store_merging::terminate_and_process_chain): Update m_n_stores and m_n_chains. (pass_store_merging::process_store): Likewise. Terminate oldest chains if the number of stores or chains get too large. (imm_store_chain_info::terminate_and_process_chain): Dump chain length.
2021-02-11 11:13:47 +01:00
-param=max-store-chains-to-track=
Common Joined UInteger Var(param_max_store_chains_to_track) Init(64) IntegerRange(1, 65536) Param
tree-optimization/38474 - fix store-merging compile-time regression The following puts a limit on the number of alias tests we do in terminate_all_aliasing_chains which is quadratic in the number of overall stores currentrly tracked. There is already a limit in place on the maximum number of stores in a single chain so the following adds a limit on the number of chains tracked. The worst number of overall stores tracked from the defaults (64 and 64) is then 4096 which when imposed as the sole limit for the testcase still causes store merging : 71.65 ( 56%) because the testcase is somewhat degenerate with most chains consisting only of a single store (and 25% of exactly three stores). The single stores are all CLOBBERs at the point variables go out of scope. Note unpatched we have store merging : 308.60 ( 84%) Limiting the number of chains to 64 brings this down to store merging : 1.52 ( 3%) which is more reasonable. There are ideas on how to make terminate_all_aliasing_chains cheaper but for this degenerate case they would not have any effect so I'll defer for GCC 12 for those. I'm not sure we want to have both --params, just keeping the more to-the-point max-stores-to-track works but makes the degenerate case above slower. I made the current default 1024 which for the testcasse (without limiting chains) results in 25% compile time and 20s putting it in the same ballpart as the next offender (which is PTA). This is a regression on trunk and the GCC 10 branch btw. 2021-02-11 Richard Biener <rguenther@suse.de> PR tree-optimization/38474 * params.opt (-param=max-store-chains-to-track=): New param. (-param=max-stores-to-track=): Likewise. * doc/invoke.texi (max-store-chains-to-track): Document. (max-stores-to-track): Likewise. * gimple-ssa-store-merging.c (pass_store_merging::m_n_chains): New. (pass_store_merging::m_n_stores): Likewise. (pass_store_merging::terminate_and_process_chain): Update m_n_stores and m_n_chains. (pass_store_merging::process_store): Likewise. Terminate oldest chains if the number of stores or chains get too large. (imm_store_chain_info::terminate_and_process_chain): Dump chain length.
2021-02-11 11:13:47 +01:00
Maximum number of store chains to track at the same time in the store merging pass.
-param=max-stores-to-track=
Common Joined UInteger Var(param_max_stores_to_track) Init(1024) IntegerRange(2, 1048576) Param
tree-optimization/38474 - fix store-merging compile-time regression The following puts a limit on the number of alias tests we do in terminate_all_aliasing_chains which is quadratic in the number of overall stores currentrly tracked. There is already a limit in place on the maximum number of stores in a single chain so the following adds a limit on the number of chains tracked. The worst number of overall stores tracked from the defaults (64 and 64) is then 4096 which when imposed as the sole limit for the testcase still causes store merging : 71.65 ( 56%) because the testcase is somewhat degenerate with most chains consisting only of a single store (and 25% of exactly three stores). The single stores are all CLOBBERs at the point variables go out of scope. Note unpatched we have store merging : 308.60 ( 84%) Limiting the number of chains to 64 brings this down to store merging : 1.52 ( 3%) which is more reasonable. There are ideas on how to make terminate_all_aliasing_chains cheaper but for this degenerate case they would not have any effect so I'll defer for GCC 12 for those. I'm not sure we want to have both --params, just keeping the more to-the-point max-stores-to-track works but makes the degenerate case above slower. I made the current default 1024 which for the testcasse (without limiting chains) results in 25% compile time and 20s putting it in the same ballpart as the next offender (which is PTA). This is a regression on trunk and the GCC 10 branch btw. 2021-02-11 Richard Biener <rguenther@suse.de> PR tree-optimization/38474 * params.opt (-param=max-store-chains-to-track=): New param. (-param=max-stores-to-track=): Likewise. * doc/invoke.texi (max-store-chains-to-track): Document. (max-stores-to-track): Likewise. * gimple-ssa-store-merging.c (pass_store_merging::m_n_chains): New. (pass_store_merging::m_n_stores): Likewise. (pass_store_merging::terminate_and_process_chain): Update m_n_stores and m_n_chains. (pass_store_merging::process_store): Likewise. Terminate oldest chains if the number of stores or chains get too large. (imm_store_chain_info::terminate_and_process_chain): Dump chain length.
2021-02-11 11:13:47 +01:00
Maximum number of store chains to track at the same time in the store merging pass.
-param=max-tail-merge-comparisons=
Common Joined UInteger Var(param_max_tail_merge_comparisons) Init(10) Param Optimization
Maximum amount of similar bbs to compare a bb with.
-param=max-tail-merge-iterations=
Common Joined UInteger Var(param_max_tail_merge_iterations) Init(2) Param Optimization
Maximum amount of iterations of the pass over a function.
-param=max-tracked-strlens=
Common Joined UInteger Var(param_max_tracked_strlens) Init(10000) Param Optimization
Maximum number of strings for which strlen optimization pass will track string lengths.
-param=max-tree-if-conversion-phi-args=
Common Joined UInteger Var(param_max_tree_if_conversion_phi_args) Init(4) IntegerRange(2, 65536) Param Optimization
Maximum number of arguments in a PHI supported by TREE if-conversion unless the loop is marked with simd pragma.
-param=max-unroll-times=
Common Joined UInteger Var(param_max_unroll_times) Init(8) Param Optimization
The maximum number of unrollings of a single loop.
-param=max-unrolled-insns=
Common Joined UInteger Var(param_max_unrolled_insns) Init(200) Param Optimization
The maximum number of instructions to consider to unroll in a loop.
-param=max-unswitch-insns=
Common Joined UInteger Var(param_max_unswitch_insns) Init(50) Param Optimization
The maximum number of insns of an unswitched loop.
-param=max-variable-expansions-in-unroller=
Common Joined UInteger Var(param_max_variable_expansions) Init(1) Param Optimization
If -fvariable-expansion-in-unroller is used, the maximum number of times that an individual variable will be expanded during loop unrolling.
-param=max-vartrack-expr-depth=
Common Joined UInteger Var(param_max_vartrack_expr_depth) Init(12) Param Optimization
Max. recursion depth for expanding var tracking expressions.
-param=max-vartrack-reverse-op-size=
Common Joined UInteger Var(param_max_vartrack_reverse_op_size) Init(50) Param Optimization
Max. size of loc list for which reverse ops should be added.
-param=max-vartrack-size=
Common Joined UInteger Var(param_max_vartrack_size) Init(50000000) Param Optimization
Max. size of var tracking hash tables.
alias: Punt after walking too many VALUEs during a toplevel find_base_term call [PR94045] As mentioned in the PR, on a largish C++ testcase the compile time on i686-linux is about 16 minutes on a fast box, mostly spent in find_base_term recursive calls dealing with very deep chains of preserved VALUEs during var-tracking. The following patch punts after we process many VALUEs (we already have code to punt if we run into a VALUE cycle). I've gathered statistics on when we punt this way (with BITS_PER_WORD, TU, function columns piped through sort | uniq -c | sort -n): 36 32 ../../gcc/asan.c _Z29initialize_sanitizer_builtinsv.part.0 108 32 _first_test.go reflect_test.reflect_test..import 1005 32 /home/jakub/src/gcc/gcc/testsuite/gcc.dg/pr85180.c foo 1005 32 /home/jakub/src/gcc/gcc/testsuite/gcc.dg/pr87985.c foo 1005 64 /home/jakub/src/gcc/gcc/testsuite/gcc.dg/pr85180.c foo 1005 64 /home/jakub/src/gcc/gcc/testsuite/gcc.dg/pr87985.c foo 2534 32 /home/jakub/src/gcc/gcc/testsuite/gcc.dg/stack-check-9.c f3 6346 32 ../../gcc/brig/brig-lang.c brig_define_builtins 6398 32 ../../gcc/d/d-builtins.cc d_define_builtins 8816 32 ../../gcc/c-family/c-common.c c_common_nodes_and_builtins 8824 32 ../../gcc/lto/lto-lang.c lto_define_builtins 41413 32 /home/jakub/src/gcc/gcc/testsuite/gcc.dg/pr43058.c test Additionally, for most of these (for the builtins definitions tested just one) I've verified with a different alias.c change which didn't punt but in the toplevel find_base_term recorded if visited_vals reached the limit whether the return value was NULL_RTX or something different, and in all these cases the end result was NULL_RTX, so at least in these cases it should just shorten the time until it returns NULL. 2020-03-09 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/94045 * params.opt (-param=max-find-base-term-values=): New option. * alias.c (find_base_term): Add cut-off for number of visited VALUEs in a single toplevel find_base_term call.
2020-03-09 13:38:23 +01:00
-param=max-find-base-term-values=
Common Joined UInteger Var(param_max_find_base_term_values) Init(200) Param Optimization
alias: Punt after walking too many VALUEs during a toplevel find_base_term call [PR94045] As mentioned in the PR, on a largish C++ testcase the compile time on i686-linux is about 16 minutes on a fast box, mostly spent in find_base_term recursive calls dealing with very deep chains of preserved VALUEs during var-tracking. The following patch punts after we process many VALUEs (we already have code to punt if we run into a VALUE cycle). I've gathered statistics on when we punt this way (with BITS_PER_WORD, TU, function columns piped through sort | uniq -c | sort -n): 36 32 ../../gcc/asan.c _Z29initialize_sanitizer_builtinsv.part.0 108 32 _first_test.go reflect_test.reflect_test..import 1005 32 /home/jakub/src/gcc/gcc/testsuite/gcc.dg/pr85180.c foo 1005 32 /home/jakub/src/gcc/gcc/testsuite/gcc.dg/pr87985.c foo 1005 64 /home/jakub/src/gcc/gcc/testsuite/gcc.dg/pr85180.c foo 1005 64 /home/jakub/src/gcc/gcc/testsuite/gcc.dg/pr87985.c foo 2534 32 /home/jakub/src/gcc/gcc/testsuite/gcc.dg/stack-check-9.c f3 6346 32 ../../gcc/brig/brig-lang.c brig_define_builtins 6398 32 ../../gcc/d/d-builtins.cc d_define_builtins 8816 32 ../../gcc/c-family/c-common.c c_common_nodes_and_builtins 8824 32 ../../gcc/lto/lto-lang.c lto_define_builtins 41413 32 /home/jakub/src/gcc/gcc/testsuite/gcc.dg/pr43058.c test Additionally, for most of these (for the builtins definitions tested just one) I've verified with a different alias.c change which didn't punt but in the toplevel find_base_term recorded if visited_vals reached the limit whether the return value was NULL_RTX or something different, and in all these cases the end result was NULL_RTX, so at least in these cases it should just shorten the time until it returns NULL. 2020-03-09 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/94045 * params.opt (-param=max-find-base-term-values=): New option. * alias.c (find_base_term): Add cut-off for number of visited VALUEs in a single toplevel find_base_term call.
2020-03-09 13:38:23 +01:00
Maximum number of VALUEs handled during a single find_base_term call.
-param=max-vrp-switch-assertions=
Common Joined UInteger Var(param_max_vrp_switch_assertions) Init(10) Param Optimization
Maximum number of assertions to add along the default edge of a switch statement during VRP.
-param=min-crossjump-insns=
Common Joined UInteger Var(param_min_crossjump_insns) Init(5) IntegerRange(1, 65536) Param Optimization
The minimum number of matching instructions to consider for crossjumping.
-param=min-inline-recursive-probability=
Convert inliner to function specific param infrastructure This patch adds opt_for_fn for all cross module params used by inliner so they can be modified at function granuality. With inlining almost always there are three functions to consider (callee and caller of the inlined edge and the outer function caller is inlined to). I always use the outer function params since that is how local parameters behave. I hope it is kind of what is also expected in most case: it is better to inline agressively into -O3 compiled code rather than inline agressively -O3 functions into their callers. New params infrastructure is nice. One drawback is that is very hard to search for individual param uses since they all occupy global namespace. With C++ world we had chance to do something like params.param_flag_name or params::param_flag_name instead... Bootstrapped/regtested x86_64-linux, comitted. * cif-code.def (MAX_INLINE_INSNS_SINGLE_O2_LIMIT): Remove. * doc/invoke.texi (max-inline-insns-single-O2, inline-heuristics-hint-percent-O2, inline-min-speedup-O2, early-inlining-insns-O2): Remove documentation. * ipa-fnsummary.c (analyze_function_body, compute_fn_summary): Use opt_for_fn when accessing parameters. * ipa-inline.c (caller_growth_limits, can_inline_edge_p, inline_insns_auto, can_inline_edge_by_limits_p, want_early_inline_function_p, big_speedup_p, want_inline_small_function_p, want_inline_self_recursive_call_p, recursive_inlining, compute_max_insns, inline_small_functions): Likewise. * opts.c (default_options): Add -O3 defaults for OPT__param_early_inlining_insns_, OPT__param_inline_heuristics_hint_percent_, OPT__param_inline_min_speedup_, OPT__param_max_inline_insns_single_. * params.opt (-param=early-inlining-insns-O2=, -param=inline-heuristics-hint-percent-O2=, -param=inline-min-speedup-O2=, -param=max-inline-insns-single-O2= -param=early-inlining-insns=, -param=inline-heuristics-hint-percent=, -param=inline-min-speedup=, -param=inline-unit-growth=, -param=large-function-growth=, -param=large-stack-frame=, -param=large-stack-frame-growth=, -param=large-unit-insns=, -param=max-inline-insns-recursive=, -param=max-inline-insns-recursive-auto=, -param=max-inline-insns-single=, -param=max-inline-insns-size=, -param=max-inline-insns-small=, -param=max-inline-recursive-depth=, -param=max-inline-recursive-depth-auto=, -param=min-inline-recursive-probability=, -param=partial-inlining-entry-probability=, -param=uninlined-function-insns=, -param=uninlined-function-time=, -param=uninlined-thunk-insns=, -param=uninlined-thunk-time=): Add Optimization. * g++.dg/tree-ssa/pr53844.C: Drop -O2 from param name. * g++.dg/tree-ssa/pr61034.C: Likewise. * g++.dg/tree-ssa/pr8781.C: Likewise. * g++.dg/warn/Wstringop-truncation-1.C: Likewise. * gcc.dg/ipa/pr63416.c: Likewise. * gcc.dg/tree-ssa/ssa-thread-12.c: Likewise. * gcc.dg/vect/pr66142.c: Likewise. * gcc.dg/winline-3.c: Likewise. * gcc.target/powerpc/pr72804.c: Likewise. From-SVN: r278644
2019-11-23 14:11:25 +01:00
Common Joined UInteger Var(param_min_inline_recursive_probability) Init(10) Optimization Param
Inline recursively only when the probability of call being executed exceeds the parameter.
-param=min-insn-to-prefetch-ratio=
Common Joined UInteger Var(param_min_insn_to_prefetch_ratio) Init(9) Param Optimization
Min. ratio of insns to prefetches to enable prefetching for a loop with an unknown trip count.
-param=min-loop-cond-split-prob=
Common Joined UInteger Var(param_min_loop_cond_split_prob) Init(30) IntegerRange(0, 100) Param Optimization
The minimum threshold for probability of semi-invariant condition statement to trigger loop split.
-param=min-nondebug-insn-uid=
Common Joined UInteger Var(param_min_nondebug_insn_uid) Param
The minimum UID to be used for a nondebug insn.
-param=min-size-for-stack-sharing=
Common Joined UInteger Var(param_min_size_for_stack_sharing) Init(32) Param Optimization
The minimum size of variables taking part in stack slot sharing when not optimizing.
-param=min-spec-prob=
Common Joined UInteger Var(param_min_spec_prob) Init(40) Param Optimization
The minimum probability of reaching a source block for interblock speculative scheduling.
-param=min-vect-loop-bound=
Common Joined UInteger Var(param_min_vect_loop_bound) Param Optimization
If -ftree-vectorize is used, the minimal loop bound of a loop to be considered for vectorization.
-param=openacc-kernels=
Common Joined Enum(openacc_kernels) Var(param_openacc_kernels) Init(OPENACC_KERNELS_PARLOOPS) Param
--param=openacc-kernels=[decompose|parloops] Specify mode of OpenACC 'kernels' constructs handling.
Enum
Name(openacc_kernels) Type(enum openacc_kernels)
EnumValue
Enum(openacc_kernels) String(decompose) Value(OPENACC_KERNELS_DECOMPOSE)
EnumValue
Enum(openacc_kernels) String(parloops) Value(OPENACC_KERNELS_PARLOOPS)
[OpenACC privatization] Largely extend diagnostics and corresponding testsuite coverage [PR90115] gcc/ PR middle-end/90115 * flag-types.h (enum openacc_privatization): New. * params.opt (-param=openacc-privatization): New. * doc/invoke.texi (openacc-privatization): Document it. * omp-general.h (get_openacc_privatization_dump_flags): New function. * omp-low.c (oacc_privatization_candidate_p): Add diagnostics. * omp-offload.c (execute_oacc_device_lower) <IFN_UNIQUE_OACC_PRIVATE>: Re-work diagnostics. * target.def (goacc.adjust_private_decl): Add 'location_t' parameter. * doc/tm.texi: Regenerate. * config/gcn/gcn-protos.h (gcn_goacc_adjust_private_decl): Adjust. * config/gcn/gcn-tree.c (gcn_goacc_adjust_private_decl): Likewise. * config/nvptx/nvptx.c (nvptx_goacc_adjust_private_decl): Likewise. Preserve it for... (nvptx_goacc_expand_var_decl): ... use here. gcc/testsuite/ PR middle-end/90115 * c-c++-common/goacc/privatization-1-compute-loop.c: New file. * c-c++-common/goacc/privatization-1-compute.c: Likewise. * c-c++-common/goacc/privatization-1-routine_gang-loop.c: Likewise. * c-c++-common/goacc/privatization-1-routine_gang.c: Likewise. * gfortran.dg/goacc/privatization-1-compute-loop.f90: Likewise. * gfortran.dg/goacc/privatization-1-compute.f90: Likewise. * gfortran.dg/goacc/privatization-1-routine_gang-loop.f90: Likewise. * gfortran.dg/goacc/privatization-1-routine_gang.f90: Likewise. * c-c++-common/goacc-gomp/nesting-1.c: Update. * c-c++-common/goacc/private-reduction-1.c: Likewise. * gfortran.dg/goacc/private-3.f95: Likewise. libgomp/ PR middle-end/90115 * testsuite/libgomp.oacc-fortran/private-atomic-1-vector.f90: New file. * testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c: Update. * testsuite/libgomp.oacc-c-c++-common/host_data-7.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-6.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-6.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-7.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-g-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-g-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-gwv-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-gwv-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-v-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-v-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-w-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-wv-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-v-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-w-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-wv-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/parallel-reduction.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/private-atomic-1-gang.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/private-atomic-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/private-variables.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/routine-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/static-variable-1.c: Likewise. * testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f: Likewise. * testsuite/libgomp.oacc-fortran/acc_on_device-1-3.f: Likewise. * testsuite/libgomp.oacc-fortran/declare-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/host_data-5.F90: Likewise. * testsuite/libgomp.oacc-fortran/if-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-3.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-6.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-3.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-4.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-5.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-6.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-7.f90: Likewise. * testsuite/libgomp.oacc-fortran/optional-private.f90: Likewise. * testsuite/libgomp.oacc-fortran/parallel-dims.f90: Likewise. * testsuite/libgomp.oacc-fortran/private-atomic-1-gang.f90: Likewise. * testsuite/libgomp.oacc-fortran/private-atomic-1-worker.f90: Likewise. * testsuite/libgomp.oacc-fortran/private-variables.f90: Likewise. * testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/routine-7.f90: Likewise.
2021-05-20 16:11:37 +02:00
-param=openacc-privatization=
Common Joined Enum(openacc_privatization) Var(param_openacc_privatization) Init(OPENACC_PRIVATIZATION_QUIET) Param
--param=openacc-privatization=[quiet|noisy] Specify mode of OpenACC privatization diagnostics.
Enum
Name(openacc_privatization) Type(enum openacc_privatization)
EnumValue
Enum(openacc_privatization) String(quiet) Value(OPENACC_PRIVATIZATION_QUIET)
EnumValue
Enum(openacc_privatization) String(noisy) Value(OPENACC_PRIVATIZATION_NOISY)
-param=parloops-chunk-size=
Common Joined UInteger Var(param_parloops_chunk_size) Param Optimization
Chunk size of omp schedule for loops parallelized by parloops.
-param=parloops-min-per-thread=
Common Joined UInteger Var(param_parloops_min_per_thread) Init(100) IntegerRange(2, 65536) Param Optimization
Minimum number of iterations per thread of an innermost parallelized loop.
-param=parloops-schedule=
Common Joined Var(param_parloops_schedule) Enum(parloops_schedule_type) Param Optimization
--param=parloops-schedule=[static|dynamic|guided|auto|runtime] Schedule type of omp schedule for loops parallelized by parloops.
Enum
Name(parloops_schedule_type) Type(int)
EnumValue
Enum(parloops_schedule_type) String(static) Value(PARLOOPS_SCHEDULE_STATIC)
EnumValue
Enum(parloops_schedule_type) String(dynamic) Value(PARLOOPS_SCHEDULE_DYNAMIC)
EnumValue
Enum(parloops_schedule_type) String(guided) Value(PARLOOPS_SCHEDULE_GUIDED)
EnumValue
Enum(parloops_schedule_type) String(auto) Value(PARLOOPS_SCHEDULE_AUTO)
EnumValue
Enum(parloops_schedule_type) String(runtime) Value(PARLOOPS_SCHEDULE_RUNTIME)
-param=partial-inlining-entry-probability=
Convert inliner to function specific param infrastructure This patch adds opt_for_fn for all cross module params used by inliner so they can be modified at function granuality. With inlining almost always there are three functions to consider (callee and caller of the inlined edge and the outer function caller is inlined to). I always use the outer function params since that is how local parameters behave. I hope it is kind of what is also expected in most case: it is better to inline agressively into -O3 compiled code rather than inline agressively -O3 functions into their callers. New params infrastructure is nice. One drawback is that is very hard to search for individual param uses since they all occupy global namespace. With C++ world we had chance to do something like params.param_flag_name or params::param_flag_name instead... Bootstrapped/regtested x86_64-linux, comitted. * cif-code.def (MAX_INLINE_INSNS_SINGLE_O2_LIMIT): Remove. * doc/invoke.texi (max-inline-insns-single-O2, inline-heuristics-hint-percent-O2, inline-min-speedup-O2, early-inlining-insns-O2): Remove documentation. * ipa-fnsummary.c (analyze_function_body, compute_fn_summary): Use opt_for_fn when accessing parameters. * ipa-inline.c (caller_growth_limits, can_inline_edge_p, inline_insns_auto, can_inline_edge_by_limits_p, want_early_inline_function_p, big_speedup_p, want_inline_small_function_p, want_inline_self_recursive_call_p, recursive_inlining, compute_max_insns, inline_small_functions): Likewise. * opts.c (default_options): Add -O3 defaults for OPT__param_early_inlining_insns_, OPT__param_inline_heuristics_hint_percent_, OPT__param_inline_min_speedup_, OPT__param_max_inline_insns_single_. * params.opt (-param=early-inlining-insns-O2=, -param=inline-heuristics-hint-percent-O2=, -param=inline-min-speedup-O2=, -param=max-inline-insns-single-O2= -param=early-inlining-insns=, -param=inline-heuristics-hint-percent=, -param=inline-min-speedup=, -param=inline-unit-growth=, -param=large-function-growth=, -param=large-stack-frame=, -param=large-stack-frame-growth=, -param=large-unit-insns=, -param=max-inline-insns-recursive=, -param=max-inline-insns-recursive-auto=, -param=max-inline-insns-single=, -param=max-inline-insns-size=, -param=max-inline-insns-small=, -param=max-inline-recursive-depth=, -param=max-inline-recursive-depth-auto=, -param=min-inline-recursive-probability=, -param=partial-inlining-entry-probability=, -param=uninlined-function-insns=, -param=uninlined-function-time=, -param=uninlined-thunk-insns=, -param=uninlined-thunk-time=): Add Optimization. * g++.dg/tree-ssa/pr53844.C: Drop -O2 from param name. * g++.dg/tree-ssa/pr61034.C: Likewise. * g++.dg/tree-ssa/pr8781.C: Likewise. * g++.dg/warn/Wstringop-truncation-1.C: Likewise. * gcc.dg/ipa/pr63416.c: Likewise. * gcc.dg/tree-ssa/ssa-thread-12.c: Likewise. * gcc.dg/vect/pr66142.c: Likewise. * gcc.dg/winline-3.c: Likewise. * gcc.target/powerpc/pr72804.c: Likewise. From-SVN: r278644
2019-11-23 14:11:25 +01:00
Common Joined UInteger Var(param_partial_inlining_entry_probability) Init(70) Optimization IntegerRange(0, 100) Param
Maximum probability of the entry BB of split region (in percent relative to entry BB of the function) to make partial inlining happen.
-param=predictable-branch-outcome=
Common Joined UInteger Var(param_predictable_branch_outcome) Init(2) IntegerRange(0, 50) Param Optimization
Maximal estimated outcome of branch considered predictable.
-param=prefetch-dynamic-strides=
Common Joined UInteger Var(param_prefetch_dynamic_strides) Init(1) IntegerRange(0, 1) Param Optimization
Whether software prefetch hints should be issued for non-constant strides.
-param=prefetch-latency=
Common Joined UInteger Var(param_prefetch_latency) Init(200) Param Optimization
The number of insns executed before prefetch is completed.
-param=prefetch-min-insn-to-mem-ratio=
Common Joined UInteger Var(param_prefetch_min_insn_to_mem_ratio) Init(3) Param Optimization
Min. ratio of insns to mem ops to enable prefetching in a loop.
-param=prefetch-minimum-stride=
Common Joined UInteger Var(param_prefetch_minimum_stride) Init(-1) Param Optimization
The minimum constant stride beyond which we should use prefetch hints for.
-param=profile-func-internal-id=
Common Joined UInteger Var(param_profile_func_internal_id) IntegerRange(0, 1) Param
Use internal function id in profile lookup.
-param=ranger-debug=
Common Joined Var(param_ranger_debug) Enum(ranger_debug) Init(RANGER_DEBUG_NONE) Param Optimization
--param=ranger-debug=[none|trace|gori|cache|tracegori|all] Specifies the output mode for debugging ranger.
Enum
Name(ranger_debug) Type(enum ranger_debug) UnknownError(unknown ranger debug mode %qs)
EnumValue
Enum(ranger_debug) String(none) Value(RANGER_DEBUG_NONE)
EnumValue
Enum(ranger_debug) String(trace) Value(RANGER_DEBUG_TRACE)
EnumValue
Enum(ranger_debug) String(cache) Value(RANGER_DEBUG_TRACE_CACHE)
EnumValue
Enum(ranger_debug) String(gori) Value(RANGER_DEBUG_GORI)
EnumValue
Enum(ranger_debug) String(tracegori) Value(RANGER_DEBUG_TRACE_GORI)
EnumValue
Enum(ranger_debug) String(all) Value(RANGER_DEBUG_ALL)
-param=ranger-logical-depth=
Common Joined UInteger Var(param_ranger_logical_depth) Init(6) IntegerRange(1, 999) Param Optimization
Maximum depth of logical expression evaluation ranger will look through when
evaluating outgoing edge ranges.
-param=relation-block-limit=
Common Joined UInteger Var(param_relation_block_limit) Init(200) IntegerRange(0, 9999) Param Optimization
Maximum number of relations the oracle will register in a basic block.
-param=rpo-vn-max-loop-depth=
Common Joined UInteger Var(param_rpo_vn_max_loop_depth) Init(7) IntegerRange(2, 65536) Param Optimization
Maximum depth of a loop nest to fully value-number optimistically.
-param=sccvn-max-alias-queries-per-access=
Common Joined UInteger Var(param_sccvn_max_alias_queries_per_access) Init(1000) Param Optimization
Maximum number of disambiguations to perform per memory access.
-param=scev-max-expr-complexity=
Common Joined UInteger Var(param_scev_max_expr_complexity) Init(10) Param Optimization
Bound on the complexity of the expressions in the scalar evolutions analyzer.
-param=scev-max-expr-size=
Common Joined UInteger Var(param_scev_max_expr_size) Init(100) Param Optimization
Bound on size of expressions used in the scalar evolutions analyzer.
-param=sched-autopref-queue-depth=
Common Joined UInteger Var(param_sched_autopref_queue_depth) Init(-1) Param Optimization
Hardware autoprefetcher scheduler model control flag. Number of lookahead cycles the model looks into, at '0' only enable instruction sorting heuristic. Disabled by default.
-param=sched-mem-true-dep-cost=
Common Joined UInteger Var(param_sched_mem_true_dep_cost) Init(1) Param Optimization
Minimal distance between possibly conflicting store and load.
-param=sched-pressure-algorithm=
Common Joined UInteger Var(param_sched_pressure_algorithm) Init(1) IntegerRange(1, 2) Param Optimization
Which -fsched-pressure algorithm to apply.
-param=sched-spec-prob-cutoff=
Common Joined UInteger Var(param_sched_spec_prob_cutoff) Init(40) IntegerRange(0, 100) Param Optimization
The minimal probability of speculation success (in percents), so that speculative insn will be scheduled.
-param=sched-state-edge-prob-cutoff=
Common Joined UInteger Var(param_sched_state_edge_prob_cutoff) Init(10) IntegerRange(0, 100) Param Optimization
The minimum probability an edge must have for the scheduler to save its state across it.
-param=selsched-insns-to-rename=
Common Joined UInteger Var(param_selsched_insns_to_rename) Init(2) Param Optimization
Maximum number of instructions in the ready list that are considered eligible for renaming.
-param=selsched-max-lookahead=
Common Joined UInteger Var(param_selsched_max_lookahead) Init(50) Param Optimization
The maximum size of the lookahead window of selective scheduling.
-param=selsched-max-sched-times=
Common Joined UInteger Var(param_selsched_max_sched_times) Init(2) IntegerRange(1, 65536) Param Optimization
Maximum number of times that an insn could be scheduled.
-param=simultaneous-prefetches=
Common Joined UInteger Var(param_simultaneous_prefetches) Init(3) Param Optimization
The number of prefetches that can run at the same time.
-param=sink-frequency-threshold=
Common Joined UInteger Var(param_sink_frequency_threshold) Init(75) IntegerRange(0, 100) Param Optimization
Target block's relative execution frequency (as a percentage) required to sink a statement.
-param=sms-dfa-history=
Common Joined UInteger Var(param_sms_dfa_history) IntegerRange(0, 16) Param Optimization
The number of cycles the swing modulo scheduler considers when checking conflicts using DFA.
-param=sms-loop-average-count-threshold=
Common Joined UInteger Var(param_sms_loop_average_count_threshold) Param Optimization
A threshold on the average loop count considered by the swing modulo scheduler.
-param=sms-max-ii-factor=
Common Joined UInteger Var(param_sms_max_ii_factor) Init(2) IntegerRange(1, 16) Param Optimization
A factor for tuning the upper bound that swing modulo scheduler uses for scheduling a loop.
-param=sms-min-sc=
Common Joined UInteger Var(param_sms_min_sc) Init(2) IntegerRange(1, 2) Param Optimization
The minimum value of stage count that swing modulo scheduler will generate.
-param=sra-max-scalarization-size-Osize=
Common Joined UInteger Var(param_sra_max_scalarization_size_size) Param Optimization
Maximum size, in storage units, of an aggregate which should be considered for scalarization when compiling for size.
-param=sra-max-scalarization-size-Ospeed=
Common Joined UInteger Var(param_sra_max_scalarization_size_speed) Param Optimization
Maximum size, in storage units, of an aggregate which should be considered for scalarization when compiling for speed.
-param=sra-max-propagations=
Common Joined UInteger Var(param_sra_max_propagations) Param Optimization Init(32)
Maximum number of artificial accesses to enable forward propagation that Scalar Replacement of Aggregates will keep for one local variable.
-param=ssa-name-def-chain-limit=
Common Joined UInteger Var(param_ssa_name_def_chain_limit) Init(512) Param Optimization
The maximum number of SSA_NAME assignments to follow in determining a value.
-param=ssp-buffer-size=
Common Joined UInteger Var(param_ssp_buffer_size) Init(8) IntegerRange(1, 65536) Param Optimization
The lower bound for a buffer to be considered for stack smashing protection.
-param=stack-clash-protection-guard-size=
Common Joined UInteger Var(param_stack_clash_protection_guard_size) Init(12) IntegerRange(12, 30) Param Optimization
Size of the stack guard expressed as a power of two in bytes.
-param=stack-clash-protection-probe-interval=
Common Joined UInteger Var(param_stack_clash_protection_probe_interval) Init(12) IntegerRange(10, 16) Param Optimization
Interval in which to probe the stack expressed as a power of two in bytes.
-param=store-merging-allow-unaligned=
Common Joined UInteger Var(param_store_merging_allow_unaligned) Init(1) IntegerRange(0, 1) Param Optimization
Allow the store merging pass to introduce unaligned stores if it is legal to do so.
-param=store-merging-max-size=
Common Joined UInteger Var(param_store_merging_max_size) Init(65536) IntegerRange(1, 65536) Param Optimization
Maximum size of a single store merging region in bytes.
-param=switch-conversion-max-branch-ratio=
Common Joined UInteger Var(param_switch_conversion_branch_ratio) Init(8) IntegerRange(1, 65536) Param Optimization
The maximum ratio between array size and switch branches for a switch conversion to take place.
New modref/ipa_modref optimization passes 2020-09-19 David Cepelik <d@dcepelik.cz> Jan Hubicka <hubicka@ucw.cz> * Makefile.in: Add ipa-modref.c and ipa-modref-tree.c. * alias.c: (reference_alias_ptr_type_1): Export. * alias.h (reference_alias_ptr_type_1): Declare. * common.opt (fipa-modref): New. * gengtype.c (open_base_files): Add ipa-modref-tree.h and ipa-modref.h * ipa-modref-tree.c: New file. * ipa-modref-tree.h: New file. * ipa-modref.c: New file. * ipa-modref.h: New file. * lto-section-in.c (lto_section_name): Add ipa_modref. * lto-streamer.h (enum lto_section_type): Add LTO_section_ipa_modref. * opts.c (default_options_table): Enable ipa-modref at -O1+. * params.opt (-param=modref-max-bases, -param=modref-max-refs, -param=modref-max-tests): New params. * passes.def: Schedule pass_modref and pass_ipa_modref. * timevar.def (TV_IPA_MODREF): New timevar. (TV_TREE_MODREF): New timevar. * tree-pass.h (make_pass_modref): Declare. (make_pass_ipa_modref): Declare. * tree-ssa-alias.c (dump_alias_stats): Include ipa-modref-tree.h and ipa-modref.h (alias_stats): Add modref_use_may_alias, modref_use_no_alias, modref_clobber_may_alias, modref_clobber_no_alias, modref_tests. (dump_alias_stats): Dump new stats. (nonoverlapping_array_refs_p): Fix formating. (modref_may_conflict): New function. (ref_maybe_used_by_call_p_1): Use it. (call_may_clobber_ref_p_1): Use it. (call_may_clobber_ref_p): Update. (stmt_may_clobber_ref_p_1): Update. * tree-ssa-alias.h (call_may_clobber_ref_p_1): Update.
2020-09-20 07:25:16 +02:00
-param=modref-max-bases=
Common Joined UInteger Var(param_modref_max_bases) Init(32) Param Optimization
New modref/ipa_modref optimization passes 2020-09-19 David Cepelik <d@dcepelik.cz> Jan Hubicka <hubicka@ucw.cz> * Makefile.in: Add ipa-modref.c and ipa-modref-tree.c. * alias.c: (reference_alias_ptr_type_1): Export. * alias.h (reference_alias_ptr_type_1): Declare. * common.opt (fipa-modref): New. * gengtype.c (open_base_files): Add ipa-modref-tree.h and ipa-modref.h * ipa-modref-tree.c: New file. * ipa-modref-tree.h: New file. * ipa-modref.c: New file. * ipa-modref.h: New file. * lto-section-in.c (lto_section_name): Add ipa_modref. * lto-streamer.h (enum lto_section_type): Add LTO_section_ipa_modref. * opts.c (default_options_table): Enable ipa-modref at -O1+. * params.opt (-param=modref-max-bases, -param=modref-max-refs, -param=modref-max-tests): New params. * passes.def: Schedule pass_modref and pass_ipa_modref. * timevar.def (TV_IPA_MODREF): New timevar. (TV_TREE_MODREF): New timevar. * tree-pass.h (make_pass_modref): Declare. (make_pass_ipa_modref): Declare. * tree-ssa-alias.c (dump_alias_stats): Include ipa-modref-tree.h and ipa-modref.h (alias_stats): Add modref_use_may_alias, modref_use_no_alias, modref_clobber_may_alias, modref_clobber_no_alias, modref_tests. (dump_alias_stats): Dump new stats. (nonoverlapping_array_refs_p): Fix formating. (modref_may_conflict): New function. (ref_maybe_used_by_call_p_1): Use it. (call_may_clobber_ref_p_1): Use it. (call_may_clobber_ref_p): Update. (stmt_may_clobber_ref_p_1): Update. * tree-ssa-alias.h (call_may_clobber_ref_p_1): Update.
2020-09-20 07:25:16 +02:00
Maximum number of bases stored in each modref tree.
-param=modref-max-refs=
Common Joined UInteger Var(param_modref_max_refs) Init(16) Param Optimization
Add access through parameter derference tracking to modref re-add tracking of accesses which was unfinished in David's patch. At the moment I only implemented tracking of the fact that access is based on derefernece of the parameter (so we track THIS pointers). Patch does not implement IPA propagation since it needs bit more work which I will post shortly: ipa-fnsummary needs to track when parameter points to local memory, summaries needs to be merged when function is inlined (because jump functions are) and propagation needs to be turned into iterative dataflow on SCC components. Patch also adds documentation of -fipa-modref and params that was left uncommited in my branch :(. Even without this change it does lead to nice increase of disambiguations for cc1plus build. Alias oracle query stats: refs_may_alias_p: 62758323 disambiguations, 72935683 queries ref_maybe_used_by_call_p: 139511 disambiguations, 63654045 queries call_may_clobber_ref_p: 23502 disambiguations, 29242 queries nonoverlapping_component_refs_p: 0 disambiguations, 37654 queries nonoverlapping_refs_since_match_p: 19417 disambiguations, 55555 must overlaps, 75721 queries aliasing_component_refs_p: 54665 disambiguations, 752449 queries TBAA oracle: 21917926 disambiguations 53054678 queries 15763411 are in alias set 0 10162238 queries asked about the same object 124 queries asked about the same alias set 0 access volatile 3681593 are dependent in the DAG 1529386 are aritificially in conflict with void * Modref stats: modref use: 8311 disambiguations, 32527 queries modref clobber: 742126 disambiguations, 1036986 queries 1987054 tbaa queries (1.916182 per modref query) 125479 base compares (0.121004 per modref query) PTA query stats: pt_solution_includes: 968314 disambiguations, 13609584 queries pt_solutions_intersect: 1019136 disambiguations, 13147139 queries So compared to https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554605.html we get 41% more use disambiguations (with similar number of queries) and 8% more clobber disambiguations. For tramp3d: Alias oracle query stats: refs_may_alias_p: 2052256 disambiguations, 2312703 queries ref_maybe_used_by_call_p: 7122 disambiguations, 2089118 queries call_may_clobber_ref_p: 234 disambiguations, 234 queries nonoverlapping_component_refs_p: 0 disambiguations, 4299 queries nonoverlapping_refs_since_match_p: 329 disambiguations, 10200 must overlaps, 10616 queries aliasing_component_refs_p: 857 disambiguations, 34555 queries TBAA oracle: 885546 disambiguations 1677080 queries 132105 are in alias set 0 469030 queries asked about the same object 0 queries asked about the same alias set 0 access volatile 190084 are dependent in the DAG 315 are aritificially in conflict with void * Modref stats: modref use: 426 disambiguations, 1881 queries modref clobber: 10042 disambiguations, 16202 queries 19405 tbaa queries (1.197692 per modref query) 2775 base compares (0.171275 per modref query) PTA query stats: pt_solution_includes: 313908 disambiguations, 526183 queries pt_solutions_intersect: 130510 disambiguations, 416084 queries Here uses decrease by 4 disambiguations and clobber improve by 3.5%. I think the difference is caused by fact that gcc has much more alias set 0 accesses originating from gimple and tree unions as I mentioned in original mail. After pushing out the IPA propagation I will re-add code to track offsets and sizes that further improve disambiguation. On tramp3d it enables a lot of DSE for structure fields not acessed by uninlined function. gcc/ * doc/invoke.texi: Document -fipa-modref, ipa-modref-max-bases, ipa-modref-max-refs, ipa-modref-max-accesses, ipa-modref-max-tests. * ipa-modref-tree.c (test_insert_search_collapse): Update. (test_merge): Update. (gt_ggc_mx): New function. * ipa-modref-tree.h (struct modref_access_node): New structure. (struct modref_ref_node): Add every_access and accesses array. (modref_ref_node::modref_ref_node): Update ctor. (modref_ref_node::search): New member function. (modref_ref_node::collapse): New member function. (modref_ref_node::insert_access): New member function. (modref_base_node::insert_ref): Do not collapse base if ref is 0. (modref_base_node::collapse): Copllapse also refs. (modref_tree): Add accesses. (modref_tree::modref_tree): Initialize max_accesses. (modref_tree::insert): Add access parameter. (modref_tree::cleanup): New member function. (modref_tree::merge): Add parm_map; merge accesses. (modref_tree::copy_from): New member function. (modref_tree::create_ggc): Add max_accesses. * ipa-modref.c (dump_access): New function. (dump_records): Dump accesses. (dump_lto_records): Dump accesses. (get_access): New function. (record_access): Record access. (record_access_lto): Record access. (analyze_call): Compute parm_map. (analyze_function): Update construction of modref records. (modref_summaries::duplicate): Likewise; use copy_from. (write_modref_records): Stream accesses. (read_modref_records): Sream accesses. (pass_ipa_modref::execute): Update call of merge. * params.opt (-param=modref-max-accesses): New. * tree-ssa-alias.c (alias_stats): Add modref_baseptr_tests. (dump_alias_stats): Update. (base_may_alias_with_dereference_p): New function. (modref_may_conflict): Check accesses. (ref_maybe_used_by_call_p_1): Update call to modref_may_conflict. (call_may_clobber_ref_p_1): Update call to modref_may_conflict.
2020-09-24 15:09:17 +02:00
Maximum number of references stored in each modref base.
-param=modref-max-accesses=
Common Joined UInteger Var(param_modref_max_accesses) Init(16) Param Optimization
Maximum number of accesses stored in each modref reference.
New modref/ipa_modref optimization passes 2020-09-19 David Cepelik <d@dcepelik.cz> Jan Hubicka <hubicka@ucw.cz> * Makefile.in: Add ipa-modref.c and ipa-modref-tree.c. * alias.c: (reference_alias_ptr_type_1): Export. * alias.h (reference_alias_ptr_type_1): Declare. * common.opt (fipa-modref): New. * gengtype.c (open_base_files): Add ipa-modref-tree.h and ipa-modref.h * ipa-modref-tree.c: New file. * ipa-modref-tree.h: New file. * ipa-modref.c: New file. * ipa-modref.h: New file. * lto-section-in.c (lto_section_name): Add ipa_modref. * lto-streamer.h (enum lto_section_type): Add LTO_section_ipa_modref. * opts.c (default_options_table): Enable ipa-modref at -O1+. * params.opt (-param=modref-max-bases, -param=modref-max-refs, -param=modref-max-tests): New params. * passes.def: Schedule pass_modref and pass_ipa_modref. * timevar.def (TV_IPA_MODREF): New timevar. (TV_TREE_MODREF): New timevar. * tree-pass.h (make_pass_modref): Declare. (make_pass_ipa_modref): Declare. * tree-ssa-alias.c (dump_alias_stats): Include ipa-modref-tree.h and ipa-modref.h (alias_stats): Add modref_use_may_alias, modref_use_no_alias, modref_clobber_may_alias, modref_clobber_no_alias, modref_tests. (dump_alias_stats): Dump new stats. (nonoverlapping_array_refs_p): Fix formating. (modref_may_conflict): New function. (ref_maybe_used_by_call_p_1): Use it. (call_may_clobber_ref_p_1): Use it. (call_may_clobber_ref_p): Update. (stmt_may_clobber_ref_p_1): Update. * tree-ssa-alias.h (call_may_clobber_ref_p_1): Update.
2020-09-20 07:25:16 +02:00
-param=modref-max-tests=
Common Joined UInteger Var(param_modref_max_tests) Init(64) Param Optimization
Maximum number of tests performed by modref query.
New modref/ipa_modref optimization passes 2020-09-19 David Cepelik <d@dcepelik.cz> Jan Hubicka <hubicka@ucw.cz> * Makefile.in: Add ipa-modref.c and ipa-modref-tree.c. * alias.c: (reference_alias_ptr_type_1): Export. * alias.h (reference_alias_ptr_type_1): Declare. * common.opt (fipa-modref): New. * gengtype.c (open_base_files): Add ipa-modref-tree.h and ipa-modref.h * ipa-modref-tree.c: New file. * ipa-modref-tree.h: New file. * ipa-modref.c: New file. * ipa-modref.h: New file. * lto-section-in.c (lto_section_name): Add ipa_modref. * lto-streamer.h (enum lto_section_type): Add LTO_section_ipa_modref. * opts.c (default_options_table): Enable ipa-modref at -O1+. * params.opt (-param=modref-max-bases, -param=modref-max-refs, -param=modref-max-tests): New params. * passes.def: Schedule pass_modref and pass_ipa_modref. * timevar.def (TV_IPA_MODREF): New timevar. (TV_TREE_MODREF): New timevar. * tree-pass.h (make_pass_modref): Declare. (make_pass_ipa_modref): Declare. * tree-ssa-alias.c (dump_alias_stats): Include ipa-modref-tree.h and ipa-modref.h (alias_stats): Add modref_use_may_alias, modref_use_no_alias, modref_clobber_may_alias, modref_clobber_no_alias, modref_tests. (dump_alias_stats): Dump new stats. (nonoverlapping_array_refs_p): Fix formating. (modref_may_conflict): New function. (ref_maybe_used_by_call_p_1): Use it. (call_may_clobber_ref_p_1): Use it. (call_may_clobber_ref_p): Update. (stmt_may_clobber_ref_p_1): Update. * tree-ssa-alias.h (call_may_clobber_ref_p_1): Update.
2020-09-20 07:25:16 +02:00
-param=modref-max-depth=
Common Joined UInteger Var(param_modref_max_depth) Init(256) IntegerRange(1, 65536) Param Optimization
Maximum depth of DFS walk used by modref escape analysis.
IPA tracking of EAF flags in ipa-modref. this patch implements the IPA propagation part of EAF flags handling in ipa-modref. It extends the local analysis to collect lattice consisting of flags and escape points. SSA name escapes if it is passed directly or indirectly to a function call. If useful flags are found for parameter its escape list is stored into escape summaries. This time each call site is annotated with info on which function parameters escape to what argument of function call. At IPA time we then perform iterative dataflow and produce final flags. ipa-modref is still cheaper than pure-const when running on cc1plus (about 2-3% that is what accounts every non-trivial passs) and the dataflow converges in 1 or 2 iterations. Local analysis does some work to avoid streaming escape points when they are not useful to determine final flags (that is, local escape analysis determined good enough flags). For cc1plus there are 225k calls with useful escape summary. * ipa-modref.c (escape_point): New type. (modref_lattice): New type. (escape_entry): New type. (escape_summary): New type. (escape_summaries_t): New type. (escape_summaries): New static variable. (eaf_flags_useful_p): New function. (modref_summary::useful_p): Add new check_flags attribute; check eaf_flags for usefulness. (modref_summary_lto): Add arg_flags. (modref_summary_lto::useful_p): Add new check_flags attribute; check eaf_flags for usefulness. (dump_modref_edge_summaries): New function. (remove_modref_edge_summaries): New function. (ignore_retval_p): New predicate. (ignore_stores_p): Also ignore for const. (remove_summary): Call remove_modref_edge_summaries. (modref_lattice::init): New member function. (modref_lattice::release): New member unction. (modref_lattice::dump): New member function. (modref_lattice::add_escape_point): New member function. (modref_lattice::merge): Two new member functions. (modref_lattice::merge_deref): New member functions. (modref_lattice::merge_direct_load): New member function. (modref_lattice::merge_direct_store): New member function. (call_lhs_flags): Rename to ... (merge_call_lhs_flags): ... this one; reimplement using modreflattice. (analyze_ssa_name_flags): Replace KNOWN_FLAGS param by LATTICE; add IPA parametr; use modref_lattice. (analyze_parms): New parameter IPA and SUMMARY_LTO; update for modref_lattice; initialize escape_summary. (analyze_function): Allocate escape_summaries; update uses of useful_p. (modref_write_escape_summary): New function. (modref_read_escape_summary): New function. (modref_write): Write escape summary. (read_section): Read escape summary. (modref_read): Initialie escape_summaries. (remap_arg_flags): New function. (update_signature): Use it. (escape_map): New structure. (update_escape_summary_1, update_escape_summary): New functions. (ipa_merge_modref_summary_after_inlining): Merge escape summaries. (propagate_unknown_call): Do not remove useless summaries. (remove_useless_summaries): Remove them here. (modref_propagate_in_scc): Update; do not dump scc. (modref_propagate_dump_scc): New function. (modref_merge_call_site_flags): New function. (modref_propagate_flags_in_scc): New function. (pass_ipa_modref::execute): Use modref_propagate_flags_in_scc and modref_propagate_dump_scc; delete escape_summaries. (ipa_modref_c_finalize): Remove escape_summaries. * ipa-modref.h (modref_summary): Update prototype of useful_p. * params.opt (param=modref-max-escape-points): New param. * doc/invoke.texi (modref-max-escape-points): Document.
2020-11-16 19:30:45 +01:00
-param=modref-max-escape-points=
Common Joined UInteger Var(param_modref_max_escape_points) Init(256) Param Optimization
Maximum number of escape points tracked by modref per SSA-name.
IPA tracking of EAF flags in ipa-modref. this patch implements the IPA propagation part of EAF flags handling in ipa-modref. It extends the local analysis to collect lattice consisting of flags and escape points. SSA name escapes if it is passed directly or indirectly to a function call. If useful flags are found for parameter its escape list is stored into escape summaries. This time each call site is annotated with info on which function parameters escape to what argument of function call. At IPA time we then perform iterative dataflow and produce final flags. ipa-modref is still cheaper than pure-const when running on cc1plus (about 2-3% that is what accounts every non-trivial passs) and the dataflow converges in 1 or 2 iterations. Local analysis does some work to avoid streaming escape points when they are not useful to determine final flags (that is, local escape analysis determined good enough flags). For cc1plus there are 225k calls with useful escape summary. * ipa-modref.c (escape_point): New type. (modref_lattice): New type. (escape_entry): New type. (escape_summary): New type. (escape_summaries_t): New type. (escape_summaries): New static variable. (eaf_flags_useful_p): New function. (modref_summary::useful_p): Add new check_flags attribute; check eaf_flags for usefulness. (modref_summary_lto): Add arg_flags. (modref_summary_lto::useful_p): Add new check_flags attribute; check eaf_flags for usefulness. (dump_modref_edge_summaries): New function. (remove_modref_edge_summaries): New function. (ignore_retval_p): New predicate. (ignore_stores_p): Also ignore for const. (remove_summary): Call remove_modref_edge_summaries. (modref_lattice::init): New member function. (modref_lattice::release): New member unction. (modref_lattice::dump): New member function. (modref_lattice::add_escape_point): New member function. (modref_lattice::merge): Two new member functions. (modref_lattice::merge_deref): New member functions. (modref_lattice::merge_direct_load): New member function. (modref_lattice::merge_direct_store): New member function. (call_lhs_flags): Rename to ... (merge_call_lhs_flags): ... this one; reimplement using modreflattice. (analyze_ssa_name_flags): Replace KNOWN_FLAGS param by LATTICE; add IPA parametr; use modref_lattice. (analyze_parms): New parameter IPA and SUMMARY_LTO; update for modref_lattice; initialize escape_summary. (analyze_function): Allocate escape_summaries; update uses of useful_p. (modref_write_escape_summary): New function. (modref_read_escape_summary): New function. (modref_write): Write escape summary. (read_section): Read escape summary. (modref_read): Initialie escape_summaries. (remap_arg_flags): New function. (update_signature): Use it. (escape_map): New structure. (update_escape_summary_1, update_escape_summary): New functions. (ipa_merge_modref_summary_after_inlining): Merge escape summaries. (propagate_unknown_call): Do not remove useless summaries. (remove_useless_summaries): Remove them here. (modref_propagate_in_scc): Update; do not dump scc. (modref_propagate_dump_scc): New function. (modref_merge_call_site_flags): New function. (modref_propagate_flags_in_scc): New function. (pass_ipa_modref::execute): Use modref_propagate_flags_in_scc and modref_propagate_dump_scc; delete escape_summaries. (ipa_modref_c_finalize): Remove escape_summaries. * ipa-modref.h (modref_summary): Update prototype of useful_p. * params.opt (param=modref-max-escape-points): New param. * doc/invoke.texi (modref-max-escape-points): Document.
2020-11-16 19:30:45 +01:00
Merge load/stores in ipa-modref summaries this patch adds logic needed to merge neighbouring accesses in ipa-modref summaries. This helps analyzing array initializers and similar code. It is bit of work, since it breaks the fact that modref tree makes a good lattice for dataflow: the access ranges can be extended indefinitely. For this reason I added counter tracking number of adjustments and a cap to limit them during the dataflow. gcc/ChangeLog: * doc/invoke.texi: Document --param modref-max-adjustments. * ipa-modref-tree.c (test_insert_search_collapse): Update. (test_merge): Update. * ipa-modref-tree.h (struct modref_access_node): Add adjustments; (modref_access_node::operator==): Fix handling of access ranges. (modref_access_node::contains): Constify parameter; handle also mismatched parm offsets. (modref_access_node::update): New function. (modref_access_node::merge): New function. (unspecified_modref_access_node): Update constructor. (modref_ref_node::insert_access): Add record_adjustments parameter; handle merging. (modref_ref_node::try_merge_with): New private function. (modref_tree::insert): New record_adjustments parameter. (modref_tree::merge): New record_adjustments parameter. (modref_tree::copy_from): Update. * ipa-modref.c (dump_access): Dump adjustments field. (get_access): Update constructor. (record_access): Update call of insert. (record_access_lto): Update call of insert. (merge_call_side_effects): Add record_adjustments parameter. (get_access_for_fnspec): Update. (process_fnspec): Update. (analyze_call): Update. (analyze_function): Update. (read_modref_records): Update. (ipa_merge_modref_summary_after_inlining): Update. (propagate_unknown_call): Update. (modref_propagate_in_scc): Update. * params.opt (param-max-modref-adjustments=): New. gcc/testsuite/ChangeLog: * gcc.dg/ipa/modref-1.c: Update testcase. * gcc.dg/tree-ssa/modref-4.c: Update testcase. * gcc.dg/tree-ssa/modref-8.c: New test.
2021-08-25 21:43:07 +02:00
-param=modref-max-adjustments=
Common Joined UInteger Var(param_modref_max_adjustments) Init(8) IntegerRange(0, 254) Param Optimization
Maximum number of times a given range is adjusted during the dataflow.
Merge load/stores in ipa-modref summaries this patch adds logic needed to merge neighbouring accesses in ipa-modref summaries. This helps analyzing array initializers and similar code. It is bit of work, since it breaks the fact that modref tree makes a good lattice for dataflow: the access ranges can be extended indefinitely. For this reason I added counter tracking number of adjustments and a cap to limit them during the dataflow. gcc/ChangeLog: * doc/invoke.texi: Document --param modref-max-adjustments. * ipa-modref-tree.c (test_insert_search_collapse): Update. (test_merge): Update. * ipa-modref-tree.h (struct modref_access_node): Add adjustments; (modref_access_node::operator==): Fix handling of access ranges. (modref_access_node::contains): Constify parameter; handle also mismatched parm offsets. (modref_access_node::update): New function. (modref_access_node::merge): New function. (unspecified_modref_access_node): Update constructor. (modref_ref_node::insert_access): Add record_adjustments parameter; handle merging. (modref_ref_node::try_merge_with): New private function. (modref_tree::insert): New record_adjustments parameter. (modref_tree::merge): New record_adjustments parameter. (modref_tree::copy_from): Update. * ipa-modref.c (dump_access): Dump adjustments field. (get_access): Update constructor. (record_access): Update call of insert. (record_access_lto): Update call of insert. (merge_call_side_effects): Add record_adjustments parameter. (get_access_for_fnspec): Update. (process_fnspec): Update. (analyze_call): Update. (analyze_function): Update. (read_modref_records): Update. (ipa_merge_modref_summary_after_inlining): Update. (propagate_unknown_call): Update. (modref_propagate_in_scc): Update. * params.opt (param-max-modref-adjustments=): New. gcc/testsuite/ChangeLog: * gcc.dg/ipa/modref-1.c: Update testcase. * gcc.dg/tree-ssa/modref-4.c: Update testcase. * gcc.dg/tree-ssa/modref-8.c: New test.
2021-08-25 21:43:07 +02:00
-param=threader-debug=
Common Joined Var(param_threader_debug) Enum(threader_debug) Init(THREADER_DEBUG_NONE) Param Optimization
--param=threader-debug=[none|all] Enables verbose dumping of the threader solver.
Enum
Name(threader_debug) Type(enum threader_debug) UnknownError(unknown threader debug mode %qs)
EnumValue
Enum(threader_debug) String(none) Value(THREADER_DEBUG_NONE)
EnumValue
Enum(threader_debug) String(all) Value(THREADER_DEBUG_ALL)
-param=tm-max-aggregate-size=
Common Joined UInteger Var(param_tm_max_aggregate_size) Init(9) Param Optimization
Size in bytes after which thread-local aggregates should be instrumented with the logging functions instead of save/restore pairs.
-param=tracer-dynamic-coverage=
Common Joined UInteger Var(param_tracer_dynamic_coverage) Init(75) IntegerRange(0, 100) Param Optimization
The percentage of function, weighted by execution frequency, that must be covered by trace formation. Used when profile feedback is not available.
-param=tracer-dynamic-coverage-feedback=
Common Joined UInteger Var(param_tracer_dynamic_coverage_feedback) Init(95) IntegerRange(0, 100) Param Optimization
The percentage of function, weighted by execution frequency, that must be covered by trace formation. Used when profile feedback is available.
-param=tracer-max-code-growth=
Common Joined UInteger Var(param_tracer_max_code_growth) Init(100) Param Optimization
Maximal code growth caused by tail duplication (in percent).
-param=tracer-min-branch-probability=
Common Joined UInteger Var(param_tracer_min_branch_probability) Init(50) IntegerRange(0, 100) Param Optimization
Stop forward growth if the probability of best edge is less than this threshold (in percent). Used when profile feedback is not available.
-param=tracer-min-branch-probability-feedback=
Common Joined UInteger Var(param_tracer_min_branch_probability_feedback) Init(80) IntegerRange(0, 100) Param Optimization
Stop forward growth if the probability of best edge is less than this threshold (in percent). Used when profile feedback is available.
-param=tracer-min-branch-ratio=
Common Joined UInteger Var(param_tracer_min_branch_ratio) Init(10) IntegerRange(0, 100) Param Optimization
Stop reverse growth if the reverse probability of best edge is less than this threshold (in percent).
-param=tree-reassoc-width=
Common Joined UInteger Var(param_tree_reassoc_width) Param Optimization
Set the maximum number of instructions executed in parallel in reassociated tree. If 0, use the target dependent heuristic.
tsan: Add optional support for distinguishing volatiles Add support to optionally emit different instrumentation for accesses to volatile variables. While the default TSAN runtime likely will never require this feature, other runtimes for different environments that have subtly different memory models or assumptions may require distinguishing volatiles. One such environment are OS kernels, where volatile is still used in various places, and often declare volatile to be appropriate even in multi-threaded contexts. One such example is the Linux kernel, which implements various synchronization primitives using volatile (READ_ONCE(), WRITE_ONCE()). Here the Kernel Concurrency Sanitizer (KCSAN), is a runtime that uses TSAN instrumentation but otherwise implements a very different approach to race detection from TSAN: https://github.com/google/ktsan/wiki/KCSAN Due to recent changes in requirements by the Linux kernel, KCSAN requires that the compiler supports tsan-distinguish-volatile (among several new requirements): https://lore.kernel.org/lkml/20200521142047.169334-7-elver@google.com/ gcc/ * params.opt: Define --param=tsan-distinguish-volatile=[0,1]. * sanitizer.def (BUILT_IN_TSAN_VOLATILE_READ1): Define new builtin for volatile instrumentation of reads/writes. (BUILT_IN_TSAN_VOLATILE_READ2): Likewise. (BUILT_IN_TSAN_VOLATILE_READ4): Likewise. (BUILT_IN_TSAN_VOLATILE_READ8): Likewise. (BUILT_IN_TSAN_VOLATILE_READ16): Likewise. (BUILT_IN_TSAN_VOLATILE_WRITE1): Likewise. (BUILT_IN_TSAN_VOLATILE_WRITE2): Likewise. (BUILT_IN_TSAN_VOLATILE_WRITE4): Likewise. (BUILT_IN_TSAN_VOLATILE_WRITE8): Likewise. (BUILT_IN_TSAN_VOLATILE_WRITE16): Likewise. * tsan.c (get_memory_access_decl): Argument if access is volatile. If param tsan-distinguish-volatile is non-zero, and access if volatile, return volatile instrumentation decl. (instrument_expr): Check if access is volatile. gcc/testsuite/ * c-c++-common/tsan/volatile.c: New test.
2020-06-09 15:15:39 +02:00
-param=tsan-distinguish-volatile=
Common Joined UInteger Var(param_tsan_distinguish_volatile) IntegerRange(0, 1) Param
Emit special instrumentation for accesses to volatiles.
-param=tsan-instrument-func-entry-exit=
Common Joined UInteger Var(param_tsan_instrument_func_entry_exit) Init(1) IntegerRange(0, 1) Param
Emit instrumentation calls to __tsan_func_entry() and __tsan_func_exit().
-param=uninit-control-dep-attempts=
Common Joined UInteger Var(param_uninit_control_dep_attempts) Init(1000) IntegerRange(1, 65536) Param Optimization
Maximum number of nested calls to search for control dependencies during uninitialized variable analysis.
-param=uninlined-function-insns=
Convert inliner to function specific param infrastructure This patch adds opt_for_fn for all cross module params used by inliner so they can be modified at function granuality. With inlining almost always there are three functions to consider (callee and caller of the inlined edge and the outer function caller is inlined to). I always use the outer function params since that is how local parameters behave. I hope it is kind of what is also expected in most case: it is better to inline agressively into -O3 compiled code rather than inline agressively -O3 functions into their callers. New params infrastructure is nice. One drawback is that is very hard to search for individual param uses since they all occupy global namespace. With C++ world we had chance to do something like params.param_flag_name or params::param_flag_name instead... Bootstrapped/regtested x86_64-linux, comitted. * cif-code.def (MAX_INLINE_INSNS_SINGLE_O2_LIMIT): Remove. * doc/invoke.texi (max-inline-insns-single-O2, inline-heuristics-hint-percent-O2, inline-min-speedup-O2, early-inlining-insns-O2): Remove documentation. * ipa-fnsummary.c (analyze_function_body, compute_fn_summary): Use opt_for_fn when accessing parameters. * ipa-inline.c (caller_growth_limits, can_inline_edge_p, inline_insns_auto, can_inline_edge_by_limits_p, want_early_inline_function_p, big_speedup_p, want_inline_small_function_p, want_inline_self_recursive_call_p, recursive_inlining, compute_max_insns, inline_small_functions): Likewise. * opts.c (default_options): Add -O3 defaults for OPT__param_early_inlining_insns_, OPT__param_inline_heuristics_hint_percent_, OPT__param_inline_min_speedup_, OPT__param_max_inline_insns_single_. * params.opt (-param=early-inlining-insns-O2=, -param=inline-heuristics-hint-percent-O2=, -param=inline-min-speedup-O2=, -param=max-inline-insns-single-O2= -param=early-inlining-insns=, -param=inline-heuristics-hint-percent=, -param=inline-min-speedup=, -param=inline-unit-growth=, -param=large-function-growth=, -param=large-stack-frame=, -param=large-stack-frame-growth=, -param=large-unit-insns=, -param=max-inline-insns-recursive=, -param=max-inline-insns-recursive-auto=, -param=max-inline-insns-single=, -param=max-inline-insns-size=, -param=max-inline-insns-small=, -param=max-inline-recursive-depth=, -param=max-inline-recursive-depth-auto=, -param=min-inline-recursive-probability=, -param=partial-inlining-entry-probability=, -param=uninlined-function-insns=, -param=uninlined-function-time=, -param=uninlined-thunk-insns=, -param=uninlined-thunk-time=): Add Optimization. * g++.dg/tree-ssa/pr53844.C: Drop -O2 from param name. * g++.dg/tree-ssa/pr61034.C: Likewise. * g++.dg/tree-ssa/pr8781.C: Likewise. * g++.dg/warn/Wstringop-truncation-1.C: Likewise. * gcc.dg/ipa/pr63416.c: Likewise. * gcc.dg/tree-ssa/ssa-thread-12.c: Likewise. * gcc.dg/vect/pr66142.c: Likewise. * gcc.dg/winline-3.c: Likewise. * gcc.target/powerpc/pr72804.c: Likewise. From-SVN: r278644
2019-11-23 14:11:25 +01:00
Common Joined UInteger Var(param_uninlined_function_insns) Init(2) Optimization IntegerRange(0, 1000000) Param
Instruction accounted for function prologue, epilogue and other overhead.
-param=uninlined-function-time=
Convert inliner to function specific param infrastructure This patch adds opt_for_fn for all cross module params used by inliner so they can be modified at function granuality. With inlining almost always there are three functions to consider (callee and caller of the inlined edge and the outer function caller is inlined to). I always use the outer function params since that is how local parameters behave. I hope it is kind of what is also expected in most case: it is better to inline agressively into -O3 compiled code rather than inline agressively -O3 functions into their callers. New params infrastructure is nice. One drawback is that is very hard to search for individual param uses since they all occupy global namespace. With C++ world we had chance to do something like params.param_flag_name or params::param_flag_name instead... Bootstrapped/regtested x86_64-linux, comitted. * cif-code.def (MAX_INLINE_INSNS_SINGLE_O2_LIMIT): Remove. * doc/invoke.texi (max-inline-insns-single-O2, inline-heuristics-hint-percent-O2, inline-min-speedup-O2, early-inlining-insns-O2): Remove documentation. * ipa-fnsummary.c (analyze_function_body, compute_fn_summary): Use opt_for_fn when accessing parameters. * ipa-inline.c (caller_growth_limits, can_inline_edge_p, inline_insns_auto, can_inline_edge_by_limits_p, want_early_inline_function_p, big_speedup_p, want_inline_small_function_p, want_inline_self_recursive_call_p, recursive_inlining, compute_max_insns, inline_small_functions): Likewise. * opts.c (default_options): Add -O3 defaults for OPT__param_early_inlining_insns_, OPT__param_inline_heuristics_hint_percent_, OPT__param_inline_min_speedup_, OPT__param_max_inline_insns_single_. * params.opt (-param=early-inlining-insns-O2=, -param=inline-heuristics-hint-percent-O2=, -param=inline-min-speedup-O2=, -param=max-inline-insns-single-O2= -param=early-inlining-insns=, -param=inline-heuristics-hint-percent=, -param=inline-min-speedup=, -param=inline-unit-growth=, -param=large-function-growth=, -param=large-stack-frame=, -param=large-stack-frame-growth=, -param=large-unit-insns=, -param=max-inline-insns-recursive=, -param=max-inline-insns-recursive-auto=, -param=max-inline-insns-single=, -param=max-inline-insns-size=, -param=max-inline-insns-small=, -param=max-inline-recursive-depth=, -param=max-inline-recursive-depth-auto=, -param=min-inline-recursive-probability=, -param=partial-inlining-entry-probability=, -param=uninlined-function-insns=, -param=uninlined-function-time=, -param=uninlined-thunk-insns=, -param=uninlined-thunk-time=): Add Optimization. * g++.dg/tree-ssa/pr53844.C: Drop -O2 from param name. * g++.dg/tree-ssa/pr61034.C: Likewise. * g++.dg/tree-ssa/pr8781.C: Likewise. * g++.dg/warn/Wstringop-truncation-1.C: Likewise. * gcc.dg/ipa/pr63416.c: Likewise. * gcc.dg/tree-ssa/ssa-thread-12.c: Likewise. * gcc.dg/vect/pr66142.c: Likewise. * gcc.dg/winline-3.c: Likewise. * gcc.target/powerpc/pr72804.c: Likewise. From-SVN: r278644
2019-11-23 14:11:25 +01:00
Common Joined UInteger Var(param_uninlined_function_time) Optimization IntegerRange(0, 1000000) Param
Time accounted for function prologue, epilogue and other overhead.
-param=uninlined-thunk-insns=
Convert inliner to function specific param infrastructure This patch adds opt_for_fn for all cross module params used by inliner so they can be modified at function granuality. With inlining almost always there are three functions to consider (callee and caller of the inlined edge and the outer function caller is inlined to). I always use the outer function params since that is how local parameters behave. I hope it is kind of what is also expected in most case: it is better to inline agressively into -O3 compiled code rather than inline agressively -O3 functions into their callers. New params infrastructure is nice. One drawback is that is very hard to search for individual param uses since they all occupy global namespace. With C++ world we had chance to do something like params.param_flag_name or params::param_flag_name instead... Bootstrapped/regtested x86_64-linux, comitted. * cif-code.def (MAX_INLINE_INSNS_SINGLE_O2_LIMIT): Remove. * doc/invoke.texi (max-inline-insns-single-O2, inline-heuristics-hint-percent-O2, inline-min-speedup-O2, early-inlining-insns-O2): Remove documentation. * ipa-fnsummary.c (analyze_function_body, compute_fn_summary): Use opt_for_fn when accessing parameters. * ipa-inline.c (caller_growth_limits, can_inline_edge_p, inline_insns_auto, can_inline_edge_by_limits_p, want_early_inline_function_p, big_speedup_p, want_inline_small_function_p, want_inline_self_recursive_call_p, recursive_inlining, compute_max_insns, inline_small_functions): Likewise. * opts.c (default_options): Add -O3 defaults for OPT__param_early_inlining_insns_, OPT__param_inline_heuristics_hint_percent_, OPT__param_inline_min_speedup_, OPT__param_max_inline_insns_single_. * params.opt (-param=early-inlining-insns-O2=, -param=inline-heuristics-hint-percent-O2=, -param=inline-min-speedup-O2=, -param=max-inline-insns-single-O2= -param=early-inlining-insns=, -param=inline-heuristics-hint-percent=, -param=inline-min-speedup=, -param=inline-unit-growth=, -param=large-function-growth=, -param=large-stack-frame=, -param=large-stack-frame-growth=, -param=large-unit-insns=, -param=max-inline-insns-recursive=, -param=max-inline-insns-recursive-auto=, -param=max-inline-insns-single=, -param=max-inline-insns-size=, -param=max-inline-insns-small=, -param=max-inline-recursive-depth=, -param=max-inline-recursive-depth-auto=, -param=min-inline-recursive-probability=, -param=partial-inlining-entry-probability=, -param=uninlined-function-insns=, -param=uninlined-function-time=, -param=uninlined-thunk-insns=, -param=uninlined-thunk-time=): Add Optimization. * g++.dg/tree-ssa/pr53844.C: Drop -O2 from param name. * g++.dg/tree-ssa/pr61034.C: Likewise. * g++.dg/tree-ssa/pr8781.C: Likewise. * g++.dg/warn/Wstringop-truncation-1.C: Likewise. * gcc.dg/ipa/pr63416.c: Likewise. * gcc.dg/tree-ssa/ssa-thread-12.c: Likewise. * gcc.dg/vect/pr66142.c: Likewise. * gcc.dg/winline-3.c: Likewise. * gcc.target/powerpc/pr72804.c: Likewise. From-SVN: r278644
2019-11-23 14:11:25 +01:00
Common Joined UInteger Var(param_uninlined_function_thunk_insns) Optimization Init(2) IntegerRange(0, 1000000) Param
Instruction accounted for function thunk overhead.
-param=uninlined-thunk-time=
Convert inliner to function specific param infrastructure This patch adds opt_for_fn for all cross module params used by inliner so they can be modified at function granuality. With inlining almost always there are three functions to consider (callee and caller of the inlined edge and the outer function caller is inlined to). I always use the outer function params since that is how local parameters behave. I hope it is kind of what is also expected in most case: it is better to inline agressively into -O3 compiled code rather than inline agressively -O3 functions into their callers. New params infrastructure is nice. One drawback is that is very hard to search for individual param uses since they all occupy global namespace. With C++ world we had chance to do something like params.param_flag_name or params::param_flag_name instead... Bootstrapped/regtested x86_64-linux, comitted. * cif-code.def (MAX_INLINE_INSNS_SINGLE_O2_LIMIT): Remove. * doc/invoke.texi (max-inline-insns-single-O2, inline-heuristics-hint-percent-O2, inline-min-speedup-O2, early-inlining-insns-O2): Remove documentation. * ipa-fnsummary.c (analyze_function_body, compute_fn_summary): Use opt_for_fn when accessing parameters. * ipa-inline.c (caller_growth_limits, can_inline_edge_p, inline_insns_auto, can_inline_edge_by_limits_p, want_early_inline_function_p, big_speedup_p, want_inline_small_function_p, want_inline_self_recursive_call_p, recursive_inlining, compute_max_insns, inline_small_functions): Likewise. * opts.c (default_options): Add -O3 defaults for OPT__param_early_inlining_insns_, OPT__param_inline_heuristics_hint_percent_, OPT__param_inline_min_speedup_, OPT__param_max_inline_insns_single_. * params.opt (-param=early-inlining-insns-O2=, -param=inline-heuristics-hint-percent-O2=, -param=inline-min-speedup-O2=, -param=max-inline-insns-single-O2= -param=early-inlining-insns=, -param=inline-heuristics-hint-percent=, -param=inline-min-speedup=, -param=inline-unit-growth=, -param=large-function-growth=, -param=large-stack-frame=, -param=large-stack-frame-growth=, -param=large-unit-insns=, -param=max-inline-insns-recursive=, -param=max-inline-insns-recursive-auto=, -param=max-inline-insns-single=, -param=max-inline-insns-size=, -param=max-inline-insns-small=, -param=max-inline-recursive-depth=, -param=max-inline-recursive-depth-auto=, -param=min-inline-recursive-probability=, -param=partial-inlining-entry-probability=, -param=uninlined-function-insns=, -param=uninlined-function-time=, -param=uninlined-thunk-insns=, -param=uninlined-thunk-time=): Add Optimization. * g++.dg/tree-ssa/pr53844.C: Drop -O2 from param name. * g++.dg/tree-ssa/pr61034.C: Likewise. * g++.dg/tree-ssa/pr8781.C: Likewise. * g++.dg/warn/Wstringop-truncation-1.C: Likewise. * gcc.dg/ipa/pr63416.c: Likewise. * gcc.dg/tree-ssa/ssa-thread-12.c: Likewise. * gcc.dg/vect/pr66142.c: Likewise. * gcc.dg/winline-3.c: Likewise. * gcc.target/powerpc/pr72804.c: Likewise. From-SVN: r278644
2019-11-23 14:11:25 +01:00
Common Joined UInteger Var(param_uninlined_function_thunk_time) Optimization Init(2) IntegerRange(0, 1000000) Param
Time accounted for function thunk overhead.
-param=unlikely-bb-count-fraction=
Common Joined UInteger Var(param_unlikely_bb_count_fraction) Init(20) Param Optimization
The denominator n of fraction 1/n of the number of profiled runs of the entire program below which the execution count of a basic block must be in order for the basic block to be considered unlikely.
-param=unroll-jam-max-unroll=
Common Joined UInteger Var(param_unroll_jam_max_unroll) Init(4) Param Optimization
Maximum unroll factor for the unroll-and-jam transformation.
-param=unroll-jam-min-percent=
Common Joined UInteger Var(param_unroll_jam_min_percent) Init(1) IntegerRange(0, 100) Param Optimization
Minimum percentage of memrefs that must go away for unroll-and-jam to be considered profitable.
-param=use-after-scope-direct-emission-threshold=
Common Joined UInteger Var(param_use_after_scope_direct_emission_threshold) Init(256) Param Optimization
Use direct poisoning/unpoisoning instructions for variables smaller or equal to this number.
-param=use-canonical-types=
Common Joined UInteger Var(param_use_canonical_types) Init(1) IntegerRange(0, 1) Param
Whether to use canonical types.
-param=vect-epilogues-nomask=
Common Joined UInteger Var(param_vect_epilogues_nomask) Init(1) IntegerRange(0, 1) Param Optimization
Enable loop epilogue vectorization using smaller vector size.
-param=vect-max-peeling-for-alignment=
Common Joined UInteger Var(param_vect_max_peeling_for_alignment) Init(-1) IntegerRange(0, 64) Param Optimization
Maximum number of loop peels to enhance alignment of data references in a loop.
-param=vect-max-version-for-alias-checks=
Common Joined UInteger Var(param_vect_max_version_for_alias_checks) Init(10) Param Optimization
Bound on number of runtime checks inserted by the vectorizer's loop versioning for alias check.
-param=vect-max-version-for-alignment-checks=
Common Joined UInteger Var(param_vect_max_version_for_alignment_checks) Init(6) Param Optimization
Bound on number of runtime checks inserted by the vectorizer's loop versioning for alignment check.
vect: Support length-based partial vectors approach Power9 supports vector load/store instruction lxvl/stxvl which allow us to operate partial vectors with one specific length. This patch extends some of current mask-based partial vectors support code for length-based approach, also adds some length specific support code. So far it assumes that we can only have one partial vectors approach at the same time, it will disable to use partial vectors if both approaches co-exist. Like the description of optab len_load/len_store, the length-based approach can have two flavors, one is length in bytes, the other is length in lanes. This patch is mainly implemented and tested for length in bytes, but as Richard S. suggested, most of code has considered both flavors. This also introduces one parameter vect-partial-vector-usage allow users to control when the loop vectorizer considers using partial vectors as an alternative to falling back to scalar code. gcc/ChangeLog: * config/rs6000/rs6000.c (rs6000_option_override_internal): Set param_vect_partial_vector_usage to 0 explicitly. * doc/invoke.texi (vect-partial-vector-usage): Document new option. * optabs-query.c (get_len_load_store_mode): New function. * optabs-query.h (get_len_load_store_mode): New declare. * params.opt (vect-partial-vector-usage): New. * tree-vect-loop-manip.c (vect_set_loop_controls_directly): Add the handlings for vectorization using length-based partial vectors, call vect_gen_len for length generation, and rename some variables with items instead of scalars. (vect_set_loop_condition_partial_vectors): Add the handlings for vectorization using length-based partial vectors. (vect_do_peeling): Allow remaining eiters less than epilogue vf for LOOP_VINFO_USING_PARTIAL_VECTORS_P. * tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Init epil_using_partial_vectors_p. (_loop_vec_info::~_loop_vec_info): Call release_vec_loop_controls for lengths destruction. (vect_verify_loop_lens): New function. (vect_analyze_loop): Add handlings for epilogue of loop when it's marked to use vectorization using partial vectors. (vect_analyze_loop_2): Add the check to allow only one vectorization approach using partial vectorization at the same time. Check param vect-partial-vector-usage for partial vectors decision. Mark LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P if the epilogue is considerable to use partial vectors. Call release_vec_loop_controls for lengths destruction. (vect_estimate_min_profitable_iters): Adjust for loop vectorization using length-based partial vectors. (vect_record_loop_mask): Init factor to 1 for vectorization using mask-based partial vectors. (vect_record_loop_len): New function. (vect_get_loop_len): Likewise. * tree-vect-stmts.c (check_load_store_for_partial_vectors): Add checks for vectorization using length-based partial vectors. Factor some code to lambda function get_valid_nvectors. (vectorizable_store): Add handlings when using length-based partial vectors. (vectorizable_load): Likewise. (vect_gen_len): New function. * tree-vectorizer.h (struct rgroup_controls): Add field factor mainly for length-based partial vectors. (vec_loop_lens): New typedef. (_loop_vec_info): Add lens and epil_using_partial_vectors_p. (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P): New macro. (LOOP_VINFO_LENS): Likewise. (LOOP_VINFO_FULLY_WITH_LENGTH_P): Likewise. (vect_record_loop_len): New declare. (vect_get_loop_len): Likewise. (vect_gen_len): Likewise.
2020-07-20 03:40:10 +02:00
-param=vect-partial-vector-usage=
Common Joined UInteger Var(param_vect_partial_vector_usage) Init(2) IntegerRange(0, 2) Param Optimization
Controls how loop vectorizer uses partial vectors. 0 means never, 1 means only for loops whose need to iterate can be removed, 2 means for all loops. The default value is 2.
vect: Replace hardcoded inner loop cost factor This patch is to replace the current hardcoded weight factor 50, which is applied by the loop vectorizer to the cost of statements in an inner loop relative to the loop being vectorized, with one newly added member inner_loop_cost_factor in loop vinfo. It also introduces one parameter vect-inner-loop-cost-factor whose default value is 50, and is used to initialize the inner_loop_cost_factor member. The motivation here is that: if targets want to have one unique function to gather some information in each add_stmt_cost call, no matter that it's put before or after the cost tweaking part for inner loop, it may have the need to adjust (expand or shrink) the gathered data as the factor. Now the factor is hardcoded, it's not easily maintained. Bootstrapped/regtested on powerpc64le-linux-gnu P9, x86_64-redhat-linux and aarch64-linux-gnu. gcc/ChangeLog: * doc/invoke.texi (vect-inner-loop-cost-factor): Document new parameter. * params.opt (vect-inner-loop-cost-factor): New. * targhooks.c (default_add_stmt_cost): Replace hardcoded factor 50 with LOOP_VINFO_INNER_LOOP_COST_FACTOR, include head file tree-vectorizer.h and its required ones. * config/aarch64/aarch64.c (aarch64_add_stmt_cost): Replace hardcoded factor 50 with LOOP_VINFO_INNER_LOOP_COST_FACTOR. * config/arm/arm.c (arm_add_stmt_cost): Likewise. * config/i386/i386.c (ix86_add_stmt_cost): Likewise. * config/rs6000/rs6000.c (rs6000_add_stmt_cost): Likewise. * tree-vect-loop.c (vect_compute_single_scalar_iteration_cost): Likewise. (_loop_vec_info::_loop_vec_info): Init inner_loop_cost_factor. * tree-vectorizer.h (_loop_vec_info): Add inner_loop_cost_factor. (LOOP_VINFO_INNER_LOOP_COST_FACTOR): New macro.
2021-05-19 12:42:51 +02:00
-param=vect-inner-loop-cost-factor=
Common Joined UInteger Var(param_vect_inner_loop_cost_factor) Init(50) IntegerRange(1, 10000) Param Optimization
The maximum factor which the loop vectorizer applies to the cost of statements in an inner loop relative to the loop being vectorized.
vect: Replace hardcoded inner loop cost factor This patch is to replace the current hardcoded weight factor 50, which is applied by the loop vectorizer to the cost of statements in an inner loop relative to the loop being vectorized, with one newly added member inner_loop_cost_factor in loop vinfo. It also introduces one parameter vect-inner-loop-cost-factor whose default value is 50, and is used to initialize the inner_loop_cost_factor member. The motivation here is that: if targets want to have one unique function to gather some information in each add_stmt_cost call, no matter that it's put before or after the cost tweaking part for inner loop, it may have the need to adjust (expand or shrink) the gathered data as the factor. Now the factor is hardcoded, it's not easily maintained. Bootstrapped/regtested on powerpc64le-linux-gnu P9, x86_64-redhat-linux and aarch64-linux-gnu. gcc/ChangeLog: * doc/invoke.texi (vect-inner-loop-cost-factor): Document new parameter. * params.opt (vect-inner-loop-cost-factor): New. * targhooks.c (default_add_stmt_cost): Replace hardcoded factor 50 with LOOP_VINFO_INNER_LOOP_COST_FACTOR, include head file tree-vectorizer.h and its required ones. * config/aarch64/aarch64.c (aarch64_add_stmt_cost): Replace hardcoded factor 50 with LOOP_VINFO_INNER_LOOP_COST_FACTOR. * config/arm/arm.c (arm_add_stmt_cost): Likewise. * config/i386/i386.c (ix86_add_stmt_cost): Likewise. * config/rs6000/rs6000.c (rs6000_add_stmt_cost): Likewise. * tree-vect-loop.c (vect_compute_single_scalar_iteration_cost): Likewise. (_loop_vec_info::_loop_vec_info): Init inner_loop_cost_factor. * tree-vectorizer.h (_loop_vec_info): Add inner_loop_cost_factor. (LOOP_VINFO_INNER_LOOP_COST_FACTOR): New macro.
2021-05-19 12:42:51 +02:00
-param=vect-induction-float=
Common Joined UInteger Var(param_vect_induction_float) Init(1) IntegerRage(0, 1) Param Optimization
Enable loop vectorization of floating point inductions.
-param=vrp1-mode=
Common Joined Var(param_vrp1_mode) Enum(vrp_mode) Init(VRP_MODE_VRP) Param Optimization
--param=vrp1-mode=[vrp|ranger] Specifies the mode VRP1 should operate in.
-param=vrp2-mode=
Common Joined Var(param_vrp2_mode) Enum(vrp_mode) Init(VRP_MODE_RANGER) Param Optimization
--param=vrp2-mode=[vrp|ranger] Specifies the mode VRP2 should operate in.
Enum
Name(vrp_mode) Type(enum vrp_mode) UnknownError(unknown vrp mode %qs)
EnumValue
Enum(vrp_mode) String(vrp) Value(VRP_MODE_VRP)
EnumValue
Enum(vrp_mode) String(ranger) Value(VRP_MODE_RANGER)
; This comment is to ensure we retain the blank line above.