IVOPT performance tuning patch. The main problem is a variant of maximal weight
bipartite matching/assignment problem -- i.e., there is an additional global
cost function. The complexity of the algorighm to find the optimial solution
> O(n^2). The existing algorithm in gcc tries to find the solution in 3 stages:
1) Find the initial solution set (dynamic programing style)
2) Extend the solution set
3) Prune the soultion set.
The problem is that in step 1, the initial set tends to be too large so that
the final solution is very likely local optimal.
This patch addresses the problem and sees very large SPEC improvements.
Another area of problem is that ivopts often creates loop invariant expressions, and
such expressions increase register pressure which is not counted. This is addressed
in this patch.
The third main problem is the profile data is not considered in cost computation
The forth problem is that loop invariant comptuation's cost is not properly adjusted.
There are more tuning opportuties, namely:
1) Do not check ivs dependency during ivs set pruning (this improves deallII 8% on core2)
2) Unconditionally consider all important candidates in partial set expansion (in addition
to the extended solutino based on selected candidates)
3) revisit the two stage initial set computation.
From-SVN: r162653
2010-07-28 Daniel Kraft <d@domob.eu>
* gfortran.h (gfc_build_intrinsic_call): New method.
* expr.c (gfc_build_intrinsic_call): New method.
* simplify.c (range_check): Ignore non-constant value.
(simplify_bound_dim): Handle non-variable expressions and
fix memory leak with non-free'ed expression.
(simplify_bound): Handle non-variable expressions.
(gfc_simplify_shape): Ditto.
(gfc_simplify_size): Ditto, but only in certain cases possible.
2010-07-28 Daniel Kraft <d@domob.eu>
* gfortran.dg/bound_8.f90: New test.
From-SVN: r162648
PR debug/45103
* dwarf2out.c (dwarf2out_var_location): Always consider
NOTE_DURING_CALL_P notes, even when not followed by real instructions.
From-SVN: r162646
2010-07-28 Richard Guenther <rguenther@suse.de>
* tree-ssa-ccp.c: Remove comment regarding STORE-CCP.
(set_lattice_value): Do not query an old default value.
(get_value_for_expr): New function. Properly canonicalize
float values.
(ccp_visit_phi_node): Use it.
From-SVN: r162638
* config/rs6000/rs6000.c (rs6000_override_options):
Use TARGET_MACHO inline, move darwin_one_byte_bool from here...
... to darwin_rs6000_override_options.
(rs6000_return_in_memory): Update preceding comment for darwin
64 bit ABI. Use TARGET_MACHO inline.
(rs6000_darwin64_struct_check_p): New.
(function_arg_advance): Use rs6000_darwin64_struct_check_p.
(function_arg): Likewise.
(rs6000_arg_partial_bytes): Likewise.
(rs6000_function_value): Likewise.
From-SVN: r162635
* config/darwin-driver.c (SWITCH_TAKES_ARG,
WORD_SWITCH_TAKES_ARG): Remove.
* cppspec.c (SWITCH_TAKES_ARG, WORD_SWITCH_TAKES_ARG): Remove.
* defaults.h (DEFAULT_SWITCH_TAKES_ARG,
DEFAULT_WORD_SWITCH_TAKES_ARG): Move from gcc.h.
(SWITCH_TAKES_ARG, WORD_SWITCH_TAKES_ARG): Move default
definitions from gcc.c.
* gcc.c (SWITCH_TAKES_ARG, WORD_SWITCH_TAKES_ARG): Move to
defaults.h.
* gcc.h (DEFAULT_SWITCH_TAKES_ARG, DEFAULT_WORD_SWITCH_TAKES_ARG):
Move to defaults.h.
* opts-common.c: Include tm.h.
(decode_cmdline_option): Use SWITCH_TAKES_ARG and
WORD_SWITCH_TAKES_ARG to count arguments to unknown options.
Handle more than one argument. Set canonical_option_num_elements.
(decode_cmdline_options_to_array): Set
canonical_option_num_elements and trailing elements of
canonical_option.
* opts.h (struct cl_decoded_option): Allow four elements in
canonical_option. Add field canonical_option_num_elements.
* Makefile.in (opts-common.o): Update dependencies.
ada:
* gcc-interface/misc.c (gnat_init_options): Ignore erroneous
options. Check canonical_option_num_elements on options copied.
fortran:
* gfortranspec.c (SWITCH_TAKES_ARG, WORD_SWITCH_TAKES_ARG):
Remove.
From-SVN: r162620
PR middle-end/44790
PR middle-end/44993
* expr.c (expand_expr_real_1) <MEM_REF>: Revert latest change. Make
sure the base has address_mode before adding the offset.
From-SVN: r162618
PR target/42495
PR middle-end/42574
* config/arm/arm.c (legitimize_pic_address): Use
gen_calculate_pic_address pattern to emit calculation of PIC address.
(will_be_in_index_register): New function.
(arm_legitimate_address_outer_p, thumb2_legitimate_address_p,)
(thumb1_legitimate_address_p): Use it provided !strict_p.
* config/arm/arm.md (calculate_pic_address): New expand and split.
From-SVN: r162595
* gcse.c (struct expr:max_distance): New field.
(doing_code_hoisting_p): New static variable.
(want_to_gcse_p): Change signature. Allow constrained hoisting of
simple expressions, don't change behavior for PRE. Set max_distance.
(insert_expr_in_table): Set new max_distance field.
(hash_scan_set): Update.
(hoist_expr_reaches_here_p): Stop search after max_distance
instructions.
(find_occr_in_bb): New static function. Use it in ...
(hoist_code): Calculate sizes of basic block before any changes are
done. Pass max_distance to hoist_expr_reaches_here_p.
(one_code_hoisting_pass): Set doing_code_hoisting_p.
* params.def (PARAM_GCSE_COST_DISTANCE_RATIO,)
(PARAM_GCSE_UNRESTRICTED_COST): New parameters.
* params.h (GCSE_COST_DISTANCE_RATIO, GCSE_UNRESTRICTED_COST): New
macros.
* doc/invoke.texi (gcse-cost-distance-ratio, gcse-unrestricted-cost):
Document.
From-SVN: r162589
PR target/44542
* cfgexpand.c (expand_one_stack_var_at): Limit align to maximum
of max_used_stack_slot_alignment and PREFERRED_STACK_BOUNDARY
instead of MAX_SUPPORTED_STACK_ALIGNMENT.
(expand_one_var): Don't consider DECL_ALIGN for variables for
which expand_one_stack_var_at has been already called.
From-SVN: r162582
PR testsuite/44701
* doc/md.texi: Clarify m and es constraints on PowerPC and m and S
constraints on IA-64.
* gcc.target/powerpc/asm-es-2.c (f2): Add <> constraints.
From-SVN: r162581