lto.c (lto_balanced_map): Fix typos in head comment.

* lto.c (lto_balanced_map): Fix typos in head comment.
	(lto_promote_cross_file_statics): Fix long lines and remove redundant
	test.

From-SVN: r172584
This commit is contained in:
Eric Botcazou 2011-04-16 10:43:04 +00:00
parent e7f2301855
commit b984326076
2 changed files with 67 additions and 53 deletions

View File

@ -1,3 +1,9 @@
2011-04-16 Eric Botcazou <ebotcazou@adacore.com>
* lto.c (lto_balanced_map): Fix typos in head comment.
(lto_promote_cross_file_statics): Fix long lines and remove redundant
test.
2011-04-16 Jan Hubicka <jh@suse.cz> 2011-04-16 Jan Hubicka <jh@suse.cz>
* lto.c (lto_balanced_map): Update. * lto.c (lto_balanced_map): Update.
@ -5,7 +11,8 @@
2011-04-14 Jan Hubicka <jh@suse.cz> 2011-04-14 Jan Hubicka <jh@suse.cz>
* lto.c: Include ipa-inline.h * lto.c: Include ipa-inline.h
(add_cgraph_node_to_partition, undo_partition): Use inline_summary accessor. (add_cgraph_node_to_partition, undo_partition): Use inline_summary
accessor.
(ipa_node_duplication_hook): Fix declaration. (ipa_node_duplication_hook): Fix declaration.
* Make-lang.in (lto.o): Update dependencies. * Make-lang.in (lto.o): Update dependencies.
@ -77,8 +84,8 @@
PR lto/45721 PR lto/45721
PR lto/45375 PR lto/45375
* lto.c (partition_cgraph_node_p, partition_varpool_node_p): Weakrefs are * lto.c (partition_cgraph_node_p, partition_varpool_node_p): Weakrefs
not partitioned. are not partitioned.
2010-12-22 Nathan Froyd <froydnj@codesourcery.com> 2010-12-22 Nathan Froyd <froydnj@codesourcery.com>
@ -376,11 +383,12 @@
* lto.c (add_cgraph_node_to_partition): Forward declare; walk also * lto.c (add_cgraph_node_to_partition): Forward declare; walk also
nodes from same comdat group as well as all comdat functions referenced nodes from same comdat group as well as all comdat functions referenced
here. here.
(add_varpool_node_to_partition, add_references_to_partition): New function. (add_varpool_node_to_partition, add_references_to_partition): New
(lto_1_1_map): Skip COMDAT fnctions/variables; use add_varpool_node_to_partition; function.
clear aux flags when done. (lto_1_1_map): Skip COMDAT fnctions/variables; use
(lto_promote_cross_file_statics): Do not promote stuff that gets duplicated to add_varpool_node_to_partition; clear aux flags when done.
each ltrans. (lto_promote_cross_file_statics): Do not promote stuff that gets
duplicated to each ltrans.
2010-07-04 Jan Hubicka <jh@suse.cz> 2010-07-04 Jan Hubicka <jh@suse.cz>
@ -588,8 +596,8 @@
2010-04-30 Jan Hubicka <jh@suse.cz> 2010-04-30 Jan Hubicka <jh@suse.cz>
* lto.c (get_filename_for_set): Look for cgraph node and if none found, use * lto.c (get_filename_for_set): Look for cgraph node and if none found,
default name. use default name.
(lto_wpa_write_files): Write any non-empty partition. (lto_wpa_write_files): Write any non-empty partition.
2010-04-30 Jan Hubicka <jh@suse.cz> 2010-04-30 Jan Hubicka <jh@suse.cz>

View File

@ -1,5 +1,5 @@
/* Top-level LTO routines. /* Top-level LTO routines.
Copyright 2009, 2010 Free Software Foundation, Inc. Copyright 2009, 2010, 2011 Free Software Foundation, Inc.
Contributed by CodeSourcery, Inc. Contributed by CodeSourcery, Inc.
This file is part of GCC. This file is part of GCC.
@ -962,40 +962,43 @@ lto_1_to_1_map (void)
} }
/* Group cgraph nodes in qually sized partitions. /* Group cgraph nodes into equally-sized partitions.
The algorithm deciding paritions are simple: nodes are taken in predefined The partitioning algorithm is simple: nodes are taken in predefined order.
order. The order correspond to order we wish to have functions in final The order corresponds to the order we want functions to have in the final
output. In future this will be given by function reordering pass, but at output. In the future this will be given by function reordering pass, but
the moment we use topological order that serve a good approximation. at the moment we use the topological order, which is a good approximation.
The goal is to partition this linear order into intervals (partitions) such The goal is to partition this linear order into intervals (partitions) so
that all partitions have approximately the same size and that the number of that all the partitions have approximately the same size and the number of
callgraph or IPA reference edgess crossing boundaries is minimal. callgraph or IPA reference edges crossing boundaries is minimal.
This is a lot faster (O(n) in size of callgraph) than algorithms doing This is a lot faster (O(n) in size of callgraph) than algorithms doing
priority based graph clustering that are generally O(n^2) and since WHOPR priority-based graph clustering that are generally O(n^2) and, since
is designed to make things go well across partitions, it leads to good results. WHOPR is designed to make things go well across partitions, it leads
to good results.
We compute the expected size of partition as We compute the expected size of a partition as:
max (total_size / lto_partitions, min_partition_size).
We use dynamic expected size of partition, so small programs
are partitioning into enough partitions to allow use of multiple CPUs while
large programs are not partitioned too much. Creating too many partition
increase streaming overhead significandly.
In the future we would like to bound maximal size of partition to avoid max (total_size / lto_partitions, min_partition_size)
ltrans stage consuming too much memory. At the moment however WPA stage is
most memory intensive phase at large benchmark since too many types and
declarations are read into memory.
The function implement simple greedy algorithm. Nodes are begin added into We use dynamic expected size of partition so small programs are partitioned
current partition until 3/4th of expected partition size is reached. into enough partitions to allow use of multiple CPUs, while large programs
After this threshold we keep track of boundary size (number of edges going to are not partitioned too much. Creating too many partitions significantly
other partitions) and continue adding functions until the current partition increases the streaming overhead.
grows into a double of expected partition size. Then the process is undone
till the point when minimal ration of boundary size and in partition calls In the future, we would like to bound the maximal size of partitions so as
was reached. */ to prevent the LTRANS stage from consuming too much memory. At the moment,
however, the WPA stage is the most memory intensive for large benchmarks,
since too many types and declarations are read into memory.
The function implements a simple greedy algorithm. Nodes are being added
to the current partition until after 3/4 of the expected partition size is
reached. Past this threshold, we keep track of boundary size (number of
edges going to other partitions) and continue adding functions until after
the current partition has grown to twice the expected partition size. Then
the process is undone to the point where the minimal ratio of boundary size
and in-partition calls was reached. */
static void static void
lto_balanced_map (void) lto_balanced_map (void)
@ -1330,7 +1333,8 @@ lto_promote_cross_file_statics (void)
n_sets = VEC_length (ltrans_partition, ltrans_partitions); n_sets = VEC_length (ltrans_partition, ltrans_partitions);
for (i = 0; i < n_sets; i++) for (i = 0; i < n_sets; i++)
{ {
ltrans_partition part = VEC_index (ltrans_partition, ltrans_partitions, i); ltrans_partition part
= VEC_index (ltrans_partition, ltrans_partitions, i);
set = part->cgraph_set; set = part->cgraph_set;
vset = part->varpool_set; vset = part->varpool_set;
@ -1361,16 +1365,15 @@ lto_promote_cross_file_statics (void)
promote_var (vnode); promote_var (vnode);
} }
/* We export initializers of read-only var into each partition /* We export the initializer of a read-only var into each partition
referencing it. Folding might take declarations from the referencing the var. Folding might take declarations from the
initializers and use it; so everything referenced from the initializer and use them, so everything referenced from the
initializers needs can be accessed from this partition after initializer can be accessed from this partition after folding.
folding.
This means that we need to promote all variables and functions This means that we need to promote all variables and functions
referenced from all initializers from readonly vars referenced referenced from all initializers of read-only vars referenced
from this partition that are not in this partition. from this partition that are not in this partition. This needs
This needs to be done recursively. */ to be done recursively. */
for (vnode = varpool_nodes; vnode; vnode = vnode->next) for (vnode = varpool_nodes; vnode; vnode = vnode->next)
if (const_value_known_p (vnode->decl) if (const_value_known_p (vnode->decl)
&& DECL_INITIAL (vnode->decl) && DECL_INITIAL (vnode->decl)
@ -1378,13 +1381,16 @@ lto_promote_cross_file_statics (void)
&& referenced_from_this_partition_p (&vnode->ref_list, set, vset) && referenced_from_this_partition_p (&vnode->ref_list, set, vset)
&& !pointer_set_insert (inserted, vnode)) && !pointer_set_insert (inserted, vnode))
VEC_safe_push (varpool_node_ptr, heap, promoted_initializers, vnode); VEC_safe_push (varpool_node_ptr, heap, promoted_initializers, vnode);
while (!VEC_empty (varpool_node_ptr, promoted_initializers)) while (!VEC_empty (varpool_node_ptr, promoted_initializers))
{ {
int i; int i;
struct ipa_ref *ref; struct ipa_ref *ref;
vnode = VEC_pop (varpool_node_ptr, promoted_initializers); vnode = VEC_pop (varpool_node_ptr, promoted_initializers);
for (i = 0; ipa_ref_list_reference_iterate (&vnode->ref_list, i, ref); i++) for (i = 0;
ipa_ref_list_reference_iterate (&vnode->ref_list, i, ref);
i++)
{ {
if (ref->refered_type == IPA_REF_CGRAPH) if (ref->refered_type == IPA_REF_CGRAPH)
{ {
@ -1399,17 +1405,17 @@ lto_promote_cross_file_statics (void)
struct varpool_node *v = ipa_ref_varpool_node (ref); struct varpool_node *v = ipa_ref_varpool_node (ref);
if (varpool_node_in_set_p (v, vset)) if (varpool_node_in_set_p (v, vset))
continue; continue;
/* Constant pool references use internal labels and thus can not
be made global. It is sensible to keep those ltrans local to /* Constant pool references use internal labels and thus
allow better optimization. */ cannot be made global. It is sensible to keep those
ltrans local to allow better optimization. */
if (DECL_IN_CONSTANT_POOL (v->decl)) if (DECL_IN_CONSTANT_POOL (v->decl))
{ {
if (!pointer_set_insert (inserted, vnode)) if (!pointer_set_insert (inserted, vnode))
VEC_safe_push (varpool_node_ptr, heap, VEC_safe_push (varpool_node_ptr, heap,
promoted_initializers, v); promoted_initializers, v);
} }
else if (!DECL_IN_CONSTANT_POOL (v->decl) else if (!v->externally_visible && v->analyzed)
&& !v->externally_visible && v->analyzed)
{ {
if (promote_var (v) if (promote_var (v)
&& DECL_INITIAL (v->decl) && DECL_INITIAL (v->decl)