[PATCH 3/7] OpenMP 4.0 offloading infrastructure: Offload tables.

gcc/
	* Makefile.in (GTFILES): Add omp-low.h to list of GC files.
	* cgraphunit.c: Include omp-low.h.
	* doc/tm.texi: Regenerate.
	* doc/tm.texi.in (TARGET_RECORD_OFFLOAD_SYMBOL): Document.
	* gengtype.c (open_base_files): Add omp-low.h to ifiles.
	* lto-cgraph.c (output_offload_tables): New function.
	(input_offload_tables): Likewise.
	* lto-section-in.c (lto_section_name): Add "offload_table".
	* lto-section-names.h (OFFLOAD_VAR_TABLE_SECTION_NAME): Define.
	(OFFLOAD_FUNC_TABLE_SECTION_NAME): Likewise.
	* lto-streamer-out.c (lto_output): Call output_offload_tables.
	* lto-streamer.h (lto_section_type): Add LTO_section_offload_table.
	(output_offload_tables, input_offload_tables): Declare.
	* omp-low.c: Include common/common-target.h and lto-section-names.h.
	(offload_funcs, offload_vars): New global <tree, va_gc> vectors.
	(expand_omp_target): Add child_fn into offload_funcs vector.
	(add_decls_addresses_to_decl_constructor): New function.
	(omp_finish_file): Likewise.
	* omp-low.h (omp_finish_file, offload_funcs, offload_vars): Declare.
	* target.def (record_offload_symbol): New DEFHOOK.
	* toplev.c: Include omp-low.h.
	(compile_file): Call omp_finish_file.
	* varpool.c: Include omp-low.h.
	(varpool_node::get_create): Add decl into offload_vars vector.

gcc/lto/
	* lto/lto.c (read_cgraph_and_symbols): Call input_offload_tables.

Co-Authored-By: Andrey Turetskiy <andrey.turetskiy@intel.com>
Co-Authored-By: Bernd Schmidt <bernds@codesourcery.com>
Co-Authored-By: Ilya Tocar <ilya.tocar@intel.com>
Co-Authored-By: Michael Zolotukhin <michael.v.zolotukhin@intel.com>

From-SVN: r217489
This commit is contained in:
Ilya Verbin 2014-11-13 13:44:04 +00:00 committed by Kirill Yukhin
parent 3f341ee716
commit ec6fe917cd
18 changed files with 267 additions and 2 deletions

View File

@ -7,6 +7,36 @@
Andrey Turetskiy <andrey.turetskiy@intel.com>
Michael Zolotukhin <michael.v.zolotukhin@intel.com>
* Makefile.in (GTFILES): Add omp-low.h to list of GC files.
* cgraphunit.c: Include omp-low.h.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in (TARGET_RECORD_OFFLOAD_SYMBOL): Document.
* gengtype.c (open_base_files): Add omp-low.h to ifiles.
* lto-cgraph.c (output_offload_tables): New function.
(input_offload_tables): Likewise.
* lto-section-in.c (lto_section_name): Add "offload_table".
* lto-section-names.h (OFFLOAD_VAR_TABLE_SECTION_NAME): Define.
(OFFLOAD_FUNC_TABLE_SECTION_NAME): Likewise.
* lto-streamer-out.c (lto_output): Call output_offload_tables.
* lto-streamer.h (lto_section_type): Add LTO_section_offload_table.
(output_offload_tables, input_offload_tables): Declare.
* omp-low.c: Include common/common-target.h and lto-section-names.h.
(offload_funcs, offload_vars): New global <tree, va_gc> vectors.
(expand_omp_target): Add child_fn into offload_funcs vector.
(add_decls_addresses_to_decl_constructor): New function.
(omp_finish_file): Likewise.
* omp-low.h (omp_finish_file, offload_funcs, offload_vars): Declare.
* target.def (record_offload_symbol): New DEFHOOK.
* toplev.c: Include omp-low.h.
(compile_file): Call omp_finish_file.
* varpool.c: Include omp-low.h.
(varpool_node::get_create): Add decl into offload_vars vector.
2014-11-13 Ilya Verbin <ilya.verbin@intel.com>
Ilya Tocar <ilya.tocar@intel.com>
Andrey Turetskiy <andrey.turetskiy@intel.com>
Bernd Schmidt <bernds@codesourcery.com>
* cgraph.c: Include context.h.
(cgraph_node::create): Set node->offloadable and g->have_offload if
decl have "omp declare target" attribute.

View File

@ -2320,6 +2320,7 @@ GTFILES = $(CPP_ID_DATA_H) $(srcdir)/input.h $(srcdir)/coretypes.h \
$(srcdir)/tree-profile.c $(srcdir)/tree-nested.c \
$(srcdir)/tree-parloops.c \
$(srcdir)/omp-low.c \
$(srcdir)/omp-low.h \
$(srcdir)/targhooks.c $(out_file) $(srcdir)/passes.c $(srcdir)/cgraphunit.c \
$(srcdir)/cgraphclones.c \
$(srcdir)/tree-phinodes.c \

View File

@ -225,6 +225,7 @@ along with GCC; see the file COPYING3. If not see
#include "dbgcnt.h"
#include "tree-chkp.h"
#include "lto-section-names.h"
#include "omp-low.h"
/* Queue of cgraph nodes scheduled to be added into cgraph. This is a
secondary queue used during optimization to accommodate passes that

View File

@ -11396,6 +11396,12 @@ If defined, this function returns an appropriate alignment in bits for an atomic
ISO C11 requires atomic compound assignments that may raise floating-point exceptions to raise exceptions corresponding to the arithmetic operation whose result was successfully stored in a compare-and-exchange sequence. This requires code equivalent to calls to @code{feholdexcept}, @code{feclearexcept} and @code{feupdateenv} to be generated at appropriate points in the compare-and-exchange sequence. This hook should set @code{*@var{hold}} to an expression equivalent to the call to @code{feholdexcept}, @code{*@var{clear}} to an expression equivalent to the call to @code{feclearexcept} and @code{*@var{update}} to an expression equivalent to the call to @code{feupdateenv}. The three expressions are @code{NULL_TREE} on entry to the hook and may be left as @code{NULL_TREE} if no code is required in a particular place. The default implementation leaves all three expressions as @code{NULL_TREE}. The @code{__atomic_feraiseexcept} function from @code{libatomic} may be of use as part of the code generated in @code{*@var{update}}.
@end deftypefn
@deftypefn {Target Hook} void TARGET_RECORD_OFFLOAD_SYMBOL (tree)
Used when offloaded functions are seen in the compilation unit and no named
sections are available. It is called once for each symbol that must be
recorded in the offload function and variable table.
@end deftypefn
@defmac TARGET_SUPPORTS_WIDE_INT
On older ports, large integers are stored in @code{CONST_DOUBLE} rtl

View File

@ -8169,6 +8169,8 @@ and the associated definitions of those functions.
@hook TARGET_ATOMIC_ASSIGN_EXPAND_FENV
@hook TARGET_RECORD_OFFLOAD_SYMBOL
@defmac TARGET_SUPPORTS_WIDE_INT
On older ports, large integers are stored in @code{CONST_DOUBLE} rtl

View File

@ -1843,7 +1843,7 @@ open_base_files (void)
"tree-ssa.h", "reload.h", "cpp-id-data.h", "tree-chrec.h",
"except.h", "output.h", "cfgloop.h", "target.h", "lto-streamer.h",
"target-globals.h", "ipa-ref.h", "cgraph.h", "ipa-prop.h",
"ipa-inline.h", "dwarf2out.h", NULL
"ipa-inline.h", "dwarf2out.h", "omp-low.h", NULL
};
const char *const *ifp;
outf_p gtype_desc_c;

View File

@ -61,6 +61,7 @@ along with GCC; see the file COPYING3. If not see
#include "context.h"
#include "pass_manager.h"
#include "ipa-utils.h"
#include "omp-low.h"
/* True when asm nodes has been output. */
bool asm_nodes_output = false;
@ -1068,6 +1069,50 @@ read_string (struct lto_input_block *ib)
return str;
}
/* Output function/variable tables that will allow libgomp to look up offload
target code.
OFFLOAD_FUNCS is filled in expand_omp_target, OFFLOAD_VARS is filled in
varpool_node::get_create. In WHOPR (partitioned) mode during the WPA stage
both OFFLOAD_FUNCS and OFFLOAD_VARS are filled by input_offload_tables. */
void
output_offload_tables (void)
{
if (vec_safe_is_empty (offload_funcs) && vec_safe_is_empty (offload_vars))
return;
struct lto_simple_output_block *ob
= lto_create_simple_output_block (LTO_section_offload_table);
for (unsigned i = 0; i < vec_safe_length (offload_funcs); i++)
{
streamer_write_enum (ob->main_stream, LTO_symtab_tags,
LTO_symtab_last_tag, LTO_symtab_unavail_node);
lto_output_fn_decl_index (ob->decl_state, ob->main_stream,
(*offload_funcs)[i]);
}
for (unsigned i = 0; i < vec_safe_length (offload_vars); i++)
{
streamer_write_enum (ob->main_stream, LTO_symtab_tags,
LTO_symtab_last_tag, LTO_symtab_variable);
lto_output_var_decl_index (ob->decl_state, ob->main_stream,
(*offload_vars)[i]);
}
streamer_write_uhwi_stream (ob->main_stream, 0);
lto_destroy_simple_output_block (ob);
/* In WHOPR mode during the WPA stage the joint offload tables need to be
streamed to one partition only. That's why we free offload_funcs and
offload_vars after the first call of output_offload_tables. */
if (flag_wpa)
{
vec_free (offload_funcs);
vec_free (offload_vars);
}
}
/* Overwrite the information in NODE based on FILE_DATA, TAG, FLAGS,
STACK_SIZE, SELF_TIME and SELF_SIZE. This is called either to initialize
NODE or to replace the values in it, for instance because the first
@ -1794,6 +1839,55 @@ input_symtab (void)
}
}
/* Input function/variable tables that will allow libgomp to look up offload
target code, and store them into OFFLOAD_FUNCS and OFFLOAD_VARS. */
void
input_offload_tables (void)
{
struct lto_file_decl_data **file_data_vec = lto_get_file_decl_data ();
struct lto_file_decl_data *file_data;
unsigned int j = 0;
while ((file_data = file_data_vec[j++]))
{
const char *data;
size_t len;
struct lto_input_block *ib
= lto_create_simple_input_block (file_data, LTO_section_offload_table,
&data, &len);
if (!ib)
continue;
enum LTO_symtab_tags tag
= streamer_read_enum (ib, LTO_symtab_tags, LTO_symtab_last_tag);
while (tag)
{
if (tag == LTO_symtab_unavail_node)
{
int decl_index = streamer_read_uhwi (ib);
tree fn_decl
= lto_file_decl_data_get_fn_decl (file_data, decl_index);
vec_safe_push (offload_funcs, fn_decl);
}
else if (tag == LTO_symtab_variable)
{
int decl_index = streamer_read_uhwi (ib);
tree var_decl
= lto_file_decl_data_get_var_decl (file_data, decl_index);
vec_safe_push (offload_vars, var_decl);
}
else
fatal_error ("invalid offload table in %s", file_data->file_name);
tag = streamer_read_enum (ib, LTO_symtab_tags, LTO_symtab_last_tag);
}
lto_destroy_simple_input_block (file_data, LTO_section_offload_table,
ib, data, len);
}
}
/* True when we need optimization summary for NODE. */
static int

View File

@ -70,7 +70,8 @@ const char *lto_section_name[LTO_N_SECTION_TYPES] =
"cgraphopt",
"inline",
"ipcp_trans",
"icf"
"icf",
"offload_table"
};

View File

@ -35,4 +35,7 @@ extern const char *section_name_prefix;
#define LTO_SEGMENT_NAME "__GNU_LTO"
#define OFFLOAD_VAR_TABLE_SECTION_NAME ".gnu.offload_vars"
#define OFFLOAD_FUNC_TABLE_SECTION_NAME ".gnu.offload_funcs"
#endif /* GCC_LTO_SECTION_NAMES_H */

View File

@ -2308,6 +2308,8 @@ lto_output (void)
statements using the statement UIDs. */
output_symtab ();
output_offload_tables ();
#ifdef ENABLE_CHECKING
lto_bitmap_free (output);
#endif

View File

@ -247,6 +247,7 @@ enum lto_section_type
LTO_section_inline_summary,
LTO_section_ipcp_transform,
LTO_section_ipa_icf,
LTO_section_offload_table,
LTO_N_SECTION_TYPES /* Must be last. */
};
@ -822,6 +823,8 @@ bool lto_symtab_encoder_encode_initializer_p (lto_symtab_encoder_t,
varpool_node *);
void output_symtab (void);
void input_symtab (void);
void output_offload_tables (void);
void input_offload_tables (void);
bool referenced_from_other_partition_p (struct ipa_ref_list *,
lto_symtab_encoder_t);
bool reachable_from_other_partition_p (struct cgraph_node *,

View File

@ -3,6 +3,13 @@
Andrey Turetskiy <andrey.turetskiy@intel.com>
Michael Zolotukhin <michael.v.zolotukhin@intel.com>
* lto/lto.c (read_cgraph_and_symbols): Call input_offload_tables.
2014-11-13 Ilya Verbin <ilya.verbin@intel.com>
Ilya Tocar <ilya.tocar@intel.com>
Andrey Turetskiy <andrey.turetskiy@intel.com>
Bernd Schmidt <bernds@codesourcery.com>
* lto-object.c (lto_obj_add_section): Use section_name_prefix instead of
LTO_SECTION_NAME_PREFIX.
* lto-partition.c (lto_promote_cross_file_statics): Call

View File

@ -3034,6 +3034,8 @@ read_cgraph_and_symbols (unsigned nfiles, const char **fnames)
/* Read the symtab. */
input_symtab ();
input_offload_tables ();
/* Store resolutions into the symbol table. */
ld_plugin_symbol_resolution_t *res;

View File

@ -77,6 +77,7 @@ along with GCC; see the file COPYING3. If not see
#include "optabs.h"
#include "cfgloop.h"
#include "target.h"
#include "common/common-target.h"
#include "omp-low.h"
#include "gimple-low.h"
#include "tree-cfgcleanup.h"
@ -87,6 +88,7 @@ along with GCC; see the file COPYING3. If not see
#include "tree-eh.h"
#include "cilk.h"
#include "context.h"
#include "lto-section-names.h"
/* Lowering of OpenMP parallel and workshare constructs proceeds in two
@ -235,6 +237,9 @@ static tree scan_omp_1_op (tree *, int *, void *);
*handled_ops_p = false; \
break;
/* Holds offload tables with decls. */
vec<tree, va_gc> *offload_funcs, *offload_vars;
/* Convenience function for calling scan_omp_1_op on tree operands. */
static inline tree
@ -8409,6 +8414,9 @@ expand_omp_target (struct omp_region *region)
DECL_STRUCT_FUNCTION (child_fn)->curr_properties = cfun->curr_properties;
cgraph_node::add_new_function (child_fn, true);
/* Add the new function to the offload table. */
vec_safe_push (offload_funcs, child_fn);
/* Fix the callgraph edges for child_cfun. Those for cfun will be
fixed in a following pass. */
push_cfun (child_cfun);
@ -12423,4 +12431,91 @@ make_pass_omp_simd_clone (gcc::context *ctxt)
return new pass_omp_simd_clone (ctxt);
}
/* Helper function for omp_finish_file routine. Takes decls from V_DECLS and
adds their addresses and sizes to constructor-vector V_CTOR. */
static void
add_decls_addresses_to_decl_constructor (vec<tree, va_gc> *v_decls,
vec<constructor_elt, va_gc> *v_ctor)
{
unsigned len = vec_safe_length (v_decls);
for (unsigned i = 0; i < len; i++)
{
tree it = (*v_decls)[i];
bool is_function = TREE_CODE (it) != VAR_DECL;
CONSTRUCTOR_APPEND_ELT (v_ctor, NULL_TREE, build_fold_addr_expr (it));
if (!is_function)
CONSTRUCTOR_APPEND_ELT (v_ctor, NULL_TREE,
fold_convert (const_ptr_type_node,
DECL_SIZE_UNIT (it)));
}
}
/* Create new symbols containing (address, size) pairs for global variables,
marked with "omp declare target" attribute, as well as addresses for the
functions, which are outlined target regions. */
void
omp_finish_file (void)
{
unsigned num_funcs = vec_safe_length (offload_funcs);
unsigned num_vars = vec_safe_length (offload_vars);
if (num_funcs == 0 && num_vars == 0)
return;
if (targetm_common.have_named_sections)
{
vec<constructor_elt, va_gc> *v_f, *v_v;
vec_alloc (v_f, num_funcs);
vec_alloc (v_v, num_vars * 2);
add_decls_addresses_to_decl_constructor (offload_funcs, v_f);
add_decls_addresses_to_decl_constructor (offload_vars, v_v);
tree vars_decl_type = build_array_type_nelts (pointer_sized_int_node,
num_vars * 2);
tree funcs_decl_type = build_array_type_nelts (pointer_sized_int_node,
num_funcs);
TYPE_ALIGN (vars_decl_type) = TYPE_ALIGN (pointer_sized_int_node);
TYPE_ALIGN (funcs_decl_type) = TYPE_ALIGN (pointer_sized_int_node);
tree ctor_v = build_constructor (vars_decl_type, v_v);
tree ctor_f = build_constructor (funcs_decl_type, v_f);
TREE_CONSTANT (ctor_v) = TREE_CONSTANT (ctor_f) = 1;
TREE_STATIC (ctor_v) = TREE_STATIC (ctor_f) = 1;
tree funcs_decl = build_decl (UNKNOWN_LOCATION, VAR_DECL,
get_identifier (".offload_func_table"),
funcs_decl_type);
tree vars_decl = build_decl (UNKNOWN_LOCATION, VAR_DECL,
get_identifier (".offload_var_table"),
vars_decl_type);
TREE_STATIC (funcs_decl) = TREE_STATIC (vars_decl) = 1;
/* Do not align tables more than TYPE_ALIGN (pointer_sized_int_node),
otherwise a joint table in a binary will contain padding between
tables from multiple object files. */
DECL_USER_ALIGN (funcs_decl) = DECL_USER_ALIGN (vars_decl) = 1;
DECL_ALIGN (funcs_decl) = TYPE_ALIGN (funcs_decl_type);
DECL_ALIGN (vars_decl) = TYPE_ALIGN (vars_decl_type);
DECL_INITIAL (funcs_decl) = ctor_f;
DECL_INITIAL (vars_decl) = ctor_v;
set_decl_section_name (funcs_decl, OFFLOAD_FUNC_TABLE_SECTION_NAME);
set_decl_section_name (vars_decl, OFFLOAD_VAR_TABLE_SECTION_NAME);
varpool_node::finalize_decl (vars_decl);
varpool_node::finalize_decl (funcs_decl);
}
else
{
for (unsigned i = 0; i < num_funcs; i++)
{
tree it = (*offload_funcs)[i];
targetm.record_offload_symbol (it);
}
for (unsigned i = 0; i < num_vars; i++)
{
tree it = (*offload_vars)[i];
targetm.record_offload_symbol (it);
}
}
}
#include "gt-omp-low.h"

View File

@ -27,5 +27,9 @@ extern void omp_expand_local (basic_block);
extern void free_omp_regions (void);
extern tree omp_reduction_init (tree, tree);
extern bool make_gimple_omp_edges (basic_block, struct omp_region **, int *);
extern void omp_finish_file (void);
extern GTY(()) vec<tree, va_gc> *offload_funcs;
extern GTY(()) vec<tree, va_gc> *offload_vars;
#endif /* GCC_OMP_LOW_H */

View File

@ -1779,6 +1779,14 @@ HOOK_VECTOR_END (vectorize)
#undef HOOK_PREFIX
#define HOOK_PREFIX "TARGET_"
DEFHOOK
(record_offload_symbol,
"Used when offloaded functions are seen in the compilation unit and no named\n\
sections are available. It is called once for each symbol that must be\n\
recorded in the offload function and variable table.",
void, (tree),
hook_void_tree)
/* Allow target specific overriding of option settings after options have
been changed by an attribute or pragma or when it is reset at the
end of the code affected by an attribute or pragma. */

View File

@ -98,6 +98,7 @@ along with GCC; see the file COPYING3. If not see
#include "insn-codes.h"
#include "optabs.h"
#include "tree-chkp.h"
#include "omp-low.h"
#if defined(DBX_DEBUGGING_INFO) || defined(XCOFF_DEBUGGING_INFO)
#include "dbxout.h"
@ -601,6 +602,8 @@ compile_file (void)
if (flag_check_pointer_bounds)
chkp_finish_file ();
omp_finish_file ();
output_shared_constant_pool ();
output_object_blocks ();
finish_tm_clone_pairs ();

View File

@ -50,6 +50,7 @@ along with GCC; see the file COPYING3. If not see
#include "gimple.h"
#include "lto-streamer.h"
#include "context.h"
#include "omp-low.h"
const char * const tls_model_names[]={"none", "tls-emulated", "tls-real",
"tls-global-dynamic", "tls-local-dynamic",
@ -171,6 +172,8 @@ varpool_node::get_create (tree decl)
{
node->offloadable = 1;
g->have_offload = true;
if (!in_lto_p)
vec_safe_push (offload_vars, decl);
}
node->register_symbol ();