Merge of HSA
2016-01-19 Martin Jambor <mjambor@suse.cz> Martin Liska <mliska@suse.cz> Michael Matz <matz@suse.de> libgomp/ * plugin/Makefrag.am: Add HSA plugin requirements. * plugin/configfrag.ac (HSA_RUNTIME_INCLUDE): New variable. (HSA_RUNTIME_LIB): Likewise. (HSA_RUNTIME_CPPFLAGS): Likewise. (HSA_RUNTIME_INCLUDE): New substitution. (HSA_RUNTIME_LIB): Likewise. (HSA_RUNTIME_LDFLAGS): Likewise. (hsa-runtime): New configure option. (hsa-runtime-include): Likewise. (hsa-runtime-lib): Likewise. (PLUGIN_HSA): New substitution variable. Fill HSA_RUNTIME_INCLUDE and HSA_RUNTIME_LIB according to the new configure options. (PLUGIN_HSA_CPPFLAGS): Likewise. (PLUGIN_HSA_LDFLAGS): Likewise. (PLUGIN_HSA_LIBS): Likewise. Check that we have access to HSA run-time. * libgomp-plugin.h (offload_target_type): New element OFFLOAD_TARGET_TYPE_HSA. * libgomp.h (gomp_target_task): New fields firstprivate_copies and args. (bool gomp_create_target_task): Updated. (gomp_device_descr): Extra parameter of run_func and async_run_func, new field can_run_func. * libgomp_g.h (GOMP_target_ext): Update prototype. * oacc-host.c (host_run): Added a new parameter args. * target.c (calculate_firstprivate_requirements): New function. (copy_firstprivate_data): Likewise. (gomp_target_fallback_firstprivate): Use them. (gomp_target_unshare_firstprivate): New function. (gomp_get_target_fn_addr): Allow returning NULL for shared memory devices. (GOMP_target): Do host fallback for all shared memory devices. Do not pass any args to plugins. (GOMP_target_ext): Introduce device-specific argument parameter args. Allow host fallback if device shares memory. Do not remap data if device has shared memory. (gomp_target_task_fn): Likewise. Also treat shared memory devices like host fallback for mappings. (GOMP_target_data): Treat shared memory devices like host fallback. (GOMP_target_data_ext): Likewise. (GOMP_target_update): Likewise. (GOMP_target_update_ext): Likewise. Also pass NULL as args to gomp_create_target_task. (GOMP_target_enter_exit_data): Likewise. (omp_target_alloc): Treat shared memory devices like host fallback. (omp_target_free): Likewise. (omp_target_is_present): Likewise. (omp_target_memcpy): Likewise. (omp_target_memcpy_rect): Likewise. (omp_target_associate_ptr): Likewise. (gomp_load_plugin_for_device): Also load can_run. * task.c (GOMP_PLUGIN_target_task_completion): Free firstprivate_copies. (gomp_create_target_task): Accept new argument args and store it to ttask. * plugin/plugin-hsa.c: New file. gcc/ * Makefile.in (OBJS): Add new source files. (GTFILES): Add hsa.c. * common.opt (disable_hsa): New variable. (-Whsa): New warning. * config.in (ENABLE_HSA): New. * configure.ac: Treat hsa differently from other accelerators. (OFFLOAD_TARGETS): Define ENABLE_OFFLOADING according to $enable_offloading. (ENABLE_HSA): Define ENABLE_HSA according to $enable_hsa. * doc/install.texi (Configuration): Document --with-hsa-runtime, --with-hsa-runtime-include, --with-hsa-runtime-lib and --with-hsa-kmt-lib. * doc/invoke.texi (-Whsa): Document. (hsa-gen-debug-stores): Likewise. * lto-wrapper.c (compile_images_for_offload_targets): Do not attempt to invoke offload compiler for hsa acclerator. * opts.c (common_handle_option): Determine whether HSA offloading should be performed. * params.def (PARAM_HSA_GEN_DEBUG_STORES): New parameter. * builtin-types.def (BT_FN_VOID_UINT_PTR_INT_PTR): New. (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_INT_INT): Removed. (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_PTR): New. * gimple-low.c (lower_stmt): Also handle GIMPLE_OMP_GRID_BODY. * gimple-pretty-print.c (dump_gimple_omp_for): Also handle GF_OMP_FOR_KIND_GRID_LOOP. (dump_gimple_omp_block): Also handle GIMPLE_OMP_GRID_BODY. (pp_gimple_stmt_1): Likewise. * gimple-walk.c (walk_gimple_stmt): Likewise. * gimple.c (gimple_build_omp_grid_body): New function. (gimple_copy): Also handle GIMPLE_OMP_GRID_BODY. * gimple.def (GIMPLE_OMP_GRID_BODY): New. * gimple.h (enum gf_mask): Added GF_OMP_PARALLEL_GRID_PHONY, GF_OMP_FOR_KIND_GRID_LOOP, GF_OMP_FOR_GRID_PHONY and GF_OMP_TEAMS_GRID_PHONY. (gimple_statement_omp_single_layout): Updated comments. (gimple_build_omp_grid_body): New function. (gimple_has_substatements): Also handle GIMPLE_OMP_GRID_BODY. (gimple_omp_for_grid_phony): New function. (gimple_omp_for_set_grid_phony): Likewise. (gimple_omp_parallel_grid_phony): Likewise. (gimple_omp_parallel_set_grid_phony): Likewise. (gimple_omp_teams_grid_phony): Likewise. (gimple_omp_teams_set_grid_phony): Likewise. (gimple_return_set_retbnd): Also handle GIMPLE_OMP_GRID_BODY. * omp-builtins.def (BUILT_IN_GOMP_OFFLOAD_REGISTER): New. (BUILT_IN_GOMP_OFFLOAD_UNREGISTER): Likewise. (BUILT_IN_GOMP_TARGET): Updated type. * omp-low.c: Include symbol-summary.h, hsa.h and params.h. (adjust_for_condition): New function. (get_omp_for_step_from_incr): Likewise. (extract_omp_for_data): Moved parts to adjust_for_condition and get_omp_for_step_from_incr. (build_outer_var_ref): Handle GIMPLE_OMP_GRID_BODY. (fixup_child_record_type): Bail out if receiver_decl is NULL. (scan_sharing_clauses): Handle OMP_CLAUSE__GRIDDIM_. (scan_omp_parallel): Do not create child functions for phony constructs. (check_omp_nesting_restrictions): Handle GIMPLE_OMP_GRID_BODY. (scan_omp_1_op): Checking assert we are not remapping to ERROR_MARK. Also also handle GIMPLE_OMP_GRID_BODY. (parallel_needs_hsa_kernel_p): New function. (expand_parallel_call): Register apprpriate parallel child functions as HSA kernels. (grid_launch_attributes_trees): New type. (grid_attr_trees): New variable. (grid_create_kernel_launch_attr_types): New function. (grid_insert_store_range_dim): Likewise. (grid_get_kernel_launch_attributes): Likewise. (get_target_argument_identifier_1): Likewise. (get_target_argument_identifier): Likewise. (get_target_argument_value): Likewise. (push_target_argument_according_to_value): Likewise. (get_target_arguments): Likewise. (expand_omp_target): Call get_target_arguments instead of looking up for teams and thread limit. (grid_expand_omp_for_loop): New function. (grid_arg_decl_map): New type. (grid_remap_kernel_arg_accesses): New function. (grid_expand_target_kernel_body): New function. (expand_omp): Call it. (lower_omp_for): Do not emit phony constructs. (lower_omp_taskreg): Do not emit phony constructs but create for them a temporary variable receiver_decl. (lower_omp_taskreg): Do not emit phony constructs. (lower_omp_teams): Likewise. (lower_omp_grid_body): New function. (lower_omp_1): Call it. (grid_reg_assignment_to_local_var_p): New function. (grid_seq_only_contains_local_assignments): Likewise. (grid_find_single_omp_among_assignments_1): Likewise. (grid_find_single_omp_among_assignments): Likewise. (grid_find_ungridifiable_statement): Likewise. (grid_target_follows_gridifiable_pattern): Likewise. (grid_remap_prebody_decls): Likewise. (grid_copy_leading_local_assignments): Likewise. (grid_process_kernel_body_copy): Likewise. (grid_attempt_target_gridification): Likewise. (grid_gridify_all_targets_stmt): Likewise. (grid_gridify_all_targets): Likewise. (execute_lower_omp): Call grid_gridify_all_targets. (make_gimple_omp_edges): Handle GIMPLE_OMP_GRID_BODY. * tree-core.h (omp_clause_code): Added OMP_CLAUSE__GRIDDIM_. (tree_omp_clause): Added union field dimension. * tree-pretty-print.c (dump_omp_clause): Handle OMP_CLAUSE__GRIDDIM_. * tree.c (omp_clause_num_ops): Added number of arguments of OMP_CLAUSE__GRIDDIM_. (omp_clause_code_name): Added name of OMP_CLAUSE__GRIDDIM_. (walk_tree_1): Handle OMP_CLAUSE__GRIDDIM_. * tree.h (OMP_CLAUSE_GRIDDIM_DIMENSION): New. (OMP_CLAUSE_SET_GRIDDIM_DIMENSION): Likewise. (OMP_CLAUSE_GRIDDIM_SIZE): Likewise. (OMP_CLAUSE_GRIDDIM_GROUP): Likewise. * passes.def: Schedule pass_ipa_hsa and pass_gen_hsail. * tree-pass.h (make_pass_gen_hsail): Declare. (make_pass_ipa_hsa): Likewise. * ipa-hsa.c: New file. * lto-section-in.c (lto_section_name): Add hsa section name. * lto-streamer.h (lto_section_type): Add hsa section. * timevar.def (TV_IPA_HSA): New. * hsa-brig-format.h: New file. * hsa-brig.c: New file. * hsa-dump.c: Likewise. * hsa-gen.c: Likewise. * hsa.c: Likewise. * hsa.h: Likewise. * toplev.c (compile_file): Call hsa_output_brig. * hsa-regalloc.c: New file. gcc/fortran/ * types.def (BT_FN_VOID_UINT_PTR_INT_PTR): New. (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_INT_INT): Removed. (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_PTR): New. gcc/lto/ * lto-partition.c: Include "hsa.h" (add_symbol_to_partition_1): Put hsa implementations into the same partition as host implementations. liboffloadmic/ * plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_async_run): New unused parameter. (GOMP_OFFLOAD_run): Likewise. include/ * gomp-constants.h (GOMP_DEVICE_HSA): New macro. (GOMP_VERSION_HSA): Likewise. (GOMP_TARGET_ARG_DEVICE_MASK): Likewise. (GOMP_TARGET_ARG_DEVICE_ALL): Likewise. (GOMP_TARGET_ARG_SUBSEQUENT_PARAM): Likewise. (GOMP_TARGET_ARG_ID_MASK): Likewise. (GOMP_TARGET_ARG_NUM_TEAMS): Likewise. (GOMP_TARGET_ARG_THREAD_LIMIT): Likewise. (GOMP_TARGET_ARG_VALUE_SHIFT): Likewise. (GOMP_TARGET_ARG_HSA_KERNEL_ATTRIBUTES): Likewise. From-SVN: r232549
This commit is contained in:
parent
2bedb645f2
commit
b2b4005150
132
gcc/ChangeLog
132
gcc/ChangeLog
@ -1,3 +1,135 @@
|
||||
2016-01-19 Martin Jambor <mjambor@suse.cz>
|
||||
Martin Liska <mliska@suse.cz>
|
||||
Michael Matz <matz@suse.de>
|
||||
|
||||
* Makefile.in (OBJS): Add new source files.
|
||||
(GTFILES): Add hsa.c.
|
||||
* common.opt (disable_hsa): New variable.
|
||||
(-Whsa): New warning.
|
||||
* config.in (ENABLE_HSA): New.
|
||||
* configure.ac: Treat hsa differently from other accelerators.
|
||||
(OFFLOAD_TARGETS): Define ENABLE_OFFLOADING according to
|
||||
$enable_offloading.
|
||||
(ENABLE_HSA): Define ENABLE_HSA according to $enable_hsa.
|
||||
* doc/install.texi (Configuration): Document --with-hsa-runtime,
|
||||
--with-hsa-runtime-include, --with-hsa-runtime-lib and
|
||||
--with-hsa-kmt-lib.
|
||||
* doc/invoke.texi (-Whsa): Document.
|
||||
(hsa-gen-debug-stores): Likewise.
|
||||
* lto-wrapper.c (compile_images_for_offload_targets): Do not attempt
|
||||
to invoke offload compiler for hsa acclerator.
|
||||
* opts.c (common_handle_option): Determine whether HSA offloading
|
||||
should be performed.
|
||||
* params.def (PARAM_HSA_GEN_DEBUG_STORES): New parameter.
|
||||
* builtin-types.def (BT_FN_VOID_UINT_PTR_INT_PTR): New.
|
||||
(BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_INT_INT): Removed.
|
||||
(BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_PTR): New.
|
||||
* gimple-low.c (lower_stmt): Also handle GIMPLE_OMP_GRID_BODY.
|
||||
* gimple-pretty-print.c (dump_gimple_omp_for): Also handle
|
||||
GF_OMP_FOR_KIND_GRID_LOOP.
|
||||
(dump_gimple_omp_block): Also handle GIMPLE_OMP_GRID_BODY.
|
||||
(pp_gimple_stmt_1): Likewise.
|
||||
* gimple-walk.c (walk_gimple_stmt): Likewise.
|
||||
* gimple.c (gimple_build_omp_grid_body): New function.
|
||||
(gimple_copy): Also handle GIMPLE_OMP_GRID_BODY.
|
||||
* gimple.def (GIMPLE_OMP_GRID_BODY): New.
|
||||
* gimple.h (enum gf_mask): Added GF_OMP_PARALLEL_GRID_PHONY,
|
||||
GF_OMP_FOR_KIND_GRID_LOOP, GF_OMP_FOR_GRID_PHONY and
|
||||
GF_OMP_TEAMS_GRID_PHONY.
|
||||
(gimple_statement_omp_single_layout): Updated comments.
|
||||
(gimple_build_omp_grid_body): New function.
|
||||
(gimple_has_substatements): Also handle GIMPLE_OMP_GRID_BODY.
|
||||
(gimple_omp_for_grid_phony): New function.
|
||||
(gimple_omp_for_set_grid_phony): Likewise.
|
||||
(gimple_omp_parallel_grid_phony): Likewise.
|
||||
(gimple_omp_parallel_set_grid_phony): Likewise.
|
||||
(gimple_omp_teams_grid_phony): Likewise.
|
||||
(gimple_omp_teams_set_grid_phony): Likewise.
|
||||
(gimple_return_set_retbnd): Also handle GIMPLE_OMP_GRID_BODY.
|
||||
* omp-builtins.def (BUILT_IN_GOMP_OFFLOAD_REGISTER): New.
|
||||
(BUILT_IN_GOMP_OFFLOAD_UNREGISTER): Likewise.
|
||||
(BUILT_IN_GOMP_TARGET): Updated type.
|
||||
* omp-low.c: Include symbol-summary.h, hsa.h and params.h.
|
||||
(adjust_for_condition): New function.
|
||||
(get_omp_for_step_from_incr): Likewise.
|
||||
(extract_omp_for_data): Moved parts to adjust_for_condition and
|
||||
get_omp_for_step_from_incr.
|
||||
(build_outer_var_ref): Handle GIMPLE_OMP_GRID_BODY.
|
||||
(fixup_child_record_type): Bail out if receiver_decl is NULL.
|
||||
(scan_sharing_clauses): Handle OMP_CLAUSE__GRIDDIM_.
|
||||
(scan_omp_parallel): Do not create child functions for phony
|
||||
constructs.
|
||||
(check_omp_nesting_restrictions): Handle GIMPLE_OMP_GRID_BODY.
|
||||
(scan_omp_1_op): Checking assert we are not remapping to
|
||||
ERROR_MARK. Also also handle GIMPLE_OMP_GRID_BODY.
|
||||
(parallel_needs_hsa_kernel_p): New function.
|
||||
(expand_parallel_call): Register apprpriate parallel child
|
||||
functions as HSA kernels.
|
||||
(grid_launch_attributes_trees): New type.
|
||||
(grid_attr_trees): New variable.
|
||||
(grid_create_kernel_launch_attr_types): New function.
|
||||
(grid_insert_store_range_dim): Likewise.
|
||||
(grid_get_kernel_launch_attributes): Likewise.
|
||||
(get_target_argument_identifier_1): Likewise.
|
||||
(get_target_argument_identifier): Likewise.
|
||||
(get_target_argument_value): Likewise.
|
||||
(push_target_argument_according_to_value): Likewise.
|
||||
(get_target_arguments): Likewise.
|
||||
(expand_omp_target): Call get_target_arguments instead of looking
|
||||
up for teams and thread limit.
|
||||
(grid_expand_omp_for_loop): New function.
|
||||
(grid_arg_decl_map): New type.
|
||||
(grid_remap_kernel_arg_accesses): New function.
|
||||
(grid_expand_target_kernel_body): New function.
|
||||
(expand_omp): Call it.
|
||||
(lower_omp_for): Do not emit phony constructs.
|
||||
(lower_omp_taskreg): Do not emit phony constructs but create for them
|
||||
a temporary variable receiver_decl.
|
||||
(lower_omp_taskreg): Do not emit phony constructs.
|
||||
(lower_omp_teams): Likewise.
|
||||
(lower_omp_grid_body): New function.
|
||||
(lower_omp_1): Call it.
|
||||
(grid_reg_assignment_to_local_var_p): New function.
|
||||
(grid_seq_only_contains_local_assignments): Likewise.
|
||||
(grid_find_single_omp_among_assignments_1): Likewise.
|
||||
(grid_find_single_omp_among_assignments): Likewise.
|
||||
(grid_find_ungridifiable_statement): Likewise.
|
||||
(grid_target_follows_gridifiable_pattern): Likewise.
|
||||
(grid_remap_prebody_decls): Likewise.
|
||||
(grid_copy_leading_local_assignments): Likewise.
|
||||
(grid_process_kernel_body_copy): Likewise.
|
||||
(grid_attempt_target_gridification): Likewise.
|
||||
(grid_gridify_all_targets_stmt): Likewise.
|
||||
(grid_gridify_all_targets): Likewise.
|
||||
(execute_lower_omp): Call grid_gridify_all_targets.
|
||||
(make_gimple_omp_edges): Handle GIMPLE_OMP_GRID_BODY.
|
||||
* tree-core.h (omp_clause_code): Added OMP_CLAUSE__GRIDDIM_.
|
||||
(tree_omp_clause): Added union field dimension.
|
||||
* tree-pretty-print.c (dump_omp_clause): Handle OMP_CLAUSE__GRIDDIM_.
|
||||
* tree.c (omp_clause_num_ops): Added number of arguments of
|
||||
OMP_CLAUSE__GRIDDIM_.
|
||||
(omp_clause_code_name): Added name of OMP_CLAUSE__GRIDDIM_.
|
||||
(walk_tree_1): Handle OMP_CLAUSE__GRIDDIM_.
|
||||
* tree.h (OMP_CLAUSE_GRIDDIM_DIMENSION): New.
|
||||
(OMP_CLAUSE_SET_GRIDDIM_DIMENSION): Likewise.
|
||||
(OMP_CLAUSE_GRIDDIM_SIZE): Likewise.
|
||||
(OMP_CLAUSE_GRIDDIM_GROUP): Likewise.
|
||||
* passes.def: Schedule pass_ipa_hsa and pass_gen_hsail.
|
||||
* tree-pass.h (make_pass_gen_hsail): Declare.
|
||||
(make_pass_ipa_hsa): Likewise.
|
||||
* ipa-hsa.c: New file.
|
||||
* lto-section-in.c (lto_section_name): Add hsa section name.
|
||||
* lto-streamer.h (lto_section_type): Add hsa section.
|
||||
* timevar.def (TV_IPA_HSA): New.
|
||||
* hsa-brig-format.h: New file.
|
||||
* hsa-brig.c: New file.
|
||||
* hsa-dump.c: Likewise.
|
||||
* hsa-gen.c: Likewise.
|
||||
* hsa.c: Likewise.
|
||||
* hsa.h: Likewise.
|
||||
* toplev.c (compile_file): Call hsa_output_brig.
|
||||
* hsa-regalloc.c: New file.
|
||||
|
||||
2016-01-18 Jeff Law <law@redhat.com>
|
||||
|
||||
PR tree-optimization/69320
|
||||
|
@ -1297,6 +1297,11 @@ OBJS = \
|
||||
graphite-sese-to-poly.o \
|
||||
gtype-desc.o \
|
||||
haifa-sched.o \
|
||||
hsa.o \
|
||||
hsa-gen.o \
|
||||
hsa-regalloc.o \
|
||||
hsa-brig.o \
|
||||
hsa-dump.o \
|
||||
hw-doloop.o \
|
||||
hwint.o \
|
||||
ifcvt.o \
|
||||
@ -1321,6 +1326,7 @@ OBJS = \
|
||||
ipa-icf.o \
|
||||
ipa-icf-gimple.o \
|
||||
ipa-reference.o \
|
||||
ipa-hsa.o \
|
||||
ipa-ref.o \
|
||||
ipa-utils.o \
|
||||
ipa.o \
|
||||
@ -2404,6 +2410,7 @@ GTFILES = $(CPP_ID_DATA_H) $(srcdir)/input.h $(srcdir)/coretypes.h \
|
||||
$(srcdir)/sancov.c \
|
||||
$(srcdir)/ipa-devirt.c \
|
||||
$(srcdir)/internal-fn.h \
|
||||
$(srcdir)/hsa.c \
|
||||
@all_gtfiles@
|
||||
|
||||
# Compute the list of GT header files from the corresponding C sources,
|
||||
|
@ -478,6 +478,8 @@ DEF_FUNCTION_TYPE_4 (BT_FN_BOOL_UINT_LONGPTR_LONGPTR_LONGPTR,
|
||||
DEF_FUNCTION_TYPE_4 (BT_FN_BOOL_UINT_ULLPTR_ULLPTR_ULLPTR,
|
||||
BT_BOOL, BT_UINT, BT_PTR_ULONGLONG, BT_PTR_ULONGLONG,
|
||||
BT_PTR_ULONGLONG)
|
||||
DEF_FUNCTION_TYPE_4 (BT_FN_VOID_UINT_PTR_INT_PTR, BT_VOID, BT_INT, BT_PTR,
|
||||
BT_INT, BT_PTR)
|
||||
|
||||
DEF_FUNCTION_TYPE_5 (BT_FN_INT_STRING_INT_SIZE_CONST_STRING_VALIST_ARG,
|
||||
BT_INT, BT_STRING, BT_INT, BT_SIZE, BT_CONST_STRING,
|
||||
@ -555,10 +557,9 @@ DEF_FUNCTION_TYPE_9 (BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR_INT,
|
||||
BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR,
|
||||
BT_PTR_FN_VOID_PTR_PTR, BT_LONG, BT_LONG,
|
||||
BT_BOOL, BT_UINT, BT_PTR, BT_INT)
|
||||
|
||||
DEF_FUNCTION_TYPE_10 (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_INT_INT,
|
||||
BT_VOID, BT_INT, BT_PTR_FN_VOID_PTR, BT_SIZE, BT_PTR,
|
||||
BT_PTR, BT_PTR, BT_UINT, BT_PTR, BT_INT, BT_INT)
|
||||
DEF_FUNCTION_TYPE_9 (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_PTR,
|
||||
BT_VOID, BT_INT, BT_PTR_FN_VOID_PTR, BT_SIZE, BT_PTR,
|
||||
BT_PTR, BT_PTR, BT_UINT, BT_PTR, BT_PTR)
|
||||
|
||||
DEF_FUNCTION_TYPE_11 (BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_UINT_LONG_INT_LONG_LONG_LONG,
|
||||
BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR,
|
||||
|
@ -239,6 +239,10 @@ Inserts call to __sanitizer_cov_trace_pc into every basic block.
|
||||
Variable
|
||||
bool dump_base_name_prefixed = false
|
||||
|
||||
; Flag whether HSA generation has been explicitely disabled
|
||||
Variable
|
||||
bool flag_disable_hsa = false
|
||||
|
||||
###
|
||||
Driver
|
||||
|
||||
@ -593,6 +597,10 @@ Wfree-nonheap-object
|
||||
Common Var(warn_free_nonheap_object) Init(1) Warning
|
||||
Warn when attempting to free a non-heap object.
|
||||
|
||||
Whsa
|
||||
Common Var(warn_hsa) Init(1) Warning
|
||||
Warn when a function cannot be expanded to HSAIL.
|
||||
|
||||
Winline
|
||||
Common Var(warn_inline) Warning
|
||||
Warn when an inlined function cannot be inlined.
|
||||
|
@ -144,6 +144,12 @@
|
||||
#endif
|
||||
|
||||
|
||||
/* Define this to enable support for generating HSAIL. */
|
||||
#ifndef USED_FOR_TARGET
|
||||
#undef ENABLE_HSA
|
||||
#endif
|
||||
|
||||
|
||||
/* Define if gcc should always pass --build-id to linker. */
|
||||
#ifndef USED_FOR_TARGET
|
||||
#undef ENABLE_LD_BUILDID
|
||||
|
19
gcc/configure
vendored
19
gcc/configure
vendored
@ -7700,6 +7700,13 @@ fi
|
||||
|
||||
for tgt in `echo $enable_offload_targets | sed 's/,/ /g'`; do
|
||||
tgt=`echo $tgt | sed 's/=.*//'`
|
||||
|
||||
if echo "$tgt" | grep "^hsa" > /dev/null ; then
|
||||
enable_hsa=1
|
||||
else
|
||||
enable_offloading=1
|
||||
fi
|
||||
|
||||
if test x"$offload_targets" = x; then
|
||||
offload_targets=$tgt
|
||||
else
|
||||
@ -7711,7 +7718,7 @@ cat >>confdefs.h <<_ACEOF
|
||||
#define OFFLOAD_TARGETS "$offload_targets"
|
||||
_ACEOF
|
||||
|
||||
if test x"$offload_targets" != x; then
|
||||
if test x"$enable_offloading" != x; then
|
||||
|
||||
$as_echo "#define ENABLE_OFFLOADING 1" >>confdefs.h
|
||||
|
||||
@ -7721,6 +7728,12 @@ $as_echo "#define ENABLE_OFFLOADING 0" >>confdefs.h
|
||||
|
||||
fi
|
||||
|
||||
if test x"$enable_hsa" = x1 ; then
|
||||
|
||||
$as_echo "#define ENABLE_HSA 1" >>confdefs.h
|
||||
|
||||
fi
|
||||
|
||||
|
||||
# Check whether --with-multilib-list was given.
|
||||
if test "${with_multilib_list+set}" = set; then :
|
||||
@ -18406,7 +18419,7 @@ else
|
||||
lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
|
||||
lt_status=$lt_dlunknown
|
||||
cat > conftest.$ac_ext <<_LT_EOF
|
||||
#line 18409 "configure"
|
||||
#line 18422 "configure"
|
||||
#include "confdefs.h"
|
||||
|
||||
#if HAVE_DLFCN_H
|
||||
@ -18512,7 +18525,7 @@ else
|
||||
lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
|
||||
lt_status=$lt_dlunknown
|
||||
cat > conftest.$ac_ext <<_LT_EOF
|
||||
#line 18515 "configure"
|
||||
#line 18528 "configure"
|
||||
#include "confdefs.h"
|
||||
|
||||
#if HAVE_DLFCN_H
|
||||
|
@ -940,6 +940,13 @@ AC_SUBST(accel_dir_suffix)
|
||||
|
||||
for tgt in `echo $enable_offload_targets | sed 's/,/ /g'`; do
|
||||
tgt=`echo $tgt | sed 's/=.*//'`
|
||||
|
||||
if echo "$tgt" | grep "^hsa" > /dev/null ; then
|
||||
enable_hsa=1
|
||||
else
|
||||
enable_offloading=1
|
||||
fi
|
||||
|
||||
if test x"$offload_targets" = x; then
|
||||
offload_targets=$tgt
|
||||
else
|
||||
@ -948,7 +955,7 @@ for tgt in `echo $enable_offload_targets | sed 's/,/ /g'`; do
|
||||
done
|
||||
AC_DEFINE_UNQUOTED(OFFLOAD_TARGETS, "$offload_targets",
|
||||
[Define to offload targets, separated by commas.])
|
||||
if test x"$offload_targets" != x; then
|
||||
if test x"$enable_offloading" != x; then
|
||||
AC_DEFINE(ENABLE_OFFLOADING, 1,
|
||||
[Define this to enable support for offloading.])
|
||||
else
|
||||
@ -956,6 +963,11 @@ else
|
||||
[Define this to enable support for offloading.])
|
||||
fi
|
||||
|
||||
if test x"$enable_hsa" = x1 ; then
|
||||
AC_DEFINE(ENABLE_HSA, 1,
|
||||
[Define this to enable support for generating HSAIL.])
|
||||
fi
|
||||
|
||||
AC_ARG_WITH(multilib-list,
|
||||
[AS_HELP_STRING([--with-multilib-list], [select multilibs (AArch64, SH and x86-64 only)])],
|
||||
:,
|
||||
|
@ -1992,6 +1992,28 @@ specifying paths @var{path1}, @dots{}, @var{pathN}.
|
||||
% @var{srcdir}/configure \
|
||||
--enable-offload-target=i686-unknown-linux-gnu=/path/to/i686/compiler,x86_64-pc-linux-gnu
|
||||
@end smallexample
|
||||
|
||||
If @samp{hsa} is specified as one of the targets, the compiler will be
|
||||
built with support for HSA GPU accelerators. Because the same
|
||||
compiler will emit the accelerator code, no path should be specified.
|
||||
|
||||
@item --with-hsa-runtime=@var{pathname}
|
||||
@itemx --with-hsa-runtime-include=@var{pathname}
|
||||
@itemx --with-hsa-runtime-lib=@var{pathname}
|
||||
|
||||
If you configure GCC with HSA offloading but do not have the HSA
|
||||
run-time library installed in a standard location then you can
|
||||
explicitly specify the directory where they are installed. The
|
||||
@option{--with-hsa-runtime=@/@var{hsainstalldir}} option is a
|
||||
shorthand for
|
||||
@option{--with-hsa-runtime-lib=@/@var{hsainstalldir}/lib} and
|
||||
@option{--with-hsa-runtime-include=@/@var{hsainstalldir}/include}.
|
||||
|
||||
@item --with-hsa-kmt-lib=@var{pathname}
|
||||
|
||||
If you configure GCC with HSA offloading but do not have the HSA
|
||||
KMT library installed in a standard location then you can
|
||||
explicitly specify the directory where it resides.
|
||||
@end table
|
||||
|
||||
@subheading Cross-Compiler-Specific Options
|
||||
|
@ -305,7 +305,7 @@ Objective-C and Objective-C++ Dialects}.
|
||||
-Wunused-but-set-parameter -Wunused-but-set-variable @gol
|
||||
-Wuseless-cast -Wvariadic-macros -Wvector-operation-performance @gol
|
||||
-Wvla -Wvolatile-register-var -Wwrite-strings @gol
|
||||
-Wzero-as-null-pointer-constant}
|
||||
-Wzero-as-null-pointer-constant -Whsa}
|
||||
|
||||
@item C and Objective-C-only Warning Options
|
||||
@gccoptlist{-Wbad-function-cast -Wmissing-declarations @gol
|
||||
@ -5693,6 +5693,10 @@ Suppress warnings when a positional initializer is used to initialize
|
||||
a structure that has been marked with the @code{designated_init}
|
||||
attribute.
|
||||
|
||||
@item -Whsa
|
||||
Issue a warning when HSAIL cannot be emitted for the compiled function or
|
||||
OpenMP construct.
|
||||
|
||||
@end table
|
||||
|
||||
@node Debugging Options
|
||||
@ -9508,6 +9512,12 @@ dynamic, guided, auto, runtime). The default is static.
|
||||
Maximum depth of recursion when querying properties of SSA names in things
|
||||
like fold routines. One level of recursion corresponds to following a
|
||||
use-def chain.
|
||||
|
||||
@item hsa-gen-debug-stores
|
||||
Enable emission of special debug stores within HSA kernels which are
|
||||
then read and reported by libgomp plugin. Generation of these stores
|
||||
is disabled by default, use @option{--param hsa-gen-debug-stores=1} to
|
||||
enable it.
|
||||
@end table
|
||||
@end table
|
||||
|
||||
|
@ -1,3 +1,9 @@
|
||||
2016-01-19 Martin Jambor <mjambor@suse.cz>
|
||||
|
||||
* types.def (BT_FN_VOID_UINT_PTR_INT_PTR): New.
|
||||
(BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_INT_INT): Removed.
|
||||
(BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_PTR): New.
|
||||
|
||||
2016-01-15 Paul Thomas <pault@gcc.gnu.org>
|
||||
|
||||
PR fortran/64324
|
||||
|
@ -159,6 +159,8 @@ DEF_FUNCTION_TYPE_4 (BT_FN_BOOL_UINT_LONGPTR_LONGPTR_LONGPTR,
|
||||
DEF_FUNCTION_TYPE_4 (BT_FN_BOOL_UINT_ULLPTR_ULLPTR_ULLPTR,
|
||||
BT_BOOL, BT_UINT, BT_PTR_ULONGLONG, BT_PTR_ULONGLONG,
|
||||
BT_PTR_ULONGLONG)
|
||||
DEF_FUNCTION_TYPE_4 (BT_FN_VOID_UINT_PTR_INT_PTR, BT_VOID, BT_INT, BT_PTR,
|
||||
BT_INT, BT_PTR)
|
||||
|
||||
DEF_FUNCTION_TYPE_5 (BT_FN_VOID_OMPFN_PTR_UINT_UINT_UINT,
|
||||
BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR, BT_UINT, BT_UINT,
|
||||
@ -220,10 +222,9 @@ DEF_FUNCTION_TYPE_9 (BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR_INT,
|
||||
BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR,
|
||||
BT_PTR_FN_VOID_PTR_PTR, BT_LONG, BT_LONG,
|
||||
BT_BOOL, BT_UINT, BT_PTR, BT_INT)
|
||||
|
||||
DEF_FUNCTION_TYPE_10 (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_INT_INT,
|
||||
DEF_FUNCTION_TYPE_9 (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_PTR,
|
||||
BT_VOID, BT_INT, BT_PTR_FN_VOID_PTR, BT_SIZE, BT_PTR,
|
||||
BT_PTR, BT_PTR, BT_UINT, BT_PTR, BT_INT, BT_INT)
|
||||
BT_PTR, BT_PTR, BT_UINT, BT_PTR, BT_PTR)
|
||||
|
||||
DEF_FUNCTION_TYPE_11 (BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_UINT_LONG_INT_LONG_LONG_LONG,
|
||||
BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR,
|
||||
|
@ -358,6 +358,7 @@ lower_stmt (gimple_stmt_iterator *gsi, struct lower_data *data)
|
||||
case GIMPLE_OMP_TASK:
|
||||
case GIMPLE_OMP_TARGET:
|
||||
case GIMPLE_OMP_TEAMS:
|
||||
case GIMPLE_OMP_GRID_BODY:
|
||||
data->cannot_fallthru = false;
|
||||
lower_omp_directive (gsi, data);
|
||||
data->cannot_fallthru = false;
|
||||
|
@ -1187,6 +1187,9 @@ dump_gimple_omp_for (pretty_printer *buffer, gomp_for *gs, int spc, int flags)
|
||||
case GF_OMP_FOR_KIND_CILKSIMD:
|
||||
pp_string (buffer, "#pragma simd");
|
||||
break;
|
||||
case GF_OMP_FOR_KIND_GRID_LOOP:
|
||||
pp_string (buffer, "#pragma omp for grid_loop");
|
||||
break;
|
||||
default:
|
||||
gcc_unreachable ();
|
||||
}
|
||||
@ -1494,6 +1497,9 @@ dump_gimple_omp_block (pretty_printer *buffer, gimple *gs, int spc, int flags)
|
||||
case GIMPLE_OMP_SECTION:
|
||||
pp_string (buffer, "#pragma omp section");
|
||||
break;
|
||||
case GIMPLE_OMP_GRID_BODY:
|
||||
pp_string (buffer, "#pragma omp gridified body");
|
||||
break;
|
||||
default:
|
||||
gcc_unreachable ();
|
||||
}
|
||||
@ -2301,6 +2307,7 @@ pp_gimple_stmt_1 (pretty_printer *buffer, gimple *gs, int spc, int flags)
|
||||
case GIMPLE_OMP_MASTER:
|
||||
case GIMPLE_OMP_TASKGROUP:
|
||||
case GIMPLE_OMP_SECTION:
|
||||
case GIMPLE_OMP_GRID_BODY:
|
||||
dump_gimple_omp_block (buffer, gs, spc, flags);
|
||||
break;
|
||||
|
||||
|
@ -655,6 +655,7 @@ walk_gimple_stmt (gimple_stmt_iterator *gsi, walk_stmt_fn callback_stmt,
|
||||
case GIMPLE_OMP_SINGLE:
|
||||
case GIMPLE_OMP_TARGET:
|
||||
case GIMPLE_OMP_TEAMS:
|
||||
case GIMPLE_OMP_GRID_BODY:
|
||||
ret = walk_gimple_seq_mod (gimple_omp_body_ptr (stmt), callback_stmt,
|
||||
callback_op, wi);
|
||||
if (ret)
|
||||
|
14
gcc/gimple.c
14
gcc/gimple.c
@ -954,6 +954,19 @@ gimple_build_omp_master (gimple_seq body)
|
||||
return p;
|
||||
}
|
||||
|
||||
/* Build a GIMPLE_OMP_GRID_BODY statement.
|
||||
|
||||
BODY is the sequence of statements to be executed by the kernel. */
|
||||
|
||||
gimple *
|
||||
gimple_build_omp_grid_body (gimple_seq body)
|
||||
{
|
||||
gimple *p = gimple_alloc (GIMPLE_OMP_GRID_BODY, 0);
|
||||
if (body)
|
||||
gimple_omp_set_body (p, body);
|
||||
|
||||
return p;
|
||||
}
|
||||
|
||||
/* Build a GIMPLE_OMP_TASKGROUP statement.
|
||||
|
||||
@ -1807,6 +1820,7 @@ gimple_copy (gimple *stmt)
|
||||
case GIMPLE_OMP_SECTION:
|
||||
case GIMPLE_OMP_MASTER:
|
||||
case GIMPLE_OMP_TASKGROUP:
|
||||
case GIMPLE_OMP_GRID_BODY:
|
||||
copy_omp_body:
|
||||
new_seq = gimple_seq_copy (gimple_omp_body (stmt));
|
||||
gimple_omp_set_body (copy, new_seq);
|
||||
|
@ -376,6 +376,10 @@ DEFGSCODE(GIMPLE_OMP_TEAMS, "gimple_omp_teams", GSS_OMP_SINGLE_LAYOUT)
|
||||
CLAUSES is an OMP_CLAUSE chain holding the associated clauses. */
|
||||
DEFGSCODE(GIMPLE_OMP_ORDERED, "gimple_omp_ordered", GSS_OMP_SINGLE_LAYOUT)
|
||||
|
||||
/* GIMPLE_OMP_GRID_BODY <BODY> represents a parallel loop lowered for execution
|
||||
on a GPU. It is an artificial statement created by omp lowering. */
|
||||
DEFGSCODE(GIMPLE_OMP_GRID_BODY, "gimple_omp_gpukernel", GSS_OMP)
|
||||
|
||||
/* GIMPLE_PREDICT <PREDICT, OUTCOME> specifies a hint for branch prediction.
|
||||
|
||||
PREDICT is one of the predictors from predict.def.
|
||||
|
65
gcc/gimple.h
65
gcc/gimple.h
@ -146,6 +146,7 @@ enum gf_mask {
|
||||
GF_CALL_CTRL_ALTERING = 1 << 7,
|
||||
GF_CALL_WITH_BOUNDS = 1 << 8,
|
||||
GF_OMP_PARALLEL_COMBINED = 1 << 0,
|
||||
GF_OMP_PARALLEL_GRID_PHONY = 1 << 1,
|
||||
GF_OMP_TASK_TASKLOOP = 1 << 0,
|
||||
GF_OMP_FOR_KIND_MASK = (1 << 4) - 1,
|
||||
GF_OMP_FOR_KIND_FOR = 0,
|
||||
@ -153,12 +154,14 @@ enum gf_mask {
|
||||
GF_OMP_FOR_KIND_TASKLOOP = 2,
|
||||
GF_OMP_FOR_KIND_CILKFOR = 3,
|
||||
GF_OMP_FOR_KIND_OACC_LOOP = 4,
|
||||
GF_OMP_FOR_KIND_GRID_LOOP = 5,
|
||||
/* Flag for SIMD variants of OMP_FOR kinds. */
|
||||
GF_OMP_FOR_SIMD = 1 << 3,
|
||||
GF_OMP_FOR_KIND_SIMD = GF_OMP_FOR_SIMD | 0,
|
||||
GF_OMP_FOR_KIND_CILKSIMD = GF_OMP_FOR_SIMD | 1,
|
||||
GF_OMP_FOR_COMBINED = 1 << 4,
|
||||
GF_OMP_FOR_COMBINED_INTO = 1 << 5,
|
||||
GF_OMP_FOR_GRID_PHONY = 1 << 6,
|
||||
GF_OMP_TARGET_KIND_MASK = (1 << 4) - 1,
|
||||
GF_OMP_TARGET_KIND_REGION = 0,
|
||||
GF_OMP_TARGET_KIND_DATA = 1,
|
||||
@ -172,6 +175,7 @@ enum gf_mask {
|
||||
GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA = 9,
|
||||
GF_OMP_TARGET_KIND_OACC_DECLARE = 10,
|
||||
GF_OMP_TARGET_KIND_OACC_HOST_DATA = 11,
|
||||
GF_OMP_TEAMS_GRID_PHONY = 1 << 0,
|
||||
|
||||
/* True on an GIMPLE_OMP_RETURN statement if the return does not require
|
||||
a thread synchronization via some sort of barrier. The exact barrier
|
||||
@ -733,7 +737,7 @@ struct GTY((tag("GSS_OMP_SINGLE_LAYOUT")))
|
||||
{
|
||||
/* [ WORD 1-7 ] : base class */
|
||||
|
||||
/* [ WORD 7 ] */
|
||||
/* [ WORD 8 ] */
|
||||
tree clauses;
|
||||
};
|
||||
|
||||
@ -1454,6 +1458,7 @@ gomp_task *gimple_build_omp_task (gimple_seq, tree, tree, tree, tree,
|
||||
tree, tree);
|
||||
gimple *gimple_build_omp_section (gimple_seq);
|
||||
gimple *gimple_build_omp_master (gimple_seq);
|
||||
gimple *gimple_build_omp_grid_body (gimple_seq);
|
||||
gimple *gimple_build_omp_taskgroup (gimple_seq);
|
||||
gomp_continue *gimple_build_omp_continue (tree, tree);
|
||||
gomp_ordered *gimple_build_omp_ordered (gimple_seq, tree);
|
||||
@ -1714,6 +1719,7 @@ gimple_has_substatements (gimple *g)
|
||||
case GIMPLE_OMP_CRITICAL:
|
||||
case GIMPLE_WITH_CLEANUP_EXPR:
|
||||
case GIMPLE_TRANSACTION:
|
||||
case GIMPLE_OMP_GRID_BODY:
|
||||
return true;
|
||||
|
||||
default:
|
||||
@ -5079,6 +5085,24 @@ gimple_omp_for_set_pre_body (gimple *gs, gimple_seq pre_body)
|
||||
omp_for_stmt->pre_body = pre_body;
|
||||
}
|
||||
|
||||
/* Return the kernel_phony of OMP_FOR statement. */
|
||||
|
||||
static inline bool
|
||||
gimple_omp_for_grid_phony (const gomp_for *omp_for)
|
||||
{
|
||||
return (gimple_omp_subcode (omp_for) & GF_OMP_FOR_GRID_PHONY) != 0;
|
||||
}
|
||||
|
||||
/* Set kernel_phony flag of OMP_FOR to VALUE. */
|
||||
|
||||
static inline void
|
||||
gimple_omp_for_set_grid_phony (gomp_for *omp_for, bool value)
|
||||
{
|
||||
if (value)
|
||||
omp_for->subcode |= GF_OMP_FOR_GRID_PHONY;
|
||||
else
|
||||
omp_for->subcode &= ~GF_OMP_FOR_GRID_PHONY;
|
||||
}
|
||||
|
||||
/* Return the clauses associated with OMP_PARALLEL GS. */
|
||||
|
||||
@ -5165,6 +5189,24 @@ gimple_omp_parallel_set_data_arg (gomp_parallel *omp_parallel_stmt,
|
||||
omp_parallel_stmt->data_arg = data_arg;
|
||||
}
|
||||
|
||||
/* Return the kernel_phony flag of OMP_PARALLEL_STMT. */
|
||||
|
||||
static inline bool
|
||||
gimple_omp_parallel_grid_phony (const gomp_parallel *stmt)
|
||||
{
|
||||
return (gimple_omp_subcode (stmt) & GF_OMP_PARALLEL_GRID_PHONY) != 0;
|
||||
}
|
||||
|
||||
/* Set kernel_phony flag of OMP_PARALLEL_STMT to VALUE. */
|
||||
|
||||
static inline void
|
||||
gimple_omp_parallel_set_grid_phony (gomp_parallel *stmt, bool value)
|
||||
{
|
||||
if (value)
|
||||
stmt->subcode |= GF_OMP_PARALLEL_GRID_PHONY;
|
||||
else
|
||||
stmt->subcode &= ~GF_OMP_PARALLEL_GRID_PHONY;
|
||||
}
|
||||
|
||||
/* Return the clauses associated with OMP_TASK GS. */
|
||||
|
||||
@ -5638,6 +5680,24 @@ gimple_omp_teams_set_clauses (gomp_teams *omp_teams_stmt, tree clauses)
|
||||
omp_teams_stmt->clauses = clauses;
|
||||
}
|
||||
|
||||
/* Return the kernel_phony flag of an OMP_TEAMS_STMT. */
|
||||
|
||||
static inline bool
|
||||
gimple_omp_teams_grid_phony (const gomp_teams *omp_teams_stmt)
|
||||
{
|
||||
return (gimple_omp_subcode (omp_teams_stmt) & GF_OMP_TEAMS_GRID_PHONY) != 0;
|
||||
}
|
||||
|
||||
/* Set kernel_phony flag of an OMP_TEAMS_STMT to VALUE. */
|
||||
|
||||
static inline void
|
||||
gimple_omp_teams_set_grid_phony (gomp_teams *omp_teams_stmt, bool value)
|
||||
{
|
||||
if (value)
|
||||
omp_teams_stmt->subcode |= GF_OMP_TEAMS_GRID_PHONY;
|
||||
else
|
||||
omp_teams_stmt->subcode &= ~GF_OMP_TEAMS_GRID_PHONY;
|
||||
}
|
||||
|
||||
/* Return the clauses associated with OMP_SECTIONS GS. */
|
||||
|
||||
@ -6002,7 +6062,8 @@ gimple_return_set_retbnd (gimple *gs, tree retval)
|
||||
case GIMPLE_OMP_RETURN: \
|
||||
case GIMPLE_OMP_ATOMIC_LOAD: \
|
||||
case GIMPLE_OMP_ATOMIC_STORE: \
|
||||
case GIMPLE_OMP_CONTINUE
|
||||
case GIMPLE_OMP_CONTINUE: \
|
||||
case GIMPLE_OMP_GRID_BODY
|
||||
|
||||
static inline bool
|
||||
is_gimple_omp (const gimple *stmt)
|
||||
|
1234
gcc/hsa-brig-format.h
Normal file
1234
gcc/hsa-brig-format.h
Normal file
File diff suppressed because it is too large
Load Diff
2560
gcc/hsa-brig.c
Normal file
2560
gcc/hsa-brig.c
Normal file
File diff suppressed because it is too large
Load Diff
1189
gcc/hsa-dump.c
Normal file
1189
gcc/hsa-dump.c
Normal file
File diff suppressed because it is too large
Load Diff
6151
gcc/hsa-gen.c
Normal file
6151
gcc/hsa-gen.c
Normal file
File diff suppressed because it is too large
Load Diff
719
gcc/hsa-regalloc.c
Normal file
719
gcc/hsa-regalloc.c
Normal file
@ -0,0 +1,719 @@
|
||||
/* HSAIL IL Register allocation and out-of-SSA.
|
||||
Copyright (C) 2013-2016 Free Software Foundation, Inc.
|
||||
Contributed by Michael Matz <matz@suse.de>
|
||||
|
||||
This file is part of GCC.
|
||||
|
||||
GCC is free software; you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License as published by
|
||||
the Free Software Foundation; either version 3, or (at your option)
|
||||
any later version.
|
||||
|
||||
GCC is distributed in the hope that it will be useful,
|
||||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
GNU General Public License for more details.
|
||||
|
||||
You should have received a copy of the GNU General Public License
|
||||
along with GCC; see the file COPYING3. If not see
|
||||
<http://www.gnu.org/licenses/>. */
|
||||
|
||||
#include "config.h"
|
||||
#include "system.h"
|
||||
#include "coretypes.h"
|
||||
#include "tm.h"
|
||||
#include "is-a.h"
|
||||
#include "vec.h"
|
||||
#include "tree.h"
|
||||
#include "dominance.h"
|
||||
#include "cfg.h"
|
||||
#include "cfganal.h"
|
||||
#include "function.h"
|
||||
#include "bitmap.h"
|
||||
#include "dumpfile.h"
|
||||
#include "cgraph.h"
|
||||
#include "print-tree.h"
|
||||
#include "cfghooks.h"
|
||||
#include "symbol-summary.h"
|
||||
#include "hsa.h"
|
||||
|
||||
|
||||
/* Process a PHI node PHI of basic block BB as a part of naive out-f-ssa. */
|
||||
|
||||
static void
|
||||
naive_process_phi (hsa_insn_phi *phi)
|
||||
{
|
||||
unsigned count = phi->operand_count ();
|
||||
for (unsigned i = 0; i < count; i++)
|
||||
{
|
||||
gcc_checking_assert (phi->get_op (i));
|
||||
hsa_op_base *op = phi->get_op (i);
|
||||
hsa_bb *hbb;
|
||||
edge e;
|
||||
|
||||
if (!op)
|
||||
break;
|
||||
|
||||
e = EDGE_PRED (phi->m_bb, i);
|
||||
if (single_succ_p (e->src))
|
||||
hbb = hsa_bb_for_bb (e->src);
|
||||
else
|
||||
{
|
||||
basic_block old_dest = e->dest;
|
||||
hbb = hsa_init_new_bb (split_edge (e));
|
||||
|
||||
/* If switch insn used this edge, fix jump table. */
|
||||
hsa_bb *source = hsa_bb_for_bb (e->src);
|
||||
hsa_insn_sbr *sbr;
|
||||
if (source->m_last_insn
|
||||
&& (sbr = dyn_cast <hsa_insn_sbr *> (source->m_last_insn)))
|
||||
sbr->replace_all_labels (old_dest, hbb->m_bb);
|
||||
}
|
||||
|
||||
hsa_build_append_simple_mov (phi->m_dest, op, hbb);
|
||||
}
|
||||
}
|
||||
|
||||
/* Naive out-of SSA. */
|
||||
|
||||
static void
|
||||
naive_outof_ssa (void)
|
||||
{
|
||||
basic_block bb;
|
||||
|
||||
hsa_cfun->m_in_ssa = false;
|
||||
|
||||
FOR_ALL_BB_FN (bb, cfun)
|
||||
{
|
||||
hsa_bb *hbb = hsa_bb_for_bb (bb);
|
||||
hsa_insn_phi *phi;
|
||||
|
||||
for (phi = hbb->m_first_phi;
|
||||
phi;
|
||||
phi = phi->m_next ? as_a <hsa_insn_phi *> (phi->m_next) : NULL)
|
||||
naive_process_phi (phi);
|
||||
|
||||
/* Zap PHI nodes, they will be deallocated when everything else will. */
|
||||
hbb->m_first_phi = NULL;
|
||||
hbb->m_last_phi = NULL;
|
||||
}
|
||||
}
|
||||
|
||||
/* Return register class number for the given HSA TYPE. 0 means the 'c' one
|
||||
bit register class, 1 means 's' 32 bit class, 2 stands for 'd' 64 bit class
|
||||
and 3 for 'q' 128 bit class. */
|
||||
|
||||
static int
|
||||
m_reg_class_for_type (BrigType16_t type)
|
||||
{
|
||||
switch (type)
|
||||
{
|
||||
case BRIG_TYPE_B1:
|
||||
return 0;
|
||||
|
||||
case BRIG_TYPE_U8:
|
||||
case BRIG_TYPE_U16:
|
||||
case BRIG_TYPE_U32:
|
||||
case BRIG_TYPE_S8:
|
||||
case BRIG_TYPE_S16:
|
||||
case BRIG_TYPE_S32:
|
||||
case BRIG_TYPE_F16:
|
||||
case BRIG_TYPE_F32:
|
||||
case BRIG_TYPE_B8:
|
||||
case BRIG_TYPE_B16:
|
||||
case BRIG_TYPE_B32:
|
||||
case BRIG_TYPE_U8X4:
|
||||
case BRIG_TYPE_S8X4:
|
||||
case BRIG_TYPE_U16X2:
|
||||
case BRIG_TYPE_S16X2:
|
||||
case BRIG_TYPE_F16X2:
|
||||
return 1;
|
||||
|
||||
case BRIG_TYPE_U64:
|
||||
case BRIG_TYPE_S64:
|
||||
case BRIG_TYPE_F64:
|
||||
case BRIG_TYPE_B64:
|
||||
case BRIG_TYPE_U8X8:
|
||||
case BRIG_TYPE_S8X8:
|
||||
case BRIG_TYPE_U16X4:
|
||||
case BRIG_TYPE_S16X4:
|
||||
case BRIG_TYPE_F16X4:
|
||||
case BRIG_TYPE_U32X2:
|
||||
case BRIG_TYPE_S32X2:
|
||||
case BRIG_TYPE_F32X2:
|
||||
return 2;
|
||||
|
||||
case BRIG_TYPE_B128:
|
||||
case BRIG_TYPE_U8X16:
|
||||
case BRIG_TYPE_S8X16:
|
||||
case BRIG_TYPE_U16X8:
|
||||
case BRIG_TYPE_S16X8:
|
||||
case BRIG_TYPE_F16X8:
|
||||
case BRIG_TYPE_U32X4:
|
||||
case BRIG_TYPE_U64X2:
|
||||
case BRIG_TYPE_S32X4:
|
||||
case BRIG_TYPE_S64X2:
|
||||
case BRIG_TYPE_F32X4:
|
||||
case BRIG_TYPE_F64X2:
|
||||
return 3;
|
||||
|
||||
default:
|
||||
gcc_unreachable ();
|
||||
}
|
||||
}
|
||||
|
||||
/* If the Ith operands of INSN is or contains a register (in an address),
|
||||
return the address of that register operand. If not return NULL. */
|
||||
|
||||
static hsa_op_reg **
|
||||
insn_reg_addr (hsa_insn_basic *insn, int i)
|
||||
{
|
||||
hsa_op_base *op = insn->get_op (i);
|
||||
if (!op)
|
||||
return NULL;
|
||||
hsa_op_reg *reg = dyn_cast <hsa_op_reg *> (op);
|
||||
if (reg)
|
||||
return (hsa_op_reg **) insn->get_op_addr (i);
|
||||
hsa_op_address *addr = dyn_cast <hsa_op_address *> (op);
|
||||
if (addr && addr->m_reg)
|
||||
return &addr->m_reg;
|
||||
return NULL;
|
||||
}
|
||||
|
||||
struct m_reg_class_desc
|
||||
{
|
||||
unsigned next_avail, max_num;
|
||||
unsigned used_num, max_used;
|
||||
uint64_t used[2];
|
||||
char cl_char;
|
||||
};
|
||||
|
||||
/* Rewrite the instructions in BB to observe spilled live ranges.
|
||||
CLASSES is the global register class state. */
|
||||
|
||||
static void
|
||||
rewrite_code_bb (basic_block bb, struct m_reg_class_desc *classes)
|
||||
{
|
||||
hsa_bb *hbb = hsa_bb_for_bb (bb);
|
||||
hsa_insn_basic *insn, *next_insn;
|
||||
|
||||
for (insn = hbb->m_first_insn; insn; insn = next_insn)
|
||||
{
|
||||
next_insn = insn->m_next;
|
||||
unsigned count = insn->operand_count ();
|
||||
for (unsigned i = 0; i < count; i++)
|
||||
{
|
||||
gcc_checking_assert (insn->get_op (i));
|
||||
hsa_op_reg **regaddr = insn_reg_addr (insn, i);
|
||||
|
||||
if (regaddr)
|
||||
{
|
||||
hsa_op_reg *reg = *regaddr;
|
||||
if (reg->m_reg_class)
|
||||
continue;
|
||||
gcc_assert (reg->m_spill_sym);
|
||||
|
||||
int cl = m_reg_class_for_type (reg->m_type);
|
||||
hsa_op_reg *tmp, *tmp2;
|
||||
if (insn->op_output_p (i))
|
||||
tmp = hsa_spill_out (insn, reg, &tmp2);
|
||||
else
|
||||
tmp = hsa_spill_in (insn, reg, &tmp2);
|
||||
|
||||
*regaddr = tmp;
|
||||
|
||||
tmp->m_reg_class = classes[cl].cl_char;
|
||||
tmp->m_hard_num = (char) (classes[cl].max_num + i);
|
||||
if (tmp2)
|
||||
{
|
||||
gcc_assert (cl == 0);
|
||||
tmp2->m_reg_class = classes[1].cl_char;
|
||||
tmp2->m_hard_num = (char) (classes[1].max_num + i);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/* Dump current function to dump file F, with info specific
|
||||
to register allocation. */
|
||||
|
||||
void
|
||||
dump_hsa_cfun_regalloc (FILE *f)
|
||||
{
|
||||
basic_block bb;
|
||||
|
||||
fprintf (f, "\nHSAIL IL for %s\n", hsa_cfun->m_name);
|
||||
|
||||
FOR_ALL_BB_FN (bb, cfun)
|
||||
{
|
||||
hsa_bb *hbb = (struct hsa_bb *) bb->aux;
|
||||
bitmap_print (dump_file, hbb->m_livein, "m_livein ", "\n");
|
||||
dump_hsa_bb (f, hbb);
|
||||
bitmap_print (dump_file, hbb->m_liveout, "m_liveout ", "\n");
|
||||
}
|
||||
}
|
||||
|
||||
/* Given the global register allocation state CLASSES and a
|
||||
register REG, try to give it a hardware register. If successful,
|
||||
store that hardreg in REG and return it, otherwise return -1.
|
||||
Also changes CLASSES to accommodate for the allocated register. */
|
||||
|
||||
static int
|
||||
try_alloc_reg (struct m_reg_class_desc *classes, hsa_op_reg *reg)
|
||||
{
|
||||
int cl = m_reg_class_for_type (reg->m_type);
|
||||
int ret = -1;
|
||||
if (classes[1].used_num + classes[2].used_num * 2 + classes[3].used_num * 4
|
||||
>= 128 - 5)
|
||||
return -1;
|
||||
if (classes[cl].used_num < classes[cl].max_num)
|
||||
{
|
||||
unsigned int i;
|
||||
classes[cl].used_num++;
|
||||
if (classes[cl].used_num > classes[cl].max_used)
|
||||
classes[cl].max_used = classes[cl].used_num;
|
||||
for (i = 0; i < classes[cl].used_num; i++)
|
||||
if (! (classes[cl].used[i / 64] & (((uint64_t)1) << (i & 63))))
|
||||
break;
|
||||
ret = i;
|
||||
classes[cl].used[i / 64] |= (((uint64_t)1) << (i & 63));
|
||||
reg->m_reg_class = classes[cl].cl_char;
|
||||
reg->m_hard_num = i;
|
||||
}
|
||||
return ret;
|
||||
}
|
||||
|
||||
/* Free up hardregs used by REG, into allocation state CLASSES. */
|
||||
|
||||
static void
|
||||
free_reg (struct m_reg_class_desc *classes, hsa_op_reg *reg)
|
||||
{
|
||||
int cl = m_reg_class_for_type (reg->m_type);
|
||||
int ret = reg->m_hard_num;
|
||||
gcc_assert (reg->m_reg_class == classes[cl].cl_char);
|
||||
classes[cl].used_num--;
|
||||
classes[cl].used[ret / 64] &= ~(((uint64_t)1) << (ret & 63));
|
||||
}
|
||||
|
||||
/* Note that the live range for REG ends at least at END. */
|
||||
|
||||
static void
|
||||
note_lr_end (hsa_op_reg *reg, int end)
|
||||
{
|
||||
if (reg->m_lr_end < end)
|
||||
reg->m_lr_end = end;
|
||||
}
|
||||
|
||||
/* Note that the live range for REG starts at least at BEGIN. */
|
||||
|
||||
static void
|
||||
note_lr_begin (hsa_op_reg *reg, int begin)
|
||||
{
|
||||
if (reg->m_lr_begin > begin)
|
||||
reg->m_lr_begin = begin;
|
||||
}
|
||||
|
||||
/* Given two registers A and B, return -1, 0 or 1 if A's live range
|
||||
starts before, at or after B's live range. */
|
||||
|
||||
static int
|
||||
cmp_begin (const void *a, const void *b)
|
||||
{
|
||||
const hsa_op_reg * const *rega = (const hsa_op_reg * const *)a;
|
||||
const hsa_op_reg * const *regb = (const hsa_op_reg * const *)b;
|
||||
int ret;
|
||||
if (rega == regb)
|
||||
return 0;
|
||||
ret = (*rega)->m_lr_begin - (*regb)->m_lr_begin;
|
||||
if (ret)
|
||||
return ret;
|
||||
return ((*rega)->m_order - (*regb)->m_order);
|
||||
}
|
||||
|
||||
/* Given two registers REGA and REGB, return true if REGA's
|
||||
live range ends after REGB's. This results in a sorting order
|
||||
with earlier end points at the end. */
|
||||
|
||||
static bool
|
||||
cmp_end (hsa_op_reg * const ®a, hsa_op_reg * const ®b)
|
||||
{
|
||||
int ret;
|
||||
if (rega == regb)
|
||||
return false;
|
||||
ret = (regb)->m_lr_end - (rega)->m_lr_end;
|
||||
if (ret)
|
||||
return ret < 0;
|
||||
return (((regb)->m_order - (rega)->m_order)) < 0;
|
||||
}
|
||||
|
||||
/* Expire all old intervals in ACTIVE (a per-regclass vector),
|
||||
that is, those that end before the interval REG starts. Give
|
||||
back resources freed so into the state CLASSES. */
|
||||
|
||||
static void
|
||||
expire_old_intervals (hsa_op_reg *reg, vec<hsa_op_reg*> *active,
|
||||
struct m_reg_class_desc *classes)
|
||||
{
|
||||
for (int i = 0; i < 4; i++)
|
||||
while (!active[i].is_empty ())
|
||||
{
|
||||
hsa_op_reg *a = active[i].pop ();
|
||||
if (a->m_lr_end > reg->m_lr_begin)
|
||||
{
|
||||
active[i].quick_push (a);
|
||||
break;
|
||||
}
|
||||
free_reg (classes, a);
|
||||
}
|
||||
}
|
||||
|
||||
/* The interval REG didn't get a hardreg. Spill it or one of those
|
||||
from ACTIVE (if the latter, then REG will become allocated to the
|
||||
hardreg that formerly was used by it). */
|
||||
|
||||
static void
|
||||
spill_at_interval (hsa_op_reg *reg, vec<hsa_op_reg*> *active)
|
||||
{
|
||||
int cl = m_reg_class_for_type (reg->m_type);
|
||||
gcc_assert (!active[cl].is_empty ());
|
||||
hsa_op_reg *cand = active[cl][0];
|
||||
if (cand->m_lr_end > reg->m_lr_end)
|
||||
{
|
||||
reg->m_reg_class = cand->m_reg_class;
|
||||
reg->m_hard_num = cand->m_hard_num;
|
||||
active[cl].ordered_remove (0);
|
||||
unsigned place = active[cl].lower_bound (reg, cmp_end);
|
||||
active[cl].quick_insert (place, reg);
|
||||
}
|
||||
else
|
||||
cand = reg;
|
||||
|
||||
gcc_assert (!cand->m_spill_sym);
|
||||
BrigType16_t type = cand->m_type;
|
||||
if (type == BRIG_TYPE_B1)
|
||||
type = BRIG_TYPE_U8;
|
||||
cand->m_reg_class = 0;
|
||||
cand->m_spill_sym = hsa_get_spill_symbol (type);
|
||||
cand->m_spill_sym->m_name_number = cand->m_order;
|
||||
}
|
||||
|
||||
/* Given the global register state CLASSES allocate all HSA virtual
|
||||
registers either to hardregs or to a spill symbol. */
|
||||
|
||||
static void
|
||||
linear_scan_regalloc (struct m_reg_class_desc *classes)
|
||||
{
|
||||
/* Compute liveness. */
|
||||
bool changed;
|
||||
int i, n;
|
||||
int insn_order;
|
||||
int *bbs = XNEWVEC (int, n_basic_blocks_for_fn (cfun));
|
||||
bitmap work = BITMAP_ALLOC (NULL);
|
||||
vec<hsa_op_reg*> ind2reg = vNULL;
|
||||
vec<hsa_op_reg*> active[4] = {vNULL, vNULL, vNULL, vNULL};
|
||||
hsa_insn_basic *m_last_insn;
|
||||
|
||||
/* We will need the reverse post order for linearization,
|
||||
and the post order for liveness analysis, which is the same
|
||||
backward. */
|
||||
n = pre_and_rev_post_order_compute (NULL, bbs, true);
|
||||
ind2reg.safe_grow_cleared (hsa_cfun->m_reg_count);
|
||||
|
||||
/* Give all instructions a linearized number, at the same time
|
||||
build a mapping from register index to register. */
|
||||
insn_order = 1;
|
||||
for (i = 0; i < n; i++)
|
||||
{
|
||||
basic_block bb = BASIC_BLOCK_FOR_FN (cfun, bbs[i]);
|
||||
hsa_bb *hbb = hsa_bb_for_bb (bb);
|
||||
hsa_insn_basic *insn;
|
||||
for (insn = hbb->m_first_insn; insn; insn = insn->m_next)
|
||||
{
|
||||
unsigned opi;
|
||||
insn->m_number = insn_order++;
|
||||
for (opi = 0; opi < insn->operand_count (); opi++)
|
||||
{
|
||||
gcc_checking_assert (insn->get_op (opi));
|
||||
hsa_op_reg **regaddr = insn_reg_addr (insn, opi);
|
||||
if (regaddr)
|
||||
ind2reg[(*regaddr)->m_order] = *regaddr;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/* Initialize all live ranges to [after-end, 0). */
|
||||
for (i = 0; i < hsa_cfun->m_reg_count; i++)
|
||||
if (ind2reg[i])
|
||||
ind2reg[i]->m_lr_begin = insn_order, ind2reg[i]->m_lr_end = 0;
|
||||
|
||||
/* Classic liveness analysis, as long as something changes:
|
||||
m_liveout is union (m_livein of successors)
|
||||
m_livein is m_liveout minus defs plus uses. */
|
||||
do
|
||||
{
|
||||
changed = false;
|
||||
for (i = n - 1; i >= 0; i--)
|
||||
{
|
||||
edge e;
|
||||
edge_iterator ei;
|
||||
basic_block bb = BASIC_BLOCK_FOR_FN (cfun, bbs[i]);
|
||||
hsa_bb *hbb = hsa_bb_for_bb (bb);
|
||||
|
||||
/* Union of successors m_livein (or empty if none). */
|
||||
bool first = true;
|
||||
FOR_EACH_EDGE (e, ei, bb->succs)
|
||||
if (e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun))
|
||||
{
|
||||
hsa_bb *succ = hsa_bb_for_bb (e->dest);
|
||||
if (first)
|
||||
{
|
||||
bitmap_copy (work, succ->m_livein);
|
||||
first = false;
|
||||
}
|
||||
else
|
||||
bitmap_ior_into (work, succ->m_livein);
|
||||
}
|
||||
if (first)
|
||||
bitmap_clear (work);
|
||||
|
||||
bitmap_copy (hbb->m_liveout, work);
|
||||
|
||||
/* Remove defs, include uses in a backward insn walk. */
|
||||
hsa_insn_basic *insn;
|
||||
for (insn = hbb->m_last_insn; insn; insn = insn->m_prev)
|
||||
{
|
||||
unsigned opi;
|
||||
unsigned ndefs = insn->input_count ();
|
||||
for (opi = 0; opi < ndefs && insn->get_op (opi); opi++)
|
||||
{
|
||||
gcc_checking_assert (insn->get_op (opi));
|
||||
hsa_op_reg **regaddr = insn_reg_addr (insn, opi);
|
||||
if (regaddr)
|
||||
bitmap_clear_bit (work, (*regaddr)->m_order);
|
||||
}
|
||||
for (; opi < insn->operand_count (); opi++)
|
||||
{
|
||||
gcc_checking_assert (insn->get_op (opi));
|
||||
hsa_op_reg **regaddr = insn_reg_addr (insn, opi);
|
||||
if (regaddr)
|
||||
bitmap_set_bit (work, (*regaddr)->m_order);
|
||||
}
|
||||
}
|
||||
|
||||
/* Note if that changed something. */
|
||||
if (bitmap_ior_into (hbb->m_livein, work))
|
||||
changed = true;
|
||||
}
|
||||
}
|
||||
while (changed);
|
||||
|
||||
/* Make one pass through all instructions in linear order,
|
||||
noting and merging possible live range start and end points. */
|
||||
m_last_insn = NULL;
|
||||
for (i = n - 1; i >= 0; i--)
|
||||
{
|
||||
basic_block bb = BASIC_BLOCK_FOR_FN (cfun, bbs[i]);
|
||||
hsa_bb *hbb = hsa_bb_for_bb (bb);
|
||||
hsa_insn_basic *insn;
|
||||
int after_end_number;
|
||||
unsigned bit;
|
||||
bitmap_iterator bi;
|
||||
|
||||
if (m_last_insn)
|
||||
after_end_number = m_last_insn->m_number;
|
||||
else
|
||||
after_end_number = insn_order;
|
||||
/* Everything live-out in this BB has at least an end point
|
||||
after us. */
|
||||
EXECUTE_IF_SET_IN_BITMAP (hbb->m_liveout, 0, bit, bi)
|
||||
note_lr_end (ind2reg[bit], after_end_number);
|
||||
|
||||
for (insn = hbb->m_last_insn; insn; insn = insn->m_prev)
|
||||
{
|
||||
unsigned opi;
|
||||
unsigned ndefs = insn->input_count ();
|
||||
for (opi = 0; opi < insn->operand_count (); opi++)
|
||||
{
|
||||
gcc_checking_assert (insn->get_op (opi));
|
||||
hsa_op_reg **regaddr = insn_reg_addr (insn, opi);
|
||||
if (regaddr)
|
||||
{
|
||||
hsa_op_reg *reg = *regaddr;
|
||||
if (opi < ndefs)
|
||||
note_lr_begin (reg, insn->m_number);
|
||||
else
|
||||
note_lr_end (reg, insn->m_number);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/* Everything live-in in this BB has a start point before
|
||||
our first insn. */
|
||||
int before_start_number;
|
||||
if (hbb->m_first_insn)
|
||||
before_start_number = hbb->m_first_insn->m_number;
|
||||
else
|
||||
before_start_number = after_end_number;
|
||||
before_start_number--;
|
||||
EXECUTE_IF_SET_IN_BITMAP (hbb->m_livein, 0, bit, bi)
|
||||
note_lr_begin (ind2reg[bit], before_start_number);
|
||||
|
||||
if (hbb->m_first_insn)
|
||||
m_last_insn = hbb->m_first_insn;
|
||||
}
|
||||
|
||||
for (i = 0; i < hsa_cfun->m_reg_count; i++)
|
||||
if (ind2reg[i])
|
||||
{
|
||||
/* All regs that have still their start at after all code actually
|
||||
are defined at the start of the routine (prologue). */
|
||||
if (ind2reg[i]->m_lr_begin == insn_order)
|
||||
ind2reg[i]->m_lr_begin = 0;
|
||||
/* All regs that have no use but a def will have lr_end == 0,
|
||||
they are actually live from def until after the insn they are
|
||||
defined in. */
|
||||
if (ind2reg[i]->m_lr_end == 0)
|
||||
ind2reg[i]->m_lr_end = ind2reg[i]->m_lr_begin + 1;
|
||||
}
|
||||
|
||||
/* Sort all intervals by increasing start point. */
|
||||
gcc_assert (ind2reg.length () == (size_t) hsa_cfun->m_reg_count);
|
||||
|
||||
#ifdef ENABLE_CHECKING
|
||||
for (unsigned i = 0; i < ind2reg.length (); i++)
|
||||
gcc_assert (ind2reg[i]);
|
||||
#endif
|
||||
|
||||
ind2reg.qsort (cmp_begin);
|
||||
for (i = 0; i < 4; i++)
|
||||
active[i].reserve_exact (hsa_cfun->m_reg_count);
|
||||
|
||||
/* Now comes the linear scan allocation. */
|
||||
for (i = 0; i < hsa_cfun->m_reg_count; i++)
|
||||
{
|
||||
hsa_op_reg *reg = ind2reg[i];
|
||||
if (!reg)
|
||||
continue;
|
||||
expire_old_intervals (reg, active, classes);
|
||||
int cl = m_reg_class_for_type (reg->m_type);
|
||||
if (try_alloc_reg (classes, reg) >= 0)
|
||||
{
|
||||
unsigned place = active[cl].lower_bound (reg, cmp_end);
|
||||
active[cl].quick_insert (place, reg);
|
||||
}
|
||||
else
|
||||
spill_at_interval (reg, active);
|
||||
|
||||
/* Some interesting dumping as we go. */
|
||||
if (dump_file)
|
||||
{
|
||||
fprintf (dump_file, " reg%d: [%5d, %5d)->",
|
||||
reg->m_order, reg->m_lr_begin, reg->m_lr_end);
|
||||
if (reg->m_reg_class)
|
||||
fprintf (dump_file, "$%c%i", reg->m_reg_class, reg->m_hard_num);
|
||||
else
|
||||
fprintf (dump_file, "[%%__%s_%i]",
|
||||
hsa_seg_name (reg->m_spill_sym->m_segment),
|
||||
reg->m_spill_sym->m_name_number);
|
||||
for (int cl = 0; cl < 4; cl++)
|
||||
{
|
||||
bool first = true;
|
||||
hsa_op_reg *r;
|
||||
fprintf (dump_file, " {");
|
||||
for (int j = 0; active[cl].iterate (j, &r); j++)
|
||||
if (first)
|
||||
{
|
||||
fprintf (dump_file, "%d", r->m_order);
|
||||
first = false;
|
||||
}
|
||||
else
|
||||
fprintf (dump_file, ", %d", r->m_order);
|
||||
fprintf (dump_file, "}");
|
||||
}
|
||||
fprintf (dump_file, "\n");
|
||||
}
|
||||
}
|
||||
|
||||
BITMAP_FREE (work);
|
||||
free (bbs);
|
||||
|
||||
if (dump_file)
|
||||
{
|
||||
fprintf (dump_file, "------- After liveness: -------\n");
|
||||
dump_hsa_cfun_regalloc (dump_file);
|
||||
fprintf (dump_file, " ----- Intervals:\n");
|
||||
for (i = 0; i < hsa_cfun->m_reg_count; i++)
|
||||
{
|
||||
hsa_op_reg *reg = ind2reg[i];
|
||||
if (!reg)
|
||||
continue;
|
||||
fprintf (dump_file, " reg%d: [%5d, %5d)->", reg->m_order,
|
||||
reg->m_lr_begin, reg->m_lr_end);
|
||||
if (reg->m_reg_class)
|
||||
fprintf (dump_file, "$%c%i\n", reg->m_reg_class, reg->m_hard_num);
|
||||
else
|
||||
fprintf (dump_file, "[%%__%s_%i]\n",
|
||||
hsa_seg_name (reg->m_spill_sym->m_segment),
|
||||
reg->m_spill_sym->m_name_number);
|
||||
}
|
||||
}
|
||||
|
||||
for (i = 0; i < 4; i++)
|
||||
active[i].release ();
|
||||
ind2reg.release ();
|
||||
}
|
||||
|
||||
/* Entry point for register allocation. */
|
||||
|
||||
static void
|
||||
regalloc (void)
|
||||
{
|
||||
basic_block bb;
|
||||
m_reg_class_desc classes[4];
|
||||
|
||||
/* If there are no registers used in the function, exit right away. */
|
||||
if (hsa_cfun->m_reg_count == 0)
|
||||
return;
|
||||
|
||||
memset (classes, 0, sizeof (classes));
|
||||
classes[0].next_avail = 0;
|
||||
classes[0].max_num = 7;
|
||||
classes[0].cl_char = 'c';
|
||||
classes[1].cl_char = 's';
|
||||
classes[2].cl_char = 'd';
|
||||
classes[3].cl_char = 'q';
|
||||
|
||||
for (int i = 1; i < 4; i++)
|
||||
{
|
||||
classes[i].next_avail = 0;
|
||||
classes[i].max_num = 20;
|
||||
}
|
||||
|
||||
linear_scan_regalloc (classes);
|
||||
|
||||
FOR_ALL_BB_FN (bb, cfun)
|
||||
rewrite_code_bb (bb, classes);
|
||||
}
|
||||
|
||||
/* Out of SSA and register allocation on HSAIL IL. */
|
||||
|
||||
void
|
||||
hsa_regalloc (void)
|
||||
{
|
||||
naive_outof_ssa ();
|
||||
|
||||
if (dump_file)
|
||||
{
|
||||
fprintf (dump_file, "------- After out-of-SSA: -------\n");
|
||||
dump_hsa_cfun (dump_file);
|
||||
}
|
||||
|
||||
regalloc ();
|
||||
|
||||
if (dump_file)
|
||||
{
|
||||
fprintf (dump_file, "------- After register allocation: -------\n");
|
||||
dump_hsa_cfun (dump_file);
|
||||
}
|
||||
}
|
947
gcc/hsa.c
Normal file
947
gcc/hsa.c
Normal file
@ -0,0 +1,947 @@
|
||||
/* Implementation of commonly needed HSAIL related functions and methods.
|
||||
Copyright (C) 2013-2016 Free Software Foundation, Inc.
|
||||
Contributed by Martin Jambor <mjambor@suse.cz> and
|
||||
Martin Liska <mliska@suse.cz>.
|
||||
|
||||
This file is part of GCC.
|
||||
|
||||
GCC is free software; you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License as published by
|
||||
the Free Software Foundation; either version 3, or (at your option)
|
||||
any later version.
|
||||
|
||||
GCC is distributed in the hope that it will be useful,
|
||||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
GNU General Public License for more details.
|
||||
|
||||
You should have received a copy of the GNU General Public License
|
||||
along with GCC; see the file COPYING3. If not see
|
||||
<http://www.gnu.org/licenses/>. */
|
||||
|
||||
#include "config.h"
|
||||
#include "system.h"
|
||||
#include "coretypes.h"
|
||||
#include "tm.h"
|
||||
#include "is-a.h"
|
||||
#include "hash-set.h"
|
||||
#include "hash-map.h"
|
||||
#include "vec.h"
|
||||
#include "tree.h"
|
||||
#include "dumpfile.h"
|
||||
#include "gimple-pretty-print.h"
|
||||
#include "diagnostic-core.h"
|
||||
#include "alloc-pool.h"
|
||||
#include "cgraph.h"
|
||||
#include "print-tree.h"
|
||||
#include "stringpool.h"
|
||||
#include "symbol-summary.h"
|
||||
#include "hsa.h"
|
||||
#include "internal-fn.h"
|
||||
#include "ctype.h"
|
||||
|
||||
/* Structure containing intermediate HSA representation of the generated
|
||||
function. */
|
||||
class hsa_function_representation *hsa_cfun;
|
||||
|
||||
/* Element of the mapping vector between a host decl and an HSA kernel. */
|
||||
|
||||
struct GTY(()) hsa_decl_kernel_map_element
|
||||
{
|
||||
/* The decl of the host function. */
|
||||
tree decl;
|
||||
/* Name of the HSA kernel in BRIG. */
|
||||
char * GTY((skip)) name;
|
||||
/* Size of OMP data, if the kernel contains a kernel dispatch. */
|
||||
unsigned omp_data_size;
|
||||
/* True if the function is gridified kernel. */
|
||||
bool gridified_kernel_p;
|
||||
};
|
||||
|
||||
/* Mapping between decls and corresponding HSA kernels in this compilation
|
||||
unit. */
|
||||
|
||||
static GTY (()) vec<hsa_decl_kernel_map_element, va_gc>
|
||||
*hsa_decl_kernel_mapping;
|
||||
|
||||
/* Mapping between decls and corresponding HSA kernels
|
||||
called by the function. */
|
||||
hash_map <tree, vec <const char *> *> *hsa_decl_kernel_dependencies;
|
||||
|
||||
/* Hash function to lookup a symbol for a decl. */
|
||||
hash_table <hsa_noop_symbol_hasher> *hsa_global_variable_symbols;
|
||||
|
||||
/* HSA summaries. */
|
||||
hsa_summary_t *hsa_summaries = NULL;
|
||||
|
||||
/* HSA number of threads. */
|
||||
hsa_symbol *hsa_num_threads = NULL;
|
||||
|
||||
/* HSA function that cannot be expanded to HSAIL. */
|
||||
hash_set <tree> *hsa_failed_functions = NULL;
|
||||
|
||||
/* True if compilation unit-wide data are already allocated and initialized. */
|
||||
static bool compilation_unit_data_initialized;
|
||||
|
||||
/* Return true if FNDECL represents an HSA-callable function. */
|
||||
|
||||
bool
|
||||
hsa_callable_function_p (tree fndecl)
|
||||
{
|
||||
return (lookup_attribute ("omp declare target", DECL_ATTRIBUTES (fndecl))
|
||||
&& !lookup_attribute ("oacc function", DECL_ATTRIBUTES (fndecl)));
|
||||
}
|
||||
|
||||
/* Allocate HSA structures that are are used when dealing with different
|
||||
functions. */
|
||||
|
||||
void
|
||||
hsa_init_compilation_unit_data (void)
|
||||
{
|
||||
if (compilation_unit_data_initialized)
|
||||
return;
|
||||
|
||||
compilation_unit_data_initialized = true;
|
||||
|
||||
hsa_global_variable_symbols = new hash_table <hsa_noop_symbol_hasher> (8);
|
||||
hsa_failed_functions = new hash_set <tree> ();
|
||||
hsa_emitted_internal_decls = new hash_table <hsa_internal_fn_hasher> (2);
|
||||
}
|
||||
|
||||
/* Free data structures that are used when dealing with different
|
||||
functions. */
|
||||
|
||||
void
|
||||
hsa_deinit_compilation_unit_data (void)
|
||||
{
|
||||
gcc_assert (compilation_unit_data_initialized);
|
||||
|
||||
delete hsa_failed_functions;
|
||||
delete hsa_emitted_internal_decls;
|
||||
|
||||
for (hash_table <hsa_noop_symbol_hasher>::iterator it
|
||||
= hsa_global_variable_symbols->begin ();
|
||||
it != hsa_global_variable_symbols->end ();
|
||||
++it)
|
||||
{
|
||||
hsa_symbol *sym = *it;
|
||||
delete sym;
|
||||
}
|
||||
|
||||
delete hsa_global_variable_symbols;
|
||||
|
||||
if (hsa_num_threads)
|
||||
{
|
||||
delete hsa_num_threads;
|
||||
hsa_num_threads = NULL;
|
||||
}
|
||||
|
||||
compilation_unit_data_initialized = false;
|
||||
}
|
||||
|
||||
/* Return true if we are generating large HSA machine model. */
|
||||
|
||||
bool
|
||||
hsa_machine_large_p (void)
|
||||
{
|
||||
/* FIXME: I suppose this is technically wrong but should work for me now. */
|
||||
return (GET_MODE_BITSIZE (Pmode) == 64);
|
||||
}
|
||||
|
||||
/* Return the HSA profile we are using. */
|
||||
|
||||
bool
|
||||
hsa_full_profile_p (void)
|
||||
{
|
||||
return true;
|
||||
}
|
||||
|
||||
/* Return true if a register in operand number OPNUM of instruction
|
||||
is an output. False if it is an input. */
|
||||
|
||||
bool
|
||||
hsa_insn_basic::op_output_p (unsigned opnum)
|
||||
{
|
||||
switch (m_opcode)
|
||||
{
|
||||
case HSA_OPCODE_PHI:
|
||||
case BRIG_OPCODE_CBR:
|
||||
case BRIG_OPCODE_SBR:
|
||||
case BRIG_OPCODE_ST:
|
||||
case BRIG_OPCODE_SIGNALNORET:
|
||||
/* FIXME: There are probably missing cases here, double check. */
|
||||
return false;
|
||||
case BRIG_OPCODE_EXPAND:
|
||||
/* Example: expand_v4_b32_b128 (dest0, dest1, dest2, dest3), src0. */
|
||||
return opnum < operand_count () - 1;
|
||||
default:
|
||||
return opnum == 0;
|
||||
}
|
||||
}
|
||||
|
||||
/* Return true if OPCODE is an floating-point bit instruction opcode. */
|
||||
|
||||
bool
|
||||
hsa_opcode_floating_bit_insn_p (BrigOpcode16_t opcode)
|
||||
{
|
||||
switch (opcode)
|
||||
{
|
||||
case BRIG_OPCODE_NEG:
|
||||
case BRIG_OPCODE_ABS:
|
||||
case BRIG_OPCODE_CLASS:
|
||||
case BRIG_OPCODE_COPYSIGN:
|
||||
return true;
|
||||
default:
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
/* Return the number of destination operands for this INSN. */
|
||||
|
||||
unsigned
|
||||
hsa_insn_basic::input_count ()
|
||||
{
|
||||
switch (m_opcode)
|
||||
{
|
||||
default:
|
||||
return 1;
|
||||
|
||||
case BRIG_OPCODE_NOP:
|
||||
return 0;
|
||||
|
||||
case BRIG_OPCODE_EXPAND:
|
||||
return 2;
|
||||
|
||||
case BRIG_OPCODE_LD:
|
||||
/* ld_v[234] not yet handled. */
|
||||
return 1;
|
||||
|
||||
case BRIG_OPCODE_ST:
|
||||
return 0;
|
||||
|
||||
case BRIG_OPCODE_ATOMICNORET:
|
||||
return 0;
|
||||
|
||||
case BRIG_OPCODE_SIGNAL:
|
||||
return 1;
|
||||
|
||||
case BRIG_OPCODE_SIGNALNORET:
|
||||
return 0;
|
||||
|
||||
case BRIG_OPCODE_MEMFENCE:
|
||||
return 0;
|
||||
|
||||
case BRIG_OPCODE_RDIMAGE:
|
||||
case BRIG_OPCODE_LDIMAGE:
|
||||
case BRIG_OPCODE_STIMAGE:
|
||||
case BRIG_OPCODE_QUERYIMAGE:
|
||||
case BRIG_OPCODE_QUERYSAMPLER:
|
||||
sorry ("HSA image ops not handled");
|
||||
return 0;
|
||||
|
||||
case BRIG_OPCODE_CBR:
|
||||
case BRIG_OPCODE_BR:
|
||||
return 0;
|
||||
|
||||
case BRIG_OPCODE_SBR:
|
||||
return 0; /* ??? */
|
||||
|
||||
case BRIG_OPCODE_WAVEBARRIER:
|
||||
return 0; /* ??? */
|
||||
|
||||
case BRIG_OPCODE_BARRIER:
|
||||
case BRIG_OPCODE_ARRIVEFBAR:
|
||||
case BRIG_OPCODE_INITFBAR:
|
||||
case BRIG_OPCODE_JOINFBAR:
|
||||
case BRIG_OPCODE_LEAVEFBAR:
|
||||
case BRIG_OPCODE_RELEASEFBAR:
|
||||
case BRIG_OPCODE_WAITFBAR:
|
||||
return 0;
|
||||
|
||||
case BRIG_OPCODE_LDF:
|
||||
return 1;
|
||||
|
||||
case BRIG_OPCODE_ACTIVELANECOUNT:
|
||||
case BRIG_OPCODE_ACTIVELANEID:
|
||||
case BRIG_OPCODE_ACTIVELANEMASK:
|
||||
case BRIG_OPCODE_ACTIVELANEPERMUTE:
|
||||
return 1; /* ??? */
|
||||
|
||||
case BRIG_OPCODE_CALL:
|
||||
case BRIG_OPCODE_SCALL:
|
||||
case BRIG_OPCODE_ICALL:
|
||||
return 0;
|
||||
|
||||
case BRIG_OPCODE_RET:
|
||||
return 0;
|
||||
|
||||
case BRIG_OPCODE_ALLOCA:
|
||||
return 1;
|
||||
|
||||
case BRIG_OPCODE_CLEARDETECTEXCEPT:
|
||||
return 0;
|
||||
|
||||
case BRIG_OPCODE_SETDETECTEXCEPT:
|
||||
return 0;
|
||||
|
||||
case BRIG_OPCODE_PACKETCOMPLETIONSIG:
|
||||
case BRIG_OPCODE_PACKETID:
|
||||
case BRIG_OPCODE_CASQUEUEWRITEINDEX:
|
||||
case BRIG_OPCODE_LDQUEUEREADINDEX:
|
||||
case BRIG_OPCODE_LDQUEUEWRITEINDEX:
|
||||
case BRIG_OPCODE_STQUEUEREADINDEX:
|
||||
case BRIG_OPCODE_STQUEUEWRITEINDEX:
|
||||
return 1; /* ??? */
|
||||
|
||||
case BRIG_OPCODE_ADDQUEUEWRITEINDEX:
|
||||
return 1;
|
||||
|
||||
case BRIG_OPCODE_DEBUGTRAP:
|
||||
return 0;
|
||||
|
||||
case BRIG_OPCODE_GROUPBASEPTR:
|
||||
case BRIG_OPCODE_KERNARGBASEPTR:
|
||||
return 1; /* ??? */
|
||||
|
||||
case HSA_OPCODE_ARG_BLOCK:
|
||||
return 0;
|
||||
|
||||
case BRIG_KIND_DIRECTIVE_COMMENT:
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
|
||||
/* Return the number of source operands for this INSN. */
|
||||
|
||||
unsigned
|
||||
hsa_insn_basic::num_used_ops ()
|
||||
{
|
||||
gcc_checking_assert (input_count () <= operand_count ());
|
||||
|
||||
return operand_count () - input_count ();
|
||||
}
|
||||
|
||||
/* Set alignment to VALUE. */
|
||||
|
||||
void
|
||||
hsa_insn_mem::set_align (BrigAlignment8_t value)
|
||||
{
|
||||
/* TODO: Perhaps remove this dump later on: */
|
||||
if (dump_file && (dump_flags & TDF_DETAILS) && value < m_align)
|
||||
{
|
||||
fprintf (dump_file, "Decreasing alignment to %u in instruction ", value);
|
||||
dump_hsa_insn (dump_file, this);
|
||||
}
|
||||
m_align = value;
|
||||
}
|
||||
|
||||
/* Return size of HSA type T in bits. */
|
||||
|
||||
unsigned
|
||||
hsa_type_bit_size (BrigType16_t t)
|
||||
{
|
||||
switch (t)
|
||||
{
|
||||
case BRIG_TYPE_B1:
|
||||
return 1;
|
||||
|
||||
case BRIG_TYPE_U8:
|
||||
case BRIG_TYPE_S8:
|
||||
case BRIG_TYPE_B8:
|
||||
return 8;
|
||||
|
||||
case BRIG_TYPE_U16:
|
||||
case BRIG_TYPE_S16:
|
||||
case BRIG_TYPE_B16:
|
||||
case BRIG_TYPE_F16:
|
||||
return 16;
|
||||
|
||||
case BRIG_TYPE_U32:
|
||||
case BRIG_TYPE_S32:
|
||||
case BRIG_TYPE_B32:
|
||||
case BRIG_TYPE_F32:
|
||||
case BRIG_TYPE_U8X4:
|
||||
case BRIG_TYPE_U16X2:
|
||||
case BRIG_TYPE_S8X4:
|
||||
case BRIG_TYPE_S16X2:
|
||||
case BRIG_TYPE_F16X2:
|
||||
return 32;
|
||||
|
||||
case BRIG_TYPE_U64:
|
||||
case BRIG_TYPE_S64:
|
||||
case BRIG_TYPE_F64:
|
||||
case BRIG_TYPE_B64:
|
||||
case BRIG_TYPE_U8X8:
|
||||
case BRIG_TYPE_U16X4:
|
||||
case BRIG_TYPE_U32X2:
|
||||
case BRIG_TYPE_S8X8:
|
||||
case BRIG_TYPE_S16X4:
|
||||
case BRIG_TYPE_S32X2:
|
||||
case BRIG_TYPE_F16X4:
|
||||
case BRIG_TYPE_F32X2:
|
||||
|
||||
return 64;
|
||||
|
||||
case BRIG_TYPE_B128:
|
||||
case BRIG_TYPE_U8X16:
|
||||
case BRIG_TYPE_U16X8:
|
||||
case BRIG_TYPE_U32X4:
|
||||
case BRIG_TYPE_U64X2:
|
||||
case BRIG_TYPE_S8X16:
|
||||
case BRIG_TYPE_S16X8:
|
||||
case BRIG_TYPE_S32X4:
|
||||
case BRIG_TYPE_S64X2:
|
||||
case BRIG_TYPE_F16X8:
|
||||
case BRIG_TYPE_F32X4:
|
||||
case BRIG_TYPE_F64X2:
|
||||
return 128;
|
||||
|
||||
default:
|
||||
gcc_assert (hsa_seen_error ());
|
||||
return t;
|
||||
}
|
||||
}
|
||||
|
||||
/* Return BRIG bit-type with BITSIZE length. */
|
||||
|
||||
BrigType16_t
|
||||
hsa_bittype_for_bitsize (unsigned bitsize)
|
||||
{
|
||||
switch (bitsize)
|
||||
{
|
||||
case 1:
|
||||
return BRIG_TYPE_B1;
|
||||
case 8:
|
||||
return BRIG_TYPE_B8;
|
||||
case 16:
|
||||
return BRIG_TYPE_B16;
|
||||
case 32:
|
||||
return BRIG_TYPE_B32;
|
||||
case 64:
|
||||
return BRIG_TYPE_B64;
|
||||
case 128:
|
||||
return BRIG_TYPE_B128;
|
||||
default:
|
||||
gcc_unreachable ();
|
||||
}
|
||||
}
|
||||
|
||||
/* Return BRIG unsigned int type with BITSIZE length. */
|
||||
|
||||
BrigType16_t
|
||||
hsa_uint_for_bitsize (unsigned bitsize)
|
||||
{
|
||||
switch (bitsize)
|
||||
{
|
||||
case 8:
|
||||
return BRIG_TYPE_U8;
|
||||
case 16:
|
||||
return BRIG_TYPE_U16;
|
||||
case 32:
|
||||
return BRIG_TYPE_U32;
|
||||
case 64:
|
||||
return BRIG_TYPE_U64;
|
||||
default:
|
||||
gcc_unreachable ();
|
||||
}
|
||||
}
|
||||
|
||||
/* Return BRIG float type with BITSIZE length. */
|
||||
|
||||
BrigType16_t
|
||||
hsa_float_for_bitsize (unsigned bitsize)
|
||||
{
|
||||
switch (bitsize)
|
||||
{
|
||||
case 16:
|
||||
return BRIG_TYPE_F16;
|
||||
case 32:
|
||||
return BRIG_TYPE_F32;
|
||||
case 64:
|
||||
return BRIG_TYPE_F64;
|
||||
default:
|
||||
gcc_unreachable ();
|
||||
}
|
||||
}
|
||||
|
||||
/* Return HSA bit-type with the same size as the type T. */
|
||||
|
||||
BrigType16_t
|
||||
hsa_bittype_for_type (BrigType16_t t)
|
||||
{
|
||||
return hsa_bittype_for_bitsize (hsa_type_bit_size (t));
|
||||
}
|
||||
|
||||
/* Return true if and only if TYPE is a floating point number type. */
|
||||
|
||||
bool
|
||||
hsa_type_float_p (BrigType16_t type)
|
||||
{
|
||||
switch (type & BRIG_TYPE_BASE_MASK)
|
||||
{
|
||||
case BRIG_TYPE_F16:
|
||||
case BRIG_TYPE_F32:
|
||||
case BRIG_TYPE_F64:
|
||||
return true;
|
||||
default:
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
/* Return true if and only if TYPE is an integer number type. */
|
||||
|
||||
bool
|
||||
hsa_type_integer_p (BrigType16_t type)
|
||||
{
|
||||
switch (type & BRIG_TYPE_BASE_MASK)
|
||||
{
|
||||
case BRIG_TYPE_U8:
|
||||
case BRIG_TYPE_U16:
|
||||
case BRIG_TYPE_U32:
|
||||
case BRIG_TYPE_U64:
|
||||
case BRIG_TYPE_S8:
|
||||
case BRIG_TYPE_S16:
|
||||
case BRIG_TYPE_S32:
|
||||
case BRIG_TYPE_S64:
|
||||
return true;
|
||||
default:
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
/* Return true if and only if TYPE is an bit-type. */
|
||||
|
||||
bool
|
||||
hsa_btype_p (BrigType16_t type)
|
||||
{
|
||||
switch (type & BRIG_TYPE_BASE_MASK)
|
||||
{
|
||||
case BRIG_TYPE_B8:
|
||||
case BRIG_TYPE_B16:
|
||||
case BRIG_TYPE_B32:
|
||||
case BRIG_TYPE_B64:
|
||||
case BRIG_TYPE_B128:
|
||||
return true;
|
||||
default:
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/* Return HSA alignment encoding alignment to N bits. */
|
||||
|
||||
BrigAlignment8_t
|
||||
hsa_alignment_encoding (unsigned n)
|
||||
{
|
||||
gcc_assert (n >= 8 && !(n & (n - 1)));
|
||||
if (n >= 256)
|
||||
return BRIG_ALIGNMENT_32;
|
||||
|
||||
switch (n)
|
||||
{
|
||||
case 8:
|
||||
return BRIG_ALIGNMENT_1;
|
||||
case 16:
|
||||
return BRIG_ALIGNMENT_2;
|
||||
case 32:
|
||||
return BRIG_ALIGNMENT_4;
|
||||
case 64:
|
||||
return BRIG_ALIGNMENT_8;
|
||||
case 128:
|
||||
return BRIG_ALIGNMENT_16;
|
||||
default:
|
||||
gcc_unreachable ();
|
||||
}
|
||||
}
|
||||
|
||||
/* Return natural alignment of HSA TYPE. */
|
||||
|
||||
BrigAlignment8_t
|
||||
hsa_natural_alignment (BrigType16_t type)
|
||||
{
|
||||
return hsa_alignment_encoding (hsa_type_bit_size (type & ~BRIG_TYPE_ARRAY));
|
||||
}
|
||||
|
||||
/* Call the correct destructor of a HSA instruction. */
|
||||
|
||||
void
|
||||
hsa_destroy_insn (hsa_insn_basic *insn)
|
||||
{
|
||||
if (hsa_insn_phi *phi = dyn_cast <hsa_insn_phi *> (insn))
|
||||
phi->~hsa_insn_phi ();
|
||||
else if (hsa_insn_br *br = dyn_cast <hsa_insn_br *> (insn))
|
||||
br->~hsa_insn_br ();
|
||||
else if (hsa_insn_cmp *cmp = dyn_cast <hsa_insn_cmp *> (insn))
|
||||
cmp->~hsa_insn_cmp ();
|
||||
else if (hsa_insn_mem *mem = dyn_cast <hsa_insn_mem *> (insn))
|
||||
mem->~hsa_insn_mem ();
|
||||
else if (hsa_insn_atomic *atomic = dyn_cast <hsa_insn_atomic *> (insn))
|
||||
atomic->~hsa_insn_atomic ();
|
||||
else if (hsa_insn_seg *seg = dyn_cast <hsa_insn_seg *> (insn))
|
||||
seg->~hsa_insn_seg ();
|
||||
else if (hsa_insn_call *call = dyn_cast <hsa_insn_call *> (insn))
|
||||
call->~hsa_insn_call ();
|
||||
else if (hsa_insn_arg_block *block = dyn_cast <hsa_insn_arg_block *> (insn))
|
||||
block->~hsa_insn_arg_block ();
|
||||
else if (hsa_insn_sbr *sbr = dyn_cast <hsa_insn_sbr *> (insn))
|
||||
sbr->~hsa_insn_sbr ();
|
||||
else if (hsa_insn_comment *comment = dyn_cast <hsa_insn_comment *> (insn))
|
||||
comment->~hsa_insn_comment ();
|
||||
else
|
||||
insn->~hsa_insn_basic ();
|
||||
}
|
||||
|
||||
/* Call the correct destructor of a HSA operand. */
|
||||
|
||||
void
|
||||
hsa_destroy_operand (hsa_op_base *op)
|
||||
{
|
||||
if (hsa_op_code_list *list = dyn_cast <hsa_op_code_list *> (op))
|
||||
list->~hsa_op_code_list ();
|
||||
else if (hsa_op_operand_list *list = dyn_cast <hsa_op_operand_list *> (op))
|
||||
list->~hsa_op_operand_list ();
|
||||
else if (hsa_op_reg *reg = dyn_cast <hsa_op_reg *> (op))
|
||||
reg->~hsa_op_reg ();
|
||||
else if (hsa_op_immed *immed = dyn_cast <hsa_op_immed *> (op))
|
||||
immed->~hsa_op_immed ();
|
||||
else
|
||||
op->~hsa_op_base ();
|
||||
}
|
||||
|
||||
/* Create a mapping between the original function DECL and kernel name NAME. */
|
||||
|
||||
void
|
||||
hsa_add_kern_decl_mapping (tree decl, char *name, unsigned omp_data_size,
|
||||
bool gridified_kernel_p)
|
||||
{
|
||||
hsa_decl_kernel_map_element dkm;
|
||||
dkm.decl = decl;
|
||||
dkm.name = name;
|
||||
dkm.omp_data_size = omp_data_size;
|
||||
dkm.gridified_kernel_p = gridified_kernel_p;
|
||||
vec_safe_push (hsa_decl_kernel_mapping, dkm);
|
||||
}
|
||||
|
||||
/* Return the number of kernel decl name mappings. */
|
||||
|
||||
unsigned
|
||||
hsa_get_number_decl_kernel_mappings (void)
|
||||
{
|
||||
return vec_safe_length (hsa_decl_kernel_mapping);
|
||||
}
|
||||
|
||||
/* Return the decl in the Ith kernel decl name mapping. */
|
||||
|
||||
tree
|
||||
hsa_get_decl_kernel_mapping_decl (unsigned i)
|
||||
{
|
||||
return (*hsa_decl_kernel_mapping)[i].decl;
|
||||
}
|
||||
|
||||
/* Return the name in the Ith kernel decl name mapping. */
|
||||
|
||||
char *
|
||||
hsa_get_decl_kernel_mapping_name (unsigned i)
|
||||
{
|
||||
return (*hsa_decl_kernel_mapping)[i].name;
|
||||
}
|
||||
|
||||
/* Return maximum OMP size for kernel decl name mapping. */
|
||||
|
||||
unsigned
|
||||
hsa_get_decl_kernel_mapping_omp_size (unsigned i)
|
||||
{
|
||||
return (*hsa_decl_kernel_mapping)[i].omp_data_size;
|
||||
}
|
||||
|
||||
/* Return if the function is gridified kernel in decl name mapping. */
|
||||
|
||||
bool
|
||||
hsa_get_decl_kernel_mapping_gridified (unsigned i)
|
||||
{
|
||||
return (*hsa_decl_kernel_mapping)[i].gridified_kernel_p;
|
||||
}
|
||||
|
||||
/* Free the mapping between original decls and kernel names. */
|
||||
|
||||
void
|
||||
hsa_free_decl_kernel_mapping (void)
|
||||
{
|
||||
if (hsa_decl_kernel_mapping == NULL)
|
||||
return;
|
||||
|
||||
for (unsigned i = 0; i < hsa_decl_kernel_mapping->length (); ++i)
|
||||
free ((*hsa_decl_kernel_mapping)[i].name);
|
||||
ggc_free (hsa_decl_kernel_mapping);
|
||||
}
|
||||
|
||||
/* Add new kernel dependency. */
|
||||
|
||||
void
|
||||
hsa_add_kernel_dependency (tree caller, const char *called_function)
|
||||
{
|
||||
if (hsa_decl_kernel_dependencies == NULL)
|
||||
hsa_decl_kernel_dependencies = new hash_map<tree, vec<const char *> *> ();
|
||||
|
||||
vec <const char *> *s = NULL;
|
||||
vec <const char *> **slot = hsa_decl_kernel_dependencies->get (caller);
|
||||
if (slot == NULL)
|
||||
{
|
||||
s = new vec <const char *> ();
|
||||
hsa_decl_kernel_dependencies->put (caller, s);
|
||||
}
|
||||
else
|
||||
s = *slot;
|
||||
|
||||
s->safe_push (called_function);
|
||||
}
|
||||
|
||||
/* Modify the name P in-place so that it is a valid HSA identifier. */
|
||||
|
||||
void
|
||||
hsa_sanitize_name (char *p)
|
||||
{
|
||||
for (; *p; p++)
|
||||
if (*p == '.' || *p == '-')
|
||||
*p = '_';
|
||||
}
|
||||
|
||||
/* Clone the name P, set trailing ampersand and sanitize the name. */
|
||||
|
||||
char *
|
||||
hsa_brig_function_name (const char *p)
|
||||
{
|
||||
unsigned len = strlen (p);
|
||||
char *buf = XNEWVEC (char, len + 2);
|
||||
|
||||
buf[0] = '&';
|
||||
buf[len + 1] = '\0';
|
||||
memcpy (buf + 1, p, len);
|
||||
|
||||
hsa_sanitize_name (buf);
|
||||
return buf;
|
||||
}
|
||||
|
||||
/* Return declaration name if exists. */
|
||||
|
||||
const char *
|
||||
hsa_get_declaration_name (tree decl)
|
||||
{
|
||||
if (!DECL_NAME (decl))
|
||||
{
|
||||
char buf[64];
|
||||
snprintf (buf, 64, "__hsa_anonymous_%i", DECL_UID (decl));
|
||||
const char *ggc_str = ggc_strdup (buf);
|
||||
return ggc_str;
|
||||
}
|
||||
|
||||
tree name_tree;
|
||||
if (TREE_CODE (decl) == FUNCTION_DECL
|
||||
|| (TREE_CODE (decl) == VAR_DECL && is_global_var (decl)))
|
||||
name_tree = DECL_ASSEMBLER_NAME (decl);
|
||||
else
|
||||
name_tree = DECL_NAME (decl);
|
||||
|
||||
const char *name = IDENTIFIER_POINTER (name_tree);
|
||||
/* User-defined assembly names have prepended asterisk symbol. */
|
||||
if (name[0] == '*')
|
||||
name++;
|
||||
|
||||
return name;
|
||||
}
|
||||
|
||||
void
|
||||
hsa_summary_t::link_functions (cgraph_node *gpu, cgraph_node *host,
|
||||
hsa_function_kind kind, bool gridified_kernel_p)
|
||||
{
|
||||
hsa_function_summary *gpu_summary = get (gpu);
|
||||
hsa_function_summary *host_summary = get (host);
|
||||
|
||||
gpu_summary->m_kind = kind;
|
||||
host_summary->m_kind = kind;
|
||||
|
||||
gpu_summary->m_gpu_implementation_p = true;
|
||||
host_summary->m_gpu_implementation_p = false;
|
||||
|
||||
gpu_summary->m_gridified_kernel_p = gridified_kernel_p;
|
||||
host_summary->m_gridified_kernel_p = gridified_kernel_p;
|
||||
|
||||
gpu_summary->m_binded_function = host;
|
||||
host_summary->m_binded_function = gpu;
|
||||
|
||||
tree gdecl = gpu->decl;
|
||||
DECL_ATTRIBUTES (gdecl)
|
||||
= tree_cons (get_identifier ("flatten"), NULL_TREE,
|
||||
DECL_ATTRIBUTES (gdecl));
|
||||
|
||||
tree fn_opts = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (gdecl);
|
||||
if (fn_opts == NULL_TREE)
|
||||
fn_opts = optimization_default_node;
|
||||
fn_opts = copy_node (fn_opts);
|
||||
TREE_OPTIMIZATION (fn_opts)->x_flag_tree_loop_vectorize = false;
|
||||
TREE_OPTIMIZATION (fn_opts)->x_flag_tree_slp_vectorize = false;
|
||||
DECL_FUNCTION_SPECIFIC_OPTIMIZATION (gdecl) = fn_opts;
|
||||
}
|
||||
|
||||
/* Add a HOST function to HSA summaries. */
|
||||
|
||||
void
|
||||
hsa_register_kernel (cgraph_node *host)
|
||||
{
|
||||
if (hsa_summaries == NULL)
|
||||
hsa_summaries = new hsa_summary_t (symtab);
|
||||
hsa_function_summary *s = hsa_summaries->get (host);
|
||||
s->m_kind = HSA_KERNEL;
|
||||
}
|
||||
|
||||
/* Add a pair of functions to HSA summaries. GPU is an HSA implementation of
|
||||
a HOST function. */
|
||||
|
||||
void
|
||||
hsa_register_kernel (cgraph_node *gpu, cgraph_node *host)
|
||||
{
|
||||
if (hsa_summaries == NULL)
|
||||
hsa_summaries = new hsa_summary_t (symtab);
|
||||
hsa_summaries->link_functions (gpu, host, HSA_KERNEL, true);
|
||||
}
|
||||
|
||||
/* Return true if expansion of the current HSA function has already failed. */
|
||||
|
||||
bool
|
||||
hsa_seen_error (void)
|
||||
{
|
||||
return hsa_cfun->m_seen_error;
|
||||
}
|
||||
|
||||
/* Mark current HSA function as failed. */
|
||||
|
||||
void
|
||||
hsa_fail_cfun (void)
|
||||
{
|
||||
hsa_failed_functions->add (hsa_cfun->m_decl);
|
||||
hsa_cfun->m_seen_error = true;
|
||||
}
|
||||
|
||||
char *
|
||||
hsa_internal_fn::name ()
|
||||
{
|
||||
char *name = xstrdup (internal_fn_name (m_fn));
|
||||
for (char *ptr = name; *ptr; ptr++)
|
||||
*ptr = TOLOWER (*ptr);
|
||||
|
||||
const char *suffix = NULL;
|
||||
if (m_type_bit_size == 32)
|
||||
suffix = "f";
|
||||
|
||||
if (suffix)
|
||||
{
|
||||
char *name2 = concat (name, suffix, NULL);
|
||||
free (name);
|
||||
name = name2;
|
||||
}
|
||||
|
||||
hsa_sanitize_name (name);
|
||||
return name;
|
||||
}
|
||||
|
||||
unsigned
|
||||
hsa_internal_fn::get_arity ()
|
||||
{
|
||||
switch (m_fn)
|
||||
{
|
||||
case IFN_ACOS:
|
||||
case IFN_ASIN:
|
||||
case IFN_ATAN:
|
||||
case IFN_COS:
|
||||
case IFN_EXP:
|
||||
case IFN_EXP10:
|
||||
case IFN_EXP2:
|
||||
case IFN_EXPM1:
|
||||
case IFN_LOG:
|
||||
case IFN_LOG10:
|
||||
case IFN_LOG1P:
|
||||
case IFN_LOG2:
|
||||
case IFN_LOGB:
|
||||
case IFN_SIGNIFICAND:
|
||||
case IFN_SIN:
|
||||
case IFN_SQRT:
|
||||
case IFN_TAN:
|
||||
case IFN_CEIL:
|
||||
case IFN_FLOOR:
|
||||
case IFN_NEARBYINT:
|
||||
case IFN_RINT:
|
||||
case IFN_ROUND:
|
||||
case IFN_TRUNC:
|
||||
return 1;
|
||||
case IFN_ATAN2:
|
||||
case IFN_COPYSIGN:
|
||||
case IFN_FMOD:
|
||||
case IFN_POW:
|
||||
case IFN_REMAINDER:
|
||||
case IFN_SCALB:
|
||||
case IFN_LDEXP:
|
||||
return 2;
|
||||
break;
|
||||
case IFN_CLRSB:
|
||||
case IFN_CLZ:
|
||||
case IFN_CTZ:
|
||||
case IFN_FFS:
|
||||
case IFN_PARITY:
|
||||
case IFN_POPCOUNT:
|
||||
default:
|
||||
/* As we produce sorry message for unknown internal functions,
|
||||
reaching this label is definitely a bug. */
|
||||
gcc_unreachable ();
|
||||
}
|
||||
}
|
||||
|
||||
BrigType16_t
|
||||
hsa_internal_fn::get_argument_type (int n)
|
||||
{
|
||||
switch (m_fn)
|
||||
{
|
||||
case IFN_ACOS:
|
||||
case IFN_ASIN:
|
||||
case IFN_ATAN:
|
||||
case IFN_COS:
|
||||
case IFN_EXP:
|
||||
case IFN_EXP10:
|
||||
case IFN_EXP2:
|
||||
case IFN_EXPM1:
|
||||
case IFN_LOG:
|
||||
case IFN_LOG10:
|
||||
case IFN_LOG1P:
|
||||
case IFN_LOG2:
|
||||
case IFN_LOGB:
|
||||
case IFN_SIGNIFICAND:
|
||||
case IFN_SIN:
|
||||
case IFN_SQRT:
|
||||
case IFN_TAN:
|
||||
case IFN_CEIL:
|
||||
case IFN_FLOOR:
|
||||
case IFN_NEARBYINT:
|
||||
case IFN_RINT:
|
||||
case IFN_ROUND:
|
||||
case IFN_TRUNC:
|
||||
case IFN_ATAN2:
|
||||
case IFN_COPYSIGN:
|
||||
case IFN_FMOD:
|
||||
case IFN_POW:
|
||||
case IFN_REMAINDER:
|
||||
case IFN_SCALB:
|
||||
return hsa_float_for_bitsize (m_type_bit_size);
|
||||
case IFN_LDEXP:
|
||||
{
|
||||
if (n == -1 || n == 0)
|
||||
return hsa_float_for_bitsize (m_type_bit_size);
|
||||
else
|
||||
return BRIG_TYPE_S32;
|
||||
}
|
||||
default:
|
||||
/* As we produce sorry message for unknown internal functions,
|
||||
reaching this label is definitely a bug. */
|
||||
gcc_unreachable ();
|
||||
}
|
||||
}
|
||||
|
||||
#include "gt-hsa.h"
|
331
gcc/ipa-hsa.c
Normal file
331
gcc/ipa-hsa.c
Normal file
@ -0,0 +1,331 @@
|
||||
/* Callgraph based analysis of static variables.
|
||||
Copyright (C) 2015-2016 Free Software Foundation, Inc.
|
||||
Contributed by Martin Liska <mliska@suse.cz>
|
||||
|
||||
This file is part of GCC.
|
||||
|
||||
GCC is free software; you can redistribute it and/or modify it under
|
||||
the terms of the GNU General Public License as published by the Free
|
||||
Software Foundation; either version 3, or (at your option) any later
|
||||
version.
|
||||
|
||||
GCC is distributed in the hope that it will be useful, but WITHOUT ANY
|
||||
WARRANTY; without even the implied warranty of MERCHANTABILITY or
|
||||
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
|
||||
for more details.
|
||||
|
||||
You should have received a copy of the GNU General Public License
|
||||
along with GCC; see the file COPYING3. If not see
|
||||
<http://www.gnu.org/licenses/>. */
|
||||
|
||||
/* Interprocedural HSA pass is responsible for creation of HSA clones.
|
||||
For all these HSA clones, we emit HSAIL instructions and pass processing
|
||||
is terminated. */
|
||||
|
||||
#include "config.h"
|
||||
#include "system.h"
|
||||
#include "coretypes.h"
|
||||
#include "tm.h"
|
||||
#include "is-a.h"
|
||||
#include "hash-set.h"
|
||||
#include "vec.h"
|
||||
#include "tree.h"
|
||||
#include "tree-pass.h"
|
||||
#include "function.h"
|
||||
#include "basic-block.h"
|
||||
#include "gimple.h"
|
||||
#include "dumpfile.h"
|
||||
#include "gimple-pretty-print.h"
|
||||
#include "tree-streamer.h"
|
||||
#include "stringpool.h"
|
||||
#include "cgraph.h"
|
||||
#include "print-tree.h"
|
||||
#include "symbol-summary.h"
|
||||
#include "hsa.h"
|
||||
|
||||
namespace {
|
||||
|
||||
/* If NODE is not versionable, warn about not emiting HSAIL and return false.
|
||||
Otherwise return true. */
|
||||
|
||||
static bool
|
||||
check_warn_node_versionable (cgraph_node *node)
|
||||
{
|
||||
if (!node->local.versionable)
|
||||
{
|
||||
warning_at (EXPR_LOCATION (node->decl), OPT_Whsa,
|
||||
"could not emit HSAIL for function %s: function cannot be "
|
||||
"cloned", node->name ());
|
||||
return false;
|
||||
}
|
||||
return true;
|
||||
}
|
||||
|
||||
/* The function creates HSA clones for all functions that were either
|
||||
marked as HSA kernels or are callable HSA functions. Apart from that,
|
||||
we redirect all edges that come from an HSA clone and end in another
|
||||
HSA clone to connect these two functions. */
|
||||
|
||||
static unsigned int
|
||||
process_hsa_functions (void)
|
||||
{
|
||||
struct cgraph_node *node;
|
||||
|
||||
if (hsa_summaries == NULL)
|
||||
hsa_summaries = new hsa_summary_t (symtab);
|
||||
|
||||
FOR_EACH_DEFINED_FUNCTION (node)
|
||||
{
|
||||
hsa_function_summary *s = hsa_summaries->get (node);
|
||||
|
||||
/* A linked function is skipped. */
|
||||
if (s->m_binded_function != NULL)
|
||||
continue;
|
||||
|
||||
if (s->m_kind != HSA_NONE)
|
||||
{
|
||||
if (!check_warn_node_versionable (node))
|
||||
continue;
|
||||
cgraph_node *clone
|
||||
= node->create_virtual_clone (vec <cgraph_edge *> (),
|
||||
NULL, NULL, "hsa");
|
||||
TREE_PUBLIC (clone->decl) = TREE_PUBLIC (node->decl);
|
||||
|
||||
clone->force_output = true;
|
||||
hsa_summaries->link_functions (clone, node, s->m_kind, false);
|
||||
|
||||
if (dump_file)
|
||||
fprintf (dump_file, "Created a new HSA clone: %s, type: %s\n",
|
||||
clone->name (),
|
||||
s->m_kind == HSA_KERNEL ? "kernel" : "function");
|
||||
}
|
||||
else if (hsa_callable_function_p (node->decl))
|
||||
{
|
||||
if (!check_warn_node_versionable (node))
|
||||
continue;
|
||||
cgraph_node *clone
|
||||
= node->create_virtual_clone (vec <cgraph_edge *> (),
|
||||
NULL, NULL, "hsa");
|
||||
TREE_PUBLIC (clone->decl) = TREE_PUBLIC (node->decl);
|
||||
|
||||
if (!cgraph_local_p (node))
|
||||
clone->force_output = true;
|
||||
hsa_summaries->link_functions (clone, node, HSA_FUNCTION, false);
|
||||
|
||||
if (dump_file)
|
||||
fprintf (dump_file, "Created a new HSA function clone: %s\n",
|
||||
clone->name ());
|
||||
}
|
||||
}
|
||||
|
||||
/* Redirect all edges that are between HSA clones. */
|
||||
FOR_EACH_DEFINED_FUNCTION (node)
|
||||
{
|
||||
cgraph_edge *e = node->callees;
|
||||
|
||||
while (e)
|
||||
{
|
||||
hsa_function_summary *src = hsa_summaries->get (node);
|
||||
if (src->m_kind != HSA_NONE && src->m_gpu_implementation_p)
|
||||
{
|
||||
hsa_function_summary *dst = hsa_summaries->get (e->callee);
|
||||
if (dst->m_kind != HSA_NONE && !dst->m_gpu_implementation_p)
|
||||
{
|
||||
e->redirect_callee (dst->m_binded_function);
|
||||
if (dump_file)
|
||||
fprintf (dump_file,
|
||||
"Redirecting edge to HSA function: %s->%s\n",
|
||||
xstrdup_for_dump (e->caller->name ()),
|
||||
xstrdup_for_dump (e->callee->name ()));
|
||||
}
|
||||
}
|
||||
|
||||
e = e->next_callee;
|
||||
}
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
/* Iterate all HSA functions and stream out HSA function summary. */
|
||||
|
||||
static void
|
||||
ipa_hsa_write_summary (void)
|
||||
{
|
||||
struct bitpack_d bp;
|
||||
struct cgraph_node *node;
|
||||
struct output_block *ob;
|
||||
unsigned int count = 0;
|
||||
lto_symtab_encoder_iterator lsei;
|
||||
lto_symtab_encoder_t encoder;
|
||||
|
||||
if (!hsa_summaries)
|
||||
return;
|
||||
|
||||
ob = create_output_block (LTO_section_ipa_hsa);
|
||||
encoder = ob->decl_state->symtab_node_encoder;
|
||||
ob->symbol = NULL;
|
||||
for (lsei = lsei_start_function_in_partition (encoder); !lsei_end_p (lsei);
|
||||
lsei_next_function_in_partition (&lsei))
|
||||
{
|
||||
node = lsei_cgraph_node (lsei);
|
||||
hsa_function_summary *s = hsa_summaries->get (node);
|
||||
|
||||
if (s->m_kind != HSA_NONE)
|
||||
count++;
|
||||
}
|
||||
|
||||
streamer_write_uhwi (ob, count);
|
||||
|
||||
/* Process all of the functions. */
|
||||
for (lsei = lsei_start_function_in_partition (encoder); !lsei_end_p (lsei);
|
||||
lsei_next_function_in_partition (&lsei))
|
||||
{
|
||||
node = lsei_cgraph_node (lsei);
|
||||
hsa_function_summary *s = hsa_summaries->get (node);
|
||||
|
||||
if (s->m_kind != HSA_NONE)
|
||||
{
|
||||
encoder = ob->decl_state->symtab_node_encoder;
|
||||
int node_ref = lto_symtab_encoder_encode (encoder, node);
|
||||
streamer_write_uhwi (ob, node_ref);
|
||||
|
||||
bp = bitpack_create (ob->main_stream);
|
||||
bp_pack_value (&bp, s->m_kind, 2);
|
||||
bp_pack_value (&bp, s->m_gpu_implementation_p, 1);
|
||||
bp_pack_value (&bp, s->m_binded_function != NULL, 1);
|
||||
streamer_write_bitpack (&bp);
|
||||
if (s->m_binded_function)
|
||||
stream_write_tree (ob, s->m_binded_function->decl, true);
|
||||
}
|
||||
}
|
||||
|
||||
streamer_write_char_stream (ob->main_stream, 0);
|
||||
produce_asm (ob, NULL);
|
||||
destroy_output_block (ob);
|
||||
}
|
||||
|
||||
/* Read section in file FILE_DATA of length LEN with data DATA. */
|
||||
|
||||
static void
|
||||
ipa_hsa_read_section (struct lto_file_decl_data *file_data, const char *data,
|
||||
size_t len)
|
||||
{
|
||||
const struct lto_function_header *header
|
||||
= (const struct lto_function_header *) data;
|
||||
const int cfg_offset = sizeof (struct lto_function_header);
|
||||
const int main_offset = cfg_offset + header->cfg_size;
|
||||
const int string_offset = main_offset + header->main_size;
|
||||
struct data_in *data_in;
|
||||
unsigned int i;
|
||||
unsigned int count;
|
||||
|
||||
lto_input_block ib_main ((const char *) data + main_offset,
|
||||
header->main_size, file_data->mode_table);
|
||||
|
||||
data_in
|
||||
= lto_data_in_create (file_data, (const char *) data + string_offset,
|
||||
header->string_size, vNULL);
|
||||
count = streamer_read_uhwi (&ib_main);
|
||||
|
||||
for (i = 0; i < count; i++)
|
||||
{
|
||||
unsigned int index;
|
||||
struct cgraph_node *node;
|
||||
lto_symtab_encoder_t encoder;
|
||||
|
||||
index = streamer_read_uhwi (&ib_main);
|
||||
encoder = file_data->symtab_node_encoder;
|
||||
node = dyn_cast<cgraph_node *> (lto_symtab_encoder_deref (encoder,
|
||||
index));
|
||||
gcc_assert (node->definition);
|
||||
hsa_function_summary *s = hsa_summaries->get (node);
|
||||
|
||||
struct bitpack_d bp = streamer_read_bitpack (&ib_main);
|
||||
s->m_kind = (hsa_function_kind) bp_unpack_value (&bp, 2);
|
||||
s->m_gpu_implementation_p = bp_unpack_value (&bp, 1);
|
||||
bool has_tree = bp_unpack_value (&bp, 1);
|
||||
|
||||
if (has_tree)
|
||||
{
|
||||
tree decl = stream_read_tree (&ib_main, data_in);
|
||||
s->m_binded_function = cgraph_node::get_create (decl);
|
||||
}
|
||||
}
|
||||
lto_free_section_data (file_data, LTO_section_ipa_hsa, NULL, data,
|
||||
len);
|
||||
lto_data_in_delete (data_in);
|
||||
}
|
||||
|
||||
/* Load streamed HSA functions summary and assign the summary to a function. */
|
||||
|
||||
static void
|
||||
ipa_hsa_read_summary (void)
|
||||
{
|
||||
struct lto_file_decl_data **file_data_vec = lto_get_file_decl_data ();
|
||||
struct lto_file_decl_data *file_data;
|
||||
unsigned int j = 0;
|
||||
|
||||
if (hsa_summaries == NULL)
|
||||
hsa_summaries = new hsa_summary_t (symtab);
|
||||
|
||||
while ((file_data = file_data_vec[j++]))
|
||||
{
|
||||
size_t len;
|
||||
const char *data = lto_get_section_data (file_data, LTO_section_ipa_hsa,
|
||||
NULL, &len);
|
||||
|
||||
if (data)
|
||||
ipa_hsa_read_section (file_data, data, len);
|
||||
}
|
||||
}
|
||||
|
||||
const pass_data pass_data_ipa_hsa =
|
||||
{
|
||||
IPA_PASS, /* type */
|
||||
"hsa", /* name */
|
||||
OPTGROUP_NONE, /* optinfo_flags */
|
||||
TV_IPA_HSA, /* tv_id */
|
||||
0, /* properties_required */
|
||||
0, /* properties_provided */
|
||||
0, /* properties_destroyed */
|
||||
0, /* todo_flags_start */
|
||||
TODO_dump_symtab, /* todo_flags_finish */
|
||||
};
|
||||
|
||||
class pass_ipa_hsa : public ipa_opt_pass_d
|
||||
{
|
||||
public:
|
||||
pass_ipa_hsa (gcc::context *ctxt)
|
||||
: ipa_opt_pass_d (pass_data_ipa_hsa, ctxt,
|
||||
NULL, /* generate_summary */
|
||||
ipa_hsa_write_summary, /* write_summary */
|
||||
ipa_hsa_read_summary, /* read_summary */
|
||||
ipa_hsa_write_summary, /* write_optimization_summary */
|
||||
ipa_hsa_read_summary, /* read_optimization_summary */
|
||||
NULL, /* stmt_fixup */
|
||||
0, /* function_transform_todo_flags_start */
|
||||
NULL, /* function_transform */
|
||||
NULL) /* variable_transform */
|
||||
{}
|
||||
|
||||
/* opt_pass methods: */
|
||||
virtual bool gate (function *);
|
||||
|
||||
virtual unsigned int execute (function *) { return process_hsa_functions (); }
|
||||
|
||||
}; // class pass_ipa_reference
|
||||
|
||||
bool
|
||||
pass_ipa_hsa::gate (function *)
|
||||
{
|
||||
return hsa_gen_requested_p ();
|
||||
}
|
||||
|
||||
} // anon namespace
|
||||
|
||||
ipa_opt_pass_d *
|
||||
make_pass_ipa_hsa (gcc::context *ctxt)
|
||||
{
|
||||
return new pass_ipa_hsa (ctxt);
|
||||
}
|
@ -51,7 +51,8 @@ const char *lto_section_name[LTO_N_SECTION_TYPES] =
|
||||
"ipcp_trans",
|
||||
"icf",
|
||||
"offload_table",
|
||||
"mode_table"
|
||||
"mode_table",
|
||||
"hsa"
|
||||
};
|
||||
|
||||
|
||||
|
@ -244,6 +244,7 @@ enum lto_section_type
|
||||
LTO_section_ipa_icf,
|
||||
LTO_section_offload_table,
|
||||
LTO_section_mode_table,
|
||||
LTO_section_ipa_hsa,
|
||||
LTO_N_SECTION_TYPES /* Must be last. */
|
||||
};
|
||||
|
||||
|
@ -736,6 +736,7 @@ compile_images_for_offload_targets (unsigned in_argc, char *in_argv[],
|
||||
return;
|
||||
unsigned num_targets = parse_env_var (target_names, &names, NULL);
|
||||
|
||||
int next_name_entry = 0;
|
||||
const char *compiler_path = getenv ("COMPILER_PATH");
|
||||
if (!compiler_path)
|
||||
goto out;
|
||||
@ -745,13 +746,19 @@ compile_images_for_offload_targets (unsigned in_argc, char *in_argv[],
|
||||
offload_names = XCNEWVEC (char *, num_targets + 1);
|
||||
for (unsigned i = 0; i < num_targets; i++)
|
||||
{
|
||||
offload_names[i]
|
||||
/* HSA does not use LTO-like streaming and a different compiler, skip
|
||||
it. */
|
||||
if (strcmp (names[i], "hsa") == 0)
|
||||
continue;
|
||||
|
||||
offload_names[next_name_entry]
|
||||
= compile_offload_image (names[i], compiler_path, in_argc, in_argv,
|
||||
compiler_opts, compiler_opt_count,
|
||||
linker_opts, linker_opt_count);
|
||||
if (!offload_names[i])
|
||||
if (!offload_names[next_name_entry])
|
||||
fatal_error (input_location,
|
||||
"problem with building target image for %s\n", names[i]);
|
||||
next_name_entry++;
|
||||
}
|
||||
|
||||
out:
|
||||
|
@ -1,3 +1,10 @@
|
||||
2016-01-19 Martin Liska <mliska@suse.cz>
|
||||
Martin Jambor <mjambor@suse.cz>
|
||||
|
||||
* lto-partition.c: Include "hsa.h"
|
||||
(add_symbol_to_partition_1): Put hsa implementations into the
|
||||
same partition as host implementations.
|
||||
|
||||
2016-01-12 Jan Hubicka <hubicka@ucw.cz>
|
||||
|
||||
PR lto/69003
|
||||
|
@ -34,6 +34,7 @@ along with GCC; see the file COPYING3. If not see
|
||||
#include "ipa-prop.h"
|
||||
#include "ipa-inline.h"
|
||||
#include "lto-partition.h"
|
||||
#include "hsa.h"
|
||||
|
||||
vec<ltrans_partition> ltrans_partitions;
|
||||
|
||||
@ -170,6 +171,24 @@ add_symbol_to_partition_1 (ltrans_partition part, symtab_node *node)
|
||||
Therefore put it into the same partition. */
|
||||
if (cnode->instrumented_version)
|
||||
add_symbol_to_partition_1 (part, cnode->instrumented_version);
|
||||
|
||||
/* Add an HSA associated with the symbol. */
|
||||
if (hsa_summaries != NULL)
|
||||
{
|
||||
hsa_function_summary *s = hsa_summaries->get (cnode);
|
||||
if (s->m_kind == HSA_KERNEL)
|
||||
{
|
||||
/* Add binded function. */
|
||||
bool added = add_symbol_to_partition_1 (part,
|
||||
s->m_binded_function);
|
||||
gcc_assert (added);
|
||||
if (symtab->dump_file)
|
||||
fprintf (symtab->dump_file,
|
||||
"adding an HSA function (host/gpu) to the "
|
||||
"partition: %s\n",
|
||||
s->m_binded_function->name ());
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
add_references_to_partition (part, node);
|
||||
|
@ -340,8 +340,13 @@ DEF_GOMP_BUILTIN (BUILT_IN_GOMP_SINGLE_COPY_START, "GOMP_single_copy_start",
|
||||
BT_FN_PTR, ATTR_NOTHROW_LEAF_LIST)
|
||||
DEF_GOMP_BUILTIN (BUILT_IN_GOMP_SINGLE_COPY_END, "GOMP_single_copy_end",
|
||||
BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)
|
||||
DEF_GOMP_BUILTIN (BUILT_IN_GOMP_OFFLOAD_REGISTER, "GOMP_offload_register_ver",
|
||||
BT_FN_VOID_UINT_PTR_INT_PTR, ATTR_NOTHROW_LIST)
|
||||
DEF_GOMP_BUILTIN (BUILT_IN_GOMP_OFFLOAD_UNREGISTER,
|
||||
"GOMP_offload_unregister_ver",
|
||||
BT_FN_VOID_UINT_PTR_INT_PTR, ATTR_NOTHROW_LIST)
|
||||
DEF_GOMP_BUILTIN (BUILT_IN_GOMP_TARGET, "GOMP_target_ext",
|
||||
BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_INT_INT,
|
||||
BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_PTR,
|
||||
ATTR_NOTHROW_LIST)
|
||||
DEF_GOMP_BUILTIN (BUILT_IN_GOMP_TARGET_DATA, "GOMP_target_data_ext",
|
||||
BT_FN_VOID_INT_SIZE_PTR_PTR_PTR, ATTR_NOTHROW_LIST)
|
||||
|
1456
gcc/omp-low.c
1456
gcc/omp-low.c
File diff suppressed because it is too large
Load Diff
31
gcc/opts.c
31
gcc/opts.c
@ -1916,8 +1916,35 @@ common_handle_option (struct gcc_options *opts,
|
||||
break;
|
||||
|
||||
case OPT_foffload_:
|
||||
/* Deferred. */
|
||||
break;
|
||||
{
|
||||
const char *p = arg;
|
||||
opts->x_flag_disable_hsa = true;
|
||||
while (*p != 0)
|
||||
{
|
||||
const char *comma = strchr (p, ',');
|
||||
|
||||
if ((strncmp (p, "disable", 7) == 0)
|
||||
&& (p[7] == ',' || p[7] == '\0'))
|
||||
{
|
||||
opts->x_flag_disable_hsa = true;
|
||||
break;
|
||||
}
|
||||
|
||||
if ((strncmp (p, "hsa", 3) == 0)
|
||||
&& (p[3] == ',' || p[3] == '\0'))
|
||||
{
|
||||
#ifdef ENABLE_HSA
|
||||
opts->x_flag_disable_hsa = false;
|
||||
#else
|
||||
sorry ("HSA has not been enabled during configuration");
|
||||
#endif
|
||||
}
|
||||
if (!comma)
|
||||
break;
|
||||
p = comma + 1;
|
||||
}
|
||||
break;
|
||||
}
|
||||
|
||||
#ifndef ACCEL_COMPILER
|
||||
case OPT_foffload_abi_:
|
||||
|
@ -1183,6 +1183,11 @@ DEFPARAM (PARAM_MAX_RTL_IF_CONVERSION_INSNS,
|
||||
"Maximum number of insns in a basic block to consider for RTL "
|
||||
"if-conversion.",
|
||||
10, 0, 99)
|
||||
|
||||
DEFPARAM (PARAM_HSA_GEN_DEBUG_STORES,
|
||||
"hsa-gen-debug-stores",
|
||||
"Level of hsa debug stores verbosity",
|
||||
0, 0, 1)
|
||||
/*
|
||||
|
||||
Local variables:
|
||||
|
@ -151,6 +151,7 @@ along with GCC; see the file COPYING3. If not see
|
||||
NEXT_PASS (pass_ipa_cp);
|
||||
NEXT_PASS (pass_ipa_cdtor_merge);
|
||||
NEXT_PASS (pass_target_clone);
|
||||
NEXT_PASS (pass_ipa_hsa);
|
||||
NEXT_PASS (pass_ipa_inline);
|
||||
NEXT_PASS (pass_ipa_pure_const);
|
||||
NEXT_PASS (pass_ipa_reference);
|
||||
@ -386,6 +387,7 @@ along with GCC; see the file COPYING3. If not see
|
||||
NEXT_PASS (pass_nrv);
|
||||
NEXT_PASS (pass_cleanup_cfg_post_optimizing);
|
||||
NEXT_PASS (pass_warn_function_noreturn);
|
||||
NEXT_PASS (pass_gen_hsail);
|
||||
|
||||
NEXT_PASS (pass_expand);
|
||||
|
||||
|
@ -97,6 +97,7 @@ DEFTIMEVAR (TV_WHOPR_WPA_IO , "whopr wpa I/O")
|
||||
DEFTIMEVAR (TV_WHOPR_PARTITIONING , "whopr partitioning")
|
||||
DEFTIMEVAR (TV_WHOPR_LTRANS , "whopr ltrans")
|
||||
DEFTIMEVAR (TV_IPA_REFERENCE , "ipa reference")
|
||||
DEFTIMEVAR (TV_IPA_HSA , "ipa HSA")
|
||||
DEFTIMEVAR (TV_IPA_PROFILE , "ipa profile")
|
||||
DEFTIMEVAR (TV_IPA_AUTOFDO , "auto profile")
|
||||
DEFTIMEVAR (TV_IPA_PURE_CONST , "ipa pure const")
|
||||
|
@ -75,6 +75,7 @@ along with GCC; see the file COPYING3. If not see
|
||||
#include "gcse.h"
|
||||
#include "tree-chkp.h"
|
||||
#include "omp-low.h"
|
||||
#include "hsa.h"
|
||||
|
||||
#if defined(DBX_DEBUGGING_INFO) || defined(XCOFF_DEBUGGING_INFO)
|
||||
#include "dbxout.h"
|
||||
@ -518,6 +519,8 @@ compile_file (void)
|
||||
|
||||
omp_finish_file ();
|
||||
|
||||
hsa_output_brig ();
|
||||
|
||||
output_shared_constant_pool ();
|
||||
output_object_blocks ();
|
||||
finish_tm_clone_pairs ();
|
||||
|
@ -458,7 +458,11 @@ enum omp_clause_code {
|
||||
OMP_CLAUSE_VECTOR_LENGTH,
|
||||
|
||||
/* OpenACC clause: tile ( size-expr-list ). */
|
||||
OMP_CLAUSE_TILE
|
||||
OMP_CLAUSE_TILE,
|
||||
|
||||
/* OpenMP internal-only clause to specify grid dimensions of a gridified
|
||||
kernel. */
|
||||
OMP_CLAUSE__GRIDDIM_
|
||||
};
|
||||
|
||||
#undef DEFTREESTRUCT
|
||||
@ -1375,6 +1379,9 @@ struct GTY(()) tree_omp_clause {
|
||||
enum tree_code reduction_code;
|
||||
enum omp_clause_linear_kind linear_kind;
|
||||
enum tree_code if_modifier;
|
||||
/* The dimension a OMP_CLAUSE__GRIDDIM_ clause of a gridified target
|
||||
construct describes. */
|
||||
unsigned int dimension;
|
||||
} GTY ((skip)) subcode;
|
||||
|
||||
/* The gimplification of OMP_CLAUSE_REDUCTION_{INIT,MERGE} for omp-low's
|
||||
|
@ -471,6 +471,7 @@ extern gimple_opt_pass *make_pass_sanopt (gcc::context *ctxt);
|
||||
extern gimple_opt_pass *make_pass_oacc_kernels (gcc::context *ctxt);
|
||||
extern simple_ipa_opt_pass *make_pass_ipa_oacc (gcc::context *ctxt);
|
||||
extern simple_ipa_opt_pass *make_pass_ipa_oacc_kernels (gcc::context *ctxt);
|
||||
extern gimple_opt_pass *make_pass_gen_hsail (gcc::context *ctxt);
|
||||
|
||||
/* IPA Passes */
|
||||
extern simple_ipa_opt_pass *make_pass_ipa_lower_emutls (gcc::context *ctxt);
|
||||
@ -495,6 +496,7 @@ extern ipa_opt_pass_d *make_pass_ipa_cp (gcc::context *ctxt);
|
||||
extern ipa_opt_pass_d *make_pass_ipa_icf (gcc::context *ctxt);
|
||||
extern ipa_opt_pass_d *make_pass_ipa_devirt (gcc::context *ctxt);
|
||||
extern ipa_opt_pass_d *make_pass_ipa_reference (gcc::context *ctxt);
|
||||
extern ipa_opt_pass_d *make_pass_ipa_hsa (gcc::context *ctxt);
|
||||
extern ipa_opt_pass_d *make_pass_ipa_pure_const (gcc::context *ctxt);
|
||||
extern simple_ipa_opt_pass *make_pass_ipa_pta (gcc::context *ctxt);
|
||||
extern simple_ipa_opt_pass *make_pass_ipa_tm (gcc::context *ctxt);
|
||||
|
@ -942,6 +942,18 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, int flags)
|
||||
pp_right_paren (pp);
|
||||
break;
|
||||
|
||||
case OMP_CLAUSE__GRIDDIM_:
|
||||
pp_string (pp, "_griddim_(");
|
||||
pp_unsigned_wide_integer (pp, OMP_CLAUSE__GRIDDIM__DIMENSION (clause));
|
||||
pp_colon (pp);
|
||||
dump_generic_node (pp, OMP_CLAUSE__GRIDDIM__SIZE (clause), spc, flags,
|
||||
false);
|
||||
pp_comma (pp);
|
||||
dump_generic_node (pp, OMP_CLAUSE__GRIDDIM__GROUP (clause), spc, flags,
|
||||
false);
|
||||
pp_right_paren (pp);
|
||||
break;
|
||||
|
||||
default:
|
||||
/* Should never happen. */
|
||||
dump_generic_node (pp, clause, spc, flags, false);
|
||||
|
@ -328,6 +328,7 @@ unsigned const char omp_clause_num_ops[] =
|
||||
1, /* OMP_CLAUSE_NUM_WORKERS */
|
||||
1, /* OMP_CLAUSE_VECTOR_LENGTH */
|
||||
1, /* OMP_CLAUSE_TILE */
|
||||
2, /* OMP_CLAUSE__GRIDDIM_ */
|
||||
};
|
||||
|
||||
const char * const omp_clause_code_name[] =
|
||||
@ -398,7 +399,8 @@ const char * const omp_clause_code_name[] =
|
||||
"num_gangs",
|
||||
"num_workers",
|
||||
"vector_length",
|
||||
"tile"
|
||||
"tile",
|
||||
"_griddim_"
|
||||
};
|
||||
|
||||
|
||||
@ -11744,6 +11746,7 @@ walk_tree_1 (tree *tp, walk_tree_fn func, void *data,
|
||||
switch (OMP_CLAUSE_CODE (*tp))
|
||||
{
|
||||
case OMP_CLAUSE_GANG:
|
||||
case OMP_CLAUSE__GRIDDIM_:
|
||||
WALK_SUBTREE (OMP_CLAUSE_OPERAND (*tp, 1));
|
||||
/* FALLTHRU */
|
||||
|
||||
|
@ -1636,6 +1636,14 @@ extern void protected_set_expr_location (tree, location_t);
|
||||
#define OMP_CLAUSE_TILE_LIST(NODE) \
|
||||
OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_TILE), 0)
|
||||
|
||||
#define OMP_CLAUSE__GRIDDIM__DIMENSION(NODE) \
|
||||
(OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__GRIDDIM_)\
|
||||
->omp_clause.subcode.dimension)
|
||||
#define OMP_CLAUSE__GRIDDIM__SIZE(NODE) \
|
||||
OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__GRIDDIM_), 0)
|
||||
#define OMP_CLAUSE__GRIDDIM__GROUP(NODE) \
|
||||
OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__GRIDDIM_), 1)
|
||||
|
||||
/* SSA_NAME accessors. */
|
||||
|
||||
/* Returns the IDENTIFIER_NODE giving the SSA name a name or NULL_TREE
|
||||
|
@ -1,3 +1,16 @@
|
||||
2016-01-19 Martin Jambor <mjambor@suse.cz>
|
||||
|
||||
* gomp-constants.h (GOMP_DEVICE_HSA): New macro.
|
||||
(GOMP_VERSION_HSA): Likewise.
|
||||
(GOMP_TARGET_ARG_DEVICE_MASK): Likewise.
|
||||
(GOMP_TARGET_ARG_DEVICE_ALL): Likewise.
|
||||
(GOMP_TARGET_ARG_SUBSEQUENT_PARAM): Likewise.
|
||||
(GOMP_TARGET_ARG_ID_MASK): Likewise.
|
||||
(GOMP_TARGET_ARG_NUM_TEAMS): Likewise.
|
||||
(GOMP_TARGET_ARG_THREAD_LIMIT): Likewise.
|
||||
(GOMP_TARGET_ARG_VALUE_SHIFT): Likewise.
|
||||
(GOMP_TARGET_ARG_HSA_KERNEL_ATTRIBUTES): Likewise.
|
||||
|
||||
2016-01-07 Mike Frysinger <vapier@gentoo.org>
|
||||
|
||||
* longlong.h: Change !__SHMEDIA__ to
|
||||
|
@ -176,6 +176,7 @@ enum gomp_map_kind
|
||||
#define GOMP_DEVICE_NOT_HOST 4
|
||||
#define GOMP_DEVICE_NVIDIA_PTX 5
|
||||
#define GOMP_DEVICE_INTEL_MIC 6
|
||||
#define GOMP_DEVICE_HSA 7
|
||||
|
||||
#define GOMP_DEVICE_ICV -1
|
||||
#define GOMP_DEVICE_HOST_FALLBACK -2
|
||||
@ -201,6 +202,7 @@ enum gomp_map_kind
|
||||
#define GOMP_VERSION 0
|
||||
#define GOMP_VERSION_NVIDIA_PTX 1
|
||||
#define GOMP_VERSION_INTEL_MIC 0
|
||||
#define GOMP_VERSION_HSA 0
|
||||
|
||||
#define GOMP_VERSION_PACK(LIB, DEV) (((LIB) << 16) | (DEV))
|
||||
#define GOMP_VERSION_LIB(PACK) (((PACK) >> 16) & 0xffff)
|
||||
@ -228,4 +230,30 @@ enum gomp_map_kind
|
||||
#define GOMP_LAUNCH_OP(X) (((X) >> GOMP_LAUNCH_OP_SHIFT) & 0xffff)
|
||||
#define GOMP_LAUNCH_OP_MAX 0xffff
|
||||
|
||||
/* Bitmask to apply in order to find out the intended device of a target
|
||||
argument. */
|
||||
#define GOMP_TARGET_ARG_DEVICE_MASK ((1 << 7) - 1)
|
||||
/* The target argument is significant for all devices. */
|
||||
#define GOMP_TARGET_ARG_DEVICE_ALL 0
|
||||
|
||||
/* Flag set when the subsequent element in the device-specific argument
|
||||
values. */
|
||||
#define GOMP_TARGET_ARG_SUBSEQUENT_PARAM (1 << 7)
|
||||
|
||||
/* Bitmask to apply to a target argument to find out the value identifier. */
|
||||
#define GOMP_TARGET_ARG_ID_MASK (((1 << 8) - 1) << 8)
|
||||
/* Target argument index of NUM_TEAMS. */
|
||||
#define GOMP_TARGET_ARG_NUM_TEAMS (1 << 8)
|
||||
/* Target argument index of THREAD_LIMIT. */
|
||||
#define GOMP_TARGET_ARG_THREAD_LIMIT (2 << 8)
|
||||
|
||||
/* If the value is directly embeded in target argument, it should be a 16-bit
|
||||
at most and shifted by this many bits. */
|
||||
#define GOMP_TARGET_ARG_VALUE_SHIFT 16
|
||||
|
||||
/* HSA specific data structures. */
|
||||
|
||||
/* Identifiers of device-specific target arguments. */
|
||||
#define GOMP_TARGET_ARG_HSA_KERNEL_ATTRIBUTES (1 << 8)
|
||||
|
||||
#endif
|
||||
|
@ -1,3 +1,64 @@
|
||||
2016-01-19 Martin Jambor <mjambor@suse.cz>
|
||||
Martin Liska <mliska@suse.cz>
|
||||
|
||||
* plugin/Makefrag.am: Add HSA plugin requirements.
|
||||
* plugin/configfrag.ac (HSA_RUNTIME_INCLUDE): New variable.
|
||||
(HSA_RUNTIME_LIB): Likewise.
|
||||
(HSA_RUNTIME_CPPFLAGS): Likewise.
|
||||
(HSA_RUNTIME_INCLUDE): New substitution.
|
||||
(HSA_RUNTIME_LIB): Likewise.
|
||||
(HSA_RUNTIME_LDFLAGS): Likewise.
|
||||
(hsa-runtime): New configure option.
|
||||
(hsa-runtime-include): Likewise.
|
||||
(hsa-runtime-lib): Likewise.
|
||||
(PLUGIN_HSA): New substitution variable.
|
||||
Fill HSA_RUNTIME_INCLUDE and HSA_RUNTIME_LIB according to the new
|
||||
configure options.
|
||||
(PLUGIN_HSA_CPPFLAGS): Likewise.
|
||||
(PLUGIN_HSA_LDFLAGS): Likewise.
|
||||
(PLUGIN_HSA_LIBS): Likewise.
|
||||
Check that we have access to HSA run-time.
|
||||
* libgomp-plugin.h (offload_target_type): New element
|
||||
OFFLOAD_TARGET_TYPE_HSA.
|
||||
* libgomp.h (gomp_target_task): New fields firstprivate_copies and
|
||||
args.
|
||||
(bool gomp_create_target_task): Updated.
|
||||
(gomp_device_descr): Extra parameter of run_func and async_run_func,
|
||||
new field can_run_func.
|
||||
* libgomp_g.h (GOMP_target_ext): Update prototype.
|
||||
* oacc-host.c (host_run): Added a new parameter args.
|
||||
* target.c (calculate_firstprivate_requirements): New function.
|
||||
(copy_firstprivate_data): Likewise.
|
||||
(gomp_target_fallback_firstprivate): Use them.
|
||||
(gomp_target_unshare_firstprivate): New function.
|
||||
(gomp_get_target_fn_addr): Allow returning NULL for shared memory
|
||||
devices.
|
||||
(GOMP_target): Do host fallback for all shared memory devices. Do not
|
||||
pass any args to plugins.
|
||||
(GOMP_target_ext): Introduce device-specific argument parameter args.
|
||||
Allow host fallback if device shares memory. Do not remap data if
|
||||
device has shared memory.
|
||||
(gomp_target_task_fn): Likewise. Also treat shared memory devices
|
||||
like host fallback for mappings.
|
||||
(GOMP_target_data): Treat shared memory devices like host fallback.
|
||||
(GOMP_target_data_ext): Likewise.
|
||||
(GOMP_target_update): Likewise.
|
||||
(GOMP_target_update_ext): Likewise. Also pass NULL as args to
|
||||
gomp_create_target_task.
|
||||
(GOMP_target_enter_exit_data): Likewise.
|
||||
(omp_target_alloc): Treat shared memory devices like host fallback.
|
||||
(omp_target_free): Likewise.
|
||||
(omp_target_is_present): Likewise.
|
||||
(omp_target_memcpy): Likewise.
|
||||
(omp_target_memcpy_rect): Likewise.
|
||||
(omp_target_associate_ptr): Likewise.
|
||||
(gomp_load_plugin_for_device): Also load can_run.
|
||||
* task.c (GOMP_PLUGIN_target_task_completion): Free
|
||||
firstprivate_copies.
|
||||
(gomp_create_target_task): Accept new argument args and store it to
|
||||
ttask.
|
||||
* plugin/plugin-hsa.c: New file.
|
||||
|
||||
2016-01-18 Tom de Vries <tom@codesourcery.com>
|
||||
|
||||
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c: New test.
|
||||
|
@ -17,7 +17,7 @@
|
||||
|
||||
# Plugins for offload execution, Makefile.am fragment.
|
||||
#
|
||||
# Copyright (C) 2014-2015 Free Software Foundation, Inc.
|
||||
# Copyright (C) 2014-2016 Free Software Foundation, Inc.
|
||||
#
|
||||
# Contributed by Mentor Embedded.
|
||||
#
|
||||
@ -89,7 +89,8 @@ DIST_COMMON = $(top_srcdir)/plugin/Makefrag.am ChangeLog \
|
||||
$(srcdir)/omp_lib.f90.in $(srcdir)/libgomp_f.h.in \
|
||||
$(srcdir)/libgomp.spec.in $(srcdir)/../depcomp
|
||||
@PLUGIN_NVPTX_TRUE@am__append_1 = libgomp-plugin-nvptx.la
|
||||
@USE_FORTRAN_TRUE@am__append_2 = openacc.f90
|
||||
@PLUGIN_HSA_TRUE@am__append_2 = libgomp-plugin-hsa.la
|
||||
@USE_FORTRAN_TRUE@am__append_3 = openacc.f90
|
||||
subdir = .
|
||||
ACLOCAL_M4 = $(top_srcdir)/aclocal.m4
|
||||
am__aclocal_m4_deps = $(top_srcdir)/../config/acx.m4 \
|
||||
@ -147,6 +148,17 @@ am__installdirs = "$(DESTDIR)$(toolexeclibdir)" "$(DESTDIR)$(infodir)" \
|
||||
"$(DESTDIR)$(toolexeclibdir)"
|
||||
LTLIBRARIES = $(toolexeclib_LTLIBRARIES)
|
||||
am__DEPENDENCIES_1 =
|
||||
@PLUGIN_HSA_TRUE@libgomp_plugin_hsa_la_DEPENDENCIES = libgomp.la \
|
||||
@PLUGIN_HSA_TRUE@ $(am__DEPENDENCIES_1)
|
||||
@PLUGIN_HSA_TRUE@am_libgomp_plugin_hsa_la_OBJECTS = \
|
||||
@PLUGIN_HSA_TRUE@ libgomp_plugin_hsa_la-plugin-hsa.lo
|
||||
libgomp_plugin_hsa_la_OBJECTS = $(am_libgomp_plugin_hsa_la_OBJECTS)
|
||||
libgomp_plugin_hsa_la_LINK = $(LIBTOOL) --tag=CC \
|
||||
$(libgomp_plugin_hsa_la_LIBTOOLFLAGS) $(LIBTOOLFLAGS) \
|
||||
--mode=link $(CCLD) $(AM_CFLAGS) $(CFLAGS) \
|
||||
$(libgomp_plugin_hsa_la_LDFLAGS) $(LDFLAGS) -o $@
|
||||
@PLUGIN_HSA_TRUE@am_libgomp_plugin_hsa_la_rpath = -rpath \
|
||||
@PLUGIN_HSA_TRUE@ $(toolexeclibdir)
|
||||
@PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_la_DEPENDENCIES = libgomp.la \
|
||||
@PLUGIN_NVPTX_TRUE@ $(am__DEPENDENCIES_1)
|
||||
@PLUGIN_NVPTX_TRUE@am_libgomp_plugin_nvptx_la_OBJECTS = \
|
||||
@ -187,7 +199,8 @@ FCLD = $(FC)
|
||||
FCLINK = $(LIBTOOL) --tag=FC $(AM_LIBTOOLFLAGS) $(LIBTOOLFLAGS) \
|
||||
--mode=link $(FCLD) $(AM_FCFLAGS) $(FCFLAGS) $(AM_LDFLAGS) \
|
||||
$(LDFLAGS) -o $@
|
||||
SOURCES = $(libgomp_plugin_nvptx_la_SOURCES) $(libgomp_la_SOURCES)
|
||||
SOURCES = $(libgomp_plugin_hsa_la_SOURCES) \
|
||||
$(libgomp_plugin_nvptx_la_SOURCES) $(libgomp_la_SOURCES)
|
||||
MULTISRCTOP =
|
||||
MULTIBUILDTOP =
|
||||
MULTIDIRS =
|
||||
@ -255,6 +268,8 @@ FC = @FC@
|
||||
FCFLAGS = @FCFLAGS@
|
||||
FGREP = @FGREP@
|
||||
GREP = @GREP@
|
||||
HSA_RUNTIME_INCLUDE = @HSA_RUNTIME_INCLUDE@
|
||||
HSA_RUNTIME_LIB = @HSA_RUNTIME_LIB@
|
||||
INSTALL = @INSTALL@
|
||||
INSTALL_DATA = @INSTALL_DATA@
|
||||
INSTALL_PROGRAM = @INSTALL_PROGRAM@
|
||||
@ -299,6 +314,10 @@ PACKAGE_URL = @PACKAGE_URL@
|
||||
PACKAGE_VERSION = @PACKAGE_VERSION@
|
||||
PATH_SEPARATOR = @PATH_SEPARATOR@
|
||||
PERL = @PERL@
|
||||
PLUGIN_HSA = @PLUGIN_HSA@
|
||||
PLUGIN_HSA_CPPFLAGS = @PLUGIN_HSA_CPPFLAGS@
|
||||
PLUGIN_HSA_LDFLAGS = @PLUGIN_HSA_LDFLAGS@
|
||||
PLUGIN_HSA_LIBS = @PLUGIN_HSA_LIBS@
|
||||
PLUGIN_NVPTX = @PLUGIN_NVPTX@
|
||||
PLUGIN_NVPTX_CPPFLAGS = @PLUGIN_NVPTX_CPPFLAGS@
|
||||
PLUGIN_NVPTX_LDFLAGS = @PLUGIN_NVPTX_LDFLAGS@
|
||||
@ -391,7 +410,7 @@ libsubincludedir = $(libdir)/gcc/$(target_alias)/$(gcc_version)/include
|
||||
AM_CPPFLAGS = $(addprefix -I, $(search_path))
|
||||
AM_CFLAGS = $(XCFLAGS)
|
||||
AM_LDFLAGS = $(XLDFLAGS) $(SECTION_LDFLAGS) $(OPT_LDFLAGS)
|
||||
toolexeclib_LTLIBRARIES = libgomp.la $(am__append_1)
|
||||
toolexeclib_LTLIBRARIES = libgomp.la $(am__append_1) $(am__append_2)
|
||||
nodist_toolexeclib_HEADERS = libgomp.spec
|
||||
|
||||
# -Wc is only a libtool option.
|
||||
@ -415,7 +434,7 @@ libgomp_la_SOURCES = alloc.c barrier.c critical.c env.c error.c iter.c \
|
||||
bar.c ptrlock.c time.c fortran.c affinity.c target.c \
|
||||
splay-tree.c libgomp-plugin.c oacc-parallel.c oacc-host.c \
|
||||
oacc-init.c oacc-mem.c oacc-async.c oacc-plugin.c oacc-cuda.c \
|
||||
priority_queue.c $(am__append_2)
|
||||
priority_queue.c $(am__append_3)
|
||||
|
||||
# Nvidia PTX OpenACC plugin.
|
||||
@PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_version_info = -version-info $(libtool_VERSION)
|
||||
@ -426,6 +445,16 @@ libgomp_la_SOURCES = alloc.c barrier.c critical.c env.c error.c iter.c \
|
||||
@PLUGIN_NVPTX_TRUE@ $(lt_host_flags) $(PLUGIN_NVPTX_LDFLAGS)
|
||||
@PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_la_LIBADD = libgomp.la $(PLUGIN_NVPTX_LIBS)
|
||||
@PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_la_LIBTOOLFLAGS = --tag=disable-static
|
||||
|
||||
# Heterogenous Systems Architecture plugin
|
||||
@PLUGIN_HSA_TRUE@libgomp_plugin_hsa_version_info = -version-info $(libtool_VERSION)
|
||||
@PLUGIN_HSA_TRUE@libgomp_plugin_hsa_la_SOURCES = plugin/plugin-hsa.c
|
||||
@PLUGIN_HSA_TRUE@libgomp_plugin_hsa_la_CPPFLAGS = $(AM_CPPFLAGS) $(PLUGIN_HSA_CPPFLAGS)
|
||||
@PLUGIN_HSA_TRUE@libgomp_plugin_hsa_la_LDFLAGS = \
|
||||
@PLUGIN_HSA_TRUE@ $(libgomp_plugin_hsa_version_info) \
|
||||
@PLUGIN_HSA_TRUE@ $(lt_host_flags) $(PLUGIN_HSA_LDFLAGS)
|
||||
@PLUGIN_HSA_TRUE@libgomp_plugin_hsa_la_LIBADD = libgomp.la $(PLUGIN_HSA_LIBS)
|
||||
@PLUGIN_HSA_TRUE@libgomp_plugin_hsa_la_LIBTOOLFLAGS = --tag=disable-static
|
||||
nodist_noinst_HEADERS = libgomp_f.h
|
||||
nodist_libsubinclude_HEADERS = omp.h openacc.h
|
||||
@USE_FORTRAN_TRUE@nodist_finclude_HEADERS = omp_lib.h omp_lib.f90 omp_lib.mod omp_lib_kinds.mod \
|
||||
@ -553,6 +582,8 @@ clean-toolexeclibLTLIBRARIES:
|
||||
echo "rm -f \"$${dir}/so_locations\""; \
|
||||
rm -f "$${dir}/so_locations"; \
|
||||
done
|
||||
libgomp-plugin-hsa.la: $(libgomp_plugin_hsa_la_OBJECTS) $(libgomp_plugin_hsa_la_DEPENDENCIES) $(EXTRA_libgomp_plugin_hsa_la_DEPENDENCIES)
|
||||
$(libgomp_plugin_hsa_la_LINK) $(am_libgomp_plugin_hsa_la_rpath) $(libgomp_plugin_hsa_la_OBJECTS) $(libgomp_plugin_hsa_la_LIBADD) $(LIBS)
|
||||
libgomp-plugin-nvptx.la: $(libgomp_plugin_nvptx_la_OBJECTS) $(libgomp_plugin_nvptx_la_DEPENDENCIES) $(EXTRA_libgomp_plugin_nvptx_la_DEPENDENCIES)
|
||||
$(libgomp_plugin_nvptx_la_LINK) $(am_libgomp_plugin_nvptx_la_rpath) $(libgomp_plugin_nvptx_la_OBJECTS) $(libgomp_plugin_nvptx_la_LIBADD) $(LIBS)
|
||||
libgomp.la: $(libgomp_la_OBJECTS) $(libgomp_la_DEPENDENCIES) $(EXTRA_libgomp_la_DEPENDENCIES)
|
||||
@ -575,6 +606,7 @@ distclean-compile:
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/iter.Plo@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/iter_ull.Plo@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/libgomp-plugin.Plo@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/libgomp_plugin_hsa_la-plugin-hsa.Plo@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/libgomp_plugin_nvptx_la-plugin-nvptx.Plo@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/lock.Plo@am__quote@
|
||||
@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/loop.Plo@am__quote@
|
||||
@ -623,6 +655,13 @@ distclean-compile:
|
||||
@AMDEP_TRUE@@am__fastdepCC_FALSE@ DEPDIR=$(DEPDIR) $(CCDEPMODE) $(depcomp) @AMDEPBACKSLASH@
|
||||
@am__fastdepCC_FALSE@ $(LTCOMPILE) -c -o $@ $<
|
||||
|
||||
libgomp_plugin_hsa_la-plugin-hsa.lo: plugin/plugin-hsa.c
|
||||
@am__fastdepCC_TRUE@ $(LIBTOOL) --tag=CC $(libgomp_plugin_hsa_la_LIBTOOLFLAGS) $(LIBTOOLFLAGS) --mode=compile $(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(libgomp_plugin_hsa_la_CPPFLAGS) $(CPPFLAGS) $(AM_CFLAGS) $(CFLAGS) -MT libgomp_plugin_hsa_la-plugin-hsa.lo -MD -MP -MF $(DEPDIR)/libgomp_plugin_hsa_la-plugin-hsa.Tpo -c -o libgomp_plugin_hsa_la-plugin-hsa.lo `test -f 'plugin/plugin-hsa.c' || echo '$(srcdir)/'`plugin/plugin-hsa.c
|
||||
@am__fastdepCC_TRUE@ $(am__mv) $(DEPDIR)/libgomp_plugin_hsa_la-plugin-hsa.Tpo $(DEPDIR)/libgomp_plugin_hsa_la-plugin-hsa.Plo
|
||||
@AMDEP_TRUE@@am__fastdepCC_FALSE@ source='plugin/plugin-hsa.c' object='libgomp_plugin_hsa_la-plugin-hsa.lo' libtool=yes @AMDEPBACKSLASH@
|
||||
@AMDEP_TRUE@@am__fastdepCC_FALSE@ DEPDIR=$(DEPDIR) $(CCDEPMODE) $(depcomp) @AMDEPBACKSLASH@
|
||||
@am__fastdepCC_FALSE@ $(LIBTOOL) --tag=CC $(libgomp_plugin_hsa_la_LIBTOOLFLAGS) $(LIBTOOLFLAGS) --mode=compile $(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(libgomp_plugin_hsa_la_CPPFLAGS) $(CPPFLAGS) $(AM_CFLAGS) $(CFLAGS) -c -o libgomp_plugin_hsa_la-plugin-hsa.lo `test -f 'plugin/plugin-hsa.c' || echo '$(srcdir)/'`plugin/plugin-hsa.c
|
||||
|
||||
libgomp_plugin_nvptx_la-plugin-nvptx.lo: plugin/plugin-nvptx.c
|
||||
@am__fastdepCC_TRUE@ $(LIBTOOL) --tag=CC $(libgomp_plugin_nvptx_la_LIBTOOLFLAGS) $(LIBTOOLFLAGS) --mode=compile $(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(libgomp_plugin_nvptx_la_CPPFLAGS) $(CPPFLAGS) $(AM_CFLAGS) $(CFLAGS) -MT libgomp_plugin_nvptx_la-plugin-nvptx.lo -MD -MP -MF $(DEPDIR)/libgomp_plugin_nvptx_la-plugin-nvptx.Tpo -c -o libgomp_plugin_nvptx_la-plugin-nvptx.lo `test -f 'plugin/plugin-nvptx.c' || echo '$(srcdir)/'`plugin/plugin-nvptx.c
|
||||
@am__fastdepCC_TRUE@ $(am__mv) $(DEPDIR)/libgomp_plugin_nvptx_la-plugin-nvptx.Tpo $(DEPDIR)/libgomp_plugin_nvptx_la-plugin-nvptx.Plo
|
||||
|
@ -60,6 +60,9 @@
|
||||
/* Define to 1 if you have the `strtoull' function. */
|
||||
#undef HAVE_STRTOULL
|
||||
|
||||
/* Define to 1 if the system has the type `struct _Mutex_Control'. */
|
||||
#undef HAVE_STRUCT__MUTEX_CONTROL
|
||||
|
||||
/* Define to 1 if the target runtime linker supports binding the same symbol
|
||||
to different versions. */
|
||||
#undef HAVE_SYMVER_SYMBOL_RENAMING_RUNTIME_SUPPORT
|
||||
@ -119,6 +122,9 @@
|
||||
/* Define to the version of this package. */
|
||||
#undef PACKAGE_VERSION
|
||||
|
||||
/* Define to 1 if the HSA plugin is built, 0 if not. */
|
||||
#undef PLUGIN_HSA
|
||||
|
||||
/* Define to 1 if the NVIDIA plugin is built, 0 if not. */
|
||||
#undef PLUGIN_NVPTX
|
||||
|
||||
|
166
libgomp/configure
vendored
166
libgomp/configure
vendored
@ -627,10 +627,18 @@ LIBGOMP_BUILD_VERSIONED_SHLIB_FALSE
|
||||
LIBGOMP_BUILD_VERSIONED_SHLIB_TRUE
|
||||
OPT_LDFLAGS
|
||||
SECTION_LDFLAGS
|
||||
PLUGIN_HSA_FALSE
|
||||
PLUGIN_HSA_TRUE
|
||||
PLUGIN_NVPTX_FALSE
|
||||
PLUGIN_NVPTX_TRUE
|
||||
offload_additional_lib_paths
|
||||
offload_additional_options
|
||||
PLUGIN_HSA_LIBS
|
||||
PLUGIN_HSA_LDFLAGS
|
||||
PLUGIN_HSA_CPPFLAGS
|
||||
PLUGIN_HSA
|
||||
HSA_RUNTIME_LIB
|
||||
HSA_RUNTIME_INCLUDE
|
||||
PLUGIN_NVPTX_LIBS
|
||||
PLUGIN_NVPTX_LDFLAGS
|
||||
PLUGIN_NVPTX_CPPFLAGS
|
||||
@ -782,6 +790,10 @@ enable_maintainer_mode
|
||||
with_cuda_driver
|
||||
with_cuda_driver_include
|
||||
with_cuda_driver_lib
|
||||
with_hsa_runtime
|
||||
with_hsa_runtime_include
|
||||
with_hsa_runtime_lib
|
||||
with_hsa_kmt_lib
|
||||
enable_linux_futex
|
||||
enable_tls
|
||||
enable_symvers
|
||||
@ -1453,6 +1465,17 @@ Optional Packages:
|
||||
--with-cuda-driver-lib=PATH
|
||||
specify directory for the installed CUDA driver
|
||||
library
|
||||
--with-hsa-runtime=PATH specify prefix directory for installed HSA run-time
|
||||
package. Equivalent to
|
||||
--with-hsa-runtime-include=PATH/include plus
|
||||
--with-hsa-runtime-lib=PATH/lib
|
||||
--with-hsa-runtime-include=PATH
|
||||
specify directory for installed HSA run-time include
|
||||
files
|
||||
--with-hsa-runtime-lib=PATH
|
||||
specify directory for the installed HSA run-time
|
||||
library
|
||||
--with-hsa-kmt-lib=PATH specify directory for installed HSA KMT library.
|
||||
|
||||
Some influential environment variables:
|
||||
CC C compiler command
|
||||
@ -11121,7 +11144,7 @@ else
|
||||
lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
|
||||
lt_status=$lt_dlunknown
|
||||
cat > conftest.$ac_ext <<_LT_EOF
|
||||
#line 11124 "configure"
|
||||
#line 11147 "configure"
|
||||
#include "confdefs.h"
|
||||
|
||||
#if HAVE_DLFCN_H
|
||||
@ -11227,7 +11250,7 @@ else
|
||||
lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
|
||||
lt_status=$lt_dlunknown
|
||||
cat > conftest.$ac_ext <<_LT_EOF
|
||||
#line 11230 "configure"
|
||||
#line 11253 "configure"
|
||||
#include "confdefs.h"
|
||||
|
||||
#if HAVE_DLFCN_H
|
||||
@ -15090,7 +15113,7 @@ esac
|
||||
|
||||
# Plugins for offload execution, configure.ac fragment. -*- mode: autoconf -*-
|
||||
#
|
||||
# Copyright (C) 2014-2015 Free Software Foundation, Inc.
|
||||
# Copyright (C) 2014-2016 Free Software Foundation, Inc.
|
||||
#
|
||||
# Contributed by Mentor Embedded.
|
||||
#
|
||||
@ -15225,6 +15248,72 @@ PLUGIN_NVPTX_LIBS=
|
||||
|
||||
|
||||
|
||||
# Look for HSA run-time, its includes and libraries
|
||||
|
||||
HSA_RUNTIME_INCLUDE=
|
||||
HSA_RUNTIME_LIB=
|
||||
|
||||
|
||||
HSA_RUNTIME_CPPFLAGS=
|
||||
HSA_RUNTIME_LDFLAGS=
|
||||
|
||||
|
||||
# Check whether --with-hsa-runtime was given.
|
||||
if test "${with_hsa_runtime+set}" = set; then :
|
||||
withval=$with_hsa_runtime;
|
||||
fi
|
||||
|
||||
|
||||
# Check whether --with-hsa-runtime-include was given.
|
||||
if test "${with_hsa_runtime_include+set}" = set; then :
|
||||
withval=$with_hsa_runtime_include;
|
||||
fi
|
||||
|
||||
|
||||
# Check whether --with-hsa-runtime-lib was given.
|
||||
if test "${with_hsa_runtime_lib+set}" = set; then :
|
||||
withval=$with_hsa_runtime_lib;
|
||||
fi
|
||||
|
||||
if test "x$with_hsa_runtime" != x; then
|
||||
HSA_RUNTIME_INCLUDE=$with_hsa_runtime/include
|
||||
HSA_RUNTIME_LIB=$with_hsa_runtime/lib
|
||||
fi
|
||||
if test "x$with_hsa_runtime_include" != x; then
|
||||
HSA_RUNTIME_INCLUDE=$with_hsa_runtime_include
|
||||
fi
|
||||
if test "x$with_hsa_runtime_lib" != x; then
|
||||
HSA_RUNTIME_LIB=$with_hsa_runtime_lib
|
||||
fi
|
||||
if test "x$HSA_RUNTIME_INCLUDE" != x; then
|
||||
HSA_RUNTIME_CPPFLAGS=-I$HSA_RUNTIME_INCLUDE
|
||||
fi
|
||||
if test "x$HSA_RUNTIME_LIB" != x; then
|
||||
HSA_RUNTIME_LDFLAGS=-L$HSA_RUNTIME_LIB
|
||||
fi
|
||||
|
||||
|
||||
# Check whether --with-hsa-kmt-lib was given.
|
||||
if test "${with_hsa_kmt_lib+set}" = set; then :
|
||||
withval=$with_hsa_kmt_lib;
|
||||
fi
|
||||
|
||||
if test "x$with_hsa_kmt_lib" != x; then
|
||||
HSA_RUNTIME_LDFLAGS="$HSA_RUNTIME_LDFLAGS -L$with_hsa_kmt_lib"
|
||||
HSA_RUNTIME_LIB=
|
||||
fi
|
||||
|
||||
PLUGIN_HSA=0
|
||||
PLUGIN_HSA_CPPFLAGS=
|
||||
PLUGIN_HSA_LDFLAGS=
|
||||
PLUGIN_HSA_LIBS=
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
# Get offload targets and path to install tree of offloading compiler.
|
||||
offload_additional_options=
|
||||
offload_additional_lib_paths=
|
||||
@ -15277,6 +15366,60 @@ rm -f core conftest.err conftest.$ac_objext \
|
||||
;;
|
||||
esac
|
||||
;;
|
||||
hsa*)
|
||||
case "${target}" in
|
||||
x86_64-*-*)
|
||||
case " ${CC} ${CFLAGS} " in
|
||||
*" -m32 "*)
|
||||
PLUGIN_HSA=0
|
||||
;;
|
||||
*)
|
||||
tgt_name=hsa
|
||||
PLUGIN_HSA=$tgt
|
||||
PLUGIN_HSA_CPPFLAGS=$HSA_RUNTIME_CPPFLAGS
|
||||
PLUGIN_HSA_LDFLAGS=$HSA_RUNTIME_LDFLAGS
|
||||
PLUGIN_HSA_LIBS="-lhsa-runtime64 -lhsakmt"
|
||||
|
||||
PLUGIN_HSA_save_CPPFLAGS=$CPPFLAGS
|
||||
CPPFLAGS="$PLUGIN_HSA_CPPFLAGS $CPPFLAGS"
|
||||
PLUGIN_HSA_save_LDFLAGS=$LDFLAGS
|
||||
LDFLAGS="$PLUGIN_HSA_LDFLAGS $LDFLAGS"
|
||||
PLUGIN_HSA_save_LIBS=$LIBS
|
||||
LIBS="$PLUGIN_HSA_LIBS $LIBS"
|
||||
|
||||
cat confdefs.h - <<_ACEOF >conftest.$ac_ext
|
||||
/* end confdefs.h. */
|
||||
#include "hsa.h"
|
||||
int
|
||||
main ()
|
||||
{
|
||||
hsa_status_t status = hsa_init ()
|
||||
;
|
||||
return 0;
|
||||
}
|
||||
_ACEOF
|
||||
if ac_fn_c_try_link "$LINENO"; then :
|
||||
PLUGIN_HSA=1
|
||||
fi
|
||||
rm -f core conftest.err conftest.$ac_objext \
|
||||
conftest$ac_exeext conftest.$ac_ext
|
||||
CPPFLAGS=$PLUGIN_HSA_save_CPPFLAGS
|
||||
LDFLAGS=$PLUGIN_HSA_save_LDFLAGS
|
||||
LIBS=$PLUGIN_HSA_save_LIBS
|
||||
case $PLUGIN_HSA in
|
||||
hsa*)
|
||||
HSA_PLUGIN=0
|
||||
as_fn_error "HSA run-time package required for HSA support" "$LINENO" 5
|
||||
;;
|
||||
esac
|
||||
;;
|
||||
esac
|
||||
;;
|
||||
*-*-*)
|
||||
PLUGIN_HSA=0
|
||||
;;
|
||||
esac
|
||||
;;
|
||||
*)
|
||||
as_fn_error "unknown offload target specified" "$LINENO" 5
|
||||
;;
|
||||
@ -15313,6 +15456,19 @@ cat >>confdefs.h <<_ACEOF
|
||||
#define PLUGIN_NVPTX $PLUGIN_NVPTX
|
||||
_ACEOF
|
||||
|
||||
if test $PLUGIN_HSA = 1; then
|
||||
PLUGIN_HSA_TRUE=
|
||||
PLUGIN_HSA_FALSE='#'
|
||||
else
|
||||
PLUGIN_HSA_TRUE='#'
|
||||
PLUGIN_HSA_FALSE=
|
||||
fi
|
||||
|
||||
|
||||
cat >>confdefs.h <<_ACEOF
|
||||
#define PLUGIN_HSA $PLUGIN_HSA
|
||||
_ACEOF
|
||||
|
||||
|
||||
|
||||
# Check for functions needed.
|
||||
@ -16712,6 +16868,10 @@ if test -z "${PLUGIN_NVPTX_TRUE}" && test -z "${PLUGIN_NVPTX_FALSE}"; then
|
||||
as_fn_error "conditional \"PLUGIN_NVPTX\" was never defined.
|
||||
Usually this means the macro was only invoked conditionally." "$LINENO" 5
|
||||
fi
|
||||
if test -z "${PLUGIN_HSA_TRUE}" && test -z "${PLUGIN_HSA_FALSE}"; then
|
||||
as_fn_error "conditional \"PLUGIN_HSA\" was never defined.
|
||||
Usually this means the macro was only invoked conditionally." "$LINENO" 5
|
||||
fi
|
||||
if test -z "${LIBGOMP_BUILD_VERSIONED_SHLIB_TRUE}" && test -z "${LIBGOMP_BUILD_VERSIONED_SHLIB_FALSE}"; then
|
||||
as_fn_error "conditional \"LIBGOMP_BUILD_VERSIONED_SHLIB\" was never defined.
|
||||
Usually this means the macro was only invoked conditionally." "$LINENO" 5
|
||||
|
@ -48,7 +48,8 @@ enum offload_target_type
|
||||
OFFLOAD_TARGET_TYPE_HOST = 2,
|
||||
/* OFFLOAD_TARGET_TYPE_HOST_NONSHM = 3 removed. */
|
||||
OFFLOAD_TARGET_TYPE_NVIDIA_PTX = 5,
|
||||
OFFLOAD_TARGET_TYPE_INTEL_MIC = 6
|
||||
OFFLOAD_TARGET_TYPE_INTEL_MIC = 6,
|
||||
OFFLOAD_TARGET_TYPE_HSA = 7
|
||||
};
|
||||
|
||||
/* Auxiliary struct, used for transferring pairs of addresses from plugin
|
||||
|
@ -496,6 +496,10 @@ struct gomp_target_task
|
||||
struct target_mem_desc *tgt;
|
||||
struct gomp_task *task;
|
||||
struct gomp_team *team;
|
||||
/* Copies of firstprivate mapped data for shared memory accelerators. */
|
||||
void *firstprivate_copies;
|
||||
/* Device-specific target arguments. */
|
||||
void **args;
|
||||
void *hostaddrs[];
|
||||
};
|
||||
|
||||
@ -750,7 +754,8 @@ extern void gomp_task_maybe_wait_for_dependencies (void **);
|
||||
extern bool gomp_create_target_task (struct gomp_device_descr *,
|
||||
void (*) (void *), size_t, void **,
|
||||
size_t *, unsigned short *, unsigned int,
|
||||
void **, enum gomp_target_task_state);
|
||||
void **, void **,
|
||||
enum gomp_target_task_state);
|
||||
|
||||
static void inline
|
||||
gomp_finish_task (struct gomp_task *task)
|
||||
@ -937,8 +942,9 @@ struct gomp_device_descr
|
||||
void *(*dev2host_func) (int, void *, const void *, size_t);
|
||||
void *(*host2dev_func) (int, void *, const void *, size_t);
|
||||
void *(*dev2dev_func) (int, void *, const void *, size_t);
|
||||
void (*run_func) (int, void *, void *);
|
||||
void (*async_run_func) (int, void *, void *, void *);
|
||||
bool (*can_run_func) (void *);
|
||||
void (*run_func) (int, void *, void *, void **);
|
||||
void (*async_run_func) (int, void *, void *, void **, void *);
|
||||
|
||||
/* Splay tree containing information about mapped memory regions. */
|
||||
struct splay_tree_s mem_map;
|
||||
|
@ -278,8 +278,7 @@ extern void GOMP_single_copy_end (void *);
|
||||
extern void GOMP_target (int, void (*) (void *), const void *,
|
||||
size_t, void **, size_t *, unsigned char *);
|
||||
extern void GOMP_target_ext (int, void (*) (void *), size_t, void **, size_t *,
|
||||
unsigned short *, unsigned int, void **,
|
||||
int, int);
|
||||
unsigned short *, unsigned int, void **, void **);
|
||||
extern void GOMP_target_data (int, const void *,
|
||||
size_t, void **, size_t *, unsigned char *);
|
||||
extern void GOMP_target_data_ext (int, size_t, void **, size_t *,
|
||||
|
@ -123,7 +123,8 @@ host_host2dev (int n __attribute__ ((unused)),
|
||||
}
|
||||
|
||||
static void
|
||||
host_run (int n __attribute__ ((unused)), void *fn_ptr, void *vars)
|
||||
host_run (int n __attribute__ ((unused)), void *fn_ptr, void *vars,
|
||||
void **args __attribute__((unused)))
|
||||
{
|
||||
void (*fn)(void *) = (void (*)(void *)) fn_ptr;
|
||||
|
||||
|
@ -38,3 +38,16 @@ libgomp_plugin_nvptx_la_LDFLAGS += $(PLUGIN_NVPTX_LDFLAGS)
|
||||
libgomp_plugin_nvptx_la_LIBADD = libgomp.la $(PLUGIN_NVPTX_LIBS)
|
||||
libgomp_plugin_nvptx_la_LIBTOOLFLAGS = --tag=disable-static
|
||||
endif
|
||||
|
||||
if PLUGIN_HSA
|
||||
# Heterogenous Systems Architecture plugin
|
||||
libgomp_plugin_hsa_version_info = -version-info $(libtool_VERSION)
|
||||
toolexeclib_LTLIBRARIES += libgomp-plugin-hsa.la
|
||||
libgomp_plugin_hsa_la_SOURCES = plugin/plugin-hsa.c
|
||||
libgomp_plugin_hsa_la_CPPFLAGS = $(AM_CPPFLAGS) $(PLUGIN_HSA_CPPFLAGS)
|
||||
libgomp_plugin_hsa_la_LDFLAGS = $(libgomp_plugin_hsa_version_info) \
|
||||
$(lt_host_flags)
|
||||
libgomp_plugin_hsa_la_LDFLAGS += $(PLUGIN_HSA_LDFLAGS)
|
||||
libgomp_plugin_hsa_la_LIBADD = libgomp.la $(PLUGIN_HSA_LIBS)
|
||||
libgomp_plugin_hsa_la_LIBTOOLFLAGS = --tag=disable-static
|
||||
endif
|
||||
|
@ -81,6 +81,62 @@ AC_SUBST(PLUGIN_NVPTX_CPPFLAGS)
|
||||
AC_SUBST(PLUGIN_NVPTX_LDFLAGS)
|
||||
AC_SUBST(PLUGIN_NVPTX_LIBS)
|
||||
|
||||
# Look for HSA run-time, its includes and libraries
|
||||
|
||||
HSA_RUNTIME_INCLUDE=
|
||||
HSA_RUNTIME_LIB=
|
||||
AC_SUBST(HSA_RUNTIME_INCLUDE)
|
||||
AC_SUBST(HSA_RUNTIME_LIB)
|
||||
HSA_RUNTIME_CPPFLAGS=
|
||||
HSA_RUNTIME_LDFLAGS=
|
||||
|
||||
AC_ARG_WITH(hsa-runtime,
|
||||
[AS_HELP_STRING([--with-hsa-runtime=PATH],
|
||||
[specify prefix directory for installed HSA run-time package.
|
||||
Equivalent to --with-hsa-runtime-include=PATH/include
|
||||
plus --with-hsa-runtime-lib=PATH/lib])])
|
||||
AC_ARG_WITH(hsa-runtime-include,
|
||||
[AS_HELP_STRING([--with-hsa-runtime-include=PATH],
|
||||
[specify directory for installed HSA run-time include files])])
|
||||
AC_ARG_WITH(hsa-runtime-lib,
|
||||
[AS_HELP_STRING([--with-hsa-runtime-lib=PATH],
|
||||
[specify directory for the installed HSA run-time library])])
|
||||
if test "x$with_hsa_runtime" != x; then
|
||||
HSA_RUNTIME_INCLUDE=$with_hsa_runtime/include
|
||||
HSA_RUNTIME_LIB=$with_hsa_runtime/lib
|
||||
fi
|
||||
if test "x$with_hsa_runtime_include" != x; then
|
||||
HSA_RUNTIME_INCLUDE=$with_hsa_runtime_include
|
||||
fi
|
||||
if test "x$with_hsa_runtime_lib" != x; then
|
||||
HSA_RUNTIME_LIB=$with_hsa_runtime_lib
|
||||
fi
|
||||
if test "x$HSA_RUNTIME_INCLUDE" != x; then
|
||||
HSA_RUNTIME_CPPFLAGS=-I$HSA_RUNTIME_INCLUDE
|
||||
fi
|
||||
if test "x$HSA_RUNTIME_LIB" != x; then
|
||||
HSA_RUNTIME_LDFLAGS=-L$HSA_RUNTIME_LIB
|
||||
fi
|
||||
|
||||
AC_ARG_WITH(hsa-kmt-lib,
|
||||
[AS_HELP_STRING([--with-hsa-kmt-lib=PATH],
|
||||
[specify directory for installed HSA KMT library.])])
|
||||
if test "x$with_hsa_kmt_lib" != x; then
|
||||
HSA_RUNTIME_LDFLAGS="$HSA_RUNTIME_LDFLAGS -L$with_hsa_kmt_lib"
|
||||
HSA_RUNTIME_LIB=
|
||||
fi
|
||||
|
||||
PLUGIN_HSA=0
|
||||
PLUGIN_HSA_CPPFLAGS=
|
||||
PLUGIN_HSA_LDFLAGS=
|
||||
PLUGIN_HSA_LIBS=
|
||||
AC_SUBST(PLUGIN_HSA)
|
||||
AC_SUBST(PLUGIN_HSA_CPPFLAGS)
|
||||
AC_SUBST(PLUGIN_HSA_LDFLAGS)
|
||||
AC_SUBST(PLUGIN_HSA_LIBS)
|
||||
|
||||
|
||||
|
||||
# Get offload targets and path to install tree of offloading compiler.
|
||||
offload_additional_options=
|
||||
offload_additional_lib_paths=
|
||||
@ -122,6 +178,49 @@ if test x"$enable_offload_targets" != x; then
|
||||
;;
|
||||
esac
|
||||
;;
|
||||
hsa*)
|
||||
case "${target}" in
|
||||
x86_64-*-*)
|
||||
case " ${CC} ${CFLAGS} " in
|
||||
*" -m32 "*)
|
||||
PLUGIN_HSA=0
|
||||
;;
|
||||
*)
|
||||
tgt_name=hsa
|
||||
PLUGIN_HSA=$tgt
|
||||
PLUGIN_HSA_CPPFLAGS=$HSA_RUNTIME_CPPFLAGS
|
||||
PLUGIN_HSA_LDFLAGS=$HSA_RUNTIME_LDFLAGS
|
||||
PLUGIN_HSA_LIBS="-lhsa-runtime64 -lhsakmt"
|
||||
|
||||
PLUGIN_HSA_save_CPPFLAGS=$CPPFLAGS
|
||||
CPPFLAGS="$PLUGIN_HSA_CPPFLAGS $CPPFLAGS"
|
||||
PLUGIN_HSA_save_LDFLAGS=$LDFLAGS
|
||||
LDFLAGS="$PLUGIN_HSA_LDFLAGS $LDFLAGS"
|
||||
PLUGIN_HSA_save_LIBS=$LIBS
|
||||
LIBS="$PLUGIN_HSA_LIBS $LIBS"
|
||||
|
||||
AC_LINK_IFELSE(
|
||||
[AC_LANG_PROGRAM(
|
||||
[#include "hsa.h"],
|
||||
[hsa_status_t status = hsa_init ()])],
|
||||
[PLUGIN_HSA=1])
|
||||
CPPFLAGS=$PLUGIN_HSA_save_CPPFLAGS
|
||||
LDFLAGS=$PLUGIN_HSA_save_LDFLAGS
|
||||
LIBS=$PLUGIN_HSA_save_LIBS
|
||||
case $PLUGIN_HSA in
|
||||
hsa*)
|
||||
HSA_PLUGIN=0
|
||||
AC_MSG_ERROR([HSA run-time package required for HSA support])
|
||||
;;
|
||||
esac
|
||||
;;
|
||||
esac
|
||||
;;
|
||||
*-*-*)
|
||||
PLUGIN_HSA=0
|
||||
;;
|
||||
esac
|
||||
;;
|
||||
*)
|
||||
AC_MSG_ERROR([unknown offload target specified])
|
||||
;;
|
||||
@ -145,3 +244,6 @@ AC_DEFINE_UNQUOTED(OFFLOAD_TARGETS, "$offload_targets",
|
||||
AM_CONDITIONAL([PLUGIN_NVPTX], [test $PLUGIN_NVPTX = 1])
|
||||
AC_DEFINE_UNQUOTED([PLUGIN_NVPTX], [$PLUGIN_NVPTX],
|
||||
[Define to 1 if the NVIDIA plugin is built, 0 if not.])
|
||||
AM_CONDITIONAL([PLUGIN_HSA], [test $PLUGIN_HSA = 1])
|
||||
AC_DEFINE_UNQUOTED([PLUGIN_HSA], [$PLUGIN_HSA],
|
||||
[Define to 1 if the HSA plugin is built, 0 if not.])
|
||||
|
1493
libgomp/plugin/plugin-hsa.c
Normal file
1493
libgomp/plugin/plugin-hsa.c
Normal file
File diff suppressed because it is too large
Load Diff
227
libgomp/target.c
227
libgomp/target.c
@ -1329,6 +1329,49 @@ gomp_target_fallback (void (*fn) (void *), void **hostaddrs)
|
||||
*thr = old_thr;
|
||||
}
|
||||
|
||||
/* Calculate alignment and size requirements of a private copy of data shared
|
||||
as GOMP_MAP_FIRSTPRIVATE and store them to TGT_ALIGN and TGT_SIZE. */
|
||||
|
||||
static inline void
|
||||
calculate_firstprivate_requirements (size_t mapnum, size_t *sizes,
|
||||
unsigned short *kinds, size_t *tgt_align,
|
||||
size_t *tgt_size)
|
||||
{
|
||||
size_t i;
|
||||
for (i = 0; i < mapnum; i++)
|
||||
if ((kinds[i] & 0xff) == GOMP_MAP_FIRSTPRIVATE)
|
||||
{
|
||||
size_t align = (size_t) 1 << (kinds[i] >> 8);
|
||||
if (*tgt_align < align)
|
||||
*tgt_align = align;
|
||||
*tgt_size = (*tgt_size + align - 1) & ~(align - 1);
|
||||
*tgt_size += sizes[i];
|
||||
}
|
||||
}
|
||||
|
||||
/* Copy data shared as GOMP_MAP_FIRSTPRIVATE to DST. */
|
||||
|
||||
static inline void
|
||||
copy_firstprivate_data (char *tgt, size_t mapnum, void **hostaddrs,
|
||||
size_t *sizes, unsigned short *kinds, size_t tgt_align,
|
||||
size_t tgt_size)
|
||||
{
|
||||
uintptr_t al = (uintptr_t) tgt & (tgt_align - 1);
|
||||
if (al)
|
||||
tgt += tgt_align - al;
|
||||
tgt_size = 0;
|
||||
size_t i;
|
||||
for (i = 0; i < mapnum; i++)
|
||||
if ((kinds[i] & 0xff) == GOMP_MAP_FIRSTPRIVATE)
|
||||
{
|
||||
size_t align = (size_t) 1 << (kinds[i] >> 8);
|
||||
tgt_size = (tgt_size + align - 1) & ~(align - 1);
|
||||
memcpy (tgt + tgt_size, hostaddrs[i], sizes[i]);
|
||||
hostaddrs[i] = tgt + tgt_size;
|
||||
tgt_size = tgt_size + sizes[i];
|
||||
}
|
||||
}
|
||||
|
||||
/* Host fallback with firstprivate map-type handling. */
|
||||
|
||||
static void
|
||||
@ -1336,37 +1379,40 @@ gomp_target_fallback_firstprivate (void (*fn) (void *), size_t mapnum,
|
||||
void **hostaddrs, size_t *sizes,
|
||||
unsigned short *kinds)
|
||||
{
|
||||
size_t i, tgt_align = 0, tgt_size = 0;
|
||||
char *tgt = NULL;
|
||||
for (i = 0; i < mapnum; i++)
|
||||
if ((kinds[i] & 0xff) == GOMP_MAP_FIRSTPRIVATE)
|
||||
{
|
||||
size_t align = (size_t) 1 << (kinds[i] >> 8);
|
||||
if (tgt_align < align)
|
||||
tgt_align = align;
|
||||
tgt_size = (tgt_size + align - 1) & ~(align - 1);
|
||||
tgt_size += sizes[i];
|
||||
}
|
||||
size_t tgt_align = 0, tgt_size = 0;
|
||||
calculate_firstprivate_requirements (mapnum, sizes, kinds, &tgt_align,
|
||||
&tgt_size);
|
||||
if (tgt_align)
|
||||
{
|
||||
tgt = gomp_alloca (tgt_size + tgt_align - 1);
|
||||
uintptr_t al = (uintptr_t) tgt & (tgt_align - 1);
|
||||
if (al)
|
||||
tgt += tgt_align - al;
|
||||
tgt_size = 0;
|
||||
for (i = 0; i < mapnum; i++)
|
||||
if ((kinds[i] & 0xff) == GOMP_MAP_FIRSTPRIVATE)
|
||||
{
|
||||
size_t align = (size_t) 1 << (kinds[i] >> 8);
|
||||
tgt_size = (tgt_size + align - 1) & ~(align - 1);
|
||||
memcpy (tgt + tgt_size, hostaddrs[i], sizes[i]);
|
||||
hostaddrs[i] = tgt + tgt_size;
|
||||
tgt_size = tgt_size + sizes[i];
|
||||
}
|
||||
char *tgt = gomp_alloca (tgt_size + tgt_align - 1);
|
||||
copy_firstprivate_data (tgt, mapnum, hostaddrs, sizes, kinds, tgt_align,
|
||||
tgt_size);
|
||||
}
|
||||
gomp_target_fallback (fn, hostaddrs);
|
||||
}
|
||||
|
||||
/* Handle firstprivate map-type for shared memory devices and the host
|
||||
fallback. Return the pointer of firstprivate copies which has to be freed
|
||||
after use. */
|
||||
|
||||
static void *
|
||||
gomp_target_unshare_firstprivate (size_t mapnum, void **hostaddrs,
|
||||
size_t *sizes, unsigned short *kinds)
|
||||
{
|
||||
size_t tgt_align = 0, tgt_size = 0;
|
||||
char *tgt = NULL;
|
||||
|
||||
calculate_firstprivate_requirements (mapnum, sizes, kinds, &tgt_align,
|
||||
&tgt_size);
|
||||
if (tgt_align)
|
||||
{
|
||||
tgt = gomp_malloc (tgt_size + tgt_align - 1);
|
||||
copy_firstprivate_data (tgt, mapnum, hostaddrs, sizes, kinds, tgt_align,
|
||||
tgt_size);
|
||||
}
|
||||
return tgt;
|
||||
}
|
||||
|
||||
/* Helper function of GOMP_target{,_ext} routines. */
|
||||
|
||||
static void *
|
||||
@ -1390,7 +1436,12 @@ gomp_get_target_fn_addr (struct gomp_device_descr *devicep,
|
||||
splay_tree_key tgt_fn = splay_tree_lookup (&devicep->mem_map, &k);
|
||||
gomp_mutex_unlock (&devicep->lock);
|
||||
if (tgt_fn == NULL)
|
||||
gomp_fatal ("Target function wasn't mapped");
|
||||
{
|
||||
if (devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
|
||||
return NULL;
|
||||
else
|
||||
gomp_fatal ("Target function wasn't mapped");
|
||||
}
|
||||
|
||||
return (void *) tgt_fn->tgt_offset;
|
||||
}
|
||||
@ -1416,13 +1467,16 @@ GOMP_target (int device, void (*fn) (void *), const void *unused,
|
||||
void *fn_addr;
|
||||
if (devicep == NULL
|
||||
|| !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
||||
/* All shared memory devices should use the GOMP_target_ext function. */
|
||||
|| devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM
|
||||
|| !(fn_addr = gomp_get_target_fn_addr (devicep, fn)))
|
||||
return gomp_target_fallback (fn, hostaddrs);
|
||||
|
||||
struct target_mem_desc *tgt_vars
|
||||
= gomp_map_vars (devicep, mapnum, hostaddrs, NULL, sizes, kinds, false,
|
||||
GOMP_MAP_VARS_TARGET);
|
||||
devicep->run_func (devicep->target_id, fn_addr, (void *) tgt_vars->tgt_start);
|
||||
devicep->run_func (devicep->target_id, fn_addr, (void *) tgt_vars->tgt_start,
|
||||
NULL);
|
||||
gomp_unmap_vars (tgt_vars, true);
|
||||
}
|
||||
|
||||
@ -1430,6 +1484,15 @@ GOMP_target (int device, void (*fn) (void *), const void *unused,
|
||||
and several arguments have been added:
|
||||
FLAGS is a bitmask, see GOMP_TARGET_FLAG_* in gomp-constants.h.
|
||||
DEPEND is array of dependencies, see GOMP_task for details.
|
||||
|
||||
ARGS is a pointer to an array consisting of a variable number of both
|
||||
device-independent and device-specific arguments, which can take one two
|
||||
elements where the first specifies for which device it is intended, the type
|
||||
and optionally also the value. If the value is not present in the first
|
||||
one, the whole second element the actual value. The last element of the
|
||||
array is a single NULL. Among the device independent can be for example
|
||||
NUM_TEAMS and THREAD_LIMIT.
|
||||
|
||||
NUM_TEAMS is positive if GOMP_teams will be called in the body with
|
||||
that value, or 1 if teams construct is not present, or 0, if
|
||||
teams construct does not have num_teams clause and so the choice is
|
||||
@ -1443,14 +1506,10 @@ GOMP_target (int device, void (*fn) (void *), const void *unused,
|
||||
void
|
||||
GOMP_target_ext (int device, void (*fn) (void *), size_t mapnum,
|
||||
void **hostaddrs, size_t *sizes, unsigned short *kinds,
|
||||
unsigned int flags, void **depend, int num_teams,
|
||||
int thread_limit)
|
||||
unsigned int flags, void **depend, void **args)
|
||||
{
|
||||
struct gomp_device_descr *devicep = resolve_device (device);
|
||||
|
||||
(void) num_teams;
|
||||
(void) thread_limit;
|
||||
|
||||
if (flags & GOMP_TARGET_FLAG_NOWAIT)
|
||||
{
|
||||
struct gomp_thread *thr = gomp_thread ();
|
||||
@ -1487,7 +1546,7 @@ GOMP_target_ext (int device, void (*fn) (void *), size_t mapnum,
|
||||
&& !thr->task->final_task)
|
||||
{
|
||||
gomp_create_target_task (devicep, fn, mapnum, hostaddrs,
|
||||
sizes, kinds, flags, depend,
|
||||
sizes, kinds, flags, depend, args,
|
||||
GOMP_TARGET_TASK_BEFORE_MAP);
|
||||
return;
|
||||
}
|
||||
@ -1507,17 +1566,30 @@ GOMP_target_ext (int device, void (*fn) (void *), size_t mapnum,
|
||||
void *fn_addr;
|
||||
if (devicep == NULL
|
||||
|| !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
||||
|| !(fn_addr = gomp_get_target_fn_addr (devicep, fn)))
|
||||
|| !(fn_addr = gomp_get_target_fn_addr (devicep, fn))
|
||||
|| (devicep->can_run_func && !devicep->can_run_func (fn_addr)))
|
||||
{
|
||||
gomp_target_fallback_firstprivate (fn, mapnum, hostaddrs, sizes, kinds);
|
||||
return;
|
||||
}
|
||||
|
||||
struct target_mem_desc *tgt_vars
|
||||
= gomp_map_vars (devicep, mapnum, hostaddrs, NULL, sizes, kinds, true,
|
||||
GOMP_MAP_VARS_TARGET);
|
||||
devicep->run_func (devicep->target_id, fn_addr, (void *) tgt_vars->tgt_start);
|
||||
gomp_unmap_vars (tgt_vars, true);
|
||||
struct target_mem_desc *tgt_vars;
|
||||
void *fpc = NULL;
|
||||
if (devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
|
||||
{
|
||||
fpc = gomp_target_unshare_firstprivate (mapnum, hostaddrs, sizes, kinds);
|
||||
tgt_vars = NULL;
|
||||
}
|
||||
else
|
||||
tgt_vars = gomp_map_vars (devicep, mapnum, hostaddrs, NULL, sizes, kinds,
|
||||
true, GOMP_MAP_VARS_TARGET);
|
||||
devicep->run_func (devicep->target_id, fn_addr,
|
||||
tgt_vars ? (void *) tgt_vars->tgt_start : hostaddrs,
|
||||
args);
|
||||
if (tgt_vars)
|
||||
gomp_unmap_vars (tgt_vars, true);
|
||||
else
|
||||
free (fpc);
|
||||
}
|
||||
|
||||
/* Host fallback for GOMP_target_data{,_ext} routines. */
|
||||
@ -1547,7 +1619,8 @@ GOMP_target_data (int device, const void *unused, size_t mapnum,
|
||||
struct gomp_device_descr *devicep = resolve_device (device);
|
||||
|
||||
if (devicep == NULL
|
||||
|| !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400))
|
||||
|| !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
||||
|| (devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM))
|
||||
return gomp_target_data_fallback ();
|
||||
|
||||
struct target_mem_desc *tgt
|
||||
@ -1565,7 +1638,8 @@ GOMP_target_data_ext (int device, size_t mapnum, void **hostaddrs,
|
||||
struct gomp_device_descr *devicep = resolve_device (device);
|
||||
|
||||
if (devicep == NULL
|
||||
|| !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400))
|
||||
|| !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
||||
|| devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
|
||||
return gomp_target_data_fallback ();
|
||||
|
||||
struct target_mem_desc *tgt
|
||||
@ -1595,7 +1669,8 @@ GOMP_target_update (int device, const void *unused, size_t mapnum,
|
||||
struct gomp_device_descr *devicep = resolve_device (device);
|
||||
|
||||
if (devicep == NULL
|
||||
|| !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400))
|
||||
|| !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
||||
|| devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
|
||||
return;
|
||||
|
||||
gomp_update (devicep, mapnum, hostaddrs, sizes, kinds, false);
|
||||
@ -1626,7 +1701,7 @@ GOMP_target_update_ext (int device, size_t mapnum, void **hostaddrs,
|
||||
if (gomp_create_target_task (devicep, (void (*) (void *)) NULL,
|
||||
mapnum, hostaddrs, sizes, kinds,
|
||||
flags | GOMP_TARGET_FLAG_UPDATE,
|
||||
depend, GOMP_TARGET_TASK_DATA))
|
||||
depend, NULL, GOMP_TARGET_TASK_DATA))
|
||||
return;
|
||||
}
|
||||
else
|
||||
@ -1646,7 +1721,8 @@ GOMP_target_update_ext (int device, size_t mapnum, void **hostaddrs,
|
||||
}
|
||||
|
||||
if (devicep == NULL
|
||||
|| !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400))
|
||||
|| !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
||||
|| devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
|
||||
return;
|
||||
|
||||
struct gomp_thread *thr = gomp_thread ();
|
||||
@ -1756,7 +1832,7 @@ GOMP_target_enter_exit_data (int device, size_t mapnum, void **hostaddrs,
|
||||
{
|
||||
if (gomp_create_target_task (devicep, (void (*) (void *)) NULL,
|
||||
mapnum, hostaddrs, sizes, kinds,
|
||||
flags, depend,
|
||||
flags, depend, NULL,
|
||||
GOMP_TARGET_TASK_DATA))
|
||||
return;
|
||||
}
|
||||
@ -1777,7 +1853,8 @@ GOMP_target_enter_exit_data (int device, size_t mapnum, void **hostaddrs,
|
||||
}
|
||||
|
||||
if (devicep == NULL
|
||||
|| !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400))
|
||||
|| !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
||||
|| devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
|
||||
return;
|
||||
|
||||
struct gomp_thread *thr = gomp_thread ();
|
||||
@ -1815,7 +1892,8 @@ gomp_target_task_fn (void *data)
|
||||
void *fn_addr;
|
||||
if (devicep == NULL
|
||||
|| !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
||||
|| !(fn_addr = gomp_get_target_fn_addr (devicep, ttask->fn)))
|
||||
|| !(fn_addr = gomp_get_target_fn_addr (devicep, ttask->fn))
|
||||
|| (devicep->can_run_func && !devicep->can_run_func (fn_addr)))
|
||||
{
|
||||
ttask->state = GOMP_TARGET_TASK_FALLBACK;
|
||||
gomp_target_fallback_firstprivate (ttask->fn, ttask->mapnum,
|
||||
@ -1826,22 +1904,36 @@ gomp_target_task_fn (void *data)
|
||||
|
||||
if (ttask->state == GOMP_TARGET_TASK_FINISHED)
|
||||
{
|
||||
gomp_unmap_vars (ttask->tgt, true);
|
||||
if (ttask->tgt)
|
||||
gomp_unmap_vars (ttask->tgt, true);
|
||||
return false;
|
||||
}
|
||||
|
||||
ttask->tgt
|
||||
= gomp_map_vars (devicep, ttask->mapnum, ttask->hostaddrs, NULL,
|
||||
ttask->sizes, ttask->kinds, true,
|
||||
GOMP_MAP_VARS_TARGET);
|
||||
void *actual_arguments;
|
||||
if (devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
|
||||
{
|
||||
ttask->tgt = NULL;
|
||||
ttask->firstprivate_copies
|
||||
= gomp_target_unshare_firstprivate (ttask->mapnum, ttask->hostaddrs,
|
||||
ttask->sizes, ttask->kinds);
|
||||
actual_arguments = ttask->hostaddrs;
|
||||
}
|
||||
else
|
||||
{
|
||||
ttask->tgt = gomp_map_vars (devicep, ttask->mapnum, ttask->hostaddrs,
|
||||
NULL, ttask->sizes, ttask->kinds, true,
|
||||
GOMP_MAP_VARS_TARGET);
|
||||
actual_arguments = (void *) ttask->tgt->tgt_start;
|
||||
}
|
||||
ttask->state = GOMP_TARGET_TASK_READY_TO_RUN;
|
||||
|
||||
devicep->async_run_func (devicep->target_id, fn_addr,
|
||||
(void *) ttask->tgt->tgt_start, (void *) ttask);
|
||||
devicep->async_run_func (devicep->target_id, fn_addr, actual_arguments,
|
||||
ttask->args, (void *) ttask);
|
||||
return true;
|
||||
}
|
||||
else if (devicep == NULL
|
||||
|| !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400))
|
||||
|| !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
||||
|| devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
|
||||
return false;
|
||||
|
||||
size_t i;
|
||||
@ -1891,7 +1983,8 @@ omp_target_alloc (size_t size, int device_num)
|
||||
if (devicep == NULL)
|
||||
return NULL;
|
||||
|
||||
if (!(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400))
|
||||
if (!(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
||||
|| devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
|
||||
return malloc (size);
|
||||
|
||||
gomp_mutex_lock (&devicep->lock);
|
||||
@ -1919,7 +2012,8 @@ omp_target_free (void *device_ptr, int device_num)
|
||||
if (devicep == NULL)
|
||||
return;
|
||||
|
||||
if (!(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400))
|
||||
if (!(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
||||
|| devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
|
||||
{
|
||||
free (device_ptr);
|
||||
return;
|
||||
@ -1946,7 +2040,8 @@ omp_target_is_present (void *ptr, int device_num)
|
||||
if (devicep == NULL)
|
||||
return 0;
|
||||
|
||||
if (!(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400))
|
||||
if (!(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
||||
|| devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
|
||||
return 1;
|
||||
|
||||
gomp_mutex_lock (&devicep->lock);
|
||||
@ -1976,7 +2071,8 @@ omp_target_memcpy (void *dst, void *src, size_t length, size_t dst_offset,
|
||||
if (dst_devicep == NULL)
|
||||
return EINVAL;
|
||||
|
||||
if (!(dst_devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400))
|
||||
if (!(dst_devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
||||
|| dst_devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
|
||||
dst_devicep = NULL;
|
||||
}
|
||||
if (src_device_num != GOMP_DEVICE_HOST_FALLBACK)
|
||||
@ -1988,7 +2084,8 @@ omp_target_memcpy (void *dst, void *src, size_t length, size_t dst_offset,
|
||||
if (src_devicep == NULL)
|
||||
return EINVAL;
|
||||
|
||||
if (!(src_devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400))
|
||||
if (!(src_devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
||||
|| src_devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
|
||||
src_devicep = NULL;
|
||||
}
|
||||
if (src_devicep == NULL && dst_devicep == NULL)
|
||||
@ -2118,7 +2215,8 @@ omp_target_memcpy_rect (void *dst, void *src, size_t element_size,
|
||||
if (dst_devicep == NULL)
|
||||
return EINVAL;
|
||||
|
||||
if (!(dst_devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400))
|
||||
if (!(dst_devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
||||
|| dst_devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
|
||||
dst_devicep = NULL;
|
||||
}
|
||||
if (src_device_num != GOMP_DEVICE_HOST_FALLBACK)
|
||||
@ -2130,7 +2228,8 @@ omp_target_memcpy_rect (void *dst, void *src, size_t element_size,
|
||||
if (src_devicep == NULL)
|
||||
return EINVAL;
|
||||
|
||||
if (!(src_devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400))
|
||||
if (!(src_devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
||||
|| src_devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
|
||||
src_devicep = NULL;
|
||||
}
|
||||
|
||||
@ -2166,7 +2265,8 @@ omp_target_associate_ptr (void *host_ptr, void *device_ptr, size_t size,
|
||||
if (devicep == NULL)
|
||||
return EINVAL;
|
||||
|
||||
if (!(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400))
|
||||
if (!(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
|
||||
|| devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
|
||||
return EINVAL;
|
||||
|
||||
gomp_mutex_lock (&devicep->lock);
|
||||
@ -2309,6 +2409,7 @@ gomp_load_plugin_for_device (struct gomp_device_descr *device,
|
||||
{
|
||||
DLSYM (run);
|
||||
DLSYM (async_run);
|
||||
DLSYM_OPT (can_run, can_run);
|
||||
DLSYM (dev2dev);
|
||||
}
|
||||
if (device->capabilities & GOMP_OFFLOAD_CAP_OPENACC_200)
|
||||
|
@ -582,6 +582,7 @@ GOMP_PLUGIN_target_task_completion (void *data)
|
||||
return;
|
||||
}
|
||||
ttask->state = GOMP_TARGET_TASK_FINISHED;
|
||||
free (ttask->firstprivate_copies);
|
||||
gomp_target_task_completion (team, task);
|
||||
gomp_mutex_unlock (&team->task_lock);
|
||||
}
|
||||
@ -594,7 +595,7 @@ bool
|
||||
gomp_create_target_task (struct gomp_device_descr *devicep,
|
||||
void (*fn) (void *), size_t mapnum, void **hostaddrs,
|
||||
size_t *sizes, unsigned short *kinds,
|
||||
unsigned int flags, void **depend,
|
||||
unsigned int flags, void **depend, void **args,
|
||||
enum gomp_target_task_state state)
|
||||
{
|
||||
struct gomp_thread *thr = gomp_thread ();
|
||||
@ -654,6 +655,7 @@ gomp_create_target_task (struct gomp_device_descr *devicep,
|
||||
ttask->devicep = devicep;
|
||||
ttask->fn = fn;
|
||||
ttask->mapnum = mapnum;
|
||||
ttask->args = args;
|
||||
memcpy (ttask->hostaddrs, hostaddrs, mapnum * sizeof (void *));
|
||||
ttask->sizes = (size_t *) &ttask->hostaddrs[mapnum];
|
||||
memcpy (ttask->sizes, sizes, mapnum * sizeof (size_t));
|
||||
|
@ -111,6 +111,8 @@ FC = @FC@
|
||||
FCFLAGS = @FCFLAGS@
|
||||
FGREP = @FGREP@
|
||||
GREP = @GREP@
|
||||
HSA_RUNTIME_INCLUDE = @HSA_RUNTIME_INCLUDE@
|
||||
HSA_RUNTIME_LIB = @HSA_RUNTIME_LIB@
|
||||
INSTALL = @INSTALL@
|
||||
INSTALL_DATA = @INSTALL_DATA@
|
||||
INSTALL_PROGRAM = @INSTALL_PROGRAM@
|
||||
@ -155,6 +157,10 @@ PACKAGE_URL = @PACKAGE_URL@
|
||||
PACKAGE_VERSION = @PACKAGE_VERSION@
|
||||
PATH_SEPARATOR = @PATH_SEPARATOR@
|
||||
PERL = @PERL@
|
||||
PLUGIN_HSA = @PLUGIN_HSA@
|
||||
PLUGIN_HSA_CPPFLAGS = @PLUGIN_HSA_CPPFLAGS@
|
||||
PLUGIN_HSA_LDFLAGS = @PLUGIN_HSA_LDFLAGS@
|
||||
PLUGIN_HSA_LIBS = @PLUGIN_HSA_LIBS@
|
||||
PLUGIN_NVPTX = @PLUGIN_NVPTX@
|
||||
PLUGIN_NVPTX_CPPFLAGS = @PLUGIN_NVPTX_CPPFLAGS@
|
||||
PLUGIN_NVPTX_LDFLAGS = @PLUGIN_NVPTX_LDFLAGS@
|
||||
|
@ -1,3 +1,8 @@
|
||||
2016-01-19 Martin Jambor <mjambor@suse.cz>
|
||||
* plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_async_run): New
|
||||
unused parameter.
|
||||
(GOMP_OFFLOAD_run): Likewise.
|
||||
|
||||
2015-12-14 Ilya Verbin <ilya.verbin@intel.com>
|
||||
|
||||
* plugin/libgomp-plugin-intelmic.cpp (unregister_main_image): Remove.
|
||||
|
@ -528,7 +528,7 @@ GOMP_OFFLOAD_dev2dev (int device, void *dst_ptr, const void *src_ptr,
|
||||
|
||||
extern "C" void
|
||||
GOMP_OFFLOAD_async_run (int device, void *tgt_fn, void *tgt_vars,
|
||||
void *async_data)
|
||||
void **, void *async_data)
|
||||
{
|
||||
TRACE ("(device = %d, tgt_fn = %p, tgt_vars = %p, async_data = %p)", device,
|
||||
tgt_fn, tgt_vars, async_data);
|
||||
@ -544,7 +544,7 @@ GOMP_OFFLOAD_async_run (int device, void *tgt_fn, void *tgt_vars,
|
||||
}
|
||||
|
||||
extern "C" void
|
||||
GOMP_OFFLOAD_run (int device, void *tgt_fn, void *tgt_vars)
|
||||
GOMP_OFFLOAD_run (int device, void *tgt_fn, void *tgt_vars, void **)
|
||||
{
|
||||
TRACE ("(device = %d, tgt_fn = %p, tgt_vars = %p)", device, tgt_fn, tgt_vars);
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user