Add flag -fassume-phsa that is on by default. If -fno-assume-phsa
is given, these optimizations are disabled.
With this flag, gccbrig can generate GENERIC that assumes we are
targeting a phsa-runtime based implementation, which allows us
to expose the work-item context accesses to retrieve WI IDs etc.
which helps optimizers.
First optimization that takes advantage of this is to get rid of
the setworkitemid calls whenever we have non-inlined calls that
use IDs internally.
Other optimizations added in this commit:
- expand absoluteid to similar level of simplicity as workitemid.
At the moment absoluteid is the best indexing ID to end up with
WG vectorization.
- propagate ID variables closer to their uses. This is mainly
to avoid known useless casts, which confuse at least scalar
evolution analysis.
- use signed long long for storing IDs. Unsigned integers have
defined wraparound semantics, which confuse at least scalar
evolution analysis, leading to unvectorizable WI loops.
- also refactor some BRIG function generation helpers to brig_function.
- no point in having the wi-loop as a for-loop. It's really
a do...while and SCEV can analyze it just fine still.
- add consts to ptrs etc. in BRIG builtin defs.
Improves optimization opportunities.
- add qualifiers to generated function parameters.
Const and restrict on the hidden local/private pointers,
the arg buffer and the context pointer help some optimizations.
From-SVN: r259957
We didn't preserve additional space for the alloca frame pointers that
are needed to be saved in the alloca space.
Fixes libgomp.c++/target-6.C execution test.
From-SVN: r259942
PR jit/85384
* acx.m4 (GCC_BASE_VER): Remove \$\$ from sed expression.
* configure.ac (gcc-driver-name.h): Honor --with-gcc-major-version
by using gcc_base_ver to generate a gcc_driver_version, and use
it when generating GCC_DRIVER_NAME.
* configure: Regenerate.
* configure: Regenerate.
From-SVN: r259462
segment variables.
PRM specs defines function and module scope group segment variables
as an experimental feature. However, PRM test suite uses and
hcc relies on them. In addition, hcc assumes certain group variable
layout in its dynamic group segment allocation code.
We cannot have global group memory offsets if we want to
both have kernel-specific group segment size and multiple kernels
calling the same functions that use function scope group memory
variables.
Now group segment is handled by separate book keeping of module
scope and function (kernel) offsets. Each function has a "frame"
in the group segment offset to which is given as an argument.
From-SVN: r253233
* brig-builtins.def: Treat HSAIL barrier builtins as
setjmp/longjump style functions.
* brigfrontend/brig-to-generic.cc: Ensure per WI copies of
private variables are aligned too.
* rt/workitems.c: Assume the host runtime allocates the work group
memory.
From-SVN: r253160
* brig-builtins.def: Added a builtin for class_f64.
* builtin-types.def: Added a builtin type needed by class_f64.
* brigfrontend/brig-code-entry-handler.cc
(brig_code_entry_handler::build_address_operand): Fix a bug
with reg+offset addressing on 32b segments. In large mode,
the offset is treated as 32bits unless it's global, readonly or
kernarg address space.
* rt/workitems.c: Removed a leftover comment.
* rt/arithmetic.c (__hsail_class_f32, __hsail_class_f64): Fix the
check for signaling/non-signalling NaN. Add class_f64 default
implementation.
From-SVN: r247576
* configure.tgt: Fix i?86-*-linux* entry.
* rt/sat_arithmetic.c (__hsail_sat_add_u32, __hsail_sat_add_u64,
__hsail_sat_add_s32, __hsail_sat_add_s64): Use __builtin_add_overflow.
(__hsail_sat_sub_u8, __hsail_sat_sub_u16): Remove pointless for overflow
over maximum.
(__hsail_sat_sub_u32, __hsail_sat_sub_u64, __hsail_sat_sub_s32,
__hsail_sat_sub_s64): Use __builtin_sub_overflow.
(__hsail_sat_mul_u32, __hsail_sat_mul_u64, __hsail_sat_mul_s32,
__hsail_sat_mul_s64): Use __builtin_mul_overflow.
* rt/arithmetic.c (__hsail_borrow_u32, __hsail_borrow_u64): Use
__builtin_sub_overflow_p.
(__hsail_carry_u32, __hsail_carry_u64): Use __builtin_add_overflow_p.
* rt/misc.c (__hsail_groupbaseptr, __hsail_kernargbaseptr_u64):
Cast pointers to uintptr_t first before casting to some other integral
type.
* rt/segment.c (__hsail_segmentp_private, __hsail_segmentp_group): Likewise.
* rt/queue.c (__hsail_ldqueuereadindex, __hsail_ldqueuewriteindex,
__hsail_addqueuewriteindex, __hsail_casqueuewriteindex,
__hsail_stqueuereadindex, __hsail_stqueuewriteindex): Cast integral value
to uintptr_t first before casting to pointer.
* rt/workitems.c (__hsail_alloca_pop_frame): Cast memcpy first argument to
void * to avoid warning.
From-SVN: r245080
2017-01-27 Pekka Jaaskelainen <pekka.jaaskelainen@parmance.com>
* configure.ac: Moved the white list of enabling BRIG FE to
libhsail-rt/configure.tgt.
* configure: Regenerated.
* MAINTAINERS: Updated maintainers for BRIG FE and libhsail-rt.
gcc/
* builtin-types.def: Use unsigned_char_type_node for BT_UINT8. Use
uint16_type_node for BT_UINT16.
gcc/brig/
* config-lang.in: Removed stale target-libbrig reference.
libhsail-rt/
* configure.tgt: Moved the white list of supported targets here
from configure.ac. Added i[3456789]86-*-linux* as a supported env
for the BRIG FE.
* README: Added a proper description of what libhsail-rt is.
From-SVN: r244978
contrib/
* update-copyright.py: Add libhsail-rt to self.default_dirs
and call self.add_dir on it. Add Intel Corporation to external
authors.
gcc/
* brig-builtins.def: Update copyright years.
* config/arm/arm_acle_builtins.def: Update copyright years.
gcc/brig/
Update copyright years.
gcc/testsuite/
* brig.dg/dg.exp: Update copyright years.
* lib/brig-dg.exp: Update copyright years.
* lib/brig.exp: Update copyright years.
libhsail-rt/
Update copyright years.
libstdc++-v3/
* libsupc++/eh_atomics.h: Update copyright years.
* testsuite/20_util/unique_ptr/cons/default.cc: Update copyright years.
From-SVN: r244920
PR other/79046
* configure.ac: Add GCC_BASE_VER.
* Makefile.am (gcc_version): Use @get_gcc_base_ver@ instead of cat to
get version from BASE-VER file.
(ACLOCAL_AMFLAGS): Set to -I .. -I ../config .
* aclocal.m4: Regenerated.
* configure: Regenerated.
* Makefile.in: Regenerated.
From-SVN: r244895