2014-11-10 17:12:42 +01:00
|
|
|
; Options for the NVPTX port
|
2022-01-03 10:42:10 +01:00
|
|
|
; Copyright (C) 2014-2022 Free Software Foundation, Inc.
|
2014-11-10 17:12:42 +01:00
|
|
|
;
|
|
|
|
; This file is part of GCC.
|
|
|
|
;
|
|
|
|
; GCC is free software; you can redistribute it and/or modify it under
|
|
|
|
; the terms of the GNU General Public License as published by the Free
|
|
|
|
; Software Foundation; either version 3, or (at your option) any later
|
|
|
|
; version.
|
|
|
|
;
|
|
|
|
; GCC is distributed in the hope that it will be useful, but WITHOUT ANY
|
|
|
|
; WARRANTY; without even the implied warranty of MERCHANTABILITY or
|
|
|
|
; FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
|
|
|
|
; for more details.
|
|
|
|
;
|
|
|
|
; You should have received a copy of the GNU General Public License
|
|
|
|
; along with GCC; see the file COPYING3. If not see
|
|
|
|
; <http://www.gnu.org/licenses/>.
|
|
|
|
|
2020-10-15 11:37:43 +02:00
|
|
|
; It's not clear whether this was ever build/tested/used, so this is no longer
|
|
|
|
; exposed to the user.
|
|
|
|
;m32
|
2020-12-11 17:25:43 +01:00
|
|
|
;Target RejectNegative InverseMask(ABI64)
|
2020-10-15 11:37:43 +02:00
|
|
|
;Generate code for a 32-bit ABI.
|
2015-02-18 09:31:18 +01:00
|
|
|
|
|
|
|
m64
|
2020-12-11 17:25:43 +01:00
|
|
|
Target RejectNegative Mask(ABI64)
|
2022-03-29 14:24:41 +02:00
|
|
|
Ignored, but preserved for backward compatibility. Only 64-bit ABI is
|
|
|
|
supported.
|
2014-11-10 17:12:42 +01:00
|
|
|
|
|
|
|
mmainkernel
|
2020-12-11 17:25:43 +01:00
|
|
|
Target RejectNegative
|
2014-11-10 17:12:42 +01:00
|
|
|
Link in code for a __main kernel.
|
2015-11-10 23:29:20 +01:00
|
|
|
|
|
|
|
moptimize
|
2020-12-11 17:25:43 +01:00
|
|
|
Target Var(nvptx_optimize) Init(-1)
|
2016-04-15 16:26:40 +02:00
|
|
|
Optimize partition neutering.
|
2016-11-16 18:17:00 +01:00
|
|
|
|
|
|
|
msoft-stack
|
2020-12-11 17:25:43 +01:00
|
|
|
Target Mask(SOFT_STACK)
|
2016-11-16 18:17:00 +01:00
|
|
|
Use custom stacks instead of local memory for automatic storage.
|
|
|
|
|
2020-10-12 10:14:13 +02:00
|
|
|
msoft-stack-reserve-local=
|
2020-12-11 17:25:43 +01:00
|
|
|
Target Joined RejectNegative UInteger Var(nvptx_softstack_size) Init(128)
|
2017-03-28 19:24:57 +02:00
|
|
|
Specify size of .local memory used for stack when the exact amount is not known.
|
|
|
|
|
2016-11-16 18:17:00 +01:00
|
|
|
muniform-simt
|
2020-12-11 17:25:43 +01:00
|
|
|
Target Mask(UNIFORM_SIMT)
|
2016-11-16 18:17:00 +01:00
|
|
|
Generate code that can keep local state uniform across all lanes.
|
|
|
|
|
|
|
|
mgomp
|
2020-12-11 17:25:43 +01:00
|
|
|
Target Mask(GOMP)
|
2016-11-16 18:17:00 +01:00
|
|
|
Generate code for OpenMP offloading: enables -msoft-stack and -muniform-simt.
|
2018-09-06 00:27:31 +02:00
|
|
|
|
|
|
|
misa=
|
2022-03-03 20:20:41 +01:00
|
|
|
Target RejectNegative ToLower Joined Enum(ptx_isa) Var(ptx_isa_option) Init(PTX_ISA_SM30)
|
2022-03-28 17:55:49 +02:00
|
|
|
Specify the PTX ISA target architecture to use.
|
2021-05-12 12:40:37 +02:00
|
|
|
|
2022-03-29 10:31:51 +02:00
|
|
|
march=
|
|
|
|
Target RejectNegative Joined Alias(misa=)
|
|
|
|
Alias:
|
|
|
|
|
[nvptx] Add march-map
Say we have an sm_50 board, and we want to run a benchmark using the highest
possible march setting.
Currently there's march=sm_30, march=sm_35, march=sm_53, but no march=sm_50.
So, we'd need to pick march=sm_35.
Likewise, for a test script that handles multiple boards, we'd need a mapping
from native board sm_xx to march, which might have to be updated with newer
gcc releases.
Add an option march-map, such that we can just specify march-map=sm_50, and
let the compiler map this to the appropriate march.
The option is implemented as a list of aliases, such that we have a somewhat
lengthy (17 lines in total):
...
$ gcc --help=target
...
-march-map=sm_30 Same as -misa=sm_30.
-march-map=sm_32 Same as -misa=sm_30.
...
-march-map=sm_87 Same as -misa=sm_80.
-march-map=sm_90 Same as -misa=sm_80.
...
This implementation was chosen in the hope that it'll be easier if
we end up with some misa multilib.
It would be nice to have the mapping list generated from an updated
nvptx-sm.def, but for now it's spelled out in nvptx.opt.
Tested on nvptx.
gcc/ChangeLog:
2022-03-29 Tom de Vries <tdevries@suse.de>
PR target/104714
* config/nvptx/nvptx.opt (march-map=*): Add aliases.
gcc/testsuite/ChangeLog:
2022-03-29 Tom de Vries <tdevries@suse.de>
PR target/104714
* gcc.target/nvptx/march-map.c: New test.
2022-03-29 10:32:13 +02:00
|
|
|
march-map=sm_30
|
|
|
|
Target RejectNegative Alias(misa=,sm_30)
|
|
|
|
|
|
|
|
march-map=sm_32
|
|
|
|
Target RejectNegative Alias(misa=,sm_30)
|
|
|
|
|
|
|
|
march-map=sm_35
|
|
|
|
Target RejectNegative Alias(misa=,sm_35)
|
|
|
|
|
|
|
|
march-map=sm_37
|
|
|
|
Target RejectNegative Alias(misa=,sm_35)
|
|
|
|
|
|
|
|
march-map=sm_50
|
|
|
|
Target RejectNegative Alias(misa=,sm_35)
|
|
|
|
|
|
|
|
march-map=sm_52
|
|
|
|
Target RejectNegative Alias(misa=,sm_35)
|
|
|
|
|
|
|
|
march-map=sm_53
|
|
|
|
Target RejectNegative Alias(misa=,sm_53)
|
|
|
|
|
|
|
|
march-map=sm_60
|
|
|
|
Target RejectNegative Alias(misa=,sm_53)
|
|
|
|
|
|
|
|
march-map=sm_61
|
|
|
|
Target RejectNegative Alias(misa=,sm_53)
|
|
|
|
|
|
|
|
march-map=sm_62
|
|
|
|
Target RejectNegative Alias(misa=,sm_53)
|
|
|
|
|
|
|
|
march-map=sm_70
|
|
|
|
Target RejectNegative Alias(misa=,sm_70)
|
|
|
|
|
|
|
|
march-map=sm_72
|
|
|
|
Target RejectNegative Alias(misa=,sm_70)
|
|
|
|
|
|
|
|
march-map=sm_75
|
|
|
|
Target RejectNegative Alias(misa=,sm_75)
|
|
|
|
|
|
|
|
march-map=sm_80
|
|
|
|
Target RejectNegative Alias(misa=,sm_80)
|
|
|
|
|
|
|
|
march-map=sm_86
|
|
|
|
Target RejectNegative Alias(misa=,sm_80)
|
|
|
|
|
|
|
|
march-map=sm_87
|
|
|
|
Target RejectNegative Alias(misa=,sm_80)
|
|
|
|
|
|
|
|
march-map=sm_90
|
|
|
|
Target RejectNegative Alias(misa=,sm_80)
|
|
|
|
|
2021-05-12 12:40:37 +02:00
|
|
|
Enum
|
|
|
|
Name(ptx_version) Type(int)
|
2022-03-28 17:55:49 +02:00
|
|
|
Known PTX ISA versions (for use with the -mptx= option):
|
2021-05-12 12:40:37 +02:00
|
|
|
|
|
|
|
EnumValue
|
|
|
|
Enum(ptx_version) String(3.1) Value(PTX_VERSION_3_1)
|
|
|
|
|
2022-02-19 23:28:49 +01:00
|
|
|
EnumValue
|
|
|
|
Enum(ptx_version) String(6.0) Value(PTX_VERSION_6_0)
|
|
|
|
|
2021-05-12 12:40:37 +02:00
|
|
|
EnumValue
|
|
|
|
Enum(ptx_version) String(6.3) Value(PTX_VERSION_6_3)
|
|
|
|
|
2021-12-15 14:37:58 +01:00
|
|
|
EnumValue
|
|
|
|
Enum(ptx_version) String(7.0) Value(PTX_VERSION_7_0)
|
|
|
|
|
2022-02-25 16:11:23 +01:00
|
|
|
EnumValue
|
|
|
|
Enum(ptx_version) String(_) Value(PTX_VERSION_default)
|
|
|
|
|
2021-05-12 12:40:37 +02:00
|
|
|
mptx=
|
2022-02-04 08:53:52 +01:00
|
|
|
Target RejectNegative ToLower Joined Enum(ptx_version) Var(ptx_version_option)
|
2022-03-28 17:55:49 +02:00
|
|
|
Specify the PTX ISA version to use.
|
[nvptx] Initialize ptx regs
With nvptx target, driver version 510.47.03 and board GT 1030 I, we run into:
...
FAIL: gcc.c-torture/execute/pr53465.c -O1 execution test
FAIL: gcc.c-torture/execute/pr53465.c -O2 execution test
FAIL: gcc.c-torture/execute/pr53465.c -O3 -g execution test
...
while the test-cases pass with nvptx-none-run -O0.
The problem is that the generated ptx contains a read from an uninitialized
ptx register, and the driver JIT doesn't handle this well.
For -O2 and -O3, we can get rid of the FAIL using --param
logical-op-non-short-circuit=0. But not for -O1.
At -O1, the test-case minimizes to:
...
void __attribute__((noinline, noclone))
foo (int y) {
int c;
for (int i = 0; i < y; i++)
{
int d = i + 1;
if (i && d <= c)
__builtin_abort ();
c = d;
}
}
int main () {
foo (2); return 0;
}
...
Note that the test-case does not contain an uninitialized use. In the first
iteration, i is 0 and consequently c is not read. In the second iteration, c
is read, but by that time it's already initialized by 'c = d' from the first
iteration.
AFAICT the problem is introduced as follows: the conditional use of c in the
loop body is translated into an unconditional use of c in the loop header:
...
# c_1 = PHI <c_4(D)(2), c_9(6)>
...
which forwprop1 propagates the 'c_9 = d_7' assignment into:
...
# c_1 = PHI <c_4(D)(2), d_7(6)>
...
which ends up being translated by expand into an unconditional:
...
(insn 13 12 0 (set (reg/v:SI 22 [ c ])
(reg/v:SI 23 [ d ])) -1
(nil))
...
at the start of the loop body, creating an uninitialized read of d on the
path from loop entry.
By disabling coalesce_ssa_name, we get the more usual copies on the incoming
edges. The copy on the loop entry path still does an uninitialized read, but
that one's now initialized by init-regs. The test-case passes, also when
disabling init-regs, so it's possible that the JIT driver doesn't object to
this type of uninitialized read.
Now that we characterized the problem to some degree, we need to fix this,
because either:
- we're violating an undocumented ptx invariant, and this is a compiler bug,
or
- this is is a driver JIT bug and we need to work around it.
There are essentially two strategies to address this:
- stop the compiler from creating uninitialized reads
- patch up uninitialized reads using additional initialization
The former will probably involve:
- making some optimizations more conservative in the presence of
uninitialized reads, and
- disabling some other optimizations (where making them more conservative is
not possible, or cannot easily be achieved).
This will probably will have a cost penalty for code that does not suffer from
the original problem.
The latter has the problem that it may paper over uninitialized reads
in the source code, or indeed over ones that were incorrectly introduced
by the compiler. But it has the advantage that it allows for the problem to
be addressed at a single location.
There's an existing pass, init-regs, which implements a form of the latter,
but it doesn't work for this example because it only inserts additional
initialization for uses that have not a single reaching definition.
Fix this by adding initialization of uninitialized ptx regs in reorg.
Control the new functionality using -minit-regs=<0|1|2|3>, meaning:
- 0: disabled.
- 1: add initialization of all regs at the entry bb
- 2: add initialization of uninitialized regs at the entry bb
- 3: add initialization of uninitialized regs close to the use
and defaulting to 3.
Tested on nvptx.
gcc/ChangeLog:
2022-02-17 Tom de Vries <tdevries@suse.de>
PR target/104440
* config/nvptx/nvptx.cc (workaround_uninit_method_1)
(workaround_uninit_method_2, workaround_uninit_method_3)
(workaround_uninit): New function.
(nvptx_reorg): Use workaround_uninit.
* config/nvptx/nvptx.opt (minit-regs): New option.
2022-02-16 17:09:11 +01:00
|
|
|
|
|
|
|
minit-regs=
|
|
|
|
Target Var(nvptx_init_regs) IntegerRange(0, 3) Joined UInteger Init(3)
|
|
|
|
Initialize ptx registers.
|
2022-02-18 12:31:02 +01:00
|
|
|
|
|
|
|
mptx-comment
|
|
|
|
Target Var(nvptx_comment) Init(1) Undocumented
|
[nvptx] Use .alias directive for mptx >= 6.3
Starting with ptx isa version 6.3, a ptx directive .alias is available.
Use this directive to support symbol aliases, as far as possible.
The alias support is off by default. It can be turned on using a switch
-malias.
Furthermore, for pre-sm_75, it's not effective unless the ptx version is
bumped to 6.3 or higher using -mptx (given that the default for pre-sm_75 is
6.0).
The alias support has the following limitations.
Only function aliases are supported.
Weak aliases are not supported. That is, if I disable the check in
nvptx_asm_output_def_from_decls that disallows this, a weak alias is emitted
and parsed by the driver. But the test gcc.dg/globalalias.c starts failing,
with the behaviour matching the comment about "weird behavior of AIX's .set
pseudo-op": a weak alias may resolve to different functions in different
files.
Aliases to weak symbols are not supported (see gcc.dg/localalias.c). This is
currently not prohibited by the compiler, but with the driver link we run
into: "error: Function test with .weak scope cannot be aliased".
Aliases to aliases are not supported (see libgomp.c-c++-common/pr96390.c).
This is currently not prohibited by the compiler, but with the driver link we
run into: "Internal error: alias to unknown symbol" .
Unreferenced aliases are not emitted (these can occur f.i. when inlining a
call to an alias). This avoids driver link error "Internal error: reference
to deleted section".
When enabling malias by default, libgomp detects alias support and
consequently libgomp.a will contains a few uses of .alias. This however
results in aforementioned "Internal error: reference to deleted section" in
many test-cases. Either there's some error with how .alias is used, or
there's a driver bug. While this issue is not resolved, we keep malias
off-by-default.
At some point we may add support in the nvptx-tools linker for symbol
aliases, and define f.i. malias=ptx and malias=ld to choose between the two in
the compiler.
An example of where this support is useful, is the OvO (OpenMP vs Offload)
testsuite. The testsuite passes already at -O2. But at -O0, there are errors
in some c++ test-cases due to missing symbol alias support. By compiling with
-malias, the whole testsuite passes also at -O0.
This patch causes a regression:
...
-PASS: gcc.dg/pr60797.c (test for errors, line 4)
+FAIL: gcc.dg/pr60797.c (test for errors, line 4)
...
The test-case is skipped for effective target alias, and both without and with
this patch the nvptx target is considered to not support it, so the test-case is
executed. The test-case expects an error message along the lines of "alias
definitions not supported in this configuration", but instead we run into:
...
gcc.dg/pr60797.c:4:12: error: foo aliased to undefined symbol
...
This is probably due to the fact that the nvptx backend now defines macros
ASM_OUTPUT_DEF and ASM_OUTPUT_DEF_FROM_DECLS, so from the point of view of the
common part of the compiler, aliases are supported.
gcc/ChangeLog:
2022-03-18 Tom de Vries <tdevries@suse.de>
PR target/104957
* config/nvptx/nvptx-protos.h (nvptx_asm_output_def_from_decls): Declare.
* config/nvptx/nvptx.cc (write_fn_proto_1): Don't add function marker
for alias.
(SET_ASM_OP, NVPTX_ASM_OUTPUT_DEF): New macro def.
(nvptx_asm_output_def_from_decls): New function.
* config/nvptx/nvptx.h (ASM_OUTPUT_DEF): New macro def, define to
gcc_unreachable ().
(ASM_OUTPUT_DEF_FROM_DECLS): New macro def, define to
nvptx_asm_output_def_from_decls.
* config/nvptx/nvptx.opt (malias): New opt.
gcc/testsuite/ChangeLog:
2022-03-18 Tom de Vries <tdevries@suse.de>
PR target/104957
* gcc.target/nvptx/alias-1.c: New test.
* gcc.target/nvptx/alias-2.c: New test.
* gcc.target/nvptx/alias-3.c: New test.
* gcc.target/nvptx/alias-4.c: New test.
* gcc.target/nvptx/nvptx.exp
(check_effective_target_runtime_ptx_isa_version_6_3): New proc.
2022-03-11 13:41:01 +01:00
|
|
|
|
|
|
|
malias
|
|
|
|
Target Var(nvptx_alias) Init(0) Undocumented
|
2022-03-19 17:40:55 +01:00
|
|
|
|
|
|
|
mexperimental
|
|
|
|
Target Var(nvptx_experimental) Init(0) Undocumented
|