8a8efad098
These test cases use directives similar to: /* { dg-additional-options "-save-temps" } */ /* { dg-final { scan-assembler-times "bar.sync" 2 } } */ This expects to scan the PTX offloading compilation assembler code (not host code!), expecting that nvptx offloading code assembly is produced after the host code, and thus overwrites the latter file. (Yes, that's certainly ugly/fragile...) ..., and this broke with recent commit1dedc12d18
"revamp dump and aux output names" plus fix-up commit commitefc16503ca
"handle dumpbase in offloading, adjust testsuite" (short summary: file names changed), so let's finally make that robust. libgomp/ * testsuite/libgomp.oacc-c-c++-common/pr85381-2.c: Replace fragile 'scan-assembler' with 'scan-offload-rtl'. * testsuite/libgomp.oacc-c-c++-common/pr85381-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/pr85381-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/pr85381-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/pr85381.c: Likewise.
28 lines
681 B
C
28 lines
681 B
C
/* { dg-do run { target openacc_nvidia_accel_selected } }
|
|
{ dg-skip-if "" { *-*-* } { "*" } { "-O2" } } */
|
|
/* { dg-additional-options "-foffload=-fdump-rtl-mach" } */
|
|
|
|
#define n 1024
|
|
|
|
int
|
|
main (void)
|
|
{
|
|
#pragma acc parallel
|
|
{
|
|
#pragma acc loop worker
|
|
for (int i = 0; i < n; i++)
|
|
;
|
|
|
|
#pragma acc loop worker
|
|
for (int i = 0; i < n; i++)
|
|
;
|
|
}
|
|
|
|
return 0;
|
|
}
|
|
|
|
/* Atm, %ntid.y is broadcast from one loop to the next, so there are 2 bar.syncs
|
|
for that (the other two are there for the same reason as in pr85381-2.c).
|
|
Todo: Recompute %ntid.y instead of broadcasting it. */
|
|
/* { dg-final { scan-offload-rtl-dump-times "nvptx_barsync" 4 "mach" } } */
|