gcc/libgomp/testsuite/libgomp.c
Jakub Jelinek 5acef69f9d openmp: Optimize triangular loop logical iterator to actual iterators computation using search for quadratic equation root(s)
This patch implements the optimized logical to actual iterators
computation for triangular loops.

I have a rough implementation using integers, but this one uses floating
point.  There is a small problem that -fopenmp programs aren't linked with
-lm, so it does it only if the hw has sqrt optab (and uses ifn rather than
__builtin_sqrt because it obviously doesn't need errno handling etc.).

Do you think it is ok this way, or should I use the integral computation
using inlined isqrt (we have inequation of the form
start >= x * t10 + t11 * (((x - 1) * x) / 2)
where t10 and t11 are signed long long values and start unsigned long long,
and the division by 2 actually is a problem for accuracy in some cases, so
if we do it in integral, we need to do actually
      long long t12 = 2 * t10 - t11;
      unsigned long long t13 = t12 * t12 + start * 8 * t11;
      unsigned long long isqrt_ = isqrtull (t13);
      long long x = (((long long) isqrt_ - t12) / t11) >> 1;
with careful overflow checking on all the computations before isqrtull
(and on overflows use the fallback implementation).

2020-07-09  Jakub Jelinek  <jakub@redhat.com>

	* omp-general.h (struct omp_for_data): Add min_inner_iterations
	and factor members.
	* omp-general.c (omp_extract_for_data): Initialize them and remember
	them in OMP_CLAUSE_COLLAPSE_COUNT if needed and restore from there.
	* omp-expand.c (expand_omp_for_init_counts): Fix up computation of
	counts[fd->last_nonrect] if fd->loop.n2 is INTEGER_CST.
	(expand_omp_for_init_vars): For
	fd->first_nonrect + 1 == fd->last_nonrect loops with for now
	INTEGER_CST fd->loop.n2 find quadratic equation roots instead of
	using fallback method when possible.

	* testsuite/libgomp.c/loop-19.c: New test.
	* testsuite/libgomp.c/loop-20.c: New test.
2020-07-09 12:07:17 +02:00
..
appendix-a
examples-4
affinity-1.c Update copyright years. 2020-01-01 12:51:42 +01:00
affinity-2.c
atomic-1.c
atomic-2.c
atomic-3.c
atomic-4.c
atomic-5.c
atomic-6.c
atomic-10.c
atomic-11.c
atomic-12.c
atomic-13.c
atomic-14.c
atomic-15.c
atomic-16.c
atomic-17.c
autopar-1.c
autopar-2.c
autopar-3.c
autopar-4.c
autopar-5.c
autopar-6.c
autopar-7.c
autopar-8.c
barrier-1.c
c.exp
cancel-for-1.c
cancel-for-2.c
cancel-parallel-1.c
cancel-parallel-2.c
cancel-parallel-3.c
cancel-sections-1.c
collapse-1.c
collapse-2.c
collapse-3.c
copyin-1.c
copyin-2.c
copyin-3.c
critical-1.c
critical-2.c
debug-1.c
depend-1.c
depend-2.c
depend-3.c
depend-4.c
depend-5.c
depend-6.c
depend-7.c
depend-8.c
depend-9.c
depend-10.c
doacross-1.c
doacross-2.c
doacross-3.c
icv-1.c
icv-2.c
lib-1.c
lib-2.c
linear-1.c
lock-1.c
lock-2.c
lock-3.c
loop-1.c
loop-2.c
loop-3.c
loop-4.c
loop-5.c
loop-6.c
loop-7.c
loop-8.c
loop-9.c
loop-10.c
loop-11.c
loop-12.c
loop-16.c
loop-17.c openmp: Non-rectangular loop support for non-composite worksharing loops and distribute 2020-06-27 12:43:36 +02:00
loop-18.c openmp: Non-rectangular loop support for non-composite worksharing loops and distribute 2020-06-27 12:43:36 +02:00
loop-19.c openmp: Optimize triangular loop logical iterator to actual iterators computation using search for quadratic equation root(s) 2020-07-09 12:07:17 +02:00
loop-20.c openmp: Optimize triangular loop logical iterator to actual iterators computation using search for quadratic equation root(s) 2020-07-09 12:07:17 +02:00
nested-1.c
nested-2.c
nested-3.c
nestedfn-1.c
nestedfn-2.c
nestedfn-3.c
nestedfn-4.c
nestedfn-5.c
nestedfn-6.c
nqueens-1.c
omp_hello.c
omp_matvec.c
omp_orphan.c
omp_reduction.c
omp_workshare1.c
omp_workshare2.c
omp_workshare3.c
omp_workshare4.c
omp-loop01.c
omp-loop02.c
omp-loop03.c
omp-nested-1.c
omp-nested-2.c
omp-nested-3.c
omp-parallel-for.c
omp-parallel-if.c
omp-single-1.c
omp-single-2.c
omp-single-3.c
ordered-1.c
ordered-2.c
ordered-3.c
ordered-5.c
parallel-1.c
parloops-exit-first-loop-alt-2.c
parloops-exit-first-loop-alt-3.c
parloops-exit-first-loop-alt-4.c
parloops-exit-first-loop-alt-5.c
parloops-exit-first-loop-alt-6.c
parloops-exit-first-loop-alt-7.c
parloops-exit-first-loop-alt.c
pr24455-1.c
pr24455.c
pr26171.c
pr26943-1.c
pr26943-2.c
pr26943-3.c
pr26943-4.c
pr29947-1.c
pr29947-2.c
pr30494.c
pr32362-1.c
pr32362-2.c
pr32362-3.c
pr32468.c
pr33880.c
pr34513.c
pr35130.c
pr35196.c
pr35549.c
pr35625.c
pr36802-1.c
pr36802-2.c
pr36802-3.c
pr38650.c
pr39154.c
pr39591-1.c
pr39591-2.c
pr39591-3.c
pr42029.c
pr42942.c
pr43893.c
pr46032-2.c
pr46032.c
pr46193.c
pr46886.c
pr48591.c
pr49897-1.c
pr49897-2.c
pr49898-1.c
pr49898-2.c
pr52547.c
pr58392.c
pr58756.c
pr61200.c
pr64734.c
pr66133.c
pr66714.c
pr68960.c
pr69110.c
pr69805.c
pr70680-1.c
pr70680-2.c
pr79940.c
pr80394.c
pr80809-1.c
pr80809-2.c
pr80809-3.c
pr80853.c
pr81687-1.c
pr81687-2.c
pr86416-1.c re PR middle-end/86416 ([OpenMP] Offloading - better lto1 error message if mode not supported on offloading target) 2019-12-19 00:27:28 +01:00
pr86416-2.c re PR middle-end/86416 ([OpenMP] Offloading - better lto1 error message if mode not supported on offloading target) 2019-12-19 00:27:28 +01:00
pr86660.c
pr89002.c
pr90779.c
pr90811.c
pr93566.c tree-nested: Fix handling of *reduction clauses with C array sections [PR93566] 2020-03-15 01:27:40 +01:00
priority.c
private-1.c
reduction-1.c
reduction-2.c
reduction-3.c
reduction-4.c
reduction-5.c
reduction-6.c
reduction-7.c
reduction-8.c
reduction-9.c
reduction-10.c
reduction-11.c
reduction-12.c
reduction-13.c
reduction-14.c
reduction-15.c
scan-1.c
scan-2.c
scan-3.c
scan-4.c
scan-5.c
scan-6.c
scan-7.c
scan-8.c
scan-9.c
scan-10.c
scan-11.c
scan-12.c
scan-13.c
scan-14.c
scan-15.c
scan-16.c
scan-17.c
scan-18.c
scan-19.c
scan-20.c
scan-21.c
scan-22.c
sections-1.c
sections-2.c
shared-1.c
shared-2.c
shared-3.c
simd-1.c
simd-2.c
simd-3.c
simd-4.c
simd-5.c
simd-6.c
simd-7.c
simd-8.c
simd-9.c
simd-10.c
simd-11.c
simd-12.c
simd-13.c
single-1.c
single-2.c
sort-1.c Update copyright years. 2020-01-01 12:51:42 +01:00
static-chunk-size-one.c
switch-conversion-2.c
switch-conversion.c
target-3.c
target-4.c
target-5.c
target-6.c
target-7.c
target-8.c
target-9.c
target-11.c
target-12.c
target-14.c
target-15.c
target-16.c
target-17.c
target-18.c
target-19.c
target-20.c
target-21.c
target-22.c
target-23.c
target-24.c
target-25.c
target-26.c
target-27.c
target-28.c
target-29.c
target-30.c
target-31.c
target-32.c
target-33.c openmp: ignore nowait if async execution is unsupported [PR93481] 2020-02-13 10:18:31 +01:00
target-34.c openmp: ignore nowait if async execution is unsupported [PR93481] 2020-02-13 10:18:31 +01:00
target-35.c
target-36.c
target-37.c
target-38.c openmp: Optimize DECL_IN_CONSTANT_POOL vars in target regions 2020-02-09 08:17:10 +01:00
target-39.c openmp: Implement discovery of implicit declare target to clauses 2020-05-12 09:17:09 +02:00
target-critical-1.c
target-link-1.c Fix OpenMP offload handling for target-link variables for nvptx (PR81689) 2020-03-24 15:13:56 +01:00
target-print-1.c
target-teams-1.c
task-1.c
task-2.c
task-3.c
task-4.c
task-5.c
task-reduction-1.c
task-reduction-2.c
task-reduction-3.c
teams-1.c
teams-2.c
thread-limit-1.c
thread-limit-2.c
thread-limit-3.c
thread-limit-4.c
thread-limit-5.c
udr-2.c
udr-3.c
uns-outer-4.c
vla-1.c