nvptx: Support floating point reciprocal instructions

The following patch addds support for PTX's rcp.rn.f32 and rcp.rn.f64
instructions.  Note that the "rcp.rn" forms of this instruction
calculate the fully IEEE compliant result for the reciprocal, unlike
the rcp.approx variants that just provide fast approximations.

This patch has been tested on nvptx-none hosted on x86_64-pc-linux-gnu
with "make" and "make check" with no new regressions.

2020-07-12  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog:

	* config/nvptx/nvptx.md (recip<mode>2): New instruction.

gcc/testsuite/ChangeLog:

	* gcc.target/nvptx/recip-1.c: New test.
This commit is contained in:
Roger Sayle 2020-07-28 15:55:47 +02:00 committed by Tom de Vries
parent 0f4a54ccb8
commit a0d007d67c
2 changed files with 27 additions and 0 deletions

View File

@ -879,6 +879,15 @@
""
"%.\\tfma%#%t0\\t%0, %1, %2, %3;")
(define_insn "*recip<mode>2"
[(set (match_operand:SDFM 0 "nvptx_register_operand" "=R")
(div:SDFM
(match_operand:SDFM 2 "const_double_operand" "F")
(match_operand:SDFM 1 "nvptx_register_operand" "R")))]
"CONST_DOUBLE_P (operands[2])
&& real_identical (CONST_DOUBLE_REAL_VALUE (operands[2]), &dconst1)"
"%.\\trcp%#%t0\\t%0, %1;")
(define_insn "div<mode>3"
[(set (match_operand:SDFM 0 "nvptx_register_operand" "=R")
(div:SDFM (match_operand:SDFM 1 "nvptx_register_operand" "R")

View File

@ -0,0 +1,18 @@
/* { dg-do assemble } */
/* { dg-options "-O2 -save-temps" } */
double
foo (double x)
{
return 1.0 / x;
}
float
foof (float x)
{
return 1.0f / x;
}
/* { dg-final { scan-assembler-times "rcp.rn.f64" 1 } } */
/* { dg-final { scan-assembler-times "rcp.rn.f32" 1 } } */