Go to file
Roger Sayle beed3f8f60 nvptx: Transition nvptx backend to STORE_FLAG_VALUE = 1
This patch to the nvptx backend changes the backend's STORE_FLAG_VALUE
from -1 to 1, by using BImode predicates and selp instructions, instead
of set instructions (almost always followed by integer negation).

Historically, it was reasonable (through rare) for backends to use -1
for representing true during the RTL passes.  However with tree-ssa,
GCC now emits lots of code that reads and writes _Bool values, requiring
STORE_FLAG_VALUE=-1 targets to frequently convert 0/-1 pseudos to 0/1
pseudos using integer negation.  Unfortunately, this process prevents
or complicates many optimizations (negate isn't associative with logical
AND, OR and XOR, and interferes with range/vrp/nonzerobits bounds etc.).

The impact of this is that for a relatively simple logical expression
like "return (x==21) && (y==69);", the nvptx backend currently generates:

                mov.u32 %r26, %ar0;
                mov.u32 %r27, %ar1;
                set.u32.eq.u32  %r30, %r26, 21;
                neg.s32 %r31, %r30;
                mov.u32 %r29, %r31;
                set.u32.eq.u32  %r33, %r27, 69;
                neg.s32 %r34, %r33;
                mov.u32 %r32, %r34;
                cvt.u16.u8      %r39, %r29;
                mov.u16 %r36, %r39;
                cvt.u16.u8      %r39, %r32;
                mov.u16 %r37, %r39;
                and.b16 %r35, %r36, %r37;
                cvt.u32.u16     %r38, %r35;
                cvt.u32.u8      %value, %r38;

This patch tweaks nvptx to generate 0/1 values instead, requiring the
same number of instructions, using (BImode) predicate registers and selp
instructions so as to now generate the almost identical:

                mov.u32 %r26, %ar0;
                mov.u32 %r27, %ar1;
                setp.eq.u32     %r31, %r26, 21;
                selp.u32        %r30, 1, 0, %r31;
                mov.u32 %r29, %r30;
                setp.eq.u32     %r34, %r27, 69;
                selp.u32        %r33, 1, 0, %r34;
                mov.u32 %r32, %r33;
                cvt.u16.u8      %r39, %r29;
                mov.u16 %r36, %r39;
                cvt.u16.u8      %r39, %r32;
                mov.u16 %r37, %r39;
                and.b16 %r35, %r36, %r37;
                cvt.u32.u16     %r38, %r35;
                cvt.u32.u8      %value, %r38;

The hidden benefit is that this sequence can (in theory) be optimized
by the RTL passes to eventually generate a much shorter sequence using
an and.pred instruction (just like Nvidia's nvcc compiler).

This patch has been tested nvptx-none with a "make" and "make -k check"
(including newlib) hosted on x86_64-pc-linux-gnu with no new failures.

gcc/ChangeLog:

	* config/nvptx/nvptx.h (STORE_FLAG_VALUE): Change to 1.
	* config/nvptx/nvptx.md (movbi): Use P1 constraint for true.
	(setcc_from_bi): Remove SImode specific pattern.
	(setcc<mode>_from_bi): Provide more general HSDIM pattern.
	(extendbi<mode>2, zeroextendbi<mode>2): Provide instructions
	for sign- and zero-extending BImode predicates to integers.
	(setcc_int<mode>): Remove previous (-1-based) instructions.
	(cstorebi4): Remove BImode to SImode specific expander.
	(cstore<mode>4): Fix indentation.  Expand using setccsi_from_bi.
	(cstore<mode>4): For both integer and floating point modes.
2022-01-04 12:28:02 +01:00
c++tools
config
contrib Daily bump. 2022-01-04 00:16:40 +00:00
fixincludes Adjust VxWorks fixincludes hack for mkdir to work for C++ 2022-01-04 10:27:11 +00:00
gcc nvptx: Transition nvptx backend to STORE_FLAG_VALUE = 1 2022-01-04 12:28:02 +01:00
gnattools
gotools
include
INSTALL
intl
libada
libatomic
libbacktrace
libcc1
libcody
libcpp
libdecnumber
libffi
libgcc
libgfortran
libgo
libgomp libgomp: Fix GOMP_DEVICE_NUM_VAR stringification during offload image load 2022-01-04 17:26:23 +08:00
libiberty
libitm Daily bump. 2022-01-04 00:16:40 +00:00
libobjc
liboffloadmic
libphobos Daily bump. 2022-01-04 00:16:40 +00:00
libquadmath Daily bump. 2022-01-04 00:16:40 +00:00
libsanitizer
libssp
libstdc++-v3
libvtv
lto-plugin
maintainer-scripts
zlib
.dir-locals.el
.gitattributes
.gitignore
ABOUT-NLS
ar-lib
ChangeLog
ChangeLog.jit
ChangeLog.tree-ssa
compile
config-ml.in
config.guess
config.rpath
config.sub
configure
configure.ac
COPYING
COPYING3
COPYING3.LIB
COPYING.LIB
COPYING.RUNTIME
depcomp
install-sh
libtool-ldflags
libtool.m4
lt~obsolete.m4
ltgcc.m4
ltmain.sh
ltoptions.m4
ltsugar.m4
ltversion.m4
MAINTAINERS
Makefile.def
Makefile.in
Makefile.tpl
missing
mkdep
mkinstalldirs
move-if-change
multilib.am
README
symlink-tree
test-driver
ylwrap

This directory contains the GNU Compiler Collection (GCC).

The GNU Compiler Collection is free software.  See the files whose
names start with COPYING for copying permission.  The manuals, and
some of the runtime libraries, are under different terms; see the
individual source files for details.

The directory INSTALL contains copies of the installation information
as HTML and plain text.  The source of this information is
gcc/doc/install.texi.  The installation information includes details
of what is included in the GCC sources and what files GCC installs.

See the file gcc/doc/gcc.texi (together with other files that it
includes) for usage and porting information.  An online readable
version of the manual is in the files gcc/doc/gcc.info*.

See http://gcc.gnu.org/bugs/ for how to report bugs usefully.

Copyright years on GCC source files may be listed using range
notation, e.g., 1987-2012, indicating that every year in the range,
inclusive, is a copyrightable year that could otherwise be listed
individually.