Go to file
Roger Sayle c3ed9e0d6e Improved Scalar-To-Vector (STV) support for TImode to V1TImode on x86_64.
This patch upgrades x86_64's scalar-to-vector (STV) pass to more
aggressively transform 128-bit scalar TImode operations into vector
V1TImode operations performed on SSE registers.  TImode functionality
already exists in STV, but only for move operations.  This change
brings support for logical operations (AND, IOR, XOR, NOT and ANDN)
and comparisons.

The effect of these changes are conveniently demonstrated by the new
sse4_1-stv-5.c test case:

__int128 a[16];
__int128 b[16];
__int128 c[16];

void foo()
{
  for (unsigned int i=0; i<16; i++)
    a[i] = b[i] & ~c[i];
}

which when currently compiled on mainline wtih -O2 -msse4 produces:

foo:    xorl    %eax, %eax
.L2:    movq    c(%rax), %rsi
        movq    c+8(%rax), %rdi
        addq    $16, %rax
        notq    %rsi
        notq    %rdi
        andq    b-16(%rax), %rsi
        andq    b-8(%rax), %rdi
        movq    %rsi, a-16(%rax)
        movq    %rdi, a-8(%rax)
        cmpq    $256, %rax
        jne     .L2
        ret

but with this patch now produces:

foo:    xorl    %eax, %eax
.L2:    movdqa  c(%rax), %xmm0
        pandn   b(%rax), %xmm0
        addq    $16, %rax
        movaps  %xmm0, a-16(%rax)
        cmpq    $256, %rax
        jne     .L2
        ret

Technically, the STV pass is implemented by three C++ classes, a common
abstract base class "scalar_chain" that contains common functionality,
and two derived classes: general_scalar_chain (which handles SI and
DI modes) and timode_scalar_chain (which handles TI modes).  As
mentioned previously, because only TI mode moves were handled the
two worker classes behaved significantly differently.  These changes
bring the functionality of these two classes closer together, which
is reflected by refactoring more shared code from general_scalar_chain
to the parent scalar_chain and reusing it from timode_scalar_chain.
There still remain significant differences (and simplifications) so
the existing division of classes (as specializations) continues to
make sense.

2022-07-11  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* config/i386/i386-features.h (scalar_chain): Add fields
	insns_conv, n_sse_to_integer and n_integer_to_sse to this
	parent class, moved from general_scalar_chain.
	(scalar_chain::convert_compare): Protected method moved
	from general_scalar_chain.
	(mark_dual_mode_def): Make protected, not private virtual.
	(scalar_chain:convert_op): New private virtual method.

	(general_scalar_chain::general_scalar_chain): Simplify constructor.
	(general_scalar_chain::~general_scalar_chain): Delete destructor.
	(general_scalar_chain): Move insns_conv, n_sse_to_integer and
	n_integer_to_sse fields to parent class, scalar_chain.
	(general_scalar_chain::mark_dual_mode_def): Delete prototype.
	(general_scalar_chain::convert_compare): Delete prototype.

	(timode_scalar_chain::compute_convert_gain): Remove simplistic
	implementation, convert to a method prototype.
	(timode_scalar_chain::mark_dual_mode_def): Delete prototype.
	(timode_scalar_chain::convert_op): Prototype new virtual method.

	* config/i386/i386-features.cc (scalar_chain::scalar_chain):
	Allocate insns_conv and initialize n_sse_to_integer and
	n_integer_to_sse fields in constructor.
	(scalar_chain::scalar_chain): Free insns_conv in destructor.

	(general_scalar_chain::general_scalar_chain): Delete
	constructor, now defined in the class declaration.
	(general_scalar_chain::~general_scalar_chain): Delete destructor.

	(scalar_chain::mark_dual_mode_def): Renamed from
	general_scalar_chain::mark_dual_mode_def.
	(timode_scalar_chain::mark_dual_mode_def): Delete.
	(scalar_chain::convert_compare): Renamed from
	general_scalar_chain::convert_compare.

	(timode_scalar_chain::compute_convert_gain): New method to
	determine the gain from converting a TImode chain to V1TImode.
	(timode_scalar_chain::convert_op): New method to convert an
	operand from TImode to V1TImode.

	(timode_scalar_chain::convert_insn) <case REG>: Only PUT_MODE
	on REG_EQUAL notes that were originally TImode (not CONST_INT).
	Handle AND, ANDN, XOR, IOR, NOT and COMPARE.
	(timode_mem_p): Helper predicate to check where operand is
	memory reference with sufficient alignment for TImode STV.
	(timode_scalar_to_vector_candidate_p): Use convertible_comparison_p
	to check whether COMPARE is convertible.  Handle SET_DESTs that
	that are REG_P or MEM_P and SET_SRCs that are REG, CONST_INT,
	CONST_WIDE_INT, MEM, AND, ANDN, IOR, XOR or NOT.

gcc/testsuite/ChangeLog
	* gcc.target/i386/sse4_1-stv-2.c: New test case, pand.
	* gcc.target/i386/sse4_1-stv-3.c: New test case, por.
	* gcc.target/i386/sse4_1-stv-4.c: New test case, pxor.
	* gcc.target/i386/sse4_1-stv-5.c: New test case, pandn.
	* gcc.target/i386/sse4_1-stv-6.c: New test case, ptest.
2022-07-11 16:04:46 +01:00
c++tools
config
contrib
fixincludes
gcc Improved Scalar-To-Vector (STV) support for TImode to V1TImode on x86_64. 2022-07-11 16:04:46 +01:00
gnattools
gotools
include Daily bump. 2022-07-07 00:16:46 +00:00
INSTALL
intl
libada
libatomic
libbacktrace Daily bump. 2022-07-09 00:16:54 +00:00
libcc1
libcody
libcpp Daily bump. 2022-07-11 00:16:25 +00:00
libdecnumber
libffi
libgcc
libgfortran
libgo
libgomp Enhance '_Pragma' diagnostics verification in OMP C/C++ test cases 2022-07-11 11:23:33 +02:00
libiberty
libitm
libobjc
liboffloadmic
libphobos Daily bump. 2022-07-07 00:16:46 +00:00
libquadmath
libsanitizer libsanitizer: Cherry-pick 5d8077565e41 from upstream 2022-07-07 10:19:58 +08:00
libssp
libstdc++-v3 Daily bump. 2022-07-10 00:16:23 +00:00
libvtv
lto-plugin Daily bump. 2022-07-08 00:16:22 +00:00
maintainer-scripts
zlib
.dir-locals.el
.gitattributes
.gitignore
ABOUT-NLS
ar-lib
ChangeLog
ChangeLog.jit
ChangeLog.tree-ssa
compile
config-ml.in
config.guess
config.rpath
config.sub
configure
configure.ac
COPYING
COPYING3
COPYING3.LIB
COPYING.LIB
COPYING.RUNTIME
depcomp
install-sh
libtool-ldflags
libtool.m4
lt~obsolete.m4
ltgcc.m4
ltmain.sh
ltoptions.m4
ltsugar.m4
ltversion.m4
MAINTAINERS
Makefile.def
Makefile.in
Makefile.tpl
missing
mkdep
mkinstalldirs
move-if-change
multilib.am
README
symlink-tree
test-driver
ylwrap

This directory contains the GNU Compiler Collection (GCC).

The GNU Compiler Collection is free software.  See the files whose
names start with COPYING for copying permission.  The manuals, and
some of the runtime libraries, are under different terms; see the
individual source files for details.

The directory INSTALL contains copies of the installation information
as HTML and plain text.  The source of this information is
gcc/doc/install.texi.  The installation information includes details
of what is included in the GCC sources and what files GCC installs.

See the file gcc/doc/gcc.texi (together with other files that it
includes) for usage and porting information.  An online readable
version of the manual is in the files gcc/doc/gcc.info*.

See http://gcc.gnu.org/bugs/ for how to report bugs usefully.

Copyright years on GCC source files may be listed using range
notation, e.g., 1987-2012, indicating that every year in the range,
inclusive, is a copyrightable year that could otherwise be listed
individually.