Go to file
Jakub Jelinek b320edc0c2 bswap: Recognize (int) __builtin_bswap64 (arg) idioms or __builtin_bswap?? (arg) & mask [PR86723]
The following patch recognizes in the bswap pass (only there for now,
haven't done it for store merging pass yet) code sequences that can
be handled by (int32) __builtin_bswap64 (arg), i.e. where we have
0x05060708 n->n with 64-bit non-memory argument (if it is memory, we
can just load the 32-bit at 4 bytes into the address and n->n would
be 0x01020304; and only 64 -> 32 bit, because 64 -> 16 bit or 32 -> 16 bit
would mean only two bytes in the result and probably not worth it),
and furthermore the case where we have in the 0x0102030405060708 etc.
numbers some bytes 0 (i.e. known to contain zeros rather than source bytes),
as long as we have at least two original bytes in the right
positions (and no unknown bytes).  This can be handled by
__builtin_bswap64 (arg) & 0xff0000ffffff00ffULL etc.
The latter change is the reason why counting the bswap messages doesn't work
too well in optimize-bswap* tests anymore, while the pass iterates from end
of basic block towards start, it will often match both the bswap at the end
and some of the earlier bswaps with some masks (not a problem generally,
we'll just DCE it away whenever possible).  The pass right now doesn't
handle __builtin_bswap* calls in the pattern matching (which is the reason
why it operates backwards), but it uses FOR_EACH_BB_FN (bb, fun) order
of handling blocks and matched sequences can span multiple blocks, so I was
worried about cases like:
void bar (unsigned long long);
unsigned long long
foo (unsigned long long value, int x)
{
  unsigned long long tmp = (((value & 0x00000000000000ffull) << 56)
          | ((value & 0x000000000000ff00ull) << 40)
          | ((value & 0x00000000ff000000ull) << 8));
  if (x)
    bar (tmp);
  return (tmp
          | ((value & 0x000000ff00000000ull) >> 8)
          | ((value & 0x0000ff0000000000ull) >> 24)
          | ((value & 0x0000000000ff0000ull) << 24)
          | ((value & 0x00ff000000000000ull) >> 40)
          | ((value & 0xff00000000000000ull) >> 56));
}
but it seems we handle even that fine, while bb2 ending in GIMPLE_COND
is processed first, we recognize there a __builtin_bswap64 (value) & mask1,
in the last bb we recognize tmp | (__builtin_bswap64 (value) & mask2) and
PRE optimizes that into t = __builtin_bswap64 (value); tmp = t & mask1;
in the first bb and return t; in the last one.

2021-08-23  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/86723
	* gimple-ssa-store-merging.c (find_bswap_or_nop_finalize): Add
	cast64_to_32 argument, set *cast64_to_32 to false, unless n is
	non-memory permutation of 64-bit src which only has bytes of
	0 or [5..8] and n->range is 4.
	(find_bswap_or_nop): Add cast64_to_32 and mask arguments, adjust
	find_bswap_or_nop_finalize caller, support bswap with some bytes
	zeroed, as long as at least two bytes are not zeroed.
	(bswap_replace): Add mask argument and handle masking of bswap
	result.
	(maybe_optimize_vector_constructor): Adjust find_bswap_or_nop
	caller, punt if cast64_to_32 or mask is not all ones.
	(pass_optimize_bswap::execute): Adjust find_bswap_or_nop_finalize
	caller, for now punt if cast64_to_32.

	* gcc.dg/pr86723.c: New test.
	* gcc.target/i386/pr86723.c: New test.
	* gcc.dg/optimize-bswapdi-1.c: Use -fdump-tree-optimized instead of
	-fdump-tree-bswap and scan for number of __builtin_bswap64 calls.
	* gcc.dg/optimize-bswapdi-2.c: Likewise.
	* gcc.dg/optimize-bswapsi-1.c: Use -fdump-tree-optimized instead of
	-fdump-tree-bswap and scan for number of __builtin_bswap32 calls.
	* gcc.dg/optimize-bswapsi-5.c: Likewise.
	* gcc.dg/optimize-bswapsi-3.c: Likewise.  Expect one __builtin_bswap32
	call instead of zero.
2021-08-23 11:54:03 +02:00
c++tools
config Daily bump. 2021-08-19 00:16:42 +00:00
contrib Daily bump. 2021-08-19 00:16:42 +00:00
fixincludes
gcc bswap: Recognize (int) __builtin_bswap64 (arg) idioms or __builtin_bswap?? (arg) & mask [PR86723] 2021-08-23 11:54:03 +02:00
gnattools
gotools
include openmp: Add support for strict modifier on grainsize/num_tasks clauses 2021-08-23 10:16:24 +02:00
INSTALL
intl
libada
libatomic
libbacktrace Daily bump. 2021-08-14 00:16:29 +00:00
libcc1 Daily bump. 2021-08-18 00:16:48 +00:00
libcody
libcpp Daily bump. 2021-08-18 00:16:48 +00:00
libdecnumber
libffi
libgcc Daily bump. 2021-08-22 00:16:40 +00:00
libgfortran Daily bump. 2021-08-11 00:16:27 +00:00
libgo libgo: various fixes for Solaris support 2021-08-14 17:34:52 -07:00
libgomp openmp: Add support for strict modifier on grainsize/num_tasks clauses 2021-08-23 10:16:24 +02:00
libiberty Daily bump. 2021-08-19 00:16:42 +00:00
libitm
libobjc
liboffloadmic
libphobos
libquadmath
libsanitizer Daily bump. 2021-08-12 00:16:28 +00:00
libssp
libstdc++-v3 Daily bump. 2021-08-21 00:16:29 +00:00
libvtv
lto-plugin
maintainer-scripts
zlib
.dir-locals.el
.gitattributes
.gitignore
ABOUT-NLS
ar-lib
ChangeLog Daily bump. 2021-08-22 00:16:40 +00:00
ChangeLog.jit
ChangeLog.tree-ssa
compile
config-ml.in
config.guess
config.rpath
config.sub
configure
configure.ac
COPYING
COPYING3
COPYING3.LIB
COPYING.LIB
COPYING.RUNTIME
depcomp
install-sh
libtool-ldflags
libtool.m4
lt~obsolete.m4
ltgcc.m4
ltmain.sh
ltoptions.m4
ltsugar.m4
ltversion.m4
MAINTAINERS MAINTAINERS: Add myself for write after approval 2021-08-21 21:41:31 +02:00
Makefile.def
Makefile.in configure: Allow host fragments to react to --enable-host-shared. 2021-08-18 19:46:32 +01:00
Makefile.tpl configure: Allow host fragments to react to --enable-host-shared. 2021-08-18 19:46:32 +01:00
missing
mkdep
mkinstalldirs
move-if-change
multilib.am
README
symlink-tree
test-driver
ylwrap

This directory contains the GNU Compiler Collection (GCC).

The GNU Compiler Collection is free software.  See the files whose
names start with COPYING for copying permission.  The manuals, and
some of the runtime libraries, are under different terms; see the
individual source files for details.

The directory INSTALL contains copies of the installation information
as HTML and plain text.  The source of this information is
gcc/doc/install.texi.  The installation information includes details
of what is included in the GCC sources and what files GCC installs.

See the file gcc/doc/gcc.texi (together with other files that it
includes) for usage and porting information.  An online readable
version of the manual is in the files gcc/doc/gcc.info*.

See http://gcc.gnu.org/bugs/ for how to report bugs usefully.

Copyright years on GCC source files may be listed using range
notation, e.g., 1987-2012, indicating that every year in the range,
inclusive, is a copyrightable year that could otherwise be listed
individually.