We failed to compare the rematerialized store values when merging
paths after walking PHIs.
2020-05-25 Richard Biener <rguenther@suse.de>
PR tree-optimization/95295
* tree-ssa-loop-im.c (sm_seq_valid_bb): Compare remat stores
RHSes and drop to full sm_other if they are not equal.
* gcc.dg/torture/pr95295-1.c: New testcase.
* gcc.dg/torture/pr95295-2.c: Likewise.
* gcc.dg/torture/pr95283.c: Likewise.
This skips invariant vector type setting for a scalar shift argument.
2020-05-25 Richard Biener <rguenther@suse.de>
PR tree-optimization/95297
* tree-vect-stmts.c (vectorizable_shift): For scalar_shift_arg
skip updating operand 1 vector type.
* g++.dg/vect/pr95297.cc: New testcase.
* g++.dg/vect/pr95290.cc: Likewise.
This fixes a hole that still allowed forwarding of TARGET_MEM_REF
addresses.
2020-05-25 Richard Biener <rguenther@suse.de>
PR tree-optimization/95308
* tree-ssa-forwprop.c (pass_forwprop::execute): Generalize
test for TARGET_MEM_REFs.
* g++.dg/torture/pr95308.C: New testcase.
This is an internal renaming generated for a generalized loop iteration
made on a tagged record type with predicate, and gigi cannot use the most
efficient way of implementing renamings because the renamed object is an
expression with a non-empty Actions list.
gcc/ada/ChangeLog
* gcc-interface/decl.c (gnat_to_gnu_entity): Add new local variable
and use it throughout the function.
<E_Variable>: Rename local variable and adjust accordingly. In the
case of a renaming, materialize the entity if the renamed object is
an N_Expression_With_Actions node.
<E_Procedure>: Use Alias accessor function consistently.
gcc/testsuite/ChangeLog
* gnat.dg/renaming16.adb: New test.
* gnat.dg/renaming16_pkg.ads: New helper.
Gigi fails to back-annotate the Present_Expr field of variants present
in a type derived from a discriminated untagged record type, which is
for example visible in the output -gnatRj.
gcc/ada/ChangeLog
* gcc-interface/decl.c (gnat_to_gnu_entity) <E_Record_Type>: Tidy up.
(build_variant_list): Add GNAT_VARIANT_PART parameter and annotate
its variants if it is present. Adjust the recursive call by passing
the variant subpart of variants, if any.
(copy_and_substitute_in_layout): Rename GNU_SUBST_LIST to SUBST_LIST
and adjust throughout. For a type, pass the variant part in the
call to build_variant_list.
The compiler can mishandle a Component_Size clause on an array type
specifying a size multiple of the storage unit, when this size is
not a multiple of the alignment of the component type.
gcc/ada/ChangeLog
* gcc-interface/decl.c (gnat_to_gnu_component_type): Cap alignment
of the component type according to the component size.
gcc/testsuite/ChangeLog
* gnat.dg/array40.adb: New test.
* gnat.dg/array40_pkg.ads: New helper.
* gcc-changelog/git_commit.py: Add trailing '/'
for libdruntime. Allow empty changelog for
only ignored files.
* gcc-changelog/test_email.py: New test for go
patch in ignored location.
* gcc-changelog/test_patches.txt: Add test.
This makes a step back in the representation of fat pointer types in
the debug info with -fgnat-encodings=minimal so as to avoid hiding the
data indirection and making it easiser to synthetize the construct.
gcc/ada/ChangeLog
* gcc-interface/decl.c (gnat_to_gnu_entity) <E_Array_Type>: Add a
description of the various types associated with the unconstrained
type. Declare the fat pointer earlier. Set the current function
as context on the template type, and the fat pointer type on the
array type. Always mark the fat pointer type as artificial and set
it as the context for the pointer type to the array. Also reuse
GNU_ENTITY_NAME. Finish up the unconstrained type at the very end.
* gcc-interface/misc.c (gnat_get_array_descr_info): Do not handle
fat pointer types and tidy up accordingly.
* gcc-interface/utils.c (build_unc_object_type): Do not set the
context on the template type.
Under very specific circumstances the compiler can generate a wrong
assignment to a mutable record object which contains an array component,
because it does not correctly handle the update of the discriminant.
gcc/ada/ChangeLog
* gcc-interface/gigi.h (operand_type): New static inline function.
* gcc-interface/trans.c (gnat_to_gnu): Do not suppress conversion
to the resulty type at the end for array types.
* gcc-interface/utils2.c (build_binary_op) <MODIFY_EXPR>: Do not
remove conversions between array types on the LHS.
gcc/testsuite/ChangeLog
* gnat.dg/array39.adb: New test.
* gnat.dg/array39_pkg.ads: New helper.
* gnat.dg/array39_pkg.adb: Likewise.
For long module names, the generated name-mangled symbol was
truncated, leading to bogus warnings about COMMON block
mismatches. Provide sufficiently large temporaries.
gcc/fortran/
2020-05-24 Harald Anlauf <anlauf@gmx.de>
PR fortran/95106
* trans-common.c (gfc_sym_mangled_common_id): Enlarge temporaries
for name-mangling.
gcc/testsuite/
2020-05-24 Harald Anlauf <anlauf@gmx.de>
PR fortran/95106
* gfortran.dg/equiv_11.f90: New test.
AIX supports "FAT" libraries containing 32 bit and 64 bit objects
(similar to Darwin), but commands for manipulating libraries do not
default to accept both 32 bit and 64 bit object files. While updating
the AIX configuration to support building and running GCC as a 64 bit
application, I have encountered some build libraries that hard code
AR=ar instead of testing the environment.
This patch adds AR_CHECK_TOOL(AR, ar) to configure.ac for the two
libraries and updates Makefile.in to accept the substitution.
2020-05-23 David Edelsohn <dje.gcc@gmail.com>
libcpp/ChangeLog:
* Makefile.in (AR): Substitute @AR@.
* configure.ac (CHECK_PROG AR): New.
* configure: Regenerate.
libdecnumber/ChangeLog:
* Makefile.in (AR): Substitute @AR@.
* configure.ac (CHECK_PROG AR): New.
* configure: Regenerate.
Now that the frontend issue PR c++/94038 is thoroughly fixed, the
testcase for PR93978 no longer fails to compile with -O -Wall, so add
-Wall to the testcase's compile flags to help ensure we don't regress
here.
libstdc++-v3/ChangeLog:
PR libstdc++/93978
* testsuite/std/ranges/adaptors/93978.cc: Add -Wall to
dg-additional-options. Avoid unused-but-set-variable warning.
Concept evaluation may entail DECL_UID generation and/or template
instantiation, so in general we can't perform it during uid-sensitive
constexpr evaluation.
gcc/cp/ChangeLog:
PR c++/94038
* constexpr.c (cxx_eval_constant_expression)
<case TEMPLATE_ID_EXPR>: Don't evaluate the concept when
constexpr evaluation is uid-sensitive.
gcc/testsuite/ChangeLog:
PR c++/94038
* g++.dg/warn/pr94038-3.C: New test.
The body of this function isn't just a return statement, so it can't be
constexpr until C++14.
PR libstdc++/95289
* include/debug/helper_functions.h (__get_distance): Only declare
as a constexpr function for C++14 and up.
* testsuite/25_algorithms/copy/debug/95289.cc: New test.
gcc/fortran/ChangeLog:
2020-05-23 Thomas Koenig <tkoenig@gcc.gnu.org>
PR libfortran/95191
* libgfortran.h (libgfortran_error_codes): Add
LIBERROR_BAD_WAIT_ID.
libgfortran/ChangeLog:
2020-05-23 Thomas Koenig <tkoenig@gcc.gnu.org>
PR libfortran/95191
* io/async.c (async_wait_id): Generate error if ID is higher
than the highest current ID.
* runtime/error.c (translate_error): Handle LIBERROR_BAD_WAIT_ID.
libgomp/ChangeLog:
2020-05-23 Thomas Koenig <tkoenig@gcc.gnu.org>
PR libfortran/95191
* testsuite/libgomp.fortran/async_io_9.f90: New test.
This simplifies the logic of converting Source arguments and pairs of
InputIterator arguments into the native string format. For any input
that is a contiguous range of path::value_type (or char8_t for POSIX)
a string view can be created and the conversion can be done directly,
with no intermediate allocation. Previously some cases created a
basic_string unnecessarily, for example construction from a pair of
path::string_type::iterators, or a pair of non-const value_type*
pointers.
* include/bits/fs_path.h (__detail::_S_range_begin)
(__detail::_S_range_end, path::_S_string_from_iter): Replace with
overloaded function template __detail::__effective_range.
(__detail::__effective_range): New overloaded function template to
create a basic_string or basic_string_view for an effective range.
(__detail::__value_type_is_char): Use __detail::__effective_range.
Do not use remove_const on value type.
(__detail::__value_type_is_char_or_char8_t): Likewise.
(path::path(const Source&, format))
(path::path(const Source&, const locale&))
(path::operator/=(const Source&), path::append(const Source&))
(path::concat(const Source&)): Use __detail::__effective_range.
(path::_S_to_string(InputIterator, InputIterator)): New function
template to create a string view if possible, or string otherwise.
(path::_S_convert): Add overloads that convert a string returned
by __detail::__effective_range. Use if-constexpr to inline conversion
logic from all overloads of _Cvt::_S_convert.
(path::_S_convert_loc): Add overload that converts a string. Use
_S_to_string to avoid allocation when possible.
(path::_Cvt): Remove.
(path::operator+=(CharT)): Remove indirection through path::concat.
* include/experimental/bits/fs_path.h (path::_S_convert_loc): Add
overload for non-const pointers, to avoid constructing a std::string.
* src/c++17/fs_path.cc (path::_S_convert_loc): Replace conditional
compilation with call to _S_convert.
These functions were originally static members of the path class, but
the 'static' specifiers were not removed when they were moved to
namespace scope. This causes ODR violations when the functions are
called from functions defined in the header, which is incompatible with
Nathan's modules branch. Change them to 'inline' instead.
* include/bits/fs_path.h (__detail::_S_range_begin)
(__detail::_S_range_end): Remove unintentional static specifiers.
* include/experimental/bits/fs_path.h (__detail::_S_range_begin)
(__detail::_S_range_end): Likewise.
This replaces the filesystem::__detail::_Path SFINAE helper with two
separate helpers, _Path and _Path2. This avoids having one helper which
tries to check two different sets of requirements.
The _Path helper now uses variable templates instead of a set of
overloaded functions to detect specializations of basic_string or
basic_string_view.
The __not_<is_void<remove_pointer_t<_Tp1>> check is not necessary in
C++20 because iterator_traits<void*> is now empty. For C++17 replace
that check with a __safe_iterator_traits helper with partial
specializations for void pointers.
Finally, the __is_encoded_char check no longer uses remove_const_t,
which means that iterators with a const value_type will no longer be
accepted as arguments for path creation. Such iterators resulted in
undefined behaviour anyway, so it's still conforming to reject them in
the constraint checks.
* include/bits/fs_path.h (filesystem::__detail::__is_encoded_char):
Replace alias template with variable template. Don't remove const.
(filesystem::__detail::__is_path_src): Replace overloaded function
template with variable template and specializations.
(filesystem::__detail::__is_path_iter_src): Replace alias template
with class template.
(filesystem::__detail::_Path): Use __is_path_src. Remove support for
iterator pairs.
(filesystem::__detail::_Path2): New alias template for checking
InputIterator requirements.
(filesystem::__detail::__constructible_from): Remove.
(filesystem::path): Replace _Path<Iter, Iter> with _Path2<Iter>.
* testsuite/27_io/filesystem/path/construct/80762.cc: Check with two
constructor arguments of void and void* types.
Another case where we need a linker-visible symbols in order to
preserve the ld64 atom model. If these symbols are emitted as
'local' the linker cannot see that they are separate from any
global weak entry that precedes them. This will cause the linker
to complain that there is (apparently) direct access to such a
weak global, preventing it from being replaced.
This is a short-term fix for the problem - we need generic
handling for relevant cases (that also does not pessimise objects
by emitting unnecessary symbols and relocations).
gcc/ChangeLog:
2020-05-23 Iain Sandoe <iain@sandoe.co.uk>
* config/darwin.h (ASM_GENERATE_INTERNAL_LABEL):
Make ubsan_{data,type},ASAN symbols linker-visible.
In a function call expression in C++17 evaluation of the function pointer is
sequenced before evaluation of the function arguments, but that doesn't
apply to function calls that were written using operator syntax. In
particular, for operators with right-to-left ordering like assignment, we
must not evaluate the LHS to find a virtual function before we evaluate the
RHS.
gcc/cp/ChangeLog:
* cp-gimplify.c (cp_gimplify_expr) [CALL_EXPR]: Don't preevaluate
the function address if the call used operator syntax.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1z/eval-order9.C: New test.
Plus [u]intptr_t and associated constants.
Refactor the bool, true, false, <stdbool.h> code so it fits into the
new table based design.
gcc/c-family/ChangeLog:
* known-headers.cc (get_stdlib_header_for_name): Add a new
stdlib_hint array for stdbool and stdint.
gcc/testsuite/ChangeLog:
* gcc.dg/spellcheck-stdint.c: New test.
* g++.dg/spellcheck-stdint.C: Likewise.
Currently gcc suggests to use _Bool instead of bool and doesn't give
any suggestions when true or false are used, but undefined. This patch
makes it so that (for C99 or higher) a fixit hint is emitted to include
<stdbool.h>.
gcc/c-family/ChangeLog:
* known-headers.cc (get_stdlib_header_for_name): Return
"<stdbool.h>" for "bool", "true" or "false" when STDLIB_C and
flag_isoc99.
gcc/testsuite/ChangeLog:
* gcc.dg/spellcheck-stdbool.c: New test.
With -fstrong-eval-order=all we evaluate the function address before the
arguments. But this caused trouble with virtual functions and
-fsanitize=vptr; we would do vptr sanitization as part of calculating the
'this' argument, and separately look at the vptr in order to find the
function address. Without -fstrong-eval-order=all 'this' is evaluated
first, but with that flag the function address is evaluated first, so we
would access the null vptr before sanitizing it.
Fixed by instrumenting the OBJ_TYPE_REF of a virtual function call instead
of the 'this' argument.
This issue suggests that we should be running the ubsan tests in multiple
standard modes like the rest of the G++ testsuite, so I've made that change
as well.
gcc/cp/ChangeLog:
* cp-ubsan.c (cp_ubsan_maybe_instrument_member_call): For a virtual
call, instrument the OBJ_TYPE_REF.
gcc/testsuite/ChangeLog:
* g++.dg/ubsan/ubsan.exp: Use g++-dg-runtest.
* c-c++-common/ubsan/bounds-13.c: Adjust.
* c-c++-common/ubsan/bounds-2.c: Adjust.
* c-c++-common/ubsan/div-by-zero-1.c: Adjust.
* c-c++-common/ubsan/div-by-zero-6.c: Adjust.
* c-c++-common/ubsan/div-by-zero-7.c: Adjust.
* c-c++-common/ubsan/overflow-add-1.c: Adjust.
* c-c++-common/ubsan/overflow-add-2.c: Adjust.
* c-c++-common/ubsan/overflow-int128.c: Adjust.
* c-c++-common/ubsan/overflow-sub-1.c: Adjust.
* c-c++-common/ubsan/overflow-sub-2.c: Adjust.
* g++.dg/ubsan/pr85029.C: Adjust.
* g++.dg/ubsan/vptr-14.C: Adjust.
Warn about using exit in signal handler and suggest _exit as alternative.
gcc/analyzer/ChangeLog:
* sm-signal.cc(signal_unsafe_call::emit): Possibly add
gcc_rich_location note for replacement.
(signal_unsafe_call::get_replacement_fn): New private function.
(get_async_signal_unsafe_fns): Add "exit".
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/signal-exit.c: New testcase.
2020-05-22 Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog:
PR target/95255
* config/i386/i386.md (<rounding_insn><mode>2): Do not try to
expand non-sse4 ROUND_ROUNDEVEN rounding via SSE support routines.
gcc/testsuite/ChangeLog:
PR target/95255
* gcc.target/i386/pr95255.c: New test.
this patch avoids stremaing completely useless stray references to gobal decl
stream. I am re-testing the patch (rebased to current tree) on x86_64-linux
and intend to commit once testing finishes.
gcc/ChangeLog:
2020-05-22 Jan Hubicka <hubicka@ucw.cz>
* lto-streamer-out.c (lto_output_tree): Do not stream final ref if
it is not needed.
gcc/lto/ChangeLog:
2020-05-22 Jan Hubicka <hubicka@ucw.cz>
* lto-common.c (lto_read_decls): Do not skip stray refs.
this patch cleans up dumping of streaming so it is clear how dump is organized
and how much space individual components needs.
Compiling:
int a=1;
main()
{
return a;
}
The output is now:
Creating output block for function_body
Streaming tree <result_decl 0x7ffff7457a50 D.1931>
Start of LTO_trees of size 1
Encoding indexable <integer_type 0x7ffff7463000 sizetype> as 0
10 bytes
^^^ I do not think we should need 10 bytes to stream single indexable reference
to 0 :)
Start of LTO_trees of size 1
Encoding indexable <integer_type 0x7ffff74630a8 bitsizetype> as 1
10 bytes
Streaming header of <result_decl 0x7ffff7457a50 D.1931> to function_body
Streaming body of <result_decl 0x7ffff7457a50 D.1931> to function_body
Encoding indexable <integer_type 0x7ffff74635e8 int> as 2
Encoding indexable <function_decl 0x7ffff757b500 main> as 0
Streaming ref to <integer_cst 0x7ffff744af18 32>
Streaming ref to <integer_cst 0x7ffff744af30 4>
52 bytes
^^^ Instead of having multiple LTO_trees sections followed by the final tree
it would make a lot of sense to have only one LTO_trees where the first tree
is one lto_input_tree should return. This is easy to arrange in DFS walk -
one does not need to pop after every SCC component but pop once at the end of
walk. However this breaks handling of integer_csts because they may now
become of LTO_trees block and streamed as header + body.
This bypasses the separate code for shared integer_cst streaming. I think
I want to stream everything into header and materialize the tree since it is not
part of SCC anyway.
Streaming tree <block 0x7ffff757e420>
Streaming header of <block 0x7ffff757e420> to function_body
Streaming body of <block 0x7ffff757e420> to function_body
8 bytes
Streaming gimple stmt _2 = a;
Streaming ref to <block 0x7ffff757e420>
4 bytes
Streaming tree <mem_ref 0x7ffff7576f78>
Start of LTO_trees of size 1
Encoding indexable <pointer_type 0x7ffff746b9d8> as 3
10 bytes
Start of LTO_trees of size 1
Streaming header of <addr_expr 0x7ffff75893c0> to function_body
Streaming body of <addr_expr 0x7ffff75893c0> to function_body
Encoding indexable <var_decl 0x7ffff7fcfb40 a> as 0
15 bytes
Streaming header of <mem_ref 0x7ffff7576f78> to function_body
Streaming body of <mem_ref 0x7ffff7576f78> to function_body
Streaming ref to <addr_expr 0x7ffff75893c0>
Streaming ref to <integer_cst 0x7ffff75a3240 0>
42 bytes
Streaming gimple stmt return _2;
Outputting global stream
0: <function_decl 0x7ffff757b500 main>
Streaming tree <function_decl 0x7ffff757b500 main>
Start of LTO_tree_scc of size 1
Streaming header of <optimization_node 0x7ffff744b000> to decls
Streaming body of <optimization_node 0x7ffff744b000> to decls
576 bytes
Start of LTO_tree_scc of size 1
Streaming header of <target_option_node 0x7ffff744a018> to decls
Streaming body of <target_option_node 0x7ffff744a018> to decls
68 bytes
Streaming single tree
Streaming header of <identifier_node 0x7ffff7577aa0 main> to decls
Streaming body of <identifier_node 0x7ffff7577aa0 main> to decls
3 bytes
Streaming single tree
Streaming header of <identifier_node 0x7ffff758a8c0 t.c> to decls
Streaming body of <identifier_node 0x7ffff758a8c0 t.c> to decls
3 bytes
Streaming single tree
Streaming header of <translation_unit_decl 0x7ffff7457ac8 t.c> to decls
Streaming body of <translation_unit_decl 0x7ffff7457ac8 t.c> to decls
Streaming ref to <identifier_node 0x7ffff758a8c0 t.c>
22 bytes
Start of LTO_tree_scc of size 1
Streaming header of <function_type 0x7ffff74717e0> to decls
Streaming body of <function_type 0x7ffff74717e0> to decls
Streaming ref to <integer_type 0x7ffff74635e8 int>
Streaming ref to <integer_cst 0x7ffff744adc8 8>
Streaming ref to <integer_cst 0x7ffff744ade0 1>
Streaming ref to <function_type 0x7ffff74717e0>
38 bytes
Start of LTO_tree_scc of size 1
Streaming header of <function_type 0x7ffff75832a0> to decls
Streaming body of <function_type 0x7ffff75832a0> to decls
Streaming ref to <integer_type 0x7ffff74635e8 int>
Streaming ref to <integer_cst 0x7ffff744adc8 8>
Streaming ref to <integer_cst 0x7ffff744ade0 1>
Streaming ref to <function_type 0x7ffff74717e0>
38 bytes
Start of LTO_tree_scc of size 1
Streaming header of <function_decl 0x7ffff757b500 main> to decls
Streaming body of <function_decl 0x7ffff757b500 main> to decls
Streaming ref to <function_type 0x7ffff75832a0>
Streaming ref to <identifier_node 0x7ffff7577aa0 main>
Streaming ref to <translation_unit_decl 0x7ffff7457ac8 t.c>
Streaming ref to <identifier_node 0x7ffff7577aa0 main>
Streaming ref to <target_option_node 0x7ffff744a018>
Streaming ref to <optimization_node 0x7ffff744b000>
58 bytes
806 bytes
0: <var_decl 0x7ffff7fcfb40 a>
Streaming tree <var_decl 0x7ffff7fcfb40 a>
Streaming single tree
Streaming header of <identifier_node 0x7ffff758a870 a> to decls
Streaming body of <identifier_node 0x7ffff758a870 a> to decls
3 bytes
Streaming single tree
Streaming ref to <integer_type 0x7ffff7463000 sizetype>
7 bytes
Streaming single tree
Streaming ref to <integer_type 0x7ffff74630a8 bitsizetype>
7 bytes
Start of LTO_tree_scc of size 1
Streaming header of <var_decl 0x7ffff7fcfb40 a> to decls
Streaming body of <var_decl 0x7ffff7fcfb40 a> to decls
Streaming ref to <integer_type 0x7ffff74635e8 int>
Streaming ref to <identifier_node 0x7ffff758a870 a>
Streaming ref to <translation_unit_decl 0x7ffff7457ac8 t.c>
Streaming ref to <integer_cst 0x7ffff744af18 32>
Streaming ref to <integer_cst 0x7ffff744af30 4>
Streaming ref to <identifier_node 0x7ffff758a870 a>
Streaming ref to <integer_cst 0x7ffff7468090 1>
49 bytes
66 bytes
gcc/ChangeLog:
2020-05-22 Jan Hubicka <hubicka@ucw.cz>
* lto-section-out.c (lto_output_decl_index): Adjust dump indentation.
* lto-streamer-out.c (create_output_block): Fix whitespace
(lto_write_tree_1): Add (debug) dump.
(DFS::DFS): Add dump.
(DFS::DFS_write_tree_body): Do not dump here.
(lto_output_tree): Improve dumping; do not stream ref when not needed.
(produce_asm_for_decls): Fix whitespace.
* tree-streamer-out.c (streamer_write_tree_header): Add dump.
Add -mavx512vpopcntdq for -march=native if AVX512VPOPCNTDQ is available.
PR target/95258
* config/i386/driver-i386.c (host_detect_local_cpu): Detect
AVX512VPOPCNTDQ.
2020-05-22 Jakub Jelinek <jakub@redhat.com>
* gcc-changelog/git_commit.py: Add trailing / to
gcc/testsuite/go.test/test and replace gcc/go/frontend/
with gcc/go/gofrontend/ in ignored locations.
This fixes handling of clobbers when commoning stores.
2020-05-22 Richard Biener <rguenther@suse.de>
PR tree-optimization/95268
* tree-ssa-sink.c (sink_common_stores_to_bb): Handle clobbers
properly.
* g++.dg/torture/pr95268.C: New testcase.
this patch seems to solve basically all collisions while building cc1.
From:
[WPA] read 3312246 unshared trees
[WPA] read 1144381 mergeable SCCs of average size 4.833785
[WPA] 8843938 tree bodies read in total
[WPA] tree SCC table: size 524287, 197767 elements, collision ratio: 0.506446
[WPA] tree SCC max chain length 43 (size 1)
[WPA] Compared 946614 SCCs, 775077 collisions (0.818789)
to
[WPA] read 3314520 unshared trees
[WPA] read 1144763 mergeable SCCs of average size 4.835021
[WPA] 8849473 tree bodies read in total
[WPA] tree SCC table: size 524287, 200574 elements, collision ratio: 0.486418
[WPA] tree SCC max chain length 2 (size 1)
[WPA] Compared 944189 SCCs, 179 collisions (0.000190)
The problem is that preloaded nodes all have hash code 0 because
cache->nodes.length is not updated while streaming out.
I also added an arbitrary constant to avoid clash with constant of 0 used to
hash NULL pointers and 1 used to hash pointers inside SCC.
* tree-streamer.c (record_common_node): Fix hash value of pre-streamed
nodes.
this patch saves few bytes from SCC streaming. First we stream end markers
that are fully ignored at stream in.
Second I missed streaming of emtry_len in the previous change so it is
pointlessly streamed for LTO_trees. Moreover entry_len is almost always 1
(always during gcc bootstrap) and thus it makes sense to avoid stremaing it
in majority of cases.
gcc/ChangeLog:
2020-05-21 Jan Hubicka <hubicka@ucw.cz>
* lto-streamer-in.c (lto_read_tree): Do not stream end markers.
(lto_input_scc): Optimize streaming of entry lengths.
* lto-streamer-out.c (lto_write_tree): Do not stream end markers
(DFS::DFS): Optimize stremaing of entry lengths
This documents new GCC 10 behavior on diagnostic options and -flto.
2020-05-22 Richard Biener <rguenther@suse.de>
PR lto/95190
* doc/invoke.texi (flto): Document behavior of diagnostic
options.
This tries to enforce a set SLP_TREE_VECTYPE in vect_get_constant_vectors
and provides some infrastructure for setting it in the vectorizable_*
functions, amending those.
2020-05-22 Richard Biener <rguenther@suse.de>
* tree-vectorizer.h (vect_is_simple_use): New overload.
(vect_maybe_update_slp_op_vectype): New.
* tree-vect-stmts.c (vect_is_simple_use): New overload
accessing operands of SLP vs. non-SLP operation transparently.
(vect_maybe_update_slp_op_vectype): New function updating
the possibly shared SLP operands vector type.
(vectorizable_operation): Be a bit more SLP vs non-SLP agnostic
using the new vect_is_simple_use overload; update SLP invariant
operand nodes vector type.
(vectorizable_comparison): Likewise.
(vectorizable_call): Likewise.
(vectorizable_conversion): Likewise.
(vectorizable_shift): Likewise.
(vectorizable_store): Likewise.
(vectorizable_condition): Likewise.
(vectorizable_assignment): Likewise.
* tree-vect-loop.c (vectorizable_reduction): Likewise.
* tree-vect-slp.c (vect_get_constant_vectors): Enforce
present SLP_TREE_VECTYPE and check it matches previous
behavior.
This fixes a leftover early out in determining the sequence of stores
to materialize.
2020-05-22 Richard Biener <rguenther@suse.de>
PR tree-optimization/95248
* tree-ssa-loop-im.c (sm_seq_valid_bb): Remove bogus early out.
* gcc.dg/torture/pr95248.c: New testcase.
This adds constructor and destructor to slp_tree factoring common
code. I've not changed the wrappers to overloaded CTORs since
I hope to use object_allocator<> and am not sure whether that can
be done in any fancy way yet.
2020-05-22 Richard Biener <rguenther@suse.de>
* tree-vectorizer.h (_slp_tree::_slp_tree): New.
(_slp_tree::~_slp_tree): Likewise.
* tree-vect-slp.c (_slp_tree::_slp_tree): Factor out code
from allocators.
(_slp_tree::~_slp_tree): Implement.
(vect_free_slp_tree): Simplify.
(vect_create_new_slp_node): Likewise. Add nops parameter.
(vect_build_slp_tree_2): Adjust.
(vect_analyze_slp_instance): Likewise.