Loop vectorization pattern recog fails to walk PHIs when determining
stmt precisions. This fails to recognize non-mask uses for bools
in PHIs and outer loop vectorization.
2021-07-19 Richard Biener <rguenther@suse.de>
PR tree-optimization/101505
* tree-vect-patterns.c (vect_determine_precisions): Walk
PHIs also for loop vectorization.
* gcc.dg/vect/pr101505.c: New testcase.
... seen as of commit a110855667 "Correct
handling of variable offset minus constant in -Warray-bounds [PR100137]".
Awaiting a different solution, of course.
libgomp/
PR target/101484
* config/gcn/team.c: Apply '-Werror=array-bounds' work-around.
* libgomp.h [__AMDGCN__]: Likewise.
This removes the last uses of gimple_expr_type.
2021-07-16 Richard Biener <rguenther@suse.de>
* tree-ssa-sccvn.c (vn_reference_eq): Handle NULL vr->type.
(ao_ref_init_from_vn_reference): Likewise.
(fully_constant_reference): Likewise.
(vn_reference_lookup_call): Do not set vr->type to random
values.
* tree-ssa-pre.c (compute_avail): Do not try to PRE calls
without a value.
* tree-vect-generic.c (expand_vector_piecewise): Pass in
whether we expanded parallel.
(expand_vector_parallel): Adjust.
(expand_vector_addition): Likewise.
(expand_vector_comparison): Likewise.
(expand_vector_operation): Likewise.
(expand_vector_scalar_condition): Likewise.
(expand_vector_conversion): Likewise.
This removes the last uses from value-range code.
2021-07-16 Richard Biener <rguenther@suse.de>
* tree-vrp.c (register_edge_assert_for_2): Use the
type from the LHS.
(vrp_folder::fold_predicate_in): Likewise.
* vr-values.c (gimple_assign_nonzero_p): Likewise.
(vr_values::extract_range_from_comparison): Likewise.
(vr_values::extract_range_from_ubsan_builtin): Use the
type of the first operand.
(vr_values::extract_range_basic): Push down type
computation, use the appropriate LHS.
(vr_values::extract_range_from_assignment): Use the
type of the LHS.
For -mgeneral-regs-only, enable the GPR only instructions which are
enabled implicitly by SSE ISAs unless they have been disabled explicitly.
gcc/
PR target/101492
* common/config/i386/i386-common.c (ix86_handle_option): For
-mgeneral-regs-only, enable the GPR only instructions which are
enabled implicitly by SSE ISAs unless they have been disabled
explicitly.
gcc/testsuite/
PR target/101492
* gcc.target/i386/pr101492-1.c: New test.
* gcc.target/i386/pr101492-2.c: Likewise.
* gcc.target/i386/pr101492-3.c: Likewise.
* gcc.target/i386/pr101492-4.c: Likewise.
Don't issue vzeroupper before function call if callee returns AVX
register since callee must be compiled with AVX.
gcc/
PR target/101495
* config/i386/i386.c (ix86_check_avx_upper_stores): Moved before
ix86_avx_u128_mode_needed.
(ix86_avx_u128_mode_needed): Return AVX_U128_DIRTY if callee
returns AVX register.
gcc/testsuite/
PR target/101495
* gcc.target/i386/avx-vzeroupper-28.c: New test.
2021-07-18 Antoni Boucher <bouanto@zoho.com>
gcc/jit/
PR target/95498
* jit-playback.c (convert): Add support to handle truncation and
extension in the convert function.
gcc/testsuite/
PR target/95498
* jit.dg/all-non-failing-tests.h: New test.
* jit.dg/test-cast.c: New test.
Signed-off-by: Antoni Boucher <bouanto@zoho.com>
range-ops uses wi_fold to individually fold subranges one at a time and
then combined them. This patch first calls wi_fold_in_parts which checks if
one of the subranges is small, and if so, further splits that subrange
into constants.
gcc/
PR tree-optimization/96542
* range-op.cc (range_operator::wi_fold_in_parts): New.
(range_operator::fold_range): Call wi_fold_in_parts.
(operator_lshift::wi_fold): Fix broken lshift by [0,0].
* range-op.h (wi_fold_in_parts): Add prototype.
gcc/testsuite
* gcc.dg/pr96542.c: New.
This adds a deleted overload of std::get<I>(const tuple<Types...>&).
Invalid calls with an out of range index will match the deleted overload
and give a single, clear error about calling a deleted function, instead
of overload resolution errors for every std::get overload in the
library.
This changes the current output of 15+ errors (plus notes and associated
header context) into just two errors (plus context):
error: static assertion failed: tuple index must be in range
error: use of deleted function 'constexpr std::__enable_if_t<(__i >= sizeof... (_Types))> std::get(const std::tuple<_Types ...>&) [with long unsigned int __i = 1; _Elements = {int}; std::__enable_if_t<(__i >= sizeof... (_Types))> = void]'
This seems like a nice improvement, although PR c++/66968 means that
"_Types" is printed in the signature rather than "_Elements".
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* include/std/tuple (get<I>): Add deleted overload for bad
index.
* testsuite/20_util/tuple/element_access/get_neg.cc: Adjust
expected errors.
This is the alias CTAD version of the CTAD bug PR93248, and the fix is
the same: clear cp_unevaluated_operand so that the entire chain of
DECL_ARGUMENTS gets substituted.
PR c++/101233
gcc/cp/ChangeLog:
* pt.c (alias_ctad_tweaks): Clear cp_unevaluated_operand for
substituting DECL_ARGUMENTS.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/class-deduction-alias10.C: New test.
This implements the wording changes of CWG 960 which clarifies that two
reference types are covariant only if they're both lvalue references
or both rvalue references.
DR 960
PR c++/99664
gcc/cp/ChangeLog:
* search.c (check_final_overrider): Compare TYPE_REF_IS_RVALUE
when the return types are references.
gcc/testsuite/ChangeLog:
* g++.dg/inherit/covariant23.C: New test.
I've been experimenting with various new diagnostics that
require a common place for the analyzer to check the validity
of reads or writes to memory (e.g. buffer overflow).
As preliminary work, this patch adds new
region_model::check_region_for_{read|write} functions
which are called anywhere that the analyzer "sees" memory being
read from or written to (via region_model::get_store_value and
region_model::set_value).
This takes over the hardcoded calls to check_for_writable_region
(allowing for other kinds of checks on writes); checking reads is
currently a no-op.
gcc/analyzer/ChangeLog:
* analyzer.h (enum access_direction): New.
* engine.cc (exploded_node::on_longjmp): Update for new param of
get_store_value.
* program-state.cc (program_state::prune_for_point): Likewise.
* region-model-impl-calls.cc (region_model::impl_call_memcpy):
Replace call to check_for_writable_region with call to
check_region_for_write.
(region_model::impl_call_memset): Likewise.
(region_model::impl_call_strcpy): Likewise.
* region-model-reachability.cc (reachable_regions::add): Update
for new param of get_store_value.
* region-model.cc (region_model::get_rvalue_1): Likewise, also for
get_rvalue_for_bits.
(region_model::get_store_value): Add ctxt param and use it to call
check_region_for_read.
(region_model::get_rvalue_for_bits): Add ctxt param and use it to
call get_store_value.
(region_model::check_region_access): New.
(region_model::check_region_for_write): New.
(region_model::check_region_for_read): New.
(region_model::set_value): Update comment. Replace call to
check_for_writable_region with call to check_region_for_write.
* region-model.h (region_model::get_rvalue_for_bits): Add ctxt
param.
(region_model::get_store_value): Add ctxt param.
(region_model::check_region_access): New decl.
(region_model::check_region_for_write): New decl.
(region_model::check_region_for_read): New decl.
* region.cc (region_model::copy_region): Update call to
get_store_value.
* svalue.cc (initial_svalue::implicitly_live_p): Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
The problem is the buffer is too small to hold "-O" and
the interger. This fixes the problem by use the correct size
instead.
Changes since v1:
* v2: Use HOST_BITS_PER_LONG and just divide by 3 instead of
3.32.
OK? Bootstrapped and tested on x86_64-linux with no regressions.
gcc/c-family/ChangeLog:
PR c/101453
* c-common.c (parse_optimize_options): Use the correct
size for buffer.
This patch adds a tiny subset of the built-in and overload descriptions.
2021-04-02 Bill Schmidt <wschmidt@linux.ibm.com>
gcc/
* config/rs6000/rs6000-builtin-new.def: New.
* config/rs6000/rs6000-overload.def: New.
Currently gengtype supports scanning target-specific files for GC roots,
but those files must exist in the source tree. This patch extends the
support to include header files generated into the build directory. It
also allows targets to specify build dependencies for s-gtype to ensure
the built headers are up to date prior to running gengtype.
2021-06-15 Bill Schmidt <wschmidt@linux.ibm.com>
gcc/
* Makefile.in (EXTRA_GTYPE_DEPS): New variable.
(s-gtype): Depend on EXTRA_GTYPE_DEPS.
* gengtype-state.c (state_writer::write_state_file_list): Add a
parameter to the fileslist expression for the number of build
headers to scan.
(read_state_files_list): Detect build headers and strip the
initial "./" or ".\" from their names.
* gengtype.c (build_headers): New global variable.
(num_build_headers): Likewise.
(open_base_files): Emit #include for each build header.
(main): Detect and count build headers.
* gengtype.h (build_headers): New extern variable.
(num_build_headers): Likewise.
Fix tests when int == long by using long long instead.
gcc/testsuite/ChangeLog:
PR middle-end/101457
* gcc.dg/vect/vect-reduc-dot-19.c: Use long long.
* gcc.dg/vect/vect-reduc-dot-20.c: Likewise.
* gcc.dg/vect/vect-reduc-dot-21.c: Likewise.
* gcc.dg/vect/vect-reduc-dot-22.c: Likewise.
If __int128 is supported then __int_traits<__int128> is guaranteed to be
specialized, so we can remove the preprocessor condition inside the
std::numeric_traits<__detail::__max_size_type> specialization. Simply
using __int_traits<_Sp::__rep> gives the right answer.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* include/bits/max_size_type.h (numeric_limits<__max_size_type>):
Use __int_traits unconditionally.
This reverts c1676651b6 and uses the
__extension__ keyword to prevent pedantic warnings instead of diagnostic
pragmas.
This also adds the __extension__ keyword in <limits> and <bits/random.h>
where there are some more warnings that I missed in the previous commit.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* include/bits/cpp_type_traits.h (__INT_N): Use __extension__
instead of diagnostic pragmas.
* include/bits/functional_hash.h: Likewise.
* include/bits/iterator_concepts.h (__is_signed_int128)
(__is_unsigned_int128): Likewise.
* include/bits/max_size_type.h (__max_size_type): Likewise.
(numeric_limits<__max_size_type>): Likewise.
* include/bits/std_abs.h (abs): Likewise.
* include/bits/stl_algobase.h (__size_to_integer): Likewise.
* include/bits/uniform_int_dist.h (uniform_int_distribution):
Likewise.
* include/ext/numeric_traits.h (_GLIBCXX_INT_N_TRAITS):
Likewise.
* include/std/type_traits (__is_integral_helper<INT_N>)
(__is_signed_integer, __is_unsigned_integer)
(__make_unsigned<INT_N>, __make_signed<INT_N>): Likewise.
* include/std/limits (__INT_N): Add __extension__ keyword.
* include/bits/random.h (_Select_uint_least_t)
(random_device): Likewise.
The primary template for _CachedPosition is a dummy implementation for
non-forward ranges, the iterators for which generally can't be cached.
Because this implementation doesn't actually cache anything, _M_has_value
is defined to be false and so calls to _M_get (which are always guarded
by _M_has_value) are unreachable.
Still, to suppress a "control reaches end of non-void function" warning
I made _M_get return {}, but after P2325 input iterators are no longer
necessarily default constructible so this workaround now breaks valid
programs.
This patch fixes this by instead using __builtin_unreachable to squelch
the warning.
PR libstdc++/101231
libstdc++-v3/ChangeLog:
* include/std/ranges (_CachedPosition::_M_get): For non-forward
ranges, just call __builtin_unreachable.
* testsuite/std/ranges/istream_view.cc (test05): New test.
This gives the new split_view's sentinel type a defaulted default
constructor, something which was overlooked in r12-1665. This patch
also fixes a couple of other issues with the new split_view as reported
in the PR.
PR libstdc++/101214
libstdc++-v3/ChangeLog:
* include/std/ranges (split_view::split_view): Use std::move.
(split_view::_Iterator::_Iterator): Remove redundant
default_initializable constraint.
(split_view::_Sentinel::_Sentinel): Declare.
* testsuite/std/ranges/adaptors/split.cc (test02): New test.
Jonathan pointed me at this issue where
constexpr unsigned f() { constexpr int n = -1; return unsigned{n}; }
is accepted in system headers, despite the narrowing conversion from
a constant. I suspect that whereas narrowing warnings should be
disabled, ill-formed narrowing of constants should be a hard error
(which can still be disabled by -Wno-narrowing).
gcc/cp/ChangeLog:
* typeck2.c (check_narrowing): Don't suppress the pedantic error
in system headers.
libstdc++-v3/ChangeLog:
* testsuite/20_util/ratio/operations/ops_overflow_neg.cc: Add
dg-error.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1y/Wnarrowing2.C: New test.
* g++.dg/cpp1y/Wnarrowing2.h: New test.
This removes a few more uses.
2021-07-16 Richard Biener <rguenther@suse.de>
* gimple-ssa-store-merging.c (verify_symbolic_number_p): Use
the type of the LHS.
(find_bswap_or_nop_1): Likewise.
(find_bswap_or_nop): Likewise.
* tree-vectorizer.h (vect_get_smallest_scalar_type): Adjust
prototype.
* tree-vect-data-refs.c (vect_get_smallest_scalar_type):
Remove unused parameters, pass in the scalar type. Fix
internal store function handling.
* tree-vect-stmts.c (vect_analyze_stmt): Remove assert.
(vect_get_vector_types_for_stmt): Move down check for
existing vector stmt after we've determined a scalar type.
Pass down the used scalar type to vect_get_smallest_scalar_type.
* tree-vect-generic.c (expand_vector_condition): Use
the type of the LHS.
(expand_vector_scalar_condition): Likewise.
(expand_vector_operations_1): Likewise.
* tree-vect-patterns.c (vect_widened_op_tree): Likewise.
(vect_recog_dot_prod_pattern): Likewise.
(vect_recog_sad_pattern): Likewise.
(vect_recog_widen_op_pattern): Likewise.
(vect_recog_widen_sum_pattern): Likewise.
(vect_recog_mixed_size_cond_pattern): Likewise.
This gets rid of a few gimple_expr_type uses.
2021-07-16 Richard Biener <rguenther@suse.de>
* gimple-fold.c (gimple_fold_stmt_to_constant_1): Use
the type of the LHS.
(gimple_assign_nonnegative_warnv_p): Likewise.
(gimple_call_nonnegative_warnv_p): Likewise. Return false
if the call has no LHS.
* gimple.c (gimple_could_trap_p_1): Use the type of the LHS.
* tree-eh.c (stmt_could_throw_1_p): Likewise.
* tree-inline.c (insert_init_stmt): Likewise.
* tree-ssa-loop-niter.c (get_val_for): Likewise.
* tree-outof-ssa.c (ssa_is_replaceable_p): Use the type of
the def.
* tree-ssa-sccvn.c (init_vn_nary_op_from_stmt): Take a
gassign *. Use the type of the lhs.
(vn_nary_op_lookup_stmt): Adjust.
(vn_nary_op_insert_stmt): Likewise.
This helps with generating code for kernel hotpatches, which contain
individual functions and are loaded more than 2G away from vmlinux.
This should not create performance regressions for the normal use
cases, because for local functions ld replaces @PLT calls with direct
calls.
gcc/ChangeLog:
* config/s390/predicates.md (bras_sym_operand): Accept all
functions in 64-bit mode, use UNSPEC_PLT31.
(larl_operand): Use UNSPEC_PLT31.
* config/s390/s390.c (s390_loadrelative_operand_p): Likewise.
(legitimize_pic_address): Likewise.
(s390_emit_tls_call_insn): Mark __tls_get_offset as function,
use UNSPEC_PLT31.
(s390_delegitimize_address): Use UNSPEC_PLT31.
(s390_output_addr_const_extra): Likewise.
(print_operand): Add @PLT to TLS calls, handle %K.
(s390_function_profiler): Mark __fentry__/_mcount as function,
use %K, use UNSPEC_PLT31.
(s390_output_mi_thunk): Use only UNSPEC_GOT, use %K.
(s390_emit_call): Use UNSPEC_PLT31.
(s390_emit_tpf_eh_return): Mark __tpf_eh_return as function.
* config/s390/s390.md (UNSPEC_PLT31): Rename from UNSPEC_PLT.
(*movdi_64): Use %K.
(reload_base_64): Likewise.
(*sibcall_brc): Likewise.
(*sibcall_brcl): Likewise.
(*sibcall_value_brc): Likewise.
(*sibcall_value_brcl): Likewise.
(*bras): Likewise.
(*brasl): Likewise.
(*bras_r): Likewise.
(*brasl_r): Likewise.
(*bras_tls): Likewise.
(*brasl_tls): Likewise.
(main_base_64): Likewise.
(reload_base_64): Likewise.
(@split_stack_call<mode>): Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/ext/visibility/noPLT.C: Skip on s390x.
* g++.target/s390/mi-thunk.C: New test.
* gcc.target/s390/nodatarel-1.c: Move foostatic to the new
tests.
* gcc.target/s390/pr80080-4.c: Allow @PLT suffix.
* gcc.target/s390/risbg-ll-3.c: Likewise.
* gcc.target/s390/call.h: Common code for the new tests.
* gcc.target/s390/call-z10-pic-nodatarel.c: New test.
* gcc.target/s390/call-z10-pic.c: New test.
* gcc.target/s390/call-z10.c: New test.
* gcc.target/s390/call-z9-pic-nodatarel.c: New test.
* gcc.target/s390/call-z9-pic.c: New test.
* gcc.target/s390/call-z9.c: New test.
* gcc.target/s390/mfentry-m64-pic.c: New test.
* gcc.target/s390/tls.h: Common code for the new TLS tests.
* gcc.target/s390/tls-pic.c: New test.
* gcc.target/s390/tls.c: New test.
My previous change to vect_gen_while introduced paths which call
make_temp_ssa_name with a NULL name which isn't supported. The
following fixes that.
2021-07-16 Richard Biener <rguenther@suse.de>
PR tree-optimization/101467
* tree-vect-stmts.c (vect_gen_while): Properly guard
make_temp_ssa_name usage.
A recent change "gcc: Add vec_select -> subreg RTL simplification"
updated the expected test results for SVE extraction tests. The new
result should only have been changed for little endian. This patch
restores the old expected result for big endian.
gcc/testsuite/ChangeLog:
2021-07-15 Jonathan Wright <jonathan.wright@arm.com>
* gcc.target/aarch64/sve/extract_1.c: Split expected results
by big/little endian targets, restoring the old expected
result for big endian.
* gcc.target/aarch64/sve/extract_2.c: Likewise.
* gcc.target/aarch64/sve/extract_3.c: Likewise.
* gcc.target/aarch64/sve/extract_4.c: Likewise.
C-SKY previously used a forked print-sysroot-suffix.sh and define
CSKY_MULTILIB_DIRNAMES to specify OS multilib directories. This
patch delete the forked print-sysroot-suffix.sh and define
MULTILIB_DIRNAMES to generate same directories.
gcc/
* config.gcc: Don't use forked print-sysroot-suffix.sh and
t-sysroot-suffix for C-SKY.
* config/csky/print-sysroot-suffix.sh: Delete.
* config/csky/t-csky-linux: Delete.
* config/csky/t-sysroot-suffix: Define MULTILIB_DIRNAMES
instead of CSKY_MULTILIB_DIRNAMES.
This fixes the partial reduction of the reused reduction vector to
carried out in the correct sign and the correctly signed vector
recorded for the skip edge use.
2021-07-16 Richard Biener <rguenther@suse.de>
* tree-vect-loop.c (vect_transform_cycle_phi): Correct sign
conversion issues with the partial reduction of the reused
vector accumulator.
The following defaults --param vect-partial-vector-usage to zero
for x86_64 matching existing behavior where support for this
is not present.
2021-07-15 Richard Biener <rguenther@suse.de>
* config/i386/i386-options.c (ix86_option_override_internal): Set
param_vect_partial_vector_usage to zero if not set.
This reorders the @{ and @relates tags, and moves the definition of the
__cpp_lib_make_unique macro out of the group, as it seems to confuse
doxygen.
libstdc++-v3/ChangeLog:
* include/bits/unique_ptr.h: Adjust doxygen markup.