Remove unnecessary tests before copying function address to r12.
2020-08-28 Bill Schmidt <wschmidt@linux.ibm.com>
gcc/
* config/rs6000/rs6000.c (rs6000_call_aix): Remove test for r12.
(rs6000_sibcall_aix): Likewise.
An API change broke the amdgcn build.
gcc/ChangeLog:
* config/gcn/gcn-tree.c (gcn_goacc_get_worker_red_decl): Add "true"
parameter to vec_safe_grow_cleared.
gcc/ChangeLog:
* ggc-common.c (gt_pch_save): Add argument to a call.
gcc/jit/ChangeLog:
* jit-recording.c (recording::switch_::make_debug_string): Add argument
to a call.
gcc/fortran/ChangeLog:
PR fortran/94672
* trans-array.c (gfc_trans_g77_array): Check against the parm decl and
set the nonparm decl used for the is-present check to NULL if absent.
gcc/testsuite/ChangeLog:
PR fortran/94672
* gfortran.dg/optional_assumed_charlen_2.f90: New test.
Problem is related to that operand 4 (In original pattern
cond_sub<mode>_any_const) is no longer the same as operand 1, and so
the pattern doesn't match the split condition.
Pattern cond_sub<mode>_any_const is being split by this patch into two
separate patterns:
* Pattern cond_sub<mode>_relaxed_const now matches const_int
SVE_RELAXED_GP operand.
* Pattern cond_sub<mode>_strict_const now matches const_int
SVE_STRICT_GP operand.
* Remove aarch64_sve_pred_dominates_p condition from both patterns.
gcc/ChangeLog:
PR target/96357
* config/aarch64/aarch64-sve.md
(cond_sub<mode>_relaxed_const): Updated and renamed from
cond_sub<mode>_any_const pattern.
(cond_sub<mode>_strict_const): New pattern.
gcc/testsuite/ChangeLog:
PR target/96357
* gcc.target/aarch64/sve/pr96357.c: New test.
This test fails on ILP32 since we're looking for a pattern that could
only be hit on LP64. Disabling the test on ILP32 since the problematic
mult pattern was never hit there, so there's nothing to test.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/mem-shift-canonical.c: Skip on ILP32.
2020-08-28 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/96624
* simplify.c (gfc_simplify_reshape): Detect zero shape and
clear index if found.
gcc/testsuite/
PR fortran/96624
* gfortran.dg/reshape_8.f90 : New test.
gcc.dg/pr96579.c includes gcc.dg/pr96370.c which needs target dfp.
2020-08-28 Christophe Lyon <christophe.lyon@linaro.org>
gcc/testsuite/
* gcc.dg/pr96579.c: Compile only with target dfp.
2020-08-30 Uros Bizjak <ubizjak@gmail.com>
gcc/ChangeLog:
PR target/96744
* config/i386/i386-expand.c (split_double_mode): Also handle
E_P2HImode and E_P2QImode.
* config/i386/sse.md (MASK_DWI): New define_mode_iterator.
(mov<mode>): New expander for P2HI,P2QI.
(*mov<mode>_internal): New define_insn_and_split to split
movement of P2QI/P2HI to 2 movqi/movhi patterns after reload.
gcc/testsuite/ChangeLog:
* gcc.target/i386/double_mask_reg-1.c: New test.
Replace the U+00B7 middle dot character, placed after "mips64p32le"
in the target lists, with a space. The U+00B7 character may not be
considered whitespace by Bourne shell and any non-ASCII character
may render incorrectly in some terminal devices.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/251177
This implements the changes from P0548 "common_type and duration". That
was a change for C++17, but as it corrects some issues introduced by DRs
I'm also treating it as a DR and changing it for all modes from C++11
up.
The main change is that duration<R,P>::period no longer denotes P, but
rather P::type, the reduced ratio. The unary operator+ and operator-
members of duration should now return a duration using that reduced
ratio.
The requirement that common_type<T>::type is the same type as
common_type<T, T>::type (rather than simply T) was already implemented
for PR 89102.
The standard says that duration::operator+() and duration::operator-()
should return common_type_t<duration>, but that seems unnecessarily
expensive to compute. This change just uses duration<rep, period> which
is the same type, so we don't need to instantiate common_type.
As an optimization, this also adds partial specializations of
common_type for two durations of the same type, a single duration, two
time_points of the same type, and a single time_point. These
specializations avoid instantiating other specializations of common_type
and one or both of __duration_common_type or __timepoint_common_type for
the cases where the answer is trivial to obtain.
libstdc++-v3/ChangeLog:
* include/std/chrono (__duration_common_type): Ensure the
reduced ratio is used. Remove unused partial specialization
using __failure_type.
(common_type): Pass reduced ratios to __duration_common_type.
Add partial specializations for simple cases involving a single
duration or time_point type.
(duration::period): Use reduced ratio.
(duration::operator+(), duration::operator-()): Return duration
type using the reduced ratio.
* testsuite/20_util/duration/requirements/typedefs_neg2.cc:
Adjust expected errors.
* testsuite/20_util/duration/requirements/reduced_period.cc: New test.
This fixes the months-based addition for year_month when the
year_month's month component is 0.
libstdc++-v3/ChangeLog:
* include/std/chrono (year_month::operator+): Properly handle a
month value of 0 by casting the month value to int before
subtracting 1 from it so that the difference is sign-extended in
the subsequent addition.
* testsuite/std/time/year_month/1.cc: Test adding months to a
year_month whose month component is below or above the
normalized range of [1,12].
We currently don't enforce a constraint on some of the calendar types'
addition/subtraction operator overloads that take a 'months' arguments:
Constraints: If the argument supplied by the caller for the months
parameter is convertible to years, its implicit conversion sequence to
years is worse than its implicit conversion sequence to months.
This constraint is relevant when adding/subtracting a duration to/from,
say, a year_month where the given duration is convertible to both
'months' and to 'years' (as in the new testcases below). The correct
behavior here in light of this constraint is to perform the operation
through the (more efficient) 'years'-based overload, but we currently
emit an ambiguous overload error.
This patch templatizes the 'months'-based addition/subtraction operator
overloads so that in the event of an implicit-conversion tie, we select
the non-template 'years'-based overload. This is the same approach
that the date library takes for enforcing this constraint.
libstdc++-v3/ChangeLog:
* include/std/chrono
(__detail::__months_years_conversion_disambiguator): Define.
(year_month::operator+=): Templatize the 'months'-based overload
so that the 'years'-based overload is selected in case of
equally-ranked implicit conversion sequences to both 'months'
and 'years' from the supplied argument.
(year_month::operator-=): Likewise.
(year_month::operator+): Likewise.
(year_month::operator-): Likewise.
(year_month_day::operator+=): Likewise.
(year_month_day::operator-=): Likewise.
(year_month_day::operator+): Likewise.
(year_month_day::operator-): Likewise.
(year_month_day_last::operator+=): Likewise.
(year_month_day_last::operator-=): Likewise.
(year_month_day_last::operator+): Likewise
(year_month_day_last::operator-): Likewise.
(year_month_day_weekday::operator+=): Likewise
(year_month_day_weekday::operator-=): Likewise.
(year_month_day_weekday::operator+): Likewise.
(year_month_day_weekday::operator-): Likewise.
(year_month_day_weekday_last::operator+=): Likewise
(year_month_day_weekday_last::operator-=): Likewise.
(year_month_day_weekday_last::operator+): Likewise.
(year_month_day_weekday_last::operator-): Likewise.
(testsuite/std/time/year_month/2.cc): New test.
(testsuite/std/time/year_month_day/2.cc): New test.
(testsuite/std/time/year_month_day_last/2.cc): New test.
(testsuite/std/time/year_month_weekday/2.cc): New test.
(testsuite/std/time/year_month_weekday_last/2.cc): New test.
For _Atomic fields, lowering the alignment of long long or double etc.
fields on ia32 is undesirable, because then one really can't perform atomic
operations on those using cmpxchg8b.
The following patch stops lowering the alignment in fields for _Atomic
types (the x86_field_alignment change) and for -mpreferred-stack-boundary=2
also ensures we don't misalign _Atomic long long etc. automatic variables
(the ix86_{local,minimum}_alignment changes).
Not sure about iamcu_alignment change, I know next to nothing about IA MCU,
but unless it doesn't have cmpxchg8b instruction, it would surprise me if we
don't want to do it as well.
clang apparently doesn't lower the field alignment for _Atomic.
2020-08-27 Jakub Jelinek <jakub@redhat.com>
PR target/65146
* config/i386/i386.c (iamcu_alignment): Don't decrease alignment
for TYPE_ATOMIC types.
(ix86_local_alignment): Likewise.
(ix86_minimum_alignment): Likewise.
(x86_field_alignment): Likewise, and emit a -Wpsabi diagnostic
for it.
* gcc.target/i386/pr65146.c: New test.
Prior to P10, ELFv2 hasn't implemented nonlocal sibcalls. Now that we do,
we need to be sure that r12 is set up prior to such a call.
2020-08-27 Bill Schmidt <wschmidt@linux.ibm.com>
gcc/
PR target/96787
* config/rs6000/rs6000.c (rs6000_sibcall_aix): Support
indirect call for ELFv2.
gcc/testsuite/
PR target/96787
* gcc.target/powerpc/pr96787-1.c: New.
* gcc.target/powerpc/pr96787-2.c: New.
A length expression containing a divide by zero in a character
declaration will result in an ICE if the constant is anymore
complicated that a contant divided by a constant.
The cause was that char_len_param_value can return MATCH_YES
even if a divide by zero was seen. Prior to returning check
whether a divide by zero was seen and if so set it to MATCH_ERROR.
2020-08-27 Mark Eggleston <markeggleston@gcc.gnu.org>
gcc/fortran
PR fortran/95882
* decl.c (char_len_param_value): Check gfc_seen_div0 and
if it is set return MATCH_ERROR.
2020-08-27 Mark Eggleston <markeggleston@gcc.gnu.org>
gcc/testsuite/
PR fortran/95882
* gfortran.dg/pr95882_1.f90: New test.
* gfortran.dg/pr95882_2.f90: New test.
* gfortran.dg/pr95882_3.f90: New test.
* gfortran.dg/pr95882_4.f90: New test.
* gfortran.dg/pr95882_5.f90: New test.
This removes the bogus tranfer of flow-sensitive info in copy_ref_info
plus fixes one oversight in FRE when flow-sensitive non-NULLness was added to
points-to info.
2020-08-27 Richard Biener <rguenther@suse.de>
PR tree-optimization/96522
* tree-ssa-address.c (copy_ref_info): Reset flow-sensitive
info of the copied points-to. Transfer bigger alignment
via the access type.
* tree-ssa-sccvn.c (eliminate_dom_walker::eliminate_stmt):
Reset all flow-sensitive info.
* gcc.dg/torture/pr96522.c: New testcase.
The following streamlines TARGET_MEM_REF dumping building
on what we do for MEM_REF and thus dumping things like
access type, TBAA type and base/clique. I've changed it
to do semantic dumping aka base + offset + step * index
rather than the odd base: A, step: way.
2020-08-27 Richard Biener <rguenther@suse.de>
* tree-pretty-print.c (dump_mem_ref): Handle TARGET_MEM_REFs.
(dump_generic_node): Use dump_mem_ref also for TARGET_MEM_REF.
* gcc.dg/tree-ssa/loop-19.c: Adjust.
* gcc.dg/tree-ssa/loop-2.c: Likewise.
* gcc.dg/tree-ssa/loop-3.c: Likewise.
Inside a (mem) RTX, it is canonical to write multiplications by powers
of two using a (mult) [0]. Outside of a (mem), the canonical way to
write multiplications by powers of two is using (ashift).
Now I observed that LRA does not quite respect this RTL canonicalization
rule. When compiling gcc/testsuite/gcc.dg/torture/pr34330.c with -Os
-ftree-vectorize, the RTL in the dump "281r.ira" has the insn:
(set (reg:SI 111)
(mem:SI (plus:DI (mult:DI (reg:DI 101 [ ivtmp.9 ])
(const_int 4 [0x4]))
(reg/v/f:DI 105 [ b ]))))
but LRA then proceeds to generate a reload, and we get the following
non-canonical insn in "282r.reload":
(set (reg:DI 7 x7 [121])
(plus:DI (mult:DI (reg:DI 5 x5 [orig:101 ivtmp.9 ] [101])
(const_int 4 [0x4]))
(reg/v/f:DI 1 x1 [orig:105 b ] [105])))
This patch fixes LRA to ensure that we generate canonical RTL in this
case. After the patch, we get the following insn in "282r.reload":
(set (reg:DI 7 x7 [121])
(plus:DI (ashift:DI (reg:DI 5 x5 [orig:101 ivtmp.9 ] [101])
(const_int 2 [0x2]))
(reg/v/f:DI 1 x1 [orig:105 b ] [105])))
[0] : https://gcc.gnu.org/onlinedocs/gccint/Insn-Canonicalizations.html
gcc/ChangeLog:
* lra-constraints.c (canonicalize_reload_addr): New.
(curr_insn_transform): Use canonicalize_reload_addr to ensure we
generate canonical RTL for an address reload.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/mem-shift-canonical.c: New test.
This makes sure to put special-ops expanded rhs left where
expression rewrite expects it.
2020-08-27 Richard Biener <rguenther@suse.de>
PR tree-optimization/96579
* tree-ssa-reassoc.c (linearize_expr_tree): If we expand
rhs via special ops make sure to swap operands.
* gcc.dg/pr96579.c: New testcase.
This improves DSEs stmt walking by not considering a DEF without
uses for further processing (and thus giving up when there's two
paths to follow).
2020-08-26 Richard Biener <rguenther@suse.de>
PR tree-optimization/96565
* tree-ssa-dse.c (dse_classify_store): Remove defs with
no uses from further processing.
* gcc.dg/tree-ssa/ssa-dse-40.c: New testcase.
* gcc.dg/builtin-object-size-4.c: Adjust.
Almost all of the proposed resolution for LWG 3448 is already
implemented; the only part left is to adjust the return type of
transform_view::sentinel::operator-.
libstdc++-v3/ChangeLog:
PR libstdc++/95322
* include/std/ranges (transform_view::sentinel::__distance_from):
Give this a deduced return type.
(transform_view::sentinel::operator-): Adjust the return type so
that it's based on the constness of the iterator rather than
that of the sentinel.
* testsuite/std/ranges/adaptors/95322.cc: Refer to LWG 3488.
This implements the proposed resolution for LWG 3406, and adds a
testcase for the example from P1994R1.
libstdc++-v3/ChangeLog:
* include/std/ranges (elements_view::begin): Adjust constraints.
(elements_view::end): Likewise.
(elements_view::_Sentinel::operator==): Templatize to take both
_Iterator<true> and _Iterator<false>.
(elements_view::_Sentinel::operator-): Likewise.
* testsuite/std/ranges/adaptors/elements.cc: Add testcase for
the example from P1994R1.
* testsuite/std/ranges/adaptors/lwg3406.cc: New test.
The example from the paper doesn't compile without the proposed
resolution for LWG 3406, so we'll add a testcase for this once the
proposed resolution is implemented.
libstdc++-v3/ChangeLog:
* include/std/ranges (elements_view::end): Replace these two
overloads with four new overloads.
(elements_view::_Iterator::operator==): Remove.
(elements_view::_Iterator::operator-): Likewise.
(elements_view::_Sentinel): Define.
A number of i386 math optimisation tests are looking assembly instructions
that are only emitted when the compiler knows the target has a C99 libm
available. Since targets like *-elf may not have such a libm, a C99 runtime
requirement is added to these tests.
gcc/testsuite/ChangeLog
* gcc.target/i386/387-7.c: Add dg-require-effective-target c99_runtime.
* gcc.target/i386/387-9.c: Likewise.
* g++.target/i386/avx512bw-pr96246-1.C: Likewise.
* gcc.target/i386/avx512f-rint-sfix-vec-2.c: Likewise.
* gcc.target/i386/avx512f-rintf-sfix-vec-2.c: Likewise.
* g++.target/i386/avx512vl-pr96246-1.C: Likewise.
* gcc.target/i386/pr61403.c: Likewise.
* gcc.target/i386/sse4_1-ceil-sfix-vec.c: Likewise.
* gcc.target/i386/sse4_1-ceilf-sfix-vec.c: Likewise.
* gcc.target/i386/sse4_1-floor-sfix-vec.c: Likewise.
* gcc.target/i386/sse4_1-floorf-sfix-vec.c: Likewise.
* gcc.target/i386/sse4_1-rint-sfix-vec.c: Likewise.
* gcc.target/i386/sse4_1-rintf-sfix-vec.c: Likewise.
* gcc.target/i386/sse4_1-round-sfix-vec.c: Likewise.
* gcc.target/i386/sse4_1-roundf-sfix-vec.c: Likewise.
The wording of the description of -fprofile-exclude-files is easy to
misunderstand. One can be led to believe a file is excluded only if
it matches all of the patterns, not just one. This patch tries to
clarify the function. It also adjusts the wording of
-fprofile-filter-files accordingly.
gcc/
PR gcov-profile/96285
* common.opt, doc/invoke.texi: Clarify wording of
-fprofile-exclude-files and adjust -fprofile-filter-files to
match.
The implementation of define_expand and define_insn patterns to handle
shifts in the MSP430 backend is inconsistent, resulting in missed
opportunities to make best use of the architecture's features.
There's now a single define_expand used as the entry point for all valid
shifts, and the decision to either use a helper function to perform the
shift (often required for the 430 ISA), or fall through to the
define_insn patterns can be made from that expander function.
Shifts by a constant amount have been grouped into one define_insn for
each type of shift, instead of having different define_insn patterns for
shifts by different amounts.
A new target option "-mmax-inline-shift=" has been added to allow tuning
of the number of shift instructions to emit inline, instead of using
a library helper function.
gcc/ChangeLog:
* config/msp430/constraints.md (K): Change unused constraint to
constraint to a const_int between 1 and 19.
(P): New constraint.
* config/msp430/msp430-protos.h (msp430x_logical_shift_right): Remove.
(msp430_expand_shift): New.
(msp430_output_asm_shift_insns): New.
* config/msp430/msp430.c (msp430_rtx_costs): Remove shift costs.
(CSH): Remove.
(msp430_expand_helper): Remove hard-coded generation of some inline
shift insns.
(use_helper_for_const_shift): New.
(msp430_expand_shift): New.
(msp430_output_asm_shift_insns): New.
(msp430_print_operand): Add new 'W' operand selector.
(msp430x_logical_shift_right): Remove.
* config/msp430/msp430.md (HPSI): New define_mode_iterator.
(HDI): Likewise.
(any_shift): New define_code_iterator.
(shift_insn): New define_code_attr.
Adjust unnamed insn patterns searched for by combine.
(ashlhi3): Remove.
(slli_1): Remove.
(430x_shift_left): Remove.
(slll_1): Remove.
(slll_2): Remove.
(ashlsi3): Remove.
(ashldi3): Remove.
(ashrhi3): Remove.
(srai_1): Remove.
(430x_arithmetic_shift_right): Remove.
(srap_1): Remove.
(srap_2): Remove.
(sral_1): Remove.
(sral_2): Remove.
(ashrsi3): Remove.
(ashrdi3): Remove.
(lshrhi3): Remove.
(srli_1): Remove.
(430x_logical_shift_right): Remove.
(srlp_1): Remove.
(srll_1): Remove.
(srll_2x): Remove.
(lshrsi3): Remove.
(lshrdi3): Remove.
(<shift_insn><mode>3): New define_expand.
(<shift_insn>hi3_430): New define_insn.
(<shift_insn>si3_const): Likewise.
(ashl<mode>3_430x): Likewise.
(ashr<mode>3_430x): Likewise.
(lshr<mode>3_430x): Likewise.
(*bitbranch<mode>4_z): Replace renamed predicate msp430_bitpos with
const_0_to_15_operand.
* config/msp430/msp430.opt: New option -mmax-inline-shift=.
* config/msp430/predicates.md (const_1_to_8_operand): New predicate.
(const_0_to_15_operand): Rename msp430_bitpos predicate.
(const_1_to_19_operand): New predicate.
* doc/invoke.texi: Document -mmax-inline-shift=.
libgcc/ChangeLog:
* config/msp430/slli.S (__gnu_mspabi_sllp): New.
* config/msp430/srai.S (__gnu_mspabi_srap): New.
* config/msp430/srli.S (__gnu_mspabi_srlp): New.
gcc/testsuite/ChangeLog:
* gcc.target/msp430/emulate-srli.c: Fix expected assembler text.
* gcc.target/msp430/max-inline-shift-430-no-opt.c: New test.
* gcc.target/msp430/max-inline-shift-430.c: New test.
* gcc.target/msp430/max-inline-shift-430x.c: New test.
The _Tuple_impl constructor for allocator-extended construction from a
different tuple type uses the _Tuple_impl's own _Head type in the
__use_alloc test. That is incorrect, because the argument tuple could
have a different type. Using the wrong type might select the
leading-allocator convention when it should use the trailing-allocator
convention, or vice versa.
libstdc++-v3/ChangeLog:
PR libstdc++/96803
* include/std/tuple
(_Tuple_impl(allocator_arg_t, Alloc, const _Tuple_impl<U...>&)):
Replace parameter pack with a type parameter and a pack and pass
the first type to __use_alloc.
* testsuite/20_util/tuple/cons/96803.cc: New test.
A recent change altered the layout of EBO-helper base classes, resulting
in an ambiguity when the hash function and equality predicate are the
same type.
This modifies the type of one of the base classes, so that we don't get
two base classes of the same type.
libstdc++-v3/ChangeLog:
* include/bits/hashtable_policy.h (_Hash_code_base): Change
index of _Hashtable_ebo_helper base class.
* testsuite/23_containers/unordered_map/dup_types.cc: New test.
This removes all uses of VR_ANTI_RANGE.
gcc/ChangeLog:
* tree-ssa-dom.c (simplify_stmt_for_jump_threading): Abstract code out to...
* tree-vrp.c (find_case_label_range): ...here. Rewrite for to use irange
API.
(simplify_stmt_for_jump_threading): Call find_case_label_range instead of
duplicating the code in simplify_stmt_for_jump_threading.
* tree-vrp.h (find_case_label_range): New prototype.
This fixes vectorized PHI latch edge updating and delay it until
all of the loop is code generated to deal with the case that the
latch def is a PHI in the same block.
2020-08-26 Richard Biener <rguenther@suse.de>
PR tree-optimization/96698
* tree-vectorizer.h (loop_vec_info::reduc_latch_defs): New.
(loop_vec_info::reduc_latch_slp_defs): Likewise.
* tree-vect-stmts.c (vect_transform_stmt): Only record
stmts to update PHI latches from, perform the update ...
* tree-vect-loop.c (vect_transform_loop): ... here after
vectorizing those PHIs.
(info_for_reduction): Properly handle non-reduction PHIs.
* gcc.dg/vect/pr96698.c: New testcase.
Since GCC 6.1 there is no reason we can't just use __glibcxx_assert in
constexpr functions in string_view. As long as the condition is true,
there will be no call to std::__replacement_assert that would make the
function ineligible for constant evaluation.
PR libstdc++/71960
* include/experimental/string_view (basic_string_view):
Enable debug assertions.
* include/std/string_view (basic_string_view):
Likewise.
The corresponding commit had the Co-authored-by: lines in the middle of
the commit message instead of at the end, so the ChangeLog script didn't
consider them.