Bitfields, while they live in memory, aren't something inline-asm can easily
operate on.
For C and "=m" or "+m", we were diagnosing bitfields in the past in the
FE, where c_mark_addressable had:
case COMPONENT_REF:
if (DECL_C_BIT_FIELD (TREE_OPERAND (x, 1)))
{
error
("cannot take address of bit-field %qD", TREE_OPERAND (x, 1));
return false;
}
but that check got moved in GCC 6 to build_unary_op instead and now we
emit an error during expansion and ICE afterwards (i.e. error-recovery).
For "m" it used to be diagnosed in c_mark_addressable too, but since
GCC 6 it is ice-on-invalid.
For C++, this was never diagnosed in the FE, but used to be diagnosed
in the gimplifier and/or during expansion before 4.8.
The following patch does multiple things:
1) diagnoses it in the FEs
2) simplifies during expansion the inline asm if any errors have been
reported (similarly how e.g. vregs pass if it detects errors on
inline-asm either deletes them or simplifies to bare minimum -
just labels), so that we don't have error-recovery ICEs there
2021-06-11 Jakub Jelinek <jakub@redhat.com>
PR inline-asm/100785
gcc/
* cfgexpand.c (expand_asm_stmt): If errors are emitted,
remove all inputs, outputs and clobbers from the asm and
set template to "".
gcc/c/
* c-typeck.c (c_mark_addressable): Diagnose trying to make
bit-fields addressable.
gcc/cp/
* typeck.c (cxx_mark_addressable): Diagnose trying to make
bit-fields addressable.
gcc/testsuite/
* c-c++-common/pr100785.c: New test.
(cherry picked from commit 644c2cc5f2)
libstdc++-v3/ChangeLog:
PR libstdc++/100806
* include/bits/semaphore_base.h (__atomic_semaphore::_M_release):
Force _M_release() to wake all waiting threads.
* testsuite/30_threads/semaphore/100806.cc: New test.
(cherry picked from commit e02840c1a9)
The files fixkfti-sw.c and fixunskfti-sw.c are renamed versions of
fixkfti.c and fixunskfti.c respectively to do the conversions in software.
The function names in the files were updated with the rename as well as
some white spaces fixes. The file float128-p10.c contains the functions
for using the ISA 3.1 hardware instructions to perform the conversions.
2021-06-15 Carl Love <cel@us.ibm.com>
gcc/ChangeLog
* config/rs6000/rs6000.c (__fixkfti, __fixunskfti, __floattikf,
__floatuntikf): Names changed to __fixkfti_sw, __fixunskfti_sw,
__floattikf_sw, __floatuntikf_sw respectively.
* config/rs6000/rs6000.md (floatti<mode>2, floatunsti<mode>2,
fix_trunc<mode>ti2, fixuns_trunc<mode>ti2): Add
define_insn for mode IEEE 128.
gcc/testsuite/ChangeLog
* gcc.target/powerpc/fp128_conversions.c: New file.
* gcc.target/powerpc/int_128bit-runnable.c(vextsd2q,
vcmpuq, vcmpsq, vcmpequq, vcmpequq., vcmpgtsq, vcmpgtsq.
vcmpgtuq, vcmpgtuq.): Update scan-assembler-times.
(ppc_native_128bit): Remove dg-require-effective-target.
libgcc/ChangeLog
* config.host: Add if test and set for
libgcc_cv_powerpc_3_1_float128_hw.
* config/rs6000/fixkfti.c: Renamed to fixkfti-sw.c.
Change calls of __fixkfti to __fixkfti_sw.
* config/rs6000/fixunskfti.c: Renamed to fixunskfti-sw.c.
Change calls of __fixunskfti to __fixunskfti_sw.
* config/rs6000/float128-p10.c (__floattikf_hw,
__floatuntikf_hw, __fixkfti_hw, __fixunskfti_hw): New file.
* config/rs6000/float128-ifunc.c (SW_OR_HW_ISA3_1): New macro.
(__floattikf_resolve, __floatuntikf_resolve, __fixkfti_resolve,
__fixunskfti_resolve): Add resolve functions.
(__floattikf, __floatuntikf, __fixkfti, __fixunskfti): New functions.
* config/rs6000/float128-sed (floattitf, __floatuntitf,
__fixtfti, __fixunstfti): Add editor commands to change names.
* config/rs6000/float128-sed-hw (__floattitf,
__floatuntitf, __fixtfti, __fixunstfti): Add editor commands to
change names.
* config/rs6000/floattikf.c: Renamed to floattikf-sw.c.
* config/rs6000/floatuntikf.c: Renamed to floatuntikf-sw.c.
* config/rs6000/quad-float128.h (__floattikf_sw,
__floatuntikf_sw, __fixkfti_sw, __fixunskfti_sw, __floattikf_hw,
__floatuntikf_hw, __fixkfti_hw, __fixunskfti_hw, __floattikf,
__floatuntikf, __fixkfti, __fixunskfti): New extern declarations.
* config/rs6000/t-float128 (floattikf, floatuntikf,
fixkfti, fixunskfti): Remove file names from fp128_ppc_funcs.
(floattikf-sw, floatuntikf-sw, fixkfti-sw, fixunskfti-sw): Add
file names to fp128_ppc_funcs.
* config/rs6000/t-float128-hw(fp128_3_1_hw_funcs,
fp128_3_1_hw_src, fp128_3_1_hw_static_obj, fp128_3_1_hw_shared_obj,
fp128_3_1_hw_obj): Add variables for ISA 3.1 support.
* config/rs6000/t-float128-p10-hw: New file.
* configure: Update script for isa 3.1 128-bit float support.
* configure.ac: Add check for 128-bit float hardware support.
This patch also renames and moves the VSX_TI iterator from vsx.md to
VEC_TI in vector.md. The uses of VEC_TI are also updated.
2021-04-29 Carl Love <cel@us.ibm.com>
gcc/ChangeLog
* config/rs6000/altivec.md (altivec_vslq, altivec_vsrq):
Rename to altivec_vslq_<mode>, altivec_vsrq_<mode>, mode VEC_TI.
* config/rs6000/vector.md (VEC_TI): Was named VSX_TI in vsx.md.
(vashlv1ti3): Change to vashl<mode>3, mode VEC_TI.
(vlshrv1ti3): Change to vlshr<mode>3, mode VEC_TI.
* config/rs6000/vsx.md (VSX_TI): Remove define_mode_iterator. Update
uses of VSX_TI to VEC_TI.
gcc/testsuite/ChangeLog
* gcc.target/powerpc/int_128bit-runnable.c: Add shift_right, shift_left
tests.
2021-06-07 Carl Love <cel@us.ibm.com>
gcc/
* config/rs6000/altivec.md (altivec_vrl<VI_char>mi): Fix
bug in argument generation.
gcc/testsuite/
* gcc.target/powerpc/check-builtin-vec_rlnm-runnable.c:
New runnable test case.
* gcc.target/powerpc/vec-rlmi-rlnm.c: Update scan assembler times
for xxlor instruction.
An explicitly deleted function must be deleted on its first declaration. We
were diagnosing this error only with -Wpedantic, but always giving the
"previous declaration" note. For GCC 11, keep the -Wpedantic dependency,
just make the note depend on the previous diagnostic.
PR c++/101106
gcc/cp/ChangeLog:
* decl.c (duplicate_decls): Condition note on return value of pedwarn.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/deleted15.C: New test.
Before my r277864, build_new_op promoted enums to int before passing them on
to cp_build_binary_op; after that commit, it doesn't, so
warn_for_sign_compare sees the enum operands and gives a redundant warning.
This warning dates back to 1995, and seems to have been dead code for a long
time--likely since build_new_op was added in 1997--so let's just remove it.
PR c++/100879
gcc/c-family/ChangeLog:
* c-warn.c (warn_for_sign_compare): Remove C++ enum mismatch
warning.
gcc/testsuite/ChangeLog:
* g++.dg/diagnostic/enum3.C: New test.
Backported from trunk.
Update the count of matches for the fusion combine patterns after
the recent changes to them. At Segher's request, used \m and \M
in the match patterns. Also I have grouped together all alternatives of
each fusion insn, which should hopefully make this test a little less
fragile.
gcc/testsuite/ChangeLog
* gcc.target/powerpc/fusion-p10-2logical.c: Update pattern
match counts.
* gcc.target/powerpc/fusion-p10-addadd.c: Update pattern match
counts.
* gcc.target/powerpc/fusion-p10-ldcmpi.c: Update pattern match
counts.
* gcc.target/powerpc/fusion-p10-logadd.c: Update pattern match
counts.
(cherry picked from commit a798b3f15c)
gcc/fortran/ChangeLog:
PR fortran/100283
PR fortran/101123
* trans-intrinsic.c (gfc_conv_intrinsic_minmax): Unconditionally
convert result of min/max to result type.
gcc/testsuite/ChangeLog:
PR fortran/100283
PR fortran/101123
* gfortran.dg/min0_max0_1.f90: New test.
* gfortran.dg/min0_max0_2.f90: New test.
(cherry picked from commit 6fc5433963)
The standard does not require the iterator's value type to be
convertible to the result type, it only requires that the result of
dereferencing the iterator can be passed to the binary function.
libstdc++-v3/ChangeLog:
PR libstdc++/95833
* include/std/numeric (reduce(Iter, Iter, T, BinaryOp)): Replace
incorrect static_assert with ones matching the 'Mandates'
conditions in the standard.
* testsuite/26_numerics/reduce/95833.cc: New test.
(cherry picked from commit 0532452dcd)
On passing +cdecp[0-7] extension to the -march string in command line options,
multilib linking is failing as mentioned in PR100856. This patch fixes this issue by
generating a separate canonical string by removing compiler options which are not
required for multilib linking from march string and assign the new string to mlibarch
option. This mlibarch string is used for multilib comparison.
gcc/ChangeLog:
2021-06-10 Srinath Parvathaneni <srinath.parvathaneni@arm.com>
PR target/100856
* common/config/arm/arm-common.c (arm_canon_arch_option_1): New function
derived from arm_canon_arch.
(arm_canon_arch_option): Call it.
(arm_canon_arch_multilib_option): New function.
* config/arm/arm-cpus.in (IGNORE_FOR_MULTILIB): New fgroup.
* config/arm/arm.h (arm_canon_arch_multilib_option): New prototype.
(CANON_ARCH_MULTILIB_SPEC_FUNCTION): New macro.
(MULTILIB_ARCH_CANONICAL_SPECS): New macro.
(DRIVER_SELF_SPECS): Add MULTILIB_ARCH_CANONICAL_SPECS.
* config/arm/arm.opt (mlibarch): New option.
* config/arm/t-rmprofile (MULTILIB_MATCHES): For armv8*-m, replace use
of march on RHS with mlibarch.
gcc/testsuite/ChangeLog:
2021-06-10 Srinath Parvathaneni <srinath.parvathaneni@arm.com>
PR target/100856
* gcc.target/arm/acle/pr100856.c: New test.
* gcc.target/arm/multilib.exp: Add tests for cde options.
(cherry picked from commit f58d03b5df)
The current CMSE support in the multilib build for
"-march=armv8.1-m.main+mve -mfloat-abi=hard -mfpu=auto" is broken
as specified in PR99939 and this patch fixes the issue.
gcc/testsuite/ChangeLog:
2021-06-11 Srinath Parvathaneni <srinath.parvathaneni@arm.com>
PR target/99939
* gcc.target/arm/cmse/cmse-18.c: Add separate scan-assembler
directives check for target is v8.1-m.main+mve or not before
comparing the assembly output.
* gcc.target/arm/cmse/cmse-20.c: New test.
libgcc/ChangeLog:
2021-06-11 Srinath Parvathaneni <srinath.parvathaneni@arm.com>
PR target/99939
* config/arm/cmse_nonsecure_call.S: Add __ARM_FEATURE_MVE
macro.
* config/arm/t-arm: To link cmse.o and cmse_nonsecure_call.o
on passing -mcmse option.
(cherry picked from commit c5ed014834)
When compiled with -m32 -O2 -D_GLIBCXX_USE_CXX11_ABI=0 we get a warning
for 21_strings/basic_string/cons/char/1.cc:
bits/char_traits.h:409:56: warning: ‘void* __builtin_memcpy(void*, const void*, unsigned int)’ reading 1073741821 bytes from a region of size 19 [-Wstringop-overread]
The warning is legitimate, even if that line cannot be reached because
we throw std::length_error before getting there. Since the invalid
length is deliberate (and mentioned in a comment) just suppress the
warning, so that the test can verify we get the exception.
Also remove an unused typedef that produces another warning.
libstdc++-v3/ChangeLog:
* testsuite/21_strings/basic_string/cons/char/1.cc: Use
diagnostic pragma to suppress -Wstringop-overread error.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
(cherry picked from commit 92edc4a768)
This removes the 'static' keyword from the helper functions added by
r8-1294 to detect whether the char_traits member functions can be
evaluated at compile time. This prevents the "inlining failed" error
reported in the PR.
The new testcase from the PR is added to the libitm testsuite, because
that's where we can be sure it's OK to use the -fgnu-tm option.
As a drive-by fix, the feature test macros for C++20 P0980R1 support are
made to depend on whether __cpp_lib_is_constant_evaluated is defined.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
PR libstdc++/91488
libstdc++-v3/ChangeLog:
* include/bits/basic_string.h (__cpp_lib_constexpr_string): Only
define C++20 value when std::is_constant_evaluated is available.
* include/bits/char_traits.h (__cpp_lib_constexpr_char_traits):
Likewise.
(__constant_string_p, __constant_array_p): Give external
linkage.
* include/std/version (__cpp_lib_constexpr_char_traits)
(__cpp_lib_constexpr_string): Only define C++20 values when
is_constant_evaluated is available.
libitm/ChangeLog:
* testsuite/libitm.c++/libstdc++-pr91488.C: New test.
(cherry picked from commit b376b1ef38)
> The following patch does create them, but treats all such bitfields as if
> they were in a structure where the particular bitfield is the only field.
While the patch passed bootstrap/regtest on the trunk, when trying to
backport it to 11 branch the bootstrap failed with
atree.ads:3844:34: size for "Node_Record" too small
errors. Turns out the error is not about size being too small, but actually
about size being non-constant, and comes from:
/* In a FIELD_DECL of a RECORD_TYPE, this is a pointer to the storage
representative FIELD_DECL. */
#define DECL_BIT_FIELD_REPRESENTATIVE(NODE) \
(FIELD_DECL_CHECK (NODE)->field_decl.qualifier)
/* For a FIELD_DECL in a QUAL_UNION_TYPE, records the expression, which
if nonzero, indicates that the field occupies the type. */
#define DECL_QUALIFIER(NODE) (FIELD_DECL_CHECK (NODE)->field_decl.qualifier)
so by setting up DECL_BIT_FIELD_REPRESENTATIVE in QUAL_UNION_TYPE we
actually set or modify DECL_QUALIFIER and then construct size as COND_EXPRs
with those bit field representatives (e.g. with array type) as conditions
which doesn't fold into constant.
The following patch fixes it by not creating DECL_BIT_FIELD_REPRESENTATIVEs
for QUAL_UNION_TYPE as there is nowhere to store them,
Shall we change tree.h to document that DECL_BIT_FIELD_REPRESENTATIVE
is valid also on UNION_TYPE?
I see:
tree-ssa-alias.c- if (TREE_CODE (type1) == RECORD_TYPE
tree-ssa-alias.c: && DECL_BIT_FIELD_REPRESENTATIVE (field1))
tree-ssa-alias.c: field1 = DECL_BIT_FIELD_REPRESENTATIVE (field1);
tree-ssa-alias.c- if (TREE_CODE (type2) == RECORD_TYPE
tree-ssa-alias.c: && DECL_BIT_FIELD_REPRESENTATIVE (field2))
tree-ssa-alias.c: field2 = DECL_BIT_FIELD_REPRESENTATIVE (field2);
Shall we change that to || == UNION_TYPE or do we assume all fields
are overlapping in a UNION_TYPE already?
At other spots (asan, ubsan, expr.c) it is unclear what will happen
if they see a QUAL_UNION_TYPE with a DECL_QUALIFIER (or does the Ada FE
lower that somehow)?
2021-06-18 Jakub Jelinek <jakub@redhat.com>
PR middle-end/101062
* stor-layout.c (finish_bitfield_layout): Don't add bitfield
representatives in QUAL_UNION_TYPE.
The following testcase is miscompiled on x86_64-linux, the bitfield store
is implemented as a RMW 64-bit operation at d+24 when the d variable has
size of only 28 bytes and scheduling moves in between the R and W part
a store to a different variable that happens to be right after the d
variable.
The reason for this is that we weren't creating
DECL_BIT_FIELD_REPRESENTATIVEs for bitfields in unions.
The following patch does create them, but treats all such bitfields as if
they were in a structure where the particular bitfield is the only field.
2021-06-16 Jakub Jelinek <jakub@redhat.com>
PR middle-end/101062
* stor-layout.c (finish_bitfield_representative): For fields in unions
assume nextf is always NULL.
(finish_bitfield_layout): Compute bit field representatives also in
unions, but handle it as if each bitfield was the only field in the
aggregate.
* gcc.dg/pr101062.c: New test.
(cherry picked from commit b4b50bf286)
This force-enables perfect forwarding call wrapper semantics whenever
the extra arguments of a partially applied range adaptor aren't all
trivially copyable, so as to avoid incurring unnecessary copies of
potentially expensive-to-copy objects (such as std::function objects)
when invoking the adaptor.
PR libstdc++/100940
libstdc++-v3/ChangeLog:
* include/std/ranges (__adaptor::_Partial): For the "simple"
forwarding partial specializations, also require that
the extra arguments are trivially copyable.
* testsuite/std/ranges/adaptors/100577.cc (test04): New test.
(cherry picked from commit 2b87f3318c)
The _S_has_simple_extra_args mechanism is used to simplify forwarding
of range adaptor's extra arguments when perfect forwarding call wrapper
semantics isn't required for correctness, on a per-adaptor basis.
Both views::take and views::drop are flagged as such, but it turns out
perfect forwarding semantics are needed for these adaptors in some
contrived cases, e.g. when their extra argument is a move-only class
that's implicitly convertible to an integral type.
To fix this, we could just clear the flag for views::take/drop as with
views::split, but that'd come at the cost of acceptable diagnostics
for ill-formed uses of these adaptors (see PR100577).
This patch instead allows adaptors to parameterize their
_S_has_simple_extra_args flag according the types of the captured extra
arguments, so that we could conditionally disable perfect forwarding
semantics only when the types of the extra arguments permit it. We
then use this finer-grained mechanism to safely disable perfect
forwarding semantics for views::take/drop when the extra argument is
integer-like, rather than incorrectly always disabling it. Similarly,
for views::split, rather than always enabling perfect forwarding
semantics we now safely disable it when the extra argument is a scalar
or a view, and recover good diagnostics for these common cases.
PR libstdc++/100940
libstdc++-v3/ChangeLog:
* include/std/ranges (__adaptor::_RangeAdaptor): Document the
template form of _S_has_simple_extra_args.
(__adaptor::__adaptor_has_simple_extra_args): Add _Args template
parameter pack. Try to treat _S_has_simple_extra_args as a
variable template parameterized by _Args.
(__adaptor::_Partial): Pass _Arg/_Args to the constraint
__adaptor_has_simple_extra_args.
(views::_Take::_S_has_simple_extra_args): Templatize according
to the type of the extra argument.
(views::_Drop::_S_has_simple_extra_args): Likewise.
(views::_Split::_S_has_simple_extra_args): Define.
* testsuite/std/ranges/adaptors/100577.cc (test01, test02):
Adjust after changes to _S_has_simple_extra_args mechanism.
(test03): Define.
(cherry picked from commit 0f4a2fb44d)
Using an MMA builtin within an openmp parallel code block, leads to an SSA
verification ICE on the temporaries we create while expanding the MMA builtins
at gimple time. The solution is to use create_tmp_reg_or_ssa_name(), which
knows when to create either an SSA or register temporary.
2021-06-14 Peter Bergner <bergner@linux.ibm.com>
gcc/
PR target/100777
* config/rs6000/rs6000-call.c (rs6000_gimple_fold_mma_builtin): Use
create_tmp_reg_or_ssa_name().
gcc/testsuite/
PR target/100777
* gcc.target/powerpc/pr100777.c: New test.
(cherry picked from commit 20073534c0)
The __builtin_vsx_assemble_pair and __builtin_mma_assemble_acc built-ins
currently assign their first source operand to the first VSX register
in a pair/quad, their second operand to the second register in a pair/quad, etc.
This is not endian friendly and forces the user to generate different calls
depending on endianness. In agreement with the POWER LLVM team, we've
decided to lightly deprecate the assemble built-ins and replace them with
"build" built-ins that automatically handle endianness so the same built-in
call and be used for both little-endian and big-endian compiles. We are not
removing the assemble built-ins, since there is code in the wild that use
them, but we are removing their documentation to encourage the use of the
new "build" variants.
gcc/
* config/rs6000/rs6000-builtin.def (build_pair): New built-in.
(build_acc): Likewise.
* config/rs6000/rs6000-call.c (mma_expand_builtin): Swap assemble
source operands in little-endian mode.
(rs6000_gimple_fold_mma_builtin): Handle VSX_BUILTIN_BUILD_PAIR.
(mma_init_builtins): Likewise.
* config/rs6000/rs6000.c (rs6000_split_multireg_move): Handle endianness
ordering for the MMA assemble and build source operands.
* doc/extend.texi (__builtin_vsx_build_acc, __builtin_mma_build_pair):
Document.
(__builtin_mma_assemble_acc, __builtin_mma_assemble_pair): Remove
documentation.
gcc/testsuite/
* gcc.target/powerpc/mma-builtin-4.c (__builtin_vsx_build_pair): Add
tests. Update expected counts.
* gcc.target/powerpc/mma-builtin-5.c (__builtin_mma_build_acc): Add
tests. Update expected counts.
(cherry picked from commit 00d07ec6e1)
The mma_assemble_input_operand predicate does not accept reg+reg indexed
addresses which can lead to ICEs. The lxv and lxvp instructions have
indexed forms (lxvx and lxvpx), so the simple solution is to just allow
indexed addresses in the predicate.
2021-05-30 Peter Bergner <bergner@linux.ibm.com>
gcc/
PR target/99842
* config/rs6000/predicates.md(mma_assemble_input_operand): Allow
indexed form addresses.
gcc/testsuite/
PR target/99842
* g++.target/powerpc/pr99842.C: New.
(cherry picked from commit df4e0359da)
Consider size_t mangling as unsigned int and long [PR100876].
gcc/ChangeLog:
PR middle-end/100876
* builtins.c: (gimple_call_return_array): Account for size_t
mangling as either unsigned int or unsigned long
Teach compute_objsize about placement new [PR100876].
Resolves:
PR c++/100876 - -Wmismatched-new-delete should understand placement new when it's not inlined
gcc/ChangeLog:
PR c++/100876
* builtins.c (gimple_call_return_array): Check for attribute fn spec.
Handle calls to placement new.
(ndecl_dealloc_argno): Avoid placement delete.
gcc/testsuite/ChangeLog:
PR c++/100876
* g++.dg/warn/Wmismatched-new-delete-4.C: New test.
* g++.dg/warn/Wmismatched-new-delete-5.C: New test.
* g++.dg/warn/Wstringop-overflow-7.C: New test.
* g++.dg/warn/Wfree-nonheap-object-6.C: New test.
* g++.dg/analyzer/placement-new.C: Prune out expected warning.
PR middle-end/100732 - ICE on sprintf %s with integer argument
gcc/ChangeLog:
PR middle-end/100732
* gimple-fold.c (gimple_fold_builtin_sprintf): Avoid folding calls
with either source or destination argument of invalid type.
* tree-ssa-uninit.c (maybe_warn_pass_by_reference): Avoid checking
calls with arguments of invalid type.
gcc/testsuite/ChangeLog:
PR middle-end/100732
* gcc.dg/tree-ssa/builtin-snprintf-11.c: New test.
* gcc.dg/tree-ssa/builtin-snprintf-12.c: New test.
* gcc.dg/tree-ssa/builtin-sprintf-28.c: New test.
* gcc.dg/tree-ssa/builtin-sprintf-29.c: New test.
* gcc.dg/uninit-pr100732.c: New test.
PR c/100619 - ICE on a VLA parameter with too many dimensions
gcc/c-family/ChangeLog:
PR c/100619
* c-attribs.c (build_attr_access_from_parms): Handle arbitrarily many
bounds.
gcc/testsuite/ChangeLog:
PR c/100619
* gcc.dg/pr100619.c: New test.
PR middle-end/100574 - ICE in size_remaining, at builtins.c
gcc/ChangeLog:
PR middle-end/100574
* builtins.c (access_ref::get_ref): Improve detection of PHIs with
all null arguments.
gcc/testsuite/ChangeLog:
PR middle-end/100574
* g++.dg/pr100574.C: New test.
Fusion patterns for add-logical/logical-add
This patch modifies the function in genfusion.pl for generating
the logical-logical patterns so that it can also generate the
add-logical and logical-add patterns which are very similar.
Also backported from trunk and combined with the add-logical patch
because that revealed problems on gcc-11.
Add needed earlyclobber to fusion patterns
The add-logical and add-add fusion patterns all have constraint
alternatives "=0,1,&r,r" for the output (3). The inputs 0 and 1
are used in the first fusion instruction and then either may be
reused as a temp for the output of the first insn which is
input to the second. However, if input 2 is the same as 0 or 1,
it gets clobbered unexpectedly. So the first 2 alts need to be
"=&0,&1,&r,r" instead to indicate that in alts 0 and 1, the
register used for 3 is earlyclobber, hence can't be the same as
input 2.
This was actually encountered in the backport of the add-logical
fusion patch to gcc-11. Some code in go hit this case:
<runtime.fillAligned+520>: andc r30,r30,r9
r30 now (~(x|((x&c)+c)))&(~c) --> this is new x
<runtime.fillAligned+524>: b <runtime.fillAligned+288>
<runtime.fillAligned+288>: addi r31,r31,-1
r31 now m-1
<runtime.fillAligned+292>: srd r31,r30,r31
r31 now x>>(m-1)
<runtime.fillAligned+296>: subf r30,r31,r30
r30 now x-(x>>(m-1))
<runtime.fillAligned+300>: or r30,r30,r30 # mdoom
nop
<runtime.fillAligned+304>: not r3,r30
r3 now ~(x-(x>>(m-1))) -- WHOOPS
The or r30,r30,r30 was meant to be or-ing in the earlier value
of r30 which was overwritten by the output of the subf.
Combined ChangeLog (needed for the scripts to understand):
gcc/ChangeLog
* config/rs6000/genfusion.pl (gen_logical_addsubf): Refactor to
add generation of logical-add and add-logical fusion pairs. Add
earlyclobber to alts 0/1.
(gen_addadd): Add earlyclobber to alts 0/1.
* config/rs6000/rs6000-cpus.def: Add new fusion to ISA 3.1 mask
and powerpc mask.
* config/rs6000/rs6000.c (rs6000_option_override_internal): Turn on
logical-add and add-logical fusion by default.
* config/rs6000/rs6000.opt: Add -mpower10-fusion-logical-add and
-mpower10-fusion-add-logical options.
* config/rs6000/fusion.md: Regenerate file.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/fusion-p10-logadd.c: New file.
Fix the mapping of vec_double and vec_floate to builtins.
gcc/ChangeLog:
PR target/100871
* config/s390/vecintrin.h (vec_doublee): Fix to use
__builtin_s390_vflls.
(vec_floate): Fix to use __builtin_s390_vflrd.
gcc/testsuite/ChangeLog:
* gcc.target/s390/zvector/vec-doublee.c: New test.
* gcc.target/s390/zvector/vec-floate.c: New test.
(cherry picked from commit a4fc63e0c3)
I've noticed this test now on various arches sometimes FAILs, sometimes
PASSes (the line 12 test in particular).
The problem is that a = 0; initialization in the caller no longer happens
before the f(&a) call as what the argument points to is only used in
debug info.
Making the function noipa forces the caller to initialize it and still
tests what the test wants to test, namely that we don't consider *p as
valid location for the c variable at line 18 (after it has been overwritten
with *p = 1;).
2021-06-16 Jakub Jelinek <jakub@redhat.com>
* gcc.dg/guality/pr49888.c (f): Use noipa attribute instead of
noinline, noclone.
(cherry picked from commit a490b1dc0b)
As the following testcase shows, libffi didn't handle properly
classify_arguments of structures at byte offsets not divisible by
UNITS_PER_WORD. The following patch adjusts it to match what
config/i386/ classify_argument does for that and also ports the
PR38781 fix there (the second chunk).
This has been committed to upstream libffi already:
5651bea284
2021-06-16 Jakub Jelinek <jakub@redhat.com>
* src/x86/ffi64.c (classify_argument): For FFI_TYPE_STRUCT set words
to number of words needed for type->size + byte_offset bytes rather
than just type->size bytes. Compute pos before the loop and check
total size of the structure.
* testsuite/libffi.call/nested_struct12.c: New test.
(cherry picked from commit 041f741770)
The following testcase ICEs, because we have a mode mismatch.
VEC_PACK_TRUNC_EXPR's operands have different modes from the result
(same vector mode size but twice as large element),
but we were passing non-NULL subtarget with the mode of the result
to the expansion of its arguments, so the VEC_PERM_EXPR in one of the
operands which had V8SImode operands and result had V16HImode target.
Fixed by clearing the subtarget if we are changing mode.
2021-06-15 Jakub Jelinek <jakub@redhat.com>
PR target/101046
* expr.c (expand_expr_real_2) <case VEC_PACK_FIX_TRUNC_EXPR,
case VEC_PACK_TRUNC_EXPR>: Clear subtarget when changing mode.
(cherry picked from commit 008153c843)
simplify_relational_operation callees typically return just const0_rtx
or const_true_rtx and then simplify_relational_operation attempts to fix
that up if the comparison result has vector mode, or floating mode,
or punt if it has scalar mode and vector mode operands (it doesn't know how
exactly to deal with the scalar masks).
But, simplify_logical_relational_operation has a special case, where
it attempts to fold (x < y) | (x >= y) etc. and if it determines it is
always true, it just returns const_true_rtx, without doing the dances that
simplify_relational_operation does.
That results in an ICE on the following testcase, where such folding happens
during expansion (of debug stmts into DEBUG_INSNs) and we ICE because
all of sudden a VOIDmode rtx appears where it expects a vector (V4SImode)
rtx.
The following patch fixes that by moving the adjustement into a separate
helper routine and using it from both simplify_relational_operation and
simplify_logical_relational_operation.
2021-06-11 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/101008
* simplify-rtx.c (relational_result): New function.
(simplify_logical_relational_operation,
simplify_relational_operation): Use it.
(cherry picked from commit 4bdcdd8fa8)