std::ranges::reverse_view uses make_reverse_iterator in its
implementation as described in [range.reverse.view]. This accidentally
allows ADL as an unqualified name is used in the call. According to
[contents], however, this should be treated as a qualified lookup into
the std namespace.
This leads to errors due to ambiguous name lookups when another
make_reverse_iterator function is found via ADL.
libstdc++-v3/Changelog:
* include/std/ranges (reverse_view::begin, reverse_view::end):
Qualify make_reverse_iterator calls to avoid ADL.
* testsuite/std/ranges/adaptors/reverse.cc: Test that
views::reverse works when make_reverse_iterator is defined
in an associated namespace.
gcc.target/arm/acle/dsp_arith.c uses DSP intrinsics, which arm_acle.h
defines only with __ARM_FEATURE_DSP, so make the test check for that
property rather than arm_qbit_ok.
However, the existing arm_dsp effective target only checks if DSP
features are supported with the current multilib rather than trying
-march and -mfloat-abi options. Thus we introduce a similar effective
target, arm_dsp_ok and associated dg-add-options.
This makes dsp_arith.c unsupported rather than failed when no option
combination is suitable, for instance when running the tests with
-mcpu=cortex-m3.
2021-03-19 Christophe Lyon <christophe.lyon@linaro.org>
gcc/
* doc/sourcebuild.texi (arm_dsp_ok, arm_dsp): Document.
gcc/testsuite/
* lib/target-supports.exp
(check_effective_target_arm_dsp_ok_nocache)
(check_effective_target_arm_dsp_ok, add_options_for_arm_dsp): New.
* gcc.target/arm/acle/dsp_arith.c: Use arm_dsp_ok effective target
and add arm_dsp options.
Make the order in which we try -mfloat-abi options consistent with the
other similar effective targets: try softfp first, then hard.
This shows that a few tests implicitly rely on -mfloat-abi=hard, so we
add this option via dg-additional-options so that it comes after any
potential -mfloat-abi option that the preceding effective-targets
might have added.
armv8_1m-fpXX-move-1.c tests don't need arm_hard_ok because they don't
include arm_mve.h: adding -mfloat-abi=hard when using a soft/softfp
toolchain does not lead to the missing include gnu/stubs-*.h error.
This patch makes armv8_1m-fpXX-move-1.c pass on arm-linux-gnueabi, and
the other tests become unsupported (instead of fail) on this target.
On arm-eabi with default cpu/fpu/mode and a+rm multilibs, the same
mve/intrinsics/* tests become unsupported instead of pass because
arm_hard_ok fails with "selected processor lacks an FPU". Since we
also override the fpu via dg-options, we'd need another effective
target (say arm_hard_mve_ok) that would check -mfloat-abi=hard
-mfpu=auto -march=armv8.1-m.main+mve.fp at the same time. But we have
already so many arm effective targets, it doesn't seem like a good way
forward.
2021-03-19 Christophe Lyon <christophe.lyon@linaro.org>
gcc/testsuite/
* lib/target-supports.exp
(check_effective_target_arm_v8_1m_mve_fp_ok_nocache): Fix
-mfloat-abi= options order.
(check_effective_target_arm_v8_1m_mve_ok_nocache): Likewise
* gcc.target/arm/mve/intrinsics/mve_vector_float2.c: Add
arm_hard_ok effective target and -mfloat-abi=hard additional
option.
* gcc.target/arm/mve/intrinsics/mve_vector_int.c: Likewise.
* gcc.target/arm/mve/intrinsics/mve_vector_uint.c: Likewise.
* gcc.target/arm/mve/intrinsics/mve_vector_uint1.c: Likewise.
* gcc.target/arm/mve/intrinsics/mve_vector_uint2.c: Likewise.
* gcc.target/arm/mve/intrinsics/vgetq_lane_s64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vgetq_lane_u64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsetq_lane_s64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsetq_lane_u64.c: Likewise.
* gcc.target/arm/armv8_1m-fp16-move-1.c: Add -mfloat-abi=hard
additional option.
* gcc.target/arm/armv8_1m-fp32-move-1.c: Likewise.
* gcc.target/arm/armv8_1m-fp64-move-1.c: Likewise.
Make the order in which we try -mfloat-abi options consistent with the
other similar effective targets: try softfp first, then hard.
This shows that a few tests implicitly rely on -mfloat-abi=hard, so we
now check arm_hard_ok where needed.
This makes these tests unsupported rather than fail on
arm-linux-gnueabi.
2021-03-19 Christophe Lyon <christophe.lyon@linaro.org>
gcc/testsuite/
* lib/target-supports.exp
(check_effective_target_arm_v8_2a_i8mm_ok_nocache): Fix
-mfloat-abi= options order.
(check_effective_target_arm_v8_2a_bf16_neon_ok_nocache): Likewise.
* gcc.target/arm/bfloat16_scalar_1_1.c: Add arm_hard_ok effective
target and -mfloat-abi=hard additional option.
* gcc.target/arm/bfloat16_simd_1_1.c: Likewise.
* gcc.target/arm/simd/bf16_ma_1.c: Likewise.
* gcc.target/arm/simd/bf16_mmla_1.c: Likewise.
* gcc.target/arm/simd/vdot-2-1.c: Likewise.
* gcc.target/arm/simd/vdot-2-2.c: Likewise.
This test relies on -mfloat-abi=hard to pass (otherwise
test_mov_imm_[12] directly build the 1.0 fp16 representation via movw
r0, #15360 rather than using vmov.f16 s0, #1.0e+0 as expected by
scan-assembler-times)
Adding the arm_hard_ok check makes the test unsupported eg. on
arm-linux-gnueabi instead of reporting a failure.
2021-03-20 Christophe Lyon <christophe.lyon@linaro.org>
gcc/testsuite/
* gcc.target/arm/armv8_2-fp16-scalar-2.c: Add arm_hard_ok.
Several tests override the -mfloat-abi option detected by their
effective targets. Make sure it is supported, so that these tests are
unsupported rather than failures (the inclusion of arm_neon.h
otherwise fails for lack of gnu/stubs-*.h)
This avoids failures with
bfloat16_simd_2_1.c
bfloat16_simd_3_1.c
bf16_vldn_1.c
bf16_vstn_1.c on arm-linux-gnueabi
and
pr51968.c
bfloat16_simd_1_2.c
bfloat16_simd_2_2.c
bfloat16_simd_3_2.c on arm-linux-gnueabihf.
On arm-eabi with default cpu/fpu/mode and a+rm multilibs,
bfloat16_simd_2_1.c, bfloat16_simd_3_1.c, bf16_vstn_1.c and
bf16_vldn_1.c become unsupported instead of pass because arm_hard_ok
fails with "selected processor lacks an FPU". Since we also override
the fpu in dg-additional-options, we'd need another effective target
(say arm_hard_neon_ok) that would check -mfloat-abi=hard -mfpu=neon at
the same time. But we have already so many arm effective targets, it
doesn't seem like a good way forward.
2021-03-19 Christophe Lyon <christophe.lyon@linaro.org>
gcc/testsuite/
* gcc.target/arm/bfloat16_simd_1_2.c: Add arm_softfp_ok.
* gcc.target/arm/bfloat16_simd_2_2.c: Likewise.
* gcc.target/arm/bfloat16_simd_3_2.c: Likewise.
* gcc.target/arm/pr51968.c: Likewise.
* gcc.target/arm/bfloat16_simd_2_1.c: arm_hard_ok.
* gcc.target/arm/bfloat16_simd_3_1.c: Likewise.
* gcc.target/arm/simd/bf16_vldn_1.c: Likewise.
* gcc.target/arm/simd/bf16_vstn_1.c: Likewise.
These tests pass with their current dg-add-options, no need to force
-mfloat=abi.
I've noticed no impact on armv8_1m-shift-imm-1.c and
armv8_1m-shift-reg-1.c, bf16_reinterpret.c now passes on
arm-linux-gnueabi and bf16_dup.c now passes on arm-linux-gnueabihf.
This allows pr51534.c to pass when forcing -mfloat-abi=soft in
runtestflags, otherwise we get an error '-mfloat-abi=soft and
-mfloat-abi=hard may not be used together' because we try to compile
with both flags.
2021-03-19 Christophe Lyon <christophe.lyon@linaro.org>
gcc/testsuite/
* gcc.target/arm/armv8_1m-shift-imm-1.c: Remove -mfloat=abi option.
* gcc.target/arm/armv8_1m-shift-reg-1.c: Likewise.
* gcc.target/arm/bf16_dup.c: Likewise.
* gcc.target/arm/bf16_reinterpret.c: Likewise.
* gcc.target/arm/pr51534.c: Remove -mfloat=abi option.
We need to add the options corresponding to the arm_v8_2a_i8mm_ok
effective target in order to use the right float-abi option:
-mfloat-abi=softfp makes the test pass for arm-linux-gnueabi,
while no -mfloat-abi option is needed for arm-linux-gnueabihf.
2021-03-19 Christophe Lyon <christophe.lyon@linaro.org>
gcc/testsuite/
* gcc.target/arm/simd/vmmla_1.c: Add arm_v8_2a_i8mm options.
A few tests lack the dg-add-options directives associated with the
dg-require-effective-target they are using. Adding them enables to
pass the right float-abi option, and thus make the tests pass instead
of emit an error.
For instance, we now pass -mfloat-abi=softfp on arm-linux-gnueabi
targets and the tests pass.
2021-03-19 Christophe Lyon <christophe.lyon@linaro.org>
gcc/testsuite/
* gcc.target/arm/bfloat16_scalar_typecheck.c: Add
arm_v8_2a_fp16_neon and arm_v8_2a_bf16_neon.
* gcc.target/arm/bfloat16_vector_typecheck_1.c: Likewise.
* gcc.target/arm/bfloat16_vector_typecheck_2.c: Likewise.
Clang does not currently support the __ibm128 type [1] and only supports
the __ieee128 type in the unreleased 12.0.0 version [2]. That means it
is not possible to provide support for -mabi=ieeelongdouble with Clang
in an ABI compatible way (as we do for GCC by defining new facets and
other types in the __gnu_cxx_ldbl128 namespace).
By preventing the definition of _GLIBCXX_LONG_DOUBLE_ALT128_COMPAT when
compiling with Clang, all uses of __ibm128 and __ieee128 types will be
disabled. This can be revisited in future when Clang supports the types
(and provides a way to detect that support using the preprocessor).
[1] https://reviews.llvm.org/D93377
[2] https://reviews.llvm.org/D97846
libstdc++-v3/ChangeLog:
* include/bits/c++config (_GLIBCXX_LONG_DOUBLE_ALT128_COMPAT):
Do not define when compiling with Clang.
In GCC 10, I introduced cp_warn_deprecated_use_scopes so that we can
handle attribute deprecated on a namespace declaration. This
function walks the decl's contexts so that we warn for code like
namespace [[deprecated]] N { struct S { }; }
N::S s;
We call cp_warn_deprecated_use_scopes when we encounter a TYPE_DECL.
But in the following testcase we have a TYPE_DECL whose context is
a deprecated function; that itself is not a reason to warn. This
patch limits for which entities we call cp_warn_deprecated_use;
essentially it's what can follow ::.
I noticed that we didn't test that
struct [[deprecated]] S { static void fn(); };
S::fn();
produces the expected warning, so I've added gen-attrs-73.C.
gcc/cp/ChangeLog:
PR c++/99318
* decl2.c (cp_warn_deprecated_use_scopes): Only call
cp_warn_deprecated_use when decl is a namespace, class, or enum.
gcc/testsuite/ChangeLog:
PR c++/99318
* g++.dg/cpp0x/attributes-namespace6.C: New test.
* g++.dg/cpp0x/gen-attrs-73.C: New test.
aarch64_add_offset uses expand_mult to multiply the SVE VL by an
out-of-range constant. expand_mult takes an argument to indicate
whether the multiplication is signed or unsigned, but in this
context the multiplication is effectively signless and so the
choice seemed arbitrary.
However, one of the things that the signedness input does is
indicate whether signed overflow should be trapped for -ftrapv.
We don't want that here, so we must treat the multiplication
as unsigned.
gcc/
2021-03-23 Jakub Jelinek <jakub@redhat.com>
PR target/99540
* config/aarch64/aarch64.c (aarch64_add_offset): Tell
expand_mult to perform an unsigned rather than a signed
multiplication.
gcc/testsuite/
2021-03-23 Richard Sandiford <richard.sandiford@arm.com>
PR target/99540
* gcc.dg/vect/pr99540.c: New test.
Since CPUID instruction may return different values on hybrid core.
volatile is needed on asm statements in <cpuid.h>.
PR target/99704
* config/i386/cpuid.h (__cpuid): Add __volatile__.
(__cpuid_count): Likewise.
This was simply an overzealous assert. Possibly correct thinking at
the time that code was written, but not true now. Of course we can
have imported artificial decls.
PR c++/99239
gcc/cp/
* decl.c (duplicate_decls): Remove assert about maybe-imported
artificial decls.
gcc/testsuite/
* g++.dg/modules/pr99239_a.H: New.
* g++.dg/modules/pr99239_b.H: New.
This makes sure we'll not run into SLP scheduling issues later by
rejecting all-constant children nodes without any scalar stmts early.
2021-03-23 Richard Biener <rguenther@suse.de>
PR tree-optimization/99721
* tree-vect-slp.c (vect_slp_analyze_node_operations):
Make sure we can schedule the node.
* gfortran.dg/vect/pr99721.f90: New testcase.
These all intend the least significant subpart of the register.
Use the same endian-neutral "subreg_lowpart_operator" predicate that
ARM does instead.
gcc/
* config/riscv/predicates.md (subreg_lowpart_operator): New predicate
* config/riscv/riscv.md (*addsi3_extended2, *subsi3_extended2)
(*negsi2_extended2, *mulsi3_extended2, *<optab>si3_mask)
(*<optab>si3_mask_1, *<optab>di3_mask, *<optab>di3_mask_1)
(*<optab>si3_extend_mask, *<optab>si3_extend_mask_1): Use
new predicate "subreg_lowpart_operator"
gcc/
* config/riscv/riscv.c (riscv_swap_instruction): New function
to byteswap an SImode rtx containing an instruction.
(riscv_trampoline_init): Byteswap the generated instructions
when needed.
We ICE on the following testcase, because std::tuple_element<...,...>::type
is void and for structured bindings we therefore need to create
void & or void && which is invalid. We created such REFERENCE_TYPE and
later ICEd in the middle-end.
The following patch fixes it by diagnosing that.
2021-03-23 Jakub Jelinek <jakub@redhat.com>
PR c++/99650
* decl.c (cp_finish_decomp): Diagnose void initializers when
using tuple_element and get.
* g++.dg/cpp1z/decomp55.C: New test.
In addition to the existing check also ask the target whether a
replacement register may be accessed in a different mode than it was set
before.
gcc/ChangeLog:
* regcprop.c (find_oldest_value_reg): Ask target whether
different mode is fine for replacement register.
This implements the new string_view constructor proposed by P1989R2.
This hasn't been voted into the C++23 draft yet, but it's been reviewed
by LWG and is expected to be approved at the next WG21 meeting.
libstdc++-v3/ChangeLog:
* include/std/string_view (basic_string_view(Range&&)): Define new
constructor and deduction guide.
* testsuite/21_strings/basic_string_view/cons/char/range_c++20.cc: New test.
* testsuite/21_strings/basic_string_view/cons/wchar_t/range_c++20.cc: New test.
We were not correctly handling cross-module redeclarations of
partial-specializations. They have their own TEMPLATE_DECL, which we
need to locate. I had a FIXME there about this case. Guess it's
fixed now.
PR c++/99480
gcc/cp/
* module.cc (depset:#️⃣:make_dependency): Propagate flags for
partial specialization.
(module_may_redeclare): Handle partial specialization.
gcc/testsuite/
* g++.dg/modules/pr99480_a.H: New.
* g++.dg/modules/pr99480_b.H: New.
aarch64 needs to skip memory address validation for LD1R insns. Skipping
the address validation may result in LRA crash for some targets when usual
memory constraint is used. This patch introduces define_relaxed_memory_constraint,
skipping address validation for it, and defining relaxed memory for
aarch64 LD1r insn memory operand.
gcc/ChangeLog:
PR target/99581
* config/aarch64/constraints.md (UtQ): Use
define_relaxed_memory_constraint for it.
* doc/md.texi (define_relaxed_memory_constraint): Describe it.
* genoutput.c (main): Process DEFINE_RELAXED_MEMORY_CONSTRAINT.
* genpreds.c (constraint_data): Add bitfield is_relaxed_memory.
(have_relaxed_memory_constraints): New static var.
(relaxed_memory_start, relaxed_memory_end): Ditto.
(add_constraint): Add arg is_relaxed_memory. Check name for
relaxed memory. Set up is_relaxed_memory in constraint_data and
have_relaxed_memory_constraints. Adjust calls.
(choose_enum_order): Process relaxed memory.
(write_tm_preds_h): Ditto.
(main): Process DEFINE_RELAXED_MEMORY_CONSTRAINT.
* gensupport.c (process_rtx): Process DEFINE_RELAXED_MEMORY_CONSTRAINT.
* ira-costs.c (record_reg_classes): Process CT_RELAXED_MEMORY.
* ira-lives.c (single_reg_class): Use
insn_extra_relaxed_memory_constraint.
* ira.c (ira_setup_alts): CT_RELAXED_MEMORY.
* lra-constraints.c (valid_address_p): Use
insn_extra_relaxed_memory_constraint instead of other memory
constraints.
(process_alt_operands): Process CT_RELAXED_MEMORY.
(curr_insn_transform): Use insn_extra_relaxed_memory_constraint.
* recog.c (asm_operand_ok, preprocess_constraints): Process
CT_RELAXED_MEMORY.
* reload.c (find_reloads): Ditto.
* rtl.def (DEFINE_RELAXED_MEMORY_CONSTRAINT): New.
* stmt.c (parse_input_constraint): Use
insn_extra_relaxed_memory_constraint.
gcc/testsuite/ChangeLog:
PR target/99581
* gcc.target/powerpc/pr99581.c: New.
2021-03-22 Segher Boessenkool <segher@kernel.crashing.org>
PR target/97926
* ubsan.c (ubsan_instrument_float_cast): Don't test for unordered if
there are no NaNs.
This implements the proposed changes for LWG 3537 (which we're allowed
to do as an extension whatever the outcome of the issue). I noticed we
didn't implement LWG 2280 completely, as the std::begin and std::end
overloads for arrays were not noexcept.
libstdc++-v3/ChangeLog:
* include/bits/range_access.h (begin(T (&)[N]), end(T (&)[N])):
Add missing 'noexcept' as per LWG 2280.
(rbegin(T (&)[N]), rend(T (&)[N]), rbegin(initializer_list<T>))
(rend(initializer_list<T>)): Add 'noexcept' as per LWG 3537.
* testsuite/24_iterators/range_access/range_access.cc: Check for
expected noexcept specifiers. Check result types of generic
std::begin and std::end overloads.
* testsuite/24_iterators/range_access/range_access_cpp14.cc:
Check for expected noexcept specifiers.
* testsuite/24_iterators/range_access/range_access_cpp17.cc:
Likewise.
This failure was ultimately from incorrect handling of alias
templates, but required a specific set of occurrences to happen in the
specialization hash table. This cleans up the specialization
streaming to add alias instantiations at the same point as other
instantiations. I also removed some unneeded global variables dealing
with mapping of duplicate decl contexts.
PR c++/99425
gcc/cp/
* cp-tree.h (map_context_from, map_context_to): Delete.
(add_mergeable_specialization): Add is_alias parm.
* pt.c (add_mergeable_specialization): Add is_alias parm, add them.
* module.cc (map_context_from, map_context_to): Delete.
(trees_in::decl_value): Add specializations later, adjust call.
Drop useless alias lookup. Set duplicate fn parm context.
(check_mergeable_decl): Drop context mapping.
(trees_in::is_matching_decl): Likewise.
(trees_in::read_function_def): Drop parameter context adjustment
here.
gcc/testsuite/
* g++.dg/modules/pr99425-1.h: New.
* g++.dg/modules/pr99425-1_a.H: New.
* g++.dg/modules/pr99425-1_b.H: New.
* g++.dg/modules/pr99425-1_c.C: New.
* g++.dg/modules/pr99425-2_a.X: New.
* g++.dg/modules/pr99425-2_b.X: New.
* g++.dg/template/pr99425.C: New.
This fixes around 500 ICEs in the testsuite which can be seen when
testing with -march=armv8.1-m.main+mve -mfloat-abi=hard -mpure-code
(leaving the testsuite free of ICEs in this configuration). All of the
ICEs are in arm_print_operand (which is expecting a mem and gets another
rtx, e.g. a const_vector) when running the output code for
*mve_mov<mode> in alternative 4.
The issue is that MVE vector moves were relying on the arm_reorg pass to
move constant vectors that we can't easily synthesize to the literal
pool. This doesn't work for -mpure-code where the literal pool is
disabled. LLVM puts these in .rodata: I've chosen to do the same here.
With this change, for -mpure-code, we no longer want to allow a constant
on the RHS of a vector load in RA. To achieve this, I added a new
constraint which matches constants only if the literal pool is
available.
gcc/ChangeLog:
PR target/97252
* config/arm/arm-protos.h (neon_make_constant): Add generate
argument to guard emitting insns, default to true.
* config/arm/arm.c (arm_legitimate_constant_p_1): Reject
CONST_VECTORs which neon_make_constant can't handle.
(neon_vdup_constant): Add generate argument, avoid emitting
insns if it's not set.
(neon_make_constant): Plumb new generate argument through.
* config/arm/constraints.md (Ui): New. Use it...
* config/arm/mve.md (*mve_mov<mode>): ... here.
* config/arm/vec-common.md (movv8hf): Use neon_make_constant to
synthesize constants.
This adds a boiler-plate warning to the debug hooks structure to
strongly discourage people from adding new debug hook targets since
we want to get rid of the current abstraction in favor of maintaining
a DWARF view of debug in the middle-end and have support for alternate
output formats to be generated off that DWARF representation.
2021-03-22 Richard Biener <rguenther@suse.de>
* debug.h: Add deprecation warning.
This avoids endless cycling when a PHI node with unchanged backedge
value (the PHI result appearing there) is subject to CSE since doing
that effectively alters the hash entry. The way to avoid this is
to ignore such edges when processing the PHI node.
2021-03-22 Richard Biener <rguenther@suse.de>
PR tree-optimization/99694
* tree-ssa-sccvn.c (visit_phi): Ignore edges with the
PHI result.
* gcc.dg/torture/pr99694.c: New testcase.
The argument is handy when one needs to generate ChangeLog entries
for a different project (e.g. binutils).
contrib/ChangeLog:
* mklog.py: Add --directory argument.
gcc/ChangeLog:
PR target/99702
* config/riscv/riscv.c (riscv_expand_block_move): Get RTL value
after type checking.
gcc/testsuite/ChangeLog:
PR target/99702
* gcc.target/riscv/pr99702.c: New.
The PR66728 changes broke __int128 handling.
It emits wide_int numbers in their minimum unsigned precision
rather than in their full precision.
The problem is then that e.g. the DW_OP_implicit_value path:
int_mode = as_a <scalar_int_mode> (mode);
loc_result = new_loc_descr (DW_OP_implicit_value,
GET_MODE_SIZE (int_mode), 0);
loc_result->dw_loc_oprnd2.val_class = dw_val_class_wide_int;
loc_result->dw_loc_oprnd2.v.val_wide = ggc_alloc<wide_int> ();
*loc_result->dw_loc_oprnd2.v.val_wide = rtx_mode_t (rtl, int_mode);
emits invalid DWARF. In particular this patch fixes there multiple
occurences of:
.byte 0x9e # DW_OP_implicit_value
.uleb128 0x10
.quad 0xffffffffffffffff
+ .quad 0
.quad .LVL46 # Location list begin address (*.LLST40)
.quad .LFE14 # Location list end address (*.LLST40)
where we said the value has 16 byte size but then only emitted 8 byte value.
My understanding is that most of the places that use val_wide expect
the precision they chose (the one of the mode they want etc.), the only
exception is the add_const_value_attribute case where it deals with
VOIDmode CONST_WIDE_INTs, for that I agree when we don't have a mode
we need to fallback to minimum precision (not sure if maximum of
min_precision UNSIGNED and SIGNED wouldn't be better, then consumers
would know if it is signed or unsigned by looking at the MSB),
but that code already computes the precision, just decided to
create the wide_int with much larger precision (e.g. 512 bit
on x86_64).
2021-03-22 Jakub Jelinek <jakub@redhat.com>
PR debug/99562
PR debug/66728
* dwarf2out.c (get_full_len): Use get_precision rather than
min_precision.
(add_const_value_attribute): Make sure add_AT_wide argument has
precision prec rather than some very wide one.
This patch is to fix empty split-conditions of some
define_insn_and_split definitions where their conditions for
define_insn part aren't empty. As Segher and Mike pointed
out, they can sometimes lead to unexpected consequences.
Bootstrapped/regtested on powerpc64le-linux-gnu P9 and
powerpc64-linux-gnu P8.
gcc/ChangeLog:
* config/rs6000/rs6000.md (*rotldi3_insert_sf,
*mov<SFDF:mode><SFDF2:mode>cc_p9, floatsi<mode>2_lfiwax,
floatsi<mode>2_lfiwax_mem, floatunssi<mode>2_lfiwzx,
floatunssi<mode>2_lfiwzx_mem, *floatsidf2_internal,
*floatunssidf2_internal, fix_trunc<mode>si2_stfiwx,
fix_trunc<mode>si2_internal, fixuns_trunc<mode>si2_stfiwx,
*round32<mode>2_fprs, *roundu32<mode>2_fprs,
*fix_trunc<mode>si2_internal): Fix empty split condition.
* config/rs6000/vsx.md (*vsx_le_undo_permute_<mode>,
vsx_reduc_<VEC_reduc_name>_v2df, vsx_reduc_<VEC_reduc_name>_v4sf,
*vsx_reduc_<VEC_reduc_name>_v2df_scalar,
*vsx_reduc_<VEC_reduc_name>_v4sf_scalar): Likewise.
vec_insert defines the element argument type to be signed int by ELFv2
ABI. When expanding a vector with a variable rtx, convert the rtx type
to DImode to support both intrinsic usage and other callers from
rs6000_expand_vector_init produced by v[k] = val when k is long type.
gcc/ChangeLog:
2021-03-21 Xionghu Luo <luoxhu@linux.ibm.com>
PR target/98914
* config/rs6000/rs6000.c (rs6000_expand_vector_set_var_p9):
Convert idx to DImode.
(rs6000_expand_vector_set_var_p8): Likewise.
gcc/testsuite/ChangeLog:
2021-03-21 Xionghu Luo <luoxhu@linux.ibm.com>
PR target/98914
* gcc.target/powerpc/pr98914.c: New test.
Aarch64, ARM and a couple of other architectures have 16-bit floats, HFmode.
As can be seen e.g. on
void
foo (void)
{
__fp16 a = 1.0;
asm ("nop");
a = 2.0;
asm ("nop");
a = 3.0;
asm ("nop");
}
testcase, GCC mishandles this on the dwarf2out.c side by assuming all
floating point types have sizes in multiples of 4 bytes, so what GCC emits
is it says that e.g. the DW_OP_implicit_value will be 2 bytes but then
doesn't emit anything and so anything emitted after it is treated by
consumers as the value and then they get out of sync.
real_to_target which insert_float uses indeed fills it that way, but putting
into an array of long 32 bits each time, but for the half floats it puts
everything into the least significant 16 bits of the first long no matter
what endianity host or target has.
The following patch fixes it. With the patch the -g -O2 -dA output changes
(in a cross without .uleb128 support):
.byte 0x9e // DW_OP_implicit_value
.byte 0x2 // uleb128 0x2
+ .2byte 0x3c00 // fp or vector constant word 0
.byte 0x7 // DW_LLE_start_end (*.LLST0)
.8byte .LVL1 // Location list begin address (*.LLST0)
.8byte .LVL2 // Location list end address (*.LLST0)
.byte 0x4 // uleb128 0x4; Location expression size
.byte 0x9e // DW_OP_implicit_value
.byte 0x2 // uleb128 0x2
+ .2byte 0x4000 // fp or vector constant word 0
.byte 0x7 // DW_LLE_start_end (*.LLST0)
.8byte .LVL2 // Location list begin address (*.LLST0)
.8byte .LFE0 // Location list end address (*.LLST0)
.byte 0x4 // uleb128 0x4; Location expression size
.byte 0x9e // DW_OP_implicit_value
.byte 0x2 // uleb128 0x2
+ .2byte 0x4200 // fp or vector constant word 0
.byte 0 // DW_LLE_end_of_list (*.LLST0)
Bootstrapped/regtested on x86_64-linux, aarch64-linux and
armv7hl-linux-gnueabi, ok for trunk?
I fear the CONST_VECTOR case is still broken, while HFmode elements of vectors
should be fine (it uses eltsize of the element sizes) and likewise SFmode could
be fine, DFmode vectors are emitted as two 32-bit ints regardless of endianity
and I'm afraid it can't be right on big-endian. But I haven't been able to
create a testcase that emits a CONST_VECTOR, for e.g. unused vector vars
with constant operands we emit CONCATN during expansion and thus ...
DW_OP_*piece for each element of the vector and for
DW_TAG_call_site_parameter we give up (because we handle CONST_VECTOR only
in loc_descriptor, not mem_loc_descriptor).
2021-03-21 Jakub Jelinek <jakub@redhat.com>
PR debug/99388
* dwarf2out.c (insert_float): Change return type from void to
unsigned, handle GET_MODE_SIZE (mode) == 2 and return element size.
(mem_loc_descriptor, loc_descriptor, add_const_value_attribute):
Adjust callers.