gcc/ada/
* libgnat/a-strunb.ads (Unbounded_String): Reference is never
null.
* libgnat/a-strunb.adb (Finalize): Copy reference while it needs
to be deallocated.
gcc/ada/
* sem_ch8.adb (Build_Class_Wide_Wrapper): Previous version split
in two subprograms to factorize its functionality:
Find_Suitable_Candidate, and Build_Class_Wide_Wrapper. These
routines are also placed in the new subprogram
Handle_Instance_With_Class_Wide_Type.
(Handle_Instance_With_Class_Wide_Type): New subprogram that
encapsulates all the code that handles instantiations with
class-wide types.
(Analyze_Subprogram_Renaming): Adjust code to invoke the new
nested subprogram Handle_Instance_With_Class_Wide_Type; adjust
documentation.
gcc/ada/
* einfo-utils.ads, einfo-utils.adb (Alias, Set_Alias,
Renamed_Entity, Set_Renamed_Entity, Renamed_Object,
Set_Renamed_Object): Add assertions that reflect how these are
supposed to be used and what they are supposed to return.
(Renamed_Entity_Or_Object): New getter.
(Set_Renamed_Object_Of_Possibly_Void): Setter that allows N to
be E_Void.
* checks.adb (Ensure_Valid): Use Renamed_Entity_Or_Object
because this is called for both cases.
* exp_dbug.adb (Debug_Renaming_Declaration): Use
Renamed_Entity_Or_Object because this is called for both cases.
Add assertions.
* exp_util.adb (Possible_Bit_Aligned_Component): Likewise.
* freeze.adb (Freeze_All_Ent): Likewise.
* sem_ch5.adb (Within_Function): Likewise.
* exp_attr.adb (Calculate_Header_Size): Call Renamed_Entity
instead of Renamed_Object.
* exp_ch11.adb (Expand_N_Raise_Statement): Likewise.
* repinfo.adb (Find_Declaration): Likewise.
* sem_ch10.adb (Same_Unit, Process_Spec_Clauses,
Analyze_With_Clause, Install_Parents): Likewise.
* sem_ch12.adb (Build_Local_Package, Needs_Body_Instantiated,
Build_Subprogram_Renaming, Check_Formal_Package_Instance,
Check_Generic_Actuals, In_Enclosing_Instance,
Denotes_Formal_Package, Process_Nested_Formal,
Check_Initialized_Types, Map_Formal_Package_Entities,
Restore_Nested_Formal): Likewise.
* sem_ch6.adb (Report_Conflict): Likewise.
* sem_ch8.adb (Analyze_Exception_Renaming,
Analyze_Generic_Renaming, Analyze_Package_Renaming,
Is_Primitive_Operator_In_Use, Declared_In_Actual,
Note_Redundant_Use): Likewise.
* sem_warn.adb (Find_Package_Renaming): Likewise.
* sem_elab.adb (Ultimate_Variable): Call Renamed_Object instead
of Renamed_Entity.
* exp_ch6.adb (Get_Function_Id): Call
Set_Renamed_Object_Of_Possibly_Void, because the defining
identifer is still E_Void at this point.
* sem_util.adb (Function_Call_Or_Allocator_Level): Likewise.
Remove redundant (unreachable) code.
(Is_Object_Renaming, Is_Valid_Renaming): Call Renamed_Object
instead of Renamed_Entity.
(Get_Fullest_View): Call Renamed_Entity instead of
Renamed_Object.
(Copy_Node_With_Replacement): Call
Set_Renamed_Object_Of_Possibly_Void because the defining entity
is sometimes E_Void.
* exp_ch5.adb (Expand_N_Assignment_Statement): Protect a call to
Renamed_Object with Is_Object to avoid assertion failure.
* einfo.ads: Minor comment fixes.
* inline.adb: Minor comment fixes.
* tbuild.ads: Minor comment fixes.
gcc/ada/
* sem_ch13.adb (Freeze_Entity_Checks): Perform same check on
predicate expression inside pragma as inside aspect.
* sem_util.adb (Is_Current_Instance): Recognize possible
occurrence of subtype as current instance inside the pragma
Predicate.
Set the 3 possible flags as all individual bits and group for options.
* flag-types.h (enum ranger_debug): Adjust values.
* params.opt (ranger_debug): Ditto.
These tests are testing Advanced SIMD codegen, so if the compiler or the
testsuite is forcing SVE they will fail.
This adds +nosve so that we always generate Advanced SIMD codegen.
gcc/testsuite/ChangeLog:
PR target/102907
* gcc.target/aarch64/shrn-combine-1.c: Disable SVE.
* gcc.target/aarch64/shrn-combine-2.c: Likewise.
* gcc.target/aarch64/shrn-combine-3.c: Likewise.
* gcc.target/aarch64/shrn-combine-4.c: Likewise.
* gcc.target/aarch64/shrn-combine-5.c: Likewise.
* gcc.target/aarch64/shrn-combine-6.c: Likewise.
* gcc.target/aarch64/shrn-combine-7.c: Likewise.
I was not careful with the fix for PR 102505 and did not craft the
check to satisfy the verifier carefully, which lead to PR 102886.
(The verifier has the test structured differently and somewhat
redundantly, so I could not just copy it).
This patch fixes it. I hope it is quite obvious correction of an
oversight and so will commit it if survives bootstrap and testing on
x86_64-linux and ppc64le-linux.
Testcase for this bug is gcc.dg/tree-ssa/sra-18.c (but only on
platforms with constant pools). I will backport the two fixes
to the release branches squashed.
gcc/ChangeLog:
2021-10-22 Martin Jambor <mjambor@suse.cz>
PR tree-optimization/102886
* tree-sra.c (totally_scalarize_subtree): Fix the out of
access-condition.
Just like PR 100382, here we have a DCE removing a
null pointer load which is needed still.
In this case, execute_fixup_cfg removes a store (correctly)
and then removes the null load (incorrectly) due to
not checking stmt_unremovable_because_of_non_call_eh_p.
This patch adds the check in the similar way as the patch
to fix PR 100382 did.
gcc/ChangeLog:
* tree-ssa-dce.c (simple_dce_from_worklist):
Check stmt_unremovable_because_of_non_call_eh_p also
before removing the statement.
Previous refactoring made the possibility of considering re-aligned
loads for unlimited cost model alignment peeling difficult so I
ditched that. Later refactoring made it easily possible again so
the following patch re-instantiates this which should fix the
observed regression on powerpc with altivec.
2021-10-25 Richard Biener <rguenther@suse.de>
PR tree-optimization/102905
* tree-vect-data-refs.c (vect_enhance_data_refs_alignment):
Use vect_supportable_dr_alignment again to determine whether
an access is supported when not aligned.
Similar for sqrt/sqrtl.
gcc/ChangeLog:
PR target/102464
* match.pd: Simplify (_Float16) sqrtf((float) a) to .SQRT(a)
when direct_internal_fn_supported_p, similar for sqrt/sqrtl.
gcc/testsuite/ChangeLog:
PR target/102464
* gcc.target/i386/pr102464-sqrtph.c: New test.
* gcc.target/i386/pr102464-sqrtsh.c: New test.
This fixes a latent issue exposed by now allowing VN_TOP in PHI
arguments. We may only use optimistic equality when merging values on
different edges, not when merging values on the same edge - in particular
we may not choose the undef value on any edge when there's a not undef
value as well.
2021-10-25 Richard Biener <rguenther@suse.de>
PR tree-optimization/102920
* tree-ssa-sccvn.h (expressions_equal_p): Add argument
controlling VN_TOP matching behavior.
* tree-ssa-sccvn.c (expressions_equal_p): Likewise.
(vn_phi_eq): Do not optimistically match VN_TOP.
* gcc.dg/torture/pr102920.c: New testcase.
This patch is to support transform in fast-math something like
_mm512_add_ph(x1, _mm512_fmadd_pch(a, b, _mm512_setzero_ph())) to
_mm512_fmadd_pch(a, b, x1).
And support transform _mm512_add_ph(x1, _mm512_fmul_pch(a, b))
to _mm512_fmadd_pch(a, b, x1).
gcc/ChangeLog:
* config/i386/sse.md (fma_<mode>_fadd_fmul): Add new
define_insn_and_split.
(fma_<mode>_fadd_fcmul):Likewise
(fma_<complexopname>_<mode>_fma_zero):Likewise
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512fp16-complex-fma.c: New test.
The behavior of the -mdisable-fpregs is confusing in that it doesn't
disable the use of the floating-point registers in all situations.
The -msoft-float disables the use of the floating-point registers in
all situations. The Linux kernel only needs to disable use of the
xmpyu instruction to avoid using the floating-point registers.
This change revises the -mdisable-fpregs option to disable the use of
the floating-point registers in all situations. It is now equivalent
to the -msoft-float option. A new -msoft-mult option is added to
disable use of the xmpyu instruction. The libgcc library can be
compiled with the -msoft-mult option to avoid using hardware integer
multiplication.
2021-10-24 John David Anglin <danglin@gcc.gnu.org>
gcc/ChangeLog:
* config/pa/pa-d.c (pa_d_handle_target_float_abi): Don't check
TARGET_DISABLE_FPREGS.
* config/pa/pa.c (fix_range): Use MASK_SOFT_FLOAT instead of
MASK_DISABLE_FPREGS.
(hppa_rtx_costs): Don't check TARGET_DISABLE_FPREGS. Adjust
cost of hardware integer multiplication.
(pa_conditional_register_usage): Don't check TARGET_DISABLE_FPREGS.
* config/pa/pa.h (INT14_OK_STRICT): Likewise.
* config/pa/pa.md: Don't check TARGET_DISABLE_FPREGS. Check
TARGET_SOFT_FLOAT in patterns that use xmpyu instruction.
* config/pa/pa.opt (mdisable-fpregs): Change target mask to
SOFT_FLOAT. Revise comment.
(msoft-float): New option.
The 'G' constraint only matches a float zero.
2021-10-24 John David Anglin <danglin@gcc.gnu.org>
gcc/ChangeLog:
* config/pa/pa.md: Don't use 'G' constraint in integer move patterns.
This patch cures the testsuite failure of bfin/20090914-3.c, which
currently FAILs on bfin-elf with "(test for excess errors)" due to:
20090914-3.c:3:1: warning: return type defaults to 'int' [-Wimplicit-int]
which is obviously not what this code was intended to test. Fixed by
turning the code into a function returning the final "fract32" result,
as simply specifying an "int" return type for main, results in the
entire function being optimized away, as the result is unused.
2021-10-24 Roger Sayle <roger@nextmovesoftware.com>
gcc/testsuite/ChangeLog
* gcc.target/bfin/20090914-3.c: Tweak test case.
Move bind-c-intent-out-2.f90 to gfortran.dg/ubsan for -fsanitize=undefined.
PR fortran/9262
* gfortran.dg/bind-c-intent-out-2.f90: Moved to ...
* gfortran.dg/ubsan/bind-c-intent-out-2.f90
On x86_64, V1TI mode holds a 128-bit integer value in a (vector) SSE
register (where regular TI mode uses a pair of 64-bit general purpose
scalar registers). This patch improves the implementation of AND, IOR,
XOR and NOT on these values.
The benefit is demonstrated by the following simple test program:
typedef unsigned __int128 v1ti __attribute__ ((__vector_size__ (16)));
v1ti and(v1ti x, v1ti y) { return x & y; }
v1ti ior(v1ti x, v1ti y) { return x | y; }
v1ti xor(v1ti x, v1ti y) { return x ^ y; }
v1ti not(v1ti x) { return ~x; }
For which GCC currently generates the rather large:
and: movdqa %xmm0, %xmm2
movq %xmm1, %rdx
movq %xmm0, %rax
andq %rdx, %rax
movhlps %xmm2, %xmm3
movhlps %xmm1, %xmm4
movq %rax, %xmm0
movq %xmm4, %rdx
movq %xmm3, %rax
andq %rdx, %rax
movq %rax, %xmm5
punpcklqdq %xmm5, %xmm0
ret
ior: movdqa %xmm0, %xmm2
movq %xmm1, %rdx
movq %xmm0, %rax
orq %rdx, %rax
movhlps %xmm2, %xmm3
movhlps %xmm1, %xmm4
movq %rax, %xmm0
movq %xmm4, %rdx
movq %xmm3, %rax
orq %rdx, %rax
movq %rax, %xmm5
punpcklqdq %xmm5, %xmm0
ret
xor: movdqa %xmm0, %xmm2
movq %xmm1, %rdx
movq %xmm0, %rax
xorq %rdx, %rax
movhlps %xmm2, %xmm3
movhlps %xmm1, %xmm4
movq %rax, %xmm0
movq %xmm4, %rdx
movq %xmm3, %rax
xorq %rdx, %rax
movq %rax, %xmm5
punpcklqdq %xmm5, %xmm0
ret
not: movdqa %xmm0, %xmm1
movq %xmm0, %rax
notq %rax
movhlps %xmm1, %xmm2
movq %rax, %xmm0
movq %xmm2, %rax
notq %rax
movq %rax, %xmm3
punpcklqdq %xmm3, %xmm0
ret
with this patch we now generate the much more efficient:
and: pand %xmm1, %xmm0
ret
ior: por %xmm1, %xmm0
ret
xor: pxor %xmm1, %xmm0
ret
not: pcmpeqd %xmm1, %xmm1
pxor %xmm1, %xmm0
ret
For my first few attempts at this patch I tried adding V1TI to the
existing VI and VI12_AVX_512F mode iterators, but these then have
dependencies on other iterators (and attributes), and so on until
everything ties itself into a knot, as V1TI mode isn't really a
first-class vector mode on x86_64. Hence I ultimately opted to use
simple stand-alone patterns (as used by the existing TF mode support).
2021-10-23 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/i386/sse.md (<any_logic>v1ti3): New define_insn to
implement V1TImode AND, IOR and XOR on TARGET_SSE2 (and above).
(one_cmplv1ti2): New define expand.
gcc/testsuite/ChangeLog
* gcc.target/i386/sse2-v1ti-logic.c: New test case.
* gcc.target/i386/sse2-v1ti-logic-2.c: New test case.
std::make_any should be constrained so it can only be called if the
construction of the return value would be valid.
libstdc++-v3/ChangeLog:
PR libstdc++/102894
* include/std/any (make_any): Add SFINAE constraint.
* testsuite/20_util/any/102894.cc: New test.
This was not defined in the spec and not consistent in the
implementation causing incosistent behavior. After review we have
updated the CPU implementations and proposed the spec be updated to
specific that FPU tininess checks check for tininess before roudning.
Architecture change draft:
https://openrisc.io/proposals/p18-fpu-tininess
libgcc/ChangeLog:
* config/or1k/sfp-machine.h (_FP_TININESS_AFTER_ROUNDING):
Change to 0.
PR testsuite/102742
libbacktrace/ChangeLog:
* btest.c (MIN_DESCRIPTOR): New.
(MAX_DESCRIPTOR): Likewise.
(check_available_files): Likewise.
(check_open_files): Check only file descriptors that
were not available at the entry.
(main): Call check_available_files.
The following fixes the test for an exit edge I put in place for
the fix for PR45178 where I somehow misunderstood how the cyclic
list works.
2021-10-22 Richard Biener <rguenther@suse.de>
PR tree-optimization/102893
* tree-ssa-dce.c (find_obviously_necessary_stmts): Fix the
test for an exit edge.
* gcc.dg/tree-ssa/ssa-dce-9.c: New testcase.
The equivalence oracle creates a new equiv set at each def point,
killing any incoming equivalences, however in the path sensitive
oracle we create brand new equivalences at each PHI:
BB4:
BB8:
x_5 = PHI <y_8(4)>
Here we note that x_5 == y_8 at the end of the path.
The current code is intersecting this new equivalence with previously
known equivalences coming into the path. This is incorrect, as this
is a new definition. This patch kills any known equivalence before we
register a new one.
This hasn't caused problems so far, but upcoming changes to the
pipeline has us threading more aggressively and triggering corner
cases where this causes incorrect code.
I have tested this patch with the usual regstrap cycle. I have also
hacked a compiler comparing the old and new behavior to see if we were
previously threading paths where the decision was made due to invalid
equivalences. Luckily, there were no such paths, but there were 22
paths in a set of .ii files where disregarding incoming relations
allowed us to thread the path. This is a miniscule improvement,
but we moved a handful of thredable paths earlier in the pipeline,
which is always good.
Tested on x86-64 Linux.
Co-authored-by: Andrew MacLeod <amacleod@redhat.com>
gcc/ChangeLog:
* gimple-range-path.cc (path_range_query::compute_phi_relations):
Kill any global relations we may know before registering a new
one.
* value-relation.cc (path_oracle::killing_def): New.
* value-relation.h (path_oracle::killing_def): New.