An object declaration (other than a deferred constant declaration)
causes freezing where it occurs (13.14(6)), which means every name
occurring within it causes freezing (13.14(4/1)), and when the name in a
subtype_mark causes freezing, the denoted subtype is frozen (13.14(11)).
Hence, one needs to freeze the target type when expanding a qualified
expression.
gcc/ada/
* exp_ch4.adb (Expand_N_Qualified_Expression): Freeze
Target_Type.
Proof of an assertion is not automatic anymore. Add two assertions
before it to guide the prover.
gcc/ada/
* libgnat/s-aridou.adb (Double_Divide): Add intermediate
assertions.
Before this commit, a GNAT compiled with assertions would crash when
attempting to emit CUDA symbols in ALI files for spark_mode/ghost
packages, whose content is a single null statement.
gcc/ada/
* lib-writ.adb (Output_CUDA_Symbols): Check for null packages.
Setting this parameter to zero when calling the Configure procedure has
the effect of disabling completely the tracking of the biggest memory
users, which wasn't clear from the current documentation. So this patch
enhances the documentation of both the Configure procedure as well as
the Dump procedure to make that explicit.
gcc/ada/
* libgnat/g-debpoo.ads: Improve documentation of the
Stack_Trace_Depth parameter.
The QNX version of __gnat_install_handler calls sigaction for a number
of signals, and then prints an error message when the the call failed.
But unfortunately, except for the first call, we forgot to store
sigaction's return value, so the check that ensues uses a potentially
uninitialized variable, which the compiler does detect (this is how we
found this issue).
This change fixes this by make sure we store the result of each
sigaction call before checking it.
While at it, we noticed a thinko in the error messages all alerting
about the SIGFPE signal, rather than the signal it just tested. Most
likely a copy/paste thinko. Fixed by this change as well.
gcc/ada/
* init.c (__gnat_install_handler) [__QNX__]: Save sigaction's
return value in err before checking err's value. Fix incorrect
signal names in perror messages.
On QNX, the sigaction handler is incorrectly installed via the
sa_handler field of struct sigaction, rather than the sa_sigaction
field. This triggers a compilation warning due to a mismatch between the
function's signature and the field's type.
| init.c:2614:18: warning: assignment to 'void (*)(int)'
| from incompatible pointer type 'void (*)(int, siginfo_t *, void *)'
| {aka 'void (*)(int, struct _siginfo *, void *)'}
| [-Wincompatible-pointer-types]
In practice, using the sa_handler field actually works, but only because
those two fields are inside a union:
From target/qnx7/usr/include/signal.h:
| union { \
| __handler_type _sa_handler; \
| __action_type _sa_sigaction; \
| } __sa_un; \
This commit fixes this.
gcc/ada/
* init.c (__gnat_install_handler) [__QNX__]: Set
act.sa_sigaction rather than act.sa_handler.
When building the GNAT runtime for QNX, we get the following warning:
| cstreams.c: In function '__gnat_full_name':
| cstreams.c:209:5: warning: implicit declaration of function 'realpath'
| [-Wimplicit-function-declaration]
| 209 | realpath (nam, buffer);
| | ^~~~~~~~
This commit fixes the warning by adding the corresponding #include
of <stdlib.h>
gcc/ada/
* cstreams.c: Add <stdlib.h> #include.
bzero is marked as legacy in POSIX.1-2001, and using it triggers a
deprecation warnings on some systems such as QNX. This change adjusts
the one place where we use it in terminals.c to use memset instead.
This, in turns, allows us to get rid of a hack for HP/UX and Solaris.
gcc/ada/
* terminals.c: Remove bzero #define on HP/UX or Solaris
platforms.
(child_setup_tty): Replace bzero call by equivalent call to
memset.
The functions in subpackage Storage_Model_Support (apart from the
Has_*_Aspect functions) are revised to have assertions that will fail
when passed a parameter that doesn't specify the appropriate aspect
(either aspect Storage_Model_Type or Designated_Storage_Model), instead
of returning Empty for bad arguments. Also, various of the functions now
allow either a type with aspect Storage_Model_Type or an object of such
a type.
gcc/ada/
* sem_util.ads (Storage_Model_Support): Revise comments on most
operations within this nested package to reflect that they can
now be passed either a type that has aspect Storage_Model_Type
or an object of such a type. Change the names of the relevant
formals to SM_Obj_Or_Type. Also, add more precise semantic
descriptions in some cases, and declare the subprograms in a
more logical order.
* sem_util.adb (Storage_Model_Support.Storage_Model_Object): Add
an assertion that the type must specify aspect
Designated_Storage_Model, rather than returning Empty when it
doesn't specify that aspect.
(Storage_Model_Support.Storage_Model_Type): Add an assertion
that formal must be an object whose type specifies aspect
Storage_Model_Type, rather than returning Empty for when it
doesn't have such a type (and test Has_Storage_Model_Type_Aspect
rather than Find_Value_Of_Aspect).
(Storage_Model_Support.Get_Storage_Model_Type_Entity): Allow
both objects and types, and add an assertion that the type (or
the type of the object) has a value for aspect
Storage_Model_Type.
Fix the escaping of the loop variable from the loop scope in both forms
of iterated element associations (i.e. "for J in ..." and "for J of
..."). Create a dedicated scope around the analyses of both loops. Also
create a copy of the Loop_Parameter_Specification instead of analyzing
(and modifying) the original Tree as it will be reanalyzed later.
gcc/ada/
* sem_aggr.adb (Resolve_Iterated_Association): Create scope
around N_Iterated_Element_Association handling. Analyze a copy
of the Loop_Parameter_Specification. Call Analyze instead
Analyze_* to be more homogeneous.
(Sem_Ch5): Remove now unused package.
The front-end drops the declaration of a temporary on the floor because
Insert_Actions fails to climb up out of an N_Iterated_Component_Association
when the temporary is created during the analysis of its Discrete_Choices.
gcc/ada/
* exp_util.adb (Insert_Actions) <N_Iterated_Component_Association>:
Climb up out of the node if the actions come from Discrete_Choices.
Fix a regression in the support for Ada 2022's treatment of calls to
abstract subprograms in pre/post-conditions (thanks to Javier Miranda
for producing this patch).
gcc/ada/
* sem_disp.adb (Check_Dispatching_Context): When checking to see
whether an expression occurs in a class-wide pre/post-condition,
also check for the possibility that it occurs in a class-wide
preconditions subprogram that was introduced as part of
expansion. Without this fix, some legal calls occuring in
class-wide preconditions may be incorrectly flagged as violating
the "a call to an abstract subprogram must be dispatching" rule.
The key is that the protected type is a (limited) private type, which
fools a test in Cleanup_Scopes.
gcc/ada/
* inline.adb (Cleanup_Scopes): Test the underlying type.
The semantic analysis of predicates involves a fair amount of tree
copying because of both semantic and implementation considerations, and
there is a difficulty with quantified expressions since they declare a
new entity that cannot be shared between the various copies of the tree.
This change implements a specific processing for it in New_Copy_Tree
that subsumes a couple of fixes made earlier for variants of the issue.
gcc/ada/
* sem_util.ads (Is_Entity_Of_Quantified_Expression): Declare.
* sem_util.adb (Is_Entity_Of_Quantified_Expression): New
predicate.
(New_Copy_Tree): Deal with all entities of quantified
expressions.
* sem_ch13.adb (Build_Predicate_Functions): Get rid of
superfluous tree copying and remove obsolete code.
* sem_ch6.adb (Fully_Conformant_Expressions): Deal with all
entities of quantified expressions.
Finalization of a record object is required to finalize any components
that have an access discriminant constrained by a per-object expression
before other components. This includes the case of a type extension;
"early finalization" components of the parent type are required to be
finalized before non-early-finalization extension components. This is
implemented in the extension type's finalization procedure by placing
the call to the parent type's finalization procedure between the
finalization of the "early finalization" extension components and the
finalization of the other extension components. Previously that call was
executed after finalizing all of the extension conponents.
gcc/ada/
* exp_ch7.adb (Build_Finalize_Statements): Add Last_POC_Call
variable to keep track of the last "early finalization" call
generated for type extension's finalization procedure. If
non-empty, then this will indicate the point at which to insert
the call to the parent type's finalization procedure. Modify
nested function Process_Component_List_For_Finalize to set this
variable (and avoid setting it during a recursive call). If
Last_POC_Call is empty, then insert the parent finalization call
before, rather than after, the finalization code for the
extension components.
This moves the implementation of AI12-0101 + AI05-0123 from the expander
to the semantic analyzer and completes the implementation of AI12-0413,
which are both binding interpretations in Ada 2012, fixing a few bugs in
the process and removing a fair amount of duplicated code throughout.
gcc/ada/
* einfo-utils.adb (Remove_Entity): Fix couple of oversights.
* exp_ch3.adb (Is_User_Defined_Equality): Delete.
(User_Defined_Eq): Call Get_User_Defined_Equality.
(Make_Eq_Body): Likewise.
(Predefined_Primitive_Eq_Body): Call Is_User_Defined_Equality.
* exp_ch4.adb (Build_Eq_Call): Call Get_User_Defined_Equality.
(Is_Equality): Delete.
(User_Defined_Primitive_Equality_Op): Likewise.
(Find_Aliased_Equality): Call Is_User_Defined_Equality.
(Expand_N_Op_Eq): Call Underlying_Type unconditionally.
Do not implement AI12-0101 + AI05-0123 here.
(Expand_Set_Membership): Call Resolve_Membership_Equality.
* exp_ch6.adb (Expand_Call_Helper): Remove obsolete code.
* sem_aux.ads (Is_Record_Or_Limited_Type): Delete.
* sem_aux.adb (Is_Record_Or_Limited_Type): Likewise.
* sem_ch4.ads (Nondispatching_Call_To_Abstract_Operation): Declare.
* sem_ch4.adb (Analyze_Call): Call Call_Abstract_Operation.
(Analyze_Membership_Op): Call Resolve_Membership_Equality.
(Nondispatching_Call_To_Abstract_Operation): New procedure.
(Remove_Abstract_Operations): Call it.
* sem_ch6.adb (Check_Untagged_Equality): Remove obsolete error and
call Is_User_Defined_Equality.
* sem_ch7.adb (Inspect_Untagged_Record_Completion): New procedure
implementing AI12-0101 + AI05-0123.
(Analyze_Package_Specification): Call it.
(Declare_Inherited_Private_Subprograms): Minor tweak.
(Uninstall_Declarations): Likewise.
* sem_disp.adb (Check_Direct_Call): Adjust to new implementation
of Is_User_Defined_Equality.
* sem_res.ads (Resolve_Membership_Equality): Declare.
* sem_res.adb (Resolve): Replace direct error handling with call to
Nondispatching_Call_To_Abstract_Operation
(Resolve_Call): Likewise.
(Resolve_Equality_Op): Likewise. mplement AI12-0413.
(Resolve_Membership_Equality): New procedure.
(Resolve_Membership_Op): Call Get_User_Defined_Equality.
* sem_util.ads (Get_User_Defined_Eq): Rename into...
(Get_User_Defined_Equality): ...this.
* sem_util.adb (Get_User_Defined_Eq): Rename into...
(Get_User_Defined_Equality): ...this. Call Is_User_Defined_Equality.
(Is_User_Defined_Equality): Also check the profile but remove tests
on Comes_From_Source and Parent.
* sinfo.ads (Generic_Parent_Type): Adjust field description.
* uintp.ads (Ubool): Invoke user-defined equality in predicate.
Cleanup related to handling of user-defined equality in GNATprove.
gcc/ada/
* exp_ch3.adb (User_Defined_Eq): Replace duplicated code with a
call to Get_User_Defined_Eq.
When checking components of a record type for their own user-defined
equality function it is enough to find just one such a component.
Cleanup related to handling of user-defined equality in GNATprove.
gcc/ada/
* exp_ch3.adb (Build_Untagged_Equality): Exit early when the
outcome of a loop is already known.
This is an incremental change towards supporting shared libraries
for VxWorks on aarch64.
The aarch64-vx7r2 compiler supports compilation with -fpic/PIC. This
change adds aarch64 to the list of CPUs for which GNATLIB_SHARED maps to
gnatlib-shared-dual for vxworks7r2, so "make gnatlib-shared" actually
builds a shared lib.
While other adjustments will be needed to get the runtime tests to pass,
this one is a necessary step and doesn't impair the rest.
gcc/ada/
* Makefile.rtl: Add aarch64 to the list of CPUs for which
GNATLIB_SHARED maps to gnatlib-shared-dual for vxworks7r2.
This aligns Analyze_Negation and Analyze_Unary_Op with the other similar
procedures in Sem_Ch4. No functional changes.
gcc/ada/
* sem_ch4.adb (Analyze_Negation): Minor tweak.
(Analyze_Unary_Op): Likewise.
The problem is that Install_Limited_With_Clause does not fully implement
AI05-0129, in the case where a regular with clause is processed before a
limited_with clause of the same package: the visible "shadow" entity is
that of the incomplete type, instead of that of the full type per the AI.
This requires adjusting Remove_Limited_With_Unit to match the change in
Install_Limited_With_Clause and also Build_Incomplete_Type_Declaration,
which is responsible for synthesizing incomplete types out of full type
declarations for self-referential types.
A small tweak is also needed in Analyze_Subprogram_Body_Helper to align
it with an equivalent processing for CW types in Find_Type_Name. And the
patch also changes the Incomplete_View field in full type declarations
to point to the entity of the view instead of its declaration.
gcc/ada/
* exp_ch3.adb (Build_Assignment): Adjust to the new definition of
Incomplete_View field.
* sem_ch10.ads (Decorate_Type): Declare.
* sem_ch10.adb (Decorate_Type): Move to library level.
(Install_Limited_With_Clause): In the already analyzed case, also
deal with incomplete type declarations present in the sources and
simplify the replacement code.
(Build_Shadow_Entity): Deal with swapped views in package body.
(Restore_Chain_For_Shadow): Deal with incomplete type declarations
present in the sources.
* sem_ch3.adb (Analyze_Full_Type_Declaration): Adjust to the new
definition of Incomplete_View field.
(Build_Incomplete_Type_Declaration): Small consistency tweak.
Set the incomplete type as the Incomplete_View of the full type.
If the scope is a package with a limited view, build a shadow
entity for the incomplete type.
* sem_ch6.adb (Analyze_Subprogram_Body_Helper): When replacing
the limited view of a CW type as designated type of an anonymous
access return type, get to the CW type of the incomplete view of
the tagged type, if any.
(Collect_Primitive_Operations): Adjust to the new definition of
Incomplete_View field.
* sinfo.ads (Incomplete_View): Denote the entity itself instead
of its declaration.
* sem_util.adb: Remove call to Defining_Entity.
Volatile refinement properties (e.g. Async_Writers), which refine the
Volatile aspect in SPARK, are inherited by subtypes from their base
types. In particular, this patch fixes handling of those properties for
subtypes of private types.
gcc/ada/
* sem_util.adb (Type_Or_Variable_Has_Enabled_Property): Given a
subtype recurse into its base type.
Routine Type_Or_Variable_Has_Enabled_Property handles either types or
objects; replace negation with an explicit positive condition.
Cleanup related to handling of volatile refinement aspects in SPARK;
behaviour is unaffected.
gcc/ada/
* sem_util.adb (Type_Or_Variable_Has_Enabled_Property): Clarify.
Routines Is_Enabled and Is_Enabled_Pragma are identical (except for
comments); remove this duplication.
Cleanup related to handling of volatile refinement aspects in SPARK;
behaviour is unaffected.
gcc/ada/
* sem_util.adb (Is_Enabled): Remove; use Is_Enabled_Pragma
instead.
This patch adds support for list items in the has_device_addr clause which type
is given by C++ template parameters.
gcc/cp/ChangeLog:
* pt.cc (tsubst_omp_clauses): Added OMP_CLAUSE_HAS_DEVICE_ADDR.
* semantics.cc (finish_omp_clauses): Added template decl processing.
libgomp/ChangeLog:
* testsuite/libgomp.c++/target-has-device-addr-7.C: New test.
* testsuite/libgomp.c++/target-has-device-addr-8.C: New test.
* testsuite/libgomp.c++/target-has-device-addr-9.C: New test.
Fixes:
opts-global.cc:75:15: runtime error: store to address 0x00000bc9be70 with insufficient space for an object of type 'char'
which happens when mask == 0, len == 0 and we allocate zero elements.
Eventually, result[0] is called which triggers the UBSAN.
gcc/ChangeLog:
* opts-global.cc (write_langs): Allocate at least one byte.
The following adds MIN/MAX folding from fold_cond_expr_with_comparison
to the part GIMPLE of match.pd, leaving the GENERIC part in
fold-const.cc since that's constrainted on frontend specific things
I did not want to carry to match.pd.
The effect becomes appearant when we no longer can rely on GENERIC
folding of COND_EXPRs in gcc.dg/tree-ssa/pr92834.c and
gcc.dg/tree-ssa/pr94786.c.
2022-05-13 Richard Biener <rguenther@suse.de>
* match.pd (A cmp B ? A : B -> min/max): New patterns
carried over from fold_cond_expr_with_comparison.
When d->perm[i] == d->perm[i-1] + 1 and d->perm[i] == nelt, it's not
continuous. It should fail if there's more than 2 continuous areas.
gcc/ChangeLog:
PR target/105587
* config/i386/i386-expand.cc
(expand_vec_perm_pslldq_psrldq_por): Fail when (d->perm[i] ==
d->perm[i-1] + 1) && d->perm[i] == nelt && start != -1.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr105587.c: New test.
const_int_operand and other const*_operand predicates do not need
constraints when the constraint is inherited from the range of
constant integer predicate. Remove the constraint in case all
alternatives use the same inherited constraint.
2022-05-15 Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog:
* config/i386/i386.md: Remove constraints when used with
const_int_operand, const0_operand, const_1_operand, constm1_operand,
const8_operand, const128_operand, const248_operand, const123_operand,
const2367_operand, const1248_operand, const359_operand,
const_4_or_8_to_11_operand, const48_operand, const_0_to_1_operand,
const_0_to_3_operand, const_0_to_4_operand, const_0_to_5_operand,
const_0_to_7_operand, const_0_to_15_operand, const_0_to_31_operand,
const_0_to_63_operand, const_0_to_127_operand, const_0_to_255_operand,
const_0_to_255_mul_8_operand, const_1_to_31_operand,
const_1_to_63_operand, const_2_to_3_operand, const_4_to_5_operand,
const_4_to_7_operand, const_6_to_7_operand, const_8_to_9_operand,
const_8_to_11_operand, const_8_to_15_operand, const_10_to_11_operand,
const_12_to_13_operand, const_12_to_15_operand, const_14_to_15_operand,
const_16_to_19_operand, const_16_to_31_operand, const_20_to_23_operand,
const_24_to_27_operand and const_28_to_31_operand.
* config/i386/mmx.md: Ditto.
* config/i386/sse.md: Ditto.
* config/i386/subst.md: Ditto.
* config/i386/sync.md: Ditto.
It has come up several times that Clang considers hidden friends of a class
to be sufficiently memberly to be covered by a friend declaration naming the
class. This is somewhat unclear in the standard: [class.friend] says
"Declaring a class to be a friend implies that private and protected members
of the class granting friendship can be named in the base-specifiers and
member declarations of the befriended class."
A hidden friend is a syntactic member-declaration, but is it a "member
declaration"? CWG was ambivalent, and referred the question to EWG as a
design choice. But recently Patrick mentioned that the current G++ choice
not to treat it as a "member declaration" was making his library work
significantly more cumbersome, so let's go ahead and vote the other way.
This means that the testcases for 100502 and 58993 are now accepted.
DR1699
PR c++/100502
PR c++/58993
gcc/cp/ChangeLog:
* friend.cc (is_friend): Hidden friends count as members.
* search.cc (friend_accessible_p): Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/template/access37.C: Now OK.
* g++.dg/template/friend69.C: Now OK.
* g++.dg/lookup/friend23.C: New test.
While I was backporting the patch for PR102300, it occurred to me that it
would be cleaner to look through the injected-class-name earlier in the
function. I don't think this changes any test results.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_template_name): Look through
injected-class-name.
My patch for 105191 made us use build_value_init more frequently from
build_vec_init_expr, but build_value_init doesn't like to be called to
initialize a class in a template. That's caused trouble in the past, and
seems like a strange restriction, so let's fix it.
PR c++/105589
PR c++/105191
PR c++/92385
gcc/cp/ChangeLog:
* init.cc (build_value_init): Handle class in template.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/initlist-array16.C: New test.
This was fixed by r258755:
PR c++/81311 - wrong C++17 overload resolution.
PR c++/81952
gcc/testsuite/ChangeLog:
* g++.dg/overload/conv-op4.C: New test.
The exporter relies on sorting interface parse methods. It would sort
them as it encountered interface types. However, when an interface
type is an element of a struct or array type, the exporter might
encounter that interface type before sorting the parse methods. If it
then encountered an identical interface type again, it could get
confused about whether the two types are identical or not.
Fix the problem by always sorting the parse methods in the
finalize_methods pass.
Also firm up the export type sorting to make sure we never have this
kind of confusion again. Doing this revealed that we need to be more
careful about sorting in order to handle aliases correctly.
Also fix the interface type hash computation to use the right hash
value when looking at parse methods rather than all methods.
The test case for this is https://go.dev/cl/405759.
Fixesgolang/go#52841
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/405556
This patch improves support for vector equality and inequality of
V1TImode vectors, and V2DImode vectors with sse2 but not sse4.
Consider the three functions below:
typedef unsigned int uv4si __attribute__ ((__vector_size__ (16)));
typedef unsigned long long uv2di __attribute__ ((__vector_size__ (16)));
typedef unsigned __int128 uv1ti __attribute__ ((__vector_size__ (16)));
uv4si eq_v4si(uv4si x, uv4si y) { return x == y; }
uv2di eq_v2di(uv2di x, uv2di y) { return x == y; }
uv1ti eq_v1ti(uv1ti x, uv1ti y) { return x == y; }
These all perform vector comparisons of 128bit SSE2 registers, generating
the result as a vector, where ~0 (all 1 bits) represents true and a zero
represents false. eq_v4si is trivially implemented by x86_64's pcmpeqd
instruction. This patch improves the other two cases:
For v2di, gcc -O2 currently generates:
movq %xmm0, %rdx
movq %xmm1, %rax
movdqa %xmm0, %xmm2
cmpq %rax, %rdx
movhlps %xmm2, %xmm3
movhlps %xmm1, %xmm4
sete %al
movq %xmm3, %rdx
movzbl %al, %eax
negq %rax
movq %rax, %xmm0
movq %xmm4, %rax
cmpq %rax, %rdx
sete %al
movzbl %al, %eax
negq %rax
movq %rax, %xmm5
punpcklqdq %xmm5, %xmm0
ret
but with this patch we now generate:
pcmpeqd %xmm0, %xmm1
pshufd $177, %xmm1, %xmm0
pand %xmm1, %xmm0
ret
where the results of a V4SI comparison are shuffled and bit-wise ANDed
to produce the desired result. There's no change in the code generated
for "-O2 -msse4" where the compiler generates a single "pcmpeqq" insn.
For V1TI mode, the results are equally dramatic, where the current -O2
output looks like:
movaps %xmm0, -40(%rsp)
movq -40(%rsp), %rax
movq -32(%rsp), %rdx
movaps %xmm1, -24(%rsp)
movq -24(%rsp), %rcx
movq -16(%rsp), %rsi
xorq %rcx, %rax
xorq %rsi, %rdx
orq %rdx, %rax
sete %al
xorl %edx, %edx
movzbl %al, %eax
negq %rax
adcq $0, %rdx
movq %rax, %xmm2
negq %rdx
movq %rdx, -40(%rsp)
movhps -40(%rsp), %xmm2
movdqa %xmm2, %xmm0
ret
with this patch we now generate:
pcmpeqd %xmm0, %xmm1
pshufd $177, %xmm1, %xmm0
pand %xmm1, %xmm0
pshufd $78, %xmm0, %xmm1
pand %xmm1, %xmm0
ret
performing a V2DI comparison, followed by a shuffle and pand, and with
-O2 -msse4 take advantages of SSE4.1's pcmpeqq:
pcmpeqq %xmm0, %xmm1
pshufd $78, %xmm1, %xmm0
pand %xmm1, %xmm0
ret
2022-05-13 Roger Sayle <roger@nextmovesoftware.com>
Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog
* config/i386/sse.md (vec_cmpeqv2div2di): Enable for TARGET_SSE2.
For !TARGET_SSE4_1, expand as a V4SI vector comparison, followed
by a pshufd and pand.
(vec_cmpeqv1tiv1ti): New define_expand implementing V1TImode
vector equality as a V2DImode vector comparison (see above),
followed by a pshufd and pand.
gcc/testsuite/ChangeLog
* gcc.target/i386/sse2-v1ti-veq.c: New test case.
* gcc.target/i386/sse2-v1ti-vne.c: New test case.
A few tests need not be restricted to 'lp64', so remove the restriction.
A few of those need a simple change to the DejaGnu directives to suppress
'-mcmodel' flags for '-m32'.
2022-05-13 Paul A. Clarke <pc@us.ibm.com>
gcc/testsuite
* g++.target/powerpc/pr65240-1.C: Adjust DejaGnu directives.
* g++.target/powerpc/pr65240-2.C: Likewise.
* g++.target/powerpc/pr65240-3.C: Likewise.
* g++.target/powerpc/pr65240-4.C: Likewise.
* g++.target/powerpc/pr65242.C: Likewise.
* g++.target/powerpc/pr67211.C: Likewise.
* g++.target/powerpc/pr69667.C: Likewise.
* g++.target/powerpc/pr71294.C: Likewise.
This patch implements the missed optimization enhancement PR 83907,
by handling memset with a constant byte value in tree-ssa's strlen
optimization pass. Effectively, this treats memset(dst,'x',3) as
it would memcpy(dst,"xxx",3).
This patch also includes a tweak to handle_store to address another
missed optimization observed in the related test case pr83907-2.c.
The consecutive byte stores to memory get coalesced into a vector
write of a vector const, but unfortunately tree-ssa-strlen's
handle_store didn't previously handle the (unusual) case where the
stored "string" starts with a zero byte but also contains non-zero
bytes.
2022-05-13 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR tree-optimization/83907
* tree-ssa-strlen.cc (handle_builtin_memset): Record a strinfo
for memset with an constant char value.
(handle_store): Improved handling of stores with a first byte
of zero, but not storing_all_zeros_p.
gcc/testsuite/ChangeLog
PR tree-optimization/83907
* gcc.dg/tree-ssa/pr83907-1.c: New test case.
* gcc.dg/tree-ssa/pr83907-2.c: New test case.
The Zbb support has introduced ctz and clz to the backend, but some
transformations in GCC need to know what the value of c[lt]z at zero
is. This affects how the optab is generated and may suppress use of
CLZ/CTZ in tree passes.
Among other things, this is needed for the transformation of
table-based ctz-implementations, such as in deepsjeng, to work
(see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90838).
Prior to this change, the test case from PR90838 would compile to
on RISC-V targets with Zbb:
myctz:
lui a4,%hi(.LC0)
ld a4,%lo(.LC0)(a4)
neg a5,a0
and a5,a5,a0
mul a5,a5,a4
lui a4,%hi(.LANCHOR0)
addi a4,a4,%lo(.LANCHOR0)
srli a5,a5,58
sh2add a5,a5,a4
lw a0,0(a5)
ret
After this change, we get:
myctz:
ctz a0,a0
andi a0,a0,63
ret
Testing this with deepsjeng_r (from SPEC 2017) against QEMU, this
shows a clear reduction in dynamic instruction count:
- before 1961888067076
- after 1907928279874 (2.75% reduction)
This also merges the various target-specific test-cases (for x86-64,
aarch64 and riscv) within gcc.dg/pr90838.c.
This extends the macros (i.e., effective-target keywords) used in
testing (lib/target-supports.exp) to reliably distinguish between RV32
and RV64 via __riscv_xlen (i.e., the integer register bitwidth) :
testing for ILP32 could be misleading (as ILP32 is a valid memory
model for 64bit systems).
gcc/ChangeLog:
* config/riscv/riscv.h (CLZ_DEFINED_VALUE_AT_ZERO): Implement.
(CTZ_DEFINED_VALUE_AT_ZERO): Same.
* doc/sourcebuild.texi: add documentation for RISC-V specific
test target keywords
gcc/testsuite/ChangeLog:
* gcc.dg/pr90838.c: Add additional flags (dg-additional-options)
when compiling for riscv64 and subsume gcc.target/aarch64/pr90838.c
and gcc.target/i386/pr95863-2.c.
* gcc.target/aarch64/pr90838.c: Removed.
* gcc.target/i386/pr95863-2.c: Removed.
* lib/target-supports.exp: Recognize RV32 or RV64 via XLEN
Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Signed-off-by: Manolis Tsamis <manolis.tsamis@vrull.eu>
Co-authored-by: Manolis Tsamis <manolis.tsamis@vrull.eu>
The non-member swap for std::exception_ptr is in a nested namespace and
so can only be found by ADL currently. Add a using-declaration so that
qualified std::swap calls will use the std::exception_ptr::swap member,
instead of the generic std::swap.
There's no new test for this, because the generic std::swap works, it
just does more work than is necessary.
Also tell Doxygen to replace the __exception_ptr namespace with
"__unspecified__" in the generate docs, so the real name is not
documented.
libstdc++-v3/ChangeLog:
* doc/doxygen/user.cfg.in (PREDEFINED): Replace __exception_ptr
with "__unspecified__".
* libsupc++/exception_ptr.h: Improve doxygen docs.
(__exception_ptr::swap): Also declare in namespace std.
This allows std::rethrow_if_nested to work with -fno-rtti by not
attempting the dynamic_cast if it requires RTTI, since that's ill-formed
with -fno-rtti. The cast will still work if a static upcast to
std::nested_exception is allowed.
Also use if-constexpr to avoid the compile-time overload resolution (and
SFINAE) and run-time dispatching for std::rethrow_if_nested and
std::throw_with_nested.
Also add better doxygen comments throughout the file.
libstdc++-v3/ChangeLog:
* libsupc++/nested_exception.h (throw_with_nested) [C++17]: Use
if-constexpr instead of tag dispatching.
(rethrow_if_nested) [C++17]: Likewise.
(rethrow_if_nested) [!__cpp_rtti]: Do not use dynamic_cast if it
would require RTTI.
* testsuite/18_support/nested_exception/rethrow_if_nested-term.cc:
New test.
When folding, the LHS has not been set, so we should be checking the type of
op1. We should also make sure op1 is not undefined.
PR tree-optimization/105597
gcc/
* range-op.cc (operator_minus::lhs_op1_relation): Use op1 instead
of the lhs and make sure it is not undefined.
gcc/testsuite/
* gcc.dg/pr105597.c: New.