This patch extends the earlier and;cmp to not;test optimization to also
perform this transformation for TImode on TARGET_64BIT and DImode on -m32,
One motivation for this is that it's a step to fixing the current failure
of gcc.target/i386/pr65105-5.c on -m32.
A more direct benefit for x86_64 is that the following code:
int foo(__int128 x, __int128 y)
{
return (x & y) == y;
}
improves with -O2 from 15 instructions:
movq %rdi, %r8
movq %rsi, %rax
movq %rax, %rdi
movq %r8, %rsi
movq %rdx, %r8
andq %rdx, %rsi
andq %rcx, %rdi
movq %rsi, %rax
movq %rdi, %rdx
xorq %r8, %rax
xorq %rcx, %rdx
orq %rdx, %rax
sete %al
movzbl %al, %eax
ret
to the slightly better 13 instructions:
movq %rdi, %r8
movq %rsi, %rax
movq %r8, %rsi
movq %rax, %rdi
notq %rsi
notq %rdi
andq %rdx, %rsi
andq %rcx, %rdi
movq %rsi, %rax
orq %rdi, %rax
sete %al
movzbl %al, %eax
ret
2022-07-05 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/i386/i386.cc (ix86_rtx_costs) <COMPARE>: Provide costs
for double word comparisons and tests (comparisons against zero).
* config/i386/i386.md (*test<mode>_not_doubleword): Split DWI
and;cmp into andn;cmp $0 as a pre-reload splitter.
(*andn<dwi>3_doubleword_bmi): Use <dwi> instead of <mode> in name.
(*<any_or><dwi>3_doubleword): Likewise.
gcc/testsuite/ChangeLog
* gcc.target/i386/testnot-3.c: New test case.
This patch is a follow-up to Hongtao's fix for PR target/105854. That
fix is perfectly correct, but the thing that caught my eye was why is
the compiler generating a shift by zero at all. Digging deeper it
turns out that we can easily optimize __builtin_ia32_palignr for
alignments of 0 and 64 respectively, which may be simplified to moves
of the highpart and lowpart respectively.
After adding optimizations to simplify the 64-bit DImode palignr, I
started to add the corresponding optimizations for vpalignr (i.e.
128-bit). The first oddity is that sse.md uses TImode and a special
SSESCALARMODE iterator, rather than V1TImode, and indeed the comment
above SSESCALARMODE hints that this should be "dropped in favor of
VIMAX_AVX2_AVX512BW". Hence this patch includes the migration of
<ssse3_avx2>_palignr<mode> to use VIMAX_AVX2_AVX512BW, basically
using V1TImode instead of TImode for 128-bit palignr.
This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-,32},
with no new failures. Ok for mainline?
2022-07-05 Roger Sayle <roger@nextmovesoftware.com>
Hongtao Liu <hongtao.liu@intel.com>
gcc/ChangeLog
* config/i386/i386-builtin.def (__builtin_ia32_palignr128): Change
CODE_FOR_ssse3_palignrti to CODE_FOR_ssse3_palignrv1ti.
* config/i386/i386-expand.cc (expand_vec_perm_palignr): Use V1TImode
and gen_ssse3_palignv1ti instead of TImode.
* config/i386/sse.md (SSESCALARMODE): Delete.
(define_mode_attr ssse3_avx2): Handle V1TImode instead of TImode.
(<ssse3_avx2>_palignr<mode>): Use VIMAX_AVX2_AVX512BW as a mode
iterator instead of SSESCALARMODE.
(ssse3_palignrdi): Optimize cases where operands[3] is 0 or 64,
using a single move instruction (if required).
gcc/testsuite/ChangeLog
* gcc.target/i386/ssse3-palignr-2.c: New test case.
This patch addresses PR rtl-optimization/96692 on x86_64, by providing
a set of combine splitters to convert the three operation ((A|B)^C)^D
into a two operation sequence using andn when either A or B is the same
register as C or D. This is essentially a reassociation problem that's
only a win if the target supports an and-not instruction (as with -mbmi).
Hence for the new test case:
int f(int a, int b, int c)
{
return (a ^ b) ^ (a | c);
}
GCC on x86_64-pc-linux-gnu wth -O2 -mbmi would previously generate:
xorl %edi, %esi
orl %edx, %edi
movl %esi, %eax
xorl %edi, %eax
ret
but with this patch now generates:
andn %edx, %edi, %eax
xorl %esi, %eax
ret
2022-07-05 Roger Sayle <roger@nextmovesoftware.com>
Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog
PR rtl-optimization/96692
* config/i386/i386.md (define_split): Split ((A | B) ^ C) ^ D
as (X & ~Y) ^ Z on target BMI when either C or D is A or B.
gcc/testsuite/ChangeLog
PR rtl-optimization/96692
* gcc.target/i386/bmi-andn-4.c: New test case.
Like macro locations, we only need to emit ordinary location
information for locations emitted into the CMI. This adds a hash table
noting which ordinary lines are needed. These are then sorted and
(sufficiently) adjacent lines are coalesced to a single map. There is
a tradeoff here, allowing greater separation reduces the number of
line maps, but increases the number of locations. It appears allowing
2 or 3 intervening lines is the sweet spot, and this patch chooses 2.
Compiling a hello-world #includeing <iostream> in it's GMF gives a
reduction in number of locations of 5 fold, but an increase in number
of maps about 4 fold. Examining one of the xtreme-header tests we
halve the number of locations and increase the number of maps by 9
fold.
Module interfaces that emit no entities (or macros, if a header-unit),
will now have no location tables.
gcc/cp/
* module.cc
(struct ord_loc_info, ord_loc_traits): New.
(ord_loc_tabke, ord_loc_remap): New globals.
(struct location_map_info): Delete.
(struct module_state_config): Rename ordinary_loc_align to
loc_range_bits.
(module_for_ordinary_loc): Adjust.
(module_state::note_location): Note ordinary locations,
return bool.
(module_state::write_location): Adjust ordinary location
streaming.
(module_state::read_location): Likewise.
(module_state::write_init_maps): Allocate ord_loc_table.
(module_state::write_prepare_maps): Reimplement ordinary
map preparation.
(module_state::read_prepare_maps): Adjust.
(module_state::write_ordinary_maps): Reimplement.
(module_state::write_macro_maps): Adjust.
(module_state::read_ordinary_maps): Reimplement.
(module_state::write_macros): Adjust.
(module_state::write_config): Adjust.
(module_state::read_config): Adjust.
(module_state::write_begin): Adjust.
(module_state::read_initial): Adjust.
gcc/testsuite/
* g++.dg/modules/loc-prune-1.C: Adjust.
* g++.dg/modules/loc-prune-4.C: New.
* g++.dg/modules/pr98718_a.C: Adjust.
* g++.dg/modules/pr98718_b.C: Adjust.
* g++.dg/modules/pr99072.H: Adjust.
This is another case like PR106182 where for the 2nd testcase in
the bug there are no removed or discovered loops but still changing
loop exits invalidates LC SSA and it is not enough to just scan for
uses in the blocks that changed loop depth. One might argue that
if we'd include former exit destinations we'd pick up the original
LC SSA use but for virtuals on block merging we'd have propagated
those out (while for regular uses we insert copies). CFG cleanup
can also be entered with loops needing fixup so any heuristics
based on loop structure are bound to fail.
PR tree-optimization/106198
* tree-cfgcleanup.cc (repair_loop_structures): Always do a
full LC SSA rewrite but only if any blocks changed loop
depth.
* gcc.dg/pr106198.c: New testcase.
The following removes the now unused per-loop path in LC SSA rewrite.
* tree-ssa-loop-manip.cc (find_uses_to_rename_def): Remove.
(find_uses_to_rename_in_loop): Likewise.
(rewrite_into_loop_closed_ssa_1): Remove loop parameter and
uses.
(rewrite_into_loop_closed_ssa): Adjust.
The code to remove LC PHI nodes in clean_up_loop_closed_phi does not handle
virtual operands because may_propagate_copy generally returns false
for them. The following copies the merge_blocks variant for
dealing with them.
This fixes a missed jump threading in gcc.dg/auto-init-uninit-4.c
which manifests in bogus uninit diagnostics.
PR tree-optimization/106186
* tree-ssa-propagate.cc (clean_up_loop_closed_phi):
Properly handle virtual PHI nodes.
The following properly handles aggregate returns of the const marked
STORE_LANES internal function to update virtual SSA form on-the-fly
rather than relying on a costly virtual SSA rewrite.
PR tree-optimization/106196
* tree-vect-stmts.cc (vect_finish_stmt_generation): Properly
handle aggregate returns of calls for VDEF updates.
* gcc.dg/torture/pr106196.c: New testcase.
The final loop IV use after the loop has that not in LC SSA
(and inserts not simplified _2 = _3 - 0 stmts). In particular
since it splits the exit edge when there's a virtual PHI in the
destination it breaks virtual LC SSA form (but likely also
non-virtual).
The following properly inserts LC PHIs instead.
2022-07-04 Richard Biener <rguenther@suse.de>
* tree-vect-loop-manip.cc (vect_set_loop_condition_normal):
Maintain LC SSA.
The array element type for the two_plus_gigs test was mistakely put in
as int rather than char.
for gcc/testsuite/ChangeLog
* lib/target-supports.exp (check_effective_target_two_plus_gigs):
Fix array element type. Reported by Hans-Peter Nilsson.
On vxworks, in kernel mode, getpid's return type is a pointer type, so
std::to_string on it fails overload resolution. Restore the type cast
from the original patch that suggested adding the pid.
for libstdc++-v3/ChangeLog
* testsuite/util/testsuite_fs.h (nonexistent_path): Convert
the getpid result to an integral type.
Ada 83 packages like Unchecked_Conversion or Text_IO are obsolete since
Ada 95. GNAT now warns about their uses when warnings on obsolescent
featured (Annex J) is active.
gcc/ada/
* doc/gnat_ugn/building_executable_programs_with_gnat.rst
(Warning Message Control): Update description of switch -gnatwj.
* gnat_ugn.texi: Regenerate.
* sem_ch10.adb (Analyze_With_Clause): Warn on WITH clauses for
obsolete renamed units; in Ada 83 mode do not consider
predefined renamings to be obsolete.
gcc/testsuite/
* gnat.dg/renaming1.adb: Update WITH clause.
* gnat.dg/renaming1.ads: Likewise.
* gnat.dg/warn29.adb: Likewise.
This patch reverts a fix for a spurious warning for validity checks on
type Long_Float. This fix was dubious (as it was only affecting
Long_Float and not Float) and apparently is no longer needed.
Cleanup related to improved detection of uninitialised scalar objects.
gcc/ada/
* sem_attr.adb (Note_Possible_Modification): Revert a
special-case for validity checks on Long_Float type.
* snames.ads-tmpl (Name_Attr_Long_Float): Remove name added
exclusively for the mentioned fix.
Formal parameters have their flag Never_Set_In_Source set at the
beginning of Process_Formals routine (regardless of the parameter mode).
There is no need to set it again when Process_Formals calls
Set_Formal_Mode (for parameters of mode IN OUT and OUT).
Code cleanup related to improved detection of uninitialised objects;
behaviour is unaffected.
gcc/ada/
* sem_ch6.adb (Set_Formal_Mode): Remove unnecessary setting of
Never_Set_In_Source.
Code cleanup related to improved detection of uninitialised objects;
semantics is unaffected.
gcc/ada/
* sem_ch6.adb (Process_Formals): Avoid repeated calls to
Expression.
The implementation of __gnat_full_name uses the CRTL realpath, however
this function returns a null string so use the default implementation
instead.
gcc/ada/
* cstreams.c (__gnat_full_name) [QNX]: Remove block.
Renaming of an object of ghost type leads to a spurious error. Now
fixed.
gcc/ada/
* ghost.adb (Is_OK_Ghost_Context): Detect ghost type inside object
renaming.
This patch cleans up some code issues found while working on
finalization, and adds some debugging aids.
gcc/ada/
* exp_ch7.adb: Change two constants Is_Protected_Body and
Is_Prot_Body to be Is_Protected_Subp_Body; these are not true
for protected bodies, but for protected subprogram bodies.
(Expand_Cleanup_Actions): No need to search for
Activation_Chain_Entity; just use Activation_Chain_Entity.
* sem_ch8.adb (Find_Direct_Name): Use Entyp constant.
* atree.adb, atree.ads, atree.h, nlists.adb, nlists.ads
(Parent): Provide nonoverloaded versions of Parent, so that they
can be easily found in the debugger.
* debug_a.adb, debug_a.ads: Clarify that we're talking about the
-gnatda switch; switches are case sensitive. Print out the
Chars field if appropriate, which makes it easier to find things
in the output.
(Debug_Output_Astring): Simplify. Also fix an off-by-one
bug ("for I in Vbars'Length .." should have been "for I in
Vbars'Length + 1 .."). Before, it was printing Debug_A_Depth +
1 '|' characters if Debug_A_Depth > Vbars'Length.
When analysing pragma Thread_Local_Storage its argument is analysed by
the call to Check_Arg_Is_Library_Level_Local_Name. There is no need to
reanalyse it. Code cleanup; behaviour is not affected.
gcc/ada/
* sem_prag.adb (Analyze_Pragma): Remove unnecessary call to
Analyze.
Opportunity for extra annotations spotted while fixing detection of
unreachable code that follows calls to procedures annotated with
No_Return.
gcc/ada/
* libgnat/g-socket.adb (Raise_Host_Error): Add No_Return aspect.
(Raise_GAI_Error): Likewise.
* libgnat/g-socket.ads (Raise_Socket_Error): Likewise.
Code cleanup related to examining uses of Check_Unset_Reference for
improved detection of uninitialised scalar objects. Semantics is
unaffected.
gcc/ada/
* sem_util.adb (Aggregate_Constraint_Checks): Fix whitespace;
refactor repeated code; replace a ??? comment with an
explanation based on the comment for the routine spec.
Flag May_Be_Modified under go a series of renamings between 1996 and
2002. It was changed to Not_Assigned, then to Not_Source_Assigned and
finally to Never_Set_In_Source. Fix remaining references in comments.
gcc/ada/
* sem_util.ads (Note_Possible_Modification): Fix occurrence of
May_Be_Modified in comment.
* sem_warn.ads (Check_Unset_Reference): Fix occurrence of
Not_Assigned in comment.
When using a qualified name such as Pack.Func as the prefix of a 'Result
attribute reference, the prefix is not fully resolved and may contain a
chain of homonyms. Look for the expected function in the homonym chain
instead of issuing an error if the first one is not the expected one.
gcc/ada/
* sem_attr.adb (Analyze_Attribute): Take into account the
possibility of homonyms.
The rewriting as renaming optimization for object declarations is done
partly during analysis, guarded with Expander_Active, and partly during
expansion, so it makes sense to do it entirely during expansion.
This merges the two cases and removes obsolete or unnecessary conditions
guarding the transformation in the process.
gcc/ada/
* exp_ch3.adb (Expand_N_Object_Declaration): Rewrite as a renaming
for any nonaliased local object with nominal unconstrained subtype
originally initialized with the result of a function call that has
been rewritten as the dereference of a reference to the result.
* sem_ch3.adb (Analyze_Object_Declaration): Do not do it here
To help the bootstrap path, we want to keep the compiler free from any
exception propagation during bootstrap. This has been broken recently in
various places.
Also introduce a way to more easily detect such breakage via the
-DNO_EXCEPTION_PROPAGATION which can now be used as part of BOOT_CFLAGS.
gcc/ada/
* exp_imgv.adb (Build_Enumeration_Image_Tables): Also disable
perfect hash in GNAT_Mode.
* raise-gcc.c (__gnat_Unwind_RaiseException): Add support for
disabling exception propagation.
* sem_eval.adb (Compile_Time_Known_Value): Update comment and
remove wrong call to Check_Error_Detected.
* sem_prag.adb (Check_Loop_Pragma_Grouping, Analyze_Pragma):
Remove exception propagation during bootstrap.
The implementation of the build-in-place return protocol for functions
whose result type is an unconstrained array type generates dangling
references to local bounds built on the stack for the result as soon as
these bounds are not static. The reason is that the implementation
treats the return object, either explicitly present in the source or
synthesized by the compiler, as a regular constrained object until very
late in the game, although it needs to be ultimately rewritten as the
renaming of the dereference of an allocator with unconstrained designated
type in order for the bounds to be part of the allocation.
Recently a partial fix was implemented for the case where the result is an
aggregate, by preventing the return object from being expanded after it has
been analyzed. However, it does not work for the general case of extended
return statements, because the statements therein are still analyzed with
the constrained version of the return object so, after it is changed into
the unconstrained renaming, this yields (sub)type mismatches.
Therefore this change goes the other way around: it rolls back the partial
fix and instead performs the transformation of the return object into the
unconstrained renaming during the expansion of its declaration, in other
words before statements referencing it, if any, are analyzed, thus ensuring
that they see the final version of the object.
gcc/ada/
* exp_aggr.adb (Expand_Array_Aggregate): Remove obsolete code.
Delay the expansion of aggregates initializing return objects of
build-in-place functions.
* exp_ch3.ads (Ensure_Activation_Chain_And_Master): Delete.
* exp_ch3.adb (Ensure_Activation_Chain_And_Master): Fold back to...
(Expand_N_Object_Declaration): ...here.
Perform the expansion of return objects of build-in-place functions
here instead of...
* exp_ch6.ads (Is_Build_In_Place_Return_Object): Declare.
* exp_ch6.adb (Expand_N_Extended_Return_Statement): ...here.
(Is_Build_In_Place_Result_Type): Alphabetize.
(Is_Build_In_Place_Return_Object): New predicate.
* exp_ch7.adb (Enclosing_Function): Delete.
(Process_Object_Declaration): Tidy up handling of return objects.
* sem_ch3.adb (Analyze_Object_Declaration): Do not decorate and
freeze the actual type if it is the same as the nominal type.
* sem_ch6.adb: Remove use and with clauses for Exp_Ch3.
(Analyze_Function_Return): Analyze again all return objects.
(Create_Extra_Formals): Do not force the definition of an Itype
if the subprogram is a compilation unit.
A new warning about unreachable code that follows calls to procedures
with No_Return would flag some dead defensive code. Comments next to
this code suggest that it was added to please some ancient version of
the compiler, but recent releases of GNAT do not require such a code.
gcc/ada/
* gnatls.adb (Corresponding_Sdep_Entry): Remove dead return
statement in defensive path; there is another return statement
for a normal execution of this routine, so rule Ada RM 6.5(5),
which requires function to have at least one return statement is
still satisfied.
(Gnatls): Remove dead, call to nonreturning Exit_Program after
Output_License_Information which itself does not return.
* libgnat/a-exstat.adb (Bad_EO): Remove raise statement that was
meant to please some ancient version of GNAT.
* libgnat/g-awk.adb (Raise_With_Info): Likewise.
* sem_attr.adb (Check_Reference): Remove dead return statement;
rule Ada RM 6.5(5), which requires function to have at least one
return statement is still satisfied.
(Analyze_Attribute): Remove dead exit statement.
(Check_Reference): Same as above.
* sem_ch12.adb (Instantiate_Formal_Package): Remove dead raise
statement; it was inconsistent with other calls to
Abandon_Instantiation, which are not followed by a raise
statement.
* sem_prag.adb (Process_Convention): Remove dead defensive
assignment.
(Interrupt_State): Remove dead defensive exit statement.
(Do_SPARK_Mode): Likewise.
* sfn_scan.adb (Scan_String): Remove dead defensive assignment.
A new warning about unreachable code that follows calls to procedures
with No_Return would flag many unnecessary return statements. Those
returns statements were applied inconsistently, so this patch is
actually more a style cleanup.
gcc/ada/
* sem_attr.adb, sem_prag.adb: Remove dead return statements
after calls to Error_Attr, Error_Pragma, Error_Pragma_Arg and
Placement_Error. All these calls raise exceptions that are
handled to gently recover from errors.
Systemitize Word_Size and Memory_Size declarations rather than hard code
with numerical values or OS specific Long_Integer size.
gcc/ada/
* libgnat/system-vxworks-ppc-kernel.ads (Word_Size): Compute
based on Standard'Word_Size.
(Memory_Size): Compute based on Word_Size.
* libgnat/system-vxworks-ppc-rtp-smp.ads: Likewise.
* libgnat/system-vxworks-ppc-rtp.ads: Likewise.
A new warning about unreachable code that follows calls to procedures
with No_Return would flag a clearly unintentional dead call to
Set_Address_Taken in analysis of Code_Address attribute.
This patch resurrects the dead code, which is worth fixing regardless of
the new warning.
gcc/ada/
* sem_attr.adb (Analyze_Attribute): Move call to
Set_Address_Taken so that it is executed when the prefix
attribute is legal.
Cleanup only; behaviour is unaffected.
gcc/ada/
* sem_ch5.adb (Check_Unreachable_Code): Avoid explicit use of
Sloc; this should also help when we finally use Source_Span for
prettier error messages.
Routine Check_Unreachable_Code is only called on nodes belonging to a
list of statements (and it wouldn't make sense to call it on anything
else).
gcc/ada/
* sem_ch5.adb (Check_Unreachable_Code): Remove redundant guard;
the call to Present wasn't needed either.
Code cleanup related to a new detection of uninitialised local scalar
objects; semantics is unaffected.
gcc/ada/
* sem_ch5.adb (Analyze_Block_Statement): Call to List_Length with
No_List is safe and will return zero.
Following a suggestion from Tamar, this patch adds a fallback
implementation of usdot using sdot. Specifically, for 8-bit
input types:
acc_2 = DOT_PROD_EXPR <a_unsigned, b_signed, acc_1>;
becomes:
tmp_1 = DOT_PROD_EXPR <64, b_signed, acc_1>;
tmp_2 = DOT_PROD_EXPR <64, b_signed, tmp_1>;
acc_2 = DOT_PROD_EXPR <a_unsigned - 128, b_signed, tmp_2>;
on the basis that (x-128)*y + 64*y + 64*y. Doing the two 64*y
operations first should give more time for x to be calculated,
on the off chance that that's useful.
gcc/
* tree-vect-patterns.cc (vect_convert_input): Expect the input
type to be signed for optab_vector_mixed_sign. Update the vectype
at the same time as type.
(vect_recog_dot_prod_pattern): Update accordingly. If usdot isn't
available, try sdot instead.
* tree-vect-loop.cc (vect_is_emulated_mixed_dot_prod): New function.
(vect_model_reduction_cost): Model the cost of implementing usdot
using sdot.
(vectorizable_reduction): Likewise. Skip target support test
for lane reductions.
(vect_emulate_mixed_dot_prod): New function.
(vect_transform_reduction): Use it to emulate usdot via sdot.
gcc/testsuite/
* gcc.dg/vect/vect-reduc-dot-9.c: Reduce target requirements
from i8mm to dotprod.
* gcc.dg/vect/vect-reduc-dot-10.c: Likewise.
* gcc.dg/vect/vect-reduc-dot-11.c: Likewise.
* gcc.dg/vect/vect-reduc-dot-12.c: Likewise.
* gcc.dg/vect/vect-reduc-dot-13.c: Likewise.
* gcc.dg/vect/vect-reduc-dot-14.c: Likewise.
* gcc.dg/vect/vect-reduc-dot-15.c: Likewise.
* gcc.dg/vect/vect-reduc-dot-16.c: Likewise.
* gcc.dg/vect/vect-reduc-dot-17.c: Likewise.
* gcc.dg/vect/vect-reduc-dot-18.c: Likewise.
* gcc.dg/vect/vect-reduc-dot-19.c: Likewise.
* gcc.dg/vect/vect-reduc-dot-20.c: Likewise.
* gcc.dg/vect/vect-reduc-dot-21.c: Likewise.
* gcc.dg/vect/vect-reduc-dot-22.c: Likewise.
The testcase shows that when cleaning up the CFG we can end up
with broken LC SSA (for virtual operands with the testcase). The
case here involves deleting a loop after which it is not enough
to scan the blocks with changed loop depth for SSA uses that need
to be rewritten. So make fix_loop_sturcture return the sum of
the number of new loops and the number of deleted loops.
PR tree-optimization/106182
* loop-init.cc (fix_loop_structure): Return the number
of newly discovered plus the number of deleted loops.
* tree-cfgcleanup.cc (repair_loop_structures): Adjust
variable name.
* gcc.dg/torture/pr106182.c: New testcase.
See gcc/config/newlib-stdint.h, where targets that have
LONG_TYPE_SIZE == 32, get INT32_TYPE defined to "long int".
INT32_TYPE ends up in the target int32_t.
Thus the tests failed for 32-bit newlib targets due to related
warning messages being matched to "aka int" where the emitted
message for these targets have "aka long int".
Tested cris-elf, committed as obvious.
gcc/testsuite:
* gcc.dg/analyzer/allocation-size-1.c,
gcc.dg/analyzer/allocation-size-2.c,
gcc.dg/analyzer/allocation-size-3.c,
gcc.dg/analyzer/allocation-size-4.c,
gcc.dg/analyzer/allocation-size-5.c: Handle int32_t being "long int".
Fortran part to C/C++
commit r13-1002-g03b71406323ddc065b1d7837d8b43b17e4b048b5
gcc/fortran/ChangeLog:
* gfortran.h (gfc_omp_namelist): Update by creating 'linear' struct,
move 'linear_op' as 'op' to id and add 'old_modifier' to it.
* dump-parse-tree.cc (show_omp_namelist): Update accordingly.
* module.cc (mio_omp_declare_simd): Likewise.
* trans-openmp.cc (gfc_trans_omp_clauses): Likewise.
* openmp.cc (resolve_omp_clauses): Likewise; accept new-style
'val' modifier with do/simd.
(gfc_match_omp_clauses): Handle OpenMP 5.2 linear clause syntax.
libgomp/ChangeLog:
* libgomp.texi (OpenMP 5.2): Mark linear-clause change as 'Y'.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/linear-4.c: New test.
* gfortran.dg/gomp/linear-2.f90: New test.
* gfortran.dg/gomp/linear-3.f90: New test.
* gfortran.dg/gomp/linear-4.f90: New test.
* gfortran.dg/gomp/linear-5.f90: New test.
* gfortran.dg/gomp/linear-6.f90: New test.
* gfortran.dg/gomp/linear-7.f90: New test.
* gfortran.dg/gomp/linear-8.f90: New test.
Co-authored-by: Jakub Jelinek <jakub@redhat.com>
The following converts a handful of places that were irange centric.
Tested on x86-64 Linux.
gcc/ChangeLog:
* gimple-range-fold.cc
(fold_using_range::range_of_ssa_name_with_loop_info): Restrict the
call to SCEV for irange supported types.
(fold_using_range::range_of_builtin_int_call): Convert to vrange.
* gimple-range.cc (gimple_ranger::prefill_stmt_dependencies): Same.
* tree-ssa-dom.cc (cprop_operand): Same.
Sorry for the long delay getting back to this, but after deeper
investigation, it turns out that Jeff Law's tingling spider senses
that the original patch wasn't updating everywhere that was required
were spot on. Although my nvptx testing showed no problems with -O2,
compiling the same tests with -O0 found several additional assertion
ICEs (exactly where he'd predicted they'd be).
Here's a revised patch that updates five locations (up from the
previous two). Finding any remaining locations (if any) might be
easier once folks are able to test things on their targets. This
also implements Jeff's suggestion to factor the common code into
helper routines.
2022-07-04 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR target/104489
* calls.cc (precompute_register_parameters): Allow promotion
of floating point values to be passed in wider integer modes
by calling new convert_float_to_wider_int.
(expand_call): Allow floating point results to be returned in
wider integer modes by calling new convert wider_int_to_float.
* cfgexpand.cc (expand_value_return): Allow backends to promote
a scalar floating point return value to a wider integer mode
by calling new convert_float_to_wider_int.
* expr.cc (convert_float_to_wider_int): New function.
(convert_wider_int_to_float): Likewise.
(expand_expr_real_1) <expand_decl_rtl>: Allow backends to promote
scalar FP PARM_DECLs to wider integer modes, by calling new
convert_wider_int_to_float.
* expr.h (convert_modes): Name arguments for improved documentation.
(convert_float_to_wider_int): Prototype new function here.
(convert_wider_int_to_float): Likewise.
* function.cc (assign_parm_setup_stack): Allow floating point
values to be passed on the stack as wider integer modes by
calling new convert_wider_int_to_float.
As the testcase in PR 105860 shows, the code that tries to re-use the
handled_component chains in SRA can be horribly confused by unions,
where it thinks it has found a compatible structure under which it can
chain the references, but in fact it found the type it was looking
for elsewhere in a union and generated a write to a completely wrong
part of an aggregate.
I don't remember whether the plan was to support unions at all in
build_reconstructed_reference but it can work, to an extent, if we
make sure that we start the search only outside the outermost union,
which is what the patch does (and the extra testcase verifies).
gcc/ChangeLog:
2022-07-01 Martin Jambor <mjambor@suse.cz>
PR tree-optimization/105860
* tree-sra.cc (build_reconstructed_reference): Start expr
traversal only just below the outermost union.
gcc/testsuite/ChangeLog:
2022-07-01 Martin Jambor <mjambor@suse.cz>
PR tree-optimization/105860
* gcc.dg/tree-ssa/alias-access-path-13.c: New test.
* gcc.dg/tree-ssa/pr105860.c: Likewise.