gcc/ada/
* exp_ch4.adb (Expand_N_If_Expression): Generate an intermediate
temporary when the expression is a condition in an outer decision
and control-flow optimizations are suppressed.
gcc/ada/
* exp_ch5.adb (Expand_General_Case_Statement.Pattern_Match): Add
new function Indexed_Element to handle array element
comparisons. Handle case choices that are array aggregates,
string literals, or names denoting constants.
* sem_case.adb (Composite_Case_Ops.Array_Case_Ops): New package
providing utilities needed for casing on arrays.
(Composite_Case_Ops.Choice_Analysis): If necessary, include
array length as a "component" (like a discriminant) when
traversing components. We do not (yet) partition choice analysis
to deal with unequal length choices separately. Instead, we
embed everything in the minimum-dimensionality Cartesian product
space needed to handle all choices properly; this is determined
by the length of the longest choice pattern.
(Composite_Case_Ops.Choice_Analysis.Traverse_Discrete_Parts):
Include length as a "component" in the traversal if necessary.
(Composite_Case_Ops.Choice_Analysis.Parse_Choice.Traverse_Choice):
Add support for case choices that are string literals or names
denoting constants.
(Composite_Case_Ops.Choice_Analysis): Include length as a
"component" in the analysis if necessary.
(Check_Choices.Check_Case_Pattern_Choices.Ops.Value_Sets.Value_Index_Count):
Improve error message when capacity exceeded.
* doc/gnat_rm/implementation_defined_pragmas.rst: Update
documentation to reflect current implementation status.
* gnat_rm.texi: Regenerate.
gcc/ada/
* freeze.adb (Check_Component_Storage_Order): Give a specific error
message for non-byte-aligned component in the packed case. Replace
"composite" with "record" in both cases.
In patch r12-3136, niter->control, niter->bound and niter->cmp are
derived from number_of_iterations_lt. While for 'until wrap condition',
the calculation in number_of_iterations_lt is not align the requirements
on the define of them and requirements in determine_exit_conditions.
This patch calculate niter->control, niter->bound and niter->cmp in
number_of_iterations_until_wrap.
gcc/ChangeLog:
2021-09-22 Jiufu Guo <guojiufu@linux.ibm.com>
PR tree-optimization/102087
* tree-ssa-loop-niter.c (number_of_iterations_until_wrap):
Update bound/cmp/control for niter.
gcc/testsuite/ChangeLog:
2021-09-22 Jiufu Guo <guojiufu@linux.ibm.com>
* gcc.dg/pr102087.c: New test.
PR tree-optimization/102087
We may be asked to fold an artificial statement not in the CFG. Since
there are no outgoing edges from those, avoid calling
register_outgoing_edges.
Tested on x86-64 Linux.
gcc/ChangeLog:
* gimple-range-fold.cc (fold_using_range::range_of_range_op):
Move check for non-empty BB here.
(fur_source::register_outgoing_edges): ...from here.
Cycling through equivalences to improve a range is nowhere near as
efficient as asking the ranger what the range on entry is.
Testing on a hybrid VRP threader, shows that this improves our VRP
threading benefit from 14.5% to 18.5% and our overall jump threads from
0.85% to 1.28%.
Tested on x86-64 Linux.
gcc/ChangeLog:
* gimple-range-path.cc (path_range_query::internal_range_of_expr):
Remove call to improve_range_with_equivs.
(path_range_query::improve_range_with_equivs): Remove
* gimple-range-path.h: Remove improve_range_with_equivs.
Function to generate a simple loop (to be used internally).
Callers will be added in follow-up commits.
gcc/fortran/
* trans-expr.c (gfc_simple_for_loop): New.
* trans.h (gfc_simple_for_loop): New prototype.
Current ubsan complains on every use of __PTR_ALIGN (when ptrdiff_t is
as large as a pointer), due to making calculations relative to a NULL
pointer. This patch avoids the problem by extracting out and
simplifying __BPTR_ALIGN for the usual case. I've continued to use
ptrdiff_t here, where it might be better to throw away __BPTR_ALIGN
entirely and just assume uintptr_t exists.
* obstack.h (__PTR_ALIGN): Expand and simplify __BPTR_ALIGN
rather than calculating relative to a NULL pointer.
Avoid emitting a strict low part move if the insv target actually
affects the whole target reg.
gcc/ChangeLog:
PR target/102222
* config/s390/s390.c (s390_expand_insv): Emit a normal move if it
is actually a full copy of the source operand into the target.
Don't emit a strict low part move if source and target mode match.
gcc/testsuite/ChangeLog:
* gcc.target/s390/pr102222.c: New test.
I've used function for omp single expansion also for omp scope. That is
mostly ok, but as the testcase shows, there is one important difference.
The omp single expansion always has a fallthru body, because it during
omp lowering expands the body as if wrapped in an if to simulate that
one thread runs the body and others wait (unless nowait) until it completes
and continue. omp scope is invoked by all threads and so if the body
is non-fallthru, the barrier (unless nowait) at the end will not be reached
by any of the threads.
The following patch fixes that by handling the case where cfg pass optimizes
away the exit bb of it gracefully.
2021-09-22 Jakub Jelinek <jakub@redhat.com>
PR middle-end/102415
* omp-expand.c (expand_omp_single): If region->exit is NULL,
assert region->entry is GIMPLE_OMP_SCOPE region and return.
* c-c++-common/gomp/scope-3.c: New test.
As the allocate-2.c testcase shows, this change isn't 100% backwards compatible,
one could have allocate and/or align functions that return an OpenMP allocator
handle and previously it would call those functions and now would use those
names as keywords for the modifiers. But it allows specify extra alignment
requirements for the allocations.
2021-09-22 Jakub Jelinek <jakub@redhat.com>
gcc/
* tree.h (OMP_CLAUSE_ALLOCATE_ALIGN): Define.
* tree.c (omp_clause_num_ops): Change number of OMP_CLAUSE_ALLOCATE
arguments from 2 to 3.
* tree-pretty-print.c (dump_omp_clause): Print allocator() around
allocate clause allocator and print align if present.
* omp-low.c (scan_sharing_clauses): Force allocate_map entry even
for omp_default_mem_alloc if align modifier is present. If align
modifier is present, use TREE_LIST to encode both allocator and
align.
(lower_private_allocate, lower_rec_input_clauses, create_task_copyfn):
Handle align modifier on allocator clause if present.
gcc/c-family/
* c-omp.c (c_omp_split_clauses): Copy over OMP_CLAUSE_ALLOCATE_ALIGN.
gcc/c/
* c-parser.c (c_parser_omp_clause_allocate): Parse allocate clause
modifiers.
gcc/cp/
* parser.c (cp_parser_omp_clause_allocate): Parse allocate clause
modifiers.
* semantics.c (finish_omp_clauses) <OMP_CLAUSE_ALLOCATE>: Perform
semantic analysis of OMP_CLAUSE_ALLOCATE_ALIGN.
* pt.c (tsubst_omp_clauses) <case OMP_CLAUSE_ALLOCATE>: Handle
also OMP_CLAUSE_ALLOCATE_ALIGN.
gcc/testsuite/
* c-c++-common/gomp/allocate-6.c: New test.
* c-c++-common/gomp/allocate-7.c: New test.
* g++.dg/gomp/allocate-4.C: New test.
libgomp/
* testsuite/libgomp.c-c++-common/allocate-2.c: New test.
* testsuite/libgomp.c-c++-common/allocate-3.c: New test.
Existing code in the sfp-machine header has been using __MACH__
as a guard for Mach-O, where currently symbols aliases are not
supported.
__MACH__ is not a sufficient guard for this, since the define
is also emitted for HURD, at least.
Fixed by amending the guard to use __APPLE__ instead.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
libgcc/ChangeLog:
* config/i386/sfp-machine.h: Guard Mach-O-specific code
using __APPLE__.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr92658-avx512f.c: Refine testcase.
* gcc.target/i386/pr92658-avx512vl.c: Adjust scan-assembler,
only v2di->v2qi truncate is not supported, v4di->v4qi should
be supported.
gcc/ChangeLog:
* config/i386/i386.md (cstorehf3): New define_expand.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512fp16-builtin-fpcompare-1.c: New test.
* gcc.target/i386/avx512fp16-builtin-fpcompare-2.c: New test.
gcc/ChangeLog:
* config/i386/i386.md (<rounding_insn>hf2): New expander.
(sse4_1_round<mode>2): Extend from MODEF to MODEFH.
* config/i386/sse.md (*sse4_1_round<ssescalarmodesuffix>):
Extend from VF_128 to VFH_128.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512fp16-builtin-round-1.c: New test.
This patch follows the discussion here[1], where Segher suggested
parameterizing those exact magic constants for density heuristics,
to make it easier to tweak if need.
The change here should be "No Functional Change". But I verified
it with SPEC2017 at option sets O2-vect and Ofast-unroll on Power8,
the result is neutral as expected.
[1]https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579121.html
gcc/ChangeLog:
* config/rs6000/rs6000.opt (rs6000-density-pct-threshold,
rs6000-density-size-threshold, rs6000-density-penalty,
rs6000-density-load-pct-threshold,
rs6000-density-load-num-threshold): New parameter.
* config/rs6000/rs6000.c (rs6000_density_test): Adjust with
corresponding parameters.
This change fixes a primordial c++11 frontend defect where function template
redeclarations with trailing return types that used dependent
sizeof/alignof/noexcept expressions in template value arguments failed to
compare as equivalent to the identical primary template declaration. By
forcing structural AST comparison of the template arguments, we no longer
require TYPE_CANONICAL to match in this case. The new canon-type-{15..18}.C
tests failed with all prior GCC versions, where the redeclarations were
incorrectly reported as ambiguous overloads. The new dependent-name{15,16}.C
tests are regression tests for sneaky problems encountered during
development of this fix. Note that this fix does not address the use of parm
objects' constexpr members as template arguments within a declaration (a
superficially similar longstanding defect).
gcc/cp/ChangeLog:
* pt.c (find_parm_usage_r): New walk_tree callback to find func
parms.
(any_template_arguments_need_structural_equality_p): New special
case.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/constexpr-52830.C: Remove unwanted dg-ice.
* g++.dg/template/canon-type-15.C: New test.
* g++.dg/template/canon-type-16.C: New test.
* g++.dg/template/canon-type-17.C: New test.
* g++.dg/template/canon-type-18.C: New test.
* g++.dg/template/dependent-name15.C: New regression test.
* g++.dg/template/dependent-name16.C: New regression test.
In Go 1.17 the gc toolchain changed to set runtime.GOROOT in cmd/link
(previously it was runtime/internal/sys.GOROOT). Do the same in libgo.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/351313
gotools/:
* Makefile.am (check-runtime): Add goroot.go to --extrafiles.
* Makefile.in: Regenerate.
The default behavior for the path solver is to resort to VARYING when
the range for an unknown SSA is outside the given path. This is both
cheap and fast, but fails to get a significant amount of ranges that
traditionally the DOM and VRP threaders could get.
This patch uses the ranger to resolve any unknown names upon entry to
the path. It also uses equivalences to improve ranges.
gcc/ChangeLog:
* gimple-range-path.cc (path_range_query::defined_outside_path):
New.
(path_range_query::range_on_path_entry): New.
(path_range_query::internal_range_of_expr): Resolve unknowns
with ranger.
(path_range_query::improve_range_with_equivs): New.
(path_range_query::ssa_range_in_phi): Resolve unknowns with
ranger.
* gimple-range-path.h (class path_range_query): Add
defined_outside_path, range_on_path_entry, and
improve_range_with_equivs.
The path solver takes an initial set of SSA names which are deemed
interesting. These are then solved along the path. Adding any copies
of said SSA names to the list of interesting names yields significantly
better results. This patch adds said copies to the already provided
list.
Currently this code is guarded by "m_resolve", which is the more
expensive mode, but it would be reasonable to make it available always,
especially since adding more imports usually has minimal impact on the
processing time. I will investigate and make it universally available
if this is indeed the case.
gcc/ChangeLog:
* gimple-range-path.cc (path_range_query::add_to_imports): New.
(path_range_query::add_copies_to_imports): New.
(path_range_query::precompute_ranges): Call
add_copies_to_imports.
* gimple-range-path.h (class path_range_query): Add prototypes
for add_copies_to_imports and add_to_imports.
This patch adds relational support to the path solver. It uses a
path_oracle that keeps track of relations within a path which are
augmented by relations on entry to the path. With it, range_of_stmt,
range_of_expr, and friends can give relation aware answers.
gcc/ChangeLog:
* gimple-range-fold.h (class fur_source): Make oracle protected.
* gimple-range-path.cc (path_range_query::path_range_query): Add
resolve argument. Initialize oracle.
(path_range_query::~path_range_query): Delete oracle.
(path_range_query::range_of_stmt): Adapt to use relations.
(path_range_query::precompute_ranges): Pre-compute relations.
(class jt_fur_source): New
(jt_fur_source::jt_fur_source): New.
(jt_fur_source::register_relation): New.
(jt_fur_source::query_relation): New.
(path_range_query::precompute_relations): New.
(path_range_query::precompute_phi_relations): New.
* gimple-range-path.h (path_range_query): Add resolve argument.
Add oracle, precompute_relations, precompute_phi_relations.
* tree-ssa-threadbackward.c (back_threader::back_threader): Pass
resolve argument to solver.
The code registering outgoing edges from a cond is living in
fold_using_range, which makes it difficult to be called from other
places. Also, it refuses to register relations on the outgoing
destinations that have more than one predecessor. This latter issue is
a problem because we would like to register outgoing edges along a path
in the path solver (regardless of single_pred_p).
gcc/ChangeLog:
* gimple-range-fold.cc (fold_using_range::range_of_range_op):
Rename postfold_gcond_edges to register_outgoing_edges and
adapt.
(fold_using_range::postfold_gcond_edges): Rename...
(fur_source::register_outgoing_edges): ...to this.
* gimple-range-fold.h (postfold_gcond_edges): Rename to
register_outgoing_edges and move to fur_source.
SCEV won't work without dominators and we can get called without
dominators from debug_ranger.
Another option would be to rename scev_initialized_p to something like
scev_available_p and move the check there. For now, this will do.
gcc/ChangeLog:
* gimple-range-fold.cc (fold_using_range::range_of_phi): Check
dom_info_available_p.
Preallocating the space is slightly cheaper than calling
safe_grow_cleared.
gcc/ChangeLog:
* gimple-range-cache.cc (non_null_ref::non_null_ref): Use create
and quick_grow_cleared instead of safe_grow_cleared.
gcc/ada/
* sem_util.adb (Accessibility_Level): Remove spurious special
case for protected type components.
* exp_ch4.adb (Generate_Accessibility_Check): Use general
Accessibility_Level instead of the low-level function
Type_Access_Level.