2013-07-18 20:55:48 +02:00
|
|
|
/* Description of pass structure
|
2022-01-03 10:42:10 +01:00
|
|
|
Copyright (C) 1987-2022 Free Software Foundation, Inc.
|
2013-07-18 20:55:48 +02:00
|
|
|
|
|
|
|
This file is part of GCC.
|
|
|
|
|
|
|
|
GCC is free software; you can redistribute it and/or modify it under
|
|
|
|
the terms of the GNU General Public License as published by the Free
|
|
|
|
Software Foundation; either version 3, or (at your option) any later
|
|
|
|
version.
|
|
|
|
|
|
|
|
GCC is distributed in the hope that it will be useful, but WITHOUT ANY
|
|
|
|
WARRANTY; without even the implied warranty of MERCHANTABILITY or
|
|
|
|
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
|
|
|
|
for more details.
|
|
|
|
|
|
|
|
You should have received a copy of the GNU General Public License
|
|
|
|
along with GCC; see the file COPYING3. If not see
|
|
|
|
<http://www.gnu.org/licenses/>. */
|
|
|
|
|
|
|
|
/*
|
|
|
|
Macros that should be defined when using this file:
|
|
|
|
INSERT_PASSES_AFTER (PASS)
|
|
|
|
PUSH_INSERT_PASSES_WITHIN (PASS)
|
|
|
|
POP_INSERT_PASSES ()
|
|
|
|
NEXT_PASS (PASS)
|
2016-04-17 07:21:50 +02:00
|
|
|
TERMINATE_PASS_LIST (PASS)
|
2013-07-18 20:55:48 +02:00
|
|
|
*/
|
|
|
|
|
|
|
|
/* All passes needed to lower the function into shape optimizers can
|
|
|
|
operate on. These passes are always run first on the function, but
|
|
|
|
backend might produce already lowered functions that are not processed
|
|
|
|
by these passes. */
|
|
|
|
INSERT_PASSES_AFTER (all_lowering_passes)
|
|
|
|
NEXT_PASS (pass_warn_unused_result);
|
|
|
|
NEXT_PASS (pass_diagnose_omp_blocks);
|
|
|
|
NEXT_PASS (pass_diagnose_tm_blocks);
|
Decompose OpenACC 'kernels' constructs into parts, a sequence of compute constructs
Not yet enabled by default: for now, the current mode of OpenACC 'kernels'
constructs handling still remains '-fopenacc-kernels=parloops', but that is to
change later.
gcc/
* omp-oacc-kernels-decompose.cc: New.
* Makefile.in (OBJS): Add it.
* passes.def: Instantiate it.
* tree-pass.h (make_pass_omp_oacc_kernels_decompose): Declare.
* flag-types.h (enum openacc_kernels): Add.
* doc/invoke.texi (-fopenacc-kernels): Document.
* gimple.h (enum gf_mask): Add
'GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_PARALLELIZED',
'GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_GANG_SINGLE',
'GF_OMP_TARGET_KIND_OACC_DATA_KERNELS'.
(is_gimple_omp_oacc, is_gimple_omp_offloaded): Handle these.
* gimple-pretty-print.c (dump_gimple_omp_target): Likewise.
* omp-expand.c (expand_omp_target, build_omp_regions_1)
(omp_make_gimple_edges): Likewise.
* omp-low.c (scan_sharing_clauses, scan_omp_for)
(check_omp_nesting_restrictions, lower_oacc_reductions)
(lower_oacc_head_mark, lower_omp_target): Likewise.
* omp-offload.c (execute_oacc_device_lower): Likewise.
gcc/c-family/
* c.opt (fopenacc-kernels): Add.
gcc/fortran/
* lang.opt (fopenacc-kernels): Add.
gcc/testsuite/
* c-c++-common/goacc/kernels-decompose-1.c: New.
* c-c++-common/goacc/kernels-decompose-2.c: New.
* c-c++-common/goacc/kernels-decompose-ice-1.c: New.
* c-c++-common/goacc/kernels-decompose-ice-2.c: New.
* gfortran.dg/goacc/kernels-decompose-1.f95: New.
* gfortran.dg/goacc/kernels-decompose-2.f95: New.
* c-c++-common/goacc/if-clause-2.c: Adjust.
* gfortran.dg/goacc/kernels-tree.f95: Likewise.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/declare-vla-kernels-decompose-ice-1.c:
New.
* testsuite/libgomp.oacc-c-c++-common/declare-vla-kernels-decompose.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/declare-vla.c: Adjust.
* testsuite/libgomp.oacc-fortran/pr94358-1.f90: Likewise.
Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>
2019-02-01 00:59:30 +01:00
|
|
|
NEXT_PASS (pass_omp_oacc_kernels_decompose);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_lower_omp);
|
|
|
|
NEXT_PASS (pass_lower_cf);
|
|
|
|
NEXT_PASS (pass_lower_tm);
|
|
|
|
NEXT_PASS (pass_refactor_eh);
|
|
|
|
NEXT_PASS (pass_lower_eh);
|
[C++ coroutines] Initial implementation.
This is the squashed version of the first 6 patches that were split to
facilitate review.
The changes to libiberty (7th patch) to support demangling the co_await
operator stand alone and are applied separately.
The patch series is an initial implementation of a coroutine feature,
expected to be standardised in C++20.
Standardisation status (and potential impact on this implementation)
--------------------------------------------------------------------
The facility was accepted into the working draft for C++20 by WG21 in
February 2019. During following WG21 meetings, design and national body
comments have been reviewed, with no significant change resulting.
The current GCC implementation is against n4835 [1].
At this stage, the remaining potential for change comes from:
* Areas of national body comments that were not resolved in the version we
have worked to:
(a) handling of the situation where aligned allocation is available.
(b) handling of the situation where a user wants coroutines, but does not
want exceptions (e.g. a GPU).
* Agreed changes that have not yet been worded in a draft standard that we
have worked to.
It is not expected that the resolution to these can produce any major
change at this phase of the standardisation process. Such changes should be
limited to the coroutine-specific code.
ABI
---
The various compiler developers 'vendors' have discussed a minimal ABI to
allow one implementation to call coroutines compiled by another.
This amounts to:
1. The layout of a public portion of the coroutine frame.
Coroutines need to preserve state across suspension points, the storage for
this is called a "coroutine frame".
The ABI mandates that pointers into the coroutine frame point to an area
begining with two function pointers (to the resume and destroy functions
described below); these are immediately followed by the "promise object"
described in the standard.
This is sufficient that the builtins can take a coroutine frame pointer and
determine the address of the promise (or call the resume/destroy functions).
2. A number of compiler builtins that the standard library might use.
These are implemented by this patch series.
3. This introduces a new operator 'co_await' the mangling for which is also
agreed between vendors (and has an issue filed for that against the upstream
c++abi). Demangling for this is added to libiberty in a separate patch.
The ABI has currently no target-specific content (a given psABI might elect
to mandate alignment, but the common ABI does not do this).
Standard Library impact
-----------------------
The current implementations require addition of only a single header to
the standard library (no change to the runtime). This header is part of
the patch.
GCC Implementation outline
--------------------------
The standard's design for coroutines does not decorate the definition of
a coroutine in any way, so that a function is only known to be a coroutine
when one of the keywords (co_await, co_yield, co_return) is encountered.
This means that we cannot special-case such functions from the outset, but
must process them differently when they are finalised - which we do from
"finish_function ()".
At a high level, this design of coroutine produces four pieces from the
original user's function:
1. A coroutine state frame (taking the logical place of the activation
record for a regular function). One item stored in that state is the
index of the current suspend point.
2. A "ramp" function
This is what the user calls to construct the coroutine frame and start
the coroutine execution. This will return some object representing the
coroutine's eventual return value (or means to continue it when it it
suspended).
3. A "resume" function.
This is what gets called when a the coroutine is resumed when suspended.
4. A "destroy" function.
This is what gets called when the coroutine state should be destroyed
and its memory released.
The standard's coroutines involve cooperation of the user's authored function
with a provided "promise" class, which includes mandatory methods for
handling the state transitions and providing output values. Most realistic
coroutines will also have one or more 'awaiter' classes that implement the
user's actions for each suspend point. As we parse (or during template
expansion) the types of the promise and awaiter classes become known, and can
then be verified against the signatures expected by the standard.
Once the function is parsed (and templates expanded) we are able to make the
transformation into the four pieces noted above.
The implementation here takes the approach of a series of AST transforms.
The state machine suspend points are encoded in three internal functions
(one of which represents an exit from scope without cleanups). These three
IFNs are lowered early in the middle end, such that the majority of GCC's
optimisers can be run on the resulting output.
As a design choice, we have carried out the outlining of the user's function
in the front end, and taken advantage of the existing middle end's abilities
to inline and DCE where that is profitable.
Since the state machine is actually common to both resumer and destroyer
functions, we make only a single function "actor" that contains both the
resume and destroy paths. The destroy function is represented by a small
stub that sets a value to signal the use of the destroy path and calls the
actor. The idea is that optimisation of the state machine need only be done
once - and then the resume and destroy paths can be identified allowing the
middle end's inline and DCE machinery to optimise as profitable as noted
above.
The middle end components for this implementation are:
A pass that:
1. Lowers the coroutine builtins that allow the standard library header to
interact with the coroutine frame (these fairly simple logical or
numerical substitution of values, given a coroutine frame pointer).
2. Lowers the IFN that represents the exit from state without cleanup.
Essentially, this becomes a gimple goto.
3. Sets the final size of the coroutine frame at this stage.
A second pass (that requires the revised CFG that results from the lowering
of the scope exit IFNs in the first).
1. Lower the IFNs that represent the state machine paths for the resume and
destroy cases.
Patches squashed into this commit:
[C++ coroutines 1] Common code and base definitions.
This part of the patch series provides the gating flag, the keywords,
cpp defines etc.
[C++ coroutines 2] Define builtins and internal functions.
This part of the patch series provides the builtin functions
used by the standard library code and the internal functions
used to implement lowering of the coroutine state machine.
[C++ coroutines 3] Front end parsing and transforms.
There are two parts to this.
1. Parsing, template instantiation and diagnostics for the standard-
mandated class entries.
The user authors a function that becomes a coroutine (lazily) by
making use of any of the co_await, co_yield or co_return keywords.
Unlike a regular function, where the activation record is placed on the
stack, and is destroyed on function exit, a coroutine has some state that
persists between calls - the 'coroutine frame' (thus analogous to a stack
frame).
We transform the user's function into three pieces:
1. A so-called ramp function, that establishes the coroutine frame and
begins execution of the coroutine.
2. An actor function that contains the state machine corresponding to the
user's suspend/resume structure.
3. A stub function that calls the actor function in 'destroy' mode.
The actor function is executed:
* from "resume point 0" by the ramp.
* from resume point N ( > 0 ) for handle.resume() calls.
* from the destroy stub for destroy point N for handle.destroy() calls.
The C++ coroutine design described in the standard makes use of some helper
methods that are authored in a so-called "promise" class provided by the
user.
At parse time (or post substitution) the type of the coroutine promise
will be determined. At that point, we can look up the required promise
class methods and issue diagnostics if they are missing or incorrect. To
avoid repeating these actions at code-gen time, we make use of temporary
'proxy' variables for the coroutine handle and the promise - which will
eventually be instantiated in the coroutine frame.
Each of the keywords will expand to a code sequence (although co_yield is
just syntactic sugar for a co_await).
We defer the analysis and transformatin until template expansion is
complete so that we have complete types at that time.
2. AST analysis and transformation which performs the code-gen for the
outlined state machine.
The entry point here is morph_fn_to_coro () which is called from
finish_function () when we have completed any template expansion.
This is preceded by helper functions that implement the phases below.
The process proceeds in four phases.
A Initial framing.
The user's function body is wrapped in the initial and final suspend
points and we begin building the coroutine frame.
We build empty decls for the actor and destroyer functions at this
time too.
When exceptions are enabled, the user's function body will also be
wrapped in a try-catch block with the catch invoking the promise
class 'unhandled_exception' method.
B Analysis.
The user's function body is analysed to determine the suspend points,
if any, and to capture local variables that might persist across such
suspensions. In most cases, it is not necessary to capture compiler
temporaries, since the tree-lowering nests the suspensions correctly.
However, in the case of a captured reference, there is a lifetime
extension to the end of the full expression - which can mean across a
suspend point in which case it must be promoted to a frame variable.
At the conclusion of analysis, we have a conservative frame layout and
maps of the local variables to their frame entry points.
C Build the ramp function.
Carry out the allocation for the coroutine frame (NOTE; the actual size
computation is deferred until late in the middle end to allow for future
optimisations that will be allowed to elide unused frame entries).
We build the return object.
D Build and expand the actor and destroyer function bodies.
The destroyer is a trivial shim that sets a bit to indicate that the
destroy dispatcher should be used and then calls into the actor.
The actor function is the implementation of the user's state machine.
The current suspend point is noted in an index.
Each suspend point is encoded as a pair of internal functions, one in
the relevant dispatcher, and one representing the suspend point.
During this process, the user's local variables and the proxies for the
self-handle and the promise class instanceare re-written to their
coroutine frame equivalents.
The complete bodies for the ramp, actor and destroy function are passed
back to finish_function for folding and gimplification.
[C++ coroutines 4] Middle end expanders and transforms.
The first part of this is a pass that provides:
* expansion of the library support builtins, these are simple boolean
or numerical substitutions.
* The functionality of implementing an exit from scope without cleanup
is performed here by lowering an IFN to a gimple goto.
This pass has to run for non-coroutine functions, since functions calling
the builtins are not necessarily coroutines (i.e. they are implementing the
library interfaces which may be called from anywhere).
The second part is the expansion of the coroutine IFNs that describe the
state machine connections to the dispatchers. This only has to be run
for functions that are coroutine components. The work done by this pass
is:
In the front end we construct a single actor function that contains
the coroutine state machine.
The actor function has three entry conditions:
1. from the ramp, resume point 0 - to initial-suspend.
2. when resume () is executed (resume point N).
3. from the destroy () shim when that is executed.
The actor function begins with two dispatchers; one for resume and
one for destroy (where the initial entry from the ramp is a special-
case of resume point 0).
Each suspend point and each dispatch entry is marked with an IFN such
that we can connect the relevant dispatchers to their target labels.
So, if we have:
CO_YIELD (NUM, FINAL, RES_LAB, DEST_LAB, FRAME_PTR)
This is await point NUM, and is the final await if FINAL is non-zero.
The resume point is RES_LAB, and the destroy point is DEST_LAB.
We expect to find a CO_ACTOR (NUM) in the resume dispatcher and a
CO_ACTOR (NUM+1) in the destroy dispatcher.
Initially, the intent of keeping the resume and destroy paths together
is that the conditionals controlling them are identical, and thus there
would be duplication of any optimisation of those paths if the split
were earlier.
Subsequent inlining of the actor (and DCE) is then able to extract the
resume and destroy paths as separate functions if that is found
profitable by the optimisers.
Once we have remade the connections to their correct postions, we elide
the labels that the front end inserted.
[C++ coroutines 5] Standard library header.
This provides the interfaces mandated by the standard and implements
the interaction with the coroutine frame by means of inline use of
builtins expanded at compile-time. There should be a 1:1 correspondence
with the standard sections which are cross-referenced.
There is no runtime content.
At this stage, we have the content in an inline namespace "__n4835" for
the CD we worked to.
[C++ coroutines 6] Testsuite.
There are two categories of test:
1. Checks for correctly formed source code and the error reporting.
2. Checks for transformation and code-gen.
The second set are run as 'torture' tests for the standard options
set, including LTO. These are also intentionally run with no options
provided (from the coroutines.exp script).
gcc/ChangeLog:
2020-01-18 Iain Sandoe <iain@sandoe.co.uk>
* Makefile.in: Add coroutine-passes.o.
* builtin-types.def (BT_CONST_SIZE): New.
(BT_FN_BOOL_PTR): New.
(BT_FN_PTR_PTR_CONST_SIZE_BOOL): New.
* builtins.def (DEF_COROUTINE_BUILTIN): New.
* coroutine-builtins.def: New file.
* coroutine-passes.cc: New file.
* function.h (struct GTY function): Add a bit to indicate that the
function is a coroutine component.
* internal-fn.c (expand_CO_FRAME): New.
(expand_CO_YIELD): New.
(expand_CO_SUSPN): New.
(expand_CO_ACTOR): New.
* internal-fn.def (CO_ACTOR): New.
(CO_YIELD): New.
(CO_SUSPN): New.
(CO_FRAME): New.
* passes.def: Add pass_coroutine_lower_builtins,
pass_coroutine_early_expand_ifns.
* tree-pass.h (make_pass_coroutine_lower_builtins): New.
(make_pass_coroutine_early_expand_ifns): New.
* doc/invoke.texi: Document the fcoroutines command line
switch.
gcc/c-family/ChangeLog:
2020-01-18 Iain Sandoe <iain@sandoe.co.uk>
* c-common.c (co_await, co_yield, co_return): New.
* c-common.h (RID_CO_AWAIT, RID_CO_YIELD,
RID_CO_RETURN): New enumeration values.
(D_CXX_COROUTINES): Bit to identify coroutines are active.
(D_CXX_COROUTINES_FLAGS): Guard for coroutine keywords.
* c-cppbuiltin.c (__cpp_coroutines): New cpp define.
* c.opt (fcoroutines): New command-line switch.
gcc/cp/ChangeLog:
2020-01-18 Iain Sandoe <iain@sandoe.co.uk>
* Make-lang.in: Add coroutines.o.
* cp-tree.h (lang_decl-fn): coroutine_p, new bit.
(DECL_COROUTINE_P): New.
* lex.c (init_reswords): Enable keywords when the coroutine flag
is set,
* operators.def (co_await): New operator.
* call.c (add_builtin_candidates): Handle CO_AWAIT_EXPR.
(op_error): Likewise.
(build_new_op_1): Likewise.
(build_new_function_call): Validate coroutine builtin arguments.
* constexpr.c (potential_constant_expression_1): Handle
CO_AWAIT_EXPR, CO_YIELD_EXPR, CO_RETURN_EXPR.
* coroutines.cc: New file.
* cp-objcp-common.c (cp_common_init_ts): Add CO_AWAIT_EXPR,
CO_YIELD_EXPR, CO_RETRN_EXPR as TS expressions.
* cp-tree.def (CO_AWAIT_EXPR, CO_YIELD_EXPR, (CO_RETURN_EXPR): New.
* cp-tree.h (coro_validate_builtin_call): New.
* decl.c (emit_coro_helper): New.
(finish_function): Handle the case when a function is found to
be a coroutine, perform the outlining and emit the outlined
functions. Set a bit to signal that this is a coroutine component.
* parser.c (enum required_token): New enumeration RT_CO_YIELD.
(cp_parser_unary_expression): Handle co_await.
(cp_parser_assignment_expression): Handle co_yield.
(cp_parser_statement): Handle RID_CO_RETURN.
(cp_parser_jump_statement): Handle co_return.
(cp_parser_operator): Handle co_await operator.
(cp_parser_yield_expression): New.
(cp_parser_required_error): Handle RT_CO_YIELD.
* pt.c (tsubst_copy): Handle CO_AWAIT_EXPR.
(tsubst_expr): Handle CO_AWAIT_EXPR, CO_YIELD_EXPR and
CO_RETURN_EXPRs.
* tree.c (cp_walk_subtrees): Likewise.
libstdc++-v3/ChangeLog:
2020-01-18 Iain Sandoe <iain@sandoe.co.uk>
* include/Makefile.am: Add coroutine to the std set.
* include/Makefile.in: Regenerated.
* include/std/coroutine: New file.
gcc/testsuite/ChangeLog:
2020-01-18 Iain Sandoe <iain@sandoe.co.uk>
* g++.dg/coroutines/co-await-syntax-00-needs-expr.C: New test.
* g++.dg/coroutines/co-await-syntax-01-outside-fn.C: New test.
* g++.dg/coroutines/co-await-syntax-02-outside-fn.C: New test.
* g++.dg/coroutines/co-await-syntax-03-auto.C: New test.
* g++.dg/coroutines/co-await-syntax-04-ctor-dtor.C: New test.
* g++.dg/coroutines/co-await-syntax-05-constexpr.C: New test.
* g++.dg/coroutines/co-await-syntax-06-main.C: New test.
* g++.dg/coroutines/co-await-syntax-07-varargs.C: New test.
* g++.dg/coroutines/co-await-syntax-08-lambda-auto.C: New test.
* g++.dg/coroutines/co-return-syntax-01-outside-fn.C: New test.
* g++.dg/coroutines/co-return-syntax-02-outside-fn.C: New test.
* g++.dg/coroutines/co-return-syntax-03-auto.C: New test.
* g++.dg/coroutines/co-return-syntax-04-ctor-dtor.C: New test.
* g++.dg/coroutines/co-return-syntax-05-constexpr-fn.C: New test.
* g++.dg/coroutines/co-return-syntax-06-main.C: New test.
* g++.dg/coroutines/co-return-syntax-07-vararg.C: New test.
* g++.dg/coroutines/co-return-syntax-08-bad-return.C: New test.
* g++.dg/coroutines/co-return-syntax-09-lambda-auto.C: New test.
* g++.dg/coroutines/co-yield-syntax-00-needs-expr.C: New test.
* g++.dg/coroutines/co-yield-syntax-01-outside-fn.C: New test.
* g++.dg/coroutines/co-yield-syntax-02-outside-fn.C: New test.
* g++.dg/coroutines/co-yield-syntax-03-auto.C: New test.
* g++.dg/coroutines/co-yield-syntax-04-ctor-dtor.C: New test.
* g++.dg/coroutines/co-yield-syntax-05-constexpr.C: New test.
* g++.dg/coroutines/co-yield-syntax-06-main.C: New test.
* g++.dg/coroutines/co-yield-syntax-07-varargs.C: New test.
* g++.dg/coroutines/co-yield-syntax-08-needs-expr.C: New test.
* g++.dg/coroutines/co-yield-syntax-09-lambda-auto.C: New test.
* g++.dg/coroutines/coro-builtins.C: New test.
* g++.dg/coroutines/coro-missing-gro.C: New test.
* g++.dg/coroutines/coro-missing-promise-yield.C: New test.
* g++.dg/coroutines/coro-missing-ret-value.C: New test.
* g++.dg/coroutines/coro-missing-ret-void.C: New test.
* g++.dg/coroutines/coro-missing-ueh-1.C: New test.
* g++.dg/coroutines/coro-missing-ueh-2.C: New test.
* g++.dg/coroutines/coro-missing-ueh-3.C: New test.
* g++.dg/coroutines/coro-missing-ueh.h: New test.
* g++.dg/coroutines/coro-pre-proc.C: New test.
* g++.dg/coroutines/coro.h: New file.
* g++.dg/coroutines/coro1-ret-int-yield-int.h: New file.
* g++.dg/coroutines/coroutines.exp: New file.
* g++.dg/coroutines/torture/alloc-00-gro-on-alloc-fail.C: New test.
* g++.dg/coroutines/torture/alloc-01-overload-newdel.C: New test.
* g++.dg/coroutines/torture/call-00-co-aw-arg.C: New test.
* g++.dg/coroutines/torture/call-01-multiple-co-aw.C: New test.
* g++.dg/coroutines/torture/call-02-temp-co-aw.C: New test.
* g++.dg/coroutines/torture/call-03-temp-ref-co-aw.C: New test.
* g++.dg/coroutines/torture/class-00-co-ret.C: New test.
* g++.dg/coroutines/torture/class-01-co-ret-parm.C: New test.
* g++.dg/coroutines/torture/class-02-templ-parm.C: New test.
* g++.dg/coroutines/torture/class-03-operator-templ-parm.C: New test.
* g++.dg/coroutines/torture/class-04-lambda-1.C: New test.
* g++.dg/coroutines/torture/class-05-lambda-capture-copy-local.C: New test.
* g++.dg/coroutines/torture/class-06-lambda-capture-ref.C: New test.
* g++.dg/coroutines/torture/co-await-00-trivial.C: New test.
* g++.dg/coroutines/torture/co-await-01-with-value.C: New test.
* g++.dg/coroutines/torture/co-await-02-xform.C: New test.
* g++.dg/coroutines/torture/co-await-03-rhs-op.C: New test.
* g++.dg/coroutines/torture/co-await-04-control-flow.C: New test.
* g++.dg/coroutines/torture/co-await-05-loop.C: New test.
* g++.dg/coroutines/torture/co-await-06-ovl.C: New test.
* g++.dg/coroutines/torture/co-await-07-tmpl.C: New test.
* g++.dg/coroutines/torture/co-await-08-cascade.C: New test.
* g++.dg/coroutines/torture/co-await-09-pair.C: New test.
* g++.dg/coroutines/torture/co-await-10-template-fn-arg.C: New test.
* g++.dg/coroutines/torture/co-await-11-forwarding.C: New test.
* g++.dg/coroutines/torture/co-await-12-operator-2.C: New test.
* g++.dg/coroutines/torture/co-await-13-return-ref.C: New test.
* g++.dg/coroutines/torture/co-ret-00-void-return-is-ready.C: New test.
* g++.dg/coroutines/torture/co-ret-01-void-return-is-suspend.C: New test.
* g++.dg/coroutines/torture/co-ret-03-different-GRO-type.C: New test.
* g++.dg/coroutines/torture/co-ret-04-GRO-nontriv.C: New test.
* g++.dg/coroutines/torture/co-ret-05-return-value.C: New test.
* g++.dg/coroutines/torture/co-ret-06-template-promise-val-1.C: New test.
* g++.dg/coroutines/torture/co-ret-07-void-cast-expr.C: New test.
* g++.dg/coroutines/torture/co-ret-08-template-cast-ret.C: New test.
* g++.dg/coroutines/torture/co-ret-09-bool-await-susp.C: New test.
* g++.dg/coroutines/torture/co-ret-10-expression-evaluates-once.C: New test.
* g++.dg/coroutines/torture/co-ret-11-co-ret-co-await.C: New test.
* g++.dg/coroutines/torture/co-ret-12-co-ret-fun-co-await.C: New test.
* g++.dg/coroutines/torture/co-ret-13-template-2.C: New test.
* g++.dg/coroutines/torture/co-ret-14-template-3.C: New test.
* g++.dg/coroutines/torture/co-yield-00-triv.C: New test.
* g++.dg/coroutines/torture/co-yield-01-multi.C: New test.
* g++.dg/coroutines/torture/co-yield-02-loop.C: New test.
* g++.dg/coroutines/torture/co-yield-03-tmpl.C: New test.
* g++.dg/coroutines/torture/co-yield-04-complex-local-state.C: New test.
* g++.dg/coroutines/torture/co-yield-05-co-aw.C: New test.
* g++.dg/coroutines/torture/co-yield-06-fun-parm.C: New test.
* g++.dg/coroutines/torture/co-yield-07-template-fn-param.C: New test.
* g++.dg/coroutines/torture/co-yield-08-more-refs.C: New test.
* g++.dg/coroutines/torture/co-yield-09-more-templ-refs.C: New test.
* g++.dg/coroutines/torture/coro-torture.exp: New file.
* g++.dg/coroutines/torture/exceptions-test-0.C: New test.
* g++.dg/coroutines/torture/func-params-00.C: New test.
* g++.dg/coroutines/torture/func-params-01.C: New test.
* g++.dg/coroutines/torture/func-params-02.C: New test.
* g++.dg/coroutines/torture/func-params-03.C: New test.
* g++.dg/coroutines/torture/func-params-04.C: New test.
* g++.dg/coroutines/torture/func-params-05.C: New test.
* g++.dg/coroutines/torture/func-params-06.C: New test.
* g++.dg/coroutines/torture/lambda-00-co-ret.C: New test.
* g++.dg/coroutines/torture/lambda-01-co-ret-parm.C: New test.
* g++.dg/coroutines/torture/lambda-02-co-yield-values.C: New test.
* g++.dg/coroutines/torture/lambda-03-auto-parm-1.C: New test.
* g++.dg/coroutines/torture/lambda-04-templ-parm.C: New test.
* g++.dg/coroutines/torture/lambda-05-capture-copy-local.C: New test.
* g++.dg/coroutines/torture/lambda-06-multi-capture.C: New test.
* g++.dg/coroutines/torture/lambda-07-multi-yield.C: New test.
* g++.dg/coroutines/torture/lambda-08-co-ret-parm-ref.C: New test.
* g++.dg/coroutines/torture/local-var-0.C: New test.
* g++.dg/coroutines/torture/local-var-1.C: New test.
* g++.dg/coroutines/torture/local-var-2.C: New test.
* g++.dg/coroutines/torture/local-var-3.C: New test.
* g++.dg/coroutines/torture/local-var-4.C: New test.
* g++.dg/coroutines/torture/mid-suspend-destruction-0.C: New test.
* g++.dg/coroutines/torture/pr92933.C: New test.
2020-01-18 12:54:46 +01:00
|
|
|
NEXT_PASS (pass_coroutine_lower_builtins);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_build_cfg);
|
|
|
|
NEXT_PASS (pass_warn_function_return);
|
[C++ coroutines] Initial implementation.
This is the squashed version of the first 6 patches that were split to
facilitate review.
The changes to libiberty (7th patch) to support demangling the co_await
operator stand alone and are applied separately.
The patch series is an initial implementation of a coroutine feature,
expected to be standardised in C++20.
Standardisation status (and potential impact on this implementation)
--------------------------------------------------------------------
The facility was accepted into the working draft for C++20 by WG21 in
February 2019. During following WG21 meetings, design and national body
comments have been reviewed, with no significant change resulting.
The current GCC implementation is against n4835 [1].
At this stage, the remaining potential for change comes from:
* Areas of national body comments that were not resolved in the version we
have worked to:
(a) handling of the situation where aligned allocation is available.
(b) handling of the situation where a user wants coroutines, but does not
want exceptions (e.g. a GPU).
* Agreed changes that have not yet been worded in a draft standard that we
have worked to.
It is not expected that the resolution to these can produce any major
change at this phase of the standardisation process. Such changes should be
limited to the coroutine-specific code.
ABI
---
The various compiler developers 'vendors' have discussed a minimal ABI to
allow one implementation to call coroutines compiled by another.
This amounts to:
1. The layout of a public portion of the coroutine frame.
Coroutines need to preserve state across suspension points, the storage for
this is called a "coroutine frame".
The ABI mandates that pointers into the coroutine frame point to an area
begining with two function pointers (to the resume and destroy functions
described below); these are immediately followed by the "promise object"
described in the standard.
This is sufficient that the builtins can take a coroutine frame pointer and
determine the address of the promise (or call the resume/destroy functions).
2. A number of compiler builtins that the standard library might use.
These are implemented by this patch series.
3. This introduces a new operator 'co_await' the mangling for which is also
agreed between vendors (and has an issue filed for that against the upstream
c++abi). Demangling for this is added to libiberty in a separate patch.
The ABI has currently no target-specific content (a given psABI might elect
to mandate alignment, but the common ABI does not do this).
Standard Library impact
-----------------------
The current implementations require addition of only a single header to
the standard library (no change to the runtime). This header is part of
the patch.
GCC Implementation outline
--------------------------
The standard's design for coroutines does not decorate the definition of
a coroutine in any way, so that a function is only known to be a coroutine
when one of the keywords (co_await, co_yield, co_return) is encountered.
This means that we cannot special-case such functions from the outset, but
must process them differently when they are finalised - which we do from
"finish_function ()".
At a high level, this design of coroutine produces four pieces from the
original user's function:
1. A coroutine state frame (taking the logical place of the activation
record for a regular function). One item stored in that state is the
index of the current suspend point.
2. A "ramp" function
This is what the user calls to construct the coroutine frame and start
the coroutine execution. This will return some object representing the
coroutine's eventual return value (or means to continue it when it it
suspended).
3. A "resume" function.
This is what gets called when a the coroutine is resumed when suspended.
4. A "destroy" function.
This is what gets called when the coroutine state should be destroyed
and its memory released.
The standard's coroutines involve cooperation of the user's authored function
with a provided "promise" class, which includes mandatory methods for
handling the state transitions and providing output values. Most realistic
coroutines will also have one or more 'awaiter' classes that implement the
user's actions for each suspend point. As we parse (or during template
expansion) the types of the promise and awaiter classes become known, and can
then be verified against the signatures expected by the standard.
Once the function is parsed (and templates expanded) we are able to make the
transformation into the four pieces noted above.
The implementation here takes the approach of a series of AST transforms.
The state machine suspend points are encoded in three internal functions
(one of which represents an exit from scope without cleanups). These three
IFNs are lowered early in the middle end, such that the majority of GCC's
optimisers can be run on the resulting output.
As a design choice, we have carried out the outlining of the user's function
in the front end, and taken advantage of the existing middle end's abilities
to inline and DCE where that is profitable.
Since the state machine is actually common to both resumer and destroyer
functions, we make only a single function "actor" that contains both the
resume and destroy paths. The destroy function is represented by a small
stub that sets a value to signal the use of the destroy path and calls the
actor. The idea is that optimisation of the state machine need only be done
once - and then the resume and destroy paths can be identified allowing the
middle end's inline and DCE machinery to optimise as profitable as noted
above.
The middle end components for this implementation are:
A pass that:
1. Lowers the coroutine builtins that allow the standard library header to
interact with the coroutine frame (these fairly simple logical or
numerical substitution of values, given a coroutine frame pointer).
2. Lowers the IFN that represents the exit from state without cleanup.
Essentially, this becomes a gimple goto.
3. Sets the final size of the coroutine frame at this stage.
A second pass (that requires the revised CFG that results from the lowering
of the scope exit IFNs in the first).
1. Lower the IFNs that represent the state machine paths for the resume and
destroy cases.
Patches squashed into this commit:
[C++ coroutines 1] Common code and base definitions.
This part of the patch series provides the gating flag, the keywords,
cpp defines etc.
[C++ coroutines 2] Define builtins and internal functions.
This part of the patch series provides the builtin functions
used by the standard library code and the internal functions
used to implement lowering of the coroutine state machine.
[C++ coroutines 3] Front end parsing and transforms.
There are two parts to this.
1. Parsing, template instantiation and diagnostics for the standard-
mandated class entries.
The user authors a function that becomes a coroutine (lazily) by
making use of any of the co_await, co_yield or co_return keywords.
Unlike a regular function, where the activation record is placed on the
stack, and is destroyed on function exit, a coroutine has some state that
persists between calls - the 'coroutine frame' (thus analogous to a stack
frame).
We transform the user's function into three pieces:
1. A so-called ramp function, that establishes the coroutine frame and
begins execution of the coroutine.
2. An actor function that contains the state machine corresponding to the
user's suspend/resume structure.
3. A stub function that calls the actor function in 'destroy' mode.
The actor function is executed:
* from "resume point 0" by the ramp.
* from resume point N ( > 0 ) for handle.resume() calls.
* from the destroy stub for destroy point N for handle.destroy() calls.
The C++ coroutine design described in the standard makes use of some helper
methods that are authored in a so-called "promise" class provided by the
user.
At parse time (or post substitution) the type of the coroutine promise
will be determined. At that point, we can look up the required promise
class methods and issue diagnostics if they are missing or incorrect. To
avoid repeating these actions at code-gen time, we make use of temporary
'proxy' variables for the coroutine handle and the promise - which will
eventually be instantiated in the coroutine frame.
Each of the keywords will expand to a code sequence (although co_yield is
just syntactic sugar for a co_await).
We defer the analysis and transformatin until template expansion is
complete so that we have complete types at that time.
2. AST analysis and transformation which performs the code-gen for the
outlined state machine.
The entry point here is morph_fn_to_coro () which is called from
finish_function () when we have completed any template expansion.
This is preceded by helper functions that implement the phases below.
The process proceeds in four phases.
A Initial framing.
The user's function body is wrapped in the initial and final suspend
points and we begin building the coroutine frame.
We build empty decls for the actor and destroyer functions at this
time too.
When exceptions are enabled, the user's function body will also be
wrapped in a try-catch block with the catch invoking the promise
class 'unhandled_exception' method.
B Analysis.
The user's function body is analysed to determine the suspend points,
if any, and to capture local variables that might persist across such
suspensions. In most cases, it is not necessary to capture compiler
temporaries, since the tree-lowering nests the suspensions correctly.
However, in the case of a captured reference, there is a lifetime
extension to the end of the full expression - which can mean across a
suspend point in which case it must be promoted to a frame variable.
At the conclusion of analysis, we have a conservative frame layout and
maps of the local variables to their frame entry points.
C Build the ramp function.
Carry out the allocation for the coroutine frame (NOTE; the actual size
computation is deferred until late in the middle end to allow for future
optimisations that will be allowed to elide unused frame entries).
We build the return object.
D Build and expand the actor and destroyer function bodies.
The destroyer is a trivial shim that sets a bit to indicate that the
destroy dispatcher should be used and then calls into the actor.
The actor function is the implementation of the user's state machine.
The current suspend point is noted in an index.
Each suspend point is encoded as a pair of internal functions, one in
the relevant dispatcher, and one representing the suspend point.
During this process, the user's local variables and the proxies for the
self-handle and the promise class instanceare re-written to their
coroutine frame equivalents.
The complete bodies for the ramp, actor and destroy function are passed
back to finish_function for folding and gimplification.
[C++ coroutines 4] Middle end expanders and transforms.
The first part of this is a pass that provides:
* expansion of the library support builtins, these are simple boolean
or numerical substitutions.
* The functionality of implementing an exit from scope without cleanup
is performed here by lowering an IFN to a gimple goto.
This pass has to run for non-coroutine functions, since functions calling
the builtins are not necessarily coroutines (i.e. they are implementing the
library interfaces which may be called from anywhere).
The second part is the expansion of the coroutine IFNs that describe the
state machine connections to the dispatchers. This only has to be run
for functions that are coroutine components. The work done by this pass
is:
In the front end we construct a single actor function that contains
the coroutine state machine.
The actor function has three entry conditions:
1. from the ramp, resume point 0 - to initial-suspend.
2. when resume () is executed (resume point N).
3. from the destroy () shim when that is executed.
The actor function begins with two dispatchers; one for resume and
one for destroy (where the initial entry from the ramp is a special-
case of resume point 0).
Each suspend point and each dispatch entry is marked with an IFN such
that we can connect the relevant dispatchers to their target labels.
So, if we have:
CO_YIELD (NUM, FINAL, RES_LAB, DEST_LAB, FRAME_PTR)
This is await point NUM, and is the final await if FINAL is non-zero.
The resume point is RES_LAB, and the destroy point is DEST_LAB.
We expect to find a CO_ACTOR (NUM) in the resume dispatcher and a
CO_ACTOR (NUM+1) in the destroy dispatcher.
Initially, the intent of keeping the resume and destroy paths together
is that the conditionals controlling them are identical, and thus there
would be duplication of any optimisation of those paths if the split
were earlier.
Subsequent inlining of the actor (and DCE) is then able to extract the
resume and destroy paths as separate functions if that is found
profitable by the optimisers.
Once we have remade the connections to their correct postions, we elide
the labels that the front end inserted.
[C++ coroutines 5] Standard library header.
This provides the interfaces mandated by the standard and implements
the interaction with the coroutine frame by means of inline use of
builtins expanded at compile-time. There should be a 1:1 correspondence
with the standard sections which are cross-referenced.
There is no runtime content.
At this stage, we have the content in an inline namespace "__n4835" for
the CD we worked to.
[C++ coroutines 6] Testsuite.
There are two categories of test:
1. Checks for correctly formed source code and the error reporting.
2. Checks for transformation and code-gen.
The second set are run as 'torture' tests for the standard options
set, including LTO. These are also intentionally run with no options
provided (from the coroutines.exp script).
gcc/ChangeLog:
2020-01-18 Iain Sandoe <iain@sandoe.co.uk>
* Makefile.in: Add coroutine-passes.o.
* builtin-types.def (BT_CONST_SIZE): New.
(BT_FN_BOOL_PTR): New.
(BT_FN_PTR_PTR_CONST_SIZE_BOOL): New.
* builtins.def (DEF_COROUTINE_BUILTIN): New.
* coroutine-builtins.def: New file.
* coroutine-passes.cc: New file.
* function.h (struct GTY function): Add a bit to indicate that the
function is a coroutine component.
* internal-fn.c (expand_CO_FRAME): New.
(expand_CO_YIELD): New.
(expand_CO_SUSPN): New.
(expand_CO_ACTOR): New.
* internal-fn.def (CO_ACTOR): New.
(CO_YIELD): New.
(CO_SUSPN): New.
(CO_FRAME): New.
* passes.def: Add pass_coroutine_lower_builtins,
pass_coroutine_early_expand_ifns.
* tree-pass.h (make_pass_coroutine_lower_builtins): New.
(make_pass_coroutine_early_expand_ifns): New.
* doc/invoke.texi: Document the fcoroutines command line
switch.
gcc/c-family/ChangeLog:
2020-01-18 Iain Sandoe <iain@sandoe.co.uk>
* c-common.c (co_await, co_yield, co_return): New.
* c-common.h (RID_CO_AWAIT, RID_CO_YIELD,
RID_CO_RETURN): New enumeration values.
(D_CXX_COROUTINES): Bit to identify coroutines are active.
(D_CXX_COROUTINES_FLAGS): Guard for coroutine keywords.
* c-cppbuiltin.c (__cpp_coroutines): New cpp define.
* c.opt (fcoroutines): New command-line switch.
gcc/cp/ChangeLog:
2020-01-18 Iain Sandoe <iain@sandoe.co.uk>
* Make-lang.in: Add coroutines.o.
* cp-tree.h (lang_decl-fn): coroutine_p, new bit.
(DECL_COROUTINE_P): New.
* lex.c (init_reswords): Enable keywords when the coroutine flag
is set,
* operators.def (co_await): New operator.
* call.c (add_builtin_candidates): Handle CO_AWAIT_EXPR.
(op_error): Likewise.
(build_new_op_1): Likewise.
(build_new_function_call): Validate coroutine builtin arguments.
* constexpr.c (potential_constant_expression_1): Handle
CO_AWAIT_EXPR, CO_YIELD_EXPR, CO_RETURN_EXPR.
* coroutines.cc: New file.
* cp-objcp-common.c (cp_common_init_ts): Add CO_AWAIT_EXPR,
CO_YIELD_EXPR, CO_RETRN_EXPR as TS expressions.
* cp-tree.def (CO_AWAIT_EXPR, CO_YIELD_EXPR, (CO_RETURN_EXPR): New.
* cp-tree.h (coro_validate_builtin_call): New.
* decl.c (emit_coro_helper): New.
(finish_function): Handle the case when a function is found to
be a coroutine, perform the outlining and emit the outlined
functions. Set a bit to signal that this is a coroutine component.
* parser.c (enum required_token): New enumeration RT_CO_YIELD.
(cp_parser_unary_expression): Handle co_await.
(cp_parser_assignment_expression): Handle co_yield.
(cp_parser_statement): Handle RID_CO_RETURN.
(cp_parser_jump_statement): Handle co_return.
(cp_parser_operator): Handle co_await operator.
(cp_parser_yield_expression): New.
(cp_parser_required_error): Handle RT_CO_YIELD.
* pt.c (tsubst_copy): Handle CO_AWAIT_EXPR.
(tsubst_expr): Handle CO_AWAIT_EXPR, CO_YIELD_EXPR and
CO_RETURN_EXPRs.
* tree.c (cp_walk_subtrees): Likewise.
libstdc++-v3/ChangeLog:
2020-01-18 Iain Sandoe <iain@sandoe.co.uk>
* include/Makefile.am: Add coroutine to the std set.
* include/Makefile.in: Regenerated.
* include/std/coroutine: New file.
gcc/testsuite/ChangeLog:
2020-01-18 Iain Sandoe <iain@sandoe.co.uk>
* g++.dg/coroutines/co-await-syntax-00-needs-expr.C: New test.
* g++.dg/coroutines/co-await-syntax-01-outside-fn.C: New test.
* g++.dg/coroutines/co-await-syntax-02-outside-fn.C: New test.
* g++.dg/coroutines/co-await-syntax-03-auto.C: New test.
* g++.dg/coroutines/co-await-syntax-04-ctor-dtor.C: New test.
* g++.dg/coroutines/co-await-syntax-05-constexpr.C: New test.
* g++.dg/coroutines/co-await-syntax-06-main.C: New test.
* g++.dg/coroutines/co-await-syntax-07-varargs.C: New test.
* g++.dg/coroutines/co-await-syntax-08-lambda-auto.C: New test.
* g++.dg/coroutines/co-return-syntax-01-outside-fn.C: New test.
* g++.dg/coroutines/co-return-syntax-02-outside-fn.C: New test.
* g++.dg/coroutines/co-return-syntax-03-auto.C: New test.
* g++.dg/coroutines/co-return-syntax-04-ctor-dtor.C: New test.
* g++.dg/coroutines/co-return-syntax-05-constexpr-fn.C: New test.
* g++.dg/coroutines/co-return-syntax-06-main.C: New test.
* g++.dg/coroutines/co-return-syntax-07-vararg.C: New test.
* g++.dg/coroutines/co-return-syntax-08-bad-return.C: New test.
* g++.dg/coroutines/co-return-syntax-09-lambda-auto.C: New test.
* g++.dg/coroutines/co-yield-syntax-00-needs-expr.C: New test.
* g++.dg/coroutines/co-yield-syntax-01-outside-fn.C: New test.
* g++.dg/coroutines/co-yield-syntax-02-outside-fn.C: New test.
* g++.dg/coroutines/co-yield-syntax-03-auto.C: New test.
* g++.dg/coroutines/co-yield-syntax-04-ctor-dtor.C: New test.
* g++.dg/coroutines/co-yield-syntax-05-constexpr.C: New test.
* g++.dg/coroutines/co-yield-syntax-06-main.C: New test.
* g++.dg/coroutines/co-yield-syntax-07-varargs.C: New test.
* g++.dg/coroutines/co-yield-syntax-08-needs-expr.C: New test.
* g++.dg/coroutines/co-yield-syntax-09-lambda-auto.C: New test.
* g++.dg/coroutines/coro-builtins.C: New test.
* g++.dg/coroutines/coro-missing-gro.C: New test.
* g++.dg/coroutines/coro-missing-promise-yield.C: New test.
* g++.dg/coroutines/coro-missing-ret-value.C: New test.
* g++.dg/coroutines/coro-missing-ret-void.C: New test.
* g++.dg/coroutines/coro-missing-ueh-1.C: New test.
* g++.dg/coroutines/coro-missing-ueh-2.C: New test.
* g++.dg/coroutines/coro-missing-ueh-3.C: New test.
* g++.dg/coroutines/coro-missing-ueh.h: New test.
* g++.dg/coroutines/coro-pre-proc.C: New test.
* g++.dg/coroutines/coro.h: New file.
* g++.dg/coroutines/coro1-ret-int-yield-int.h: New file.
* g++.dg/coroutines/coroutines.exp: New file.
* g++.dg/coroutines/torture/alloc-00-gro-on-alloc-fail.C: New test.
* g++.dg/coroutines/torture/alloc-01-overload-newdel.C: New test.
* g++.dg/coroutines/torture/call-00-co-aw-arg.C: New test.
* g++.dg/coroutines/torture/call-01-multiple-co-aw.C: New test.
* g++.dg/coroutines/torture/call-02-temp-co-aw.C: New test.
* g++.dg/coroutines/torture/call-03-temp-ref-co-aw.C: New test.
* g++.dg/coroutines/torture/class-00-co-ret.C: New test.
* g++.dg/coroutines/torture/class-01-co-ret-parm.C: New test.
* g++.dg/coroutines/torture/class-02-templ-parm.C: New test.
* g++.dg/coroutines/torture/class-03-operator-templ-parm.C: New test.
* g++.dg/coroutines/torture/class-04-lambda-1.C: New test.
* g++.dg/coroutines/torture/class-05-lambda-capture-copy-local.C: New test.
* g++.dg/coroutines/torture/class-06-lambda-capture-ref.C: New test.
* g++.dg/coroutines/torture/co-await-00-trivial.C: New test.
* g++.dg/coroutines/torture/co-await-01-with-value.C: New test.
* g++.dg/coroutines/torture/co-await-02-xform.C: New test.
* g++.dg/coroutines/torture/co-await-03-rhs-op.C: New test.
* g++.dg/coroutines/torture/co-await-04-control-flow.C: New test.
* g++.dg/coroutines/torture/co-await-05-loop.C: New test.
* g++.dg/coroutines/torture/co-await-06-ovl.C: New test.
* g++.dg/coroutines/torture/co-await-07-tmpl.C: New test.
* g++.dg/coroutines/torture/co-await-08-cascade.C: New test.
* g++.dg/coroutines/torture/co-await-09-pair.C: New test.
* g++.dg/coroutines/torture/co-await-10-template-fn-arg.C: New test.
* g++.dg/coroutines/torture/co-await-11-forwarding.C: New test.
* g++.dg/coroutines/torture/co-await-12-operator-2.C: New test.
* g++.dg/coroutines/torture/co-await-13-return-ref.C: New test.
* g++.dg/coroutines/torture/co-ret-00-void-return-is-ready.C: New test.
* g++.dg/coroutines/torture/co-ret-01-void-return-is-suspend.C: New test.
* g++.dg/coroutines/torture/co-ret-03-different-GRO-type.C: New test.
* g++.dg/coroutines/torture/co-ret-04-GRO-nontriv.C: New test.
* g++.dg/coroutines/torture/co-ret-05-return-value.C: New test.
* g++.dg/coroutines/torture/co-ret-06-template-promise-val-1.C: New test.
* g++.dg/coroutines/torture/co-ret-07-void-cast-expr.C: New test.
* g++.dg/coroutines/torture/co-ret-08-template-cast-ret.C: New test.
* g++.dg/coroutines/torture/co-ret-09-bool-await-susp.C: New test.
* g++.dg/coroutines/torture/co-ret-10-expression-evaluates-once.C: New test.
* g++.dg/coroutines/torture/co-ret-11-co-ret-co-await.C: New test.
* g++.dg/coroutines/torture/co-ret-12-co-ret-fun-co-await.C: New test.
* g++.dg/coroutines/torture/co-ret-13-template-2.C: New test.
* g++.dg/coroutines/torture/co-ret-14-template-3.C: New test.
* g++.dg/coroutines/torture/co-yield-00-triv.C: New test.
* g++.dg/coroutines/torture/co-yield-01-multi.C: New test.
* g++.dg/coroutines/torture/co-yield-02-loop.C: New test.
* g++.dg/coroutines/torture/co-yield-03-tmpl.C: New test.
* g++.dg/coroutines/torture/co-yield-04-complex-local-state.C: New test.
* g++.dg/coroutines/torture/co-yield-05-co-aw.C: New test.
* g++.dg/coroutines/torture/co-yield-06-fun-parm.C: New test.
* g++.dg/coroutines/torture/co-yield-07-template-fn-param.C: New test.
* g++.dg/coroutines/torture/co-yield-08-more-refs.C: New test.
* g++.dg/coroutines/torture/co-yield-09-more-templ-refs.C: New test.
* g++.dg/coroutines/torture/coro-torture.exp: New file.
* g++.dg/coroutines/torture/exceptions-test-0.C: New test.
* g++.dg/coroutines/torture/func-params-00.C: New test.
* g++.dg/coroutines/torture/func-params-01.C: New test.
* g++.dg/coroutines/torture/func-params-02.C: New test.
* g++.dg/coroutines/torture/func-params-03.C: New test.
* g++.dg/coroutines/torture/func-params-04.C: New test.
* g++.dg/coroutines/torture/func-params-05.C: New test.
* g++.dg/coroutines/torture/func-params-06.C: New test.
* g++.dg/coroutines/torture/lambda-00-co-ret.C: New test.
* g++.dg/coroutines/torture/lambda-01-co-ret-parm.C: New test.
* g++.dg/coroutines/torture/lambda-02-co-yield-values.C: New test.
* g++.dg/coroutines/torture/lambda-03-auto-parm-1.C: New test.
* g++.dg/coroutines/torture/lambda-04-templ-parm.C: New test.
* g++.dg/coroutines/torture/lambda-05-capture-copy-local.C: New test.
* g++.dg/coroutines/torture/lambda-06-multi-capture.C: New test.
* g++.dg/coroutines/torture/lambda-07-multi-yield.C: New test.
* g++.dg/coroutines/torture/lambda-08-co-ret-parm-ref.C: New test.
* g++.dg/coroutines/torture/local-var-0.C: New test.
* g++.dg/coroutines/torture/local-var-1.C: New test.
* g++.dg/coroutines/torture/local-var-2.C: New test.
* g++.dg/coroutines/torture/local-var-3.C: New test.
* g++.dg/coroutines/torture/local-var-4.C: New test.
* g++.dg/coroutines/torture/mid-suspend-destruction-0.C: New test.
* g++.dg/coroutines/torture/pr92933.C: New test.
2020-01-18 12:54:46 +01:00
|
|
|
NEXT_PASS (pass_coroutine_early_expand_ifns);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_expand_omp);
|
2017-06-09 11:35:05 +02:00
|
|
|
NEXT_PASS (pass_build_cgraph_edges);
|
2016-04-17 07:21:50 +02:00
|
|
|
TERMINATE_PASS_LIST (all_lowering_passes)
|
2013-07-18 20:55:48 +02:00
|
|
|
|
|
|
|
/* Interprocedural optimization passes. */
|
|
|
|
INSERT_PASSES_AFTER (all_small_ipa_passes)
|
|
|
|
NEXT_PASS (pass_ipa_free_lang_data);
|
|
|
|
NEXT_PASS (pass_ipa_function_and_variable_visibility);
|
ipa-chkp.c: New.
gcc/
2014-11-05 Ilya Enkovich <ilya.enkovich@intel.com>
* ipa-chkp.c: New.
* ipa-chkp.h: New.
* tree-chkp.c: New.
* tree-chkp.h: New.
* tree-chkp-opt.c: New.
* rtl-chkp.c: New.
* rtl-chkp.h: New.
* Makefile.in (OBJS): Add ipa-chkp.o, rtl-chkp.o, tree-chkp.o
tree-chkp-opt.o.
(GTFILES): Add tree-chkp.c.
* mode-classes.def (MODE_POINTER_BOUNDS): New.
* tree.def (POINTER_BOUNDS_TYPE): New.
* genmodes.c (complete_mode): Support MODE_POINTER_BOUNDS.
(POINTER_BOUNDS_MODE): New.
(make_pointer_bounds_mode): New.
* machmode.h (POINTER_BOUNDS_MODE_P): New.
* stor-layout.c (int_mode_for_mode): Support MODE_POINTER_BOUNDS.
(layout_type): Support POINTER_BOUNDS_TYPE.
* tree-pretty-print.c (dump_generic_node): Support POINTER_BOUNDS_TYPE.
* tree-core.h (tree_index): Add TI_POINTER_BOUNDS_TYPE.
* tree.c (build_int_cst_wide): Support POINTER_BOUNDS_TYPE.
(type_contains_placeholder_1): Likewise.
(build_common_tree_nodes): Initialize
pointer_bounds_type_node.
* tree.h (POINTER_BOUNDS_TYPE_P): New.
(pointer_bounds_type_node): New.
(POINTER_BOUNDS_P): New.
(BOUNDED_TYPE_P): New.
(BOUNDED_P): New.
(CALL_WITH_BOUNDS_P): New.
* gimple.h (gf_mask): Add GF_CALL_WITH_BOUNDS.
(gimple_call_with_bounds_p): New.
(gimple_call_set_with_bounds): New.
(gimple_return_retbnd): New.
(gimple_return_set_retbnd): New
* gimple.c (gimple_build_return): Increase number of ops
for return statement.
(gimple_build_call_from_tree): Propagate CALL_WITH_BOUNDS_P
flag.
* gimple-pretty-print.c (dump_gimple_return): Print second op.
* rtl.h (CALL_EXPR_WITH_BOUNDS_P): New.
* gimplify.c (gimplify_init_constructor): Avoid infinite
loop during gimplification of bounds initializer.
* calls.c: Include tree-chkp.h, rtl-chkp.h, bitmap.h.
(special_function_p): Use original decl name when analyzing
instrumentation clone.
(arg_data): Add fields special_slot, pointer_arg and
pointer_offset.
(store_bounds): New.
(emit_call_1): Propagate instrumentation flag for CALL.
(initialize_argument_information): Compute pointer_arg,
pointer_offset and special_slot for pointer bounds arguments.
(finalize_must_preallocate): Preallocate when storing bounds
in bounds table.
(compute_argument_addresses): Skip pointer bounds.
(expand_call): Store bounds into tables separately. Return
result joined with resulting bounds.
* cfgexpand.c: Include tree-chkp.h, rtl-chkp.h.
(expand_call_stmt): Propagate bounds flag for CALL_EXPR.
(expand_return): Add returned bounds arg. Handle returned bounds.
(expand_gimple_stmt_1): Adjust to new expand_return signature.
(gimple_expand_cfg): Reset rtx bounds map.
* expr.c: Include tree-chkp.h, rtl-chkp.h.
(expand_assignment): Handle returned bounds.
(store_expr_with_bounds): New. Replaces store_expr with new bounds
target argument. Handle bounds returned by calls.
(store_expr): Now wraps store_expr_with_bounds.
* expr.h (store_expr_with_bounds): New.
* function.c: Include tree-chkp.h, rtl-chkp.h.
(bounds_parm_data): New.
(use_register_for_decl): Do not registerize decls used for bounds
stores and loads.
(assign_parms_augmented_arg_list): Add bounds of the result
structure pointer as the second argument.
(assign_parm_find_entry_rtl): Mark bounds are never passed on
the stack.
(assign_parm_is_stack_parm): Likewise.
(assign_parm_load_bounds): New.
(assign_bounds): New.
(assign_parms): Load bounds and determine a location for
returned bounds.
(diddle_return_value_1): New.
(diddle_return_value): Handle returned bounds.
* function.h (rtl_data): Add field for returned bounds.
* varasm.c: Include tree-chkp.h.
(output_constant): Support POINTER_BOUNDS_TYPE.
(output_constant_pool_2): Support MODE_POINTER_BOUNDS.
(ultimate_transparent_alias_target): Move up.
(make_decl_rtl): For instrumented function use
name of the original decl.
(assemble_start_function): Mark function as global
in case it is instrumentation clone of the global
function.
(do_assemble_alias): Follow transparent alias chain
for identifier. Check if original alias is public.
(maybe_assemble_visibility): Use visibility of the
original function for instrumented version.
(default_unique_section): Likewise.
* emit-rtl.c (immed_double_const): Support MODE_POINTER_BOUNDS.
(init_emit_once): Build pointer bounds zero constants.
* explow.c (trunc_int_for_mode): Support MODE_POINTER_BOUNDS.
* target.def (builtin_chkp_function): New.
(chkp_bound_type): New.
(chkp_bound_mode): New.
(chkp_make_bounds_constant): New.
(chkp_initialize_bounds): New.
(load_bounds_for_arg): New.
(store_bounds_for_arg): New.
(load_returned_bounds): New.
(store_returned_bounds): New.
(chkp_function_value_bounds): New.
(setup_incoming_vararg_bounds): New.
(function_arg): Update hook description with new possible return
value CONST_INT.
* targhooks.h (default_load_bounds_for_arg): New.
(default_store_bounds_for_arg): New.
(default_load_returned_bounds): New.
(default_store_returned_bounds): New.
(default_chkp_bound_type): New.
(default_chkp_bound_mode): New.
(default_builtin_chkp_function): New.
(default_chkp_function_value_bounds): New.
(default_chkp_make_bounds_constant): New.
(default_chkp_initialize_bounds): New.
(default_setup_incoming_vararg_bounds): New.
* targhooks.c (default_load_bounds_for_arg): New.
(default_store_bounds_for_arg): New.
(default_load_returned_bounds): New.
(default_store_returned_bounds): New.
(default_chkp_bound_type): New.
(default_chkp_bound_mode); New.
(default_builtin_chkp_function): New.
(default_chkp_function_value_bounds): New.
(default_chkp_make_bounds_constant): New.
(default_chkp_initialize_bounds): New.
(default_setup_incoming_vararg_bounds): New.
* builtin-types.def (BT_BND): New.
(BT_FN_PTR_CONST_PTR): New.
(BT_FN_CONST_PTR_CONST_PTR): New.
(BT_FN_BND_CONST_PTR): New.
(BT_FN_CONST_PTR_BND): New.
(BT_FN_PTR_CONST_PTR_SIZE): New.
(BT_FN_PTR_CONST_PTR_CONST_PTR): New.
(BT_FN_VOID_PTRPTR_CONST_PTR): New.
(BT_FN_VOID_CONST_PTR_SIZE): New.
(BT_FN_VOID_PTR_BND): New.
(BT_FN_CONST_PTR_CONST_PTR_CONST_PTR): New.
(BT_FN_BND_CONST_PTR_SIZE): New.
(BT_FN_PTR_CONST_PTR_CONST_PTR_SIZE): New.
(BT_FN_VOID_CONST_PTR_BND_CONST_PTR): New.
* chkp-builtins.def: New.
* builtins.def: include chkp-builtins.def.
(DEF_CHKP_BUILTIN): New.
* builtins.c: Include tree-chkp.h and rtl-chkp.h.
(expand_builtin): Support BUILT_IN_CHKP_INIT_PTR_BOUNDS,
BUILT_IN_CHKP_NULL_PTR_BOUNDS, BUILT_IN_CHKP_COPY_PTR_BOUNDS,
BUILT_IN_CHKP_CHECK_PTR_LBOUNDS, BUILT_IN_CHKP_CHECK_PTR_UBOUNDS,
BUILT_IN_CHKP_CHECK_PTR_BOUNDS, BUILT_IN_CHKP_SET_PTR_BOUNDS,
BUILT_IN_CHKP_NARROW_PTR_BOUNDS, BUILT_IN_CHKP_STORE_PTR_BOUNDS,
BUILT_IN_CHKP_GET_PTR_LBOUND, BUILT_IN_CHKP_GET_PTR_UBOUND,
BUILT_IN_CHKP_BNDMK, BUILT_IN_CHKP_BNDSTX, BUILT_IN_CHKP_BNDCL,
BUILT_IN_CHKP_BNDCU, BUILT_IN_CHKP_BNDLDX, BUILT_IN_CHKP_BNDRET,
BUILT_IN_CHKP_INTERSECT, BUILT_IN_CHKP_NARROW,
BUILT_IN_CHKP_EXTRACT_LOWER, BUILT_IN_CHKP_EXTRACT_UPPER.
(std_expand_builtin_va_start): Init bounds for va_list.
* cppbuiltin.c (define_builtin_macros_for_compilation_flags): Add
__CHKP__ macro when Pointer Bounds Checker is on.
* params.def (PARAM_CHKP_MAX_CTOR_SIZE): New.
* passes.def (pass_ipa_chkp_versioning): New.
(pass_early_local_passes): Renamed to pass_build_ssa_passes.
(pass_fixup_cfg): Moved to pass_chkp_instrumentation_passes.
(pass_chkp_instrumentation_passes): New.
(pass_ipa_chkp_produce_thunks): New.
(pass_local_optimization_passes): New.
(pass_chkp_opt): New.
* tree-pass.h (make_pass_ipa_chkp_versioning): New.
(make_pass_ipa_chkp_produce_thunks): New.
(make_pass_chkp): New.
(make_pass_chkp_opt): New.
(make_pass_early_local_passes): Renamed to ...
(make_pass_build_ssa_passes): This.
(make_pass_chkp_instrumentation_passes): New.
(make_pass_local_optimization_passes): New.
* passes.c (pass_manager::execute_early_local_passes): Execute
early passes in three steps.
(execute_all_early_local_passes): Renamed to ...
(execute_build_ssa_passes): This.
(pass_data_early_local_passes): Renamed to ...
(pass_data_build_ssa_passes): This.
(pass_early_local_passes): Renamed to ...
(pass_build_ssa_passes): This.
(pass_data_chkp_instrumentation_passes): New.
(pass_chkp_instrumentation_passes): New.
(pass_data_local_optimization_passes): New.
(pass_local_optimization_passes): New.
(make_pass_early_local_passes): Renamed to ...
(make_pass_build_ssa_passes): This.
(make_pass_chkp_instrumentation_passes): New.
(make_pass_local_optimization_passes): New.
* c-family/c.opt (fcheck-pointer-bounds): New.
(fchkp-check-incomplete-type): New.
(fchkp-zero-input-bounds-for-main): New.
(fchkp-first-field-has-own-bounds): New.
(fchkp-narrow-bounds): New.
(fchkp-narrow-to-innermost-array): New.
(fchkp-optimize): New.
(fchkp-use-fast-string-functions): New.
(fchkp-use-nochk-string-functions): New.
(fchkp-use-static-bounds): New.
(fchkp-use-static-const-bounds): New.
(fchkp-treat-zero-dynamic-size-as-infinite): New.
(fchkp-check-read): New.
(fchkp-check-write): New.
(fchkp-store-bounds): New.
(fchkp-instrument-calls): New.
(fchkp-instrument-marked-only): New.
(Wchkp): New.
* c-family/c-common.c (handle_bnd_variable_size_attribute): New.
(handle_bnd_legacy): New.
(handle_bnd_instrument): New.
(c_common_attribute_table): Add bnd_variable_size, bnd_legacy
and bnd_instrument. Fix documentation.
(c_common_format_attribute_table): Likewsie.
* toplev.c: include tree-chkp.h.
(process_options): Check Pointer Bounds Checker is supported.
(compile_file): Add chkp_finish_file call.
* ipa-cp.c (initialize_node_lattices): Use cgraph_local_p
to handle instrumentation clones properly.
(propagate_constants_accross_call): Do not propagate
through instrumentation thunks.
* ipa-pure-const.c (propagate_pure_const): Support
IPA_REF_CHKP.
* ipa-inline.c (early_inliner): Check edge has summary allocated.
* ipa-split.c: Include tree-chkp.h.
(find_retbnd): New.
(split_part_set_ssa_name_p): New.
(consider_split): Do not split retbnd and retval
producers.
(insert_bndret_call_after): new.
(split_function): Propagate Pointer Bounds Checker
instrumentation marks and handle returned bounds.
* tree-ssa-sccvn.h (vn_reference_op_struct): Transform opcode
into bit field and add with_bounds field.
* tree-ssa-sccvn.c (copy_reference_ops_from_call): Set
with_bounds field for instrumented calls.
* tree-ssa-pre.c (create_component_ref_by_pieces_1): Restore
CALL_WITH_BOUNDS_P flag for calls.
* tree-ssa-ccp.c: Include tree-chkp.h.
(insert_clobber_before_stack_restore): Handle
BUILT_IN_CHKP_BNDRET calls.
* tree-ssa-dce.c: Include tree-chkp.h.
(propagate_necessity): For free call fed by alloc check
bounds are also provided by the same alloc.
(eliminate_unnecessary_stmts): Handle BUILT_IN_CHKP_BNDRET
used by free calls.
* tree-inline.c: Include tree-chkp.h.
(declare_return_variable): Add arg holding
returned bounds slot. Create and initialize returned bounds var.
(remap_gimple_stmt): Handle returned bounds.
Return sequence of statements instead of a single statement.
(insert_init_stmt): Add declaration.
(remap_gimple_seq): Adjust to new remap_gimple_stmt signature.
(copy_bb): Adjust to changed return type of remap_gimple_stmt.
Properly handle bounds in va_arg_pack and va_arg_pack_len.
(expand_call_inline): Handle returned bounds. Add bounds copy
for generated mem to mem assignments.
* tree-inline.h (copy_body_data): Add fields retbnd and
assign_stmts.
* value-prof.c: Include tree-chkp.h.
(gimple_ic): Support returned bounds.
* ipa.c (cgraph_build_static_cdtor_1): Support contructors
with "chkp ctor" and "bnd_legacy" attributes.
(symtab_remove_unreachable_nodes): Keep initial values for
pointer bounds to be used for checks eliminations.
(process_references): Handle IPA_REF_CHKP.
(walk_polymorphic_call_targets): Likewise.
* ipa-visibility.c (cgraph_externally_visible_p): Mark
instrumented 'main' as externally visible.
(function_and_variable_visibility): Filter instrumentation
thunks.
* cgraph.h (cgraph_thunk_info): Add add_pointer_bounds_args
field.
(cgraph_node): Add instrumented_version, orig_decl and
instrumentation_clone fields.
(symtab_node::get_alias_target): Allow IPA_REF_CHKP reference.
(varpool_node): Add need_bounds_init field.
(cgraph_local_p): New.
* cgraph.c: Include tree-chkp.h.
(cgraph_node::remove): Fix instrumented_version
of the referenced node if any.
(cgraph_node::dump): Dump instrumentation_clone and
instrumented_version fields.
(cgraph_node::verify_node): Check correctness of IPA_REF_CHKP
references and instrumentation thunks.
(cgraph_can_remove_if_no_direct_calls_and_refs_p): Keep
all not instrumented instrumentation clones alive.
(cgraph_redirect_edge_call_stmt_to_callee): Support
returned bounds.
* cgraphbuild.c (rebuild_cgraph_edges): Rebuild IPA_REF_CHKP
reference.
(cgraph_rebuild_references): Likewise.
* cgraphunit.c: Include tree-chkp.h.
(assemble_thunks_and_aliases): Skip thunks calling instrumneted
function version.
(varpool_finalize_decl): Register statically initialized decls
in Pointer Bounds Checker.
(walk_polymorphic_call_targets): Do not mark generated call to
__builtin_unreachable as with_bounds.
(output_weakrefs): If there are both instrumented and original
versions, output only one of them.
(cgraph_node::expand_thunk): Set with_bounds flag
for created call statement.
* ipa-ref.h (ipa_ref_use): Add IPA_REF_CHKP.
(ipa_ref): increase size of use field.
* symtab.c (ipa_ref_use_name): Add element for IPA_REF_CHKP.
* varpool.c (dump_varpool_node): Dump need_bounds_init field.
(ctor_for_folding): Do not fold constant bounds vars.
* lto-streamer.h (LTO_minor_version): Change minor version from
0 to 1.
* lto-cgraph.c (compute_ltrans_boundary): Keep initial values for
pointer bounds.
(lto_output_node): Output instrumentation_clone,
thunk.add_pointer_bounds_args and orig_decl field.
(lto_output_ref): Adjust to new ipa_ref::use field size.
(input_overwrite_node): Read instrumentation_clone field.
(input_node): Read thunk.add_pointer_bounds_args and orig_decl
fields.
(input_ref): Adjust to new ipa_ref::use field size.
(input_cgraph_1): Compute instrumented_version fields and restore
IDENTIFIER_TRANSPARENT_ALIAS chains.
(lto_output_varpool_node): Output
need_bounds_init value.
(input_varpool_node): Read need_bounds_init value.
* lto-partition.c (add_symbol_to_partition_1): Keep original
and instrumented versions together.
(privatize_symbol_name): Restore transparent alias chain if required.
(add_references_to_partition): Add references to pointer bounds vars.
* dbxout.c (dbxout_type): Ignore POINTER_BOUNDS_TYPE.
* dwarf2out.c (gen_subprogram_die): Ignore bound args.
(gen_type_die_with_usage): Skip pointer bounds.
(dwarf2out_global_decl): Likewise.
(is_base_type): Support POINTER_BOUNDS_TYPE.
(gen_formal_types_die): Skip pointer bounds.
(gen_decl_die): Likewise.
* var-tracking.c (vt_add_function_parameters): Skip
bounds parameters.
* ipa-icf.c (sem_function::merge): Do not merge when instrumentation
thunk still exists.
(sem_variable::merge): Reset need_bounds_init flag.
* doc/extend.texi: Document Pointer Bounds Checker built-in functions
and attributes.
* doc/tm.texi.in (TARGET_LOAD_BOUNDS_FOR_ARG): New.
(TARGET_STORE_BOUNDS_FOR_ARG): New.
(TARGET_LOAD_RETURNED_BOUNDS): New.
(TARGET_STORE_RETURNED_BOUNDS): New.
(TARGET_CHKP_FUNCTION_VALUE_BOUNDS): New.
(TARGET_SETUP_INCOMING_VARARG_BOUNDS): New.
(TARGET_BUILTIN_CHKP_FUNCTION): New.
(TARGET_CHKP_BOUND_TYPE): New.
(TARGET_CHKP_BOUND_MODE): New.
(TARGET_CHKP_MAKE_BOUNDS_CONSTANT): New.
(TARGET_CHKP_INITIALIZE_BOUNDS): New.
* doc/tm.texi: Regenerated.
* doc/rtl.texi (MODE_POINTER_BOUNDS): New.
(BND32mode): New.
(BND64mode): New.
* doc/invoke.texi (-mmpx): New.
(-mno-mpx): New.
(chkp-max-ctor-size): New.
* config/i386/constraints.md (w): New.
(Ti): New.
(Tb): New.
* config/i386/i386-c.c (ix86_target_macros_internal): Add __MPX__.
* config/i386/i386-modes.def (BND32): New.
(BND64): New.
* config/i386/i386-protos.h (ix86_bnd_prefixed_insn_p): New.
* config/i386/i386.c: Include tree-chkp.h, rtl-chkp.h, tree-iterator.h.
(regclass_map): Add bound registers.
(dbx_register_map): Likewise.
(dbx64_register_map): Likewise.
(svr4_dbx_register_map): Likewise.
(isa_opts): Add -mmpx.
(PTA_MPX): New.
(ix86_option_override_internal): Support MPX ISA.
(ix86_conditional_register_usage): Support bound registers.
(ix86_code_end): Add MPX bnd prefix.
(output_set_got): Likewise.
(print_reg): Avoid prefixes for bound registers.
(ix86_print_operand): Add '!' (MPX bnd) print prefix support.
(ix86_print_operand_punct_valid_p): Likewise.
(ix86_print_operand_address): Support UNSPEC_BNDMK_ADDR and
UNSPEC_BNDLDX_ADDR.
(ix86_output_call_insn): Add MPX bnd prefix to branch instructions.
(ix86_class_likely_spilled_p): Add bound regs support.
(ix86_hard_regno_mode_ok): Likewise.
(x86_order_regs_for_local_alloc): Likewise.
(ix86_bnd_prefixed_insn_p): New.
(ix86_builtins): Add
IX86_BUILTIN_BNDMK, IX86_BUILTIN_BNDSTX,
IX86_BUILTIN_BNDLDX, IX86_BUILTIN_BNDCL,
IX86_BUILTIN_BNDCU, IX86_BUILTIN_BNDRET,
IX86_BUILTIN_BNDNARROW, IX86_BUILTIN_BNDINT,
IX86_BUILTIN_SIZEOF, IX86_BUILTIN_BNDLOWER,
IX86_BUILTIN_BNDUPPER.
(builtin_isa): Add leaf_p and nothrow_p fields.
(def_builtin): Initialize leaf_p and nothrow_p.
(ix86_add_new_builtins): Handle leaf_p and nothrow_p
flags.
(bdesc_mpx): New.
(bdesc_mpx_const): New.
(ix86_init_mpx_builtins): New.
(ix86_init_builtins): Call ix86_init_mpx_builtins.
(ix86_emit_cmove): New.
(ix86_emit_move_max): New.
(ix86_expand_builtin): Expand IX86_BUILTIN_BNDMK,
IX86_BUILTIN_BNDSTX, IX86_BUILTIN_BNDLDX,
IX86_BUILTIN_BNDCL, IX86_BUILTIN_BNDCU,
IX86_BUILTIN_BNDRET, IX86_BUILTIN_BNDNARROW,
IX86_BUILTIN_BNDINT, IX86_BUILTIN_SIZEOF,
IX86_BUILTIN_BNDLOWER, IX86_BUILTIN_BNDUPPER.
(ix86_function_value_bounds): New.
(ix86_builtin_mpx_function): New.
(ix86_get_arg_address_for_bt): New.
(ix86_load_bounds): New.
(ix86_store_bounds): New.
(ix86_load_returned_bounds): New.
(ix86_store_returned_bounds): New.
(ix86_mpx_bound_mode): New.
(ix86_make_bounds_constant): New.
(ix86_initialize_bounds):
(TARGET_LOAD_BOUNDS_FOR_ARG): New.
(TARGET_STORE_BOUNDS_FOR_ARG): New.
(TARGET_LOAD_RETURNED_BOUNDS): New.
(TARGET_STORE_RETURNED_BOUNDS): New.
(TARGET_CHKP_BOUND_MODE): New.
(TARGET_BUILTIN_CHKP_FUNCTION): New.
(TARGET_CHKP_FUNCTION_VALUE_BOUNDS): New.
(TARGET_CHKP_MAKE_BOUNDS_CONSTANT): New.
(TARGET_CHKP_INITIALIZE_BOUNDS): New.
(ix86_option_override_internal): Do not
support x32 with MPX.
(init_cumulative_args): Init stdarg, bnd_regno, bnds_in_bt
and force_bnd_pass.
(function_arg_advance_32): Return number of used integer
registers.
(function_arg_advance_64): Likewise.
(function_arg_advance_ms_64): Likewise.
(ix86_function_arg_advance): Handle pointer bounds.
(ix86_function_arg): Likewise.
(ix86_function_value_regno_p): Mark fisrt bounds registers as
possible function value.
(ix86_function_value_1): Handle pointer bounds type/mode
(ix86_return_in_memory): Likewise.
(ix86_print_operand): Analyse insn to decide abounf "bnd" prefix.
(ix86_expand_call): Generate returned bounds.
(ix86_setup_incoming_vararg_bounds): New.
(ix86_va_start): Initialize bounds for pointers in va_list.
(TARGET_SETUP_INCOMING_VARARG_BOUNDS): New.
* config/i386/i386.h (TARGET_MPX): New.
(TARGET_MPX_P): New.
(FIRST_PSEUDO_REGISTER): Fix to new value.
(FIXED_REGISTERS): Add bound registers.
(CALL_USED_REGISTERS): Likewise.
(REG_ALLOC_ORDER): Likewise.
(HARD_REGNO_NREGS): Likewise.
(VALID_BND_REG_MODE): New.
(FIRST_BND_REG): New.
(LAST_BND_REG): New.
(reg_class): Add BND_REGS.
(REG_CLASS_NAMES): Likewise.
(REG_CLASS_CONTENTS): Likewise.
(BND_REGNO_P): New.
(ANY_BND_REG_P): New.
(BNDmode): New.
(HI_REGISTER_NAMES): Add bound registers.
(ix86_args): Add bnd_regno, bnds_in_bt, force_bnd_pass and
stdarg fields.
* config/i386/i386.md (UNSPEC_BNDMK): New.
(UNSPEC_BNDMK_ADDR): New.
(UNSPEC_BNDSTX): New.
(UNSPEC_BNDLDX): New.
(UNSPEC_BNDLDX_ADDR): New.
(UNSPEC_BNDCL): New.
(UNSPEC_BNDCU): New.
(UNSPEC_BNDCN): New.
(UNSPEC_MPX_FENCE): New.
(UNSPEC_SIZEOF): New.
(BND0_REG): New.
(BND1_REG): New.
(type): Add mpxmov, mpxmk, mpxchk, mpxld, mpxst.
(length_immediate): Support mpxmov, mpxmk, mpxchk, mpxld, mpxst.
(prefix_rep): Check for bnd prefix.
(prefix_0f): Support mpxmov, mpxmk, mpxchk, mpxld, mpxst.
(length_nobnd): New.
(length): Use length_nobnd when specified.
(memory): Support mpxmov, mpxmk, mpxchk, mpxld, mpxst.
(BND): New.
(bnd_ptr): New.
(BNDCHECK): New.
(bndcheck): New.
(*jcc_1): Add MPX bnd prefix.
(*jcc_2): Likewise.
(jump): Likewise.
(*indirect_jump): Likewise.
(*tablejump_1): Likewise.
(simple_return_internal): Likewise.
(simple_return_internal_long): Likewise.
(simple_return_pop_internal): Likewise.
(simple_return_indirect_internal): Likewise.
(<mode>_mk): New.
(*<mode>_mk): New.
(mov<mode>): New.
(*mov<mode>_internal_mpx): New.
(<mode>_<bndcheck>): New.
(*<mode>_<bndcheck>): New.
(<mode>_ldx): New.
(*<mode>_ldx): New.
(<mode>_stx): New.
(*<mode>_stx): New.
move_size_reloc_<mode>): New.
* config/i386/predicates.md (address_mpx_no_base_operand): New.
(address_mpx_no_index_operand): New.
(bnd_mem_operator): New.
(symbol_operand): New.
(x86_64_immediate_size_operand): New.
* config/i386/i386.opt (mmpx): New.
* config/i386/i386-builtin-types.def (BND): New.
(ULONG): New.
(BND_FTYPE_PCVOID_ULONG): New.
(VOID_FTYPE_BND_PCVOID): New.
(VOID_FTYPE_PCVOID_PCVOID_BND): New.
(BND_FTYPE_PCVOID_PCVOID): New.
(BND_FTYPE_PCVOID): New.
(BND_FTYPE_BND_BND): New.
(PVOID_FTYPE_PVOID_PVOID_ULONG): New.
(PVOID_FTYPE_PCVOID_BND_ULONG): New.
(ULONG_FTYPE_VOID): New.
(PVOID_FTYPE_BND): New.
gcc/testsuite/
2014-11-05 Ilya Enkovich <ilya.enkovich@intel.com>
* gcc.target/i386/chkp-builtins-1.c: New.
* gcc.target/i386/chkp-builtins-2.c: New.
* gcc.target/i386/chkp-builtins-3.c: New.
* gcc.target/i386/chkp-builtins-4.c: New.
* gcc.target/i386/chkp-remove-bndint-1.c: New.
* gcc.target/i386/chkp-remove-bndint-2.c: New.
* gcc.target/i386/chkp-const-check-1.c: New.
* gcc.target/i386/chkp-const-check-2.c: New.
* gcc.target/i386/chkp-lifetime-1.c: New.
* gcc.dg/pr37858.c: Replace early_local_cleanups pass name
with build_ssa_passes.
From-SVN: r217125
2014-11-05 13:42:03 +01:00
|
|
|
NEXT_PASS (pass_build_ssa_passes);
|
|
|
|
PUSH_INSERT_PASSES_WITHIN (pass_build_ssa_passes)
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_fixup_cfg);
|
|
|
|
NEXT_PASS (pass_build_ssa);
|
2022-04-05 09:51:32 +02:00
|
|
|
NEXT_PASS (pass_walloca, /*strict_mode_p=*/true);
|
2021-05-05 19:07:39 +02:00
|
|
|
NEXT_PASS (pass_warn_printf);
|
2016-02-16 21:46:17 +01:00
|
|
|
NEXT_PASS (pass_warn_nonnull_compare);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_early_warn_uninitialized);
|
2022-02-03 22:51:46 +01:00
|
|
|
NEXT_PASS (pass_warn_access, /*early=*/true);
|
2019-02-14 08:31:14 +01:00
|
|
|
NEXT_PASS (pass_ubsan);
|
2015-03-27 05:02:28 +01:00
|
|
|
NEXT_PASS (pass_nothrow);
|
2016-12-02 18:05:10 +01:00
|
|
|
NEXT_PASS (pass_rebuild_cgraph_edges);
|
ipa-chkp.c: New.
gcc/
2014-11-05 Ilya Enkovich <ilya.enkovich@intel.com>
* ipa-chkp.c: New.
* ipa-chkp.h: New.
* tree-chkp.c: New.
* tree-chkp.h: New.
* tree-chkp-opt.c: New.
* rtl-chkp.c: New.
* rtl-chkp.h: New.
* Makefile.in (OBJS): Add ipa-chkp.o, rtl-chkp.o, tree-chkp.o
tree-chkp-opt.o.
(GTFILES): Add tree-chkp.c.
* mode-classes.def (MODE_POINTER_BOUNDS): New.
* tree.def (POINTER_BOUNDS_TYPE): New.
* genmodes.c (complete_mode): Support MODE_POINTER_BOUNDS.
(POINTER_BOUNDS_MODE): New.
(make_pointer_bounds_mode): New.
* machmode.h (POINTER_BOUNDS_MODE_P): New.
* stor-layout.c (int_mode_for_mode): Support MODE_POINTER_BOUNDS.
(layout_type): Support POINTER_BOUNDS_TYPE.
* tree-pretty-print.c (dump_generic_node): Support POINTER_BOUNDS_TYPE.
* tree-core.h (tree_index): Add TI_POINTER_BOUNDS_TYPE.
* tree.c (build_int_cst_wide): Support POINTER_BOUNDS_TYPE.
(type_contains_placeholder_1): Likewise.
(build_common_tree_nodes): Initialize
pointer_bounds_type_node.
* tree.h (POINTER_BOUNDS_TYPE_P): New.
(pointer_bounds_type_node): New.
(POINTER_BOUNDS_P): New.
(BOUNDED_TYPE_P): New.
(BOUNDED_P): New.
(CALL_WITH_BOUNDS_P): New.
* gimple.h (gf_mask): Add GF_CALL_WITH_BOUNDS.
(gimple_call_with_bounds_p): New.
(gimple_call_set_with_bounds): New.
(gimple_return_retbnd): New.
(gimple_return_set_retbnd): New
* gimple.c (gimple_build_return): Increase number of ops
for return statement.
(gimple_build_call_from_tree): Propagate CALL_WITH_BOUNDS_P
flag.
* gimple-pretty-print.c (dump_gimple_return): Print second op.
* rtl.h (CALL_EXPR_WITH_BOUNDS_P): New.
* gimplify.c (gimplify_init_constructor): Avoid infinite
loop during gimplification of bounds initializer.
* calls.c: Include tree-chkp.h, rtl-chkp.h, bitmap.h.
(special_function_p): Use original decl name when analyzing
instrumentation clone.
(arg_data): Add fields special_slot, pointer_arg and
pointer_offset.
(store_bounds): New.
(emit_call_1): Propagate instrumentation flag for CALL.
(initialize_argument_information): Compute pointer_arg,
pointer_offset and special_slot for pointer bounds arguments.
(finalize_must_preallocate): Preallocate when storing bounds
in bounds table.
(compute_argument_addresses): Skip pointer bounds.
(expand_call): Store bounds into tables separately. Return
result joined with resulting bounds.
* cfgexpand.c: Include tree-chkp.h, rtl-chkp.h.
(expand_call_stmt): Propagate bounds flag for CALL_EXPR.
(expand_return): Add returned bounds arg. Handle returned bounds.
(expand_gimple_stmt_1): Adjust to new expand_return signature.
(gimple_expand_cfg): Reset rtx bounds map.
* expr.c: Include tree-chkp.h, rtl-chkp.h.
(expand_assignment): Handle returned bounds.
(store_expr_with_bounds): New. Replaces store_expr with new bounds
target argument. Handle bounds returned by calls.
(store_expr): Now wraps store_expr_with_bounds.
* expr.h (store_expr_with_bounds): New.
* function.c: Include tree-chkp.h, rtl-chkp.h.
(bounds_parm_data): New.
(use_register_for_decl): Do not registerize decls used for bounds
stores and loads.
(assign_parms_augmented_arg_list): Add bounds of the result
structure pointer as the second argument.
(assign_parm_find_entry_rtl): Mark bounds are never passed on
the stack.
(assign_parm_is_stack_parm): Likewise.
(assign_parm_load_bounds): New.
(assign_bounds): New.
(assign_parms): Load bounds and determine a location for
returned bounds.
(diddle_return_value_1): New.
(diddle_return_value): Handle returned bounds.
* function.h (rtl_data): Add field for returned bounds.
* varasm.c: Include tree-chkp.h.
(output_constant): Support POINTER_BOUNDS_TYPE.
(output_constant_pool_2): Support MODE_POINTER_BOUNDS.
(ultimate_transparent_alias_target): Move up.
(make_decl_rtl): For instrumented function use
name of the original decl.
(assemble_start_function): Mark function as global
in case it is instrumentation clone of the global
function.
(do_assemble_alias): Follow transparent alias chain
for identifier. Check if original alias is public.
(maybe_assemble_visibility): Use visibility of the
original function for instrumented version.
(default_unique_section): Likewise.
* emit-rtl.c (immed_double_const): Support MODE_POINTER_BOUNDS.
(init_emit_once): Build pointer bounds zero constants.
* explow.c (trunc_int_for_mode): Support MODE_POINTER_BOUNDS.
* target.def (builtin_chkp_function): New.
(chkp_bound_type): New.
(chkp_bound_mode): New.
(chkp_make_bounds_constant): New.
(chkp_initialize_bounds): New.
(load_bounds_for_arg): New.
(store_bounds_for_arg): New.
(load_returned_bounds): New.
(store_returned_bounds): New.
(chkp_function_value_bounds): New.
(setup_incoming_vararg_bounds): New.
(function_arg): Update hook description with new possible return
value CONST_INT.
* targhooks.h (default_load_bounds_for_arg): New.
(default_store_bounds_for_arg): New.
(default_load_returned_bounds): New.
(default_store_returned_bounds): New.
(default_chkp_bound_type): New.
(default_chkp_bound_mode): New.
(default_builtin_chkp_function): New.
(default_chkp_function_value_bounds): New.
(default_chkp_make_bounds_constant): New.
(default_chkp_initialize_bounds): New.
(default_setup_incoming_vararg_bounds): New.
* targhooks.c (default_load_bounds_for_arg): New.
(default_store_bounds_for_arg): New.
(default_load_returned_bounds): New.
(default_store_returned_bounds): New.
(default_chkp_bound_type): New.
(default_chkp_bound_mode); New.
(default_builtin_chkp_function): New.
(default_chkp_function_value_bounds): New.
(default_chkp_make_bounds_constant): New.
(default_chkp_initialize_bounds): New.
(default_setup_incoming_vararg_bounds): New.
* builtin-types.def (BT_BND): New.
(BT_FN_PTR_CONST_PTR): New.
(BT_FN_CONST_PTR_CONST_PTR): New.
(BT_FN_BND_CONST_PTR): New.
(BT_FN_CONST_PTR_BND): New.
(BT_FN_PTR_CONST_PTR_SIZE): New.
(BT_FN_PTR_CONST_PTR_CONST_PTR): New.
(BT_FN_VOID_PTRPTR_CONST_PTR): New.
(BT_FN_VOID_CONST_PTR_SIZE): New.
(BT_FN_VOID_PTR_BND): New.
(BT_FN_CONST_PTR_CONST_PTR_CONST_PTR): New.
(BT_FN_BND_CONST_PTR_SIZE): New.
(BT_FN_PTR_CONST_PTR_CONST_PTR_SIZE): New.
(BT_FN_VOID_CONST_PTR_BND_CONST_PTR): New.
* chkp-builtins.def: New.
* builtins.def: include chkp-builtins.def.
(DEF_CHKP_BUILTIN): New.
* builtins.c: Include tree-chkp.h and rtl-chkp.h.
(expand_builtin): Support BUILT_IN_CHKP_INIT_PTR_BOUNDS,
BUILT_IN_CHKP_NULL_PTR_BOUNDS, BUILT_IN_CHKP_COPY_PTR_BOUNDS,
BUILT_IN_CHKP_CHECK_PTR_LBOUNDS, BUILT_IN_CHKP_CHECK_PTR_UBOUNDS,
BUILT_IN_CHKP_CHECK_PTR_BOUNDS, BUILT_IN_CHKP_SET_PTR_BOUNDS,
BUILT_IN_CHKP_NARROW_PTR_BOUNDS, BUILT_IN_CHKP_STORE_PTR_BOUNDS,
BUILT_IN_CHKP_GET_PTR_LBOUND, BUILT_IN_CHKP_GET_PTR_UBOUND,
BUILT_IN_CHKP_BNDMK, BUILT_IN_CHKP_BNDSTX, BUILT_IN_CHKP_BNDCL,
BUILT_IN_CHKP_BNDCU, BUILT_IN_CHKP_BNDLDX, BUILT_IN_CHKP_BNDRET,
BUILT_IN_CHKP_INTERSECT, BUILT_IN_CHKP_NARROW,
BUILT_IN_CHKP_EXTRACT_LOWER, BUILT_IN_CHKP_EXTRACT_UPPER.
(std_expand_builtin_va_start): Init bounds for va_list.
* cppbuiltin.c (define_builtin_macros_for_compilation_flags): Add
__CHKP__ macro when Pointer Bounds Checker is on.
* params.def (PARAM_CHKP_MAX_CTOR_SIZE): New.
* passes.def (pass_ipa_chkp_versioning): New.
(pass_early_local_passes): Renamed to pass_build_ssa_passes.
(pass_fixup_cfg): Moved to pass_chkp_instrumentation_passes.
(pass_chkp_instrumentation_passes): New.
(pass_ipa_chkp_produce_thunks): New.
(pass_local_optimization_passes): New.
(pass_chkp_opt): New.
* tree-pass.h (make_pass_ipa_chkp_versioning): New.
(make_pass_ipa_chkp_produce_thunks): New.
(make_pass_chkp): New.
(make_pass_chkp_opt): New.
(make_pass_early_local_passes): Renamed to ...
(make_pass_build_ssa_passes): This.
(make_pass_chkp_instrumentation_passes): New.
(make_pass_local_optimization_passes): New.
* passes.c (pass_manager::execute_early_local_passes): Execute
early passes in three steps.
(execute_all_early_local_passes): Renamed to ...
(execute_build_ssa_passes): This.
(pass_data_early_local_passes): Renamed to ...
(pass_data_build_ssa_passes): This.
(pass_early_local_passes): Renamed to ...
(pass_build_ssa_passes): This.
(pass_data_chkp_instrumentation_passes): New.
(pass_chkp_instrumentation_passes): New.
(pass_data_local_optimization_passes): New.
(pass_local_optimization_passes): New.
(make_pass_early_local_passes): Renamed to ...
(make_pass_build_ssa_passes): This.
(make_pass_chkp_instrumentation_passes): New.
(make_pass_local_optimization_passes): New.
* c-family/c.opt (fcheck-pointer-bounds): New.
(fchkp-check-incomplete-type): New.
(fchkp-zero-input-bounds-for-main): New.
(fchkp-first-field-has-own-bounds): New.
(fchkp-narrow-bounds): New.
(fchkp-narrow-to-innermost-array): New.
(fchkp-optimize): New.
(fchkp-use-fast-string-functions): New.
(fchkp-use-nochk-string-functions): New.
(fchkp-use-static-bounds): New.
(fchkp-use-static-const-bounds): New.
(fchkp-treat-zero-dynamic-size-as-infinite): New.
(fchkp-check-read): New.
(fchkp-check-write): New.
(fchkp-store-bounds): New.
(fchkp-instrument-calls): New.
(fchkp-instrument-marked-only): New.
(Wchkp): New.
* c-family/c-common.c (handle_bnd_variable_size_attribute): New.
(handle_bnd_legacy): New.
(handle_bnd_instrument): New.
(c_common_attribute_table): Add bnd_variable_size, bnd_legacy
and bnd_instrument. Fix documentation.
(c_common_format_attribute_table): Likewsie.
* toplev.c: include tree-chkp.h.
(process_options): Check Pointer Bounds Checker is supported.
(compile_file): Add chkp_finish_file call.
* ipa-cp.c (initialize_node_lattices): Use cgraph_local_p
to handle instrumentation clones properly.
(propagate_constants_accross_call): Do not propagate
through instrumentation thunks.
* ipa-pure-const.c (propagate_pure_const): Support
IPA_REF_CHKP.
* ipa-inline.c (early_inliner): Check edge has summary allocated.
* ipa-split.c: Include tree-chkp.h.
(find_retbnd): New.
(split_part_set_ssa_name_p): New.
(consider_split): Do not split retbnd and retval
producers.
(insert_bndret_call_after): new.
(split_function): Propagate Pointer Bounds Checker
instrumentation marks and handle returned bounds.
* tree-ssa-sccvn.h (vn_reference_op_struct): Transform opcode
into bit field and add with_bounds field.
* tree-ssa-sccvn.c (copy_reference_ops_from_call): Set
with_bounds field for instrumented calls.
* tree-ssa-pre.c (create_component_ref_by_pieces_1): Restore
CALL_WITH_BOUNDS_P flag for calls.
* tree-ssa-ccp.c: Include tree-chkp.h.
(insert_clobber_before_stack_restore): Handle
BUILT_IN_CHKP_BNDRET calls.
* tree-ssa-dce.c: Include tree-chkp.h.
(propagate_necessity): For free call fed by alloc check
bounds are also provided by the same alloc.
(eliminate_unnecessary_stmts): Handle BUILT_IN_CHKP_BNDRET
used by free calls.
* tree-inline.c: Include tree-chkp.h.
(declare_return_variable): Add arg holding
returned bounds slot. Create and initialize returned bounds var.
(remap_gimple_stmt): Handle returned bounds.
Return sequence of statements instead of a single statement.
(insert_init_stmt): Add declaration.
(remap_gimple_seq): Adjust to new remap_gimple_stmt signature.
(copy_bb): Adjust to changed return type of remap_gimple_stmt.
Properly handle bounds in va_arg_pack and va_arg_pack_len.
(expand_call_inline): Handle returned bounds. Add bounds copy
for generated mem to mem assignments.
* tree-inline.h (copy_body_data): Add fields retbnd and
assign_stmts.
* value-prof.c: Include tree-chkp.h.
(gimple_ic): Support returned bounds.
* ipa.c (cgraph_build_static_cdtor_1): Support contructors
with "chkp ctor" and "bnd_legacy" attributes.
(symtab_remove_unreachable_nodes): Keep initial values for
pointer bounds to be used for checks eliminations.
(process_references): Handle IPA_REF_CHKP.
(walk_polymorphic_call_targets): Likewise.
* ipa-visibility.c (cgraph_externally_visible_p): Mark
instrumented 'main' as externally visible.
(function_and_variable_visibility): Filter instrumentation
thunks.
* cgraph.h (cgraph_thunk_info): Add add_pointer_bounds_args
field.
(cgraph_node): Add instrumented_version, orig_decl and
instrumentation_clone fields.
(symtab_node::get_alias_target): Allow IPA_REF_CHKP reference.
(varpool_node): Add need_bounds_init field.
(cgraph_local_p): New.
* cgraph.c: Include tree-chkp.h.
(cgraph_node::remove): Fix instrumented_version
of the referenced node if any.
(cgraph_node::dump): Dump instrumentation_clone and
instrumented_version fields.
(cgraph_node::verify_node): Check correctness of IPA_REF_CHKP
references and instrumentation thunks.
(cgraph_can_remove_if_no_direct_calls_and_refs_p): Keep
all not instrumented instrumentation clones alive.
(cgraph_redirect_edge_call_stmt_to_callee): Support
returned bounds.
* cgraphbuild.c (rebuild_cgraph_edges): Rebuild IPA_REF_CHKP
reference.
(cgraph_rebuild_references): Likewise.
* cgraphunit.c: Include tree-chkp.h.
(assemble_thunks_and_aliases): Skip thunks calling instrumneted
function version.
(varpool_finalize_decl): Register statically initialized decls
in Pointer Bounds Checker.
(walk_polymorphic_call_targets): Do not mark generated call to
__builtin_unreachable as with_bounds.
(output_weakrefs): If there are both instrumented and original
versions, output only one of them.
(cgraph_node::expand_thunk): Set with_bounds flag
for created call statement.
* ipa-ref.h (ipa_ref_use): Add IPA_REF_CHKP.
(ipa_ref): increase size of use field.
* symtab.c (ipa_ref_use_name): Add element for IPA_REF_CHKP.
* varpool.c (dump_varpool_node): Dump need_bounds_init field.
(ctor_for_folding): Do not fold constant bounds vars.
* lto-streamer.h (LTO_minor_version): Change minor version from
0 to 1.
* lto-cgraph.c (compute_ltrans_boundary): Keep initial values for
pointer bounds.
(lto_output_node): Output instrumentation_clone,
thunk.add_pointer_bounds_args and orig_decl field.
(lto_output_ref): Adjust to new ipa_ref::use field size.
(input_overwrite_node): Read instrumentation_clone field.
(input_node): Read thunk.add_pointer_bounds_args and orig_decl
fields.
(input_ref): Adjust to new ipa_ref::use field size.
(input_cgraph_1): Compute instrumented_version fields and restore
IDENTIFIER_TRANSPARENT_ALIAS chains.
(lto_output_varpool_node): Output
need_bounds_init value.
(input_varpool_node): Read need_bounds_init value.
* lto-partition.c (add_symbol_to_partition_1): Keep original
and instrumented versions together.
(privatize_symbol_name): Restore transparent alias chain if required.
(add_references_to_partition): Add references to pointer bounds vars.
* dbxout.c (dbxout_type): Ignore POINTER_BOUNDS_TYPE.
* dwarf2out.c (gen_subprogram_die): Ignore bound args.
(gen_type_die_with_usage): Skip pointer bounds.
(dwarf2out_global_decl): Likewise.
(is_base_type): Support POINTER_BOUNDS_TYPE.
(gen_formal_types_die): Skip pointer bounds.
(gen_decl_die): Likewise.
* var-tracking.c (vt_add_function_parameters): Skip
bounds parameters.
* ipa-icf.c (sem_function::merge): Do not merge when instrumentation
thunk still exists.
(sem_variable::merge): Reset need_bounds_init flag.
* doc/extend.texi: Document Pointer Bounds Checker built-in functions
and attributes.
* doc/tm.texi.in (TARGET_LOAD_BOUNDS_FOR_ARG): New.
(TARGET_STORE_BOUNDS_FOR_ARG): New.
(TARGET_LOAD_RETURNED_BOUNDS): New.
(TARGET_STORE_RETURNED_BOUNDS): New.
(TARGET_CHKP_FUNCTION_VALUE_BOUNDS): New.
(TARGET_SETUP_INCOMING_VARARG_BOUNDS): New.
(TARGET_BUILTIN_CHKP_FUNCTION): New.
(TARGET_CHKP_BOUND_TYPE): New.
(TARGET_CHKP_BOUND_MODE): New.
(TARGET_CHKP_MAKE_BOUNDS_CONSTANT): New.
(TARGET_CHKP_INITIALIZE_BOUNDS): New.
* doc/tm.texi: Regenerated.
* doc/rtl.texi (MODE_POINTER_BOUNDS): New.
(BND32mode): New.
(BND64mode): New.
* doc/invoke.texi (-mmpx): New.
(-mno-mpx): New.
(chkp-max-ctor-size): New.
* config/i386/constraints.md (w): New.
(Ti): New.
(Tb): New.
* config/i386/i386-c.c (ix86_target_macros_internal): Add __MPX__.
* config/i386/i386-modes.def (BND32): New.
(BND64): New.
* config/i386/i386-protos.h (ix86_bnd_prefixed_insn_p): New.
* config/i386/i386.c: Include tree-chkp.h, rtl-chkp.h, tree-iterator.h.
(regclass_map): Add bound registers.
(dbx_register_map): Likewise.
(dbx64_register_map): Likewise.
(svr4_dbx_register_map): Likewise.
(isa_opts): Add -mmpx.
(PTA_MPX): New.
(ix86_option_override_internal): Support MPX ISA.
(ix86_conditional_register_usage): Support bound registers.
(ix86_code_end): Add MPX bnd prefix.
(output_set_got): Likewise.
(print_reg): Avoid prefixes for bound registers.
(ix86_print_operand): Add '!' (MPX bnd) print prefix support.
(ix86_print_operand_punct_valid_p): Likewise.
(ix86_print_operand_address): Support UNSPEC_BNDMK_ADDR and
UNSPEC_BNDLDX_ADDR.
(ix86_output_call_insn): Add MPX bnd prefix to branch instructions.
(ix86_class_likely_spilled_p): Add bound regs support.
(ix86_hard_regno_mode_ok): Likewise.
(x86_order_regs_for_local_alloc): Likewise.
(ix86_bnd_prefixed_insn_p): New.
(ix86_builtins): Add
IX86_BUILTIN_BNDMK, IX86_BUILTIN_BNDSTX,
IX86_BUILTIN_BNDLDX, IX86_BUILTIN_BNDCL,
IX86_BUILTIN_BNDCU, IX86_BUILTIN_BNDRET,
IX86_BUILTIN_BNDNARROW, IX86_BUILTIN_BNDINT,
IX86_BUILTIN_SIZEOF, IX86_BUILTIN_BNDLOWER,
IX86_BUILTIN_BNDUPPER.
(builtin_isa): Add leaf_p and nothrow_p fields.
(def_builtin): Initialize leaf_p and nothrow_p.
(ix86_add_new_builtins): Handle leaf_p and nothrow_p
flags.
(bdesc_mpx): New.
(bdesc_mpx_const): New.
(ix86_init_mpx_builtins): New.
(ix86_init_builtins): Call ix86_init_mpx_builtins.
(ix86_emit_cmove): New.
(ix86_emit_move_max): New.
(ix86_expand_builtin): Expand IX86_BUILTIN_BNDMK,
IX86_BUILTIN_BNDSTX, IX86_BUILTIN_BNDLDX,
IX86_BUILTIN_BNDCL, IX86_BUILTIN_BNDCU,
IX86_BUILTIN_BNDRET, IX86_BUILTIN_BNDNARROW,
IX86_BUILTIN_BNDINT, IX86_BUILTIN_SIZEOF,
IX86_BUILTIN_BNDLOWER, IX86_BUILTIN_BNDUPPER.
(ix86_function_value_bounds): New.
(ix86_builtin_mpx_function): New.
(ix86_get_arg_address_for_bt): New.
(ix86_load_bounds): New.
(ix86_store_bounds): New.
(ix86_load_returned_bounds): New.
(ix86_store_returned_bounds): New.
(ix86_mpx_bound_mode): New.
(ix86_make_bounds_constant): New.
(ix86_initialize_bounds):
(TARGET_LOAD_BOUNDS_FOR_ARG): New.
(TARGET_STORE_BOUNDS_FOR_ARG): New.
(TARGET_LOAD_RETURNED_BOUNDS): New.
(TARGET_STORE_RETURNED_BOUNDS): New.
(TARGET_CHKP_BOUND_MODE): New.
(TARGET_BUILTIN_CHKP_FUNCTION): New.
(TARGET_CHKP_FUNCTION_VALUE_BOUNDS): New.
(TARGET_CHKP_MAKE_BOUNDS_CONSTANT): New.
(TARGET_CHKP_INITIALIZE_BOUNDS): New.
(ix86_option_override_internal): Do not
support x32 with MPX.
(init_cumulative_args): Init stdarg, bnd_regno, bnds_in_bt
and force_bnd_pass.
(function_arg_advance_32): Return number of used integer
registers.
(function_arg_advance_64): Likewise.
(function_arg_advance_ms_64): Likewise.
(ix86_function_arg_advance): Handle pointer bounds.
(ix86_function_arg): Likewise.
(ix86_function_value_regno_p): Mark fisrt bounds registers as
possible function value.
(ix86_function_value_1): Handle pointer bounds type/mode
(ix86_return_in_memory): Likewise.
(ix86_print_operand): Analyse insn to decide abounf "bnd" prefix.
(ix86_expand_call): Generate returned bounds.
(ix86_setup_incoming_vararg_bounds): New.
(ix86_va_start): Initialize bounds for pointers in va_list.
(TARGET_SETUP_INCOMING_VARARG_BOUNDS): New.
* config/i386/i386.h (TARGET_MPX): New.
(TARGET_MPX_P): New.
(FIRST_PSEUDO_REGISTER): Fix to new value.
(FIXED_REGISTERS): Add bound registers.
(CALL_USED_REGISTERS): Likewise.
(REG_ALLOC_ORDER): Likewise.
(HARD_REGNO_NREGS): Likewise.
(VALID_BND_REG_MODE): New.
(FIRST_BND_REG): New.
(LAST_BND_REG): New.
(reg_class): Add BND_REGS.
(REG_CLASS_NAMES): Likewise.
(REG_CLASS_CONTENTS): Likewise.
(BND_REGNO_P): New.
(ANY_BND_REG_P): New.
(BNDmode): New.
(HI_REGISTER_NAMES): Add bound registers.
(ix86_args): Add bnd_regno, bnds_in_bt, force_bnd_pass and
stdarg fields.
* config/i386/i386.md (UNSPEC_BNDMK): New.
(UNSPEC_BNDMK_ADDR): New.
(UNSPEC_BNDSTX): New.
(UNSPEC_BNDLDX): New.
(UNSPEC_BNDLDX_ADDR): New.
(UNSPEC_BNDCL): New.
(UNSPEC_BNDCU): New.
(UNSPEC_BNDCN): New.
(UNSPEC_MPX_FENCE): New.
(UNSPEC_SIZEOF): New.
(BND0_REG): New.
(BND1_REG): New.
(type): Add mpxmov, mpxmk, mpxchk, mpxld, mpxst.
(length_immediate): Support mpxmov, mpxmk, mpxchk, mpxld, mpxst.
(prefix_rep): Check for bnd prefix.
(prefix_0f): Support mpxmov, mpxmk, mpxchk, mpxld, mpxst.
(length_nobnd): New.
(length): Use length_nobnd when specified.
(memory): Support mpxmov, mpxmk, mpxchk, mpxld, mpxst.
(BND): New.
(bnd_ptr): New.
(BNDCHECK): New.
(bndcheck): New.
(*jcc_1): Add MPX bnd prefix.
(*jcc_2): Likewise.
(jump): Likewise.
(*indirect_jump): Likewise.
(*tablejump_1): Likewise.
(simple_return_internal): Likewise.
(simple_return_internal_long): Likewise.
(simple_return_pop_internal): Likewise.
(simple_return_indirect_internal): Likewise.
(<mode>_mk): New.
(*<mode>_mk): New.
(mov<mode>): New.
(*mov<mode>_internal_mpx): New.
(<mode>_<bndcheck>): New.
(*<mode>_<bndcheck>): New.
(<mode>_ldx): New.
(*<mode>_ldx): New.
(<mode>_stx): New.
(*<mode>_stx): New.
move_size_reloc_<mode>): New.
* config/i386/predicates.md (address_mpx_no_base_operand): New.
(address_mpx_no_index_operand): New.
(bnd_mem_operator): New.
(symbol_operand): New.
(x86_64_immediate_size_operand): New.
* config/i386/i386.opt (mmpx): New.
* config/i386/i386-builtin-types.def (BND): New.
(ULONG): New.
(BND_FTYPE_PCVOID_ULONG): New.
(VOID_FTYPE_BND_PCVOID): New.
(VOID_FTYPE_PCVOID_PCVOID_BND): New.
(BND_FTYPE_PCVOID_PCVOID): New.
(BND_FTYPE_PCVOID): New.
(BND_FTYPE_BND_BND): New.
(PVOID_FTYPE_PVOID_PVOID_ULONG): New.
(PVOID_FTYPE_PCVOID_BND_ULONG): New.
(ULONG_FTYPE_VOID): New.
(PVOID_FTYPE_BND): New.
gcc/testsuite/
2014-11-05 Ilya Enkovich <ilya.enkovich@intel.com>
* gcc.target/i386/chkp-builtins-1.c: New.
* gcc.target/i386/chkp-builtins-2.c: New.
* gcc.target/i386/chkp-builtins-3.c: New.
* gcc.target/i386/chkp-builtins-4.c: New.
* gcc.target/i386/chkp-remove-bndint-1.c: New.
* gcc.target/i386/chkp-remove-bndint-2.c: New.
* gcc.target/i386/chkp-const-check-1.c: New.
* gcc.target/i386/chkp-const-check-2.c: New.
* gcc.target/i386/chkp-lifetime-1.c: New.
* gcc.dg/pr37858.c: Replace early_local_cleanups pass name
with build_ssa_passes.
From-SVN: r217125
2014-11-05 13:42:03 +01:00
|
|
|
POP_INSERT_PASSES ()
|
|
|
|
|
|
|
|
NEXT_PASS (pass_local_optimization_passes);
|
|
|
|
PUSH_INSERT_PASSES_WITHIN (pass_local_optimization_passes)
|
|
|
|
NEXT_PASS (pass_fixup_cfg);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_rebuild_cgraph_edges);
|
cgraphunit.c (symbol_table::process_new_functions): Update.
* cgraphunit.c (symbol_table::process_new_functions): Update.
* ipa-fnsummary.c (pass_data_inline_parameters): Remove.
(inline_generate_summary): Rename to ...
(ipa_fn_summary_generate): ... this one.
(inline_read_summary): Rename to ...
(ipa_fn_summary_read): ... this one.
(inline_write_summary): Rename to ...
(ipa_fn_summary_write): ... this one.
(inline_free_summary): Rename to ...
(ipa_free_fn_summary): ... this one.
(pass_data_local_fn_summary, pass_local_fn_summary,
make_pass_local_fn_summary, pass_data_ipa_free_fn_summary,
pass_ipa_free_fn_summary, make_pass_ipa_free_fn_summary,
pass_data_ipa_fn_summary, pass_ipa_fn_summary,
make_pass_ipa_fn_summary): New.
* ipa-fnsummary.h (inline_generate_summary, inline_read_summary,
inline_write_summary, inline_free_summary): Remove.
(ipa_free_fn_summary) : New.
* ipa-inline.c (ipa_inline): Update.
(pass_ipa_inline): Do not generate summaries.
* ipa.c (pass_data_ipa_free_fn_summary, pass_ipa_free_fn_summary):
Remove.
* passes.def: Replace pass_inline_parameters by pass_local_fn_summary
and add pass_ipa_fn_summary.
* tree-pass.h (make_pass_ipa_fn_summary, make_pass_local_fn_summary):
New.
(make_pass_inline_parameters): Remove.
* lto.c (do_whole_program_analysis): Replace inline_free_summary
by ipa_free_fn_summary.
* gcc.dg/ipa/ctor-empty-1.c: Update template.
* gcc.dg/ipa/inline-5.c: Likewise.
* gfortran.dg/pr48636.f90: Likewise.
From-SVN: r248375
2017-05-23 18:20:53 +02:00
|
|
|
NEXT_PASS (pass_local_fn_summary);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_early_inline);
|
2021-11-23 23:30:29 +01:00
|
|
|
NEXT_PASS (pass_warn_recursion);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_all_early_optimizations);
|
|
|
|
PUSH_INSERT_PASSES_WITHIN (pass_all_early_optimizations)
|
|
|
|
NEXT_PASS (pass_remove_cgraph_callee_edges);
|
passes: Fix up subobject __bos [PR101419]
The following testcase is miscompiled, because VN during cunrolli changes
__bos argument from address of a larger field to address of a smaller field
and so __builtin_object_size (, 1) then folds into smaller value than the
actually available size.
copy_reference_ops_from_ref has a hack for this, but it was using
cfun->after_inlining as a check whether the hack can be ignored, and
cunrolli is after_inlining.
This patch uses a property to make it exact (set at the end of objsz
pass that doesn't do insert_min_max_p) and additionally based on discussions
in the PR moves the objsz pass earlier after IPA.
2021-07-13 Jakub Jelinek <jakub@redhat.com>
Richard Biener <rguenther@suse.de>
PR tree-optimization/101419
* tree-pass.h (PROP_objsz): Define.
(make_pass_early_object_sizes): Declare.
* passes.def (pass_all_early_optimizations): Rename pass_object_sizes
there to pass_early_object_sizes, drop parameter.
(pass_all_optimizations): Move pass_object_sizes right after pass_ccp,
drop parameter, move pass_post_ipa_warn right after that.
* tree-object-size.c (pass_object_sizes::execute): Rename to...
(object_sizes_execute): ... this. Add insert_min_max_p argument.
(pass_data_object_sizes): Move after object_sizes_execute.
(pass_object_sizes): Likewise. In execute method call
object_sizes_execute, drop set_pass_param method and insert_min_max_p
non-static data member and its initializer in the ctor.
(pass_data_early_object_sizes, pass_early_object_sizes,
make_pass_early_object_sizes): New.
* tree-ssa-sccvn.c (copy_reference_ops_from_ref): Use
(cfun->curr_properties & PROP_objsz) instead of cfun->after_inlining.
* gcc.dg/builtin-object-size-10.c: Pass -fdump-tree-early_objsz-details
instead of -fdump-tree-objsz1-details in dg-options and adjust names
of dump file in scan-tree-dump.
* gcc.dg/pr101419.c: New test.
2021-07-13 11:04:22 +02:00
|
|
|
NEXT_PASS (pass_early_object_sizes);
|
2015-11-16 13:40:41 +01:00
|
|
|
/* Don't record nonzero bits before IPA to avoid
|
|
|
|
using too much memory. */
|
|
|
|
NEXT_PASS (pass_ccp, false /* nonzero_p */);
|
2013-07-18 20:55:48 +02:00
|
|
|
/* After CCP we rewrite no longer addressed locals into SSA
|
|
|
|
form if possible. */
|
|
|
|
NEXT_PASS (pass_forwprop);
|
2021-10-29 17:30:42 +02:00
|
|
|
NEXT_PASS (pass_early_thread_jumps, /*first=*/true);
|
2014-04-14 15:55:46 +02:00
|
|
|
NEXT_PASS (pass_sra_early);
|
2013-07-18 20:55:48 +02:00
|
|
|
/* pass_build_ealias is a dummy pass that ensures that we
|
|
|
|
execute TODO_rebuild_alias at this point. */
|
|
|
|
NEXT_PASS (pass_build_ealias);
|
2019-07-01 09:54:38 +02:00
|
|
|
NEXT_PASS (pass_fre, true /* may_iterate */);
|
2016-09-21 01:23:55 +02:00
|
|
|
NEXT_PASS (pass_early_vrp);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_merge_phi);
|
2015-04-22 03:32:14 +02:00
|
|
|
NEXT_PASS (pass_dse);
|
2021-01-16 09:17:38 +01:00
|
|
|
NEXT_PASS (pass_cd_dce, false /* update_address_taken_p */);
|
2018-10-23 13:34:56 +02:00
|
|
|
NEXT_PASS (pass_phiopt, true /* early_p */);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_tail_recursion);
|
2020-08-28 10:26:13 +02:00
|
|
|
NEXT_PASS (pass_if_to_switch);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_convert_switch);
|
ipa-chkp.c: New.
gcc/
2014-11-05 Ilya Enkovich <ilya.enkovich@intel.com>
* ipa-chkp.c: New.
* ipa-chkp.h: New.
* tree-chkp.c: New.
* tree-chkp.h: New.
* tree-chkp-opt.c: New.
* rtl-chkp.c: New.
* rtl-chkp.h: New.
* Makefile.in (OBJS): Add ipa-chkp.o, rtl-chkp.o, tree-chkp.o
tree-chkp-opt.o.
(GTFILES): Add tree-chkp.c.
* mode-classes.def (MODE_POINTER_BOUNDS): New.
* tree.def (POINTER_BOUNDS_TYPE): New.
* genmodes.c (complete_mode): Support MODE_POINTER_BOUNDS.
(POINTER_BOUNDS_MODE): New.
(make_pointer_bounds_mode): New.
* machmode.h (POINTER_BOUNDS_MODE_P): New.
* stor-layout.c (int_mode_for_mode): Support MODE_POINTER_BOUNDS.
(layout_type): Support POINTER_BOUNDS_TYPE.
* tree-pretty-print.c (dump_generic_node): Support POINTER_BOUNDS_TYPE.
* tree-core.h (tree_index): Add TI_POINTER_BOUNDS_TYPE.
* tree.c (build_int_cst_wide): Support POINTER_BOUNDS_TYPE.
(type_contains_placeholder_1): Likewise.
(build_common_tree_nodes): Initialize
pointer_bounds_type_node.
* tree.h (POINTER_BOUNDS_TYPE_P): New.
(pointer_bounds_type_node): New.
(POINTER_BOUNDS_P): New.
(BOUNDED_TYPE_P): New.
(BOUNDED_P): New.
(CALL_WITH_BOUNDS_P): New.
* gimple.h (gf_mask): Add GF_CALL_WITH_BOUNDS.
(gimple_call_with_bounds_p): New.
(gimple_call_set_with_bounds): New.
(gimple_return_retbnd): New.
(gimple_return_set_retbnd): New
* gimple.c (gimple_build_return): Increase number of ops
for return statement.
(gimple_build_call_from_tree): Propagate CALL_WITH_BOUNDS_P
flag.
* gimple-pretty-print.c (dump_gimple_return): Print second op.
* rtl.h (CALL_EXPR_WITH_BOUNDS_P): New.
* gimplify.c (gimplify_init_constructor): Avoid infinite
loop during gimplification of bounds initializer.
* calls.c: Include tree-chkp.h, rtl-chkp.h, bitmap.h.
(special_function_p): Use original decl name when analyzing
instrumentation clone.
(arg_data): Add fields special_slot, pointer_arg and
pointer_offset.
(store_bounds): New.
(emit_call_1): Propagate instrumentation flag for CALL.
(initialize_argument_information): Compute pointer_arg,
pointer_offset and special_slot for pointer bounds arguments.
(finalize_must_preallocate): Preallocate when storing bounds
in bounds table.
(compute_argument_addresses): Skip pointer bounds.
(expand_call): Store bounds into tables separately. Return
result joined with resulting bounds.
* cfgexpand.c: Include tree-chkp.h, rtl-chkp.h.
(expand_call_stmt): Propagate bounds flag for CALL_EXPR.
(expand_return): Add returned bounds arg. Handle returned bounds.
(expand_gimple_stmt_1): Adjust to new expand_return signature.
(gimple_expand_cfg): Reset rtx bounds map.
* expr.c: Include tree-chkp.h, rtl-chkp.h.
(expand_assignment): Handle returned bounds.
(store_expr_with_bounds): New. Replaces store_expr with new bounds
target argument. Handle bounds returned by calls.
(store_expr): Now wraps store_expr_with_bounds.
* expr.h (store_expr_with_bounds): New.
* function.c: Include tree-chkp.h, rtl-chkp.h.
(bounds_parm_data): New.
(use_register_for_decl): Do not registerize decls used for bounds
stores and loads.
(assign_parms_augmented_arg_list): Add bounds of the result
structure pointer as the second argument.
(assign_parm_find_entry_rtl): Mark bounds are never passed on
the stack.
(assign_parm_is_stack_parm): Likewise.
(assign_parm_load_bounds): New.
(assign_bounds): New.
(assign_parms): Load bounds and determine a location for
returned bounds.
(diddle_return_value_1): New.
(diddle_return_value): Handle returned bounds.
* function.h (rtl_data): Add field for returned bounds.
* varasm.c: Include tree-chkp.h.
(output_constant): Support POINTER_BOUNDS_TYPE.
(output_constant_pool_2): Support MODE_POINTER_BOUNDS.
(ultimate_transparent_alias_target): Move up.
(make_decl_rtl): For instrumented function use
name of the original decl.
(assemble_start_function): Mark function as global
in case it is instrumentation clone of the global
function.
(do_assemble_alias): Follow transparent alias chain
for identifier. Check if original alias is public.
(maybe_assemble_visibility): Use visibility of the
original function for instrumented version.
(default_unique_section): Likewise.
* emit-rtl.c (immed_double_const): Support MODE_POINTER_BOUNDS.
(init_emit_once): Build pointer bounds zero constants.
* explow.c (trunc_int_for_mode): Support MODE_POINTER_BOUNDS.
* target.def (builtin_chkp_function): New.
(chkp_bound_type): New.
(chkp_bound_mode): New.
(chkp_make_bounds_constant): New.
(chkp_initialize_bounds): New.
(load_bounds_for_arg): New.
(store_bounds_for_arg): New.
(load_returned_bounds): New.
(store_returned_bounds): New.
(chkp_function_value_bounds): New.
(setup_incoming_vararg_bounds): New.
(function_arg): Update hook description with new possible return
value CONST_INT.
* targhooks.h (default_load_bounds_for_arg): New.
(default_store_bounds_for_arg): New.
(default_load_returned_bounds): New.
(default_store_returned_bounds): New.
(default_chkp_bound_type): New.
(default_chkp_bound_mode): New.
(default_builtin_chkp_function): New.
(default_chkp_function_value_bounds): New.
(default_chkp_make_bounds_constant): New.
(default_chkp_initialize_bounds): New.
(default_setup_incoming_vararg_bounds): New.
* targhooks.c (default_load_bounds_for_arg): New.
(default_store_bounds_for_arg): New.
(default_load_returned_bounds): New.
(default_store_returned_bounds): New.
(default_chkp_bound_type): New.
(default_chkp_bound_mode); New.
(default_builtin_chkp_function): New.
(default_chkp_function_value_bounds): New.
(default_chkp_make_bounds_constant): New.
(default_chkp_initialize_bounds): New.
(default_setup_incoming_vararg_bounds): New.
* builtin-types.def (BT_BND): New.
(BT_FN_PTR_CONST_PTR): New.
(BT_FN_CONST_PTR_CONST_PTR): New.
(BT_FN_BND_CONST_PTR): New.
(BT_FN_CONST_PTR_BND): New.
(BT_FN_PTR_CONST_PTR_SIZE): New.
(BT_FN_PTR_CONST_PTR_CONST_PTR): New.
(BT_FN_VOID_PTRPTR_CONST_PTR): New.
(BT_FN_VOID_CONST_PTR_SIZE): New.
(BT_FN_VOID_PTR_BND): New.
(BT_FN_CONST_PTR_CONST_PTR_CONST_PTR): New.
(BT_FN_BND_CONST_PTR_SIZE): New.
(BT_FN_PTR_CONST_PTR_CONST_PTR_SIZE): New.
(BT_FN_VOID_CONST_PTR_BND_CONST_PTR): New.
* chkp-builtins.def: New.
* builtins.def: include chkp-builtins.def.
(DEF_CHKP_BUILTIN): New.
* builtins.c: Include tree-chkp.h and rtl-chkp.h.
(expand_builtin): Support BUILT_IN_CHKP_INIT_PTR_BOUNDS,
BUILT_IN_CHKP_NULL_PTR_BOUNDS, BUILT_IN_CHKP_COPY_PTR_BOUNDS,
BUILT_IN_CHKP_CHECK_PTR_LBOUNDS, BUILT_IN_CHKP_CHECK_PTR_UBOUNDS,
BUILT_IN_CHKP_CHECK_PTR_BOUNDS, BUILT_IN_CHKP_SET_PTR_BOUNDS,
BUILT_IN_CHKP_NARROW_PTR_BOUNDS, BUILT_IN_CHKP_STORE_PTR_BOUNDS,
BUILT_IN_CHKP_GET_PTR_LBOUND, BUILT_IN_CHKP_GET_PTR_UBOUND,
BUILT_IN_CHKP_BNDMK, BUILT_IN_CHKP_BNDSTX, BUILT_IN_CHKP_BNDCL,
BUILT_IN_CHKP_BNDCU, BUILT_IN_CHKP_BNDLDX, BUILT_IN_CHKP_BNDRET,
BUILT_IN_CHKP_INTERSECT, BUILT_IN_CHKP_NARROW,
BUILT_IN_CHKP_EXTRACT_LOWER, BUILT_IN_CHKP_EXTRACT_UPPER.
(std_expand_builtin_va_start): Init bounds for va_list.
* cppbuiltin.c (define_builtin_macros_for_compilation_flags): Add
__CHKP__ macro when Pointer Bounds Checker is on.
* params.def (PARAM_CHKP_MAX_CTOR_SIZE): New.
* passes.def (pass_ipa_chkp_versioning): New.
(pass_early_local_passes): Renamed to pass_build_ssa_passes.
(pass_fixup_cfg): Moved to pass_chkp_instrumentation_passes.
(pass_chkp_instrumentation_passes): New.
(pass_ipa_chkp_produce_thunks): New.
(pass_local_optimization_passes): New.
(pass_chkp_opt): New.
* tree-pass.h (make_pass_ipa_chkp_versioning): New.
(make_pass_ipa_chkp_produce_thunks): New.
(make_pass_chkp): New.
(make_pass_chkp_opt): New.
(make_pass_early_local_passes): Renamed to ...
(make_pass_build_ssa_passes): This.
(make_pass_chkp_instrumentation_passes): New.
(make_pass_local_optimization_passes): New.
* passes.c (pass_manager::execute_early_local_passes): Execute
early passes in three steps.
(execute_all_early_local_passes): Renamed to ...
(execute_build_ssa_passes): This.
(pass_data_early_local_passes): Renamed to ...
(pass_data_build_ssa_passes): This.
(pass_early_local_passes): Renamed to ...
(pass_build_ssa_passes): This.
(pass_data_chkp_instrumentation_passes): New.
(pass_chkp_instrumentation_passes): New.
(pass_data_local_optimization_passes): New.
(pass_local_optimization_passes): New.
(make_pass_early_local_passes): Renamed to ...
(make_pass_build_ssa_passes): This.
(make_pass_chkp_instrumentation_passes): New.
(make_pass_local_optimization_passes): New.
* c-family/c.opt (fcheck-pointer-bounds): New.
(fchkp-check-incomplete-type): New.
(fchkp-zero-input-bounds-for-main): New.
(fchkp-first-field-has-own-bounds): New.
(fchkp-narrow-bounds): New.
(fchkp-narrow-to-innermost-array): New.
(fchkp-optimize): New.
(fchkp-use-fast-string-functions): New.
(fchkp-use-nochk-string-functions): New.
(fchkp-use-static-bounds): New.
(fchkp-use-static-const-bounds): New.
(fchkp-treat-zero-dynamic-size-as-infinite): New.
(fchkp-check-read): New.
(fchkp-check-write): New.
(fchkp-store-bounds): New.
(fchkp-instrument-calls): New.
(fchkp-instrument-marked-only): New.
(Wchkp): New.
* c-family/c-common.c (handle_bnd_variable_size_attribute): New.
(handle_bnd_legacy): New.
(handle_bnd_instrument): New.
(c_common_attribute_table): Add bnd_variable_size, bnd_legacy
and bnd_instrument. Fix documentation.
(c_common_format_attribute_table): Likewsie.
* toplev.c: include tree-chkp.h.
(process_options): Check Pointer Bounds Checker is supported.
(compile_file): Add chkp_finish_file call.
* ipa-cp.c (initialize_node_lattices): Use cgraph_local_p
to handle instrumentation clones properly.
(propagate_constants_accross_call): Do not propagate
through instrumentation thunks.
* ipa-pure-const.c (propagate_pure_const): Support
IPA_REF_CHKP.
* ipa-inline.c (early_inliner): Check edge has summary allocated.
* ipa-split.c: Include tree-chkp.h.
(find_retbnd): New.
(split_part_set_ssa_name_p): New.
(consider_split): Do not split retbnd and retval
producers.
(insert_bndret_call_after): new.
(split_function): Propagate Pointer Bounds Checker
instrumentation marks and handle returned bounds.
* tree-ssa-sccvn.h (vn_reference_op_struct): Transform opcode
into bit field and add with_bounds field.
* tree-ssa-sccvn.c (copy_reference_ops_from_call): Set
with_bounds field for instrumented calls.
* tree-ssa-pre.c (create_component_ref_by_pieces_1): Restore
CALL_WITH_BOUNDS_P flag for calls.
* tree-ssa-ccp.c: Include tree-chkp.h.
(insert_clobber_before_stack_restore): Handle
BUILT_IN_CHKP_BNDRET calls.
* tree-ssa-dce.c: Include tree-chkp.h.
(propagate_necessity): For free call fed by alloc check
bounds are also provided by the same alloc.
(eliminate_unnecessary_stmts): Handle BUILT_IN_CHKP_BNDRET
used by free calls.
* tree-inline.c: Include tree-chkp.h.
(declare_return_variable): Add arg holding
returned bounds slot. Create and initialize returned bounds var.
(remap_gimple_stmt): Handle returned bounds.
Return sequence of statements instead of a single statement.
(insert_init_stmt): Add declaration.
(remap_gimple_seq): Adjust to new remap_gimple_stmt signature.
(copy_bb): Adjust to changed return type of remap_gimple_stmt.
Properly handle bounds in va_arg_pack and va_arg_pack_len.
(expand_call_inline): Handle returned bounds. Add bounds copy
for generated mem to mem assignments.
* tree-inline.h (copy_body_data): Add fields retbnd and
assign_stmts.
* value-prof.c: Include tree-chkp.h.
(gimple_ic): Support returned bounds.
* ipa.c (cgraph_build_static_cdtor_1): Support contructors
with "chkp ctor" and "bnd_legacy" attributes.
(symtab_remove_unreachable_nodes): Keep initial values for
pointer bounds to be used for checks eliminations.
(process_references): Handle IPA_REF_CHKP.
(walk_polymorphic_call_targets): Likewise.
* ipa-visibility.c (cgraph_externally_visible_p): Mark
instrumented 'main' as externally visible.
(function_and_variable_visibility): Filter instrumentation
thunks.
* cgraph.h (cgraph_thunk_info): Add add_pointer_bounds_args
field.
(cgraph_node): Add instrumented_version, orig_decl and
instrumentation_clone fields.
(symtab_node::get_alias_target): Allow IPA_REF_CHKP reference.
(varpool_node): Add need_bounds_init field.
(cgraph_local_p): New.
* cgraph.c: Include tree-chkp.h.
(cgraph_node::remove): Fix instrumented_version
of the referenced node if any.
(cgraph_node::dump): Dump instrumentation_clone and
instrumented_version fields.
(cgraph_node::verify_node): Check correctness of IPA_REF_CHKP
references and instrumentation thunks.
(cgraph_can_remove_if_no_direct_calls_and_refs_p): Keep
all not instrumented instrumentation clones alive.
(cgraph_redirect_edge_call_stmt_to_callee): Support
returned bounds.
* cgraphbuild.c (rebuild_cgraph_edges): Rebuild IPA_REF_CHKP
reference.
(cgraph_rebuild_references): Likewise.
* cgraphunit.c: Include tree-chkp.h.
(assemble_thunks_and_aliases): Skip thunks calling instrumneted
function version.
(varpool_finalize_decl): Register statically initialized decls
in Pointer Bounds Checker.
(walk_polymorphic_call_targets): Do not mark generated call to
__builtin_unreachable as with_bounds.
(output_weakrefs): If there are both instrumented and original
versions, output only one of them.
(cgraph_node::expand_thunk): Set with_bounds flag
for created call statement.
* ipa-ref.h (ipa_ref_use): Add IPA_REF_CHKP.
(ipa_ref): increase size of use field.
* symtab.c (ipa_ref_use_name): Add element for IPA_REF_CHKP.
* varpool.c (dump_varpool_node): Dump need_bounds_init field.
(ctor_for_folding): Do not fold constant bounds vars.
* lto-streamer.h (LTO_minor_version): Change minor version from
0 to 1.
* lto-cgraph.c (compute_ltrans_boundary): Keep initial values for
pointer bounds.
(lto_output_node): Output instrumentation_clone,
thunk.add_pointer_bounds_args and orig_decl field.
(lto_output_ref): Adjust to new ipa_ref::use field size.
(input_overwrite_node): Read instrumentation_clone field.
(input_node): Read thunk.add_pointer_bounds_args and orig_decl
fields.
(input_ref): Adjust to new ipa_ref::use field size.
(input_cgraph_1): Compute instrumented_version fields and restore
IDENTIFIER_TRANSPARENT_ALIAS chains.
(lto_output_varpool_node): Output
need_bounds_init value.
(input_varpool_node): Read need_bounds_init value.
* lto-partition.c (add_symbol_to_partition_1): Keep original
and instrumented versions together.
(privatize_symbol_name): Restore transparent alias chain if required.
(add_references_to_partition): Add references to pointer bounds vars.
* dbxout.c (dbxout_type): Ignore POINTER_BOUNDS_TYPE.
* dwarf2out.c (gen_subprogram_die): Ignore bound args.
(gen_type_die_with_usage): Skip pointer bounds.
(dwarf2out_global_decl): Likewise.
(is_base_type): Support POINTER_BOUNDS_TYPE.
(gen_formal_types_die): Skip pointer bounds.
(gen_decl_die): Likewise.
* var-tracking.c (vt_add_function_parameters): Skip
bounds parameters.
* ipa-icf.c (sem_function::merge): Do not merge when instrumentation
thunk still exists.
(sem_variable::merge): Reset need_bounds_init flag.
* doc/extend.texi: Document Pointer Bounds Checker built-in functions
and attributes.
* doc/tm.texi.in (TARGET_LOAD_BOUNDS_FOR_ARG): New.
(TARGET_STORE_BOUNDS_FOR_ARG): New.
(TARGET_LOAD_RETURNED_BOUNDS): New.
(TARGET_STORE_RETURNED_BOUNDS): New.
(TARGET_CHKP_FUNCTION_VALUE_BOUNDS): New.
(TARGET_SETUP_INCOMING_VARARG_BOUNDS): New.
(TARGET_BUILTIN_CHKP_FUNCTION): New.
(TARGET_CHKP_BOUND_TYPE): New.
(TARGET_CHKP_BOUND_MODE): New.
(TARGET_CHKP_MAKE_BOUNDS_CONSTANT): New.
(TARGET_CHKP_INITIALIZE_BOUNDS): New.
* doc/tm.texi: Regenerated.
* doc/rtl.texi (MODE_POINTER_BOUNDS): New.
(BND32mode): New.
(BND64mode): New.
* doc/invoke.texi (-mmpx): New.
(-mno-mpx): New.
(chkp-max-ctor-size): New.
* config/i386/constraints.md (w): New.
(Ti): New.
(Tb): New.
* config/i386/i386-c.c (ix86_target_macros_internal): Add __MPX__.
* config/i386/i386-modes.def (BND32): New.
(BND64): New.
* config/i386/i386-protos.h (ix86_bnd_prefixed_insn_p): New.
* config/i386/i386.c: Include tree-chkp.h, rtl-chkp.h, tree-iterator.h.
(regclass_map): Add bound registers.
(dbx_register_map): Likewise.
(dbx64_register_map): Likewise.
(svr4_dbx_register_map): Likewise.
(isa_opts): Add -mmpx.
(PTA_MPX): New.
(ix86_option_override_internal): Support MPX ISA.
(ix86_conditional_register_usage): Support bound registers.
(ix86_code_end): Add MPX bnd prefix.
(output_set_got): Likewise.
(print_reg): Avoid prefixes for bound registers.
(ix86_print_operand): Add '!' (MPX bnd) print prefix support.
(ix86_print_operand_punct_valid_p): Likewise.
(ix86_print_operand_address): Support UNSPEC_BNDMK_ADDR and
UNSPEC_BNDLDX_ADDR.
(ix86_output_call_insn): Add MPX bnd prefix to branch instructions.
(ix86_class_likely_spilled_p): Add bound regs support.
(ix86_hard_regno_mode_ok): Likewise.
(x86_order_regs_for_local_alloc): Likewise.
(ix86_bnd_prefixed_insn_p): New.
(ix86_builtins): Add
IX86_BUILTIN_BNDMK, IX86_BUILTIN_BNDSTX,
IX86_BUILTIN_BNDLDX, IX86_BUILTIN_BNDCL,
IX86_BUILTIN_BNDCU, IX86_BUILTIN_BNDRET,
IX86_BUILTIN_BNDNARROW, IX86_BUILTIN_BNDINT,
IX86_BUILTIN_SIZEOF, IX86_BUILTIN_BNDLOWER,
IX86_BUILTIN_BNDUPPER.
(builtin_isa): Add leaf_p and nothrow_p fields.
(def_builtin): Initialize leaf_p and nothrow_p.
(ix86_add_new_builtins): Handle leaf_p and nothrow_p
flags.
(bdesc_mpx): New.
(bdesc_mpx_const): New.
(ix86_init_mpx_builtins): New.
(ix86_init_builtins): Call ix86_init_mpx_builtins.
(ix86_emit_cmove): New.
(ix86_emit_move_max): New.
(ix86_expand_builtin): Expand IX86_BUILTIN_BNDMK,
IX86_BUILTIN_BNDSTX, IX86_BUILTIN_BNDLDX,
IX86_BUILTIN_BNDCL, IX86_BUILTIN_BNDCU,
IX86_BUILTIN_BNDRET, IX86_BUILTIN_BNDNARROW,
IX86_BUILTIN_BNDINT, IX86_BUILTIN_SIZEOF,
IX86_BUILTIN_BNDLOWER, IX86_BUILTIN_BNDUPPER.
(ix86_function_value_bounds): New.
(ix86_builtin_mpx_function): New.
(ix86_get_arg_address_for_bt): New.
(ix86_load_bounds): New.
(ix86_store_bounds): New.
(ix86_load_returned_bounds): New.
(ix86_store_returned_bounds): New.
(ix86_mpx_bound_mode): New.
(ix86_make_bounds_constant): New.
(ix86_initialize_bounds):
(TARGET_LOAD_BOUNDS_FOR_ARG): New.
(TARGET_STORE_BOUNDS_FOR_ARG): New.
(TARGET_LOAD_RETURNED_BOUNDS): New.
(TARGET_STORE_RETURNED_BOUNDS): New.
(TARGET_CHKP_BOUND_MODE): New.
(TARGET_BUILTIN_CHKP_FUNCTION): New.
(TARGET_CHKP_FUNCTION_VALUE_BOUNDS): New.
(TARGET_CHKP_MAKE_BOUNDS_CONSTANT): New.
(TARGET_CHKP_INITIALIZE_BOUNDS): New.
(ix86_option_override_internal): Do not
support x32 with MPX.
(init_cumulative_args): Init stdarg, bnd_regno, bnds_in_bt
and force_bnd_pass.
(function_arg_advance_32): Return number of used integer
registers.
(function_arg_advance_64): Likewise.
(function_arg_advance_ms_64): Likewise.
(ix86_function_arg_advance): Handle pointer bounds.
(ix86_function_arg): Likewise.
(ix86_function_value_regno_p): Mark fisrt bounds registers as
possible function value.
(ix86_function_value_1): Handle pointer bounds type/mode
(ix86_return_in_memory): Likewise.
(ix86_print_operand): Analyse insn to decide abounf "bnd" prefix.
(ix86_expand_call): Generate returned bounds.
(ix86_setup_incoming_vararg_bounds): New.
(ix86_va_start): Initialize bounds for pointers in va_list.
(TARGET_SETUP_INCOMING_VARARG_BOUNDS): New.
* config/i386/i386.h (TARGET_MPX): New.
(TARGET_MPX_P): New.
(FIRST_PSEUDO_REGISTER): Fix to new value.
(FIXED_REGISTERS): Add bound registers.
(CALL_USED_REGISTERS): Likewise.
(REG_ALLOC_ORDER): Likewise.
(HARD_REGNO_NREGS): Likewise.
(VALID_BND_REG_MODE): New.
(FIRST_BND_REG): New.
(LAST_BND_REG): New.
(reg_class): Add BND_REGS.
(REG_CLASS_NAMES): Likewise.
(REG_CLASS_CONTENTS): Likewise.
(BND_REGNO_P): New.
(ANY_BND_REG_P): New.
(BNDmode): New.
(HI_REGISTER_NAMES): Add bound registers.
(ix86_args): Add bnd_regno, bnds_in_bt, force_bnd_pass and
stdarg fields.
* config/i386/i386.md (UNSPEC_BNDMK): New.
(UNSPEC_BNDMK_ADDR): New.
(UNSPEC_BNDSTX): New.
(UNSPEC_BNDLDX): New.
(UNSPEC_BNDLDX_ADDR): New.
(UNSPEC_BNDCL): New.
(UNSPEC_BNDCU): New.
(UNSPEC_BNDCN): New.
(UNSPEC_MPX_FENCE): New.
(UNSPEC_SIZEOF): New.
(BND0_REG): New.
(BND1_REG): New.
(type): Add mpxmov, mpxmk, mpxchk, mpxld, mpxst.
(length_immediate): Support mpxmov, mpxmk, mpxchk, mpxld, mpxst.
(prefix_rep): Check for bnd prefix.
(prefix_0f): Support mpxmov, mpxmk, mpxchk, mpxld, mpxst.
(length_nobnd): New.
(length): Use length_nobnd when specified.
(memory): Support mpxmov, mpxmk, mpxchk, mpxld, mpxst.
(BND): New.
(bnd_ptr): New.
(BNDCHECK): New.
(bndcheck): New.
(*jcc_1): Add MPX bnd prefix.
(*jcc_2): Likewise.
(jump): Likewise.
(*indirect_jump): Likewise.
(*tablejump_1): Likewise.
(simple_return_internal): Likewise.
(simple_return_internal_long): Likewise.
(simple_return_pop_internal): Likewise.
(simple_return_indirect_internal): Likewise.
(<mode>_mk): New.
(*<mode>_mk): New.
(mov<mode>): New.
(*mov<mode>_internal_mpx): New.
(<mode>_<bndcheck>): New.
(*<mode>_<bndcheck>): New.
(<mode>_ldx): New.
(*<mode>_ldx): New.
(<mode>_stx): New.
(*<mode>_stx): New.
move_size_reloc_<mode>): New.
* config/i386/predicates.md (address_mpx_no_base_operand): New.
(address_mpx_no_index_operand): New.
(bnd_mem_operator): New.
(symbol_operand): New.
(x86_64_immediate_size_operand): New.
* config/i386/i386.opt (mmpx): New.
* config/i386/i386-builtin-types.def (BND): New.
(ULONG): New.
(BND_FTYPE_PCVOID_ULONG): New.
(VOID_FTYPE_BND_PCVOID): New.
(VOID_FTYPE_PCVOID_PCVOID_BND): New.
(BND_FTYPE_PCVOID_PCVOID): New.
(BND_FTYPE_PCVOID): New.
(BND_FTYPE_BND_BND): New.
(PVOID_FTYPE_PVOID_PVOID_ULONG): New.
(PVOID_FTYPE_PCVOID_BND_ULONG): New.
(ULONG_FTYPE_VOID): New.
(PVOID_FTYPE_BND): New.
gcc/testsuite/
2014-11-05 Ilya Enkovich <ilya.enkovich@intel.com>
* gcc.target/i386/chkp-builtins-1.c: New.
* gcc.target/i386/chkp-builtins-2.c: New.
* gcc.target/i386/chkp-builtins-3.c: New.
* gcc.target/i386/chkp-builtins-4.c: New.
* gcc.target/i386/chkp-remove-bndint-1.c: New.
* gcc.target/i386/chkp-remove-bndint-2.c: New.
* gcc.target/i386/chkp-const-check-1.c: New.
* gcc.target/i386/chkp-const-check-2.c: New.
* gcc.target/i386/chkp-lifetime-1.c: New.
* gcc.dg/pr37858.c: Replace early_local_cleanups pass name
with build_ssa_passes.
From-SVN: r217125
2014-11-05 13:42:03 +01:00
|
|
|
NEXT_PASS (pass_cleanup_eh);
|
|
|
|
NEXT_PASS (pass_profile);
|
|
|
|
NEXT_PASS (pass_local_pure_const);
|
2021-11-11 18:14:45 +01:00
|
|
|
NEXT_PASS (pass_modref);
|
2013-07-18 20:55:48 +02:00
|
|
|
/* Split functions creates parts that are not run through
|
|
|
|
early optimizations again. It is thus good idea to do this
|
ipa-chkp.c: New.
gcc/
2014-11-05 Ilya Enkovich <ilya.enkovich@intel.com>
* ipa-chkp.c: New.
* ipa-chkp.h: New.
* tree-chkp.c: New.
* tree-chkp.h: New.
* tree-chkp-opt.c: New.
* rtl-chkp.c: New.
* rtl-chkp.h: New.
* Makefile.in (OBJS): Add ipa-chkp.o, rtl-chkp.o, tree-chkp.o
tree-chkp-opt.o.
(GTFILES): Add tree-chkp.c.
* mode-classes.def (MODE_POINTER_BOUNDS): New.
* tree.def (POINTER_BOUNDS_TYPE): New.
* genmodes.c (complete_mode): Support MODE_POINTER_BOUNDS.
(POINTER_BOUNDS_MODE): New.
(make_pointer_bounds_mode): New.
* machmode.h (POINTER_BOUNDS_MODE_P): New.
* stor-layout.c (int_mode_for_mode): Support MODE_POINTER_BOUNDS.
(layout_type): Support POINTER_BOUNDS_TYPE.
* tree-pretty-print.c (dump_generic_node): Support POINTER_BOUNDS_TYPE.
* tree-core.h (tree_index): Add TI_POINTER_BOUNDS_TYPE.
* tree.c (build_int_cst_wide): Support POINTER_BOUNDS_TYPE.
(type_contains_placeholder_1): Likewise.
(build_common_tree_nodes): Initialize
pointer_bounds_type_node.
* tree.h (POINTER_BOUNDS_TYPE_P): New.
(pointer_bounds_type_node): New.
(POINTER_BOUNDS_P): New.
(BOUNDED_TYPE_P): New.
(BOUNDED_P): New.
(CALL_WITH_BOUNDS_P): New.
* gimple.h (gf_mask): Add GF_CALL_WITH_BOUNDS.
(gimple_call_with_bounds_p): New.
(gimple_call_set_with_bounds): New.
(gimple_return_retbnd): New.
(gimple_return_set_retbnd): New
* gimple.c (gimple_build_return): Increase number of ops
for return statement.
(gimple_build_call_from_tree): Propagate CALL_WITH_BOUNDS_P
flag.
* gimple-pretty-print.c (dump_gimple_return): Print second op.
* rtl.h (CALL_EXPR_WITH_BOUNDS_P): New.
* gimplify.c (gimplify_init_constructor): Avoid infinite
loop during gimplification of bounds initializer.
* calls.c: Include tree-chkp.h, rtl-chkp.h, bitmap.h.
(special_function_p): Use original decl name when analyzing
instrumentation clone.
(arg_data): Add fields special_slot, pointer_arg and
pointer_offset.
(store_bounds): New.
(emit_call_1): Propagate instrumentation flag for CALL.
(initialize_argument_information): Compute pointer_arg,
pointer_offset and special_slot for pointer bounds arguments.
(finalize_must_preallocate): Preallocate when storing bounds
in bounds table.
(compute_argument_addresses): Skip pointer bounds.
(expand_call): Store bounds into tables separately. Return
result joined with resulting bounds.
* cfgexpand.c: Include tree-chkp.h, rtl-chkp.h.
(expand_call_stmt): Propagate bounds flag for CALL_EXPR.
(expand_return): Add returned bounds arg. Handle returned bounds.
(expand_gimple_stmt_1): Adjust to new expand_return signature.
(gimple_expand_cfg): Reset rtx bounds map.
* expr.c: Include tree-chkp.h, rtl-chkp.h.
(expand_assignment): Handle returned bounds.
(store_expr_with_bounds): New. Replaces store_expr with new bounds
target argument. Handle bounds returned by calls.
(store_expr): Now wraps store_expr_with_bounds.
* expr.h (store_expr_with_bounds): New.
* function.c: Include tree-chkp.h, rtl-chkp.h.
(bounds_parm_data): New.
(use_register_for_decl): Do not registerize decls used for bounds
stores and loads.
(assign_parms_augmented_arg_list): Add bounds of the result
structure pointer as the second argument.
(assign_parm_find_entry_rtl): Mark bounds are never passed on
the stack.
(assign_parm_is_stack_parm): Likewise.
(assign_parm_load_bounds): New.
(assign_bounds): New.
(assign_parms): Load bounds and determine a location for
returned bounds.
(diddle_return_value_1): New.
(diddle_return_value): Handle returned bounds.
* function.h (rtl_data): Add field for returned bounds.
* varasm.c: Include tree-chkp.h.
(output_constant): Support POINTER_BOUNDS_TYPE.
(output_constant_pool_2): Support MODE_POINTER_BOUNDS.
(ultimate_transparent_alias_target): Move up.
(make_decl_rtl): For instrumented function use
name of the original decl.
(assemble_start_function): Mark function as global
in case it is instrumentation clone of the global
function.
(do_assemble_alias): Follow transparent alias chain
for identifier. Check if original alias is public.
(maybe_assemble_visibility): Use visibility of the
original function for instrumented version.
(default_unique_section): Likewise.
* emit-rtl.c (immed_double_const): Support MODE_POINTER_BOUNDS.
(init_emit_once): Build pointer bounds zero constants.
* explow.c (trunc_int_for_mode): Support MODE_POINTER_BOUNDS.
* target.def (builtin_chkp_function): New.
(chkp_bound_type): New.
(chkp_bound_mode): New.
(chkp_make_bounds_constant): New.
(chkp_initialize_bounds): New.
(load_bounds_for_arg): New.
(store_bounds_for_arg): New.
(load_returned_bounds): New.
(store_returned_bounds): New.
(chkp_function_value_bounds): New.
(setup_incoming_vararg_bounds): New.
(function_arg): Update hook description with new possible return
value CONST_INT.
* targhooks.h (default_load_bounds_for_arg): New.
(default_store_bounds_for_arg): New.
(default_load_returned_bounds): New.
(default_store_returned_bounds): New.
(default_chkp_bound_type): New.
(default_chkp_bound_mode): New.
(default_builtin_chkp_function): New.
(default_chkp_function_value_bounds): New.
(default_chkp_make_bounds_constant): New.
(default_chkp_initialize_bounds): New.
(default_setup_incoming_vararg_bounds): New.
* targhooks.c (default_load_bounds_for_arg): New.
(default_store_bounds_for_arg): New.
(default_load_returned_bounds): New.
(default_store_returned_bounds): New.
(default_chkp_bound_type): New.
(default_chkp_bound_mode); New.
(default_builtin_chkp_function): New.
(default_chkp_function_value_bounds): New.
(default_chkp_make_bounds_constant): New.
(default_chkp_initialize_bounds): New.
(default_setup_incoming_vararg_bounds): New.
* builtin-types.def (BT_BND): New.
(BT_FN_PTR_CONST_PTR): New.
(BT_FN_CONST_PTR_CONST_PTR): New.
(BT_FN_BND_CONST_PTR): New.
(BT_FN_CONST_PTR_BND): New.
(BT_FN_PTR_CONST_PTR_SIZE): New.
(BT_FN_PTR_CONST_PTR_CONST_PTR): New.
(BT_FN_VOID_PTRPTR_CONST_PTR): New.
(BT_FN_VOID_CONST_PTR_SIZE): New.
(BT_FN_VOID_PTR_BND): New.
(BT_FN_CONST_PTR_CONST_PTR_CONST_PTR): New.
(BT_FN_BND_CONST_PTR_SIZE): New.
(BT_FN_PTR_CONST_PTR_CONST_PTR_SIZE): New.
(BT_FN_VOID_CONST_PTR_BND_CONST_PTR): New.
* chkp-builtins.def: New.
* builtins.def: include chkp-builtins.def.
(DEF_CHKP_BUILTIN): New.
* builtins.c: Include tree-chkp.h and rtl-chkp.h.
(expand_builtin): Support BUILT_IN_CHKP_INIT_PTR_BOUNDS,
BUILT_IN_CHKP_NULL_PTR_BOUNDS, BUILT_IN_CHKP_COPY_PTR_BOUNDS,
BUILT_IN_CHKP_CHECK_PTR_LBOUNDS, BUILT_IN_CHKP_CHECK_PTR_UBOUNDS,
BUILT_IN_CHKP_CHECK_PTR_BOUNDS, BUILT_IN_CHKP_SET_PTR_BOUNDS,
BUILT_IN_CHKP_NARROW_PTR_BOUNDS, BUILT_IN_CHKP_STORE_PTR_BOUNDS,
BUILT_IN_CHKP_GET_PTR_LBOUND, BUILT_IN_CHKP_GET_PTR_UBOUND,
BUILT_IN_CHKP_BNDMK, BUILT_IN_CHKP_BNDSTX, BUILT_IN_CHKP_BNDCL,
BUILT_IN_CHKP_BNDCU, BUILT_IN_CHKP_BNDLDX, BUILT_IN_CHKP_BNDRET,
BUILT_IN_CHKP_INTERSECT, BUILT_IN_CHKP_NARROW,
BUILT_IN_CHKP_EXTRACT_LOWER, BUILT_IN_CHKP_EXTRACT_UPPER.
(std_expand_builtin_va_start): Init bounds for va_list.
* cppbuiltin.c (define_builtin_macros_for_compilation_flags): Add
__CHKP__ macro when Pointer Bounds Checker is on.
* params.def (PARAM_CHKP_MAX_CTOR_SIZE): New.
* passes.def (pass_ipa_chkp_versioning): New.
(pass_early_local_passes): Renamed to pass_build_ssa_passes.
(pass_fixup_cfg): Moved to pass_chkp_instrumentation_passes.
(pass_chkp_instrumentation_passes): New.
(pass_ipa_chkp_produce_thunks): New.
(pass_local_optimization_passes): New.
(pass_chkp_opt): New.
* tree-pass.h (make_pass_ipa_chkp_versioning): New.
(make_pass_ipa_chkp_produce_thunks): New.
(make_pass_chkp): New.
(make_pass_chkp_opt): New.
(make_pass_early_local_passes): Renamed to ...
(make_pass_build_ssa_passes): This.
(make_pass_chkp_instrumentation_passes): New.
(make_pass_local_optimization_passes): New.
* passes.c (pass_manager::execute_early_local_passes): Execute
early passes in three steps.
(execute_all_early_local_passes): Renamed to ...
(execute_build_ssa_passes): This.
(pass_data_early_local_passes): Renamed to ...
(pass_data_build_ssa_passes): This.
(pass_early_local_passes): Renamed to ...
(pass_build_ssa_passes): This.
(pass_data_chkp_instrumentation_passes): New.
(pass_chkp_instrumentation_passes): New.
(pass_data_local_optimization_passes): New.
(pass_local_optimization_passes): New.
(make_pass_early_local_passes): Renamed to ...
(make_pass_build_ssa_passes): This.
(make_pass_chkp_instrumentation_passes): New.
(make_pass_local_optimization_passes): New.
* c-family/c.opt (fcheck-pointer-bounds): New.
(fchkp-check-incomplete-type): New.
(fchkp-zero-input-bounds-for-main): New.
(fchkp-first-field-has-own-bounds): New.
(fchkp-narrow-bounds): New.
(fchkp-narrow-to-innermost-array): New.
(fchkp-optimize): New.
(fchkp-use-fast-string-functions): New.
(fchkp-use-nochk-string-functions): New.
(fchkp-use-static-bounds): New.
(fchkp-use-static-const-bounds): New.
(fchkp-treat-zero-dynamic-size-as-infinite): New.
(fchkp-check-read): New.
(fchkp-check-write): New.
(fchkp-store-bounds): New.
(fchkp-instrument-calls): New.
(fchkp-instrument-marked-only): New.
(Wchkp): New.
* c-family/c-common.c (handle_bnd_variable_size_attribute): New.
(handle_bnd_legacy): New.
(handle_bnd_instrument): New.
(c_common_attribute_table): Add bnd_variable_size, bnd_legacy
and bnd_instrument. Fix documentation.
(c_common_format_attribute_table): Likewsie.
* toplev.c: include tree-chkp.h.
(process_options): Check Pointer Bounds Checker is supported.
(compile_file): Add chkp_finish_file call.
* ipa-cp.c (initialize_node_lattices): Use cgraph_local_p
to handle instrumentation clones properly.
(propagate_constants_accross_call): Do not propagate
through instrumentation thunks.
* ipa-pure-const.c (propagate_pure_const): Support
IPA_REF_CHKP.
* ipa-inline.c (early_inliner): Check edge has summary allocated.
* ipa-split.c: Include tree-chkp.h.
(find_retbnd): New.
(split_part_set_ssa_name_p): New.
(consider_split): Do not split retbnd and retval
producers.
(insert_bndret_call_after): new.
(split_function): Propagate Pointer Bounds Checker
instrumentation marks and handle returned bounds.
* tree-ssa-sccvn.h (vn_reference_op_struct): Transform opcode
into bit field and add with_bounds field.
* tree-ssa-sccvn.c (copy_reference_ops_from_call): Set
with_bounds field for instrumented calls.
* tree-ssa-pre.c (create_component_ref_by_pieces_1): Restore
CALL_WITH_BOUNDS_P flag for calls.
* tree-ssa-ccp.c: Include tree-chkp.h.
(insert_clobber_before_stack_restore): Handle
BUILT_IN_CHKP_BNDRET calls.
* tree-ssa-dce.c: Include tree-chkp.h.
(propagate_necessity): For free call fed by alloc check
bounds are also provided by the same alloc.
(eliminate_unnecessary_stmts): Handle BUILT_IN_CHKP_BNDRET
used by free calls.
* tree-inline.c: Include tree-chkp.h.
(declare_return_variable): Add arg holding
returned bounds slot. Create and initialize returned bounds var.
(remap_gimple_stmt): Handle returned bounds.
Return sequence of statements instead of a single statement.
(insert_init_stmt): Add declaration.
(remap_gimple_seq): Adjust to new remap_gimple_stmt signature.
(copy_bb): Adjust to changed return type of remap_gimple_stmt.
Properly handle bounds in va_arg_pack and va_arg_pack_len.
(expand_call_inline): Handle returned bounds. Add bounds copy
for generated mem to mem assignments.
* tree-inline.h (copy_body_data): Add fields retbnd and
assign_stmts.
* value-prof.c: Include tree-chkp.h.
(gimple_ic): Support returned bounds.
* ipa.c (cgraph_build_static_cdtor_1): Support contructors
with "chkp ctor" and "bnd_legacy" attributes.
(symtab_remove_unreachable_nodes): Keep initial values for
pointer bounds to be used for checks eliminations.
(process_references): Handle IPA_REF_CHKP.
(walk_polymorphic_call_targets): Likewise.
* ipa-visibility.c (cgraph_externally_visible_p): Mark
instrumented 'main' as externally visible.
(function_and_variable_visibility): Filter instrumentation
thunks.
* cgraph.h (cgraph_thunk_info): Add add_pointer_bounds_args
field.
(cgraph_node): Add instrumented_version, orig_decl and
instrumentation_clone fields.
(symtab_node::get_alias_target): Allow IPA_REF_CHKP reference.
(varpool_node): Add need_bounds_init field.
(cgraph_local_p): New.
* cgraph.c: Include tree-chkp.h.
(cgraph_node::remove): Fix instrumented_version
of the referenced node if any.
(cgraph_node::dump): Dump instrumentation_clone and
instrumented_version fields.
(cgraph_node::verify_node): Check correctness of IPA_REF_CHKP
references and instrumentation thunks.
(cgraph_can_remove_if_no_direct_calls_and_refs_p): Keep
all not instrumented instrumentation clones alive.
(cgraph_redirect_edge_call_stmt_to_callee): Support
returned bounds.
* cgraphbuild.c (rebuild_cgraph_edges): Rebuild IPA_REF_CHKP
reference.
(cgraph_rebuild_references): Likewise.
* cgraphunit.c: Include tree-chkp.h.
(assemble_thunks_and_aliases): Skip thunks calling instrumneted
function version.
(varpool_finalize_decl): Register statically initialized decls
in Pointer Bounds Checker.
(walk_polymorphic_call_targets): Do not mark generated call to
__builtin_unreachable as with_bounds.
(output_weakrefs): If there are both instrumented and original
versions, output only one of them.
(cgraph_node::expand_thunk): Set with_bounds flag
for created call statement.
* ipa-ref.h (ipa_ref_use): Add IPA_REF_CHKP.
(ipa_ref): increase size of use field.
* symtab.c (ipa_ref_use_name): Add element for IPA_REF_CHKP.
* varpool.c (dump_varpool_node): Dump need_bounds_init field.
(ctor_for_folding): Do not fold constant bounds vars.
* lto-streamer.h (LTO_minor_version): Change minor version from
0 to 1.
* lto-cgraph.c (compute_ltrans_boundary): Keep initial values for
pointer bounds.
(lto_output_node): Output instrumentation_clone,
thunk.add_pointer_bounds_args and orig_decl field.
(lto_output_ref): Adjust to new ipa_ref::use field size.
(input_overwrite_node): Read instrumentation_clone field.
(input_node): Read thunk.add_pointer_bounds_args and orig_decl
fields.
(input_ref): Adjust to new ipa_ref::use field size.
(input_cgraph_1): Compute instrumented_version fields and restore
IDENTIFIER_TRANSPARENT_ALIAS chains.
(lto_output_varpool_node): Output
need_bounds_init value.
(input_varpool_node): Read need_bounds_init value.
* lto-partition.c (add_symbol_to_partition_1): Keep original
and instrumented versions together.
(privatize_symbol_name): Restore transparent alias chain if required.
(add_references_to_partition): Add references to pointer bounds vars.
* dbxout.c (dbxout_type): Ignore POINTER_BOUNDS_TYPE.
* dwarf2out.c (gen_subprogram_die): Ignore bound args.
(gen_type_die_with_usage): Skip pointer bounds.
(dwarf2out_global_decl): Likewise.
(is_base_type): Support POINTER_BOUNDS_TYPE.
(gen_formal_types_die): Skip pointer bounds.
(gen_decl_die): Likewise.
* var-tracking.c (vt_add_function_parameters): Skip
bounds parameters.
* ipa-icf.c (sem_function::merge): Do not merge when instrumentation
thunk still exists.
(sem_variable::merge): Reset need_bounds_init flag.
* doc/extend.texi: Document Pointer Bounds Checker built-in functions
and attributes.
* doc/tm.texi.in (TARGET_LOAD_BOUNDS_FOR_ARG): New.
(TARGET_STORE_BOUNDS_FOR_ARG): New.
(TARGET_LOAD_RETURNED_BOUNDS): New.
(TARGET_STORE_RETURNED_BOUNDS): New.
(TARGET_CHKP_FUNCTION_VALUE_BOUNDS): New.
(TARGET_SETUP_INCOMING_VARARG_BOUNDS): New.
(TARGET_BUILTIN_CHKP_FUNCTION): New.
(TARGET_CHKP_BOUND_TYPE): New.
(TARGET_CHKP_BOUND_MODE): New.
(TARGET_CHKP_MAKE_BOUNDS_CONSTANT): New.
(TARGET_CHKP_INITIALIZE_BOUNDS): New.
* doc/tm.texi: Regenerated.
* doc/rtl.texi (MODE_POINTER_BOUNDS): New.
(BND32mode): New.
(BND64mode): New.
* doc/invoke.texi (-mmpx): New.
(-mno-mpx): New.
(chkp-max-ctor-size): New.
* config/i386/constraints.md (w): New.
(Ti): New.
(Tb): New.
* config/i386/i386-c.c (ix86_target_macros_internal): Add __MPX__.
* config/i386/i386-modes.def (BND32): New.
(BND64): New.
* config/i386/i386-protos.h (ix86_bnd_prefixed_insn_p): New.
* config/i386/i386.c: Include tree-chkp.h, rtl-chkp.h, tree-iterator.h.
(regclass_map): Add bound registers.
(dbx_register_map): Likewise.
(dbx64_register_map): Likewise.
(svr4_dbx_register_map): Likewise.
(isa_opts): Add -mmpx.
(PTA_MPX): New.
(ix86_option_override_internal): Support MPX ISA.
(ix86_conditional_register_usage): Support bound registers.
(ix86_code_end): Add MPX bnd prefix.
(output_set_got): Likewise.
(print_reg): Avoid prefixes for bound registers.
(ix86_print_operand): Add '!' (MPX bnd) print prefix support.
(ix86_print_operand_punct_valid_p): Likewise.
(ix86_print_operand_address): Support UNSPEC_BNDMK_ADDR and
UNSPEC_BNDLDX_ADDR.
(ix86_output_call_insn): Add MPX bnd prefix to branch instructions.
(ix86_class_likely_spilled_p): Add bound regs support.
(ix86_hard_regno_mode_ok): Likewise.
(x86_order_regs_for_local_alloc): Likewise.
(ix86_bnd_prefixed_insn_p): New.
(ix86_builtins): Add
IX86_BUILTIN_BNDMK, IX86_BUILTIN_BNDSTX,
IX86_BUILTIN_BNDLDX, IX86_BUILTIN_BNDCL,
IX86_BUILTIN_BNDCU, IX86_BUILTIN_BNDRET,
IX86_BUILTIN_BNDNARROW, IX86_BUILTIN_BNDINT,
IX86_BUILTIN_SIZEOF, IX86_BUILTIN_BNDLOWER,
IX86_BUILTIN_BNDUPPER.
(builtin_isa): Add leaf_p and nothrow_p fields.
(def_builtin): Initialize leaf_p and nothrow_p.
(ix86_add_new_builtins): Handle leaf_p and nothrow_p
flags.
(bdesc_mpx): New.
(bdesc_mpx_const): New.
(ix86_init_mpx_builtins): New.
(ix86_init_builtins): Call ix86_init_mpx_builtins.
(ix86_emit_cmove): New.
(ix86_emit_move_max): New.
(ix86_expand_builtin): Expand IX86_BUILTIN_BNDMK,
IX86_BUILTIN_BNDSTX, IX86_BUILTIN_BNDLDX,
IX86_BUILTIN_BNDCL, IX86_BUILTIN_BNDCU,
IX86_BUILTIN_BNDRET, IX86_BUILTIN_BNDNARROW,
IX86_BUILTIN_BNDINT, IX86_BUILTIN_SIZEOF,
IX86_BUILTIN_BNDLOWER, IX86_BUILTIN_BNDUPPER.
(ix86_function_value_bounds): New.
(ix86_builtin_mpx_function): New.
(ix86_get_arg_address_for_bt): New.
(ix86_load_bounds): New.
(ix86_store_bounds): New.
(ix86_load_returned_bounds): New.
(ix86_store_returned_bounds): New.
(ix86_mpx_bound_mode): New.
(ix86_make_bounds_constant): New.
(ix86_initialize_bounds):
(TARGET_LOAD_BOUNDS_FOR_ARG): New.
(TARGET_STORE_BOUNDS_FOR_ARG): New.
(TARGET_LOAD_RETURNED_BOUNDS): New.
(TARGET_STORE_RETURNED_BOUNDS): New.
(TARGET_CHKP_BOUND_MODE): New.
(TARGET_BUILTIN_CHKP_FUNCTION): New.
(TARGET_CHKP_FUNCTION_VALUE_BOUNDS): New.
(TARGET_CHKP_MAKE_BOUNDS_CONSTANT): New.
(TARGET_CHKP_INITIALIZE_BOUNDS): New.
(ix86_option_override_internal): Do not
support x32 with MPX.
(init_cumulative_args): Init stdarg, bnd_regno, bnds_in_bt
and force_bnd_pass.
(function_arg_advance_32): Return number of used integer
registers.
(function_arg_advance_64): Likewise.
(function_arg_advance_ms_64): Likewise.
(ix86_function_arg_advance): Handle pointer bounds.
(ix86_function_arg): Likewise.
(ix86_function_value_regno_p): Mark fisrt bounds registers as
possible function value.
(ix86_function_value_1): Handle pointer bounds type/mode
(ix86_return_in_memory): Likewise.
(ix86_print_operand): Analyse insn to decide abounf "bnd" prefix.
(ix86_expand_call): Generate returned bounds.
(ix86_setup_incoming_vararg_bounds): New.
(ix86_va_start): Initialize bounds for pointers in va_list.
(TARGET_SETUP_INCOMING_VARARG_BOUNDS): New.
* config/i386/i386.h (TARGET_MPX): New.
(TARGET_MPX_P): New.
(FIRST_PSEUDO_REGISTER): Fix to new value.
(FIXED_REGISTERS): Add bound registers.
(CALL_USED_REGISTERS): Likewise.
(REG_ALLOC_ORDER): Likewise.
(HARD_REGNO_NREGS): Likewise.
(VALID_BND_REG_MODE): New.
(FIRST_BND_REG): New.
(LAST_BND_REG): New.
(reg_class): Add BND_REGS.
(REG_CLASS_NAMES): Likewise.
(REG_CLASS_CONTENTS): Likewise.
(BND_REGNO_P): New.
(ANY_BND_REG_P): New.
(BNDmode): New.
(HI_REGISTER_NAMES): Add bound registers.
(ix86_args): Add bnd_regno, bnds_in_bt, force_bnd_pass and
stdarg fields.
* config/i386/i386.md (UNSPEC_BNDMK): New.
(UNSPEC_BNDMK_ADDR): New.
(UNSPEC_BNDSTX): New.
(UNSPEC_BNDLDX): New.
(UNSPEC_BNDLDX_ADDR): New.
(UNSPEC_BNDCL): New.
(UNSPEC_BNDCU): New.
(UNSPEC_BNDCN): New.
(UNSPEC_MPX_FENCE): New.
(UNSPEC_SIZEOF): New.
(BND0_REG): New.
(BND1_REG): New.
(type): Add mpxmov, mpxmk, mpxchk, mpxld, mpxst.
(length_immediate): Support mpxmov, mpxmk, mpxchk, mpxld, mpxst.
(prefix_rep): Check for bnd prefix.
(prefix_0f): Support mpxmov, mpxmk, mpxchk, mpxld, mpxst.
(length_nobnd): New.
(length): Use length_nobnd when specified.
(memory): Support mpxmov, mpxmk, mpxchk, mpxld, mpxst.
(BND): New.
(bnd_ptr): New.
(BNDCHECK): New.
(bndcheck): New.
(*jcc_1): Add MPX bnd prefix.
(*jcc_2): Likewise.
(jump): Likewise.
(*indirect_jump): Likewise.
(*tablejump_1): Likewise.
(simple_return_internal): Likewise.
(simple_return_internal_long): Likewise.
(simple_return_pop_internal): Likewise.
(simple_return_indirect_internal): Likewise.
(<mode>_mk): New.
(*<mode>_mk): New.
(mov<mode>): New.
(*mov<mode>_internal_mpx): New.
(<mode>_<bndcheck>): New.
(*<mode>_<bndcheck>): New.
(<mode>_ldx): New.
(*<mode>_ldx): New.
(<mode>_stx): New.
(*<mode>_stx): New.
move_size_reloc_<mode>): New.
* config/i386/predicates.md (address_mpx_no_base_operand): New.
(address_mpx_no_index_operand): New.
(bnd_mem_operator): New.
(symbol_operand): New.
(x86_64_immediate_size_operand): New.
* config/i386/i386.opt (mmpx): New.
* config/i386/i386-builtin-types.def (BND): New.
(ULONG): New.
(BND_FTYPE_PCVOID_ULONG): New.
(VOID_FTYPE_BND_PCVOID): New.
(VOID_FTYPE_PCVOID_PCVOID_BND): New.
(BND_FTYPE_PCVOID_PCVOID): New.
(BND_FTYPE_PCVOID): New.
(BND_FTYPE_BND_BND): New.
(PVOID_FTYPE_PVOID_PVOID_ULONG): New.
(PVOID_FTYPE_PCVOID_BND_ULONG): New.
(ULONG_FTYPE_VOID): New.
(PVOID_FTYPE_BND): New.
gcc/testsuite/
2014-11-05 Ilya Enkovich <ilya.enkovich@intel.com>
* gcc.target/i386/chkp-builtins-1.c: New.
* gcc.target/i386/chkp-builtins-2.c: New.
* gcc.target/i386/chkp-builtins-3.c: New.
* gcc.target/i386/chkp-builtins-4.c: New.
* gcc.target/i386/chkp-remove-bndint-1.c: New.
* gcc.target/i386/chkp-remove-bndint-2.c: New.
* gcc.target/i386/chkp-const-check-1.c: New.
* gcc.target/i386/chkp-const-check-2.c: New.
* gcc.target/i386/chkp-lifetime-1.c: New.
* gcc.dg/pr37858.c: Replace early_local_cleanups pass name
with build_ssa_passes.
From-SVN: r217125
2014-11-05 13:42:03 +01:00
|
|
|
late. */
|
|
|
|
NEXT_PASS (pass_split_functions);
|
2018-08-10 11:31:51 +02:00
|
|
|
NEXT_PASS (pass_strip_predict_hints, true /* early_p */);
|
2013-07-18 20:55:48 +02:00
|
|
|
POP_INSERT_PASSES ()
|
|
|
|
NEXT_PASS (pass_release_ssa_names);
|
|
|
|
NEXT_PASS (pass_rebuild_cgraph_edges);
|
cgraphunit.c (symbol_table::process_new_functions): Update.
* cgraphunit.c (symbol_table::process_new_functions): Update.
* ipa-fnsummary.c (pass_data_inline_parameters): Remove.
(inline_generate_summary): Rename to ...
(ipa_fn_summary_generate): ... this one.
(inline_read_summary): Rename to ...
(ipa_fn_summary_read): ... this one.
(inline_write_summary): Rename to ...
(ipa_fn_summary_write): ... this one.
(inline_free_summary): Rename to ...
(ipa_free_fn_summary): ... this one.
(pass_data_local_fn_summary, pass_local_fn_summary,
make_pass_local_fn_summary, pass_data_ipa_free_fn_summary,
pass_ipa_free_fn_summary, make_pass_ipa_free_fn_summary,
pass_data_ipa_fn_summary, pass_ipa_fn_summary,
make_pass_ipa_fn_summary): New.
* ipa-fnsummary.h (inline_generate_summary, inline_read_summary,
inline_write_summary, inline_free_summary): Remove.
(ipa_free_fn_summary) : New.
* ipa-inline.c (ipa_inline): Update.
(pass_ipa_inline): Do not generate summaries.
* ipa.c (pass_data_ipa_free_fn_summary, pass_ipa_free_fn_summary):
Remove.
* passes.def: Replace pass_inline_parameters by pass_local_fn_summary
and add pass_ipa_fn_summary.
* tree-pass.h (make_pass_ipa_fn_summary, make_pass_local_fn_summary):
New.
(make_pass_inline_parameters): Remove.
* lto.c (do_whole_program_analysis): Replace inline_free_summary
by ipa_free_fn_summary.
* gcc.dg/ipa/ctor-empty-1.c: Update template.
* gcc.dg/ipa/inline-5.c: Likewise.
* gfortran.dg/pr48636.f90: Likewise.
From-SVN: r248375
2017-05-23 18:20:53 +02:00
|
|
|
NEXT_PASS (pass_local_fn_summary);
|
2013-07-18 20:55:48 +02:00
|
|
|
POP_INSERT_PASSES ()
|
2015-12-16 14:49:07 +01:00
|
|
|
|
2018-11-20 14:25:04 +01:00
|
|
|
NEXT_PASS (pass_ipa_remove_symbols);
|
2015-12-16 14:49:07 +01:00
|
|
|
NEXT_PASS (pass_ipa_oacc);
|
|
|
|
PUSH_INSERT_PASSES_WITHIN (pass_ipa_oacc)
|
|
|
|
NEXT_PASS (pass_ipa_pta);
|
|
|
|
/* Pass group that runs when the function is an offloaded function
|
|
|
|
containing oacc kernels loops. */
|
|
|
|
NEXT_PASS (pass_ipa_oacc_kernels);
|
|
|
|
PUSH_INSERT_PASSES_WITHIN (pass_ipa_oacc_kernels)
|
|
|
|
NEXT_PASS (pass_oacc_kernels);
|
|
|
|
PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels)
|
|
|
|
NEXT_PASS (pass_ch);
|
2019-07-01 09:54:38 +02:00
|
|
|
NEXT_PASS (pass_fre, true /* may_iterate */);
|
2015-12-16 14:49:07 +01:00
|
|
|
/* We use pass_lim to rewrite in-memory iteration and reduction
|
|
|
|
variable accesses in loops into local variables accesses. */
|
|
|
|
NEXT_PASS (pass_lim);
|
|
|
|
NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */);
|
|
|
|
NEXT_PASS (pass_dce);
|
2016-01-18 13:52:42 +01:00
|
|
|
NEXT_PASS (pass_parallelize_loops, true /* oacc_kernels_p */);
|
2015-12-16 14:49:07 +01:00
|
|
|
NEXT_PASS (pass_expand_omp_ssa);
|
|
|
|
NEXT_PASS (pass_rebuild_cgraph_edges);
|
|
|
|
POP_INSERT_PASSES ()
|
|
|
|
POP_INSERT_PASSES ()
|
|
|
|
POP_INSERT_PASSES ()
|
|
|
|
|
2017-02-03 16:22:47 +01:00
|
|
|
NEXT_PASS (pass_target_clone);
|
2014-10-21 19:59:30 +02:00
|
|
|
NEXT_PASS (pass_ipa_auto_profile);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_ipa_tree_profile);
|
|
|
|
PUSH_INSERT_PASSES_WITHIN (pass_ipa_tree_profile)
|
|
|
|
NEXT_PASS (pass_feedback_split_functions);
|
|
|
|
POP_INSERT_PASSES ()
|
2017-12-20 20:41:38 +01:00
|
|
|
NEXT_PASS (pass_ipa_free_fn_summary, true /* small_p */);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_ipa_increase_alignment);
|
|
|
|
NEXT_PASS (pass_ipa_tm);
|
|
|
|
NEXT_PASS (pass_ipa_lower_emutls);
|
2016-04-17 07:21:50 +02:00
|
|
|
TERMINATE_PASS_LIST (all_small_ipa_passes)
|
2013-07-18 20:55:48 +02:00
|
|
|
|
|
|
|
INSERT_PASSES_AFTER (all_regular_ipa_passes)
|
2019-09-27 15:23:16 +02:00
|
|
|
NEXT_PASS (pass_analyzer);
|
Optimize ODR enum streaming
it turns out that half of the global decl stream of cc1 LTO build consits
TREE_LISTS, identifiers and integer cosntats representing TYPE_VALUES of enums.
Those are streamed only to produce ODR warning and used otherwise, so this
patch moves the info to a separate section that is represented and streamed
more effectively.
This also adds place for more info that may be used for ODR diagnostics
(i.e. at the moment we do not warn when the declarations differs i.e. by the
associated member functions and their types) and the type inheritance graph
rather then poluting the global stream.
I was bit unsure what enums we want to store into the section. All parsed
enums is probably too expensive, only those enums streamed to represent IL is
bit hard to get, so I went for those seen by free lang data.
As a plus we now get bit more precise warning because also the location of
mismatched enum CONST_DECL is streamed.
It changes:
[WPA] read 4608466 unshared trees
[WPA] read 2942094 mergeable SCCs of average size 1.365328
[WPA] 8625389 tree bodies read in total
[WPA] tree SCC table: size 524287, 247652 elements, collision ratio: 0.383702
[WPA] tree SCC max chain length 2 (size 1)
[WPA] Compared 2694442 SCCs, 228 collisions (0.000085)
[WPA] Merged 2694419 SCCs
[WPA] Merged 3731982 tree bodies
[WPA] Merged 633335 types
[WPA] 122077 types prevailed (155548 associated trees)
...
[WPA] Compression: 110593119 input bytes, 287696614 uncompressed bytes (ratio: 2.601397)
[WPA] Size of mmap'd section decls: 85628556 bytes
[WPA] Size of mmap'd section function_body: 13842928 bytes
[WPA] read 1720989 unshared trees
[WPA] read 1252217 mergeable SCCs of average size 1.858507
[WPA] 4048243 tree bodies read in total
[WPA] tree SCC table: size 524287, 226524 elements, collision ratio: 0.491759
[WPA] tree SCC max chain length 2 (size 1)
[WPA] Compared 1025693 SCCs, 196 collisions (0.000191)
[WPA] Merged 1025670 SCCs
[WPA] Merged 2063373 tree bodies
[WPA] Merged 633497 types
[WPA] 122299 types prevailed (155827 associated trees)
...
[WPA] Compression: 103428770 input bytes, 281151423 uncompressed bytes (ratio: 2.718310)
[WPA] Size of mmap'd section decls: 49390917 bytes
[WPA] Size of mmap'd section function_body: 13858258 bytes
...
[WPA] Size of mmap'd section odr_types: 29054816 bytes
So number of SCCs streamed drops to 38% and the number of unshared trees (that
are bit misnamed since it is mostly integer_cst) to 37%.
Things speeds up correspondingly, but I did not save time report from previous
build.
The enum values are still quite surprisingly large. I may take a look into
ways getting it smaller incrementally, but it streams reasonably fast:
Time variable usr sys wall GGC
phase opt and generate : 25.20 ( 68%) 10.88 ( 72%) 36.13 ( 69%) 868060 kB ( 52%)
phase stream in : 4.46 ( 12%) 0.90 ( 6%) 5.38 ( 10%) 790724 kB ( 48%)
phase stream out : 6.69 ( 18%) 3.32 ( 22%) 10.03 ( 19%) 8 kB ( 0%)
ipa lto gimple in : 0.79 ( 2%) 1.86 ( 12%) 2.39 ( 5%) 252612 kB ( 15%)
ipa lto gimple out : 2.48 ( 7%) 0.78 ( 5%) 3.26 ( 6%) 0 kB ( 0%)
ipa lto decl in : 1.71 ( 5%) 0.46 ( 3%) 2.34 ( 4%) 417883 kB ( 25%)
ipa lto decl out : 3.28 ( 9%) 0.07 ( 0%) 3.27 ( 6%) 0 kB ( 0%)
whopr wpa I/O : 0.40 ( 1%) 2.24 ( 15%) 2.77 ( 5%) 8 kB ( 0%)
lto stream decompression : 1.38 ( 4%) 0.31 ( 2%) 1.36 ( 3%) 0 kB ( 0%)
ipa ODR types : 0.18 ( 0%) 0.02 ( 0%) 0.25 ( 0%) 0 kB ( 0%)
ipa inlining heuristics : 11.64 ( 31%) 1.45 ( 10%) 13.12 ( 25%) 453160 kB ( 27%)
ipa pure const : 1.74 ( 5%) 0.00 ( 0%) 1.76 ( 3%) 0 kB ( 0%)
ipa icf : 1.72 ( 5%) 5.33 ( 35%) 7.06 ( 13%) 16593 kB ( 1%)
whopr partitioning : 2.22 ( 6%) 0.01 ( 0%) 2.23 ( 4%) 5689 kB ( 0%)
TOTAL : 37.17 15.20 52.46 1660886 kB
LTO-bootstrapped/regtested x86_64-linux, will comit it shortly.
gcc/ChangeLog:
2020-06-03 Jan Hubicka <hubicka@ucw.cz>
* ipa-devirt.c: Include data-streamer.h, lto-streamer.h and
streamer-hooks.h.
(odr_enums): New static var.
(struct odr_enum_val): New struct.
(class odr_enum): New struct.
(odr_enum_map): New hashtable.
(odr_types_equivalent_p): Drop code testing TYPE_VALUES.
(add_type_duplicate): Likewise.
(free_odr_warning_data): Do not free TYPE_VALUES.
(register_odr_enum): New function.
(ipa_odr_summary_write): New function.
(ipa_odr_read_section): New function.
(ipa_odr_summary_read): New function.
(class pass_ipa_odr): New pass.
(make_pass_ipa_odr): New function.
* ipa-utils.h (register_odr_enum): Declare.
* lto-section-in.c: (lto_section_name): Add odr_types section.
* lto-streamer.h (enum lto_section_type): Add odr_types section.
* passes.def: Add odr_types pass.
* lto-streamer-out.c (DFS::DFS_write_tree_body): Do not stream
TYPE_VALUES.
(hash_tree): Likewise.
* tree-streamer-in.c (lto_input_ts_type_non_common_tree_pointers):
Likewise.
* tree-streamer-out.c (write_ts_type_non_common_tree_pointers):
Likewise.
* timevar.def (TV_IPA_ODR): New timervar.
* tree-pass.h (make_pass_ipa_odr): Declare.
* tree.c (free_lang_data_in_type): Regiser ODR types.
gcc/lto/ChangeLog:
2020-06-03 Jan Hubicka <hubicka@ucw.cz>
* lto-common.c (compare_tree_sccs_1): Do not compare TYPE_VALUES.
gcc/testsuite/ChangeLog:
2020-06-03 Jan Hubicka <hubicka@ucw.cz>
* g++.dg/lto/pr84805_0.C: Update.
2020-06-03 21:16:43 +02:00
|
|
|
NEXT_PASS (pass_ipa_odr);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_ipa_whole_program_visibility);
|
|
|
|
NEXT_PASS (pass_ipa_profile);
|
2014-10-16 12:47:55 +02:00
|
|
|
NEXT_PASS (pass_ipa_icf);
|
2013-09-01 17:14:24 +02:00
|
|
|
NEXT_PASS (pass_ipa_devirt);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_ipa_cp);
|
2019-09-20 00:25:04 +02:00
|
|
|
NEXT_PASS (pass_ipa_sra);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_ipa_cdtor_merge);
|
cgraphunit.c (symbol_table::process_new_functions): Update.
* cgraphunit.c (symbol_table::process_new_functions): Update.
* ipa-fnsummary.c (pass_data_inline_parameters): Remove.
(inline_generate_summary): Rename to ...
(ipa_fn_summary_generate): ... this one.
(inline_read_summary): Rename to ...
(ipa_fn_summary_read): ... this one.
(inline_write_summary): Rename to ...
(ipa_fn_summary_write): ... this one.
(inline_free_summary): Rename to ...
(ipa_free_fn_summary): ... this one.
(pass_data_local_fn_summary, pass_local_fn_summary,
make_pass_local_fn_summary, pass_data_ipa_free_fn_summary,
pass_ipa_free_fn_summary, make_pass_ipa_free_fn_summary,
pass_data_ipa_fn_summary, pass_ipa_fn_summary,
make_pass_ipa_fn_summary): New.
* ipa-fnsummary.h (inline_generate_summary, inline_read_summary,
inline_write_summary, inline_free_summary): Remove.
(ipa_free_fn_summary) : New.
* ipa-inline.c (ipa_inline): Update.
(pass_ipa_inline): Do not generate summaries.
* ipa.c (pass_data_ipa_free_fn_summary, pass_ipa_free_fn_summary):
Remove.
* passes.def: Replace pass_inline_parameters by pass_local_fn_summary
and add pass_ipa_fn_summary.
* tree-pass.h (make_pass_ipa_fn_summary, make_pass_local_fn_summary):
New.
(make_pass_inline_parameters): Remove.
* lto.c (do_whole_program_analysis): Replace inline_free_summary
by ipa_free_fn_summary.
* gcc.dg/ipa/ctor-empty-1.c: Update template.
* gcc.dg/ipa/inline-5.c: Likewise.
* gfortran.dg/pr48636.f90: Likewise.
From-SVN: r248375
2017-05-23 18:20:53 +02:00
|
|
|
NEXT_PASS (pass_ipa_fn_summary);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_ipa_inline);
|
|
|
|
NEXT_PASS (pass_ipa_pure_const);
|
2020-09-20 07:25:16 +02:00
|
|
|
NEXT_PASS (pass_ipa_modref);
|
2017-12-20 20:41:38 +01:00
|
|
|
NEXT_PASS (pass_ipa_free_fn_summary, false /* small_p */);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_ipa_reference);
|
2014-06-24 05:07:13 +02:00
|
|
|
/* This pass needs to be scheduled after any IP code duplication. */
|
|
|
|
NEXT_PASS (pass_ipa_single_use);
|
2014-05-19 02:58:43 +02:00
|
|
|
/* Comdat privatization come last, as direct references to comdat local
|
|
|
|
symbols are not allowed outside of the comdat group. Privatizing early
|
|
|
|
would result in missed optimizations due to this restriction. */
|
|
|
|
NEXT_PASS (pass_ipa_comdats);
|
2016-04-17 07:21:50 +02:00
|
|
|
TERMINATE_PASS_LIST (all_regular_ipa_passes)
|
2013-07-18 20:55:48 +02:00
|
|
|
|
|
|
|
/* Simple IPA passes executed after the regular passes. In WHOPR mode the
|
|
|
|
passes are executed after partitioning and thus see just parts of the
|
|
|
|
compiled unit. */
|
|
|
|
INSERT_PASSES_AFTER (all_late_ipa_passes)
|
|
|
|
NEXT_PASS (pass_ipa_pta);
|
cgraph.h (enum cgraph_simd_clone_arg_type): New.
* cgraph.h (enum cgraph_simd_clone_arg_type): New.
(struct cgraph_simd_clone_arg, struct cgraph_simd_clone): New.
(struct cgraph_node): Add simdclone and simd_clones fields.
* config/i386/i386.c (ix86_simd_clone_compute_vecsize_and_simdlen,
ix86_simd_clone_adjust, ix86_simd_clone_usable): New functions.
(TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN,
TARGET_SIMD_CLONE_ADJUST, TARGET_SIMD_CLONE_USABLE): Define.
* doc/tm.texi.in (TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN,
TARGET_SIMD_CLONE_ADJUST, TARGET_SIMD_CLONE_USABLE): Add.
* doc/tm.texi: Regenerated.
* ggc.h (ggc_alloc_cleared_simd_clone_stat): New function.
* ipa-cp.c (determine_versionability): Fail if "omp declare simd"
attribute is present.
* omp-low.c: Include pretty-print.h, ipa-prop.h and tree-eh.h.
(simd_clone_vector_of_formal_parm_types): New function.
(simd_clone_struct_alloc, simd_clone_struct_copy,
simd_clone_vector_of_formal_parm_types, simd_clone_clauses_extract,
simd_clone_compute_base_data_type, simd_clone_mangle,
simd_clone_create, simd_clone_adjust_return_type,
create_tmp_simd_array, simd_clone_adjust_argument_types,
simd_clone_init_simd_arrays): New functions.
(struct modify_stmt_info): New type.
(ipa_simd_modify_stmt_ops, ipa_simd_modify_function_body,
simd_clone_adjust, expand_simd_clones, ipa_omp_simd_clone): New
functions.
(pass_data_omp_simd_clone): New variable.
(pass_omp_simd_clone): New class.
(make_pass_omp_simd_clone): New function.
* passes.def (pass_omp_simd_clone): New.
* target.def (TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN,
TARGET_SIMD_CLONE_ADJUST, TARGET_SIMD_CLONE_USABLE): New target
hooks.
* target.h (struct cgraph_node, struct cgraph_simd_node): Declare.
* tree-core.h (OMP_CLAUSE_LINEAR_VARIABLE_STRIDE): Document.
* tree.h (OMP_CLAUSE_LINEAR_VARIABLE_STRIDE): Define.
* tree-pass.h (make_pass_omp_simd_clone): New prototype.
* tree-vect-data-refs.c: Include cgraph.h.
(vect_analyze_data_refs): Inline by hand find_data_references_in_loop
and find_data_references_in_bb, if find_data_references_in_stmt
fails, still allow calls to #pragma omp declare simd functions
in #pragma omp simd loops unless they contain data references among
the call arguments or in lhs.
* tree-vect-loop.c (vect_determine_vectorization_factor): Handle
calls with no lhs.
(vect_transform_loop): Allow NULL STMT_VINFO_VECTYPE for calls without
lhs.
* tree-vectorizer.h (enum stmt_vec_info_type): Add
call_simd_clone_vec_info_type.
(struct _stmt_vec_info): Add simd_clone_fndecl field.
(STMT_VINFO_SIMD_CLONE_FNDECL): Define.
* tree-vect-stmts.c: Include tree-ssa-loop.h,
tree-scalar-evolution.h and cgraph.h.
(vectorizable_call): Handle calls without lhs. Assert
!stmt_can_throw_internal instead of failing for it. Don't update
EH stuff.
(struct simd_call_arg_info): New.
(vectorizable_simd_clone_call): New function.
(vect_transform_stmt): Call it.
(vect_analyze_stmt): Likewise. Allow NULL STMT_VINFO_VECTYPE for
calls without lhs.
* ipa-prop.c (ipa_add_new_function): Only call ipa_analyze_node
if cgraph_function_with_gimple_body_p is true.
c/
* c-decl.c (c_builtin_function_ext_scope): Avoid binding if
external_scope is NULL.
cp/
* semantics.c (finish_omp_clauses): For #pragma omp declare simd
linear clause step call maybe_constant_value.
testsuite/
* g++.dg/gomp/declare-simd-1.C (f38): Make sure
simdlen is a power of two.
* gcc.dg/gomp/simd-clones-2.c: Compile on all targets.
Remove -msse2. Adjust regexps for name mangling changes.
* gcc.dg/gomp/simd-clones-3.c: Likewise.
* gcc.dg/vect/vect-simd-clone-1.c: New test.
* gcc.dg/vect/vect-simd-clone-2.c: New test.
* gcc.dg/vect/vect-simd-clone-3.c: New test.
* gcc.dg/vect/vect-simd-clone-4.c: New test.
* gcc.dg/vect/vect-simd-clone-5.c: New test.
* gcc.dg/vect/vect-simd-clone-6.c: New test.
* gcc.dg/vect/vect-simd-clone-7.c: New test.
* gcc.dg/vect/vect-simd-clone-8.c: New test.
* gcc.dg/vect/vect-simd-clone-9.c: New test.
* gcc.dg/vect/vect-simd-clone-10.c: New test.
* gcc.dg/vect/vect-simd-clone-10.h: New file.
* gcc.dg/vect/vect-simd-clone-10a.c: New file.
* gcc.dg/vect/vect-simd-clone-11.c: New test.
Co-Authored-By: Jakub Jelinek <jakub@redhat.com>
From-SVN: r205442
2013-11-27 12:20:06 +01:00
|
|
|
NEXT_PASS (pass_omp_simd_clone);
|
2016-04-17 07:21:50 +02:00
|
|
|
TERMINATE_PASS_LIST (all_late_ipa_passes)
|
2013-07-18 20:55:48 +02:00
|
|
|
|
|
|
|
/* These passes are run after IPA passes on every function that is being
|
|
|
|
output to the assembler file. */
|
|
|
|
INSERT_PASSES_AFTER (all_passes)
|
2013-11-05 16:09:40 +01:00
|
|
|
NEXT_PASS (pass_fixup_cfg);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_lower_eh_dispatch);
|
2021-03-02 13:20:11 +01:00
|
|
|
NEXT_PASS (pass_oacc_loop_designation);
|
openacc: Middle-end worker-partitioning support
This patch implements worker-partitioning support in the middle end,
by rewriting gimple. The OpenACC execution model requires that code
can run in either "worker single" mode where only a single worker per
gang is active, or "worker partitioned" mode, where multiple workers
per gang are active. This means we need to do something equivalent
to spawning additional workers when transitioning from worker-single
to worker-partitioned mode. However, GPUs typically fix the number of
threads of invoked kernels at launch time, so we need to do something
with the "extra" threads when they are not wanted.
The scheme used is to conditionalise each basic block that executes
in "worker single" mode for worker 0 only. Conditional branches
are handled specially so "idle" (non-0) workers follow along with
worker 0. On transitioning to "worker partitioned" mode, any variables
modified by worker 0 are propagated to the other workers via GPU shared
memory. Special care is taken for routine calls, writes through pointers,
and so forth, as follows:
- There are two types of function calls to consider in worker-single
mode: "normal" calls to maths library routines, etc. are called from
worker 0 only. OpenACC routines may contain worker-partitioned loops
themselves, so are called from all workers, including "idle" ones.
- SSA names set in worker-single mode, but used in worker-partitioned
mode, are copied to shared memory in worker 0. Other workers retrieve
the value from the appropriate shared-memory location after a barrier,
and new phi nodes are introduced at the convergence point to resolve
the worker 0/other worker copies of the value.
- Local scalar variables (on the stack) also need special handling. We
broadcast any variables that are written in the current worker-single
block, and that are read in any worker-partitioned block. (This is
believed to be safe, and is flow-insensitive to ease analysis.)
- Local aggregates (arrays and composites) on the stack are *not*
broadcast. Instead we force gimple stmts modifying elements/fields of
local aggregates into fully-partitioned mode. The RHS of the
assignment is a scalar, and is thus subject to broadcasting as above.
- Writes through pointers may affect any local variable that has
its address taken. We use points-to analysis to determine the set
of potentially-affected variables for a given pointer indirection.
We broadcast any such variable which is used in worker-partitioned
mode, on a per-block basis for any block containing a write through
a pointer.
Some slides about the implementation (from 2018) are available at:
https://jtb20.github.io/gcnworkers.pdf
gcc/
* Makefile.in (OBJS): Add omp-oacc-neuter-broadcast.o.
* doc/tm.texi.in (TARGET_GOACC_CREATE_WORKER_BROADCAST_RECORD):
Add documentation hook.
* doc/tm.texi: Regenerate.
* omp-oacc-neuter-broadcast.cc: New file.
* omp-builtins.def (BUILT_IN_GOACC_BARRIER)
(BUILT_IN_GOACC_SINGLE_START, BUILT_IN_GOACC_SINGLE_COPY_START)
(BUILT_IN_GOACC_SINGLE_COPY_END): New builtins.
* passes.def (pass_omp_oacc_neuter_broadcast): Add pass.
* target.def (goacc.create_worker_broadcast_record): Add target
hook.
* tree-pass.h (make_pass_omp_oacc_neuter_broadcast): Add
prototype.
* config/gcn/gcn-protos.h (gcn_goacc_adjust_propagation_record):
Rename prototype to...
(gcn_goacc_create_worker_broadcast_record): ... this.
* config/gcn/gcn-tree.c (gcn_goacc_adjust_propagation_record): Rename
function to...
(gcn_goacc_create_worker_broadcast_record): ... this.
* config/gcn/gcn.c (TARGET_GOACC_ADJUST_PROPAGATION_RECORD):
Rename to...
(TARGET_GOACC_CREATE_WORKER_BROADCAST_RECORD): ... this.
Co-Authored-By: Nathan Sidwell <nathan@codesourcery.com> (via 'gcc/config/nvptx/nvptx.c' master)
Co-Authored-By: Kwok Cheung Yeung <kcy@codesourcery.com>
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
2021-03-02 13:20:11 +01:00
|
|
|
NEXT_PASS (pass_omp_oacc_neuter_broadcast);
|
2015-09-30 21:16:29 +02:00
|
|
|
NEXT_PASS (pass_oacc_device_lower);
|
2016-11-22 17:57:29 +01:00
|
|
|
NEXT_PASS (pass_omp_device_lower);
|
2015-12-15 15:56:50 +01:00
|
|
|
NEXT_PASS (pass_omp_target_link);
|
2020-04-14 08:53:19 +02:00
|
|
|
NEXT_PASS (pass_adjust_alignment);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_all_optimizations);
|
|
|
|
PUSH_INSERT_PASSES_WITHIN (pass_all_optimizations)
|
|
|
|
NEXT_PASS (pass_remove_cgraph_callee_edges);
|
|
|
|
/* Initial scalar cleanups before alias computation.
|
|
|
|
They ensure memory accesses are not indirect wherever possible. */
|
2018-08-10 11:31:51 +02:00
|
|
|
NEXT_PASS (pass_strip_predict_hints, false /* early_p */);
|
2015-11-16 13:40:41 +01:00
|
|
|
NEXT_PASS (pass_ccp, true /* nonzero_p */);
|
2013-07-18 20:55:48 +02:00
|
|
|
/* After CCP we rewrite no longer addressed locals into SSA
|
|
|
|
form if possible. */
|
passes: Fix up subobject __bos [PR101419]
The following testcase is miscompiled, because VN during cunrolli changes
__bos argument from address of a larger field to address of a smaller field
and so __builtin_object_size (, 1) then folds into smaller value than the
actually available size.
copy_reference_ops_from_ref has a hack for this, but it was using
cfun->after_inlining as a check whether the hack can be ignored, and
cunrolli is after_inlining.
This patch uses a property to make it exact (set at the end of objsz
pass that doesn't do insert_min_max_p) and additionally based on discussions
in the PR moves the objsz pass earlier after IPA.
2021-07-13 Jakub Jelinek <jakub@redhat.com>
Richard Biener <rguenther@suse.de>
PR tree-optimization/101419
* tree-pass.h (PROP_objsz): Define.
(make_pass_early_object_sizes): Declare.
* passes.def (pass_all_early_optimizations): Rename pass_object_sizes
there to pass_early_object_sizes, drop parameter.
(pass_all_optimizations): Move pass_object_sizes right after pass_ccp,
drop parameter, move pass_post_ipa_warn right after that.
* tree-object-size.c (pass_object_sizes::execute): Rename to...
(object_sizes_execute): ... this. Add insert_min_max_p argument.
(pass_data_object_sizes): Move after object_sizes_execute.
(pass_object_sizes): Likewise. In execute method call
object_sizes_execute, drop set_pass_param method and insert_min_max_p
non-static data member and its initializer in the ctor.
(pass_data_early_object_sizes, pass_early_object_sizes,
make_pass_early_object_sizes): New.
* tree-ssa-sccvn.c (copy_reference_ops_from_ref): Use
(cfun->curr_properties & PROP_objsz) instead of cfun->after_inlining.
* gcc.dg/builtin-object-size-10.c: Pass -fdump-tree-early_objsz-details
instead of -fdump-tree-objsz1-details in dg-options and adjust names
of dump file in scan-tree-dump.
* gcc.dg/pr101419.c: New test.
2021-07-13 11:04:22 +02:00
|
|
|
NEXT_PASS (pass_object_sizes);
|
|
|
|
NEXT_PASS (pass_post_ipa_warn);
|
2022-01-16 00:41:40 +01:00
|
|
|
/* Must run before loop unrolling. */
|
|
|
|
NEXT_PASS (pass_warn_access, /*early=*/true);
|
2014-01-17 18:50:10 +01:00
|
|
|
NEXT_PASS (pass_complete_unrolli);
|
2015-10-21 22:11:33 +02:00
|
|
|
NEXT_PASS (pass_backprop);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_phiprop);
|
|
|
|
NEXT_PASS (pass_forwprop);
|
|
|
|
/* pass_build_alias is a dummy pass that ensures that we
|
|
|
|
execute TODO_rebuild_alias at this point. */
|
|
|
|
NEXT_PASS (pass_build_alias);
|
|
|
|
NEXT_PASS (pass_return_slot);
|
2019-07-01 09:54:38 +02:00
|
|
|
NEXT_PASS (pass_fre, true /* may_iterate */);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_merge_phi);
|
2021-10-29 17:30:42 +02:00
|
|
|
NEXT_PASS (pass_thread_jumps_full, /*first=*/true);
|
2015-11-16 13:40:05 +01:00
|
|
|
NEXT_PASS (pass_vrp, true /* warn_array_bounds_p */);
|
2021-04-07 12:09:44 +02:00
|
|
|
NEXT_PASS (pass_dse);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_dce);
|
2021-04-27 14:27:40 +02:00
|
|
|
/* pass_stdarg is always run and at this point we execute
|
|
|
|
TODO_remove_unused_locals to prune CLOBBERs of dead
|
|
|
|
variables which are otherwise a churn on alias walkings. */
|
2015-04-29 11:13:49 +02:00
|
|
|
NEXT_PASS (pass_stdarg);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_call_cdce);
|
|
|
|
NEXT_PASS (pass_cselim);
|
2014-06-17 14:36:34 +02:00
|
|
|
NEXT_PASS (pass_copy_prop);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_tree_ifcombine);
|
2015-05-07 11:52:38 +02:00
|
|
|
NEXT_PASS (pass_merge_phi);
|
2018-10-23 13:34:56 +02:00
|
|
|
NEXT_PASS (pass_phiopt, false /* early_p */);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_tail_recursion);
|
|
|
|
NEXT_PASS (pass_ch);
|
|
|
|
NEXT_PASS (pass_lower_complex);
|
|
|
|
NEXT_PASS (pass_sra);
|
|
|
|
/* The dom pass will also resolve all __builtin_constant_p calls
|
|
|
|
that are still there to 0. This has to be done after some
|
|
|
|
propagations have already run, but before some more dead code
|
|
|
|
is removed, and this place fits nicely. Remember this when
|
|
|
|
trying to move or duplicate pass_dominator somewhere earlier. */
|
2021-10-29 17:30:42 +02:00
|
|
|
NEXT_PASS (pass_thread_jumps, /*first=*/true);
|
2015-11-16 13:40:24 +01:00
|
|
|
NEXT_PASS (pass_dominator, true /* may_peel_loop_headers_p */);
|
2019-04-25 16:32:16 +02:00
|
|
|
/* Threading can leave many const/copy propagations in the IL.
|
|
|
|
Clean them up. Failure to do so well can lead to false
|
|
|
|
positives from warnings for erroneous code. */
|
|
|
|
NEXT_PASS (pass_copy_prop);
|
|
|
|
/* Identify paths that should never be executed in a conforming
|
|
|
|
program and isolate those paths. */
|
2013-11-05 20:47:44 +01:00
|
|
|
NEXT_PASS (pass_isolate_erroneous_paths);
|
2021-09-14 20:41:18 +02:00
|
|
|
NEXT_PASS (pass_reassoc, true /* early_p */);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_dce);
|
|
|
|
NEXT_PASS (pass_forwprop);
|
2018-10-23 13:34:56 +02:00
|
|
|
NEXT_PASS (pass_phiopt, false /* early_p */);
|
2015-11-16 13:40:41 +01:00
|
|
|
NEXT_PASS (pass_ccp, true /* nonzero_p */);
|
2013-07-18 20:55:48 +02:00
|
|
|
/* After CCP we rewrite no longer addressed locals into SSA
|
|
|
|
form if possible. */
|
|
|
|
NEXT_PASS (pass_cse_sincos);
|
|
|
|
NEXT_PASS (pass_optimize_bswap);
|
2015-07-09 11:01:51 +02:00
|
|
|
NEXT_PASS (pass_laddress);
|
2016-05-19 09:39:52 +02:00
|
|
|
NEXT_PASS (pass_lim);
|
2016-10-18 23:40:58 +02:00
|
|
|
NEXT_PASS (pass_walloca, false);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_pre);
|
2022-03-17 08:10:59 +01:00
|
|
|
NEXT_PASS (pass_sink_code, false /* unsplit edges */);
|
2015-12-04 19:27:54 +01:00
|
|
|
NEXT_PASS (pass_sancov);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_asan);
|
|
|
|
NEXT_PASS (pass_tsan);
|
2021-04-07 12:09:44 +02:00
|
|
|
NEXT_PASS (pass_dse);
|
2016-05-19 09:39:52 +02:00
|
|
|
NEXT_PASS (pass_dce);
|
2014-06-23 18:51:10 +02:00
|
|
|
/* Pass group that runs when 1) enabled, 2) there are loops
|
2014-08-14 15:14:24 +02:00
|
|
|
in the function. Make sure to run pass_fix_loops before
|
|
|
|
to discover/remove loops before running the gate function
|
|
|
|
of pass_tree_loop. */
|
|
|
|
NEXT_PASS (pass_fix_loops);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_tree_loop);
|
|
|
|
PUSH_INSERT_PASSES_WITHIN (pass_tree_loop)
|
2019-05-02 16:08:08 +02:00
|
|
|
/* Before loop_init we rewrite no longer addressed locals into SSA
|
|
|
|
form if possible. */
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_tree_loop_init);
|
|
|
|
NEXT_PASS (pass_tree_unswitch);
|
|
|
|
NEXT_PASS (pass_scev_cprop);
|
2016-10-20 14:18:32 +02:00
|
|
|
NEXT_PASS (pass_loop_split);
|
Add a loop versioning pass
This patch adds a pass that versions loops with variable index strides
for the case in which the stride is 1. E.g.:
for (int i = 0; i < n; ++i)
x[i * stride] = ...;
becomes:
if (stepx == 1)
for (int i = 0; i < n; ++i)
x[i] = ...;
else
for (int i = 0; i < n; ++i)
x[i * stride] = ...;
This is useful for both vector code and scalar code, and in some cases
can enable further optimisations like loop interchange or pattern
recognition.
The pass gives a 7.6% improvement on Cortex-A72 for 554.roms_r at -O3
and a 2.4% improvement for 465.tonto. I haven't found any SPEC tests
that regress.
Sizewise, there's a 10% increase in .text for both 554.roms_r and
465.tonto. That's obviously a lot, but in tonto's case it's because
the whole program is written using assumed-shape arrays and pointers,
so a large number of functions really do benefit from versioning.
roms likewise makes heavy use of assumed-shape arrays, and that
improvement in performance IMO justifies the code growth.
The next biggest .text increase is 4.5% for 548.exchange2_r. I did see
a small (0.4%) speed improvement there, but although both 3-iteration runs
produced stable results, that might still be noise. There was a slightly
larger (non-noise) improvement for a 256-bit SVE model.
481.wrf and 521.wrf_r .text grew by 2.8% and 2.5% respectively, but
without any noticeable improvement in performance. No other test grew
by more than 2%.
Although the main SPEC beneficiaries are all Fortran tests, the
benchmarks we use for SVE also include some C and C++ tests that
benefit.
Using -frepack-arrays gives the same benefits in many Fortran cases.
The problem is that using that option inappropriately can force a full
array copy for arguments that the function only reads once, and so it
isn't really something we can turn on by default. The new pass is
supposed to give most of the benefits of -frepack-arrays without
the risk of unnecessary repacking.
The patch therefore enables the pass by default at -O3.
2018-12-17 Richard Sandiford <richard.sandiford@arm.com>
Ramana Radhakrishnan <ramana.radhakrishnan@arm.com>
Kyrylo Tkachov <kyrylo.tkachov@arm.com>
gcc/
* doc/invoke.texi (-fversion-loops-for-strides): Document
(loop-versioning-group-size, loop-versioning-max-inner-insns)
(loop-versioning-max-outer-insns): Document new --params.
* Makefile.in (OBJS): Add gimple-loop-versioning.o.
* common.opt (fversion-loops-for-strides): New option.
* opts.c (default_options_table): Enable fversion-loops-for-strides
at -O3.
* params.def (PARAM_LOOP_VERSIONING_GROUP_SIZE)
(PARAM_LOOP_VERSIONING_MAX_INNER_INSNS)
(PARAM_LOOP_VERSIONING_MAX_OUTER_INSNS): New parameters.
* passes.def: Add pass_loop_versioning.
* timevar.def (TV_LOOP_VERSIONING): New time variable.
* tree-ssa-propagate.h
(substitute_and_fold_engine::substitute_and_fold): Add an optional
block parameter.
* tree-ssa-propagate.c
(substitute_and_fold_engine::substitute_and_fold): Likewise.
When passed, only walk blocks dominated by that block.
* tree-vrp.h (range_includes_p): Declare.
(range_includes_zero_p): Turn into an inline wrapper around
range_includes_p.
* tree-vrp.c (range_includes_p): New function, generalizing...
(range_includes_zero_p): ...this.
* tree-pass.h (make_pass_loop_versioning): Declare.
* gimple-loop-versioning.cc: New file.
gcc/testsuite/
* gcc.dg/loop-versioning-1.c: New test.
* gcc.dg/loop-versioning-10.c: Likewise.
* gcc.dg/loop-versioning-11.c: Likewise.
* gcc.dg/loop-versioning-2.c: Likewise.
* gcc.dg/loop-versioning-3.c: Likewise.
* gcc.dg/loop-versioning-4.c: Likewise.
* gcc.dg/loop-versioning-5.c: Likewise.
* gcc.dg/loop-versioning-6.c: Likewise.
* gcc.dg/loop-versioning-7.c: Likewise.
* gcc.dg/loop-versioning-8.c: Likewise.
* gcc.dg/loop-versioning-9.c: Likewise.
* gfortran.dg/loop_versioning_1.f90: Likewise.
* gfortran.dg/loop_versioning_2.f90: Likewise.
* gfortran.dg/loop_versioning_3.f90: Likewise.
* gfortran.dg/loop_versioning_4.f90: Likewise.
* gfortran.dg/loop_versioning_5.f90: Likewise.
* gfortran.dg/loop_versioning_6.f90: Likewise.
* gfortran.dg/loop_versioning_7.f90: Likewise.
* gfortran.dg/loop_versioning_8.f90: Likewise.
From-SVN: r267197
2018-12-17 11:05:51 +01:00
|
|
|
NEXT_PASS (pass_loop_versioning);
|
2017-12-07 15:49:54 +01:00
|
|
|
NEXT_PASS (pass_loop_jam);
|
2016-11-25 11:22:57 +01:00
|
|
|
/* All unswitching, final value replacement and splitting can expose
|
|
|
|
empty loops. Remove them now. */
|
2021-01-16 09:17:38 +01:00
|
|
|
NEXT_PASS (pass_cd_dce, false /* update_address_taken_p */);
|
2017-06-07 13:31:44 +02:00
|
|
|
NEXT_PASS (pass_iv_canon);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_loop_distribution);
|
2017-12-07 19:03:53 +01:00
|
|
|
NEXT_PASS (pass_linterchange);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_copy_prop);
|
|
|
|
NEXT_PASS (pass_graphite);
|
|
|
|
PUSH_INSERT_PASSES_WITHIN (pass_graphite)
|
|
|
|
NEXT_PASS (pass_graphite_transforms);
|
|
|
|
NEXT_PASS (pass_lim);
|
|
|
|
NEXT_PASS (pass_copy_prop);
|
2014-06-18 13:45:17 +02:00
|
|
|
NEXT_PASS (pass_dce);
|
2013-07-18 20:55:48 +02:00
|
|
|
POP_INSERT_PASSES ()
|
2016-01-18 13:52:32 +01:00
|
|
|
NEXT_PASS (pass_parallelize_loops, false /* oacc_kernels_p */);
|
2016-01-16 23:19:05 +01:00
|
|
|
NEXT_PASS (pass_expand_omp_ssa);
|
2015-07-02 13:47:31 +02:00
|
|
|
NEXT_PASS (pass_ch_vect);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_if_conversion);
|
tree-vectorizer.h (struct _loop_vec_info): Add scalar_loop field.
* tree-vectorizer.h (struct _loop_vec_info): Add scalar_loop field.
(LOOP_VINFO_SCALAR_LOOP): Define.
(slpeel_tree_duplicate_loop_to_edge_cfg): Add scalar_loop argument.
* config/i386/sse.md (maskload<mode>, maskstore<mode>): New expanders.
* tree-data-ref.c (get_references_in_stmt): Handle MASK_LOAD and
MASK_STORE.
* internal-fn.def (LOOP_VECTORIZED, MASK_LOAD, MASK_STORE): New
internal fns.
* tree-if-conv.c: Include expr.h, optabs.h, tree-ssa-loop-ivopts.h and
tree-ssa-address.h.
(release_bb_predicate): New function.
(free_bb_predicate): Use it.
(reset_bb_predicate): Likewise. Don't unallocate bb->aux
just to immediately allocate it again.
(add_to_predicate_list): Add loop argument. If basic blocks that
dominate loop->latch don't insert any predicate.
(add_to_dst_predicate_list): Adjust caller.
(if_convertible_phi_p): Add any_mask_load_store argument, if true,
handle it like flag_tree_loop_if_convert_stores.
(insert_gimplified_predicates): Likewise.
(ifcvt_can_use_mask_load_store): New function.
(if_convertible_gimple_assign_stmt_p): Add any_mask_load_store
argument, check if some conditional loads or stores can't be
converted into MASK_LOAD or MASK_STORE.
(if_convertible_stmt_p): Add any_mask_load_store argument,
pass it down to if_convertible_gimple_assign_stmt_p.
(predicate_bbs): Don't return bool, only check if the last stmt
of a basic block is GIMPLE_COND and handle that. Adjust
add_to_predicate_list caller.
(if_convertible_loop_p_1): Only call predicate_bbs if
flag_tree_loop_if_convert_stores and free_bb_predicate in that case
afterwards, check gimple_code of stmts here. Replace is_predicated
check with dominance check. Add any_mask_load_store argument,
pass it down to if_convertible_stmt_p and if_convertible_phi_p,
call if_convertible_phi_p only after all if_convertible_stmt_p
calls.
(if_convertible_loop_p): Add any_mask_load_store argument,
pass it down to if_convertible_loop_p_1.
(predicate_mem_writes): Emit MASK_LOAD and/or MASK_STORE calls.
(combine_blocks): Add any_mask_load_store argument, pass
it down to insert_gimplified_predicates and call predicate_mem_writes
if it is set. Call predicate_bbs.
(version_loop_for_if_conversion): New function.
(tree_if_conversion): Adjust if_convertible_loop_p and combine_blocks
calls. Return todo flags instead of bool, call
version_loop_for_if_conversion if if-conversion should be just
for the vectorized loops and nothing else.
(main_tree_if_conversion): Adjust caller. Don't call
tree_if_conversion for dont_vectorize loops if if-conversion
isn't explicitly enabled.
* tree-vect-data-refs.c (vect_check_gather): Handle
MASK_LOAD/MASK_STORE.
(vect_analyze_data_refs, vect_supportable_dr_alignment): Likewise.
* gimple.h (gimple_expr_type): Handle MASK_STORE.
* internal-fn.c (expand_LOOP_VECTORIZED, expand_MASK_LOAD,
expand_MASK_STORE): New functions.
* tree-vectorizer.c: Include tree-cfg.h and gimple-fold.h.
(vect_loop_vectorized_call, fold_loop_vectorized_call): New functions.
(vectorize_loops): Don't try to vectorize loops with
loop->dont_vectorize set. Set LOOP_VINFO_SCALAR_LOOP for if-converted
loops, fold LOOP_VECTORIZED internal call depending on if loop
has been vectorized or not.
* tree-vect-loop-manip.c (slpeel_duplicate_current_defs_from_edges):
New function.
(slpeel_tree_duplicate_loop_to_edge_cfg): Add scalar_loop argument.
If non-NULL, copy basic blocks from scalar_loop instead of loop, but
still to loop's entry or exit edge.
(slpeel_tree_peel_loop_to_edge): Add scalar_loop argument, pass it
down to slpeel_tree_duplicate_loop_to_edge_cfg.
(vect_do_peeling_for_loop_bound, vect_do_peeling_for_loop_alignment):
Adjust callers.
(vect_loop_versioning): If LOOP_VINFO_SCALAR_LOOP, perform loop
versioning from that loop instead of LOOP_VINFO_LOOP, move it to the
right place in the CFG afterwards.
* tree-vect-loop.c (vect_determine_vectorization_factor): Handle
MASK_STORE.
* cfgloop.h (struct loop): Add dont_vectorize field.
* tree-loop-distribution.c (copy_loop_before): Adjust
slpeel_tree_duplicate_loop_to_edge_cfg caller.
* optabs.def (maskload_optab, maskstore_optab): New optabs.
* passes.def: Add a note that pass_vectorize must immediately follow
pass_if_conversion.
* tree-predcom.c (split_data_refs_to_components): Give up if
DR_STMT is a call.
* tree-vect-stmts.c (vect_mark_relevant): Don't crash if lhs
is NULL.
(exist_non_indexing_operands_for_use_p): Handle MASK_LOAD
and MASK_STORE.
(vectorizable_mask_load_store): New function.
(vectorizable_call): Call it for MASK_LOAD or MASK_STORE.
(vect_transform_stmt): Handle MASK_STORE.
* tree-ssa-phiopt.c (cond_if_else_store_replacement): Ignore
DR_STMT where lhs is NULL.
* optabs.h (can_vec_perm_p): Fix up comment typo.
(can_vec_mask_load_store_p): New prototype.
* optabs.c (can_vec_mask_load_store_p): New function.
* gcc.dg/vect/vect-cond-11.c: New test.
* gcc.target/i386/vect-cond-1.c: New test.
* gcc.target/i386/avx2-gather-5.c: New test.
* gcc.target/i386/avx2-gather-6.c: New test.
* gcc.dg/vect/vect-mask-loadstore-1.c: New test.
* gcc.dg/vect/vect-mask-load-1.c: New test.
From-SVN: r205856
2013-12-10 12:46:01 +01:00
|
|
|
/* pass_vectorize must immediately follow pass_if_conversion.
|
|
|
|
Please do not add any other passes in between. */
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_vectorize);
|
2020-11-03 03:51:47 +01:00
|
|
|
PUSH_INSERT_PASSES_WITHIN (pass_vectorize)
|
2014-06-18 13:45:17 +02:00
|
|
|
NEXT_PASS (pass_dce);
|
2020-11-03 03:51:47 +01:00
|
|
|
POP_INSERT_PASSES ()
|
|
|
|
NEXT_PASS (pass_predcom);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_complete_unroll);
|
2020-11-03 03:51:47 +01:00
|
|
|
NEXT_PASS (pass_pre_slp_scalar_cleanup);
|
|
|
|
PUSH_INSERT_PASSES_WITHIN (pass_pre_slp_scalar_cleanup)
|
|
|
|
NEXT_PASS (pass_fre, false /* may_iterate */);
|
|
|
|
NEXT_PASS (pass_dse);
|
|
|
|
POP_INSERT_PASSES ()
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_slp_vectorize);
|
|
|
|
NEXT_PASS (pass_loop_prefetch);
|
2014-06-23 18:51:10 +02:00
|
|
|
/* Run IVOPTs after the last pass that uses data-reference analysis
|
|
|
|
as that doesn't handle TARGET_MEM_REFs. */
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_iv_optimize);
|
|
|
|
NEXT_PASS (pass_lim);
|
|
|
|
NEXT_PASS (pass_tree_loop_done);
|
|
|
|
POP_INSERT_PASSES ()
|
2014-06-23 18:51:10 +02:00
|
|
|
/* Pass group that runs when pass_tree_loop is disabled or there
|
|
|
|
are no loops in the function. */
|
|
|
|
NEXT_PASS (pass_tree_no_loop);
|
|
|
|
PUSH_INSERT_PASSES_WITHIN (pass_tree_no_loop)
|
|
|
|
NEXT_PASS (pass_slp_vectorize);
|
|
|
|
POP_INSERT_PASSES ()
|
2015-06-17 19:59:25 +02:00
|
|
|
NEXT_PASS (pass_simduid_cleanup);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_lower_vector_ssa);
|
2018-05-18 10:43:19 +02:00
|
|
|
NEXT_PASS (pass_lower_switch);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_cse_reciprocals);
|
2021-09-14 20:41:18 +02:00
|
|
|
NEXT_PASS (pass_reassoc, false /* early_p */);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_strength_reduction);
|
2015-12-18 05:04:20 +01:00
|
|
|
NEXT_PASS (pass_split_paths);
|
2014-09-27 02:03:23 +02:00
|
|
|
NEXT_PASS (pass_tracer);
|
2019-07-01 09:54:38 +02:00
|
|
|
NEXT_PASS (pass_fre, false /* may_iterate */);
|
2019-08-26 11:29:07 +02:00
|
|
|
/* After late FRE we rewrite no longer addressed locals into SSA
|
|
|
|
form if possible. */
|
2021-10-29 17:30:42 +02:00
|
|
|
NEXT_PASS (pass_thread_jumps, /*first=*/false);
|
2015-11-16 13:40:24 +01:00
|
|
|
NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */);
|
2014-06-24 20:50:00 +02:00
|
|
|
NEXT_PASS (pass_strlen);
|
2021-10-29 17:30:42 +02:00
|
|
|
NEXT_PASS (pass_thread_jumps_full, /*first=*/false);
|
2015-11-16 13:40:05 +01:00
|
|
|
NEXT_PASS (pass_vrp, false /* warn_array_bounds_p */);
|
2021-10-19 09:33:34 +02:00
|
|
|
/* Run CCP to compute alignment and nonzero bits. */
|
introduce try store by multiple pieces
The ldist pass turns even very short loops into memset calls. E.g.,
the TFmode emulation calls end with a loop of up to 3 iterations, to
zero out trailing words, and the loop distribution pass turns them
into calls of the memset builtin.
Though short constant-length clearing memsets are usually dealt with
efficiently, for non-constant-length ones, the options are setmemM, or
a function calls.
RISC-V doesn't have any setmemM pattern, so the loops above end up
"optimized" into memset calls, incurring not only the overhead of an
explicit call, but also discarding the information the compiler has
about the alignment of the destination, and that the length is a
multiple of the word alignment.
This patch handles variable lengths with multiple conditional
power-of-2-constant-sized stores-by-pieces, so as to reduce the
overhead of length compares.
It also changes the last copy-prop pass into ccp, so that pointer
alignment and length's nonzero bits are detected and made available
for the expander, even for ldist-introduced SSA_NAMEs.
for gcc/ChangeLog
* builtins.c (try_store_by_multiple_pieces): New.
(expand_builtin_memset_args): Use it. If target_char_cast
fails, proceed as for non-constant val. Pass len's ctz to...
* expr.c (clear_storage_hints): ... this. Try store by
multiple pieces after setmem.
(clear_storage): Adjust.
* expr.h (clear_storage_hints): Likewise.
(try_store_by_multiple_pieces): Declare.
* passes.def: Replace the last copy_prop with ccp.
2021-05-04 03:48:47 +02:00
|
|
|
NEXT_PASS (pass_ccp, true /* nonzero_p */);
|
2019-04-29 22:21:57 +02:00
|
|
|
NEXT_PASS (pass_warn_restrict);
|
2013-09-12 15:19:21 +02:00
|
|
|
NEXT_PASS (pass_dse);
|
2021-01-16 09:17:38 +01:00
|
|
|
NEXT_PASS (pass_cd_dce, true /* update_address_taken_p */);
|
|
|
|
/* After late CD DCE we rewrite no longer addressed locals into SSA
|
|
|
|
form if possible. */
|
2013-09-12 15:19:21 +02:00
|
|
|
NEXT_PASS (pass_forwprop);
|
2022-03-17 08:10:59 +01:00
|
|
|
NEXT_PASS (pass_sink_code, true /* unsplit edges */);
|
2018-10-23 13:34:56 +02:00
|
|
|
NEXT_PASS (pass_phiopt, false /* early_p */);
|
2013-09-12 15:19:21 +02:00
|
|
|
NEXT_PASS (pass_fold_builtins);
|
|
|
|
NEXT_PASS (pass_optimize_widening_mul);
|
2016-10-28 16:18:50 +02:00
|
|
|
NEXT_PASS (pass_store_merging);
|
2013-09-12 15:19:21 +02:00
|
|
|
NEXT_PASS (pass_tail_calls);
|
2016-02-10 13:46:33 +01:00
|
|
|
/* If DCE is not run before checking for uninitialized uses,
|
2013-07-18 20:55:48 +02:00
|
|
|
we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c).
|
|
|
|
However, this also causes us to misdiagnose cases that should be
|
2016-02-10 13:46:33 +01:00
|
|
|
real warnings (e.g., testsuite/gcc.dg/pr18501.c). */
|
|
|
|
NEXT_PASS (pass_dce);
|
2013-09-11 14:20:07 +02:00
|
|
|
/* Split critical edges before late uninit warning to reduce the
|
|
|
|
number of false positives from it. */
|
|
|
|
NEXT_PASS (pass_split_crit_edges);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_late_warn_uninitialized);
|
|
|
|
NEXT_PASS (pass_local_pure_const);
|
2020-09-20 07:25:16 +02:00
|
|
|
NEXT_PASS (pass_modref);
|
2021-11-08 18:38:09 +01:00
|
|
|
/* uncprop replaces constants by SSA names. This makes analysis harder
|
|
|
|
and thus it should be run last. */
|
|
|
|
NEXT_PASS (pass_uncprop);
|
2013-07-18 20:55:48 +02:00
|
|
|
POP_INSERT_PASSES ()
|
|
|
|
NEXT_PASS (pass_all_optimizations_g);
|
|
|
|
PUSH_INSERT_PASSES_WITHIN (pass_all_optimizations_g)
|
2022-01-18 13:31:56 +01:00
|
|
|
/* The idea is that with -Og we do not perform any IPA optimization
|
|
|
|
so post-IPA work should be restricted to semantically required
|
|
|
|
passes and all optimization work is done early. */
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_remove_cgraph_callee_edges);
|
2018-08-10 11:31:51 +02:00
|
|
|
NEXT_PASS (pass_strip_predict_hints, false /* early_p */);
|
2013-07-18 20:55:48 +02:00
|
|
|
/* Lower remaining pieces of GIMPLE. */
|
|
|
|
NEXT_PASS (pass_lower_complex);
|
|
|
|
NEXT_PASS (pass_lower_vector_ssa);
|
2018-05-18 10:43:19 +02:00
|
|
|
NEXT_PASS (pass_lower_switch);
|
2013-07-18 20:55:48 +02:00
|
|
|
/* Perform simple scalar cleanup which is constant/copy propagation. */
|
2015-11-16 13:40:41 +01:00
|
|
|
NEXT_PASS (pass_ccp, true /* nonzero_p */);
|
2016-12-21 23:15:59 +01:00
|
|
|
NEXT_PASS (pass_post_ipa_warn);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_object_sizes);
|
|
|
|
/* Fold remaining builtins. */
|
|
|
|
NEXT_PASS (pass_fold_builtins);
|
2019-08-26 20:29:45 +02:00
|
|
|
NEXT_PASS (pass_strlen);
|
2013-07-18 20:55:48 +02:00
|
|
|
/* Copy propagation also copy-propagates constants, this is necessary
|
|
|
|
to forward object-size and builtin folding results properly. */
|
|
|
|
NEXT_PASS (pass_copy_prop);
|
|
|
|
NEXT_PASS (pass_dce);
|
2015-12-04 19:27:54 +01:00
|
|
|
NEXT_PASS (pass_sancov);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_asan);
|
|
|
|
NEXT_PASS (pass_tsan);
|
|
|
|
/* ??? We do want some kind of loop invariant motion, but we possibly
|
|
|
|
need to adjust LIM to be more friendly towards preserving accurate
|
|
|
|
debug information here. */
|
2013-09-11 14:20:07 +02:00
|
|
|
/* Split critical edges before late uninit warning to reduce the
|
|
|
|
number of false positives from it. */
|
|
|
|
NEXT_PASS (pass_split_crit_edges);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_late_warn_uninitialized);
|
2021-11-08 18:38:09 +01:00
|
|
|
/* uncprop replaces constants by SSA names. This makes analysis harder
|
|
|
|
and thus it should be run last. */
|
|
|
|
NEXT_PASS (pass_uncprop);
|
2013-07-18 20:55:48 +02:00
|
|
|
POP_INSERT_PASSES ()
|
|
|
|
NEXT_PASS (pass_tm_init);
|
|
|
|
PUSH_INSERT_PASSES_WITHIN (pass_tm_init)
|
|
|
|
NEXT_PASS (pass_tm_mark);
|
|
|
|
NEXT_PASS (pass_tm_memopt);
|
|
|
|
NEXT_PASS (pass_tm_edges);
|
|
|
|
POP_INSERT_PASSES ()
|
builtin-types.def (BT_FN_BOOL_UINT_LONGPTR_LONGPTR_LONGPTR, [...]): New.
gcc/
2015-10-13 Jakub Jelinek <jakub@redhat.com>
Aldy Hernandez <aldyh@redhat.com>
Ilya Verbin <ilya.verbin@intel.com>
* builtin-types.def (BT_FN_BOOL_UINT_LONGPTR_LONGPTR_LONGPTR,
BT_FN_BOOL_UINT_ULLPTR_ULLPTR_ULLPTR,
BT_FN_BOOL_UINT_LONGPTR_LONG_LONGPTR_LONGPTR,
BT_FN_BOOL_UINT_ULLPTR_ULL_ULLPTR_ULLPTR,
BT_FN_VOID_INT_SIZE_PTR_PTR_PTR_UINT_PTR,
BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR,
BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR_INT,
BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_UINT_LONG_INT_LONG_LONG_LONG,
BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_UINT_LONG_INT_ULL_ULL_ULL,
BT_FN_VOID_LONG_VAR, BT_FN_VOID_ULL_VAR): New.
(BT_FN_VOID_INT_PTR_SIZE_PTR_PTR_PTR,
BT_FN_VOID_INT_OMPFN_PTR_SIZE_PTR_PTR_PTR,
BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR): Remove.
* cgraph.h (enum cgraph_simd_clone_arg_type): Add
SIMD_CLONE_ARG_TYPE_LINEAR_REF_CONSTANT_STEP,
SIMD_CLONE_ARG_TYPE_LINEAR_UVAL_CONSTANT_STEP and
SIMD_CLONE_ARG_TYPE_LINEAR_VAL_CONSTANT_STEP.
(struct cgraph_simd_clone_arg): Adjust comment.
* coretypes.h (struct gomp_ordered): New forward decl.
* gimple.c (gimple_build_omp_critical): Add CLAUSES argument,
set critical clauses to it.
(gimple_build_omp_ordered): Return gomp_ordered * instead of
gimple *. Add CLAUSES argument, set ordered clauses to it.
(gimple_copy): Unshare clauses on GIMPLE_OMP_CRITICAL and
GIMPLE_OMP_ORDERED.
* gimple.def (GIMPLE_OMP_ORDERED): Change from GSS_OMP to
GSS_OMP_SINGLE_LAYOUT, move it after GIMPLE_OMP_TEAMS.
* gimple.h (enum gf_mask): Add GF_OMP_TASK_TASKLOOP. Add another bit
to GF_OMP_FOR_KIND_MASK mask. Add GF_OMP_FOR_KIND_TASKLOOP, renumber
GF_OMP_FOR_KIND_CILKFOR and GF_OMP_FOR_KIND_OACC_LOOP. Adjust
GF_OMP_FOR_SIMD, GF_OMP_FOR_COMBINED and GF_OMP_FOR_COMBINED_INTO.
Add another bit to GF_OMP_TARGET_KIND_MASK mask. Add
GF_OMP_TARGET_KIND_ENTER_DATA and GF_OMP_TARGET_KIND_EXIT_DATA,
renumber
GF_OMP_TARGET_KIND_OACC_{PARALLEL,KERNELS,DATA,UPDATE,ENTER_EXIT_DATA}.
(gomp_critical): Add clauses field.
(gomp_ordered): New struct.
(is_a_helper <gomp_ordered *>::test): New inline.
(gimple_build_omp_critical): Add CLAUSES argument.
(gimple_build_omp_ordered): Likewise. Return gomp_ordered *
instead of gimple *.
(gimple_omp_critical_clauses, gimple_omp_critical_clauses_ptr,
gimple_omp_critical_set_clauses, gimple_omp_ordered_clauses,
gimple_omp_ordered_clauses_ptr, gimple_omp_ordered_set_clauses,
gimple_omp_task_taskloop_p, gimple_omp_task_set_taskloop_p): New
inline functions.
* gimple-pretty-print.c (dump_gimple_omp_for): Handle taskloop.
(dump_gimple_omp_target): Handle enter data and exit data.
(dump_gimple_omp_block): Don't handle GIMPLE_OMP_ORDERED here.
(dump_gimple_omp_critical): Print clauses.
(dump_gimple_omp_ordered): New function.
(dump_gimple_omp_task): Handle taskloop.
(pp_gimple_stmt_1): Use dump_gimple_omp_ordered for
GIMPLE_OMP_ORDERED.
* gimple-walk.c (walk_gimple_op): Walk clauses on
GIMPLE_OMP_CRITICAL and GIMPLE_OMP_ORDERED.
* gimplify.c (enum gimplify_omp_var_data): Add GOVD_MAP_0LEN_ARRAY.
(enum omp_region_type): Add ORT_COMBINED_TARGET and ORT_NONE.
(struct gimplify_omp_ctx): Add loop_iter_var,
target_map_scalars_firstprivate, target_map_pointers_as_0len_arrays
and target_firstprivatize_array_bases fields.
(delete_omp_context): Release loop_iter_var.
(gimplify_bind_expr): Handle ORT_NONE.
(maybe_fold_stmt): Adjust check for ORT_TARGET for the addition of
ORT_COMBINED_TARGET.
(is_gimple_stmt): Return true for OMP_TASKLOOP, OMP_TEAMS and
OMP_TARGET{,_DATA,_UPDATE,_ENTER_DATA,_EXIT_DATA}.
(omp_firstprivatize_variable): Handle ORT_NONE. Adjust check for
ORT_TARGET for the addition of ORT_COMBINED_TARGET. Handle
ctx->target_map_scalars_firstprivate.
(omp_add_variable): Handle ORT_NONE. Allow map clause together with
data sharing clauses. For data sharing clause with VLA decl
on omp target/target data don't add firstprivate for the pointer.
Call omp_notice_variable on TYPE_SIZE_UNIT only if it is a DECL_P.
(omp_notice_threadprivate_variable): Adjust check for ORT_TARGET for
the addition of ORT_COMBINED_TARGET.
(omp_notice_variable): Handle ORT_NONE. Adjust check for ORT_TARGET
for the addition of ORT_COMBINED_TARGET. Handle implicit mapping of
pointers as zero length array sections and
ctx->target_map_scalars_firstprivate mapping of scalars as firstprivate
data sharing.
(omp_check_private): Handle omp_member_access_dummy_var vars.
(find_decl_expr): New function.
(gimplify_scan_omp_clauses): Add CODE argument. For OMP_CLAUSE_IF
complain if OMP_CLAUSE_IF_MODIFIER is present and does not match code.
Handle OMP_CLAUSE_GANG separately. Handle
OMP_CLAUSE_{PRIORITY,GRAINSIZE,NUM_TASKS,NOGROUP,THREADS,SIMD,SIMDLEN}
clauses. Diagnose linear clause on combined
distribute {, parallel for} simd construct, unless it is the loop
iterator. Handle struct element GOMP_MAP_FIRSTPRIVATE_POINTER.
Handle map clauses with COMPONENT_REF. Initialize
ctx->target_map_scalars_firstprivate,
ctx->target_firstprivatize_array_bases and
ctx->target_map_pointers_as_0len_arrays. Add firstprivate for
linear clause even to target region if combined. Remove
map clauses with GOMP_MAP_FIRSTPRIVATE_POINTER kind from
OMP_TARGET_{,ENTER_,EXIT_}DATA. For GOMP_MAP_FIRSTPRIVATE_POINTER
map kind with non-INTEGER_CST OMP_CLAUSE_SIZE firstprivatize the bias.
Handle OMP_CLAUSE_DEPEND_{SINK,SOURCE}. Handle
OMP_CLAUSE_{{USE,IS}_DEVICE_PTR,DEFAULTMAP,HINT}.
For linear clause on worksharing loop combined with parallel add
shared clause on the parallel. Handle OMP_CLAUSE_REDUCTION
with MEM_REF OMP_CLAUSE_DECL. Set DECL_NAME on
omp_member_access_dummy_var vars. Add lastprivate clause to outer
taskloop if needed.
(gimplify_adjust_omp_clauses_1): Handle GOVD_MAP_0LEN_ARRAY.
If gimplify_omp_ctxp->target_firstprivatize_array_bases, use
GOMP_MAP_FIRSTPRIVATE_POINTER map kind instead of
GOMP_MAP_POINTER.
(gimplify_adjust_omp_clauses): Add CODE argument. Handle removal
of GOMP_MAP_FIRSTPRIVATE_POINTER struct elements for struct not seen
in target body. Handle removal of struct mapping if struct is not
seen in target body. Remove GOMP_MAP_STRUCT map clause on
OMP_TARGET_EXIT_DATA. Adjust check for ORT_TARGET for the
addition of ORT_COMBINED_TARGET. Use GOMP_MAP_FIRSTPRIVATE_POINTER
instead of GOMP_MAP_POINTER if ctx->target_firstprivatize_array_bases
for VLAs. Set OMP_CLAUSE_MAP_PRIVATE if both data sharing and map
clause appear together. Handle
OMP_CLAUSE_{{USE,IS}_DEVICE_PTR,DEFAULTMAP,HINT}. Don't remove map
clause if it has map-type-modifier always. Handle
OMP_CLAUSE_{PRIORITY,GRAINSIZE,NUM_TASKS,NOGROUP,THREADS,SIMD,SIMDLEN}
clauses.
(gimplify_oacc_cache, gimplify_omp_parallel, gimplify_omp_task):
Adjust gimplify_scan_omp_clauses and gimplify_adjust_omp_clauses
callers.
(gimplify_omp_for): Likewise. Handle OMP_TASKLOOP. Initialize
loop_iter_var. Use OMP_FOR_ORIG_DECLS. Fix handling of lastprivate
iterators in doacross loops.
(gimplify_omp_workshare): Adjust gimplify_scan_omp_clauses and
gimplify_adjust_omp_clauses callers. Use ORT_COMBINED_TARGET
for OMP_TARGET_COMBINED. Adjust check for ORT_TARGET
for the addition of ORT_COMBINED_TARGET.
(gimplify_omp_target_update): Adjust gimplify_scan_omp_clauses and
gimplify_adjust_omp_clauses callers. Handle OMP_TARGET_ENTER_DATA
and OMP_TARGET_EXIT_DATA.
(gimplify_omp_ordered): New function.
(gimplify_expr): Handle OMP_TASKLOOP, OMP_TARGET_ENTER_DATA and
OMP_TARGET_EXIT_DATA. Use gimplify_omp_ordered for OMP_ORDERED.
Gimplify clauses on OMP_CRITICAL.
* internal-fn.c (expand_GOMP_SIMD_ORDERED_START,
expand_GOMP_SIMD_ORDERED_END): New functions.
* internal-fn.def (GOMP_SIMD_ORDERED_START,
GOMP_SIMD_ORDERED_END): New internal functions.
* omp-builtins.def (BUILT_IN_GOMP_LOOP_DOACROSS_STATIC_START,
BUILT_IN_GOMP_LOOP_DOACROSS_DYNAMIC_START,
BUILT_IN_GOMP_LOOP_DOACROSS_GUIDED_START,
BUILT_IN_GOMP_LOOP_DOACROSS_RUNTIME_START,
BUILT_IN_GOMP_LOOP_ULL_DOACROSS_STATIC_START,
BUILT_IN_GOMP_LOOP_ULL_DOACROSS_DYNAMIC_START,
BUILT_IN_GOMP_LOOP_ULL_DOACROSS_GUIDED_START,
BUILT_IN_GOMP_LOOP_ULL_DOACROSS_RUNTIME_START,
BUILT_IN_GOMP_DOACROSS_POST, BUILT_IN_GOMP_DOACROSS_WAIT,
BUILT_IN_GOMP_DOACROSS_ULL_POST, BUILT_IN_GOMP_DOACROSS_ULL_WAIT,
BUILT_IN_GOMP_TARGET_ENTER_EXIT_DATA, BUILT_IN_GOMP_TASKLOOP,
BUILT_IN_GOMP_TASKLOOP_ULL): New built-ins.
(BUILT_IN_GOMP_TASK): Add INT argument to the end.
(BUILT_IN_GOMP_TARGET): Rename from GOMP_target to GOMP_target_41,
adjust type.
(BUILT_IN_GOMP_TARGET_DATA): Rename from GOMP_target_data to
GOMP_target_data_41, adjust type.
(BUILT_IN_GOMP_TARGET_UPDATE): Rename from GOMP_target_update to
GOMP_target_update_41, adjust type.
* omp-low.c (struct omp_region): Adjust comments, add ord_stmt
field.
(struct omp_for_data): Add ordered and simd_schedule fields.
(omp_member_access_dummy_var, unshare_and_remap_1,
unshare_and_remap, is_taskloop_ctx): New functions.
(is_taskreg_ctx): Use is_parallel_ctx and is_task_ctx.
(extract_omp_for_data): Handle taskloops and doacross loops
and simd schedule modifier.
(omp_adjust_chunk_size): New function.
(get_ws_args_for): Use it.
(lookup_sfield): Change first argument to splay_tree_key,
add overload with first argument tree.
(maybe_lookup_field): Likewise.
(use_pointer_for_field): Handle omp_member_access_dummy_var.
(omp_copy_decl_2): If var is TREE_ADDRESSABLE listed in
task_shared_vars, clear TREE_ADDRESSABLE on the copy.
(build_outer_var_ref): Add LASTPRIVATE argument, handle
taskloops and omp_member_access_dummy_var vars.
(build_sender_ref): Change first argument to splay_tree_key,
add overload with first argument tree.
(install_var_field): For mask & 8 use &DECL_UID as key instead
of the tree itself.
(fixup_child_record_type): Const qualify *.omp_data_i.
(scan_sharing_clauses): Handle OMP_CLAUSE_SHARED_FIRSTPRIVATE,
C/C++ array reductions, OMP_CLAUSE_{IS,USE}_DEVICE_PTR clauses,
OMP_CLAUSE_{PRIORITY,GRAINSIZE,NUM_TASKS,SIMDLEN,THREADS,SIMD} and
OMP_CLAUSE_{NOGROUP,DEFAULTMAP} clauses, OMP_CLAUSE__LOOPTEMP_ clause
on taskloop, GOMP_MAP_FIRSTPRIVATE_POINTER, OMP_CLAUSE_MAP_PRIVATE.
(create_omp_child_function): Set TREE_READONLY on .omp_data_i.
(find_combined_for): Allow searching for different GIMPLE_OMP_FOR
kinds.
(add_taskreg_looptemp_clauses): New function.
(scan_omp_parallel): Use it.
(scan_omp_task): Likewise.
(finish_taskreg_scan): Handle OMP_CLAUSE_SHARED_FIRSTPRIVATE.
For taskloop, move fields for the first two _LOOPTEMP_ clauses first.
(check_omp_nesting_restrictions): Handle GF_OMP_TARGET_KIND_ENTER_DATA
and GF_OMP_TARGET_KIND_EXIT_DATA. Formatting fixes. Allow the
sandwiched taskloop constructs. Type check
OMP_CLAUSE_DEPEND_{KIND,SOURCE}. Allow ordered simd inside of simd
region. Diagnose depend(source) or depend(sink:...) on
target constructs or task/taskloop.
(handle_simd_reference): Use get_name.
(lower_rec_input_clauses): Likewise. Ignore all
OMP_CLAUSE_LASTPRIVATE_FIRSTPRIVATE clauses on taskloop construct.
Allow _LOOPTEMP_ clause on GOMP_TASK. Unshare new_var
before passing it to omp_clause_{default,copy}_ctor. Handle
OMP_CLAUSE_REDUCTION with MEM_REF OMP_CLAUSE_DECL. Set
lastprivate_firstprivate flag for linear that needs copyin and
copyout. Use BUILT_IN_ALLOCA_WITH_ALIGN instead of BUILT_IN_ALLOCA.
(lower_lastprivate_clauses): For OMP_CLAUSE_LASTPRIVATE_FIRSTPRIVATE
on taskloop lookup decl in outer context. Pass true to
build_outer_var_ref lastprivate argument. Handle
OMP_CLAUSE_LASTPRIVATE_TASKLOOP_IV lastprivate if the decl is global
outside of outer taskloop for.
(lower_reduction_clauses): Handle OMP_CLAUSE_REDUCTION with MEM_REF
OMP_CLAUSE_DECL.
(lower_send_clauses): Ignore first two _LOOPTEMP_ clauses in taskloop
GOMP_TASK. Handle OMP_CLAUSE_SHARED_FIRSTPRIVATE. Handle
omp_member_access_dummy_var vars. Handle OMP_CLAUSE_REDUCTION
with MEM_REF OMP_CLAUSE_DECL. Use new lookup_sfield overload.
(lower_send_shared_vars): Ignore fields with NULL or FIELD_DECL
abstract origin. Handle omp_member_access_dummy_var vars.
(expand_parallel_call): Use expand_omp_build_assign.
(expand_task_call): Handle taskloop construct expansion. Add
REGION argument. Use GOMP_TASK_* defines instead of hardcoded
integers. Add priority argument to GOMP_task* calls. Or in
GOMP_TASK_FLAG_PRIORITY into flags if priority is present for
GOMP_task call.
(expand_omp_build_assign): Add prototype. Add AFTER
argument, if true emit statements after *GSI_P and continue linking.
(expand_omp_taskreg): Adjust expand_task_call caller.
(expand_omp_for_init_counts): Rename zero_iter_bb argument to
zero_iter1_bb and first_zero_iter to first_zero_iter1, add
zero_iter2_bb and first_zero_iter2 arguments, handle computation
of counts even for ordered loops.
(expand_omp_for_init_vars): Handle GOMP_TASK inner_stmt.
(expand_omp_ordered_source, expand_omp_ordered_sink,
expand_omp_ordered_source_sink, expand_omp_for_ordered_loops): New
functions.
(expand_omp_for_generic): Use omp_adjust_chunk_size. Handle linear
clauses on worksharing loop. Handle DOACROSS loop expansion.
(expand_omp_for_static_nochunk): Handle linear clauses on
worksharing loop. Adjust expand_omp_for_init_counts
callers.
(expand_omp_for_static_chunk): Likewise. Use omp_adjust_chunk_size.
(expand_omp_simd): Handle addressable fd->loop.v. Adjust
expand_omp_for_init_counts callers.
(expand_omp_taskloop_for_outer, expand_omp_taskloop_for_inner): New
functions.
(expand_omp_for): Call expand_omp_taskloop_for_* for taskloop.
Handle doacross loops.
(expand_omp_target): Handle GF_OMP_TARGET_KIND_ENTER_DATA and
GF_OMP_TARGET_KIND_EXIT_DATA. Pass flags and depend arguments to
GOMP_target_{41,update_41,enter_exit_data} libcalls.
(expand_omp): Don't expand ordered depend constructs here, record
ord_stmt instead for later expand_omp_for_generic.
(build_omp_regions_1): Handle GF_OMP_TARGET_KIND_ENTER_DATA and
GF_OMP_TARGET_KIND_EXIT_DATA. Treat GIMPLE_OMP_ORDERED with depend
clause as stand-alone directive.
(lower_omp_ordered_clauses): New function.
(lower_omp_ordered): Handle OMP_CLAUSE_SIMD, for OMP_CLAUSE_DEPEND
don't lower anything.
(lower_omp_for_lastprivate): Use last _looptemp_ clause
on taskloop for comparison.
(lower_omp_for): Handle taskloop constructs. Adjust OMP_CLAUSE_DECL
and OMP_CLAUSE_LINEAR_STEP so that expand_omp_for_* can use it during
expansion for linear adjustments.
(create_task_copyfn): Handle OMP_CLAUSE_SHARED_FIRSTPRIVATE.
(lower_depend_clauses): Assert not seeing sink/source depend kinds.
Set TREE_ADDRESSABLE on array. Change first argument from gimple *
to tree * pointing to the stmt's clauses.
(lower_omp_taskreg): Adjust lower_depend_clauses caller.
(lower_omp_target): Handle GF_OMP_TARGET_KIND_ENTER_DATA
and GF_OMP_TARGET_KIND_EXIT_DATA, depend clauses,
GOMP_MAP_{RELEASE,ALWAYS_{TO,FROM,TOFROM},FIRSTPRIVATE_POINTER,STRUCT}
map kinds, OMP_CLAUSE_{FIRSTPRIVATE,PRIVATE,{IS,USE}_DEVICE_PTR
clauses. Always use short kind and 8-bit align shift.
(lower_omp_regimplify_p): Use IS_TYPE_OR_DECL_P macro.
(struct lower_omp_regimplify_operands_data): New type.
(lower_omp_regimplify_operands_p, lower_omp_regimplify_operands):
New functions.
(lower_omp_1): Use lower_omp_regimplify_operands instead of
gimple_regimplify_operands.
(make_gimple_omp_edges): Handle GF_OMP_TARGET_KIND_ENTER_DATA and
GF_OMP_TARGET_KIND_EXIT_DATA. Treat GIMPLE_OMP_ORDERED with depend
clause as stand-alone directive.
(simd_clone_clauses_extract): Honor OMP_CLAUSE_LINEAR_KIND.
(simd_clone_mangle): Mangle the various linear kinds
per the new ABI.
(simd_clone_adjust_argument_types): Handle
SIMD_CLONE_ARG_TYPE_LINEAR_*_CONSTANT_STEP.
(simd_clone_init_simd_arrays): Don't do anything for uval.
(simd_clone_adjust): Handle
SIMD_CLONE_ARG_TYPE_LINEAR_REF_CONSTANT_STEP like
SIMD_CLONE_ARG_TYPE_LINEAR_CONSTANT_STEP.
Handle SIMD_CLONE_ARG_TYPE_LINEAR_UVAL_CONSTANT_STEP.
* omp-low.h (omp_member_access_dummy_var): New prototype.
* passes.def (pass_simduid_cleanup): Schedule another copy of the
pass after all optimizations.
* tree.c (omp_clause_code_name): Add entries for
OMP_CLAUSE_{TO_DECLARE,LINK,{USE,IS}_DEVICE_PTR,DEFAULTMAP,HINT}
and OMP_CLAUSE_{PRIORITY,GRAINSIZE,NUM_TASKS,NOGROUP,THREADS,SIMD}.
(omp_clause_num_ops): Likewise. Bump number of OMP_CLAUSE_REDUCTION
arguments to 5 and for OMP_CLAUSE_ORDERED to 1.
(walk_tree_1): Adjust for OMP_CLAUSE_ORDERED having 1 argument and
OMP_CLAUSE_REDUCTION 5 arguments. Handle
OMP_CLAUSE_{TO_DECLARE,LINK,{USE,IS}_DEVICE_PTR,DEFAULTMAP,HINT}
and OMP_CLAUSE_{PRIORITY,GRAINSIZE,NUM_TASKS,NOGROUP,THREADS,SIMD}
clauses.
* tree-core.h (enum omp_clause_linear_kind): New.
(struct tree_omp_clause): Change type of map_kind
from unsigned char to unsigned int. Add subcode.if_modifier
and subcode.linear_kind fields.
(enum omp_clause_code): Add
OMP_CLAUSE_{TO_DECLARE,LINK,{USE,IS}_DEVICE_PTR,DEFAULTMAP,HINT}
and OMP_CLAUSE_{PRIORITY,GRAINSIZE,NUM_TASKS,NOGROUP,THREADS,SIMD}.
(OMP_CLAUSE_REDUCTION): Document
OMP_CLAUSE_REDUCTION_DECL_PLACEHOLDER.
(enum omp_clause_depend_kind): Add OMP_CLAUSE_DEPEND_{SOURCE,SINK}.
* tree.def (OMP_FOR): Add OMP_FOR_ORIG_DECLS operand.
(OMP_CRITICAL): Move before OMP_SINGLE. Add OMP_CRITICAL_CLAUSES
operand.
(OMP_ORDERED): Move before OMP_SINGLE. Add OMP_ORDERED_CLAUSES
operand.
(OMP_TASKLOOP, OMP_TARGET_ENTER_DATA, OMP_TARGET_EXIT_DATA): New tree
codes.
* tree.h (OMP_BODY): Replace OMP_CRITICAL with OMP_TASKGROUP.
(OMP_CLAUSE_SET_MAP_KIND): Cast to unsigned int rather than unsigned
char.
(OMP_CRITICAL_NAME): Adjust to be 3rd operand instead of 2nd.
(OMP_CLAUSE_NUM_TASKS_EXPR): Formatting fix.
(OMP_STANDALONE_CLAUSES): Adjust to cover OMP_TARGET_{ENTER,EXIT}_DATA.
(OMP_CLAUSE_DEPEND_SINK_NEGATIVE, OMP_TARGET_COMBINED,
OMP_CLAUSE_MAP_PRIVATE, OMP_FOR_ORIG_DECLS, OMP_CLAUSE_IF_MODIFIER,
OMP_CLAUSE_MAP_MAYBE_ZERO_LENGTH_ARRAY_SECTION, OMP_CRITICAL_CLAUSES,
OMP_CLAUSE_PRIVATE_TASKLOOP_IV, OMP_CLAUSE_LASTPRIVATE_TASKLOOP_IV,
OMP_CLAUSE_HINT_EXPR, OMP_CLAUSE_SCHEDULE_SIMD,
OMP_CLAUSE_LINEAR_KIND, OMP_CLAUSE_REDUCTION_DECL_PLACEHOLDER,
OMP_CLAUSE_SHARED_FIRSTPRIVATE, OMP_ORDERED_CLAUSES,
OMP_TARGET_ENTER_DATA_CLAUSES, OMP_TARGET_EXIT_DATA_CLAUSES,
OMP_CLAUSE_NUM_TASKS_EXPR, OMP_CLAUSE_GRAINSIZE_EXPR,
OMP_CLAUSE_PRIORITY_EXPR, OMP_CLAUSE_ORDERED_EXPR): Define.
* tree-inline.c (remap_gimple_stmt): Handle clauses on
GIMPLE_OMP_ORDERED and GIMPLE_OMP_CRITICAL. For
IFN_GOMP_SIMD_ORDERED_{START,END} set has_simduid_loops.
* tree-nested.c (convert_nonlocal_omp_clauses): Handle
OMP_CLAUSE_{TO_DECLARE,LINK,{USE,IS}_DEVICE_PTR,SIMDLEN,PRIORITY,SIMD}
and OMP_CLAUSE_{GRAINSIZE,NUM_TASKS,HINT,NOGROUP,THREADS,DEFAULTMAP}
clauses. Handle OMP_CLAUSE_REDUCTION_DECL_PLACEHOLDER.
(convert_local_omp_clauses): Likewise.
* tree-pretty-print.c (dump_omp_clause): Handle
OMP_CLAUSE_{TO_DECLARE,LINK,{USE,IS}_DEVICE_PTR,SIMDLEN,PRIORITY,SIMD}
and OMP_CLAUSE_{GRAINSIZE,NUM_TASKS,HINT,NOGROUP,THREADS,DEFAULTMAP}
clauses. Handle OMP_CLAUSE_IF_MODIFIER, OMP_CLAUSE_ORDERED_EXPR,
OMP_CLAUSE_SCHEDULE_SIMD, OMP_CLAUSE_LINEAR_KIND,
OMP_CLAUSE_DEPEND_{SOURCE,SINK}. Use "delete" for
GOMP_MAP_FORCE_DEALLOC. Handle
GOMP_MAP_{ALWAYS_{TO,FROM,TOFROM},RELEASE,FIRSTPRIVATE_POINTER,STRUCT}.
(dump_generic_node): Handle OMP_TASKLOOP, OMP_TARGET_{ENTER,EXIT}_DATA
and clauses on OMP_ORDERED and OMP_CRITICAL.
* tree-vectorizer.c (adjust_simduid_builtins): Adjust comment.
Remove IFN_GOMP_SIMD_ORDERED_{START,END}.
(vectorize_loops): Adjust comments.
(pass_simduid_cleanup::execute): Likewise.
* tree-vect-stmts.c (vectorizable_simd_clone_call): Handle
SIMD_CLONE_ARG_TYPE_LINEAR_{REF,VAL,UVAL}_CONSTANT_STEP.
* wide-int.h (wi::gcd): New.
gcc/c-family/
2015-10-13 Jakub Jelinek <jakub@redhat.com>
Aldy Hernandez <aldyh@redhat.com>
* c-common.c (enum c_builtin_type): Define DEF_FUNCTION_TYPE_9,
DEF_FUNCTION_TYPE_10 and DEF_FUNCTION_TYPE_11.
(c_define_builtins): Likewise.
* c-common.h (enum c_omp_clause_split): Add
C_OMP_CLAUSE_SPLIT_TASKLOOP.
(c_finish_omp_critical, c_finish_omp_ordered): Add CLAUSES argument.
(c_finish_omp_for): Add ORIG_DECLV argument.
* c-cppbuiltin.c (c_cpp_builtins): Predefine _OPENMP as
201511 instead of 201307.
* c-omp.c (c_finish_omp_critical): Add CLAUSES argument, set
OMP_CRITICAL_CLAUSES to it.
(c_finish_omp_ordered): Add CLAUSES argument, set
OMP_ORDERED_CLAUSES to it.
(c_finish_omp_for): Add ORIG_DECLV argument, set OMP_FOR_ORIG_DECLS
to it if OMP_FOR. Clear DECL_INITIAL on the IVs.
(c_omp_split_clauses): Handle OpenMP 4.5 combined/composite
constructs and new OpenMP 4.5 clauses. Clear
OMP_CLAUSE_SCHEDULE_SIMD if not combined with OMP_SIMD. Add
verification code.
* c-pragma.c (omp_pragmas_simd): Add taskloop.
* c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_TASKLOOP.
(enum pragma_omp_clause): Add
PRAGMA_OMP_CLAUSE_{DEFAULTMAP,GRAINSIZE,HINT,{IS,USE}_DEVICE_PTR}
and PRAGMA_OMP_CLAUSE_{LINK,NOGROUP,NUM_TASKS,PRIORITY,SIMD,THREADS}.
gcc/c/
2015-10-13 Jakub Jelinek <jakub@redhat.com>
Aldy Hernandez <aldyh@redhat.com>
* c-parser.c (c_parser_pragma): Handle PRAGMA_OMP_ORDERED here.
(c_parser_omp_clause_name): Handle OpenMP 4.5 clauses.
(c_parser_omp_variable_list): Handle structure elements for
map, to and from clauses. Handle array sections in reduction
clause. Formatting fixes.
(c_parser_omp_clause_if): Add IS_OMP argument, handle parsing of
if clause modifiers.
(c_parser_omp_clause_num_tasks, c_parser_omp_clause_grainsize,
c_parser_omp_clause_priority, c_parser_omp_clause_hint,
c_parser_omp_clause_defaultmap, c_parser_omp_clause_use_device_ptr,
c_parser_omp_clause_is_device_ptr): New functions.
(c_parser_omp_clause_ordered): Parse optional parameter.
(c_parser_omp_clause_reduction): Handle array reductions.
(c_parser_omp_clause_schedule): Parse optional simd modifier.
(c_parser_omp_clause_nogroup, c_parser_omp_clause_orderedkind): New
functions.
(c_parser_omp_clause_linear): Parse linear clause modifiers.
(c_parser_omp_clause_depend_sink): New function.
(c_parser_omp_clause_depend): Parse source/sink depend kinds.
(c_parser_omp_clause_map): Parse release/delete map kinds and
optional always modifier.
(c_parser_oacc_all_clauses): Adjust c_parser_omp_clause_if
and c_finish_omp_clauses callers.
(c_parser_omp_all_clauses): Likewise. Parse OpenMP 4.5 clauses.
Parse "to" as OMP_CLAUSE_TO_DECLARE if on declare target directive.
(c_parser_oacc_cache): Adjust c_finish_omp_clauses caller.
(OMP_CRITICAL_CLAUSE_MASK): Define.
(c_parser_omp_critical): Parse critical clauses.
(c_parser_omp_for_loop): Handle doacross loops, adjust
c_finish_omp_for and c_finish_omp_clauses callers.
(OMP_SIMD_CLAUSE_MASK): Add simdlen clause.
(c_parser_omp_simd): Allow ordered clause if it has no parameter.
(OMP_FOR_CLAUSE_MASK): Add linear clause.
(c_parser_omp_for): Disallow ordered clause when combined with
distribute. Disallow linear clause when combined with distribute
and not combined with simd.
(OMP_ORDERED_CLAUSE_MASK, OMP_ORDERED_DEPEND_CLAUSE_MASK): Define.
(c_parser_omp_ordered): Add CONTEXT argument, remove LOC argument,
parse clauses and if depend clause is found, don't parse a body.
(c_parser_omp_parallel): Disallow copyin clause on target parallel.
Allow target parallel without for after it.
(OMP_TASK_CLAUSE_MASK): Add priority clause.
(OMP_TARGET_DATA_CLAUSE_MASK): Add use_device_ptr clause.
(c_parser_omp_target_data): Diagnose no map clauses or clauses with
invalid kinds.
(OMP_TARGET_UPDATE_CLAUSE_MASK): Add depend and nowait clauses.
(OMP_TARGET_ENTER_DATA_CLAUSE_MASK,
OMP_TARGET_EXIT_DATA_CLAUSE_MASK): Define.
(c_parser_omp_target_enter_data, c_parser_omp_target_exit_data): New
functions.
(OMP_TARGET_CLAUSE_MASK): Add depend, nowait, private, firstprivate,
defaultmap and is_device_ptr clauses.
(c_parser_omp_target): Parse target parallel and target simd. Set
OMP_TARGET_COMBINED on combined constructs. Parse target enter data
and target exit data. Diagnose invalid map kinds.
(OMP_DECLARE_TARGET_CLAUSE_MASK): Define.
(c_parser_omp_declare_target): Parse OpenMP 4.5 forms of this
construct.
(c_parser_omp_declare_reduction): Use STRIP_NOPS when checking for
&omp_priv.
(OMP_TASKLOOP_CLAUSE_MASK): Define.
(c_parser_omp_taskloop): New function.
(c_parser_omp_construct): Don't handle PRAGMA_OMP_ORDERED here,
handle PRAGMA_OMP_TASKLOOP.
(c_parser_cilk_for): Adjust c_finish_omp_clauses callers.
* c-tree.h (c_finish_omp_clauses): Add two new arguments.
* c-typeck.c (handle_omp_array_sections_1): Fix comment typo.
Add IS_OMP argument, handle structure element bases, diagnose
bitfields, pass IS_OMP recursively, diagnose known zero length
array sections in depend clauses, handle array sections in reduction
clause, diagnose negative length even for pointers.
(handle_omp_array_sections): Add IS_OMP argument, use auto_vec for
types, pass IS_OMP down to handle_omp_array_sections_1, handle
array sections in reduction clause, set
OMP_CLAUSE_MAP_MAYBE_ZERO_LENGTH_ARRAY_SECTION if map could be zero
length array section, use GOMP_MAP_FIRSTPRIVATE_POINTER for IS_OMP.
(c_finish_omp_clauses): Add IS_OMP and DECLARE_SIMD arguments.
Handle new OpenMP 4.5 clauses and new restrictions for the old ones.
gcc/cp/
2015-10-13 Jakub Jelinek <jakub@redhat.com>
Aldy Hernandez <aldyh@redhat.com>
* class.c (finish_struct_1): Call finish_omp_declare_simd_methods.
* cp-gimplify.c (cp_gimplify_expr): Handle OMP_TASKLOOP.
(cp_genericize_r): Likewise.
(cxx_omp_finish_clause): Don't diagnose references.
(cxx_omp_disregard_value_expr): New function.
* cp-objcp-common.h (LANG_HOOKS_OMP_DISREGARD_VALUE_EXPR): Redefine.
* cp-tree.h (OMP_FOR_GIMPLIFYING_P): Document for OMP_TASKLOOP.
(DECL_OMP_PRIVATIZED_MEMBER): Define.
(finish_omp_declare_simd_methods, push_omp_privatization_clauses,
pop_omp_privatization_clauses, save_omp_privatization_clauses,
restore_omp_privatization_clauses, omp_privatize_field,
cxx_omp_disregard_value_expr): New prototypes.
(finish_omp_clauses): Add two new arguments.
(finish_omp_for): Add ORIG_DECLV argument.
* parser.c (cp_parser_lambda_body): Call
save_omp_privatization_clauses and restore_omp_privatization_clauses.
(cp_parser_omp_clause_name): Handle OpenMP 4.5 clauses.
(cp_parser_omp_var_list_no_open): Handle structure elements for
map, to and from clauses. Handle array sections in reduction
clause. Parse this keyword. Formatting fixes.
(cp_parser_omp_clause_if): Add IS_OMP argument, handle parsing of
if clause modifiers.
(cp_parser_omp_clause_num_tasks, cp_parser_omp_clause_grainsize,
cp_parser_omp_clause_priority, cp_parser_omp_clause_hint,
cp_parser_omp_clause_defaultmap): New functions.
(cp_parser_omp_clause_ordered): Parse optional parameter.
(cp_parser_omp_clause_reduction): Handle array reductions.
(cp_parser_omp_clause_schedule): Parse optional simd modifier.
(cp_parser_omp_clause_nogroup, cp_parser_omp_clause_orderedkind):
New functions.
(cp_parser_omp_clause_linear): Parse linear clause modifiers.
(cp_parser_omp_clause_depend_sink): New function.
(cp_parser_omp_clause_depend): Parse source/sink depend kinds.
(cp_parser_omp_clause_map): Parse release/delete map kinds and
optional always modifier.
(cp_parser_oacc_all_clauses): Adjust cp_parser_omp_clause_if
and finish_omp_clauses callers.
(cp_parser_omp_all_clauses): Likewise. Parse OpenMP 4.5 clauses.
Parse "to" as OMP_CLAUSE_TO_DECLARE if on declare target directive.
(OMP_CRITICAL_CLAUSE_MASK): Define.
(cp_parser_omp_critical): Parse critical clauses.
(cp_parser_omp_for_incr): Use cp_tree_equal if
processing_template_decl.
(cp_parser_omp_for_loop_init): Return tree instead of bool. Handle
non-static data member iterators.
(cp_parser_omp_for_loop): Handle doacross loops, adjust
finish_omp_for and finish_omp_clauses callers.
(cp_omp_split_clauses): Adjust finish_omp_clauses caller.
(OMP_SIMD_CLAUSE_MASK): Add simdlen clause.
(cp_parser_omp_simd): Allow ordered clause if it has no parameter.
(OMP_FOR_CLAUSE_MASK): Add linear clause.
(cp_parser_omp_for): Disallow ordered clause when combined with
distribute. Disallow linear clause when combined with distribute
and not combined with simd.
(OMP_ORDERED_CLAUSE_MASK, OMP_ORDERED_DEPEND_CLAUSE_MASK): Define.
(cp_parser_omp_ordered): Add CONTEXT argument, return bool instead
of tree, parse clauses and if depend clause is found, don't parse
a body.
(cp_parser_omp_parallel): Disallow copyin clause on target parallel.
Allow target parallel without for after it.
(OMP_TASK_CLAUSE_MASK): Add priority clause.
(OMP_TARGET_DATA_CLAUSE_MASK): Add use_device_ptr clause.
(cp_parser_omp_target_data): Diagnose no map clauses or clauses with
invalid kinds.
(OMP_TARGET_UPDATE_CLAUSE_MASK): Add depend and nowait clauses.
(OMP_TARGET_ENTER_DATA_CLAUSE_MASK,
OMP_TARGET_EXIT_DATA_CLAUSE_MASK): Define.
(cp_parser_omp_target_enter_data, cp_parser_omp_target_exit_data): New
functions.
(OMP_TARGET_CLAUSE_MASK): Add depend, nowait, private, firstprivate,
defaultmap and is_device_ptr clauses.
(cp_parser_omp_target): Parse target parallel and target simd. Set
OMP_TARGET_COMBINED on combined constructs. Parse target enter data
and target exit data. Diagnose invalid map kinds.
(cp_parser_oacc_cache): Adjust finish_omp_clauses caller.
(OMP_DECLARE_TARGET_CLAUSE_MASK): Define.
(cp_parser_omp_declare_target): Parse OpenMP 4.5 forms of this
construct.
(OMP_TASKLOOP_CLAUSE_MASK): Define.
(cp_parser_omp_taskloop): New function.
(cp_parser_omp_construct): Don't handle PRAGMA_OMP_ORDERED here,
handle PRAGMA_OMP_TASKLOOP.
(cp_parser_pragma): Handle PRAGMA_OMP_ORDERED here directly,
handle PRAGMA_OMP_TASKLOOP, call push_omp_privatization_clauses
and pop_omp_privatization_clauses around parsing calls.
(cp_parser_cilk_for): Adjust finish_omp_clauses caller.
* pt.c (apply_late_template_attributes): Adjust tsubst_omp_clauses
and finish_omp_clauses callers.
(tsubst_omp_clause_decl): Return NULL if decl is NULL.
For TREE_LIST, copy over OMP_CLAUSE_DEPEND_SINK_NEGATIVE bit.
Use tsubst_expr instead of tsubst_copy, undo convert_from_reference
effects.
(tsubst_omp_clauses): Add ALLOW_FIELDS argument. Handle new
OpenMP 4.5 clauses. Use tsubst_omp_clause_decl for more clauses.
If ALLOW_FIELDS, handle non-static data members in the clauses.
Clear OMP_CLAUSE_LINEAR_STEP if it has been cleared before.
(omp_parallel_combined_clauses): New variable.
(tsubst_omp_for_iterator): Add ORIG_DECLV argument, recur on
OMP_FOR_ORIG_DECLS, handle non-static data member iterators.
Improve handling of clauses on combined constructs.
(tsubst_expr): Call push_omp_privatization_clauses and
pop_omp_privatization_clauses around instantiation of certain
OpenMP constructs, improve handling of clauses on combined
constructs, handle OMP_TASKLOOP, adjust tsubst_omp_for_iterator,
tsubst_omp_clauses and finish_omp_for callers, handle clauses on
critical and ordered, handle OMP_TARGET_{ENTER,EXIT}_DATA.
(instantiate_decl): Call save_omp_privatization_clauses and
restore_omp_privatization_clauses around instantiation.
(dependent_omp_for_p): Fix up comment typo. Handle SCOPE_REF.
* semantics.c (omp_private_member_map, omp_private_member_vec,
omp_private_member_ignore_next): New variables.
(finish_non_static_data_member): Return dummy decl for privatized
non-static data members.
(omp_clause_decl_field, omp_clause_printable_decl,
omp_note_field_privatization, omp_privatize_field): New functions.
(handle_omp_array_sections_1): Fix comment typo.
Add IS_OMP argument, handle structure element bases, diagnose
bitfields, pass IS_OMP recursively, diagnose known zero length
array sections in depend clauses, handle array sections in reduction
clause, diagnose negative length even for pointers.
(handle_omp_array_sections): Add IS_OMP argument, use auto_vec for
types, pass IS_OMP down to handle_omp_array_sections_1, handle
array sections in reduction clause, set
OMP_CLAUSE_MAP_MAYBE_ZERO_LENGTH_ARRAY_SECTION if map could be zero
length array section, use GOMP_MAP_FIRSTPRIVATE_POINTER for IS_OMP.
(finish_omp_reduction_clause): Handle array sections and arrays.
Use omp_clause_printable_decl.
(finish_omp_declare_simd_methods, cp_finish_omp_clause_depend_sink):
New functions.
(finish_omp_clauses): Add ALLOW_FIELDS and DECLARE_SIMD arguments.
Handle new OpenMP 4.5 clauses and new restrictions for the old
ones, handle non-static data members, reject this keyword when not
allowed.
(push_omp_privatization_clauses, pop_omp_privatization_clauses,
save_omp_privatization_clauses, restore_omp_privatization_clauses):
New functions.
(handle_omp_for_class_iterator): Handle OMP_TASKLOOP class iterators.
Add collapse and ordered arguments. Fix handling of lastprivate
iterators in doacross loops.
(finish_omp_for): Add ORIG_DECLV argument, handle doacross loops,
adjust c_finish_omp_for, handle_omp_for_class_iterator and
finish_omp_clauses callers. Fill in OMP_CLAUSE_LINEAR_STEP on simd
loops with non-static data member iterators.
gcc/fortran/
2015-10-13 Jakub Jelinek <jakub@redhat.com>
Ilya Verbin <ilya.verbin@intel.com>
* f95-lang.c (DEF_FUNCTION_TYPE_9, DEF_FUNCTION_TYPE_10,
DEF_FUNCTION_TYPE_11, DEF_FUNCTION_TYPE_VAR_1): Define.
* trans-openmp.c (gfc_trans_omp_clauses): Set
OMP_CLAUSE_IF_MODIFIER to ERROR_MARK, OMP_CLAUSE_ORDERED_EXPR
to NULL.
(gfc_trans_omp_critical): Adjust for addition of clauses.
(gfc_trans_omp_ordered): Likewise.
* types.def (BT_FN_BOOL_UINT_LONGPTR_LONGPTR_LONGPTR,
BT_FN_BOOL_UINT_ULLPTR_ULLPTR_ULLPTR,
BT_FN_BOOL_UINT_LONGPTR_LONG_LONGPTR_LONGPTR,
BT_FN_BOOL_UINT_ULLPTR_ULL_ULLPTR_ULLPTR,
BT_FN_VOID_INT_SIZE_PTR_PTR_PTR_UINT_PTR,
BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR,
BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR_INT,
BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_UINT_LONG_INT_LONG_LONG_LONG,
BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_UINT_LONG_INT_ULL_ULL_ULL,
BT_FN_VOID_LONG_VAR, BT_FN_VOID_ULL_VAR): New.
(BT_FN_VOID_INT_PTR_SIZE_PTR_PTR_PTR,
BT_FN_VOID_INT_OMPFN_PTR_SIZE_PTR_PTR_PTR,
BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR): Remove.
gcc/lto/
2015-10-13 Jakub Jelinek <jakub@redhat.com>
* lto-lang.c (DEF_FUNCTION_TYPE_9, DEF_FUNCTION_TYPE_10,
DEF_FUNCTION_TYPE_11): Define.
gcc/jit/
2015-10-13 Jakub Jelinek <jakub@redhat.com>
* jit-builtins.c (DEF_FUNCTION_TYPE_9, DEF_FUNCTION_TYPE_10,
DEF_FUNCTION_TYPE_11): Define.
* jit-builtins.h (DEF_FUNCTION_TYPE_9, DEF_FUNCTION_TYPE_10,
DEF_FUNCTION_TYPE_11): Define.
gcc/ada/
2015-10-13 Jakub Jelinek <jakub@redhat.com>
* gcc-interface/utils.c (DEF_FUNCTION_TYPE_9, DEF_FUNCTION_TYPE_10,
DEF_FUNCTION_TYPE_11): Define.
gcc/testsuite/
2015-10-13 Jakub Jelinek <jakub@redhat.com>
Aldy Hernandez <aldyh@redhat.com>
* c-c++-common/gomp/cancel-1.c (f2): Add map clause to target data.
* c-c++-common/gomp/clauses-1.c: New test.
* c-c++-common/gomp/clauses-2.c: New test.
* c-c++-common/gomp/clauses-3.c: New test.
* c-c++-common/gomp/clauses-4.c: New test.
* c-c++-common/gomp/declare-target-1.c: New test.
* c-c++-common/gomp/declare-target-2.c: New test.
* c-c++-common/gomp/depend-3.c: New test.
* c-c++-common/gomp/depend-4.c: New test.
* c-c++-common/gomp/doacross-1.c: New test.
* c-c++-common/gomp/if-1.c: New test.
* c-c++-common/gomp/if-2.c: New test.
* c-c++-common/gomp/linear-1.c: New test.
* c-c++-common/gomp/map-2.c: New test.
* c-c++-common/gomp/map-3.c: New test.
* c-c++-common/gomp/nesting-1.c (f_omp_parallel,
f_omp_target_data): Add map clause to target data.
* c-c++-common/gomp/nesting-warn-1.c (f_omp_target): Likewise.
* c-c++-common/gomp/ordered-1.c: New test.
* c-c++-common/gomp/ordered-2.c: New test.
* c-c++-common/gomp/ordered-3.c: New test.
* c-c++-common/gomp/pr61486-1.c (foo): Remove linear clause
on non-iterator.
* c-c++-common/gomp/pr61486-2.c (test, test2): Remove ordered
clause and ordered construct where no longer allowed.
* c-c++-common/gomp/priority-1.c: New test.
* c-c++-common/gomp/reduction-1.c: New test.
* c-c++-common/gomp/schedule-simd-1.c: New test.
* c-c++-common/gomp/sink-1.c: New test.
* c-c++-common/gomp/sink-2.c: New test.
* c-c++-common/gomp/sink-3.c: New test.
* c-c++-common/gomp/sink-4.c: New test.
* c-c++-common/gomp/udr-1.c: New test.
* c-c++-common/taskloop-1.c: New test.
* c-c++-common/cpp/openmp-define-3.c: Adjust for the new
value of _OPENMP macro.
* c-c++-common/cilk-plus/PS/body.c (foo): Adjust expected diagnostics.
* c-c++-common/goacc-gomp/nesting-fail-1.c (f_acc_parallel,
f_acc_kernels, f_acc_data, f_acc_loop): Add map clause to target data.
* gcc.dg/gomp/clause-1.c:
* gcc.dg/gomp/reduction-1.c: New test.
* gcc.dg/gomp/sink-fold-1.c: New test.
* gcc.dg/gomp/sink-fold-2.c: New test.
* gcc.dg/gomp/sink-fold-3.c: New test.
* gcc.dg/vect/vect-simd-clone-15.c: New test.
* g++.dg/gomp/clause-1.C (T::test): Remove dg-error on privatization
of non-static data members.
* g++.dg/gomp/clause-3.C (foo): Remove one dg-error directive.
Add some linear clause tests.
* g++.dg/gomp/declare-simd-3.C: New test.
* g++.dg/gomp/linear-1.C: New test.
* g++.dg/gomp/member-1.C: New test.
* g++.dg/gomp/member-2.C: New test.
* g++.dg/gomp/pr66571-2.C: New test.
* g++.dg/gomp/pr67504.C (foo): Add test for ordered clause with
dependent argument.
* g++.dg/gomp/pr67522.C (foo): Add test for invalid array section
in reduction clause.
* g++.dg/gomp/reference-1.C: New test.
* g++.dg/gomp/sink-1.C: New test.
* g++.dg/gomp/sink-2.C: New test.
* g++.dg/gomp/sink-3.C: New test.
* g++.dg/gomp/task-1.C: Remove both dg-error directives.
* g++.dg/gomp/this-1.C: New test.
* g++.dg/gomp/this-2.C: New test.
* g++.dg/vect/simd-clone-2.cc: New test.
* g++.dg/vect/simd-clone-2.h: New test.
* g++.dg/vect/simd-clone-3.cc: New test.
* g++.dg/vect/simd-clone-4.cc: New test.
* g++.dg/vect/simd-clone-4.h: New test.
* g++.dg/vect/simd-clone-5.cc: New test.
include/
2015-10-13 Jakub Jelinek <jakub@redhat.com>
Ilya Verbin <ilya.verbin@intel.com>
* gomp-constants.h (GOMP_MAP_FLAG_ALWAYS): Define.
(enum gomp_map_kind): Add GOMP_MAP_FIRSTPRIVATE,
GOMP_MAP_FIRSTPRIVATE_INT, GOMP_MAP_USE_DEVICE_PTR,
GOMP_MAP_ZERO_LEN_ARRAY_SECTION, GOMP_MAP_ALWAYS_TO,
GOMP_MAP_ALWAYS_FROM, GOMP_MAP_ALWAYS_TOFROM, GOMP_MAP_STRUCT,
GOMP_MAP_DELETE_ZERO_LEN_ARRAY_SECTION, GOMP_MAP_DELETE,
GOMP_MAP_RELEASE, GOMP_MAP_FIRSTPRIVATE_POINTER.
(GOMP_MAP_ALWAYS_TO_P, GOMP_MAP_ALWAYS_FROM_P): Define.
(GOMP_TASK_FLAG_UNTIED, GOMP_TASK_FLAG_FINAL, GOMP_TASK_FLAG_MERGEABLE,
GOMP_TASK_FLAG_DEPEND, GOMP_TASK_FLAG_PRIORITY, GOMP_TASK_FLAG_UP,
GOMP_TASK_FLAG_GRAINSIZE, GOMP_TASK_FLAG_IF, GOMP_TASK_FLAG_NOGROUP,
GOMP_TARGET_FLAG_NOWAIT, GOMP_TARGET_FLAG_EXIT_DATA,
GOMP_TARGET_FLAG_UPDATE): Define.
libgomp/
2015-10-13 Jakub Jelinek <jakub@redhat.com>
Aldy Hernandez <aldyh@redhat.com>
Ilya Verbin <ilya.verbin@intel.com>
* config/linux/affinity.c (omp_get_place_num_procs,
omp_get_place_proc_ids, gomp_get_place_proc_ids_8): New functions.
* config/linux/doacross.h: New file.
* config/posix/affinity.c (omp_get_place_num_procs,
omp_get_place_proc_ids, gomp_get_place_proc_ids_8): New functions.
* config/posix/doacross.h: New file.
* env.c: Include gomp-constants.h.
(struct gomp_task_icv): Rename run_sched_modifier to
run_sched_chunk_size.
(gomp_max_task_priority_var): New variable.
(parse_schedule): Rename run_sched_modifier to run_sched_chunk_size.
(handle_omp_display_env): Change _OPENMP value from 201307 to
201511. Print OMP_MAX_TASK_PRIORITY.
(initialize_env): Parse OMP_MAX_TASK_PRIORITY.
(omp_set_schedule, omp_get_schedule): Rename modifier argument to
chunk_size and run_sched_modifier to run_sched_chunk_size.
(omp_get_max_task_priority, omp_get_initial_device,
omp_get_num_places, omp_get_place_num, omp_get_partition_num_places,
omp_get_partition_place_nums): New functions.
* fortran.c (omp_set_schedule_, omp_set_schedule_8_,
omp_get_schedule_, omp_get_schedule_8_): Rename modifier argument
to chunk_size.
(omp_get_num_places_, omp_get_place_num_procs_,
omp_get_place_num_procs_8_, omp_get_place_proc_ids_,
omp_get_place_proc_ids_8_, omp_get_place_num_,
omp_get_partition_num_places_, omp_get_partition_place_nums_,
omp_get_partition_place_nums_8_, omp_get_initial_device_,
omp_get_max_task_priority_): New functions.
* libgomp_g.h (GOMP_loop_doacross_static_start,
GOMP_loop_doacross_dynamic_start, GOMP_loop_doacross_guided_start,
GOMP_loop_doacross_runtime_start, GOMP_loop_ull_doacross_static_start,
GOMP_loop_ull_doacross_dynamic_start,
GOMP_loop_ull_doacross_guided_start,
GOMP_loop_ull_doacross_runtime_start, GOMP_doacross_post,
GOMP_doacross_wait, GOMP_doacross_ull_post, GOMP_doacross_wait,
GOMP_taskloop, GOMP_taskloop_ull, GOMP_target_41,
GOMP_target_data_41, GOMP_target_update_41,
GOMP_target_enter_exit_data): New prototypes.
(GOMP_task): Add prototype argument.
* libgomp.h (_LIBGOMP_CHECKING_): Define to 0 if not yet defined.
(struct gomp_doacross_work_share): New type.
(struct gomp_work_share): Add doacross field.
(struct gomp_task_icv): Rename run_sched_modifier to
run_sched_chunk_size.
(enum gomp_task_kind): Rename GOMP_TASK_IFFALSE to
GOMP_TASK_UNDEFERRED. Add comments.
(struct gomp_task_depend_entry): Add comments.
(struct gomp_task): Likewise.
(struct gomp_taskgroup): Likewise.
(struct gomp_target_task): New type.
(struct gomp_team): Add comment.
(gomp_get_place_proc_ids_8, gomp_doacross_init,
gomp_doacross_ull_init, gomp_task_maybe_wait_for_dependencies,
gomp_create_target_task, gomp_target_task_fn): New prototypes.
(struct target_var_desc): New type.
(struct target_mem_desc): Adjust comment. Use struct
target_var_desc instead of splay_tree_key for list.
(REFCOUNT_INFINITY): Define.
(struct splay_tree_key_s): Remove copy_from field.
(struct gomp_device_descr): Add dev2dev_func field.
(enum gomp_map_vars_kind): New enum.
(gomp_map_vars): Add one argument.
* libgomp.map (OMP_4.5): Export omp_get_max_task_priority,
omp_get_max_task_priority_, omp_get_num_places, omp_get_num_places_,
omp_get_place_num_procs, omp_get_place_num_procs_,
omp_get_place_num_procs_8_, omp_get_place_proc_ids,
omp_get_place_proc_ids_, omp_get_place_proc_ids_8_, omp_get_place_num,
omp_get_place_num_, omp_get_partition_num_places,
omp_get_partition_num_places_, omp_get_partition_place_nums,
omp_get_partition_place_nums_, omp_get_partition_place_nums_8_,
omp_get_initial_device, omp_get_initial_device_, omp_target_alloc,
omp_target_free, omp_target_is_present, omp_target_memcpy,
omp_target_memcpy_rect, omp_target_associate_ptr and
omp_target_disassociate_ptr.
(GOMP_4.0.2): Renamed to ...
(GOMP_4.5): ... this. Export GOMP_target_41, GOMP_target_data_41,
GOMP_target_update_41, GOMP_target_enter_exit_data, GOMP_taskloop,
GOMP_taskloop_ull, GOMP_loop_doacross_dynamic_start,
GOMP_loop_doacross_guided_start, GOMP_loop_doacross_runtime_start,
GOMP_loop_doacross_static_start, GOMP_doacross_post,
GOMP_doacross_wait, GOMP_loop_ull_doacross_dynamic_start,
GOMP_loop_ull_doacross_guided_start,
GOMP_loop_ull_doacross_runtime_start,
GOMP_loop_ull_doacross_static_start, GOMP_doacross_ull_post and
GOMP_doacross_ull_wait.
* libgomp.texi: Document omp_get_max_task_priority.
Rename modifier argument to chunk_size for omp_set_schedule and
omp_get_schedule. Document OMP_MAX_TASK_PRIORITY env var.
* loop.c (GOMP_loop_runtime_start): Adjust for run_sched_modifier
to run_sched_chunk_size renaming.
(GOMP_loop_ordered_runtime_start): Likewise.
(gomp_loop_doacross_static_start, gomp_loop_doacross_dynamic_start,
gomp_loop_doacross_guided_start, GOMP_loop_doacross_runtime_start,
GOMP_parallel_loop_runtime_start): New functions.
(GOMP_parallel_loop_runtime): Adjust for run_sched_modifier
to run_sched_chunk_size renaming.
(GOMP_loop_doacross_static_start, GOMP_loop_doacross_dynamic_start,
GOMP_loop_doacross_guided_start): New functions or aliases.
* loop_ull.c (GOMP_loop_ull_runtime_start): Adjust for
run_sched_modifier to run_sched_chunk_size renaming.
(GOMP_loop_ull_ordered_runtime_start): Likewise.
(gomp_loop_ull_doacross_static_start,
gomp_loop_ull_doacross_dynamic_start,
gomp_loop_ull_doacross_guided_start,
GOMP_loop_ull_doacross_runtime_start): New functions.
(GOMP_loop_ull_doacross_static_start,
GOMP_loop_ull_doacross_dynamic_start,
GOMP_loop_ull_doacross_guided_start): New functions or aliases.
* oacc-mem.c (acc_map_data, present_create_copy,
gomp_acc_insert_pointer): Pass GOMP_MAP_VARS_OPENACC instead of false
to gomp_map_vars.
(gomp_acc_remove_pointer): Use copy_from from target_var_desc.
* oacc-parallel.c (GOACC_data_start): Pass GOMP_MAP_VARS_OPENACC
instead of false to gomp_map_vars.
(GOACC_parallel_keyed): Likewise. Use copy_from from target_var_desc.
* omp.h.in (omp_lock_hint_t): New type.
(omp_init_lock_with_hint, omp_init_nest_lock_with_hint,
omp_get_num_places, omp_get_place_num_procs, omp_get_place_proc_ids,
omp_get_place_num, omp_get_partition_num_places,
omp_get_partition_place_nums, omp_get_initial_device,
omp_get_max_task_priority, omp_target_alloc, omp_target_free,
omp_target_is_present, omp_target_memcpy, omp_target_memcpy_rect,
omp_target_associate_ptr, omp_target_disassociate_ptr): New
prototypes.
* omp_lib.f90.in (omp_lock_hint_kind): New parameter.
(omp_lock_hint_none, omp_lock_hint_uncontended,
omp_lock_hint_contended, omp_lock_hint_nonspeculative,
omp_lock_hint_speculative): New parameters.
(omp_init_lock_with_hint, omp_init_nest_lock_with_hint,
omp_get_num_places, omp_get_place_num_procs, omp_get_place_proc_ids,
omp_get_place_num, omp_get_partition_num_places,
omp_get_partition_place_nums, omp_get_initial_device,
omp_get_max_task_priority): New interfaces.
(omp_set_schedule, omp_get_schedule): Rename modifier argument
to chunk_size.
* omp_lib.h.in (omp_lock_hint_kind): New parameter.
(omp_lock_hint_none, omp_lock_hint_uncontended,
omp_lock_hint_contended, omp_lock_hint_nonspeculative,
omp_lock_hint_speculative): New parameters.
(omp_init_lock_with_hint, omp_init_nest_lock_with_hint,
omp_get_num_places, omp_get_place_num_procs, omp_get_place_proc_ids,
omp_get_place_num, omp_get_partition_num_places,
omp_get_partition_place_nums, omp_get_initial_device,
omp_get_max_task_priority): New functions and subroutines.
* ordered.c: Include stdarg.h and string.h.
(MAX_COLLAPSED_BITS): Define.
(gomp_doacross_init, GOMP_doacross_post, GOMP_doacross_wait,
gomp_doacross_ull_init, GOMP_doacross_ull_post,
GOMP_doacross_ull_wait): New functions.
* target.c: Include errno.h.
(resolve_device): If device is not initialized, call
gomp_init_device on it.
(gomp_map_lookup): New function.
(gomp_map_vars_existing): Add tgt_var argument, fill it in.
Don't bump refcount if REFCOUNT_INFINITY. Handle
GOMP_MAP_ALWAYS_TO_P.
(get_kind): Rename is_openacc argument to short_mapkind.
(gomp_map_pointer): Use gomp_map_lookup.
(gomp_map_fields_existing): New function.
(gomp_map_vars): Rename is_openacc argument to short_mapkind
and is_target to pragma_kind. Handle GOMP_MAP_VARS_ENTER_DATA,
handle GOMP_MAP_FIRSTPRIVATE_INT, GOMP_MAP_STRUCT,
GOMP_MAP_USE_DEVICE_PTR, GOMP_MAP_ZERO_LEN_ARRAY_SECTION.
Adjust for tgt->list changed type and copy_from living in there.
(gomp_copy_from_async): Adjust for tgt->list changed type and
copy_from living in there.
(gomp_unmap_vars): Likewise.
(gomp_update): Likewise. Rename is_openacc argument to
short_mapkind. Don't fail if object is not mapped.
(gomp_load_image_to_device): Initialize refcount to
REFCOUNT_INFINITY.
(gomp_target_fallback): New function.
(gomp_get_target_fn_addr): Likewise.
(GOMP_target): Adjust gomp_map_vars caller, use
gomp_get_target_fn_addr and gomp_target_fallback.
(GOMP_target_41): New function.
(gomp_target_data_fallback): New function.
(GOMP_target_data): Use it, adjust gomp_map_vars caller.
(GOMP_target_data_41): New function.
(GOMP_target_update): Adjust gomp_update caller.
(GOMP_target_update_41): New function.
(gomp_exit_data, GOMP_target_enter_exit_data,
gomp_target_task_fn, omp_target_alloc, omp_target_free,
omp_target_is_present, omp_target_memcpy,
omp_target_memcpy_rect_worker, omp_target_memcpy_rect,
omp_target_associate_ptr, omp_target_disassociate_ptr,
gomp_load_plugin_for_device): New functions.
* task.c: Include gomp-constants.h. Include taskloop.c
twice to get GOMP_taskloop and GOMP_taskloop_ull definitions.
(gomp_task_handle_depend): New function.
(GOMP_task): Use it. Add priority argument. Use
gomp-constant.h constants instead of hardcoded numbers.
Rename GOMP_TASK_IFFALSE to GOMP_TASK_UNDEFERRED.
(gomp_create_target_task): New function.
(verify_children_queue, verify_taskgroup_queue,
verify_task_queue): New functions.
(gomp_task_run_pre): Call verify_*_queue functions.
If an upcoming tied task is about to leave the sibling or
taskgroup queues in an invalid state, adjust appropriately.
Remove taskgroup argument. Add comments.
(gomp_task_run_post_handle_dependers): Add comments.
(gomp_task_run_post_remove_parent): Likewise.
(gomp_barrier_handle_tasks): Adjust gomp_task_run_pre caller.
(GOMP_taskwait): Likewise. Add comments.
(gomp_task_maybe_wait_for_dependencies): Fix scheduling
problem such that the first non parent_depends_on task does not
end up at the end of the children queue.
(GOMP_taskgroup_start): Rename GOMP_TASK_IFFALSE to
GOMP_TASK_UNDEFERRED.
(GOMP_taskgroup_end): Adjust gomp_task_run_pre caller.
* taskloop.c: New file.
* testsuite/lib/libgomp.exp
(check_effective_target_offload_device_nonshared_as): New proc.
* testsuite/libgomp.c/affinity-2.c: New test.
* testsuite/libgomp.c/doacross-1.c: New test.
* testsuite/libgomp.c/doacross-2.c: New test.
* testsuite/libgomp.c/examples-4/declare_target-1.c (fib_wrapper):
Add map clause to target.
* testsuite/libgomp.c/examples-4/declare_target-4.c (accum): Likewise.
* testsuite/libgomp.c/examples-4/declare_target-5.c (accum): Likewise.
* testsuite/libgomp.c/examples-4/device-1.c (main): Likewise.
* testsuite/libgomp.c/examples-4/device-3.c (main): Likewise.
* testsuite/libgomp.c/examples-4/target_data-3.c (gramSchmidt):
Likewise.
* testsuite/libgomp.c/examples-4/teams-2.c (dotprod): Likewise.
* testsuite/libgomp.c/examples-4/teams-3.c (dotprod): Likewise.
* testsuite/libgomp.c/examples-4/teams-4.c (dotprod): Likewise.
* testsuite/libgomp.c/for-2.h (OMPTGT, OMPTO, OMPFROM): Define if
not defined. Use those where needed.
* testsuite/libgomp.c/for-4.c: New test.
* testsuite/libgomp.c/for-5.c: New test.
* testsuite/libgomp.c/for-6.c: New test.
* testsuite/libgomp.c/linear-1.c: New test.
* testsuite/libgomp.c/ordered-4.c: New test.
* testsuite/libgomp.c/pr66199-2.c (f2): Adjust for linear clause
only allowed on the loop iterator.
* testsuite/libgomp.c/pr66199-3.c: New test.
* testsuite/libgomp.c/pr66199-4.c: New test.
* testsuite/libgomp.c/reduction-7.c: New test.
* testsuite/libgomp.c/reduction-8.c: New test.
* testsuite/libgomp.c/reduction-9.c: New test.
* testsuite/libgomp.c/reduction-10.c: New test.
* testsuite/libgomp.c/target-1.c (fn2, fn3, fn4): Add
map(tofrom:s).
* testsuite/libgomp.c/target-2.c (fn2, fn3, fn4): Likewise.
* testsuite/libgomp.c/target-7.c (foo): Add map(h) where needed.
* testsuite/libgomp.c/target-11.c: New test.
* testsuite/libgomp.c/target-12.c: New test.
* testsuite/libgomp.c/target-13.c: New test.
* testsuite/libgomp.c/target-14.c: New test.
* testsuite/libgomp.c/target-15.c: New test.
* testsuite/libgomp.c/target-16.c: New test.
* testsuite/libgomp.c/target-17.c: New test.
* testsuite/libgomp.c/target-18.c: New test.
* testsuite/libgomp.c/target-19.c: New test.
* testsuite/libgomp.c/target-20.c: New test.
* testsuite/libgomp.c/target-21.c: New test.
* testsuite/libgomp.c/target-22.c: New test.
* testsuite/libgomp.c/target-23.c: New test.
* testsuite/libgomp.c/target-24.c: New test.
* testsuite/libgomp.c/target-25.c: New test.
* testsuite/libgomp.c/target-26.c: New test.
* testsuite/libgomp.c/target-27.c: New test.
* testsuite/libgomp.c/taskloop-1.c: New test.
* testsuite/libgomp.c/taskloop-2.c: New test.
* testsuite/libgomp.c/taskloop-3.c: New test.
* testsuite/libgomp.c/taskloop-4.c: New test.
* testsuite/libgomp.c++/ctor-13.C: New test.
* testsuite/libgomp.c++/doacross-1.C: New test.
* testsuite/libgomp.c++/examples-4/declare_target-2.C:
Replace offload_device with offload_device_nonshared_as.
* testsuite/libgomp.c++/for-12.C: New test.
* testsuite/libgomp.c++/for-13.C: New test.
* testsuite/libgomp.c++/for-14.C: New test.
* testsuite/libgomp.c++/linear-1.C: New test.
* testsuite/libgomp.c++/member-1.C: New test.
* testsuite/libgomp.c++/member-2.C: New test.
* testsuite/libgomp.c++/member-3.C: New test.
* testsuite/libgomp.c++/member-4.C: New test.
* testsuite/libgomp.c++/member-5.C: New test.
* testsuite/libgomp.c++/ordered-1.C: New test.
* testsuite/libgomp.c++/reduction-5.C: New test.
* testsuite/libgomp.c++/reduction-6.C: New test.
* testsuite/libgomp.c++/reduction-7.C: New test.
* testsuite/libgomp.c++/reduction-8.C: New test.
* testsuite/libgomp.c++/reduction-9.C: New test.
* testsuite/libgomp.c++/reduction-10.C: New test.
* testsuite/libgomp.c++/reference-1.C: New test.
* testsuite/libgomp.c++/simd14.C: New test.
* testsuite/libgomp.c++/target-2.C (fn2): Add map(tofrom: s) clause.
* testsuite/libgomp.c++/target-5.C: New test.
* testsuite/libgomp.c++/target-6.C: New test.
* testsuite/libgomp.c++/target-7.C: New test.
* testsuite/libgomp.c++/target-8.C: New test.
* testsuite/libgomp.c++/target-9.C: New test.
* testsuite/libgomp.c++/target-10.C: New test.
* testsuite/libgomp.c++/target-11.C: New test.
* testsuite/libgomp.c++/target-12.C: New test.
* testsuite/libgomp.c++/taskloop-1.C: New test.
* testsuite/libgomp.c++/taskloop-2.C: New test.
* testsuite/libgomp.c++/taskloop-3.C: New test.
* testsuite/libgomp.c++/taskloop-4.C: New test.
* testsuite/libgomp.c++/taskloop-5.C: New test.
* testsuite/libgomp.c++/taskloop-6.C: New test.
* testsuite/libgomp.c++/taskloop-7.C: New test.
* testsuite/libgomp.c++/taskloop-8.C: New test.
* testsuite/libgomp.c++/taskloop-9.C: New test.
* testsuite/libgomp.fortran/affinity1.f90: New test.
* testsuite/libgomp.fortran/affinity2.f90: New test.
liboffloadmic/
2015-10-13 Ilya Verbin <ilya.verbin@intel.com>
* plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_dev2dev): New
function.
* plugin/offload_target_main.cpp (__offload_target_tgt2tgt): New
static function, register it in liboffloadmic.
From-SVN: r228777
2015-10-13 21:06:23 +02:00
|
|
|
NEXT_PASS (pass_simduid_cleanup);
|
Commit the vtable verification feature.
Commit the vtable verification feature. This feature is designed to
detect, at run time, if/when the vtable pointer in a C++ object has
been corrupted, before allowing virtual calls through that pointer.
If pointer corruption is detected, execution of the program is halted.
libstdc++-v3 ChangeLog:
2013-08-06 Caroline Tice <cmtice@google.com>
* fragment.am: Add XTEMPLATE_FLAGS.
* configure.ac: Add definitions for --enable-vtable-verify.
* acinclude.m4: Add --enable-vtable-verify and
--disable-vtable-verify; define --enable-vtable-verify; define
VTV_CXXFLAGS, VTV_PCH_CXXFLAGS and VTV_CXXLINKFLAGS.
* config/abi/pre/gnu.ver: Export symbols for vtable verification.
* libsupc++/Makefile.am: Define vtv_sources and add it to
libsupc___la_SOURCES and libsupc__convenience_la_SOURCES.
* libsupc++/vtv_stubs.cc: New file.
* include/Makefile.am: Add VTV_PCH_CXXFLAGS to PCHFLAGS.
* src/Makefile.am: Add VTV_CXXFLAGS to AM_CXXFLAGS; add
VTV_CXXLINKFLAGS to CXXLINK.
* src/c++98/Makefile.am: Comment out XTEMPLATE_FLAGS; add VTV_CXXFLAGS
to AM_CXXFLAGS; add VTV_CXXXLINKFLAGS to CXXLINK.
* src/C++11/Makefile.am: Ditto.
* doc/xml/manual/configure.xml: Add entry for --enable-vtable-verify.
* scripts/testsuite_flags.in: Add cxxvtvflags to Usage; cause
cxxvtvflags to use VTV_CXXFLAGS and VTV_CXXLINKFLAGS.
* testsuite/lib/libstdc++.exp: Add cxxvtvflags; add code to locate
libvtv if --enable-vtable-verify was used; set cxxvtvflags; add
cxxvtvflags to cxx_final.
* testsuite/18_support/bad_exception/23591_thread-1.c: Add
-fvtable-verify=none to compiler flags.
* testsuite/17_intro/freestanding.cc: Add -fvtable-verify=none
to compiler flags.
* configure: Regenerated.
* Makefile.in: Regenerated.
* python/Makefile.in: Regenerated.
* include/Makefile.in: Regenerated.
* libsupc++/Makefile.in: Regenerated.
* config.h.in: Regenerated.
* po/Makefile.in: Regenerated.
* src/Makefile.in: Regenerated.
* src/c++98/Makefile.in: Regenerated.
* src/c++11/Makefile.in: Regenerated.
* doc/Makefile.in: Regenerated.
* testsuite/Makefile.in: Regenerated.
top level ChangeLog:
2013-08-06 Caroline Tice <cmtice@google.com>
* configure.ac: Add target-libvtv to target_libraries; disable libvtv
on non-linux systems; add target-libvtv to noconfigdirs; add
libsupc++/.libs to C++ library search paths.
* configure: Regenerated.
* Makefile.def: Add libvtv to target_modules; make libvtv depend on
libstdc++ and libgcc.
* Makefile.in: Regenerated.
include/ChangeLog:
2013-08-06 Caroline Tice <cmtice@google.com>
* vtv-change-permission.h: New file.
contrib/ChangeLog:
2013-08-06 Caroline Tice4 <cmtice@google.com>
* gcc_update: Add libvtv files.
libgcc/ChangeLog:
2013-08-06 Caroline Tice <cmtice@google.com>
config.host (extra_parts): Add vtv_start.o, vtv_end.o
vtv_start_preinit.o and vtv_end_preinit.o.
configure.ac: Add code to check/set enable_vtable_verify.
Makefile.in: Add rules to build vtv_*.o, if enable_vtable_verify is
true.
vtv_start_preinit.c: New file.
vtv_end_preinit.c: New file.
vtv_start.c: New file.
vtv_end.c: New file.
configure: Regenerated.
gcc/ChangeLog:
2013-08-06 Caroline Tice <cmtice@google.com>
* gcc.c (VTABLE_VERIFICATION_SPEC): New definition.
(LINK_COMMAND_SPEC): Add VTABLE_VERIFICATION_SPEC.
* tree-pass.h: Add pass_vtable_verify.
* varasm.c (assemble_variable): Add code to properly set the comdat
section and name for the .vtable_map_vars section.
(assemble_vtyv_preinit_initializer): New function.
(default_sectin_type_flags): Make sure .vtable_map_vars section has
LINK_ONCE flag.
* output.h: Add function decl for assemble_vtv_preinit_initializer.
* vtable-verify.c: New file.
* vtable-verify.h: New file.
* flag-types.h (enum vtv_priority): Defintions for flag_vtable_verify
initialiation levels.
* timevar.def (TV_VTABLE_VERIFICATION): New definition.
* passes.def: Insert pass_vtable_verify.
* aclocal.m4: Reorder includes.
* doc/invoke.texi: Add documentation for the flags -fvtable-verify=,
-fvtv-debug and -fvtv-counts.
* config/gnu-user.h (GNU_USER_TARGET_STARTFILE_SPEC): Add vtv_start*.o,
as appropriate, if -fvtable-verify=... is used.
(GNU_USER_TARGET_ENDFILE_SPEC): Add vtv_end*.o as appropriate, if
-fvtable-verify=... is used.
* Makefile.in (OBJS): Add vtable-verify.o to list.
(vtable-verify.o): Add new build rule.
(GTFILES): Add vtable-verify.c to list.
* common.opt (fvtable-verify=): New flag.
(vtv_priority): Values for fvtable-verify= flag.
(fvtv-counts): New flag.
(fvtv-debug): New flag.
* tree.h (save_vtable_map_decl): New extern function decl.
gcc/cp/ChangeLog:
2013-08-06 Caroline Tice <cmtice@google.com>
* Make-lang.in (*CXX_AND_OBJCXX_OBJS): Add vtable-class-hierarchy.o to
list.
(vtable-class-hierarchy.o): Add build rule.
* cp-tree.h (vtv_start_verification_constructor_init_function): New
extern function decl.
(vtv_finish_verification_constructor_init_function): New extern
function decl.
(build_vtbl_address): New extern function decl.
(get_mangled_vtable_map_var_name): New extern function decl.
(vtv_compute_class_hierarchy_transitive_closure): New extern function
decl.
(vtv_generate_init_routine): New extern function decl.
(vtv_save_class_info): New extern function decl.
(vtv_recover_class_info): New extern function decl.
(vtv_build_vtable_verify_fndecl): New extern function decl.
* class.c (finish_struct_1): Add call to vtv_save_class_info if
flag_vtable_verify is true.
* config-lang.in: Add vtable-class-hierarchy.c to gtfiles list.
* vtable-class-hierarchy.c: New file.
* mangle.c (get_mangled_vtable_map_var_name): New function.
* decl2.c (start_objects): Update function comment.
(cp_write_global_declarations): Call vtv_recover_class_info,
vtv_compute_class_hierarchy_transitive_closure and
vtv_build_vtable_verify_fndecl, before calling
finalize_compilation_unit, and call vtv_generate_init_rount after, IFF
flag_vtable_verify is true.
(vtv_start_verification_constructor_init_function): New function.
(vtv_finish_verification_constructor_init_function): New function.
* init.c (build_vtbl_address): Remove static qualifier from function.
libvtv/ChangeLog:
2013-08-06 Caroline Tice <cmtice@google.com>
Initial check-in of new vtable verification feature.
* configure.ac : New file.
* acinclude.m4 : New file.
* Makefile.am : New file.
* aclocal.m4 : New file.
* configure.tgt : New file.
* configure: New file (generated).
* Makefile.in: New file (generated).
* vtv_set.h : New file.
* vtv_utils.cc : New file.
* vtv_utils.h : New file.
* vtv_malloc.cc : New file.
* vtv_rts.cc : New file.
* vtv_malloc.h : New file.
* vtv_rts.h : New file.
* vtv_fail.cc : New file.
* vtv_fail.h : New file.
* vtv_map.h : New file.
* scripts/run-testsuite.sh : New file.
* scripts/sum-vtv-counts.c : New file.
* testsuite/parts-test-main.h : New file.
* testusite/dataentry.cc : New file.
* testsuite/temp_deriv.cc : New file.
* testsuite/register_pair.cc : New file.
* testsuite/virtual_inheritance.cc : New file.
* testsuite/field-test.cc : New file.
* testsuite/nested_vcall_test.cc : New file.
* testsuite/template-list-iostream.cc : New file.
* testsuite/register_pair_inserts.cc : New file.
* testsuite/register_pair_inserts_mt.cc : New file.
* testsuite/event.list : New file.
* testsuite/parts-test-extra-parts-views.cc : New file.
* testsuite/parts-test-extra-parts-views.h : New file.
* testsuite/environment-fail-32.s : New file.
* testsuite/parts-test-extra-parts.h : New file.
* testsuite/temp_deriv2.cc : New file.
* testsuite/dlopen_mt.cc : New file.
* testsuite/event.h : New file.
* testsuite/template-list.cc : New file.
* testsuite/replace-fail.cc : New file.
* testsuite/Makefile.am : New file.
* testsuite/Makefile.in: New file (generated).
* testsuite/mempool_negative.c : New file.
* testsuite/parts-test-main.cc : New file.
* testsuite/event-private.cc : New file.
* testsuite/thunk.cc : New file.
* testsuite/event-defintiions.cc : New file.
* testsuite/event-private.h : New file.
* testsuite/parts-test.list : New file.
* testusite/register_pair_mt.cc : New file.
* testsuite/povray-derived.cc : New file.
* testsuite/event-main.cc : New file.
* testsuite/environment.cc : New file.
* testsuite/template-list2.cc : New file.
* testsuite/thunk_vtable_map_attack.cc : New file.
* testsuite/parts-test-extra-parts.cc : New file.
* testsuite/environment-fail-64.s : New file.
* testsuite/dlopen.cc : New file.
* testsuite/so.cc : New file.
* testsuite/temp_deriv3.cc : New file.
* testsuite/const_vtable.cc : New file.
* testsuite/mempool_positive.c : New file.
* testsuite/dup_name.cc : New file.
From-SVN: r201555
2013-08-07 05:38:59 +02:00
|
|
|
NEXT_PASS (pass_vtable_verify);
|
2015-04-17 11:26:59 +02:00
|
|
|
NEXT_PASS (pass_lower_vaarg);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_lower_vector);
|
|
|
|
NEXT_PASS (pass_lower_complex_O0);
|
2015-12-04 19:27:54 +01:00
|
|
|
NEXT_PASS (pass_sancov_O0);
|
2018-05-18 21:52:23 +02:00
|
|
|
NEXT_PASS (pass_lower_switch_O0);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_asan_O0);
|
|
|
|
NEXT_PASS (pass_tsan_O0);
|
2013-11-19 12:45:15 +01:00
|
|
|
NEXT_PASS (pass_sanopt);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_cleanup_eh);
|
|
|
|
NEXT_PASS (pass_lower_resx);
|
|
|
|
NEXT_PASS (pass_nrv);
|
2020-03-09 13:23:03 +01:00
|
|
|
NEXT_PASS (pass_gimple_isel);
|
2021-10-28 05:51:02 +02:00
|
|
|
NEXT_PASS (pass_harden_conditional_branches);
|
|
|
|
NEXT_PASS (pass_harden_compares);
|
2022-02-03 22:51:46 +01:00
|
|
|
NEXT_PASS (pass_warn_access, /*early=*/false);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_cleanup_cfg_post_optimizing);
|
|
|
|
NEXT_PASS (pass_warn_function_noreturn);
|
|
|
|
|
|
|
|
NEXT_PASS (pass_expand);
|
|
|
|
|
|
|
|
NEXT_PASS (pass_rest_of_compilation);
|
|
|
|
PUSH_INSERT_PASSES_WITHIN (pass_rest_of_compilation)
|
|
|
|
NEXT_PASS (pass_instantiate_virtual_regs);
|
|
|
|
NEXT_PASS (pass_into_cfg_layout_mode);
|
|
|
|
NEXT_PASS (pass_jump);
|
|
|
|
NEXT_PASS (pass_lower_subreg);
|
|
|
|
NEXT_PASS (pass_df_initialize_opt);
|
|
|
|
NEXT_PASS (pass_cse);
|
|
|
|
NEXT_PASS (pass_rtl_fwprop);
|
|
|
|
NEXT_PASS (pass_rtl_cprop);
|
|
|
|
NEXT_PASS (pass_rtl_pre);
|
|
|
|
NEXT_PASS (pass_rtl_hoist);
|
|
|
|
NEXT_PASS (pass_rtl_cprop);
|
|
|
|
NEXT_PASS (pass_rtl_store_motion);
|
|
|
|
NEXT_PASS (pass_cse_after_global_opts);
|
|
|
|
NEXT_PASS (pass_rtl_ifcvt);
|
|
|
|
NEXT_PASS (pass_reginfo_init);
|
|
|
|
/* Perform loop optimizations. It might be better to do them a bit
|
|
|
|
sooner, but we want the profile feedback to work more
|
|
|
|
efficiently. */
|
|
|
|
NEXT_PASS (pass_loop2);
|
|
|
|
PUSH_INSERT_PASSES_WITHIN (pass_loop2)
|
|
|
|
NEXT_PASS (pass_rtl_loop_init);
|
|
|
|
NEXT_PASS (pass_rtl_move_loop_invariants);
|
2014-10-15 10:02:06 +02:00
|
|
|
NEXT_PASS (pass_rtl_unroll_loops);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_rtl_doloop);
|
|
|
|
NEXT_PASS (pass_rtl_loop_done);
|
|
|
|
POP_INSERT_PASSES ()
|
2019-07-08 19:35:12 +02:00
|
|
|
NEXT_PASS (pass_lower_subreg2);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_web);
|
|
|
|
NEXT_PASS (pass_rtl_cprop);
|
|
|
|
NEXT_PASS (pass_cse2);
|
|
|
|
NEXT_PASS (pass_rtl_dse1);
|
|
|
|
NEXT_PASS (pass_rtl_fwprop_addr);
|
|
|
|
NEXT_PASS (pass_inc_dec);
|
|
|
|
NEXT_PASS (pass_initialize_regs);
|
|
|
|
NEXT_PASS (pass_ud_rtl_dce);
|
|
|
|
NEXT_PASS (pass_combine);
|
|
|
|
NEXT_PASS (pass_if_after_combine);
|
Move jump threading before reload
r266734 has introduced a new instance of jump threading pass in order to
take advantage of opportunities that combine opens up. It was perceived
back then that it was beneficial to delay it after reload, since that
might produce even more such opportunities.
Unfortunately jump threading interferes with hot/cold partitioning. In
the code from PR92007, it converts the following
+-------------------------- 2/HOT ------------------------+
| |
v v
3/HOT --> 5/HOT --> 8/HOT --> 11/COLD --> 6/HOT --EH--> 16/HOT
| ^
| |
+-------------------------------+
into the following:
+---------------------- 2/HOT ------------------+
| |
v v
3/HOT --> 8/HOT --> 11/COLD --> 6/COLD --EH--> 16/HOT
This makes hot bb 6 dominated by cold bb 11, and because of this
fixup_partitions makes bb 6 cold as well, which in turn makes EH edge
6->16 a crossing one. Not only can't we have crossing EH edges, we are
also not allowed to introduce new crossing edges after reload in
general, since it might require extra registers on some targets.
Therefore, move the jump threading pass between combine and hot/cold
partitioning. Building SPEC 2006 and SPEC 2017 with the old and the new
code indicates that:
* When doing jump threading right after reload, 3889 edges are threaded.
* When doing jump threading right after combine, 3918 edges are
threaded.
This means this change will not introduce performance regressions.
gcc/ChangeLog:
2019-10-28 Ilya Leoshkevich <iii@linux.ibm.com>
PR rtl-optimization/92007
* cfgcleanup.c (thread_jump): Add an assertion that we don't
call it after reload if hot/cold partitioning has been done.
(class pass_postreload_jump): Rename to
pass_jump_after_combine.
(make_pass_postreload_jump): Rename to
make_pass_jump_after_combine.
* passes.def(pass_postreload_jump): Move before reload, rename
to pass_jump_after_combine.
* tree-pass.h (make_pass_postreload_jump): Rename to
make_pass_jump_after_combine.
gcc/testsuite/ChangeLog:
2019-10-28 Ilya Leoshkevich <iii@linux.ibm.com>
PR rtl-optimization/92007
* g++.dg/opt/pr92007.C: New test (from Arseny Solokha).
From-SVN: r277507
2019-10-28 11:04:31 +01:00
|
|
|
NEXT_PASS (pass_jump_after_combine);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_partition_blocks);
|
|
|
|
NEXT_PASS (pass_outof_cfg_layout_mode);
|
|
|
|
NEXT_PASS (pass_split_all_insns);
|
2019-07-08 19:35:12 +02:00
|
|
|
NEXT_PASS (pass_lower_subreg3);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_df_initialize_no_opt);
|
|
|
|
NEXT_PASS (pass_stack_ptr_mod);
|
|
|
|
NEXT_PASS (pass_mode_switching);
|
|
|
|
NEXT_PASS (pass_match_asm_constraints);
|
|
|
|
NEXT_PASS (pass_sms);
|
2013-11-06 20:46:39 +01:00
|
|
|
NEXT_PASS (pass_live_range_shrinkage);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_sched);
|
2018-01-13 19:00:51 +01:00
|
|
|
NEXT_PASS (pass_early_remat);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_ira);
|
|
|
|
NEXT_PASS (pass_reload);
|
|
|
|
NEXT_PASS (pass_postreload);
|
|
|
|
PUSH_INSERT_PASSES_WITHIN (pass_postreload)
|
|
|
|
NEXT_PASS (pass_postreload_cse);
|
|
|
|
NEXT_PASS (pass_gcse2);
|
|
|
|
NEXT_PASS (pass_split_after_reload);
|
|
|
|
NEXT_PASS (pass_ree);
|
|
|
|
NEXT_PASS (pass_compare_elim_after_reload);
|
|
|
|
NEXT_PASS (pass_thread_prologue_and_epilogue);
|
|
|
|
NEXT_PASS (pass_rtl_dse2);
|
|
|
|
NEXT_PASS (pass_stack_adjustments);
|
|
|
|
NEXT_PASS (pass_jump2);
|
2014-06-30 22:14:42 +02:00
|
|
|
NEXT_PASS (pass_duplicate_computed_gotos);
|
2014-11-14 03:32:38 +01:00
|
|
|
NEXT_PASS (pass_sched_fusion);
|
2014-06-30 22:14:42 +02:00
|
|
|
NEXT_PASS (pass_peephole2);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_if_after_reload);
|
|
|
|
NEXT_PASS (pass_regrename);
|
|
|
|
NEXT_PASS (pass_cprop_hardreg);
|
|
|
|
NEXT_PASS (pass_fast_rtl_dce);
|
|
|
|
NEXT_PASS (pass_reorder_blocks);
|
|
|
|
NEXT_PASS (pass_leaf_regs);
|
|
|
|
NEXT_PASS (pass_split_before_sched2);
|
|
|
|
NEXT_PASS (pass_sched2);
|
|
|
|
NEXT_PASS (pass_stack_regs);
|
|
|
|
PUSH_INSERT_PASSES_WITHIN (pass_stack_regs)
|
|
|
|
NEXT_PASS (pass_split_before_regstack);
|
|
|
|
NEXT_PASS (pass_stack_regs_run);
|
|
|
|
POP_INSERT_PASSES ()
|
Reorganize post-ra pipeline for targets without register allocation.
* passes.def (pass_compute_alignments, pass_duplicate_computed_gotos,
pass_variable_tracking, pass_free_cfg, pass_machine_reorg,
pass_cleanup_barriers, pass_delay_slots,
pass_split_for_shorten_branches, pass_convert_to_eh_region_ranges,
pass_shorten_branches, pass_est_nothrow_function_flags,
pass_dwarf2_frame, pass_final): Move outside of pass_postreload and
into pass_late_compilation.
(pass_late_compilation): Add.
* passes.c (pass_data_late_compilation, pass_late_compilation,
make_pass_late_compilation): New.
* timevar.def (TV_LATE_COMPILATION): New.
From-SVN: r217124
2014-11-05 13:14:45 +01:00
|
|
|
POP_INSERT_PASSES ()
|
|
|
|
NEXT_PASS (pass_late_compilation);
|
|
|
|
PUSH_INSERT_PASSES_WITHIN (pass_late_compilation)
|
2020-10-30 20:41:38 +01:00
|
|
|
NEXT_PASS (pass_zero_call_used_regs);
|
2013-07-18 20:55:48 +02:00
|
|
|
NEXT_PASS (pass_compute_alignments);
|
|
|
|
NEXT_PASS (pass_variable_tracking);
|
|
|
|
NEXT_PASS (pass_free_cfg);
|
|
|
|
NEXT_PASS (pass_machine_reorg);
|
|
|
|
NEXT_PASS (pass_cleanup_barriers);
|
|
|
|
NEXT_PASS (pass_delay_slots);
|
|
|
|
NEXT_PASS (pass_split_for_shorten_branches);
|
|
|
|
NEXT_PASS (pass_convert_to_eh_region_ranges);
|
|
|
|
NEXT_PASS (pass_shorten_branches);
|
|
|
|
NEXT_PASS (pass_set_nothrow_function_flags);
|
|
|
|
NEXT_PASS (pass_dwarf2_frame);
|
|
|
|
NEXT_PASS (pass_final);
|
|
|
|
POP_INSERT_PASSES ()
|
|
|
|
NEXT_PASS (pass_df_finish);
|
|
|
|
POP_INSERT_PASSES ()
|
|
|
|
NEXT_PASS (pass_clean_state);
|
2016-04-17 07:21:50 +02:00
|
|
|
TERMINATE_PASS_LIST (all_passes)
|