gcc/gcc/passes.def

539 lines
22 KiB
Modula-2
Raw Normal View History

/* Description of pass structure
2022-01-03 10:42:10 +01:00
Copyright (C) 1987-2022 Free Software Foundation, Inc.
This file is part of GCC.
GCC is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 3, or (at your option) any later
version.
GCC is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
for more details.
You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING3. If not see
<http://www.gnu.org/licenses/>. */
/*
Macros that should be defined when using this file:
INSERT_PASSES_AFTER (PASS)
PUSH_INSERT_PASSES_WITHIN (PASS)
POP_INSERT_PASSES ()
NEXT_PASS (PASS)
TERMINATE_PASS_LIST (PASS)
*/
/* All passes needed to lower the function into shape optimizers can
operate on. These passes are always run first on the function, but
backend might produce already lowered functions that are not processed
by these passes. */
INSERT_PASSES_AFTER (all_lowering_passes)
NEXT_PASS (pass_warn_unused_result);
NEXT_PASS (pass_diagnose_omp_blocks);
NEXT_PASS (pass_diagnose_tm_blocks);
Decompose OpenACC 'kernels' constructs into parts, a sequence of compute constructs Not yet enabled by default: for now, the current mode of OpenACC 'kernels' constructs handling still remains '-fopenacc-kernels=parloops', but that is to change later. gcc/ * omp-oacc-kernels-decompose.cc: New. * Makefile.in (OBJS): Add it. * passes.def: Instantiate it. * tree-pass.h (make_pass_omp_oacc_kernels_decompose): Declare. * flag-types.h (enum openacc_kernels): Add. * doc/invoke.texi (-fopenacc-kernels): Document. * gimple.h (enum gf_mask): Add 'GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_PARALLELIZED', 'GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_GANG_SINGLE', 'GF_OMP_TARGET_KIND_OACC_DATA_KERNELS'. (is_gimple_omp_oacc, is_gimple_omp_offloaded): Handle these. * gimple-pretty-print.c (dump_gimple_omp_target): Likewise. * omp-expand.c (expand_omp_target, build_omp_regions_1) (omp_make_gimple_edges): Likewise. * omp-low.c (scan_sharing_clauses, scan_omp_for) (check_omp_nesting_restrictions, lower_oacc_reductions) (lower_oacc_head_mark, lower_omp_target): Likewise. * omp-offload.c (execute_oacc_device_lower): Likewise. gcc/c-family/ * c.opt (fopenacc-kernels): Add. gcc/fortran/ * lang.opt (fopenacc-kernels): Add. gcc/testsuite/ * c-c++-common/goacc/kernels-decompose-1.c: New. * c-c++-common/goacc/kernels-decompose-2.c: New. * c-c++-common/goacc/kernels-decompose-ice-1.c: New. * c-c++-common/goacc/kernels-decompose-ice-2.c: New. * gfortran.dg/goacc/kernels-decompose-1.f95: New. * gfortran.dg/goacc/kernels-decompose-2.f95: New. * c-c++-common/goacc/if-clause-2.c: Adjust. * gfortran.dg/goacc/kernels-tree.f95: Likewise. libgomp/ * testsuite/libgomp.oacc-c-c++-common/declare-vla-kernels-decompose-ice-1.c: New. * testsuite/libgomp.oacc-c-c++-common/declare-vla-kernels-decompose.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/declare-vla.c: Adjust. * testsuite/libgomp.oacc-fortran/pr94358-1.f90: Likewise. Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>
2019-02-01 00:59:30 +01:00
NEXT_PASS (pass_omp_oacc_kernels_decompose);
NEXT_PASS (pass_lower_omp);
NEXT_PASS (pass_lower_cf);
NEXT_PASS (pass_lower_tm);
NEXT_PASS (pass_refactor_eh);
NEXT_PASS (pass_lower_eh);
[C++ coroutines] Initial implementation. This is the squashed version of the first 6 patches that were split to facilitate review. The changes to libiberty (7th patch) to support demangling the co_await operator stand alone and are applied separately. The patch series is an initial implementation of a coroutine feature, expected to be standardised in C++20. Standardisation status (and potential impact on this implementation) -------------------------------------------------------------------- The facility was accepted into the working draft for C++20 by WG21 in February 2019. During following WG21 meetings, design and national body comments have been reviewed, with no significant change resulting. The current GCC implementation is against n4835 [1]. At this stage, the remaining potential for change comes from: * Areas of national body comments that were not resolved in the version we have worked to: (a) handling of the situation where aligned allocation is available. (b) handling of the situation where a user wants coroutines, but does not want exceptions (e.g. a GPU). * Agreed changes that have not yet been worded in a draft standard that we have worked to. It is not expected that the resolution to these can produce any major change at this phase of the standardisation process. Such changes should be limited to the coroutine-specific code. ABI --- The various compiler developers 'vendors' have discussed a minimal ABI to allow one implementation to call coroutines compiled by another. This amounts to: 1. The layout of a public portion of the coroutine frame. Coroutines need to preserve state across suspension points, the storage for this is called a "coroutine frame". The ABI mandates that pointers into the coroutine frame point to an area begining with two function pointers (to the resume and destroy functions described below); these are immediately followed by the "promise object" described in the standard. This is sufficient that the builtins can take a coroutine frame pointer and determine the address of the promise (or call the resume/destroy functions). 2. A number of compiler builtins that the standard library might use. These are implemented by this patch series. 3. This introduces a new operator 'co_await' the mangling for which is also agreed between vendors (and has an issue filed for that against the upstream c++abi). Demangling for this is added to libiberty in a separate patch. The ABI has currently no target-specific content (a given psABI might elect to mandate alignment, but the common ABI does not do this). Standard Library impact ----------------------- The current implementations require addition of only a single header to the standard library (no change to the runtime). This header is part of the patch. GCC Implementation outline -------------------------- The standard's design for coroutines does not decorate the definition of a coroutine in any way, so that a function is only known to be a coroutine when one of the keywords (co_await, co_yield, co_return) is encountered. This means that we cannot special-case such functions from the outset, but must process them differently when they are finalised - which we do from "finish_function ()". At a high level, this design of coroutine produces four pieces from the original user's function: 1. A coroutine state frame (taking the logical place of the activation record for a regular function). One item stored in that state is the index of the current suspend point. 2. A "ramp" function This is what the user calls to construct the coroutine frame and start the coroutine execution. This will return some object representing the coroutine's eventual return value (or means to continue it when it it suspended). 3. A "resume" function. This is what gets called when a the coroutine is resumed when suspended. 4. A "destroy" function. This is what gets called when the coroutine state should be destroyed and its memory released. The standard's coroutines involve cooperation of the user's authored function with a provided "promise" class, which includes mandatory methods for handling the state transitions and providing output values. Most realistic coroutines will also have one or more 'awaiter' classes that implement the user's actions for each suspend point. As we parse (or during template expansion) the types of the promise and awaiter classes become known, and can then be verified against the signatures expected by the standard. Once the function is parsed (and templates expanded) we are able to make the transformation into the four pieces noted above. The implementation here takes the approach of a series of AST transforms. The state machine suspend points are encoded in three internal functions (one of which represents an exit from scope without cleanups). These three IFNs are lowered early in the middle end, such that the majority of GCC's optimisers can be run on the resulting output. As a design choice, we have carried out the outlining of the user's function in the front end, and taken advantage of the existing middle end's abilities to inline and DCE where that is profitable. Since the state machine is actually common to both resumer and destroyer functions, we make only a single function "actor" that contains both the resume and destroy paths. The destroy function is represented by a small stub that sets a value to signal the use of the destroy path and calls the actor. The idea is that optimisation of the state machine need only be done once - and then the resume and destroy paths can be identified allowing the middle end's inline and DCE machinery to optimise as profitable as noted above. The middle end components for this implementation are: A pass that: 1. Lowers the coroutine builtins that allow the standard library header to interact with the coroutine frame (these fairly simple logical or numerical substitution of values, given a coroutine frame pointer). 2. Lowers the IFN that represents the exit from state without cleanup. Essentially, this becomes a gimple goto. 3. Sets the final size of the coroutine frame at this stage. A second pass (that requires the revised CFG that results from the lowering of the scope exit IFNs in the first). 1. Lower the IFNs that represent the state machine paths for the resume and destroy cases. Patches squashed into this commit: [C++ coroutines 1] Common code and base definitions. This part of the patch series provides the gating flag, the keywords, cpp defines etc. [C++ coroutines 2] Define builtins and internal functions. This part of the patch series provides the builtin functions used by the standard library code and the internal functions used to implement lowering of the coroutine state machine. [C++ coroutines 3] Front end parsing and transforms. There are two parts to this. 1. Parsing, template instantiation and diagnostics for the standard- mandated class entries. The user authors a function that becomes a coroutine (lazily) by making use of any of the co_await, co_yield or co_return keywords. Unlike a regular function, where the activation record is placed on the stack, and is destroyed on function exit, a coroutine has some state that persists between calls - the 'coroutine frame' (thus analogous to a stack frame). We transform the user's function into three pieces: 1. A so-called ramp function, that establishes the coroutine frame and begins execution of the coroutine. 2. An actor function that contains the state machine corresponding to the user's suspend/resume structure. 3. A stub function that calls the actor function in 'destroy' mode. The actor function is executed: * from "resume point 0" by the ramp. * from resume point N ( > 0 ) for handle.resume() calls. * from the destroy stub for destroy point N for handle.destroy() calls. The C++ coroutine design described in the standard makes use of some helper methods that are authored in a so-called "promise" class provided by the user. At parse time (or post substitution) the type of the coroutine promise will be determined. At that point, we can look up the required promise class methods and issue diagnostics if they are missing or incorrect. To avoid repeating these actions at code-gen time, we make use of temporary 'proxy' variables for the coroutine handle and the promise - which will eventually be instantiated in the coroutine frame. Each of the keywords will expand to a code sequence (although co_yield is just syntactic sugar for a co_await). We defer the analysis and transformatin until template expansion is complete so that we have complete types at that time. 2. AST analysis and transformation which performs the code-gen for the outlined state machine. The entry point here is morph_fn_to_coro () which is called from finish_function () when we have completed any template expansion. This is preceded by helper functions that implement the phases below. The process proceeds in four phases. A Initial framing. The user's function body is wrapped in the initial and final suspend points and we begin building the coroutine frame. We build empty decls for the actor and destroyer functions at this time too. When exceptions are enabled, the user's function body will also be wrapped in a try-catch block with the catch invoking the promise class 'unhandled_exception' method. B Analysis. The user's function body is analysed to determine the suspend points, if any, and to capture local variables that might persist across such suspensions. In most cases, it is not necessary to capture compiler temporaries, since the tree-lowering nests the suspensions correctly. However, in the case of a captured reference, there is a lifetime extension to the end of the full expression - which can mean across a suspend point in which case it must be promoted to a frame variable. At the conclusion of analysis, we have a conservative frame layout and maps of the local variables to their frame entry points. C Build the ramp function. Carry out the allocation for the coroutine frame (NOTE; the actual size computation is deferred until late in the middle end to allow for future optimisations that will be allowed to elide unused frame entries). We build the return object. D Build and expand the actor and destroyer function bodies. The destroyer is a trivial shim that sets a bit to indicate that the destroy dispatcher should be used and then calls into the actor. The actor function is the implementation of the user's state machine. The current suspend point is noted in an index. Each suspend point is encoded as a pair of internal functions, one in the relevant dispatcher, and one representing the suspend point. During this process, the user's local variables and the proxies for the self-handle and the promise class instanceare re-written to their coroutine frame equivalents. The complete bodies for the ramp, actor and destroy function are passed back to finish_function for folding and gimplification. [C++ coroutines 4] Middle end expanders and transforms. The first part of this is a pass that provides: * expansion of the library support builtins, these are simple boolean or numerical substitutions. * The functionality of implementing an exit from scope without cleanup is performed here by lowering an IFN to a gimple goto. This pass has to run for non-coroutine functions, since functions calling the builtins are not necessarily coroutines (i.e. they are implementing the library interfaces which may be called from anywhere). The second part is the expansion of the coroutine IFNs that describe the state machine connections to the dispatchers. This only has to be run for functions that are coroutine components. The work done by this pass is: In the front end we construct a single actor function that contains the coroutine state machine. The actor function has three entry conditions: 1. from the ramp, resume point 0 - to initial-suspend. 2. when resume () is executed (resume point N). 3. from the destroy () shim when that is executed. The actor function begins with two dispatchers; one for resume and one for destroy (where the initial entry from the ramp is a special- case of resume point 0). Each suspend point and each dispatch entry is marked with an IFN such that we can connect the relevant dispatchers to their target labels. So, if we have: CO_YIELD (NUM, FINAL, RES_LAB, DEST_LAB, FRAME_PTR) This is await point NUM, and is the final await if FINAL is non-zero. The resume point is RES_LAB, and the destroy point is DEST_LAB. We expect to find a CO_ACTOR (NUM) in the resume dispatcher and a CO_ACTOR (NUM+1) in the destroy dispatcher. Initially, the intent of keeping the resume and destroy paths together is that the conditionals controlling them are identical, and thus there would be duplication of any optimisation of those paths if the split were earlier. Subsequent inlining of the actor (and DCE) is then able to extract the resume and destroy paths as separate functions if that is found profitable by the optimisers. Once we have remade the connections to their correct postions, we elide the labels that the front end inserted. [C++ coroutines 5] Standard library header. This provides the interfaces mandated by the standard and implements the interaction with the coroutine frame by means of inline use of builtins expanded at compile-time. There should be a 1:1 correspondence with the standard sections which are cross-referenced. There is no runtime content. At this stage, we have the content in an inline namespace "__n4835" for the CD we worked to. [C++ coroutines 6] Testsuite. There are two categories of test: 1. Checks for correctly formed source code and the error reporting. 2. Checks for transformation and code-gen. The second set are run as 'torture' tests for the standard options set, including LTO. These are also intentionally run with no options provided (from the coroutines.exp script). gcc/ChangeLog: 2020-01-18 Iain Sandoe <iain@sandoe.co.uk> * Makefile.in: Add coroutine-passes.o. * builtin-types.def (BT_CONST_SIZE): New. (BT_FN_BOOL_PTR): New. (BT_FN_PTR_PTR_CONST_SIZE_BOOL): New. * builtins.def (DEF_COROUTINE_BUILTIN): New. * coroutine-builtins.def: New file. * coroutine-passes.cc: New file. * function.h (struct GTY function): Add a bit to indicate that the function is a coroutine component. * internal-fn.c (expand_CO_FRAME): New. (expand_CO_YIELD): New. (expand_CO_SUSPN): New. (expand_CO_ACTOR): New. * internal-fn.def (CO_ACTOR): New. (CO_YIELD): New. (CO_SUSPN): New. (CO_FRAME): New. * passes.def: Add pass_coroutine_lower_builtins, pass_coroutine_early_expand_ifns. * tree-pass.h (make_pass_coroutine_lower_builtins): New. (make_pass_coroutine_early_expand_ifns): New. * doc/invoke.texi: Document the fcoroutines command line switch. gcc/c-family/ChangeLog: 2020-01-18 Iain Sandoe <iain@sandoe.co.uk> * c-common.c (co_await, co_yield, co_return): New. * c-common.h (RID_CO_AWAIT, RID_CO_YIELD, RID_CO_RETURN): New enumeration values. (D_CXX_COROUTINES): Bit to identify coroutines are active. (D_CXX_COROUTINES_FLAGS): Guard for coroutine keywords. * c-cppbuiltin.c (__cpp_coroutines): New cpp define. * c.opt (fcoroutines): New command-line switch. gcc/cp/ChangeLog: 2020-01-18 Iain Sandoe <iain@sandoe.co.uk> * Make-lang.in: Add coroutines.o. * cp-tree.h (lang_decl-fn): coroutine_p, new bit. (DECL_COROUTINE_P): New. * lex.c (init_reswords): Enable keywords when the coroutine flag is set, * operators.def (co_await): New operator. * call.c (add_builtin_candidates): Handle CO_AWAIT_EXPR. (op_error): Likewise. (build_new_op_1): Likewise. (build_new_function_call): Validate coroutine builtin arguments. * constexpr.c (potential_constant_expression_1): Handle CO_AWAIT_EXPR, CO_YIELD_EXPR, CO_RETURN_EXPR. * coroutines.cc: New file. * cp-objcp-common.c (cp_common_init_ts): Add CO_AWAIT_EXPR, CO_YIELD_EXPR, CO_RETRN_EXPR as TS expressions. * cp-tree.def (CO_AWAIT_EXPR, CO_YIELD_EXPR, (CO_RETURN_EXPR): New. * cp-tree.h (coro_validate_builtin_call): New. * decl.c (emit_coro_helper): New. (finish_function): Handle the case when a function is found to be a coroutine, perform the outlining and emit the outlined functions. Set a bit to signal that this is a coroutine component. * parser.c (enum required_token): New enumeration RT_CO_YIELD. (cp_parser_unary_expression): Handle co_await. (cp_parser_assignment_expression): Handle co_yield. (cp_parser_statement): Handle RID_CO_RETURN. (cp_parser_jump_statement): Handle co_return. (cp_parser_operator): Handle co_await operator. (cp_parser_yield_expression): New. (cp_parser_required_error): Handle RT_CO_YIELD. * pt.c (tsubst_copy): Handle CO_AWAIT_EXPR. (tsubst_expr): Handle CO_AWAIT_EXPR, CO_YIELD_EXPR and CO_RETURN_EXPRs. * tree.c (cp_walk_subtrees): Likewise. libstdc++-v3/ChangeLog: 2020-01-18 Iain Sandoe <iain@sandoe.co.uk> * include/Makefile.am: Add coroutine to the std set. * include/Makefile.in: Regenerated. * include/std/coroutine: New file. gcc/testsuite/ChangeLog: 2020-01-18 Iain Sandoe <iain@sandoe.co.uk> * g++.dg/coroutines/co-await-syntax-00-needs-expr.C: New test. * g++.dg/coroutines/co-await-syntax-01-outside-fn.C: New test. * g++.dg/coroutines/co-await-syntax-02-outside-fn.C: New test. * g++.dg/coroutines/co-await-syntax-03-auto.C: New test. * g++.dg/coroutines/co-await-syntax-04-ctor-dtor.C: New test. * g++.dg/coroutines/co-await-syntax-05-constexpr.C: New test. * g++.dg/coroutines/co-await-syntax-06-main.C: New test. * g++.dg/coroutines/co-await-syntax-07-varargs.C: New test. * g++.dg/coroutines/co-await-syntax-08-lambda-auto.C: New test. * g++.dg/coroutines/co-return-syntax-01-outside-fn.C: New test. * g++.dg/coroutines/co-return-syntax-02-outside-fn.C: New test. * g++.dg/coroutines/co-return-syntax-03-auto.C: New test. * g++.dg/coroutines/co-return-syntax-04-ctor-dtor.C: New test. * g++.dg/coroutines/co-return-syntax-05-constexpr-fn.C: New test. * g++.dg/coroutines/co-return-syntax-06-main.C: New test. * g++.dg/coroutines/co-return-syntax-07-vararg.C: New test. * g++.dg/coroutines/co-return-syntax-08-bad-return.C: New test. * g++.dg/coroutines/co-return-syntax-09-lambda-auto.C: New test. * g++.dg/coroutines/co-yield-syntax-00-needs-expr.C: New test. * g++.dg/coroutines/co-yield-syntax-01-outside-fn.C: New test. * g++.dg/coroutines/co-yield-syntax-02-outside-fn.C: New test. * g++.dg/coroutines/co-yield-syntax-03-auto.C: New test. * g++.dg/coroutines/co-yield-syntax-04-ctor-dtor.C: New test. * g++.dg/coroutines/co-yield-syntax-05-constexpr.C: New test. * g++.dg/coroutines/co-yield-syntax-06-main.C: New test. * g++.dg/coroutines/co-yield-syntax-07-varargs.C: New test. * g++.dg/coroutines/co-yield-syntax-08-needs-expr.C: New test. * g++.dg/coroutines/co-yield-syntax-09-lambda-auto.C: New test. * g++.dg/coroutines/coro-builtins.C: New test. * g++.dg/coroutines/coro-missing-gro.C: New test. * g++.dg/coroutines/coro-missing-promise-yield.C: New test. * g++.dg/coroutines/coro-missing-ret-value.C: New test. * g++.dg/coroutines/coro-missing-ret-void.C: New test. * g++.dg/coroutines/coro-missing-ueh-1.C: New test. * g++.dg/coroutines/coro-missing-ueh-2.C: New test. * g++.dg/coroutines/coro-missing-ueh-3.C: New test. * g++.dg/coroutines/coro-missing-ueh.h: New test. * g++.dg/coroutines/coro-pre-proc.C: New test. * g++.dg/coroutines/coro.h: New file. * g++.dg/coroutines/coro1-ret-int-yield-int.h: New file. * g++.dg/coroutines/coroutines.exp: New file. * g++.dg/coroutines/torture/alloc-00-gro-on-alloc-fail.C: New test. * g++.dg/coroutines/torture/alloc-01-overload-newdel.C: New test. * g++.dg/coroutines/torture/call-00-co-aw-arg.C: New test. * g++.dg/coroutines/torture/call-01-multiple-co-aw.C: New test. * g++.dg/coroutines/torture/call-02-temp-co-aw.C: New test. * g++.dg/coroutines/torture/call-03-temp-ref-co-aw.C: New test. * g++.dg/coroutines/torture/class-00-co-ret.C: New test. * g++.dg/coroutines/torture/class-01-co-ret-parm.C: New test. * g++.dg/coroutines/torture/class-02-templ-parm.C: New test. * g++.dg/coroutines/torture/class-03-operator-templ-parm.C: New test. * g++.dg/coroutines/torture/class-04-lambda-1.C: New test. * g++.dg/coroutines/torture/class-05-lambda-capture-copy-local.C: New test. * g++.dg/coroutines/torture/class-06-lambda-capture-ref.C: New test. * g++.dg/coroutines/torture/co-await-00-trivial.C: New test. * g++.dg/coroutines/torture/co-await-01-with-value.C: New test. * g++.dg/coroutines/torture/co-await-02-xform.C: New test. * g++.dg/coroutines/torture/co-await-03-rhs-op.C: New test. * g++.dg/coroutines/torture/co-await-04-control-flow.C: New test. * g++.dg/coroutines/torture/co-await-05-loop.C: New test. * g++.dg/coroutines/torture/co-await-06-ovl.C: New test. * g++.dg/coroutines/torture/co-await-07-tmpl.C: New test. * g++.dg/coroutines/torture/co-await-08-cascade.C: New test. * g++.dg/coroutines/torture/co-await-09-pair.C: New test. * g++.dg/coroutines/torture/co-await-10-template-fn-arg.C: New test. * g++.dg/coroutines/torture/co-await-11-forwarding.C: New test. * g++.dg/coroutines/torture/co-await-12-operator-2.C: New test. * g++.dg/coroutines/torture/co-await-13-return-ref.C: New test. * g++.dg/coroutines/torture/co-ret-00-void-return-is-ready.C: New test. * g++.dg/coroutines/torture/co-ret-01-void-return-is-suspend.C: New test. * g++.dg/coroutines/torture/co-ret-03-different-GRO-type.C: New test. * g++.dg/coroutines/torture/co-ret-04-GRO-nontriv.C: New test. * g++.dg/coroutines/torture/co-ret-05-return-value.C: New test. * g++.dg/coroutines/torture/co-ret-06-template-promise-val-1.C: New test. * g++.dg/coroutines/torture/co-ret-07-void-cast-expr.C: New test. * g++.dg/coroutines/torture/co-ret-08-template-cast-ret.C: New test. * g++.dg/coroutines/torture/co-ret-09-bool-await-susp.C: New test. * g++.dg/coroutines/torture/co-ret-10-expression-evaluates-once.C: New test. * g++.dg/coroutines/torture/co-ret-11-co-ret-co-await.C: New test. * g++.dg/coroutines/torture/co-ret-12-co-ret-fun-co-await.C: New test. * g++.dg/coroutines/torture/co-ret-13-template-2.C: New test. * g++.dg/coroutines/torture/co-ret-14-template-3.C: New test. * g++.dg/coroutines/torture/co-yield-00-triv.C: New test. * g++.dg/coroutines/torture/co-yield-01-multi.C: New test. * g++.dg/coroutines/torture/co-yield-02-loop.C: New test. * g++.dg/coroutines/torture/co-yield-03-tmpl.C: New test. * g++.dg/coroutines/torture/co-yield-04-complex-local-state.C: New test. * g++.dg/coroutines/torture/co-yield-05-co-aw.C: New test. * g++.dg/coroutines/torture/co-yield-06-fun-parm.C: New test. * g++.dg/coroutines/torture/co-yield-07-template-fn-param.C: New test. * g++.dg/coroutines/torture/co-yield-08-more-refs.C: New test. * g++.dg/coroutines/torture/co-yield-09-more-templ-refs.C: New test. * g++.dg/coroutines/torture/coro-torture.exp: New file. * g++.dg/coroutines/torture/exceptions-test-0.C: New test. * g++.dg/coroutines/torture/func-params-00.C: New test. * g++.dg/coroutines/torture/func-params-01.C: New test. * g++.dg/coroutines/torture/func-params-02.C: New test. * g++.dg/coroutines/torture/func-params-03.C: New test. * g++.dg/coroutines/torture/func-params-04.C: New test. * g++.dg/coroutines/torture/func-params-05.C: New test. * g++.dg/coroutines/torture/func-params-06.C: New test. * g++.dg/coroutines/torture/lambda-00-co-ret.C: New test. * g++.dg/coroutines/torture/lambda-01-co-ret-parm.C: New test. * g++.dg/coroutines/torture/lambda-02-co-yield-values.C: New test. * g++.dg/coroutines/torture/lambda-03-auto-parm-1.C: New test. * g++.dg/coroutines/torture/lambda-04-templ-parm.C: New test. * g++.dg/coroutines/torture/lambda-05-capture-copy-local.C: New test. * g++.dg/coroutines/torture/lambda-06-multi-capture.C: New test. * g++.dg/coroutines/torture/lambda-07-multi-yield.C: New test. * g++.dg/coroutines/torture/lambda-08-co-ret-parm-ref.C: New test. * g++.dg/coroutines/torture/local-var-0.C: New test. * g++.dg/coroutines/torture/local-var-1.C: New test. * g++.dg/coroutines/torture/local-var-2.C: New test. * g++.dg/coroutines/torture/local-var-3.C: New test. * g++.dg/coroutines/torture/local-var-4.C: New test. * g++.dg/coroutines/torture/mid-suspend-destruction-0.C: New test. * g++.dg/coroutines/torture/pr92933.C: New test.
2020-01-18 12:54:46 +01:00
NEXT_PASS (pass_coroutine_lower_builtins);
NEXT_PASS (pass_build_cfg);
NEXT_PASS (pass_warn_function_return);
[C++ coroutines] Initial implementation. This is the squashed version of the first 6 patches that were split to facilitate review. The changes to libiberty (7th patch) to support demangling the co_await operator stand alone and are applied separately. The patch series is an initial implementation of a coroutine feature, expected to be standardised in C++20. Standardisation status (and potential impact on this implementation) -------------------------------------------------------------------- The facility was accepted into the working draft for C++20 by WG21 in February 2019. During following WG21 meetings, design and national body comments have been reviewed, with no significant change resulting. The current GCC implementation is against n4835 [1]. At this stage, the remaining potential for change comes from: * Areas of national body comments that were not resolved in the version we have worked to: (a) handling of the situation where aligned allocation is available. (b) handling of the situation where a user wants coroutines, but does not want exceptions (e.g. a GPU). * Agreed changes that have not yet been worded in a draft standard that we have worked to. It is not expected that the resolution to these can produce any major change at this phase of the standardisation process. Such changes should be limited to the coroutine-specific code. ABI --- The various compiler developers 'vendors' have discussed a minimal ABI to allow one implementation to call coroutines compiled by another. This amounts to: 1. The layout of a public portion of the coroutine frame. Coroutines need to preserve state across suspension points, the storage for this is called a "coroutine frame". The ABI mandates that pointers into the coroutine frame point to an area begining with two function pointers (to the resume and destroy functions described below); these are immediately followed by the "promise object" described in the standard. This is sufficient that the builtins can take a coroutine frame pointer and determine the address of the promise (or call the resume/destroy functions). 2. A number of compiler builtins that the standard library might use. These are implemented by this patch series. 3. This introduces a new operator 'co_await' the mangling for which is also agreed between vendors (and has an issue filed for that against the upstream c++abi). Demangling for this is added to libiberty in a separate patch. The ABI has currently no target-specific content (a given psABI might elect to mandate alignment, but the common ABI does not do this). Standard Library impact ----------------------- The current implementations require addition of only a single header to the standard library (no change to the runtime). This header is part of the patch. GCC Implementation outline -------------------------- The standard's design for coroutines does not decorate the definition of a coroutine in any way, so that a function is only known to be a coroutine when one of the keywords (co_await, co_yield, co_return) is encountered. This means that we cannot special-case such functions from the outset, but must process them differently when they are finalised - which we do from "finish_function ()". At a high level, this design of coroutine produces four pieces from the original user's function: 1. A coroutine state frame (taking the logical place of the activation record for a regular function). One item stored in that state is the index of the current suspend point. 2. A "ramp" function This is what the user calls to construct the coroutine frame and start the coroutine execution. This will return some object representing the coroutine's eventual return value (or means to continue it when it it suspended). 3. A "resume" function. This is what gets called when a the coroutine is resumed when suspended. 4. A "destroy" function. This is what gets called when the coroutine state should be destroyed and its memory released. The standard's coroutines involve cooperation of the user's authored function with a provided "promise" class, which includes mandatory methods for handling the state transitions and providing output values. Most realistic coroutines will also have one or more 'awaiter' classes that implement the user's actions for each suspend point. As we parse (or during template expansion) the types of the promise and awaiter classes become known, and can then be verified against the signatures expected by the standard. Once the function is parsed (and templates expanded) we are able to make the transformation into the four pieces noted above. The implementation here takes the approach of a series of AST transforms. The state machine suspend points are encoded in three internal functions (one of which represents an exit from scope without cleanups). These three IFNs are lowered early in the middle end, such that the majority of GCC's optimisers can be run on the resulting output. As a design choice, we have carried out the outlining of the user's function in the front end, and taken advantage of the existing middle end's abilities to inline and DCE where that is profitable. Since the state machine is actually common to both resumer and destroyer functions, we make only a single function "actor" that contains both the resume and destroy paths. The destroy function is represented by a small stub that sets a value to signal the use of the destroy path and calls the actor. The idea is that optimisation of the state machine need only be done once - and then the resume and destroy paths can be identified allowing the middle end's inline and DCE machinery to optimise as profitable as noted above. The middle end components for this implementation are: A pass that: 1. Lowers the coroutine builtins that allow the standard library header to interact with the coroutine frame (these fairly simple logical or numerical substitution of values, given a coroutine frame pointer). 2. Lowers the IFN that represents the exit from state without cleanup. Essentially, this becomes a gimple goto. 3. Sets the final size of the coroutine frame at this stage. A second pass (that requires the revised CFG that results from the lowering of the scope exit IFNs in the first). 1. Lower the IFNs that represent the state machine paths for the resume and destroy cases. Patches squashed into this commit: [C++ coroutines 1] Common code and base definitions. This part of the patch series provides the gating flag, the keywords, cpp defines etc. [C++ coroutines 2] Define builtins and internal functions. This part of the patch series provides the builtin functions used by the standard library code and the internal functions used to implement lowering of the coroutine state machine. [C++ coroutines 3] Front end parsing and transforms. There are two parts to this. 1. Parsing, template instantiation and diagnostics for the standard- mandated class entries. The user authors a function that becomes a coroutine (lazily) by making use of any of the co_await, co_yield or co_return keywords. Unlike a regular function, where the activation record is placed on the stack, and is destroyed on function exit, a coroutine has some state that persists between calls - the 'coroutine frame' (thus analogous to a stack frame). We transform the user's function into three pieces: 1. A so-called ramp function, that establishes the coroutine frame and begins execution of the coroutine. 2. An actor function that contains the state machine corresponding to the user's suspend/resume structure. 3. A stub function that calls the actor function in 'destroy' mode. The actor function is executed: * from "resume point 0" by the ramp. * from resume point N ( > 0 ) for handle.resume() calls. * from the destroy stub for destroy point N for handle.destroy() calls. The C++ coroutine design described in the standard makes use of some helper methods that are authored in a so-called "promise" class provided by the user. At parse time (or post substitution) the type of the coroutine promise will be determined. At that point, we can look up the required promise class methods and issue diagnostics if they are missing or incorrect. To avoid repeating these actions at code-gen time, we make use of temporary 'proxy' variables for the coroutine handle and the promise - which will eventually be instantiated in the coroutine frame. Each of the keywords will expand to a code sequence (although co_yield is just syntactic sugar for a co_await). We defer the analysis and transformatin until template expansion is complete so that we have complete types at that time. 2. AST analysis and transformation which performs the code-gen for the outlined state machine. The entry point here is morph_fn_to_coro () which is called from finish_function () when we have completed any template expansion. This is preceded by helper functions that implement the phases below. The process proceeds in four phases. A Initial framing. The user's function body is wrapped in the initial and final suspend points and we begin building the coroutine frame. We build empty decls for the actor and destroyer functions at this time too. When exceptions are enabled, the user's function body will also be wrapped in a try-catch block with the catch invoking the promise class 'unhandled_exception' method. B Analysis. The user's function body is analysed to determine the suspend points, if any, and to capture local variables that might persist across such suspensions. In most cases, it is not necessary to capture compiler temporaries, since the tree-lowering nests the suspensions correctly. However, in the case of a captured reference, there is a lifetime extension to the end of the full expression - which can mean across a suspend point in which case it must be promoted to a frame variable. At the conclusion of analysis, we have a conservative frame layout and maps of the local variables to their frame entry points. C Build the ramp function. Carry out the allocation for the coroutine frame (NOTE; the actual size computation is deferred until late in the middle end to allow for future optimisations that will be allowed to elide unused frame entries). We build the return object. D Build and expand the actor and destroyer function bodies. The destroyer is a trivial shim that sets a bit to indicate that the destroy dispatcher should be used and then calls into the actor. The actor function is the implementation of the user's state machine. The current suspend point is noted in an index. Each suspend point is encoded as a pair of internal functions, one in the relevant dispatcher, and one representing the suspend point. During this process, the user's local variables and the proxies for the self-handle and the promise class instanceare re-written to their coroutine frame equivalents. The complete bodies for the ramp, actor and destroy function are passed back to finish_function for folding and gimplification. [C++ coroutines 4] Middle end expanders and transforms. The first part of this is a pass that provides: * expansion of the library support builtins, these are simple boolean or numerical substitutions. * The functionality of implementing an exit from scope without cleanup is performed here by lowering an IFN to a gimple goto. This pass has to run for non-coroutine functions, since functions calling the builtins are not necessarily coroutines (i.e. they are implementing the library interfaces which may be called from anywhere). The second part is the expansion of the coroutine IFNs that describe the state machine connections to the dispatchers. This only has to be run for functions that are coroutine components. The work done by this pass is: In the front end we construct a single actor function that contains the coroutine state machine. The actor function has three entry conditions: 1. from the ramp, resume point 0 - to initial-suspend. 2. when resume () is executed (resume point N). 3. from the destroy () shim when that is executed. The actor function begins with two dispatchers; one for resume and one for destroy (where the initial entry from the ramp is a special- case of resume point 0). Each suspend point and each dispatch entry is marked with an IFN such that we can connect the relevant dispatchers to their target labels. So, if we have: CO_YIELD (NUM, FINAL, RES_LAB, DEST_LAB, FRAME_PTR) This is await point NUM, and is the final await if FINAL is non-zero. The resume point is RES_LAB, and the destroy point is DEST_LAB. We expect to find a CO_ACTOR (NUM) in the resume dispatcher and a CO_ACTOR (NUM+1) in the destroy dispatcher. Initially, the intent of keeping the resume and destroy paths together is that the conditionals controlling them are identical, and thus there would be duplication of any optimisation of those paths if the split were earlier. Subsequent inlining of the actor (and DCE) is then able to extract the resume and destroy paths as separate functions if that is found profitable by the optimisers. Once we have remade the connections to their correct postions, we elide the labels that the front end inserted. [C++ coroutines 5] Standard library header. This provides the interfaces mandated by the standard and implements the interaction with the coroutine frame by means of inline use of builtins expanded at compile-time. There should be a 1:1 correspondence with the standard sections which are cross-referenced. There is no runtime content. At this stage, we have the content in an inline namespace "__n4835" for the CD we worked to. [C++ coroutines 6] Testsuite. There are two categories of test: 1. Checks for correctly formed source code and the error reporting. 2. Checks for transformation and code-gen. The second set are run as 'torture' tests for the standard options set, including LTO. These are also intentionally run with no options provided (from the coroutines.exp script). gcc/ChangeLog: 2020-01-18 Iain Sandoe <iain@sandoe.co.uk> * Makefile.in: Add coroutine-passes.o. * builtin-types.def (BT_CONST_SIZE): New. (BT_FN_BOOL_PTR): New. (BT_FN_PTR_PTR_CONST_SIZE_BOOL): New. * builtins.def (DEF_COROUTINE_BUILTIN): New. * coroutine-builtins.def: New file. * coroutine-passes.cc: New file. * function.h (struct GTY function): Add a bit to indicate that the function is a coroutine component. * internal-fn.c (expand_CO_FRAME): New. (expand_CO_YIELD): New. (expand_CO_SUSPN): New. (expand_CO_ACTOR): New. * internal-fn.def (CO_ACTOR): New. (CO_YIELD): New. (CO_SUSPN): New. (CO_FRAME): New. * passes.def: Add pass_coroutine_lower_builtins, pass_coroutine_early_expand_ifns. * tree-pass.h (make_pass_coroutine_lower_builtins): New. (make_pass_coroutine_early_expand_ifns): New. * doc/invoke.texi: Document the fcoroutines command line switch. gcc/c-family/ChangeLog: 2020-01-18 Iain Sandoe <iain@sandoe.co.uk> * c-common.c (co_await, co_yield, co_return): New. * c-common.h (RID_CO_AWAIT, RID_CO_YIELD, RID_CO_RETURN): New enumeration values. (D_CXX_COROUTINES): Bit to identify coroutines are active. (D_CXX_COROUTINES_FLAGS): Guard for coroutine keywords. * c-cppbuiltin.c (__cpp_coroutines): New cpp define. * c.opt (fcoroutines): New command-line switch. gcc/cp/ChangeLog: 2020-01-18 Iain Sandoe <iain@sandoe.co.uk> * Make-lang.in: Add coroutines.o. * cp-tree.h (lang_decl-fn): coroutine_p, new bit. (DECL_COROUTINE_P): New. * lex.c (init_reswords): Enable keywords when the coroutine flag is set, * operators.def (co_await): New operator. * call.c (add_builtin_candidates): Handle CO_AWAIT_EXPR. (op_error): Likewise. (build_new_op_1): Likewise. (build_new_function_call): Validate coroutine builtin arguments. * constexpr.c (potential_constant_expression_1): Handle CO_AWAIT_EXPR, CO_YIELD_EXPR, CO_RETURN_EXPR. * coroutines.cc: New file. * cp-objcp-common.c (cp_common_init_ts): Add CO_AWAIT_EXPR, CO_YIELD_EXPR, CO_RETRN_EXPR as TS expressions. * cp-tree.def (CO_AWAIT_EXPR, CO_YIELD_EXPR, (CO_RETURN_EXPR): New. * cp-tree.h (coro_validate_builtin_call): New. * decl.c (emit_coro_helper): New. (finish_function): Handle the case when a function is found to be a coroutine, perform the outlining and emit the outlined functions. Set a bit to signal that this is a coroutine component. * parser.c (enum required_token): New enumeration RT_CO_YIELD. (cp_parser_unary_expression): Handle co_await. (cp_parser_assignment_expression): Handle co_yield. (cp_parser_statement): Handle RID_CO_RETURN. (cp_parser_jump_statement): Handle co_return. (cp_parser_operator): Handle co_await operator. (cp_parser_yield_expression): New. (cp_parser_required_error): Handle RT_CO_YIELD. * pt.c (tsubst_copy): Handle CO_AWAIT_EXPR. (tsubst_expr): Handle CO_AWAIT_EXPR, CO_YIELD_EXPR and CO_RETURN_EXPRs. * tree.c (cp_walk_subtrees): Likewise. libstdc++-v3/ChangeLog: 2020-01-18 Iain Sandoe <iain@sandoe.co.uk> * include/Makefile.am: Add coroutine to the std set. * include/Makefile.in: Regenerated. * include/std/coroutine: New file. gcc/testsuite/ChangeLog: 2020-01-18 Iain Sandoe <iain@sandoe.co.uk> * g++.dg/coroutines/co-await-syntax-00-needs-expr.C: New test. * g++.dg/coroutines/co-await-syntax-01-outside-fn.C: New test. * g++.dg/coroutines/co-await-syntax-02-outside-fn.C: New test. * g++.dg/coroutines/co-await-syntax-03-auto.C: New test. * g++.dg/coroutines/co-await-syntax-04-ctor-dtor.C: New test. * g++.dg/coroutines/co-await-syntax-05-constexpr.C: New test. * g++.dg/coroutines/co-await-syntax-06-main.C: New test. * g++.dg/coroutines/co-await-syntax-07-varargs.C: New test. * g++.dg/coroutines/co-await-syntax-08-lambda-auto.C: New test. * g++.dg/coroutines/co-return-syntax-01-outside-fn.C: New test. * g++.dg/coroutines/co-return-syntax-02-outside-fn.C: New test. * g++.dg/coroutines/co-return-syntax-03-auto.C: New test. * g++.dg/coroutines/co-return-syntax-04-ctor-dtor.C: New test. * g++.dg/coroutines/co-return-syntax-05-constexpr-fn.C: New test. * g++.dg/coroutines/co-return-syntax-06-main.C: New test. * g++.dg/coroutines/co-return-syntax-07-vararg.C: New test. * g++.dg/coroutines/co-return-syntax-08-bad-return.C: New test. * g++.dg/coroutines/co-return-syntax-09-lambda-auto.C: New test. * g++.dg/coroutines/co-yield-syntax-00-needs-expr.C: New test. * g++.dg/coroutines/co-yield-syntax-01-outside-fn.C: New test. * g++.dg/coroutines/co-yield-syntax-02-outside-fn.C: New test. * g++.dg/coroutines/co-yield-syntax-03-auto.C: New test. * g++.dg/coroutines/co-yield-syntax-04-ctor-dtor.C: New test. * g++.dg/coroutines/co-yield-syntax-05-constexpr.C: New test. * g++.dg/coroutines/co-yield-syntax-06-main.C: New test. * g++.dg/coroutines/co-yield-syntax-07-varargs.C: New test. * g++.dg/coroutines/co-yield-syntax-08-needs-expr.C: New test. * g++.dg/coroutines/co-yield-syntax-09-lambda-auto.C: New test. * g++.dg/coroutines/coro-builtins.C: New test. * g++.dg/coroutines/coro-missing-gro.C: New test. * g++.dg/coroutines/coro-missing-promise-yield.C: New test. * g++.dg/coroutines/coro-missing-ret-value.C: New test. * g++.dg/coroutines/coro-missing-ret-void.C: New test. * g++.dg/coroutines/coro-missing-ueh-1.C: New test. * g++.dg/coroutines/coro-missing-ueh-2.C: New test. * g++.dg/coroutines/coro-missing-ueh-3.C: New test. * g++.dg/coroutines/coro-missing-ueh.h: New test. * g++.dg/coroutines/coro-pre-proc.C: New test. * g++.dg/coroutines/coro.h: New file. * g++.dg/coroutines/coro1-ret-int-yield-int.h: New file. * g++.dg/coroutines/coroutines.exp: New file. * g++.dg/coroutines/torture/alloc-00-gro-on-alloc-fail.C: New test. * g++.dg/coroutines/torture/alloc-01-overload-newdel.C: New test. * g++.dg/coroutines/torture/call-00-co-aw-arg.C: New test. * g++.dg/coroutines/torture/call-01-multiple-co-aw.C: New test. * g++.dg/coroutines/torture/call-02-temp-co-aw.C: New test. * g++.dg/coroutines/torture/call-03-temp-ref-co-aw.C: New test. * g++.dg/coroutines/torture/class-00-co-ret.C: New test. * g++.dg/coroutines/torture/class-01-co-ret-parm.C: New test. * g++.dg/coroutines/torture/class-02-templ-parm.C: New test. * g++.dg/coroutines/torture/class-03-operator-templ-parm.C: New test. * g++.dg/coroutines/torture/class-04-lambda-1.C: New test. * g++.dg/coroutines/torture/class-05-lambda-capture-copy-local.C: New test. * g++.dg/coroutines/torture/class-06-lambda-capture-ref.C: New test. * g++.dg/coroutines/torture/co-await-00-trivial.C: New test. * g++.dg/coroutines/torture/co-await-01-with-value.C: New test. * g++.dg/coroutines/torture/co-await-02-xform.C: New test. * g++.dg/coroutines/torture/co-await-03-rhs-op.C: New test. * g++.dg/coroutines/torture/co-await-04-control-flow.C: New test. * g++.dg/coroutines/torture/co-await-05-loop.C: New test. * g++.dg/coroutines/torture/co-await-06-ovl.C: New test. * g++.dg/coroutines/torture/co-await-07-tmpl.C: New test. * g++.dg/coroutines/torture/co-await-08-cascade.C: New test. * g++.dg/coroutines/torture/co-await-09-pair.C: New test. * g++.dg/coroutines/torture/co-await-10-template-fn-arg.C: New test. * g++.dg/coroutines/torture/co-await-11-forwarding.C: New test. * g++.dg/coroutines/torture/co-await-12-operator-2.C: New test. * g++.dg/coroutines/torture/co-await-13-return-ref.C: New test. * g++.dg/coroutines/torture/co-ret-00-void-return-is-ready.C: New test. * g++.dg/coroutines/torture/co-ret-01-void-return-is-suspend.C: New test. * g++.dg/coroutines/torture/co-ret-03-different-GRO-type.C: New test. * g++.dg/coroutines/torture/co-ret-04-GRO-nontriv.C: New test. * g++.dg/coroutines/torture/co-ret-05-return-value.C: New test. * g++.dg/coroutines/torture/co-ret-06-template-promise-val-1.C: New test. * g++.dg/coroutines/torture/co-ret-07-void-cast-expr.C: New test. * g++.dg/coroutines/torture/co-ret-08-template-cast-ret.C: New test. * g++.dg/coroutines/torture/co-ret-09-bool-await-susp.C: New test. * g++.dg/coroutines/torture/co-ret-10-expression-evaluates-once.C: New test. * g++.dg/coroutines/torture/co-ret-11-co-ret-co-await.C: New test. * g++.dg/coroutines/torture/co-ret-12-co-ret-fun-co-await.C: New test. * g++.dg/coroutines/torture/co-ret-13-template-2.C: New test. * g++.dg/coroutines/torture/co-ret-14-template-3.C: New test. * g++.dg/coroutines/torture/co-yield-00-triv.C: New test. * g++.dg/coroutines/torture/co-yield-01-multi.C: New test. * g++.dg/coroutines/torture/co-yield-02-loop.C: New test. * g++.dg/coroutines/torture/co-yield-03-tmpl.C: New test. * g++.dg/coroutines/torture/co-yield-04-complex-local-state.C: New test. * g++.dg/coroutines/torture/co-yield-05-co-aw.C: New test. * g++.dg/coroutines/torture/co-yield-06-fun-parm.C: New test. * g++.dg/coroutines/torture/co-yield-07-template-fn-param.C: New test. * g++.dg/coroutines/torture/co-yield-08-more-refs.C: New test. * g++.dg/coroutines/torture/co-yield-09-more-templ-refs.C: New test. * g++.dg/coroutines/torture/coro-torture.exp: New file. * g++.dg/coroutines/torture/exceptions-test-0.C: New test. * g++.dg/coroutines/torture/func-params-00.C: New test. * g++.dg/coroutines/torture/func-params-01.C: New test. * g++.dg/coroutines/torture/func-params-02.C: New test. * g++.dg/coroutines/torture/func-params-03.C: New test. * g++.dg/coroutines/torture/func-params-04.C: New test. * g++.dg/coroutines/torture/func-params-05.C: New test. * g++.dg/coroutines/torture/func-params-06.C: New test. * g++.dg/coroutines/torture/lambda-00-co-ret.C: New test. * g++.dg/coroutines/torture/lambda-01-co-ret-parm.C: New test. * g++.dg/coroutines/torture/lambda-02-co-yield-values.C: New test. * g++.dg/coroutines/torture/lambda-03-auto-parm-1.C: New test. * g++.dg/coroutines/torture/lambda-04-templ-parm.C: New test. * g++.dg/coroutines/torture/lambda-05-capture-copy-local.C: New test. * g++.dg/coroutines/torture/lambda-06-multi-capture.C: New test. * g++.dg/coroutines/torture/lambda-07-multi-yield.C: New test. * g++.dg/coroutines/torture/lambda-08-co-ret-parm-ref.C: New test. * g++.dg/coroutines/torture/local-var-0.C: New test. * g++.dg/coroutines/torture/local-var-1.C: New test. * g++.dg/coroutines/torture/local-var-2.C: New test. * g++.dg/coroutines/torture/local-var-3.C: New test. * g++.dg/coroutines/torture/local-var-4.C: New test. * g++.dg/coroutines/torture/mid-suspend-destruction-0.C: New test. * g++.dg/coroutines/torture/pr92933.C: New test.
2020-01-18 12:54:46 +01:00
NEXT_PASS (pass_coroutine_early_expand_ifns);
NEXT_PASS (pass_expand_omp);
NEXT_PASS (pass_build_cgraph_edges);
TERMINATE_PASS_LIST (all_lowering_passes)
/* Interprocedural optimization passes. */
INSERT_PASSES_AFTER (all_small_ipa_passes)
NEXT_PASS (pass_ipa_free_lang_data);
NEXT_PASS (pass_ipa_function_and_variable_visibility);
ipa-chkp.c: New. gcc/ 2014-11-05 Ilya Enkovich <ilya.enkovich@intel.com> * ipa-chkp.c: New. * ipa-chkp.h: New. * tree-chkp.c: New. * tree-chkp.h: New. * tree-chkp-opt.c: New. * rtl-chkp.c: New. * rtl-chkp.h: New. * Makefile.in (OBJS): Add ipa-chkp.o, rtl-chkp.o, tree-chkp.o tree-chkp-opt.o. (GTFILES): Add tree-chkp.c. * mode-classes.def (MODE_POINTER_BOUNDS): New. * tree.def (POINTER_BOUNDS_TYPE): New. * genmodes.c (complete_mode): Support MODE_POINTER_BOUNDS. (POINTER_BOUNDS_MODE): New. (make_pointer_bounds_mode): New. * machmode.h (POINTER_BOUNDS_MODE_P): New. * stor-layout.c (int_mode_for_mode): Support MODE_POINTER_BOUNDS. (layout_type): Support POINTER_BOUNDS_TYPE. * tree-pretty-print.c (dump_generic_node): Support POINTER_BOUNDS_TYPE. * tree-core.h (tree_index): Add TI_POINTER_BOUNDS_TYPE. * tree.c (build_int_cst_wide): Support POINTER_BOUNDS_TYPE. (type_contains_placeholder_1): Likewise. (build_common_tree_nodes): Initialize pointer_bounds_type_node. * tree.h (POINTER_BOUNDS_TYPE_P): New. (pointer_bounds_type_node): New. (POINTER_BOUNDS_P): New. (BOUNDED_TYPE_P): New. (BOUNDED_P): New. (CALL_WITH_BOUNDS_P): New. * gimple.h (gf_mask): Add GF_CALL_WITH_BOUNDS. (gimple_call_with_bounds_p): New. (gimple_call_set_with_bounds): New. (gimple_return_retbnd): New. (gimple_return_set_retbnd): New * gimple.c (gimple_build_return): Increase number of ops for return statement. (gimple_build_call_from_tree): Propagate CALL_WITH_BOUNDS_P flag. * gimple-pretty-print.c (dump_gimple_return): Print second op. * rtl.h (CALL_EXPR_WITH_BOUNDS_P): New. * gimplify.c (gimplify_init_constructor): Avoid infinite loop during gimplification of bounds initializer. * calls.c: Include tree-chkp.h, rtl-chkp.h, bitmap.h. (special_function_p): Use original decl name when analyzing instrumentation clone. (arg_data): Add fields special_slot, pointer_arg and pointer_offset. (store_bounds): New. (emit_call_1): Propagate instrumentation flag for CALL. (initialize_argument_information): Compute pointer_arg, pointer_offset and special_slot for pointer bounds arguments. (finalize_must_preallocate): Preallocate when storing bounds in bounds table. (compute_argument_addresses): Skip pointer bounds. (expand_call): Store bounds into tables separately. Return result joined with resulting bounds. * cfgexpand.c: Include tree-chkp.h, rtl-chkp.h. (expand_call_stmt): Propagate bounds flag for CALL_EXPR. (expand_return): Add returned bounds arg. Handle returned bounds. (expand_gimple_stmt_1): Adjust to new expand_return signature. (gimple_expand_cfg): Reset rtx bounds map. * expr.c: Include tree-chkp.h, rtl-chkp.h. (expand_assignment): Handle returned bounds. (store_expr_with_bounds): New. Replaces store_expr with new bounds target argument. Handle bounds returned by calls. (store_expr): Now wraps store_expr_with_bounds. * expr.h (store_expr_with_bounds): New. * function.c: Include tree-chkp.h, rtl-chkp.h. (bounds_parm_data): New. (use_register_for_decl): Do not registerize decls used for bounds stores and loads. (assign_parms_augmented_arg_list): Add bounds of the result structure pointer as the second argument. (assign_parm_find_entry_rtl): Mark bounds are never passed on the stack. (assign_parm_is_stack_parm): Likewise. (assign_parm_load_bounds): New. (assign_bounds): New. (assign_parms): Load bounds and determine a location for returned bounds. (diddle_return_value_1): New. (diddle_return_value): Handle returned bounds. * function.h (rtl_data): Add field for returned bounds. * varasm.c: Include tree-chkp.h. (output_constant): Support POINTER_BOUNDS_TYPE. (output_constant_pool_2): Support MODE_POINTER_BOUNDS. (ultimate_transparent_alias_target): Move up. (make_decl_rtl): For instrumented function use name of the original decl. (assemble_start_function): Mark function as global in case it is instrumentation clone of the global function. (do_assemble_alias): Follow transparent alias chain for identifier. Check if original alias is public. (maybe_assemble_visibility): Use visibility of the original function for instrumented version. (default_unique_section): Likewise. * emit-rtl.c (immed_double_const): Support MODE_POINTER_BOUNDS. (init_emit_once): Build pointer bounds zero constants. * explow.c (trunc_int_for_mode): Support MODE_POINTER_BOUNDS. * target.def (builtin_chkp_function): New. (chkp_bound_type): New. (chkp_bound_mode): New. (chkp_make_bounds_constant): New. (chkp_initialize_bounds): New. (load_bounds_for_arg): New. (store_bounds_for_arg): New. (load_returned_bounds): New. (store_returned_bounds): New. (chkp_function_value_bounds): New. (setup_incoming_vararg_bounds): New. (function_arg): Update hook description with new possible return value CONST_INT. * targhooks.h (default_load_bounds_for_arg): New. (default_store_bounds_for_arg): New. (default_load_returned_bounds): New. (default_store_returned_bounds): New. (default_chkp_bound_type): New. (default_chkp_bound_mode): New. (default_builtin_chkp_function): New. (default_chkp_function_value_bounds): New. (default_chkp_make_bounds_constant): New. (default_chkp_initialize_bounds): New. (default_setup_incoming_vararg_bounds): New. * targhooks.c (default_load_bounds_for_arg): New. (default_store_bounds_for_arg): New. (default_load_returned_bounds): New. (default_store_returned_bounds): New. (default_chkp_bound_type): New. (default_chkp_bound_mode); New. (default_builtin_chkp_function): New. (default_chkp_function_value_bounds): New. (default_chkp_make_bounds_constant): New. (default_chkp_initialize_bounds): New. (default_setup_incoming_vararg_bounds): New. * builtin-types.def (BT_BND): New. (BT_FN_PTR_CONST_PTR): New. (BT_FN_CONST_PTR_CONST_PTR): New. (BT_FN_BND_CONST_PTR): New. (BT_FN_CONST_PTR_BND): New. (BT_FN_PTR_CONST_PTR_SIZE): New. (BT_FN_PTR_CONST_PTR_CONST_PTR): New. (BT_FN_VOID_PTRPTR_CONST_PTR): New. (BT_FN_VOID_CONST_PTR_SIZE): New. (BT_FN_VOID_PTR_BND): New. (BT_FN_CONST_PTR_CONST_PTR_CONST_PTR): New. (BT_FN_BND_CONST_PTR_SIZE): New. (BT_FN_PTR_CONST_PTR_CONST_PTR_SIZE): New. (BT_FN_VOID_CONST_PTR_BND_CONST_PTR): New. * chkp-builtins.def: New. * builtins.def: include chkp-builtins.def. (DEF_CHKP_BUILTIN): New. * builtins.c: Include tree-chkp.h and rtl-chkp.h. (expand_builtin): Support BUILT_IN_CHKP_INIT_PTR_BOUNDS, BUILT_IN_CHKP_NULL_PTR_BOUNDS, BUILT_IN_CHKP_COPY_PTR_BOUNDS, BUILT_IN_CHKP_CHECK_PTR_LBOUNDS, BUILT_IN_CHKP_CHECK_PTR_UBOUNDS, BUILT_IN_CHKP_CHECK_PTR_BOUNDS, BUILT_IN_CHKP_SET_PTR_BOUNDS, BUILT_IN_CHKP_NARROW_PTR_BOUNDS, BUILT_IN_CHKP_STORE_PTR_BOUNDS, BUILT_IN_CHKP_GET_PTR_LBOUND, BUILT_IN_CHKP_GET_PTR_UBOUND, BUILT_IN_CHKP_BNDMK, BUILT_IN_CHKP_BNDSTX, BUILT_IN_CHKP_BNDCL, BUILT_IN_CHKP_BNDCU, BUILT_IN_CHKP_BNDLDX, BUILT_IN_CHKP_BNDRET, BUILT_IN_CHKP_INTERSECT, BUILT_IN_CHKP_NARROW, BUILT_IN_CHKP_EXTRACT_LOWER, BUILT_IN_CHKP_EXTRACT_UPPER. (std_expand_builtin_va_start): Init bounds for va_list. * cppbuiltin.c (define_builtin_macros_for_compilation_flags): Add __CHKP__ macro when Pointer Bounds Checker is on. * params.def (PARAM_CHKP_MAX_CTOR_SIZE): New. * passes.def (pass_ipa_chkp_versioning): New. (pass_early_local_passes): Renamed to pass_build_ssa_passes. (pass_fixup_cfg): Moved to pass_chkp_instrumentation_passes. (pass_chkp_instrumentation_passes): New. (pass_ipa_chkp_produce_thunks): New. (pass_local_optimization_passes): New. (pass_chkp_opt): New. * tree-pass.h (make_pass_ipa_chkp_versioning): New. (make_pass_ipa_chkp_produce_thunks): New. (make_pass_chkp): New. (make_pass_chkp_opt): New. (make_pass_early_local_passes): Renamed to ... (make_pass_build_ssa_passes): This. (make_pass_chkp_instrumentation_passes): New. (make_pass_local_optimization_passes): New. * passes.c (pass_manager::execute_early_local_passes): Execute early passes in three steps. (execute_all_early_local_passes): Renamed to ... (execute_build_ssa_passes): This. (pass_data_early_local_passes): Renamed to ... (pass_data_build_ssa_passes): This. (pass_early_local_passes): Renamed to ... (pass_build_ssa_passes): This. (pass_data_chkp_instrumentation_passes): New. (pass_chkp_instrumentation_passes): New. (pass_data_local_optimization_passes): New. (pass_local_optimization_passes): New. (make_pass_early_local_passes): Renamed to ... (make_pass_build_ssa_passes): This. (make_pass_chkp_instrumentation_passes): New. (make_pass_local_optimization_passes): New. * c-family/c.opt (fcheck-pointer-bounds): New. (fchkp-check-incomplete-type): New. (fchkp-zero-input-bounds-for-main): New. (fchkp-first-field-has-own-bounds): New. (fchkp-narrow-bounds): New. (fchkp-narrow-to-innermost-array): New. (fchkp-optimize): New. (fchkp-use-fast-string-functions): New. (fchkp-use-nochk-string-functions): New. (fchkp-use-static-bounds): New. (fchkp-use-static-const-bounds): New. (fchkp-treat-zero-dynamic-size-as-infinite): New. (fchkp-check-read): New. (fchkp-check-write): New. (fchkp-store-bounds): New. (fchkp-instrument-calls): New. (fchkp-instrument-marked-only): New. (Wchkp): New. * c-family/c-common.c (handle_bnd_variable_size_attribute): New. (handle_bnd_legacy): New. (handle_bnd_instrument): New. (c_common_attribute_table): Add bnd_variable_size, bnd_legacy and bnd_instrument. Fix documentation. (c_common_format_attribute_table): Likewsie. * toplev.c: include tree-chkp.h. (process_options): Check Pointer Bounds Checker is supported. (compile_file): Add chkp_finish_file call. * ipa-cp.c (initialize_node_lattices): Use cgraph_local_p to handle instrumentation clones properly. (propagate_constants_accross_call): Do not propagate through instrumentation thunks. * ipa-pure-const.c (propagate_pure_const): Support IPA_REF_CHKP. * ipa-inline.c (early_inliner): Check edge has summary allocated. * ipa-split.c: Include tree-chkp.h. (find_retbnd): New. (split_part_set_ssa_name_p): New. (consider_split): Do not split retbnd and retval producers. (insert_bndret_call_after): new. (split_function): Propagate Pointer Bounds Checker instrumentation marks and handle returned bounds. * tree-ssa-sccvn.h (vn_reference_op_struct): Transform opcode into bit field and add with_bounds field. * tree-ssa-sccvn.c (copy_reference_ops_from_call): Set with_bounds field for instrumented calls. * tree-ssa-pre.c (create_component_ref_by_pieces_1): Restore CALL_WITH_BOUNDS_P flag for calls. * tree-ssa-ccp.c: Include tree-chkp.h. (insert_clobber_before_stack_restore): Handle BUILT_IN_CHKP_BNDRET calls. * tree-ssa-dce.c: Include tree-chkp.h. (propagate_necessity): For free call fed by alloc check bounds are also provided by the same alloc. (eliminate_unnecessary_stmts): Handle BUILT_IN_CHKP_BNDRET used by free calls. * tree-inline.c: Include tree-chkp.h. (declare_return_variable): Add arg holding returned bounds slot. Create and initialize returned bounds var. (remap_gimple_stmt): Handle returned bounds. Return sequence of statements instead of a single statement. (insert_init_stmt): Add declaration. (remap_gimple_seq): Adjust to new remap_gimple_stmt signature. (copy_bb): Adjust to changed return type of remap_gimple_stmt. Properly handle bounds in va_arg_pack and va_arg_pack_len. (expand_call_inline): Handle returned bounds. Add bounds copy for generated mem to mem assignments. * tree-inline.h (copy_body_data): Add fields retbnd and assign_stmts. * value-prof.c: Include tree-chkp.h. (gimple_ic): Support returned bounds. * ipa.c (cgraph_build_static_cdtor_1): Support contructors with "chkp ctor" and "bnd_legacy" attributes. (symtab_remove_unreachable_nodes): Keep initial values for pointer bounds to be used for checks eliminations. (process_references): Handle IPA_REF_CHKP. (walk_polymorphic_call_targets): Likewise. * ipa-visibility.c (cgraph_externally_visible_p): Mark instrumented 'main' as externally visible. (function_and_variable_visibility): Filter instrumentation thunks. * cgraph.h (cgraph_thunk_info): Add add_pointer_bounds_args field. (cgraph_node): Add instrumented_version, orig_decl and instrumentation_clone fields. (symtab_node::get_alias_target): Allow IPA_REF_CHKP reference. (varpool_node): Add need_bounds_init field. (cgraph_local_p): New. * cgraph.c: Include tree-chkp.h. (cgraph_node::remove): Fix instrumented_version of the referenced node if any. (cgraph_node::dump): Dump instrumentation_clone and instrumented_version fields. (cgraph_node::verify_node): Check correctness of IPA_REF_CHKP references and instrumentation thunks. (cgraph_can_remove_if_no_direct_calls_and_refs_p): Keep all not instrumented instrumentation clones alive. (cgraph_redirect_edge_call_stmt_to_callee): Support returned bounds. * cgraphbuild.c (rebuild_cgraph_edges): Rebuild IPA_REF_CHKP reference. (cgraph_rebuild_references): Likewise. * cgraphunit.c: Include tree-chkp.h. (assemble_thunks_and_aliases): Skip thunks calling instrumneted function version. (varpool_finalize_decl): Register statically initialized decls in Pointer Bounds Checker. (walk_polymorphic_call_targets): Do not mark generated call to __builtin_unreachable as with_bounds. (output_weakrefs): If there are both instrumented and original versions, output only one of them. (cgraph_node::expand_thunk): Set with_bounds flag for created call statement. * ipa-ref.h (ipa_ref_use): Add IPA_REF_CHKP. (ipa_ref): increase size of use field. * symtab.c (ipa_ref_use_name): Add element for IPA_REF_CHKP. * varpool.c (dump_varpool_node): Dump need_bounds_init field. (ctor_for_folding): Do not fold constant bounds vars. * lto-streamer.h (LTO_minor_version): Change minor version from 0 to 1. * lto-cgraph.c (compute_ltrans_boundary): Keep initial values for pointer bounds. (lto_output_node): Output instrumentation_clone, thunk.add_pointer_bounds_args and orig_decl field. (lto_output_ref): Adjust to new ipa_ref::use field size. (input_overwrite_node): Read instrumentation_clone field. (input_node): Read thunk.add_pointer_bounds_args and orig_decl fields. (input_ref): Adjust to new ipa_ref::use field size. (input_cgraph_1): Compute instrumented_version fields and restore IDENTIFIER_TRANSPARENT_ALIAS chains. (lto_output_varpool_node): Output need_bounds_init value. (input_varpool_node): Read need_bounds_init value. * lto-partition.c (add_symbol_to_partition_1): Keep original and instrumented versions together. (privatize_symbol_name): Restore transparent alias chain if required. (add_references_to_partition): Add references to pointer bounds vars. * dbxout.c (dbxout_type): Ignore POINTER_BOUNDS_TYPE. * dwarf2out.c (gen_subprogram_die): Ignore bound args. (gen_type_die_with_usage): Skip pointer bounds. (dwarf2out_global_decl): Likewise. (is_base_type): Support POINTER_BOUNDS_TYPE. (gen_formal_types_die): Skip pointer bounds. (gen_decl_die): Likewise. * var-tracking.c (vt_add_function_parameters): Skip bounds parameters. * ipa-icf.c (sem_function::merge): Do not merge when instrumentation thunk still exists. (sem_variable::merge): Reset need_bounds_init flag. * doc/extend.texi: Document Pointer Bounds Checker built-in functions and attributes. * doc/tm.texi.in (TARGET_LOAD_BOUNDS_FOR_ARG): New. (TARGET_STORE_BOUNDS_FOR_ARG): New. (TARGET_LOAD_RETURNED_BOUNDS): New. (TARGET_STORE_RETURNED_BOUNDS): New. (TARGET_CHKP_FUNCTION_VALUE_BOUNDS): New. (TARGET_SETUP_INCOMING_VARARG_BOUNDS): New. (TARGET_BUILTIN_CHKP_FUNCTION): New. (TARGET_CHKP_BOUND_TYPE): New. (TARGET_CHKP_BOUND_MODE): New. (TARGET_CHKP_MAKE_BOUNDS_CONSTANT): New. (TARGET_CHKP_INITIALIZE_BOUNDS): New. * doc/tm.texi: Regenerated. * doc/rtl.texi (MODE_POINTER_BOUNDS): New. (BND32mode): New. (BND64mode): New. * doc/invoke.texi (-mmpx): New. (-mno-mpx): New. (chkp-max-ctor-size): New. * config/i386/constraints.md (w): New. (Ti): New. (Tb): New. * config/i386/i386-c.c (ix86_target_macros_internal): Add __MPX__. * config/i386/i386-modes.def (BND32): New. (BND64): New. * config/i386/i386-protos.h (ix86_bnd_prefixed_insn_p): New. * config/i386/i386.c: Include tree-chkp.h, rtl-chkp.h, tree-iterator.h. (regclass_map): Add bound registers. (dbx_register_map): Likewise. (dbx64_register_map): Likewise. (svr4_dbx_register_map): Likewise. (isa_opts): Add -mmpx. (PTA_MPX): New. (ix86_option_override_internal): Support MPX ISA. (ix86_conditional_register_usage): Support bound registers. (ix86_code_end): Add MPX bnd prefix. (output_set_got): Likewise. (print_reg): Avoid prefixes for bound registers. (ix86_print_operand): Add '!' (MPX bnd) print prefix support. (ix86_print_operand_punct_valid_p): Likewise. (ix86_print_operand_address): Support UNSPEC_BNDMK_ADDR and UNSPEC_BNDLDX_ADDR. (ix86_output_call_insn): Add MPX bnd prefix to branch instructions. (ix86_class_likely_spilled_p): Add bound regs support. (ix86_hard_regno_mode_ok): Likewise. (x86_order_regs_for_local_alloc): Likewise. (ix86_bnd_prefixed_insn_p): New. (ix86_builtins): Add IX86_BUILTIN_BNDMK, IX86_BUILTIN_BNDSTX, IX86_BUILTIN_BNDLDX, IX86_BUILTIN_BNDCL, IX86_BUILTIN_BNDCU, IX86_BUILTIN_BNDRET, IX86_BUILTIN_BNDNARROW, IX86_BUILTIN_BNDINT, IX86_BUILTIN_SIZEOF, IX86_BUILTIN_BNDLOWER, IX86_BUILTIN_BNDUPPER. (builtin_isa): Add leaf_p and nothrow_p fields. (def_builtin): Initialize leaf_p and nothrow_p. (ix86_add_new_builtins): Handle leaf_p and nothrow_p flags. (bdesc_mpx): New. (bdesc_mpx_const): New. (ix86_init_mpx_builtins): New. (ix86_init_builtins): Call ix86_init_mpx_builtins. (ix86_emit_cmove): New. (ix86_emit_move_max): New. (ix86_expand_builtin): Expand IX86_BUILTIN_BNDMK, IX86_BUILTIN_BNDSTX, IX86_BUILTIN_BNDLDX, IX86_BUILTIN_BNDCL, IX86_BUILTIN_BNDCU, IX86_BUILTIN_BNDRET, IX86_BUILTIN_BNDNARROW, IX86_BUILTIN_BNDINT, IX86_BUILTIN_SIZEOF, IX86_BUILTIN_BNDLOWER, IX86_BUILTIN_BNDUPPER. (ix86_function_value_bounds): New. (ix86_builtin_mpx_function): New. (ix86_get_arg_address_for_bt): New. (ix86_load_bounds): New. (ix86_store_bounds): New. (ix86_load_returned_bounds): New. (ix86_store_returned_bounds): New. (ix86_mpx_bound_mode): New. (ix86_make_bounds_constant): New. (ix86_initialize_bounds): (TARGET_LOAD_BOUNDS_FOR_ARG): New. (TARGET_STORE_BOUNDS_FOR_ARG): New. (TARGET_LOAD_RETURNED_BOUNDS): New. (TARGET_STORE_RETURNED_BOUNDS): New. (TARGET_CHKP_BOUND_MODE): New. (TARGET_BUILTIN_CHKP_FUNCTION): New. (TARGET_CHKP_FUNCTION_VALUE_BOUNDS): New. (TARGET_CHKP_MAKE_BOUNDS_CONSTANT): New. (TARGET_CHKP_INITIALIZE_BOUNDS): New. (ix86_option_override_internal): Do not support x32 with MPX. (init_cumulative_args): Init stdarg, bnd_regno, bnds_in_bt and force_bnd_pass. (function_arg_advance_32): Return number of used integer registers. (function_arg_advance_64): Likewise. (function_arg_advance_ms_64): Likewise. (ix86_function_arg_advance): Handle pointer bounds. (ix86_function_arg): Likewise. (ix86_function_value_regno_p): Mark fisrt bounds registers as possible function value. (ix86_function_value_1): Handle pointer bounds type/mode (ix86_return_in_memory): Likewise. (ix86_print_operand): Analyse insn to decide abounf "bnd" prefix. (ix86_expand_call): Generate returned bounds. (ix86_setup_incoming_vararg_bounds): New. (ix86_va_start): Initialize bounds for pointers in va_list. (TARGET_SETUP_INCOMING_VARARG_BOUNDS): New. * config/i386/i386.h (TARGET_MPX): New. (TARGET_MPX_P): New. (FIRST_PSEUDO_REGISTER): Fix to new value. (FIXED_REGISTERS): Add bound registers. (CALL_USED_REGISTERS): Likewise. (REG_ALLOC_ORDER): Likewise. (HARD_REGNO_NREGS): Likewise. (VALID_BND_REG_MODE): New. (FIRST_BND_REG): New. (LAST_BND_REG): New. (reg_class): Add BND_REGS. (REG_CLASS_NAMES): Likewise. (REG_CLASS_CONTENTS): Likewise. (BND_REGNO_P): New. (ANY_BND_REG_P): New. (BNDmode): New. (HI_REGISTER_NAMES): Add bound registers. (ix86_args): Add bnd_regno, bnds_in_bt, force_bnd_pass and stdarg fields. * config/i386/i386.md (UNSPEC_BNDMK): New. (UNSPEC_BNDMK_ADDR): New. (UNSPEC_BNDSTX): New. (UNSPEC_BNDLDX): New. (UNSPEC_BNDLDX_ADDR): New. (UNSPEC_BNDCL): New. (UNSPEC_BNDCU): New. (UNSPEC_BNDCN): New. (UNSPEC_MPX_FENCE): New. (UNSPEC_SIZEOF): New. (BND0_REG): New. (BND1_REG): New. (type): Add mpxmov, mpxmk, mpxchk, mpxld, mpxst. (length_immediate): Support mpxmov, mpxmk, mpxchk, mpxld, mpxst. (prefix_rep): Check for bnd prefix. (prefix_0f): Support mpxmov, mpxmk, mpxchk, mpxld, mpxst. (length_nobnd): New. (length): Use length_nobnd when specified. (memory): Support mpxmov, mpxmk, mpxchk, mpxld, mpxst. (BND): New. (bnd_ptr): New. (BNDCHECK): New. (bndcheck): New. (*jcc_1): Add MPX bnd prefix. (*jcc_2): Likewise. (jump): Likewise. (*indirect_jump): Likewise. (*tablejump_1): Likewise. (simple_return_internal): Likewise. (simple_return_internal_long): Likewise. (simple_return_pop_internal): Likewise. (simple_return_indirect_internal): Likewise. (<mode>_mk): New. (*<mode>_mk): New. (mov<mode>): New. (*mov<mode>_internal_mpx): New. (<mode>_<bndcheck>): New. (*<mode>_<bndcheck>): New. (<mode>_ldx): New. (*<mode>_ldx): New. (<mode>_stx): New. (*<mode>_stx): New. move_size_reloc_<mode>): New. * config/i386/predicates.md (address_mpx_no_base_operand): New. (address_mpx_no_index_operand): New. (bnd_mem_operator): New. (symbol_operand): New. (x86_64_immediate_size_operand): New. * config/i386/i386.opt (mmpx): New. * config/i386/i386-builtin-types.def (BND): New. (ULONG): New. (BND_FTYPE_PCVOID_ULONG): New. (VOID_FTYPE_BND_PCVOID): New. (VOID_FTYPE_PCVOID_PCVOID_BND): New. (BND_FTYPE_PCVOID_PCVOID): New. (BND_FTYPE_PCVOID): New. (BND_FTYPE_BND_BND): New. (PVOID_FTYPE_PVOID_PVOID_ULONG): New. (PVOID_FTYPE_PCVOID_BND_ULONG): New. (ULONG_FTYPE_VOID): New. (PVOID_FTYPE_BND): New. gcc/testsuite/ 2014-11-05 Ilya Enkovich <ilya.enkovich@intel.com> * gcc.target/i386/chkp-builtins-1.c: New. * gcc.target/i386/chkp-builtins-2.c: New. * gcc.target/i386/chkp-builtins-3.c: New. * gcc.target/i386/chkp-builtins-4.c: New. * gcc.target/i386/chkp-remove-bndint-1.c: New. * gcc.target/i386/chkp-remove-bndint-2.c: New. * gcc.target/i386/chkp-const-check-1.c: New. * gcc.target/i386/chkp-const-check-2.c: New. * gcc.target/i386/chkp-lifetime-1.c: New. * gcc.dg/pr37858.c: Replace early_local_cleanups pass name with build_ssa_passes. From-SVN: r217125
2014-11-05 13:42:03 +01:00
NEXT_PASS (pass_build_ssa_passes);
PUSH_INSERT_PASSES_WITHIN (pass_build_ssa_passes)
NEXT_PASS (pass_fixup_cfg);
NEXT_PASS (pass_build_ssa);
NEXT_PASS (pass_walloca, /*strict_mode_p=*/true);
NEXT_PASS (pass_warn_printf);
NEXT_PASS (pass_warn_nonnull_compare);
NEXT_PASS (pass_early_warn_uninitialized);
NEXT_PASS (pass_warn_access, /*early=*/true);
NEXT_PASS (pass_ubsan);
NEXT_PASS (pass_nothrow);
NEXT_PASS (pass_rebuild_cgraph_edges);
ipa-chkp.c: New. gcc/ 2014-11-05 Ilya Enkovich <ilya.enkovich@intel.com> * ipa-chkp.c: New. * ipa-chkp.h: New. * tree-chkp.c: New. * tree-chkp.h: New. * tree-chkp-opt.c: New. * rtl-chkp.c: New. * rtl-chkp.h: New. * Makefile.in (OBJS): Add ipa-chkp.o, rtl-chkp.o, tree-chkp.o tree-chkp-opt.o. (GTFILES): Add tree-chkp.c. * mode-classes.def (MODE_POINTER_BOUNDS): New. * tree.def (POINTER_BOUNDS_TYPE): New. * genmodes.c (complete_mode): Support MODE_POINTER_BOUNDS. (POINTER_BOUNDS_MODE): New. (make_pointer_bounds_mode): New. * machmode.h (POINTER_BOUNDS_MODE_P): New. * stor-layout.c (int_mode_for_mode): Support MODE_POINTER_BOUNDS. (layout_type): Support POINTER_BOUNDS_TYPE. * tree-pretty-print.c (dump_generic_node): Support POINTER_BOUNDS_TYPE. * tree-core.h (tree_index): Add TI_POINTER_BOUNDS_TYPE. * tree.c (build_int_cst_wide): Support POINTER_BOUNDS_TYPE. (type_contains_placeholder_1): Likewise. (build_common_tree_nodes): Initialize pointer_bounds_type_node. * tree.h (POINTER_BOUNDS_TYPE_P): New. (pointer_bounds_type_node): New. (POINTER_BOUNDS_P): New. (BOUNDED_TYPE_P): New. (BOUNDED_P): New. (CALL_WITH_BOUNDS_P): New. * gimple.h (gf_mask): Add GF_CALL_WITH_BOUNDS. (gimple_call_with_bounds_p): New. (gimple_call_set_with_bounds): New. (gimple_return_retbnd): New. (gimple_return_set_retbnd): New * gimple.c (gimple_build_return): Increase number of ops for return statement. (gimple_build_call_from_tree): Propagate CALL_WITH_BOUNDS_P flag. * gimple-pretty-print.c (dump_gimple_return): Print second op. * rtl.h (CALL_EXPR_WITH_BOUNDS_P): New. * gimplify.c (gimplify_init_constructor): Avoid infinite loop during gimplification of bounds initializer. * calls.c: Include tree-chkp.h, rtl-chkp.h, bitmap.h. (special_function_p): Use original decl name when analyzing instrumentation clone. (arg_data): Add fields special_slot, pointer_arg and pointer_offset. (store_bounds): New. (emit_call_1): Propagate instrumentation flag for CALL. (initialize_argument_information): Compute pointer_arg, pointer_offset and special_slot for pointer bounds arguments. (finalize_must_preallocate): Preallocate when storing bounds in bounds table. (compute_argument_addresses): Skip pointer bounds. (expand_call): Store bounds into tables separately. Return result joined with resulting bounds. * cfgexpand.c: Include tree-chkp.h, rtl-chkp.h. (expand_call_stmt): Propagate bounds flag for CALL_EXPR. (expand_return): Add returned bounds arg. Handle returned bounds. (expand_gimple_stmt_1): Adjust to new expand_return signature. (gimple_expand_cfg): Reset rtx bounds map. * expr.c: Include tree-chkp.h, rtl-chkp.h. (expand_assignment): Handle returned bounds. (store_expr_with_bounds): New. Replaces store_expr with new bounds target argument. Handle bounds returned by calls. (store_expr): Now wraps store_expr_with_bounds. * expr.h (store_expr_with_bounds): New. * function.c: Include tree-chkp.h, rtl-chkp.h. (bounds_parm_data): New. (use_register_for_decl): Do not registerize decls used for bounds stores and loads. (assign_parms_augmented_arg_list): Add bounds of the result structure pointer as the second argument. (assign_parm_find_entry_rtl): Mark bounds are never passed on the stack. (assign_parm_is_stack_parm): Likewise. (assign_parm_load_bounds): New. (assign_bounds): New. (assign_parms): Load bounds and determine a location for returned bounds. (diddle_return_value_1): New. (diddle_return_value): Handle returned bounds. * function.h (rtl_data): Add field for returned bounds. * varasm.c: Include tree-chkp.h. (output_constant): Support POINTER_BOUNDS_TYPE. (output_constant_pool_2): Support MODE_POINTER_BOUNDS. (ultimate_transparent_alias_target): Move up. (make_decl_rtl): For instrumented function use name of the original decl. (assemble_start_function): Mark function as global in case it is instrumentation clone of the global function. (do_assemble_alias): Follow transparent alias chain for identifier. Check if original alias is public. (maybe_assemble_visibility): Use visibility of the original function for instrumented version. (default_unique_section): Likewise. * emit-rtl.c (immed_double_const): Support MODE_POINTER_BOUNDS. (init_emit_once): Build pointer bounds zero constants. * explow.c (trunc_int_for_mode): Support MODE_POINTER_BOUNDS. * target.def (builtin_chkp_function): New. (chkp_bound_type): New. (chkp_bound_mode): New. (chkp_make_bounds_constant): New. (chkp_initialize_bounds): New. (load_bounds_for_arg): New. (store_bounds_for_arg): New. (load_returned_bounds): New. (store_returned_bounds): New. (chkp_function_value_bounds): New. (setup_incoming_vararg_bounds): New. (function_arg): Update hook description with new possible return value CONST_INT. * targhooks.h (default_load_bounds_for_arg): New. (default_store_bounds_for_arg): New. (default_load_returned_bounds): New. (default_store_returned_bounds): New. (default_chkp_bound_type): New. (default_chkp_bound_mode): New. (default_builtin_chkp_function): New. (default_chkp_function_value_bounds): New. (default_chkp_make_bounds_constant): New. (default_chkp_initialize_bounds): New. (default_setup_incoming_vararg_bounds): New. * targhooks.c (default_load_bounds_for_arg): New. (default_store_bounds_for_arg): New. (default_load_returned_bounds): New. (default_store_returned_bounds): New. (default_chkp_bound_type): New. (default_chkp_bound_mode); New. (default_builtin_chkp_function): New. (default_chkp_function_value_bounds): New. (default_chkp_make_bounds_constant): New. (default_chkp_initialize_bounds): New. (default_setup_incoming_vararg_bounds): New. * builtin-types.def (BT_BND): New. (BT_FN_PTR_CONST_PTR): New. (BT_FN_CONST_PTR_CONST_PTR): New. (BT_FN_BND_CONST_PTR): New. (BT_FN_CONST_PTR_BND): New. (BT_FN_PTR_CONST_PTR_SIZE): New. (BT_FN_PTR_CONST_PTR_CONST_PTR): New. (BT_FN_VOID_PTRPTR_CONST_PTR): New. (BT_FN_VOID_CONST_PTR_SIZE): New. (BT_FN_VOID_PTR_BND): New. (BT_FN_CONST_PTR_CONST_PTR_CONST_PTR): New. (BT_FN_BND_CONST_PTR_SIZE): New. (BT_FN_PTR_CONST_PTR_CONST_PTR_SIZE): New. (BT_FN_VOID_CONST_PTR_BND_CONST_PTR): New. * chkp-builtins.def: New. * builtins.def: include chkp-builtins.def. (DEF_CHKP_BUILTIN): New. * builtins.c: Include tree-chkp.h and rtl-chkp.h. (expand_builtin): Support BUILT_IN_CHKP_INIT_PTR_BOUNDS, BUILT_IN_CHKP_NULL_PTR_BOUNDS, BUILT_IN_CHKP_COPY_PTR_BOUNDS, BUILT_IN_CHKP_CHECK_PTR_LBOUNDS, BUILT_IN_CHKP_CHECK_PTR_UBOUNDS, BUILT_IN_CHKP_CHECK_PTR_BOUNDS, BUILT_IN_CHKP_SET_PTR_BOUNDS, BUILT_IN_CHKP_NARROW_PTR_BOUNDS, BUILT_IN_CHKP_STORE_PTR_BOUNDS, BUILT_IN_CHKP_GET_PTR_LBOUND, BUILT_IN_CHKP_GET_PTR_UBOUND, BUILT_IN_CHKP_BNDMK, BUILT_IN_CHKP_BNDSTX, BUILT_IN_CHKP_BNDCL, BUILT_IN_CHKP_BNDCU, BUILT_IN_CHKP_BNDLDX, BUILT_IN_CHKP_BNDRET, BUILT_IN_CHKP_INTERSECT, BUILT_IN_CHKP_NARROW, BUILT_IN_CHKP_EXTRACT_LOWER, BUILT_IN_CHKP_EXTRACT_UPPER. (std_expand_builtin_va_start): Init bounds for va_list. * cppbuiltin.c (define_builtin_macros_for_compilation_flags): Add __CHKP__ macro when Pointer Bounds Checker is on. * params.def (PARAM_CHKP_MAX_CTOR_SIZE): New. * passes.def (pass_ipa_chkp_versioning): New. (pass_early_local_passes): Renamed to pass_build_ssa_passes. (pass_fixup_cfg): Moved to pass_chkp_instrumentation_passes. (pass_chkp_instrumentation_passes): New. (pass_ipa_chkp_produce_thunks): New. (pass_local_optimization_passes): New. (pass_chkp_opt): New. * tree-pass.h (make_pass_ipa_chkp_versioning): New. (make_pass_ipa_chkp_produce_thunks): New. (make_pass_chkp): New. (make_pass_chkp_opt): New. (make_pass_early_local_passes): Renamed to ... (make_pass_build_ssa_passes): This. (make_pass_chkp_instrumentation_passes): New. (make_pass_local_optimization_passes): New. * passes.c (pass_manager::execute_early_local_passes): Execute early passes in three steps. (execute_all_early_local_passes): Renamed to ... (execute_build_ssa_passes): This. (pass_data_early_local_passes): Renamed to ... (pass_data_build_ssa_passes): This. (pass_early_local_passes): Renamed to ... (pass_build_ssa_passes): This. (pass_data_chkp_instrumentation_passes): New. (pass_chkp_instrumentation_passes): New. (pass_data_local_optimization_passes): New. (pass_local_optimization_passes): New. (make_pass_early_local_passes): Renamed to ... (make_pass_build_ssa_passes): This. (make_pass_chkp_instrumentation_passes): New. (make_pass_local_optimization_passes): New. * c-family/c.opt (fcheck-pointer-bounds): New. (fchkp-check-incomplete-type): New. (fchkp-zero-input-bounds-for-main): New. (fchkp-first-field-has-own-bounds): New. (fchkp-narrow-bounds): New. (fchkp-narrow-to-innermost-array): New. (fchkp-optimize): New. (fchkp-use-fast-string-functions): New. (fchkp-use-nochk-string-functions): New. (fchkp-use-static-bounds): New. (fchkp-use-static-const-bounds): New. (fchkp-treat-zero-dynamic-size-as-infinite): New. (fchkp-check-read): New. (fchkp-check-write): New. (fchkp-store-bounds): New. (fchkp-instrument-calls): New. (fchkp-instrument-marked-only): New. (Wchkp): New. * c-family/c-common.c (handle_bnd_variable_size_attribute): New. (handle_bnd_legacy): New. (handle_bnd_instrument): New. (c_common_attribute_table): Add bnd_variable_size, bnd_legacy and bnd_instrument. Fix documentation. (c_common_format_attribute_table): Likewsie. * toplev.c: include tree-chkp.h. (process_options): Check Pointer Bounds Checker is supported. (compile_file): Add chkp_finish_file call. * ipa-cp.c (initialize_node_lattices): Use cgraph_local_p to handle instrumentation clones properly. (propagate_constants_accross_call): Do not propagate through instrumentation thunks. * ipa-pure-const.c (propagate_pure_const): Support IPA_REF_CHKP. * ipa-inline.c (early_inliner): Check edge has summary allocated. * ipa-split.c: Include tree-chkp.h. (find_retbnd): New. (split_part_set_ssa_name_p): New. (consider_split): Do not split retbnd and retval producers. (insert_bndret_call_after): new. (split_function): Propagate Pointer Bounds Checker instrumentation marks and handle returned bounds. * tree-ssa-sccvn.h (vn_reference_op_struct): Transform opcode into bit field and add with_bounds field. * tree-ssa-sccvn.c (copy_reference_ops_from_call): Set with_bounds field for instrumented calls. * tree-ssa-pre.c (create_component_ref_by_pieces_1): Restore CALL_WITH_BOUNDS_P flag for calls. * tree-ssa-ccp.c: Include tree-chkp.h. (insert_clobber_before_stack_restore): Handle BUILT_IN_CHKP_BNDRET calls. * tree-ssa-dce.c: Include tree-chkp.h. (propagate_necessity): For free call fed by alloc check bounds are also provided by the same alloc. (eliminate_unnecessary_stmts): Handle BUILT_IN_CHKP_BNDRET used by free calls. * tree-inline.c: Include tree-chkp.h. (declare_return_variable): Add arg holding returned bounds slot. Create and initialize returned bounds var. (remap_gimple_stmt): Handle returned bounds. Return sequence of statements instead of a single statement. (insert_init_stmt): Add declaration. (remap_gimple_seq): Adjust to new remap_gimple_stmt signature. (copy_bb): Adjust to changed return type of remap_gimple_stmt. Properly handle bounds in va_arg_pack and va_arg_pack_len. (expand_call_inline): Handle returned bounds. Add bounds copy for generated mem to mem assignments. * tree-inline.h (copy_body_data): Add fields retbnd and assign_stmts. * value-prof.c: Include tree-chkp.h. (gimple_ic): Support returned bounds. * ipa.c (cgraph_build_static_cdtor_1): Support contructors with "chkp ctor" and "bnd_legacy" attributes. (symtab_remove_unreachable_nodes): Keep initial values for pointer bounds to be used for checks eliminations. (process_references): Handle IPA_REF_CHKP. (walk_polymorphic_call_targets): Likewise. * ipa-visibility.c (cgraph_externally_visible_p): Mark instrumented 'main' as externally visible. (function_and_variable_visibility): Filter instrumentation thunks. * cgraph.h (cgraph_thunk_info): Add add_pointer_bounds_args field. (cgraph_node): Add instrumented_version, orig_decl and instrumentation_clone fields. (symtab_node::get_alias_target): Allow IPA_REF_CHKP reference. (varpool_node): Add need_bounds_init field. (cgraph_local_p): New. * cgraph.c: Include tree-chkp.h. (cgraph_node::remove): Fix instrumented_version of the referenced node if any. (cgraph_node::dump): Dump instrumentation_clone and instrumented_version fields. (cgraph_node::verify_node): Check correctness of IPA_REF_CHKP references and instrumentation thunks. (cgraph_can_remove_if_no_direct_calls_and_refs_p): Keep all not instrumented instrumentation clones alive. (cgraph_redirect_edge_call_stmt_to_callee): Support returned bounds. * cgraphbuild.c (rebuild_cgraph_edges): Rebuild IPA_REF_CHKP reference. (cgraph_rebuild_references): Likewise. * cgraphunit.c: Include tree-chkp.h. (assemble_thunks_and_aliases): Skip thunks calling instrumneted function version. (varpool_finalize_decl): Register statically initialized decls in Pointer Bounds Checker. (walk_polymorphic_call_targets): Do not mark generated call to __builtin_unreachable as with_bounds. (output_weakrefs): If there are both instrumented and original versions, output only one of them. (cgraph_node::expand_thunk): Set with_bounds flag for created call statement. * ipa-ref.h (ipa_ref_use): Add IPA_REF_CHKP. (ipa_ref): increase size of use field. * symtab.c (ipa_ref_use_name): Add element for IPA_REF_CHKP. * varpool.c (dump_varpool_node): Dump need_bounds_init field. (ctor_for_folding): Do not fold constant bounds vars. * lto-streamer.h (LTO_minor_version): Change minor version from 0 to 1. * lto-cgraph.c (compute_ltrans_boundary): Keep initial values for pointer bounds. (lto_output_node): Output instrumentation_clone, thunk.add_pointer_bounds_args and orig_decl field. (lto_output_ref): Adjust to new ipa_ref::use field size. (input_overwrite_node): Read instrumentation_clone field. (input_node): Read thunk.add_pointer_bounds_args and orig_decl fields. (input_ref): Adjust to new ipa_ref::use field size. (input_cgraph_1): Compute instrumented_version fields and restore IDENTIFIER_TRANSPARENT_ALIAS chains. (lto_output_varpool_node): Output need_bounds_init value. (input_varpool_node): Read need_bounds_init value. * lto-partition.c (add_symbol_to_partition_1): Keep original and instrumented versions together. (privatize_symbol_name): Restore transparent alias chain if required. (add_references_to_partition): Add references to pointer bounds vars. * dbxout.c (dbxout_type): Ignore POINTER_BOUNDS_TYPE. * dwarf2out.c (gen_subprogram_die): Ignore bound args. (gen_type_die_with_usage): Skip pointer bounds. (dwarf2out_global_decl): Likewise. (is_base_type): Support POINTER_BOUNDS_TYPE. (gen_formal_types_die): Skip pointer bounds. (gen_decl_die): Likewise. * var-tracking.c (vt_add_function_parameters): Skip bounds parameters. * ipa-icf.c (sem_function::merge): Do not merge when instrumentation thunk still exists. (sem_variable::merge): Reset need_bounds_init flag. * doc/extend.texi: Document Pointer Bounds Checker built-in functions and attributes. * doc/tm.texi.in (TARGET_LOAD_BOUNDS_FOR_ARG): New. (TARGET_STORE_BOUNDS_FOR_ARG): New. (TARGET_LOAD_RETURNED_BOUNDS): New. (TARGET_STORE_RETURNED_BOUNDS): New. (TARGET_CHKP_FUNCTION_VALUE_BOUNDS): New. (TARGET_SETUP_INCOMING_VARARG_BOUNDS): New. (TARGET_BUILTIN_CHKP_FUNCTION): New. (TARGET_CHKP_BOUND_TYPE): New. (TARGET_CHKP_BOUND_MODE): New. (TARGET_CHKP_MAKE_BOUNDS_CONSTANT): New. (TARGET_CHKP_INITIALIZE_BOUNDS): New. * doc/tm.texi: Regenerated. * doc/rtl.texi (MODE_POINTER_BOUNDS): New. (BND32mode): New. (BND64mode): New. * doc/invoke.texi (-mmpx): New. (-mno-mpx): New. (chkp-max-ctor-size): New. * config/i386/constraints.md (w): New. (Ti): New. (Tb): New. * config/i386/i386-c.c (ix86_target_macros_internal): Add __MPX__. * config/i386/i386-modes.def (BND32): New. (BND64): New. * config/i386/i386-protos.h (ix86_bnd_prefixed_insn_p): New. * config/i386/i386.c: Include tree-chkp.h, rtl-chkp.h, tree-iterator.h. (regclass_map): Add bound registers. (dbx_register_map): Likewise. (dbx64_register_map): Likewise. (svr4_dbx_register_map): Likewise. (isa_opts): Add -mmpx. (PTA_MPX): New. (ix86_option_override_internal): Support MPX ISA. (ix86_conditional_register_usage): Support bound registers. (ix86_code_end): Add MPX bnd prefix. (output_set_got): Likewise. (print_reg): Avoid prefixes for bound registers. (ix86_print_operand): Add '!' (MPX bnd) print prefix support. (ix86_print_operand_punct_valid_p): Likewise. (ix86_print_operand_address): Support UNSPEC_BNDMK_ADDR and UNSPEC_BNDLDX_ADDR. (ix86_output_call_insn): Add MPX bnd prefix to branch instructions. (ix86_class_likely_spilled_p): Add bound regs support. (ix86_hard_regno_mode_ok): Likewise. (x86_order_regs_for_local_alloc): Likewise. (ix86_bnd_prefixed_insn_p): New. (ix86_builtins): Add IX86_BUILTIN_BNDMK, IX86_BUILTIN_BNDSTX, IX86_BUILTIN_BNDLDX, IX86_BUILTIN_BNDCL, IX86_BUILTIN_BNDCU, IX86_BUILTIN_BNDRET, IX86_BUILTIN_BNDNARROW, IX86_BUILTIN_BNDINT, IX86_BUILTIN_SIZEOF, IX86_BUILTIN_BNDLOWER, IX86_BUILTIN_BNDUPPER. (builtin_isa): Add leaf_p and nothrow_p fields. (def_builtin): Initialize leaf_p and nothrow_p. (ix86_add_new_builtins): Handle leaf_p and nothrow_p flags. (bdesc_mpx): New. (bdesc_mpx_const): New. (ix86_init_mpx_builtins): New. (ix86_init_builtins): Call ix86_init_mpx_builtins. (ix86_emit_cmove): New. (ix86_emit_move_max): New. (ix86_expand_builtin): Expand IX86_BUILTIN_BNDMK, IX86_BUILTIN_BNDSTX, IX86_BUILTIN_BNDLDX, IX86_BUILTIN_BNDCL, IX86_BUILTIN_BNDCU, IX86_BUILTIN_BNDRET, IX86_BUILTIN_BNDNARROW, IX86_BUILTIN_BNDINT, IX86_BUILTIN_SIZEOF, IX86_BUILTIN_BNDLOWER, IX86_BUILTIN_BNDUPPER. (ix86_function_value_bounds): New. (ix86_builtin_mpx_function): New. (ix86_get_arg_address_for_bt): New. (ix86_load_bounds): New. (ix86_store_bounds): New. (ix86_load_returned_bounds): New. (ix86_store_returned_bounds): New. (ix86_mpx_bound_mode): New. (ix86_make_bounds_constant): New. (ix86_initialize_bounds): (TARGET_LOAD_BOUNDS_FOR_ARG): New. (TARGET_STORE_BOUNDS_FOR_ARG): New. (TARGET_LOAD_RETURNED_BOUNDS): New. (TARGET_STORE_RETURNED_BOUNDS): New. (TARGET_CHKP_BOUND_MODE): New. (TARGET_BUILTIN_CHKP_FUNCTION): New. (TARGET_CHKP_FUNCTION_VALUE_BOUNDS): New. (TARGET_CHKP_MAKE_BOUNDS_CONSTANT): New. (TARGET_CHKP_INITIALIZE_BOUNDS): New. (ix86_option_override_internal): Do not support x32 with MPX. (init_cumulative_args): Init stdarg, bnd_regno, bnds_in_bt and force_bnd_pass. (function_arg_advance_32): Return number of used integer registers. (function_arg_advance_64): Likewise. (function_arg_advance_ms_64): Likewise. (ix86_function_arg_advance): Handle pointer bounds. (ix86_function_arg): Likewise. (ix86_function_value_regno_p): Mark fisrt bounds registers as possible function value. (ix86_function_value_1): Handle pointer bounds type/mode (ix86_return_in_memory): Likewise. (ix86_print_operand): Analyse insn to decide abounf "bnd" prefix. (ix86_expand_call): Generate returned bounds. (ix86_setup_incoming_vararg_bounds): New. (ix86_va_start): Initialize bounds for pointers in va_list. (TARGET_SETUP_INCOMING_VARARG_BOUNDS): New. * config/i386/i386.h (TARGET_MPX): New. (TARGET_MPX_P): New. (FIRST_PSEUDO_REGISTER): Fix to new value. (FIXED_REGISTERS): Add bound registers. (CALL_USED_REGISTERS): Likewise. (REG_ALLOC_ORDER): Likewise. (HARD_REGNO_NREGS): Likewise. (VALID_BND_REG_MODE): New. (FIRST_BND_REG): New. (LAST_BND_REG): New. (reg_class): Add BND_REGS. (REG_CLASS_NAMES): Likewise. (REG_CLASS_CONTENTS): Likewise. (BND_REGNO_P): New. (ANY_BND_REG_P): New. (BNDmode): New. (HI_REGISTER_NAMES): Add bound registers. (ix86_args): Add bnd_regno, bnds_in_bt, force_bnd_pass and stdarg fields. * config/i386/i386.md (UNSPEC_BNDMK): New. (UNSPEC_BNDMK_ADDR): New. (UNSPEC_BNDSTX): New. (UNSPEC_BNDLDX): New. (UNSPEC_BNDLDX_ADDR): New. (UNSPEC_BNDCL): New. (UNSPEC_BNDCU): New. (UNSPEC_BNDCN): New. (UNSPEC_MPX_FENCE): New. (UNSPEC_SIZEOF): New. (BND0_REG): New. (BND1_REG): New. (type): Add mpxmov, mpxmk, mpxchk, mpxld, mpxst. (length_immediate): Support mpxmov, mpxmk, mpxchk, mpxld, mpxst. (prefix_rep): Check for bnd prefix. (prefix_0f): Support mpxmov, mpxmk, mpxchk, mpxld, mpxst. (length_nobnd): New. (length): Use length_nobnd when specified. (memory): Support mpxmov, mpxmk, mpxchk, mpxld, mpxst. (BND): New. (bnd_ptr): New. (BNDCHECK): New. (bndcheck): New. (*jcc_1): Add MPX bnd prefix. (*jcc_2): Likewise. (jump): Likewise. (*indirect_jump): Likewise. (*tablejump_1): Likewise. (simple_return_internal): Likewise. (simple_return_internal_long): Likewise. (simple_return_pop_internal): Likewise. (simple_return_indirect_internal): Likewise. (<mode>_mk): New. (*<mode>_mk): New. (mov<mode>): New. (*mov<mode>_internal_mpx): New. (<mode>_<bndcheck>): New. (*<mode>_<bndcheck>): New. (<mode>_ldx): New. (*<mode>_ldx): New. (<mode>_stx): New. (*<mode>_stx): New. move_size_reloc_<mode>): New. * config/i386/predicates.md (address_mpx_no_base_operand): New. (address_mpx_no_index_operand): New. (bnd_mem_operator): New. (symbol_operand): New. (x86_64_immediate_size_operand): New. * config/i386/i386.opt (mmpx): New. * config/i386/i386-builtin-types.def (BND): New. (ULONG): New. (BND_FTYPE_PCVOID_ULONG): New. (VOID_FTYPE_BND_PCVOID): New. (VOID_FTYPE_PCVOID_PCVOID_BND): New. (BND_FTYPE_PCVOID_PCVOID): New. (BND_FTYPE_PCVOID): New. (BND_FTYPE_BND_BND): New. (PVOID_FTYPE_PVOID_PVOID_ULONG): New. (PVOID_FTYPE_PCVOID_BND_ULONG): New. (ULONG_FTYPE_VOID): New. (PVOID_FTYPE_BND): New. gcc/testsuite/ 2014-11-05 Ilya Enkovich <ilya.enkovich@intel.com> * gcc.target/i386/chkp-builtins-1.c: New. * gcc.target/i386/chkp-builtins-2.c: New. * gcc.target/i386/chkp-builtins-3.c: New. * gcc.target/i386/chkp-builtins-4.c: New. * gcc.target/i386/chkp-remove-bndint-1.c: New. * gcc.target/i386/chkp-remove-bndint-2.c: New. * gcc.target/i386/chkp-const-check-1.c: New. * gcc.target/i386/chkp-const-check-2.c: New. * gcc.target/i386/chkp-lifetime-1.c: New. * gcc.dg/pr37858.c: Replace early_local_cleanups pass name with build_ssa_passes. From-SVN: r217125
2014-11-05 13:42:03 +01:00
POP_INSERT_PASSES ()
NEXT_PASS (pass_local_optimization_passes);
PUSH_INSERT_PASSES_WITHIN (pass_local_optimization_passes)
NEXT_PASS (pass_fixup_cfg);
NEXT_PASS (pass_rebuild_cgraph_edges);
2017-05-23 18:20:53 +02:00
NEXT_PASS (pass_local_fn_summary);
NEXT_PASS (pass_early_inline);
NEXT_PASS (pass_warn_recursion);
NEXT_PASS (pass_all_early_optimizations);
PUSH_INSERT_PASSES_WITHIN (pass_all_early_optimizations)
NEXT_PASS (pass_remove_cgraph_callee_edges);
passes: Fix up subobject __bos [PR101419] The following testcase is miscompiled, because VN during cunrolli changes __bos argument from address of a larger field to address of a smaller field and so __builtin_object_size (, 1) then folds into smaller value than the actually available size. copy_reference_ops_from_ref has a hack for this, but it was using cfun->after_inlining as a check whether the hack can be ignored, and cunrolli is after_inlining. This patch uses a property to make it exact (set at the end of objsz pass that doesn't do insert_min_max_p) and additionally based on discussions in the PR moves the objsz pass earlier after IPA. 2021-07-13 Jakub Jelinek <jakub@redhat.com> Richard Biener <rguenther@suse.de> PR tree-optimization/101419 * tree-pass.h (PROP_objsz): Define. (make_pass_early_object_sizes): Declare. * passes.def (pass_all_early_optimizations): Rename pass_object_sizes there to pass_early_object_sizes, drop parameter. (pass_all_optimizations): Move pass_object_sizes right after pass_ccp, drop parameter, move pass_post_ipa_warn right after that. * tree-object-size.c (pass_object_sizes::execute): Rename to... (object_sizes_execute): ... this. Add insert_min_max_p argument. (pass_data_object_sizes): Move after object_sizes_execute. (pass_object_sizes): Likewise. In execute method call object_sizes_execute, drop set_pass_param method and insert_min_max_p non-static data member and its initializer in the ctor. (pass_data_early_object_sizes, pass_early_object_sizes, make_pass_early_object_sizes): New. * tree-ssa-sccvn.c (copy_reference_ops_from_ref): Use (cfun->curr_properties & PROP_objsz) instead of cfun->after_inlining. * gcc.dg/builtin-object-size-10.c: Pass -fdump-tree-early_objsz-details instead of -fdump-tree-objsz1-details in dg-options and adjust names of dump file in scan-tree-dump. * gcc.dg/pr101419.c: New test.
2021-07-13 11:04:22 +02:00
NEXT_PASS (pass_early_object_sizes);
/* Don't record nonzero bits before IPA to avoid
using too much memory. */
NEXT_PASS (pass_ccp, false /* nonzero_p */);
/* After CCP we rewrite no longer addressed locals into SSA
form if possible. */
NEXT_PASS (pass_forwprop);
NEXT_PASS (pass_early_thread_jumps, /*first=*/true);
NEXT_PASS (pass_sra_early);
/* pass_build_ealias is a dummy pass that ensures that we
execute TODO_rebuild_alias at this point. */
NEXT_PASS (pass_build_ealias);
NEXT_PASS (pass_fre, true /* may_iterate */);
Add Early VRP gcc/ChangeLog: 2016-09-21 Kugan Vivekanandarajah <kuganv@linaro.org> * doc/invoke.texi: Document -fdump-tree-evrp. * passes.def: Define new pass_early_vrp. * timevar.def: Define new TV_TREE_EARLY_VRP. * tree-pass.h (make_pass_early_vrp): New. * tree-ssa-propagate.c: Make replace_uses_in non static. * tree-ssa-propagate.h: Export replace_uses_in. * tree-vrp.c (extract_range_for_var_from_comparison_expr): New. (extract_range_from_assert): Factor out extract_range_for_var_from_comparison_expr. (vrp_initialize_lattice): New. (vrp_initialize): Factor out vrp_initialize_lattice. (vrp_valueize): Fix it to reject complex value ranges. (vrp_free_lattice): New. (evrp_dom_walker::before_dom_children): Likewise. (evrp_dom_walker::after_dom_children): Likewise. (evrp_dom_walker::push_value_range): Likewise. (evrp_dom_walker::pop_value_range): Likewise. (execute_early_vrp): Likewise. (execute_vrp): Call vrp_initialize_lattice and vrp_free_lattice. (make_pass_early_vrp): New. gcc/testsuite/ChangeLog: 2016-09-21 Kugan Vivekanandarajah <kuganv@linaro.org> * g++.dg/tree-ssa/pr31146-2.C: Run with -fno-tree-evrp as evrp also does the same transformation. * g++.dg/warn/pr33738.C: XFAIL as optimization now happens in ccp. * gcc.dg/tree-ssa/evrp1.c: New test. * gcc.dg/tree-ssa/evrp2.c: New test. * gcc.dg/tree-ssa/evrp3.c: New test. * gcc.dg/tree-ssa/pr20657.c: Check for the pattern in evrp dump. * gcc.dg/tree-ssa/pr22117.c: Likewise. * gcc.dg/tree-ssa/pr61839_2.c: Likewise. * gcc.dg/tree-ssa/pr64130.c: Likewise. * gcc.dg/tree-ssa/pr37508.c: Change the pattern to be checked as foling now happens early. * gcc.dg/tree-ssa/vrp04.c: Likewise. * gcc.dg/tree-ssa/vrp06.c: Likewise. * gcc.dg/tree-ssa/vrp16.c: Likewise. * gcc.dg/tree-ssa/vrp25.c: Likewise. * gcc.dg/tree-ssa/vrp67.c: Likewise. From-SVN: r240291
2016-09-21 01:23:55 +02:00
NEXT_PASS (pass_early_vrp);
NEXT_PASS (pass_merge_phi);
NEXT_PASS (pass_dse);
NEXT_PASS (pass_cd_dce, false /* update_address_taken_p */);
NEXT_PASS (pass_phiopt, true /* early_p */);
NEXT_PASS (pass_tail_recursion);
NEXT_PASS (pass_if_to_switch);
NEXT_PASS (pass_convert_switch);
ipa-chkp.c: New. gcc/ 2014-11-05 Ilya Enkovich <ilya.enkovich@intel.com> * ipa-chkp.c: New. * ipa-chkp.h: New. * tree-chkp.c: New. * tree-chkp.h: New. * tree-chkp-opt.c: New. * rtl-chkp.c: New. * rtl-chkp.h: New. * Makefile.in (OBJS): Add ipa-chkp.o, rtl-chkp.o, tree-chkp.o tree-chkp-opt.o. (GTFILES): Add tree-chkp.c. * mode-classes.def (MODE_POINTER_BOUNDS): New. * tree.def (POINTER_BOUNDS_TYPE): New. * genmodes.c (complete_mode): Support MODE_POINTER_BOUNDS. (POINTER_BOUNDS_MODE): New. (make_pointer_bounds_mode): New. * machmode.h (POINTER_BOUNDS_MODE_P): New. * stor-layout.c (int_mode_for_mode): Support MODE_POINTER_BOUNDS. (layout_type): Support POINTER_BOUNDS_TYPE. * tree-pretty-print.c (dump_generic_node): Support POINTER_BOUNDS_TYPE. * tree-core.h (tree_index): Add TI_POINTER_BOUNDS_TYPE. * tree.c (build_int_cst_wide): Support POINTER_BOUNDS_TYPE. (type_contains_placeholder_1): Likewise. (build_common_tree_nodes): Initialize pointer_bounds_type_node. * tree.h (POINTER_BOUNDS_TYPE_P): New. (pointer_bounds_type_node): New. (POINTER_BOUNDS_P): New. (BOUNDED_TYPE_P): New. (BOUNDED_P): New. (CALL_WITH_BOUNDS_P): New. * gimple.h (gf_mask): Add GF_CALL_WITH_BOUNDS. (gimple_call_with_bounds_p): New. (gimple_call_set_with_bounds): New. (gimple_return_retbnd): New. (gimple_return_set_retbnd): New * gimple.c (gimple_build_return): Increase number of ops for return statement. (gimple_build_call_from_tree): Propagate CALL_WITH_BOUNDS_P flag. * gimple-pretty-print.c (dump_gimple_return): Print second op. * rtl.h (CALL_EXPR_WITH_BOUNDS_P): New. * gimplify.c (gimplify_init_constructor): Avoid infinite loop during gimplification of bounds initializer. * calls.c: Include tree-chkp.h, rtl-chkp.h, bitmap.h. (special_function_p): Use original decl name when analyzing instrumentation clone. (arg_data): Add fields special_slot, pointer_arg and pointer_offset. (store_bounds): New. (emit_call_1): Propagate instrumentation flag for CALL. (initialize_argument_information): Compute pointer_arg, pointer_offset and special_slot for pointer bounds arguments. (finalize_must_preallocate): Preallocate when storing bounds in bounds table. (compute_argument_addresses): Skip pointer bounds. (expand_call): Store bounds into tables separately. Return result joined with resulting bounds. * cfgexpand.c: Include tree-chkp.h, rtl-chkp.h. (expand_call_stmt): Propagate bounds flag for CALL_EXPR. (expand_return): Add returned bounds arg. Handle returned bounds. (expand_gimple_stmt_1): Adjust to new expand_return signature. (gimple_expand_cfg): Reset rtx bounds map. * expr.c: Include tree-chkp.h, rtl-chkp.h. (expand_assignment): Handle returned bounds. (store_expr_with_bounds): New. Replaces store_expr with new bounds target argument. Handle bounds returned by calls. (store_expr): Now wraps store_expr_with_bounds. * expr.h (store_expr_with_bounds): New. * function.c: Include tree-chkp.h, rtl-chkp.h. (bounds_parm_data): New. (use_register_for_decl): Do not registerize decls used for bounds stores and loads. (assign_parms_augmented_arg_list): Add bounds of the result structure pointer as the second argument. (assign_parm_find_entry_rtl): Mark bounds are never passed on the stack. (assign_parm_is_stack_parm): Likewise. (assign_parm_load_bounds): New. (assign_bounds): New. (assign_parms): Load bounds and determine a location for returned bounds. (diddle_return_value_1): New. (diddle_return_value): Handle returned bounds. * function.h (rtl_data): Add field for returned bounds. * varasm.c: Include tree-chkp.h. (output_constant): Support POINTER_BOUNDS_TYPE. (output_constant_pool_2): Support MODE_POINTER_BOUNDS. (ultimate_transparent_alias_target): Move up. (make_decl_rtl): For instrumented function use name of the original decl. (assemble_start_function): Mark function as global in case it is instrumentation clone of the global function. (do_assemble_alias): Follow transparent alias chain for identifier. Check if original alias is public. (maybe_assemble_visibility): Use visibility of the original function for instrumented version. (default_unique_section): Likewise. * emit-rtl.c (immed_double_const): Support MODE_POINTER_BOUNDS. (init_emit_once): Build pointer bounds zero constants. * explow.c (trunc_int_for_mode): Support MODE_POINTER_BOUNDS. * target.def (builtin_chkp_function): New. (chkp_bound_type): New. (chkp_bound_mode): New. (chkp_make_bounds_constant): New. (chkp_initialize_bounds): New. (load_bounds_for_arg): New. (store_bounds_for_arg): New. (load_returned_bounds): New. (store_returned_bounds): New. (chkp_function_value_bounds): New. (setup_incoming_vararg_bounds): New. (function_arg): Update hook description with new possible return value CONST_INT. * targhooks.h (default_load_bounds_for_arg): New. (default_store_bounds_for_arg): New. (default_load_returned_bounds): New. (default_store_returned_bounds): New. (default_chkp_bound_type): New. (default_chkp_bound_mode): New. (default_builtin_chkp_function): New. (default_chkp_function_value_bounds): New. (default_chkp_make_bounds_constant): New. (default_chkp_initialize_bounds): New. (default_setup_incoming_vararg_bounds): New. * targhooks.c (default_load_bounds_for_arg): New. (default_store_bounds_for_arg): New. (default_load_returned_bounds): New. (default_store_returned_bounds): New. (default_chkp_bound_type): New. (default_chkp_bound_mode); New. (default_builtin_chkp_function): New. (default_chkp_function_value_bounds): New. (default_chkp_make_bounds_constant): New. (default_chkp_initialize_bounds): New. (default_setup_incoming_vararg_bounds): New. * builtin-types.def (BT_BND): New. (BT_FN_PTR_CONST_PTR): New. (BT_FN_CONST_PTR_CONST_PTR): New. (BT_FN_BND_CONST_PTR): New. (BT_FN_CONST_PTR_BND): New. (BT_FN_PTR_CONST_PTR_SIZE): New. (BT_FN_PTR_CONST_PTR_CONST_PTR): New. (BT_FN_VOID_PTRPTR_CONST_PTR): New. (BT_FN_VOID_CONST_PTR_SIZE): New. (BT_FN_VOID_PTR_BND): New. (BT_FN_CONST_PTR_CONST_PTR_CONST_PTR): New. (BT_FN_BND_CONST_PTR_SIZE): New. (BT_FN_PTR_CONST_PTR_CONST_PTR_SIZE): New. (BT_FN_VOID_CONST_PTR_BND_CONST_PTR): New. * chkp-builtins.def: New. * builtins.def: include chkp-builtins.def. (DEF_CHKP_BUILTIN): New. * builtins.c: Include tree-chkp.h and rtl-chkp.h. (expand_builtin): Support BUILT_IN_CHKP_INIT_PTR_BOUNDS, BUILT_IN_CHKP_NULL_PTR_BOUNDS, BUILT_IN_CHKP_COPY_PTR_BOUNDS, BUILT_IN_CHKP_CHECK_PTR_LBOUNDS, BUILT_IN_CHKP_CHECK_PTR_UBOUNDS, BUILT_IN_CHKP_CHECK_PTR_BOUNDS, BUILT_IN_CHKP_SET_PTR_BOUNDS, BUILT_IN_CHKP_NARROW_PTR_BOUNDS, BUILT_IN_CHKP_STORE_PTR_BOUNDS, BUILT_IN_CHKP_GET_PTR_LBOUND, BUILT_IN_CHKP_GET_PTR_UBOUND, BUILT_IN_CHKP_BNDMK, BUILT_IN_CHKP_BNDSTX, BUILT_IN_CHKP_BNDCL, BUILT_IN_CHKP_BNDCU, BUILT_IN_CHKP_BNDLDX, BUILT_IN_CHKP_BNDRET, BUILT_IN_CHKP_INTERSECT, BUILT_IN_CHKP_NARROW, BUILT_IN_CHKP_EXTRACT_LOWER, BUILT_IN_CHKP_EXTRACT_UPPER. (std_expand_builtin_va_start): Init bounds for va_list. * cppbuiltin.c (define_builtin_macros_for_compilation_flags): Add __CHKP__ macro when Pointer Bounds Checker is on. * params.def (PARAM_CHKP_MAX_CTOR_SIZE): New. * passes.def (pass_ipa_chkp_versioning): New. (pass_early_local_passes): Renamed to pass_build_ssa_passes. (pass_fixup_cfg): Moved to pass_chkp_instrumentation_passes. (pass_chkp_instrumentation_passes): New. (pass_ipa_chkp_produce_thunks): New. (pass_local_optimization_passes): New. (pass_chkp_opt): New. * tree-pass.h (make_pass_ipa_chkp_versioning): New. (make_pass_ipa_chkp_produce_thunks): New. (make_pass_chkp): New. (make_pass_chkp_opt): New. (make_pass_early_local_passes): Renamed to ... (make_pass_build_ssa_passes): This. (make_pass_chkp_instrumentation_passes): New. (make_pass_local_optimization_passes): New. * passes.c (pass_manager::execute_early_local_passes): Execute early passes in three steps. (execute_all_early_local_passes): Renamed to ... (execute_build_ssa_passes): This. (pass_data_early_local_passes): Renamed to ... (pass_data_build_ssa_passes): This. (pass_early_local_passes): Renamed to ... (pass_build_ssa_passes): This. (pass_data_chkp_instrumentation_passes): New. (pass_chkp_instrumentation_passes): New. (pass_data_local_optimization_passes): New. (pass_local_optimization_passes): New. (make_pass_early_local_passes): Renamed to ... (make_pass_build_ssa_passes): This. (make_pass_chkp_instrumentation_passes): New. (make_pass_local_optimization_passes): New. * c-family/c.opt (fcheck-pointer-bounds): New. (fchkp-check-incomplete-type): New. (fchkp-zero-input-bounds-for-main): New. (fchkp-first-field-has-own-bounds): New. (fchkp-narrow-bounds): New. (fchkp-narrow-to-innermost-array): New. (fchkp-optimize): New. (fchkp-use-fast-string-functions): New. (fchkp-use-nochk-string-functions): New. (fchkp-use-static-bounds): New. (fchkp-use-static-const-bounds): New. (fchkp-treat-zero-dynamic-size-as-infinite): New. (fchkp-check-read): New. (fchkp-check-write): New. (fchkp-store-bounds): New. (fchkp-instrument-calls): New. (fchkp-instrument-marked-only): New. (Wchkp): New. * c-family/c-common.c (handle_bnd_variable_size_attribute): New. (handle_bnd_legacy): New. (handle_bnd_instrument): New. (c_common_attribute_table): Add bnd_variable_size, bnd_legacy and bnd_instrument. Fix documentation. (c_common_format_attribute_table): Likewsie. * toplev.c: include tree-chkp.h. (process_options): Check Pointer Bounds Checker is supported. (compile_file): Add chkp_finish_file call. * ipa-cp.c (initialize_node_lattices): Use cgraph_local_p to handle instrumentation clones properly. (propagate_constants_accross_call): Do not propagate through instrumentation thunks. * ipa-pure-const.c (propagate_pure_const): Support IPA_REF_CHKP. * ipa-inline.c (early_inliner): Check edge has summary allocated. * ipa-split.c: Include tree-chkp.h. (find_retbnd): New. (split_part_set_ssa_name_p): New. (consider_split): Do not split retbnd and retval producers. (insert_bndret_call_after): new. (split_function): Propagate Pointer Bounds Checker instrumentation marks and handle returned bounds. * tree-ssa-sccvn.h (vn_reference_op_struct): Transform opcode into bit field and add with_bounds field. * tree-ssa-sccvn.c (copy_reference_ops_from_call): Set with_bounds field for instrumented calls. * tree-ssa-pre.c (create_component_ref_by_pieces_1): Restore CALL_WITH_BOUNDS_P flag for calls. * tree-ssa-ccp.c: Include tree-chkp.h. (insert_clobber_before_stack_restore): Handle BUILT_IN_CHKP_BNDRET calls. * tree-ssa-dce.c: Include tree-chkp.h. (propagate_necessity): For free call fed by alloc check bounds are also provided by the same alloc. (eliminate_unnecessary_stmts): Handle BUILT_IN_CHKP_BNDRET used by free calls. * tree-inline.c: Include tree-chkp.h. (declare_return_variable): Add arg holding returned bounds slot. Create and initialize returned bounds var. (remap_gimple_stmt): Handle returned bounds. Return sequence of statements instead of a single statement. (insert_init_stmt): Add declaration. (remap_gimple_seq): Adjust to new remap_gimple_stmt signature. (copy_bb): Adjust to changed return type of remap_gimple_stmt. Properly handle bounds in va_arg_pack and va_arg_pack_len. (expand_call_inline): Handle returned bounds. Add bounds copy for generated mem to mem assignments. * tree-inline.h (copy_body_data): Add fields retbnd and assign_stmts. * value-prof.c: Include tree-chkp.h. (gimple_ic): Support returned bounds. * ipa.c (cgraph_build_static_cdtor_1): Support contructors with "chkp ctor" and "bnd_legacy" attributes. (symtab_remove_unreachable_nodes): Keep initial values for pointer bounds to be used for checks eliminations. (process_references): Handle IPA_REF_CHKP. (walk_polymorphic_call_targets): Likewise. * ipa-visibility.c (cgraph_externally_visible_p): Mark instrumented 'main' as externally visible. (function_and_variable_visibility): Filter instrumentation thunks. * cgraph.h (cgraph_thunk_info): Add add_pointer_bounds_args field. (cgraph_node): Add instrumented_version, orig_decl and instrumentation_clone fields. (symtab_node::get_alias_target): Allow IPA_REF_CHKP reference. (varpool_node): Add need_bounds_init field. (cgraph_local_p): New. * cgraph.c: Include tree-chkp.h. (cgraph_node::remove): Fix instrumented_version of the referenced node if any. (cgraph_node::dump): Dump instrumentation_clone and instrumented_version fields. (cgraph_node::verify_node): Check correctness of IPA_REF_CHKP references and instrumentation thunks. (cgraph_can_remove_if_no_direct_calls_and_refs_p): Keep all not instrumented instrumentation clones alive. (cgraph_redirect_edge_call_stmt_to_callee): Support returned bounds. * cgraphbuild.c (rebuild_cgraph_edges): Rebuild IPA_REF_CHKP reference. (cgraph_rebuild_references): Likewise. * cgraphunit.c: Include tree-chkp.h. (assemble_thunks_and_aliases): Skip thunks calling instrumneted function version. (varpool_finalize_decl): Register statically initialized decls in Pointer Bounds Checker. (walk_polymorphic_call_targets): Do not mark generated call to __builtin_unreachable as with_bounds. (output_weakrefs): If there are both instrumented and original versions, output only one of them. (cgraph_node::expand_thunk): Set with_bounds flag for created call statement. * ipa-ref.h (ipa_ref_use): Add IPA_REF_CHKP. (ipa_ref): increase size of use field. * symtab.c (ipa_ref_use_name): Add element for IPA_REF_CHKP. * varpool.c (dump_varpool_node): Dump need_bounds_init field. (ctor_for_folding): Do not fold constant bounds vars. * lto-streamer.h (LTO_minor_version): Change minor version from 0 to 1. * lto-cgraph.c (compute_ltrans_boundary): Keep initial values for pointer bounds. (lto_output_node): Output instrumentation_clone, thunk.add_pointer_bounds_args and orig_decl field. (lto_output_ref): Adjust to new ipa_ref::use field size. (input_overwrite_node): Read instrumentation_clone field. (input_node): Read thunk.add_pointer_bounds_args and orig_decl fields. (input_ref): Adjust to new ipa_ref::use field size. (input_cgraph_1): Compute instrumented_version fields and restore IDENTIFIER_TRANSPARENT_ALIAS chains. (lto_output_varpool_node): Output need_bounds_init value. (input_varpool_node): Read need_bounds_init value. * lto-partition.c (add_symbol_to_partition_1): Keep original and instrumented versions together. (privatize_symbol_name): Restore transparent alias chain if required. (add_references_to_partition): Add references to pointer bounds vars. * dbxout.c (dbxout_type): Ignore POINTER_BOUNDS_TYPE. * dwarf2out.c (gen_subprogram_die): Ignore bound args. (gen_type_die_with_usage): Skip pointer bounds. (dwarf2out_global_decl): Likewise. (is_base_type): Support POINTER_BOUNDS_TYPE. (gen_formal_types_die): Skip pointer bounds. (gen_decl_die): Likewise. * var-tracking.c (vt_add_function_parameters): Skip bounds parameters. * ipa-icf.c (sem_function::merge): Do not merge when instrumentation thunk still exists. (sem_variable::merge): Reset need_bounds_init flag. * doc/extend.texi: Document Pointer Bounds Checker built-in functions and attributes. * doc/tm.texi.in (TARGET_LOAD_BOUNDS_FOR_ARG): New. (TARGET_STORE_BOUNDS_FOR_ARG): New. (TARGET_LOAD_RETURNED_BOUNDS): New. (TARGET_STORE_RETURNED_BOUNDS): New. (TARGET_CHKP_FUNCTION_VALUE_BOUNDS): New. (TARGET_SETUP_INCOMING_VARARG_BOUNDS): New. (TARGET_BUILTIN_CHKP_FUNCTION): New. (TARGET_CHKP_BOUND_TYPE): New. (TARGET_CHKP_BOUND_MODE): New. (TARGET_CHKP_MAKE_BOUNDS_CONSTANT): New. (TARGET_CHKP_INITIALIZE_BOUNDS): New. * doc/tm.texi: Regenerated. * doc/rtl.texi (MODE_POINTER_BOUNDS): New. (BND32mode): New. (BND64mode): New. * doc/invoke.texi (-mmpx): New. (-mno-mpx): New. (chkp-max-ctor-size): New. * config/i386/constraints.md (w): New. (Ti): New. (Tb): New. * config/i386/i386-c.c (ix86_target_macros_internal): Add __MPX__. * config/i386/i386-modes.def (BND32): New. (BND64): New. * config/i386/i386-protos.h (ix86_bnd_prefixed_insn_p): New. * config/i386/i386.c: Include tree-chkp.h, rtl-chkp.h, tree-iterator.h. (regclass_map): Add bound registers. (dbx_register_map): Likewise. (dbx64_register_map): Likewise. (svr4_dbx_register_map): Likewise. (isa_opts): Add -mmpx. (PTA_MPX): New. (ix86_option_override_internal): Support MPX ISA. (ix86_conditional_register_usage): Support bound registers. (ix86_code_end): Add MPX bnd prefix. (output_set_got): Likewise. (print_reg): Avoid prefixes for bound registers. (ix86_print_operand): Add '!' (MPX bnd) print prefix support. (ix86_print_operand_punct_valid_p): Likewise. (ix86_print_operand_address): Support UNSPEC_BNDMK_ADDR and UNSPEC_BNDLDX_ADDR. (ix86_output_call_insn): Add MPX bnd prefix to branch instructions. (ix86_class_likely_spilled_p): Add bound regs support. (ix86_hard_regno_mode_ok): Likewise. (x86_order_regs_for_local_alloc): Likewise. (ix86_bnd_prefixed_insn_p): New. (ix86_builtins): Add IX86_BUILTIN_BNDMK, IX86_BUILTIN_BNDSTX, IX86_BUILTIN_BNDLDX, IX86_BUILTIN_BNDCL, IX86_BUILTIN_BNDCU, IX86_BUILTIN_BNDRET, IX86_BUILTIN_BNDNARROW, IX86_BUILTIN_BNDINT, IX86_BUILTIN_SIZEOF, IX86_BUILTIN_BNDLOWER, IX86_BUILTIN_BNDUPPER. (builtin_isa): Add leaf_p and nothrow_p fields. (def_builtin): Initialize leaf_p and nothrow_p. (ix86_add_new_builtins): Handle leaf_p and nothrow_p flags. (bdesc_mpx): New. (bdesc_mpx_const): New. (ix86_init_mpx_builtins): New. (ix86_init_builtins): Call ix86_init_mpx_builtins. (ix86_emit_cmove): New. (ix86_emit_move_max): New. (ix86_expand_builtin): Expand IX86_BUILTIN_BNDMK, IX86_BUILTIN_BNDSTX, IX86_BUILTIN_BNDLDX, IX86_BUILTIN_BNDCL, IX86_BUILTIN_BNDCU, IX86_BUILTIN_BNDRET, IX86_BUILTIN_BNDNARROW, IX86_BUILTIN_BNDINT, IX86_BUILTIN_SIZEOF, IX86_BUILTIN_BNDLOWER, IX86_BUILTIN_BNDUPPER. (ix86_function_value_bounds): New. (ix86_builtin_mpx_function): New. (ix86_get_arg_address_for_bt): New. (ix86_load_bounds): New. (ix86_store_bounds): New. (ix86_load_returned_bounds): New. (ix86_store_returned_bounds): New. (ix86_mpx_bound_mode): New. (ix86_make_bounds_constant): New. (ix86_initialize_bounds): (TARGET_LOAD_BOUNDS_FOR_ARG): New. (TARGET_STORE_BOUNDS_FOR_ARG): New. (TARGET_LOAD_RETURNED_BOUNDS): New. (TARGET_STORE_RETURNED_BOUNDS): New. (TARGET_CHKP_BOUND_MODE): New. (TARGET_BUILTIN_CHKP_FUNCTION): New. (TARGET_CHKP_FUNCTION_VALUE_BOUNDS): New. (TARGET_CHKP_MAKE_BOUNDS_CONSTANT): New. (TARGET_CHKP_INITIALIZE_BOUNDS): New. (ix86_option_override_internal): Do not support x32 with MPX. (init_cumulative_args): Init stdarg, bnd_regno, bnds_in_bt and force_bnd_pass. (function_arg_advance_32): Return number of used integer registers. (function_arg_advance_64): Likewise. (function_arg_advance_ms_64): Likewise. (ix86_function_arg_advance): Handle pointer bounds. (ix86_function_arg): Likewise. (ix86_function_value_regno_p): Mark fisrt bounds registers as possible function value. (ix86_function_value_1): Handle pointer bounds type/mode (ix86_return_in_memory): Likewise. (ix86_print_operand): Analyse insn to decide abounf "bnd" prefix. (ix86_expand_call): Generate returned bounds. (ix86_setup_incoming_vararg_bounds): New. (ix86_va_start): Initialize bounds for pointers in va_list. (TARGET_SETUP_INCOMING_VARARG_BOUNDS): New. * config/i386/i386.h (TARGET_MPX): New. (TARGET_MPX_P): New. (FIRST_PSEUDO_REGISTER): Fix to new value. (FIXED_REGISTERS): Add bound registers. (CALL_USED_REGISTERS): Likewise. (REG_ALLOC_ORDER): Likewise. (HARD_REGNO_NREGS): Likewise. (VALID_BND_REG_MODE): New. (FIRST_BND_REG): New. (LAST_BND_REG): New. (reg_class): Add BND_REGS. (REG_CLASS_NAMES): Likewise. (REG_CLASS_CONTENTS): Likewise. (BND_REGNO_P): New. (ANY_BND_REG_P): New. (BNDmode): New. (HI_REGISTER_NAMES): Add bound registers. (ix86_args): Add bnd_regno, bnds_in_bt, force_bnd_pass and stdarg fields. * config/i386/i386.md (UNSPEC_BNDMK): New. (UNSPEC_BNDMK_ADDR): New. (UNSPEC_BNDSTX): New. (UNSPEC_BNDLDX): New. (UNSPEC_BNDLDX_ADDR): New. (UNSPEC_BNDCL): New. (UNSPEC_BNDCU): New. (UNSPEC_BNDCN): New. (UNSPEC_MPX_FENCE): New. (UNSPEC_SIZEOF): New. (BND0_REG): New. (BND1_REG): New. (type): Add mpxmov, mpxmk, mpxchk, mpxld, mpxst. (length_immediate): Support mpxmov, mpxmk, mpxchk, mpxld, mpxst. (prefix_rep): Check for bnd prefix. (prefix_0f): Support mpxmov, mpxmk, mpxchk, mpxld, mpxst. (length_nobnd): New. (length): Use length_nobnd when specified. (memory): Support mpxmov, mpxmk, mpxchk, mpxld, mpxst. (BND): New. (bnd_ptr): New. (BNDCHECK): New. (bndcheck): New. (*jcc_1): Add MPX bnd prefix. (*jcc_2): Likewise. (jump): Likewise. (*indirect_jump): Likewise. (*tablejump_1): Likewise. (simple_return_internal): Likewise. (simple_return_internal_long): Likewise. (simple_return_pop_internal): Likewise. (simple_return_indirect_internal): Likewise. (<mode>_mk): New. (*<mode>_mk): New. (mov<mode>): New. (*mov<mode>_internal_mpx): New. (<mode>_<bndcheck>): New. (*<mode>_<bndcheck>): New. (<mode>_ldx): New. (*<mode>_ldx): New. (<mode>_stx): New. (*<mode>_stx): New. move_size_reloc_<mode>): New. * config/i386/predicates.md (address_mpx_no_base_operand): New. (address_mpx_no_index_operand): New. (bnd_mem_operator): New. (symbol_operand): New. (x86_64_immediate_size_operand): New. * config/i386/i386.opt (mmpx): New. * config/i386/i386-builtin-types.def (BND): New. (ULONG): New. (BND_FTYPE_PCVOID_ULONG): New. (VOID_FTYPE_BND_PCVOID): New. (VOID_FTYPE_PCVOID_PCVOID_BND): New. (BND_FTYPE_PCVOID_PCVOID): New. (BND_FTYPE_PCVOID): New. (BND_FTYPE_BND_BND): New. (PVOID_FTYPE_PVOID_PVOID_ULONG): New. (PVOID_FTYPE_PCVOID_BND_ULONG): New. (ULONG_FTYPE_VOID): New. (PVOID_FTYPE_BND): New. gcc/testsuite/ 2014-11-05 Ilya Enkovich <ilya.enkovich@intel.com> * gcc.target/i386/chkp-builtins-1.c: New. * gcc.target/i386/chkp-builtins-2.c: New. * gcc.target/i386/chkp-builtins-3.c: New. * gcc.target/i386/chkp-builtins-4.c: New. * gcc.target/i386/chkp-remove-bndint-1.c: New. * gcc.target/i386/chkp-remove-bndint-2.c: New. * gcc.target/i386/chkp-const-check-1.c: New. * gcc.target/i386/chkp-const-check-2.c: New. * gcc.target/i386/chkp-lifetime-1.c: New. * gcc.dg/pr37858.c: Replace early_local_cleanups pass name with build_ssa_passes. From-SVN: r217125
2014-11-05 13:42:03 +01:00
NEXT_PASS (pass_cleanup_eh);
NEXT_PASS (pass_profile);
NEXT_PASS (pass_local_pure_const);
Enable pure-const discovery in modref. We newly can handle some extra cases, for example: struct a {int a,b,c;}; __attribute__ ((noinline)) int init (struct a *a) { a->a=1; a->b=2; a->c=3; } int const_fn () { struct a a; init (&a); return a.a + a.b + a.c; } Here pure/const stops on the fact that const_fn calls non-const init, while modref knows that the memory it initializes is local to const_fn. I ended up reordering passes so early modref is done after early pure-const mostly to avoid need to change testsuite which greps for const functions being detects in pure-const. Stil some testuiste compensation is needed. gcc/ChangeLog: 2021-11-11 Jan Hubicka <hubicka@ucw.cz> * ipa-modref.c (analyze_function): Do pure/const discovery, return true on success. (pass_modref::execute): If pure/const is discovered fixup cfg. (ignore_edge): Do not ignore pure/const edges. (modref_propagate_in_scc): Do pure/const discovery, return true if cdtor was promoted pure/const. (pass_ipa_modref::execute): If needed remove unreachable functions. * ipa-pure-const.c (warn_function_noreturn): Fix whitespace. (warn_function_cold): Likewise. (skip_function_for_local_pure_const): Move earlier. (ipa_make_function_const): Break out from ... (ipa_make_function_pure): Break out from ... (propagate_pure_const): ... here. (pass_local_pure_const::execute): Use it. * ipa-utils.h (ipa_make_function_const): Declare. (ipa_make_function_pure): Declare. * passes.def: Move early modref after pure-const. gcc/testsuite/ChangeLog: 2021-11-11 Jan Hubicka <hubicka@ucw.cz> * c-c++-common/tm/inline-asm.c: Disable pure-const. * g++.dg/ipa/modref-1.C: Update template. * gcc.dg/tree-ssa/modref-11.c: Disable pure-const. * gcc.dg/tree-ssa/modref-14.c: New test. * gcc.dg/tree-ssa/modref-8.c: Do not optimize sibling calls. * gfortran.dg/do_subscript_3.f90: Add -O0.
2021-11-11 18:14:45 +01:00
NEXT_PASS (pass_modref);
/* Split functions creates parts that are not run through
early optimizations again. It is thus good idea to do this
ipa-chkp.c: New. gcc/ 2014-11-05 Ilya Enkovich <ilya.enkovich@intel.com> * ipa-chkp.c: New. * ipa-chkp.h: New. * tree-chkp.c: New. * tree-chkp.h: New. * tree-chkp-opt.c: New. * rtl-chkp.c: New. * rtl-chkp.h: New. * Makefile.in (OBJS): Add ipa-chkp.o, rtl-chkp.o, tree-chkp.o tree-chkp-opt.o. (GTFILES): Add tree-chkp.c. * mode-classes.def (MODE_POINTER_BOUNDS): New. * tree.def (POINTER_BOUNDS_TYPE): New. * genmodes.c (complete_mode): Support MODE_POINTER_BOUNDS. (POINTER_BOUNDS_MODE): New. (make_pointer_bounds_mode): New. * machmode.h (POINTER_BOUNDS_MODE_P): New. * stor-layout.c (int_mode_for_mode): Support MODE_POINTER_BOUNDS. (layout_type): Support POINTER_BOUNDS_TYPE. * tree-pretty-print.c (dump_generic_node): Support POINTER_BOUNDS_TYPE. * tree-core.h (tree_index): Add TI_POINTER_BOUNDS_TYPE. * tree.c (build_int_cst_wide): Support POINTER_BOUNDS_TYPE. (type_contains_placeholder_1): Likewise. (build_common_tree_nodes): Initialize pointer_bounds_type_node. * tree.h (POINTER_BOUNDS_TYPE_P): New. (pointer_bounds_type_node): New. (POINTER_BOUNDS_P): New. (BOUNDED_TYPE_P): New. (BOUNDED_P): New. (CALL_WITH_BOUNDS_P): New. * gimple.h (gf_mask): Add GF_CALL_WITH_BOUNDS. (gimple_call_with_bounds_p): New. (gimple_call_set_with_bounds): New. (gimple_return_retbnd): New. (gimple_return_set_retbnd): New * gimple.c (gimple_build_return): Increase number of ops for return statement. (gimple_build_call_from_tree): Propagate CALL_WITH_BOUNDS_P flag. * gimple-pretty-print.c (dump_gimple_return): Print second op. * rtl.h (CALL_EXPR_WITH_BOUNDS_P): New. * gimplify.c (gimplify_init_constructor): Avoid infinite loop during gimplification of bounds initializer. * calls.c: Include tree-chkp.h, rtl-chkp.h, bitmap.h. (special_function_p): Use original decl name when analyzing instrumentation clone. (arg_data): Add fields special_slot, pointer_arg and pointer_offset. (store_bounds): New. (emit_call_1): Propagate instrumentation flag for CALL. (initialize_argument_information): Compute pointer_arg, pointer_offset and special_slot for pointer bounds arguments. (finalize_must_preallocate): Preallocate when storing bounds in bounds table. (compute_argument_addresses): Skip pointer bounds. (expand_call): Store bounds into tables separately. Return result joined with resulting bounds. * cfgexpand.c: Include tree-chkp.h, rtl-chkp.h. (expand_call_stmt): Propagate bounds flag for CALL_EXPR. (expand_return): Add returned bounds arg. Handle returned bounds. (expand_gimple_stmt_1): Adjust to new expand_return signature. (gimple_expand_cfg): Reset rtx bounds map. * expr.c: Include tree-chkp.h, rtl-chkp.h. (expand_assignment): Handle returned bounds. (store_expr_with_bounds): New. Replaces store_expr with new bounds target argument. Handle bounds returned by calls. (store_expr): Now wraps store_expr_with_bounds. * expr.h (store_expr_with_bounds): New. * function.c: Include tree-chkp.h, rtl-chkp.h. (bounds_parm_data): New. (use_register_for_decl): Do not registerize decls used for bounds stores and loads. (assign_parms_augmented_arg_list): Add bounds of the result structure pointer as the second argument. (assign_parm_find_entry_rtl): Mark bounds are never passed on the stack. (assign_parm_is_stack_parm): Likewise. (assign_parm_load_bounds): New. (assign_bounds): New. (assign_parms): Load bounds and determine a location for returned bounds. (diddle_return_value_1): New. (diddle_return_value): Handle returned bounds. * function.h (rtl_data): Add field for returned bounds. * varasm.c: Include tree-chkp.h. (output_constant): Support POINTER_BOUNDS_TYPE. (output_constant_pool_2): Support MODE_POINTER_BOUNDS. (ultimate_transparent_alias_target): Move up. (make_decl_rtl): For instrumented function use name of the original decl. (assemble_start_function): Mark function as global in case it is instrumentation clone of the global function. (do_assemble_alias): Follow transparent alias chain for identifier. Check if original alias is public. (maybe_assemble_visibility): Use visibility of the original function for instrumented version. (default_unique_section): Likewise. * emit-rtl.c (immed_double_const): Support MODE_POINTER_BOUNDS. (init_emit_once): Build pointer bounds zero constants. * explow.c (trunc_int_for_mode): Support MODE_POINTER_BOUNDS. * target.def (builtin_chkp_function): New. (chkp_bound_type): New. (chkp_bound_mode): New. (chkp_make_bounds_constant): New. (chkp_initialize_bounds): New. (load_bounds_for_arg): New. (store_bounds_for_arg): New. (load_returned_bounds): New. (store_returned_bounds): New. (chkp_function_value_bounds): New. (setup_incoming_vararg_bounds): New. (function_arg): Update hook description with new possible return value CONST_INT. * targhooks.h (default_load_bounds_for_arg): New. (default_store_bounds_for_arg): New. (default_load_returned_bounds): New. (default_store_returned_bounds): New. (default_chkp_bound_type): New. (default_chkp_bound_mode): New. (default_builtin_chkp_function): New. (default_chkp_function_value_bounds): New. (default_chkp_make_bounds_constant): New. (default_chkp_initialize_bounds): New. (default_setup_incoming_vararg_bounds): New. * targhooks.c (default_load_bounds_for_arg): New. (default_store_bounds_for_arg): New. (default_load_returned_bounds): New. (default_store_returned_bounds): New. (default_chkp_bound_type): New. (default_chkp_bound_mode); New. (default_builtin_chkp_function): New. (default_chkp_function_value_bounds): New. (default_chkp_make_bounds_constant): New. (default_chkp_initialize_bounds): New. (default_setup_incoming_vararg_bounds): New. * builtin-types.def (BT_BND): New. (BT_FN_PTR_CONST_PTR): New. (BT_FN_CONST_PTR_CONST_PTR): New. (BT_FN_BND_CONST_PTR): New. (BT_FN_CONST_PTR_BND): New. (BT_FN_PTR_CONST_PTR_SIZE): New. (BT_FN_PTR_CONST_PTR_CONST_PTR): New. (BT_FN_VOID_PTRPTR_CONST_PTR): New. (BT_FN_VOID_CONST_PTR_SIZE): New. (BT_FN_VOID_PTR_BND): New. (BT_FN_CONST_PTR_CONST_PTR_CONST_PTR): New. (BT_FN_BND_CONST_PTR_SIZE): New. (BT_FN_PTR_CONST_PTR_CONST_PTR_SIZE): New. (BT_FN_VOID_CONST_PTR_BND_CONST_PTR): New. * chkp-builtins.def: New. * builtins.def: include chkp-builtins.def. (DEF_CHKP_BUILTIN): New. * builtins.c: Include tree-chkp.h and rtl-chkp.h. (expand_builtin): Support BUILT_IN_CHKP_INIT_PTR_BOUNDS, BUILT_IN_CHKP_NULL_PTR_BOUNDS, BUILT_IN_CHKP_COPY_PTR_BOUNDS, BUILT_IN_CHKP_CHECK_PTR_LBOUNDS, BUILT_IN_CHKP_CHECK_PTR_UBOUNDS, BUILT_IN_CHKP_CHECK_PTR_BOUNDS, BUILT_IN_CHKP_SET_PTR_BOUNDS, BUILT_IN_CHKP_NARROW_PTR_BOUNDS, BUILT_IN_CHKP_STORE_PTR_BOUNDS, BUILT_IN_CHKP_GET_PTR_LBOUND, BUILT_IN_CHKP_GET_PTR_UBOUND, BUILT_IN_CHKP_BNDMK, BUILT_IN_CHKP_BNDSTX, BUILT_IN_CHKP_BNDCL, BUILT_IN_CHKP_BNDCU, BUILT_IN_CHKP_BNDLDX, BUILT_IN_CHKP_BNDRET, BUILT_IN_CHKP_INTERSECT, BUILT_IN_CHKP_NARROW, BUILT_IN_CHKP_EXTRACT_LOWER, BUILT_IN_CHKP_EXTRACT_UPPER. (std_expand_builtin_va_start): Init bounds for va_list. * cppbuiltin.c (define_builtin_macros_for_compilation_flags): Add __CHKP__ macro when Pointer Bounds Checker is on. * params.def (PARAM_CHKP_MAX_CTOR_SIZE): New. * passes.def (pass_ipa_chkp_versioning): New. (pass_early_local_passes): Renamed to pass_build_ssa_passes. (pass_fixup_cfg): Moved to pass_chkp_instrumentation_passes. (pass_chkp_instrumentation_passes): New. (pass_ipa_chkp_produce_thunks): New. (pass_local_optimization_passes): New. (pass_chkp_opt): New. * tree-pass.h (make_pass_ipa_chkp_versioning): New. (make_pass_ipa_chkp_produce_thunks): New. (make_pass_chkp): New. (make_pass_chkp_opt): New. (make_pass_early_local_passes): Renamed to ... (make_pass_build_ssa_passes): This. (make_pass_chkp_instrumentation_passes): New. (make_pass_local_optimization_passes): New. * passes.c (pass_manager::execute_early_local_passes): Execute early passes in three steps. (execute_all_early_local_passes): Renamed to ... (execute_build_ssa_passes): This. (pass_data_early_local_passes): Renamed to ... (pass_data_build_ssa_passes): This. (pass_early_local_passes): Renamed to ... (pass_build_ssa_passes): This. (pass_data_chkp_instrumentation_passes): New. (pass_chkp_instrumentation_passes): New. (pass_data_local_optimization_passes): New. (pass_local_optimization_passes): New. (make_pass_early_local_passes): Renamed to ... (make_pass_build_ssa_passes): This. (make_pass_chkp_instrumentation_passes): New. (make_pass_local_optimization_passes): New. * c-family/c.opt (fcheck-pointer-bounds): New. (fchkp-check-incomplete-type): New. (fchkp-zero-input-bounds-for-main): New. (fchkp-first-field-has-own-bounds): New. (fchkp-narrow-bounds): New. (fchkp-narrow-to-innermost-array): New. (fchkp-optimize): New. (fchkp-use-fast-string-functions): New. (fchkp-use-nochk-string-functions): New. (fchkp-use-static-bounds): New. (fchkp-use-static-const-bounds): New. (fchkp-treat-zero-dynamic-size-as-infinite): New. (fchkp-check-read): New. (fchkp-check-write): New. (fchkp-store-bounds): New. (fchkp-instrument-calls): New. (fchkp-instrument-marked-only): New. (Wchkp): New. * c-family/c-common.c (handle_bnd_variable_size_attribute): New. (handle_bnd_legacy): New. (handle_bnd_instrument): New. (c_common_attribute_table): Add bnd_variable_size, bnd_legacy and bnd_instrument. Fix documentation. (c_common_format_attribute_table): Likewsie. * toplev.c: include tree-chkp.h. (process_options): Check Pointer Bounds Checker is supported. (compile_file): Add chkp_finish_file call. * ipa-cp.c (initialize_node_lattices): Use cgraph_local_p to handle instrumentation clones properly. (propagate_constants_accross_call): Do not propagate through instrumentation thunks. * ipa-pure-const.c (propagate_pure_const): Support IPA_REF_CHKP. * ipa-inline.c (early_inliner): Check edge has summary allocated. * ipa-split.c: Include tree-chkp.h. (find_retbnd): New. (split_part_set_ssa_name_p): New. (consider_split): Do not split retbnd and retval producers. (insert_bndret_call_after): new. (split_function): Propagate Pointer Bounds Checker instrumentation marks and handle returned bounds. * tree-ssa-sccvn.h (vn_reference_op_struct): Transform opcode into bit field and add with_bounds field. * tree-ssa-sccvn.c (copy_reference_ops_from_call): Set with_bounds field for instrumented calls. * tree-ssa-pre.c (create_component_ref_by_pieces_1): Restore CALL_WITH_BOUNDS_P flag for calls. * tree-ssa-ccp.c: Include tree-chkp.h. (insert_clobber_before_stack_restore): Handle BUILT_IN_CHKP_BNDRET calls. * tree-ssa-dce.c: Include tree-chkp.h. (propagate_necessity): For free call fed by alloc check bounds are also provided by the same alloc. (eliminate_unnecessary_stmts): Handle BUILT_IN_CHKP_BNDRET used by free calls. * tree-inline.c: Include tree-chkp.h. (declare_return_variable): Add arg holding returned bounds slot. Create and initialize returned bounds var. (remap_gimple_stmt): Handle returned bounds. Return sequence of statements instead of a single statement. (insert_init_stmt): Add declaration. (remap_gimple_seq): Adjust to new remap_gimple_stmt signature. (copy_bb): Adjust to changed return type of remap_gimple_stmt. Properly handle bounds in va_arg_pack and va_arg_pack_len. (expand_call_inline): Handle returned bounds. Add bounds copy for generated mem to mem assignments. * tree-inline.h (copy_body_data): Add fields retbnd and assign_stmts. * value-prof.c: Include tree-chkp.h. (gimple_ic): Support returned bounds. * ipa.c (cgraph_build_static_cdtor_1): Support contructors with "chkp ctor" and "bnd_legacy" attributes. (symtab_remove_unreachable_nodes): Keep initial values for pointer bounds to be used for checks eliminations. (process_references): Handle IPA_REF_CHKP. (walk_polymorphic_call_targets): Likewise. * ipa-visibility.c (cgraph_externally_visible_p): Mark instrumented 'main' as externally visible. (function_and_variable_visibility): Filter instrumentation thunks. * cgraph.h (cgraph_thunk_info): Add add_pointer_bounds_args field. (cgraph_node): Add instrumented_version, orig_decl and instrumentation_clone fields. (symtab_node::get_alias_target): Allow IPA_REF_CHKP reference. (varpool_node): Add need_bounds_init field. (cgraph_local_p): New. * cgraph.c: Include tree-chkp.h. (cgraph_node::remove): Fix instrumented_version of the referenced node if any. (cgraph_node::dump): Dump instrumentation_clone and instrumented_version fields. (cgraph_node::verify_node): Check correctness of IPA_REF_CHKP references and instrumentation thunks. (cgraph_can_remove_if_no_direct_calls_and_refs_p): Keep all not instrumented instrumentation clones alive. (cgraph_redirect_edge_call_stmt_to_callee): Support returned bounds. * cgraphbuild.c (rebuild_cgraph_edges): Rebuild IPA_REF_CHKP reference. (cgraph_rebuild_references): Likewise. * cgraphunit.c: Include tree-chkp.h. (assemble_thunks_and_aliases): Skip thunks calling instrumneted function version. (varpool_finalize_decl): Register statically initialized decls in Pointer Bounds Checker. (walk_polymorphic_call_targets): Do not mark generated call to __builtin_unreachable as with_bounds. (output_weakrefs): If there are both instrumented and original versions, output only one of them. (cgraph_node::expand_thunk): Set with_bounds flag for created call statement. * ipa-ref.h (ipa_ref_use): Add IPA_REF_CHKP. (ipa_ref): increase size of use field. * symtab.c (ipa_ref_use_name): Add element for IPA_REF_CHKP. * varpool.c (dump_varpool_node): Dump need_bounds_init field. (ctor_for_folding): Do not fold constant bounds vars. * lto-streamer.h (LTO_minor_version): Change minor version from 0 to 1. * lto-cgraph.c (compute_ltrans_boundary): Keep initial values for pointer bounds. (lto_output_node): Output instrumentation_clone, thunk.add_pointer_bounds_args and orig_decl field. (lto_output_ref): Adjust to new ipa_ref::use field size. (input_overwrite_node): Read instrumentation_clone field. (input_node): Read thunk.add_pointer_bounds_args and orig_decl fields. (input_ref): Adjust to new ipa_ref::use field size. (input_cgraph_1): Compute instrumented_version fields and restore IDENTIFIER_TRANSPARENT_ALIAS chains. (lto_output_varpool_node): Output need_bounds_init value. (input_varpool_node): Read need_bounds_init value. * lto-partition.c (add_symbol_to_partition_1): Keep original and instrumented versions together. (privatize_symbol_name): Restore transparent alias chain if required. (add_references_to_partition): Add references to pointer bounds vars. * dbxout.c (dbxout_type): Ignore POINTER_BOUNDS_TYPE. * dwarf2out.c (gen_subprogram_die): Ignore bound args. (gen_type_die_with_usage): Skip pointer bounds. (dwarf2out_global_decl): Likewise. (is_base_type): Support POINTER_BOUNDS_TYPE. (gen_formal_types_die): Skip pointer bounds. (gen_decl_die): Likewise. * var-tracking.c (vt_add_function_parameters): Skip bounds parameters. * ipa-icf.c (sem_function::merge): Do not merge when instrumentation thunk still exists. (sem_variable::merge): Reset need_bounds_init flag. * doc/extend.texi: Document Pointer Bounds Checker built-in functions and attributes. * doc/tm.texi.in (TARGET_LOAD_BOUNDS_FOR_ARG): New. (TARGET_STORE_BOUNDS_FOR_ARG): New. (TARGET_LOAD_RETURNED_BOUNDS): New. (TARGET_STORE_RETURNED_BOUNDS): New. (TARGET_CHKP_FUNCTION_VALUE_BOUNDS): New. (TARGET_SETUP_INCOMING_VARARG_BOUNDS): New. (TARGET_BUILTIN_CHKP_FUNCTION): New. (TARGET_CHKP_BOUND_TYPE): New. (TARGET_CHKP_BOUND_MODE): New. (TARGET_CHKP_MAKE_BOUNDS_CONSTANT): New. (TARGET_CHKP_INITIALIZE_BOUNDS): New. * doc/tm.texi: Regenerated. * doc/rtl.texi (MODE_POINTER_BOUNDS): New. (BND32mode): New. (BND64mode): New. * doc/invoke.texi (-mmpx): New. (-mno-mpx): New. (chkp-max-ctor-size): New. * config/i386/constraints.md (w): New. (Ti): New. (Tb): New. * config/i386/i386-c.c (ix86_target_macros_internal): Add __MPX__. * config/i386/i386-modes.def (BND32): New. (BND64): New. * config/i386/i386-protos.h (ix86_bnd_prefixed_insn_p): New. * config/i386/i386.c: Include tree-chkp.h, rtl-chkp.h, tree-iterator.h. (regclass_map): Add bound registers. (dbx_register_map): Likewise. (dbx64_register_map): Likewise. (svr4_dbx_register_map): Likewise. (isa_opts): Add -mmpx. (PTA_MPX): New. (ix86_option_override_internal): Support MPX ISA. (ix86_conditional_register_usage): Support bound registers. (ix86_code_end): Add MPX bnd prefix. (output_set_got): Likewise. (print_reg): Avoid prefixes for bound registers. (ix86_print_operand): Add '!' (MPX bnd) print prefix support. (ix86_print_operand_punct_valid_p): Likewise. (ix86_print_operand_address): Support UNSPEC_BNDMK_ADDR and UNSPEC_BNDLDX_ADDR. (ix86_output_call_insn): Add MPX bnd prefix to branch instructions. (ix86_class_likely_spilled_p): Add bound regs support. (ix86_hard_regno_mode_ok): Likewise. (x86_order_regs_for_local_alloc): Likewise. (ix86_bnd_prefixed_insn_p): New. (ix86_builtins): Add IX86_BUILTIN_BNDMK, IX86_BUILTIN_BNDSTX, IX86_BUILTIN_BNDLDX, IX86_BUILTIN_BNDCL, IX86_BUILTIN_BNDCU, IX86_BUILTIN_BNDRET, IX86_BUILTIN_BNDNARROW, IX86_BUILTIN_BNDINT, IX86_BUILTIN_SIZEOF, IX86_BUILTIN_BNDLOWER, IX86_BUILTIN_BNDUPPER. (builtin_isa): Add leaf_p and nothrow_p fields. (def_builtin): Initialize leaf_p and nothrow_p. (ix86_add_new_builtins): Handle leaf_p and nothrow_p flags. (bdesc_mpx): New. (bdesc_mpx_const): New. (ix86_init_mpx_builtins): New. (ix86_init_builtins): Call ix86_init_mpx_builtins. (ix86_emit_cmove): New. (ix86_emit_move_max): New. (ix86_expand_builtin): Expand IX86_BUILTIN_BNDMK, IX86_BUILTIN_BNDSTX, IX86_BUILTIN_BNDLDX, IX86_BUILTIN_BNDCL, IX86_BUILTIN_BNDCU, IX86_BUILTIN_BNDRET, IX86_BUILTIN_BNDNARROW, IX86_BUILTIN_BNDINT, IX86_BUILTIN_SIZEOF, IX86_BUILTIN_BNDLOWER, IX86_BUILTIN_BNDUPPER. (ix86_function_value_bounds): New. (ix86_builtin_mpx_function): New. (ix86_get_arg_address_for_bt): New. (ix86_load_bounds): New. (ix86_store_bounds): New. (ix86_load_returned_bounds): New. (ix86_store_returned_bounds): New. (ix86_mpx_bound_mode): New. (ix86_make_bounds_constant): New. (ix86_initialize_bounds): (TARGET_LOAD_BOUNDS_FOR_ARG): New. (TARGET_STORE_BOUNDS_FOR_ARG): New. (TARGET_LOAD_RETURNED_BOUNDS): New. (TARGET_STORE_RETURNED_BOUNDS): New. (TARGET_CHKP_BOUND_MODE): New. (TARGET_BUILTIN_CHKP_FUNCTION): New. (TARGET_CHKP_FUNCTION_VALUE_BOUNDS): New. (TARGET_CHKP_MAKE_BOUNDS_CONSTANT): New. (TARGET_CHKP_INITIALIZE_BOUNDS): New. (ix86_option_override_internal): Do not support x32 with MPX. (init_cumulative_args): Init stdarg, bnd_regno, bnds_in_bt and force_bnd_pass. (function_arg_advance_32): Return number of used integer registers. (function_arg_advance_64): Likewise. (function_arg_advance_ms_64): Likewise. (ix86_function_arg_advance): Handle pointer bounds. (ix86_function_arg): Likewise. (ix86_function_value_regno_p): Mark fisrt bounds registers as possible function value. (ix86_function_value_1): Handle pointer bounds type/mode (ix86_return_in_memory): Likewise. (ix86_print_operand): Analyse insn to decide abounf "bnd" prefix. (ix86_expand_call): Generate returned bounds. (ix86_setup_incoming_vararg_bounds): New. (ix86_va_start): Initialize bounds for pointers in va_list. (TARGET_SETUP_INCOMING_VARARG_BOUNDS): New. * config/i386/i386.h (TARGET_MPX): New. (TARGET_MPX_P): New. (FIRST_PSEUDO_REGISTER): Fix to new value. (FIXED_REGISTERS): Add bound registers. (CALL_USED_REGISTERS): Likewise. (REG_ALLOC_ORDER): Likewise. (HARD_REGNO_NREGS): Likewise. (VALID_BND_REG_MODE): New. (FIRST_BND_REG): New. (LAST_BND_REG): New. (reg_class): Add BND_REGS. (REG_CLASS_NAMES): Likewise. (REG_CLASS_CONTENTS): Likewise. (BND_REGNO_P): New. (ANY_BND_REG_P): New. (BNDmode): New. (HI_REGISTER_NAMES): Add bound registers. (ix86_args): Add bnd_regno, bnds_in_bt, force_bnd_pass and stdarg fields. * config/i386/i386.md (UNSPEC_BNDMK): New. (UNSPEC_BNDMK_ADDR): New. (UNSPEC_BNDSTX): New. (UNSPEC_BNDLDX): New. (UNSPEC_BNDLDX_ADDR): New. (UNSPEC_BNDCL): New. (UNSPEC_BNDCU): New. (UNSPEC_BNDCN): New. (UNSPEC_MPX_FENCE): New. (UNSPEC_SIZEOF): New. (BND0_REG): New. (BND1_REG): New. (type): Add mpxmov, mpxmk, mpxchk, mpxld, mpxst. (length_immediate): Support mpxmov, mpxmk, mpxchk, mpxld, mpxst. (prefix_rep): Check for bnd prefix. (prefix_0f): Support mpxmov, mpxmk, mpxchk, mpxld, mpxst. (length_nobnd): New. (length): Use length_nobnd when specified. (memory): Support mpxmov, mpxmk, mpxchk, mpxld, mpxst. (BND): New. (bnd_ptr): New. (BNDCHECK): New. (bndcheck): New. (*jcc_1): Add MPX bnd prefix. (*jcc_2): Likewise. (jump): Likewise. (*indirect_jump): Likewise. (*tablejump_1): Likewise. (simple_return_internal): Likewise. (simple_return_internal_long): Likewise. (simple_return_pop_internal): Likewise. (simple_return_indirect_internal): Likewise. (<mode>_mk): New. (*<mode>_mk): New. (mov<mode>): New. (*mov<mode>_internal_mpx): New. (<mode>_<bndcheck>): New. (*<mode>_<bndcheck>): New. (<mode>_ldx): New. (*<mode>_ldx): New. (<mode>_stx): New. (*<mode>_stx): New. move_size_reloc_<mode>): New. * config/i386/predicates.md (address_mpx_no_base_operand): New. (address_mpx_no_index_operand): New. (bnd_mem_operator): New. (symbol_operand): New. (x86_64_immediate_size_operand): New. * config/i386/i386.opt (mmpx): New. * config/i386/i386-builtin-types.def (BND): New. (ULONG): New. (BND_FTYPE_PCVOID_ULONG): New. (VOID_FTYPE_BND_PCVOID): New. (VOID_FTYPE_PCVOID_PCVOID_BND): New. (BND_FTYPE_PCVOID_PCVOID): New. (BND_FTYPE_PCVOID): New. (BND_FTYPE_BND_BND): New. (PVOID_FTYPE_PVOID_PVOID_ULONG): New. (PVOID_FTYPE_PCVOID_BND_ULONG): New. (ULONG_FTYPE_VOID): New. (PVOID_FTYPE_BND): New. gcc/testsuite/ 2014-11-05 Ilya Enkovich <ilya.enkovich@intel.com> * gcc.target/i386/chkp-builtins-1.c: New. * gcc.target/i386/chkp-builtins-2.c: New. * gcc.target/i386/chkp-builtins-3.c: New. * gcc.target/i386/chkp-builtins-4.c: New. * gcc.target/i386/chkp-remove-bndint-1.c: New. * gcc.target/i386/chkp-remove-bndint-2.c: New. * gcc.target/i386/chkp-const-check-1.c: New. * gcc.target/i386/chkp-const-check-2.c: New. * gcc.target/i386/chkp-lifetime-1.c: New. * gcc.dg/pr37858.c: Replace early_local_cleanups pass name with build_ssa_passes. From-SVN: r217125
2014-11-05 13:42:03 +01:00
late. */
NEXT_PASS (pass_split_functions);
NEXT_PASS (pass_strip_predict_hints, true /* early_p */);
POP_INSERT_PASSES ()
NEXT_PASS (pass_release_ssa_names);
NEXT_PASS (pass_rebuild_cgraph_edges);
2017-05-23 18:20:53 +02:00
NEXT_PASS (pass_local_fn_summary);
POP_INSERT_PASSES ()
NEXT_PASS (pass_ipa_remove_symbols);
NEXT_PASS (pass_ipa_oacc);
PUSH_INSERT_PASSES_WITHIN (pass_ipa_oacc)
NEXT_PASS (pass_ipa_pta);
/* Pass group that runs when the function is an offloaded function
containing oacc kernels loops. */
NEXT_PASS (pass_ipa_oacc_kernels);
PUSH_INSERT_PASSES_WITHIN (pass_ipa_oacc_kernels)
NEXT_PASS (pass_oacc_kernels);
PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels)
NEXT_PASS (pass_ch);
NEXT_PASS (pass_fre, true /* may_iterate */);
/* We use pass_lim to rewrite in-memory iteration and reduction
variable accesses in loops into local variables accesses. */
NEXT_PASS (pass_lim);
NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */);
NEXT_PASS (pass_dce);
Add pass_parallelize_loops to pass_oacc_kernels 2016-01-18 Tom de Vries <tom@codesourcery.com> * passes.def: Add pass_parallelize_loops to pass_oacc_kernels. * gcc.dg/autopar/outer-1.c: Update for new parloops instantiation. * gcc.dg/autopar/outer-2.c: Same. * gcc.dg/autopar/outer-3.c: Same. * gcc.dg/autopar/outer-4.c: Same. * gcc.dg/autopar/outer-5.c: Same. * gcc.dg/autopar/outer-6.c: Same. * gcc.dg/autopar/parallelization-1.c: Same. * gcc.dg/autopar/parloops-exit-first-loop-alt-2.c: Same. * gcc.dg/autopar/parloops-exit-first-loop-alt-3.c: Same. * gcc.dg/autopar/parloops-exit-first-loop-alt-4.c: Same. * gcc.dg/autopar/parloops-exit-first-loop-alt-5.c: Same. * gcc.dg/autopar/parloops-exit-first-loop-alt-6.c: Same. * gcc.dg/autopar/parloops-exit-first-loop-alt-7.c: Same. * gcc.dg/autopar/parloops-exit-first-loop-alt-pr66652.c: Same. * gcc.dg/autopar/parloops-exit-first-loop-alt.c: Same. * gcc.dg/autopar/pr39500-1.c: Same. * gcc.dg/autopar/pr39500-2.c: Same. * gcc.dg/autopar/pr46193.c: Same. * gcc.dg/autopar/pr46194.c: Same. * gcc.dg/autopar/pr49580.c: Same. * gcc.dg/autopar/pr49960-1.c: Same. * gcc.dg/autopar/pr49960.c: Same. * gcc.dg/autopar/pr68373.c: Same. * gcc.dg/autopar/reduc-1.c: Same. * gcc.dg/autopar/reduc-1char.c: Same. * gcc.dg/autopar/reduc-1short.c: Same. * gcc.dg/autopar/reduc-2.c: Same. * gcc.dg/autopar/reduc-2char.c: Same. * gcc.dg/autopar/reduc-2short.c: Same. * gcc.dg/autopar/reduc-3.c: Same. * gcc.dg/autopar/reduc-4.c: Same. * gcc.dg/autopar/reduc-6.c: Same. * gcc.dg/autopar/reduc-7.c: Same. * gcc.dg/autopar/reduc-8.c: Same. * gcc.dg/autopar/reduc-9.c: Same. * gcc.dg/autopar/uns-outer-4.c: Same. * gcc.dg/autopar/uns-outer-5.c: Same. * gcc.dg/autopar/uns-outer-6.c: Same. * gfortran.dg/parloops-exit-first-loop-alt-2.f95: Same. * gfortran.dg/parloops-exit-first-loop-alt.f95: Same. From-SVN: r232513
2016-01-18 13:52:42 +01:00
NEXT_PASS (pass_parallelize_loops, true /* oacc_kernels_p */);
NEXT_PASS (pass_expand_omp_ssa);
NEXT_PASS (pass_rebuild_cgraph_edges);
POP_INSERT_PASSES ()
POP_INSERT_PASSES ()
POP_INSERT_PASSES ()
NEXT_PASS (pass_target_clone);
Add AutoFDO. gcc/ChangeLog: 2014-10-21 Dehao Chen <dehao@google.com> * auto-profile.c: New file. * auto-profile.h: New file. * basic-block.h (maybe_hot_count_p): New export func. (add_working_set): New export func. * gcov-io.h (GCOV_TAG_AFDO_FILE_NAMES): New tag. (GCOV_TAG_AFDO_FUNCTION): Likewise. (GCOV_TAG_AFDO_WORKING_SET): Likewise. * opts.c (enable_fdo_optimizations): New func. (common_handle_option): Handle -fauto-profile flag. * ipa-inline.c (want_early_inline_function_p): Iterative-einline. (class pass_early_inline): Export early_inliner. (early_inliner): Likewise. (pass_early_inline::execute): Likewise. * ipa-inline.h (early_inliner): Likewise. * predict.c (maybe_hot_count_p): New export func. (counts_to_freqs): AutoFDO logic. (rebuild_frequencies): Likewise. * tree-profile.c (pass_ipa_tree_profile::gate): Likewise. * profile.c (add_working_set): New func. * Makefile.in (auto-profile.o): New object file. * passes.def (pass_ipa_auto_profile): New pass. * tree-ssa-live.c (remove_unused_scope_block_p): AutoFDO logic. * tree-pass.h (make_pass_ipa_auto_profile): New pass. * toplev.c (compile_file): AutoFDO logic. * doc/invoke.texi (-fauto-profile): New doc. * coverage.c (coverage_init): AutoFDO logic. * common.opt (-fauto-profile): New flag. * timevar.def (TV_IPA_AUTOFDO): New tag. * value-prof.c (gimple_alloc_histogram_value): New export func. (check_ic_target): Likewise. * value-prof.h (gimple_alloc_histogram_value): Likewise. (check_ic_target): Likewise. From-SVN: r216523
2014-10-21 19:59:30 +02:00
NEXT_PASS (pass_ipa_auto_profile);
NEXT_PASS (pass_ipa_tree_profile);
PUSH_INSERT_PASSES_WITHIN (pass_ipa_tree_profile)
NEXT_PASS (pass_feedback_split_functions);
POP_INSERT_PASSES ()
NEXT_PASS (pass_ipa_free_fn_summary, true /* small_p */);
NEXT_PASS (pass_ipa_increase_alignment);
NEXT_PASS (pass_ipa_tm);
NEXT_PASS (pass_ipa_lower_emutls);
TERMINATE_PASS_LIST (all_small_ipa_passes)
INSERT_PASSES_AFTER (all_regular_ipa_passes)
Initial commit of analyzer This patch adds a static analysis pass to the middle-end, focusing for this release on C code, and malloc/free issues in particular. See: https://gcc.gnu.org/wiki/DavidMalcolm/StaticAnalyzer gcc/ChangeLog: * Makefile.in (lang_opt_files): Add analyzer.opt. (ANALYZER_OBJS): New. (OBJS): Add digraph.o, graphviz.o, ordered-hash-map-tests.o, tristate.o and ANALYZER_OBJS. (TEXI_GCCINT_FILES): Add analyzer.texi. * common.opt (-fanalyzer): New driver option. * config.in: Regenerate. * configure: Regenerate. * configure.ac (--disable-analyzer, ENABLE_ANALYZER): New option. (gccdepdir): Also create depdir for "analyzer" subdir. * digraph.cc: New file. * digraph.h: New file. * doc/analyzer.texi: New file. * doc/gccint.texi ("Static Analyzer") New menu item. (analyzer.texi): Include it. * doc/invoke.texi ("Static Analyzer Options"): New list and new section. ("Warning Options"): Add static analysis warnings to the list. (-Wno-analyzer-double-fclose): New option. (-Wno-analyzer-double-free): New option. (-Wno-analyzer-exposure-through-output-file): New option. (-Wno-analyzer-file-leak): New option. (-Wno-analyzer-free-of-non-heap): New option. (-Wno-analyzer-malloc-leak): New option. (-Wno-analyzer-possible-null-argument): New option. (-Wno-analyzer-possible-null-dereference): New option. (-Wno-analyzer-null-argument): New option. (-Wno-analyzer-null-dereference): New option. (-Wno-analyzer-stale-setjmp-buffer): New option. (-Wno-analyzer-tainted-array-index): New option. (-Wno-analyzer-use-after-free): New option. (-Wno-analyzer-use-of-pointer-in-stale-stack-frame): New option. (-Wno-analyzer-use-of-uninitialized-value): New option. (-Wanalyzer-too-complex): New option. (-fanalyzer-call-summaries): New warning. (-fanalyzer-checker=): New warning. (-fanalyzer-fine-grained): New warning. (-fno-analyzer-state-merge): New warning. (-fno-analyzer-state-purge): New warning. (-fanalyzer-transitivity): New warning. (-fanalyzer-verbose-edges): New warning. (-fanalyzer-verbose-state-changes): New warning. (-fanalyzer-verbosity=): New warning. (-fdump-analyzer): New warning. (-fdump-analyzer-callgraph): New warning. (-fdump-analyzer-exploded-graph): New warning. (-fdump-analyzer-exploded-nodes): New warning. (-fdump-analyzer-exploded-nodes-2): New warning. (-fdump-analyzer-exploded-nodes-3): New warning. (-fdump-analyzer-supergraph): New warning. * doc/sourcebuild.texi (dg-require-dot): New. (dg-check-dot): New. * gdbinit.in (break-on-saved-diagnostic): New command. * graphviz.cc: New file. * graphviz.h: New file. * ordered-hash-map-tests.cc: New file. * ordered-hash-map.h: New file. * passes.def (pass_analyzer): Add before pass_ipa_whole_program_visibility. * selftest-run-tests.c (selftest::run_tests): Call selftest::ordered_hash_map_tests_cc_tests. * selftest.h (selftest::ordered_hash_map_tests_cc_tests): New decl. * shortest-paths.h: New file. * timevar.def (TV_ANALYZER): New timevar. (TV_ANALYZER_SUPERGRAPH): Likewise. (TV_ANALYZER_STATE_PURGE): Likewise. (TV_ANALYZER_PLAN): Likewise. (TV_ANALYZER_SCC): Likewise. (TV_ANALYZER_WORKLIST): Likewise. (TV_ANALYZER_DUMP): Likewise. (TV_ANALYZER_DIAGNOSTICS): Likewise. (TV_ANALYZER_SHORTEST_PATHS): Likewise. * tree-pass.h (make_pass_analyzer): New decl. * tristate.cc: New file. * tristate.h: New file. gcc/analyzer/ChangeLog: * ChangeLog: New file. * analyzer-selftests.cc: New file. * analyzer-selftests.h: New file. * analyzer.opt: New file. * analysis-plan.cc: New file. * analysis-plan.h: New file. * analyzer-logging.cc: New file. * analyzer-logging.h: New file. * analyzer-pass.cc: New file. * analyzer.cc: New file. * analyzer.h: New file. * call-string.cc: New file. * call-string.h: New file. * checker-path.cc: New file. * checker-path.h: New file. * constraint-manager.cc: New file. * constraint-manager.h: New file. * diagnostic-manager.cc: New file. * diagnostic-manager.h: New file. * engine.cc: New file. * engine.h: New file. * exploded-graph.h: New file. * pending-diagnostic.cc: New file. * pending-diagnostic.h: New file. * program-point.cc: New file. * program-point.h: New file. * program-state.cc: New file. * program-state.h: New file. * region-model.cc: New file. * region-model.h: New file. * sm-file.cc: New file. * sm-malloc.cc: New file. * sm-malloc.dot: New file. * sm-pattern-test.cc: New file. * sm-sensitive.cc: New file. * sm-signal.cc: New file. * sm-taint.cc: New file. * sm.cc: New file. * sm.h: New file. * state-purge.cc: New file. * state-purge.h: New file. * supergraph.cc: New file. * supergraph.h: New file. gcc/testsuite/ChangeLog: * gcc.dg/analyzer/CVE-2005-1689-minimal.c: New test. * gcc.dg/analyzer/abort.c: New test. * gcc.dg/analyzer/alloca-leak.c: New test. * gcc.dg/analyzer/analyzer-decls.h: New header. * gcc.dg/analyzer/analyzer-verbosity-0.c: New test. * gcc.dg/analyzer/analyzer-verbosity-1.c: New test. * gcc.dg/analyzer/analyzer-verbosity-2.c: New test. * gcc.dg/analyzer/analyzer.exp: New suite. * gcc.dg/analyzer/attribute-nonnull.c: New test. * gcc.dg/analyzer/call-summaries-1.c: New test. * gcc.dg/analyzer/conditionals-2.c: New test. * gcc.dg/analyzer/conditionals-3.c: New test. * gcc.dg/analyzer/conditionals-notrans.c: New test. * gcc.dg/analyzer/conditionals-trans.c: New test. * gcc.dg/analyzer/data-model-1.c: New test. * gcc.dg/analyzer/data-model-2.c: New test. * gcc.dg/analyzer/data-model-3.c: New test. * gcc.dg/analyzer/data-model-4.c: New test. * gcc.dg/analyzer/data-model-5.c: New test. * gcc.dg/analyzer/data-model-5b.c: New test. * gcc.dg/analyzer/data-model-5c.c: New test. * gcc.dg/analyzer/data-model-5d.c: New test. * gcc.dg/analyzer/data-model-6.c: New test. * gcc.dg/analyzer/data-model-7.c: New test. * gcc.dg/analyzer/data-model-8.c: New test. * gcc.dg/analyzer/data-model-9.c: New test. * gcc.dg/analyzer/data-model-11.c: New test. * gcc.dg/analyzer/data-model-12.c: New test. * gcc.dg/analyzer/data-model-13.c: New test. * gcc.dg/analyzer/data-model-14.c: New test. * gcc.dg/analyzer/data-model-15.c: New test. * gcc.dg/analyzer/data-model-16.c: New test. * gcc.dg/analyzer/data-model-17.c: New test. * gcc.dg/analyzer/data-model-18.c: New test. * gcc.dg/analyzer/data-model-19.c: New test. * gcc.dg/analyzer/data-model-path-1.c: New test. * gcc.dg/analyzer/disabling.c: New test. * gcc.dg/analyzer/dot-output.c: New test. * gcc.dg/analyzer/double-free-lto-1-a.c: New test. * gcc.dg/analyzer/double-free-lto-1-b.c: New test. * gcc.dg/analyzer/double-free-lto-1.h: New header. * gcc.dg/analyzer/equivalence.c: New test. * gcc.dg/analyzer/explode-1.c: New test. * gcc.dg/analyzer/explode-2.c: New test. * gcc.dg/analyzer/factorial.c: New test. * gcc.dg/analyzer/fibonacci.c: New test. * gcc.dg/analyzer/fields.c: New test. * gcc.dg/analyzer/file-1.c: New test. * gcc.dg/analyzer/file-2.c: New test. * gcc.dg/analyzer/function-ptr-1.c: New test. * gcc.dg/analyzer/function-ptr-2.c: New test. * gcc.dg/analyzer/function-ptr-3.c: New test. * gcc.dg/analyzer/gzio-2.c: New test. * gcc.dg/analyzer/gzio-3.c: New test. * gcc.dg/analyzer/gzio-3a.c: New test. * gcc.dg/analyzer/gzio.c: New test. * gcc.dg/analyzer/infinite-recursion.c: New test. * gcc.dg/analyzer/loop-2.c: New test. * gcc.dg/analyzer/loop-2a.c: New test. * gcc.dg/analyzer/loop-3.c: New test. * gcc.dg/analyzer/loop-4.c: New test. * gcc.dg/analyzer/loop.c: New test. * gcc.dg/analyzer/malloc-1.c: New test. * gcc.dg/analyzer/malloc-2.c: New test. * gcc.dg/analyzer/malloc-3.c: New test. * gcc.dg/analyzer/malloc-callbacks.c: New test. * gcc.dg/analyzer/malloc-dce.c: New test. * gcc.dg/analyzer/malloc-dedupe-1.c: New test. * gcc.dg/analyzer/malloc-ipa-1.c: New test. * gcc.dg/analyzer/malloc-ipa-10.c: New test. * gcc.dg/analyzer/malloc-ipa-11.c: New test. * gcc.dg/analyzer/malloc-ipa-12.c: New test. * gcc.dg/analyzer/malloc-ipa-13.c: New test. * gcc.dg/analyzer/malloc-ipa-2.c: New test. * gcc.dg/analyzer/malloc-ipa-3.c: New test. * gcc.dg/analyzer/malloc-ipa-4.c: New test. * gcc.dg/analyzer/malloc-ipa-5.c: New test. * gcc.dg/analyzer/malloc-ipa-6.c: New test. * gcc.dg/analyzer/malloc-ipa-7.c: New test. * gcc.dg/analyzer/malloc-ipa-8-double-free.c: New test. * gcc.dg/analyzer/malloc-ipa-8-lto-a.c: New test. * gcc.dg/analyzer/malloc-ipa-8-lto-b.c: New test. * gcc.dg/analyzer/malloc-ipa-8-lto-c.c: New test. * gcc.dg/analyzer/malloc-ipa-8-lto.h: New test. * gcc.dg/analyzer/malloc-ipa-8-unchecked.c: New test. * gcc.dg/analyzer/malloc-ipa-9.c: New test. * gcc.dg/analyzer/malloc-macro-inline-events.c: New test. * gcc.dg/analyzer/malloc-macro-separate-events.c: New test. * gcc.dg/analyzer/malloc-macro.h: New header. * gcc.dg/analyzer/malloc-many-paths-1.c: New test. * gcc.dg/analyzer/malloc-many-paths-2.c: New test. * gcc.dg/analyzer/malloc-many-paths-3.c: New test. * gcc.dg/analyzer/malloc-paths-1.c: New test. * gcc.dg/analyzer/malloc-paths-10.c: New test. * gcc.dg/analyzer/malloc-paths-2.c: New test. * gcc.dg/analyzer/malloc-paths-3.c: New test. * gcc.dg/analyzer/malloc-paths-4.c: New test. * gcc.dg/analyzer/malloc-paths-5.c: New test. * gcc.dg/analyzer/malloc-paths-6.c: New test. * gcc.dg/analyzer/malloc-paths-7.c: New test. * gcc.dg/analyzer/malloc-paths-8.c: New test. * gcc.dg/analyzer/malloc-paths-9.c: New test. * gcc.dg/analyzer/malloc-vs-local-1a.c: New test. * gcc.dg/analyzer/malloc-vs-local-1b.c: New test. * gcc.dg/analyzer/malloc-vs-local-2.c: New test. * gcc.dg/analyzer/malloc-vs-local-3.c: New test. * gcc.dg/analyzer/malloc-vs-local-4.c: New test. * gcc.dg/analyzer/operations.c: New test. * gcc.dg/analyzer/params-2.c: New test. * gcc.dg/analyzer/params.c: New test. * gcc.dg/analyzer/paths-1.c: New test. * gcc.dg/analyzer/paths-1a.c: New test. * gcc.dg/analyzer/paths-2.c: New test. * gcc.dg/analyzer/paths-3.c: New test. * gcc.dg/analyzer/paths-4.c: New test. * gcc.dg/analyzer/paths-5.c: New test. * gcc.dg/analyzer/paths-6.c: New test. * gcc.dg/analyzer/paths-7.c: New test. * gcc.dg/analyzer/pattern-test-1.c: New test. * gcc.dg/analyzer/pattern-test-2.c: New test. * gcc.dg/analyzer/pointer-merging.c: New test. * gcc.dg/analyzer/pr61861.c: New test. * gcc.dg/analyzer/pragma-1.c: New test. * gcc.dg/analyzer/scope-1.c: New test. * gcc.dg/analyzer/sensitive-1.c: New test. * gcc.dg/analyzer/setjmp-1.c: New test. * gcc.dg/analyzer/setjmp-2.c: New test. * gcc.dg/analyzer/setjmp-3.c: New test. * gcc.dg/analyzer/setjmp-4.c: New test. * gcc.dg/analyzer/setjmp-5.c: New test. * gcc.dg/analyzer/setjmp-6.c: New test. * gcc.dg/analyzer/setjmp-7.c: New test. * gcc.dg/analyzer/setjmp-7a.c: New test. * gcc.dg/analyzer/setjmp-8.c: New test. * gcc.dg/analyzer/setjmp-9.c: New test. * gcc.dg/analyzer/signal-1.c: New test. * gcc.dg/analyzer/signal-2.c: New test. * gcc.dg/analyzer/signal-3.c: New test. * gcc.dg/analyzer/signal-4a.c: New test. * gcc.dg/analyzer/signal-4b.c: New test. * gcc.dg/analyzer/strcmp-1.c: New test. * gcc.dg/analyzer/switch.c: New test. * gcc.dg/analyzer/taint-1.c: New test. * gcc.dg/analyzer/zlib-1.c: New test. * gcc.dg/analyzer/zlib-2.c: New test. * gcc.dg/analyzer/zlib-3.c: New test. * gcc.dg/analyzer/zlib-4.c: New test. * gcc.dg/analyzer/zlib-5.c: New test. * gcc.dg/analyzer/zlib-6.c: New test. * lib/gcc-defs.exp (dg-check-dot): New procedure. * lib/target-supports.exp (check_dot_available): New procedure. (check_effective_target_analyzer): New. * lib/target-supports-dg.exp (dg-require-dot): New procedure.
2019-09-27 15:23:16 +02:00
NEXT_PASS (pass_analyzer);
Optimize ODR enum streaming it turns out that half of the global decl stream of cc1 LTO build consits TREE_LISTS, identifiers and integer cosntats representing TYPE_VALUES of enums. Those are streamed only to produce ODR warning and used otherwise, so this patch moves the info to a separate section that is represented and streamed more effectively. This also adds place for more info that may be used for ODR diagnostics (i.e. at the moment we do not warn when the declarations differs i.e. by the associated member functions and their types) and the type inheritance graph rather then poluting the global stream. I was bit unsure what enums we want to store into the section. All parsed enums is probably too expensive, only those enums streamed to represent IL is bit hard to get, so I went for those seen by free lang data. As a plus we now get bit more precise warning because also the location of mismatched enum CONST_DECL is streamed. It changes: [WPA] read 4608466 unshared trees [WPA] read 2942094 mergeable SCCs of average size 1.365328 [WPA] 8625389 tree bodies read in total [WPA] tree SCC table: size 524287, 247652 elements, collision ratio: 0.383702 [WPA] tree SCC max chain length 2 (size 1) [WPA] Compared 2694442 SCCs, 228 collisions (0.000085) [WPA] Merged 2694419 SCCs [WPA] Merged 3731982 tree bodies [WPA] Merged 633335 types [WPA] 122077 types prevailed (155548 associated trees) ... [WPA] Compression: 110593119 input bytes, 287696614 uncompressed bytes (ratio: 2.601397) [WPA] Size of mmap'd section decls: 85628556 bytes [WPA] Size of mmap'd section function_body: 13842928 bytes [WPA] read 1720989 unshared trees [WPA] read 1252217 mergeable SCCs of average size 1.858507 [WPA] 4048243 tree bodies read in total [WPA] tree SCC table: size 524287, 226524 elements, collision ratio: 0.491759 [WPA] tree SCC max chain length 2 (size 1) [WPA] Compared 1025693 SCCs, 196 collisions (0.000191) [WPA] Merged 1025670 SCCs [WPA] Merged 2063373 tree bodies [WPA] Merged 633497 types [WPA] 122299 types prevailed (155827 associated trees) ... [WPA] Compression: 103428770 input bytes, 281151423 uncompressed bytes (ratio: 2.718310) [WPA] Size of mmap'd section decls: 49390917 bytes [WPA] Size of mmap'd section function_body: 13858258 bytes ... [WPA] Size of mmap'd section odr_types: 29054816 bytes So number of SCCs streamed drops to 38% and the number of unshared trees (that are bit misnamed since it is mostly integer_cst) to 37%. Things speeds up correspondingly, but I did not save time report from previous build. The enum values are still quite surprisingly large. I may take a look into ways getting it smaller incrementally, but it streams reasonably fast: Time variable usr sys wall GGC phase opt and generate : 25.20 ( 68%) 10.88 ( 72%) 36.13 ( 69%) 868060 kB ( 52%) phase stream in : 4.46 ( 12%) 0.90 ( 6%) 5.38 ( 10%) 790724 kB ( 48%) phase stream out : 6.69 ( 18%) 3.32 ( 22%) 10.03 ( 19%) 8 kB ( 0%) ipa lto gimple in : 0.79 ( 2%) 1.86 ( 12%) 2.39 ( 5%) 252612 kB ( 15%) ipa lto gimple out : 2.48 ( 7%) 0.78 ( 5%) 3.26 ( 6%) 0 kB ( 0%) ipa lto decl in : 1.71 ( 5%) 0.46 ( 3%) 2.34 ( 4%) 417883 kB ( 25%) ipa lto decl out : 3.28 ( 9%) 0.07 ( 0%) 3.27 ( 6%) 0 kB ( 0%) whopr wpa I/O : 0.40 ( 1%) 2.24 ( 15%) 2.77 ( 5%) 8 kB ( 0%) lto stream decompression : 1.38 ( 4%) 0.31 ( 2%) 1.36 ( 3%) 0 kB ( 0%) ipa ODR types : 0.18 ( 0%) 0.02 ( 0%) 0.25 ( 0%) 0 kB ( 0%) ipa inlining heuristics : 11.64 ( 31%) 1.45 ( 10%) 13.12 ( 25%) 453160 kB ( 27%) ipa pure const : 1.74 ( 5%) 0.00 ( 0%) 1.76 ( 3%) 0 kB ( 0%) ipa icf : 1.72 ( 5%) 5.33 ( 35%) 7.06 ( 13%) 16593 kB ( 1%) whopr partitioning : 2.22 ( 6%) 0.01 ( 0%) 2.23 ( 4%) 5689 kB ( 0%) TOTAL : 37.17 15.20 52.46 1660886 kB LTO-bootstrapped/regtested x86_64-linux, will comit it shortly. gcc/ChangeLog: 2020-06-03 Jan Hubicka <hubicka@ucw.cz> * ipa-devirt.c: Include data-streamer.h, lto-streamer.h and streamer-hooks.h. (odr_enums): New static var. (struct odr_enum_val): New struct. (class odr_enum): New struct. (odr_enum_map): New hashtable. (odr_types_equivalent_p): Drop code testing TYPE_VALUES. (add_type_duplicate): Likewise. (free_odr_warning_data): Do not free TYPE_VALUES. (register_odr_enum): New function. (ipa_odr_summary_write): New function. (ipa_odr_read_section): New function. (ipa_odr_summary_read): New function. (class pass_ipa_odr): New pass. (make_pass_ipa_odr): New function. * ipa-utils.h (register_odr_enum): Declare. * lto-section-in.c: (lto_section_name): Add odr_types section. * lto-streamer.h (enum lto_section_type): Add odr_types section. * passes.def: Add odr_types pass. * lto-streamer-out.c (DFS::DFS_write_tree_body): Do not stream TYPE_VALUES. (hash_tree): Likewise. * tree-streamer-in.c (lto_input_ts_type_non_common_tree_pointers): Likewise. * tree-streamer-out.c (write_ts_type_non_common_tree_pointers): Likewise. * timevar.def (TV_IPA_ODR): New timervar. * tree-pass.h (make_pass_ipa_odr): Declare. * tree.c (free_lang_data_in_type): Regiser ODR types. gcc/lto/ChangeLog: 2020-06-03 Jan Hubicka <hubicka@ucw.cz> * lto-common.c (compare_tree_sccs_1): Do not compare TYPE_VALUES. gcc/testsuite/ChangeLog: 2020-06-03 Jan Hubicka <hubicka@ucw.cz> * g++.dg/lto/pr84805_0.C: Update.
2020-06-03 21:16:43 +02:00
NEXT_PASS (pass_ipa_odr);
NEXT_PASS (pass_ipa_whole_program_visibility);
NEXT_PASS (pass_ipa_profile);
NEXT_PASS (pass_ipa_icf);
NEXT_PASS (pass_ipa_devirt);
NEXT_PASS (pass_ipa_cp);
New IPA-SRA 2019-09-20 Martin Jambor <mjambor@suse.cz> * coretypes.h (cgraph_edge): Declare. * ipa-param-manipulation.c: Rewrite. * ipa-param-manipulation.h: Likewise. * Makefile.in (GTFILES): Added ipa-param-manipulation.h and ipa-sra.c. (OBJS): Added ipa-sra.o. * cgraph.h (ipa_replace_map): Removed fields old_tree, replace_p and ref_p, added fields param_adjustments and performed_splits. (struct cgraph_clone_info): Remove ags_to_skip and combined_args_to_skip, new field param_adjustments. (cgraph_node::create_clone): Changed parameters to use ipa_param_adjustments. (cgraph_node::create_virtual_clone): Likewise. (cgraph_node::create_virtual_clone_with_body): Likewise. (tree_function_versioning): Likewise. (cgraph_build_function_type_skip_args): Removed. * cgraph.c (cgraph_edge::redirect_call_stmt_to_callee): Convert to using ipa_param_adjustments. (clone_of_p): Likewise. * cgraphclones.c (cgraph_build_function_type_skip_args): Removed. (build_function_decl_skip_args): Likewise. (duplicate_thunk_for_node): Adjust parameters using ipa_param_body_adjustments, copy param_adjustments instead of args_to_skip. (cgraph_node::create_clone): Convert to using ipa_param_adjustments. (cgraph_node::create_virtual_clone): Likewise. (cgraph_node::create_version_clone_with_body): Likewise. (cgraph_materialize_clone): Likewise. (symbol_table::materialize_all_clones): Likewise. * ipa-fnsummary.c (ipa_fn_summary_t::duplicate): Simplify ipa_replace_map check. * ipa-cp.c (get_replacement_map): Do not initialize removed fields. (initialize_node_lattices): Make aware that some parameters might have already been removed. (want_remove_some_param_p): New function. (create_specialized_node): Convert to using ipa_param_adjustments and deal with possibly pre-existing adjustments. * lto-cgraph.c (output_cgraph_opt_summary_p): Likewise. (output_node_opt_summary): Do not stream removed fields. Stream parameter adjustments instead of argumetns to skip. (input_node_opt_summary): Likewise. (input_node_opt_summary): Likewise. * lto-section-in.c (lto_section_name): Added ipa-sra section. * lto-streamer.h (lto_section_type): Likewise. * tree-inline.h (copy_body_data): New fields killed_new_ssa_names and param_body_adjs. (copy_decl_to_var): Declare. * tree-inline.c (update_clone_info): Do not remap old_tree. (remap_gimple_stmt): Use ipa_param_body_adjustments to modify gimple statements, walk all extra generated statements and remap their operands. (redirect_all_calls): Add killed SSA names to a hash set. (remap_ssa_name): Do not remap killed SSA names. (copy_arguments_for_versioning): Renames to copy_arguments_nochange, half of functionality moved to ipa_param_body_adjustments. (copy_decl_to_var): Make exported. (copy_body): Destroy killed_new_ssa_names hash set. (expand_call_inline): Remap performed splits. (update_clone_info): Likewise. (tree_function_versioning): Simplify tree_map processing. Updated to accept ipa_param_adjustments and use ipa_param_body_adjustments. * omp-simd-clone.c (simd_clone_vector_of_formal_parm_types): Adjust for the new interface. (simd_clone_clauses_extract): Likewise, make args an auto_vec. (simd_clone_compute_base_data_type): Likewise. (simd_clone_init_simd_arrays): Adjust for the new interface. (simd_clone_adjust_argument_types): Likewise. (struct modify_stmt_info): Likewise. (ipa_simd_modify_stmt_ops): Likewise. (ipa_simd_modify_function_body): Likewise. (simd_clone_adjust): Likewise. * tree-sra.c: Removed IPA-SRA. Include tree-sra.h. (type_internals_preclude_sra_p): Make public. * tree-sra.h: New file. * ipa-inline-transform.c (save_inline_function_body): Update to refelct new tree_function_versioning signature. * ipa-prop.c (adjust_agg_replacement_values): Use a helper from ipa_param_adjustments to get current parameter indices. (ipcp_modif_dom_walker::before_dom_children): Likewise. (ipcp_update_bits): Likewise. (ipcp_update_vr): Likewise. * ipa-split.c (split_function): Convert to using ipa_param_adjustments. * ipa-sra.c: New file. * multiple_target.c (create_target_clone): Update to reflet new type of create_version_clone_with_body. * trans-mem.c (ipa_tm_create_version): Update to reflect new type of tree_function_versioning. (modify_function): Update to reflect new type of tree_function_versioning. * params.def (PARAM_IPA_SRA_MAX_REPLACEMENTS): New. * passes.def: Remove old IPA-SRA and add new one. * tree-pass.h (make_pass_early_ipa_sra): Remove declaration. (make_pass_ipa_sra): Declare. * dbgcnt.def: Remove eipa_sra. Added ipa_sra_params and ipa_sra_retvalues. * doc/invoke.texi (ipa-sra-max-replacements): New. testsuite/ * g++.dg/ipa/pr81248.C: Adjust dg-options and dump-scan. * gcc.dg/ipa/ipa-sra-1.c: Likewise. * gcc.dg/ipa/ipa-sra-10.c: Likewise. * gcc.dg/ipa/ipa-sra-11.c: Likewise. * gcc.dg/ipa/ipa-sra-3.c: Likewise. * gcc.dg/ipa/ipa-sra-4.c: Likewise. * gcc.dg/ipa/ipa-sra-5.c: Likewise. * gcc.dg/ipa/ipacost-2.c: Disable ipa-sra. * gcc.dg/ipa/ipcp-agg-9.c: Likewise. * gcc.dg/ipa/pr78121.c: Adjust scan pattern. * gcc.dg/ipa/vrp1.c: Likewise. * gcc.dg/ipa/vrp2.c: Likewise. * gcc.dg/ipa/vrp3.c: Likewise. * gcc.dg/ipa/vrp7.c: Likewise. * gcc.dg/ipa/vrp8.c: Likewise. * gcc.dg/noreorder.c: use noipa attribute instead of noinline. * gcc.dg/ipa/20040703-wpa.c: New test. * gcc.dg/ipa/ipa-sra-12.c: New test. * gcc.dg/ipa/ipa-sra-13.c: Likewise. * gcc.dg/ipa/ipa-sra-14.c: Likewise. * gcc.dg/ipa/ipa-sra-15.c: Likewise. * gcc.dg/ipa/ipa-sra-16.c: Likewise. * gcc.dg/ipa/ipa-sra-17.c: Likewise. * gcc.dg/ipa/ipa-sra-18.c: Likewise. * gcc.dg/ipa/ipa-sra-19.c: Likewise. * gcc.dg/ipa/ipa-sra-20.c: Likewise. * gcc.dg/ipa/ipa-sra-21.c: Likewise. * gcc.dg/ipa/ipa-sra-22.c: Likewise. * gcc.dg/sso/ipa-sra-1.c: Likewise. * g++.dg/ipa/ipa-sra-2.C: Likewise. * g++.dg/ipa/ipa-sra-3.C: Likewise. * gcc.dg/tree-ssa/ipa-cp-1.c: Make return value used. * g++.dg/ipa/devirt-19.C: Add missing return, add -fipa-cp-clone option. * g++.dg/lto/devirt-19_0.C: Add -fipa-cp-clone option. * gcc.dg/ipa/ipa-sra-2.c: Removed. * gcc.dg/ipa/ipa-sra-6.c: Likewise. From-SVN: r275982
2019-09-20 00:25:04 +02:00
NEXT_PASS (pass_ipa_sra);
NEXT_PASS (pass_ipa_cdtor_merge);
2017-05-23 18:20:53 +02:00
NEXT_PASS (pass_ipa_fn_summary);
NEXT_PASS (pass_ipa_inline);
NEXT_PASS (pass_ipa_pure_const);
New modref/ipa_modref optimization passes 2020-09-19 David Cepelik <d@dcepelik.cz> Jan Hubicka <hubicka@ucw.cz> * Makefile.in: Add ipa-modref.c and ipa-modref-tree.c. * alias.c: (reference_alias_ptr_type_1): Export. * alias.h (reference_alias_ptr_type_1): Declare. * common.opt (fipa-modref): New. * gengtype.c (open_base_files): Add ipa-modref-tree.h and ipa-modref.h * ipa-modref-tree.c: New file. * ipa-modref-tree.h: New file. * ipa-modref.c: New file. * ipa-modref.h: New file. * lto-section-in.c (lto_section_name): Add ipa_modref. * lto-streamer.h (enum lto_section_type): Add LTO_section_ipa_modref. * opts.c (default_options_table): Enable ipa-modref at -O1+. * params.opt (-param=modref-max-bases, -param=modref-max-refs, -param=modref-max-tests): New params. * passes.def: Schedule pass_modref and pass_ipa_modref. * timevar.def (TV_IPA_MODREF): New timevar. (TV_TREE_MODREF): New timevar. * tree-pass.h (make_pass_modref): Declare. (make_pass_ipa_modref): Declare. * tree-ssa-alias.c (dump_alias_stats): Include ipa-modref-tree.h and ipa-modref.h (alias_stats): Add modref_use_may_alias, modref_use_no_alias, modref_clobber_may_alias, modref_clobber_no_alias, modref_tests. (dump_alias_stats): Dump new stats. (nonoverlapping_array_refs_p): Fix formating. (modref_may_conflict): New function. (ref_maybe_used_by_call_p_1): Use it. (call_may_clobber_ref_p_1): Use it. (call_may_clobber_ref_p): Update. (stmt_may_clobber_ref_p_1): Update. * tree-ssa-alias.h (call_may_clobber_ref_p_1): Update.
2020-09-20 07:25:16 +02:00
NEXT_PASS (pass_ipa_modref);
NEXT_PASS (pass_ipa_free_fn_summary, false /* small_p */);
NEXT_PASS (pass_ipa_reference);
/* This pass needs to be scheduled after any IP code duplication. */
NEXT_PASS (pass_ipa_single_use);
/* Comdat privatization come last, as direct references to comdat local
symbols are not allowed outside of the comdat group. Privatizing early
would result in missed optimizations due to this restriction. */
NEXT_PASS (pass_ipa_comdats);
TERMINATE_PASS_LIST (all_regular_ipa_passes)
/* Simple IPA passes executed after the regular passes. In WHOPR mode the
passes are executed after partitioning and thus see just parts of the
compiled unit. */
INSERT_PASSES_AFTER (all_late_ipa_passes)
NEXT_PASS (pass_ipa_pta);
cgraph.h (enum cgraph_simd_clone_arg_type): New. * cgraph.h (enum cgraph_simd_clone_arg_type): New. (struct cgraph_simd_clone_arg, struct cgraph_simd_clone): New. (struct cgraph_node): Add simdclone and simd_clones fields. * config/i386/i386.c (ix86_simd_clone_compute_vecsize_and_simdlen, ix86_simd_clone_adjust, ix86_simd_clone_usable): New functions. (TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN, TARGET_SIMD_CLONE_ADJUST, TARGET_SIMD_CLONE_USABLE): Define. * doc/tm.texi.in (TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN, TARGET_SIMD_CLONE_ADJUST, TARGET_SIMD_CLONE_USABLE): Add. * doc/tm.texi: Regenerated. * ggc.h (ggc_alloc_cleared_simd_clone_stat): New function. * ipa-cp.c (determine_versionability): Fail if "omp declare simd" attribute is present. * omp-low.c: Include pretty-print.h, ipa-prop.h and tree-eh.h. (simd_clone_vector_of_formal_parm_types): New function. (simd_clone_struct_alloc, simd_clone_struct_copy, simd_clone_vector_of_formal_parm_types, simd_clone_clauses_extract, simd_clone_compute_base_data_type, simd_clone_mangle, simd_clone_create, simd_clone_adjust_return_type, create_tmp_simd_array, simd_clone_adjust_argument_types, simd_clone_init_simd_arrays): New functions. (struct modify_stmt_info): New type. (ipa_simd_modify_stmt_ops, ipa_simd_modify_function_body, simd_clone_adjust, expand_simd_clones, ipa_omp_simd_clone): New functions. (pass_data_omp_simd_clone): New variable. (pass_omp_simd_clone): New class. (make_pass_omp_simd_clone): New function. * passes.def (pass_omp_simd_clone): New. * target.def (TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN, TARGET_SIMD_CLONE_ADJUST, TARGET_SIMD_CLONE_USABLE): New target hooks. * target.h (struct cgraph_node, struct cgraph_simd_node): Declare. * tree-core.h (OMP_CLAUSE_LINEAR_VARIABLE_STRIDE): Document. * tree.h (OMP_CLAUSE_LINEAR_VARIABLE_STRIDE): Define. * tree-pass.h (make_pass_omp_simd_clone): New prototype. * tree-vect-data-refs.c: Include cgraph.h. (vect_analyze_data_refs): Inline by hand find_data_references_in_loop and find_data_references_in_bb, if find_data_references_in_stmt fails, still allow calls to #pragma omp declare simd functions in #pragma omp simd loops unless they contain data references among the call arguments or in lhs. * tree-vect-loop.c (vect_determine_vectorization_factor): Handle calls with no lhs. (vect_transform_loop): Allow NULL STMT_VINFO_VECTYPE for calls without lhs. * tree-vectorizer.h (enum stmt_vec_info_type): Add call_simd_clone_vec_info_type. (struct _stmt_vec_info): Add simd_clone_fndecl field. (STMT_VINFO_SIMD_CLONE_FNDECL): Define. * tree-vect-stmts.c: Include tree-ssa-loop.h, tree-scalar-evolution.h and cgraph.h. (vectorizable_call): Handle calls without lhs. Assert !stmt_can_throw_internal instead of failing for it. Don't update EH stuff. (struct simd_call_arg_info): New. (vectorizable_simd_clone_call): New function. (vect_transform_stmt): Call it. (vect_analyze_stmt): Likewise. Allow NULL STMT_VINFO_VECTYPE for calls without lhs. * ipa-prop.c (ipa_add_new_function): Only call ipa_analyze_node if cgraph_function_with_gimple_body_p is true. c/ * c-decl.c (c_builtin_function_ext_scope): Avoid binding if external_scope is NULL. cp/ * semantics.c (finish_omp_clauses): For #pragma omp declare simd linear clause step call maybe_constant_value. testsuite/ * g++.dg/gomp/declare-simd-1.C (f38): Make sure simdlen is a power of two. * gcc.dg/gomp/simd-clones-2.c: Compile on all targets. Remove -msse2. Adjust regexps for name mangling changes. * gcc.dg/gomp/simd-clones-3.c: Likewise. * gcc.dg/vect/vect-simd-clone-1.c: New test. * gcc.dg/vect/vect-simd-clone-2.c: New test. * gcc.dg/vect/vect-simd-clone-3.c: New test. * gcc.dg/vect/vect-simd-clone-4.c: New test. * gcc.dg/vect/vect-simd-clone-5.c: New test. * gcc.dg/vect/vect-simd-clone-6.c: New test. * gcc.dg/vect/vect-simd-clone-7.c: New test. * gcc.dg/vect/vect-simd-clone-8.c: New test. * gcc.dg/vect/vect-simd-clone-9.c: New test. * gcc.dg/vect/vect-simd-clone-10.c: New test. * gcc.dg/vect/vect-simd-clone-10.h: New file. * gcc.dg/vect/vect-simd-clone-10a.c: New file. * gcc.dg/vect/vect-simd-clone-11.c: New test. Co-Authored-By: Jakub Jelinek <jakub@redhat.com> From-SVN: r205442
2013-11-27 12:20:06 +01:00
NEXT_PASS (pass_omp_simd_clone);
TERMINATE_PASS_LIST (all_late_ipa_passes)
/* These passes are run after IPA passes on every function that is being
output to the assembler file. */
INSERT_PASSES_AFTER (all_passes)
NEXT_PASS (pass_fixup_cfg);
NEXT_PASS (pass_lower_eh_dispatch);
[OpenACC] Extract 'pass_oacc_loop_designation' out of 'pass_oacc_device_lower' This really is a separate step -- and another pass to be added between the two, later on. gcc/ * omp-offload.c (oacc_loop_xform_head_tail, oacc_loop_process): 'update_stmt' after modification. (pass_oacc_loop_designation): New function, extracted out of... (pass_oacc_device_lower): ... this. (pass_data_oacc_loop_designation, pass_oacc_loop_designation) (make_pass_oacc_loop_designation): New * passes.def: Add it. * tree-parloops.c (create_parallel_loop): Adjust. * tree-pass.h (make_pass_oacc_loop_designation): New. gcc/testsuite/ * c-c++-common/goacc/classify-kernels-unparallelized.c: 's%oaccdevlow%oaccloops%g'. * c-c++-common/goacc/classify-kernels.c: Likewise. * c-c++-common/goacc/classify-parallel.c: Likewise. * c-c++-common/goacc/classify-routine-nohost.c: Likewise. * c-c++-common/goacc/classify-routine.c: Likewise. * c-c++-common/goacc/classify-serial.c: Likewise. * c-c++-common/goacc/routine-nohost-1.c: Likewise. * g++.dg/goacc/template.C: Likewise. * gcc.dg/goacc/loop-processing-1.c: Likewise. * gfortran.dg/goacc/classify-kernels-unparallelized.f95: Likewise. * gfortran.dg/goacc/classify-kernels.f95: Likewise. * gfortran.dg/goacc/classify-parallel.f95: Likewise. * gfortran.dg/goacc/classify-routine-nohost.f95: Likewise. * gfortran.dg/goacc/classify-routine.f95: Likewise. * gfortran.dg/goacc/classify-serial.f95: Likewise. * gfortran.dg/goacc/routine-multiple-directives-1.f90: Likewise. libgomp/ * testsuite/libgomp.oacc-c-c++-common/pr85486-2.c: 's%oaccdevlow%oaccloops%g'. * testsuite/libgomp.oacc-c-c++-common/pr85486-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/pr85486.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/routine-nohost-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/vector-length-128-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/vector-length-128-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/vector-length-128-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/vector-length-128-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/vector-length-128-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/vector-length-128-6.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/vector-length-128-7.c: Likewise. * testsuite/libgomp.oacc-fortran/routine-nohost-1.f90: Likewise. Co-Authored-By: Julian Brown <julian@codesourcery.com> Co-Authored-By: Kwok Cheung Yeung <kcy@codesourcery.com>
2021-03-02 13:20:11 +01:00
NEXT_PASS (pass_oacc_loop_designation);
openacc: Middle-end worker-partitioning support This patch implements worker-partitioning support in the middle end, by rewriting gimple. The OpenACC execution model requires that code can run in either "worker single" mode where only a single worker per gang is active, or "worker partitioned" mode, where multiple workers per gang are active. This means we need to do something equivalent to spawning additional workers when transitioning from worker-single to worker-partitioned mode. However, GPUs typically fix the number of threads of invoked kernels at launch time, so we need to do something with the "extra" threads when they are not wanted. The scheme used is to conditionalise each basic block that executes in "worker single" mode for worker 0 only. Conditional branches are handled specially so "idle" (non-0) workers follow along with worker 0. On transitioning to "worker partitioned" mode, any variables modified by worker 0 are propagated to the other workers via GPU shared memory. Special care is taken for routine calls, writes through pointers, and so forth, as follows: - There are two types of function calls to consider in worker-single mode: "normal" calls to maths library routines, etc. are called from worker 0 only. OpenACC routines may contain worker-partitioned loops themselves, so are called from all workers, including "idle" ones. - SSA names set in worker-single mode, but used in worker-partitioned mode, are copied to shared memory in worker 0. Other workers retrieve the value from the appropriate shared-memory location after a barrier, and new phi nodes are introduced at the convergence point to resolve the worker 0/other worker copies of the value. - Local scalar variables (on the stack) also need special handling. We broadcast any variables that are written in the current worker-single block, and that are read in any worker-partitioned block. (This is believed to be safe, and is flow-insensitive to ease analysis.) - Local aggregates (arrays and composites) on the stack are *not* broadcast. Instead we force gimple stmts modifying elements/fields of local aggregates into fully-partitioned mode. The RHS of the assignment is a scalar, and is thus subject to broadcasting as above. - Writes through pointers may affect any local variable that has its address taken. We use points-to analysis to determine the set of potentially-affected variables for a given pointer indirection. We broadcast any such variable which is used in worker-partitioned mode, on a per-block basis for any block containing a write through a pointer. Some slides about the implementation (from 2018) are available at: https://jtb20.github.io/gcnworkers.pdf gcc/ * Makefile.in (OBJS): Add omp-oacc-neuter-broadcast.o. * doc/tm.texi.in (TARGET_GOACC_CREATE_WORKER_BROADCAST_RECORD): Add documentation hook. * doc/tm.texi: Regenerate. * omp-oacc-neuter-broadcast.cc: New file. * omp-builtins.def (BUILT_IN_GOACC_BARRIER) (BUILT_IN_GOACC_SINGLE_START, BUILT_IN_GOACC_SINGLE_COPY_START) (BUILT_IN_GOACC_SINGLE_COPY_END): New builtins. * passes.def (pass_omp_oacc_neuter_broadcast): Add pass. * target.def (goacc.create_worker_broadcast_record): Add target hook. * tree-pass.h (make_pass_omp_oacc_neuter_broadcast): Add prototype. * config/gcn/gcn-protos.h (gcn_goacc_adjust_propagation_record): Rename prototype to... (gcn_goacc_create_worker_broadcast_record): ... this. * config/gcn/gcn-tree.c (gcn_goacc_adjust_propagation_record): Rename function to... (gcn_goacc_create_worker_broadcast_record): ... this. * config/gcn/gcn.c (TARGET_GOACC_ADJUST_PROPAGATION_RECORD): Rename to... (TARGET_GOACC_CREATE_WORKER_BROADCAST_RECORD): ... this. Co-Authored-By: Nathan Sidwell <nathan@codesourcery.com> (via 'gcc/config/nvptx/nvptx.c' master) Co-Authored-By: Kwok Cheung Yeung <kcy@codesourcery.com> Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
2021-03-02 13:20:11 +01:00
NEXT_PASS (pass_omp_oacc_neuter_broadcast);
NEXT_PASS (pass_oacc_device_lower);
NEXT_PASS (pass_omp_device_lower);
c-common.c (c_common_attribute_table): Handle "omp declare target link" attribute. gcc/c-family/ * c-common.c (c_common_attribute_table): Handle "omp declare target link" attribute. gcc/ * cgraphunit.c (output_in_order): Do not assemble "omp declare target link" variables in ACCEL_COMPILER. * gimplify.c (gimplify_adjust_omp_clauses): Do not remove mapping of "omp declare target link" variables. * omp-low.c (scan_sharing_clauses): Do not remove mapping of "omp declare target link" variables. (add_decls_addresses_to_decl_constructor): For "omp declare target link" variables output address of the artificial pointer instead of address of the variable. Set most significant bit of the size to mark them. (pass_data_omp_target_link): New pass_data. (pass_omp_target_link): New class. (find_link_var_op): New static function. (make_pass_omp_target_link): New function. * passes.def: Add pass_omp_target_link. * tree-pass.h (make_pass_omp_target_link): Declare. * varpool.c (symbol_table::output_variables): Do not assemble "omp declare target link" variables in ACCEL_COMPILER. gcc/lto/ * lto.c: Include stringpool.h and fold-const.h. (offload_handle_link_vars): New static function. (lto_main): Call offload_handle_link_vars. libgomp/ * libgomp.h (REFCOUNT_LINK): Define. (struct splay_tree_key_s): Add link_key. * target.c (gomp_map_vars): Treat REFCOUNT_LINK objects as not mapped. Replace target address of the pointer with target address of newly mapped object in the splay tree. Set link pointer on target to the device address of the mapped object. (gomp_unmap_vars): Restore target address of the pointer in the splay tree for REFCOUNT_LINK objects after unmapping. (gomp_load_image_to_device): Set refcount to REFCOUNT_LINK for "omp declare target link" objects. (gomp_unload_image_from_device): Replace j with i. Force unmap of all "omp declare target link" objects, which were mapped for the image. (gomp_exit_data): Restore target address of the pointer in the splay tree for REFCOUNT_LINK objects after unmapping. * testsuite/libgomp.c/target-link-1.c: New file. From-SVN: r231655
2015-12-15 15:56:50 +01:00
NEXT_PASS (pass_omp_target_link);
NEXT_PASS (pass_adjust_alignment);
NEXT_PASS (pass_all_optimizations);
PUSH_INSERT_PASSES_WITHIN (pass_all_optimizations)
NEXT_PASS (pass_remove_cgraph_callee_edges);
/* Initial scalar cleanups before alias computation.
They ensure memory accesses are not indirect wherever possible. */
NEXT_PASS (pass_strip_predict_hints, false /* early_p */);
NEXT_PASS (pass_ccp, true /* nonzero_p */);
/* After CCP we rewrite no longer addressed locals into SSA
form if possible. */
passes: Fix up subobject __bos [PR101419] The following testcase is miscompiled, because VN during cunrolli changes __bos argument from address of a larger field to address of a smaller field and so __builtin_object_size (, 1) then folds into smaller value than the actually available size. copy_reference_ops_from_ref has a hack for this, but it was using cfun->after_inlining as a check whether the hack can be ignored, and cunrolli is after_inlining. This patch uses a property to make it exact (set at the end of objsz pass that doesn't do insert_min_max_p) and additionally based on discussions in the PR moves the objsz pass earlier after IPA. 2021-07-13 Jakub Jelinek <jakub@redhat.com> Richard Biener <rguenther@suse.de> PR tree-optimization/101419 * tree-pass.h (PROP_objsz): Define. (make_pass_early_object_sizes): Declare. * passes.def (pass_all_early_optimizations): Rename pass_object_sizes there to pass_early_object_sizes, drop parameter. (pass_all_optimizations): Move pass_object_sizes right after pass_ccp, drop parameter, move pass_post_ipa_warn right after that. * tree-object-size.c (pass_object_sizes::execute): Rename to... (object_sizes_execute): ... this. Add insert_min_max_p argument. (pass_data_object_sizes): Move after object_sizes_execute. (pass_object_sizes): Likewise. In execute method call object_sizes_execute, drop set_pass_param method and insert_min_max_p non-static data member and its initializer in the ctor. (pass_data_early_object_sizes, pass_early_object_sizes, make_pass_early_object_sizes): New. * tree-ssa-sccvn.c (copy_reference_ops_from_ref): Use (cfun->curr_properties & PROP_objsz) instead of cfun->after_inlining. * gcc.dg/builtin-object-size-10.c: Pass -fdump-tree-early_objsz-details instead of -fdump-tree-objsz1-details in dg-options and adjust names of dump file in scan-tree-dump. * gcc.dg/pr101419.c: New test.
2021-07-13 11:04:22 +02:00
NEXT_PASS (pass_object_sizes);
NEXT_PASS (pass_post_ipa_warn);
Add -Wdangling-pointer [PR63272]. Resolves: PR c/63272 - GCC should warn when using pointer to dead scoped variable with in the same function gcc/c-family/ChangeLog: PR c/63272 * c.opt (-Wdangling-pointer): New option. gcc/ChangeLog: PR c/63272 * diagnostic-spec.c (nowarn_spec_t::nowarn_spec_t): Handle -Wdangling-pointer. * doc/invoke.texi (-Wdangling-pointer): Document new option. * gimple-ssa-warn-access.cc (pass_waccess::clone): Set new member. (pass_waccess::check_pointer_uses): New function. (pass_waccess::gimple_call_return_arg): New function. (pass_waccess::gimple_call_return_arg_ref): New function. (pass_waccess::check_call_dangling): New function. (pass_waccess::check_dangling_uses): New function overloads. (pass_waccess::check_dangling_stores): New function. (pass_waccess::check_dangling_stores): New function. (pass_waccess::m_clobbers): New data member. (pass_waccess::m_func): New data member. (pass_waccess::m_run_number): New data member. (pass_waccess::m_check_dangling_p): New data member. (pass_waccess::check_alloca): Check m_early_checks_p. (pass_waccess::check_alloc_size_call): Same. (pass_waccess::check_strcat): Same. (pass_waccess::check_strncat): Same. (pass_waccess::check_stxcpy): Same. (pass_waccess::check_stxncpy): Same. (pass_waccess::check_strncmp): Same. (pass_waccess::check_memop_access): Same. (pass_waccess::check_read_access): Same. (pass_waccess::check_builtin): Call check_pointer_uses. (pass_waccess::warn_invalid_pointer): Add arguments. (is_auto_decl): New function. (pass_waccess::check_stmt): New function. (pass_waccess::check_block): Call check_stmt. (pass_waccess::execute): Call check_dangling_uses, check_dangling_stores. Empty m_clobbers. * passes.def (pass_warn_access): Invoke pass two more times. gcc/testsuite/ChangeLog: PR c/63272 * g++.dg/warn/Wfree-nonheap-object-6.C: Disable valid warnings. * g++.dg/warn/ref-temp1.C: Prune expected warning. * gcc.dg/uninit-pr50476.c: Expect a new warning. * c-c++-common/Wdangling-pointer-2.c: New test. * c-c++-common/Wdangling-pointer-3.c: New test. * c-c++-common/Wdangling-pointer-4.c: New test. * c-c++-common/Wdangling-pointer-5.c: New test. * c-c++-common/Wdangling-pointer-6.c: New test. * c-c++-common/Wdangling-pointer.c: New test. * g++.dg/warn/Wdangling-pointer-2.C: New test. * g++.dg/warn/Wdangling-pointer.C: New test. * gcc.dg/Wdangling-pointer-2.c: New test. * gcc.dg/Wdangling-pointer.c: New test.
2022-01-16 00:41:40 +01:00
/* Must run before loop unrolling. */
NEXT_PASS (pass_warn_access, /*early=*/true);
NEXT_PASS (pass_complete_unrolli);
NEXT_PASS (pass_backprop);
NEXT_PASS (pass_phiprop);
NEXT_PASS (pass_forwprop);
/* pass_build_alias is a dummy pass that ensures that we
execute TODO_rebuild_alias at this point. */
NEXT_PASS (pass_build_alias);
NEXT_PASS (pass_return_slot);
NEXT_PASS (pass_fre, true /* may_iterate */);
NEXT_PASS (pass_merge_phi);
NEXT_PASS (pass_thread_jumps_full, /*first=*/true);
NEXT_PASS (pass_vrp, true /* warn_array_bounds_p */);
NEXT_PASS (pass_dse);
NEXT_PASS (pass_dce);
/* pass_stdarg is always run and at this point we execute
TODO_remove_unused_locals to prune CLOBBERs of dead
variables which are otherwise a churn on alias walkings. */
NEXT_PASS (pass_stdarg);
NEXT_PASS (pass_call_cdce);
NEXT_PASS (pass_cselim);
NEXT_PASS (pass_copy_prop);
NEXT_PASS (pass_tree_ifcombine);
NEXT_PASS (pass_merge_phi);
NEXT_PASS (pass_phiopt, false /* early_p */);
NEXT_PASS (pass_tail_recursion);
NEXT_PASS (pass_ch);
NEXT_PASS (pass_lower_complex);
NEXT_PASS (pass_sra);
/* The dom pass will also resolve all __builtin_constant_p calls
that are still there to 0. This has to be done after some
propagations have already run, but before some more dead code
is removed, and this place fits nicely. Remember this when
trying to move or duplicate pass_dominator somewhere earlier. */
NEXT_PASS (pass_thread_jumps, /*first=*/true);
NEXT_PASS (pass_dominator, true /* may_peel_loop_headers_p */);
/* Threading can leave many const/copy propagations in the IL.
Clean them up. Failure to do so well can lead to false
positives from warnings for erroneous code. */
NEXT_PASS (pass_copy_prop);
/* Identify paths that should never be executed in a conforming
program and isolate those paths. */
NEXT_PASS (pass_isolate_erroneous_paths);
NEXT_PASS (pass_reassoc, true /* early_p */);
NEXT_PASS (pass_dce);
NEXT_PASS (pass_forwprop);
NEXT_PASS (pass_phiopt, false /* early_p */);
NEXT_PASS (pass_ccp, true /* nonzero_p */);
/* After CCP we rewrite no longer addressed locals into SSA
form if possible. */
NEXT_PASS (pass_cse_sincos);
NEXT_PASS (pass_optimize_bswap);
NEXT_PASS (pass_laddress);
NEXT_PASS (pass_lim);
NEXT_PASS (pass_walloca, false);
NEXT_PASS (pass_pre);
NEXT_PASS (pass_sink_code, false /* unsplit edges */);
NEXT_PASS (pass_sancov);
NEXT_PASS (pass_asan);
NEXT_PASS (pass_tsan);
NEXT_PASS (pass_dse);
NEXT_PASS (pass_dce);
tree-ssa-loop.c (gate_loop): New function. 2014-06-23 Richard Biener <rguenther@suse.de> * tree-ssa-loop.c (gate_loop): New function. (pass_tree_loop::gate): Call it. (pass_data_tree_no_loop, pass_tree_no_loop, make_pass_tree_no_loop): New. * tree-vectorizer.c: Include tree-scalar-evolution.c (pass_slp_vectorize::execute): Initialize loops and SCEV if required. (pass_slp_vectorize::clone): New method. * timevar.def (TV_TREE_NOLOOP): New. * tree-pass.h (make_pass_tree_no_loop): Declare. * passes.def (pass_tree_no_loop): New pass group with SLP vectorizer. * g++.dg/vect/slp-pr50413.cc: Scan and cleanup appropriate SLP dumps. * g++.dg/vect/slp-pr50819.cc: Likewise. * g++.dg/vect/slp-pr56812.cc: Likewise. * gcc.dg/vect/bb-slp-1.c: Likewise. * gcc.dg/vect/bb-slp-10.c: Likewise. * gcc.dg/vect/bb-slp-11.c: Likewise. * gcc.dg/vect/bb-slp-13.c: Likewise. * gcc.dg/vect/bb-slp-14.c: Likewise. * gcc.dg/vect/bb-slp-15.c: Likewise. * gcc.dg/vect/bb-slp-16.c: Likewise. * gcc.dg/vect/bb-slp-17.c: Likewise. * gcc.dg/vect/bb-slp-18.c: Likewise. * gcc.dg/vect/bb-slp-19.c: Likewise. * gcc.dg/vect/bb-slp-2.c: Likewise. * gcc.dg/vect/bb-slp-20.c: Likewise. * gcc.dg/vect/bb-slp-21.c: Likewise. * gcc.dg/vect/bb-slp-22.c: Likewise. * gcc.dg/vect/bb-slp-23.c: Likewise. * gcc.dg/vect/bb-slp-24.c: Likewise. * gcc.dg/vect/bb-slp-25.c: Likewise. * gcc.dg/vect/bb-slp-26.c: Likewise. * gcc.dg/vect/bb-slp-27.c: Likewise. * gcc.dg/vect/bb-slp-28.c: Likewise. * gcc.dg/vect/bb-slp-29.c: Likewise. * gcc.dg/vect/bb-slp-3.c: Likewise. * gcc.dg/vect/bb-slp-30.c: Likewise. * gcc.dg/vect/bb-slp-31.c: Likewise. * gcc.dg/vect/bb-slp-32.c: Likewise. * gcc.dg/vect/bb-slp-4.c: Likewise. * gcc.dg/vect/bb-slp-5.c: Likewise. * gcc.dg/vect/bb-slp-6.c: Likewise. * gcc.dg/vect/bb-slp-7.c: Likewise. * gcc.dg/vect/bb-slp-8.c: Likewise. * gcc.dg/vect/bb-slp-8a.c: Likewise. * gcc.dg/vect/bb-slp-8b.c: Likewise. * gcc.dg/vect/bb-slp-9.c: Likewise. * gcc.dg/vect/bb-slp-cond-1.c: Likewise. * gcc.dg/vect/bb-slp-pattern-1.c: Likewise. * gcc.dg/vect/bb-slp-pattern-2.c: Likewise. * gcc.dg/vect/fast-math-bb-slp-call-1.c: Likewise. * gcc.dg/vect/fast-math-bb-slp-call-2.c: Likewise. * gcc.dg/vect/fast-math-bb-slp-call-3.c: Likewise. * gcc.dg/vect/no-tree-reassoc-bb-slp-12.c: Likewise. * gcc.dg/vect/no-tree-sra-bb-slp-pr50730.c: Likewise. * gcc.dg/vect/pr26359.c: Likewise. * gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c: Likewise. From-SVN: r211904
2014-06-23 18:51:10 +02:00
/* Pass group that runs when 1) enabled, 2) there are loops
in the function. Make sure to run pass_fix_loops before
to discover/remove loops before running the gate function
of pass_tree_loop. */
NEXT_PASS (pass_fix_loops);
NEXT_PASS (pass_tree_loop);
PUSH_INSERT_PASSES_WITHIN (pass_tree_loop)
/* Before loop_init we rewrite no longer addressed locals into SSA
form if possible. */
NEXT_PASS (pass_tree_loop_init);
NEXT_PASS (pass_tree_unswitch);
NEXT_PASS (pass_scev_cprop);
NEXT_PASS (pass_loop_split);
Add a loop versioning pass This patch adds a pass that versions loops with variable index strides for the case in which the stride is 1. E.g.: for (int i = 0; i < n; ++i) x[i * stride] = ...; becomes: if (stepx == 1) for (int i = 0; i < n; ++i) x[i] = ...; else for (int i = 0; i < n; ++i) x[i * stride] = ...; This is useful for both vector code and scalar code, and in some cases can enable further optimisations like loop interchange or pattern recognition. The pass gives a 7.6% improvement on Cortex-A72 for 554.roms_r at -O3 and a 2.4% improvement for 465.tonto. I haven't found any SPEC tests that regress. Sizewise, there's a 10% increase in .text for both 554.roms_r and 465.tonto. That's obviously a lot, but in tonto's case it's because the whole program is written using assumed-shape arrays and pointers, so a large number of functions really do benefit from versioning. roms likewise makes heavy use of assumed-shape arrays, and that improvement in performance IMO justifies the code growth. The next biggest .text increase is 4.5% for 548.exchange2_r. I did see a small (0.4%) speed improvement there, but although both 3-iteration runs produced stable results, that might still be noise. There was a slightly larger (non-noise) improvement for a 256-bit SVE model. 481.wrf and 521.wrf_r .text grew by 2.8% and 2.5% respectively, but without any noticeable improvement in performance. No other test grew by more than 2%. Although the main SPEC beneficiaries are all Fortran tests, the benchmarks we use for SVE also include some C and C++ tests that benefit. Using -frepack-arrays gives the same benefits in many Fortran cases. The problem is that using that option inappropriately can force a full array copy for arguments that the function only reads once, and so it isn't really something we can turn on by default. The new pass is supposed to give most of the benefits of -frepack-arrays without the risk of unnecessary repacking. The patch therefore enables the pass by default at -O3. 2018-12-17 Richard Sandiford <richard.sandiford@arm.com> Ramana Radhakrishnan <ramana.radhakrishnan@arm.com> Kyrylo Tkachov <kyrylo.tkachov@arm.com> gcc/ * doc/invoke.texi (-fversion-loops-for-strides): Document (loop-versioning-group-size, loop-versioning-max-inner-insns) (loop-versioning-max-outer-insns): Document new --params. * Makefile.in (OBJS): Add gimple-loop-versioning.o. * common.opt (fversion-loops-for-strides): New option. * opts.c (default_options_table): Enable fversion-loops-for-strides at -O3. * params.def (PARAM_LOOP_VERSIONING_GROUP_SIZE) (PARAM_LOOP_VERSIONING_MAX_INNER_INSNS) (PARAM_LOOP_VERSIONING_MAX_OUTER_INSNS): New parameters. * passes.def: Add pass_loop_versioning. * timevar.def (TV_LOOP_VERSIONING): New time variable. * tree-ssa-propagate.h (substitute_and_fold_engine::substitute_and_fold): Add an optional block parameter. * tree-ssa-propagate.c (substitute_and_fold_engine::substitute_and_fold): Likewise. When passed, only walk blocks dominated by that block. * tree-vrp.h (range_includes_p): Declare. (range_includes_zero_p): Turn into an inline wrapper around range_includes_p. * tree-vrp.c (range_includes_p): New function, generalizing... (range_includes_zero_p): ...this. * tree-pass.h (make_pass_loop_versioning): Declare. * gimple-loop-versioning.cc: New file. gcc/testsuite/ * gcc.dg/loop-versioning-1.c: New test. * gcc.dg/loop-versioning-10.c: Likewise. * gcc.dg/loop-versioning-11.c: Likewise. * gcc.dg/loop-versioning-2.c: Likewise. * gcc.dg/loop-versioning-3.c: Likewise. * gcc.dg/loop-versioning-4.c: Likewise. * gcc.dg/loop-versioning-5.c: Likewise. * gcc.dg/loop-versioning-6.c: Likewise. * gcc.dg/loop-versioning-7.c: Likewise. * gcc.dg/loop-versioning-8.c: Likewise. * gcc.dg/loop-versioning-9.c: Likewise. * gfortran.dg/loop_versioning_1.f90: Likewise. * gfortran.dg/loop_versioning_2.f90: Likewise. * gfortran.dg/loop_versioning_3.f90: Likewise. * gfortran.dg/loop_versioning_4.f90: Likewise. * gfortran.dg/loop_versioning_5.f90: Likewise. * gfortran.dg/loop_versioning_6.f90: Likewise. * gfortran.dg/loop_versioning_7.f90: Likewise. * gfortran.dg/loop_versioning_8.f90: Likewise. From-SVN: r267197
2018-12-17 11:05:51 +01:00
NEXT_PASS (pass_loop_versioning);
NEXT_PASS (pass_loop_jam);
/* All unswitching, final value replacement and splitting can expose
empty loops. Remove them now. */
NEXT_PASS (pass_cd_dce, false /* update_address_taken_p */);
NEXT_PASS (pass_iv_canon);
NEXT_PASS (pass_loop_distribution);
re PR tree-optimization/81303 (410.bwaves regression caused by r249919) PR tree-optimization/81303 * Makefile.in (gimple-loop-interchange.o): New object file. * common.opt (floop-interchange): Reuse the option from graphite. * doc/invoke.texi (-floop-interchange): Ditto. New document for -floop-interchange and mention it for -O3. * opts.c (default_options_table): Enable -floop-interchange at -O3. * gimple-loop-interchange.cc: New file. * params.def (PARAM_LOOP_INTERCHANGE_MAX_NUM_STMTS): New parameter. (PARAM_LOOP_INTERCHANGE_STRIDE_RATIO): New parameter. * passes.def (pass_linterchange): New pass. * timevar.def (TV_LINTERCHANGE): New time var. * tree-pass.h (make_pass_linterchange): New declaration. * tree-ssa-loop-ivcanon.c (create_canonical_iv): Change to external interchange. Record IV before/after increment in new parameters. * tree-ssa-loop-ivopts.h (create_canonical_iv): New declaration. * tree-vect-loop.c (vect_is_simple_reduction): Factor out reduction path check into... (check_reduction_path): ...New function here. * tree-vectorizer.h (check_reduction_path): New declaration. gcc/testsuite * gcc.dg/tree-ssa/loop-interchange-1.c: New test. * gcc.dg/tree-ssa/loop-interchange-1b.c: New test. * gcc.dg/tree-ssa/loop-interchange-2.c: New test. * gcc.dg/tree-ssa/loop-interchange-3.c: New test. * gcc.dg/tree-ssa/loop-interchange-4.c: New test. * gcc.dg/tree-ssa/loop-interchange-5.c: New test. * gcc.dg/tree-ssa/loop-interchange-6.c: New test. * gcc.dg/tree-ssa/loop-interchange-7.c: New test. * gcc.dg/tree-ssa/loop-interchange-8.c: New test. * gcc.dg/tree-ssa/loop-interchange-9.c: New test. * gcc.dg/tree-ssa/loop-interchange-10.c: New test. * gcc.dg/tree-ssa/loop-interchange-11.c: New test. * gcc.dg/tree-ssa/loop-interchange-12.c: New test. * gcc.dg/tree-ssa/loop-interchange-13.c: New test. Co-Authored-By: Richard Biener <rguenther@suse.de> From-SVN: r255472
2017-12-07 19:03:53 +01:00
NEXT_PASS (pass_linterchange);
NEXT_PASS (pass_copy_prop);
NEXT_PASS (pass_graphite);
PUSH_INSERT_PASSES_WITHIN (pass_graphite)
NEXT_PASS (pass_graphite_transforms);
NEXT_PASS (pass_lim);
NEXT_PASS (pass_copy_prop);
NEXT_PASS (pass_dce);
POP_INSERT_PASSES ()
NEXT_PASS (pass_parallelize_loops, false /* oacc_kernels_p */);
NEXT_PASS (pass_expand_omp_ssa);
NEXT_PASS (pass_ch_vect);
NEXT_PASS (pass_if_conversion);
tree-vectorizer.h (struct _loop_vec_info): Add scalar_loop field. * tree-vectorizer.h (struct _loop_vec_info): Add scalar_loop field. (LOOP_VINFO_SCALAR_LOOP): Define. (slpeel_tree_duplicate_loop_to_edge_cfg): Add scalar_loop argument. * config/i386/sse.md (maskload<mode>, maskstore<mode>): New expanders. * tree-data-ref.c (get_references_in_stmt): Handle MASK_LOAD and MASK_STORE. * internal-fn.def (LOOP_VECTORIZED, MASK_LOAD, MASK_STORE): New internal fns. * tree-if-conv.c: Include expr.h, optabs.h, tree-ssa-loop-ivopts.h and tree-ssa-address.h. (release_bb_predicate): New function. (free_bb_predicate): Use it. (reset_bb_predicate): Likewise. Don't unallocate bb->aux just to immediately allocate it again. (add_to_predicate_list): Add loop argument. If basic blocks that dominate loop->latch don't insert any predicate. (add_to_dst_predicate_list): Adjust caller. (if_convertible_phi_p): Add any_mask_load_store argument, if true, handle it like flag_tree_loop_if_convert_stores. (insert_gimplified_predicates): Likewise. (ifcvt_can_use_mask_load_store): New function. (if_convertible_gimple_assign_stmt_p): Add any_mask_load_store argument, check if some conditional loads or stores can't be converted into MASK_LOAD or MASK_STORE. (if_convertible_stmt_p): Add any_mask_load_store argument, pass it down to if_convertible_gimple_assign_stmt_p. (predicate_bbs): Don't return bool, only check if the last stmt of a basic block is GIMPLE_COND and handle that. Adjust add_to_predicate_list caller. (if_convertible_loop_p_1): Only call predicate_bbs if flag_tree_loop_if_convert_stores and free_bb_predicate in that case afterwards, check gimple_code of stmts here. Replace is_predicated check with dominance check. Add any_mask_load_store argument, pass it down to if_convertible_stmt_p and if_convertible_phi_p, call if_convertible_phi_p only after all if_convertible_stmt_p calls. (if_convertible_loop_p): Add any_mask_load_store argument, pass it down to if_convertible_loop_p_1. (predicate_mem_writes): Emit MASK_LOAD and/or MASK_STORE calls. (combine_blocks): Add any_mask_load_store argument, pass it down to insert_gimplified_predicates and call predicate_mem_writes if it is set. Call predicate_bbs. (version_loop_for_if_conversion): New function. (tree_if_conversion): Adjust if_convertible_loop_p and combine_blocks calls. Return todo flags instead of bool, call version_loop_for_if_conversion if if-conversion should be just for the vectorized loops and nothing else. (main_tree_if_conversion): Adjust caller. Don't call tree_if_conversion for dont_vectorize loops if if-conversion isn't explicitly enabled. * tree-vect-data-refs.c (vect_check_gather): Handle MASK_LOAD/MASK_STORE. (vect_analyze_data_refs, vect_supportable_dr_alignment): Likewise. * gimple.h (gimple_expr_type): Handle MASK_STORE. * internal-fn.c (expand_LOOP_VECTORIZED, expand_MASK_LOAD, expand_MASK_STORE): New functions. * tree-vectorizer.c: Include tree-cfg.h and gimple-fold.h. (vect_loop_vectorized_call, fold_loop_vectorized_call): New functions. (vectorize_loops): Don't try to vectorize loops with loop->dont_vectorize set. Set LOOP_VINFO_SCALAR_LOOP for if-converted loops, fold LOOP_VECTORIZED internal call depending on if loop has been vectorized or not. * tree-vect-loop-manip.c (slpeel_duplicate_current_defs_from_edges): New function. (slpeel_tree_duplicate_loop_to_edge_cfg): Add scalar_loop argument. If non-NULL, copy basic blocks from scalar_loop instead of loop, but still to loop's entry or exit edge. (slpeel_tree_peel_loop_to_edge): Add scalar_loop argument, pass it down to slpeel_tree_duplicate_loop_to_edge_cfg. (vect_do_peeling_for_loop_bound, vect_do_peeling_for_loop_alignment): Adjust callers. (vect_loop_versioning): If LOOP_VINFO_SCALAR_LOOP, perform loop versioning from that loop instead of LOOP_VINFO_LOOP, move it to the right place in the CFG afterwards. * tree-vect-loop.c (vect_determine_vectorization_factor): Handle MASK_STORE. * cfgloop.h (struct loop): Add dont_vectorize field. * tree-loop-distribution.c (copy_loop_before): Adjust slpeel_tree_duplicate_loop_to_edge_cfg caller. * optabs.def (maskload_optab, maskstore_optab): New optabs. * passes.def: Add a note that pass_vectorize must immediately follow pass_if_conversion. * tree-predcom.c (split_data_refs_to_components): Give up if DR_STMT is a call. * tree-vect-stmts.c (vect_mark_relevant): Don't crash if lhs is NULL. (exist_non_indexing_operands_for_use_p): Handle MASK_LOAD and MASK_STORE. (vectorizable_mask_load_store): New function. (vectorizable_call): Call it for MASK_LOAD or MASK_STORE. (vect_transform_stmt): Handle MASK_STORE. * tree-ssa-phiopt.c (cond_if_else_store_replacement): Ignore DR_STMT where lhs is NULL. * optabs.h (can_vec_perm_p): Fix up comment typo. (can_vec_mask_load_store_p): New prototype. * optabs.c (can_vec_mask_load_store_p): New function. * gcc.dg/vect/vect-cond-11.c: New test. * gcc.target/i386/vect-cond-1.c: New test. * gcc.target/i386/avx2-gather-5.c: New test. * gcc.target/i386/avx2-gather-6.c: New test. * gcc.dg/vect/vect-mask-loadstore-1.c: New test. * gcc.dg/vect/vect-mask-load-1.c: New test. From-SVN: r205856
2013-12-10 12:46:01 +01:00
/* pass_vectorize must immediately follow pass_if_conversion.
Please do not add any other passes in between. */
NEXT_PASS (pass_vectorize);
pass: Run cleanup passes before SLP [PR96789] As the discussion in PR96789, we found that some scalar stmts which can be eliminated by some passes after SLP, but we still modeled their costs when trying to SLP, it could impact vectorizer's decision. One typical case is the case in PR96789 on target Power. As Richard suggested there, this patch is to introduce one pass called pre_slp_scalar_cleanup which has some secondary clean up passes, for now they are FRE and DSE. It introduces one new TODO flags group called pending TODO flags, unlike normal TODO flags, the pending TODO flags are passed down in the pipeline until one of its consumers can perform the requested action. Consumers should then clear the flags for the actions that they have taken. Soem compilation time statistics on all SPEC2017 INT bmks were collected on one Power9 machine for several option sets below: A1: -Ofast -funroll-loops A2: -O1 A3: -O1 -funroll-loops A4: -O2 A5: -O2 -funroll-loops the corresponding increment rate is trivial: A1 A2 A3 A4 A5 0.08% 0.00% -0.38% -0.10% -0.05% Bootstrapped/regtested on powerpc64le-linux-gnu P8. gcc/ChangeLog: PR tree-optimization/96789 * function.h (struct function): New member unsigned pending_TODOs. * passes.c (class pass_pre_slp_scalar_cleanup): New class. (make_pass_pre_slp_scalar_cleanup): New function. (pass_data_pre_slp_scalar_cleanup): New pass data. * passes.def: (pass_pre_slp_scalar_cleanup): New pass, add pass_fre and pass_dse as its children. * timevar.def (TV_SCALAR_CLEANUP): New timevar. * tree-pass.h (PENDING_TODO_force_next_scalar_cleanup): New pending TODO flag. (make_pass_pre_slp_scalar_cleanup): New declare. * tree-ssa-loop-ivcanon.c (tree_unroll_loops_completely_1): Once any outermost loop gets unrolled, flag cfun pending_TODOs PENDING_TODO_force_next_scalar_cleanup on. gcc/testsuite/ChangeLog: PR tree-optimization/96789 * gcc.dg/tree-ssa/ssa-dse-28.c: Adjust. * gcc.dg/tree-ssa/ssa-dse-29.c: Likewise. * gcc.dg/vect/bb-slp-41.c: Likewise. * gcc.dg/tree-ssa/pr96789.c: New test.
2020-11-03 03:51:47 +01:00
PUSH_INSERT_PASSES_WITHIN (pass_vectorize)
NEXT_PASS (pass_dce);
pass: Run cleanup passes before SLP [PR96789] As the discussion in PR96789, we found that some scalar stmts which can be eliminated by some passes after SLP, but we still modeled their costs when trying to SLP, it could impact vectorizer's decision. One typical case is the case in PR96789 on target Power. As Richard suggested there, this patch is to introduce one pass called pre_slp_scalar_cleanup which has some secondary clean up passes, for now they are FRE and DSE. It introduces one new TODO flags group called pending TODO flags, unlike normal TODO flags, the pending TODO flags are passed down in the pipeline until one of its consumers can perform the requested action. Consumers should then clear the flags for the actions that they have taken. Soem compilation time statistics on all SPEC2017 INT bmks were collected on one Power9 machine for several option sets below: A1: -Ofast -funroll-loops A2: -O1 A3: -O1 -funroll-loops A4: -O2 A5: -O2 -funroll-loops the corresponding increment rate is trivial: A1 A2 A3 A4 A5 0.08% 0.00% -0.38% -0.10% -0.05% Bootstrapped/regtested on powerpc64le-linux-gnu P8. gcc/ChangeLog: PR tree-optimization/96789 * function.h (struct function): New member unsigned pending_TODOs. * passes.c (class pass_pre_slp_scalar_cleanup): New class. (make_pass_pre_slp_scalar_cleanup): New function. (pass_data_pre_slp_scalar_cleanup): New pass data. * passes.def: (pass_pre_slp_scalar_cleanup): New pass, add pass_fre and pass_dse as its children. * timevar.def (TV_SCALAR_CLEANUP): New timevar. * tree-pass.h (PENDING_TODO_force_next_scalar_cleanup): New pending TODO flag. (make_pass_pre_slp_scalar_cleanup): New declare. * tree-ssa-loop-ivcanon.c (tree_unroll_loops_completely_1): Once any outermost loop gets unrolled, flag cfun pending_TODOs PENDING_TODO_force_next_scalar_cleanup on. gcc/testsuite/ChangeLog: PR tree-optimization/96789 * gcc.dg/tree-ssa/ssa-dse-28.c: Adjust. * gcc.dg/tree-ssa/ssa-dse-29.c: Likewise. * gcc.dg/vect/bb-slp-41.c: Likewise. * gcc.dg/tree-ssa/pr96789.c: New test.
2020-11-03 03:51:47 +01:00
POP_INSERT_PASSES ()
NEXT_PASS (pass_predcom);
NEXT_PASS (pass_complete_unroll);
pass: Run cleanup passes before SLP [PR96789] As the discussion in PR96789, we found that some scalar stmts which can be eliminated by some passes after SLP, but we still modeled their costs when trying to SLP, it could impact vectorizer's decision. One typical case is the case in PR96789 on target Power. As Richard suggested there, this patch is to introduce one pass called pre_slp_scalar_cleanup which has some secondary clean up passes, for now they are FRE and DSE. It introduces one new TODO flags group called pending TODO flags, unlike normal TODO flags, the pending TODO flags are passed down in the pipeline until one of its consumers can perform the requested action. Consumers should then clear the flags for the actions that they have taken. Soem compilation time statistics on all SPEC2017 INT bmks were collected on one Power9 machine for several option sets below: A1: -Ofast -funroll-loops A2: -O1 A3: -O1 -funroll-loops A4: -O2 A5: -O2 -funroll-loops the corresponding increment rate is trivial: A1 A2 A3 A4 A5 0.08% 0.00% -0.38% -0.10% -0.05% Bootstrapped/regtested on powerpc64le-linux-gnu P8. gcc/ChangeLog: PR tree-optimization/96789 * function.h (struct function): New member unsigned pending_TODOs. * passes.c (class pass_pre_slp_scalar_cleanup): New class. (make_pass_pre_slp_scalar_cleanup): New function. (pass_data_pre_slp_scalar_cleanup): New pass data. * passes.def: (pass_pre_slp_scalar_cleanup): New pass, add pass_fre and pass_dse as its children. * timevar.def (TV_SCALAR_CLEANUP): New timevar. * tree-pass.h (PENDING_TODO_force_next_scalar_cleanup): New pending TODO flag. (make_pass_pre_slp_scalar_cleanup): New declare. * tree-ssa-loop-ivcanon.c (tree_unroll_loops_completely_1): Once any outermost loop gets unrolled, flag cfun pending_TODOs PENDING_TODO_force_next_scalar_cleanup on. gcc/testsuite/ChangeLog: PR tree-optimization/96789 * gcc.dg/tree-ssa/ssa-dse-28.c: Adjust. * gcc.dg/tree-ssa/ssa-dse-29.c: Likewise. * gcc.dg/vect/bb-slp-41.c: Likewise. * gcc.dg/tree-ssa/pr96789.c: New test.
2020-11-03 03:51:47 +01:00
NEXT_PASS (pass_pre_slp_scalar_cleanup);
PUSH_INSERT_PASSES_WITHIN (pass_pre_slp_scalar_cleanup)
NEXT_PASS (pass_fre, false /* may_iterate */);
NEXT_PASS (pass_dse);
POP_INSERT_PASSES ()
NEXT_PASS (pass_slp_vectorize);
NEXT_PASS (pass_loop_prefetch);
tree-ssa-loop.c (gate_loop): New function. 2014-06-23 Richard Biener <rguenther@suse.de> * tree-ssa-loop.c (gate_loop): New function. (pass_tree_loop::gate): Call it. (pass_data_tree_no_loop, pass_tree_no_loop, make_pass_tree_no_loop): New. * tree-vectorizer.c: Include tree-scalar-evolution.c (pass_slp_vectorize::execute): Initialize loops and SCEV if required. (pass_slp_vectorize::clone): New method. * timevar.def (TV_TREE_NOLOOP): New. * tree-pass.h (make_pass_tree_no_loop): Declare. * passes.def (pass_tree_no_loop): New pass group with SLP vectorizer. * g++.dg/vect/slp-pr50413.cc: Scan and cleanup appropriate SLP dumps. * g++.dg/vect/slp-pr50819.cc: Likewise. * g++.dg/vect/slp-pr56812.cc: Likewise. * gcc.dg/vect/bb-slp-1.c: Likewise. * gcc.dg/vect/bb-slp-10.c: Likewise. * gcc.dg/vect/bb-slp-11.c: Likewise. * gcc.dg/vect/bb-slp-13.c: Likewise. * gcc.dg/vect/bb-slp-14.c: Likewise. * gcc.dg/vect/bb-slp-15.c: Likewise. * gcc.dg/vect/bb-slp-16.c: Likewise. * gcc.dg/vect/bb-slp-17.c: Likewise. * gcc.dg/vect/bb-slp-18.c: Likewise. * gcc.dg/vect/bb-slp-19.c: Likewise. * gcc.dg/vect/bb-slp-2.c: Likewise. * gcc.dg/vect/bb-slp-20.c: Likewise. * gcc.dg/vect/bb-slp-21.c: Likewise. * gcc.dg/vect/bb-slp-22.c: Likewise. * gcc.dg/vect/bb-slp-23.c: Likewise. * gcc.dg/vect/bb-slp-24.c: Likewise. * gcc.dg/vect/bb-slp-25.c: Likewise. * gcc.dg/vect/bb-slp-26.c: Likewise. * gcc.dg/vect/bb-slp-27.c: Likewise. * gcc.dg/vect/bb-slp-28.c: Likewise. * gcc.dg/vect/bb-slp-29.c: Likewise. * gcc.dg/vect/bb-slp-3.c: Likewise. * gcc.dg/vect/bb-slp-30.c: Likewise. * gcc.dg/vect/bb-slp-31.c: Likewise. * gcc.dg/vect/bb-slp-32.c: Likewise. * gcc.dg/vect/bb-slp-4.c: Likewise. * gcc.dg/vect/bb-slp-5.c: Likewise. * gcc.dg/vect/bb-slp-6.c: Likewise. * gcc.dg/vect/bb-slp-7.c: Likewise. * gcc.dg/vect/bb-slp-8.c: Likewise. * gcc.dg/vect/bb-slp-8a.c: Likewise. * gcc.dg/vect/bb-slp-8b.c: Likewise. * gcc.dg/vect/bb-slp-9.c: Likewise. * gcc.dg/vect/bb-slp-cond-1.c: Likewise. * gcc.dg/vect/bb-slp-pattern-1.c: Likewise. * gcc.dg/vect/bb-slp-pattern-2.c: Likewise. * gcc.dg/vect/fast-math-bb-slp-call-1.c: Likewise. * gcc.dg/vect/fast-math-bb-slp-call-2.c: Likewise. * gcc.dg/vect/fast-math-bb-slp-call-3.c: Likewise. * gcc.dg/vect/no-tree-reassoc-bb-slp-12.c: Likewise. * gcc.dg/vect/no-tree-sra-bb-slp-pr50730.c: Likewise. * gcc.dg/vect/pr26359.c: Likewise. * gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c: Likewise. From-SVN: r211904
2014-06-23 18:51:10 +02:00
/* Run IVOPTs after the last pass that uses data-reference analysis
as that doesn't handle TARGET_MEM_REFs. */
NEXT_PASS (pass_iv_optimize);
NEXT_PASS (pass_lim);
NEXT_PASS (pass_tree_loop_done);
POP_INSERT_PASSES ()
tree-ssa-loop.c (gate_loop): New function. 2014-06-23 Richard Biener <rguenther@suse.de> * tree-ssa-loop.c (gate_loop): New function. (pass_tree_loop::gate): Call it. (pass_data_tree_no_loop, pass_tree_no_loop, make_pass_tree_no_loop): New. * tree-vectorizer.c: Include tree-scalar-evolution.c (pass_slp_vectorize::execute): Initialize loops and SCEV if required. (pass_slp_vectorize::clone): New method. * timevar.def (TV_TREE_NOLOOP): New. * tree-pass.h (make_pass_tree_no_loop): Declare. * passes.def (pass_tree_no_loop): New pass group with SLP vectorizer. * g++.dg/vect/slp-pr50413.cc: Scan and cleanup appropriate SLP dumps. * g++.dg/vect/slp-pr50819.cc: Likewise. * g++.dg/vect/slp-pr56812.cc: Likewise. * gcc.dg/vect/bb-slp-1.c: Likewise. * gcc.dg/vect/bb-slp-10.c: Likewise. * gcc.dg/vect/bb-slp-11.c: Likewise. * gcc.dg/vect/bb-slp-13.c: Likewise. * gcc.dg/vect/bb-slp-14.c: Likewise. * gcc.dg/vect/bb-slp-15.c: Likewise. * gcc.dg/vect/bb-slp-16.c: Likewise. * gcc.dg/vect/bb-slp-17.c: Likewise. * gcc.dg/vect/bb-slp-18.c: Likewise. * gcc.dg/vect/bb-slp-19.c: Likewise. * gcc.dg/vect/bb-slp-2.c: Likewise. * gcc.dg/vect/bb-slp-20.c: Likewise. * gcc.dg/vect/bb-slp-21.c: Likewise. * gcc.dg/vect/bb-slp-22.c: Likewise. * gcc.dg/vect/bb-slp-23.c: Likewise. * gcc.dg/vect/bb-slp-24.c: Likewise. * gcc.dg/vect/bb-slp-25.c: Likewise. * gcc.dg/vect/bb-slp-26.c: Likewise. * gcc.dg/vect/bb-slp-27.c: Likewise. * gcc.dg/vect/bb-slp-28.c: Likewise. * gcc.dg/vect/bb-slp-29.c: Likewise. * gcc.dg/vect/bb-slp-3.c: Likewise. * gcc.dg/vect/bb-slp-30.c: Likewise. * gcc.dg/vect/bb-slp-31.c: Likewise. * gcc.dg/vect/bb-slp-32.c: Likewise. * gcc.dg/vect/bb-slp-4.c: Likewise. * gcc.dg/vect/bb-slp-5.c: Likewise. * gcc.dg/vect/bb-slp-6.c: Likewise. * gcc.dg/vect/bb-slp-7.c: Likewise. * gcc.dg/vect/bb-slp-8.c: Likewise. * gcc.dg/vect/bb-slp-8a.c: Likewise. * gcc.dg/vect/bb-slp-8b.c: Likewise. * gcc.dg/vect/bb-slp-9.c: Likewise. * gcc.dg/vect/bb-slp-cond-1.c: Likewise. * gcc.dg/vect/bb-slp-pattern-1.c: Likewise. * gcc.dg/vect/bb-slp-pattern-2.c: Likewise. * gcc.dg/vect/fast-math-bb-slp-call-1.c: Likewise. * gcc.dg/vect/fast-math-bb-slp-call-2.c: Likewise. * gcc.dg/vect/fast-math-bb-slp-call-3.c: Likewise. * gcc.dg/vect/no-tree-reassoc-bb-slp-12.c: Likewise. * gcc.dg/vect/no-tree-sra-bb-slp-pr50730.c: Likewise. * gcc.dg/vect/pr26359.c: Likewise. * gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c: Likewise. From-SVN: r211904
2014-06-23 18:51:10 +02:00
/* Pass group that runs when pass_tree_loop is disabled or there
are no loops in the function. */
NEXT_PASS (pass_tree_no_loop);
PUSH_INSERT_PASSES_WITHIN (pass_tree_no_loop)
NEXT_PASS (pass_slp_vectorize);
POP_INSERT_PASSES ()
NEXT_PASS (pass_simduid_cleanup);
NEXT_PASS (pass_lower_vector_ssa);
NEXT_PASS (pass_lower_switch);
NEXT_PASS (pass_cse_reciprocals);
NEXT_PASS (pass_reassoc, false /* early_p */);
NEXT_PASS (pass_strength_reduction);
NEXT_PASS (pass_split_paths);
NEXT_PASS (pass_tracer);
NEXT_PASS (pass_fre, false /* may_iterate */);
/* After late FRE we rewrite no longer addressed locals into SSA
form if possible. */
NEXT_PASS (pass_thread_jumps, /*first=*/false);
NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */);
NEXT_PASS (pass_strlen);
NEXT_PASS (pass_thread_jumps_full, /*first=*/false);
NEXT_PASS (pass_vrp, false /* warn_array_bounds_p */);
/* Run CCP to compute alignment and nonzero bits. */
introduce try store by multiple pieces The ldist pass turns even very short loops into memset calls. E.g., the TFmode emulation calls end with a loop of up to 3 iterations, to zero out trailing words, and the loop distribution pass turns them into calls of the memset builtin. Though short constant-length clearing memsets are usually dealt with efficiently, for non-constant-length ones, the options are setmemM, or a function calls. RISC-V doesn't have any setmemM pattern, so the loops above end up "optimized" into memset calls, incurring not only the overhead of an explicit call, but also discarding the information the compiler has about the alignment of the destination, and that the length is a multiple of the word alignment. This patch handles variable lengths with multiple conditional power-of-2-constant-sized stores-by-pieces, so as to reduce the overhead of length compares. It also changes the last copy-prop pass into ccp, so that pointer alignment and length's nonzero bits are detected and made available for the expander, even for ldist-introduced SSA_NAMEs. for gcc/ChangeLog * builtins.c (try_store_by_multiple_pieces): New. (expand_builtin_memset_args): Use it. If target_char_cast fails, proceed as for non-constant val. Pass len's ctz to... * expr.c (clear_storage_hints): ... this. Try store by multiple pieces after setmem. (clear_storage): Adjust. * expr.h (clear_storage_hints): Likewise. (try_store_by_multiple_pieces): Declare. * passes.def: Replace the last copy_prop with ccp.
2021-05-04 03:48:47 +02:00
NEXT_PASS (pass_ccp, true /* nonzero_p */);
NEXT_PASS (pass_warn_restrict);
NEXT_PASS (pass_dse);
NEXT_PASS (pass_cd_dce, true /* update_address_taken_p */);
/* After late CD DCE we rewrite no longer addressed locals into SSA
form if possible. */
NEXT_PASS (pass_forwprop);
NEXT_PASS (pass_sink_code, true /* unsplit edges */);
NEXT_PASS (pass_phiopt, false /* early_p */);
NEXT_PASS (pass_fold_builtins);
NEXT_PASS (pass_optimize_widening_mul);
GIMPLE store merging pass 2016-10-28 Kyrylo Tkachov <kyrylo.tkachov@arm.com> PR middle-end/22141 * Makefile.in (OBJS): Add gimple-ssa-store-merging.o. * common.opt (fstore-merging): New Optimization option. * opts.c (default_options_table): Add entry for OPT_ftree_store_merging. * fold-const.h (can_native_encode_type_p): Declare prototype. * fold-const.c (can_native_encode_type_p): Define. * params.def (PARAM_STORE_MERGING_ALLOW_UNALIGNED): Define. (PARAM_MAX_STORES_TO_MERGE): Likewise. * timevar.def (TV_GIMPLE_STORE_MERGING): New timevar. * passes.def: Insert pass_tree_store_merging. * tree-pass.h (make_pass_store_merging): Declare extern prototype. * gimple-ssa-store-merging.c: New file. * doc/invoke.texi (Optimization Options): Document -fstore-merging. (--param documentation): Document store-merging-allow-unaligned and max-stores-to-merge. 2016-10-28 Kyrylo Tkachov <kyrylo.tkachov@arm.com> Jakub Jelinek <jakub@redhat.com> Andrew Pinski <pinskia@gmail.com> PR middle-end/22141 PR rtl-optimization/23684 * gcc.c-torture/execute/pr22141-1.c: New test. * gcc.c-torture/execute/pr22141-2.c: Likewise. * gcc.target/aarch64/ldp_stp_1.c: Adjust for -fstore-merging. * gcc.target/aarch64/ldp_stp_4.c: Likewise. * gcc.dg/store_merging_1.c: New test. * gcc.dg/store_merging_2.c: Likewise. * gcc.dg/store_merging_3.c: Likewise. * gcc.dg/store_merging_4.c: Likewise. * gcc.dg/store_merging_5.c: Likewise. * gcc.dg/store_merging_6.c: Likewise. * gcc.dg/store_merging_7.c: Likewise. * gcc.target/i386/pr22141.c: Likewise. * gcc.target/i386/pr34012.c: Add -fno-store-merging to dg-options. * g++.dg/init/new17.C: Likewise. Co-Authored-By: Andrew Pinski <pinskia@gmail.com> Co-Authored-By: Jakub Jelinek <jakub@redhat.com> From-SVN: r241649
2016-10-28 16:18:50 +02:00
NEXT_PASS (pass_store_merging);
NEXT_PASS (pass_tail_calls);
/* If DCE is not run before checking for uninitialized uses,
we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c).
However, this also causes us to misdiagnose cases that should be
real warnings (e.g., testsuite/gcc.dg/pr18501.c). */
NEXT_PASS (pass_dce);
/* Split critical edges before late uninit warning to reduce the
number of false positives from it. */
NEXT_PASS (pass_split_crit_edges);
NEXT_PASS (pass_late_warn_uninitialized);
NEXT_PASS (pass_local_pure_const);
New modref/ipa_modref optimization passes 2020-09-19 David Cepelik <d@dcepelik.cz> Jan Hubicka <hubicka@ucw.cz> * Makefile.in: Add ipa-modref.c and ipa-modref-tree.c. * alias.c: (reference_alias_ptr_type_1): Export. * alias.h (reference_alias_ptr_type_1): Declare. * common.opt (fipa-modref): New. * gengtype.c (open_base_files): Add ipa-modref-tree.h and ipa-modref.h * ipa-modref-tree.c: New file. * ipa-modref-tree.h: New file. * ipa-modref.c: New file. * ipa-modref.h: New file. * lto-section-in.c (lto_section_name): Add ipa_modref. * lto-streamer.h (enum lto_section_type): Add LTO_section_ipa_modref. * opts.c (default_options_table): Enable ipa-modref at -O1+. * params.opt (-param=modref-max-bases, -param=modref-max-refs, -param=modref-max-tests): New params. * passes.def: Schedule pass_modref and pass_ipa_modref. * timevar.def (TV_IPA_MODREF): New timevar. (TV_TREE_MODREF): New timevar. * tree-pass.h (make_pass_modref): Declare. (make_pass_ipa_modref): Declare. * tree-ssa-alias.c (dump_alias_stats): Include ipa-modref-tree.h and ipa-modref.h (alias_stats): Add modref_use_may_alias, modref_use_no_alias, modref_clobber_may_alias, modref_clobber_no_alias, modref_tests. (dump_alias_stats): Dump new stats. (nonoverlapping_array_refs_p): Fix formating. (modref_may_conflict): New function. (ref_maybe_used_by_call_p_1): Use it. (call_may_clobber_ref_p_1): Use it. (call_may_clobber_ref_p): Update. (stmt_may_clobber_ref_p_1): Update. * tree-ssa-alias.h (call_may_clobber_ref_p_1): Update.
2020-09-20 07:25:16 +02:00
NEXT_PASS (pass_modref);
/* uncprop replaces constants by SSA names. This makes analysis harder
and thus it should be run last. */
NEXT_PASS (pass_uncprop);
POP_INSERT_PASSES ()
NEXT_PASS (pass_all_optimizations_g);
PUSH_INSERT_PASSES_WITHIN (pass_all_optimizations_g)
/* The idea is that with -Og we do not perform any IPA optimization
so post-IPA work should be restricted to semantically required
passes and all optimization work is done early. */
NEXT_PASS (pass_remove_cgraph_callee_edges);
NEXT_PASS (pass_strip_predict_hints, false /* early_p */);
/* Lower remaining pieces of GIMPLE. */
NEXT_PASS (pass_lower_complex);
NEXT_PASS (pass_lower_vector_ssa);
NEXT_PASS (pass_lower_switch);
/* Perform simple scalar cleanup which is constant/copy propagation. */
NEXT_PASS (pass_ccp, true /* nonzero_p */);
re PR bootstrap/78817 (stage2 bootstrap failure in vec.h:1613:5: error: argument 1 null where non-null expected after r243661) PR bootstrap/78817 * tree-pass.h (make_pass_post_ipa_warn): Declare. * builtins.c (validate_arglist): Adjust get_nonnull_args call. Check for NULL pointer argument to nonnull arg here. (validate_arg): Revert 2016-12-14 changes. * calls.h (get_nonnull_args): Remove declaration. * tree-ssa-ccp.c: Include diagnostic-core.h. (pass_data_post_ipa_warn): New variable. (pass_post_ipa_warn): New class. (pass_post_ipa_warn::execute): New method. (make_pass_post_ipa_warn): New function. * tree.h (get_nonnull_args): Declare. * tree.c (get_nonnull_args): New function. * calls.c (maybe_warn_null_arg): Removed. (maybe_warn_null_arg): Removed. (initialize_argument_information): Revert 2016-12-14 changes. * passes.def: Add pass_post_ipa_warn after first ccp after IPA. c-family/ * c-common.c (struct nonnull_arg_ctx): New type. (check_function_nonnull): Return bool instead of void. Use nonnull_arg_ctx as context rather than just location_t. (check_nonnull_arg): Adjust for the new context type, set warned_p to true if a warning has been diagnosed. (check_function_arguments): Return bool instead of void. * c-common.h (check_function_arguments): Adjust prototype. c/ * c-typeck.c (build_function_call_vec): If check_function_arguments returns true, set TREE_NO_WARNING on CALL_EXPR. cp/ * typeck.c (cp_build_function_call_vec): If check_function_arguments returns true, set TREE_NO_WARNING on CALL_EXPR. * call.c (build_over_call): Likewise. From-SVN: r243874
2016-12-21 23:15:59 +01:00
NEXT_PASS (pass_post_ipa_warn);
NEXT_PASS (pass_object_sizes);
/* Fold remaining builtins. */
NEXT_PASS (pass_fold_builtins);
PR tree-optimization/83431 - -Wformat-truncation may incorrectly report truncation gcc/ChangeLog: PR c++/83431 * gimple-ssa-sprintf.c (pass_data_sprintf_length): Remove object. (sprintf_dom_walker): Remove class. (get_int_range): Make argument const. (directive::fmtfunc, directive::set_precision): Same. (format_none): Same. (build_intmax_type_nodes): Same. (adjust_range_for_overflow): Same. (format_floating): Same. (format_character): Same. (format_string): Same. (format_plain): Same. (get_int_range): Cast away constness. (format_integer): Same. (get_string_length): Call get_range_strlen_dynamic. Handle null lendata.maxbound. (should_warn_p): Adjust argument scope qualifier. (maybe_warn): Same. (format_directive): Same. (parse_directive): Same. (is_call_safe): Same. (try_substitute_return_value): Same. (sprintf_dom_walker::handle_printf_call): Rename... (handle_printf_call): ...to this. Initialize target to host charmap here instead of in pass_sprintf_length::execute. (struct call_info): Make global. (sprintf_dom_walker::compute_format_length): Make global. (sprintf_dom_walker::handle_gimple_call): Same. * passes.def (pass_sprintf_length): Replace with pass_strlen. * print-rtl.c (print_pattern): Reduce the number of spaces to avoid -Wformat-truncation. * tree-pass.h (make_pass_warn_printf): New function. * tree-ssa-strlen.c (strlen_optimize): New variable. (get_string_length): Add comments. (get_range_strlen_dynamic): New function. (check_and_optimize_call): New function. (handle_integral_assign): New function. (strlen_check_and_optimize_stmt): Factor code out into strlen_check_and_optimize_call and handle_integral_assign. (strlen_dom_walker::evrp): New member. (strlen_dom_walker::before_dom_children): Use evrp member. (strlen_dom_walker::after_dom_children): Use evrp member. (printf_strlen_execute): New function. (pass_strlen::gate): Update to handle printf calls. (dump_strlen_info): New function. (pass_data_warn_printf): New variable. (pass_warn_printf): New class. * tree-ssa-strlen.h (get_range_strlen_dynamic): Declare. (handle_printf_call): Same. * tree-vrp.c (value_range_base::type): Adjust assertion. * vr-values.c (vr_values::update_value_range): Use type of the first argument rather than the second. gcc/testsuite/ChangeLog: PR c++/83431 * gcc.dg/strlenopt-63.c: New test. * gcc.dg/pr79538.c: Adjust text of expected warning. * gcc.dg/pr81292-1.c: Adjust pass name. * gcc.dg/pr81292-2.c: Same. * gcc.dg/pr81703.c: Same. * gcc.dg/strcmpopt_2.c: Same. * gcc.dg/strcmpopt_3.c: Same. * gcc.dg/strcmpopt_4.c: Same. * gcc.dg/strlenopt-1.c: Same. * gcc.dg/strlenopt-10.c: Same. * gcc.dg/strlenopt-11.c: Same. * gcc.dg/strlenopt-13.c: Same. * gcc.dg/strlenopt-14g.c: Same. * gcc.dg/strlenopt-14gf.c: Same. * gcc.dg/strlenopt-15.c: Same. * gcc.dg/strlenopt-16g.c: Same. * gcc.dg/strlenopt-17g.c: Same. * gcc.dg/strlenopt-18g.c: Same. * gcc.dg/strlenopt-19.c: Same. * gcc.dg/strlenopt-1f.c: Same. * gcc.dg/strlenopt-2.c: Same. * gcc.dg/strlenopt-20.c: Same. * gcc.dg/strlenopt-21.c: Same. * gcc.dg/strlenopt-22.c: Same. * gcc.dg/strlenopt-22g.c: Same. * gcc.dg/strlenopt-24.c: Same. * gcc.dg/strlenopt-25.c: Same. * gcc.dg/strlenopt-26.c: Same. * gcc.dg/strlenopt-27.c: Same. * gcc.dg/strlenopt-28.c: Same. * gcc.dg/strlenopt-29.c: Same. * gcc.dg/strlenopt-2f.c: Same. * gcc.dg/strlenopt-3.c: Same. * gcc.dg/strlenopt-30.c: Same. * gcc.dg/strlenopt-31g.c: Same. * gcc.dg/strlenopt-32.c: Same. * gcc.dg/strlenopt-33.c: Same. * gcc.dg/strlenopt-33g.c: Same. * gcc.dg/strlenopt-34.c: Same. * gcc.dg/strlenopt-35.c: Same. * gcc.dg/strlenopt-4.c: Same. * gcc.dg/strlenopt-48.c: Same. * gcc.dg/strlenopt-49.c: Same. * gcc.dg/strlenopt-4g.c: Same. * gcc.dg/strlenopt-4gf.c: Same. * gcc.dg/strlenopt-5.c: Same. * gcc.dg/strlenopt-50.c: Same. * gcc.dg/strlenopt-51.c: Same. * gcc.dg/strlenopt-52.c: Same. * gcc.dg/strlenopt-53.c: Same. * gcc.dg/strlenopt-54.c: Same. * gcc.dg/strlenopt-55.c: Same. * gcc.dg/strlenopt-56.c: Same. * gcc.dg/strlenopt-6.c: Same. * gcc.dg/strlenopt-61.c: Same. * gcc.dg/strlenopt-7.c: Same. * gcc.dg/strlenopt-8.c: Same. * gcc.dg/strlenopt-9.c: Same. * gcc.dg/strlenopt.h (snprintf, snprintf): Declare. * gcc.dg/tree-ssa/builtin-snprintf-6.c: New test. * gcc.dg/tree-ssa/builtin-snprintf-7.c: New test. * gcc.dg/tree-ssa/builtin-snprintf-8.c: New test. * gcc.dg/tree-ssa/builtin-snprintf-9.c: New test. * gcc.dg/tree-ssa/builtin-sprintf-warn-21.c: New test. * gcc.dg/tree-ssa/dump-4.c: New test. * gcc.dg/tree-ssa/pr83501.c: Adjust pass name. From-SVN: r274933
2019-08-26 20:29:45 +02:00
NEXT_PASS (pass_strlen);
/* Copy propagation also copy-propagates constants, this is necessary
to forward object-size and builtin folding results properly. */
NEXT_PASS (pass_copy_prop);
NEXT_PASS (pass_dce);
NEXT_PASS (pass_sancov);
NEXT_PASS (pass_asan);
NEXT_PASS (pass_tsan);
/* ??? We do want some kind of loop invariant motion, but we possibly
need to adjust LIM to be more friendly towards preserving accurate
debug information here. */
/* Split critical edges before late uninit warning to reduce the
number of false positives from it. */
NEXT_PASS (pass_split_crit_edges);
NEXT_PASS (pass_late_warn_uninitialized);
/* uncprop replaces constants by SSA names. This makes analysis harder
and thus it should be run last. */
NEXT_PASS (pass_uncprop);
POP_INSERT_PASSES ()
NEXT_PASS (pass_tm_init);
PUSH_INSERT_PASSES_WITHIN (pass_tm_init)
NEXT_PASS (pass_tm_mark);
NEXT_PASS (pass_tm_memopt);
NEXT_PASS (pass_tm_edges);
POP_INSERT_PASSES ()
builtin-types.def (BT_FN_BOOL_UINT_LONGPTR_LONGPTR_LONGPTR, [...]): New. gcc/ 2015-10-13 Jakub Jelinek <jakub@redhat.com> Aldy Hernandez <aldyh@redhat.com> Ilya Verbin <ilya.verbin@intel.com> * builtin-types.def (BT_FN_BOOL_UINT_LONGPTR_LONGPTR_LONGPTR, BT_FN_BOOL_UINT_ULLPTR_ULLPTR_ULLPTR, BT_FN_BOOL_UINT_LONGPTR_LONG_LONGPTR_LONGPTR, BT_FN_BOOL_UINT_ULLPTR_ULL_ULLPTR_ULLPTR, BT_FN_VOID_INT_SIZE_PTR_PTR_PTR_UINT_PTR, BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR, BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR_INT, BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_UINT_LONG_INT_LONG_LONG_LONG, BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_UINT_LONG_INT_ULL_ULL_ULL, BT_FN_VOID_LONG_VAR, BT_FN_VOID_ULL_VAR): New. (BT_FN_VOID_INT_PTR_SIZE_PTR_PTR_PTR, BT_FN_VOID_INT_OMPFN_PTR_SIZE_PTR_PTR_PTR, BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR): Remove. * cgraph.h (enum cgraph_simd_clone_arg_type): Add SIMD_CLONE_ARG_TYPE_LINEAR_REF_CONSTANT_STEP, SIMD_CLONE_ARG_TYPE_LINEAR_UVAL_CONSTANT_STEP and SIMD_CLONE_ARG_TYPE_LINEAR_VAL_CONSTANT_STEP. (struct cgraph_simd_clone_arg): Adjust comment. * coretypes.h (struct gomp_ordered): New forward decl. * gimple.c (gimple_build_omp_critical): Add CLAUSES argument, set critical clauses to it. (gimple_build_omp_ordered): Return gomp_ordered * instead of gimple *. Add CLAUSES argument, set ordered clauses to it. (gimple_copy): Unshare clauses on GIMPLE_OMP_CRITICAL and GIMPLE_OMP_ORDERED. * gimple.def (GIMPLE_OMP_ORDERED): Change from GSS_OMP to GSS_OMP_SINGLE_LAYOUT, move it after GIMPLE_OMP_TEAMS. * gimple.h (enum gf_mask): Add GF_OMP_TASK_TASKLOOP. Add another bit to GF_OMP_FOR_KIND_MASK mask. Add GF_OMP_FOR_KIND_TASKLOOP, renumber GF_OMP_FOR_KIND_CILKFOR and GF_OMP_FOR_KIND_OACC_LOOP. Adjust GF_OMP_FOR_SIMD, GF_OMP_FOR_COMBINED and GF_OMP_FOR_COMBINED_INTO. Add another bit to GF_OMP_TARGET_KIND_MASK mask. Add GF_OMP_TARGET_KIND_ENTER_DATA and GF_OMP_TARGET_KIND_EXIT_DATA, renumber GF_OMP_TARGET_KIND_OACC_{PARALLEL,KERNELS,DATA,UPDATE,ENTER_EXIT_DATA}. (gomp_critical): Add clauses field. (gomp_ordered): New struct. (is_a_helper <gomp_ordered *>::test): New inline. (gimple_build_omp_critical): Add CLAUSES argument. (gimple_build_omp_ordered): Likewise. Return gomp_ordered * instead of gimple *. (gimple_omp_critical_clauses, gimple_omp_critical_clauses_ptr, gimple_omp_critical_set_clauses, gimple_omp_ordered_clauses, gimple_omp_ordered_clauses_ptr, gimple_omp_ordered_set_clauses, gimple_omp_task_taskloop_p, gimple_omp_task_set_taskloop_p): New inline functions. * gimple-pretty-print.c (dump_gimple_omp_for): Handle taskloop. (dump_gimple_omp_target): Handle enter data and exit data. (dump_gimple_omp_block): Don't handle GIMPLE_OMP_ORDERED here. (dump_gimple_omp_critical): Print clauses. (dump_gimple_omp_ordered): New function. (dump_gimple_omp_task): Handle taskloop. (pp_gimple_stmt_1): Use dump_gimple_omp_ordered for GIMPLE_OMP_ORDERED. * gimple-walk.c (walk_gimple_op): Walk clauses on GIMPLE_OMP_CRITICAL and GIMPLE_OMP_ORDERED. * gimplify.c (enum gimplify_omp_var_data): Add GOVD_MAP_0LEN_ARRAY. (enum omp_region_type): Add ORT_COMBINED_TARGET and ORT_NONE. (struct gimplify_omp_ctx): Add loop_iter_var, target_map_scalars_firstprivate, target_map_pointers_as_0len_arrays and target_firstprivatize_array_bases fields. (delete_omp_context): Release loop_iter_var. (gimplify_bind_expr): Handle ORT_NONE. (maybe_fold_stmt): Adjust check for ORT_TARGET for the addition of ORT_COMBINED_TARGET. (is_gimple_stmt): Return true for OMP_TASKLOOP, OMP_TEAMS and OMP_TARGET{,_DATA,_UPDATE,_ENTER_DATA,_EXIT_DATA}. (omp_firstprivatize_variable): Handle ORT_NONE. Adjust check for ORT_TARGET for the addition of ORT_COMBINED_TARGET. Handle ctx->target_map_scalars_firstprivate. (omp_add_variable): Handle ORT_NONE. Allow map clause together with data sharing clauses. For data sharing clause with VLA decl on omp target/target data don't add firstprivate for the pointer. Call omp_notice_variable on TYPE_SIZE_UNIT only if it is a DECL_P. (omp_notice_threadprivate_variable): Adjust check for ORT_TARGET for the addition of ORT_COMBINED_TARGET. (omp_notice_variable): Handle ORT_NONE. Adjust check for ORT_TARGET for the addition of ORT_COMBINED_TARGET. Handle implicit mapping of pointers as zero length array sections and ctx->target_map_scalars_firstprivate mapping of scalars as firstprivate data sharing. (omp_check_private): Handle omp_member_access_dummy_var vars. (find_decl_expr): New function. (gimplify_scan_omp_clauses): Add CODE argument. For OMP_CLAUSE_IF complain if OMP_CLAUSE_IF_MODIFIER is present and does not match code. Handle OMP_CLAUSE_GANG separately. Handle OMP_CLAUSE_{PRIORITY,GRAINSIZE,NUM_TASKS,NOGROUP,THREADS,SIMD,SIMDLEN} clauses. Diagnose linear clause on combined distribute {, parallel for} simd construct, unless it is the loop iterator. Handle struct element GOMP_MAP_FIRSTPRIVATE_POINTER. Handle map clauses with COMPONENT_REF. Initialize ctx->target_map_scalars_firstprivate, ctx->target_firstprivatize_array_bases and ctx->target_map_pointers_as_0len_arrays. Add firstprivate for linear clause even to target region if combined. Remove map clauses with GOMP_MAP_FIRSTPRIVATE_POINTER kind from OMP_TARGET_{,ENTER_,EXIT_}DATA. For GOMP_MAP_FIRSTPRIVATE_POINTER map kind with non-INTEGER_CST OMP_CLAUSE_SIZE firstprivatize the bias. Handle OMP_CLAUSE_DEPEND_{SINK,SOURCE}. Handle OMP_CLAUSE_{{USE,IS}_DEVICE_PTR,DEFAULTMAP,HINT}. For linear clause on worksharing loop combined with parallel add shared clause on the parallel. Handle OMP_CLAUSE_REDUCTION with MEM_REF OMP_CLAUSE_DECL. Set DECL_NAME on omp_member_access_dummy_var vars. Add lastprivate clause to outer taskloop if needed. (gimplify_adjust_omp_clauses_1): Handle GOVD_MAP_0LEN_ARRAY. If gimplify_omp_ctxp->target_firstprivatize_array_bases, use GOMP_MAP_FIRSTPRIVATE_POINTER map kind instead of GOMP_MAP_POINTER. (gimplify_adjust_omp_clauses): Add CODE argument. Handle removal of GOMP_MAP_FIRSTPRIVATE_POINTER struct elements for struct not seen in target body. Handle removal of struct mapping if struct is not seen in target body. Remove GOMP_MAP_STRUCT map clause on OMP_TARGET_EXIT_DATA. Adjust check for ORT_TARGET for the addition of ORT_COMBINED_TARGET. Use GOMP_MAP_FIRSTPRIVATE_POINTER instead of GOMP_MAP_POINTER if ctx->target_firstprivatize_array_bases for VLAs. Set OMP_CLAUSE_MAP_PRIVATE if both data sharing and map clause appear together. Handle OMP_CLAUSE_{{USE,IS}_DEVICE_PTR,DEFAULTMAP,HINT}. Don't remove map clause if it has map-type-modifier always. Handle OMP_CLAUSE_{PRIORITY,GRAINSIZE,NUM_TASKS,NOGROUP,THREADS,SIMD,SIMDLEN} clauses. (gimplify_oacc_cache, gimplify_omp_parallel, gimplify_omp_task): Adjust gimplify_scan_omp_clauses and gimplify_adjust_omp_clauses callers. (gimplify_omp_for): Likewise. Handle OMP_TASKLOOP. Initialize loop_iter_var. Use OMP_FOR_ORIG_DECLS. Fix handling of lastprivate iterators in doacross loops. (gimplify_omp_workshare): Adjust gimplify_scan_omp_clauses and gimplify_adjust_omp_clauses callers. Use ORT_COMBINED_TARGET for OMP_TARGET_COMBINED. Adjust check for ORT_TARGET for the addition of ORT_COMBINED_TARGET. (gimplify_omp_target_update): Adjust gimplify_scan_omp_clauses and gimplify_adjust_omp_clauses callers. Handle OMP_TARGET_ENTER_DATA and OMP_TARGET_EXIT_DATA. (gimplify_omp_ordered): New function. (gimplify_expr): Handle OMP_TASKLOOP, OMP_TARGET_ENTER_DATA and OMP_TARGET_EXIT_DATA. Use gimplify_omp_ordered for OMP_ORDERED. Gimplify clauses on OMP_CRITICAL. * internal-fn.c (expand_GOMP_SIMD_ORDERED_START, expand_GOMP_SIMD_ORDERED_END): New functions. * internal-fn.def (GOMP_SIMD_ORDERED_START, GOMP_SIMD_ORDERED_END): New internal functions. * omp-builtins.def (BUILT_IN_GOMP_LOOP_DOACROSS_STATIC_START, BUILT_IN_GOMP_LOOP_DOACROSS_DYNAMIC_START, BUILT_IN_GOMP_LOOP_DOACROSS_GUIDED_START, BUILT_IN_GOMP_LOOP_DOACROSS_RUNTIME_START, BUILT_IN_GOMP_LOOP_ULL_DOACROSS_STATIC_START, BUILT_IN_GOMP_LOOP_ULL_DOACROSS_DYNAMIC_START, BUILT_IN_GOMP_LOOP_ULL_DOACROSS_GUIDED_START, BUILT_IN_GOMP_LOOP_ULL_DOACROSS_RUNTIME_START, BUILT_IN_GOMP_DOACROSS_POST, BUILT_IN_GOMP_DOACROSS_WAIT, BUILT_IN_GOMP_DOACROSS_ULL_POST, BUILT_IN_GOMP_DOACROSS_ULL_WAIT, BUILT_IN_GOMP_TARGET_ENTER_EXIT_DATA, BUILT_IN_GOMP_TASKLOOP, BUILT_IN_GOMP_TASKLOOP_ULL): New built-ins. (BUILT_IN_GOMP_TASK): Add INT argument to the end. (BUILT_IN_GOMP_TARGET): Rename from GOMP_target to GOMP_target_41, adjust type. (BUILT_IN_GOMP_TARGET_DATA): Rename from GOMP_target_data to GOMP_target_data_41, adjust type. (BUILT_IN_GOMP_TARGET_UPDATE): Rename from GOMP_target_update to GOMP_target_update_41, adjust type. * omp-low.c (struct omp_region): Adjust comments, add ord_stmt field. (struct omp_for_data): Add ordered and simd_schedule fields. (omp_member_access_dummy_var, unshare_and_remap_1, unshare_and_remap, is_taskloop_ctx): New functions. (is_taskreg_ctx): Use is_parallel_ctx and is_task_ctx. (extract_omp_for_data): Handle taskloops and doacross loops and simd schedule modifier. (omp_adjust_chunk_size): New function. (get_ws_args_for): Use it. (lookup_sfield): Change first argument to splay_tree_key, add overload with first argument tree. (maybe_lookup_field): Likewise. (use_pointer_for_field): Handle omp_member_access_dummy_var. (omp_copy_decl_2): If var is TREE_ADDRESSABLE listed in task_shared_vars, clear TREE_ADDRESSABLE on the copy. (build_outer_var_ref): Add LASTPRIVATE argument, handle taskloops and omp_member_access_dummy_var vars. (build_sender_ref): Change first argument to splay_tree_key, add overload with first argument tree. (install_var_field): For mask & 8 use &DECL_UID as key instead of the tree itself. (fixup_child_record_type): Const qualify *.omp_data_i. (scan_sharing_clauses): Handle OMP_CLAUSE_SHARED_FIRSTPRIVATE, C/C++ array reductions, OMP_CLAUSE_{IS,USE}_DEVICE_PTR clauses, OMP_CLAUSE_{PRIORITY,GRAINSIZE,NUM_TASKS,SIMDLEN,THREADS,SIMD} and OMP_CLAUSE_{NOGROUP,DEFAULTMAP} clauses, OMP_CLAUSE__LOOPTEMP_ clause on taskloop, GOMP_MAP_FIRSTPRIVATE_POINTER, OMP_CLAUSE_MAP_PRIVATE. (create_omp_child_function): Set TREE_READONLY on .omp_data_i. (find_combined_for): Allow searching for different GIMPLE_OMP_FOR kinds. (add_taskreg_looptemp_clauses): New function. (scan_omp_parallel): Use it. (scan_omp_task): Likewise. (finish_taskreg_scan): Handle OMP_CLAUSE_SHARED_FIRSTPRIVATE. For taskloop, move fields for the first two _LOOPTEMP_ clauses first. (check_omp_nesting_restrictions): Handle GF_OMP_TARGET_KIND_ENTER_DATA and GF_OMP_TARGET_KIND_EXIT_DATA. Formatting fixes. Allow the sandwiched taskloop constructs. Type check OMP_CLAUSE_DEPEND_{KIND,SOURCE}. Allow ordered simd inside of simd region. Diagnose depend(source) or depend(sink:...) on target constructs or task/taskloop. (handle_simd_reference): Use get_name. (lower_rec_input_clauses): Likewise. Ignore all OMP_CLAUSE_LASTPRIVATE_FIRSTPRIVATE clauses on taskloop construct. Allow _LOOPTEMP_ clause on GOMP_TASK. Unshare new_var before passing it to omp_clause_{default,copy}_ctor. Handle OMP_CLAUSE_REDUCTION with MEM_REF OMP_CLAUSE_DECL. Set lastprivate_firstprivate flag for linear that needs copyin and copyout. Use BUILT_IN_ALLOCA_WITH_ALIGN instead of BUILT_IN_ALLOCA. (lower_lastprivate_clauses): For OMP_CLAUSE_LASTPRIVATE_FIRSTPRIVATE on taskloop lookup decl in outer context. Pass true to build_outer_var_ref lastprivate argument. Handle OMP_CLAUSE_LASTPRIVATE_TASKLOOP_IV lastprivate if the decl is global outside of outer taskloop for. (lower_reduction_clauses): Handle OMP_CLAUSE_REDUCTION with MEM_REF OMP_CLAUSE_DECL. (lower_send_clauses): Ignore first two _LOOPTEMP_ clauses in taskloop GOMP_TASK. Handle OMP_CLAUSE_SHARED_FIRSTPRIVATE. Handle omp_member_access_dummy_var vars. Handle OMP_CLAUSE_REDUCTION with MEM_REF OMP_CLAUSE_DECL. Use new lookup_sfield overload. (lower_send_shared_vars): Ignore fields with NULL or FIELD_DECL abstract origin. Handle omp_member_access_dummy_var vars. (expand_parallel_call): Use expand_omp_build_assign. (expand_task_call): Handle taskloop construct expansion. Add REGION argument. Use GOMP_TASK_* defines instead of hardcoded integers. Add priority argument to GOMP_task* calls. Or in GOMP_TASK_FLAG_PRIORITY into flags if priority is present for GOMP_task call. (expand_omp_build_assign): Add prototype. Add AFTER argument, if true emit statements after *GSI_P and continue linking. (expand_omp_taskreg): Adjust expand_task_call caller. (expand_omp_for_init_counts): Rename zero_iter_bb argument to zero_iter1_bb and first_zero_iter to first_zero_iter1, add zero_iter2_bb and first_zero_iter2 arguments, handle computation of counts even for ordered loops. (expand_omp_for_init_vars): Handle GOMP_TASK inner_stmt. (expand_omp_ordered_source, expand_omp_ordered_sink, expand_omp_ordered_source_sink, expand_omp_for_ordered_loops): New functions. (expand_omp_for_generic): Use omp_adjust_chunk_size. Handle linear clauses on worksharing loop. Handle DOACROSS loop expansion. (expand_omp_for_static_nochunk): Handle linear clauses on worksharing loop. Adjust expand_omp_for_init_counts callers. (expand_omp_for_static_chunk): Likewise. Use omp_adjust_chunk_size. (expand_omp_simd): Handle addressable fd->loop.v. Adjust expand_omp_for_init_counts callers. (expand_omp_taskloop_for_outer, expand_omp_taskloop_for_inner): New functions. (expand_omp_for): Call expand_omp_taskloop_for_* for taskloop. Handle doacross loops. (expand_omp_target): Handle GF_OMP_TARGET_KIND_ENTER_DATA and GF_OMP_TARGET_KIND_EXIT_DATA. Pass flags and depend arguments to GOMP_target_{41,update_41,enter_exit_data} libcalls. (expand_omp): Don't expand ordered depend constructs here, record ord_stmt instead for later expand_omp_for_generic. (build_omp_regions_1): Handle GF_OMP_TARGET_KIND_ENTER_DATA and GF_OMP_TARGET_KIND_EXIT_DATA. Treat GIMPLE_OMP_ORDERED with depend clause as stand-alone directive. (lower_omp_ordered_clauses): New function. (lower_omp_ordered): Handle OMP_CLAUSE_SIMD, for OMP_CLAUSE_DEPEND don't lower anything. (lower_omp_for_lastprivate): Use last _looptemp_ clause on taskloop for comparison. (lower_omp_for): Handle taskloop constructs. Adjust OMP_CLAUSE_DECL and OMP_CLAUSE_LINEAR_STEP so that expand_omp_for_* can use it during expansion for linear adjustments. (create_task_copyfn): Handle OMP_CLAUSE_SHARED_FIRSTPRIVATE. (lower_depend_clauses): Assert not seeing sink/source depend kinds. Set TREE_ADDRESSABLE on array. Change first argument from gimple * to tree * pointing to the stmt's clauses. (lower_omp_taskreg): Adjust lower_depend_clauses caller. (lower_omp_target): Handle GF_OMP_TARGET_KIND_ENTER_DATA and GF_OMP_TARGET_KIND_EXIT_DATA, depend clauses, GOMP_MAP_{RELEASE,ALWAYS_{TO,FROM,TOFROM},FIRSTPRIVATE_POINTER,STRUCT} map kinds, OMP_CLAUSE_{FIRSTPRIVATE,PRIVATE,{IS,USE}_DEVICE_PTR clauses. Always use short kind and 8-bit align shift. (lower_omp_regimplify_p): Use IS_TYPE_OR_DECL_P macro. (struct lower_omp_regimplify_operands_data): New type. (lower_omp_regimplify_operands_p, lower_omp_regimplify_operands): New functions. (lower_omp_1): Use lower_omp_regimplify_operands instead of gimple_regimplify_operands. (make_gimple_omp_edges): Handle GF_OMP_TARGET_KIND_ENTER_DATA and GF_OMP_TARGET_KIND_EXIT_DATA. Treat GIMPLE_OMP_ORDERED with depend clause as stand-alone directive. (simd_clone_clauses_extract): Honor OMP_CLAUSE_LINEAR_KIND. (simd_clone_mangle): Mangle the various linear kinds per the new ABI. (simd_clone_adjust_argument_types): Handle SIMD_CLONE_ARG_TYPE_LINEAR_*_CONSTANT_STEP. (simd_clone_init_simd_arrays): Don't do anything for uval. (simd_clone_adjust): Handle SIMD_CLONE_ARG_TYPE_LINEAR_REF_CONSTANT_STEP like SIMD_CLONE_ARG_TYPE_LINEAR_CONSTANT_STEP. Handle SIMD_CLONE_ARG_TYPE_LINEAR_UVAL_CONSTANT_STEP. * omp-low.h (omp_member_access_dummy_var): New prototype. * passes.def (pass_simduid_cleanup): Schedule another copy of the pass after all optimizations. * tree.c (omp_clause_code_name): Add entries for OMP_CLAUSE_{TO_DECLARE,LINK,{USE,IS}_DEVICE_PTR,DEFAULTMAP,HINT} and OMP_CLAUSE_{PRIORITY,GRAINSIZE,NUM_TASKS,NOGROUP,THREADS,SIMD}. (omp_clause_num_ops): Likewise. Bump number of OMP_CLAUSE_REDUCTION arguments to 5 and for OMP_CLAUSE_ORDERED to 1. (walk_tree_1): Adjust for OMP_CLAUSE_ORDERED having 1 argument and OMP_CLAUSE_REDUCTION 5 arguments. Handle OMP_CLAUSE_{TO_DECLARE,LINK,{USE,IS}_DEVICE_PTR,DEFAULTMAP,HINT} and OMP_CLAUSE_{PRIORITY,GRAINSIZE,NUM_TASKS,NOGROUP,THREADS,SIMD} clauses. * tree-core.h (enum omp_clause_linear_kind): New. (struct tree_omp_clause): Change type of map_kind from unsigned char to unsigned int. Add subcode.if_modifier and subcode.linear_kind fields. (enum omp_clause_code): Add OMP_CLAUSE_{TO_DECLARE,LINK,{USE,IS}_DEVICE_PTR,DEFAULTMAP,HINT} and OMP_CLAUSE_{PRIORITY,GRAINSIZE,NUM_TASKS,NOGROUP,THREADS,SIMD}. (OMP_CLAUSE_REDUCTION): Document OMP_CLAUSE_REDUCTION_DECL_PLACEHOLDER. (enum omp_clause_depend_kind): Add OMP_CLAUSE_DEPEND_{SOURCE,SINK}. * tree.def (OMP_FOR): Add OMP_FOR_ORIG_DECLS operand. (OMP_CRITICAL): Move before OMP_SINGLE. Add OMP_CRITICAL_CLAUSES operand. (OMP_ORDERED): Move before OMP_SINGLE. Add OMP_ORDERED_CLAUSES operand. (OMP_TASKLOOP, OMP_TARGET_ENTER_DATA, OMP_TARGET_EXIT_DATA): New tree codes. * tree.h (OMP_BODY): Replace OMP_CRITICAL with OMP_TASKGROUP. (OMP_CLAUSE_SET_MAP_KIND): Cast to unsigned int rather than unsigned char. (OMP_CRITICAL_NAME): Adjust to be 3rd operand instead of 2nd. (OMP_CLAUSE_NUM_TASKS_EXPR): Formatting fix. (OMP_STANDALONE_CLAUSES): Adjust to cover OMP_TARGET_{ENTER,EXIT}_DATA. (OMP_CLAUSE_DEPEND_SINK_NEGATIVE, OMP_TARGET_COMBINED, OMP_CLAUSE_MAP_PRIVATE, OMP_FOR_ORIG_DECLS, OMP_CLAUSE_IF_MODIFIER, OMP_CLAUSE_MAP_MAYBE_ZERO_LENGTH_ARRAY_SECTION, OMP_CRITICAL_CLAUSES, OMP_CLAUSE_PRIVATE_TASKLOOP_IV, OMP_CLAUSE_LASTPRIVATE_TASKLOOP_IV, OMP_CLAUSE_HINT_EXPR, OMP_CLAUSE_SCHEDULE_SIMD, OMP_CLAUSE_LINEAR_KIND, OMP_CLAUSE_REDUCTION_DECL_PLACEHOLDER, OMP_CLAUSE_SHARED_FIRSTPRIVATE, OMP_ORDERED_CLAUSES, OMP_TARGET_ENTER_DATA_CLAUSES, OMP_TARGET_EXIT_DATA_CLAUSES, OMP_CLAUSE_NUM_TASKS_EXPR, OMP_CLAUSE_GRAINSIZE_EXPR, OMP_CLAUSE_PRIORITY_EXPR, OMP_CLAUSE_ORDERED_EXPR): Define. * tree-inline.c (remap_gimple_stmt): Handle clauses on GIMPLE_OMP_ORDERED and GIMPLE_OMP_CRITICAL. For IFN_GOMP_SIMD_ORDERED_{START,END} set has_simduid_loops. * tree-nested.c (convert_nonlocal_omp_clauses): Handle OMP_CLAUSE_{TO_DECLARE,LINK,{USE,IS}_DEVICE_PTR,SIMDLEN,PRIORITY,SIMD} and OMP_CLAUSE_{GRAINSIZE,NUM_TASKS,HINT,NOGROUP,THREADS,DEFAULTMAP} clauses. Handle OMP_CLAUSE_REDUCTION_DECL_PLACEHOLDER. (convert_local_omp_clauses): Likewise. * tree-pretty-print.c (dump_omp_clause): Handle OMP_CLAUSE_{TO_DECLARE,LINK,{USE,IS}_DEVICE_PTR,SIMDLEN,PRIORITY,SIMD} and OMP_CLAUSE_{GRAINSIZE,NUM_TASKS,HINT,NOGROUP,THREADS,DEFAULTMAP} clauses. Handle OMP_CLAUSE_IF_MODIFIER, OMP_CLAUSE_ORDERED_EXPR, OMP_CLAUSE_SCHEDULE_SIMD, OMP_CLAUSE_LINEAR_KIND, OMP_CLAUSE_DEPEND_{SOURCE,SINK}. Use "delete" for GOMP_MAP_FORCE_DEALLOC. Handle GOMP_MAP_{ALWAYS_{TO,FROM,TOFROM},RELEASE,FIRSTPRIVATE_POINTER,STRUCT}. (dump_generic_node): Handle OMP_TASKLOOP, OMP_TARGET_{ENTER,EXIT}_DATA and clauses on OMP_ORDERED and OMP_CRITICAL. * tree-vectorizer.c (adjust_simduid_builtins): Adjust comment. Remove IFN_GOMP_SIMD_ORDERED_{START,END}. (vectorize_loops): Adjust comments. (pass_simduid_cleanup::execute): Likewise. * tree-vect-stmts.c (vectorizable_simd_clone_call): Handle SIMD_CLONE_ARG_TYPE_LINEAR_{REF,VAL,UVAL}_CONSTANT_STEP. * wide-int.h (wi::gcd): New. gcc/c-family/ 2015-10-13 Jakub Jelinek <jakub@redhat.com> Aldy Hernandez <aldyh@redhat.com> * c-common.c (enum c_builtin_type): Define DEF_FUNCTION_TYPE_9, DEF_FUNCTION_TYPE_10 and DEF_FUNCTION_TYPE_11. (c_define_builtins): Likewise. * c-common.h (enum c_omp_clause_split): Add C_OMP_CLAUSE_SPLIT_TASKLOOP. (c_finish_omp_critical, c_finish_omp_ordered): Add CLAUSES argument. (c_finish_omp_for): Add ORIG_DECLV argument. * c-cppbuiltin.c (c_cpp_builtins): Predefine _OPENMP as 201511 instead of 201307. * c-omp.c (c_finish_omp_critical): Add CLAUSES argument, set OMP_CRITICAL_CLAUSES to it. (c_finish_omp_ordered): Add CLAUSES argument, set OMP_ORDERED_CLAUSES to it. (c_finish_omp_for): Add ORIG_DECLV argument, set OMP_FOR_ORIG_DECLS to it if OMP_FOR. Clear DECL_INITIAL on the IVs. (c_omp_split_clauses): Handle OpenMP 4.5 combined/composite constructs and new OpenMP 4.5 clauses. Clear OMP_CLAUSE_SCHEDULE_SIMD if not combined with OMP_SIMD. Add verification code. * c-pragma.c (omp_pragmas_simd): Add taskloop. * c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_TASKLOOP. (enum pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_{DEFAULTMAP,GRAINSIZE,HINT,{IS,USE}_DEVICE_PTR} and PRAGMA_OMP_CLAUSE_{LINK,NOGROUP,NUM_TASKS,PRIORITY,SIMD,THREADS}. gcc/c/ 2015-10-13 Jakub Jelinek <jakub@redhat.com> Aldy Hernandez <aldyh@redhat.com> * c-parser.c (c_parser_pragma): Handle PRAGMA_OMP_ORDERED here. (c_parser_omp_clause_name): Handle OpenMP 4.5 clauses. (c_parser_omp_variable_list): Handle structure elements for map, to and from clauses. Handle array sections in reduction clause. Formatting fixes. (c_parser_omp_clause_if): Add IS_OMP argument, handle parsing of if clause modifiers. (c_parser_omp_clause_num_tasks, c_parser_omp_clause_grainsize, c_parser_omp_clause_priority, c_parser_omp_clause_hint, c_parser_omp_clause_defaultmap, c_parser_omp_clause_use_device_ptr, c_parser_omp_clause_is_device_ptr): New functions. (c_parser_omp_clause_ordered): Parse optional parameter. (c_parser_omp_clause_reduction): Handle array reductions. (c_parser_omp_clause_schedule): Parse optional simd modifier. (c_parser_omp_clause_nogroup, c_parser_omp_clause_orderedkind): New functions. (c_parser_omp_clause_linear): Parse linear clause modifiers. (c_parser_omp_clause_depend_sink): New function. (c_parser_omp_clause_depend): Parse source/sink depend kinds. (c_parser_omp_clause_map): Parse release/delete map kinds and optional always modifier. (c_parser_oacc_all_clauses): Adjust c_parser_omp_clause_if and c_finish_omp_clauses callers. (c_parser_omp_all_clauses): Likewise. Parse OpenMP 4.5 clauses. Parse "to" as OMP_CLAUSE_TO_DECLARE if on declare target directive. (c_parser_oacc_cache): Adjust c_finish_omp_clauses caller. (OMP_CRITICAL_CLAUSE_MASK): Define. (c_parser_omp_critical): Parse critical clauses. (c_parser_omp_for_loop): Handle doacross loops, adjust c_finish_omp_for and c_finish_omp_clauses callers. (OMP_SIMD_CLAUSE_MASK): Add simdlen clause. (c_parser_omp_simd): Allow ordered clause if it has no parameter. (OMP_FOR_CLAUSE_MASK): Add linear clause. (c_parser_omp_for): Disallow ordered clause when combined with distribute. Disallow linear clause when combined with distribute and not combined with simd. (OMP_ORDERED_CLAUSE_MASK, OMP_ORDERED_DEPEND_CLAUSE_MASK): Define. (c_parser_omp_ordered): Add CONTEXT argument, remove LOC argument, parse clauses and if depend clause is found, don't parse a body. (c_parser_omp_parallel): Disallow copyin clause on target parallel. Allow target parallel without for after it. (OMP_TASK_CLAUSE_MASK): Add priority clause. (OMP_TARGET_DATA_CLAUSE_MASK): Add use_device_ptr clause. (c_parser_omp_target_data): Diagnose no map clauses or clauses with invalid kinds. (OMP_TARGET_UPDATE_CLAUSE_MASK): Add depend and nowait clauses. (OMP_TARGET_ENTER_DATA_CLAUSE_MASK, OMP_TARGET_EXIT_DATA_CLAUSE_MASK): Define. (c_parser_omp_target_enter_data, c_parser_omp_target_exit_data): New functions. (OMP_TARGET_CLAUSE_MASK): Add depend, nowait, private, firstprivate, defaultmap and is_device_ptr clauses. (c_parser_omp_target): Parse target parallel and target simd. Set OMP_TARGET_COMBINED on combined constructs. Parse target enter data and target exit data. Diagnose invalid map kinds. (OMP_DECLARE_TARGET_CLAUSE_MASK): Define. (c_parser_omp_declare_target): Parse OpenMP 4.5 forms of this construct. (c_parser_omp_declare_reduction): Use STRIP_NOPS when checking for &omp_priv. (OMP_TASKLOOP_CLAUSE_MASK): Define. (c_parser_omp_taskloop): New function. (c_parser_omp_construct): Don't handle PRAGMA_OMP_ORDERED here, handle PRAGMA_OMP_TASKLOOP. (c_parser_cilk_for): Adjust c_finish_omp_clauses callers. * c-tree.h (c_finish_omp_clauses): Add two new arguments. * c-typeck.c (handle_omp_array_sections_1): Fix comment typo. Add IS_OMP argument, handle structure element bases, diagnose bitfields, pass IS_OMP recursively, diagnose known zero length array sections in depend clauses, handle array sections in reduction clause, diagnose negative length even for pointers. (handle_omp_array_sections): Add IS_OMP argument, use auto_vec for types, pass IS_OMP down to handle_omp_array_sections_1, handle array sections in reduction clause, set OMP_CLAUSE_MAP_MAYBE_ZERO_LENGTH_ARRAY_SECTION if map could be zero length array section, use GOMP_MAP_FIRSTPRIVATE_POINTER for IS_OMP. (c_finish_omp_clauses): Add IS_OMP and DECLARE_SIMD arguments. Handle new OpenMP 4.5 clauses and new restrictions for the old ones. gcc/cp/ 2015-10-13 Jakub Jelinek <jakub@redhat.com> Aldy Hernandez <aldyh@redhat.com> * class.c (finish_struct_1): Call finish_omp_declare_simd_methods. * cp-gimplify.c (cp_gimplify_expr): Handle OMP_TASKLOOP. (cp_genericize_r): Likewise. (cxx_omp_finish_clause): Don't diagnose references. (cxx_omp_disregard_value_expr): New function. * cp-objcp-common.h (LANG_HOOKS_OMP_DISREGARD_VALUE_EXPR): Redefine. * cp-tree.h (OMP_FOR_GIMPLIFYING_P): Document for OMP_TASKLOOP. (DECL_OMP_PRIVATIZED_MEMBER): Define. (finish_omp_declare_simd_methods, push_omp_privatization_clauses, pop_omp_privatization_clauses, save_omp_privatization_clauses, restore_omp_privatization_clauses, omp_privatize_field, cxx_omp_disregard_value_expr): New prototypes. (finish_omp_clauses): Add two new arguments. (finish_omp_for): Add ORIG_DECLV argument. * parser.c (cp_parser_lambda_body): Call save_omp_privatization_clauses and restore_omp_privatization_clauses. (cp_parser_omp_clause_name): Handle OpenMP 4.5 clauses. (cp_parser_omp_var_list_no_open): Handle structure elements for map, to and from clauses. Handle array sections in reduction clause. Parse this keyword. Formatting fixes. (cp_parser_omp_clause_if): Add IS_OMP argument, handle parsing of if clause modifiers. (cp_parser_omp_clause_num_tasks, cp_parser_omp_clause_grainsize, cp_parser_omp_clause_priority, cp_parser_omp_clause_hint, cp_parser_omp_clause_defaultmap): New functions. (cp_parser_omp_clause_ordered): Parse optional parameter. (cp_parser_omp_clause_reduction): Handle array reductions. (cp_parser_omp_clause_schedule): Parse optional simd modifier. (cp_parser_omp_clause_nogroup, cp_parser_omp_clause_orderedkind): New functions. (cp_parser_omp_clause_linear): Parse linear clause modifiers. (cp_parser_omp_clause_depend_sink): New function. (cp_parser_omp_clause_depend): Parse source/sink depend kinds. (cp_parser_omp_clause_map): Parse release/delete map kinds and optional always modifier. (cp_parser_oacc_all_clauses): Adjust cp_parser_omp_clause_if and finish_omp_clauses callers. (cp_parser_omp_all_clauses): Likewise. Parse OpenMP 4.5 clauses. Parse "to" as OMP_CLAUSE_TO_DECLARE if on declare target directive. (OMP_CRITICAL_CLAUSE_MASK): Define. (cp_parser_omp_critical): Parse critical clauses. (cp_parser_omp_for_incr): Use cp_tree_equal if processing_template_decl. (cp_parser_omp_for_loop_init): Return tree instead of bool. Handle non-static data member iterators. (cp_parser_omp_for_loop): Handle doacross loops, adjust finish_omp_for and finish_omp_clauses callers. (cp_omp_split_clauses): Adjust finish_omp_clauses caller. (OMP_SIMD_CLAUSE_MASK): Add simdlen clause. (cp_parser_omp_simd): Allow ordered clause if it has no parameter. (OMP_FOR_CLAUSE_MASK): Add linear clause. (cp_parser_omp_for): Disallow ordered clause when combined with distribute. Disallow linear clause when combined with distribute and not combined with simd. (OMP_ORDERED_CLAUSE_MASK, OMP_ORDERED_DEPEND_CLAUSE_MASK): Define. (cp_parser_omp_ordered): Add CONTEXT argument, return bool instead of tree, parse clauses and if depend clause is found, don't parse a body. (cp_parser_omp_parallel): Disallow copyin clause on target parallel. Allow target parallel without for after it. (OMP_TASK_CLAUSE_MASK): Add priority clause. (OMP_TARGET_DATA_CLAUSE_MASK): Add use_device_ptr clause. (cp_parser_omp_target_data): Diagnose no map clauses or clauses with invalid kinds. (OMP_TARGET_UPDATE_CLAUSE_MASK): Add depend and nowait clauses. (OMP_TARGET_ENTER_DATA_CLAUSE_MASK, OMP_TARGET_EXIT_DATA_CLAUSE_MASK): Define. (cp_parser_omp_target_enter_data, cp_parser_omp_target_exit_data): New functions. (OMP_TARGET_CLAUSE_MASK): Add depend, nowait, private, firstprivate, defaultmap and is_device_ptr clauses. (cp_parser_omp_target): Parse target parallel and target simd. Set OMP_TARGET_COMBINED on combined constructs. Parse target enter data and target exit data. Diagnose invalid map kinds. (cp_parser_oacc_cache): Adjust finish_omp_clauses caller. (OMP_DECLARE_TARGET_CLAUSE_MASK): Define. (cp_parser_omp_declare_target): Parse OpenMP 4.5 forms of this construct. (OMP_TASKLOOP_CLAUSE_MASK): Define. (cp_parser_omp_taskloop): New function. (cp_parser_omp_construct): Don't handle PRAGMA_OMP_ORDERED here, handle PRAGMA_OMP_TASKLOOP. (cp_parser_pragma): Handle PRAGMA_OMP_ORDERED here directly, handle PRAGMA_OMP_TASKLOOP, call push_omp_privatization_clauses and pop_omp_privatization_clauses around parsing calls. (cp_parser_cilk_for): Adjust finish_omp_clauses caller. * pt.c (apply_late_template_attributes): Adjust tsubst_omp_clauses and finish_omp_clauses callers. (tsubst_omp_clause_decl): Return NULL if decl is NULL. For TREE_LIST, copy over OMP_CLAUSE_DEPEND_SINK_NEGATIVE bit. Use tsubst_expr instead of tsubst_copy, undo convert_from_reference effects. (tsubst_omp_clauses): Add ALLOW_FIELDS argument. Handle new OpenMP 4.5 clauses. Use tsubst_omp_clause_decl for more clauses. If ALLOW_FIELDS, handle non-static data members in the clauses. Clear OMP_CLAUSE_LINEAR_STEP if it has been cleared before. (omp_parallel_combined_clauses): New variable. (tsubst_omp_for_iterator): Add ORIG_DECLV argument, recur on OMP_FOR_ORIG_DECLS, handle non-static data member iterators. Improve handling of clauses on combined constructs. (tsubst_expr): Call push_omp_privatization_clauses and pop_omp_privatization_clauses around instantiation of certain OpenMP constructs, improve handling of clauses on combined constructs, handle OMP_TASKLOOP, adjust tsubst_omp_for_iterator, tsubst_omp_clauses and finish_omp_for callers, handle clauses on critical and ordered, handle OMP_TARGET_{ENTER,EXIT}_DATA. (instantiate_decl): Call save_omp_privatization_clauses and restore_omp_privatization_clauses around instantiation. (dependent_omp_for_p): Fix up comment typo. Handle SCOPE_REF. * semantics.c (omp_private_member_map, omp_private_member_vec, omp_private_member_ignore_next): New variables. (finish_non_static_data_member): Return dummy decl for privatized non-static data members. (omp_clause_decl_field, omp_clause_printable_decl, omp_note_field_privatization, omp_privatize_field): New functions. (handle_omp_array_sections_1): Fix comment typo. Add IS_OMP argument, handle structure element bases, diagnose bitfields, pass IS_OMP recursively, diagnose known zero length array sections in depend clauses, handle array sections in reduction clause, diagnose negative length even for pointers. (handle_omp_array_sections): Add IS_OMP argument, use auto_vec for types, pass IS_OMP down to handle_omp_array_sections_1, handle array sections in reduction clause, set OMP_CLAUSE_MAP_MAYBE_ZERO_LENGTH_ARRAY_SECTION if map could be zero length array section, use GOMP_MAP_FIRSTPRIVATE_POINTER for IS_OMP. (finish_omp_reduction_clause): Handle array sections and arrays. Use omp_clause_printable_decl. (finish_omp_declare_simd_methods, cp_finish_omp_clause_depend_sink): New functions. (finish_omp_clauses): Add ALLOW_FIELDS and DECLARE_SIMD arguments. Handle new OpenMP 4.5 clauses and new restrictions for the old ones, handle non-static data members, reject this keyword when not allowed. (push_omp_privatization_clauses, pop_omp_privatization_clauses, save_omp_privatization_clauses, restore_omp_privatization_clauses): New functions. (handle_omp_for_class_iterator): Handle OMP_TASKLOOP class iterators. Add collapse and ordered arguments. Fix handling of lastprivate iterators in doacross loops. (finish_omp_for): Add ORIG_DECLV argument, handle doacross loops, adjust c_finish_omp_for, handle_omp_for_class_iterator and finish_omp_clauses callers. Fill in OMP_CLAUSE_LINEAR_STEP on simd loops with non-static data member iterators. gcc/fortran/ 2015-10-13 Jakub Jelinek <jakub@redhat.com> Ilya Verbin <ilya.verbin@intel.com> * f95-lang.c (DEF_FUNCTION_TYPE_9, DEF_FUNCTION_TYPE_10, DEF_FUNCTION_TYPE_11, DEF_FUNCTION_TYPE_VAR_1): Define. * trans-openmp.c (gfc_trans_omp_clauses): Set OMP_CLAUSE_IF_MODIFIER to ERROR_MARK, OMP_CLAUSE_ORDERED_EXPR to NULL. (gfc_trans_omp_critical): Adjust for addition of clauses. (gfc_trans_omp_ordered): Likewise. * types.def (BT_FN_BOOL_UINT_LONGPTR_LONGPTR_LONGPTR, BT_FN_BOOL_UINT_ULLPTR_ULLPTR_ULLPTR, BT_FN_BOOL_UINT_LONGPTR_LONG_LONGPTR_LONGPTR, BT_FN_BOOL_UINT_ULLPTR_ULL_ULLPTR_ULLPTR, BT_FN_VOID_INT_SIZE_PTR_PTR_PTR_UINT_PTR, BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR, BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR_INT, BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_UINT_LONG_INT_LONG_LONG_LONG, BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_UINT_LONG_INT_ULL_ULL_ULL, BT_FN_VOID_LONG_VAR, BT_FN_VOID_ULL_VAR): New. (BT_FN_VOID_INT_PTR_SIZE_PTR_PTR_PTR, BT_FN_VOID_INT_OMPFN_PTR_SIZE_PTR_PTR_PTR, BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR): Remove. gcc/lto/ 2015-10-13 Jakub Jelinek <jakub@redhat.com> * lto-lang.c (DEF_FUNCTION_TYPE_9, DEF_FUNCTION_TYPE_10, DEF_FUNCTION_TYPE_11): Define. gcc/jit/ 2015-10-13 Jakub Jelinek <jakub@redhat.com> * jit-builtins.c (DEF_FUNCTION_TYPE_9, DEF_FUNCTION_TYPE_10, DEF_FUNCTION_TYPE_11): Define. * jit-builtins.h (DEF_FUNCTION_TYPE_9, DEF_FUNCTION_TYPE_10, DEF_FUNCTION_TYPE_11): Define. gcc/ada/ 2015-10-13 Jakub Jelinek <jakub@redhat.com> * gcc-interface/utils.c (DEF_FUNCTION_TYPE_9, DEF_FUNCTION_TYPE_10, DEF_FUNCTION_TYPE_11): Define. gcc/testsuite/ 2015-10-13 Jakub Jelinek <jakub@redhat.com> Aldy Hernandez <aldyh@redhat.com> * c-c++-common/gomp/cancel-1.c (f2): Add map clause to target data. * c-c++-common/gomp/clauses-1.c: New test. * c-c++-common/gomp/clauses-2.c: New test. * c-c++-common/gomp/clauses-3.c: New test. * c-c++-common/gomp/clauses-4.c: New test. * c-c++-common/gomp/declare-target-1.c: New test. * c-c++-common/gomp/declare-target-2.c: New test. * c-c++-common/gomp/depend-3.c: New test. * c-c++-common/gomp/depend-4.c: New test. * c-c++-common/gomp/doacross-1.c: New test. * c-c++-common/gomp/if-1.c: New test. * c-c++-common/gomp/if-2.c: New test. * c-c++-common/gomp/linear-1.c: New test. * c-c++-common/gomp/map-2.c: New test. * c-c++-common/gomp/map-3.c: New test. * c-c++-common/gomp/nesting-1.c (f_omp_parallel, f_omp_target_data): Add map clause to target data. * c-c++-common/gomp/nesting-warn-1.c (f_omp_target): Likewise. * c-c++-common/gomp/ordered-1.c: New test. * c-c++-common/gomp/ordered-2.c: New test. * c-c++-common/gomp/ordered-3.c: New test. * c-c++-common/gomp/pr61486-1.c (foo): Remove linear clause on non-iterator. * c-c++-common/gomp/pr61486-2.c (test, test2): Remove ordered clause and ordered construct where no longer allowed. * c-c++-common/gomp/priority-1.c: New test. * c-c++-common/gomp/reduction-1.c: New test. * c-c++-common/gomp/schedule-simd-1.c: New test. * c-c++-common/gomp/sink-1.c: New test. * c-c++-common/gomp/sink-2.c: New test. * c-c++-common/gomp/sink-3.c: New test. * c-c++-common/gomp/sink-4.c: New test. * c-c++-common/gomp/udr-1.c: New test. * c-c++-common/taskloop-1.c: New test. * c-c++-common/cpp/openmp-define-3.c: Adjust for the new value of _OPENMP macro. * c-c++-common/cilk-plus/PS/body.c (foo): Adjust expected diagnostics. * c-c++-common/goacc-gomp/nesting-fail-1.c (f_acc_parallel, f_acc_kernels, f_acc_data, f_acc_loop): Add map clause to target data. * gcc.dg/gomp/clause-1.c: * gcc.dg/gomp/reduction-1.c: New test. * gcc.dg/gomp/sink-fold-1.c: New test. * gcc.dg/gomp/sink-fold-2.c: New test. * gcc.dg/gomp/sink-fold-3.c: New test. * gcc.dg/vect/vect-simd-clone-15.c: New test. * g++.dg/gomp/clause-1.C (T::test): Remove dg-error on privatization of non-static data members. * g++.dg/gomp/clause-3.C (foo): Remove one dg-error directive. Add some linear clause tests. * g++.dg/gomp/declare-simd-3.C: New test. * g++.dg/gomp/linear-1.C: New test. * g++.dg/gomp/member-1.C: New test. * g++.dg/gomp/member-2.C: New test. * g++.dg/gomp/pr66571-2.C: New test. * g++.dg/gomp/pr67504.C (foo): Add test for ordered clause with dependent argument. * g++.dg/gomp/pr67522.C (foo): Add test for invalid array section in reduction clause. * g++.dg/gomp/reference-1.C: New test. * g++.dg/gomp/sink-1.C: New test. * g++.dg/gomp/sink-2.C: New test. * g++.dg/gomp/sink-3.C: New test. * g++.dg/gomp/task-1.C: Remove both dg-error directives. * g++.dg/gomp/this-1.C: New test. * g++.dg/gomp/this-2.C: New test. * g++.dg/vect/simd-clone-2.cc: New test. * g++.dg/vect/simd-clone-2.h: New test. * g++.dg/vect/simd-clone-3.cc: New test. * g++.dg/vect/simd-clone-4.cc: New test. * g++.dg/vect/simd-clone-4.h: New test. * g++.dg/vect/simd-clone-5.cc: New test. include/ 2015-10-13 Jakub Jelinek <jakub@redhat.com> Ilya Verbin <ilya.verbin@intel.com> * gomp-constants.h (GOMP_MAP_FLAG_ALWAYS): Define. (enum gomp_map_kind): Add GOMP_MAP_FIRSTPRIVATE, GOMP_MAP_FIRSTPRIVATE_INT, GOMP_MAP_USE_DEVICE_PTR, GOMP_MAP_ZERO_LEN_ARRAY_SECTION, GOMP_MAP_ALWAYS_TO, GOMP_MAP_ALWAYS_FROM, GOMP_MAP_ALWAYS_TOFROM, GOMP_MAP_STRUCT, GOMP_MAP_DELETE_ZERO_LEN_ARRAY_SECTION, GOMP_MAP_DELETE, GOMP_MAP_RELEASE, GOMP_MAP_FIRSTPRIVATE_POINTER. (GOMP_MAP_ALWAYS_TO_P, GOMP_MAP_ALWAYS_FROM_P): Define. (GOMP_TASK_FLAG_UNTIED, GOMP_TASK_FLAG_FINAL, GOMP_TASK_FLAG_MERGEABLE, GOMP_TASK_FLAG_DEPEND, GOMP_TASK_FLAG_PRIORITY, GOMP_TASK_FLAG_UP, GOMP_TASK_FLAG_GRAINSIZE, GOMP_TASK_FLAG_IF, GOMP_TASK_FLAG_NOGROUP, GOMP_TARGET_FLAG_NOWAIT, GOMP_TARGET_FLAG_EXIT_DATA, GOMP_TARGET_FLAG_UPDATE): Define. libgomp/ 2015-10-13 Jakub Jelinek <jakub@redhat.com> Aldy Hernandez <aldyh@redhat.com> Ilya Verbin <ilya.verbin@intel.com> * config/linux/affinity.c (omp_get_place_num_procs, omp_get_place_proc_ids, gomp_get_place_proc_ids_8): New functions. * config/linux/doacross.h: New file. * config/posix/affinity.c (omp_get_place_num_procs, omp_get_place_proc_ids, gomp_get_place_proc_ids_8): New functions. * config/posix/doacross.h: New file. * env.c: Include gomp-constants.h. (struct gomp_task_icv): Rename run_sched_modifier to run_sched_chunk_size. (gomp_max_task_priority_var): New variable. (parse_schedule): Rename run_sched_modifier to run_sched_chunk_size. (handle_omp_display_env): Change _OPENMP value from 201307 to 201511. Print OMP_MAX_TASK_PRIORITY. (initialize_env): Parse OMP_MAX_TASK_PRIORITY. (omp_set_schedule, omp_get_schedule): Rename modifier argument to chunk_size and run_sched_modifier to run_sched_chunk_size. (omp_get_max_task_priority, omp_get_initial_device, omp_get_num_places, omp_get_place_num, omp_get_partition_num_places, omp_get_partition_place_nums): New functions. * fortran.c (omp_set_schedule_, omp_set_schedule_8_, omp_get_schedule_, omp_get_schedule_8_): Rename modifier argument to chunk_size. (omp_get_num_places_, omp_get_place_num_procs_, omp_get_place_num_procs_8_, omp_get_place_proc_ids_, omp_get_place_proc_ids_8_, omp_get_place_num_, omp_get_partition_num_places_, omp_get_partition_place_nums_, omp_get_partition_place_nums_8_, omp_get_initial_device_, omp_get_max_task_priority_): New functions. * libgomp_g.h (GOMP_loop_doacross_static_start, GOMP_loop_doacross_dynamic_start, GOMP_loop_doacross_guided_start, GOMP_loop_doacross_runtime_start, GOMP_loop_ull_doacross_static_start, GOMP_loop_ull_doacross_dynamic_start, GOMP_loop_ull_doacross_guided_start, GOMP_loop_ull_doacross_runtime_start, GOMP_doacross_post, GOMP_doacross_wait, GOMP_doacross_ull_post, GOMP_doacross_wait, GOMP_taskloop, GOMP_taskloop_ull, GOMP_target_41, GOMP_target_data_41, GOMP_target_update_41, GOMP_target_enter_exit_data): New prototypes. (GOMP_task): Add prototype argument. * libgomp.h (_LIBGOMP_CHECKING_): Define to 0 if not yet defined. (struct gomp_doacross_work_share): New type. (struct gomp_work_share): Add doacross field. (struct gomp_task_icv): Rename run_sched_modifier to run_sched_chunk_size. (enum gomp_task_kind): Rename GOMP_TASK_IFFALSE to GOMP_TASK_UNDEFERRED. Add comments. (struct gomp_task_depend_entry): Add comments. (struct gomp_task): Likewise. (struct gomp_taskgroup): Likewise. (struct gomp_target_task): New type. (struct gomp_team): Add comment. (gomp_get_place_proc_ids_8, gomp_doacross_init, gomp_doacross_ull_init, gomp_task_maybe_wait_for_dependencies, gomp_create_target_task, gomp_target_task_fn): New prototypes. (struct target_var_desc): New type. (struct target_mem_desc): Adjust comment. Use struct target_var_desc instead of splay_tree_key for list. (REFCOUNT_INFINITY): Define. (struct splay_tree_key_s): Remove copy_from field. (struct gomp_device_descr): Add dev2dev_func field. (enum gomp_map_vars_kind): New enum. (gomp_map_vars): Add one argument. * libgomp.map (OMP_4.5): Export omp_get_max_task_priority, omp_get_max_task_priority_, omp_get_num_places, omp_get_num_places_, omp_get_place_num_procs, omp_get_place_num_procs_, omp_get_place_num_procs_8_, omp_get_place_proc_ids, omp_get_place_proc_ids_, omp_get_place_proc_ids_8_, omp_get_place_num, omp_get_place_num_, omp_get_partition_num_places, omp_get_partition_num_places_, omp_get_partition_place_nums, omp_get_partition_place_nums_, omp_get_partition_place_nums_8_, omp_get_initial_device, omp_get_initial_device_, omp_target_alloc, omp_target_free, omp_target_is_present, omp_target_memcpy, omp_target_memcpy_rect, omp_target_associate_ptr and omp_target_disassociate_ptr. (GOMP_4.0.2): Renamed to ... (GOMP_4.5): ... this. Export GOMP_target_41, GOMP_target_data_41, GOMP_target_update_41, GOMP_target_enter_exit_data, GOMP_taskloop, GOMP_taskloop_ull, GOMP_loop_doacross_dynamic_start, GOMP_loop_doacross_guided_start, GOMP_loop_doacross_runtime_start, GOMP_loop_doacross_static_start, GOMP_doacross_post, GOMP_doacross_wait, GOMP_loop_ull_doacross_dynamic_start, GOMP_loop_ull_doacross_guided_start, GOMP_loop_ull_doacross_runtime_start, GOMP_loop_ull_doacross_static_start, GOMP_doacross_ull_post and GOMP_doacross_ull_wait. * libgomp.texi: Document omp_get_max_task_priority. Rename modifier argument to chunk_size for omp_set_schedule and omp_get_schedule. Document OMP_MAX_TASK_PRIORITY env var. * loop.c (GOMP_loop_runtime_start): Adjust for run_sched_modifier to run_sched_chunk_size renaming. (GOMP_loop_ordered_runtime_start): Likewise. (gomp_loop_doacross_static_start, gomp_loop_doacross_dynamic_start, gomp_loop_doacross_guided_start, GOMP_loop_doacross_runtime_start, GOMP_parallel_loop_runtime_start): New functions. (GOMP_parallel_loop_runtime): Adjust for run_sched_modifier to run_sched_chunk_size renaming. (GOMP_loop_doacross_static_start, GOMP_loop_doacross_dynamic_start, GOMP_loop_doacross_guided_start): New functions or aliases. * loop_ull.c (GOMP_loop_ull_runtime_start): Adjust for run_sched_modifier to run_sched_chunk_size renaming. (GOMP_loop_ull_ordered_runtime_start): Likewise. (gomp_loop_ull_doacross_static_start, gomp_loop_ull_doacross_dynamic_start, gomp_loop_ull_doacross_guided_start, GOMP_loop_ull_doacross_runtime_start): New functions. (GOMP_loop_ull_doacross_static_start, GOMP_loop_ull_doacross_dynamic_start, GOMP_loop_ull_doacross_guided_start): New functions or aliases. * oacc-mem.c (acc_map_data, present_create_copy, gomp_acc_insert_pointer): Pass GOMP_MAP_VARS_OPENACC instead of false to gomp_map_vars. (gomp_acc_remove_pointer): Use copy_from from target_var_desc. * oacc-parallel.c (GOACC_data_start): Pass GOMP_MAP_VARS_OPENACC instead of false to gomp_map_vars. (GOACC_parallel_keyed): Likewise. Use copy_from from target_var_desc. * omp.h.in (omp_lock_hint_t): New type. (omp_init_lock_with_hint, omp_init_nest_lock_with_hint, omp_get_num_places, omp_get_place_num_procs, omp_get_place_proc_ids, omp_get_place_num, omp_get_partition_num_places, omp_get_partition_place_nums, omp_get_initial_device, omp_get_max_task_priority, omp_target_alloc, omp_target_free, omp_target_is_present, omp_target_memcpy, omp_target_memcpy_rect, omp_target_associate_ptr, omp_target_disassociate_ptr): New prototypes. * omp_lib.f90.in (omp_lock_hint_kind): New parameter. (omp_lock_hint_none, omp_lock_hint_uncontended, omp_lock_hint_contended, omp_lock_hint_nonspeculative, omp_lock_hint_speculative): New parameters. (omp_init_lock_with_hint, omp_init_nest_lock_with_hint, omp_get_num_places, omp_get_place_num_procs, omp_get_place_proc_ids, omp_get_place_num, omp_get_partition_num_places, omp_get_partition_place_nums, omp_get_initial_device, omp_get_max_task_priority): New interfaces. (omp_set_schedule, omp_get_schedule): Rename modifier argument to chunk_size. * omp_lib.h.in (omp_lock_hint_kind): New parameter. (omp_lock_hint_none, omp_lock_hint_uncontended, omp_lock_hint_contended, omp_lock_hint_nonspeculative, omp_lock_hint_speculative): New parameters. (omp_init_lock_with_hint, omp_init_nest_lock_with_hint, omp_get_num_places, omp_get_place_num_procs, omp_get_place_proc_ids, omp_get_place_num, omp_get_partition_num_places, omp_get_partition_place_nums, omp_get_initial_device, omp_get_max_task_priority): New functions and subroutines. * ordered.c: Include stdarg.h and string.h. (MAX_COLLAPSED_BITS): Define. (gomp_doacross_init, GOMP_doacross_post, GOMP_doacross_wait, gomp_doacross_ull_init, GOMP_doacross_ull_post, GOMP_doacross_ull_wait): New functions. * target.c: Include errno.h. (resolve_device): If device is not initialized, call gomp_init_device on it. (gomp_map_lookup): New function. (gomp_map_vars_existing): Add tgt_var argument, fill it in. Don't bump refcount if REFCOUNT_INFINITY. Handle GOMP_MAP_ALWAYS_TO_P. (get_kind): Rename is_openacc argument to short_mapkind. (gomp_map_pointer): Use gomp_map_lookup. (gomp_map_fields_existing): New function. (gomp_map_vars): Rename is_openacc argument to short_mapkind and is_target to pragma_kind. Handle GOMP_MAP_VARS_ENTER_DATA, handle GOMP_MAP_FIRSTPRIVATE_INT, GOMP_MAP_STRUCT, GOMP_MAP_USE_DEVICE_PTR, GOMP_MAP_ZERO_LEN_ARRAY_SECTION. Adjust for tgt->list changed type and copy_from living in there. (gomp_copy_from_async): Adjust for tgt->list changed type and copy_from living in there. (gomp_unmap_vars): Likewise. (gomp_update): Likewise. Rename is_openacc argument to short_mapkind. Don't fail if object is not mapped. (gomp_load_image_to_device): Initialize refcount to REFCOUNT_INFINITY. (gomp_target_fallback): New function. (gomp_get_target_fn_addr): Likewise. (GOMP_target): Adjust gomp_map_vars caller, use gomp_get_target_fn_addr and gomp_target_fallback. (GOMP_target_41): New function. (gomp_target_data_fallback): New function. (GOMP_target_data): Use it, adjust gomp_map_vars caller. (GOMP_target_data_41): New function. (GOMP_target_update): Adjust gomp_update caller. (GOMP_target_update_41): New function. (gomp_exit_data, GOMP_target_enter_exit_data, gomp_target_task_fn, omp_target_alloc, omp_target_free, omp_target_is_present, omp_target_memcpy, omp_target_memcpy_rect_worker, omp_target_memcpy_rect, omp_target_associate_ptr, omp_target_disassociate_ptr, gomp_load_plugin_for_device): New functions. * task.c: Include gomp-constants.h. Include taskloop.c twice to get GOMP_taskloop and GOMP_taskloop_ull definitions. (gomp_task_handle_depend): New function. (GOMP_task): Use it. Add priority argument. Use gomp-constant.h constants instead of hardcoded numbers. Rename GOMP_TASK_IFFALSE to GOMP_TASK_UNDEFERRED. (gomp_create_target_task): New function. (verify_children_queue, verify_taskgroup_queue, verify_task_queue): New functions. (gomp_task_run_pre): Call verify_*_queue functions. If an upcoming tied task is about to leave the sibling or taskgroup queues in an invalid state, adjust appropriately. Remove taskgroup argument. Add comments. (gomp_task_run_post_handle_dependers): Add comments. (gomp_task_run_post_remove_parent): Likewise. (gomp_barrier_handle_tasks): Adjust gomp_task_run_pre caller. (GOMP_taskwait): Likewise. Add comments. (gomp_task_maybe_wait_for_dependencies): Fix scheduling problem such that the first non parent_depends_on task does not end up at the end of the children queue. (GOMP_taskgroup_start): Rename GOMP_TASK_IFFALSE to GOMP_TASK_UNDEFERRED. (GOMP_taskgroup_end): Adjust gomp_task_run_pre caller. * taskloop.c: New file. * testsuite/lib/libgomp.exp (check_effective_target_offload_device_nonshared_as): New proc. * testsuite/libgomp.c/affinity-2.c: New test. * testsuite/libgomp.c/doacross-1.c: New test. * testsuite/libgomp.c/doacross-2.c: New test. * testsuite/libgomp.c/examples-4/declare_target-1.c (fib_wrapper): Add map clause to target. * testsuite/libgomp.c/examples-4/declare_target-4.c (accum): Likewise. * testsuite/libgomp.c/examples-4/declare_target-5.c (accum): Likewise. * testsuite/libgomp.c/examples-4/device-1.c (main): Likewise. * testsuite/libgomp.c/examples-4/device-3.c (main): Likewise. * testsuite/libgomp.c/examples-4/target_data-3.c (gramSchmidt): Likewise. * testsuite/libgomp.c/examples-4/teams-2.c (dotprod): Likewise. * testsuite/libgomp.c/examples-4/teams-3.c (dotprod): Likewise. * testsuite/libgomp.c/examples-4/teams-4.c (dotprod): Likewise. * testsuite/libgomp.c/for-2.h (OMPTGT, OMPTO, OMPFROM): Define if not defined. Use those where needed. * testsuite/libgomp.c/for-4.c: New test. * testsuite/libgomp.c/for-5.c: New test. * testsuite/libgomp.c/for-6.c: New test. * testsuite/libgomp.c/linear-1.c: New test. * testsuite/libgomp.c/ordered-4.c: New test. * testsuite/libgomp.c/pr66199-2.c (f2): Adjust for linear clause only allowed on the loop iterator. * testsuite/libgomp.c/pr66199-3.c: New test. * testsuite/libgomp.c/pr66199-4.c: New test. * testsuite/libgomp.c/reduction-7.c: New test. * testsuite/libgomp.c/reduction-8.c: New test. * testsuite/libgomp.c/reduction-9.c: New test. * testsuite/libgomp.c/reduction-10.c: New test. * testsuite/libgomp.c/target-1.c (fn2, fn3, fn4): Add map(tofrom:s). * testsuite/libgomp.c/target-2.c (fn2, fn3, fn4): Likewise. * testsuite/libgomp.c/target-7.c (foo): Add map(h) where needed. * testsuite/libgomp.c/target-11.c: New test. * testsuite/libgomp.c/target-12.c: New test. * testsuite/libgomp.c/target-13.c: New test. * testsuite/libgomp.c/target-14.c: New test. * testsuite/libgomp.c/target-15.c: New test. * testsuite/libgomp.c/target-16.c: New test. * testsuite/libgomp.c/target-17.c: New test. * testsuite/libgomp.c/target-18.c: New test. * testsuite/libgomp.c/target-19.c: New test. * testsuite/libgomp.c/target-20.c: New test. * testsuite/libgomp.c/target-21.c: New test. * testsuite/libgomp.c/target-22.c: New test. * testsuite/libgomp.c/target-23.c: New test. * testsuite/libgomp.c/target-24.c: New test. * testsuite/libgomp.c/target-25.c: New test. * testsuite/libgomp.c/target-26.c: New test. * testsuite/libgomp.c/target-27.c: New test. * testsuite/libgomp.c/taskloop-1.c: New test. * testsuite/libgomp.c/taskloop-2.c: New test. * testsuite/libgomp.c/taskloop-3.c: New test. * testsuite/libgomp.c/taskloop-4.c: New test. * testsuite/libgomp.c++/ctor-13.C: New test. * testsuite/libgomp.c++/doacross-1.C: New test. * testsuite/libgomp.c++/examples-4/declare_target-2.C: Replace offload_device with offload_device_nonshared_as. * testsuite/libgomp.c++/for-12.C: New test. * testsuite/libgomp.c++/for-13.C: New test. * testsuite/libgomp.c++/for-14.C: New test. * testsuite/libgomp.c++/linear-1.C: New test. * testsuite/libgomp.c++/member-1.C: New test. * testsuite/libgomp.c++/member-2.C: New test. * testsuite/libgomp.c++/member-3.C: New test. * testsuite/libgomp.c++/member-4.C: New test. * testsuite/libgomp.c++/member-5.C: New test. * testsuite/libgomp.c++/ordered-1.C: New test. * testsuite/libgomp.c++/reduction-5.C: New test. * testsuite/libgomp.c++/reduction-6.C: New test. * testsuite/libgomp.c++/reduction-7.C: New test. * testsuite/libgomp.c++/reduction-8.C: New test. * testsuite/libgomp.c++/reduction-9.C: New test. * testsuite/libgomp.c++/reduction-10.C: New test. * testsuite/libgomp.c++/reference-1.C: New test. * testsuite/libgomp.c++/simd14.C: New test. * testsuite/libgomp.c++/target-2.C (fn2): Add map(tofrom: s) clause. * testsuite/libgomp.c++/target-5.C: New test. * testsuite/libgomp.c++/target-6.C: New test. * testsuite/libgomp.c++/target-7.C: New test. * testsuite/libgomp.c++/target-8.C: New test. * testsuite/libgomp.c++/target-9.C: New test. * testsuite/libgomp.c++/target-10.C: New test. * testsuite/libgomp.c++/target-11.C: New test. * testsuite/libgomp.c++/target-12.C: New test. * testsuite/libgomp.c++/taskloop-1.C: New test. * testsuite/libgomp.c++/taskloop-2.C: New test. * testsuite/libgomp.c++/taskloop-3.C: New test. * testsuite/libgomp.c++/taskloop-4.C: New test. * testsuite/libgomp.c++/taskloop-5.C: New test. * testsuite/libgomp.c++/taskloop-6.C: New test. * testsuite/libgomp.c++/taskloop-7.C: New test. * testsuite/libgomp.c++/taskloop-8.C: New test. * testsuite/libgomp.c++/taskloop-9.C: New test. * testsuite/libgomp.fortran/affinity1.f90: New test. * testsuite/libgomp.fortran/affinity2.f90: New test. liboffloadmic/ 2015-10-13 Ilya Verbin <ilya.verbin@intel.com> * plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_dev2dev): New function. * plugin/offload_target_main.cpp (__offload_target_tgt2tgt): New static function, register it in liboffloadmic. From-SVN: r228777
2015-10-13 21:06:23 +02:00
NEXT_PASS (pass_simduid_cleanup);
Commit the vtable verification feature. Commit the vtable verification feature. This feature is designed to detect, at run time, if/when the vtable pointer in a C++ object has been corrupted, before allowing virtual calls through that pointer. If pointer corruption is detected, execution of the program is halted. libstdc++-v3 ChangeLog: 2013-08-06 Caroline Tice <cmtice@google.com> * fragment.am: Add XTEMPLATE_FLAGS. * configure.ac: Add definitions for --enable-vtable-verify. * acinclude.m4: Add --enable-vtable-verify and --disable-vtable-verify; define --enable-vtable-verify; define VTV_CXXFLAGS, VTV_PCH_CXXFLAGS and VTV_CXXLINKFLAGS. * config/abi/pre/gnu.ver: Export symbols for vtable verification. * libsupc++/Makefile.am: Define vtv_sources and add it to libsupc___la_SOURCES and libsupc__convenience_la_SOURCES. * libsupc++/vtv_stubs.cc: New file. * include/Makefile.am: Add VTV_PCH_CXXFLAGS to PCHFLAGS. * src/Makefile.am: Add VTV_CXXFLAGS to AM_CXXFLAGS; add VTV_CXXLINKFLAGS to CXXLINK. * src/c++98/Makefile.am: Comment out XTEMPLATE_FLAGS; add VTV_CXXFLAGS to AM_CXXFLAGS; add VTV_CXXXLINKFLAGS to CXXLINK. * src/C++11/Makefile.am: Ditto. * doc/xml/manual/configure.xml: Add entry for --enable-vtable-verify. * scripts/testsuite_flags.in: Add cxxvtvflags to Usage; cause cxxvtvflags to use VTV_CXXFLAGS and VTV_CXXLINKFLAGS. * testsuite/lib/libstdc++.exp: Add cxxvtvflags; add code to locate libvtv if --enable-vtable-verify was used; set cxxvtvflags; add cxxvtvflags to cxx_final. * testsuite/18_support/bad_exception/23591_thread-1.c: Add -fvtable-verify=none to compiler flags. * testsuite/17_intro/freestanding.cc: Add -fvtable-verify=none to compiler flags. * configure: Regenerated. * Makefile.in: Regenerated. * python/Makefile.in: Regenerated. * include/Makefile.in: Regenerated. * libsupc++/Makefile.in: Regenerated. * config.h.in: Regenerated. * po/Makefile.in: Regenerated. * src/Makefile.in: Regenerated. * src/c++98/Makefile.in: Regenerated. * src/c++11/Makefile.in: Regenerated. * doc/Makefile.in: Regenerated. * testsuite/Makefile.in: Regenerated. top level ChangeLog: 2013-08-06 Caroline Tice <cmtice@google.com> * configure.ac: Add target-libvtv to target_libraries; disable libvtv on non-linux systems; add target-libvtv to noconfigdirs; add libsupc++/.libs to C++ library search paths. * configure: Regenerated. * Makefile.def: Add libvtv to target_modules; make libvtv depend on libstdc++ and libgcc. * Makefile.in: Regenerated. include/ChangeLog: 2013-08-06 Caroline Tice <cmtice@google.com> * vtv-change-permission.h: New file. contrib/ChangeLog: 2013-08-06 Caroline Tice4 <cmtice@google.com> * gcc_update: Add libvtv files. libgcc/ChangeLog: 2013-08-06 Caroline Tice <cmtice@google.com> config.host (extra_parts): Add vtv_start.o, vtv_end.o vtv_start_preinit.o and vtv_end_preinit.o. configure.ac: Add code to check/set enable_vtable_verify. Makefile.in: Add rules to build vtv_*.o, if enable_vtable_verify is true. vtv_start_preinit.c: New file. vtv_end_preinit.c: New file. vtv_start.c: New file. vtv_end.c: New file. configure: Regenerated. gcc/ChangeLog: 2013-08-06 Caroline Tice <cmtice@google.com> * gcc.c (VTABLE_VERIFICATION_SPEC): New definition. (LINK_COMMAND_SPEC): Add VTABLE_VERIFICATION_SPEC. * tree-pass.h: Add pass_vtable_verify. * varasm.c (assemble_variable): Add code to properly set the comdat section and name for the .vtable_map_vars section. (assemble_vtyv_preinit_initializer): New function. (default_sectin_type_flags): Make sure .vtable_map_vars section has LINK_ONCE flag. * output.h: Add function decl for assemble_vtv_preinit_initializer. * vtable-verify.c: New file. * vtable-verify.h: New file. * flag-types.h (enum vtv_priority): Defintions for flag_vtable_verify initialiation levels. * timevar.def (TV_VTABLE_VERIFICATION): New definition. * passes.def: Insert pass_vtable_verify. * aclocal.m4: Reorder includes. * doc/invoke.texi: Add documentation for the flags -fvtable-verify=, -fvtv-debug and -fvtv-counts. * config/gnu-user.h (GNU_USER_TARGET_STARTFILE_SPEC): Add vtv_start*.o, as appropriate, if -fvtable-verify=... is used. (GNU_USER_TARGET_ENDFILE_SPEC): Add vtv_end*.o as appropriate, if -fvtable-verify=... is used. * Makefile.in (OBJS): Add vtable-verify.o to list. (vtable-verify.o): Add new build rule. (GTFILES): Add vtable-verify.c to list. * common.opt (fvtable-verify=): New flag. (vtv_priority): Values for fvtable-verify= flag. (fvtv-counts): New flag. (fvtv-debug): New flag. * tree.h (save_vtable_map_decl): New extern function decl. gcc/cp/ChangeLog: 2013-08-06 Caroline Tice <cmtice@google.com> * Make-lang.in (*CXX_AND_OBJCXX_OBJS): Add vtable-class-hierarchy.o to list. (vtable-class-hierarchy.o): Add build rule. * cp-tree.h (vtv_start_verification_constructor_init_function): New extern function decl. (vtv_finish_verification_constructor_init_function): New extern function decl. (build_vtbl_address): New extern function decl. (get_mangled_vtable_map_var_name): New extern function decl. (vtv_compute_class_hierarchy_transitive_closure): New extern function decl. (vtv_generate_init_routine): New extern function decl. (vtv_save_class_info): New extern function decl. (vtv_recover_class_info): New extern function decl. (vtv_build_vtable_verify_fndecl): New extern function decl. * class.c (finish_struct_1): Add call to vtv_save_class_info if flag_vtable_verify is true. * config-lang.in: Add vtable-class-hierarchy.c to gtfiles list. * vtable-class-hierarchy.c: New file. * mangle.c (get_mangled_vtable_map_var_name): New function. * decl2.c (start_objects): Update function comment. (cp_write_global_declarations): Call vtv_recover_class_info, vtv_compute_class_hierarchy_transitive_closure and vtv_build_vtable_verify_fndecl, before calling finalize_compilation_unit, and call vtv_generate_init_rount after, IFF flag_vtable_verify is true. (vtv_start_verification_constructor_init_function): New function. (vtv_finish_verification_constructor_init_function): New function. * init.c (build_vtbl_address): Remove static qualifier from function. libvtv/ChangeLog: 2013-08-06 Caroline Tice <cmtice@google.com> Initial check-in of new vtable verification feature. * configure.ac : New file. * acinclude.m4 : New file. * Makefile.am : New file. * aclocal.m4 : New file. * configure.tgt : New file. * configure: New file (generated). * Makefile.in: New file (generated). * vtv_set.h : New file. * vtv_utils.cc : New file. * vtv_utils.h : New file. * vtv_malloc.cc : New file. * vtv_rts.cc : New file. * vtv_malloc.h : New file. * vtv_rts.h : New file. * vtv_fail.cc : New file. * vtv_fail.h : New file. * vtv_map.h : New file. * scripts/run-testsuite.sh : New file. * scripts/sum-vtv-counts.c : New file. * testsuite/parts-test-main.h : New file. * testusite/dataentry.cc : New file. * testsuite/temp_deriv.cc : New file. * testsuite/register_pair.cc : New file. * testsuite/virtual_inheritance.cc : New file. * testsuite/field-test.cc : New file. * testsuite/nested_vcall_test.cc : New file. * testsuite/template-list-iostream.cc : New file. * testsuite/register_pair_inserts.cc : New file. * testsuite/register_pair_inserts_mt.cc : New file. * testsuite/event.list : New file. * testsuite/parts-test-extra-parts-views.cc : New file. * testsuite/parts-test-extra-parts-views.h : New file. * testsuite/environment-fail-32.s : New file. * testsuite/parts-test-extra-parts.h : New file. * testsuite/temp_deriv2.cc : New file. * testsuite/dlopen_mt.cc : New file. * testsuite/event.h : New file. * testsuite/template-list.cc : New file. * testsuite/replace-fail.cc : New file. * testsuite/Makefile.am : New file. * testsuite/Makefile.in: New file (generated). * testsuite/mempool_negative.c : New file. * testsuite/parts-test-main.cc : New file. * testsuite/event-private.cc : New file. * testsuite/thunk.cc : New file. * testsuite/event-defintiions.cc : New file. * testsuite/event-private.h : New file. * testsuite/parts-test.list : New file. * testusite/register_pair_mt.cc : New file. * testsuite/povray-derived.cc : New file. * testsuite/event-main.cc : New file. * testsuite/environment.cc : New file. * testsuite/template-list2.cc : New file. * testsuite/thunk_vtable_map_attack.cc : New file. * testsuite/parts-test-extra-parts.cc : New file. * testsuite/environment-fail-64.s : New file. * testsuite/dlopen.cc : New file. * testsuite/so.cc : New file. * testsuite/temp_deriv3.cc : New file. * testsuite/const_vtable.cc : New file. * testsuite/mempool_positive.c : New file. * testsuite/dup_name.cc : New file. From-SVN: r201555
2013-08-07 05:38:59 +02:00
NEXT_PASS (pass_vtable_verify);
NEXT_PASS (pass_lower_vaarg);
NEXT_PASS (pass_lower_vector);
NEXT_PASS (pass_lower_complex_O0);
NEXT_PASS (pass_sancov_O0);
NEXT_PASS (pass_lower_switch_O0);
NEXT_PASS (pass_asan_O0);
NEXT_PASS (pass_tsan_O0);
bootstrap-ubsan.mk (POSTSTAGE1_LDFLAGS): Add -ldl. config/ * bootstrap-ubsan.mk (POSTSTAGE1_LDFLAGS): Add -ldl. gcc/c-family/ * c-ubsan.c (ubsan_instrument_division): Adjust ubsan_create_data call. (ubsan_instrument_shift): Likewise. (ubsan_instrument_vla): Likewise. gcc/ * opts.c (common_handle_option): Add -fsanitize=null option. Turn off -fdelete-null-pointer-checks option when doing the NULL pointer checking. * sanitizer.def (BUILT_IN_UBSAN_HANDLE_TYPE_MISMATCH): Add. * tree-pass.h (make_pass_ubsan): Declare. (make_pass_sanopt): Declare. * timevar.def (TV_TREE_UBSAN): New timevar. * passes.def: Add pass_sanopt and pass_ubsan. * ubsan.h (ubsan_null_ckind): New enum. (ubsan_mismatch_data): New struct. (ubsan_expand_null_ifn): Declare. (ubsan_create_data): Adjust declaration. (ubsan_type_descriptor): Likewise. * asan.c: Include "ubsan.h". (pass_data_sanopt): New pass. (execute_sanopt): New function. (gate_sanopt): Likewise. (make_pass_sanopt): Likewise. (class pass_sanopt): New class. * ubsan.c: Include tree-pass.h, gimple-ssa.h, gimple-walk.h, gimple-iterator.h and cfgloop.h. (PROB_VERY_UNLIKELY): Define. (tree_type_map_hash): New function. (ubsan_type_descriptor): Add new parameter. Improve type name generation. (ubsan_create_data): Add new parameter. Add pointer data into ubsan structure. (ubsan_expand_null_ifn): New function. (instrument_member_call): Likewise. (instrument_mem_ref): Likewise. (instrument_null): Likewise. (ubsan_pass): Likewise. (gate_ubsan): Likewise. (make_pass_ubsan): Likewise. (ubsan_instrument_unreachable): Adjust ubsan_create_data call. (class pass_ubsan): New class. (pass_data_ubsan): New pass. * flag-types.h (enum sanitize_code): Add SANITIZE_NULL. * internal-fn.c (expand_UBSAN_NULL): New function. * cgraphunit.c (varpool_finalize_decl): Call varpool_assemble_decl even when !flag_toplevel_reorder. * internal-fn.def (UBSAN_NULL): New. gcc/testsuite/ * c-c++-common/ubsan/null-1.c: New test. * c-c++-common/ubsan/null-2.c: New test. * c-c++-common/ubsan/null-3.c: New test. * c-c++-common/ubsan/null-4.c: New test. * c-c++-common/ubsan/null-5.c: New test. * c-c++-common/ubsan/null-6.c: New test. * c-c++-common/ubsan/null-7.c: New test. * c-c++-common/ubsan/null-8.c: New test. * c-c++-common/ubsan/null-9.c: New test. * c-c++-common/ubsan/null-10.c: New test. * c-c++-common/ubsan/null-11.c: New test. * gcc.dg/ubsan/c99-shift-2.c: Adjust dg-output. * c-c++-common/ubsan/shift-1.c: Likewise. * c-c++-common/ubsan/div-by-zero-3.c: Likewise. From-SVN: r205021
2013-11-19 12:45:15 +01:00
NEXT_PASS (pass_sanopt);
NEXT_PASS (pass_cleanup_eh);
NEXT_PASS (pass_lower_resx);
NEXT_PASS (pass_nrv);
Lower VEC_COND_EXPR into internal functions. gcc/ChangeLog: * Makefile.in: Add new file. * expr.c (expand_expr_real_2): Add gcc_unreachable as we should not meet this condition. (do_store_flag): Likewise. * gimplify.c (gimplify_expr): Gimplify first argument of VEC_COND_EXPR to be a SSA name. * internal-fn.c (vec_cond_mask_direct): New. (vec_cond_direct): Likewise. (vec_condu_direct): Likewise. (vec_condeq_direct): Likewise. (expand_vect_cond_optab_fn): New. (expand_vec_cond_optab_fn): Likewise. (expand_vec_condu_optab_fn): Likewise. (expand_vec_condeq_optab_fn): Likewise. (expand_vect_cond_mask_optab_fn): Likewise. (expand_vec_cond_mask_optab_fn): Likewise. (direct_vec_cond_mask_optab_supported_p): Likewise. (direct_vec_cond_optab_supported_p): Likewise. (direct_vec_condu_optab_supported_p): Likewise. (direct_vec_condeq_optab_supported_p): Likewise. * internal-fn.def (VCOND): New OPTAB. (VCONDU): Likewise. (VCONDEQ): Likewise. (VCOND_MASK): Likewise. * optabs.c (get_rtx_code): Make it global. (expand_vec_cond_mask_expr): Removed. (expand_vec_cond_expr): Removed. * optabs.h (expand_vec_cond_expr): Likewise. (vector_compare_rtx): Make it global. * passes.def: Add new pass_gimple_isel pass. * tree-cfg.c (verify_gimple_assign_ternary): Add check for VEC_COND_EXPR about first argument. * tree-pass.h (make_pass_gimple_isel): New. * tree-ssa-forwprop.c (pass_forwprop::execute): Prevent propagation of the first argument of a VEC_COND_EXPR. * tree-ssa-reassoc.c (ovce_extract_ops): Support SSA_NAME as first argument of a VEC_COND_EXPR. (optimize_vec_cond_expr): Likewise. * tree-vect-generic.c (expand_vector_divmod): Make SSA_NAME for a first argument of created VEC_COND_EXPR. (expand_vector_condition): Fix coding style. * tree-vect-stmts.c (vectorizable_condition): Gimplify first argument. * gimple-isel.cc: New file. gcc/testsuite/ChangeLog: * g++.dg/vect/vec-cond-expr-eh.C: New test.
2020-03-09 13:23:03 +01:00
NEXT_PASS (pass_gimple_isel);
NEXT_PASS (pass_harden_conditional_branches);
NEXT_PASS (pass_harden_compares);
NEXT_PASS (pass_warn_access, /*early=*/false);
NEXT_PASS (pass_cleanup_cfg_post_optimizing);
NEXT_PASS (pass_warn_function_noreturn);
NEXT_PASS (pass_expand);
NEXT_PASS (pass_rest_of_compilation);
PUSH_INSERT_PASSES_WITHIN (pass_rest_of_compilation)
NEXT_PASS (pass_instantiate_virtual_regs);
NEXT_PASS (pass_into_cfg_layout_mode);
NEXT_PASS (pass_jump);
NEXT_PASS (pass_lower_subreg);
NEXT_PASS (pass_df_initialize_opt);
NEXT_PASS (pass_cse);
NEXT_PASS (pass_rtl_fwprop);
NEXT_PASS (pass_rtl_cprop);
NEXT_PASS (pass_rtl_pre);
NEXT_PASS (pass_rtl_hoist);
NEXT_PASS (pass_rtl_cprop);
NEXT_PASS (pass_rtl_store_motion);
NEXT_PASS (pass_cse_after_global_opts);
NEXT_PASS (pass_rtl_ifcvt);
NEXT_PASS (pass_reginfo_init);
/* Perform loop optimizations. It might be better to do them a bit
sooner, but we want the profile feedback to work more
efficiently. */
NEXT_PASS (pass_loop2);
PUSH_INSERT_PASSES_WITHIN (pass_loop2)
NEXT_PASS (pass_rtl_loop_init);
NEXT_PASS (pass_rtl_move_loop_invariants);
NEXT_PASS (pass_rtl_unroll_loops);
NEXT_PASS (pass_rtl_doloop);
NEXT_PASS (pass_rtl_loop_done);
POP_INSERT_PASSES ()
NEXT_PASS (pass_lower_subreg2);
NEXT_PASS (pass_web);
NEXT_PASS (pass_rtl_cprop);
NEXT_PASS (pass_cse2);
NEXT_PASS (pass_rtl_dse1);
NEXT_PASS (pass_rtl_fwprop_addr);
NEXT_PASS (pass_inc_dec);
NEXT_PASS (pass_initialize_regs);
NEXT_PASS (pass_ud_rtl_dce);
NEXT_PASS (pass_combine);
NEXT_PASS (pass_if_after_combine);
Move jump threading before reload r266734 has introduced a new instance of jump threading pass in order to take advantage of opportunities that combine opens up. It was perceived back then that it was beneficial to delay it after reload, since that might produce even more such opportunities. Unfortunately jump threading interferes with hot/cold partitioning. In the code from PR92007, it converts the following +-------------------------- 2/HOT ------------------------+ | | v v 3/HOT --> 5/HOT --> 8/HOT --> 11/COLD --> 6/HOT --EH--> 16/HOT | ^ | | +-------------------------------+ into the following: +---------------------- 2/HOT ------------------+ | | v v 3/HOT --> 8/HOT --> 11/COLD --> 6/COLD --EH--> 16/HOT This makes hot bb 6 dominated by cold bb 11, and because of this fixup_partitions makes bb 6 cold as well, which in turn makes EH edge 6->16 a crossing one. Not only can't we have crossing EH edges, we are also not allowed to introduce new crossing edges after reload in general, since it might require extra registers on some targets. Therefore, move the jump threading pass between combine and hot/cold partitioning. Building SPEC 2006 and SPEC 2017 with the old and the new code indicates that: * When doing jump threading right after reload, 3889 edges are threaded. * When doing jump threading right after combine, 3918 edges are threaded. This means this change will not introduce performance regressions. gcc/ChangeLog: 2019-10-28 Ilya Leoshkevich <iii@linux.ibm.com> PR rtl-optimization/92007 * cfgcleanup.c (thread_jump): Add an assertion that we don't call it after reload if hot/cold partitioning has been done. (class pass_postreload_jump): Rename to pass_jump_after_combine. (make_pass_postreload_jump): Rename to make_pass_jump_after_combine. * passes.def(pass_postreload_jump): Move before reload, rename to pass_jump_after_combine. * tree-pass.h (make_pass_postreload_jump): Rename to make_pass_jump_after_combine. gcc/testsuite/ChangeLog: 2019-10-28 Ilya Leoshkevich <iii@linux.ibm.com> PR rtl-optimization/92007 * g++.dg/opt/pr92007.C: New test (from Arseny Solokha). From-SVN: r277507
2019-10-28 11:04:31 +01:00
NEXT_PASS (pass_jump_after_combine);
NEXT_PASS (pass_partition_blocks);
NEXT_PASS (pass_outof_cfg_layout_mode);
NEXT_PASS (pass_split_all_insns);
NEXT_PASS (pass_lower_subreg3);
NEXT_PASS (pass_df_initialize_no_opt);
NEXT_PASS (pass_stack_ptr_mod);
NEXT_PASS (pass_mode_switching);
NEXT_PASS (pass_match_asm_constraints);
NEXT_PASS (pass_sms);
NEXT_PASS (pass_live_range_shrinkage);
NEXT_PASS (pass_sched);
Add an "early rematerialisation" pass This patch looks for pseudo registers that are live across a call and for which no call-preserved hard registers exist. It then recomputes the pseudos as necessary to ensure that they are no longer live across a call. The comment at the head of the file describes the approach. A new target hook selects which modes should be treated in this way. By default none are, in which case the pass is skipped very early. It might also be worth looking for cases like: C1: R1 := f (...) ... C2: R2 := f (...) C3: R1 := C2 and giving the same value number to C1 and C3, effectively treating it like: C1: R1 := f (...) ... C2: R2 := f (...) C3: R1 := f (...) Another (much more expensive) enhancement would be to apply value numbering to all pseudo registers (not just rematerialisation candidates), so that we can handle things like: C1: R1 := f (...R2...) ... C2: R1 := f (...R3...) where R2 and R3 hold the same value. But the current pass seems to catch the vast majority of cases. 2018-01-13 Richard Sandiford <richard.sandiford@linaro.org> gcc/ * Makefile.in (OBJS): Add early-remat.o. * target.def (select_early_remat_modes): New hook. * doc/tm.texi.in (TARGET_SELECT_EARLY_REMAT_MODES): New hook. * doc/tm.texi: Regenerate. * targhooks.h (default_select_early_remat_modes): Declare. * targhooks.c (default_select_early_remat_modes): New function. * timevar.def (TV_EARLY_REMAT): New timevar. * passes.def (pass_early_remat): New pass. * tree-pass.h (make_pass_early_remat): Declare. * early-remat.c: New file. * config/aarch64/aarch64.c (aarch64_select_early_remat_modes): New function. (TARGET_SELECT_EARLY_REMAT_MODES): Define. gcc/testsuite/ * gcc.target/aarch64/sve/spill_1.c: Also test that no predicates are spilled. * gcc.target/aarch64/sve/spill_2.c: New test. * gcc.target/aarch64/sve/spill_3.c: Likewise. * gcc.target/aarch64/sve/spill_4.c: Likewise. * gcc.target/aarch64/sve/spill_5.c: Likewise. * gcc.target/aarch64/sve/spill_6.c: Likewise. * gcc.target/aarch64/sve/spill_7.c: Likewise. From-SVN: r256636
2018-01-13 19:00:51 +01:00
NEXT_PASS (pass_early_remat);
NEXT_PASS (pass_ira);
NEXT_PASS (pass_reload);
NEXT_PASS (pass_postreload);
PUSH_INSERT_PASSES_WITHIN (pass_postreload)
NEXT_PASS (pass_postreload_cse);
NEXT_PASS (pass_gcse2);
NEXT_PASS (pass_split_after_reload);
NEXT_PASS (pass_ree);
NEXT_PASS (pass_compare_elim_after_reload);
NEXT_PASS (pass_thread_prologue_and_epilogue);
NEXT_PASS (pass_rtl_dse2);
NEXT_PASS (pass_stack_adjustments);
NEXT_PASS (pass_jump2);
NEXT_PASS (pass_duplicate_computed_gotos);
NEXT_PASS (pass_sched_fusion);
NEXT_PASS (pass_peephole2);
NEXT_PASS (pass_if_after_reload);
NEXT_PASS (pass_regrename);
NEXT_PASS (pass_cprop_hardreg);
NEXT_PASS (pass_fast_rtl_dce);
NEXT_PASS (pass_reorder_blocks);
NEXT_PASS (pass_leaf_regs);
NEXT_PASS (pass_split_before_sched2);
NEXT_PASS (pass_sched2);
NEXT_PASS (pass_stack_regs);
PUSH_INSERT_PASSES_WITHIN (pass_stack_regs)
NEXT_PASS (pass_split_before_regstack);
NEXT_PASS (pass_stack_regs_run);
POP_INSERT_PASSES ()
POP_INSERT_PASSES ()
NEXT_PASS (pass_late_compilation);
PUSH_INSERT_PASSES_WITHIN (pass_late_compilation)
Add -fzero-call-used-regs option and zero_call_used_regs function attributes. This new feature causes the compiler to zero a subset of all call-used registers at function return. This is used to increase program security by either mitigating Return-Oriented Programming (ROP) attacks or preventing information leakage through registers. gcc/ChangeLog: 2020-10-30 Qing Zhao <qing.zhao@oracle.com> H.J.Lu <hjl.tools@gmail.com> * common.opt: Add new option -fzero-call-used-regs * config/i386/i386.c (zero_call_used_regno_p): New function. (zero_call_used_regno_mode): Likewise. (zero_all_vector_registers): Likewise. (zero_all_st_registers): Likewise. (zero_all_mm_registers): Likewise. (ix86_zero_call_used_regs): Likewise. (TARGET_ZERO_CALL_USED_REGS): Define. * df-scan.c (df_epilogue_uses_p): New function. (df_get_exit_block_use_set): Replace EPILOGUE_USES with df_epilogue_uses_p. * df.h (df_epilogue_uses_p): Declare. * doc/extend.texi: Document the new zero_call_used_regs attribute. * doc/invoke.texi: Document the new -fzero-call-used-regs option. * doc/tm.texi: Regenerate. * doc/tm.texi.in (TARGET_ZERO_CALL_USED_REGS): New hook. * emit-rtl.h (struct rtl_data): New field must_be_zero_on_return. * flag-types.h (namespace zero_regs_flags): New namespace. * function.c (gen_call_used_regs_seq): New function. (class pass_zero_call_used_regs): New class. (pass_zero_call_used_regs::execute): New function. (make_pass_zero_call_used_regs): New function. * optabs.c (expand_asm_reg_clobber_mem_blockage): New function. * optabs.h (expand_asm_reg_clobber_mem_blockage): Declare. * opts.c (zero_call_used_regs_opts): New structure array initialization. (parse_zero_call_used_regs_options): New function. (common_handle_option): Handle -fzero-call-used-regs. * opts.h (zero_call_used_regs_opts): New structure array. * passes.def: Add new pass pass_zero_call_used_regs. * recog.c (valid_insn_p): New function. * recog.h (valid_insn_p): Declare. * resource.c (init_resource_info): Replace EPILOGUE_USES with df_epilogue_uses_p. * target.def (zero_call_used_regs): New hook. * targhooks.c (default_zero_call_used_regs): New function. * targhooks.h (default_zero_call_used_regs): Declare. * tree-pass.h (make_pass_zero_call_used_regs): Declare. gcc/c-family/ChangeLog: 2020-10-30 Qing Zhao <qing.zhao@oracle.com> H.J.Lu <hjl.tools@gmail.com> * c-attribs.c (c_common_attribute_table): Add new attribute zero_call_used_regs. (handle_zero_call_used_regs_attribute): New function. gcc/testsuite/ChangeLog: 2020-10-30 Qing Zhao <qing.zhao@oracle.com> H.J.Lu <hjl.tools@gmail.com> * c-c++-common/zero-scratch-regs-1.c: New test. * c-c++-common/zero-scratch-regs-10.c: New test. * c-c++-common/zero-scratch-regs-11.c: New test. * c-c++-common/zero-scratch-regs-2.c: New test. * c-c++-common/zero-scratch-regs-3.c: New test. * c-c++-common/zero-scratch-regs-4.c: New test. * c-c++-common/zero-scratch-regs-5.c: New test. * c-c++-common/zero-scratch-regs-6.c: New test. * c-c++-common/zero-scratch-regs-7.c: New test. * c-c++-common/zero-scratch-regs-8.c: New test. * c-c++-common/zero-scratch-regs-9.c: New test. * c-c++-common/zero-scratch-regs-attr-usages.c: New test. * gcc.target/i386/zero-scratch-regs-1.c: New test. * gcc.target/i386/zero-scratch-regs-10.c: New test. * gcc.target/i386/zero-scratch-regs-11.c: New test. * gcc.target/i386/zero-scratch-regs-12.c: New test. * gcc.target/i386/zero-scratch-regs-13.c: New test. * gcc.target/i386/zero-scratch-regs-14.c: New test. * gcc.target/i386/zero-scratch-regs-15.c: New test. * gcc.target/i386/zero-scratch-regs-16.c: New test. * gcc.target/i386/zero-scratch-regs-17.c: New test. * gcc.target/i386/zero-scratch-regs-18.c: New test. * gcc.target/i386/zero-scratch-regs-19.c: New test. * gcc.target/i386/zero-scratch-regs-2.c: New test. * gcc.target/i386/zero-scratch-regs-20.c: New test. * gcc.target/i386/zero-scratch-regs-21.c: New test. * gcc.target/i386/zero-scratch-regs-22.c: New test. * gcc.target/i386/zero-scratch-regs-23.c: New test. * gcc.target/i386/zero-scratch-regs-24.c: New test. * gcc.target/i386/zero-scratch-regs-25.c: New test. * gcc.target/i386/zero-scratch-regs-26.c: New test. * gcc.target/i386/zero-scratch-regs-27.c: New test. * gcc.target/i386/zero-scratch-regs-28.c: New test. * gcc.target/i386/zero-scratch-regs-29.c: New test. * gcc.target/i386/zero-scratch-regs-30.c: New test. * gcc.target/i386/zero-scratch-regs-31.c: New test. * gcc.target/i386/zero-scratch-regs-3.c: New test. * gcc.target/i386/zero-scratch-regs-4.c: New test. * gcc.target/i386/zero-scratch-regs-5.c: New test. * gcc.target/i386/zero-scratch-regs-6.c: New test. * gcc.target/i386/zero-scratch-regs-7.c: New test. * gcc.target/i386/zero-scratch-regs-8.c: New test. * gcc.target/i386/zero-scratch-regs-9.c: New test.
2020-10-30 20:41:38 +01:00
NEXT_PASS (pass_zero_call_used_regs);
NEXT_PASS (pass_compute_alignments);
NEXT_PASS (pass_variable_tracking);
NEXT_PASS (pass_free_cfg);
NEXT_PASS (pass_machine_reorg);
NEXT_PASS (pass_cleanup_barriers);
NEXT_PASS (pass_delay_slots);
NEXT_PASS (pass_split_for_shorten_branches);
NEXT_PASS (pass_convert_to_eh_region_ranges);
NEXT_PASS (pass_shorten_branches);
NEXT_PASS (pass_set_nothrow_function_flags);
NEXT_PASS (pass_dwarf2_frame);
NEXT_PASS (pass_final);
POP_INSERT_PASSES ()
NEXT_PASS (pass_df_finish);
POP_INSERT_PASSES ()
NEXT_PASS (pass_clean_state);
TERMINATE_PASS_LIST (all_passes)