OpenE2K/gcc - gcc - Expired Mentality Git

Author	SHA1	Message	Date
Andrew MacLeod	8cbaa09333	Fix spacing and typos in comments. * gimple-range-cache.cc: Comment cleanups. * gimple-range-gori.cc: Comment cleanups. * gimple-range.cc: Comment/spacing cleanups * value-range.h: Comment cleanups.	2021-06-17 10:12:38 -04:00
Patrick Palka	2b87f3318c	libstdc++: Non-triv-copyable extra args aren't simple [PR100940] This force-enables perfect forwarding call wrapper semantics whenever the extra arguments of a partially applied range adaptor aren't all trivially copyable, so as to avoid incurring unnecessary copies of potentially expensive-to-copy objects (such as std::function objects) when invoking the adaptor. PR libstdc++/100940 libstdc++-v3/ChangeLog: * include/std/ranges (__adaptor::_Partial): For the "simple" forwarding partial specializations, also require that the extra arguments are trivially copyable. * testsuite/std/ranges/adaptors/100577.cc (test04): New test.	2021-06-17 09:46:07 -04:00
Patrick Palka	0f4a2fb44d	libstdc++: Refine range adaptors' "simple extra args" mechanism [PR100940] The _S_has_simple_extra_args mechanism is used to simplify forwarding of range adaptor's extra arguments when perfect forwarding call wrapper semantics isn't required for correctness, on a per-adaptor basis. Both views::take and views::drop are flagged as such, but it turns out perfect forwarding semantics are needed for these adaptors in some contrived cases, e.g. when their extra argument is a move-only class that's implicitly convertible to an integral type. To fix this, we could just clear the flag for views::take/drop as with views::split, but that'd come at the cost of acceptable diagnostics for ill-formed uses of these adaptors (see PR100577). This patch instead allows adaptors to parameterize their _S_has_simple_extra_args flag according the types of the captured extra arguments, so that we could conditionally disable perfect forwarding semantics only when the types of the extra arguments permit it. We then use this finer-grained mechanism to safely disable perfect forwarding semantics for views::take/drop when the extra argument is integer-like, rather than incorrectly always disabling it. Similarly, for views::split, rather than always enabling perfect forwarding semantics we now safely disable it when the extra argument is a scalar or a view, and recover good diagnostics for these common cases. PR libstdc++/100940 libstdc++-v3/ChangeLog: * include/std/ranges (__adaptor::_RangeAdaptor): Document the template form of _S_has_simple_extra_args. (__adaptor::__adaptor_has_simple_extra_args): Add _Args template parameter pack. Try to treat _S_has_simple_extra_args as a variable template parameterized by _Args. (__adaptor::_Partial): Pass _Arg/_Args to the constraint __adaptor_has_simple_extra_args. (views::_Take::_S_has_simple_extra_args): Templatize according to the type of the extra argument. (views::_Drop::_S_has_simple_extra_args): Likewise. (views::_Split::_S_has_simple_extra_args): Define. * testsuite/std/ranges/adaptors/100577.cc (test01, test02): Adjust after changes to _S_has_simple_extra_args mechanism. (test03): Define.	2021-06-17 09:46:04 -04:00
Chung-Lin Tang	275c736e73	libgomp: Structure element mapping for OpenMP 5.0 This patch implement OpenMP 5.0 requirements of incrementing/decrementing the reference count of a mapped structure at most once (across all elements) on a construct. This is implemented by pulling in libgomp/hashtab.h and using htab_t as a pointer set. Structure element list siblings also have pointers-to-refcounts linked together, to naturally achieve uniform increment/decrement without repeating. There are still some questions on whether using such a htab_t based set is faster/slower than using a sorted pointer array based implementation. This is to be researched on later. libgomp/ChangeLog: * hashtab.h (htab_clear): New function with initialization code factored out from... (htab_create): ...here, adjust to use htab_clear function. * libgomp.h (REFCOUNT_SPECIAL): New symbol to denote range of special refcount values, add comments. (REFCOUNT_INFINITY): Adjust definition to use REFCOUNT_SPECIAL. (REFCOUNT_LINK): Likewise. (REFCOUNT_STRUCTELEM): New special refcount range for structure element siblings. (REFCOUNT_STRUCTELEM_P): Macro for testing for structure element sibling maps. (REFCOUNT_STRUCTELEM_FLAG_FIRST): Flag to indicate first sibling. (REFCOUNT_STRUCTELEM_FLAG_LAST): Flag to indicate last sibling. (REFCOUNT_STRUCTELEM_FIRST_P): Macro to test _FIRST flag. (REFCOUNT_STRUCTELEM_LAST_P): Macro to test _LAST flag. (struct splay_tree_key_s): Add structelem_refcount and structelem_refcount_ptr fields into a union with dynamic_refcount. Add comments. (gomp_map_vars): Delete declaration. (gomp_map_vars_async): Likewise. (gomp_unmap_vars): Likewise. (gomp_unmap_vars_async): Likewise. (goacc_map_vars): New declaration. (goacc_unmap_vars): Likewise. * oacc-mem.c (acc_map_data): Adjust to use goacc_map_vars. (goacc_enter_datum): Likewise. (goacc_enter_data_internal): Likewise. * oacc-parallel.c (GOACC_parallel_keyed): Adjust to use goacc_map_vars and goacc_unmap_vars. (GOACC_data_start): Adjust to use goacc_map_vars. (GOACC_data_end): Adjust to use goacc_unmap_vars. * target.c (hash_entry_type): New typedef. (htab_alloc): New function hook for hashtab.h. (htab_free): Likewise. (htab_hash): Likewise. (htab_eq): Likewise. (hashtab.h): Add file include. (gomp_increment_refcount): New function. (gomp_decrement_refcount): Likewise. (gomp_map_vars_existing): Add refcount_set parameter, adjust to use gomp_increment_refcount. (gomp_map_fields_existing): Add refcount_set parameter, adjust calls to gomp_map_vars_existing. (gomp_map_vars_internal): Add refcount_set parameter, add local openmp_p variable to guard OpenMP specific paths, adjust calls to gomp_map_vars_existing, add structure element sibling splay_tree_key sequence creation code, adjust Fortran map case to avoid increment under OpenMP. (gomp_map_vars): Adjust to static, add refcount_set parameter, manage local refcount_set if caller passed in NULL, adjust call to gomp_map_vars_internal. (gomp_map_vars_async): Adjust and rename into... (goacc_map_vars): ...this new function, adjust call to gomp_map_vars_internal. (gomp_remove_splay_tree_key): New function with code factored out from gomp_remove_var_internal. (gomp_remove_var_internal): Add code to handle removing multiple splay_tree_key sequence for structure elements, adjust code to use gomp_remove_splay_tree_key for splay-tree key removal. (gomp_unmap_vars_internal): Add refcount_set parameter, adjust to use gomp_decrement_refcount. (gomp_unmap_vars): Adjust to static, add refcount_set parameter, manage local refcount_set if caller passed in NULL, adjust call to gomp_unmap_vars_internal. (gomp_unmap_vars_async): Adjust and rename into... (goacc_unmap_vars): ...this new function, adjust call to gomp_unmap_vars_internal. (GOMP_target): Manage refcount_set and adjust calls to gomp_map_vars and gomp_unmap_vars. (GOMP_target_ext): Likewise. (gomp_target_data_fallback): Adjust call to gomp_map_vars. (GOMP_target_data): Likewise. (GOMP_target_data_ext): Likewise. (GOMP_target_end_data): Adjust call to gomp_unmap_vars. (gomp_exit_data): Add refcount_set parameter, adjust to use gomp_decrement_refcount, adjust to queue splay-tree keys for removal after main loop. (GOMP_target_enter_exit_data): Manage refcount_set and adjust calls to gomp_map_vars and gomp_exit_data. (gomp_target_task_fn): Likewise. * testsuite/libgomp.c-c++-common/refcount-1.c: New testcase. * testsuite/libgomp.c-c++-common/struct-elem-1.c: New testcase. * testsuite/libgomp.c-c++-common/struct-elem-2.c: New testcase. * testsuite/libgomp.c-c++-common/struct-elem-3.c: New testcase. * testsuite/libgomp.c-c++-common/struct-elem-4.c: New testcase. * testsuite/libgomp.c-c++-common/struct-elem-5.c: New testcase.	2021-06-17 21:34:59 +08:00
H.J. Lu	967b465302	Add a target calls hook: TARGET_PUSH_ARGUMENT 1. Replace PUSH_ARGS with a target calls hook, TARGET_PUSH_ARGUMENT, which takes an integer argument. When it returns true, push instructions will be used to pass outgoing arguments. If the argument is nonzero, it is the number of bytes to push and indicates the PUSH instruction usage is optional so that the backend can decide if PUSH instructions should be generated. Otherwise, the argument is zero. 2. Implement x86 target hook which returns false when the number of bytes to push is no less than 16 (8 for 32-bit targets) if vector load and store can be used. 3. Remove target PUSH_ARGS definitions which return 0 as it is the same as the default. 4. Define TARGET_PUSH_ARGUMENT of cr16 and m32c to always return true. gcc/ PR target/100704 * calls.c (expand_call): Replace PUSH_ARGS with targetm.calls.push_argument (0). (emit_library_call_value_1): Likewise. * defaults.h (PUSH_ARGS): Removed. (PUSH_ARGS_REVERSED): Replace PUSH_ARGS with targetm.calls.push_argument (0). * expr.c (block_move_libcall_safe_for_call_parm): Likewise. (emit_push_insn): Pass the number bytes to push to targetm.calls.push_argument and pass 0 if ARGS_ADDR is 0. * hooks.c (hook_bool_uint_true): New. * hooks.h (hook_bool_uint_true): Likewise. * rtlanal.c (nonzero_bits1): Replace PUSH_ARGS with targetm.calls.push_argument (0). * target.def (push_argument): Add a targetm.calls hook. * targhooks.c (default_push_argument): New. * targhooks.h (default_push_argument): Likewise. * config/bpf/bpf.h (PUSH_ARGS): Removed. * config/cr16/cr16.c (TARGET_PUSH_ARGUMENT): New. * config/cr16/cr16.h (PUSH_ARGS): Removed. * config/i386/i386.c (ix86_push_argument): New. (TARGET_PUSH_ARGUMENT): Likewise. * config/i386/i386.h (PUSH_ARGS): Removed. * config/m32c/m32c.c (TARGET_PUSH_ARGUMENT): New. * config/m32c/m32c.h (PUSH_ARGS): Removed. * config/nios2/nios2.h (PUSH_ARGS): Likewise. * config/pru/pru.h (PUSH_ARGS): Likewise. * doc/tm.texi.in: Remove PUSH_ARGS documentation. Add TARGET_PUSH_ARGUMENT hook. * doc/tm.texi: Regenerated. gcc/testsuite/ PR target/100704 * gcc.target/i386/pr100704-1.c: New test. * gcc.target/i386/pr100704-2.c: Likewise. * gcc.target/i386/pr100704-3.c: Likewise.	2021-06-17 06:33:14 -07:00
Uros Bizjak	20a2c8ace0	i386: Add variable vec_set for 64bit vectors [PR97194] To generate sane code a SSE4.1 variable PBLENDV instruction is needed. 2021-06-17 Uroš Bizjak <ubizjak@gmail.com> gcc/ PR target/97194 * config/i386/i386-expand.c (expand_vector_set_var): Handle V2FS mode remapping. Pass TARGET_MMX_WITH_SSE to ix86_expand_vector_init_duplicate. (ix86_expand_vector_init_duplicate): Emit insv_1 for QImode for !TARGET_PARTIAL_REG_STALL. * config/i386/predicates.md (vec_setm_mmx_operand): New predicate. * config/i386/mmx.md (vec_setv2sf): Use vec_setm_mmx_operand as operand 2 predicate. Call ix86_expand_vector_set_var for non-constant index operand. (vec_setv2si): Ditto. (vec_setv4hi): Ditto. (vec_setv8qi): ditto. gcc/testsuite/ PR target/97194 * gcc.target/i386/sse4_1-vec-set-1.c: New test. * gcc.target/i386/sse4_1-vec-set-2.c: ditto.	2021-06-17 15:19:54 +02:00
Aldy Hernandez	f1555d4013	Cleanup clz and ctz code in range_of_builtin_call. These are various cleanups to the clz/ctz code. First, ranges from range_of_expr are always numeric so we should adjust. Also, the checks for non-zero were assuming the argument was unsigned, which in the PR's testcase is clearly not. I've cleaned this up, so that it works either way. I've also removed the following annoying idiom: - int newmini = prec - 1 - wi::floor_log2 (r.upper_bound ()); - if (newmini == prec) This is really a check for r.upper_bound() == 0, as floor_log2(0) returns -1. It's confusing. Tested on x86-64 Linux. gcc/ChangeLog: PR tree-optimization/100790 * gimple-range.cc (range_of_builtin_call): Cleanup clz and ctz code. gcc/testsuite/ChangeLog: * gcc.dg/pr100790.c: New test.	2021-06-17 12:10:41 +02:00
Martin Liska	8eac92a07e	docs: Use -O1 as a canonical value for -O option gcc/ChangeLog: * doc/invoke.texi: Use consistently -O1 instead of -O.	2021-06-17 11:48:59 +02:00
Martin Liska	c0954059db	gcov: update documentation entry about string format gcc/ChangeLog: * gcov-io.h: Update documentation entry about string format.	2021-06-17 11:39:11 +02:00
Marius Hillenbrand	a4fc63e0c3	IBM Z: Fix vector intrinsics vec_double and vec_floate Fix the mapping of vec_double and vec_floate to builtins. gcc/ChangeLog: PR target/100871 * config/s390/vecintrin.h (vec_doublee): Fix to use __builtin_s390_vflls. (vec_floate): Fix to use __builtin_s390_vflrd. gcc/testsuite/ChangeLog: * gcc.target/s390/zvector/vec-doublee.c: New test. * gcc.target/s390/zvector/vec-floate.c: New test.	2021-06-17 11:14:23 +02:00
Trevor Saunders	53c55d3204	return auto_vec from more dominance functions This ensures the vector gets cleaned up by the caller when appropriate. Signed-off-by: Trevor Saunders <tbsaunde@tbsaunde.org> gcc/ChangeLog: * dominance.c (get_dominated_to_depth): Return auto_vec<basic_block>. * dominance.h (get_dominated_to_depth): Likewise. (get_all_dominated_blocks): Likewise. * cfgcleanup.c (delete_unreachable_blocks): Adjust. * gcse.c (hoist_code): Likewise. * tree-cfg.c (remove_edge_and_dominated_blocks): Likewise. * tree-parloops.c (oacc_entry_exit_ok): Likewise. * tree-ssa-dce.c (eliminate_unnecessary_stmts): Likewise. * tree-ssa-phiprop.c (pass_phiprop::execute): Likewise.	2021-06-17 04:43:28 -04:00
Trevor Saunders	4541b5ec16	make get_domminated_by_region return a auto_vec This makes it clear the caller owns the vector, and ensures it is cleaned up. Signed-off-by: Trevor Saunders <tbsaunde@tbsaunde.org> gcc/ChangeLog: * dominance.c (get_dominated_by_region): Return auto_vec<basic_block>. * dominance.h (get_dominated_by_region): Likewise. * tree-cfg.c (gimple_duplicate_sese_region): Adjust. (gimple_duplicate_sese_tail): Likewise. (move_sese_region_to_fn): Likewise.	2021-06-17 04:43:28 -04:00
Trevor Saunders	4f899c4298	return auto_vec from get_dominated_by Signed-off-by: Trevor Saunders <tbsaunde@tbsaunde.org> gcc/ChangeLog: * dominance.c (get_dominated_by): Return auto_vec<basic_block>. * dominance.h (get_dominated_by): Likewise. * auto-profile.c (afdo_find_equiv_class): Adjust. * cfgloopmanip.c (duplicate_loop_to_header_edge): Likewise. * loop-unroll.c (unroll_loop_runtime_iterations): Likewise. * tree-cfg.c (test_linear_chain): Likewise. (test_diamond): Likewise.	2021-06-17 04:43:27 -04:00
Trevor Saunders	a165040e11	return auto_vec from get_loop_hot_path This ensures callers take ownership of the returned vector. Signed-off-by: Trevor Saunders <tbsaunde@tbsaunde.org> gcc/ChangeLog: * cfgloop.h (get_loop_hot_path): Return auto_vec<basic_block>. * cfgloopanal.c (get_loop_hot_path): Likewise. * tree-ssa-loop-ivcanon.c (tree_estimate_loop_size): Likewise.	2021-06-17 04:43:27 -04:00
Trevor Saunders	265af872a1	return auto_vec from cgraph_node::collect_callers This ensures the callers of collect_callers () take ownership of the vector and free it when appropriate. Signed-off-by: Trevor Saunders <tbsaunde@tbsaunde.org> gcc/ChangeLog: * cgraph.c (cgraph_node::collect_callers): Return auto_vec<cgraph_edge >. cgraph.h (cgraph_node::collect_callers): Likewise. * ipa-cp.c (create_specialized_node): Adjust. (decide_about_value): Likewise. (decide_whether_version_node): Likewise. * ipa-sra.c (process_isra_node_results): Likewise.	2021-06-17 04:43:27 -04:00
Trevor Saunders	e9681f5725	auto_vec copy/move improvements - Unfortunately using_auto_storage () needs to handle m_vec being null. - Handle self move of an auto_vec to itself. - Make sure auto_vec defines the classes move constructor and assignment operator, as well as ones taking vec<T>, so the compiler does not generate them for us. Per https://en.cppreference.com/w/cpp/language/move_constructor the ones taking vec<T> do not count as the classes move constructor or assignment operator, but we want them as well to assign a plain vec to a auto_vec. - Explicitly delete auto_vec's copy constructor and assignment operator. This prevents unintentional expenssive coppies of the vector and makes it clear when coppies are needed that that is what is intended. When it is necessary to copy a vector copy () can be used. Signed-off-by: Trevor Saunders <tbsaunde@tbsaunde.org> gcc/ChangeLog: * vec.h (vl_ptr>::using_auto_storage): Handle null m_vec. (auto_vec<T, 0>::auto_vec): Define move constructor, and delete copy constructor. (auto_vec<T, 0>::operator=): Define move assignment and delete copy assignment.	2021-06-17 04:43:26 -04:00
Aldy Hernandez	3f3ee13959	Add debugging helpers for ranger. These are debugging aids for help in debugging ranger based passes. gcc/ChangeLog: * gimple-range.cc (debug_seed_ranger): New. (dump_ranger): New. (debug_ranger): New.	2021-06-17 10:29:28 +02:00
Richard Biener	3dfa4fe9f1	Vectorization of BB reductions This adds a simple reduction vectorization capability to the non-loop vectorizer. Simple meaning it lacks any of the fancy ways to generate the reduction epilogue but only supports those we can handle via a direct internal function reducing a vector to a scalar. One of the main reasons is to avoid massive refactoring at this point but also that more complex epilogue operations are hardly profitable. Mixed sign reductions are for now fend off and I'm not finally settled with whether we want an explicit SLP node for the reduction epilogue operation. Handling mixed signs could be done by multiplying with a { 1, -1, .. } vector. Fend off are also reductions with non-internal operands (constants or register parameters for example). Costing is done by accounting the original scalar participating stmts for the scalar cost and log2 permutes and operations for the vectorized epilogue. -- SPEC CPU 2017 FP with rate workload measurements show (picked fastest runs of three) regressions for 507.cactuBSSN_r (1.5%), 508.namd_r (2.5%), 511.povray_r (2.5%), 526.blender_r (0.5) and 527.cam4_r (2.5%) and improvements for 510.parest_r (5%) and 538.imagick_r (1.5%). This is with -Ofast -march=znver2 on a Zen2. Statistics on CPU 2017 shows that the overwhelming number of seeds we find are reductions of two lanes (well - that's basically every associative operation). That means we put a quite high pressure on the SLP discovery process this way. In total we find 583218 seeds we put to SLP discovery out of which 66205 pass that and only 6185 of those make it through code generation checks. 796 of those are discarded because the reduction is part of a larger SLP instance. 4195 of the remaining are deemed not profitable to vectorize and 1194 are finally vectorized. That's a poor 0.2% rate. Of the 583218 seeds 486826 (83%) have two lanes, 60912 have three (10%), 28181 four (5%), 4808 five, 909 six and there are instances up to 120 lanes. There's a set of 54086 candidate seeds we reject because they contain a constant or invariant (not implemented yet) but still have two or more lanes that could be put to SLP discovery. 2021-06-16 Richard Biener <rguenther@suse.de> PR tree-optimization/54400 * tree-vectorizer.h (enum slp_instance_kind): Add slp_inst_kind_bb_reduc. (reduction_fn_for_scalar_code): Declare. * tree-vect-data-refs.c (vect_slp_analyze_instance_dependence): Check SLP_INSTANCE_KIND instead of looking at the representative. (vect_slp_analyze_instance_alignment): Likewise. * tree-vect-loop.c (reduction_fn_for_scalar_code): Export. * tree-vect-slp.c (vect_slp_linearize_chain): Split out chain linearization from vect_build_slp_tree_2 and generalize for the use of BB reduction vectorization. (vect_build_slp_tree_2): Adjust accordingly. (vect_optimize_slp): Elide permutes at the root of BB reduction instances. (vectorizable_bb_reduc_epilogue): New function. (vect_slp_prune_covered_roots): Likewise. (vect_slp_analyze_operations): Use them. (vect_slp_check_for_constructors): Recognize associatable chains for BB reduction vectorization. (vectorize_slp_instance_root_stmt): Generate code for the BB reduction epilogue. * gcc.dg/vect/bb-slp-pr54400.c: New testcase.	2021-06-17 09:52:07 +02:00
Aldy Hernandez	9f12bd79c0	Add amacleod and aldyh as vrp and ranger maintainers. ChangeLog: MAINTAINERS (Various Maintainers): Add Andrew and myself as *vrp and ranger maintainers.	2021-06-17 09:49:43 +02:00
Arnaud Charlet	607507410e	[Ada] Use runtime from base compiler during stage1 (continued) gcc/ada/ * gcc-interface/Make-lang.in: Use libgnat.so if libgnat.a cannot be found.	2021-06-17 01:45:05 -04:00
Jason Merrill	ff4deb4b1d	c++: Tweak PR101029 fix The case of an initializer with side effects for a zero-length array seems extremely unlikely, but we should still return the right type in that case. PR c++/101029 gcc/cp/ChangeLog: * init.c (build_vec_init): Preserve the type of base.	2021-06-16 23:38:32 -04:00
GCC Administrator	9a61dfdb5e	Daily bump.	2021-06-17 00:16:54 +00:00
Andrew MacLeod	786188e8b8	Add recomputation to outgoing_edge_range. The gori engine can calculate outgoing ranges for exported values. This change allows 1st degree recomputation. If a name is not exported from a block, but one of the ssa_names used directly in computing it is, then we can recompute the ssa_name on the edge using the edge values for its operands. * gimple-range-gori.cc (gori_compute::has_edge_range_p): Check with may_recompute_p. (gori_compute::may_recompute_p): New. (gori_compute::outgoing_edge_range_p): Perform recomputations. * gimple-range-gori.h (class gori_compute): Add prototype.	2021-06-16 20:07:40 -04:00
Andrew MacLeod	8a22a10c78	Range_on_edge in ranger_cache should return true for all ranges. Range_on_edge was implemented in the cache to always return a range, but only returned true when the edge actally changed the range. Return true with any range that can be calculated. * gimple-range-cache.cc (ranger_cache::range_on_edge): Always return true when a range can be calculated. * gimple-range.cc (gimple_ranger::dump_bb): Check has_edge_range_p.	2021-06-16 20:07:40 -04:00
Martin Sebor	487be9201c	Correct documented option defaults. gcc/ChangeLog: * doc/invoke.texi (-Wmismatched-dealloc, -Wmismatched-new-delete): Correct documented defaults.	2021-06-16 16:52:05 -06:00
Jason Merrill	6816a44dfe	c++: static memfn from non-dependent base [PR101078] After my patch for PR91706, or before that with the qualified call, tsubst_baselink returned a BASELINK with BASELINK_BINFO indicating a base of a still-dependent derived class. We need to look up the relevant base binfo in the substituted class. PR c++/101078 PR c++/91706 gcc/cp/ChangeLog: * pt.c (tsubst_baselink): Update binfos in non-dependent case. gcc/testsuite/ChangeLog: * g++.dg/template/access39.C: New test.	2021-06-16 17:28:36 -04:00
Harald Anlauf	cfe0a2ec26	Fortran - ICE in gfc_check_do_variable, at fortran/parse.c:4446 Avoid NULL pointer dereferences during error recovery. gcc/fortran/ChangeLog: PR fortran/95501 PR fortran/95502 * expr.c (gfc_check_pointer_assign): Avoid NULL pointer dereference. * match.c (gfc_match_pointer_assignment): Likewise. * parse.c (gfc_check_do_variable): Avoid comparison with NULL symtree. gcc/testsuite/ChangeLog: PR fortran/95501 PR fortran/95502 * gfortran.dg/pr95502.f90: New test.	2021-06-16 22:04:22 +02:00
Harald Anlauf	d117f992d8	Revert "Fortran - ICE in gfc_check_do_variable, at fortran/parse.c:4446" This reverts commit `72e3d92178`.	2021-06-16 22:00:52 +02:00
Harald Anlauf	72e3d92178	Fortran - ICE in gfc_check_do_variable, at fortran/parse.c:4446 Avoid NULL pointer dereferences during error recovery. gcc/fortran/ChangeLog: PR fortran/95501 PR fortran/95502 * expr.c (gfc_check_pointer_assign): Avoid NULL pointer dereference. * match.c (gfc_match_pointer_assignment): Likewise. * parse.c (gfc_check_do_variable): Avoid comparison with NULL symtree. gcc/testsuite/ChangeLog: PR fortran/95501 PR fortran/95502 * gfortran.dg/pr95502.f90: New test.	2021-06-16 21:54:16 +02:00
Andrew MacLeod	bdfc1207bd	Avoid loading an undefined value in the ranger_cache constructor. Enable_new_values takes a boolean, returning the old value. The constructor for ranger_cache initialized the m_new_value_p field by calling this routine and ignorng the result. This potentially loads the old value uninitialized. * gimple-range-cache.cc (ranger_cache::ranger_cache): Initialize m_new_value_p directly.	2021-06-16 13:01:21 -04:00
Jason Merrill	9e64426dae	libcpp: location comparison within macro [PR100796] The patch for 96391 changed linemap_compare_locations to give up on comparing locations from macro expansions if we don't have column information. But in this testcase, the BOILERPLATE macro is multiple lines long, so we do want to compare locations within the macro. So this patch moves the LINE_MAP_MAX_LOCATION_WITH_COLS check inside the block, to use it for failing gracefully. PR c++/100796 PR preprocessor/96391 libcpp/ChangeLog: * line-map.c (linemap_compare_locations): Only use comparison with LINE_MAP_MAX_LOCATION_WITH_COLS to avoid abort. gcc/testsuite/ChangeLog: * g++.dg/plugin/location-overflow-test-pr100796.c: New test. * g++.dg/plugin/plugin.exp: Run it.	2021-06-16 11:41:08 -04:00
Uros Bizjak	dd835ec24b	ii386: Add missing two element 64bit vector permutations [PR89021] In addition to V8QI permutations, several other missing permutations are added for 64bit vector modes for TARGET_SSSE3 and TARGET_SSE4_1 targets. 2021-06-16 Uroš Bizjak <ubizjak@gmail.com> gcc/ PR target/89021 * config/i386/i386-expand.c (expand_vec_perm_2perm_pblendv): Handle 64bit modes for TARGET_SSE4_1. (expand_vec_perm_pshufb2): Handle 64bit modes for TARGET_SSSE3. (expand_vec_perm_even_odd_pack): Handle V4HI mode. (expand_vec_perm_even_odd_1) <case E_V4HImode>: Expand via expand_vec_perm_pshufb2 for TARGET_SSSE3 and via expand_vec_perm_even_odd_pack for TARGET_SSE4_1. * config/i386/mmx.md (mmx_packusdw): New insn pattern.	2021-06-16 16:07:52 +02:00
Jonathan Wakely	c25e3bf879	libstdc++: Use named struct for __decay_copy In r12-1486-gcb326a6442f09cb36b05ce556fc91e10bfeb0cf6 I changed __decay_copy to be a function object of unnamed class type. This causes problems when importing the library headers: error: conflicting global module declaration 'constexpr const std::ranges::__cust_access::<unnamed struct> std::ranges::__cust_access::__decay_copy' The fix is to use a named struct instead of an anonymous one. Signed-off-by: Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * include/bits/iterator_concepts.h (__decay_copy): Name type.	2021-06-16 14:31:13 +01:00
Jonathan Wakely	b9e35ee6d6	libstdc++: Revert final/non-addressable changes to ranges CPOs In r12-1489-g8b93548778a487f31f21e0c6afe7e0bde9711fc4 I made the [range.access] CPO types final and non-addressable. Tim Song pointed out this is wrong. Only the [range.iter.ops] functions should be final and non-addressable. Revert the changes to the [range.access] objects. Signed-off-by: Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * include/bits/ranges_base.h (ranges::begin, ranges::end) (ranges::cbegin, ranges::cend, ranges::rbeing, ranges::rend) (ranges::crbegin, ranges::crend, ranges::size, ranges::ssize) (ranges::empty, ranges::data, ranges::cdata): Remove final keywords and deleted operator& overloads. * testsuite/24_iterators/customization_points/iter_move.cc: Use new is_customization_point_object function. * testsuite/24_iterators/customization_points/iter_swap.cc: Likewise. * testsuite/std/concepts/concepts.lang/concept.swappable/swap.cc: Likewise. * testsuite/std/ranges/access/begin.cc: Likewise. * testsuite/std/ranges/access/cbegin.cc: Likewise. * testsuite/std/ranges/access/cdata.cc: Likewise. * testsuite/std/ranges/access/cend.cc: Likewise. * testsuite/std/ranges/access/crbegin.cc: Likewise. * testsuite/std/ranges/access/crend.cc: Likewise. * testsuite/std/ranges/access/data.cc: Likewise. * testsuite/std/ranges/access/empty.cc: Likewise. * testsuite/std/ranges/access/end.cc: Likewise. * testsuite/std/ranges/access/rbegin.cc: Likewise. * testsuite/std/ranges/access/rend.cc: Likewise. * testsuite/std/ranges/access/size.cc: Likewise. * testsuite/std/ranges/access/ssize.cc: Likewise. * testsuite/util/testsuite_iterators.h (is_customization_point_object): New function.	2021-06-16 14:31:04 +01:00
Jonathan Wright	dbfc149b63	aarch64: Model zero-high-half semantics of ADDHN/SUBHN instructions Model the zero-high-half semantics of the narrowing arithmetic Neon instructions in the aarch64_<sur><addsub>hn<mode> RTL pattern. Modeling these semantics allows for better RTL combinations while also removing some register allocation issues as the compiler now knows that the operation is totally destructive. Add new tests to narrow_zero_high_half.c to verify the benefit of this change. gcc/ChangeLog: 2021-06-14 Jonathan Wright <jonathan.wright@arm.com> * config/aarch64/aarch64-simd.md (aarch64_<sur><addsub>hn<mode>): Change to an expander that emits the correct instruction depending on endianness. (aarch64_<sur><addsub>hn<mode>_insn_le): Define. (aarch64_<sur><addsub>hn<mode>_insn_be): Define. gcc/testsuite/ChangeLog: * gcc.target/aarch64/narrow_zero_high_half.c: Add new tests.	2021-06-16 14:22:42 +01:00
Jonathan Wright	d0889b5d37	aarch64: Model zero-high-half semantics of [SU]QXTN instructions Split the aarch64_<su>qmovn<mode> pattern into separate scalar and vector variants. Further split the vector RTL pattern into big/ little endian variants that model the zero-high-half semantics of the underlying instruction. Modeling these semantics allows for better RTL combinations while also removing some register allocation issues as the compiler now knows that the operation is totally destructive. Add new tests to narrow_zero_high_half.c to verify the benefit of this change. gcc/ChangeLog: 2021-06-14 Jonathan Wright <jonathan.wright@arm.com> * config/aarch64/aarch64-simd-builtins.def: Split generator for aarch64_<su>qmovn builtins into scalar and vector variants. * config/aarch64/aarch64-simd.md (aarch64_<su>qmovn<mode>_insn_le): Define. (aarch64_<su>qmovn<mode>_insn_be): Define. (aarch64_<su>qmovn<mode>): Split into scalar and vector variants. Change vector variant to an expander that emits the correct instruction depending on endianness. gcc/testsuite/ChangeLog: * gcc.target/aarch64/narrow_zero_high_half.c: Add new tests.	2021-06-16 14:22:22 +01:00
Jonathan Wright	c86a303968	aarch64: Model zero-high-half semantics of SQXTUN instruction in RTL Split the aarch64_sqmovun<mode> pattern into separate scalar and vector variants. Further split the vector pattern into big/little endian variants that model the zero-high-half semantics of the underlying instruction. Modeling these semantics allows for better RTL combinations while also removing some register allocation issues as the compiler now knows that the operation is totally destructive. Add new tests to narrow_zero_high_half.c to verify the benefit of this change. gcc/ChangeLog: 2021-06-14 Jonathan Wright <jonathan.wright@arm.com> * config/aarch64/aarch64-simd-builtins.def: Split generator for aarch64_sqmovun builtins into scalar and vector variants. * config/aarch64/aarch64-simd.md (aarch64_sqmovun<mode>): Split into scalar and vector variants. Change vector variant to an expander that emits the correct instruction depending on endianness. (aarch64_sqmovun<mode>_insn_le): Define. (aarch64_sqmovun<mode>_insn_be): Define. gcc/testsuite/ChangeLog: * gcc.target/aarch64/narrow_zero_high_half.c: Add new tests.	2021-06-16 14:22:08 +01:00
Jonathan Wright	d8a88cdae9	aarch64: Model zero-high-half semantics of XTN instruction in RTL Modeling the zero-high-half semantics of the XTN narrowing instruction in RTL indicates to the compiler that this is a totally destructive operation. This enables more RTL simplifications and also prevents some register allocation issues. Add new tests to narrow_zero_high_half.c to verify the benefit of this change. gcc/ChangeLog: 2021-06-11 Jonathan Wright <jonathan.wright@arm.com> * config/aarch64/aarch64-simd.md (aarch64_xtn<mode>_insn_le): Define - modeling zero-high-half semantics. (aarch64_xtn<mode>): Change to an expander that emits the appropriate instruction depending on endianness. (aarch64_xtn<mode>_insn_be): Define - modeling zero-high-half semantics. (aarch64_xtn2<mode>_le): Rename to... (aarch64_xtn2<mode>_insn_le): This. (aarch64_xtn2<mode>_be): Rename to... (aarch64_xtn2<mode>_insn_be): This. (vec_pack_trunc_<mode>): Emit truncation instruction instead of aarch64_xtn. * config/aarch64/iterators.md (Vnarrowd): Add Vnarrowd mode attribute iterator. gcc/testsuite/ChangeLog: * gcc.target/aarch64/narrow_zero_high_half.c: Add new tests.	2021-06-16 14:21:52 +01:00
Jonathan Wright	ac6c858d07	testsuite: aarch64: Add zero-high-half tests for narrowing shifts Add tests to verify that Neon narrowing-shift instructions clear the top half of the result vector. It is sufficient to show that a subsequent combine with a zero-vector is optimized away - leaving just the narrowing-shift instruction. gcc/testsuite/ChangeLog: 2021-06-15 Jonathan Wright <jonathan.wright@arm.com> * gcc.target/aarch64/narrow_zero_high_half.c: New test.	2021-06-16 14:21:34 +01:00
Martin Jambor	d7deee423f	tree-sra: Do not refresh readonly decls (PR 100453) When SRA transforms an assignment where the RHS is an aggregate decl that it creates replacements for, the (least efficient) fallback method of dealing with them is to store all the replacements back into the original decl and then let the original assignment takes its course. That of course should not need to be done for TREE_READONLY bases which cannot change contents. The SRA code handled this situation only for DECL_IN_CONSTANT_POOL const decls, this patch modifies the check so that it tests for TREE_READONLY and I also looked at all other callers of generate_subtree_copies and added checks to another one dealing with the same exact situation and one which deals with it in a non-assignment context. gcc/ChangeLog: 2021-06-11 Martin Jambor <mjambor@suse.cz> PR tree-optimization/100453 * tree-sra.c (create_access): Disqualify any const candidates which are written to. (sra_modify_expr): Do not store sub-replacements back to a const base. (handle_unscalarized_data_in_subtree): Likewise. (sra_modify_assign): Likewise. Earlier, use TREE_READONLy test instead of constant_decl_p. gcc/testsuite/ChangeLog: 2021-06-10 Martin Jambor <mjambor@suse.cz> PR tree-optimization/100453 * gcc.dg/tree-ssa/pr100453.c: New test.	2021-06-16 13:23:14 +02:00
Jakub Jelinek	a490b1dc0b	testsuite: Use noipa attribute instead of noinline, noclone I've noticed this test now on various arches sometimes FAILs, sometimes PASSes (the line 12 test in particular). The problem is that a = 0; initialization in the caller no longer happens before the f(&a) call as what the argument points to is only used in debug info. Making the function noipa forces the caller to initialize it and still tests what the test wants to test, namely that we don't consider p as valid location for the c variable at line 18 (after it has been overwritten with p = 1;). 2021-06-16 Jakub Jelinek <jakub@redhat.com> * gcc.dg/guality/pr49888.c (f): Use noipa attribute instead of noinline, noclone.	2021-06-16 13:10:48 +02:00
Jakub Jelinek	b4b50bf286	stor-layout: Create DECL_BIT_FIELD_REPRESENTATIVE even for bitfields in unions [PR101062] The following testcase is miscompiled on x86_64-linux, the bitfield store is implemented as a RMW 64-bit operation at d+24 when the d variable has size of only 28 bytes and scheduling moves in between the R and W part a store to a different variable that happens to be right after the d variable. The reason for this is that we weren't creating DECL_BIT_FIELD_REPRESENTATIVEs for bitfields in unions. The following patch does create them, but treats all such bitfields as if they were in a structure where the particular bitfield is the only field. 2021-06-16 Jakub Jelinek <jakub@redhat.com> PR middle-end/101062 * stor-layout.c (finish_bitfield_representative): For fields in unions assume nextf is always NULL. (finish_bitfield_layout): Compute bit field representatives also in unions, but handle it as if each bitfield was the only field in the aggregate. * gcc.dg/pr101062.c: New test.	2021-06-16 12:17:55 +02:00
Richard Biener	43fc4234ad	tree-optimization/101088 - fix SM invalidation issue When we face a sm_ord vs sm_unord for the same ref during store sequence merging we assert that the ref is already marked unsupported. But it can be that it will only be marked so during the ongoing merging so instead of asserting mark it here. Also apply some optimization to not waste resources to search for already unsupported refs. 2021-06-16 Richard Biener <rguenther@suse.de> PR tree-optimization/101088 * tree-ssa-loop-im.c (sm_seq_valid_bb): Only look for supported refs on edges. Do not assert same ref but different kind stores are unsuported but mark them so. (hoist_memory_references): Only look for supported refs on exits. * gcc.dg/torture/pr101088.c: New testcase.	2021-06-16 11:28:03 +02:00
Roger Sayle	3155d51bfd	[PATCH] PR rtl-optimization/46235: Improved use of bt for bit tests on x86_64. This patch tackles PR46235 to improve the code generated for bit tests on x86_64 by making more use of the bt instruction. Currently, GCC emits bt instructions when followed by condition jumps (thanks to Uros' splitters). This patch adds splitters in i386.md, to catch the cases where bt is followed by a conditional move (as in the original report), or by a setc/setnc (as in comment 5 of the Bugzilla PR). With this patch, the function in the original PR int foo(int a, int x, int y) { if (a & (1 << x)) return a; return 1; } which with -O2 on mainline generates: foo: movl %edi, %eax movl %esi, %ecx sarl %cl, %eax testb $1, %al movl $1, %eax cmovne %edi, %eax ret now generates: foo: btl %esi, %edi movl $1, %eax cmovc %edi, %eax ret Likewise, IsBitSet1 and IsBitSet2 (from comment 5) bool IsBitSet1(unsigned char byte, int index) { return (byte & (1<<index)) != 0; } bool IsBitSet2(unsigned char byte, int index) { return (byte >> index) & 1; } Before: movzbl %dil, %eax movl %esi, %ecx sarl %cl, %eax andl $1, %eax ret After: movzbl %dil, %edi btl %esi, %edi setc %al ret According to Agner Fog, SAR/SHR r,cl takes 2 cycles on skylake, where BT r,r takes only one, so the performance improvements on recent hardware may be more significant than implied by just the reduced number of instructions. I've avoided transforming cases (such as btsi_setcsi) where using bt sequences may not be a clear win (over sarq/andl). 2010-06-15 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog PR rtl-optimization/46235 * config/i386/i386.md: New define_split for bt followed by cmov. (bt<mode>_setcqi): New define_insn_and_split for bt followed by setc. (bt<mode>_setncqi): New define_insn_and_split for bt then setnc. (bt<mode>_setnc<mode>): New define_insn_and_split for bt followed by setnc with zero extension. gcc/testsuite/ChangeLog PR rtl-optimization/46235 gcc.target/i386/bt-5.c: New test. * gcc.target/i386/bt-6.c: New test. * gcc.target/i386/bt-7.c: New test.	2021-06-16 09:56:09 +01:00
Jakub Jelinek	041f741770	libffi: Fix up x86_64 classify_argument As the following testcase shows, libffi didn't handle properly classify_arguments of structures at byte offsets not divisible by UNITS_PER_WORD. The following patch adjusts it to match what config/i386/ classify_argument does for that and also ports the PR38781 fix there (the second chunk). This has been committed to upstream libffi already: `5651bea284` 2021-06-16 Jakub Jelinek <jakub@redhat.com> * src/x86/ffi64.c (classify_argument): For FFI_TYPE_STRUCT set words to number of words needed for type->size + byte_offset bytes rather than just type->size bytes. Compute pos before the loop and check total size of the structure. * testsuite/libffi.call/nested_struct12.c: New test.	2021-06-16 10:45:27 +02:00
Piotr Trojanek	ccf0dee109	[Ada] Fix Is_Volatile_Function for functions declared in protected bodies gcc/ada/ * sem_util.adb (Is_Volatile_Function): Follow the exact wording of SPARK (regarding volatile functions) and Ada (regarding protected functions).	2021-06-16 04:43:05 -04:00
Piotr Trojanek	1a9ff8d39c	[Ada] Ignore volatile restrictions in preanalysis gcc/ada/ * sem_util.adb (Is_OK_Volatile_Context): All references to volatile objects are legal in preanalysis. (Within_Volatile_Function): Previously it was wrongly called on Empty entities; now it is only called on E_Return_Statement, which allow the body to be greatly simplified.	2021-06-16 04:43:05 -04:00
Yannick Moy	3feba0a578	[Ada] Do not generate an Itype_Reference node for slices in GNATprove mode gcc/ada/ * sem_res.adb (Set_Slice_Subtype): Revert special-case introduced previously, which is not needed as Itypes created for slices are precisely always used.	2021-06-16 04:43:04 -04:00
Eric Botcazou	f4fe186bfe	[Ada] Fix floating-point exponentiation with Integer'First exponent gcc/ada/ * urealp.adb (Scale): Change first paramter to Uint and adjust. (Equivalent_Decimal_Exponent): Pass U.Den directly to Scale. * libgnat/s-exponr.adb (Negative): Rename to... (Safe_Negative): ...this and change its lower bound. (Exponr): Adjust to above renaming and deal with Integer'First.	2021-06-16 04:43:04 -04:00
Piotr Trojanek	07b7dc09b2	[Ada] Fix detection of volatile expressions in restricted contexts gcc/ada/ * sem_res.adb (Flag_Effectively_Volatile_Objects): Detect also allocators within restricted contexts and not just entity names. (Resolve_Actuals): Remove duplicated code for detecting restricted contexts; it is now exclusively done in Is_OK_Volatile_Context. (Resolve_Entity_Name): Adapt to new parameter of Is_OK_Volatile_Context. * sem_util.ads, sem_util.adb (Is_OK_Volatile_Context): Adapt to handle contexts both inside and outside of subprogram call actual parameters. (Within_Subprogram_Call): Remove; now handled by Is_OK_Volatile_Context itself and its parameter.	2021-06-16 04:43:04 -04:00

... 2 3 4 5 6 ...

186127 Commits