OpenE2K/gcc - gcc - Expired Mentality Git

Author	SHA1	Message	Date
Jonathan Wakely	5f1db7627f	libstdc++: Improve types used as iterators in testsuite Signed-off-by: Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * testsuite/25_algorithms/copy/34595.cc: Add missing operation for type used as an iterator. * testsuite/25_algorithms/unique_copy/check_type.cc: Likewise.	2021-09-28 20:22:51 +01:00
Jonathan Wakely	4000d722e6	libstdc++: Fix tests that use invalid types in ordered containers Types used in ordered containers need to be comparable, or the container needs to use a custom comparison function. These tests fail when _GLIBCXX_CONCEPT_CHECKS is defined, because the element types aren't comparable. Signed-off-by: Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * testsuite/20_util/is_nothrow_swappable/value.h: Use custom comparison function for priority_queue of type with no relational operators. * testsuite/20_util/is_swappable/value.h: Likewise. * testsuite/24_iterators/output/concept.cc: Add operator< to type used in set.	2021-09-28 20:22:51 +01:00
Jonathan Wakely	45a8cd2569	libstdc++: Fix _OutputIteratorConcept checks in algorithms The _OutputIteratorConcept should be checked using the correct value category. The std::move_backward and std::copy_backward algorithms should use _OutputIteratorConcept instead of _ConvertibleConcept. In order to use the correct value category, the concept should use a function that returns _ValueT instead of using an lvalue data member. Signed-off-by: Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * include/bits/boost_concept_check.h (_OutputIteratorConcept): Use a function to preserve value category of the type. * include/bits/stl_algobase.h (copy, move, fill_n): Use a reference as the second argument for _OutputIteratorConcept. (copy_backward, move_backward): Use _OutputIteratorConcept instead of _ConvertibleConcept.	2021-09-28 20:22:50 +01:00
Jonathan Wakely	82626be2d6	libstdc++: Specialize std::pointer_traits<__normal_iterator<I,C>> This allows std::__to_address to be used with __normal_iterator in C++11/14/17 modes. Without the partial specialization the deduced pointer_traits::element_type is incorrect, and so the return type of __to_address is wrong. A similar partial specialization is probably needed for __gnu_debug::_Safe_iterator. Signed-off-by: Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * include/bits/stl_iterator.h (pointer_traits): Define partial specialization for __normal_iterator. * testsuite/24_iterators/normal_iterator/to_address.cc: New test.	2021-09-28 20:22:50 +01:00
Iain Sandoe	b12d6e7989	Darwin, PPC : Fix R13 for PPC64. We have a somewhat unusual situation in that for PPC64, R13 is both reserved and callee-saved (it is used internally by the pthreads implementation to contain pthread_self). So add R13 to the fixed regs, but also keep it in the callee- saved set. gcc/ChangeLog: * config/rs6000/darwin.h (FIXED_R13): Add for PPC64. (FIRST_SAVED_GP_REGNO): Save from R13 even when it is one of the fixed regs.	2021-09-28 20:16:05 +01:00
Iain Sandoe	45f775f5f8	libgcc, X86, Darwin: Export cpu_model and indicator. These two symbols have been emitted since 4.8, but were not added to the Darwin exports, so we have been using the ones from libgcc.a. Added to libgcc_s now. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk> libgcc/ChangeLog: * config/i386/libgcc-darwin.ver: Add Symbols for __cpu_model, __cpu_indicator_init.	2021-09-28 20:02:48 +01:00
Iain Sandoe	fae627162d	coroutines: Only set parm copy guard vars if we have exceptions [PR 102454]. For coroutines, we make copies of the original function arguments into the coroutine frame. Normally, these are destroyed on the proper exit from the coroutine when the frame is destroyed. However, if an exception is thrown before the first suspend point is reached, the cleanup has to happen in the ramp function. These cleanups are guarded such that they are only applied to any param copies actually made. The ICE is caused by an attempt to set the guard variable when there are no exceptions enabled (the guard var is not created in this case). Fixed by checking for flag_exceptions in this case too. While touching this code paths, also clean up the synthetic names used when a function parm is unnamed. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk> PR c++/102454 gcc/cp/ChangeLog: * coroutines.cc (analyze_fn_parms): Clean up synthetic names for unnamed function params. (morph_fn_to_coro): Do not try to set a guard variable for param DTORs in the ramp, unless we have exceptions active. gcc/testsuite/ChangeLog: * g++.dg/coroutines/pr102454.C: New test.	2021-09-28 19:53:59 +01:00
Jonathan Wakely	a11052d98d	libstdc++: Improve std::forward static assert message The previous message told you something was wrong, but not why it happened or why it's bad. This changes it to explain that the function is being misused. Signed-off-by: Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * include/bits/move.h (forward(remove_reference_t<T>&&)): Improve text of static_assert. * testsuite/20_util/forward/c_neg.cc: Adjust dg-error. * testsuite/20_util/forward/f_neg.cc: Likewise.	2021-09-28 17:30:05 +01:00
Jonathan Wakely	f2b7f56a15	libstdc++: Fix mismatched noexcept-specifiers in filesystem::path [PR102499] Signed-off-by: Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: PR libstdc++/102499 * include/bits/fs_path.h (path::begin, path::end): Add noexcept to declarations, to match definitions.	2021-09-28 17:30:05 +01:00
Kyrylo Tkachov	e159c0aa10	aarch64: Add command-line support for Armv8.7-a This patch adds support for -march=armv8.7-a in GCC. It adds the +ls64 extension that's included in this architecture revision. Currently this is just the command-line option and +ls64 allows the relevant instructions to be used in inline assembly. The ACLE defines some intrinsics for them but those can be added separately later (together with the appropriate __ARM_FEATURE_* predefine). 2021-09-28 Kyrylo Tkachov <kyrylo.tkachov@arm.com> * config/aarch64/aarch64.h (AARCH64_FL_LS64): Define (AARCH64_FL_V8_7): Likewise. (AARCH64_FL_FOR_ARCH8_7): Likewise. * config/aarch64/aarch64-arches.def (armv8.7-a): Define. * config/aarch64/aarch64-option-extensions.def (ls64): Define. * doc/invoke.texi: Document the above.	2021-09-28 16:13:26 +01:00
Aldy Hernandez	0400ca17f3	Improve jump threading dump output. In analyzing PR102511, it has become abundantly clear that we need better debugging aids for the jump threader solver. Currently debugging these issues is a nightmare if you're not intimately familiar with the code. This patch attempts to improve this. First, I'm enabling path solver dumps with TDF_THREADING. None of the available TDF_* flags are a good match, and using TDF_DETAILS would blow up the dump file, since both threaders continually call the solver to try out candidates. This will allow dumping path solver details without having to resort to hacking the source. I am also dumping the current registered_jump_thread dbg counter used by the registry, in the solver. That way narrowing down a problematic thread can then be examined by -fdump--threading and looking at the solver details surrounding the appropriate counter (which the dbgcnt also dumps to the dump file). You still need knowledge of the solver to debug these issues, but at least now it's not entirely opaque. Tested on x86-64 Linux. gcc/ChangeLog: dbgcnt.c (dbg_cnt_counter): New. * dbgcnt.h (dbg_cnt_counter): New. * dumpfile.c (dump_options): Add entry for TDF_THREADING. * dumpfile.h (enum dump_flag): Add TDF_THREADING. * gimple-range-path.cc (DEBUG_SOLVER): Use TDF_THREADING. * tree-ssa-threadupdate.c (dump_jump_thread_path): Dump out debug counter.	2021-09-28 15:55:29 +02:00
Tobias Burnus	1f0a57bd54	libgomp: Only check for 2sizeof(void) int type with Fortran [PR96661] The depend type is a struct with two pointer members for C/C++ - but for Fortran OpenMP requires an integer type with kind = omp_depend_kind. Thus, libgomp's configure checks that an integer type/kind with size 2sizeof(void) is available. However, this integer type/kind is not needed when building without Fortran support. Thus, only check this when Fortran is enabled. libgomp/ PR libgomp/96661 * configure.ac: Only check for int-type = 2size_t support when building with Fortran support. configure: Regenerate.	2021-09-28 15:15:47 +02:00
Ilya Leoshkevich	92cdd338fd	reassoc: Test rank biasing Add both positive and negative tests. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/reassoc-46.c: New test. * gcc.dg/tree-ssa/reassoc-46.h: Common code for new tests. * gcc.dg/tree-ssa/reassoc-47.c: New test. * gcc.dg/tree-ssa/reassoc-48.c: New test. * gcc.dg/tree-ssa/reassoc-49.c: New test. * gcc.dg/tree-ssa/reassoc-50.c: New test. * gcc.dg/tree-ssa/reassoc-51.c: New test.	2021-09-28 14:58:23 +02:00
Aldy Hernandez	c32f7df917	Enable jump threading at -O1. My previous patch gating all jump threading by -fthread-jumps had the side effect of turning off DOM jump threading at -O1. This causes numerous -Wuninitialized false positives. This patch turns on jump threading at -O1 to minimize the disruption. gcc/ChangeLog: * cfgcleanup.c (pass_jump::execute): Check flag_expensive_optimizations. (pass_jump_after_combine::gate): Same. * doc/invoke.texi (-fthread-jumps): Enable for -O1. * opts.c (default_options_table): Enable -fthread-jumps at -O1. * tree-ssa-threadupdate.c (fwd_jt_path_registry::remove_jump_threads_including): Bail unless flag_thread_jumps. gcc/testsuite/ChangeLog: * gcc.dg/auto-init-uninit-1.c: Adjust. * gcc.dg/auto-init-uninit-15.c: Same. * gcc.dg/guality/example.c: Same. * gcc.dg/loop-8.c: Same. * gcc.dg/strlenopt-40.c: Same. * gcc.dg/tree-ssa/pr18133-2.c: Same. * gcc.dg/tree-ssa/pr18134.c: Same. * gcc.dg/uninit-1.c: Same. * gcc.dg/uninit-pr44547.c: Same. * gcc.dg/uninit-pr59970.c: Same.	2021-09-28 14:33:53 +02:00
Thomas Schwinge	95540a6d1d	'gfortran.dg/assumed_rank_22_aux.c' messages printed vs. DejaGnu Print lower-case 'error: [...]' instead of upper-case 'ERROR: [...]', to not confuse the DejaGnu log processing harness into thinking these are DejaGnu harness ERRORs: Running /scratch/tschwing/build2-trusty-cs/gcc/build/submit-big/source-gcc/gcc/testsuite/gfortran.dg/dg.exp ... +ERROR: c_assumed num=100: x->dim[2].extent = -1 != 0 +ERROR: c_assumed num=100: x->dim[2].extent = -1 != 0 +ERROR: c_assumed num=100: x->dim[2].extent = -1 != 0 +ERROR: c_assumed num=100: x->dim[2].extent = -1 != 0 +ERROR: c_assumed num=100: x->dim[2].extent = -1 != 0 +ERROR: c_assumed num=100: x->dim[2].extent = -1 != 0 [...] Fix-up for recent commit 00f6de9c69119594f7dad3bd525937c94c8200d0 "Fortran: Fix assumed-size to assumed-rank passing [PR94070]". gcc/testsuite/ * gfortran.dg/assumed_rank_22_aux.c: Adjust messages printed.	2021-09-28 14:18:23 +02:00
Thomas Schwinge	a43ae03a05	Further test case adjustment re "Fortran: Fix assumed-size to assumed-rank passing" Fix-up for recent commit 00f6de9c69119594f7dad3bd525937c94c8200d0 "Fortran: Fix assumed-size to assumed-rank passing [PR94070]", and commit da1f6391b7c255e4e2eea983832120eff4f7d3df "libgomp.oacc-fortran/privatized-ref-2.f90: Fix dg-note". Due to use of '#if !ACC_MEM_SHARED' conditionals in 'libgomp.oacc-fortran/if-1.f90', 'target { ! openacc_host_selected }' needs some special care (ignoring the pre-existing mismatch of 'ACC_MEM_SHARED' vs. 'openacc_host_selected'). As seen with GCN offloading, we need to revert to another bit of the original code in 'libgomp.oacc-fortran/privatized-ref-2.f90'. libgomp/ * testsuite/libgomp.oacc-fortran/if-1.f90: Adjust. * testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Likewise.	2021-09-28 14:18:21 +02:00
Ilya Leoshkevich	dbed1c8693	reassoc: Propagate PHI_LOOP_BIAS along single uses PR tree-optimization/49749 introduced code that shortens dependency chains containing loop accumulators by placing them last on operand lists of associative operations. 456.hmmer benchmark on s390 could benefit from this, however, the code that needs it modifies loop accumulator before using it, and since only so-called loop-carried phis are are treated as loop accumulators, the code in the present form doesn't really help. According to Bill Schmidt - the original author - such a conservative approach was chosen so as to avoid unnecessarily swapping operands, which might cause unpredictable effects. However, giving special treatment to forms of loop accumulators is acceptable. The definition of loop-carried phi is: it's a single-use phi, which is used in the same innermost loop it's defined in, at least one argument of which is defined in the same innermost loop as the phi itself. Given this, it seems natural to treat single uses of such phis as phis themselves. gcc/ChangeLog: * tree-ssa-reassoc.c (biased_names): New global. (propagate_bias_p): New function. (loop_carried_phi): Remove. (propagate_rank): Propagate bias along single uses. (get_rank): Update biased_names when needed.	2021-09-28 14:10:59 +02:00
Ilya Leoshkevich	99c106e695	reassoc: Do not bias loop-carried PHIs early Biasing loop-carried PHIs during the 1st reassociation pass interferes with reduction chains and does not bring measurable benefits, so do it only during the 2nd reassociation pass. gcc/ChangeLog: * passes.def (pass_reassoc): Rename parameter to early_p. * tree-ssa-reassoc.c (reassoc_bias_loop_carried_phi_ranks_p): New variable. (phi_rank): Don't bias loop-carried phi ranks before vectorization pass. (execute_reassoc): Add bias_loop_carried_phi_ranks_p parameter. (pass_reassoc::pass_reassoc): Add bias_loop_carried_phi_ranks_p initializer. (pass_reassoc::set_param): Set bias_loop_carried_phi_ranks_p value. (pass_reassoc::execute): Pass bias_loop_carried_phi_ranks_p to execute_reassoc. (pass_reassoc::bias_loop_carried_phi_ranks_p): New member.	2021-09-28 14:10:13 +02:00
Jakub Jelinek	3b7041e834	i386: Don't emit fldpi etc. if -frounding-math [PR102498] i387 has instructions to store some transcedental numbers into the top of stack. The problem is that what exact bit in the last place one gets for those depends on the current rounding mode, the CPU knows the number with slightly higher precision. The compiler assumes rounding to nearest when comparing them against constants in the IL, but at runtime the rounding can be different and so some of these depending on rounding mode and the constant could be 1 ulp higher or smaller than expected. We only support changing the rounding mode at runtime if the non-default -frounding-mode option is used, so the following patch just disables using those constants if that flag is on. 2021-09-28 Jakub Jelinek <jakub@redhat.com> PR target/102498 * config/i386/i386.c (standard_80387_constant_p): Don't recognize special 80387 instruction XFmode constants if flag_rounding_math. * gcc.target/i386/pr102498.c: New test.	2021-09-28 13:02:51 +02:00
Richard Biener	34b1e44e16	tree-optimization/99793 - testcase for the PR This adds a testcase for the PR which was fixed with the fix for PR100112. 2021-09-28 Richard Biener <rguenther@suse.de> PR tree-optimization/99793 * gcc.dg/tree-ssa/pr99793.c: New testcase.	2021-09-28 12:50:29 +02:00
Richard Biener	5b8b1522e0	tree-optimization/100112 - VN last_vuse and redundant store elimination This avoids the last_vuse optimization hindering redundant store elimination by always also recording the original VUSE that was in effect on the load. In stage3 gcc/.o we have 3182752 times recorded a single entry and 903409 times two entries (that's ~20% overhead). With just recording a single entry the number of hashtable lookups done when walking the vuse->vdef links to find an earlier access is 28961618. When recording the second entry this makes us find that earlier for donwnstream redundant accesses, reducing the number of hashtable lookups to 25401052 (that's a ~10% reduction). 2021-09-27 Richard Biener <rguenther@suse.de> PR tree-optimization/100112 tree-ssa-sccvn.c (visit_reference_op_load): Record the referece into the hashtable twice in case last_vuse is different from the original vuse on the stmt. * gcc.dg/tree-ssa/ssa-fre-95.c: New testcase.	2021-09-28 12:31:46 +02:00
Jakub Jelinek	4f07769057	openmp: Don't call omp_finish_clause on implicitly added private clauses on simd [PR102492] The gimplifier adds implicit private clauses on SIMD constructs for local variables in the SIMD body if they are addressable to make sure they use the magic arrays with "omp simd array" attribute (such that each SIMD lane has its own copy), but we actually don't need to default privatize etc. those, the construction for them is done in the SIMD body and so is destruction. omp_finish_clause for C++ now requires default constructor (and dtor) for private, so that OpenMP 5.1 default(private) works, but that will never be needed on SIMD. So, this patch just doesn't call omp_finish_clause for private on simd. The C and Fortran langhooks don't do anything for private. 2021-09-28 Jakub Jelinek <jakub@redhat.com> PR middle-end/102492 * gimplify.c (gimplify_adjust_omp_clauses_1): Don't call the omp_finish_clause langhook on implicitly added OMP_CLAUSE_PRIVATE clauses on SIMD constructs. * g++.dg/gomp/simd-3.C: New test.	2021-09-28 11:38:03 +02:00
Aldy Hernandez	fb8b72ebb5	Return VARYING in range_on_path_entry if nothing found. The problem here is that the solver's code solving unknown SSAs on entry to a path was returning UNDEFINED if there were no incoming edges to the start of the path that were not the function entry block. This caused a cascade of pain down stream. Tested on x86-64 Linux. PR tree-optimization/102511 gcc/ChangeLog: * gimple-range-path.cc (path_range_query::range_on_path_entry): Return VARYING when nothing found. gcc/testsuite/ChangeLog: * gcc.dg/pr102511.c: New test. * gcc.dg/tree-ssa/ssa-dom-thread-14.c: Adjust.	2021-09-28 11:11:20 +02:00
Andrew Burgess	dc614a838e	top-level configure: setup target_configdirs based on repository The top-level configure script is shared between the gcc repository and the binutils-gdb repository. The target_configdirs variable in the configure.ac script, defines sub-directories that contain components that should be built for the target using the target tools. Some components, e.g. zlib, are built as both host and target libraries. This causes problems for binutils-gdb. If we run 'make all' in the binutils-gdb repository we end up trying to build a target version of the zlib library, which requires the target compiler be available. Often the target compiler isn't immediately available, and so the build fails. The problem with zlib impacted a previous attempt to synchronise the top-level configure scripts from gcc to binutils-gdb, see this thread: https://sourceware.org/pipermail/binutils/2019-May/107094.html And I'm in the process of importing libbacktrace in to binutils-gdb, which is also a host and target library, and triggers the same issues. I believe that for binutils-gdb, at least at the moment, there are no target libraries that we need to build. In the configure script we build three lists of things we want to build, $configdirs, $build_configdirs, and $target_configdirs, we also build two lists of things we don't want to build, $skipdirs and $noconfigdirs. We then remove anything that is in the lists of things not to build, from the list of things that should be built. My proposal is to add everything in target_configdirs into skipdirs, if the source tree doesn't contain a gcc/ sub-directory. The result is that for binutils-gdb no target tools or libraries will be built, while for the gcc repository, nothing should change. If a user builds a unified source tree, then the target tools and libraries should still be built as the gcc/ directory will be present. I've tested a build of gcc on x86-64, and the same set of target libraries still seem to get built. On binutils-gdb this change resolves the issues with 'make all'. ChangeLog: * configure: Regenerate. * configure.ac (skipdirs): Add the contents of target_configdirs if we are not building gcc.	2021-09-28 09:43:36 +01:00
Hongyu Wang	eea10afef7	AVX512FP16: Support basic 64/32bit vector type and operation. For 32bit target, V4HF vector is parsed same as __m64 type, V2HF is parsed by stack and returned from GPR since it is not specified by ABI. gcc/ChangeLog: PR target/102230 * config/i386/i386.h (VALID_AVX512FP16_REG_MODE): Add V2HF mode check. (VALID_SSE2_REG_VHF_MODE): Add V4HFmode and V2HFmode. (VALID_MMX_REG_MODE): Add V4HFmode. (SSE_REG_MODE_P): Replace VALID_AVX512FP16_REG_MODE with vector mode condition. * config/i386/i386.c (classify_argument): Parse V4HF/V2HF via sse regs. (function_arg_32): Add V4HFmode. (function_arg_advance_32): Likewise. * config/i386/i386.md (mode): Add V4HF/V2HF. (MODE_SIZE): Likewise. * config/i386/mmx.md (MMXMODE): Add V4HF mode. (V_32): Add V2HF mode. (VHF_32_64): New mode iterator. (mov<mode>_internal): Adjust sse alternatives to support V4HF mode move. (mov<mode>_internal): Adjust sse alternatives to support V2HF mode move. (<insn><mode>3): New define_insn for add/sub/mul/div. gcc/testsuite/ChangeLog: PR target/102230 * gcc.target/i386/avx512fp16-floatvnhf.c: Remove xfail. * gcc.target/i386/avx512fp16-trunc-extendvnhf.c: Ditto. * gcc.target/i386/avx512fp16-truncvnhf.c: Ditto. * gcc.target/i386/avx512fp16-64-32-vecop-1.c: New test. * gcc.target/i386/avx512fp16-64-32-vecop-2.c: Ditto. * gcc.target/i386/pr102230.c: Ditto.	2021-09-28 16:39:32 +08:00
Richard Biener	1dadd5110f	Fix gcc.target/i386/vect-pr97352.c for -m32 -march=cascadelake The easiest is to disable AVX2 and AVX512F explicitely. 2021-09-28 Richard Biener <rguenther@suse.de> * gcc.target/i386/vect-pr97352.c: Pass -mno-avx2 -mno-avx512f.	2021-09-28 10:05:22 +02:00
Tobias Burnus	ce450af508	gfortran.dg/include_15.f90: Add dg-prune-output [PR102500] gcc/testsuite/ PR fortran/102500 * gfortran.dg/include_15.f90: Add 'dg-prune-output' to prune -Wmissing-include-dirs output printed or not depending on how the testsuite is run.	2021-09-28 09:49:12 +02:00
Richard Biener	6fabd9e25d	Fix gcc.dg/vect/bb-slp-pr65935.c FAIL with AVX after recent change This avoids bigger than V2DF vectorization which disturbs the ability to consistently check for the vectorization result after us now also vectorizing the V2DF tail of a V4DF vectorization variant. 2021-09-28 Richard Biener <rguenther@suse.de> * gcc.dg/vect/bb-slp-pr65935.c: Prefer 128bit vectorization on x86.	2021-09-28 09:02:12 +02:00
Aldy Hernandez	e475ae9bbf	Control all jump threading passes with -fjump-threads. Last year I mentioned that -fthread-jumps was being ignored by the majority of our jump threading passes, and Jeff said he'd be in favor of fixing this. This patch remedies the situation, but it does change existing behavior. Currently -fthread-jumps is only enabled for -O2, -O3, and -Os. This means that even if we restricted all jump threading passes with -fthread-jumps, DOM jump threading would still seep through since it runs at -O1. I propose this patch, but it does mean that DOM jump threading would have to be explicitly enabled with -O1 -fthread-jumps. gcc/ChangeLog: * tree-ssa-threadbackward.c (pass_thread_jumps::gate): Check flag_thread_jumps. (pass_early_thread_jumps::gate): Same. * tree-ssa-threadedge.c (jump_threader::thread_outgoing_edges): Return if !flag_thread_jumps. * tree-ssa-threadupdate.c (jt_path_registry::register_jump_thread): Assert that flag_thread_jumps is true. gcc/testsuite/ChangeLog: * gcc.dg/auto-init-uninit-1.c: Add -fthread-jumps. * gcc.dg/auto-init-uninit-15.c: Same. * gcc.dg/guality/example.c: Same. * gcc.dg/loop-8.c: Same. * gcc.dg/strlenopt-40.c: Same. * gcc.dg/tree-ssa/pr18133-2.c: Same. * gcc.dg/tree-ssa/pr18134.c: Same. * gcc.dg/uninit-1.c: Same. * gcc.dg/uninit-pr44547.c: Same. * gcc.dg/uninit-pr59970.c: Same.	2021-09-28 08:17:29 +02:00
liuhongt	9cfb95f9b9	Relax condition of (vec_concat:M(vec_select op0 idx0)(vec_select op0 idx1)) to allow different modes between op0 and M, but have same inner mode. This will enable optimization for below pattern. (set (reg:V2DF 87 [ xx ]) (vec_concat:V2DF (vec_select:DF (reg:V4DF 92) (parallel [ (const_int 2 [0x2]) ])) (vec_select:DF (reg:V4DF 92) (parallel [ (const_int 3 [0x3]) ])))) gcc/ChangeLog: * simplify-rtx.c (simplify_context::simplify_binary_operation_1): Relax condition of simplifying (vec_concat:M (vec_select op0 index0)(vec_select op1 index1)) to allow different modes between op0 and M, but have same inner mode. gcc/testsuite/ChangeLog: * gcc.target/i386/vect-rebuild.c: Adjust testcases. * gcc.target/i386/avx512f-vect-rebuild.c: New test.	2021-09-28 11:00:29 +08:00
liuhongt	3540429be7	Support 128/256/512-bit vector plus/smin/smax reduction for _Float16. gcc/ChangeLog: * config/i386/i386-expand.c (emit_reduc_half): Handle V8HF/V16HF/V32HFmode. * config/i386/sse.md (REDUC_SSE_PLUS_MODE): Add V8HF. (REDUC_SSE_SMINMAX_MODE): Ditto. (REDUC_PLUS_MODE): Add V16HF and V32HF. (REDUC_SMINMAX_MODE): Ditto. gcc/testsuite * gcc.target/i386/avx512fp16-reduce-op-2.c: New test. * gcc.target/i386/avx512fp16-reduce-op-3.c: New test.	2021-09-28 09:40:30 +08:00
GCC Administrator	cf966403d9	Daily bump.	2021-09-28 00:16:21 +00:00
Patrick Palka	51018dd139	c++: deduction guides and ttp rewriting [PR102479] The problem here is ultimately that rewrite_tparm_list when rewriting a TEMPLATE_TEMPLATE_PARM introduces a tree cycle in the rewritten ttp that structural_comptypes can't cope with. In particular the DECL_TEMPLATE_PARMS of a ttp's TEMPLATE_DECL normally captures an empty parameter list at its own level (and so the TEMPLATE_DECL doesn't appear in its own DECL_TEMPLATE_PARMS), but rewrite_tparm_list ends up giving it a complete parameter list. In the new testcase below, this causes infinite recursion from structural_comptypes when comparing Tmpl<char> with Tmpl<long> (where both 'Tmpl's are rewritten ttps). This patch fixes this by making rewrite_template_parm give a rewritten template template parm an empty parameter list at its own level, thereby avoiding the tree cycle. Testing the alias CTAD case revealed that we're not setting current_template_parms in alias_ctad_tweaks, which this patch also fixes. PR c++/102479 gcc/cp/ChangeLog: * pt.c (rewrite_template_parm): Handle single-level tsubst_args. Avoid a tree cycle when assigning the DECL_TEMPLATE_PARMS for a rewritten ttp. (alias_ctad_tweaks): Set current_template_parms accordingly. gcc/testsuite/ChangeLog: * g++.dg/cpp1z/class-deduction12.C: Also test alias CTAD in the same way. * g++.dg/cpp1z/class-deduction99.C: New test.	2021-09-27 16:01:10 -04:00
Aldy Hernandez	8366836860	Minor cleanups to solver. These are some minor cleanups and renames that surfaced after the hybrid_threader work. gcc/ChangeLog: * gimple-range-path.cc (path_range_query::precompute_ranges_in_block): Rename to... (path_range_query::compute_ranges_in_block): ...this. (path_range_query::precompute_ranges): Rename to... (path_range_query::compute_ranges): ...this. (path_range_query::precompute_relations): Rename to... (path_range_query::compute_relations): ...this. (path_range_query::precompute_phi_relations): Rename to... (path_range_query::compute_phi_relations): ...this. * gimple-range-path.h: Rename precompute* to compute. tree-ssa-threadbackward.c (back_threader::find_taken_edge_switch): Same. (back_threader::find_taken_edge_cond): Same. * tree-ssa-threadedge.c (hybrid_jt_simplifier::compute_ranges_from_state): Same. (hybrid_jt_state::register_equivs_stmt): Inline... * tree-ssa-threadedge.h: ...here.	2021-09-27 17:39:51 +02:00
Aldy Hernandez	4ef1e524fd	Remove old VRP jump threader code. There's a lot of code that melts away without the ASSERT_EXPR based jump threader. Also, I cleaned up the include files as part of the process. gcc/ChangeLog: * tree-vrp.c (lhs_of_dominating_assert): Remove. (class vrp_jt_state): Remove. (class vrp_jt_simplifier): Remove. (vrp_jt_simplifier::simplify): Remove. (class vrp_jump_threader): Remove. (vrp_jump_threader::vrp_jump_threader): Remove. (vrp_jump_threader::~vrp_jump_threader): Remove. (vrp_jump_threader::before_dom_children): Remove. (vrp_jump_threader::after_dom_children): Remove.	2021-09-27 17:39:51 +02:00
Aldy Hernandez	0288527f47	Replace VRP threader with a hybrid forward threader. This patch implements the new hybrid forward threader and replaces the embedded VRP threader with it. With all the pieces that have gone in, the implementation of the hybrid threader is straightforward: convert the current state into SSA imports that the solver will understand, and let the path solver precompute ranges and relations for the path. After this setup is done, we can use the range_query API to solve gimple statements in the threader. The forward threader is now engine agnostic so there are no changes to the threader per se. I have put the hybrid bits in tree-ssa-threadedge., instead of VRP, because they will also be used in the evrp removal of the DOM/threader, which is my next task. Most of the patch, is actually test changes. I have gone through every single one and verified that we're correct. Most were trivial dump file name changes, but others required going through the IL an certifying that the different IL was expected. For example, in pr59597.c, we have one less thread because the ASSERT_EXPR was getting in the way, and making it seem like things were not crossing loops. The hybrid threader sees the correct representation of the IL, and avoids threading this one case. The final numbers are a 12.16% improvement in jump threads immediately after VRP, and a 0.82% improvement in overall jump threads. The performance drop is 0.6% (plus the 1.43% hit from moving the embedded threader into its own pass). As I've said, I'd prefer to keep the threader in its own pass, but if this is an issue, we can address this with a shared ranger when VRP is replaced with an evrp instance (upcoming). Note, that these numbers are slightly different than what I originally posted. A few correctness tweaks, plus restricting loop threads, made the difference. That being said, I was aiming for par. A 12% gain is just gravy ;-). When we merge the threaders, we should see even better numbers-- and we'll have the benefit of an entire release stress testing the solver. As I mentioned in my introductory note, paths ending in MEM_REF conditional are missing. In reality, this didn't make a difference, as it was so rare. However, as a follow-up, I will distill a test and add a suitable PR to keep us honest. There is a one-line change to libgomp/team.c silencing a new used uninitialized warning. As my previous work with the threaders has shown, warnings flare up after each improvement to jump threading. I expect this to be no different. I've promised Jakub to investigate fully, so I will analyze and add the appropriate PR for the warning experts. Oh yeah, the new pass dump is called vrp-threader[12] to match each VRP[12] pass. However, there's no reason for it to either be named vrp-threader, or for it to live in tree-vrp.c. Tested on x86-64 Linux. OK? p.s. "Did I say 5 weeks? My bad, I meant 5 months." gcc/ChangeLog: passes.def (pass_vrp_threader): New. * tree-pass.h (make_pass_vrp_threader): Add make_pass_vrp_threader. * tree-ssa-threadedge.c (hybrid_jt_state::register_equivs_stmt): New. (hybrid_jt_simplifier::hybrid_jt_simplifier): New. (hybrid_jt_simplifier::simplify): New. (hybrid_jt_simplifier::compute_ranges_from_state): New. * tree-ssa-threadedge.h (class hybrid_jt_state): New. (class hybrid_jt_simplifier): New. * tree-vrp.c (execute_vrp): Remove ASSERT_EXPR based jump threader. (class hybrid_threader): New. (hybrid_threader::hybrid_threader): New. (hybrid_threader::~hybrid_threader): New. (hybrid_threader::before_dom_children): New. (hybrid_threader::after_dom_children): New. (execute_vrp_threader): New. (class pass_vrp_threader): New. (make_pass_vrp_threader): New. libgomp/ChangeLog: * team.c: Initialize start_data. * testsuite/libgomp.graphite/force-parallel-4.c: Adjust. * testsuite/libgomp.graphite/force-parallel-8.c: Adjust. gcc/testsuite/ChangeLog: * gcc.dg/torture/pr55107.c: Adjust. * gcc.dg/tree-ssa/phi_on_compare-1.c: Adjust. * gcc.dg/tree-ssa/phi_on_compare-2.c: Adjust. * gcc.dg/tree-ssa/phi_on_compare-3.c: Adjust. * gcc.dg/tree-ssa/phi_on_compare-4.c: Adjust. * gcc.dg/tree-ssa/pr21559.c: Adjust. * gcc.dg/tree-ssa/pr59597.c: Adjust. * gcc.dg/tree-ssa/pr61839_1.c: Adjust. * gcc.dg/tree-ssa/pr61839_3.c: Adjust. * gcc.dg/tree-ssa/pr71437.c: Adjust. * gcc.dg/tree-ssa/ssa-dom-thread-11.c: Adjust. * gcc.dg/tree-ssa/ssa-dom-thread-16.c: Adjust. * gcc.dg/tree-ssa/ssa-dom-thread-18.c: Adjust. * gcc.dg/tree-ssa/ssa-dom-thread-2a.c: Adjust. * gcc.dg/tree-ssa/ssa-dom-thread-4.c: Adjust. * gcc.dg/tree-ssa/ssa-thread-14.c: Adjust. * gcc.dg/tree-ssa/ssa-vrp-thread-1.c: Adjust. * gcc.dg/tree-ssa/vrp106.c: Adjust. * gcc.dg/tree-ssa/vrp55.c: Adjust.	2021-09-27 17:39:51 +02:00
Martin Liska	dd11aab646	Come up with section_flag enum. gcc/ChangeLog: * output.h (enum section_flag): New. (SECTION_FORGET): Remove. (SECTION_ENTSIZE): Make it (1UL << 8) - 1. (SECTION_STYLE_MASK): Define it based on other enum values. * varasm.c (switch_to_section): Remove unused handling of SECTION_FORGET.	2021-09-27 16:59:38 +02:00
Martin Liska	a64697d7a3	flag_complex_method: support optimize attribute gcc/c-family/ChangeLog: * c-opts.c (c_common_init_options_struct): Set also x_flag_default_complex_method. gcc/ChangeLog: * common.opt: Add new variable flag_default_complex_method. * opts.c (finish_options): Handle flags related to x_flag_complex_method. * toplev.c (process_options): Remove option handling related to flag_complex_method. gcc/go/ChangeLog: * go-lang.c (go_langhook_init_options_struct): Set also x_flag_default_complex_method. gcc/lto/ChangeLog: * lto-lang.c (lto_init_options_struct): Set also x_flag_default_complex_method. gcc/testsuite/ChangeLog: * gcc.c-torture/compile/attr-complex-method-2.c: New test. * gcc.c-torture/compile/attr-complex-method.c: New test.	2021-09-27 16:58:37 +02:00
Vincent Lefevre	3e6a511b94	Update pathname for IBM long double description. include/ * floatformat.h: Update pathname for IBM long double description.	2021-09-27 10:56:14 -04:00
Richard Biener	d06dc8a2c7	middle-end/102450 - avoid type_for_size for non-existing modes This avoids asking type_for_size for types with sizes for which no scalar integer mode exists. Instead the following uses int_mode_for_size to get the same result. 2021-09-27 Richard Biener <rguenther@suse.de> PR middle-end/102450 * gimple-fold.c (gimple_fold_builtin_memory_op): Avoid using type_for_size, instead use int_mode_for_size.	2021-09-27 15:04:32 +02:00
Tobias Burnus	da1f6391b7	libgomp.oacc-fortran/privatized-ref-2.f90: Fix dg-note In my last commit, r12-3897-g00f6de9c69119594f7dad3bd525937c94c8200d0, which inlined array-size code, I had to update the expected output. However, in doing so, I accidentally (copy'n'paste) changed dg-note into dg-message. libgomp/ * testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Change dg-message back to dg-note.	2021-09-27 14:33:39 +02:00
Tobias Burnus	00f6de9c69	Fortran: Fix assumed-size to assumed-rank passing [PR94070] This code inlines the size0 and size1 libgfortran calls, the former is still used by libgfortan itself (and by old code). Besides permitting more optimizations, it also permits to handle assumed-rank dummies better: If the dummy argument is a nonpointer/nonallocatable, an assumed-size actual arg is repesented by having ubound == -1 for the last dimension. However, for allocatable/pointers, this value can also exist. Hence, the dummy arg attr has to be honored. For that reason, when calling an assumed-rank procedure with nonpointer, nonallocatable dummy arguments, the bounds have to be updated to avoid the case ubound == -1 for the last dimension. PR fortran/94070 gcc/fortran/ChangeLog: * trans-array.c (gfc_tree_array_size): New function to find size inline (whole array or one dimension). (array_parameter_size): Use it, take stmt_block as arg. (gfc_conv_array_parameter): Update call. * trans-array.h (gfc_tree_array_size): Add prototype. * trans-decl.c (gfor_fndecl_size0, gfor_fndecl_size1): Remove these global vars. (gfc_build_intrinsic_function_decls): Remove their initialization. * trans-expr.c (gfc_conv_procedure_call): Update bounds of pointer/allocatable actual args to nonallocatable/nonpointer dummies to be one based. * trans-intrinsic.c (gfc_conv_intrinsic_shape): Fix case for assumed rank with allocatable/pointer dummy. (gfc_conv_intrinsic_size): Update to use inline function. * trans.h (gfor_fndecl_size0, gfor_fndecl_size1): Remove var decl. libgfortran/ChangeLog: * intrinsics/size.c (size0, size1): Comment that now not used by newer compiler code. libgomp/ChangeLog: * testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Update expected dg-note output. gcc/testsuite/ChangeLog: * gfortran.dg/c-interop/cf-out-descriptor-6.f90: Remove xfail. * gfortran.dg/c-interop/size.f90: Remove xfail. * gfortran.dg/intrinsic_size_3.f90: Update scan-tree-dump-times. * gfortran.dg/transpose_optimization_2.f90: Likewise. * gfortran.dg/size_optional_dim_1.f90: Add scan-tree-dump-not. * gfortran.dg/assumed_rank_22.f90: New test. * gfortran.dg/assumed_rank_22_aux.c: New test.	2021-09-27 14:04:54 +02:00
Andrew Pinski	76773d3fea	Fix PR c/94726: ICE with __builtin_shuffle and changing of types The problem here is __builtin_shuffle when called with two arguments instead of 1, uses a SAVE_EXPR to put in for the 1st and 2nd operand of VEC_PERM_EXPR and when we go and gimplify the SAVE_EXPR, the type is now error_mark_node and that fails hard. This fixes the problem by adding a simple check for type of operand of SAVE_EXPR not to be error_mark_node. OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions. gcc/ChangeLog: PR c/94726 * gimplify.c (gimplify_save_expr): Return early if the type of val is error_mark_node. gcc/testsuite/ChangeLog: PR c/94726 * gcc.dg/pr94726.c: New test.	2021-09-27 10:37:28 +00:00
Aldy Hernandez	d5f8abe1d3	Use on-demand ranges in ssa_name_has_boolean_range before querying nonzero bits. The function ssa_name_has_boolean_range looks at the nonzero bits stored in SSA_NAME_RANGE_INFO. These are global in nature and are the result of a previous evrp/VRP run (technically other passes can also set them). However, we can do better if we use get_range_query. Doing so will use a ranger if enabled in a pass, or global ranges otherwise. The call to get_nonzero_bits remains, as there are passes that will set them independently of the global range info. Tested on x86-64 Linux with a regstrap as well as in a DOM environment using an on-demand ranger instead of evrp. gcc/ChangeLog: * tree-ssanames.c (ssa_name_has_boolean_range): Use get_range_query.	2021-09-27 12:23:59 +02:00
Aldy Hernandez	e1d01f4973	Convert some evrp uses in DOM to the range_query API. DOM is the last remaining user of the evrp engine. This patch converts a few uses of the engine and vr-values into the new API. There is one subtle change. The call to vr_value's op_with_constant_singleton_value_range can theoretically return non-constants, unlike the range_query API which only returns constants. In this particular case it doesn't matter because the symbolic stuff will have been handled by the const_and_copies/avail_exprs read in the SSA_NAME_VALUE copy immediately before. I have verified this is the case by asserting that all calls to op_with_constant_singleton_value_range at this point return either NULL or an INTEGER_CST. Tested on x86-64 Linux with a regstrap, as well as the aforementioned assert. gcc/ChangeLog: * gimple-ssa-evrp-analyze.h (class evrp_range_analyzer): Remove vrp_visit_cond_stmt. * tree-ssa-dom.c (cprop_operand): Convert to range_query API. (cprop_into_stmt): Same. (dom_opt_dom_walker::optimize_stmt): Same.	2021-09-27 11:43:19 +02:00
Richard Biener	6390c5047a	Allow different vector types for stmt groups This allows vectorization (in practice non-loop vectorization) to have a stmt participate in different vector type vectorizations. It allows us to remove vect_update_shared_vectype and replace it by pushing/popping STMT_VINFO_VECTYPE from SLP_TREE_VECTYPE around vect_analyze_stmt and vect_transform_stmt. For data-ref the situation is a bit more complicated since we analyze alignment info with a specific vector type in mind which doesn't play well when that changes. So the bulk of the change is passing down the actual vector type used for a vectorized access to the various accessors of alignment info, first and foremost dr_misalignment but also aligned_access_p, known_alignment_for_access_p, vect_known_alignment_in_bytes and vect_supportable_dr_alignment. I took the liberty to replace ALL_CAPS macro accessors with the lower-case function invocations. The actual changes to the behavior are in dr_misalignment which now is the place factoring in the negative step adjustment as well as handling alignment queries for a vector type with bigger alignment requirements than what we can (or have) analyze(d). vect_slp_analyze_node_alignment makes use of this and upon receiving a vector type with a bigger alingment desire re-analyzes the DR with respect to it but keeps an older more precise result if possible. In this context it might be possible to do the analysis just once but instead of analyzing with respect to a specific desired alignment look for the biggest alignment we can compute a not unknown alignment. The ChangeLog includes the functional changes but not the bulk due to the alignment accessor API changes - I hope that's something good. 2021-09-17 Richard Biener <rguenther@suse.de> PR tree-optimization/97351 PR tree-optimization/97352 PR tree-optimization/82426 * tree-vectorizer.h (dr_misalignment): Add vector type argument. (aligned_access_p): Likewise. (known_alignment_for_access_p): Likewise. (vect_supportable_dr_alignment): Likewise. (vect_known_alignment_in_bytes): Likewise. Refactor. (DR_MISALIGNMENT): Remove. (vect_update_shared_vectype): Likewise. * tree-vect-data-refs.c (dr_misalignment): Refactor, handle a vector type with larger alignment requirement and apply the negative step adjustment here. (vect_calculate_target_alignment): Remove. (vect_compute_data_ref_alignment): Get explicit vector type argument, do not apply a negative step alignment adjustment here. (vect_slp_analyze_node_alignment): Re-analyze alignment when we re-visit the DR with a bigger desired alignment but keep more precise results from smaller alignments. * tree-vect-slp.c (vect_update_shared_vectype): Remove. (vect_slp_analyze_node_operations_1): Do not update the shared vector type on stmts. * tree-vect-stmts.c (vect_analyze_stmt): Push/pop the vector type of an SLP node to the representative stmt-info. (vect_transform_stmt): Likewise. * gcc.target/i386/vect-pr82426.c: New testcase. * gcc.target/i386/vect-pr97352.c: Likewise.	2021-09-27 10:24:12 +02:00
liuhongt	e7b8d70200	Revert "Optimize v4sf reduction.". This reverts commit 8f323c712ea76cc4506b03895e9b991e4e4b2baf. PR target/102473 PR target/101059	2021-09-27 15:51:24 +08:00
GCC Administrator	1932e1169a	Daily bump.	2021-09-27 00:16:16 +00:00
Tobias Burnus	fe2771b291	Fortran: Fix associated intrinsic with assumed rank [PR101334] ASSOCIATE (ptr, tgt) takes as first argument also an assumed-rank array; however, using it together with a tgt (required to be non assumed rank) had issues for both scalar and nonscalar tgt. PR fortran/101334 gcc/fortran/ChangeLog: * trans-intrinsic.c (gfc_conv_associated): Support assumed-rank 'pointer' with scalar/array 'target' argument. libgfortran/ChangeLog: * intrinsics/associated.c (associated): Also check for same rank. gcc/testsuite/ChangeLog: * gfortran.dg/associated_assumed_rank.f90: New test.	2021-09-26 19:26:01 +02:00
liuhongt	e98e12c40b	Remove storage only description for _Float16 w/o avx512fp16. gcc/ChangeLog: * doc/extend.texi (Half-Precision): Remove storage only description for _Float16 w/o avx512fp16.	2021-09-26 09:04:41 +08:00

1 2 3 4 5 ...

188347 Commits