OpenE2K/gcc - gcc - Expired Mentality Git

Author	SHA1	Message	Date
Bernhard Reutner-Fischer	b58c12f3cf	contrib: testsuite-management: Update to be python3 compatible contrib/ChangeLog: * testsuite-management/validate_failures.py: 2to3	2021-11-05 16:38:03 +01:00
Wilco Dijkstra	b33b267834	AArch64: Fix PR103085 The stack protector implementation hides symbols in a const unspec, which means movdi/movsi patterns must always support const on symbol operands and explicitly strip away the unspec. Do this for the recently added GOT alternatives. Add a test to ensure stack-protector tests GOT accesses as well. 2021-11-05 Wilco Dijkstra <wdijkstr@arm.com> PR target/103085 * config/aarch64/aarch64.c (aarch64_mov_operand_p): Strip the salt first. * config/aarch64/constraints.md: Support const in Usw. gcc/testsuite/ PR target/103085 * gcc.target/aarch64/pr103085.c: New test	2021-11-05 15:36:32 +00:00
John David Anglin	a505e1fae4	Move PREFERRED_DEBUGGING_TYPE define in pa64-hpux.h to pa.h This fixes D language build on hppa64-hpux11. 2021-11-05 John David Anglin <danglin@gcc.gnu.org> gcc/ChangeLog: * config/pa/pa.h (PREFERRED_DEBUGGING_TYPE): Define to DWARF2_DEBUG. * config/pa/pa64-hpux.h (PREFERRED_DEBUGGING_TYPE): Remove define.	2021-11-05 15:05:15 +00:00
Martin Liska	d8a62882b8	gcov-profile: Filter test only for some targets [PR102945] PR gcov-profile/102945 gcc/testsuite/ChangeLog: * gcc.dg/gcov-info-to-gcda.c: Filter supported targets.	2021-11-05 15:40:47 +01:00
Richard Biener	bcf4065c90	Split vector loop analysis into main and epilogue analysis As discussed this splits the analysis loop into two, first settling on a vector mode used for the main loop and only then analyzing the epilogue of that for possible vectorization. That makes it easier to put in support for unrolled main loops. On the way I've realized some cleanup opportunities, namely caching n_stmts in vec_info_shared (it's computed by dataref analysis) avoiding to pass that around and setting/clearing loop->aux during analysis - try_vectorize_loop_1 will ultimatively set it on those we vectorize. This also gets rid of the previously introduced callback in vect_analyze_loop_1 in favor of making that advance the mode iterator. I'm now pushing VOIDmode explicitely into the vector_modes array which makes the re-start on the epilogue side a bit more straight-forward. Note that will now use auto-detection of the vector mode in case the main loop used it and we want to try LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P and the first mode from the target array if not. I've added a comment that says we may want to make sure we don't try vectorizing the epilogue with a bigger vector size than the main loop but the situation isn't very likely to appear in practice I guess (and it was also present before this change). In principle this change should not change vectorization decisions but the way we handled re-analyzing epilogues as main loops makes me only 99% sure that it does. 2021-11-05 Richard Biener <rguenther@suse.de> * tree-vectorizer.h (vec_info_shared::n_stmts): Add. (LOOP_VINFO_N_STMTS): Likewise. (vec_info_for_bb): Remove unused function. * tree-vectorizer.c (vec_info_shared::vec_info_shared): Initialize n_stmts member. * tree-vect-loop.c: Remove INCLUDE_FUNCTIONAL. (vect_create_loop_vinfo): Do not set loop->aux. (vect_analyze_loop_2): Do not get n_stmts as argument, instead use LOOP_VINFO_N_STMTS. Set LOOP_VINFO_VECTORIZABLE_P here. (vect_analyze_loop_1): Remove callback, get the mode iterator and autodetected_vector_mode as argument, advancing the iterator and initializing autodetected_vector_mode here. (vect_analyze_loop): Split analysis loop into two, first processing main loops only and then epilogues.	2021-11-05 14:34:42 +01:00
Martin Jambor	ea42c80585	ipa: Do not require RECORD_TYPE for ancestor jump functions The check this patch removes has remained from times when ancestor jump functions have been only used for devirtualization and also contained BINFOs. It is not necessary now and should have been removed long time ago. gcc/ChangeLog: 2021-11-04 Martin Jambor <mjambor@suse.cz> * ipa-prop.c (compute_complex_assign_jump_func): Remove unnecessary check for RECORD_TYPE.	2021-11-05 14:29:31 +01:00
Jonathan Wakely	30b8ec68e2	libstdc++: Add xfail to pretty printer tests that fail in C++20 For some reason the type printer for std::string doesn't work in C++20 mode, so std::basic_string<char, char_traits<char>, allocator<char> is printed out in full rather than being shown as std::string. It's probably related to the fact that the extern template declarations are disabled for C++20, but I don't know why that affects GDB. For now I'm just marking the relevant tests as XFAIL. That requires adding support for target selectors to individual GDB directives such as note-test and whatis-regexp-test. libstdc++-v3/ChangeLog: * testsuite/lib/gdb-test.exp: Add target selector support to the dg-final directives. * testsuite/libstdc++-prettyprinters/80276.cc: Add xfail for C++20. * testsuite/libstdc++-prettyprinters/libfundts.cc: Likewise. * testsuite/libstdc++-prettyprinters/prettyprinters.exp: Tweak comment.	2021-11-05 12:22:31 +00:00
Gerald Pfeifer	44d9d55c6d	include: Allow for our md5.h to defer to the system header This came up in the context of libsanitizer, where platform-specific support for FreeBSD relies on aspects provided by FreeBSD's own md5.h. Address this by allowing GCC's md5.h to pull in the system header instead, controlled by a new macro USE_SYSTEM_MD5. 2021-11-05 Gerald Pfeifer <gerald@pfeifer.com> Jakub Jelinek <jakub@redhat.com> include/ * md5.h (USE_SYSTEM_MD5): Introduce.	2021-11-05 13:06:34 +01:00
Gerald Pfeifer	84cbbb0a16	doc: No longer generate old.html Commit `431d26e1dd` removed doc/install-old.texi, alas we still tried to generate the associated web page old.html - which then turned out empty. Simplify remove this from the list of pages to be generated. gcc: * doc/install.texi2html: Do not generate old.html any longer.	2021-11-05 13:06:03 +01:00
Martin Liska	14c7041a1f	Reset when -gtoggle is used in gcc_options. PR debug/102955 gcc/ChangeLog: * opts.c (finish_options): Reset flag_gtoggle when it is used. gcc/testsuite/ChangeLog: * g++.dg/pr102955.C: New test.	2021-11-05 13:01:01 +01:00
Jakub Jelinek	155f6b2be4	dwarf2out: Fix up CONST_WIDE_INT handling once more [PR103046] My last change to CONST_WIDE_INT handling in add_const_value_attribute broke handling of CONST_WIDE_INT constants like ((__uint128_t) 1 << 120). wi::min_precision (w1, UNSIGNED) in that case 121, but wide_int::from creates a wide_int that has 0 and 0xff00000000000000ULL in its elts and precision 121. When we output that, we output both elements and thus emit 0, 0xff00000000000000 instead of the desired 0, 0x0100000000000000. IMHO we should actually pass machine_mode to add_const_value_attribute from callers, so that we know exactly what precision we want. Because hypothetically, if say mode is OImode and the CONST_WIDE_INT value fits into 128 bits or 192 bits, we'd emit just those 128 or 192 bits but debug info users would expect 256 bits. On typedef unsigned __int128 U; int main () { U a = (U) 1 << 120; U b = 0xffffffffffffffffULL; U c = ((U) 0xffffffff00000000ULL) << 64; return 0; } vanilla gcc incorrectly emits 0, 0xff00000000000000 for a, 0xffffffffffffffff alone (DW_FORM_data8) for b and 0, 0xffffffff00000000 for c. gcc with the previously posted PR103046 patch emits 0, 0x0100000000000000 for a, 0xffffffffffffffff alone for b and 0, 0xffffffff00000000 for c. And with this patch we emit 0, 0x0100000000000000 for a, 0xffffffffffffffff, 0 for b and 0, 0xffffffff00000000 for c. So, the patch below certainly causes larger debug info (well, 128-bit integers are pretty rare), but in this case the question is if it isn't more correct, as debug info consumers generally will not know if they should sign or zero extend the value in DW_AT_const_value. The previous code assumes they will always zero extend it... 2021-11-05 Jakub Jelinek <jakub@redhat.com> PR debug/103046 * dwarf2out.c (add_const_value_attribute): Add MODE argument, use it in CONST_WIDE_INT handling. Adjust recursive calls. (add_location_or_const_value_attribute): Pass DECL_MODE (decl) to new add_const_value_attribute argument. (tree_add_const_value_attribute): Pass TYPE_MODE (type) to new add_const_value_attribute argument.	2021-11-05 10:20:10 +01:00
Rasmus Villemoes	44d0243a24	gcc: vx-common.h: fix test for VxWorks7 The macro TARGET_VXWORKS7 is always defined (see vxworks-dummy.h). Thus we need to test its value, not its definedness. Fixes `aca124df` (define NO_DOT_IN_LABEL only in vxworks6). gcc/ChangeLog: * config/vx-common.h: Test value of TARGET_VXWORKS7 rather than definedness.	2021-11-05 09:42:21 +01:00
Richard Biener	33f1d03870	First refactor of vect_analyze_loop This refactors the main loop analysis part in vect_analyze_loop, re-purposing the existing vect_reanalyze_as_main_loop for this to reduce code duplication. Failure flow is a bit tricky since we want to extract info from the analyzed loop but I wanted to share the destruction part. Thus I add some std::function and lambda to funnel post-analysis for the case we want that (when analyzing from the main iteration but not when re-analyzing an epilogue as main). In addition I split vect_analyze_loop_form into analysis and vinfo creation so we can do the analysis only once, simplifying the new vect_analyze_loop_1. As discussed we probably want to change the loop over vector modes to first only analyze things as the main loop, picking the best (or simd VF) mode for the main loop and then analyze for a vectorized epilogue. The unroll would then integrate with the main loop vectorization. I think that currently we may fail to analyze the epilogue with the same mode as the main loop when using partial vectors since we increment mode_i before doing that. 2021-11-04 Richard Biener <rguenther@suse.de> * tree-vectorizer.h (struct vect_loop_form_info): New. (vect_analyze_loop_form): Adjust. (vect_create_loop_vinfo): New. * tree-parloops.c (gather_scalar_reductions): Adjust for vect_analyze_loop_form API change. * tree-vect-loop.c: Include <functional>. (vect_analyze_loop_form_1): Rename to vect_analyze_loop_form, take struct vect_loop_form_info as output parameter and adjust. (vect_analyze_loop_form): Rename to vect_create_loop_vinfo and split out call to the original vect_analyze_loop_form_1. (vect_reanalyze_as_main_loop): Rename to... (vect_analyze_loop_1): ... this, factor out the call to vect_analyze_loop_form and generalize to be able to use it twice ... (vect_analyze_loop): ... here. Perform vect_analyze_loop_form once only and here.	2021-11-05 09:03:11 +01:00
Xionghu Luo	614b39757b	rs6000: Fix incorrect fusion constraint [PR102991] gcc/ChangeLog: 2021-11-05 Xionghu Luo <luoxhu@linux.ibm.com> PR target/102991 * config/rs6000/fusion.md: Regenerate. * config/rs6000/genfusion.pl: Fix incorrect clobber constraint.	2021-11-04 20:19:59 -05:00
GCC Administrator	29a1af24ef	Daily bump.	2021-11-05 00:16:36 +00:00
Jonathan Wakely	a634928f5c	libstdc++: Fix pretty printing of std::unique_ptr [PR103086] Since std::tuple started using [[no_unique_address]] the tuple<T, D> member of std::unique_ptr<T, D> has two _M_head_impl subobjects, in different base classes. That means this printer code is ambiguous: tuple_head_type = tuple_impl_type.fields()[1].type # _Head_base head_field = tuple_head_type.fields()[0] if head_field.name == '_M_head_impl': self.pointer = tuple_member['_M_head_impl'] In older versions of GDB it happened to work by chance, because GDB returned the last _M_head_impl member and std::tuple's base classes are stored in reverse order, so the last one was the T element of the tuple. Since GDB 11 it returns the first _M_head_impl, which is the deleter element. The fix is for the printer to stop using an ambiguous field name and cast the tuple to the correct base class before accessing the _M_head_impl member. Instead of fixing this in both UniquePointerPrinter and StdPathPrinter a new unique_ptr_get function is defined to do it correctly. That is defined in terms of new tuple_get and _tuple_impl_get functions. It would be possible to reuse _tuple_impl_get to access each element in StdTuplePrinter._iterator.__next__, but that already does the correct casting, and wouldn't be much simpler anyway. libstdc++-v3/ChangeLog: PR libstdc++/103086 * python/libstdcxx/v6/printers.py (_tuple_impl_get): New helper for accessing the tuple element stored in a _Tuple_impl node. (tuple_get): New function for accessing a tuple element. (unique_ptr_get): New function for accessing a unique_ptr. (UniquePointerPrinter, StdPathPrinter): Use unique_ptr_get. * python/libstdcxx/v6/xmethods.py (UniquePtrGetWorker): Cast tuple to its base class before accessing _M_head_impl.	2021-11-04 22:50:02 +00:00
Jonathan Wakely	f4130a3eb5	libstdc++: Deprecate std::unexpected and handler functions These functions have been deprecated since C++11, and were removed in C++17. The proposal P0323 wants to reuse the name std::unexpected for a class template, so we will need to stop defining the current function for C++23 anyway. This marks them as deprecated for C++11 and up, to warn users they won't continue to be available. It disables them for C++17 and up, unless the _GLIBCXX_USE_DEPRECATED macro is defined. The <unwind-cxx.h> header uses std::unexpected_handler in the public API, but since that type is the same as std::terminate_handler we can just use that instead, to avoid warnings about it being deprecated. libstdc++-v3/ChangeLog: * doc/xml/manual/evolution.xml: Document deprecations. * doc/html/: Regenerate. libsupc++/exception (unexpected_handler, unexpected) (get_unexpected, set_unexpected): Add deprecated attribute. Do not define without _GLIBCXX_USE_DEPRECATED for C++17 and up. * libsupc++/eh_personality.cc (PERSONALITY_FUNCTION): Disable deprecated warnings. * libsupc++/eh_ptr.cc (std::rethrow_exception): Likewise. * libsupc++/eh_terminate.cc: Likewise. * libsupc++/eh_throw.cc (__cxa_init_primary_exception): Likewise. * libsupc++/unwind-cxx.h (struct __cxa_exception): Use terminate_handler instead of unexpected_handler. (struct __cxa_dependent_exception): Likewise. (__unexpected): Likewise. * testsuite/18_support/headers/exception/synopsis.cc: Add dg-warning for deprecated warning. * testsuite/18_support/exception_ptr/60612-unexpected.cc: Disable deprecated warnings. * testsuite/18_support/set_unexpected.cc: Likewise. * testsuite/18_support/unexpected_handler.cc: Likewise. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/lambda/lambda-eh2.C: Add dg-warning for new deprecation warnings. * g++.dg/cpp0x/noexcept06.C: Likewise. * g++.dg/cpp0x/noexcept07.C: Likewise. * g++.dg/eh/forced3.C: Likewise. * g++.dg/eh/unexpected1.C: Likewise. * g++.old-deja/g++.eh/spec1.C: Likewise. * g++.old-deja/g++.eh/spec2.C: Likewise. * g++.old-deja/g++.eh/spec3.C: Likewise. * g++.old-deja/g++.eh/spec4.C: Likewise. * g++.old-deja/g++.mike/eh33.C: Likewise. * g++.old-deja/g++.mike/eh34.C: Likewise. * g++.old-deja/g++.mike/eh50.C: Likewise. * g++.old-deja/g++.mike/eh51.C: Likewise.	2021-11-04 20:53:29 +00:00
Andreas Krebbel	79fe28d2c4	IBM Z: Define STACK_CHECK_MOVING_SP With -fstack-check the stack probes emitted access memory below the stack pointer. gcc/ChangeLog: * config/s390/s390.h (STACK_CHECK_MOVING_SP): New macro definition.	2021-11-04 19:40:33 +01:00
Jonathan Wakely	b57899f30f	libstdc++: Consolidate duplicate metaprogramming utilities Currently std::variant uses __index_of<T, Types...> to find the first occurence of a type in a pack, and __exactly_once<T, Types...> to check that there is no other occurrence. We can reuse the __find_uniq_type_in_pack<T, Types...>() function for both tasks, and remove the recursive templates used to implement __index_of and __exactly_once. libstdc++-v3/ChangeLog: * include/bits/utility.h (__find_uniq_type_in_pack): Move definition to here, ... * include/std/tuple (__find_uniq_type_in_pack): ... from here. * include/std/variant (__detail__variant::__index_of): Remove. (__detail::__variant::__exactly_once): Define using __find_uniq_type_in_pack instead of __index_of. (get<T>, get_if<T>, variant::__index_of): Likewise.	2021-11-04 18:14:50 +00:00
Jonathan Wakely	09aab7e699	libstdc++: Optimize std::tuple_element and std::tuple_size_v This reduces the number of class template instantiations needed for code using tuples, by reusing _Nth_type in tuple_element and specializing tuple_size_v for tuple, pair and array (and const-qualified versions of them). Also define the _Nth_type primary template as a complete type (but with no nested 'type' member). This avoids "invalid use of incomplete type" errors for out-of-range specializations of tuple_element. Those errors would probably be confusing and unhelpful for users. We already have a user-friendly static assert in tuple_element itself. Also ensure that tuple_size_v is available whenever tuple_size is (as proposed by LWG 3387). We already do that for tuple_element_t. libstdc++-v3/ChangeLog: * include/bits/stl_pair.h (tuple_size_v): Define partial specializations for std::pair. * include/bits/utility.h (_Nth_type): Move definition here and define primary template. (tuple_size_v): Move definition here. * include/std/array (tuple_size_v): Define partial specializations for std::array. * include/std/tuple (tuple_size_v): Move primary template to <bits/utility.h>. Define partial specializations for std::tuple. (tuple_element): Change definition to use _Nth_type. * include/std/variant (_Nth_type): Move to <bits/utility.h>. (variant_alternative, variant): Adjust qualification of _Nth_type. * testsuite/20_util/tuple/element_access/get_neg.cc: Prune additional errors from _Nth_type.	2021-11-04 18:14:50 +00:00
Tamar Christina	1b4a63593b	AArch64: Lower intrinsics shift to GIMPLE when possible. This lowers shifts to GIMPLE when the C interpretations of the shift operations matches that of AArch64. In C shifting right by BITSIZE is undefined, but the behavior is defined in AArch64. Additionally negative shifts lefts are undefined for the register variant of the instruction (SSHL, USHL) as being right shifts. Since we have a right shift by immediate I rewrite those cases into right shifts So: int64x1_t foo3 (int64x1_t a) { return vshl_s64 (a, vdup_n_s64(-6)); } produces: foo3: sshr d0, d0, 6 ret instead of: foo3: mov x0, -6 fmov d1, x0 sshl d0, d0, d1 ret This behavior isn't specifically mentioned for a left shift by immediate, but I believe that only the case because we do have a right shift by immediate but not a right shift by register. As such I do the same for left shift by immediate. gcc/ChangeLog: * config/aarch64/aarch64-builtins.c (aarch64_general_gimple_fold_builtin): Add ashl, sshl, ushl, ashr, ashr_simd, lshr, lshr_simd. * config/aarch64/aarch64-simd-builtins.def (lshr): Use USHIFTIMM. * config/aarch64/arm_neon.h (vshr_n_u8, vshr_n_u16, vshr_n_u32, vshrq_n_u8, vshrq_n_u16, vshrq_n_u32, vshrq_n_u64): Fix type hack. gcc/testsuite/ChangeLog: * gcc.target/aarch64/advsimd-intrinsics/vshl-opt-1.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vshl-opt-2.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vshl-opt-3.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vshl-opt-4.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vshl-opt-5.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vshl-opt-6.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vshl-opt-7.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vshl-opt-8.c: New test. * gcc.target/aarch64/signbit-2.c: New test.	2021-11-04 17:36:08 +00:00
Tamar Christina	d70720c238	middle-end: convert negate + right shift into compare greater. This turns an inversion of the sign bit + arithmetic right shift into a comparison with 0. i.e. void fun1(int32_t x, int n) { for (int i = 0; i < (n & -16); i++) x[i] = (-x[i]) >> 31; } now generates: .L3: ldr q0, [x0] cmgt v0.4s, v0.4s, #0 str q0, [x0], 16 cmp x0, x1 bne .L3 instead of: .L3: ldr q0, [x0] neg v0.4s, v0.4s sshr v0.4s, v0.4s, 31 str q0, [x0], 16 cmp x0, x1 bne .L3 gcc/ChangeLog: match.pd: New negate+shift pattern. gcc/testsuite/ChangeLog: * gcc.dg/signbit-2.c: New test. * gcc.dg/signbit-3.c: New test. * gcc.dg/signbit-4.c: New test. * gcc.dg/signbit-5.c: New test. * gcc.dg/signbit-6.c: New test. * gcc.target/aarch64/signbit-1.c: New test.	2021-11-04 17:32:09 +00:00
Andrew MacLeod	004afb984b	Treat undefined operands as varying in GORI. If the LHS is UNDEFINED simply stop calculating. Treat op1 and op2 as VARYING if they are UNDEFINED. PR tree-optimization/103079 gcc/ * gimple-range-gori.cc (gimple_range_calc_op1): Treat undefined as varying. (gimple_range_calc_op2): Ditto. gcc/testsuite/ * gcc.dg/pr103079.c: New.	2021-11-04 13:15:36 -04:00
Martin Jambor	1ece90ffa9	ipa-sra: Improve debug info for removed parameters (PR 93385) In spring I added code eliminating any statements using parameters removed by IPA passes (to fix PR 93385). That patch fixed issues such as divisions by zero that such code could perform but it only reset all affected debug bind statements, this one updates them with expressions which can allow the debugger to print the removed value - see the added test-case for an example. Even though I originally did not want to create DEBUG_EXPR_DECLs for intermediate values, I ended up doing so, because otherwise the code started creating statements like # DEBUG __aD.198693 => &MEM[(const struct _Alloc_nodeD.171110 )D#195]._M_tD.184726->_M_implD.171154 which not only is a bit scary but also gimple-fold ICEs on it. Therefore I decided they are probably quite necessary. The patch simply notes each removed SSA name present in a debug statement and then works from it backwards, looking if it can reconstruct the expression it represents (which can fail if a non-degenerate PHI node is in the way). If it can, it populates two hash maps with those expressions so that 1) removed assignments are replaced with a debug bind defining a new intermediate debug_decl_expr and 2) existing debug binds that refer to SSA names that are bing removed now refer to corresponding debug_decl_exprs. If a removed parameter is passed to another function, the debugging information still cannot describe its value there - see the xfailed test in the testcase. I sort of know what needs to be done but that needs a little bit more of IPA infrastructure on top of this patch and so I would like to get this patch reviewed first. Bootstrapped and tested on x86_64-linux, i686-linux and (long time ago) on aarch64-linux. Also LTO-bootstrapped and on x86_64-linux. Perhaps it is good to go to trunk? Thanks, Martin gcc/ChangeLog: 2021-03-29 Martin Jambor <mjambor@suse.cz> PR ipa/93385 ipa-param-manipulation.h (class ipa_param_body_adjustments): New members remap_with_debug_expressions, m_dead_ssa_debug_equiv, m_dead_stmt_debug_equiv and prepare_debug_expressions. Added parameter to mark_dead_statements. * ipa-param-manipulation.c: Include tree-phinodes.h and cfgexpand.h. (ipa_param_body_adjustments::mark_dead_statements): New parameter debugstack, push into it all SSA names used in debug statements, produce m_dead_ssa_debug_equiv mapping for the removed param. (replace_with_mapped_expr): New function. (ipa_param_body_adjustments::remap_with_debug_expressions): Likewise. (ipa_param_body_adjustments::prepare_debug_expressions): Likewise. (ipa_param_body_adjustments::common_initialization): Gather and procecc SSA which will be removed but are in debug statements. Simplify. (ipa_param_body_adjustments::ipa_param_body_adjustments): Initialize new members. * tree-inline.c (remap_gimple_stmt): Create a debug bind when possible when avoiding a copy of an unnecessary statement. Remap removed SSA names in existing debug statements. (tree_function_versioning): Do not create DEBUG_EXPR_DECL for removed parameters if we have already done so. gcc/testsuite/ChangeLog: 2021-03-29 Martin Jambor <mjambor@suse.cz> PR ipa/93385 * gcc.dg/guality/ipa-sra-1.c: New test.	2021-11-04 18:08:29 +01:00
Sandra Loosemore	7237c5b698	Fortran manual: Remove old docs for never-implemented extensions. 2021-11-01 Sandra Loosemore <sandra@codesourcery.com> gcc/fortran/ * gfortran.texi (Projects): Add bullet for helping with incomplete standards compliance. (Proposed Extensions): Delete section.	2021-11-04 09:53:02 -07:00
Sandra Loosemore	b96fdc0fca	Fortran manual: Update miscellaneous references to old standard versions. 2021-11-01 Sandra Loosemore <sandra@codesourcery.com> gcc/fortran/ * intrinsic.texi (Introduction to Intrinsics): Genericize references to standard versions. * invoke.texi (-fall-intrinsics): Likewise. (-fmax-identifier-length=): Likewise.	2021-11-04 09:53:02 -07:00
Sandra Loosemore	a0db59bc5f	Fortran manual: Update section on Interoperability with C 2021-11-01 Sandra Loosemore <sandra@codesourcery.com> gcc/fortran/ * gfortran.texi (Interoperability with C): Copy-editing. Add more index entries. (Intrinsic Types): Likewise. (Derived Types and struct): Likewise. (Interoperable Global Variables): Likewise. (Interoperable Subroutines and Functions): Likewise. (Working with C Pointers): Likewise. (Further Interoperability of Fortran with C): Likewise. Rewrite to reflect that this is now fully supported by gfortran.	2021-11-04 09:53:02 -07:00
Sandra Loosemore	227e010036	Fortran manual: Revise introductory chapter. Fix various bit-rot in the discussion of standards conformance, remove material that is only of historical interest, copy-editing. Also move discussion of preprocessing out of the introductory chapter. 2021-11-01 Sandra Loosemore <sandra@codesourcery.com> gcc/fortran/ * gfortran.texi (About GNU Fortran): Consolidate material formerly in other sections. Copy-editing. (Preprocessing and conditional compilation): Delete, moving most material to invoke.texi. (GNU Fortran and G77): Delete. (Project Status): Delete. (Standards): Update. (Fortran 95 status): Mention conditional compilation here. (Fortran 2003 status): Rewrite to mention the 1 missing feature instead of all the ones implemented. (Fortran 2008 status): Similarly for the 2 missing features. (Fortran 2018 status): Rewrite to reflect completion of TS29113 feature support. * invoke.texi (Preprocessing Options): Move material formerly in introductory chapter here.	2021-11-04 09:53:02 -07:00
Sandra Loosemore	2b1c757d83	Fortran manual: Combine standard conformance docs in one place. Discussion of conformance with various revisions of the Fortran standard was split between two separate parts of the manual. This patch moves it all to the introductory chapter. 2021-11-01 Sandra Loosemore <sandra@codesourcery.com> gcc/fortran/ * gfortran.texi (Standards): Move discussion of specific standard versions here.... (Fortran standards status): ...from here, and delete this node.	2021-11-04 09:53:02 -07:00
Jan Hubicka	d3f7a2fa64	Workaround ICE in gimple_call_static_chain_flags gcc/ChangeLog: 2021-11-04 Jan Hubicka <hubicka@ucw.cz> PR ipa/103058 * gimple.c (gimple_call_static_chain_flags): Handle case when nested function does not bind locally.	2021-11-04 17:10:47 +01:00
Jason Merrill	fae00a0ac0	c++: use range-for more gcc/cp/ChangeLog: * call.c (build_array_conv): Use range-for. (build_complex_conv): Likewise. * constexpr.c (clear_no_implicit_zero) (reduced_constant_expression_p): Likewise. * decl.c (cp_complete_array_type): Likewise. * decl2.c (mark_vtable_entries): Likewise. * pt.c (iterative_hash_template_arg): (invalid_tparm_referent_p, unify) (type_dependent_expression_p): Likewise. * typeck.c (build_ptrmemfunc_access_expr): Likewise.	2021-11-04 11:35:54 -04:00
Jonathan Wright	eb04ccf4bf	aarch64: Pass and return Neon vector-tuple types without a parallel Neon vector-tuple types can be passed in registers on function call and return - there is no need to generate a parallel rtx. This patch adds cases to detect vector-tuple modes and generates an appropriate register rtx. This change greatly improves code generated when passing Neon vector- tuple types between functions; many new test cases are added to defend these improvements. gcc/ChangeLog: 2021-10-07 Jonathan Wright <jonathan.wright@arm.com> * config/aarch64/aarch64.c (aarch64_function_value): Generate a register rtx for Neon vector-tuple modes. (aarch64_layout_arg): Likewise. gcc/testsuite/ChangeLog: * gcc.target/aarch64/vector_structure_intrinsics.c: New code generation tests.	2021-11-04 14:55:44 +00:00
Jonathan Wright	511245325a	gcc/lower_subreg.c: Prevent decomposition if modes are not tieable Preventing decomposition if modes are not tieable is necessary to stop AArch64 partial Neon structure modes being treated as packed in registers. This is a necessary prerequisite for a future AArch64 PCS change to maintain good code generation. gcc/ChangeLog: 2021-10-14 Jonathan Wright <jonathan.wright@arm.com> * lower-subreg.c (simple_move): Prevent decomposition if modes are not tieable.	2021-11-04 14:55:44 +00:00
Jonathan Wright	66f206b853	aarch64: Add machine modes for Neon vector-tuple types Until now, GCC has used large integer machine modes (OI, CI and XI) to model Neon vector-tuple types. This is suboptimal for many reasons, the most notable are: 1) Large integer modes are opaque and modifying one vector in the tuple requires a lot of inefficient set/get gymnastics. The result is a lot of superfluous move instructions. 2) Large integer modes do not map well to types that are tuples of 64-bit vectors - we need additional zero-padding which again results in superfluous move instructions. This patch adds new machine modes that better model the C-level Neon vector-tuple types. The approach is somewhat similar to that already used for SVE vector-tuple types. All of the AArch64 backend patterns and builtins that manipulate Neon vector tuples are updated to use the new machine modes. This has the effect of significantly reducing the amount of boiler-plate code in the arm_neon.h header. While this patch increases the quality of code generated in many instances, there is still room for significant improvement - which will be attempted in subsequent patches. gcc/ChangeLog: 2021-08-09 Jonathan Wright <jonathan.wright@arm.com> Richard Sandiford <richard.sandiford@arm.com> * config/aarch64/aarch64-builtins.c (v2x8qi_UP): Define. (v2x4hi_UP): Likewise. (v2x4hf_UP): Likewise. (v2x4bf_UP): Likewise. (v2x2si_UP): Likewise. (v2x2sf_UP): Likewise. (v2x1di_UP): Likewise. (v2x1df_UP): Likewise. (v2x16qi_UP): Likewise. (v2x8hi_UP): Likewise. (v2x8hf_UP): Likewise. (v2x8bf_UP): Likewise. (v2x4si_UP): Likewise. (v2x4sf_UP): Likewise. (v2x2di_UP): Likewise. (v2x2df_UP): Likewise. (v3x8qi_UP): Likewise. (v3x4hi_UP): Likewise. (v3x4hf_UP): Likewise. (v3x4bf_UP): Likewise. (v3x2si_UP): Likewise. (v3x2sf_UP): Likewise. (v3x1di_UP): Likewise. (v3x1df_UP): Likewise. (v3x16qi_UP): Likewise. (v3x8hi_UP): Likewise. (v3x8hf_UP): Likewise. (v3x8bf_UP): Likewise. (v3x4si_UP): Likewise. (v3x4sf_UP): Likewise. (v3x2di_UP): Likewise. (v3x2df_UP): Likewise. (v4x8qi_UP): Likewise. (v4x4hi_UP): Likewise. (v4x4hf_UP): Likewise. (v4x4bf_UP): Likewise. (v4x2si_UP): Likewise. (v4x2sf_UP): Likewise. (v4x1di_UP): Likewise. (v4x1df_UP): Likewise. (v4x16qi_UP): Likewise. (v4x8hi_UP): Likewise. (v4x8hf_UP): Likewise. (v4x8bf_UP): Likewise. (v4x4si_UP): Likewise. (v4x4sf_UP): Likewise. (v4x2di_UP): Likewise. (v4x2df_UP): Likewise. (TYPES_GETREGP): Delete. (TYPES_SETREGP): Likewise. (TYPES_LOADSTRUCT_U): Define. (TYPES_LOADSTRUCT_P): Likewise. (TYPES_LOADSTRUCT_LANE_U): Likewise. (TYPES_LOADSTRUCT_LANE_P): Likewise. (TYPES_STORE1P): Move for consistency. (TYPES_STORESTRUCT_U): Define. (TYPES_STORESTRUCT_P): Likewise. (TYPES_STORESTRUCT_LANE_U): Likewise. (TYPES_STORESTRUCT_LANE_P): Likewise. (aarch64_simd_tuple_types): Define. (aarch64_lookup_simd_builtin_type): Handle tuple type lookup. (aarch64_init_simd_builtin_functions): Update frontend lookup for builtin functions after handling arm_neon.h pragma. (register_tuple_type): Manually set modes of single-integer tuple types. Record tuple types. * config/aarch64/aarch64-modes.def (ADV_SIMD_D_REG_STRUCT_MODES): Define D-register tuple modes. (ADV_SIMD_Q_REG_STRUCT_MODES): Define Q-register tuple modes. (SVE_MODES): Give single-vector modes priority over vector- tuple modes. (VECTOR_MODES_WITH_PREFIX): Set partial-vector mode order to be after all single-vector modes. * config/aarch64/aarch64-simd-builtins.def: Update builtin generator macros to reflect modifications to the backend patterns. * config/aarch64/aarch64-simd.md (aarch64_simd_ld2<mode>): Use vector-tuple mode iterator and rename to... (aarch64_simd_ld2<vstruct_elt>): This. (aarch64_simd_ld2r<mode>): Use vector-tuple mode iterator and rename to... (aarch64_simd_ld2r<vstruct_elt>): This. (aarch64_vec_load_lanesoi_lane<mode>): Use vector-tuple mode iterator and rename to... (aarch64_vec_load_lanes<mode>_lane<vstruct_elt>): This. (vec_load_lanesoi<mode>): Use vector-tuple mode iterator and rename to... (vec_load_lanes<mode><vstruct_elt>): This. (aarch64_simd_st2<mode>): Use vector-tuple mode iterator and rename to... (aarch64_simd_st2<vstruct_elt>): This. (aarch64_vec_store_lanesoi_lane<mode>): Use vector-tuple mode iterator and rename to... (aarch64_vec_store_lanes<mode>_lane<vstruct_elt>): This. (vec_store_lanesoi<mode>): Use vector-tuple mode iterator and rename to... (vec_store_lanes<mode><vstruct_elt>): This. (aarch64_simd_ld3<mode>): Use vector-tuple mode iterator and rename to... (aarch64_simd_ld3<vstruct_elt>): This. (aarch64_simd_ld3r<mode>): Use vector-tuple mode iterator and rename to... (aarch64_simd_ld3r<vstruct_elt>): This. (aarch64_vec_load_lanesci_lane<mode>): Use vector-tuple mode iterator and rename to... (vec_load_lanesci<mode>): This. (aarch64_simd_st3<mode>): Use vector-tuple mode iterator and rename to... (aarch64_simd_st3<vstruct_elt>): This. (aarch64_vec_store_lanesci_lane<mode>): Use vector-tuple mode iterator and rename to... (vec_store_lanesci<mode>): This. (aarch64_simd_ld4<mode>): Use vector-tuple mode iterator and rename to... (aarch64_simd_ld4<vstruct_elt>): This. (aarch64_simd_ld4r<mode>): Use vector-tuple mode iterator and rename to... (aarch64_simd_ld4r<vstruct_elt>): This. (aarch64_vec_load_lanesxi_lane<mode>): Use vector-tuple mode iterator and rename to... (vec_load_lanesxi<mode>): This. (aarch64_simd_st4<mode>): Use vector-tuple mode iterator and rename to... (aarch64_simd_st4<vstruct_elt>): This. (aarch64_vec_store_lanesxi_lane<mode>): Use vector-tuple mode iterator and rename to... (vec_store_lanesxi<mode>): This. (mov<mode>): Define for Neon vector-tuple modes. (aarch64_ld1x3<VALLDIF:mode>): Use vector-tuple mode iterator and rename to... (aarch64_ld1x3<vstruct_elt>): This. (aarch64_ld1_x3_<mode>): Use vector-tuple mode iterator and rename to... (aarch64_ld1_x3_<vstruct_elt>): This. (aarch64_ld1x4<VALLDIF:mode>): Use vector-tuple mode iterator and rename to... (aarch64_ld1x4<vstruct_elt>): This. (aarch64_ld1_x4_<mode>): Use vector-tuple mode iterator and rename to... (aarch64_ld1_x4_<vstruct_elt>): This. (aarch64_st1x2<VALLDIF:mode>): Use vector-tuple mode iterator and rename to... (aarch64_st1x2<vstruct_elt>): This. (aarch64_st1_x2_<mode>): Use vector-tuple mode iterator and rename to... (aarch64_st1_x2_<vstruct_elt>): This. (aarch64_st1x3<VALLDIF:mode>): Use vector-tuple mode iterator and rename to... (aarch64_st1x3<vstruct_elt>): This. (aarch64_st1_x3_<mode>): Use vector-tuple mode iterator and rename to... (aarch64_st1_x3_<vstruct_elt>): This. (aarch64_st1x4<VALLDIF:mode>): Use vector-tuple mode iterator and rename to... (aarch64_st1x4<vstruct_elt>): This. (aarch64_st1_x4_<mode>): Use vector-tuple mode iterator and rename to... (aarch64_st1_x4_<vstruct_elt>): This. (aarch64_mov<mode>): Define for vector-tuple modes. (aarch64_be_mov<mode>): Likewise. (aarch64_ld<VSTRUCT:nregs>r<VALLDIF:mode>): Use vector-tuple mode iterator and rename to... (aarch64_ld<nregs>r<vstruct_elt>): This. (aarch64_ld2<mode>_dreg): Use vector-tuple mode iterator and rename to... (aarch64_ld2<vstruct_elt>_dreg): This. (aarch64_ld3<mode>_dreg): Use vector-tuple mode iterator and rename to... (aarch64_ld3<vstruct_elt>_dreg): This. (aarch64_ld4<mode>_dreg): Use vector-tuple mode iterator and rename to... (aarch64_ld4<vstruct_elt>_dreg): This. (aarch64_ld<VSTRUCT:nregs><VDC:mode>): Use vector-tuple mode iterator and rename to... (aarch64_ld<nregs><vstruct_elt>): Use vector-tuple mode iterator and rename to... (aarch64_ld<VSTRUCT:nregs><VQ:mode>): Use vector-tuple mode (aarch64_ld1x2<VQ:mode>): Delete. (aarch64_ld1x2<VDC:mode>): Use vector-tuple mode iterator and rename to... (aarch64_ld1x2<vstruct_elt>): This. (aarch64_ld<VSTRUCT:nregs>_lane<VALLDIF:mode>): Use vector- tuple mode iterator and rename to... (aarch64_ld<nregs>_lane<vstruct_elt>): This. (aarch64_get_dreg<VSTRUCT:mode><VDC:mode>): Delete. (aarch64_get_qreg<VSTRUCT:mode><VQ:mode>): Likewise. (aarch64_st2<mode>_dreg): Use vector-tuple mode iterator and rename to... (aarch64_st2<vstruct_elt>_dreg): This. (aarch64_st3<mode>_dreg): Use vector-tuple mode iterator and rename to... (aarch64_st3<vstruct_elt>_dreg): This. (aarch64_st4<mode>_dreg): Use vector-tuple mode iterator and rename to... (aarch64_st4<vstruct_elt>_dreg): This. (aarch64_st<VSTRUCT:nregs><VDC:mode>): Use vector-tuple mode iterator and rename to... (aarch64_st<nregs><vstruct_elt>): This. (aarch64_st<VSTRUCT:nregs><VQ:mode>): Use vector-tuple mode iterator and rename to aarch64_st<nregs><vstruct_elt>. (aarch64_st<VSTRUCT:nregs>_lane<VALLDIF:mode>): Use vector- tuple mode iterator and rename to... (aarch64_st<nregs>_lane<vstruct_elt>): This. (aarch64_set_qreg<VSTRUCT:mode><VQ:mode>): Delete. (aarch64_simd_ld1<mode>_x2): Use vector-tuple mode iterator and rename to... (aarch64_simd_ld1<vstruct_elt>_x2): This. * config/aarch64/aarch64.c (aarch64_advsimd_struct_mode_p): Refactor to include new vector-tuple modes. (aarch64_classify_vector_mode): Add cases for new vector- tuple modes. (aarch64_advsimd_partial_struct_mode_p): Define. (aarch64_advsimd_full_struct_mode_p): Likewise. (aarch64_advsimd_vector_array_mode): Likewise. (aarch64_sve_data_mode): Change location in file. (aarch64_array_mode): Handle case of Neon vector-tuple modes. (aarch64_hard_regno_nregs): Handle case of partial Neon vector structures. (aarch64_classify_address): Refactor to include handling of Neon vector-tuple modes. (aarch64_print_operand): Print "d" for "%R" for a partial Neon vector structure. (aarch64_expand_vec_perm_1): Use new vector-tuple mode. (aarch64_modes_tieable_p): Prevent tieing Neon partial struct modes with scalar machines modes larger than 8 bytes. (aarch64_can_change_mode_class): Don't allow changes between partial and full Neon vector-structure modes. * config/aarch64/arm_neon.h (vst2_lane_f16): Use updated builtin and remove boiler-plate code for opaque mode. (vst2_lane_f32): Likewise. (vst2_lane_f64): Likewise. (vst2_lane_p8): Likewise. (vst2_lane_p16): Likewise. (vst2_lane_p64): Likewise. (vst2_lane_s8): Likewise. (vst2_lane_s16): Likewise. (vst2_lane_s32): Likewise. (vst2_lane_s64): Likewise. (vst2_lane_u8): Likewise. (vst2_lane_u16): Likewise. (vst2_lane_u32): Likewise. (vst2_lane_u64): Likewise. (vst2q_lane_f16): Likewise. (vst2q_lane_f32): Likewise. (vst2q_lane_f64): Likewise. (vst2q_lane_p8): Likewise. (vst2q_lane_p16): Likewise. (vst2q_lane_p64): Likewise. (vst2q_lane_s8): Likewise. (vst2q_lane_s16): Likewise. (vst2q_lane_s32): Likewise. (vst2q_lane_s64): Likewise. (vst2q_lane_u8): Likewise. (vst2q_lane_u16): Likewise. (vst2q_lane_u32): Likewise. (vst2q_lane_u64): Likewise. (vst3_lane_f16): Likewise. (vst3_lane_f32): Likewise. (vst3_lane_f64): Likewise. (vst3_lane_p8): Likewise. (vst3_lane_p16): Likewise. (vst3_lane_p64): Likewise. (vst3_lane_s8): Likewise. (vst3_lane_s16): Likewise. (vst3_lane_s32): Likewise. (vst3_lane_s64): Likewise. (vst3_lane_u8): Likewise. (vst3_lane_u16): Likewise. (vst3_lane_u32): Likewise. (vst3_lane_u64): Likewise. (vst3q_lane_f16): Likewise. (vst3q_lane_f32): Likewise. (vst3q_lane_f64): Likewise. (vst3q_lane_p8): Likewise. (vst3q_lane_p16): Likewise. (vst3q_lane_p64): Likewise. (vst3q_lane_s8): Likewise. (vst3q_lane_s16): Likewise. (vst3q_lane_s32): Likewise. (vst3q_lane_s64): Likewise. (vst3q_lane_u8): Likewise. (vst3q_lane_u16): Likewise. (vst3q_lane_u32): Likewise. (vst3q_lane_u64): Likewise. (vst4_lane_f16): Likewise. (vst4_lane_f32): Likewise. (vst4_lane_f64): Likewise. (vst4_lane_p8): Likewise. (vst4_lane_p16): Likewise. (vst4_lane_p64): Likewise. (vst4_lane_s8): Likewise. (vst4_lane_s16): Likewise. (vst4_lane_s32): Likewise. (vst4_lane_s64): Likewise. (vst4_lane_u8): Likewise. (vst4_lane_u16): Likewise. (vst4_lane_u32): Likewise. (vst4_lane_u64): Likewise. (vst4q_lane_f16): Likewise. (vst4q_lane_f32): Likewise. (vst4q_lane_f64): Likewise. (vst4q_lane_p8): Likewise. (vst4q_lane_p16): Likewise. (vst4q_lane_p64): Likewise. (vst4q_lane_s8): Likewise. (vst4q_lane_s16): Likewise. (vst4q_lane_s32): Likewise. (vst4q_lane_s64): Likewise. (vst4q_lane_u8): Likewise. (vst4q_lane_u16): Likewise. (vst4q_lane_u32): Likewise. (vst4q_lane_u64): Likewise. (vtbl3_s8): Likewise. (vtbl3_u8): Likewise. (vtbl3_p8): Likewise. (vtbl4_s8): Likewise. (vtbl4_u8): Likewise. (vtbl4_p8): Likewise. (vld1_u8_x3): Likewise. (vld1_s8_x3): Likewise. (vld1_u16_x3): Likewise. (vld1_s16_x3): Likewise. (vld1_u32_x3): Likewise. (vld1_s32_x3): Likewise. (vld1_u64_x3): Likewise. (vld1_s64_x3): Likewise. (vld1_f16_x3): Likewise. (vld1_f32_x3): Likewise. (vld1_f64_x3): Likewise. (vld1_p8_x3): Likewise. (vld1_p16_x3): Likewise. (vld1_p64_x3): Likewise. (vld1q_u8_x3): Likewise. (vld1q_s8_x3): Likewise. (vld1q_u16_x3): Likewise. (vld1q_s16_x3): Likewise. (vld1q_u32_x3): Likewise. (vld1q_s32_x3): Likewise. (vld1q_u64_x3): Likewise. (vld1q_s64_x3): Likewise. (vld1q_f16_x3): Likewise. (vld1q_f32_x3): Likewise. (vld1q_f64_x3): Likewise. (vld1q_p8_x3): Likewise. (vld1q_p16_x3): Likewise. (vld1q_p64_x3): Likewise. (vld1_u8_x2): Likewise. (vld1_s8_x2): Likewise. (vld1_u16_x2): Likewise. (vld1_s16_x2): Likewise. (vld1_u32_x2): Likewise. (vld1_s32_x2): Likewise. (vld1_u64_x2): Likewise. (vld1_s64_x2): Likewise. (vld1_f16_x2): Likewise. (vld1_f32_x2): Likewise. (vld1_f64_x2): Likewise. (vld1_p8_x2): Likewise. (vld1_p16_x2): Likewise. (vld1_p64_x2): Likewise. (vld1q_u8_x2): Likewise. (vld1q_s8_x2): Likewise. (vld1q_u16_x2): Likewise. (vld1q_s16_x2): Likewise. (vld1q_u32_x2): Likewise. (vld1q_s32_x2): Likewise. (vld1q_u64_x2): Likewise. (vld1q_s64_x2): Likewise. (vld1q_f16_x2): Likewise. (vld1q_f32_x2): Likewise. (vld1q_f64_x2): Likewise. (vld1q_p8_x2): Likewise. (vld1q_p16_x2): Likewise. (vld1q_p64_x2): Likewise. (vld1_s8_x4): Likewise. (vld1q_s8_x4): Likewise. (vld1_s16_x4): Likewise. (vld1q_s16_x4): Likewise. (vld1_s32_x4): Likewise. (vld1q_s32_x4): Likewise. (vld1_u8_x4): Likewise. (vld1q_u8_x4): Likewise. (vld1_u16_x4): Likewise. (vld1q_u16_x4): Likewise. (vld1_u32_x4): Likewise. (vld1q_u32_x4): Likewise. (vld1_f16_x4): Likewise. (vld1q_f16_x4): Likewise. (vld1_f32_x4): Likewise. (vld1q_f32_x4): Likewise. (vld1_p8_x4): Likewise. (vld1q_p8_x4): Likewise. (vld1_p16_x4): Likewise. (vld1q_p16_x4): Likewise. (vld1_s64_x4): Likewise. (vld1_u64_x4): Likewise. (vld1_p64_x4): Likewise. (vld1q_s64_x4): Likewise. (vld1q_u64_x4): Likewise. (vld1q_p64_x4): Likewise. (vld1_f64_x4): Likewise. (vld1q_f64_x4): Likewise. (vld2_s64): Likewise. (vld2_u64): Likewise. (vld2_f64): Likewise. (vld2_s8): Likewise. (vld2_p8): Likewise. (vld2_p64): Likewise. (vld2_s16): Likewise. (vld2_p16): Likewise. (vld2_s32): Likewise. (vld2_u8): Likewise. (vld2_u16): Likewise. (vld2_u32): Likewise. (vld2_f16): Likewise. (vld2_f32): Likewise. (vld2q_s8): Likewise. (vld2q_p8): Likewise. (vld2q_s16): Likewise. (vld2q_p16): Likewise. (vld2q_p64): Likewise. (vld2q_s32): Likewise. (vld2q_s64): Likewise. (vld2q_u8): Likewise. (vld2q_u16): Likewise. (vld2q_u32): Likewise. (vld2q_u64): Likewise. (vld2q_f16): Likewise. (vld2q_f32): Likewise. (vld2q_f64): Likewise. (vld3_s64): Likewise. (vld3_u64): Likewise. (vld3_f64): Likewise. (vld3_s8): Likewise. (vld3_p8): Likewise. (vld3_s16): Likewise. (vld3_p16): Likewise. (vld3_s32): Likewise. (vld3_u8): Likewise. (vld3_u16): Likewise. (vld3_u32): Likewise. (vld3_f16): Likewise. (vld3_f32): Likewise. (vld3_p64): Likewise. (vld3q_s8): Likewise. (vld3q_p8): Likewise. (vld3q_s16): Likewise. (vld3q_p16): Likewise. (vld3q_s32): Likewise. (vld3q_s64): Likewise. (vld3q_u8): Likewise. (vld3q_u16): Likewise. (vld3q_u32): Likewise. (vld3q_u64): Likewise. (vld3q_f16): Likewise. (vld3q_f32): Likewise. (vld3q_f64): Likewise. (vld3q_p64): Likewise. (vld4_s64): Likewise. (vld4_u64): Likewise. (vld4_f64): Likewise. (vld4_s8): Likewise. (vld4_p8): Likewise. (vld4_s16): Likewise. (vld4_p16): Likewise. (vld4_s32): Likewise. (vld4_u8): Likewise. (vld4_u16): Likewise. (vld4_u32): Likewise. (vld4_f16): Likewise. (vld4_f32): Likewise. (vld4_p64): Likewise. (vld4q_s8): Likewise. (vld4q_p8): Likewise. (vld4q_s16): Likewise. (vld4q_p16): Likewise. (vld4q_s32): Likewise. (vld4q_s64): Likewise. (vld4q_u8): Likewise. (vld4q_u16): Likewise. (vld4q_u32): Likewise. (vld4q_u64): Likewise. (vld4q_f16): Likewise. (vld4q_f32): Likewise. (vld4q_f64): Likewise. (vld4q_p64): Likewise. (vld2_dup_s8): Likewise. (vld2_dup_s16): Likewise. (vld2_dup_s32): Likewise. (vld2_dup_f16): Likewise. (vld2_dup_f32): Likewise. (vld2_dup_f64): Likewise. (vld2_dup_u8): Likewise. (vld2_dup_u16): Likewise. (vld2_dup_u32): Likewise. (vld2_dup_p8): Likewise. (vld2_dup_p16): Likewise. (vld2_dup_p64): Likewise. (vld2_dup_s64): Likewise. (vld2_dup_u64): Likewise. (vld2q_dup_s8): Likewise. (vld2q_dup_p8): Likewise. (vld2q_dup_s16): Likewise. (vld2q_dup_p16): Likewise. (vld2q_dup_s32): Likewise. (vld2q_dup_s64): Likewise. (vld2q_dup_u8): Likewise. (vld2q_dup_u16): Likewise. (vld2q_dup_u32): Likewise. (vld2q_dup_u64): Likewise. (vld2q_dup_f16): Likewise. (vld2q_dup_f32): Likewise. (vld2q_dup_f64): Likewise. (vld2q_dup_p64): Likewise. (vld3_dup_s64): Likewise. (vld3_dup_u64): Likewise. (vld3_dup_f64): Likewise. (vld3_dup_s8): Likewise. (vld3_dup_p8): Likewise. (vld3_dup_s16): Likewise. (vld3_dup_p16): Likewise. (vld3_dup_s32): Likewise. (vld3_dup_u8): Likewise. (vld3_dup_u16): Likewise. (vld3_dup_u32): Likewise. (vld3_dup_f16): Likewise. (vld3_dup_f32): Likewise. (vld3_dup_p64): Likewise. (vld3q_dup_s8): Likewise. (vld3q_dup_p8): Likewise. (vld3q_dup_s16): Likewise. (vld3q_dup_p16): Likewise. (vld3q_dup_s32): Likewise. (vld3q_dup_s64): Likewise. (vld3q_dup_u8): Likewise. (vld3q_dup_u16): Likewise. (vld3q_dup_u32): Likewise. (vld3q_dup_u64): Likewise. (vld3q_dup_f16): Likewise. (vld3q_dup_f32): Likewise. (vld3q_dup_f64): Likewise. (vld3q_dup_p64): Likewise. (vld4_dup_s64): Likewise. (vld4_dup_u64): Likewise. (vld4_dup_f64): Likewise. (vld4_dup_s8): Likewise. (vld4_dup_p8): Likewise. (vld4_dup_s16): Likewise. (vld4_dup_p16): Likewise. (vld4_dup_s32): Likewise. (vld4_dup_u8): Likewise. (vld4_dup_u16): Likewise. (vld4_dup_u32): Likewise. (vld4_dup_f16): Likewise. (vld4_dup_f32): Likewise. (vld4_dup_p64): Likewise. (vld4q_dup_s8): Likewise. (vld4q_dup_p8): Likewise. (vld4q_dup_s16): Likewise. (vld4q_dup_p16): Likewise. (vld4q_dup_s32): Likewise. (vld4q_dup_s64): Likewise. (vld4q_dup_u8): Likewise. (vld4q_dup_u16): Likewise. (vld4q_dup_u32): Likewise. (vld4q_dup_u64): Likewise. (vld4q_dup_f16): Likewise. (vld4q_dup_f32): Likewise. (vld4q_dup_f64): Likewise. (vld4q_dup_p64): Likewise. (vld2_lane_u8): Likewise. (vld2_lane_u16): Likewise. (vld2_lane_u32): Likewise. (vld2_lane_u64): Likewise. (vld2_lane_s8): Likewise. (vld2_lane_s16): Likewise. (vld2_lane_s32): Likewise. (vld2_lane_s64): Likewise. (vld2_lane_f16): Likewise. (vld2_lane_f32): Likewise. (vld2_lane_f64): Likewise. (vld2_lane_p8): Likewise. (vld2_lane_p16): Likewise. (vld2_lane_p64): Likewise. (vld2q_lane_u8): Likewise. (vld2q_lane_u16): Likewise. (vld2q_lane_u32): Likewise. (vld2q_lane_u64): Likewise. (vld2q_lane_s8): Likewise. (vld2q_lane_s16): Likewise. (vld2q_lane_s32): Likewise. (vld2q_lane_s64): Likewise. (vld2q_lane_f16): Likewise. (vld2q_lane_f32): Likewise. (vld2q_lane_f64): Likewise. (vld2q_lane_p8): Likewise. (vld2q_lane_p16): Likewise. (vld2q_lane_p64): Likewise. (vld3_lane_u8): Likewise. (vld3_lane_u16): Likewise. (vld3_lane_u32): Likewise. (vld3_lane_u64): Likewise. (vld3_lane_s8): Likewise. (vld3_lane_s16): Likewise. (vld3_lane_s32): Likewise. (vld3_lane_s64): Likewise. (vld3_lane_f16): Likewise. (vld3_lane_f32): Likewise. (vld3_lane_f64): Likewise. (vld3_lane_p8): Likewise. (vld3_lane_p16): Likewise. (vld3_lane_p64): Likewise. (vld3q_lane_u8): Likewise. (vld3q_lane_u16): Likewise. (vld3q_lane_u32): Likewise. (vld3q_lane_u64): Likewise. (vld3q_lane_s8): Likewise. (vld3q_lane_s16): Likewise. (vld3q_lane_s32): Likewise. (vld3q_lane_s64): Likewise. (vld3q_lane_f16): Likewise. (vld3q_lane_f32): Likewise. (vld3q_lane_f64): Likewise. (vld3q_lane_p8): Likewise. (vld3q_lane_p16): Likewise. (vld3q_lane_p64): Likewise. (vld4_lane_u8): Likewise. (vld4_lane_u16): Likewise. (vld4_lane_u32): Likewise. (vld4_lane_u64): Likewise. (vld4_lane_s8): Likewise. (vld4_lane_s16): Likewise. (vld4_lane_s32): Likewise. (vld4_lane_s64): Likewise. (vld4_lane_f16): Likewise. (vld4_lane_f32): Likewise. (vld4_lane_f64): Likewise. (vld4_lane_p8): Likewise. (vld4_lane_p16): Likewise. (vld4_lane_p64): Likewise. (vld4q_lane_u8): Likewise. (vld4q_lane_u16): Likewise. (vld4q_lane_u32): Likewise. (vld4q_lane_u64): Likewise. (vld4q_lane_s8): Likewise. (vld4q_lane_s16): Likewise. (vld4q_lane_s32): Likewise. (vld4q_lane_s64): Likewise. (vld4q_lane_f16): Likewise. (vld4q_lane_f32): Likewise. (vld4q_lane_f64): Likewise. (vld4q_lane_p8): Likewise. (vld4q_lane_p16): Likewise. (vld4q_lane_p64): Likewise. (vqtbl2_s8): Likewise. (vqtbl2_u8): Likewise. (vqtbl2_p8): Likewise. (vqtbl2q_s8): Likewise. (vqtbl2q_u8): Likewise. (vqtbl2q_p8): Likewise. (vqtbl3_s8): Likewise. (vqtbl3_u8): Likewise. (vqtbl3_p8): Likewise. (vqtbl3q_s8): Likewise. (vqtbl3q_u8): Likewise. (vqtbl3q_p8): Likewise. (vqtbl4_s8): Likewise. (vqtbl4_u8): Likewise. (vqtbl4_p8): Likewise. (vqtbl4q_s8): Likewise. (vqtbl4q_u8): Likewise. (vqtbl4q_p8): Likewise. (vqtbx2_s8): Likewise. (vqtbx2_u8): Likewise. (vqtbx2_p8): Likewise. (vqtbx2q_s8): Likewise. (vqtbx2q_u8): Likewise. (vqtbx2q_p8): Likewise. (vqtbx3_s8): Likewise. (vqtbx3_u8): Likewise. (vqtbx3_p8): Likewise. (vqtbx3q_s8): Likewise. (vqtbx3q_u8): Likewise. (vqtbx3q_p8): Likewise. (vqtbx4_s8): Likewise. (vqtbx4_u8): Likewise. (vqtbx4_p8): Likewise. (vqtbx4q_s8): Likewise. (vqtbx4q_u8): Likewise. (vqtbx4q_p8): Likewise. (vst1_s64_x2): Likewise. (vst1_u64_x2): Likewise. (vst1_f64_x2): Likewise. (vst1_s8_x2): Likewise. (vst1_p8_x2): Likewise. (vst1_s16_x2): Likewise. (vst1_p16_x2): Likewise. (vst1_s32_x2): Likewise. (vst1_u8_x2): Likewise. (vst1_u16_x2): Likewise. (vst1_u32_x2): Likewise. (vst1_f16_x2): Likewise. (vst1_f32_x2): Likewise. (vst1_p64_x2): Likewise. (vst1q_s8_x2): Likewise. (vst1q_p8_x2): Likewise. (vst1q_s16_x2): Likewise. (vst1q_p16_x2): Likewise. (vst1q_s32_x2): Likewise. (vst1q_s64_x2): Likewise. (vst1q_u8_x2): Likewise. (vst1q_u16_x2): Likewise. (vst1q_u32_x2): Likewise. (vst1q_u64_x2): Likewise. (vst1q_f16_x2): Likewise. (vst1q_f32_x2): Likewise. (vst1q_f64_x2): Likewise. (vst1q_p64_x2): Likewise. (vst1_s64_x3): Likewise. (vst1_u64_x3): Likewise. (vst1_f64_x3): Likewise. (vst1_s8_x3): Likewise. (vst1_p8_x3): Likewise. (vst1_s16_x3): Likewise. (vst1_p16_x3): Likewise. (vst1_s32_x3): Likewise. (vst1_u8_x3): Likewise. (vst1_u16_x3): Likewise. (vst1_u32_x3): Likewise. (vst1_f16_x3): Likewise. (vst1_f32_x3): Likewise. (vst1_p64_x3): Likewise. (vst1q_s8_x3): Likewise. (vst1q_p8_x3): Likewise. (vst1q_s16_x3): Likewise. (vst1q_p16_x3): Likewise. (vst1q_s32_x3): Likewise. (vst1q_s64_x3): Likewise. (vst1q_u8_x3): Likewise. (vst1q_u16_x3): Likewise. (vst1q_u32_x3): Likewise. (vst1q_u64_x3): Likewise. (vst1q_f16_x3): Likewise. (vst1q_f32_x3): Likewise. (vst1q_f64_x3): Likewise. (vst1q_p64_x3): Likewise. (vst1_s8_x4): Likewise. (vst1q_s8_x4): Likewise. (vst1_s16_x4): Likewise. (vst1q_s16_x4): Likewise. (vst1_s32_x4): Likewise. (vst1q_s32_x4): Likewise. (vst1_u8_x4): Likewise. (vst1q_u8_x4): Likewise. (vst1_u16_x4): Likewise. (vst1q_u16_x4): Likewise. (vst1_u32_x4): Likewise. (vst1q_u32_x4): Likewise. (vst1_f16_x4): Likewise. (vst1q_f16_x4): Likewise. (vst1_f32_x4): Likewise. (vst1q_f32_x4): Likewise. (vst1_p8_x4): Likewise. (vst1q_p8_x4): Likewise. (vst1_p16_x4): Likewise. (vst1q_p16_x4): Likewise. (vst1_s64_x4): Likewise. (vst1_u64_x4): Likewise. (vst1_p64_x4): Likewise. (vst1q_s64_x4): Likewise. (vst1q_u64_x4): Likewise. (vst1q_p64_x4): Likewise. (vst1_f64_x4): Likewise. (vst1q_f64_x4): Likewise. (vst2_s64): Likewise. (vst2_u64): Likewise. (vst2_f64): Likewise. (vst2_s8): Likewise. (vst2_p8): Likewise. (vst2_s16): Likewise. (vst2_p16): Likewise. (vst2_s32): Likewise. (vst2_u8): Likewise. (vst2_u16): Likewise. (vst2_u32): Likewise. (vst2_f16): Likewise. (vst2_f32): Likewise. (vst2_p64): Likewise. (vst2q_s8): Likewise. (vst2q_p8): Likewise. (vst2q_s16): Likewise. (vst2q_p16): Likewise. (vst2q_s32): Likewise. (vst2q_s64): Likewise. (vst2q_u8): Likewise. (vst2q_u16): Likewise. (vst2q_u32): Likewise. (vst2q_u64): Likewise. (vst2q_f16): Likewise. (vst2q_f32): Likewise. (vst2q_f64): Likewise. (vst2q_p64): Likewise. (vst3_s64): Likewise. (vst3_u64): Likewise. (vst3_f64): Likewise. (vst3_s8): Likewise. (vst3_p8): Likewise. (vst3_s16): Likewise. (vst3_p16): Likewise. (vst3_s32): Likewise. (vst3_u8): Likewise. (vst3_u16): Likewise. (vst3_u32): Likewise. (vst3_f16): Likewise. (vst3_f32): Likewise. (vst3_p64): Likewise. (vst3q_s8): Likewise. (vst3q_p8): Likewise. (vst3q_s16): Likewise. (vst3q_p16): Likewise. (vst3q_s32): Likewise. (vst3q_s64): Likewise. (vst3q_u8): Likewise. (vst3q_u16): Likewise. (vst3q_u32): Likewise. (vst3q_u64): Likewise. (vst3q_f16): Likewise. (vst3q_f32): Likewise. (vst3q_f64): Likewise. (vst3q_p64): Likewise. (vst4_s64): Likewise. (vst4_u64): Likewise. (vst4_f64): Likewise. (vst4_s8): Likewise. (vst4_p8): Likewise. (vst4_s16): Likewise. (vst4_p16): Likewise. (vst4_s32): Likewise. (vst4_u8): Likewise. (vst4_u16): Likewise. (vst4_u32): Likewise. (vst4_f16): Likewise. (vst4_f32): Likewise. (vst4_p64): Likewise. (vst4q_s8): Likewise. (vst4q_p8): Likewise. (vst4q_s16): Likewise. (vst4q_p16): Likewise. (vst4q_s32): Likewise. (vst4q_s64): Likewise. (vst4q_u8): Likewise. (vst4q_u16): Likewise. (vst4q_u32): Likewise. (vst4q_u64): Likewise. (vst4q_f16): Likewise. (vst4q_f32): Likewise. (vst4q_f64): Likewise. (vst4q_p64): Likewise. (vtbx4_s8): Likewise. (vtbx4_u8): Likewise. (vtbx4_p8): Likewise. (vld1_bf16_x2): Likewise. (vld1q_bf16_x2): Likewise. (vld1_bf16_x3): Likewise. (vld1q_bf16_x3): Likewise. (vld1_bf16_x4): Likewise. (vld1q_bf16_x4): Likewise. (vld2_bf16): Likewise. (vld2q_bf16): Likewise. (vld2_dup_bf16): Likewise. (vld2q_dup_bf16): Likewise. (vld3_bf16): Likewise. (vld3q_bf16): Likewise. (vld3_dup_bf16): Likewise. (vld3q_dup_bf16): Likewise. (vld4_bf16): Likewise. (vld4q_bf16): Likewise. (vld4_dup_bf16): Likewise. (vld4q_dup_bf16): Likewise. (vst1_bf16_x2): Likewise. (vst1q_bf16_x2): Likewise. (vst1_bf16_x3): Likewise. (vst1q_bf16_x3): Likewise. (vst1_bf16_x4): Likewise. (vst1q_bf16_x4): Likewise. (vst2_bf16): Likewise. (vst2q_bf16): Likewise. (vst3_bf16): Likewise. (vst3q_bf16): Likewise. (vst4_bf16): Likewise. (vst4q_bf16): Likewise. (vld2_lane_bf16): Likewise. (vld2q_lane_bf16): Likewise. (vld3_lane_bf16): Likewise. (vld3q_lane_bf16): Likewise. (vld4_lane_bf16): Likewise. (vld4q_lane_bf16): Likewise. (vst2_lane_bf16): Likewise. (vst2q_lane_bf16): Likewise. (vst3_lane_bf16): Likewise. (vst3q_lane_bf16): Likewise. (vst4_lane_bf16): Likewise. (vst4q_lane_bf16): Likewise. * config/aarch64/geniterators.sh: Modify iterator regex to match new vector-tuple modes. * config/aarch64/iterators.md (insn_count): Extend mode attribute with vector-tuple type information. (nregs): Likewise. (Vendreg): Likewise. (Vetype): Likewise. (Vtype): Likewise. (VSTRUCT_2D): New mode iterator. (VSTRUCT_2DNX): Likewise. (VSTRUCT_2DX): Likewise. (VSTRUCT_2Q): Likewise. (VSTRUCT_2QD): Likewise. (VSTRUCT_3D): Likewise. (VSTRUCT_3DNX): Likewise. (VSTRUCT_3DX): Likewise. (VSTRUCT_3Q): Likewise. (VSTRUCT_3QD): Likewise. (VSTRUCT_4D): Likewise. (VSTRUCT_4DNX): Likewise. (VSTRUCT_4DX): Likewise. (VSTRUCT_4Q): Likewise. (VSTRUCT_4QD): Likewise. (VSTRUCT_D): Likewise. (VSTRUCT_Q): Likewise. (VSTRUCT_QD): Likewise. (VSTRUCT_ELT): New mode attribute. (vstruct_elt): Likewise. * genmodes.c (VECTOR_MODE): Add default prefix and order parameters. (VECTOR_MODE_WITH_PREFIX): Define. (make_vector_mode): Add mode prefix and order parameters. gcc/testsuite/ChangeLog: * gcc.target/aarch64/advsimd-intrinsics/bf16_vldN_lane_2.c: Relax incorrect register number requirement. * gcc.target/aarch64/sve/pcs/struct_3_256.c: Accept equivalent codegen with fmov.	2021-11-04 14:54:36 +00:00
Jonathan Wright	4e5929e457	gcc/expmed.c: Ensure vector modes are tieable before extraction Extracting a bitfield from a vector can be achieved by casting the vector to a new type whose elements are the same size as the desired bitfield, before generating a subreg. However, this is only an optimization if the original vector can be accessed in the new machine mode without first being copied - a condition denoted by the TARGET_MODES_TIEABLE_P hook. This patch adds a check to make sure that the vector modes are tieable before attempting to generate a subreg. This is a necessary prerequisite for a subsequent patch that will introduce new machine modes for Arm Neon vector-tuple types. gcc/ChangeLog: 2021-10-11 Jonathan Wright <jonathan.wright@arm.com> * expmed.c (extract_bit_field_1): Ensure modes are tieable.	2021-11-04 14:51:09 +00:00
Jonathan Wright	2fc2026061	gcc/expr.c: Remove historic workaround for broken SIMD subreg A long time ago, using a parallel to take a subreg of a SIMD register was broken. This temporary fix[1] (from 2003) spilled these registers to memory and reloaded the appropriate part to obtain the subreg. The fix initially existed for the benefit of the PowerPC E500 - a platform for which GCC removed support a number of years ago. Regardless, a proper mechanism for taking a subreg of a SIMD register exists now anyway. This patch removes the workaround thus preventing SIMD registers being dumped to memory unnecessarily - which sometimes can't be fixed by later passes. [1] https://gcc.gnu.org/pipermail/gcc-patches/2003-April/102099.html gcc/ChangeLog: 2021-10-11 Jonathan Wright <jonathan.wright@arm.com> * expr.c (emit_group_load_1): Remove historic workaround.	2021-11-04 14:50:55 +00:00
Jonathan Wright	8197ab94b4	aarch64: Move Neon vector-tuple type declaration into the compiler Declare the Neon vector-tuple types inside the compiler instead of in the arm_neon.h header. This is a necessary first step before adding corresponding machine modes to the AArch64 backend. The vector-tuple types are implemented using a #pragma. This means initialization of builtin functions that have vector-tuple types as arguments or return values has to be delayed until the #pragma is handled. gcc/ChangeLog: 2021-09-10 Jonathan Wright <jonathan.wright@arm.com> * config/aarch64/aarch64-builtins.c (aarch64_init_simd_builtins): Factor out main loop to... (aarch64_init_simd_builtin_functions): This new function. (register_tuple_type): Define. (aarch64_scalar_builtin_type_p): Define. (handle_arm_neon_h): Define. * config/aarch64/aarch64-c.c (aarch64_pragma_aarch64): Handle pragma for arm_neon.h. * config/aarch64/aarch64-protos.h (aarch64_advsimd_struct_mode_p): Declare. (handle_arm_neon_h): Likewise. * config/aarch64/aarch64.c (aarch64_advsimd_struct_mode_p): Remove static modifier. * config/aarch64/arm_neon.h (target): Remove Neon vector structure type definitions.	2021-11-04 14:50:40 +00:00
H.J. Lu	fbe58ba97a	x86: Check leal/addl gcc.target/i386/amxtile-3.c for x32 Check leal and addl for x32 to fix: FAIL: gcc.target/i386/amxtile-3.c scan-assembler addq[ \\t]+\\$12 FAIL: gcc.target/i386/amxtile-3.c scan-assembler leaq[ \\t]+4 FAIL: gcc.target/i386/amxtile-3.c scan-assembler leaq[ \\t]+8 * gcc.target/i386/amxtile-3.c: Check leal/addl for x32.	2021-11-04 07:41:52 -07:00
Aldy Hernandez	6a9678f0b3	path solver: Prefer range_of_expr instead of range_on_edge. The range_of_expr method provides better caching than range_on_edge. If we have a statement, we can just it and avoid the range_on_edge dance. Plus we can use all the range_of_expr fanciness. Tested on x86-64 and ppc64le Linux with the usual regstrap. I also verified that the before and after number of threads was the same or greater in a suite of .ii files from a bootstrap. gcc/ChangeLog: PR tree-optimization/102943 * gimple-range-path.cc (path_range_query::range_on_path_entry): Prefer range_of_expr unless there are no statements in the BB.	2021-11-04 15:39:03 +01:00
Aldy Hernandez	e441162269	Avoid repeating calculations in threader. We already attempt to resolve the current path on entry to find_paths_to_name(), so there's no need to do so again for each exported range since nothing has changed. Removing this redundant calculation avoids 22% of calls into the path solver. Tested on x86-64 and ppc64le Linux with the usual regstrap. I also verified that the before and after number of threads was the same in a suite of .ii files from a bootstrap. gcc/ChangeLog: PR tree-optimization/102943 * tree-ssa-threadbackward.c (back_threader::find_paths_to_names): Avoid duplicate calculation of paths.	2021-11-04 15:37:35 +01:00
Aldy Hernandez	5ea1ce43b6	path solver: Only compute relations for imports. We are currently calculating implicit PHI relations for all PHI arguments. This creates unecessary work, as we only care about SSA names in the import bitmap. Similarly for inter-path relationals. We can avoid things not in the bitmap. Tested on x86-64 and ppc64le Linux with the usual regstrap. I also verified that the before and after number of threads was the same in a suite of .ii files from a bootstrap. gcc/ChangeLog: PR tree-optimization/102943 * gimple-range-path.cc (path_range_query::compute_phi_relations): Only compute relations for SSA names in the import list. (path_range_query::compute_outgoing_relations): Same. * gimple-range-path.h (path_range_query::import_p): New.	2021-11-04 15:37:35 +01:00
H.J. Lu	333efaea63	libffi: Add --enable-cet to configure When --enable-cet is used to configure GCC, enable Intel CET in libffi. * Makefile.am (AM_CFLAGS): Add $(CET_FLAGS). (AM_CCASFLAGS): Likewise. * configure.ac (CET_FLAGS): Add GCC_CET_FLAGS and AC_SUBST. * Makefile.in: Regenerate. * aclocal.m4: Likewise. * configure: Likewise. * include/Makefile.in: Likewise. * man/Makefile.in: Likewise. * testsuite/Makefile.in: Likewise.	2021-11-04 07:19:22 -07:00
Martin Liska	af1bfcc04c	Add -v option for git_check_commit.py. Doing so, one can see: $ git gcc-verify a50914d2111c72d2cd5cb8cf474133f4f85a25f6 -v Checking a50914d2111c72d2cd5cb8cf474133f4f85a25f6: FAILED ERR: unchanged file mentioned in a ChangeLog: "gcc/common.opt" ERR: unchanged file mentioned in a ChangeLog (did you mean "gcc/testsuite/g++.dg/pr102955.C"?): "gcc/testsuite/gcc.dg/pr102955.c" - gcc/testsuite/gcc.dg/pr102955.c ? ^^ ^ + gcc/testsuite/g++.dg/pr102955.C ? ^^ ^ contrib/ChangeLog: * gcc-changelog/git_check_commit.py: Add -v option. * gcc-changelog/git_commit.py: Print verbose diff for wrong filename.	2021-11-04 15:01:52 +01:00
Tamar Christina	5914a7b5c6	testsuite: Add more guards to complex tests This test hopefully fixes all the remaining target specific test issues by 1: Unrolling all add testcases by 16 using pragma GCC unroll 2. On armhf use Adv.SIMD instead of MVE to test. MVE's autovec is too incomplete to be a general test target. 3. Add appropriate vect_<type> and float<size> guards on testcases. gcc/testsuite/ChangeLog: PR testsuite/103042 * gcc.dg/vect/complex/bb-slp-complex-add-pattern-int.c: Update guards. * gcc.dg/vect/complex/bb-slp-complex-add-pattern-long.c: Likewise. * gcc.dg/vect/complex/bb-slp-complex-add-pattern-short.c: Likewise. * gcc.dg/vect/complex/bb-slp-complex-add-pattern-unsigned-int.c: Likewise. * gcc.dg/vect/complex/bb-slp-complex-add-pattern-unsigned-long.c: Likewise. * gcc.dg/vect/complex/bb-slp-complex-add-pattern-unsigned-short.c: Likewise. * gcc.dg/vect/complex/complex-add-pattern-template.c: Likewise. * gcc.dg/vect/complex/complex-add-template.c: Likewise. * gcc.dg/vect/complex/complex-operations-run.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-add-double.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-add-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-add-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-double.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-double.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-double.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-double.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-add-double.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-add-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-add-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-add-pattern-double.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-add-pattern-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-add-pattern-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mla-double.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mla-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mla-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mls-double.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mls-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mls-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mul-double.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mul-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mul-half-float.c: Likewise. * gcc.dg/vect/complex/vect-complex-add-pattern-byte.c: Likewise. * gcc.dg/vect/complex/vect-complex-add-pattern-int.c: Likewise. * gcc.dg/vect/complex/vect-complex-add-pattern-long.c: Likewise. * gcc.dg/vect/complex/vect-complex-add-pattern-short.c: Likewise. * gcc.dg/vect/complex/vect-complex-add-pattern-unsigned-byte.c: Likewise. * gcc.dg/vect/complex/vect-complex-add-pattern-unsigned-int.c: Likewise. * gcc.dg/vect/complex/vect-complex-add-pattern-unsigned-long.c: Likewise. * gcc.dg/vect/complex/vect-complex-add-pattern-unsigned-short.c: Likewise.	2021-11-04 13:43:36 +00:00
David Malcolm	347682ea46	analyzer: fix ICE in sm_state_map::dump when dumping trees gcc/analyzer/ChangeLog: * program-state.cc (sm_state_map::dump): Use default_tree_printer as format decoder. Signed-off-by: David Malcolm <dmalcolm@redhat.com>	2021-11-04 09:32:05 -04:00
Richard Biener	d136035016	rtl-optimization/103075 - avoid ICEing on unfolded int-to-float converts The following avoids asserting in exact_int_to_float_conversion_p that the argument is not constant which it in fact can be with -frounding-math and inexact int-to-float conversions. Say so. 2021-11-04 Richard Biener <rguenther@suse.de> PR rtl-optimization/103075 * simplify-rtx.c (exact_int_to_float_conversion_p): Return false for a VOIDmode operand. * gcc.dg/pr103075.c: New testcase.	2021-11-04 13:33:19 +01:00
Richard Sandiford	d43fc1df73	aarch64: Move more code into aarch64_vector_costs This patch moves more code into aarch64_vector_costs and reuses some of the information that is now available in the base class. I'm planing to significantly rework this code, with more hooks into the vectoriser, but this seemed worth doing as a first step. gcc/ * config/aarch64/aarch64.c (aarch64_vector_costs): Make member variables private and add "m_" to their names. Remove is_loop. (aarch64_record_potential_advsimd_unrolling): Replace with... (aarch64_vector_costs::record_potential_advsimd_unrolling): ...this. (aarch64_analyze_loop_vinfo): Replace with... (aarch64_vector_costs::analyze_loop_vinfo): ...this. Move initialization of (m_)vec_flags to add_stmt_cost. (aarch64_analyze_bb_vinfo): Delete. (aarch64_count_ops): Replace with... (aarch64_vector_costs::count_ops): ...this. (aarch64_vector_costs::add_stmt_cost): Set m_vec_flags, using m_costing_for_scalar to test whether we're costing scalar or vector code. (aarch64_adjust_body_cost_sve): Replace with... (aarch64_vector_costs::adjust_body_cost_sve): ...this. (aarch64_adjust_body_cost): Replace with... (aarch64_vector_costs::adjust_body_cost): ...this. (aarch64_vector_costs::finish_cost): Use m_vinfo instead of is_loop.	2021-11-04 12:31:17 +00:00
Richard Sandiford	6239dd0512	vect: Convert cost hooks to classes The current vector cost interface has a quite a bit of redundancy built in. Each target that defines its own hooks has to replicate the basic unsigned[3] management. Currently each target also duplicates the cost adjustment for inner loops. This patch instead defines a vector_costs class for holding the scalar or vector cost and allows targets to subclass it. There is then only one costing hook: to create a new costs structure of the appropriate type. Everything else can be virtual functions, with common concepts implemented in the base class rather than in each target's derivation. This might seem like excess C++-ification, but it shaves ~100 LOC. I've also got some follow-on changes that become significantly easier with this patch. Maybe it could help with things like weighting blocks based on frequency too. This will clash with Andre's unrolling patches. His patches have priority so this patch should queue behind them. The x86 and rs6000 parts fully convert to a self-contained class. The equivalent aarch64 changes are more complex, so this patch just does the bare minimum. A later patch will rework the aarch64 bits. gcc/ * target.def (targetm.vectorize.init_cost): Replace with... (targetm.vectorize.create_costs): ...this. (targetm.vectorize.add_stmt_cost): Delete. (targetm.vectorize.finish_cost): Likewise. (targetm.vectorize.destroy_cost_data): Likewise. * doc/tm.texi.in (TARGET_VECTORIZE_INIT_COST): Replace with... (TARGET_VECTORIZE_CREATE_COSTS): ...this. (TARGET_VECTORIZE_ADD_STMT_COST): Delete. (TARGET_VECTORIZE_FINISH_COST): Likewise. (TARGET_VECTORIZE_DESTROY_COST_DATA): Likewise. * doc/tm.texi: Regenerate. * tree-vectorizer.h (vec_info::vec_info): Remove target_cost_data parameter. (vec_info::target_cost_data): Change from a void * to a vector_costs . (vector_costs): New class. (init_cost): Take a vec_info and return a vector_costs. (dump_stmt_cost): Remove data parameter. (add_stmt_cost): Replace vinfo and data parameters with a vector_costs. (add_stmt_costs): Likewise. (finish_cost): Replace data parameter with a vector_costs. (destroy_cost_data): Delete. tree-vectorizer.c (dump_stmt_cost): Remove data argument and don't print it. (vec_info::vec_info): Remove the target_cost_data parameter and initialize the member variable to null instead. (vec_info::~vec_info): Delete target_cost_data instead of calling destroy_cost_data. (vector_costs::add_stmt_cost): New function. (vector_costs::finish_cost): Likewise. (vector_costs::record_stmt_cost): Likewise. (vector_costs::adjust_cost_for_freq): Likewise. * tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Update call to vec_info::vec_info. (vect_compute_single_scalar_iteration_cost): Update after above changes to costing interface. (vect_analyze_loop_operations): Likewise. (vect_estimate_min_profitable_iters): Likewise. (vect_analyze_loop_2): Initialize LOOP_VINFO_TARGET_COST_DATA at the start_over point, where it needs to be recreated after trying without slp. Update retry code accordingly. * tree-vect-slp.c (_bb_vec_info::_bb_vec_info): Update call to vec_info::vec_info. (vect_slp_analyze_operation): Update after above changes to costing interface. (vect_bb_vectorization_profitable_p): Likewise. * targhooks.h (default_init_cost): Replace with... (default_vectorize_create_costs): ...this. (default_add_stmt_cost): Delete. (default_finish_cost, default_destroy_cost_data): Likewise. * targhooks.c (default_init_cost): Replace with... (default_vectorize_create_costs): ...this. (default_add_stmt_cost): Delete, moving logic to vector_costs instead. (default_finish_cost, default_destroy_cost_data): Delete. * config/aarch64/aarch64.c (aarch64_vector_costs): Inherit from vector_costs. Add a constructor. (aarch64_init_cost): Replace with... (aarch64_vectorize_create_costs): ...this. (aarch64_add_stmt_cost): Replace with... (aarch64_vector_costs::add_stmt_cost): ...this. Use record_stmt_cost to adjust the cost for inner loops. (aarch64_finish_cost): Replace with... (aarch64_vector_costs::finish_cost): ...this. (aarch64_destroy_cost_data): Delete. (TARGET_VECTORIZE_INIT_COST): Replace with... (TARGET_VECTORIZE_CREATE_COSTS): ...this. (TARGET_VECTORIZE_ADD_STMT_COST): Delete. (TARGET_VECTORIZE_FINISH_COST): Likewise. (TARGET_VECTORIZE_DESTROY_COST_DATA): Likewise. * config/i386/i386.c (ix86_vector_costs): New structure. (ix86_init_cost): Replace with... (ix86_vectorize_create_costs): ...this. (ix86_add_stmt_cost): Replace with... (ix86_vector_costs::add_stmt_cost): ...this. Use adjust_cost_for_freq to adjust the cost for inner loops. (ix86_finish_cost, ix86_destroy_cost_data): Delete. (TARGET_VECTORIZE_INIT_COST): Replace with... (TARGET_VECTORIZE_CREATE_COSTS): ...this. (TARGET_VECTORIZE_ADD_STMT_COST): Delete. (TARGET_VECTORIZE_FINISH_COST): Likewise. (TARGET_VECTORIZE_DESTROY_COST_DATA): Likewise. * config/rs6000/rs6000.c (TARGET_VECTORIZE_INIT_COST): Replace with... (TARGET_VECTORIZE_CREATE_COSTS): ...this. (TARGET_VECTORIZE_ADD_STMT_COST): Delete. (TARGET_VECTORIZE_FINISH_COST): Likewise. (TARGET_VECTORIZE_DESTROY_COST_DATA): Likewise. (rs6000_cost_data): Inherit from vector_costs. Add a constructor. Drop loop_info, cost and costing_for_scalar in favor of the corresponding vector_costs member variables. Add "m_" to the names of the remaining member variables and initialize them. (rs6000_density_test): Replace with... (rs6000_cost_data::density_test): ...this. (rs6000_init_cost): Replace with... (rs6000_vectorize_create_costs): ...this. (rs6000_update_target_cost_per_stmt): Replace with... (rs6000_cost_data::update_target_cost_per_stmt): ...this. (rs6000_add_stmt_cost): Replace with... (rs6000_cost_data::add_stmt_cost): ...this. Use adjust_cost_for_freq to adjust the cost for inner loops. (rs6000_adjust_vect_cost_per_loop): Replace with... (rs6000_cost_data::adjust_vect_cost_per_loop): ...this. (rs6000_finish_cost): Replace with... (rs6000_cost_data::finish_cost): ...this. Group loop code into a single if statement and pass the loop_vinfo down to subroutines. (rs6000_destroy_cost_data): Delete.	2021-11-04 12:31:17 +00:00
Martin Liska	af976d90fa	libsanitizer: update LOCAL_PATCHES libsanitizer/ChangeLog: * LOCAL_PATCHES: Update git revision.	2021-11-04 13:26:58 +01:00
H.J. Lu	65ade6a34c	libsanitizer: Apply local patches	2021-11-04 13:26:17 +01:00

1 2 3 4 5 ...

189352 Commits