OpenE2K/gcc - gcc - Expired Mentality Git

Author	SHA1	Message	Date
Mihailo Stojanovic	cc2fda1328	aarch64: Prevent use of SIMD fcvtz[su] instruction variant with "nosimd" Currently, SF->SI and DF->DI conversions on Aarch64 with the "nosimd" flag provided sometimes cause the emitting of a vector variant of the fcvtz[su] instruction (e.g. fcvtzu s0, s0). This modifies the corresponding pattern to only select the vector variant of the instruction when generating code with SIMD enabled. gcc/ChangeLog: * config/aarch64/aarch64.md (<optab>_trunc<fcvt_target><GPI:mode>2): Set the "arch" attribute to disambiguate between SIMD and FP variants of the instruction. gcc/testsuite/ChangeLog: * gcc.target/aarch64/fcvt_nosimd.c: New test.	2021-03-30 11:42:49 +01:00
GCC Administrator	65374af219	Daily bump.	2021-03-30 00:16:29 +00:00
Joseph Myers	8aac913adf	Update cpplib sr.po. * sr.po: Update.	2021-03-29 22:53:22 +00:00
Joseph Myers	318074f335	Update gcc sv.po. * sv.po: Update.	2021-03-29 22:51:16 +00:00
Eric Botcazou	471babd886	Fix wrong assignment of aggregate to full-access component This is a regression present on the mainline: the compiler (front-end) fails to assign an aggregate to a full-access component (i.e. Atomic or VFA) as a whole if the type of the component is not full access itself. gcc/ada/ PR ada/99802 * freeze.adb (Is_Full_Access_Aggregate): Call Is_Full_Access_Object on the name of an N_Assignment_Statement to spot full access.	2021-03-30 00:45:38 +02:00
Martin Sebor	af739c8797	PR tree-optimization/61869 - Spurious uninitialized warning gcc/testsuite/ChangeLog: PR tree-optimization/61869 * gcc.dg/uninit-pr61869.c: New test.	2021-03-29 15:58:01 -06:00
Martin Sebor	fecc835e21	PR tree-optimization/61677 - False positive with -Wmaybe-uninitialized gcc/testsuite/ChangeLog: PR tree-optimization/61677 * gcc.dg/uninit-pr61677.c: New test.	2021-03-29 15:23:03 -06:00
Michael Meissner	645bfc1619	Require GLIBC 2.32 for Decimal/_Float128 conversions. In the patch that I applied on March 2nd, I had code to provide support for Decimal/_Float128 conversions if the user did not use at least GLIBC 2.32. It did this by using __ibm128 as an intermediate type. The trouble is __ibm128 cannot represent all of the numbers that _Float128 can, and you lose if you do this conversion. This patch removes this support. The dfp-bit.c functions now call the the __sprintfieee128 and __strtoieee128 functions to do the conversion. If the user does not have GLIBC, they will get a linker error that these functions do not exist. The float128 support functions are only built into the static libgcc, so there isn't an issue with having references to __strtoieee128 and __sprintfieee128 with older GLIBC libraries. As an added bonus, this patch eliminates the __sprintfkf function which included stdio.h to get a definition for the sprintf library function. This allows for building cross compilers without having to have a target stdio.h available. libgcc/ 2021-03-29 Michael Meissner <meissner@linux.ibm.com> * config/rs6000/t-float128 (fp128_decstr_funcs): Delete. (fp128_ppc_funcs): Do not add $(fp128_decstr_funcs). (fp128_decstr_objs): Delete. * dfp-bit.h: Call __sprintfieee128 to do conversions from _Float128 to a Decimal type. Call __strtoieee128 to do conversions from a Decimal type to _Float128. * config/rs6000/_sprintfkf.c: Delete file. * config/rs6000/_sprintfkf.h: Delete file. * config/rs6000/_strtokf.c: Delete file. * config/rs6000/_strtokf.h: Delete file.	2021-03-29 16:43:14 -04:00
Martin Sebor	77093a75ca	PR tree-optimization/61112 - repeated conditional triggers false positive -Wmaybe-uninitialized gcc/testsuite/ChangeLog: PR tree-optimization/61112 * gcc.dg/uninit-pr61112.c: New test.	2021-03-29 13:52:53 -06:00
Jan Hubicka	7b6ca93b2d	Fix pr99751.c testcase PR ipa/99751 * gcc.c-torture/compile/pr99751.c: Rename from ... * gcc.c-torture/execute/pr99751.c: ... to this.	2021-03-29 20:59:42 +02:00
Jan Hubicka	dd64aaafe6	Fix typo in merge_call_lhs_flags gcc/ChangeLog: 2021-03-29 Jan Hubicka <hubicka@ucw.cz> * ipa-modref.c (merge_call_lhs_flags): Correct handling of deref. (analyze_ssa_name_flags): Fix typo in comment. gcc/testsuite/ChangeLog: 2021-03-29 Jan Hubicka <hubicka@ucw.cz> * gcc.c-torture/compile/pr99751.c: New test.	2021-03-29 20:09:35 +02:00
Jonathan Wakely	864caa158f	Fix PR number in ChangeLog	2021-03-29 17:08:38 +01:00
Jakub Jelinek	afa8c67eb9	testsuite: Expect a warning on aarch64 for declare-simd-coarray-lib.f90 [PR93660] aarch64 currently doesn't support declare simd where the return value and arguments have different sizes and warns about that case. This change adds a dg-warning for that case like various other tests have already. 2021-03-29 Jakub Jelinek <jakub@redhat.com> PR fortran/93660 * gfortran.dg/gomp/declare-simd-coarray-lib.f90: Expect a mixed size declare simd warning on aarch64.	2021-03-29 17:05:47 +02:00
Jonathan Wakely	e19afa0645	libstdc++: Adjust link to PSTL upstream (again) The LLVM project renamed their default branch to 'main'. libstdc++-v3/ChangeLog: * doc/xml/manual/status_cxx2017.xml: Adjust link for PSTL. * doc/html/manual/status.html: Regenerate.	2021-03-29 14:14:00 +01:00
Alex Coplan	e4005cf871	aarch64: Fix SVE ACLE builtins with LTO [PR99216] As discussed in the PR, we currently have two different numbering schemes for SVE builtins: one for C, and one for C++. This is problematic for LTO, where we end up getting confused about which intrinsic we're talking about. This patch inserts placeholders into the registered_functions vector to ensure that there is a consistent numbering scheme for both C and C++. We use integer_zero_node as a placeholder node instead of building a function decl. This is safe because the node is only returned by the TARGET_BUILTIN_DECL hook, which (on AArch64) is only used for validation when builtin decls are streamed into lto1. gcc/ChangeLog: PR target/99216 * config/aarch64/aarch64-sve-builtins.cc (function_builder::add_function): Add placeholder_p argument, use placeholder decls if this is set. (function_builder::add_unique_function): Instead of conditionally adding direct overloads, unconditionally add either a direct overload or a placeholder. (function_builder::add_overloaded_function): Set placeholder_p if we're using C++ overloads. Use the obstack for string storage instead of relying on the tree nodes. (function_builder::add_overloaded_functions): Don't return early for m_direct_overloads: we need to add placeholders. * config/aarch64/aarch64-sve-builtins.h (function_builder::add_function): Add placeholder_p argument. gcc/testsuite/ChangeLog: PR target/99216 * g++.target/aarch64/sve/pr99216.C: New test.	2021-03-29 12:18:19 +01:00
Richard Biener	8cf2812cfc	tree-optimization/99807 - avoid bogus assert with permute SLP node This avoids asserting anything on the SLP_TREE_REPRESENTATIVE of an SLP permute node (which shouldn't be there). 2021-03-29 Richard Biener <rguenther@suse.de> PR tree-optimization/99807 * tree-vect-slp.c (vect_slp_analyze_node_operations_1): Move assert below VEC_PERM handling. * gfortran.dg/vect/pr99807.f90: New testcase.	2021-03-29 13:13:10 +02:00
Kyrylo Tkachov	37d9074e12	aarch64: PR target/99037 Fix RTL represntation in move_lo_quad patterns This patch fixes the RTL representation of the move_lo_quad patterns to use aarch64_simd_or_scalar_imm_zero for the zero part rather than a vec_duplicate of zero or a const_int 0. The expander that generates them is also adjusted so that we use and match the correct const_vector forms throughout. Co-Authored-By: Jakub Jelinek <jakub@redhat.com> gcc/ChangeLog: PR target/99037 * config/aarch64/aarch64-simd.md (move_lo_quad_internal_<mode>): Use aarch64_simd_or_scalar_imm_zero to match zeroes. Remove pattern matching const_int 0. (move_lo_quad_internal_be_<mode>): Likewise. (move_lo_quad_<mode>): Update for the above. * config/aarch64/iterators.md (VQ_2E): Delete. gcc/testsuite/ChangeLog: PR target/99808 * gcc.target/aarch64/pr99808.c: New test.	2021-03-29 11:54:57 +01:00
Jakub Jelinek	25e515d219	fold-const: Fix ICE in extract_muldiv_1 [PR99777] extract_muldiv{,_1} is apparently only prepared to handle scalar integer operations, the callers ensure it by only calling it if the divisor or one of the multiplicands is INTEGER_CST and because neither multiplication nor division nor modulo are really supported e.g. for pointer types, nullptr type etc. But the CASE_CONVERT handling doesn't really check if it isn't a cast from some other type kind, so on the testcase we end up trying to build MULT_EXPR in POINTER_TYPE which ICEs. A few years ago Marek has added ANY_INTEGRAL_TYPE_P checks to two spots, but the code uses TYPE_PRECISION which means something completely different for vector types, etc. So IMNSHO we should just punt on conversions from non-integrals or non-scalar integrals. 2021-03-29 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/99777 * fold-const.c (extract_muldiv_1): For conversions, punt on casts from types other than scalar integral types. * g++.dg/torture/pr99777.C: New test.	2021-03-29 12:35:32 +02:00
Tobias Burnus	d579e2e76f	libgomp: Fix on_device_arch.c aux-file handling [PR99555] libgomp/ChangeLog: PR target/99555 * testsuite/lib/on_device_arch.c: Move to ... * testsuite/libgomp.c-c++-common/on_device_arch.h: ... here. * testsuite/libgomp.fortran/on_device_arch.c: New file; #include on_device_arch.h. * testsuite/libgomp.c-c++-common/task-detach-6.c: #include on_device_arch.h instead of using dg-additional-source. * testsuite/libgomp.c/pr99555-1.c: Likewise. * testsuite/libgomp.fortran/task-detach-6.f90: Update to use on_device_arch.c without relative paths.	2021-03-29 10:40:38 +02:00
GCC Administrator	c411011287	Daily bump.	2021-03-29 00:16:20 +00:00
David Edelsohn	499fa254ae	aix: TLS DWARF symbol decorations. GCC currently emits TLS relocation decorations on symbols in DWARF sections. Recent changes to the AIX linker cause it to reject such symbols. This patch removes the decorations (@ie, @le, @m) and emit only the qualified symbol name. gcc/ChangeLog: * config/rs6000/rs6000.c (rs6000_output_dwarf_dtprel): Do not add XCOFF TLS reloc decorations.	2021-03-28 17:57:33 -04:00
Gerald Pfeifer	d15db0c5f5	doc: Update link to "Memory Model" paper gcc/ChangeLog: * doc/analyzer.texi (Analyzer Internals): Update link to "A Memory Model for Static Analysis of C Programs".	2021-03-28 23:34:35 +02:00
François Dumont	d04c246cae	libstdc++: _GLIBCXX_DEBUG Fix allocator-extended move constructor libstdc++-v3/ChangeLog: * include/debug/forward_list (forward_list(forward_list&&, const allocator_type&)): Add noexcept qualification. * include/debug/list (list(list&&, const allocator_type&)): Likewise and add call to safe container allocator aware move constructor. * include/debug/vector (vector(vector&&, const allocator_type&)): Fix noexcept qualification. * testsuite/23_containers/forward_list/cons/noexcept_move_construct.cc: Add allocator-extended move constructor noexceot qualification check. * testsuite/23_containers/list/cons/noexcept_move_construct.cc: Likewise.	2021-03-28 22:06:33 +02:00
Christophe Lyon	46720db72c	testsuite/arm: Improve scan-assembler in pr96770.c I'm seeing random scan-assembler-times failures in pr96770.c when LTO is used. I suspect this is because the \\+4 string matches the LTO sections, sometimes. This small patch avoids the issue, by matching arr\\+4 instead of \\+4. 2021-03-28 Christophe Lyon <christophe.lyon@linaro.org> gcc/testsuite/ PR target/96770 * gcc.target/arm/pure-code/pr96770.c: Improve scan-assembler-times.	2021-03-28 19:01:24 +00:00
Paul Thomas	297363774e	Fortran: Fix problem with runtime pointer check [PR99602]. 2021-03-28 Paul Thomas <pault@gcc.gnu.org> gcc/fortran/ChangeLog PR fortran/99602 * trans-expr.c (gfc_conv_procedure_call): Use the _data attrs for class expressions and detect proc pointer evaluations by the non-null actual argument list. gcc/testsuite/ChangeLog PR fortran/99602 * gfortran.dg/pr99602.f90: New test. * gfortran.dg/pr99602a.f90: New test. * gfortran.dg/pr99602b.f90: New test. * gfortran.dg/pr99602c.f90: New test. * gfortran.dg/pr99602d.f90: New test.	2021-03-28 19:39:50 +01:00
Iain Buclaw	5a5d23010a	d: Predefine the D_PIE version condition when flag_pie is set. Same as the D_PIC version condition, which is set by flag_pic. gcc/d/ChangeLog: * d-builtins.cc (d_init_versions): Predefine D_PIE if flag_pie is set.	2021-03-28 17:46:36 +02:00
Iain Buclaw	be080b1727	d: Don't create gdc.test symlink in the gdc testsuite directory Instead, tests are copied from the source tree (i.e: $srcdir/compilable) into the test base directory ($base_dir/compilable). A dejagnu test file with all translated test directives is created in a path that follows DejaGnu naming conventions ($base_dir/gdc.test/compilable), which is then passed to `dg-test'. Before invoking the compiler, the gdc.test prefixed is trimmed from the test program in `gdc-dg-test' so that all copied test files are picked up with the correct path names. gcc/testsuite/ChangeLog: * lib/gdc-utils.exp (gdc-copy-extra): Rename to... (gdc-copy-file): ... this. Use file copy instead of open/close. (gdc-convert-test): Save translated dejagnu test to gdc.test directory, only write dejagnu directives to the test file. (gdc-do-test): Don't create gdc.test symlink.	2021-03-28 14:47:36 +02:00
Iain Buclaw	0907036f45	d: Define language hook for LANG_HOOKS_ENUM_UNDERLYING_BASE_TYPE The underlying base type for enumerals are always present in TREE_TYPE. gcc/d/ChangeLog: * d-lang.cc (d_enum_underlying_base_type): New function. (LANG_HOOKS_ENUM_UNDERLYING_BASE_TYPE): Set as d_enum_underlying_base_type.	2021-03-28 14:47:36 +02:00
Iain Buclaw	d3ae0f515d	d: Use COMPILER_FOR_BUILD to build all D front-end generator programs This means the correct config headers are included when building the D front-end in a Canadian cross configuration. gcc/d/ChangeLog: * Make-lang.in (DMDGEN_COMPILE): Remove. (d/%.dmdgen.o): Use COMPILER_FOR_BUILD and BUILD_COMPILERFLAGS to build all D generator programs. (D_SYSTEM_H): New macro. (d/idgen.dmdgen.o): Add dependencies to build. (d/impcnvgen.dmdgen.o): Likewise. * d-system.h: Include bconfig.h if GENERATOR_FILE is defined.	2021-03-28 14:47:35 +02:00
Iain Buclaw	65c001bfaf	d: Don't generate per-module wrapper for calling DSO constructor/destructor. The static constructor/destructor list only ever has one function to call in it, so mark the gdc.dso_ctor and gdc.dso_dtor functions as static ctor/dtor directly instead. gcc/d/ChangeLog: * config-lang.in (gtfiles): Remove modules.cc. * modules.cc (struct module_info): Remove GTY marker. (static_ctor_list): Remove variable. (static_dtor_list): Remove variable. (register_moduleinfo): Directly set DECL_STATIC_CONSTRUCTOR on dso_ctor, and DECL_STATIC_DESTRUCTOR on dso_dtor. (d_finish_compilation): Remove static ctor/dtor handling. gcc/testsuite/ChangeLog: * gdc.dg/gdc270a.d: Removed. * gdc.dg/gdc270b.d: Removed.	2021-03-28 14:47:35 +02:00
GCC Administrator	d21001c793	Daily bump.	2021-03-28 00:16:17 +00:00
Steve Kargl	01685676a9	fortran: Fix off-by-one in buffer sizes. gcc/fortran/ChangeLog: * misc.c (gfc_typename): Fix off-by-one in buffer sizes.	2021-03-27 15:02:16 -07:00
GCC Administrator	651684b462	Daily bump.	2021-03-27 00:16:27 +00:00
David Edelsohn	42a21b4cb5	aix: ABI struct alignment (PR99557) The AIX power alignment rules apply the natural alignment of the "first member" if it is of a floating-point data type (or is an aggregate whose recursively "first" member or element is such a type). The alignment associated with these types for subsequent members use an alignment value where the floating-point data type is considered to have 4-byte alignment. GCC had been stripping array type but had not recursively looked within structs and unions. This also applies to classes and subclasses and, therefore, becomes more prominent with C++. For example, struct A { double x[2]; int y; }; struct B { int i; struct A a; }; struct A has double-word alignment for the bare type, but word alignment and offset within struct B despite the alignment of struct A. If struct A were the first member of struct B, struct B would have double-word alignment. One must search for the innermost first member to increase the alignment if double and then search for the innermost first member to reduce the alignment if the TYPE had double-word alignment solely because the innermost first member was double. This patch recursively looks through the first member to apply the double-word alignment to the struct / union as a whole and to apply the word alignment to the struct or union as a member within a struct or union. This is an ABI change for GCC on AIX, but GCC on AIX had not correctly implemented the AIX ABI and had not been compatible with the IBM XL compiler. Bootstrapped on powerpc-ibm-aix7.2.3.0. gcc/ChangeLog: * config/rs6000/aix.h (ADJUST_FIELD_ALIGN): Call function. * config/rs6000/rs6000-protos.h (rs6000_special_adjust_field_align): Declare. * config/rs6000/rs6000.c (rs6000_special_adjust_field_align): New. (rs6000_special_round_type_align): Recursively check innermost first field. gcc/testsuite/ChangeLog: * gcc.target/powerpc/pr99557.c: New.	2021-03-26 19:56:12 -04:00
Jakub Jelinek	1cdfc98a99	dwarf2cfi: Defer queued register saves some more [PR99334] On the testcase in the PR with -fno-tree-sink -O3 -fPIC -fomit-frame-pointer -fno-strict-aliasing -mstackrealign we have prologue: 0000000000000000 <_func_with_dwarf_issue_>: 0: 4c 8d 54 24 08 lea 0x8(%rsp),%r10 5: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp 9: 41 ff 72 f8 pushq -0x8(%r10) d: 55 push %rbp e: 48 89 e5 mov %rsp,%rbp 11: 41 57 push %r15 13: 41 56 push %r14 15: 41 55 push %r13 17: 41 54 push %r12 19: 41 52 push %r10 1b: 53 push %rbx 1c: 48 83 ec 20 sub $0x20,%rsp and emit 00000000 0000000000000014 00000000 CIE Version: 1 Augmentation: "zR" Code alignment factor: 1 Data alignment factor: -8 Return address column: 16 Augmentation data: 1b DW_CFA_def_cfa: r7 (rsp) ofs 8 DW_CFA_offset: r16 (rip) at cfa-8 DW_CFA_nop DW_CFA_nop 00000018 0000000000000044 0000001c FDE cie=00000000 pc=0000000000000000..00000000000001d5 DW_CFA_advance_loc: 5 to 0000000000000005 DW_CFA_def_cfa: r10 (r10) ofs 0 DW_CFA_advance_loc: 9 to 000000000000000e DW_CFA_expression: r6 (rbp) (DW_OP_breg6 (rbp): 0) DW_CFA_advance_loc: 13 to 000000000000001b DW_CFA_def_cfa_expression (DW_OP_breg6 (rbp): -40; DW_OP_deref) DW_CFA_expression: r15 (r15) (DW_OP_breg6 (rbp): -8) DW_CFA_expression: r14 (r14) (DW_OP_breg6 (rbp): -16) DW_CFA_expression: r13 (r13) (DW_OP_breg6 (rbp): -24) DW_CFA_expression: r12 (r12) (DW_OP_breg6 (rbp): -32) ... unwind info for that. The problem is when async signal (or stepping through in the debugger) stops after the pushq %rbp instruction and before movq %rsp, %rbp, the unwind info says that caller's %rbp is saved there at %rbp, but that is not true, caller's %rbp is either still available in the %rbp register, or in %rsp, only after executing the next instruction - movq %rsp, %rbp - the location for %rbp is correct. So, either we'd need to temporarily say: DW_CFA_advance_loc: 9 to 000000000000000e DW_CFA_expression: r6 (rbp) (DW_OP_breg7 (rsp): 0) DW_CFA_advance_loc: 3 to 0000000000000011 DW_CFA_expression: r6 (rbp) (DW_OP_breg6 (rbp): 0) DW_CFA_advance_loc: 10 to 000000000000001b or to me it seems more compact to just say: DW_CFA_advance_loc: 12 to 0000000000000011 DW_CFA_expression: r6 (rbp) (DW_OP_breg6 (rbp): 0) DW_CFA_advance_loc: 10 to 000000000000001b I've tried instead to deal with it through REG_FRAME_RELATED_EXPR from the backend, but that failed miserably as explained in the PR, dwarf2cfi.c has some rules (Rule 16 to Rule 19) that are specific to the dynamic stack realignment using drap register that only the i386 backend does right now, and by using REG_FRAME_RELATED_EXPR or REG_CFA* notes we can't emulate those rules. The following patch instead does the deferring of the hard frame pointer save rule in dwarf2cfi.c Rule 18 handling and emits it on the (set hfp sp) assignment that must appear shortly after it and adds assertion that it is the case. The difference before/after the patch on the assembly is: --- pr99334.s~ 2021-03-26 15:42:40.881749380 +0100 +++ pr99334.s 2021-03-26 17:38:05.729161910 +0100 @@ -11,8 +11,8 @@ _func_with_dwarf_issue_: andq $-16, %rsp pushq -8(%r10) pushq %rbp - .cfi_escape 0x10,0x6,0x2,0x76,0 movq %rsp, %rbp + .cfi_escape 0x10,0x6,0x2,0x76,0 pushq %r15 pushq %r14 pushq %r13 i.e. does just what we IMHO need, after pushq %rbp %rbp still contains parent's frame value and so the save rule doesn't need to be overridden there, ditto at the start of the next insn before the side-effect took effect, and we override it only after it when %rbp already has the right value. If some other target adds dynamic stack realignment in the future and the offset 0 case wouldn't be true there, the code can be adjusted so that it works on all the drap architectures, I'm pretty sure the code would need other adjustments too. For the rule 18 and for the (set hfp sp) after it we already have asserts for the drap cases that check whether the code looks the way i?86/x86_64 emit it currently. 2021-03-26 Jakub Jelinek <jakub@redhat.com> PR debug/99334 * dwarf2out.h (struct dw_fde_node): Add rule18 member. * dwarf2cfi.c (dwarf2out_frame_debug_expr): When handling (set hfp sp) assignment with drap_reg active, queue reg save for hfp with offset 0 and flush queued reg saves. When handling a push with rule18, defer queueing reg save for hfp and just assert the offset is 0. (scan_trace): Assert that fde->rule18 is false.	2021-03-27 00:20:42 +01:00
Martin Sebor	980b12cc81	PR tree-optimization/59970 - Bogus -Wmaybe-uninitialized at low optimization levels PR tree-optimization/59970 * gcc.dg/uninit-pr59970.c: New test.	2021-03-26 16:40:39 -06:00
Marek Polacek	c453a81712	c++: ICE on invalid with NSDMI in C++98 [PR98352] NSDMIs are a C++11 thing, and here we ICE with them on the non-C++11 path. Fortunately all we need is a small tweak to my recent r11-7835 patch. gcc/cp/ChangeLog: PR c++/98352 * method.c (implicitly_declare_fn): Pass &raises to synthesized_method_walk. gcc/testsuite/ChangeLog: PR c++/98352 * g++.dg/cpp0x/inh-ctor37.C: Remove dg-error. * g++.dg/cpp0x/nsdmi17.C: New test.	2021-03-26 16:12:08 -04:00
Jonathan Wakely	5f070ba298	libstdc++: Add PRNG fallback to std::random_device This makes std::random_device usable on VxWorks when running on older x86 hardware. Since the r10-728 fix for PR libstdc++/85494 the library will use the new code unconditionally on x86, but the cpuid checks for RDSEED and RDRAND can fail at runtime, depending on the hardware where the code is executing. If the OS does not provide /dev/urandom then this means the std::random_device constructor always fails. In previous releases if /dev/urandom is unavailable then std::mt19937 was used unconditionally. This patch adds a fallback for the case where the runtime cpuid checks for x86 hardware instructions fail, and no /dev/urandom is available. When this happens a std::linear_congruential_engine object will be used, with a seed based on hashing the engine's address and the current time. Distinct std::random_device objects will use different seeds, unless an object is created and destroyed and a new object created at the same memory location within the clock tick. This is not great, but is better than always throwing from the constructor, and better than always using std::mt19937 with the same seed (as GCC 9 and earlier do). libstdc++-v3/ChangeLog: * src/c++11/random.cc (USE_LCG): Define when a pseudo-random fallback is needed. [USE_LCG] (bad_seed, construct_lcg_at, destroy_lcg_at, __lcg): New helper functions and callback. (random_device::_M_init): Add 'prng' and 'all' enumerators. Replace switch with fallthrough with a series of 'if' statements. [USE_LCG]: Construct an lcg_type engine and use __lcg when cpuid checks fail. (random_device::_M_init_pretr1) [USE_MT19937]: Accept "prng" token. (random_device::_M_getval): Check for callback unconditionally and always pass _M_file pointer. * testsuite/26_numerics/random/random_device/85494.cc: Remove effective-target check. Use new random_device_available helper. * testsuite/26_numerics/random/random_device/94087.cc: Likewise. * testsuite/26_numerics/random/random_device/cons/default-cow.cc: Remove effective-target check. * testsuite/26_numerics/random/random_device/cons/default.cc: Likewise. * testsuite/26_numerics/random/random_device/cons/token.cc: Use new random_device_available helper. Test "prng" token. * testsuite/util/testsuite_random.h (random_device_available): New helper function.	2021-03-26 19:12:12 +00:00
Nathan Sidwell	d82797420c	c++: imported templates and alias-template changes [PR 99283] During development of modules, I had difficulty deciding whether the module flags of a template should live on the decl_template_result, the template_decl, or both. I chose the latter, and require them to be consistent. This and a few other defects show how hard that consistency is. Hence this patch move to holding the flags on the template-decl-result decl. That's the entity various bits of the parser have at the appropriate time. Once needs STRIP_TEMPLATE in a bunch of places, which this patch adds. Also a check that we never give a TEMPLATE_DECL to the module flag accessors. This left a problem with how I was handling template aliases. These were in two parts -- separating the TEMPLATE_DECL from the TYPE_DECL. That seemed somewhat funky, but development showed it necessary. Of course, that causes problems if the TEMPLATE_DECL cannot contain 'am imported' information. Investigating now shows that we do not need to treat them separately. By reverting a bit of template instantiation machinery that caused the problem, we're back on course. I think what has happened is that between then and now, other typedef fixes have corrected the underlying problem this separation was working around. It allows a bunch of cleanup in the decl streamer, as we no longer have to handle a null TEMPLATE_DECL_RESULT. PR c++/99283 gcc/cp/ * cp-tree.h (DECL_MODULE_CHECK): Ban TEMPLATE_DECL. (SET_TYPE_TEMPLATE_INFO): Restore Alias template setting. * decl.c (duplicate_decls): Remove template_decl module flag propagation. * module.cc (merge_kind_name): Add alias tmpl spec as a thing. (dumper::impl::nested_name): Adjust for template-decl module flag change. (trees_in::assert_definition): Likewise. (trees_in::install_entity): Likewise. (trees_out::decl_value): Likewise. Remove alias template separation of template and type_decl. (trees_in::decl_value): Likewise. (trees_out::key_mergeable): Likewise, (trees_in::key_mergeable): Likewise. (trees_out::decl_node): Adjust for template-decl module flag change. (depset:#️⃣:make_dependency): Likewise. (get_originating_module, module_may_redeclare): Likewise. (set_instantiating_module, set_defining_module): Likewise. * name-lookup.c (name_lookup::search_adl): Likewise. (do_pushdecl): Likewise. * pt.c (build_template_decl): Likewise. (lookup_template_class_1): Remove special alias_template handling of DECL_TI_TEMPLATE. (tsubst_template_decl): Likewise. gcc/testsuite/ * g++.dg/modules/pr99283-2_a.H: New. * g++.dg/modules/pr99283-2_b.H: New. * g++.dg/modules/pr99283-2_c.H: New. * g++.dg/modules/pr99283-3_a.H: New. * g++.dg/modules/pr99283-3_b.H: New. * g++.dg/modules/pr99283-4.H: New. * g++.dg/modules/tpl-alias-1_a.H: Adjust scans. * g++.dg/modules/tpl-alias-1_b.C: Adjust scans.	2021-03-26 12:08:42 -07:00
Dimitar Dimitrov	c314741a53	MAINTAINERS: Add myself as pru port maintainer ChangeLog: * MAINTAINERS: Add myself as pru port maintainer.	2021-03-26 19:52:24 +02:00
Vladimir Makarov	0d37e2d3ea	[PR99766] Consider relaxed memory associated more with memory instead of special memory. Relaxed memory should be considered more like memory then special memory. gcc/ChangeLog: PR target/99766 * ira-costs.c (record_reg_classes): Put case with CT_RELAXED_MEMORY adjacent to one with CT_MEMORY. * ira.c (ira_setup_alts): Ditto. * lra-constraints.c (process_alt_operands): Ditto. * recog.c (asm_operand_ok): Ditto. * reload.c (find_reloads): Ditto. gcc/testsuite/ChangeLog: PR target/99766 * g++.target/aarch64/sve/pr99766.C: New.	2021-03-26 17:11:30 +00:00
Richard Sandiford	6b8b0c8e24	aarch64: Add costs for LD[34] and ST[34] postincrements Most postincrements are cheap on Neoverse V1, but it's generally better to avoid them on LD[34] and ST[34] instructions. This patch adds separate address costs fields for these cases. Other CPUs continue to use the same costs for all postincrements. gcc/ * config/aarch64/aarch64-protos.h (cpu_addrcost_table::post_modify_ld3_st3): New member variable. (cpu_addrcost_table::post_modify_ld4_st4): Likewise. * config/aarch64/aarch64.c (generic_addrcost_table): Update accordingly, using the same costs as for post_modify. (exynosm1_addrcost_table, xgene1_addrcost_table): Likewise. (thunderx2t99_addrcost_table, thunderx3t110_addrcost_table): (tsv110_addrcost_table, qdf24xx_addrcost_table): Likewise. (a64fx_addrcost_table): Likewise. (neoversev1_addrcost_table): New. (neoversev1_tunings): Use neoversev1_addrcost_table. (aarch64_address_cost): Use the new post_modify costs for CImode and XImode.	2021-03-26 16:08:38 +00:00
Richard Sandiford	1205a8cadb	aarch64: Take issue rate into account for vector loop costs When SVE is enabled, GCC needs to do a three-way comparison between scalar, Advanced SIMD and SVE code. The normal costs tend to be latency-based, which is well-suited to SLP. However, comparing sums of latency costs means that we effectively treat the code as executing sequentially. This can hide the effect of pipeline bubbles or resource contention that in practice are quite important for loop vectorisation. This is particularly true for loops that involve reductions. This patch therefore tries to estimate how quickly each piece of code could issue, using a very (very) simplistic model. It then uses this to adjust the loop vector costs up or down as appropriate. Part of the Advanced SIMD vs. SVE adjustment is opt-in and is not enabled by default even for use_new_vector_costs. Like with the previous patches, this one only becomes active if a CPU selects use_new_vector_costs. It should therefore have a very low impact on other CPUs. The code also mostly ignores CPUs that have no issue information, even if use_new_vector_costs is enabled for some reason. gcc/ * config/aarch64/aarch64.opt (-param=aarch64-loop-vect-issue-rate-niters=): New parameter. * doc/invoke.texi: Document it. * config/aarch64/aarch64-protos.h (aarch64_base_vec_issue_info) (aarch64_scalar_vec_issue_info, aarch64_simd_vec_issue_info) (aarch64_advsimd_vec_issue_info, aarch64_sve_vec_issue_info) (aarch64_vec_issue_info): New structures. (cpu_vector_cost): Write comments above the variables rather than to the side. (cpu_vector_cost::issue_info): New member variable. * config/aarch64/aarch64.c: Include gimple-pretty-print.h and tree-ssa-loop-niter.h. (generic_vector_cost, a64fx_vector_cost, qdf24xx_vector_cost) (thunderx_vector_cost, tsv110_vector_cost, cortexa57_vector_cost) (exynosm1_vector_cost, xgene1_vector_cost, thunderx2t99_vector_cost) (thunderx3t110_vector_cost): Initialize issue_info to null. (neoversev1_scalar_issue_info, neoversev1_advsimd_issue_info) (neoversev1_sve_issue_info, neoversev1_vec_issue_info): New structures. (neoversev1_vector_cost): Use them. (aarch64_vec_op_count, aarch64_sve_op_count): New structures. (aarch64_vector_costs::saw_sve_only_op): New member variable. (aarch64_vector_costs::num_vector_iterations): Likewise. (aarch64_vector_costs::scalar_ops): Likewise. (aarch64_vector_costs::advsimd_ops): Likewise. (aarch64_vector_costs::sve_ops): Likewise. (aarch64_vector_costs::seen_loads): Likewise. (aarch64_simd_vec_costs_for_flags): New function. (aarch64_analyze_loop_vinfo): Initialize num_vector_iterations. Count the number of predicate operations required by SVE WHILE instructions. (aarch64_comparison_type, aarch64_multiply_add_p): New functions. (aarch64_sve_only_stmt_p, aarch64_in_loop_reduction_latency): Likewise. (aarch64_count_ops): Likewise. (aarch64_add_stmt_cost): Record whether see an SVE operation that cannot currently be implementing using Advanced SIMD. Record issue information about the scalar, Advanced SIMD and (where relevant) SVE versions of a loop. (aarch64_vec_op_count::dump): New function. (aarch64_sve_op_count::dump): Likewise. (aarch64_estimate_min_cycles_per_iter): Likewise. (aarch64_adjust_body_cost): If issue information is available, try to compare the issue rates of the various loop implementations and increase or decrease the vector body cost accordingly.	2021-03-26 16:08:38 +00:00
Richard Sandiford	e4180ab2fe	aarch64: Ignore inductions when costing vector code In practice it seems to be better not to cost a vector induction. The scalar code generally needs the same induction but doesn't cost it, making an apples-for-apples comparison harder. Most inductions also have a low latency and their cost usually gets hidden by other operations. Like with the previous patches, this one only becomes active if a CPU selects use_new_vector_costs. It should therefore have a very low impact on other CPUs. gcc/ * config/aarch64/aarch64.c (aarch64_detect_vector_stmt_subtype): Assume a zero cost for induction phis.	2021-03-26 16:08:37 +00:00
Richard Sandiford	99f94ae501	aarch64: Cost comparisons embedded in COND_EXPRs So far the costing of COND_EXPRs hasn't distinguished between cases in which the condition is calculated separately or is built into the COND_EXPR itself. This patch adds the cost of any embedded comparison. Like with the previous patches, this one only becomes active if a CPU selects use_new_vector_costs. It should therefore have a very low impact on other CPUs. gcc/ * config/aarch64/aarch64.c (aarch64_embedded_comparison_type): New function. (aarch64_adjust_stmt_cost): Add the costs of embedded scalar and vector comparisons.	2021-03-26 16:08:36 +00:00
Richard Sandiford	ed17ad5ea1	aarch64: Detect scalar extending loads If the scalar code does an integer load followed by an integer extension, we've tended to cost that as two separate operations, even though the extension is probably going to be free in practice. This patch treats the extension as having zero cost, like we already do for extending SVE loads. Like with previous patches, this one only becomes active if a CPU selects use_new_vector_costs. It should therefore have a very low impact on other CPUs. gcc/ * config/aarch64/aarch64.c (aarch64_detect_scalar_stmt_subtype): New function. (aarch64_add_stmt_cost): Call it.	2021-03-26 16:08:35 +00:00
Richard Sandiford	3b924b0d7c	aarch64: Try to detect when Advanced SIMD code would be completely unrolled GCC usually costs the SVE and Advanced SIMD versions of a loop and picks the one with the lowest cost. By default it will choose SVE over Advanced SIMD in the event of tie. This is normally the correct behaviour, not least because SVE can handle every scalar iteration count whereas Advanced SIMD can only handle full vectors. However, there is one important exception that GCC failed to consider: we can completely unroll Advanced SIMD code at compile time, but we can't do the same for SVE. This patch therefore adds an opt-in heuristic to guess whether the Advanced SIMD version of a loop is likely to be unrolled. This will only be suitable for some CPUs, so it is not enabled by default and is controlled separately from use_new_vector_costs. Like with previous patches, this one only becomes active if a CPU selects both of the new tuning parameters. It should therefore have a very low impact on other CPUs. gcc/ * config/aarch64/aarch64-tuning-flags.def (matched_vector_throughput): New tuning parameter. * config/aarch64/aarch64.c (neoversev1_tunings): Use it. (aarch64_estimated_sve_vq): New function. (aarch64_vector_costs::analyzed_vinfo): New member variable. (aarch64_vector_costs::is_loop): Likewise. (aarch64_vector_costs::unrolled_advsimd_niters): Likewise. (aarch64_vector_costs::unrolled_advsimd_stmts): Likewise. (aarch64_record_potential_advsimd_unrolling): New function. (aarch64_analyze_loop_vinfo, aarch64_analyze_bb_vinfo): Likewise. (aarch64_add_stmt_cost): Call aarch64_analyze_loop_vinfo or aarch64_analyze_bb_vinfo on the first use of a costs structure. Detect whether we're vectorizing a loop for SVE that might be completely unrolled if it used Advanced SIMD instead. (aarch64_adjust_body_cost_for_latency): New function. (aarch64_finish_cost): Call it.	2021-03-26 16:08:34 +00:00
Richard Sandiford	50a525b50c	aarch64: Use an aarch64-specific structure for vector costing This patch makes the AArch64 vector code use its own vector costs structure, rather than just using the default unsigned[3]. Unfortunately, it's not easy to make this change specific to use_new_vector_costs, so this part is one that affects all CPUs. The change is relatively mechanical though. gcc/ * config/aarch64/aarch64.c (aarch64_vector_costs): New structure. (aarch64_init_cost): New function. (aarch64_add_stmt_cost): Use aarch64_vector_costs instead of the default unsigned[3]. (aarch64_finish_cost, aarch64_destroy_cost_data): New functions. (TARGET_VECTORIZE_INIT_COST): Override. (TARGET_VECTORIZE_FINISH_COST): Likewise. (TARGET_VECTORIZE_DESTROY_COST_DATA): Likewise.	2021-03-26 16:08:34 +00:00
Richard Sandiford	14bd21c2c5	aarch64: Add a CPU-specific cost table for Neoverse V1 This patch adds dedicated vector costs for Neoverse V1. Previously we just used the Cortex-A57 costs, which isn't ideal given that Cortex-A57 doesn't support SVE. gcc/ * config/aarch64/aarch64.c (neoversev1_advsimd_vector_cost) (neoversev1_sve_vector_cost): New cost structures. (neoversev1_vector_cost): Likewise. (neoversev1_tunings): Use them. Enable use_new_vector_costs.	2021-03-26 16:08:33 +00:00
Richard Sandiford	7c679969ba	aarch64: Add costs for one element of a scatter store Currently each element in a gather load is costed as a scalar_load and each element in a scatter store is costed as a scalar_store. The load side seems to work pretty well in practice, since many CPU-specific costs give loads quite a high cost relative to arithmetic operations. However, stores usually have a cost of just 1, which means that scatters tend to appear too cheap. This patch adds a separate cost for one element in a scatter store. Like with the previous patches, this one only becomes active if a CPU selects use_new_vector_costs. It should therefore have a very low impact on other CPUs. gcc/ * config/aarch64/aarch64-protos.h (sve_vec_cost::scatter_store_elt_cost): New member variable. * config/aarch64/aarch64.c (generic_sve_vector_cost): Update accordingly, taking the cost from the cost of a scalar_store. (a64fx_sve_vector_cost): Likewise. (aarch64_detect_vector_stmt_subtype): Detect scatter stores.	2021-03-26 16:08:32 +00:00

... 2 3 4 5 6 ...

184206 Commits