OpenE2K/gcc - gcc - Expired Mentality Git

Author	SHA1	Message	Date
Tamar Christina	9fcb8ec603	[testsuite] Fix trapping access in test PR101750 I believe PR101750 to be a testism. Fix it by giving the class a name. gcc/testsuite/ChangeLog: PR tree-optimization/101750 * g++.dg/vect/pr99149.cc: Name class.	2021-08-04 14:36:26 +01:00
Richard Biener	31855ba6b1	Add emulated gather capability to the vectorizer This adds a gather vectorization capability to the vectorizer without target support by decomposing the offset vector, doing sclar loads and then building a vector from the result. This is aimed mainly at cases where vectorizing the rest of the loop offsets the cost of vectorizing the gather. Note it's difficult to avoid vectorizing the offset load, but in some cases later passes can turn the vector load + extract into scalar loads, see the followup patch. On SPEC CPU 2017 510.parest_r this improves runtime from 250s to 219s on a Zen2 CPU which has its native gather instructions disabled (using those the runtime instead increases to 254s) using -Ofast -march=znver2 [-flto]. It turns out the critical loops in this benchmark all perform gather operations. 2021-07-30 Richard Biener <rguenther@suse.de> * tree-vect-data-refs.c (vect_check_gather_scatter): Include widening conversions only when the result is still handed by native gather or the current offset size not already matches the data size. Also succeed analysis in case there's no native support, noted by a IFN_LAST ifn and a NULL decl. (vect_analyze_data_refs): Always consider gathers. * tree-vect-patterns.c (vect_recog_gather_scatter_pattern): Test for no IFN gather rather than decl gather. * tree-vect-stmts.c (vect_model_load_cost): Pass in the gather-scatter info and cost emulated gathers accordingly. (vect_truncate_gather_scatter_offset): Properly test for no IFN gather. (vect_use_strided_gather_scatters_p): Likewise. (get_load_store_type): Handle emulated gathers and its restrictions. (vectorizable_load): Likewise. Emulate them by extracting scalar offsets, doing scalar loads and a vector construct. * gcc.target/i386/vect-gather-1.c: New testcase. * gfortran.dg/vect/vect-8.f90: Adjust.	2021-08-04 15:28:07 +02:00
H.J. Lu	f2e5d2717d	by_pieces: Pass MAX_PIECES to op_by_pieces_d Pass MAX_PIECES to op_by_pieces_d::op_by_pieces_d for move, store and compare. PR target/101742 * expr.c (op_by_pieces_d::op_by_pieces_d): Add a max_pieces argument to set m_max_size. (move_by_pieces_d): Pass MOVE_MAX_PIECES to op_by_pieces_d. (store_by_pieces_d): Pass STORE_MAX_PIECES to op_by_pieces_d. (compare_by_pieces_d): Pass COMPARE_MAX_PIECES to op_by_pieces_d.	2021-08-04 06:24:46 -07:00
Roger Sayle	96146e61cd	Fold (X<<C1)^(X<<C2) to a multiplication when possible. The easiest way to motivate these additions to match.pd is with the following example: unsigned int foo(unsigned char i) { return i \| (i<<8) \| (i<<16) \| (i<<24); } which mainline with -O2 on x86_64 currently generates: foo: movzbl %dil, %edi movl %edi, %eax movl %edi, %edx sall $8, %eax sall $16, %edx orl %edx, %eax orl %edi, %eax sall $24, %edi orl %edi, %eax ret but with this patch now becomes: foo: movzbl %dil, %eax imull $16843009, %eax, %eax ret Interestingly, this transformation is already applied when using addition, allowing synth_mult to select an optimal sequence, but not when using the equivalent bit-wise ior or xor operators. The solution is to use tree_nonzero_bits to check that the potentially non-zero bits of each operand don't overlap, which ensures that BIT_IOR_EXPR and BIT_XOR_EXPR produce the same results as PLUS_EXPR, which effectively generalizes the old fold_plusminus_mult_expr. Technically, the transformation is to canonicalize (XC1)\|(XC2) and (XC1)^(XC2) to X(C1+C2) where X and X<<C are considered special cases. 2021-08-04 Roger Sayle <roger@nextmovesoftware.com> Marc Glisse <marc.glisse@inria.fr> gcc/ChangeLog match.pd (bit_ior, bit_xor): Canonicalize (XC1)\|(XC2) and (XC1)^(XC2) as X(C1+C2), and related variants, using tree_nonzero_bits to ensure that operands are bit-wise disjoint. gcc/testsuite/ChangeLog gcc.dg/fold-ior-4.c: New test.	2021-08-04 14:22:51 +01:00
Jonathan Wakely	0d04fe4923	libstdc++: Add [[nodiscard]] to sequence containers ... and container adaptors. This adds the [[nodiscard]] attribute to functions with no side-effects for the sequence containers and their iterators, and the debug versions of those containers, and the container adaptors, Signed-off-by: Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * include/bits/forward_list.h: Add [[nodiscard]] to functions with no side-effects. * include/bits/stl_bvector.h: Likewise. * include/bits/stl_deque.h: Likewise. * include/bits/stl_list.h: Likewise. * include/bits/stl_queue.h: Likewise. * include/bits/stl_stack.h: Likewise. * include/bits/stl_vector.h: Likewise. * include/debug/deque: Likewise. * include/debug/forward_list: Likewise. * include/debug/list: Likewise. * include/debug/safe_iterator.h: Likewise. * include/debug/vector: Likewise. * include/std/array: Likewise. * testsuite/23_containers/array/creation/3_neg.cc: Use -Wno-unused-result. * testsuite/23_containers/array/debug/back1_neg.cc: Cast result to void. * testsuite/23_containers/array/debug/back2_neg.cc: Likewise. * testsuite/23_containers/array/debug/front1_neg.cc: Likewise. * testsuite/23_containers/array/debug/front2_neg.cc: Likewise. * testsuite/23_containers/array/debug/square_brackets_operator1_neg.cc: Likewise. * testsuite/23_containers/array/debug/square_brackets_operator2_neg.cc: Likewise. * testsuite/23_containers/array/tuple_interface/get_neg.cc: Adjust dg-error line numbers. * testsuite/23_containers/deque/cons/clear_allocator.cc: Cast result to void. * testsuite/23_containers/deque/debug/invalidation/4.cc: Likewise. * testsuite/23_containers/deque/types/1.cc: Use -Wno-unused-result. * testsuite/23_containers/list/types/1.cc: Cast result to void. * testsuite/23_containers/priority_queue/members/7161.cc: Likewise. * testsuite/23_containers/queue/members/7157.cc: Likewise. * testsuite/23_containers/vector/59829.cc: Likewise. * testsuite/23_containers/vector/ext_pointer/types/1.cc: Likewise. * testsuite/23_containers/vector/ext_pointer/types/2.cc: Likewise. * testsuite/23_containers/vector/types/1.cc: Use -Wno-unused-result.	2021-08-04 12:54:29 +01:00
Jonathan Wakely	240b01b021	libstdc++: Add [[nodiscard]] to iterators and related utilities This adds [[nodiscard]] throughout <iterator>, as proposed by P2377R0 (with some minor corrections). The attribute is added for all modes from C++11 up, using [[__nodiscard__]] or _GLIBCXX_NODISCARD where C++17 [[nodiscard]] can't be used directly. Signed-off-by: Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * include/bits/iterator_concepts.h (iter_move): Add [[nodiscard]]. * include/bits/range_access.h (begin, end, cbegin, cend) (rbegin, rend, crbegin, crend, size, data, ssize): Likewise. * include/bits/ranges_base.h (ranges::begin, ranges::end) (ranges::cbegin, ranges::cend, ranges::rbegin, ranges::rend) (ranges::crbegin, ranges::crend, ranges::size, ranges::ssize) (ranges::empty, ranges::data, ranges::cdata): Likewise. * include/bits/stl_iterator.h (reverse_iterator, __normal_iterator) (back_insert_iterator, front_insert_iterator, insert_iterator) (move_iterator, move_sentinel, common_iterator) (counted_iterator): Likewise. * include/bits/stl_iterator_base_funcs.h (distance, next, prev): Likewise. * include/bits/stream_iterator.h (istream_iterator) (ostream_iterartor): Likewise. * include/bits/streambuf_iterator.h (istreambuf_iterator) (ostreambuf_iterator): Likewise. * include/std/ranges (views::single, views::iota, views::all) (views::filter, views::transform, views::take, views::take_while) (views::drop, views::drop_while, views::join, views::lazy_split) (views::split, views::counted, views::common, views::reverse) (views::elements): Likewise. * testsuite/20_util/rel_ops.cc: Use -Wno-unused-result. * testsuite/24_iterators/move_iterator/greedy_ops.cc: Likewise. * testsuite/24_iterators/normal_iterator/greedy_ops.cc: Likewise. * testsuite/24_iterators/reverse_iterator/2.cc: Likewise. * testsuite/24_iterators/reverse_iterator/greedy_ops.cc: Likewise. * testsuite/21_strings/basic_string/range_access/char/1.cc: Cast result to void. * testsuite/21_strings/basic_string/range_access/wchar_t/1.cc: Likewise. * testsuite/21_strings/basic_string_view/range_access/char/1.cc: Likewise. * testsuite/21_strings/basic_string_view/range_access/wchar_t/1.cc: Likewise. * testsuite/23_containers/array/range_access.cc: Likewise. * testsuite/23_containers/deque/range_access.cc: Likewise. * testsuite/23_containers/forward_list/range_access.cc: Likewise. * testsuite/23_containers/list/range_access.cc: Likewise. * testsuite/23_containers/map/range_access.cc: Likewise. * testsuite/23_containers/multimap/range_access.cc: Likewise. * testsuite/23_containers/multiset/range_access.cc: Likewise. * testsuite/23_containers/set/range_access.cc: Likewise. * testsuite/23_containers/unordered_map/range_access.cc: Likewise. * testsuite/23_containers/unordered_multimap/range_access.cc: Likewise. * testsuite/23_containers/unordered_multiset/range_access.cc: Likewise. * testsuite/23_containers/unordered_set/range_access.cc: Likewise. * testsuite/23_containers/vector/range_access.cc: Likewise. * testsuite/24_iterators/customization_points/iter_move.cc: Likewise. * testsuite/24_iterators/istream_iterator/sentinel.cc: Likewise. * testsuite/24_iterators/istreambuf_iterator/sentinel.cc: Likewise. * testsuite/24_iterators/move_iterator/dr2061.cc: Likewise. * testsuite/24_iterators/operations/prev_neg.cc: Likewise. * testsuite/24_iterators/ostreambuf_iterator/2.cc: Likewise. * testsuite/24_iterators/range_access/range_access.cc: Likewise. * testsuite/24_iterators/range_operations/100768.cc: Likewise. * testsuite/26_numerics/valarray/range_access2.cc: Likewise. * testsuite/28_regex/range_access.cc: Likewise. * testsuite/experimental/string_view/range_access/char/1.cc: Likewise. * testsuite/experimental/string_view/range_access/wchar_t/1.cc: Likewise. * testsuite/ext/vstring/range_access.cc: Likewise. * testsuite/std/ranges/adaptors/take.cc: Likewise. * testsuite/std/ranges/p2259.cc: Likewise.	2021-08-04 12:54:28 +01:00
Richard Biener	2724d1bba6	Rewrite more vector loads to scalar loads This teaches forwprop to rewrite more vector loads that are only used in BIT_FIELD_REFs as scalar loads. This provides the remaining uplift to SPEC CPU 2017 510.parest_r on Zen 2 which has CPU gathers disabled. In particular vector load + vec_unpack + bit-field-ref is turned into (extending) scalar loads which avoids costly XMM/GPR transitions. To not conflict with vector load + bit-field-ref + vector constructor matching to vector load + shuffle the extended transform is only done after vector lowering. 2021-07-30 Richard Biener <rguenther@suse.de> * tree-ssa-forwprop.c (pass_forwprop::execute): Split out code to decompose vector loads ... (optimize_vector_load): ... here. Generalize it to handle intermediate widening and TARGET_MEM_REF loads and apply it to loads with a supported vector mode as well.	2021-08-04 12:38:03 +02:00
Richard Biener	87a0b607e4	tree-optimization/101756 - avoid vectorizing boolean MAX reductions The following avoids vectorizing MIN/MAX reductions on bools which, when ending up as vector(2) <signed-boolean:64> would need to be adjusted because of the sign change. The fix instead avoids any reduction vectorization where the result isn't compatible to the original scalar type since we don't compensate for that either. 2021-08-04 Richard Biener <rguenther@suse.de> PR tree-optimization/101756 * tree-vect-slp.c (vectorizable_bb_reduc_epilogue): Make sure the result of the reduction epilogue is compatible to the original scalar result. * gcc.dg/vect/bb-slp-pr101756.c: New testcase.	2021-08-04 12:33:23 +02:00
Jakub Jelinek	af31cab047	c++: Fix up #pragma omp declare {simd,variant} and acc routine parsing When parsing default arguments, we need to temporarily clear parser->omp_declare_simd and parser->oacc_routine, otherwise it can clash with further declarations inside of e.g. lambdas inside of those default arguments. 2021-08-04 Jakub Jelinek <jakub@redhat.com> PR c++/101759 * parser.c (cp_parser_default_argument): Temporarily override parser->omp_declare_simd and parser->oacc_routine to NULL. * g++.dg/gomp/pr101759.C: New test. * g++.dg/goacc/pr101759.C: New test.	2021-08-04 11:53:48 +02:00
Jakub Jelinek	8aa14fa7d9	testsuite: Fix duplicated content of gcc.c-torture/execute/ieee/pr29302-1.x The file has two identical halves, seems like twice applied patch. 2021-08-04 Jakub Jelinek <jakub@redhat.com> * gcc.c-torture/execute/ieee/pr29302-1.x: Undo doubly applied patch.	2021-08-04 11:44:45 +02:00
liuhongt	9f26640f7b	Refine predicate of peephole2 to general_reg_operand. [PR target/101743] The define_peephole2 which is added by r12-2640-gf7bf03cf69ccb7dc should only work on general registers, considering that x86 also supports mov instructions between gpr, sse reg, mask reg, limiting the peephole2 predicate to general_reg_operand. gcc/ChangeLog: PR target/101743 * config/i386/i386.md (peephole2): Refine predicate from register_operand to general_reg_operand.	2021-08-04 17:43:17 +08:00
Jakub Jelinek	7195fa03e7	libgcc: Fix duplicated content of config/t-slibgcc-fuchsia The file has two identical halves, seems like twice applied patch. 2021-08-04 Jakub Jelinek <jakub@redhat.com> * config/t-slibgcc-fuchsia: Undo doubly applied patch.	2021-08-04 11:40:52 +02:00
Aldy Hernandez	9db0bcd9fd	Mark path_range_query::dump as override. gcc/ChangeLog: * gimple-range-path.h (path_range_query::dump): Mark override.	2021-08-04 10:57:11 +02:00
Richard Biener	4d56259101	tree-optimization/101769 - tail recursion creates possibly infinite loop This makes tail recursion optimization produce a loop structure manually rather than relying on loop fixup. That also allows the loop to be marked as finite (it would eventually blow the stack if it were not). 2021-08-04 Richard Biener <rguenther@suse.de> PR tree-optimization/101769 * tree-tailcall.c (eliminate_tail_call): Add the created loop for the first recursion and return it via the new output parameter. (optimize_tail_call): Pass through new output param. (tree_optimize_tail_calls_1): After creating all latches, add the created loop to the loop tree. Do not mark loops for fixup. * g++.dg/tree-ssa/pr101769.C: New testcase.	2021-08-04 10:35:27 +02:00
Martin Liska	5c73b94fdc	docs: document threader-mode param gcc/ChangeLog: * doc/invoke.texi: Document threader-mode param.	2021-08-04 09:48:05 +02:00
liuhongt	3ae1468e26	Add dg-require-effective-target for testcases. gcc/testsuite/ChangeLog: * gcc.target/i386/cond_op_addsubmul_d-2.c: Add dg-require-effective-target for avx512. * gcc.target/i386/cond_op_addsubmul_q-2.c: Ditto. * gcc.target/i386/cond_op_addsubmul_w-2.c: Ditto. * gcc.target/i386/cond_op_addsubmuldiv_double-2.c: Ditto. * gcc.target/i386/cond_op_addsubmuldiv_float-2.c: Ditto. * gcc.target/i386/cond_op_fma_double-2.c: Ditto. * gcc.target/i386/cond_op_fma_float-2.c: Ditto.	2021-08-04 13:25:46 +08:00
liuhongt	2fc2e3917f	Support cond_{fma,fms,fnma,fnms} for vector float/double under AVX512. gcc/ChangeLog: * config/i386/sse.md (cond_fma<mode>): New expander. (cond_fms<mode>): Ditto. (cond_fnma<mode>): Ditto. (cond_fnms<mode>): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/cond_op_fma_double-1.c: New test. * gcc.target/i386/cond_op_fma_double-2.c: New test. * gcc.target/i386/cond_op_fma_float-1.c: New test. * gcc.target/i386/cond_op_fma_float-2.c: New test.	2021-08-04 12:58:01 +08:00
Cherry Mui	22e40cc7fe	compiler: support new language constructs in escape analysis Previous CLs add new language constructs in Go 1.17, specifically, unsafe.Add, unsafe.Slice, and conversion from a slice to a pointer to an array. This CL handles them in the escape analysis. At the point of the escape analysis, unsafe.Add and unsafe.Slice are still builtin calls, so just handle them in data flow. Conversion from a slice to a pointer to an array has already been lowered to a combination of compound expression, conditional expression and slice info expressions, so handle them in the escape analysis. Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/339671	2021-08-03 18:32:07 -07:00
GCC Administrator	fa1407c761	Daily bump.	2021-08-04 00:16:51 +00:00
Ian Lance Taylor	e435e72ad7	compile, runtime: make selectnbrecv return two values The only different between selectnbrecv and selectnbrecv2 is the later set the input pointer value by second return value from chanrecv. So by making selectnbrecv return two values from chanrecv, we can get rid of selectnbrecv2, the compiler can now call only selectnbrecv and generate simpler code. This is the gofrontend version of https://golang.org/cl/292890. Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/339529	2021-08-03 16:40:00 -07:00
Ian Lance Taylor	cbbd439a33	compiler: check slice to pointer-to-array conversion element type When checking a slice to pointer-to-array conversion, I forgot to verify that the elements types are identical. For golang/go#395 Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/339329	2021-08-03 16:36:20 -07:00
Segher Boessenkool	3a7794b469	rs6000: Replace & by && 2021-08-03 Segher Boessenkool <segher@kernel.crashing.org> * config/rs6000/vsx.md (*vsx_le_perm_store_<mode>): Use && instead of &.	2021-08-03 22:33:52 +00:00
Segher Boessenkool	ebff536cf4	rs6000: "e" is not a free constraint letter It is the prefix of the "es" and "eI" constraints. 2021-08-03 Segher Boessenkool <segher@kernel.crashing.org> * config/rs6000/constraints.md: Remove "e" from the list of available constraint characters.	2021-08-03 22:26:40 +00:00
Eugene Rozenfeld	285aa6895d	Fix indirect call inlining with AutoFDO The histogram value for indirect calls was incorrectly set up. That is fixed now. With this change the tree-prof tests checking indirect call inlining with AutoFDO in gcc.dg and g++.dg are passing. Resolves: PR gcov-profile/71672 - inlining indirect calls does not work with autofdo gcc/ChangeLog: PR gcov-profile/71672 * auto-profile.c (afdo_indirect_call): Fix setup of the historgram value for indirect calls.	2021-08-03 14:36:33 -07:00
Eugene Rozenfeld	9265b37853	Fixes for AutoFDO testing * create_gcov tool doesn't currently support dwarf 5 so I made a change in profopt.exp to pass -gdwarf-4 when compiling the binary to profile. * I updated the invocation of create_gcov in profopt.exp to pass -gcov_version=2. I recently made a change to create_gcov to support version 2: https://github.com/google/autofdo/pull/117 . * I removed useless -o perf.data from the invocation of gcc-auto-profile in target-supports.exp. These changes contribute to fixing PR gcov-profile/71672. gcc/testsuite/ChangeLog: * lib/profopt.exp: Pass gdwarf-4 when compiling test to profile; pass -gcov_version=2. * lib/target-supports.exp: Remove unnecessary -o perf.data passed to gcc-auto-profile.	2021-08-03 14:28:42 -07:00
Eugene Rozenfeld	0ed093c7c3	Fix indir-call-prof-2.c with AutoFDO indir-call-prof-2.c has -fno-early-inlining but AutoFDO can't work without early inlining (it needs to match the inlining of the profiled binary). I changed profopt.exp to always pass -fearly-inlining for AutoFDO. With that change the indirect call inlining in indir-call-prof-2.c happens in the early inliner so I changed the dg-final-use-autofdo. Contributes to fixing PR gcov-profile/71672 gcc/testsuite/ChangeLog: * gcc.dg/tree-prof/indir-call-prof-2.c: Fix dg-final-use-autofdo. * lib/profopt.exp: Pass -fearly-inlining when compiling with AutoFDO.	2021-08-03 14:26:27 -07:00
Eugene Rozenfeld	f9ad3d5339	Fixes for AutoFDO tests * Changed several tests to use -fdump-ipa-afdo-optimized instead of -fdump-ipa-afdo in dg-options so that the expected output can be found * Increased the number of iterations in several tests so that perf can have enough sampling events Contributes to fixing PR gcov-profile/71672. gcc/testsuite/ChangeLog: * g++.dg/tree-prof/indir-call-prof.C: Fix options, increase the number of iterations. * g++.dg/tree-prof/morefunc.C: Fix options, increase the number of iterations. * g++.dg/tree-prof/reorder.C: Fix options, increase the number of iterations. * gcc.dg/tree-prof/indir-call-prof-2.c: Fix options, increase the number of iterations. * gcc.dg/tree-prof/indir-call-prof.c: Fix options.	2021-08-03 14:25:47 -07:00
Martin Sebor	aabf07cd5d	Disable a test case in ILP32 [PR101688]. Resolves: PR testsuite/101688 - g++.dg/warn/Wstringop-overflow-4.C fails on 32-bit archs with new jump threader gcc/testsuite: PR testsuite/101688 * g++.dg/warn/Wstringop-overflow-4.C: Disable a test case in ILP32.	2021-08-03 13:56:56 -06:00
Paul A. Clarke	0f44b09732	rs6000: Add test for _mm_minpos_epu16 Copy the test for _mm_minpos_epu16 from gcc/testsuite/gcc.target/i386/sse4_1-phminposuw.c, with a few adjustments: - Adjust the dejagnu directives for powerpc platform. - Make the data not be monotonically increasing, such that some of the returned values are not always the first value (index 0). - Create a list of input data testing various scenarios including more than one minimum value and different orders and indices of the minimum value. - Fix a masking issue where the index was being truncated to 2 bits instead of 3 bits, which wasn't found because all of the returned indices were 0 with the original generated data. - Support big-endian. 2021-08-03 Paul A. Clarke <pc@us.ibm.com> gcc/testsuite * gcc.target/powerpc/sse4_1-phminposuw.c: Copy from gcc/testsuite/gcc.target/i386, adjust dg directives to suit, make more robust.	2021-08-03 13:58:41 -05:00
Paul A. Clarke	eaa93a0f3d	rs6000: Add support for _mm_minpos_epu16 Add a naive implementation of the subject x86 intrinsic to ease porting. 2021-08-03 Paul A. Clarke <pc@us.ibm.com> gcc * config/rs6000/smmintrin.h (_mm_minpos_epu16): New.	2021-08-03 13:58:31 -05:00
Jonathan Wakely	a77a46d9ae	libstdc++: Suppress redundant definitions of inline variables In C++17 the out-of-class definitions for static constexpr variables are redundant, because they are implicitly inline. This change avoids "redundant redeclaration" warnings from -Wsystem-headers -Wdeprecated. Signed-off-by: Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * include/bits/random.tcc (linear_congruential_engine): Do not define static constexpr members when they are implicitly inline. * include/std/ratio (ratio, __ratio_multiply, __ratio_divide) (__ratio_add, __ratio_subtract): Likewise. * include/std/type_traits (integral_constant): Likewise. * testsuite/26_numerics/random/pr60037-neg.cc: Adjust dg-error line number.	2021-08-03 15:41:11 +01:00
Jonathan Wakely	5c6759e416	libstdc++: Replace TR1 components with C++11 ones in test utils Signed-off-by: Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * testsuite/util/testsuite_common_types.h: Replace uses of tr1::unordered_map and tr1::unordered_set with their C++11 equivalents. * testsuite/29_atomics/atomic/cons/assign_neg.cc: Adjust dg-error line number. * testsuite/29_atomics/atomic/cons/copy_neg.cc: Likewise. * testsuite/29_atomics/atomic_integral/cons/assign_neg.cc: Likewise. * testsuite/29_atomics/atomic_integral/cons/copy_neg.cc: Likewise. * testsuite/29_atomics/atomic_integral/operators/bitwise_neg.cc: Likewise. * testsuite/29_atomics/atomic_integral/operators/decrement_neg.cc: Likewise. * testsuite/29_atomics/atomic_integral/operators/increment_neg.cc: Likewise.	2021-08-03 15:40:42 +01:00
Jonathan Wakely	13a1ac9f6f	libstdc++: Specialize allocator_traits<pmr::polymorphic_allocator<T>> This adds a partial specialization of allocator_traits, similar to what was already done for std::allocator. This means that most uses of polymorphic_allocator via the traits can avoid the metaprogramming overhead needed to deduce the properties from polymorphic_allocator. In addition, I'm changing polymorphic_allocator::delete_object to invoke the destructor (or pseudo-destructor) directly, rather than calling allocator_traits::destroy, which calls polymorphic_allocator::destroy (which is deprecated). This is observable if a user has specialized allocator_traits<polymorphic_allocator<Foo>> and expects to see its destroy member function called. I consider explicit specializations of allocator_traits to be wrong-headed, and this use case seems unnecessary to support. So delete_object just invokes the destructor directly. Signed-off-by: Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * include/std/memory_resource (polymorphic_allocator::delete_object): Call destructor directly instead of using destroy. (allocator_traits<polymorphic_allocator<T>>): Define partial specialization.	2021-08-03 15:30:36 +01:00
Jonathan Wakely	9bd87e3887	libstdc++: Remove trailing whitespace in some tests Signed-off-by: Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * testsuite/20_util/function_objects/binders/3113.cc: Remove trailing whitespace. * testsuite/20_util/shared_ptr/assign/auto_ptr.cc: Likewise. * testsuite/20_util/shared_ptr/assign/auto_ptr_neg.cc: Likewise. * testsuite/20_util/shared_ptr/assign/auto_ptr_rvalue.cc: Likewise. * testsuite/20_util/shared_ptr/creation/dr925.cc: Likewise. * testsuite/25_algorithms/headers/algorithm/synopsis.cc: Likewise. * testsuite/25_algorithms/random_shuffle/requirements/explicit_instantiation/2.cc: Likewise. * testsuite/25_algorithms/random_shuffle/requirements/explicit_instantiation/pod.cc: Likewise.	2021-08-03 15:30:36 +01:00
Jonathan Wakely	7f2f4b8791	libstdc++: Deprecate std::random_shuffle for C++14 The std::random_shuffle algorithm was removed in C++14 (without deprecation). This adds the deprecated attribute for C++14 and later, so that users are warned they should not be using it in those dialects. Signed-off-by: Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * doc/xml/manual/evolution.xml: Document deprecation. * doc/html/: Regenerate. include/bits/c++config (_GLIBCXX14_DEPRECATED): Define. (_GLIBCXX14_DEPRECATED_SUGGEST): Define. * include/bits/stl_algo.h (random_shuffle): Deprecate for C++14 and later. * testsuite/25_algorithms/headers/algorithm/synopsis.cc: Adjust for C++11 and C++14 changes to std::random_shuffle and std::shuffle. * testsuite/25_algorithms/random_shuffle/1.cc: Add options to use deprecated algorithms. * testsuite/25_algorithms/random_shuffle/59603.cc: Likewise. * testsuite/25_algorithms/random_shuffle/moveable.cc: Likewise. * testsuite/25_algorithms/random_shuffle/requirements/explicit_instantiation/2.cc: Likewise. * testsuite/25_algorithms/random_shuffle/requirements/explicit_instantiation/pod.cc: Likewise.	2021-08-03 15:30:35 +01:00
Jonathan Wakely	07b70dfc4e	libstdc++: Add testsuite proc for testing deprecated features This change adds options to tests that explicitly use deprecated features, so that -D_GLIBCXX_USE_DEPRECATED=0 can be used to run the rest of the testsuite. The tests that explicitly/intentionally use deprecated features will still be able to use them, but they can be disabled for the majority of tests. Signed-off-by: Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * testsuite/23_containers/forward_list/operations/3.cc: Use lambda instead of std::bind2nd. * testsuite/20_util/function_objects/binders/3113.cc: Add options for testing deprecated features. * testsuite/20_util/pair/cons/99957.cc: Likewise. * testsuite/20_util/shared_ptr/assign/auto_ptr.cc: Likewise. * testsuite/20_util/shared_ptr/assign/auto_ptr_neg.cc: Likewise. * testsuite/20_util/shared_ptr/assign/auto_ptr_rvalue.cc: Likewise. * testsuite/20_util/shared_ptr/cons/43820_neg.cc: Likewise. * testsuite/20_util/shared_ptr/cons/auto_ptr.cc: Likewise. * testsuite/20_util/shared_ptr/cons/auto_ptr_neg.cc: Likewise. * testsuite/20_util/shared_ptr/creation/dr925.cc: Likewise. * testsuite/20_util/unique_ptr/cons/auto_ptr.cc: Likewise. * testsuite/20_util/unique_ptr/cons/auto_ptr_neg.cc: Likewise. * testsuite/ext/pb_ds/example/priority_queue_erase_if.cc: Likewise. * testsuite/ext/pb_ds/example/priority_queue_split_join.cc: Likewise. * testsuite/lib/dg-options.exp (dg_add_options_using-deprecated): New proc.	2021-08-03 15:30:17 +01:00
Jonathan Wakely	e9f64fff64	libstdc++: Reduce header dependencies in <regex> This reduces the size of <regex> a little. This is one of the largest and slowest headers in the library. By using <bits/stl_algobase.h> and <bits/stl_algo.h> instead of <algorithm> we don't need to parse all the parallel algorithms and std::ranges:: algorithms that are not needed by <regex>. Similarly, by using <bits/stl_tree.h> and <bits/stl_map.h> instead of <map> we don't need to parse the definition of std::multimap. The _State_info type is not movable or copyable, so doesn't need to use std::unique_ptr<bool[]> to manage a bitset, we can just delete it in the destructor. It would use a lot less space if we used a bitset instead, but that would be an ABI break. We could do it for the versioned namespace, but this patch doesn't do so. For future reference, using vector<bool> would work, but would increase sizeof(_State_info) by two pointers, because it's three times as large as unique_ptr<bool[]>. We can't use std::bitset because the length isn't constant. We want a bitset with a non-constant but fixed length. Signed-off-by: Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * include/bits/regex_executor.h (_State_info): Replace unique_ptr<bool[]> with array of bool. * include/bits/regex_executor.tcc: Likewise. * include/bits/regex_scanner.tcc: Replace std::strchr with __builtin_strchr. * include/std/regex: Replace standard headers with smaller internal ones. * testsuite/28_regex/traits/char/lookup_classname.cc: Include <string.h> for strlen. * testsuite/28_regex/traits/char/lookup_collatename.cc: Likewise.	2021-08-03 15:24:52 +01:00
H.J. Lu	98d7f305d5	x86: Use XMM31 for scratch SSE register In 64-bit mode, use XMM31 for scratch SSE register to avoid vzeroupper if possible. gcc/ * config/i386/i386.c (ix86_gen_scratch_sse_rtx): In 64-bit mode, try XMM31 to avoid vzeroupper. gcc/testsuite/ * gcc.target/i386/avx-vzeroupper-14.c: Pass -mno-avx512f to disable XMM31. * gcc.target/i386/avx-vzeroupper-15.c: Likewise. * gcc.target/i386/pr82941-1.c: Updated. Check for vzeroupper. * gcc.target/i386/pr82942-1.c: Likewise. * gcc.target/i386/pr82990-1.c: Likewise. * gcc.target/i386/pr82990-3.c: Likewise. * gcc.target/i386/pr82990-5.c: Likewise. * gcc.target/i386/pr100865-4b.c: Likewise. * gcc.target/i386/pr100865-6b.c: Likewise. * gcc.target/i386/pr100865-7b.c: Likewise. * gcc.target/i386/pr100865-10b.c: Likewise. * gcc.target/i386/pr100865-8b.c: Updated. * gcc.target/i386/pr100865-9b.c: Likewise. * gcc.target/i386/pr100865-11b.c: Likewise. * gcc.target/i386/pr100865-12b.c: Likewise.	2021-08-03 07:11:58 -07:00
Jonathan Wakely	a1a2654cdc	libstdc++: Avoid using std::unique_ptr in <locale> std::wstring_convert and std::wbuffer_convert types are not copyable or movable, and store a plain pointer without a deleter. That means a much simpler type that just uses delete in its destructor can be used instead of std::unique_ptr. That avoids including and parsing all of <bits/unique_ptr.h> in every header that includes <locale>. It also avoids instantiating unique_ptr<C> and std::tuple<C, default_delete<C>> when the conversion utilities are used. Signed-off-by: Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: include/bits/locale_conv.h (__detail::_Scoped_ptr): Define new RAII class template. (wstring_convert, wbuffer_convert): Use __detail::_Scoped_ptr instead of unique_ptr.	2021-08-03 15:06:56 +01:00
Richard Sandiford	048039c49b	aarch64: Add -mtune=neoverse-512tvb This patch adds an option to tune for Neoverse cores that have a total vector bandwidth of 512 bits (4x128 for Advanced SIMD and a vector-length-dependent equivalent for SVE). This is intended to be a compromise between tuning aggressively for a single core like Neoverse V1 (which can be too narrow) and tuning for AArch64 cores in general (which can be too wide). -mcpu=neoverse-512tvb is equivalent to -mcpu=neoverse-v1 -mtune=neoverse-512tvb. gcc/ * doc/invoke.texi: Document -mtune=neoverse-512tvb and -mcpu=neoverse-512tvb. * config/aarch64/aarch64-cores.def (neoverse-512tvb): New entry. * config/aarch64/aarch64-tune.md: Regenerate. * config/aarch64/aarch64.c (neoverse512tvb_sve_vector_cost) (neoverse512tvb_sve_issue_info, neoverse512tvb_vec_issue_info) (neoverse512tvb_vector_cost, neoverse512tvb_tunings): New structures. (aarch64_adjust_body_cost_sve): Handle -mtune=neoverse-512tvb. (aarch64_adjust_body_cost): Likewise.	2021-08-03 13:00:49 +01:00
Richard Sandiford	9690309baf	aarch64: Restrict issue heuristics to inner vector loop The AArch64 vector costs try to take issue rates into account. However, when vectorising an outer loop, we lumped the inner and outer operations together, which is somewhat meaningless. This patch restricts the heuristic to the inner loop. gcc/ * config/aarch64/aarch64.c (aarch64_add_stmt_cost): Only record issue information for operations that occur in the innermost loop.	2021-08-03 13:00:48 +01:00
Richard Sandiford	028059b46e	aarch64: Tweak MLA vector costs The issue-based vector costs currently assume that a multiply-add sequence can be implemented using a single instruction. This is generally true for scalars (which have a 4-operand instruction) and SVE (which allows the output to be tied to any input). However, for Advanced SIMD, multiplying two values and adding an invariant will end up being a move and an MLA. The only target to use the issue-based vector costs is Neoverse V1, which would generally prefer SVE in this case anyway. I therefore don't have a self-contained testcase. However, the distinction becomes more important with a later patch. gcc/ * config/aarch64/aarch64.c (aarch64_multiply_add_p): Add a vec_flags parameter. Detect cases in which an Advanced SIMD MLA would almost certainly require a MOV. (aarch64_count_ops): Update accordingly.	2021-08-03 13:00:47 +01:00
Richard Sandiford	537afb0857	aarch64: Tweak the cost of elementwise stores When the vectoriser scalarises a strided store, it counts one scalar_store for each element plus one vec_to_scalar extraction for each element. However, extracting element 0 is free on AArch64, so it should have zero cost. I don't have a testcase that requires this for existing -mtune options, but it becomes more important with a later patch. gcc/ * config/aarch64/aarch64.c (aarch64_is_store_elt_extraction): New function, split out from... (aarch64_detect_vector_stmt_subtype): ...here. (aarch64_add_stmt_cost): Treat extracting element 0 as free.	2021-08-03 13:00:46 +01:00
Richard Sandiford	78770e0e5d	aarch64: Add gather_load_xNN_cost tuning fields This patch adds tuning fields for the total cost of a gather load instruction. Until now, we've costed them as one scalar load per element instead. Those scalar_load-based values are also what the patch uses to fill in the new fields for existing cost structures. gcc/ * config/aarch64/aarch64-protos.h (sve_vec_cost): Add gather_load_x32_cost and gather_load_x64_cost. * config/aarch64/aarch64.c (generic_sve_vector_cost) (a64fx_sve_vector_cost, neoversev1_sve_vector_cost): Update accordingly, using the values given by the scalar_load * number of elements calculation that we used previously. (aarch64_detect_vector_stmt_subtype): Use the new fields.	2021-08-03 13:00:45 +01:00
Richard Sandiford	b585f0112f	aarch64: Split out aarch64_adjust_body_cost_sve This patch splits the SVE-specific part of aarch64_adjust_body_cost out into its own subroutine, so that a future patch can call it more than once. I wondered about using a lambda to avoid having to pass all the arguments, but in the end this way seemed clearer. gcc/ * config/aarch64/aarch64.c (aarch64_adjust_body_cost_sve): New function, split out from... (aarch64_adjust_body_cost): ...here.	2021-08-03 13:00:45 +01:00
Richard Sandiford	83d796d3e5	aarch64: Add a simple fixed-point class for costing This patch adds a simple fixed-point class for holding fractional cost values. It can exactly represent the reciprocal of any single-vector SVE element count (including the non-power-of-2 ones). This means that it can also hold 1/N for all N in [1, 16], which should be enough for the various _per_cycle fields. For now the assumption is that the number of possible reciprocals is fixed at compile time and so the class should always be able to hold an exact value. The class uses a uint64_t to hold the fixed-point value, which means that it can hold any scaled uint32_t cost. Normally we don't worry about overflow when manipulating raw uint32_t costs, but just to be on the safe side, the class uses saturating arithmetic for all operations. As far as the changes to the cost routines themselves go: - The changes to aarch64_add_stmt_cost and its subroutines are just laying groundwork for future patches; no functional change intended. - The changes to aarch64_adjust_body_cost mean that we now take fractional differences into account. gcc/ config/aarch64/fractional-cost.h: New file. * config/aarch64/aarch64.c: Include <algorithm> (indirectly) and cost_fraction.h. (vec_cost_fraction): New typedef. (aarch64_detect_scalar_stmt_subtype): Use it for statement costs. (aarch64_detect_vector_stmt_subtype): Likewise. (aarch64_sve_adjust_stmt_cost, aarch64_adjust_stmt_cost): Likewise. (aarch64_estimate_min_cycles_per_iter): Use vec_cost_fraction for cycle counts. (aarch64_adjust_body_cost): Likewise. (aarch64_test_cost_fraction): New function. (aarch64_run_selftests): Call it.	2021-08-03 13:00:44 +01:00
Richard Sandiford	fa3ca6151c	aarch64: Turn sve_width tuning field into a bitmask The tuning structures have an sve_width field that specifies the number of bits in an SVE vector (or SVE_NOT_IMPLEMENTED if not applicable). This patch turns the field into a bitmask so that it can specify multiple widths at the same time. For now we always treat the mininum width as the likely width. An alternative would have been to add extra fields, which would have coped correctly with non-power-of-2 widths. However, we're very far from supporting constant non-power-of-2 vectors in GCC, so I think the non-power-of-2 case will in reality always have to be hidden behind VLA. gcc/ * config/aarch64/aarch64-protos.h (tune_params::sve_width): Turn into a bitmask. * config/aarch64/aarch64.c (aarch64_cmp_autovec_modes): Update accordingly. (aarch64_estimated_poly_value): Likewise. Use the least significant set bit for the minimum and likely values. Use the most significant set bit for the maximum value.	2021-08-03 13:00:43 +01:00
liuhongt	d0b952edd3	Add cond_add/sub/mul for vector integer modes. gcc/ChangeLog: * config/i386/sse.md (cond_<insn><mode>): New expander. (cond_mul<mode>): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/cond_op_addsubmul_d-1.c: New test. * gcc.target/i386/cond_op_addsubmul_d-2.c: New test. * gcc.target/i386/cond_op_addsubmul_q-1.c: New test. * gcc.target/i386/cond_op_addsubmul_q-2.c: New test. * gcc.target/i386/cond_op_addsubmul_w-1.c: New test. * gcc.target/i386/cond_op_addsubmul_w-2.c: New test.	2021-08-03 19:27:52 +08:00
Mosè Giordano	759f3854f0	Fix bashism in `libsanitizer/configure.tgt' Appending to a string variable with `+=' is a bashism and does not work in strict POSIX shells like dash. This results in the extra compilation flags not to be set correctly. This patch replaces the `+=' syntax with a simple string interpolation to append to the `EXTRA_CXXFLAGS' variable. libsanitizer/ChangeLog PR sanitizer/101111 * configure.tgt: Fix bashism in setting of `EXTRA_CXXFLAGS'.	2021-08-03 13:24:47 +02:00
Jakub Jelinek	1a830c0636	analyzer: Fix ICE on MD builtin [PR101721] The following testcase ICEs because DECL_FUNCTION_CODE asserts the builtin is BUILT_IN_NORMAL, but it sees a backend (MD) builtin instead. The FE, normal and MD builtin numbers overlap, so one should always check what kind of builtin it is before looking at specific codes. On the other side, region-model.cc has: if (fndecl_built_in_p (callee_fndecl, BUILT_IN_NORMAL) && gimple_builtin_call_types_compatible_p (call, callee_fndecl)) switch (DECL_UNCHECKED_FUNCTION_CODE (callee_fndecl)) which IMO should use DECL_FUNCTION_CODE instead, it checked first it is a normal builtin... 2021-08-03 Jakub Jelinek <jakub@redhat.com> PR analyzer/101721 * sm-malloc.cc (known_allocator_p): Only check DECL_FUNCTION_CODE on BUILT_IN_NORMAL builtins. * gcc.dg/analyzer/pr101721.c: New test.	2021-08-03 12:44:17 +02:00

1 2 3 4 5 ...

187243 Commits