OpenE2K/gcc - gcc - Expired Mentality Git

Commit Graph

Author	SHA1	Message	Date
François Dumont	90bf60c3c2	libstdc++: _Rb_tree code cleanup, remove lambdas Use new template parameters to replace usage of lambdas to move or not tree values on copy. libstdc++-v3/ChangeLog: * include/bits/move.h (_GLIBCXX_FWDREF): New. * include/bits/stl_tree.h: Adapt to use latter. (_Rb_tree<>::_M_clone_node): Add _MoveValue template parameter. (_Rb_tree<>::_M_mbegin): New. (_Rb_tree<>::_M_begin): Use latter. (_Rb_tree<>::_M_copy): Add _MoveValues template parameter. * testsuite/23_containers/map/allocator/move_cons.cc: New test. * testsuite/23_containers/multimap/allocator/move_cons.cc: New test. * testsuite/23_containers/multiset/allocator/move_cons.cc: New test. * testsuite/23_containers/set/allocator/move_cons.cc: New test.	2020-11-20 20:56:10 +01:00
Jan Hubicka	d1081010a1	Improve hashing of decls in ipa-icf-gimple Another remaining case is that we end up comparing calls with mismatching number of parameters or with different permutations of them. This is because we hash decls to nothing. This patch improves that by hashing decls by their code and parm decls by indexes that are stable. Also for defualt defs in SSA_NAMEs we can add the corresponding decl (that is usually parm decls). Still we could improve on this by hasing ssa names by their definit parameters and possibly making maps of other decls and assigning them stable function local IDs. * ipa-icf-gimple.c (func_checker::hash_operand): Improve hashing of decls.	2020-11-20 20:36:14 +01:00
Jan Hubicka	4c3b16f3c1	Only compare sizes of automatic variables one of common remaining reasons for ICF to fail after loading in fuction body is mismatched type of automatic vairable. This is becuase compatible_types_p resorts to checking TYPE_MAIN_VARIANTS for euqivalence that prevents merging many TBAA compaitle cases. (And thus is also not reflected by the hash extended by alias sets of accesses.) Since in gimple automatic variables are just blocks of memory I think we should only check its size only. All accesses are matched when copmparing the actual loads/stores. I am not sure if we need to match types of other DECLs but I decided I can try to be safe here: for PARM_DECl/RESUILT_DECL we match them anyway to be sure that functions are ABI compatible. For CONST_DECL and readonly global VAR_DECLs they are matched when comparing their constructors. * ipa-icf-gimple.c (func_checker::compare_decl): Do not compare types of local variables.	2020-11-20 20:33:04 +01:00
Andrew MacLeod	6585462630	re: FAIL: gcc.dg/pr97515.c Adjust testcase to check in CCP not EVRP. gcc/testuite/ * gcc.dg/pr97515.c: Check in ccp2, not evrp.	2020-11-20 11:08:43 -05:00
Andrea Corallo	f671b3d79f	PR target/97727 aarch64: [testcase] fix bf16_vstN_lane_2.c for big endian targets gcc/testsuite/ChangeLog 2020-11-09 Andrea Corallo <andrea.corallo@arm.com> PR target/97727 * gcc.target/aarch64/advsimd-intrinsics/bf16_vstN_lane_2.c: Relax regexps.	2020-11-20 16:37:12 +01:00
Nathan Sidwell	bf0a3968f5	doc: Fixup a couple of formatting nits I noticed a couple of places we used @code{program} instead of @command{program}. gcc/ * doc/invoke.texi: Replace a couple of @code with @command	2020-11-20 10:14:13 -05:00
Andrea Corallo	86706296b7	[PR target/97726] arm: [testsuite] fix some simd tests on armbe 2020-11-10 Andrea Corallo <andrea.corallo@arm.com> PR target/97726 * gcc.target/arm/simd/bf16_vldn_1.c: Relax regexps not to fail on big endian. * gcc.target/arm/simd/vldn_lane_bf16_1.c: Likewise * gcc.target/arm/simd/vmmla_1.c: Add -mfloat-abi=hard flag.	2020-11-20 16:03:51 +01:00
Tamar Christina	ad318e3f1d	SLP: Have vectorizable_slp_permutation set type on invariants This modifies vectorizable_slp_permutation to update the type of the children of a perm node before trying to permute them. This allows us to be able to permute invariant nodes. This will be covered by test from the SLP pattern matcher. gcc/ChangeLog: * tree-vect-slp.c (vectorizable_slp_permutation): Update types on nodes when needed.	2020-11-20 13:32:32 +00:00
Jonathan Wakely	640ebeb336	libstdc++: Remove <memory_resource> dependency from <regex> [PR 92546] Unlike the other headers that declare alias templates in namespace pmr, <regex> includes <memory_resource>. That was done because the pmr::string::const_iterator typedef requires pmr::string to be complete, which requires pmr::polymorphic_allocator<char> to be complete. By using __normal_iterator<const char, pmr::string> instead of the const_iterator typedef we can avoid the completeness requirement. This makes <regex> smaller, by not requiring <memory_resource> and its <shared_mutex> dependency, which depends on <chrono>. Backporting this will also help with PR 97876, where <stop_token> ends up being needed by <regex> via <memory_resource>. libstdc++-v3/ChangeLog: PR libstdc++/92546 include/std/regex (pmr::smatch, pmr::wsmatch): Declare using underlying __normal_iterator type, not nested typedef basic_string::const_iterator.	2020-11-20 13:06:48 +00:00
Richard Biener	4405edb496	Deal with (pattern) SLP consumed stmts in hybrid discovery This makes hybrid SLP discovery deal with stmts indirectly consumed by SLP, for example via patterns. This means that all uses of a stmt end up in SLP vectorized stmts. This helps my prototype patches for PR97832 where I make SLP discovery re-associate chains to make operands match. This ends up building SLP computation nodes without 1:1 representatives in the scalar IL and thus no scalar lane defs in SLP_TREE_SCALAR_STMTS. Nevertheless all of the original scalar stmts are consumed so this represents another kind of SLP pattern for the computation chain result. 2020-11-20 Richard Biener <rguenther@suse.de> * tree-vect-slp.c (maybe_push_to_hybrid_worklist): New function. (vect_detect_hybrid_slp): Use it. Perform a backward walk over the IL.	2020-11-20 13:06:58 +01:00
Richard Biener	6e820b943b	dump SLP_TREE_REPRESENTATIVE It always annoyed me to see those empty SLP nodes in dumpfiles: t.c:16:3: note: node 0x3a2a280 (max_nunits=1, refcnt=1) t.c:16:3: note: { } t.c:16:3: note: children 0x3a29db0 0x3a29e90 resulting from two-operator handling. The following makes sure to also dump the operation template or VEC_PERM_EXPR. 2020-11-20 Richard Biener <rguenther@suse.de> * tree-vect-slp.c (vect_print_slp_tree): Also dump SLP_TREE_REPRESENTATIVE.	2020-11-20 13:05:42 +01:00
Jakub Jelinek	1bea0d0aa5	c++: Add __builtin_clear_padding builtin - C++20 P0528R3 compiler side [PR88101] The following patch implements __builtin_clear_padding builtin that clears the padding bits in object representation (but preserves value representation). Inside of unions it clears only those padding bits that are padding for all the union members (so that it never alters value representation). It handles trailing padding, padding in the middle of structs including bitfields (PDP11 unhandled, I've never figured out how those bitfields work), VLAs (doesn't handle variable length structures, but I think almost nobody uses them and it isn't worth the extra complexity). For VLAs and sufficiently large arrays it uses runtime clearing loop instead of emitting straight-line code (unless arrays are inside of a union). The way I think this can be used for atomics is e.g. if the structures are power of two sized and small enough that we use the hw atomics for say compare_exchange __builtin_clear_padding could be called first on the address of expected and desired arguments (for desired only if we want to ensure that most of the time the atomic memory will have padding bits cleared), then perform the weak cmpxchg and if that fails, we got the value from the atomic memory; we can call __builtin_clear_padding on a copy of that and then compare it with expected, and if it is the same with the padding bits masked off, we can use the original with whatever random padding bits in it as the new expected for next cmpxchg. __builtin_clear_padding itself is not atomic and therefore it shouldn't be called on the atomic memory itself, but compare_exchange's expected argument is a reference and normally the implementation may store there the current value from memory, so padding bits can be cleared in that, and desired is passed by value rather than reference, so clearing is fine too. When using libatomic, we can use it either that way, or add new libatomic APIs that accept another argument, pointer to the padding bit bitmask, and construct that in the template as alignas (_T) unsigned char _mask[sizeof (_T)]; std::memset (_mask, ~0, sizeof (_mask)); __builtin_clear_padding ((_T ) _mask); which will have bits cleared for padding bits and set for bits taking part in the value representation. Then libatomic could internally instead of using memcmp compare for (i = 0; i < N; i++) if ((val1[i] & mask[i]) != (val2[i] & mask[i])) 2020-11-20 Jakub Jelinek <jakub@redhat.com> PR libstdc++/88101 gcc/ * builtins.def (BUILT_IN_CLEAR_PADDING): New built-in function. * gimplify.c (gimplify_call_expr): Rewrite single argument BUILT_IN_CLEAR_PADDING into two-argument variant. * gimple-fold.c (clear_padding_unit, clear_padding_buf_size): New const variables. (struct clear_padding_struct): New type. (clear_padding_flush, clear_padding_add_padding, clear_padding_emit_loop, clear_padding_type, clear_padding_union, clear_padding_real_needs_padding_p, clear_padding_type_may_have_padding_p, gimple_fold_builtin_clear_padding): New functions. (gimple_fold_builtin): Handle BUILT_IN_CLEAR_PADDING. * doc/extend.texi (__builtin_clear_padding): Document. gcc/c-family/ * c-common.c (check_builtin_function_arguments): Handle BUILT_IN_CLEAR_PADDING. gcc/testsuite/ * c-c++-common/builtin-clear-padding-1.c: New test. * c-c++-common/torture/builtin-clear-padding-1.c: New test. * c-c++-common/torture/builtin-clear-padding-2.c: New test. * c-c++-common/torture/builtin-clear-padding-3.c: New test. * c-c++-common/torture/builtin-clear-padding-4.c: New test. * c-c++-common/torture/builtin-clear-padding-5.c: New test. * g++.dg/torture/builtin-clear-padding-1.C: New test. * g++.dg/torture/builtin-clear-padding-2.C: New test. * gcc.dg/builtin-clear-padding-1.c: New test.	2020-11-20 12:28:34 +01:00
Jakub Jelinek	410b8f6f41	arm: Fix up neon_vector_mem_operand [PR97528] The documentation for POST_MODIFY says: Currently, the compiler can only handle second operands of the form (plus (reg) (reg)) and (plus (reg) (const_int)), where the first operand of the PLUS has to be the same register as the first operand of the _MODIFY. The following testcase ICEs, because combine just attempts to simplify things and ends up with (post_modify (reg1) (plus (mult (reg2) (const_int 4)) (reg1)) but the target predicates accept it, because they only verify that POST_MODIFY's second operand is PLUS and the second operand of the PLUS is a REG. The following patch fixes this by performing further verification that the POST_MODIFY is in the form it should be. 2020-11-20 Jakub Jelinek <jakub@redhat.com> PR target/97528 config/arm/arm.c (neon_vector_mem_operand): For POST_MODIFY, require first POST_MODIFY operand is a REG and is equal to the first operand of PLUS. * gcc.target/arm/pr97528.c: New test.	2020-11-20 12:26:58 +01:00
Eric Botcazou	1b3c981367	Plug loophole in string store merging There is a loophole in new string store merging support added recently: it does not check that the stores are consecutive, which is obviously required if you want to concatenate them... Simple fix attached, the nice thing being that it can fall back to the regular processing if any hole is detected in the series of stores, thanks to the handling of STRING_CST by native_encode_expr. gcc/ChangeLog: * gimple-ssa-store-merging.c (struct merged_store_group): Add new 'consecutive' field. (merged_store_group): Set it to true. (do_merge): Set it to false if the store is not consecutive and set string_concatenation to false in this case. (merge_into): Call do_merge on entry. (merge_overlapping): Likewise. gcc/testsuite/ChangeLog: * gnat.dg/opt90a.adb: New test. * gnat.dg/opt90b.adb: Likewise. * gnat.dg/opt90c.adb: Likewise. * gnat.dg/opt90d.adb: Likewise. * gnat.dg/opt90e.adb: Likewise. * gnat.dg/opt90a_pkg.ads: New helper. * gnat.dg/opt90b_pkg.ads: Likewise. * gnat.dg/opt90c_pkg.ads: Likewise. * gnat.dg/opt90d_pkg.ads: Likewise. * gnat.dg/opt90e_pkg.ads: Likewise.	2020-11-20 12:24:08 +01:00
Jan Hubicka	cd287abe8c	Fix comment in ipa-icf-gimple.c * ipa-icf-gimple.c (func_checker::operand_equal_p): Fix comment.	2020-11-20 11:13:02 +01:00
Jan Hubicka	8e39410125	Fix comparsion of {CLOBBER} in icf after fixing few issues I gotto stage where 1.4M icf mismatches are due to comparing two gimple clobber. The problem is that operand_equal_p match clobber case CONSTRUCTOR: /* In GIMPLE empty constructors are allowed in initializers of aggregates. / return !CONSTRUCTOR_NELTS (arg0) && !CONSTRUCTOR_NELTS (arg1); But this happens too late after comparing its types (that are not very relevant for memory store). In the context of ipa-icf we do not really need to match RHS of gimple clobbers: it is enough to know that the LHS stores can be considered equivalent. I this added logic to hash them all the same way and compare using TREE_CLOBBER_P flag. I see other option in extending operand_equal_p in fold-const to handle them more generously or making stmt hash and compare to skip comparing/hashing RHS of gimple_clobber_p. ipa-icf-gimple.c (func_checker::hash_operand): Hash gimple clobber. (func_checker::operand_equal_p): Special case gimple clobber.	2020-11-20 11:06:48 +01:00
Uros Bizjak	fdace75840	i386: Optimize abs expansion [PR97873] The patch introduces absM named pattern to generate optimal insn sequence for CMOVE_TARGET targets. Currently, the expansion goes through neg+max optabs, and the following code is generated: movl %edi, %eax negl %eax cmpl %edi, %eax cmovl %edi, %eax This sequence is unoptimal in two ways. a) The compare instruction is not needed, since NEG insn sets the sign flag based on the result. The CMOV can use sign flag to select between negated and original value: movl %edi, %eax negl %eax cmovs %edi, %eax b) On some targets, CMOV is undesirable due to its performance issues. In addition to TARGET_EXPAND_ABS bypass, the patch introduces STV conversion of abs RTX to use PABS SSE insn: vmovd %edi, %xmm0 vpabsd %xmm0, %xmm0 vmovd %xmm0, %eax The patch changes compare mode of NEG instruction to CCGOCmode, which is the same mode as the mode of SUB instruction. IOW, sign bit becomes usable. Also, the mode iterator of <maxmin:code><mode>3 pattern is changed to SWI48x instead of SWI248. The purpose of maxmin expander is to prepare max/min RTX for STV to eventually convert them to SSE PMAX/PMIN instructions, in order to avoid CMOV insns with general registers. 2020-11-20 Uroš Bizjak <ubizjak@gmail.com> gcc/ PR target/97873 * config/i386/i386.md (neg<mode>2_2): Rename from "neg<mode>2_cmpz". Use CCGOCmode instead of CCZmode. (negsi2_zext): Rename from negsi2_cmpz_zext. Use CCGOCmode instead of CCZmode. (neg<mode>_ccc_1): New insn pattern. (neg<dwi>2_doubleword): Use neg<mode>_ccc_1. (abs<mode>2): Add FLAGS_REG clobber. Use TARGET_CMOVE insn predicate. (abs<mode>2_1): New insn_and_split pattern. (absdi2_doubleword): Ditto. (<maxmin:code><mode>3): Use SWI48x mode iterator. (<maxmin:code><mode>3): Use SWI48 mode iterator. * config/i386/i386-features.c (general_scalar_chain::compute_convert_gain): Handle ABS code. (general_scalar_chain::convert_insn): Ditto. (general_scalar_to_vector_candidate_p): Ditto. gcc/testsuite/ PR target/97873 * gcc.target/i386/pr97873.c: New test. * gcc.target/i386/pr97873-1.c: New test.	2020-11-20 10:29:35 +01:00
Jakub Jelinek	a774a6a2fb	configury: Fix up --enable-link-serialization support Eric reported that the --enable-link-serialization changes seemed to cause the binaries to be always relinked, for example from the gcc/ directory of the build tree: make [relink of gnat1, brig1, cc1plus, d21, f951, go1, lto1, ...] make [relink of gnat1, brig1, cc1plus, d21, f951, go1, lto1, ...] Furthermore as reported in PR, it can cause problems during make install where make install rebuilds the binaries again. The problem is that for make .PHONY targets are just "rebuilt" always, so it is very much undesirable for the cc1plus$(exeext) etc. dependencies to include .PHONY targets, but I was using them - cc1plus.prev which would depend on some .serial and e.g. cc1.serial depending on c and c depending on cc1$(exeext). The following patch rewrites this so that .serial and .prev aren't .PHONY targets, but instead just make variables. I was worried that the order in which the language makefile fragments are included (which is quite random, what order we get from the filesystem matching /config-lang.in) would be a problem but it seems to work fine - as it uses make = rather than := variables, later definitions are just fine for earlier uses as long as the uses aren't needed during the makefile parsing, but only in the dependencies of make targets and in their commands. 2020-11-20 Jakub Jelinek <jakub@redhat.com> PR other/97911 gcc/ * configure.ac: In SERIAL_LIST use lang words without .serial suffix. Change $lang.prev from a target to variable and instead of depending on .serial expand to the .serial variable if the word is in the SERIAL_LIST at all, otherwise to nothing. * configure: Regenerated. gcc/c/ * Make-lang.in (c.serial): Change from goal to a variable. (.PHONY): Drop c.serial. gcc/ada/ * gcc-interface/Make-lang.in (ada.serial): Change from goal to a variable. (.PHONY): Drop ada.serial and ada.prev. (gnat1$(exeext)): Depend on $(ada.serial) rather than ada.serial. gcc/brig/ * Make-lang.in (brig.serial): Change from goal to a variable. (.PHONY): Drop brig.serial and brig.prev. (brig1$(exeext)): Depend on $(brig.serial) rather than brig.serial. gcc/cp/ * Make-lang.in (c++.serial): Change from goal to a variable. (.PHONY): Drop c++.serial and c++.prev. (cc1plus$(exeext)): Depend on $(c++.serial) rather than c++.serial. gcc/d/ * Make-lang.in (d.serial): Change from goal to a variable. (.PHONY): Drop d.serial and d.prev. (d21$(exeext)): Depend on $(d.serial) rather than d.serial. gcc/fortran/ * Make-lang.in (fortran.serial): Change from goal to a variable. (.PHONY): Drop fortran.serial and fortran.prev. (f951$(exeext)): Depend on $(fortran.serial) rather than fortran.serial. gcc/go/ * Make-lang.in (go.serial): Change from goal to a variable. (.PHONY): Drop go.serial and go.prev. (go1$(exeext)): Depend on $(go.serial) rather than go.serial. gcc/jit/ * Make-lang.in (jit.serial): Change from goal to a variable. (.PHONY): Drop jit.serial and jit.prev. ($(LIBGCCJIT_FILENAME)): Depend on $(jit.serial) rather than jit.serial. gcc/lto/ * Make-lang.in (lto1.serial, lto2.serial): Change from goals to variables. (.PHONY): Drop lto1.serial, lto2.serial, lto1.prev and lto2.prev. ($(LTO_EXE)): Depend on $(lto1.serial) rather than lto1.serial. ($(LTO_DUMP_EXE)): Depend on $(lto2.serial) rather than lto2.serial. gcc/objc/ * Make-lang.in (objc.serial): Change from goal to a variable. (.PHONY): Drop objc.serial and objc.prev. (cc1obj$(exeext)): Depend on $(objc.serial) rather than objc.serial. gcc/objcp/ * Make-lang.in (obj-c++.serial): Change from goal to a variable. (.PHONY): Drop obj-c++.serial and obj-c++.prev. (cc1objplus$(exeext)): Depend on $(obj-c++.serial) rather than obj-c++.serial.	2020-11-20 08:45:11 +01:00
Kewen Lin	02109ea268	rs6000: Fix p8_mtvsrd_df's insn type This patch is to fix insn type of p8_mtvsrd_df from mfvsr to mtvsr, in order to align with the other places using mtvsrd. gcc/ChangeLog: * config/rs6000/rs6000.md (p8_mtvsrd_df): Fix insn type.	2020-11-20 00:41:03 -06:00
Martin Uecker	32934a4f45	C: Drop qualifiers during lvalue conversion [PR97702] 2020-11-20 Martin Uecker <muecker@gwdg.de> gcc/ * gimplify.c (gimplify_modify_expr_rhs): Optimizie NOP_EXPRs that contain compound literals. gcc/c/ * c-typeck.c (convert_lvalue_to_rvalue): Drop qualifiers. gcc/testsuite/ * gcc.dg/cond-constqual-1.c: Adapt test. * gcc.dg/lvalue-11.c: New test. * gcc.dg/pr60195.c: Add warning.	2020-11-20 07:34:11 +01:00
GCC Administrator	d62586ee56	Daily bump.	2020-11-20 00:16:40 +00:00
Jakub Jelinek	d3f2933487	ranger: Improve a % b operand ranges [PR91029] As mentioned in the PR, the previous PR91029 patch was testing op2 >= 0 which is unnecessary, even negative op2 values will work the same, furthermore, from if a % b > 0 we can deduce a > 0 rather than just a >= 0 (0 % b would be 0), and it actually valid even for other constants than 0, a % b > 5 means a > 5 (a % b has the same sign as a and a in [0, 5] would result in a % b in [0, 5]. Also, we can deduce a range for the other operand, if we know a % b >= 20, then b must be (in absolute value for signed modulo) > 20, for a % [0, 20] the result would be [0, 19]. 2020-11-19 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/91029 * range-op.cc (operator_trunc_mod::op1_range): Don't require signed types, nor require that op2 >= 0. Implement (a % b) >= x && x > 0 implies a >= x and (a % b) <= x && x < 0 implies a <= x. (operator_trunc_mod::op2_range): New method. * gcc.dg/tree-ssa/pr91029-1.c: New test. * gcc.dg/tree-ssa/pr91029-2.c: New test.	2020-11-20 00:02:21 +01:00
Andrew MacLeod	d0d8b5d836	Process only valid shift ranges. When shifting outside the valid range of [0, precision-1], we can choose to process just the valid ones since the rest is undefined. this allows us to produce results for x << [0,2][+INF, +INF] by discarding the invalid ranges and processing just [0,2]. gcc/ PR tree-optimization/93781 * range-op.cc (get_shift_range): Rename from undefined_shift_range_check and now return valid shift ranges. (operator_lshift::fold_range): Use result from get_shift_range. (operator_rshift::fold_range): Ditto. gcc/testsuite/ * gcc.dg/tree-ssa/pr93781-1.c: New. * gcc.dg/tree-ssa/pr93781-2.c: New. * gcc.dg/tree-ssa/pr93781-3.c: New.	2020-11-19 17:41:30 -05:00
Nathan Sidwell	5bba2215c2	c++: Template hash access This exposes the template specialization table, so the modules machinery may access it. The hashed entity (tmpl, args & spec) is available, along with a hash table walker. We also need a way of finding or inserting entries, along with some bookkeeping fns to deal with the instantiation and (partial) specialization lists. gcc/cp/ * cp-tree.h (struct spec_entry): Moved from pt.c. (walk_specializations, match_mergeable_specialization) (get_mergeable_specialization_flags) (add_mergeable_specialization): Declare. * pt.c (struct spec_entry): Moved to cp-tree.h. (walk_specializations, match_mergeable_specialization) (get_mergeable_specialization_flags) (add_mergeable_specialization): New.	2020-11-19 13:25:00 -08:00
Jonathan Wakely	08b4d32571	libstdc++: Avoid calling undefined __gthread_self weak symbol [PR 95989] Since glibc 2.27 the pthread_self symbol has been defined in libc rather than libpthread. Because we only call pthread_self through a weak alias it's possible for statically linked executables to end up without a definition of pthread_self. This crashes when trying to call an undefined weak symbol. We can use the __GLIBC_PREREQ version check to detect the version of glibc where pthread_self is no longer in libpthread, and call it directly rather than through the weak reference. It would be better to check for pthread_self in libc during configure instead of hardcoding the __GLIBC_PREREQ check. That would be complicated by the fact that prior to glibc 2.27 libc.a didn't have the pthread_self symbol, but libc.so.6 did. The configure checks would need to try to link both statically and dynamically, and the result would depend on whether the static libc.a happens to be installed during configure (which could vary between different systems using the same version of glibc). Doing it properly is left for a future date, as that will be needed anyway after glibc moves all pthread symbols from libpthread to libc. When that happens we should revisit the whole approach of using weak symbols for pthread symbols. For the purposes of std::this_thread::get_id() we call pthread_self() directly when using glibc 2.27 or later. Otherwise, if __gthread_active_p() is true then we know the libpthread symbol is available so we call that. Otherwise, we are single-threaded and just use ((__gthread_t)1) as the thread ID. An undesirable consequence of this change is that code compiled prior to the change might inline the old definition of this_thread::get_id() which always returns (__gthread_t)1 in a program that isn't linked to libpthread. Code compiled after the change will use pthread_self() and so get a real TID. That could result in the main thread having different thread::id values in different translation units. This seems acceptable, as there are not expected to be many uses of thread::id in programs that aren't linked to libpthread. An earlier version of this patch also changed __gthread_self() to use __GLIBC_PREREQ(2, 27) and only use the weak symbol for older glibc. Tha might still make sense to do, but isn't needed by libstdc++ now. libstdc++-v3/ChangeLog: PR libstdc++/95989 * config/os/gnu-linux/os_defines.h (_GLIBCXX_NATIVE_THREAD_ID): Define new macro to get reliable thread ID. * include/bits/std_thread.h: (this_thread::get_id): Use new macro if it's defined. * testsuite/30_threads/jthread/95989.cc: New test. * testsuite/30_threads/this_thread/95989.cc: New test.	2020-11-19 21:07:06 +00:00
Nathan Sidwell	bfc139e2b1	c++: Expose constexpr hash table This patch exposes the constexpr hash table so that the modules machinery can save and load constexpr bodies. While there I noticed that we could do a little constification of the hasher and comparator functions. Also combine the saving machinery to a single function returning void -- nothing ever looked at its return value. gcc/cp/ * cp-tree.h (struct constexpr_fundef): Moved from constexpr.c. (maybe_save_constexpr_fundef): Declare. (register_constexpr_fundef): Take constexpr_fundef object, return void. * decl.c (mabe_save_function_definition): Delete, functionality moved to maybe_save_constexpr_fundef. (emit_coro_helper, finish_function): Adjust. * constexpr.c (struct constexpr_fundef): Moved to cp-tree.h. (constexpr_fundef_hasher::equal): Constify. (constexpr_fundef_hasher::hash): Constify. (retrieve_constexpr_fundef): Make non-static. (maybe_save_constexpr_fundef): Break out checking and duplication from ... (register_constexpr_fundef): ... here. Just register the constexpr.	2020-11-19 12:21:31 -08:00
Jan Hubicka	0862d007b5	Fix two bugs in operand_equal_p * fold-const.c (operand_compare::operand_equal_p): Fix thinko in COMPONENT_REF handling and guard types_same_for_odr by virtual_method_call_p. (operand_compare::hash_operand): Likewise.	2020-11-19 20:16:26 +01:00
Jakub Jelinek	8156cfaa4c	c, tree: Fix ICE from get_parm_array_spec [PR97860] The C and C++ FEs handle zero sized arrays differently, C uses NULL TYPE_MAX_VALUE on non-NULL TYPE_DOMAIN on complete ARRAY_TYPEs with bitsize_zero_node TYPE_SIZE, while C++ FE likes to set TYPE_MAX_VALUE to the largest value (and min to the lowest). Martin has used array_type_nelts in get_parm_array_spec where the function on the C form of [0] arrays returns error_mark_node and the code crashes soon afterwards. The following patch teaches array_type_nelts about this (e.g. dwarf2out already handles that as [0]). While it will change what is_empty_type returns for certain types (e.g. struct S { int a[0]; };), as those types occupy zero bits in C, it should make an ABI difference. So, the tree.c change makes the c-decl.c code handle the [0] arrays like any other constant extents, and the c-decl.c change just makes sure that if we'd run into error_mark_node e.g. from the VLA expressions, we don't crash on those. 2020-11-19 Jakub Jelinek <jakub@redhat.com> PR c/97860 * tree.c (array_type_nelts): For complete arrays with zero min and NULL max and zero size return -1. * c-decl.c (get_parm_array_spec): Bail out of nelts is error_operand_p. * gcc.dg/pr97860.c: New test.	2020-11-19 20:09:55 +01:00
Marek Polacek	ae48b74ca0	c++: Fix array new with value-initialization [PR97523] Since my r11-3092 the following is rejected with -std=c++20: struct T { explicit T(); }; void fn(int n) { new T[1](); } with "would use explicit constructor 'T::T()'". It is because since that change we go into the P1009 block in build_new (array_p is false, but nelts is non-null and we're in C++20). Since we only have (), we build a {} and continue to build_new_1, which then calls build_vec_init and then we error because the {} isn't CONSTRUCTOR_IS_DIRECT_INIT. For (), which is value-initializing, we want to do what we were doing before: pass empty init and let build_value_init take care of it. For various reasons I wanted to dig a little bit deeper into this, and as a result, I'm adding a test for [expr.new]/24 (and checked that out current behavior matches clang++). gcc/cp/ChangeLog: PR c++/97523 * init.c (build_new): When value-initializing an array new, leave the INIT as an empty vector. gcc/testsuite/ChangeLog: PR c++/97523 * g++.dg/expr/anew5.C: New test. * g++.dg/expr/anew6.C: New test.	2020-11-19 14:00:41 -05:00
Marek Polacek	25056bdf94	c++: Fix crash with broken deduction from {} [PR97895] Unfortunately, the otherwise beautiful for (constructor_elt &elt : CONSTRUCTOR_ELTS (init)) is not immune to an empty constructor, so we have to check CONSTRUCTOR_ELTS first. gcc/cp/ChangeLog: PR c++/97895 pt.c (do_auto_deduction): Don't crash when the constructor has zero elements. gcc/testsuite/ChangeLog: PR c++/97895 * g++.dg/cpp0x/auto54.C: New test.	2020-11-19 13:14:41 -05:00
Nathan Sidwell	e1f07131e2	config: Add tests for modules-desired features this adds configure tests for features that modules can take advantage of -- and if they are not present has reduced or fallback functionality. gcc/ * configure.ac: Add tests for fstatat, sighandler_t, O_CLOEXEC, unix-domain and ipv6 sockets. * config.in: Rebuilt. * configure: Rebuilt.	2020-11-19 09:56:30 -08:00
Nathan Sidwell	255483e5b7	c++: Relax new assert [PR 97905] It turns out there are legitimate cases for the new decl to not have lang-specific. PR c++/97905 gcc/cp/ * decl.c (duplicate_decls): Relax new assert. gcc/testsuite/ * g++.dg/lookup/pr97905.C: New.	2020-11-19 09:56:30 -08:00
Dimitar Dimitrov	5ace1776b8	pru: Add builtins for HALT and LMBD Add builtins for HALT and LMBD, per Texas Instruments document SPRUHV7C. Use the new LMBD pattern to define an expand for clz. Binutils [1] and sim [2] support for LMBD instruction are merged now. [1] https://sourceware.org/pipermail/binutils/2020-October/113901.html [2] https://sourceware.org/pipermail/gdb-patches/2020-November/173141.html gcc/ChangeLog: * config/pru/alu-zext.md: Add lmbd patterns for zero_extend variants. * config/pru/pru.c (enum pru_builtin): Add HALT and LMBD. (pru_init_builtins): Ditto. (pru_builtin_decl): Ditto. (pru_expand_builtin): Ditto. * config/pru/pru.h (CLZ_DEFINED_VALUE_AT_ZERO): Define PRU value for CLZ with zero value parameter. * config/pru/pru.md: Add halt, lmbd and clz patterns. * doc/extend.texi: Document PRU builtins. gcc/testsuite/ChangeLog: * gcc.target/pru/halt.c: New test. * gcc.target/pru/lmbd.c: New test.	2020-11-19 19:39:49 +02:00
Richard Sandiford	0b0061f4d8	vect: Add a “very cheap” cost model Currently we have three vector cost models: cheap, dynamic and unlimited. -O2 -ftree-vectorize uses “cheap” by default, but that's still relatively aggressive about peeling and aliasing checks, and can lead to significant code size growth. This patch adds an even more conservative choice, which for lack of imagination I've called “very cheap”. It only allows vectorisation if the vector code entirely replaces the scalar code. It also requires one iteration of the vector loop to pay for itself, regardless of how often the loop iterates. (If the vector loop needs multiple iterations to be beneficial then things are probably too close to call, and the conservative thing would be to stick with the scalar code.) The idea is that this should be suitable for -O2, although the patch doesn't change any defaults itself. I tested this by building and running a bunch of workloads for SVE, with three options: (1) -O2 (2) -O2 -ftree-vectorize -fvect-cost-model=very-cheap (3) -O2 -ftree-vectorize [-fvect-cost-model=cheap] All three builds used the default -msve-vector-bits=scalable and ran with the minimum vector length of 128 bits, which should give a worst-case bound for the performance impact. The workloads included a mixture of microbenchmarks and full applications. Because it's quite an eclectic mix, there's not much point giving exact figures. The aim was more to get a general impression. Code size growth with (2) was much lower than with (3). Only a handful of tests increased by more than 5%, and all of them were microbenchmarks. In terms of performance, (2) was significantly faster than (1) on microbenchmarks (as expected) but also on some full apps. Again, performance only regressed on a handful of tests. As expected, the performance of (3) vs. (1) and (3) vs. (2) is more of a mixed bag. There are several significant improvements with (3) over (2), but also some (smaller) regressions. That seems to be in line with -O2 -ftree-vectorize being a kind of -O2.5. The patch reorders vect_cost_model so that values are in order of increasing aggressiveness, which makes it possible to use range checks. The value 0 still represents “unlimited”, so “if (flag_vect_cost_model)” is still a meaningful check. gcc/ * doc/invoke.texi (-fvect-cost-model): Add a very-cheap model. * common.opt (fvect-cost-model=): Add very-cheap as a possible option. (fsimd-cost-model=): Likewise. (vect_cost_model): Add very-cheap. * flag-types.h (vect_cost_model): Add VECT_COST_MODEL_VERY_CHEAP. Put the values in order of increasing aggressiveness. * tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Use range checks when comparing against VECT_COST_MODEL_CHEAP. (vect_prune_runtime_alias_test_list): Do not allow any alias checks for the very-cheap cost model. * tree-vect-loop.c (vect_analyze_loop_costing): Do not allow any peeling for the very-cheap cost model. Also require one iteration of the vector loop to pay for itself. gcc/testsuite/ * gcc.dg/vect/vect-cost-model-1.c: New test. * gcc.dg/vect/vect-cost-model-2.c: Likewise. * gcc.dg/vect/vect-cost-model-3.c: Likewise. * gcc.dg/vect/vect-cost-model-4.c: Likewise. * gcc.dg/vect/vect-cost-model-5.c: Likewise. * gcc.dg/vect/vect-cost-model-6.c: Likewise.	2020-11-19 16:49:37 +00:00
Jonathan Wakely	5e6a43158d	libstdc++: Add missing header to some tests These tests use std::this_thread::sleep_for without including <thread>. libstdc++-v3/ChangeLog: * testsuite/30_threads/async/async.cc: Include <thread>. * testsuite/30_threads/future/members/93456.cc: Likewise.	2020-11-19 16:17:33 +00:00
Wilco Dijkstra	5c5a67e61b	AArch64: Add cost table for Cortex-A76 Add an initial cost table for Cortex-A76 - this is copied from cotexa57_extra_costs but updated based on the Optimization Guide. Use the new cost table on all Neoverse tunings and ensure the tunings are consistent for all. As a result more compact code is generated with more combined shift+alu operations. Eg. -mcpu=cortex-a76 will now merge the shifts in: int f(int x, int y) { return (x & y << 3) * (x \| y << 3); } and w2, w0, w1, lsl 3 orr w0, w0, w1, lsl 3 mul w0, w2, w0 ret SPEC2017 codesize improves by 0.02% and SPECINT2017 shows 0.24% gain. 2020-11-18 Wilco Dijkstra <wdijkstr@arm.com> gcc/ * config/aarch64/aarch64.c (neoversen1_tunings): Use new cortexa76_extra_costs. (neoversev1_tunings): Likewise. (neoversen2_tunines): Likewise. * config/arm/aarch-cost-tables.h (cortexa76_extra_costs): add new costs.	2020-11-19 16:14:11 +00:00
Wilco Dijkstra	1d77928fc4	AArch64: Improve inline memcpy expansion Improve the inline memcpy expansion. Use integer load/store for copies <= 24 bytes instead of SIMD. Set the maximum copy to expand to 256 by default, except that -Os or no Neon expands up to 128 bytes. When using LDP/STP of Q-registers, also use Q-register accesses for the unaligned tail, saving 2 instructions (eg. all sizes up to 48 bytes emit exactly 4 instructions). Cleanup code and comments. The codesize gain vs the GCC10 expansion is 0.05% on SPECINT2017. 2020-11-03 Wilco Dijkstra <wdijkstr@arm.com> gcc/ * config/aarch64/aarch64.c (aarch64_expand_cpymem): Cleanup code and comments, tweak expansion decisions and improve tail expansion.	2020-11-19 16:05:33 +00:00
Eric Botcazou	2729378d09	Fix PR ada/97805 We need to include limits.h (or <climits>) in adaint.c because of LLONG_MIN. gcc/ada/ChangeLog: PR ada/97805 * adaint.c: Include climits in C++ and limits.h otherwise.	2020-11-19 16:41:34 +01:00
Nathan Sidwell	9844497a93	preprocessor: main file searching This adds the capability to locate the main file on the user or system include paths. That's extremely useful to users building header units. Searching has to be requiested (plain header-unit compilation will not search). Also, to make include_next work as expected when building a header unit, we add a mechanism to retrofit a non-searched source file as one on the include path. libcpp/ * include/cpplib.h (enum cpp_main_search): New. (struct cpp_options): Add main_search field. (cpp_main_loc): Declare. (cpp_retrofit_as_include): Declare. * internal.h (struct cpp_reader): Add main_loc field. (_cpp_in_main_source_file): Not main if main is a header. * init.c (cpp_read_main_file): Use main_search option to locate main file. Set main_loc * files.c (cpp_retrofit_as_include): New.	2020-11-19 07:05:08 -08:00
Jonathan Wakely	b204d7722d	libstdc++: Move std::thread to a new header This makes it possible to use std::thread without including the whole of <thread>. It also makes this_thread::get_id() and this_thread::yield() available even when there is no gthreads support (e.g. when GCC is built with --disable-threads or --enable-threads=single). In order for the std:🧵:id return type of this_thread::get_id() to be defined, std:thread itself is defined unconditionally. However the constructor that creates new threads is not defined for single-threaded builds. The thread::join() and thread::detach() member functions are defined inline for single-threaded builds and just throw an exception (because we know the thread cannot be joinable if the constructor that creates joinable threads doesn't exit). The thread::hardware_concurrency() member function is also defined inline and returns 0 (as suggested by the standard when the value "is not computable or well-defined"). The main benefit for most targets is that other headers such as <future> do not need to include the whole of <thread> just to be able to create a std::thread. That avoids including <stop_token> and std::jthread where not required. This is another partial fix for PR 92546. This also means we can use this_thread::get_id() and this_thread::yield() in <stop_token> instead of using the gthread functions directly. This removes some preprocessor conditionals, simplifying the code. libstdc++-v3/ChangeLog: PR libstdc++/92546 * include/Makefile.am: Add new <bits/std_thread.h> header. * include/Makefile.in: Regenerate. * include/std/future: Include new header instead of <thread>. * include/std/stop_token: Include new header instead of <bits/gthr.h>. (stop_token::_S_yield()): Use this_thread::yield(). (_Stop_state_t::_M_requester): Change type to std:🧵:id. (_Stop_state_t::_M_request_stop()): Use this_thread::get_id(). (_Stop_state_t::_M_remove_callback(_Stop_cb)): Likewise. Use __is_single_threaded() to decide whether to synchronize. include/std/thread (thread, operator==, this_thread::get_id) (this_thread::yield): Move to new header. (operator<=>, operator!=, operator<, operator<=, operator>) (operator>=, hash<thread::id>, operator<<): Define even when gthreads not available. * src/c++11/thread.cc: Include <memory>. * include/bits/std_thread.h: New file. (thread, operator==, this_thread::get_id, this_thread::yield): Define even when gthreads not available. [!_GLIBCXX_HAS_GTHREADS] (thread::join, thread::detach) (thread::hardware_concurrency): Define inline.	2020-11-19 13:36:15 +00:00
Jonathan Wakely	b108faa940	libstdc++: Fix overflow checks to use the correct "time_t" [PR 93456] I recently added overflow checks to src/c++11/futex.cc for PR 93456, but then changed the type of the timespec for PR 93421. This meant the overflow checks were no longer using the right range, because the variable being written to might be smaller than time_t. This introduces new typedef that corresponds to the tv_sec member of the struct being passed to the syscall, and uses that typedef in the range checks. libstdc++-v3/ChangeLog: PR libstdc++/93421 PR libstdc++/93456 * src/c++11/futex.cc (syscall_time_t): New typedef for the type of the syscall_timespec::tv_sec member. (relative_timespec, _M_futex_wait_until) (_M_futex_wait_until_steady): Use syscall_time_t in overflow checks, not time_t.	2020-11-19 13:33:11 +00:00
Nathan Sidwell	bf425849f1	preprocessor: main-file cleanup In preparing module patch 7 I realized there was a cleanup I could make to simplify it. This is that cleanup. Also, when doing the cleanup I noticed some macros had been turned into inline functions, but not renamed to the preprocessors internal namespace (_cpp_$INTERNAL rather than cpp_$USER). Thus, this renames those functions, deletes an internal field of the file structure, and determines whether we're in the main file by comparing to pfile->main_file, the _cpp_file of the main file. libcpp/ * internal.h (cpp_in_system_header): Rename to ... (_cpp_in_system_header): ... here. (cpp_in_primary_file): Rename to ... (_cpp_in_main_source_file): ... here. Compare main_file equality and check main_search value. * lex.c (maybe_va_opt_error, _cpp_lex_direct): Adjust for rename. * macro.c (_cpp_builtin_macro_text): Likewise. (replace_args): Likewise. * directives.c (do_include_next): Likewise. (do_pragma_once, do_pragma_system_header): Likewise. * files.c (struct _cpp_file): Delete main_file field. (pch_open): Check pfile->main_file equality. (make_cpp_file): Drop cpp_reader parm, don't set main_file. (_cpp_find_file): Adjust. (_cpp_stack_file): Check pfile->main_file equality. (struct report_missing_guard_data): Add cpp_reader field. (report_missing_guard): Check pfile->main_file equality. (_cpp_report_missing_guards): Adjust.	2020-11-19 04:47:00 -08:00
Richard Biener	d84ba819fe	Fix bootstrap This fixes a typo in the TREE_CODE compare which should compare against TYPE_DECL, not TYPE_NAME. 2020-11-19 Richard Biener <rguenther@suse.de> * fold-const.c (operand_compare::hash_operand): Fix typo.	2020-11-19 13:42:11 +01:00
Richard Biener	717e22dcd4	Fix gcc.dg/pr97897.c This adds dg-options "" to avoid the pedantic error on _Complex int. 2020-11-19 Richard Biener <rguenther@suse.de> * gcc.dg/pr97897.c: Add dg-options.	2020-11-19 13:27:55 +01:00
Richard Biener	b08e0ee301	refactor reassocs get_rank This refactors things so assigned ranks are dumped and the cache is consistently used also for PHIs. 2020-11-19 Richard Biener <rguenther@suse.de> * tree-ssa-reassoc.c (get_rank): Refactor to consistently use the cache and dump ranks assigned.	2020-11-19 13:27:55 +01:00
Jan Hubicka	d8cf897674	Fix operand_equal_p hash and copare of ODR_TYPE_REF * fold-const.c (operand_compare::operand_equal_p): More OBJ_TYPE_REF matching to correct place; drop OEP_ADDRESS_OF for TOKEN, OBJECT and class. (operand_compare::hash_operand): Hash ODR type for OBJ_TYPE_REF.	2020-11-19 13:08:29 +01:00
Joel Hutton	27842e2a1e	[3/3] [AArch64][vect] vec_widen_lshift pattern Add aarch64 vec_widen_lshift_lo/hi patterns and fix bug it triggers in mid-end. This pattern takes one vector with N elements of size S, shifts each element left by the element width and stores the results as N elements of size 2s (in 2 result vectors). The aarch64 backend implements this with the shll,shll2 instruction pair. gcc/ChangeLog: config/aarch64/aarch64-simd.md: Add vec_widen_lshift_hi/lo<mode> patterns. * tree-vect-stmts.c (vectorizable_conversion): Fix for widen_lshift case. gcc/testsuite/ChangeLog: * gcc.target/aarch64/vect-widen-lshift.c: New test.	2020-11-19 11:49:59 +00:00
Joel Hutton	9fc9573f9a	[2/3] [vect] Add widening add, subtract patterns Add widening add, subtract patterns to tree-vect-patterns. Update the widened code of patterns that detect PLUS_EXPR to also detect WIDEN_PLUS_EXPR. These patterns take 2 vectors with N elements of size S and perform an add/subtract on the elements, storing the results as N elements of size 2S (in 2 result vectors). This is implemented in the aarch64 backend as addl,addl2 and subl,subl2 respectively. Add aarch64 tests for patterns. gcc/ChangeLog: doc/generic.texi: Document new widen_plus/minus_lo/hi tree codes. * doc/md.texi: Document new widenening add/subtract hi/lo optabs. * expr.c (expand_expr_real_2): Add widen_add, widen_subtract cases. * optabs-tree.c (optab_for_tree_code): Add case for widening optabs. * optabs.def (OPTAB_D): Define vectorized widen add, subtracts. * tree-cfg.c (verify_gimple_assign_binary): Add case for widening adds, subtracts. * tree-inline.c (estimate_operator_cost): Add case for widening adds, subtracts. * tree-vect-generic.c (expand_vector_operations_1): Add case for widening adds, subtracts * tree-vect-patterns.c (vect_recog_widen_add_pattern): New recog pattern. (vect_recog_widen_sub_pattern): New recog pattern. (vect_recog_average_pattern): Update widened add code. (vect_recog_average_pattern): Update widened add code. * tree-vect-stmts.c (vectorizable_conversion): Add case for widened add, subtract. (supportable_widening_operation): Add case for widened add, subtract. * tree.def (WIDEN_PLUS_EXPR): New tree code. (WIDEN_MINUS_EXPR): New tree code. (VEC_WIDEN_ADD_HI_EXPR): New tree code. (VEC_WIDEN_PLUS_LO_EXPR): New tree code. (VEC_WIDEN_MINUS_HI_EXPR): New tree code. (VEC_WIDEN_MINUS_LO_EXPR): New tree code. gcc/testsuite/ChangeLog: * gcc.target/aarch64/vect-widen-add.c: New test. * gcc.target/aarch64/vect-widen-sub.c: New test.	2020-11-19 11:49:59 +00:00
Joel Hutton	ec46904edf	[1/3][aarch64] Add vec_widen patterns to aarch64 Add widening add and subtract patterns to the aarch64 backend. These allow taking vectors of N elements of size S and performing and add/subtract on the high or low half widening the resulting elements and storing N/2 elements of size 2S. These correspond to the addl,addl2,subl,subl2 instructions. gcc/ChangeLog: config/aarch64/aarch64-simd.md: New patterns vec_widen_saddl_lo/hi_<mode>.	2020-11-19 11:47:43 +00:00
Richard Biener	ec383f0bdb	tree-optimization/97901 - ICE propagating out LC PHIs We need to fold the stmt to canonicalize MEM_REFs which means we're back to using replace_uses_by. Which means we need dominators to not require a CFG cleanup upthread. 2020-11-19 Richard Biener <rguenther@suse.de> PR tree-optimization/97901 * tree-ssa-propagate.c (clean_up_loop_closed_phi): Compute dominators and use replace_uses_by. * gcc.dg/torture/pr97901.c: New testcase.	2020-11-19 11:35:45 +01:00

1 2 3 4 5 ...

181359 Commits All Branches Search

181359 Commits

All Branches