OpenE2K/gcc - gcc - Expired Mentality Git

Commit Graph

Author	SHA1	Message	Date
Jakub Jelinek	4f18893fbd	sparc: Preserve ORIGINAL_REGNO in epilogue_renumber [PR105257] The following testcase ICEs, because the pic register is (reg:DI 24 %i0 [109]) and is used in the delay slot of a return. We invoke epilogue_renumber and that changes it to (reg:DI 8 %o0) which no longer satisfies sparc_pic_register_p predicate, so we don't recognize the insn anymore. The following patch fixes that by preserving ORIGINAL_REGNO if specified, so we get (reg:DI 8 %o0 [109]) instead. 2022-04-19 Jakub Jelinek <jakub@redhat.com> PR target/105257 * config/sparc/sparc.c (epilogue_renumber): If ORIGINAL_REGNO, use gen_raw_REG instead of gen_rtx_REG and copy over also ORIGINAL_REGNO. Use return 0; instead of /* fallthrough /. gcc.dg/pr105257.c: New test. (cherry picked from commit `eeca2b8bd0`)	2022-05-11 08:17:59 +02:00
Jakub Jelinek	30895a25ea	c++: Fix up CONSTRUCTOR_PLACEHOLDER_BOUNDARY handling [PR105256] The CONSTRUCTOR_PLACEHOLDER_BOUNDARY bit is supposed to separate PLACEHOLDER_EXPRs that should be replaced by one object or subobjects of it (variable, TARGET_EXPR slot, ...) from other PLACEHOLDER_EXPRs that should be replaced by different objects or subobjects. The bit is set when finding PLACEHOLDER_EXPRs inside of a CONSTRUCTOR, not looking into nested CONSTRUCTOR_PLACEHOLDER_BOUNDARY ctors, and we prevent elision of TARGET_EXPRs (through TARGET_EXPR_NO_ELIDE) whose initializer is a CONSTRUCTOR_PLACEHOLDER_BOUNDARY ctor. The following testcase ICEs though, we don't replace the placeholders in there at all, because CONSTRUCTOR_PLACEHOLDER_BOUNDARY isn't set on the TARGET_EXPR_INITIAL ctor, but on a ctor nested in such a ctor. replace_placeholders should be run on the whole TARGET_EXPR slot. So, the following patch fixes it by moving the CONSTRUCTOR_PLACEHOLDER_BOUNDARY bit from nested CONSTRUCTORs to the CONSTRUCTOR containing those (but only if it is closely nested, if there is some other tree sandwiched in between, it doesn't do it). 2022-04-19 Jakub Jelinek <jakub@redhat.com> PR c++/105256 * typeck2.c (process_init_constructor_array, process_init_constructor_record, process_init_constructor_union): Move CONSTRUCTOR_PLACEHOLDER_BOUNDARY flag from CONSTRUCTOR elements to the containing CONSTRUCTOR. * g++.dg/cpp0x/pr105256.C: New test. (cherry picked from commit `eb03e42459`)	2022-05-11 08:17:59 +02:00
Jakub Jelinek	14407aba9f	i386: Fix ICE caused by ix86_emit_i387_log1p [PR105214] The following testcase ICEs, because ix86_emit_i387_log1p attempts to emit something like if (cond) some_code1; else some_code2; and emits a conditional jump using emit_jump_insn (standard way in the file) and an unconditional jump using emit_jump. The problem with that is that if there is pending stack adjustment, it isn't emitted before the conditional jump, but is before the unconditional jump and therefore stack is adjusted only conditionally (at the end of some_code1 above), which makes dwarf2 pass unhappy about it but is a serious wrong-code even if it doesn't ICE. This can be fixed either by emitting pending stack adjust before the conditional jump as the following patch does, or by not using emit_jump (label2); and instead hand inlining what that function does except for the pending stack adjustment, like: emit_jump_insn (targetm.gen_jump (label2)); emit_barrier (); In that case there will be no stack adjustment in the sequence and it will be done later on somewhere else. 2022-04-12 Jakub Jelinek <jakub@redhat.com> PR target/105214 * config/i386/i386.c (ix86_emit_i387_log1p): Call do_pending_stack_adjust. * gcc.dg/asan/pr105214.c: New test. (cherry picked from commit `d481d13786`)	2022-05-11 08:17:58 +02:00
Jakub Jelinek	fd68b02744	builtins: Fix up expand_builtin_int_roundingfn_2 [PR105211] The expansion of __builtin_iround{,f,l} etc. builtins in some cases emits calls to a different fallback builtin. To locate the right builtin it uses mathfn_built_in_1 with the type of the first argument. If its TYPE_MAIN_VARIANT is {float,double,long_double}_type_node, all is fine, but on the following testcase, because GIMPLE considers scalar float conversions between types with the same mode as useless, TYPE_MAIN_VARIANT of the arg's type is float32_type_node and because there isn't __builtin_lroundf32 returns NULL and we ICE. This patch will first try the type of the first argument of the builtin's prototype (so that say on sizeof(double)==sizeof(long double) target it honors whether it was a l or non-l call; though even that can't be 100% trusted, user could incorrectly prototype it) and as fallback the type argument. If neither works, doesn't fallback. 2022-04-11 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/105211 * builtins.c (expand_builtin_int_roundingfn_2): If mathfn_built_in_1 fails for TREE_TYPE (arg), retry it with TREE_VALUE (TYPE_ARG_TYPES (TREE_TYPE (fndecl))) and if even that fails, emit call normally. * gcc.dg/pr105211.c: New test. (cherry picked from commit `91a38e8a84`)	2022-05-11 08:17:58 +02:00
Jakub Jelinek	6547662a2a	c-family: Initialize ridpointers for __int128 etc. [PR105186] The following testcase ICEs with C++ and is incorrectly rejected with C. The reason is that both FEs use ridpointers identifiers for CPP_KEYWORD and value or u.value for CPP_NAME e.g. when parsing attributes or OpenMP directives etc., like: /* Save away the identifier that indicates which attribute this is. / identifier = (token->type == CPP_KEYWORD) / For keywords, use the canonical spelling, not the parsed identifier. / ? ridpointers[(int) token->keyword] : id_token->u.value; identifier = canonicalize_attr_name (identifier); I've tried to change those to use ridpointers only if non-NULL and otherwise use the value/u.value even for CPP_KEYWORDS, but that was a large 10 hunks patch. The following patch instead just initializes ridpointers for the __intNN keywords. It can't be done earlier before we record_builtin_type as there are 2 different spellings and if we initialize those ridpointers early, the second record_builtin_type fails miserably. 2022-04-11 Jakub Jelinek <jakub@redhat.com> PR c++/105186 c-common.c (c_common_nodes_and_builtins): After registering __int%d and __int%d__ builtin types, initialize corresponding ridpointers entry. * c-c++-common/pr105186.c: New test. (cherry picked from commit `083e8e66d2`)	2022-05-11 08:17:58 +02:00
Jakub Jelinek	cddca3b79f	fold-const: Fix up make_range_step [PR105189] The following testcase is miscompiled, because fold_truth_andor incorrectly folds (unsigned) foo () >= 0U && 1 into foo () >= 0 For the unsigned comparison (which is useless in this case, as >= 0U is always true, but hasn't been folded yet), previous make_range_step derives exp (unsigned) foo () and +[0U, -] range for it. Next we process the NOP_EXPR. We have special code for unsigned to signed casts, already earlier punt if low or high aren't representable in arg0_type or if it is a narrowing conversion. For the signed to unsigned casts, I think if high is specified we are still fine, as we punt for non-representable values in arg0_type, n_high is then still representable and so was smaller or equal to signed maximum and either low is not present (equivalent to 0U), or low must be smaller or equal to high and so for unsigned exp +[low, high] the signed exp +[n_low, n_high] will be correct. Similarly, if both low and high aren't specified (always true or always false), it is ok too. But if we have for unsigned exp +[low, -] or -[low, -], using +[n_low, -] or -[n_high, -] is incorrect. Because low is smaller or equal to signed maximum and high is unspecified (i.e. unsigned maximum), when signed that range is a union of +[n_low, -] and +[-, -1] which is equivalent to -[0, n_low-1], unless low is 0, in that case we can treat it as [-, -]. 2022-04-08 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/105189 * fold-const.c (make_range_step): Fix up handling of (unsigned) x +[low, -] ranges for signed x if low fits into typeof (x). * g++.dg/torture/pr105189.C: New test. (cherry picked from commit `5e6597064b`)	2022-05-11 08:17:58 +02:00
Jakub Jelinek	5169f5756e	combine: Don't record for UNDO_MODE pointers into regno_reg_rtx array [PR104985] The testcase in the PR fails under valgrind on mips64 (but only Martin can reproduce, I couldn't). But the problem reported there is that SUBST_MODE remembers addresses into the regno_reg_rtx array, then some splitter needs a new pseudo and calls gen_reg_rtx, which reallocates the regno_reg_rtx array and finally undo operation is done and dereferences the old regno_reg_rtx entry. The rtx values stored in regno_reg_rtx array seems to be created by gen_reg_rtx only and since then aren't modified, all we do for it is adjusting its fields (e.g. adjust_reg_mode that SUBST_MODE does). So, I think it is useless to use where.r for UNDO_MODE and store &regno_reg_rtx[regno] in struct undo, we can store just regno_reg_rtx[regno] (i.e. pointer to the REG itself instead of pointer to pointer to REG) or could also store just the regno. The following patch does the latter, and because SUBST_MODE no longer needs to be a macro, changes all SUBST_MODE uses to subst_mode. 2022-04-06 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/104985 * combine.c (struct undo): Add where.regno member. (do_SUBST_MODE): Rename to ... (subst_mode): ... this. Change first argument from rtx * into int, operate on regno_reg_rtx[regno] and save regno into where.regno. (SUBST_MODE): Remove. (try_combine): Use subst_mode instead of SUBST_MODE, change first argument from regno_reg_rtx[whatever] to whatever. For UNDO_MODE, use regno_reg_rtx[undo->where.regno] instead of undo->where.r. (undo_to_marker): For UNDO_MODE, use regno_reg_rtx[undo->where.regno] instead of undo->where.r. (simplify_set): Use subst_mode instead of SUBST_MODE, change first argument from regno_reg_rtx[whatever] to whatever. (cherry picked from commit `61bee6aed2`)	2022-05-11 08:17:58 +02:00
Jakub Jelinek	a78199c69f	i386: Fix up ix86_expand_vector_init_general [PR105123] The following testcase is miscompiled on ia32. The problem is that at -O0 we end up with: vector(4) short unsigned int _1; short unsigned int u.0_3; ... _1 = {u.0_3, u.0_3, u.0_3, u.0_3}; statement (dead) which is wrongly expanded. elt is (subreg:HI (reg:SI 83 [ u.0_3 ]) 0), tmp_mode SImode, so after convert_mode we start with word (reg:SI 83 [ u.0_3 ]). The intent is to manually broadcast that value to 2 SImode parts, but because we pass word as target to expand_simple_binop, it will overwrite (reg:SI 83 [ u.0_3 ]) and we end up with 0: 10: {r83:SI=r83:SI<<0x10;clobber flags:CC;} 11: {r83:SI=r83:SI\|r83:SI;clobber flags:CC;} 12: {r83:SI=r83:SI<<0x10;clobber flags:CC;} 13: {r83:SI=r83:SI\|r83:SI;clobber flags:CC;} 14: clobber r110:V4HI 15: r110:V4HI#0=r83:SI 16: r110:V4HI#4=r83:SI as the two ors do nothing and two shifts each by 16 left shift it all away. The following patch fixes that by using NULL_RTX target, so we expand it as 10: {r110:SI=r83:SI<<0x10;clobber flags:CC;} 11: {r111:SI=r110:SI\|r83:SI;clobber flags:CC;} 12: {r112:SI=r83:SI<<0x10;clobber flags:CC;} 13: {r113:SI=r112:SI\|r83:SI;clobber flags:CC;} 14: clobber r114:V4HI 15: r114:V4HI#0=r111:SI 16: r114:V4HI#4=r113:SI instead. Another possibility would be to pass NULL_RTX only when word == elt and word otherwise, where word would necessarily be a pseudo from the first shift after passing NULL_RTX there once or pass NULL_RTX for the shift and word for ior. 2022-04-03 Jakub Jelinek <jakub@redhat.com> PR target/105123 * config/i386/i386.c (ix86_expand_vector_init_general): Avoid using word as target for expand_simple_binop when doing ASHIFT and IOR. * gcc.target/i386/pr105123.c: New test. (cherry picked from commit `e1a74058b7`)	2022-05-11 08:17:57 +02:00
Jakub Jelinek	bee22b8bc1	ubsan: Fix ICE due to -fsanitize=object-size [PR105093] The following testcase ICEs, because for a volatile X & RESULT_DECL ubsan wants to take address of that reference. instrument_object_size is called with x, so the base is equal to the access and the var is automatic, so there is no risk of an out of bounds access for it. Normally we wouldn't instrument those because we fold address of the t - address of inner to 0, add constant size of the decl and it is equal to what __builtin_object_size computes. But the volatile results in the subtraction not being folded. The first hunk fixes it by punting if we access the whole automatic decl, so that even volatile won't cause a problem. The second hunk (not strictly needed for this testcase) is similar to what has been added to asan.cc recently, if we actually take address of a decl and keep it in the IL, we better mark it addressable. 2022-03-30 Jakub Jelinek <jakub@redhat.com> PR sanitizer/105093 * ubsan.c (instrument_object_size): If t is equal to inner and is a decl other than global var, punt. When emitting call to UBSAN_OBJECT_SIZE ifn, make sure base is addressable. * g++.dg/ubsan/pr105093.C: New test. (cherry picked from commit `e3e68fa59e`)	2022-05-11 08:17:57 +02:00
Jakub Jelinek	3bbd4ce93b	c++: Fix up __builtin_convertvector parsing Jonathan reported on IRC that we don't parse __builtin_bit_cast (type, val).field etc. The problem is that for these 2 builtins we return from cp_parser_postfix_expression instead of setting postfix_expression to the cp_build_* value and falling through into the postfix regression suffix handling loop. 2022-03-26 Jakub Jelinek <jakub@redhat.com> * parser.c (cp_parser_postfix_expression) <case RID_BILTIN_CONVERTVECTOR>: Don't return cp_build_vec_convert result right away, instead set postfix_expression to it and break. * c-c++-common/builtin-convertvector-3.c: New test. (cherry picked from commit `1806829e08`)	2022-05-11 08:17:57 +02:00
Jakub Jelinek	54bccc8e05	c++: extern thread_local declarations in constexpr [PR104994] C++14 to C++20 apparently should allow extern thread_local declarations in constexpr functions, however useless they are there (because accessing such vars is not valid in a constant expression, perhaps sizeof/decltype). P2242 changed that for C++23 to passing through declaration but https://cplusplus.github.io/CWG/issues/2552.html has been filed for it yesterday. 2022-03-24 Jakub Jelinek <jakub@redhat.com> PR c++/104994 * constexpr.c (potential_constant_expression_1): Don't diagnose extern thread_local declarations. * decl.c (start_decl): Likewise. * g++.dg/cpp2a/constexpr-nonlit7.C: New test. (cherry picked from commit `72124f487c`)	2022-05-11 08:17:57 +02:00
Jakub Jelinek	c1a8261b70	i386: Don't emit pushf;pop for __builtin_ia32_readeflags_u* with unused lhs [PR104971] __builtin_ia32_readeflags_u* aren't marked const or pure I think intentionally, so that they aren't CSEd from different regions of a function etc. because we don't and can't easily track all dependencies between it and surrounding code (if somebody looks at the condition flags, it is dependent on the vast majority of instructions). But the builtin itself doesn't have any side-effects, so if we ignore the result of the builtin, there is no point to emit anything. There is a LRA bug that miscompiles the testcase which this patch makes latent, which is certainly worth fixing too, but IMHO this change (and maybe ix86_gimple_fold_builtin too which would fold it even earlier when it looses lhs) is worth it as well. 2022-03-19 Jakub Jelinek <jakub@redhat.com> PR middle-end/104971 * config/i386/i386.c (ix86_expand_builtin) <case IX86_BUILTIN_READ_FLAGS>: If ignore, don't push/pop anything and just return const0_rtx. * gcc.target/i386/pr104971.c: New test. (cherry picked from commit `b60bc913cc`)	2022-05-11 08:17:57 +02:00
Jakub Jelinek	0e02b8468b	c, c++, c-family: -Wshift-negative-value and -Wshift-overflow* tweaks for -fwrapv and C++20+ [PR104711] As mentioned in the PR, different standards have different definition on what is an UB left shift. They all agree on out of bounds (including negative) shift count. The rules used by ubsan are: C99-C2x ((unsigned) x >> (uprecm1 - y)) != 0 then UB C++11-C++17 x < 0 \|\| ((unsigned) x >> (uprecm1 - y)) > 1 then UB C++20 and later everything is well defined Now, for C++20, I've in the P1236R1 implementation added an early exit for -Wshift-overflow* warning so that it never warns, but apparently -Wshift-negative-value remained as is. As it is well defined in C++20, the following patch doesn't enable -Wshift-negative-value from -Wextra anymore for C++20 and later, if users want for compatibility with C++17 and earlier get the warning, they still can by using -Wshift-negative-value explicitly. Another thing is -fwrapv, that is an extension to the standards, so it is up to us how exactly we define that case. Our ubsan code treats TYPE_OVERFLOW_WRAPS (type0) and cxx_dialect >= cxx20 the same as only diagnosing out of bounds shift count and nothing else and IMHO it is most sensical to treat -fwrapv signed left shifts the same as C++20 treats them, https://eel.is/c++draft/expr.shift#2 "The value of E1 << E2 is the unique value congruent to E1×2^E2 modulo 2^N, where N is the width of the type of the result. [Note 1: E1 is left-shifted E2 bit positions; vacated bits are zero-filled. — end note]" with no UB dependent on the E1 values. The UB is only "The behavior is undefined if the right operand is negative, or greater than or equal to the width of the promoted left operand." Under the hood (except for FEs and ubsan from FEs) GCC middle-end doesn't consider UB in left shifts dependent on the first operand's value, only the out of bounds shifts. While this change isn't a regression, I'd think it is useful for GCC 12, it doesn't add new warnings, but just removes warnings that aren't appropriate. 2022-03-09 Jakub Jelinek <jakub@redhat.com> PR c/104711 gcc/ * doc/invoke.texi (-Wextra): Document that -Wshift-negative-value is enabled by it only for C++11 to C++17 rather than for C++03 or later. (-Wshift-negative-value): Similarly (except here we stated that it is enabled for C++11 or later). gcc/c-family/ * c-opts.c (c_common_post_options): Don't enable -Wshift-negative-value from -Wextra for C++20 or later. * c-ubsan.c (ubsan_instrument_shift): Adjust comments. * c-warn.c (maybe_warn_shift_overflow): Use TYPE_OVERFLOW_WRAPS instead of TYPE_UNSIGNED. gcc/c/ * c-fold.c (c_fully_fold_internal): Don't emit -Wshift-negative-value warning if TYPE_OVERFLOW_WRAPS. * c-typeck.c (build_binary_op): Likewise. gcc/cp/ * constexpr.c (cxx_eval_check_shift_p): Use TYPE_OVERFLOW_WRAPS instead of TYPE_UNSIGNED. * typeck.c (cp_build_binary_op): Don't emit -Wshift-negative-value warning if TYPE_OVERFLOW_WRAPS. gcc/testsuite/ * c-c++-common/Wshift-negative-value-1.c: Remove dg-additional-options, instead in target selectors of each diagnostic check for exact C++ versions where it should be diagnosed. * c-c++-common/Wshift-negative-value-2.c: Likewise. * c-c++-common/Wshift-negative-value-3.c: Likewise. * c-c++-common/Wshift-negative-value-4.c: Likewise. * c-c++-common/Wshift-negative-value-7.c: New test. * c-c++-common/Wshift-negative-value-8.c: New test. * c-c++-common/Wshift-negative-value-9.c: New test. * c-c++-common/Wshift-negative-value-10.c: New test. * c-c++-common/Wshift-overflow-1.c: Remove dg-additional-options, instead in target selectors of each diagnostic check for exact C++ versions where it should be diagnosed. * c-c++-common/Wshift-overflow-2.c: Likewise. * c-c++-common/Wshift-overflow-5.c: Likewise. * c-c++-common/Wshift-overflow-6.c: Likewise. * c-c++-common/Wshift-overflow-7.c: Likewise. * c-c++-common/Wshift-overflow-8.c: New test. * c-c++-common/Wshift-overflow-9.c: New test. * c-c++-common/Wshift-overflow-10.c: New test. * c-c++-common/Wshift-overflow-11.c: New test. * c-c++-common/Wshift-overflow-12.c: New test. (cherry picked from commit `d76511138d`)	2022-05-11 08:17:57 +02:00
Jakub Jelinek	2a829a4e85	c++: Don't suggest cdtor or conversion op identifiers in spelling hints [PR104806] On the following testcase, we emit "did you mean '__dt '?" in the error message. "__dt " shows there because it is dtor_identifier, but we shouldn't suggest those to the user, they are purely internal and can't be really typed by the user because of the final space in it. 2022-03-08 Jakub Jelinek <jakub@redhat.com> PR c++/104806 * search.c (lookup_field_fuzzy_info::fuzzy_lookup_field): Ignore identifiers with space at the end. * g++.dg/spellcheck-pr104806.C: New test. (cherry picked from commit `e480c3c06d`)	2022-05-11 08:17:57 +02:00
Jakub Jelinek	e763af00a2	s390: Fix up cmp_and_trap_unsigned_int<mode> constraints [PR104775] The following testcase fails to assemble due to clgte %r6,0(%r1,%r10) insn not being accepted by assembler. My rough understanding is that in the RSY-b insn format the spot in other formats used for index registers is used instead for M3 what kind of comparison it is, so this patch follows what other similar instructions use for constraint (i.e. one without index register). 2022-03-07 Jakub Jelinek <jakub@redhat.com> PR target/104775 config/s390/s390.md (cmp_and_trap_unsigned_int<mode>): Use S constraint instead of T in the last alternative. gcc.target/s390/pr104775.c: New test. (cherry picked from commit `2472dcaa8c`)	2022-05-11 08:17:56 +02:00
Jakub Jelinek	870a9a8e82	match.pd: Further complex simplification fixes [PR104675] Mark mentioned in the PR further 2 simplifications that also ICE with complex types. For these, eventually (but IMO GCC 13 materials) we could support it for vector types if it would be uniform vector constants. Currently integer_pow2p is true only for INTEGER_CSTs and COMPLEX_CSTs and we can't use bit_and etc. for complex type. 2022-02-25 Jakub Jelinek <jakub@redhat.com> Marc Glisse <marc.glisse@inria.fr> PR tree-optimization/104675 * match.pd (t * 2U / 2 -> t & (~0 / 2), t / 2U * 2 -> t & ~1): Restrict simplifications to INTEGRAL_TYPE_P. * gcc.dg/pr104675-3.c : New test. (cherry picked from commit `f62115c9b7`)	2022-05-11 08:17:56 +02:00
Jakub Jelinek	5c742d9a7e	rs6000: Use rs6000_emit_move in movmisalign<mode> expander [PR104681] The following testcase ICEs, because for some strange reason it decides to use movmisaligntf during expansion where the destination is MEM and source is CONST_DOUBLE. For normal mov<mode> expanders the rs6000 backend uses rs6000_emit_move to ensure that if one operand is a MEM, the other is a REG and a few other things, but for movmisalign<mode> nothing enforced this. The middle-end documents that movmisalign<mode> shouldn't fail, so we can't force that through predicates or condition on the expander. 2022-02-25 Jakub Jelinek <jakub@redhat.com> PR target/104681 * config/rs6000/vector.md (movmisalign<mode>): Use rs6000_emit_move. * g++.dg/opt/pr104681.C: New test. (cherry picked from commit `3885a122f8`)	2022-05-11 08:17:56 +02:00
Jakub Jelinek	ff9fe8ef03	match.pd: Don't create BIT_NOT_EXPRs for COMPLEX_TYPE [PR104675] We don't support BIT_{AND,IOR,XOR,NOT}_EXPR on complex types, &/\|/^ are just rejected for them, and ~ is parsed as CONJ_EXPR. So, we should avoid simplifications which turn valid complex type expressions into something that will ICE during expansion. 2022-02-25 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/104675 * match.pd (-A - 1 -> ~A, -1 - A -> ~A): Don't simplify for COMPLEX_TYPE. * gcc.dg/pr104675-1.c: New test. * gcc.dg/pr104675-2.c: New test. (cherry picked from commit `758671b88b`)	2022-05-11 08:17:56 +02:00
Jakub Jelinek	2692cbbc12	libiberty: Fix up debug.temp.o creation if .o has 64K+ sections [PR104617] On #define A(n) int foo1##n(void) { return 1##n; } #define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6) A(n##7) A(n##8) A(n##9) #define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6) B(n##7) B(n##8) B(n##9) #define D(n) C(n##0) C(n##1) C(n##2) C(n##3) C(n##4) C(n##5) C(n##6) C(n##7) C(n##8) C(n##9) #define E(n) D(n##0) D(n##1) D(n##2) D(n##3) D(n##4) D(n##5) D(n##6) D(n##7) D(n##8) D(n##9) E(0) E(1) E(2) D(30) D(31) C(320) C(321) C(322) C(323) C(324) C(325) B(3260) B(3261) B(3262) B(3263) A(32640) A(32641) A(32642) testcase with ./xgcc -B ./ -c -g -fpic -ffat-lto-objects -flto -O0 -o foo1.o foo1.c -ffunction-sections ./xgcc -B ./ -shared -g -fpic -flto -O0 -o foo1.so foo1.o /tmp/ccTW8mBm.debug.temp.o: file not recognized: file format not recognized (testcase too slow to be included into testsuite). The problem is clearly reported by readelf: readelf: foo1.o.debug.temp.o: Warning: Section 2 has an out of range sh_link value of 65321 readelf: foo1.o.debug.temp.o: Warning: Section 5 has an out of range sh_link value of 65321 readelf: foo1.o.debug.temp.o: Warning: Section 10 has an out of range sh_link value of 65323 readelf: foo1.o.debug.temp.o: Warning: [ 2]: Link field (65321) should index a symtab section. readelf: foo1.o.debug.temp.o: Warning: [ 5]: Link field (65321) should index a symtab section. readelf: foo1.o.debug.temp.o: Warning: [10]: Link field (65323) should index a string section. because simple_object_elf_copy_lto_debug_sections doesn't adjust sh_info and sh_link fields in ElfNN_Shdr if they are in between SHN_{LO,HI}RESERVE inclusive. Not adjusting those is incorrect though, SHN_{LO,HI}RESERVE range is only relevant to the 16-bit fields, mainly st_shndx in ElfNN_Sym where if one needs >= SHN_LORESERVE section number, SHN_XINDEX should be used instead and .symtab_shndx section should contain the real section index, and in ElfNN_Ehdr e_shnum and e_shstrndx fields, where if >= SHN_LORESERVE value is needed it should put those into Shdr[0].sh_{size,link}. But, sh_{link,info} are 32-bit fields which can contain any section index. Note, as simple-object-elf.c mentions, binutils from 2.12 to 2.18 (so before 2011) used to mishandle the > 63.75K sections case and assumed there is a hole in between the sections, but what simple_object_elf_copy_lto_debug_sections does wouldn't help in that case for the debug temp object creation, we'd need to detect the case also in that routine and take it into account in the remapping etc. I think it is not worth it given that it is over 10 years, if somebody needs 63.75K or more sections, better use more recent binutils. 2022-02-22 Jakub Jelinek <jakub@redhat.com> PR lto/104617 simple-object-elf.c (simple_object_elf_match): Fix up URL in comment. (simple_object_elf_copy_lto_debug_sections): Remap sh_info and sh_link even if they are in the SHN_LORESERVE .. SHN_HIRESERVE range (inclusive). (cherry picked from commit `2f59f06761`)	2022-05-11 08:17:56 +02:00
Jakub Jelinek	eca81c14d5	valtrack: Avoid creating raw SUBREGs with VOIDmode argument [PR104557] After the recent r12-7240 simplify_immed_subreg changes, we bail on more simplify_subreg calls than before, e.g. apparently for decimal modes in the NaN representations we almost never preserve anything except the canonical {q,s}NaNs. simplify_gen_subreg will punt in such cases because a SUBREG with VOIDmode is not valid, but debug_lowpart_subreg wants to attempt even harder, even if e.g. target indicates certain mode combinations aren't valid for the backend, dwarf2out can still handle them. But a SUBREG from a VOIDmode operand is just too much, the inner mode is lost there. We'd need some new rtx that would be able to represent those cases. For now, just punt in those cases. 2022-02-17 Jakub Jelinek <jakub@redhat.com> PR debug/104557 * valtrack.c (debug_lowpart_subreg): Don't call gen_rtx_raw_SUBREG if expr has VOIDmode. * gcc.dg/dfp/pr104557.c: New test. (cherry picked from commit `1c2b44b523`)	2022-05-11 08:17:56 +02:00
Jakub Jelinek	b65f562b8f	c-family: Fix up shorten_compare for decimal vs. non-decimal float comparison [PR104510] The comment in shorten_compare says: /* If either arg is decimal float and the other is float, fail. / but the callers of shorten_compare don't expect anything like failure as a possibility from the function, callers require that the function promotes the operands to the same type, whether the original selected restype_ptr one or some shortened. So, if we choose not to shorten, we should still promote to the original restype_ptr. 2022-02-16 Jakub Jelinek <jakub@redhat.com> PR c/104510 c-common.c (shorten_compare): Convert original arguments to the original restype_ptr when mixing binary and decimal float. gcc.dg/dfp/pr104510.c: New test. (cherry picked from commit `6e74122f0d`)	2022-05-11 08:17:55 +02:00
Jakub Jelinek	3aeecb1945	sanitizer: Use glibc _thread_db_sizeof_pthread symbol if present I've cherry-picked following fix from llvm-project. Recent glibcs have _thread_db_sizeof_pthread symbol variable which contains the size of struct pthread, so that sanitizers don't need to guess that and risk that it will change again. 2022-02-15 Jakub Jelinek <jakub@redhat.com> * sanitizer_common/sanitizer_linux_libcdep.cc: Cherry-pick llvm-project revision ef14b78d9a144ba81ba02083fe21eb286a88732b. (cherry picked from commit `c4c0aa6089`)	2022-05-11 08:16:53 +02:00
Jakub Jelinek	57e0795a44	openmp: Make finalize_task_copyfn order reproduceable [PR104517] The following testcase fails -fcompare-debug, because finalize_task_copyfn was invoked from splay tree destruction, whose order can in some cases depend on -g/-g0. The fix is to queue the task stmts that need copyfn in a vector and run finalize_task_copyfn on elements of that vector. 2022-02-15 Jakub Jelinek <jakub@redhat.com> PR debug/104517 * omp-low.c (task_cpyfns): New variable. (delete_omp_context): Don't call finalize_task_copyfn from here. (create_task_copyfn): Push task_stmt into task_cpyfns. (execute_lower_omp): Call finalize_task_copyfn here on entries from task_cpyfns vector and release the vector. (cherry picked from commit `6a0d6e7ca9`)	2022-05-11 07:58:40 +02:00
Jakub Jelinek	87cd4bc02f	c++: Don't reject GOTO_EXPRs to cdtor_label in potential_constant_expression_1 [PR104513] return in ctors on targetm.cxx.cdtor_returns_this () target like arm is emitted as GOTO_EXPR cdtor_label where at cdtor_label it emits RETURN_EXPR with the this. Similarly, in all dtors regardless of targetm.cxx.cdtor_returns_this () a return is emitted similarly. potential_constant_expression_1 was rejecting these gotos and so we incorrectly rejected these testcases, but actual cxx_eval* is apparently handling these just fine. I was a little bit worried that for the destruction of bases we wouldn't evaluate something we should, but as the testcase shows, that is evaluated through try ... finally and there is nothing after the cdtor_label. For arm there is RETURN_EXPR this; but we don't really care about the return value from ctors and dtors during the constexpr evaluation. I must say I don't see much the point of cdtor_labels at all, I'd think that with try ... finally around it for non-arm we could just RETURN_EXPR instead of the GOTO_EXPR and the try/finally gimplification would DTRT, and we could just add the right return value for the arm case. 2022-02-14 Jakub Jelinek <jakub@redhat.com> PR c++/104513 * constexpr.c (potential_constant_expression_1) <case GOTO_EXPR>: Don't punt if returns (target). * g++.dg/cpp1y/constexpr-104513.C: New test. (cherry picked from commit `02a981a8e5`)	2022-05-11 07:58:40 +02:00
Jakub Jelinek	77ee9b906d	asan: Fix up address sanitizer instrumentation of __builtin_alloca* if it can throw [PR104449] With -fstack-check* __builtin_alloca* can throw and the asan instrumentation of this builtin wasn't prepared for that case. The following patch fixes that by replacing the builtin with the replacement builtin and emitting any further insns on the fallthru edge. I haven't touched the hwasan code which most likely suffers from the same problem. 2022-02-12 Jakub Jelinek <jakub@redhat.com> PR sanitizer/104449 * asan.c: Include tree-eh.h. (handle_builtin_alloca): Handle the case when __builtin_alloca or __builtin_alloca_with_align can throw. * gcc.dg/asan/pr104449.c: New test. * g++.dg/asan/pr104449.C: New test. (cherry picked from commit `f0c7367b88`)	2022-05-11 07:58:39 +02:00
Jakub Jelinek	cb412e0e88	i386: Fix up cvtsd2ss splitter [PR104502] The following testcase ICEs, because AVX512F is enabled, AVX512VL is not, and the cvtsd2ss insn has %xmm0-15 as output operand and %xmm16-31 as input operand. For output operand %xmm16+ the splitter just gives up in such case, but for such input it just emits vmovddup which requires AVX512VL if either operand is EXT_REX_SSE_REG_P (when it is 128-bit). The following patch fixes it by treating that case like the pre-SSE3 output != input case - move the input to output and do everything on the output reg which is known to be < %xmm16. 2022-02-12 Jakub Jelinek <jakub@redhat.com> PR target/104502 * config/i386/i386.md (cvtsd2ss splitter): If operands[1] is xmm16+ and AVX512VL isn't available, move operands[1] to operands[0] first. * gcc.target/i386/pr104502.c: New test. (cherry picked from commit `0538d42cdd`)	2022-05-11 07:58:39 +02:00
Jakub Jelinek	c7e7ca915d	c++: Fix up constant expression __builtin_convertvector folding [PR104472] The following testcase ICEs, because due to the -frounding-math fold_const_call fails, which is it returns NULL, and returning NULL from cxx_eval* is wrong, all the callers rely on them to either return folded value or original with non_constant_p = true. The following patch does that, and additionally falls through into the default case where there is diagnostics for the !ctx->quiet case too. 2022-02-11 Jakub Jelinek <jakub@redhat.com> PR c++/104472 constexpr.c (cxx_eval_internal_function) <case IFN_VEC_CONVERT>: Only return fold_const_call result if it is non-NULL. Otherwise fall through into the default: case to return t, set non_constant_p and emit diagnostics if needed. g++.dg/cpp0x/constexpr-104472.C: New test. (cherry picked from commit `84993d94e1`)	2022-05-11 07:58:39 +02:00
Jakub Jelinek	ffbe41f14f	combine: Fix ICE with substitution of CONST_INT into PRE_DEC argument [PR104446] The following testcase ICEs, because combine substitutes (insn 10 9 11 2 (set (reg/v:SI 7 sp [ a ]) (const_int 0 [0])) "pr104446.c":9:5 81 {movsi_internal} (nil)) (insn 13 11 14 2 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [0 S4 A32]) (reg:SI 85)) "pr104446.c":10:3 56 {pushsi2} (expr_list:REG_DEAD (reg:SI 85) (expr_list:REG_ARGS_SIZE (const_int 16 [0x10]) (nil)))) forming (insn 13 11 14 2 (set (mem/f:SI (pre_dec:SI (const_int 0 [0])) [0 S4 A32]) (reg:SI 85)) "pr104446.c":10:3 56 {pushsi2} (expr_list:REG_DEAD (reg:SI 85) (expr_list:REG_ARGS_SIZE (const_int 16 [0x10]) (nil)))) which is invalid RTL (pre_dec's argument must be a REG). I know substitution creates various forms of invalid RTL and hopes that invalid RTL just won't recog. But unfortunately in this case we ICE before we get to recog, as try_combine does: if (n_auto_inc) { int new_n_auto_inc = 0; for_each_inc_dec (newpat, count_auto_inc, &new_n_auto_inc); if (n_auto_inc != new_n_auto_inc) { if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, "Number of auto_inc expressions changed\n"); undo_all (); return 0; } } and for_each_inc_dec under the hood will do e.g. for the PRE_DEC case: case PRE_DEC: case POST_DEC: { poly_int64 size = GET_MODE_SIZE (GET_MODE (mem)); rtx r1 = XEXP (x, 0); rtx c = gen_int_mode (-size, GET_MODE (r1)); return fn (mem, x, r1, r1, c, data); } and that code rightfully expects that the PRE_DEC operand has non-VOIDmode (as it needs to be a REG) - gen_int_mode for VOIDmode results in ICE. I think it is better not to emit the clearly invalid RTL during substitution like we do for other cases, than to adding workarounds for invalid IL created by combine to rtlanal.cc and perhaps elsewhere. As for the testcase, of course it is UB at runtime to modify sp that way, but if such code is never reached, we must compile it, not to ICE on it. And I don't see why on other targets which use the autoinc rtxes much more it couldn't happen with other registers. 2022-02-11 Jakub Jelinek <jakub@redhat.com> PR middle-end/104446 combine.c (subst): Don't substitute CONST_INTs into RTX_AUTOINC operands. * gcc.target/i386/pr104446.c: New test. (cherry picked from commit `fb76c0ad35`)	2022-05-11 07:58:38 +02:00
Jakub Jelinek	8c9f4bafe5	rs6000: Fix up vspltis_shifted [PR102140] The following testcase ICEs, because (const_vector:V4SI [ (const_int 0 [0]) repeated x3 (const_int -2147483648 [0xffffffff80000000]) ]) is recognized as valid easy_vector_constant in between split1 pass and end of RA. The problem is that such constants need to be split, and the only splitter for that is: (define_split [(set (match_operand:VM 0 "altivec_register_operand") (match_operand:VM 1 "easy_vector_constant_vsldoi"))] "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode) && can_create_pseudo_p ()" There is only a single splitting pass before RA, so after that finishes, if something gets matched in between that and end of RA (after that can_create_pseudo_p () would be no longer true), it will never be successfully split and we ICE at final.cc time or earlier. The i386 backend (and a few others) already use (cfun->curr_properties & PROP_rtl_split_insns) as a test for split1 pass finished, so that some insns that should be split during split1 and shouldn't be matched afterwards are properly guarded. So, the following patch does that for vspltis_shifted too. 2022-02-08 Jakub Jelinek <jakub@redhat.com> PR target/102140 * config/rs6000/rs6000.c (vspltis_shifted): Return false also if split1 pass has finished already. * gcc.dg/pr102140.c: New test. (cherry picked from commit `0c3e491a4e`)	2022-05-11 07:58:37 +02:00
Jakub Jelinek	b7e9d466fc	libgomp: Fix segfault with posthumous orphan tasks [PR104385] The following patch fixes crashes with posthumous orphan tasks. When a parent task finishes, gomp_clear_parent clears the parent pointers of its children tasks present in the parent->children_queue. But children that are still waiting for dependencies aren't in that queue yet, they will be added there only when the sibling they are waiting for exits. Unfortunately we were adding those tasks into the queues with the original task->parent which then causes crashes because that task is gone and freed. The following patch fixes that by clearing the parent field when we schedule such task for running by adding it into the queues and we know that the sibling task which is about to finish has NULL parent. 2022-02-08 Jakub Jelinek <jakub@redhat.com> PR libgomp/104385 * task.c (gomp_task_run_post_handle_dependers): If parent is NULL, clear task->parent. * testsuite/libgomp.c/pr104385.c: New test. (cherry picked from commit `0af7ef050a`)	2022-05-11 07:58:37 +02:00
Jakub Jelinek	7157e07447	libcpp: Fix up padding handling in funlike_invocation_p [PR104147] As mentioned in the PR, in some cases we preprocess incorrectly when we encounter an identifier which is defined as function-like macro, followed by at least 2 CPP_PADDING tokens and then some other identifier. On the following testcase, the problem is in the 3rd funlike_invocation_p, the tokens are CPP_NAME Y, CPP_PADDING (the pfile->avoid_paste shared token), CPP_PADDING (one created with padding_token, val.source is non-NULL and val.source->flags & PREV_WHITE is non-zero) and then another CPP_NAME. funlike_invocation_p remembers there was a padding token, but remembers the first one because of its condition, then the next token is the CPP_NAME, which is not CPP_OPEN_PAREN, so the CPP_NAME token is backed up, but as we can't easily backup more tokens, it pushes into a new context the padding token (the pfile->avoid_paste one). The net effect is that when Y is not defined as fun-like macro, we read Y, avoid_paste, padding_token, Y, while if Y is fun-like macro, we read Y, avoid_paste, avoid_paste, Y (the second avoid_paste is because that is how we handle end of a context). Now, for stringify_arg that is unfortunately a significant difference, which handles CPP_PADDING tokens with: if (token->type == CPP_PADDING) { if (source == NULL \|\| (!(source->flags & PREV_WHITE) && token->val.source == NULL)) source = token->val.source; continue; } and later on /* Leading white space? / if (dest - 1 != BUFF_FRONT (pfile->u_buff)) { if (source == NULL) source = token; if (source->flags & PREV_WHITE) dest++ = ' '; } source = NULL; (and c-ppoutput.cc has similar code). So, when Y is not fun-like macro, ' ' is added because padding_token's val.source->flags & PREV_WHITE is non-zero, while when it is fun-like macro, we don't add ' ' in between, because source is NULL and so used from the next token (CPP_NAME Y), which doesn't have PREV_WHITE set. Now, the funlike_invocation_p condition if (padding == NULL \|\| (!(padding->flags & PREV_WHITE) && token->val.source == NULL)) padding = token; looks very similar to that in stringify_arg/c-ppoutput.cc, so I assume the intent was to prefer do the same thing and pick the right padding. But there are significant differences. Both stringify_arg and c-ppoutput.cc don't remember the CPP_PADDING token, but its val.source instead, while in funlike_invocation_p we want to remember the padding token that has the significant information for stringify_arg/c-ppoutput.cc. So, IMHO we want to overwrite padding if: 1) padding == NULL (remember that there was any padding at all) 2) padding->val.source == NULL (this matches the source == NULL case in stringify_arg) 3) !(padding->val.source->flags & PREV_WHITE) && token->val.source == NULL (this matches the !(source->flags & PREV_WHITE) && token->val.source == NULL case in stringify_arg) 2022-02-01 Jakub Jelinek <jakub@redhat.com> PR preprocessor/104147 * macro.c (funlike_invocation_p): For padding prefer a token with val.source non-NULL especially if it has PREV_WHITE set on val.source->flags. Add gcc_assert that CPP_PADDING tokens don't have PREV_WHITE set in flags. * c-c++-common/cpp/pr104147.c: New test. (cherry picked from commit `95ac563540`)	2022-05-11 07:58:36 +02:00
Jakub Jelinek	7e05b86bcd	libcpp: Avoid PREV_WHITE and other random content on CPP_PADDING tokens The funlike_invocation_p macro never triggered, the other asserts did on some tests, see below for a full list. This seems to be caused by #pragma/_Pragma handling. do_pragma does: pfile->directive_result.src_loc = pragma_token_virt_loc; pfile->directive_result.type = CPP_PRAGMA; pfile->directive_result.flags = pragma_token->flags; pfile->directive_result.val.pragma = p->u.ident; when it sees a pragma, while start_directive does: pfile->directive_result.type = CPP_PADDING; and so does _cpp_do__Pragma. Now, for #pragma lex.cc will just ignore directive_result if it has CPP_PADDING type: if (_cpp_handle_directive (pfile, result->flags & PREV_WHITE)) { if (pfile->directive_result.type == CPP_PADDING) continue; result = &pfile->directive_result; } but destringize_and_run does not: if (pfile->directive_result.type == CPP_PRAGMA) { ... } else { count = 1; toks = XNEW (cpp_token); toks[0] = pfile->directive_result; and from there it will copy type member of CPP_PADDING, but all the other members from the last CPP_PRAGMA before it. Small testcase for it with no option (at least no -fopenmp or -fopenmp-simd). #pragma GCC push_options #pragma GCC ignored "-Wformat" #pragma GCC pop_options void foo () { _Pragma ("omp simd") for (int i = 0; i < 64; i++) ; } Here is a patch that replaces those toks = XNEW (cpp_token); toks[0] = pfile->directive_result; lines with toks = &pfile->avoid_paste; 2022-02-01 Jakub Jelinek <jakub@redhat.com> * directives.c (destringize_and_run): Push &pfile->avoid_paste instead of a copy of pfile->directive_result for the CPP_PADDING case. (cherry picked from commit `efc46b550f`)	2022-05-11 07:58:36 +02:00
Jakub Jelinek	02da8ea286	optabs: Don't create pseudos in prepare_cmp_insn when not allowed [PR102478] cond traps can be created during ce3 after reload (and e.g. PR103028 recently fixed some ce3 cond trap related bug, so I think often that works fine and we shouldn't disable cond traps after RA altogether), but it calls prepare_cmp_insn. This function can fail, so I don't see why we couldn't make it work after RA (in most cases it already just works). The first hunk is just an optimization which doesn't make sense after RA, so I've guarded it with can_create_pseudo_p. The second hunk is just a theoretical case, I don't have a testcase for it. prepare_cmp_insn has some other spots that can create pseudos, like when both operands have VOIDmode, or when it is BLKmode comparison, or not OPTAB_DIRECT, but I think none of that applies to ce3, we punt on BLKmode earlier, use OPTAB_DIRECT and shouldn't be comparing two VOIDmode CONST_INTs. 2022-01-21 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/102478 * optabs.c (prepare_cmp_insn): If !can_create_pseudo_p (), don't force_reg constants and for -fnon-call-exceptions fail if copy_to_reg would be needed. * gcc.dg/pr102478.c: New test. (cherry picked from commit `c2d9159717`)	2022-05-11 07:58:36 +02:00
Jakub Jelinek	95f6eb7ae7	match.pd, optabs: Avoid vectorization of {FLOOR,CEIL,ROUND}_{DIV,MOD}_EXPR [PR102860] power10 has modv4si3 expander and so vectorizes the following testcase where Fortran modulo is FLOOR_MOD_EXPR. optabs_for_tree_code indicates that the optab for all the _MOD_EXPR variants is umod_optab or smod_optab, but that isn't true, that optab actually expands just TRUNC_MOD_EXPR. For the other tree codes expmed.cc has code how to adjust the TRUNC_MOD_EXPR into those by emitting some extra comparisons and conditional updates. Similarly for _DIV_EXPR, except in that case it actually needs both division and modulo. While it would be possible to handle it in expmed.cc for vectors as well, we'd need to be sure all the vector operations we need for that are available, and furthermore we wouldn't account for that in the costing. So, IMHO it is better to stop pretending those non-truncating (and non-exact) div/mod operations have an optab. For GCC 13, we should IMHO pattern match these in tree-vect-patterns.cc and transform them to truncating div/mod with follow-up adjustments and let the vectorizer vectorize that. As written in the PR, for signed operands: r = x %[fl] y; is r = x % y; if (r && (x ^ y) < 0) r += y; and d = x /[fl] y; is r = x % y; d = x / y; if (r && (x ^ y) < 0) --d; and r = x %[cl] y; is r = x % y; if (r && (x ^ y) >= 0) r -= y; and d = /[cl] y; is r = x % y; d = x / y; if (r && (x ^ y) >= 0) ++d; (too lazy to figure out rounding div/mod now). I'll create a PR for that. The patch also extends a match.pd optimization that floor_mod on unsigned operands is actually trunc_mod. 2022-01-19 Jakub Jelinek <jakub@redhat.com> PR middle-end/102860 * match.pd (x %[fl] y -> x % y): New simplification for unsigned integral types. * optabs-tree.c (optab_for_tree_code): Return unknown_optab for {CEIL,FLOOR,ROUND}_{DIV,MOD}_EXPR with VECTOR_TYPE. * gfortran.dg/pr102860.f90: New test. (cherry picked from commit `ffc7f200ad`)	2022-05-11 07:58:36 +02:00
Jakub Jelinek	e875dc9f97	ifcvt: Check for asm goto at the end of then_bb/else_bb in ifcvt [PR103908] On the following testcase, RTL ifcvt sees then_bb (note 7 6 8 3 [bb 3] NOTE_INSN_BASIC_BLOCK) (insn 8 7 9 3 (set (mem/c:SI (symbol_ref:DI ("b") [flags 0x2] <var_decl 0x7fdccf5b0cf0 b>) [1 b+0 S4 A32]) (const_int 1 [0x1])) "pr103908.c":6:7 81 {movsi_internal} (nil)) (jump_insn 9 8 13 3 (parallel [ (asm_operands/v ("# insn 1") ("") 0 [] [] [ (label_ref:DI 21) ] pr103908.c:7) (clobber (reg:CC 17 flags)) ]) "pr103908.c":7:5 -1 (expr_list:REG_UNUSED (reg:CC 17 flags) (nil)) -> 21) and similarly else_bb (just with a different asm_operands template). It checks that those basic blocks have a single successor and uses last_active_insn which intentionally skips over JUMP_INSNs, sees both basic blocks contain the same set and merges them (or if the sets are different, attempts some other noce optimization). But we can't assume that the jump, even when it has only a single successor, has no side-effects. The following patch fixes it by punting if test_bb ends with a JUMP_INSN that isn't onlyjump_p. 2022-01-06 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/103908 ifcvt.c (bb_valid_for_noce_process_p): Punt on bbs ending with asm goto. * gcc.target/i386/pr103908.c: New test. (cherry picked from commit `80ad67e2af`)	2022-05-11 07:58:35 +02:00
Jakub Jelinek	86b98701bf	libcpp: Fix up ##__VA_OPT__ handling [PR89971] In the following testcase we incorrectly error about pasting / token with padding token (which is a result of __VA_OPT__); instead we should like e.g. for ##arg where arg is empty macro argument clear PASTE_LEFT flag of the previous token if __VA_OPT__ doesn't add any real tokens (which can happen either because the macro doesn't have any tokens passed to ... (i.e. __VA_ARGS__ expands to empty) or when __VA_OPT__ doesn't have any tokens in between ()s). 2021-12-30 Jakub Jelinek <jakub@redhat.com> PR preprocessor/89971 libcpp/ * macro.c (replace_args): For ##__VA_OPT__, if __VA_OPT__ expands to no tokens at all, drop PASTE_LEFT flag from the previous token. gcc/testsuite/ * c-c++-common/cpp/va-opt-9.c: New test. (cherry picked from commit `5545d1edcb`)	2022-05-11 07:58:35 +02:00
Jakub Jelinek	5d96fb401e	shrink-wrapping: Fix up prologue block discovery [PR103860] The following testcase is miscompiled, because a prologue which contains subq $8, %rsp instruction is emitted at the start of a basic block which contains conditional jump that depends on flags register set in an earlier basic block, the prologue instruction then clobbers those flags. Normally this case is checked by can_get_prologue predicate, but this is done only at the start of the loop. If we update pro later in the loop (because some bb shouldn't be duplicated) and then don't push anything further into vec and the vec is already empty (this can happen when the new pro is already in bb_with bitmask and either has no successors (that is the case in the testcase where that bb ends with a trap) or all the successors are already in bb_with, then the loop doesn't iterate further and can_get_prologue will not be checked. The following simple patch makes sure we call can_get_prologue even after the last former iteration when vec is already empty and only break from the loop afterwards (and only if the updating of pro done because of !can_get_prologue didn't push anything into vec again). 2021-12-30 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/103860 * shrink-wrap.c (try_shrink_wrapping): Make sure can_get_prologue is called on pro even if nothing further is pushed into vec. * gcc.dg/pr103860.c: New test. (cherry picked from commit `1820137ba6`)	2022-05-11 07:58:35 +02:00
Jakub Jelinek	d7dbfed37a	loop-invariant: Fix -fcompare-debug failure [PR103837] In the following testcase we have a -fcompare-debug failure, because can_move_invariant_reg doesn't ignore DEBUG_INSNs in its decisions. In the testcase we have due to uninitialized variable: loop_header debug_insn using pseudo84 pseudo84 = invariant insn using pseudo84 end loop and with -g decide not to move the pseudo84 = invariant before the loop header; in this case not resetting the debug insns might be fine. But, we could have also: pseudo84 = whatever loop_header debug_insn using pseudo84 pseudo84 = invariant insn using pseudo84 end loop and in that case not resetting the debug insns would result in wrong-debug. And, we don't really have generally a good substitution on what pseudo84 contains, it could inherit various values from different paths. So, the following patch ignores DEBUG_INSNs in the decisions, and if there are any that previously prevented the optimization, resets them before return true. 2021-12-28 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/103837 * loop-invariant.c (can_move_invariant_reg): Ignore DEBUG_INSNs in the decisions whether to return false or continue and right before returning true reset those debug insns that previously caused returning false. * gcc.dg/pr103837.c: New test. (cherry picked from commit `3c5fd3616f`)	2022-05-11 07:58:34 +02:00
Jakub Jelinek	790b8d49eb	bswap: Fix UB in find_bswap_or_nop_finalize [PR103435] On gcc.c-torture/execute/pr103376.c in the following code we trigger UB in the compiler. n->range is 8 because it is 64-bit load and rsize is 0 because it is a bswap sequence with load and known to be 0: /* Find real size of result (highest non-zero byte). / if (n->base_addr) for (tmpn = n->n, rsize = 0; tmpn; tmpn >>= BITS_PER_MARKER, rsize++); else rsize = n->range; The shifts then shift uint64_t by 64 bits. For this case mask is 0 and we want both cmpxchg and cmpnop as 0, the operation can be done as both nop and bswap and callers will prefer nop. 2021-11-27 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/103435 gimple-ssa-store-merging.c (find_bswap_or_nop_finalize): Avoid UB if n->range - rsize == 8, just clear both cmpnop and cmpxchg in that case. (cherry picked from commit `567d5f3d62`)	2022-05-11 07:58:34 +02:00
Jakub Jelinek	4a3c9aabb0	fortran, debug: Fix up DW_AT_rank [PR103315] For DW_AT_rank we were emitting .uleb128 0x4 # DW_AT_rank .byte 0x97 # DW_OP_push_object_address .byte 0x23 # DW_OP_plus_uconst .uleb128 0x1c .byte 0x6 # DW_OP_deref on 64-bit and .uleb128 0x4 # DW_AT_rank .byte 0x97 # DW_OP_push_object_address .byte 0x23 # DW_OP_plus_uconst .uleb128 0x10 .byte 0x6 # DW_OP_deref on 32-bit. I think this is wrong, as dtype.rank field in the descriptor has unsigned char type, not pointer type nor pointer sized integral. E.g. if we have a REAL :: a(..) dummy argument, which is passed as a reference to the function descriptor, we want to evaluate a->dtype.rank. The above DWARF expressions perform (uintptr_t )(a + 0x1c) and (uintptr_t )(a + 0x10) respectively. The following patch changes those to: .uleb128 0x5 # DW_AT_rank .byte 0x97 # DW_OP_push_object_address .byte 0x23 # DW_OP_plus_uconst .uleb128 0x1c .byte 0x94 # DW_OP_deref_size .byte 0x1 and .uleb128 0x5 # DW_AT_rank .byte 0x97 # DW_OP_push_object_address .byte 0x23 # DW_OP_plus_uconst .uleb128 0x10 .byte 0x94 # DW_OP_deref_size .byte 0x1 which perform (unsigned char )(a + 0x1c) and (unsigned char )(a + 0x10) respectively. 2021-11-21 Jakub Jelinek <jakub@redhat.com> PR debug/103315 * trans-types.c (gfc_get_array_descr_info): Use DW_OP_deref_size 1 instead of DW_OP_deref for DW_AT_rank. (cherry picked from commit `da17c304e2`)	2022-05-11 07:58:34 +02:00
Jakub Jelinek	547692808b	c++: Fix up -fstrong-eval-order handling of call arguments [PR70796] For -fstrong-eval-order (default for C++17 and later) we make sure to gimplify arguments in the right order, but as the following testcase shows that is not enough. The problem is that some lvalues can satisfy the is_gimple_val / fb_rvalue predicate used by gimplify_arg for is_gimple_reg_type typed expressions, or is_gimple_lvalue / fb_either used for other types. E.g. in foo we have: C::C (&p, ++i, ++i) before gimplification where i is an automatic int variable and without this patch gimplify that as: i = i + 1; i = i + 1; C::C (&p, i, i); which means that the ctor is called with the original i value incremented by 2 in both arguments, while because the call is CALL_EXPR_ORDERED_ARGS the first argument should be different. Similarly in qux we have: B::B (&p, TARGET_EXPR <D.2274, (const struct A &) A::operator++ (&i)>, TARGET_EXPR <D.2275, (const struct A &) A::operator++ (&i)>) and gimplify it as: _1 = A::operator++ (&i); _2 = A::operator++ (&i); B::B (&p, MEM[(const struct A &)_1], MEM[(const struct A &)_2]); but because A::operator++ returns the passed in argument, again we have the same value in both cases due to gimplify_arg doing: /* Also strip a TARGET_EXPR that would force an extra copy. / if (TREE_CODE (arg_p) == TARGET_EXPR) { tree init = TARGET_EXPR_INITIAL (arg_p); if (init && !VOID_TYPE_P (TREE_TYPE (init))) arg_p = init; } which is perfectly fine optimization for calls with unordered arguments, but breaks the ordered ones. Lastly, in corge, we have before gimplification: D::foo (NON_LVALUE_EXPR <p>, 3, ++p) and gimplify it as p = p + 4; D::foo (p, 3, p); which is again wrong, because the this argument isn't before the side-effects but after it. The following patch adds cp_gimplify_arg wrapper, which if ordered and is_gimple_reg_type forces non-SSA_NAME is_gimple_variable result into a temporary, and if ordered, not is_gimple_reg_type and argument is TARGET_EXPR bypasses the gimplify_arg optimization. So, in foo with this patch we gimplify it as: i = i + 1; i.0_1 = i; i = i + 1; C::C (&p, i.0_1, i); in qux as: _1 = A::operator++ (&i); D.2312 = MEM[(const struct A &)_1]; _2 = A::operator++ (&i); B::B (&p, D.2312, MEM[(const struct A &)_2]); where D.2312 is a temporary and in corge as: p.9_1 = p; p = p + 4; D::foo (p.9_1, 3, p); The is_gimple_reg_type forcing into a temporary should be really cheap (I think even at -O0 it should be optimized if there is no modification in between), the aggregate copies might be more expensive but I think e.g. SRA or FRE should be able to deal with those if there are no intervening changes. But still, the patch tries to avoid those when it is cheaply provable that nothing bad happens (if no argument following it in the strong evaluation order doesn't have TREE_SIDE_EFFECTS, then even VAR_DECLs etc. shouldn't be modified after it). There is also an optimization to avoid doing that for this or for arguments with reference types as nothing can modify the parameter values during evaluation of other argument's side-effects. I've tried if e.g. int i = 1; return i << ++i; doesn't suffer from this problem as well, but it doesn't, the FE uses SAVE_EXPR <i>, SAVE_EXPR <i> << ++i; in that case which gimplifies the way we want (temporary in the first operand). 2021-11-19 Jakub Jelinek <jakub@redhat.com> PR c++/70796 * cp-gimplify.c (cp_gimplify_arg): New function. (cp_gimplify_expr): Use cp_gimplify_arg instead of gimplify_arg, pass true as last argument to it if there are any following arguments in strong evaluation order with side-effects. * g++.dg/cpp1z/eval-order11.C: New test. (cherry picked from commit `a84177aff7`)	2022-05-11 07:58:33 +02:00
Jakub Jelinek	1f02d664cd	lim: Reset flow sensitive info even for pointers [PR103192] Since 2014 is lim clearing SSA_NAME_RANGE_INFO for integral SSA_NAMEs if moving them from conditional contexts inside of a loop into unconditional before the loop, but as the miscompilation of gimplify.c shows, we need to treat pointers the same, even for them we need to reset whether the pointer can/can't be null or the recorded pointer alignment. This fixes -FAIL: libgomp.c/../libgomp.c-c++-common/target-in-reduction-2.c (internal compiler error) -FAIL: libgomp.c/../libgomp.c-c++-common/target-in-reduction-2.c (test for excess errors) -UNRESOLVED: libgomp.c/../libgomp.c-c++-common/target-in-reduction-2.c compilation failed to produce executable -FAIL: libgomp.c++/../libgomp.c-c++-common/target-in-reduction-2.c (internal compiler error) -FAIL: libgomp.c++/../libgomp.c-c++-common/target-in-reduction-2.c (test for excess errors) -UNRESOLVED: libgomp.c++/../libgomp.c-c++-common/target-in-reduction-2.c compilation failed to produce executable -FAIL: libgomp.c++/target-in-reduction-2.C (internal compiler error) -FAIL: libgomp.c++/target-in-reduction-2.C (test for excess errors) -UNRESOLVED: libgomp.c++/target-in-reduction-2.C compilation failed to produce executable on both x86_64 and i686. 2021-11-17 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/103192 * tree-ssa-loop-im.c (move_computations_worker): Use reset_flow_sensitive_info instead of manually clearing SSA_NAME_RANGE_INFO and do it for all SSA_NAMEs, not just ones with integral types. (cherry picked from commit `077425c890`)	2022-05-11 07:58:33 +02:00
Jakub Jelinek	294682dc23	i386: Fix up x86 atomic_bit_test* expanders for !TARGET_HIMODE_MATH [PR103205] With !TARGET_HIMODE_MATH, the OPTAB_DIRECT expand_simple_binop fail and so we ICE. We don't really care if they are done promoted in SImode instead. 2021-11-15 Jakub Jelinek <jakub@redhat.com> PR target/103205 * config/i386/sync.md (atomic_bit_test_and_set<mode>, atomic_bit_test_and_complement<mode>, atomic_bit_test_and_reset<mode>): Use OPTAB_WIDEN instead of OPTAB_DIRECT. * gcc.target/i386/pr103205.c: New test. (cherry picked from commit `625eef42e3`)	2022-05-11 07:58:33 +02:00
Jakub Jelinek	4b14a4af62	dwarf2out: Fix up field_byte_offset [PR101378] For PCC_BITFIELD_TYPE_MATTERS field_byte_offset has quite large code to deal with it since many years ago (see it e.g. in GCC 3.2, although it used to be on HOST_WIDE_INTs, then on double_ints, now on offset_ints). But that code apparently isn't able to cope with members with empty class types with [[no_unique_address]] attribute, because the empty classes have non-zero type size but zero decl size and so one can end up from the computation with negative offset or offset 1 byte smaller than it should be. For !PCC_BITFIELD_TYPE_MATTERS, we just use tree_result = byte_position (decl); which seems exactly right even for the empty classes or anything which is not a bitfield (and for which we don't add DW_AT_bit_offset attribute). So, instead of trying to handle those no_unique_address members in the current already very complicated code, this limits it to bitfields. stor-layout.c PCC_BITFIELD_TYPE_MATTERS handling also affects only bitfields, twice it checks DECL_BIT_FIELD and once DECL_BIT_FIELD_TYPE. As discussed, this patch uses DECL_BIT_FIELD_TYPE check, because DECL_BIT_FIELD might be cleared for some bitfields with bitsizes multiple of BITS_PER_UNIT and e.g. struct S { int e; int a : 1, b : 7, c : 8, d : 16; } s; struct T { int a : 1, b : 7; long long c : 8; int d : 16; } t; int main () { s.c = 0x55; s.d = 0xaaaa; t.c = 0x55; t.d = 0xaaaa; s.e++; } has different debug info with DECL_BIT_FIELD check. 2021-11-11 Jakub Jelinek <jakub@redhat.com> PR debug/101378 * dwarf2out.c (field_byte_offset): Do the PCC_BITFIELD_TYPE_MATTERS handling only for DECL_BIT_FIELD_TYPE decls. * g++.dg/debug/dwarf2/pr101378.C: New test. (cherry picked from commit `10db757301`)	2022-05-11 07:58:32 +02:00
Jakub Jelinek	a76e866211	openmp: For default(none) ignore variables created by ubsan_create_data [PR64888] We weren't ignoring the ubsan variables created by c-ubsan.c before gimplification (others are added later). One way to fix this would be to introduce further UBSAN_ internal functions and lower it later (sanopt pass) like other ifns, this patch instead recognizes those magic vars by name/name of type and DECL_ARTIFICIAL and TYPE_ARTIFICIAL. 2021-10-21 Jakub Jelinek <jakub@redhat.com> PR middle-end/64888 gcc/c-family/ * c-omp.c (c_omp_predefined_variable): Return true also for ubsan_create_data created artificial variables. gcc/testsuite/ * c-c++-common/ubsan/pr64888.c: New test. (cherry picked from commit `40dd9d839e`)	2022-05-11 07:58:30 +02:00
Jakub Jelinek	662de049d6	c++: Don't reject calls through PMF during constant evaluation [PR102786] The following testcase incorrectly rejects the c initializer, while in the s.a case cxx_eval_ sees .__pfn reads etc., in the s.&S::foo case get_member_function_from_ptrfunc creates expressions which use INTEGER_CSTs with type of pointer to METHOD_TYPE. And cxx_eval_constant_expression rejects any INTEGER_CSTs with pointer type if they aren't 0. Either we'd need to make sure we defer such folding till cp_fold but the function and pfn_from_ptrmemfunc is used from lots of places, or the following patch just tries to reject only non-zero INTEGER_CSTs with pointer types if they don't point to METHOD_TYPE in the hope that all such INTEGER_CSTs with POINTER_TYPE to METHOD_TYPE are result of folding valid pointer-to-member function expressions. I don't immediately see how one could create such INTEGER_CSTs otherwise, cast of integers to PMF is rejected and would have the PMF RECORD_TYPE anyway, etc. 2021-10-19 Jakub Jelinek <jakub@redhat.com> PR c++/102786 constexpr.c (cxx_eval_constant_expression): Don't reject INTEGER_CSTs with type POINTER_TYPE to METHOD_TYPE. * g++.dg/cpp2a/constexpr-virtual19.C: New test. (cherry picked from commit `f45610a452`)	2022-05-11 07:58:30 +02:00
Jakub Jelinek	be2c01c634	openmp: Fix up handling of OMP_PLACES=threads(1) When writing the places-.c tests, I've noticed that we mishandle threads abstract name with specified num-places if num-places isn't a multiple of number of hw threads in a core. It then happily ignores the maximum count and overwrites for the remaining hw threads in a core further places that haven't been allocated. 2021-10-15 Jakub Jelinek <jakub@redhat.com> config/linux/affinity.c (gomp_affinity_init_level_1): For level 1 after creating count places clean up and return immediately. * testsuite/libgomp.c/places-6.c: New test. * testsuite/libgomp.c/places-7.c: New test. * testsuite/libgomp.c/places-8.c: New test. (cherry picked from commit `4764049dd6`)	2022-05-11 07:58:30 +02:00
Jakub Jelinek	ee221ea5cc	c++: Fix apply_identity_attributes [PR102548] The following testcase ICEs on x86_64-linux with -m32 due to a bug in apply_identity_attributes. The function is being smart and attempts not to duplicate the chain unnecessarily, if either there are no attributes that affect type identity or there is possibly empty set of attributes that do not affect type identity in the chain followed by attributes that do affect type identity, it reuses that attribute chain. The function mishandles the cases where in the chain an attribute affects type identity and is followed by one or more attributes that don't affect type identity (and then perhaps some further ones that do). There are two bugs. One is that when we notice first attribute that doesn't affect type identity after first attribute that does affect type identity (with perhaps some further such attributes in the chain after it), we want to put into the new chain just attributes starting from (inclusive) first_ident and up to (exclusive) the current attribute a, but the code puts into the chain all attributes starting with first_ident, including the ones that do not affect type identity and if e.g. we have doesn't0 affects1 doesn't2 affects3 affects4 sequence of attributes, the resulting sequence would have affects1 doesn't2 affects3 affects4 affects3 affects4 attributes, i.e. one attribute that shouldn't be there and two attributes duplicated. That is fixed by the a2 -> a2 != a change. The second one is that we ICE once we see second attribute that doesn't affect type identity after an attribute that affects it. That is because first_ident is set to error_mark_node after handling the first attribute that doesn't affect type identity (i.e. after we've copied the [first_ident, a) set of attributes to the new chain) to denote that from that time on, each attribute that affects type identity should be copied whenever it is seen (the if (as && as->affects_type_identity) code does that correctly). But that condition is false and first_ident is error_mark_node, we enter else if (first_ident) and use TREE_PURPOSE /TREE_VALUE/TREE_CHAIN on error_mark_node, which ICEs. When first_ident is error_mark_node and a doesn't affect type identity, we want to do nothing. So that is the && first_ident != error_mark_node chunk. 2021-10-05 Jakub Jelinek <jakub@redhat.com> PR c++/102548 * tree.c (apply_identity_attributes): Fix handling of the case where an attribute in the list doesn't affect type identity but some attribute before it does. * g++.target/i386/pr102548.C: New test. (cherry picked from commit `737f95bab5`)	2022-05-11 07:58:30 +02:00
Jakub Jelinek	f806bea0a6	ubsan: Use -fno{,-}sanitize=float-divide-by-zero for float division by zero recovery [PR102515] We've been using -f{,no-}sanitize-recover=integer-divide-by-zero to decide on the float -fsanitize=float-divide-by-zero instrumentation _abort suffix. This patch fixes it to use -f{,no-}sanitize-recover=float-divide-by-zero for it instead. 2021-10-01 Jakub Jelinek <jakub@redhat.com> Richard Biener <rguenther@suse.de> PR sanitizer/102515 gcc/c-family/ * c-ubsan.c (ubsan_instrument_division): Check the right flag_sanitize_recover bit, depending on which sanitization is done. gcc/testsuite/ * c-c++-common/ubsan/float-div-by-zero-2.c: New test. (cherry picked from commit `9c1a633d96`)	2022-05-11 07:58:30 +02:00
Jakub Jelinek	8837138d29	i386: Don't emit fldpi etc. if -frounding-math [PR102498] i387 has instructions to store some transcedental numbers into the top of stack. The problem is that what exact bit in the last place one gets for those depends on the current rounding mode, the CPU knows the number with slightly higher precision. The compiler assumes rounding to nearest when comparing them against constants in the IL, but at runtime the rounding can be different and so some of these depending on rounding mode and the constant could be 1 ulp higher or smaller than expected. We only support changing the rounding mode at runtime if the non-default -frounding-mode option is used, so the following patch just disables using those constants if that flag is on. 2021-09-28 Jakub Jelinek <jakub@redhat.com> PR target/102498 * config/i386/i386.c (standard_80387_constant_p): Don't recognize special 80387 instruction XFmode constants if flag_rounding_math. * gcc.target/i386/pr102498.c: New test. (cherry picked from commit `3b7041e834`)	2022-05-11 07:58:29 +02:00

1 2 3 4 5 ...

171114 Commits All Branches Search

171114 Commits

All Branches