Commit Graph

171114 Commits

Author SHA1 Message Date
Jakub Jelinek 4f18893fbd sparc: Preserve ORIGINAL_REGNO in epilogue_renumber [PR105257]
The following testcase ICEs, because the pic register is
(reg:DI 24 %i0 [109]) and is used in the delay slot of a return.
We invoke epilogue_renumber and that changes it to
(reg:DI 8 %o0) which no longer satisfies sparc_pic_register_p
predicate, so we don't recognize the insn anymore.

The following patch fixes that by preserving ORIGINAL_REGNO if
specified, so we get (reg:DI 8 %o0 [109]) instead.

2022-04-19  Jakub Jelinek  <jakub@redhat.com>

	PR target/105257
	* config/sparc/sparc.c (epilogue_renumber): If ORIGINAL_REGNO,
	use gen_raw_REG instead of gen_rtx_REG and copy over also
	ORIGINAL_REGNO.  Use return 0; instead of /* fallthrough */.

	* gcc.dg/pr105257.c: New test.

(cherry picked from commit eeca2b8bd0)
2022-05-11 08:17:59 +02:00
Jakub Jelinek 30895a25ea c++: Fix up CONSTRUCTOR_PLACEHOLDER_BOUNDARY handling [PR105256]
The CONSTRUCTOR_PLACEHOLDER_BOUNDARY bit is supposed to separate
PLACEHOLDER_EXPRs that should be replaced by one object or subobjects of it
(variable, TARGET_EXPR slot, ...) from other PLACEHOLDER_EXPRs that should
be replaced by different objects or subobjects.
The bit is set when finding PLACEHOLDER_EXPRs inside of a CONSTRUCTOR, not
looking into nested CONSTRUCTOR_PLACEHOLDER_BOUNDARY ctors, and we prevent
elision of TARGET_EXPRs (through TARGET_EXPR_NO_ELIDE) whose initializer
is a CONSTRUCTOR_PLACEHOLDER_BOUNDARY ctor.  The following testcase ICEs
though, we don't replace the placeholders in there at all, because
CONSTRUCTOR_PLACEHOLDER_BOUNDARY isn't set on the TARGET_EXPR_INITIAL
ctor, but on a ctor nested in such a ctor.  replace_placeholders should be
run on the whole TARGET_EXPR slot.

So, the following patch fixes it by moving the CONSTRUCTOR_PLACEHOLDER_BOUNDARY
bit from nested CONSTRUCTORs to the CONSTRUCTOR containing those (but only
if it is closely nested, if there is some other tree sandwiched in between,
it doesn't do it).

2022-04-19  Jakub Jelinek  <jakub@redhat.com>

	PR c++/105256
	* typeck2.c (process_init_constructor_array,
	process_init_constructor_record, process_init_constructor_union): Move
	CONSTRUCTOR_PLACEHOLDER_BOUNDARY flag from CONSTRUCTOR elements to the
	containing CONSTRUCTOR.

	* g++.dg/cpp0x/pr105256.C: New test.

(cherry picked from commit eb03e42459)
2022-05-11 08:17:59 +02:00
Jakub Jelinek 14407aba9f i386: Fix ICE caused by ix86_emit_i387_log1p [PR105214]
The following testcase ICEs, because ix86_emit_i387_log1p attempts to
emit something like
  if (cond)
    some_code1;
  else
    some_code2;
and emits a conditional jump using emit_jump_insn (standard way in
the file) and an unconditional jump using emit_jump.
The problem with that is that if there is pending stack adjustment,
it isn't emitted before the conditional jump, but is before the
unconditional jump and therefore stack is adjusted only conditionally
(at the end of some_code1 above), which makes dwarf2 pass unhappy about it
but is a serious wrong-code even if it doesn't ICE.

This can be fixed either by emitting pending stack adjust before the
conditional jump as the following patch does, or by not using
  emit_jump (label2);
and instead hand inlining what that function does except for the
pending stack adjustment, like:
  emit_jump_insn (targetm.gen_jump (label2));
  emit_barrier ();
In that case there will be no stack adjustment in the sequence and
it will be done later on somewhere else.

2022-04-12  Jakub Jelinek  <jakub@redhat.com>

	PR target/105214
	* config/i386/i386.c (ix86_emit_i387_log1p): Call
	do_pending_stack_adjust.

	* gcc.dg/asan/pr105214.c: New test.

(cherry picked from commit d481d13786)
2022-05-11 08:17:58 +02:00
Jakub Jelinek fd68b02744 builtins: Fix up expand_builtin_int_roundingfn_2 [PR105211]
The expansion of __builtin_iround{,f,l} etc. builtins in some cases
emits calls to a different fallback builtin.  To locate the right builtin
it uses mathfn_built_in_1 with the type of the first argument.
If its TYPE_MAIN_VARIANT is {float,double,long_double}_type_node, all is
fine, but on the following testcase, because GIMPLE considers scalar
float conversions between types with the same mode as useless,
TYPE_MAIN_VARIANT of the arg's type is float32_type_node and because there
isn't __builtin_lroundf32 returns NULL and we ICE.

This patch will first try the type of the first argument of the builtin's
prototype (so that say on sizeof(double)==sizeof(long double) target it honors
whether it was a *l or non-*l call; though even that can't be 100% trusted,
user could incorrectly prototype it) and as fallback the type argument.
If neither works, doesn't fallback.

2022-04-11  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/105211
	* builtins.c (expand_builtin_int_roundingfn_2): If mathfn_built_in_1
	fails for TREE_TYPE (arg), retry it with
	TREE_VALUE (TYPE_ARG_TYPES (TREE_TYPE (fndecl))) and if even that
	fails, emit call normally.

	* gcc.dg/pr105211.c: New test.

(cherry picked from commit 91a38e8a84)
2022-05-11 08:17:58 +02:00
Jakub Jelinek 6547662a2a c-family: Initialize ridpointers for __int128 etc. [PR105186]
The following testcase ICEs with C++ and is incorrectly rejected with C.
The reason is that both FEs use ridpointers identifiers for CPP_KEYWORD
and value or u.value for CPP_NAME e.g. when parsing attributes or OpenMP
directives etc., like:
         /* Save away the identifier that indicates which attribute
            this is.  */
         identifier = (token->type == CPP_KEYWORD)
           /* For keywords, use the canonical spelling, not the
              parsed identifier.  */
           ? ridpointers[(int) token->keyword]
           : id_token->u.value;

         identifier = canonicalize_attr_name (identifier);
I've tried to change those to use ridpointers only if non-NULL and otherwise
use the value/u.value even for CPP_KEYWORDS, but that was a large 10 hunks
patch.

The following patch instead just initializes ridpointers for the __intNN
keywords.  It can't be done earlier before we record_builtin_type as there
are 2 different spellings and if we initialize those ridpointers early, the
second record_builtin_type fails miserably.

2022-04-11  Jakub Jelinek  <jakub@redhat.com>

	PR c++/105186
	* c-common.c (c_common_nodes_and_builtins): After registering __int%d
	and __int%d__ builtin types, initialize corresponding ridpointers
	entry.

	* c-c++-common/pr105186.c: New test.

(cherry picked from commit 083e8e66d2)
2022-05-11 08:17:58 +02:00
Jakub Jelinek cddca3b79f fold-const: Fix up make_range_step [PR105189]
The following testcase is miscompiled, because fold_truth_andor
incorrectly folds
(unsigned) foo () >= 0U && 1
into
foo () >= 0
For the unsigned comparison (which is useless in this case,
as >= 0U is always true, but hasn't been folded yet), previous
make_range_step derives exp (unsigned) foo () and +[0U, -]
range for it.  Next we process the NOP_EXPR.  We have special code
for unsigned to signed casts, already earlier punt if low or high
aren't representable in arg0_type or if it is a narrowing conversion.
For the signed to unsigned casts, I think if high is specified we
are still fine, as we punt for non-representable values in arg0_type,
n_high is then still representable and so was smaller or equal to
signed maximum and either low is not present (equivalent to 0U), or
low must be smaller or equal to high and so for unsigned exp
+[low, high] the signed exp +[n_low, n_high] will be correct.
Similarly, if both low and high aren't specified (always true or
always false), it is ok too.
But if we have for unsigned exp +[low, -] or -[low, -], using
+[n_low, -] or -[n_high, -] is incorrect.  Because low is smaller
or equal to signed maximum and high is unspecified (i.e. unsigned
maximum), when signed that range is a union of +[n_low, -] and
+[-, -1] which is equivalent to -[0, n_low-1], unless low
is 0, in that case we can treat it as [-, -].

2022-04-08  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/105189
	* fold-const.c (make_range_step): Fix up handling of
	(unsigned) x +[low, -] ranges for signed x if low fits into
	typeof (x).

	* g++.dg/torture/pr105189.C: New test.

(cherry picked from commit 5e6597064b)
2022-05-11 08:17:58 +02:00
Jakub Jelinek 5169f5756e combine: Don't record for UNDO_MODE pointers into regno_reg_rtx array [PR104985]
The testcase in the PR fails under valgrind on mips64 (but only Martin
can reproduce, I couldn't).
But the problem reported there is that SUBST_MODE remembers addresses
into the regno_reg_rtx array, then some splitter needs a new pseudo
and calls gen_reg_rtx, which reallocates the regno_reg_rtx array
and finally undo operation is done and dereferences the old regno_reg_rtx
entry.
The rtx values stored in regno_reg_rtx array seems to be created
by gen_reg_rtx only and since then aren't modified, all we do for it
is adjusting its fields (e.g. adjust_reg_mode that SUBST_MODE does).

So, I think it is useless to use where.r for UNDO_MODE and store
&regno_reg_rtx[regno] in struct undo, we can store just
regno_reg_rtx[regno] (i.e. pointer to the REG itself instead of
pointer to pointer to REG) or could also store just the regno.

The following patch does the latter, and because SUBST_MODE no longer
needs to be a macro, changes all SUBST_MODE uses to subst_mode.

2022-04-06  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/104985
	* combine.c (struct undo): Add where.regno member.
	(do_SUBST_MODE): Rename to ...
	(subst_mode): ... this.  Change first argument from rtx * into int,
	operate on regno_reg_rtx[regno] and save regno into where.regno.
	(SUBST_MODE): Remove.
	(try_combine): Use subst_mode instead of SUBST_MODE, change first
	argument from regno_reg_rtx[whatever] to whatever.  For UNDO_MODE, use
	regno_reg_rtx[undo->where.regno] instead of *undo->where.r.
	(undo_to_marker): For UNDO_MODE, use regno_reg_rtx[undo->where.regno]
	instead of *undo->where.r.
	(simplify_set): Use subst_mode instead of SUBST_MODE, change first
	argument from regno_reg_rtx[whatever] to whatever.

(cherry picked from commit 61bee6aed2)
2022-05-11 08:17:58 +02:00
Jakub Jelinek a78199c69f i386: Fix up ix86_expand_vector_init_general [PR105123]
The following testcase is miscompiled on ia32.
The problem is that at -O0 we end up with:
  vector(4) short unsigned int _1;
  short unsigned int u.0_3;
...
  _1 = {u.0_3, u.0_3, u.0_3, u.0_3};
statement (dead) which is wrongly expanded.
elt is (subreg:HI (reg:SI 83 [ u.0_3 ]) 0), tmp_mode SImode,
so after convert_mode we start with word (reg:SI 83 [ u.0_3 ]).
The intent is to manually broadcast that value to 2 SImode parts,
but because we pass word as target to expand_simple_binop, it will
overwrite (reg:SI 83 [ u.0_3 ]) and we end up with 0:
   10: {r83:SI=r83:SI<<0x10;clobber flags:CC;}
   11: {r83:SI=r83:SI|r83:SI;clobber flags:CC;}
   12: {r83:SI=r83:SI<<0x10;clobber flags:CC;}
   13: {r83:SI=r83:SI|r83:SI;clobber flags:CC;}
   14: clobber r110:V4HI
   15: r110:V4HI#0=r83:SI
   16: r110:V4HI#4=r83:SI
as the two ors do nothing and two shifts each by 16 left shift it all
away.
The following patch fixes that by using NULL_RTX target, so we expand it as
   10: {r110:SI=r83:SI<<0x10;clobber flags:CC;}
   11: {r111:SI=r110:SI|r83:SI;clobber flags:CC;}
   12: {r112:SI=r83:SI<<0x10;clobber flags:CC;}
   13: {r113:SI=r112:SI|r83:SI;clobber flags:CC;}
   14: clobber r114:V4HI
   15: r114:V4HI#0=r111:SI
   16: r114:V4HI#4=r113:SI
instead.

Another possibility would be to pass NULL_RTX only when word == elt
and word otherwise, where word would necessarily be a pseudo from the first
shift after passing NULL_RTX there once or pass NULL_RTX for the shift and
word for ior.

2022-04-03  Jakub Jelinek  <jakub@redhat.com>

	PR target/105123
	* config/i386/i386.c (ix86_expand_vector_init_general): Avoid
	using word as target for expand_simple_binop when doing ASHIFT and
	IOR.

	* gcc.target/i386/pr105123.c: New test.

(cherry picked from commit e1a74058b7)
2022-05-11 08:17:57 +02:00
Jakub Jelinek bee22b8bc1 ubsan: Fix ICE due to -fsanitize=object-size [PR105093]
The following testcase ICEs, because for a volatile X & RESULT_DECL
ubsan wants to take address of that reference.  instrument_object_size
is called with x, so the base is equal to the access and the var
is automatic, so there is no risk of an out of bounds access for it.
Normally we wouldn't instrument those because we fold address of the
t - address of inner to 0, add constant size of the decl and it is
equal to what __builtin_object_size computes.  But the volatile
results in the subtraction not being folded.

The first hunk fixes it by punting if we access the whole automatic
decl, so that even volatile won't cause a problem.
The second hunk (not strictly needed for this testcase) is similar
to what has been added to asan.cc recently, if we actually take
address of a decl and keep it in the IL, we better mark it addressable.

2022-03-30  Jakub Jelinek  <jakub@redhat.com>

	PR sanitizer/105093
	* ubsan.c (instrument_object_size): If t is equal to inner and
	is a decl other than global var, punt.  When emitting call to
	UBSAN_OBJECT_SIZE ifn, make sure base is addressable.

	* g++.dg/ubsan/pr105093.C: New test.

(cherry picked from commit e3e68fa59e)
2022-05-11 08:17:57 +02:00
Jakub Jelinek 3bbd4ce93b c++: Fix up __builtin_convertvector parsing
Jonathan reported on IRC that we don't parse
__builtin_bit_cast (type, val).field
etc.
The problem is that for these 2 builtins we return from
cp_parser_postfix_expression instead of setting postfix_expression
to the cp_build_* value and falling through into the postfix regression
suffix handling loop.

2022-03-26  Jakub Jelinek  <jakub@redhat.com>

	* parser.c (cp_parser_postfix_expression)
	<case RID_BILTIN_CONVERTVECTOR>: Don't
	return cp_build_vec_convert result right away, instead
	set postfix_expression to it and break.

	* c-c++-common/builtin-convertvector-3.c: New test.

(cherry picked from commit 1806829e08)
2022-05-11 08:17:57 +02:00
Jakub Jelinek 54bccc8e05 c++: extern thread_local declarations in constexpr [PR104994]
C++14 to C++20 apparently should allow extern thread_local declarations in
constexpr functions, however useless they are there (because accessing
such vars is not valid in a constant expression, perhaps sizeof/decltype).
P2242 changed that for C++23 to passing through declaration but
https://cplusplus.github.io/CWG/issues/2552.html
has been filed for it yesterday.

2022-03-24  Jakub Jelinek  <jakub@redhat.com>

	PR c++/104994
	* constexpr.c (potential_constant_expression_1): Don't diagnose extern
	thread_local declarations.
	* decl.c (start_decl): Likewise.

	* g++.dg/cpp2a/constexpr-nonlit7.C: New test.

(cherry picked from commit 72124f487c)
2022-05-11 08:17:57 +02:00
Jakub Jelinek c1a8261b70 i386: Don't emit pushf;pop for __builtin_ia32_readeflags_u* with unused lhs [PR104971]
__builtin_ia32_readeflags_u* aren't marked const or pure I think
intentionally, so that they aren't CSEd from different regions of a function
etc. because we don't and can't easily track all dependencies between
it and surrounding code (if somebody looks at the condition flags, it is
dependent on the vast majority of instructions).
But the builtin itself doesn't have any side-effects, so if we ignore the
result of the builtin, there is no point to emit anything.

There is a LRA bug that miscompiles the testcase which this patch makes
latent, which is certainly worth fixing too, but IMHO this change
(and maybe ix86_gimple_fold_builtin too which would fold it even earlier
when it looses lhs) is worth it as well.

2022-03-19  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/104971
	* config/i386/i386.c
	(ix86_expand_builtin) <case IX86_BUILTIN_READ_FLAGS>: If ignore,
	don't push/pop anything and just return const0_rtx.

	* gcc.target/i386/pr104971.c: New test.

(cherry picked from commit b60bc913cc)
2022-05-11 08:17:57 +02:00
Jakub Jelinek 0e02b8468b c, c++, c-family: -Wshift-negative-value and -Wshift-overflow* tweaks for -fwrapv and C++20+ [PR104711]
As mentioned in the PR, different standards have different definition
on what is an UB left shift.  They all agree on out of bounds (including
negative) shift count.
The rules used by ubsan are:
C99-C2x ((unsigned) x >> (uprecm1 - y)) != 0 then UB
C++11-C++17 x < 0 || ((unsigned) x >> (uprecm1 - y)) > 1 then UB
C++20 and later everything is well defined
Now, for C++20, I've in the P1236R1 implementation added an early
exit for -Wshift-overflow* warning so that it never warns, but apparently
-Wshift-negative-value remained as is.  As it is well defined in C++20,
the following patch doesn't enable -Wshift-negative-value from -Wextra
anymore for C++20 and later, if users want for compatibility with C++17
and earlier get the warning, they still can by using -Wshift-negative-value
explicitly.
Another thing is -fwrapv, that is an extension to the standards, so it is up
to us how exactly we define that case.  Our ubsan code treats
TYPE_OVERFLOW_WRAPS (type0) and cxx_dialect >= cxx20 the same as only
diagnosing out of bounds shift count and nothing else and IMHO it is most
sensical to treat -fwrapv signed left shifts the same as C++20 treats
them, https://eel.is/c++draft/expr.shift#2
"The value of E1 << E2 is the unique value congruent to E1×2^E2 modulo 2^N,
where N is the width of the type of the result.
[Note 1: E1 is left-shifted E2 bit positions; vacated bits are zero-filled.
— end note]"
with no UB dependent on the E1 values.  The UB is only
"The behavior is undefined if the right operand is negative, or greater
than or equal to the width of the promoted left operand."
Under the hood (except for FEs and ubsan from FEs) GCC middle-end doesn't
consider UB in left shifts dependent on the first operand's value, only
the out of bounds shifts.

While this change isn't a regression, I'd think it is useful for GCC 12,
it doesn't add new warnings, but just removes warnings that aren't
appropriate.

2022-03-09  Jakub Jelinek  <jakub@redhat.com>

	PR c/104711
gcc/
	* doc/invoke.texi (-Wextra): Document that -Wshift-negative-value
	is enabled by it only for C++11 to C++17 rather than for C++03 or
	later.
	(-Wshift-negative-value): Similarly (except here we stated
	that it is enabled for C++11 or later).
gcc/c-family/
	* c-opts.c (c_common_post_options): Don't enable
	-Wshift-negative-value from -Wextra for C++20 or later.
	* c-ubsan.c (ubsan_instrument_shift): Adjust comments.
	* c-warn.c (maybe_warn_shift_overflow): Use TYPE_OVERFLOW_WRAPS
	instead of TYPE_UNSIGNED.
gcc/c/
	* c-fold.c (c_fully_fold_internal): Don't emit
	-Wshift-negative-value warning if TYPE_OVERFLOW_WRAPS.
	* c-typeck.c (build_binary_op): Likewise.
gcc/cp/
	* constexpr.c (cxx_eval_check_shift_p): Use TYPE_OVERFLOW_WRAPS
	instead of TYPE_UNSIGNED.
	* typeck.c (cp_build_binary_op): Don't emit
	-Wshift-negative-value warning if TYPE_OVERFLOW_WRAPS.
gcc/testsuite/
	* c-c++-common/Wshift-negative-value-1.c: Remove
	dg-additional-options, instead in target selectors of each diagnostic
	check for exact C++ versions where it should be diagnosed.
	* c-c++-common/Wshift-negative-value-2.c: Likewise.
	* c-c++-common/Wshift-negative-value-3.c: Likewise.
	* c-c++-common/Wshift-negative-value-4.c: Likewise.
	* c-c++-common/Wshift-negative-value-7.c: New test.
	* c-c++-common/Wshift-negative-value-8.c: New test.
	* c-c++-common/Wshift-negative-value-9.c: New test.
	* c-c++-common/Wshift-negative-value-10.c: New test.
	* c-c++-common/Wshift-overflow-1.c: Remove
	dg-additional-options, instead in target selectors of each diagnostic
	check for exact C++ versions where it should be diagnosed.
	* c-c++-common/Wshift-overflow-2.c: Likewise.
	* c-c++-common/Wshift-overflow-5.c: Likewise.
	* c-c++-common/Wshift-overflow-6.c: Likewise.
	* c-c++-common/Wshift-overflow-7.c: Likewise.
	* c-c++-common/Wshift-overflow-8.c: New test.
	* c-c++-common/Wshift-overflow-9.c: New test.
	* c-c++-common/Wshift-overflow-10.c: New test.
	* c-c++-common/Wshift-overflow-11.c: New test.
	* c-c++-common/Wshift-overflow-12.c: New test.

(cherry picked from commit d76511138d)
2022-05-11 08:17:57 +02:00
Jakub Jelinek 2a829a4e85 c++: Don't suggest cdtor or conversion op identifiers in spelling hints [PR104806]
On the following testcase, we emit "did you mean '__dt '?" in the error
message.  "__dt " shows there because it is dtor_identifier, but we
shouldn't suggest those to the user, they are purely internal and can't
be really typed by the user because of the final space in it.

2022-03-08  Jakub Jelinek  <jakub@redhat.com>

	PR c++/104806
	* search.c (lookup_field_fuzzy_info::fuzzy_lookup_field): Ignore
	identifiers with space at the end.

	* g++.dg/spellcheck-pr104806.C: New test.

(cherry picked from commit e480c3c06d)
2022-05-11 08:17:57 +02:00
Jakub Jelinek e763af00a2 s390: Fix up *cmp_and_trap_unsigned_int<mode> constraints [PR104775]
The following testcase fails to assemble due to clgte %r6,0(%r1,%r10)
insn not being accepted by assembler.
My rough understanding is that in the RSY-b insn format the spot
in other formats used for index registers is used instead for M3 what
kind of comparison it is, so this patch follows what other similar
instructions use for constraint (i.e. one without index register).

2022-03-07  Jakub Jelinek  <jakub@redhat.com>

	PR target/104775
	* config/s390/s390.md (*cmp_and_trap_unsigned_int<mode>): Use
	S constraint instead of T in the last alternative.

	* gcc.target/s390/pr104775.c: New test.

(cherry picked from commit 2472dcaa8c)
2022-05-11 08:17:56 +02:00
Jakub Jelinek 870a9a8e82 match.pd: Further complex simplification fixes [PR104675]
Mark mentioned in the PR further 2 simplifications that also ICE
with complex types.
For these, eventually (but IMO GCC 13 materials) we could support it
for vector types if it would be uniform vector constants.
Currently integer_pow2p is true only for INTEGER_CSTs and COMPLEX_CSTs
and we can't use bit_and etc. for complex type.

2022-02-25  Jakub Jelinek  <jakub@redhat.com>
	    Marc Glisse  <marc.glisse@inria.fr>

	PR tree-optimization/104675
	* match.pd (t * 2U / 2 -> t & (~0 / 2), t / 2U * 2 -> t & ~1):
	Restrict simplifications to INTEGRAL_TYPE_P.

	* gcc.dg/pr104675-3.c : New test.

(cherry picked from commit f62115c9b7)
2022-05-11 08:17:56 +02:00
Jakub Jelinek 5c742d9a7e rs6000: Use rs6000_emit_move in movmisalign<mode> expander [PR104681]
The following testcase ICEs, because for some strange reason it decides to use
movmisaligntf during expansion where the destination is MEM and source is
CONST_DOUBLE.  For normal mov<mode> expanders the rs6000 backend uses
rs6000_emit_move to ensure that if one operand is a MEM, the other is a REG
and a few other things, but for movmisalign<mode> nothing enforced this.
The middle-end documents that movmisalign<mode> shouldn't fail, so we can't
force that through predicates or condition on the expander.

2022-02-25  Jakub Jelinek  <jakub@redhat.com>

	PR target/104681
	* config/rs6000/vector.md (movmisalign<mode>): Use rs6000_emit_move.

	* g++.dg/opt/pr104681.C: New test.

(cherry picked from commit 3885a122f8)
2022-05-11 08:17:56 +02:00
Jakub Jelinek ff9fe8ef03 match.pd: Don't create BIT_NOT_EXPRs for COMPLEX_TYPE [PR104675]
We don't support BIT_{AND,IOR,XOR,NOT}_EXPR on complex types,
&/|/^ are just rejected for them, and ~ is parsed as CONJ_EXPR.
So, we should avoid simplifications which turn valid complex type
expressions into something that will ICE during expansion.

2022-02-25  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/104675
	* match.pd (-A - 1 -> ~A, -1 - A -> ~A): Don't simplify for
	COMPLEX_TYPE.

	* gcc.dg/pr104675-1.c: New test.
	* gcc.dg/pr104675-2.c: New test.

(cherry picked from commit 758671b88b)
2022-05-11 08:17:56 +02:00
Jakub Jelinek 2692cbbc12 libiberty: Fix up debug.temp.o creation if *.o has 64K+ sections [PR104617]
On
 #define A(n) int foo1##n(void) { return 1##n; }
 #define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6) A(n##7) A(n##8) A(n##9)
 #define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6) B(n##7) B(n##8) B(n##9)
 #define D(n) C(n##0) C(n##1) C(n##2) C(n##3) C(n##4) C(n##5) C(n##6) C(n##7) C(n##8) C(n##9)
 #define E(n) D(n##0) D(n##1) D(n##2) D(n##3) D(n##4) D(n##5) D(n##6) D(n##7) D(n##8) D(n##9)
 E(0) E(1) E(2) D(30) D(31) C(320) C(321) C(322) C(323) C(324) C(325)
 B(3260) B(3261) B(3262) B(3263) A(32640) A(32641) A(32642)
testcase with
./xgcc -B ./ -c -g -fpic -ffat-lto-objects -flto  -O0 -o foo1.o foo1.c -ffunction-sections
./xgcc -B ./ -shared -g -fpic -flto -O0 -o foo1.so foo1.o
/tmp/ccTW8mBm.debug.temp.o: file not recognized: file format not recognized
(testcase too slow to be included into testsuite).
The problem is clearly reported by readelf:
readelf: foo1.o.debug.temp.o: Warning: Section 2 has an out of range sh_link value of 65321
readelf: foo1.o.debug.temp.o: Warning: Section 5 has an out of range sh_link value of 65321
readelf: foo1.o.debug.temp.o: Warning: Section 10 has an out of range sh_link value of 65323
readelf: foo1.o.debug.temp.o: Warning: [ 2]: Link field (65321) should index a symtab section.
readelf: foo1.o.debug.temp.o: Warning: [ 5]: Link field (65321) should index a symtab section.
readelf: foo1.o.debug.temp.o: Warning: [10]: Link field (65323) should index a string section.
because simple_object_elf_copy_lto_debug_sections doesn't adjust sh_info and
sh_link fields in ElfNN_Shdr if they are in between SHN_{LO,HI}RESERVE
inclusive.  Not adjusting those is incorrect though, SHN_{LO,HI}RESERVE
range is only relevant to the 16-bit fields, mainly st_shndx in ElfNN_Sym
where if one needs >= SHN_LORESERVE section number, SHN_XINDEX should be
used instead and .symtab_shndx section should contain the real section
index, and in ElfNN_Ehdr e_shnum and e_shstrndx fields, where if >=
SHN_LORESERVE value is needed it should put those into
Shdr[0].sh_{size,link}.  But, sh_{link,info} are 32-bit fields which can
contain any section index.

Note, as simple-object-elf.c mentions, binutils from 2.12 to 2.18 (so before
2011) used to mishandle the > 63.75K sections case and assumed there is a
hole in between the sections, but what
simple_object_elf_copy_lto_debug_sections does wouldn't help in that case
for the debug temp object creation, we'd need to detect the case also in
that routine and take it into account in the remapping etc.  I think
it is not worth it given that it is over 10 years, if somebody needs
63.75K or more sections, better use more recent binutils.

2022-02-22  Jakub Jelinek  <jakub@redhat.com>

	PR lto/104617
	* simple-object-elf.c (simple_object_elf_match): Fix up URL
	in comment.
	(simple_object_elf_copy_lto_debug_sections): Remap sh_info and
	sh_link even if they are in the SHN_LORESERVE .. SHN_HIRESERVE
	range (inclusive).

(cherry picked from commit 2f59f06761)
2022-05-11 08:17:56 +02:00
Jakub Jelinek eca81c14d5 valtrack: Avoid creating raw SUBREGs with VOIDmode argument [PR104557]
After the recent r12-7240 simplify_immed_subreg changes, we bail on more
simplify_subreg calls than before, e.g. apparently for decimal modes
in the NaN representations  we almost never preserve anything except the
canonical {q,s}NaNs.
simplify_gen_subreg will punt in such cases because a SUBREG with VOIDmode
is not valid, but debug_lowpart_subreg wants to attempt even harder, even
if e.g. target indicates certain mode combinations aren't valid for the
backend, dwarf2out can still handle them.  But a SUBREG from a VOIDmode
operand is just too much, the inner mode is lost there.  We'd need some
new rtx that would be able to represent those cases.
For now, just punt in those cases.

2022-02-17  Jakub Jelinek  <jakub@redhat.com>

	PR debug/104557
	* valtrack.c (debug_lowpart_subreg): Don't call gen_rtx_raw_SUBREG
	if expr has VOIDmode.

	* gcc.dg/dfp/pr104557.c: New test.

(cherry picked from commit 1c2b44b523)
2022-05-11 08:17:56 +02:00
Jakub Jelinek b65f562b8f c-family: Fix up shorten_compare for decimal vs. non-decimal float comparison [PR104510]
The comment in shorten_compare says:
  /* If either arg is decimal float and the other is float, fail.  */
but the callers of shorten_compare don't expect anything like failure
as a possibility from the function, callers require that the function
promotes the operands to the same type, whether the original selected
*restype_ptr one or some shortened.
So, if we choose not to shorten, we should still promote to the original
*restype_ptr.

2022-02-16  Jakub Jelinek  <jakub@redhat.com>

	PR c/104510
	* c-common.c (shorten_compare): Convert original arguments to
	the original *restype_ptr when mixing binary and decimal float.

	* gcc.dg/dfp/pr104510.c: New test.

(cherry picked from commit 6e74122f0d)
2022-05-11 08:17:55 +02:00
Jakub Jelinek 3aeecb1945 sanitizer: Use glibc _thread_db_sizeof_pthread symbol if present
I've cherry-picked following fix from llvm-project.  Recent glibcs
have _thread_db_sizeof_pthread symbol variable which contains the
size of struct pthread, so that sanitizers don't need to guess that
and risk that it will change again.

2022-02-15  Jakub Jelinek  <jakub@redhat.com>

	* sanitizer_common/sanitizer_linux_libcdep.cc: Cherry-pick
	llvm-project revision ef14b78d9a144ba81ba02083fe21eb286a88732b.

(cherry picked from commit c4c0aa6089)
2022-05-11 08:16:53 +02:00
Jakub Jelinek 57e0795a44 openmp: Make finalize_task_copyfn order reproduceable [PR104517]
The following testcase fails -fcompare-debug, because finalize_task_copyfn
was invoked from splay tree destruction, whose order can in some cases
depend on -g/-g0.  The fix is to queue the task stmts that need copyfn
in a vector and run finalize_task_copyfn on elements of that vector.

2022-02-15  Jakub Jelinek  <jakub@redhat.com>

	PR debug/104517
	* omp-low.c (task_cpyfns): New variable.
	(delete_omp_context): Don't call finalize_task_copyfn from here.
	(create_task_copyfn): Push task_stmt into task_cpyfns.
	(execute_lower_omp): Call finalize_task_copyfn here on entries from
	task_cpyfns vector and release the vector.

(cherry picked from commit 6a0d6e7ca9)
2022-05-11 07:58:40 +02:00
Jakub Jelinek 87cd4bc02f c++: Don't reject GOTO_EXPRs to cdtor_label in potential_constant_expression_1 [PR104513]
return in ctors on targetm.cxx.cdtor_returns_this () target like arm
is emitted as GOTO_EXPR cdtor_label where at cdtor_label it emits
RETURN_EXPR with the this.
Similarly, in all dtors regardless of targetm.cxx.cdtor_returns_this ()
a return is emitted similarly.

potential_constant_expression_1 was rejecting these gotos and so we
incorrectly rejected these testcases, but actual cxx_eval* is apparently
handling these just fine.  I was a little bit worried that for the
destruction of bases we wouldn't evaluate something we should, but as the
testcase shows, that is evaluated through try ... finally and there is
nothing after the cdtor_label.  For arm there is RETURN_EXPR this; but we
don't really care about the return value from ctors and dtors during the
constexpr evaluation.

I must say I don't see much the point of cdtor_labels at all, I'd think
that with try ... finally around it for non-arm we could just RETURN_EXPR
instead of the GOTO_EXPR and the try/finally gimplification would DTRT,
and we could just add the right return value for the arm case.

2022-02-14  Jakub Jelinek  <jakub@redhat.com>

	PR c++/104513
	* constexpr.c (potential_constant_expression_1) <case GOTO_EXPR>:
	Don't punt if returns (target).

	* g++.dg/cpp1y/constexpr-104513.C: New test.

(cherry picked from commit 02a981a8e5)
2022-05-11 07:58:40 +02:00
Jakub Jelinek 77ee9b906d asan: Fix up address sanitizer instrumentation of __builtin_alloca* if it can throw [PR104449]
With -fstack-check* __builtin_alloca* can throw and the asan
instrumentation of this builtin wasn't prepared for that case.
The following patch fixes that by replacing the builtin with the
replacement builtin and emitting any further insns on the fallthru
edge.

I haven't touched the hwasan code which most likely suffers from the
same problem.

2022-02-12  Jakub Jelinek  <jakub@redhat.com>

	PR sanitizer/104449
	* asan.c: Include tree-eh.h.
	(handle_builtin_alloca): Handle the case when __builtin_alloca or
	__builtin_alloca_with_align can throw.

	* gcc.dg/asan/pr104449.c: New test.
	* g++.dg/asan/pr104449.C: New test.

(cherry picked from commit f0c7367b88)
2022-05-11 07:58:39 +02:00
Jakub Jelinek cb412e0e88 i386: Fix up cvtsd2ss splitter [PR104502]
The following testcase ICEs, because AVX512F is enabled, AVX512VL is not,
and the cvtsd2ss insn has %xmm0-15 as output operand and %xmm16-31 as
input operand.  For output operand %xmm16+ the splitter just gives up
in such case, but for such input it just emits vmovddup which requires
AVX512VL if either operand is EXT_REX_SSE_REG_P (when it is 128-bit).

The following patch fixes it by treating that case like the pre-SSE3
output != input case - move the input to output and do everything on
the output reg which is known to be < %xmm16.

2022-02-12  Jakub Jelinek  <jakub@redhat.com>

	PR target/104502
	* config/i386/i386.md (cvtsd2ss splitter): If operands[1] is xmm16+
	and AVX512VL isn't available, move operands[1] to operands[0] first.

	* gcc.target/i386/pr104502.c: New test.

(cherry picked from commit 0538d42cdd)
2022-05-11 07:58:39 +02:00
Jakub Jelinek c7e7ca915d c++: Fix up constant expression __builtin_convertvector folding [PR104472]
The following testcase ICEs, because due to the -frounding-math
fold_const_call fails, which is it returns NULL, and returning NULL from
cxx_eval* is wrong, all the callers rely on them to either return folded
value or original with *non_constant_p = true.

The following patch does that, and additionally falls through into the
default case where there is diagnostics for the !ctx->quiet case too.

2022-02-11  Jakub Jelinek  <jakub@redhat.com>

	PR c++/104472
	* constexpr.c (cxx_eval_internal_function) <case IFN_VEC_CONVERT>:
	Only return fold_const_call result if it is non-NULL.  Otherwise
	fall through into the default: case to return t, set *non_constant_p
	and emit diagnostics if needed.

	* g++.dg/cpp0x/constexpr-104472.C: New test.

(cherry picked from commit 84993d94e1)
2022-05-11 07:58:39 +02:00
Jakub Jelinek ffbe41f14f combine: Fix ICE with substitution of CONST_INT into PRE_DEC argument [PR104446]
The following testcase ICEs, because combine substitutes
(insn 10 9 11 2 (set (reg/v:SI 7 sp [ a ])
        (const_int 0 [0])) "pr104446.c":9:5 81 {*movsi_internal}
     (nil))
(insn 13 11 14 2 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [0  S4 A32])
        (reg:SI 85)) "pr104446.c":10:3 56 {*pushsi2}
     (expr_list:REG_DEAD (reg:SI 85)
        (expr_list:REG_ARGS_SIZE (const_int 16 [0x10])
            (nil))))
forming
(insn 13 11 14 2 (set (mem/f:SI (pre_dec:SI (const_int 0 [0])) [0  S4 A32])
        (reg:SI 85)) "pr104446.c":10:3 56 {*pushsi2}
     (expr_list:REG_DEAD (reg:SI 85)
        (expr_list:REG_ARGS_SIZE (const_int 16 [0x10])
            (nil))))
which is invalid RTL (pre_dec's argument must be a REG).
I know substitution creates various forms of invalid RTL and hopes that
invalid RTL just won't recog.
But unfortunately in this case we ICE before we get to recog, as
try_combine does:
  if (n_auto_inc)
    {
      int new_n_auto_inc = 0;
      for_each_inc_dec (newpat, count_auto_inc, &new_n_auto_inc);

      if (n_auto_inc != new_n_auto_inc)
        {
          if (dump_file && (dump_flags & TDF_DETAILS))
            fprintf (dump_file, "Number of auto_inc expressions changed\n");
          undo_all ();
          return 0;
        }
    }
and for_each_inc_dec under the hood will do e.g. for the PRE_DEC case:
    case PRE_DEC:
    case POST_DEC:
      {
        poly_int64 size = GET_MODE_SIZE (GET_MODE (mem));
        rtx r1 = XEXP (x, 0);
        rtx c = gen_int_mode (-size, GET_MODE (r1));
        return fn (mem, x, r1, r1, c, data);
      }
and that code rightfully expects that the PRE_DEC operand has non-VOIDmode
(as it needs to be a REG) - gen_int_mode for VOIDmode results in ICE.
I think it is better not to emit the clearly invalid RTL during substitution
like we do for other cases, than to adding workarounds for invalid IL
created by combine to rtlanal.cc and perhaps elsewhere.
As for the testcase, of course it is UB at runtime to modify sp that way,
but if such code is never reached, we must compile it, not to ICE on it.
And I don't see why on other targets which use the autoinc rtxes much more
it couldn't happen with other registers.

2022-02-11  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/104446
	* combine.c (subst): Don't substitute CONST_INTs into RTX_AUTOINC
	operands.

	* gcc.target/i386/pr104446.c: New test.

(cherry picked from commit fb76c0ad35)
2022-05-11 07:58:38 +02:00
Jakub Jelinek 8c9f4bafe5 rs6000: Fix up vspltis_shifted [PR102140]
The following testcase ICEs, because
(const_vector:V4SI [
                (const_int 0 [0]) repeated x3
                (const_int -2147483648 [0xffffffff80000000])
            ])
is recognized as valid easy_vector_constant in between split1 pass and
end of RA.
The problem is that such constants need to be split, and the only
splitter for that is:
(define_split
  [(set (match_operand:VM 0 "altivec_register_operand")
        (match_operand:VM 1 "easy_vector_constant_vsldoi"))]
  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode) && can_create_pseudo_p ()"
There is only a single splitting pass before RA, so after that finishes,
if something gets matched in between that and end of RA (after that
can_create_pseudo_p () would be no longer true), it will never be
successfully split and we ICE at final.cc time or earlier.

The i386 backend (and a few others) already use
(cfun->curr_properties & PROP_rtl_split_insns)
as a test for split1 pass finished, so that some insns that should be split
during split1 and shouldn't be matched afterwards are properly guarded.

So, the following patch does that for vspltis_shifted too.

2022-02-08  Jakub Jelinek  <jakub@redhat.com>

	PR target/102140
	* config/rs6000/rs6000.c (vspltis_shifted): Return false also if
	split1 pass has finished already.

	* gcc.dg/pr102140.c: New test.

(cherry picked from commit 0c3e491a4e)
2022-05-11 07:58:37 +02:00
Jakub Jelinek b7e9d466fc libgomp: Fix segfault with posthumous orphan tasks [PR104385]
The following patch fixes crashes with posthumous orphan tasks.
When a parent task finishes, gomp_clear_parent clears the parent
pointers of its children tasks present in the parent->children_queue.
But children that are still waiting for dependencies aren't in that
queue yet, they will be added there only when the sibling they are
waiting for exits.  Unfortunately we were adding those tasks into
the queues with the original task->parent which then causes crashes
because that task is gone and freed.  The following patch fixes that
by clearing the parent field when we schedule such task for running
by adding it into the queues and we know that the sibling task which
is about to finish has NULL parent.

2022-02-08  Jakub Jelinek  <jakub@redhat.com>

	PR libgomp/104385
	* task.c (gomp_task_run_post_handle_dependers): If parent is NULL,
	clear task->parent.
	* testsuite/libgomp.c/pr104385.c: New test.

(cherry picked from commit 0af7ef050a)
2022-05-11 07:58:37 +02:00
Jakub Jelinek 7157e07447 libcpp: Fix up padding handling in funlike_invocation_p [PR104147]
As mentioned in the PR, in some cases we preprocess incorrectly when we
encounter an identifier which is defined as function-like macro, followed
by at least 2 CPP_PADDING tokens and then some other identifier.
On the following testcase, the problem is in the 3rd funlike_invocation_p,
the tokens are CPP_NAME Y, CPP_PADDING (the pfile->avoid_paste shared token),
CPP_PADDING (one created with padding_token, val.source is non-NULL and
val.source->flags & PREV_WHITE is non-zero) and then another CPP_NAME.
funlike_invocation_p remembers there was a padding token, but remembers the
first one because of its condition, then the next token is the CPP_NAME,
which is not CPP_OPEN_PAREN, so the CPP_NAME token is backed up, but as we
can't easily backup more tokens, it pushes into a new context the padding
token (the pfile->avoid_paste one).  The net effect is that when Y is not
defined as fun-like macro, we read Y, avoid_paste, padding_token, Y,
while if Y is fun-like macro, we read Y, avoid_paste, avoid_paste, Y
(the second avoid_paste is because that is how we handle end of a context).
Now, for stringify_arg that is unfortunately a significant difference,
which handles CPP_PADDING tokens with:
      if (token->type == CPP_PADDING)
        {
          if (source == NULL
              || (!(source->flags & PREV_WHITE)
                  && token->val.source == NULL))
            source = token->val.source;
          continue;
        }
and later on
      /* Leading white space?  */
      if (dest - 1 != BUFF_FRONT (pfile->u_buff))
        {
          if (source == NULL)
            source = token;
          if (source->flags & PREV_WHITE)
            *dest++ = ' ';
        }
      source = NULL;
(and c-ppoutput.cc has similar code).
So, when Y is not fun-like macro, ' ' is added because padding_token's
val.source->flags & PREV_WHITE is non-zero, while when it is fun-like
macro, we don't add ' ' in between, because source is NULL and so
used from the next token (CPP_NAME Y), which doesn't have PREV_WHITE set.

Now, the funlike_invocation_p condition
       if (padding == NULL
           || (!(padding->flags & PREV_WHITE) && token->val.source == NULL))
        padding = token;
looks very similar to that in stringify_arg/c-ppoutput.cc, so I assume
the intent was to prefer do the same thing and pick the right padding.
But there are significant differences.  Both stringify_arg and c-ppoutput.cc
don't remember the CPP_PADDING token, but its val.source instead, while
in funlike_invocation_p we want to remember the padding token that has the
significant information for stringify_arg/c-ppoutput.cc.
So, IMHO we want to overwrite padding if:
1) padding == NULL (remember that there was any padding at all)
2) padding->val.source == NULL (this matches the source == NULL
   case in stringify_arg)
3) !(padding->val.source->flags & PREV_WHITE) && token->val.source == NULL
   (this matches the !(source->flags & PREV_WHITE) && token->val.source == NULL
   case in stringify_arg)

2022-02-01  Jakub Jelinek  <jakub@redhat.com>

	PR preprocessor/104147
	* macro.c (funlike_invocation_p): For padding prefer a token
	with val.source non-NULL especially if it has PREV_WHITE set
	on val.source->flags.  Add gcc_assert that CPP_PADDING tokens
	don't have PREV_WHITE set in flags.

	* c-c++-common/cpp/pr104147.c: New test.

(cherry picked from commit 95ac563540)
2022-05-11 07:58:36 +02:00
Jakub Jelinek 7e05b86bcd libcpp: Avoid PREV_WHITE and other random content on CPP_PADDING tokens
The funlike_invocation_p macro never triggered, the other
asserts did on some tests, see below for a full list.
This seems to be caused by #pragma/_Pragma handling.
do_pragma does:
          pfile->directive_result.src_loc = pragma_token_virt_loc;
          pfile->directive_result.type = CPP_PRAGMA;
          pfile->directive_result.flags = pragma_token->flags;
          pfile->directive_result.val.pragma = p->u.ident;
when it sees a pragma, while start_directive does:
  pfile->directive_result.type = CPP_PADDING;
and so does _cpp_do__Pragma.
Now, for #pragma lex.cc will just ignore directive_result if
it has CPP_PADDING type:
              if (_cpp_handle_directive (pfile, result->flags & PREV_WHITE))
                {
                  if (pfile->directive_result.type == CPP_PADDING)
                    continue;
                  result = &pfile->directive_result;
                }
but destringize_and_run does not:
  if (pfile->directive_result.type == CPP_PRAGMA)
    {
...
    }
  else
    {
      count = 1;
      toks = XNEW (cpp_token);
      toks[0] = pfile->directive_result;
and from there it will copy type member of CPP_PADDING, but all the
other members from the last CPP_PRAGMA before it.
Small testcase for it with no option (at least no -fopenmp or -fopenmp-simd).
 #pragma GCC push_options
 #pragma GCC ignored "-Wformat"
 #pragma GCC pop_options
void
foo ()
{
  _Pragma ("omp simd")
  for (int i = 0; i < 64; i++)
    ;
}

Here is a patch that replaces those
      toks = XNEW (cpp_token);
      toks[0] = pfile->directive_result;
lines with
      toks = &pfile->avoid_paste;

2022-02-01  Jakub Jelinek  <jakub@redhat.com>

	* directives.c (destringize_and_run): Push &pfile->avoid_paste
	instead of a copy of pfile->directive_result for the CPP_PADDING
	case.

(cherry picked from commit efc46b550f)
2022-05-11 07:58:36 +02:00
Jakub Jelinek 02da8ea286 optabs: Don't create pseudos in prepare_cmp_insn when not allowed [PR102478]
cond traps can be created during ce3 after reload (and e.g. PR103028
recently fixed some ce3 cond trap related bug, so I think often that
works fine and we shouldn't disable cond traps after RA altogether),
but it calls prepare_cmp_insn.  This function can fail, so I don't
see why we couldn't make it work after RA (in most cases it already
just works).  The first hunk is just an optimization which doesn't
make sense after RA, so I've guarded it with can_create_pseudo_p.
The second hunk is just a theoretical case, I don't have a testcase for it.
prepare_cmp_insn has some other spots that can create pseudos, like when
both operands have VOIDmode, or when it is BLKmode comparison, or
not OPTAB_DIRECT, but I think none of that applies to ce3, we punt on
BLKmode earlier, use OPTAB_DIRECT and shouldn't be comparing two
VOIDmode CONST_INTs.

2022-01-21  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/102478
	* optabs.c (prepare_cmp_insn): If !can_create_pseudo_p (), don't
	force_reg constants and for -fnon-call-exceptions fail if copy_to_reg
	would be needed.

	* gcc.dg/pr102478.c: New test.

(cherry picked from commit c2d9159717)
2022-05-11 07:58:36 +02:00
Jakub Jelinek 95f6eb7ae7 match.pd, optabs: Avoid vectorization of {FLOOR,CEIL,ROUND}_{DIV,MOD}_EXPR [PR102860]
power10 has modv4si3 expander and so vectorizes the following testcase
where Fortran modulo is FLOOR_MOD_EXPR.
optabs_for_tree_code indicates that the optab for all the *_MOD_EXPR
variants is umod_optab or smod_optab, but that isn't true, that optab
actually expands just TRUNC_MOD_EXPR.  For the other tree codes expmed.cc
has code how to adjust the TRUNC_MOD_EXPR into those by emitting some
extra comparisons and conditional updates.  Similarly for *_DIV_EXPR,
except in that case it actually needs both division and modulo.

While it would be possible to handle it in expmed.cc for vectors as well,
we'd need to be sure all the vector operations we need for that are
available, and furthermore we wouldn't account for that in the costing.

So, IMHO it is better to stop pretending those non-truncating (and
non-exact) div/mod operations have an optab.  For GCC 13, we should
IMHO pattern match these in tree-vect-patterns.cc and transform them
to truncating div/mod with follow-up adjustments and let the vectorizer
vectorize that.  As written in the PR, for signed operands:
r = x %[fl] y;
is
r = x % y; if (r && (x ^ y) < 0) r += y;
and
d = x /[fl] y;
is
r = x % y; d = x / y; if (r && (x ^ y) < 0) --d;
and
r = x %[cl] y;
is
r = x % y; if (r && (x ^ y) >= 0) r -= y;
and
d = /[cl] y;
is
r = x % y; d = x / y; if (r && (x ^ y) >= 0) ++d;
(too lazy to figure out rounding div/mod now).  I'll create a PR
for that.
The patch also extends a match.pd optimization that floor_mod on
unsigned operands is actually trunc_mod.

2022-01-19  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/102860
	* match.pd (x %[fl] y -> x % y): New simplification for
	unsigned integral types.
	* optabs-tree.c (optab_for_tree_code): Return unknown_optab
	for {CEIL,FLOOR,ROUND}_{DIV,MOD}_EXPR with VECTOR_TYPE.

	* gfortran.dg/pr102860.f90: New test.

(cherry picked from commit ffc7f200ad)
2022-05-11 07:58:36 +02:00
Jakub Jelinek e875dc9f97 ifcvt: Check for asm goto at the end of then_bb/else_bb in ifcvt [PR103908]
On the following testcase, RTL ifcvt sees then_bb
(note 7 6 8 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
(insn 8 7 9 3 (set (mem/c:SI (symbol_ref:DI ("b") [flags 0x2]  <var_decl 0x7fdccf5b0cf0 b>) [1 b+0 S4 A32])
        (const_int 1 [0x1])) "pr103908.c":6:7 81 {*movsi_internal}
     (nil))
(jump_insn 9 8 13 3 (parallel [
            (asm_operands/v ("# insn 1") ("") 0 []
                 []
                 [
                    (label_ref:DI 21)
                ] pr103908.c:7)
            (clobber (reg:CC 17 flags))
        ]) "pr103908.c":7:5 -1
     (expr_list:REG_UNUSED (reg:CC 17 flags)
        (nil))
 -> 21)
and similarly else_bb (just with a different asm_operands template).
It checks that those basic blocks have a single successor and
uses last_active_insn which intentionally skips over JUMP_INSNs, sees
both basic blocks contain the same set and merges them (or if the
sets are different, attempts some other noce optimization).
But we can't assume that the jump, even when it has only a single successor,
has no side-effects.

The following patch fixes it by punting if test_bb ends with a JUMP_INSN
that isn't onlyjump_p.

2022-01-06  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/103908
	* ifcvt.c (bb_valid_for_noce_process_p): Punt on bbs ending with
	asm goto.

	* gcc.target/i386/pr103908.c: New test.

(cherry picked from commit 80ad67e2af)
2022-05-11 07:58:35 +02:00
Jakub Jelinek 86b98701bf libcpp: Fix up ##__VA_OPT__ handling [PR89971]
In the following testcase we incorrectly error about pasting / token
with padding token (which is a result of __VA_OPT__); instead we should
like e.g. for ##arg where arg is empty macro argument clear PASTE_LEFT
flag of the previous token if __VA_OPT__ doesn't add any real tokens
(which can happen either because the macro doesn't have any tokens
passed to ... (i.e. __VA_ARGS__ expands to empty) or when __VA_OPT__
doesn't have any tokens in between ()s).

2021-12-30  Jakub Jelinek  <jakub@redhat.com>

	PR preprocessor/89971
libcpp/
	* macro.c (replace_args): For ##__VA_OPT__, if __VA_OPT__ expands
	to no tokens at all, drop PASTE_LEFT flag from the previous token.
gcc/testsuite/
	* c-c++-common/cpp/va-opt-9.c: New test.

(cherry picked from commit 5545d1edcb)
2022-05-11 07:58:35 +02:00
Jakub Jelinek 5d96fb401e shrink-wrapping: Fix up prologue block discovery [PR103860]
The following testcase is miscompiled, because a prologue which
contains subq $8, %rsp instruction is emitted at the start of
a basic block which contains conditional jump that depends on
flags register set in an earlier basic block, the prologue instruction
then clobbers those flags.
Normally this case is checked by can_get_prologue predicate, but this
is done only at the start of the loop.  If we update pro later in the
loop (because some bb shouldn't be duplicated) and then don't push
anything further into vec and the vec is already empty (this can happen
when the new pro is already in bb_with bitmask and either has no successors
(that is the case in the testcase where that bb ends with a trap) or
all the successors are already in bb_with, then the loop doesn't iterate
further and can_get_prologue will not be checked.

The following simple patch makes sure we call can_get_prologue even after
the last former iteration when vec is already empty and only break from
the loop afterwards (and only if the updating of pro done because of
!can_get_prologue didn't push anything into vec again).

2021-12-30  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/103860
	* shrink-wrap.c (try_shrink_wrapping): Make sure can_get_prologue is
	called on pro even if nothing further is pushed into vec.

	* gcc.dg/pr103860.c: New test.

(cherry picked from commit 1820137ba6)
2022-05-11 07:58:35 +02:00
Jakub Jelinek d7dbfed37a loop-invariant: Fix -fcompare-debug failure [PR103837]
In the following testcase we have a -fcompare-debug failure, because
can_move_invariant_reg doesn't ignore DEBUG_INSNs in its decisions.
In the testcase we have due to uninitialized variable:
  loop_header
    debug_insn using pseudo84
    pseudo84 = invariant
    insn using pseudo84
  end loop
and with -g decide not to move the pseudo84 = invariant before the
loop header; in this case not resetting the debug insns might be fine.
But, we could have also:
  pseudo84 = whatever
  loop_header
    debug_insn using pseudo84
    pseudo84 = invariant
    insn using pseudo84
  end loop
and in that case not resetting the debug insns would result in wrong-debug.
And, we don't really have generally a good substitution on what pseudo84
contains, it could inherit various values from different paths.
So, the following patch ignores DEBUG_INSNs in the decisions, and if there
are any that previously prevented the optimization, resets them before
return true.

2021-12-28  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/103837
	* loop-invariant.c (can_move_invariant_reg): Ignore DEBUG_INSNs in
	the decisions whether to return false or continue and right before
	returning true reset those debug insns that previously caused
	returning false.

	* gcc.dg/pr103837.c: New test.

(cherry picked from commit 3c5fd3616f)
2022-05-11 07:58:34 +02:00
Jakub Jelinek 790b8d49eb bswap: Fix UB in find_bswap_or_nop_finalize [PR103435]
On gcc.c-torture/execute/pr103376.c in the following code we trigger UB
in the compiler.  n->range is 8 because it is 64-bit load and rsize is 0
because it is a bswap sequence with load and known to be 0:
  /* Find real size of result (highest non-zero byte).  */
  if (n->base_addr)
    for (tmpn = n->n, rsize = 0; tmpn; tmpn >>= BITS_PER_MARKER, rsize++);
  else
    rsize = n->range;
The shifts then shift uint64_t by 64 bits.  For this case mask is 0
and we want both *cmpxchg and *cmpnop as 0, the operation can be done as
both nop and bswap and callers will prefer nop.

2021-11-27  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/103435
	* gimple-ssa-store-merging.c (find_bswap_or_nop_finalize): Avoid UB if
	n->range - rsize == 8, just clear both *cmpnop and *cmpxchg in that
	case.

(cherry picked from commit 567d5f3d62)
2022-05-11 07:58:34 +02:00
Jakub Jelinek 4a3c9aabb0 fortran, debug: Fix up DW_AT_rank [PR103315]
For DW_AT_rank we were emitting
        .uleb128 0x4    # DW_AT_rank
        .byte   0x97    # DW_OP_push_object_address
        .byte   0x23    # DW_OP_plus_uconst
        .uleb128 0x1c
        .byte   0x6     # DW_OP_deref
on 64-bit and
        .uleb128 0x4    # DW_AT_rank
        .byte   0x97    # DW_OP_push_object_address
        .byte   0x23    # DW_OP_plus_uconst
        .uleb128 0x10
        .byte   0x6     # DW_OP_deref
on 32-bit.  I think this is wrong, as dtype.rank field in the descriptor
has unsigned char type, not pointer type nor pointer sized integral.
E.g. if we have a
    REAL :: a(..)
dummy argument, which is passed as a reference to the function descriptor,
we want to evaluate a->dtype.rank.  The above DWARF expressions perform
*(uintptr_t *)(a + 0x1c)
and
*(uintptr_t *)(a + 0x10)
respectively.  The following patch changes those to:
        .uleb128 0x5    # DW_AT_rank
        .byte   0x97    # DW_OP_push_object_address
        .byte   0x23    # DW_OP_plus_uconst
        .uleb128 0x1c
        .byte   0x94    # DW_OP_deref_size
        .byte   0x1
and
        .uleb128 0x5    # DW_AT_rank
        .byte   0x97    # DW_OP_push_object_address
        .byte   0x23    # DW_OP_plus_uconst
        .uleb128 0x10
        .byte   0x94    # DW_OP_deref_size
        .byte   0x1
which perform
*(unsigned char *)(a + 0x1c)
and
*(unsigned char *)(a + 0x10)
respectively.

2021-11-21  Jakub Jelinek  <jakub@redhat.com>

	PR debug/103315
	* trans-types.c (gfc_get_array_descr_info): Use DW_OP_deref_size 1
	instead of DW_OP_deref for DW_AT_rank.

(cherry picked from commit da17c304e2)
2022-05-11 07:58:34 +02:00
Jakub Jelinek 547692808b c++: Fix up -fstrong-eval-order handling of call arguments [PR70796]
For -fstrong-eval-order (default for C++17 and later) we make sure to
gimplify arguments in the right order, but as the following testcase
shows that is not enough.
The problem is that some lvalues can satisfy the is_gimple_val / fb_rvalue
predicate used by gimplify_arg for is_gimple_reg_type typed expressions,
or is_gimple_lvalue / fb_either used for other types.
E.g. in foo we have:
  C::C (&p,  ++i,  ++i)
before gimplification where i is an automatic int variable and without this
patch gimplify that as:
  i = i + 1;
  i = i + 1;
  C::C (&p, i, i);
which means that the ctor is called with the original i value incremented
by 2 in both arguments, while because the call is CALL_EXPR_ORDERED_ARGS
the first argument should be different.  Similarly in qux we have:
  B::B (&p, TARGET_EXPR <D.2274, *(const struct A &) A::operator++ (&i)>,
        TARGET_EXPR <D.2275, *(const struct A &) A::operator++ (&i)>)
and gimplify it as:
      _1 = A::operator++ (&i);
      _2 = A::operator++ (&i);
      B::B (&p, MEM[(const struct A &)_1], MEM[(const struct A &)_2]);
but because A::operator++ returns the passed in argument, again we have
the same value in both cases due to gimplify_arg doing:
      /* Also strip a TARGET_EXPR that would force an extra copy.  */
      if (TREE_CODE (*arg_p) == TARGET_EXPR)
        {
          tree init = TARGET_EXPR_INITIAL (*arg_p);
          if (init
              && !VOID_TYPE_P (TREE_TYPE (init)))
            *arg_p = init;
        }
which is perfectly fine optimization for calls with unordered arguments,
but breaks the ordered ones.
Lastly, in corge, we have before gimplification:
  D::foo (NON_LVALUE_EXPR <p>, 3,  ++p)
and gimplify it as
  p = p + 4;
  D::foo (p, 3, p);
which is again wrong, because the this argument isn't before the
side-effects but after it.
The following patch adds cp_gimplify_arg wrapper, which if ordered
and is_gimple_reg_type forces non-SSA_NAME is_gimple_variable
result into a temporary, and if ordered, not is_gimple_reg_type
and argument is TARGET_EXPR bypasses the gimplify_arg optimization.
So, in foo with this patch we gimplify it as:
  i = i + 1;
  i.0_1 = i;
  i = i + 1;
  C::C (&p, i.0_1, i);
in qux as:
      _1 = A::operator++ (&i);
      D.2312 = MEM[(const struct A &)_1];
      _2 = A::operator++ (&i);
      B::B (&p, D.2312, MEM[(const struct A &)_2]);
where D.2312 is a temporary and in corge as:
  p.9_1 = p;
  p = p + 4;
  D::foo (p.9_1, 3, p);
The is_gimple_reg_type forcing into a temporary should be really cheap
(I think even at -O0 it should be optimized if there is no modification in
between), the aggregate copies might be more expensive but I think e.g. SRA
or FRE should be able to deal with those if there are no intervening
changes.  But still, the patch tries to avoid those when it is cheaply
provable that nothing bad happens (if no argument following it in the
strong evaluation order doesn't have TREE_SIDE_EFFECTS, then even VAR_DECLs
etc. shouldn't be modified after it).  There is also an optimization to
avoid doing that for this or for arguments with reference types as nothing
can modify the parameter values during evaluation of other argument's
side-effects.

I've tried if e.g.
  int i = 1;
  return i << ++i;
doesn't suffer from this problem as well, but it doesn't, the FE uses
  SAVE_EXPR <i>, SAVE_EXPR <i> << ++i;
in that case which gimplifies the way we want (temporary in the first
operand).

2021-11-19  Jakub Jelinek  <jakub@redhat.com>

	PR c++/70796
	* cp-gimplify.c (cp_gimplify_arg): New function.
	(cp_gimplify_expr): Use cp_gimplify_arg instead of gimplify_arg,
	pass true as last argument to it if there are any following
	arguments in strong evaluation order with side-effects.

	* g++.dg/cpp1z/eval-order11.C: New test.

(cherry picked from commit a84177aff7)
2022-05-11 07:58:33 +02:00
Jakub Jelinek 1f02d664cd lim: Reset flow sensitive info even for pointers [PR103192]
Since 2014 is lim clearing SSA_NAME_RANGE_INFO for integral SSA_NAMEs
if moving them from conditional contexts inside of a loop into unconditional
before the loop, but as the miscompilation of gimplify.c shows, we need to
treat pointers the same, even for them we need to reset whether the pointer
can/can't be null or the recorded pointer alignment.

This fixes
-FAIL: libgomp.c/../libgomp.c-c++-common/target-in-reduction-2.c (internal compiler error)
-FAIL: libgomp.c/../libgomp.c-c++-common/target-in-reduction-2.c (test for excess errors)
-UNRESOLVED: libgomp.c/../libgomp.c-c++-common/target-in-reduction-2.c compilation failed to produce executable
-FAIL: libgomp.c++/../libgomp.c-c++-common/target-in-reduction-2.c (internal compiler error)
-FAIL: libgomp.c++/../libgomp.c-c++-common/target-in-reduction-2.c (test for excess errors)
-UNRESOLVED: libgomp.c++/../libgomp.c-c++-common/target-in-reduction-2.c compilation failed to produce executable
-FAIL: libgomp.c++/target-in-reduction-2.C (internal compiler error)
-FAIL: libgomp.c++/target-in-reduction-2.C (test for excess errors)
-UNRESOLVED: libgomp.c++/target-in-reduction-2.C compilation failed to produce executable
on both x86_64 and i686.

2021-11-17  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/103192
	* tree-ssa-loop-im.c (move_computations_worker): Use
	reset_flow_sensitive_info instead of manually clearing
	SSA_NAME_RANGE_INFO and do it for all SSA_NAMEs, not just ones
	with integral types.

(cherry picked from commit 077425c890)
2022-05-11 07:58:33 +02:00
Jakub Jelinek 294682dc23 i386: Fix up x86 atomic_bit_test* expanders for !TARGET_HIMODE_MATH [PR103205]
With !TARGET_HIMODE_MATH, the OPTAB_DIRECT expand_simple_binop fail and so
we ICE.  We don't really care if they are done promoted in SImode instead.

2021-11-15  Jakub Jelinek  <jakub@redhat.com>

	PR target/103205
	* config/i386/sync.md (atomic_bit_test_and_set<mode>,
	atomic_bit_test_and_complement<mode>,
	atomic_bit_test_and_reset<mode>): Use OPTAB_WIDEN instead of
	OPTAB_DIRECT.

	* gcc.target/i386/pr103205.c: New test.

(cherry picked from commit 625eef42e3)
2022-05-11 07:58:33 +02:00
Jakub Jelinek 4b14a4af62 dwarf2out: Fix up field_byte_offset [PR101378]
For PCC_BITFIELD_TYPE_MATTERS field_byte_offset has quite large code
to deal with it since many years ago (see it e.g. in GCC 3.2, although it
used to be on HOST_WIDE_INTs, then on double_ints, now on offset_ints).
But that code apparently isn't able to cope with members with empty class
types with [[no_unique_address]] attribute, because the empty classes have
non-zero type size but zero decl size and so one can end up from the
computation with negative offset or offset 1 byte smaller than it should be.
For !PCC_BITFIELD_TYPE_MATTERS, we just use
    tree_result = byte_position (decl);
which seems exactly right even for the empty classes or anything which is
not a bitfield (and for which we don't add DW_AT_bit_offset attribute).
So, instead of trying to handle those no_unique_address members in the
current already very complicated code, this limits it to bitfields.

stor-layout.c PCC_BITFIELD_TYPE_MATTERS handling also affects only
bitfields, twice it checks DECL_BIT_FIELD and once DECL_BIT_FIELD_TYPE.

As discussed, this patch uses DECL_BIT_FIELD_TYPE check, because
DECL_BIT_FIELD might be cleared for some bitfields with bitsizes
multiple of BITS_PER_UNIT and e.g.
struct S { int e; int a : 1, b : 7, c : 8, d : 16; } s;
struct T { int a : 1, b : 7; long long c : 8; int d : 16; } t;

int
main ()
{
  s.c = 0x55;
  s.d = 0xaaaa;
  t.c = 0x55;
  t.d = 0xaaaa;
  s.e++;
}
has different debug info with DECL_BIT_FIELD check.

2021-11-11  Jakub Jelinek  <jakub@redhat.com>

	PR debug/101378
	* dwarf2out.c (field_byte_offset): Do the PCC_BITFIELD_TYPE_MATTERS
	handling only for DECL_BIT_FIELD_TYPE decls.

	* g++.dg/debug/dwarf2/pr101378.C: New test.

(cherry picked from commit 10db757301)
2022-05-11 07:58:32 +02:00
Jakub Jelinek a76e866211 openmp: For default(none) ignore variables created by ubsan_create_data [PR64888]
We weren't ignoring the ubsan variables created by c-ubsan.c before gimplification
(others are added later).  One way to fix this would be to introduce further
UBSAN_ internal functions and lower it later (sanopt pass) like other ifns,
this patch instead recognizes those magic vars by name/name of type and DECL_ARTIFICIAL
and TYPE_ARTIFICIAL.

2021-10-21  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/64888
gcc/c-family/
	* c-omp.c (c_omp_predefined_variable): Return true also for
	ubsan_create_data created artificial variables.
gcc/testsuite/
	* c-c++-common/ubsan/pr64888.c: New test.

(cherry picked from commit 40dd9d839e)
2022-05-11 07:58:30 +02:00
Jakub Jelinek 662de049d6 c++: Don't reject calls through PMF during constant evaluation [PR102786]
The following testcase incorrectly rejects the c initializer,
while in the s.*a case cxx_eval_* sees .__pfn reads etc.,
in the s.*&S::foo case get_member_function_from_ptrfunc creates
expressions which use INTEGER_CSTs with type of pointer to METHOD_TYPE.
And cxx_eval_constant_expression rejects any INTEGER_CSTs with pointer
type if they aren't 0.
Either we'd need to make sure we defer such folding till cp_fold but the
function and pfn_from_ptrmemfunc is used from lots of places, or
the following patch just tries to reject only non-zero INTEGER_CSTs
with pointer types if they don't point to METHOD_TYPE in the hope that
all such INTEGER_CSTs with POINTER_TYPE to METHOD_TYPE are result of
folding valid pointer-to-member function expressions.
I don't immediately see how one could create such INTEGER_CSTs otherwise,
cast of integers to PMF is rejected and would have the PMF RECORD_TYPE
anyway, etc.

2021-10-19  Jakub Jelinek  <jakub@redhat.com>

	PR c++/102786
	* constexpr.c (cxx_eval_constant_expression): Don't reject
	INTEGER_CSTs with type POINTER_TYPE to METHOD_TYPE.

	* g++.dg/cpp2a/constexpr-virtual19.C: New test.

(cherry picked from commit f45610a452)
2022-05-11 07:58:30 +02:00
Jakub Jelinek be2c01c634 openmp: Fix up handling of OMP_PLACES=threads(1)
When writing the places-*.c tests, I've noticed that we mishandle threads
abstract name with specified num-places if num-places isn't a multiple of
number of hw threads in a core.  It then happily ignores the maximum count
and overwrites for the remaining hw threads in a core further places that
haven't been allocated.

2021-10-15  Jakub Jelinek  <jakub@redhat.com>

	* config/linux/affinity.c (gomp_affinity_init_level_1): For level 1
	after creating count places clean up and return immediately.
	* testsuite/libgomp.c/places-6.c: New test.
	* testsuite/libgomp.c/places-7.c: New test.
	* testsuite/libgomp.c/places-8.c: New test.

(cherry picked from commit 4764049dd6)
2022-05-11 07:58:30 +02:00
Jakub Jelinek ee221ea5cc c++: Fix apply_identity_attributes [PR102548]
The following testcase ICEs on x86_64-linux with -m32 due to a bug in
apply_identity_attributes.  The function is being smart and attempts not
to duplicate the chain unnecessarily, if either there are no attributes
that affect type identity or there is possibly empty set of attributes
that do not affect type identity in the chain followed by attributes
that do affect type identity, it reuses that attribute chain.

The function mishandles the cases where in the chain an attribute affects
type identity and is followed by one or more attributes that don't
affect type identity (and then perhaps some further ones that do).

There are two bugs.  One is that when we notice first attribute that
doesn't affect type identity after first attribute that does affect type
identity (with perhaps some further such attributes in the chain after it),
we want to put into the new chain just attributes starting from
(inclusive) first_ident and up to (exclusive) the current attribute a,
but the code puts into the chain all attributes starting with first_ident,
including the ones that do not affect type identity and if e.g. we have
doesn't0 affects1 doesn't2 affects3 affects4 sequence of attributes, the
resulting sequence would have
affects1 doesn't2 affects3 affects4 affects3 affects4
attributes, i.e. one attribute that shouldn't be there and two attributes
duplicated.  That is fixed by the a2 -> a2 != a change.

The second one is that we ICE once we see second attribute that doesn't
affect type identity after an attribute that affects it.  That is because
first_ident is set to error_mark_node after handling the first attribute
that doesn't affect type identity (i.e. after we've copied the
[first_ident, a) set of attributes to the new chain) to denote that from
that time on, each attribute that affects type identity should be copied
whenever it is seen (the if (as && as->affects_type_identity) code does
that correctly).  But that condition is false and first_ident is
error_mark_node, we enter else if (first_ident) and use TREE_PURPOSE
/TREE_VALUE/TREE_CHAIN on error_mark_node, which ICEs.  When
first_ident is error_mark_node and a doesn't affect type identity,
we want to do nothing.  So that is the && first_ident != error_mark_node
chunk.

2021-10-05  Jakub Jelinek  <jakub@redhat.com>

	PR c++/102548
	* tree.c (apply_identity_attributes): Fix handling of the
	case where an attribute in the list doesn't affect type
	identity but some attribute before it does.

	* g++.target/i386/pr102548.C: New test.

(cherry picked from commit 737f95bab5)
2022-05-11 07:58:30 +02:00
Jakub Jelinek f806bea0a6 ubsan: Use -fno{,-}sanitize=float-divide-by-zero for float division by zero recovery [PR102515]
We've been using
-f{,no-}sanitize-recover=integer-divide-by-zero to decide on the float
-fsanitize=float-divide-by-zero instrumentation _abort suffix.
This patch fixes it to use -f{,no-}sanitize-recover=float-divide-by-zero
for it instead.

2021-10-01  Jakub Jelinek  <jakub@redhat.com>
	    Richard Biener  <rguenther@suse.de>

	PR sanitizer/102515
gcc/c-family/
	* c-ubsan.c (ubsan_instrument_division): Check the right
	flag_sanitize_recover bit, depending on which sanitization
	is done.
gcc/testsuite/
	* c-c++-common/ubsan/float-div-by-zero-2.c: New test.

(cherry picked from commit 9c1a633d96)
2022-05-11 07:58:30 +02:00
Jakub Jelinek 8837138d29 i386: Don't emit fldpi etc. if -frounding-math [PR102498]
i387 has instructions to store some transcedental numbers into the top of
stack.  The problem is that what exact bit in the last place one gets for
those depends on the current rounding mode, the CPU knows the number with
slightly higher precision.  The compiler assumes rounding to nearest when
comparing them against constants in the IL, but at runtime the rounding
can be different and so some of these depending on rounding mode and the
constant could be 1 ulp higher or smaller than expected.
We only support changing the rounding mode at runtime if the non-default
-frounding-mode option is used, so the following patch just disables
using those constants if that flag is on.

2021-09-28  Jakub Jelinek  <jakub@redhat.com>

	PR target/102498
	* config/i386/i386.c (standard_80387_constant_p): Don't recognize
	special 80387 instruction XFmode constants if flag_rounding_math.

	* gcc.target/i386/pr102498.c: New test.

(cherry picked from commit 3b7041e834)
2022-05-11 07:58:29 +02:00