i387 has instructions to store some transcedental numbers into the top of
stack. The problem is that what exact bit in the last place one gets for
those depends on the current rounding mode, the CPU knows the number with
slightly higher precision. The compiler assumes rounding to nearest when
comparing them against constants in the IL, but at runtime the rounding
can be different and so some of these depending on rounding mode and the
constant could be 1 ulp higher or smaller than expected.
We only support changing the rounding mode at runtime if the non-default
-frounding-mode option is used, so the following patch just disables
using those constants if that flag is on.
2021-09-28 Jakub Jelinek <jakub@redhat.com>
PR target/102498
* config/i386/i386.c (standard_80387_constant_p): Don't recognize
special 80387 instruction XFmode constants if flag_rounding_math.
* gcc.target/i386/pr102498.c: New test.
(cherry picked from commit 3b7041e834)
The code sequence emitted uses CC internally.
gcc/ChangeLog:
* config/s390/tpf.md (prologue_tpf, epilogue_tpf): Add cc clobber.
(cherry picked from commit e1223ea2f4)
Avoid emitting a strict low part move if the insv target actually
affects the whole target reg.
gcc/ChangeLog:
PR target/102222
* config/s390/s390.c (s390_expand_insv): Emit a normal move if it
is actually a full copy of the source operand into the target.
Don't emit a strict low part move if source and target mode match.
gcc/testsuite/ChangeLog:
* gcc.target/s390/pr102222.c: New test.
(cherry picked from commit a9b3c451be)
There is one inconsistent bit-field streaming out and in.
On the side of streaming in:
bp_pack_value (&bp, info->inlinable, 1);
bp_pack_value (&bp, false, 1);
bp_pack_value (&bp, info->fp_expressions, 1);
while on the side of the streaming out:
info->inlinable = bp_unpack_value (&bp, 1);
info->fp_expressions = bp_unpack_value (&bp, 1)
The removal of Cilk Plus support r8-4956 missed to remove
the streaming out of the bit, instead just change the value
for streaming out to be always false.
By hacking fp_expression_p to always return true, I can see
it reads the wrong fp_expressions value (false) out in wpa.
GCC12 adopts commit 63c6446f77
which removes the inconsistent streaming out instead.
gcc/ChangeLog:
* ipa-fnsummary.c (inline_read_section): Unpack a dummy bit
to keep consistent with the side of streaming out.
We cannot use r12 here, it is already in use as the GEP (for sibling
calls).
2021-09-08 Segher Boessenkool <segher@kernel.crashing.org>
PR target/102107
* config/rs6000/rs6000-logue.c (rs6000_emit_epilogue): For ELFv2 use
r11 instead of r12 for restoring CR.
(cherry picked from commit 86e6268cff)
CR is saved and/or restored on some paths where GPR12 is already live
since it has a meaning in the calling convention in the ELFv2 ABI.
It is not completely clear to me that we can always use r11 here, but
it does seem save, there is checking code (to detect conflicts here),
and it is stage 1. So here goes.
2021-09-03 Segher Boessenkool <segher@kernel.crashing.org>
PR target/102107
* config/rs6000/rs6000-logue.c (rs6000_emit_prologue): On ELFv2 use r11
instead of r12 for CR save, in all cases.
(cherry picked from commit 2484f7a4b0)
gcc/fortran/ChangeLog:
PR fortran/102366
* trans-decl.c (gfc_finish_var_decl): Disable the warning message
for variables moved from stack to static storange if they are
declared in the main, but allow the move to happen.
gcc/testsuite/ChangeLog:
PR fortran/102366
* gfortran.dg/pr102366.f90: New test.
(cherry picked from commit 51166eb2c5)
The implementation of the no_fsanitize_address effective target was copied
from asan-dg.exp without realizing that it does not work outside of this
context (there is a comment explaining why). As a consequence, it always
returns 0, so for example the directive in gnat.dg/asan1.adb:
{ dg-skip-if "no address sanitizer" { no_fsanitize_address } }
does not work. This led some people to add the nonsensical:
{ dg-require-effective-target no_fsanitize_address }
to sanitizer tests, e.g. g++.dg/warn/uninit-pr93100.C, thus disabling them
everywhere instead of just for the problematic targets.
gcc/testsuite/
* lib/target-supports.exp (no_fsanitize_address): Add missing bits.
* gcc.dg/pr91441.c: Likewise.
* gcc.dg/pr96260.c: Likewise.
* gcc.dg/pr96307.c: Likewise.
* gnat.dg/asan1.adb: Likewise.
* g++.dg/abi/anon4.C: Likewise.
While OpenMP 5.1 and GCC 12 permits 'order(concurrent)' on distribute,
OpenMP 5.0 and GCC 11 don't. This patch for GCC 11 ensures the clause also
does not end up on 'distribute' when splitting combined directives.
gcc/fortran/ChangeLog:
* trans-openmp.c (gfc_split_omp_clauses): Don't put 'order(concurrent)'
on 'distribute' for combined directives, matching OpenMP 5.0
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/distribute-order-concurrent.f90: New test.
gcc/fortran/ChangeLog:
PR fortran/102287
* trans-expr.c (gfc_conv_procedure_call): Wrap deallocation of
allocatable components of optional allocatable derived type
procedure arguments with INTENT(OUT) into a presence check.
gcc/testsuite/ChangeLog:
PR fortran/102287
* gfortran.dg/intent_out_14.f90: New test.
(cherry picked from commit cfea7b86f2)
This is a duplication of volatile loads introduced during GCC 9 development
by the 2->2 mechanism of the RTL combiner. There is already a substantial
checking for volatile references in can_combine_p but it implicitly assumes
that the combination reduces the number of instructions, which is of course
not the case here. So the fix teaches try_combine to abort the combination
when it is about to make a copy of volatile references to preserve them.
gcc/
PR rtl-optimization/102306
* combine.c (try_combine): Abort the combination if we are about to
duplicate volatile references.
gcc/testsuite/
* gcc.target/sparc/20210917-1.c: New test.
gcc/fortran/ChangeLog:
PR fortran/85130
* expr.c (find_substring_ref): Handle given substring start and
end indices as signed integers, not unsigned.
gcc/testsuite/ChangeLog:
PR fortran/85130
* gfortran.dg/substr_6.f90: Revert commit r8-7574, adding again
test that was erroneously considered as illegal.
(cherry picked from commit 8d93ba93d3)
gcc/fortran/ChangeLog:
PR fortran/82314
* decl.c (add_init_expr_to_sym): For proper initialization of
array-valued named constants the array bounds need to be
simplified before adding the initializer.
gcc/testsuite/ChangeLog:
PR fortran/82314
* gfortran.dg/pr82314.f90: New test.
(cherry picked from commit 104c05c528)
The LEON5 can often dual issue instructions from the same 64-bit aligned
double word if there are no data dependencies. Add scheduling information
to avoid scheduling unpairable instructions back-to-back.
gcc/ChangeLog:
* config/sparc/sparc-opts.h (enum sparc_processor_type): Add LEON5
* config/sparc/sparc.c (struct processor_costs): Add LEON5 costs
(leon5_adjust_cost): Increase cost of store with data dependency
on ALU instruction and FPU anti-dependencies.
(sparc_option_override): Add LEON5 costs
(sparc_adjust_cost): Add LEON5 cost adjustments
* config/sparc/sparc.h: Add LEON5
* config/sparc/sparc.md: Include LEON5 scheduling information
* config/sparc/sparc.opt: Add LEON5
* doc/invoke.texi: Add LEON5
* config/sparc/leon5.md: New file.
This is needed to prevent the Store -> (Non-store or load) -> Store
sequence.
gcc/ChangeLog:
* config/sparc/sparc.md (stack_protect_setsi): Add NOP to prevent
sensitive sequence for B2BST errata workaround.
A call to the function might have a load instruction in the delay slot
and a load followed by an atomic function could cause a deadlock.
gcc/ChangeLog:
* config/sparc/sparc.c (sparc_do_work_around_errata): Do not begin
functions with atomic instruction in the UT700 errata workaround.
This version detects multiple empty assembly statements in a row and also
detects non-memory barrier empty assembly statements (__asm__("")). It
can be used instead of next_active_insn().
gcc/ChangeLog:
* config/sparc/sparc.c (next_active_non_empty_insn): New function
that returns next active non empty assembly instruction.
(sparc_do_work_around_errata): Use new function.
Check the attribute of instruction to determine if it performs a store
or load operation. This more generic approach sees the last instruction
in the GOTdata_op model as a potential load and treats the memory barrier
as a potential store instruction.
gcc/ChangeLog:
* config/sparc/sparc.c (store_insn_p): Add predicate for store
attributes.
(load_insn_p): Add predicate for load attributes.
(sparc_do_work_around_errata): Use new predicates.
The problem here is the aarch64_expand_setmem code did not check
STRICT_ALIGNMENT if it is creating an overlapping store.
This patch adds that check and the testcase works.
gcc/ChangeLog:
PR target/101934
* config/aarch64/aarch64.c (aarch64_expand_setmem):
Check STRICT_ALIGNMENT before creating an overlapping
store.
gcc/testsuite/ChangeLog:
PR target/101934
* gcc.target/aarch64/memset-strict-align-1.c: New test.
(cherry picked from commit a45786e9a3)
> > Note, if the flexible array member is initialized only with non-constant
> > initializers, we have a worse bug that this patch doesn't solve, the
> > splitting of initializers into constant and dynamic initialization removes
> > the initializer and we don't have just wrong DECL_*SIZE, but nothing is
> > emitted when emitting those vars into assembly either and so the dynamic
> > initialization clobbers other vars that may overlap the variable.
> > I think we need keep an empty CONSTRUCTOR elt in DECL_INITIAL for the
> > flexible array member in that case.
>
> Makes sense.
So, the following patch fixes that.
The typeck2.c change makes sure we keep those CONSTRUCTORs around (although
they should be empty because all their elts had side-effects/was
non-constant if it was removed earlier), and the varasm.c change is to avoid
ICEs on those as well as ICEs on other flex array members that had some
initializers without side-effects, but not on the last array element.
The code was already asserting that the (index of the last elt in the
CONSTRUCTOR + 1) times elt size is equal to TYPE_SIZE_UNIT of the local->val
type, which is true for C flex arrays or for C++ if they don't have any
side-effects or the last elt doesn't have side-effects, this patch changes
that to assertion that the TYPE_SIZE_UNIT is greater than equal to the
offset of the end of last element in the CONSTRUCTOR and uses TYPE_SIZE_UNIT
(int_size_in_bytes) in the code later on.
2021-09-15 Jakub Jelinek <jakub@redhat.com>
PR c++/88578
PR c++/102295
gcc/
* varasm.c (output_constructor_regular_field): Instead of assertion
that array_size_for_constructor result is equal to size of
TREE_TYPE (local->val) in bytes, assert that the type size is greater
or equal to array_size_for_constructor result and use type size as
fieldsize.
gcc/cp/
* typeck2.c (split_nonconstant_init_1): Don't throw away empty
initializers of flexible array members if they have non-zero type
size.
gcc/testsuite/
* g++.dg/ext/flexary39.C: New test.
* g++.dg/ext/flexary40.C: New test.
(cherry picked from commit e5d1af8a07)
The C FE updates DECL_*SIZE for vars which have initializers for flexible
array members for many years, but C++ FE kept DECL_*SIZE the same as the
type size (i.e. as if there were zero elements in the flexible array
member). This results e.g. in ELF symbol sizes being too small.
Note, if the flexible array member is initialized only with non-constant
initializers, we have a worse bug that this patch doesn't solve, the
splitting of initializers into constant and dynamic initialization removes
the initializer and we don't have just wrong DECL_*SIZE, but nothing is
emitted when emitting those vars into assembly either and so the dynamic
initialization clobbers other vars that may overlap the variable.
I think we need keep an empty CONSTRUCTOR elt in DECL_INITIAL for the
flexible array member in that case.
2021-09-14 Jakub Jelinek <jakub@redhat.com>
PR c++/102295
* decl.c (layout_var_decl): For aggregates ending with a flexible
array member, add the size of the initializer for that member to
DECL_SIZE and DECL_SIZE_UNIT.
* g++.target/i386/pr102295.C: New test.
(cherry picked from commit 818c505188)
is_xible_helper returns error_mark_node (i.e. false from the traits)
for abstract classes by testing ABSTRACT_CLASS_TYPE_P (to) early.
Unfortunately, as the testcase shows, that doesn't work on class templates
that haven't been instantiated yet, ABSTRACT_CLASS_TYPE_P for them is false
until it is instantiated, which is done when the routine later constructs
a dummy object with that type.
The following patch fixes this by calling complete_type first, so that
ABSTRACT_CLASS_TYPE_P test will work properly, while keeping the handling
of arrays with unknown bounds, or incomplete types where it is done
currently.
2021-09-14 Jakub Jelinek <jakub@redhat.com>
PR c++/102305
* method.c (is_xible_helper): Call complete_type on to.
* g++.dg/cpp0x/pr102305.C: New test.
(cherry picked from commit f008fd3a48)
The MMA build built-ins currently use individual lxv instructions to
load up the registers of a __vector_pair or __vector_quad. If the
memory addresses of the built-in operands are to adjacent locations,
then we can use an lxvp in some cases to load up two registers at once.
The patch below adds support for checking whether memory addresses are
adjacent and emitting an lxvp instead of two lxv instructions.
2021-07-14 Peter Bergner <bergner@linux.ibm.com>
gcc/
* config/rs6000/rs6000.c (adjacent_mem_locations): Return the lower
addressed memory rtx, if any.
(rs6000_split_multireg_move): Fix code formatting.
Handle MMA build built-ins with operands in adjacent memory locations.
gcc/testsuite/
* gcc.target/powerpc/mma-builtin-9.c: New test.
(cherry picked from commit 69feb7601e)
An upcoming change to rs6000_split_multireg_move requires it to be
moved later in the file to fix a declaration issue.
2021-07-14 Peter Bergner <bergner@linux.ibm.com>
gcc/
* config/rs6000/rs6000.c (rs6000_split_multireg_move): Move to later
in the file.
(cherry picked from commit 7d914777fc)
Backported from master:
2021-08-09 Pat Haugen <pthaugen@linux.ibm.com>
gcc/ChangeLog:
* config/rs6000/rs6000.c (is_load_insn1): Verify destination is a
register.
(is_store_insn1): Verify source is a register.
This is a regression present on the mainline and 11 branch in the form of an
ICE for an enumeration type with a full signed representation for its size.
gcc/ada/
PR ada/101970
* exp_attr.adb (Expand_N_Attribute_Reference) <Attribute_Enum_Rep>:
Use an unchecked conversion instead of a regular conversion in the
enumeration case and remove Conversion_OK flag in the integer case.
<Attribute_Pos>: Remove superfluous test.
gcc/testsuite/
* gnat.dg/enum_rep2.adb: New test.
The error is to be issued when objects of the type are declared instead.
gcc/ada/
* gcc-interface/decl.c (validate_size): Do not issue an error if the
old size has overflowed.
They should not be 0-based, unless the array type itself is.
gcc/ada/
* gcc-interface/decl.c (gnat_to_gnu_entity): For vector types, make
the representative array the debug type.
Recent compilers enforce more strictly the RM C.6(18) clause, which says
that volatile record types are by-reference types. This changes the typical
error message now given in these cases.
gcc/ada/
* gcc-interface/decl.c (gnat_to_gnu_entity) <is_type>: Declare new
constant. Adjust error message issued by validate_size in the case
of by-reference types.
(validate_size): Always use the error strings passed by the caller.