no-guess-branch-probability option requires profile_count scaling with
initialized_p guard, use multiply instead of apply_scale, which will do
the right thing to undefined probabilities and will not cause unnecesary
roundoff errors and precision info loss.
Also merge the missed part of r12-6086 of factor out function to avoid
duplicate code.
Regression testest pass on Power and X86.
gcc/ChangeLog:
PR tree-optimization/103793
* tree-ssa-loop-split.c (fix_loop_bb_probability): New function.
(split_loop): Use multiply to scale loop1's exit probability.
(do_split_loop_on_cond): Call fix_loop_bb_probability.
gcc/testsuite/ChangeLog:
PR tree-optimization/103793
* gcc.dg/pr103793.c: New test.
On non-ELF targets, the Makefile needs a newline inside the sed REPLACE
string. The way it is currently done fails with GNU Make < 4, but GCC
only requires "GNU make version 3.80 (or later)".
The portable solution is given in the autoconf manual:
https://www.gnu.org/software/autoconf/manual/autoconf-2.69/html_node/Newlines-in-Make-Rules.html
libbacktrace/ChangeLog:
PR libbacktrace/103822
* Makefile.am: Fix newline.
* Makefile.in: Regenerate.
Make the front-end emit the right type for CHARACTER(C_CHAR), VALUE
arguments to BIND(C) procedures. They are scalar integers of C type
char, and should be emitted as such. They are not strings or arrays,
and are not promoted to C int, either.
gcc/fortran/ChangeLog:
PR fortran/103828
* trans-decl.c (generate_local_decl): Do not call
gfc_conv_scalar_char_value(), but check the type tree.
* trans-expr.c (gfc_conv_scalar_char_value): Rename to
conv_scalar_char_value, do not alter type tree.
(gfc_conv_procedure_call): Adjust call to renamed
conv_scalar_char_value() function.
* trans-types.c (gfc_sym_type): Take care of
CHARACTER(C_CHAR), VALUE arguments.
* trans.h (gfc_conv_scalar_char_value): Remove prototype.
gcc/testsuite/ChangeLog:
PR fortran/103828
* gfortran.dg/c_char_tests_3.f90: New file.
* gfortran.dg/c_char_tests_3_c.c: New file.
* gfortran.dg/c_char_tests_4.f90: New file.
* gfortran.dg/c_char_tests_5.f90: New file.
BOOLEAN_TYPE also counts as integral, so verify_type should allow it.
PR c++/99968
gcc/ChangeLog:
* tree.c (verify_type): Allow enumerator with BOOLEAN_TYPE.
gcc/testsuite/ChangeLog:
* g++.dg/ext/is_enum2.C: New test.
Some time ago I've changed const_binop -> wide_int_binop, so that it punts
on shifts by negative count. fold_truth_andor_1 doesn't check the results
of const_binop (?SHIFT_EXPR, ) though and assumes they will be always
non-NULL, which is no longer the case.
2021-12-28 Jakub Jelinek <jakub@redhat.com>
PR middle-end/103813
* fold-const.c (fold_truth_andor_1): Punt of const_binop LSHIFT_EXPR
or RSHIFT_EXPR returns NULL. Formatting fix.
* gcc.c-torture/compile/pr103813.c: New test.
In the following testcase we have a -fcompare-debug failure, because
can_move_invariant_reg doesn't ignore DEBUG_INSNs in its decisions.
In the testcase we have due to uninitialized variable:
loop_header
debug_insn using pseudo84
pseudo84 = invariant
insn using pseudo84
end loop
and with -g decide not to move the pseudo84 = invariant before the
loop header; in this case not resetting the debug insns might be fine.
But, we could have also:
pseudo84 = whatever
loop_header
debug_insn using pseudo84
pseudo84 = invariant
insn using pseudo84
end loop
and in that case not resetting the debug insns would result in wrong-debug.
And, we don't really have generally a good substitution on what pseudo84
contains, it could inherit various values from different paths.
So, the following patch ignores DEBUG_INSNs in the decisions, and if there
are any that previously prevented the optimization, resets them before
return true.
2021-12-28 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/103837
* loop-invariant.c (can_move_invariant_reg): Ignore DEBUG_INSNs in
the decisions whether to return false or continue and right before
returning true reset those debug insns that previously caused
returning false.
* gcc.dg/pr103837.c: New test.
These two spots are meant to punt if the newly added code contains
any CALL_INSNs, because in that case having a large sequence of insns
that also calls something is undesirable, better have one call that
is optimized in itself well.
The functions do last = get_last_insn (); before emitting any insns
(and expand_binop as the ultimate caller uses delete_insns_since if
the expansion fails), but the checks were incorrect for 2 reasons:
1) it checked not just what follows after that last insn, but also
the last insn itself; so, if the division or modulo is immediately
preceded by a CALL_INSN, then we punt; this also causes -fcompare-debug
failures if the CALL_INSN is with -g followed by one or more DEBUG_INSNs
2) if get_last_insn () is NULL (i.e. emitting into a new sequence), then
we didn't check anything
2021-12-28 Jakub Jelinek <jakub@redhat.com>
PR debug/103838
* optabs.c (expand_doubleword_mod, expand_doubleword_divmod): Only
check newly added insns for CALL_P, not the last insn of previous
code.
* gcc.dg/pr103838.c: New test.
It happens that options are parsed and various diagnostics happen
in finish_options. That's a proper place as the function is also called
for optimize/target attributes (pragmas). However, it is possible that
target overwrites an option from command line and so the diagnostics
does not happen. That's fixed in the patch.
- options are parsed and finish_options is called:
if (opts->x_flag_unwind_tables
&& !targetm_common.unwind_tables_default
&& opts->x_flag_reorder_blocks_and_partition
&& (ui_except == UI_SJLJ || ui_except >= UI_TARGET))
{
if (opts_set->x_flag_reorder_blocks_and_partition)
inform (loc,
"%<-freorder-blocks-and-partition%> does not support "
"unwind info on this architecture");
opts->x_flag_reorder_blocks_and_partition = 0;
opts->x_flag_reorder_blocks = 1;
}
It's not triggered because of opts->x_flag_unwind_tables is false by default, but
the option is overwritten in target:
...
if (TARGET_64BIT_P (opts->x_ix86_isa_flags))
{
if (opts->x_optimize >= 1)
SET_OPTION_IF_UNSET (opts, opts_set, flag_omit_frame_pointer,
!USE_IX86_FRAME_POINTER);
if (opts->x_flag_asynchronous_unwind_tables
&& TARGET_64BIT_MS_ABI)
SET_OPTION_IF_UNSET (opts, opts_set, flag_unwind_tables, 1);
...
PR driver/103465
gcc/ChangeLog:
* opts.c (finish_options): More part of diagnostics to ...
(diagnose_options): ... here. Call the function from both
finish_options and process_options.
* opts.h (diagnose_options): Declare.
* toplev.c (process_options): Call diagnose_options.
register_operand predicate allows not just REGs, but also SUBREGs of REGs,
and for the latter lowpart_subreg might FAIL when trying to create paradoxical
SUBREG in some cases. For the input operand fixed by force_reg on it first,
for the output operand handled by always dividing into a fresh V4SFmode temporary
and emit_move_insn into the destination afterwards, that is also beneficial for
combine.
2021-12-28 Jakub Jelinek <jakub@redhat.com>
PR target/103842
* config/i386/mmx.md (divv2sf3): Use force_reg on op1. Always perform
divv4sf3 into a pseudo and emit_move_insn into operands[0].
* g++.dg/opt/pr103842.C: New test.
gcc/testsuite/ChangeLog:
* gcc.target/i386/amx-check.h (check_float_tile_register):
New check function for float to prevent precision loss.
* gcc.target/i386/amxbf16-dpbf16ps-2.c: Correct the type convert
and byte offset. Use the new check function.
The r12-6123 fix for SFINAE with p+N and incomplete type also fixed
the analogous issue with p[N].
PR c++/101239
gcc/testsuite/ChangeLog:
* g++.dg/template/sfinae32a.C: New test.
In pointer_int_sum when called from a SFINAE context, we need to avoid
calling size_in_bytes_loc on an incomplete pointed-to type since this
latter function isn't SFINAE-enabled and always emits an error on such
input.
PR c++/103700
gcc/c-family/ChangeLog:
* c-common.c (pointer_int_sum): When quiet, return
error_mark_node for an incomplete pointed-to type and don't
call size_in_bytes_loc.
gcc/testsuite/ChangeLog:
* g++.dg/template/sfinae32.C: New test.
The 'm' constraint is defined with define_memory_constraint which allows
LRA to convert the operand to the form '(mem (reg X))', where X is a
base register. To prevent LRA from generating '(mem (reg X))' from a
register:
1. Add a 'BM' constraint which is similar to the 'm' constraint, but
is defined with define_constraint.
2. Add a 'm' mode attribute which is mapped to the 'm' constraint for
general_operand and the 'BM' constraint for x86_64_general_operand.
3. Replace the 'm' constraint on <general_operand> with the '<m>'
constraint.
4. Replace the 'm' constraint on x86_64_general_operand with the 'BM'
constraint.
gcc/
PR target/103762
* config/i386/constraints.md (BM): New constraint.
* config/i386/i386.md (m): New mode attribute.
Replace the 'm' constraint on <general_operand> with the '<m>'
constraint.
Replace the 'm' constraint on x86_64_general_operand with the
'BM' constraint.
gcc/testsuite/
* gcc.target/i386/pr103762-1a.c: New test.
* gcc.target/i386/pr103762-1b.c: Likewise.
* gcc.target/i386/pr103762-1c.c: Likewise.
libgfortran/ChangeLog:
PR libfortran/98076
* runtime/string.c (itoa64, itoa64_pad19): New helper functions.
(gfc_itoa): On targets with 128-bit integers, call fast
64-bit functions to avoid many slow divisions.
gcc/testsuite/ChangeLog:
PR libfortran/98076
* gfortran.dg/pr98076.f90: New test.
libgfortran/ChangeLog:
PR libfortran/81986
PR libfortran/99191
* libgfortran.h: Remove gfc_xtoa(), adjust gfc_itoa() and
GFC_ITOA_BUF_SIZE.
* io/write.c (write_decimal): conversion parameter is always
gfc_itoa(), so remove it. Protect from overflow.
(xtoa): Move gfc_xtoa and update its name.
(xtoa_big): Renamed from ztoa_big for consistency.
(write_z): Adjust to new function names.
(write_i, write_integer): Remove last arg of write_decimal.
* runtime/backtrace.c (error_callback): Comment on the use of
gfc_itoa().
* runtime/error.c (gfc_xtoa): Move to io/write.c.
* runtime/string.c (gfc_itoa): Take an unsigned argument,
remove the handling of negative values.
As per title.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/ChangeLog:
* config/darwin.c (darwin_override_options): Make a comment
more inclusive.
The current rule was too strict and has not been required since Darwin11.
This relaxes the constraint to allow up to 2^28 alignment for non-common
entities. Common is still restricted to a maximum aligment of 2^15.
When the host is an older version of Darwin ( earlier that 11 ) then the
existing constraint is still applied. Note that this is a host constraint
not a target one (so that a compilation on 10.7 targeting 10.6 is allowed
to use a greater alignment than the tools on 10.6 support). This matches
the behaviour of clang.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/ChangeLog:
* config.gcc: Emit L2_MAX_OFILE_ALIGNMENT with suitable
values for the host.
* config/darwin.c (darwin_emit_common): Error for alignment
values > 32768.
* config/darwin.h (MAX_OFILE_ALIGNMENT): Rework to use the
configured L2_MAX_OFILE_ALIGNMENT.
gcc/testsuite/ChangeLog:
* gcc.dg/darwin-aligned-globals.c: New test.
* gcc.dg/darwin-comm-1.c: New test.
* gcc.dg/attr-aligned.c: Amend for new alignment values on
Darwin.
* gcc.target/i386/pr89261.c: Likewise.
We were checking whether the flag had been set by the user, but not if
it was set to true. Which means that the check fails in its intent when
the user puts -fno-reorder-and-partition.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/ChangeLog:
* config/darwin.c (darwin_override_options): When checking for the
flag-reorder-and-partition case, also check that it is set on.
There are places that we need to make different codegen depending
on the object format rather than on the arch. We already have
definitions for ELF, COFF etc. this adds one for MACHO.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/ChangeLog:
* config/darwin.h (OBJECT_FORMAT_MACHO): New.
This is a fix to PR target/103773 where -Oz shouldn't use push/pop
on x86 to shrink writing small integer constants to memory.
Instead clang uses "andl $0, mem" for writing zero, and "orl $-1, mem"
when writing -1 to memory when using -Oz. This patch implements this
via peephole2 where we can confirm that its ok to clobber the flags.
2021-12-23 Roger Sayle <roger@nextmovesoftware.com>
Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog
PR target/103773
* config/i386/i386.md (*mov<mode>_and): New define_insn for
writing a zero to memory using AND.
(*mov<mode>_or): Extend to allow memory destination and HImode.
(*movdi_internal): Remove -Oz push/pop optimization from here.
(*movsi_internal): Likewise.
(peephole2): Perform -Oz push/pop optimization here, only for
register destinations, values other than zero, and in functions
that don't used the red zone.
(peephole2): With -Oz, convert writes of 0 or -1 to memory into
their clobber forms, i.e. *mov<mode>_and and *mov<mode>_or resp.
gcc/testsuite/ChangeLog
PR target/103773
* gcc.target/i386/pr103773-2.c: New test case.
* gcc.target/i386/pr103773.c: New test case.
gcc/fortran/ChangeLog:
PR fortran/103778
* check.c (is_c_interoperable): A BOZ literal constant is not
interoperable.
gcc/testsuite/ChangeLog:
PR fortran/103778
* gfortran.dg/illegal_boz_arg_3.f90: New test.
gcc/fortran/ChangeLog:
PR fortran/103776
* match.c (match_case_selector): Reject expressions in CASE
selector which are not scalar.
gcc/testsuite/ChangeLog:
PR fortran/103776
* gfortran.dg/select_10.f90: New test.
Move the implementation of MVE ACLE types from arm_mve_types.h to
inside GCC via a new pragma, which replaces the prior type
definitions. This allows for the types to be used internally for
intrinsic function definitions.
gcc/ChangeLog:
* config.gcc (arm*-*-*): Add arm-mve-builtins.o to extra_objs.
* config/arm/arm-c.c (arm_pragma_arm): Handle "#pragma GCC arm".
(arm_register_target_pragmas): Register it.
* config/arm/arm-protos.h: (arm_mve::arm_handle_mve_types_h): New
prototype.
* config/arm/arm_mve_types.h: Replace MVE type definitions with
new pragma.
* config/arm/t-arm: (arm-mve-builtins.o): New target rule.
* config/arm/arm-mve-builtins.cc: New file.
* config/arm/arm-mve-builtins.def: New file.
* config/arm/arm-mve-builtins.h: New file.
gcc/testsuite/ChangeLog:
* gcc.target/arm/mve/mve.exp: Add new subdirectories.
* gcc.target/arm/mve/general-c/type_redef_1.c: New test.
* gcc.target/arm/mve/general/double_pragmas_1.c: New test.
* gcc.target/arm/mve/general/nomve_1.c: New test.
Move the arm_simd_type and arm_type_qualifiers enums, and
arm_simd_info struct from arm-builtins.c into arm-builtins.h header.
This is a first step towards internalising the type definitions for
MVE predicate, vector, and tuple types. By moving arm_simd_types into
a header, we allow future patches to use these type trees externally
to arm-builtins.c, which is a crucial step towards developing an MVE
intrinsics framework similar to the current SVE implementation.
gcc/ChangeLog:
* config/arm/arm-builtins.c (enum arm_type_qualifiers): Move to
arm_builtins.h.
(enum arm_simd_type): Move to arm-builtins.h.
(struct arm_simd_type_info): Move to arm-builtins.h.
* config/arm/arm-builtins.h (enum arm_simd_type): Move from
arm-builtins.c.
(enum arm_type_qualifiers): Move from arm-builtins.c.
(struct arm_simd_type_info): Move from arm-builtins.c.
The logic for detection of REAL(KIND=16) in kinds-override.h made
assumptions:
-- if real(kind=10) exists, i.e. if HAVE_GFC_REAL_10 is defined,
then it is necessarily the "long double" type
-- if real(kind=16) exists, then:
* if HAVE_GFC_REAL_10, real(kind=16) is "__float128"
* otherwise, real(kind=16) is "long double"
This may not always be true. Take the aarch64-apple-darwin port,
it has double == long double == binary64, and __float128 == binary128.
We already have more fine-grained logic in the mk-kinds-h.sh script,
where we actually check the Fortran kind corresponding to C’s long
double. So let's use it, and emit the GFC_REAL_16_IS_FLOAT128 /
GFC_REAL_16_IS_LONG_DOUBLE macros there.
libgfortran/ChangeLog:
* kinds-override.h: Move GFC_REAL_16_IS_* macros...
* mk-kinds-h.sh: ... here.
As well as checking for the existence of a GDC compiler, also validate
that it has also been built with libphobos, otherwise warn or fail with
the message that GDC is required to build d.
config/ChangeLog:
PR d/103528
* acx.m4 (ACX_PROG_GDC): Add check whether D compiler works.
ChangeLog:
* configure: Regenerate.