uninit diagnostics uses passing via reference and access attributes
but that iterates over function type arguments which can in some
cases appearantly outrun the actual arguments leading to ICEs.
The following simply ignores not present arguments.
2022-06-14 Richard Biener <rguenther@suse.de>
PR tree-optimization/105946
* tree-ssa-uninit.cc (maybe_warn_pass_by_reference):
Do not look at arguments not specified in the function call.
(cherry picked from commit e07a876c07)
The following avoids the need to massage the target optimization
node at WPA time when we fixup the optimization node, copying
FP related flags from callee to caller. The target is already
set up to fixup, but that only works when not switching between
functions. After fixing that the fixup is then done at LTRANS
time when materializing the function.
2022-07-01 Richard Biener <rguenthert@suse.de>
PR target/105459
* config/i386/i386-options.cc (ix86_set_current_function):
Rebuild the target optimization node whenever necessary,
not only when the optimization node didn't change.
* gcc.dg/lto/pr105459_0.c: New testcase.
(cherry picked from commit 4c94382a13)
gcc/fortran/ChangeLog:
PR fortran/104313
* trans-decl.cc (gfc_generate_return): Do not generate conflicting
fake results for functions with no result variable under -ff2c.
gcc/testsuite/ChangeLog:
PR fortran/104313
* gfortran.dg/pr104313.f: New test.
(cherry picked from commit 517fb1a781)
Testing has found that using load and store vector pair for block copies
can result in a slow down on power10. This patch disables using the
vector pair instructions for block copies if we are tuning for power10.
2022-06-11 Michael Meissner <meissner@linux.ibm.com>
gcc/
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Do
not generate block copies with vector pair instructions if we are
tuning for power10. Back port from master branch.
In check_new_reg_p, the nregs of a du chain is computed by obtaining the
MODE of the first element in the chain, and then calling
hard_regno_nregs() with the MODE. But the first element of the chain can
be a DEBUG_INSN whose mode need not be the same as the rest of the
elements in the du chain. This was resulting in fcompare-debug failure
as check_new_reg_p was returning a different result with -g for the same
candidate register. We can instead obtain nregs from the du chain
itself.
2022-06-10 Surya Kumari Jangala <jskumari@linux.ibm.com>
gcc/
PR rtl-optimization/105041
* regrename.cc (check_new_reg_p): Use nregs value from du chain.
gcc/testsuite/
PR rtl-optimization/105041
* gcc.target/powerpc/pr105041.c: New test.
(cherry picked from commit 3e16b4359e)
As the testcase in PR 105860 shows, the code that tries to re-use the
handled_component chains in SRA can be horribly confused by unions,
where it thinks it has found a compatible structure under which it can
chain the references, but in fact it found the type it was looking
for elsewhere in a union and generated a write to a completely wrong
part of an aggregate.
I don't remember whether the plan was to support unions at all in
build_reconstructed_reference but it can work, to an extent, if we
make sure that we start the search only outside the outermost union,
which is what the patch does (and the extra testcase verifies).
Additionally, this commit also contains sqashed in it a backport of
b984b84cbe which fixes the testcase
gcc.dg/tree-ssa/alias-access-path-13.c for many 32-bit targets.
gcc/ChangeLog:
2022-07-01 Martin Jambor <mjambor@suse.cz>
PR tree-optimization/105860
* tree-sra.cc (build_reconstructed_reference): Start expr
traversal only just below the outermost union.
gcc/testsuite/ChangeLog:
2022-07-01 Martin Jambor <mjambor@suse.cz>
PR tree-optimization/105860
* gcc.dg/tree-ssa/alias-access-path-13.c: New test.
* gcc.dg/tree-ssa/pr105860.c: Likewise.
(cherry picked from commit b110e5283e)
(mult (sign_extend:DI rj:SI) (sign_extend:DI rk:SI)) should be
"mulw.d.w", not "mul.d".
gcc/ChangeLog:
* config/loongarch/loongarch.md (mulsidi3_64bit): Use mulw.d.w
instead of mul.d.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/mulw_d_w.c: New test.
* gcc.c-torture/execute/mul-sext.c: New test.
(cherry picked from commit 1fa42d6214)
This is a backport of the fix for PR target/105930 from mainline to the
gcc12 release branch.
2022-07-09 Roger Sayle <roger@nextmovesoftware.com>
Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog
PR target/105930
* config/i386/i386.md (*<any_or>di3_doubleword): Split after
reload. Use rtx_equal_p to avoid creating memory-to-memory moves,
and emit NOTE_INSN_DELETED if operand[2] is zero (i.e. with -O0).
Under the LA architecture, when the stack is dropped too far, the process
of dropping the stack is divided into two steps.
step1: After dropping the stack, save callee saved registers on the stack.
step2: The rest of it.
The stack drop operation is optimized when frame->total_size minus
frame->sp_fp_offset is an integer multiple of 4096, can reduce the number
of instructions required to drop the stack. However, this optimization is
not effective because of the original calculation method
The following case:
int main()
{
char buf[1024 * 12];
printf ("%p\n", buf);
return 0;
}
As you can see from the generated assembler, the old GCC has two more
instructions than the new GCC, lines 14 and line 24.
new old
10 main: | 11 main:
11 addi.d $r3,$r3,-16 | 12 lu12i.w $r13,-12288>>12
12 lu12i.w $r13,-12288>>12 | 13 addi.d $r3,$r3,-2032
13 lu12i.w $r5,-12288>>12 | 14 ori $r13,$r13,2016
14 lu12i.w $r12,12288>>12 | 15 lu12i.w $r5,-12288>>12
15 st.d $r1,$r3,8 | 16 lu12i.w $r12,12288>>12
16 add.d $r12,$r12,$r5 | 17 st.d $r1,$r3,2024
17 add.d $r3,$r3,$r13 | 18 add.d $r12,$r12,$r5
18 add.d $r5,$r12,$r3 | 19 add.d $r3,$r3,$r13
19 la.local $r4,.LC0 | 20 add.d $r5,$r12,$r3
20 bl %plt(printf) | 21 la.local $r4,.LC0
21 lu12i.w $r13,12288>>12 | 22 bl %plt(printf)
22 add.d $r3,$r3,$r13 | 23 lu12i.w $r13,8192>>12
23 ld.d $r1,$r3,8 | 24 ori $r13,$r13,2080
24 or $r4,$r0,$r0 | 25 add.d $r3,$r3,$r13
25 addi.d $r3,$r3,16 | 26 ld.d $r1,$r3,2024
26 jr $r1 | 27 or $r4,$r0,$r0
| 28 addi.d $r3,$r3,2032
| 29 jr $r1
gcc/ChangeLog:
* config/loongarch/loongarch.cc (loongarch_compute_frame_info):
Modify fp_sp_offset and gp_sp_offset's calculation method,
when frame->mask or frame->fmask is zero, don't minus UNITS_PER_WORD
or UNITS_PER_FP_REG.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/prolog-opt.c: New test.
(cherry picked from commit aa8fd7f656)
The ${host_builddir}/largefile-config.h header can't be written until
its parent directory has been created, so it needs to have the creation
of that directory as a prerequisite.
libstdc++-v3/ChangeLog:
PR libstdc++/106162
* include/Makefile.am (largefile-config.h): Add
stamp-${host_alias} prerequisite.
* include/Makefile.in: Regenerate.
(cherry picked from commit 8a6ee426c2)
Although these tests use filesystem::remove_all to clean up, that fails
because it uses recursive_directory_iterator which is intentionally
bodged by the custom readdir defined in the test.
Just use POSIX rmdir to clean up. We don't need to use _rmdir or _wrmdir
for Windows, because we'll never reach test02() on targets where the
custom readdir doesn't interpose the one from libc.
libstdc++-v3/ChangeLog:
* testsuite/27_io/filesystem/iterators/error_reporting.cc: Use
rmdir to remove directories.
* testsuite/experimental/filesystem/iterators/error_reporting.cc:
Likewise.
(cherry picked from commit 7c1c7e120c)
The <https://gcc.gnu.org/pipermail/gcc/2022-May/238679.html> thread
seems to have concluded that -Wformat shouldn't warn about
printf((const char*) u8"test %d\n", 1);
saying "format string is not an array of type 'char'". This code
is not an aliasing violation, and there are no I/O functions for u8
strings, so the const char * cast is OK and shouldn't be disregarded.
PR c++/105626
gcc/c-family/ChangeLog:
* c-format.cc (check_format_arg): Don't emit -Wformat warnings with
u8 strings.
gcc/testsuite/ChangeLog:
* g++.dg/warn/Wformat-char8_t-1.C: New test.
(cherry picked from commit 543828e79b)
The routine fold_using_range::relation_fold_and_or needs to verify that both
operands of 2 stmts are the same, and uses GORIs dependency cache for this.
This cache cannot be counted on to reflect the current contents of a
stmt, expecially in the presence of an IL changing pass. Instead, look at the
statement operands.
PR tree-optimization/106114
gcc/
* gimple-range-fold.cc (fold_using_range::relation_fold_and_or): Check
statement operands instead of GORI cache.
gcc/testsuite/
* gcc.dg/pr106114.c: New.
Casting from vector to static array is permitted, and the frontend
generates a reinterpret cast, but casting back the other way resulted in
an error. This has been fixed to be properly handled in the code
generation pass of VectorExp, and the conversion for lvalue and rvalue
handling done in convert_expr and convert_for_rvalue respectively.
PR d/106139
gcc/d/ChangeLog:
* d-convert.cc (convert_expr): Handle casting from array to vector.
(convert_for_rvalue): Rewrite vector to array casts of the same
element type into a constructor.
(convert_for_assignment): Return calling convert_for_rvalue.
* expr.cc (ExprVisitor::visit (VectorExp *)): Handle generating a
vector expression from a static array.
* toir.cc (IRVisitor::visit (ReturnStatement *)): Call
convert_for_rvalue on return value.
gcc/testsuite/ChangeLog:
* gdc.dg/pr106139a.d: New test.
* gdc.dg/pr106139b.d: New test.
* gdc.dg/pr106139c.d: New test.
* gdc.dg/pr106139d.d: New test.
(cherry picked from commit 329bef49da)
This patch addresses PR target/105991 where a change to prefer representing
shifts and adds at the tree-level as multiplications, causes problems for
the rldimi patterns in the powerpc backend. The issue is that rs6000.md
models this pattern using IOR, and some variants that have the equivalent
PLUS or XOR in the RTL fail to match some *rotl<mode>4_insert patterns.
This is fixed in this patch by adding a define_insn_and_split to locally
canonicalize the PLUS and XOR forms to the backend's preferred IOR form.
Backported from master.
2022-07-04 Roger Sayle <roger@nextmovesoftware.com>
Marek Polacek <polacek@redhat.com>
Segher Boessenkool <segher@kernel.crashing.org>
Kewen Lin <linkw@linux.ibm.com>
gcc/ChangeLog
PR target/105991
* config/rs6000/rs6000.md (rotl<mode>3_insert_3): Check that
exact_log2 doesn't return -1 (or zero).
(plus_xor): New code iterator.
(*rotl<mode>3_insert_3_<code>): New define_insn_and_split.
gcc/testsuite/ChangeLog
PR target/105991
* gcc.target/powerpc/pr105991.c: New test case.
Integer division by zero is undefined behavior anyway, and there are
already many platforms where neither the GCC port and the hardware do
anything to trap on division by zero. So any portable program shall not
rely on SIGFPE on division by zero, in both theory and practice. As the
result, there is no real reason to cost two additional instructions just
for the trap on division by zero with a new ISA.
One remaining reason to trap on division by zero may be debugging,
especially while -fsanitize=integer-divide-by-zero is not implemented
for LoongArch yet. To make debugging easier, keep -mcheck-zero-division
as the default for -O0 and -Og, but use -mno-check-zero-division as the
default for all other optimization levels.
Backport this behavior change for 12.2, so 12.1 will be the only release
with a different default.
Co-authored-by: Lulu Cheng <chenglulu@loongson.cn>
gcc/ChangeLog:
* config/loongarch/loongarch.cc (loongarch_check_zero_div_p):
New static function.
(loongarch_idiv_insns): Use loongarch_check_zero_div_p instead
of TARGET_CHECK_ZERO_DIV.
(loongarch_output_division): Likewise.
* common/config/loongarch/loongarch-common.cc
(TARGET_DEFAULT_TARGET_FLAGS): Remove unneeded hook.
* doc/invoke.texi: Update to match the new behavior.
gcc/testsuite/ChangeLog:
* gcc.c-torture/execute/20101011-1.c (dg-additional-options):
add -mcheck-zero-division for LoongArch targets.
(cherry picked from commit f150dc1bd1)
gcc/fortran/ChangeLog:
PR fortran/103137
PR fortran/103138
PR fortran/103693
PR fortran/105243
* decl.cc (gfc_match_data_decl): Reject CLASS entity declaration
when it is given the PARAMETER attribute.
gcc/testsuite/ChangeLog:
PR fortran/103137
PR fortran/103138
PR fortran/103693
PR fortran/105243
* gfortran.dg/class_58.f90: Fix test.
* gfortran.dg/class_73.f90: New test.
Co-authored-by: Steven G. Kargl <kargl@gcc.gnu.org>
(cherry picked from commit 4c233cabbe)
gcc/fortran/ChangeLog:
PR fortran/106121
* simplify.cc (gfc_simplify_extends_type_of): Do not attempt to
simplify when one of the arguments is a CLASS variable that was
not properly declared.
gcc/testsuite/ChangeLog:
PR fortran/106121
* gfortran.dg/extends_type_of_4.f90: New test.
Co-authored-by: Steven G. Kargl <kargl@gcc.gnu.org>
(cherry picked from commit b8f284d367)
When optimizing for size with -Oz, setting a register can be minimized by
pushing an immediate value to the stack and popping it to the destination.
Alas the one general register that shouldn't be updated via the stack is
the stack pointer itself, where "pop %esp" can't be represented in GCC's
RTL ("use of a register mentioned in pre_inc, pre_dec, post_inc or
post_dec is not permitted within the same instruction"). This patch
fixes PR target/106122 by explicitly checking for SP_REG in the
problematic peephole2.
2022-07-O3 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR target/106122
* config/i386/i386.md (peephole2): Avoid generating pop %esp
when optimizing for size.
gcc/testsuite/ChangeLog
PR target/106122
* gcc.target/i386/pr106122.c: New test case.
On musl <pthread.h> uses calloc() (via <sched.h>). jit/ includes
it directly and exposes use of poisoned calloc():
/build/build/./prev-gcc/xg++ ... ../../gcc-13-20220626/gcc/jit/jit-playback.cc
make[3]: *** [Makefile:1143: jit/libgccjit.o] Error 1
make[3]: *** Waiting for unfinished jobs....
In file included from /<<NIX>>/musl-1.2.3-dev/include/pthread.h:30,
from ../../gcc-13-20220626/gcc/jit/jit-playback.cc:44:
/<<NIX>>/musl-1.2.3-dev/include/sched.h:84:7: error: attempt to use poisoned "calloc"
84 | void *calloc(size_t, size_t);
| ^
/<<NIX>>/musl-1.2.3-dev/include/sched.h:124:36: error: attempt to use poisoned "calloc"
124 | #define CPU_ALLOC(n) ((cpu_set_t *)calloc(1,CPU_ALLOC_SIZE(n)))
| ^
The change moves <pthread.h> inclusion to "system.h" under new
INCLUDE_PTHREAD_H guard and uses this mechanism in libgccjit.
gcc/
PR c++/106102
* system.h: Introduce INCLUDE_PTHREAD_H macros to include <pthread.h>.
gcc/jit/
PR c++/106102
* jit-playback.cc: Include <pthread.h> via "system.h" to avoid calloc()
poisoning.
* jit-recording.cc: Ditto.
* libgccjit.cc: Ditto.
(cherry picked from commit 49d508065b)
On musl <pthread.h> uses calloc() (via <sched.h>). <memory> includes
it indirectly and exposes use of poisoned calloc() when module code
is built:
/build/build/./prev-gcc/xg++ ... ../../gcc-13-20220626/gcc/cp/mapper-resolver.cc
In file included from /<<NIX>>/musl-1.2.3-dev/include/pthread.h:30,
from /build/build/prev-x86_64-unknown-linux-musl/libstdc++-v3/include/x86_64-unknown-linux-musl/bits/gthr-default.h:35,
....
from /build/build/prev-x86_64-unknown-linux-musl/libstdc++-v3/include/memory:77,
from ../../gcc-13-20220626/gcc/../libcody/cody.hh:24,
from ../../gcc-13-20220626/gcc/cp/../../c++tools/resolver.h:25,
from ../../gcc-13-20220626/gcc/cp/../../c++tools/resolver.cc:23,
from ../../gcc-13-20220626/gcc/cp/mapper-resolver.cc:32:
/<<NIX>>/musl-1.2.3-dev/include/sched.h:84:7: error: attempt to use poisoned "calloc"
84 | void *calloc(size_t, size_t);
| ^
/<<NIX>>/musl-1.2.3-dev/include/sched.h:124:36: error: attempt to use poisoned "calloc"
124 | #define CPU_ALLOC(n) ((cpu_set_t *)calloc(1,CPU_ALLOC_SIZE(n)))
| ^
gcc/cp/
PR c++/106102
* mapper-client.cc: Include <memory> via "system.h".
* mapper-resolver.cc: Ditto.
* module.cc: Ditto.
libcc1/
PR c++/106102
* libcc1plugin.cc: Include <memory> via "system.h".
* libcp1plugin.cc: Ditto.
(cherry picked from commit 3b21c21f3f)
Actually, for release branches let's just avoid the lookup for the lambdas
that are the problematic case and only make the bigger change on trunk.
PR c++/106024
gcc/cp/ChangeLog:
* parser.cc (cp_parser_lookup_name): Limit previous change to
lambdas.
gcc/
PR target/103722
* config/sh/sh.cc (sh_register_move_cost): Avoid cost "2" (which
is special) for various scenarios.
(cherry picked from commit ce1580252e)
Since the patch for PR103408, the template parameters for the lambda in this
test have level 1 instead of 2, and we were treating null template args as 1
level of arguments, so tsubst_template_parms decided it had nothing to do.
Fixed by distinguishing between <> and no args at all, which is what we have
in our "substitution" in a requires-expression.
PR c++/105541
gcc/cp/ChangeLog:
* cp-tree.h (TMPL_ARGS_DEPTH): 0 for null args.
* parser.cc (cp_parser_enclosed_template_argument_list):
Use 0-length TREE_VEC for <>.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/lambda-requires1.C: New test.
We were wrongly looking up the generic lambda op() in a dependent scope, and
then trying to look up its instantiation at substitution time, but lambdas
aren't instantiated, so we crashed. The fix is to not look into dependent
class scopes.
But this created trouble with wrongly trying to use a template from the
enclosing scope when we aren't actually looking at a template-argument-list,
in template/lookup18.C, so let's avoid that.
PR c++/106024
gcc/cp/ChangeLog:
* parser.cc (missing_template_diag): Factor out...
(cp_parser_id_expression): ...from here.
(cp_parser_lookup_name): Don't look in dependent object_type.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/lambda-generic10.C: New test.
gcc/fortran/ChangeLog:
PR fortran/105691
* simplify.cc (gfc_simplify_index): Replace old simplification
code by the equivalent of the runtime library implementation. Use
HOST_WIDE_INT instead of int for string index, length variables.
gcc/testsuite/ChangeLog:
PR fortran/105691
* gfortran.dg/index_6.f90: New test.
(cherry picked from commit ff35dbc020)
gcc/fortran/ChangeLog:
PR fortran/105813
* check.cc (gfc_check_unpack): Try to simplify MASK argument to
UNPACK so that checking of the VECTOR argument can work when MASK
is a variable.
gcc/testsuite/ChangeLog:
PR fortran/105813
* gfortran.dg/unpack_vector_1.f90: New test.
(cherry picked from commit f21f17f95c)