The tests are disabled for historical reasons only.
gcc/testsuite/ChangeLog:
* lib/target-supports.exp (check_effective_target_global_constructor):
Remove amdgcn.
The test FAILs on 32-bit targets, because when unsigned long
is 32-bit, (unsigned long) -1 isn't 0xffffffffffffffff.
The options to fix this would be either using -1UL, or switch
to unsigned long long and using -1ULL, I chose the latter because
the test then FAILs in r13-1242 even on 32-bit targets.
And while at it, some deobfuscation and formatting tweaks.
2022-06-27 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/106070
* gcc.dg/torture/pr106070.c: Use unsigned long long instead of
unsigned long and -1ULL instead of 0xffffffffffffffff, deobcuscate
and improve formatting.
In case we need to supplement the C standard library with additional
definitions for float and long double, the declarations expected to be
in the C headers may not be there. Rely on the cmath overloads
instead.
for libstdc++-v3/ChangeLog
* testsuite/20_util/to_chars/long_double.cc: Use cmath
long double overloads for nexttoward and ldexp.
simd_math.h assumes declarations for many C99 functions to be present,
that libstdc++ doesn't add to target systems that don't have them in
the C library.
Add the C99 math requirement to tests for simd features, so that they
don't fail because of limitations of the target C library.
for libstdc++-v3/ChangeLog
* testsuite/experimental/simd/standard_abi_usable.cc: Require
cmath support.
* testsuite/experimental/simd/standard_abi_usable_2.cc:
Likewise.
The template version of complex::proj returns its argument without
testing for infinities, and that's all we have when neither C99
complex nor C99 math functions are available, and it seems too hard to
do better without isinf and copysign.
I suppose just calling them and expecting users will supply
specializations as needed has been ruled out, and so has refraining
from defining it when it can't be implemented correctly.
It's pointless to run the proj.cc test under these circumstances, so
arrange for it to be skipped. In an unusual way, after trying to
introduce dg-require tests for ccomplex-or-cmath, and found their
results to be misleading due to variations across -std=* versions.
for libstdc++-v3/ChangeLog
* testsuite/26_numerics/complex/proj.cc: Skip test in the
circumstances in which the implementation of proj is known to
be broken.
Systems without preemptive multi-threading require sched_yield calls
to be placed at points in which a context switch might be needed to
enable the test to complete.
for gcc/testsuite/ChangeLog
* gcc.dg/atomic/c11-atomic-exec-4.c: Call sched_yield.
* gcc.dg/atomic/c11-atomic-exec-5.c: Likewise.
* gcc.dg/atomic/pr80640-2.c: Likewise.
* gcc.dg/atomic/pr80640.c: Likewise.
* gcc.dg/atomic/pr81316.c: Likewise.
* gcc.dg/di-sync-multithread.c: Likewise.
In the recent patch to check for openat, I missed an occurrence of
dirfd in std::filesystem.
for libstdc++-v3/ChangeLog
* src/c++17/fs_dir.cc (dir_and_pathname): Use dirfd if
_GLIBCXX_HAVE_OPENAT.
In the recent patch that introduced NO_SYMLINKS, I missed one of the
testcases that created symlinks.
for libstdc++-v3/ChangeLog
* testsuite/27_io/filesystem/iterators/recursive_directory_iterator.cc
(test06): Don't create symlinks when NO_SYMLINKS is defined.
Some net/timer/waitable tests fail on rtems because poll() is not
available.
The above, as well as net/internet/resolver/ops tests and
net/timer/waitable/cons.cc, will fail early at runtime unless mkfifo
is enabled in the RTEMS configuration, because the io_context ctor
throws when pipe() fails.
However, even enabling pipes and adjusting the net_ts link command to
use --gc-sections for -lbsd as recommended, both
net/internet/resolver/ops still fail at runtime.
for libstdc++-v3/ChangeLog
* testsuite/lib/dg-options.exp (add_options_for_net_ts):
Add -Wl,--gc-sections for RTEMS targets.
* testsuite/experimental/net/timer/waitable/dest.cc: Link-time
xfail on RTEMS.
* testsuite/experimental/net/timer/waitable/ops.cc: Likewise.
* testsuite/experimental/net/internet/resolver/ops/lookup.cc:
Execution-time xfail on RTEMS.
* testsuite/experimental/net/internet/resolver/ops/reverse.cc:
Likewise.
This nonsense is no longer required, now that the minimum supported
assembler version is LLVM 13.0.1.
gcc/ChangeLog:
* config/gcn/gcn.md (*movbi): Remove assembler bug workarounds.
(jump): Likewise.
(movdi_symbol_save_scc): Likewise.
We have noticed that, on RTEMS, a small number of testscases are
failing because two calls to this method return the same filename.
This happens for instance in 27_io/filesystem/operations/copy_file.cc
where it does:
auto from = __gnu_test::nonexistent_path();
auto to = __gnu_test::nonexistent_path();
We tracked this issue down to the fact that the implementation of
mkstemp on that system appears to use a very predictable algorithm
for chosing the name of the temporary file, where the same filename
appears to be tried in the same order, regardless of past calls.
So, as long as the file gets deleted after a call to mkstemp (something
we do here in our nonexistent_path method), the next call to mkstemps
ends up returning the same filename, causing the collision we se above.
This commit enhances the __gnu_test::nonexistent_path method to
introduce in the filename being returned a counter which gets
incremented at every call of this method.
Co-authored-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* testsuite/util/testsuite_fs.h (__gnu_test::nonexistent_path):
Always include a counter in the filename returned.
This is not suitable to backport, as it affects the ABI of std::variant
and so isn't appropriate for a release branch.
libstdc++-v3/ChangeLog:
* include/bits/move_only_function.h (_Never_valueless_alt):
Define partial specialization for std::move_only_function.
libstdc++-v3/ChangeLog:
* include/std/variant (_Build_FUN::_S_fun): Define fallback case
as deleted.
(__accepted_index, _Extra_visit_slot_needed): Replace class
templates with variable templates.
This redefines std::is_clock in terms of std::is_clock_v, instead of the
other way around. This avoids instantiatng the class template for code
that only uses the variable template.
libstdc++-v3/ChangeLog:
* include/bits/chrono.h (is_clock_v): Define to false.
(is_clock_v<T>): Define partial specialization for true cases.
(is_clock): Define in terms of is_clock_v.
When building gdbserver with -fsanitize=thread (added to CFLAGS/CXXFLAGS) we
run into:
...
ld: ../libiberty/libiberty.a(safe-ctype.o): warning: relocation against \
`__tsan_init' in read-only section `.text'
ld: ../libiberty/libiberty.a(safe-ctype.o): relocation R_X86_64_PC32 \
against symbol `__tsan_init' can not be used when making a shared object; \
recompile with -fPIC
ld: final link failed: bad value
collect2: error: ld returned 1 exit status
make[1]: *** [libinproctrace.so] Error 1
...
which looks similar to what is described in commit 78e49486944 ("[gdb/build]
Fix gdbserver build with -fsanitize=address").
The gdbserver component builds a shared library libinproctrace.so, which uses
libiberty and therefore requires the pic variant. The gdbserver Makefile is
setup to use this variant, if available, but it's not there.
Fix this by listing gdbserver in the toplevel configure alongside libcc1, as a
component that needs the libiberty pic variant, setting:
...
extra_host_libiberty_configure_flags=--enable-shared
...
Tested on x86_64-linux.
ChangeLog:
2022-06-27 Tom de Vries <tdevries@suse.de>
* configure.ac: Build libiberty pic variant for gdbserver.
* configure: Regenerate.
This patch is a follow-up improvement to my recent patch for
PR rtl-optimization/7061. That patch added the test case
gcc.target/i386/pr7061-2.c:
float im(float _Complex a) { return __imag__ a; }
For which GCC on x86_64 currently generates:
movq %xmm0, %rax
shrq $32, %rax
movd %eax, %xmm0
ret
but with this patch we now generate (the same as LLVM):
shufps $85, %xmm0, %xmm0
ret
This is achieved by providing a define_insn_and_split that allows
truncated lshiftrt:DI by 32 to be performed on either SSE or general
regs, where if the register allocator prefers to use SSE, we split
to a shufps_v4si, or if not, we use a regular shrq.
2022-06-27 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR rtl-optimization/7061
* config/i386/i386.md (*highpartdisi2): New define_insn_and_split.
gcc/testsuite/ChangeLog
PR rtl-optimization/7061
* gcc.target/i386/pr7061-2.c: Update to look for shufps.
This patch implements the missed optimization described in PR 94026,
where a the shift can be eliminated from the sequence of a shift,
followed by a bit-wise AND followed by an equality/inequality test.
Specifically, ((X << C1) & C2) cmp C3 into (X & (C2 >> C1)) cmp (C3 >> C1)
and likewise ((X >> C1) & C2) cmp C3 into (X & (C2 << C1)) cmp (C3 << C1)
where cmp is == or !=, and C1, C2 and C3 are integer constants.
The example in the subject line is taken from the hot function
self_atari from the Go program Leela (in SPEC CPU 2017).
2022-06-27 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR tree-optimization/94026
* match.pd (((X << C1) & C2) eq/ne C3): New simplification.
(((X >> C1) & C2) eq/ne C3): Likewise.
gcc/testsuite/ChangeLog
PR tree-optimization/94026
* gcc.dg/pr94026.c: New test case.
Such constants are often subject to the constant synthesis:
int test(int a) {
return a - 31999;
}
test:
movi a3, 1
addmi a3, a3, -0x7d00
add a2, a2, a3
ret
This patch optimizes such case as follows:
test:
addi a2, a2, 1
addmi a2, a2, -0x7d00
ret
gcc/ChangeLog:
* config/xtensa/xtensa.md:
Suppress unnecessary emitting nop insn in the split patterns for
integer/FP constant synthesis, and add new peephole2 pattern that
folds such synthesized additions.
gcc/fortran/ChangeLog:
PR fortran/105691
* simplify.cc (gfc_simplify_index): Replace old simplification
code by the equivalent of the runtime library implementation. Use
HOST_WIDE_INT instead of int for string index, length variables.
gcc/testsuite/ChangeLog:
PR fortran/105691
* gfortran.dg/index_6.f90: New test.
These tests validate fp conversions with various rounding modes which
would not work on soft-float ABIs.
On -march=rv64imac/-mabi=lp64 this reduces 5 unique failures (overall 35
due to multi flag combination builds)
gcc/testsuite/Changelog:
* gcc.dg/torture/fp-double-convert-float-1.c: Add
dg-require-effective-target hard_float.
* gcc.dg/torture/fp-int-convert-timode-3.c: Ditto.
* gcc.dg/torture/fp-int-convert-timode-4.c: Ditto.
* gcc.dg/torture/fp-uint64-convert-double-1.c: Ditto.
* gcc.dg/torture/fp-uint64-convert-double-2.c: Ditto.
Add
AC_CONFIG_MACRO_DIRS([../config])
So that just running:
$ autoreconf -vf
... does the right thing (no need to specify -I ../config).
Note: I don't have access to the gcc repo, so if this patch is approved,
can somebody push it there on my behalf? I can push it to binutils-gdb.
libiberty/ChangeLog:
* configure.ac: Add AC_CONFIG_MACRO_DIRS call.
* configure: Re-generate.
The procedure detailed in contrib/unicode/README was followed with nothing
notable coming up. The glibc scripts did not require any update, so the
only change was retrieving new versions of the Unicode data files and
rerunning gen_wcwidth.py.
contrib/ChangeLog:
* unicode/EastAsianWidth.txt: Update to Unicode 14.0.0.
* unicode/PropList.txt: Likewise.
* unicode/README: Likewise.
* unicode/UnicodeData.txt: Likewise.
libcpp/ChangeLog:
* generated_cpp_wcwidth.h: Generated from updated Unicode data files.
If target packs structures by default, the bitfield offset which the
test validates must be adjusted to not include padding.
gcc/testsuite/ChangeLog:
* gcc.dg/debug/btf/btf-bitfields-1.c: Adjust the checked offsets
for targets which pack structures by default.
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
As per the explanation in the test, and in the DOM conversion to
ranger patch, this is a known regression. I had mentioned I would
XFAIL this test, but forgot to do so. There is an analysis in the
test itself as to what is going on.
Tested on x86-64 Linux.
gcc/testsuite/ChangeLog:
* g++.dg/warn/Wstringop-overflow-4.C: XFAIL a test.
We do, of course, mean $host not $target in this case. Corrected thus.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
ChangeLog:
* configure: Regenerate.
* configure.ac: Correct use of $host.
[Jeff, this is the same patch I sent you last week with minor tweaks
to the commit message.]
[Despite the verbosity of the message, this is actually a pretty
straightforward patch. It should've gone in last cycle, but there
was a nagging regression I couldn't get to until after stage1
had closed.]
There are 3 uses of EVRP in DOM that must be converted.
Unfortunately, they need to be converted in one go, so further
splitting of this patch would be problematic.
There's nothing here earth shattering. It's all pretty obvious in
retrospect, but I've added a short description of each use to aid in
reviewing:
* Convert evrp use in cprop to ranger.
This is easy, as cprop in DOM was converted to the ranger API last
cycle, so this is just a matter of using a ranger instead of an
evrp_range_analyzer.
* Convert evrp use in threader to ranger.
The idea here is to use the hybrid approach we used for the initial
VRP threader conversion last cycle. The DOM threader will continue
using the forward threader infrastructure while continuing to query
DOM data structures, and only if the conditional does not relsolve,
using the ranger. This gives us the best of both worlds, and is a
proven approach.
Furthermore, as frange and prange come live in the next cycle, we
can move away from the forward threader altogether, and just add
another backward threader. This will not only remove the last use
of the forward threader, but will allow us to remove at least 1 or 2
threader instances.
* Convert conditional folding to use the method used by the ranger and
evrp. Previously DOM was calling into the guts of
simplify_using_ranges::vrp_visit_cond_stmt. The blessed way now is
using fold_cond() which rewrites the conditional and edges
automatically.
When legacy is removed, simplify_using_ranges will be further
cleaned up, and there will only be one entry point into simplifying
a statement.
* DOM was setting global ranges determined from unreachable edges as a
side-effect of using the evrp engine. We must handle these cases
before nuking evrp, and DOM seems like a good fit. I've just moved
the snippet to DOM, but it could live anywhere else we do a DOM
walk.
For the record, this is the case *vrp handled:
<bb C>:
...
if (c_5(D) != 5)
goto <bb N>;
else
goto <bb M>;
<bb N>:
__builtin_unreachable ();
<bb M>:
If M dominates all uses of c_5, we can set the global range of c_5
to [5,5].
I have tested on x86-64, pcc64le, and aarch64 Linux.
I also ran threading benchmarks as well as performance benchmarks.
DOM threads 1.56% more paths which ultimately yields a miniscule total
increase of 0.03%.
The conversion to ranger brings a 7.87% performance drop in DOM, which
is a wash in overall compilation. This is in line with other
replacements of legacy evrp with ranger. We handle a lot more cases.
It's not free .
There is a a regression on Wstringop-overflow-4.C which I'm planning
on XFAILing. It's another variant of the usual middle-end false
positives: having no ranges produces no warnings, but slightly refined
ranges, or worse-- isolating specific problematic cases in the
threader causes flare-ups.
As an aside, as Richi has suggested, I think we should discuss
restricting the threader's ability to thread highly unlikely paths.
These cause no end of pain for middle-end warnings. However,
I don't know if this would conflict with path isolation for
things like null dereferencing. ISTR you were interested in this.
BTW, I think the Wstringop-overflow-4.C test is problematic and I've
attached my analysis. Basically the regression is caused by a bad
interaction with the rounding/alignment that placement new has inlined
into the IL. This happens for int16_r[] which the test is testing.
Ranger can glean some range info, which causes DOM threading to
isolate a path which causes a warning.
OK for trunk?
gcc/ChangeLog:
* tree-ssa-dom.cc (dom_jt_state): Pass ranger to constructor
instead of evrp.
(dom_jt_state::push): Remove m_evrp.
(dom_jt_state::pop): Same.
(dom_jt_state::record_ranges_from_stmt): Remove.
(dom_jt_state::register_equiv): Remove updating of evrp ranges.
(class dom_jt_simplifier): Pass ranger to constructor.
Inherit from hybrid_jt_simplifier.
(dom_jt_simplifier::simplify): Convert to ranger.
(pass_dominator::execute): Same.
(all_uses_feed_or_dominated_by_stmt): New.
(dom_opt_dom_walker::set_global_ranges_from_unreachable_edges): New.
(dom_opt_dom_walker::before_dom_children): Call
set_global_ranges_from_unreachable_edges.
Do not call record_ranges_from_stmt.
(dom_opt_dom_walker::after_dom_children): Remove evrp use.
(cprop_operand): Use int_range<> instead of value_range.
(dom_opt_dom_walker::fold_cond): New.
(dom_opt_dom_walker::optimize_stmt): Pass ranger to
cprop_into_stmt.
Use fold_cond() instead of vrp_visit_cond_stmt().
* tree-ssa-threadedge.cc (jt_state::register_equivs_stmt): Do not
pass state to simplifier.
* vr-values.h (class vr_values): Make fold_cond public.
gcc/testsuite/ChangeLog:
* gcc.dg/sancov/cmp0.c: Adjust for conversion to ranger.
* gcc.dg/tree-ssa/ssa-dom-branch-1.c: Same.
* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Same.
* gcc.dg/vect/bb-slp-pr81635-2.c: Same.
* gcc.dg/vect/bb-slp-pr81635-4.c: Same.
* g++.dg/warn/Wstringop-overflow-4.C: Likewise.
* gcc.target/mips/data-sym-multi-pool.c: Likewise.
* gcc.target/mips/mips.exp: Likewise.
When amending the allowed alignment size to accommodate the larger values
permitted by newer tools, we retained the object file limit of 2^15 for
Darwin versions <= 10, since that is what the native tools expect there.
This triggers a different diagnostic path with a distinct error message,
which is checked in the revised test here.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/testsuite/ChangeLog:
* gcc.dg/darwin-comm-1.c: Check for the correct error message for
Darwin <= 10.
This middle-end patch proposes the "hard register constant propagation"
pass be performed up to three times on each basic block (up from the
current two times) if the second pass successfully made changes.
The motivation for three passes is to handle the "swap idiom" (i.e.
t = x; x = y; y = t;" sequences) that get generated by register allocation
(reload).
Consider the x86_64 test case for __int128 addition recently discussed
on gcc-patches. With that proposed patch, the input to the cprop_hardreg
pass looks like:
movq %rdi, %r8
movq %rsi, %rdi
movq %r8, %rsi
movq %rdx, %rax
movq %rcx, %rdx
addq %rsi %rax
adcq %rdi, %rdx
ret
where the first three instructions effectively swap %rsi and %rdi.
On the first pass of cprop_hardreg, we notice that the third insn,
%rsi := %r8, is redundant and can eliminated/propagated to produce:
movq %rdi, %r8
movq %rsi, %rdi
movq %rdx, %rax
movq %rcx, %rdx
addq %r8 %rax
adcq %rdi, %rdx
ret
Because a successful propagation was found, cprop_hardreg then runs
a second pass/sweep on affected basic blocks (using worklist), and
on this second pass notices that the second instruction, %rdi := %rsi,
may now be propagated (%rsi was killed in the before the first transform),
and after a second pass, we now end up with:
movq %rdi, %r8
movq %rdx, %rax
movq %rcx, %rdx
addq %r8, %rax
adcq %rsi, %rdx
ret
which is the current behaviour on mainline. However, a third and final
pass would now notice that the first insn, "%r8 := %rdi" is also now
eliminable, and a third iteration would produce optimal code:
movq %rdx, %rax
movq %rcx, %rdx
addq %rdi, %rax
adcq %rsi, %rdx
ret
The patch below creates two worklists, and alternates between them on
sucessive passes, populating NEXT with the basic block id's of blocks
that were updated during the current pass over the CURR worklist.
It should be noted that this a regression fix; GCC 4.8 generated
optimal code with two moves (whereas GCC 12 required 5 moves, up
from GCC 11's 4 moves).
2022-06-25 Roger Sayle <roger@nextmovesoftware.com>
Richard Biener <rguenther@suse.de>
gcc/ChangeLog
* regcprop.cc (pass_cprop_hardreg::execute): Perform a third
iteration over each basic block that was updated by the second
iteration.
fgrep has been deprecated in favor of grep -F for a long time, and the
next grep release (3.8 or 4.0) will print a warning of fgrep is used.
And, the fgrep command in exgettext is no longer useful after we
migrated from SVN to Git. Remove the fgrep command so we won't see the
warning.
gcc/po/ChangeLog:
* exgettext: Remove unneeded fgrep command.
This seems like a good warning to have in -Wall, as requested. But as
pointed out in PR20423, some users want a warning only when a derived
function doesn't override any base function. So let's put that lesser
version in -Wall (and -Woverloaded-virtual=1) while leaving the semantics
for the existing option the same.
PR c++/87729
PR c++/20423
gcc/c-family/ChangeLog:
* c.opt (Woverloaded-virtual): Add levels, include in -Wall.
gcc/ChangeLog:
* doc/invoke.texi: Document changes.
gcc/cp/ChangeLog:
* class.cc (warn_hidden): Handle -Woverloaded-virtual=1.
gcc/testsuite/ChangeLog:
* g++.dg/warn/Woverloaded-virt1.C: New test.
* g++.dg/warn/Woverloaded-virt2.C: New test.
This test spuriously fails on AVR with:
error: width of 'bitfield_c' exceeds its type
8-bit and 16-bit microcontrollers do not seem to be the target audience
for BTF file format. So the least intrusive fix is to simply skip the
test for them.
gcc/testsuite/ChangeLog:
* gcc.dg/debug/btf/btf-bitfields-1.c: Skip if int is less than
32-bits.
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
gcc/fortran/ChangeLog:
PR fortran/105813
* check.cc (gfc_check_unpack): Try to simplify MASK argument to
UNPACK so that checking of the VECTOR argument can work when MASK
is a variable.
gcc/testsuite/ChangeLog:
PR fortran/105813
* gfortran.dg/unpack_vector_1.f90: New test.
The gcc.dg/builtin-object-size-20.c test case assumes that the target
inserts padding between structure members. Obviously it fails for
targets which pack structures by default.
Split the cases into two tests, so that the ones requiring structure
padding can be skipped for default_packed targets.
gcc/testsuite/ChangeLog:
* gcc.dg/builtin-object-size-20.c: Remove cases which
work on default_packed targets.
* gcc.dg/builtin-object-size-22.c: New test with the cases
removed above.
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
Epiphany, PRU, ARC and NDS32 may predefine __big_endian__ and
__little_endian__ macros. This leads to spurious warnings like:
gcc.dg/sso/memcpy-1.c:7: warning: "__little_endian__" redefined
Fix by renaming the macros in the test.
gcc/testsuite/ChangeLog:
* gcc.dg/sso/memcpy-1.c (__big_endian__, __little_endian__):
Rename macros to avoid conflicts with predefined ones.
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
Some embedded targets do not pass any argv arguments. When argc is
zero, this causes spurious failures for lto/pr101868_0.c. Fix by
following the strategy in r0-114701-g2c49569ecea56d. Use a volatile
variable instead of argc to inject a runtime value into the test.
I validated the following:
- No changes in testresults for x86_64-pc-linux-gnu.
- The spurious failures are fixed for PRU target.
- lto/pr101868_0.c still fails on x86_64-pc-linux-gnu, if
the PR/101868 fix (r12-2254-gfedcf3c476aff7) is reverted.
PR tree-optimization/101868
gcc/testsuite/ChangeLog:
* gcc.dg/lto/pr101868_0.c (zero): New volatile variable.
(main): Use it instead of argc.
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>