When compiling test-case gcc.dg/atomic/c11-atomic-exec-1.c, we run into
these ptxas errors:
...
line 100; error: Rounding modifier required for instruction 'cvt'
line 105; error: Rounding modifier required for instruction 'cvt'
...
The problem is that this move:
...
//(insn 13 11 14 2
// (set (reg:DF 28 [ _9 ])
// (subreg:DF (reg:TI 22 [ _1 ]) 0)) 9 {*movdf_insn}
// (nil))
cvt.f64.u64 %r28, %r22$0;
...
is emitted as cvt.f64.u64, while it should be a mov.b64 instead.
Fix this by handling this case in nvptx_output_mov_insn.
Tested on nvptx.
gcc/ChangeLog:
PR target/97158
* config/nvptx/nvptx.c (nvptx_output_mov_insn): Handle move from
DF subreg to DF reg.
This patch replaces a sequence of dyn_cast to different gimple stmt
types in exploded_node::on_stmt with a switch on the gimple_code. This
makes clearer which kinds of stmt are currently treated as no-ops, as a
precursor to handling them properly.
No functional change intended.
gcc/analyzer/ChangeLog:
* engine.cc (exploded_node::on_stmt): Replace sequence of dyn_cast
with switch.
In the testcase below, the dependent specializations iter_reference_t<F>
and iter_reference_t<Out> share the same tree due to specialization
caching. So when find_template_parameters walks through the
requires-expression (as part of normalization), it sees and includes the
out-of-scope template parameter F in the list of template parameters
it found within the requires-expression (along with Out and N).
From a correctness perspective this is harmless since the parameter mapping
routines only care about the level and index of each parameter, so F is
no different from Out in that sense. And it's also harmless that two
parameters in the parameter mapping have the same level and index.
But having both Out and F in the parameter mapping means extra work for
hash_atomic_constrant, tsubst_parameter_mapping and get_mapped_args; and
it also means we print this irrelevant template parameter in the
testcase's diagnostics (via pp_cxx_parameter_mapping):
in requirements with ‘Out o’ [with N = (const int&)&a; F = const int*; Out = const int*]
This patch makes keep_template_parm return only in-scope template
parameters by looking into ctx_parms for the corresponding in-scope
one, through a new helper function corresponding_template_parameter.
(That we sometimes print irrelevant template parameters in diagnostics
is also the subject of PR99 and PR66968, so the above diagnostic issue
could likely be fixed in a more general way, but this targeted fix to
keep_template_parm is perhaps worthwhile on its own.)
gcc/cp/ChangeLog:
PR c++/95310
* pt.c (corresponding_template_parameter): Define.
(keep_template_parm): Use it to adjust the given template
parameter to the corresponding in-scope one from ctx_parms.
gcc/testsuite/ChangeLog:
PR c++/95310
* g++.dg/concepts/diagnostic15.C: New test.
The remaining use of xref_tag_from_type was also suspicious. It turns
out to be an error path. At parse time we diagnose that a class
definition cannot appear, but we swallow the definition. This code
was attempting to push it into the global scope (or find a conflict).
This seems needless, just return error_mark_node. This was the
simpler fix than going through the parser and figuring out how to get
it to put in error_mark_node at the right point.
gcc/cp/
* cp-tree.h (xref_tag_from_type): Don't declare.
* decl.c (xref_tag_from_type): Delete.
* pt.c (lookup_template_class_1): Erroneously located class
definitions just give error_mark, don't try and inject it into the
namespace.
These two builtin calls are added already during parsing before pointer
subtractions or comparisons, normally they perform runtime verification
of whether the pointers point to the same object or different objects,
but during constant expressione valuation we don't really need those
builtins for anything.
2020-09-22 Jakub Jelinek <jakub@redhat.com>
PR c++/97145
* constexpr.c (cxx_eval_builtin_function_call): Return void_node for
calls to __sanitize_ptr_{sub,cmp} builtins.
* g++.dg/asan/pr97145.C: New test.
libstdc++-v3/ChangeLog:
PR libstdc++/97167
* src/c++17/fs_path.cc (path::_Parser::root_path()): Check
for empty string before inspecting the first character.
* testsuite/27_io/filesystem/path/append/source.cc: Append
empty string_view to path.
The 'mod' and 'div' operators in eBPF are unsigned, with no signed
counterpart. xBPF adds two new ALU operations, sdiv and smod, for
signed division and modulus, respectively. Update bpf.md with
'define_insn' blocks for signed div and mod to use them when targetting
xBPF, and add new tests to ensure they are used appropriately.
2020-09-17 David Faust <david.faust@oracle.com>
gcc/
* config/bpf/bpf.md: Add defines for signed div and mod operators.
gcc/testsuite/
* gcc.target/bpf/diag-sdiv.c: New test.
* gcc.target/bpf/diag-smod.c: New test.
* gcc.target/bpf/xbpf-sdiv-1.c: New test.
* gcc.target/bpf/xbpf-smod-1.c: New test.
In working on fixing hiddenness, I discovered some suspicious code in
template instantiation. I suspect it dates from when we didn't do the
hidden friend injection thing at all. The xreftag finds the same
class, but makes it visible to name lookup. Which is wrong.
hurrah, fixing a bug by deleting code!
gcc/cp/
* pt.c (instantiate_class_template_1): Do not repush and unhide
injected friend.
gcc/testsuite/
* g++.old-deja/g++.pt/friend34.C: Check injected friend is still
invisible.
This introduces two new headers:
<bits/ranges_base.h> defines the minimal components needed
for using C++20 ranges (customization point objects such as
std::ranges::begin, concepts such as std::ranges::range, etc.)
<bits/ranges_util.h> includes <bits/ranges_base.h> and additionally
defines subrange, which is needed by <bits/ranges_algo.h>.
Most of the content of <bits/ranges_base.h> was previously defined in
<bits/range_access.h>, but a few pieces were only defined in <ranges>.
This meant the entire <ranges> header was needed in <algorithm> and
<memory>, even though they don't use all the range adaptors.
By moving the ranges components out of <bits/range_access.h> that file
is left defining just the contents of [iterator.range] i.e. std::begin,
std::end, std::size etc. and not C++20 ranges components.
For consistency with other C++20 ranges headers, <bits/range_cmp.h> is
renamed to <bits/ranges_cmp.h>.
libstdc++-v3/ChangeLog:
* include/Makefile.am: Add new headers and adjust for renamed
header.
* include/Makefile.in: Regenerate.
* include/bits/iterator_concepts.h: Adjust for renamed header.
* include/bits/range_access.h (ranges::*): Move to new
<bits/ranges_base.h> header.
* include/bits/ranges_algobase.h: Include new <bits/ranges_base.h>
header instead of <ranges>.
* include/bits/ranges_algo.h: Include new <bits/ranges_util.h>
header.
* include/bits/range_cmp.h: Moved to...
* include/bits/ranges_cmp.h: ...here.
* include/bits/ranges_base.h: New header.
* include/bits/ranges_util.h: New header.
* include/experimental/string_view: Include new
<bits/ranges_base.h> header.
* include/std/functional: Adjust for renamed header.
* include/std/ranges (ranges::view_base, ranges::enable_view)
(ranges::dangling, ranges::borrowed_iterator_t): Move to new
<bits/ranges_base.h> header.
(ranges::view_interface, ranges::subrange)
(ranges::borrowed_subrange_t): Move to new <bits/ranges_util.h>
header.
* include/std/span: Include new <bits/ranges_base.h> header.
* include/std/string_view: Likewise.
* testsuite/24_iterators/back_insert_iterator/pr93884.cc: Add
missing <ranges> header.
* testsuite/24_iterators/front_insert_iterator/pr93884.cc:
Likewise.
This patch enables a peephole2 optimization which transforms a load of
constant zero into a temporary register which is then finally used to
compare against a floating-point register of interest into a single load
and test instruction. However, the optimization is only applied if both
registers are dead afterwards and if we test for (in)equality only.
This is relaxed in case of fast math.
This is a follow up to PR88856.
gcc/ChangeLog:
* config/s390/s390.md ("*cmp<mode>_ccs_0", "*cmp<mode>_ccz_0",
"*cmp<mode>_ccs_0_fastmath"): Basically change "*cmp<mode>_ccs_0" into
"*cmp<mode>_ccz_0" and for fast math add "*cmp<mode>_ccs_0_fastmath".
gcc/testsuite/ChangeLog:
* gcc.target/s390/load-and-test-fp-1.c: Change test to include all
possible combinations of dead/live registers and comparisons (equality,
relational).
* gcc.target/s390/load-and-test-fp-2.c: Same as load-and-test-fp-1.c
but for fast math.
* gcc.target/s390/load-and-test-fp.h: New test included by
load-and-test-fp-{1,2}.c.
By running libgomp test-case libgomp.c/target-28.c with GOMP_NVPTX_PTXRW=w
(using a maintenance patch that adds support for this env var), we dump the
ptx in target-28.exe to file. By editing one ptx file to rename
gomp_nvptx_main to gomp_nvptx_main2 in both declaration and call, and
running with GOMP_NVPTX_PTXRW=r, we trigger a link error:
...
$ GOMP_NVPTX_PTXRW=r ./target-28.exe
libgomp: cuLinkComplete error: unknown error
...
The error is somewhat uninformative.
Fix this by dumping the error log returned by the failing cuda call, such
that we have instead:
...
$ GOMP_NVPTX_PTXRW=r ./target-28.exe
libgomp: Link error log error : \
Undefined reference to 'gomp_nvptx_main2' in ''
libgomp: cuLinkComplete error: unknown error
...
Build on x86_64 with nvptx accelerator, tested libgomp.
libgomp/ChangeLog:
* plugin/plugin-nvptx.c (link_ptx): Print elog if cuLinkComplete call
fails.
This patch implements some missing intrinsics that perform a CLS on unsigned SIMD types.
Bootstrapped and tested on aarch64-none-linux-gnu.
gcc/
PR target/71233
* config/aarch64/arm_neon.h (vcls_u8, vcls_u16, vcls_u32,
vclsq_u8, vclsq_u16, vclsq_u32): Define.
gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/vcls_unsigned_1.c: New test.
This patch implements some missing vceq* intrinsics on poly types.
The behaviour is to produce the appropriate CMEQ instruction as for the unsigned types.
Bootstrapped and tested on aarch64-none-linux-gnu.
gcc/
PR target/71233
* config/aarch64/arm_neon.h (vceqq_p64, vceqz_p64, vceqzq_p64): Define.
gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/vceq_poly_1.c: New test.
This implements the vadd[p]_p* intrinsics.
In terms of functionality they are aliases of veor operations on the relevant unsigned types.
Bootstrapped and tested on aarch64-none-linux-gnu.
gcc/
PR target/71233
* config/aarch64/arm_neon.h (vadd_p8, vadd_p16, vadd_p64, vaddq_p8,
vaddq_p16, vaddq_p64, vaddq_p128): Define.
gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/vadd_poly_1.c: New test.
Before the change gcc did not stream correctly TOPN counters
if counters belonged to a non-local shared object.
As a result zero-section optimization generated TOPN sections
in a form not recognizable by '__gcov_merge_topn'.
The problem happens because in a case of multiple shared objects
'__gcov_merge_topn' function is present in address space multiple
times (once per each object).
The fix is to never rely on function address and predicate on TOPN
counter types.
libgcc/ChangeLog:
PR gcov-profile/96913
* libgcov-driver.c (write_one_data): Avoid function pointer
comparison in TOP streaming decision.
This fixes
FAIL: compiler driver --help=common option(s): "^ +-.*[^:.]$" absent from output: " --param=modref-max-tests= Maximum number of tests perofmed by modref query"
FAIL: compiler driver --help=optimizers option(s): "^ +-.*[^:.]$" absent from output: " -fipa-modref Perform interprocedural modref analysis"
2020-09-22 Jakub Jelinek <jakub@redhat.com>
* common.opt (-fipa-modref): Add dot at the end of option help.
* params.opt (--param=modref-max-tests=): Likewise.
While backporting 5494edae83 I noticed
that it's still not correct. I made the allocator-extended constructor
use the right type for the uses-allocator construction detection, but I
used an rvalue when it should be a const lvalue.
This should fix it properly this time.
libstdc++-v3/ChangeLog:
PR libstdc++/96803
* include/std/tuple
(_Tuple_impl(allocator_arg_t, Alloc, const _Tuple_impl<U...>&)):
Use correct value category in __use_alloc call.
* testsuite/20_util/tuple/cons/96803.cc: Check with constructors
that require correct value category to be used.
For a span with statically empty extent, we currently model the
preconditions of front(), back(), and operator[] as if they are
mandates, by using a static_assert to verify that extent != 0. This
causes us to reject valid programs that would instantiate these member
functions and at runtime never call them.
Since they are already followed by more general runtime asserts, this
patch just removes these static_asserts altogether,
libstdc++-v3/ChangeLog:
* include/std/span (span::front): Remove static_assert.
(span::back): Likewise.
(span::operator[]): Likewise.
* testsuite/23_containers/span/back_neg.cc: Rewrite to verify
that we check the preconditions of back() only when it's called.
* testsuite/23_containers/span/front_neg.cc: Likewise for
front().
* testsuite/23_containers/span/index_op_neg.cc: Likewise for
operator[].
This fixes a division by zero in the selection-sampling std::__sample
overload when the input range is empty (and hence __unsampled_sz is 0).
libstdc++-v3/ChangeLog:
* include/bits/stl_algo.h (__sample): Exit early when the
input range is empty.
* testsuite/25_algorithms/sample/3.cc: New test.
As per P0202.
libstdc++-v3/ChangeLog:
* include/bits/stl_algo.h (for_each_n): Mark constexpr for C++20.
(search): Likewise for the overload that takes a searcher.
* testsuite/25_algorithms/for_each/constexpr.cc: Test constexpr
std::for_each_n.
* testsuite/25_algorithms/search/constexpr.cc: Test constexpr
std::search overload that takes a searcher.
Verify that arguments are pointers before calling handling code
that calls deref_rvalue on them.
gcc/analyzer/ChangeLog:
PR analyzer/97130
* region-model-impl-calls.cc (call_details::get_arg_type): New.
* region-model.cc (region_model::on_call_pre): Check that the
initial arg is a pointer before calling impl_call_memset and
impl_call_strlen.
* region-model.h (call_details::get_arg_type): New decl.
gcc/testsuite/ChangeLog:
PR analyzer/97130
* gcc.dg/analyzer/pr97130.c: New test.
Whilst debugging the remaining state explosion in PR analyzer/93355
I noticed that half of the states at an exploding program point had:
'malloc': {'&buf': 'non-heap'}
whereas the other half didn't, presumably depending on whether the path
to each enode had used this local buffer:
char buf[400];
This patch tweaks malloc_state_machine::get_default_state to be smarter
about this, so that we can implicitly treat pointers to decls as
non-heap, preventing pointless differences between sm_state_map
instances. With that, all of the states in question have equal (empty)
malloc sm-state - though the state explosion continues for other reasons.
gcc/analyzer/ChangeLog:
PR analyzer/93355
* sm-malloc.cc (malloc_state_machine::get_default_state): Look at
the base region when considering pointers. Treat pointers to
decls as being non-heap.
libstdc++-v3/ChangeLog:
* include/bits/c++config (__replacement_assert): Add noreturn
attribute.
(__glibcxx_assert_impl): Use __builtin_expect to hint that the
assertion is expected to pass.
libstdc++-v3/ChangeLog:
* include/std/ranges (drop_view::begin()): Adjust constraints
to match the correct condition for O(1) ranges::next (LWG 3482).
* testsuite/std/ranges/adaptors/drop.cc: Check that iterator is
cached for non-sized_range.
DR 1722 clarifies that the conversion function from lambda to pointer to
function should be noexcept(true).
gcc/cp/ChangeLog:
PR c++/90583
DR 1722
* lambda.c (maybe_add_lambda_conv_op): Mark the conversion function
as noexcept.
gcc/testsuite/ChangeLog:
PR c++/90583
DR 1722
* g++.dg/cpp0x/lambda/lambda-conv14.C: New test.
I noticed that clang++ has this CTAD warning and thought that it might
be useful to have it. From clang++: "Some style guides want to allow
using CTAD only on types that "opt-in"; i.e. on types that are designed
to support it and not just types that *happen* to work with it."
So this warning warns when CTAD deduced a type, but the type does not
define any deduction guides. In that case CTAD worked only because the
compiler synthesized the implicit deduction guides. That might not be
intended.
It can be suppressed by adding a deduction guide that will never be
considered:
struct allow_ctad_t;
template <typename T> struct S { S(T) {} };
S(allow_ctad_t) -> S<void>;
This warning is off by default. It doesn't warn when the type comes
from a system header unless -Wsystem-headers.
gcc/c-family/ChangeLog:
* c.opt (Wctad-maybe-unsupported): New option.
gcc/cp/ChangeLog:
* pt.c (deduction_guides_for): Add a bool parameter. Set it.
(do_class_deduction): Warn when CTAD succeeds but the type doesn't
have any explicit deduction guides.
gcc/ChangeLog:
* doc/invoke.texi: Document -Wctad-maybe-unsupported.
gcc/testsuite/ChangeLog:
* g++.dg/warn/Wctad-maybe-unsupported.C: New test.
* g++.dg/warn/Wctad-maybe-unsupported2.C: New test.
* g++.dg/warn/Wctad-maybe-unsupported3.C: New test.
* g++.dg/warn/Wctad-maybe-unsupported.h: New file.