The name __deref is defined as a macro by Windows headers.
This renames the __deref() helper function to __ref. It doesn't actually
dereference an iterator. it just has the same type as the iterator's
reference type.
libstdc++-v3/ChangeLog:
PR libstdc++/97362
* doc/html/manual/source_code_style.html: Regenerate.
* doc/xml/manual/appendix_contributing.xml: Add __deref to
BADNAMES.
* include/debug/functions.h (_Irreflexive_checker::__deref):
Rename to __ref.
* testsuite/17_intro/badnames.cc: Check __deref.
This ensures that intermediate results are done in uint32_t values,
meeting the requirement for operations to be done modulo 2^32.
If the target doesn't define __UINT32_TYPE__ then substitute uint32_t
with a class type that uses uint_least32_t and masks the value to
UINT32_MAX.
I've also split the first loop that goes from k=0 to k<m into three
loops, for k=0, [1,s] and [s+1,m). This avoids branching for those three
cases in the body of the loop, and also avoids the concerns in PR 94823
regarding the k-1 index when k==0.
libstdc++-v3/ChangeLog:
PR libstdc++/97311
* include/bits/random.tcc (seed_seq::generate): Use uint32_t for
calculations. Also split the first loop into three loops to
avoid branching on k on every iteration, resolving PR 94823.
* testsuite/26_numerics/random/seed_seq/97311.cc: New test.
* testsuite/26_numerics/random/pr60037-neg.cc: Adjust dg-erro
line number.
We usually export variables in recipes this way. I'm not sure it's
necessary, but it's consistent.
libstdc++-v3/ChangeLog:
* testsuite/Makefile.am: Set and export variable separately.
* testsuite/Makefile.in: Regenerate.
It looks like our check-performance target runs completely unoptimized,
which is a bit silly. This exports the CXXFLAGS from the parent make
process to the check_performance script.
libstdc++-v3/ChangeLog:
* scripts/check_performance: Use gnu++11 instead of gnu++0x.
* testsuite/Makefile.am (check-performance): Export CXXFLAGS to
child process.
* testsuite/Makefile.in: Regenerate.
This tests std::uniform_int_distribution with various parameters and
engines.
libstdc++-v3/ChangeLog:
* testsuite/performance/26_numerics/random_dist.cc: New test.
This rewrites ranges::construct_at in terms of std::construct_at so
that we can piggyback on the compiler's existing support for
intercepting placement new within std::construct_at during constexpr
evaluation, instead of having to additionally teach the compiler about
ranges::construct_at.
While we're making changes to ranges::construct_at, this patch also
declares it conditionally noexcept and qualifies the calls to declval in
its requires-clause.
libstdc++-v3/ChangeLog:
PR libstdc++/95788
* include/bits/ranges_uninitialized.h:
(__construct_at_fn::operator()): Rewrite in terms of
std::construct_at. Declare it conditionally noexcept. Qualify
calls to declval in its requires-clause.
* testsuite/20_util/specialized_algorithms/construct_at/95788.cc:
New test.
These three distributions all require 0 < S where S is the sum of the
weights. When the sum is zero there's an undefined FP division by zero.
Add assertions to help users diagnose the problem.
libstdc++-v3/ChangeLog:
PR libstdc++/82584
* include/bits/random.tcc
(discrete_distribution::param_type::_M_initialize)
(piecewise_constant_distribution::param_type::_M_initialize)
(piecewise_linear_distribution::param_type::_M_initialize):
Add assertions for positive sums..
* testsuite/26_numerics/random/pr60037-neg.cc: Adjust dg-error
line.
The new constructors that C++11 added to std::ios_base::failure were
missing for the old ABI. This adds them, but just ignores the
std::error_code argument (because there's nowhere to store it).
This also adds a code() member, which should be provided by the
std::system_error base class, but that base class isn't present in the
old ABI.
This allows the old ios::failure to be used in code that expects the new
API, although with reduced functionality.
libstdc++-v3/ChangeLog:
* include/bits/ios_base.h (ios_base::failure): Add constructors
takeing error_code argument. Add code() member function.
* testsuite/27_io/ios_base/failure/cxx11.cc: Allow test to
run for the old ABI but do not check for derivation from
std::system_error.
* testsuite/27_io/ios_base/failure/error_code.cc: New test.
My previous attempt to fix this only worked when m is a power of two.
There is still a bug when a=00 and !has_single_bit(m).
Instead of trying to make _Mod work for a==0 this change ensures that we
never instantiate it with a==0. For C++17 we can use if-constexpr, but
otherwise we need to use a different multipler. It doesn't matter what
we use, as it won't actually be called, only instantiated.
libstdc++-v3/ChangeLog:
* include/bits/random.h (__detail::_Mod): Revert last change.
(__detail::__mod): Do not use _Mod for a==0 case.
* testsuite/26_numerics/random/linear_congruential_engine/operators/call.cc:
Check other cases with a==0. Also check runtime results.
* testsuite/26_numerics/random/pr60037-neg.cc: Adjust dg-error
line.
My recent changes to std::exception_ptr moved some members to be inline
in the header but didn't replace the variable names with reserved names.
The "tmp" variable must be fixed. The "other" parameter is actually a
reserved name because of std::allocator<T>::rebind<U>::other but should
be fixed anyway.
There are also some bad uses of "ForwardIterator" in <ranges>.
There's also a "il" parameter in a std::seed_seq constructor in <random>
which is only reserved since C++14.
libstdc++-v3/ChangeLog:
* include/bits/random.h (seed_seq(initializer_list<T>)): Rename
parameter to use reserved name.
* include/bits/ranges_algo.h (shift_left, shift_right): Rename
template parameters to use reserved name.
* libsupc++/exception_ptr.h (exception_ptr): Likewise for
parameters and local variables.
* testsuite/17_intro/names.cc: Check "il". Do not check "d" and
"y" in C++20 mode.
This inlines most members of std::exception_ptr so that all operations
on a null exception_ptr can be optimized away. This benefits code like
std::future and coroutines where an exception_ptr object is present to
cope with exceptional cases, but is usually not used and remains null.
Since those functions were previously non-inline we have to continue to
export them from the library, for objects that were compiled against the
old headers and expect to find definitions in the library.
In order to inline the copy constructor and destructor we need to export
the _M_addref() and _M_release() members that increment/decrement the
reference count when copying/destroying a non-null exception_ptr. The
copy ctor and dtor check for null and don't call _M_addref and
_M_release unless they need to. The checks for null pointers in
_M_addref and _M_release are still needed because old code might call
them without checking for null first. But we can use __builtin_expect to
predict that they are usually called for the non-null case.
libstdc++-v3/ChangeLog:
PR libstdc++/90295
* config/abi/pre/gnu.ver (CXXABI_1.3.13): New symbol version.
(exception_ptr::_M_addref(), exception_ptr::_M_release()):
Export symbols.
* libsupc++/eh_ptr.cc (exception_ptr::exception_ptr()):
Remove out-of-line definition.
(exception_ptr::exception_ptr(const exception_ptr&)):
Likewise.
(exception_ptr::~exception_ptr()): Likewise.
(exception_ptr::operator=(const exception_ptr&)):
Likewise.
(exception_ptr::swap(exception_ptr&)): Likewise.
(exception_ptr::_M_addref()): Add branch prediction.
* libsupc++/exception_ptr.h (exception_ptr::operator bool):
Add noexcept.
[!_GLIBCXX_EH_PTR_COMPAT] (operator==, operator!=): Define
inline as hidden friends. Remove declarations at namespace
scope.
(exception_ptr::exception_ptr()): Define inline.
(exception_ptr::exception_ptr(const exception_ptr&)):
Likewise.
(exception_ptr::~exception_ptr()): Likewise.
(exception_ptr::operator=(const exception_ptr&)):
Likewise.
(exception_ptr::swap(exception_ptr&)): Likewise.
* testsuite/util/testsuite_abi.cc: Add CXXABI_1.3.13.
* testsuite/18_support/exception_ptr/90295.cc: New test.
In commit ef275d1f20 I implemented the
wrong resolution of LWG 3474. This removes the deduction guide and
alters the views::join factory to create the right type explicitly.
libstdc++-v3/ChangeLog:
* include/std/ranges (join_view): Remove deduction guide.
(views::join): Add explicit template argument list to prevent
deducing the wrong type.
* testsuite/std/ranges/adaptors/join.cc: Move test for LWG 3474
here, from ...
* testsuite/std/ranges/adaptors/join_lwg3474.cc: Removed.
This avoids unnecessary instantiations of std::numeric_limits or
inclusion of <limits> when a more lightweight alternative would work.
Some uses can be replaced with __gnu_cxx::__int_traits and some can just
use size_t(-1) directly where SIZE_MAX is needed.
libstdc++-v3/ChangeLog:
* include/bits/regex.h: Use __int_traits<int> instead of
std::numeric_limits<int>.
* include/bits/uniform_int_dist.h: Use __int_traits<T>::__max
instead of std::numeric_limits<T>::max().
* include/bits/hashtable_policy.h: Use size_t(-1) instead of
std::numeric_limits<size_t>::max().
* include/std/regex: Include <ext/numeric_traits.h>.
* include/std/string_view: Use typedef for __int_traits<int>.
* src/c++11/hashtable_c++0x.cc: Use size_t(-1) instead of
std::numeric_limits<size_t>::max().
* testsuite/std/ranges/iota/96042.cc: Include <limits>.
* testsuite/std/ranges/iota/difference_type.cc: Likewise.
* testsuite/std/ranges/subrange/96042.cc: Likewise.
When adding new features to <numeric> I included the required headers
adjacent to the new code. This cleans it up by moving all the includes
to the start of the file.
libstdc++-v3/ChangeLog:
* include/std/numeric: Move all #include directives to the top
of the header.
* testsuite/26_numerics/gcd/gcd_neg.cc: Adjust dg-error line
numbers.
* testsuite/26_numerics/lcm/lcm_neg.cc: Likewise.
std::allocator and std::pmr::polymorphic_allocator should throw
std::bad_array_new_length from their allocate member functions if the
number of bytes required cannot be represented in std::size_t.
libstdc++-v3/ChangeLog:
* config/abi/pre/gnu.ver: Add new symbol.
* include/bits/functexcept.h (__throw_bad_array_new_length):
Declare new function.
* include/ext/malloc_allocator.h (malloc_allocator::allocate):
Throw bad_array_new_length for impossible sizes (LWG 3190).
* include/ext/new_allocator.h (new_allocator::allocate):
Likewise.
* include/std/memory_resource (polymorphic_allocator::allocate)
(polymorphic_allocator::allocate_object): Use new function,
__throw_bad_array_new_length.
* src/c++11/functexcept.cc (__throw_bad_array_new_length):
Define.
* testsuite/20_util/allocator/lwg3190.cc: New test.
As Jonathan Wakely pointed out[1], my change in commit
f9ddb696a2 should have been rounding to
the target clock duration type rather than the input clock duration type
in __atomic_futex_unsigned::_M_load_when_equal_until just as (e.g.)
condition_variable does.
As well as fixing this, let's create a rather contrived test that fails
with the previous code, but unfortunately only when run on a machine
with an uptime of over 208.5 days, and even then not always.
[1] https://gcc.gnu.org/pipermail/libstdc++/2020-September/051004.html
libstdc++-v3/ChangeLog:
PR libstdc++/91486
* include/bits/atomic_futex.h:
(__atomic_futex_unsigned::_M_load_when_equal_until): Use target
clock duration type when rounding.
* testsuite/30_threads/async/async.cc (test_pr91486_wait_for):
Rename from test_pr91486.
(float_steady_clock): New class for test.
(test_pr91486_wait_until): New test.
Commit 53ad6b1979 split the implementation
of std::chrono::__detail::ceil so that when compiling for C++17 and
later std::chrono::ceil is used but when compiling for earlier versions
a separate implementation is used to comply with C++11's limited
constexpr rules. Let's run the equivalent of the existing
std::chrono::ceil test cases on std::chrono::__detail::ceil too to make
sure that it doesn't get broken.
libstdc++-v3/ChangeLog:
* testsuite/20_util/duration_cast/rounding_c++11.cc: Copy
rounding.cc and alter to support compilation for C++11 and to
test std::chrono::__detail::ceil.
This fixes a linker error for older ARM cores without 64-bit atomics.
I think the { dg-add-options libatomic } is no longer needed, but it's
harmless to keep it there.
libstdc++-v3/ChangeLog:
* testsuite/29_atomics/atomic_float/value_init.cc: Use float
instead of double so that __atomic_load_8 isn't needed.
This test was supposed to verify that when __libc_single_threaded is
available we successfully detect recursive static initialization even
when linked to libpthread. But I forgot to that when recursive init is
detected, we terminate, and so the test fails.
This adds a terminate handler that exits cleanly, so the test passes
when recursive init is detected.
libstdc++-v3/ChangeLog:
* testsuite/18_support/96817.cc: Use terminate handler that
calls _Exit(0).
I noticed that the following changes from this paper were not yet
implemented.
libstdc++-v3/ChangeLog:
* include/bits/stl_iterator.h (reverse_iterator::iter_move):
Define for C++20 as per P0896.
(reverse_iterator::iter_swap): Likewise.
(move_iterator::operator*): Apply P0896 changes for C++20.
(move_iterator::operator[]): Likewise.
* testsuite/24_iterators/reverse_iterator/cust.cc: New test.
Since the standard range adaptors are specified to derive from the empty
class view_base, having their first data member store the underlying
view is suboptimal, for if the underlying view also derives from
view_base then the two view_base subobjects will be adjacent; this
prevents the compiler from applying the empty base optimization to elide
away the storage for these two empty bases.
This patch improves the situation by declaring the _M_base data member
last instead of first in each range adaptor that has more than one data
member, so that the empty base optimization can apply in more cases.
libstdc++-v3/ChangeLog:
* include/std/ranges (filter_view): Declare the data member
_M_base last instead of first, and adjust constructors' member
initializer lists accordingly.
(transform_view): Likewise.
(take_view): Likewise.
(take_while_view): Likewise.
(drop_view): Likewise.
(drop_while_view): Likewise.
(join_view): Likewise.
(split_view): Likewise (and tweak nearby formatting).
(reverse_view): Likewise.
* testsuite/std/ranges/adaptors/sizeof.cc: Update expected
sizes.
libstdc++-v3/ChangeLog:
* include/bits/ranges_util.h (subrange::_M_end): Give it
[[no_unique_address]].
* testsuite/std/ranges/subrange/sizeof.cc: New test.
libstdc++-v3/ChangeLog:
* include/std/ranges (iota_view::_M_bound): Give it
[[no_unique_address]].
* testsuite/std/ranges/iota/iota_view.cc: Check that an
unbounded iota_view has minimal size.
Glibc 2.32 adds a global variable that says whether the process is
single-threaded. We can use this to decide whether to elide atomic
operations, as a more precise and reliable indicator than
__gthread_active_p.
This means that guard variables for statics and reference counting in
shared_ptr can use less expensive, non-atomic ops even in processes that
are linked to libpthread, as long as no threads have been created yet.
It also means that we switch to using atomics if libpthread gets loaded
later via dlopen (this still isn't supported in general, for other
reasons).
We can't use __libc_single_threaded to replace __gthread_active_p
everywhere. If we replaced the uses of __gthread_active_p in std::mutex
then we would elide the pthread_mutex_lock in the code below, but not
the pthread_mutex_unlock:
std::mutex m;
m.lock(); // pthread_mutex_lock
std::thread t([]{}); // __libc_single_threaded = false
t.join();
m.unlock(); // pthread_mutex_unlock
We need the lock and unlock to use the same "is threading enabled"
predicate, and similarly for init/destroy pairs for mutexes and
condition variables, so that we don't try to release resources that were
never acquired.
There are other places that could use __libc_single_threaded, such as
_Sp_locker in src/c++11/shared_ptr.cc and locale init functions, but
they can be changed later.
libstdc++-v3/ChangeLog:
PR libstdc++/96817
* include/ext/atomicity.h (__gnu_cxx::__is_single_threaded()):
New function wrapping __libc_single_threaded if available.
(__exchange_and_add_dispatch, __atomic_add_dispatch): Use it.
* libsupc++/guard.cc (__cxa_guard_acquire, __cxa_guard_abort)
(__cxa_guard_release): Likewise.
* testsuite/18_support/96817.cc: New test.
libstdc++-v3/ChangeLog:
PR libstdc++/71579
* include/std/type_traits (invoke_result, is_invocable)
(is_invocable_r, is_nothrow_invocable, is_nothrow_invocable_r):
Add static_asserts to make sure that the arguments of the type
traits are not misused with incomplete types.
* testsuite/20_util/invoke_result/incomplete_args_neg.cc: New test.
* testsuite/20_util/is_invocable/incomplete_args_neg.cc: New test.
* testsuite/20_util/is_invocable/incomplete_neg.cc: New test.
* testsuite/20_util/is_nothrow_invocable/incomplete_args_neg.cc:
New test.
* testsuite/20_util/is_nothrow_invocable/incomplete_neg.cc: Check
for error on incomplete type usage in trait.
The class template semiregular-box<T> defined in [range.semi.wrap] is
used by a number of views to accomodate non-semiregular subobjects
while ensuring that the overall view remains semiregular. It provides
a stand-in default constructor, copy assignment operator and move
assignment operator whenever the underlying type lacks them. The
wrapper derives from std::optional<T> to support default construction
when T is not default constructible.
It would be nice for this wrapper to essentially be a no-op when the
underlying type is already semiregular, but this is currently not the
case due to its use of std::optional<T>, which incurs space overhead
compared to storing just T.
To that end, this patch specializes the semiregular wrapper for
semiregular T. Compared to the primary template, this specialization
uses less space, and it allows [[no_unique_address]] to optimize away
wrapped data members whose underlying type is empty and semiregular
(e.g. a non-capturing lambda). This patch also applies
[[no_unique_address]] to the five data members that use the wrapper.
libstdc++-v3/ChangeLog:
* include/std/ranges (__detail::__boxable): Split out the
associated constraints of __box into here.
(__detail::__box): Use the __boxable concept. Define a leaner
partial specialization for semiregular types.
(single_view::_M_value): Give it [[no_unique_address]].
(filter_view::_M_pred): Likewise.
(transform_view::_M_fun): Likewise.
(take_while_view::_M_pred): Likewise.
(drop_while_view::_M_pred):: Likewise.
* testsuite/std/ranges/adaptors/detail/semiregular_box.cc: New
test.
libstdc++-v3/ChangeLog:
PR libstdc++/97167
* src/c++17/fs_path.cc (path::_Parser::root_path()): Check
for empty string before inspecting the first character.
* testsuite/27_io/filesystem/path/append/source.cc: Append
empty string_view to path.
This introduces two new headers:
<bits/ranges_base.h> defines the minimal components needed
for using C++20 ranges (customization point objects such as
std::ranges::begin, concepts such as std::ranges::range, etc.)
<bits/ranges_util.h> includes <bits/ranges_base.h> and additionally
defines subrange, which is needed by <bits/ranges_algo.h>.
Most of the content of <bits/ranges_base.h> was previously defined in
<bits/range_access.h>, but a few pieces were only defined in <ranges>.
This meant the entire <ranges> header was needed in <algorithm> and
<memory>, even though they don't use all the range adaptors.
By moving the ranges components out of <bits/range_access.h> that file
is left defining just the contents of [iterator.range] i.e. std::begin,
std::end, std::size etc. and not C++20 ranges components.
For consistency with other C++20 ranges headers, <bits/range_cmp.h> is
renamed to <bits/ranges_cmp.h>.
libstdc++-v3/ChangeLog:
* include/Makefile.am: Add new headers and adjust for renamed
header.
* include/Makefile.in: Regenerate.
* include/bits/iterator_concepts.h: Adjust for renamed header.
* include/bits/range_access.h (ranges::*): Move to new
<bits/ranges_base.h> header.
* include/bits/ranges_algobase.h: Include new <bits/ranges_base.h>
header instead of <ranges>.
* include/bits/ranges_algo.h: Include new <bits/ranges_util.h>
header.
* include/bits/range_cmp.h: Moved to...
* include/bits/ranges_cmp.h: ...here.
* include/bits/ranges_base.h: New header.
* include/bits/ranges_util.h: New header.
* include/experimental/string_view: Include new
<bits/ranges_base.h> header.
* include/std/functional: Adjust for renamed header.
* include/std/ranges (ranges::view_base, ranges::enable_view)
(ranges::dangling, ranges::borrowed_iterator_t): Move to new
<bits/ranges_base.h> header.
(ranges::view_interface, ranges::subrange)
(ranges::borrowed_subrange_t): Move to new <bits/ranges_util.h>
header.
* include/std/span: Include new <bits/ranges_base.h> header.
* include/std/string_view: Likewise.
* testsuite/24_iterators/back_insert_iterator/pr93884.cc: Add
missing <ranges> header.
* testsuite/24_iterators/front_insert_iterator/pr93884.cc:
Likewise.
While backporting 5494edae83 I noticed
that it's still not correct. I made the allocator-extended constructor
use the right type for the uses-allocator construction detection, but I
used an rvalue when it should be a const lvalue.
This should fix it properly this time.
libstdc++-v3/ChangeLog:
PR libstdc++/96803
* include/std/tuple
(_Tuple_impl(allocator_arg_t, Alloc, const _Tuple_impl<U...>&)):
Use correct value category in __use_alloc call.
* testsuite/20_util/tuple/cons/96803.cc: Check with constructors
that require correct value category to be used.
For a span with statically empty extent, we currently model the
preconditions of front(), back(), and operator[] as if they are
mandates, by using a static_assert to verify that extent != 0. This
causes us to reject valid programs that would instantiate these member
functions and at runtime never call them.
Since they are already followed by more general runtime asserts, this
patch just removes these static_asserts altogether,
libstdc++-v3/ChangeLog:
* include/std/span (span::front): Remove static_assert.
(span::back): Likewise.
(span::operator[]): Likewise.
* testsuite/23_containers/span/back_neg.cc: Rewrite to verify
that we check the preconditions of back() only when it's called.
* testsuite/23_containers/span/front_neg.cc: Likewise for
front().
* testsuite/23_containers/span/index_op_neg.cc: Likewise for
operator[].
This fixes a division by zero in the selection-sampling std::__sample
overload when the input range is empty (and hence __unsampled_sz is 0).
libstdc++-v3/ChangeLog:
* include/bits/stl_algo.h (__sample): Exit early when the
input range is empty.
* testsuite/25_algorithms/sample/3.cc: New test.
As per P0202.
libstdc++-v3/ChangeLog:
* include/bits/stl_algo.h (for_each_n): Mark constexpr for C++20.
(search): Likewise for the overload that takes a searcher.
* testsuite/25_algorithms/for_each/constexpr.cc: Test constexpr
std::for_each_n.
* testsuite/25_algorithms/search/constexpr.cc: Test constexpr
std::search overload that takes a searcher.
libstdc++-v3/ChangeLog:
* include/std/ranges (drop_view::begin()): Adjust constraints
to match the correct condition for O(1) ranges::next (LWG 3482).
* testsuite/std/ranges/adaptors/drop.cc: Check that iterator is
cached for non-sized_range.
libstdc++-v3/ChangeLog:
* include/std/ranges (transform_view, elements_view): Relax
constraints on operator- for iterators, as per LWG 3483.
* testsuite/std/ranges/adaptors/elements.cc: Check that we
can take the difference of two iterators from a non-random
access range.
* testsuite/std/ranges/adaptors/transform.cc: Likewise.
The cast from void* to T* in std::assume_aligned is not valid in a
constexpr function. The optimization hint is redundant during constant
evaluation anyway (the compiler can see the object and knows its
alignment). Simply return the original pointer without applying the
__builtin_assume_aligned hint to it when doing constant evaluation.
This change also removes the preprocessor branch that works around
uintptr_t not being available. We already assume that type is present
elsewhere in the library.
libstdc++-v3/ChangeLog:
PR libstdc++/97132
* include/bits/align.h (align) [!_GLIBCXX_USE_C99_STDINT_TR1]:
Remove unused code.
(assume_aligned): Do not use __builtin_assume_aligned during
constant evaluation.
* testsuite/20_util/assume_aligned/1.cc: Improve test.
* testsuite/20_util/assume_aligned/97132.cc: New test.
libstdc++-v3/ChangeLog:
PR libstdc++/97101
* include/std/functional (bind_front): Fix order of parameters
in is_nothrow_constructible_v specialization.
* testsuite/20_util/function_objects/bind_front/97101.cc: New test.
The fix for PR68519 in 83fd5e73b3 only
applied to condition_variable::wait_for. This problem can also apply to
condition_variable::wait_until but only if the custom clock is using a
more recent epoch so that a small enough delta can be calculated. let's
use the newly-added chrono::__detail::ceil to fix this and also make use
of that function to simplify the previous wait_for fixes.
Also, simplify the existing test case for PR68519 a little and make its
variables local so we can add a new test case for the above problem.
Unfortunately, the test would have only started failing if sufficient
time has passed since the chrono::steady_clock epoch had passed anyway,
but it's better than nothing.
libstdc++-v3/ChangeLog:
* include/std/condition_variable (condition_variable::wait_until):
Convert delta to steady_clock duration before adding to current
steady_clock time to avoid rounding errors described in PR68519.
(condition_variable::wait_for): Simplify calculation of absolute
time by using chrono::__detail::ceil in both overloads.
* testsuite/30_threads/condition_variable/members/68519.cc:
(test_wait_for): Renamed from test01. Replace unassigned val
variable with constant false. Reduce scope of mx and cv
variables to just test_wait_for function.
(test_wait_until): Add new test case.
Convert the specified duration to the target clock's duration type
before adding it to the current time in
__atomic_futex_unsigned::_M_load_when_equal_for and
_M_load_when_equal_until. This removes the risk of the timeout being
rounded down to the current time resulting in there being no wait at all
when the duration type lacks sufficient precision to hold the
steady_clock current time.
Rather than using the style of fix from PR68519, let's expose the C++17
std::chrono::ceil function as std::chrono::__detail::ceil so that it can
be used in code compiled with earlier standards versions and simplify
the fix. This was suggested by John Salmon in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91486#c5 .
This problem has become considerably less likely to trigger since I
switched the __atomic__futex_unsigned::__clock_t reference clock from
system_clock to steady_clock and added the loop, but the consequences of
triggering it have changed too.
By my calculations it takes just over 194 days from the epoch for the
current time not to be representable in a float. This means that
system_clock is always subject to the problem (with the standard 1970
epoch) whereas steady_clock with float duration only runs out of
resolution machine has been running for that long (assuming the Linux
implementation of CLOCK_MONOTONIC.)
The recently-added loop in
__atomic_futex_unsigned::_M_load_when_equal_until turns this scenario
into a busy wait.
Unfortunately the combination of both of these things means that it's
not possible to write a test case for this occurring in
_M_load_when_equal_until as it stands.
libstdc++-v3/ChangeLog:
PR libstdc++/91486
* include/bits/atomic_futex.h
(__atomic_futex_unsigned::_M_load_when_equal_for)
(__atomic_futex_unsigned::_M_load_when_equal_until): Use
__detail::ceil to convert delta to the reference clock
duration type to avoid resolution problems.
* include/std/chrono (__detail::ceil): Move implementation
of std::chrono::ceil into private namespace so that it's
available to pre-C++17 code.
* testsuite/30_threads/async/async.cc (test_pr91486):
Test __atomic_futex_unsigned::_M_load_when_equal_for.
If std::future::wait_until is passed a time point measured against a
clock that is neither std::chrono::steady_clock nor
std::chrono::system_clock then the generic implementation of
__atomic_futex_unsigned::_M_load_when_equal_until is called which
calculates the timeout based on __clock_t and calls the
_M_load_when_equal_until method for that clock to perform the actual
wait.
There's no guarantee that __clock_t is running at the same speed as the
caller's clock, so if the underlying wait times out timeout we need to
check the timeout against the caller's clock again before potentially
looping.
Also add two extra tests to the testsuite's async.cc:
* run test03 with steady_clock_copy, which behaves identically to
chrono::steady_clock, but isn't chrono::steady_clock. This causes
the overload of __atomic_futex_unsigned::_M_load_when_equal_until
that takes an arbitrary clock to be called.
* invent test04 which uses a deliberately slow running clock in order
to exercise the looping behaviour of
__atomic_futex_unsigned::_M_load_when_equal_until described above.
libstdc++-v3/ChangeLog:
* include/bits/atomic_futex.h
(__atomic_futex_unsigned::_M_load_when_equal_until): Add
loop on generic _Clock to check the timeout against _Clock
again after _M_load_when_equal_until returns indicating a
timeout.
* testsuite/30_threads/async/async.cc: Invent slow_clock
that runs at an eleventh of steady_clock's speed. Use it
to test the user-supplied-clock variant of
__atomic_futex_unsigned::_M_load_when_equal_until works
generally with test03 and loops correctly when the timeout
time hasn't been reached in test04.
Add tests for waiting for the future using both chrono::steady_clock and
chrono::system_clock in preparation for dealing with those clocks
properly in futex.cc.
libstdc++-v3/ChangeLog:
* testsuite/30_threads/async/async.cc (test02): Test steady_clock
with std::future::wait_until.
(test03): Add new test templated on clock type waiting for future
associated with async to resolve.
(main): Call test03 to test both system_clock and steady_clock.
When a pool resource is constructed with max_blocks_per_chunk=1 it ends
up creating a pool with blocks_per_chunk=0 which means it never
allocates anything. Instead it returns null pointers, which should be
impossible.
To avoid this problem, round the max_blocks_per_chunk value to a
multiple of four, so it's never smaller than four.
libstdc++-v3/ChangeLog:
PR libstdc++/94160
* src/c++17/memory_resource.cc (munge_options): Round
max_blocks_per_chunk to a multiple of four.
(__pool_resource::_M_alloc_pools()): Simplify slightly.
* testsuite/20_util/unsynchronized_pool_resource/allocate.cc:
Check that valid pointers are returned when small values are
used for max_blocks_per_chunk.
The primary reason for this change is to reduce the size of buffers
allocated by std::pmr::monotonic_buffer_resource. Previously, a new
buffer would always add the size of the linked list node (11 bytes) and
then round up to the next power of two. This results in a huge increase
if the expected size of the next buffer is already a power of two. For
example, if the resource is constructed with a desired initial size of
4096 the first buffer it allocates will be std::bit_ceil(4096+11) which
is 8192. If the user has carefully selected the initial size to match
their expected memory requirements then allocating double that amount
wastes a lot of memory.
After this patch the allocated size will be rounded up to a 64-byte
boundary, instead of to a power of two. This means for an initial size
of 4096 only 4160 bytes get allocated.
Previously only the base-2 logarithm of the size was stored, which could
be stored in a single 8-bit integer. Now that the size isn't always a
power of two we need to use more bits to store it. As the size is always
a multiple of 64 the low six bits are not needed, and so we can use the
same approach that the pool resources already use of storing the base-2
logarithm of the alignment in the low bits that are not used for the
size. To avoid code duplication, a new aligned_size<N> helper class is
introduced by this patch, which is then used by both the pool resources'
big_block type and the monotonic_buffer_resource::_Chunk type.
Originally the big_block type used two bit-fields to store the size and
alignment in the space of a single size_t member. The aligned_size type
uses a single size_t member and uses masks and bitwise operations to
manipulate the size and alignment values. This results in better code
than the old version, because the bit-fields weren't optimally ordered
for little endian architectures, so the alignment was actually stored in
the high bits, not the unused low bits, requiring additional shifts to
calculate the values. Using bitwise operations directly avoids needing
to reorder the bit-fields depending on the endianness.
While adapting the _Chunk and big_block types to use aligned_size<N> I
also added checks for size overflows (technically, unsigned wraparound).
The memory resources now ensure that when they require an allocation
that is too large to represent in size_t they will request SIZE_MAX
bytes from the upstream resource, rather than requesting a small value
that results from wrapround. The testsuite is enhanced to verify this.
libstdc++-v3/ChangeLog:
PR libstdc++/96942
* include/std/memory_resource (monotonic_buffer_resource::do_allocate):
Use __builtin_expect when checking if a new buffer needs to be
allocated from the upstream resource, and for checks for edge
cases like zero sized buffers and allocations.
* src/c++17/memory_resource.cc (aligned_size): New class template.
(aligned_ceil): New helper function to round up to a given
alignment.
(monotonic_buffer_resource::chunk): Replace _M_size and _M_align
with an aligned_size member. Remove _M_canary member. Change _M_next
to pointer instead of unaligned buffer.
(monotonic_buffer_resource::chunk::allocate): Round up to multiple
of 64 instead of to power of two. Check for size overflow. Remove
redundant check for minimum required alignment.
(monotonic_buffer_resource::chunk::release): Adjust for changes
to data members.
(monotonic_buffer_resource::_M_new_buffer): Use aligned_ceil.
(big_block): Replace _M_size and _M_align with aligned_size
member.
(big_block::big_block): Check for size overflow.
(big_block::size, big_block::align): Adjust to use aligned_size.
(big_block::alloc_size): Use aligned_ceil.
(munge_options): Use aligned_ceil.
(__pool_resource::allocate): Use big_block::align for alignment.
* testsuite/20_util/monotonic_buffer_resource/allocate.cc: Check
upstream resource gets expected values for impossible sizes.
* testsuite/20_util/unsynchronized_pool_resource/allocate.cc:
Likewise. Adjust checks for expected alignment in existing test.
This "fix" makes no sense, but it avoids an error from G++ about
std::is_constructible being incomplete. The real problem is elsewhere,
but this "fixes" the regression for now.
libstdc++-v3/ChangeLog:
PR libstdc++/96592
* include/std/tuple (_TupleConstraints<true, T...>): Use
alternative is_constructible instead of std::is_constructible.
* testsuite/20_util/tuple/cons/96592.cc: New test.
The current std::gcd and std::chrono::duration::_S_gcd algorithms are
both recursive. This is potentially expensive to evaluate in constant
expressions, because each level of recursion makes a new copy of the
function to evaluate. The maximum number of steps is bounded
(proportional to the number of decimal digits in the smaller value) and
so unlikely to exceed the limit for constexpr nesting, but the memory
usage is still suboptimal. By using an iterative algorithm we avoid
that compile-time cost. Because looping in constexpr functions is not
allowed until C++14, we need to keep the recursive implementation in
duration::_S_gcd for C++11 mode.
For std::gcd we can also optimise runtime performance by using the
binary GCD algorithm.
libstdc++-v3/ChangeLog:
* include/std/chrono (duration::_S_gcd): Use iterative algorithm
for C++14 and later.
* include/std/numeric (__detail::__gcd): Replace recursive
Euclidean algorithm with iterative version of binary GCD algorithm.
* testsuite/26_numerics/gcd/1.cc: Test additional inputs.
* testsuite/26_numerics/gcd/gcd_neg.cc: Adjust dg-error lines.
* testsuite/26_numerics/lcm/lcm_neg.cc: Likewise.
* testsuite/experimental/numeric/gcd.cc: Test additional inputs.
* testsuite/26_numerics/gcd/2.cc: New test.
This was copied from a test for std::lcm but I forgot to change one of
the calls to use the experimental version of the function.
libstdc++-v3/ChangeLog:
PR libstdc++/92978
* testsuite/experimental/numeric/92978.cc: Use experimental::lcm
not std::lcm.