Prefer to overload __to_address to partially specialize std::pointer_traits because
std::pointer_traits would be mostly useless. Moreover partial specialization of
pointer_traits<__normal_iterator<P, C>> fails to rebind C, so you get incorrect types
like __normal_iterator<long*, vector<int>>. In the case of __gnu_debug::_Safe_iterator
the to_pointer method is impossible to implement correctly because we are missing
the parent container to associate the iterator to.
libstdc++-v3/ChangeLog:
* include/bits/stl_iterator.h
(std::pointer_traits<__gnu_cxx::__normal_iterator<>>): Remove.
(std::__to_address(const __gnu_cxx::__normal_iterator<>&)): New for C++11 to C++17.
* include/debug/safe_iterator.h
(std::__to_address(const __gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iterator<>,
_Sequence>&)): New for C++11 to C++17.
* testsuite/24_iterators/normal_iterator/to_address.cc: Add check on std::vector::iterator
to validate both __gnu_cxx::__normal_iterator<> __to_address overload in normal mode and
__gnu_debug::_Safe_iterator in _GLIBCXX_DEBUG mode.
This patch uses the same not completely correct case insensitive comparisons
as used elsewhere in the same header. Proper comparisons that would handle
even multi-byte characters would be harder, but I don't see them implemented
in __ctype's methods.
2021-12-15 Jakub Jelinek <jakub@redhat.com>
PR libstdc++/71557
* include/bits/locale_facets_nonio.tcc (_M_extract_via_format):
Compare characters other than format specifiers and whitespace
case insensitively.
(_M_extract_name): Compare characters case insensitively.
* testsuite/22_locale/time_get/get/char/71557.cc: New test.
* testsuite/22_locale/time_get/get/wchar_t/71557.cc: New test.
This checks whether the locale data for en_HK includes %p and adjusts
the string being tested accordingly. To account for Jakub's fix to make
%I parse "12" as 0 instead of 12, we need to change the expected value
for the case where the locale format doesn't include %p. Also change the
time from 12:00:00 to 12:02:01 so we can tell if the minutes and seconds
get mixed up.
libstdc++-v3/ChangeLog:
PR libstdc++/103687
* testsuite/22_locale/time_get/get_date/wchar_t/4.cc: Restore
original locale before returning.
* testsuite/22_locale/time_get/get_time/char/2.cc: Check for %p
in locale's T_FMT and adjust accordingly.
* testsuite/22_locale/time_get/get_time/wchar_t/2.cc: Likewise.
std::regex currently allows invalid bracket ranges such as [\w-a] which
are only allowed by ECMAScript when in web browser compatibility mode.
It should be an error, because the start of the range is a character
class, not a single character. The current implementation of
_Compiler::_M_expression_term does not provide a way to reject this,
because we only remember a previous character, not whether we just
processed a character class (or collating symbol etc.)
This patch replaces the pair<bool, CharT> used to emulate
optional<CharT> with a custom class closer to pair<tribool,CharT>. That
allows us to track three states, so that we can tell when we've just
seen a character class.
With this additional state the code in _M_expression_term for processing
the _S_token_bracket_dash can be improved to correctly reject the [\w-a]
case, without regressing for valid cases such as [\w-] and [----].
libstdc++-v3/ChangeLog:
PR libstdc++/102447
* include/bits/regex_compiler.h (_Compiler::_BracketState): New
class.
(_Compiler::_BrackeyMatcher): New alias template.
(_Compiler::_M_expression_term): Change pair<bool, CharT>
parameter to _BracketState. Process first character for
ECMAScript syntax as well as POSIX.
* include/bits/regex_compiler.tcc
(_Compiler::_M_insert_bracket_matcher): Pass _BracketState.
(_Compiler::_M_expression_term): Use _BracketState to store
state between calls. Improve handling of dashes in ranges.
* testsuite/28_regex/algorithms/regex_match/cstring_bracket_01.cc:
Add more tests for ranges containing dashes. Check invalid
ranges with character class at the beginning.
This removes the __syntax_option and __match_flag enumeration types,
which are only used to define enumerators with successive values that
are then used to initialize the std::regex_constants global variables.
By defining enumerators in the syntax_option_type and match_flag_type
enumeration types with the correct values for the globals we get rid of
two useless enumeration types that just count from 0 to N, and we
improve the debugging experience. Because the enumeration types now have
enumerators defined, GDB will print values in terms of those enumerators
e.g.
$6 = (std::regex_constants::_S_ECMAScript | std::regex_constants::_S_multiline)
Previously this would have been shown as simply 0x810 because there were
no enumerators of that type.
This changes the type and value of enumerators such as _S_grep, but
users should never be referring to them directly anyway.
libstdc++-v3/ChangeLog:
* include/bits/regex_constants.h (__syntax_option, __match_flag):
Remove.
(syntax_option_type, match_flag_type): Define enumerators.
Use to initialize globals. Add constexpr to compound assignment
operators.
* include/bits/regex_error.h (error_type): Add comment.
* testsuite/28_regex/constants/constexpr.cc: Remove comment.
* testsuite/28_regex/constants/error_type.cc: Improve comment.
* testsuite/28_regex/constants/match_flag_type.cc: Check bitmask
requirements.
* testsuite/28_regex/constants/syntax_option_type.cc: Likewise.
libstdc++-v3/ChangeLog:
* include/bits/regex_compiler.tcc (_Compiler::_M_match_token):
Use reserved name for parameter.
* testsuite/17_intro/names.cc: Check "token".
The scripts/make_exports.pl script used for darwin only replaces '*'
wildcards in globs, it doesn't handle '?'. This means the recent changes
to std::__timepunct exports broke darwin.
Rather than use mangled names in the linker script, this adds support
for '?' to the perl script.
This also removes some unnecessary escaping of the replacement strings
in s// substitutions.
libstdc++-v3/ChangeLog:
* scripts/make_exports.pl: Replace '?' with '.' when turning
a glob into a regex.
Passing IncompleteType(&)[] to ranges::begin produces an error outside
the immediate context, which is fine for ranges::begin, but it means
that we fail to enforce the SFINAE-able constraints for ranges::size and
ranges::size. They should not be callable for any array of unknown
bound, whether the type is complete or not. Because we don't enforce
that in their constraints, we get a hard error when they try to use
ranges::begin.
This simply adds explicit checks for arrays of unknown bound to the
constraints for ranges::size and ranges::empty. We only need to check it
for the __sentinel_size and __eq_iter_empty concepts, because those are
the ones that are relevant to arrays, and which try to use
ranges::begin.
libstdc++-v3/ChangeLog:
* include/bits/ranges_base.h (ranges::size, ranges::empty): Add
explicit check for unbounded arrays before using ranges::begin.
* testsuite/std/ranges/access/empty.cc: Check handling of unbounded
arrays.
* testsuite/std/ranges/access/size.cc: Likewise.
The overload of std::regex_replace that takes a std::basic_string as the
fmt argument (for the replacement string) is implemented in terms of the
one taking a const C*, which uses std::char_traits to find the length.
That means it stops at a null character, even though the basic_string
might have additional characters beyond that.
Rather than duplicate the implementation of the const C* one for the
std::basic_string case, this moves that implementation to a new
__regex_replace function which takes a const C* and a length. Then both
the std::basic_string and const C* overloads can call that (with the
latter using char_traits to find the length to pass to the new
function).
libstdc++-v3/ChangeLog:
PR libstdc++/103664
* include/bits/regex.h (__regex_replace): Declare.
(regex_replace): Use it.
* include/bits/regex.tcc (__regex_replace): Replace regex_replace
definition with __regex_replace.
* testsuite/28_regex/algorithms/regex_replace/char/103664.cc: New test.
In the testcase for 103534 we get a warning about append leading to memcpy
of a very large number of bytes overflowing the buffer. This turns out to
be because we weren't calling _M_check_length for string append. Rather
than do that directly, let's go through the public pointer append that calls
it.
PR c++/103534
libstdc++-v3/ChangeLog:
* include/bits/basic_string.h (append (basic_string)): Call pointer
append instead of _M_append directly.
gcc/testsuite/ChangeLog:
* g++.dg/warn/Wstringop-overflow-8.C: New test.
This incremental patch adds std::time_get %r support (%p was added already
in the previous patch). The _M_am_fm_format method previously in the header
unfortunately had wrong arguments and so was useless, so the largest
complication in this patch is exporting a new symbol in the right symbol
version.
2021-12-10 Jakub Jelinek <jakub@redhat.com>
PR libstdc++/71367
* config/locale/dragonfly/time_members.cc (_M_initialize_timepunct):
Initialize "C" _M_am_pm_format to %I:%M:%S %p rather than empty
string.
* config/locale/gnu/time_members.cc (_M_initialize_timepunct):
Likewise.
* config/locale/generic/time_members.cc (_M_initialize_timepunct):
Likewise.
* include/bits/locale_facets_nonio.h (_M_am_pm_format): New method.
* include/bits/locale_facets_nonio.tcc (_M_extract_via_format): Handle
%r.
* config/abi/pre/gnu.ver (GLIBCXX_3.4.30): Export _M_am_pm_format
with const _CharT** argument, ensure it isn't exported in GLIBCXX_3.4.
* testsuite/22_locale/time_get/get/char/71367.cc: New test.
* testsuite/22_locale/time_get/get/wchar_t/71367.cc: New test.
The following patch is an attempt to fix various time_get related issues.
Sorry, it is long...
One of them is PR78714. It seems _M_extract_via_format has been written
with how strftime behaves in mind rather than how strptime behaves.
There is a significant difference between the two, for strftime %a and %A
behave differently etc., one emits an abbreviated name, the other full name.
For strptime both should behave the same and accept both the full or
abbreviated names. This needed large changes in _M_extract_name, which
was assuming the names are unique and names aren't prefixes of other names.
The _M_extract_name changes allow to deal with those cases. As can be
seen in the new testcase, e.g. for %b and english locales we need to
accept both Apr and April. If we see Apr in the input, the code looks
at whether there is end right after those 3 chars or if the next
character doesn't match characters in the longer names; in that case
it accepts the abbreviated name. Otherwise, if the input has Apri, it
commits to a longer name and fails if it isn't April. This behavior is
different from strptime, which for %bix and Aprix accepts it, but for
an input iterator I'm afraid we can't do better, we can't go back (peek
more than the current character).
Another case is that %d and %e in strptime should work the same, while
previously the code was hardcoding that %d would be 01 to 31 and %e
1 to 31 (with leading 0 replaced by space).
strptime POSIX 2009 documentation seems to suggest for numbers it should
accept up to the specified number of digits rather than exactly that number
of digits:
The pattern "[x,y]" indicates that the value shall fall within the range
given (both bounds being inclusive), and the maximum number of characters scanned
shall be the maximum required to represent any value in the range without leading
zeros.
so by my reading "1:" is valid for "%H:".
The glibc strptime implementation actually skips any amount of whitespace
in all the cases where a number is read, my current patch skips a single
space at the start of %d/%e but not the others, but doesn't subtract the
space length from the len characters.
One option would be to do the leading whitespace skipping in _M_extract_num
but take it into account how many digits can be read.
This matters for " 12:" and "%H:", but not for " 12:" and " %H:"
as in the latter case the space in the format string results in all the
whitespace at the start to be consumed.
Note, the allowing of a single digit rather than 2 changes a behavior in
other ways, e.g. when seeing 40 in a number for range [1, 31] we reject
it as before, but previously we'd keep *ret == '4' because it was assuming
it has to be 2 digits and 40 isn't valid, so we know error already on the
4, but now we accept the 4 as value and fail iff the next format string
doesn't match the 0.
Also, previously it wasn't really checking the number was in the right
range, it would accept 00 for [1, 31] numbers, or would accept 39.
Another thing is that %I was parsing 12 as tm_hour 12 rather than as tm_hour 0
like e.g. glibc does.
Another thing is that %t was matching a single tab and %n a single newline,
while strptime docs say it skips over whitespace (again, zero or more).
Another thing is that %p wasn't handled at all, I think this was the main
cause of
FAIL: 22_locale/time_get/get_time/char/2.cc execution test
FAIL: 22_locale/time_get/get_time/char/wrapped_env.cc execution test
FAIL: 22_locale/time_get/get_time/char/wrapped_locale.cc execution test
FAIL: 22_locale/time_get/get_time/wchar_t/2.cc execution test
FAIL: 22_locale/time_get/get_time/wchar_t/wrapped_env.cc execution test
FAIL: 22_locale/time_get/get_time/wchar_t/wrapped_locale.cc execution test
before this patch, because en_HK* locales do use %I and %p in it.
The patch handles %p only if it follows %I (i.e. when the hour is parsed
first), which is the more usual case (in glibc):
grep '%I' localedata/locales/* | grep '%I.*%p' | wc -l
282
grep '%I' localedata/locales/* | grep -v '%I.*%p' | wc -l
44
grep '%I' localedata/locales/* | grep -v '%p' | wc -l
17
The last case use %P instead of %p in t_fmt_ampm, not sure if that one
is never used by strptime because %P isn't handled by strptime.
Anyway, the right thing to handle even %p%I would be to pass some state
around through all the _M_extract_via_format calls like glibc passes
struct __strptime_state
{
unsigned int have_I : 1;
unsigned int have_wday : 1;
unsigned int have_yday : 1;
unsigned int have_mon : 1;
unsigned int have_mday : 1;
unsigned int have_uweek : 1;
unsigned int have_wweek : 1;
unsigned int is_pm : 1;
unsigned int want_century : 1;
unsigned int want_era : 1;
unsigned int want_xday : 1;
enum ptime_locale_status decided : 2;
signed char week_no;
signed char century;
int era_cnt;
} s;
around. That is for the %p case used like:
if (s.have_I && s.is_pm)
tm->tm_hour += 12;
during finalization, but handles tons of other cases which it is unclear
if libstdc++ needs or doesn't need to handle, e.g. strptime if one
specifies year and yday computes wday/mon/day from it, etc. basically for
the redundant fields computes them from other fields if those have been
parsed and are sufficient to determine it.
To do this we'd need to change ABI for the _M_extract_via_format,
though sure, we could add a wrapper around the new one with the old
arguments that would just use a dummy state. And we'd need a new
_M_whatever finalizer that would do those post parsing tweaks.
Also, %% wasn't handled.
For a whitespace in the strings there was inconsistent behavior,
_M_extract_via_format would require exactly that whitespace char (say
matching space, or matching tab), while the caller follows what
https://eel.is/c++draft/locale.time.get#members-8.5 says, that
when encountering whitespace it skips whitespace in the format and
then whitespace in the input if any. I've changed _M_extract_via_format
to skip whitespace in the input (looping over format isn't IMHO necessary,
because next iteration of the loop will handle that too).
Tested on x86_64-linux by make check-target-libstdc++-v3, ok for trunk
if it passes full bootstrap/regtest?
For the new 3.cc testcases, I have included hopefully correctly
corresponding C testcase using strptime in an attachment, and to the
extent where it can be compared (e.g. strptime on failure just
returns NULL, doesn't tell where it exactly stopped) I think the
only difference is that
str = "Novembur";
format = "%bembur";
ret = strptime (str, format, &time);
case where strptime accepts it but there is no way to do it with input
operator.
I admit I don't have libc++ or other STL libraries around to be able to
check how much the new 3.cc matches or disagrees with other implementations.
Now, the things not handled by this patch but which should be fixed (I
probably need to go back to compiler work) or at least looked at:
1) seems %j, %r, %U, %w and %W aren't handled (not sure if all of them
are already in POSIX 2009 or some are later)
2) I haven't touched the %y/%Y/%C and year handling stuff, that is
definitely not matching what POSIX 2009 says:
C All but the last two digits of the year {2}; leading zeros shall be permitted but shall not be required. A leading '+' or '−' character shall be permitted before
any leading zeros but shall not be required.
y The last two digits of the year. When format contains neither a C conversion specifier nor a Y conversion specifier, values in the range [69,99] shall refer to
years 1969 to 1999 inclusive and values in the range [00,68] shall refer to years 2000 to 2068 inclusive; leading zeros shall be permitted but shall not be re‐
quired. A leading '+' or '−' character shall be permitted before any leading zeros but shall not be required.
Note: It is expected that in a future version of this standard the default century inferred from a 2-digit year will change. (This would apply to all commands
accepting a 2-digit year as input.)
Y The full year {4}; leading zeros shall be permitted but shall not be required. A leading '+' or '−' character shall be permitted before any leading zeros but
shall not be required.
I've tried to avoid making changes to _M_extract_num for these as well
to keep current status quo (the __len == 4 cases). One thing is what
to do for things with %C %y and/or %Y in the formats, another thing
is what to do in the methods that directly perform _M_extract_num
for year
3) the above question what to do for leading whitespace of any numbers
being parsed
4) the %p%I issue mentioned above and generally what to do if we
pass state and have finalizers at the end of parsing
5) _M_extract_via_format is also inconsistent with its callers on handling
the non-whitespace characters in between format specifiers, the caller
follows https://eel.is/c++draft/locale.time.get#members-8.6 and does
case insensitive comparison:
// TODO real case-insensitive comparison
else if (__ctype.tolower(*__s) == __ctype.tolower(*__fmt) ||
__ctype.toupper(*__s) == __ctype.toupper(*__fmt))
while _M_extract_via_format only compares exact characters:
// Verify format and input match, extract and discard.
if (__format[__i] == *__beg)
++__beg;
(another question is if there is a better way how to do real
case-insensitive comparison of 2 characters and whether we e.g. need
to handle the Turkish i/İ and ı/I which have different number of bytes
in UTF-8)
6) _M_extract_name does something weird for case-sensitivity,
// NB: Some of the locale data is in the form of all lowercase
// names, and some is in the form of initially-capitalized
// names. Look for both.
if (__beg != __end)
and
if (__c == __names[__i1][0]
|| __c == __ctype.toupper(__names[__i1][0]))
for the first letter while just
__name[__pos] == *__beg
on all the following letters. strptime says:
In case a text string (such as the name of a day of the week or a month
name) is to be matched, the comparison is case insensitive.
so supposedly all the _M_extract_name comparisons should be case
insensitive.
2021-12-10 Jakub Jelinek <jakub@redhat.com>
PR libstdc++/78714
* include/bits/locale_facets_nonio.tcc (_M_extract_via_format):
Mention in function comment it interprets strptime format string
rather than strftime. Handle %a and %A the same by accepting both
full and abbreviated names. Similarly handle %h, %b and %B the same.
Handle %d and %e the same by accepting possibly optional single space
and 1 or 2 digits. For %I store tm_hour 0 instead of tm_hour 12. For
%t and %n skip any whitespace. Handle %p and %%. For whitespace in
the string skip any whitespace.
(_M_extract_num): For __len == 2 accept 1 or 2 digits rather than
always 2. Don't punt early if __value * __mult is larget than __max
or smaller than __min - __mult, instead punt if __value > __max.
At the end verify __value is in between __min and __max and punt
otherwise.
(_M_extract_name): Allow non-unique names or names which are prefixes
of other names. Don't recompute lengths of names for every character.
* testsuite/22_locale/time_get/get/char/3.cc: New test.
* testsuite/22_locale/time_get/get/wchar_t/3.cc: New test.
* testsuite/22_locale/time_get/get_date/char/12791.cc (test01): Use
62 instead 60 and expect 6 to be accepted and thus *ret01 == '2'.
* testsuite/22_locale/time_get/get_date/wchar_t/12791.cc (test01):
Similarly.
* testsuite/22_locale/time_get/get_time/char/2.cc (test02): Add " PM"
to the string.
* testsuite/22_locale/time_get/get_time/char/5.cc (test01): Expect
tm_hour 1 rather than 0.
* testsuite/22_locale/time_get/get_time/wchar_t/2.cc (test02): Add
" PM" to the string.
* testsuite/22_locale/time_get/get_time/wchar_t/5.cc (test01): Expect
tm_hour 1 rather than 0.
A mutex and condition variable is used for timed waits on atomics if
there is no "platform wait" (e.g. futex) supported. But the use of those
types wasn't guarded by the _GLIBCXX_HAS_GTHREADS macro, causing errors
for --disable-threads builds. This fix allows <atomic> to work on
targets with futexes but no gthreads.
libstdc++-v3/ChangeLog:
PR libstdc++/103638
* include/bits/atomic_timed_wait.h: Check _GLIBCXX_HAS_GTHREADS
before using std::mutex and std::__condvar.
If no OS function to sleep (e.g. nanosleep, usleep, Win32 Sleep etc.) is
available then configure defines the macro NO_SLEEP. But this will not
get prefixed with "_GLIBCXX_" because include/Makefile.am only does that
for macros beginning with "HAVE_". The configure script should define
_GLIBCXX_NO_SLEEP instead (which is what the code actually checks for).
libstdc++-v3/ChangeLog:
* acinclude.m4 (GLIBCXX_ENABLE_LIBSTDCXX_TIME): Add _GLIBCXX_
prefix to NO_SLEEP macro.
* config.h.in: Regenerate.
* configure: Regenerate.
This was an oversight in the original commit adding wait/notify
to atomic<T>.
libstdc++-v3/ChangeLog:
PR libstdc++/102994
* include/bits/atomic_base.h (__atomic_base<_PTp*>::wait()):
Add const qualifier.
* include/std/atomic (atomic<_Tp*>::wait(), atomic_wait()):
Likewise.
* testsuite/29_atomics/atomic/wait_notify/102994.cc:
New test.
Since r11-1571 (c++: Refinements to "more constrained") was changed in
the front end, the following comment from stl_iterator.h stopped being
true:
// These extra overloads are not needed in C++20, because the ones above
// are constrained with a requires-clause and so overload resolution will
// prefer them to greedy unconstrained function templates.
The requires-clause is no longer considered when comparing unrelated
function templates. That means that the constrained operator== specified
in the standard is no longer more constrained than the pathological
comparison operators defined in the testsuite_greedy_ops.h header. This
was causing several tests to FAIL in C++20 mode:
FAIL: 23_containers/deque/types/1.cc (test for excess errors)
FAIL: 23_containers/vector/types/1.cc (test for excess errors)
FAIL: 24_iterators/move_iterator/greedy_ops.cc (test for excess errors)
FAIL: 24_iterators/normal_iterator/greedy_ops.cc (test for excess errors)
FAIL: 24_iterators/reverse_iterator/greedy_ops.cc (test for excess errors)
The solution is to restore some of the non-standard comparison operators
that are more specialized than the greedy operators in the testsuite.
libstdc++-v3/ChangeLog:
* include/bits/stl_iterator.h (operator==, operator<=>): Define
overloads for homogeneous specializations of reverse_iterator,
__normal_iterator and move_iterator.
This test no longer has additional errors for C++20 mode, so remove the
dg-error that is now failing, and the unnecessary dg-prune-output.
libstdc++-v3/ChangeLog:
* testsuite/20_util/scoped_allocator/69293_neg.cc: Remove
dg-error for c++20.
This allows std::make_exception_ptr to be used in a translation unit
compiled with -fno-exceptions. This works because the new implementation
added for PR 68297 doesn't need to throw or catch anything. The catch is
there to handle exceptions from the constructor of the exception object,
which we can assume won't happen in a -fno-exceptions TU and so use the
__catch macro instead. If the constructor does throw (because it's
defined in a different TU which was compiled with exceptions enabled)
then that exception will propagate to the make_exception_ptr caller.
That seems acceptable for a program that is trying to mix & match TUs
compiled with and without exceptions, and using types that throw when
constructed. That should be rare, and can't reasonably be expected to
have sensible behaviour.
This also enables the new implementation for targets that use a
non-standard calling convention for the exceptionDestructor callback
(specifically, mingw, which uses __thiscall). All we need to do is mark
the __dest_thunk function template with the right calling convention.
Finally, the useless no-op definition of make_exception_ptr (which is
only used if both RTTI and exceptions are disabled) is marked
always_inline, to ensure that the linker won't keep that definition and
discard the functional ones when both definitions of the function are
present in the link. An alternative would be to add the abi_tag
attribute to the useless definition, but making it always_inline should
work, and it's small enough to always be inlined reliably.
libstdc++-v3/ChangeLog:
PR libstdc++/85813
* libsupc++/exception_ptr.h (__dest_thunk): Add macro for
destructor calling convention.
(make_exception_ptr): Enable non-throwing implementation for
-fno-exceptions and for non-standard calling conventions. Use
always_inline attribute on the useless no-rtti no-exceptions
definition.
* testsuite/18_support/exception_ptr/64241.cc: Add -fno-rtti so
the no-op implementation is still used.
This restores support for std::make_exception_ptr<E&> and for using
std::exception_ptr in C++98.
Because the new non-throwing implementation needs to use std::decay to
handle references the original throwing implementation is used for
C++98.
We also need to change the typeid expression so it doesn't yield the
dynamic type when the function parameter is a reference to a polymorphic
type. Otherwise the new exception object could be caught by any handler
matching the dynamic type, even though the actual exception object is
only a copy of the base class, sliced to the static type.
libstdc++-v3/ChangeLog:
PR libstdc++/103630
* libsupc++/exception_ptr.h (exception_ptr): Fix exception
specifications on inline definitions.
(make_exception_ptr): Decay the template parameter. Use typeid
of the static type.
* testsuite/18_support/exception_ptr/103630.cc: New test.
This implements my P2467R0 proposal to support opening an fstream in
exclusive mode. The new constant is also supported pre-C++23 as
std::ios_base::__noreplace.
This proposal hasn't been approved for C++23 yet, but I am confident it
will be, as this is restoring a feture found in pre-ISO C++ iostreams
implementations (and still present in the MSVC library as _Noreplace).
If the proposal fails for C++23 we can remove the ios::noreplace
name and just keep ios::__noreplace as an extension.
libstdc++-v3/ChangeLog:
PR libstdc++/59769
* config/io/basic_file_stdio.cc (fopen_mode): Add support for
exclusive mode.
* include/bits/ios_base.h (_S_noreplace): Define new enumerator.
(ios_base::__noreplace): Define.
(ios_base::noreplace): Define for C++23.
* include/std/version (__cpp_lib_ios_noreplace): Define.
* testsuite/27_io/basic_ofstream/open/char/noreplace.cc: New test.
* testsuite/27_io/basic_ofstream/open/wchar_t/noreplace.cc: New test.
std::condition_variable::wait(unique_lock<mutex>&) is incorrectly marked
noexcept, which means that the __forced_unwind exception used by NPTL
cancellation will terminate the process. It should allow exceptions to
pass through, so that a thread can be cleanly cancelled when waiting on
a condition variable.
The new behaviour is exported as a new version of the symbol, to avoid
an ABI break for existing code linked to the non-throwing definition of
the function. Code linked against older releases will have a reference
to the @GLIBCXX_3.4.11 version, andcode compiled against the new
libstdc++ will get a reference to the @@GLIBCXX_3.4.30 version.
libstdc++-v3/ChangeLog:
PR libstdc++/103382
* config/abi/pre/gnu.ver (GLIBCXX_3.4.11): Do not export old
symbol if .symver renaming is supported.
(GLIBCXX_3.4.30): Export new symbol if .symver renaming is
supported.
* doc/xml/manual/evolution.xml: Document change.
* doc/html/manual/api.html: Regenerate.
* include/bits/std_mutex.h (__condvar::wait, __condvar::wait_until):
Remove noexcept.
* include/std/condition_variable (condition_variable::wait):
Likewise.
* src/c++11/condition_variable.cc (condition_variable::wait):
Likewise.
* src/c++11/compatibility-condvar.cc (__nothrow_wait_cv::wait):
Define nothrow wrapper around std::condition_variable::wait and
export the old symbol as an alias to it.
* testsuite/30_threads/condition_variable/members/103382.cc: New test.
Inserting a pair<Key, Value> into a map<Key, Value> will allocate a new
node and construct a pair<const Key, Value> in the node, then check if
the Key is already present in the map. That is because pair<Key, Value>
is not the same type as the map's value_type. But it only differs in the
const-qualification on the Key, and so we should be able to do the
lookup directly, without allocating a new node. This avoids allocating
and then deallocating a node for the case where the key is already found
and nothing gets inserted.
We can take this optimization further and lookup the key directly for a
pair<Key, X>, pair<const Key, X>, pair<Key&, X> etc. for any X. A strict
reading of the standard says we can only do this when we know the
allocator won't do anything funky with the value when constructing a
pair<const Key, Value> from a slightly different type. Inserting that
type only requires the value_type to be Cpp17EmplaceInsertable into the
container, and that doesn't have any requirement that the value is
unchanged (unlike Cpp17CopyInsertable and Cpp17MoveInsertable). For that
reason, the optimization is only done for maps using std::allocator.
A similar optimization can be done for map.emplace(key, value) where the
first argument is similar to the key_type and so can be looked up
without allocating a new node and constructing a key_type.
Finally, both of the insert and emplace cases can use the same
optimization when key_type is a scalar type and some other scalar is
being passed as the insert/emplace argument. Converting from one scalar
type to another won't have surprising value-altering behaviour, and has
no side effects (unlike e.g. constructing a std::string from a const
char* argument, which might allocate).
We don't need to do this for std::multimap, because we always insert the
new node even if the key is already present. So there's no benefit to
doing the lookup before allocating the new node.
libstdc++-v3/ChangeLog:
PR libstdc++/92300
* include/bits/stl_map.h (insert(Pair&&), emplace(Args&&...)):
Check whether the arguments can be looked up directly without
constructing a temporary node first.
* include/bits/stl_pair.h (__is_pair): Move to here, from ...
* include/bits/uses_allocator_args.h (__is_pair): ... here.
* testsuite/23_containers/map/modifiers/emplace/92300.cc: New test.
* testsuite/23_containers/map/modifiers/insert/92300.cc: New test.
When non-const references, pointers or iterators are obtained to the
contents of a COW std::basic_string, the implementation has to assume it
could result in a write to the contents. If the string was previously
shared, it does the "copy-on-write" step of creating a new copy of the
data that is not shared by another object. It also marks the string as
"leaked", so that no future copies of it will share ownership either.
However, if the string is empty then the only character in the sequence
is the terminating null, and modifying that is undefined behaviour. This
means that non-const references/pointers/iterators to an empty string
are effectively const. Since no direct modification is possible, there
is no need to "leak" the string, it can be safely shared with other
objects. This avoids unnecessary allocations to create new copies of
empty strings that can't be modified anyway.
We already did this optimization for strings that share ownership of the
static _S_empty_rep() object, but not for strings that have non-zero
capacity, and not for fully-dynamic-strings (where the _S_empty_rep()
object is never used).
With this change we avoid two allocations in the return statement:
std::string s;
s.reserve(1); // allocate
std::string s2 = s;
std::string s3 = s;
return s[0] + s2[0] + s3[0]; // leak+allocate twice
libstdc++-v3/ChangeLog:
* include/bits/cow_string.h (basic_string::_M_leak_hard): Do not
reallocate an empty string.
These warnings are triggered by perfectly valid code using std::string.
They're particularly bad when --enable-fully-dynamic-string is used,
because even std::string().begin() will give a warning.
Use pragmas to stop the troublesome warnings for copies done by
std::char_traits.
libstdc++-v3/ChangeLog:
PR libstdc++/103332
PR libstdc++/102958
PR libstdc++/103483
* include/bits/char_traits.h: Suppress stringop and array-bounds
warnings.
The possible base classes of std::allocator are new_allocator and
malloc_allocator, which both cause a non-reserved name to be declared in
every program that includes the definition of std::allocator. This is
non-conforming.
This change replaces __gnu_cxx::new_allocator with std::__new_allocator
which is identical except for using a reserved name. The non-standard
extension __gnu_cxx::new_allocator is preserved as a thin wrapper over
std::__new_allocator. There is no problem with the extension using a
non-reserved name now that it's not included by default in other
headers.
The same change could be done to __gnu_cxx::malloc_allocator but as it's
not the default configuration it can wait.
libstdc++-v3/ChangeLog:
PR libstdc++/64135
* config/allocator/new_allocator_base.h: Include
<bits/new_allocator.h> instead of <ext/new_allocator.h>.
(__allocator_base): Use std::__new_allocator instead of
__gnu_cxx::new_allocator.
* doc/xml/manual/allocator.xml: Document new default base class
for std::allocator.
* doc/xml/manual/evolution.xml: Likewise.
* doc/html/*: Regenerate.
* include/Makefile.am: Add bits/new_allocator.h.
* include/Makefile.in: Regenerate.
* include/experimental/memory_resource (new_delete_resource):
Use std::__new_allocator instead of __gnu_cxx::new_allocator.
* include/ext/new_allocator.h (new_allocator): Derive from
std::__new_allocator. Move implementation to ...
* include/bits/new_allocator.h: New file.
* testsuite/20_util/allocator/64135.cc: New test.
The check for _Atomic_word being 32-bit is just a normal runtime
condition for C++11 and C++14, because it doesn't use if-constexpr. That
means the 1LL << (CHAR_BIT * sizeof(_Atomic_word)) expression expands to
1LL << 64 on Solaris, which is ill-formed.
This adds another indirection so that the shift width is zero if the
code is unreachable.
libstdc++-v3/ChangeLog:
* include/bits/shared_ptr_base.h (_Sp_counted_base::_M_release()):
Make shift width conditional on __double_word condition.
This rewrites _Sp_counted_base::_M_release to skip the two atomic
instructions that decrement each of the use count and the weak count
when both are 1.
Benefits: Save the cost of the last atomic decrements of each of the use
count and the weak count in _Sp_counted_base. Atomic instructions are
significantly slower than regular loads and stores across major
architectures.
How current code works: _M_release() atomically decrements the use
count, checks if it was 1, if so calls _M_dispose(), atomically
decrements the weak count, checks if it was 1, and if so calls
_M_destroy().
How the proposed algorithm works: _M_release() loads both use count and
weak count together atomically (assuming suitable alignment, discussed
later), checks if the value corresponds to a 0x1 value in the individual
count members, and if so calls _M_dispose() and _M_destroy().
Otherwise, it follows the original algorithm.
Why it works: When the current thread executing _M_release() finds each
of the counts is equal to 1, then no other threads could possibly hold
use or weak references to this control block. That is, no other threads
could possibly access the counts or the protected object.
There are two crucial high-level issues that I'd like to point out first:
- Atomicity of access to the counts together
- Proper alignment of the counts together
The patch is intended to apply the proposed algorithm only to the case of
64-bit mode, 4-byte counts, and 8-byte aligned _Sp_counted_base.
** Atomicity **
- The proposed algorithm depends on the mutual atomicity among 8-byte
atomic operations and 4-byte atomic operations on each of the 4-byte halves
of the 8-byte aligned 8-byte block.
- The standard does not guarantee atomicity of 8-byte operations on a pair
of 8-byte aligned 4-byte objects.
- To my knowledge this works in practice on systems that guarantee native
implementation of 4-byte and 8-byte atomic operations.
- __atomic_always_lock_free is used to check for native atomic operations.
** Alignment **
- _Sp_counted_base is an internal base class, with a virtual destructor,
so it has a vptr at the beginning of the class, and will be aligned to
alignof(void*) i.e. 8 bytes.
- The first members of the class are the 4-byte use count and 4-byte
weak count, which will occupy 8 contiguous bytes immediately after the
vptr, i.e. they form an 8-byte aligned 8 byte range.
Other points:
- The proposed algorithm can interact correctly with the current algorithm.
That is, multiple threads using different versions of the code with and
without the patch operating on the same objects should always interact
correctly. The intent for the patch is to be ABI compatible with the
current implementation.
- The proposed patch involves a performance trade-off between saving the
costs of atomic instructions when the counts are both 1 vs adding the cost
of loading the 8-byte combined counts and comparison with {0x1, 0x1}.
- I noticed a big difference between the code generated by GCC vs LLVM. GCC
seems to generate noticeably more code and what seems to be redundant null
checks and branches.
- The patch has been in use (built using LLVM) in a large environment for
many months. The performance gains outweigh the losses (roughly 10 to 1)
across a large variety of workloads.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
Co-authored-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* include/bits/c++config (_GLIBCXX_TSAN): Define macro
indicating that TSan is in use.
* include/bits/shared_ptr_base.h (_Sp_counted_base::_M_release):
Replace definition in primary template with explicit
specializations for _S_mutex and _S_atomic policies.
(_Sp_counted_base<_S_mutex>::_M_release): New specialization.
(_Sp_counted_base<_S_atomic>::_M_release): New specialization,
using a single atomic load to access both reference counts at
once.
(_Sp_counted_base::_M_release_last_use): New member function.
Newlib has reverted the commit that caused us to require a
workaround. As such we can now revert the workaround.
This reverts commit 0e510ab534.
libstdc++-v3/ChangeLog:
PR libstdc++/103305
* config/os/newlib/ctype_base.h (upper, lower, alpha, digit, xdigit,
space, print, graph, cntrl, punct, alnum, blank): Revert.
This fixes a -Wuninitialized warning for std::cmatch m1, m2; m1=m2;
Also name the template parameters in the forward declaration, to get rid
of the <template-parameter-1-1> noise in diagnostics.
libstdc++-v3/ChangeLog:
PR libstdc++/103549
* include/bits/regex.h (match_results): Give names to template
parameters in first declaration.
(match_results::_M_begin): Add default member-initializer.
This introduces a new RAII type to simplify the emplace members which
currently use try-catch blocks to deallocate a node if an exception is
thrown by the comparisons done during insertion. The new type is created
on the stack and manages the allocation of a new node and deallocates it
in the destructor if it wasn't inserted into the tree. It also provides
helper functions for doing the insertion, releasing ownership of the
node to the tree.
Also, we don't need to use long qualified names if we put the return
type after the nested-name-specifier.
libstdc++-v3/ChangeLog:
* include/bits/stl_tree.h (_Rb_tree::_Auto_node): Define new
RAII helper for creating and inserting new nodes.
(_Rb_tree::_M_insert_node): Use trailing-return-type to simplify
out-of-line definition.
(_Rb_tree::_M_insert_lower_node): Likewise.
(_Rb_tree::_M_insert_equal_lower_node): Likewise.
(_Rb_tree::_M_emplace_unique): Likewise. Use _Auto_node.
(_Rb_tree::_M_emplace_equal): Likewise.
(_Rb_tree::_M_emplace_hint_unique): Likewise.
(_Rb_tree::_M_emplace_hint_equal): Likewise.
The move constructor for the fully-dynamic std::basic_string was not
noexcept until recently, so the std::logic_error and std::runtime_error
move constructors were defined to make non-throwing copies of their
string members, instead of potentially-throwing moves.
Now that move construction is always noexecpt, the exception classes can
always move the string. The fully-dynamic string move assignment was
always noexcept, so I don't know why I special-cased the move assignment
operators of the exception classes. That can be changed too.
libstdc++-v3/ChangeLog:
* src/c++11/cow-stdexcept.cc [_GLIBCXX_FULY_DYNAMIC_STRING]
(logic_error, runtime_error): Remove custom definitions.
The bitmap_allocator, __mt_alloc and __pool_alloc extensions are no
longer suitable for use as the base class of std::allocator, because
they have not been updated to meet the C++20 requirements. There is a
patch attached to PR 103340 which addresses that, but more work would be
needed to solve the linking errors that occur when the library is
configured to use them.
Using --enable-libstdcxx-allocator=bitmap wouldn't even bootstrap for
the past few years, and I can't find any gcc-testresults reports using
any of these allocators. This patch removes the configure option to use
these as the std::allocator base class. The allocators are still in the
tree and can be used directly, you just can't configure the library to
use one of them as the base class of std::allocator.
libstdc++-v3/ChangeLog:
PR libstdc++/103340
PR libstdc++/103400
PR libstdc++/103381
* acinclude.m4 (GLIBCXX_ENABLE_ALLOCATOR): Remove mt, bitmap
and pool options.
* configure: Regenerate.
* config/allocator/bitmap_allocator_base.h: Removed.
* config/allocator/mt_allocator_base.h: Removed.
* config/allocator/pool_allocator_base.h: Removed.
* doc/xml/manual/allocator.xml: Update.
* doc/xml/manual/configure.xml: Update.
* doc/xml/manual/evolution.xml: Document removal.
* doc/xml/manual/mt_allocator.xml: Editorial tweaks.
* doc/html/manual/*: Regenerate.