gcc/libstdc++-v3/testsuite/28_regex
Jonathan Wakely 7ce3c230ed libstdc++: Fix handling of invalid ranges in std::regex [PR102447]
std::regex currently allows invalid bracket ranges such as [\w-a] which
are only allowed by ECMAScript when in web browser compatibility mode.
It should be an error, because the start of the range is a character
class, not a single character. The current implementation of
_Compiler::_M_expression_term does not provide a way to reject this,
because we only remember a previous character, not whether we just
processed a character class (or collating symbol etc.)

This patch replaces the pair<bool, CharT> used to emulate
optional<CharT> with a custom class closer to pair<tribool,CharT>. That
allows us to track three states, so that we can tell when we've just
seen a character class.

With this additional state the code in _M_expression_term for processing
the _S_token_bracket_dash can be improved to correctly reject the [\w-a]
case, without regressing for valid cases such as [\w-] and [----].

libstdc++-v3/ChangeLog:

	PR libstdc++/102447
	* include/bits/regex_compiler.h (_Compiler::_BracketState): New
	class.
	(_Compiler::_BrackeyMatcher): New alias template.
	(_Compiler::_M_expression_term): Change pair<bool, CharT>
	parameter to _BracketState. Process first character for
	ECMAScript syntax as well as POSIX.
	* include/bits/regex_compiler.tcc
	(_Compiler::_M_insert_bracket_matcher): Pass _BracketState.
	(_Compiler::_M_expression_term): Use _BracketState to store
	state between calls. Improve handling of dashes in ranges.
	* testsuite/28_regex/algorithms/regex_match/cstring_bracket_01.cc:
	Add more tests for ranges containing dashes. Check invalid
	ranges with character class at the beginning.
2021-12-14 21:45:46 +00:00
..
algorithms libstdc++: Fix handling of invalid ranges in std::regex [PR102447] 2021-12-14 21:45:46 +00:00
basic_regex libstdc++: Fix 28_regex/basic_regex/84110.cc on Solaris 2021-10-26 14:07:57 +02:00
constants libstdc++: Simplify definition of std::regex_constants variables 2021-12-14 21:45:45 +00:00
headers/regex
iterators
match_results libstdc++: Replace hyphens in effective target keywords 2021-11-24 13:20:26 +00:00
regex_error
requirements
sub_match
traits libstdc++: Reduce header dependencies in <regex> 2021-08-03 15:24:52 +01:00
init-list.cc
range_access.cc libstdc++: Add [[nodiscard]] to iterators and related utilities 2021-08-04 12:54:28 +01:00
regression.cc
simple_c++11.cc