Commit Graph

105 Commits

Author SHA1 Message Date
Andreas Schwab a445af0bc7 Fix buffer overrun in regexp matcher 2013-02-12 09:30:34 +01:00
Joseph Myers 568035b787 Update copyright notices with scripts/update-copyrights. 2013-01-02 19:05:09 +00:00
Paul Eggert 59ba27a63a Replace FSF snail mail address with URLs. 2012-02-09 23:18:22 +00:00
Andreas Schwab f3a6cc0a56 Fix access after end of search string in regex matcher 2011-11-30 11:03:20 +01:00
Ulrich Drepper 8887a920a4 Fix unnecessary overallocation due to incomplete character
When incomplete characters are found at the end of a string the
code ran amok and allocated lots of memory.  Stricter limits
are now in place.
2011-05-28 17:14:30 -04:00
Jim Meyering 2543fef229 Fix infloop on persistent failing calloc in regex. 2010-12-27 18:19:56 -05:00
Andreas Schwab d84acf388e Fix lookup of collation sequence value during regexp matching 2010-05-05 09:59:25 -07:00
Paul Eggert aef699dce1 regexec.c: avoid overflow in realloc buffer length computation 2010-01-22 12:41:12 -08:00
Paul Eggert 74bc9f14db regexec.c: avoid leaks on out-of-memory failure paths 2010-01-22 12:33:58 -08:00
Paul Eggert 42a2c9b5c3 regexec.c: avoid overflow in computing sum of lengths 2010-01-22 12:22:18 -08:00
Paul Eggert eadc09f22c re_search_internal: Avoid overflow in computing re_malloc buffer size 2010-01-22 12:15:53 -08:00
Paul Eggert 4cd028677b prune_impossible_nodes: Avoid overflow in computing re_malloc buffer size 2010-01-22 12:03:56 -08:00
Paul Eggert daa8454919 regexec.c: avoid arithmetic overflow in buffer size calculation 2010-01-22 10:52:38 -08:00
Paul Eggert d044d844dd regexec.c: simplify re_search_2_stub 2010-01-22 10:39:59 -08:00
Ulrich Drepper 2da42bc065 Fix a few more cases of ignored return values in regex. 2010-01-15 12:03:16 -08:00
Ulrich Drepper 76c7f2cd8a [BZ 697]
* posix/regexec.c (prune_impossible_nodes): Handle sifted_states[0]
	being NULL also if there are no backreferences.
	* posix/rxspencer/tests: Add testcases.
2009-01-08 00:47:30 +00:00
Ulrich Drepper b7d1c5fa30 * posix/fnmatch_loop.c: Take rule index returned as part of
findidx return value into account when accessing weights.
	* posix/regcomp.c: Likewise.
	* posix/regexec.c: Likewise.
2007-10-12 17:47:19 +00:00
Ulrich Drepper 1ba81cea27 * posix/regexec.c: Finish prototyping of static functions.
* posix/regex_internal.c: Likewise.
2005-10-15 15:23:33 +00:00
Ulrich Drepper 513bbb254d [BZ #1373]
2005-10-13  Ulrich Drepper  <drepper@redhat.com>
	[BZ #1373]
	* argp/argp.h: Remove __NTH for __argp_usage inline function.
2005-10-14 05:54:47 +00:00
Ulrich Drepper e2f5526407 [BZ #1231]
2005-08-23  Paul Eggert  <eggert@cs.ucla.edu>
	[BZ #1231]
	* posix/regex_internal.c (re_string_skip_chars, register_state,
	calc_state_hash): Remove forward decls.
	* posix/regexec.c (acquire_init_state_context, check_halt_node_context,
	proceed_next_node, pop_fail_stack, sub_epsilon_src_nodes,
	clean_state_log_if_needed): Likewise.

	* posix/regex.c: No need to use K&R definitions for static functions.
	* posix/regex_internal.c: Likewise.
2005-10-13 20:08:58 +00:00
Ulrich Drepper c0c9615cbb * posix/regexec.c (update_cur_sifted_state, check_arrival,
check_arrival_add_next_nodes): Avoid using uninitialized variable.

	* malloc/memusage.c (dest): Fix a bunch of warnings on 32-bit arches.

	* sysdeps/i386/fpu/libm-test-ulps: Update for GCC 4.0.x.
2005-09-30 15:46:19 +00:00
Ulrich Drepper 2c05d33f90 [BZ #1302]
2005-09-06  Paul Eggert  <eggert@cs.ucla.edu>
            Ulrich Drepper  <drepper@redhat.com>

	[BZ #1302]
	Change bitset word type from unsigned int to unsigned long int,
	as this has better performance on typical 64-bit hosts.  Change
	bitset type name to bitset_t.
	* posix/regcomp.c (build_equiv_class, build_charclass):
	(build_range_exp, build_collating_symbol):
	Prefer bitset_t to re_bitset_ptr_t in prototypes, when the actual
	argument is a bitset.  This is merely a style issue, but it makes
	it clearer that an entire array is expected.
	(re_compile_fastmap_iter, init_dfa, init_word_char, optimize_subexps,
	lower_subexp): Adjust for new bitset_t definition.
	(lower_subexp, parse_bracket_exp, built_charclass_op): Likewise.
	* posix/regex_internal.h (bitset_set, bitset_clear, bitset_contain,
	bitset_not, bitset_merge, bitset_set_all, bitset_mask): Likewise.
	* posix/regexec.c (check_dst_limits_calc_pos_1,
	check_subexp_matching_top, build_trtable, group_nodes_into_DFAstates):
	Likewise.
	* posix/regcomp.c (utf8_sb_map): Don't assume initializer
	== 0xffffffff.
	* posix/regex_internal.h (BITSET_WORD_BITS): Renamed from UINT_BITS.
	All uses changed.
	(BITSET_WORDS): Renamed from BITSET_UINTS.  All uses changed.
	(bitset_word_t): New type, replacing 'unsigned int' for bitset uses.
	All uses changed.
	(BITSET_WORD_MAX): New macro.
	(bitset_set, bitset_clear, bitset_contain, bitset_empty,
	(bitset_set_all, bitset_copy):  Adjust for bitset_t change.
	(bitset_empty, bitset_copy):
	Prefer sizeof (bitset_t) to multiplying it out ourselves.
	(bitset_not_merge): Remove; unused.
	(bitset_contain): Return bool, not unsigned int with one bit on.
	All callers changed.
	* posix/regexec.c (build_trtable): Don't assume bitset_t has no
	stricter alignment than re_node_set; do this by defining a new
	internal type struct dests_alloc and using it to allocate memory.
2005-09-28 17:33:18 +00:00
Ulrich Drepper 997470b3e1 [BZ #281]
* posix/regex.h: Define RE_TRANSLATE_TYPE as unsigned char *.
	* posix/regcomp.c: Remove unnecessary uses of
	unsigned RE_TRANSLATE_TYPE.
	* posix/regex_internal.h: Likewise.
	* posix/regex_internal.c: Likewise.
	* posix/regexexec.c: Likewise.
	Based on a patch by Stepan Kasal <kasal@ucw.cz>.
2005-09-23 06:11:29 +00:00
Ulrich Drepper 76b864c8e0 (update_cur_sifted_state): Likewise.
(re_search_internal): Likewise.
	(prune_impossible_nodes): Likewise.
	(acquire_init_state_context): Likewise.
	(proceed_next_node): Likewise.
	(set_regs): Likewise.
	(free_fail_stack_return): Likewise.
	(check_subexp_limits): Likewise.
	(sub_epsilon_src_nodes):  Likewise.
	(add_epsilon_src_nodes):  Likewise.
	(merge_state_array): Likewise.
	(update_regs): Likewise.
	(build_trtable): Likewise.
	(sift_states_backward): Mark MCTX parameter as const.
	(build_sifted_states): Likewise.
	(update_cur_sifted_state): Likewise.
	(sift_states_mkref): Likewise.
	(check_dst_limits_calc_pos_1): Likewise.
	* posix/regex_internal.h (re_match_context_t): Make dfa a const
	pointer.
2005-09-07 16:15:23 +00:00
Ulrich Drepper 6efbd82c5c (transit_state_bkref): Make DFA a const pointer.
(get_subexp): Likewise.
	(check_arrival): Likewise.
	(check_arrival_expand_ecl): Mark DFA parameter as const.
	(check_arrival_expand_ecl_sub): Likewise.
	(check_arrival_expand_ecl): Mark eclosure as const.
2005-09-07 15:26:18 +00:00
Ulrich Drepper 1878e9af92 * posix/regexec.c (find_recover_state): Remove unnecessary
initialization.
2005-09-07 07:16:24 +00:00
Ulrich Drepper c42b4152ab * posix/regexec.c (merge_state_with_log): Define dfa as const pointer.
(transit_state_sb): Likewise.
	(transit_state_mb): Likewise.
	(sift_states_iter_mb): Likewise.
	(check_arrival_add_next_nodes): Likewise.
	(check_node_accept_bytes): Change first parameter to pointer-to-const.
	[_LIBC] (re_search_2_stub): Use mempcpy.
2005-09-07 05:41:42 +00:00
Ulrich Drepper 01ed6ceb7c * posix/regex_internal.c (re_string_reconstruct): Avoid calling
mbrtowc for very simple UTF-8 case.

2005-09-01  Paul Eggert  <eggert@cs.ucla.edu>

	* posix/regex_internal.c (build_wcs_upper_buffer): Fix portability
	bugs in int versus size_t comparisons.

2005-09-06  Ulrich Drepper  <drepper@redhat.com>

	* posix/regex_internal.c (re_acquire_state): Make DFA pointer arg
	a pointer-to-const.
	(re_acquire_state_context): Likewise.
	* posix/regex_internal.h: Adjust prototypes.

2005-08-31  Jim Meyering  <jim@meyering.net>

	* posix/regcomp.c (search_duplicated_node): Make first pointer arg
	a pointer-to-const.
	* posix/regex_internal.c (create_ci_newstate, create_cd_newstate,
	register_state): Likewise.
	* posix/regexec.c (search_cur_bkref_entry, check_dst_limits):
	(check_dst_limits_calc_pos_1, check_dst_limits_calc_pos):
	(group_nodes_into_DFAstates): Likewise.

	* posix/regexec.c (re_search_internal): Simplify update of
	rm_so and rm_eo by replacing "if (A == B) A += C - B;"
	with the equivalent of "if (A == B) A = C;".

2005-09-06  Ulrich Drepper  <drepper@redhat.com>

	* posix/regcomp.c (re_compile_internal): Change third parameter type
	to size_t.
	(init_dfa): Likewise.  Make sure that arithmetic on pat_len doesn't
	overflow.
	* posix/regex_internal.h (struct re_dfa_t): Change type of nodes_alloc
	and nodes_len to size_t.
	* posix/regex_internal.c (re_dfa_add_node): Use size_t as type for
	new_nodes_alloc.  Check for overflow.

2005-08-31  Paul Eggert  <eggert@cs.ucla.edu>

	* posix/regcomp.c (re_compile_fastmap_iter, init_dfa, init_word_char):
	(optimize_subexps, lower_subexp):
	Don't assume 1<<31 has defined behavior on hosts with 32-bit int,
	since the signed shift might overflow.  Use 1u<<31 instead.
	* posix/regex_internal.h (bitset_set, bitset_clear, bitset_contain):
	Likewise.
	* posix/regexec.c (check_dst_limits_calc_pos_1): Likewise.
	(check_subexp_matching_top): Likewise.
	* posix/regcomp.c (optimize_subexps, lower_subexp):
	Use CHAR_BIT rather than 8, for clarity.
	* posix/regexec.c (check_dst_limits_calc_pos_1):
	(check_subexp_matching_top): Likewise.
	* posix/regcomp.c (init_dfa): Make table_size unsigned, so that we
	don't have to worry about portability issues when shifting it left.
	Remove no-longer-needed test for table_size > 0.
	* posix/regcomp.c (parse_sub_exp): Do not shift more bits than there
	are in a word, as the resulting behavior is undefined.
	* posix/regexec.c (check_dst_limits_calc_pos_1): Likewise;
	in one case, a <= should have been an <, and in another case the
	whole test was missing.
	* posix/regex_internal.h (BYTE_BITS): Remove.  All uses changed to
	the standard name CHAR_BIT.
2005-09-07 01:15:33 +00:00
Ulrich Drepper 2d87db5b53 * posix/regex_internal.h (re_sub_match_top_t): Remove unused member
next_last_offset.
	(struct re_dfa_t): Remove unused member states_alloc.
	* posix/regcomp.c (init_dfa): Don't initialize unused members.

2005-08-25  Paul Eggert  <eggert@cs.ucla.edu>

	* posix/regexec.c (set_regs): Don't alloca with an unbounded size.

	alloca modernization/simplification for regex.
	* posix/regex.c: Remove portability cruft for alloca.  This no longer
	needs to be at the start of the file, and can be moved into
	regex_internal.h and simplified.
	* posix/regex_internal.h: Include <alloca.h>.
	(__libc_use_alloca) [!defined _LIBC]: New macro.
	* posix/regexec.c (build_trtable): Remove "#ifdef _LIBC",
	since the code now works outside glibc.

2005-09-06  Ulrich Drepper  <drepper@redhat.com>

	* include/regex.h: Remove use of _RE_ARGS.

2005-08-25  Paul Eggert  <eggert@cs.ucla.edu>

	* posix/regexec.c (find_recover_state): Change "err" to "*err".

2005-08-24  Paul Eggert  <eggert@cs.ucla.edu>

	* posix/regcomp.c (regerror): Pointer args are 'restrict',
	as per POSIX.
	* posix/regex.h (regerror): Likewise.
	* manual/pattern.texi (POSIX Regexp Compilation): Likewise.
	Similarly for regcomp and regexec.  Also, first 2 args of regexec
	and 2nd arg of regerror are const.

	* posix/regex.c: Do not include <sys/types.h>, as POSIX no longer
	requires this.  (The code never needed it.)

2005-08-20  Paul Eggert  <eggert@cs.ucla.edu>

	* posix/regexec.c (sift_states_bkref): re_node_set_insert returns
	int, not reg_errcode_t.

	* posix/regex_internal.c (calc_state_hash): Put 'inline' before type,
	since some broken compilers warn about it otherwise.

	* posix/regcomp.c (create_initial_state): Remove duplicate decl.

2005-08-20  Paul Eggert  <eggert@cs.ucla.edu>

	* posix/regex.h (_RE_ARGS): Remove.  No longer needed, since we assume
	C89 or better.  All uses removed.

2005-09-06  Ulrich Drepper  <drepper@redhat.com>

	* posix/regex.c: Prevent using C++ compilers.

2005-08-19  Paul Eggert  <eggert@cs.ucla.edu>

	* posix/regcomp.c (duplicate_node): Return new index, not an error
	code, and let the caller return REG_ESPACE if out of space.  This
	removes an uninitialied-variable warning with GCC 4.0.1, and also
	avoids taking the address of a local variable.  All callers
	changed.

2005-09-06  Ulrich Drepper  <drepper@redhat.com>

	* include/time.h (__strptime_internal): Rename parameter to avoid
	bogus compiler warning.

2005-08-19  Jim Meyering  <jim@meyering.net>

	* posix/regexec.c (proceed_next_node): Redo local variables to
	avoid GCC shadowing warnings.

2005-09-06  Ulrich Drepper  <drepper@redhat.com>

	* posix/regex_internal.c (re_acquire_state): Minor code rearrangement.
	(re_acquire_state_context): Likewise.

2005-08-19  Paul Eggert  <eggert@cs.ucla.edu>

	* posix/regex_internal.c (re_string_realloc_buffers):
	(re_node_set_insert, re_node_set_insert_last, re_dfa_add_node):
	Rename local variables to avoid GCC shadowing warnings.

2005-07-08  Eric Blake  <ebb9@byu.net>
            Paul Eggert  <eggert@cs.ucla.edu>

	* posix/regcomp.c (init_dfa): Store __btowc value in wint_t, not
	wchar_t.  Remove now-unnecessary cast.
	(build_range_exp): Likewise.
2005-09-06 21:15:13 +00:00
Ulrich Drepper 7b91899345 [BZ #934]
2005-05-06  Jakub Jelinek  <jakub@redhat.com>
	[BZ #934]
	* posix/regex_internal.h: Include bits/libc-lock.h or define dummy
	__libc_lock_* macros if not _LIBC.
	(struct re_dfa_t): Add lock.
	* posix/regcomp.c (re_compile_internal): Add __libc_lock_init.
	* posix/regexec.c (regexec, re_search_stub): Add locking.
2005-05-06 23:34:44 +00:00
Ulrich Drepper 1c99f950d1 * posix/regexec.c (check_node_accept_bytes): Correct cast to avoid
warning.
	* posix/regex_internal.c (re_string_reconstruct): Add cast to
	avoid warning.
	(build_wcs_upper_buffer): Change type of bug to plain char.
	* locale/weightwc.h (findidx): Add casts to avoid warnings.
	* time/mktime.c (ranged_convert): Initialize tm to make the
	compiler happy.
	* wcsmbs/mbsrtowcs_l.c (__mbsrtowcs_l): Add casts to avoid warnings.
	* wcsmbs/wcsnrtombs.c (__wcsnrtombs): Add casts to avoid warnings.
	* wcsmbs/mbsnrtowcs.c: Add casts to avoid warnings.
	* wcsmbs/wcsrtombs.c (__wcsrtombs): Add casts to avoid warnings.
	* wcsmbs/wcrtomb.c (__wcrtomb): Add casts to avoid warnings.
	* wcsmbs/mbrtowc.c (__mbrtowc): Use unsigned char for outbuf.
	* posix/regex_internal.c [_LIBC] (build_wcs_buffer): Avoid using
	dynamically sized array.
	(build_wcs_upper_buffer): Likewise.
2005-03-06 07:27:56 +00:00
Ulrich Drepper 963d8d782f [BZ #558]
Update.
2005-01-27  Paolo Bonzini  <bonzini@gnu.org>

	[BZ #558]
	* posix/regcomp.c (calc_inveclosure): Return reg_errcode_t.
	Initialize the node sets in dfa->inveclosures.
	(analyze): Initialize inveclosures only if it is needed.
	Check errors from calc_inveclosure.
	* posix/regex_internal.c (re_dfa_add_node): Do not initialize
	the inveclosure node set.
	* posix/regexec.c (re_search_internal): If nmatch includes unused
	subexpressions, reset them to { rm_so: -1, rm_eo: -1 } here.

	* posix/regcomp.c (parse_bracket_exp) [!RE_ENABLE_I18N]:
	Do build a SIMPLE_BRACKET token.

	* posix/regexec.c (transit_state_mb): Do not examine nodes
	where ACCEPT_MB is not set.
2005-01-27 19:08:10 +00:00
Ulrich Drepper 02f3550c8b [BZ #605, BZ #611]
Update.
2004-12-13  Paolo Bonzini  <bonzini@gnu.org>

	Separate parsing and creation of the NFA.  Avoided recursion on
	the (very unbalanced) parse tree.
	[BZ #611]
	* posix/regcomp.c (struct subexp_optimize, analyze_tree, calc_epsdest,
	re_dfa_add_tree_node, mark_opt_subexp_iter): Removed.
	(optimize_subexps, duplicate_tree, calc_first, calc_next,
	mark_opt_subexp): Rewritten.
	(preorder, postorder, lower_subexps, lower_subexp, link_nfa_nodes,
	create_token_tree, free_tree, free_token): New.
	(analyze): Accept a regex_t *.  Invoke the passes via the preorder and
	postorder generic visitors.  Do not initialize the fields in the
	re_dfa_t that represent the transitions.
	(free_dfa_content): Use free_token.
	(re_compile_internal): Analyze before UTF-8 optimizations.  Do not
	include optimization of subexpressions.
	(create_initial_state): Fetch the DFA node index from the first node's
	bin_tree_t *.
	(optimize_utf8): Abort on unexpected nodes, including OP_DUP_QUESTION.
	Return on COMPLEX_BRACKET.
	(duplicate_node_closure): Fix comment.
	(duplicate_node): Do not initialize the fields in the
	re_dfa_t that represent the transitions.
	(calc_eclosure, calc_inveclosure): Do not handle OP_DELETED_SUBEXP.
	(create_tree): Remove final argument.  All callers adjusted.  Rewritten
	to use create_token_tree.
	(parse_reg_exp, parse_branch, parse_expression, parse_bracket_exp,
	build_charclass_op): Use create_tree or create_token_tree instead
	of re_dfa_add_tree_node.
	(parse_dup_op): Likewise.  Also free the tree using free_tree for
	"<re>{0}", and lower OP_DUP_QUESTION to OP_ALT: "a?" is equivalent
	to "a|".  Adjust invocation of mark_opt_subexp.
	(parse_sub_exp): Create a single SUBEXP node.
	* posix/regex_internal.c (re_dfa_add_node): Remove last parameter,
	always perform as if it was 1.  Do not initialize OPT_SUBEXP and
	DUPLICATED, and initialize the DFA fields representing the transitions.
	* posix/regex_internal.h (re_dfa_add_node): Adjust prototype.
	(re_token_type_t): Move OP_DUP_PLUS and OP_DUP_QUESTION to the tokens
	section.  Add a tree-only code SUBEXP.  Remove OP_DELETED_SUBEXP.
	(bin_tree_t): Include a full re_token_t for TOKEN.  Turn FIRST and
	NEXT into pointers to trees.  Remove ECLOSURE.

2004-12-28  Paolo Bonzini  <bonzini@gnu.org >

	[BZ #605]
	* posix/regcomp.c (parse_bracket_exp): Do not modify DFA nodes
	that were already created.
	* posix/regex_internal.c (re_dfa_add_node): Set accept_mb field
	in the token if needed.
	(create_ci_newstate, create_cd_newstate): Set accept_mb field
	from the tokens' field.
	* posix/regex_internal.h (re_token_t): Add accept_mb field.
	(ACCEPT_MB_NODE): Removed.
	* posix/regexec.c (proceed_next_node, transit_states_mb,
	build_sifted_states, check_arrival_add_next_nodes): Use
	accept_mb instead of ACCEPT_MB_NODE.
2005-01-26 22:42:49 +00:00
Ulrich Drepper ab4b89fe8a Update.
2004-04-27  Paolo Bonzini  <bonzini@gnu.org>

	* posix/regex_internal.h (struct re_dfastate_t): Make
	word_trtable a pointer to the 512-item transition table.
	* posix/regexec.c (build_trtable): Fill in either state->trtable
	or state->word_trtable.  Return a boolean indicating success.
	(transit_state): Expect state->trtable to be a 256-item
	transition table.  Reorganize code to have less tests in
	the common case, and to save an indentation level.
2004-12-27 16:44:39 +00:00
Ulrich Drepper a334319f65 (CFLAGS-tst-align.c): Add -mpreferred-stack-boundary=4. 2004-12-22 20:10:10 +00:00
Jakub Jelinek 0ecb606cb6 2.5-18.1 2007-07-12 18:26:36 +00:00
Ulrich Drepper 5cf1ec5256 Update.
2004-12-07  Paolo Bonzini  <bonzini@gnu.org>

	* posix/regexec.c (proceed_next_node): Simplify treatment of epsilon
	nodes.  Pass the pushed node to push_fail_stack.
	(push_fail_stack): Accept a single node rather than an array
	of two epsilon destinations.
	(build_sifted_states): Only walk non-epsilon nodes.
	(check_arrival): Don't pass epsilon nodes to
	check_arrival_add_next_nodes.
	(check_arrival_add_next_nodes) [DEBUG]: Abort if an epsilon node is
	found.
	(check_node_accept): Do expensive checks later.
	(add_epsilon_src_nodes): Cache result of merging the inveclosures.
	* posix/regex_internal.h (re_dfastate_t): Add non_eps_nodes and
	inveclosure.
	(re_string_elem_size_at, re_string_char_size_at, re_string_wchar_at,
	re_string_context_at, re_string_peek_byte_case,
	re_string_fetch_byte_case, re_node_set_compare, re_node_set_contains):
	Declare as pure.
	* posix/regex_internal.c (create_newstate_common): Remove.
	(register_state): Move part of it here.  Initialize non_eps_nodes.
	(free_state): Free inveclosure and non_eps_nodes.
	(create_cd_newstate, create_ci_newstate): Allocate the new
	re_dfastate_t here.
2004-12-10 04:37:58 +00:00
Ulrich Drepper d8f73de86a Update.
2004-12-01  Paolo Bonzini  <bonzini@gnu.org>

	* posix/regcomp.c (free_dfa_content, init_dfa): Remove
	references to re_dfa_t's subexps field.
	(parse_sub_exp, parse_expression): Do not use it.  Use
	completed_bkref_map instead.
	(create_initial_state, peek_token): Store a backreference \N
	with opr.idx = N-1.
	* posix/regexec.c (proceed_next_node, check_dst_limits, get_subexp):
	Likewise.
	(check_subexp_limits): Remove useless condition.
	* posix/regex_internal.h (re_subexp_t): Remove.
	(re_dfa_t): Remove subexps and subexps_alloc field, add
	completed_bkref_map.
2004-12-06 03:03:01 +00:00
Ulrich Drepper c06a6956a4 [BZ #544]
Update.
2004-11-18  Jakub Jelinek  <jakub@redhat.com>

	[BZ #544]
	* posix/regex.h (RE_NO_SUB): New define.
	* posix/regex_internal.h (OP_DELETED_SUBEXP): New.
	(re_dfa_t): Add subexp_map.
	* posix/regcomp.c (struct subexp_optimize): New type.
	(optimize_subexps): New routine.
	(re_compile_internal): Call it.
	(re_compile_pattern): Set preg->no_sub to 1 if RE_NO_SUB.
	(free_dfa_content): Free subexp_map.
	(calc_inveclosure, calc_eclosure): Skip OP_DELETED_SUBEXP
	nodes.
	* posix/regexec.c (re_search_internal): If subexp_map
	is not NULL, duplicate registers as needed.
	* posix/Makefile: Add rules to build and run tst-regex2.
	* posix/tst-regex2.c: New test.
	* posix/rxspencer/tests: Fix last two tests (\0 -> \1).
	Add some new tests for nested subexpressions.
2004-11-18 23:57:34 +00:00
Ulrich Drepper 7db612081a Update.
2004-11-12  Ulrich Drepper  <drepper@redhat.com>

	* posix/Makefile (tests): Add bug-regex24.
	* posix/bug-regex24.c: New file.

2004-11-12  Paolo Bonzini  <bonzini@gnu.org>

	* posix/regexec.c (check_dst_limits_calc_pos_1): Use the map to
	cut recursive paths.  Make exit condition more precise.
	(match_ctx_add_entry): Initialize the map.
	* posix/regex_internal.h (struct re_backref_cache_entry): Add a map of
	reachable subexpression nodes from each backreference cache entry.
2004-11-12 09:45:05 +00:00
Ulrich Drepper cb265fec1b Update.
* posix/regexec.c (match_ctx_free_subtops): Remove, merge into...
	(match_ctx_clean): ... this function.
	(match_ctx_free): Call match_ctx_clean.
2004-11-10 18:51:26 +00:00
Ulrich Drepper bb677c9581 Update.
2004-11-09  Paolo Bonzini  <bonzini@gnu.org>

	* posix/regexec.c (transit_state): Remove the check for
	out-of-bounds buffers.
	(check_matching): Check here for out-of-bounds buffers.
	(re_search_internal): Store into match_kind a set of bits
	indicating which incantation of fastmap scanning must be
	used.  Use a switch statement instead of multiple ifs.
	Exit the final "for (;;)" with goto free_return unless
	the match succeeded, thus simplifying some conditionals.

	* posix/regex_internal.c (re_string_reconstruct,
	re_string_context_at): Add several branch predictions for
	case-sensitive matching and no transition table being used.

2004-11-10  Ulrich Drepper  <drepper@redhat.com>

	* posix/tst-waitid.c: Don't use error to print error message, they
	won't end up in the .out file.

	* nscd/nscd_getgr_r.c: Likewise.  Make map externally visible.
	* nscd/nscd_gethst_r.c: Likewise.
2004-11-10 15:48:06 +00:00
Ulrich Drepper e40a38b383 Update.
2004-11-08  Ulrich Drepper  <drepper@redhat.com>

	* posix/regcomp.c (utf8_sb_map): Define.
	(free_dfa_content): Don't free dfa->sb_char if it's a pointer to
	utf8_sb_map.
	(init_dfa): Use utf8_sb_map instead of initializing memory when the
	encoding is UTF-8.

	* posix/regcomp.c (init_dfa): Get the codeset name outside glibc as
	well.  Check if it is spelled UTF8 as well as UTF-8, and check
	case-insensitively.  Set dfa->map_notascii manually when outside
	glibc.
	* posix/regex_internal.c (build_wcs_upper_buffer) [!_LIBC]: Enable
	optimizations based on map_notascii.
	* posix/regex_internal.h [HAVE_LANGINFO_H || HAVE_LANGINFO_CODESET
	|| _LIBC]: Include langinfo.h.

	* posix/regex_internal.h (struct re_backref_cache_entry): Add "more"
	field.
	* posix/regexec.c (check_dst_limits): Hoist computation of the source
	and destination bkref_idx out of the loop.  Pass it to
	check_dst_limits_calc_pos.
	(check_dst_limits_calc_pos_1): New function, containing the recursive
	loop of check_dst_limits_calc_pos; uses the "more" field of
	struct re_backref_cache to control the loop.
	(check_dst_limits_calc_pos): Store into "boundaries" the position
	relative to lim's start and end positions.  Do not accept eclosures,
	accept bkref_idx instead.  Call check_dst_limits_calc_pos_1 to do the
	work.
	(sift_states_bkref): Use the "more" field of struct re_backref_cache
	to control the loop.  A big "if" was turned into a continue and the
	function was reindented.
	(get_subexp): Use the "more" field of struct re_backref_cache
	to control the loop.
	(match_ctx_add_entry): Initialize the bkref_ents' "more" field.
	(search_cur_bkref_entry): Return -1 if out of bounds.

	* posix/regexec.c (empty_set): Remove.
	(sift_states_backward): Remove cur_src variable.  Move inner loop
	to build_sifted_states.
	(build_sifted_states): Extract from sift_states_backward.  Do not
	use empty_set.
	(update_cur_sifted_state): Do not use empty_set.  Special case
	dest_nodes->nelem == 0.
2004-11-08 22:49:44 +00:00
Ulrich Drepper d2c38eb3fa Update.
2004-11-03  Paolo Bonzini  <bonzini@gnu.org>

	* posix/regex_internal.h (struct re_backref_cache_entry): Remove flag
	field.
	(struct re_sift_context_t): Remove cur_bkref, cls_subexp_idx,
	check_subexp fields.  Move limits last.
	* posix/regexec.c (match_ctx_clear_flag): Remove.
	(sift_ctx_init): Remove check_subexp parameter.  Do not set removed
	fields.  Callers adjusted.
	(expand_bkref_cache): Remove last_str parameter.  Callers adjusted.
	(re_search_internal): Remove fast_translate variable.
	(update_cur_sifted_state): Pass candidates as the final parameter
	to sift_states_bkref.
	(sift_states_bkref): Change last unused parameter to be "candidates",
	do not fetch candidates into a local variable.
	Remove dead test for "node == sctx->bkref", and the cur_bkref_idx
	variable.
	Remove loops that set/reset the flag field of backref cache entries.
	(check_arrival_add_next_nodes): Use a signed int to hold the return
	value of re_node_set_insert.
	(group_nodes_into_DFAstates): Likewise.
	(match_ctx_add_entry): Do not set the flag field of the new entry.
2004-11-08 16:07:55 +00:00
Ulrich Drepper 78678039a4 Update.
2004-03-10  Richard Henderson  <rth@redhat.com>

	* sysdeps/generic/errno.c: Disable versioning for rtld.

	* sysdeps/generic/Makefile (elf/shared): Add unwind-pe.
	* sysdeps/generic/unwind-pe.c: New file.
	* sysdeps/generic/unwind-pe.h: Only prototypes for _LIBC without
	_LIBC_DEFINITIONS.

	* posix/regexec.c: Likewise.
2004-03-10 10:04:19 +00:00
Ulrich Drepper 3f2fb22342 [BZ #16]
Update.
2004-03-09  Ulrich Drepper  <drepper@redhat.com>

	* stdlib/qsort.c (_quicksort): Initialize first stack element [BZ #16].

2004-03-05  Jakub Jelinek  <jakub@redhat.com>

	* posix/regexec.c (regexec): Return with error on unknown eflags.
	Replace weak_alias with versioned_symbol.
	(__compat_regexec): New.
	* posix/Versions (libc): Add regexec@GLIBC_2.3.4.
2004-03-10 06:46:51 +00:00
Ulrich Drepper 58845a7030 Update.
* include/wctype.h: Add libc_hidden_proto for __towctrans.
	* wctype/towctrans.c: Add libc_hidden_def.

	* libio/memstream.c (open_memstream): Use _IO_init with INTUSE.

	* posix/regexec.c (transit_state): Remove unused variable
	next_state.

	* posix/regcomp.c (init_dfa): Use __btowc instead of btowc.
2004-03-05 10:54:16 +00:00
Ulrich Drepper 6fefb4e0b1 Update.
2004-01-15  Paolo Bonzini  <bonzini@gnu.org>

	* posix/regex.h (REG_STARTEND): Define.
	* posix/regexec.c (regexec): Check for REG_STARTEND.
2004-03-04 23:37:01 +00:00
Ulrich Drepper 4c595adb60 Update.
2004-02-29  Paolo Bonzini  <bonzini@gnu.org>

	* posix/regexec.c (transit_state): Don't handle state == NULL.
	Move state log and backreference management...
	(merge_state_with_log): ... to this function.
	(find_recover_state): New function.
	(check_matching): Use find_recover_state to get a non-NULL
	state when an invalid state is reached.  Compute the amount
	of initial characters to be skipped less conservatively when
	multi-byte character sets are in use.  Do not check
	dfa->nbackref if the state log is NULL.  Initialize err.
	(acquire_init_state_context): Expect err to be initialized.
	Fix spacing.

2004-03-05  Jakub Jelinek  <jakub@redhat.com>

	* sysdeps/sparc/sparc32/elf/start.S: Handle PIEs.
	* sysdeps/sparc/sparc64/elf/start.S: Likewise.
2004-03-04 23:28:06 +00:00
Ulrich Drepper bf14fb7c60 [BZ #6]
Update.
	* posix/regexec.c (transit_state): Fix typo in commented-out code
	[BZ #6].
2004-02-16 19:27:53 +00:00