regcomp brings in references to wcscoll, which isn't in all the
standards that contain regcomp. In turn, wcscoll brings in references
to wcscmp, also not in all those standards. This patch fixes this by
making those functions into weak aliases of __wcscoll and __wcscmp and
calling those names instead as needed.
Tested for x86_64 and x86 (testsuite, and that disassembly of
installed shared libraries is unchanged by the patch).
[BZ #18497]
* wcsmbs/wcscmp.c [!WCSCMP] (WCSCMP): Define as __wcscmp instead
of wcscmp.
(wcscmp): Define as weak alias of WCSCMP.
* wcsmbs/wcscoll.c (STRCOLL): Define as __wcscoll instead of
wcscoll.
(USE_HIDDEN_DEF): Define.
[!USE_IN_EXTENDED_LOCALE_MODEL] (wcscoll): Define as weak alias of
__wcscoll. Don't use libc_hidden_weak.
* wcsmbs/wcscoll_l.c (STRCMP): Define as __wcscmp instead of
wcscmp.
* sysdeps/i386/i686/multiarch/wcscmp-c.c
[SHARED] (libc_hidden_def): Define __GI___wcscmp instead of
__GI_wcscmp.
(weak_alias): Undefine and redefine.
* sysdeps/i386/i686/multiarch/wcscmp.S (wcscmp): Rename to
__wcscmp and define as weak alias of __wcscmp.
* sysdeps/x86_64/wcscmp.S (wcscmp): Likewise.
* include/wchar.h (__wcscmp): Declare. Use libc_hidden_proto.
(__wcscoll): Likewise.
(wcscmp): Don't use libc_hidden_proto.
(wcscoll): Likewise.
* posix/regcomp.c (build_range_exp): Call __wcscoll instead of
wcscoll.
* posix/regexec.c (check_node_accept_bytes): Likewise.
* conform/Makefile (test-xfail-XPG3/regex.h/linknamespace): Remove
variable.
(test-xfail-XPG4/regex.h/linknamespace): Likewise.
(test-xfail-POSIX/regex.h/linknamespace): Likewise.
* posix/regexec.c (prune_impossible_nodes): Handle sifted_states[0]
being NULL also if there are no backreferences.
* posix/rxspencer/tests: Add testcases.
check_arrival_add_next_nodes): Avoid using uninitialized variable.
* malloc/memusage.c (dest): Fix a bunch of warnings on 32-bit arches.
* sysdeps/i386/fpu/libm-test-ulps: Update for GCC 4.0.x.
2005-09-06 Paul Eggert <eggert@cs.ucla.edu>
Ulrich Drepper <drepper@redhat.com>
[BZ #1302]
Change bitset word type from unsigned int to unsigned long int,
as this has better performance on typical 64-bit hosts. Change
bitset type name to bitset_t.
* posix/regcomp.c (build_equiv_class, build_charclass):
(build_range_exp, build_collating_symbol):
Prefer bitset_t to re_bitset_ptr_t in prototypes, when the actual
argument is a bitset. This is merely a style issue, but it makes
it clearer that an entire array is expected.
(re_compile_fastmap_iter, init_dfa, init_word_char, optimize_subexps,
lower_subexp): Adjust for new bitset_t definition.
(lower_subexp, parse_bracket_exp, built_charclass_op): Likewise.
* posix/regex_internal.h (bitset_set, bitset_clear, bitset_contain,
bitset_not, bitset_merge, bitset_set_all, bitset_mask): Likewise.
* posix/regexec.c (check_dst_limits_calc_pos_1,
check_subexp_matching_top, build_trtable, group_nodes_into_DFAstates):
Likewise.
* posix/regcomp.c (utf8_sb_map): Don't assume initializer
== 0xffffffff.
* posix/regex_internal.h (BITSET_WORD_BITS): Renamed from UINT_BITS.
All uses changed.
(BITSET_WORDS): Renamed from BITSET_UINTS. All uses changed.
(bitset_word_t): New type, replacing 'unsigned int' for bitset uses.
All uses changed.
(BITSET_WORD_MAX): New macro.
(bitset_set, bitset_clear, bitset_contain, bitset_empty,
(bitset_set_all, bitset_copy): Adjust for bitset_t change.
(bitset_empty, bitset_copy):
Prefer sizeof (bitset_t) to multiplying it out ourselves.
(bitset_not_merge): Remove; unused.
(bitset_contain): Return bool, not unsigned int with one bit on.
All callers changed.
* posix/regexec.c (build_trtable): Don't assume bitset_t has no
stricter alignment than re_node_set; do this by defining a new
internal type struct dests_alloc and using it to allocate memory.
(get_subexp): Likewise.
(check_arrival): Likewise.
(check_arrival_expand_ecl): Mark DFA parameter as const.
(check_arrival_expand_ecl_sub): Likewise.
(check_arrival_expand_ecl): Mark eclosure as const.
mbrtowc for very simple UTF-8 case.
2005-09-01 Paul Eggert <eggert@cs.ucla.edu>
* posix/regex_internal.c (build_wcs_upper_buffer): Fix portability
bugs in int versus size_t comparisons.
2005-09-06 Ulrich Drepper <drepper@redhat.com>
* posix/regex_internal.c (re_acquire_state): Make DFA pointer arg
a pointer-to-const.
(re_acquire_state_context): Likewise.
* posix/regex_internal.h: Adjust prototypes.
2005-08-31 Jim Meyering <jim@meyering.net>
* posix/regcomp.c (search_duplicated_node): Make first pointer arg
a pointer-to-const.
* posix/regex_internal.c (create_ci_newstate, create_cd_newstate,
register_state): Likewise.
* posix/regexec.c (search_cur_bkref_entry, check_dst_limits):
(check_dst_limits_calc_pos_1, check_dst_limits_calc_pos):
(group_nodes_into_DFAstates): Likewise.
* posix/regexec.c (re_search_internal): Simplify update of
rm_so and rm_eo by replacing "if (A == B) A += C - B;"
with the equivalent of "if (A == B) A = C;".
2005-09-06 Ulrich Drepper <drepper@redhat.com>
* posix/regcomp.c (re_compile_internal): Change third parameter type
to size_t.
(init_dfa): Likewise. Make sure that arithmetic on pat_len doesn't
overflow.
* posix/regex_internal.h (struct re_dfa_t): Change type of nodes_alloc
and nodes_len to size_t.
* posix/regex_internal.c (re_dfa_add_node): Use size_t as type for
new_nodes_alloc. Check for overflow.
2005-08-31 Paul Eggert <eggert@cs.ucla.edu>
* posix/regcomp.c (re_compile_fastmap_iter, init_dfa, init_word_char):
(optimize_subexps, lower_subexp):
Don't assume 1<<31 has defined behavior on hosts with 32-bit int,
since the signed shift might overflow. Use 1u<<31 instead.
* posix/regex_internal.h (bitset_set, bitset_clear, bitset_contain):
Likewise.
* posix/regexec.c (check_dst_limits_calc_pos_1): Likewise.
(check_subexp_matching_top): Likewise.
* posix/regcomp.c (optimize_subexps, lower_subexp):
Use CHAR_BIT rather than 8, for clarity.
* posix/regexec.c (check_dst_limits_calc_pos_1):
(check_subexp_matching_top): Likewise.
* posix/regcomp.c (init_dfa): Make table_size unsigned, so that we
don't have to worry about portability issues when shifting it left.
Remove no-longer-needed test for table_size > 0.
* posix/regcomp.c (parse_sub_exp): Do not shift more bits than there
are in a word, as the resulting behavior is undefined.
* posix/regexec.c (check_dst_limits_calc_pos_1): Likewise;
in one case, a <= should have been an <, and in another case the
whole test was missing.
* posix/regex_internal.h (BYTE_BITS): Remove. All uses changed to
the standard name CHAR_BIT.
next_last_offset.
(struct re_dfa_t): Remove unused member states_alloc.
* posix/regcomp.c (init_dfa): Don't initialize unused members.
2005-08-25 Paul Eggert <eggert@cs.ucla.edu>
* posix/regexec.c (set_regs): Don't alloca with an unbounded size.
alloca modernization/simplification for regex.
* posix/regex.c: Remove portability cruft for alloca. This no longer
needs to be at the start of the file, and can be moved into
regex_internal.h and simplified.
* posix/regex_internal.h: Include <alloca.h>.
(__libc_use_alloca) [!defined _LIBC]: New macro.
* posix/regexec.c (build_trtable): Remove "#ifdef _LIBC",
since the code now works outside glibc.
2005-09-06 Ulrich Drepper <drepper@redhat.com>
* include/regex.h: Remove use of _RE_ARGS.
2005-08-25 Paul Eggert <eggert@cs.ucla.edu>
* posix/regexec.c (find_recover_state): Change "err" to "*err".
2005-08-24 Paul Eggert <eggert@cs.ucla.edu>
* posix/regcomp.c (regerror): Pointer args are 'restrict',
as per POSIX.
* posix/regex.h (regerror): Likewise.
* manual/pattern.texi (POSIX Regexp Compilation): Likewise.
Similarly for regcomp and regexec. Also, first 2 args of regexec
and 2nd arg of regerror are const.
* posix/regex.c: Do not include <sys/types.h>, as POSIX no longer
requires this. (The code never needed it.)
2005-08-20 Paul Eggert <eggert@cs.ucla.edu>
* posix/regexec.c (sift_states_bkref): re_node_set_insert returns
int, not reg_errcode_t.
* posix/regex_internal.c (calc_state_hash): Put 'inline' before type,
since some broken compilers warn about it otherwise.
* posix/regcomp.c (create_initial_state): Remove duplicate decl.
2005-08-20 Paul Eggert <eggert@cs.ucla.edu>
* posix/regex.h (_RE_ARGS): Remove. No longer needed, since we assume
C89 or better. All uses removed.
2005-09-06 Ulrich Drepper <drepper@redhat.com>
* posix/regex.c: Prevent using C++ compilers.
2005-08-19 Paul Eggert <eggert@cs.ucla.edu>
* posix/regcomp.c (duplicate_node): Return new index, not an error
code, and let the caller return REG_ESPACE if out of space. This
removes an uninitialied-variable warning with GCC 4.0.1, and also
avoids taking the address of a local variable. All callers
changed.
2005-09-06 Ulrich Drepper <drepper@redhat.com>
* include/time.h (__strptime_internal): Rename parameter to avoid
bogus compiler warning.
2005-08-19 Jim Meyering <jim@meyering.net>
* posix/regexec.c (proceed_next_node): Redo local variables to
avoid GCC shadowing warnings.
2005-09-06 Ulrich Drepper <drepper@redhat.com>
* posix/regex_internal.c (re_acquire_state): Minor code rearrangement.
(re_acquire_state_context): Likewise.
2005-08-19 Paul Eggert <eggert@cs.ucla.edu>
* posix/regex_internal.c (re_string_realloc_buffers):
(re_node_set_insert, re_node_set_insert_last, re_dfa_add_node):
Rename local variables to avoid GCC shadowing warnings.
2005-07-08 Eric Blake <ebb9@byu.net>
Paul Eggert <eggert@cs.ucla.edu>
* posix/regcomp.c (init_dfa): Store __btowc value in wint_t, not
wchar_t. Remove now-unnecessary cast.
(build_range_exp): Likewise.
Update.
2005-01-27 Paolo Bonzini <bonzini@gnu.org>
[BZ #558]
* posix/regcomp.c (calc_inveclosure): Return reg_errcode_t.
Initialize the node sets in dfa->inveclosures.
(analyze): Initialize inveclosures only if it is needed.
Check errors from calc_inveclosure.
* posix/regex_internal.c (re_dfa_add_node): Do not initialize
the inveclosure node set.
* posix/regexec.c (re_search_internal): If nmatch includes unused
subexpressions, reset them to { rm_so: -1, rm_eo: -1 } here.
* posix/regcomp.c (parse_bracket_exp) [!RE_ENABLE_I18N]:
Do build a SIMPLE_BRACKET token.
* posix/regexec.c (transit_state_mb): Do not examine nodes
where ACCEPT_MB is not set.
Update.
2004-12-13 Paolo Bonzini <bonzini@gnu.org>
Separate parsing and creation of the NFA. Avoided recursion on
the (very unbalanced) parse tree.
[BZ #611]
* posix/regcomp.c (struct subexp_optimize, analyze_tree, calc_epsdest,
re_dfa_add_tree_node, mark_opt_subexp_iter): Removed.
(optimize_subexps, duplicate_tree, calc_first, calc_next,
mark_opt_subexp): Rewritten.
(preorder, postorder, lower_subexps, lower_subexp, link_nfa_nodes,
create_token_tree, free_tree, free_token): New.
(analyze): Accept a regex_t *. Invoke the passes via the preorder and
postorder generic visitors. Do not initialize the fields in the
re_dfa_t that represent the transitions.
(free_dfa_content): Use free_token.
(re_compile_internal): Analyze before UTF-8 optimizations. Do not
include optimization of subexpressions.
(create_initial_state): Fetch the DFA node index from the first node's
bin_tree_t *.
(optimize_utf8): Abort on unexpected nodes, including OP_DUP_QUESTION.
Return on COMPLEX_BRACKET.
(duplicate_node_closure): Fix comment.
(duplicate_node): Do not initialize the fields in the
re_dfa_t that represent the transitions.
(calc_eclosure, calc_inveclosure): Do not handle OP_DELETED_SUBEXP.
(create_tree): Remove final argument. All callers adjusted. Rewritten
to use create_token_tree.
(parse_reg_exp, parse_branch, parse_expression, parse_bracket_exp,
build_charclass_op): Use create_tree or create_token_tree instead
of re_dfa_add_tree_node.
(parse_dup_op): Likewise. Also free the tree using free_tree for
"<re>{0}", and lower OP_DUP_QUESTION to OP_ALT: "a?" is equivalent
to "a|". Adjust invocation of mark_opt_subexp.
(parse_sub_exp): Create a single SUBEXP node.
* posix/regex_internal.c (re_dfa_add_node): Remove last parameter,
always perform as if it was 1. Do not initialize OPT_SUBEXP and
DUPLICATED, and initialize the DFA fields representing the transitions.
* posix/regex_internal.h (re_dfa_add_node): Adjust prototype.
(re_token_type_t): Move OP_DUP_PLUS and OP_DUP_QUESTION to the tokens
section. Add a tree-only code SUBEXP. Remove OP_DELETED_SUBEXP.
(bin_tree_t): Include a full re_token_t for TOKEN. Turn FIRST and
NEXT into pointers to trees. Remove ECLOSURE.
2004-12-28 Paolo Bonzini <bonzini@gnu.org >
[BZ #605]
* posix/regcomp.c (parse_bracket_exp): Do not modify DFA nodes
that were already created.
* posix/regex_internal.c (re_dfa_add_node): Set accept_mb field
in the token if needed.
(create_ci_newstate, create_cd_newstate): Set accept_mb field
from the tokens' field.
* posix/regex_internal.h (re_token_t): Add accept_mb field.
(ACCEPT_MB_NODE): Removed.
* posix/regexec.c (proceed_next_node, transit_states_mb,
build_sifted_states, check_arrival_add_next_nodes): Use
accept_mb instead of ACCEPT_MB_NODE.
2004-04-27 Paolo Bonzini <bonzini@gnu.org>
* posix/regex_internal.h (struct re_dfastate_t): Make
word_trtable a pointer to the 512-item transition table.
* posix/regexec.c (build_trtable): Fill in either state->trtable
or state->word_trtable. Return a boolean indicating success.
(transit_state): Expect state->trtable to be a 256-item
transition table. Reorganize code to have less tests in
the common case, and to save an indentation level.
2004-12-07 Paolo Bonzini <bonzini@gnu.org>
* posix/regexec.c (proceed_next_node): Simplify treatment of epsilon
nodes. Pass the pushed node to push_fail_stack.
(push_fail_stack): Accept a single node rather than an array
of two epsilon destinations.
(build_sifted_states): Only walk non-epsilon nodes.
(check_arrival): Don't pass epsilon nodes to
check_arrival_add_next_nodes.
(check_arrival_add_next_nodes) [DEBUG]: Abort if an epsilon node is
found.
(check_node_accept): Do expensive checks later.
(add_epsilon_src_nodes): Cache result of merging the inveclosures.
* posix/regex_internal.h (re_dfastate_t): Add non_eps_nodes and
inveclosure.
(re_string_elem_size_at, re_string_char_size_at, re_string_wchar_at,
re_string_context_at, re_string_peek_byte_case,
re_string_fetch_byte_case, re_node_set_compare, re_node_set_contains):
Declare as pure.
* posix/regex_internal.c (create_newstate_common): Remove.
(register_state): Move part of it here. Initialize non_eps_nodes.
(free_state): Free inveclosure and non_eps_nodes.
(create_cd_newstate, create_ci_newstate): Allocate the new
re_dfastate_t here.
2004-12-01 Paolo Bonzini <bonzini@gnu.org>
* posix/regcomp.c (free_dfa_content, init_dfa): Remove
references to re_dfa_t's subexps field.
(parse_sub_exp, parse_expression): Do not use it. Use
completed_bkref_map instead.
(create_initial_state, peek_token): Store a backreference \N
with opr.idx = N-1.
* posix/regexec.c (proceed_next_node, check_dst_limits, get_subexp):
Likewise.
(check_subexp_limits): Remove useless condition.
* posix/regex_internal.h (re_subexp_t): Remove.
(re_dfa_t): Remove subexps and subexps_alloc field, add
completed_bkref_map.
Update.
2004-11-18 Jakub Jelinek <jakub@redhat.com>
[BZ #544]
* posix/regex.h (RE_NO_SUB): New define.
* posix/regex_internal.h (OP_DELETED_SUBEXP): New.
(re_dfa_t): Add subexp_map.
* posix/regcomp.c (struct subexp_optimize): New type.
(optimize_subexps): New routine.
(re_compile_internal): Call it.
(re_compile_pattern): Set preg->no_sub to 1 if RE_NO_SUB.
(free_dfa_content): Free subexp_map.
(calc_inveclosure, calc_eclosure): Skip OP_DELETED_SUBEXP
nodes.
* posix/regexec.c (re_search_internal): If subexp_map
is not NULL, duplicate registers as needed.
* posix/Makefile: Add rules to build and run tst-regex2.
* posix/tst-regex2.c: New test.
* posix/rxspencer/tests: Fix last two tests (\0 -> \1).
Add some new tests for nested subexpressions.
2004-11-12 Ulrich Drepper <drepper@redhat.com>
* posix/Makefile (tests): Add bug-regex24.
* posix/bug-regex24.c: New file.
2004-11-12 Paolo Bonzini <bonzini@gnu.org>
* posix/regexec.c (check_dst_limits_calc_pos_1): Use the map to
cut recursive paths. Make exit condition more precise.
(match_ctx_add_entry): Initialize the map.
* posix/regex_internal.h (struct re_backref_cache_entry): Add a map of
reachable subexpression nodes from each backreference cache entry.
2004-11-09 Paolo Bonzini <bonzini@gnu.org>
* posix/regexec.c (transit_state): Remove the check for
out-of-bounds buffers.
(check_matching): Check here for out-of-bounds buffers.
(re_search_internal): Store into match_kind a set of bits
indicating which incantation of fastmap scanning must be
used. Use a switch statement instead of multiple ifs.
Exit the final "for (;;)" with goto free_return unless
the match succeeded, thus simplifying some conditionals.
* posix/regex_internal.c (re_string_reconstruct,
re_string_context_at): Add several branch predictions for
case-sensitive matching and no transition table being used.
2004-11-10 Ulrich Drepper <drepper@redhat.com>
* posix/tst-waitid.c: Don't use error to print error message, they
won't end up in the .out file.
* nscd/nscd_getgr_r.c: Likewise. Make map externally visible.
* nscd/nscd_gethst_r.c: Likewise.
2004-11-08 Ulrich Drepper <drepper@redhat.com>
* posix/regcomp.c (utf8_sb_map): Define.
(free_dfa_content): Don't free dfa->sb_char if it's a pointer to
utf8_sb_map.
(init_dfa): Use utf8_sb_map instead of initializing memory when the
encoding is UTF-8.
* posix/regcomp.c (init_dfa): Get the codeset name outside glibc as
well. Check if it is spelled UTF8 as well as UTF-8, and check
case-insensitively. Set dfa->map_notascii manually when outside
glibc.
* posix/regex_internal.c (build_wcs_upper_buffer) [!_LIBC]: Enable
optimizations based on map_notascii.
* posix/regex_internal.h [HAVE_LANGINFO_H || HAVE_LANGINFO_CODESET
|| _LIBC]: Include langinfo.h.
* posix/regex_internal.h (struct re_backref_cache_entry): Add "more"
field.
* posix/regexec.c (check_dst_limits): Hoist computation of the source
and destination bkref_idx out of the loop. Pass it to
check_dst_limits_calc_pos.
(check_dst_limits_calc_pos_1): New function, containing the recursive
loop of check_dst_limits_calc_pos; uses the "more" field of
struct re_backref_cache to control the loop.
(check_dst_limits_calc_pos): Store into "boundaries" the position
relative to lim's start and end positions. Do not accept eclosures,
accept bkref_idx instead. Call check_dst_limits_calc_pos_1 to do the
work.
(sift_states_bkref): Use the "more" field of struct re_backref_cache
to control the loop. A big "if" was turned into a continue and the
function was reindented.
(get_subexp): Use the "more" field of struct re_backref_cache
to control the loop.
(match_ctx_add_entry): Initialize the bkref_ents' "more" field.
(search_cur_bkref_entry): Return -1 if out of bounds.
* posix/regexec.c (empty_set): Remove.
(sift_states_backward): Remove cur_src variable. Move inner loop
to build_sifted_states.
(build_sifted_states): Extract from sift_states_backward. Do not
use empty_set.
(update_cur_sifted_state): Do not use empty_set. Special case
dest_nodes->nelem == 0.
2004-11-03 Paolo Bonzini <bonzini@gnu.org>
* posix/regex_internal.h (struct re_backref_cache_entry): Remove flag
field.
(struct re_sift_context_t): Remove cur_bkref, cls_subexp_idx,
check_subexp fields. Move limits last.
* posix/regexec.c (match_ctx_clear_flag): Remove.
(sift_ctx_init): Remove check_subexp parameter. Do not set removed
fields. Callers adjusted.
(expand_bkref_cache): Remove last_str parameter. Callers adjusted.
(re_search_internal): Remove fast_translate variable.
(update_cur_sifted_state): Pass candidates as the final parameter
to sift_states_bkref.
(sift_states_bkref): Change last unused parameter to be "candidates",
do not fetch candidates into a local variable.
Remove dead test for "node == sctx->bkref", and the cur_bkref_idx
variable.
Remove loops that set/reset the flag field of backref cache entries.
(check_arrival_add_next_nodes): Use a signed int to hold the return
value of re_node_set_insert.
(group_nodes_into_DFAstates): Likewise.
(match_ctx_add_entry): Do not set the flag field of the new entry.
2004-03-10 Richard Henderson <rth@redhat.com>
* sysdeps/generic/errno.c: Disable versioning for rtld.
* sysdeps/generic/Makefile (elf/shared): Add unwind-pe.
* sysdeps/generic/unwind-pe.c: New file.
* sysdeps/generic/unwind-pe.h: Only prototypes for _LIBC without
_LIBC_DEFINITIONS.
* posix/regexec.c: Likewise.