147 lines
3.1 KiB
HTML
147 lines
3.1 KiB
HTML
<HTML>
|
|
<HEAD>
|
|
<H1>
|
|
Notes on the ctype implementation.
|
|
</H1>
|
|
</HEAD>
|
|
<I>
|
|
prepared by Benjamin Kosnik (bkoz@redhat.com) on August 30, 2000
|
|
</I>
|
|
|
|
<P>
|
|
<H2>
|
|
1. Abstract
|
|
</H2>
|
|
<P>
|
|
Woe is me.
|
|
</P>
|
|
|
|
<P>
|
|
<H2>
|
|
2. What the standard says
|
|
</H2>
|
|
|
|
|
|
<P>
|
|
<H2>
|
|
3. Problems with "C" ctype : global locales, termination.
|
|
</H2>
|
|
|
|
<P>
|
|
For the required specialization codecvt<wchar_t, char, mbstate_t> ,
|
|
conversions are made between the internal character set (always UCS4
|
|
on GNU/Linux) and whatever the currently selected locale for the
|
|
LC_CTYPE category implements.
|
|
|
|
<P>
|
|
<H2>
|
|
4. Design
|
|
</H2>
|
|
The two required specializations are implemented as follows:
|
|
|
|
<P>
|
|
<TT>
|
|
ctype<char>
|
|
</TT>
|
|
<P>
|
|
This is simple specialization. Implementing this was a piece of cake.
|
|
|
|
<P>
|
|
<TT>
|
|
ctype<wchar_t>
|
|
</TT>
|
|
<P>
|
|
This specialization, by specifying all the template parameters, pretty
|
|
much ties the hands of implementors. As such, the implementation is
|
|
straightforward, involving mcsrtombs for the conversions between char
|
|
to wchar_t and wcsrtombs for conversions between wchar_t and char.
|
|
|
|
<P>
|
|
Neither of these two required specializations deals with Unicode
|
|
characters. As such, libstdc++-v3 implements
|
|
|
|
|
|
|
|
<P>
|
|
<H2>
|
|
5. Examples
|
|
</H2>
|
|
|
|
<pre>
|
|
typedef ctype<char> cctype;
|
|
</pre>
|
|
|
|
More information can be found in the following testcases:
|
|
<UL>
|
|
<LI> testsuite/22_locale/ctype_char_members.cc
|
|
<LI> testsuite/22_locale/ctype_wchar_t_members.cc
|
|
</UL>
|
|
|
|
<P>
|
|
<H2>
|
|
6. Unresolved Issues
|
|
</H2>
|
|
|
|
<UL>
|
|
<LI> how to deal with the global locale issue?
|
|
|
|
<LI> how to deal with different types than char, wchar_t?
|
|
|
|
<LI> codecvt/ctype overlap: narrow/widen
|
|
|
|
<LI> mask typedef in codecvt_base, argument types in codecvt.
|
|
what is know about this type?
|
|
|
|
<LI> why mask* argument in codecvt?
|
|
|
|
<LI> can this be made (more) generic? is there a simple way to
|
|
straighten out the configure-time mess that is a by-product of
|
|
this class?
|
|
|
|
<LI> get the ctype<wchar_t>::mask stuff under control. Need to
|
|
make some kind of static table, and not do lookup evertime
|
|
somebody hits the do_is... functions. Too bad we can't just
|
|
redefine mask for ctype<wchar_t>
|
|
|
|
<LI> rename abstract base class. See if just smash-overriding
|
|
is a better approach. Clarify, add sanity to naming.
|
|
|
|
</UL>
|
|
|
|
|
|
<P>
|
|
<H2>
|
|
7. Acknowledgments
|
|
</H2>
|
|
Ulrich Drepper for patient answering of late-night questions, skeletal
|
|
examples, and C language expertise.
|
|
|
|
<P>
|
|
<H2>
|
|
8. Bibliography / Referenced Documents
|
|
</H2>
|
|
|
|
Drepper, Ulrich, GNU libc (glibc) 2.2 manual. In particular, Chapters "6. Character Set Handling" and "7 Locales and Internationalization"
|
|
|
|
<P>
|
|
Drepper, Ulrich, Numerous, late-night email correspondence
|
|
|
|
<P>
|
|
ISO/IEC 14882:1998 Programming languages - C++
|
|
|
|
<P>
|
|
ISO/IEC 9899:1999 Programming languages - C
|
|
|
|
<P>
|
|
Langer, Angelika and Klaus Kreft, Standard C++ IOStreams and Locales, Advanced Programmer's Guide and Reference, Addison Wesley Longman, Inc. 2000
|
|
|
|
<P>
|
|
Stroustrup, Bjarne, Appendix D, The C++ Programming Language, Special Edition, Addison Wesley, Inc. 2000
|
|
|
|
<P>
|
|
System Interface Definitions, Issue 6 (IEEE Std. 1003.1-200x)
|
|
The Open Group/The Institute of Electrical and Electronics Engineers, Inc.
|
|
http://www.opennc.org/austin/docreg.html
|
|
|
|
|