gcc/libstdc++-v3/docs/html/22_locale/howto.html

220 lines
8.2 KiB
HTML
Raw Normal View History

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<meta name="AUTHOR" content="pme@gcc.gnu.org (Phil Edwards)">
<meta name="KEYWORDS" content="HOWTO, libstdc++, GCC, g++, libg++, STL">
<meta name="DESCRIPTION" content="HOWTO for the libstdc++ chapter 22.">
<meta name="GENERATOR" content="vi and eight fingers">
<title>libstdc++-v3 HOWTO: Chapter 22</title>
<link rel="StyleSheet" href="../lib3styles.css">
</head>
<body>
<h1 class="centered"><a name="top">Chapter 22: Localization</a></h1>
<p>Chapter 22 deals with the C++ localization facilities.
</p>
<!-- I wanted to write that sentence in something requiring an exotic font,
like Cryllic or Kanji. Probably more work than such cuteness is worth,
but I still think it'd be funny.
-->
<!-- ####################################################### -->
<hr>
<h1>Contents</h1>
<ul>
<li><a href="#1">class locale</a>
<li><a href="#2">class codecvt</a>
<li><a href="#3">class ctype</a>
<li><a href="#4">class messages</a>
<li><a href="#5">Bjarne Stroustrup on Locales</a>
<li><a href="#6">Nathan Myers on Locales</a>
<li><a href="#7">Correct Transformations</a>
</ul>
Implement std::messages. 2001-08-07 Benjamin Kosnik <bkoz@redhat.com> Implement std::messages. Make config a fully-nested directory. * config/locale: New directory. * config/c_locale_generic.cc: Move into locale subdirectory. * config/c_locale_generic.h: Same. * config/c_locale_gnu.cc: Same. * config/c_locale_gnu.h: Same. * config/c_locale_ieee_1003.1-200x.cc: Same. * config/c_locale_ieee_1003.1-200x_.h: Same. * config/codecvt_specializations_generic.h: Same. * config/codecvt_specializations_ieee_1003.1-200x.h: Same. * config/messages_members_gnu.h: Same. * config/messaages_members_gnu.cc: Same. * config/messages_members_generic.h: Same. * config/messaages_members_generic.cc: Same. * config/messages_members_ieee_1003.1-200x.h: Same. * config/messaages_members_ieee_1003.1-200x.cc: Same. * config/io: New directory. * config/basic_file_libio.h: Move into io subdirectory. * config/basic_file_libio.cc: Same. * config/c_io_libio.h: Same. * config/c_io_libio_codecvt.c: Same. * config/basic_file_stdio.h: Same. * config/basic_file_stdio.cc: Same. * config/c_io_stdio.h: Same. * po: New directory. * po/POTFILES.in: New file. * po/Makefile.am: New file. * po/Makefile.in: New file. * po/libstdc++.pot: Generic translation file. * po/fr.po: Preliminary French translation. * po/de.po: Preliminary German translation. * intl: New directory. * intl/Makefile.am: New file. * intl/Makefile.in: New file. * intl/string_literals.cc: New file. * acinclude.m4 (GLIBCPP_CONFIGURE): Bump VERSION to 3.1.0. Add requisite setup for gettext. Re-arrange. * aclocal.m4: Regenerate. * configure.in: Don't call GLIBCPP_CHECK_COMPILER_VERSION. Output Makefile bits for po and intl. * configure: Regenerate. * Makefile.am (SUBDIRS): Add intl, po. Add rule for dist. * Makefile.in: Regenerate. * acconfig.h: Add ENABLE_NLS, HAVE_CATGETS, HAVE_GETTEXT, HAVE_STPCPY. * config.h.in: Regenerate. * acinclude.m4 (AC_REPLACE_STRINGFUNCS): Remove. * include/Makefile.am (install-data-local): Don't install Makefile. * include/Makefile.in: Regenerate. * include/bits/locale_facet.h (locale::facet::_S_clone_c_locale): Add member. * config/locale/c_locale_gnu.cc (_S_clone_c_locale): Add definition. * config/locale/c_locale_generic.cc: Same. * config/locale/c_locale_ieee_1003.1-200x.cc: Same. * include/bits/codecvt.h: Excise non-standard, non-required bits. This includes __enc_traits, and partial specializations of codecvt for __enc_traits. * src/codecvt.cc (__enc_traits::_S_max_size): Guard * config/codecvt_partials_ieee_1003.1-200x.h: New file. * config/codecvt_partials_generic.h: New file. * include/Makefile.am (allstamps): Add stamp-codecvt_model. (stamp-codecvt_model): Add. * include/Makefile.in: Regenerate. * acinclude.m4 (GLIBCPP_ENABLE_CLOCALE): Add in codecvt configury. * aclocal.m4: Regenerate. * configure: Regenerate. * testsuite/22_locale/codecvt_unicode_wchar_t.cc: Use macro guard. * testsuite/22_locale/codecvt_unicode_char.cc: Same. * testsuite/22_locale/ctor_copy_dtor.cc: And here. * include/bits/localefwd.h (class locale::facet): Add __enc_traits as a friend for _S_*_c_locale functions. * include/bits/codecvt.h (__enc_traits::__enc_traits): Add locale argument to default constructor so that CODESET information can be deduced. * include/bits/locale_facets.h (messages_byname): Re-implement. Remove specializations. * src/locale.cc (messages_byname<char>): Remove specialization. (messages_byname<wchar_t>): Same. * config/locale/c_locale_ieee_1003.1-200x.cc: New file. * config/locale/c_locale_ieee_1003.1-200x.h: New file. * config/locale/messages_members_ieee_1003.1-200x.cc: New file. * config/locale/messages_members_ieee_1003.1-200x.h: New file. * config/locale/messages_members_gnu.cc: New file. * config/locale/messages_members_gnu.h: New file. * config/locale/messages_members_generic.cc: New file. * config/locale/messages_members_generic.h: New file. * docs/html/configopts.html: Add documentation for new locale model, ieee_1003.1. Adjust other flags for current defaults. * docs/html/22_locale/locale.html: Update. * docs/html/22_locale/howto.html: Add link to messages.html. Organize. * docs/html/22_locale/messages.html: New. * src/Makefile.am (sources): Add messages_virtuals.cc. * src/Makefile.in: Regenerate. * include/Makefile.am (allstamps): Add stamp-messages_model. (stamp-messages_model): Add. * include/Makefile.in: Regenerate. * acinclude.m4 (GLIBCPP_ENABLE_CLOCALE): Add in messages configury. * aclocal.m4: Regenerate. * configure: Regenerate. * testsuite_flags.in (--cxxflags): Add LOCALEDIR. * testsuite/lib/libstdc++-v3-dg.exp: Remove broken LD_LIBRARY_PATH bits for Irix. * acinclude (GLIBCPP_ENABLE_CLOCALE): Set glibcpp_localedir to the build directories message catalog base directory, and export. Eventually this should probably be made to deal with build and install directories. For now, punt on this as the library itself doesn't use message catalogs (yet). * testsuite/22_locale/messages.cc: New file. * testsuite/22_locale/messages_char_members.cc: New file. * testsuite/22_locale/messages_byname.cc: New file. From-SVN: r44702
2001-08-08 04:49:01 +02:00
<!-- ####################################################### -->
<hr>
<h2><a name="1">class locale</a></h2>
<p>Notes made during the implementation of locales can be found
<a href="locale.html">here</a>.
</p>
<hr>
<h2><a name="2">class codecvt</a></h2>
<p>Notes made during the implementation of codecvt can be found
<a href="codecvt.html">here</a>.
</p>
Implement std::messages. 2001-08-07 Benjamin Kosnik <bkoz@redhat.com> Implement std::messages. Make config a fully-nested directory. * config/locale: New directory. * config/c_locale_generic.cc: Move into locale subdirectory. * config/c_locale_generic.h: Same. * config/c_locale_gnu.cc: Same. * config/c_locale_gnu.h: Same. * config/c_locale_ieee_1003.1-200x.cc: Same. * config/c_locale_ieee_1003.1-200x_.h: Same. * config/codecvt_specializations_generic.h: Same. * config/codecvt_specializations_ieee_1003.1-200x.h: Same. * config/messages_members_gnu.h: Same. * config/messaages_members_gnu.cc: Same. * config/messages_members_generic.h: Same. * config/messaages_members_generic.cc: Same. * config/messages_members_ieee_1003.1-200x.h: Same. * config/messaages_members_ieee_1003.1-200x.cc: Same. * config/io: New directory. * config/basic_file_libio.h: Move into io subdirectory. * config/basic_file_libio.cc: Same. * config/c_io_libio.h: Same. * config/c_io_libio_codecvt.c: Same. * config/basic_file_stdio.h: Same. * config/basic_file_stdio.cc: Same. * config/c_io_stdio.h: Same. * po: New directory. * po/POTFILES.in: New file. * po/Makefile.am: New file. * po/Makefile.in: New file. * po/libstdc++.pot: Generic translation file. * po/fr.po: Preliminary French translation. * po/de.po: Preliminary German translation. * intl: New directory. * intl/Makefile.am: New file. * intl/Makefile.in: New file. * intl/string_literals.cc: New file. * acinclude.m4 (GLIBCPP_CONFIGURE): Bump VERSION to 3.1.0. Add requisite setup for gettext. Re-arrange. * aclocal.m4: Regenerate. * configure.in: Don't call GLIBCPP_CHECK_COMPILER_VERSION. Output Makefile bits for po and intl. * configure: Regenerate. * Makefile.am (SUBDIRS): Add intl, po. Add rule for dist. * Makefile.in: Regenerate. * acconfig.h: Add ENABLE_NLS, HAVE_CATGETS, HAVE_GETTEXT, HAVE_STPCPY. * config.h.in: Regenerate. * acinclude.m4 (AC_REPLACE_STRINGFUNCS): Remove. * include/Makefile.am (install-data-local): Don't install Makefile. * include/Makefile.in: Regenerate. * include/bits/locale_facet.h (locale::facet::_S_clone_c_locale): Add member. * config/locale/c_locale_gnu.cc (_S_clone_c_locale): Add definition. * config/locale/c_locale_generic.cc: Same. * config/locale/c_locale_ieee_1003.1-200x.cc: Same. * include/bits/codecvt.h: Excise non-standard, non-required bits. This includes __enc_traits, and partial specializations of codecvt for __enc_traits. * src/codecvt.cc (__enc_traits::_S_max_size): Guard * config/codecvt_partials_ieee_1003.1-200x.h: New file. * config/codecvt_partials_generic.h: New file. * include/Makefile.am (allstamps): Add stamp-codecvt_model. (stamp-codecvt_model): Add. * include/Makefile.in: Regenerate. * acinclude.m4 (GLIBCPP_ENABLE_CLOCALE): Add in codecvt configury. * aclocal.m4: Regenerate. * configure: Regenerate. * testsuite/22_locale/codecvt_unicode_wchar_t.cc: Use macro guard. * testsuite/22_locale/codecvt_unicode_char.cc: Same. * testsuite/22_locale/ctor_copy_dtor.cc: And here. * include/bits/localefwd.h (class locale::facet): Add __enc_traits as a friend for _S_*_c_locale functions. * include/bits/codecvt.h (__enc_traits::__enc_traits): Add locale argument to default constructor so that CODESET information can be deduced. * include/bits/locale_facets.h (messages_byname): Re-implement. Remove specializations. * src/locale.cc (messages_byname<char>): Remove specialization. (messages_byname<wchar_t>): Same. * config/locale/c_locale_ieee_1003.1-200x.cc: New file. * config/locale/c_locale_ieee_1003.1-200x.h: New file. * config/locale/messages_members_ieee_1003.1-200x.cc: New file. * config/locale/messages_members_ieee_1003.1-200x.h: New file. * config/locale/messages_members_gnu.cc: New file. * config/locale/messages_members_gnu.h: New file. * config/locale/messages_members_generic.cc: New file. * config/locale/messages_members_generic.h: New file. * docs/html/configopts.html: Add documentation for new locale model, ieee_1003.1. Adjust other flags for current defaults. * docs/html/22_locale/locale.html: Update. * docs/html/22_locale/howto.html: Add link to messages.html. Organize. * docs/html/22_locale/messages.html: New. * src/Makefile.am (sources): Add messages_virtuals.cc. * src/Makefile.in: Regenerate. * include/Makefile.am (allstamps): Add stamp-messages_model. (stamp-messages_model): Add. * include/Makefile.in: Regenerate. * acinclude.m4 (GLIBCPP_ENABLE_CLOCALE): Add in messages configury. * aclocal.m4: Regenerate. * configure: Regenerate. * testsuite_flags.in (--cxxflags): Add LOCALEDIR. * testsuite/lib/libstdc++-v3-dg.exp: Remove broken LD_LIBRARY_PATH bits for Irix. * acinclude (GLIBCPP_ENABLE_CLOCALE): Set glibcpp_localedir to the build directories message catalog base directory, and export. Eventually this should probably be made to deal with build and install directories. For now, punt on this as the library itself doesn't use message catalogs (yet). * testsuite/22_locale/messages.cc: New file. * testsuite/22_locale/messages_char_members.cc: New file. * testsuite/22_locale/messages_byname.cc: New file. From-SVN: r44702
2001-08-08 04:49:01 +02:00
<p>The following is the abstract from the implementation notes:
</p>
<blockquote>
Implement std::messages. 2001-08-07 Benjamin Kosnik <bkoz@redhat.com> Implement std::messages. Make config a fully-nested directory. * config/locale: New directory. * config/c_locale_generic.cc: Move into locale subdirectory. * config/c_locale_generic.h: Same. * config/c_locale_gnu.cc: Same. * config/c_locale_gnu.h: Same. * config/c_locale_ieee_1003.1-200x.cc: Same. * config/c_locale_ieee_1003.1-200x_.h: Same. * config/codecvt_specializations_generic.h: Same. * config/codecvt_specializations_ieee_1003.1-200x.h: Same. * config/messages_members_gnu.h: Same. * config/messaages_members_gnu.cc: Same. * config/messages_members_generic.h: Same. * config/messaages_members_generic.cc: Same. * config/messages_members_ieee_1003.1-200x.h: Same. * config/messaages_members_ieee_1003.1-200x.cc: Same. * config/io: New directory. * config/basic_file_libio.h: Move into io subdirectory. * config/basic_file_libio.cc: Same. * config/c_io_libio.h: Same. * config/c_io_libio_codecvt.c: Same. * config/basic_file_stdio.h: Same. * config/basic_file_stdio.cc: Same. * config/c_io_stdio.h: Same. * po: New directory. * po/POTFILES.in: New file. * po/Makefile.am: New file. * po/Makefile.in: New file. * po/libstdc++.pot: Generic translation file. * po/fr.po: Preliminary French translation. * po/de.po: Preliminary German translation. * intl: New directory. * intl/Makefile.am: New file. * intl/Makefile.in: New file. * intl/string_literals.cc: New file. * acinclude.m4 (GLIBCPP_CONFIGURE): Bump VERSION to 3.1.0. Add requisite setup for gettext. Re-arrange. * aclocal.m4: Regenerate. * configure.in: Don't call GLIBCPP_CHECK_COMPILER_VERSION. Output Makefile bits for po and intl. * configure: Regenerate. * Makefile.am (SUBDIRS): Add intl, po. Add rule for dist. * Makefile.in: Regenerate. * acconfig.h: Add ENABLE_NLS, HAVE_CATGETS, HAVE_GETTEXT, HAVE_STPCPY. * config.h.in: Regenerate. * acinclude.m4 (AC_REPLACE_STRINGFUNCS): Remove. * include/Makefile.am (install-data-local): Don't install Makefile. * include/Makefile.in: Regenerate. * include/bits/locale_facet.h (locale::facet::_S_clone_c_locale): Add member. * config/locale/c_locale_gnu.cc (_S_clone_c_locale): Add definition. * config/locale/c_locale_generic.cc: Same. * config/locale/c_locale_ieee_1003.1-200x.cc: Same. * include/bits/codecvt.h: Excise non-standard, non-required bits. This includes __enc_traits, and partial specializations of codecvt for __enc_traits. * src/codecvt.cc (__enc_traits::_S_max_size): Guard * config/codecvt_partials_ieee_1003.1-200x.h: New file. * config/codecvt_partials_generic.h: New file. * include/Makefile.am (allstamps): Add stamp-codecvt_model. (stamp-codecvt_model): Add. * include/Makefile.in: Regenerate. * acinclude.m4 (GLIBCPP_ENABLE_CLOCALE): Add in codecvt configury. * aclocal.m4: Regenerate. * configure: Regenerate. * testsuite/22_locale/codecvt_unicode_wchar_t.cc: Use macro guard. * testsuite/22_locale/codecvt_unicode_char.cc: Same. * testsuite/22_locale/ctor_copy_dtor.cc: And here. * include/bits/localefwd.h (class locale::facet): Add __enc_traits as a friend for _S_*_c_locale functions. * include/bits/codecvt.h (__enc_traits::__enc_traits): Add locale argument to default constructor so that CODESET information can be deduced. * include/bits/locale_facets.h (messages_byname): Re-implement. Remove specializations. * src/locale.cc (messages_byname<char>): Remove specialization. (messages_byname<wchar_t>): Same. * config/locale/c_locale_ieee_1003.1-200x.cc: New file. * config/locale/c_locale_ieee_1003.1-200x.h: New file. * config/locale/messages_members_ieee_1003.1-200x.cc: New file. * config/locale/messages_members_ieee_1003.1-200x.h: New file. * config/locale/messages_members_gnu.cc: New file. * config/locale/messages_members_gnu.h: New file. * config/locale/messages_members_generic.cc: New file. * config/locale/messages_members_generic.h: New file. * docs/html/configopts.html: Add documentation for new locale model, ieee_1003.1. Adjust other flags for current defaults. * docs/html/22_locale/locale.html: Update. * docs/html/22_locale/howto.html: Add link to messages.html. Organize. * docs/html/22_locale/messages.html: New. * src/Makefile.am (sources): Add messages_virtuals.cc. * src/Makefile.in: Regenerate. * include/Makefile.am (allstamps): Add stamp-messages_model. (stamp-messages_model): Add. * include/Makefile.in: Regenerate. * acinclude.m4 (GLIBCPP_ENABLE_CLOCALE): Add in messages configury. * aclocal.m4: Regenerate. * configure: Regenerate. * testsuite_flags.in (--cxxflags): Add LOCALEDIR. * testsuite/lib/libstdc++-v3-dg.exp: Remove broken LD_LIBRARY_PATH bits for Irix. * acinclude (GLIBCPP_ENABLE_CLOCALE): Set glibcpp_localedir to the build directories message catalog base directory, and export. Eventually this should probably be made to deal with build and install directories. For now, punt on this as the library itself doesn't use message catalogs (yet). * testsuite/22_locale/messages.cc: New file. * testsuite/22_locale/messages_char_members.cc: New file. * testsuite/22_locale/messages_byname.cc: New file. From-SVN: r44702
2001-08-08 04:49:01 +02:00
The standard class codecvt attempts to address conversions between
different character encoding schemes. In particular, the standard
attempts to detail conversions between the implementation-defined
wide characters (hereafter referred to as wchar_t) and the standard
type char that is so beloved in classic &quot;C&quot; (which can
now be referred to as narrow characters.) This document attempts
to describe how the GNU libstdc++-v3 implementation deals with the
conversion between wide and narrow characters, and also presents a
framework for dealing with the huge number of other encodings that
iconv can convert, including Unicode and UTF8. Design issues and
requirements are addressed, and examples of correct usage for both
the required specializations for wide and narrow characters and the
implementation-provided extended functionality are given.
</blockquote>
<hr>
<h2><a name="3">class ctype</a></h2>
<p>Notes made during the implementation of ctype can be found
<a href="ctype.html">here</a>.
</p>
<hr>
<h2><a name="4">class messages</a></h2>
<p>Notes made during the implementation of messages can be found
<a href="messages.html">here</a>.
</p>
<hr>
<h2><a name="5">Stroustrup on Locales</a></h2>
<p>Dr. Bjarne Stroustrup has released a
<a href="http://www.research.att.com/~bs/3rd_loc0.html">pointer</a>
to Appendix D of his book,
<a href="http://www.research.att.com/~bs/3rd.html">The C++
Programming Language (3rd Edition)</a>. It is a detailed
description of locales and how to use them.
</p>
<p>He also writes:
</p>
<blockquote><em>
Please note that I still consider this detailed description of
locales beyond the needs of most C++ programmers. It is written
with experienced programmers in mind and novices will do best to
avoid it.
</em></blockquote>
<hr>
<h2><a name="6">Nathan Myers on Locales</a></h2>
<p>An article entitled &quot;The Standard C++ Locale&quot; was
published in Dr. Dobb's Journal and can be found
<a href="http://www.cantrip.org/locale.html">here</a>.
</p>
<hr>
<h2><a name="7">Correct Transformations</a></h2>
<!-- Jumping directly to here from chapter 21. -->
<p>A very common question on newsgroups and mailing lists is, &quot;How
do I do &lt;foo&gt; to a character string?&quot; where &lt;foo&gt; is
a task such as changing all the letters to uppercase, to lowercase,
testing for digits, etc. A skilled and conscientious programmer
will follow the question with another, &quot;And how do I make the
code portable?&quot;
</p>
<p>(Poor innocent programmer, you have no idea the depths of trouble
you are getting yourself into. 'Twould be best for your sanity if
you dropped the whole idea and took up basket weaving instead. No?
Fine, you asked for it...)
</p>
<p>The task of changing the case of a letter or classifying a character
as numeric, graphical, etc, all depends on the cultural context of the
program at runtime. So, first you must take the portability question
into account. Once you have localized the program to a particular
natural language, only then can you perform the specific task.
Unfortunately, specializing a function for a human language is not
as simple as declaring
<code> extern &quot;Danish&quot; int tolower (int); </code>.
</p>
<p>The C++ code to do all this proceeds in the same way. First, a locale
is created. Then member functions of that locale are called to
perform minor tasks. Continuing the example from Chapter 21, we wish
to use the following convenience functions:
</p>
<pre>
namespace std {
template &lt;class charT&gt;
charT
toupper (charT c, const locale&amp; loc) const;
template &lt;class charT&gt;
charT
tolower (charT c, const locale&amp; loc) const;
}</pre>
<p>
This function extracts the appropriate &quot;facet&quot; from the
locale <em>loc</em> and calls the appropriate member function of that
facet, passing <em>c</em> as its argument. The resulting character
is returned.
</p>
<p>For the C/POSIX locale, the results are the same as calling the
classic C <code>toupper/tolower</code> function that was used in previous
examples. For other locales, the code should Do The Right Thing.
</p>
<p>Of course, these functions take a second argument, and the
transformation algorithm's operator argument can only take a single
parameter. So we write simple wrapper structs to handle that.
</p>
<p>The next-to-final version of the code started in Chapter 21 looks like:
</p>
<pre>
#include &lt;iterator&gt; // for back_inserter
#include &lt;locale&gt;
#include &lt;string&gt;
#include &lt;algorithm&gt;
#include &lt;cctype&gt; // old &lt;ctype.h&gt;
struct Toupper
{
Toupper(std::locale const&amp; l) : loc(l) {;}
char operator() (char c) { return std::toupper(c,loc); }
private:
std::locale const&amp; loc;
};
struct Tolower
{
Tolower(std::locale const&amp; l) : loc(l) {;}
char operator() (char c) { return std::tolower(c,loc); }
private:
std::locale const&amp; loc;
};
int main ()
{
std::string s("Some Kind Of Initial Input Goes Here");
std::locale loc_c("C");
Toupper up(loc_c);
Tolower down(loc_c);
// Change everything into upper case.
std::transform(s.begin(), s.end(), s.begin(), up);
// Change everything into lower case.
std::transform(s.begin(), s.end(), s.begin(), down);
// Change everything back into upper case, but store the
// result in a different string.
std::string capital_s;
std::transform(s.begin(), s.end(), std::back_inserter(capital_s), up);
}</pre>
<p>The final version of the code uses <code>bind2nd</code> to eliminate
the wrapper structs, but the resulting code is tricky. I have not
shown it here because no compilers currently available to me will
handle it.
</p>
<!-- ####################################################### -->
<hr>
<p class="fineprint"><em>
See <a href="../17_intro/license.html">license.html</a> for copying conditions.
Comments and suggestions are welcome, and may be sent to
<a href="mailto:libstdc++@gcc.gnu.org">the libstdc++ mailing list</a>.
</em></p>
</body>
</html>