1fa782725c
2002-03-04 Eric Blake <ebb9@email.byu.edu> * scripts/unicode-decomp.pl: Move from chartables.pl, and remove the code for generating include/java-chartables.h. * scripts/unicode-blocks.pl: Move from scripts/blocks.pl, and merge with Classpath. * scripts/unicode-muncher.pl: Copy from Classpath. * scritps/MakeCharTables.java: New file. * gnu/gcj/convert/Blocks-3.txt: New file. * gnu/gcj/convert/UnicodeData-3.0.0.txt: New file. * gnu/gcj/convert/UnicodeCharacterDatabase-3.0.0.html: New file. * gnu/java/lang/CharData.java: Copy from Classpath. * Makefile.am (ordinary_java_source_files): Add gnu/java/lang/CharData.java. * configure.in: Remove --enable-fast-character option. * java/lang/Character.java: Merge algorithms and Javadoc with Classpath. * java/lang/natCharacter.cc: Implement Unicode lookup table more efficiently. * include/java-chardecomp.h: Regenerate. * include/java-chartables.h: Regenerate. From-SVN: r50368
346 lines
8.8 KiB
HTML
346 lines
8.8 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
|
|
|
|
"http://www.w3.org/TR/REC-html40/loose.dtd">
|
|
|
|
<html>
|
|
|
|
|
|
|
|
<head>
|
|
|
|
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
|
|
|
|
<meta http-equiv="Content-Language" content="en-us">
|
|
|
|
<meta name="GENERATOR" content="Microsoft FrontPage 4.0">
|
|
|
|
<meta name="ProgId" content="FrontPage.Editor.Document">
|
|
|
|
<link rel="stylesheet" href="http://www.unicode.org/unicode.css" type="text/css">
|
|
|
|
<title>Unicode Character Database</title>
|
|
|
|
</head>
|
|
|
|
|
|
|
|
<body>
|
|
|
|
|
|
|
|
<h1>UNICODE CHARACTER DATABASE<br>
|
|
Version 3.0.0</h1>
|
|
|
|
<table border="1" cellspacing="2" cellpadding="0" height="87" width="100%">
|
|
|
|
<tr>
|
|
|
|
<td valign="TOP" width="144">Revision</td>
|
|
|
|
<td valign="TOP">3.0.0</td>
|
|
|
|
</tr>
|
|
|
|
<tr>
|
|
|
|
<td valign="TOP" width="144">Authors</td>
|
|
|
|
<td valign="TOP">Mark Davis and Ken Whistler</td>
|
|
|
|
</tr>
|
|
|
|
<tr>
|
|
|
|
<td valign="TOP" width="144">Date</td>
|
|
|
|
<td valign="TOP">1999-09-11</td>
|
|
|
|
</tr>
|
|
|
|
<tr>
|
|
|
|
<td valign="TOP" width="144">This Version</td>
|
|
|
|
<td valign="TOP"><a href="ftp://ftp.unicode.org/Public/3.0-Update/UnicodeCharacterDatabase-3.0.0.html">ftp://ftp.unicode.org/Public/3.0-Update/UnicodeCharacterDatabase-3.0.0.html</a></td>
|
|
|
|
</tr>
|
|
|
|
<tr>
|
|
|
|
<td valign="TOP" width="144">Previous Version</td>
|
|
|
|
<td valign="TOP">n/a</td>
|
|
|
|
</tr>
|
|
|
|
<tr>
|
|
|
|
<td valign="TOP" width="144">Latest Version</td>
|
|
|
|
<td valign="TOP"><a href="ftp://ftp.unicode.org/Public/3.0-Update/UnicodeCharacterDatabase-3.0.0.html">ftp://ftp.unicode.org/Public/3.0-Update/UnicodeCharacterDatabase-3.0.0.html</a></td>
|
|
|
|
</tr>
|
|
|
|
</table>
|
|
|
|
<p align="center">Copyright © 1995-1999 Unicode, Inc. All Rights reserved.</p>
|
|
|
|
<h2>Disclaimer</h2>
|
|
|
|
<p>The Unicode Character Database is provided as is by Unicode, Inc. No claims
|
|
|
|
are made as to fitness for any particular purpose. No warranties of any kind are
|
|
|
|
expressed or implied. The recipient agrees to determine applicability of
|
|
|
|
information provided. If this file has been purchased on magnetic or optical
|
|
|
|
media from Unicode, Inc., the sole remedy for any claim will be exchange of
|
|
|
|
defective media within 90 days of receipt.</p>
|
|
|
|
<p>This disclaimer is applicable for all other data files accompanying the
|
|
|
|
Unicode Character Database, some of which have been compiled by the Unicode
|
|
|
|
Consortium, and some of which have been supplied by other sources.</p>
|
|
|
|
<h2>Limitations on Rights to Redistribute This Data</h2>
|
|
|
|
<p>Recipient is granted the right to make copies in any form for internal
|
|
|
|
distribution and to freely use the information supplied in the creation of
|
|
|
|
products supporting the Unicode<sup>TM</sup> Standard. The files in the Unicode
|
|
|
|
Character Database can be redistributed to third parties or other organizations
|
|
|
|
(whether for profit or not) as long as this notice and the disclaimer notice are
|
|
|
|
retained. Information can be extracted from these files and used in
|
|
|
|
documentation or programs, as long as there is an accompanying notice indicating
|
|
|
|
the source.</p>
|
|
|
|
<h2>Introduction</h2>
|
|
|
|
<p>The Unicode Character Database is a set of files that define the Unicode
|
|
|
|
character properties and internal mappings. For more information about character
|
|
|
|
properties and mappings, see <i><a href="http://www.unicode.org/unicode/uni2book/u2.html">The
|
|
|
|
Unicode Standard</a></i>.</p>
|
|
|
|
<p>The Unicode Character Database has been updated to reflect Version 3.0 of the
|
|
|
|
Unicode Standard, with many characters added to those published in Version 2.0.
|
|
|
|
A number of corrections have also been made to case mappings or other errors in
|
|
|
|
the database noted since the publication of Version 2.0. Normative bidirectional
|
|
|
|
properties have also been modified to reflect decisions of the Unicode Technical
|
|
|
|
Committee.</p>
|
|
|
|
<p>For more information on versions of the Unicode Standard and how to reference
|
|
|
|
them, see <a href="http://www.unicode.org/unicode/standard/versions/">http://www.unicode.org/unicode/standard/versions/</a>.</p>
|
|
|
|
<h2>Conformance</h2>
|
|
|
|
<p>Character properties may be either normative or informative. <i>Normative</i>
|
|
|
|
means that implementations that claim conformance to the Unicode Standard (at a
|
|
|
|
particular version) and which make use of a particular property or field must
|
|
|
|
follow the specifications of the standard for that property or field in order to
|
|
|
|
be conformant. The term <i>normative</i> when applied to a property or field of
|
|
|
|
the Unicode Character Database, does <i>not</i> mean that the value of that
|
|
|
|
field will never change. Corrections and extensions to the standard in the
|
|
|
|
future may require minor changes to normative values, even though the Unicode
|
|
|
|
Technical Committee strives to minimize such changes. An<i> informative </i>property
|
|
|
|
or field is strongly recommended, but a conformant implementation is free to use
|
|
|
|
or change such values as it may require while still being conformant to the
|
|
|
|
standard. Particular implementations may choose to override the properties and
|
|
|
|
mappings that are not normative. In that case, it is up to the implementer to
|
|
|
|
establish a protocol to convey that information.</p>
|
|
|
|
<h2>Files</h2>
|
|
|
|
<p>The following summarizes the files in the Unicode Character Database. For
|
|
|
|
more information about these files, see the referenced technical report or
|
|
|
|
section of Unicode Standard, Version 3.0.</p>
|
|
|
|
<p><b>UnicodeData.txt (Chapter 4)</b>
|
|
|
|
<ul>
|
|
|
|
<li>The main file in the Unicode Character Database.</li>
|
|
|
|
<li>For detailed information on the format, see <a href="UnicodeData.html">UnicodeData.html</a>.
|
|
|
|
This file also characterizes which properties are normative and which are
|
|
|
|
informative.</li>
|
|
|
|
</ul>
|
|
|
|
<p><b>PropList.txt (Chapter 4)</b>
|
|
|
|
<ul>
|
|
|
|
<li>Additional informative properties list: <i>Alphabetic, Ideographic,</i>
|
|
|
|
and <i>Mathematical</i>, among others.</li>
|
|
|
|
</ul>
|
|
|
|
<p><b>SpecialCasing.txt (Chapter 4)</b>
|
|
|
|
<ul>
|
|
|
|
<li>List of informative special casing properties, including one-to-many
|
|
|
|
mappings such as SHARP S => "SS", and locale-specific mappings,
|
|
|
|
such as for Turkish <i>dotless i</i>.</li>
|
|
|
|
</ul>
|
|
|
|
<p><b>Blocks.txt (Chapter 14)</b>
|
|
|
|
<ul>
|
|
|
|
<li>List of normative block names.</li>
|
|
|
|
</ul>
|
|
|
|
<p><b>Jamo.txt (Chapter 4)</b>
|
|
|
|
<ul>
|
|
|
|
<li>List of normative Jamo short names, used in deriving HANGUL SYLLABLE names
|
|
|
|
algorithmically.</li>
|
|
|
|
</ul>
|
|
|
|
<p><b>ArabicShaping.txt (Section 8.2)</b>
|
|
|
|
<ul>
|
|
|
|
<li>Basic Arabic and Syriac character shaping properties, such as initial,
|
|
|
|
medial and final shapes. These properties are normative for minimal shaping
|
|
|
|
of Arabic and Syriac. </li>
|
|
|
|
</ul>
|
|
|
|
<p><b>NamesList.txt (Chapter 14)</b>
|
|
|
|
<ul>
|
|
|
|
<li>This file duplicates some of the material in the UnicodeData file, and
|
|
|
|
adds informative annotations uses in the character charts, as printed in the
|
|
|
|
Unicode Standard. </li>
|
|
|
|
<li><b>Note: </b>The information in NamesList.txt and Index.txt files matches
|
|
|
|
the appropriate version of the book. Changes in the Unicode Character
|
|
|
|
Database since then may not be reflected in these files, since they are
|
|
|
|
primarily of archival interest.</li>
|
|
|
|
</ul>
|
|
|
|
<p><b>Index.txt (Chapter 14)</b>
|
|
|
|
<ul>
|
|
|
|
<li>Informative index to Unicode characters, as printed in the Unicode
|
|
|
|
Standard</li>
|
|
|
|
<li><b>Note: </b>The information in NamesList.txt and Index.txt files matches
|
|
|
|
the appropriate version of the book. Changes in the Unicode Character
|
|
|
|
Database since then may not be reflected in these files, since they are
|
|
|
|
primarily of archival interest.</li>
|
|
|
|
</ul>
|
|
|
|
<p><b>CompositionExclusions.txt (<a href="http://www.unicode.org/unicode/reports/tr15/">UTR#15
|
|
|
|
Unicode Normalization Forms</a>)</b>
|
|
|
|
<ul>
|
|
|
|
<li>Normative properties for normalization.</li>
|
|
|
|
</ul>
|
|
|
|
<p><b>LineBreak.txt (<a href="http://www.unicode.org/unicode/reports/tr14/">UTR
|
|
|
|
#14: Line Breaking Properties</a>)</b>
|
|
|
|
<ul>
|
|
|
|
<li>Normative and informative properties for line breaking. To see which
|
|
|
|
properties are informative and which are normative, consult UTR#14.</li>
|
|
|
|
</ul>
|
|
|
|
<p><b>EastAsianWidth.txt (<a href="http://www.unicode.org/unicode/reports/tr11/">UTR
|
|
|
|
#11: East Asian Character Width</a>)</b>
|
|
|
|
<ul>
|
|
|
|
<li>Informative properties for determining the choice of wide vs. narrow
|
|
|
|
glyphs in East Asian contexts.</li>
|
|
|
|
</ul>
|
|
|
|
<p><b>diffXvY.txt</b>
|
|
|
|
<ul>
|
|
|
|
<li>Mechanically-generated informative files containing accumulated
|
|
|
|
differences between successive versions of UnicodeData.txt</li>
|
|
|
|
</ul>
|
|
|
|
|
|
|
|
</body>
|
|
|
|
|
|
|
|
</html>
|
|
|