45 lines
2.0 KiB
Plaintext
45 lines
2.0 KiB
Plaintext
|
This directory contains a mechanism for GCC to have its own internal
|
||
|
implementation of wcwidth functionality. (cpp_wcwidth () in libcpp/charset.c).
|
||
|
|
||
|
The idea is to produce the necessary lookup table
|
||
|
(../../libcpp/generated_cpp_wcwidth.h) in a reproducible way, starting from the
|
||
|
following files that are distributed by the Unicode Consortium:
|
||
|
|
||
|
ftp://ftp.unicode.org/Public/UNIDATA/UnicodeData.txt
|
||
|
ftp://ftp.unicode.org/Public/UNIDATA/EastAsianWidth.txt
|
||
|
ftp://ftp.unicode.org/Public/UNIDATA/PropList.txt
|
||
|
|
||
|
These three files have been added to source control in this directory;
|
||
|
please see unicode-license.txt for the relevant copyright information.
|
||
|
|
||
|
In order to keep in sync with glibc's wcwidth as much as possible, it is
|
||
|
desirable for the logic that processes the Unicode data to be the same as
|
||
|
glibc's. To that end, we also put in this directory, in the from_glibc/
|
||
|
directory, the glibc python code that implements their logic. This code was
|
||
|
copied verbatim from glibc, and it can be updated at any time from the glibc
|
||
|
source code repository. The files copied from that respository are:
|
||
|
|
||
|
localedata/unicode-gen/unicode_utils.py
|
||
|
localedata/unicode-gen/utf8_gen.py
|
||
|
|
||
|
And the most recent versions added to GCC are from glibc git commit:
|
||
|
2a764c6ee848dfe92cb2921ed3b14085f15d9e79
|
||
|
|
||
|
Finally, the script gen_wcwidth.py found here contains the GCC-specific code to
|
||
|
map glibc's output to the lookup tables we require. This script should not need
|
||
|
to change, unless there are structural changes to the Unicode data files or to
|
||
|
the glibc code.
|
||
|
|
||
|
The procedure to update GCC's wcwidth tables is the following:
|
||
|
|
||
|
1. Update the three Unicode data files from the above URLs.
|
||
|
|
||
|
2. Update the two glibc files in from_glibc/ from glibc's git. Update
|
||
|
the commit number above in this README.
|
||
|
|
||
|
3. Run ./gen_wcwidth.py X.Y > ../../libcpp/generated_cpp_wcwidth.h
|
||
|
(where X.Y is the version of the Unicode standard corresponding to the
|
||
|
Unicode data files being used, most recently, 12.1).
|
||
|
|
||
|
After that, GCC's wcwidth will match the most recent glibc.
|