497c9f8d4d
generated_cpp_wcwidth.h was regenerated using Unicode 13.0.0 data files. No material changes to the parsing scripts (either GCC- or glibc-sourced) were necessary; glibc's utf8_gen.py was tweaked slightly by glibc and matched here. contrib/ChangeLog: * unicode/EastAsianWidth.txt: Update to Unicode 13.0.0. * unicode/PropList.txt: Likewise. * unicode/README: Likewise. * unicode/UnicodeData.txt: Likewise. * unicode/from_glibc/unicode_utils.py: Update to latest glibc version. * unicode/from_glibc/utf8_gen.py: Likewise. libcpp/ChangeLog: * generated_cpp_wcwidth.h: Regenerated from Unicode 13.0.0 data.
45 lines
2.0 KiB
Plaintext
45 lines
2.0 KiB
Plaintext
This directory contains a mechanism for GCC to have its own internal
|
|
implementation of wcwidth functionality. (cpp_wcwidth () in libcpp/charset.c).
|
|
|
|
The idea is to produce the necessary lookup table
|
|
(../../libcpp/generated_cpp_wcwidth.h) in a reproducible way, starting from the
|
|
following files that are distributed by the Unicode Consortium:
|
|
|
|
ftp://ftp.unicode.org/Public/UNIDATA/UnicodeData.txt
|
|
ftp://ftp.unicode.org/Public/UNIDATA/EastAsianWidth.txt
|
|
ftp://ftp.unicode.org/Public/UNIDATA/PropList.txt
|
|
|
|
These three files have been added to source control in this directory;
|
|
please see unicode-license.txt for the relevant copyright information.
|
|
|
|
In order to keep in sync with glibc's wcwidth as much as possible, it is
|
|
desirable for the logic that processes the Unicode data to be the same as
|
|
glibc's. To that end, we also put in this directory, in the from_glibc/
|
|
directory, the glibc python code that implements their logic. This code was
|
|
copied verbatim from glibc, and it can be updated at any time from the glibc
|
|
source code repository. The files copied from that respository are:
|
|
|
|
localedata/unicode-gen/unicode_utils.py
|
|
localedata/unicode-gen/utf8_gen.py
|
|
|
|
And the most recent versions added to GCC are from glibc git commit:
|
|
f6032247061fb37d59565f2e9667e242c8a98e76
|
|
|
|
Finally, the script gen_wcwidth.py found here contains the GCC-specific code to
|
|
map glibc's output to the lookup tables we require. This script should not need
|
|
to change, unless there are structural changes to the Unicode data files or to
|
|
the glibc code.
|
|
|
|
The procedure to update GCC's wcwidth tables is the following:
|
|
|
|
1. Update the three Unicode data files from the above URLs.
|
|
|
|
2. Update the two glibc files in from_glibc/ from glibc's git. Update
|
|
the commit number above in this README.
|
|
|
|
3. Run ./gen_wcwidth.py X.Y > ../../libcpp/generated_cpp_wcwidth.h
|
|
(where X.Y is the version of the Unicode standard corresponding to the
|
|
Unicode data files being used, most recently, 13.0.0).
|
|
|
|
After that, GCC's wcwidth will match the most recent glibc.
|