2019-12-09 21:03:47 +01:00
|
|
|
This directory contains a mechanism for GCC to have its own internal
|
|
|
|
implementation of wcwidth functionality. (cpp_wcwidth () in libcpp/charset.c).
|
|
|
|
|
|
|
|
The idea is to produce the necessary lookup table
|
|
|
|
(../../libcpp/generated_cpp_wcwidth.h) in a reproducible way, starting from the
|
|
|
|
following files that are distributed by the Unicode Consortium:
|
|
|
|
|
|
|
|
ftp://ftp.unicode.org/Public/UNIDATA/UnicodeData.txt
|
|
|
|
ftp://ftp.unicode.org/Public/UNIDATA/EastAsianWidth.txt
|
|
|
|
ftp://ftp.unicode.org/Public/UNIDATA/PropList.txt
|
|
|
|
|
|
|
|
These three files have been added to source control in this directory;
|
|
|
|
please see unicode-license.txt for the relevant copyright information.
|
|
|
|
|
|
|
|
In order to keep in sync with glibc's wcwidth as much as possible, it is
|
|
|
|
desirable for the logic that processes the Unicode data to be the same as
|
|
|
|
glibc's. To that end, we also put in this directory, in the from_glibc/
|
|
|
|
directory, the glibc python code that implements their logic. This code was
|
|
|
|
copied verbatim from glibc, and it can be updated at any time from the glibc
|
|
|
|
source code repository. The files copied from that respository are:
|
|
|
|
|
|
|
|
localedata/unicode-gen/unicode_utils.py
|
|
|
|
localedata/unicode-gen/utf8_gen.py
|
|
|
|
|
|
|
|
And the most recent versions added to GCC are from glibc git commit:
|
2020-11-07 15:36:42 +01:00
|
|
|
f6032247061fb37d59565f2e9667e242c8a98e76
|
2019-12-09 21:03:47 +01:00
|
|
|
|
|
|
|
Finally, the script gen_wcwidth.py found here contains the GCC-specific code to
|
|
|
|
map glibc's output to the lookup tables we require. This script should not need
|
|
|
|
to change, unless there are structural changes to the Unicode data files or to
|
|
|
|
the glibc code.
|
|
|
|
|
|
|
|
The procedure to update GCC's wcwidth tables is the following:
|
|
|
|
|
|
|
|
1. Update the three Unicode data files from the above URLs.
|
|
|
|
|
|
|
|
2. Update the two glibc files in from_glibc/ from glibc's git. Update
|
|
|
|
the commit number above in this README.
|
|
|
|
|
|
|
|
3. Run ./gen_wcwidth.py X.Y > ../../libcpp/generated_cpp_wcwidth.h
|
|
|
|
(where X.Y is the version of the Unicode standard corresponding to the
|
2020-11-07 15:36:42 +01:00
|
|
|
Unicode data files being used, most recently, 13.0.0).
|
2019-12-09 21:03:47 +01:00
|
|
|
|
|
|
|
After that, GCC's wcwidth will match the most recent glibc.
|