glibc

Commit Graph

Author	SHA1	Message	Date
Rafal Luzynski	1bb3653925	localedata: Once again correct and regenerate i18n_ctype. Following the previous work by Carlos O'Donell the category of LC_CTYPE is correctly set to "i18n:2012" rather than "unicode:2014" and the i18n_ctype file is once again regenerated from scratch to make sure it does not contain any manual additions except the copyright message. Reviewed-by: Carlos O'Donell <carlos@redhat.com> * localedata/unicode-gen/gen_unicode_ctype.py (output_head): category of LC_CTYPE set to "i18n:2012". * localedata/locales/i18n_ctype: Regenerate.	2017-10-31 23:54:47 +01:00
Carlos O'Donell	337ff3c501	localedata: Fix unicode-gen check target. After the transition to generating a distinct file for Unicode ctype information e.g. i18n_ctype, the check target was left with the wrong target name. This patch fixes the check target and regenerates the files with more information than previously used, filling in the the LC_IDENTIFICATION data. Tested on x86_64 by regenerating from Unicode source files, and running checks. Tested by subsequently rebuilding all locales. No regressions in testsuite. Signed-off-by: Carlos O'Donell <carlos@redhat.com> Reported-by: Rafal Luzynski <digitalfreak@lingonborough.com>	2017-10-25 09:17:46 -07:00
Carlos O'Donell	8dc8be75d2	localedata: Reorganize Unicode LC_CTYPE inclusion. The commit does the following things: * Move non-transliteration Unicode generated data to i18n_ctype. * Copy the i18n_ctype data into i18n and add transliteration. In the future, any locale which needs Unicode LC_CTYPE data can also just use `copy i18n_ctype` and get the base character classes and maps without transliteration. Tested by compiling all the locales and my prototype C.UTF-8 which uses it. Signed-off-by: Carlos O'Donell <carlos@redhat.com>	2017-10-13 22:29:52 -07:00
Mike FABIAN	2ae5be041d	Improve utf8_gen.py to set the width for characters with Prepended_Concatenation_Mark property to 1 [BZ #22070] * localedata/unicode-gen/utf8_gen.py: Set the width for characters with Prepended_Concatenation_Mark property to 1 * localedata/charmaps/UTF-8: Updated using the improved script.	2017-09-06 12:39:49 +02:00
Mike FABIAN	af83ed5c46	Write all ranges of neighbouring characters with the same width using the range notation in charmaps/UTF-8 Writing ranges of neighbouring characters with the same with like this <U000E0100>...<U000E01EF> 0 in charmaps/UTF-8 is more efficient than writing many single character lines like: <U000E0100> 0 <U000E0101> 0 ... [BZ #21750] * unicode-gen/utf8_gen.py: Write all ranges of neighbouring characters with the same width using the range notation in charmaps/UTF-8.	2017-09-06 12:37:49 +02:00
Thorsten Glaser	267ee5d7ab	Resolve some historically special cases of ambiguous width [BZ #21750] * unicode-gen/utf8_gen.py (U+00AD): Set width to 1. * unicode-gen/utf8_gen.py (U+1160..U+11FF): Set width to 0. * unicode-gen/utf8_gen.py (U+3248..U+324F): Set width to 2. * unicode-gen/utf8_gen.py (U+4DC0..U+4DFF): Likewise.	2017-08-17 11:06:08 +02:00
Thorsten Glaser	41b6f0ce85	Handle more cases of combining characters [BZ #21750] * unicode-gen/utf8_gen.py: Treat category Me and Mn as combining.	2017-08-17 11:06:08 +02:00
Thorsten Glaser	580be3035d	UnicodeData has precedence over EastAsianWidth [BZ #19852] [BZ #21750] * unicode-gen/utf8_gen.py: Process EastAsianWidth lines before UnicodeData lines so the latter have precedence; remove hack to group output by EastAsianWidth ranges.	2017-08-17 11:06:08 +02:00
Mike FABIAN	925fac7793	Bug 21533: Update to Unicode 10.0.0 * Unicode 10.0.0 Support: Character encoding, character type info, and transliteration tables are all updated to Unicode 10.0.0, using generator scripts contributed by Mike FABIAN (Red Hat).	2017-06-22 17:02:55 +02:00
Mike FABIAN	0b38d66a4e	Bug 20313: Update to Unicode 9.0.0 * Unicode 9.0.0 Support: Character encoding, character type info, and transliteration tables are all updated to Unicode 9.0.0, using generator scripts contributed by Mike FABIAN (Red Hat).	2017-02-21 06:30:38 -05:00
Joseph Myers	bfff8b1bec	Update copyright dates with scripts/update-copyrights.	2017-01-01 00:14:16 +00:00
Mike Frysinger	277da2ab88	unicode-gen: include standard comment file header We deployed this header to all the locale files, so make sure we include it in the generated ones too so we don't lose it.	2016-06-11 02:10:52 -04:00
Joseph Myers	f7a9f785e5	Update copyright dates with scripts/update-copyrights.	2016-01-04 16:05:18 +00:00
Joseph Myers	85bafe6f3d	Automate LC_CTYPE generation for tr_TR, update to Unicode 8.0.0 (bug 18491). This patch makes the automation of Unicode LC_CTYPE generation also support generating the modified LC_CTYPE used for Turkish (where case conversions of 'i' and 'I' differ from ASCII conventions), so allowing that to be more readily kept in sync for future Unicode updates. The patch includes the locale update generated by the scripts. Tested for x86_64. [BZ #18491] * unicode-gen/unicode_utils.py (to_upper_turkish): New function. (to_lower_turkish): Likewise. * unicode-gen/gen_unicode_ctype.py (output_tables): Support producing output with Turkish case conversions. (--turkish): New command-line option. * unicode-gen/Makefile (GENERATED): Add tr_TR. (tr_TR): New rule. * locales/tr_TR: Regenerate LC_CTYPE.	2015-12-11 12:45:19 +00:00
Mike FABIAN	23256f5ed8	Update to Unicode 8.0.0. Update __STDC_ISO_10646__ to 201505L for Unicode 8.0.0. Update character encoding, ctype, and transliteration tables. New scripts autogenerate transliteration tables.	2015-12-10 00:33:48 -05:00
Carlos O'Donell	dd8e8e5476	Update transliteration support to Unicode 7.0.0. The transliteration files are now autogenerated from upstream Unicode data.	2015-12-09 22:52:13 -05:00
Alexandre Oliva	7b1ec6a05c	Amendments to Unicode 7 update. for ChangeLog * include/stdc-predef.h (__STDC_ISO_10646__): Update to 201304L, for Unicode 7. for localedata/ChangeLog * unicode-gen/ctype_compatibility.py: Use date ranges in copyright notice. * unicode-gen/ctype_compatibility_test_cases.py: Likewise. * unicode-gen/gen_unicode_ctype.py: Likewise. * unicode-gen/utf8_compatibility.py: Likewise. * unicode-gen/utf8_gen.py: Likewise. Use upper case for global variables, use tuples for global constant arrays. From Mike FABIAN. Suggested by Mike Frysinger <vapier@gentoo.org>.	2015-02-23 11:35:24 -03:00
Alexandre Oliva	4a4839c94a	Unicode 7.0.0 update; added generator scripts. for localedata/ChangeLog [BZ #17588] [BZ #13064] [BZ #14094] [BZ #17998] * unicode-gen/Makefile: New. * unicode-gen/unicode-license.txt: New, from Unicode. * unicode-gen/UnicodeData.txt: New, from Unicode. * unicode-gen/DerivedCoreProperties.txt: New, from Unicode. * unicode-gen/EastAsianWidth.txt: New, from Unicode. * unicode-gen/gen_unicode_ctype.py: New generator, from Mike FABIAN <mfabian@redhat.com>. * unicode-gen/ctype_compatibility.py: New verifier, from Pravin Satpute <psatpute@redhat.com> and Mike FABIAN. * unicode-gen/ctype_compatibility_test_cases.py: New verifier module, from Mike FABIAN. * unicode-gen/utf8_gen.py: New generator, from Pravin Satpute and Mike FABIAN. * unicode-gen/utf8_compatibility.py: New verifier, from Pravin Satpute and Mike FABIAN. * charmaps/UTF-8: Update. * locales/i18n: Update. * gen-unicode-ctype.c: Remove. * tst-ctype-de_DE.ISO-8859-1.in: Adjust, islower now returns true for ordinal indicators.	2015-02-20 20:14:59 -02:00

18 Commits