contrib/ChangeLog
2019-12-09 Lewis Hyatt <lhyatt@gmail.com>
PR preprocessor/49973
* unicode/from_glibc/unicode_utils.py: Support script from
glibc (commit 464cd3) to extract character widths from Unicode data
files.
* unicode/from_glibc/utf8_gen.py: Likewise.
* unicode/UnicodeData.txt: Unicode v. 12.1.0 data file.
* unicode/EastAsianWidth.txt: Likewise.
* unicode/PropList.txt: Likewise.
* unicode/gen_wcwidth.py: New utility to generate
libcpp/generated_cpp_wcwidth.h with help from the glibc support
scripts and the Unicode data files.
* unicode/unicode-license.txt: Added.
* unicode/README: New explanatory file.
libcpp/ChangeLog
2019-12-09 Lewis Hyatt <lhyatt@gmail.com>
PR preprocessor/49973
* generated_cpp_wcwidth.h: New file generated by
../contrib/unicode/gen_wcwidth.py, supports new cpp_wcwidth function.
* charset.c (compute_next_display_width): New function to help
implement display columns.
(cpp_byte_column_to_display_column): Likewise.
(cpp_display_column_to_byte_column): Likewise.
(cpp_wcwidth): Likewise.
* include/cpplib.h (cpp_byte_column_to_display_column): Declare.
(cpp_display_column_to_byte_column): Declare.
(cpp_wcwidth): Declare.
(cpp_display_width): New function.
gcc/ChangeLog
2019-12-09 Lewis Hyatt <lhyatt@gmail.com>
PR preprocessor/49973
* input.c (location_compute_display_column): New function to help with
multibyte awareness in diagnostics.
(test_cpp_utf8): New self-test.
(input_c_tests): Call the new test.
* input.h (location_compute_display_column): Declare.
* diagnostic-show-locus.c: Pervasive changes to add multibyte awareness
to all classes and functions.
(enum column_unit): New enum.
(class exploc_with_display_col): New class.
(class layout_point): Convert m_column member to array m_columns[2].
(layout_range::contains_point): Add col_unit argument.
(test_layout_range_for_single_point): Pass new argument.
(test_layout_range_for_single_line): Likewise.
(test_layout_range_for_multiple_lines): Likewise.
(line_bounds::convert_to_display_cols): New function.
(layout::get_state_at_point): Add col_unit argument.
(make_range): Use empty filename rather than dummy filename.
(get_line_width_without_trailing_whitespace): Rename to...
(get_line_bytes_without_trailing_whitespace): ...this.
(test_get_line_width_without_trailing_whitespace): Rename to...
(test_get_line_bytes_without_trailing_whitespace): ...this.
(class layout): m_exploc changed to exploc_with_display_col from
plain expanded_location.
(layout::get_linenum_width): New accessor member function.
(layout::get_x_offset_display): Likewise.
(layout::calculate_linenum_width): New subroutine for the constuctor.
(layout::calculate_x_offset_display): Likewise.
(layout::layout): Use the new subroutines. Add multibyte awareness.
(layout::print_source_line): Add multibyte awareness.
(layout::print_line): Likewise.
(layout::print_annotation_line): Likewise.
(line_label::line_label): Likewise.
(layout::print_any_labels): Likewise.
(layout::annotation_line_showed_range_p): Likewise.
(get_printed_columns): Likewise.
(class line_label): Rename m_length to m_display_width.
(get_affected_columns): Rename to...
(get_affected_range): ...this; add col_unit argument and multibyte
awareness.
(class correction): Add m_affected_bytes and m_display_cols
members. Rename m_len to m_byte_length for clarity. Add multibyte
awareness throughout.
(correction::insertion_p): Add multibyte awareness.
(correction::compute_display_cols): New function.
(correction::ensure_terminated): Use new member name m_byte_length.
(line_corrections::add_hint): Add multibyte awareness.
(layout::print_trailing_fixits): Likewise.
(layout::get_x_bound_for_row): Likewise.
(test_one_liner_simple_caret_utf8): New self-test analogous to the one
with _utf8 suffix removed, testing multibyte awareness.
(test_one_liner_caret_and_range_utf8): Likewise.
(test_one_liner_multiple_carets_and_ranges_utf8): Likewise.
(test_one_liner_fixit_insert_before_utf8): Likewise.
(test_one_liner_fixit_insert_after_utf8): Likewise.
(test_one_liner_fixit_remove_utf8): Likewise.
(test_one_liner_fixit_replace_utf8): Likewise.
(test_one_liner_fixit_replace_non_equal_range_utf8): Likewise.
(test_one_liner_fixit_replace_equal_secondary_range_utf8): Likewise.
(test_one_liner_fixit_validation_adhoc_locations_utf8): Likewise.
(test_one_liner_many_fixits_1_utf8): Likewise.
(test_one_liner_many_fixits_2_utf8): Likewise.
(test_one_liner_labels_utf8): Likewise.
(test_diagnostic_show_locus_one_liner_utf8): Likewise.
(test_overlapped_fixit_printing_utf8): Likewise.
(test_overlapped_fixit_printing): Adapt for changes to
get_affected_columns, get_printed_columns and class corrections.
(test_overlapped_fixit_printing_2): Likewise.
(test_linenum_sep): New constant.
(test_left_margin): Likewise.
(test_offset_impl): Helper function for new test.
(test_layout_x_offset_display_utf8): New test.
(diagnostic_show_locus_c_tests): Call new tests.
gcc/testsuite/ChangeLog:
2019-12-09 Lewis Hyatt <lhyatt@gmail.com>
PR preprocessor/49973
* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
(test_show_locus): Tweak so that expected output is the same as
before the diagnostic-show-locus.c changes.
* gcc.dg/cpp/pr66415-1.c: Likewise.
From-SVN: r279137
C2x adds u8'' character constants to C. This patch adds the
corresponding GCC support.
Most of the support was already present for C++ and just needed
enabling for C2x. However, in C2x these constants have type unsigned
char, which required corresponding adjustments in the compiler and the
preprocessor to give them that type for C.
For C, it seems clear to me that having type unsigned char means the
constants are unsigned in the preprocessor (and thus treated as having
type uintmax_t in #if conditionals), so this patch implements that. I
included a conditional in the libcpp change to avoid affecting
signedness for C++, but I'm not sure if in fact these constants should
also be unsigned in the preprocessor for C++ in which case that
!CPP_OPTION (pfile, cplusplus) conditional would not be needed.
Bootstrapped with no regressions on x86_64-pc-linux-gnu.
gcc/c:
* c-parser.c (c_parser_postfix_expression)
(c_parser_check_literal_zero): Handle CPP_UTF8CHAR.
* gimple-parser.c (c_parser_gimple_postfix_expression): Likewise.
gcc/c-family:
* c-lex.c (lex_charconst): Make CPP_UTF8CHAR constants unsigned
char for C.
gcc/testsuite:
* gcc.dg/c11-utf8char-1.c, gcc.dg/c2x-utf8char-1.c,
gcc.dg/c2x-utf8char-2.c, gcc.dg/c2x-utf8char-3.c,
gcc.dg/gnu2x-utf8char-1.c: New tests.
libcpp:
* charset.c (narrow_str_to_charconst): Make CPP_UTF8CHAR constants
unsigned for C.
* init.c (lang_defaults): Set utf8_char_literals for GNUC2X and
STDC2X.
From-SVN: r278265
PR c++/91370 - Implement P1041R4 and P1139R2 - Stronger Unicode reqs
* charset.c (narrow_str_to_charconst): Add TYPE argument. For
CPP_UTF8CHAR diagnose whenever number of chars is > 1, using
CPP_DL_ERROR instead of CPP_DL_WARNING.
(wide_str_to_charconst): For CPP_CHAR16 or CPP_CHAR32, use
CPP_DL_ERROR instead of CPP_DL_WARNING when multiple char16_t
or char32_t chars are needed.
(cpp_interpret_charconst): Adjust narrow_str_to_charconst caller.
* g++.dg/cpp1z/utf8-neg.C: Expect errors rather than -Wmultichar
warnings.
* g++.dg/ext/utf16-4.C: Expect errors rather than warnings.
* g++.dg/ext/utf32-4.C: Likewise.
* g++.dg/cpp2a/ucn2.C: New test.
From-SVN: r277929
* charset.c (UCS_LIMIT): New macro.
(ucn_valid_in_identifier): Use it instead of a hardcoded constant.
(_cpp_valid_ucn): Issue a pedantic warning for UCNs larger than
UCS_LIMIT outside of identifiers in C and in C++2a or later.
From-SVN: r276167
libcpp/ChangeLog
2019-09-19 Lewis Hyatt <lhyatt@gmail.com>
PR c/67224
* charset.c (_cpp_valid_utf8): New function to help lex UTF-8 tokens.
* internal.h (_cpp_valid_utf8): Declare.
* lex.c (forms_identifier_p): Use it to recognize UTF-8 identifiers.
(_cpp_lex_direct): Handle UTF-8 in identifiers and CPP_OTHER tokens.
Do all work in "default" case to avoid slowing down typical code paths.
Also handle $ and UCN in the default case for consistency.
gcc/Changelog
2019-09-19 Lewis Hyatt <lhyatt@gmail.com>
PR c/67224
* doc/cpp.texi: Document support for extended characters in
identifiers.
* doc/cppopts.texi: Likewise.
gcc/testsuite/ChangeLog
2019-09-19 Lewis Hyatt <lhyatt@gmail.com>
PR c/67224
* c-c++-common/cpp/ucnid-2011-1-utf8.c: New test.
* g++.dg/cpp/ucnid-1-utf8.C: New test.
* g++.dg/cpp/ucnid-2-utf8.C: New test.
* g++.dg/cpp/ucnid-3-utf8.C: New test.
* g++.dg/cpp/ucnid-4-utf8.C: New test.
* g++.dg/other/ucnid-1-utf8.C: New test.
* gcc.dg/cpp/ucnid-1-utf8.c: New test.
* gcc.dg/cpp/ucnid-10-utf8.c: New test.
* gcc.dg/cpp/ucnid-11-utf8.c: New test.
* gcc.dg/cpp/ucnid-12-utf8.c: New test.
* gcc.dg/cpp/ucnid-13-utf8.c: New test.
* gcc.dg/cpp/ucnid-14-utf8.c: New test.
* gcc.dg/cpp/ucnid-15-utf8.c: New test.
* gcc.dg/cpp/ucnid-2-utf8.c: New test.
* gcc.dg/cpp/ucnid-3-utf8.c: New test.
* gcc.dg/cpp/ucnid-4-utf8.c: New test.
* gcc.dg/cpp/ucnid-6-utf8.c: New test.
* gcc.dg/cpp/ucnid-7-utf8.c: New test.
* gcc.dg/cpp/ucnid-9-utf8.c: New test.
* gcc.dg/ucnid-1-utf8.c: New test.
* gcc.dg/ucnid-10-utf8.c: New test.
* gcc.dg/ucnid-11-utf8.c: New test.
* gcc.dg/ucnid-12-utf8.c: New test.
* gcc.dg/ucnid-13-utf8.c: New test.
* gcc.dg/ucnid-14-utf8.c: New test.
* gcc.dg/ucnid-15-utf8.c: New test.
* gcc.dg/ucnid-16-utf8.c: New test.
* gcc.dg/ucnid-2-utf8.c: New test.
* gcc.dg/ucnid-3-utf8.c: New test.
* gcc.dg/ucnid-4-utf8.c: New test.
* gcc.dg/ucnid-5-utf8.c: New test.
* gcc.dg/ucnid-6-utf8.c: New test.
* gcc.dg/ucnid-7-utf8.c: New test.
* gcc.dg/ucnid-8-utf8.c: New test.
* gcc.dg/ucnid-9-utf8.c: New test.
From-SVN: r275979
This patch renames the "error" callback within libcpp
to "diagnostic", and uses the pair of enums in cpplib.h, rather
than passing two different kinds of "int" around.
gcc/c-family/ChangeLog:
* c-common.c (c_option_controlling_cpp_error): Rename to...
(c_option_controlling_cpp_diagnostic): ...this, and convert
"reason" from int to enum.
(c_cpp_error): Rename to...
(c_cpp_diagnostic): ...this, converting level and reason to enums.
* c-common.h (c_cpp_error): Rename to...
(c_cpp_diagnostic): ...this, converting level and reason to enums.
* c-opts.c (c_common_init_options): Update for renaming.
gcc/fortran/ChangeLog:
* cpp.c (gfc_cpp_init_0): Update for renamings.
(cb_cpp_error): Rename to...
(cb_cpp_diagnostic): ...this, converting level and reason to
enums.
gcc/ChangeLog:
* genmatch.c (error_cb): Rename to...
(diagnostic_cb): ...this, converting int params to enums.
(fatal_at): Update for renaming.
(warning_at): Likewise.
(main): Likewise.
* input.c (selftest::ebcdic_execution_charset::apply):
Update for renaming of...
(selftest::ebcdic_execution_charset::on_error): ...this, renaming
to...
(selftest::ebcdic_execution_charset::on_diagnostic): ...this,
converting level and reason to enums.
(class selftest::lexer_error_sink): Rename to...
(class selftest::lexer_test_options): ...this, renaming field
"m_errors" to "m_diagnostics".
(selftest::lexer_test_options::apply): Update for renaming of...
(selftest::lexer_test_options::on_error): ...this, renaming to...
(selftest::lexer_test_options::on_diagnostic): ...this
converting level and reason to enums.
(selftest::test_lexer_string_locations_raw_string_unterminated):
Update for renamings.
* opth-gen.awk (struct cpp_reason_option_codes_t): Use enum for
"reason".
libcpp/ChangeLog:
* charset.c (noop_error_cb): Rename to...
(noop_diagnostic_cb): ...this, converting params to enums.
(cpp_interpret_string_ranges): Update for renaming and enums.
* directives.c (check_eol_1): Convert reason to enum.
(do_diagnostic): Convert code and reason to enum.
(do_error): Use CPP_W_NONE rather than 0.
(do_pragma_dependency): Likewise.
* errors.c (cpp_diagnostic_at): Convert level and reason to enums.
Update for renaming.
(cpp_diagnostic): Convert level and reason to enums.
(cpp_error): Convert level to enum.
(cpp_warning): Convert reason to enums.
(cpp_pedwarning): Likewise.
(cpp_warning_syshdr): Likewise.
(cpp_diagnostic_with_line): Convert level and reason to enums.
Update for renaming.
(cpp_error_with_line): Convert level to enum.
(cpp_warning_with_line): Convert reason to enums.
(cpp_pedwarning_with_line): Likewise.
(cpp_warning_with_line_syshdr): Likewise.
(cpp_error_at): Convert level to enum.
(cpp_errno): Likewise.
(cpp_errno_filename): Likewise.
* include/cpplib.h (enum cpp_diagnostic_level): Name this enum,
and move to before struct cpp_callbacks.
(enum cpp_warning_reason): Likewise.
(cpp_callbacks::diagnostic): Convert params from int to enums.
(cpp_error): Convert int param to enum cpp_diagnostic_level.
(cpp_warning): Convert int param to enum cpp_warning_reason.
(cpp_pedwarning): Likewise.
(cpp_warning_syshdr): Likewise.
(cpp_errno): Convert int param to enum cpp_diagnostic_level.
(cpp_errno_filename): Likewise.
(cpp_error_with_line): Likewise.
(cpp_warning_with_line): Convert int param to enum
cpp_warning_reason.
(cpp_pedwarning_with_line): Likewise.
(cpp_warning_with_line_syshdr): Likewise.
(cpp_error_at): Convert int param to enum cpp_diagnostic_level.
* macro.c (create_iso_definition): Convert int to enum.
(_cpp_create_definition): Likewise.
From-SVN: r264999
Whilst investigating PR preprocessor/78324 I noticed that the
substring location code currently doesn't handle raw strings
correctly, by not skipping the 'R', opening quote, delimiter
and opening parenthesis.
For example, an attempt to underline chars 4-7 with caret at 6 of
this raw string yields this erroneous output:
__emit_string_literal_range (R"foo(0123456789)foo",
~~^~
With the patch, the correct range/caret is printed:
__emit_string_literal_range (R"foo(0123456789)foo",
~~^~
gcc/ChangeLog:
* input.c (selftest::test_lexer_string_locations_long_line): New
function.
(selftest::test_lexer_string_locations_raw_string_multiline): New
function.
(selftest::input_c_tests): Call the new functions, via
for_each_line_table_case.
gcc/testsuite/ChangeLog:
* gcc.dg/plugin/diagnostic-test-string-literals-1.c
(test_raw_string_one_liner): New function.
(test_raw_string_multiline): New function.
libcpp/ChangeLog:
* charset.c (cpp_interpret_string_1): Skip locations from
loc_reader when advancing 'p' when handling raw strings.
From-SVN: r242552
substring_loc::get_location currently fails for the final terminator
character in a STRING_CST from the C frontend, so that format_warning_va
falls back to using the location of the string as a whole.
This patch tweaks things [1] so that we use the final closing quote
as the location of the terminator character, as requested in
PR preprocessor/77672.
[1] specifically, cpp_interpret_string_1.
gcc/ChangeLog:
PR preprocessor/77672
* input.c (selftest::test_lexer_string_locations_simple): Update
test to expect location information of the terminator character
at the location of the final closing quote.
(selftest::test_lexer_string_locations_hex): Likewise.
(selftest::test_lexer_string_locations_oct): Likewise.
(selftest::test_lexer_string_locations_letter_escape_1): Likewise.
(selftest::test_lexer_string_locations_letter_escape_2): Likewise.
(selftest::test_lexer_string_locations_ucn4): Likewise.
(selftest::test_lexer_string_locations_ucn8): Likewise.
(selftest::test_lexer_string_locations_u8): Likewise.
(selftest::test_lexer_string_locations_utf8_source): Likewise.
(selftest::test_lexer_string_locations_concatenation_1): Likewise.
(selftest::test_lexer_string_locations_concatenation_2): Likewise.
(selftest::test_lexer_string_locations_concatenation_3): Likewise.
(selftest::test_lexer_string_locations_macro): Likewise.
(selftest::test_lexer_string_locations_long_line): Likewise.
gcc/testsuite/ChangeLog:
PR preprocessor/77672
* gcc.dg/plugin/diagnostic-test-string-literals-1.c
(test_terminator_location): New function.
libcpp/ChangeLog:
PR preprocessor/77672
* charset.c (cpp_interpret_string_1): Add a source_range for the
NUL-terminator, using the location of the trailing quote of the
final string.
From-SVN: r240434
libcpp/ChangeLog:
PR bootstrap/72823
* charset.c (_cpp_valid_ucn): Replace overzealous assert with one
that allows for char_range to be non-NULL when loc_reader is NULL.
From-SVN: r239211
gcc/c-family/ChangeLog:
* c-common.c: Include "substring-locations.h".
(get_cpp_ttype_from_string_type): New function.
(g_string_concat_db): New global.
(substring_loc::get_range): New method.
* c-common.h (g_string_concat_db): New declaration.
(class substring_loc): New class.
* c-lex.c (lex_string): When concatenating strings, capture the
locations of all tokens using a new obstack, and record the
concatenation locations within g_string_concat_db.
* c-opts.c (c_common_init_options): Construct g_string_concat_db
on the ggc-heap.
gcc/ChangeLog:
* input.c (string_concat::string_concat): New constructor.
(string_concat_db::string_concat_db): New constructor.
(string_concat_db::record_string_concatenation): New method.
(string_concat_db::get_string_concatenation): New method.
(string_concat_db::get_key_loc): New method.
(class auto_cpp_string_vec): New class.
(get_substring_ranges_for_loc): New function.
(get_source_range_for_substring): New function.
(get_num_source_ranges_for_substring): New function.
(class selftest::lexer_test_options): New class.
(struct selftest::lexer_test): New struct.
(class selftest::ebcdic_execution_charset): New class.
(selftest::ebcdic_execution_charset::s_singleton): New variable.
(selftest::lexer_test::lexer_test): New constructor.
(selftest::lexer_test::~lexer_test): New destructor.
(selftest::lexer_test::get_token): New method.
(selftest::assert_char_at_range): New function.
(ASSERT_CHAR_AT_RANGE): New macro.
(selftest::assert_num_substring_ranges): New function.
(ASSERT_NUM_SUBSTRING_RANGES): New macro.
(selftest::assert_has_no_substring_ranges): New function.
(ASSERT_HAS_NO_SUBSTRING_RANGES): New macro.
(selftest::test_lexer_string_locations_simple): New function.
(selftest::test_lexer_string_locations_ebcdic): New function.
(selftest::test_lexer_string_locations_hex): New function.
(selftest::test_lexer_string_locations_oct): New function.
(selftest::test_lexer_string_locations_letter_escape_1): New function.
(selftest::test_lexer_string_locations_letter_escape_2): New function.
(selftest::test_lexer_string_locations_ucn4): New function.
(selftest::test_lexer_string_locations_ucn8): New function.
(selftest::uint32_from_big_endian): New function.
(selftest::test_lexer_string_locations_wide_string): New function.
(selftest::uint16_from_big_endian): New function.
(selftest::test_lexer_string_locations_string16): New function.
(selftest::test_lexer_string_locations_string32): New function.
(selftest::test_lexer_string_locations_u8): New function.
(selftest::test_lexer_string_locations_utf8_source): New function.
(selftest::test_lexer_string_locations_concatenation_1): New
function.
(selftest::test_lexer_string_locations_concatenation_2): New
function.
(selftest::test_lexer_string_locations_concatenation_3): New
function.
(selftest::test_lexer_string_locations_macro): New function.
(selftest::test_lexer_string_locations_stringified_macro_argument):
New function.
(selftest::test_lexer_string_locations_non_string): New function.
(selftest::test_lexer_string_locations_long_line): New function.
(selftest::test_lexer_char_constants): New function.
(selftest::input_c_tests): Call the new test functions once per
case within the line_table test matrix.
* input.h (struct string_concat): New struct.
(struct location_hash): New struct.
(class string_concat_db): New class.
* substring-locations.h: New header.
gcc/testsuite/ChangeLog:
* gcc.dg/plugin/diagnostic-test-string-literals-1.c: New file.
* gcc.dg/plugin/diagnostic-test-string-literals-2.c: New file.
* gcc.dg/plugin/diagnostic_plugin_test_string_literals.c: New file.
* gcc.dg/plugin/plugin.exp (plugin_test_list): Add the above new files.
libcpp/ChangeLog:
* charset.c (cpp_substring_ranges::cpp_substring_ranges): New
constructor.
(cpp_substring_ranges::~cpp_substring_ranges): New destructor.
(cpp_substring_ranges::add_range): New method.
(cpp_substring_ranges::add_n_ranges): New method.
(_cpp_valid_ucn): Add "char_range" and "loc_reader" params; if
they are non-NULL, read position information from *loc_reader
and update char_range->m_finish accordingly.
(convert_ucn): Add "char_range", "loc_reader", and "ranges"
params. If loc_reader is non-NULL, read location information from
it, and update *ranges accordingly, using char_range.
Conditionalize the conversion into tbuf on tbuf being non-NULL.
(convert_hex): Likewise, conditionalizing the call to
emit_numeric_escape on tbuf.
(convert_oct): Likewise.
(convert_escape): Add params "loc_reader" and "ranges". If
loc_reader is non-NULL, read location information from it, and
update *ranges accordingly. Conditionalize the conversion into
tbuf on tbuf being non-NULL.
(cpp_interpret_string): Rename to...
(cpp_interpret_string_1): ...this, adding params "loc_readers" and
"out". Use "to" to conditionalize the initialization and usage of
"tbuf", such as running the converter. If "loc_readers" is
non-NULL, use the instances within it, reading location
information from them, and passing them to convert_escape; likewise
write to "out" if loc_readers is non-NULL. Check for leading
quote and issue an error if it is not present. Update boundary
check from "== limit" to ">= limit" to protect against erroneous
location values to calls that are not parsing string literals.
(cpp_interpret_string): Reimplement in terms to
cpp_interpret_string_1.
(noop_error_cb): New function.
(cpp_interpret_string_ranges): New function.
(cpp_string_location_reader::cpp_string_location_reader): New
constructor.
(cpp_string_location_reader::get_next): New method.
* include/cpplib.h (class cpp_string_location_reader): New class.
(class cpp_substring_ranges): New class.
(cpp_interpret_string_ranges): New prototype.
* internal.h (_cpp_valid_ucn): Add params "char_range" and
"loc_reader".
* lex.c (forms_identifier_p): Pass NULL for new params to
_cpp_valid_ucn.
From-SVN: r239175
PR c++/69628
* charset.c (cpp_interpret_charconst): Clear *PCHARS_SEEN
and *UNSIGNEDP if bailing out early due to errors.
* g++.dg/parse/pr69628.C: New test.
From-SVN: r233186
libcpp:
2014-11-29 John Schmerge <jbschmerge@gmail.com>
PR preprocessor/41698
* charset.c (one_utf8_to_utf16): Do not produce surrogate pairs
for 0xffff.
gcc/testsuite:
2014-11-29 Joseph Myers <joseph@codesourcery.com>
PR preprocessor/41698
* gcc/testsuite/g++.dg/cpp/utf16-pr41698-1.C: New test.
From-SVN: r218179
2014-10-02 Bernd Edlinger <bernd.edlinger@hotmail.de>
Jeff Law <law@redhat.com>
* charset.c (convert_no_conversion): Reallocate memory with 25%
headroom.
Co-Authored-By: Jeff Law <law@redhat.com>
From-SVN: r215785
* charset.c (conversion): Rename to ...
(cpp_conversion): ... this one; update.
* files.c (file_hash_entry): Rename to ...
(cpp_file_hash_entry): ... this one ; update.
From-SVN: r215482
gcc/testsuite:
* c-c++-common/cpp/ucnid-2011-1.c: New test.
libcpp:
* ucnid.tab: Add C11 and C11NOSTART data.
* makeucnid.c (digit): Rename enum value to N99.
(C11, N11, all_languages): New enum values.
(NUM_CODE_POINTS, MAX_CODE_POINT): New macros.
(flags, decomp, combining_value): Use NUM_CODE_POINTS as array
size.
(decomp): Use unsigned int as element type.
(all_decomp): New array.
(read_ucnid): Handle C11 and C11NOSTART. Use MAX_CODE_POINT.
(read_table): Use MAX_CODE_POINT. Store all decompositions in
all_decomp.
(read_derived): Use MAX_CODE_POINT.
(write_table): Use NUM_CODE_POINTS. Print N99, C11 and N11
flags. Print whole array variable declaration rather than just
array contents.
(char_id_valid, write_context_switch): New functions.
(main): Call write_context_switch.
* ucnid.h: Regenerate.
* include/cpplib.h (struct cpp_options): Add c11_identifiers.
* init.c (struct lang_flags): Add c11_identifiers.
(cpp_set_lang): Set c11_identifiers option from selected language.
* internal.h (struct normalize_state): Document "previous" as
previous starter character.
(NORMALIZE_STATE_UPDATE_IDNUM): Take character as argument.
* charset.c (DIG): Rename enum value to N99.
(C11, N11): New enum values.
(struct ucnrange): Give name to struct. Use short for flags and
unsigned int for end of range. Include ucnid.h for whole variable
declaration.
(ucn_valid_in_identifier): Allow for characters up to 0x10FFFF.
Allow for C11 in determining valid characters and valid start
characters. Use check_nfc for non-Hangul context-dependent
checks. Only store starter characters in nst->previous.
(_cpp_valid_ucn): Pass new argument to
NORMALIZE_STATE_UPDATE_IDNUM.
* lex.c (lex_identifier): Pass new argument to
NORMALIZE_STATE_UPDATE_IDNUM. Call NORMALIZE_STATE_UPDATE_IDNUM
after initial non-UCN part of identifier.
(lex_number): Pass new argument to NORMALIZE_STATE_UPDATE_IDNUM.
From-SVN: r204886
PR bootstrap/55380
PR other/54691
* files.c (read_file_guts): Allocate extra 16 bytes instead of
1 byte at the end of buf. Pass size + 16 instead of size
to _cpp_convert_input.
* charset.c (_cpp_convert_input): Reallocate if there aren't
at least 16 bytes beyond to.len in the buffer. Clear 16 bytes
at to.text + to.len.
From-SVN: r194102
More N3077 raw string changes
* charset.c (cpp_interpret_string): Don't transform UCNs in raw
strings.
* lex.c (bufring_append): Split out from...
(lex_raw_string): ...here. Undo trigraph and line splicing
transformations. Do process line notes in multi-line literals.
(_cpp_process_line_notes): Ignore notes that were already handled.
From-SVN: r157804
Some raw string changes from N3077
* charset.c (cpp_interpret_string): Change inner delimiters to ().
* lex.c (lex_raw_string): Likewise. Also disallow '\' in delimiter.
From-SVN: r157797
2008-06-12 H.J. Lu <hongjiu.lu@intel.com>
PR preprocessor/36479
* charset.c (cpp_interpret_string_notranslate): Also set
narrow_cset_desc.width.
From-SVN: r136714
libcpp/ChangeLog:
2008-04-14 Kris Van Hees <kris.van.hees@oracle.com>
* include/cpp-id-data.h (UC): Was U, conflicts with U... literal.
* include/cpplib.h (CHAR16, CHAR32, STRING16, STRING32): New tokens.
(struct cpp_options): Added uliterals.
(cpp_interpret_string): Update prototype.
(cpp_interpret_string_notranslate): Idem.
* charset.c (init_iconv_desc): New width member in cset_converter.
(cpp_init_iconv): Add support for char{16,32}_cset_desc.
(convert_ucn): Idem.
(emit_numeric_escape): Idem.
(convert_hex): Idem.
(convert_oct): Idem.
(convert_escape): Idem.
(converter_for_type): New function.
(cpp_interpret_string): Use converter_for_type, support u and U prefix.
(cpp_interpret_string_notranslate): Match changed prototype.
(wide_str_to_charconst): Use converter_for_type.
(cpp_interpret_charconst): Add support for CPP_CHAR{16,32}.
* directives.c (linemarker_dir): Macro U changed to UC.
(parse_include): Idem.
(register_pragma_1): Idem.
(restore_registered_pragmas): Idem.
(get__Pragma_string): Support CPP_STRING{16,32}.
* expr.c (eval_token): Support CPP_CHAR{16,32}.
* init.c (struct lang_flags): Added uliterals.
(lang_defaults): Idem.
* internal.h (struct cset_converter) <width>: New field.
(struct cpp_reader) <char16_cset_desc>: Idem.
(struct cpp_reader) <char32_cset_desc>: Idem.
* lex.c (digraph_spellings): Macro U changed to UC.
(OP, TK): Idem.
(lex_string): Add support for u'...', U'...', u... and U....
(_cpp_lex_direct): Idem.
* macro.c (_cpp_builtin_macro_text): Macro U changed to UC.
(stringify_arg): Support CPP_CHAR{16,32} and CPP_STRING{16,32}.
gcc/ChangeLog:
2008-04-14 Kris Van Hees <kris.van.hees@oracle.com>
* c-common.c (CHAR16_TYPE, CHAR32_TYPE): New macros.
(fname_as_string): Match updated cpp_interpret_string prototype.
(fix_string_type): Support char16_t* and char32_t*.
(c_common_nodes_and_builtins): Add char16_t and char32_t (and
derivative) nodes. Register as builtin if C++0x.
(c_parse_error): Support CPP_CHAR{16,32}.
* c-common.h (RID_CHAR16, RID_CHAR32): New elements.
(enum c_tree_index) <CTI_CHAR16_TYPE, CTI_SIGNED_CHAR16_TYPE,
CTI_UNSIGNED_CHAR16_TYPE, CTI_CHAR32_TYPE, CTI_SIGNED_CHAR32_TYPE,
CTI_UNSIGNED_CHAR32_TYPE, CTI_CHAR16_ARRAY_TYPE,
CTI_CHAR32_ARRAY_TYPE>: New elements.
(char16_type_node, signed_char16_type_node, unsigned_char16_type_node,
char32_type_node, signed_char32_type_node, char16_array_type_node,
char32_array_type_node): New defines.
* c-lex.c (cb_ident): Match updated cpp_interpret_string prototype.
(c_lex_with_flags): Support CPP_CHAR{16,32} and CPP_STRING{16,32}.
(lex_string): Support CPP_STRING{16,32}, match updated
cpp_interpret_string and cpp_interpret_string_notranslate prototypes.
(lex_charconst): Support CPP_CHAR{16,32}.
* c-parser.c (c_parser_postfix_expression): Support CPP_CHAR{16,32}
and CPP_STRING{16,32}.
gcc/cp/ChangeLog:
2008-04-14 Kris Van Hees <kris.van.hees@oracle.com>
* cvt.c (type_promotes_to): Support char16_t and char32_t.
* decl.c (grokdeclarator): Disallow signed/unsigned/short/long on
char16_t and char32_t.
* lex.c (reswords): Add char16_t and char32_t (for c++0x).
* mangle.c (write_builtin_type): Mangle char16_t/char32_t as vendor
extended builtin type u8char32_t.
* parser.c (cp_lexer_next_token_is_decl_specifier_keyword): Support
RID_CHAR{16,32}.
(cp_lexer_print_token): Support CPP_STRING{16,32}.
(cp_parser_is_string_literal): Idem.
(cp_parser_string_literal): Idem.
(cp_parser_primary_expression): Support CPP_CHAR{16,32} and
CPP_STRING{16,32}.
(cp_parser_simple_type_specifier): Support RID_CHAR{16,32}.
* tree.c (char_type_p): Support char16_t and char32_t as char types.
* typeck.c (string_conv_p): Support char16_t and char32_t.
gcc/testsuite/ChangeLog:
2008-04-14 Kris Van Hees <kris.van.hees@oracle.com>
Tests for char16_t and char32_t support.
* g++.dg/ext/utf-cvt.C: New
* g++.dg/ext/utf-cxx0x.C: New
* g++.dg/ext/utf-cxx98.C: New
* g++.dg/ext/utf-dflt.C: New
* g++.dg/ext/utf-gnuxx0x.C: New
* g++.dg/ext/utf-gnuxx98.C: New
* g++.dg/ext/utf-mangle.C: New
* g++.dg/ext/utf-typedef-cxx0x.C: New
* g++.dg/ext/utf-typedef-
* g++.dg/ext/utf-typespec.C: New
* g++.dg/ext/utf16-1.C: New
* g++.dg/ext/utf16-2.C: New
* g++.dg/ext/utf16-3.C: New
* g++.dg/ext/utf16-4.C: New
* g++.dg/ext/utf32-1.C: New
* g++.dg/ext/utf32-2.C: New
* g++.dg/ext/utf32-3.C: New
* g++.dg/ext/utf32-4.C: New
* gcc.dg/utf-cvt.c: New
* gcc.dg/utf-dflt.c: New
* gcc.dg/utf16-1.c: New
* gcc.dg/utf16-2.c: New
* gcc.dg/utf16-3.c: New
* gcc.dg/utf16-4.c: New
* gcc.dg/utf32-1.c: New
* gcc.dg/utf32-2.c: New
* gcc.dg/utf32-3.c: New
* gcc.dg/utf32-4.c: New
libiberty/ChangeLog:
2008-04-14 Kris Van Hees <kris.van.hees@oracle.com>
* testsuite/demangle-expected: Added tests for char16_t and char32_t.
From-SVN: r134438
* configure.ac: Check declarations for asprintf and vasprintf.
* config.in: Regenerate.
* configure: Likewise.
* charset.c (conversion_loop): Use XRESIZEVEC.
(convert_no_conversion): Likewise.
(convert_using_iconv): Likewise.
(init_iconv_desc): Cast return value of alloca.
(cpp_host_to_exec_charset): Use XNEWVEC.
(emit_numeric_escape): Use XRESIZEVEC.
(cpp_interpret_string): Use XNEWVEC.
(cpp_interpret_string): Use XRESIZEVEC.
(_cpp_interpret_identifier): Cast return value of alloca.
(_cpp_convert_input): Use XNEWVEC and XRESIZEVEC.
* directives.c (glue_header_name): Use XNEWVEC and XRESIZEVEC.
(parse_include): Use XNEWVEC.
(insert_pragma_entry): Rename local variable "new" to
"new_entry".
(save_registered_pragmas): Cast return value of xmemdup.
(destringize_and_run): Same for alloca.
(parse_assertion): Likewise.
(do_assert): Cast allocated storage to proper type.
(cpp_define): Likewise.
(_cpp_define_builtin): Likewise.
(cpp_undef): Likewise.
(handle_assertion): Likewise.
(cpp_push_buffer): Rename local variable "new" to "new_buffer".
* expr.c (CPP_UPLUS): Cast value to type cpp_ttype.
(CPP_UMINUS): Likewise.
(struct cpp_operator): Rename from struct operator.
(_cpp_expand_op_stack): Use XRESIZEVEC.
* files.c (pch_open_file): Use XNEWVEC.
(pch_open_file): Use XRESIZEVEC.
(read_file_guts): Use XNEWVEC and XRESIZEVEC.
(dir_name_of_file): Use XNEWVEC.
(make_cpp_file): Use XCNEW.
(make_cpp_dir): Likewise.
(allocate_file_hash_entries): USE XNEWVEC.
(cpp_included): Cast return value of htab_find_with_hash.
(append_file_to_dir): Use XNEWVEC.
(read_filename_string): Likewise. Use XRESIZEVEC too.
(read_name_map): Cast return value of alloca. Use XRESIZEVEC.
(remap_filename): Use XNEWVEC.
(struct pchf_entry): Move definition out of struct pchf_data.
(_cpp_save_file_entries): Use XCNEWVAR.
(_cpp_read_file_entries): Use XNEWVAR.
* identifiers.c (alloc_node): Use XOBNEW.
* init.c (cpp_create_reader): Use XCNEW.
(cpp_init_builtins): Cast of b->value to enum builtin_type.
(read_original_directory): Cast return value of alloca.
* lex.c (add_line_note): Use XRESIZEVEC.
(warn_about_normalization): Use XNEWVEC.
(_cpp_lex_direct): Cast node->directive_index to (enum cpp_ttype).
(new_buff): Use XNEWVEC.
* line-map.c (linemap_add): Use XRESIZEVEC.
* macro.c (builtin_macro): Cast return value of alloca.
(paste_tokens): Likewise.
(expand_arg): Use XNEWVEC and XRESIZEVEC.
(_cpp_save_parameter): Use XRESIZEVEC.
(create_iso_definition): Cast allocated storage to proper type.
(_cpp_create_definition): Likewise.
(cpp_macro_definition): Use XRESIZEVEC.
* makedepend.c (add_clm): Use XNEW.
(add_dir): Likewise.
* mkdeps.c (munge): Use XNEWVEC.
(deps_init): Use XCNEW.
(deps_add_target): Use XRESIZEVEC.
(deps_add_default_target): Cast return value of alloca.
(deps_add_dep): Use XRESIZEVEC.
(deps_add_vpath): Likewise. Use XNEWVEC too.
(deps_restore): Likewise.
* pch.c (save_idents): Use XNEW and XNEWVEC.
(cpp_save_state): Use XNEW.
(count_defs): Cast return value of htab_find.
(write_defs): Likewise.
(cpp_write_pch_deps): Use XNEWVEC.
(collect_ht_nodes): Use XRESIZEVEC.
(cpp_valid_state): Use XNEWVEC.
(save_macros): Use XRESIZEVEC. Cast return value of xmemdup.
* symtab.c (ht_create): Use XCNEW.
(ht_lookup_with_hash): Cast return value of obstack_copy0.
(ht_expand): Use XCNEWVEC.
* system.h (HAVE_DESIGNATED_INITIALIZERS): False if __cplusplus.
(bool): Do not define if __cplusplus.
From-SVN: r100295