64 Commits

Author SHA1 Message Date
David Malcolm
bd5e882cf6 diagnostics: escape non-ASCII source bytes for certain diagnostics
This patch adds support to GCC's diagnostic subsystem for escaping certain
bytes and Unicode characters when quoting source code.

Specifically, this patch adds a new flag rich_location::m_escape_on_output
which is a hint from a diagnostic that non-ASCII bytes in the pertinent
lines of the user's source code should be escaped when printed.

The patch sets this for the following diagnostics:
- when complaining about stray bytes in the program (when these
are non-printable)
- when complaining about "null character(s) ignored");
- for -Wnormalized= (and generate source ranges for such warnings)

The escaping is controlled by a new option:
  -fdiagnostics-escape-format=[unicode|bytes]

For example, consider a diagnostic involing a source line containing the
string "before" followed by the Unicode character U+03C0 ("GREEK SMALL
LETTER PI", with UTF-8 encoding 0xCF 0x80) followed by the byte 0xBF
(a stray UTF-8 trailing byte), followed by the string "after", where the
diagnostic highlights the U+03C0 character.

By default, this line will be printed verbatim to the user when
reporting a diagnostic at it, as:

 beforeπXafter
       ^

(using X for the stray byte to avoid putting invalid UTF-8 in this
commit message)

If the diagnostic sets the "escape" flag, it will be printed as:

 before<U+03C0><BF>after
       ^~~~~~~~

with -fdiagnostics-escape-format=unicode (the default), or as:

  before<CF><80><BF>after
        ^~~~~~~~

if the user supplies -fdiagnostics-escape-format=bytes.

This only affects how the source is printed; it does not affect
how column numbers that are printed (as per -fdiagnostics-column-unit=
and -fdiagnostics-column-origin=).

gcc/c-family/ChangeLog:
	* c-lex.c (c_lex_with_flags): When complaining about non-printable
	CPP_OTHER tokens, set the "escape on output" flag.

gcc/ChangeLog:
	* common.opt (fdiagnostics-escape-format=): New.
	(diagnostics_escape_format): New enum.
	(DIAGNOSTICS_ESCAPE_FORMAT_UNICODE): New enum value.
	(DIAGNOSTICS_ESCAPE_FORMAT_BYTES): Likewise.
	* diagnostic-format-json.cc (json_end_diagnostic): Add
	"escape-source" attribute.
	* diagnostic-show-locus.c
	(exploc_with_display_col::exploc_with_display_col): Replace
	"tabstop" param with a cpp_char_column_policy and add an "aspect"
	param.  Use these to compute m_display_col accordingly.
	(struct char_display_policy): New struct.
	(layout::m_policy): New field.
	(layout::m_escape_on_output): New field.
	(def_policy): New function.
	(make_range): Update for changes to exploc_with_display_col ctor.
	(default_print_decoded_ch): New.
	(width_per_escaped_byte): New.
	(escape_as_bytes_width): New.
	(escape_as_bytes_print): New.
	(escape_as_unicode_width): New.
	(escape_as_unicode_print): New.
	(make_policy): New.
	(layout::layout): Initialize new fields.  Update m_exploc ctor
	call for above change to ctor.
	(layout::maybe_add_location_range): Update for changes to
	exploc_with_display_col ctor.
	(layout::calculate_x_offset_display): Update for change to
	cpp_display_width.
	(layout::print_source_line): Pass policy
	to cpp_display_width_computation. Capture cpp_decoded_char when
	calling process_next_codepoint.  Move printing of source code to
	m_policy.m_print_cb.
	(line_label::line_label): Pass in policy rather than context.
	(layout::print_any_labels): Update for change to line_label ctor.
	(get_affected_range): Pass in policy rather than context, updating
	calls to location_compute_display_column accordingly.
	(get_printed_columns): Likewise, also for cpp_display_width.
	(correction::correction): Pass in policy rather than tabstop.
	(correction::compute_display_cols): Pass m_policy rather than
	m_tabstop to cpp_display_width.
	(correction::m_tabstop): Replace with...
	(correction::m_policy): ...this.
	(line_corrections::line_corrections): Pass in policy rather than
	context.
	(line_corrections::m_context): Replace with...
	(line_corrections::m_policy): ...this.
	(line_corrections::add_hint): Update to use m_policy rather than
	m_context.
	(line_corrections::add_hint): Likewise.
	(layout::print_trailing_fixits): Likewise.
	(selftest::test_display_widths): New.
	(selftest::test_layout_x_offset_display_utf8): Update to use
	policy rather than tabstop.
	(selftest::test_one_liner_labels_utf8): Add test of escaping
	source lines.
	(selftest::test_diagnostic_show_locus_one_liner_utf8): Update to
	use policy rather than tabstop.
	(selftest::test_overlapped_fixit_printing): Likewise.
	(selftest::test_overlapped_fixit_printing_utf8): Likewise.
	(selftest::test_overlapped_fixit_printing_2): Likewise.
	(selftest::test_tab_expansion): Likewise.
	(selftest::test_escaping_bytes_1): New.
	(selftest::test_escaping_bytes_2): New.
	(selftest::diagnostic_show_locus_c_tests): Call the new tests.
	* diagnostic.c (diagnostic_initialize): Initialize
	context->escape_format.
	(convert_column_unit): Update to use default character width policy.
	(selftest::test_diagnostic_get_location_text): Likewise.
	* diagnostic.h (enum diagnostics_escape_format): New enum.
	(diagnostic_context::escape_format): New field.
	* doc/invoke.texi (-fdiagnostics-escape-format=): New option.
	(-fdiagnostics-format=): Add "escape-source" attribute to examples
	of JSON output, and document it.
	* input.c (location_compute_display_column): Pass in "policy"
	rather than "tabstop", passing to
	cpp_byte_column_to_display_column.
	(selftest::test_cpp_utf8): Update to use cpp_char_column_policy.
	* input.h (class cpp_char_column_policy): New forward decl.
	(location_compute_display_column): Pass in "policy" rather than
	"tabstop".
	* opts.c (common_handle_option): Handle
	OPT_fdiagnostics_escape_format_.
	* selftest.c (temp_source_file::temp_source_file): New ctor
	overload taking a size_t.
	* selftest.h (temp_source_file::temp_source_file): Likewise.

gcc/testsuite/ChangeLog:
	* c-c++-common/diagnostic-format-json-1.c: Add regexp to consume
	"escape-source" attribute.
	* c-c++-common/diagnostic-format-json-2.c: Likewise.
	* c-c++-common/diagnostic-format-json-3.c: Likewise.
	* c-c++-common/diagnostic-format-json-4.c: Likewise, twice.
	* c-c++-common/diagnostic-format-json-5.c: Likewise.
	* gcc.dg/cpp/warn-normalized-4-bytes.c: New test.
	* gcc.dg/cpp/warn-normalized-4-unicode.c: New test.
	* gcc.dg/encoding-issues-bytes.c: New test.
	* gcc.dg/encoding-issues-unicode.c: New test.
	* gfortran.dg/diagnostic-format-json-1.F90: Add regexp to consume
	"escape-source" attribute.
	* gfortran.dg/diagnostic-format-json-2.F90: Likewise.
	* gfortran.dg/diagnostic-format-json-3.F90: Likewise.

libcpp/ChangeLog:
	* charset.c (convert_escape): Use encoding_rich_location when
	complaining about nonprintable unknown escape sequences.
	(cpp_display_width_computation::::cpp_display_width_computation):
	Pass in policy rather than tabstop.
	(cpp_display_width_computation::process_next_codepoint): Add "out"
	param and populate *out if non-NULL.
	(cpp_display_width_computation::advance_display_cols): Pass NULL
	to process_next_codepoint.
	(cpp_byte_column_to_display_column): Pass in policy rather than
	tabstop.  Pass NULL to process_next_codepoint.
	(cpp_display_column_to_byte_column): Pass in policy rather than
	tabstop.
	* errors.c (cpp_diagnostic_get_current_location): New function,
	splitting out the logic from...
	(cpp_diagnostic): ...here.
	(cpp_warning_at): New function.
	(cpp_pedwarning_at): New function.
	* include/cpplib.h (cpp_warning_at): New decl for rich_location.
	(cpp_pedwarning_at): Likewise.
	(struct cpp_decoded_char): New.
	(struct cpp_char_column_policy): New.
	(cpp_display_width_computation::cpp_display_width_computation):
	Replace "tabstop" param with "policy".
	(cpp_display_width_computation::process_next_codepoint): Add "out"
	param.
	(cpp_display_width_computation::m_tabstop): Replace with...
	(cpp_display_width_computation::m_policy): ...this.
	(cpp_byte_column_to_display_column): Replace "tabstop" param with
	"policy".
	(cpp_display_width): Likewise.
	(cpp_display_column_to_byte_column): Likewise.
	* include/line-map.h (rich_location::escape_on_output_p): New.
	(rich_location::set_escape_on_output): New.
	(rich_location::m_escape_on_output): New.
	* internal.h (cpp_diagnostic_get_current_location): New decl.
	(class encoding_rich_location): New.
	* lex.c (skip_whitespace): Use encoding_rich_location when
	complaining about null characters.
	(warn_about_normalization): Generate a source range when
	complaining about improperly normalized tokens, rather than just a
	point, and use encoding_rich_location so that the source code
	is escaped on printing.
	* line-map.c (rich_location::rich_location): Initialize
	m_escape_on_output.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2021-11-01 09:35:46 -04:00
Martin Liska
724e27046b Remove unused but set variables.
Reported by clang13 -Wunused-but-set-variable:

gcc/ChangeLog:

	* dbgcnt.c (dbg_cnt_process_opt): Remove unused but set variable.
	* gcov.c (get_cycles_count): Likewise.
	* lto-compress.c (lto_compression_zlib): Likewise.
	(lto_uncompression_zlib): Likewise.
	* targhooks.c (default_pch_valid_p): Likewise.

libcpp/ChangeLog:

	* charset.c (convert_oct): Remove unused but set variable.
2021-10-18 10:16:46 +02:00
Jakub Jelinek
c4d6dcacfc libcpp: Implement C++23 P1949R7 - C++ Identifier Syntax using Unicode Standard Annex 31
The following patch implements the
P1949R7 - C++ Identifier Syntax using Unicode Standard Annex 31
paper.  We already allow UTF-8 characters in the source, so that part
is already implemented, so IMHO all we need to do is pedwarn instead of
just warn for the (default) -Wnormalize=nfc (or for -Wnormalize={id,nkfc})
if the character is not in NFC and to use the unicode XID_Start and
XID_Continue derived code properties to find out what characters are allowed
(the standard actually adds U+005F to XID_Start, but we are handling the
ASCII compatible characters differently already and they aren't allowed
in UCNs in identifiers).  Instead of hardcoding the large tables
in ucnid.tab, this patch makes makeucnid.c read them from the Unicode
tables (13.0.0 version at this point).

For non-pedantic mode, we accept as 2nd+ char in identifiers a union
of valid characters in all supported modes, but for the 1st char it
was actually pedantically requiring that it is not any of the characters
that may not appear in the currently chosen standard as the first character.
This patch changes it such that also what is allowed at the start of an
identifier is a union of characters valid at the start of an identifier
in any of the pedantic modes.

2021-09-01  Jakub Jelinek  <jakub@redhat.com>

	PR c++/100977
libcpp/
	* include/cpplib.h (struct cpp_options): Add cxx23_identifiers.
	* charset.c (CXX23, NXX23): New enumerators.
	(CID, NFC, NKC, CTX): Renumber.
	(ucn_valid_in_identifier): Implement P1949R7 - use CXX23 and
	NXX23 flags for cxx23_identifiers.  For start character in
	non-pedantic mode, allow characters that are allowed as start
	characters in any of the supported language modes, rather than
	disallowing characters allowed only as non-start characters in
	current mode but for characters from other language modes allowing
	them even if they are never allowed at start.
	* init.c (struct lang_flags): Add cxx23_identifiers.
	(lang_defaults): Add cxx23_identifiers column.
	(cpp_set_lang): Initialize CPP_OPTION (pfile, cxx23_identifiers).
	* lex.c (warn_about_normalization): If cxx23_identifiers, use
	cpp_pedwarning_with_line instead of cpp_warning_with_line for
	"is not in NFC" diagnostics.
	* makeucnid.c: Adjust usage comment.
	(CXX23, NXX23): New enumerators.
	(all_languages): Add CXX23.
	(not_NFC, not_NFKC, maybe_not_NFC): Renumber.
	(read_derivedcore): New function.
	(write_table): Print also CXX23 and NXX23 columns.
	(main): Require 5 arguments instead of 4, call read_derivedcore.
	* ucnid.h: Regenerated using Unicode 13.0.0 files.
gcc/testsuite/
	* g++.dg/cpp23/normalize1.C: New test.
	* g++.dg/cpp23/normalize2.C: New test.
	* g++.dg/cpp23/normalize3.C: New test.
	* g++.dg/cpp23/normalize4.C: New test.
	* g++.dg/cpp23/normalize5.C: New test.
	* g++.dg/cpp23/normalize6.C: New test.
	* g++.dg/cpp23/normalize7.C: New test.
	* g++.dg/cpp23/ucnid-1-utf8.C: New test.
	* g++.dg/cpp23/ucnid-2-utf8.C: New test.
	* gcc.dg/cpp/ucnid-4.c: Don't expect
	"not valid at the start of an identifier" errors.
	* gcc.dg/cpp/ucnid-4-utf8.c: Likewise.
	* gcc.dg/cpp/ucnid-5-utf8.c: New test.
2021-09-01 22:33:06 +02:00
Lewis Hyatt
3ac6b5cff1 diagnostics: Support for -finput-charset [PR93067]
Adds the logic to handle -finput-charset in layout_get_source_line(), so that
source lines are converted from their input encodings prior to being output by
diagnostics machinery. Also adds the ability to strip a UTF-8 BOM similarly.

gcc/c-family/ChangeLog:

	PR other/93067
	* c-opts.c (c_common_input_charset_cb): New function.
	(c_common_post_options): Call new function
	diagnostic_initialize_input_context().

gcc/d/ChangeLog:

	PR other/93067
	* d-lang.cc (d_input_charset_callback): New function.
	(d_init): Call new function
	diagnostic_initialize_input_context().

gcc/fortran/ChangeLog:

	PR other/93067
	* cpp.c (gfc_cpp_post_options): Call new function
	diagnostic_initialize_input_context().

gcc/ChangeLog:

	PR other/93067
	* coretypes.h (typedef diagnostic_input_charset_callback): Declare.
	* diagnostic.c (diagnostic_initialize_input_context): New function.
	* diagnostic.h (diagnostic_initialize_input_context): Declare.
	* input.c (default_charset_callback): New function.
	(file_cache::initialize_input_context): New function.
	(file_cache_slot::create): Added ability to convert the input
	according to the input context.
	(file_cache::file_cache): Initialize the new input context.
	(class file_cache_slot): Added new m_alloc_offset member.
	(file_cache_slot::file_cache_slot): Initialize the new member.
	(file_cache_slot::~file_cache_slot): Handle potentially offset buffer.
	(file_cache_slot::maybe_grow): Likewise.
	(file_cache_slot::needs_read_p): Handle NULL fp, which is now possible.
	(file_cache_slot::get_next_line): Likewise.
	* input.h (class file_cache): Added input context member.

libcpp/ChangeLog:

	PR other/93067
	* charset.c (init_iconv_desc): Adapt to permit PFILE argument to
	be NULL.
	(_cpp_convert_input): Likewise. Also move UTF-8 BOM logic to...
	(cpp_check_utf8_bom): ...here.  New function.
	(cpp_input_conversion_is_trivial): New function.
	* files.c (read_file_guts): Allow PFILE argument to be NULL.  Add
	INPUT_CHARSET argument as an alternate source of this information.
	(read_file): Pass the new argument to read_file_guts.
	(cpp_get_converted_source): New function.
	* include/cpplib.h (struct cpp_converted_source): Declare.
	(cpp_get_converted_source): Declare.
	(cpp_input_conversion_is_trivial): Declare.
	(cpp_check_utf8_bom): Declare.

gcc/testsuite/ChangeLog:

	PR other/93067
	* gcc.dg/diagnostic-input-charset-1.c: New test.
	* gcc.dg/diagnostic-input-utf8-bom.c: New test.
2021-08-25 11:15:28 -04:00
Jakub Jelinek
99dee82307 Update copyright years. 2021-01-04 10:26:59 +01:00
JeanHeyd Meneide
eccec86841 Feature: Macros for identifying the wide and narrow execution string literal encoding
gcc/c-family
	* c-cppbuiltin.c (c_cpp_builtins): Add predefined
	{__GNUC_EXECUTION_CHARSET_NAME} and
	_WIDE_EXECUTION_CHARSET_NAME} macros.

gcc/
	* doc/cpp.texi: Document new macros.

gcc/testsuite/
	* c-c++-common/cpp/wide-narrow-predef-macros.c: New test.

libcpp/
	* charset.c (init_iconv_desc): Initialize "to" and "from" fields.
	* directives.c (cpp_get_narrow_charset_name): New function.
	(cpp_get_wide_charset_name): Likewise.
	* include/cpplib.h (cpp_get_narrow_charset_name): Prototype.
	(cpp_get_wide_charset_name): Likewise.
	* internal.h (cset_converter): Add "to" and "from" fields.
2020-12-01 14:46:51 -07:00
Lewis Hyatt
004bb936d6 diagnostics: Support conversion of tabs to spaces [PR49973] [PR86904]
Supports conversion of tabs to spaces when outputting diagnostics. Also
adds -fdiagnostics-column-unit and -fdiagnostics-column-origin options to
control how the column number is output, thereby resolving the two PRs.

gcc/c-family/ChangeLog:

	PR other/86904
	* c-indentation.c (should_warn_for_misleading_indentation): Get
	global tabstop from the new source.
	* c-opts.c (c_common_handle_option): Remove handling of -ftabstop, which
	is now a common option.
	* c.opt: Likewise.

gcc/ChangeLog:

	PR preprocessor/49973
	PR other/86904
	* common.opt: Handle -ftabstop here instead of in c-family
	options.  Add -fdiagnostics-column-unit= and
	-fdiagnostics-column-origin= options.
	* opts.c (common_handle_option): Handle the new options.
	* diagnostic-format-json.cc (json_from_expanded_location): Add
	diagnostic_context argument.  Use it to convert column numbers as per
	the new options.
	(json_from_location_range): Likewise.
	(json_from_fixit_hint): Likewise.
	(json_end_diagnostic): Pass the new context argument to helper
	functions above.  Add "column-origin" field to the output.
	(test_unknown_location): Add the new context argument to calls to
	helper functions.
	(test_bad_endpoints): Likewise.
	* diagnostic-show-locus.c
	(exploc_with_display_col::exploc_with_display_col): Support
	tabstop parameter.
	(layout_point::layout_point): Make use of class
	exploc_with_display_col.
	(layout_range::layout_range): Likewise.
	(struct line_bounds): Clarify that the units are now always
	display columns.  Rename members accordingly.  Add constructor.
	(layout::print_source_line): Add support for tab expansion.
	(make_range): Adapt to class layout_range changes.
	(layout::maybe_add_location_range): Likewise.
	(layout::layout): Adapt to class exploc_with_display_col changes.
	(layout::calculate_x_offset_display): Support tabstop parameter.
	(layout::print_annotation_line): Adapt to struct line_bounds changes.
	(layout::print_line): Likewise.
	(line_label::line_label): Add diagnostic_context argument.
	(get_affected_range): Likewise.
	(get_printed_columns): Likewise.
	(layout::print_any_labels): Adapt to struct line_label changes.
	(class correction): Add m_tabstop member.
	(correction::correction): Add tabstop argument.
	(correction::compute_display_cols): Use m_tabstop.
	(class line_corrections): Add m_context member.
	(line_corrections::line_corrections): Add diagnostic_context argument.
	(line_corrections::add_hint): Use m_context to handle tabstops.
	(layout::print_trailing_fixits): Adapt to class line_corrections
	changes.
	(test_layout_x_offset_display_utf8): Support tabstop parameter.
	(test_layout_x_offset_display_tab): New selftest.
	(test_one_liner_colorized_utf8): Likewise.
	(test_tab_expansion): Likewise.
	(test_diagnostic_show_locus_one_liner_utf8): Call the new tests.
	(diagnostic_show_locus_c_tests): Likewise.
	(test_overlapped_fixit_printing): Adapt to helper class and
	function changes.
	(test_overlapped_fixit_printing_utf8): Likewise.
	(test_overlapped_fixit_printing_2): Likewise.
	* diagnostic.h (enum diagnostics_column_unit): New enum.
	(struct diagnostic_context): Add members for the new options.
	(diagnostic_converted_column): Declare.
	(json_from_expanded_location): Add new context argument.
	* diagnostic.c (diagnostic_initialize): Initialize new members.
	(diagnostic_converted_column): New function.
	(maybe_line_and_column): Be willing to output a column of 0.
	(diagnostic_get_location_text): Convert column number as per the new
	options.
	(diagnostic_report_current_module): Likewise.
	(assert_location_text): Add origin and column_unit arguments for
	testing the new functionality.
	(test_diagnostic_get_location_text): Test the new functionality.
	* doc/invoke.texi: Document the new options and behavior.
	* input.h (location_compute_display_column): Add tabstop argument.
	* input.c (location_compute_display_column): Likewise.
	(test_cpp_utf8): Add selftests for tab expansion.
	* tree-diagnostic-path.cc (default_tree_make_json_for_path): Pass the
	new context argument to json_from_expanded_location().

libcpp/ChangeLog:

	PR preprocessor/49973
	PR other/86904
	* include/cpplib.h (struct cpp_options):  Removed support for -ftabstop,
	which is now handled by diagnostic_context.
	(class cpp_display_width_computation): New class.
	(cpp_byte_column_to_display_column): Add optional tabstop argument.
	(cpp_display_width): Likewise.
	(cpp_display_column_to_byte_column): Likewise.
	* charset.c
	(cpp_display_width_computation::cpp_display_width_computation): New
	function.
	(cpp_display_width_computation::advance_display_cols): Likewise.
	(compute_next_display_width): Removed and implemented this
	functionality in a new function...
	(cpp_display_width_computation::process_next_codepoint): ...here.
	(cpp_byte_column_to_display_column): Added tabstop argument.
	Reimplemented in terms of class cpp_display_width_computation.
	(cpp_display_column_to_byte_column): Likewise.
	* init.c (cpp_create_reader): Remove handling of -ftabstop, which is now
	handled by diagnostic_context.

gcc/testsuite/ChangeLog:

	PR preprocessor/49973
	PR other/86904
	* c-c++-common/Wmisleading-indentation-3.c: Adjust expected output
	for new defaults.
	* c-c++-common/Wmisleading-indentation.c: Likewise.
	* c-c++-common/diagnostic-format-json-1.c: Likewise.
	* c-c++-common/diagnostic-format-json-2.c: Likewise.
	* c-c++-common/diagnostic-format-json-3.c: Likewise.
	* c-c++-common/diagnostic-format-json-4.c: Likewise.
	* c-c++-common/diagnostic-format-json-5.c: Likewise.
	* c-c++-common/missing-close-symbol.c: Likewise.
	* g++.dg/diagnostic/bad-binary-ops.C: Likewise.
	* g++.dg/parse/error4.C: Likewise.
	* g++.old-deja/g++.brendan/crash11.C: Likewise.
	* g++.old-deja/g++.pt/overload2.C: Likewise.
	* g++.old-deja/g++.robertl/eb109.C: Likewise.
	* gcc.dg/analyzer/malloc-paths-9.c: Likewise.
	* gcc.dg/bad-binary-ops.c: Likewise.
	* gcc.dg/format/branch-1.c: Likewise.
	* gcc.dg/format/pr79210.c: Likewise.
	* gcc.dg/plugin/diagnostic-test-expressions-1.c: Likewise.
	* gcc.dg/plugin/diagnostic-test-string-literals-1.c: Likewise.
	* gcc.dg/redecl-4.c: Likewise.
	* gfortran.dg/diagnostic-format-json-1.F90: Likewise.
	* gfortran.dg/diagnostic-format-json-2.F90: Likewise.
	* gfortran.dg/diagnostic-format-json-3.F90: Likewise.
	* go.dg/arrayclear.go: Add a comment explaining why adding a
	comment was necessary to work around a dejagnu bug.
	* c-c++-common/diagnostic-units-1.c: New test.
	* c-c++-common/diagnostic-units-2.c: New test.
	* c-c++-common/diagnostic-units-3.c: New test.
	* c-c++-common/diagnostic-units-4.c: New test.
	* c-c++-common/diagnostic-units-5.c: New test.
	* c-c++-common/diagnostic-units-6.c: New test.
	* c-c++-common/diagnostic-units-7.c: New test.
	* c-c++-common/diagnostic-units-8.c: New test.
2020-07-14 12:05:56 -04:00
Jason Merrill
b04445d4a8 c++: Replace "C++2a" with "C++20".
C++20 isn't final quite yet, but all that remains is formalities, so let's
go ahead and change all the references.

I think for the next C++ standard we can just call it C++23 rather than
C++2b, since the committee has been consistent about time-based releases
rather than feature-based.

gcc/c-family/ChangeLog
2020-05-13  Jason Merrill  <jason@redhat.com>

	* c.opt (std=c++20): Make c++2a the alias.
	(std=gnu++20): Likewise.
	* c-common.h (cxx_dialect): Change cxx2a to cxx20.
	* c-opts.c: Adjust.
	* c-cppbuiltin.c: Adjust.
	* c-ubsan.c: Adjust.
	* c-warn.c: Adjust.

gcc/cp/ChangeLog
2020-05-13  Jason Merrill  <jason@redhat.com>

	* call.c, class.c, constexpr.c, constraint.cc, decl.c, init.c,
	lambda.c, lex.c, method.c, name-lookup.c, parser.c, pt.c, tree.c,
	typeck2.c: Change cxx2a to cxx20.

libcpp/ChangeLog
2020-05-13  Jason Merrill  <jason@redhat.com>

	* include/cpplib.h (enum c_lang): Change CXX2A to CXX20.
	* init.c, lex.c: Adjust.
2020-05-13 15:16:49 -04:00
Jakub Jelinek
8d9254fc8a Update copyright years.
From-SVN: r279813
2020-01-01 12:51:42 +01:00
David Malcolm
6dd0c82021 Drop unused member from cpp_string_location_reader (PR preprocessor/92982)
libcpp/ChangeLog:
	PR preprocessor/92982
	* charset.c
	(cpp_string_location_reader::cpp_string_location_reader): Delete
	initialization of m_line_table.
	* include/cpplib.h (cpp_string_location_reader::m_line_table):
	Delete unused member.

From-SVN: r279541
2019-12-18 17:26:01 +00:00
Jakub Jelinek
937a778ea3 re PR preprocessor/92919 (invalid memory access in wide_str_to_charconst when running ucn2.C testcase (caught by hwasan))
PR preprocessor/92919
	* charset.c (wide_str_to_charconst): If str contains just the
	NUL terminator, punt quietly.

From-SVN: r279399
2019-12-14 23:18:53 +01:00
Lewis Hyatt
ee9256409f Byte vs column awareness for diagnostic-show-locus.c (PR 49973)
contrib/ChangeLog

2019-12-09  Lewis Hyatt  <lhyatt@gmail.com>

	PR preprocessor/49973
	* unicode/from_glibc/unicode_utils.py: Support script from
	glibc (commit 464cd3) to extract character widths from Unicode data
	files.
	* unicode/from_glibc/utf8_gen.py: Likewise.
	* unicode/UnicodeData.txt: Unicode v. 12.1.0 data file.
	* unicode/EastAsianWidth.txt: Likewise.
	* unicode/PropList.txt: Likewise.
	* unicode/gen_wcwidth.py: New utility to generate
	libcpp/generated_cpp_wcwidth.h with help from the glibc support
	scripts and the Unicode data files.
	* unicode/unicode-license.txt: Added.
	* unicode/README: New explanatory file.

libcpp/ChangeLog

2019-12-09  Lewis Hyatt  <lhyatt@gmail.com>

	PR preprocessor/49973
	* generated_cpp_wcwidth.h: New file generated by
	../contrib/unicode/gen_wcwidth.py, supports new cpp_wcwidth function.
	* charset.c (compute_next_display_width): New function to help
	implement display columns.
	(cpp_byte_column_to_display_column): Likewise.
	(cpp_display_column_to_byte_column): Likewise.
	(cpp_wcwidth): Likewise.
	* include/cpplib.h (cpp_byte_column_to_display_column): Declare.
	(cpp_display_column_to_byte_column): Declare.
	(cpp_wcwidth): Declare.
	(cpp_display_width): New function.

gcc/ChangeLog

2019-12-09  Lewis Hyatt  <lhyatt@gmail.com>

	PR preprocessor/49973
	* input.c (location_compute_display_column): New function to help with
	multibyte awareness in diagnostics.
	(test_cpp_utf8): New self-test.
	(input_c_tests): Call the new test.
	* input.h (location_compute_display_column): Declare.
	* diagnostic-show-locus.c: Pervasive changes to add multibyte awareness
	to all classes and functions.
	(enum column_unit): New enum.
	(class exploc_with_display_col): New class.
	(class layout_point): Convert m_column member to array m_columns[2].
	(layout_range::contains_point): Add col_unit argument.
	(test_layout_range_for_single_point): Pass new argument.
	(test_layout_range_for_single_line): Likewise.
	(test_layout_range_for_multiple_lines): Likewise.
	(line_bounds::convert_to_display_cols): New function.
	(layout::get_state_at_point): Add col_unit argument.
	(make_range): Use empty filename rather than dummy filename.
	(get_line_width_without_trailing_whitespace): Rename to...
	(get_line_bytes_without_trailing_whitespace): ...this.
	(test_get_line_width_without_trailing_whitespace): Rename to...
	(test_get_line_bytes_without_trailing_whitespace): ...this.
	(class layout): m_exploc changed to exploc_with_display_col from
	plain expanded_location.
	(layout::get_linenum_width): New accessor member function.
	(layout::get_x_offset_display): Likewise.
	(layout::calculate_linenum_width): New subroutine for the constuctor.
	(layout::calculate_x_offset_display): Likewise.
	(layout::layout): Use the new subroutines. Add multibyte awareness.
	(layout::print_source_line): Add multibyte awareness.
	(layout::print_line): Likewise.
	(layout::print_annotation_line): Likewise.
	(line_label::line_label): Likewise.
	(layout::print_any_labels): Likewise.
	(layout::annotation_line_showed_range_p): Likewise.
	(get_printed_columns): Likewise.
	(class line_label): Rename m_length to m_display_width.
	(get_affected_columns): Rename to...
	(get_affected_range): ...this; add col_unit argument and multibyte
	awareness.
	(class correction): Add m_affected_bytes and m_display_cols
	members.  Rename m_len to m_byte_length for clarity.  Add multibyte
	awareness throughout.
	(correction::insertion_p): Add multibyte awareness.
	(correction::compute_display_cols): New function.
	(correction::ensure_terminated): Use new member name m_byte_length.
	(line_corrections::add_hint): Add multibyte awareness.
	(layout::print_trailing_fixits): Likewise.
	(layout::get_x_bound_for_row): Likewise.
	(test_one_liner_simple_caret_utf8): New self-test analogous to the one
	with _utf8 suffix removed, testing multibyte awareness.
	(test_one_liner_caret_and_range_utf8): Likewise.
	(test_one_liner_multiple_carets_and_ranges_utf8): Likewise.
	(test_one_liner_fixit_insert_before_utf8): Likewise.
	(test_one_liner_fixit_insert_after_utf8): Likewise.
	(test_one_liner_fixit_remove_utf8): Likewise.
	(test_one_liner_fixit_replace_utf8): Likewise.
	(test_one_liner_fixit_replace_non_equal_range_utf8): Likewise.
	(test_one_liner_fixit_replace_equal_secondary_range_utf8): Likewise.
	(test_one_liner_fixit_validation_adhoc_locations_utf8): Likewise.
	(test_one_liner_many_fixits_1_utf8): Likewise.
	(test_one_liner_many_fixits_2_utf8): Likewise.
	(test_one_liner_labels_utf8): Likewise.
	(test_diagnostic_show_locus_one_liner_utf8): Likewise.
	(test_overlapped_fixit_printing_utf8): Likewise.
	(test_overlapped_fixit_printing): Adapt for changes to
	get_affected_columns, get_printed_columns and class corrections.
	(test_overlapped_fixit_printing_2): Likewise.
	(test_linenum_sep): New constant.
	(test_left_margin): Likewise.
	(test_offset_impl): Helper function for new test.
	(test_layout_x_offset_display_utf8): New test.
	(diagnostic_show_locus_c_tests): Call new tests.

gcc/testsuite/ChangeLog:

2019-12-09  Lewis Hyatt  <lhyatt@gmail.com>

	PR preprocessor/49973
	* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
	(test_show_locus): Tweak so that expected output is the same as
	before the diagnostic-show-locus.c changes.
	* gcc.dg/cpp/pr66415-1.c: Likewise.

From-SVN: r279137
2019-12-09 20:03:47 +00:00
Joseph Myers
7c5890cc0a Support UTF-8 character constants for C2x.
C2x adds u8'' character constants to C.  This patch adds the
corresponding GCC support.

Most of the support was already present for C++ and just needed
enabling for C2x.  However, in C2x these constants have type unsigned
char, which required corresponding adjustments in the compiler and the
preprocessor to give them that type for C.

For C, it seems clear to me that having type unsigned char means the
constants are unsigned in the preprocessor (and thus treated as having
type uintmax_t in #if conditionals), so this patch implements that.  I
included a conditional in the libcpp change to avoid affecting
signedness for C++, but I'm not sure if in fact these constants should
also be unsigned in the preprocessor for C++ in which case that
!CPP_OPTION (pfile, cplusplus) conditional would not be needed.

Bootstrapped with no regressions on x86_64-pc-linux-gnu.

gcc/c:
	* c-parser.c (c_parser_postfix_expression)
	(c_parser_check_literal_zero): Handle CPP_UTF8CHAR.
	* gimple-parser.c (c_parser_gimple_postfix_expression): Likewise.

gcc/c-family:
	* c-lex.c (lex_charconst): Make CPP_UTF8CHAR constants unsigned
	char for C.

gcc/testsuite:
	* gcc.dg/c11-utf8char-1.c, gcc.dg/c2x-utf8char-1.c,
	gcc.dg/c2x-utf8char-2.c, gcc.dg/c2x-utf8char-3.c,
	gcc.dg/gnu2x-utf8char-1.c: New tests.

libcpp:
	* charset.c (narrow_str_to_charconst): Make CPP_UTF8CHAR constants
	unsigned for C.
	* init.c (lang_defaults): Set utf8_char_literals for GNUC2X and
	STDC2X.

From-SVN: r278265
2019-11-14 20:18:33 +00:00
Jakub Jelinek
2c03d73667 PR c++/91370 - Implement P1041R4 and P1139R2 - Stronger Unicode reqs
PR c++/91370 - Implement P1041R4 and P1139R2 - Stronger Unicode reqs
	* charset.c (narrow_str_to_charconst): Add TYPE argument.  For
	CPP_UTF8CHAR diagnose whenever number of chars is > 1, using
	CPP_DL_ERROR instead of CPP_DL_WARNING.
	(wide_str_to_charconst): For CPP_CHAR16 or CPP_CHAR32, use
	CPP_DL_ERROR instead of CPP_DL_WARNING when multiple char16_t
	or char32_t chars are needed.
	(cpp_interpret_charconst): Adjust narrow_str_to_charconst caller.

	* g++.dg/cpp1z/utf8-neg.C: Expect errors rather than -Wmultichar
	warnings.
	* g++.dg/ext/utf16-4.C: Expect errors rather than warnings.
	* g++.dg/ext/utf32-4.C: Likewise.
	* g++.dg/cpp2a/ucn2.C: New test.

From-SVN: r277929
2019-11-07 21:24:38 +01:00
Eric Botcazou
0900e29cdb charset.c (UCS_LIMIT): New macro.
* charset.c (UCS_LIMIT): New macro.
	(ucn_valid_in_identifier): Use it instead of a hardcoded constant.
	(_cpp_valid_ucn): Issue a pedantic warning for UCNs larger than
	UCS_LIMIT outside of identifiers in C and in C++2a or later.

From-SVN: r276167
2019-09-26 21:43:51 +00:00
Lewis Hyatt
7d112d6670 Support extended characters in C/C++ identifiers (PR c/67224)
libcpp/ChangeLog
2019-09-19  Lewis Hyatt  <lhyatt@gmail.com>

	PR c/67224
	* charset.c (_cpp_valid_utf8): New function to help lex UTF-8 tokens.
	* internal.h (_cpp_valid_utf8): Declare.
	* lex.c (forms_identifier_p): Use it to recognize UTF-8 identifiers.
	(_cpp_lex_direct): Handle UTF-8 in identifiers and CPP_OTHER tokens.
	Do all work in "default" case to avoid slowing down typical code paths.
	Also handle $ and UCN in the default case for consistency.

gcc/Changelog
2019-09-19  Lewis Hyatt  <lhyatt@gmail.com>

	PR c/67224
	* doc/cpp.texi: Document support for extended characters in
	identifiers.
	* doc/cppopts.texi: Likewise.

gcc/testsuite/ChangeLog
2019-09-19  Lewis Hyatt  <lhyatt@gmail.com>

	PR c/67224
	* c-c++-common/cpp/ucnid-2011-1-utf8.c: New test.
	* g++.dg/cpp/ucnid-1-utf8.C: New test.
	* g++.dg/cpp/ucnid-2-utf8.C: New test.
	* g++.dg/cpp/ucnid-3-utf8.C: New test.
	* g++.dg/cpp/ucnid-4-utf8.C: New test.
	* g++.dg/other/ucnid-1-utf8.C: New test.
	* gcc.dg/cpp/ucnid-1-utf8.c: New test.
	* gcc.dg/cpp/ucnid-10-utf8.c: New test.
	* gcc.dg/cpp/ucnid-11-utf8.c: New test.
	* gcc.dg/cpp/ucnid-12-utf8.c: New test.
	* gcc.dg/cpp/ucnid-13-utf8.c: New test.
	* gcc.dg/cpp/ucnid-14-utf8.c: New test.
	* gcc.dg/cpp/ucnid-15-utf8.c: New test.
	* gcc.dg/cpp/ucnid-2-utf8.c: New test.
	* gcc.dg/cpp/ucnid-3-utf8.c: New test.
	* gcc.dg/cpp/ucnid-4-utf8.c: New test.
	* gcc.dg/cpp/ucnid-6-utf8.c: New test.
	* gcc.dg/cpp/ucnid-7-utf8.c: New test.
	* gcc.dg/cpp/ucnid-9-utf8.c: New test.
	* gcc.dg/ucnid-1-utf8.c: New test.
	* gcc.dg/ucnid-10-utf8.c: New test.
	* gcc.dg/ucnid-11-utf8.c: New test.
	* gcc.dg/ucnid-12-utf8.c: New test.
	* gcc.dg/ucnid-13-utf8.c: New test.
	* gcc.dg/ucnid-14-utf8.c: New test.
	* gcc.dg/ucnid-15-utf8.c: New test.
	* gcc.dg/ucnid-16-utf8.c: New test.
	* gcc.dg/ucnid-2-utf8.c: New test.
	* gcc.dg/ucnid-3-utf8.c: New test.
	* gcc.dg/ucnid-4-utf8.c: New test.
	* gcc.dg/ucnid-5-utf8.c: New test.
	* gcc.dg/ucnid-6-utf8.c: New test.
	* gcc.dg/ucnid-7-utf8.c: New test.
	* gcc.dg/ucnid-8-utf8.c: New test.
	* gcc.dg/ucnid-9-utf8.c: New test.

From-SVN: r275979
2019-09-19 20:56:11 +01:00
Jakub Jelinek
a554497024 Update copyright years.
From-SVN: r267494
2019-01-01 13:31:55 +01:00
David Malcolm
620e594be5 Eliminate source_location in favor of location_t
Historically GCC used location_t, while libcpp used source_location.

This inconsistency has been annoying me for a while, so this patch
removes source_location in favor of location_t throughout
(as the latter is shorter).

gcc/ChangeLog:
	* builtins.c: Replace "source_location" with "location_t".
	* diagnostic-show-locus.c: Likewise.
	* diagnostic.c: Likewise.
	* dumpfile.c: Likewise.
	* gcc-rich-location.h: Likewise.
	* genmatch.c: Likewise.
	* gimple.h: Likewise.
	* gimplify.c: Likewise.
	* input.c: Likewise.
	* input.h: Likewise.  Eliminate the typedef.
	* omp-expand.c: Likewise.
	* selftest.h: Likewise.
	* substring-locations.h (get_source_location_for_substring):
	Rename to..
	(get_location_within_string): ...this.
	* tree-cfg.c: Replace "source_location" with "location_t".
	* tree-cfgcleanup.c: Likewise.
	* tree-diagnostic.c: Likewise.
	* tree-into-ssa.c: Likewise.
	* tree-outof-ssa.c: Likewise.
	* tree-parloops.c: Likewise.
	* tree-phinodes.c: Likewise.
	* tree-phinodes.h: Likewise.
	* tree-ssa-loop-ivopts.c: Likewise.
	* tree-ssa-loop-manip.c: Likewise.
	* tree-ssa-phiopt.c: Likewise.
	* tree-ssa-phiprop.c: Likewise.
	* tree-ssa-threadupdate.c: Likewise.
	* tree-ssa.c: Likewise.
	* tree-ssa.h: Likewise.
	* tree-vect-loop-manip.c: Likewise.

gcc/c-family/ChangeLog:
	* c-common.c (c_get_substring_location): Update for renaming of
	get_source_location_for_substring to get_location_within_string.
	* c-lex.c: Replace "source_location" with "location_t".
	* c-opts.c: Likewise.
	* c-ppoutput.c: Likewise.

gcc/c/ChangeLog:
	* c-decl.c: Replace "source_location" with "location_t".
	* c-tree.h: Likewise.
	* c-typeck.c: Likewise.
	* gimple-parser.c: Likewise.

gcc/cp/ChangeLog:
	* call.c: Replace "source_location" with "location_t".
	* cp-tree.h: Likewise.
	* cvt.c: Likewise.
	* name-lookup.c: Likewise.
	* parser.c: Likewise.
	* typeck.c: Likewise.

gcc/fortran/ChangeLog:
	* cpp.c: Replace "source_location" with "location_t".
	* gfortran.h: Likewise.

gcc/go/ChangeLog:
	* go-gcc-diagnostics.cc: Replace "source_location" with "location_t".
	* go-gcc.cc: Likewise.
	* go-linemap.cc: Likewise.
	* go-location.h: Likewise.
	* gofrontend/README: Likewise.

gcc/jit/ChangeLog:
	* jit-playback.c: Replace "source_location" with "location_t".

gcc/testsuite/ChangeLog:
	* g++.dg/plugin/comment_plugin.c: Replace "source_location" with
	"location_t".
	* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c: Likewise.

libcc1/ChangeLog:
	* libcc1plugin.cc: Replace "source_location" with "location_t".
	(plugin_context::get_source_location): Rename to...
	(plugin_context::get_location_t): ...this.
	* libcp1plugin.cc: Likewise.

libcpp/ChangeLog:
	* charset.c: Replace "source_location" with "location_t".
	* directives-only.c: Likewise.
	* directives.c: Likewise.
	* errors.c: Likewise.
	* expr.c: Likewise.
	* files.c: Likewise.
	* include/cpplib.h: Likewise.  Rename MAX_SOURCE_LOCATION to
	MAX_LOCATION_T.
	* include/line-map.h: Likewise.
	* init.c: Likewise.
	* internal.h: Likewise.
	* lex.c: Likewise.
	* line-map.c: Likewise.
	* location-example.txt: Likewise.
	* macro.c: Likewise.
	* pch.c: Likewise.
	* traditional.c: Likewise.

From-SVN: r266085
2018-11-13 20:05:03 +00:00
David Malcolm
c24300baea Cleanup of libcpp diagnostic callbacks
This patch renames the "error" callback within libcpp
to "diagnostic", and uses the pair of enums in cpplib.h, rather
than passing two different kinds of "int" around.

gcc/c-family/ChangeLog:
	* c-common.c (c_option_controlling_cpp_error): Rename to...
	(c_option_controlling_cpp_diagnostic): ...this, and convert
	"reason" from int to enum.
	(c_cpp_error): Rename to...
	(c_cpp_diagnostic): ...this, converting level and reason to enums.
	* c-common.h (c_cpp_error): Rename to...
	(c_cpp_diagnostic): ...this, converting level and reason to enums.
	* c-opts.c (c_common_init_options): Update for renaming.

gcc/fortran/ChangeLog:
	* cpp.c (gfc_cpp_init_0): Update for renamings.
	(cb_cpp_error): Rename to...
	(cb_cpp_diagnostic): ...this, converting level and reason to
	enums.

gcc/ChangeLog:
	* genmatch.c (error_cb): Rename to...
	(diagnostic_cb): ...this, converting int params to enums.
	(fatal_at): Update for renaming.
	(warning_at): Likewise.
	(main): Likewise.
	* input.c (selftest::ebcdic_execution_charset::apply):
	Update for renaming of...
	(selftest::ebcdic_execution_charset::on_error): ...this, renaming
	to...
	(selftest::ebcdic_execution_charset::on_diagnostic): ...this,
	converting level and reason to enums.
	(class selftest::lexer_error_sink): Rename to...
	(class selftest::lexer_test_options): ...this, renaming field
	"m_errors" to "m_diagnostics".
	(selftest::lexer_test_options::apply): Update for renaming of...
	(selftest::lexer_test_options::on_error): ...this, renaming to...
	(selftest::lexer_test_options::on_diagnostic): ...this
	converting level and reason to enums.
	(selftest::test_lexer_string_locations_raw_string_unterminated):
	Update for renamings.
	* opth-gen.awk (struct cpp_reason_option_codes_t): Use enum for
	"reason".

libcpp/ChangeLog:
	* charset.c (noop_error_cb): Rename to...
	(noop_diagnostic_cb): ...this, converting params to enums.
	(cpp_interpret_string_ranges): Update for renaming and enums.
	* directives.c (check_eol_1): Convert reason to enum.
	(do_diagnostic): Convert code and reason to enum.
	(do_error): Use CPP_W_NONE rather than 0.
	(do_pragma_dependency): Likewise.
	* errors.c (cpp_diagnostic_at): Convert level and reason to enums.
	Update for renaming.
	(cpp_diagnostic): Convert level and reason to enums.
	(cpp_error): Convert level to enum.
	(cpp_warning): Convert reason to enums.
	(cpp_pedwarning): Likewise.
	(cpp_warning_syshdr): Likewise.
	(cpp_diagnostic_with_line): Convert level and reason to enums.
	Update for renaming.
	(cpp_error_with_line): Convert level to enum.
	(cpp_warning_with_line): Convert reason to enums.
	(cpp_pedwarning_with_line): Likewise.
	(cpp_warning_with_line_syshdr): Likewise.
	(cpp_error_at): Convert level to enum.
	(cpp_errno): Likewise.
	(cpp_errno_filename): Likewise.
	* include/cpplib.h (enum cpp_diagnostic_level): Name this enum,
	and move to before struct cpp_callbacks.
	(enum cpp_warning_reason): Likewise.
	(cpp_callbacks::diagnostic): Convert params from int to enums.
	(cpp_error): Convert int param to enum cpp_diagnostic_level.
	(cpp_warning): Convert int param to enum cpp_warning_reason.
	(cpp_pedwarning): Likewise.
	(cpp_warning_syshdr): Likewise.
	(cpp_errno): Convert int param to enum cpp_diagnostic_level.
	(cpp_errno_filename): Likewise.
	(cpp_error_with_line): Likewise.
	(cpp_warning_with_line): Convert int param to enum
	cpp_warning_reason.
	(cpp_pedwarning_with_line): Likewise.
	(cpp_warning_with_line_syshdr): Likewise.
	(cpp_error_at): Convert int param to enum cpp_diagnostic_level.
	* macro.c (create_iso_definition): Convert int to enum.
	(_cpp_create_definition): Likewise.

From-SVN: r264999
2018-10-09 23:37:19 +00:00
Jakub Jelinek
85ec4feb11 Update copyright years.
From-SVN: r256169
2018-01-03 11:03:58 +01:00
Jakub Jelinek
cbe34bb5ed Update copyright years.
From-SVN: r243994
2017-01-01 13:07:43 +01:00
David Malcolm
b8f564124e Fix locations within raw strings
Whilst investigating PR preprocessor/78324 I noticed that the
substring location code currently doesn't handle raw strings
correctly, by not skipping the 'R', opening quote, delimiter
and opening parenthesis.

For example, an attempt to underline chars 4-7 with caret at 6 of
this raw string yields this erroneous output:
   __emit_string_literal_range (R"foo(0123456789)foo",
                                    ~~^~

With the patch, the correct range/caret is printed:

   __emit_string_literal_range (R"foo(0123456789)foo",
                                          ~~^~

gcc/ChangeLog:
	* input.c (selftest::test_lexer_string_locations_long_line): New
	function.
	(selftest::test_lexer_string_locations_raw_string_multiline): New
	function.
	(selftest::input_c_tests): Call the new functions, via
	for_each_line_table_case.

gcc/testsuite/ChangeLog:
	* gcc.dg/plugin/diagnostic-test-string-literals-1.c
	(test_raw_string_one_liner): New function.
	(test_raw_string_multiline): New function.

libcpp/ChangeLog:
	* charset.c (cpp_interpret_string_1): Skip locations from
	loc_reader when advancing 'p' when handling raw strings.

From-SVN: r242552
2016-11-17 15:55:26 +00:00
David Malcolm
bbd6fcf320 Provide location information for terminator characters (PR preprocessor/77672)
substring_loc::get_location currently fails for the final terminator
character in a STRING_CST from the C frontend, so that format_warning_va
falls back to using the location of the string as a whole.

This patch tweaks things [1] so that we use the final closing quote
as the location of the terminator character, as requested in
PR preprocessor/77672.

[1] specifically, cpp_interpret_string_1.

gcc/ChangeLog:
	PR preprocessor/77672
	* input.c (selftest::test_lexer_string_locations_simple): Update
	test to expect location information of the terminator character
	at the location of the final closing quote.
	(selftest::test_lexer_string_locations_hex): Likewise.
	(selftest::test_lexer_string_locations_oct): Likewise.
	(selftest::test_lexer_string_locations_letter_escape_1): Likewise.
	(selftest::test_lexer_string_locations_letter_escape_2): Likewise.
	(selftest::test_lexer_string_locations_ucn4): Likewise.
	(selftest::test_lexer_string_locations_ucn8): Likewise.
	(selftest::test_lexer_string_locations_u8): Likewise.
	(selftest::test_lexer_string_locations_utf8_source): Likewise.
	(selftest::test_lexer_string_locations_concatenation_1): Likewise.
	(selftest::test_lexer_string_locations_concatenation_2): Likewise.
	(selftest::test_lexer_string_locations_concatenation_3): Likewise.
	(selftest::test_lexer_string_locations_macro): Likewise.
	(selftest::test_lexer_string_locations_long_line): Likewise.

gcc/testsuite/ChangeLog:
	PR preprocessor/77672
	* gcc.dg/plugin/diagnostic-test-string-literals-1.c
	(test_terminator_location): New function.

libcpp/ChangeLog:
	PR preprocessor/77672
	* charset.c (cpp_interpret_string_1): Add a source_range for the
	NUL-terminator, using the location of the trailing quote of the
	final string.

From-SVN: r240434
2016-09-23 14:14:52 +00:00
David Malcolm
e7864d68ee Fix crash in selftest::test_lexer_string_locations_ucn4 (PR bootstrap/72823)
libcpp/ChangeLog:
	PR bootstrap/72823
	* charset.c (_cpp_valid_ucn): Replace overzealous assert with one
	that allows for char_range to be non-NULL when loc_reader is NULL.

From-SVN: r239211
2016-08-06 18:06:30 +00:00
David Malcolm
88fa5555a3 On-demand locations within string-literals
gcc/c-family/ChangeLog:
	* c-common.c: Include "substring-locations.h".
	(get_cpp_ttype_from_string_type): New function.
	(g_string_concat_db): New global.
	(substring_loc::get_range): New method.
	* c-common.h (g_string_concat_db): New declaration.
	(class substring_loc): New class.
	* c-lex.c (lex_string): When concatenating strings, capture the
	locations of all tokens using a new obstack, and record the
	concatenation locations within g_string_concat_db.
	* c-opts.c (c_common_init_options): Construct g_string_concat_db
	on the ggc-heap.

gcc/ChangeLog:
	* input.c (string_concat::string_concat): New constructor.
	(string_concat_db::string_concat_db): New constructor.
	(string_concat_db::record_string_concatenation): New method.
	(string_concat_db::get_string_concatenation): New method.
	(string_concat_db::get_key_loc): New method.
	(class auto_cpp_string_vec): New class.
	(get_substring_ranges_for_loc): New function.
	(get_source_range_for_substring): New function.
	(get_num_source_ranges_for_substring): New function.
	(class selftest::lexer_test_options): New class.
	(struct selftest::lexer_test): New struct.
	(class selftest::ebcdic_execution_charset): New class.
	(selftest::ebcdic_execution_charset::s_singleton): New variable.
	(selftest::lexer_test::lexer_test): New constructor.
	(selftest::lexer_test::~lexer_test): New destructor.
	(selftest::lexer_test::get_token): New method.
	(selftest::assert_char_at_range): New function.
	(ASSERT_CHAR_AT_RANGE): New macro.
	(selftest::assert_num_substring_ranges): New function.
	(ASSERT_NUM_SUBSTRING_RANGES): New macro.
	(selftest::assert_has_no_substring_ranges): New function.
	(ASSERT_HAS_NO_SUBSTRING_RANGES): New macro.
	(selftest::test_lexer_string_locations_simple): New function.
	(selftest::test_lexer_string_locations_ebcdic): New function.
	(selftest::test_lexer_string_locations_hex): New function.
	(selftest::test_lexer_string_locations_oct): New function.
	(selftest::test_lexer_string_locations_letter_escape_1): New function.
	(selftest::test_lexer_string_locations_letter_escape_2): New function.
	(selftest::test_lexer_string_locations_ucn4): New function.
	(selftest::test_lexer_string_locations_ucn8): New function.
	(selftest::uint32_from_big_endian): New function.
	(selftest::test_lexer_string_locations_wide_string): New function.
	(selftest::uint16_from_big_endian): New function.
	(selftest::test_lexer_string_locations_string16): New function.
	(selftest::test_lexer_string_locations_string32): New function.
	(selftest::test_lexer_string_locations_u8): New function.
	(selftest::test_lexer_string_locations_utf8_source): New function.
	(selftest::test_lexer_string_locations_concatenation_1): New
	function.
	(selftest::test_lexer_string_locations_concatenation_2): New
	function.
	(selftest::test_lexer_string_locations_concatenation_3): New
	function.
	(selftest::test_lexer_string_locations_macro): New function.
	(selftest::test_lexer_string_locations_stringified_macro_argument):
	New function.
	(selftest::test_lexer_string_locations_non_string): New function.
	(selftest::test_lexer_string_locations_long_line): New function.
	(selftest::test_lexer_char_constants): New function.
	(selftest::input_c_tests): Call the new test functions once per
	case within the line_table test matrix.
	* input.h (struct string_concat): New struct.
	(struct location_hash): New struct.
	(class string_concat_db): New class.
	* substring-locations.h: New header.

gcc/testsuite/ChangeLog:
	* gcc.dg/plugin/diagnostic-test-string-literals-1.c: New file.
	* gcc.dg/plugin/diagnostic-test-string-literals-2.c: New file.
	* gcc.dg/plugin/diagnostic_plugin_test_string_literals.c: New file.
	* gcc.dg/plugin/plugin.exp (plugin_test_list): Add the above new files.

libcpp/ChangeLog:
	* charset.c (cpp_substring_ranges::cpp_substring_ranges): New
	constructor.
	(cpp_substring_ranges::~cpp_substring_ranges): New destructor.
	(cpp_substring_ranges::add_range): New method.
	(cpp_substring_ranges::add_n_ranges): New method.
	(_cpp_valid_ucn): Add "char_range" and "loc_reader" params; if
	they are non-NULL, read position information from *loc_reader
	and update char_range->m_finish accordingly.
	(convert_ucn): Add "char_range", "loc_reader", and "ranges"
	params.  If loc_reader is non-NULL, read location information from
	it, and update *ranges accordingly, using char_range.
	Conditionalize the conversion into tbuf on tbuf being non-NULL.
	(convert_hex): Likewise, conditionalizing the call to
	emit_numeric_escape on tbuf.
	(convert_oct): Likewise.
	(convert_escape): Add params "loc_reader" and "ranges".  If
	loc_reader is non-NULL, read location information from it, and
	update *ranges accordingly.  Conditionalize the conversion into
	tbuf on tbuf being non-NULL.
	(cpp_interpret_string): Rename to...
	(cpp_interpret_string_1): ...this, adding params "loc_readers" and
	"out".  Use "to" to conditionalize the initialization and usage of
	"tbuf", such as running the converter.  If "loc_readers" is
	non-NULL, use the instances within it, reading location
	information from them, and passing them to convert_escape; likewise
	write to "out" if loc_readers is non-NULL.  Check for leading
	quote and issue an error if it is not present.  Update boundary
	check from "== limit" to ">= limit" to protect against erroneous
	location values to calls that are not parsing string literals.
	(cpp_interpret_string): Reimplement in terms to
	cpp_interpret_string_1.
	(noop_error_cb): New function.
	(cpp_interpret_string_ranges): New function.
	(cpp_string_location_reader::cpp_string_location_reader): New
	constructor.
	(cpp_string_location_reader::get_next): New method.
	* include/cpplib.h (class cpp_string_location_reader): New class.
	(class cpp_substring_ranges): New class.
	(cpp_interpret_string_ranges): New prototype.
	* internal.h (_cpp_valid_ucn): Add params "char_range" and
	"loc_reader".
	* lex.c (forms_identifier_p): Pass NULL for new params to
	_cpp_valid_ucn.

From-SVN: r239175
2016-08-05 18:08:33 +00:00
Jakub Jelinek
b5c1c98852 re PR c++/69628 (Conditional jump or move depends on uninitialised value(s) in lex_charconst(cpp_token const*) (c-lex.c:1252))
PR c++/69628
	* charset.c (cpp_interpret_charconst): Clear *PCHARS_SEEN
	and *UNSIGNEDP if bailing out early due to errors.

	* g++.dg/parse/pr69628.C: New test.

From-SVN: r233186
2016-02-05 20:39:48 +01:00
Jakub Jelinek
818ab71a41 Update copyright years.
From-SVN: r232055
2016-01-04 15:30:50 +01:00
Paolo Carlini
fbb22910cf re PR preprocessor/53690 ([C++11] \u0000 and \U00000000 are wrongly encoded as U+0001.)
/libcpp
2015-07-02  Paolo Carlini  <paolo.carlini@oracle.com>

	PR c++/53690
	* charset.c (_cpp_valid_ucn): Add cppchar_t * parameter and change
	return type to bool.  Fix encoding of \u0000 and \U00000000 in C++.
	(convert_ucn): Adjust call.
	* lex.c (forms_identifier_p): Likewise.
	* internal.h (_cpp_valid_ucn): Adjust declaration.

/gcc/testsuite
2015-07-02  Paolo Carlini  <paolo.carlini@oracle.com>

	PR c++/53690
	* g++.dg/cpp/pr53690.C: New.

From-SVN: r225353
2015-07-02 18:54:41 +00:00
Edward Smith-Rowland
fe95b0366a Implement N4197 - Adding u8 character literals
libcpp:

2015-06-30  Edward Smith-Rowland  <3dw4rd@verizon.net>

	Implement N4197 - Adding u8 character literals
	* include/cpplib.h (UTF8CHAR, UTF8CHAR_USERDEF): New cpp tokens;
	(struct cpp_options): Add utf8_char_literals.
	* init.c (struct lang_flags): Add utf8_char_literals;
	(struct lang_flags lang_defaults): Add column for utf8_char_literals.
	* macro.c (stringify_arg()): Treat CPP_UTF8CHAR token; 
	* expr.c (cpp_userdef_char_remove_type(), cpp_userdef_char_add_type()):
	Treat CPP_UTF8CHAR_USERDEF, CPP_UTF8CHAR tokens;
	(cpp_userdef_char_p()): Treat CPP_UTF8CHAR_USERDEF token;
	(eval_token(), _cpp_parse_expr()): Treat CPP_UTF8CHAR token.
	* lex.c (lex_string(), _cpp_lex_direct()): Include CPP_UTF8CHAR tokens.
	* charset.c (converter_for_type(), cpp_interpret_charconst()):
	Treat CPP_UTF8CHAR token.


gcc/c-family:

2015-06-30  Edward Smith-Rowland  <3dw4rd@verizon.net>

	Implement N4197 - Adding u8 character literals
	* c-family/c-ada-spec.c (print_ada_macros()): Treat CPP_UTF8CHAR
	like CPP_CHAR.
	* c-family/c-common.c (c_parse_error()): print CPP_UTF8CHAR
	and CPP_UTF8CHAR_USERDEF tokens.
	* c-family/c-lex.c (c_lex_with_flags()): Treat CPP_UTF8CHAR_USERDEF
	and CPP_UTF8CHAR tokens; (lex_charconst()): Treat CPP_UTF8CHAR token.


gcc/cp:

2015-06-30  Edward Smith-Rowland  <3dw4rd@verizon.net>

	Implement N4197 - Adding u8 character literals
	* parser.c (cp_parser_primary_expression()): Treat CPP_UTF8CHAR
	and CPP_UTF8CHAR_USERDEF tokens;
	(cp_parser_parenthesized_expression_list()): Treat CPP_UTF8CHAR token.


gcc/testsuite:

2015-06-30  Edward Smith-Rowland  <3dw4rd@verizon.net>

	Implement N4197 - Adding u8 character literals
	* g++.dg/cpp1z/utf8.C: New.
	* g++.dg/cpp1z/utf8-neg.C: New.
	* g++.dg/cpp1z/udlit-utf8char.C: New.

From-SVN: r225185
2015-06-30 12:58:48 +00:00
Jakub Jelinek
5624e564d2 Update copyright years.
From-SVN: r219188
2015-01-05 13:33:28 +01:00
Joseph Myers
81fee4a708 Fix off-by-one bug in utf16 conversion (PR preprocessor/41698).
libcpp:
2014-11-29  John Schmerge  <jbschmerge@gmail.com>

	PR preprocessor/41698
	* charset.c (one_utf8_to_utf16): Do not produce surrogate pairs
	for 0xffff.

gcc/testsuite:
2014-11-29  Joseph Myers  <joseph@codesourcery.com>

	PR preprocessor/41698
	* gcc/testsuite/g++.dg/cpp/utf16-pr41698-1.C: New test.

From-SVN: r218179
2014-11-29 01:56:06 +00:00
Bernd Edlinger
dc257367bb charset.c (convert_no_conversion): Reallocate memory with 25% headroom.
2014-10-02  Bernd Edlinger  <bernd.edlinger@hotmail.de>
            Jeff Law  <law@redhat.com>

        * charset.c (convert_no_conversion): Reallocate memory with 25%
        headroom.

Co-Authored-By: Jeff Law <law@redhat.com>

From-SVN: r215785
2014-10-02 00:06:28 +00:00
Jan Hubicka
d87fc69983 charset.c (conversion): Rename to ...
* charset.c (conversion): Rename to ...
	(cpp_conversion): ... this one; update.
	* files.c (file_hash_entry): Rename to ...
	(cpp_file_hash_entry): ... this one ; update.

From-SVN: r215482
2014-09-22 19:43:02 +00:00
Marek Polacek
177cce463d c-opts.c (sanitize_cpp_opts): Make warn_long_long be set according to warn_c90_c99_compat.
gcc/c-family/
	* c-opts.c (sanitize_cpp_opts): Make warn_long_long be set according
	to warn_c90_c99_compat.
	* c.opt (Wc90-c99-compat, Wdeclaration-after-statement): Initialize
	to -1.
gcc/c/
	* c-decl.c (warn_variable_length_array): Pass OPT_Wvla unconditionally
	to pedwarn_c90.
	* c-errors.c: Include "opts.h".
	(pedwarn_c90): Rewrite to handle -Wno-c90-c99-compat better.
	* c-parser.c (disable_extension_diagnostics): Handle negative value
	of warn_c90_c99_compat, too.
	(restore_extension_diagnostics): Likewise.
	(c_parser_compound_statement_nostart): Pass
	OPT_Wdeclaration_after_statement unconditionally to pedwarn_c90.
gcc/testsuite/
	* gcc.dg/Wc90-c99-compat-4.c: Remove all dg-warnings.
	* gcc.dg/Wc90-c99-compat-5.c: Remove all dg-errors.
	* gcc.dg/Wc90-c99-compat-7.c: New test.
	* gcc.dg/Wc90-c99-compat-8.c: New test.
	* gcc.dg/Wdeclaration-after-statement-4.c: New test.
libcpp/
	* charset.c (_cpp_valid_ucn): Warn only if -Wc90-c99-compat.
	* lex.c (_cpp_lex_direct): Likewise.
	* macro.c (replace_args): Likewise.
	(parse_params): Likewise.
	* include/cpplib.h (cpp_options): Change cpp_warn_c90_c99_compat
	to char.

From-SVN: r214131
2014-08-19 05:34:31 +00:00
Marek Polacek
f3bede7188 re PR c/51849 (-Wc99-compat would be considered useful)
PR c/51849
gcc/
	* gcc/doc/invoke.texi: Document -Wc90-c99-compat.
gcc/c-family/
	* c-opts.c (sanitize_cpp_opts): Pass warn_c90_c99_compat to libcpp.
	* c.opt (Wc90-c99-compat): Add option.
gcc/c/
	* c-decl.c (build_array_declarator): Remove check for !flag_isoc99.
	Call pedwarn_c90 instead of pedwarn.
	(check_bitfield_type_and_width): Likewise.
	(declspecs_add_qual): Likewise.
	(declspecs_add_type): Likewise.
	(warn_variable_length_array): Unify function for -pedantic and -Wvla.
	Adjust to only call pedwarn_c90.
	(grokdeclarator): Remove pedantic && !flag_isoc99 check.  Call
	pedwarn_c90 instead of pedwarn.
	* c-errors.c (pedwarn_c90): Handle -Wc90-c99-compat.
	* c-parser.c (disable_extension_diagnostics): Handle
	warn_c90_c99_compat.
	(restore_extension_diagnostics): Likewise.
	(c_parser_enum_specifier): Remove check for !flag_isoc99.  Call
	pedwarn_c90 instead of pedwarn.
	(c_parser_initelt): Likewise.
	(c_parser_postfix_expression): Likewise.
	(c_parser_postfix_expression_after_paren_type): Likewise.
	(c_parser_compound_statement_nostart): Remove check for !flag_isoc99.
	* c-tree.h: Fix formatting.
	* c-typeck.c (build_array_ref): Remove check for !flag_isoc99.  Call
	pedwarn_c90 instead of pedwarn.
gcc/testsuite/
	* gcc.dg/Wc90-c99-compat-1.c: New test.
	* gcc.dg/Wc90-c99-compat-2.c: New test.
	* gcc.dg/Wc90-c99-compat-3.c: New test.
	* gcc.dg/Wc90-c99-compat-4.c: New test.
	* gcc.dg/Wc90-c99-compat-5.c: New test.
	* gcc.dg/Wc90-c99-compat-6.c: New test.
	* gcc.dg/wvla-1.c: Adjust dg-warning.
	* gcc.dg/wvla-2.c: Adjust dg-warning.
	* gcc.dg/wvla-4.c: Adjust dg-warning.
	* gcc.dg/wvla-6.c: Adjust dg-warning.
libcpp/
	* lex.c (_cpp_lex_direct): Warn when -Wc90-c99-compat is in effect.
	* charset.c (_cpp_valid_ucn): Likewise.
	* include/cpplib.h (cpp_options): Add cpp_warn_c90_c99_compat.
	* macro.c (replace_args): Warn when -Wc90-c99-compat is in effect.
	(parse_params): Likewise.

From-SVN: r213786
2014-08-10 06:10:49 +00:00
Richard Sandiford
35c3d610e3 Update copyright years in libcpp/
From-SVN: r206293
2014-01-02 22:24:45 +00:00
Joseph Myers
d3f4ff8b51 ucnid-2011-1.c: New test.
gcc/testsuite:
	* c-c++-common/cpp/ucnid-2011-1.c: New test.

libcpp:
	* ucnid.tab: Add C11 and C11NOSTART data.
	* makeucnid.c (digit): Rename enum value to N99.
	(C11, N11, all_languages): New enum values.
	(NUM_CODE_POINTS, MAX_CODE_POINT): New macros.
	(flags, decomp, combining_value): Use NUM_CODE_POINTS as array
	size.
	(decomp): Use unsigned int as element type.
	(all_decomp): New array.
	(read_ucnid): Handle C11 and C11NOSTART.  Use MAX_CODE_POINT.
	(read_table): Use MAX_CODE_POINT.  Store all decompositions in
	all_decomp.
	(read_derived): Use MAX_CODE_POINT.
	(write_table): Use NUM_CODE_POINTS.  Print N99, C11 and N11
	flags.  Print whole array variable declaration rather than just
	array contents.
	(char_id_valid, write_context_switch): New functions.
	(main): Call write_context_switch.
	* ucnid.h: Regenerate.
	* include/cpplib.h (struct cpp_options): Add c11_identifiers.
	* init.c (struct lang_flags): Add c11_identifiers.
	(cpp_set_lang): Set c11_identifiers option from selected language.
	* internal.h (struct normalize_state): Document "previous" as
	previous starter character.
	(NORMALIZE_STATE_UPDATE_IDNUM): Take character as argument.
	* charset.c (DIG): Rename enum value to N99.
	(C11, N11): New enum values.
	(struct ucnrange): Give name to struct.  Use short for flags and
	unsigned int for end of range.  Include ucnid.h for whole variable
	declaration.
	(ucn_valid_in_identifier): Allow for characters up to 0x10FFFF.
	Allow for C11 in determining valid characters and valid start
	characters.  Use check_nfc for non-Hangul context-dependent
	checks.  Only store starter characters in nst->previous.
	(_cpp_valid_ucn): Pass new argument to
	NORMALIZE_STATE_UPDATE_IDNUM.
	* lex.c (lex_identifier): Pass new argument to
	NORMALIZE_STATE_UPDATE_IDNUM.  Call NORMALIZE_STATE_UPDATE_IDNUM
	after initial non-UCN part of identifier.
	(lex_number): Pass new argument to NORMALIZE_STATE_UPDATE_IDNUM.

From-SVN: r204886
2013-11-16 00:05:08 +00:00
Richard Sandiford
500f3ed906 Update copyright years in libcpp.
From-SVN: r195162
2013-01-14 18:13:59 +00:00
Jakub Jelinek
f41e5bd19d re PR bootstrap/55380 (All search_line_fast implementations read beyond buffer)
PR bootstrap/55380
	PR other/54691
	* files.c (read_file_guts): Allocate extra 16 bytes instead of
	1 byte at the end of buf.  Pass size + 16 instead of size
	to _cpp_convert_input.
	* charset.c (_cpp_convert_input): Reallocate if there aren't
	at least 16 bytes beyond to.len in the buffer.  Clear 16 bytes
	at to.text + to.len.

From-SVN: r194102
2012-12-03 18:19:47 +01:00
Jakub Jelinek
d652f226fc Update Copyright years for files modified in 2010.
From-SVN: r168438
2011-01-03 21:52:22 +01:00
Simon Baldwin
87cf065171 diagnostic.h (diagnostic_override_option_index): New macro to set a diagnostic's option_index.
* diagnostic.h (diagnostic_override_option_index): New macro to
	set a diagnostic's option_index.
	* c-tree.h (c_cpp_error): Add warning reason argument.
	* opts.c (_warning_as_error_callback): New.
	(register_warning_as_error_callback): Store callback for
	warnings enabled via enable_warning_as_error.
	(enable_warning_as_error): Call callback, minor code tidy.
	* opts.h (register_warning_as_error_callback): Declare.
	* c-opts.c (warning_as_error_callback): New, set cpp_opts flag in
	response to -Werror=.
	(c_common_init_options): Register warning_as_error_callback in opts.c.
	* common.opt: Add -Wno-cpp option.
	* c-common.c (struct reason_option_codes_t): Map cpp warning
	reason codes to gcc option indexes.
	* (c_option_controlling_cpp_error): New function, lookup the gcc
	option index for a cpp warning reason code.
	* (c_cpp_error): Add warning reason argument, call
	c_option_controlling_cpp_error for diagnostic_override_option_index.
	* doc/invoke.texi: Document -Wno-cpp.

	* cpp.c (cb_cpp_error): Add warning reason argument, set a value
	for diagnostic_override_option_index if CPP_W_WARNING_DIRECTIVE.

	* directives.c (do_diagnostic): Add warning reason argument,
	call appropriate error reporting function for code.
	(directive_diagnostics): Call specific warning functions with
	warning reason where appropriate.
	(do_error, do_warning, do_pragma_dependency): Add warning reason
	argument to do_diagnostic calls.
	* macro.c (_cpp_warn_if_unused_macro, enter_macro_context,
	_cpp_create_definition): Call specific warning functions with
        warning reason where appropriate.
	* Makefile.in: Add new diagnostic functions to gettext translations.
	* include/cpplib.h (struct cpp_callbacks): Add warning reason code
	to error callback.
	(CPP_DL_WARNING, CPP_DL_WARNING_SYSHDR, CPP_DL_PEDWARN, CPP_DL_ERROR,
	CPP_DL_ICE, CPP_DL_NOTE, CPP_DL_FATAL): Replace macros with enums.
	(CPP_W_NONE, CPP_W_DEPRECATED, CPP_W_COMMENTS,
	CPP_W_MISSING_INCLUDE_DIRS, CPP_W_TRIGRAPHS, CPP_W_MULTICHAR,
	CPP_W_TRADITIONAL, CPP_W_LONG_LONG, CPP_W_ENDIF_LABELS,
	CPP_W_NUM_SIGN_CHANGE, CPP_W_VARIADIC_MACROS,
	CPP_W_BUILTIN_MACRO_REDEFINED, CPP_W_DOLLARS, CPP_W_UNDEF,
	CPP_W_UNUSED_MACROS, CPP_W_CXX_OPERATOR_NAMES, CPP_W_NORMALIZE,
	CPP_W_INVALID_PCH, CPP_W_WARNING_DIRECTIVE): New enums for cpp
	warning reason codes.
	(cpp_warning, cpp_pedwarning, cpp_warning_syshdr,
	cpp_warning_with_line, cpp_pedwarning_with_line,
	cpp_warning_with_line_syshdr): New specific error reporting functions.
	* pch.c (cpp_valid_state): Call specific warning functions with
        warning reason where appropriate.
	* errors.c (cpp_diagnostic, cpp_diagnostic_with_line): New central
	diagnostic handlers.
	(cpp_warning, cpp_pedwarning, cpp_warning_syshdr,
	cpp_warning_with_line, cpp_pedwarning_with_line,
	cpp_warning_with_line_syshdr): New specific error reporting functions.
	* expr.c (cpp_classify_number, eval_token, num_unary_op): Call
	specific warning functions with warning reason where appropriate.
	* lex.c (_cpp_process_line_notes, _cpp_skip_block_comment,
	warn_about_normalization, lex_identifier_intern, lex_identifier,
	_cpp_lex_direct): Ditto.
	* charset.c (_cpp_valid_ucn, convert_hex, convert_escape,
	narrow_str_to_charconst): Ditto.

	* gcc.dg/cpp/warn-undef-2.c: New.
	* gcc.dg/cpp/warn-traditional-2.c: New.
	* gcc.dg/cpp/warn-comments-2.c: New.
	* gcc.dg/cpp/warning-directive-1.c: New.
	* gcc.dg/cpp/warn-long-long.c: New.
	* gcc.dg/cpp/warn-traditional.c: New.
	* gcc.dg/cpp/warn-variadic-2.c: New.
	* gcc.dg/cpp/warn-undef.c: New.
	* gcc.dg/cpp/warn-normalized-1.c: New.
	* gcc.dg/cpp/warning-directive-2.c: New.
	* gcc.dg/cpp/warn-long-long-2.c: New.
	* gcc.dg/cpp/warn-variadic.c: New.
	* gcc.dg/cpp/warn-normalized-2.c: New.
	* gcc.dg/cpp/warning-directive-3.c: New.
	* gcc.dg/cpp/warn-deprecated-2.c: New.
	* gcc.dg/cpp/warn-trigraphs-1.c: New.
	* gcc.dg/cpp/warn-multichar-2.c: New.
	* gcc.dg/cpp/warn-normalized-3.c: New.
	* gcc.dg/cpp/warning-directive-4.c: New.
	* gcc.dg/cpp/warn-unused-macros.c: New.
	* gcc.dg/cpp/warn-trigraphs-2.c: New.
	* gcc.dg/cpp/warn-cxx-compat-2.c: New.
	* gcc.dg/cpp/warn-cxx-compat.c: New.
	* gcc.dg/cpp/warn-redefined.c: New.
	* gcc.dg/cpp/warn-trigraphs-3.c: New.
	* gcc.dg/cpp/warn-unused-macros-2.c: New.
	* gcc.dg/cpp/warn-deprecated.c: New.
	* gcc.dg/cpp/warn-trigraphs-4.c: New.
	* gcc.dg/cpp/warn-redefined-2.c: New.
	* gcc.dg/cpp/warn-comments.c: New.
	* gcc.dg/cpp/warn-multichar.c: New.
	* g++.dg/cpp/warning-directive-1.C: New.
	* g++.dg/cpp/warning-directive-2.C: New.
	* g++.dg/cpp/warning-directive-3.C: New.
	* g++.dg/cpp/warning-directive-4.C: New.
	* gfortran.dg/warning-directive-1.F90: New.
	* gfortran.dg/warning-directive-3.F90: New.
	* gfortran.dg/warning-directive-2.F90: New.
	* gfortran.dg/warning-directive-4.F90: New.

From-SVN: r158079
2010-04-07 17:18:10 +00:00
Jason Merrill
00a81b8b9d More N3077 raw string changes
More N3077 raw string changes
	* charset.c (cpp_interpret_string): Don't transform UCNs in raw
	strings.
	* lex.c (bufring_append): Split out from...
	(lex_raw_string): ...here.  Undo trigraph and line splicing
	transformations.  Do process line notes in multi-line literals.
	(_cpp_process_line_notes): Ignore notes that were already handled.

From-SVN: r157804
2010-03-29 16:07:29 -04:00
Jason Merrill
521506258f Some raw string changes from N3077
Some raw string changes from N3077
	* charset.c (cpp_interpret_string): Change inner delimiters to ().
	* lex.c (lex_raw_string): Likewise.  Also disallow '\' in delimiter.

From-SVN: r157797
2010-03-29 11:00:43 -04:00
Jakub Jelinek
2c6e3f5540 charset.c (cpp_init_iconv): Initialize utf8_cset_desc.
* charset.c (cpp_init_iconv): Initialize utf8_cset_desc.
	(_cpp_destroy_iconv): Destroy utf8_cset_desc, char16_cset_desc
	and char32_cset_desc.
	(converter_for_type): Handle CPP_UTF8STRING.
	(cpp_interpret_string): Handle CPP_UTF8STRING and raw-strings.
	* directives.c (get__Pragma_string): Handle CPP_UTF8STRING.
	(parse_include): Reject raw strings.
	* include/cpplib.h (CPP_UTF8STRING): New token type.
	* internal.h (struct cpp_reader): Add utf8_cset_desc field.
	* lex.c (lex_raw_string): New function.
	(lex_string): Handle u8 string literals, call lex_raw_string
	for raw string literals.
	(_cpp_lex_direct): Call lex_string even for u8" and {,u,U,L,u8}R"
	sequences.
	* macro.c (stringify_arg): Handle CPP_UTF8STRING.

	* c-common.c (c_parse_error): Handle CPP_UTF8STRING.
	* c-lex.c (c_lex_with_flags): Likewise.  Test C_LEX_STRING_NO_JOIN
	instead of C_LEX_RAW_STRINGS.
	(lex_string): Handle CPP_UTF8STRING.
	* c-parser.c (c_parser_postfix_expression): Likewise.
	* c-pragma.h (C_LEX_RAW_STRINGS): Rename to ...
	(C_LEX_STRING_NO_JOIN): ... this.

	* parser.c (cp_lexer_print_token, cp_parser_is_string_literal,
	cp_parser_string_literal, cp_parser_primary_expression): Likewise.
	(cp_lexer_get_preprocessor_token): Use C_LEX_STRING_JOIN instead
	of C_LEX_RAW_STRINGS.

	* gcc.dg/raw-string-1.c: New test.
	* gcc.dg/raw-string-2.c: New test.
	* gcc.dg/raw-string-3.c: New test.
	* gcc.dg/raw-string-4.c: New test.
	* gcc.dg/raw-string-5.c: New test.
	* gcc.dg/raw-string-6.c: New test.
	* gcc.dg/raw-string-7.c: New test.
	* gcc.dg/utf8-1.c: New test.
	* gcc.dg/utf8-2.c: New test.
	* gcc.dg/utf-badconcat2.c: New test.
	* gcc.dg/utf-dflt2.c: New test.
	* gcc.dg/cpp/include6.c: New test.
	* g++.dg/ext/raw-string-1.C: New test.
	* g++.dg/ext/raw-string-2.C: New test.
	* g++.dg/ext/raw-string-3.C: New test.
	* g++.dg/ext/raw-string-4.C: New test.
	* g++.dg/ext/raw-string-5.C: New test.
	* g++.dg/ext/raw-string-6.C: New test.
	* g++.dg/ext/raw-string-7.C: New test.
	* g++.dg/ext/utf8-1.C: New test.
	* g++.dg/ext/utf8-2.C: New test.
	* g++.dg/ext/utf-badconcat2.C: New test.
	* g++.dg/ext/utf-dflt2.C: New test.

From-SVN: r152995
2009-10-19 23:41:15 +02:00
Jason Merrill
30c99a9e19 * charset.c (_cpp_valid_ucn): Update C++0x restrictions.
From-SVN: r152614
2009-10-09 20:39:46 -04:00
Tom Tromey
709a22df79 re PR preprocessor/41067 (Inconsistency in warnings on invalid \-escapes)
PR preprocessor/41067:
	* charset.c (convert_escape): Add missing ":" to error text.

From-SVN: r150854
2009-08-17 17:34:53 +00:00
Joseph Myers
9e322bc1a5 charset.c (one_utf8_to_cppchar): Correct mask used for 5-byte UTF-8 sequences.
libcpp:
	* charset.c (one_utf8_to_cppchar): Correct mask used for 5-byte
	UTF-8 sequences.

gcc/testsuite:
	* gcc.dg/cpp/utf8-5byte-1.c: New test.

From-SVN: r147073
2009-05-03 12:59:26 +01:00
Jakub Jelinek
748086b7b2 Licensing changes to GPLv3 resp. GPLv3 with GCC Runtime Exception.
From-SVN: r145841
2009-04-09 17:00:19 +02:00
H.J. Lu
0b7c73cc04 re PR preprocessor/36479 (Short buffer in libcpp)
2008-06-12  H.J. Lu  <hongjiu.lu@intel.com>

	PR preprocessor/36479
	* charset.c (cpp_interpret_string_notranslate): Also set
	narrow_cset_desc.width.

From-SVN: r136714
2008-06-12 10:03:41 -07:00
Tom Tromey
688e7a5344 re PR preprocessor/33415 (Can't compile .cpp file with UTF-8 BOM.)
libcpp
	PR libcpp/33415:
	* charset.c (_cpp_convert_input): Add buffer_start argument.
	Ignore UTF-8 BOM if seen.
	* internal.h (_cpp_convert_input): Add argument.
	* files.c (struct _cpp_file) <buffer_start>: New field.
	(destroy_cpp_file): Free buffer_start, not buffer.
	(_cpp_pop_file_buffer): Likewise.
	(read_file_guts): Update.
gcc/testsuite
	PR libcpp/33415:
	* gcc.dg/cpp/pr33415.c: New file.

From-SVN: r134507
2008-04-21 14:02:00 +00:00