Commit Graph

166 Commits

Author SHA1 Message Date
Jakub Jelinek 55dfce4d5c libcpp: Fix up handling of deferred pragmas [PR102432]
The https://gcc.gnu.org/pipermail/gcc-patches/2020-November/557903.html
change broke the following testcases.  The problem is when a pragma
namespace allows expansion (i.e. p->is_nspace && p->allow_expansion),
e.g. the omp or acc namespaces do, then when parsing the second pragma
token we do it with pfile->state.in_directive set,
pfile->state.prevent_expansion clear and pfile->state.in_deferred_pragma
clear (the last one because we don't know yet if it will be a deferred
pragma or not).  If the pragma line only contains a single name
and newline after it, and there exists a function-like macro with the
same name, the preprocessor needs to peek in funlike_invocation_p
the next token whether it isn't ( but in this case it will see a newline.
As pfile->state.in_directive is set, we don't read anything after the
newline, pfile->buffer->need_line is set and CPP_EOF is lexed, which
funlike_invocation_p doesn't push back.  Because name is a function-like
macro and on the pragma line there is no ( after the name, it isn't
expanded, and control flow returns to do_pragma.  If name is valid
deferred pragma, we set pfile->state.in_deferred_pragma (and really
need it set so that e.g. end_directive later on doesn't eat all the
tokens from the pragma line).

Before Nathan's change (which unfortunately didn't contain rationale
on why it is better to do it like that), this wasn't a problem,
next _cpp_lex_direct called when we want next token would return
CPP_PRAGMA_EOF when it saw buffer->need_line, which would turn off
pfile->state.in_deferred_pragma and following get token would already
read the next line.  But Nathan's patch replaced it with an assertion
failure that now triggers and CPP_PRAGMA_EOL is done only when lexing
the '\n'.  Except for this special case that works fine, but in
this case it doesn't because when peeking the token we still didn't know
that it will be a deferred pragma.
I've tried to fix that up in do_pragma by detecting this and pushing
CPP_PRAGMA_EOL as lookahead, but that doesn't work because end_directive
still needs to see pfile->state.in_deferred_pragma set.

So, this patch affectively reverts part of Nathan's change, CPP_PRAGMA_EOL
addition isn't done only when parsing the '\n', but is now done in both
places, in the first one instead of the assertion failure.

2021-12-04  Jakub Jelinek  <jakub@redhat.com>

	PR preprocessor/102432
	* lex.c (_cpp_lex_direct): If buffer->need_line while
	pfile->state.in_deferred_pragma, return CPP_PRAGMA_EOL token instead
	of assertion failure.

	* c-c++-common/gomp/pr102432.c: New test.
	* c-c++-common/goacc/pr102432.c: New test.
2021-12-04 11:00:09 +01:00
Jakub Jelinek c264208e16 libcpp: Enable P1949R7 for C++98 too [PR100977]
On Mon, Nov 29, 2021 at 05:53:58PM -0500, Jason Merrill wrote:
> I'm inclined to go ahead and change C++98 as well; I doubt anyone is relying
> on the particular C++98 extended character set rules, and we already accept
> the union of the different sets when not pedantic.

Ok, here is an incremental patch to do that also for -std={c,gnu}++98.

2021-12-01  Jakub Jelinek  <jakub@redhat.com>

	PR c++/100977
	* init.c (struct lang_flags): Remove cxx23_identifiers.
	(lang_defaults): Remove cxx23_identifiers initializers.
	(cpp_set_lang): Don't copy cxx23_identifiers.
	* include/cpplib.h (struct cpp_options): Adjust comment about
	c11_identifiers.  Remove cxx23_identifiers field.
	* lex.c (warn_about_normalization): Use cplusplus instead of
	cxx23_identifiers.
	* charset.c (ucn_valid_in_identifier): Likewise.

	* g++.dg/cpp/ucnid-1.C: Adjust expected diagnostics.
	* g++.dg/cpp/ucnid-1-utf8.C: Likewise.
2021-12-01 10:21:20 +01:00
Marek Polacek 630686f93f libcpp: Use [[likely]] conditionally
Let's hide [[likely]] behind a macro, to suppress warnings if the
compiler doesn't support it.

Co-authored-by: Jonathan Wakely <jwakely@redhat.com>

	PR preprocessor/103355

libcpp/ChangeLog:

	* lex.c: Use ATTR_LIKELY instead of [[likely]].
	* system.h (ATTR_LIKELY): Define.
2021-11-22 21:43:38 -05:00
David Malcolm bef32d4a28 libcpp: capture and underline ranges in -Wbidi-chars= [PR103026]
This patch converts the bidi::vec to use a struct so that we can
capture location_t values for the bidirectional control characters.

Before:

  Wbidi-chars-1.c: In function ‘main’:
  Wbidi-chars-1.c:6:43: warning: unpaired UTF-8 bidirectional control character detected [-Wbidi-chars=]
      6 |     /*<U+202E> } <U+2066>if (isAdmin)<U+2069> <U+2066> begin admins only */
        |                                                                           ^
  Wbidi-chars-1.c:9:28: warning: unpaired UTF-8 bidirectional control character detected [-Wbidi-chars=]
      9 |     /* end admins only <U+202E> { <U+2066>*/
        |                                            ^

After:

  Wbidi-chars-1.c: In function ‘main’:
  Wbidi-chars-1.c:6:43: warning: unpaired UTF-8 bidirectional control characters detected [-Wbidi-chars=]
      6 |     /*<U+202E> } <U+2066>if (isAdmin)<U+2069> <U+2066> begin admins only */
        |       ~~~~~~~~                                ~~~~~~~~                    ^
        |       |                                       |                           |
        |       |                                       |                           end of bidirectional context
        |       U+202E (RIGHT-TO-LEFT OVERRIDE)         U+2066 (LEFT-TO-RIGHT ISOLATE)
  Wbidi-chars-1.c:9:28: warning: unpaired UTF-8 bidirectional control characters detected [-Wbidi-chars=]
      9 |     /* end admins only <U+202E> { <U+2066>*/
        |                        ~~~~~~~~   ~~~~~~~~ ^
        |                        |          |        |
        |                        |          |        end of bidirectional context
        |                        |          U+2066 (LEFT-TO-RIGHT ISOLATE)
        |                        U+202E (RIGHT-TO-LEFT OVERRIDE)

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

gcc/testsuite/ChangeLog:
	PR preprocessor/103026
	* c-c++-common/Wbidi-chars-ranges.c: New test.

libcpp/ChangeLog:
	PR preprocessor/103026
	* lex.c (struct bidi::context): New.
	(bidi::vec): Convert to a vec of context rather than unsigned
	char.
	(bidi::ctx_at): Rename to...
	(bidi::pop_kind_at): ...this and reimplement for above change.
	(bidi::current_ctx): Update for change to vec.
	(bidi::current_ctx_ucn_p): Likewise.
	(bidi::current_ctx_loc): New.
	(bidi::on_char): Update for usage of context struct.  Add "loc"
	param and pass it when pushing contexts.
	(get_location_for_byte_range_in_cur_line): New.
	(get_bidi_utf8): Rename to...
	(get_bidi_utf8_1): ...this, reintroducing...
	(get_bidi_utf8): ...as a wrapper, setting *OUT when the result is
	not NONE.
	(get_bidi_ucn): Rename to...
	(get_bidi_ucn_1): ...this, reintroducing...
	(get_bidi_ucn): ...as a wrapper, setting *OUT when the result is
	not NONE.
	(class unpaired_bidi_rich_location): New.
	(maybe_warn_bidi_on_close): Use unpaired_bidi_rich_location when
	reporting on unpaired bidi chars.  Split into singular vs plural
	spellings.
	(maybe_warn_bidi_on_char): Pass in a location_t rather than a
	const uchar * and use it when emitting warnings, and when calling
	bidi::on_char.
	(_cpp_skip_block_comment): Capture location when kind is not NONE
	and pass it to maybe_warn_bidi_on_char.
	(skip_line_comment): Likewise.
	(forms_identifier_p): Likewise.
	(lex_raw_string): Likewise.
	(lex_string): Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2021-11-17 17:34:12 -05:00
David Malcolm 1a7f2c0774 libcpp: escape non-ASCII source bytes in -Wbidi-chars= [PR103026]
This flags rich_locations associated with -Wbidi-chars= so that
non-ASCII bytes will be escaped when printing the source lines
(using the diagnostics support I added in
r12-4825-gbd5e882cf6e0def3dd1bc106075d59a303fe0d1e).

In particular, this ensures that the printed source lines will
be pure ASCII, and thus the visual ordering of the characters
will be the same as the logical ordering.

Before:

  Wbidi-chars-1.c: In function ‘main’:
  Wbidi-chars-1.c:6:43: warning: unpaired UTF-8 bidirectional control character detected [-Wbidi-chars=]
      6 |     /*‮ } ⁦if (isAdmin)⁩ ⁦ begin admins only */
        |                                           ^
  Wbidi-chars-1.c:9:28: warning: unpaired UTF-8 bidirectional control character detected [-Wbidi-chars=]
      9 |     /* end admins only ‮ { ⁦*/
        |                            ^

  Wbidi-chars-11.c:6:15: warning: UTF-8 vs UCN mismatch when closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [-Wbidi-chars=]
      6 | int LRE_‪_PDF_\u202c;
        |               ^
  Wbidi-chars-11.c:8:19: warning: UTF-8 vs UCN mismatch when closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [-Wbidi-chars=]
      8 | int LRE_\u202a_PDF_‬_;
        |                   ^
  Wbidi-chars-11.c:10:28: warning: UTF-8 vs UCN mismatch when closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [-Wbidi-chars=]
     10 | const char *s1 = "LRE_‪_PDF_\u202c";
        |                            ^
  Wbidi-chars-11.c:12:33: warning: UTF-8 vs UCN mismatch when closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [-Wbidi-chars=]
     12 | const char *s2 = "LRE_\u202a_PDF_‬";
        |                                 ^

After:

  Wbidi-chars-1.c: In function ‘main’:
  Wbidi-chars-1.c:6:43: warning: unpaired UTF-8 bidirectional control character detected [-Wbidi-chars=]
      6 |     /*<U+202E> } <U+2066>if (isAdmin)<U+2069> <U+2066> begin admins only */
        |                                                                           ^
  Wbidi-chars-1.c:9:28: warning: unpaired UTF-8 bidirectional control character detected [-Wbidi-chars=]
      9 |     /* end admins only <U+202E> { <U+2066>*/
        |                                            ^

  Wbidi-chars-11.c:6:15: warning: UTF-8 vs UCN mismatch when closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [-Wbidi-chars=]
      6 | int LRE_<U+202A>_PDF_\u202c;
        |                       ^
  Wbidi-chars-11.c:8:19: warning: UTF-8 vs UCN mismatch when closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [-Wbidi-chars=]
      8 | int LRE_\u202a_PDF_<U+202C>_;
        |                   ^
  Wbidi-chars-11.c:10:28: warning: UTF-8 vs UCN mismatch when closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [-Wbidi-chars=]
     10 | const char *s1 = "LRE_<U+202A>_PDF_\u202c";
        |                                    ^
  Wbidi-chars-11.c:12:33: warning: UTF-8 vs UCN mismatch when closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [-Wbidi-chars=]
     12 | const char *s2 = "LRE_\u202a_PDF_<U+202C>";
        |                                 ^

libcpp/ChangeLog:
	PR preprocessor/103026
	* lex.c (maybe_warn_bidi_on_close): Use a rich_location
	and call set_escape_on_output (true) on it.
	(maybe_warn_bidi_on_char): Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2021-11-17 17:32:30 -05:00
Jakub Jelinek 049f0efeaa libcpp: Fix up handling of block comments in -fdirectives-only mode [PR103130]
Normal preprocessing, -fdirectives-only preprocessing before the Nathan's
rewrite, and all other compilers I've tried on godbolt treat even \*/
as end of a block comment, but the new -fdirectives-only handling doesn't.

2021-11-17  Jakub Jelinek  <jakub@redhat.com>

	PR preprocessor/103130
	* lex.c (cpp_directive_only_process): Treat even \*/ as end of block
	comment.

	* c-c++-common/cpp/dir-only-9.c: New test.
2021-11-17 17:31:40 +01:00
Marek Polacek 51c500269b libcpp: Implement -Wbidi-chars for CVE-2021-42574 [PR103026]
From a link below:
"An issue was discovered in the Bidirectional Algorithm in the Unicode
Specification through 14.0. It permits the visual reordering of
characters via control sequences, which can be used to craft source code
that renders different logic than the logical ordering of tokens
ingested by compilers and interpreters. Adversaries can leverage this to
encode source code for compilers accepting Unicode such that targeted
vulnerabilities are introduced invisibly to human reviewers."

More info:
https://nvd.nist.gov/vuln/detail/CVE-2021-42574
https://trojansource.codes/

This is not a compiler bug.  However, to mitigate the problem, this patch
implements -Wbidi-chars=[none|unpaired|any] to warn about possibly
misleading Unicode bidirectional control characters the preprocessor may
encounter.

The default is =unpaired, which warns about improperly terminated
bidirectional control characters; e.g. a LRE without its corresponding PDF.
The level =any warns about any use of bidirectional control characters.

This patch handles both UCNs and UTF-8 characters.  UCNs designating
bidi characters in identifiers are accepted since r204886.  Then r217144
enabled -fextended-identifiers by default.  Extended characters in C/C++
identifiers have been accepted since r275979.  However, this patch still
warns about mixing UTF-8 and UCN bidi characters; there seems to be no
good reason to allow mixing them.

We warn in different contexts: comments (both C and C++-style), string
literals, character constants, and identifiers.  Expectedly, UCNs are ignored
in comments and raw string literals.  The bidirectional control characters
can nest so this patch handles that as well.

I have not included nor tested this at all with Fortran (which also has
string literals and line comments).

Dave M. posted patches improving diagnostic involving Unicode characters.
This patch does not make use of this new infrastructure yet.

	PR preprocessor/103026

gcc/c-family/ChangeLog:

	* c.opt (Wbidi-chars, Wbidi-chars=): New option.

gcc/ChangeLog:

	* doc/invoke.texi: Document -Wbidi-chars.

libcpp/ChangeLog:

	* include/cpplib.h (enum cpp_bidirectional_level): New.
	(struct cpp_options): Add cpp_warn_bidirectional.
	(enum cpp_warning_reason): Add CPP_W_BIDIRECTIONAL.
	* internal.h (struct cpp_reader): Add warn_bidi_p member
	function.
	* init.c (cpp_create_reader): Set cpp_warn_bidirectional.
	* lex.c (bidi): New namespace.
	(get_bidi_utf8): New function.
	(get_bidi_ucn): Likewise.
	(maybe_warn_bidi_on_close): Likewise.
	(maybe_warn_bidi_on_char): Likewise.
	(_cpp_skip_block_comment): Implement warning about bidirectional
	control characters.
	(skip_line_comment): Likewise.
	(forms_identifier_p): Likewise.
	(lex_identifier): Likewise.
	(lex_string): Likewise.
	(lex_raw_string): Likewise.

gcc/testsuite/ChangeLog:

	* c-c++-common/Wbidi-chars-1.c: New test.
	* c-c++-common/Wbidi-chars-2.c: New test.
	* c-c++-common/Wbidi-chars-3.c: New test.
	* c-c++-common/Wbidi-chars-4.c: New test.
	* c-c++-common/Wbidi-chars-5.c: New test.
	* c-c++-common/Wbidi-chars-6.c: New test.
	* c-c++-common/Wbidi-chars-7.c: New test.
	* c-c++-common/Wbidi-chars-8.c: New test.
	* c-c++-common/Wbidi-chars-9.c: New test.
	* c-c++-common/Wbidi-chars-10.c: New test.
	* c-c++-common/Wbidi-chars-11.c: New test.
	* c-c++-common/Wbidi-chars-12.c: New test.
	* c-c++-common/Wbidi-chars-13.c: New test.
	* c-c++-common/Wbidi-chars-14.c: New test.
	* c-c++-common/Wbidi-chars-15.c: New test.
	* c-c++-common/Wbidi-chars-16.c: New test.
	* c-c++-common/Wbidi-chars-17.c: New test.
2021-11-16 21:56:16 -05:00
David Malcolm bd5e882cf6 diagnostics: escape non-ASCII source bytes for certain diagnostics
This patch adds support to GCC's diagnostic subsystem for escaping certain
bytes and Unicode characters when quoting source code.

Specifically, this patch adds a new flag rich_location::m_escape_on_output
which is a hint from a diagnostic that non-ASCII bytes in the pertinent
lines of the user's source code should be escaped when printed.

The patch sets this for the following diagnostics:
- when complaining about stray bytes in the program (when these
are non-printable)
- when complaining about "null character(s) ignored");
- for -Wnormalized= (and generate source ranges for such warnings)

The escaping is controlled by a new option:
  -fdiagnostics-escape-format=[unicode|bytes]

For example, consider a diagnostic involing a source line containing the
string "before" followed by the Unicode character U+03C0 ("GREEK SMALL
LETTER PI", with UTF-8 encoding 0xCF 0x80) followed by the byte 0xBF
(a stray UTF-8 trailing byte), followed by the string "after", where the
diagnostic highlights the U+03C0 character.

By default, this line will be printed verbatim to the user when
reporting a diagnostic at it, as:

 beforeπXafter
       ^

(using X for the stray byte to avoid putting invalid UTF-8 in this
commit message)

If the diagnostic sets the "escape" flag, it will be printed as:

 before<U+03C0><BF>after
       ^~~~~~~~

with -fdiagnostics-escape-format=unicode (the default), or as:

  before<CF><80><BF>after
        ^~~~~~~~

if the user supplies -fdiagnostics-escape-format=bytes.

This only affects how the source is printed; it does not affect
how column numbers that are printed (as per -fdiagnostics-column-unit=
and -fdiagnostics-column-origin=).

gcc/c-family/ChangeLog:
	* c-lex.c (c_lex_with_flags): When complaining about non-printable
	CPP_OTHER tokens, set the "escape on output" flag.

gcc/ChangeLog:
	* common.opt (fdiagnostics-escape-format=): New.
	(diagnostics_escape_format): New enum.
	(DIAGNOSTICS_ESCAPE_FORMAT_UNICODE): New enum value.
	(DIAGNOSTICS_ESCAPE_FORMAT_BYTES): Likewise.
	* diagnostic-format-json.cc (json_end_diagnostic): Add
	"escape-source" attribute.
	* diagnostic-show-locus.c
	(exploc_with_display_col::exploc_with_display_col): Replace
	"tabstop" param with a cpp_char_column_policy and add an "aspect"
	param.  Use these to compute m_display_col accordingly.
	(struct char_display_policy): New struct.
	(layout::m_policy): New field.
	(layout::m_escape_on_output): New field.
	(def_policy): New function.
	(make_range): Update for changes to exploc_with_display_col ctor.
	(default_print_decoded_ch): New.
	(width_per_escaped_byte): New.
	(escape_as_bytes_width): New.
	(escape_as_bytes_print): New.
	(escape_as_unicode_width): New.
	(escape_as_unicode_print): New.
	(make_policy): New.
	(layout::layout): Initialize new fields.  Update m_exploc ctor
	call for above change to ctor.
	(layout::maybe_add_location_range): Update for changes to
	exploc_with_display_col ctor.
	(layout::calculate_x_offset_display): Update for change to
	cpp_display_width.
	(layout::print_source_line): Pass policy
	to cpp_display_width_computation. Capture cpp_decoded_char when
	calling process_next_codepoint.  Move printing of source code to
	m_policy.m_print_cb.
	(line_label::line_label): Pass in policy rather than context.
	(layout::print_any_labels): Update for change to line_label ctor.
	(get_affected_range): Pass in policy rather than context, updating
	calls to location_compute_display_column accordingly.
	(get_printed_columns): Likewise, also for cpp_display_width.
	(correction::correction): Pass in policy rather than tabstop.
	(correction::compute_display_cols): Pass m_policy rather than
	m_tabstop to cpp_display_width.
	(correction::m_tabstop): Replace with...
	(correction::m_policy): ...this.
	(line_corrections::line_corrections): Pass in policy rather than
	context.
	(line_corrections::m_context): Replace with...
	(line_corrections::m_policy): ...this.
	(line_corrections::add_hint): Update to use m_policy rather than
	m_context.
	(line_corrections::add_hint): Likewise.
	(layout::print_trailing_fixits): Likewise.
	(selftest::test_display_widths): New.
	(selftest::test_layout_x_offset_display_utf8): Update to use
	policy rather than tabstop.
	(selftest::test_one_liner_labels_utf8): Add test of escaping
	source lines.
	(selftest::test_diagnostic_show_locus_one_liner_utf8): Update to
	use policy rather than tabstop.
	(selftest::test_overlapped_fixit_printing): Likewise.
	(selftest::test_overlapped_fixit_printing_utf8): Likewise.
	(selftest::test_overlapped_fixit_printing_2): Likewise.
	(selftest::test_tab_expansion): Likewise.
	(selftest::test_escaping_bytes_1): New.
	(selftest::test_escaping_bytes_2): New.
	(selftest::diagnostic_show_locus_c_tests): Call the new tests.
	* diagnostic.c (diagnostic_initialize): Initialize
	context->escape_format.
	(convert_column_unit): Update to use default character width policy.
	(selftest::test_diagnostic_get_location_text): Likewise.
	* diagnostic.h (enum diagnostics_escape_format): New enum.
	(diagnostic_context::escape_format): New field.
	* doc/invoke.texi (-fdiagnostics-escape-format=): New option.
	(-fdiagnostics-format=): Add "escape-source" attribute to examples
	of JSON output, and document it.
	* input.c (location_compute_display_column): Pass in "policy"
	rather than "tabstop", passing to
	cpp_byte_column_to_display_column.
	(selftest::test_cpp_utf8): Update to use cpp_char_column_policy.
	* input.h (class cpp_char_column_policy): New forward decl.
	(location_compute_display_column): Pass in "policy" rather than
	"tabstop".
	* opts.c (common_handle_option): Handle
	OPT_fdiagnostics_escape_format_.
	* selftest.c (temp_source_file::temp_source_file): New ctor
	overload taking a size_t.
	* selftest.h (temp_source_file::temp_source_file): Likewise.

gcc/testsuite/ChangeLog:
	* c-c++-common/diagnostic-format-json-1.c: Add regexp to consume
	"escape-source" attribute.
	* c-c++-common/diagnostic-format-json-2.c: Likewise.
	* c-c++-common/diagnostic-format-json-3.c: Likewise.
	* c-c++-common/diagnostic-format-json-4.c: Likewise, twice.
	* c-c++-common/diagnostic-format-json-5.c: Likewise.
	* gcc.dg/cpp/warn-normalized-4-bytes.c: New test.
	* gcc.dg/cpp/warn-normalized-4-unicode.c: New test.
	* gcc.dg/encoding-issues-bytes.c: New test.
	* gcc.dg/encoding-issues-unicode.c: New test.
	* gfortran.dg/diagnostic-format-json-1.F90: Add regexp to consume
	"escape-source" attribute.
	* gfortran.dg/diagnostic-format-json-2.F90: Likewise.
	* gfortran.dg/diagnostic-format-json-3.F90: Likewise.

libcpp/ChangeLog:
	* charset.c (convert_escape): Use encoding_rich_location when
	complaining about nonprintable unknown escape sequences.
	(cpp_display_width_computation::::cpp_display_width_computation):
	Pass in policy rather than tabstop.
	(cpp_display_width_computation::process_next_codepoint): Add "out"
	param and populate *out if non-NULL.
	(cpp_display_width_computation::advance_display_cols): Pass NULL
	to process_next_codepoint.
	(cpp_byte_column_to_display_column): Pass in policy rather than
	tabstop.  Pass NULL to process_next_codepoint.
	(cpp_display_column_to_byte_column): Pass in policy rather than
	tabstop.
	* errors.c (cpp_diagnostic_get_current_location): New function,
	splitting out the logic from...
	(cpp_diagnostic): ...here.
	(cpp_warning_at): New function.
	(cpp_pedwarning_at): New function.
	* include/cpplib.h (cpp_warning_at): New decl for rich_location.
	(cpp_pedwarning_at): Likewise.
	(struct cpp_decoded_char): New.
	(struct cpp_char_column_policy): New.
	(cpp_display_width_computation::cpp_display_width_computation):
	Replace "tabstop" param with "policy".
	(cpp_display_width_computation::process_next_codepoint): Add "out"
	param.
	(cpp_display_width_computation::m_tabstop): Replace with...
	(cpp_display_width_computation::m_policy): ...this.
	(cpp_byte_column_to_display_column): Replace "tabstop" param with
	"policy".
	(cpp_display_width): Likewise.
	(cpp_display_column_to_byte_column): Likewise.
	* include/line-map.h (rich_location::escape_on_output_p): New.
	(rich_location::set_escape_on_output): New.
	(rich_location::m_escape_on_output): New.
	* internal.h (cpp_diagnostic_get_current_location): New decl.
	(class encoding_rich_location): New.
	* lex.c (skip_whitespace): Use encoding_rich_location when
	complaining about null characters.
	(warn_about_normalization): Generate a source range when
	complaining about improperly normalized tokens, rather than just a
	point, and use encoding_rich_location so that the source code
	is escaped on printing.
	* line-map.c (rich_location::rich_location): Initialize
	m_escape_on_output.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2021-11-01 09:35:46 -04:00
Jakub Jelinek c4d6dcacfc libcpp: Implement C++23 P1949R7 - C++ Identifier Syntax using Unicode Standard Annex 31
The following patch implements the
P1949R7 - C++ Identifier Syntax using Unicode Standard Annex 31
paper.  We already allow UTF-8 characters in the source, so that part
is already implemented, so IMHO all we need to do is pedwarn instead of
just warn for the (default) -Wnormalize=nfc (or for -Wnormalize={id,nkfc})
if the character is not in NFC and to use the unicode XID_Start and
XID_Continue derived code properties to find out what characters are allowed
(the standard actually adds U+005F to XID_Start, but we are handling the
ASCII compatible characters differently already and they aren't allowed
in UCNs in identifiers).  Instead of hardcoding the large tables
in ucnid.tab, this patch makes makeucnid.c read them from the Unicode
tables (13.0.0 version at this point).

For non-pedantic mode, we accept as 2nd+ char in identifiers a union
of valid characters in all supported modes, but for the 1st char it
was actually pedantically requiring that it is not any of the characters
that may not appear in the currently chosen standard as the first character.
This patch changes it such that also what is allowed at the start of an
identifier is a union of characters valid at the start of an identifier
in any of the pedantic modes.

2021-09-01  Jakub Jelinek  <jakub@redhat.com>

	PR c++/100977
libcpp/
	* include/cpplib.h (struct cpp_options): Add cxx23_identifiers.
	* charset.c (CXX23, NXX23): New enumerators.
	(CID, NFC, NKC, CTX): Renumber.
	(ucn_valid_in_identifier): Implement P1949R7 - use CXX23 and
	NXX23 flags for cxx23_identifiers.  For start character in
	non-pedantic mode, allow characters that are allowed as start
	characters in any of the supported language modes, rather than
	disallowing characters allowed only as non-start characters in
	current mode but for characters from other language modes allowing
	them even if they are never allowed at start.
	* init.c (struct lang_flags): Add cxx23_identifiers.
	(lang_defaults): Add cxx23_identifiers column.
	(cpp_set_lang): Initialize CPP_OPTION (pfile, cxx23_identifiers).
	* lex.c (warn_about_normalization): If cxx23_identifiers, use
	cpp_pedwarning_with_line instead of cpp_warning_with_line for
	"is not in NFC" diagnostics.
	* makeucnid.c: Adjust usage comment.
	(CXX23, NXX23): New enumerators.
	(all_languages): Add CXX23.
	(not_NFC, not_NFKC, maybe_not_NFC): Renumber.
	(read_derivedcore): New function.
	(write_table): Print also CXX23 and NXX23 columns.
	(main): Require 5 arguments instead of 4, call read_derivedcore.
	* ucnid.h: Regenerated using Unicode 13.0.0 files.
gcc/testsuite/
	* g++.dg/cpp23/normalize1.C: New test.
	* g++.dg/cpp23/normalize2.C: New test.
	* g++.dg/cpp23/normalize3.C: New test.
	* g++.dg/cpp23/normalize4.C: New test.
	* g++.dg/cpp23/normalize5.C: New test.
	* g++.dg/cpp23/normalize6.C: New test.
	* g++.dg/cpp23/normalize7.C: New test.
	* g++.dg/cpp23/ucnid-1-utf8.C: New test.
	* g++.dg/cpp23/ucnid-2-utf8.C: New test.
	* gcc.dg/cpp/ucnid-4.c: Don't expect
	"not valid at the start of an identifier" errors.
	* gcc.dg/cpp/ucnid-4-utf8.c: Likewise.
	* gcc.dg/cpp/ucnid-5-utf8.c: New test.
2021-09-01 22:33:06 +02:00
Jakub Jelinek d15a2d261b libcpp: Fix up -fdirectives-only handling of // comments on last line not terminated with newline [PR100646]
As can be seen on the testcases, before the -fdirectives-only preprocessing
rewrite the preprocessor would assume // comments are terminated by the
end of file even when newline wasn't there, but now we error out.
The following patch restores the previous behavior.

2021-05-20  Jakub Jelinek  <jakub@redhat.com>

	PR preprocessor/100646
	* lex.c (cpp_directive_only_process): Treat end of file as termination
	for !is_block comments.

	* gcc.dg/cpp/pr100646-1.c: New test.
	* gcc.dg/cpp/pr100646-2.c: New test.
2021-05-20 09:09:07 +02:00
Jakub Jelinek c6b664e2c4 libcpp: Fix up -fdirectives-only preprocessing of includes not ending with newline [PR100392]
If a header doesn't end with a new-line, with -fdirectives-only we right now
preprocess it as
int i = 1;# 2 "pr100392.c" 2
i.e. the line directive isn't on the next line, which means we fail to parse
it when compiling.

GCC 10 and earlier libcpp/directives-only.c had for this:
  if (!pfile->state.skipping && cur != base)
    {
      /* If the file was not newline terminated, add rlimit, which is
         guaranteed to point to a newline, to the end of our range.  */
      if (cur[-1] != '\n')
        {
          cur++;
          CPP_INCREMENT_LINE (pfile, 0);
          lines++;
        }

      cb->print_lines (lines, base, cur - base);
    }
and we have the assertion
      /* Files always end in a newline or carriage return.  We rely on this for
         character peeking safety.  */
      gcc_assert (buffer->rlimit[0] == '\n' || buffer->rlimit[0] == '\r');
So, this patch just does readd the more less same thing, so that we emit
a newline after the inline even when it wasn't there before.

2021-05-12  Jakub Jelinek  <jakub@redhat.com>

	PR preprocessor/100392
	* lex.c (cpp_directive_only_process): If buffer doesn't end with '\n',
	add buffer->rlimit[0] character to the printed range and
	CPP_INCREMENT_LINE and increment line_count.

	* gcc.dg/cpp/pr100392.c: New test.
	* gcc.dg/cpp/pr100392.h: New file.
2021-05-12 15:14:35 +02:00
Joseph Myers 3e3fdf3d52 preprocessor: Fix cpp_avoid_paste for digit separators
The libcpp function cpp_avoid_paste is used to insert whitespace in
preprocessed output where needed to avoid two consecutive
preprocessing tokens, that logically (e.g. when stringized) do not
have whitespace between them, from being incorrectly lexed as one when
the preprocessed input is reread by a compiler.

This fails to allow for digit separators, so meaning that invalid
code, that has a CPP_NUMBER (from a macro expansion) followed by a
character literal, can result in preprocessed output with a valid use
of digit separators, so that required syntax errors do not occur when
compiling with -save-temps.  Fix this by handling that case in
cpp_avoid_paste (as with other cases in cpp_avoid_paste, this doesn't
try to check whether the language version in use supports digit
separators; it's always OK to have unnecessary whitespace in
preprocessed output).

Note: there are other cases, with various kinds of wide character or
string literal following a CPP_NUMBER, where spurious pasting of
preprocessing tokens can occur but the sequence of tokens remains
invalid both before and after that pasting.  Maybe cpp_avoid_paste
should also handle those cases (and similar cases after a CPP_NAME),
to ensure the sequence of preprocessing tokens in preprocessed output
is exactly right, whether or not it affects whether syntax errors
occur.  This patch only addresses the case with digit separators where
invalid code can fail to be diagnosed without the space inserted.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.

libcpp/
	* lex.c (cpp_avoid_paste): Do not allow pasting CPP_NUMBER with
	CPP_CHAR.

gcc/testsuite/
	* g++.dg/cpp1y/digit-sep-paste.C, gcc.dg/c2x-digit-separators-3.c:
	New tests.
2021-05-11 18:54:32 +00:00
Jakub Jelinek 170c850e4b libcpp: Fix up pragma preprocessing [PR100450]
Since the r0-85991-ga25a8f3be322fe0f838947b679f73d6efc2a412c
https://gcc.gnu.org/legacy-ml/gcc-patches/2008-02/msg01329.html
changes, so that we handle macros inside of pragmas that should expand
macros, during preprocessing we print those pragmas token by token,
with CPP_PRAGMA printed as
      fputs ("#pragma ", print.outf);
      if (space)
        fprintf (print.outf, "%s %s", space, name);
      else
        fprintf (print.outf, "%s", name);
where name is some identifier (so e.g. print
 #pragma omp parallel
or
 #pragma omp for
etc.).  Because it ends in an identifier, we need to handle it like
an identifier (i.e. CPP_NAME) for the decision whether a space needs
to be emitted in between that #pragma whatever or #pragma whatever whatever
and following token, otherwise the attached testcase is preprocessed as
 #pragma omp forreduction(+:red)
rather than
 #pragma omp for reduction(+:red)
The cpp_avoid_paste function is only called for this purpose.

2021-05-07  Jakub Jelinek  <jakub@redhat.com>

	PR c/100450
	* lex.c (cpp_avoid_paste): Handle token1 CPP_PRAGMA like CPP_NAME.

	* c-c++-common/gomp/pr100450.c: New test.
2021-05-07 17:48:37 +02:00
Joseph Myers 8f51cf38bb preprocessor: Fix pp-number lexing of digit separators [PR83873, PR97604]
When the preprocessor lexes preprocessing numbers in lex_number, it
accepts digit separators in more cases than actually permitted in
pp-numbers by the standard syntax.

One thing this accepts is adjacent digit separators; there is some
code to reject those later, but as noted in bug 83873 it fails to
cover the case of adjacent digit separators within a floating-point
exponent.  Accepting adjacent digit separators only results in a
missing diagnostic, not in valid code being rejected or being accepted
with incorrect semantics, because the correct lexing in such a case
would have '' start the following preprocessing tokens, and no valid
preprocessing token starts '' while ' isn't valid on its own as a
preprocessing token either.  So this patch fixes that case by moving
the error for adjacent digit separators to lex_number (allowing a more
specific diagnostic than if '' were excluded from the pp-number
completely).

Other cases inappropriately accepted involve digit separators before
'.', 'e+', 'e-', 'p+' or 'p-' (or corresponding uppercase variants).
In those cases, as shown by the test digit-sep-pp-number.C added, this
can result in valid code being wrongly rejected as a result of too
many characters being included in the pp-number.  So this case is
fixed by terminating the pp-number at the correct character according
to the standard.  That test also covers the case where a digit
separator was followed by an identifier-nondigit that is not a
nondigit (e.g. a UCN); that case was already handled correctly.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.

libcpp/
	PR c++/83873
	PR preprocessor/97604
	* lex.c (lex_number): Reject adjacent digit separators here.  Do
	not allow digit separators before '.' or an exponent with sign.
	* expr.c (cpp_classify_number): Do not check for adjacent digit
	separators here.

gcc/testsuite/
	PR c++/83873
	PR preprocessor/97604
	* g++.dg/cpp1y/digit-sep-neg-2.C,
	g++.dg/cpp1y/digit-sep-pp-number.C: New tests.
	* g++.dg/cpp1y/digit-sep-line-neg.C, g++.dg/cpp1y/digit-sep-neg.C:
	Adjust expected messages.
2021-05-06 23:20:35 +00:00
Jakub Jelinek ac16f4327f libcpp: Fix up -fdirectives-only preprocessing [PR98882]
GCC 11 ICEs on all -fdirectives-only preprocessing when the files don't end
with a newline.

The problem is in the assertion, for empty TUs buffer->cur == buffer->rlimit
and so buffer->rlimit[-1] access triggers UB in the preprocessor, for
non-empty TUs it refers to the last character in the file, which can be
anything.
The preprocessor adds a '\n' character (or '\r', in particular if the
user file ends with '\r' then it adds another '\r' rather than '\n'), but
that is added after the limit, i.e. at buffer->rlimit[0].

Now, if the routine handles occassional bumping of pos to buffer->rlimit + 1,
I think it is just the assert that needs changing, usually we read from *pos
if pos < limit and then e.g. if it is '\r', look at the following character
(which could be one of those '\n' or '\r' at buffer->rlimit[0]).  There is
also the case where for '\\' before the limit we read following character
and if it is '\n', do one thing, if it is '\r' read another character.
But in that case if '\\' was the last char in the TU, the limit char will be
'\n', so we are ok.

2021-02-03  Jakub Jelinek  <jakub@redhat.com>

	PR preprocessor/98882
	* lex.c (cpp_directive_only_process): Don't assert that rlimit[-1]
	is a newline, instead assert that rlimit[0] is either newline or
	carriage return.  When seeing '\\' followed by '\r', check limit
	before accessing pos[1].

	* gcc.dg/cpp/pr98882.c: New test.
2021-02-03 23:18:05 +01:00
liuhongt 530b1d6887 Fix ICE for [PR target/98833].
And replace __builtin_ia32_pcmpeqb128 with operator == in libcpp.

gcc/ChangeLog:

	PR target/98833
	* config/i386/sse.md (sse2_gt<mode>3): Drop !TARGET_XOP in condition.
	(*sse2_eq<mode>3): Ditto.

gcc/testsuite/ChangeLog:

	PR target/98833
	* gcc.target/i386/pr98833.c: New test.

libcpp/

	PR target/98833
	* lex.c (search_line_sse2): Replace builtins with == operator.
2021-01-27 18:49:25 +08:00
Jakub Jelinek 99dee82307 Update copyright years. 2021-01-04 10:26:59 +01:00
Nathan Sidwell 13f93cf533 preprocessor: Add deferred macros
Deferred macros are needed for C++ modules.  Header units may export
macro definitions and undefinitions.  These are resolved lazily at the
point of (potential) use.  (The language specifies that, it's not just
a useful optimization.)  Thus, identifier nodes grow a 'deferred'
field, which fortunately doesn't expand the structure on 64-bit
systems as there was padding there.  This is non-zero on NT_MACRO
nodes, if the macro is deferred.  When such an identifier is lexed, it
is resolved via a callback that I added recently.  That will either
provide the macro definition, or discover it there was an overriding
undef.  Either way the identifier is no longer a deferred macro.
Notice it is now possible for NT_MACRO nodes to have a NULL macro
expansion.

	libcpp/
	* include/cpplib.h (struct cpp_hashnode): Add deferred field.
	(cpp_set_deferred_macro): Define.
	(cpp_get_deferred_macro): Declare.
	(cpp_macro_definition): Reformat, add overload.
	(cpp_macro_definition_location): Deal with deferred macro.
	(cpp_alloc_token_string, cpp_compare_macro): Declare.
	* internal.h (_cpp_notify_macro_use): Return bool
	(_cpp_maybe_notify_macro_use): Likewise.
	* directives.c (do_undef): Check macro is not undef before
	warning.
	(do_ifdef, do_ifndef): Deal with deferred macro.
	* expr.c (parse_defined): Likewise.
	* lex.c (cpp_allocate_token_string): Break out of ...
	(create_literal): ... here.  Call it.
	(cpp_maybe_module_directive): Deal with deferred macro.
	* macro.c (cpp_get_token_1): Deal with deferred macro.
	(warn_of_redefinition): Deal with deferred macro.
	(compare_macros): Rename to ...
	(cpp_compare_macro): ... here.  Make extern.
	(cpp_get_deferred_macro): New.
	(_cpp_notify_macro_use): Deal with deferred macro, return bool
	indicating definedness.
	(cpp_macro_definition): Deal with deferred macro.
2020-11-24 08:31:03 -08:00
Nathan Sidwell bf425849f1 preprocessor: main-file cleanup
In preparing module patch 7 I realized there was a cleanup I could
make to simplify it.  This is that cleanup.  Also, when doing the
cleanup I noticed some macros had been turned into inline functions,
but not renamed to the preprocessors internal namespace
(_cpp_$INTERNAL rather than cpp_$USER).  Thus, this renames those
functions, deletes an internal field of the file structure, and
determines whether we're in the main file by comparing to
pfile->main_file, the _cpp_file of the main file.

	libcpp/
	* internal.h (cpp_in_system_header): Rename to ...
	(_cpp_in_system_header): ... here.
	(cpp_in_primary_file): Rename to ...
	(_cpp_in_main_source_file): ... here.  Compare main_file equality
	and check main_search value.
	* lex.c (maybe_va_opt_error, _cpp_lex_direct): Adjust for rename.
	* macro.c (_cpp_builtin_macro_text): Likewise.
	(replace_args): Likewise.
	* directives.c (do_include_next): Likewise.
	(do_pragma_once, do_pragma_system_header): Likewise.
	* files.c (struct _cpp_file): Delete main_file field.
	(pch_open): Check pfile->main_file equality.
	(make_cpp_file): Drop cpp_reader parm, don't set main_file.
	(_cpp_find_file): Adjust.
	(_cpp_stack_file): Check pfile->main_file equality.
	(struct report_missing_guard_data): Add cpp_reader field.
	(report_missing_guard): Check pfile->main_file equality.
	(_cpp_report_missing_guards): Adjust.
2020-11-19 04:47:00 -08:00
Nathan Sidwell c9c3d5f28a preprocessor: C++ module-directives
C++20 modules introduces a new kind of preprocessor directive -- a
module directive.  These are directives but without the leading '#'.
We have to detect them by sniffing the start of a logical line.  When
detected we replace the initial identifiers with unspellable tokens
and pass them through to the language parser the same way deferred
pragmas are.  There's a PRAGMA_EOL at the logical end of line too.

One additional complication is that we have to do header-name lexing
after the initial tokens, and that requires changes in the macro-aware
piece of the preprocessor.  The above sniffer sets a counter in the
lexer state, and that triggers at the appropriate point.  We then do
the same header-name lexing that occurs on a #include directive or
has_include pseudo-macro.  Except that the header name ends up in the
token stream.

A couple of token emitters need to deal with the new token possibility.

	gcc/c-family/
	* c-lex.c (c_lex_with_flags): CPP_HEADER_NAMEs can now be seen.
	libcpp/
	* include/cpplib.h (struct cpp_options): Add module_directives
	option.
	(NODE_MODULE): New node flag.
	(struct cpp_hashnode): Make rid-code a bitfield, increase bits in
	flags and swap with type field.
	* init.c (post_options): Create module-directive identifier nodes.
	* internal.h (struct lexer_state): Add directive_file_token &
	n_modules fields.  Add module node enumerator.
	* lex.c (cpp_maybe_module_directive): New.
	(_cpp_lex_token): Call it.
	(cpp_output_token): Add '"' around CPP_HEADER_NAME token.
	(do_peek_ident, do_peek_module): New.
	(cpp_directives_only): Detect module-directive lines.
	* macro.c (cpp_get_token_1): Deal with directive_file_token
	triggering.
2020-11-18 10:24:12 -08:00
Nathan Sidwell 8bd9a00f43 cpplib: EOF in pragmas
This patch moves the generation of PRAGMA_EOF earlier, to when we set
need_line, rather than when we try and get the next line.  It also
prevents peeking past a PRAGMA token.

	libcpp/
	* lex.c (cpp_peek_token): Do not peek past CPP_PRAGMA.
	(_cpp_lex_direct): Handle EOF in pragma when setting need_line,
	not when needing a line.
2020-11-03 10:07:20 -08:00
Nathan Sidwell 082a7b2390 cpplib: Fix off-by-one error
I noticed a fencepost error in the preprocessor.  We should be
checking if the next char is at the limit, not the current char (which
can't be, because we're looking at it).

	libcpp/
	* lex.c (_cpp_clean_line): Fix DOS off-by-one error.
2020-11-03 08:49:25 -08:00
Nathan Sidwell dbcc6b1577 preprocessor: Further fix for EOF in macro args [PR97471]
My previous attempt at fixing this was incorrect.  The problem occurs
earlier in that _cpp_lex_direct processes the unwinding EOF needs in
collect_args mode.  This patch changes it not to do that, in the same
way as directive parsing works.  Also collect_args shouldn't push_back
such fake EOFs, and neither should funlike_invocation_p.

	libcpp/
	* lex.c (_cpp_lex_direct): Do not complete EOF processing when
	parsing_args.
	* macro.c (collect_args): Do not unwind fake EOF.
	(funlike_invocation_p): Do not unwind fake EOF.
	(cpp_context): Replace abort with gcc_assert.
	gcc/testsuite/
	* gcc.dg/cpp/endif.c: Move to ...
	* c-c++-common/cpp/endif.c: ... here.
	* gcc.dg/cpp/endif.h: Move to ...
	* c-c++-common/cpp/endif.h: ... here.
	* c-c++-common/cpp/eof-2.c: Adjust diagnostic.
	* c-c++-common/cpp/eof-3.c: Adjust diagnostic.
2020-10-20 08:01:34 -07:00
Jakub Jelinek d00b1b023e powerpc, libcpp: Fix gcc build with clang on power8 [PR97163]
libcpp has two specialized altivec implementations of search_line_fast,
one for power8+ and the other one otherwise.
Both use __attribute__((altivec(vector))) and the GCC builtins rather than
altivec.h and the APIs from there, which is fine, but should be restricted
to when libcpp is built with GCC, so that it can be relied on.
The second elif is
and thus e.g. when built with clang it isn't picked, but the first one was
just guarded with
and so according to the bugreporter clang fails miserably on that.

The following patch fixes that by adding the same GCC_VERSION requirement
as the second version.  I don't know where the 4.5 in there comes from and
the exact version doesn't matter that much, as long as it is above 4.2 that
clang pretends to be and smaller or equal to 4.8 as the oldest gcc we
support as bootstrap compiler ATM.
Furthermore, the patch fixes the comment, the version it is talking about is
not pre-GCC 5, but actually the GCC 5+ one.

2020-09-26  Jakub Jelinek  <jakub@redhat.com>

	PR bootstrap/97163
	* lex.c (search_line_fast): Only use _ARCH_PWR8 Altivec version
	for GCC >= 4.5.
2020-09-26 10:07:41 +02:00
Jakub Jelinek ae49af9485 libcpp: Fix up raw string literal parsing error-recovery [PR96323]
For (invalid) newline inside of the raw string literal delimiter, doing
continue means we skip the needed processing of newlines.  Instead of
duplicating that, this patch just doesn't continue for those.

2020-07-28  Jakub Jelinek  <jakub@redhat.com>

	PR preprocessor/96323
	* lex.c (lex_raw_string): For c == '\n' don't continue after reporting
	an prefix delimiter error.

	* c-c++-common/cpp/pr96323.c: New test.
2020-07-28 15:40:15 +02:00
Nathan Sidwell ed63c387aa preprocessor: Reimplement raw string lexing [pr95149]
pr95149 is a false positive static analysis checker.  But it
encouranged me to fix raw string lexing, which does contain a
complicated macro and pointers to local variables.  The
reimplementation does away with that macro.  Part of the complication
is we need to undo some of the fresh line processing -- trigraph notes
and escaped line continuations.  But the undone characters need to go
through the raw string processing, as they can legitimately be part of
the prefix marker.  however, in this reformulation we only process one
line marker at a time[*], so there's a limited number of undone
characters.  We can arrange the buffering to make sure we don't split
such an append sequence, and then simply take the characters from the
append buffer.

The prefix scanner had a switch statement, which I discovered was not
optimized as well as an if of a bunch of explicit comparisons (pr
95208 filed).

Finally I adjusted the failure mode.  When we get a bad prefix, we lex
up until the next '"', thus often swallowing the whole raw string.
Previously we'd bail and then the lexer would usually generate stupid
tokens, particularly when meeting the ending '"'.

	libcpp/
	* lex.c (struct lit_accum): New.
	(bufring_append): Replace by lit_accum::append.
	(lex_raw_string): Reimplement, using fragments of the old version.
	(lex_string): Adjust lex_raw_string call.

	gcc/testsuite/
	* c-c++-common/raw-string-14.c: Adjust errors.
	* c-c++-common/raw-string-16.c: Likewise.
	* c-c++-common/raw-string-5.c: Likewise.
2020-05-19 11:39:15 -07:00
Jason Merrill b04445d4a8 c++: Replace "C++2a" with "C++20".
C++20 isn't final quite yet, but all that remains is formalities, so let's
go ahead and change all the references.

I think for the next C++ standard we can just call it C++23 rather than
C++2b, since the committee has been consistent about time-based releases
rather than feature-based.

gcc/c-family/ChangeLog
2020-05-13  Jason Merrill  <jason@redhat.com>

	* c.opt (std=c++20): Make c++2a the alias.
	(std=gnu++20): Likewise.
	* c-common.h (cxx_dialect): Change cxx2a to cxx20.
	* c-opts.c: Adjust.
	* c-cppbuiltin.c: Adjust.
	* c-ubsan.c: Adjust.
	* c-warn.c: Adjust.

gcc/cp/ChangeLog
2020-05-13  Jason Merrill  <jason@redhat.com>

	* call.c, class.c, constexpr.c, constraint.cc, decl.c, init.c,
	lambda.c, lex.c, method.c, name-lookup.c, parser.c, pt.c, tree.c,
	typeck2.c: Change cxx2a to cxx20.

libcpp/ChangeLog
2020-05-13  Jason Merrill  <jason@redhat.com>

	* include/cpplib.h (enum c_lang): Change CXX2A to CXX20.
	* init.c, lex.c: Adjust.
2020-05-13 15:16:49 -04:00
Nathan Sidwell 2a0225e478 preprocessor: EOF location is at end of file [PR95013]
My recent C++ parser change to pay attention to EOF location uncovered
a separate bug.  The preprocesor's EOF logic would set the EOF
location to be the beginning of the last line of text in the file --
not the 'line' after that, which contains no characters.  Mostly.
This fixes things so that when we attempt to read the last line of the
main file, we don't pop the buffer until the tokenizer has a chance to
create an EOF token with the correct location information.  It is then
responsible for popping the buffer.  As it happens, raw string literal
tokenizing contained a bug -- it would increment the line number
prematurely, because it cached buffer->cur in a local variable, but
checked buffer->cur before updating it to figure out if it was at end
of file.   We fix up that too.

The EOF token intentionally doesn't have a column number -- it's not a
position on a line, it's a non-existant line.

The testsuite churn is just correcting the EOF location diagnostics.

	libcpp/
	PR preprocessor/95013
	* lex.c (lex_raw_string): Process line notes before incrementing.
	Correct incrementing condition.  Adjust for new
	_cpp_get_fresh_line EOF behaviour.
	(_cpp_get_fresh_line): Do not pop buffer at EOF, increment line
	instead.
	(_cpp_lex_direct): Adjust for new _cpp_get_fresh_line behaviour.
	(cpp_directive_only_process): Assert we got a fresh line.
	* traditional.c (_cpp_read_logical_line_trad): Adjust for new
	_cpp_get_fresh_line behaviour.

	gcc/testsuite/
	* c-c++-common/goacc/pr79428-1.c: Adjust EOF diagnostic location.
	* c-c++-common/gomp/pr79428-2.c: Likewise.
	* g++.dg/cpp0x/decltype63.C: Likewise.
	* g++.dg/cpp0x/gen-attrs-64.C: Likewise.
	* g++.dg/cpp0x/pr68726.C: Likewise.
	* g++.dg/cpp0x/pr78341.C: Likewise.
	* g++.dg/cpp1y/pr65202.C: Likewise.
	* g++.dg/cpp1y/pr65340.C: Likewise.
	* g++.dg/cpp1y/pr68578.C: Likewise.
	* g++.dg/cpp1z/class-deduction44.C: Likewise.
	* g++.dg/diagnostic/unclosed-extern-c.C: Likewise.
	* g++.dg/diagnostic/unclosed-function.C: Likewise.
	* g++.dg/diagnostic/unclosed-namespace.C: Likewise.
	* g++.dg/diagnostic/unclosed-struct.C: Likewise.
	* g++.dg/ext/pr84598.C: Likewise.
	* g++.dg/other/switch4.C: Likewise.
	* g++.dg/parse/attr4.C: Likewise.
	* g++.dg/parse/cond4.C: Likewise.
	* g++.dg/parse/crash10.C: Likewise.
	* g++.dg/parse/crash18.C: Likewise.
	* g++.dg/parse/crash27.C: Likewise.
	* g++.dg/parse/crash34.C: Likewise.
	* g++.dg/parse/crash35.C: Likewise.
	* g++.dg/parse/crash52.C: Likewise.
	* g++.dg/parse/crash59.C: Likewise.
	* g++.dg/parse/crash61.C: Likewise.
	* g++.dg/parse/crash67.C: Likewise.
	* g++.dg/parse/error14.C: Likewise.
	* g++.dg/parse/error56.C: Likewise.
	* g++.dg/parse/invalid1.C: Likewise.
	* g++.dg/parse/parameter-declaration-1.C: Likewise.
	* g++.dg/parse/parser-pr28152-2.C: Likewise.
	* g++.dg/parse/parser-pr28152.C: Likewise.
	* g++.dg/parse/pr68722.C: Likewise.
	* g++.dg/pr46852.C: Likewise.
	* g++.dg/pr46868.C: Likewise.
	* g++.dg/template/crash115.C: Likewise.
	* g++.dg/template/crash43.C: Likewise.
	* g++.dg/template/crash90.C: Likewise.
	* g++.dg/template/error-recovery1.C: Likewise.
	* g++.dg/template/error57.C: Likewise.
	* g++.old-deja/g++.other/crash31.C: Likewise.
	* gcc.dg/empty-source-2.c: Likewise.
	* gcc.dg/empty-source-3.c: Likewise.
	* gcc.dg/noncompile/pr30552-3.c: Likewise.
	* gcc.dg/noncompile/pr35447-1.c: Likewise.
	* gcc.dg/pr20245-1.c: Likewise.
	* gcc.dg/pr28419.c: Likewise.
	* gcc.dg/rtl/truncated-rtl-file.c: Likewise.
	* gcc.dg/unclosed-init.c: Likewise.
	* obj-c++.dg/property/property-neg-6.mm: Likewise.
	* obj-c++.dg/syntax-error-10.mm: Likewise.
	* obj-c++.dg/syntax-error-8.mm: Likewise.
	* obj-c++.dg/syntax-error-9.mm: Likewise.
2020-05-12 13:40:29 -07:00
Nathan Sidwell b224c3763e preprocessor: Reimplement directives only processing, support raw literals.
The existing directives-only code (a) punched a hole through the
libcpp interface and (b) didn't support raw string literals.  This
reimplements this preprocessing mode.  I added a proper callback
interface, and adjusted c-ppoutput to use it.  Sadly I cannot get rid
of the libcpp/internal.h include for unrelated reasons.

The new scanner is in lex.x, and works doing some backwards scanning
when it finds a charater of interest.  This reduces the number of
cases one has to deal with in forward scanning.  It may have different
failure mode than forward scanning on bad tokenization.

Finally, Moved some cpp tests from the c-specific dg.gcc/cpp directory
to the c-c++-common/cpp shared directory,

	libcpp/
	* directives-only.c: Delete.
	* Makefile.in (libcpp_a_OBJS, libcpp_a_SOURCES): Remove it.
	* include/cpplib.h (enum CPP_DO_task): New enum.
	(cpp_directive_only_preprocess): Declare.
	* internal.h (_cpp_dir_only_callbacks): Delete.
	(_cpp_preprocess_dir_only): Delete.
	* lex.c (do_peek_backslask, do_peek_next, do_peek_prev): New.
	(cpp_directives_only_process): New implementation.

	gcc/c-family/
	Reimplement directives only processing.
	* c-ppoutput.c (token_streamer): Ne.
	(directives_only_cb): New.  Swallow ...
	(print_lines_directives_only): ... this.
	(scan_translation_unit_directives_only): Reimplment using the
	published interface.

	gcc/testsuite/
	* gcc.dg/cpp/counter-[23].c: Move to c-c+_-common/cpp.
	* gcc.dg/cpp/dir-only-*: Likewise.
	* c-c++-common/cpp/dir-only-[78].c: New.
2020-05-08 11:13:29 -07:00
Jakub Jelinek 8d9254fc8a Update copyright years.
From-SVN: r279813
2020-01-01 12:51:42 +01:00
Jason Merrill b7689b962d Implement C++20 operator<=>.
There are three major pieces to this support: scalar operator<=>,
synthesis of comparison operators, and rewritten/reversed overload
resolution (e.g. a < b becomes 0 > b <=> a).

Unlike other defaulted functions, where we use synthesized_method_walk to
semi-simulate what the definition of the function will be like, this patch
determines the characteristics of a comparison operator by trying to define
it.

My handling of non-dependent rewritten operators in templates can still use
some work: build_min_non_dep_op_overload can't understand the rewrites and
crashes, so I'm avoiding it for now by clearing *overload.  This means we'll
do name lookup again at instantiation time, which can incorrectly mean a
different result.  I'll poke at this more in stage 3.

I'm leaving out a fourth section ("strong structural equality") even though
I've implemented it, because it seems likely to change radically tomorrow.

Thanks to Tim van Deurzen and Jakub for implementing lexing of the <=>
operator, and Jonathan for the initial <compare> header.

gcc/cp/
	* cp-tree.h (struct lang_decl_fn): Add maybe_deleted bitfield.
	(DECL_MAYBE_DELETED): New.
	(enum special_function_kind): Add sfk_comparison.
	(LOOKUP_REWRITTEN, LOOKUP_REVERSED): New.
	* call.c (struct z_candidate): Add rewritten and reversed methods.
	(add_builtin_candidate): Handle SPACESHIP_EXPR.
	(add_builtin_candidates): Likewise.
	(add_candidates): Don't add a reversed candidate if the parms are
	the same.
	(add_operator_candidates): Split out from build_new_op_1.  Handle
	rewritten and reversed candidates.
	(add_candidate): Swap conversions of reversed candidate.
	(build_new_op_1): Swap them back.  Build a second operation for
	rewritten candidates.
	(extract_call_expr): Handle rewritten calls.
	(same_fn_or_template): New.
	(joust): Handle rewritten and reversed candidates.
	* class.c (add_implicitly_declared_members): Add implicit op==.
	(classtype_has_op, classtype_has_defaulted_op): New.
	* constexpr.c (cxx_eval_binary_expression): Handle SPACESHIP_EXPR.
	(cxx_eval_constant_expression, potential_constant_expression_1):
	Likewise.
	* cp-gimplify.c (genericize_spaceship): New.
	(cp_genericize_r): Use it.
	* cp-objcp-common.c (cp_common_init_ts): Handle SPACESHIP_EXPR.
	* decl.c (finish_function): Handle deleted function.
	* decl2.c (grokfield): SET_DECL_FRIEND_CONTEXT on defaulted friend.
	(mark_used): Check DECL_MAYBE_DELETED.  Remove assumption that
	defaulted functions are non-static members.
	* error.c (dump_expr): Handle SPACESHIP_EXPR.
	* method.c (type_has_trivial_fn): False for sfk_comparison.
	(enum comp_cat_tag, struct comp_cat_info_t): New types.
	(comp_cat_cache): New array variable.
	(lookup_comparison_result, lookup_comparison_category)
	(is_cat, cat_tag_for, spaceship_comp_cat)
	(spaceship_type, genericize_spaceship)
	(common_comparison_type, early_check_defaulted_comparison)
	(comp_info, build_comparison_op): New.
	(synthesize_method): Handle sfk_comparison.  Handle deleted.
	(get_defaulted_eh_spec, maybe_explain_implicit_delete)
	(explain_implicit_non_constexpr, implicitly_declare_fn)
	(defaulted_late_check, defaultable_fn_check): Handle sfk_comparison.
	* name-lookup.c (get_std_name_hint): Add comparison categories.
	* tree.c (special_function_p): Add sfk_comparison.
	* typeck.c (cp_build_binary_op): Handle SPACESHIP_EXPR.

2019-11-05  Tim van Deurzen  <tim@kompiler.org>

	Add new tree code for the spaceship operator.
gcc/cp/
	* cp-tree.def: Add new tree code.
	* operators.def: New binary operator.
	* parser.c: Add new token and tree code.
libcpp/
	* cpplib.h: Add spaceship operator for C++.
	* lex.c: Implement conditional lexing of spaceship operator for C++20.

2019-11-05  Jonathan Wakely  <jwakely@redhat.com>

libstdc++-v3/
	* libsupc++/compare: New header.
	* libsupc++/Makefile.am (std_HEADERS): Add compare.
	* include/std/version: Define __cpp_lib_three_way_comparison.
	* include/std/functional: #include <compare>.

From-SVN: r277865
2019-11-05 18:56:18 -05:00
Joseph Myers 93313b94fe Handle :: tokens in C for C2x.
As part of adding [[]]-style attributes, C2x adds the token :: for use
in scoped attribute names.

This patch adds corresponding support for that token in C to GCC.  The
token is supported both for C2x and for older gnu* standards (on the
basis that extensions are normally supported in older gnu* versions;
people will expect to be able to use [[]] attributes, before C2x is
the default, without needing to use -std=gnu2x).

There are no cases in older C standards where the token : can be
followed by a token starting with : in syntactically valid sources;
the only cases the :: token could break in older standard C thus are
ones involving concatenation of pp-tokens where the result does not
end up as tokens (e.g., gets stringized).  In GNU C extensions, the
main case where :: might appear in existing sources is in asm
statements, and the C parser is thus made to handle it like two
consecutive : tokens, which the C++ parser already does.  A limited
test of various positionings of :: in asm statements is added to the
testsuite (in particular, to cover the syntax error when :: means too
many colons but a single : would be OK), but existing tests cover a
variety of styles there anyway.

Technically there are cases in Objective-C and OpenMP for which this
also changes how previously valid code is lexed: the objc-selector-arg
syntax allows multiple consecutive : tokens (although I don't think
they are particularly useful there), while OpenMP syntax includes
array section syntax such as [:] which, before :: was a token, could
also be written as [::> (there might be other OpenMP cases potentially
affected, I didn't check all the OpenMP syntax in detail).  I don't
think either of those cases affects the basis for supporting the ::
token in all -std=gnu* modes, or that there is any obvious need to
special-case handling of CPP_SCOPE tokens for those constructs the way
there is for asm statements.

cpp_avoid_paste, which determines when spaces need adding between
tokens in preprocessed output where there wouldn't otherwise be
whitespace between them (e.g. if stringized), already inserts space
between : and : unconditionally, rather than only for C++, so no
change is needed there (but a C2x test is added that such space is
indeed inserted).

Bootstrapped with no regressions on x86-64-pc-linux-gnu.

gcc/c:
	* c-parser.c (c_parser_asm_statement): Handle CPP_SCOPE like two
	CPP_COLON tokens.

gcc/testsuite:
	* gcc.dg/asm-scope-1.c, gcc.dg/cpp/c11-scope-1.c,
	gcc.dg/cpp/c17-scope-1.c, gcc.dg/cpp/c2x-scope-1.c,
	gcc.dg/cpp/c2x-scope-2.c, gcc.dg/cpp/c90-scope-1.c,
	gcc.dg/cpp/c94-scope-1.c, gcc.dg/cpp/c99-scope-1.c,
	gcc.dg/cpp/gnu11-scope-1.c, gcc.dg/cpp/gnu17-scope-1.c,
	gcc.dg/cpp/gnu89-scope-1.c, gcc.dg/cpp/gnu99-scope-1.c: New tests.

libcpp:
	* include/cpplib.h (struct cpp_options): Add member scope.
	* init.c (struct lang_flags, lang_defaults): Likewise.
	(cpp_set_lang): Set scope member of pfile.
	* lex.c (_cpp_lex_direct): Test CPP_OPTION (pfile, scope) not
	CPP_OPTION (pfile, cplusplus) for creating CPP_SCOPE tokens.

From-SVN: r276434
2019-10-02 01:08:40 +01:00
Lewis Hyatt 7d112d6670 Support extended characters in C/C++ identifiers (PR c/67224)
libcpp/ChangeLog
2019-09-19  Lewis Hyatt  <lhyatt@gmail.com>

	PR c/67224
	* charset.c (_cpp_valid_utf8): New function to help lex UTF-8 tokens.
	* internal.h (_cpp_valid_utf8): Declare.
	* lex.c (forms_identifier_p): Use it to recognize UTF-8 identifiers.
	(_cpp_lex_direct): Handle UTF-8 in identifiers and CPP_OTHER tokens.
	Do all work in "default" case to avoid slowing down typical code paths.
	Also handle $ and UCN in the default case for consistency.

gcc/Changelog
2019-09-19  Lewis Hyatt  <lhyatt@gmail.com>

	PR c/67224
	* doc/cpp.texi: Document support for extended characters in
	identifiers.
	* doc/cppopts.texi: Likewise.

gcc/testsuite/ChangeLog
2019-09-19  Lewis Hyatt  <lhyatt@gmail.com>

	PR c/67224
	* c-c++-common/cpp/ucnid-2011-1-utf8.c: New test.
	* g++.dg/cpp/ucnid-1-utf8.C: New test.
	* g++.dg/cpp/ucnid-2-utf8.C: New test.
	* g++.dg/cpp/ucnid-3-utf8.C: New test.
	* g++.dg/cpp/ucnid-4-utf8.C: New test.
	* g++.dg/other/ucnid-1-utf8.C: New test.
	* gcc.dg/cpp/ucnid-1-utf8.c: New test.
	* gcc.dg/cpp/ucnid-10-utf8.c: New test.
	* gcc.dg/cpp/ucnid-11-utf8.c: New test.
	* gcc.dg/cpp/ucnid-12-utf8.c: New test.
	* gcc.dg/cpp/ucnid-13-utf8.c: New test.
	* gcc.dg/cpp/ucnid-14-utf8.c: New test.
	* gcc.dg/cpp/ucnid-15-utf8.c: New test.
	* gcc.dg/cpp/ucnid-2-utf8.c: New test.
	* gcc.dg/cpp/ucnid-3-utf8.c: New test.
	* gcc.dg/cpp/ucnid-4-utf8.c: New test.
	* gcc.dg/cpp/ucnid-6-utf8.c: New test.
	* gcc.dg/cpp/ucnid-7-utf8.c: New test.
	* gcc.dg/cpp/ucnid-9-utf8.c: New test.
	* gcc.dg/ucnid-1-utf8.c: New test.
	* gcc.dg/ucnid-10-utf8.c: New test.
	* gcc.dg/ucnid-11-utf8.c: New test.
	* gcc.dg/ucnid-12-utf8.c: New test.
	* gcc.dg/ucnid-13-utf8.c: New test.
	* gcc.dg/ucnid-14-utf8.c: New test.
	* gcc.dg/ucnid-15-utf8.c: New test.
	* gcc.dg/ucnid-16-utf8.c: New test.
	* gcc.dg/ucnid-2-utf8.c: New test.
	* gcc.dg/ucnid-3-utf8.c: New test.
	* gcc.dg/ucnid-4-utf8.c: New test.
	* gcc.dg/ucnid-5-utf8.c: New test.
	* gcc.dg/ucnid-6-utf8.c: New test.
	* gcc.dg/ucnid-7-utf8.c: New test.
	* gcc.dg/ucnid-8-utf8.c: New test.
	* gcc.dg/ucnid-9-utf8.c: New test.

From-SVN: r275979
2019-09-19 20:56:11 +01:00
Nathan Sidwell 056f95ec95 [preprocessor/91639] #includes at EOF
https://gcc.gnu.org/ml/gcc-patches/2019-09/msg00280.html
	libcpp/
	PR preprocessor/91639
	* directives.c (do_include_common): Tell lexer we're a #include.
	* files.c (_cpp_stack_file): Lexer will have always incremented.
	* internal.h (struct cpp_context): Extend in_directive's
	semantics.
	* lex.c (_cpp_lex_direct): Increment line for final \n when lexing
	for an ISO #include.
	* line-map.c (linemap_line_start): Remember if we overflowed.

	gcc/testsuite/
	PR preprocessor/91639
	* c-c++-common/cpp/pr91639.c: New.
	* c-c++-common/cpp/pr91639-one.h: New.
	* c-c++-common/cpp/pr91639-two.h: New.

From-SVN: r275402
2019-09-05 11:23:48 +00:00
Andrew Pinski 3f23e487f3 [PATCH] Fix PR 81721: ICE with PCH and Pragma warning and C++ operator
libcpp/ChangeLog:
2019-05-19  Andrew Pinski  <apinski@marvell.com>

        PR pch/81721
        * lex.c (cpp_token_val_index <case SPELL_OPERATOR>): If tok->flags
        has NAMED_OP set, then return CPP_TOKEN_FLD_NODE.

gcc/testsuite/ChangeLog:
2019-05-19  Andrew Pinski  <apinski@marvell.com>

        PR pch/81721
        * g++.dg/pch/operator-1.C: New testcase.
        * g++.dg/pch/operator-1.Hs: New file.

From-SVN: r271395
2019-05-19 23:59:06 -07:00
Jakub Jelinek a554497024 Update copyright years.
From-SVN: r267494
2019-01-01 13:31:55 +01:00
David Malcolm 620e594be5 Eliminate source_location in favor of location_t
Historically GCC used location_t, while libcpp used source_location.

This inconsistency has been annoying me for a while, so this patch
removes source_location in favor of location_t throughout
(as the latter is shorter).

gcc/ChangeLog:
	* builtins.c: Replace "source_location" with "location_t".
	* diagnostic-show-locus.c: Likewise.
	* diagnostic.c: Likewise.
	* dumpfile.c: Likewise.
	* gcc-rich-location.h: Likewise.
	* genmatch.c: Likewise.
	* gimple.h: Likewise.
	* gimplify.c: Likewise.
	* input.c: Likewise.
	* input.h: Likewise.  Eliminate the typedef.
	* omp-expand.c: Likewise.
	* selftest.h: Likewise.
	* substring-locations.h (get_source_location_for_substring):
	Rename to..
	(get_location_within_string): ...this.
	* tree-cfg.c: Replace "source_location" with "location_t".
	* tree-cfgcleanup.c: Likewise.
	* tree-diagnostic.c: Likewise.
	* tree-into-ssa.c: Likewise.
	* tree-outof-ssa.c: Likewise.
	* tree-parloops.c: Likewise.
	* tree-phinodes.c: Likewise.
	* tree-phinodes.h: Likewise.
	* tree-ssa-loop-ivopts.c: Likewise.
	* tree-ssa-loop-manip.c: Likewise.
	* tree-ssa-phiopt.c: Likewise.
	* tree-ssa-phiprop.c: Likewise.
	* tree-ssa-threadupdate.c: Likewise.
	* tree-ssa.c: Likewise.
	* tree-ssa.h: Likewise.
	* tree-vect-loop-manip.c: Likewise.

gcc/c-family/ChangeLog:
	* c-common.c (c_get_substring_location): Update for renaming of
	get_source_location_for_substring to get_location_within_string.
	* c-lex.c: Replace "source_location" with "location_t".
	* c-opts.c: Likewise.
	* c-ppoutput.c: Likewise.

gcc/c/ChangeLog:
	* c-decl.c: Replace "source_location" with "location_t".
	* c-tree.h: Likewise.
	* c-typeck.c: Likewise.
	* gimple-parser.c: Likewise.

gcc/cp/ChangeLog:
	* call.c: Replace "source_location" with "location_t".
	* cp-tree.h: Likewise.
	* cvt.c: Likewise.
	* name-lookup.c: Likewise.
	* parser.c: Likewise.
	* typeck.c: Likewise.

gcc/fortran/ChangeLog:
	* cpp.c: Replace "source_location" with "location_t".
	* gfortran.h: Likewise.

gcc/go/ChangeLog:
	* go-gcc-diagnostics.cc: Replace "source_location" with "location_t".
	* go-gcc.cc: Likewise.
	* go-linemap.cc: Likewise.
	* go-location.h: Likewise.
	* gofrontend/README: Likewise.

gcc/jit/ChangeLog:
	* jit-playback.c: Replace "source_location" with "location_t".

gcc/testsuite/ChangeLog:
	* g++.dg/plugin/comment_plugin.c: Replace "source_location" with
	"location_t".
	* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c: Likewise.

libcc1/ChangeLog:
	* libcc1plugin.cc: Replace "source_location" with "location_t".
	(plugin_context::get_source_location): Rename to...
	(plugin_context::get_location_t): ...this.
	* libcp1plugin.cc: Likewise.

libcpp/ChangeLog:
	* charset.c: Replace "source_location" with "location_t".
	* directives-only.c: Likewise.
	* directives.c: Likewise.
	* errors.c: Likewise.
	* expr.c: Likewise.
	* files.c: Likewise.
	* include/cpplib.h: Likewise.  Rename MAX_SOURCE_LOCATION to
	MAX_LOCATION_T.
	* include/line-map.h: Likewise.
	* init.c: Likewise.
	* internal.h: Likewise.
	* lex.c: Likewise.
	* line-map.c: Likewise.
	* location-example.txt: Likewise.
	* macro.c: Likewise.
	* pch.c: Likewise.
	* traditional.c: Likewise.

From-SVN: r266085
2018-11-13 20:05:03 +00:00
Nathan Sidwell f3f6029db2 [6/6] Preprocessor forced macro location
https://gcc.gnu.org/ml/gcc-patches/2018-10/msg02044.html
	libcpp/
	* internal.h (struct cpp_reader): Rename forced_token_location_p
	to forced_token_location and drop its pointerness.
	* include/cpplib.h (cpp_force_token_locations): Take location, not
	pointer to one.
	* init.c (cpp_create_reader): Adjust.
	* lex.c (cpp_read_main_file): 

	gcc/c-family/
	* c-opts.c (c_finish_options): Adjust cpp_force_token_locations call.

	gcc/fortran/
	* cpp.c (gfc_cpp_init): Adjust cpp_force_token_locations call.

From-SVN: r265692
2018-10-31 15:26:28 +00:00
Nathan Sidwell 10f04917ab [PATCH] Macro body is trailing array
https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01037.html
	* include/cpplib.h (enum cpp_macro_kind): New.
	(struct cpp_macro): Make body trailing array.  Add kind field,
	delete traditional flag.
	* internal.h (_cpp_new_macro): Declare.
	(_cpp_reserve_room): New inline.
	(_cpp_commit_buf): Declare.
	(_cpp_create_trad_definition): Return new macro.
	* lex.c (_cpp_commit_buff): New.
	* macro.c (macro_real_token_count): Count backwards.
	(replace_args): Pointer equality not orderedness.
	(_cpp_save_parameter): Use _cpp_reserve_room.
	(alloc_expansion_token): Delete.
	(lex_expansion_token): Return macro pointer.  Use _cpp_reserve_room.
	(create_iso_definition): Allocate macro itself.  Adjust for
	different allocation ordering.
	(_cpp_new_macro): New.
	(_cpp_create_definition): Adjust for API changes.
	* traditional.c (push_replacement_text): Don't set traditional
	flag.
	(save_replacement_text): Likewise.
	(_cpp_create_trad_definition): Allocate macro itself, Adjust for
	different allocation ordering.

From-SVN: r263622
2018-08-17 16:07:19 +00:00
Nathan Sidwell 3f6677f418 [PATCH] CPP Macro predicates
https://gcc.gnu.org/ml/gcc-patches/2018-08/msg00897.html
	libcpp/
	* include/cpplib.h (cpp_user_macro_p, cpp_builtin_macro_p)
	(cpp_macro_p): New inlines.
	* directives.c (do_pragma_poison): Use cpp_macro_p.
	(do_ifdef, do_ifndef): Likewise.  Use _cpp_maybe_notify_macro_use.
	(cpp_pop_definition): Use cpp_macro_p.  Move _cpp_free_definition
	earlier.  Don't zap node directly.
	* expr.c (parse_defined): Use _cpp_maybe_notify_macro_use &
	cpp_macro_p.
	* files.c (should_stack_file): Use cpp_macro_p.
	* identifiers.c (cpp_defined): Likewise.
	* internal.h (_cpp_mark_macro): Use cpp_user_macro_p.
	(_cpp_notify_macro_use): Declare.
	(_cpp_maybe_notify_macro_use): New inline.
	* lex.c (is_macro): Use cpp_macro_p.
	* macro.c (_cpp_warn_if_unused_macro): Use cpp_user_macro_p.
	(enter_macro_context): Likewise.
	(_cpp_create_definition): Use cpp_builtin_macro_p,
	cpp_user_macro_p.  Move _cpp_free_definition earlier.
	(_cpp_notify_macro_use): New, broken out of multiple call sites.
	* traditional.c (fun_like_macro_p): Use cpp_builtin_macro_p.
	(maybe_start_funlike, _cpp_scan_out_logical_line)
	(push_replacement_text): Likewise.
	gcc/c-family/
	* c-ada-spec.c (count_ada_macro): Use cpp_user_macro_p.
	(store_ada_macro): Likewise.
	* c-ppoutput.c (cb_used_define, dump_macro): Likewise.
	* c-spellcheck.cc (should-suggest_as_macro_p): Likewise,
	gcc/
	* config/rs6000/rs6000-c.c (rs6000_macro_to_expend): Use cpp_macro_p.
	* config/powerpcspc/powerpcspe-c.c (rs6000_macro_to_expend): Likewise.
	gcc/cp/
	* name-lookup.c (lookup_name_fuzzy): Likewise.
	gcc/fortran/
	* cpp.c (dump_macro): Use cpp_user_macro_p.

From-SVN: r263587
2018-08-16 13:51:38 +00:00
Jakub Jelinek 0c86a39db2 lex.c (_cpp_lex_direct): Use CPP_DL_NOTE instead of CPP_DL_PEDWARN...
* lex.c (_cpp_lex_direct): Use CPP_DL_NOTE instead of CPP_DL_PEDWARN,
	CPP_DL_WARNING or CPP_DL_ERROR for note that diagnostics for C++ style
	comments is reported only once per file and guard those calls on the
	preceding cpp_error returning true.

	* gcc.dg/cpp/pr61854-c90.c (foo): Expect a note, rather than error.
	* gcc.dg/cpp/pr61854-c94.c (foo): Likewise.
	* gcc.dg/cpp/pr61854-4.c (foo): Likewise.
	* gcc.dg/cpp/pr61854-8.c: New test.

From-SVN: r262832
2018-07-17 20:10:57 +02:00
Jonathan Wakely b44f8ad8b2 PR preprocessor/84517 allow double-underscore macros after string literals
gcc/testsuite:

	PR preprocessor/84517
	* g++.dg/cpp0x/udlit-macros.C: Expect a warning for ""__FILE__.

libcpp:

	PR preprocessor/84517
	* lex.c (is_macro_not_literal_suffix): New function.
	(lex_raw_string, lex_string): Use is_macro_not_literal_suffix to
	decide when to issue -Wliteral-suffix warnings.

From-SVN: r258069
2018-02-28 15:27:17 +00:00
Kelvin Nilsen a3a821c903 rs6000-p8swap.c (rs6000_sum_of_two_registers_p): New function.
gcc/ChangeLog:

2018-01-10  Kelvin Nilsen  <kelvin@gcc.gnu.org>

	* config/rs6000/rs6000-p8swap.c (rs6000_sum_of_two_registers_p):
	New function.
	(rs6000_quadword_masked_address_p): Likewise.
	(quad_aligned_load_p): Likewise.
	(quad_aligned_store_p): Likewise.
	(const_load_sequence_p): Add comment to describe the outer-most loop.
	(mimic_memory_attributes_and_flags): New function.
	(rs6000_gen_stvx): Likewise.
	(replace_swapped_aligned_store): Likewise.
	(rs6000_gen_lvx): Likewise.
	(replace_swapped_aligned_load): Likewise.
	(replace_swapped_load_constant): Capitalize argument name in
	comment describing this function.
	(rs6000_analyze_swaps): Add a third pass to search for vector loads
	and stores that access quad-word aligned addresses and replace
	with stvx or lvx instructions when appropriate.
	* config/rs6000/rs6000-protos.h (rs6000_sum_of_two_registers_p):
	New function prototype.
	(rs6000_quadword_masked_address_p): Likewise.
	(rs6000_gen_lvx): Likewise.
	(rs6000_gen_stvx): Likewise.
	* config/rs6000/vsx.md (*vsx_le_perm_load_<mode>): For modes
	VSX_D (V2DF, V2DI), modify this split to select lvx instruction
	when memory address is aligned.
	(*vsx_le_perm_load_<mode>): For modes VSX_W (V4SF, V4SI), modify
	this split to select lvx instruction when memory address is aligned.
	(*vsx_le_perm_load_v8hi): Modify this split to select lvx
	instruction when memory address is aligned.
	(*vsx_le_perm_load_v16qi): Likewise.
	(four unnamed splitters): Modify to select the stvx instruction
	when memory is aligned.

gcc/testsuite/ChangeLog:

2018-01-10  Kelvin Nilsen  <kelvin@gcc.gnu.org>

	* gcc.target/powerpc/pr48857.c: Modify dejagnu directives to look
	for lvx and stvx instead of lxvd2x and stxvd2x and require
	little-endian target.  Add comments.
	* gcc.target/powerpc/swaps-p8-28.c: Add functions for more
	comprehensive testing.
	* gcc.target/powerpc/swaps-p8-29.c: Likewise.
	* gcc.target/powerpc/swaps-p8-30.c: Likewise.
	* gcc.target/powerpc/swaps-p8-31.c: Likewise.
	* gcc.target/powerpc/swaps-p8-32.c: Likewise.
	* gcc.target/powerpc/swaps-p8-33.c: Likewise.
	* gcc.target/powerpc/swaps-p8-34.c: Likewise.
	* gcc.target/powerpc/swaps-p8-35.c: Likewise.
	* gcc.target/powerpc/swaps-p8-36.c: Likewise.
	* gcc.target/powerpc/swaps-p8-37.c: Likewise.
	* gcc.target/powerpc/swaps-p8-38.c: Likewise.
	* gcc.target/powerpc/swaps-p8-39.c: Likewise.
	* gcc.target/powerpc/swaps-p8-40.c: Likewise.
	* gcc.target/powerpc/swaps-p8-41.c: Likewise.
	* gcc.target/powerpc/swaps-p8-42.c: Likewise.
	* gcc.target/powerpc/swaps-p8-43.c: Likewise.
	* gcc.target/powerpc/swaps-p8-44.c: Likewise.
	* gcc.target/powerpc/swaps-p8-45.c: Likewise.
	* gcc.target/powerpc/vec-extract-2.c: Add comment and remove
	scan-assembler-not directives that forbid lvx and xxpermdi.
	* gcc.target/powerpc/vec-extract-3.c: Likewise.
	* gcc.target/powerpc/vec-extract-5.c: Likewise.
	* gcc.target/powerpc/vec-extract-6.c: Likewise.
	* gcc.target/powerpc/vec-extract-7.c: Likewise.
	* gcc.target/powerpc/vec-extract-8.c: Likewise.
	* gcc.target/powerpc/vec-extract-9.c: Likewise.
	* gcc.target/powerpc/vsx-vector-6-le.c: Change
	scan-assembler-times directives to reflect different numbers of
	expected xxlnor, xxlor, xvcmpgtdp, and xxland instructions.

libcpp/ChangeLog:

2018-01-10  Kelvin Nilsen  <kelvin@gcc.gnu.org>

	* lex.c (search_line_fast): Remove illegal coercion of an
	unaligned pointer value to vector pointer type and replace with
	use of __builtin_vec_vsx_ld () built-in function, which operates
	on unaligned pointer values.

From-SVN: r256656
2018-01-14 05:19:29 +00:00
Jakub Jelinek 85ec4feb11 Update copyright years.
From-SVN: r256169
2018-01-03 11:03:58 +01:00
Michael Weiser 35c4515b8b [PATCH, PR83492] Fix selection of aarch64 big-endian shift parameters based on __AARCH64EB__
2017-12-20  Michael Weiser  <michael.weiser@gmx.de>

	PR preprocessor/83492
	* lex.c (search_line_fast) [__ARM_NEON && __ARM_64BIT_STATE]:
	Fix selection of big-endian shift parameters by using
	__ARM_BIG_ENDIAN.

From-SVN: r255896
2017-12-20 15:07:01 +00:00
Tom Tromey fb771b9dad Implement __VA_OPT__
This implements __VA_OPT__, a new preprocessor feature added in C++2A.
The paper can be found here:

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0306r4.html

gcc/ChangeLog

        * doc/cpp.texi (Variadic Macros): Document __VA_OPT__.

gcc/testsuite/ChangeLog

        * c-c++-common/cpp/va-opt-pedantic.c: New file.
        * c-c++-common/cpp/va-opt.c: New file.
        * c-c++-common/cpp/va-opt-error.c: New file.

libcpp/ChangeLog

        * pch.c (cpp_read_state): Set n__VA_OPT__.
        * macro.c (vaopt_state): New class.
        (_cpp_arguments_ok): Check va_opt flag.
        (replace_args, create_iso_definition): Use vaopt_state.
        * lex.c (lex_identifier_intern): Possibly issue errors for
        __VA_OPT__.
        (lex_identifier): Likewise.
        (maybe_va_opt_error): New function.
        * internal.h (struct lexer_state) <va_args_ok>: Update comment.
        (struct spec_nodes) <n__VA_OPT__>: New field.
        * init.c (struct lang_flags) <va_opt>: New field.
        (lang_defaults): Add entries for C++2A.  Update all entries for
        va_opt.
        (cpp_set_lang): Initialize va_opt.
        * include/cpplib.h (struct cpp_options) <va_opt>: New field.
        * identifiers.c (_cpp_init_hashtable): Initialize n__VA_OPT__.

From-SVN: r254707
2017-11-13 20:17:42 +00:00
Mukesh Kapoor 7d19c460ed re PR c++/80955 (Macros expanded in definition of user-defined literals)
/libcpp
2017-11-06  Mukesh Kapoor  <mukesh.kapoor@oracle.com>

	PR c++/80955
	* lex.c (lex_string): When checking for a valid macro for the
	warning related to -Wliteral-suffix (CPP_W_LITERAL_SUFFIX),
	check that the macro name does not start with an underscore
	before calling is_macro().

/gcc/testsuite
2017-11-06  Mukesh Kapoor  <mukesh.kapoor@oracle.com>

	PR c++/80955
	* g++.dg/cpp0x/udlit-macros.C: New.

From-SVN: r254443
2017-11-06 10:33:41 +00:00
Tom de Vries c830c7d5c7 [libcpp] Remove semicolon after do {} while (0) in BUF_APPEND
2017-11-05  Tom de Vries  <tom@codesourcery.com>

	PR other/82784
	* lex.c (BUF_APPEND): Remove semicolon after
	"do {} while (0)".

From-SVN: r254424
2017-11-05 09:58:16 +00:00
David Malcolm 05945a1b83 libcpp: add callback for comment-handling
gcc/testsuite/ChangeLog:
	* g++.dg/plugin/comment_plugin.c: New test plugin.
	* g++.dg/plugin/comments-1.C: New test file.
	* g++.dg/plugin/plugin.exp (plugin_test_list): Add the above.

libcpp/ChangeLog:
	* include/cpplib.h (struct cpp_callbacks): Add "comment"
	callback.
	* lex.c (_cpp_lex_direct): Call the comment callback if non-NULL.

From-SVN: r248901
2017-06-05 20:53:06 +00:00
Jonathan Wakely 5764ee3c84 Fix numerous typos in comments
gcc:

	* alias.c (base_alias_check): Fix typo in comment.
	* cgraph.h (class ipa_polymorphic_call_context): Likewise.
	* cgraphunit.c (symbol_table::compile): Likewise.
	* collect2.c (maybe_run_lto_and_relink): Likewise.
	* config/arm/arm.c (arm_thumb1_mi_thunk): Likewise.
	* config/avr/avr-arch.h (avr_arch_info_t): Likewise.
	* config/avr/avr.c (avr_map_op_t): Likewise.
	* config/cr16/cr16.h (DATA_ALIGNMENT): Likewise.
	* config/epiphany/epiphany.c (TARGET_ARG_PARTIAL_BYTES): Likewise.
	* config/epiphany/epiphany.md (movcc): Likewise.
	* config/i386/i386.c (legitimize_pe_coff_extern_decl): Likewise.
	* config/m68k/m68k.c (struct _sched_ib, m68k_sched_variable_issue):
	Likewise.
	* config/mips/mips.c (mips_save_restore_reg): Likewise.
	* config/rx/rx.c (rx_is_restricted_memory_address): Likewise.
	* config/s390/s390.c (Z10_EARLYLOAD_DISTANCE): Likewise.
	* config/sh/sh.c (sh_rtx_costs): Likewise.
	* fold-const.c (fold_truth_andor): Likewise.
	* genautomata.c (collapse_flag): Likewise.
	* gengtype.h (struct type::u::s): Likewise.
	* gensupport.c (has_subst_attribute, add_mnemonic_string): Likewise.
	* input.c (FORMAT_AMOUNT): Likewise.
	* ipa-cp.c (class ipcp_lattice, agg_replacements_to_vector)
	(known_aggs_to_agg_replacement_list): Likewise.
	* ipa-inline-analysis.c: Likewise.
	* ipa-inline.h (estimate_edge_time, estimate_edge_hints): Likewise.
	* ipa-polymorphic-call.c
	(ipa_polymorphic_call_context::restrict_to_inner_class): Likewise.
	* loop-unroll.c (analyze_insn_to_expand_var): Likewise.
	* lra.c (lra_optional_reload_pseudos, lra_subreg_reload_pseudos):
	Likewise.
	* modulo-sched.c (apply_reg_moves): Likewise.
	* omp-expand.c (build_omp_regions_1): Likewise.
	* trans-mem.c (struct tm_wrapper_hasher): Likewise.
	* tree-ssa-loop-ivopts.c (may_eliminate_iv): Likewise.
	* tree-ssa-loop-niter.c (maybe_lower_iteration_bound): Likewise.
	* tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Likewise.
	* value-prof.c: Likewise.
	* var-tracking.c (val_reset): Likewise.

gcc/ada:

	* doc/gnat_ugn/gnat_and_program_execution.rst: Fix typo.
	* g-socket.adb (To_Host_Entry): Fix typo in comment.
	* gnat_ugn.texi: Fix typo.
	* raise.c (_gnat_builtin_longjmp): Fix capitalization in comment.
	* s-stposu.adb (Allocate_Any_Controlled): Fix typo in comment.
	* sem_ch3.adb (Build_Derived_Record_Type): Likewise.
	* sem_util.adb (Mark_Coextensions): Likewise.
	* sem_util.ads (Available_Full_View_Of_Component): Likewise.

gcc/c:

	* c-array-notation.c: Fix typo in comment.

gcc/c-family:

	* c-warn.c (do_warn_double_promotion): Fix typo in comment.

gcc/cp:

        * class.c (update_vtable_entry_for_fn): Fix typo in comment.
	* decl2.c (one_static_initialization_or_destruction): Likewise.
	* name-lookup.c (store_bindings): Likewise.
	* parser.c (make_call_declarator): Likewise.
	* pt.c (check_explicit_specialization): Likewise.

gcc/testsuite:

	* g++.old-deja/g++.benjamin/scope02.C: Fix typo in comment.
	* gcc.dg/20031012-1.c: Likewise.
	* gcc.dg/ipa/ipcp-1.c: Likewise.
	* gcc.dg/torture/matrix-3.c: Likewise.
	* gcc.target/powerpc/ppc-spe.c: Likewise.
	* gcc.target/rx/zero-width-bitfield.c: Likewise.

libcpp:

	* include/line-map.h (LINEMAPS_MACRO_MAPS): Fix typo in comment.
	* lex.c (search_line_fast): Likewise.
	* pch.h (cpp_valid_state): Likewise.

libdecnumber:

	* decCommon.c (decFloatFromPackedChecked): Fix typo in comment.
	* decNumber.c (decNumberPower, decMultiplyOp): Likewise.

libgcc:

	* config/c6x/pr-support.c (__gnu_unwind_execute): Fix typo in comment.

libitm:

	* libitm_i.h (sutrct gtm_thread): Fix typo in comment.

From-SVN: r246664
2017-04-03 23:30:56 +01:00