Commit Graph

1073 Commits

Author SHA1 Message Date
GCC Administrator 70e4cb66c1 Daily bump. 2021-12-05 00:16:28 +00:00
Jakub Jelinek 55dfce4d5c libcpp: Fix up handling of deferred pragmas [PR102432]
The https://gcc.gnu.org/pipermail/gcc-patches/2020-November/557903.html
change broke the following testcases.  The problem is when a pragma
namespace allows expansion (i.e. p->is_nspace && p->allow_expansion),
e.g. the omp or acc namespaces do, then when parsing the second pragma
token we do it with pfile->state.in_directive set,
pfile->state.prevent_expansion clear and pfile->state.in_deferred_pragma
clear (the last one because we don't know yet if it will be a deferred
pragma or not).  If the pragma line only contains a single name
and newline after it, and there exists a function-like macro with the
same name, the preprocessor needs to peek in funlike_invocation_p
the next token whether it isn't ( but in this case it will see a newline.
As pfile->state.in_directive is set, we don't read anything after the
newline, pfile->buffer->need_line is set and CPP_EOF is lexed, which
funlike_invocation_p doesn't push back.  Because name is a function-like
macro and on the pragma line there is no ( after the name, it isn't
expanded, and control flow returns to do_pragma.  If name is valid
deferred pragma, we set pfile->state.in_deferred_pragma (and really
need it set so that e.g. end_directive later on doesn't eat all the
tokens from the pragma line).

Before Nathan's change (which unfortunately didn't contain rationale
on why it is better to do it like that), this wasn't a problem,
next _cpp_lex_direct called when we want next token would return
CPP_PRAGMA_EOF when it saw buffer->need_line, which would turn off
pfile->state.in_deferred_pragma and following get token would already
read the next line.  But Nathan's patch replaced it with an assertion
failure that now triggers and CPP_PRAGMA_EOL is done only when lexing
the '\n'.  Except for this special case that works fine, but in
this case it doesn't because when peeking the token we still didn't know
that it will be a deferred pragma.
I've tried to fix that up in do_pragma by detecting this and pushing
CPP_PRAGMA_EOL as lookahead, but that doesn't work because end_directive
still needs to see pfile->state.in_deferred_pragma set.

So, this patch affectively reverts part of Nathan's change, CPP_PRAGMA_EOL
addition isn't done only when parsing the '\n', but is now done in both
places, in the first one instead of the assertion failure.

2021-12-04  Jakub Jelinek  <jakub@redhat.com>

	PR preprocessor/102432
	* lex.c (_cpp_lex_direct): If buffer->need_line while
	pfile->state.in_deferred_pragma, return CPP_PRAGMA_EOL token instead
	of assertion failure.

	* c-c++-common/gomp/pr102432.c: New test.
	* c-c++-common/goacc/pr102432.c: New test.
2021-12-04 11:00:09 +01:00
GCC Administrator 03a9bd059b Daily bump. 2021-12-04 00:16:46 +00:00
Jakub Jelinek fe7c3ecff1 pch: Add support for PCH for relocatable executables [PR71934]
So, if we want to make PCH work for PIEs, I'd say we can:
1) add a new GTY option, say callback, which would act like
   skip for non-PCH and for PCH would make us skip it but
   remember for address bias translation
2) drop the skip for tree_translation_unit_decl::language
3) change get_unnamed_section to have const char * as
   last argument instead of const void *, change
   unnamed_section::data also to const char * and update
   everything related to that
4) maybe add a host hook whether it is ok to support binaries
   changing addresses (the only thing I'm worried is if
   some host that uses function descriptors allocates them
   dynamically instead of having them somewhere in the
   executable)
5) maybe add a gengtype warning if it sees in GTY tracked
   structure a function pointer without that new callback
   option

Here is 1), 2), 3) implemented.

Note, on stdc++.h.gch/O2g.gch there are just those 10 relocations without
the second patch, with it a few more, but nothing huge.  And for non-PIEs
there isn't really any extra work on the load side except freading two scalar
values and fseek.

2021-12-03  Jakub Jelinek  <jakub@redhat.com>

	PR pch/71934
gcc/
	* ggc.h (gt_pch_note_callback): Declare.
	* gengtype.h (enum typekind): Add TYPE_CALLBACK.
	(callback_type): Declare.
	* gengtype.c (dbgprint_count_type_at): Handle TYPE_CALLBACK.
	(callback_type): New variable.
	(process_gc_options): Add CALLBACK argument, handle callback
	option.
	(set_gc_used_type): Adjust process_gc_options caller, if callback,
	set type to &callback_type.
	(output_mangled_typename): Handle TYPE_CALLBACK.
	(walk_type): Likewise.  Handle callback option.
	(write_types_process_field): Handle TYPE_CALLBACK.
	(write_types_local_user_process_field): Likewise.
	(write_types_local_process_field): Likewise.
	(write_root): Likewise.
	(dump_typekind): Likewise.
	(dump_type): Likewise.
	* gengtype-state.c (type_lineloc): Handle TYPE_CALLBACK.
	(state_writer::write_state_callback_type): New method.
	(state_writer::write_state_type): Handle TYPE_CALLBACK.
	(read_state_callback_type): New function.
	(read_state_type): Handle TYPE_CALLBACK.
	* ggc-common.c (callback_vec): New variable.
	(gt_pch_note_callback): New function.
	(gt_pch_save): Stream out gt_pch_save function address and relocation
	table.
	(gt_pch_restore): Stream in saved gt_pch_save function address and
	relocation table and apply relocations if needed.
	* doc/gty.texi (callback): Document new GTY option.
	* varasm.c (get_unnamed_section): Change callback argument's type and
	last argument's type from const void * to const char *.
	(output_section_asm_op): Change argument's type from const void *
	to const char *, remove unnecessary cast.
	* tree-core.h (struct tree_translation_unit_decl): Drop GTY((skip))
	from language member.
	* output.h (unnamed_section_callback): Change argument type from
	const void * to const char *.
	(struct unnamed_section): Use GTY((callback)) instead of GTY((skip))
	for callback member.  Change data member type from const void *
	to const char *.
	(struct noswitch_section): Use GTY((callback)) instead of GTY((skip))
	for callback member.
	(get_unnamed_section): Change callback argument's type and
	last argument's type from const void * to const char *.
	(output_section_asm_op): Change argument's type from const void *
	to const char *.
	* config/avr/avr.c (avr_output_progmem_section_asm_op): Likewise.
	Remove unneeded cast.
	* config/darwin.c (output_objc_section_asm_op): Change argument's type
	from const void * to const char *.
	* config/pa/pa.c (som_output_text_section_asm_op): Likewise.
	(som_output_comdat_data_section_asm_op): Likewise.
	* config/rs6000/rs6000.c (rs6000_elf_output_toc_section_asm_op):
	Likewise.
	(rs6000_xcoff_output_readonly_section_asm_op): Likewise.  Instead
	of dereferencing directive hardcode variable names and decide based on
	whether directive is NULL or not.
	(rs6000_xcoff_output_readwrite_section_asm_op): Change argument's type
	from const void * to const char *.
	(rs6000_xcoff_output_tls_section_asm_op): Likewise.  Instead
	of dereferencing directive hardcode variable names and decide based on
	whether directive is NULL or not.
	(rs6000_xcoff_output_toc_section_asm_op): Change argument's type
	from const void * to const char *.
	(rs6000_xcoff_asm_init_sections): Adjust get_unnamed_section callers.
gcc/c-family/
	* c-pch.c (struct c_pch_validity): Remove pch_init member.
	(pch_init): Don't initialize v.pch_init.
	(c_common_valid_pch): Don't warn and punt if .text addresses change.
libcpp/
	* include/line-map.h (class line_maps): Add GTY((callback)) to
	reallocator and round_alloc_size members.
2021-12-03 11:03:30 +01:00
GCC Administrator 40fa651e60 Daily bump. 2021-12-02 00:16:33 +00:00
Jakub Jelinek c264208e16 libcpp: Enable P1949R7 for C++98 too [PR100977]
On Mon, Nov 29, 2021 at 05:53:58PM -0500, Jason Merrill wrote:
> I'm inclined to go ahead and change C++98 as well; I doubt anyone is relying
> on the particular C++98 extended character set rules, and we already accept
> the union of the different sets when not pedantic.

Ok, here is an incremental patch to do that also for -std={c,gnu}++98.

2021-12-01  Jakub Jelinek  <jakub@redhat.com>

	PR c++/100977
	* init.c (struct lang_flags): Remove cxx23_identifiers.
	(lang_defaults): Remove cxx23_identifiers initializers.
	(cpp_set_lang): Don't copy cxx23_identifiers.
	* include/cpplib.h (struct cpp_options): Adjust comment about
	c11_identifiers.  Remove cxx23_identifiers field.
	* lex.c (warn_about_normalization): Use cplusplus instead of
	cxx23_identifiers.
	* charset.c (ucn_valid_in_identifier): Likewise.

	* g++.dg/cpp/ucnid-1.C: Adjust expected diagnostics.
	* g++.dg/cpp/ucnid-1-utf8.C: Likewise.
2021-12-01 10:21:20 +01:00
Jakub Jelinek ac5fd364f0 libcpp: Fix up #__VA_OPT__ handling [PR103415]
stringify_arg uses pfile->u_buff to create the string literal.
Unfortunately, paste_tokens -> _cpp_lex_direct -> lex_number -> _cpp_unaligned_alloc
can in some cases use pfile->u_buff too, which results in losing everything
prepared for the string literal until the token pasting.

The following patch fixes that by not calling paste_token during the
construction of the string literal, but doing that before.  All the tokens
we are processing have been pushed into a token buffer using
tokens_buff_add_token so it is fine if we paste some of them in that buffer
(successful pasting creates a new token in that buffer), move following
tokens if any to make it contiguous, pop (throw away) the extra tokens at
the end and then do stringify_arg.

Also, paste_tokens now copies over PREV_WHITE and PREV_FALLTHROUGH flags
from the original lhs token to the replacement token.  Copying that way
the PREV_WHITE flag is needed for the #__VA_OPT__ handling and copying
over PREV_FALLTHROUGH fixes the new Wimplicit-fallthrough-38.c test.

2021-12-01  Jakub Jelinek  <jakub@redhat.com>

	PR preprocessor/103415
libcpp/
	* macro.c (stringify_arg): Remove va_opt argument and va_opt handling.
	(paste_tokens): On successful paste or in PREV_WHITE and
	PREV_FALLTHROUGH flags from the *plhs token to the new token.
	(replace_args): Adjust stringify_arg callers.  For #__VA_OPT__,
	perform token pasting in a separate loop before stringify_arg call.
gcc/testsuite/
	* c-c++-common/cpp/va-opt-8.c: New test.
	* c-c++-common/Wimplicit-fallthrough-38.c: New test.
2021-12-01 10:07:59 +01:00
GCC Administrator c177e80609 Daily bump. 2021-12-01 00:17:04 +00:00
Richard Biener fa01e206c8 Remove more stray returns and gcc_unreachable ()s
This removes more cases that appear when bootstrap with
-Wunreachable-code-return progresses.

2021-11-29  Richard Biener  <rguenther@suse.de>

	* config/i386/i386.c (ix86_shift_rotate_cost): Remove
	unreachable return.
	* tree-chrec.c (evolution_function_is_invariant_rec_p):
	Likewise.
	* tree-if-conv.c (if_convertible_stmt_p): Likewise.
	* tree-ssa-pre.c (fully_constant_expression): Likewise.
	* tree-vrp.c (operand_less_p): Likewise.
	* reload.c (reg_overlap_mentioned_for_reload_p): Remove
	unreachable gcc_unreachable ().
	* sel-sched-ir.h (bb_next_bb): Likewise.
	* varasm.c (compare_constant): Likewise.

gcc/cp/
	* logic.cc (cnf_size_r): Remove unreachable and inconsistently
	placed gcc_unreachable ()s.
	* pt.c (iterative_hash_template_arg): Remove unreachable
	gcc_unreachable and return.

gcc/fortran/
	* target-memory.c (gfc_element_size): Remove unreachable return.

gcc/objc/
	* objc-act.c (objc_build_setter_call): Remove unreachable
	return.

libcpp/
	* charset.c (convert_escape): Remove unreachable break.
2021-11-30 15:05:12 +01:00
Jakub Jelinek 7abcc9ca20 libcpp: Enable P1949R7 for C++11 and up as it was a DR [PR100977]
Jonathan mentioned on IRC that:
"Accept P1949R7 (C++ Identifier Syntax using Unicode Standard Annex 31) as
a Defect Report and apply the changes therein to the C++ working paper."
while I've actually implemented it only for -std={gnu,c}++{23,2b}.
As the C++98 rules were significantly different, I'm not trying to change
anything for C++98.

2021-11-30  Jakub Jelinek  <jakub@redhat.com>

	PR c++/100977
	* init.c (lang_defaults): Enable cxx23_identifiers for
	-std={gnu,c}++{11,14,17,20} too.

	* c-c++-common/cpp/ucnid-2011-1-utf8.c: Expect errors in C++.
	* c-c++-common/cpp/ucnid-2011-1.c: Likewise.
	* g++.dg/cpp/ucnid-4-utf8.C: Add missing space to dg-options.
	* g++.dg/cpp23/normalize3.C: Enable for c++11 rather than just c++23.
	* g++.dg/cpp23/normalize4.C: Likewise.
	* g++.dg/cpp23/normalize5.C: Likewise.
	* g++.dg/cpp23/normalize7.C: Expect errors rather than just warnings
	for c++11 and up rather than just c++23.
	* g++.dg/cpp23/ucnid-2-utf8.C: Expect errors even for c++11 .. c++20.
2021-11-30 09:50:52 +01:00
GCC Administrator 87cd82c81d Daily bump. 2021-11-30 00:16:44 +00:00
Eric Gallager 909b30a17e Make etags path used by build system configurable
This commit allows users to specify a path to their "etags"
executable for use when doing "make tags".
I based this patch off of this one from upstream automake:
https://git.savannah.gnu.org/cgit/automake.git/commit/m4?id=d2ccbd7eb38d6a4277d6f42b994eb5a29b1edf29
This means that I just supplied variables that the user can override
for the tags programs, rather than having the configure scripts
actually check for them. I handle etags and ctags separately because
the intl subdirectory has separate targets for them. This commit
only affects the subdirectories that use handwritten Makefiles; the
ones that use automake will have to wait until we update the version
of automake used to be 1.16.4 or newer before they'll be fixed.

Addresses #103021

gcc/ChangeLog:

	PR other/103021
	* Makefile.in: Substitute CTAGS, ETAGS, and CSCOPE
	variables. Use ETAGS variable in TAGS target.
	* configure: Regenerate.
	* configure.ac: Allow CTAGS, ETAGS, and CSCOPE
	variables to be overridden.

gcc/ada/ChangeLog:

	PR other/103021
	* gcc-interface/Make-lang.in: Use ETAGS variable in
	TAGS target.

gcc/c/ChangeLog:

	PR other/103021
	* Make-lang.in: Use ETAGS variable in TAGS target.

gcc/cp/ChangeLog:

	PR other/103021
	* Make-lang.in: Use ETAGS variable in TAGS target.

gcc/d/ChangeLog:

	PR other/103021
	* Make-lang.in: Use ETAGS variable in TAGS target.

gcc/fortran/ChangeLog:

	PR other/103021
	* Make-lang.in: Use ETAGS variable in TAGS target.

gcc/go/ChangeLog:

	PR other/103021
	* Make-lang.in: Use ETAGS variable in TAGS target.

gcc/objc/ChangeLog:

	PR other/103021
	* Make-lang.in: Use ETAGS variable in TAGS target.

gcc/objcp/ChangeLog:

	PR other/103021
	* Make-lang.in: Use ETAGS variable in TAGS target.

intl/ChangeLog:

	PR other/103021
	* Makefile.in: Use ETAGS variable in TAGS target,
	CTAGS variable in CTAGS target, and MKID variable
	in ID target.
	* configure: Regenerate.
	* configure.ac: Allow CTAGS, ETAGS, and MKID
	variables to be overridden.

libcpp/ChangeLog:

	PR other/103021
	* Makefile.in: Use ETAGS variable in TAGS target.
	* configure: Regenerate.
	* configure.ac: Allow ETAGS variable to be overridden.

libiberty/ChangeLog:

	PR other/103021
	* Makefile.in: Use ETAGS variable in TAGS target.
	* configure: Regenerate.
	* configure.ac: Allow ETAGS variable to be overridden.
2021-11-29 13:24:12 -05:00
GCC Administrator e1d4359264 Daily bump. 2021-11-24 00:16:29 +00:00
Christophe Lyon 46d3cfd29d libcpp: Fix ATTR_LIKELY definition PR preprocessor/103355
Fix the definition of ATTR_LIKELY when __has_cpp_attribute is not
defined, as it is the case with old compilers such as gcc-4.8.5.

	libcpp/:
	PR preprocessor/103355
	* system.h (ATTR_LIKELY): Fix definition.
2021-11-23 16:06:42 +00:00
Marek Polacek 630686f93f libcpp: Use [[likely]] conditionally
Let's hide [[likely]] behind a macro, to suppress warnings if the
compiler doesn't support it.

Co-authored-by: Jonathan Wakely <jwakely@redhat.com>

	PR preprocessor/103355

libcpp/ChangeLog:

	* lex.c: Use ATTR_LIKELY instead of [[likely]].
	* system.h (ATTR_LIKELY): Define.
2021-11-22 21:43:38 -05:00
GCC Administrator 06be28f64a Daily bump. 2021-11-23 00:16:27 +00:00
Jakub Jelinek a6e0d59370 libcpp: Fix _Pragma stringification [PR103165]
As the testcase show, sometimes _Pragma is turned into CPP_PRAGMA
.. CPP_PRAGMA_EOL tokens, even when it might still need to be
stringized later on.  We are then ICEing because we don't handle
stringification of CPP_PRAGMA or CPP_PRAGMA_EOL, but trying to
reconstruct the exact tokens with exact spacing after it has been
lowered is very hard.  So, instead this patch ensures we don't
lower _Pragma during expand_arg calls, but only later when
cpp_get_token_1 is called outside of expand_arg.

2021-11-22  Jakub Jelinek  <jakub@redhat.com>
	    Tobias Burnus  <tobias@codesourcery.com>

	PR preprocessor/103165
libcpp/
	* internal.h (struct lexer_state): Add ignore__Pragma field.
	* macro.c (builtin_macro): Don't interpret _Pragma if
	pfile->state.ignore__Pragma.
	(expand_arg): Temporarily set pfile->state.ignore__Pragma to 1.
gcc/testsuite/
	* c-c++-common/gomp/pragma-3.c: New test.
	* c-c++-common/gomp/pragma-4.c: New test.
	* c-c++-common/gomp/pragma-5.c: New test.

Co-Authored-By: Tobias Burnus <tobias@codesourcery.com>
2021-11-22 22:29:20 +01:00
GCC Administrator 280d2838c1 Daily bump. 2021-11-18 00:16:34 +00:00
David Malcolm bef32d4a28 libcpp: capture and underline ranges in -Wbidi-chars= [PR103026]
This patch converts the bidi::vec to use a struct so that we can
capture location_t values for the bidirectional control characters.

Before:

  Wbidi-chars-1.c: In function ‘main’:
  Wbidi-chars-1.c:6:43: warning: unpaired UTF-8 bidirectional control character detected [-Wbidi-chars=]
      6 |     /*<U+202E> } <U+2066>if (isAdmin)<U+2069> <U+2066> begin admins only */
        |                                                                           ^
  Wbidi-chars-1.c:9:28: warning: unpaired UTF-8 bidirectional control character detected [-Wbidi-chars=]
      9 |     /* end admins only <U+202E> { <U+2066>*/
        |                                            ^

After:

  Wbidi-chars-1.c: In function ‘main’:
  Wbidi-chars-1.c:6:43: warning: unpaired UTF-8 bidirectional control characters detected [-Wbidi-chars=]
      6 |     /*<U+202E> } <U+2066>if (isAdmin)<U+2069> <U+2066> begin admins only */
        |       ~~~~~~~~                                ~~~~~~~~                    ^
        |       |                                       |                           |
        |       |                                       |                           end of bidirectional context
        |       U+202E (RIGHT-TO-LEFT OVERRIDE)         U+2066 (LEFT-TO-RIGHT ISOLATE)
  Wbidi-chars-1.c:9:28: warning: unpaired UTF-8 bidirectional control characters detected [-Wbidi-chars=]
      9 |     /* end admins only <U+202E> { <U+2066>*/
        |                        ~~~~~~~~   ~~~~~~~~ ^
        |                        |          |        |
        |                        |          |        end of bidirectional context
        |                        |          U+2066 (LEFT-TO-RIGHT ISOLATE)
        |                        U+202E (RIGHT-TO-LEFT OVERRIDE)

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

gcc/testsuite/ChangeLog:
	PR preprocessor/103026
	* c-c++-common/Wbidi-chars-ranges.c: New test.

libcpp/ChangeLog:
	PR preprocessor/103026
	* lex.c (struct bidi::context): New.
	(bidi::vec): Convert to a vec of context rather than unsigned
	char.
	(bidi::ctx_at): Rename to...
	(bidi::pop_kind_at): ...this and reimplement for above change.
	(bidi::current_ctx): Update for change to vec.
	(bidi::current_ctx_ucn_p): Likewise.
	(bidi::current_ctx_loc): New.
	(bidi::on_char): Update for usage of context struct.  Add "loc"
	param and pass it when pushing contexts.
	(get_location_for_byte_range_in_cur_line): New.
	(get_bidi_utf8): Rename to...
	(get_bidi_utf8_1): ...this, reintroducing...
	(get_bidi_utf8): ...as a wrapper, setting *OUT when the result is
	not NONE.
	(get_bidi_ucn): Rename to...
	(get_bidi_ucn_1): ...this, reintroducing...
	(get_bidi_ucn): ...as a wrapper, setting *OUT when the result is
	not NONE.
	(class unpaired_bidi_rich_location): New.
	(maybe_warn_bidi_on_close): Use unpaired_bidi_rich_location when
	reporting on unpaired bidi chars.  Split into singular vs plural
	spellings.
	(maybe_warn_bidi_on_char): Pass in a location_t rather than a
	const uchar * and use it when emitting warnings, and when calling
	bidi::on_char.
	(_cpp_skip_block_comment): Capture location when kind is not NONE
	and pass it to maybe_warn_bidi_on_char.
	(skip_line_comment): Likewise.
	(forms_identifier_p): Likewise.
	(lex_raw_string): Likewise.
	(lex_string): Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2021-11-17 17:34:12 -05:00
David Malcolm 1a7f2c0774 libcpp: escape non-ASCII source bytes in -Wbidi-chars= [PR103026]
This flags rich_locations associated with -Wbidi-chars= so that
non-ASCII bytes will be escaped when printing the source lines
(using the diagnostics support I added in
r12-4825-gbd5e882cf6e0def3dd1bc106075d59a303fe0d1e).

In particular, this ensures that the printed source lines will
be pure ASCII, and thus the visual ordering of the characters
will be the same as the logical ordering.

Before:

  Wbidi-chars-1.c: In function ‘main’:
  Wbidi-chars-1.c:6:43: warning: unpaired UTF-8 bidirectional control character detected [-Wbidi-chars=]
      6 |     /*‮ } ⁦if (isAdmin)⁩ ⁦ begin admins only */
        |                                           ^
  Wbidi-chars-1.c:9:28: warning: unpaired UTF-8 bidirectional control character detected [-Wbidi-chars=]
      9 |     /* end admins only ‮ { ⁦*/
        |                            ^

  Wbidi-chars-11.c:6:15: warning: UTF-8 vs UCN mismatch when closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [-Wbidi-chars=]
      6 | int LRE_‪_PDF_\u202c;
        |               ^
  Wbidi-chars-11.c:8:19: warning: UTF-8 vs UCN mismatch when closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [-Wbidi-chars=]
      8 | int LRE_\u202a_PDF_‬_;
        |                   ^
  Wbidi-chars-11.c:10:28: warning: UTF-8 vs UCN mismatch when closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [-Wbidi-chars=]
     10 | const char *s1 = "LRE_‪_PDF_\u202c";
        |                            ^
  Wbidi-chars-11.c:12:33: warning: UTF-8 vs UCN mismatch when closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [-Wbidi-chars=]
     12 | const char *s2 = "LRE_\u202a_PDF_‬";
        |                                 ^

After:

  Wbidi-chars-1.c: In function ‘main’:
  Wbidi-chars-1.c:6:43: warning: unpaired UTF-8 bidirectional control character detected [-Wbidi-chars=]
      6 |     /*<U+202E> } <U+2066>if (isAdmin)<U+2069> <U+2066> begin admins only */
        |                                                                           ^
  Wbidi-chars-1.c:9:28: warning: unpaired UTF-8 bidirectional control character detected [-Wbidi-chars=]
      9 |     /* end admins only <U+202E> { <U+2066>*/
        |                                            ^

  Wbidi-chars-11.c:6:15: warning: UTF-8 vs UCN mismatch when closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [-Wbidi-chars=]
      6 | int LRE_<U+202A>_PDF_\u202c;
        |                       ^
  Wbidi-chars-11.c:8:19: warning: UTF-8 vs UCN mismatch when closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [-Wbidi-chars=]
      8 | int LRE_\u202a_PDF_<U+202C>_;
        |                   ^
  Wbidi-chars-11.c:10:28: warning: UTF-8 vs UCN mismatch when closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [-Wbidi-chars=]
     10 | const char *s1 = "LRE_<U+202A>_PDF_\u202c";
        |                                    ^
  Wbidi-chars-11.c:12:33: warning: UTF-8 vs UCN mismatch when closing a context by "U+202C (POP DIRECTIONAL FORMATTING)" [-Wbidi-chars=]
     12 | const char *s2 = "LRE_\u202a_PDF_<U+202C>";
        |                                 ^

libcpp/ChangeLog:
	PR preprocessor/103026
	* lex.c (maybe_warn_bidi_on_close): Use a rich_location
	and call set_escape_on_output (true) on it.
	(maybe_warn_bidi_on_char): Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2021-11-17 17:32:30 -05:00
Jakub Jelinek 049f0efeaa libcpp: Fix up handling of block comments in -fdirectives-only mode [PR103130]
Normal preprocessing, -fdirectives-only preprocessing before the Nathan's
rewrite, and all other compilers I've tried on godbolt treat even \*/
as end of a block comment, but the new -fdirectives-only handling doesn't.

2021-11-17  Jakub Jelinek  <jakub@redhat.com>

	PR preprocessor/103130
	* lex.c (cpp_directive_only_process): Treat even \*/ as end of block
	comment.

	* c-c++-common/cpp/dir-only-9.c: New test.
2021-11-17 17:31:40 +01:00
Marek Polacek 51c500269b libcpp: Implement -Wbidi-chars for CVE-2021-42574 [PR103026]
From a link below:
"An issue was discovered in the Bidirectional Algorithm in the Unicode
Specification through 14.0. It permits the visual reordering of
characters via control sequences, which can be used to craft source code
that renders different logic than the logical ordering of tokens
ingested by compilers and interpreters. Adversaries can leverage this to
encode source code for compilers accepting Unicode such that targeted
vulnerabilities are introduced invisibly to human reviewers."

More info:
https://nvd.nist.gov/vuln/detail/CVE-2021-42574
https://trojansource.codes/

This is not a compiler bug.  However, to mitigate the problem, this patch
implements -Wbidi-chars=[none|unpaired|any] to warn about possibly
misleading Unicode bidirectional control characters the preprocessor may
encounter.

The default is =unpaired, which warns about improperly terminated
bidirectional control characters; e.g. a LRE without its corresponding PDF.
The level =any warns about any use of bidirectional control characters.

This patch handles both UCNs and UTF-8 characters.  UCNs designating
bidi characters in identifiers are accepted since r204886.  Then r217144
enabled -fextended-identifiers by default.  Extended characters in C/C++
identifiers have been accepted since r275979.  However, this patch still
warns about mixing UTF-8 and UCN bidi characters; there seems to be no
good reason to allow mixing them.

We warn in different contexts: comments (both C and C++-style), string
literals, character constants, and identifiers.  Expectedly, UCNs are ignored
in comments and raw string literals.  The bidirectional control characters
can nest so this patch handles that as well.

I have not included nor tested this at all with Fortran (which also has
string literals and line comments).

Dave M. posted patches improving diagnostic involving Unicode characters.
This patch does not make use of this new infrastructure yet.

	PR preprocessor/103026

gcc/c-family/ChangeLog:

	* c.opt (Wbidi-chars, Wbidi-chars=): New option.

gcc/ChangeLog:

	* doc/invoke.texi: Document -Wbidi-chars.

libcpp/ChangeLog:

	* include/cpplib.h (enum cpp_bidirectional_level): New.
	(struct cpp_options): Add cpp_warn_bidirectional.
	(enum cpp_warning_reason): Add CPP_W_BIDIRECTIONAL.
	* internal.h (struct cpp_reader): Add warn_bidi_p member
	function.
	* init.c (cpp_create_reader): Set cpp_warn_bidirectional.
	* lex.c (bidi): New namespace.
	(get_bidi_utf8): New function.
	(get_bidi_ucn): Likewise.
	(maybe_warn_bidi_on_close): Likewise.
	(maybe_warn_bidi_on_char): Likewise.
	(_cpp_skip_block_comment): Implement warning about bidirectional
	control characters.
	(skip_line_comment): Likewise.
	(forms_identifier_p): Likewise.
	(lex_identifier): Likewise.
	(lex_string): Likewise.
	(lex_raw_string): Likewise.

gcc/testsuite/ChangeLog:

	* c-c++-common/Wbidi-chars-1.c: New test.
	* c-c++-common/Wbidi-chars-2.c: New test.
	* c-c++-common/Wbidi-chars-3.c: New test.
	* c-c++-common/Wbidi-chars-4.c: New test.
	* c-c++-common/Wbidi-chars-5.c: New test.
	* c-c++-common/Wbidi-chars-6.c: New test.
	* c-c++-common/Wbidi-chars-7.c: New test.
	* c-c++-common/Wbidi-chars-8.c: New test.
	* c-c++-common/Wbidi-chars-9.c: New test.
	* c-c++-common/Wbidi-chars-10.c: New test.
	* c-c++-common/Wbidi-chars-11.c: New test.
	* c-c++-common/Wbidi-chars-12.c: New test.
	* c-c++-common/Wbidi-chars-13.c: New test.
	* c-c++-common/Wbidi-chars-14.c: New test.
	* c-c++-common/Wbidi-chars-15.c: New test.
	* c-c++-common/Wbidi-chars-16.c: New test.
	* c-c++-common/Wbidi-chars-17.c: New test.
2021-11-16 21:56:16 -05:00
GCC Administrator cf82e8d964 Daily bump. 2021-11-02 00:16:32 +00:00
David Malcolm bd5e882cf6 diagnostics: escape non-ASCII source bytes for certain diagnostics
This patch adds support to GCC's diagnostic subsystem for escaping certain
bytes and Unicode characters when quoting source code.

Specifically, this patch adds a new flag rich_location::m_escape_on_output
which is a hint from a diagnostic that non-ASCII bytes in the pertinent
lines of the user's source code should be escaped when printed.

The patch sets this for the following diagnostics:
- when complaining about stray bytes in the program (when these
are non-printable)
- when complaining about "null character(s) ignored");
- for -Wnormalized= (and generate source ranges for such warnings)

The escaping is controlled by a new option:
  -fdiagnostics-escape-format=[unicode|bytes]

For example, consider a diagnostic involing a source line containing the
string "before" followed by the Unicode character U+03C0 ("GREEK SMALL
LETTER PI", with UTF-8 encoding 0xCF 0x80) followed by the byte 0xBF
(a stray UTF-8 trailing byte), followed by the string "after", where the
diagnostic highlights the U+03C0 character.

By default, this line will be printed verbatim to the user when
reporting a diagnostic at it, as:

 beforeπXafter
       ^

(using X for the stray byte to avoid putting invalid UTF-8 in this
commit message)

If the diagnostic sets the "escape" flag, it will be printed as:

 before<U+03C0><BF>after
       ^~~~~~~~

with -fdiagnostics-escape-format=unicode (the default), or as:

  before<CF><80><BF>after
        ^~~~~~~~

if the user supplies -fdiagnostics-escape-format=bytes.

This only affects how the source is printed; it does not affect
how column numbers that are printed (as per -fdiagnostics-column-unit=
and -fdiagnostics-column-origin=).

gcc/c-family/ChangeLog:
	* c-lex.c (c_lex_with_flags): When complaining about non-printable
	CPP_OTHER tokens, set the "escape on output" flag.

gcc/ChangeLog:
	* common.opt (fdiagnostics-escape-format=): New.
	(diagnostics_escape_format): New enum.
	(DIAGNOSTICS_ESCAPE_FORMAT_UNICODE): New enum value.
	(DIAGNOSTICS_ESCAPE_FORMAT_BYTES): Likewise.
	* diagnostic-format-json.cc (json_end_diagnostic): Add
	"escape-source" attribute.
	* diagnostic-show-locus.c
	(exploc_with_display_col::exploc_with_display_col): Replace
	"tabstop" param with a cpp_char_column_policy and add an "aspect"
	param.  Use these to compute m_display_col accordingly.
	(struct char_display_policy): New struct.
	(layout::m_policy): New field.
	(layout::m_escape_on_output): New field.
	(def_policy): New function.
	(make_range): Update for changes to exploc_with_display_col ctor.
	(default_print_decoded_ch): New.
	(width_per_escaped_byte): New.
	(escape_as_bytes_width): New.
	(escape_as_bytes_print): New.
	(escape_as_unicode_width): New.
	(escape_as_unicode_print): New.
	(make_policy): New.
	(layout::layout): Initialize new fields.  Update m_exploc ctor
	call for above change to ctor.
	(layout::maybe_add_location_range): Update for changes to
	exploc_with_display_col ctor.
	(layout::calculate_x_offset_display): Update for change to
	cpp_display_width.
	(layout::print_source_line): Pass policy
	to cpp_display_width_computation. Capture cpp_decoded_char when
	calling process_next_codepoint.  Move printing of source code to
	m_policy.m_print_cb.
	(line_label::line_label): Pass in policy rather than context.
	(layout::print_any_labels): Update for change to line_label ctor.
	(get_affected_range): Pass in policy rather than context, updating
	calls to location_compute_display_column accordingly.
	(get_printed_columns): Likewise, also for cpp_display_width.
	(correction::correction): Pass in policy rather than tabstop.
	(correction::compute_display_cols): Pass m_policy rather than
	m_tabstop to cpp_display_width.
	(correction::m_tabstop): Replace with...
	(correction::m_policy): ...this.
	(line_corrections::line_corrections): Pass in policy rather than
	context.
	(line_corrections::m_context): Replace with...
	(line_corrections::m_policy): ...this.
	(line_corrections::add_hint): Update to use m_policy rather than
	m_context.
	(line_corrections::add_hint): Likewise.
	(layout::print_trailing_fixits): Likewise.
	(selftest::test_display_widths): New.
	(selftest::test_layout_x_offset_display_utf8): Update to use
	policy rather than tabstop.
	(selftest::test_one_liner_labels_utf8): Add test of escaping
	source lines.
	(selftest::test_diagnostic_show_locus_one_liner_utf8): Update to
	use policy rather than tabstop.
	(selftest::test_overlapped_fixit_printing): Likewise.
	(selftest::test_overlapped_fixit_printing_utf8): Likewise.
	(selftest::test_overlapped_fixit_printing_2): Likewise.
	(selftest::test_tab_expansion): Likewise.
	(selftest::test_escaping_bytes_1): New.
	(selftest::test_escaping_bytes_2): New.
	(selftest::diagnostic_show_locus_c_tests): Call the new tests.
	* diagnostic.c (diagnostic_initialize): Initialize
	context->escape_format.
	(convert_column_unit): Update to use default character width policy.
	(selftest::test_diagnostic_get_location_text): Likewise.
	* diagnostic.h (enum diagnostics_escape_format): New enum.
	(diagnostic_context::escape_format): New field.
	* doc/invoke.texi (-fdiagnostics-escape-format=): New option.
	(-fdiagnostics-format=): Add "escape-source" attribute to examples
	of JSON output, and document it.
	* input.c (location_compute_display_column): Pass in "policy"
	rather than "tabstop", passing to
	cpp_byte_column_to_display_column.
	(selftest::test_cpp_utf8): Update to use cpp_char_column_policy.
	* input.h (class cpp_char_column_policy): New forward decl.
	(location_compute_display_column): Pass in "policy" rather than
	"tabstop".
	* opts.c (common_handle_option): Handle
	OPT_fdiagnostics_escape_format_.
	* selftest.c (temp_source_file::temp_source_file): New ctor
	overload taking a size_t.
	* selftest.h (temp_source_file::temp_source_file): Likewise.

gcc/testsuite/ChangeLog:
	* c-c++-common/diagnostic-format-json-1.c: Add regexp to consume
	"escape-source" attribute.
	* c-c++-common/diagnostic-format-json-2.c: Likewise.
	* c-c++-common/diagnostic-format-json-3.c: Likewise.
	* c-c++-common/diagnostic-format-json-4.c: Likewise, twice.
	* c-c++-common/diagnostic-format-json-5.c: Likewise.
	* gcc.dg/cpp/warn-normalized-4-bytes.c: New test.
	* gcc.dg/cpp/warn-normalized-4-unicode.c: New test.
	* gcc.dg/encoding-issues-bytes.c: New test.
	* gcc.dg/encoding-issues-unicode.c: New test.
	* gfortran.dg/diagnostic-format-json-1.F90: Add regexp to consume
	"escape-source" attribute.
	* gfortran.dg/diagnostic-format-json-2.F90: Likewise.
	* gfortran.dg/diagnostic-format-json-3.F90: Likewise.

libcpp/ChangeLog:
	* charset.c (convert_escape): Use encoding_rich_location when
	complaining about nonprintable unknown escape sequences.
	(cpp_display_width_computation::::cpp_display_width_computation):
	Pass in policy rather than tabstop.
	(cpp_display_width_computation::process_next_codepoint): Add "out"
	param and populate *out if non-NULL.
	(cpp_display_width_computation::advance_display_cols): Pass NULL
	to process_next_codepoint.
	(cpp_byte_column_to_display_column): Pass in policy rather than
	tabstop.  Pass NULL to process_next_codepoint.
	(cpp_display_column_to_byte_column): Pass in policy rather than
	tabstop.
	* errors.c (cpp_diagnostic_get_current_location): New function,
	splitting out the logic from...
	(cpp_diagnostic): ...here.
	(cpp_warning_at): New function.
	(cpp_pedwarning_at): New function.
	* include/cpplib.h (cpp_warning_at): New decl for rich_location.
	(cpp_pedwarning_at): Likewise.
	(struct cpp_decoded_char): New.
	(struct cpp_char_column_policy): New.
	(cpp_display_width_computation::cpp_display_width_computation):
	Replace "tabstop" param with "policy".
	(cpp_display_width_computation::process_next_codepoint): Add "out"
	param.
	(cpp_display_width_computation::m_tabstop): Replace with...
	(cpp_display_width_computation::m_policy): ...this.
	(cpp_byte_column_to_display_column): Replace "tabstop" param with
	"policy".
	(cpp_display_width): Likewise.
	(cpp_display_column_to_byte_column): Likewise.
	* include/line-map.h (rich_location::escape_on_output_p): New.
	(rich_location::set_escape_on_output): New.
	(rich_location::m_escape_on_output): New.
	* internal.h (cpp_diagnostic_get_current_location): New decl.
	(class encoding_rich_location): New.
	* lex.c (skip_whitespace): Use encoding_rich_location when
	complaining about null characters.
	(warn_about_normalization): Generate a source range when
	complaining about improperly normalized tokens, rather than just a
	point, and use encoding_rich_location so that the source code
	is escaped on printing.
	* line-map.c (rich_location::rich_location): Initialize
	m_escape_on_output.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2021-11-01 09:35:46 -04:00
GCC Administrator 4c61300f2b Daily bump. 2021-10-30 00:16:25 +00:00
Tobias Burnus 0078a058a5 libcpp: Fix _Pragma expansion [PR102409]
Both #pragma and _Pragma ended up as CPP_PRAGMA. Presumably since
r131819 (2008, GCC 4.3) for PR34692, pragmas are not expanded in
macro arguments but are output as is before. From the old bug report,
that was to fix usage like
  FOO (
    #pragma GCC diagnostic
  )
However, that change also affected _Pragma such that
  BAR (
    "1";
    _Pragma("omp ..."); )
yielded
  #pragma omp ...
followed by what BAR expanded too, possibly including '"1";'.

This commit adds a flag, PRAGMA_OP, to tokens to make the two
distinguishable - and include again _Pragma in the expanded arguments.

libcpp/ChangeLog:

	PR c++/102409
	* directives.c (destringize_and_run): Add PRAGMA_OP to the
	CPP_PRAGMA token's flags to mark is as coming from _Pragma.
	* include/cpplib.h (PRAGMA_OP): #define, to be used with token flags.
	* macro.c (collect_args): Only handle CPP_PRAGMA special if PRAGMA_OP
	is set.

gcc/testsuite/ChangeLog:

	* c-c++-common/gomp/pragma-1.c: New test.
	* c-c++-common/gomp/pragma-2.c: New test.
2021-10-29 22:55:32 +02:00
GCC Administrator c2bd5d8a30 Daily bump. 2021-10-23 00:16:26 +00:00
Eric Gallager c3e80a16af Add install-dvi Makefile targets.
Closes #102663

ChangeLog:

	PR other/102663
	* Makefile.def: Handle install-dvi target.
	* Makefile.tpl: Likewise.
	* Makefile.in: Regenerate.

c++tools/ChangeLog:

	PR other/102663
	* Makefile.in: Add dummy install-dvi target.

gcc/ChangeLog:

	PR other/102663
	* Makefile.in: Handle dvidir and install-dvi target.
	* configure: Regenerate.
	* configure.ac: Add install-dvi to target_list.

gcc/ada/ChangeLog:

	PR other/102663
	* gcc-interface/Make-lang.in: Allow dvi-formatted
	documentation to be installed.

gcc/c/ChangeLog:

	PR other/102663
	* Make-lang.in: Add dummy c.install-dvi target.

gcc/cp/ChangeLog:

	PR other/102663
	* Make-lang.in: Add dummy c++.install-dvi target.

gcc/d/ChangeLog:

	PR other/102663
	* Make-lang.in: Allow dvi-formatted documentation
	to be installed.

gcc/fortran/ChangeLog:

	PR other/102663
	* Make-lang.in: Allow dvi-formatted documentation
	to be installed.

gcc/lto/ChangeLog:

	PR other/102663
	* Make-lang.in: Add dummy lto.install-dvi target.

gcc/objc/ChangeLog:

	PR other/102663
	* Make-lang.in: Add dummy objc.install-dvi target.

gcc/objcp/ChangeLog:

	PR other/102663
	* Make-lang.in: Add dummy objc++.install-dvi target.

gnattools/ChangeLog:

	PR other/102663
	* Makefile.in: Add dummy install-dvi target.

libada/ChangeLog:

	PR other/102663
	* Makefile.in: Add dummy install-dvi target.

libcpp/ChangeLog:

	PR other/102663
	* Makefile.in: Add dummy install-dvi target.

libdecnumber/ChangeLog:

	PR other/102663
	* Makefile.in: Add dummy install-dvi target.

libiberty/ChangeLog:

	PR other/102663
	* Makefile.in: Allow dvi-formatted documentation
	to be installed.
2021-10-22 15:43:50 -07:00
GCC Administrator ce4d1f632f Daily bump. 2021-10-19 00:16:23 +00:00
Martin Liska 724e27046b Remove unused but set variables.
Reported by clang13 -Wunused-but-set-variable:

gcc/ChangeLog:

	* dbgcnt.c (dbg_cnt_process_opt): Remove unused but set variable.
	* gcov.c (get_cycles_count): Likewise.
	* lto-compress.c (lto_compression_zlib): Likewise.
	(lto_uncompression_zlib): Likewise.
	* targhooks.c (default_pch_valid_p): Likewise.

libcpp/ChangeLog:

	* charset.c (convert_oct): Remove unused but set variable.
2021-10-18 10:16:46 +02:00
GCC Administrator 57c7ec62ee Daily bump. 2021-10-07 00:16:24 +00:00
Jakub Jelinek f43eb7707c libcpp: Implement C++23 P2334R1 - #elifdef/#elifndef
This patch implements C++23 P2334R1, which is easy because Joseph has done
all the hard work for C2X already.
Unlike the C N2645 paper, the C++ P2334R1 contains one important addition
(but not in the normative text):
"While this is a new preprocessor feature and cannot be treated as a defect
report, implementations that support older versions of the standard are
encouraged to implement this feature in the older language modes as well
as C++23."
so there are different variants how to implement it.
One is ignoring that sentence and only implementing it
for -std=c++23/-std=gnu++23 like it is only implemented for -std=c2x.
Another option would be to implement it also in the older GNU modes but
not in the C/CXX modes (but it would be strange if we did that just for
C++ and not for C).
Yet another option is to enable it unconditionally.
And yet another option would be to enable it unconditionally but emit
a warning (or pedwarn) when it is seen.
Note, when it is enabled for the older language modes, as Joseph wrote
in the c11-elifdef-1.c testcase, it can result e.g. in rejecting previously
valid code:
 #define A
 #undef B
 #if 0
 #elifdef A
 #error "#elifdef A applied"
 #endif
 #if 0
 #elifndef B
 #error "#elifndef B applied"
 #endif
Note, seems clang went the enable it unconditionally in all standard
versions of both C and C++, no warnings or anything whatsoever, so
essentially treated it as a DR that changed behavior of e.g. the above code.
After feedback, this option enables #elifdef/#elifndef for -std=c2x
and -std=c++2{b,3} and enables it also for -std=gnu*, but for GNU modes
older than C2X or C++23 if -pedantic it emits a pedwarn on the directives
that either would be rejected in the corresponding -std=c* modes, e.g.
  #if 1
  #elifdef A // pedwarn if -pedantic
  #endif
or when the directives would be silently accepted, but when they are
recognized it changes behavior, so e.g.
  #define A
  #if 0
  #elifdef A // pedwarn if -pedantic
  #define M 1
  #endif
It won't pedwarn if the directives would be silently ignored and wouldn't
change anything, like:
  #define A
  #if 0
  #elifndef A
  #define M 1
  #endif
or
  #undef B
  #if 0
  #elifdef B
  #define M 1
  #endif

2021-10-06  Jakub Jelinek  <jakub@redhat.com>

libcpp/
	* init.c (lang_defaults): Implement P2334R1, enable elifdef for
	-std=c++23 and -std=gnu++23.
	* directives.c (_cpp_handle_directive): Support elifdef/elifndef if
	either CPP_OPTION (pfile, elifdef) or !CPP_OPTION (pfile, std).
	(do_elif): For older non-std modes if pedantic pedwarn about
	#elifdef/#elifndef directives that change behavior.
gcc/testsuite/
	* gcc.dg/cpp/gnu11-elifdef-1.c: New test.
	* gcc.dg/cpp/gnu11-elifdef-2.c: New test.
	* gcc.dg/cpp/gnu11-elifdef-3.c: New test.
	* gcc.dg/cpp/gnu11-elifdef-4.c: New test.
	* g++.dg/cpp/elifdef-1.C: New test.
	* g++.dg/cpp/elifdef-2.C: New test.
	* g++.dg/cpp/elifdef-3.C: New test.
	* g++.dg/cpp/elifdef-4.C: New test.
	* g++.dg/cpp/elifdef-5.C: New test.
	* g++.dg/cpp/elifdef-6.C: New test.
	* g++.dg/cpp/elifdef-7.C: New test.
2021-10-06 10:13:51 +02:00
GCC Administrator e11c6046f9 Daily bump. 2021-09-02 00:16:59 +00:00
Jakub Jelinek c4d6dcacfc libcpp: Implement C++23 P1949R7 - C++ Identifier Syntax using Unicode Standard Annex 31
The following patch implements the
P1949R7 - C++ Identifier Syntax using Unicode Standard Annex 31
paper.  We already allow UTF-8 characters in the source, so that part
is already implemented, so IMHO all we need to do is pedwarn instead of
just warn for the (default) -Wnormalize=nfc (or for -Wnormalize={id,nkfc})
if the character is not in NFC and to use the unicode XID_Start and
XID_Continue derived code properties to find out what characters are allowed
(the standard actually adds U+005F to XID_Start, but we are handling the
ASCII compatible characters differently already and they aren't allowed
in UCNs in identifiers).  Instead of hardcoding the large tables
in ucnid.tab, this patch makes makeucnid.c read them from the Unicode
tables (13.0.0 version at this point).

For non-pedantic mode, we accept as 2nd+ char in identifiers a union
of valid characters in all supported modes, but for the 1st char it
was actually pedantically requiring that it is not any of the characters
that may not appear in the currently chosen standard as the first character.
This patch changes it such that also what is allowed at the start of an
identifier is a union of characters valid at the start of an identifier
in any of the pedantic modes.

2021-09-01  Jakub Jelinek  <jakub@redhat.com>

	PR c++/100977
libcpp/
	* include/cpplib.h (struct cpp_options): Add cxx23_identifiers.
	* charset.c (CXX23, NXX23): New enumerators.
	(CID, NFC, NKC, CTX): Renumber.
	(ucn_valid_in_identifier): Implement P1949R7 - use CXX23 and
	NXX23 flags for cxx23_identifiers.  For start character in
	non-pedantic mode, allow characters that are allowed as start
	characters in any of the supported language modes, rather than
	disallowing characters allowed only as non-start characters in
	current mode but for characters from other language modes allowing
	them even if they are never allowed at start.
	* init.c (struct lang_flags): Add cxx23_identifiers.
	(lang_defaults): Add cxx23_identifiers column.
	(cpp_set_lang): Initialize CPP_OPTION (pfile, cxx23_identifiers).
	* lex.c (warn_about_normalization): If cxx23_identifiers, use
	cpp_pedwarning_with_line instead of cpp_warning_with_line for
	"is not in NFC" diagnostics.
	* makeucnid.c: Adjust usage comment.
	(CXX23, NXX23): New enumerators.
	(all_languages): Add CXX23.
	(not_NFC, not_NFKC, maybe_not_NFC): Renumber.
	(read_derivedcore): New function.
	(write_table): Print also CXX23 and NXX23 columns.
	(main): Require 5 arguments instead of 4, call read_derivedcore.
	* ucnid.h: Regenerated using Unicode 13.0.0 files.
gcc/testsuite/
	* g++.dg/cpp23/normalize1.C: New test.
	* g++.dg/cpp23/normalize2.C: New test.
	* g++.dg/cpp23/normalize3.C: New test.
	* g++.dg/cpp23/normalize4.C: New test.
	* g++.dg/cpp23/normalize5.C: New test.
	* g++.dg/cpp23/normalize6.C: New test.
	* g++.dg/cpp23/normalize7.C: New test.
	* g++.dg/cpp23/ucnid-1-utf8.C: New test.
	* g++.dg/cpp23/ucnid-2-utf8.C: New test.
	* gcc.dg/cpp/ucnid-4.c: Don't expect
	"not valid at the start of an identifier" errors.
	* gcc.dg/cpp/ucnid-4-utf8.c: Likewise.
	* gcc.dg/cpp/ucnid-5-utf8.c: New test.
2021-09-01 22:33:06 +02:00
Jason Merrill ac6e77aacf libcpp: __VA_OPT__ tweak
> We want to remove the latter <placemarker> but not the former one, and
> the patch adds the vaopt_padding_tokens counter for it to control
> how many placemarkers are removed on vaopt_state::END.
> As can be seen in #c1 and #c2 of the PR, I've tried various approaches,
> but neither worked out for all the cases except the posted one.

I notice that the second placemarker you mention is avoid_paste, which seems
relevant.  This seems to also work, at least it doesn't seem to break any of
the va_opt tests.

2021-09-01  Jason Merrill  <jason@redhat.com>

	* macro.c (replace_args): When __VA_OPT__ is on the LHS of ##,
	remove trailing avoid_paste tokens.
2021-09-01 21:33:30 +02:00
Jakub Jelinek e928cf47f3 libcpp: __VA_OPT__ p1042r1 placemarker changes [PR101488]
So, besides missing #__VA_OPT__ patch for which I've posted patch last week,
P1042R1 introduced some placemarker changes for __VA_OPT__, most notably
the addition of before "removal of placemarker tokens," rescanning ...
and the
 #define H4(X, ...) __VA_OPT__(a X ## X) ## b
H4(, 1)  // replaced by a b
example mentioned there where we replace it currently with ab

The following patch are the minimum changes (except for the
__builtin_expect) that achieve the same preprocessing between current
clang++ and patched gcc on all the testcases I've tried (i.e. gcc __VA_OPT__
testsuite in c-c++-common/cpp/va-opt* including the new test and the clang
clang/test/Preprocessor/macro_va_opt* testcases).

At one point I was trying to implement the __VA_OPT__(args) case as if
for non-empty __VA_ARGS__ it expanded as if __VA_OPT__( and ) were missing,
but from the tests it seems that is not how it should work, in particular
if after (or before) we have some macro argument and it is not followed
(or preceded) by ##, then it should be macro expanded even when __VA_OPT__
is after ## or ) is followed by ##.  And it seems that not removing any
padding tokens isn't possible either, because the expansion of the arguments
typically has a padding token at the start and end and those at least
according to the testsuite need to go.  It is unclear if it would be enough
to remove just one or if all padding tokens should be removed.
Anyway, e.g. the previous removal of all padding tokens at the end of
__VA_OPT__ is undesirable, as it e.g. eats also the padding tokens needed
for the H4 example from the paper.

2021-09-01  Jakub Jelinek  <jakub@redhat.com>

	PR preprocessor/101488
	* macro.c (replace_args): Fix up handling of CPP_PADDING tokens at the
	start or end of __VA_OPT__ arguments when preceeded or followed by ##.

	* c-c++-common/cpp/va-opt-3.c: Adjust expected output.
	* c-c++-common/cpp/va-opt-7.c: New test.
2021-09-01 21:31:25 +02:00
GCC Administrator 6d51ee4321 Daily bump. 2021-09-01 00:16:58 +00:00
Martin Sebor e4d2305adf Disable gcc_rich_location copying and assignment.
gcc/cp/ChangeLog:

	* parser.c (cp_parser_selection_statement): Use direct initialization
	instead of copy.

gcc/ChangeLog:

	* gcc-rich-location.h (gcc_rich_location): Make ctor explicit.

libcpp/ChangeLog:

	* include/line-map.h (class rich_location): Disable copying and
	assignment.
2021-08-31 11:15:21 -06:00
GCC Administrator 85d77ac474 Daily bump. 2021-08-26 00:17:03 +00:00
Lewis Hyatt 3ac6b5cff1 diagnostics: Support for -finput-charset [PR93067]
Adds the logic to handle -finput-charset in layout_get_source_line(), so that
source lines are converted from their input encodings prior to being output by
diagnostics machinery. Also adds the ability to strip a UTF-8 BOM similarly.

gcc/c-family/ChangeLog:

	PR other/93067
	* c-opts.c (c_common_input_charset_cb): New function.
	(c_common_post_options): Call new function
	diagnostic_initialize_input_context().

gcc/d/ChangeLog:

	PR other/93067
	* d-lang.cc (d_input_charset_callback): New function.
	(d_init): Call new function
	diagnostic_initialize_input_context().

gcc/fortran/ChangeLog:

	PR other/93067
	* cpp.c (gfc_cpp_post_options): Call new function
	diagnostic_initialize_input_context().

gcc/ChangeLog:

	PR other/93067
	* coretypes.h (typedef diagnostic_input_charset_callback): Declare.
	* diagnostic.c (diagnostic_initialize_input_context): New function.
	* diagnostic.h (diagnostic_initialize_input_context): Declare.
	* input.c (default_charset_callback): New function.
	(file_cache::initialize_input_context): New function.
	(file_cache_slot::create): Added ability to convert the input
	according to the input context.
	(file_cache::file_cache): Initialize the new input context.
	(class file_cache_slot): Added new m_alloc_offset member.
	(file_cache_slot::file_cache_slot): Initialize the new member.
	(file_cache_slot::~file_cache_slot): Handle potentially offset buffer.
	(file_cache_slot::maybe_grow): Likewise.
	(file_cache_slot::needs_read_p): Handle NULL fp, which is now possible.
	(file_cache_slot::get_next_line): Likewise.
	* input.h (class file_cache): Added input context member.

libcpp/ChangeLog:

	PR other/93067
	* charset.c (init_iconv_desc): Adapt to permit PFILE argument to
	be NULL.
	(_cpp_convert_input): Likewise. Also move UTF-8 BOM logic to...
	(cpp_check_utf8_bom): ...here.  New function.
	(cpp_input_conversion_is_trivial): New function.
	* files.c (read_file_guts): Allow PFILE argument to be NULL.  Add
	INPUT_CHARSET argument as an alternate source of this information.
	(read_file): Pass the new argument to read_file_guts.
	(cpp_get_converted_source): New function.
	* include/cpplib.h (struct cpp_converted_source): Declare.
	(cpp_get_converted_source): Declare.
	(cpp_input_conversion_is_trivial): Declare.
	(cpp_check_utf8_bom): Declare.

gcc/testsuite/ChangeLog:

	PR other/93067
	* gcc.dg/diagnostic-input-charset-1.c: New test.
	* gcc.dg/diagnostic-input-utf8-bom.c: New test.
2021-08-25 11:15:28 -04:00
GCC Administrator 2d14d64bf2 Daily bump. 2021-08-18 00:16:48 +00:00
Jakub Jelinek d565999792 c++: Add C++20 #__VA_OPT__ support
The following patch implements C++20 # __VA_OPT__ (...) support.
Testcases cover what I came up with myself and what LLVM has for #__VA_OPT__
in its testsuite and the string literals are identical between the two
compilers on the va-opt-5.c testcase.

2021-08-17  Jakub Jelinek  <jakub@redhat.com>

libcpp/
	* macro.c (vaopt_state): Add m_stringify member.
	(vaopt_state::vaopt_state): Initialize it.
	(vaopt_state::update): Overwrite it.
	(vaopt_state::stringify): New method.
	(stringify_arg): Replace arg argument with first, count arguments
	and add va_opt argument.  Use first instead of arg->first and
	count instead of arg->count, for va_opt add paste_tokens handling.
	(paste_tokens): Fix up len calculation.  Don't spell rhs twice,
	instead use %.*s to supply lhs and rhs spelling lengths.  Don't call
	_cpp_backup_tokens here.
	(paste_all_tokens): Call it here instead.
	(replace_args): Adjust stringify_arg caller.  For vaopt_state::END
	if stringify is true handle __VA_OPT__ stringification.
	(create_iso_definition): Handle # __VA_OPT__ similarly to # macro_arg.
gcc/testsuite/
	* c-c++-common/cpp/va-opt-5.c: New test.
	* c-c++-common/cpp/va-opt-6.c: New test.
2021-08-17 09:27:57 +02:00
GCC Administrator 9d1d9fc8b4 Daily bump. 2021-08-17 00:16:32 +00:00
Joseph Myers 58608f64a7 Update cpplib de.po
* de.po: Update.
2021-08-16 19:16:23 +00:00
GCC Administrator 72be20e202 Daily bump. 2021-08-13 00:16:43 +00:00
Jakub Jelinek 408d88af60 libcpp: Fix ICE with -Wtraditional preprocessing [PR101638]
The following testcase ICEs in cpp_sys_macro_p, because cpp_sys_macro_p
is called for a builtin macro which doesn't use node->value.macro union
member but a different one and so dereferencing it ICEs.
As the testcase is distilled from contemporary glibc headers, it means
basically -Wtraditional now ICEs on almost everything.

The fix can be either the patch below, return true for builtin macros,
or we could instead return false for builtin macros, or the fix could
be also (untested):
--- libcpp/expr.c       2021-05-07 10:34:46.345122608 +0200
+++ libcpp/expr.c       2021-08-12 09:54:01.837556365 +0200
@@ -783,13 +783,13 @@ cpp_classify_number (cpp_reader *pfile,

       /* Traditional C only accepted the 'L' suffix.
          Suppress warning about 'LL' with -Wno-long-long.  */
-      if (CPP_WTRADITIONAL (pfile) && ! cpp_sys_macro_p (pfile))
+      if (CPP_WTRADITIONAL (pfile))
        {
          int u_or_i = (result & (CPP_N_UNSIGNED|CPP_N_IMAGINARY));
          int large = (result & CPP_N_WIDTH) == CPP_N_LARGE
                       && CPP_OPTION (pfile, cpp_warn_long_long);

-         if (u_or_i || large)
+         if ((u_or_i || large) && ! cpp_sys_macro_p (pfile))
            cpp_warning_with_line (pfile, large ? CPP_W_LONG_LONG : CPP_W_TRADITIONAL,
                                   virtual_location, 0,
                                   "traditional C rejects the \"%.*s\" suffix",
The builtin macros at least currently don't add any suffixes
or numbers -Wtraditional would like to warn about.  For floating
point suffixes, -Wtraditional calls cpp_sys_macro_p only right
away before emitting the warning, but in the above case the ICE
is because cpp_sys_macro_p is called even if the number doesn't
have any suffixes (that is I think always for builtin macros
right now).

2021-08-12  Jakub Jelinek  <jakub@redhat.com>

	PR preprocessor/101638
	* macro.c (cpp_sys_macro_p): Return true instead of
	crashing on builtin macros.

	* gcc.dg/cpp/pr101638.c: New test.
2021-08-12 22:40:11 +02:00
GCC Administrator 8ebf4fb54a Daily bump. 2021-08-06 00:16:29 +00:00
Jakub Jelinek 4739344d36 libcpp: Regenerate ucnid.h using Unicode 13.0.0 files [PR100977]
The following patch (incremental to the makeucnid.c fix) regenerates
ucnid.h with https://www.unicode.org/Public/13.0.0/ucd/ files.

2021-08-05  Jakub Jelinek  <jakub@redhat.com>

	PR c++/100977
	* ucnid.h: Regenerated using Unicode 13.0.0 files.
2021-08-05 17:35:20 +02:00
Jakub Jelinek 4805b92a32 libcpp: Fix makeucnid bug with combining values [PR100977]
I've noticed in ucnid.h two adjacent lines that had all flags and combine
values identical and as such were supposed to be merged.

This is due to a bug in makeucnid.c, which records last_flag,
last_combine and really_safe of what has just been printed, but
because of a typo mishandles it for last_combine, always compares against
the combining_value[0] which is 0.

This has two effects on the table, one is that often the table is
unnecessarily large, as for non-zero .combine every character has its own
record instead of adjacent characters with the same flags and combine
being merged.  This means larger tables.
The other is that sometimes the last char that has combine set doesn't
actually have it in the tables, because the code is printing entries only
upon seeing the next character and if that character does have
combining_value of 0 and flags are otherwise the same as previously printed,
it will not print anything.

The following patch fixes that, for clarity what exactly it affects
I've regenerated with the same Unicode files as last time it has
been regenerated.

2021-08-05  Jakub Jelinek  <jakub@redhat.com>

	PR c++/100977
	* makeucnid.c (write_table): Fix computation of last_combine.
	* ucnid.h: Regenerated using Unicode 6.3.0 files.
2021-08-05 17:34:16 +02:00
GCC Administrator 1a7febe943 Daily bump. 2021-07-27 00:16:27 +00:00