For deferred macros we also need a new field on the macro itself, so
that the module machinery can determine the macro was imported. Also
the documentation for the hashnode's deferred field was incomplete.
libcpp/
* include/cpplib.h (struct cpp_macro): Add imported_p field.
(struct cpp_hashnode): Tweak deferred field documentation.
* macro.c (_cpp_new_macro): Clear new field.
(cpp_get_deferred_macro, get_deferred_or_lazy_macro): Assert
more.
The preprocessor check for overflow (of linenum_type = unsigned int)
when reading the line number in a #line directive is incomplete; it
checks "reg < reg_prev" which doesn't cover all cases where
multiplying by 10 overflowed. Fix this by checking for overflow
before rather than after it occurs (using essentially the same logic
as used by e.g. glibc printf when reading width and precision values
from strings).
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
libcpp/
2020-11-27 Joseph Myers <joseph@codesourcery.com>
PR preprocessor/97602
* directives.c (strtolinenum): Check for overflow before it
occurs. Correct comment.
gcc/testsuite/
2020-11-27 Joseph Myers <joseph@codesourcery.com>
PR preprocessor/97602
* gcc.dg/cpp/line9.c, gcc.dg/cpp/line10.c: New tests.
Deferred macros are needed for C++ modules. Header units may export
macro definitions and undefinitions. These are resolved lazily at the
point of (potential) use. (The language specifies that, it's not just
a useful optimization.) Thus, identifier nodes grow a 'deferred'
field, which fortunately doesn't expand the structure on 64-bit
systems as there was padding there. This is non-zero on NT_MACRO
nodes, if the macro is deferred. When such an identifier is lexed, it
is resolved via a callback that I added recently. That will either
provide the macro definition, or discover it there was an overriding
undef. Either way the identifier is no longer a deferred macro.
Notice it is now possible for NT_MACRO nodes to have a NULL macro
expansion.
libcpp/
* include/cpplib.h (struct cpp_hashnode): Add deferred field.
(cpp_set_deferred_macro): Define.
(cpp_get_deferred_macro): Declare.
(cpp_macro_definition): Reformat, add overload.
(cpp_macro_definition_location): Deal with deferred macro.
(cpp_alloc_token_string, cpp_compare_macro): Declare.
* internal.h (_cpp_notify_macro_use): Return bool
(_cpp_maybe_notify_macro_use): Likewise.
* directives.c (do_undef): Check macro is not undef before
warning.
(do_ifdef, do_ifndef): Deal with deferred macro.
* expr.c (parse_defined): Likewise.
* lex.c (cpp_allocate_token_string): Break out of ...
(create_literal): ... here. Call it.
(cpp_maybe_module_directive): Deal with deferred macro.
* macro.c (cpp_get_token_1): Deal with deferred macro.
(warn_of_redefinition): Deal with deferred macro.
(compare_macros): Rename to ...
(cpp_compare_macro): ... here. Make extern.
(cpp_get_deferred_macro): New.
(_cpp_notify_macro_use): Deal with deferred macro, return bool
indicating definedness.
(cpp_macro_definition): Deal with deferred macro.
This adds the capability to locate the main file on the user or system
include paths. That's extremely useful to users building header
units. Searching has to be requiested (plain header-unit compilation
will not search). Also, to make include_next work as expected when
building a header unit, we add a mechanism to retrofit a non-searched
source file as one on the include path.
libcpp/
* include/cpplib.h (enum cpp_main_search): New.
(struct cpp_options): Add main_search field.
(cpp_main_loc): Declare.
(cpp_retrofit_as_include): Declare.
* internal.h (struct cpp_reader): Add main_loc field.
(_cpp_in_main_source_file): Not main if main is a header.
* init.c (cpp_read_main_file): Use main_search option to locate
main file. Set main_loc
* files.c (cpp_retrofit_as_include): New.
In preparing module patch 7 I realized there was a cleanup I could
make to simplify it. This is that cleanup. Also, when doing the
cleanup I noticed some macros had been turned into inline functions,
but not renamed to the preprocessors internal namespace
(_cpp_$INTERNAL rather than cpp_$USER). Thus, this renames those
functions, deletes an internal field of the file structure, and
determines whether we're in the main file by comparing to
pfile->main_file, the _cpp_file of the main file.
libcpp/
* internal.h (cpp_in_system_header): Rename to ...
(_cpp_in_system_header): ... here.
(cpp_in_primary_file): Rename to ...
(_cpp_in_main_source_file): ... here. Compare main_file equality
and check main_search value.
* lex.c (maybe_va_opt_error, _cpp_lex_direct): Adjust for rename.
* macro.c (_cpp_builtin_macro_text): Likewise.
(replace_args): Likewise.
* directives.c (do_include_next): Likewise.
(do_pragma_once, do_pragma_system_header): Likewise.
* files.c (struct _cpp_file): Delete main_file field.
(pch_open): Check pfile->main_file equality.
(make_cpp_file): Drop cpp_reader parm, don't set main_file.
(_cpp_find_file): Adjust.
(_cpp_stack_file): Check pfile->main_file equality.
(struct report_missing_guard_data): Add cpp_reader field.
(report_missing_guard): Check pfile->main_file equality.
(_cpp_report_missing_guards): Adjust.
C++20 modules introduces a new kind of preprocessor directive -- a
module directive. These are directives but without the leading '#'.
We have to detect them by sniffing the start of a logical line. When
detected we replace the initial identifiers with unspellable tokens
and pass them through to the language parser the same way deferred
pragmas are. There's a PRAGMA_EOL at the logical end of line too.
One additional complication is that we have to do header-name lexing
after the initial tokens, and that requires changes in the macro-aware
piece of the preprocessor. The above sniffer sets a counter in the
lexer state, and that triggers at the appropriate point. We then do
the same header-name lexing that occurs on a #include directive or
has_include pseudo-macro. Except that the header name ends up in the
token stream.
A couple of token emitters need to deal with the new token possibility.
gcc/c-family/
* c-lex.c (c_lex_with_flags): CPP_HEADER_NAMEs can now be seen.
libcpp/
* include/cpplib.h (struct cpp_options): Add module_directives
option.
(NODE_MODULE): New node flag.
(struct cpp_hashnode): Make rid-code a bitfield, increase bits in
flags and swap with type field.
* init.c (post_options): Create module-directive identifier nodes.
* internal.h (struct lexer_state): Add directive_file_token &
n_modules fields. Add module node enumerator.
* lex.c (cpp_maybe_module_directive): New.
(_cpp_lex_token): Call it.
(cpp_output_token): Add '"' around CPP_HEADER_NAME token.
(do_peek_ident, do_peek_module): New.
(cpp_directives_only): Detect module-directive lines.
* macro.c (cpp_get_token_1): Deal with directive_file_token
triggering.
This is slightly different to the original patch I posted. This adds
separate module target and dependency functions (rather than a single
bi-modal function).
libcpp/
* include/cpplib.h (struct cpp_options): Add modules to
dep-options.
* include/mkdeps.h (deps_add_module_target): Declare.
(deps_add_module_dep): Declare.
* mkdeps.c (class mkdeps): Add modules, module_name, cmi_name,
is_header_unit fields. Adjust cdtors.
(deps_add_module_target, deps_add_module_dep): New.
(make_write): Write module dependencies, if enabled.
These two callbacks are needed for C++ modules. The first is for
handling macros from header-units. These are resolved lazily. The
second is for include-translation -- whether a #include gets turned
into a header-unit import.
libcpp/
* include/cpplib.h (struct cpp_callbacks): Add
user_deferred_macro & translate_include.
This patch adds LC_MODULE as a map kind, used to indicate a c++
module. Unlike a regular source file, it only contains a single
location, and the source locations in that module are represented by
ordinary locations whose 'included_from' location is the module.
It also exposes some entry points that modules will use to create
blocks of line maps.
In the original posting, I'd missed the deletion of the
linemap_enter_macro from internal.h. That's included here.
libcpp/
* include/line-map.h (enum lc_reason): Add LC_MODULE.
(MAP_MODULE_P): New.
(line_map_new_raw): Declare.
(linemap_enter_macro): Move declaration from internal.h
(linemap_module_loc, linemap_module_reparent)
(linemap_module_restore): Declare.
(linemap_lookup_macro_indec): Declare.
* internal.h (linemap_enter_macro): Moved to line-map.h.
* line-map.c (linemap_new_raw): New, broken out of ...
(new_linemap): ... here. Call it.
(LAST_SOURCE_LINE_LOCATION): New.
(liemap_module_loc, linemap_module_reparent)
(linemap_module_restore): New.
(linemap_lookup_macro_index): New, broken out of ...
(linemap_macro_map_lookup): ... here. Call it.
(linemap_dump): Add module dump.
As Jakub points out, we only ever pass a single variadic parm (if at
all), so just an optional arg is fine.
PR preprocessor/97858
libcpp/
* mkdeps.c (munge): Drop varadic args, we only ever use one.
C2x adds binary integer constants (approved at the last WG14 meeting,
though not yet added to the working draft in git). Configure libcpp
to consider these a standard feature in C2x mode, with appropriate
updates to diagnostics including support for diagnosing them with
-std=c2x -Wc11-c2x-compat.
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
gcc/testsuite/
2020-11-13 Joseph Myers <joseph@codesourcery.com>
* gcc.dg/binary-constants-2.c, gcc.dg/binary-constants-3.c,
gcc.dg/system-binary-constants-1.c: Update expected diagnostics.
* gcc.dg/c11-binary-constants-1.c,
gcc.dg/c11-binary-constants-2.c, gcc.dg/c2x-binary-constants-1.c,
gcc.dg/c2x-binary-constants-2.c, gcc.dg/c2x-binary-constants-3.c:
New tests.
libcpp/
2020-11-13 Joseph Myers <joseph@codesourcery.com>
* expr.c (cpp_classify_number): Update diagnostic for binary
constants for C. Also diagnose binary constants for
-Wc11-c2x-compat.
* init.c (lang_defaults): Enable binary constants for GNUC2X and
STDC2X.
C2x adds the __has_c_attribute preprocessor operator, similar to C++
__has_cpp_attribute.
GCC implements __has_cpp_attribute as exactly equivalent to
__has_attribute. (The documentation says they differ regarding the
values returned for standard attributes, but that's actually only a
matter of the particular nonzero value returned not being specified in
the documentation for __has_attribute; the implementation makes no
distinction between the two.)
I don't think having them exactly equivalent is actually correct,
either for __has_cpp_attribute or for __has_c_attribute.
Specifically, I think it is only correct for __has_cpp_attribute or
__has_c_attribute to return nonzero if the given attribute is
supported, with the particular pp-tokens passed to __has_cpp_attribute
or __has_c_attribute, with [[]] syntax, not if it's only accepted in
__attribute__ or with gnu:: added in [[]]. For example, they should
return nonzero for gnu::packed, but zero for plain packed, because
[[gnu::packed]] is accepted but [[packed]] is ignored as not a
standard attribute.
This patch implements that for __has_c_attribute, leaving any changes
to __has_cpp_attribute for the C++ maintainers. A new
BT_HAS_STD_ATTRIBUTE is added for __has_c_attribute (which I think,
based on the above, would actually be correct to use for
__has_cpp_attribute as well). The code in c_common_has_attribute that
deals with scopes has its C++ conditional removed; instead, whether
the language is C or C++ is used only to determine the numeric values
returned for standard attributes (and which standard attributes are
handled there at all). A new argument is passed to
c_common_has_attribute to distinguish BT_HAS_STD_ATTRIBUTE from
BT_HAS_ATTRIBUTE, and that argument is used to stop attributes with no
scope specified from being accepted with __has_c_attribute unless they
are one of the known standard attributes and so handled specially.
Although the standard specify constants ending with 'L' as the values
for the standard attributes, there is no correctness issue with the
lack of code in GCC to add that 'L' to the expansion:
__has_c_attribute and __has_cpp_attribute are expanded in #if after
other macro expansion has occurred, with no semantics being specified
if they occur outside #if, so there is no way for a conforming program
to inspect the exact text of the expansion of those macros, only to
use the resulting pp-number in a #if expression, where long and int
have the same set of values.
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
gcc/
2020-11-12 Joseph Myers <joseph@codesourcery.com>
* doc/cpp.texi (__has_attribute): Document when scopes are allowed
for C.
(__has_c_attribute): New.
gcc/c-family/
2020-11-12 Joseph Myers <joseph@codesourcery.com>
* c-lex.c (c_common_has_attribute): Take argument std_syntax.
Allow scope for C. Handle standard attributes for C. Do not
accept unscoped attributes if std_syntax and not handled as
standard attributes.
* c-common.h (c_common_has_attribute): Update prototype.
gcc/testsuite/
2020-11-12 Joseph Myers <joseph@codesourcery.com>
* gcc.dg/c2x-has-c-attribute-1.c, gcc.dg/c2x-has-c-attribute-2.c,
gcc.dg/c2x-has-c-attribute-3.c, gcc.dg/c2x-has-c-attribute-4.c:
New tests.
libcpp/
2020-11-12 Joseph Myers <joseph@codesourcery.com>
* include/cpplib.h (struct cpp_callbacks): Add bool argument to
has_attribute.
(enum cpp_builtin_type): Add BT_HAS_STD_ATTRIBUTE.
* init.c (builtin_array): Add __has_c_attribute.
(cpp_init_special_builtins): Handle BT_HAS_STD_ATTRIBUTE.
* macro.c (_cpp_builtin_macro_text): Handle BT_HAS_STD_ATTRIBUTE.
Update call to has_attribute for BT_HAS_ATTRIBUTE.
* traditional.c (fun_like_macro): Handle BT_HAS_STD_ATTRIBUTE.
gcc/c-family
PR pch/86674
* c-pch.c (c_common_valid_pch): Use cpp_warning with CPP_W_INVALID_PCH
reason to fix -Werror=invalid-pch and -Wno-error=invalid-pch switches.
libcpp
PR pch/86674
* files.c (_cpp_find_file): Use CPP_DL_NOTE not CPP_DL_ERROR in call to
cpp_error.
generated_cpp_wcwidth.h was regenerated using Unicode 13.0.0 data files. No
material changes to the parsing scripts (either GCC- or glibc-sourced) were
necessary; glibc's utf8_gen.py was tweaked slightly by glibc and matched here.
contrib/ChangeLog:
* unicode/EastAsianWidth.txt: Update to Unicode 13.0.0.
* unicode/PropList.txt: Likewise.
* unicode/README: Likewise.
* unicode/UnicodeData.txt: Likewise.
* unicode/from_glibc/unicode_utils.py: Update to latest glibc version.
* unicode/from_glibc/utf8_gen.py: Likewise.
libcpp/ChangeLog:
* generated_cpp_wcwidth.h: Regenerated from Unicode 13.0.0 data.
Joseph pointed me at cb_get_source_date_epoch, which allows repeatable
builds and solves a FIXME I had on the modules branch. Unfortunately
it's used exclusively to generate __DATE__ and __TIME__ values, which
fallback to using a time(2) call. It'd be nicer if the preprocessor
made whatever time value it determined available to the rest of the
compiler. So this patch adds a new cpp_get_date function, which
abstracts the call to the get_source_date_epoch hook, or uses time
directly. The value is cached. Thus the timestamp I end up putting
on CMI files matches __DATE__ and __TIME__ expansions. That seems
worthwhile.
libcpp/
* include/cpplib.h (enum class CPP_time_kind): New.
(cpp_get_date): Declare.
* internal.h (struct cpp_reader): Replace source_date_epoch with
time_stamp and time_stamp_kind.
* init.c (cpp_create_reader): Initialize them.
* macro.c (_cpp_builtin_macro_text): Use cpp_get_date.
(cpp_get_date): Broken out from _cpp_builtin_macro_text and
genericized.
This patch moves the generation of PRAGMA_EOF earlier, to when we set
need_line, rather than when we try and get the next line. It also
prevents peeking past a PRAGMA token.
libcpp/
* lex.c (cpp_peek_token): Do not peek past CPP_PRAGMA.
(_cpp_lex_direct): Handle EOF in pragma when setting need_line,
not when needing a line.
I noticed a fencepost error in the preprocessor. We should be
checking if the next char is at the limit, not the current char (which
can't be, because we're looking at it).
libcpp/
* lex.c (_cpp_clean_line): Fix DOS off-by-one error.
This patch cleans up the interface to the dependency generation a
little. We now only check the option in one place, and the
cpp_get_deps function returns nullptr if there are no dependencies. I
also reworded the -MT and -MQ help text to be make agnostic -- as
there are ideas about emitting, say, JSON.
libcpp/
* include/mkdeps.h: Include cpplib.h
(deps_write): Adjust first parm type.
* mkdeps.c: Include internal.h
(make_write): Adjust first parm type. Check phony option
directly.
(deps_write): Adjust first parm type.
* init.c (cpp_read_main_file): Use get_deps.
* directives.c (cpp_get_deps): Check option before initializing.
gcc/c-family/
* c.opt (MQ,MT): Reword description to be make-agnostic.
gcc/fortran/
* cpp.c (gfc_cpp_add_dep): Only add dependency if we're recording
them.
(gfc_cpp_init): Likewise for target.
Our macro use hook passes a location, but doesn't recieve it from the
using location. This patch adds the extra location_t parameter and
passes it though.
A second cleanup is breaking out the macro comparison code from the
redefinition warning. That;ll turn out useful for modules.
Finally, there's a filename comparison needed for the location
optimization of rewinding from line 2 (occurs during the emission of
builtin macros).
libcpp/
* internal.h (_cpp_notify_macro_use): Add location parm.
(_cpp_maybe_notify_macro_use): Likewise.
* directives.c (_cpp_do_file_change): Check we've not changed file
when optimizing a rewind.
(do_ifdef): Pass location to _cpp_maybe_notify_macro_use.
(do_ifndef): Likewise. Delete obsolete comment about powerpc.
* expr.c (parse_defined): Pass location to
_cpp_maybe_notify_macro_use.
* macro.c (enter_macro_context): Likewise.
(warn_of_redefinition): Break out helper function. Call it.
(compare_macros): New function broken out of warn_of_redefinition.
(_cpp_new_macro): Zero all fields.
(_cpp_notify_macro_use): Add location parameter.
My previous attempt at fixing this was incorrect. The problem occurs
earlier in that _cpp_lex_direct processes the unwinding EOF needs in
collect_args mode. This patch changes it not to do that, in the same
way as directive parsing works. Also collect_args shouldn't push_back
such fake EOFs, and neither should funlike_invocation_p.
libcpp/
* lex.c (_cpp_lex_direct): Do not complete EOF processing when
parsing_args.
* macro.c (collect_args): Do not unwind fake EOF.
(funlike_invocation_p): Do not unwind fake EOF.
(cpp_context): Replace abort with gcc_assert.
gcc/testsuite/
* gcc.dg/cpp/endif.c: Move to ...
* c-c++-common/cpp/endif.c: ... here.
* gcc.dg/cpp/endif.h: Move to ...
* c-c++-common/cpp/endif.h: ... here.
* c-c++-common/cpp/eof-2.c: Adjust diagnostic.
* c-c++-common/cpp/eof-3.c: Adjust diagnostic.
We inject EOF tokens between macro argument lists, but had
confused/stale logic in the non-fn invocation. Renamed the magic
'eof' token, as it's now only used for macro argument termination.
Always rewind the non-OPEN_PAREN token.
libcpp/
* internal.h (struct cpp_reader): Rename 'eof' field to 'endarg'.
* init.c (cpp_create_reader): Adjust.
* macro.c (collect_args): Use endarg for separator. Always rewind
in the not-fn case.
gcc/testsuite/
* c-c++-common/cpp/pr97471.c: New.
Using the tokenizer to sniff for an initial line marker for
preprocessed input is a little brittle, particularly with
-fdirectives-only. If there is no marker we'll happily munch initial
comments. This patch directly sniffs the buffer. This is safe
because the initial line marker was machine generated and must be
right at the beginning of the file. Anything else is not such a line
marker. The same is true for the initial directory marker. For that
tokenizing the string is simplest, but at that point it's either a
regular line marker or a directory marker. If it's a regular marker,
unwinding tokens is fine.
libcpp/
* internal.h (enum include_type): Rename IT_MAIN_INJECT to
IT_PRE_MAIN.
* init.c (cpp_read_main_file): If there is no line marker, adjust
the initial line marker.
(read_original_filename): Return bool, peek the buffer directly
before trying to tokenize.
(read_original_directory): Likewise. Directly prod the string
literal.
* files.c (_cpp_stack_file): Adjust for IT_PRE_MAIN change.