Consider the example code mentionned in this PR:
$ cat -n test.c
1 #define C(a, b) a ## b
2 #define L(x) C(L, x)
3 #define M(a) goto L(__LINE__); __LINE__; L(__LINE__):
4 M(a /* --> this is the line of the expansion point of M. */
5 ); /* --> this is the line of the end of the invocation of M. */
$
"cc1 -quiet -E test.c" yields:
goto L5; 5; L4:
;
Notice how we have a 'L4' there, where it should be L5. That is the issue.
My understanding is that during the *second* expansion of __LINE__
(the one between the two L(__LINE__)), builtin_macro() is called by
enter_macro_context() with the location of the expansion point of M
(which is at line 4). Then _cpp_builtin_macro_text() expands __LINE__
into the line number of the location of the last token that has been
lexed, which is the location of the closing parenthesis of the
invocation of M, at line 5. So that invocation of __LINE__ is
expanded into 5.
Now let's see why the last invocation of __LINE__ is expanded into 4.
In builtin_macro(), we have this code at some point:
/* Set pfile->cur_token as required by _cpp_lex_direct. */
pfile->cur_token = _cpp_temp_token (pfile);
cpp_token *token = _cpp_lex_direct (pfile);
/* We should point to the expansion point of the builtin macro. */
token->src_loc = loc;
The first two statements insert a new token in the stream of lexed
token and pfile->cur_token[-1], is the "new" last token that has been
lexed. But the location of pfile->cur_token[-1] is the same location
as the location of the "previous" pfile->cur_token[-1], by courtesy of
_cpp_temp_token(). So normally, in subsequent invocations of
builtin_macro(), the location of pfile->cur_token[-1] should always be
the location of the closing parenthesis of the invocation of M at line
5. Except that that code in master now has the statement
"token->src_loc = loc;" on the next line. That statement actually
sets the location of pfile->cur_token[-1] to 'loc'. Which is the
location of the expansion point of M, which is on line 4.
So in the subsequent call to builtin_macro() (for the last expansion
of __LINE__ in L(__LINE__)), for _cpp_builtin_macro_text(),
pfile->cur_token[-1].src_loc is going to have a line number of 4.
I think the core issue here is that the location that is passed to
builtin_macro() from enter_macro_context() is not correct when we are
in presence of a top-most function-like macro invocation; in that
case, that location should be the location of the closing parenthesis
of the macro invocation. Otherwise, if we are in presence of a a
top-most object-like macro invocation then the location passed down
to builtin_macro should be the location of the expansion point of the
macro.
That way, in the particular case of the input code above, the location
received by builtin_macro() will always have line number 5.
Boostrapped and tested on x86_64-unknown-linux-gnu against trunk.
libcpp/ChangeLog:
* internal.h (cpp_reader::top_most_macro_node): New data member.
* macro.c (enter_macro_context): Pass the location of the end of
the top-most invocation of the function-like macro, or the
location of the expansion point of the top-most object-like macro.
(cpp_get_token_1): Store the top-most macro node in the new
pfile->top_most_macro_node data member.
(_cpp_pop_context): Clear the new cpp_reader::top_most_macro_node
data member.
gcc/testsuite/ChangeLog:
* gcc.dg/cpp/builtin-macro-1.c: New test case.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
From-SVN: r220367
DR#412
PR preprocessor/60570
* directives.c (do_elif): Don't evaluate #elif conditionals
when they don't need to be.
* gcc.dg/cpp/pr36320.c: Turn dg-error into dg-bogus.
* gcc.dg/cpp/pr60570.c: New test.
From-SVN: r220035
libcpp/ChangeLog:
2014-12-05 Manuel López-Ibáñez <manu@gcc.gnu.org>
* line-map.c (linemap_position_for_loc_and_offset): Add new
linemap_assert_fails.
gcc/fortran/ChangeLog:
2014-12-05 Manuel López-Ibáñez <manu@gcc.gnu.org>
* scanner.c (gfc_next_char_literal): Use gfc_warning_now.
(load_file): Use the line length as the column hint for
linemap_line_start. Reserve a location for the highest column of
the line.
From-SVN: r218407
libcpp:
2014-11-29 John Schmerge <jbschmerge@gmail.com>
PR preprocessor/41698
* charset.c (one_utf8_to_utf16): Do not produce surrogate pairs
for 0xffff.
gcc/testsuite:
2014-11-29 Joseph Myers <joseph@codesourcery.com>
PR preprocessor/41698
* gcc/testsuite/g++.dg/cpp/utf16-pr41698-1.C: New test.
From-SVN: r218179
PR preprocessor/60436
* line-map.c (linemap_line_start): If highest is above 0x60000000
and we are still tracking columns or highest is above 0x70000000,
force add_map.
From-SVN: r218042
libcpp:
2014-11-10 Edward Smith-Rowland <3dw4rd@verizon.net>
* include/cpplib.h (cpp_callbacks): Add has_attribute.
* internal.h (lexer_state): Add in__has_attribute__.
* directives.c (lex_macro_node): Prevent use of __has_attribute__
as a macro.
* expr.c (parse_has_attribute): New function; (eval_token): Look for
__has_attribute__ and route to parse_has_attribute.
* identifiers.c (_cpp_init_hashtable): Initialize n__has_attribute__.
* pch.c (cpp_read_state): Initialize n__has_attribute__.
* traditional.c (enum ls): Add ls_has_attribute, ls_has_attribute_close;
(_cpp_scan_out_logical_line): Attend to __has_attribute__.
gcc/c-family:
2014-11-10 Edward Smith-Rowland <3dw4rd@verizon.net>
* c-cppbuiltin.c (__has_attribute, __has_cpp_attribute): New macros;
(__cpp_rtti, __cpp_exceptions): New macros for C++98;
(__cpp_range_based_for, __cpp_initializer_lists,
__cpp_delegating_constructors, __cpp_nsdmi,
__cpp_inheriting_constructors, __cpp_ref_qualifiers): New macros
for C++11; (__cpp_attribute_deprecated): Remove in favor of
__has_cpp_attribute.
* c-lex.c (cb_has_attribute): New callback CPP function;
(init_c_lex): Set has_attribute callback.
gcc/testsuite:
2014-11-10 Edward Smith-Rowland <3dw4rd@verizon.net>
* g++.dg/cpp1y/feat-cxx11.C: Test new feature macros for C++98
and C++11; Test existence of __has_cpp_attribute; Test C++11
attributes.
* g++.dg/cpp1y/feat-cxx11-neg.C: Ditto.
* g++.dg/cpp1y/feat-cxx14.C: Ditto and test for C++14 attributes.
* g++.dg/cpp1y/feat-cxx98.C: Test new feature macros for C++98.
* g++.dg/cpp1y/feat-cxx98-neg.C: Ditto.
* g++.dg/cpp1y/feat-neg.C: Test that __cpp_rtti, _cpp_exceptions
will be undefined for -fno-rtti -fno-exceptions.
From-SVN: r217292
This patch makes cpplib track the original spellings of extended
identifiers, as well as the canonical UTF-8 version, in order to
follow standard semantics properly without needing a convoluted and
undocumented canonicalization in translation phase 1 (see bug 9449
comments 39-46 regarding such a canonicalization).
The spelling is tracked in cpp_identifier and cpp_macro_arg without
making cpp_token any larger. The original spelling is used for checks
of duplicate macro definitions, stringizing (see the C++ tests added;
this case is only an issue for C++ not C because C makes it
implementation-defined whether a \ is inserted before the \ of a UCN
in a string or character constant when stringizing, while C++ does
not), pasting (relevant when the result is then stringized for C++)
and when macro definitions are output as text (e.g. for -d options).
Once a macro has been defined, only the original spelling of the
argument names needs keeping in the argument list. While it is being
defined, however, both spellings are needed: the original one for
subsequent saving for checks of duplicate macro definitions, and the
canonical one which is the node marked specially to generate macro
argument tokens rather than normal identifier tokens. The buffer that
is used to save the original values of the identifier tokens is
changed so that it stores both those original values and a pointer to
the canonical hash nodes, so that those canonical nodes can be found
when their values need restoring after the macro definition has been
parsed.
I believe this covers the known standards issues in extended
identifiers support (the remaining unimplemented C99 areas in GCC all
being floating-point-related), except for C++ translation of extended
characters to UCNs in phase 1 (which I have no plans to work on).
There are however probably issues left with handling of extended
identifiers in other places, as listed in
<https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00337.html> (those
issues are generally the sort of thing that could be addressed as bugs
outside development stage 1). (The bulk of the potential issues Zack
was concerned about in 2003-5, that resulted in extended identifiers
being disabled in the absence of -fextended-identifiers, were
effectively eliminated by the audit and fixes I did in 2009, however;
that todo list reflects what was left over after that audit.)
Bootstrapped with no regressions on x86_64-unknown-linux-gnu.
libcpp:
* include/cpp-id-data.h (struct cpp_macro): Update comment
regarding parameters.
* include/cpplib.h (struct cpp_macro_arg, struct cpp_identifier):
Add spelling fields.
(struct cpp_token): Update comment on macro_arg.
* internal.h (_cpp_save_parameter): Add extra argument.
(_cpp_spell_ident_ucns): New declaration.
* lex.c (lex_identifier): Add SPELLING argument. Set *SPELLING to
original spelling of identifier.
(_cpp_lex_direct): Update calls to lex_identifier.
(_cpp_spell_ident_ucns): New function, factored out of
cpp_spell_token.
(cpp_spell_token): Adjust FORSTRING argument semantics to return
original spelling of identifiers. Use _cpp_spell_ident_ucns in
!FORSTRING case.
(_cpp_equiv_tokens): Check spellings of identifiers and macro
arguments are identical.
* macro.c (macro_arg_saved_data): New structure.
(paste_tokens): Use original spellings of identifiers from
cpp_spell_token.
(_cpp_save_parameter): Add argument SPELLING. Save both canonical
node and its value.
(parse_params): Update calls to _cpp_save_parameter.
(lex_expansion_token): Save spelling of macro argument tokens.
(_cpp_create_definition): Extract canonical node from saved data.
(cpp_macro_definition): Use UCNs in spelling of macro name. Use
original spellings of macro argument tokens and identifiers.
* traditional.c (scan_parameters): Update call to
_cpp_save_parameter.
gcc:
* doc/invoke.texi (-std=c99, -std=c11): Don't refer to corner
cases of extended identifiers.
gcc/testsuite:
* g++.dg/cpp/ucnid-2.C, g++.dg/cpp/ucnid-3.C,
gcc.dg/cpp/ucnid-11.c, gcc.dg/cpp/ucnid-12.c,
gcc.dg/cpp/ucnid-13.c, gcc.dg/cpp/ucnid-14.c,
gcc.dg/cpp/ucnid-15.c: New tests.
From-SVN: r217202
As proposed at <https://gcc.gnu.org/ml/gcc/2014-11/msg00014.html>,
this patch enables -fextended-identifiers by default for all standard
versions including this feature (all C++ versions, C99 and above for
C, but not C90 / C94 / gnu89 / preprocessing assembler). It adds a
couple of tests for areas where I previously noted testsuite coverage
for extended identifiers was lacking, removes -fextended-identifiers
from existing tests, adds -g to various such tests to verify that
extended identifiers don't break debug info generation and removes the
test that was only there to verify that the feature was off by
default.
The current state of the feature may not correspond exactly to any
particular checklist from 2004/5 (see bug 9449) of what was wanted
before enabling the feature by default, but I don't think it's any
worse than plenty of other features supported by default before every
corner case is fully functional, and think problems can readily be
fixed incrementally.
The following aspects of extended identifiers could still do with more
work (and should be straightforward):
* C -aux-info (output should use UCNs).
* ObjC -gen-decls (output should use UCNs; associated diagnostics from
the ObjC front end should use extended characters or UCNs as
appropriate to the locale, via using %qE or identifier_to_locale).
* Use DW_AT_use_UTF8 in DWARF-3 debug info for compilation units built
with extended identifiers enabled (or unconditionally).
* cpplib diagnostics (outputting characters or UCNs as appropriate
depending on the locale, as done for identifiers in non-cpplib
diagnostics).
* C++ test for UCN linking with C and extern "C".
* Check GDB support / file issues for support if needed.
* Actual UTF-8 in identifiers (?). (Be careful about not affecting
performance for the normal fast path of lexing identifiers, if
possible.)
The following may be trickier:
* cpplib spelling preservation (required to diagnose macro
redefinition with different spellings of the same identifier in the
definition or argument names; different spellings of the name of the
macro itself are OK, however; also required for correct handling of
multiple stringizing in C++); correct output for -d (UCNs), DWARF
debug info for macros (UCNs), PCH and PCH tests. (Spelling
preservation is the issue that needs fixing to remove references to
corner cases in the documentation of -std=c99 and -std=c11 and in
c99status.html.) The idea would be to add a second pointer to
cpp_identifier that stores the original spelling (whether for
extended identifiers only, or for all identifiers); this does not
enlarge cpp_token because the resulting larger cpp_identifier
structure is no bigger than cpp_string.
* C++ translation of extended characters (including $@` and various
control characters) to UCNs in phase 1 (note diagnostics thus
needed, but not for C++11, for control characters in strings /
character constants as those UCNs invalid); a likely implementation
approach is to do translation when identifiers / strings / character
constants are lexed, together with errors for stray $@` / control
characters in program as not being valid UCNs in identifiers ($ only
if not accepted in identifiers); note that this translation should
not take place inside raw string literals.
Bootstrapped with no regressions on x86_64-unknown-linux-gnu.
libcpp:
PR preprocessor/9449
* init.c (lang_defaults): Enable extended identifiers for C++ and
C99-based standards.
gcc:
PR preprocessor/9449
* doc/cpp.texi (Character sets, Tokenization)
(Implementation-defined behavior): Don't refer to UCNs in
identifiers requiring -fextended-identifiers.
* doc/cppopts.texi (-fextended-identifiers): Document as enabled
by default for C99 and later and C++.
* doc/invoke.texi (-std=c99, -std=c11): Don't refer to extended
identifiers needing -fextended-identifiers.
gcc/testsuite:
PR preprocessor/9449
* lib/target-supports.exp (check_effective_target_ucn_nocache):
Don't use -fextended-identifiers.
* c-c++-common/cpp/normalize-3.c, c-c++-common/cpp/ucnid-2011-1.c,
g++.dg/cpp/ucn-1.C, g++.dg/cpp/ucnid-1.C, g++.dg/other/ucnid-1.C,
gcc.dg/cpp/normalize-1.c, gcc.dg/cpp/normalize-2.c,
gcc.dg/cpp/normalize-4.c: Don't use -fextended-identifiers.
* gcc.dg/cpp/ucnid-1.c: Don't use -fextended-identifiers. Use
-g3.
* gcc.dg/cpp/ucnid-10.c, gcc.dg/cpp/ucnid-2.c,
gcc.dg/cpp/ucnid-3.c, gcc.dg/cpp/ucnid-4.c, gcc.dg/cpp/ucnid-5.c,
gcc.dg/cpp/ucnid-7.c, gcc.dg/cpp/ucnid-9.c,
gcc.dg/cpp/warn-normalized-1.c, gcc.dg/cpp/warn-normalized-2.c,
gcc.dg/cpp/warn-normalized-3.c: Don't use -fextended-identifiers.
* gcc.dg/ucnid-1.c, gcc.dg/ucnid-2.c, gcc.dg/ucnid-3.c,
gcc.dg/ucnid-4.c, gcc.dg/ucnid-5.c, gcc.dg/ucnid-6.c: Don't use
-fextended-identifiers. Use -g.
* gcc.dg/ucnid-7.c, gcc.dg/ucnid-8.c: Don't use
-fextended-identifiers.
* gcc.dg/ucnid-9.c: Don't use -fextended-identifiers. Use -g.
* gcc.dg/ucnid-10.c: Don't use -fextended-identifiers.
* gcc.dg/ucnid-11.c, gcc.dg/ucnid-12.c: Don't use
-fextended-identifiers. Use -g.
* gcc.dg/ucnid-13.c: Don't use -fextended-identifiers.
* gcc.dg/cpp/ucnid-8.c: Remove test.
* gcc.dg/cpp/ucnid-10.c, gcc.dg/ucnid-14.c: New tests.
From-SVN: r217144
2014-10-03 Bill Schmidt <wschmidt@linux.vnet.ibm.com>
* lex.c (search_line_fast): Add new version to be used for Power8
and later targets when Altivec is enabled. Restrict the existing
Altivec version to big-endian systems so that lvsr is not used on
little endian, where it is deprecated. Remove LE-specific code
from the now-BE-only version.
From-SVN: r215873
2014-10-02 Bernd Edlinger <bernd.edlinger@hotmail.de>
Jeff Law <law@redhat.com>
* charset.c (convert_no_conversion): Reallocate memory with 25%
headroom.
Co-Authored-By: Jeff Law <law@redhat.com>
From-SVN: r215785
* charset.c (conversion): Rename to ...
(cpp_conversion): ... this one; update.
* files.c (file_hash_entry): Rename to ...
(cpp_file_hash_entry): ... this one ; update.
From-SVN: r215482
gcc/ChangeLog:
2014-09-04 Manuel López-Ibáñez <manu@gcc.gnu.org>
* doc/options.texi: Document that Var and Init are required if CPP
is given.
* optc-gen.awk: Require Var and Init if CPP is given.
* common.opt (Wpedantic): Use Init.
libcpp/ChangeLog:
2014-09-04 Manuel López-Ibáñez <manu@gcc.gnu.org>
* macro.c (replace_args): Use cpp_pedwarning, cpp_warning and
CPP_W flags.
* include/cpplib.h: Add CPP_W_C90_C99_COMPAT and CPP_W_PEDANTIC.
* init.c (cpp_create_reader): Do not init to -1 here.
* expr.c (num_binary_op): Use cpp_pedwarning.
gcc/c-family/ChangeLog:
2014-09-04 Manuel López-Ibáñez <manu@gcc.gnu.org>
* c.opt (Wc90-c99-compat,Wc++-compat,Wcomment,Wendif-labels,
Winvalid-pch,Wlong-long,Wmissing-include-dirs,Wmultichar,Wpedantic,
(Wdate-time,Wtraditional,Wundef,Wvariadic-macros): Add CPP, Var
and Init.
* c-opts.c (c_common_handle_option): Do not handle here.
(sanitize_cpp_opts): Likewise.
* c-common.c (struct reason_option_codes_t): Handle
CPP_W_C90_C99_COMPAT and CPP_W_PEDANTIC.
gcc/testsuite/ChangeLog:
2014-09-04 Manuel López-Ibáñez <manu@gcc.gnu.org>
* gcc.dg/cpp/endif-pedantic2.c: More general options do not
override specific ones, but specific ones do.
From-SVN: r214904
libcpp/ChangeLog:
2014-08-29 Manuel López-Ibáñez <manu@gcc.gnu.org>
* macro.c (warn_of_redefinition): Suppress warnings for builtins
that lack the NODE_WARN flag, unless Wbuiltin-macro-redefined.
(_cpp_create_definition): Use Wbuiltin-macro-redefined for
builtins that lack the NODE_WARN flag.
* directives.c (do_undef): Likewise.
* init.c (cpp_init_special_builtins): Do not change flags
depending on Wbuiltin-macro-redefined.
gcc/c-family/ChangeLog:
2014-08-29 Manuel López-Ibáñez <manu@gcc.gnu.org>
* c.opt (Wbuiltin-macro-redefined): Use CPP, Var and Init.
* c-opts.c (c_common_handle_option): Do not handle here.
From-SVN: r214730
libcpp/
2014-08-27 Edward Smith-Rowland <3dw4rd@verizon.net>
PR cpp/23827 - standard C++ should not have hex float preprocessor
tokens
* libcpp/init.c (lang_flags): Change CXX98 flag for extended numbers
from 1 to 0.
* libcpp/expr.c (cpp_classify_number): Weite error message for improper
use of hex floating literal.
gcc/testsuite/
2014-08-27 Edward Smith-Rowland <3dw4rd@verizon.net>
PR cpp/23827 - standard C++ should not have hex float preprocessor
tokens
* g++.dg/cpp/pr23827_cxx11.C: New.
* g++.dg/cpp/pr23827_cxx98.C: New.
* g++.dg/cpp/pr23827_cxx98_neg.C: New.
* gcc.dg/cpp/pr23827_c90.c: New.
* gcc.dg/cpp/pr23827_c90_neg.c: New.
* gcc.dg/cpp/pr23827_c99.c: New.
From-SVN: r214616
PR c/61861
* macro.c (builtin_macro): Add location parameter. Set
location of builtin macro to the expansion point.
(enter_macro_context): Pass location to builtin_macro.
* gcc.dg/pr61861.c: New test.
From-SVN: r213102
When a built-in macro is expanded, the location of the token in the
epansion list is the location of the expansion point of the built-in
macro.
This patch creates a virtual location for that token instead,
effectively tracking locations of tokens resulting from built-in macro
tokens.
libcpp/
* include/line-map.h (line_maps::builtin_location): New data
member.
(line_map_init): Add a new parameter to initialize the new
line_maps::builtin_location data member.
* line-map.c (linemap_init): Initialize the
line_maps::builtin_location data member.
* macro.c (builtin_macro): Create a macro map and track the token
resulting from the expansion of a built-in macro.
gcc/
* input.h (is_location_from_builtin_token): New function
declaration.
* input.c (is_location_from_builtin_token): New function
definition.
* toplev.c (general_init): Tell libcpp what the pre-defined
spelling location for built-in tokens is.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
From-SVN: r212637
2014-07-10 Edward Smith-Rowland <3dw4rd@verizon.net>
Jonathan Wakely <jwakely@redhat.com>
PR CPP/61389
* macro.c (_cpp_arguments_ok, parse_params, create_iso_definition):
Warning messages mention C++11 in c++ mode and C99 in c mode.
* lex.c (lex_identifier_intern, lex_identifier): Ditto
Co-Authored-By: Jonathan Wakely <jwakely@redhat.com>
From-SVN: r212441
libcpp/
2014-07-09 Edward Smith-Rowland <3dw4rd@verizon.net>
PR c++/58155 - -Wliteral-suffix warns about tokens which are skipped
by preprocessor
* lex.c (lex_raw_string ()): Do not warn about invalid suffix
if skipping. (lex_string ()): Ditto.
gcc/testsuite/
2014-07-09 Edward Smith-Rowland <3dw4rd@verizon.net>
PR c++/58155 - -Wliteral-suffix warns about tokens which are skipped
g++.dg/cpp0x/pr58155.C: New.
From-SVN: r212392
PR c++/61038
I was asked to combine the escape logic for regular chars and strings
with the escape logic for user-defined literals chars and strings.
I just forgot the first time.
I forgot the ChangeLog!
From-SVN: r211267
PR c++/61038
I was asked to combine the escape logic for regular chars and strings
with the escape logic for user-defined literals chars and strings.
I just forgot the first time.
From-SVN: r211266
2014-05-26 Richard Biener <rguenther@suse.de>
libcpp/
* configure.ac: Remove long long and __int64 type checks,
add check for uint64_t and fail if that wasn't found.
* include/cpplib.h (cpp_num_part): Use uint64_t.
* config.in: Regenerate.
* configure: Likewise.
gcc/
* configure.ac: Drop __int64 type check. Insist that we
found uint64_t and int64_t.
* hwint.h (HOST_BITS_PER___INT64): Remove.
(HOST_BITS_PER_WIDE_INT): Define to 64 and remove
__int64 case.
(HOST_WIDE_INT_PRINT_*): Remove 32bit case.
(HOST_WIDEST_INT*): Define to HOST_WIDE_INT*.
(HOST_WIDEST_FAST_INT): Remove __int64 case.
* vmsdbg.h (struct _DST_SRC_COMMAND): Use int64_t
for dst_q_src_df_rms_cdt.
* configure: Regenerate.
* config.in: Likewise.
From-SVN: r210928