Fix PR65261
Running bootstrap-ubsan on ppc64le shows many instances of:
libcpp/lex.c:552:30: runtime error: load of misaligned address
0x01001f31d37a for type 'const uchar', which requires 16 byte alignment
But the unaligned vector loads are intended in this case, because they
are preferable to forced-alignment on POWER8. So just silence the ubsan
errors.
2015-03-02 Markus Trippelsdorf <markus@trippelsdorf.de>
include/
PR target/65261
* ansidecl.h (ATTRIBUTE_NO_SANITIZE_UNDEFINED): New macro.
libcpp/
PR target/65261
* lex.c (search_line_fast): Silence ubsan errors.
From-SVN: r221190
This patch makes cpplib track the original spellings of extended
identifiers, as well as the canonical UTF-8 version, in order to
follow standard semantics properly without needing a convoluted and
undocumented canonicalization in translation phase 1 (see bug 9449
comments 39-46 regarding such a canonicalization).
The spelling is tracked in cpp_identifier and cpp_macro_arg without
making cpp_token any larger. The original spelling is used for checks
of duplicate macro definitions, stringizing (see the C++ tests added;
this case is only an issue for C++ not C because C makes it
implementation-defined whether a \ is inserted before the \ of a UCN
in a string or character constant when stringizing, while C++ does
not), pasting (relevant when the result is then stringized for C++)
and when macro definitions are output as text (e.g. for -d options).
Once a macro has been defined, only the original spelling of the
argument names needs keeping in the argument list. While it is being
defined, however, both spellings are needed: the original one for
subsequent saving for checks of duplicate macro definitions, and the
canonical one which is the node marked specially to generate macro
argument tokens rather than normal identifier tokens. The buffer that
is used to save the original values of the identifier tokens is
changed so that it stores both those original values and a pointer to
the canonical hash nodes, so that those canonical nodes can be found
when their values need restoring after the macro definition has been
parsed.
I believe this covers the known standards issues in extended
identifiers support (the remaining unimplemented C99 areas in GCC all
being floating-point-related), except for C++ translation of extended
characters to UCNs in phase 1 (which I have no plans to work on).
There are however probably issues left with handling of extended
identifiers in other places, as listed in
<https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00337.html> (those
issues are generally the sort of thing that could be addressed as bugs
outside development stage 1). (The bulk of the potential issues Zack
was concerned about in 2003-5, that resulted in extended identifiers
being disabled in the absence of -fextended-identifiers, were
effectively eliminated by the audit and fixes I did in 2009, however;
that todo list reflects what was left over after that audit.)
Bootstrapped with no regressions on x86_64-unknown-linux-gnu.
libcpp:
* include/cpp-id-data.h (struct cpp_macro): Update comment
regarding parameters.
* include/cpplib.h (struct cpp_macro_arg, struct cpp_identifier):
Add spelling fields.
(struct cpp_token): Update comment on macro_arg.
* internal.h (_cpp_save_parameter): Add extra argument.
(_cpp_spell_ident_ucns): New declaration.
* lex.c (lex_identifier): Add SPELLING argument. Set *SPELLING to
original spelling of identifier.
(_cpp_lex_direct): Update calls to lex_identifier.
(_cpp_spell_ident_ucns): New function, factored out of
cpp_spell_token.
(cpp_spell_token): Adjust FORSTRING argument semantics to return
original spelling of identifiers. Use _cpp_spell_ident_ucns in
!FORSTRING case.
(_cpp_equiv_tokens): Check spellings of identifiers and macro
arguments are identical.
* macro.c (macro_arg_saved_data): New structure.
(paste_tokens): Use original spellings of identifiers from
cpp_spell_token.
(_cpp_save_parameter): Add argument SPELLING. Save both canonical
node and its value.
(parse_params): Update calls to _cpp_save_parameter.
(lex_expansion_token): Save spelling of macro argument tokens.
(_cpp_create_definition): Extract canonical node from saved data.
(cpp_macro_definition): Use UCNs in spelling of macro name. Use
original spellings of macro argument tokens and identifiers.
* traditional.c (scan_parameters): Update call to
_cpp_save_parameter.
gcc:
* doc/invoke.texi (-std=c99, -std=c11): Don't refer to corner
cases of extended identifiers.
gcc/testsuite:
* g++.dg/cpp/ucnid-2.C, g++.dg/cpp/ucnid-3.C,
gcc.dg/cpp/ucnid-11.c, gcc.dg/cpp/ucnid-12.c,
gcc.dg/cpp/ucnid-13.c, gcc.dg/cpp/ucnid-14.c,
gcc.dg/cpp/ucnid-15.c: New tests.
From-SVN: r217202
2014-10-03 Bill Schmidt <wschmidt@linux.vnet.ibm.com>
* lex.c (search_line_fast): Add new version to be used for Power8
and later targets when Altivec is enabled. Restrict the existing
Altivec version to big-endian systems so that lvsr is not used on
little endian, where it is deprecated. Remove LE-specific code
from the now-BE-only version.
From-SVN: r215873
2014-07-10 Edward Smith-Rowland <3dw4rd@verizon.net>
Jonathan Wakely <jwakely@redhat.com>
PR CPP/61389
* macro.c (_cpp_arguments_ok, parse_params, create_iso_definition):
Warning messages mention C++11 in c++ mode and C99 in c mode.
* lex.c (lex_identifier_intern, lex_identifier): Ditto
Co-Authored-By: Jonathan Wakely <jwakely@redhat.com>
From-SVN: r212441
libcpp/
2014-07-09 Edward Smith-Rowland <3dw4rd@verizon.net>
PR c++/58155 - -Wliteral-suffix warns about tokens which are skipped
by preprocessor
* lex.c (lex_raw_string ()): Do not warn about invalid suffix
if skipping. (lex_string ()): Ditto.
gcc/testsuite/
2014-07-09 Edward Smith-Rowland <3dw4rd@verizon.net>
PR c++/58155 - -Wliteral-suffix warns about tokens which are skipped
g++.dg/cpp0x/pr58155.C: New.
From-SVN: r212392
gcc/testsuite:
* c-c++-common/cpp/ucnid-2011-1.c: New test.
libcpp:
* ucnid.tab: Add C11 and C11NOSTART data.
* makeucnid.c (digit): Rename enum value to N99.
(C11, N11, all_languages): New enum values.
(NUM_CODE_POINTS, MAX_CODE_POINT): New macros.
(flags, decomp, combining_value): Use NUM_CODE_POINTS as array
size.
(decomp): Use unsigned int as element type.
(all_decomp): New array.
(read_ucnid): Handle C11 and C11NOSTART. Use MAX_CODE_POINT.
(read_table): Use MAX_CODE_POINT. Store all decompositions in
all_decomp.
(read_derived): Use MAX_CODE_POINT.
(write_table): Use NUM_CODE_POINTS. Print N99, C11 and N11
flags. Print whole array variable declaration rather than just
array contents.
(char_id_valid, write_context_switch): New functions.
(main): Call write_context_switch.
* ucnid.h: Regenerate.
* include/cpplib.h (struct cpp_options): Add c11_identifiers.
* init.c (struct lang_flags): Add c11_identifiers.
(cpp_set_lang): Set c11_identifiers option from selected language.
* internal.h (struct normalize_state): Document "previous" as
previous starter character.
(NORMALIZE_STATE_UPDATE_IDNUM): Take character as argument.
* charset.c (DIG): Rename enum value to N99.
(C11, N11): New enum values.
(struct ucnrange): Give name to struct. Use short for flags and
unsigned int for end of range. Include ucnid.h for whole variable
declaration.
(ucn_valid_in_identifier): Allow for characters up to 0x10FFFF.
Allow for C11 in determining valid characters and valid start
characters. Use check_nfc for non-Hangul context-dependent
checks. Only store starter characters in nst->previous.
(_cpp_valid_ucn): Pass new argument to
NORMALIZE_STATE_UPDATE_IDNUM.
* lex.c (lex_identifier): Pass new argument to
NORMALIZE_STATE_UPDATE_IDNUM. Call NORMALIZE_STATE_UPDATE_IDNUM
after initial non-UCN part of identifier.
(lex_number): Pass new argument to NORMALIZE_STATE_UPDATE_IDNUM.
From-SVN: r204886
PR preprocessor/57620
* lex.c (lex_raw_string): Undo phase1 and phase2 transformations
between R" and final " rather than only in between R"del( and )del".
* c-c++-common/raw-string-2.c (s12, u12, U12, L12): Remove.
(main): Don't test {s,u,U,L}12.
* c-c++-common/raw-string-13.c: New test.
* c-c++-common/raw-string-14.c: New test.
* c-c++-common/raw-string-15.c: New test.
* c-c++-common/raw-string-16.c: New test.
From-SVN: r201091
PR preprocessor/57824
* lex.c (lex_raw_string): Allow reading new-lines if
in_deferred_pragma or if parsing_args and there is still
data in the current buffer.
* c-c++-common/raw-string-17.c: New test.
* c-c++-common/gomp/pr57824.c: New test.
From-SVN: r200879
* c-ppoutput.c (scan_translation_unit): Call account_for_newlines
for all CPP_TOKEN_FLD_STR tokens, not just CPP_COMMENT.
* include/cpplib.h (cpp_token_val_index): Change parameter type to
const cpp_token *.
* lex.c (cpp_token_val_index): Likewise.
* c-c++-common/raw-string-18.c: New test.
* c-c++-common/raw-string-19.c: New test.
From-SVN: r200878
PR preprocessor/57757
* lex.c (cpp_avoid_paste): Avoid pasting CPP_{,W,UTF8}STRING
or CPP_STRING{16,32} with CPP_NAME or SPELL_LITERAL token that
starts if a-zA-Z_.
* g++.dg/cpp/paste1.C: New test.
* g++.dg/cpp/paste2.C: New test.
From-SVN: r200875
/libcpp
2013-04-24 Paolo Carlini <paolo.carlini@oracle.com>
* include/cpplib.h (enum c_lang): Add CLK_GNUCXX1Y and CLK_CXX1Y.
* init.c (lang_defaults): Add defaults for the latter.
(cpp_init_builtins): Define __cplusplus as 201300L for the latter.
* lex.c (_cpp_lex_direct): Update.
/gcc/c-family
2013-04-24 Paolo Carlini <paolo.carlini@oracle.com>
* c-opts.c (set_std_cxx11): Use CLK_CXX1Y and CLK_GNUCXX1Y.
/gcc/testsuite
2013-04-24 Paolo Carlini <paolo.carlini@oracle.com>
* g++.dg/cpp1y/cplusplus.C: New.
From-SVN: r198261
* configure.ac: Don't define ENABLE_CHECKING whenever
--enable-checking is seen, instead use similar --enable-checking=yes
vs. --enable-checking=release default as gcc/ subdir has and
define ENABLE_CHECKING if ENABLE_CHECKING is defined in gcc/.
Define ENABLE_VALGRIND_CHECKING if requested.
* lex.c (new_buff): If ENABLE_VALGRIND_CHECKING, put _cpp_buff
struct first in the allocated buffer and result->base after it.
(_cpp_free_buff): If ENABLE_VALGRIND_CHECKING, free buff itself
instead of buff->base.
* config.in: Regenerated.
* configure: Regenerated.
From-SVN: r196333
I noticed that the file lex.c had C++ style comments, which I believe
is against the coding standards of the project.
Fixed, tested and applied to master as per the obvious rule.
libcpp/
* lex.c (lex_raw_string): Change C++ style comments into C
style comments.
(lex_string): Likewise.
From-SVN: r186946
This option, which is enabled by default, causes the preprocessor to warn
when a string or character literal is followed by a ud-suffix which does
not begin with an underscore. According to [lex.ext]p10, this is
ill-formed.
Also modifies the preprocessor to treat such ill-formed suffixes as separate
preprocessing tokens. This is consistent with the Clang front end (see
http://llvm.org/viewvc/llvm-project?view=rev&revision=152287), and enables
backwards compatibility with code that uses formatting macros from
<inttypes.h>, as in the following code block:
int main() {
int64_t i64 = 123;
printf("My int64: %"PRId64"\n", i64);
}
Google ref b/6377711.
2012-04-27 Ollie Wild <aaw@google.com>
PR c++/52538
* gcc/c-family/c-common.c: Add CPP_W_LITERAL_SUFFIX mapping.
* gcc/c-family/c-opts.c (c_common_handle_option): Handle
OPT_Wliteral_suffix.
* gcc/c-family/c.opt: Add Wliteral-suffix.
* gcc/doc/invoke.texi (Wliteral-suffix): Document new option.
* gcc/testsuite/g++.dg/cpp0x/Wliteral-suffix.c: New test.
* libcpp/include/cpplib.h (struct cpp_options): Add new field,
warn_literal_suffix.
(CPP_W_LITERAL_SUFFIX): New enum.
* libcpp/init.c (cpp_create_reader): Default initialization of
warn_literal_suffix.
* libcpp/lex.c (lex_raw_string): Treat user-defined literals which
don't begin with '_' as separate tokens and produce a warning.
(lex_string): Ditto.
From-SVN: r186909
libcpp/
* include/internal.h (_cpp_remaining_tokens_num_in_context): Take the
context to act upon.
* lex.c (_cpp_remaining_tokens_num_in_context): Likewise. Update
comment.
(cpp_token_from_context_at): Likewise.
(cpp_peek_token): Use the context to peek tokens from.
From-SVN: r180328
This second instalment uses the infrastructure of the previous patch
to allocate a macro map for each macro expansion and assign a virtual
location to each token resulting from the expansion.
To date when cpp_get_token comes across a token that happens to be a
macro, the macro expander kicks in, expands the macro, pushes the
resulting tokens onto a "token context" and returns a dummy padding
token. The next call to cpp_get_token goes look into the token context
for the next token [which is going to result from the previous macro
expansion] and returns it. If the token is a macro, the macro expander
kicks in and you know the story.
This patch piggy-backs on that macro expansion process, so to speak.
First it modifies the macro expander to make it create a macro map for
each macro expansion. It then allocates a virtual location for each
resulting token. Virtual locations of tokens resulting from macro
expansions are then stored on a special kind of context called an
"expanded tokens context". In other words, in an expanded tokens
context, there are tokens resulting from macro expansion and their
associated virtual locations. cpp_get_token_with_location is modified
to return the virtual location of tokens resulting from macro
expansion. Note that once all tokens from an expanded token context have
been consumed and the context and is freed, the memory used to store the
virtual locations of the tokens held in that context is freed as well.
This helps reducing the overall peak memory consumption.
The client code that was getting macro expansion point location from
cpp_get_token_with_location now gets virtual location from it. Those
virtual locations can in turn be resolved into the different
interesting physical locations thanks to the linemap API exposed by
the previous patch.
Expensive progress. Possibly. So this whole virtual location
allocation business is switched off by default. So by default no
extended token is created. No extended token context is created
either. One has to use -ftrack-macro-expansion to switch this on. This
complicates the code but I believe it can be useful as some of our
friends found out at http://llvm.org/bugs/show_bug.cgi?id=5610
The patch tries to reduce the memory consumption by freeing some token
context memory that was being reused before. I didn't notice any
compilation slow down due to this immediate freeing on my GNU/Linux
system.
As no client code tries to resolve virtual locations to anything but
what was being done before, no new test case has been added.
Co-Authored-By: Dodji Seketeli <dodji@redhat.com>
From-SVN: r180082
Use it to force BUILTINS_LOCATION when declaring builtins instead of creating a <built-in> entry in the line_table which is wrong.
* c-opts.c (c_finish_options): Force BUILTINS_LOCATION for tokens
defined in cpp_init_builtins and c_cpp_builtins.
gcc/fortran/ChangeLog
* cpp.c (gfc_cpp_init): Force BUILTINS_LOCATION for tokens
defined in cpp_define_builtins.
libcpp/ChangeLog
* init.c (cpp_create_reader): Inititalize forced_token_location_p.
* internal.h (struct cpp_reader): Add field forced_token_location_p.
* lex.c (_cpp_lex_direct): Use forced_token_location_p.
(cpp_force_token_locations): New.
(cpp_stop_forcing_token_locations): New.
From-SVN: r177973
LINEMAP_POSITION_FOR_COLUMN had the exact same effect as
linemap_position_for_column, removed it and updated users
to use linemap_position_for_column instead
libcpp/ChangeLog
* include/line-map.h (LINEMAP_POSITION_FOR_COLUMN): Remove.
Update all users to use linemap_position_for_column instead.
gcc/go/ChangeLog
* gofrontend/lex.cc (Lex::location): Update to use
linemap_position_for_column instead.
(Lex::earlier_location): Likewise.
From-SVN: r177768
PR target/49104
* config/i386/cpuid.h (bit_MMXEXT): New define.
libcpp/ChangeLog:
2011-05-22 Uros Bizjak <ubizjak@gmail.com>
PR target/49104
* lex.c (init_vectorized_lexer): Do not set "minimum" when __3dNOW_A__
is defined. Check bit_MMXEXT and bit_CMOV to use search_line_mmx.
From-SVN: r174032
PR preprocessor/48740
* lex.c (lex_raw_string): When raw string ends with
??) followed by raw prefix and ", ensure it is preprocessed
with ??) rather than ??].
* c-c++-common/raw-string-11.c: New test.
From-SVN: r172903