This patch combines:
[PATCH 05/10] Add ranges to libcpp tokens (via ad-hoc data, unoptimized)
[PATCH 06/10] Track expression ranges in C frontend
[PATCH 07/10] Add plugin to recursively dump the source-ranges in a tree (v2)
[PATCH 08/10] Wire things up so that libcpp users get token underlines
[PATCH 09/10] Delay some resolution of ad-hoc locations, preserving ranges
[PATCH 10/10] Compress short ranges into source_location
[PATCH] libcpp: add examples to source_location description
along with fixes for the nits identified during review.
gcc/ChangeLog:
* Makefile.in (OBJS): Add gcc-rich-location.o.
* diagnostic.c (diagnostic_append_note): Pass line_table to
rich_location ctor.
(emit_diagnostic): Likewise.
(inform): Likewise.
(inform_n): Likewise.
(warning): Likewise.
(warning_at): Likewise.
(warning_n): Likewise.
(pedwarn): Likewise.
(permerror): Likewise.
(error): Likewise.
(error_n): Likewise.
(error_at): Likewise.
(sorry): Likewise.
(fatal_error): Likewise.
(internal_error): Likewise.
(internal_error_no_backtrace): Likewise.
(source_range::debug): Likewise.
* gcc-rich-location.c: New file.
* gcc-rich-location.h: New file.
* genmatch.c (fatal_at): Pass line_table to rich_location ctor.
(warning_at): Likewise.
* gimple.h (gimple_set_block): Use set_block function.
* input.c (dump_line_table_statistics): Dump stats on how many
ranges were optimized vs how many needed ad-hoc table.
(write_digit_row): Add "map" param; use its range_bits
to calculate the per-character offset.
(dump_location_info): Print the range and column bits for each
ordinary map. Use the range bits to calculate the per-character
offset. Pass the map as a new param to the various calls to
write_digit_row. Eliminate uses of
ORDINARY_MAP_NUMBER_OF_COLUMN_BITS.
* print-tree.c (print_node): Print any source range information.
* rtl-error.c (diagnostic_for_asm): Likewise.
* toplev.c (general_init): Initialize line_table's
default_range_bits.
* tree-cfg.c (move_block_to_fn): Likewise.
(move_block_to_fn): Likewise.
* tree-inline.c (copy_phis_for_bb): Likewise.
* tree.c (tree_set_block): Likewise.
(get_pure_location): New function.
(set_source_range): New functions.
(set_block): New function.
(set_source_range): New functions.
* tree.h (CAN_HAVE_RANGE_P): New.
(EXPR_LOCATION_RANGE): New.
(EXPR_HAS_RANGE): New.
(get_expr_source_range): New inline function.
(DECL_LOCATION_RANGE): New.
(set_source_range): New decls.
(get_decl_source_range): New inline function.
gcc/ada/ChangeLog:
* gcc-interface/trans.c (Sloc_to_locus): Add line_table param when
calling linemap_position_for_line_and_column.
gcc/c-family/ChangeLog:
* c-common.c (c_fully_fold_internal): Capture existing souce_range,
and store it on the result.
* c-opts.c (c_common_init_options): Set
global_dc->colorize_source_p.
gcc/c/ChangeLog:
* c-decl.c (warn_defaults_to): Pass line_table to
rich_location ctor.
* c-errors.c (pedwarn_c99): Likewise.
(pedwarn_c90): Likewise.
* c-parser.c (set_c_expr_source_range): New functions.
(c_token::get_range): New method.
(c_token::get_finish): New method.
(c_parser_expr_no_commas): Call set_c_expr_source_range on the ret
based on the range from the start of the LHS to the end of the
RHS.
(c_parser_conditional_expression): Likewise, based on the range
from the start of the cond.value to the end of exp2.value.
(c_parser_binary_expression): Call set_c_expr_source_range on
the stack values for TRUTH_ANDIF_EXPR and TRUTH_ORIF_EXPR.
(c_parser_cast_expression): Call set_c_expr_source_range on ret
based on the cast_loc through to the end of the expr.
(c_parser_unary_expression): Likewise, based on the
op_loc through to the end of op.
(c_parser_sizeof_expression) Likewise, based on the start of the
sizeof token through to either the closing paren or the end of
expr.
(c_parser_postfix_expression): Likewise, using the token range,
or from the open paren through to the close paren for
parenthesized expressions.
(c_parser_postfix_expression_after_primary): Likewise, for
various kinds of expression.
* c-tree.h (struct c_expr): Add field "src_range".
(c_expr::get_start): New method.
(c_expr::get_finish): New method.
(set_c_expr_source_range): New decls.
* c-typeck.c (parser_build_unary_op): Call set_c_expr_source_range
on ret for prefix unary ops.
(parser_build_binary_op): Likewise, running from the start of
arg1.value through to the end of arg2.value.
gcc/cp/ChangeLog:
* error.c (pedwarn_cxx98): Pass line_table to rich_location ctor.
gcc/fortran/ChangeLog:
* error.c (gfc_warning): Pass line_table to rich_location ctor.
(gfc_warning_now_at): Likewise.
(gfc_warning_now): Likewise.
(gfc_error_now): Likewise.
(gfc_fatal_error): Likewise.
(gfc_error): Likewise.
(gfc_internal_error): Likewise.
gcc/testsuite/ChangeLog:
* gcc.dg/diagnostic-token-ranges.c: New file.
* gcc.dg/diagnostic-tree-expr-ranges-2.c: New file.
* gcc.dg/plugin/diagnostic-test-expressions-1.c: New file.
* gcc.dg/plugin/diagnostic-test-show-trees-1.c: New file.
* gcc.dg/plugin/diagnostic_plugin_show_trees.c: New file.
* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c (get_loc): Add
line_table param when calling
linemap_position_for_line_and_column.
(test_show_locus): Pass line_table to rich_location ctors.
(plugin_init): Remove setting of global_dc->colorize_source_p.
* gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c:
New file.
* gcc.dg/plugin/plugin.exp (plugin_test_list): Add
diagnostic_plugin_test_tree_expression_range.c,
diagnostic-test-expressions-1.c, diagnostic_plugin_show_trees.c,
and diagnostic-test-show-trees-1.c.
libcpp/ChangeLog:
* errors.c (cpp_diagnostic): Pass pfile->line_table to
rich_location ctor.
(cpp_diagnostic_with_line): Likewise.
* include/cpplib.h (struct cpp_token): Update comment for src_loc
to indicate that the range of the token is "baked into" the
source_location.
* include/line-map.h (source_location): Update the descriptive
comment to reflect the packing scheme for short ranges, adding
worked examples of location encoding.
(struct line_map_ordinary): Drop field "column_bits" in favor
of field "m_column_and_range_bits"; add field "m_range_bits".
(ORDINARY_MAP_NUMBER_OF_COLUMN_BITS): Delete.
(location_adhoc_data): Add source_range field.
(struct line_maps): Add fields "default_range_bits",
"num_optimized_ranges" and "num_unoptimized_ranges".
(get_combined_adhoc_loc): Add source_range param.
(get_range_from_loc): New declaration.
(pure_location_p): New prototype.
(COMBINE_LOCATION_DATA): Add source_range param.
(SOURCE_LINE): Update for renaming of column_bits.
(SOURCE_COLUMN): Likewise. Shift the column right by the map's
range_bits.
(LAST_SOURCE_LINE_LOCATION): Update for renaming of column_bits.
(linemap_position_for_line_and_column): Add line_maps * params.
(rich_location::rich_location): Likewise.
* lex.c (_cpp_lex_direct): Capture the range of the token, baking
it into token->src_loc via a call to COMBINE_LOCATION_DATA.
* line-map.c (LINE_MAP_MAX_COLUMN_NUMBER): Reduce from 1U << 17 to
1U << 12.
(location_adhoc_data_hash): Add the src_range into
the hash value.
(location_adhoc_data_eq): Require equality of the src_range
values.
(can_be_stored_compactly_p): New function.
(get_combined_adhoc_loc): Add src_range param, and store it,
via a bit-packing scheme for short ranges, otherwise within the
lookaside table. Remove the requirement that data is non-NULL.
(get_range_from_adhoc_loc): New function.
(get_range_from_loc): New function.
(pure_location_p): New function.
(linemap_add): Ensure that start_location has zero for the
range_bits, unless we're past LINE_MAP_MAX_LOCATION_WITH_COLS.
Initialize range_bits to zero. Assert that the start_location
is "pure".
(linemap_line_start): Assert that the
column_and_range_bits >= range_bits.
Update determinination of whether we need to start a new map
using the effective column bits, without the range bits.
Use the set's default_range_bits in new maps, apart from
those with column_bits == 0, which should also have 0 range_bits.
Increase the column bits for new maps by the range bits.
When adding lines to an existing map, use set->highest_line
directly rather than offsetting highest by SOURCE_COLUMN.
Add assertions to sanity-check the return value.
(linemap_position_for_column): Offset to_column by range_bits.
Update set->highest_location if necessary.
(linemap_position_for_line_and_column): Add line_maps * param.
Update the calculation to offset the column by range_bits, and
conditionalize it on being <= LINE_MAP_MAX_LOCATION_WITH_COLS.
Bound it by LINEMAPS_MACRO_LOWEST_LOCATION. Update
set->highest_location if necessary.
(linemap_position_for_loc_and_offset): Handle ad-hoc locations;
pass "set" to linemap_position_for_line_and_column.
(linemap_macro_map_loc_unwind_toward_spelling): Add line_maps
param. Handle ad-hoc locations.
(linemap_location_in_system_header_p): Pass on "set" to call to
linemap_macro_map_loc_unwind_toward_spelling.
(linemap_macro_loc_to_spelling_point): Retain ad-hoc locations.
Pass on "set" to call to
linemap_macro_map_loc_unwind_toward_spelling.
(linemap_resolve_location): Retain ad-hoc locations. Pass on
"set" to call to linemap_macro_map_loc_unwind_toward_spelling.
(linemap_unwind_toward_expansion): Pass on "set" to call to
linemap_macro_map_loc_unwind_toward_spelling.
(linemap_expand_location): Extract the data pointer before
extracting the location.
(rich_location::rich_location): Add line_maps param; use it to
extract the range from the source_location.
* location-example.txt: Regenerate, showing new representation.
From-SVN: r230331
PR preprocessor/61977
* lex.c (cpp_peek_token): If peektok is CPP_EOF, back it up
with all tokens peeked by the current function.
* gcc.dg/cpp/pr61977.c: New test.
From-SVN: r221882
libcpp/
2015-03-16 Edward Smith-Rowland <3dw4rd@verizon.net>
PR c++/64626
* lex.c (lex_number): If a number ends with digit-seps (') skip back
and let lex_string take them.
gcc/testsuite/
2015-03-16 Edward Smith-Rowland <3dw4rd@verizon.net>
PR c++/64626
g++.dg/cpp1y/pr64626-1.C: New.
g++.dg/cpp1y/pr64626-2.C: New.
g++.dg/cpp1y/digit-sep-neg.C: Adjust errors and warnings.
From-SVN: r221470
Fix PR65261
Running bootstrap-ubsan on ppc64le shows many instances of:
libcpp/lex.c:552:30: runtime error: load of misaligned address
0x01001f31d37a for type 'const uchar', which requires 16 byte alignment
But the unaligned vector loads are intended in this case, because they
are preferable to forced-alignment on POWER8. So just silence the ubsan
errors.
2015-03-02 Markus Trippelsdorf <markus@trippelsdorf.de>
include/
PR target/65261
* ansidecl.h (ATTRIBUTE_NO_SANITIZE_UNDEFINED): New macro.
libcpp/
PR target/65261
* lex.c (search_line_fast): Silence ubsan errors.
From-SVN: r221190
This patch makes cpplib track the original spellings of extended
identifiers, as well as the canonical UTF-8 version, in order to
follow standard semantics properly without needing a convoluted and
undocumented canonicalization in translation phase 1 (see bug 9449
comments 39-46 regarding such a canonicalization).
The spelling is tracked in cpp_identifier and cpp_macro_arg without
making cpp_token any larger. The original spelling is used for checks
of duplicate macro definitions, stringizing (see the C++ tests added;
this case is only an issue for C++ not C because C makes it
implementation-defined whether a \ is inserted before the \ of a UCN
in a string or character constant when stringizing, while C++ does
not), pasting (relevant when the result is then stringized for C++)
and when macro definitions are output as text (e.g. for -d options).
Once a macro has been defined, only the original spelling of the
argument names needs keeping in the argument list. While it is being
defined, however, both spellings are needed: the original one for
subsequent saving for checks of duplicate macro definitions, and the
canonical one which is the node marked specially to generate macro
argument tokens rather than normal identifier tokens. The buffer that
is used to save the original values of the identifier tokens is
changed so that it stores both those original values and a pointer to
the canonical hash nodes, so that those canonical nodes can be found
when their values need restoring after the macro definition has been
parsed.
I believe this covers the known standards issues in extended
identifiers support (the remaining unimplemented C99 areas in GCC all
being floating-point-related), except for C++ translation of extended
characters to UCNs in phase 1 (which I have no plans to work on).
There are however probably issues left with handling of extended
identifiers in other places, as listed in
<https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00337.html> (those
issues are generally the sort of thing that could be addressed as bugs
outside development stage 1). (The bulk of the potential issues Zack
was concerned about in 2003-5, that resulted in extended identifiers
being disabled in the absence of -fextended-identifiers, were
effectively eliminated by the audit and fixes I did in 2009, however;
that todo list reflects what was left over after that audit.)
Bootstrapped with no regressions on x86_64-unknown-linux-gnu.
libcpp:
* include/cpp-id-data.h (struct cpp_macro): Update comment
regarding parameters.
* include/cpplib.h (struct cpp_macro_arg, struct cpp_identifier):
Add spelling fields.
(struct cpp_token): Update comment on macro_arg.
* internal.h (_cpp_save_parameter): Add extra argument.
(_cpp_spell_ident_ucns): New declaration.
* lex.c (lex_identifier): Add SPELLING argument. Set *SPELLING to
original spelling of identifier.
(_cpp_lex_direct): Update calls to lex_identifier.
(_cpp_spell_ident_ucns): New function, factored out of
cpp_spell_token.
(cpp_spell_token): Adjust FORSTRING argument semantics to return
original spelling of identifiers. Use _cpp_spell_ident_ucns in
!FORSTRING case.
(_cpp_equiv_tokens): Check spellings of identifiers and macro
arguments are identical.
* macro.c (macro_arg_saved_data): New structure.
(paste_tokens): Use original spellings of identifiers from
cpp_spell_token.
(_cpp_save_parameter): Add argument SPELLING. Save both canonical
node and its value.
(parse_params): Update calls to _cpp_save_parameter.
(lex_expansion_token): Save spelling of macro argument tokens.
(_cpp_create_definition): Extract canonical node from saved data.
(cpp_macro_definition): Use UCNs in spelling of macro name. Use
original spellings of macro argument tokens and identifiers.
* traditional.c (scan_parameters): Update call to
_cpp_save_parameter.
gcc:
* doc/invoke.texi (-std=c99, -std=c11): Don't refer to corner
cases of extended identifiers.
gcc/testsuite:
* g++.dg/cpp/ucnid-2.C, g++.dg/cpp/ucnid-3.C,
gcc.dg/cpp/ucnid-11.c, gcc.dg/cpp/ucnid-12.c,
gcc.dg/cpp/ucnid-13.c, gcc.dg/cpp/ucnid-14.c,
gcc.dg/cpp/ucnid-15.c: New tests.
From-SVN: r217202
2014-10-03 Bill Schmidt <wschmidt@linux.vnet.ibm.com>
* lex.c (search_line_fast): Add new version to be used for Power8
and later targets when Altivec is enabled. Restrict the existing
Altivec version to big-endian systems so that lvsr is not used on
little endian, where it is deprecated. Remove LE-specific code
from the now-BE-only version.
From-SVN: r215873
2014-07-10 Edward Smith-Rowland <3dw4rd@verizon.net>
Jonathan Wakely <jwakely@redhat.com>
PR CPP/61389
* macro.c (_cpp_arguments_ok, parse_params, create_iso_definition):
Warning messages mention C++11 in c++ mode and C99 in c mode.
* lex.c (lex_identifier_intern, lex_identifier): Ditto
Co-Authored-By: Jonathan Wakely <jwakely@redhat.com>
From-SVN: r212441
libcpp/
2014-07-09 Edward Smith-Rowland <3dw4rd@verizon.net>
PR c++/58155 - -Wliteral-suffix warns about tokens which are skipped
by preprocessor
* lex.c (lex_raw_string ()): Do not warn about invalid suffix
if skipping. (lex_string ()): Ditto.
gcc/testsuite/
2014-07-09 Edward Smith-Rowland <3dw4rd@verizon.net>
PR c++/58155 - -Wliteral-suffix warns about tokens which are skipped
g++.dg/cpp0x/pr58155.C: New.
From-SVN: r212392
gcc/testsuite:
* c-c++-common/cpp/ucnid-2011-1.c: New test.
libcpp:
* ucnid.tab: Add C11 and C11NOSTART data.
* makeucnid.c (digit): Rename enum value to N99.
(C11, N11, all_languages): New enum values.
(NUM_CODE_POINTS, MAX_CODE_POINT): New macros.
(flags, decomp, combining_value): Use NUM_CODE_POINTS as array
size.
(decomp): Use unsigned int as element type.
(all_decomp): New array.
(read_ucnid): Handle C11 and C11NOSTART. Use MAX_CODE_POINT.
(read_table): Use MAX_CODE_POINT. Store all decompositions in
all_decomp.
(read_derived): Use MAX_CODE_POINT.
(write_table): Use NUM_CODE_POINTS. Print N99, C11 and N11
flags. Print whole array variable declaration rather than just
array contents.
(char_id_valid, write_context_switch): New functions.
(main): Call write_context_switch.
* ucnid.h: Regenerate.
* include/cpplib.h (struct cpp_options): Add c11_identifiers.
* init.c (struct lang_flags): Add c11_identifiers.
(cpp_set_lang): Set c11_identifiers option from selected language.
* internal.h (struct normalize_state): Document "previous" as
previous starter character.
(NORMALIZE_STATE_UPDATE_IDNUM): Take character as argument.
* charset.c (DIG): Rename enum value to N99.
(C11, N11): New enum values.
(struct ucnrange): Give name to struct. Use short for flags and
unsigned int for end of range. Include ucnid.h for whole variable
declaration.
(ucn_valid_in_identifier): Allow for characters up to 0x10FFFF.
Allow for C11 in determining valid characters and valid start
characters. Use check_nfc for non-Hangul context-dependent
checks. Only store starter characters in nst->previous.
(_cpp_valid_ucn): Pass new argument to
NORMALIZE_STATE_UPDATE_IDNUM.
* lex.c (lex_identifier): Pass new argument to
NORMALIZE_STATE_UPDATE_IDNUM. Call NORMALIZE_STATE_UPDATE_IDNUM
after initial non-UCN part of identifier.
(lex_number): Pass new argument to NORMALIZE_STATE_UPDATE_IDNUM.
From-SVN: r204886
PR preprocessor/57620
* lex.c (lex_raw_string): Undo phase1 and phase2 transformations
between R" and final " rather than only in between R"del( and )del".
* c-c++-common/raw-string-2.c (s12, u12, U12, L12): Remove.
(main): Don't test {s,u,U,L}12.
* c-c++-common/raw-string-13.c: New test.
* c-c++-common/raw-string-14.c: New test.
* c-c++-common/raw-string-15.c: New test.
* c-c++-common/raw-string-16.c: New test.
From-SVN: r201091
PR preprocessor/57824
* lex.c (lex_raw_string): Allow reading new-lines if
in_deferred_pragma or if parsing_args and there is still
data in the current buffer.
* c-c++-common/raw-string-17.c: New test.
* c-c++-common/gomp/pr57824.c: New test.
From-SVN: r200879
* c-ppoutput.c (scan_translation_unit): Call account_for_newlines
for all CPP_TOKEN_FLD_STR tokens, not just CPP_COMMENT.
* include/cpplib.h (cpp_token_val_index): Change parameter type to
const cpp_token *.
* lex.c (cpp_token_val_index): Likewise.
* c-c++-common/raw-string-18.c: New test.
* c-c++-common/raw-string-19.c: New test.
From-SVN: r200878
PR preprocessor/57757
* lex.c (cpp_avoid_paste): Avoid pasting CPP_{,W,UTF8}STRING
or CPP_STRING{16,32} with CPP_NAME or SPELL_LITERAL token that
starts if a-zA-Z_.
* g++.dg/cpp/paste1.C: New test.
* g++.dg/cpp/paste2.C: New test.
From-SVN: r200875
/libcpp
2013-04-24 Paolo Carlini <paolo.carlini@oracle.com>
* include/cpplib.h (enum c_lang): Add CLK_GNUCXX1Y and CLK_CXX1Y.
* init.c (lang_defaults): Add defaults for the latter.
(cpp_init_builtins): Define __cplusplus as 201300L for the latter.
* lex.c (_cpp_lex_direct): Update.
/gcc/c-family
2013-04-24 Paolo Carlini <paolo.carlini@oracle.com>
* c-opts.c (set_std_cxx11): Use CLK_CXX1Y and CLK_GNUCXX1Y.
/gcc/testsuite
2013-04-24 Paolo Carlini <paolo.carlini@oracle.com>
* g++.dg/cpp1y/cplusplus.C: New.
From-SVN: r198261
* configure.ac: Don't define ENABLE_CHECKING whenever
--enable-checking is seen, instead use similar --enable-checking=yes
vs. --enable-checking=release default as gcc/ subdir has and
define ENABLE_CHECKING if ENABLE_CHECKING is defined in gcc/.
Define ENABLE_VALGRIND_CHECKING if requested.
* lex.c (new_buff): If ENABLE_VALGRIND_CHECKING, put _cpp_buff
struct first in the allocated buffer and result->base after it.
(_cpp_free_buff): If ENABLE_VALGRIND_CHECKING, free buff itself
instead of buff->base.
* config.in: Regenerated.
* configure: Regenerated.
From-SVN: r196333
I noticed that the file lex.c had C++ style comments, which I believe
is against the coding standards of the project.
Fixed, tested and applied to master as per the obvious rule.
libcpp/
* lex.c (lex_raw_string): Change C++ style comments into C
style comments.
(lex_string): Likewise.
From-SVN: r186946
This option, which is enabled by default, causes the preprocessor to warn
when a string or character literal is followed by a ud-suffix which does
not begin with an underscore. According to [lex.ext]p10, this is
ill-formed.
Also modifies the preprocessor to treat such ill-formed suffixes as separate
preprocessing tokens. This is consistent with the Clang front end (see
http://llvm.org/viewvc/llvm-project?view=rev&revision=152287), and enables
backwards compatibility with code that uses formatting macros from
<inttypes.h>, as in the following code block:
int main() {
int64_t i64 = 123;
printf("My int64: %"PRId64"\n", i64);
}
Google ref b/6377711.
2012-04-27 Ollie Wild <aaw@google.com>
PR c++/52538
* gcc/c-family/c-common.c: Add CPP_W_LITERAL_SUFFIX mapping.
* gcc/c-family/c-opts.c (c_common_handle_option): Handle
OPT_Wliteral_suffix.
* gcc/c-family/c.opt: Add Wliteral-suffix.
* gcc/doc/invoke.texi (Wliteral-suffix): Document new option.
* gcc/testsuite/g++.dg/cpp0x/Wliteral-suffix.c: New test.
* libcpp/include/cpplib.h (struct cpp_options): Add new field,
warn_literal_suffix.
(CPP_W_LITERAL_SUFFIX): New enum.
* libcpp/init.c (cpp_create_reader): Default initialization of
warn_literal_suffix.
* libcpp/lex.c (lex_raw_string): Treat user-defined literals which
don't begin with '_' as separate tokens and produce a warning.
(lex_string): Ditto.
From-SVN: r186909