In this problem report, the compiler is fed a (bogus) translation unit
in which some literals contain bytes whose value is zero. The
preprocessor detects that and proceeds to emit diagnostics for that
king of bogus literals. But then when the diagnostics machinery
re-reads the input file again to display the bogus literals with a
caret, it attempts to calculate the length of each of the lines it got
using fgets. The line length calculation is done using strlen. But
that doesn't work well when the content of the line can have several
zero bytes. The result is that the read_line never sees the end of
the line because strlen repeatedly reports that the line ends before
the end-of-line character; so read_line thinks its buffer for reading
the line is too small; it thus increases the buffer, leading to a huge
memory consumption and disaster.
Here is what this patch does.
location_get_source_line is modified to return the length of a source
line that can now contain bytes with zero value.
diagnostic_show_locus() is then modified to consider that a line can
have characters of value zero, and so just shows a white space when
instructed to display one of these characters.
Additionally location_get_source_line is modified to avoid re-reading
each and every line from the beginning of the file until it reaches
the line number N that it is instructed to get; this was leading to
annoying quadratic behaviour when reading adjacent lines near the end
of (big) files. So a cache is now associated to the file opened in
text mode. When the content of the file is read, that content is
stashed in the file cache. That file cache is searched for line
delimiters. A number of line positions are saved in the cache and a
number of file caches are kept in memory. That way when
location_get_source_line is asked to read line N + 1, it just has to
start reading from line N that it has already read.
libcpp/ChangeLog:
* include/line-map.h (linemap_get_file_highest_location): Declare
new function.
* line-map.c (linemap_get_file_highest_location): Define it.
gcc/ChangeLog:
* input.h (location_get_source_line): Take an additional line_size
parameter.
(void diagnostics_file_cache_fini): Declare new function.
* input.c (struct fcache): New type.
(fcache_tab_size, fcache_buffer_size, fcache_line_record_size):
New static constants.
(diagnostic_file_cache_init, total_lines_num)
(lookup_file_in_cache_tab, evicted_cache_tab_entry)
(add_file_to_cache_tab, lookup_or_add_file_to_cache_tab)
(needs_read, needs_grow, maybe_grow, read_data, maybe_read_data)
(get_next_line, read_next_line, goto_next_line, read_line_num):
New static function definitions.
(diagnostic_file_cache_fini): New function.
(location_get_source_line): Take an additional output line_len
parameter. Re-write using lookup_or_add_file_to_cache_tab and
read_line_num.
* diagnostic.c (diagnostic_finish): Call
diagnostic_file_cache_fini.
(adjust_line): Take an additional input parameter for the length
of the line, rather than calculating it with strlen.
(diagnostic_show_locus): Adjust the use of
location_get_source_line and adjust_line with respect to their new
signature. While displaying a line now, do not stop at the first
null byte. Rather, display the zero byte as a space and keep
going until we reach the size of the line.
* Makefile.in: Add vec.o to OBJS-libcommon
gcc/testsuite/ChangeLog:
* c-c++-common/cpp/warning-zero-in-literals-1.c: New test file.
Signed-off-by: Dodji Seketeli <dodji@seketeli.org>
From-SVN: r206957
gcc:
2012-07-27 Laurynas Biveinis <laurynas.biveinis@gmail.com>
Steven Bosscher <steven@gcc.gnu.org>
* gengtype.c (adjust_field_type): Diagnose duplicate "length"
option applications and option being applied to arrays of atomic
types.
(walk_type): Allow "atomic" option on strings too.
* dwarf2out.h (struct dw_vec_struct): Use the "atomic" GTY option
for the array field.
* vec.h: Describe the atomic object "A" type of the macros in
the header comment.
(VEC_T_GTY_ATOMIC, DEF_VEC_A, DEF_VEC_ALLOC_A): Define.
* emit-rtl.c (locations_locators_vals): use the atomic object
vector.
* doc/gty.texi: Clarify that GTY option "length" is only for
arrays of non-atomic objects. Fix typo in the description of the
"atomic" option.
gcc/java:
2012-07-24 Laurynas Biveinis <laurynas.biveinis@gmail.com>
* jcf.h (CPool): Use the "atomic" GTY option for the tags field.
(bootstrap_method): Likewise for the bootstrap_arguments field.
libcpp:
2012-07-24 Laurynas Biveinis <laurynas.biveinis@gmail.com>
* include/line-map.h (line_map_macro): Use the "atomic" GTY option
for the macro_locations field.
Co-Authored-By: Steven Bosscher <steven@gcc.gnu.org>
From-SVN: r189951
Now that diagnostics for tokens coming from macro expansions point to
the spelling location of the relevant token (and then displays the
context of the expansion), some ugly (not so seldom) corner cases can
happen.
When the relevant token is a built-in token (which means the location
of that token is BUILTINS_LOCATION) the location prefix displayed to
the user in the diagnostic line is the "<built-in>:0:0" string. For
instance:
<built-in>:0:0: warning: conversion to 'float' alters 'int' constant value
For the user, I think this is surprising and useless.
A more user-friendly approach would be to refer to the first location
that (in the reported macro expansion context) is for a location in
real source code, like what is shown in the new test case
gcc/testsuite/g++.dg/warn/Wconversion-real-integer2.C accompanying
this patch.
To do this, I am making the line-map module provide a new
linemap_unwind_to_first_non_reserved_loc function that resolves a
virtual location to the first spelling location that is in real source
code.
I am then using that facility in the diagnostics printing module and
in the macro unwinder to avoid printing diagnostics lines that refer
to the locations for built-ins or more generally for reserved
locations. Note that when I start the dance of skipping a built-in
location I also skip locations that are in system headers, because it
turned out that a lot of those built-ins are actually used in system
headers (e.g, "#define INT_MAX __INT_MAX__" where __INT_MAX__ is a
built-in).
Besides the user-friendliness gain, this patch allows a number of
regression tests to PASS unchanged with and without
-ftrack-macro-expansion.
Tested and bootstrapped on x86_64-unknown-linux-gnu against trunk.
Note that the bootstrap with -ftrack-macro-expansion exhibits other
separate issues that are addressed in subsequent patches. This patch
just fixes one class of problems.
The patch does pass bootstrap with -ftrack-macro-expansion turned off,
though.
libcpp/
* include/line-map.h (linemap_unwind_toward_expansion): Fix typo
in comment.
(linemap_unwind_to_first_non_reserved_loc): Declare new function.
* line-map.c (linemap_unwind_to_first_non_reserved_loc): Define
new function.
gcc/
* input.c (expand_location_1): When expanding to spelling location
in a context of a macro expansion, skip reserved system header
locations. Update comments. * tree-diagnostic.c
(maybe_unwind_expanded_macro_loc): Likewise.
gcc/testsuite/
* g++.dg/warn/Wconversion-real-integer2.C: New test.
* g++.dg/warn/Wconversion-real-integer-3.C: Likewise.
* g++.dg/warn/conversion-real-integer-3.h: New header used by the
new test above.
From-SVN: r186970
libcpp/
* include/line-map.h (linemap_expand_location): Take a line table
parameter. Update comment.
(linemap_resolve_location): Update comment.
(linemap_expand_location_full): Remove.
* line-map.c (linemap_resolve_location): Handle reserved
locations; return a NULL map in those cases.
(linemap_expand_location): If location is reserved, return a
zeroed expanded location. Update comment. Take a line table to
assert that the function takes non-virtual locations only.
(linemap_expand_location_full): remove.
(linemap_dump_location): Handle the fact that
linemap_resolve_location can return NULL line maps when the
location resolves to a reserved location.
gcc/
* input.c (expand_location): Rewrite using
linemap_resolve_location and linemap_expand_location. Add a
comment.
From-SVN: r180426
libcpp/
* include/line-map.h (struct linemap_stats): Change the type of
the members from size_t to long.
* macro.c (macro_arg_token_iter_init): Unconditionally initialize
iter->location_ptr.
gcc/c-family/
* c-lex.c (fe_file_change): Use LINEMAP_SYSP when
!NO_IMPLICIT_EXTERN_C.
gcc/
* input.c (dump_line_table_statistics): Use long, not size_t.
From-SVN: r180124
This patch basically arranges for the allocation size of line_map
buffers to be as close as possible to a power of two. This
*significantly* decreases peak memory consumption as (macro) maps are
numerous and stay live during all the compilation.
The patch adds a new ggc_round_alloc_size interface to the ggc
allocator. In each of the two main allocator implementations ('page'
and 'zone') the function has been extracted from the main allocation
function code and returns the actual size of the allocated memory
region, thus giving a chance to the caller to maximize the amount of
memory it actually uses from the allocated memory region. In the
'none' allocator implementation (that uses xmalloc) the
ggc_round_alloc_size just returns the requested allocation size.
Co-Authored-By: Dodji Seketeli <dodji@redhat.com>
From-SVN: r180086
This patch adds statistics about line maps' memory consumption and
macro expansion to the output of -fmem-report. It has been useful in
trying to reduce the memory consumption of the macro maps support.
Co-Authored-By: Dodji Seketeli <dodji@redhat.com>
From-SVN: r180085
This patch adds -fdebug-cpp option. When used with -E this dumps the
relevant macro map before every single token. This clutters the output
a lot but has proved to be invaluable in tracking some bugs during the
development of the virtual location support.
Co-Authored-By: Dodji Seketeli <dodji@redhat.com>
From-SVN: r180084
This second instalment uses the infrastructure of the previous patch
to allocate a macro map for each macro expansion and assign a virtual
location to each token resulting from the expansion.
To date when cpp_get_token comes across a token that happens to be a
macro, the macro expander kicks in, expands the macro, pushes the
resulting tokens onto a "token context" and returns a dummy padding
token. The next call to cpp_get_token goes look into the token context
for the next token [which is going to result from the previous macro
expansion] and returns it. If the token is a macro, the macro expander
kicks in and you know the story.
This patch piggy-backs on that macro expansion process, so to speak.
First it modifies the macro expander to make it create a macro map for
each macro expansion. It then allocates a virtual location for each
resulting token. Virtual locations of tokens resulting from macro
expansions are then stored on a special kind of context called an
"expanded tokens context". In other words, in an expanded tokens
context, there are tokens resulting from macro expansion and their
associated virtual locations. cpp_get_token_with_location is modified
to return the virtual location of tokens resulting from macro
expansion. Note that once all tokens from an expanded token context have
been consumed and the context and is freed, the memory used to store the
virtual locations of the tokens held in that context is freed as well.
This helps reducing the overall peak memory consumption.
The client code that was getting macro expansion point location from
cpp_get_token_with_location now gets virtual location from it. Those
virtual locations can in turn be resolved into the different
interesting physical locations thanks to the linemap API exposed by
the previous patch.
Expensive progress. Possibly. So this whole virtual location
allocation business is switched off by default. So by default no
extended token is created. No extended token context is created
either. One has to use -ftrack-macro-expansion to switch this on. This
complicates the code but I believe it can be useful as some of our
friends found out at http://llvm.org/bugs/show_bug.cgi?id=5610
The patch tries to reduce the memory consumption by freeing some token
context memory that was being reused before. I didn't notice any
compilation slow down due to this immediate freeing on my GNU/Linux
system.
As no client code tries to resolve virtual locations to anything but
what was being done before, no new test case has been added.
Co-Authored-By: Dodji Seketeli <dodji@redhat.com>
From-SVN: r180082
This is the first instalment of a set which goal is to track locations
of tokens across macro expansions. Tom Tromey did the original work
and attached the patch to PR preprocessor/7263. This opus is a
derivative of that original work.
This patch modifies the linemap module of libcpp to add virtual
locations support.
A virtual location is a mapped location that can resolve to several
different physical locations. It can always resolve to the spelling
location of a token. For tokens resulting from macro expansion it can
resolve to:
- either the location of the expansion point of the macro.
- or the location of the token in the definition of the
macro
- or, if the token is an argument of a function-like macro,
the location of the use of the matching macro parameter in
the definition of the macro
The patch creates a new type of line map called a macro map. For every
single macro expansion, there is a macro map that generates a virtual
location for every single resulting token of the expansion.
The good old type of line map we all know is now called an ordinary
map. That one still encodes spelling locations as it has always had.
As a result linemap_lookup as been extended to return a macro map when
given a virtual location resulting from a macro expansion. The layout
of structs line_map has changed to support this new type of map. So
did the layout of struct line_maps. Accessor macros have been
introduced to avoid messing with the implementation details of these
datastructures directly. This helped already as we have been testing
different ways of arranging these datastructure. Having to constantly
adjust client code that is too tied with the internals of line_map and
line_maps would have been even more painful.
Of course, many new public functions have been added to the linemap
module to handle the resolution of virtual locations.
This patch introduces the infrastructure but no part of the compiler
uses virtual locations yet.
However the client code of the linemap data structures has been
adjusted as per the changes. E.g, it's not anymore reliable for a
client code to manipulate struct line_map directly if it just wants to
deal with spelling locations, because struct line_map can now
represent a macro map as well. In that case, it's better to use the
convenient API to resolve the initial (possibly virtual) location to a
spelling location (or to an ordinary map) and use that.
This is the reason why the patch adjusts the Java, Ada and Fortran
front ends.
Also, note that virtual locations are not supposed to be ordered for
relations '<' and '>' anymore. To test if a virtual location appears
"before" another one, one has to use a new operator exposed by the
line map interface. The patch updates the only spot (in the
diagnostics module) I have found that was making the assumption that
locations were ordered for these relations. This is the only change
that introduces a use of the new line map API in this patch, so I am
adding a regression test for it only.
From-SVN: r180081
LINEMAP_POSITION_FOR_COLUMN had the exact same effect as
linemap_position_for_column, removed it and updated users
to use linemap_position_for_column instead
libcpp/ChangeLog
* include/line-map.h (LINEMAP_POSITION_FOR_COLUMN): Remove.
Update all users to use linemap_position_for_column instead.
gcc/go/ChangeLog
* gofrontend/lex.cc (Lex::location): Update to use
linemap_position_for_column instead.
(Lex::earlier_location): Likewise.
From-SVN: r177768
2009-07-17 Jerry Quinn <jlquinn@optonline.net>
* directives.c (do_linemarker, do_line): Use CPP_STRING for
ignored enum value.
* files.c (find_file_in_dir): Add cast from void* to char*.
* symtab.c (ht_lookup_with_hash): Add cast from void* to char*.
* Makefile.in: (WARN_CFLAGS): Use general and C-specific
warnings.
(CXX, CXXFLAGS, WARN_CXXFLAGS, ALL_CXXFLAGS,
ENABLE_BUILD_WITH_CXX, CCDEPMODE, CXXDEPMODE, COMPILER,
COMPILER_FLAGS): New.
(DEPMODE): Set from CCDEPMODE or CXXDEPMODE.
(COMPILE.base): Use COMPILER instead of CC. Use COMPILER_FLAGS
instead of ALL_CFLAGS.
* configure.ac: Invoke AC_PROG_CXX. Separate C-specific warnings
from other warnings. Add -Wc++-compat to C-specific warnings.
Check for --enable-build-with-cxx. Set and substitute
ENABLE_BUILD_WITH_CXX. Invoke ZW_PROG_COMPILER_DEPENDENCIES
according to ENABLE_BUILD_WITH_CXX. Invoke AC_LANG before
AC_CHECK_HEADERS.
* configure: Rebuild.
* include/cpp-id-data.h: Remove extern "C".
* include/line-map.h: Likewise.
* include/mkdeps.h: Likewise.
* include/symtab.h: Likewise.
* internal.h: Likewise.
From-SVN: r149763
2008-07-21 Manuel Lopez-Ibanez <manu@gcc.gnu.org>
* include/line-map.h (linenum_type): New typedef.
(struct line_map): Use it.
(SOURCE_LINE): Second arguments is a LOCATION not a LINE.
(SOURCE_COLUMN): Likewise.
* macro.c (_cpp_builtin_macro_text): Use linenum_type. Don't store
source_location values in a variable of type linenum_type.
* directives.c (struct if_stack): Use linenum_type.
(strtoul_for_line): Rename as strtolinenum.
(do_line): Use linenum_type.
(do_linemarker): Use linenum_type and strtolinenum.
(_cpp_do_file_change): Use linenum_t.
* line-map.c (linemap_add): Likewise.
(linemap_line_start): Likewise.
* traditional.c (struct fun_macro): 'line' is a source_location.
* errors.c (print_location): Use linenum_type.
* directives-only.c (_cpp_preprocess_dir_only): Likewise.
* internal.h (CPP_INCREMENT_LINE): Likewise.
* lex.c (_cpp_skip_block_comment): Use source_location.
From-SVN: r138026
ChangeLog:
2004-05-23 Paolo Bonzini <bonzini@gnu.org>
* Makefile.def (host_modules): add libcpp.
* Makefile.tpl: Add dependencies on and for libcpp.
* Makefile.in: Regenerate.
* configure.in: Add libcpp host module.
* configure: Regenerate.
config/ChangeLog:
2004-05-23 Paolo Bonzini <bonzini@gnu.org>
* acx.m4 (ACX_HEADER_STDBOOL, ACX_HEADER_STRING):
From gcc.
gcc/ChangeLog:
2004-05-23 Paolo Bonzini <bonzini@gnu.org>
Move libcpp to the toplevel.
* Makefile.in: Remove references to libcpp files,
use CPPLIBS instead of libcpp.a. Define SYMTAB_H
and change hashtable.h to that.
* aclocal.m4 (gcc_AC_HEADER_STDBOOL,
gcc_AC_HEADER_STRING, gcc_AC_C__BOOL): Remove.
* configure.ac (gcc_AC_C__BOOL, HAVE_UCHAR): Remove tests.
* configure: Regenerate.
* config.in: Regenerate.
* c-ppoutput.c: Include ../libcpp/internal.h instead of cpphash.h.
* cppcharset.c: Removed.
* cpperror.c: Removed.
* cppexp.c: Removed.
* cppfiles.c: Removed.
* cpphash.c: Removed.
* cpphash.h: Removed.
* cppinit.c: Removed.
* cpplex.c: Removed.
* cpplib.c: Removed.
* cpplib.h: Removed.
* cppmacro.c: Removed.
* cpppch.c: Removed.
* cpptrad.c: Removed.
* cppucnid.h: Removed.
* cppucnid.pl: Removed.
* cppucnid.tab: Removed.
* hashtable.c: Removed.
* hashtable.h: Removed.
* line-map.c: Removed.
* line-map.h: Removed.
* mkdeps.c: Removed.
* mkdeps.h: Removed.
* stringpool.h: Include symtab.h instead of hashtable.h.
* tree.h: Include symtab.h instead of hashtable.h.
* system.h (O_NONBLOCK, O_NOCTTY): Do not define.
gcc/cp/ChangeLog:
2004-05-23 Paolo Bonzini <bonzini@gnu.org>
* Make-lang.in: No need to specify $(LIBCPP).
gcc/java/ChangeLog:
2004-05-23 Paolo Bonzini <bonzini@gnu.org>
* Make-lang.in: Link in $(LIBCPP) instead of mkdeps.o.
libcpp/ChangeLog:
2004-05-23 Paolo Bonzini <bonzini@gnu.org>
Moved libcpp from the gcc subdirectory to the toplevel.
* Makefile.am: New file.
* Makefile.in: Regenerate.
* configure.ac: New file.
* configure: Regenerate.
* config.in: Regenerate.
* charset.c: Moved from gcc/cppcharset.c. Add note about
brokenness of input charset detection. Adjust for change
in name of cppucnid.h.
* errors.c: Moved from gcc/cpperror.c. Do not include intl.h.
* expr.c: Moved from gcc/cppexp.c.
* files.c: Moved from gcc/cppfiles.c. Do not include intl.h.
Remove #define of O_BINARY, it is in system.h.
* identifiers.c: Moved from gcc/cpphash.c.
* internal.h: Moved from gcc/cpphash.h. Change header
guard name. All other files adjusted to match name change.
* init.c: Moved from gcc/cppinit.c.
(init_library) [ENABLE_NLS]: Call bindtextdomain.
* lex.c: Moved from gcc/cpplex.c.
* directives.c: Moved from gcc/cpplib.c.
* macro.c: Moved from gcc/cppmacro.c.
* pch.c: Moved from gcc/cpppch.c. Do not include intl.h.
* traditional.c: Moved from gcc/cpptrad.c.
* ucnid.h: Moved from gcc/cppucnid.h. Change header
guard name.
* ucnid.pl: Moved from gcc/cppucnid.pl.
* ucnid.tab: Moved from gcc/cppucnid.tab. Change header
guard name.
* symtab.c: Moved from gcc/hashtable.c.
* line-map.c: Moved from gcc. Do not include intl.h.
* mkdeps.c: Moved from gcc.
* system.h: New file.
libcpp/include/ChangeLog:
2004-05-23 Paolo Bonzini <bonzini@gnu.org>
* cpplib.h: Moved from gcc. Change header guard name.
* line-map.h: Moved from gcc. Change header guard name.
* mkdeps.h: Moved from gcc. Change header guard name.
* symtab.h: Moved from gcc/hashtable.h. Change header
guard name.
libcpp/po/ChangeLog:
2004-05-23 Paolo Bonzini <bonzini@gnu.org>
* be.po: Extracted from gcc/po/be.po.
* ca.po: Extracted from gcc/po/ca.po.
* da.po: Extracted from gcc/po/da.po.
* de.po: Extracted from gcc/po/de.po.
* el.po: Extracted from gcc/po/el.po.
* es.po: Extracted from gcc/po/es.po.
* fr.po: Extracted from gcc/po/fr.po.
* ja.po: Extracted from gcc/po/ja.po.
* nl.po: Extracted from gcc/po/nl.po.
* sv.po: Extracted from gcc/po/sv.po.
* tr.po: Extracted from gcc/po/tr.po.
From-SVN: r82199