binutils-gdb/gdb/filename-seen-cache.h

/* Filename-seen cache for the GNU debugger, GDB.

   Copyright (C) 1986-2019 Free Software Foundation, Inc.

   This file is part of GDB.

   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation; either version 3 of the License, or
   (at your option) any later version.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */

#ifndef FILENAME_SEEN_CACHE_H
#define FILENAME_SEEN_CACHE_H

#include "defs.h"
#include "common/function-view.h"

/* Cache to watch for file names already seen.  */

class filename_seen_cache
{
public:
  filename_seen_cache ();
  ~filename_seen_cache ();

  DISABLE_COPY_AND_ASSIGN (filename_seen_cache);

  /* Empty the cache, but do not delete it.  */
  void clear ();

  /* If FILE is not already in the table of files in CACHE, add it and
     return false; otherwise return true.

     NOTE: We don't manage space for FILE, we assume FILE lives as
     long as the caller needs.  */
  bool seen (const char *file);

  /* Traverse all cache entries, calling CALLBACK on each.  The
     filename is passed as argument to CALLBACK.  */
  void traverse (gdb::function_view<void (const char *filename)> callback)
  {
    auto erased_cb = [] (void **slot, void *info) -> int
      {
	auto filename = (const char *) *slot;
	auto restored_cb = (decltype (callback) *) info;
	(*restored_cb) (filename);
	return 1;
      };

    htab_traverse_noresize (m_tab, erased_cb, &callback);
  }

private:
  /* Table of files seen so far.  */
  htab_t m_tab;
};

#endif /* FILENAME_SEEN_CACHE_H */
Fix TAB-completion + .gdb_index slowness (generalize filename_seen_cache) Tab completion when debugging a program binary that uses GDB index is surprisingly much slower than when GDB uses psymtabs instead. Around 1.5x/3x slower. That's surprising, because the whole point of GDB index is to speed things up... For example, with: set pagination off set $count = 0 while $count < 400 complete b string_prin # matches gdb's string_printf printf "count = %d\n", $count set $count = $count + 1 end $ time ./gdb --batch -q ./gdb-with-index -ex "source script.cmd" real 0m11.042s user 0m10.920s sys 0m0.042s $ time ./gdb --batch -q ./gdb-without-index -ex "source script.cmd" real 0m4.635s user 0m4.590s sys 0m0.037s Same but with: - complete b string_prin + complete b zzzzzz to exercise the no-matches worst case, master currently gets you something like: with index without index real 0m11.971s 0m8.413s user 0m11.912s 0m8.355s sys 0m0.035s 0m0.035s Running gdb under perf shows 80% spent inside maybe_add_partial_symtab_filename, and 20% spent in the lbasename inside that. The problem that tab completion walks over all compunit symtabs, and for each, walks the contained file symtabs. And there a huge number of file symtabs (each included system header, etc.) that appear in each compunit symtab's file symtab list. As in, when debugging GDB, I have 367381 symtabs iterated, when of those only 5371 filenames are unique... This was a regression from the earlier (nice) split of symtabs in compunit symtabs + file symtabs. The fix here is to add a cache of unique filenames per objfile so that the walk / uniquing is only done once. There's already a abstraction for this in symtab.c; this patch moves that code out to a separate file and C++ifies it bit. This makes the worst-case scenario above consistently drop to ~2.5s (1.5s for the "string_prin" hit case), making it over 3.3x times faster than psymtabs in this use case (7x in the "string_prin" hit case). gdb/ChangeLog: 2017-07-17 Pedro Alves <palves@redhat.com> * Makefile.in (COMMON_OBS): Add filename-seen-cache.o. * dwarf2read.c: Include "filename-seen-cache.h". * dwarf2read.c (dwarf2_per_objfile) <filenames_cache>: New field. (dw2_map_symbol_filenames): Build and use a filenames_seen_cache. * filename-seen-cache.c: New file. * filename-seen-cache.h: New file. * symtab.c: Include "filename-seen-cache.h". (struct filename_seen_cache, INITIAL_FILENAME_SEEN_CACHE_SIZE) (create_filename_seen_cache, clear_filename_seen_cache) (delete_filename_seen_cache, filename_seen): Delete, parts moved to filename-seen-cache.h/filename-seen-cache.c. (output_source_filename, sources_info) (maybe_add_partial_symtab_filename) (make_source_files_completion_list): Adjust to use filename_seen_cache. 2017-07-17 12:28:33 +02:00			`/* Filename-seen cache for the GNU debugger, GDB.`

Update copyright year range in all GDB files. This commit applies all changes made after running the gdb/copyright.py script. Note that one file was flagged by the script, due to an invalid copyright header (gdb/unittests/basic_string_view/element_access/char/empty.cc). As the file was copied from GCC's libstdc++-v3 testsuite, this commit leaves this file untouched for the time being; a patch to fix the header was sent to gcc-patches first. gdb/ChangeLog: Update copyright year range in all GDB files. 2019-01-01 07:01:51 +01:00			`Copyright (C) 1986-2019 Free Software Foundation, Inc.`
Fix TAB-completion + .gdb_index slowness (generalize filename_seen_cache) Tab completion when debugging a program binary that uses GDB index is surprisingly much slower than when GDB uses psymtabs instead. Around 1.5x/3x slower. That's surprising, because the whole point of GDB index is to speed things up... For example, with: set pagination off set $count = 0 while $count < 400 complete b string_prin # matches gdb's string_printf printf "count = %d\n", $count set $count = $count + 1 end $ time ./gdb --batch -q ./gdb-with-index -ex "source script.cmd" real 0m11.042s user 0m10.920s sys 0m0.042s $ time ./gdb --batch -q ./gdb-without-index -ex "source script.cmd" real 0m4.635s user 0m4.590s sys 0m0.037s Same but with: - complete b string_prin + complete b zzzzzz to exercise the no-matches worst case, master currently gets you something like: with index without index real 0m11.971s 0m8.413s user 0m11.912s 0m8.355s sys 0m0.035s 0m0.035s Running gdb under perf shows 80% spent inside maybe_add_partial_symtab_filename, and 20% spent in the lbasename inside that. The problem that tab completion walks over all compunit symtabs, and for each, walks the contained file symtabs. And there a huge number of file symtabs (each included system header, etc.) that appear in each compunit symtab's file symtab list. As in, when debugging GDB, I have 367381 symtabs iterated, when of those only 5371 filenames are unique... This was a regression from the earlier (nice) split of symtabs in compunit symtabs + file symtabs. The fix here is to add a cache of unique filenames per objfile so that the walk / uniquing is only done once. There's already a abstraction for this in symtab.c; this patch moves that code out to a separate file and C++ifies it bit. This makes the worst-case scenario above consistently drop to ~2.5s (1.5s for the "string_prin" hit case), making it over 3.3x times faster than psymtabs in this use case (7x in the "string_prin" hit case). gdb/ChangeLog: 2017-07-17 Pedro Alves <palves@redhat.com> * Makefile.in (COMMON_OBS): Add filename-seen-cache.o. * dwarf2read.c: Include "filename-seen-cache.h". * dwarf2read.c (dwarf2_per_objfile) <filenames_cache>: New field. (dw2_map_symbol_filenames): Build and use a filenames_seen_cache. * filename-seen-cache.c: New file. * filename-seen-cache.h: New file. * symtab.c: Include "filename-seen-cache.h". (struct filename_seen_cache, INITIAL_FILENAME_SEEN_CACHE_SIZE) (create_filename_seen_cache, clear_filename_seen_cache) (delete_filename_seen_cache, filename_seen): Delete, parts moved to filename-seen-cache.h/filename-seen-cache.c. (output_source_filename, sources_info) (maybe_add_partial_symtab_filename) (make_source_files_completion_list): Adjust to use filename_seen_cache. 2017-07-17 12:28:33 +02:00
			`This file is part of GDB.`

			`This program is free software; you can redistribute it and/or modify`
			`it under the terms of the GNU General Public License as published by`
			`the Free Software Foundation; either version 3 of the License, or`
			`(at your option) any later version.`

			`This program is distributed in the hope that it will be useful,`
			`but WITHOUT ANY WARRANTY; without even the implied warranty of`
			`MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the`
			`GNU General Public License for more details.`

			`You should have received a copy of the GNU General Public License`
			`along with this program. If not, see <http://www.gnu.org/licenses/>. */`

Add include guard to filename-seen-cache.h While moving things around, I stumbled on filename_seen_cache being re-defined, because filename-seen-cache.h doesn't have an include guard. gdb/ChangeLog: * filename-seen-cache.h: Add include guard. 2018-03-26 21:31:10 +02:00			`#ifndef FILENAME_SEEN_CACHE_H`
			`#define FILENAME_SEEN_CACHE_H`

Fix TAB-completion + .gdb_index slowness (generalize filename_seen_cache) Tab completion when debugging a program binary that uses GDB index is surprisingly much slower than when GDB uses psymtabs instead. Around 1.5x/3x slower. That's surprising, because the whole point of GDB index is to speed things up... For example, with: set pagination off set $count = 0 while $count < 400 complete b string_prin # matches gdb's string_printf printf "count = %d\n", $count set $count = $count + 1 end $ time ./gdb --batch -q ./gdb-with-index -ex "source script.cmd" real 0m11.042s user 0m10.920s sys 0m0.042s $ time ./gdb --batch -q ./gdb-without-index -ex "source script.cmd" real 0m4.635s user 0m4.590s sys 0m0.037s Same but with: - complete b string_prin + complete b zzzzzz to exercise the no-matches worst case, master currently gets you something like: with index without index real 0m11.971s 0m8.413s user 0m11.912s 0m8.355s sys 0m0.035s 0m0.035s Running gdb under perf shows 80% spent inside maybe_add_partial_symtab_filename, and 20% spent in the lbasename inside that. The problem that tab completion walks over all compunit symtabs, and for each, walks the contained file symtabs. And there a huge number of file symtabs (each included system header, etc.) that appear in each compunit symtab's file symtab list. As in, when debugging GDB, I have 367381 symtabs iterated, when of those only 5371 filenames are unique... This was a regression from the earlier (nice) split of symtabs in compunit symtabs + file symtabs. The fix here is to add a cache of unique filenames per objfile so that the walk / uniquing is only done once. There's already a abstraction for this in symtab.c; this patch moves that code out to a separate file and C++ifies it bit. This makes the worst-case scenario above consistently drop to ~2.5s (1.5s for the "string_prin" hit case), making it over 3.3x times faster than psymtabs in this use case (7x in the "string_prin" hit case). gdb/ChangeLog: 2017-07-17 Pedro Alves <palves@redhat.com> * Makefile.in (COMMON_OBS): Add filename-seen-cache.o. * dwarf2read.c: Include "filename-seen-cache.h". * dwarf2read.c (dwarf2_per_objfile) <filenames_cache>: New field. (dw2_map_symbol_filenames): Build and use a filenames_seen_cache. * filename-seen-cache.c: New file. * filename-seen-cache.h: New file. * symtab.c: Include "filename-seen-cache.h". (struct filename_seen_cache, INITIAL_FILENAME_SEEN_CACHE_SIZE) (create_filename_seen_cache, clear_filename_seen_cache) (delete_filename_seen_cache, filename_seen): Delete, parts moved to filename-seen-cache.h/filename-seen-cache.c. (output_source_filename, sources_info) (maybe_add_partial_symtab_filename) (make_source_files_completion_list): Adjust to use filename_seen_cache. 2017-07-17 12:28:33 +02:00			`#include "defs.h"`
			`#include "common/function-view.h"`

			`/* Cache to watch for file names already seen. */`

			`class filename_seen_cache`
			`{`
			`public:`
			`filename_seen_cache ();`
			`~filename_seen_cache ();`

Use DISABLE_COPY_AND_ASSIGN We have many classes that copy cotr and assignment operator are deleted, so this patch replaces these existing mechanical code with macro DISABLE_COPY_AND_ASSIGN. gdb: 2017-09-19 Yao Qi <yao.qi@linaro.org> * annotate.h (struct annotate_arg_emitter): Use DISABLE_COPY_AND_ASSIGN. * common/refcounted-object.h (refcounted_object): Likewise. * completer.h (struct completion_result): Likewise. * dwarf2read.c (struct dwarf2_per_objfile): Likewise. * filename-seen-cache.h (filename_seen_cache): Likewise. * gdbcore.h (thread_section_name): Likewise. * gdb_regex.h (compiled_regex): Likewise. * gdbthread.h (scoped_restore_current_thread): Likewise. * inferior.h (scoped_restore_current_inferior): Likewise. * jit.c (jit_reader): Likewise. * linespec.h (struct linespec_result): Likewise. * mi/mi-parse.h (struct mi_parse): Likewise. * nat/fork-inferior.c (execv_argv): Likewise. * progspace.h (scoped_restore_current_program_space): Likewise. * python/python-internal.h (class gdbpy_enter): Likewise. * regcache.h (regcache): Likewise. * target-descriptions.c (struct tdesc_reg): Likewise. (struct tdesc_type): Likewise. (struct tdesc_feature): Likewise. * ui-out.h (ui_out_emit_type): Likewise. 2017-09-19 11:10:03 +02:00			`DISABLE_COPY_AND_ASSIGN (filename_seen_cache);`
Fix TAB-completion + .gdb_index slowness (generalize filename_seen_cache) Tab completion when debugging a program binary that uses GDB index is surprisingly much slower than when GDB uses psymtabs instead. Around 1.5x/3x slower. That's surprising, because the whole point of GDB index is to speed things up... For example, with: set pagination off set $count = 0 while $count < 400 complete b string_prin # matches gdb's string_printf printf "count = %d\n", $count set $count = $count + 1 end $ time ./gdb --batch -q ./gdb-with-index -ex "source script.cmd" real 0m11.042s user 0m10.920s sys 0m0.042s $ time ./gdb --batch -q ./gdb-without-index -ex "source script.cmd" real 0m4.635s user 0m4.590s sys 0m0.037s Same but with: - complete b string_prin + complete b zzzzzz to exercise the no-matches worst case, master currently gets you something like: with index without index real 0m11.971s 0m8.413s user 0m11.912s 0m8.355s sys 0m0.035s 0m0.035s Running gdb under perf shows 80% spent inside maybe_add_partial_symtab_filename, and 20% spent in the lbasename inside that. The problem that tab completion walks over all compunit symtabs, and for each, walks the contained file symtabs. And there a huge number of file symtabs (each included system header, etc.) that appear in each compunit symtab's file symtab list. As in, when debugging GDB, I have 367381 symtabs iterated, when of those only 5371 filenames are unique... This was a regression from the earlier (nice) split of symtabs in compunit symtabs + file symtabs. The fix here is to add a cache of unique filenames per objfile so that the walk / uniquing is only done once. There's already a abstraction for this in symtab.c; this patch moves that code out to a separate file and C++ifies it bit. This makes the worst-case scenario above consistently drop to ~2.5s (1.5s for the "string_prin" hit case), making it over 3.3x times faster than psymtabs in this use case (7x in the "string_prin" hit case). gdb/ChangeLog: 2017-07-17 Pedro Alves <palves@redhat.com> * Makefile.in (COMMON_OBS): Add filename-seen-cache.o. * dwarf2read.c: Include "filename-seen-cache.h". * dwarf2read.c (dwarf2_per_objfile) <filenames_cache>: New field. (dw2_map_symbol_filenames): Build and use a filenames_seen_cache. * filename-seen-cache.c: New file. * filename-seen-cache.h: New file. * symtab.c: Include "filename-seen-cache.h". (struct filename_seen_cache, INITIAL_FILENAME_SEEN_CACHE_SIZE) (create_filename_seen_cache, clear_filename_seen_cache) (delete_filename_seen_cache, filename_seen): Delete, parts moved to filename-seen-cache.h/filename-seen-cache.c. (output_source_filename, sources_info) (maybe_add_partial_symtab_filename) (make_source_files_completion_list): Adjust to use filename_seen_cache. 2017-07-17 12:28:33 +02:00
			`/* Empty the cache, but do not delete it. */`
			`void clear ();`

			`/* If FILE is not already in the table of files in CACHE, add it and`
			`return false; otherwise return true.`

			`NOTE: We don't manage space for FILE, we assume FILE lives as`
			`long as the caller needs. */`
			`bool seen (const char *file);`

			`/* Traverse all cache entries, calling CALLBACK on each. The`
			`filename is passed as argument to CALLBACK. */`
			`void traverse (gdb::function_view<void (const char *filename)> callback)`
			`{`
			`auto erased_cb = [] (void *slot, void info) -> int`
			`{`
			`auto filename = (const char ) slot;`
			`auto restored_cb = (decltype (callback) *) info;`
			`(*restored_cb) (filename);`
			`return 1;`
			`};`

			`htab_traverse_noresize (m_tab, erased_cb, &callback);`
			`}`

			`private:`
			`/* Table of files seen so far. */`
			`htab_t m_tab;`
			`};`
Add include guard to filename-seen-cache.h While moving things around, I stumbled on filename_seen_cache being re-defined, because filename-seen-cache.h doesn't have an include guard. gdb/ChangeLog: * filename-seen-cache.h: Add include guard. 2018-03-26 21:31:10 +02:00
			`#endif /* FILENAME_SEEN_CACHE_H */`