8e4d333b0e
* doc/html/*: Regenerate. From-SVN: r264760
121 lines
11 KiB
HTML
121 lines
11 KiB
HTML
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
|
||
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>Design</title><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot" /><meta name="keywords" content="C++, library, profile" /><meta name="keywords" content="ISO C++, library" /><meta name="keywords" content="ISO C++, runtime, library" /><link rel="home" href="../index.html" title="The GNU C++ Library" /><link rel="up" href="profile_mode.html" title="Chapter 19. Profile Mode" /><link rel="prev" href="profile_mode.html" title="Chapter 19. Profile Mode" /><link rel="next" href="profile_mode_api.html" title="Extensions for Custom Containers" /></head><body><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">Design</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="profile_mode.html">Prev</a> </td><th width="60%" align="center">Chapter 19. Profile Mode</th><td width="20%" align="right"> <a accesskey="n" href="profile_mode_api.html">Next</a></td></tr></table><hr /></div><div class="section"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a id="manual.ext.profile_mode.design"></a>Design</h2></div></div></div><p>
|
||
</p><div class="table"><a id="table.profile_code_loc"></a><p class="title"><strong>Table 19.1. Profile Code Location</strong></p><div class="table-contents"><table class="table" summary="Profile Code Location" border="1"><colgroup><col align="left" class="c1" /><col align="left" class="c2" /></colgroup><thead><tr><th align="left">Code Location</th><th align="left">Use</th></tr></thead><tbody><tr><td align="left"><code class="code">libstdc++-v3/include/std/*</code></td><td align="left">Preprocessor code to redirect to profile extension headers.</td></tr><tr><td align="left"><code class="code">libstdc++-v3/include/profile/*</code></td><td align="left">Profile extension public headers (map, vector, ...).</td></tr><tr><td align="left"><code class="code">libstdc++-v3/include/profile/impl/*</code></td><td align="left">Profile extension internals. Implementation files are
|
||
only included from <code class="code">impl/profiler.h</code>, which is the only
|
||
file included from the public headers.</td></tr></tbody></table></div></div><br class="table-break" /><p>
|
||
</p><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.wrapper"></a>Wrapper Model</h3></div></div></div><p>
|
||
In order to get our instrumented library version included instead of the
|
||
release one,
|
||
we use the same wrapper model as the debug mode.
|
||
We subclass entities from the release version. Wherever
|
||
<code class="code">_GLIBCXX_PROFILE</code> is defined, the release namespace is
|
||
<code class="code">std::__norm</code>, whereas the profile namespace is
|
||
<code class="code">std::__profile</code>. Using plain <code class="code">std</code> translates
|
||
into <code class="code">std::__profile</code>.
|
||
</p><p>
|
||
Whenever possible, we try to wrap at the public interface level, e.g.,
|
||
in <code class="code">unordered_set</code> rather than in <code class="code">hashtable</code>,
|
||
in order not to depend on implementation.
|
||
</p><p>
|
||
Mixing object files built with and without the profile mode must
|
||
not affect the program execution. However, there are no guarantees to
|
||
the accuracy of diagnostics when using even a single object not built with
|
||
<code class="code">-D_GLIBCXX_PROFILE</code>.
|
||
Currently, mixing the profile mode with debug and parallel extensions is
|
||
not allowed. Mixing them at compile time will result in preprocessor errors.
|
||
Mixing them at link time is undefined.
|
||
</p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.instrumentation"></a>Instrumentation</h3></div></div></div><p>
|
||
Instead of instrumenting every public entry and exit point,
|
||
we chose to add instrumentation on demand, as needed
|
||
by individual diagnostics.
|
||
The main reason is that some diagnostics require us to extract bits of
|
||
internal state that are particular only to that diagnostic.
|
||
We plan to formalize this later, after we learn more about the requirements
|
||
of several diagnostics.
|
||
</p><p>
|
||
All the instrumentation points can be switched on and off using
|
||
<code class="code">-D[_NO]_GLIBCXX_PROFILE_<diagnostic></code> options.
|
||
With all the instrumentation calls off, there should be negligible
|
||
overhead over the release version. This property is needed to support
|
||
diagnostics based on timing of internal operations. For such diagnostics,
|
||
we anticipate turning most of the instrumentation off in order to prevent
|
||
profiling overhead from polluting time measurements, and thus diagnostics.
|
||
</p><p>
|
||
All the instrumentation on/off compile time switches live in
|
||
<code class="code">include/profile/profiler.h</code>.
|
||
</p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.rtlib"></a>Run Time Behavior</h3></div></div></div><p>
|
||
For practical reasons, the instrumentation library processes the trace
|
||
partially
|
||
rather than dumping it to disk in raw form. Each event is processed when
|
||
it occurs. It is usually attached a cost and it is aggregated into
|
||
the database of a specific diagnostic class. The cost model
|
||
is based largely on the standard performance guarantees, but in some
|
||
cases we use knowledge about GCC's standard library implementation.
|
||
</p><p>
|
||
Information is indexed by (1) call stack and (2) instance id or address
|
||
to be able to understand and summarize precise creation-use-destruction
|
||
dynamic chains. Although the analysis is sensitive to dynamic instances,
|
||
the reports are only sensitive to call context. Whenever a dynamic instance
|
||
is destroyed, we accumulate its effect to the corresponding entry for the
|
||
call stack of its constructor location.
|
||
</p><p>
|
||
For details, see
|
||
<a class="link" href="https://ieeexplore.ieee.org/document/4907670/" target="_top">paper presented at
|
||
CGO 2009</a>.
|
||
</p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.analysis"></a>Analysis and Diagnostics</h3></div></div></div><p>
|
||
Final analysis takes place offline, and it is based entirely on the
|
||
generated trace and debugging info in the application binary.
|
||
See section Diagnostics for a list of analysis types that we plan to support.
|
||
</p><p>
|
||
The input to the analysis is a table indexed by profile type and call stack.
|
||
The data type for each entry depends on the profile type.
|
||
</p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.cost-model"></a>Cost Model</h3></div></div></div><p>
|
||
While it is likely that cost models become complex as we get into
|
||
more sophisticated analysis, we will try to follow a simple set of rules
|
||
at the beginning.
|
||
</p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p><span class="emphasis"><em>Relative benefit estimation:</em></span>
|
||
The idea is to estimate or measure the cost of all operations
|
||
in the original scenario versus the scenario we advise to switch to.
|
||
For instance, when advising to change a vector to a list, an occurrence
|
||
of the <code class="code">insert</code> method will generally count as a benefit.
|
||
Its magnitude depends on (1) the number of elements that get shifted
|
||
and (2) whether it triggers a reallocation.
|
||
</p></li><li class="listitem"><p><span class="emphasis"><em>Synthetic measurements:</em></span>
|
||
We will measure the relative difference between similar operations on
|
||
different containers. We plan to write a battery of small tests that
|
||
compare the times of the executions of similar methods on different
|
||
containers. The idea is to run these tests on the target machine.
|
||
If this training phase is very quick, we may decide to perform it at
|
||
library initialization time. The results can be cached on disk and reused
|
||
across runs.
|
||
</p></li><li class="listitem"><p><span class="emphasis"><em>Timers:</em></span>
|
||
We plan to use timers for operations of larger granularity, such as sort.
|
||
For instance, we can switch between different sort methods on the fly
|
||
and report the one that performs best for each call context.
|
||
</p></li><li class="listitem"><p><span class="emphasis"><em>Show stoppers:</em></span>
|
||
We may decide that the presence of an operation nullifies the advice.
|
||
For instance, when considering switching from <code class="code">set</code> to
|
||
<code class="code">unordered_set</code>, if we detect use of operator <code class="code">++</code>,
|
||
we will simply not issue the advice, since this could signal that the use
|
||
care require a sorted container.</p></li></ul></div></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.reports"></a>Reports</h3></div></div></div><p>
|
||
There are two types of reports. First, if we recognize a pattern for which
|
||
we have a substitute that is likely to give better performance, we print
|
||
the advice and estimated performance gain. The advice is usually associated
|
||
to a code position and possibly a call stack.
|
||
</p><p>
|
||
Second, we report performance characteristics for which we do not have
|
||
a clear solution for improvement. For instance, we can point to the user
|
||
the top 10 <code class="code">multimap</code> locations
|
||
which have the worst data locality in actual traversals.
|
||
Although this does not offer a solution,
|
||
it helps the user focus on the key problems and ignore the uninteresting ones.
|
||
</p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.testing"></a>Testing</h3></div></div></div><p>
|
||
First, we want to make sure we preserve the behavior of the release mode.
|
||
You can just type <code class="code">"make check-profile"</code>, which
|
||
builds and runs the whole test suite in profile mode.
|
||
</p><p>
|
||
Second, we want to test the correctness of each diagnostic.
|
||
We created a <code class="code">profile</code> directory in the test suite.
|
||
Each diagnostic must come with at least two tests, one for false positives
|
||
and one for false negatives.
|
||
</p></div></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="profile_mode.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="profile_mode.html">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="profile_mode_api.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Chapter 19. Profile Mode </td><td width="20%" align="center"><a accesskey="h" href="../index.html">Home</a></td><td width="40%" align="right" valign="top"> Extensions for Custom Containers</td></tr></table></div></body></html> |