55d2e499d6
* doc/xml/manual/backwards_compatibility.xml: Fix autoconf tests for C++11 compiler features and library headers. Add stable id attributes. Use <filename> element for headers and surround in angle brackets. Use <classname> for classes. * doc/html/*: Regenerate. From-SVN: r181047
123 lines
11 KiB
HTML
123 lines
11 KiB
HTML
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
|
||
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
|
||
<html xmlns="http://www.w3.org/1999/xhtml"><head><title>Design</title><meta name="generator" content="DocBook XSL-NS Stylesheets V1.76.1"/><meta name="keywords" content=" C++ , library , profile "/><meta name="keywords" content=" ISO C++ , library "/><meta name="keywords" content=" ISO C++ , runtime , library "/><link rel="home" href="../index.html" title="The GNU C++ Library"/><link rel="up" href="profile_mode.html" title="Chapter 19. Profile Mode"/><link rel="prev" href="profile_mode.html" title="Chapter 19. Profile Mode"/><link rel="next" href="bk01pt03ch19s03.html" title="Extensions for Custom Containers"/></head><body><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">Design</th></tr><tr><td align="left"><a accesskey="p" href="profile_mode.html">Prev</a> </td><th width="60%" align="center">Chapter 19. Profile Mode</th><td align="right"> <a accesskey="n" href="bk01pt03ch19s03.html">Next</a></td></tr></table><hr/></div><div class="section" title="Design"><div class="titlepage"><div><div><h2 class="title"><a id="manual.ext.profile_mode.design"/>Design</h2></div></div></div><p>
|
||
</p><div class="table"><a id="id568017"/><p class="title"><strong>Table 19.1. Profile Code Location</strong></p><div class="table-contents"><table summary="Profile Code Location" border="1"><colgroup><col style="text-align: left" class="c1"/><col style="text-align: left" class="c2"/></colgroup><thead><tr><th style="text-align: left">Code Location</th><th style="text-align: left">Use</th></tr></thead><tbody><tr><td style="text-align: left"><code class="code">libstdc++-v3/include/std/*</code></td><td style="text-align: left">Preprocessor code to redirect to profile extension headers.</td></tr><tr><td style="text-align: left"><code class="code">libstdc++-v3/include/profile/*</code></td><td style="text-align: left">Profile extension public headers (map, vector, ...).</td></tr><tr><td style="text-align: left"><code class="code">libstdc++-v3/include/profile/impl/*</code></td><td style="text-align: left">Profile extension internals. Implementation files are
|
||
only included from <code class="code">impl/profiler.h</code>, which is the only
|
||
file included from the public headers.</td></tr></tbody></table></div></div><br class="table-break"/><p>
|
||
</p><div class="section" title="Wrapper Model"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.wrapper"/>Wrapper Model</h3></div></div></div><p>
|
||
In order to get our instrumented library version included instead of the
|
||
release one,
|
||
we use the same wrapper model as the debug mode.
|
||
We subclass entities from the release version. Wherever
|
||
<code class="code">_GLIBCXX_PROFILE</code> is defined, the release namespace is
|
||
<code class="code">std::__norm</code>, whereas the profile namespace is
|
||
<code class="code">std::__profile</code>. Using plain <code class="code">std</code> translates
|
||
into <code class="code">std::__profile</code>.
|
||
</p><p>
|
||
Whenever possible, we try to wrap at the public interface level, e.g.,
|
||
in <code class="code">unordered_set</code> rather than in <code class="code">hashtable</code>,
|
||
in order not to depend on implementation.
|
||
</p><p>
|
||
Mixing object files built with and without the profile mode must
|
||
not affect the program execution. However, there are no guarantees to
|
||
the accuracy of diagnostics when using even a single object not built with
|
||
<code class="code">-D_GLIBCXX_PROFILE</code>.
|
||
Currently, mixing the profile mode with debug and parallel extensions is
|
||
not allowed. Mixing them at compile time will result in preprocessor errors.
|
||
Mixing them at link time is undefined.
|
||
</p></div><div class="section" title="Instrumentation"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.instrumentation"/>Instrumentation</h3></div></div></div><p>
|
||
Instead of instrumenting every public entry and exit point,
|
||
we chose to add instrumentation on demand, as needed
|
||
by individual diagnostics.
|
||
The main reason is that some diagnostics require us to extract bits of
|
||
internal state that are particular only to that diagnostic.
|
||
We plan to formalize this later, after we learn more about the requirements
|
||
of several diagnostics.
|
||
</p><p>
|
||
All the instrumentation points can be switched on and off using
|
||
<code class="code">-D[_NO]_GLIBCXX_PROFILE_<diagnostic></code> options.
|
||
With all the instrumentation calls off, there should be negligible
|
||
overhead over the release version. This property is needed to support
|
||
diagnostics based on timing of internal operations. For such diagnostics,
|
||
we anticipate turning most of the instrumentation off in order to prevent
|
||
profiling overhead from polluting time measurements, and thus diagnostics.
|
||
</p><p>
|
||
All the instrumentation on/off compile time switches live in
|
||
<code class="code">include/profile/profiler.h</code>.
|
||
</p></div><div class="section" title="Run Time Behavior"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.rtlib"/>Run Time Behavior</h3></div></div></div><p>
|
||
For practical reasons, the instrumentation library processes the trace
|
||
partially
|
||
rather than dumping it to disk in raw form. Each event is processed when
|
||
it occurs. It is usually attached a cost and it is aggregated into
|
||
the database of a specific diagnostic class. The cost model
|
||
is based largely on the standard performance guarantees, but in some
|
||
cases we use knowledge about GCC's standard library implementation.
|
||
</p><p>
|
||
Information is indexed by (1) call stack and (2) instance id or address
|
||
to be able to understand and summarize precise creation-use-destruction
|
||
dynamic chains. Although the analysis is sensitive to dynamic instances,
|
||
the reports are only sensitive to call context. Whenever a dynamic instance
|
||
is destroyed, we accumulate its effect to the corresponding entry for the
|
||
call stack of its constructor location.
|
||
</p><p>
|
||
For details, see
|
||
<a class="link" href="http://dx.doi.org/10.1109/CGO.2009.36">paper presented at
|
||
CGO 2009</a>.
|
||
</p></div><div class="section" title="Analysis and Diagnostics"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.analysis"/>Analysis and Diagnostics</h3></div></div></div><p>
|
||
Final analysis takes place offline, and it is based entirely on the
|
||
generated trace and debugging info in the application binary.
|
||
See section Diagnostics for a list of analysis types that we plan to support.
|
||
</p><p>
|
||
The input to the analysis is a table indexed by profile type and call stack.
|
||
The data type for each entry depends on the profile type.
|
||
</p></div><div class="section" title="Cost Model"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.cost-model"/>Cost Model</h3></div></div></div><p>
|
||
While it is likely that cost models become complex as we get into
|
||
more sophisticated analysis, we will try to follow a simple set of rules
|
||
at the beginning.
|
||
</p><div class="itemizedlist"><ul class="itemizedlist"><li class="listitem"><p><span class="emphasis"><em>Relative benefit estimation:</em></span>
|
||
The idea is to estimate or measure the cost of all operations
|
||
in the original scenario versus the scenario we advise to switch to.
|
||
For instance, when advising to change a vector to a list, an occurrence
|
||
of the <code class="code">insert</code> method will generally count as a benefit.
|
||
Its magnitude depends on (1) the number of elements that get shifted
|
||
and (2) whether it triggers a reallocation.
|
||
</p></li><li class="listitem"><p><span class="emphasis"><em>Synthetic measurements:</em></span>
|
||
We will measure the relative difference between similar operations on
|
||
different containers. We plan to write a battery of small tests that
|
||
compare the times of the executions of similar methods on different
|
||
containers. The idea is to run these tests on the target machine.
|
||
If this training phase is very quick, we may decide to perform it at
|
||
library initialization time. The results can be cached on disk and reused
|
||
across runs.
|
||
</p></li><li class="listitem"><p><span class="emphasis"><em>Timers:</em></span>
|
||
We plan to use timers for operations of larger granularity, such as sort.
|
||
For instance, we can switch between different sort methods on the fly
|
||
and report the one that performs best for each call context.
|
||
</p></li><li class="listitem"><p><span class="emphasis"><em>Show stoppers:</em></span>
|
||
We may decide that the presence of an operation nullifies the advice.
|
||
For instance, when considering switching from <code class="code">set</code> to
|
||
<code class="code">unordered_set</code>, if we detect use of operator <code class="code">++</code>,
|
||
we will simply not issue the advice, since this could signal that the use
|
||
care require a sorted container.</p></li></ul></div></div><div class="section" title="Reports"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.reports"/>Reports</h3></div></div></div><p>
|
||
There are two types of reports. First, if we recognize a pattern for which
|
||
we have a substitute that is likely to give better performance, we print
|
||
the advice and estimated performance gain. The advice is usually associated
|
||
to a code position and possibly a call stack.
|
||
</p><p>
|
||
Second, we report performance characteristics for which we do not have
|
||
a clear solution for improvement. For instance, we can point to the user
|
||
the top 10 <code class="code">multimap</code> locations
|
||
which have the worst data locality in actual traversals.
|
||
Although this does not offer a solution,
|
||
it helps the user focus on the key problems and ignore the uninteresting ones.
|
||
</p></div><div class="section" title="Testing"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.testing"/>Testing</h3></div></div></div><p>
|
||
First, we want to make sure we preserve the behavior of the release mode.
|
||
You can just type <code class="code">"make check-profile"</code>, which
|
||
builds and runs the whole test suite in profile mode.
|
||
</p><p>
|
||
Second, we want to test the correctness of each diagnostic.
|
||
We created a <code class="code">profile</code> directory in the test suite.
|
||
Each diagnostic must come with at least two tests, one for false positives
|
||
and one for false negatives.
|
||
</p></div></div><div class="navfooter"><hr/><table width="100%" summary="Navigation footer"><tr><td align="left"><a accesskey="p" href="profile_mode.html">Prev</a> </td><td align="center"><a accesskey="u" href="profile_mode.html">Up</a></td><td align="right"> <a accesskey="n" href="bk01pt03ch19s03.html">Next</a></td></tr><tr><td align="left" valign="top">Chapter 19. Profile Mode </td><td align="center"><a accesskey="h" href="../index.html">Home</a></td><td align="right" valign="top"> Extensions for Custom Containers</td></tr></table></div></body></html>
|