[multiple changes]
2004-02-11 Stefan Olsson <stefan@xapa.se> * docs/html/ext/mt_allocator.html: New. 2004-02-11 Benjamin Kosnik <bkoz@redhat.com> * docs/html/20_util/allocator.html: New file, consolidate allocator information here. Revamp. * docs/html/documentation.html: Change links. * docs/html/20_util/howto.html: Same. * docs/html/ext/howto.html: Same. From-SVN: r77687
This commit is contained in:
parent
9288d11204
commit
1c86f39d32
@ -1,3 +1,15 @@
|
||||
2004-02-11 Stefan Olsson <stefan@xapa.se>
|
||||
|
||||
* docs/html/ext/mt_allocator.html: New.
|
||||
|
||||
2004-02-11 Benjamin Kosnik <bkoz@redhat.com>
|
||||
|
||||
* docs/html/20_util/allocator.html: New file, consolidate
|
||||
allocator information here. Revamp.
|
||||
* docs/html/documentation.html: Change links.
|
||||
* docs/html/20_util/howto.html: Same.
|
||||
* docs/html/ext/howto.html: Same.
|
||||
|
||||
2004-02-11 Paolo Carlini <pcarlini@suse.de>
|
||||
|
||||
PR libstdc++/13731 (first part: write)
|
||||
|
468
libstdc++-v3/docs/html/20_util/allocator.html
Normal file
468
libstdc++-v3/docs/html/20_util/allocator.html
Normal file
@ -0,0 +1,468 @@
|
||||
<?xml version="1.0" encoding="ISO-8859-1"?>
|
||||
<!DOCTYPE html
|
||||
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
|
||||
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
||||
|
||||
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
|
||||
<head>
|
||||
<meta name="AUTHOR" content="pme@gcc.gnu.org (Phil Edwards) and bkoz@gcc.gnu.org (Benjamin Kosnik)" />
|
||||
<meta name="KEYWORDS" content="c++, libstdc++, g++, allocator, memory" />
|
||||
<meta name="DESCRIPTION" content="Allocators and allocation" />
|
||||
<meta name="GENERATOR" content="emacs and ten fingers" />
|
||||
<title>Allocators and allocation</title>
|
||||
<link rel="StyleSheet" href="../lib3styles.css" type="text/css" />
|
||||
<link rel="Copyright" href="../17_intro/license.html" type="text/html" />
|
||||
</head>
|
||||
<body>
|
||||
|
||||
<h1 class="centered"><a name="top">Allocators and allocation</a></h1>
|
||||
|
||||
<p class="fineprint"><em>
|
||||
The latest version of this document is always available at
|
||||
<a href="http://gcc.gnu.org/onlinedocs/libstdc++/20_util/allocator.html">
|
||||
http://gcc.gnu.org/onlinedocs/libstdc++/20_util/allocator.html</a>.
|
||||
</em></p>
|
||||
|
||||
<p><em>
|
||||
To the <a href="http://gcc.gnu.org/libstdc++/">libstdc++-v3 homepage</a>.
|
||||
</em></p>
|
||||
|
||||
<!-- ####################################################### -->
|
||||
<hr />
|
||||
<p> The C++ Standard encapsulates memory management characteristics
|
||||
for strings, container classes, and parts of iostreams in a
|
||||
template class called <code>std::allocator</code>.
|
||||
</p>
|
||||
|
||||
<h3 class="left">
|
||||
<a name="standard requirements">Standard requirements</a>
|
||||
</h3>
|
||||
<p>The C++ standard only gives a few directives in this area:
|
||||
</p>
|
||||
<ul>
|
||||
<li>When you add elements to a container, and the container must allocate
|
||||
more memory to hold them, the container makes the request via its
|
||||
<code>Allocator</code> template parameter. This includes adding
|
||||
chars to the string class, which acts as a regular STL container
|
||||
in this respect.
|
||||
</li>
|
||||
<li>The default <code>Allocator</code> of every container-of-T is
|
||||
<code>std::allocator<T></code>.
|
||||
</li>
|
||||
<li>The interface of the <code>allocator<T></code> class is
|
||||
extremely simple. It has about 20 public declarations (nested
|
||||
typedefs, member functions, etc), but the two which concern us most
|
||||
are:
|
||||
<pre>
|
||||
T* allocate (size_type n, const void* hint = 0);
|
||||
void deallocate (T* p, size_type n);</pre>
|
||||
(This is a simplification; the real signatures use nested typedefs.)
|
||||
The <code>"n"</code> arguments in both those functions is a
|
||||
<em>count</em> of the number of T's to allocate space for,
|
||||
<em>not their total size</em>.
|
||||
</li>
|
||||
<li>"The storage is obtained by calling
|
||||
<code>::operator new(size_t)</code>, but it is unspecified when or
|
||||
how often this function is called. The use of <code>hint</code>
|
||||
is unspecified, but intended as an aid to locality if an
|
||||
implementation so desires." [20.4.1.1]/6
|
||||
</li>
|
||||
</ul>
|
||||
|
||||
<p> Complete details cam be found in the C++ standard, look in
|
||||
[20.4 Memory].
|
||||
</p>
|
||||
|
||||
<h3 class="left">
|
||||
<a name="probs possibilities">Problems and Possibilities</a>
|
||||
</h3>
|
||||
<p>The easiest way of fulfilling the requirements is to call operator new
|
||||
each time a container needs memory, and to call operator delete each
|
||||
time the container releases memory. <strong>BUT</strong>
|
||||
<a href="http://gcc.gnu.org/ml/libstdc++/2001-05/msg00105.html">this
|
||||
method is horribly slow</a>.
|
||||
</p>
|
||||
<p>Or we can keep old memory around, and reuse it in a pool to save time.
|
||||
The old libstdc++-v2 used a memory pool, and so do we. As of 3.0,
|
||||
<a href="http://gcc.gnu.org/ml/libstdc++/2001-05/msg00136.html">it's
|
||||
on by default</a>. The pool is shared among all the containers in the
|
||||
program: when your program's std::vector<int> gets cut in half
|
||||
and frees a bunch of its storage, that memory can be reused by the
|
||||
private std::list<WonkyWidget> brought in from a KDE library
|
||||
that you linked against. And we don't have to call operators new and
|
||||
delete to pass the memory on, either, which is a speed bonus.
|
||||
<strong>BUT</strong>...
|
||||
</p>
|
||||
<p>What about threads? No problem: in a threadsafe environment, the
|
||||
memory pool is manipulated atomically, so you can grow a container in
|
||||
one thread and shrink it in another, etc. <strong>BUT</strong> what
|
||||
if threads in libstdc++-v3 aren't set up properly?
|
||||
<a href="../faq/index.html#5_6">That's been answered already</a>.
|
||||
</p>
|
||||
<p><strong>BUT</strong> what if you want to use your own allocator? What
|
||||
if you plan on using a runtime-loadable version of malloc() which uses
|
||||
shared telepathic anonymous mmap'd sections serializable over a
|
||||
network, so that memory requests <em>should</em> go through malloc?
|
||||
And what if you need to debug it?
|
||||
</p>
|
||||
|
||||
<h3 class="left">
|
||||
<a name="stdallocator">Implementation details of <code>std::allocator</code></a>
|
||||
</h3>
|
||||
<p> The implementation of <code> std::allocator</code> has continued
|
||||
to evolve through successive releases. Here's a brief history.
|
||||
</p>
|
||||
|
||||
<h5 class="left">
|
||||
<a name="30allocator"> 3.0, 3.1, 3.2, 3.3 </a>
|
||||
</h5>
|
||||
<p> During this period, all allocators were written to the SGI
|
||||
style, and all STL containers expected this interface. This
|
||||
interface had a traits class called <code>_Alloc_traits</code> that
|
||||
attempted to provide more information for compile-time allocation
|
||||
selection and optimization. This traits class had another allocator
|
||||
wrapper, <code>__simple_alloc<T,A></code>, which was a
|
||||
wrapper around another allocator, A, which itself is an allocator
|
||||
for instances of T. But wait, there's more:
|
||||
<code>__allocator<T,A></code> is another adapter. Many of
|
||||
the provided allocator classes were SGI style: such classes can be
|
||||
changed to a conforming interface with this wrapper:
|
||||
<code>__allocator<T, __alloc></code> is thus the same as
|
||||
<code>allocator<T></code>.
|
||||
</p>
|
||||
|
||||
<p> The class <code>std::allocator</code> use the typedef
|
||||
<code>__alloc</code> to select an underlying allocator that
|
||||
satisfied memory allocation requests. The selection of this
|
||||
underlying allocator was not user-configurable.
|
||||
</p>
|
||||
|
||||
<h5 class="left">
|
||||
<a name="34allocator"> 3.4 </a>
|
||||
</h5>
|
||||
<p> For this and later releases, the only allocator interface that
|
||||
is support is the standard C++ interface. As such, all STL
|
||||
containers have been adjusted, and all external allocators have
|
||||
been modified to support this change. Because of this,
|
||||
<code>__simple_alloc, __allocator, __alloc, </code> and <code>
|
||||
_Alloc_traits</code> have all been removed.
|
||||
</p>
|
||||
|
||||
<p> The class <code>std::allocator</code> just has typedef,
|
||||
constructor, and rebind members. It inherits from one of the
|
||||
high-speed extension allocators, covered below. Thus, all
|
||||
allocation and deallocation depends on the base class.
|
||||
</p>
|
||||
|
||||
<p> The base class that <code>std::allocator</code> is derived from
|
||||
is not user-configurable.
|
||||
</p>
|
||||
|
||||
<h5 class="left">
|
||||
<a name="benchmarks"> How the default allocation strategy is selected.</a>
|
||||
</h5>
|
||||
<p> It's difficult to pick an allocation strategy that will provide
|
||||
maximum utility, without excessively penalizing some behavior. In
|
||||
fact, it's difficult just deciding which typical actions to measure
|
||||
for speed.
|
||||
</p>
|
||||
|
||||
<p> Three synthetic benchmarks have been created that provide data
|
||||
that is used to compare different C++ allocators. These tests are:
|
||||
</p>
|
||||
|
||||
<ul>
|
||||
<li>Insertion. Over multiple iterations, various STL container
|
||||
objects have elements inserted to some maximum amount. A variety
|
||||
of allocators are tested.
|
||||
Test source <a
|
||||
href="http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc%2b%2b-v3/testsuite/performance/20_util/allocator/insert.cc?only_with_tag=MAIN">here.</a>
|
||||
</li>
|
||||
|
||||
<li>Insertion, clear, and re-insertion in a multi-threaded
|
||||
environment. Over multiple iterations, several threads are
|
||||
started that insert elements into a STL container, then assign a
|
||||
null instance of the same type to clear memory, and then
|
||||
re-insert the same number of elements. Several STL containers and
|
||||
multiple allocators are tested. This test shows the ability of
|
||||
the allocator to reclaim memory on a pre-thread basis, as well as
|
||||
measuring thread contention for memory resources.
|
||||
Test source
|
||||
<a href="http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc%2b%2b-v3/testsuite/performance/20_util/allocator/insert_insert.cc">
|
||||
here.</a>
|
||||
</li>
|
||||
|
||||
<li>A threaded producer/consumer model.
|
||||
Test source
|
||||
<a href="http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc%2b%2b-v3/testsuite/performance/20_util/allocator/producer_consumer.cc">
|
||||
here.</a>
|
||||
</li>
|
||||
</ul>
|
||||
|
||||
<h5 class="left">
|
||||
<a name="forcenew"> Disabling memory caching.</a>
|
||||
</h5>
|
||||
<p> In use, <code>std::allocator</code> may allocate and deallocate
|
||||
using implementation-specified strategies and heuristics. Because of
|
||||
this, every call to an allocator object's <code> allocate</code>
|
||||
member function may not actually call the global operator new. This
|
||||
situation is also duplicated for calls to the <code>
|
||||
deallocate</code> member function.
|
||||
</p>
|
||||
|
||||
<p> This can be confusing.
|
||||
</p>
|
||||
|
||||
<p> In particular, this can make debugging memory errors more
|
||||
difficult, especially when using third party tools like valgrind or
|
||||
debug versions of <code> new</code>.
|
||||
</p>
|
||||
|
||||
<p> There are various ways to solve this problem. One would be to
|
||||
use a custom allocator that just called operators <code> new
|
||||
</code> and <code> delete</code> directly, for every
|
||||
allocation. (See include/ext/new_allocator.h, for instance.)
|
||||
However, that option would involve changing source code to use the a
|
||||
non-default allocator. Another option is to force the default
|
||||
allocator to remove caching and pools, and to directly allocate
|
||||
with every call of <code> allocate</code> and directly deallocate
|
||||
with every call of <code> deallocate</code>, regardless of
|
||||
efficiency. As it turns out, this last option is available,
|
||||
although the exact mechanism has evolved with time.
|
||||
</p>
|
||||
|
||||
<p> For GCC releases from 2.95 through the 3.1 series, defining
|
||||
<code>__USE_MALLOC</code> on the gcc command line would change the
|
||||
default allocation strategy to instead use <code> malloc</code> and
|
||||
<code> free</code>. See
|
||||
<a href="../23_containers/howto.html#3">this note</a>
|
||||
for details as to why this was something needing improvement.
|
||||
</p>
|
||||
|
||||
<p>Starting with GCC 3.2, and continued in the 3.3 series, to
|
||||
globally disable memory caching within the library for the
|
||||
default allocator, merely set GLIBCPP_FORCE_NEW (at this time,
|
||||
with any value) in the system's environment before running the
|
||||
program. If your program crashes with GLIBCPP_FORCE_NEW in the
|
||||
environment, it likely means that you linked against objects
|
||||
built against the older library. Code to support this extension
|
||||
is fully compatible with 3.2 code if GLIBCPP_FORCE_NEW is not in
|
||||
the environment.
|
||||
</p>
|
||||
|
||||
<p> As it turns out, the 3.4 code base continues to use this
|
||||
mechanism, only the environment variable has been changed to
|
||||
GLIBCXX_FORCE_NEW.
|
||||
</p>
|
||||
|
||||
<h3 class="left">
|
||||
<a name="ext allocators">Other allocators</a>
|
||||
</h3>
|
||||
<p> Several other allocators are provided as part of this
|
||||
implementation. The location of the extension allocators and their
|
||||
names have changed, but in all cases, functionality is
|
||||
equivalent. Starting with gcc-3.4, all extension allocators are
|
||||
standard style. Before this point, SGI style was the norm. Because of
|
||||
this, the number of template arguments also changed. Here's a simple
|
||||
chart to track the changes.
|
||||
</p>
|
||||
|
||||
<table title="extension allocators" border="1">
|
||||
<tr>
|
||||
<th>Allocator (3.4)</th>
|
||||
<th>Header (3.4)</th>
|
||||
<th>Allocator (3.[0-3])</th>
|
||||
<th>Header (3.[0-3])</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>__gnu_cxx::new_allocator<T></td>
|
||||
<td><ext/new_allocator.h></td>
|
||||
<td>std::__new_alloc</td>
|
||||
<td><memory></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>__gnu_cxx::malloc_allocator<T></td>
|
||||
<td><ext/malloc_allocator.h></td>
|
||||
<td>std::__malloc_alloc_template<int></td>
|
||||
<td><memory></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>__gnu_cxx::debug_allocator<T></td>
|
||||
<td><ext/debug_allocator.h></td>
|
||||
<td>std::debug_alloc<T></td>
|
||||
<td><memory></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>__gnu_cxx::__pool_alloc<bool, int></td>
|
||||
<td><ext/pool_allocator.h></td>
|
||||
<td>std::__default_alloc_template<bool,int></td>
|
||||
<td><memory></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>__gnu_cxx::__mt_alloc<T></td>
|
||||
<td><ext/mt_allocator.h></td>
|
||||
<td></td>
|
||||
<td></td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<p>More details on each of these allocators follows. </p>
|
||||
<ul>
|
||||
<li><code>new_allocator</code>
|
||||
<p>Simply wraps <code>::operator new</code>
|
||||
and <code>::operator delete</code>.
|
||||
</p>
|
||||
</li>
|
||||
<li><code>malloc_allocator</code>
|
||||
<p>Simply wraps
|
||||
<code>malloc</code> and <code>free</code>. There is also a hook
|
||||
for an out-of-memory handler (for new/delete this is taken care of
|
||||
elsewhere).
|
||||
</p>
|
||||
</li>
|
||||
<li><code>debug_allocator</code>
|
||||
<p> A wrapper around an
|
||||
arbitrary allocator A. It passes on slightly increased size
|
||||
requests to A, and uses the extra memory to store size information.
|
||||
When a pointer is passed to <code>deallocate()</code>, the stored
|
||||
size is checked, and assert() is used to guarantee they match.
|
||||
</p>
|
||||
</li>
|
||||
<li><code>__pool_alloc</code>
|
||||
<p> A high-performance, single pool allocator. The reusable
|
||||
memory is shared among identical instantiations of this type.
|
||||
It calls through <code>::operator new</code> to obtain new memory
|
||||
when its lists run out. If a client container requests a block
|
||||
larger than a certain threshold size, then the pool is bypassed,
|
||||
and the allocate/deallocate request is passed to
|
||||
<code>::operator new</code> directly. </p>
|
||||
|
||||
<p> This class take a boolean template parameter, called
|
||||
<code>thr</code>, and an integer template parameter, called
|
||||
<code>inst</code>.
|
||||
</p>
|
||||
<p>The <code>inst</code> number is used to track additional memory
|
||||
pools. The point of the number is to allow multiple
|
||||
instantiations of the classes without changing the semantics at
|
||||
all. All three of
|
||||
</p>
|
||||
|
||||
<pre>
|
||||
typedef __pool_alloc<true,0> normal;
|
||||
typedef __pool_alloc<true,1> private;
|
||||
typedef __pool_alloc<true,42> also_private;</pre>
|
||||
<p>behave exactly the same way. However, the memory pool for each type
|
||||
(and remember that different instantiations result in different types)
|
||||
remains separate.
|
||||
</p>
|
||||
<p>The library uses <strong>0</strong> in all its instantiations. If you
|
||||
wish to keep separate free lists for a particular purpose, use a
|
||||
different number.
|
||||
</p>
|
||||
<p>The <code>thr</code> boolean determines whether the pool should
|
||||
be manipulated atomically or not. When thr=true, the allocator
|
||||
is is threadsafe, while thr=false, and is slightly faster but
|
||||
unsafe for multiple threads.
|
||||
</p>
|
||||
<p>(Note that the GCC thread abstraction layer allows us to provide safe
|
||||
zero-overhead stubs for the threading routines, if threads were
|
||||
disabled at configuration time.)
|
||||
</p>
|
||||
|
||||
</li>
|
||||
|
||||
<li><code>__mt_alloc</code>
|
||||
<p>A high-performance
|
||||
fixed-size allocator. It has its own documentation, found <a
|
||||
href="../ext/mt_allocator.html">here</a>.
|
||||
</p>
|
||||
</li>
|
||||
</ul>
|
||||
|
||||
|
||||
<h3 class="left">
|
||||
<a name="using custom allocators">Using a specific allocator</a>
|
||||
</h3>
|
||||
<p>You can specify different memory management schemes on a
|
||||
per-container basis, by overriding the default
|
||||
<code>Allocator</code> template parameter. For example, an easy
|
||||
(but non-portable) method of specifying that only malloc/free
|
||||
should be used instead of the default node allocator is:
|
||||
</p>
|
||||
<pre>
|
||||
std::list <int, __gnu_cxx::malloc_allocator<int> > malloc_list;</pre>
|
||||
Likewise, a debugging form of whichever allocator is currently in use:
|
||||
<pre>
|
||||
std::deque <int, __gnu_cxx::debug_allocator<std::allocator<int> > > debug_deque;</pre>
|
||||
|
||||
|
||||
<h3 class="left">
|
||||
<a name="custom allocators">Writing custom allocators</a>
|
||||
</h3>
|
||||
<p> Writing a portable C++ allocator would dictate that the
|
||||
interface would look much like the one specified for <code>
|
||||
std::allocator</code>. Additional member functions, but not
|
||||
subtractions, would be permissible.
|
||||
</p>
|
||||
|
||||
<p> Probably the best place to start would be to copy one of the
|
||||
extension allocators already shipped with libstdc++: say, <code>
|
||||
new_allocator </code>.
|
||||
</p>
|
||||
|
||||
|
||||
<h3 class="left">
|
||||
<a name="biblio">Bibliography / Further Reading</a>
|
||||
</h3>
|
||||
<p>
|
||||
ISO/IEC 14882:1998 Programming languages - C++ [20.4 Memory]
|
||||
</p>
|
||||
|
||||
<p>
|
||||
Austern, Matt, C/C++ Users Journal.
|
||||
<a href="http://www.cuj.com/documents/s=8000/cujcexp1812austern/">The Standard Librarian: What Are Allocators Good
|
||||
For?</a>
|
||||
</p>
|
||||
|
||||
<p>
|
||||
Berger, Emery,
|
||||
<a href="http://www.cs.umass.edu/~emery/hoard/"> The Hoard memory allocator </a>
|
||||
</p>
|
||||
|
||||
<p>
|
||||
Berger, Emery with Ben Zorn & Kathryn McKinley, OOPSLA 2002
|
||||
<a href="http://www.cs.umass.edu/~emery/pubs/berger-oopsla2002.pdf">Reconsidering Custom Memory Allocation</a>
|
||||
</p>
|
||||
|
||||
<p>
|
||||
Kreft, Klaus and Angelika Langer, C++ Report, June 1998
|
||||
<a href="http://www.langer.camelot.de/Articles/C++Report/Allocators/Allocators.html">Allocator Types</a>
|
||||
</p>
|
||||
|
||||
<p>
|
||||
Stroustrup, Bjarne, 19.4 Allocators, The C++ Programming
|
||||
Language, Special Edition, Addison Wesley, Inc. 2000
|
||||
</p>
|
||||
|
||||
<p>
|
||||
Yen, Felix, <a href="http://home.earthlink.net/~brimar/yalloc/">Yalloc: A Recycling C++ Allocator</a>
|
||||
</p>
|
||||
|
||||
<hr />
|
||||
<p>Return <a href="#top">to the top of the page</a> or
|
||||
<a href="http://gcc.gnu.org/libstdc++/">to the libstdc++ homepage</a>.
|
||||
</p>
|
||||
|
||||
|
||||
<!-- ####################################################### -->
|
||||
|
||||
<hr />
|
||||
<p class="fineprint"><em>
|
||||
See <a href="../17_intro/license.html">license.html</a> for copying conditions.
|
||||
Comments and suggestions are welcome, and may be sent to
|
||||
<a href="mailto:libstdc++@gcc.gnu.org">the libstdc++ mailing list</a>.
|
||||
</em></p>
|
||||
|
||||
|
||||
</body>
|
||||
</html>
|
@ -219,7 +219,7 @@
|
||||
<hr />
|
||||
<h2><a name="5">Memory allocators</a></h2>
|
||||
<p>The available free store ("heap") management classes are
|
||||
described <a href="../ext/howto.html">here</a>.
|
||||
described <a href="allocator.html">here</a>.
|
||||
</p>
|
||||
<p>Return <a href="#top">to top of page</a> or
|
||||
<a href="../faq/index.html">to the FAQ</a>.
|
||||
|
@ -126,8 +126,8 @@
|
||||
<li><a href="18_support/howto.html#2">Implementation properties</a></li>
|
||||
<li><a href="18_support/howto.html#3">Start and Termination</a></li>
|
||||
<li><a href="18_support/howto.html#4">Verbose <code>terminate</code></a></li>
|
||||
<li><a href="18_support/howto.html#6">Dynamic memory management</a></li>
|
||||
<li><a href="18_support/howto.html#7">RTTI, the ABI, and demangling</a></li>
|
||||
<li><a href="18_support/howto.html#5">Dynamic memory management</a></li>
|
||||
<li><a href="18_support/howto.html#6">RTTI, the ABI, and demangling</a></li>
|
||||
</ul>
|
||||
</li>
|
||||
|
||||
@ -145,7 +145,7 @@
|
||||
<li><a href="20_util/howto.html#2"><code>auto_ptr</code> inside container classes</a></li>
|
||||
<li><a href="20_util/howto.html#3">Functors</a></li>
|
||||
<li><a href="20_util/howto.html#4">Pairs</a></li>
|
||||
<li><a href="20_util/howto.html#5">Memory allocators</a></li>
|
||||
<li><a href="20_util/allocator.html">Allocators and allocation</a></li>
|
||||
</ul>
|
||||
</li>
|
||||
|
||||
@ -226,8 +226,7 @@
|
||||
<ul>
|
||||
<li><a href="ext/howto.html#1">Ropes and trees and hashes, oh my!</a></li>
|
||||
<li><a href="ext/howto.html#2">Added members and types</a></li>
|
||||
<li><a href="ext/howto.html#3">Allocators (versions 3.0, 3.1, 3.2, 3.3)</a></li>
|
||||
<li><a href="ext/howto.html#6">Allocators (version 3.4)</a></li>
|
||||
<li><a href="ext/mt_allocator.html"><code>__mt_alloc</code> </a></li>
|
||||
<li><a href="ext/howto.html#4">Compile-time checks</a></li>
|
||||
<li><a href="ext/howto.html#5">LWG Issues</a></li>
|
||||
<li><a href="ext/../18_support/howto.html#5">Demangling</a></li>
|
||||
|
@ -48,8 +48,7 @@
|
||||
<ul>
|
||||
<li><a href="#1">Ropes and trees and hashes, oh my!</a></li>
|
||||
<li><a href="#2">Added members and types</a></li>
|
||||
<li><a href="#3">Allocators (versions 3.0, 3.1, 3.2, 3.3)</a></li>
|
||||
<li><a href="#6">Allocators (version 3.4)</a></li>
|
||||
<li><a href="mt_allocator.html"><code>__mt_alloc</code> </a></li>
|
||||
<li><a href="#4">Compile-time checks</a></li>
|
||||
<li><a href="#5">LWG Issues</a></li>
|
||||
<li><a href="../18_support/howto.html#5">Demangling</a></li>
|
||||
@ -140,247 +139,6 @@
|
||||
<a href="../faq/index.html">to the FAQ</a>.
|
||||
</p>
|
||||
|
||||
<hr />
|
||||
<h2><a name="3">Allocators (versions 3.0, 3.1, 3.2, 3.3)</a></h2>
|
||||
<p>Thread-safety, space efficiency, high speed, portability... this is a
|
||||
mess. Where to begin?
|
||||
</p>
|
||||
<h3>The Rules</h3>
|
||||
<p>The C++ standard only gives a few directives in this area:
|
||||
</p>
|
||||
<ul>
|
||||
<li>When you add elements to a container, and the container must allocate
|
||||
more memory to hold them, the container makes the request via its
|
||||
<code>Allocator</code> template parameter. This includes adding
|
||||
char's to the string class, which acts as a regular STL container
|
||||
in this respect.
|
||||
</li>
|
||||
<li>The default <code>Allocator</code> of every container-of-T is
|
||||
<code>std::allocator<T></code>.
|
||||
</li>
|
||||
<li>The interface of the <code>allocator<T></code> class is
|
||||
extremely simple. It has about 20 public declarations (nested
|
||||
typedefs, member functions, etc), but the two which concern us most
|
||||
are:
|
||||
<pre>
|
||||
T* allocate (size_type n, const void* hint = 0);
|
||||
void deallocate (T* p, size_type n);</pre>
|
||||
(This is a simplicifcation; the real signatures use nested typedefs.)
|
||||
The <code>"n"</code> arguments in both those functions is a
|
||||
<em>count</em> of the number of T's to allocate space for,
|
||||
<em>not their total size</em>.
|
||||
</li>
|
||||
<li>"The storage is obtained by calling
|
||||
<code>::operator new(size_t)</code>, but it is unspecified when or
|
||||
how often this function is called. The use of <code>hint</code>
|
||||
is unspecified, but intended as an aid to locality if an
|
||||
implementation so desires." [20.4.1.1]/6
|
||||
</li>
|
||||
</ul>
|
||||
<h3>Problems and Possibilities</h3>
|
||||
<p>The easiest way of fulfilling the requirements is to call operator new
|
||||
each time a container needs memory, and to call operator delete each
|
||||
time the container releases memory. <strong>BUT</strong>
|
||||
<a href="http://gcc.gnu.org/ml/libstdc++/2001-05/msg00105.html">this
|
||||
method is horribly slow</a>.
|
||||
</p>
|
||||
<p>Or we can keep old memory around, and reuse it in a pool to save time.
|
||||
The old libstdc++-v2 used a memory pool, and so do we. As of 3.0,
|
||||
<a href="http://gcc.gnu.org/ml/libstdc++/2001-05/msg00136.html">it's
|
||||
on by default</a>. The pool is shared among all the containers in the
|
||||
program: when your program's std::vector<int> gets cut in half
|
||||
and frees a bunch of its storage, that memory can be reused by the
|
||||
private std::list<WonkyWidget> brought in from a KDE library
|
||||
that you linked against. And we don't have to call operators new and
|
||||
delete to pass the memory on, either, which is a speed bonus.
|
||||
<strong>BUT</strong>...
|
||||
</p>
|
||||
<p>What about threads? No problem: in a threadsafe environment, the
|
||||
memory pool is manipulated atomically, so you can grow a container in
|
||||
one thread and shrink it in another, etc. <strong>BUT</strong> what
|
||||
if threads in libstdc++-v3 aren't set up properly?
|
||||
<a href="../faq/index.html#5_6">That's been answered already</a>.
|
||||
</p>
|
||||
<p><strong>BUT</strong> what if you want to use your own allocator? What
|
||||
if you plan on using a runtime-loadable version of malloc() which uses
|
||||
shared telepathic anonymous mmap'd sections serializable over a
|
||||
network, so that memory requests <em>should</em> go through malloc?
|
||||
And what if you need to debug it?
|
||||
</p>
|
||||
<p>Well then:
|
||||
</p>
|
||||
<h3>Available allocators in namespace std</h3>
|
||||
<p>First I'll describe the situation as it exists for the code which
|
||||
was released in GCC 3.1 and 3.2. Then I'll describe the differences
|
||||
for 3.0. The allocator classes also have source documentation,
|
||||
which is described <a href="../documentation.html#4">here</a> (you
|
||||
will need to retrieve the maintainer-level docs, as almost none of
|
||||
these entities are in the ISO standard).
|
||||
</p>
|
||||
<p>As a general rule of thumb, users are not allowed to use names which
|
||||
begin with an underscore. This means that to be portable between
|
||||
compilers, none of the following may be used in your program directly.
|
||||
(If you decide to be unportable, then you're free do do what you want,
|
||||
but it's not our fault if stuff breaks.) They are presented here for
|
||||
information for maintainers and contributors in addition to users.
|
||||
</p>
|
||||
<p>These classes are always available:
|
||||
</p>
|
||||
<ul>
|
||||
<li><code>__new_alloc</code> simply wraps <code>::operator new</code>
|
||||
and <code>::operator delete</code>.
|
||||
</li>
|
||||
<li><code>__malloc_alloc_template<int inst></code> simply wraps
|
||||
<code>malloc</code> and <code>free</code>. There is also a hook
|
||||
for an out-of-memory handler (for new/delete this is taken care of
|
||||
elsewhere). The <code>inst</code> parameter is described below.
|
||||
This class was called <code>malloc_alloc</code> in earlier versions.
|
||||
</li>
|
||||
<li><code>allocator<T></code> has already been described; it is
|
||||
The Standard Allocator for instances of T. It uses the internal
|
||||
<code>__alloc</code> typedef (see below) to satisy its requests.
|
||||
</li>
|
||||
<li><code>__simple_alloc<T,A></code> is a wrapper around another
|
||||
allocator, A, which itself is an allocator for instances of T.
|
||||
This is primarily used in an internal "allocator traits"
|
||||
class which helps encapsulate the different styles of allocators.
|
||||
</li>
|
||||
<li><code>__debug_alloc<A></code> is also a wrapper around an
|
||||
arbitrary allocator A. It passes on slightly increased size
|
||||
requests to A, and uses the extra memory to store size information.
|
||||
When a pointer is passed to <code>deallocate()</code>, the stored
|
||||
size is checked, and assert() is used to guarantee they match.
|
||||
</li>
|
||||
<li><code>__allocator<T,A></code> is an adaptor. Many of these
|
||||
allocator classes have a consistent yet non-standard interface.
|
||||
Such classes can be changed to a conforming interface with this
|
||||
wrapper: <code>__allocator<T, __alloc></code> is thus the
|
||||
same as <code>allocator<T></code>.
|
||||
</li>
|
||||
</ul>
|
||||
<p>Normally,
|
||||
<code> __default_alloc_template<bool thr, int inst> </code>
|
||||
is also available. This is the high-speed pool, called the default
|
||||
node allocator. The reusable memory is shared among identical
|
||||
instantiations of
|
||||
this type. It calls through <code>__new_alloc</code> to obtain
|
||||
new memory when its lists run out. If a client container requests a
|
||||
block larger than a certain threshold size, then the pool is bypassed,
|
||||
and the allocate/deallocate request is passed to
|
||||
<code>__new_alloc</code> directly.
|
||||
</p>
|
||||
<p>Its <code>inst</code> parameter is described below. The
|
||||
<code>thr</code> boolean determines whether the pool should be
|
||||
manipulated atomically or not. Two typedefs are provided:
|
||||
<code>__alloc</code> is defined as this node allocator with thr=true,
|
||||
and therefore is threadsafe, while <code>__single_client_alloc</code>
|
||||
defines thr=false, and is slightly faster but unsafe for multiple
|
||||
threads.
|
||||
</p>
|
||||
<p>(Note that the GCC thread abstraction layer allows us to provide safe
|
||||
zero-overhead stubs for the threading routines, if threads were
|
||||
disabled at configuration time. In this situation,
|
||||
<code>__alloc</code> should not be noticably slower than
|
||||
<code>__single_client_alloc</code>.)
|
||||
</p>
|
||||
<p>[Another threadsafe allocator where each thread keeps its own free
|
||||
list, so that no locking is needed, might be described here.]
|
||||
</p>
|
||||
<h3>A cannon to swat a fly:<code> __USE_MALLOC</code></h3>
|
||||
<p>If you've already read <a href="../23_containers/howto.html#3">this
|
||||
advice</a> but still think you remember how to use this macro from
|
||||
SGI STL days. We have removed it in gcc 3.3. See next section
|
||||
for the new way to get the same effect.
|
||||
</p>
|
||||
<h3>Globally disabling memory caching:<code> GLIBCXX_FORCE_NEW</code></h3>
|
||||
<p>Starting with gcc 3.3, if you want to globally disable memory
|
||||
caching within the library for the default allocator (i.e.
|
||||
the one you get for all library objects when you do not specify
|
||||
which one to use), merely set GLIBCXX_FORCE_NEW (at this time,
|
||||
with any value) into your environment before running the
|
||||
program. You will obtain a similar effect without having to
|
||||
recompile your entire program and the entire library (the new
|
||||
operator in gcc is a light wrapper around malloc). If your
|
||||
program crashes with GLIBCXX_FORCE_NEW in the environment,
|
||||
it likely means that you linked against objects built against
|
||||
the older library. Code to support this extension is fully
|
||||
compatible with 3.2 code if GLIBCXX_FORCE_NEW is not in the
|
||||
environment. Prior to GCC 3.4, this variable was spelt
|
||||
GLIBCPP_FORCE_NEW.
|
||||
</p>
|
||||
<h3>Writing your own allocators</h3>
|
||||
<p>Depending on your application (a specific program, a generic library,
|
||||
etc), allocator classes tend to be one of two styles: "SGI"
|
||||
or "standard". See the comments in stl_alloc.h for more
|
||||
information on this crucial difference.
|
||||
</p>
|
||||
<p>At the bottom of that header is a helper type,
|
||||
<code>_Alloc_traits</code>, and various specializations of it. This
|
||||
allows the container classes to make possible compile-time
|
||||
optimizations based on features of the allocator. You should provide
|
||||
a specialization of this type for your allocator (doing so takes only
|
||||
two or three statements).
|
||||
</p>
|
||||
<h3>Using non-default allocators</h3>
|
||||
<p>You can specify different memory management schemes on a per-container
|
||||
basis, by overriding the default <code>Allocator</code> template
|
||||
parameter. For example, an easy
|
||||
(but nonportable)
|
||||
method of specifying that only malloc/free should be used instead of
|
||||
the default node allocator is:
|
||||
</p>
|
||||
<pre>
|
||||
std::list <my_type, std::__malloc_alloc_template<0> > my_malloc_based_list;</pre>
|
||||
Likewise, a debugging form of whichever allocator is currently in use:
|
||||
<pre>
|
||||
std::deque <my_type, std::__debug_alloc<std::__alloc> > debug_deque;</pre>
|
||||
<h3><code>inst</code></h3>
|
||||
<p>The <code>__malloc_alloc_template</code> and
|
||||
<code>__default_alloc_template</code> classes take an integer parameter,
|
||||
called inst here. This number is completely unused.
|
||||
</p>
|
||||
<p>The point of the number is to allow multiple instantiations of the
|
||||
classes without changing the semantics at all. All three of
|
||||
</p>
|
||||
<pre>
|
||||
typedef __default_alloc_template<true,0> normal;
|
||||
typedef __default_alloc_template<true,1> private;
|
||||
typedef __default_alloc_template<true,42> also_private;</pre>
|
||||
<p>behave exactly the same way. However, the memory pool for each type
|
||||
(and remember that different instantiations result in different types)
|
||||
remains separate.
|
||||
</p>
|
||||
<p>The library uses <strong>0</strong> in all its instantiations. If you
|
||||
wish to keep separate free lists for a particular purpose, use a
|
||||
different number.
|
||||
</p>
|
||||
<h3>3.0.x</h3>
|
||||
<p>For 3.0.x, many of the names were incorrectly <em>not</em> prefixed
|
||||
with underscores. So symbols such as "std::single_client_alloc"
|
||||
are present. Be very careful to not depend on these names any more
|
||||
than you would depend on implementation-only names.
|
||||
</p>
|
||||
<p>Certain macros like <code>_NOTHREADS</code> and <code>__STL_THREADS</code>
|
||||
can affect the 3.0.x allocators. Do not use them. Those macros have
|
||||
been completely removed for 3.1.
|
||||
</p>
|
||||
<p>Return <a href="#top">to top of page</a> or
|
||||
<a href="../faq/index.html">to the FAQ</a>.
|
||||
</p>
|
||||
|
||||
<hr />
|
||||
<h2><a name="6">Allocators (version 3.4)</a></h2>
|
||||
<p>Changes are coming...
|
||||
</p>
|
||||
<p>If you plan on writing your own allocators,
|
||||
<a href="../documentation.html#4">source documentation</a> is
|
||||
available. You'll need to get the "maintainers" collection
|
||||
in order to see the helper classes and extra notes.
|
||||
</p>
|
||||
<p>Return <a href="#top">to top of page</a> or
|
||||
<a href="../faq/index.html">to the FAQ</a>.
|
||||
</p>
|
||||
|
||||
<hr />
|
||||
<h2><a name="4">Compile-time checks</a></h2>
|
||||
<p>Currently libstdc++-v3 uses the concept checkers from the Boost
|
||||
|
414
libstdc++-v3/docs/html/ext/mt_allocator.html
Normal file
414
libstdc++-v3/docs/html/ext/mt_allocator.html
Normal file
@ -0,0 +1,414 @@
|
||||
<?xml version="1.0" encoding="ISO-8859-1"?>
|
||||
<!DOCTYPE html
|
||||
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
|
||||
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
||||
|
||||
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
|
||||
<head>
|
||||
<meta name="AUTHOR" content="Stefan Olsson <stefan@xapa.se>" />
|
||||
<meta name="KEYWORDS" content="c++, libstdc++, g++, allocator, memory" />
|
||||
<meta name="DESCRIPTION" content="Allocators and allocation" />
|
||||
<meta name="GENERATOR" content="emacs and ten fingers" />
|
||||
<title>A fixed-size, multi-thread optimized allocator</title>
|
||||
<link rel="StyleSheet" href="../lib3styles.css" type="text/css" />
|
||||
<link rel="Copyright" href="../17_intro/license.html" type="text/html" />
|
||||
</head>
|
||||
<body>
|
||||
|
||||
<h1 class="centered"><a name="top">A fixed-size, multi-thread optimized allocator</a></h1>
|
||||
|
||||
<p class="fineprint"><em>
|
||||
The latest version of this document is always available at
|
||||
<a href="http://gcc.gnu.org/onlinedocs/libstdc++/ext/mt_allocator.html">
|
||||
http://gcc.gnu.org/onlinedocs/libstdc++/ext/mt_allocator.html</a>.
|
||||
</em></p>
|
||||
|
||||
<p><em>
|
||||
To the <a href="http://gcc.gnu.org/libstdc++/">libstdc++-v3 homepage</a>.
|
||||
</em></p>
|
||||
|
||||
<!-- ####################################################### -->
|
||||
<hr />
|
||||
<h3 class="left">
|
||||
<a name="intro">Introduction</a>
|
||||
</h3>
|
||||
|
||||
<p> The mt allocator [hereinafter referred to simply as "the
|
||||
allocator"] is a fixed size (power of two) allocator that was
|
||||
initially developed specifically to suit the needs of multi threaded
|
||||
applications [hereinafter referred to as an MT application]. Over time
|
||||
the allocator has evolved and been improved in many ways, one of the
|
||||
being that it now also does a good job in single threaded applications
|
||||
[hereinafter referred to as a ST application]. (Note: In this
|
||||
document, when referring to single threaded applications this also
|
||||
includes applications that are compiled with gcc without thread
|
||||
support enabled. This is accomplished using ifdef's on __GTHREADS)
|
||||
</p>
|
||||
|
||||
<p>
|
||||
The aim of this document is to describe - from a application point of
|
||||
view - the "inner workings" of the allocator.
|
||||
</p>
|
||||
|
||||
|
||||
<h3 class="left">
|
||||
<a name="init">Initialization</a>
|
||||
</h3>
|
||||
|
||||
<p>
|
||||
The static variables (pointers to freelists, tuning parameters etc)
|
||||
are initialized to their default values at file scope, i.e.:
|
||||
</p>
|
||||
|
||||
<pre>
|
||||
template<typename _Tp> size_t
|
||||
__mt_alloc<_Tp>::_S_freelist_headroom = 10;
|
||||
</pre>
|
||||
|
||||
<p>
|
||||
The very first allocate() call will always call the _S_init() function.
|
||||
In order to make sure that this function is called exactly once we make use
|
||||
of a __gthread_once (with _S_once_mt and _S_init as arguments) call in MT
|
||||
applications and check a static bool (_S_initialized) in ST applications.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
The _S_init() function:
|
||||
- If the GLIBCXX_FORCE_NEW environment variable is set, it sets the bool
|
||||
_S_force_new to true and then returns. This will cause subsequent calls to
|
||||
allocate() to return memory directly from a new() call, and deallocate will
|
||||
only do a delete() call.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
- If the GLIBCXX_FORCE_NEW environment variable is not set, both ST and MT
|
||||
applications will:
|
||||
- Calculate the number of bins needed. A bin is a specific power of two size
|
||||
of bytes. I.e., by default the allocator will deal with requests of up to
|
||||
128 bytes (or whatever the value of _S_max_bytes is when _S_init() is
|
||||
called). This means that there will be bins of the following sizes
|
||||
(in bytes): 1, 2, 4, 8, 16, 32, 64, 128.
|
||||
|
||||
- Create the _S_binmap array. All requests are rounded up to the next
|
||||
"large enough" bin. I.e., a request for 29 bytes will cause a block from
|
||||
the "32 byte bin" to be returned to the application. The purpose of
|
||||
_S_binmap is to speed up the process of finding out which bin to use.
|
||||
I.e., the value of _S_binmap[ 29 ] is initialized to 5 (bin 5 = 32 bytes).
|
||||
</p>
|
||||
<p>
|
||||
- Create the _S_bin array. This array consists of bin_records. There will be
|
||||
as many bin_records in this array as the number of bins that we calculated
|
||||
earlier. I.e., if _S_max_bytes = 128 there will be 8 entries.
|
||||
Each bin_record is then initialized:
|
||||
- bin_record->first = An array of pointers to block_records. There will be
|
||||
as many block_records pointers as there are maximum number of threads
|
||||
(in a ST application there is only 1 thread, in a MT application there
|
||||
are _S_max_threads).
|
||||
This holds the pointer to the first free block for each thread in this
|
||||
bin. I.e., if we would like to know where the first free block of size 32
|
||||
for thread number 3 is we would look this up by: _S_bin[ 5 ].first[ 3 ]
|
||||
- bin_record->last = See above, the only difference being that this points
|
||||
to the last record on the same freelist.
|
||||
|
||||
The above created block_record pointers members are now initialized to
|
||||
their initial values. I.e. _S_bin[ n ].first[ n ] = NULL;
|
||||
</p>
|
||||
|
||||
<p>
|
||||
- Additionally a MT application will:
|
||||
- Create a list of free thread id's. The pointer to the first entry
|
||||
is stored in _S_thread_freelist_first. The reason for this approach is
|
||||
that the __gthread_self() call will not return a value that corresponds to
|
||||
the maximum number of threads allowed but rather a process id number or
|
||||
something else. So what we do is that we create a list of thread_records.
|
||||
This list is _S_max_threads long and each entry holds a size_t thread_id
|
||||
which is initialized to 1, 2, 3, 4, 5 and so on up to _S_max_threads.
|
||||
Each time a thread calls allocate() or deallocate() we call
|
||||
_S_get_thread_id() which looks at the value of _S_thread_key which is a
|
||||
thread local storage pointer. If this is NULL we know that this is a newly
|
||||
created thread and we pop the first entry from this list and saves the
|
||||
pointer to this record in the _S_thread_key variable. The next time
|
||||
we will get the pointer to the thread_record back and we use the
|
||||
thread_record->thread_id as identification. I.e., the first thread that
|
||||
calls allocate will get the first record in this list and thus be thread
|
||||
number 1 and will then find the pointer to its first free 32 byte block
|
||||
in _S_bin[ 5 ].first[ 1 ]
|
||||
When we create the _S_thread_key we also define a destructor
|
||||
(_S_thread_key_destr) which means that when the thread dies, this
|
||||
thread_record is returned to the front of this list and the thread id
|
||||
can then be reused if a new thread is created.
|
||||
This list is protected by a mutex (_S_thread_freelist_mutex) which is only
|
||||
locked when records are removed/added to the list.
|
||||
</p>
|
||||
<p>
|
||||
- Initialize the free and used counters of each bin_record:
|
||||
- bin_record->free = An array of size_t. This keeps track of the number
|
||||
of blocks on a specific thread's freelist in each bin. I.e., if a thread
|
||||
has 12 32-byte blocks on it's freelists and allocates one of these, this
|
||||
counter would be decreased to 11.
|
||||
|
||||
- bin_record->used = An array of size_t. This keeps track of the number
|
||||
of blocks currently in use of this size by this thread. I.e., if a thread
|
||||
has made 678 requests (and no deallocations...) of 32-byte blocks this
|
||||
counter will read 678.
|
||||
|
||||
The above created arrays are now initialized with their initial values.
|
||||
I.e. _S_bin[ n ].free[ n ] = 0;
|
||||
</p>
|
||||
<p>
|
||||
- Initialize the mutex of each bin_record:
|
||||
The bin_record->mutex is used to protect the global freelist. This concept
|
||||
of a global freelist is explained in more detail in the section
|
||||
"A multi threaded example", but basically this mutex is locked whenever
|
||||
a block of memory is retrieved or returned to the global freelist for this
|
||||
specific bin. This only occurs when a number of blocks are grabbed from the
|
||||
global list to a thread specific list or when a thread decides to return
|
||||
some blocks to the global freelist.
|
||||
</p>
|
||||
|
||||
<h3 class="left">
|
||||
<a name="st_example">A single threaded example (and a primer for the multi threaded example!)</a>
|
||||
</h3>
|
||||
|
||||
<p>
|
||||
Let's start by describing how the data on a freelist is laid out in memory.
|
||||
This is the first two blocks in freelist for thread id 3 in bin 3 (8 bytes):
|
||||
</p>
|
||||
<pre>
|
||||
+----------------+
|
||||
| next* ---------|--+ (_S_bin[ 3 ].first[ 3 ] points here)
|
||||
| | |
|
||||
| | |
|
||||
| | |
|
||||
+----------------+ |
|
||||
| thread_id = 3 | |
|
||||
| | |
|
||||
| | |
|
||||
| | |
|
||||
+----------------+ |
|
||||
| DATA | | (A pointer to here is what is returned to the
|
||||
| | | the application when needed)
|
||||
| | |
|
||||
| | |
|
||||
| | |
|
||||
| | |
|
||||
| | |
|
||||
| | |
|
||||
+----------------+ |
|
||||
+----------------+ |
|
||||
| next* |<-+ (If next == NULL it's the last one on the list and
|
||||
| | then the _S_bin[ 3 ].last[ 3 ] pointer points to
|
||||
| | here as well)
|
||||
| |
|
||||
+----------------+
|
||||
| thread_id = 3 |
|
||||
| |
|
||||
| |
|
||||
| |
|
||||
+----------------+
|
||||
| DATA |
|
||||
| |
|
||||
| |
|
||||
| |
|
||||
| |
|
||||
| |
|
||||
| |
|
||||
| |
|
||||
+----------------+
|
||||
</pre>
|
||||
|
||||
<p>
|
||||
With this in mind we simplify things a bit for a while and say that there is
|
||||
only one thread (a ST application). In this case all operations are made to
|
||||
what is referred to as the global pool - thread id 0 (No thread may be
|
||||
assigned this id since they span from 1 to _S_max_threads in a MT application).
|
||||
</p>
|
||||
<p>
|
||||
When the application requests memory (calling allocate()) we first look at the
|
||||
requested size and if this is > _S_max_bytes we call new() directly and return.
|
||||
</p>
|
||||
<p>
|
||||
If the requested size is within limits we start by finding out from which
|
||||
bin we should serve this request by looking in _S_binmap.
|
||||
</p>
|
||||
<p>
|
||||
A quick look at _S_bin[ bin ].first[ 0 ] tells us if there are any blocks of
|
||||
this size on the freelist (0). If this is not NULL - fine, just remove the
|
||||
block that _S_bin[ bin ].first[ 0 ] points to from the list,
|
||||
update _S_bin[ bin ].first[ 0 ] and return a pointer to that blocks data.
|
||||
</p>
|
||||
<p>
|
||||
If the freelist is empty (the pointer is NULL) we must get memory from the
|
||||
system and build us a freelist within this memory. All requests for new memory
|
||||
is made in chunks of _S_chunk_size. Knowing the size of a block_record and
|
||||
the bytes that this bin stores we then calculate how many blocks we can create
|
||||
within this chunk, build the list, remove the first block, update the pointers
|
||||
(_S_bin[ bin ].first[ 0 ] and _S_bin[ bin ].last[ 0 ]) and return a pointer
|
||||
to that blocks data.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
Deallocation is equally simple; the pointer is casted back to a block_record
|
||||
pointer, lookup which bin to use based on the size, add the block to the end
|
||||
of the global freelist (with the next pointer set to NULL) and update the
|
||||
pointers as needed (_S_bin[ bin ].first[ 0 ] and _S_bin[ bin ].last[ 0 ]).
|
||||
</p>
|
||||
|
||||
<h3 class="left">
|
||||
<a name="mt_example">A multi threaded example</a>
|
||||
</h3>
|
||||
|
||||
<p>
|
||||
In the ST example we never used the thread_id variable present in each block.
|
||||
Let's start by explaining the purpose of this in a MT application.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
The concept of "ownership" was introduced since many MT applications
|
||||
allocate and deallocate memory to shared containers from different
|
||||
threads (such as a cache shared amongst all threads). This introduces
|
||||
a problem if the allocator only returns memory to the current threads
|
||||
freelist (I.e., there might be one thread doing all the allocation and
|
||||
thus obtaining ever more memory from the system and another thread
|
||||
that is getting a longer and longer freelist - this will in the end
|
||||
consume all available memory).
|
||||
</p>
|
||||
|
||||
<p>
|
||||
Each time a block is moved from the global list (where ownership is
|
||||
irrelevant), to a threads freelist (or when a new freelist is built
|
||||
from a chunk directly onto a threads freelist or when a deallocation
|
||||
occurs on a block which was not allocated by the same thread id as the
|
||||
one doing the deallocation) the thread id is set to the current one.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
What's the use? Well, when a deallocation occurs we can now look at
|
||||
the thread id and find out if it was allocated by another thread id
|
||||
and decrease the used counter of that thread instead, thus keeping the
|
||||
free and used counters correct. And keeping the free and used counters
|
||||
corrects is very important since the relationship between these two
|
||||
variables decides if memory should be returned to the global pool or
|
||||
not when a deallocation occurs.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
When the application requests memory (calling allocate()) we first
|
||||
look at the requested size and if this is > _S_max_bytes we call new()
|
||||
directly and return.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
If the requested size is within limits we start by finding out from which
|
||||
bin we should serve this request by looking in _S_binmap.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
A call to _S_get_thread_id() returns the thread id for the calling thread
|
||||
(and if no value has been set in _S_thread_key, a new id is assigned and
|
||||
returned).
|
||||
</p>
|
||||
|
||||
<p>
|
||||
A quick look at _S_bin[ bin ].first[ thread_id ] tells us if there are
|
||||
any blocks of this size on the current threads freelist. If this is
|
||||
not NULL - fine, just remove the block that _S_bin[ bin ].first[
|
||||
thread_id ] points to from the list, update _S_bin[ bin ].first[
|
||||
thread_id ], update the free and used counters and return a pointer to
|
||||
that blocks data.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
If the freelist is empty (the pointer is NULL) we start by looking at
|
||||
the global freelist (0). If there are blocks available on the global
|
||||
freelist we lock this bins mutex and move up to block_count (the
|
||||
number of blocks of this bins size that will fit into a _S_chunk_size)
|
||||
or until end of list - whatever comes first - to the current threads
|
||||
freelist and at the same time change the thread_id ownership and
|
||||
update the counters and pointers. When the bins mutex has been
|
||||
unlocked, we remove the block that _S_bin[ bin ].first[ thread_id ]
|
||||
points to from the list, update _S_bin[ bin ].first[ thread_id ],
|
||||
update the free and used counters, and return a pointer to that blocks
|
||||
data.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
The reason that the number of blocks moved to the current threads
|
||||
freelist is limited to block_count is to minimize the chance that a
|
||||
subsequent deallocate() call will return the excess blocks to the
|
||||
global freelist (based on the _S_freelist_headroom calculation, see
|
||||
below).
|
||||
</p>
|
||||
|
||||
<p>
|
||||
However if there isn't any memory on the global pool we need to get
|
||||
memory from the system - this is done in exactly the same way as in a
|
||||
single threaded application with one major difference; the list built
|
||||
in the newly allocated memory (of _S_chunk_size size) is added to the
|
||||
current threads freelist instead of to the global.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
The basic process of a deallocation call is simple: always add the
|
||||
block to the end of the current threads freelist and update the
|
||||
counters and pointers (as described earlier with the specific check of
|
||||
ownership that causes the used counter of the thread that originally
|
||||
allocated the block to be decreased instead of the current threads
|
||||
counter).
|
||||
</p>
|
||||
|
||||
<p>
|
||||
And here comes the free and used counters to service. Each time a
|
||||
deallocation() call is made, the length of the current threads
|
||||
freelist is compared to the amount memory in use by this thread.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
Let's go back to the example of an application that has one thread
|
||||
that does all the allocations and one that deallocates. Both these
|
||||
threads use say 516 32-byte blocks that was allocated during thread
|
||||
creation for example. Their used counters will both say 516 at this
|
||||
point. The allocation thread now grabs 1000 32-byte blocks and puts
|
||||
them in a shared container. The used counter for this thread is now
|
||||
1516.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
The deallocation thread now deallocates 500 of these blocks. For each
|
||||
deallocation made the used counter of the allocating thread is
|
||||
decreased and the freelist of the deallocation thread gets longer and
|
||||
longer. But the calculation made in deallocate() will limit the length
|
||||
of the freelist in the deallocation thread to _S_freelist_headroom %
|
||||
of it's used counter. In this case, when the freelist (given that the
|
||||
_S_freelist_headroom is at it's default value of 10%) exceeds 52
|
||||
(516/10) blocks will be returned to the global pool where the
|
||||
allocating thread may pick them up and reuse them.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
In order to reduce lock contention (since this requires this bins
|
||||
mutex to be locked) this operation is also made in chunks of blocks
|
||||
(just like when chunks of blocks are moved from the global freelist to
|
||||
a threads freelist mentioned above). The "formula" used can probably
|
||||
be improved to further reduce the risk of blocks being "bounced back
|
||||
and forth" between freelists.
|
||||
</p>
|
||||
|
||||
<hr />
|
||||
<p>Return <a href="#top">to the top of the page</a> or
|
||||
<a href="http://gcc.gnu.org/libstdc++/">to the libstdc++ homepage</a>.
|
||||
</p>
|
||||
|
||||
|
||||
<!-- ####################################################### -->
|
||||
|
||||
<hr />
|
||||
<p class="fineprint"><em>
|
||||
See <a href="../17_intro/license.html">license.html</a> for copying conditions.
|
||||
Comments and suggestions are welcome, and may be sent to
|
||||
<a href="mailto:libstdc++@gcc.gnu.org">the libstdc++ mailing list</a>.
|
||||
</em></p>
|
||||
|
||||
|
||||
</body>
|
||||
</html>
|
Loading…
Reference in New Issue
Block a user