<htmlxmlns="http://www.w3.org/1999/xhtml"><head><title>Implementation</title><metaname="generator"content="DocBook XSL-NS Stylesheets V1.76.1"/><metaname="keywords"content=" ISO C++ , allocator "/><metaname="keywords"content=" ISO C++ , library "/><metaname="keywords"content=" ISO C++ , runtime , library "/><linkrel="home"href="../index.html"title="The GNU C++ Library"/><linkrel="up"href="bitmap_allocator.html"title="Chapter21.The bitmap_allocator"/><linkrel="prev"href="bitmap_allocator.html"title="Chapter21.The bitmap_allocator"/><linkrel="next"href="policy_data_structures.html"title="Chapter22.Policy-Based Data Structures"/></head><body><divclass="navheader"><tablewidth="100%"summary="Navigation header"><tr><thcolspan="3"align="center">Implementation</th></tr><tr><tdalign="left"><aaccesskey="p"href="bitmap_allocator.html">Prev</a></td><thwidth="60%"align="center">Chapter21.The bitmap_allocator</th><tdalign="right"><aaccesskey="n"href="policy_data_structures.html">Next</a></td></tr></table><hr/></div><divclass="section"title="Implementation"><divclass="titlepage"><div><div><h2class="title"><aid="allocator.bitmap.impl"/>Implementation</h2></div></div></div><divclass="section"title="Free List Store"><divclass="titlepage"><div><div><h3class="title"><aid="bitmap.impl.free_list_store"/>Free List Store</h3></div></div></div><p>
The Free List Store (referred to as FLS for the remaining part of this
document) is the Global memory pool that is shared by all instances of
the bitmapped allocator instantiated for any type. This maintains a
sorted order of all free memory blocks given back to it by the
bitmapped allocator, and is also responsible for giving memory to the
bitmapped allocator when it asks for more.
</p><p>
Internally, there is a Free List threshold which indicates the
Maximum number of free lists that the FLS can hold internally
(cache). Currently, this value is set at 64. So, if there are
more than 64 free lists coming in, then some of them will be given
back to the OS using operator delete so that at any given time the
Free List's size does not exceed 64 entries. This is done because
a Binary Search is used to locate an entry in a free list when a
request for memory comes along. Thus, the run-time complexity of
the search would go up given an increasing size, for 64 entries
however, lg(64) == 6 comparisons are enough to locate the correct
free list if it exists.
</p><p>
Suppose the free list size has reached its threshold, then the
largest block from among those in the list and the new block will
be selected and given back to the OS. This is done because it
reduces external fragmentation, and allows the OS to use the
larger blocks later in an orderly fashion, possibly merging them
later. Also, on some systems, large blocks are obtained via calls
to mmap, so giving them back to free system resources becomes most
important.
</p><p>
The function _S_should_i_give decides the policy that determines
whether the current block of memory should be given to the
allocator for the request that it has made. That's because we may
not always have exact fits for the memory size that the allocator
requests. We do this mainly to prevent external fragmentation at
the cost of a little internal fragmentation. Now, the value of
this internal fragmentation has to be decided by this function. I
can see 3 possibilities right now. Please add more as and when you
find better strategies.
</p><divclass="orderedlist"><olclass="orderedlist"><liclass="listitem"><p>Equal size check. Return true only when the 2 blocks are of equal
size.</p></li><liclass="listitem"><p>Difference Threshold: Return true only when the _block_size is
greater than or equal to the _required_size, and if the _BS is > _RS
by a difference of less than some THRESHOLD value, then return true,
else return false. </p></li><liclass="listitem"><p>Percentage Threshold. Return true only when the _block_size is
greater than or equal to the _required_size, and if the _BS is > _RS
by a percentage of less than some THRESHOLD value, then return true,
else return false.</p></li></ol></div><p>
Currently, (3) is being used with a value of 36% Maximum wastage per
A super block is the block of memory acquired from the FLS from
which the bitmap allocator carves out memory for single objects
and satisfies the user's requests. These super blocks come in
sizes that are powers of 2 and multiples of 32
(_Bits_Per_Block). Yes both at the same time! That's because the
next super block acquired will be 2 times the previous one, and
also all super blocks have to be multiples of the _Bits_Per_Block
value.
</p><p>
How does it interact with the free list store?
</p><p>
The super block is contained in the FLS, and the FLS is responsible for
getting / returning Super Bocks to and from the OS using operator new
as defined by the C++ standard.
</p></div><divclass="section"title="Super Block Data Layout"><divclass="titlepage"><div><div><h3class="title"><aid="bitmap.impl.super_block_data"/>Super Block Data Layout</h3></div></div></div><p>
Each Super Block will be of some size that is a multiple of the
number of Bits Per Block. Typically, this value is chosen as
Bits_Per_Byte x sizeof(size_t). On an x86 system, this gives the
figure 8 x 4 = 32. Thus, each Super Block will be of size 32
x Some_Value. This Some_Value is sizeof(value_type). For now, let
it be called 'K'. Thus, finally, Super Block size is 32 x K bytes.
</p><p>
This value of 32 has been chosen because each size_t has 32-bits
and Maximum use of these can be made with such a figure.
</p><p>
Consider a block of size 64 ints. In memory, it would look like this:
(assume a 32-bit system where, size_t is a 32-bit entity).
Another issue would be whether to keep the all bitmaps in a
separate area in memory, or to keep them near the actual blocks
that will be given out or allocated for the client. After some
testing, I've decided to keep these bitmaps close to the actual
blocks. This will help in 2 ways.
</p><divclass="orderedlist"><olclass="orderedlist"><liclass="listitem"><p>Constant time access for the bitmap themselves, since no kind of
look up will be needed to find the correct bitmap list or its
equivalent.</p></li><liclass="listitem"><p>And also this would preserve the cache as far as possible.</p></li></ol></div><p>
So in effect, this kind of an allocator might prove beneficial from a
purely cache point of view. But this allocator has been made to try and
roll out the defects of the node_allocator, wherein the nodes get
skewed about in memory, if they are not returned in the exact reverse
order or in the same order in which they were allocated. Also, the
new_allocator's book keeping overhead is too much for small objects and
single object allocations, though it preserves the locality of blocks
very well when they are returned back to the allocator.
</p></div><divclass="section"title="Overhead and Grow Policy"><divclass="titlepage"><div><div><h3class="title"><aid="bitmap.impl.grow_policy"/>Overhead and Grow Policy</h3></div></div></div><p>
Expected overhead per block would be 1 bit in memory. Also, once
the address of the free list has been found, the cost for
allocation/deallocation would be negligible, and is supposed to be
constant time. For these very reasons, it is very important to
minimize the linear time costs, which include finding a free list
with a free block while allocating, and finding the corresponding
free list for a block while deallocating. Therefore, I have
decided that the growth of the internal pool for this allocator
will be exponential as compared to linear for
node_allocator. There, linear time works well, because we are
mainly concerned with speed of allocation/deallocation and memory
consumption, whereas here, the allocation/deallocation part does
have some linear/logarithmic complexity components in it. Thus, to
try and minimize them would be a good thing to do at the cost of a
little bit of memory.
</p><p>
Another thing to be noted is the pool size will double every time
the internal pool gets exhausted, and all the free blocks have
been given away. The initial size of the pool would be
sizeof(size_t) x 8 which is the number of bits in an integer,
which can fit exactly in a CPU register. Hence, the term given is
exponential growth of the internal pool.
</p></div></div><divclass="navfooter"><hr/><tablewidth="100%"summary="Navigation footer"><tr><tdalign="left"><aaccesskey="p"href="bitmap_allocator.html">Prev</a></td><tdalign="center"><aaccesskey="u"href="bitmap_allocator.html">Up</a></td><tdalign="right"><aaccesskey="n"href="policy_data_structures.html">Next</a></td></tr><tr><tdalign="left"valign="top">Chapter21.The bitmap_allocator</td><tdalign="center"><aaccesskey="h"href="../index.html">Home</a></td><tdalign="right"valign="top">Chapter22.Policy-Based Data Structures</td></tr></table></div></body></html>