Current allocate_stack logic for create stacks is to first mmap all
the required memory with the desirable memory and then mprotect the
guard area with PROT_NONE if required. Although it works as expected,
it pessimizes the allocation because it requires the kernel to actually
increase commit charge (it counts against the available physical/swap
memory available for the system).
The only issue is to actually check this change since side-effects are
really Linux specific and to actually account them it would require a
kernel specific tests to parse the system wide information. On the kernel
I checked /proc/self/statm does not show any meaningful difference for
vmm and/or rss before and after thread creation. I could only see
really meaningful information checking on system wide /proc/meminfo
between thread creation: MemFree, MemAvailable, and Committed_AS shows
large difference without the patch. I think trying to use these
kind of information on a testcase is fragile.
The BZ#18988 reports shows that the commit pages are easily seen with
mlockall (MCL_FUTURE) (with lock all pages that become mapped in the
process) however a more straighfoward testcase shows that pthread_create
could be faster using this patch:
--
static const int inner_count = 256;
static const int outer_count = 128;
static
void *thread1(void *arg)
{
return NULL;
}
static
void *sleeper(void *arg)
{
pthread_t ts[inner_count];
for (int i = 0; i < inner_count; i++)
pthread_create (&ts[i], &a, thread1, NULL);
for (int i = 0; i < inner_count; i++)
pthread_join (ts[i], NULL);
return NULL;
}
int main(void)
{
pthread_attr_init(&a);
pthread_attr_setguardsize(&a, 1<<20);
pthread_attr_setstacksize(&a, 1134592);
pthread_t ts[outer_count];
for (int i = 0; i < outer_count; i++)
pthread_create(&ts[i], &a, sleeper, NULL);
for (int i = 0; i < outer_count; i++)
pthread_join(ts[i], NULL);
assert(r == 0);
}
return 0;
}
--
On x86_64 (4.4.0-45-generic, gcc 5.4.0) running the small benchtests
I see:
$ time ./test
real 0m3.647s
user 0m0.080s
sys 0m11.836s
While with the patch I see:
$ time ./test
real 0m0.696s
user 0m0.040s
sys 0m1.152s
So I added a pthread_create benchtest (thread_create) which check
the thread creation latency. As for the simple benchtests, I saw
improvements in thread creation on all architectures I tested the
change.
Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu,
arm-linux-gnueabihf, powerpc64le-linux-gnu, sparc64-linux-gnu,
and sparcv9-linux-gnu.
[BZ #18988]
* benchtests/thread_create-inputs: New file.
* benchtests/thread_create-source.c: Likewise.
* support/xpthread_attr_setguardsize.c: Likewise.
* support/Makefile (libsupport-routines): Add
xpthread_attr_setguardsize object.
* support/xthread.h: Add xpthread_attr_setguardsize prototype.
* benchtests/Makefile (bench-pthread): Add thread_create.
* nptl/allocatestack.c (allocate_stack): Call mmap with PROT_NONE and
then mprotect the required area.
This directory contains the sources of the GNU C Library.
See the file "version.h" for what release version you have.
The GNU C Library is the standard system C library for all GNU systems,
and is an important part of what makes up a GNU system. It provides the
system API for all programs written in C and C-compatible languages such
as C++ and Objective C; the runtime facilities of other programming
languages use the C library to access the underlying operating system.
In GNU/Linux systems, the C library works with the Linux kernel to
implement the operating system behavior seen by user applications.
In GNU/Hurd systems, it works with a microkernel and Hurd servers.
The GNU C Library implements much of the POSIX.1 functionality in the
GNU/Hurd system, using configurations i[4567]86-*-gnu. The current
GNU/Hurd support requires out-of-tree patches that will eventually be
incorporated into an official GNU C Library release.
When working with Linux kernels, this version of the GNU C Library
requires Linux kernel version 3.2 or later.
Also note that the shared version of the libgcc_s library must be
installed for the pthread library to work correctly.
The GNU C Library supports these configurations for using Linux kernels:
aarch64*-*-linux-gnu
alpha*-*-linux-gnu
arm-*-linux-gnueabi
hppa-*-linux-gnu Not currently functional without patches.
i[4567]86-*-linux-gnu
x86_64-*-linux-gnu Can build either x86_64 or x32
ia64-*-linux-gnu
m68k-*-linux-gnu
microblaze*-*-linux-gnu
mips-*-linux-gnu
mips64-*-linux-gnu
powerpc-*-linux-gnu Hardware or software floating point, BE only.
powerpc64*-*-linux-gnu Big-endian and little-endian.
s390-*-linux-gnu
s390x-*-linux-gnu
sh[34]-*-linux-gnu
sparc*-*-linux-gnu
sparc64*-*-linux-gnu
tilegx-*-linux-gnu
tilepro-*-linux-gnu
If you are interested in doing a port, please contact the glibc
maintainers; see http://www.gnu.org/software/libc/ for more
information.
See the file INSTALL to find out how to configure, build, and install
the GNU C Library. You might also consider reading the WWW pages for
the C library at http://www.gnu.org/software/libc/.
The GNU C Library is (almost) completely documented by the Texinfo manual
found in the `manual/' subdirectory. The manual is still being updated
and contains some known errors and omissions; we regret that we do not
have the resources to work on the manual as much as we would like. For
corrections to the manual, please file a bug in the `manual' component,
following the bug-reporting instructions below. Please be sure to check
the manual in the current development sources to see if your problem has
already been corrected.
Please see http://www.gnu.org/software/libc/bugs.html for bug reporting
information. We are now using the Bugzilla system to track all bug reports.
This web page gives detailed information on how to report bugs properly.
The GNU C Library is free software. See the file COPYING.LIB for copying
conditions, and LICENSES for notices about a few contributions that require
these additional notices to be distributed. License copyright years may be
listed using range notation, e.g., 1996-2015, indicating that every year in
the range, inclusive, is a copyrightable year that would otherwise be listed
individually.