Commit Graph

28867 Commits

Author SHA1 Message Date
Mel Gorman c26efef979 malloc: Consistently apply trim_threshold to all heaps [BZ #17195]
Trimming heaps is a balance between saving memory and the system overhead
required to update page tables and discard allocated pages. The malloc
option M_TRIM_THRESHOLD is a tunable that users are meant to use to decide
where this balance point is but it is only applied to the main arena.

For scalability reasons, glibc malloc has per-thread heaps but these are
shrunk with madvise() if there is one page free at the top of the heap.
In some circumstances this can lead to high system overhead if a thread
has a control flow like

    while (data_to_process) {
        buf = malloc(large_size);
        do_stuff();
        free(buf);
    }

For a large size, the free() will call madvise (pagetable teardown, page
free and TLB flush) every time followed immediately by a malloc (fault,
kernel page alloc, zeroing and charge accounting). The kernel overhead
can dominate such a workload.

This patch allows the user to tune when madvise gets called by applying
the trim threshold to the per-thread heaps and using similar logic to the
main arena when deciding whether to shrink. Alternatively if the dynamic
brk/mmap threshold gets adjusted then the new values will be obeyed by
the per-thread heaps.

Bug 17195 was a test case motivated by a problem encountered in scientific
applications written in python that performance badly due to high page fault
overhead. The basic operation of such a program was posted by Julian Taylor
https://sourceware.org/ml/libc-alpha/2015-02/msg00373.html

With this patch applied, the overhead is eliminated. All numbers in this
report are in seconds and were recorded by running Julian's program 30
times.

pyarray
                                 glibc               madvise
                                  2.21                    v2
System  min             1.81 (  0.00%)        0.00 (100.00%)
System  mean            1.93 (  0.00%)        0.02 ( 99.20%)
System  stddev          0.06 (  0.00%)        0.01 ( 88.99%)
System  max             2.06 (  0.00%)        0.03 ( 98.54%)
Elapsed min             3.26 (  0.00%)        2.37 ( 27.30%)
Elapsed mean            3.39 (  0.00%)        2.41 ( 28.84%)
Elapsed stddev          0.14 (  0.00%)        0.02 ( 82.73%)
Elapsed max             4.05 (  0.00%)        2.47 ( 39.01%)

               glibc     madvise
                2.21          v2
User          141.86      142.28
System         57.94        0.60
Elapsed       102.02       72.66

Note that almost a minutes worth of system time is eliminted and the
program completes 28% faster on average.

To illustrate the problem without python this is a basic test-case for
the worst case scenario where every free is a madvise followed by a an alloc

/* gcc bench-free.c -lpthread -o bench-free */
static int num = 1024;

void __attribute__((noinline,noclone)) dostuff (void *p)
{
}

void *worker (void *data)
{
  int i;

  for (i = num; i--;)
    {
      void *m = malloc (48*4096);
      dostuff (m);
      free (m);
    }

  return NULL;
}

int main()
{
  int i;
  pthread_t t;
  void *ret;
  if (pthread_create (&t, NULL, worker, NULL))
    exit (2);
  if (pthread_join (t, &ret))
    exit (3);
  return 0;
}

Before the patch, this resulted in 1024 calls to madvise. With the patch applied,
madvise is called twice because the default trim threshold is high enough to avoid
this.

This a more complex case where there is a mix of frees. It's simply a different worker
function for the test case above

void *worker (void *data)
{
  int i;
  int j = 0;
  void *free_index[num];

  for (i = num; i--;)
    {
      void *m = malloc ((i % 58) *4096);
      dostuff (m);
      if (i % 2 == 0) {
        free (m);
      } else {
        free_index[j++] = m;
      }
    }
  for (; j >= 0; j--)
    {
      free(free_index[j]);
    }

  return NULL;
}

glibc 2.21 calls malloc 90305 times but with the patch applied, it's
called 13438. Increasing the trim threshold will decrease the number of
times it's called with the option of eliminating the overhead.

ebizzy is meant to generate a workload resembling common web application
server workloads. It is threaded with a large working set that at its core
has an allocation, do_stuff, free loop that also hits this case. The primary
metric of the benchmark is records processed per second. This is running on
my desktop which is a single socket machine with an I7-4770 and 8 cores.
Each thread count was run for 30 seconds. It was only run once as the
performance difference is so high that the variation is insignificant.

                glibc 2.21              patch
threads 1            10230              44114
threads 2            19153              84925
threads 4            34295             134569
threads 8            51007             183387

Note that the saving happens to be a concidence as the size allocated
by ebizzy was less than the default threshold. If a different number of
chunks were specified then it may also be necessary to tune the threshold
to compensate

This is roughly quadrupling the performance of this benchmark. The difference in
system CPU usage illustrates why.

ebizzy running 1 thread with glibc 2.21
10230 records/s 306904
real 30.00 s
user  7.47 s
sys  22.49 s

22.49 seconds was spent in the kernel for a workload runinng 30 seconds. With the
patch applied

ebizzy running 1 thread with patch applied
44126 records/s 1323792
real 30.00 s
user 29.97 s
sys   0.00 s

system CPU usage was zero with the patch applied. strace shows that glibc
running this workload calls madvise approximately 9000 times a second. With
the patch applied madvise was called twice during the workload (or 0.06
times per second).

2015-02-10  Mel Gorman  <mgorman@suse.de>

  [BZ #17195]
  * malloc/arena.c (free): Apply trim threshold to per-thread heaps
    as well as the main arena.
2015-04-02 12:14:14 +05:30
H.J. Lu a3d9ab5070 Limit threads sharing L2 cache to 2 for SLM/KNL
Silvermont and Knights Landing have a modular system design with two cores
sharing an L2 cache.  If more than 2 cores are detected to shared L2 cache,
it should be adjusted for Silvermont and Knights Landing.

	[BZ #18185]
	* sysdeps/x86_64/cacheinfo.c (init_cacheinfo): Limit threads
	sharing L2 cache to 2 for Silvermont/Knights Landing.
2015-03-31 13:18:10 -07:00
H.J. Lu 83569fb894 Add a testcase for copy reloc against protected data
Linkers in some versions of binutils 2.25 and 2.26 don't support protected
data symbol with error messsage like:

/usr/bin/ld: copy reloc against protected `bar' is invalid
/usr/bin/ld: failed to set dynamic section sizes: Bad value

We check if linker supports copy reloc against protected data symbol to
avoid running the test if linker is broken.

	[BZ #17711]
	* config.make.in (have-protected-data): New.
	* configure.ac: Check linker support for protected data symbol.
	* configure: Regenerated.
	* elf/Makefile (modules-names): Add tst-protected1moda and
	tst-protected1modb if $(have-protected-data) is yes.
	(tests): Add tst-protected1a and tst-protected1b if
	$(have-protected-data) is yes.
	($(objpfx)tst-protected1a): New.
	($(objpfx)tst-protected1b): Likewise.
	(tst-protected1modb.so-no-z-defs): Likewise.
	* elf/tst-protected1a.c: New file.
	* elf/tst-protected1b.c: Likewise.
	* elf/tst-protected1mod.h: Likewise.
	* elf/tst-protected1moda.c: Likewise.
	* elf/tst-protected1modb.c: Likewise.
2015-03-31 05:21:12 -07:00
H.J. Lu 62da1e3b00 Add ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA to x86
With copy relocation, address of protected data defined in the shared
library may be external.   When there is a relocation against the
protected data symbol within the shared library, we need to check if we
should skip the definition in the executable copied from the protected
data.  This patch adds ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA and defines
it for x86.  If ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA isn't 0, do_lookup_x
will skip the data definition in the executable from copy reloc.

	[BZ #17711]
	* elf/dl-lookup.c (do_lookup_x): When UNDEF_MAP is NULL, which
	indicates it is called from do_lookup_x on relocation against
	protected data, skip the data definion in the executable from
	copy reloc.
	(_dl_lookup_symbol_x): Pass ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA,
	instead of ELF_RTYPE_CLASS_PLT, to do_lookup_x for
	EXTERN_PROTECTED_DATA relocation against STT_OBJECT symbol.
	* sysdeps/generic/ldsodefs.h * (ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA):
	New.  Defined to 4 if DL_EXTERN_PROTECTED_DATA is defined,
	otherwise to 0.
	* sysdeps/i386/dl-lookupcfg.h (DL_EXTERN_PROTECTED_DATA): New.
	* sysdeps/i386/dl-machine.h (elf_machine_type_class): Set class
	to ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA for R_386_GLOB_DAT.
	* sysdeps/x86_64/dl-lookupcfg.h (DL_EXTERN_PROTECTED_DATA): New.
	* sysdeps/x86_64/dl-machine.h (elf_machine_type_class): Set class
	to ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA for R_X86_64_GLOB_DAT.
2015-03-31 05:16:57 -07:00
Martin Galvan 675ddb7184 NPTL: Remove duplicate definition of PTHREAD_ADAPTIVE_MUTEX_INITIALIZER_NP
The PTHREAD_ADAPTIVE_MUTEX_INITIALIZER_NP macro was defined twice with the
same values in pthread.h; this removes the second definition.
2015-03-28 01:50:12 -04:00
Martin Galvan 4d611e1261 NPTL: swap comments for THREAD_SETMEM and THREAD_SETMEM_NC for i386 and x86_64
The comments for THREAD_SETMEM and THREAD_SETMEM_NC were swapped for
i386 and x86_64; this patch fixes that.
2015-03-28 00:44:22 -04:00
Roland McGrath 7285eb535a Convert dlfcn/tststatic to use test-skeleton. 2015-03-27 12:55:25 -07:00
Alan Modra 19a6a3acd1 Harden powerpc64 elf_machine_fixup_plt
IFUNC is difficult to correctly implement on any target needing a GOT
to support position independent code, due to the dependency on order
of dynamic relocations.  ld.so should be changed to apply IFUNC
relocations last, globally, because without that it is actually
impossible to write an IFUNC resolver in C that works in all
situations.  Case in point, vfork in libpthread.so is an IFUNC with
the resolver returning &__libc_vfork.  (system and fork are similar.)
If another shared library, libA say, uses vfork then it is quite
possible that libpthread.so hasn't been dynamically relocated before
the unfortunate libA is dynamically relocated.  In that case the GOT
entry for &__libc_vfork is still zero, so the IFUNC resolver returns
NULL.  LD_BIND_NOW=1 results in libA PLT dynamic relocations being
applied using this NULL value and ld.so segfaults.

This patch hardens ld.so to not segfault on a NULL from an IFUNC
resolver.  It also fixes a problem with undefined weak.  If you leave
the plt entry as-is for undefined weak then if the entry is ever
called it will loop in ld.so rather than segfaulting.

	* sysdeps/powerpc/powerpc64/dl-machine.h (elf_machine_fixup_plt):
	Don't segfault if ifunc resolver returns a NULL.  Do set plt to
	zero for undefined weak.
	(elf_machine_plt_conflict): Similarly.
2015-03-26 12:30:45 +10:30
Joseph Myers efd5b641dd Add more tests of acosh, asinh and atanh.
This patch adds some randomly-generated tests of acosh, asinh and
atanh that are observed to increase ulps on x86_64.

Tested for x86_64 and x86 and ulps updated accordingly.

	* math/auto-libm-test-in: Add more tests of acosh, asinh and
	atanh.
	* math/auto-libm-test-out: Regenerated.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2015-03-25 22:21:20 +00:00
Joseph Myers e9b1015112 Add another test of asin.
This patch adds a randomly-generated test of asin that is observed to
increase ulps on x86_64.

Tested for x86_64 and x86 and ulps updated accordingly.

	* math/auto-libm-test-in: Add another test of asin.
	* math/auto-libm-test-out: Regenerated.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2015-03-25 21:57:04 +00:00
Joseph Myers 9fa55373e1 Remove unused macros from i386 lowlevellock.h.
In the course of the work on six-argument syscalls I noticed that the
i386 lowlevellock.h contained some unused macro definitions (already
unused before my patch).  This patch removes them.

Tested for x86 that installed stripped shared libraries are unchanged
by this patch.

	* sysdeps/unix/sysv/linux/i386/lowlevellock.h (LLL_EBX_LOAD):
	Remove macro.
	(LLL_EBX_REG): Likewise.
	(LLL_ENTER_KERNEL): Likewise.
2015-03-25 21:21:18 +00:00
Joseph Myers 38755f1421 Add more tests of asin.
This patch adds some randomly-generated tests of asin that are
observed to increase ulps on x86_64.

Tested for x86_64 and x86 and ulps updated accordingly.

	* math/auto-libm-test-in: Add more tests of asin.
	* math/auto-libm-test-out: Regenerated.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2015-03-25 17:53:58 +00:00
Joseph Myers a9fe4c5aa8 Support six-argument syscalls from C for 32-bit x86, use generic lowlevellock-futex.h (bug 18138).
This patch follows the approach outlined in
<https://sourceware.org/ml/libc-alpha/2015-03/msg00656.html> to
support six-argument syscalls from INTERNAL_SYSCALL for 32-bit x86,
making them call a function __libc_do_syscall that takes the syscall
number and three syscall arguments in the registers in which the
kernel expects them, along with a pointer to a structure containing
the other three arguments.

In turn, this allows the generic lowlevellock-futex.h to be used on
32-bit x86, so supporting lll_futex_timed_wait_bitset (and so allowing
FUTEX_CLOCK_REALTIME to be used in various cases, so fixing bug 18138
for 32-bit x86 and leaving hppa as the only architecture missing
lll_futex_timed_wait_bitset).  The change to lowlevellock.h's
definition of SYS_futex is because the generic lowlevelloc-futex.h
ends up bringing in bits/syscall.h which defines SYS_futex to
__NR_futex, so resulting in redefinition errors.  The revised
definition in lowlevellock.h is in line with what the x86_64 version
does.

__libc_do_syscall is only needed in libpthread at present (meaning
nothing special needs to be done to make it shared-only in most
libraries containing it, static in libc only, as on ARM).

Tested for 32-bit x86, with the glibc testsuite and with the test in
bug 18138.  The failures seen

FAIL: nptl/tst-cleanupx4
FAIL: rt/tst-cpuclock2

are pre-existing.

	[BZ #18138]
	* sysdeps/unix/sysv/linux/i386/sysdep.h (struct
	libc_do_syscall_args): New structure.
	(INTERNAL_SYSCALL_MAIN_0): New macro.
	(INTERNAL_SYSCALL_MAIN_1): Likewise.
	(INTERNAL_SYSCALL_MAIN_2): Likewise.
	(INTERNAL_SYSCALL_MAIN_3): Likewise.
	(INTERNAL_SYSCALL_MAIN_4): Likewise.
	(INTERNAL_SYSCALL_MAIN_5): Likewise.
	(INTERNAL_SYSCALL_MAIN_6): Likewise.  Call __libc_do_syscall.
	(INTERNAL_SYSCALL): Define to use INTERNAL_SYSCALL_MAIN_##nr.
	Replace conditional definitions by conditional definitions of ....
	(INTERNAL_SYSCALL_MAIN_INLINE): ... this.  New macro.
	* sysdeps/unix/sysv/linux/i386/libc-do-syscall.S: New file.
	* sysdeps/unix/sysv/linux/i386/Makefile [$(subdir) = nptl]
	(libpthread-sysdep_routines): Add libc-do-syscall.
	* sysdeps/unix/sysv/linux/i386/lowlevellock-futex.h: Remove file.
	* sysdeps/unix/sysv/linux/i386/lowlevellock.h (SYS_futex): Define
	to __NR_futex not 240.
2015-03-25 15:17:54 +00:00
Alan Modra afcd9480fe powerpc __tls_get_addr call optimization
This patch is glibc support for a PowerPC TLS optimization, inspired
by Alexandre Oliva's TLS optimization for other processors,
http://www.lsd.ic.unicamp.br/~oliva/writeups/TLS/RFC-TLSDESC-x86.txt

In essence, this optimization uses a zero module id in the tls_index
GOT entry to indicate that a TLS variable is allocated space in the
static TLS area.  A special plt call linker stub for __tls_get_addr
checks for such a tls_index and if found, returns the offset
immediately.  The linker communicates the fact that the special
__tls_get_addr stub is used by setting a bit in the dynamic tag
DT_PPC64_OPT/DT_PPC_OPT.  glibc communicates to the linker that this
optimization is available by the presence of __tls_get_addr_opt.

tst-tlsmod2.so is built with -Wl,--no-tls-get-addr-optimize for
tst-tls-dlinfo, which otherwise would fail since it tests that no
static tls is allocated.  The ld option --no-tls-get-addr-optimize has
been available since binutils-2.20 so doesn't need a configure test.

	* NEWS: Advertise TLS optimization.
	* elf/elf.h (R_PPC_TLSGD, R_PPC_TLSLD, DT_PPC_OPT, PPC_OPT_TLS): Define.
	(DT_PPC_NUM): Increment.
	* elf/dynamic-link.h (HAVE_STATIC_TLS): Define.
	(CHECK_STATIC_TLS): Use here.
	* sysdeps/powerpc/powerpc32/dl-machine.h (elf_machine_rela): Optimize
	TLS descriptors.
	* sysdeps/powerpc/powerpc64/dl-machine.h (elf_machine_rela): Likewise.
	* sysdeps/powerpc/dl-tls.c: New file.
	* sysdeps/powerpc/Versions: Add __tls_get_addr_opt.
	* sysdeps/powerpc/tst-tlsopt-powerpc.c: New tls test.
	* sysdeps/unix/sysv/linux/powerpc/Makefile: Add new test.
	Build tst-tlsmod2.so with --no-tls-get-addr-optimize.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/ld.abilist: Update.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/ld.abilist: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/ld-le.abilist: Likewise.
2015-03-25 15:53:47 +10:30
Alan Modra da9f333410 powerpc64 configure message
This feature doesn't depend on the linker, as can be seen from the
actual test.  It's a compiler feature.

	* sysdeps/powerpc/powerpc64/configure.ac: Correct "linker support
	for overlapping .opd entries" to "support...".
	* sysdeps/powerpc/powerpc64/configure: Regenerate
2015-03-25 15:45:36 +10:30
Joseph Myers 8d6439712d Add more tests of acos.
This patch adds some randomly-generated tests of acos that are
observed to increase ulps on x86_64.

Tested for x86_64 and x86 and ulps updated accordingly.

	* math/auto-libm-test-in: Add more tests of acos.
	* math/auto-libm-test-out: Regenerated.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2015-03-25 00:30:10 +00:00
Joseph Myers bc899ea090 Add more tests of expm1.
This patch adds some randomly-generated tests of expm1 that are
observed to increase ulps on x86_64.

Tested for x86_64 and x86 and ulps updated accordingly.

	* math/auto-libm-test-in: Add more tests of expm1.
	* math/auto-libm-test-out: Regenerated.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2015-03-25 00:05:13 +00:00
Joseph Myers 239ed6f309 Add more tests of cosh, sinh.
This patch adds some randomly-generated tests of cosh and sinh that
are observed to increase ulps on x86_64.

Tested for x86_64 and x86 and ulps updated accordingly.

	* math/auto-libm-test-in: Add more tests of cosh and sinh.
	* math/auto-libm-test-out: Regenerated.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2015-03-24 23:48:04 +00:00
Joseph Myers a737e8263a Regenerate x86_64, x86 ulps from scratch.
The x86_64 and x86 libm-test-ulps files hadn't been regenerated from
scratch for some time, as evidenced by the presence of entries for
*_tonearest functions (those tests duplicated the
default-rounding-mode tests, and such duplicates are no longer run).
The aarch64, alpha, hppa, ia64, m68k, microblaze, powerpc, s390, sh,
sparc, tile files similarly could do with from-scratch regeneration as
evidenced by the presence of such entries.  (Truncate the existing
file then run "make regen-ulps" and move the resulting file into
place.)

This patch regenerates the x86_64 and x86 files from scratch.  It's
likely some of the reduced / removed ulps will need restoring because
they appear on processors or compiler versions other than the one I
tested on, but in such cases I'd like to first see if I can generate
new tests that show such ulps on the Intel processor I'm testing on,
to reduce the effects from different people using different processors
and compilers to regenerate the ulps.

	* sysdeps/i386/fpu/libm-test-ulps: Regenerated.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2015-03-24 23:30:19 +00:00
Joseph Myers 7c84a5042f Add more tests of log2.
In testing for x86_64 on an AMD processor, I observed libm test
failures of the form:

testing long double (without inline functions)
Failure: Test: log2_downward (0x2.b7e151628aed4p+0)
Result:
 is:          1.44269504088896356633e+00   0xb.8aa3b295c17f67600000p-3
 should be:   1.44269504088896356622e+00   0xb.8aa3b295c17f67500000p-3
 difference:  1.08420217248550443400e-19   0x8.00000000000000000000p-66
 ulp       :  1.0000
 max.ulp   :  0.0000
Maximal error of `log2_downward'
 is      : 1 ulp
 accepted: 0 ulp

These issues arise because the maximum ulps when regenerating on one
processor are not the same as on another processor, so regeneration on
several processors may be needed when updating libm-test-ulps to avoid
failures for some users testing glibc - but such regeneration on
multiple processors is inconvenient.  Causes can be: on x86 and, for
x86_64, for long double, variation in results of x87 instructions for
transcendental operations between processors; on x86, variation in
compiler excess precision between compiler versions and
configurations; on any processor where the compiler may contract
expressions using fused multiply-add, variation in what contraction
occurs.

Although it's hard to be sure libm-test-ulps covers all ulps that may
be seen in any configuration for the given architecture, in practice
it helps simply to add wider test coverage to make it more likely
that, when testing on one processor, the ulps seen are the biggest
that can be seen for that function on that processor, and hopefully
they are also the biggest that can be seen for that function in other
configurations for that architecture.  Thus, this patch adds some
tests of log2 that increase the ulps I see on x86_64 on an Intel
processor, so that hopefully future from-scratch regenerations on that
processor will produce ulps big enough not to have errors from testing
on AMD processors.  These tests were found by randomly generating
inputs and seeing what produced ulps larger than those currently in
libm-test-ulps.  Of course such increases also improve the accuracy of
the empirical table of known ulps generated from libm-test-ulps files
that goes in the manual.

Tested for x86_64 and x86 and ulps updated accordingly.

	* math/auto-libm-test-in: Add more tests of log2.
	* math/auto-libm-test-out: Regenerated.
	* sysdeps/i386/fpu/libm-test-ulps: Update.
	* sysdeps/x86_64/fpu/libm-test-ulps: Likewise.
2015-03-24 23:06:28 +00:00
Roland McGrath 7e9c7b9b68 Minor cleanups in libio/iofdopen.c 2015-03-23 13:46:36 -07:00
Florian Weimer 98734cc501 pthread_setaffinity (Linux variant): Rewrite to use VLA instead of alloca
extend_alloca was used to emulate VLA deallocation.  The new version
also handles the res == 0 corner case more explicitly, by returning 0
instead of the (potentially undefined, but usually zero) system call
error.
2015-03-23 16:34:48 +01:00
Florian Weimer 2b028564f1 Avoid SIGFPE in wordexp [BZ #18100]
Check for a zero divisor and integer overflow before performing
division in arithmetic expansion.
2015-03-23 16:12:38 +01:00
Alan Modra 59261ad3eb Remove HAVE_ASM_PPC_REL16 references
In bc0cdc498 the configure check for HAVE_ASM_PPC_REL16 was removed
on the grounds that the minimum binutils supports rel16 relocs.  This
is true, but not all references to HAVE_ASM_PPC_REL16 in the sources
were removed.

	* config.h.in: Remove HAVE_ASM_PPC_REL16.
	* sysdeps/powerpc/powerpc32/tls-macros.h: Remove HAVE_ASM_PPC_REL16
	and false branch of conditional.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/swapcontext-common.S:
	Likewise.
2015-03-23 15:33:59 +10:30
Samuel Thibault 9e70234cbd Add more exception to local headers list
* scripts/check-local-headers.sh (exclude): Add device/,
	hurd/hurd_types.h, hurd/ioctl_types.h, hurd/paths.h, hurd/ioctls.defs,
	cthreads.h.
2015-03-23 00:31:02 +01:00
Samuel Thibault 661a7dbad0 Fix visibility of EXTPROC macro
* bits/termios.h [!__USE_MISC] (EXTPROC): Do not define.
2015-03-22 17:29:14 +01:00
Joseph Myers 6a9350c854 Note old commit as having resolved bug 11505. 2015-03-21 17:50:13 +00:00
Samuel Thibault 868df0f9e9 Fix warnings
* sysdeps/mach/hurd/Makefile ($(common-objpfx)errnos.d): Depend on
	libc-modules.h
	* sysdeps/mach/hurd/i386/trampoline.c (_hurd_setup_sighandler): Remove
	unused declaration of _hurd_intr_rpc_msg_in_trap.
	* mach/mach_init.c (__mach_init): Test whether HAVE_HOST_PAGE_SIZE is
	defined instead of whether it is non-zero.
	* sysdeps/mach/hurd/i386/intr-msg.h (INTR_MSG_TRAP): Use "+m"
	input constraint instead of both input and output constraint.  Use ecx
	clobber instead of %ecx.
	* sysdeps/mach/hurd/malloc-machine.h (mutex_init, mutex_lock,
	mutex_unlock): Use a statement expression instead of an expression list.
	* sysdeps/mach/hurd/setitimer.c (_hurd_itimer_thread_stack_size): Set
	type to vm_size_t instead of vm_address_t.
	* sysdeps/mach/hurd/fork.c (__fork): Test whether STACK_GROWTH_UP is
	defined instead of whether it is non-zero.
	* hurd/hurd/ioctl.h (_hurd_locked_install_cttyid): New declaration.
	* sysdeps/mach/hurd/setsid.c: Include <hurd/ioctl.h>.
	* sysdeps/mach/hurd/mmap.c (__mmap): Use 0 instead of NULL for
	comparisons with mapaddr.
	* nscd/nscd-client.h: Include <time.h>.
	* sysdeps/mach/hurd/dl-sysdep.c (fmh): Pass vm_offset_t dummy
	9th parameter to __vm_region instead of int.
2015-03-21 04:49:44 +01:00
Samuel Thibault d583531a9e Add missing dependency
* sysdeps/mach/hurd/Makefile ($(common-objpfx)errnos.d): Depend on
	libc-modules.h
2015-03-21 01:19:23 +01:00
Roland McGrath 298e5d56dc ARM: Fix memcpy & memmove for [ARM_ALWAYS_BX] 2015-03-19 12:45:24 -07:00
Chris Metcalf becb26b84b linux-generic: add a README 2015-03-19 13:33:01 -04:00
Joseph Myers c2f5813ae0 Make sem_timedwait use FUTEX_CLOCK_REALTIME (bug 18138).
sem_timedwait converts absolute timeouts to relative to pass them to
the futex syscall.  (Before the recent reimplementation, on x86_64 it
used FUTEX_CLOCK_REALTIME, but not on other architectures.)

Correctly implementing POSIX requirements, however, requires use of
FUTEX_CLOCK_REALTIME; passing a relative timeout to the kernel does
not conform to POSIX.  The POSIX specification for sem_timedwait says
"The timeout shall be based on the CLOCK_REALTIME clock.".  The POSIX
specification for clock_settime says "If the value of the
CLOCK_REALTIME clock is set via clock_settime(), the new value of the
clock shall be used to determine the time of expiration for absolute
time services based upon the CLOCK_REALTIME clock. This applies to the
time at which armed absolute timers expire. If the absolute time
requested at the invocation of such a time service is before the new
value of the clock, the time service shall expire immediately as if
the clock had reached the requested time normally.".  If a relative
timeout is passed to the kernel, it is interpreted according to the
CLOCK_MONOTONIC clock, and so fails to meet that POSIX requirement in
the event of clock changes.

This patch makes sem_timedwait use lll_futex_timed_wait_bitset with
FUTEX_CLOCK_REALTIME when possible, as done in some other places in
NPTL.  FUTEX_CLOCK_REALTIME is always available for supported Linux
kernel versions; unavailability of lll_futex_timed_wait_bitset is only
an issue for hppa (an issue noted in
<https://sourceware.org/glibc/wiki/PortStatus>, and fixed by the
unreviewed
<https://sourceware.org/ml/libc-alpha/2014-12/msg00655.html> that
removes the hppa lowlevellock.h completely).

In the FUTEX_CLOCK_REALTIME case, the glibc code still needs to check
for negative tv_sec and handle that as timeout, because the Linux
kernel returns EINVAL not ETIMEDOUT for that case, so resulting in
failures of nptl/tst-abstime and nptl/tst-sem13 in the absence of that
check.  If we're trying to distinguish between Linux-specific and
generic-futex NPTL code, I suppose having this in an nptl/ file isn't
ideal, but there doesn't seem to be any better place at present.

It's not possible to add a testcase for this issue to the testsuite
because of the requirement to change the system clock as part of a
test (this is a case where testing would require some form of
container, with root in that container, and one whose CLOCK_REALTIME
is isolated from that of the host; I'm not sure what forms of
containers, short of a full virtual machine, provide that clock
isolation).

Tested for x86_64.  Also tested for powerpc with the testcase included
in the bug.

	[BZ #18138]
	* nptl/sem_waitcommon.c: Include <kernel-features.h>.
	(futex_abstimed_wait)
	[__ASSUME_FUTEX_CLOCK_REALTIME && lll_futex_timed_wait_bitset]:
	Use lll_futex_timed_wait_bitset with FUTEX_CLOCK_REALTIME instead
	of lll_futex_timed_wait.
2015-03-18 17:05:38 +00:00
Siddhesh Poyarekar 3382c079da Fix up NEWS merge goof-up 2015-03-18 15:04:57 +05:30
Brad Hubbard ed6b0fe710 Use calloc to allocate xports (BZ #17542)
If xports is NULL in xprt_register we malloc it but if sock >
_rpc_dtablesize() that memory does not get initialised and may in theory
contain any value. Later we make a conditional jump in svc_getreq_common
based on the uninitialised memory and this caused a general protection
fault in rpc.statd on an older version of glibc but this code has not
changed since that version.

Following is the valgrind warning.

==26802== Conditional jump or move depends on uninitialised value(s)
==26802==    at 0x5343A25: svc_getreq_common (in /lib64/libc-2.5.so)
==26802==    by 0x534357B: svc_getreqset (in /lib64/libc-2.5.so)
==26802==    by 0x10DE1F: ??? (in /sbin/rpc.statd)
==26802==    by 0x10D0EF: main (in /sbin/rpc.statd)
==26802==  Uninitialised value was created by a heap allocation
==26802==    at 0x4C2210C: malloc (vg_replace_malloc.c:195)
==26802==    by 0x53438BE: xprt_register (in /lib64/libc-2.5.so)
==26802==    by 0x53450DF: svcudp_bufcreate (in /lib64/libc-2.5.so)
==26802==    by 0x10FE32: ??? (in /sbin/rpc.statd)
==26802==    by 0x10D13E: main (in /sbin/rpc.statd)
2015-03-18 14:51:26 +05:30
Alexandre Oliva f8aeae3473 Fix DTV race, assert, DTV_SURPLUS Static TLS limit, and nptl_db garbage
for  ChangeLog

	[BZ #17090]
	[BZ #17620]
	[BZ #17621]
	[BZ #17628]
	* NEWS: Update.
	* elf/dl-tls.c (_dl_update_slotinfo): Clean up outdated DTV
	entries with Static TLS too.  Skip entries past the end of the
	allocated DTV, from Alan Modra.
	(tls_get_addr_tail): Update to glibc_likely/unlikely.  Move
	Static TLS DTV entry set up from...
	 (_dl_allocate_tls_init): ... here (fix modid assertion), ...
	* elf/dl-reloc.c (_dl_nothread_init_static_tls): ... here...
	* nptl/allocatestack.c (init_one_static_tls): ... and here...
	* elf/dlopen.c (dl_open_worker): Drop l_tls_modid upper bound
	for Static TLS.
	* elf/tlsdeschtab.h (map_generation): Return size_t.  Check
	that the slot we find is associated with the given map before
	using its generation count.
	* nptl_db/db_info.c: Include ldsodefs.h.
	(rtld_global, dtv_slotinfo_list, dtv_slotinfo): New typedefs.
	* nptl_db/structs.def (DB_RTLD_VARIABLE): New macro.
	(DB_MAIN_VARIABLE, DB_RTLD_GLOBAL_FIELD): Likewise.
	(link_map::l_tls_offset): New struct field.
	(dtv_t::counter): Likewise.
	(rtld_global): New struct.
	(_rtld_global): New rtld variable.
	(dl_tls_dtv_slotinfo_list): New rtld global field.
	(dtv_slotinfo_list): New struct.
	(dtv_slotinfo): Likewise.
	* nptl_db/td_symbol_list.c: Drop gnu/lib-names.h include.
	(td_lookup): Rename to...
	(td_mod_lookup): ... this.  Use new mod parameter instead of
	LIBPTHREAD_SO.
	* nptl_db/td_thr_tlsbase.c: Include link.h.
	(dtv_slotinfo_list, dtv_slotinfo): New functions.
	(td_thr_tlsbase): Check DTV generation.  Compute Static TLS
	addresses even if the DTV is out of date or missing them.
	* nptl_db/fetch-value.c (_td_locate_field): Do not refuse to
	index zero-length arrays.
	* nptl_db/thread_dbP.h: Include gnu/lib-names.h.
	(td_lookup): Make it a macro implemented in terms of...
	(td_mod_lookup): ... this declaration.
	* nptl_db/db-symbols.awk (DB_RTLD_VARIABLE): Override.
	(DB_MAIN_VARIABLE): Likewise.
2015-03-17 00:31:49 -03:00
H.J. Lu b97eb2bdb1 Preserve bound registers in _dl_runtime_resolve
We need to add a BND prefix before indirect branch at the end of
_dl_runtime_resolve to preserve bound registers.

	[BZ #18134]
	* sysdeps/x86_64/dl-trampoline.S (PRESERVE_BND_REGS_PREFIX): New.
	(_dl_runtime_resolve): Add a BND prefix before indirect branch.
2015-03-16 14:59:14 -07:00
Paul Eggert cb21929049 * stdlib/setenv.c (__add_to_environ): Revert previous change. 2015-03-15 17:06:21 -07:00
Andreas Schwab a3905fd9de m68k: fix 64-bit arithmetic in atomic operations (bug 18128) 2015-03-14 22:27:36 +01:00
Paul Eggert 2ecccaede9 * stdlib/setenv.c (__add_to_environ):
Dump core quickly if setenv (..., NULL, ...) is called.
2015-03-13 10:14:03 -07:00
Roland McGrath cdaf79d0af ARM: Rewrite sysdeps/arm/tls-macros.h 2015-03-13 10:10:09 -07:00
Carlos O'Donell cf9313e7d1 Enhance nscd's inotify support (Bug 14906).
In bug 14906 the user complains that the inotify support in nscd
is not sufficient when it comes to detecting changes in the
configurationfiles that should be watched for the various databases.

The current nscd implementation uses inotify to watch for changes in
the configuration files, but adds watches only for IN_DELETE_SELF and
IN_MODIFY. These watches are insufficient to cover even the most basic
uses by a system administrator. For example using emacs or vim to edit
a configuration file should trigger a reload but it might not if
the editors use move to atomically update the file. This atomic update
changes the inode and thus removes the notification on the file (as
inotify is based on inodes). Thus the inotify support in nscd for
configuration files is insufficient to account for the average use
cases of system administrators and users.

The inotify support is significantly enhanced and described here:
https://www.sourceware.org/ml/libc-alpha/2015-02/msg00504.html

Tested on x86_64 with and without inotify support.
2015-03-13 09:49:24 -04:00
Joseph Myers 7d67a196b6 soft-fp: Define and use _FP_STATIC_ASSERT.
This patch makes soft-fp use static assertions in place of conditional
calls to abort, in places where there are checks for conditions (on
the types for which a macro is used) that the code is not prepared to
handle.  The fallback definition of _FP_STATIC_ASSERT (for kernel use
only, as only relevant to compilers not supported for building glibc)
is as in misc/sys/cdefs.h.

This means that soft-fp only ever calls abort for _FP_UNREACHABLE
calls in builds with GCC versions before 4.5.  Thus, there is no need
for an abort declaration or <stdlib.h> include, since the kernel code
handles defining abort as a macro itself - and so this avoids any need
for an __KERNEL__ condition on the abort declaration to avoid it
breaking with the kernel's macro definition.  That is, this patch is
intended to make glibc's soft-fp code suitable for kernel use with no
kernel-local changes to the soft-fp code needed at all.

Tested for powerpc-nofpu that installed stripped shared libraries are
unchanged by the patch.  One explicit <stdlib.h> include had to be
added to a file that was relying on the include from soft-fp.h.

	* soft-fp/soft-fp.h (_FP_STATIC_ASSERT): New macro.
	[_LIBC]: Do not include <stdlib.h>.
	[!_LIBC] (abort): Remove declaration.
	* soft-fp/op-2.h (_FP_MUL_MEAT_2_120_240_double): Use
	_FP_STATIC_ASSERT instead of conditionally calling abort.
	* soft-fp/op-common.h (_FP_FROM_INT): Likewise.
	(_FP_EXTEND_CNAN): Likewise.
	(FP_TRUNC): Likewise.
	(__FP_CLZ): Likewise.
	* sysdeps/powerpc/nofpu/flt-rounds.c: Include <stdlib.h>.
2015-03-12 18:43:21 +00:00
Yaakov Selkowitz af85ebcdf7 manual: fix XPG basename prototype
* manual/string.texi (XPG basename): Fix prototype.
2015-03-12 17:56:46 +01:00
Stefan Liebler 2e807f2959 S/390: Fix setcontext/swapcontext which are not restoring sigmask. 2015-03-12 11:08:11 +01:00
Stefan Liebler 1b2bebe6b7 S/390: Regenerate ULPs 2015-03-12 11:04:13 +01:00
Aurelien Jarno 6a1cf708dd Fix ldconfig segmentation fault with corrupted cache (Bug 18093).
ldconfig is using an aux-cache to speed up the ld.so.cache update. It
is read by mmaping the file to a structure which contains data offsets
used as pointers. As they are not checked, it is not hard to get
ldconfig to segfault with a corrupted file. This happens for instance if
the file is truncated, which is common following a filesystem check
following a system crash.

This can be reproduced for example by truncating the file to roughly
half of it's size.

There is already some code in elf/cache.c (load_aux_cache) to check
for a corrupted aux cache, but it happens to be broken and not enough.
The test (aux_cache->nlibs >= aux_cache_size) compares the number of
libs entry with the cache size. It's a non sense, as it basically
assumes that each library entry is a 1 byte... Instead this commit
computes the theoretical cache size using the headers and compares it
to the real size.
2015-03-11 21:07:32 -04:00
Paul Pluzhnikov a2d4cf72c0 Fix BZ #18043 comment # 19: don't call undefined setenv(..., NULL, 1). 2015-03-11 08:55:50 -07:00
Adhemerval Zanella 5ca10a0c9a powerpc: Remove HAVE_ASM_GLOBAL_DOT_NAME define
With AIX port deprecated there is no need to check/define
HAVE_ASM_GLOBAL_DOT_NAME anymore since the current minimum binutils
supported (2.22) does not emit global symbol with dot.

This patch removes all the HAVE_ASM_GLOBAL_DOT_NAME definition and
checks for powerpc64 port.
2015-03-11 09:01:05 -04:00
Mike Frysinger b42e14ff3e hppa: update __O_SYNC fix with [BZ #18068] 2015-03-11 03:33:07 -04:00
Carlos O'Donell e4363cfb57 hppa: Fix feupdateenv and fesetexceptflag (Bug 18111).
The function feupdateenv has been fixed to correctly handle FE_DFL_ENV
and FE_NOMASK_ENV.

The fesetexceptflag function has been fixed to correctly handle setting
the new flags instead of just OR-ing the existing flags.

This fixes the test-fenv-return and test-fenvinline failures on hppa.
2015-03-11 02:48:59 -04:00