Commit Graph

34507 Commits

Author SHA1 Message Date
Wilco Dijkstra be3eaffd5a [AArch64] Improve integer memcpy
Further optimize integer memcpy.  Small cases now include copies up
to 32 bytes.  64-128 byte copies are split into two cases to improve
performance of 64-96 byte copies.  Comments have been rewritten.

(cherry picked from commit 700065132744e0dfa6d4d9142d63f6e3a1934726)
2020-10-12 18:29:42 +01:00
Krzysztof Koch c969e84e0c aarch64: Increase small and medium cases for __memcpy_generic
Increase the upper bound on medium cases from 96 to 128 bytes.
Now, up to 128 bytes are copied unrolled.

Increase the upper bound on small cases from 16 to 32 bytes so that
copies of 17-32 bytes are not impacted by the larger medium case.

Benchmarking:
The attached figures show relative timing difference with respect
to 'memcpy_generic', which is the existing implementation.
'memcpy_med_128' denotes the the version of memcpy_generic with
only the medium case enlarged. The 'memcpy_med_128_small_32' numbers
are for the version of memcpy_generic submitted in this patch, which
has both medium and small cases enlarged. The figures were generated
using the script from:
https://www.sourceware.org/ml/libc-alpha/2019-10/msg00563.html

Depending on the platform, the performance improvement in the
bench-memcpy-random.c benchmark ranges from 6% to 20% between
the original and final version of memcpy.S

Tested against GLIBC testsuite and randomized tests.

(cherry picked from commit b9f145df85145506f8e61bac38b792584a38d88f)
2020-10-12 18:29:42 +01:00
Wilco Dijkstra 53d501d6e9 AArch64: Rename IS_ARES to IS_NEOVERSE_N1
Rename IS_ARES to IS_NEOVERSE_N1 since that is a bit clearer.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit 0f6278a8793a5d04ea31878119eccf99f469a02d)
2020-10-12 18:29:38 +01:00
Wilco Dijkstra 64458aabeb AArch64: Improve backwards memmove performance
On some microarchitectures performance of the backwards memmove improves if
the stores use STR with decreasing addresses.  So change the memmove loop
in memcpy_advsimd.S to use 2x STR rather than STP.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit bd394d131c10c9ec22c6424197b79410042eed99)
2020-10-12 18:28:42 +01:00
Wilco Dijkstra 58c6a7ae53 AArch64: Add optimized Q-register memcpy
Add a new memcpy using 128-bit Q registers - this is faster on modern
cores and reduces codesize.  Similar to the generic memcpy, small cases
include copies up to 32 bytes.  64-128 byte copies are split into two
cases to improve performance of 64-96 byte copies.  Large copies align
the source rather than the destination.

bench-memcpy-random is ~9% faster than memcpy_falkor on Neoverse N1,
so make this memcpy the default on N1 (on Centriq it is 15% faster than
memcpy_falkor).

Passes GLIBC regression tests.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
(cherry picked from commit 4a733bf375238a6a595033b5785cea7f27d61307)
2020-10-12 18:28:34 +01:00
Wilco Dijkstra 2fb2098c24 AArch64: Align ENTRY to a cacheline
Given almost all uses of ENTRY are for string/memory functions,
align ENTRY to a cacheline to simplify things.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit 34f0d01d5e43c7dedd002ab47f6266dfb5b79c22)
2020-10-12 16:56:18 +01:00
H.J. Lu 83aaa17144 NEWS: Mention BZ 25933 fix 2020-07-04 09:59:10 -07:00
Sunil K Pandey 0a4d3eac67 Fix avx2 strncmp offset compare condition check [BZ #25933]
strcmp-avx2.S: In avx2 strncmp function, strings are compared in
chunks of 4 vector size(i.e. 32x4=128 byte for avx2). After first 4
vector size comparison, code must check whether it already passed
the given offset. This patch implement avx2 offset check condition
for strncmp function, if both string compare same for first 4 vector
size.

(cherry picked from commit 75870237ff3bb363447b03f4b0af100227570910)
2020-07-04 09:53:28 -07:00
Florian Weimer b0d3f7858c nss_compat: internal_end*ent may clobber errno, hiding ERANGE [BZ #25976]
During cleanup, before returning from get*_r functions, the end*ent
calls must not change errno.  Otherwise, an ERANGE error from the
underlying implementation can be hidden, causing unexpected lookup
failures.  This commit introduces an internal_end*ent_noerror
function which saves and restore errno, and marks the original
internal_end*ent function as warn_unused_result, so that it is used
only in contexts were errors from it can be handled explicitly.

Reviewed-by: DJ Delorie <dj@redhat.com>
(cherry picked from commit 790b8dda4455865cb8c3a47801f4304c1a43baf6)
2020-05-19 16:33:04 +02:00
Andreas Schwab a318448f7a Fix array overflow in backtrace on PowerPC (bug 25423)
When unwinding through a signal frame the backtrace function on PowerPC
didn't check array bounds when storing the frame address.  Fixes commit
d400dcac5e ("PowerPC: fix backtrace to handle signal trampolines").

(cherry picked from commit d93769405996dfc11d216ddbe415946617b5a494)
2020-03-18 13:41:55 -04:00
Andreas Schwab 9aaebaf805 Fix use-after-free in glob when expanding ~user (bug 25414)
The value of `end_name' points into the value of `dirname', thus don't
deallocate the latter before the last use of the former.

(cherry picked from commit ddc650e9b3dc916eab417ce9f79e67337b05035c)
2020-03-17 21:51:11 -04:00
Florian Weimer 8e5d591b10 math/test-sinl-pseudo: Use stack protector only if available
This fixes commit 9333498794cde1d5cca518bad ("Avoid ldbl-96 stack
corruption from range reduction of pseudo-zero (bug 25487).").

(cherry picked from commit c10acd40262486dac597001aecc20ad9d3bd0e4a)
2020-03-15 17:33:57 -04:00
Joseph Myers 0474cd5de6 Avoid ldbl-96 stack corruption from range reduction of pseudo-zero (bug 25487).
Bug 25487 reports stack corruption in ldbl-96 sinl on a pseudo-zero
argument (an representation where all the significand bits, including
the explicit high bit, are zero, but the exponent is not zero, which
is not a valid representation for the long double type).

Although this is not a valid long double representation, existing
practice in this area (see bug 4586, originally marked invalid but
subsequently fixed) is that we still seek to avoid invalid memory
accesses as a result, in case of programs that treat arbitrary binary
data as long double representations, although the invalid
representations of the ldbl-96 format do not need to be consistently
handled the same as any particular valid representation.

This patch makes the range reduction detect pseudo-zero and unnormal
representations that would otherwise go to __kernel_rem_pio2, and
returns a NaN for them instead of continuing with the range reduction
process.  (Pseudo-zero and unnormal representations whose unbiased
exponent is less than -1 have already been safely returned from the
function before this point without going through the rest of range
reduction.)  Pseudo-zero representations would previously result in
the value passed to __kernel_rem_pio2 being all-zero, which is
definitely unsafe; unnormal representations would previously result in
a value passed whose high bit is zero, which might well be unsafe
since that is not a form of input expected by __kernel_rem_pio2.

Tested for x86_64.

(cherry picked from commit 9333498794cde1d5cca518badf79533a24114b6f)
2020-03-15 17:33:26 -04:00
Florian Weimer e0a0770bb4 riscv: Do not use __has_include__
The user-visible preprocessor construct is called __has_include.

(cherry picked from commit 28dd3939221ab26c6774097e9596e30d9753f758)
2020-01-21 13:42:47 +01:00
Florian Weimer ea6f2c3174 misc/test-errno-linux: Handle EINVAL from quotactl
In commit 3dd4d40b420846dd35869ccc8f8627feef2cff32 ("xfs: Sanity check
flags of Q_XQUOTARM call"), Linux 5.4 added checking for the flags
argument, causing the test to fail due to too restrictive test
expectations.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit 1f7525d924b608a3e43b10fcfb3d46b8a6e9e4f9)
2019-12-05 17:30:40 +01:00
Kamlesh Kumar 42786ee476 <string.h>: Define __CORRECT_ISO_CPP_STRING_H_PROTO for Clang [BZ #25232]
Without the asm redirects, strchr et al. are not const-correct.

libc++ has a wrapper header that works with and without
__CORRECT_ISO_CPP_STRING_H_PROTO (using a Clang extension).  But when
Clang is used with libstdc++ or just C headers, the overloaded functions
with the correct types are not declared.

This change does not impact current GCC (with libstdc++ or libc++).

(cherry picked from commit 953ceff17a4a15b10cfdd5edc3c8cae4884c8ec3)
2019-12-05 16:55:19 +01:00
Florian Weimer 0b4c3e1e0b x86: Assume --enable-cet if GCC defaults to CET [BZ #25225]
This links in CET support if GCC defaults to CET.  Otherwise, __CET__
is defined, yet CET functionality is not compiled and linked into the
dynamic loader, resulting in a linker failure due to undefined
references to _dl_cet_check and _dl_open_check.

(cherry picked from commit 9fb8139079ef0bb1aa33a4ae418cbb113b9b9da7)
2019-12-03 21:08:49 +01:00
Florian Weimer 44a61d4589 libio: Disable vtable validation for pre-2.1 interposed handles [BZ #25203]
Commit c402355dfa ("libio: Disable
vtable validation in case of interposition [BZ #23313]") only covered
the interposable glibc 2.1 handles, in libio/stdfiles.c.  The
parallel code in libio/oldstdfiles.c needs similar detection logic.

Fixes (again) commit db3476aff1
("libio: Implement vtable verification [BZ #20191]").

Change-Id: Ief6f9f17e91d1f7263421c56a7dc018f4f595c21
(cherry picked from commit cb61630ed712d033f54295f776967532d3f4b46a)
2019-11-28 14:17:27 +01:00
Florian Weimer 5422ac2d08 Update NEWS for CVE-2019-19126 2019-11-22 13:45:03 +01:00
Marcin Kościelnicki 2626b15e88 rtld: Check __libc_enable_secure before honoring LD_PREFER_MAP_32BIT_EXEC (CVE-2019-19126) [BZ #25204]
The problem was introduced in glibc 2.23, in commit
b9eb92ab05
("Add Prefer_MAP_32BIT_EXEC to map executable pages with MAP_32BIT").

(cherry picked from commit d5dfad4326fc683c813df1e37bbf5cf920591c8e)
Change-Id: Ib782573b4623ee3edfa9f98ad62f69b9d8edcb27
2019-11-22 13:11:36 +01:00
Florian Weimer 845278f2c6 Linux: Use in-tree copy of SO_ constants for !__USE_MISC [BZ #24532]
The kernel changes for a 64-bit time_t on 32-bit architectures
resulted in <asm/socket.h> indirectly including <linux/posix_types.h>.
The latter is not namespace-clean for the POSIX version of
<sys/socket.h>.

This issue has persisted across several Linux releases, so this commit
creates our own copy of the SO_* definitions for !__USE_MISC mode.

The new test socket/tst-socket-consts ensures that the copy is
consistent with the kernel definitions (which vary across
architectures).  The test is tricky to get right because CPPFLAGS
includes include/libc-symbols.h, which in turn defines _GNU_SOURCE
unconditionally.

Tested with build-many-glibcs.py.  I verified that a discrepancy in
the definitions actually results in a failure of the
socket/tst-socket-consts test.

(cherry picked from commit 7854ebf8ed18180189c335f6f499fe9322458f0b)
2019-11-18 18:16:03 +01:00
DJ Delorie a683450f26 Base max_fast on alignment, not width, of bins (Bug 24903)
set_max_fast sets the "impossibly small" value based on,
eventually, MALLOC_ALIGNMENT.  The comparisons for the smallest
chunk used is, eventually, MIN_CHUNK_SIZE.  Note that i386
is the only platform where these are the same, so a smallest
chunk *would* be put in a no-fastbins fastbin.

This change calculates the "impossibly small" value
based on MIN_CHUNK_SIZE instead, so that we can know it will
always be impossibly small.

(cherry picked from commit ff12e0fb91b9072800f031cb21fb2651ee7b6251)
2019-11-18 16:10:51 +01:00
Florian Weimer 1b96d1d90b malloc: Various cleanups for malloc/tst-mxfast
(cherry picked from commit f9769a239784772453d595bc2f4bed8739810e06)
2019-11-18 16:10:11 +01:00
DJ Delorie 4618f1ffba Add glibc.malloc.mxfast tunable
* elf/dl-tunables.list: Add glibc.malloc.mxfast.
* manual/tunables.texi: Document it.
* malloc/malloc.c (do_set_mxfast): New.
(__libc_mallopt): Call it.
* malloc/arena.c: Add mxfast tunable.
* malloc/tst-mxfast.c: New.
* malloc/Makefile: Add it.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit c48d92b430c480de06762f80c104922239416826)
2019-11-18 16:08:44 +01:00
Niklas Hambüchen c6e4c3198b malloc: Fix missing accounting of top chunk in malloc_info [BZ #24026]
Fixes `<total type="rest" size="..."> incorrectly showing as 0 most
of the time.

The rest value being wrong is significant because to compute the
actual amount of memory handed out via malloc, the user must subtract
it from <system type="current" size="...">. That result being wrong
makes investigating memory fragmentation issues like
<https://bugzilla.redhat.com/show_bug.cgi?id=843478> close to
impossible.

(cherry picked from commit b6d2c4475d5abc05dd009575b90556bdd3c78ad0)
2019-11-18 16:04:03 +01:00
Florian Weimer a0a551d259 malloc: Remove unwanted leading whitespace in malloc_info [BZ #24867]
It was introduced in commit 6c8dbf00f5
("Reformat malloc to gnu style.").

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit b0f6679bcd738ea244a14acd879d974901e56c8e)
2019-11-18 16:02:10 +01:00
Wilco Dijkstra 0ad788face Small tcache improvements
Change the tcache->counts[] entries to uint16_t - this removes
the limit set by char and allows a larger tcache.  Remove a few
redundant asserts.

bench-malloc-thread with 4 threads is ~15% faster on Cortex-A72.

Reviewed-by: DJ Delorie <dj@redhat.com>

	* malloc/malloc.c (MAX_TCACHE_COUNT): Increase to UINT16_MAX.
	(tcache_put): Remove redundant assert.
	(tcache_get): Remove redundant asserts.
	(__libc_malloc): Check tcache count is not zero.
	* manual/tunables.texi (glibc.malloc.tcache_count): Update maximum.

(cherry picked from commit 1f50f2ad854c84ead522bfc7331b46dbe6057d53)
2019-11-18 15:55:12 +01:00
Joseph Myers 9a3ff995bd Fix assertion in malloc.c:tcache_get.
One of the warnings that appears with -Wextra is "ordered comparison
of pointer with integer zero" in malloc.c:tcache_get, for the
assertion:

  assert (tcache->entries[tc_idx] > 0);

Indeed, a "> 0" comparison does not make sense for
tcache->entries[tc_idx], which is a pointer.  My guess is that
tcache->counts[tc_idx] is what's intended here, and this patch changes
the assertion accordingly.

Tested for x86_64.

	* malloc/malloc.c (tcache_get): Compare tcache->counts[tc_idx]
	with 0, not tcache->entries[tc_idx].

(cherry picked from commit 77dc0d8643aa99c92bf671352b0a8adde705896f)
2019-11-18 15:45:39 +01:00
Stefan Liebler 8646009efd Fix alignment of TLS variables for tls variant TLS_TCB_AT_TP [BZ #23403]
The alignment of TLS variables is wrong if accessed from within a thread
for architectures with tls variant TLS_TCB_AT_TP.
For the main thread the static tls data is properly aligned.
For other threads the alignment depends on the alignment of the thread
pointer as the static tls data is located relative to this pointer.

This patch adds this alignment for TLS_TCB_AT_TP variants in the same way
as it is already done for TLS_DTV_AT_TP. The thread pointer is also already
properly aligned if the user provides its own stack for the new thread.

This patch extends the testcase nptl/tst-tls1.c in order to check the
alignment of the tls variables and it adds a pthread_create invocation
with a user provided stack.
The test itself is migrated from test-skeleton.c to test-driver.c
and the missing support functions xpthread_attr_setstack and xposix_memalign
are added.

ChangeLog:

	[BZ #23403]
	* nptl/allocatestack.c (allocate_stack): Align pointer pd for
	TLS_TCB_AT_TP tls variant.
	* nptl/tst-tls1.c: Migrate to support/test-driver.c.
	Add alignment checks.
	* support/Makefile (libsupport-routines): Add xposix_memalign and
	xpthread_setstack.
	* support/support.h: Add xposix_memalign.
	* support/xthread.h: Add xpthread_attr_setstack.
	* support/xposix_memalign.c: New File.
	* support/xpthread_attr_setstack.c: Likewise.

(cherry picked from commit bc79db3fd487daea36e7c130f943cfb9826a41b4)
2019-11-05 14:36:16 -05:00
Dragan Mladjenovic a7646651f8 mips: Force RWX stack for hard-float builds that can run on pre-4.8 kernels
Linux/Mips kernels prior to 4.8 could potentially crash the user
process when doing FPU emulation while running on non-executable
user stack.

Currently, gcc doesn't emit .note.GNU-stack for mips, but that will
change in the future. To ensure that glibc can be used with such
future gcc, without silently resulting in binaries that might crash
in runtime, this patch forces RWX stack for all built objects if
configured to run against minimum kernel version less than 4.8.

	* sysdeps/unix/sysv/linux/mips/Makefile
	(test-xfail-check-execstack):
	Move under mips-has-gnustack != yes.
	(CFLAGS-.o*, ASFLAGS-.o*): New rules.
	Apply -Wa,-execstack if mips-force-execstack == yes.
	* sysdeps/unix/sysv/linux/mips/configure: Regenerated.
	* sysdeps/unix/sysv/linux/mips/configure.ac
	(mips-force-execstack): New var.
	Set to yes for hard-float builds with minimum_kernel < 4.8.0
	or minimum_kernel not set at all.
	(mips-has-gnustack): New var.
	Use value of libc_cv_as_noexecstack
	if mips-force-execstack != yes, otherwise set to no.

(cherry picked from commit 33bc9efd91de1b14354291fc8ebd5bce96379f12)
2019-11-05 14:49:11 -03:00
Florian Weimer 52a6381659 elf: Refuse to dlopen PIE objects [BZ #24323]
Another executable has already been mapped, so the dynamic linker
cannot perform relocations correctly for the second executable.

(cherry picked from commit 2c75b545de6fe3c44138799c68217a94bc669a88)
2019-10-31 19:29:35 -04:00
DJ Delorie f1f24cdeba nss_db: fix endent wrt NULL mappings [BZ #24695] [BZ #24696]
nss_db allows for getpwent et al to be called without a set*ent,
but it only works once.  After the last get*ent a set*ent is
required to restart, because the end*ent did not properly reset
the module.  Resetting it to NULL allows for a proper restart.

If the database doesn't exist, however, end*ent erroniously called
munmap which set errno.

The test case runs "makedb" inside the testroot, so needs selinux
DSOs installed.

(cherry picked from commit 99135114ba23c3110b7e4e650fabdc5e639746b7)
2019-10-31 18:12:21 -04:00
Adhemerval Zanella c1803823c6 support: Export bindir path on support_path
Checked on x86_64-linux-gnu.

	* support/Makefile (CFLAGS-support_paths.c): Add -DBINDIR_PATH.
	* support/support.h (support_bindir_prefix): New variable.
	* support/support_paths.c [BINDIR_PATH] (support_bindir_prefix):

Reviewed-by: DJ Delorie <dj@redhat.com>
(cherry picked from commit c7ac9caaae6f8d02d4e0c7618d4991324a084c66)
2019-10-31 18:10:53 -04:00
H.J. Lu 5e1548a6d9 Call _dl_open_check after relocation [BZ #24259]
This is a workaround for [BZ #20839] which doesn't remove the NODELETE
object when _dl_open_check throws an exception.  Move it after relocation
in dl_open_worker to avoid leaving the NODELETE object mapped without
relocation.

	[BZ #24259]
	* elf/dl-open.c (dl_open_worker): Call _dl_open_check after
	relocation.
	* sysdeps/x86/Makefile (tests): Add tst-cet-legacy-5a,
	tst-cet-legacy-5b, tst-cet-legacy-6a and tst-cet-legacy-6b.
	(modules-names): Add tst-cet-legacy-mod-5a, tst-cet-legacy-mod-5b,
	tst-cet-legacy-mod-5c, tst-cet-legacy-mod-6a, tst-cet-legacy-mod-6b
	and tst-cet-legacy-mod-6c.
	(CFLAGS-tst-cet-legacy-5a.c): New.
	(CFLAGS-tst-cet-legacy-5b.c): Likewise.
	(CFLAGS-tst-cet-legacy-mod-5a.c): Likewise.
	(CFLAGS-tst-cet-legacy-mod-5b.c): Likewise.
	(CFLAGS-tst-cet-legacy-mod-5c.c): Likewise.
	(CFLAGS-tst-cet-legacy-6a.c): Likewise.
	(CFLAGS-tst-cet-legacy-6b.c): Likewise.
	(CFLAGS-tst-cet-legacy-mod-6a.c): Likewise.
	(CFLAGS-tst-cet-legacy-mod-6b.c): Likewise.
	(CFLAGS-tst-cet-legacy-mod-6c.c): Likewise.
	($(objpfx)tst-cet-legacy-5a): Likewise.
	($(objpfx)tst-cet-legacy-5a.out): Likewise.
	($(objpfx)tst-cet-legacy-mod-5a.so): Likewise.
	($(objpfx)tst-cet-legacy-mod-5b.so): Likewise.
	($(objpfx)tst-cet-legacy-5b): Likewise.
	($(objpfx)tst-cet-legacy-5b.out): Likewise.
	(tst-cet-legacy-5b-ENV): Likewise.
	($(objpfx)tst-cet-legacy-6a): Likewise.
	($(objpfx)tst-cet-legacy-6a.out): Likewise.
	($(objpfx)tst-cet-legacy-mod-6a.so): Likewise.
	($(objpfx)tst-cet-legacy-mod-6b.so): Likewise.
	($(objpfx)tst-cet-legacy-6b): Likewise.
	($(objpfx)tst-cet-legacy-6b.out): Likewise.
	(tst-cet-legacy-6b-ENV): Likewise.
	* sysdeps/x86/tst-cet-legacy-5.c: New file.
	* sysdeps/x86/tst-cet-legacy-5a.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-5b.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-6.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-6a.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-6b.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-mod-5.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-mod-5a.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-mod-5b.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-mod-5c.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-mod-6.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-mod-6a.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-mod-6b.c: Likewise.
	* sysdeps/x86/tst-cet-legacy-mod-6c.c: Likewise.

(cherry picked from commit d0093c5cefb7f7a4143f3bb03743633823229cc6)
2019-10-31 16:53:28 -04:00
Joseph Myers afbf970cae Fix RISC-V vfork build with Linux 5.3 kernel headers.
Building glibc for RISC-V with Linux 5.3 kernel headers fails because
<linux/sched.h>, included in vfork.S for CLONE_* constants, contains a
structure definition not safe for inclusion in assembly code.

All other architectures already avoid use of that header in vfork.S,
either defining the CLONE_* constants locally or embedding the
required values directly in the relevant instruction, where they
implement vfork using the clone syscall (see the implementations for
aarch64, ia64, mips and nios2).  This patch makes the RISC-V version
define the constants locally like the other architectures.

Tested build for all three RISC-V configurations in
build-many-glibcs.py with Linux 5.3 headers.

	* sysdeps/unix/sysv/linux/riscv/vfork.S: Do not include
	<linux/sched.h>.
	(CLONE_VM): New macro.
	(CLONE_VFORK): Likewise.

(cherry picked from commit 8cacbcf4a984ccac24efedb795d9c8a7f149d17b)
2019-09-20 21:30:04 +02:00
Aurelien Jarno a132a2c305 alpha: force old OSF1 syscalls for getegid, geteuid and getppid [BZ #24986]
On alpha, Linux kernel 5.1 added the standard getegid, geteuid and
getppid syscalls (commit ecf7e0a4ad15287). Up to now alpha was using
the corresponding OSF1 syscalls through:
 - sysdeps/unix/alpha/getegid.S
 - sysdeps/unix/alpha/geteuid.S
 - sysdeps/unix/alpha/getppid.S

When building against kernel headers >= 5.1, the glibc now use the new
syscalls through sysdeps/unix/sysv/linux/syscalls.list. When it is then
used with an older kernel, the corresponding 3 functions fail.

A quick fix is to move the OSF1 wrappers under the
sysdeps/unix/sysv/linux/alpha directory so they override the standard
linux ones. A better fix would be to try the new syscalls and fallback
to the old OSF1 in case the new ones fail. This can be implemented in
a later commit.

Changelog:
	[BZ #24986]
        * sysdeps/unix/alpha/getegid.S: Move to ...
	* sysdeps/unix/sysv/linux/alpha/getegid.S: ... here.
        * sysdeps/unix/alpha/geteuid.S: Move to ...
	* sysdeps/unix/sysv/linux/alpha/geteuid.S: ... here.
        * sysdeps/unix/alpha/getppid.S: Move to ...
	* sysdeps/unix/sysv/linux/alpha/getppid.S: ... here
2019-09-14 20:12:11 +02:00
Wilco Dijkstra 91372f0001 Improve performance of memmem
This patch significantly improves performance of memmem using a novel
modified Horspool algorithm.  Needles up to size 256 use a bad-character
table indexed by hashed pairs of characters to quickly skip past mismatches.
Long needles use a self-adapting filtering step to avoid comparing the whole
needle repeatedly.

By limiting the needle length to 256, the shift table only requires 8 bits
per entry, lowering preprocessing overhead and minimizing cache effects.
This limit also implies worst-case performance is linear.

Small needles up to size 2 use a dedicated linear search.  Very long needles
use the Two-Way algorithm (to avoid increasing stack size or slowing down
the common case, inlining is disabled).

The performance gain is 6.6 times on English text on AArch64 using random
needles with average size 8.

Tested against GLIBC testsuite and randomized tests.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

	* string/memmem.c (__memmem): Rewrite to improve performance.

(cherry picked from commit 680942b0167715e123d934b609060cd382f8e39f)
2019-09-13 14:51:35 +01:00
Wilco Dijkstra 1ad15e008c Improve performance of strstr
This patch significantly improves performance of strstr using a novel
modified Horspool algorithm.  Needles up to size 256 use a bad-character
table indexed by hashed pairs of characters to quickly skip past mismatches.
Long needles use a self-adapting filtering step to avoid comparing the whole
needle repeatedly.

By limiting the needle length to 256, the shift table only requires 8 bits
per entry, lowering preprocessing overhead and minimizing cache effects.
This limit also implies worst-case performance is linear.

Small needles up to size 3 use a dedicated linear search.  Very long needles
use the Two-Way algorithm.

The performance gain using the improved bench-strstr on Cortex-A72 is 5.8
times basic_strstr and 3.7 times twoway_strstr.

Tested against GLIBC testsuite, randomized tests and the GNULIB strstr test
(https://git.savannah.gnu.org/cgit/gnulib.git/tree/tests/test-strstr.c).

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

	* string/str-two-way.h (two_way_short_needle): Add inline to avoid
	warning.
	(two_way_long_needle): Block inlining.
	* string/strstr.c (strstr2): Add new function.
	(strstr3): Likewise.
	(STRSTR): Completely rewrite strstr to improve performance.

(cherry picked from commit 5e0a7ecb6629461b28adc1a5aabcc0ede122f201)
2019-09-13 14:51:28 +01:00
Aurelien Jarno ef98313dd3 Update Alpha libm-test-ulps
Changelog:

	* sysdeps/alpha/fpu/libm-test-ulps: Regenerated using GCC 9.2.

(cherry picked from commit b5367a08ae810e3c648fb036f2e5766204f9d83f)
2019-09-03 21:43:22 +02:00
Adhemerval Zanella 6d8eaf4a25 hppa: Update libm-tests-ulps
The make regen-ulps was done on a PA8900 with 8.3.0.

	* sysdeps/hppa/fpu/libm-test-ulps: Update.

(cherry picked from commit 3175dcc1e67425ad471caddc3d3cfae357de26ff)
2019-08-18 11:38:26 +02:00
Richard Henderson 23ef51a50a alpha: Do not redefine __NR_shmat or __NR_osf_shmat
Fixes build using v5.1-rc1 headers.

The kernel has cleaned up how these are defined.  Previous behavior
was to define __NR_osf_shmat as 209 and not define __NR_shmat.
Current behavior is to define __NR_shmat as 209 and then define
__NR_osf_shmat as __NR_shmat.

	* sysdeps/unix/sysv/linux/alpha/kernel-features.h (__NR_shmat):
	Do not redefine.
	* sysdeps/unix/sysv/linux/alpha/sysdep.h (__NR_osf_shmat):
	Do not redefine.

(cherry picked from commit d5ecee822e72a2fd156338ab2be2f2e70a1da55a)
2019-08-15 19:52:22 +02:00
Adhemerval Zanella 2d3fefd7ce posix: Fix large mmap64 offset for mips64n32 (BZ#24699)
The fix for BZ#21270 (commit 158d5fa0e1) added a mask to avoid offset larger
than 1^44 to be used along __NR_mmap2.  However mips64n32 users __NR_mmap,
as mips64n64, but still defines off_t as old non-LFS type (other ILP32, such
x32, defines off_t being equal to off64_t).  This leads to use the same
mask meant only for __NR_mmap2 call for __NR_mmap, thus limiting the maximum
offset it can use with mmap64.

This patch fixes by setting the high mask only for __NR_mmap2 usage. The
posix/tst-mmap-offset.c already tests it and also fails for mips64n32. The
patch also change the test to check for an arch-specific header that defines
the maximum supported offset.

Checked on x86_64-linux-gnu, i686-linux-gnu, and I also tests tst-mmap-offset
on qemu simulated mips64 with kernel 3.2.0 kernel for both mips-linux-gnu and
mips64-n32-linux-gnu.

	[BZ #24699]
	* posix/tst-mmap-offset.c: Mention BZ #24699.
	(do_test_bz21270): Rename to do_test_large_offset and use
	mmap64_maximum_offset to check for maximum expected offset value.
	* sysdeps/generic/mmap_info.h: New file.
	* sysdeps/unix/sysv/linux/mips/mmap_info.h: Likewise.
	* sysdeps/unix/sysv/linux/mmap64.c (MMAP_OFF_HIGH_MASK): Define iff
	__NR_mmap2 is used.

(cherry picked from commit a008c76b56e4f958cf5a0d6f67d29fade89421b7)
2019-07-15 09:26:39 -03:00
Szabolcs Nagy 4163c382f0 aarch64: handle STO_AARCH64_VARIANT_PCS
Backport of commit 82bc69c012838a381c4167c156a06f4598f34227
and commit 30ba0375464f34e4bf8129f3d3dc14d0c09add17
without using DT_AARCH64_VARIANT_PCS for optimizing the symbol table check.
This is needed so the internal abi between ld.so and libc.so is unchanged.

Avoid lazy binding of symbols that may follow a variant PCS with different
register usage convention from the base PCS.

Currently the lazy binding entry code does not preserve all the registers
required for AdvSIMD and SVE vector calls.  Saving and restoring all
registers unconditionally may break existing binaries, even if they never
use vector calls, because of the larger stack requirement for lazy
resolution, which can be significant on an SVE system.

The solution is to mark all symbols in the symbol table that may follow
a variant PCS so the dynamic linker can handle them specially.  In this
patch such symbols are always resolved at load time, not lazily.

So currently LD_AUDIT for variant PCS symbols are not supported, for that
the _dl_runtime_profile entry needs to be changed e.g. to unconditionally
save/restore all registers (but pass down arg and retval registers to
pltentry/exit callbacks according to the base PCS).

This patch also removes a __builtin_expect from the modified code because
the branch prediction hint did not seem useful.

	* sysdeps/aarch64/dl-machine.h (elf_machine_lazy_rel): Check
	STO_AARCH64_VARIANT_PCS and bind such symbols at load time.
2019-07-12 10:14:12 +01:00
Szabolcs Nagy 53f48f845c aarch64: add STO_AARCH64_VARIANT_PCS and DT_AARCH64_VARIANT_PCS
STO_AARCH64_VARIANT_PCS is a non-visibility st_other flag for marking
symbols that reference functions that may follow a variant PCS with
different register usage convention from the base PCS.

DT_AARCH64_VARIANT_PCS is a dynamic tag that marks ELF modules that
have R_*_JUMP_SLOT relocations for symbols marked with
STO_AARCH64_VARIANT_PCS (i.e. have variant PCS calls via a PLT).

	* elf/elf.h (STO_AARCH64_VARIANT_PCS): Define.
	(DT_AARCH64_VARIANT_PCS): Define.
2019-07-09 11:53:59 +01:00
Florian Weimer a00dc2a18f NEWS: Add deprecated section heading 2019-07-09 10:34:07 +02:00
Florian Weimer da347f4aa3 io: Remove copy_file_range emulation [BZ #24744]
The kernel is evolving this interface (e.g., removal of the
restriction on cross-device copies), and keeping up with that
is difficult.  Applications which need the function should
run kernels which support the system call instead of relying on
the imperfect glibc emulation.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit 5a659ccc0ec217ab02a4c273a1f6d346a359560a)
2019-07-09 10:01:21 +02:00
Dmitry V. Levin 34fb5f61d3 libio: do not attempt to free wide buffers of legacy streams [BZ #24228]
Commit a601b74d31 aka glibc-2.23~693
("In preparation for fixing BZ#16734, fix failure in misc/tst-error1-mem
when _G_HAVE_MMAP is turned off.") introduced a regression:
_IO_unbuffer_all now invokes _IO_wsetb to free wide buffers of all
files, including legacy standard files which are small statically
allocated objects that do not have wide buffers and the _mode member,
causing memory corruption.

Another memory corruption in _IO_unbuffer_all happens when -1
is assigned to the _mode member of legacy standard files that
do not have it.

[BZ #24228]
* libio/genops.c (_IO_unbuffer_all)
[SHLIB_COMPAT (libc, GLIBC_2_0, GLIBC_2_1)]: Do not attempt to free wide
buffers and access _IO_FILE_complete members of legacy libio streams.
* libio/tst-bz24228.c: New file.
* libio/tst-bz24228.map: Likewise.
* libio/Makefile [build-shared] (tests): Add tst-bz24228.
[build-shared] (generated): Add tst-bz24228.mtrace and
tst-bz24228.check.
[run-built-tests && build-shared] (tests-special): Add
$(objpfx)tst-bz24228-mem.out.
(LDFLAGS-tst-bz24228, tst-bz24228-ENV): New variables.
($(objpfx)tst-bz24228-mem.out): New rule.

(cherry picked from commit 21cc130b78a4db9113fb6695e2b951e697662440)
2019-06-20 17:32:07 +00:00
Zack Weinberg 2ec0b166bf Use a proper C tokenizer to implement the obsolete typedefs test.
The test for obsolete typedefs in installed headers was implemented
using grep, and could therefore get false positives on e.g. “ulong”
in a comment.  It was also scanning all of the headers included by
our headers, and therefore testing headers we don’t control, e.g.
Linux kernel headers.

This patch splits the obsolete-typedef test from
scripts/check-installed-headers.sh to a separate program,
scripts/check-obsolete-constructs.py.  Being implemented in Python,
it is feasible to make it tokenize C accurately enough to avoid false
positives on the contents of comments and strings.  It also only
examines $(headers) in each subdirectory--all the headers we install,
but not any external dependencies of those headers.  Headers whose
installed name starts with finclude/ are ignored, on the assumption
that they contain Fortran.

It is also feasible to make the new test understand the difference
between _defining_ the obsolete typedefs and _using_ the obsolete
typedefs, which means posix/{bits,sys}/types.h no longer need to be
exempted.  This uncovered an actual bug in bits/types.h: __quad_t and
__u_quad_t were being used to define __S64_TYPE, __U64_TYPE,
__SQUAD_TYPE and __UQUAD_TYPE.  These are changed to __int64_t and
__uint64_t respectively.  This is a safe change, despite the comments
in bits/types.h claiming a difference between __quad_t and __int64_t,
because those comments are incorrect.  In all current ABIs, both
__quad_t and __int64_t are ‘long’ when ‘long’ is a 64-bit type, and
‘long long’ when ‘long’ is a 32-bit type, and similarly for __u_quad_t
and __uint64_t.  (Changing the types to be what the comments say they
are would be an ABI break, as it affects C++ name mangling.)  This
patch includes a minimal change to make the comments not completely
wrong.

sys/types.h was defining the legacy BSD u_intN_t typedefs using a
construct that was not necessarily consistent with how the C99 uintN_t
typedefs are defined, and is also too complicated for the new script to
understand (it lexes C relatively accurately, but it does not attempt
to expand preprocessor macros, nor does it do any actual parsing).
This patch cuts all of that out and uses bits/types.h's __uintN_t typedefs
to define u_intN_t instead.  This is verified to not change the ABI on
any supported architecture, via the c++-types test, which means u_intN_t
and uintN_t were, in fact, consistent on all supported architectures.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

	* scripts/check-obsolete-constructs.py: New test script.
	* scripts/check-installed-headers.sh: Remove tests for
	obsolete typedefs, superseded by check-obsolete-constructs.py.
	* Rules: Run scripts/check-obsolete-constructs.py over $(headers)
	as a special test.  Update commentary.
	* posix/bits/types.h (__SQUAD_TYPE, __S64_TYPE): Define as __int64_t.
	(__UQUAD_TYPE, __U64_TYPE): Define as __uint64_t.
	Update commentary.
	* posix/sys/types.h (__u_intN_t): Remove.
	(u_int8_t): Typedef using __uint8_t.
	(u_int16_t): Typedef using __uint16_t.
	(u_int32_t): Typedef using __uint32_t.
	(u_int64_t): Typedef using __uint64_t.

(cherry picked from commit 711a322a235d4c8177713f11aa59156603b94aeb)
2019-06-05 14:15:01 +02:00
Florian Weimer bd0a325b6a malloc: Fix warnings in tests with GCC 9
This is a partial backport of test changes in commit
9bf8e29ca136094f73f69f725f15c51facc97206 ("malloc: make malloc fail
with requests larger than PTRDIFF_MAX (BZ#23741)"), without the
actual functionality changes.
2019-06-05 14:02:05 +02:00
Wilco Dijkstra 95d66fecaa Fix tcache count maximum (BZ #24531)
The tcache counts[] array is a char, which has a very small range and thus
may overflow.  When setting tcache_count tunable, there is no overflow check.
However the tunable must not be larger than the maximum value of the tcache
counts[] array, otherwise it can overflow when filling the tcache.

	[BZ #24531]
	* malloc/malloc.c (MAX_TCACHE_COUNT): New define.
	(do_set_tcache_count): Only update if count is small enough.
	* manual/tunables.texi (glibc.malloc.tcache_count): Document max value.

(cherry picked from commit 5ad533e8e65092be962e414e0417112c65d154fb)
2019-05-22 14:17:01 +01:00