Commit Graph

628 Commits

Author SHA1 Message Date
Ulrich Drepper 8d4f46c613 Move fma routines to right place 2011-10-20 21:55:41 -04:00
Ulrich Drepper 855d156018 Optimize x86-64 rawmemchr and add test 2011-10-19 22:22:29 -04:00
Ulrich Drepper d9a4d2ab27 Add optimized str{,n}casecmp for AVX on x86-64 2011-10-19 12:42:38 -04:00
Andreas Schwab 8f3b1ffefa Fix PLT use for feraiseexcept on x86_64 2011-10-19 13:03:31 +02:00
Ulrich Drepper d9a8d0abcc Use new internal libc_fe* interfaces in more functions 2011-10-18 15:11:31 -04:00
Ulrich Drepper 4855e3ddf5 Provide combined internal feholdexcept/fesetround interface 2011-10-18 09:59:04 -04:00
Ulrich Drepper 23ce562780 Pretty print last change to x86-64 mathinline.h 2011-10-18 09:38:47 -04:00
Ulrich Drepper 581d30e386 Add optimized nearbyint{,f} for x86-64 2011-10-18 09:13:23 -04:00
Ulrich Drepper d38f1dba00 Start optimizing the use of the fenv interfaces in libm itself 2011-10-18 09:00:46 -04:00
Andreas Schwab 83c7615c2d Fix last change 2011-10-18 14:11:29 +02:00
Andreas Schwab caa6c9d845 Fix linkage conflict with feraiseexcept 2011-10-18 11:46:51 +02:00
Ulrich Drepper 228a984d54 Relax asm requirements for recently added x86-64 math interfaces 2011-10-17 20:30:52 -04:00
Ulrich Drepper c8553a6a6f Makr x86-64 math_private.h more robust 2011-10-17 16:00:39 -04:00
Ulrich Drepper ed22dcf691 Provide internal optimizations on x86-64 with SSE4.1
Provide macros so that the internal users can, if possible, directly use
the new instructions.

Also fix up the mathinline.h header when compiling with SSE4.1 enabled.
2011-10-17 11:23:40 -04:00
Ulrich Drepper b171c13768 Fix last x86-64 mathinline change
Use correct function names.
2011-10-17 10:37:00 -04:00
Ulrich Drepper ad0f5cad15 Use rounds{s,d} for x86 rint, ceil, floor 2011-10-16 20:58:17 -04:00
Ulrich Drepper 2d1f3a4db6 Fix WS 2011-10-15 11:11:12 -04:00
Liubov Dmitrieva be13f7bff6 Optimized memcmp and wmemcmp for x86-64 and x86-32 2011-10-15 11:10:08 -04:00
Andreas Schwab 6b1f68c91f Fix lost feraiseexcept symbol 2011-10-14 11:21:23 +02:00
Andreas Schwab 714fad23c6 Fix PLT use in feupdateenv on x86_64 2011-10-13 15:26:45 +02:00
Andreas Schwab 81dcc7fb74 Check for zero size in memrchr for x86_64 2011-10-13 13:34:41 +02:00
Ulrich Drepper 0ac5ae2335 Optimize libm
libm is now somewhat integrated with gcc's -ffinite-math-only option
and lots of the wrapper functions have been optimized.
2011-10-12 11:27:51 -04:00
Ulrich Drepper 7edb55ce06 Optimize use of isnan, isinf, finite 2011-10-08 10:18:26 -04:00
Ulrich Drepper 66fb11b1da Fix whitespace 2011-10-07 11:50:21 -04:00
Liubov Dmitrieva 093ecf9299 Improve 64 bit memchr, memrchr, rawmemchr with SSE2 2011-10-07 11:49:10 -04:00
Andreas Schwab 3a62d00d40 Don't call ifunc functions in trace mode 2011-10-05 14:35:40 +02:00
Andreas Schwab bf972c9dfc Fix parse error in bits/mathinline.h with --std=c99 2011-09-26 14:01:30 +02:00
Ulrich Drepper 4c1a1f71c0 Add fmax and fmin inlines for x86-64 2011-09-15 13:11:08 -04:00
Ulrich Drepper ee4d03150a Use correct section to allow merging 2011-09-14 13:43:24 -04:00
Ulrich Drepper cd20565401 Optimized lrint and llrint for x86-64 2011-09-14 12:58:43 -04:00
Andreas Schwab e529793b50 Avoid macro clash between <sys/select.h> and <linux/posix_types.h> 2011-09-13 15:16:38 +02:00
Ulrich Drepper 83cd142045 Remove --wth-tls option, TLS support is required 2011-09-11 15:02:01 -04:00
Ulrich Drepper d063d16433 Remove support for !USE___THREAD 2011-09-10 16:50:28 -04:00
Petr Baudis 1248c1c415 Fix jn precision 2011-09-09 22:16:10 -04:00
H.J. Lu 08a300c956 Simplify AVX check 2011-09-07 21:38:23 -04:00
Ulrich Drepper ceaa0c5dc3 Move Atom-optimized code out of the way and together 2011-09-06 21:53:03 -04:00
Ulrich Drepper 8e1294e83f Remove now-wrong comment 2011-09-06 17:20:33 -04:00
Ulrich Drepper 6d18b67f4d Fix whitespaces 2011-09-05 21:42:12 -04:00
Liubov Dmitrieva a5f524e479 Add Atom-optimized strchr and strrchr for x86-64 2011-09-05 21:34:03 -04:00
Ulrich Drepper 49d42c37ba Add optimized x86-64 wcscmp 2011-09-05 14:08:23 -04:00
Ulrich Drepper 0276a718c0 Fix minor CFI problem in regular x86-64 trampoline 2011-08-20 08:58:44 -04:00
Ulrich Drepper c88f17668b Fix CFI info in x86-64 trampolines for non-AVX code 2011-08-20 08:56:30 -04:00
Ulrich Drepper 8e999d2962 Minor optimization of popcount in l10nflist 2011-08-11 14:07:04 -04:00
Andreas Schwab 8c1a459f9a Fix inline strncat/strncmp on x86 2011-08-04 14:59:25 -04:00
Ulrich Drepper bba33c289b One more typo in AVX test 2011-07-23 15:18:13 -04:00
Ulrich Drepper 2ee5518515 Merge branch 'master' of ssh://sourceware.org/git/glibc
Conflicts:
	ChangeLog
2011-07-23 00:04:15 -04:00
Ulrich Drepper 1aae088a8a One more change to XSAVE patch 2011-07-22 23:33:22 -04:00
Andreas Schwab 1d002f2539 Fix AVX check 2011-07-22 14:33:47 -04:00
Ulrich Drepper 21137f89c5 Fix overflow bug is optimized strncat for x86-64 2011-07-21 12:32:36 -04:00
Ulrich Drepper 5644ef5461 Fix check for AVX enablement
The AVX bit is set if the CPU supports AVX.  But this doesn't mean the
kernel does.  Add checks according to Intel's documentation.
2011-07-20 21:21:03 -04:00
Ulrich Drepper 6986b98a18 Force :a_x86_64_ymm to be 16-byte aligned 2011-07-20 14:20:00 -04:00
Ulrich Drepper 8002999481 Fix whitespaces 2011-07-19 17:27:09 -04:00
Liubov Dmitrieva 99710781cc Improve 64 bit strcat functions with SSE2/SSSE3 2011-07-19 17:11:54 -04:00
Ulrich Drepper ecaddd6699 Rebuild configure scripts 2011-07-06 21:29:02 -04:00
H.J. Lu 8912479f9e Improved st{r,p}{,n}cpy for SSE2 and SSSE3 on x86-64 2011-06-24 15:14:22 -04:00
H.J. Lu 0b1cbaaef5 Optimized st{r,p}{,n}cpy for SSE2/SSSE3 on x86-32 2011-06-24 14:15:32 -04:00
David S. Miller 42675c6ff0 Add an elf_ifunc_invoke interface so that architectures can implement
the ifunc resolver calls however they wish.
2011-06-20 19:56:40 -07:00
H.J. Lu 3d29045b5e Assume Intel Core i3/i5/i7 processor if AVX is available 2011-06-03 07:01:25 -04:00
H.J. Lu 8db736347c Fix typo in x86-64 powl 2011-05-18 19:50:48 -04:00
Mike Frysinger 4c559bcdf3 Fix static linking with checking x86/x86-64 memcpy. 2011-04-17 22:20:47 -04:00
Ulrich Drepper e6c6149412 Fix memory leak in TLS of loaded objects. 2011-04-10 22:43:01 -04:00
Ulrich Drepper dedc7c7b05 Fix typo in cache information table for x86-{32,64}. 2011-04-03 09:32:31 -04:00
H.J. Lu 0354e35501 Work around old buggy program which cannot cope with memcpy semantics. 2011-04-01 19:38:21 -04:00
Ulrich Drepper bb2420590c Last change caused infinite loops because of missing loop increment. 2011-03-22 01:52:43 -04:00
H.J. Lu c97a1282a4 Handle page boundaries in x86 SSE4.2 strncmp. 2011-03-21 05:35:38 -04:00
Ulrich Drepper 2a11560107 Implement x86 cpuid handling of leaf4 for cache information. 2011-03-20 08:14:30 -04:00
Harsha Jagasia 7e4ba49cd3 Enable SSE2 memset for AMD'supcoming Orochi processor.
This patch enables SSE2 memset for AMD's upcoming Orochi processor.
This patch also fixes the following bug:
For misaligned blocks larger than > 144 Bytes, memset branches into
the integer code path depending on the value of misalignment even if
the startup code chooses the SSE2 code path upfront, when multiarch
is enabled.
2011-03-04 23:30:08 -05:00
Ulrich Drepper baa6c69a57 Work around empty line at end file generated by autoconf. 2011-02-17 01:26:07 -05:00
Ulrich Drepper e943389325 Remove use of ranlib. 2011-02-15 14:52:29 -05:00
Roland McGrath a0bf67cca2 Fix some warning nits. 2011-02-04 10:53:51 -08:00
Ulrich Drepper f257bbd77d Clean up some bits/select.h headers. 2011-01-09 16:49:17 -05:00
Ryan S. Arnold 30950a5fd2 Make PowerPC64 default to nonexecutable stack 2010-12-19 22:49:01 -05:00
H.J. Lu 13b695749a Support Intel processor model 6 and model 0x2. 2010-11-12 03:48:52 -05:00
H.J. Lu 8ca52c6e3b Fix one exit path in x86-64 SSE4.2 str{,n}casecmp. 2010-11-10 03:05:37 -05:00
Ulrich Drepper 69da074d7a Fix warnings in __bswap_16. 2010-11-10 02:38:35 -05:00
H.J. Lu ff02d5280b Use IFUNC on x86-64 memset 2010-11-08 03:41:34 -05:00
Ulrich Drepper c0dde15b5d 32bit memset-sse2.S fails with uneven cache size
32bit memset-sse2.S assumes cache size is multiple of 128 bytes.  If
it isn't true, memset-sse2.S will fail.  For example, a processor can
have 24576 KB L3 cache and 20 cores. That is 2516582 byte per core. Half
of it is 1258291, which isn't helpful for vector instructions.  This
patch rounds cache sizes to multiple of 256 bytes and adds "raw" cache
sizes.
2010-11-05 07:57:46 -04:00
Richard Li dbf3a06904 Fix x86-64 strchr propagation of search byte into all bytes of SSE register 2010-10-25 14:13:17 -04:00
Ulrich Drepper 18edac4857 Provide FP_FAST_FMA{,F,L} definitions for x86/x86-64. 2010-10-19 12:56:42 -04:00
Jakub Jelinek 5e908464b9 Implement accurate fma. 2010-10-13 22:27:03 -04:00
Jakub Jelinek 9ff8d36f27 Correct implementation of fmaf. 2010-10-11 09:27:05 -04:00
Ulrich Drepper 45db99c7d0 Fix handling of tail bytes of buffer in SSE2/SSSE3 x86-64 version strn{,case}cmp 2010-10-03 22:10:30 -04:00
Ulrich Drepper 015a4c6193 Re-enable all strncasecmp versions. 2010-09-20 20:18:00 -07:00
Ulrich Drepper 8ffcee4a04 Fix limit detection in x86-64 SSE2 strncasecmp. 2010-09-20 14:02:23 -07:00
Ulrich Drepper 0959ffc97b Update x86-64 mpn routines from GMP 5.0.1. 2010-09-02 23:36:25 -07:00
Ulrich Drepper 01d2601561 Fix typo in last commit. 2010-08-26 22:35:42 -07:00
Ulrich Drepper 9ea3de11f1 Move slow Atom code to separate section. 2010-08-26 22:17:03 -07:00
Ulrich Drepper 107b2fa56c Shorten x86-64 strlen a bit. 2010-08-26 22:12:16 -07:00
H.J. Lu 623aac7f84 Unroll x86-64 strlen 2010-08-26 22:09:34 -07:00
H.J. Lu b416a90085 Missing comma in last commit. 2010-08-26 13:18:46 -07:00
Roland McGrath 8b2b771538 Clean up warnings in new x86_64/multiarch code. 2010-08-25 12:13:08 -07:00
H.J. Lu e73015f2d6 Unroll 32bit SSE strlen and handle slow bsf 2010-08-25 10:07:37 -07:00
Ulrich Drepper 1cdfe7242f Add missing copyright year updated and pretty printing. 2010-08-24 11:42:19 -07:00
Richard Henderson 73f27d5e72 Clean up SSE variable shifts 2010-08-24 11:35:01 -07:00
Ulrich Drepper 9da4bb316f Fix two typos in x86-64 SSE4.2 strncasecmp implementation. 2010-08-19 09:20:44 -07:00
Ulrich Drepper 1feccb6caf Fix fourth parameter of SSE4.2 strcmp for x86-64. 2010-08-15 20:46:09 -07:00
Ulrich Drepper 28c90b2cf5 Use correct register for fourth parameter of x86-64 strncasecmp_l. 2010-08-15 17:42:12 -07:00
Ulrich Drepper 25244f174f Undo inccorect change. 2010-08-15 10:34:33 -07:00
Ulrich Drepper e9f82e0d1d Add optimized strncasecmp versions for x86-64. 2010-08-14 22:04:01 -07:00
Ulrich Drepper ca6bb004eb Fix x86-64 build without multiarch. 2010-08-14 14:56:32 -07:00
Andi Kleen d22e4cc939 x86: Add support for frame pointer less mcount 2010-08-07 21:24:05 -07:00
Ulrich Drepper 73507d3ae0 Add support for SSSE3 and SSE4.2 versions of strcasecmp on x86-64. 2010-07-31 21:41:09 -07:00
Ulrich Drepper 66f6765a47 Pretty printing x86-64 SSE4.3 strcmp. 2010-07-30 12:54:37 -07:00
Ulrich Drepper 42e08a5438 Implement optimized strcaecmp for x86-64. 2010-07-30 00:14:04 -07:00
Ulrich Drepper fe36dd025e Fix tolower operation in strcasestr. 2010-07-30 00:09:07 -07:00
Ulrich Drepper 880113d91e Avoid compiling unneeded file in ld.so. 2010-07-27 21:12:59 -07:00
Ulrich Drepper 24fb0f88ed Add optimized x86-64 implementation of strnlen.
While at it, beef up the test suite for strnlen and add performance
tests for it, too.
2010-07-26 08:37:08 -07:00
Ulrich Drepper 8e96b93aa7 Speed up x86-64 strcasestr a bit moew.
Using the new SSE4.2 instructions is cool but not really the fastest.
Some older SSE instructions can do the trick faster.
2010-07-24 08:34:44 -07:00
Andreas Schwab f6a31e0eb6 Add strcasestr-nonascii to i386 build 2010-07-21 07:26:18 -07:00
Ulrich Drepper d02dc4ba08 Fix non-ASCII case of SSE4.2 strcasstr. 2010-07-16 16:00:22 -07:00
Ulrich Drepper cc9f2e47a0 Speed up SSE4.2 strcasestr by avoiding indirect function call. 2010-07-16 15:37:38 -07:00
H.J. Lu 6fb8cbcb58 Improve 64bit memcpy/memmove for Atom, Core 2 and Core i7
This patch includes optimized 64bit memcpy/memmove for Atom, Core 2 and
Core i7.  It improves memcpy by up to 3X on Atom, up to 4X on Core 2 and
up to 1X on Core i7.  It also improves memmove by up to 3X on Atom, up to
4X on Core 2 and up to 2X on Core i7.
2010-06-30 08:26:11 -07:00
H.J. Lu 3c88fe1e3a Incorrect x86 CPU family and model check. 2010-05-27 11:14:18 -07:00
Ulrich Drepper 94a27fabeb Whitespace fix. 2010-04-14 22:29:51 -07:00
H.J. Lu a11ec63713 Add x86-32 FMA support 2010-04-14 22:27:59 -07:00
H.J. Lu df87f54923 Check DATA_CACHE_SIZE_HALF 2010-04-14 22:18:27 -07:00
H.J. Lu dd37cd1a12 Optimie x86-64 SSE4 memcmp for unaligned data. 2010-04-14 17:53:44 -07:00
H.J. Lu 404a6e3201 x86-64 SSE4 optimized memcmp
This is 64bit SSE4 optimized memcmp. It improves memcmp by upto 3X
on Intel Core i7.
2010-04-14 00:12:53 -07:00
Ulrich Drepper bbbdd77809 Update x86-64 cpu multiarch selection header. 2010-04-13 19:17:10 -07:00
Ulrich Drepper 22f4f44b67 Fix concurrent handling of __cpu_features. 2010-04-04 00:25:46 -07:00
H.J. Lu 7d9335ecd7 Don't define __strpbrk_sse42 in static library 2010-03-24 12:16:24 -07:00
Richard Guenther e39acb1f16 Fix R_X86_64_PC32 overflow detection 2010-03-04 19:33:41 -08:00
Ulrich Drepper 4a1297d761 We can use the 64-bit register versions of the double functions. 2010-02-24 20:00:30 -08:00
Andreas Schwab 7eb22e757e Avoid PLT call to fegetenv on s390 2010-02-09 22:34:17 -08:00
Ulrich Drepper f69190e74a Prevent silent errors should x86-64 strncmp be needed outside libc. 2010-01-14 08:09:32 -08:00
H.J. Lu 5a7af22fbb Unroll the loop x86-64 SSE4.2 strlen. 2010-01-13 07:51:48 -08:00
H.J. Lu 3af48cbdfa Optimize 32bit memset/memcpy with SSE2/SSSE3. 2010-01-12 11:22:03 -08:00
H.J. Lu 2510d01ddb Define bit_SSE2 and index_SSE2. 2009-12-13 15:23:02 -08:00
H.J. Lu 51ddd2c01e Define bit_XXX and index_XXX.
This patch defines bit_XXX and index_XXX and use them to check processor
feature in assembly code.  It can prevent typos in processor feature
check.
2009-12-13 09:47:02 -08:00
Ulrich Drepper 823bc6da65 Fix whitespaces. 2009-10-22 22:50:00 -07:00
H.J. Lu 001659f4d5 Implement SSE4.2 optimized strchr and strrchr. 2009-10-22 22:47:12 -07:00
Roland McGrath b0f3a2e43f Clean up unnecessary libc_hidden_builtin_def fiddling in x86 multiarch definitions. 2009-10-06 20:01:23 -07:00
Roland McGrath 9d6982d5d2 Clean up x86 multiarch HAS_FOO macros. 2009-10-06 19:59:03 -07:00
Roland McGrath 7967983fd4 configure tweaks, support $libc_add_on_config_subdirs 2009-09-15 14:14:42 -07:00
Jakub Jelinek 22bb992d51 Fix strstr/strcasestr/fma/fmaf on x86_64. 2009-09-02 19:43:04 -07:00
Jakub Jelinek 240441038f Fix x86_64 bits/mathinline.h for -m32 compilation. 2009-09-01 15:30:12 -07:00
Andreas Schwab c2735e958a Fix parse error in bits/mathinline.h with --std=c99 2009-08-31 17:26:14 +02:00
H.J. Lu 5a4eb7282e Remove ENABLE_SSSE3_ON_ATOM.
It turns that SSSE3 isn't slow on Atom. The problem is bsf. This patch
removes ENABLE_SSSE3_ON_ATOM.
2009-08-28 14:54:46 -07:00
Ulrich Drepper 65b14bcee2 Optimize out duplicated scalbln code for x86-64. 2009-08-25 16:46:34 -07:00
Ulrich Drepper 7423a3456a Optimized signbit{,f} for x86-64. 2009-08-25 14:54:12 -07:00
Ulrich Drepper 84088310ce Handle AVX saving on x86-64 in interrupted smbol lookups.
If a signal arrived during a symbol lookup and the signal handler also
required a symbol lookup, the end of the lookup in the signal handler reset
the flag whether restoring AVX/SSE registers is needed.  Resetting means
in this case that the tail part of the outer lookup code will try to
restore the registers and this can fail miserably.  We now restore to the
previous value which makes nesting calls possible.
2009-08-25 10:42:30 -07:00
Ulrich Drepper cf00cc00bc Add ceil implementation for 64-bit machines.
On 64-bit machines we should not split doubles into two 32 bit
integer and handle the words separately.  We have wide registers.
This patch implements a 64-bit ceil version.  Ideally all other
functions will be converted over time.
2009-08-24 18:05:48 -07:00
Ulrich Drepper 9a1ea1525e Optimize float construction/extraction on x86-64. 2009-08-24 14:52:49 -07:00
Ulrich Drepper ef72d5f1b9 Optimize x86-64 signbit{,f} a bit. 2009-08-24 10:20:58 -07:00
H.J. Lu 4e1e2f4247 Support mixed SSE/AVX audit and check AVX only once.
This patch fixes mixed SSE/AVX audit and checks AVX only once in
_dl_runtime_profile. When an AVX or SSE register value in pltenter is
modified, we have to make sure that the SSE part value is the same in both
lr_xmm and lr_vector fields so that pltexit will get the correct value
from either lr_xmm or lr_vector fields. AVX-enabled pltenter should
update both lr_xmm and lr_vector fields to support stacked AVX/SSE
pltenter functions.
2009-08-08 10:54:42 -07:00
Ulrich Drepper 8e436522e1 Move SSE4.2 functions together. 2009-08-08 09:38:32 -07:00
Ulrich Drepper 0fda545d5f Add SSSE3-optimized implementation of str{,n}cmp for x86-64. 2009-08-07 22:51:02 -07:00
Ulrich Drepper 57b378ac89 Avoid warning through fake initialization. 2009-08-07 16:19:54 -07:00
Ulrich Drepper 3aa2588d4a Fix whitespaces in last checkin. 2009-08-07 09:47:12 -07:00
H.J. Lu a546baa9cd Properly count number of logical processors on Intel CPUs.
The meaning of the 25-14 bits in EAX returned from cpuid with EAX = 4
has been changed from "the maximum number of threads sharing the cache"
to "the maximum number of addressable IDs for logical processors sharing
the cache" if cpuid takes EAX = 11.  We need to use results from both
EAX = 4 and EAX = 11 to get the number of threads sharing the cache.

The 25-14 bits in EAX on Core i7 is 15 although the number of logical
processors is 8.  Here is a white paper on this:

http://software.intel.com/en-us/articles/intel-64-architecture-processor-topology-enumeration/

This patch correctly counts number of logical processors on Intel CPUs
with EAX = 11 support on cpuid.  Tested on Dinnington, Core i7 and
Nehalem EX/EP.

It also fixed Pentium Ds workaround since EBX may not have the right
value returned from cpuid with EAX = 1.
2009-08-07 09:39:36 -07:00
H.J. Lu 02cea47161 Add x86 32-bit SSE4.2 string functions.
This patch adds 32bit SSE4.2 string functions.  It uses -16L instead of
0xfffffffffffffff0L, which works for both 32bit and 64bit long.  Tested
on 32bit Core i7 and Core 2.
2009-08-04 12:13:43 -07:00
H.J. Lu 6f6f1215f6 Support multiarch for i686.
This patch adds multiarch support when configured for i686.  I modified
some x86-64 functions to support 32bit. I will contribute 32bit SSE string
and memory functions later.
2009-07-31 11:53:35 -07:00
Ulrich Drepper 98b1e6c866 ____longjmp_chk is now OS-specific.
We use sigaltstack internally which on some systems is a syscall
and should be used as such.  Move the x86-64 version to the Linux
specific directory and create in its place a file which always
causes compile errors.
2009-07-30 21:42:27 -07:00
Ulrich Drepper 8e80581787 Change code a bit to correct CFI. 2009-07-30 21:29:27 -07:00
Ulrich Drepper 07df809969 Optimize ____longjmp_chk for x86-64 a bit. 2009-07-30 20:09:30 -07:00
Ulrich Drepper 5ead9ce5c7 Fix x86-64 ____longjmp_chk to handle signal stacks.
The simple test previously used might trigger if the longjmp jumps
from the signal stack to the normal stack.  We now explicitly test
for this case.
2009-07-30 17:31:48 -07:00
Ulrich Drepper 78c4ef475d Add support for x86-64 fma instruction.
Use it to implement fma and fmaf, if possible.
2009-07-29 15:26:06 -07:00
Ulrich Drepper 9a1d2d4555 Prepare use if IFUNC functions outside libc.so.
We use a callback function into libc.so to get access to the data
structure with the information and have special versions of the test
macros which automatically use this function.
2009-07-29 15:22:28 -07:00
Ulrich Drepper 649bf13320 Improve CFI in x86-64 ld.so trampoline code. 2009-07-29 08:50:03 -07:00
H.J. Lu 09e0389eb1 Properly restore AVX registers on x86-64.
tst-audit4 and tst-audit5 fail under AVX emulator due to je instead of
jne. This patch fixes them.
2009-07-29 08:40:54 -07:00
Ulrich Drepper b48a267b8f Preserve SSE registers in runtime relocations on x86-64.
SSE registers are used for passing parameters and must be preserved
in runtime relocations.  This is inside ld.so enforced through the
tests in tst-xmmymm.sh.  But the malloc routines used after startup
come from libc.so and can be arbitrarily complex.  It's overkill
to save the SSE registers all the time because of that.  These calls
are rare.  Instead we save them on demand.  The new infrastructure
put in place in this patch makes this possible and efficient.
2009-07-29 08:33:03 -07:00
Ulrich Drepper e83c1a8a72 Refine testing for xmm/ymm register use in x86-64 ld.so.
The test now takes the callgraph into account.  Only code called
during runtime relocation is affected by the limitation.  We now
determine the affected object files as closely as possible from
the outside.  This allowed to remove some the specializations
for some of the string functions as they are only used in other
code paths.
2009-07-27 13:40:27 -07:00
Ulrich Drepper 009a69f0bc No need for special strcmp for rtld. 2009-07-27 06:55:04 -07:00
Ulrich Drepper 16d2ea4c82 Make sure no code in ld.so uses xmm/ymm registers on x86-64.
This patch introduces a test to make sure no function modifies the
xmm/ymm registers.  With the exception of the auditing functions.

The test is probably too pessimistic.  All code linked into ld.so
is checked.  Perhaps at some point the callgraph starting from
_dl_fixup and _dl_profile_fixup is checked and we can start using
faster SSE-using functions in parts of ld.so.
2009-07-26 16:10:00 -07:00
H.J. Lu 7956a3d27c Add SSE2 support to str{,n}cmp for x86-64. 2009-07-26 13:32:28 -07:00
H.J. Lu 4e5b5821bf Some some optimizations for x86-64 strcmp. 2009-07-25 19:15:14 -07:00
Ulrich Drepper 29e92fa5cd Optimize x86-64 SSE4.2 strcmp.
The file contained some code which was never used.  Don't compile it
in.
2009-07-25 12:02:47 -07:00
Ulrich Drepper b2509a1e38 Avoid cpuid instructions in cache info discovery.
When multiarch is enabled we have this information stored.  Use it.
2009-07-23 14:03:53 -07:00
Ulrich Drepper 3e9099b4f6 Add more cache descriptors for L3 caches on x86 and x86-64.
The most recent AP 485 describes a few more cache descriptors for
L3 caches with 24-way associativity.
2009-07-23 13:42:46 -07:00
Ulrich Drepper d28797e426 Perform test for Arom x86-64 in central place and handle it.
There will be more than one function which, in multiarch mode, wants
to use SSSE3.  We should not test in each of them for Atoms with
slow SSSE3.  Instead, disable the SSSE3 bit in the startup code for
such machines.
2009-07-23 13:15:17 -07:00
Ulrich Drepper ae612b04cc Minor cleanups in x86-64 strstr. 2009-07-21 07:52:12 -07:00
Ulrich Drepper a8f895ebe1 Better check for optimization in new x86-64 strstr/strcasestr. 2009-07-20 21:18:28 -07:00
H.J. Lu 2b7a8664fa SSE4.2 strstr/strcasestr for x86-64.
This patch implements SSE4.2 strstr/strcasestr, using Knuth-Morris-Pratt
string searching algorithm.
2009-07-20 21:06:50 -07:00
Ulrich Drepper c8027cced1 Optimize restoring of ymm registers on x86-64.
The patch mainly reduces the code size but also avoids some jumps.
2009-07-16 07:15:15 -07:00
Ulrich Drepper 24a12a5a5f Fix up whitespaces in new memcmp for x86-64. 2009-07-16 07:02:27 -07:00
H.J. Lu e26c9b8415 memcmp implementation for x86-64 using SSE2. 2009-07-16 07:00:34 -07:00
Ulrich Drepper ca419225a3 Fix thinko in AVX audit patch.
Don't use AVX instructions too often.
2009-07-15 17:59:14 -07:00
Ulrich Drepper 47fc9b710b Fix typo in last change. 2009-07-15 17:51:11 -07:00
Ulrich Drepper d7bd7a8ae8 Secure AVX changes for auditing code.
The original AVX patch used a function pointer to handle the difference
between machines with and without AVX support.  This is insecure.  A
well-placed memory exploit could lead to redirection of the execution.
Using a variable and several tests is a bit slower but cannot be
exploited in this way.
2009-07-15 17:41:36 -07:00
H.J. Lu b0ecde3a63 Add AVX support to ld.so auditing for x86-64. 2009-07-10 12:04:14 -07:00
Ulrich Drepper cea4329592 Minor cleanups in recently added files. 2009-07-03 03:23:01 -07:00
Ulrich Drepper d6485c981b Align functions to 16-byte boundary.
Some of the new multi-arch string functions for x86-64 were
not aligned to 16 byte boundarie,s possibly creating unnecessary
cache line misses and delays.
2009-07-03 03:01:57 -07:00
H.J. Lu 06e51c8f3d Add SSE4.2 support for strcspn, strpbrk, and strspn on x86-64. 2009-07-03 02:48:56 -07:00
H.J. Lu 167d5ed5de Fix handling of xmm6 in ld.so audit hooks on x86-64. 2009-07-02 04:33:12 -07:00
Ulrich Drepper af263b8154 Whitespace fixes in last patch. 2009-07-02 03:43:05 -07:00
H.J. Lu ab6a873fe0 SSSE3 strcpy/stpcpy for x86-64
This patch adds SSSE3 strcpy/stpcpy. I got up to 4X speed up on Core 2
and Core i7.  I disabled it on Atom since SSSE3 version is slower for
shorter (<64byte) data.
2009-07-02 03:39:03 -07:00
Ulrich Drepper e6bd12ddf7 Regenerated. 2009-06-30 05:33:52 -07:00
Ulrich Drepper b38a2e2e64 Fix little checkin problem in last patch. 2009-06-30 04:41:38 -07:00
H.J. Lu 0181291385 Determine and store processor family and model on x86-64. 2009-06-30 04:39:09 -07:00
Ulrich Drepper 059215ae21 Clean up whitespaces in last patch. 2009-06-22 20:39:37 -07:00
H.J. Lu 772f4e6a1b Add SSE4.2 support for strcmp and strncmp on x86-64. 2009-06-22 20:38:41 -07:00
Jakub Jelinek fab8238de6 Fix x86-64 memchr for large lengths. 2009-06-16 10:23:31 -07:00
Ulrich Drepper eb0b6cb6e1 Fix warnings when using <sys/select.h>.
gcc 4.4 is more picky.  And the x86-64 version of <bits/select.h>
contained a now unnecessary asm optimization.  Remove it.
2009-06-14 16:09:42 -07:00
Ulrich Drepper b77c932329 Add SSE4.2 optimized rawmemchr implementation for x86-64. 2009-06-05 16:54:50 -07:00
Ulrich Drepper 6f9eea15bf Forgot some more cleanups for the SSE4.2 strlen on x86-64. 2009-06-05 11:51:59 -07:00
Ulrich Drepper f85a9e72e2 Add missing cleanups from SSE4.2 x86-64 strlen. 2009-06-05 11:39:45 -07:00
Ulrich Drepper 3ab2d57a4d Optimize x86-64 strlen for SSE4.2.
The SSE4.2 implementation is used in the DSO only.  The patch also adds
some infrastructure to be used in similar code later one.
2009-06-05 11:32:00 -07:00
Ulrich Drepper 2f3f7b9da2 More small optimizations for x86-64 strlen. 2009-06-04 16:45:35 -07:00
Ulrich Drepper 747785f2b3 Tiny strlen for x86-64 optimization.
I didn't remove an instruction from a previous version in the final
version.
2009-06-04 10:54:29 -07:00
Ulrich Drepper fd96f06208 Small optimization of STT_GNU_IFUNC handling.
The test to call the indirect function now includes a subtest to
checked whether the symbol is defined.  When coming to that point
this is almost always the case.  The test for STT_GNU_IFUNC on the
other hand rarely is true.  Move it to the front means we don't have
to perform the second test unless really necessary.
2009-06-01 11:49:05 -07:00
Ulrich Drepper b7629ee33f Better error message for invalid relocatio in static binary. 2009-06-01 11:39:24 -07:00
Ulrich Drepper 8ea2372936 Fix up sched_cpucount in x86-64.
Now that static executables can handle IFUNC functions don't exclude
optimization for sched_cpucount for !SHARED.
2009-05-31 23:46:42 -07:00
Ulrich Drepper 7441470835 Finish IFUNC support for x86 and x86-64.
Add support for the IRELAIVE relocation and IFUNC in static executables.
2009-05-31 23:45:33 -07:00
Ulrich Drepper 963cb6fcb4 Simplify CPUID value handling.
SO far Intel and AMD use exactly the same bits meaning the same
things in CPUID index 1.  Simplify the code.  Should an architecture
come along which doesn't use the same semantics then it must use a
different index value than COMMON_CPUID_INDEX_1.
2009-05-31 17:52:05 -07:00
Ulrich Drepper 1de0c16183 Compact cache info data structure for x86/x86-64.
This saves about 1.5kB in the DSO.
2009-05-29 11:53:36 -07:00
H.J. Lu e7535de78f Add missing .text directives.
The ____longjmp_chk functions on x86 and x86-64 were placed in .rodata.str1.1.
2009-05-21 18:38:11 -07:00
Ulrich Drepper b50f8e42ba Check for valid stack frame in longjmp.
If longjmp restores the stack frame to an address which is beyond
the stack frame at the time of the longjmp call it would install
an uninitialized stack frame.  If compiled with _FORTIFY_SOURCE
defined, longjmp will now bail out in this situation.
2009-05-15 19:37:13 -07:00
Ulrich Drepper deb84c43b1 * version.h (VERSION): Bump to 2.10.1.
* nss/getXXbyYY_r.c: If NO_COMPAT_NEEDED is defined don't define any
	compatibility functions.
	* nss/getXXent_r.c: Likewise.
	* gshadow/getsgent_r.c: Define NO_COMPAT_NEEDED.
	* gshadow/getsgnam_r.c: Likewise.
	* gshadow/Version: Remove duplicate entries.

	* sysdeps/x86_64/cacheinfo.c (intel_02_cache_info): Add missing entries
	for recent processor.
	* sysdeps/unix/sysv/linux/i386/sysconf.c (intel_02_cache_info):
	Likewise.
2009-05-10 18:38:52 +00:00
Ulrich Drepper 2221e33e5d * sysdeps/x86_64/memchr.S: Handle invalid buffer pointers when
count is zero.
2009-05-09 06:40:15 +00:00
Ulrich Drepper f0e3c47fd6 * sysdeps/ieee754/dbl-64/s_expm1.c: Set errno for overflow.
* sysdeps/ieee754/flt-32/s_expm1f.c: Likewise.
	* sysdeps/x86_64/fpu/s_expm1l.S: Likewise.
2009-04-27 05:31:37 +00:00
Ulrich Drepper 6cc8844f1d * sysdeps/unix/sysv/linux/dl-osinfo.h (dl_fatal): Remove inline
from definition.

	* sysdeps/x86_64/dl-machine.h (elf_machine_rela): Don't define
	label if it is not used.

	* elf/dl-profile.c (_dl_start_profile): Define real-type variant
	of gmon_hist_hdr and gmon_hdr structures and use them.

	* elf/dl-load.c (open_verify): Add temporary variable to avoid
	warning.

	* nscd/nscd_helper.c (get_mapping): Avoid casts to avoid warnings.

	* sunrpc/clnt_raw.c (clntraw_private_s): Use union in definition
	to avoid cast.

	* inet/rexec.c (rexec_af): Make sa2 a union to avoid warnings.
	* inet/rcmd.c (rcmd_af): Make from a union of the various needed types
	to avoid warnings.
	(iruserok_af): Use ss_family instead of casts.

	* gmon/gmon.c (write_hist): Define real-type variant of
	gmon_hist_hdr structure and use it.
	(write_gmon): Likewise for gmon_hdr.

	* sysdeps/unix/sysv/linux/readv.c: Avoid declaration of replacement
	function if we are not going to define it.
	* sysdeps/unix/sysv/linux/writev.c: Likewise.

	* inet/inet6_option.c (optin_alloc): Add temporary variable to
	avoid warning.

	* libio/strfile.h (struct _IO_streambuf): Use correct type and
	name of VTable element.
	* libio/iovsprintf.c: Avoid casts to avoid warnings.
	* libio/iovsscanf.c: Likewise.
	* libio/vasprintf.c: Likewise.
	* libio/vsnprintf.c: Likewise.
	* stdio-common/isoc99_vsscanf.c: Likewise.
	* stdlib/strfmon_l.c: Likewise.
	* debug/vasprintf_chk.c: Likewise.
	* debug/vsnprintf_chk.c: Likewise.
	* debug/vsprintf_chk.c: Likewise.
2009-04-26 20:12:37 +00:00
Ulrich Drepper 337c270829 * sysdeps/i386/fpu/s_tan.S: Set errno for ±Inf.
* sysdeps/i386/fpu/s_tanf.S: Likewise.
	* sysdeps/i386/fpu/s_tanl.S: Likewise.
	* sysdeps/ieee754/dbl-64/s_tan.c: Likewise.
	* sysdeps/ieee754/flt-32/s_tanf.c: Likewise.
	* sysdeps/x86_64/fpu/s_tanl.S: Likewise.
	* math/libm-test.inc: Add tests for errno after tan calls with
	±Inf.
2009-04-26 05:42:49 +00:00
Ulrich Drepper 0c59a1963e * sysdeps/i386/fpu/s_cos.S: Set errno for ±Inf.
* sysdeps/i386/fpu/s_cosf.S: Likewise.
	* sysdeps/i386/fpu/s_cosl.S: Likewise.
	* sysdeps/i386/fpu/s_sin.S: Likewise.
	* sysdeps/i386/fpu/s_sinf.S: Likewise.
	* sysdeps/i386/fpu/s_sinl.S: Likewise.
	* sysdeps/ieee754/dbl-64/s_sin.c: Likewise.
	* sysdeps/ieee754/flt-32/s_cosf.c: Likewise.
	* sysdeps/ieee754/flt-32/s_sinf.c: Likewise.
	* sysdeps/ieee754/ldbl-96/s_cosl.c: Likewise.
	* sysdeps/ieee754/ldbl-96/s_sinl.c: Likewise.
	* sysdeps/x86_64/fpu/s_cosl.S: Likewise.
	* sysdeps/x86_64/fpu/s_sinl.S: Likewise.
	* math/libm-test.inc: Add tests for errno after sin/cos calls with
	±Inf.
2009-04-26 01:04:54 +00:00
Ulrich Drepper ae650a41ef * sysdeps/x86_64/mp_clz_tab.c: New file. 2009-04-15 04:30:41 +00:00
Ulrich Drepper 893a5fd440 Optimizations from GMP.
* sysdeps/x86_64/add_n.S: New file.
	* sysdeps/x86_64/addmul_1.S: New file.
	* sysdeps/x86_64/lshift.S: New file.
	* sysdeps/x86_64/mul_1.S: New file.
	* sysdeps/x86_64/rshift.S: New file.
	* sysdeps/x86_64/sub_n.S: New file.
	* sysdeps/x86_64/submul_1.S: New file.
2009-04-14 22:26:05 +00:00
Ulrich Drepper 7fd23f1f3b mpn_add_n for x86-64. 2009-04-14 22:24:59 +00:00
Ulrich Drepper 84aa52d7e9 * sysdeps/x86-64/strrchr.S: New file. 2009-04-14 05:58:16 +00:00
Ulrich Drepper f140a0d53d * sysdeps/x86_64/rawmemchr.S: New file. 2009-04-10 07:57:20 +00:00
Ulrich Drepper 4c8b8cc332 * malloc/malloc.c (_int_realloc): Add parameter with old block
size.  Remove duplicated test.  Don't handle mmap'ed blocks here.
	Adjust all callers.
	* malloc/hooks.c (realloc_check): Adjust _int_realloc call.
2009-04-08 18:00:34 +00:00
Ulrich Drepper cd57745bd8 * sysdeps/x86_64/strchrnul.S: New file.
depending libcrypt on -lfreebl3.
2009-04-07 23:22:10 +00:00
Ulrich Drepper ddba0f1700 * string/stratcliff.c (do_test): Add memchr tests..
* sysdeps/x86_64/memchr.S: Fix handling of end of buffer after
	first read quad word.
2009-04-07 14:53:04 +00:00
Ulrich Drepper 322e23db24 * sysdeps/x86_64/memchr.S: New file. 2009-04-07 06:36:33 +00:00
Ulrich Drepper 1df6f9d808 * sysdeps/x86_64/strchr.S: Likewise. 2009-04-06 03:29:26 +00:00
Ulrich Drepper a152f366dc * sysdeps/x86_64/strlen.S: Optimize by using SSE2 instructions. 2009-04-05 18:49:28 +00:00
Ulrich Drepper 906dd40db3 [BZ #9881]
* inet/inet6_rth.c (inet6_rth_add): Add some error checking.
	Patch mostly by Yang Hongyang <yanghy@cn.fujitsu.com>.
	* inet/Makefile (tests): Add tst-inet6_rth.
	* inet/tst-inet6_rth.c: New file.

	alignment of La_x86_64_regs.  Store xmm parameters.
2009-03-15 19:16:16 +00:00
Ulrich Drepper a42ad61bae * elf/dl-runtime.c (reloc_offset): Define.
(reloc_index): Define.
	(_dl_fixup): Rename reloc_offset parameter to reloc_arg.
	(_dl_fixup_profile): Likewise.  Use reloc_index instead of
	computing index from reloc_offset.
	(_dl_call_pltexit): Likewise.
	* sysdeps/x86_64/dl-trampoline.S (_dl_runtime_resolve): Just pass
	the relocation index to _dl_fixup.
	(_dl_runtime_profile): Likewise for _dl_fixup_profile and
	_dl_call_pltexit.
	* sysdeps/x86_64/dl-runtime.c: New file.
2009-03-15 00:26:14 +00:00
Ulrich Drepper 1f7c90a722 [BZ #9893]
* sysdeps/x86_64/dl-trampoline.S (_dl_runtime_profile): Fix
	alignement of La_x86_64_regs.  Store xmm parameters.
	Patch mostly by Jiri Olsa <olsajiri@gmail.com>.
2009-03-14 23:57:33 +00:00
Ulrich Drepper 425ce2edb9 * config.h.in (USE_MULTIARCH): Define.
* configure.in: Handle --enable-multi-arch.
	* elf/dl-runtime.c (_dl_fixup): Handle STT_GNU_IFUNC.
	(_dl_fixup_profile): Likewise.
	* elf/do-lookup.c (dl_lookup_x): Likewise.
	* sysdeps/x86_64/dl-machine.h: Handle STT_GNU_IFUNC.
	* elf/elf.h (STT_GNU_IFUNC): Define.
	* include/libc-symbols.h (libc_ifunc): Define.
	* sysdeps/x86_64/cacheinfo.c: If USE_MULTIARCH is defined, use the
	framework in init-arch.h to get CPUID values.
	* sysdeps/x86_64/multiarch/Makefile: New file.
	* sysdeps/x86_64/multiarch/init-arch.c: New file.
	* sysdeps/x86_64/multiarch/init-arch.h: New file.
	* sysdeps/x86_64/multiarch/sched_cpucount.c: New file.

	* config.make.in (experimental-malloc): Define.
	* configure.in: Handle --enable-experimental-malloc.
	* malloc/Makefile: Handle experimental-malloc flag.
	* malloc/malloc.c: Implement PER_THREAD and ATOMIC_FASTBINS features.
	* malloc/arena.c: Likewise.
	* malloc/hooks.c: Likewise.
	* malloc/malloc.h: Define M_ARENA_TEST and M_ARENA_MAX.
2009-03-13 23:53:18 +00:00
Ulrich Drepper e7f110cdbd * sysdeps/x86_64/dl-machine.h (elf_machine_rela): Add branch
prediction.  A few size optimizations.
2009-03-12 06:31:25 +00:00
Jakub Jelinek d82a27f841 * stdlib/monetary.h: Uglify function parameter names.
* sunrpc/rpc/pmap_clnt.h: Likewise. 
* sunrpc/rpc/svc.h: Likewise. 
* sunrpc/rpc/xdr.h: Likewise. 
* sunrpc/rpc/clnt.h: Likewise. 
* resolv/netdb.h: Likewise. 
* resolv/arpa/nameser.h: Likewise. 
* resolv/resolv.h: Likewise. 
* argp/argp.h: Likewise. 
* locale/langinfo.h: Likewise. 
* io/sys/stat.h: Likewise. 
* posix/spawn.h: Likewise. 
* nis/rpcsvc/nislib.h: Likewise. 
* malloc/obstack.h: Likewise. 
* sysdeps/ia64/bits/link.h: Likewise. 
* sysdeps/i386/bits/link.h: Likewise. 
* sysdeps/s390/bits/link.h: Likewise. 
* sysdeps/powerpc/bits/link.h: Likewise. 
* sysdeps/x86_64/bits/link.h: Likewise. 
* sysdeps/sparc/bits/link.h: Likewise. 
* sysdeps/sh/bits/link.h: Likewise. 
* sysdeps/unix/sysv/linux/i386/sys/io.h: Likewise. 
* sysdeps/unix/sysv/linux/x86_64/sys/io.h: Likewise. 
* sysdeps/unix/sysv/linux/sparc/sys/eventfd.h: Likewise. 
* sysdeps/unix/sysv/linux/sys/eventfd.h: Likewise.
2009-02-16  Jakub Jelinek  <jakub@redhat.com>

	* stdlib/monetary.h: Uglify function parameter names.
	* sunrpc/rpc/pmap_clnt.h: Likewise.
	* sunrpc/rpc/svc.h: Likewise.
	* sunrpc/rpc/xdr.h: Likewise.
	* sunrpc/rpc/clnt.h: Likewise.
	* resolv/netdb.h: Likewise.
	* resolv/arpa/nameser.h: Likewise.
	* resolv/resolv.h: Likewise.
	* argp/argp.h: Likewise.
	* locale/langinfo.h: Likewise.
	* io/sys/stat.h: Likewise.
	* posix/spawn.h: Likewise.
	* nis/rpcsvc/nislib.h: Likewise.
	* malloc/obstack.h: Likewise.
	* sysdeps/ia64/bits/link.h: Likewise.
	* sysdeps/i386/bits/link.h: Likewise.
	* sysdeps/s390/bits/link.h: Likewise.
	* sysdeps/powerpc/bits/link.h: Likewise.
	* sysdeps/x86_64/bits/link.h: Likewise.
	* sysdeps/sparc/bits/link.h: Likewise.
	* sysdeps/sh/bits/link.h: Likewise.
	* sysdeps/unix/sysv/linux/i386/sys/io.h: Likewise.
	* sysdeps/unix/sysv/linux/x86_64/sys/io.h: Likewise.
	* sysdeps/unix/sysv/linux/sparc/sys/eventfd.h: Likewise.
	* sysdeps/unix/sysv/linux/sys/eventfd.h: Likewise.
2009-02-16 21:00:15 +00:00
Ulrich Drepper 6c03cd11e9 * include/atomic.h: Define catomic_and if not already defined.
* sysdeps/x86_64/bits/atomic.h: Define catomic_and.
	* sysdeps/i386/i486/bits/atomic.h: Likewise.
2009-02-08 23:50:23 +00:00
Ulrich Drepper ebc22416e4 * sysdeps/x86_64/cacheinfo.c (intel_02_known): Add new descriptors.
* sysdeps/unix/sysv/linux/i386/sysconf.c (intel_02_known): Likewise.
2009-02-01 18:13:41 +00:00
Ulrich Drepper fd537e535f [BZ #9750]
* nscd/mem.c (gc): Use alloca_count to get the real stack usage.
	* include/alloca.h (alloca_account): Define.
	* sysdeps/x86_64/stackinfo.h (stackinfo_get_sp): Define.
	(stackinfo_sub_sp): Define.
2009-01-29 00:17:57 +00:00
Ulrich Drepper 50e481ceeb * nscd/nscd_gethst_r.c (nscd_gethst_r): Don't use nscd if
LOCALDOMAIN is defined.
	* nscd/nscd_getai.c (__nscd_getai): Likewise.
2008-12-29 20:56:13 +00:00
Ulrich Drepper 217d45cd35 * sysdeps/x86_64/bits/select.h: New file. 2008-12-29 20:16:11 +00:00
Roland McGrath 187f9fbc46 2008-11-11 Roland McGrath <roland@redhat.com>
* sysdeps/x86_64/configure: New file.
2008-11-11 09:50:06 +00:00
Ulrich Drepper 62a1ffc6fa * sysdeps/x86_64/memset.S: Reduce size of tables for PIC. 2008-08-14 18:58:04 +00:00
Ulrich Drepper 9523fd2806 * sysdeps/i386/fpu/s_expm1l.S: Simply use exp implementation for large
parameters.
	* sysdeps/x86_64/fpu/s_expm1l.S: Likewise.
	Patch by Denys Vlasenko <dvlasenk@redhat.com>.

	* nscd/connections.c (nscd_init): Typo in preprocessor directive.
2008-08-05 22:08:42 +00:00
Ulrich Drepper 2f9a1be867 [BZ #6442]
* string/endian.h: Add macros for fixed-size endian conversion.
	* bits/byteswap.h: Allow inclusion from <endian.h>.
	* sysdeps/i386/bits/byteswap.h: Likewise.
	* sysdeps/ia64/bits/byteswap.h: Likewise.
	* sysdeps/s390/bits/byteswap.h: Likewise.
	* sysdeps/x86_64/bits/byteswap.h: Likewise.
	* string/Makefile (tests): Add tst-endian.
	* string/tst-endian.c: New file.
2008-05-15 02:54:33 +00:00
Ulrich Drepper c9ff0187a6 Introduce TLS descriptors for i386 and x86_64.
* include/inline-hashtab.h: New file, copied from 2005's
	libiberty, with fix for memory leak imported afterwards by
	Glauber de Oliveira Costa.
	* elf/tlsdeschtab.h: New file.
	* elf/dl-reloc.c (_dl_try_allocate_static_tls): Extract from...
	(_dl_allocate_static_tls): ... here.  Rearrange failure path.
	(CHECK_STATIC_TLS): Move to...
	* elf/dynamic-link.h: ... this file.
	(TRY_STATIC_TLS): New macro.
	* elf/dl-conflict.c (CHECK_STATIC_TLS, TRY_STATIC_TLS): Override.
	* elf/elf.h (R_386_TLS_GOTDESC, R_386_TLS_DESC_CALL,
	R_386_TLS_DESC): Define.
	(R_X86_64_PC64, R_X86_GOTOFF64, R_X86_64_GOTPC32): Merge from
	binutils.
	(R_X86_64_GOTPC32_TLSDESC, R_X86_64_TLSDESC_CALL,
	R_X86_64_TLSDESC): Define.
	(R_386_NUM, R_X86_64_NUM): Adjust.
	* sysdeps/i386/Makefile (sysdep-dl-routines, sysdep_routines,
	systep-rtld-routines): Add tlsdesc and dl-tlsdesc for elf subdir.
	(gen-as-const-headers): Add tlsdesc.sym to csu subdir.
	* sysdeps/i386/dl-lookupcfg.h: New file.  Introduce _dl_unmap to
	release tlsdesc_table.
	* sysdeps/i386/dl-machine.h: Include dl-tlsdesc.h.
	(elf_machine_type_class): Mark R_386_TLS_DESC as PLT class.
	(elf_machine_rel): Handle R_386_TLS_DESC.
	(elf_machine_rela): Likewise.
	(elf_machine_lazy_rel): Likewise.
	(elf_machine_lazy_rela): Likewise.
	* sysdeps/i386/dl-tls.h (struct dl_tls_index): Name it.
	* sysdeps/i386/dl-tlsdesc.S: New file.
	* sysdeps/i386/dl-tlsdesc.h: New file.
	* sysdeps/i386/tlsdesc.c: New file.
	* sysdeps/i386/tlsdesc.sym: New file.
	* sysdeps/i386/bits/linkmap.h (struct link_map_machine): Add
	tlsdesc_table.
	* sysdeps/x86_64/Makefile (sysdep-dl-routines, sysdep_routines,
	systep-rtld-routines): Add tlsdesc and dl-tlsdesc for elf subdir.
	(gen-as-const-headers): Add tlsdesc.sym to csu subdir.
	* sysdeps/x86_64/dl-lookupcfg.h: New file.  Introduce _dl_unmap to
	release tlsdesc_table.
	* sysdeps/x86_64/dl-machine.h: Include dl-tlsdesc.h.
	(elf_machine_runtime_setup): Set up lazy TLSDESC GOT entry.
	(elf_machine_type_class): Mark R_X86_64_TLSDESC as PLT class.
	(elf_machine_rel): Handle R_X86_64_TLSDESC.
	(elf_machine_rela): Likewise.
	(elf_machine_lazy_rel): Likewise.
	* sysdeps/x86_64/dl-tls.h (struct dl_tls_index): Name it.
	(__tls_get_addr): Do not declare for non-shared compiles.
	* sysdeps/x86_64/dl-tlsdesc.S: New file.
	* sysdeps/x86_64/dl-tlsdesc.h: New file.
	* sysdeps/x86_64/tlsdesc.c: New file.
	* sysdeps/x86_64/tlsdesc.sym: New file.
	* sysdeps/x86_64/bits/linkmap.h (struct link_map_machine): Add
	tlsdesc_table for both 32- and 64-bit structs.
2008-05-13 05:41:30 +00:00
Ulrich Drepper 78c2bf0eb4 * sysdeps/x86_64/rtld-memset.c: New file.
2008-2-26  Harsha Jagasia  <harsha.jagasia@amd.com>

	* sysdeps/x86_64/cacheinfo.c (NOT_USED_RIGHT_NOW): Remove ifdef guards.

	* sysdeps/x86_64/memset.S: Rewrite non-SSE code path as tuned for AMD
	Barcelona machine.  Make default fall through branch of
	__x86_64_preferred_memory_instruction check as the integer code path.

2007-10-15  H.J. Lu  <hongjiu.lu@intel.com>

	* sysdeps/x86_64/cacheinfo.c
	(__x86_64_preferred_memory_instruction): New variable.
	(init_cacheinfo): Initialize __x86_64_preferred_memory_instruction.

	* sysdeps/x86_64/memset.S: Rewrite.

2008-01-08  Jakub Jelinek  <jakub@redhat.com>
	* malloc/malloc.c (public_cALLOc): For arenas other than
2008-03-07 17:55:11 +00:00
Ulrich Drepper ee269826ab (intel_02_known): New entry 0x3f. 2007-12-23 19:32:28 +00:00
Ulrich Drepper f6ed654cab * sysdeps/x86_64/memset.S: Add sfence after movnti. 2007-11-08 01:07:04 +00:00
Ulrich Drepper 41ff2a4999 * sysdeps/x86_64/dl-trampoline.S (_dl_runtime_profile): Make sure
stack is properly aligned for the target function.
	Correct unwind info.
2007-10-31 19:25:15 +00:00
Jakub Jelinek ed13ccf1f7 * sysdeps/x86_64/memset.S (bzero): Renamed to __bzero. Add
weak_alias.
2007-10-17  Jakub Jelinek  <jakub@redhat.com>

	* sysdeps/x86_64/memset.S (bzero): Renamed to __bzero.  Add
	weak_alias.
2007-10-18 00:09:32 +00:00
Ulrich Drepper 406f28dbe5 * sysdeps/x86_64/cacheinfo.c: Comment out code added in support of
new memset.
	too high for the improvements.  Implement bzero unconditionally for
	use in libc.
2007-10-17 15:58:16 +00:00
Ulrich Drepper ac1cb5da08 * sysdeps/x86_64/memset.S: Revert to old version for now. The cost is
too high for the improvements.

2007-10-17  Ulrich Drepper  <drepper@redhat.com>
	    Jakub Jelinek  <jakub@redhat.com>
2007-10-17 15:44:30 +00:00
Ulrich Drepper 69819d9223 (__tzfile_read): Take extra memory requested by caller into account when copying TZ string.
2007-10-16  Ulrich Drepper  <drepper@redhat.com>

	* time/tzfile.c (__tzfile_read): Take extra memory requested by caller
	into account when copying TZ string.
2007-10-16 22:37:35 +00:00
Jakub Jelinek 8d137b6098 * sysdeps/x86_64/memset.S (memset): Fix sse2_nt_move
PIC indirect jump.
	* sysdeps/x86_64/memset.S (memset): Fix sse2_nt_move
	PIC indirect jump.
2007-10-16 09:23:09 +00:00
Jakub Jelinek 0308ad66c1 * sysdeps/x86_64/memset.S: Jump from bzero to memset using
a local label rather than HIDDEN_JUMPTARGET.
2007-10-16  Jakub Jelinek  <jakub@redhat.com>

	* sysdeps/x86_64/memset.S: Jump from bzero to memset using
	a local label rather than HIDDEN_JUMPTARGET.
2007-10-16 08:54:19 +00:00