Faster memset on x64
This implementation speed up memset in several ways. First is avoiding expensive computed jump. Second is using fact that arguments of memset are most of time aligned to 8 bytes. Benchmark results on: kam.mff.cuni.cz/~ondra/benchmark_string/memset_profile_result27_04_13.tar.bz2
This commit is contained in:
parent
2d48b41c8f
commit
b2b671b677
|
@ -1,3 +1,9 @@
|
||||||
|
2013-05-20 Ondřej Bílka <neleai@seznam.cz>
|
||||||
|
|
||||||
|
* sysdeps/x86_64/memset.S (memset): New implementation.
|
||||||
|
(__bzero): Likewise.
|
||||||
|
(__memset_tail): New function.
|
||||||
|
|
||||||
2013-05-20 Ondřej Bílka <neleai@seznam.cz>
|
2013-05-20 Ondřej Bílka <neleai@seznam.cz>
|
||||||
|
|
||||||
* sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S: New file.
|
* sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S: New file.
|
||||||
|
|
File diff suppressed because it is too large
Load Diff
Loading…
Reference in New Issue