linux/arch/arm/lib
Kirill A. Shutemov dca230f00d ARM: 5701/1: ARM: copy_page.S: take into account the size of the cache line
Optimized version of copy_page() was written with assumption that cache
line size is 32 bytes. On Cortex-A8 cache line size is 64 bytes.

This patch tries to generalize copy_page() to work with any cache line
size if cache line size is multiple of 16 and page size is multiple of
two cache line size.

After this optimization we've got ~25% speedup on OMAP3(tested in
userspace).

There is test for kernelspace which trigger copy-on-write after fork():

 #include <stdlib.h>
 #include <string.h>
 #include <unistd.h>

 #define BUF_SIZE (10000*4096)
 #define NFORK 200

 int main(int argc, char **argv)
 {
         char *buf = malloc(BUF_SIZE);
         int i;

         memset(buf, 0, BUF_SIZE);

         for(i = 0; i < NFORK; i++) {
                 if (fork()) {
                         wait(NULL);
                 } else {
                         int j;

                         for(j = 0; j < BUF_SIZE; j+= 4096)
                                 buf[j] = (j & 0xFF) + 1;
                         break;
                 }
         }

         free(buf);
         return 0;
 }

Before optimization this test takes ~66 seconds, after optimization
takes ~56 seconds.

Signed-off-by: Siarhei Siamashka <siarhei.siamashka@nokia.com>
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-09-15 22:07:02 +01:00
..
ashldi3.S Thumb-2: Implement the unified arch/arm/lib functions 2009-07-24 12:32:57 +01:00
ashrdi3.S Thumb-2: Implement the unified arch/arm/lib functions 2009-07-24 12:32:57 +01:00
backtrace.S Thumb-2: Implement the unified arch/arm/lib functions 2009-07-24 12:32:57 +01:00
bitops.h Complete irq tracing support for ARM 2009-08-13 20:34:37 +02:00
changebit.S
clear_user.S Thumb-2: Implement the unified arch/arm/lib functions 2009-07-24 12:32:57 +01:00
clearbit.S
copy_from_user.S Thumb-2: Implement the unified arch/arm/lib functions 2009-07-24 12:32:57 +01:00
copy_page.S ARM: 5701/1: ARM: copy_page.S: take into account the size of the cache line 2009-09-15 22:07:02 +01:00
copy_template.S Thumb-2: Implement the unified arch/arm/lib functions 2009-07-24 12:32:57 +01:00
copy_to_user.S Thumb-2: Implement the unified arch/arm/lib functions 2009-07-24 12:32:57 +01:00
csumipv6.S
csumpartial.S
csumpartialcopy.S
csumpartialcopygeneric.S
csumpartialcopyuser.S Thumb-2: Implement the unified arch/arm/lib functions 2009-07-24 12:32:57 +01:00
delay.S
div64.S Thumb-2: Implement the unified arch/arm/lib functions 2009-07-24 12:32:57 +01:00
ecard.S
findbit.S Thumb-2: Implement the unified arch/arm/lib functions 2009-07-24 12:32:57 +01:00
floppydma.S
getuser.S Thumb-2: Implement the unified arch/arm/lib functions 2009-07-24 12:32:57 +01:00
io-acorn.S
io-readsb.S
io-readsl.S
io-readsw-armv3.S
io-readsw-armv4.S
io-shark.c
io-writesb.S
io-writesl.S
io-writesw-armv3.S
io-writesw-armv4.S Thumb-2: Implement the unified arch/arm/lib functions 2009-07-24 12:32:57 +01:00
lib1funcs.S
lshrdi3.S Thumb-2: Implement the unified arch/arm/lib functions 2009-07-24 12:32:57 +01:00
Makefile
memchr.S
memcpy.S Thumb-2: Implement the unified arch/arm/lib functions 2009-07-24 12:32:57 +01:00
memmove.S Thumb-2: Implement the unified arch/arm/lib functions 2009-07-24 12:32:57 +01:00
memset.S
memzero.S
muldi3.S
putuser.S Thumb-2: Implement the unified arch/arm/lib functions 2009-07-24 12:32:57 +01:00
setbit.S
sha1.S Thumb-2: Add some .align statements to the .S files 2009-07-24 12:32:52 +01:00
strchr.S
strncpy_from_user.S Thumb-2: Implement the unified arch/arm/lib functions 2009-07-24 12:32:57 +01:00
strnlen_user.S Thumb-2: Implement the unified arch/arm/lib functions 2009-07-24 12:32:57 +01:00
strrchr.S
testchangebit.S
testclearbit.S
testsetbit.S
uaccess_with_memcpy.c [ARM] alternative copy_to_user: more precise fallback threshold 2009-05-30 01:10:15 -04:00
uaccess.S
ucmpdi2.S