linux/include/asm-i386
Hiro Yoshioka c22ce143d1 [PATCH] x86: cache pollution aware __copy_from_user_ll()
Use the x86 cache-bypassing copy instructions for copy_from_user().

Some performance data are

Total of GLOBAL_POWER_EVENTS (CPU cycle samples)

2.6.12.4.orig    1921587
2.6.12.4.nt      1599424
1599424/1921587=83.23% (16.77% reduction)

BSQ_CACHE_REFERENCE (L3 cache miss)
2.6.12.4.orig      57427
2.6.12.4.nt        20858
20858/57427=36.32% (63.7% reduction)

L3 cache miss reduction of __copy_from_user_ll
samples  %
37408    65.1412  vmlinux                  __copy_from_user_ll
23        0.1103  vmlinux                  __copy_user_zeroing_intel_nocache
23/37408=0.061% (99.94% reduction)

Top 5 of 2.6.12.4.nt
Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) with a unit mask of 0x01 (mandatory) count 100000
samples  %        app name                 symbol name
128392    8.0274  vmlinux                  __copy_user_zeroing_intel_nocache
64206     4.0143  vmlinux                  journal_add_journal_head
59746     3.7355  vmlinux                  do_get_write_access
47674     2.9807  vmlinux                  journal_put_journal_head
46021     2.8774  vmlinux                  journal_dirty_metadata
pattern9-0-cpu4-0-09011728/summary.out

Counted BSQ_CACHE_REFERENCE events (cache references seen by the bus unit) with a unit mask of 0x3f (multiple flags) count 3000
samples  %        app name                 symbol name
69755     4.2861  vmlinux                  __copy_user_zeroing_intel_nocache
55685     3.4215  vmlinux                  journal_add_journal_head
52371     3.2179  vmlinux                  __find_get_block
45504     2.7960  vmlinux                  journal_put_journal_head
36005     2.2123  vmlinux                  journal_stop
pattern9-0-cpu4-0-09011744/summary.out

Counted BSQ_CACHE_REFERENCE events (cache references seen by the bus unit) with a unit mask of 0x200 (read 3rd level cache miss) count 3000
samples  %        app name                 symbol name
1147      5.4994  vmlinux                  journal_add_journal_head
881       4.2240  vmlinux                  journal_dirty_data
872       4.1809  vmlinux                  blk_rq_map_sg
734       3.5192  vmlinux                  journal_commit_transaction
617       2.9582  vmlinux                  radix_tree_delete
pattern9-0-cpu4-0-09011731/summary.out

iozone results are

original 2.6.12.4 CPU time = 207.768 sec
cache aware       CPU time = 184.783 sec
(three times run)
184.783/207.768=88.94% (11.06% reduction)

original:
pattern9-0-cpu4-0-08191720/iozone.out:  CPU Utilization: Wall time   45.997    CPU time   64.527    CPU utilization 140.28 %
pattern9-0-cpu4-0-08191741/iozone.out:  CPU Utilization: Wall time   46.878    CPU time   71.933    CPU utilization 153.45 %
pattern9-0-cpu4-0-08191743/iozone.out:  CPU Utilization: Wall time   45.152    CPU time   71.308    CPU utilization 157.93 %

cache awre:
pattern9-0-cpu4-0-09011728/iozone.out:  CPU Utilization: Wall time   44.842    CPU time   62.465    CPU utilization 139.30 %
pattern9-0-cpu4-0-09011731/iozone.out:  CPU Utilization: Wall time   44.718    CPU time   59.273    CPU utilization 132.55 %
pattern9-0-cpu4-0-09011744/iozone.out:  CPU Utilization: Wall time   44.367    CPU time   63.045    CPU utilization 142.10 %

Signed-off-by: Hiro Yoshioka <hyoshiok@miraclelinux.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:56 -07:00
..
mach-bigsmp [PATCH] x86: convert bigsmp to use flat physical mode 2006-01-06 08:33:37 -08:00
mach-default [PATCH] RTC: Fix up some RTC whitespace and style 2006-03-28 09:16:01 -08:00
mach-es7000 [PATCH] Compilation fix for ES7000 when no ACPI is specified in config (i386) 2006-03-23 07:38:04 -08:00
mach-generic
mach-numaq
mach-summit Don't include linux/config.h from anywhere else in include/ 2006-04-26 12:56:16 +01:00
mach-visws [PATCH] i386: fix uses of user_mode() vs. user_mode_vm() 2006-03-23 07:38:05 -08:00
mach-voyager [PATCH] i386: fix uses of user_mode() vs. user_mode_vm() 2006-03-23 07:38:05 -08:00
8253pit.h
a.out.h
acpi.h [PATCH] don't call check_acpi_pci() on x86 with ACPI disabled 2006-03-22 07:53:54 -08:00
agp.h
alternative.h [PATCH] x86: SMP alternatives 2006-03-23 07:38:04 -08:00
apic.h Don't include linux/config.h from anywhere else in include/ 2006-04-26 12:56:16 +01:00
apicdef.h [PATCH] i386 kdump timer vector lockup fix 2006-03-31 12:18:50 -08:00
arch_hooks.h [PATCH] x86: early printk handling fixes 2006-03-23 07:38:05 -08:00
atomic.h Don't include linux/config.h from anywhere else in include/ 2006-04-26 12:56:16 +01:00
auxvec.h
bitops.h Don't include linux/config.h from anywhere else in include/ 2006-04-26 12:56:16 +01:00
boot.h
bug.h Don't include linux/config.h from anywhere else in include/ 2006-04-26 12:56:16 +01:00
bugs.h Don't include linux/config.h from anywhere else in include/ 2006-04-26 12:56:16 +01:00
byteorder.h Don't include linux/config.h from anywhere else in include/ 2006-04-26 12:56:16 +01:00
cache.h Don't include linux/config.h from anywhere else in include/ 2006-04-26 12:56:16 +01:00
cacheflush.h [PATCH] x86/x86_64: mark rodata section read only: x86 parts 2006-01-06 08:33:36 -08:00
checksum.h
cpu.h
cpufeature.h [PATCH] i386/x86-64: Fix x87 information leak between processes 2006-04-20 07:58:11 -07:00
cputime.h
current.h [PATCH] mark several functions __always_inline 2006-01-14 18:27:15 -08:00
debugreg.h
delay.h
desc.h [PATCH] x86: fix broken SMP boot sequence 2006-02-24 14:31:38 -08:00
div64.h
dma-mapping.h [PATCH] i386: make pci_map_single/pci_map_sg warn for zero length. 2006-01-11 19:04:56 -08:00
dma.h Don't include linux/config.h from anywhere else in include/ 2006-04-26 12:56:16 +01:00
dmi.h [PATCH] x86_64: Implement early DMI scanning 2006-03-25 09:10:55 -08:00
e820.h [PATCH] x86_64: Introduce e820_all_mapped 2006-04-09 11:53:50 -07:00
edac.h [PATCH] EDAC: core EDAC support code 2006-01-18 19:20:31 -08:00
elf.h [PATCH] fix remaining missing includes 2005-11-07 07:53:41 -08:00
emergency-restart.h
errno.h
fcntl.h
fixmap.h Don't include linux/config.h from anywhere else in include/ 2006-04-26 12:56:16 +01:00
floppy.h [PATCH] Remove long dead i386 floppy asm code 2006-03-31 12:18:50 -08:00
futex.h [PATCH] lightweight robust futexes updates 2006-03-27 08:44:49 -08:00
genapic.h
hardirq.h Don't include linux/config.h from anywhere else in include/ 2006-04-26 12:56:16 +01:00
highmem.h Don't include linux/config.h from anywhere else in include/ 2006-04-26 12:56:16 +01:00
hpet.h Don't include linux/config.h from anywhere else in include/ 2006-04-26 12:56:16 +01:00
hw_irq.h Don't include linux/config.h from anywhere else in include/ 2006-04-26 12:56:16 +01:00
i387.h [PATCH] i386: fix broken FP exception handling 2006-04-29 14:13:16 -07:00
i8253.h
i8259.h
ide.h Don't include linux/config.h from anywhere else in include/ 2006-04-26 12:56:16 +01:00
io_apic.h Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2006-05-24 09:22:21 +01:00
io.h Don't include linux/config.h from anywhere else in include/ 2006-04-26 12:56:16 +01:00
ioctl.h [PATCH] Generic ioctl.h 2006-01-10 08:01:34 -08:00
ioctls.h
ipc.h
ipcbuf.h
irq.h Don't include linux/config.h from anywhere else in include/ 2006-04-26 12:56:16 +01:00
ist.h
kdebug.h [PATCH] Notifier chain update: API changes 2006-03-27 08:44:50 -08:00
kexec.h [PATCH] Kdump: i386 compiler warning fix 2006-01-10 08:01:27 -08:00
kmap_types.h Don't include linux/config.h from anywhere else in include/ 2006-04-26 12:56:16 +01:00
kprobes.h [PATCH] x86: kprobes-booster 2006-03-26 08:57:04 -08:00
ldt.h
linkage.h
local.h [PATCH] make local_t signed 2006-03-31 12:18:55 -08:00
math_emu.h
mc146818rtc.h
mca_dma.h
mca.h
mman.h [PATCH] add asm-generic/mman.h 2006-02-15 15:32:22 -08:00
mmu_context.h Don't include linux/config.h from anywhere else in include/ 2006-04-26 12:56:16 +01:00
mmu.h
mmx.h
mmzone.h [PATCH] unify pfn_to_page: i386 pfn_to_page 2006-03-27 08:44:44 -08:00
module.h [PATCH] Base support for AMD Geode GX/LX processors 2006-01-06 08:33:38 -08:00
mpspec_def.h [PATCH] mpspec: remove unneeded packed attribute 2006-01-06 08:33:39 -08:00
mpspec.h [PATCH] mptspec: remove duplicate #include 2006-04-11 06:18:34 -07:00
msgbuf.h
msi.h [PATCH] PCI: cleanup unused variable about msi driver 2006-06-21 12:00:00 -07:00
msr.h
mtrr.h Don't include linux/config.h from anywhere else in include/ 2006-04-26 12:56:16 +01:00
mutex.h [PATCH] x86: SMP alternatives 2006-03-23 07:38:04 -08:00
namei.h
nmi.h
node.h
numa.h [PATCH] x86-64: Use ACPI PXM to parse PCI<->node assignments 2005-09-12 10:49:57 -07:00
numaq.h
page.h Exclude asm-generic/{page,memory_model}.h from user bits of i386/x86_64 page.h 2006-04-27 15:48:08 +01:00
param.h Don't include linux/config.h from anywhere else in include/ 2006-04-26 12:56:16 +01:00
parport.h
pci-direct.h
pci.h Don't include linux/config.h from anywhere else in include/ 2006-04-26 12:56:16 +01:00
percpu.h
pgalloc.h Don't include linux/config.h from anywhere else in include/ 2006-04-26 12:56:16 +01:00
pgtable-2level-defs.h
pgtable-2level.h [PATCH] x86/PAE: Fix pte_clear for the >4GB RAM case 2006-04-27 12:00:59 -07:00
pgtable-3level-defs.h
pgtable-3level.h [PATCH] x86/PAE: Fix pte_clear for the >4GB RAM case 2006-04-27 12:00:59 -07:00
pgtable.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2006-04-29 01:42:26 +01:00
poll.h [PATCH] POLLRDHUP/EPOLLRDHUP handling for half-closed devices notifications 2006-03-25 08:22:56 -08:00
posix_types.h
processor.h Don't include linux/config.h from anywhere else in include/ 2006-04-26 12:56:16 +01:00
ptrace.h [PATCH] PTRACE_SYSEMU is only for i386 and clashes with other ptrace codes of other archs 2006-01-08 20:14:04 -08:00
resource.h
rtc.h
rwlock.h [PATCH] x86: SMP alternatives 2006-03-23 07:38:04 -08:00
rwsem.h [PATCH] add sem_is_read/write_locked() 2005-10-29 21:40:35 -07:00
scatterlist.h
seccomp.h
sections.h
segment.h [PATCH] x86: Pnp segments in segment h 2006-01-06 08:33:34 -08:00
semaphore.h [PATCH] x86: SMP alternatives 2006-03-23 07:38:04 -08:00
sembuf.h
serial.h Don't include linux/config.h from anywhere else in include/ 2006-04-26 12:56:16 +01:00
setup.h [PATCH] unify PFN_* macros 2006-03-27 08:44:48 -08:00
shmbuf.h
shmparam.h
sigcontext.h
siginfo.h
signal.h [PATCH] Handle TIF_RESTORE_SIGMASK for i386 2006-01-18 19:20:29 -08:00
smp.h Don't include linux/config.h from anywhere else in include/ 2006-04-26 12:56:16 +01:00
socket.h
sockios.h
sparsemem.h
spinlock_types.h
spinlock.h Don't include linux/config.h from anywhere else in include/ 2006-04-26 12:56:16 +01:00
srat.h
stat.h [PATCH] 2TB files: st_blocks is invalid when calling stat64 2006-03-26 08:57:00 -08:00
statfs.h
string.h Don't include linux/config.h from anywhere else in include/ 2006-04-26 12:56:16 +01:00
suspend.h
system.h Don't include linux/config.h from anywhere else in include/ 2006-04-26 12:56:16 +01:00
termbits.h
termios.h
thread_info.h Don't include linux/config.h from anywhere else in include/ 2006-04-26 12:56:16 +01:00
timer.h
timex.h Don't include linux/config.h from anywhere else in include/ 2006-04-26 12:56:16 +01:00
tlb.h
tlbflush.h Don't include linux/config.h from anywhere else in include/ 2006-04-26 12:56:16 +01:00
topology.h [PATCH] sched: new sched domain for representing multi-core 2006-03-27 08:44:43 -08:00
types.h Don't include linux/config.h from anywhere else in include/ 2006-04-26 12:56:16 +01:00
uaccess.h [PATCH] x86: cache pollution aware __copy_from_user_ll() 2006-06-23 07:42:56 -07:00
ucontext.h
unaligned.h
unistd.h [PATCH] sys_move_pages: 32bit support (i386, x86_64) 2006-06-23 07:42:53 -07:00
user.h
vga.h [PATCH] vgacon: make VGA_MAP_MEM take size, remove extra use 2006-06-22 15:05:58 -07:00
vic.h
vm86.h [PATCH] Make vm86 support optional 2006-01-08 20:14:11 -08:00
voyager.h
xor.h