linux/include/asm-generic
Jiri Olsa ea7145477a x86: Separate out entry text section
Put x86 entry code into a separate link section: .entry.text.

Separating the entry text section seems to have performance
benefits - caused by more efficient instruction cache usage.

Running hackbench with perf stat --repeat showed that the change
compresses the icache footprint. The icache load miss rate went
down by about 15%:

 before patch:
         19417627  L1-icache-load-misses      ( +-   0.147% )

 after patch:
         16490788  L1-icache-load-misses      ( +-   0.180% )

The motivation of the patch was to fix a particular kprobes
bug that relates to the entry text section, the performance
advantage was discovered accidentally.

Whole perf output follows:

 - results for current tip tree:

  Performance counter stats for './hackbench/hackbench 10' (500 runs):

         19417627  L1-icache-load-misses      ( +-   0.147% )
       2676914223  instructions             #      0.497 IPC     ( +- 0.079% )
       5389516026  cycles                     ( +-   0.144% )

      0.206267711  seconds time elapsed   ( +-   0.138% )

 - results for current tip tree with the patch applied:

  Performance counter stats for './hackbench/hackbench 10' (500 runs):

         16490788  L1-icache-load-misses      ( +-   0.180% )
       2717734941  instructions             #      0.502 IPC     ( +- 0.079% )
       5414756975  cycles                     ( +-   0.148% )

      0.206747566  seconds time elapsed   ( +-   0.137% )

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: masami.hiramatsu.pt@hitachi.com
Cc: ananth@in.ibm.com
Cc: davem@davemloft.net
Cc: 2nddept-manager@sdl.hitachi.co.jp
LKML-Reference: <20110307181039.GB15197@jolsa.redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-08 17:22:11 +01:00
..
bitops bitops: remove duplicated extern declarations 2010-10-09 21:51:45 +02:00
4level-fixup.h
atomic64.h
atomic-long.h
atomic.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic 2010-10-22 11:17:06 -07:00
audit_change_attr.h audit: make link()/linkat() match "attribute change" predicate 2010-10-30 08:45:43 -04:00
audit_dir_write.h
audit_read.h
audit_signal.h
audit_write.h
auxvec.h
bitops.h
bitsperlong.h
bug.h panic: Allow warnings to set different taint flags 2010-05-19 08:36:48 +01:00
bugs.h
cache.h
cacheflush.h
checksum.h
cmpxchg-local.h Fix IRQ flag handling naming 2010-10-07 14:08:55 +01:00
cmpxchg.h
cputime.h taskstats: use real microsecond granularity for CPU times 2010-10-27 18:03:17 -07:00
current.h
delay.h
device.h
div64.h
dma-coherent.h
dma-mapping-broken.h dma-mapping: remove dma_is_consistent API 2010-08-11 08:59:21 -07:00
dma-mapping-common.h dma-mapping: remove unnecessary sync_single_range_* in dma_map_ops 2010-05-27 09:12:52 -07:00
dma.h
emergency-restart.h
errno-base.h
errno.h
fb.h
fcntl.h asm-generic: fcntl: make exported headers use strict posix types 2010-10-09 21:51:43 +02:00
futex.h
getorder.h
gpio.h Revert "gpiolib: annotate gpio-intialization with __must_check" 2011-01-13 17:26:46 -08:00
hardirq.h Fix IRQ flag handling naming 2010-10-07 14:08:55 +01:00
hw_irq.h
ide_iops.h
int-l64.h
int-ll64.h
io.h asm-generic/io.h: add reads[bwl]/writes[bwl] helpers 2011-01-10 07:18:03 -05:00
ioctl.h
ioctls.h TTY: Add tty ioctl to figure device node of the system console. 2010-12-16 16:18:28 -08:00
iomap.h
ipcbuf.h
irq_regs.h core: Replace __get_cpu_var with __this_cpu_read if not used for an address. 2010-12-17 15:07:19 +01:00
irq.h
irqflags.h Fix IRQ flag handling naming 2010-10-07 14:08:55 +01:00
Kbuild include: replace unifdef-y with header-y 2010-08-14 22:26:51 +02:00
Kbuild.asm include: replace unifdef-y with header-y 2010-08-14 22:26:51 +02:00
kdebug.h asm-generic: kdebug.h: Checkpatch cleanup 2010-10-09 21:51:44 +02:00
kmap_types.h include/asm-generic/kmap_types.h: add helpful reminder 2010-05-25 08:07:03 -07:00
libata-portmap.h
linkage.h
local64.h arch: Implement local64_t 2010-06-09 11:12:36 +02:00
local.h local_t: Remove cpu_local_xx macros 2010-01-05 15:34:49 +09:00
memory_model.h
mm_hooks.h
mman-common.h thp: mm: define MADV_NOHUGEPAGE 2011-01-13 17:32:47 -08:00
mman.h
mmu_context.h
mmu.h
module.h
msgbuf.h
mutex-dec.h
mutex-null.h
mutex-xchg.h
mutex.h
page.h
param.h
parport.h
pci-dma-compat.h dma-mapping: pci: move pci_set_dma_mask and pci_set_consistent_dma_mask to pci-dma-compat.h 2010-03-12 15:52:42 -08:00
pci.h
percpu.h percpu: Optimize __get_cpu_var() 2010-09-10 10:56:51 +02:00
pgalloc.h
pgtable-nopmd.h
pgtable-nopud.h
pgtable.h mm: <asm-generic/pgtable.h> must include <linux/mm_types.h> 2011-02-28 17:46:49 -08:00
poll.h
posix_types.h
resource.h
rtc.h
scatterlist.h asm-generic: remove ARCH_HAS_SG_CHAIN in scatterlist.h 2010-05-27 09:12:54 -07:00
sections.h x86: Separate out entry text section 2011-03-08 17:22:11 +01:00
segment.h
sembuf.h
serial.h
setup.h
shmbuf.h
shmparam.h
siginfo.h
signal-defs.h
signal.h
socket.h
sockios.h
spinlock.h
stat.h asm-generic/stat.h: support 64-bit file time_t for stat() 2010-11-01 15:31:29 -04:00
statfs.h add f_flags to struct statfs(64) 2010-08-09 16:48:44 -04:00
string.h
swab.h
syscall.h
syscalls.h Fix the declaration of sys_execve() in asm-generic/syscalls.h 2010-08-18 12:12:38 -07:00
system.h asm-generic: cmpxchg does not handle non-long arguments 2010-10-09 21:51:27 +02:00
termbits.h tty: Add EXTPROC support for LINEMODE 2010-08-10 13:47:39 -07:00
termios-base.h
termios.h
timex.h
tlb.h
tlbflush.h
topology.h topology: alternate fix for ia64 tiger_defconfig build breakage 2010-08-09 20:44:57 -07:00
types.h
uaccess-unaligned.h
uaccess.h
ucontext.h
unaligned.h
unistd.h Add fanotify syscalls to <asm-generic/unistd.h>. 2010-08-13 08:08:37 -04:00
user.h
vga.h
vmlinux.lds.h x86: Separate out entry text section 2011-03-08 17:22:11 +01:00
xor.h