linux/arch/x86/mm
Shaohua Li 9329672021 x86: Spread tlb flush vector between nodes
Currently flush tlb vector allocation is based on below equation:
	sender = smp_processor_id() % 8
This isn't optimal, CPUs from different node can have the same vector, this
causes a lot of lock contention. Instead, we can assign the same vectors to
CPUs from the same node, while different node has different vectors. This has
below advantages:
a. if there is lock contention, the lock contention is between CPUs from one
node. This should be much cheaper than the contention between nodes.
b. completely avoid lock contention between nodes. This especially benefits
kswapd, which is the biggest user of tlb flush, since kswapd sets its affinity
to specific node.

In my test, this could reduce > 20% CPU overhead in extreme case.The test
machine has 4 nodes and each node has 16 CPUs. I then bind each node's kswapd
to the first CPU of the node. I run a workload with 4 sequential mmap file
read thread. The files are empty sparse file. This workload will trigger a
lot of page reclaim and tlbflush. The kswapd bind is to easy trigger the
extreme tlb flush lock contention because otherwise kswapd keeps migrating
between CPUs of a node and I can't get stable result. Sure in real workload,
we can't always see so big tlb flush lock contention, but it's possible.

[ hpa: folded in fix from Eric Dumazet to use this_cpu_read() ]

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
LKML-Reference: <1287544023.4571.8.camel@sli10-conroe.sh.intel.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-10-20 14:44:42 -07:00
..
kmemcheck x86, kmemcheck: Remove double test 2010-08-30 09:19:28 +02:00
dump_pagetables.c x86, mm: Create symbolic index into address_markers array 2010-07-20 16:56:19 -07:00
extable.c
fault.c x86, mm: Fix incorrect data type in vmalloc_sync_all() 2010-10-20 12:54:04 -07:00
gup.c
highmem_32.c kmap_atomic: make kunmap_atomic() harder to misuse 2010-08-09 20:44:54 -07:00
hugetlbpage.c
init_32.c
init_64.c x86, mm: Hold mm->page_table_lock while doing vmalloc_sync 2010-10-19 13:57:08 -07:00
init.c
iomap_32.c
ioremap.c x86, iomap: Fix wrong page aligned size calculation in ioremapping code 2010-07-20 16:56:35 -07:00
k8topology_64.c
kmmio.c x86, kmmio/mmiotrace: Fix double free of kmmio_fault_pages 2010-06-18 11:30:09 +02:00
Makefile
memtest.c
mmap.c
mmio-mod.c
numa_32.c
numa_64.c numa: x86_64: use generic percpu var numa_node_id() implementation 2010-05-27 09:12:57 -07:00
numa.c x86/mm: Remove unused DBG() macro 2010-05-31 10:01:53 +02:00
pageattr-test.c
pageattr.c
pat_internal.h
pat_rbtree.c rbtree: Undo augmented trees performance damage and regression 2010-07-05 14:43:50 +02:00
pat.c Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2010-08-06 10:17:52 -07:00
pf_in.c x86,mmiotrace: Add support for tracing STOS instruction 2010-08-02 01:32:01 +02:00
pf_in.h
pgtable_32.c
pgtable.c x86, mm: Hold mm->page_table_lock while doing vmalloc_sync 2010-10-19 13:57:08 -07:00
physaddr.c
physaddr.h
setup_nx.c
srat_32.c
srat_64.c
testmmiotrace.c x86, kmmio/mmiotrace: Fix double free of kmmio_fault_pages 2010-06-18 11:30:09 +02:00
tlb.c x86: Spread tlb flush vector between nodes 2010-10-20 14:44:42 -07:00