linux/kernel/sched
Peter Zijlstra d153b15344 sched/core: Fix wake_affine() performance regression
Eric reported a sysbench regression against commit:

  3fed382b46 ("sched/numa: Implement NUMA node level wake_affine()")

Similarly, Rik was looking at the NAS-lu.C benchmark, which regressed
against his v3.10 enterprise kernel.

PRE (current tip/master):

 ivb-ep sysbench:

   2: [30 secs]     transactions:                        64110  (2136.94 per sec.)
   5: [30 secs]     transactions:                        143644 (4787.99 per sec.)
  10: [30 secs]     transactions:                        274298 (9142.93 per sec.)
  20: [30 secs]     transactions:                        418683 (13955.45 per sec.)
  40: [30 secs]     transactions:                        320731 (10690.15 per sec.)
  80: [30 secs]     transactions:                        355096 (11834.28 per sec.)

 hsw-ex NAS:

 OMP_PROC_BIND/lu.C.x_threads_144_run_1.log: Time in seconds =                    18.01
 OMP_PROC_BIND/lu.C.x_threads_144_run_2.log: Time in seconds =                    17.89
 OMP_PROC_BIND/lu.C.x_threads_144_run_3.log: Time in seconds =                    17.93
 lu.C.x_threads_144_run_1.log: Time in seconds =                   434.68
 lu.C.x_threads_144_run_2.log: Time in seconds =                   405.36
 lu.C.x_threads_144_run_3.log: Time in seconds =                   433.83

POST (+patch):

 ivb-ep sysbench:

   2: [30 secs]     transactions:                        64494  (2149.75 per sec.)
   5: [30 secs]     transactions:                        145114 (4836.99 per sec.)
  10: [30 secs]     transactions:                        278311 (9276.69 per sec.)
  20: [30 secs]     transactions:                        437169 (14571.60 per sec.)
  40: [30 secs]     transactions:                        669837 (22326.73 per sec.)
  80: [30 secs]     transactions:                        631739 (21055.88 per sec.)

 hsw-ex NAS:

 lu.C.x_threads_144_run_1.log: Time in seconds =                    23.36
 lu.C.x_threads_144_run_2.log: Time in seconds =                    22.96
 lu.C.x_threads_144_run_3.log: Time in seconds =                    22.52

This patch takes out all the shiny wake_affine() stuff and goes back to
utter basics. Between the two CPUs involved with the wakeup (the CPU
doing the wakeup and the CPU we ran on previously) pick the CPU we can
run on _now_.

This restores much of the regressions against the older kernels,
but leaves some ground in the overloaded case. The default-enabled
WA_WEIGHT (which will be introduced in the next patch) is an attempt
to address the overloaded situation.

Reported-by: Eric Farman <farman@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: jinpuwang@gmail.com
Cc: vcaputo@pengaru.com
Fixes: 3fed382b46 ("sched/numa: Implement NUMA node level wake_affine()")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-10 10:14:02 +02:00
..
Makefile membarrier: Provide expedited private command 2017-08-17 07:28:05 -07:00
autogroup.c sched/autogroup: Fix error reporting printk text in autogroup_create() 2017-08-10 17:06:03 +02:00
autogroup.h sched/headers: Prepare for new header dependencies before moving code to <linux/sched/autogroup.h> 2017-03-02 08:42:28 +01:00
clock.c sched/clock: Fix early boot preempt assumption in __set_sched_clock_stable() 2017-05-24 09:10:00 +02:00
completion.c Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-09-04 11:52:29 -07:00
core.c sched/debug: Ignore TASK_IDLE for SysRq-W 2017-09-29 11:02:57 +02:00
cpuacct.c sched/cputime: Convert kcpustat to nsecs 2017-02-01 09:13:47 +01:00
cpuacct.h sched/cpuacct: Simplify the cpuacct code 2016-03-21 11:00:28 +01:00
cpudeadline.c sched/deadline: Change return value of cpudl_find() 2017-08-10 12:18:17 +02:00
cpudeadline.h sched/deadline: Split cpudl_set() into cpudl_set() and cpudl_clear() 2016-09-05 13:29:43 +02:00
cpufreq.c cpufreq / sched: Pass flags to cpufreq_update_util() 2016-08-16 22:14:55 +02:00
cpufreq_schedutil.c Merge branch 'pm-cpufreq-sched' 2017-09-04 00:05:22 +02:00
cpupri.c sched/cpupri: Don't re-initialize 'struct cpupri' 2017-08-10 12:18:14 +02:00
cpupri.h sched/cpupri: Remove unnecessary definitions in cpupri.h 2014-11-16 10:58:59 +01:00
cputime.c sched/cputime: Don't use smp_processor_id() in preemptible context 2017-07-14 10:27:15 +02:00
deadline.c sched/deadline: replace earliest dl and rq leftmost caching 2017-09-08 18:26:49 -07:00
debug.c sched/debug: Remove unused variable 2017-09-29 10:09:09 +02:00
fair.c sched/core: Fix wake_affine() performance regression 2017-10-10 10:14:02 +02:00
features.h sched/core: Fix wake_affine() performance regression 2017-10-10 10:14:02 +02:00
idle.c PM / s2idle: Rename ->enter_freeze to ->enter_s2idle 2017-08-11 01:29:56 +02:00
idle_task.c sched/core: Add wrappers for lockdep_(un)pin_lock() 2017-01-14 11:29:30 +01:00
loadavg.c sched/loadavg: Generalize "_idle" naming to "_nohz" 2017-06-22 11:30:01 +02:00
membarrier.c membarrier: Provide expedited private command 2017-08-17 07:28:05 -07:00
rt.c sched: cpufreq: Allow remote cpufreq callbacks 2017-08-01 14:24:53 +02:00
sched-pelt.h sched/fair: Move the PELT constants into a generated header 2017-04-14 10:26:37 +02:00
sched.h Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-09-13 12:22:32 -07:00
stats.c sched: use %*pb[l] to print bitmaps including cpumasks and nodemasks 2015-02-13 21:21:37 -08:00
stats.h sched/headers: Move cputime functionality from <linux/sched.h> and <linux/cputime.h> into <linux/sched/cputime.h> 2017-03-03 01:45:22 +01:00
stop_task.c sched/core: Add wrappers for lockdep_(un)pin_lock() 2017-01-14 11:29:30 +01:00
swait.c sched/wait: Remove the lockless swait_active() check in swake_up*() 2017-08-10 12:28:53 +02:00
topology.c Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-09-13 12:22:32 -07:00
wait.c sched/wait: Introduce wakeup boomark in wake_up_page_bit 2017-09-14 09:56:18 -07:00
wait_bit.c sched/wait: Disambiguate wq_entry->task_list and wq_head->task_list naming 2017-06-20 12:19:14 +02:00