linux/kernel
Ingo Molnar f6cf891c4d sched: make the scheduler converge to the ideal latency
de-HZ-ification of the granularity defaults unearthed a pre-existing
property of CFS: while it correctly converges to the granularity goal,
it does not prevent run-time fluctuations in the range of
[-gran ... 0 ... +gran].

With the increase of the granularity due to the removal of HZ
dependencies, this becomes visible in chew-max output (with 5 tasks
running):

 out:  28 . 27. 32 | flu:  0 .  0 | ran:    9 .   13 | per:   37 .   40
 out:  27 . 27. 32 | flu:  0 .  0 | ran:   17 .   13 | per:   44 .   40
 out:  27 . 27. 32 | flu:  0 .  0 | ran:    9 .   13 | per:   36 .   40
 out:  29 . 27. 32 | flu:  2 .  0 | ran:   17 .   13 | per:   46 .   40
 out:  28 . 27. 32 | flu:  0 .  0 | ran:    9 .   13 | per:   37 .   40
 out:  29 . 27. 32 | flu:  0 .  0 | ran:   18 .   13 | per:   47 .   40
 out:  28 . 27. 32 | flu:  0 .  0 | ran:    9 .   13 | per:   37 .   40

average slice is the ideal 13 msecs and the period is picture-perfect 40
msecs. But the 'ran' field fluctuates around 13.33 msecs and there's no
mechanism in CFS to keep that from happening: it's a perfectly valid
solution that CFS finds.

to fix this we add a granularity/preemption rule that knows about
the "target latency", which makes tasks that run longer than the ideal
latency run a bit less. The simplest approach is to simply decrease the
preemption granularity when a task overruns its ideal latency. For this
we have to track how much the task executed since its last preemption.

( this adds a new field to task_struct, but we can eliminate that
  overhead in 2.6.24 by putting all the scheduler timestamps into an
  anonymous union. )

with this change in place, chew-max output is fluctuation-less all
around:

 out:  28 . 27. 39 | flu:  0 .  2 | ran:   13 .   13 | per:   41 .   40
 out:  28 . 27. 39 | flu:  0 .  2 | ran:   13 .   13 | per:   41 .   40
 out:  28 . 27. 39 | flu:  0 .  2 | ran:   13 .   13 | per:   41 .   40
 out:  28 . 27. 39 | flu:  0 .  2 | ran:   13 .   13 | per:   41 .   40
 out:  28 . 27. 39 | flu:  0 .  1 | ran:   13 .   13 | per:   41 .   40
 out:  28 . 27. 39 | flu:  0 .  1 | ran:   13 .   13 | per:   41 .   40

this patch has no impact on any fastpath or on any globally observable
scheduling property. (unless you have sharp enough eyes to see
millisecond-level ruckles in glxgears smoothness :-)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mike Galbraith <efault@gmx.de>
2007-08-28 12:53:24 +02:00
..
irq free_irq(): fix DEBUG_SHIRQ handling 2007-08-22 19:52:44 -07:00
power Hibernation: do not try to mark invalid PFNs as nosave 2007-08-11 15:47:40 -07:00
time timer: remove clockevents_unregister_notifier 2007-08-11 15:47:42 -07:00
.gitignore gitignore: ignore more generated files 2006-01-03 11:35:26 +01:00
Kconfig.hz [PATCH] HZ: 300Hz support 2006-12-07 08:39:36 -08:00
Kconfig.preempt [PATCH] sched: arch preempt notifier mechanism 2007-07-26 13:40:43 +02:00
Makefile user namespace: add the framework 2007-07-16 09:05:47 -07:00
acct.c Cleanup non-arch xtime uses, use get_seconds() or current_kernel_time(). 2007-07-25 10:09:20 -07:00
audit.c Freezer: make kernel threads nonfreezable by default 2007-07-17 10:23:02 -07:00
audit.h Audit: add TTY input auditing 2007-07-16 09:05:47 -07:00
auditfilter.c [PATCH] allow audit filtering on bit & operations 2007-07-22 09:57:02 -04:00
auditsc.c kernel/auditsc.c: fix an off-by-one 2007-08-22 19:52:44 -07:00
capability.c [PATCH] pid: replace do/while_each_task_pid with do/while_each_pid_task 2007-02-12 09:48:32 -08:00
compat.c signal/timer/event: timerfd compat code 2007-05-11 08:29:36 -07:00
configs.c use simple_read_from_buffer in kernel/ 2007-05-09 12:30:49 -07:00
cpu.c HOTPLUG: Add CPU_DYING notifier 2007-07-16 12:05:49 +03:00
cpuset.c usermodehelper: Tidy up waiting 2007-07-18 08:47:40 -07:00
delayacct.c sched: update delay-accounting to use CFS's precise stats 2007-07-09 18:52:00 +02:00
die_notifier.c move die notifier handling to common code 2007-05-08 11:15:04 -07:00
dma.c [PATCH] struct seq_operations and struct file_operations constification 2006-12-07 08:39:46 -08:00
exec_domain.c Remove obsolete #include <linux/config.h> 2006-06-30 19:25:36 +02:00
exit.c Kill some obsolete sub-thread-ptrace stuff 2007-08-03 15:06:33 -07:00
extable.c [PATCH] symbol_put_addr() locks kernel 2006-05-15 11:20:55 -07:00
fork.c mm: Remove slab destructors from kmem_cache_create(). 2007-07-20 10:11:58 +09:00
futex.c futex_unlock_pi() hurts my brain and may cause application deadlock 2007-08-22 19:52:44 -07:00
futex_compat.c Revert "futex_requeue_pi optimization" 2007-06-18 09:48:41 -07:00
hrtimer.c Cache xtime every call to update_wall_time 2007-07-25 10:17:44 -07:00
itimer.c The scheduled -EINVAL for invalid timevals in setitimer 2007-05-08 11:15:13 -07:00
kallsyms.c kallsyms: make KSYM_NAME_LEN include space for trailing '\0' 2007-07-17 10:23:03 -07:00
kexec.c kdump/kexec: calculate note size at compile time 2007-05-08 11:15:07 -07:00
kfifo.c is_power_of_2: kernel/kfifo.c 2007-07-16 09:05:50 -07:00
kmod.c kernel-doc fix for kmod.c 2007-07-26 11:33:06 -07:00
kprobes.c fix compilation with gcc 4.2 2007-08-11 15:47:42 -07:00
ksysfs.c FRV: Fix linkage problems 2007-07-20 12:01:34 -07:00
kthread.c kthread: silence bogus section mismatch warning 2007-07-31 15:39:42 -07:00
latency.c [PATCH] severing module.h->sched.h 2006-12-04 02:00:22 -05:00
lockdep.c lockdep debugging: give stacktrace for init_error 2007-07-19 10:04:49 -07:00
lockdep_internals.h [PATCH] lockdep: more chains 2006-12-07 08:39:43 -08:00
lockdep_proc.c Fix leak on /proc/lockdep_stats 2007-07-31 15:39:40 -07:00
module.c Fix Off-by-one in /sys/module/*/refcnt 2007-08-22 14:35:35 -07:00
mutex-debug.c [PATCH] remove many unneeded #includes of sched.h 2007-02-14 08:09:54 -08:00
mutex-debug.h [PATCH] lockdep: better lock debugging 2006-07-03 15:27:01 -07:00
mutex.c lockstat: measure lock bouncing 2007-07-19 10:04:49 -07:00
mutex.h [PATCH] lockdep: prove mutex locking correctness 2006-07-03 15:27:04 -07:00
nsproxy.c mm: Remove slab destructors from kmem_cache_create(). 2007-07-20 10:11:58 +09:00
panic.c Report that kernel is tainted if there was an OOPS 2007-07-17 10:23:02 -07:00
params.c modules: better error messages when modules fail to load due to a sysfs problem. 2007-07-30 14:25:23 -07:00
pid.c namespace: ensure clone_flags are always stored in an unsigned long 2007-07-16 09:05:48 -07:00
posix-cpu-timers.c sched: make posix-cpu-timers use CFS's accounting information 2007-07-09 18:51:58 +02:00
posix-timers.c posix-timers: fix creation race 2007-08-22 19:52:46 -07:00
printk.c fix - ensure we don't use bootconsoles after init has been released 2007-08-21 20:23:53 -07:00
profile.c fix compilation with gcc 4.2 2007-08-11 15:47:42 -07:00
ptrace.c coredump masking: reimplementation of dumpable using two flags 2007-07-19 10:04:46 -07:00
rcupdate.c Add suspend-related notifications for CPU hotplug 2007-05-09 12:30:56 -07:00
rcutorture.c Freezer: make kernel threads nonfreezable by default 2007-07-17 10:23:02 -07:00
relay.c Fix a use after free bug in kernel->userspace relay file support 2007-07-31 15:39:42 -07:00
resource.c libata/IDE: remove combined mode quirk 2007-04-28 14:15:59 -04:00
rtmutex-debug.c FUTEX: Tidy up the code 2007-07-16 09:05:49 -07:00
rtmutex-debug.h [PATCH] lockdep: better lock debugging 2006-07-03 15:27:01 -07:00
rtmutex-tester.c Freezer: make kernel threads nonfreezable by default 2007-07-17 10:23:02 -07:00
rtmutex.c FUTEX: Tidy up the code 2007-07-16 09:05:49 -07:00
rtmutex.h [PATCH] lockdep: better lock debugging 2006-07-03 15:27:01 -07:00
rtmutex_common.h FUTEX: Tidy up the code 2007-07-16 09:05:49 -07:00
rwsem.c lockstat: hook into spinlock_t, rwlock_t, rwsem and mutex 2007-07-19 10:04:49 -07:00
sched.c sched: make the scheduler converge to the ideal latency 2007-08-28 12:53:24 +02:00
sched_debug.c sched: sched_clock_idle_[sleep|wakeup]_event() 2007-08-23 15:18:02 +02:00
sched_fair.c sched: make the scheduler converge to the ideal latency 2007-08-28 12:53:24 +02:00
sched_idletask.c sched: remove the 'u64 now' parameter from ->put_prev_task() 2007-08-09 11:16:49 +02:00
sched_rt.c sched: optimize task_tick_rt() a bit 2007-08-24 20:39:10 +02:00
sched_stats.h [PATCH] sched: add schedstat_set() API 2007-08-02 17:41:40 +02:00
seccomp.c make seccomp zerocost in schedule 2007-07-16 09:05:50 -07:00
signal.c signalfd: fix interaction with posix-timers 2007-08-22 19:52:46 -07:00
softirq.c Freezer: make kernel threads nonfreezable by default 2007-07-17 10:23:02 -07:00
softlockup.c Freezer: make kernel threads nonfreezable by default 2007-07-17 10:23:02 -07:00
spinlock.c lockstat: hook into spinlock_t, rwlock_t, rwsem and mutex 2007-07-19 10:04:49 -07:00
srcu.c [PATCH] SRCU: report out-of-memory errors 2006-10-04 07:55:30 -07:00
stacktrace.c [PATCH] lockdep: stacktrace subsystem, core 2006-07-03 15:27:02 -07:00
stop_machine.c Fix stop_machine_run problem with naughty real time process 2007-07-16 09:05:41 -07:00
sys.c Replace CONFIG_SOFTWARE_SUSPEND with CONFIG_HIBERNATION 2007-07-29 16:45:38 -07:00
sys_ni.c diskquota: 32bit quota tools on 64bit architectures 2007-07-16 09:05:48 -07:00
sysctl.c sched: cleanup, sched_granularity -> sched_min_granularity 2007-08-25 18:41:53 +02:00
taskstats.c taskstats: add context-switch counters 2007-07-16 09:05:46 -07:00
time.c Cleanup non-arch xtime uses, use get_seconds() or current_kernel_time(). 2007-07-25 10:09:20 -07:00
timer.c Pull ia64-clocksource into release branch 2007-07-20 11:26:47 -07:00
tsacct.c Cleanup non-arch xtime uses, use get_seconds() or current_kernel_time(). 2007-07-25 10:09:20 -07:00
uid16.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
user.c mm: Remove slab destructors from kmem_cache_create(). 2007-07-20 10:11:58 +09:00
user_namespace.c fix create_new_namespaces() return value 2007-07-16 09:05:47 -07:00
utsname.c namespace: ensure clone_flags are always stored in an unsigned long 2007-07-16 09:05:48 -07:00
utsname_sysctl.c remove CONFIG_UTS_NS and CONFIG_IPC_NS 2007-07-16 09:05:47 -07:00
wait.c Fix occurrences of "the the " 2007-05-09 08:57:56 +02:00
workqueue.c fix bogus hotplug cpu warning 2007-08-27 10:27:48 -07:00