linux/kernel/sched
Ben Segall 54d27365ca sched/fair: Prevent throttling in early pick_next_task_fair()
The optimized task selection logic optimistically selects a new task
to run without first doing a full put_prev_task(). This is so that we
can avoid a put/set on the common ancestors of the old and new task.

Similarly, we should only call check_cfs_rq_runtime() to throttle
eligible groups if they're part of the common ancestry, otherwise it
is possible to end up with no eligible task in the simple task
selection.

Imagine:
		/root
	/prev		/next
	/A		/B

If our optimistic selection ends up throttling /next, we goto simple
and our put_prev_task() ends up throttling /prev, after which we're
going to bug out in set_next_entity() because there aren't any tasks
left.

Avoid this scenario by only throttling common ancestors.

Reported-by: Mohammed Naser <mnaser@vexxhost.com>
Reported-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Ben Segall <bsegall@google.com>
[ munged Changelog ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Roman Gushchin <klamm@yandex-team.ru>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: pjt@google.com
Fixes: 678d5718d8 ("sched/fair: Optimize cgroup pick_next_task_fair()")
Link: http://lkml.kernel.org/r/xm26wq1oswoq.fsf@sword-of-the-dawn.mtv.corp.google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-07 15:57:44 +02:00
..
Makefile sched: Move the loadavg code to a more obvious location 2015-05-08 12:04:12 +02:00
auto_group.c sched, timer: Convert usages of ACCESS_ONCE() in the scheduler to READ_ONCE()/WRITE_ONCE() 2015-05-08 12:11:32 +02:00
auto_group.h sched, timer: Convert usages of ACCESS_ONCE() in the scheduler to READ_ONCE()/WRITE_ONCE() 2015-05-08 12:11:32 +02:00
clock.c kernel/sched/clock.c: add another clock for use with the soft lockup watchdog 2015-02-12 18:54:13 -08:00
completion.c sched/completion: Serialize completion_done() with complete() 2015-02-18 14:27:40 +01:00
core.c preempt: Use preempt_schedule_context() as the official tracing preemption point 2015-06-07 15:57:42 +02:00
cpuacct.c cgroup: rename cgroup_subsys->base_cftypes to ->legacy_cftypes 2014-07-15 11:05:09 -04:00
cpuacct.h
cpudeadline.c sched/deadline: Remove cpu_active_mask from cpudl_find() 2015-02-04 07:52:29 +01:00
cpudeadline.h sched/deadline: Modify cpudl::free_cpus to reflect rd->online 2015-01-30 19:39:16 +01:00
cpupri.c Merge commit '3cf2f34' into sched/core, to fix build error 2014-06-12 13:46:37 +02:00
cpupri.h sched/cpupri: Remove unnecessary definitions in cpupri.h 2014-11-16 10:58:59 +01:00
cputime.c sched, timer: Convert usages of ACCESS_ONCE() in the scheduler to READ_ONCE()/WRITE_ONCE() 2015-05-08 12:11:32 +02:00
deadline.c sched, timer: Convert usages of ACCESS_ONCE() in the scheduler to READ_ONCE()/WRITE_ONCE() 2015-05-08 12:11:32 +02:00
debug.c sched: Track group sched_entity usage contributions 2015-03-27 09:35:58 +01:00
fair.c sched/fair: Prevent throttling in early pick_next_task_fair() 2015-06-07 15:57:44 +02:00
features.h sched/rt: Use IPI to trigger RT task push migration instead of pulling 2015-03-23 10:55:22 +01:00
idle.c cpuidle: Run tick_broadcast_exit() with disabled interrupts 2015-04-29 15:19:21 +02:00
idle_task.c sched: Provide update_curr callbacks for stop/idle scheduling classes 2014-11-23 14:14:40 -08:00
loadavg.c sched: Move the loadavg code to a more obvious location 2015-05-08 12:04:12 +02:00
rt.c sched, timer: Convert usages of ACCESS_ONCE() in the scheduler to READ_ONCE()/WRITE_ONCE() 2015-05-08 12:11:32 +02:00
sched.h sched, timer: Convert usages of ACCESS_ONCE() in the scheduler to READ_ONCE()/WRITE_ONCE() 2015-05-08 12:11:32 +02:00
stats.c sched: use %*pb[l] to print bitmaps including cpumasks and nodemasks 2015-02-13 21:21:37 -08:00
stats.h sched, timer: Use the atomic task_cputime in thread_group_cputimer 2015-05-08 12:17:46 +02:00
stop_task.c sched: Provide update_curr callbacks for stop/idle scheduling classes 2014-11-23 14:14:40 -08:00
wait.c sched, timer: Convert usages of ACCESS_ONCE() in the scheduler to READ_ONCE()/WRITE_ONCE() 2015-05-08 12:11:32 +02:00