linux/kernel
Paul Menage 8bab8dded6 cgroups: add cgroup support for enabling controllers at boot time
The effects of cgroup_disable=foo are:

- foo isn't auto-mounted if you mount all cgroups in a single hierarchy
- foo isn't visible as an individually mountable subsystem

As a result there will only ever be one call to foo->create(), at init time;
all processes will stay in this group, and the group will never be mounted on
a visible hierarchy.  Any additional effects (e.g.  not allocating metadata)
are up to the foo subsystem.

This doesn't handle early_init subsystems (their "disabled" bit isn't set be,
but it could easily be extended to do so if any of the early_init systems
wanted it - I think it would just involve some nastier parameter processing
since it would occur before the command-line argument parser had been run.

Hugh said:

  Ballpark figures, I'm trying to get this question out rather than
  processing the exact numbers: CONFIG_CGROUP_MEM_RES_CTLR adds 15% overhead
  to the affected paths, booting with cgroup_disable=memory cuts that back to
  1% overhead (due to slightly bigger struct page).

  I'm no expert on distros, they may have no interest whatever in
  CONFIG_CGROUP_MEM_RES_CTLR=y; and the rest of us can easily build with or
  without it, or apply the cgroup_disable=memory patches.

Unix bench's execl test result on x86_64 was

== just after boot without mounting any cgroup fs.==
mem_cgorup=off : Execl Throughput       43.0     3150.1      732.6
mem_cgroup=on  : Execl Throughput       43.0     2932.6      682.0
==

[lizf@cn.fujitsu.com: fix boot option parsing]
Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Paul Menage <menage@google.com>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Sudhir Kumar <skumar@linux.vnet.ibm.com>
Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-04 14:46:26 -07:00
..
irq genirq: do not leave interupts enabled on free_irq 2008-02-19 10:43:58 +01:00
power Merge branches 'release' and 'doc' into release 2008-03-13 01:59:53 -04:00
time clocksource: revert: use init_timer_deferrable for clocksource_watchdog 2008-03-25 20:13:25 +01:00
.gitignore Update kernel/.gitignore with new auto-generated files 2008-02-09 23:27:01 -08:00
acct.c bsd_acct: using task_struct->tgid is not right in pid-namespaces 2008-03-24 19:22:20 -07:00
audit_tree.c Introduce path_put() 2008-02-14 21:13:33 -08:00
audit.c audit: silence two kerneldoc warnings in kernel/audit.c 2008-03-28 14:45:21 -07:00
audit.h [PATCH] audit: watching subtrees 2007-10-21 02:37:45 -04:00
auditfilter.c Introduce path_put() 2008-02-14 21:13:33 -08:00
auditsc.c [PATCH] Audit: Fix the format type for size_t variables 2008-03-01 07:16:06 -05:00
backtracetest.c x86: add a simple backtrace test module 2008-01-30 13:33:08 +01:00
capability.c Add 64-bit capability support to the kernel 2008-02-05 09:44:20 -08:00
cgroup_debug.c Task Control Groups: simple task cgroup debug info subsystem 2007-10-19 11:53:36 -07:00
cgroup.c cgroups: add cgroup support for enabling controllers at boot time 2008-04-04 14:46:26 -07:00
compat.c hrtimer: don't modify restart_block->fn in restart functions 2008-02-10 10:48:03 +01:00
configs.c
cpu.c cpu: fix section mismatch warnings for enable_nonboot_cpus 2008-02-08 09:22:41 -08:00
cpuset.c cpusets: fix obsolete comment 2008-03-05 17:53:33 -08:00
delayacct.c
dma.c
exec_domain.c
exit.c Fix waitid si_code regression 2008-03-08 11:54:00 -08:00
extable.c module: Don't report discarded init pages as kernel text. 2008-01-29 17:13:18 +11:00
fork.c memcgroup: fix spurious EBUSY on memory cgroup removal 2008-03-28 14:45:21 -07:00
futex_compat.c futex_compat __user annotation 2008-03-30 14:18:41 -07:00
futex.c NULL noise: fs/*, mm/*, kernel/* 2008-03-30 14:18:41 -07:00
hrtimer.c hrtimer: catch expired CLOCK_REALTIME timers early 2008-02-14 22:08:30 +01:00
itimer.c ITIMER_REAL: convert to use struct pid 2008-02-08 09:22:29 -08:00
kallsyms.c remove support for un-needed _extratext section 2008-02-06 10:41:01 -08:00
Kconfig.hz sched: high-res preemption tick 2008-01-25 21:08:29 +01:00
Kconfig.preempt rcu: move PREEMPT_RCU config option back under PREEMPT 2008-03-10 18:01:20 -07:00
kexec.c vmcoreinfo: add "VMCOREINFO_" to all the call for vmcoreinfo_append_str() 2008-02-07 08:42:25 -08:00
kfifo.c
kmod.c Dont touch fs_struct in usermodehelper 2008-02-14 21:13:32 -08:00
kprobes.c kprobes: fix a null pointer bug in register_kretprobe() 2008-03-04 16:35:19 -08:00
ksysfs.c Kobject: convert remaining kobject_unregister() to kobject_put() 2008-01-24 20:40:40 -08:00
kthread.c sched: fix, always create kernel threads with normal priority 2008-01-25 21:08:33 +01:00
latencytop.c sched: latencytop support 2008-01-25 21:08:34 +01:00
lockdep_internals.h
lockdep_proc.c
lockdep.c Subject: lockdep: include all lock classes in all_lock_classes 2008-02-25 23:03:02 +01:00
Makefile avoid overflows in kernel/time.c 2008-02-08 09:22:39 -08:00
marker.c markers: use synchronize_sched() 2008-04-02 15:28:19 -07:00
module.c modules: warn about suspicious return values from module's ->init() hook 2008-03-10 18:01:20 -07:00
mutex-debug.c kernel: remove fastcall in kernel/* 2008-02-08 09:22:31 -08:00
mutex-debug.h
mutex.c kernel: remove fastcall in kernel/* 2008-02-08 09:22:31 -08:00
mutex.h
notifier.c kernel/notifier.c should #include <linux/reboot.h> 2008-02-06 10:41:02 -08:00
ns_cgroup.c cgroups: implement namespace tracking subsystem 2007-10-19 11:53:37 -07:00
nsproxy.c namespaces: move the IPC namespace under IPC_NS option 2008-02-08 09:22:23 -08:00
panic.c ACPI: Taint kernel on ACPI table override (format corrected) 2008-02-06 22:07:51 -05:00
params.c Add new string functions strict_strto* and convert kernel params to use them 2008-02-08 09:22:41 -08:00
pid_namespace.c namespaces: cleanup the code managed with PID_NS option 2008-02-08 09:22:23 -08:00
pid.c kernel: remove fastcall in kernel/* 2008-02-08 09:22:31 -08:00
pm_qos_params.c pm qos infrastructure and interface 2008-02-05 09:44:22 -08:00
posix-cpu-timers.c Use find_task_by_vpid in posix timers 2008-02-08 09:22:41 -08:00
posix-timers.c hrtimer: check relative timeouts for overflow 2008-02-14 22:08:30 +01:00
printk.c Make printk() console semaphore accesses sensible 2008-03-24 19:25:08 -07:00
profile.c Nuke a duplicate include from profile.c 2008-02-08 09:22:34 -08:00
ptrace.c ptrace_check_attach: remove unneeded ->signal != NULL check 2008-02-08 09:22:26 -08:00
rcuclassic.c Preempt-RCU: implementation 2008-01-25 21:08:24 +01:00
rcupdate.c rcupdate: fix comment 2008-02-13 16:21:18 -08:00
rcupreempt_trace.c Preempt-RCU: implementation 2008-01-25 21:08:24 +01:00
rcupreempt.c rcupreempt: remove never-migrates assumption from rcu_process_callbacks() 2008-02-29 20:21:13 +01:00
rcutorture.c cpu-hotplug: replace lock_cpu_hotplug() with get_online_cpus() 2008-01-25 21:08:02 +01:00
relay.c relay: set an spd_release() hook for splice 2008-03-26 12:04:09 +01:00
res_counter.c Memory Resource Controller use strstrip while parsing arguments 2008-03-04 16:35:09 -08:00
resource.c [POWERPC] Add arch-specific walk_memory_remove() for 64-bit powerpc 2008-02-08 19:52:48 +11:00
rtmutex_common.h Don't operate with pid_t in rtmutex tester 2008-02-08 09:22:41 -08:00
rtmutex-debug.c Don't operate with pid_t in rtmutex tester 2008-02-08 09:22:41 -08:00
rtmutex-debug.h
rtmutex-tester.c Driver core: change sysdev classes to use dynamic kobject names 2008-01-24 20:40:40 -08:00
rtmutex.c hrtimer: more hrtimer_init_sleeper() fallout. 2008-02-13 15:45:36 +01:00
rtmutex.h
rwsem.c sched: mark rwsem functions as __sched for wchan/profiling 2007-12-18 15:21:13 +01:00
sched_debug.c sched: improve affine wakeups 2008-03-19 04:27:53 +01:00
sched_fair.c sched: cleanup old and rarely used 'debug' features. 2008-03-21 16:43:47 +01:00
sched_idletask.c sched: high-res preemption tick 2008-01-25 21:08:29 +01:00
sched_rt.c sched: balance RT task resched only on runqueue 2008-03-07 16:43:00 +01:00
sched_stats.h sched: clean up kernel/sched_stat.h 2007-11-28 15:52:56 +01:00
sched.c NOHZ: reevaluate idle sleep length after add_timer_on() 2008-03-26 08:28:55 +01:00
seccomp.c
signal.c freezer vs stopped or traced 2008-03-04 07:59:54 -08:00
softirq.c rcu: add support for dynamic ticks and preempt rcu 2008-02-29 18:46:50 +01:00
softlockup.c softlockup: fix task state setting 2008-02-29 18:46:53 +01:00
spinlock.c spinlock: lockbreak cleanup 2008-01-30 13:31:20 +01:00
srcu.c make srcu_readers_active() static 2008-02-06 10:41:02 -08:00
stacktrace.c
stop_machine.c stopmachine: semaphore to mutex 2008-02-06 10:41:08 -08:00
sys_ni.c timerfd: new timerfd API 2008-02-05 09:44:07 -08:00
sys.c Pidns: make full use of xxx_vnr() calls 2008-02-08 09:22:29 -08:00
sysctl_check.c constify tables in kernel/sysctl_check.c 2008-02-08 09:22:31 -08:00
sysctl.c sched: revert load_balance_monitor() changes 2008-03-04 17:54:06 +01:00
taskstats.c kernel/taskstats.c: fix bogus nlmsg_free() 2007-11-14 18:45:44 -08:00
test_kprobes.c kprobes: kretprobe user entry-handler 2008-02-06 10:41:11 -08:00
time.c avoid overflows in kernel/time.c 2008-02-08 09:22:39 -08:00
timeconst.pl timeconst.pl: correct reversal of USEC_TO_HZ and HZ_TO_USEC 2008-02-12 14:29:26 -08:00
timer.c NOHZ: reevaluate idle sleep length after add_timer_on() 2008-03-26 08:28:55 +01:00
tsacct.c
uid16.c
user_namespace.c namespaces: cleanup the code managed with the USER_NS option 2008-02-08 09:22:23 -08:00
user.c sched: rt-group: make rt groups scheduling configurable 2008-02-13 15:45:40 +01:00
utsname_sysctl.c Isolate the UTS namespace's domainname and hostname back 2007-11-29 09:24:53 -08:00
utsname.c
wait.c kernel: remove fastcall in kernel/* 2008-02-08 09:22:31 -08:00
workqueue.c workqueue: make delayed_work_timer_fn() static 2008-02-08 09:22:37 -08:00