linux/kernel
Oleg Nesterov 2f4e6e2a81 wait_task_zombie: fix 2/3 races vs forget_original_parent()
Two threads, T1 and T2.  T2 ptraces P, and P is not a child of ptracer's
thread group.  P exits and goes to TASK_ZOMBIE.

T1 does wait_task_zombie(P):

	P->exit_state = TASK_DEAD;
	...
	read_unlock(&tasklist_lock);

					T2 does exit(), takes tasklist,
					forget_original_parent() does
					__ptrace_unlink(P) but doesn't
					call do_notify_parent(P) because
					p->exit_state == EXIT_DEAD.

Now, P is not visible to our process: __ptrace_unlink() removed it from
->children. We should send notification to P->parent and release P if and
only if SIGCHLD is ignored.

And we have 3 bugs:

1. P->parent does do_wait() and gets -ECHILD (P is on ->parent->children,
   but its state is TASK_DEAD).

2. // wait_task_zombie() continues

	if (put_user(...)) {
		// TODO: is this safe?
		p->exit_state = EXIT_ZOMBIE;
		return;
	}

   we return without notification/release, task_struct leaked.

   Solution: ignore -EFAULT and proceed. It is an application's bug if
   we can't fill infop/stat_addr (in case of VM_FAULT_OOM we have much
   more problems).

3. // wait_task_zombie() continues

	if (p->real_parent != p->parent) {
		// Not taken, it was untraced'ed
		...
	}

	release_task(p);

   we released the task which we shouldn't.

   Solution: check ->real_parent != ->parent before, under tasklist_lock,
   but use ptrace_unlink() instead of __ptrace_unlink() to check ->ptrace.

This patch hopefully solves 2 and 3, the 1st bug will be fixed later, we need
some cleanups in forget_original_parent/reparent_thread.

However, the first race is very unlikely and not critical, so I hope it makes
sense to fix 1 and 2 for now.

4. Small cleanup: don't "restore" EXIT_ZOMBIE unless we know we are not going
   to realease the child.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:51 -07:00
..
irq Fix CONFIG_DEBUG_SHIRQ trigger on free_irq() 2007-10-17 08:42:49 -07:00
power hibernation doesn't even build on frv - tons of helpers are missing 2007-09-26 09:22:04 -07:00
time time: simplify smp_call_function_single() call sequence 2007-10-17 08:42:48 -07:00
.gitignore
Kconfig.hz
Kconfig.preempt [PATCH] sched: arch preempt notifier mechanism 2007-07-26 13:40:43 +02:00
Makefile user namespace: add the framework 2007-07-16 09:05:47 -07:00
acct.c Cleanup non-arch xtime uses, use get_seconds() or current_kernel_time(). 2007-07-25 10:09:20 -07:00
audit.c [NET]: make netlink user -> kernel interface synchronious 2007-10-10 21:15:29 -07:00
audit.h Audit: add TTY input auditing 2007-07-16 09:05:47 -07:00
auditfilter.c [PATCH] allow audit filtering on bit & operations 2007-07-22 09:57:02 -04:00
auditsc.c Clean up duplicate includes in kernel/ 2007-10-17 08:42:48 -07:00
capability.c [PATCH] pid: replace do/while_each_task_pid with do/while_each_pid_task 2007-02-12 09:48:32 -08:00
compat.c signal/timer/event: timerfd compat code 2007-05-11 08:29:36 -07:00
configs.c use simple_read_from_buffer in kernel/ 2007-05-09 12:30:49 -07:00
cpu.c PM: Fix dependencies of CONFIG_SUSPEND and CONFIG_HIBERNATION 2007-08-31 01:42:22 -07:00
cpuset.c oom: compare cpuset mems_allowed instead of exclusive ancestors 2007-10-17 08:42:46 -07:00
delayacct.c sched: clean up schedstats, cnt -> count 2007-10-15 17:00:12 +02:00
die_notifier.c move die notifier handling to common code 2007-05-08 11:15:04 -07:00
dma.c
exec_domain.c
exit.c wait_task_zombie: fix 2/3 races vs forget_original_parent() 2007-10-17 08:42:51 -07:00
extable.c
fork.c Slab API: remove useless ctor parameter and reorder parameters 2007-10-17 08:42:45 -07:00
futex.c robust futex thread exit race 2007-10-01 07:52:23 -07:00
futex_compat.c robust futex thread exit race 2007-10-01 07:52:23 -07:00
hrtimer.c [KTIME]: Introduce ktime_sub_ns and ktime_sub_us 2007-10-10 16:48:12 -07:00
itimer.c The scheduled -EINVAL for invalid timevals in setitimer 2007-05-08 11:15:13 -07:00
kallsyms.c kallsyms: make KSYM_NAME_LEN include space for trailing '\0' 2007-07-17 10:23:03 -07:00
kexec.c Clean up duplicate includes in kernel/ 2007-10-17 08:42:48 -07:00
kfifo.c is_power_of_2: kernel/kfifo.c 2007-07-16 09:05:50 -07:00
kmod.c Restore call_usermodehelper_pipe() behaviour 2007-09-11 17:21:20 -07:00
kprobes.c kprobes: support kretprobe blacklist 2007-10-16 09:43:10 -07:00
ksysfs.c sched: group scheduling, sysfs tunables 2007-10-15 17:00:14 +02:00
kthread.c kthread: silence bogus section mismatch warning 2007-07-31 15:39:42 -07:00
latency.c
lockdep.c lockdep: syscall exit check 2007-10-11 22:11:12 +02:00
lockdep_internals.h
lockdep_proc.c lockdep: Avoid /proc/lockdep & lock_stat infinite output 2007-10-11 22:11:11 +02:00
module.c Add /sys/module/name/notes 2007-10-17 08:42:50 -07:00
mutex-debug.c [PATCH] remove many unneeded #includes of sched.h 2007-02-14 08:09:54 -08:00
mutex-debug.h
mutex.c lockdep: fixup mutex annotations 2007-10-11 22:11:12 +02:00
mutex.h
nsproxy.c [NET]: Add network namespace clone & unshare support. 2007-10-10 16:52:46 -07:00
panic.c Report that kernel is tainted if there was an OOPS 2007-07-17 10:23:02 -07:00
params.c modules: better error messages when modules fail to load due to a sysfs problem. 2007-07-30 14:25:23 -07:00
pid.c namespace: ensure clone_flags are always stored in an unsigned long 2007-07-16 09:05:48 -07:00
posix-cpu-timers.c sched: make posix-cpu-timers use CFS's accounting information 2007-07-09 18:51:58 +02:00
posix-timers.c SLAB_PANIC more (proc, posix-timers, shmem) 2007-10-17 08:42:47 -07:00
printk.c printk: add interfaces for external access to the log buffer 2007-10-17 08:42:50 -07:00
profile.c Memoryless nodes: Allow profiling data to fall back to other nodes 2007-10-16 09:42:58 -07:00
ptrace.c m32r: convert to generic sys_ptrace 2007-10-16 09:43:04 -07:00
rcupdate.c Clean up duplicate includes in kernel/ 2007-10-17 08:42:48 -07:00
rcutorture.c Clean up duplicate includes in kernel/ 2007-10-17 08:42:48 -07:00
relay.c Fix a use after free bug in kernel->userspace relay file support 2007-07-31 15:39:42 -07:00
resource.c memory unplug: memory hotplug cleanup 2007-10-16 09:43:01 -07:00
rtmutex-debug.c kernel/rtmutex-debug.c: cleanups 2007-10-17 08:42:50 -07:00
rtmutex-debug.h
rtmutex-tester.c Freezer: make kernel threads nonfreezable by default 2007-07-17 10:23:02 -07:00
rtmutex.c FUTEX: Tidy up the code 2007-07-16 09:05:49 -07:00
rtmutex.h
rtmutex_common.h FUTEX: Tidy up the code 2007-07-16 09:05:49 -07:00
rwsem.c lockstat: hook into spinlock_t, rwlock_t, rwsem and mutex 2007-07-19 10:04:49 -07:00
sched.c cpuset: remove sched domain hooks from cpusets 2007-10-16 09:43:09 -07:00
sched_debug.c Make scheduler debug file operations const 2007-10-15 17:00:19 +02:00
sched_fair.c sched: reintroduce cache-hot affinity 2007-10-15 17:00:18 +02:00
sched_idletask.c sched: mark scheduling classes as const 2007-10-15 17:00:12 +02:00
sched_rt.c sched: tidy up SCHED_RR 2007-10-15 17:00:13 +02:00
sched_stats.h sched: clean up schedstats, cnt -> count 2007-10-15 17:00:12 +02:00
seccomp.c make seccomp zerocost in schedule 2007-07-16 09:05:50 -07:00
signal.c do_sigaction: remove now unneeded recalc_sigpending() 2007-10-17 08:42:51 -07:00
softirq.c [KERNEL]: Unexport raise_softirq_irqoff 2007-10-10 16:49:18 -07:00
softlockup.c softlockup: add a /proc tuning parameter 2007-10-17 08:42:47 -07:00
spinlock.c lockstat: hook into spinlock_t, rwlock_t, rwsem and mutex 2007-07-19 10:04:49 -07:00
srcu.c
stacktrace.c
stop_machine.c Fix stop_machine_run problem with naughty real time process 2007-07-16 09:05:41 -07:00
sys.c Fix SMP poweroff hangs 2007-10-01 07:52:23 -07:00
sys_ni.c diskquota: 32bit quota tools on 64bit architectures 2007-07-16 09:05:48 -07:00
sysctl.c softlockup: add a /proc tuning parameter 2007-10-17 08:42:47 -07:00
taskstats.c Clean up duplicate includes in kernel/ 2007-10-17 08:42:48 -07:00
time.c Clean up duplicate includes in kernel/ 2007-10-17 08:42:48 -07:00
timer.c Pull ia64-clocksource into release branch 2007-07-20 11:26:47 -07:00
tsacct.c Cleanup non-arch xtime uses, use get_seconds() or current_kernel_time(). 2007-07-25 10:09:20 -07:00
uid16.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
user.c sched: generate uevents for user creation/destruction 2007-10-15 17:00:18 +02:00
user_namespace.c Fix user namespace exiting OOPs 2007-09-19 11:24:18 -07:00
utsname.c Fix UTS corruption during clone(CLONE_NEWUTS) 2007-09-19 11:24:17 -07:00
utsname_sysctl.c remove CONFIG_UTS_NS and CONFIG_IPC_NS 2007-07-16 09:05:47 -07:00
wait.c Fix occurrences of "the the " 2007-05-09 08:57:56 +02:00
workqueue.c fix bogus hotplug cpu warning 2007-08-27 10:27:48 -07:00