d80e731eca
This patch is intentionally incomplete to simplify the review. It ignores ep_unregister_pollwait() which plays with the same wqh. See the next change. epoll assumes that the EPOLL_CTL_ADD'ed file controls everything f_op->poll() needs. In particular it assumes that the wait queue can't go away until eventpoll_release(). This is not true in case of signalfd, the task which does EPOLL_CTL_ADD uses its ->sighand which is not connected to the file. This patch adds the special event, POLLFREE, currently only for epoll. It expects that init_poll_funcptr()'ed hook should do the necessary cleanup. Perhaps it should be defined as EPOLLFREE in eventpoll. __cleanup_sighand() is changed to do wake_up_poll(POLLFREE) if ->signalfd_wqh is not empty, we add the new signalfd_cleanup() helper. ep_poll_callback(POLLFREE) simply does list_del_init(task_list). This make this poll entry inconsistent, but we don't care. If you share epoll fd which contains our sigfd with another process you should blame yourself. signalfd is "really special". I simply do not know how we can define the "right" semantics if it used with epoll. The main problem is, epoll calls signalfd_poll() once to establish the connection with the wait queue, after that signalfd_poll(NULL) returns the different/inconsistent results depending on who does EPOLL_CTL_MOD/signalfd_read/etc. IOW: apart from sigmask, signalfd has nothing to do with the file, it works with the current thread. In short: this patch is the hack which tries to fix the symptoms. It also assumes that nobody can take tasklist_lock under epoll locks, this seems to be true. Note: - we do not have wake_up_all_poll() but wake_up_poll() is fine, poll/epoll doesn't use WQ_FLAG_EXCLUSIVE. - signalfd_cleanup() uses POLLHUP along with POLLFREE, we need a couple of simple changes in eventpoll.c to make sure it can't be "lost". Reported-by: Maxime Bizon <mbizon@freebox.fr> Cc: <stable@kernel.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
---|---|---|
.. | ||
debug | ||
events | ||
gcov | ||
irq | ||
power | ||
sched | ||
time | ||
trace | ||
.gitignore | ||
acct.c | ||
async.c | ||
audit_tree.c | ||
audit_watch.c | ||
audit.c | ||
audit.h | ||
auditfilter.c | ||
auditsc.c | ||
backtracetest.c | ||
bounds.c | ||
capability.c | ||
cgroup_freezer.c | ||
cgroup.c | ||
compat.c | ||
configs.c | ||
cpu_pm.c | ||
cpu.c | ||
cpuset.c | ||
crash_dump.c | ||
cred.c | ||
delayacct.c | ||
dma.c | ||
elfcore.c | ||
exec_domain.c | ||
exit.c | ||
extable.c | ||
fork.c | ||
freezer.c | ||
futex_compat.c | ||
futex.c | ||
groups.c | ||
hrtimer.c | ||
hung_task.c | ||
irq_work.c | ||
itimer.c | ||
jump_label.c | ||
kallsyms.c | ||
Kconfig.freezer | ||
Kconfig.hz | ||
Kconfig.locks | ||
Kconfig.preempt | ||
kexec.c | ||
kfifo.c | ||
kmod.c | ||
kprobes.c | ||
ksysfs.c | ||
kthread.c | ||
latencytop.c | ||
lockdep_internals.h | ||
lockdep_proc.c | ||
lockdep_states.h | ||
lockdep.c | ||
Makefile | ||
module.c | ||
mutex-debug.c | ||
mutex-debug.h | ||
mutex.c | ||
mutex.h | ||
notifier.c | ||
nsproxy.c | ||
padata.c | ||
panic.c | ||
params.c | ||
pid_namespace.c | ||
pid.c | ||
posix-cpu-timers.c | ||
posix-timers.c | ||
printk.c | ||
profile.c | ||
ptrace.c | ||
range.c | ||
rcu.h | ||
rcupdate.c | ||
rcutiny_plugin.h | ||
rcutiny.c | ||
rcutorture.c | ||
rcutree_plugin.h | ||
rcutree_trace.c | ||
rcutree.c | ||
rcutree.h | ||
relay.c | ||
res_counter.c | ||
resource.c | ||
rtmutex_common.h | ||
rtmutex-debug.c | ||
rtmutex-debug.h | ||
rtmutex-tester.c | ||
rtmutex.c | ||
rtmutex.h | ||
rwsem.c | ||
seccomp.c | ||
semaphore.c | ||
signal.c | ||
smp.c | ||
softirq.c | ||
spinlock.c | ||
srcu.c | ||
stacktrace.c | ||
stop_machine.c | ||
sys_ni.c | ||
sys.c | ||
sysctl_binary.c | ||
sysctl_check.c | ||
sysctl.c | ||
taskstats.c | ||
test_kprobes.c | ||
time.c | ||
timeconst.pl | ||
timer.c | ||
tracepoint.c | ||
tsacct.c | ||
uid16.c | ||
up.c | ||
user_namespace.c | ||
user-return-notifier.c | ||
user.c | ||
utsname_sysctl.c | ||
utsname.c | ||
wait.c | ||
watchdog.c | ||
workqueue_sched.h | ||
workqueue.c |