posix: Do not use WNOHANG in waitpid call for Linux posix_spawn

As shown in some buildbot issues on aarch64 and powerpc, calling
clone (VFORK) and waitpid (WNOHANG) does not guarantee the child
is ready to be collected.  This patch changes the call back to 0
as before fe05e1cb6d fix.

This change can lead to the scenario 4.3 described in the commit,
where the waitpid call can hang undefinitely on the call.  However
this is also a very unlikely and also undefinied situation where
both the caller is trying to terminate a pid before posix_spawn
returns and the race pid reuse is triggered.  I don't see how to
correct handle this specific situation within posix_spawn.

Checked on x86_64-linux-gnu, aarch64-linux-gnu and
powerpc64-linux-gnu.

	* sysdeps/unix/sysv/linux/spawni.c (__spawnix): Use 0 instead of
	WNOHANG in waitpid call.
This commit is contained in:
Adhemerval Zanella 2017-10-21 11:33:27 -02:00
parent a2e0a7f12b
commit aa95a2414e
2 changed files with 10 additions and 5 deletions

View File

@ -1,3 +1,8 @@
2017-10-23 Adhemerval Zanella <adhemerval.zanella@linaro.org>
* sysdeps/unix/sysv/linux/spawni.c (__spawnix): Use 0 instead of
WNOHANG in waitpid call.
2017-10-23 Siddhesh Poyarekar <siddhesh@sourceware.org>
* manual/conf.texi (_SC_LEVEL1_DCACHE_LINESIZE,

View File

@ -374,12 +374,12 @@ __spawnix (pid_t * pid, const char *file,
ec = args.err;
if (ec > 0)
/* There still an unlikely case where the child is cancelled after
setting args.err, due to a positive error value. Also due a
setting args.err, due to a positive error value. Also there is
possible pid reuse race (where the kernel allocated the same pid
to unrelated process) we need not to undefinitely hang expecting
an invalid pid. In both cases an error is returned to the
caller. */
__waitpid (new_pid, NULL, WNOHANG);
to an unrelated process). Unfortunately due synchronization
issues where the kernel might not have the process collected
the waitpid below can not use WNOHANG. */
__waitpid (new_pid, NULL, 0);
}
else
ec = -new_pid;