blockjob: do not allow coroutine double entry or entry-after-completion

When block_job_sleep_ns() is called, the co-routine is scheduled for
future execution.  If we allow the job to be re-entered prior to the
scheduled time, we present a race condition in which a coroutine can be
entered recursively, or even entered after the coroutine is deleted.

The job->busy flag is used by blockjobs when a coroutine is busy
executing. The function 'block_job_enter()' obeys the busy flag,
and will not enter a coroutine if set.  If we sleep a job, we need to
leave the busy flag set, so that subsequent calls to block_job_enter()
are prevented.

This changes the prior behavior of block_job_cancel() being able to
immediately wake up and cancel a job; in practice, this should not be an
issue, as the coroutine sleep times are generally very small, and the
cancel will occur the next time the coroutine wakes up.

This fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1508708

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
This commit is contained in:
Jeff Cody 2017-11-17 22:26:16 -05:00
parent 7c3d1917fd
commit 4afeffc857
2 changed files with 7 additions and 3 deletions

View File

@ -797,11 +797,14 @@ void block_job_sleep_ns(BlockJob *job, QEMUClockType type, int64_t ns)
return;
}
job->busy = false;
/* We need to leave job->busy set here, because when we have
* put a coroutine to 'sleep', we have scheduled it to run in
* the future. We cannot enter that same coroutine again before
* it wakes and runs, otherwise we risk double-entry or entry after
* completion. */
if (!block_job_should_pause(job)) {
co_aio_sleep_ns(blk_get_aio_context(job->blk), type, ns);
}
job->busy = true;
block_job_pause_point(job);
}

View File

@ -143,7 +143,8 @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver,
* @ns: How many nanoseconds to stop for.
*
* Put the job to sleep (assuming that it wasn't canceled) for @ns
* nanoseconds. Canceling the job will interrupt the wait immediately.
* nanoseconds. Canceling the job will not interrupt the wait, so the
* cancel will not process until the coroutine wakes up.
*/
void block_job_sleep_ns(BlockJob *job, QEMUClockType type, int64_t ns);