watchdog/diag288: avoid race condition on expired watchdog

When configured to inject an NMI, watchdog_perform_action() may cause
the BQL to be temporarily relinquished (inject_nmi() → ... →
s390_nmi() → s390_cpu_restart() → run_on_cpu()). When the guest issues
diag 288 again in response to the NMI, the diag 288 operation will
race against wdt_diag288_reset(). Depending on scheduler behaviour,
wdt_diag288_reset() may be run after the guest issued a diag 288
Init. As a result, we will cancel the timer the guest just set up. The
effect observed by the guest is that a second expiry does not trigger
the watchdog action and diag 288 Change operations fail.

Fix this by resetting the timer _before_ invoking the action.

Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Acked-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
This commit is contained in:
Sascha Silbe 2016-01-29 15:51:45 +01:00 committed by Cornelia Huck
parent 8777f6abdb
commit fe345a3d5d
1 changed files with 8 additions and 4 deletions

View File

@ -51,15 +51,19 @@ static void diag288_reset(void *opaque)
static void diag288_timer_expired(void *dev)
{
qemu_log_mask(CPU_LOG_RESET, "Watchdog timer expired.\n");
watchdog_perform_action();
/* Reset the watchdog only if the guest was notified about expiry. */
/* Reset the watchdog only if the guest gets notified about
* expiry. watchdog_perform_action() may temporarily relinquish
* the BQL; reset before triggering the action to avoid races with
* diag288 instructions. */
switch (get_watchdog_action()) {
case WDT_DEBUG:
case WDT_NONE:
case WDT_PAUSE:
return;
break;
default:
wdt_diag288_reset(dev);
}
wdt_diag288_reset(dev);
watchdog_perform_action();
}
static int wdt_diag288_handle_timer(DIAG288State *diag288,