spapr: Fix handling of unplugged devices during CAS and migration

We already detect if a device is being hot plugged before CAS to trigger
a CAS reboot and during migration to migrate the state of the associated
DRC. But hot unplugging a device is also an asynchronous operation that
requires the guest to take action. This means that if the guest is migrated
after the hot unplug event was sent but before it could release the device
with RTAS, the destination QEMU doesn't know about the pending unplug
operation and doesn't actually remove the device when the guest finally
releases it.

Similarly, if the unplug request is fired before CAS, the guest isn't
notified of the change, just like with hotplug. It ends up booting with
the device still present in the DT and configures it, just like it was
never removed. Even weirder, since the event is still queued, it will
be eventually processed when some other unrelated event is posted to
the guest.

Enhance spapr_drc_transient() to also return true if an unplug request is
pending. This fixes the issue at CAS with a CAS reboot request and
causes the DRC state to be migrated. Some extra care is still needed to
inform the destination that an unplug request is pending : migrate the
unplug_requested field of the DRC in an optional subsection. This might
break backwards migration, but this is still better than ending with
an inconsistent guest.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <158169248798.3465937.1108351365840514270.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This commit is contained in:
Greg Kurz 2020-02-14 16:01:28 +01:00 committed by David Gibson
parent 4b63db1289
commit ab8584349c

View File

@ -456,6 +456,22 @@ void spapr_drc_reset(SpaprDrc *drc)
}
}
static bool spapr_drc_unplug_requested_needed(void *opaque)
{
return spapr_drc_unplug_requested(opaque);
}
static const VMStateDescription vmstate_spapr_drc_unplug_requested = {
.name = "spapr_drc/unplug_requested",
.version_id = 1,
.minimum_version_id = 1,
.needed = spapr_drc_unplug_requested_needed,
.fields = (VMStateField []) {
VMSTATE_BOOL(unplug_requested, SpaprDrc),
VMSTATE_END_OF_LIST()
}
};
bool spapr_drc_transient(SpaprDrc *drc)
{
SpaprDrcClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
@ -471,9 +487,10 @@ bool spapr_drc_transient(SpaprDrc *drc)
/*
* We need to reset the DRC at CAS or to migrate the DRC state if it's
* not equal to the expected long-term state, which is the same as the
* coldplugged initial state.
* coldplugged initial state, or if an unplug request is pending.
*/
return (drc->state != drck->ready_state);
return drc->state != drck->ready_state ||
spapr_drc_unplug_requested(drc);
}
static bool spapr_drc_needed(void *opaque)
@ -489,6 +506,10 @@ static const VMStateDescription vmstate_spapr_drc = {
.fields = (VMStateField []) {
VMSTATE_UINT32(state, SpaprDrc),
VMSTATE_END_OF_LIST()
},
.subsections = (const VMStateDescription * []) {
&vmstate_spapr_drc_unplug_requested,
NULL
}
};