Commit Graph

72 Commits

Author SHA1 Message Date
David Gibson 4445b1d27e spapr: Remove sPAPRConfigureConnectorState sub-structure
Most of the time, the state of a DRC object is contained in the single
'state' variable.  However, during the transition from UNISOLATE to
CONFIGURED state requires multiple calls to the ibm,configure-connector
RTAS call to retrieve the device tree for the attached device.  We need
some extra state to keep track of where we're up to in delivering the
device tree information to the guest.

Currently that extra state is in a sPAPRConfigureConnectorState
substructure which is only allocated when we're in the middle of the
configure connector process.  That sounds like a good idea, but the extra
state is only two integers - on many platforms that will take up the same
room as the (maybe NULL) ccs pointer even before malloc() overhead.  Plus
it's another object whose lifetime we need to manage.  In short, it's not
worth it.

So, fold the sPAPRConfigureConnectorState substructure directly into the
DRC object.

Previously the structure was allocated lazily when the configure-connector
call discovers it's not there.  Now, we need to initialize the subfields
pre-emptively, as soon as we enter UNISOLATE state.

Although it's not strictly necessary (the field values should only ever
be consulted when in UNISOLATE state), we try to keep them at -1 when in
other states, as a debugging aid.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
2017-07-17 15:07:05 +10:00
David Gibson 9d4c0f4f0a spapr: Consolidate DRC state variables
Each DRC has three fields describing its state: isolation_state,
allocation_state and configured.  At first this seems like a reasonable
representation, since its based directly on the PAPR defined
isolation-state and allocation-state indicators.  However:
  * Only a few combinations of the two fields' values are permitted
  * allocation_state isn't used at all for physical DRCs
  * The indicators are write only so they don't really have a well
    defined current value independent of each other

This replaces these variables with a single state variable, whose names
and numbers are based on the diagram in LoPAPR section 13.4.  Along with
this we add code to check the current state on various operations and make
sure the requested transition is permitted.

Strictly speaking, this makes guest visible changes to behaviour (since we
probably allowed some transitions we shouldn't have before).  However, a
hypothetical guest broken by that wasn't PAPR compliant, and probably
wouldn't have worked under PowerVM.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
2017-07-17 15:07:05 +10:00
David Gibson f1c52354e5 spapr: Cleanups relating to DRC awaiting_release field
'awaiting_release' indicates that the host has requested an unplug of the
device attached to the DRC, but the guest has not (yet) put the device
into a state where it is safe to complete removal.

1. Rename it to 'unplug_requested' which to me at least is clearer

2. Remove the ->release_pending() method used to check this from outside
spapr_drc.c.  The method only plausibly has one implementation, so use
a plain function (spapr_drc_unplug_requested()) instead.

3. Remove it from the migration stream.  Attempting to migrate mid-unplug
is broken not just for spapr - in general management has no good way to
determine if the device should be present on the destination or not.  So,
until that's fixed, there's no point adding extra things to the stream.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
2017-07-17 15:07:05 +10:00
David Gibson a8dc47fd82 spapr: Refactor spapr_drc_detach()
This function has two unused parameters - remove them.

It also sets awaiting_release on all paths, except one.  On that path
setting it is harmless, since it will be immediately cleared by
spapr_drc_release().  So factor it out of the if statements.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
2017-07-17 15:07:05 +10:00
David Gibson ba50822ff8 spapr: Abort on delete failure in spapr_drc_release()
We currently ignore errors from the object_property_del() in
spapr_drc_release().  But the only way that could fail is if the property
doesn't exist, in which case it's a bug that we're in spapr_drc_release()
at all.  So change from ignoring to abort()ing on errors.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-07-17 15:07:05 +10:00
David Gibson 82a93a1d30 spapr: Remove 'awaiting_allocation' DRC flag
The awaiting_allocation flag in the DRC was introduced by aab9913
"spapr_drc: Prevent detach racing against attach for CPU DR", allegedly to
prevent a guest crash on racing attach and detach.  Except.. information
from the BZ actually suggests a qemu crash, not a guest crash.  And there
shouldn't be a problem here anyway: if the guest has already moved the DRC
away from UNUSABLE state, the detach would already be deferred, and if it
hadn't it should be safe to detach it (the guest should fail gracefully
when it attempts to change the allocation state).

I think this was probably just a bandaid for some other problem in the
state management.  So, remove awaiting_allocation and associated code.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
2017-07-17 15:07:05 +10:00
Laurent Vivier 94fd9cbaa3 spapr: Treat devices added before inbound migration as coldplugged
When migrating a guest which has already had devices hotplugged,
libvirt typically starts the destination qemu with -incoming defer,
adds those hotplugged devices with qmp, then initiates the incoming
migration.

This causes problems for the management of spapr DRC state.  Because
the device is treated as hotplugged, it goes into a DRC state for a
device immediately after it's plugged, but before the guest has
acknowledged its presence.  However, chances are the guest on the
source machine *has* acknowledged the device's presence and configured
it.

If the source has fully configured the device, then DRC state won't be
sent in the migration stream: for maximum migration compatibility with
earlier versions we don't migrate DRCs in coldplug-equivalent state.
That means that the DRC effectively changes state over the migrate,
causing problems later on.

In addition, logging hotplug events for these devices isn't what we
want because a) those events should already have been issued on the
source host and b) the event queue should get wiped out by the
incoming state anyway.

In short, what we really want is to treat devices added before an
incoming migration as if they were coldplugged.

To do this, we first add a spapr_drc_hotplugged() helper which
determines if the device is hotplugged in the sense relevant for DRC
state management.  We only send hotplug events when this is true.
Second, when we add a device which isn't hotplugged in this sense, we
force a reset of the DRC state - this ensures the DRC is in a
coldplug-equivalent state (there isn't usually a system reset between
these device adds and the incoming migration).

This is based on an earlier patch by Laurent Vivier, cleaned up and
extended.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
2017-07-17 15:07:05 +10:00
David Gibson 3579d606a0 spapr: Remove unnecessary instance_size specifications from DRC subtypes
All the DRC subtypes explicitly list instance_size in TypeInfo (all as
sizeof(sPAPRDRConnector).  This isn't necessary, since if it's not listed
it will be derived from the parent type.

Worse, this is dangerous, because if a subtype is changed in future to
have a larger structure, then subtypes of that subtype also need to have
instance_size changed, or it will lead to hard to track memory corruption
bugs.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-07-17 15:06:08 +10:00
David Gibson 5c1da81215 spapr: Remove unnecessary differences between hotplug and coldplug paths
spapr_drc_attach() has a 'coldplug' parameter which sets the DRC into
configured state initially, instead of the usual ISOLATED/UNUSABLE state.
It turns out this is unnecessary: although coldplugged devices do need to
be in CONFIGURED state once the guest starts, that will already be
accomplished by the reset code which will move DRCs for already plugged
devices into a coldplug equivalent state.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
2017-07-11 11:04:01 +10:00
David Gibson 6b762f29a8 spapr: Add DRC release method
At the moment, spapr_drc_release() has an ugly switch on the DRC type to
call the right, device-specific release function.  This cleans it up by
doing that via a proper QOM method.

It's still arguably an abstraction violation for the DRC code to call into
the specific device code, but one mess at a time.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
2017-07-11 11:04:01 +10:00
David Gibson 6caf3ac613 spapr: Uniform DRC reset paths
DRC objects have a regular device reset method.  However, it only gets
called in the usual way for PCI DRCs.  Because of where CPU and LMB DRCs
are in the QOM tree, their device reset method isn't automatically called.
So, the machine manually registers reset handlers to call device_reset().

This patch removes the device reset method, and instead always explicitly
registers the reset handler from realize().  This means the callers don't
have to worry about the two cases, and we always get proper resets.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2017-07-11 11:04:01 +10:00
David Gibson f8dc29834c spapr: Leave DR-indicator management to the guest
The DR-indicator is essentially a "virtual LED" attached to a hotpluggable
device, which the guest can set to various states for the attention of
the operator or management layers.

It's mostly guest managed, except that we once-off set it to
ACTIVE/INACTIVE in the attach/detach path.  While that makes certain sense,
there's no indication in PAPR that the hypervisor should do this, and the
drmgr code on the guest side doesn't appear to need it (it will already set
the indicator to ACTIVE on hotplug, and INACTIVE on remove).

So, leave the DR-indicator entirely to the guest; the only thing we need
to do is ensure it's in a sane state on reset.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
2017-07-11 11:04:01 +10:00
David Gibson 0dfabd39d5 spapr: Clean up DRC set_isolation_state() path
There are substantial differences in the various paths through
set_isolation_state(), both for setting to ISOLATED versus UNISOLATED
state and for logical versus physical DRCs.

So, split the set_isolation_state() method into isolate() and unisolate()
methods, and give it different implementations for the two DRC types.

Factor some minimal common checks, including for valid indicator values
(which we weren't previously checking) into rtas_set_isolation_state().

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-30 14:03:32 +10:00
David Gibson 617367321e spapr: Clean up DRC set_allocation_state path
The allocation-state indicator should only actually be implemented for
"logical" DRCs, not physical ones.  Factor a check for this, and also for
valid indicator state values into rtas_set_allocation_state().  Because
they don't exist for physical DRCs, there's no reason that we'd ever want
more than one method implementation, so it can just be a plain function.

In addition, the setting to USABLE and setting to UNUSABLE paths in
set_allocation_state() don't actually have much in common.  So, split the
method separate functions for each parameter value (drc_set_usable()
and drc_set_unusable()).

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-30 14:03:32 +10:00
David Gibson 4f9242fc93 spapr: Make DRC reset force DRC into known state
The reset handler for DRCs attempts several state transitions which are
subject to various checks and restrictions.  But at reset time we know
there is no guest, so we can ignore most of the usual sequencing rules and
just set the DRC back to a known state.  In fact, it's safer to do so.

The existing code also has several redundant checks for
drc->awaiting_release inside a block which has already tested that.  This
patch removes those and sets the DRC to a fixed initial state based only
on whether a device is currently plugged or not.

With DRCs correctly reset to a state based on device presence, we don't
need to force state transitions as cold plugged devices are processed.
This allows us to remove all the callers of the set_*_state() methods from
outside spapr_drc.c.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-30 14:03:32 +10:00
David Gibson 9c914e5370 spapr: Split DRC release from DRC detach
spapr_drc_detach() is called when qemu generic code requests a device be
unplugged.  It makes a number of tests, which could well delay further
action until later, before actually detach the device from the DRC.

This splits out the part which actually removes the device from the DRC
into spapr_drc_release().  This will be useful for further cleanups.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-30 14:03:32 +10:00
David Gibson 307b7715d0 spapr: Eliminate DRC 'signalled' state variable
The 'signalled' field in the DRC appears to be entirely a torturous
workaround for the fact that PCI devices were started in UNISOLATED state
for unclear reasons.

1) 'signalled' is already meaningless for logical (so far, all non PCI)
DRCs.  It's always set to true (at least at any point it might be tested),
and can't be assigned any real meaning due to the way signalling works for
logical DRCs.

2) For PCI DRCs, the only time signalled would be false is when non-zero
functions of a multifunction device are hotplugged, followed by function
zero (the other way around is explicitly not permitted). In that case the
secondary function DRCs are attached, but the notification isn't sent to
the guest until function 0 is plugged.

3) signalled being false is used to allow a DRC detach to switch mode
back to ISOLATED state, which allows a secondary function to be hotplugged
then unplugged with function 0 never inserted.  Without this a secondary
function starting in UNISOLATED state couldn't be detached again without
function 0 being inserted, all the functions configured by the guest, then
sent back to ISOLATED state.

4) But now that PCI DRCs start in ISOLATED state, there's nothing to be
done.  If the guest doesn't get the notification, it won't switch the
device to UNISOLATED state, so nothing prevents it from being unplugged.
If the guest does move it to UNISOLATED state without the signal (due to
a manual drmgr call, for instance) then it really isn't safe to unplug it.

So, this patch removes the signalled variable and all code related to it.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-30 14:03:32 +10:00
David Gibson af8ad96bd0 spapr: Start hotplugged PCI devices in ISOLATED state
PCI DRCs, and only PCI DRCs, are immediately moved to UNISOLATED isolation
state once the device is attached.  This has been there from the initial
implementation, and it's not clear why.

The state diagram in PAPR 13.4 suggests PCI devices should start in
ISOLATED state until the guest moves them into UNISOLATED, and the code in
the guest-side drmgr tool seems to work that way too.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
2017-06-30 14:03:32 +10:00
Laurent Vivier 593080936a Revert "spapr: fix memory hot-unplugging"
This reverts commit fe6824d126.

Conflicts hw/ppc/spapr_drc.c, because get_index() has been renamed
spapr_get_index().

This didn't fix the problem. Once the hotplug has been started
some memory is allocated and some structures are allocated.
We don't free it when we ignore the unplug, and we can't because
they can be in use by the kernel.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-06-09 12:35:46 +10:00
David Gibson 7980833619 spapr: Rework DRC name handling
DRC objects have a get_name method which returns the DRC name generated
when the DRC is created.  Replace that with a fixed spapr_drc_name()
function which generates the name on the fly from other information.  This
means:
  * We get rid of a method with only one implementation, and only local
    callers
  * We don't have to carry the name string around for the lifetime of the
    DRC
  * We use information added to the class structure to generate the name
    in standard format, so we don't need an explicit switch on drc type
    any more

We also eliminate the 'name' property; it's basically useless since the
only information in it can easily be deduced from other things.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Acked-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-08 14:38:27 +10:00
David Gibson 0be4e88621 spapr: Change DRC attach & detach methods to functions
DRC objects have attach & detach methods, but there's only one
implementation.  Although there are some differences in its behaviour for
different DRC types, the overall structure is the same, so while we might
want different method implementations for some parts, we're unlikely to
want them for the top-level functions.

So, replace them with direct function calls.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Acked-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-08 14:38:26 +10:00
David Gibson cd74d27e42 spapr: Clean up handling of DR-indicator
There are 3 types of "indicator" associated with hotplug in the PAPR spec
the "allocation state", "isolation state" and "DR-indicator".  The first
two are intimately tied to the various state transitions associated with
hotplug.  The DR-indicator, however, is different and simpler.

It's basically just a guest controlled variable which can be used by the
guest to flag state or problems associated with a device.  The idea is that
the hypervisor can use it to present information back on management
consoles (on some machines with PowerVM it may even control physical LEDs
on the machine case associated with the relevant device).

For that reason, there's only ever likely to be a single update
implementation so the set_indicator_state method isn't useful.  Replace it
with a direct function call.

While we're there, make some small associated cleanups:
  * PAPR doesn't use the term "indicator state", just "DR-indicator" and
the allocation state and isolation state are also considered "indicators".
Rename things to be less confusing
  * Fold set_indicator_state() and rtas_set_indicator_state() into a single
rtas_set_dr_indicator() function.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Acked-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-08 14:38:26 +10:00
David Gibson 7b7258f810 spapr: Clean up RTAS set-indicator
In theory the RTAS set-indicator call can be used for a number of
"indicators" defined by PAPR.  In practice the only ones we're ever likely
to implement are those used for Dynamic Reconfiguration (i.e. hotplug).
Because of this, the current implementation determines the associated DRC
object, before dispatching based on the type of indicator.

However, this means we also need a check that we're dealing with a DR
related indicator at all, which duplicates some of the logic from the
switch further down.

Even though it means a bit of code duplication, things work out cleaner if
we delegate the DRC lookup to the individual indicator type functions -
and it also allows some further cleanups.

While we're there, remove references to "sensor", a copy/paste artefact
from the related, but distinct "get-sensor" call.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Acked-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-08 14:38:26 +10:00
David Gibson f224d35be9 spapr: Clean up DR entity sense handling
DRC classes have an entity_sense method to determine (in a specific PAPR
sense) the presence or absence of a device plugged into a DRC.  However,
we only have one implementation of the method, which explicitly tests for
different DRC types.  This changes it to instead have different method
implementations for the two cases: "logical" and "physical" DRCs.

While we're at it, the entity sense method always returns RTAS_OUT_SUCCESS,
and the interesting value is returned via pass-by-reference.  Simplify this
to directly return the value we care about

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Acked-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-08 14:38:26 +10:00
David Gibson 91dcb1ffa6 spapr: Remove some non-useful properties on DRC objects
* 'connector_type' is easily derived from the 'index' property, so there's
   no point to it (it's also implicit in the QOM type of the DRC)
 * 'isolation-state', 'indicator-state' and 'allocation-state' are
   part of the transaction between qemu and guest during PAPR hotplug
   operations, and outside tools really have no business looking at it
   (especially not changing, and these were RW properties)
 * 'entity-sense' is basically just a weird PAPR encoding of whether there
   is a device connected to this DRC

Strictly speaking removing these properties is breaking the qemu interface.
However, I'm pretty sure no management tools have ever used these.  For
debugging there are better alternatives.  Therefore, I think removing these
broken interfaces is the better option.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Acked-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-06 09:24:25 +10:00
David Gibson 1693ea1685 spapr: Eliminate spapr_drc_get_type_str()
This function was used in generating the device tree.  However, now that
we have different QOM types for different DRC types we can easily store
the information we need in the class structure and avoid this specialized
lookup function.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Acked-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-06 09:24:21 +10:00
David Gibson b8fdd530be spapr: Move configure-connector state into DRC
Currently the sPAPRMachineState contains a list of sPAPRConfigureConnector
structures which store intermediate state for the ibm,configure-connector
RTAS call.

This was an attempt to separate this state from the core of the DRC state.
However the configure connector process is intimately tied to the DRC
model, so there's really no point trying to have two levels of interface
here.

Moving the configure-connector state into its corresponding DRC allows
removal of a number of helpers for maintaining the anciliary list.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Acked-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-06 09:24:17 +10:00
David Gibson fbf5539718 spapr: Clean up spapr_dr_connector_by_*()
* Change names to something less ludicrously verbose
 * Now that we have QOM subclasses for the different DRC types, use a QOM
   typename instead of a PAPR type value parameter

The latter allows removal of the get_type_shift() helper.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Acked-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-06 09:24:08 +10:00
David Gibson 2d33581899 spapr: Introduce DRC subclasses
Currently we only have a single QOM type for all DRCs, but lots of
places where we switch behaviour based on the DRC's PAPR defined type.
This is a poor use of our existing type system.

So, instead create QOM subclasses for each PAPR defined DRC type.  We
also introduce intermediate subclasses for physical and logical DRCs,
a division which will be useful later on.

Instead of being stored in the DRC object itself, the PAPR type is now
stored in the class structure.  There are still many places where we
switch directly on the PAPR type value, but this at least provides the
basis to start to remove those.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Acked-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-06 09:23:46 +10:00
Greg Kurz a32e900b8a spapr/drc: don't migrate DRC of cold-plugged CPUs and LMBs
As explained in commit 5c0139a8c2 ("spapr: fix default DRC state for
coldplugged LMBs"), guests expect cold-plugged LMBs to be pre-allocated
and unisolated. The same goes for cold-plugged CPUs.

While here, let's convert g_assert(false) to the better self documenting
g_assert_not_reached().

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-06-06 09:22:02 +10:00
David Gibson 0b55aa91c9 spapr: Make DRC get_index and get_type methods into plain functions
These two methods only have one implementation, and the spec they're
implementing means any other implementation is unlikely, verging on
impossible.

So replace them with simple functions.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
2017-06-06 08:53:24 +10:00
David Gibson 4f65ce00ab spapr: Abolish DRC set_configured method
DRConnectorClass has a set_configured method, however:
  * There is only one implementation, and only ever likely to be one
  * There's exactly one caller, and that's (now) local
  * The implementation is very straightforward

So abolish the method entirely, and just open-code what we need.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
2017-06-06 08:53:24 +10:00
David Gibson 88af6ea568 spapr: Abolish DRC get_fdt method
The DRConnectorClass includes a get_fdt method.  However
  * There's only one implementation, and there's only likely to ever be one
  * Both callers are local to spapr_drc
  * Each caller only uses one half of the actual implementation

So abolish get_fdt() entirely, and just open-code what we need.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
2017-06-06 08:53:24 +10:00
David Gibson b89b3d3929 spapr: Move DRC RTAS calls into spapr_drc.c
Currently implementations of the RTAS calls related to DRCs are in
spapr_rtas.c.  They belong better in spapr_drc.c - that way they're closer
to related code, and we'll be able to make some more things local.

spapr_rtas.c was intended to contain the RTAS infrastructure and core calls
that don't belong anywhere else, not every RTAS implementation.

Code motion only.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
2017-06-06 08:53:24 +10:00
Daniel Henrique Barboza a50919dddf hw/ppc: migrating the DRC state of hotplugged devices
In pseries, a firmware abstraction called Dynamic Reconfiguration
Connector (DRC) is used to assign a particular dynamic resource
to the guest and provide an interface to manage configuration/removal
of the resource associated with it. In other words, DRC is the
'plugged state' of a device.

Before this patch, DRC wasn't being migrated. This causes
post-migration problems due to DRC state mismatch between source and
target. The DRC state of a device X in the source might
change, while in the target the DRC state of X is still fresh. When
migrating the guest, X will not have the same hotplugged state as it
did in the source. This means that we can't hot unplug X in the
target after migration is completed because its DRC state is not consistent.
https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1677552 is one
bug that is caused by this DRC state mismatch between source and
target.

To migrate the DRC state, we defined the VMStateDescription struct for
spapr_drc to enable the transmission of spapr_drc state in migration.
Not all the elements in the DRC state are migrated - only those
that can be modified by guest actions or device add/remove
operations:

- 'isolation_state', 'allocation_state' and 'indicator_state'
are involved in the DR state transition diagram from
PAPR+ 2.7, 13.4;

- 'configured', 'signalled', 'awaiting_release' and 'awaiting_allocation'
are needed in attaching and detaching devices;

- 'indicator_state' provides users with hardware state information.

These are the DRC elements that are migrated.

In this patch the DRC state is migrated for PCI, LMB and CPU
connector types. At this moment there is no support to migrate
DRC for the PHB (PCI Host Bridge) type.

In the 'realize' function the DRC is registered using vmstate_register,
similar to what hw/ppc/spapr_iommu.c does in 'spapr_tce_table_realize'.
This approach works because  DRCs are bus-less and do not sit
on a BusClass that implements bc->get_dev_path, so as a fallback the
VMSD gets identified via "spapr_drc"/get_index(drc).

Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-25 11:31:33 +10:00
Daniel Henrique Barboza 318347234d hw/ppc: removing drc->detach_cb and drc->detach_cb_opaque
The pointer drc->detach_cb is being used as a way of informing
the detach() function inside spapr_drc.c which cb to execute. This
information can also be retrieved simply by checking drc->type and
choosing the right callback based on it. In this context, detach_cb
is redundant information that must be managed.

After the previous spapr_lmb_release change, no detach_cb_opaques
are being used by any of the three callbacks functions. This is
yet another information that is now unused and, on top of that, can't
be migrated either.

This patch makes the following changes:

- removal of detach_cb_opaque. the 'opaque' argument was removed from
the callbacks and from the detach() function of sPAPRConnectorClass. The
attribute detach_cb_opaque of sPAPRConnector was removed.

- removal of detach_cb from the detach() call. The function pointer
detach_cb of sPAPRConnector was removed. detach() now uses a
switch(drc->type) to execute the apropriate callback. To achieve this,
spapr_core_release, spapr_lmb_release and spapr_phb_remove_pci_device_cb
callbacks were made public to be visible inside detach().

Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-05-25 11:31:33 +10:00
Eduardo Habkost e90f2a8c3e qdev: Replace cannot_instantiate_with_device_add_yet with !user_creatable
cannot_instantiate_with_device_add_yet was introduced by commit
efec3dd631 to replace no_user. It was
supposed to be a temporary measure.

When it was introduced, we had 54
cannot_instantiate_with_device_add_yet=true lines in the code.
Today (3 years later) this number has not shrunk: we now have
57 cannot_instantiate_with_device_add_yet=true lines. I think it
is safe to say it is not a temporary measure, and we won't see
the flag go away soon.

Instead of a long field name that misleads people to believe it
is temporary, replace it a shorter and less misleading field:
user_creatable.

Except for code comments, changes were generated using the
following Coccinelle patch:

  @@
  expression DC;
  @@
  (
  -DC->cannot_instantiate_with_device_add_yet = false;
  +DC->user_creatable = true;
  |
  -DC->cannot_instantiate_with_device_add_yet = true;
  +DC->user_creatable = false;
  )

  @@
  typedef ObjectClass;
  expression dc;
  identifier class, data;
  @@
   static void device_class_init(ObjectClass *class, void *data)
   {
   ...
   dc->hotpluggable = true;
  +dc->user_creatable = true;
   ...
   }

  @@
  @@
   struct DeviceClass {
   ...
  -bool cannot_instantiate_with_device_add_yet;
  +bool user_creatable;
   ...
  }

  @@
  expression DC;
  @@
  (
  -!DC->cannot_instantiate_with_device_add_yet
  +DC->user_creatable
  |
  -DC->cannot_instantiate_with_device_add_yet
  +!DC->user_creatable
  )

Cc: Alistair Francis <alistair.francis@xilinx.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Thomas Huth <thuth@redhat.com>
Acked-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170503203604.31462-2-ehabkost@redhat.com>
[ehabkost: kept "TODO remove once we're there" comment]
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-05-17 10:37:00 -03:00
Laurent Vivier fe6824d126 spapr: fix memory hot-unplugging
If, once the kernel has booted, we try to remove a memory
hotplugged while the kernel was not started, QEMU crashes on
an assert:

    qemu-system-ppc64: hw/virtio/vhost.c:651:
                       vhost_commit: Assertion `r >= 0' failed.
    ...
    #4  in vhost_commit
    #5  in memory_region_transaction_commit
    #6  in pc_dimm_memory_unplug
    #7  in spapr_memory_unplug
    #8  spapr_machine_device_unplug
    #9  in hotplug_handler_unplug
    #10 in spapr_lmb_release
    #11 in detach
    #12 in set_allocation_state
    #13 in rtas_set_indicator
    ...

If we take a closer look to the guest kernel log, we can see when
we try to unplug the memory:

    pseries-hotplug-mem: Attempting to hot-add 4 LMB(s)

What happens:

    1- The kernel has ignored the memory hotplug event because
       it was not started when it was generated.

    2- When we hot-unplug the memory,
       QEMU starts to remove the memory,
            generates an hot-unplug event,
        and signals the kernel of the incoming new event

    3- as the kernel is started, on the QEMU signal, it reads
       the event list, decodes the hotplug event and tries to
       finish the hotplugging.

    4- QEMU receive the the hotplug notification while it
       is trying to hot-unplug the memory. This moves the memory
       DRC to an invalid state

This patch prevents this by not allowing to set the allocation
state to USABLE while the DRC is awaiting release.

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1432382

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-29 11:35:16 +11:00
Markus Armbruster a4a1c70dc7 qapi: Make input visitors detect unvisited list tails
Fix the design flaw demonstrated in the previous commit: new method
check_list() lets input visitors report that unvisited input remains
for a list, exactly like check_struct() lets them report that
unvisited input remains for a struct or union.

Implement the method for the qobject input visitor (straightforward),
and the string input visitor (less so, due to the magic list syntax
there).  The opts visitor's list magic is even more impenetrable, and
all I can do there today is a stub with a FIXME comment.  No worse
than before.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1488544368-30622-26-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-05 09:14:20 +01:00
Stefan Weil b12227afb1 hw: Fix typos found by codespell
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Acked-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-01-24 23:26:52 +03:00
Bharata B Rao cf63246319 spapr: Memory hot-unplug support
Add support to hot remove pc-dimm memory devices.

Since we're introducing a machine-level unplug_request hook, we also
had handling for CPU unplug there as well to ensure CPU unplug
continues to work as it did before.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
* add hooks to CAS/cmdline enablement of hotplug ACR support
* add hook for CPU unplug
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 11:17:35 +11:00
Laurent Vivier 24ac7755d7 spapr_drc: convert to trace framework instead of DPRINTF
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-09-23 10:29:40 +10:00
Cédric Le Goater ce9863b797 hw/ppc: use error_report instead of fprintf
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-09-07 12:40:12 +10:00
Eric Blake 1158bb2a05 qapi: Add parameter to visit_end_*
Rather than making the dealloc visitor track of stack of pointers
remembered during visit_start_* in order to free them during
visit_end_*, it's a lot easier to just make all callers pass the
same pointer to visit_end_*.  The generated code has access to the
same pointer, while all other users are doing virtual walks and
can pass NULL.  The dealloc visitor is then greatly simplified.

All three visit_end_*() functions intentionally take a void**,
even though the visit_start_*() functions differ between void**,
GenericList**, and GenericAlternate**.  This is done for several
reasons: when doing a virtual walk, passing NULL doesn't care
what the type is, but when doing a generated walk, we already
have to cast the caller's specific FOO* to call visit_start,
while using void** lets us use visit_end without a cast. Also,
an upcoming patch will add a clone visitor that wants to use
the same implementation for all three visit_end callbacks,
which is made easier if all three share the same signature.

For visitors with already track per-object state (the QMP visitors
via a stack, and the string visitors which do not allow nesting),
add an assertion that the caller is indeed passing the same
pointer to paired calls.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1465490926-28625-4-git-send-email-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-07-06 10:52:04 +02:00
Bharata B Rao aab99135b6 spapr_drc: Prevent detach racing against attach for CPU DR
If a CPU is hot removed while hotplug of the same is still in progress,
the guest crashes. Prevent this by ensuring that detach is done only
after attach has completed.

The existing code already prevents such race for PCI hotplug. However
given that CPU is a logical DR unlike PCI and starts with ISOLATED
state, we need a logic that works for CPU too.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
               [Don't set awaiting_attach for PCI devices]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17 16:33:48 +10:00
Eric Blake d9f62dde13 qapi: Simplify semantics of visit_next_list()
The semantics of the list visit are somewhat baroque, with the
following pseudocode when FooList is used:

start()
for (prev = head; cur = next(prev); prev = &cur) {
    visit(&cur->value)
}

Note that these semantics (advance before visit) requires that
the first call to next() return the list head, while all other
calls return the next element of the list; that is, every visitor
implementation is required to track extra state to decide whether
to return the input as-is, or to advance.  It also requires an
argument of 'GenericList **' to next(), solely because the first
iteration might need to modify the caller's GenericList head, so
that all other calls have to do a layer of dereferencing.

Thankfully, we only have two uses of list visits in the entire
code base: one in spapr_drc (which completely avoids
visit_next_list(), feeding in integers from a different source
than uint8List), and one in qapi-visit.py.  That is, all other
list visitors are generated in qapi-visit.c, and share the same
paradigm based on a qapi FooList type, so we can refactor how
lists are laid out with minimal churn among clients.

We can greatly simplify things by hoisting the special case
into the start() routine, and flipping the order in the loop
to visit before advance:

start(head)
for (tail = *head; tail; tail = next(tail)) {
    visit(&tail->value)
}

With the simpler semantics, visitors have less state to track,
the argument to next() is reduced to 'GenericList *', and it
also becomes obvious whether an input visitor is allocating a
FooList during visit_start_list() (rather than the old way of
not knowing if an allocation happened until the first
visit_next_list()).  As a minor drawback, we now allocate in
two functions instead of one, and have to pass the size to
both functions (unless we were to tweak the input visitors to
cache the size to start_list for reuse during next_list, but
that defeats the goal of less visitor state).

The signature of visit_start_list() is chosen to match
visit_start_struct(), with the new parameters after 'name'.

The spapr_drc case is a virtual visit, done by passing NULL for
list, similarly to how NULL is passed to visit_start_struct()
when a qapi type is not used in those visits.  It was easy to
provide these semantics for qmp-output and dealloc visitors,
and a bit harder for qmp-input (several prerequisite patches
refactored things to make this patch straightforward).  But it
turned out that the string and opts visitors munge enough other
state during visit_next_list() to make it easier to just
document and require a GenericList visit for now; an assertion
will remind us to adjust things if we need the semantics in the
future.

Several pre-requisite cleanup patches made the reshuffling of
the various visitors easier; particularly the qmp input visitor.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-24-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:55 +02:00
Eric Blake 15c2f669e3 qapi: Split visit_end_struct() into pieces
As mentioned in previous patches, we want to call visit_end_struct()
functions unconditionally, so that visitors can release resources
tied up since the matching visit_start_struct() without also having
to worry about error priority if more than one error occurs.

Even though error_propagate() can be safely used to ignore a second
error during cleanup caused by a first error, it is simpler if the
cleanup cannot set an error.  So, split out the error checking
portion (basically, input visitors checking for unvisited keys) into
a new function visit_check_struct(), which can be safely skipped if
any earlier errors are encountered, and leave the cleanup portion
(which never fails, but must be called unconditionally if
visit_start_struct() succeeded) in visit_end_struct().

Generated code in qapi-visit.c has diffs resembling:

|@@ -59,10 +59,12 @@ void visit_type_ACPIOSTInfo(Visitor *v,
|         goto out_obj;
|     }
|     visit_type_ACPIOSTInfo_members(v, obj, &err);
|-    error_propagate(errp, err);
|-    err = NULL;
|+    if (err) {
|+        goto out_obj;
|+    }
|+    visit_check_struct(v, &err);
| out_obj:
|-    visit_end_struct(v, &err);
|+    visit_end_struct(v);
| out:

and in qapi-event.c:

@@ -47,7 +47,10 @@ void qapi_event_send_acpi_device_ost(ACP
|         goto out;
|     }
|     visit_type_q_obj_ACPI_DEVICE_OST_arg_members(v, &param, &err);
|-    visit_end_struct(v, err ? NULL : &err);
|+    if (!err) {
|+        visit_check_struct(v, &err);
|+    }
|+    visit_end_struct(v);
|     if (err) {
|         goto out;

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-20-git-send-email-eblake@redhat.com>
[Conflict with a doc fixup resolved]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:55 +02:00
Eric Blake a543a554cf spapr_drc: Expose 'null' in qom-get when there is no fdt
Now that the QMP output visitor supports an explicit null
output, we should utilize it to make it easier to diagnose
the difference between a missing fdt ('null') vs. a
present-but-empty one ('{}').

(Note that this reverts the behavior of commit ab8bf1d, taking
us back to the behavior of commit 6c2f9a1 [which in turn
stemmed from a crash fix in 1d10b44]; but that this time,
the change is intentional and not an accidental side-effect.)

Signed-off-by: Eric Blake <eblake@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <1461879932-9020-17-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:54 +02:00
Michael Roth df18b2db69 spapr_drc: fix aborts during DRC-count based hotplug
CPU/memory resources can be signalled en-masse via
spapr_hotplug_req_add_by_count(), and when doing so, actually change
the meaning of the 'drc' parameter passed to
spapr_hotplug_req_event() to be a count rather than an index.

f40eb92 added a hook in spapr_hotplug_req_event() to record when a
device had been 'signalled' to the guest, but that code assumes that
drc is always an index. In cases where it's a count, such as memory
hotplug, the DRC lookup will fail, leading to an assert.

Fix this by only explicitly setting the signalled state for cases where
we are doing PCI hotplug.

For other resources types, since we cannot selectively track whether a
resource has been signalled in cases where we signal attach as a count,
set the 'signalled' state to true immediately upon making the
resource available via drck->attach().

Reported-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Cc: david@gibson.dropbear.id.au
Cc: qemu-ppc@nongnu.org
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-04-26 11:16:08 +10:00
Michael Roth f40eb921da spapr_drc: enable immediate detach for unsignalled devices
Currently spapr doesn't support "aborting" hotplug of PCI
devices by allowing device_del to immediately remove the
device if we haven't signalled the presence of the device
to the guest.

In the past this wasn't an issue, since we always immediately
signalled device attach and simply relied on full guest-aware
add->remove path for device removal. However, as of 788d259,
we now defer signalling for PCI functions until function 0
is attached, so now we need to deal with these "abort" operations
for cases where a user hotplugs a non-0 function, then opts to
remove it prior hotplugging function 0. Currently they'd have to
reboot before the unplug completed. PCIe multifunction hotplug
does not have this requirement however, so from a management
implementation perspective it would be good to address this within
the same release as 788d259.

We accomplish this by simply adding a 'signalled' flag to track
whether a device hotplug event has been sent to the guest. If it
hasn't, we allow immediate removal under the assumption that the
guest will not be using the device. Devices present at boot/reset
time are also assumed to be 'signalled'.

For CPU/memory/etc, signalling will still happen immediately
as part of device_add, so only PCI functions should be affected.

Cc: bharata@linux.vnet.ibm.com
Cc: david@gibson.dropbear.id.au
Cc: sbhat@linux.vnet.ibm.com
Cc: qemu-ppc@nongnu.org
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
[dwg: This fixes a regression where an incorrect hot-add of a non-zero
      function can no longer be backed out until function 0 is added]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-04-05 10:47:03 +10:00