Commit Graph

84617 Commits

Author SHA1 Message Date
Daniel P. Berrangé
aae12d4baa iotests: add support for capturing and matching QMP events
When using the _launch_qemu and _send_qemu_cmd functions from
common.qemu, any QMP events get mixed in with the output from
the commands and responses.

This makes it difficult to write a test case as the ordering
of events in the output is not stable.

This introduces a variable 'capture_events' which can be set
to a list of event names. Any events listed in this variable
will not be printed, instead collected in the $QEMU_EVENTS
environment variable.

A new '_wait_event' function can be invoked to collect events
at a fixed point in time. The function will first pull events
cached in $QEMU_EVENTS variable, and if none are found, will
then read more from QMP.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210204124834.774401-11-berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Daniel P. Berrangé
bef7e9e2c7 migration: introduce a delete_snapshot wrapper
Make snapshot deletion consistent with the snapshot save
and load commands by using a wrapper around the blockdev
layer. The main difference is that we get upfront validation
of the passed in device list (if any).

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210204124834.774401-10-berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Daniel P. Berrangé
f1a9fcdd01 migration: wire up support for snapshot device selection
Modify load_snapshot/save_snapshot to accept the device list and vmstate
node name parameters previously added to the block layer.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210204124834.774401-9-berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Daniel P. Berrangé
f781f84189 migration: control whether snapshots are ovewritten
The traditional HMP "savevm" command will overwrite an existing snapshot
if it already exists with the requested name. This new flag allows this
to be controlled allowing for safer behaviour with a future QMP command.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210204124834.774401-8-berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Daniel P. Berrangé
3d3e9b1f66 block: rename and alter bdrv_all_find_snapshot semantics
Currently bdrv_all_find_snapshot() will return 0 if it finds
a snapshot, -1 if an error occurs, or if it fails to find a
snapshot. New callers to be added want to distinguish between
the error scenario and failing to find a snapshot.

Rename it to bdrv_all_has_snapshot and make it return -1 on
error, 0 if no snapshot is found and 1 if snapshot is found.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210204124834.774401-7-berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Daniel P. Berrangé
c22d644ca7 block: allow specifying name of block device for vmstate storage
Currently the vmstate will be stored in the first block device that
supports snapshots. Historically this would have usually been the
root device, but with UEFI it might be the variable store. There
needs to be a way to override the choice of block device to store
the state in.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210204124834.774401-6-berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Daniel P. Berrangé
cf3a74c94f block: add ability to specify list of blockdevs during snapshot
When running snapshot operations, there are various rules for which
blockdevs are included/excluded. While this provides reasonable default
behaviour, there are scenarios that are not well handled by the default
logic. Some of the conditions do not have a single correct answer.

Thus there needs to be a way for the mgmt app to provide an explicit
list of blockdevs to perform snapshots across. This can be achieved
by passing a list of node names that should be used.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210204124834.774401-5-berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Daniel P. Berrangé
f61fe11aa6 migration: stop returning errno from load_snapshot()
None of the callers care about the errno value since there is a full
Error object populated. This gives consistency with save_snapshot()
which already just returns a boolean value.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
[PMD: Return false/true instead of -1/0, document function]
Acked-by: Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210204124834.774401-4-berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Philippe Mathieu-Daudé
7ea14df230 migration: Make save_snapshot() return bool, not 0/-1
Just for consistency, following the example documented since
commit e3fe3988d7 ("error: Document Error API usage rules"),
return a boolean value indicating an error is set or not.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210204124834.774401-3-berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Daniel P. Berrangé
e26f98e209 block: push error reporting into bdrv_all_*_snapshot functions
The bdrv_all_*_snapshot functions return a BlockDriverState pointer
for the invalid backend, which the callers then use to report an
error message. In some cases multiple callers are reporting the
same error message, but with slightly different text. In the future
there will be more error scenarios for some of these methods, which
will benefit from fine grained error message reporting. So it is
helpful to push error reporting down a level.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
[PMD: Initialize variables]
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210204124834.774401-2-berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Dr. David Alan Gilbert
a64aec725e migration: Display the migration blockers
Update 'info migrate' to display migration blocking information.
If the outbound migration is not blocked, there is no change, however
if it is blocked a message is displayed with a list of reasons why,
e.g.

qemu-system-x86_64 -nographic  -smp 4 -m 4G -M pc,usb=on \
 -chardev null,id=n -device usb-serial,chardev=n \
 -virtfs local,path=/home,mount_tag=fs,security_model=none \
 -drive if=virtio,file=myimage.qcow2

(qemu) info migrate
globals:
store-global-state: on
only-migratable: off
send-configuration: on
send-section-footer: on
decompress-error-check: on
clear-bitmap-shift: 18
Outgoing migration blocked:
  Migration is disabled when VirtFS export path '/home' is mounted in the guest using mount_tag 'fs'
  non-migratable device: 0000:00:01.2/1/usb-serial

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20210202135522.127380-3-dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Dr. David Alan Gilbert
3af8554bd0 migration: Add blocker information
Modify query-migrate so that it has a flag indicating if outbound
migration is blocked, and if it is a list of reasons.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20210202135522.127380-2-dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Markus Armbruster
54270c450a migration: Fix a few absurdly defective error messages
migrate_params_check() has a number of error messages of the form

    Parameter 'NAME' expects is invalid, it should be ...

Fix them to something like

    Parameter 'NAME' expects a ...

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20210202141734.2488076-5-armbru@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Markus Armbruster
7bfc47936e migration: Fix cache_init()'s "Failed to allocate" error messages
cache_init() attempts to handle allocation failure.  The two error
messages are garbage, as untested error messages commonly are:

    Parameter 'cache size' expects Failed to allocate cache
    Parameter 'cache size' expects Failed to allocate page cache

Fix them to just

    Failed to allocate cache
    Failed to allocate page cache

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20210202141734.2488076-4-armbru@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Markus Armbruster
8b9407a09f migration: Clean up signed vs. unsigned XBZRLE cache-size
73af8dd8d7 "migration: Make xbzrle_cache_size a migration
parameter" (v2.11.0) made the new parameter unsigned (QAPI type
'size', uint64_t in C).  It neglected to update existing code, which
continues to use int64_t.

migrate_xbzrle_cache_size() returns the new parameter.  Adjust its
return type.

QMP query-migrate-cache-size returns migrate_xbzrle_cache_size().
Adjust its return type.

migrate-set-parameters passes the new parameter to
xbzrle_cache_resize().  Adjust its parameter type.

xbzrle_cache_resize() passes it on to cache_init().  Adjust its
parameter type.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20210202141734.2488076-3-armbru@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Markus Armbruster
ec17de0ac0 migration: Fix migrate-set-parameters argument validation
Commit 741d4086c8 "migration: Use proper types in json" (v2.12.0)
switched MigrationParameters to narrower integer types, and removed
the simplified qmp_migrate_set_parameters()'s argument checking
accordingly.

Good idea, except qmp_migrate_set_parameters() takes
MigrateSetParameters, not MigrationParameters.  Its job is updating
migrate_get_current()->parameters (which *is* of type
MigrationParameters) according to its argument.  The integers now get
truncated silently.  Reproducer:

    ---> {'execute': 'query-migrate-parameters'}
    <--- {"return": {[...] "compress-threads": 8, [...]}}
    ---> {"execute": "migrate-set-parameters", "arguments": {"compress-threads": 257}}
    <--- {"return": {}}
    ---> {'execute': 'query-migrate-parameters'}
    <--- {"return": {[...] "compress-threads": 1, [...]}}

Fix by resynchronizing MigrateSetParameters with MigrationParameters.

Fixes: 741d4086c8
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20210202141734.2488076-2-armbru@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Andrey Gruzdev
c7243566d0 migration: introduce 'userfaultfd-wrlat.py' script
Add BCC/eBPF script to analyze userfaultfd write fault latency distribution.

Signed-off-by: Andrey Gruzdev <andrey.gruzdev@virtuozzo.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20210129101407.103458-6-andrey.gruzdev@virtuozzo.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Andrey Gruzdev
8518278a6a migration: implementation of background snapshot thread
Introducing implementation of 'background' snapshot thread
which in overall follows the logic of precopy migration
while internally utilizes completely different mechanism
to 'freeze' vmstate at the start of snapshot creation.

This mechanism is based on userfault_fd with wr-protection
support and is Linux-specific.

Signed-off-by: Andrey Gruzdev <andrey.gruzdev@virtuozzo.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20210129101407.103458-5-andrey.gruzdev@virtuozzo.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Andrey Gruzdev
278e2f551a migration: support UFFD write fault processing in ram_save_iterate()
In this particular implementation the same single migration
thread is responsible for both normal linear dirty page
migration and procesing UFFD page fault events.

Processing write faults includes reading UFFD file descriptor,
finding respective RAM block and saving faulting page to
the migration stream. After page has been saved, write protection
can be removed. Since asynchronous version of qemu_put_buffer()
is expected to be used to save pages, we also have to flush
migraion stream prior to un-protecting saved memory range.

Write protection is being removed for any previously protected
memory chunk that has hit the migration stream. That's valid
for pages from linear page scan along with write fault pages.

Signed-off-by: Andrey Gruzdev <andrey.gruzdev@virtuozzo.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20210129101407.103458-4-andrey.gruzdev@virtuozzo.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
  fixup pagefault.address cast for 32bit
2021-02-08 11:19:51 +00:00
Andrey Gruzdev
0e9b5cd6b2 migration: introduce UFFD-WP low-level interface helpers
Glue code to the userfaultfd kernel implementation.
Querying feature support, createing file descriptor, feature control,
memory region registration, IOCTLs on registered registered regions.

Signed-off-by: Andrey Gruzdev <andrey.gruzdev@virtuozzo.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20210129101407.103458-3-andrey.gruzdev@virtuozzo.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
  Fixed up range.start casting for 32bit
2021-02-08 11:19:51 +00:00
Andrey Gruzdev
6e8c25b4c6 migration: introduce 'background-snapshot' migration capability
Add new capability to 'qapi/migration.json' schema.
Update migrate_caps_check() to validate enabled capability set
against introduced one. Perform checks for required kernel features
and compatibility with guest memory backends.

Signed-off-by: Andrey Gruzdev <andrey.gruzdev@virtuozzo.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20210129101407.103458-2-andrey.gruzdev@virtuozzo.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Wainer dos Santos Moschetta
1dfafcbd39 migration/qemu-file: Fix maybe uninitialized on qemu_get_buffer_in_place()
Fixed error when compiling migration/qemu-file.c with -Werror=maybe-uninitialized
as shown here:

../migration/qemu-file.c: In function 'qemu_get_buffer_in_place':
../migration/qemu-file.c:604:18: error: 'src' may be used uninitialized in this function [-Werror=maybe-uninitialized]
  604 |             *buf = src;
      |             ~~~~~^~~~~
cc1: all warnings being treated as errors

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Message-Id: <20210128130625.569900-1-wainersm@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Jinhao Gao
39f633d429 savevm: Fix memory leak of vmstate_configuration
When VM migrate VMState of configuration, the fields(name and capabilities)
of configuration having a flag of VMS_ALLOC need to allocate memory. If the
src doesn't free memory of capabilities in SaveState after save VMState of
configuration, or the dst doesn't free memory of name and capabilities in post
load of configuration, it may result in memory leak of name and capabilities.
We free memory in configuration_post_save and configuration_post_load func,
which prevents memory leak.

Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Jinhao Gao <gaojinhao@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20201231061020.828-3-gaojinhao@huawei.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Jinhao Gao
e6ddad1fd5 spapr_pci: Fix memory leak of vmstate_spapr_pci
When VM migrate VMState of spapr_pci, the field(msi_devs) of spapr_pci
having a flag of VMS_ALLOC need to allocate memory. If the src doesn't free
memory of msi_devs in SaveStateEntry of spapr_pci after QEMUFile save
VMState of spapr_pci, it may result in memory leak of msi_devs. We add the
post_save func to free memory, which prevents memory leak.

Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Jinhao Gao <gaojinhao@huawei.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20201231061020.828-2-gaojinhao@huawei.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Peter Maydell
6f0e9c26db Generalize memory encryption models
A number of hardware platforms are implementing mechanisms whereby the
 hypervisor does not have unfettered access to guest memory, in order
 to mitigate the security impact of a compromised hypervisor.
 
 AMD's SEV implements this with in-cpu memory encryption, and Intel has
 its own memory encryption mechanism.  POWER has an upcoming mechanism
 to accomplish this in a different way, using a new memory protection
 level plus a small trusted ultravisor.  s390 also has a protected
 execution environment.
 
 The current code (committed or draft) for these features has each
 platform's version configured entirely differently.  That doesn't seem
 ideal for users, or particularly for management layers.
 
 AMD SEV introduces a notionally generic machine option
 "machine-encryption", but it doesn't actually cover any cases other
 than SEV.
 
 This series is a proposal to at least partially unify configuration
 for these mechanisms, by renaming and generalizing AMD's
 "memory-encryption" property.  It is replaced by a
 "confidential-guest-support" property pointing to a platform specific
 object which configures and manages the specific details.
 
 Note to Ram Pai: the documentation I've included for PEF is very
 minimal.  If you could send a patch expanding on that, it would be
 very helpful.
 
 Changes since v8:
  * Rebase
  * Fixed some cosmetic typos
 Changes since v7:
  * Tweaked and clarified meaning of the 'ready' flag
  * Polished the interface to the PEF internals
  * Shifted initialization for s390 PV later (I hope I've finally got
    this after apply_cpu_model() where it needs to be)
 Changes since v6:
  * Moved to using OBJECT_DECLARE_TYPE and OBJECT_DEFINE_TYPE macros
  * Assorted minor fixes
 Changes since v5:
  * Renamed from "securable guest memory" to "confidential guest
    support"
  * Simpler reworking of x86 boot time flash encryption
  * Added a bunch of documentation
  * Fixed some compile errors on POWER
 Changes since v4:
  * Renamed from "host trust limitation" to "securable guest memory",
    which I think is marginally more descriptive
  * Re-organized initialization, because the previous model called at
    kvm_init didn't work for s390
  * Assorted fixes to the s390 implementation; rudimentary testing
    (gitlab CI) only
 Changes since v3:
  * Rebased
  * Added first cut at handling of s390 protected virtualization
 Changes since RFCv2:
  * Rebased
  * Removed preliminary SEV cleanups (they've been merged)
  * Changed name to "host trust limitation"
  * Added migration blocker to the PEF code (based on SEV's version)
 Changes since RFCv1:
  * Rebased
  * Fixed some errors pointed out by Dave Gilbert
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAmAg1R8ACgkQbDjKyiDZ
 s5KCVRAAgm/xlgEv2hDZ7z+MuOTNesCpR3uU4iX02xNktox96Qai7XlrA7bhDf1v
 y/0FLnOOL6Kn5OHeS2CiDPIgWIUfapSwDsTPooZ6GqfzCI+r0jIaSBu59IBhvJRh
 o3ZTfT2fsckY9Gy2YN29ssN87ovDTPNlvRAxGH/71mMKEGJcK6QWxGcsyJDmeKq4
 0/tOQaLMFRRagTpwqCT1eacMzyQwkoDcywQHfi0Is+Q4voWPKgDY0qPqLd1OG2XI
 cMQ8fagums3NkPpVbKAW7sIvDiHtH1HNDoHKTiwKtTUsN5LBz+LN87LoKAdBasV0
 AiRm8gi+CkF/NOA2RjwaFmThxt7sr8kTKVuIqTo5m8agqkhJr97+gBxUym49CxTx
 1Zjo9TWsprKXnXl8vfGtAIZ4pkYQzomMDT3AilEST3+zbpRuwTMGOJ5vLF7RrKtF
 AtF2XBiPGZ/NztpbmaukuG/R49wwW5we4dR1zySMcoTsAl1rIzxpfwBnYatOY0Hg
 sVc9gABwQ0kacsseVIX72c+30U02cR8f6uRfuqNAEUW13vdAo/5/PXxGVlevMkw5
 33MYr16CkGnYgtgJtORK+x8/vPlAYiBzZrn71Wym7yKCamf8LMbzPNXKjUaD/GT8
 TZG7abTV8vuS0m7V/hGgV8nTVaG/6VLEyAtO6YpjQ+1p+dO8xBc=
 =TTeT
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dg-gitlab/tags/cgs-pull-request' into staging

Generalize memory encryption models

A number of hardware platforms are implementing mechanisms whereby the
hypervisor does not have unfettered access to guest memory, in order
to mitigate the security impact of a compromised hypervisor.

AMD's SEV implements this with in-cpu memory encryption, and Intel has
its own memory encryption mechanism.  POWER has an upcoming mechanism
to accomplish this in a different way, using a new memory protection
level plus a small trusted ultravisor.  s390 also has a protected
execution environment.

The current code (committed or draft) for these features has each
platform's version configured entirely differently.  That doesn't seem
ideal for users, or particularly for management layers.

AMD SEV introduces a notionally generic machine option
"machine-encryption", but it doesn't actually cover any cases other
than SEV.

This series is a proposal to at least partially unify configuration
for these mechanisms, by renaming and generalizing AMD's
"memory-encryption" property.  It is replaced by a
"confidential-guest-support" property pointing to a platform specific
object which configures and manages the specific details.

Note to Ram Pai: the documentation I've included for PEF is very
minimal.  If you could send a patch expanding on that, it would be
very helpful.

Changes since v8:
 * Rebase
 * Fixed some cosmetic typos
Changes since v7:
 * Tweaked and clarified meaning of the 'ready' flag
 * Polished the interface to the PEF internals
 * Shifted initialization for s390 PV later (I hope I've finally got
   this after apply_cpu_model() where it needs to be)
Changes since v6:
 * Moved to using OBJECT_DECLARE_TYPE and OBJECT_DEFINE_TYPE macros
 * Assorted minor fixes
Changes since v5:
 * Renamed from "securable guest memory" to "confidential guest
   support"
 * Simpler reworking of x86 boot time flash encryption
 * Added a bunch of documentation
 * Fixed some compile errors on POWER
Changes since v4:
 * Renamed from "host trust limitation" to "securable guest memory",
   which I think is marginally more descriptive
 * Re-organized initialization, because the previous model called at
   kvm_init didn't work for s390
 * Assorted fixes to the s390 implementation; rudimentary testing
   (gitlab CI) only
Changes since v3:
 * Rebased
 * Added first cut at handling of s390 protected virtualization
Changes since RFCv2:
 * Rebased
 * Removed preliminary SEV cleanups (they've been merged)
 * Changed name to "host trust limitation"
 * Added migration blocker to the PEF code (based on SEV's version)
Changes since RFCv1:
 * Rebased
 * Fixed some errors pointed out by Dave Gilbert

# gpg: Signature made Mon 08 Feb 2021 06:07:27 GMT
# gpg:                using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dg-gitlab/tags/cgs-pull-request:
  s390: Recognize confidential-guest-support option
  confidential guest support: Alter virtio default properties for protected guests
  spapr: PEF: prevent migration
  spapr: Add PEF based confidential guest support
  confidential guest support: Update documentation
  confidential guest support: Move SEV initialization into arch specific code
  confidential guest support: Introduce cgs "ready" flag
  sev: Add Error ** to sev_kvm_init()
  confidential guest support: Rework the "memory-encryption" property
  confidential guest support: Move side effect out of machine_set_memory_encryption()
  sev: Remove false abstraction of flash encryption
  confidential guest support: Introduce new confidential guest support class
  qom: Allow optional sugar props

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-02-08 11:11:26 +00:00
Alex Bennée
d994cc5449 docs/system: document an example booting the versatilepb machine
There is a bit more out there including Aurelien's excellent write up
and older Debian images here:

  https://www.aurel32.net/info/debian_arm_qemu.php
  https://people.debian.org/~aurel32/qemu/armel/

However the web is transitory and git is forever so lets add something
to the fine manual.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Cc: Anders Roxell <anders.roxell@linaro.org>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20210202134001.25738-16-alex.bennee@linaro.org>
2021-02-08 10:55:20 +00:00
Alex Bennée
a5dbb17507 docs/system: document an example vexpress-a15 invocation
The wiki and the web are curiously absent of the right runes to boot a
vexpress model so I had to work from first principles to work it out.
Use the more modern -drive notation so alternative backends can be
used (unlike the hardwired -sd mode).

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Cc: Anders Roxell <anders.roxell@linaro.org>
Message-Id: <20210202134001.25738-15-alex.bennee@linaro.org>
2021-02-08 10:55:20 +00:00
Alex Bennée
c401c058a1 tests/Makefile.include: don't use TARGET_DIRS for check-tcg
TARGET_DIRS reflects what we wanted to configure which in the normal
case is all our targets. However once meson has pared-down our target
list due to missing features we need to check the final list of
ninja-targets. This prevents check-tcg barfing on a --disable-tcg
build.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210202134001.25738-14-alex.bennee@linaro.org>
2021-02-08 10:55:20 +00:00
Alex Bennée
47e3424ac9 scripts/mtest2make.py: export all-%s-targets variable and use it
There are some places where the conditional makefile support is the
simplest solution. Now we don't expose CONFIG_TCG as a variable create
a new one that can be checked for the check-help output.

As check-tcg is a PHONY target we re-use check-softfloat to gate that
as well.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210202134001.25738-13-alex.bennee@linaro.org>
2021-02-08 10:55:20 +00:00
Stefan Weil
2a86d66be1 tests/tcg: Replace /bin/true by true (required on macOS)
/bin/true is missing on macOS, but simply "true" is available as a shell builtin.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210128135627.2067003-1-sw@weilnetz.de>
Message-Id: <20210202134001.25738-12-alex.bennee@linaro.org>
2021-02-08 10:55:20 +00:00
Richard Henderson
6e3dd75717 gdbstub: Fix handle_query_xfer_auxv
The main problem was that we were treating a guest address
as a host address with a mere cast.

Use the correct interface for accessing guest memory.  Do not
allow offset == auxv_len, which would result in an empty packet.

Fixes: 51c623b0de ("gdbstub: add support to Xfer:auxv:read: packet")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210128201831.534033-1-richard.henderson@linaro.org>
Message-Id: <20210202134001.25738-11-alex.bennee@linaro.org>
2021-02-08 10:55:15 +00:00
Alex Bennée
46bae04a86 tests/tcg: don't silently skip the gdb tests
Otherwise people won't know what they are missing.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210202134001.25738-10-alex.bennee@linaro.org>
2021-02-08 09:41:00 +00:00
Alex Bennée
d6a66c811e configure: bump the minimum gdb version for check-tcg to 9.1
For SVE, currently the bulk of the GDB TCG tests, we need at least GDB
9.1 to support the "ieee_half" data type we report. This only affects
when GDB tests are run; users can still use lower versions of gdb as
long as they aren't talking to an SVE enabled model. The work around
is to either get a newer gdb or disable SVE for their CPU model.

Reported-by: Claudio Fontana <cfontana@suse.de>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Cc: Luis Machado <luis.machado@linaro.org>
Message-Id: <20210202134001.25738-9-alex.bennee@linaro.org>
2021-02-08 09:41:00 +00:00
Alex Bennée
2df52b9bfd configure: make version_ge more tolerant of shady version input
When checking GDB versions we have to tolerate all sorts of random
distro extensions to the version string. While we already attempt to
do some of that before we call version_ge is makes sense to try and
regularise the first input by stripping extraneous -'s. While we at it
convert the old-style shell quoting into a cleaner form t shut up my
editors linter lest it confuse me by underlining the whole line.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210202134001.25738-8-alex.bennee@linaro.org>
2021-02-08 09:41:00 +00:00
Alex Bennée
ddd5ed8331 tests/docker: add a docker-exec-copy-test
This provides test machinery for checking the QEMU copying logic works
properly. It takes considerably less time to run than starting a
debootstrap only for it to fail later. I considered adding a remove
command to docker.py but figured that might be gold plating given the
relative size of the containers compared to the ones with actual stuff
in them.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210202134001.25738-7-alex.bennee@linaro.org>
2021-02-08 09:41:00 +00:00
Alex Bennée
6147c2495d tests/docker: alias docker-help target for consistency
We have a bunch of -help targets so this will save some cognitive
dissonance. Keep the original for those with muscle memory.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210202134001.25738-6-alex.bennee@linaro.org>
2021-02-08 09:41:00 +00:00
Alex Bennée
3971c70f15 tests/docker: preserve original name when copying libs
While it is important we chase down the symlinks to copy the correct
data we can confuse the kernel by renaming the interpreter to what is
in the binary. Extend _copy_with_mkdir to preserve the original name
of the file when asked.

Fixes: 5e33f7fead ("tests/docker: better handle symlinked libs")
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210202134001.25738-5-alex.bennee@linaro.org>
2021-02-08 09:41:00 +00:00
Alex Bennée
dffccf3d34 tests/docker: make _copy_with_mkdir accept missing files
Depending on the linker/ldd setup we might get a file with no path.
Typically this is the psuedo library linux-vdso.so which doesn't
actually exist on the disk. Rather than try and catch these distro
specific edge cases just shout about it and try and continue.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210202134001.25738-4-alex.bennee@linaro.org>
2021-02-08 09:41:00 +00:00
Philippe Mathieu-Daudé
dc23bbc3df tests/docker: Fix typo in help message
To have the variable properly passed, we need to set it,
ie. NOUSER=1. Fix the message displayed by 'make docker'.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210119052120.522069-1-f4bug@amsat.org>
Message-Id: <20210202134001.25738-3-alex.bennee@linaro.org>
2021-02-08 09:41:00 +00:00
Philippe Mathieu-Daudé
4d8f630915 tests/docker: Fix _get_so_libs() for docker-binfmt-image
Fix a variable rename mistake from commit 5e33f7fead:

  Traceback (most recent call last):
    File "./tests/docker/docker.py", line 710, in <module>
      sys.exit(main())
    File "./tests/docker/docker.py", line 706, in main
      return args.cmdobj.run(args, argv)
    File "./tests/docker/docker.py", line 489, in run
      _copy_binary_with_libs(args.include_executable,
    File "./tests/docker/docker.py", line 149, in _copy_binary_with_libs
      libs = _get_so_libs(src)
    File "./tests/docker/docker.py", line 123, in _get_so_libs
      libs.append(s.group(1))
  NameError: name 's' is not defined

Fixes: 5e33f7fead ("tests/docker: better handle symlinked libs")
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210119050149.516910-1-f4bug@amsat.org>
Message-Id: <20210202134001.25738-2-alex.bennee@linaro.org>
2021-02-08 09:41:00 +00:00
Thomas Huth
36a7ab5f04 tests/acceptance: Increase the timeout in the replay tests
Our gitlab-CI just showed a failed test_ppc_mac99 since it was apparently
killed some few seconds before the test finished. Allow it some more
time to complete.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Acked-by: Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210127065222.48650-1-thuth@redhat.com>
2021-02-08 09:40:51 +00:00
Peter Maydell
2766043345 qemu-sparc queue
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCgA8FiEEzGIauY6CIA2RXMnEW8LFb64PMh8FAmAgZQgeHG1hcmsuY2F2
 ZS1heWxhbmRAaWxhbmRlLmNvLnVrAAoJEFvCxW+uDzIfGowH/2JNIL3Uh0rLNMZ9
 wu5VPkZWPFHwXdbRiQBFZLi33JTYdkzMVl8cJ83KkUPi26hG+S9sszCRZmrPM76E
 vPPABm+jTqjSC5jQcVYcjaEhhLPCT4iq66m+F58Quw66C/StWY/c0W2LZNC6d6ul
 U6lrU8T/ycOo/IH9WrANRiDuAudbsPDC/riLZpyOUAe3muSirxAUFC0Mg40wdHMN
 vcDD4PoOruDFoUEn9vOvmuHYkKLSY8HZvmU6SVqx51ZJlPDlpp/z3GjjI81ftleL
 w6/FyEFuu9kzF4D7BJ2K8DnvUiMBXq+hC9bLfX4nQUXE2JHDIExphVurTtyPNrR+
 7V56TO4=
 =7Gso
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mcayland/tags/qemu-sparc-20210207' into staging

qemu-sparc queue

# gpg: Signature made Sun 07 Feb 2021 22:09:12 GMT
# gpg:                using RSA key CC621AB98E82200D915CC9C45BC2C56FAE0F321F
# gpg:                issuer "mark.cave-ayland@ilande.co.uk"
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>" [full]
# Primary key fingerprint: CC62 1AB9 8E82 200D 915C  C9C4 5BC2 C56F AE0F 321F

* remotes/mcayland/tags/qemu-sparc-20210207:
  utils/fifo8: add VMSTATE_FIFO8_TEST macro
  utils/fifo8: change fatal errors from abort() to assert()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-02-08 09:23:53 +00:00
David Gibson
651615d92d s390: Recognize confidential-guest-support option
At least some s390 cpu models support "Protected Virtualization" (PV),
a mechanism to protect guests from eavesdropping by a compromised
hypervisor.

This is similar in function to other mechanisms like AMD's SEV and
POWER's PEF, which are controlled by the "confidential-guest-support"
machine option.  s390 is a slightly special case, because we already
supported PV, simply by using a CPU model with the required feature
(S390_FEAT_UNPACK).

To integrate this with the option used by other platforms, we
implement the following compromise:

 - When the confidential-guest-support option is set, s390 will
   recognize it, verify that the CPU can support PV (failing if not)
   and set virtio default options necessary for encrypted or protected
   guests, as on other platforms.  i.e. if confidential-guest-support
   is set, we will either create a guest capable of entering PV mode,
   or fail outright.

 - If confidential-guest-support is not set, guests might still be
   able to enter PV mode, if the CPU has the right model.  This may be
   a little surprising, but shouldn't actually be harmful.

To start a guest supporting Protected Virtualization using the new
option use the command line arguments:
    -object s390-pv-guest,id=pv0 -machine confidential-guest-support=pv0

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
2021-02-08 16:57:38 +11:00
David Gibson
9f88a7a3df confidential guest support: Alter virtio default properties for protected guests
The default behaviour for virtio devices is not to use the platforms normal
DMA paths, but instead to use the fact that it's running in a hypervisor
to directly access guest memory.  That doesn't work if the guest's memory
is protected from hypervisor access, such as with AMD's SEV or POWER's PEF.

So, if a confidential guest mechanism is enabled, then apply the
iommu_platform=on option so it will go through normal DMA mechanisms.
Those will presumably have some way of marking memory as shared with
the hypervisor or hardware so that DMA will work.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
2021-02-08 16:57:38 +11:00
David Gibson
6742eefc93 spapr: PEF: prevent migration
We haven't yet implemented the fairly involved handshaking that will be
needed to migrate PEF protected guests.  For now, just use a migration
blocker so we get a meaningful error if someone attempts this (this is the
same approach used by AMD SEV).

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
2021-02-08 16:57:38 +11:00
David Gibson
6c8ebe30ea spapr: Add PEF based confidential guest support
Some upcoming POWER machines have a system called PEF (Protected
Execution Facility) which uses a small ultravisor to allow guests to
run in a way that they can't be eavesdropped by the hypervisor.  The
effect is roughly similar to AMD SEV, although the mechanisms are
quite different.

Most of the work of this is done between the guest, KVM and the
ultravisor, with little need for involvement by qemu.  However qemu
does need to tell KVM to allow secure VMs.

Because the availability of secure mode is a guest visible difference
which depends on having the right hardware and firmware, we don't
enable this by default.  In order to run a secure guest you need to
create a "pef-guest" object and set the confidential-guest-support
property to point to it.

Note that this just *allows* secure guests, the architecture of PEF is
such that the guest still needs to talk to the ultravisor to enter
secure mode.  Qemu has no direct way of knowing if the guest is in
secure mode, and certainly can't know until well after machine
creation time.

To start a PEF-capable guest, use the command line options:
    -object pef-guest,id=pef0 -machine confidential-guest-support=pef0

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
2021-02-08 16:57:38 +11:00
David Gibson
64d19f3334 confidential guest support: Update documentation
Now that we've implemented a generic machine option for configuring various
confidential guest support mechanisms:
  1. Update docs/amd-memory-encryption.txt to reference this rather than
     the earlier SEV specific option
  2. Add a docs/confidential-guest-support.txt to cover the generalities of
     the confidential guest support scheme

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
2021-02-08 16:57:38 +11:00
David Gibson
ec78e2cda3 confidential guest support: Move SEV initialization into arch specific code
While we've abstracted some (potential) differences between mechanisms for
securing guest memory, the initialization is still specific to SEV.  Given
that, move it into x86's kvm_arch_init() code, rather than the generic
kvm_init() code.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
2021-02-08 16:57:38 +11:00
David Gibson
abc27d4241 confidential guest support: Introduce cgs "ready" flag
The platform specific details of mechanisms for implementing
confidential guest support may require setup at various points during
initialization.  Thus, it's not really feasible to have a single cgs
initialization hook, but instead each mechanism needs its own
initialization calls in arch or machine specific code.

However, to make it harder to have a bug where a mechanism isn't
properly initialized under some circumstances, we want to have a
common place, late in boot, where we verify that cgs has been
initialized if it was requested.

This patch introduces a ready flag to the ConfidentialGuestSupport
base type to accomplish this, which we verify in
qemu_machine_creation_done().

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
2021-02-08 16:57:38 +11:00
David Gibson
c9f5aaa6bc sev: Add Error ** to sev_kvm_init()
This allows failures to be reported richly and idiomatically.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
2021-02-08 16:57:38 +11:00