Commit Graph

84473 Commits

Author SHA1 Message Date
Dmitry Fomichev
62e8faa468 hw/block/nvme: Add Commands Supported and Effects log
This log page becomes necessary to implement to allow checking for
Zone Append command support in Zoned Namespace Command Set.

This commit adds the code to report this log page for NVM Command
Set only. The parts that are specific to zoned operation will be
added later in the series.

All incoming admin and i/o commands are now only processed if their
corresponding support bits are set in this log. This provides an
easy way to control what commands to support and what not to
depending on set CC.CSS.

Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08 20:58:32 +01:00
Dmitry Fomichev
3ec1d547a5 hw/block/nvme: Combine nvme_write_zeroes() and nvme_write()
Move write processing to nvme_do_write() that now handles both WRITE
and WRITE ZEROES. Both nvme_write() and nvme_write_zeroes() become
inline helper functions.

Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08 18:55:48 +01:00
Dmitry Fomichev
13a7b6539d hw/block/nvme: Separate read and write handlers
The majority of code in nvme_rw() is becoming read- or write-specific.
Move these parts to two separate handlers, nvme_read() and nvme_write()
to make the code more readable and to remove multiple is_write checks
that has been present in the i/o path.

This is a refactoring patch, no change in functionality.

Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08 18:55:48 +01:00
Dmitry Fomichev
b52f26cd1f hw/block/nvme: Generate namespace UUIDs
In NVMe 1.4, a namespace must report an ID descriptor of UUID type
if it doesn't support EUI64 or NGUID. Add a new namespace property,
"uuid", that provides the user the option to either specify the UUID
explicitly or have a UUID generated automatically every time a
namespace is initialized.

Suggested-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08 18:55:48 +01:00
Dmitry Fomichev
ba69f22481 hw/block/nvme: Process controller reset and shutdown differently
Controller reset ans subsystem shutdown are handled very much the same
in the current code, but some of the steps should be different in these
two cases.

Introduce two new functions, nvme_reset_ctrl() and nvme_shutdown_ctrl(),
to separate some portions of the code from nvme_clear_ctrl(). The steps
that are made different between reset and shutdown are that BAR.CC is not
reset to zero upon the shutdown and namespace data is flushed to
backing storage as a part of shutdown handling, but not upon reset.

Suggested-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08 18:55:48 +01:00
Klaus Jensen
e1f81c1478 hw/block/nvme: fix bad clearing of CAP
Commit 37712e00b1 ("hw/block/nvme: factor out pmr setup") changed the
control flow such that the CAP register is erronously cleared after
nvme_init_pmr() has configured it. Since the entire NvmeCtrl structure
is zero-filled initially, there is no need for the explicit clearing, so
just remove it.

Fixes: 37712e00b1 ("hw/block/nvme: factor out pmr setup")
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2021-02-08 18:55:48 +01:00
Gollu Appalanaidu
0a384f923f hw/block/nvme: add compare command
Add the Compare command.

This implementation uses a bounce buffer to read in the data from
storage and then compare with the host supplied buffer.

Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
[k.jensen: rebased]
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-02-08 18:55:48 +01:00
Klaus Jensen
2605257a26 hw/block/nvme: add the dataset management command
Add support for the Dataset Management command and the Deallocate
attribute. Deallocation results in discards being sent to the underlying
block device. Whether of not the blocks are actually deallocated is
affected by the same factors as Write Zeroes (see previous commit).

     format | discard | dsm (512B)  dsm (4KiB)  dsm (64KiB)
    --------------------------------------------------------
      qcow2    ignore   n           n           n
      qcow2    unmap    n           n           y
      raw      ignore   n           n           n
      raw      unmap    n           y           y

Again, a raw format and 4KiB LBAs are preferable.

In order to set the Namespace Preferred Deallocate Granularity and
Alignment fields (NPDG and NPDA), choose a sane minimum discard
granularity of 4KiB. If we are using a passthru device supporting
discard at a 512B granularity, user should set the discard_granularity
property explicitly. NPDG and NPDA will also account for the
cluster_size of the block driver if required (i.e. for QCOW2).

See NVM Express 1.3d, Section 6.7 ("Dataset Management command").

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-02-08 18:55:48 +01:00
Klaus Jensen
6fd704a59a nvme: add namespace I/O optimization fields to shared header
This adds the NPWG, NPWA, NPDG, NPDA and NOWS family of fields to the
shared nvme.h header for use by later patches.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Fam Zheng <fam@euphon.net>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2021-02-08 18:55:48 +01:00
Klaus Jensen
54064e51d1 hw/block/nvme: add dulbe support
Add support for reporting the Deallocated or Unwritten Logical Block
Error (DULBE).

Rely on the block status flags reported by the block layer and consider
any block with the BDRV_BLOCK_ZERO flag to be deallocated.

Multiple factors affect when a Write Zeroes command result in
deallocation of blocks.

  * the underlying file system block size
  * the blockdev format
  * the 'discard' and 'logical_block_size' parameters

     format | discard | wz (512B)  wz (4KiB)  wz (64KiB)
    -----------------------------------------------------
      qcow2    ignore   n          n          y
      qcow2    unmap    n          n          y
      raw      ignore   n          y          y
      raw      unmap    n          y          y

So, this works best with an image in raw format and 4KiB LBAs, since
holes can then be punched on a per-block basis (this assumes a file
system with a 4kb block size, YMMV). A qcow2 image, uses a cluster size
of 64KiB by default and blocks will only be marked deallocated if a full
cluster is zeroed or discarded. However, this *is* consistent with the
spec since Write Zeroes "should" deallocate the block if the Deallocate
attribute is set and "may" deallocate if the Deallocate attribute is not
set. Thus, we always try to deallocate (the BDRV_REQ_MAY_UNMAP flag is
always set).

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-02-08 18:55:48 +01:00
Klaus Jensen
54eea8d947 hw/block/nvme: pull aio error handling
Add a new function, nvme_aio_err, to handle errors resulting from AIOs
and use this from the callbacks.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2021-02-08 18:55:48 +01:00
Klaus Jensen
c519d9d55e hw/block/nvme: remove superfluous NvmeCtrl parameter
nvme_check_bounds has no use of the NvmeCtrl parameter; remove it.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
2021-02-08 18:55:47 +01:00
Peter Maydell
4f799257b3 QAPI patches patches for 2021-02-08
-----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmAhQpISHGFybWJydUBy
 ZWRoYXQuY29tAAoJEDhwtADrkYZTBFYP/jh1x1Isn35q3jwjzKy6N3EYbzmaRKhW
 rc5bFlZLQ95Nso+Q3b8izUSeoGz7t8E+WLZBBxzNoOWBS2uC8HS0jIpUmCJaB+Yu
 aVrG70+5/s/BPTX7iF6LIrcRj3GIQdFEWR5Zn1V4PFk5G7NZFwEiTnViq8pLw87H
 c2ktAorDyH6Zl4DA2YjV0XBLPMclt53sQ/M5b6ythUq6E05bVUkQFiUm3N8joIEX
 2djwumVi/dYpDwR/pVPBjPXcP5gc44y4D9sjJU6I16kyCyKi8Tr9mmEcSJFmR/Y/
 GdSilmtFhxUikAQogZwi/48BhR0GawhgcKZP5IkKNjvfwqE13Vhx8/p0pEubFPUE
 YexvFqkSmBRZ1mT0QpCcG4lp+u6PzhTyc/gbWwPm42/T+x9fG0cDgvDZCPjIRSsg
 LiUmmqrwOXr+Lw6GP1Q0f0KKv5QhCfeq1YcYrEXTQsa1PDT5T8ARBzxR5O1X2UNR
 Xw7j4u9R63p6P5nOk5+wwRLQkUkGl7N0SYqe4thyHUfW4r48V5J9RT7ONKD2CXO9
 Dw/Q3Ga1GmHaydpoZ4Az1k5kyr5dVBbAISRRWRpYA0sneg85o3RU3aTMt3r43pzC
 5pjeSpx6v7K4Y3NJPL1e/j/qJZq10WtEslkF/TKcBS2qLgiKCZ1oXQPORQxxX9uK
 zuK8oPQR7W42
 =uLm+
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2021-02-08' into staging

QAPI patches patches for 2021-02-08

# gpg: Signature made Mon 08 Feb 2021 13:54:26 GMT
# gpg:                using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653
# gpg:                issuer "armbru@redhat.com"
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full]
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>" [full]
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-qapi-2021-02-08:
  qapi: enable strict-optional checks
  qapi: type 'info' as Optional[QAPISourceInfo]
  qapi/gen: Drop support for QAPIGen without a file name
  qapi/commands: Simplify command registry generation
  qapi/gen: Support switching to another module temporarily
  qapi/gen: write _genc/_genh access shims
  qapi: centralize the built-in module name definition
  qapi/gen: Combine ._add_[user|system]_module
  qapi: use './builtin' as the built-in module name
  qapi: use explicitly internal module names
  qapi/gen: Replace ._begin_system_module()
  qapi: centralize is_[user|system|builtin]_module methods
  qapi/gen: inline _wrap_ifcond into end_if()
  qapi/main: handle theoretical None-return from re.match()
  qapi/events: fix visit_event typing
  qapi/commands: assert arg_type is not None

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-02-08 16:12:21 +00:00
John Snow
c51172667b qapi: enable strict-optional checks
In the modules that we are checking so far, we can be stricter about the
difference between Optional[T] and T types. Enable that check.

Enabling it now will assist review on further typing and cleanup work.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20210201193747.2169670-17-jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2021-02-08 14:15:58 +01:00
John Snow
4a82e468e7 qapi: type 'info' as Optional[QAPISourceInfo]
For everything typed so far, type this parameter as
Optional[QAPISourceInfo].

In the most generic case, QAPISchemaEntity's info field may be None to
represent types that come from built-in definitions. Although some
Entity types may not currently have any built-in definitions, it is not
easily possible to constrain the type except on an ad-hoc basis using
assertions.

It's easier and simpler, then, to just say it's always an Optional type.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20210201193747.2169670-16-jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2021-02-08 14:15:58 +01:00
Markus Armbruster
cc0747f6b7 qapi/gen: Drop support for QAPIGen without a file name
The previous commit removed the only user of QAPIGen(None).  Tighten
the type hint.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20210201193747.2169670-15-jsnow@redhat.com>
2021-02-08 14:15:58 +01:00
Markus Armbruster
c6cd7e4151 qapi/commands: Simplify command registry generation
QAPISchemaGenCommandVisitor.visit_command() needs to generate the
marshalling function into the current module, and also generate its
registration into the ./init system module.  The latter is done
somewhat awkwardly: .__init__() creates a QAPIGenCCode that will not
be written out, each .visit_command() adds its registration to it, and
.visit_end() copies its contents into the ./init module it creates.

Instead provide the means to temporarily switch to another module.
Create the ./init module in .visit_begin(), and generate its initial
part.  Add registrations to it in .visit_command().  Finish it in
.visit_end().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20210201193747.2169670-14-jsnow@redhat.com>
2021-02-08 14:15:58 +01:00
Markus Armbruster
d921d27c1b qapi/gen: Support switching to another module temporarily
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20210201193747.2169670-13-jsnow@redhat.com>
[Commit message tweaked]
2021-02-08 14:15:58 +01:00
John Snow
fd9b160384 qapi/gen: write _genc/_genh access shims
Many places assume they can access these fields without checking them
first to ensure they are defined. Eliminating the _genc and _genh fields
and replacing them with functional properties that check for correct
state can ease the typing overhead by eliminating the Optional[T] return
type.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20210201193747.2169670-12-jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2021-02-08 14:15:58 +01:00
John Snow
39b2d838f1 qapi: centralize the built-in module name definition
Use a constant to make it obvious we're referring to a very specific thing.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20210201193747.2169670-11-jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2021-02-08 14:15:58 +01:00
Markus Armbruster
4ab0ff6da0 qapi/gen: Combine ._add_[user|system]_module
With callers to _add_system_module now explicitly using the './' prefix
to indicate a system module, there is no longer any reason to have
separate interfaces for adding system vs user modules; use a unified
interface that differentiates based on the name.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20210201193747.2169670-10-jsnow@redhat.com>
2021-02-08 14:15:58 +01:00
John Snow
e2bbc4eaa7 qapi: use './builtin' as the built-in module name
Use './builtin' as the built-in module name instead of
None. Clarify the typing that this is now always a string.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20210201193747.2169670-9-jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2021-02-08 14:15:58 +01:00
John Snow
12893a8ea7 qapi: use explicitly internal module names
QAPISchemaModularCVisitor._add_system_module() prefixes './' to its name
argument to make it a module name.  Pass the module name instead.  This
will allow us to coalesce the methods to add modules later on.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20210201193747.2169670-8-jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Commit message reworded]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2021-02-08 14:15:58 +01:00
Markus Armbruster
f3a705928a qapi/gen: Replace ._begin_system_module()
QAPISchemaModularCVisitor._begin_system_module() is actually just for
the builtin module.  Rename it to ._begin_builtin_module() and drop
its useless @name parameter.

Clarify conditionals in visit_module to make this clear.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20210201193747.2169670-7-jsnow@redhat.com>
2021-02-08 14:15:58 +01:00
John Snow
98967c248c qapi: centralize is_[user|system|builtin]_module methods
Define what a module is and define what kind of a module it is once and
for all, in one place.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20210201193747.2169670-6-jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2021-02-08 14:15:58 +01:00
John Snow
a253b3eb9a qapi/gen: inline _wrap_ifcond into end_if()
We assert _start_if is not None in end_if, but that's opaque to mypy.
By inlining _wrap_ifcond, that constraint becomes provable to mypy.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20210201193747.2169670-5-jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2021-02-08 14:15:58 +01:00
John Snow
ad1218086e qapi/main: handle theoretical None-return from re.match()
Mypy cannot understand that this match can never be None, so help it
along.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20210201193747.2169670-4-jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2021-02-08 14:15:58 +01:00
John Snow
3cc01c546b qapi/events: fix visit_event typing
Actually, the arg_type can indeed be Optional.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20210201193747.2169670-3-jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2021-02-08 14:15:58 +01:00
John Snow
ec9697ab3f qapi/commands: assert arg_type is not None
When boxed is True, expr.py asserts that we must have
arguments. Ultimately, this should mean that if boxed is True that
arg_type should be defined. Mypy cannot infer this, and does not support
'stateful' type inference, e.g.:

```
if x:
    assert y is not None

...

if x:
    y.etc()
```

does not work, because mypy does not statefully remember the conditional
assertion in the second block. Help mypy out by creating a new local
that it can track more easily.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-Id: <20210201193747.2169670-2-jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2021-02-08 14:15:58 +01:00
Peter Maydell
8eef07b4d3 Testing, gdbstub and doc tweaks:
- increase timeout on replay kernel acceptance test
   - fixes for binfmt_misc docker images
   - better gdb version detection
   - don't silently skip gdb tests
   - fix for gdbstub auxv handling
   - cleaner handling of check-tcg on tcg disabled builds
   - expand vexpress/versitile docs with examples
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmAhHIMACgkQ+9DbCVqe
 KkS/gwf+OeN1OxOqn7kdhOvbTx2k0XAxOEN/lC47xeVIpGBmC08jA7o/+mfy3aLK
 xCGYpK7LkW06KYEZ3r3ojxph5dbR16TcqououM0Z5YrQ0A/wuMX7E9l27Ndr209c
 jt67z/8ZSXwnA93bttFrqYwIcYqGdfdx/cQFi/hyWGZytBBbwM+OWPkwwiklUPSu
 AnmC/K4LowItr3yifu1E1ppgbLsVGu/xX2t/Et/7CTnYngMSr3Zb9ZhdgfTsBuZ3
 JzMw990VR0OuslTg8t+563lYvBboCqu6WbV3WIas7XT41JC97KygCj+IuyZjV039
 2DQ0t723o9B05UFFrZaJJFXs1Kb+/A==
 =4zok
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/stsquad/tags/pull-testing-gdbstub-docs-080221-1' into staging

Testing, gdbstub and doc tweaks:

  - increase timeout on replay kernel acceptance test
  - fixes for binfmt_misc docker images
  - better gdb version detection
  - don't silently skip gdb tests
  - fix for gdbstub auxv handling
  - cleaner handling of check-tcg on tcg disabled builds
  - expand vexpress/versitile docs with examples

# gpg: Signature made Mon 08 Feb 2021 11:12:03 GMT
# gpg:                using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full]
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8  DF35 FBD0 DB09 5A9E 2A44

* remotes/stsquad/tags/pull-testing-gdbstub-docs-080221-1:
  docs/system: document an example booting the versatilepb machine
  docs/system: document an example vexpress-a15 invocation
  tests/Makefile.include: don't use TARGET_DIRS for check-tcg
  scripts/mtest2make.py: export all-%s-targets variable and use it
  tests/tcg: Replace /bin/true by true (required on macOS)
  gdbstub: Fix handle_query_xfer_auxv
  tests/tcg: don't silently skip the gdb tests
  configure: bump the minimum gdb version for check-tcg to 9.1
  configure: make version_ge more tolerant of shady version input
  tests/docker: add a docker-exec-copy-test
  tests/docker: alias docker-help target for consistency
  tests/docker: preserve original name when copying libs
  tests/docker: make _copy_with_mkdir accept missing files
  tests/docker: Fix typo in help message
  tests/docker: Fix _get_so_libs() for docker-binfmt-image
  tests/acceptance: Increase the timeout in the replay tests

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-02-08 13:00:54 +00:00
Peter Maydell
6f0e9c26db Generalize memory encryption models
A number of hardware platforms are implementing mechanisms whereby the
 hypervisor does not have unfettered access to guest memory, in order
 to mitigate the security impact of a compromised hypervisor.
 
 AMD's SEV implements this with in-cpu memory encryption, and Intel has
 its own memory encryption mechanism.  POWER has an upcoming mechanism
 to accomplish this in a different way, using a new memory protection
 level plus a small trusted ultravisor.  s390 also has a protected
 execution environment.
 
 The current code (committed or draft) for these features has each
 platform's version configured entirely differently.  That doesn't seem
 ideal for users, or particularly for management layers.
 
 AMD SEV introduces a notionally generic machine option
 "machine-encryption", but it doesn't actually cover any cases other
 than SEV.
 
 This series is a proposal to at least partially unify configuration
 for these mechanisms, by renaming and generalizing AMD's
 "memory-encryption" property.  It is replaced by a
 "confidential-guest-support" property pointing to a platform specific
 object which configures and manages the specific details.
 
 Note to Ram Pai: the documentation I've included for PEF is very
 minimal.  If you could send a patch expanding on that, it would be
 very helpful.
 
 Changes since v8:
  * Rebase
  * Fixed some cosmetic typos
 Changes since v7:
  * Tweaked and clarified meaning of the 'ready' flag
  * Polished the interface to the PEF internals
  * Shifted initialization for s390 PV later (I hope I've finally got
    this after apply_cpu_model() where it needs to be)
 Changes since v6:
  * Moved to using OBJECT_DECLARE_TYPE and OBJECT_DEFINE_TYPE macros
  * Assorted minor fixes
 Changes since v5:
  * Renamed from "securable guest memory" to "confidential guest
    support"
  * Simpler reworking of x86 boot time flash encryption
  * Added a bunch of documentation
  * Fixed some compile errors on POWER
 Changes since v4:
  * Renamed from "host trust limitation" to "securable guest memory",
    which I think is marginally more descriptive
  * Re-organized initialization, because the previous model called at
    kvm_init didn't work for s390
  * Assorted fixes to the s390 implementation; rudimentary testing
    (gitlab CI) only
 Changes since v3:
  * Rebased
  * Added first cut at handling of s390 protected virtualization
 Changes since RFCv2:
  * Rebased
  * Removed preliminary SEV cleanups (they've been merged)
  * Changed name to "host trust limitation"
  * Added migration blocker to the PEF code (based on SEV's version)
 Changes since RFCv1:
  * Rebased
  * Fixed some errors pointed out by Dave Gilbert
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAmAg1R8ACgkQbDjKyiDZ
 s5KCVRAAgm/xlgEv2hDZ7z+MuOTNesCpR3uU4iX02xNktox96Qai7XlrA7bhDf1v
 y/0FLnOOL6Kn5OHeS2CiDPIgWIUfapSwDsTPooZ6GqfzCI+r0jIaSBu59IBhvJRh
 o3ZTfT2fsckY9Gy2YN29ssN87ovDTPNlvRAxGH/71mMKEGJcK6QWxGcsyJDmeKq4
 0/tOQaLMFRRagTpwqCT1eacMzyQwkoDcywQHfi0Is+Q4voWPKgDY0qPqLd1OG2XI
 cMQ8fagums3NkPpVbKAW7sIvDiHtH1HNDoHKTiwKtTUsN5LBz+LN87LoKAdBasV0
 AiRm8gi+CkF/NOA2RjwaFmThxt7sr8kTKVuIqTo5m8agqkhJr97+gBxUym49CxTx
 1Zjo9TWsprKXnXl8vfGtAIZ4pkYQzomMDT3AilEST3+zbpRuwTMGOJ5vLF7RrKtF
 AtF2XBiPGZ/NztpbmaukuG/R49wwW5we4dR1zySMcoTsAl1rIzxpfwBnYatOY0Hg
 sVc9gABwQ0kacsseVIX72c+30U02cR8f6uRfuqNAEUW13vdAo/5/PXxGVlevMkw5
 33MYr16CkGnYgtgJtORK+x8/vPlAYiBzZrn71Wym7yKCamf8LMbzPNXKjUaD/GT8
 TZG7abTV8vuS0m7V/hGgV8nTVaG/6VLEyAtO6YpjQ+1p+dO8xBc=
 =TTeT
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dg-gitlab/tags/cgs-pull-request' into staging

Generalize memory encryption models

A number of hardware platforms are implementing mechanisms whereby the
hypervisor does not have unfettered access to guest memory, in order
to mitigate the security impact of a compromised hypervisor.

AMD's SEV implements this with in-cpu memory encryption, and Intel has
its own memory encryption mechanism.  POWER has an upcoming mechanism
to accomplish this in a different way, using a new memory protection
level plus a small trusted ultravisor.  s390 also has a protected
execution environment.

The current code (committed or draft) for these features has each
platform's version configured entirely differently.  That doesn't seem
ideal for users, or particularly for management layers.

AMD SEV introduces a notionally generic machine option
"machine-encryption", but it doesn't actually cover any cases other
than SEV.

This series is a proposal to at least partially unify configuration
for these mechanisms, by renaming and generalizing AMD's
"memory-encryption" property.  It is replaced by a
"confidential-guest-support" property pointing to a platform specific
object which configures and manages the specific details.

Note to Ram Pai: the documentation I've included for PEF is very
minimal.  If you could send a patch expanding on that, it would be
very helpful.

Changes since v8:
 * Rebase
 * Fixed some cosmetic typos
Changes since v7:
 * Tweaked and clarified meaning of the 'ready' flag
 * Polished the interface to the PEF internals
 * Shifted initialization for s390 PV later (I hope I've finally got
   this after apply_cpu_model() where it needs to be)
Changes since v6:
 * Moved to using OBJECT_DECLARE_TYPE and OBJECT_DEFINE_TYPE macros
 * Assorted minor fixes
Changes since v5:
 * Renamed from "securable guest memory" to "confidential guest
   support"
 * Simpler reworking of x86 boot time flash encryption
 * Added a bunch of documentation
 * Fixed some compile errors on POWER
Changes since v4:
 * Renamed from "host trust limitation" to "securable guest memory",
   which I think is marginally more descriptive
 * Re-organized initialization, because the previous model called at
   kvm_init didn't work for s390
 * Assorted fixes to the s390 implementation; rudimentary testing
   (gitlab CI) only
Changes since v3:
 * Rebased
 * Added first cut at handling of s390 protected virtualization
Changes since RFCv2:
 * Rebased
 * Removed preliminary SEV cleanups (they've been merged)
 * Changed name to "host trust limitation"
 * Added migration blocker to the PEF code (based on SEV's version)
Changes since RFCv1:
 * Rebased
 * Fixed some errors pointed out by Dave Gilbert

# gpg: Signature made Mon 08 Feb 2021 06:07:27 GMT
# gpg:                using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dg-gitlab/tags/cgs-pull-request:
  s390: Recognize confidential-guest-support option
  confidential guest support: Alter virtio default properties for protected guests
  spapr: PEF: prevent migration
  spapr: Add PEF based confidential guest support
  confidential guest support: Update documentation
  confidential guest support: Move SEV initialization into arch specific code
  confidential guest support: Introduce cgs "ready" flag
  sev: Add Error ** to sev_kvm_init()
  confidential guest support: Rework the "memory-encryption" property
  confidential guest support: Move side effect out of machine_set_memory_encryption()
  sev: Remove false abstraction of flash encryption
  confidential guest support: Introduce new confidential guest support class
  qom: Allow optional sugar props

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-02-08 11:11:26 +00:00
Alex Bennée
d994cc5449 docs/system: document an example booting the versatilepb machine
There is a bit more out there including Aurelien's excellent write up
and older Debian images here:

  https://www.aurel32.net/info/debian_arm_qemu.php
  https://people.debian.org/~aurel32/qemu/armel/

However the web is transitory and git is forever so lets add something
to the fine manual.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Cc: Anders Roxell <anders.roxell@linaro.org>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20210202134001.25738-16-alex.bennee@linaro.org>
2021-02-08 10:55:20 +00:00
Alex Bennée
a5dbb17507 docs/system: document an example vexpress-a15 invocation
The wiki and the web are curiously absent of the right runes to boot a
vexpress model so I had to work from first principles to work it out.
Use the more modern -drive notation so alternative backends can be
used (unlike the hardwired -sd mode).

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Cc: Anders Roxell <anders.roxell@linaro.org>
Message-Id: <20210202134001.25738-15-alex.bennee@linaro.org>
2021-02-08 10:55:20 +00:00
Alex Bennée
c401c058a1 tests/Makefile.include: don't use TARGET_DIRS for check-tcg
TARGET_DIRS reflects what we wanted to configure which in the normal
case is all our targets. However once meson has pared-down our target
list due to missing features we need to check the final list of
ninja-targets. This prevents check-tcg barfing on a --disable-tcg
build.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210202134001.25738-14-alex.bennee@linaro.org>
2021-02-08 10:55:20 +00:00
Alex Bennée
47e3424ac9 scripts/mtest2make.py: export all-%s-targets variable and use it
There are some places where the conditional makefile support is the
simplest solution. Now we don't expose CONFIG_TCG as a variable create
a new one that can be checked for the check-help output.

As check-tcg is a PHONY target we re-use check-softfloat to gate that
as well.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210202134001.25738-13-alex.bennee@linaro.org>
2021-02-08 10:55:20 +00:00
Stefan Weil
2a86d66be1 tests/tcg: Replace /bin/true by true (required on macOS)
/bin/true is missing on macOS, but simply "true" is available as a shell builtin.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210128135627.2067003-1-sw@weilnetz.de>
Message-Id: <20210202134001.25738-12-alex.bennee@linaro.org>
2021-02-08 10:55:20 +00:00
Richard Henderson
6e3dd75717 gdbstub: Fix handle_query_xfer_auxv
The main problem was that we were treating a guest address
as a host address with a mere cast.

Use the correct interface for accessing guest memory.  Do not
allow offset == auxv_len, which would result in an empty packet.

Fixes: 51c623b0de ("gdbstub: add support to Xfer:auxv:read: packet")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210128201831.534033-1-richard.henderson@linaro.org>
Message-Id: <20210202134001.25738-11-alex.bennee@linaro.org>
2021-02-08 10:55:15 +00:00
Alex Bennée
46bae04a86 tests/tcg: don't silently skip the gdb tests
Otherwise people won't know what they are missing.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210202134001.25738-10-alex.bennee@linaro.org>
2021-02-08 09:41:00 +00:00
Alex Bennée
d6a66c811e configure: bump the minimum gdb version for check-tcg to 9.1
For SVE, currently the bulk of the GDB TCG tests, we need at least GDB
9.1 to support the "ieee_half" data type we report. This only affects
when GDB tests are run; users can still use lower versions of gdb as
long as they aren't talking to an SVE enabled model. The work around
is to either get a newer gdb or disable SVE for their CPU model.

Reported-by: Claudio Fontana <cfontana@suse.de>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Cc: Luis Machado <luis.machado@linaro.org>
Message-Id: <20210202134001.25738-9-alex.bennee@linaro.org>
2021-02-08 09:41:00 +00:00
Alex Bennée
2df52b9bfd configure: make version_ge more tolerant of shady version input
When checking GDB versions we have to tolerate all sorts of random
distro extensions to the version string. While we already attempt to
do some of that before we call version_ge is makes sense to try and
regularise the first input by stripping extraneous -'s. While we at it
convert the old-style shell quoting into a cleaner form t shut up my
editors linter lest it confuse me by underlining the whole line.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210202134001.25738-8-alex.bennee@linaro.org>
2021-02-08 09:41:00 +00:00
Alex Bennée
ddd5ed8331 tests/docker: add a docker-exec-copy-test
This provides test machinery for checking the QEMU copying logic works
properly. It takes considerably less time to run than starting a
debootstrap only for it to fail later. I considered adding a remove
command to docker.py but figured that might be gold plating given the
relative size of the containers compared to the ones with actual stuff
in them.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210202134001.25738-7-alex.bennee@linaro.org>
2021-02-08 09:41:00 +00:00
Alex Bennée
6147c2495d tests/docker: alias docker-help target for consistency
We have a bunch of -help targets so this will save some cognitive
dissonance. Keep the original for those with muscle memory.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210202134001.25738-6-alex.bennee@linaro.org>
2021-02-08 09:41:00 +00:00
Alex Bennée
3971c70f15 tests/docker: preserve original name when copying libs
While it is important we chase down the symlinks to copy the correct
data we can confuse the kernel by renaming the interpreter to what is
in the binary. Extend _copy_with_mkdir to preserve the original name
of the file when asked.

Fixes: 5e33f7fead ("tests/docker: better handle symlinked libs")
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210202134001.25738-5-alex.bennee@linaro.org>
2021-02-08 09:41:00 +00:00
Alex Bennée
dffccf3d34 tests/docker: make _copy_with_mkdir accept missing files
Depending on the linker/ldd setup we might get a file with no path.
Typically this is the psuedo library linux-vdso.so which doesn't
actually exist on the disk. Rather than try and catch these distro
specific edge cases just shout about it and try and continue.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210202134001.25738-4-alex.bennee@linaro.org>
2021-02-08 09:41:00 +00:00
Philippe Mathieu-Daudé
dc23bbc3df tests/docker: Fix typo in help message
To have the variable properly passed, we need to set it,
ie. NOUSER=1. Fix the message displayed by 'make docker'.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210119052120.522069-1-f4bug@amsat.org>
Message-Id: <20210202134001.25738-3-alex.bennee@linaro.org>
2021-02-08 09:41:00 +00:00
Philippe Mathieu-Daudé
4d8f630915 tests/docker: Fix _get_so_libs() for docker-binfmt-image
Fix a variable rename mistake from commit 5e33f7fead:

  Traceback (most recent call last):
    File "./tests/docker/docker.py", line 710, in <module>
      sys.exit(main())
    File "./tests/docker/docker.py", line 706, in main
      return args.cmdobj.run(args, argv)
    File "./tests/docker/docker.py", line 489, in run
      _copy_binary_with_libs(args.include_executable,
    File "./tests/docker/docker.py", line 149, in _copy_binary_with_libs
      libs = _get_so_libs(src)
    File "./tests/docker/docker.py", line 123, in _get_so_libs
      libs.append(s.group(1))
  NameError: name 's' is not defined

Fixes: 5e33f7fead ("tests/docker: better handle symlinked libs")
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210119050149.516910-1-f4bug@amsat.org>
Message-Id: <20210202134001.25738-2-alex.bennee@linaro.org>
2021-02-08 09:41:00 +00:00
Thomas Huth
36a7ab5f04 tests/acceptance: Increase the timeout in the replay tests
Our gitlab-CI just showed a failed test_ppc_mac99 since it was apparently
killed some few seconds before the test finished. Allow it some more
time to complete.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Acked-by: Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210127065222.48650-1-thuth@redhat.com>
2021-02-08 09:40:51 +00:00
Peter Maydell
2766043345 qemu-sparc queue
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCgA8FiEEzGIauY6CIA2RXMnEW8LFb64PMh8FAmAgZQgeHG1hcmsuY2F2
 ZS1heWxhbmRAaWxhbmRlLmNvLnVrAAoJEFvCxW+uDzIfGowH/2JNIL3Uh0rLNMZ9
 wu5VPkZWPFHwXdbRiQBFZLi33JTYdkzMVl8cJ83KkUPi26hG+S9sszCRZmrPM76E
 vPPABm+jTqjSC5jQcVYcjaEhhLPCT4iq66m+F58Quw66C/StWY/c0W2LZNC6d6ul
 U6lrU8T/ycOo/IH9WrANRiDuAudbsPDC/riLZpyOUAe3muSirxAUFC0Mg40wdHMN
 vcDD4PoOruDFoUEn9vOvmuHYkKLSY8HZvmU6SVqx51ZJlPDlpp/z3GjjI81ftleL
 w6/FyEFuu9kzF4D7BJ2K8DnvUiMBXq+hC9bLfX4nQUXE2JHDIExphVurTtyPNrR+
 7V56TO4=
 =7Gso
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mcayland/tags/qemu-sparc-20210207' into staging

qemu-sparc queue

# gpg: Signature made Sun 07 Feb 2021 22:09:12 GMT
# gpg:                using RSA key CC621AB98E82200D915CC9C45BC2C56FAE0F321F
# gpg:                issuer "mark.cave-ayland@ilande.co.uk"
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>" [full]
# Primary key fingerprint: CC62 1AB9 8E82 200D 915C  C9C4 5BC2 C56F AE0F 321F

* remotes/mcayland/tags/qemu-sparc-20210207:
  utils/fifo8: add VMSTATE_FIFO8_TEST macro
  utils/fifo8: change fatal errors from abort() to assert()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-02-08 09:23:53 +00:00
David Gibson
651615d92d s390: Recognize confidential-guest-support option
At least some s390 cpu models support "Protected Virtualization" (PV),
a mechanism to protect guests from eavesdropping by a compromised
hypervisor.

This is similar in function to other mechanisms like AMD's SEV and
POWER's PEF, which are controlled by the "confidential-guest-support"
machine option.  s390 is a slightly special case, because we already
supported PV, simply by using a CPU model with the required feature
(S390_FEAT_UNPACK).

To integrate this with the option used by other platforms, we
implement the following compromise:

 - When the confidential-guest-support option is set, s390 will
   recognize it, verify that the CPU can support PV (failing if not)
   and set virtio default options necessary for encrypted or protected
   guests, as on other platforms.  i.e. if confidential-guest-support
   is set, we will either create a guest capable of entering PV mode,
   or fail outright.

 - If confidential-guest-support is not set, guests might still be
   able to enter PV mode, if the CPU has the right model.  This may be
   a little surprising, but shouldn't actually be harmful.

To start a guest supporting Protected Virtualization using the new
option use the command line arguments:
    -object s390-pv-guest,id=pv0 -machine confidential-guest-support=pv0

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
2021-02-08 16:57:38 +11:00
David Gibson
9f88a7a3df confidential guest support: Alter virtio default properties for protected guests
The default behaviour for virtio devices is not to use the platforms normal
DMA paths, but instead to use the fact that it's running in a hypervisor
to directly access guest memory.  That doesn't work if the guest's memory
is protected from hypervisor access, such as with AMD's SEV or POWER's PEF.

So, if a confidential guest mechanism is enabled, then apply the
iommu_platform=on option so it will go through normal DMA mechanisms.
Those will presumably have some way of marking memory as shared with
the hypervisor or hardware so that DMA will work.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
2021-02-08 16:57:38 +11:00