Commit Graph

68206 Commits

Author SHA1 Message Date
Peter Maydell 3b5b6e9b51 pci, pc, virtio: features, fixes, cleanups
intel-iommu scalable option
 pcie acs emulation
 beginning for vhost-user-blk reconnect and of vhost-user backend work
 misc fixes and cleanups
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJciHBSAAoJECgfDbjSjVRpoxkH/2NvGGZo+fSAIjVcEOe9BKZx
 XeI4X51QnqOqur3GktoHQzpMYCGxYy653AE69aoO1JVOXsoJS2py0SKw5VIa9bnh
 BeZwXGmf1/rySC+iFc5oSNxHv7vS2o40ccwrkeKoqbbzrnLPIYQs/yyfJG/m0HtS
 xj0zSN6rTY8xxiJYVQftav3ylqInIr3d14WoJcIP3ksiOVtuQ1yjDJnJdKCZvLMk
 4dtFuQJpownQrOZ0jfXXvpWu2VUC2ZuBd4ylTK3IiqBRjfaU4/wIq6ySMsU1evLy
 chcAykqY0jt5nz339K2HgquUtcuE3LsKi3igqTZMKi2vb3SLQFnPBO0DUyjXvGg=
 =gusE
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

pci, pc, virtio: features, fixes, cleanups

intel-iommu scalable option
pcie acs emulation
beginning for vhost-user-blk reconnect and of vhost-user backend work
misc fixes and cleanups

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Wed 13 Mar 2019 02:52:02 GMT
# gpg:                using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream: (26 commits)
  i386, acpi: check acpi_memory_hotplug capacity in pre_plug
  gen_pcie_root_port: Add ACS (Access Control Services) capability
  pcie: Add a simple PCIe ACS (Access Control Services) helper function
  vhost-user-blk: Add support to get/set inflight buffer
  libvhost-user: Support tracking inflight I/O in shared memory
  libvhost-user: Introduce vu_queue_map_desc()
  libvhost-user: Remove unnecessary FD flag check for event file descriptors
  vhost-user: Support transferring inflight buffer between qemu and backend
  nvdimm: use NVDIMM_ACPI_IO_LEN for the proper IO size
  nvdimm: use *function* directly instead of allocating it again
  nvdimm: fix typo in nvdimm_build_nvdimm_devices argument
  intel_iommu: add scalable-mode option to make scalable mode work
  intel_iommu: add 256 bits qi_desc support
  intel_iommu: scalable mode emulation
  libvhost-user: add vu_queue_unpop()
  libvhost-user-glib: export vug_source_new()
  vhost-user: split vhost_user_read()
  vhost-user: wrap some read/write with retry handling
  libvhost-user: exit by default on VHOST_USER_NONE
  vhost-user: simplify vhost_user_init/vhost_user_cleanup
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-03-13 19:10:40 +00:00
Peter Maydell 523a2a42c3 Pull request
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+ber27ys35W+dsvQfe+BBqr8OQ4FAlyIFSwACgkQfe+BBqr8
 OQ7wghAAm16eCEr57oTO7QXR3y8uVFsKqXBn9cNH6nbrFp2PUQSglwMDKBls1Z5m
 olF23X/JaqSlSmkL9BBuzDZ6Up+kkHKuxPq4/5RKXfiDI0pr3R0eqts0COAlaN9q
 Bew3ipj99m8gzMi2093AW4+Ob0N3658fuDTGLe1M1Uoy7CEg1QJ7rVOBBEui7vIl
 RbZ8l/Zmb4ldNpB3lnE4Nh9ue8fy0RAj3Nai161nCnNeXNF/VzD3Ye8bojSBbnux
 PIMX6/RWmykX4feIf9QP8apDpxX4HkyuPq5EdwT9PD8PwdyXPAXZtsYUNCuNtQuk
 n5VKFVgFYgqUclBeVHmrMYPU4K4iCFQp4/Fua7wzPEC0iG05NiiDv91oVkEJCp3L
 ManHeuGfNLCcXaIntKZhuJl1cK8yMM3yDww6/pPTehrPjcyvKa0NOqhQBExektcD
 R6q7maJRzFaxSxdcs+Zzuog9zESvH1mlJxQCKzeYhAP0kkxInyTELE/Vbx37xuqR
 RFfZYyVQ6x87Q/sxHx4EMiV97WUM8elZOQdSEC/okt5WUUNpgIu0WF9nSQ1VKZ8C
 CZmv5xh9ogfwvB/kOm6IVwNkLvVagJQcLwddORI5LLXLbSIUcuwVSuyMp/7iDtQ/
 hnHkGs2mIJ2JUYbSSNsSJNs6oTurn8eTFCeGoYKJgd9l4QxaThU=
 =ekU+
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/jnsnow/tags/bitmaps-pull-request' into staging

Pull request

# gpg: Signature made Tue 12 Mar 2019 20:23:08 GMT
# gpg:                using RSA key F9B7ABDBBCACDF95BE76CBD07DEF8106AAFC390E
# gpg: Good signature from "John Snow (John Huston) <jsnow@redhat.com>" [full]
# Primary key fingerprint: FAEB 9711 A12C F475 812F  18F2 88A9 064D 1835 61EB
#      Subkey fingerprint: F9B7 ABDB BCAC DF95 BE76  CBD0 7DEF 8106 AAFC 390E

* remotes/jnsnow/tags/bitmaps-pull-request: (22 commits)
  tests/qemu-iotests: add bitmap resize test 246
  block/qcow2-bitmap: Allow resizes with persistent bitmaps
  block/qcow2-bitmap: Don't check size for IN_USE bitmap
  docs/interop/qcow2: Improve bitmap flag in_use specification
  bitmaps: Fix typo in function name
  block/dirty-bitmaps: implement inconsistent bit
  block/dirty-bitmaps: disallow busy bitmaps as merge source
  block/dirty-bitmaps: prohibit removing readonly bitmaps
  block/dirty-bitmaps: prohibit readonly bitmaps for backups
  block/dirty-bitmaps: add block_dirty_bitmap_check function
  block/dirty-bitmap: add inconsistent status
  block/dirty-bitmaps: add inconsistent bit
  iotests: add busy/recording bit test to 124
  blockdev: remove unused paio parameter documentation
  block/dirty-bitmaps: move comment block
  block/dirty-bitmaps: unify qmp_locked and user_locked calls
  block/dirty-bitmap: explicitly lock bitmaps with successors
  nbd: change error checking order for bitmaps
  block/dirty-bitmap: change semantics of enabled predicate
  block/dirty-bitmap: remove set/reset assertions against enabled bit
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

# Conflicts:
#	tests/qemu-iotests/group
2019-03-13 17:30:34 +00:00
Peter Maydell 36fe770966 Block layer patches:
- file-posix: Make auto-read-only dynamic
 - Add x-blockdev-reopen QMP command
 - Finalize block-latency-histogram QMP command
 - gluster: Build fixes for newer lib version
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJciAjXAAoJEH8JsnLIjy/WwMcP/3VawpEBO4I94gp7KNUO1yZa
 rW7Rk0VO0gcvjk4fyTpQ1I3U2dfX6NwZLFMrk4oUS382QcKL9ky/TXVeeCaWxYxi
 51OH6+wHQKu1MuAjM9acXRD59pfOwmI6wbKAgrungeFzHF3TvCYcLD0rY9Mhz1wp
 Q7Oqkk2au6cFrmqZChCF2S5guZc0JOuwzd+LdDshRNDek2Px8a3etVq37VBUuxzK
 WDvIws1IZkFI5y2WE3T8kn7YJ8NgMZ1p47tgkymDX7fkn3V766tec8ZYBy1Qz9ab
 +I3UlzijuXB8vq+egEtzQfJvvTyoPrb65VFjW94ITu9onuclYo1oV5XVgx2c/NiR
 WnUagbu9nft1E4+zmSrVB3Y4I7Pbwi+At/2L2dMQXIrrebK50Cqg8GW2fthhq/KM
 5NavsqgdH14gOGS1yUGu06J0HO87XiJUKta4Th9M6iKvcGJqZ+F1WZGSiVEhhk1W
 w0FSmWdB/XdUwWoWdQnfx8d43OK34Q3spqsAe59DvKKyORw8Uug1i43yWvSepwSf
 SILmRYLOpMIrKUihh9NPIxB6QCzv/Mt7WjSEtaif+EXgQIGZhoQBjWpde0h5Dwh5
 DrVz6NqDNnz0VDARPq2hgWOs4RlNpMvx5eZFJcK66RsOfLom3CL7mpnVaa+jd6e6
 Nf0HPh/ubFB0IeEYbV0n
 =dQt+
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches:

- file-posix: Make auto-read-only dynamic
- Add x-blockdev-reopen QMP command
- Finalize block-latency-histogram QMP command
- gluster: Build fixes for newer lib version

# gpg: Signature made Tue 12 Mar 2019 19:30:31 GMT
# gpg:                using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream: (28 commits)
  qemu-iotests: Test the x-blockdev-reopen QMP command
  block: Add an 'x-blockdev-reopen' QMP command
  block: Remove the AioContext parameter from bdrv_reopen_multiple()
  block: Add bdrv_reset_options_allowed()
  block: Add a 'mutable_opts' field to BlockDriver
  block: Allow changing the backing file on reopen
  block: Allow omitting the 'backing' option in certain cases
  block: Handle child references in bdrv_reopen_queue()
  block: Add 'keep_old_opts' parameter to bdrv_reopen_queue()
  block: Freeze the backing chain for the duration of the stream job
  block: Freeze the backing chain for the duration of the mirror job
  block: Freeze the backing chain for the duration of the commit job
  block: Allow freezing BdrvChild links
  nvme: fix write zeroes offset and count
  file-posix: Make auto-read-only dynamic
  file-posix: Prepare permission code for fd switching
  file-posix: Lock new fd in raw_reopen_prepare()
  file-posix: Store BDRVRawState.reopen_state during reopen
  file-posix: Factor out raw_reconfigure_getfd()
  file-posix: Fix bdrv_open_flags() for snapshot=on
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-03-13 14:44:28 +00:00
Peter Maydell f39901d59f Break out documentation to docs/devel/.
Add support for pattern groups.
 Other misc cleanups for multiple decode functions.
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJch+V5AAoJEGTfOOivfiFfsFsH/1KW6UWAiieZ1+HPYEp24Ku8
 hCNxlfj0iKe1ZEuC8qp2U27GzePi71IlIJ7p5AuAhiTQBBWz8bPzJJUALm3EliaI
 V4/13fLnTYALnPWoUJclU5frdHBhpIWxFUtnLdB50dSU1cTbFFyS+63JsW3wSSXt
 UqntlhSsAmAQr3ULnKufwDZQJgQoft/8G4YzvMOm/7E0ZeV3B9mARAkn6m/30gLx
 nXgLI2OQrA1ATLeTfzNRup9G+YjLx0nW2LRhAseIWcQAW8PyfJsfW6tJeou93+bf
 fK6BkLMgor74QH37Y3u7KVJGJ04u2Gtu0p2JzBA9MU/0l07WihWPA0eJGnP396I=
 =BxBC
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/rth/tags/pull-dt-20190312' into staging

Break out documentation to docs/devel/.
Add support for pattern groups.
Other misc cleanups for multiple decode functions.

# gpg: Signature made Tue 12 Mar 2019 16:59:37 GMT
# gpg:                using RSA key 64DF38E8AF7E215F
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* remotes/rth/tags/pull-dt-20190312:
  decodetree: Properly diagnose fields overflowing an insn
  decodetree: Prefix extract function names with decode_function
  decodetree: Allow +- to begin a number initializing a field
  decodetree: Produce clean output for an empty input file
  decodetree: Add --static-decode option
  test/decode: Add tests for PatternGroups
  decodetree: Allow grouping of overlapping patterns
  decodetree: Do not unconditionaly return from Pattern.output_code
  decodetree: Ensure build_tree does not include values outside insnmask
  decodetree: Document the usefulness of argument sets
  decodetree: Move documentation to docs/devel/decodetree.rst
  MAINTAINERS: Add scripts/decodetree.py to the TCG section

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-03-13 13:09:38 +00:00
Peter Maydell 9298a4e8a6 Misc fixes affecting HP-UX 10.20.
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJch9tgAAoJEGTfOOivfiFfKtoH/RzN42QeTfH60VrqsxIcOQ3t
 6vuDzO071zCPUMWLpIizhGH2FDRbUS2H4oihtoaoZza0IllewdhjXr4EabmXBlbz
 GD1louz73+AvOJ88dmfCC8tvEVW0Ga+MgljDBGwNAPPTEXlYUZGYGq8r4Mv9sxHu
 TGNkb4Q/nSsIXeaSaXUHECVJNJ43o+naCgoO6odTgjcN5Dr99t5Uk8GK2wgfLMaE
 iz+c9EsEDPtPnJ7vmwbS/sJuS5D6fqBX2Ag0UNvNlJObImZfkxDi6g4h/VKDHrpa
 Ycxoog+EaCBV1iVkHhl8GV8hwzNFJ0EMFcK95Bf0KxcOVbL6jh59uINR2zVjS7M=
 =pjDE
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/rth/tags/pull-hppa-20190312' into staging

Misc fixes affecting HP-UX 10.20.

# gpg: Signature made Tue 12 Mar 2019 16:16:32 GMT
# gpg:                using RSA key 64DF38E8AF7E215F
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* remotes/rth/tags/pull-hppa-20190312:
  target/hppa: exit TB if either Data or Instruction TLB changes
  target/hppa: add TLB protection id check
  target/hppa: allow multiple itlbp without itlba
  target/hppa: fix b,gate instruction
  target/hppa: ignore DIAG opcode
  target/hppa: remove PSW I/R/Q bit check
  target/hppa: add TLB trace events
  target/hppa: report ITLB_EXCP_MISS for ITLB misses
  target/hppa: fix TLB handling for page 0
  target/hppa: fix overwriting source reg in addb
  target/hppa: Check for page crossings in use_goto_tb

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-03-13 09:33:41 +00:00
Wei Yang 9040e6dfa8 i386, acpi: check acpi_memory_hotplug capacity in pre_plug
Currently we do device realization like below:

   hotplug_handler_pre_plug()
   dc->realize()
   hotplug_handler_plug()

Before we do device realization and plug, we should allocate necessary
resources and check if memory-hotplug-support property is enabled.

At the piix4 and ich9, the memory-hotplug-support property is checked at
plug stage. This means that device has been realized and mapped into guest
address space 'pc_dimm_plug()' by the time acpi plug handler is called,
where it might fail and crash QEMU due to reaching g_assert_not_reached()
(piix4) or error_abort (ich9).

Fix it by checking if memory hotplug is enabled at pre_plug stage
where we can gracefully abort hotplug request.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
CC: Igor Mammedov <imammedo@redhat.com>
CC: Eric Blake <eblake@redhat.com>
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>

Message-Id: <20190301033548.6691-1-richardw.yang@linux.intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-12 22:31:21 -04:00
Knut Omang e07fb4b50b gen_pcie_root_port: Add ACS (Access Control Services) capability
Claim ACS support in the generic PCIe root port to allow
passthrough of individual functions of a device to different
guests (in a nested virt.setting) with VFIO.
Without this patch, all functions of a device, such as all VFs of
an SR/IOV device, will end up in the same IOMMU group.
A similar situation occurs on Windows with Hyper-V.

In the single function device case, it also has a small cosmetic
benefit in that the root port itself is not grouped with
the device. VFIO handles that situation in that binding rules
only apply to endpoints, so it does not limit passthrough in
those cases.

Signed-off-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Message-Id: <319460b483f566dd57487eb3dd340ed4c10aa53c.1550768238.git-series.knut.omang@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
2019-03-12 22:31:21 -04:00
Knut Omang db891a9ba3 pcie: Add a simple PCIe ACS (Access Control Services) helper function
Implementing an ACS capability on downstream ports and multifunction
endpoints indicates isolation and IOMMU visibility to a finer
granularity. This creates smaller IOMMU groups in the guest and thus
more flexibility in assigning endpoints to guest userspace or an L2
guest.

Signed-off-by: Knut Omang <knut.omang@oracle.com>
Message-Id: <07489975121696f5573b0a92baaf3486ef51e35d.1550768238.git-series.knut.omang@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
2019-03-12 22:31:21 -04:00
Xie Yongji a1fe0b8f27 vhost-user-blk: Add support to get/set inflight buffer
This patch adds support for vhost-user-blk device to get/set
inflight buffer from/to backend.

Signed-off-by: Xie Yongji <xieyongji@baidu.com>
Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
Message-Id: <20190228085355.9614-6-xieyongji@baidu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-12 22:31:21 -04:00
Xie Yongji 5f9ff1eff3 libvhost-user: Support tracking inflight I/O in shared memory
This patch adds support for VHOST_USER_GET_INFLIGHT_FD and
VHOST_USER_SET_INFLIGHT_FD message to set/get shared buffer
to/from qemu. Then backend can track inflight I/O in this buffer.

Signed-off-by: Xie Yongji <xieyongji@baidu.com>
Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
Message-Id: <20190228085355.9614-5-xieyongji@baidu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-12 22:31:21 -04:00
Xie Yongji f7671f3d49 libvhost-user: Introduce vu_queue_map_desc()
Introduce vu_queue_map_desc() which should be
independent with vu_queue_pop();

Signed-off-by: Xie Yongji <xieyongji@baidu.com>
Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20190228085355.9614-4-xieyongji@baidu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-12 22:31:21 -04:00
Xie Yongji 792468cc6f libvhost-user: Remove unnecessary FD flag check for event file descriptors
The vu_check_queue_msg_file() has checked the FD flag. So let's
delete the redundant check after it.

Signed-off-by: Xie Yongji <xieyongji@baidu.com>
Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
Message-Id: <20190228085355.9614-3-xieyongji@baidu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-12 22:31:21 -04:00
Xie Yongji 5ad204bf2a vhost-user: Support transferring inflight buffer between qemu and backend
This patch introduces two new messages VHOST_USER_GET_INFLIGHT_FD
and VHOST_USER_SET_INFLIGHT_FD to support transferring a shared
buffer between qemu and backend.

Firstly, qemu uses VHOST_USER_GET_INFLIGHT_FD to get the
shared buffer from backend. Then qemu should send it back
through VHOST_USER_SET_INFLIGHT_FD each time we start vhost-user.

This shared buffer is used to track inflight I/O by backend.
Qemu should retrieve a new one when vm reset.

Signed-off-by: Xie Yongji <xieyongji@baidu.com>
Signed-off-by: Chai Wen <chaiwen@baidu.com>
Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
Message-Id: <20190228085355.9614-2-xieyongji@baidu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-12 22:31:21 -04:00
Wei Yang 1b8fff5758 nvdimm: use NVDIMM_ACPI_IO_LEN for the proper IO size
The IO range is defined to 4 bytes with NVDIMM_ACPI_IO_LEN, so it is
more proper to use this macro instead of calculating it by sizeof.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20190227075101.6263-4-richardw.yang@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2019-03-12 22:31:21 -04:00
Wei Yang ac265cacdd nvdimm: use *function* directly instead of allocating it again
At the beginning or nvdimm_build_common_dsm(), variable *function* is
already allocated for Arg2.

This patch reuse variable *function* instead of allocating it again.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20190227075101.6263-3-richardw.yang@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2019-03-12 22:31:21 -04:00
Wei Yang b096c11458 nvdimm: fix typo in nvdimm_build_nvdimm_devices argument
>From dsm_dma_arrea to dsm_dma_area.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20190227075101.6263-2-richardw.yang@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2019-03-12 22:31:21 -04:00
Yi Sun 4a4f219e8a intel_iommu: add scalable-mode option to make scalable mode work
This patch adds an option to provide flexibility for user to expose
Scalable Mode to guest. User could expose Scalable Mode to guest by
the config as below:

"-device intel-iommu,caching-mode=on,scalable-mode=on"

The Linux iommu driver has supported scalable mode. Please refer below
patch set:

    https://www.spinics.net/lists/kernel/msg2985279.html

Signed-off-by: Liu, Yi L <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Message-Id: <1551753295-30167-4-git-send-email-yi.y.sun@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-12 22:31:21 -04:00
Liu, Yi L c0c1d35184 intel_iommu: add 256 bits qi_desc support
Per Intel(R) VT-d 3.0, the qi_desc is 256 bits in Scalable
Mode. This patch adds emulation of 256bits qi_desc.

Signed-off-by: Liu, Yi L <yi.l.liu@intel.com>
[Yi Sun is co-developer to rebase and refine the patch.]
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <1551753295-30167-3-git-send-email-yi.y.sun@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-12 22:31:21 -04:00
Liu, Yi L fb43cf739e intel_iommu: scalable mode emulation
Intel(R) VT-d 3.0 spec introduces scalable mode address translation to
replace extended context mode. This patch extends current emulator to
support Scalable Mode which includes root table, context table and new
pasid table format change. Now intel_iommu emulates both legacy mode
and scalable mode (with legacy-equivalent capability set).

The key points are below:
1. Extend root table operations to support both legacy mode and scalable
   mode.
2. Extend context table operations to support both legacy mode and
   scalable mode.
3. Add pasid tabled operations to support scalable mode.

Signed-off-by: Liu, Yi L <yi.l.liu@intel.com>
[Yi Sun is co-developer to contribute much to refine the whole commit.]
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Message-Id: <1551753295-30167-2-git-send-email-yi.y.sun@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2019-03-12 22:31:21 -04:00
Marc-André Lureau b13919ab64 libvhost-user: add vu_queue_unpop()
vhost-user-input will make use of this function to undo some queue pop
in case the virtio queue does not have enough room.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20190308140454.32437-11-marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-12 22:31:21 -04:00
Marc-André Lureau 922ef483ec libvhost-user-glib: export vug_source_new()
Simplify the creation of FD sources for other users. This is just
convenience to avoid duplicating similar code elsewhere.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20190308140454.32437-10-marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-12 22:31:21 -04:00
Marc-André Lureau 9af84c02e2 vhost-user: split vhost_user_read()
Split vhost_user_read(), so only header can be read with
vhost_user_read_header().

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20190308140454.32437-8-marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-12 22:31:21 -04:00
Marc-André Lureau 917d7dd72a vhost-user: wrap some read/write with retry handling
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20190308140454.32437-6-marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-12 21:22:31 -04:00
Marc-André Lureau 6079865526 libvhost-user: exit by default on VHOST_USER_NONE
Since commit 2566378d6d, libvhost-user
no longer panics on disconnect (rc == 0), and instead silently ignores
an invalid VHOST_USER_NONE message.

Without extra work from the API user, this will simply busy-loop on
HUP events. The obvious thing to do is to exit(0) instead, while
additional or different work can be done by overriding
iface->process_msg().

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Message-Id: <20190308140454.32437-5-marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-12 21:22:31 -04:00
Marc-André Lureau 0b99f22461 vhost-user: simplify vhost_user_init/vhost_user_cleanup
Take a VhostUserState* that can be pre-allocated, and initialize it
with the associated chardev.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
Message-Id: <20190308140454.32437-4-marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-12 21:22:31 -04:00
Marc-André Lureau 482580a658 vhost-user: define conventions for vhost-user backends
As discussed during "[PATCH v4 00/29] vhost-user for input & GPU"
review, let's define a common set of backend conventions to help with
management layer implementation, and interoperability.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20190308140454.32437-3-marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-12 21:22:31 -04:00
Marc-André Lureau ba275e9d28 libvhost-user: fix clang enum-conversion warning
Now that the VhostUserMsg.request field is used for both master &
slave requests, since commit d84599f56c820d8c1ac9928a76500dcdfbbf194d:

contrib/libvhost-user/libvhost-user.c:953:20: error: implicit conversion from enumeration type 'enum VhostUserSlaveRequest' to different enumeration type 'VhostUserRequest' (aka 'enum VhostUserRequest') [-Werror,-Wenum-conversion]
        .request = VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG,
                   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20190308140454.32437-2-marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-12 21:22:31 -04:00
David Gibson 596546fe9e virtio-balloon: Restore MADV_WILLNEED hint on balloon deflate
Prior to f6deb6d9 "virtio-balloon: Remove unnecessary MADV_WILLNEED on
deflate", the balloon device issued an madvise() MADV_WILLNEED on
pages removed from the balloon.  That would hint to the host kernel
that the pages were likely to be needed by the guest in the near
future.

It's unclear if this is actually valuable or not, and so f6deb6d9
removed this, essentially ignoring balloon deflate requests.  However,
concerns have been raised that this might cause a performance
regression by causing extra latency for the guest in certain
configurations.

So, until we can get actual benchmark data to see if that's the case,
this restores the old behaviour, issuing a MADV_WILLNEED when a page is
removed from the balloon.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190306030601.21986-4-david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-12 21:22:31 -04:00
David Gibson b27b323914 virtio-balloon: Fix possible guest memory corruption with inflates & deflates
This fixes a balloon bug with a nasty consequence - potentially
corrupting guest memory - but which is extremely unlikely to be
triggered in practice.

The balloon always works in 4kiB units, but the host could have a
larger page size on certain platforms.  Since ed48c59 "virtio-balloon:
Safely handle BALLOON_PAGE_SIZE < host page size" we've handled this
by accumulating requests to balloon 4kiB subpages until they formed a
full host page.  Since f6deb6d "virtio-balloon: Remove unnecessary
MADV_WILLNEED on deflate" we essentially ignore deflate requests.

Suppose we have a host with 8kiB pages, and one host page has subpages
A & B.  If we get this sequence of events -
	inflate A
	deflate A
	inflate B
- the current logic will discard the whole host page.  That's
incorrect because the guest has deflated subpage A, and could have
written important data to it.

This patch fixes the problem by adjusting our state information about
partially ballooned host pages when deflate requests are received.

Fixes: ed48c59 "virtio-balloon: Safely handle BALLOON_PAGE_SIZE < host page size"

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190306030601.21986-3-david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
2019-03-12 21:22:31 -04:00
David Gibson 301cf2a8dd virtio-balloon: Don't mismatch g_malloc()/free (CID 1399146)
ed48c59875 "virtio-balloon: Safely handle BALLOON_PAGE_SIZE < host
page size" introduced a new temporary data structure which tracks 4kiB
chunks which have been inserted into the balloon by the guest but
don't yet form a full host page which we can discard.

Unfortunately, I had a thinko and allocated that structure with
g_malloc0() but freed it with a plain free() rather than g_free().
This corrects the problem.

Fixes: ed48c59875
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190306030601.21986-2-david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
2019-03-12 21:22:31 -04:00
Wei Wang ae440bd14c virtio-balloon: fix a use-after-free case
The elem could theorically contain both outbuf and inbufs. We move the
free operation to the end of this function to avoid using elem->in_sg
while elem has been freed.

Fixes: c13c4153f7
("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Peter Xu <peterx@redhat.com>
Message-Id: <1552383280-4122-1-git-send-email-wei.w.wang@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-12 21:22:31 -04:00
Peter Maydell 3f3bbfc7ce - qtest patches
- One SD patch (with Reviewed-by from the maintainer)
 - One license fix patch
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJch3X+AAoJEC7Z13T+cC21gY4P/iaLcbYHWYxghqNooA7W+sAQ
 JRMzYgMZ/7W6P5WrDUhv6wtPMU2BiALEaHwcPQUtASpsfEHnEPZ/9xPbG7aPRXet
 xE9x5Xpuc0CiHOGNBjfzhlRWQgsVuDn7uiv7aj48ZJmXYb4SI/MT2FKGByXHb2ie
 E6E92NdjLvY0qJnm7A4TLMKyPdIKG8zLcqQbPz+fpD3bqI7Do4XtCN11i35yKsy9
 Co2v+mRxUiZho7f5QuKYD/pU9DVaHO1Ra4fAZhE+LajUeFwpEWZPkZameB8Bkyzk
 l3CE2XgYvrTWspb1N1VX5M+3WHyozfwF15lPnkm3ANxX32bg6TNc0JSs0udtr+WW
 RnJLAc1G68qRAor1bE8SxS1yQ8p3VDbiQMVc/ogBxpbJEefyb5n0CcsAgi0pkVNt
 Uf+CMk2qc9sqXsGFBJCsXBTNnb/fJ/MX41s3SJOxesXpsS9uqXJcK1sa4inY7Waw
 366MliFaYYhJyANvzuQVi+onENUq8kw3tCHQb7dieKe7A73f84CqBhtKbn88JUae
 3ZpcqpjVwLMO3gLVsDRUrRJy23RId4FeqJZAhVbC3K+eggTDh2SWQEhgryfvcY/O
 BRc1L3pifblyzxI74wGlgdSpNFqjE2YeTM5HkBiNrIsU0gNmRGoorZJE1TOYp/HY
 /EzPnw2s1YhrEO+RylvR
 =aFrM
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/huth-gitlab/tags/pull-request-2019-03-12' into staging

- qtest patches
- One SD patch (with Reviewed-by from the maintainer)
- One license fix patch

# gpg: Signature made Tue 12 Mar 2019 09:03:58 GMT
# gpg:                using RSA key 2ED9D774FE702DB5
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg:                 aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg:                 aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg:                 aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3  EAB9 2ED9 D774 FE70 2DB5

* remotes/huth-gitlab/tags/pull-request-2019-03-12:
  scripts/qemugdb: re-license timers.py to GPLv2 or later
  hw/sd/sdhci: Move PCI-related code into a separate file
  ahci-test: Drop dependence on global_qtest
  tests: test-announce-self: fix memory leak

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-03-12 21:06:26 +00:00
Alberto Garcia bf3e50f623 qemu-iotests: Test the x-blockdev-reopen QMP command
This patch adds several tests for the x-blockdev-reopen QMP command.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-03-12 20:30:14 +01:00
Alberto Garcia 1479c730c7 block: Add an 'x-blockdev-reopen' QMP command
This command allows reopening an arbitrary BlockDriverState with a
new set of options. Some options (e.g node-name) cannot be changed
and some block drivers don't allow reopening, but otherwise this
command is modelled after 'blockdev-add' and the state of the reopened
BlockDriverState should generally be the same as if it had just been
added by 'blockdev-add' with the same set of options.

One notable exception is the 'backing' option: 'x-blockdev-reopen'
requires that it is always present unless the BlockDriverState in
question doesn't have a current or default backing file.

This command allows reconfiguring the graph by using the appropriate
options to change the children of a node. At the moment it's possible
to change a backing file by setting the 'backing' option to the name
of the new node, but it should also be possible to add a similar
functionality to other block drivers (e.g. Quorum, blkverify).

Although the API is unlikely to change, this command is marked
experimental for the time being so there's room to see if the
semantics need changes.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-03-12 20:30:14 +01:00
Alberto Garcia 5019aece2a block: Remove the AioContext parameter from bdrv_reopen_multiple()
This parameter has been unused since 1a63a90750

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-03-12 20:30:14 +01:00
Alberto Garcia faf116b438 block: Add bdrv_reset_options_allowed()
bdrv_reopen_prepare() receives a BDRVReopenState with (among other
things) a new set of options to be applied to that BlockDriverState.

If an option is missing then it means that we want to reset it to its
default value rather than keep the previous one. This way the state
of the block device after being reopened is comparable to that of a
device added with "blockdev-add" using the same set of options.

Not all options from all drivers can be changed this way, however.
If the user attempts to reset an immutable option to its default value
using this method then we must forbid it.

This new function takes a BlockDriverState and a new set of options
and checks if there's any option that was previously set but is
missing from the new set of options.

If the option is present in both sets we don't need to check that they
have the same value. The loop at the end of bdrv_reopen_prepare()
already takes care of that.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-03-12 20:30:14 +01:00
Alberto Garcia 8a2ce0bc1e block: Add a 'mutable_opts' field to BlockDriver
If we reopen a BlockDriverState and there is an option that is present
in bs->options but missing from the new set of options then we have to
return an error unless the driver is able to reset it to its default
value.

This patch adds a new 'mutable_opts' field to BlockDriver. This is
a list of runtime options that can be modified during reopen. If an
option in this list is unspecified on reopen then it must be reset (or
return an error).

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-03-12 20:30:14 +01:00
Alberto Garcia cb828c31de block: Allow changing the backing file on reopen
This patch allows the user to change the backing file of an image that
is being reopened. Here's what it does:

 - In bdrv_reopen_prepare(): check that the value of 'backing' points
   to an existing node or is null. If it points to an existing node it
   also needs to make sure that replacing the backing file will not
   create a cycle in the node graph (i.e. you cannot reach the parent
   from the new backing file).

 - In bdrv_reopen_commit(): perform the actual node replacement by
   calling bdrv_set_backing_hd().

There may be temporary implicit nodes between a BDS and its backing
file (e.g. a commit filter node). In these cases bdrv_reopen_prepare()
looks for the real (non-implicit) backing file and requires that the
'backing' option points to it. Replacing or detaching a backing file
is forbidden if there are implicit nodes in the middle.

Although x-blockdev-reopen is meant to be used like blockdev-add,
there's an important thing that must be taken into account: the only
way to set a new backing file is by using a reference to an existing
node (previously added with e.g. blockdev-add).  If 'backing' contains
a dictionary with a new set of options ({"driver": "qcow2", "file": {
... }}) then it is interpreted that the _existing_ backing file must
be reopened with those options.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-03-12 20:30:14 +01:00
Alberto Garcia bacd9b87c4 block: Allow omitting the 'backing' option in certain cases
Of all options of type BlockdevRef used to specify children in
BlockdevOptions, 'backing' is the only one that is optional.

For "x-blockdev-reopen" we want that if an option is omitted then it
must be reset to its default value. The default value of 'backing'
means that QEMU opens the backing file specified in the image
metadata, but this is not something that we want to support for the
reopen operation.

Because of this the 'backing' option has to be specified during
reopen, pointing to the existing backing file if we want to keep it,
or pointing to a different one (or NULL) if we want to replace it (to
be implemented in a subsequent patch).

In order to simplify things a bit and not to require that the user
passes the 'backing' option to every single block device even when
it's clearly not necessary, this patch allows omitting this option if
the block device being reopened doesn't have a backing file attached
_and_ no default backing file is specified in the image metadata.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-03-12 20:30:14 +01:00
Alberto Garcia 8546632e61 block: Handle child references in bdrv_reopen_queue()
Children in QMP are specified with BlockdevRef / BlockdevRefOrNull,
which can contain a set of child options, a child reference, or
NULL. In optional attributes like "backing" it can also be missing.

Only the first case (set of child options) is being handled properly
by bdrv_reopen_queue(). This patch deals with all the others.

Here's how these cases should be handled when bdrv_reopen_queue() is
deciding what to do with each child of a BlockDriverState:

   1) Set of child options: if the child was implicitly created (i.e
      inherits_from points to the parent) then the options are removed
      from the parent's options QDict and are passed to the child with
      a recursive bdrv_reopen_queue() call. This case was already
      working fine.

   2) Child reference: there's two possibilites here.

      2a) Reference to the current child: if the child was implicitly
          created then it is put in the reopen queue, keeping its
          current set of options (since this was a child reference
          there was no way to specify a different set of options).
          If the child is not implicit then it keeps its current set
          of options but it is not reopened (and therefore does not
          inherit any new option from the parent).

      2b) Reference to a different BDS: the current child is not put
          in the reopen queue at all. Passing a reference to a
          different BDS can be used to replace a child, although at
          the moment no driver implements this, so it results in an
          error. In any case, the current child is not going to be
          reopened (and might in fact disappear if it's replaced)

   3) NULL: This is similar to (2b). Although no driver allows this
      yet it can be used to detach the current child so it should not
      be put in the reopen queue.

   4) Missing option: at the moment "backing" is the only case where
      this can happen. With "blockdev-add", leaving "backing" out
      means that the default backing file is opened. We don't want to
      open a new image during reopen, so we require that "backing" is
      always present. We'll relax this requirement a bit in the next
      patch. If keep_old_opts is true and "backing" is missing then
      this behaves like 2a (the current child is reopened).

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-03-12 20:30:14 +01:00
Alberto Garcia 077e8e2018 block: Add 'keep_old_opts' parameter to bdrv_reopen_queue()
The bdrv_reopen_queue() function is used to create a queue with
the BDSs that are going to be reopened and their new options. Once
the queue is ready bdrv_reopen_multiple() is called to perform the
operation.

The original options from each one of the BDSs are kept, with the new
options passed to bdrv_reopen_queue() applied on top of them.

For "x-blockdev-reopen" we want a function that behaves much like
"blockdev-add". We want to ignore the previous set of options so that
only the ones actually specified by the user are applied, with the
rest having their default values.

One of the things that we need is a way to tell bdrv_reopen_queue()
whether we want to keep the old set of options or not, and that's what
this patch does. All current callers are setting this new parameter to
true and x-blockdev-reopen will set it to false.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-03-12 20:30:14 +01:00
Alberto Garcia 6585493369 block: Freeze the backing chain for the duration of the stream job
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-03-12 20:30:14 +01:00
Alberto Garcia ef53dc09ed block: Freeze the backing chain for the duration of the mirror job
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-03-12 20:30:14 +01:00
Alberto Garcia df827336ab block: Freeze the backing chain for the duration of the commit job
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-03-12 20:30:14 +01:00
Alberto Garcia 2cad1ebe70 block: Allow freezing BdrvChild links
Our permission system is useful to define what operations are allowed
on a certain block node and includes things like BLK_PERM_WRITE or
BLK_PERM_RESIZE among others.

One of the permissions is BLK_PERM_GRAPH_MOD which allows "changing
the node that this BdrvChild points to". The exact meaning of this has
never been very clear, but it can be understood as "change any of the
links connected to the node". This can be used to prevent changing a
backing link, but it's too coarse.

This patch adds a new 'frozen' attribute to BdrvChild, which forbids
detaching the link from the node it points to, and new API to freeze
and unfreeze a backing chain.

After this change a few functions can fail, so they need additional
checks.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-03-12 20:30:14 +01:00
Keith Busch 9d6459d21a nvme: fix write zeroes offset and count
The implementation used blocks units rather than the expected bytes.

Fixes: c03e7ef12a ("nvme: Implement Write Zeroes")
Reported-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-03-12 20:30:14 +01:00
Kevin Wolf 23dece19da file-posix: Make auto-read-only dynamic
Until now, with auto-read-only=on we tried to open the file read-write
first and if that failed, read-only was tried. This is actually not good
enough for libvirt, which gives QEMU SELinux permissions for read-write
only as soon as it actually intends to write to the image. So we need to
be able to switch between read-only and read-write at runtime.

This patch makes auto-read-only dynamic, i.e. the file is opened
read-only as long as no user of the node has requested write
permissions, but it is automatically reopened read-write as soon as the
first writer is attached. Conversely, if the last writer goes away, the
file is reopened read-only again.

bs->read_only is no longer set for auto-read-only=on files even if the
file descriptor is opened read-only because it will be transparently
upgraded as soon as a writer is attached. This changes the output of
qemu-iotests 232.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-03-12 20:30:14 +01:00
Kevin Wolf 6ceabe6f77 file-posix: Prepare permission code for fd switching
In order to be able to dynamically reopen the file read-only or
read-write, depending on the users that are attached, we need to be able
to switch to a different file descriptor during the permission change.

This interacts with reopen, which also creates a new file descriptor and
performs permission changes internally. In this case, the permission
change code must reuse the reopen file descriptor instead of creating a
third one.

In turn, reopen can drop its code to copy file locks to the new file
descriptor because that is now done when applying the new permissions.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-03-12 20:30:14 +01:00
Kevin Wolf a6aeca0ca5 file-posix: Lock new fd in raw_reopen_prepare()
There is no reason why we can take locks on the new file descriptor only
in raw_reopen_commit() where error handling isn't possible any more.
Instead, we can already do this in raw_reopen_prepare().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-03-12 20:30:14 +01:00
Kevin Wolf e0c9cf3a48 file-posix: Store BDRVRawState.reopen_state during reopen
We'll want to access the file descriptor in the reopen_state while
processing permission changes in the context of the repoen.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-03-12 20:30:14 +01:00