Commit Graph

45814 Commits

Author SHA1 Message Date
Alexey Kardashevskiy
d78c19b5cf memory: Fix IOMMU replay base address
Since a788f227 "memory: Allow replay of IOMMU mapping notifications"
when new VFIO listener is added, all existing IOMMU mappings are
replayed. However there is a problem that the base address of
an IOMMU memory region (IOMMU MR) is ignored which is not a problem
for the existing user (which is pseries) with its default 32bit DMA
window starting at 0 but it is if there is another DMA window.

This stores the IOMMU's offset_within_address_space and adjusts
the IOVA before calling vfio_dma_map/vfio_dma_unmap.

As the IOMMU notifier expects IOVA offset rather than the absolute
address, this also adjusts IOVA in sPAPR H_PUT_TCE handler before
calling notifier(s).

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-05-26 11:12:08 -06:00
Alexey Kardashevskiy
7a057b4fb9 vfio: Fix 128 bit handling when deleting region
7532d3cbf "vfio: Fix 128 bit handling" added support for 64bit IOMMU
memory regions when those are added to VFIO address space; however
removing code cannot cope with these as int128_get64() will fail on
1<<64.

This copies 128bit handling from region_add() to region_del().

Since the only machine type which is actually going to use 64bit IOMMU
is pseries and it never really removes them (instead it will dynamically
add/remove subregions), this should cause no behavioral change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-05-26 11:12:07 -06:00
Alex Williamson
0eb7342417 vfio/pci: Add IGD documentation
Document the usage modes, host primary graphics considerations, usage,
and fw_cfg ABI required for IGD assignment with vfio.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-26 11:12:05 -06:00
Alex Williamson
6ced0bba70 vfio/pci: Add a separate option for IGD OpRegion support
The IGD OpRegion is enabled automatically when running in legacy mode,
but it can sometimes be useful in universal passthrough mode as well.
Without an OpRegion, output spigots don't work, and even though Intel
doesn't officially support physical outputs in UPT mode, it's a
useful feature.  Note that if an OpRegion is enabled but a monitor is
not connected, some graphics features will be disabled in the guest
versus a headless system without an OpRegion, where they would work.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-26 11:12:03 -06:00
Alex Williamson
c4c45e943e vfio/pci: Intel graphics legacy mode assignment
Enable quirks to support SandyBridge and newer IGD devices as primary
VM graphics.  This requires new vfio-pci device specific regions added
in kernel v4.6 to expose the IGD OpRegion, the shadow ROM, and config
space access to the PCI host bridge and LPC/ISA bridge.  VM firmware
support, SeaBIOS only so far, is also required for reserving memory
regions for IGD specific use.  In order to enable this mode, IGD must
be assigned to the VM at PCI bus address 00:02.0, it must have a ROM,
it must be able to enable VGA, it must have or be able to create on
its own an LPC/ISA bridge of the proper type at PCI bus address
00:1f.0 (sorry, not compatible with Q35 yet), and it must have the
above noted vfio-pci kernel features and BIOS.  The intention is that
to enable this mode, a user simply needs to assign 00:02.0 from the
host to 00:02.0 in the VM:

  -device vfio-pci,host=0000:00:02.0,bus=pci.0,addr=02.0

and everything either happens automatically or it doesn't.  In the
case that it doesn't, we leave error reports, but assume the device
will operate in universal passthrough mode (UPT), which doesn't
require any of this, but has a much more narrow window of supported
devices, supported use cases, and supported guest drivers.

When using IGD in this mode, the VM firmware is required to reserve
some VM RAM for the OpRegion (on the order or several 4k pages) and
stolen memory for the GTT (up to 8MB for the latest GPUs).  An
additional option, x-igd-gms allows the user to specify some amount
of additional memory (value is number of 32MB chunks up to 512MB) that
is pre-allocated for graphics use.  TBH, I don't know of anything that
requires this or makes use of this memory, which is why we don't
allocate any by default, but the specification suggests this is not
actually a valid combination, so the option exists as a workaround.
Please report if it's actually necessary in some environment.

See code comments for further discussion about the actual operation
of the quirks necessary to assign these devices.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-26 11:12:01 -06:00
Alex Williamson
581406e0e3 vfio/pci: Setup BAR quirks after capabilities probing
Capability probing modifies wmask, which quirks may be interested in
changing themselves.  Apply our BAR quirks after the capability scan
to make this possible.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-26 11:12:00 -06:00
Alex Williamson
182bca4592 vfio/pci: Consolidate VGA setup
Combine VGA discovery and registration.  Quirks can have dependencies
on BARs, so the quirks push out until after we've scanned the BARs.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-26 11:11:58 -06:00
Alex Williamson
4225f2b670 vfio/pci: Fix return of vfio_populate_vga()
This function returns success if either we setup the VGA region or
the host vfio doesn't return enough regions to support the VGA index.
This latter case doesn't make any sense.  If we're asked to populate
VGA, fail if it doesn't exist and let the caller decide if that's
important.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-26 11:11:56 -06:00
Alex Williamson
e61a424f05 vfio: Create device specific region info helper
Given a device specific region type and sub-type, find it.  Also
cleanup return point on error in vfio_get_region_info() so that we
always return 0 with a valid pointer or -errno and NULL.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-26 11:04:50 -06:00
Alex Williamson
b53b0f696b vfio: Enable sparse mmap capability
The sparse mmap capability in a vfio region info allows vfio to tell
us which sub-areas of a region may be mmap'd.  Thus rather than
assuming a single mmap covers the entire region and later frobbing it
ourselves for things like the PCI MSI-X vector table, we can read that
directly from vfio.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-26 09:43:20 -06:00
Peter Maydell
2c56d06baf Block layer patches
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJXReG4AAoJEH8JsnLIjy/WKv0P/0Bevm3ahvJuStRdl+u0MgZ7
 9jM69CN3wxH8SVwCJ3zBndWVOrsAbXB+SFTu/0Dr+tq+dw901AN8lspzf1TFiYnL
 5M8B1syjOXUxuOcwl18ytuDu9aQhxbbCIC9pzWqxizm8N2nXP9r/VlNiztm4T2nn
 RV0DyD5XVoV2v3rpFPerv1EgOtO0lmfgSABwk31a5ftULdy1QzDOliLm/l7W7Oke
 xMFJ93OOBd/p0jmLrULM3V/tipRpxcnvijfVoLJb3895/0G7mJ1PCRYl3uokd+cy
 DxmWw9Fil/kfpwmQhpvpWeOeSkh/NXVNDH5fLfl7V1dPyYcs2xZOJz3mNtuMDU2N
 kBxkWtOQivr/q9cIK6sEacDRtTjouRi/DoccR6S8IXs9nY1fJoLBvXufo2mQrwsy
 5heUhrvZ+AD/eZVjQXQyIc9jCLltU12lW1Z+KrVO7HICcT/Z+UGqeROrRrhzjRRl
 iuO2iHXI7I4KSB2Xgrx5WMYZu8GbxHJmfe20O2bupJ5PHRHLF0ySZ0MW3Gbs1W5E
 5e5tU4D4x5JYJ1KYsMg2FsuejqMBh25coqywT7I3XOEOnN8opOPVKoHb86gZOjXD
 wckrSGHkX9rfvRgy+vG/d3RqQOvxtaH1kbWCHikVjHpnVPzz6uPPD1H8/T5s2IGB
 49PdfL0SIOQsl9U/3WPk
 =g40O
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches

# gpg: Signature made Wed 25 May 2016 18:32:40 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream: (31 commits)
  blockjob: Remove BlockJob.bs
  commit: Use BlockBackend for I/O
  backup: Use BlockBackend for I/O
  backup: Remove bs parameter from backup_do_cow()
  backup: Pack Notifier within BackupBlockJob
  backup: Don't leak BackupBlockJob in error path
  mirror: Use BlockBackend for I/O
  mirror: Allow target that already has a BlockBackend
  stream: Use BlockBackend for I/O
  block: Make blk_co_preadv/pwritev() public
  block: Convert block job core to BlockBackend
  block: Default to enabled write cache in blk_new()
  block: Cancel jobs first in bdrv_close_all()
  block: keep a list of block jobs
  block: Rename blk_write_zeroes()
  dma-helpers: change BlockBackend to opaque value in DMAIOFunc
  dma-helpers: change interface to byte-based
  block: Propagate .drained_begin/end callbacks
  block: Fix reconfiguring graph with drained nodes
  block: Make bdrv_drain() use bdrv_drained_begin/end()
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-26 14:29:30 +01:00
Andreas Färber
a62c89117f qdev: Start disentangling bus from device
Move bus type and related APIs to a separate file bus.c.
This is a first step in breaking up qdev.c into more manageable chunks.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[AF: Rebased onto osdep.h]
Signed-off-by: Andreas Färber <afaerber@suse.de>
[PMM: added bus.o to link line for test-qdev-global-props]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-26 14:06:41 +01:00
Sergey Fedorov
c88c67e58b cpu-exec: Fix direct jump to TB spanning page
It is not safe to make a direct jump to a TB spanning two pages in
system emulation because the mapping for the second page can get changed
but we don't take care of direct jumps in this case.

However in user mode emulation, this is not the case because there's
only static address translation and TBs are always invalidated properly.

Fixes: 5b053a4a28 ("tcg: Clean up direct block chaining safety checks")

Reported-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Tested-by: Max Filippov <jcmvbkbc@gmail.com>
Message-id: 1463404380-29302-1-git-send-email-sergey.fedorov@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-26 13:14:29 +01:00
Peter Maydell
0533d3de60 Andreas stepping down from most maintainer positions
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJXRcqJAAoJEPou0S0+fgE/a00QAIWVINu7rN6Pv9RrbkmaNoan
 geKyY1urljB6v8vkGeuYGAkmWXxLcvrduXTHh+b8MD8MxokzulL11ftxd1/5Oldj
 v9I+4XSbO5I/cw8OxFxf8vsxBRox+XckeFdm3Ft0pEFXd2lZsK4WRnqZ0lG2p0D2
 j2TXzngwiIpvtoL6Gm14dBJc0bvDxJ+JLMFNvViMMqABwrLb/kMHYm0zx6OUrIO7
 K5NCfCZCkIYJ4eRRcH/o0YKotvGGabfeqOWQ+4aPmnYp/DlunfzAb6ex7GujzSvE
 DjsD+hI63FkOTs8ZtV/YFqbTN1Rp139WnVD7f0HYPCP2DK8a1OMNlAJ4ZRl08jS4
 UOmM6NJJ+op1XESfwqJ8aXsOL6Gm4dhErHQ8yPJFWvzrO2IxYU5J4NnRNDrqJZBR
 U2+X67ZIYYviBCCAYZyssaIcCS7tTG8wo6sTQXEeXpuc+94e2X1hQJEXqFn3N89X
 0DpOFz0+FE63Xdj3nOHeDTxYZI6j6v0L2eQ8YPEEm5RQJSaOGGk/u128+6IjP6VC
 63T1bwLlY5ZKffNgS8XaywRGAKSQJp0woNOxy6a5zc7R8Ns6QP0IGEKFzs6kOdiI
 NYB4nK/NAZHu8uBNrcKXiqkwS9LDOqJflXynboyWKYmbkgu8pqj03aN5L1pBQCis
 8tL2eL+340zkqhN31PTQ
 =AMGu
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/afaerber/tags/maintainers-for-peter' into staging

Andreas stepping down from most maintainer positions

# gpg: Signature made Wed 25 May 2016 16:53:45 BST using RSA key ID 3E7E013F
# gpg: Good signature from "Andreas Färber <afaerber@suse.de>"
# gpg:                 aka "Andreas Färber <afaerber@suse.com>"

* remotes/afaerber/tags/maintainers-for-peter:
  MAINTAINERS: Drop Andreas as CPU maintainer
  MAINTAINERS: Drop Andreas as 0.15 maintainer
  MAINTAINERS: Drop Andreas as PReP maintainer
  MAINTAINERS: Drop Andreas as Cocoa maintainer

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-26 12:41:12 +01:00
Kevin Wolf
b75536c9fa blockjob: Remove BlockJob.bs
There is a single remaining user in qemu-img, and another one in a test
case, both of which can be trivially converted to using BlockJob.blk
instead.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-05-25 19:04:21 +02:00
Kevin Wolf
4653456a5f commit: Use BlockBackend for I/O
This changes the commit block job to use the job's BlockBackend for
performing its I/O. job->bs isn't used by the commit code any more
afterwards.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-25 19:04:21 +02:00
Kevin Wolf
5c438bc68c backup: Use BlockBackend for I/O
This changes the backup block job to use the job's BlockBackend for
performing its I/O. job->bs isn't used by the backup code any more
afterwards.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-25 19:04:21 +02:00
Kevin Wolf
8543c27414 backup: Remove bs parameter from backup_do_cow()
Now that we pass the job to the function, bs is implied by that.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
2016-05-25 19:04:21 +02:00
John Snow
12b3e52e48 backup: Pack Notifier within BackupBlockJob
Instead of relying on peeking at bs->job, we want to explicitly get
a reference to the job that was involved in this notifier callback.

Pack the Notifier inside of the BackupBlockJob so we can use
container_of to get a reference back to the BackupBlockJob object.

This cuts out one more case where we rely unnecessarily on bs->job.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-25 19:04:21 +02:00
Kevin Wolf
91ab688379 backup: Don't leak BackupBlockJob in error path
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
2016-05-25 19:04:21 +02:00
Kevin Wolf
e253f4b897 mirror: Use BlockBackend for I/O
This changes the mirror block job to use the job's BlockBackend for
performing its I/O. job->bs isn't used by the mirroring code any more
afterwards.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-25 19:04:21 +02:00
Kevin Wolf
b880481579 mirror: Allow target that already has a BlockBackend
We had to forbid mirroring to a target BDS that already had a BB
attached because the node swapping at job completion would add a second
BB and we didn't support multiple BBs on a single BDS at the time. Now
we do, so we can lift the restriction.

As we allow additional BlockBackends for the target, we must expect
other users to be sending requests. There may no requests be in flight
during the graph modification, so we have to drain those users now.

The core part of this patch is a revert of commit 40365552.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-25 19:04:21 +02:00
Kevin Wolf
03e35d820d stream: Use BlockBackend for I/O
This changes the streaming block job to use the job's BlockBackend for
performing the COR reads. job->bs isn't used by the streaming code any
more afterwards.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-25 19:04:21 +02:00
Kevin Wolf
1e98fefd95 block: Make blk_co_preadv/pwritev() public
Also add trace points now that the function can be directly called.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
2016-05-25 19:04:21 +02:00
Kevin Wolf
b6d2e59995 block: Convert block job core to BlockBackend
This adds a new BlockBackend field to the BlockJob struct, which
coexists with the BlockDriverState while converting the individual jobs.

When creating a block job, a new BlockBackend is created on top of the
given BlockDriverState, and it is destroyed when the BlockJob ends. The
reference to the BDS is now held by the BlockBackend instead of calling
bdrv_ref/unref manually.

We have to be careful when we use bdrv_replace_in_backing_chain() in
block jobs because this changes the BDS that job->blk points to. At the
moment block jobs are too tightly coupled with their BDS, so that moving
a job to another BDS isn't easily possible; therefore, we need to just
manually undo this change afterwards.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-25 19:04:21 +02:00
Kevin Wolf
0c3169dffa block: Default to enabled write cache in blk_new()
The existing users of the function are:

1. blk_new_open(), which already enabled the write cache
2. Some test cases that don't care about the setting
3. blockdev_init() for empty drives, where the cache mode is overridden
   with the value from the options when a medium is inserted

Therefore, this patch doesn't change the current behaviour. It will be
convenient, however, for additional users of blk_new() (like block
jobs) if the most sensible WCE setting is the default.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
2016-05-25 19:04:21 +02:00
Kevin Wolf
a1a2af0756 block: Cancel jobs first in bdrv_close_all()
So far, bdrv_close_all() first removed all root BlockDriverStates of
BlockBackends and monitor owned BDSes, and then assumed that the
remaining BDSes must be related to jobs and cancelled these jobs.

This order doesn't work that well any more when block jobs use
BlockBackends internally because then they will lose their BDS before
being cancelled.

This patch changes bdrv_close_all() to first cancel all jobs and then
remove all root BDSes from the remaining BBs.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-25 19:04:21 +02:00
Alberto Garcia
a7112795c1 block: keep a list of block jobs
The current way to obtain the list of existing block jobs is to
iterate over all root nodes and check which ones own a job.

Since we want to be able to support block jobs in other nodes as well,
this patch keeps a list of jobs that is updated every time one is
created or destroyed.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-25 19:04:21 +02:00
Eric Blake
d004bd52aa block: Rename blk_write_zeroes()
Commit 983a1600 changed the semantics of blk_write_zeroes() to
be byte-based rather than sector-based, but did not change the
name, which is an open invitation for other code to misuse the
function.  Renaming to pwrite_zeroes() makes it more in line
with other byte-based interfaces, and will help make it easier
to track which remaining write_zeroes interfaces still need
conversion.

Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-25 19:04:21 +02:00
Paolo Bonzini
8a8e63ebdd dma-helpers: change BlockBackend to opaque value in DMAIOFunc
Callers of dma_blk_io have no way to pass extra data to the DMAIOFunc,
because the original callback and opaque are gone by the time DMAIOFunc
is called.  On the other hand, the BlockBackend is usually derived
from those extra data that you could pass to the DMAIOFunc (in the
next patch, that would be the SCSIRequest).

So change DMAIOFunc's prototype, decoupling it from blk_aio_readv
and blk_aio_writev's.  The new prototype loses the BlockBackend
and gains an extra opaque value which, in the case of dma_blk_readv
and dma_blk_writev, is of course used for the BlockBackend.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-25 19:04:11 +02:00
Paolo Bonzini
cbe0ed6247 dma-helpers: change interface to byte-based
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-25 19:04:11 +02:00
Kevin Wolf
20018e12cf block: Propagate .drained_begin/end callbacks
When draining intermediate nodes (i.e. nodes that aren't the root node
for at least one of their parents; with node references, the user can
always configure the graph to create this situation), we need to
propagate the .drained_begin/end callbacks all the way up to the root
for the drain to be effective.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-25 19:04:11 +02:00
Kevin Wolf
36fe13317b block: Fix reconfiguring graph with drained nodes
When changing the BlockDriverState that a BdrvChild points to while the
node is currently drained, we must call the .drained_end() parent
callback. Conversely, when this means attaching a new node that is
already drained, we need to call .drained_begin().

bdrv_root_attach_child() takes now an opaque parameter, which is needed
because the callbacks must also be called if we're attaching a new child
to the BlockBackend when the root node is already drained, and they need
a way to identify the BlockBackend. Previously, child->opaque was set
too late and the callbacks would still see it as NULL.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-25 19:04:10 +02:00
Kevin Wolf
6820643fdb block: Make bdrv_drain() use bdrv_drained_begin/end()
Until now, bdrv_drained_begin() used bdrv_drain() internally to drain
the queue. This is kind of backwards and caused quiescing code to be
duplicated because bdrv_drained_begin() had to ensure that no new
requests come in even after bdrv_drain() returns, whereas bdrv_drain()
had to have them because it could be called from other places.

Instead move the bdrv_drain() code to bdrv_drained_begin() and make
bdrv_drain() a simple wrapper around bdrv_drained_begin/end().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-25 19:04:10 +02:00
Kevin Wolf
e9740bc6d4 block: Introduce bdrv_replace_child()
This adds a common function that is called when attaching a new child to
a parent, removing a child from a parent and when reconfiguring the
graph so that an existing child points to a different node now.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-25 19:04:10 +02:00
Max Reitz
109525ad6a block: Drop errp parameter from blk_new()
blk_new() cannot fail so its Error ** parameter has become superfluous.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-25 19:04:10 +02:00
Max Reitz
6b574e09b3 block: Drop bdrv_parent_cb_...() from bdrv_close()
bdrv_close() now asserts that the BDS's refcount is 0, therefore it
cannot have any parents and the bdrv_parent_cb_change_media() call is a
no-op.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-25 19:04:10 +02:00
Max Reitz
30f55fb81f block: Assert !bs->refcnt in bdrv_close()
The only caller of bdrv_close() left is bdrv_delete(). We may as well
assert that, in a way (there are some things in bdrv_close() that make
more sense under that assumption, such as the call to
bdrv_release_all_dirty_bitmaps() which in turn assumes that no frozen
bitmaps are attached to the BDS).

In addition, being called only in bdrv_delete() means that we can drop
bdrv_close()'s forward declaration at the top of block.c.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-25 19:04:10 +02:00
Max Reitz
5b3639371c block: Make bdrv_open() return a BDS
There are no callers to bdrv_open() or bdrv_open_inherit() left that
pass a pointer to a non-NULL BDS pointer as the first argument of these
functions, so we can finally drop that parameter and just make them
return the new BDS.

Generally, the following pattern is applied:

    bs = NULL;
    ret = bdrv_open(&bs, ..., &local_err);
    if (ret < 0) {
        error_propagate(errp, local_err);
        ...
    }

by

    bs = bdrv_open(..., errp);
    if (!bs) {
        ret = -EINVAL;
        ...
    }

Of course, there are only a few instances where the pattern is really
pure.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-25 19:04:10 +02:00
Max Reitz
9bddf75979 block: Drop bdrv_new_root()
It is unused now, so we may just as well drop it.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-25 19:04:10 +02:00
Max Reitz
28eb9b12f7 block: Drop blk_new_with_bs()
Its only caller is blk_new_open(), so we can just inline it there.

The bdrv_new_root() call is dropped in the process because we can just
let bdrv_open() create the BDS.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-25 19:04:10 +02:00
Max Reitz
21a699afc8 tests: Drop BDS from test-throttle.c
Now that throttling has been moved to the BlockBackend level, we do not
need to create a BDS along with the BB in the I/O throttling test.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-25 19:04:10 +02:00
Max Reitz
668361898e block: Let bdrv_open_inherit() return the snapshot
If bdrv_open_inherit() creates a snapshot BDS and *pbs is NULL, that
snapshot BDS should be returned instead of the BDS under it.

This has worked so far because (nearly) all users of BDRV_O_SNAPSHOT use
blk_new_open() to create the BDS tree. bdrv_append() (which is called by
bdrv_append_temp_snapshot()) redirects pointers from parents (i.e. the
BB in this case) to the newly appended child (i.e. the overlay),
therefore, while bdrv_open_inherit() did not return the root BDS, the BB
still pointed to it.

The only instance where BDRV_O_SNAPSHOT is used but blk_new_open() is
not is in blockdev_init() if no BDS tree is created, and instead
blk_new() is used and the flags are stored in the BB root state.
However, qmp_blockdev_change_medium() filters the BDRV_O_SNAPSHOT flag
before invoking bdrv_open(), so it will not have any effect.

In any case, it would be nicer if bdrv_open_inherit() could just always
return the root of the BDS tree that has been created.

To this end, bdrv_append_temp_snapshot() now returns the snapshot BDS
instead of just appending it on top of the snapshotted BDS. Also, it
calls bdrv_ref() before bdrv_append() (which bdrv_open_inherit() has to
undo if not returning the overlay).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-25 19:04:10 +02:00
Max Reitz
506f8709ce block: Drop useless bdrv_new() call
bdrv_append_temp_snapshot() uses bdrv_new() to create an empty BDS
before invoking bdrv_open() on that BDS. This is probably a relict from
when it used to do some modifications on that empty BDS, but now that is
unnecessary, so we can just set bs_snapshot to NULL and let bdrv_open()
do the rest.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-25 19:04:10 +02:00
Kevin Wolf
88be7b4be4 block: Fix bdrv_next() memory leak
The bdrv_next() users all leaked the BdrvNextIterator after completing
the iteration. Simply changing bdrv_next() to free the iterator before
returning NULL at the end of list doesn't work because some callers exit
the loop before looking at all BDSes.

This patch moves the BdrvNextIterator from the heap to the stack of
the caller and switches to a bdrv_first()/bdrv_next() interface for
initialising the iterator.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-25 19:04:10 +02:00
Andreas Färber
12b0e69cd7 MAINTAINERS: Drop Andreas as CPU maintainer
Signed-off-by: Andreas Färber <afaerber@suse.de>
2016-05-25 17:44:15 +02:00
Andreas Färber
211b76d1db MAINTAINERS: Drop Andreas as 0.15 maintainer
Downgrade to orphan status, like all other remaining stable entries.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2016-05-25 17:44:15 +02:00
Andreas Färber
9f38774da2 MAINTAINERS: Drop Andreas as PReP maintainer
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
2016-05-25 17:44:15 +02:00
Andreas Färber
aa373a1ec8 MAINTAINERS: Drop Andreas as Cocoa maintainer
Peter has taken over Cocoa maintainership.

Signed-off-by: Andreas Färber <andreas.faerber@web.de>
2016-05-25 17:14:56 +02:00
Peter Maydell
287db79df8 X86 queue, 2016-05-23
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJXQ4i7AAoJECgHk2+YTcWmkhsQAIELcp/4Kj/9TBqurmicXrBm
 sEfLGThc94NZv6tDNUg4d96x83263RJPUsJqOP7TGSavO5xWD8J4A1yf8x6tI71l
 /kr064fZzYZGgKyaIvFBqmNT8uy3CVZ1+5algaYh4pOBD3y0hrTBmwB2vIHZeKDC
 6ljLuOVV2/n3SlhthK4Me9DiNPTwmMolfq2EG+5jBHmFXnbXRmApXaiX4a3qNvI+
 Aqz9btyJgTReepPcpuxF3o/h+Dx0JgC1gT4bPkIDV0wx00adUWufhRqc3D1QaOBr
 AREMSpeepuHNrkPpc7ITvyMK9q+bW8OzB0koXQW6Q50DtM7DRsxkqsr3TDvkSGG+
 ZyxsL6UdDoSKwJaDcbkhRL82P6U+MvfMBLMbo4V+S1728maUVRx74Ah+axRVX6wb
 hWtEYPvFUKp03lY82hXDoZJC+WLu+mhuAqS6a74/OG47lV2V61X8i3zrZ54RVHMh
 1fkr4MXkPvrRTH+ifjJlH6leWg5JA6OCPPMqO0fujUIxlMGA9QAHK6Fs0ReKEUyy
 WNXQ/CMfHcZv3WIJWH2NEBTJHfQwP4KWM6nIoO8Z4WT7tFpqUGULabJ9jt5t3NB5
 lW586UT6pCnqUwh89yTYLg0+Q+rmgmIlAwjVL6q0tRXXL98kosng2jgzlu7gpSBJ
 2lXFSzoqr+Ux3GbNF6E4
 =Rh0Q
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/ehabkost/tags/x86-pull-request' into staging

X86 queue, 2016-05-23

# gpg: Signature made Mon 23 May 2016 23:48:27 BST using RSA key ID 984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"

* remotes/ehabkost/tags/x86-pull-request:
  target-i386: kvm: Eliminate kvm_msr_entry_set()
  target-i386: kvm: Simplify MSR setting functions
  target-i386: kvm: Simplify MSR array construction
  target-i386: kvm: Increase MSR_BUF_SIZE
  target-i386: kvm: Allocate kvm_msrs struct once per VCPU
  target-i386: Call cpu_exec_init() on realize
  target-i386: Move TCG initialization to realize time
  target-i386: Move TCG initialization check to tcg_x86_init()
  cpu: Eliminate cpudef_init(), cpudef_setup()
  target-i386: Set constant model_id for qemu64/qemu32/athlon
  pc: Set CPU model-id on compat_props for pc <= 2.4
  osdep: Move default qemu_hw_version() value to a macro
  target-i386: kvm: Use X86XSaveArea struct for xsave save/load
  target-i386: Use xsave structs for ext_save_area
  target-i386: Define structs for layout of xsave area

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-24 13:06:33 +01:00