This adds support for VHDX v0 logs, as specified in Microsoft's
VHDX Specification Format v1.00:
https://www.microsoft.com/en-us/download/details.aspx?id=34750
The following support is added:
* Log parsing, and validation - validate that an existing log
is correct.
* Log search - search through an existing log, to find any valid
sequence of entries.
* Log replay and flush - replay an existing log, and flush/clear
the log when complete.
The VHDX log is a circular buffer, with elements (sectors) of 4KB.
A log entry is a variably-length number of sectors, that is
comprised of a header and 'descriptors', that describe each sector.
A log may contain multiple entries, know as a log sequence. In a log
sequence, each log entry immediately follows the previous entry, with an
incrementing sequence number. There can only ever be one active and
valid sequence in the log.
Each log entry must match the file log GUID in order to be valid (along
with other criteria). Once we have flushed all valid log entries, we
marked the file log GUID to be zero, which indicates a buffer with no
valid entries.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Allow tracking of first file write in the VHDX image, as well as
the ability to update the GUID in the header. This is in preparation
for log support.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This moves the endian translation functions out from the vhdx.c source,
into a separate source file. In addition to the previously defined
endian functions, new endian translation functions for log support are
added as well.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This adds some magic number defines, and internal structure definitions
for VHDX log replay support. The struct VHDXLogEntries does not reflect
an on-disk data structure, and thus does not need to be packed.
Some minor code style fixes are applied as well.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
In preparation for VHDX log support, move these structures to the
header.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This adds the ability to update the headers in a VHDX image, including
generating a new MS-compatible GUID.
As VHDX depends on uuid.h, VHDX is now a configurable build option. If
VHDX support is enabled, that will also enable uuid as well. The
default is to have VHDX enabled.
To enable/disable VHDX: --enable-vhdx, --disable-vhdx
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Just a couple of minor comments to help note where allocated
buffers are freed, and a typo fix.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The below patch is needed to compile qemu trunk on FreeBSD with gcc48,
clang will fail.... ;). Host x84_64-freebsd.
Signed-off-by: Andreas Tobler <andreast@FreeBSD.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Replace the legacy cpu_to_be64wu() with stq_be_p().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1383669517-25598-9-git-send-email-peter.maydell@linaro.org
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQIcBAABAgAGBQJScnrnAAoJEH8JsnLIjy/WhhsP/2No1yEGNzfhw0WLDsEGBJI7
zjG+QkRMO4q2t256SxNr84KBFJlYKBvGrx+W8xC66AdvR1feL5hmWdXAMTJovx6Z
3Qt59RI9iISZ2OEtc9FhdsC+dSdM/3qie17XuuSCqifsi4xLjIZK/s18+RnLa0t/
nRObYP4prRl0c3o1gKaUvNz2wkIqctQAIe8UQkn6R1vPC6D60m/H9dDj4Kj68HO0
ICsF4AXBR/V2a8gU36/PGexBVyfgC4HOeuN0qNSTgYOKxLuNR+SrlzzhHE+jZTs5
GASm3vg/vUgBOO1759X5T8hveO6yu8XL82l+/d5nIK4gYGORIQZT74dyV5JgQIlF
Y47d0cF28+C/fuL1jh7c+2HY5WmmJQosMi9CaCBj0lvH0k5caEjqwPeHtRBmEyu3
1wAcLQJowZrWB5ez9MjezsaL4sPCymvB/4F443xdz5V19mE41bLZGW2EIT7MXHY7
IcwLU/opx76GMOFfWVMA7jeQkjiPaqGeaQHJzdnGUzIthqyiTigQMfi5P3nXGDic
uQi+KrqP9lNpJlZk4xGQnFovHNmKZrnLhUvqOIPk7/wKMvlU6ewdzp5Fnwzqw4MW
uJ/6eBJYolMyY+q37AH3Q6ZUkwTJi9O1drCPA0Ogr/dJiCyAiOoKuL0N74VabpcD
AahXw+yYV0qh6H4YjOzW
=wGCx
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'kwolf/tags/for-anthony' into staging
Block patches for 1.7.0-rc0 (v2)
# gpg: Signature made Thu 31 Oct 2013 04:44:39 PM CET using RSA key ID C88F2FD6
# gpg: Can't check signature: public key not found
* kwolf/tags/for-anthony: (30 commits)
vmdk: Implment bdrv_get_specific_info
qapi: Add optional field 'compressed' to ImageInfo
qemu-iotests: prefill some data to test image
sheepdog: check simultaneous create in resend_aioreq
sheepdog: cancel aio requests if possible
sheepdog: make add_aio_request and send_aioreq void functions
sheepdog: try to reconnect to sheepdog after network error
coroutine: add co_aio_sleep_ns() to allow sleep in block drivers
sheepdog: reload inode outside of resend_aioreq
sheepdog: handle vdi objects in resend_aio_req
sheepdog: check return values of qemu_co_recv/send correctly
qemu-iotests: Test case for backing file deletion
qemu-iotests: drop duplicated "create_image"
qemu-iotests: Fix 051 reference output
block: Avoid unecessary drv->bdrv_getlength() calls
block: Disable BDRV_O_COPY_ON_READ for the backing file
ahci: fix win7 hang on boot
sheepdog: pass copy_policy in the request
sheepdog: explicitly set copies as type uint8_t
block: Don't copy backing file name on error
...
Message-id: 1383064269-27720-1-git-send-email-kwolf@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
Implement .bdrv_get_specific_info to return the extent information.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
After reconnection happens, all the inflight requests are moved to the
failed request list. As a result, sd_co_rw_vector() can send another
create request before resend_aioreq() resends a create request from
the failed list.
This patch adds a helper function check_simultaneous_create() and
checks simultaneous create requests more strictly in resend_aioreq().
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Tested-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch tries to cancel aio requests in pending queue and failed
queue. When the sheepdog driver cannot cancel the requests, it waits
for them to be completed.
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Tested-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
These functions no longer return errors. We can make them void
functions and simplify the codes.
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Tested-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This introduces a failed request queue and links all the inflight
requests to the list after network error happens. After QEMU
reconnects to the sheepdog server successfully, the sheepdog block
driver will retry all the requests in the failed queue.
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Tested-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This prepares for using resend_aioreq() after reconnecting to the
sheepdog server.
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Tested-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The current resend_aio_req() doesn't work when the request is against
vdi objects. This fixes the problem.
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Tested-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If qemu_co_recv/send doesn't return the specified length, it means
that an error happened.
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Tested-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The block layer generally keeps the size of an image cached in
bs->total_sectors so that it doesn't have to perform expensive
operations to get the size whenever it needs it.
This doesn't work however when using a backend that can change its size
without qemu being aware of it, i.e. passthrough of removable media like
CD-ROMs or floppy disks. For this reason, the caching is disabled when a
removable device is used.
It is obvious that checking whether the _guest_ device has removable
media isn't the right thing to do when we want to know whether the size
of the host backend can change. To make things worse, non-top-level
BlockDriverStates never have any device attached, which makes qemu
assume they are removable, so drv->bdrv_getlength() is always called on
the protocol layer. In the case of raw-posix, this causes unnecessary
lseek() system calls, which turned out to be rather expensive.
This patch completely changes the logic and disables bs->total_sectors
caching only for certain block driver types, for which a size change is
expected: host_cdrom and host_floppy on POSIX, host_device on win32; also
the raw format in case it sits on top of one of these protocols, but in
the common case the nested bdrv_getlength() call on the protocol driver
will use the cache again and avoid an expensive drv->bdrv_getlength()
call.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Currently copy_policy isn't used. Recent sheepdog supports erasure coding, which
make use of copy_policy internally, but require client explicitly passing
copy_policy from base inode to newly creately inode for snapshot related
operations.
If connected sheep daemon doesn't utilize copy_policy, passing it to sheep
daemon is just one extra null effect operation. So no compatibility problem.
With this patch, sheepdog can provide erasure coded volume for QEMU VM.
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Acked-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
'copies' is actually uint8_t since day one, but request headers and some helper
functions parameterize it as uint32_t for unknown reasons and effectively
reserve 24 bytes for possible future use. This patch explicitly set the correct
for copies and reserve the left bytes.
This is a preparation patch that allow passing copy_policy in request header.
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Acked-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Opening the qcow2 image with BDRV_O_NO_FLUSH prevents any flushes during
the image creation. This means that the image has not yet been flushed
to disk when qemu-img create exits. This flush is delayed until the next
operation on the image involving opening it without BDRV_O_NO_FLUSH and
closing (or directly flushing) it. For large images and/or images with a
small cluster size and preallocated metadata, this flush may take a
significant amount of time and may occur unexpectedly.
Reopening the image without BDRV_O_NO_FLUSH right before the end of
qcow2_create2() results in hoisting the potentially costly flush into
the image creation, which is expected to take some time (whereas
successive image operations may be not).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
compatiblity -> compatibility
continously -> continuously
existance -> existence
usefull -> useful
shoudl -> should
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
this adds a check that a dynamic VHD file has not been
accidently truncated (e.g. during transfer or upload).
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Saving the VM state is done using bdrv_pwrite. This function may perform
a read-modify-write, which in this case results in data being read from
beyond the end of the virtual disk. Since we are actually trying to
access an area which is not a part of the virtual disk, zero_beyond_eof
has to be set to false before performing the partial write, otherwise
the VM state may become corrupted.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Since df2a6f29a5, bdrv_co_do_writev increases the total_sectors value of
a growable block devices on writes after the current end. This leads to
the virtual disk apparently growing in qcow2_save_vmstate, which in turn
affects the disk size captured by the internal snapshot taken directly
afterwards through e.g. the HMP savevm command. Such a "grown" snapshot
cannot be loaded after reopening the qcow2 image, since its disk size
differs from the actual virtual disk size (writing a VM state does not
actually increase the virtual disk size).
Fix this by restoring total_sectors at the end of qcow2_save_vmstate.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The VMFS extent line in description file doesn't have start offset as
FLAT lines does, and it should be defaulted to 0. The flat_offset
variable is initialized to -1, so we need to set it in this case.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Previously cid of parent is parsed from image file for every IO request.
We already have L1/L2 cache and don't have assumption that parent image
can be updated behind us, so remove this to get more efficiency.
The parent CID is checked only for once after opening.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
On one occasion, hdev_open() returned -1 in case of an unknown error
instead of a proper -errno value. Adjust this to match the behavior of
raw_open() (in raw-win32), which is to return -EINVAL in this case.
Also, change the call to error_setg*() to match the one in raw_open() as
well.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
An extra 'p++' after while loop when *p == '\n' will move p to unknown
data position, risking parsing junk data or memory access violation.
Cc: qemu-stable@nongnu.org
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Convert "fprintf(stderr,..." and standardize error messages:
Remove a few local_error's and use errp.
Remove "VMDK:" or "Vmdk:" prefixes in error message and fix to upper
case.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Make use of the error parameter in the opening and creating functions in
block/raw-posix.c.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Make use of the error parameter in the opening and creating functions in
block/raw-win32.c.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Propagate errors in raw_create rather than directly reporting and
afterwards discarding them.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Evaluate the runtime overlap check options and set
BDRVQcowState.overlap_check appropriately.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Introduces the macros QCOW2_OL_CONSTANT and QCOW2_OL_ALL in addition to
the already existing QCOW2_OL_CACHED, signifying all metadata overlap
checks that can be performed in constant time (regardless of image size
etc.) and truly all available overlap checks, respectively.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add an array which assigns the option string to its corresponding
overlap check bit.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add runtime options to tune the overlap checks to be performed before
write accesses.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Replace the QCOW2_OL_DEFAULT macro by a variable overlap_check in
BDRVQcowState.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In qcow2_check_metadata_overlap and qcow2_pre_write_overlap_check,
change the parameter signifying the checks to perform from its current
positive form to a negative one, i.e., it will no longer explicitly
specify every check to perform but rather a mask of checks not to
perform.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When trying to find a new snapshot ID, the existing ones are converted
to integers using strtoul. This function returns an unsigned long,
therefore its result should be saved in an unsigned long as well.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If the new snapshot table could not be written in qcow2_snapshot_create,
the old snapshot table has to be restored in memory and the new one
released. This should include restoration of the old snapshot count as
well, which is added by this patch.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In qcow2_write_compressed, if the compression fails, a normal cluster is
written to disk. This is done through bdrv_write on the qcow2 BDS
itself (using the guest offset), thus it is wrong to do a metadata
overlap check before.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The error message in qcow2_downgrade about an unsupported refcount
order is missing a space. This patch adds it.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This field is used by blkverify to disable external snapshots creation.
It will also be used by block filters like quorum to disable external
snapshot creation.
Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
if a raw device like an iscsi target or host device is used
the current implementation makes a second call out to get
the block status of bs->file.
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qcow2_write_snapshots relies on the length of every snapshot ID and name
fitting into an unsigned 16 bit integer. This is currently ensured by
QEMU through generally only allowing 128 byte IDs and 256 byte names.
However, if this should change in the future, the length written to the
image file should not be silently truncated (though the name itself
would be written completely).
Since this is currently not an issue but might require attention due to
internal QEMU changes in the future, an assert ensuring sanity is enough
for now.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If an error occurs during qcow2_write_snapshots, the newly allocated
snapshot table clusters are leaked and should thus be freed.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qcow2_write_snapshots does contain a fail label and there is no reason
not to use it on some errors; therefore, we should always jump there on
error.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In qcow2_free_any_clusters, preallocated zero clusters should be freed
just as normal clusters are.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Currently, qcow2_check_metadata_overlap uses bdrv_read to read inactive
L1 tables from disk. The number of sectors to read is calculated through
a truncating integer division, therefore, if the L1 table size is not a
multiple of the sector size, the final entries will not be read and
their entries in memory remain undefined (from the g_malloc).
Using bdrv_pread fixes this.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add a new ImageInfoSpecificQCow2 type as a subtype of ImageInfoSpecific.
This contains the compatibility level as a string and an optional
lazy_refcounts boolean (optional means mandatory for compat >= 1.1 and
not available for compat == 0.10).
Also, add qcow2_get_specific_info, which returns this information.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add a function for generically dumping the ImageInfoSpecific information
in a human-readable format to block/qapi.c.
Use this function in bdrv_image_info_dump and qemu-io-cmds.c:info_f to
allow qemu-img info resp. qemu-io -c info to print that format specific
information.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add a function for retrieving an ImageInfoSpecific object from a block
driver.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Switch the string to enum type BlockJobType in BlockJobDriver.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We will use BlockJobType as the enum type name of block jobs in QAPI,
rename current BlockJobType to BlockJobDriver, which will eventually
become a set of operations, similar to block drivers.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
# By Asias He (1) and Peter Lieven (1)
# Via Paolo Bonzini
* bonzini/scsi-next:
scsi: Allocate SCSITargetReq r->buf dynamically [CVE-2013-4344]
block/iscsi: reenable iscsi_co_get_block_status
Message-id: 1381332391-8781-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
Commit f35c934a accidently disabled iscsi_co_get_block_status for all
libiscsi versions. Its not possible to check for enumeration constants
in the C preprocessor. This patch changes the check to the preprocessor
constant LIBISCSI_FEATURE_IOVECTOR which was introduced shortly after
get_lba_status support was added to libiscsi.
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If an error occurs in l2_allocate, the allocated (but unused) L2 cluster
should be freed.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Switching the L1 table in memory should be an atomic operation, as far
as possible. Calling qcow2_free_clusters on the old L1 table on disk is
not a good idea when the old L1 table is no longer valid and the address
to the new one hasn't yet been written into the corresponding
BDRVQcowState field. To be more specific, this can lead to segfaults due
to qcow2_check_metadata_overlap trying to access the L1 table during the
free operation.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This blocks migration for VHDX image files, until the
functionality can be supported.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
CHECK_OFLAG_COPIED as a parameter to check_refcounts_l1 and
check_refcounts_l2 is obselete now, since the OFLAG_COPIED consistency
check is actually no longer performed by these functions (but by
check_oflag_copied).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
If an inactive L1 table is loaded from disk, its entries are in big
endian and have to be converted to host byte order before using them.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
All callers pass start = 0, and it's doubtful if any other value would
actually do what you expect. Remove the parameter.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Compressed clusters can never be contiguous, therefore the corresponding
flag does not need to be given explicitly to count_contiguous_clusters.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The function is not intended to be used on compressed clusters and will
not work correctly, if used anyway, since L2E_OFFSET_MASK is not the
right mask for determining the offset of compressed clusters. Therefore,
assert that the first cluster is not compressed and always include the
compression flag in the mask of significant flags, i.e., stop the search
as soon as a compressed cluster occurs.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In expand_zero_clusters_in_l1, a new cluster is only allocated if it was
not already preallocated. On error, such preallocated clusters should
not be freed, but only the newly allocated ones.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Just returning -errno in some cases prevents
trace_qcow2_l2_allocate_done from being executed (and, in one case, also
the unused allocated L2 table from being freed). Always going down the
error path fixes this.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In l2_allocate, the fail path is executed if qcow2_cache_flush fails.
However, the L2 table has not yet been fetched from the L2 table cache.
The qcow2_cache_put in the fail path therefore basically gives an
undefined argument as the L2 table address (in this case).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Since the expanded_clusters bitmap is addressed using host offsets in
the underlying image file, the correct size to use for allocating the
bitmap is not determined by the guest disk image but by the underlying
host image file.
Furthermore, this size may change during the expansion due to cluster
allocations on growable image files. In this case, the bitmap needs to
be resized as well to reflect the growth.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If qcow2_alloc_cluster_link_l2 is called with a QCowL2Meta describing a
request crossing L2 boundaries, a buffer overflow will occur. This is
impossible right now since such requests are never generated (every
request is shortened to L2 boundaries before) and probably also
completely unintended (considering the name "QCowL2Meta"), however, it
is still worth an assertion.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
QEDHeader is read, and written, directly from on-disk images
via bdrv_pread()/write(). To avoid any unintentional padding,
these structs should be packed.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
QCowHeader and QCowExtension are structs that reside in the on-disk
image format, and are read and written directly via bdrv_pread()/write(),
and as such should be packed to avoid any unintentional struct padding.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The VHD footer and header structs (vhd_footer and vhd_dyndisk_header)
are on-disk structures for the image format, and as such should be
packed.
Go ahead and make these typedefs as well, with the preferred QEMU
naming convention, so that the packed attribute is used consistently
with the struct.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The header struct VdiHeader is an on-disk structure for the image
format, and as such should be packed.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When there are no snapshots qemu_rbd_snap_list() returns 0 and the
snapshot table pointer is NULL. Don't forget to free the snaps buffer
we allocated for librbd rbd_snap_list().
When the function succeeds don't forget to free the snaps buffer after
calling rbd_snap_list_end().
Cc: qemu-stable@nongnu.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The patch fixes a warning from gcc (Debian 4.6.3-14+rpi1) 4.6.3:
block/stream.c:141:22: error:
‘copy’ may be used uninitialized in this function [-Werror=uninitialized]
This is not a real bug - a better compiler would not complain.
Now 'copy' has always a defined value, so the check for ret >= 0
can be removed.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Some drivers will have driver specifics options but no filename.
This new bool allow the block layer to treat them correctly.
The .bdrv_needs_filename is set in drivers not having .bdrv_parse_filename and
not having .bdrv_open.
The first exception to this rule will be the quorum driver.
Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We use the extent size as cluster size for flat extents (where no L1/L2
table is allocated so it's safe) reuse sector calculating code with
sparse extents.
Don't pass in the cluster size for adding flat extent, just set it to
sectors later, then the cluster size checking will not fail.
The cluster_sectors is changed to int64_t to allow big flat extent.
Without this, flat extent opening is broken:
# qemu-img create -f vmdk -o subformat=monolithicFlat /tmp/a.vmdk 100G
Formatting '/tmp/a.vmdk', fmt=vmdk size=107374182400 compat6=off subformat='monolithicFlat' zeroed_grain=off
# qemu-img info /tmp/a.vmdk
image: /tmp/a.vmdk
file format: raw
virtual size: 0 (0 bytes)
disk size: 4.0K
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When trying to update the refcounts for a snapshot, the return value of
update_refcount on a compressed cluster was pretty much ignored,
cancelling the update on error but returning 0. This is caused by an
inner "ret" variable shadowing the outer one (the latter is used in the
return statement).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
# By Stefan Hajnoczi (4) and others
# Via Stefan Hajnoczi
* stefanha/block:
virtio-blk: do not relay a previous driver's WCE configuration to the current
blockdev: do not default cache.no-flush to true
block: don't lose data from last incomplete sector
qcow2: Correct snapshots size for overlap check
coroutine: fix /perf/nesting coroutine benchmark
coroutine: add qemu_coroutine_yield benchmark
qemu-timer: do not take the lock in timer_pending
qemu-timer: make qemu_timer_mod_ns() and qemu_timer_del() thread-safe
qemu-timer: drop outdated signal safety comments
osdep: warn if open(O_DIRECT) on fails with EINVAL
libcacard: link against qemu-error.o for error_report()
Message-id: 1379698931-946-1-git-send-email-stefanha@redhat.com
# By Hervé Poussineau (5) and Stefan Weil (1)
# Via Paolo Bonzini
* bonzini/scsi-next:
block/iscsi: Drop iscsi_co_get_block_status for older versions of libiscsi
lsi: add 53C810 variant
lsi: remove todo
lsi: ignore write accesses to CTEST0 registers
lsi: check ssid versus sdid only if ssid is valid
lsi: use constant name instead of its value
Using s->snapshots_size instead of snapshots_size for the metadata
overlap check in qcow2_write_snapshots leads to the detection of an
overlap with the main qcow2 image header when deleting the last
snapshot, since s->snapshots_size has not yet been updated and is
therefore non-zero. However, the offset returned by qcow2_alloc_clusters
will be zero since snapshots_size is zero. Therefore, an overlap is
detected albeit no such will occur.
This patch fixes this by replacing s->snapshots_size by snapshots_size
when calling qcow2_pre_write_overlap_check.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Debian wheezy includes libiscsi-dev 1.4.0 which does not provide
SCSI_PROVISIONING_TYPE_DEALLOCATED. Drop iscsi_co_get_block_status
in this case to allow compilation without errors.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
# By Max Reitz (16) and others
# Via Kevin Wolf
* kwolf/for-anthony: (33 commits)
qemu-iotests: Fix test 038
block: Assert validity of BdrvActionOps
qemu-iotests: Cleanup test image in test number 007
qemu-img: fix invalid JSON
coroutine: add ./configure --disable-coroutine-pool
qemu-iotests: Adjustments due to error propagation
qcow2: Use Error parameter
qemu-img create: Emit filename on error
block: Error parameter for create functions
block: Error parameter for open functions
bdrv: Use "Error" for creating images
bdrv: Use "Error" for opening images
qemu-iotests: add 057 internal snapshot for block device test case
hmp: add interface hmp_snapshot_delete_blkdev_internal
hmp: add interface hmp_snapshot_blkdev_internal
qmp: add interface blockdev-snapshot-delete-internal-sync
qmp: add interface blockdev-snapshot-internal-sync
qmp: add internal snapshot support in qmp_transaction
snapshot: distinguish id and name in snapshot delete
snapshot: new function bdrv_snapshot_find_by_id_and_name()
...
Message-id: 1379073063-14963-1-git-send-email-kwolf@redhat.com
Replace .bdrv_aio_discard with .bdrv_co_discard so that discard
requests can be split in multiple parts, each for a small amount
of sectors.
This is useful because we expose a generic API with no limit
on the amount of sectors that can be unmapped in one request.
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add an Error ** parameter to bdrv_create and its associated functions to
allow more specific error messages.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Add an Error ** parameter to bdrv_open, bdrv_file_open and associated
functions to allow more specific error messages.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Add an Error ** parameter to BlockDriver.bdrv_open and
BlockDriver.bdrv_file_open to allow more specific error messages.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Snapshot creation actually already distinguish id and name since it take
a structured parameter *sn, but delete can't. Later an accurate delete
is needed in qmp_transaction abort and blockdev-snapshot-delete-sync,
so change its prototype. Also *errp is added to tip error, but return
value is kepted to let caller check what kind of error happens. Existing
caller for it are savevm, delvm and qemu-img, they are not impacted by
introducing a new function bdrv_snapshot_delete_by_id_or_name(), which
check the return value and do the operation again.
Before this patch:
For qcow2, it search id first then name to find the one to delete.
For rbd, it search name.
For sheepdog, it does nothing.
After this patch:
For qcow2, logic is the same by call it twice in caller.
For rbd, it always fails in delete with id, but still search for name
in second try, no change to user.
Some code for *errp is based on Pavel's patch.
Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
To make it clear about id and name in searching, add this API
to distinguish them. Caller can choose to search by id or name,
*errp will be set only for exception.
Some code are modified based on Pavel's patch.
Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Implement bdrv_amend_options for compat, size, backing_file, backing_fmt
and lazy_refcounts.
Downgrading images from compat=1.1 to compat=0.10 is achieved through
handling all incompatible flags accordingly, clearing all compatible and
autoclear flags and expanding all zero clusters.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Save the image refcount order in BDRVQcowState. This will be relevant
for future code supporting different refcount orders than four and also
for code that needs to verify a certain refcount order for an opened
image.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>