Commit Graph

2120 Commits

Author SHA1 Message Date
Eric Blake 6a8f9661dc block: Convert to new qapi union layout
We have two issues with our qapi union layout:
1) Even though the QMP wire format spells the tag 'type', the
C code spells it 'kind', requiring some hacks in the generator.
2) The C struct uses an anonymous union, which places all tag
values in the same namespace as all non-variant members. This
leads to spurious collisions if a tag value matches a non-variant
member's name.

Make the conversion to the new layout for block-related code.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1445898903-12082-16-git-send-email-eblake@redhat.com>
[Commit message tweaked slightly]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2015-11-02 08:30:27 +01:00
Kevin Wolf 37a639a7fb block: Consider all child nodes in bdrv_requests_pending()
The function manually recursed into bs->file and bs->backing to check
whether there were any requests pending, but it ignored other children.

There's no need to special case file and backing here, so just replace
these two explicit recursions by a loop recursing for all child nodes.

Reported-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1446029211-27148-1-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-10-29 17:59:27 +00:00
Fam Zheng 51288d7917 block: Introduce "drained begin/end" API
The semantics is that after bdrv_drained_begin(bs), bs will not get new external
requests until the matching bdrv_drained_end(bs).

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-23 18:18:24 +02:00
Fam Zheng dca21ef23b aio: Add "is_external" flag for event handlers
All callers pass in false, and the real external ones will switch to
true in coming patches.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-23 18:18:23 +02:00
Alberto Garcia d87d01e16a throttle: Remove throttle_group_lock/unlock()
The group throttling code was always meant to handle its locking
internally. However, bdrv_swap() was touching the ThrottleGroup
structure directly and therefore needed an API for that.

Now that bdrv_swap() no longer exists there's no need for the
throttle_group_lock() API anymore.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-23 18:18:23 +02:00
Max Reitz 5433c24f0f block: Prepare for NULL BDS
blk_bs() will not necessarily return a non-NULL value any more (unless
blk_is_available() is true or it can be assumed to otherwise, e.g.
because it is called immediately after a successful blk_new_with_bs() or
blk_new_open()).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-23 18:18:23 +02:00
Max Reitz 0c3c36d651 block: Add blk_insert_bs()
This function associates the given BlockDriverState with the given
BlockBackend.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-23 18:18:23 +02:00
Max Reitz a46fc9c950 block: Prepare remaining BB functions for NULL BDS
There are several BlockBackend functions which, in theory, cannot fail.
This patch makes them cope with the BlockDriverState pointer being NULL
by making them fall back to some default action like ignoring the value
in setters and returning the default in getters.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-23 18:18:23 +02:00
Max Reitz c09ba36c9a block: Fail requests to empty BlockBackend
If there is no BlockDriverState in a BlockBackend or if the tray of the
guest device is open, fail all requests (where that is possible) with
-ENOMEDIUM.

The reason the status of the guest device is taken into account is
because once the guest device's tray is opened, any request on the same
BlockBackend as the guest uses should fail. If the BDS tree is supposed
to be usable even after ejecting it from the guest, a different
BlockBackend must be used.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-23 18:18:23 +02:00
Max Reitz 061959e8da block: Make some BB functions fall back to BBRS
If there is no BDS tree attached to a BlockBackend, functions that can
do so should fall back to the BlockBackendRootState structure.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-23 18:18:23 +02:00
Max Reitz 281d22d86c block: Add BlockBackendRootState
This structure will store some of the state of the root BDS if the BDS
tree is removed, so that state can be restored once a new BDS tree is
inserted.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-23 18:18:23 +02:00
Max Reitz 973f2ddf7b block/throttle-groups: Make incref/decref public
Throttle groups are not necessarily referenced by BDSs alone; a later
patch will essentially allow BBs to reference them, too. Make the
ref/unref functions public so that reference can be properly accounted
for.

Their interface is slightly adjusted in that they return and take a
ThrottleState pointer, respectively, instead of a ThrottleGroup pointer.
Functionally, they are equivalent, but since ThrottleGroup is not meant
to be used outside of block/throttle-groups.c, ThrottleState is easier
to handle.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-23 18:18:23 +02:00
Max Reitz 373340b26c block: Move I/O status and error actions into BB
These options are only relevant for the user of a whole BDS tree (like a
guest device or a block job) and should thus be moved into the
BlockBackend.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-23 18:18:23 +02:00
Max Reitz 7f0e9da6f1 block: Move BlockAcctStats into BlockBackend
As the comment above bdrv_get_stats() says, BlockAcctStats is something
which belongs to the device instead of each BlockDriverState. This patch
therefore moves it into the BlockBackend.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-23 18:18:23 +02:00
Max Reitz 53d8f9d8fb block: Remove wr_highest_sector from BlockAcctStats
BlockAcctStats contains statistics about the data transferred from and
to the device; wr_highest_sector does not fit in with the rest.

Furthermore, those statistics are supposed to be specific for a certain
device and not necessarily for a BDS (see the comment above
bdrv_get_stats()); on the other hand, wr_highest_sector may be a rather
important information to know for each BDS. When BlockAcctStats is
finally removed from the BDS, we will want to keep wr_highest_sector in
the BDS.

Finally, wr_highest_sector is renamed to wr_highest_offset and given the
appropriate meaning. Externally, it is represented as an offset so there
is no point in doing something different internally. Its definition is
changed to match that in qapi/block-core.json which is "the offset after
the greatest byte written to". Doing so should not cause any harm since
if external programs tried to calculate the volume usage by
(wr_highest_offset + 512) / volume_size, after this patch they will just
assume the volume to be full slightly earlier than before.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-23 18:18:23 +02:00
Max Reitz 68e9ec017b block: Move guest_block_size into BlockBackend
guest_block_size is a guest device property so it should be moved into
the interface between block layer and guest devices, which is the
BlockBackend.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-23 18:18:23 +02:00
Max Reitz 4981bdec0d block: Fix BB AIOCB AioContext without BDS
Fix the BlockBackend's AIOCB AioContext for aborting AIO in case there
is no BDS. If there is no implementation of AIOCBInfo::get_aio_context()
the AioContext is derived from the BDS the AIOCB belongs to. If that BDS
is NULL (because it has been removed from the BB) this will not work.

This patch makes blk_get_aio_context() fall back to the main loop
context if the BDS pointer is NULL and implements
AIOCBInfo::get_aio_context() (blk_aiocb_get_aio_context()) which invokes
blk_get_aio_context().

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-23 18:18:23 +02:00
Max Reitz 1354c47378 block/raw_bsd: Drop raw_is_inserted()
With the new automatically-recursive implementation of
bdrv_is_inserted() checking by default whether all the children of a BDS
are inserted, we can drop raw's own implementation.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-23 18:18:23 +02:00
Max Reitz db0284f86a block: Add blk_is_available()
blk_is_available() returns true iff the BDS is inserted (which means
blk_bs() is not NULL and bdrv_is_inserted() returns true) and if the
tray of the guest device is closed.

blk_is_inserted() is changed to return true only if blk_bs() is not
NULL.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-23 18:18:22 +02:00
Max Reitz e031f75048 block: Make bdrv_is_inserted() return a bool
Make bdrv_is_inserted(), blk_is_inserted(), and the callback
BlockDriver.bdrv_is_inserted() return a bool.

Suggested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-23 18:18:22 +02:00
Max Reitz f709623b3d block: Remove host floppy support
It has been deprecated as of 2.3, so we can now remove it.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-23 18:18:22 +02:00
Daniel P. Berrange 10817bf09d coroutine: move into libqemuutil.a library
The coroutine files are currently referenced by the block-obj-y
variable. The coroutine functionality though is already used by
more than just the block code. eg migration code uses coroutine
yield. In the future the I/O channel code will also use the
coroutine yield functionality. Since the coroutine code is nicely
self-contained it can be easily built as part of the libqemuutil.a
library, making it widely available.

The headers are also moved into include/qemu, instead of the
include/block directory, since they are now part of the util
codebase, and the impl was never in the block/ directory
either.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2015-10-20 14:59:04 +01:00
Fam Zheng 6b826af7b0 blkdebug: Don't confuse image as backing file
The word "backing file" nowadays refers to the backing_hd in the
external snapshot sense (i.e. bs->backing_hd), instead of the file sense
(bs->file). Correct the comment to use the right term.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-16 15:35:48 +02:00
Kevin Wolf e394621fbd qcow2: Remove forward declaration of QCowAIOCB
This struct doesn't exist any more since commit 3fc48d09 in August 2011,
it's about time to remove its forward declaration.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2015-10-16 15:34:30 +02:00
Stefan Hajnoczi 1501ecc1d8 raw-posix: warn about BDRV_O_NATIVE_AIO if libaio is unavailable
raw-posix.c silently ignores BDRV_O_NATIVE_AIO if libaio is unavailable.
It is confusing when aio=native performance is identical to aio=threads
because the binary was accidentally built without libaio.

Print a deprecation warning if -drive aio=native is used with a binary
that does not support libaio.  There are probably users using aio=native
who would be inconvenienced if QEMU suddenly refused to start their
guests.  In the future this will become an error.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-16 15:34:30 +02:00
Kevin Wolf 7e39d3a2dd blkverify: Fix BDS leak in .bdrv_open error path
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
2015-10-16 15:34:30 +02:00
Kevin Wolf 8e419aefa0 block: Remove bdrv_swap()
bdrv_swap() is unused now. Remove it and all functions that have
no other users than bdrv_swap(). In particular, this removes the
.bdrv_rebind callbacks from block drivers.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-10-16 15:34:30 +02:00
Kevin Wolf 3f09bfbc7b block: Add and use bdrv_replace_in_backing_chain()
This cleans up the mess we left behind in the mirror code after the
previous patch. Instead of using bdrv_swap(), just change pointers.

The interface change of the mirror job that callers must consider is
that after job completion, their local BDS pointers still point to the
same node now. qemu-img must change its code accordingly (which makes it
easier to understand); the other callers stays unchanged because after
completion they don't do anything with the BDS, but just with the job,
and the job is still owned by the source BDS.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-10-16 15:34:30 +02:00
Kevin Wolf 8ccb9569a9 blockjob: Store device name at job creation
Some block jobs change the block device graph on completion. This means
that the device that owns the job and originally was addressed with its
device name may no longer be what the corresponding BlockBackend points
to.

Previously, the effects of bdrv_swap() ensured that the job was (at
least partially) transferred to the target image. Events that contain
the device name could still use bdrv_get_device_name(job->bs) and get
the same result.

After removing bdrv_swap(), this won't work any more. Instead, save the
device name at job creation and use that copy for QMP events and
anything else identifying the job.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-10-16 15:34:30 +02:00
Kevin Wolf a2d6190048 block-backend: Add blk_set_bs()
It allows changing the BlockDriverState that a BlockBackend points to.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-10-16 15:34:29 +02:00
Kevin Wolf 439db28cf9 block/io: Make bdrv_requests_pending() public
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-10-16 15:34:29 +02:00
Kevin Wolf 5db15a5769 block: Manage backing file references in bdrv_set_backing_hd()
This simplifies the code somewhat, especially when dropping whole
backing file subchains.

The exception is the mirroring code that does adventurous things with
bdrv_swap() and in order to keep it working, I had to duplicate most of
bdrv_set_backing_hd() locally. We'll get rid again of this ugliness
shortly.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-10-16 15:34:29 +02:00
Kevin Wolf 760e006384 block: Convert bs->backing_hd to BdrvChild
This is the final step in converting all of the BlockDriverState
pointers that block drivers use to BdrvChild.

After this patch, bs->children contains the full list of child nodes
that are referenced by a given BDS, and these children are only
referenced through BdrvChild, so that updating the pointer in there is
enough for changing edges in the graph.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-10-16 15:34:29 +02:00
Kevin Wolf 9a4f4c3156 block: Convert bs->file to BdrvChild
This patch removes the temporary duplication between bs->file and
bs->file_child by converting everything to BdrvChild.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-10-16 15:34:29 +02:00
Kevin Wolf 0bd6e91a7e quorum: Convert to BdrvChild
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-10-16 15:34:29 +02:00
Kevin Wolf 3e586be0b2 blkverify: Convert s->test_file to BdrvChild
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-10-16 15:34:29 +02:00
Kevin Wolf 24bc15d1f6 vmdk: Use BdrvChild instead of BDS for references to extents
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-10-16 15:34:29 +02:00
Paolo Bonzini c84b31926f block: switch from g_slice allocator to malloc
Simplify memory allocation by sticking with a single API.  GSlice
is not that fast anyway (tcmalloc/jemalloc are better).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-10-12 11:17:45 +01:00
Paolo Bonzini eab2ac9d3c block/ssh: remove dead code
The "err" label cannot be reached with qp != NULL.  Remove the free-ing
of qp and avoid future regressions by removing the initializer.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
ACKed-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2015-10-08 19:46:01 +03:00
Richard W.M. Jones 73ba05d936 block/raw-posix: Open file descriptor O_RDWR to work around glibc posix_fallocate emulation issue.
https://bugzilla.redhat.com/show_bug.cgi?id=1265196

The following command fails on an NFS mountpoint:

  $ qemu-img create -f qcow2 -o preallocation=falloc disk.img 262144
  Formatting 'disk.img', fmt=qcow2 size=262144 encryption=off cluster_size=65536 preallocation='falloc' lazy_refcounts=off
  qemu-img: disk.img: Could not preallocate data for the new file: Bad file descriptor

The reason turns out to be because NFS doesn't support the
posix_fallocate call.  glibc emulates it instead.  However glibc's
emulation involves using the pread(2) syscall.  The pread syscall
fails with EBADF if the file descriptor is opened without the read
open-flag (ie. open (..., O_WRONLY)).

I contacted glibc upstream about this, and their response is here:

  https://bugzilla.redhat.com/show_bug.cgi?id=1265196#c9

There are two possible fixes: Use Linux fallocate directly, or (this
fix) work around the problem in qemu by opening the file with O_RDWR
instead of O_WRONLY.

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1265196
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-10-02 13:48:29 +02:00
Kevin Wolf 5d555030ba raw-win32: Fix write request error handling
aio_worker() wrote the return code to the wrong variable.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Guangmu Zhu <guangmuzhu@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2015-10-02 13:48:29 +02:00
Jeff Cody 5279efebcf block: mirror - fix full sync mode when target does not support zero init
During mirror, if the target device does not support zero init, a
mirror may result in a corrupted image for sync="full" mode.

This is due to how the initial dirty bitmap is set up prior to copying
data - we did not mark sectors as dirty that are unallocated.  This
means those unallocated sectors are skipped over on the target, and for
a device without zero init, invalid data may reside in those holes.

If both of the following conditions are true, then we will explicitly
mark all sectors as dirty:

    1.) sync = "full"
    2.) bdrv_has_zero_init(target) == false

If the target does support zero init, but a target image is passed in
with data already present (i.e. an "existing" image), it is assumed the
data present in the existing image is valid data for those sectors.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 91ed4bc5bda7e2b09eb508b07c83f4071fe0b3c9.1443705220.git.jcody@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2015-10-01 15:02:21 -04:00
Peter Maydell 9e071429e6 * First batch of MAINTAINERS updates
* IOAPIC fixes (to pass kvm-unit-tests with -machine kernel_irqchip=off)
 * NBD API upgrades from Daniel
 * strtosz fixes from Marc-André
 * improved support for readonly=on on scsi-generic devices
 * new "info ioapic" and "info lapic" monitor commands
 * Peter Crosthwaite's ELF_MACHINE cleanups
 * docs patches from Thomas and Daniel
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJWBSAEAAoJEL/70l94x66DeL4H/21YR4GWCqo30f+W5kx24ZNo
 by8H2kdZmWKRr/La1JlAReki9GCP1U8Q0cYC8V885gHLKcahWS/75UKwNbw0OSyg
 2jj4uREc645TTFAvV5kQ+uAw9F/dchvkXylrVgOoUPipfmYibXY8JLu9AcVnZi6H
 X5Rvpqo4Uhp2cbRG7rYWrwgpNL+VZmKc8LDdqdlXrkjjanhuAYO2E9NBKaE+xJQQ
 FHcpkV92iSZFEZ0CB535BTIdNdDM/ae6bw1As27EF10YBTfneCQNazSeh13pLO2n
 lHit2GZr2VeTSBrPkPsItToY/Gw38duVZK4QM5/wSkHBzyeUJY0ltQrf53veYfk=
 =uc+I
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* First batch of MAINTAINERS updates
* IOAPIC fixes (to pass kvm-unit-tests with -machine kernel_irqchip=off)
* NBD API upgrades from Daniel
* strtosz fixes from Marc-André
* improved support for readonly=on on scsi-generic devices
* new "info ioapic" and "info lapic" monitor commands
* Peter Crosthwaite's ELF_MACHINE cleanups
* docs patches from Thomas and Daniel

# gpg: Signature made Fri 25 Sep 2015 11:20:52 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"

* remotes/bonzini/tags/for-upstream: (52 commits)
  doc: Refresh URLs in the qemu-tech documentation
  docs: describe the QEMU build system structure / design
  typedef: add typedef for QemuOpts
  i386: interrupt poll processing
  i386: partial revert of interrupt poll fix
  ppc: Rename ELF_MACHINE to be PPC specific
  i386: Rename ELF_MACHINE to be x86 specific
  alpha: Remove ELF_MACHINE from cpu.h
  mips: Remove ELF_MACHINE from cpu.h
  sparc: Remove ELF_MACHINE from cpu.h
  s390: Remove ELF_MACHINE from cpu.h
  sh4: Remove ELF_MACHINE from cpu.h
  xtensa: Remove ELF_MACHINE from cpu.h
  tricore: Remove ELF_MACHINE from cpu.h
  or32: Remove ELF_MACHINE from cpu.h
  lm32: Remove ELF_MACHINE from cpu.h
  unicore: Remove ELF_MACHINE from cpu.h
  moxie: Remove ELF_MACHINE from cpu.h
  cris: Remove ELF_MACHINE from cpu.h
  m68k: Remove ELF_MACHINE from cpu.h
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-09-25 21:52:30 +01:00
Hitoshi Mitake e6fd57ea29 sheepdog: refine discard support
This patch refines discard support of the sheepdog driver. The
existing discard mechanism was implemented on SD_OP_DISCARD_OBJ, which
was introduced before fine grained reference counting on newer
sheepdog. It doesn't care about relations of snapshots and clones and
discards objects unconditionally.

With this patch, the driver just updates an inode object for updating
reference. Removing the object is done in sheep process side.

Cc: Teruaki Ishizaki <ishizaki.teruaki@lab.ntt.co.jp>
Cc: Vasiliy Tolstov <v.tolstov@selfip.ru>
Cc: Jeff Cody <jcody@redhat.com>
Signed-off-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
Tested-by: Vasiliy Tolstov <v.tolstov@selfip.ru>
Message-id: 1441076590-8015-3-git-send-email-mitake.hitoshi@lab.ntt.co.jp
Signed-off-by: Jeff Cody <jcody@redhat.com>
2015-09-25 10:25:19 -04:00
Hitoshi Mitake 498f21405a sheepdog: use per AIOCB dirty indexes for non overlapping requests
In the commit 96b14ff85acf, requests for overlapping areas are
serialized. However, it cannot handle a case of non overlapping
requests. In such a case, min_dirty_data_idx and max_dirty_data_idx
can be overwritten by the requests and invalid inode update can
happen e.g. a case like create(1, 2) and create(3, 4) are issued in
parallel.

This patch lets SheepdogAIOCB have dirty data indexes instead of
BDRVSheepdogState for avoiding the above situation.

This patch also does trivial renaming for better description:
overwrapping -> overlapping

Cc: Teruaki Ishizaki <ishizaki.teruaki@lab.ntt.co.jp>
Cc: Vasiliy Tolstov <v.tolstov@selfip.ru>
Cc: Jeff Cody <jcody@redhat.com>
Signed-off-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
Tested-by: Vasiliy Tolstov <v.tolstov@selfip.ru>
Message-id: 1441076590-8015-2-git-send-email-mitake.hitoshi@lab.ntt.co.jp
Signed-off-by: Jeff Cody <jcody@redhat.com>
2015-09-25 10:25:19 -04:00
Wen Congyang 06c3916b35 Backup: don't do copy-on-read in before_write_notifier
We will copy data in before_write_notifier to do backup.
It is a nested I/O request, so we cannot do copy-on-read.

The steps to reproduce it:
1. -drive copy-on-read=on,...  // qemu option
2. drive_backup -f disk0 /path_to_backup.img // monitor command

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Tested-by: Jeff Cody <jcody@redhat.com>
Message-id: 1441682913-14320-3-git-send-email-wency@cn.fujitsu.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2015-09-25 08:37:07 -04:00
Wen Congyang 9568b511c9 block: Introduce a new API bdrv_co_no_copy_on_readv()
In some cases, we need to disable copy-on-read, and just
read the data.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Message-id: 1441682913-14320-2-git-send-email-wency@cn.fujitsu.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2015-09-25 08:37:07 -04:00
Liu Yuan 4da65c8092 sheepdog: add reopen support
With reopen supported, block-commit (and offline commit) is now supported for
image files whose base image uses the Sheepdog protocol driver.

Cc: qemu-devel@nongnu.org
Cc: Jeff Cody <jcody@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <liuyuan@cmss.chinamobile.com>
Message-id: 1440730438-24676-1-git-send-email-namei.unix@gmail.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2015-09-25 08:37:07 -04:00
Peter Lieven 18a8056e0b block/nfs: cache allocated filesize for read-only files
If the file is readonly its not expected to grow so
save the blocking call to nfs_fstat_async and use
the value saved at connection time. Also important
the monitor (and thus the main loop) will not hang
if block device info is queried and the NFS share
is unresponsive.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1440671441-7978-1-git-send-email-pl@kamp.de
Signed-off-by: Jeff Cody <jcody@redhat.com>
2015-09-25 08:37:07 -04:00
Peter Lieven 055c6f912c block/nfs: fix calculation of allocated file size
st.st_blocks is always counted in 512 byte units. Do not
use st.st_blksize as multiplicator which may be larger.

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1440067607-14547-1-git-send-email-pl@kamp.de
Signed-off-by: Jeff Cody <jcody@redhat.com>
2015-09-25 08:37:07 -04:00