qemu-e2k

Author	SHA1	Message	Date
Paolo Bonzini	402a47411b	mirror: support more than one in-flight AIO operation With AIO support in place, we can start copying more than one chunk in parallel. This patch introduces the required infrastructure for this: the buffer is split into multiple granularity-sized chunks, and there is a free list to access them. Because of copy-on-write, a single operation may already require multiple chunks to be available on the free list. In addition, two different iterations on the HBitmap may want to copy the same cluster. We avoid this by keeping a bitmap of in-flight I/O operations, and blocking until the previous iteration completes. This should be a pretty rare occurrence, though; as long as there is no overlap the next iteration can start before the previous one finishes. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-01-25 18:18:35 +01:00
Paolo Bonzini	08e4ed6cde	mirror: add buf-size argument to drive-mirror This makes sense when the next commit starts using the extra buffer space to perform many I/O operations asynchronously. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-01-25 18:18:34 +01:00
Paolo Bonzini	bd48bde8f0	mirror: switch mirror_iteration to AIO There is really no change in the behavior of the job here, since there is still a maximum of one in-flight I/O operation between the source and the target. However, this patch already introduces the AIO callbacks (which are unmodified in the next patch) and some of the logic to count in-flight operations and only complete the job when there is none. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-01-25 18:18:34 +01:00
Paolo Bonzini	eee13dfe30	mirror: allow customizing the granularity The desired granularity may be very different depending on the kind of operation (e.g. continuous replication vs. collapse-to-raw) and whether the VM is expected to perform lots of I/O while mirroring is in progress. Allow the user to customize it, while providing a sane default so that in general there will be no extra allocated space in the target compared to the source. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-01-25 18:18:34 +01:00
Paolo Bonzini	50717e941b	block: allow customizing the granularity of the dirty bitmap Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-01-25 18:18:34 +01:00
Paolo Bonzini	acc906c6c5	block: return count of dirty sectors, not chunks Reviewed-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-01-25 18:18:33 +01:00
Paolo Bonzini	b812f6719c	mirror: perform COW if the cluster size is bigger than the granularity When mirroring runs, the backing files for the target may not yet be ready. However, this means that a copy-on-write operation on the target would fill the missing sectors with zeros. Copy-on-write only happens if the granularity of the dirty bitmap is smaller than the cluster size (and only for clusters that are allocated in the source after the job has started copying). So far, the granularity was fixed to 1MB; to avoid the problem we detected the situation and required the backing files to be available in that case only. However, we want to lower the granularity for efficiency, so we need a better solution. The solution is to always copy a whole cluster the first time it is touched. The code keeps a bitmap of clusters that have already been allocated by the mirroring job, and only does "manual" copy-on-write if the chunk being copied is zero in the bitmap. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-01-25 18:18:33 +01:00
Paolo Bonzini	8f0720ecbc	block: implement dirty bitmap using HBitmap This actually uses the dirty bitmap in the block layer, and converts mirroring to use an HBitmapIter. Reviewed-by: Laszlo Ersek <lersek@redhat.com> (except block/mirror.c parts) Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-01-25 18:18:33 +01:00
Peter Lieven	7371d56fb2	iscsi: add support for iovectors This patch adds support for directly passing the iovec array from QEMUIOVector if libiscsi supports it (1.8.0 or newer). Signed-off-by: Peter Lieven <pl@kamp.de> [Preserve the improvements from commit `4cc841b`, iscsi: partly avoid iovec linearization in iscsi_aio_writev, 2012-11-19 - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-01-24 15:37:55 +01:00
Paolo Bonzini	4790b03d30	iscsi: do not leak acb->buf when commands are aborted acb->buf is freed in the WRITE(16) callback, but this may not get called at all when commands are aborted. Add another free in the ABORT TASK callback, which requires setting acb->buf to NULL everywhere. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-01-24 15:37:55 +01:00
Anthony Liguori	177f7fc688	Merge remote-tracking branch 'bonzini/scsi-next' into staging # By Peter Lieven (3) and others # Via Paolo Bonzini * bonzini/scsi-next: scsi: Drop useless null test in scsi_unit_attention() lsi: use qbus_reset_all to reset SCSI bus scsi: fix segfault with 0-byte disk iscsi: add support for iSCSI NOPs [v2] iscsi: partly avoid iovec linearization in iscsi_aio_writev iscsi: add iscsi_create support	2013-01-23 09:08:54 -06:00
Peter Lieven	5b5d34ec98	iscsi: add support for iSCSI NOPs [v2] This patch will send NOP-Out PDUs every 5 seconds to the iSCSI target. If a consecutive number of NOP-In replies fail a reconnect is initiated. iSCSI NOPs help to ensure that the connection to the target is still operational. This should not, but in reality may be the case even if the TCP connection is still alive if there are bugs in either the target or the initiator implementation. v2: - track the NOPs inside libiscsi so libiscsi can reset the counter in case it initiates a reconnect. Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-01-22 15:07:03 +01:00
Peter Lieven	4cc841b57c	iscsi: partly avoid iovec linearization in iscsi_aio_writev libiscsi expects all write16 data in a linear buffer. If the iovec only contains one buffer we can skip the linearization step as well as the additional malloc/free and pass the buffer directly. Reported-by: Ronnie Sahlberg <ronniesahlberg@gmail.com> Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-01-22 15:07:03 +01:00
Peter Lieven	de8864e5ae	iscsi: add iscsi_create support This patch adds support for bdrv_create. This allows e.g. to use qemu-img to convert from any supported device to an iscsi backed storage as destination. Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-01-22 15:07:03 +01:00
Anthony Liguori	8b17ed4caa	Merge remote-tracking branch 'stefanha/block' into staging # By Kevin Wolf (4) and others # Via Stefan Hajnoczi * stefanha/block: dataplane: support viostor virtio-pci status bit setting dataplane: avoid reentrancy during virtio_blk_data_plane_stop() win32-aio: use iov utility functions instead of open-coding them win32-aio: Fix memory leak win32-aio: Fix vectored reads aio: Fix return value of aio_poll() ide: Remove wrong assertion block: fix null-pointer bug on error case in block commit	2013-01-20 11:01:10 -06:00
Andreas Färber	c36dd8a09f	block/raw-posix: Make hdev_aio_discard() available outside Linux Fixes the build on OpenBSD among others. Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Andreas Färber <andreas.faerber@web.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2013-01-19 14:35:02 +00:00
Michael Tokarev	3249dbe661	win32-aio: use iov utility functions instead of open-coding them We have iov_from_buf() and iov_to_buf(), use them instead of open-coding these in block/win32-aio.c Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-18 09:57:51 +01:00
Kevin Wolf	e8bccad5ac	win32-aio: Fix memory leak The buffer is allocated for both reads and writes, and obviously it should be freed even if an error occurs. Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-17 10:58:09 +01:00
Kevin Wolf	bcbbd234d4	win32-aio: Fix vectored reads Copying data in the right direction really helps a lot! Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-17 10:57:13 +01:00
Jeff Cody	6d759117d3	block: fix null-pointer bug on error case in block commit This is a bug that was caught by a coverity run by Markus. In the error case when we errored out to exit_restore_open early in the function, 'overlay_bs' was still NULL at that point, although it is used to look up flags and perform a bdrv_reopen(). Move the overlay_bs lookup to where it is needed, and check for NULL before restoring the flags. Also get rid of the unneeded parameter initialization. Reported-By: Markus Armbruster <armbru@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-17 10:51:11 +01:00
Markus Armbruster	7191bf311e	block: Fix how mirror_run() frees its buffer It allocates with qemu_blockalign(), therefore it must free with qemu_vfree(), not g_free(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-15 17:28:55 +01:00
Markus Armbruster	7479acdbce	win32-aio: Fix how win32_aio_process_completion() frees buffer win32_aio_submit() allocates it with qemu_blockalign(), therefore it must be freed with qemu_vfree(), not g_free(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-15 16:47:45 +01:00
Liu Yuan	f700f8e346	sheepdog: clean up sd_aio_setup() The last two parameters of sd_aio_setup() are never used, so remove them. Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Liu Yuan <tailai.ly@taobao.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-15 13:40:10 +01:00
Liu Yuan	4778307278	sheepdog: multiplex the rw FD to flush cache This will reduce sockfds connected to the sheep server to one, which simply the future hacks. Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Liu Yuan <tailai.ly@taobao.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-15 11:18:49 +01:00
Paolo Bonzini	8238010b26	block: make discard asynchronous This is easy with the thread pool, because we can use s->is_xfs and s->has_discard from the worker function. QEMU has a widespread assumption that each I/O operation writes less than 2^32 bytes. This patch doesn't fix it throughout of course, but it starts correcting struct RawPosixAIOData so that there is no regression with respect to the synchronous discard implementation. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-15 10:03:47 +01:00
Paolo Bonzini	fcd9d45552	raw: support discard on block devices Block devices use a ioctl instead of fallocate, so add a separate implementation. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-15 10:03:47 +01:00
Paolo Bonzini	c85191e5c9	raw-posix: remember whether discard failed Avoid sending system calls repeatedly if they shall fail. This does not apply to XFS: if the filesystem-specific ioctl fails, something weird is happening. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-15 10:03:47 +01:00
Kusanagi Kouichi	3d4fa43e64	raw-posix: support discard on more filesystems Linux 2.6.38 introduced the filesystem independent interface to deallocate part of a file. As of Linux 3.7, btrfs, ext4, ocfs2, tmpfs and xfs support it. Even though the system calls here are in practice issued on Linux, the code is structured to allow plugging in alternatives for other Unix variants. EOPNOTSUPP is used unconditionally in this patch, but it is supported in both OpenBSD and Mac OS X since forever (see for example http://lists.debian.org/debian-glibc/2006/02/msg00337.html). Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-15 10:03:47 +01:00
Kevin Wolf	8d2497c355	qcow2: Fix segfault on zero-length write One of the recent refactoring patches (commit `f50f88b9`) didn't take care to initialise l2meta properly, so with zero-length writes, which don't even enter the write loop, qemu just segfaulted. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-15 09:08:55 +01:00
Anthony Liguori	da758bd7a3	Merge remote-tracking branch 'kwolf/for-anthony' into staging * kwolf/for-anthony: dataplane: handle misaligned virtio-blk requests dataplane: extract virtio-blk read/write processing into do_rdwr_cmd() block: make qiov_is_aligned() public raw-posix: fix bdrv_aio_ioctl sheepdog: implement direct write semantics block: do not probe zero-sized disks Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2013-01-14 10:26:26 -06:00
Stefan Hajnoczi	c53b1c5114	block: make qiov_is_aligned() public The qiov_is_aligned() function checks whether a QEMUIOVector meets a BlockDriverState's alignment requirements. This is needed by virtio-blk-data-plane so: 1. Move the function from block/raw-posix.c to block/block.c. 2. Make it public in block/block.h. 3. Rename to bdrv_qiov_is_aligned(). 4. Change return type from int to bool. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-01-14 10:06:56 +01:00
Paolo Bonzini	b608c8dc02	raw-posix: fix bdrv_aio_ioctl When the raw-posix aio=thread code was moved from posix-aio-compat.c to block/raw-posix.c, there was an unintended change to the ioctl code. The code used to return the ioctl command, which posix_aio_read() would later morph into a zero. This hack is not necessary anymore, and in fact breaks scsi-generic (which expects a zero return code). Remove it. Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-01-14 10:06:56 +01:00
Liu Yuan	0e7106d8b5	sheepdog: implement direct write semantics Sheepdog supports both writeback/writethrough write but has not yet supported DIRECTIO semantics which bypass the cache completely even if Sheepdog daemon is set up with cache enabled. Suppose cache is enabled on Sheepdog daemon size, the new cache control is cache=writeback # enable the writeback semantics for write cache=writethrough # enable the emulated writethrough semantics for write cache=directsync # disable cache competely Guest WCE toggling on the run time to toggle writeback/writethrough is also supported. Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Stefan Hajnoczi <stefanha@gmail.com> Signed-off-by: Liu Yuan <tailai.ly@taobao.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-01-14 10:06:56 +01:00
Paolo Bonzini	4d4545743f	qemu-option: move standard option definitions out of qemu-config.c Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-01-12 17:17:53 +01:00
Stefan Weil	eb7ff6fb0b	Replace remaining gmtime, localtime by gmtime_r, localtime_r This allows removing of MinGW specific code and improves reentrancy for POSIX hosts. [Removed unused ret variable in qemu_get_timedate() to fix warning: vl.c: In function ‘qemu_get_timedate’: vl.c:451:16: error: variable ‘ret’ set but not used [-Werror=unused-but-set-variable] -- Stefan Hajnoczi] Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-11 09:44:37 +01:00
Liu Yuan	d6b1ef89a1	sheepdog: pass oid directly to send_pending_req() Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Cc: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Liu Yuan <tailai.ly@taobao.com> Reviewed-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-02 16:09:00 +01:00
Liu Yuan	bd751f2204	sheepdog: don't update inode when create_and_write fails For the error case such as SD_RES_NO_SPACE, we shouldn't update the inode bitmap to avoid the scenario that the object is allocated but wasn't created at the server side. This will result in VM's IO error on the failed object. Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Cc: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Liu Yuan <tailai.ly@taobao.com> Reviewed-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-02 16:08:58 +01:00
Stefan Weil	fccedc624c	block/raw-win32: Fix compiler warnings (wrong format specifiers) Commit `fbcad04d6b` added fprintf statements with wrong format specifiers. GetLastError() returns a DWORD which is unsigned long, so %lu must be used. Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-02 16:08:57 +01:00
Stefan Hajnoczi	4065742ac0	raw-posix: add raw_get_aio_fd() for virtio-blk-data-plane The raw_get_aio_fd() function allows virtio-blk-data-plane to get the file descriptor of a raw image file with Linux AIO enabled. This interface is really a layering violation that can be resolved once the block layer is able to run outside the global mutex - at that point virtio-blk-data-plane will switch from custom Linux AIO code to using the block layer. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-02 15:31:39 +01:00
Paolo Bonzini	9c17d615a6	softmmu: move include files to include/sysemu/ Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-12-19 08:32:45 +01:00
Paolo Bonzini	1de7afc984	misc: move include files to include/qemu/ Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-12-19 08:32:39 +01:00
Paolo Bonzini	caf71f86a3	migration: move include files to include/migration/ Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-12-19 08:31:32 +01:00
Paolo Bonzini	737e150e89	block: move include files to include/block/ Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-12-19 08:31:31 +01:00
Paolo Bonzini	7b1b5d1913	qapi: move include files to include/qobject/ Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-12-19 08:31:31 +01:00
Paolo Bonzini	f8fe796407	janitor: do not include qemu-char everywhere Touching char/char.h basically causes the whole of QEMU to be rebuilt. Avoid this, it is usually unnecessary. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-12-19 08:29:59 +01:00
Paolo Bonzini	077805fa92	janitor: do not rely on indirect inclusions of or from qemu-char.h Various header files rely on qemu-char.h including qemu-config.h or main-loop.h, but they really do not need qemu-char.h at all (particularly interesting is the case of the block layer!). Clean this up, and also add missing inclusions of qemu-char.h itself. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-12-19 08:29:52 +01:00
Paolo Bonzini	525877c999	build: move rules from Makefile to */Makefile.objs Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-12-19 08:29:06 +01:00
Kevin Wolf	226c3c26b9	qcow2: Factor out handle_dependencies() Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-13 15:37:59 +01:00
Kevin Wolf	4e95314e2b	qcow2: Execute run_dependent_requests() without lock There's no reason for run_dependent_requests() to hold s->lock, and a later patch will require that in fact the lock is not held. Also, before this patch, run_dependent_requests() not only does what its name suggests, but also removes the l2meta from the list of in-flight requests. When changing this, it becomes an one-liner, so just inline it completely. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-13 15:37:59 +01:00
Kevin Wolf	280d373579	qcow2: Enable dirty flag in qcow2_alloc_cluster_link_l2 This is closer to where the dirty flag is really needed, and it avoids having checks for special cases related to cluster allocation directly in the writev loop. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-13 15:37:59 +01:00
Kevin Wolf	f50f88b9fe	qcow2: Allocate l2meta only for cluster allocations Even for writes to already allocated clusters, an l2meta is allocated, though it stays effectively unused. After this patch, only allocating requests still have one. Each l2meta now describes an in-flight request that writes to clusters that are not yet hooked up in the L2 table. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-13 15:37:59 +01:00
Kevin Wolf	060bee8943	qcow2: Drop l2meta.cluster_offset There's no real reason to have an l2meta for normal requests that don't allocate anything. Before we can get rid of it, we must return the host cluster offset in a different way. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-13 15:37:59 +01:00
Kevin Wolf	cf5c1a231e	qcow2: Allocate l2meta dynamically As soon as delayed COW is introduced, the l2meta struct is needed even after completion of the request, so it can't live on the stack. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-13 15:37:59 +01:00
Kevin Wolf	593fb83cac	qcow2: Introduce Qcow2COWRegion This makes it easier to address the areas for which a COW must be performed. As a nice side effect, the COW code in qcow2_alloc_cluster_link_l2 becomes really trivial. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-13 15:37:59 +01:00
Kevin Wolf	1d3afd649b	qcow2: Round QCowL2Meta.offset down to cluster boundary The offset within the cluster is already present as n_start and this is what the code uses. QCowL2Meta.offset is only needed at a cluster granularity. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-13 15:37:59 +01:00
Kevin Wolf	67a7a0ebe5	qcow2: Move BLKDBG_EVENT out of the lock We want to use these events to suspend requests for testing concurrent AIO requests. Suspending requests while they are holding the CoMutex is rather boring for this purpose. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-12 12:33:48 +01:00
Kevin Wolf	3c90c65d7a	blkdebug: Implement suspend/resume of AIO requests This allows more systematic AIO testing. The patch adds three new operations to blkdebug: * Setting a "breakpoint" on a blkdebug event. The next request that triggers this breakpoint is suspended and is tagged with a name. The breakpoint is removed after a request has triggered it. * A suspended request (identified by it's tag) can be resumed * It's possible to check whether a suspended request with a given tag exists. This can be used for waiting for an event. Ideally, we would instead tag requests right when they are created and set breakpoints for individual requests. However, at this point the block layer doesn't allow this easily, and breakpoints that trigger for any request already allow a lot of useful testing. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-12 12:33:48 +01:00
Kevin Wolf	9e35542b0f	blkdebug: Factor out remove_rule() The cleanup work to remove a rule depends on the type of the rule. It's easy for the existing rules as there is no data that must be cleaned up and is specific to a type yet, but the next patch will change this. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-12 12:33:48 +01:00
Kevin Wolf	312a2ba0eb	blkdebug: Allow usage without config file As soon as new rules can be set during runtime, as introduced by the next patch, blkdebug makes sense even without a config file. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-12 12:33:48 +01:00
Fabien Chouteau	fbcad04d6b	Fix error code checking for SetFilePointer() call An error has occurred if the return value is invalid_set_file_pointer and getlasterror doesn't return no_error. Signed-off-by: Fabien Chouteau <chouteau@adacore.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2012-12-11 11:36:57 +01:00
Stefan Priebe	473c7f0255	rbd: Fix race between aio completition and aio cancel This one fixes a race which qemu had also in iscsi block driver between cancellation and io completition. qemu_rbd_aio_cancel was not synchronously waiting for the end of the command. To archieve this it introduces a new status flag which uses -EINPROGRESS. Signed-off-by: Stefan Priebe <s.priebe@profihost.ag> Reviewed-by: Stefan Hajnoczi <stefanha@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-11 11:05:11 +01:00
Paolo Bonzini	c208e8c2d8	raw-posix: inline paio_ioctl into hdev_aio_ioctl clang now warns about an unused function: CC block/raw-posix.o block/raw-posix.c:707:26: warning: unused function paio_ioctl [-Wunused-function] static BlockDriverAIOCB paio_ioctl(BlockDriverState bs, int fd, ^ 1 warning generated. because the only use of paio_ioctl() is inside a #if defined(__linux__) guard and it is static now. Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2012-12-11 11:04:26 +01:00
Charles Arnold	258d2edbcd	block: vpc support for ~2 TB disks The VHD specification allows for up to a 2 TB disk size. The current implementation in qemu emulates EIDE and ATA-2 hardware which only allows for up to 127 GB. This disk size limitation can be overridden by allowing up to 255 heads instead of the normal 4 bit limitation of 16. Doing so allows disk images to be created of up to nearly 2 TB. This change does not violate the VHD format specification nor does it change how smaller disks (ie, <=127GB) are defined. [Charles Arnold also writes: "In analyzing a 160 GB VHD fixed disk image created on Windows 2008 R2, it appears that MS is also ignoring the CHS values in the footer geometry field in whatever driver they use for accessing the image. The CHS values are set at 65535,16,255 which obviously doesn't represent an image size of 160 GB." -- Stefan] Signed-off-by: Charles Arnold <carnold@suse.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2012-12-11 11:04:26 +01:00
Charles Arnold	1fe1fa510a	block: vpc initialize the uuid footer field Initialize the uuid field in the footer with a generated uuid. Signed-off-by: Charles Arnold <carnold@suse.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2012-12-11 11:04:25 +01:00
Kevin Wolf	c57b6656c3	aio: Get rid of qemu_aio_flush() There are no remaining users, and new users should probably be using bdrv_drain_all() in the first place. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-11 11:04:25 +01:00
Peter Lieven	f807ecd574	iscsi: do not assume device is zero initialized Without any complex checks we can't assume that an iscsi target is initialized to zero. Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-11-28 12:51:58 +01:00
Peter Lieven	e829b0bb05	iscsi: fix deadlock during login If the connection is interrupted before the first login is successfully completed qemu-kvm is waiting forever in qemu_aio_wait(). This is fixed by performing an sync login to the target. If the connection breaks after the first successful login errors are handled internally by libiscsi. Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-11-28 12:50:56 +01:00
Peter Lieven	8da1e18b0c	iscsi: fix segfault in url parsing If an invalid URL is specified iscsi_get_error(iscsi) is called with iscsi == NULL. Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-11-28 12:46:13 +01:00
Stefan Priebe	08448d5195	use int64_t for return values from rbd instead of int rbd / rados tends to return pretty often length of writes or discarded blocks. These values might be bigger than int. The steps to reproduce are: mkfs.xfs -f a whole device bigger than int in bytes. mkfs.xfs sends a discard. Important is that you use scsi-hd and set discard_granularity=512. Otherwise rbd disabled discard support. Signed-off-by: Stefan Priebe <s.priebe@profihost.ag> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2012-11-21 09:43:23 +01:00
Stefan Hajnoczi	8ba2aae32c	vdi: don't override libuuid symbols It's poor symbol hygiene to provide a global symbols that collide with a common library like libuuid. If QEMU links against a shared library that depends on uuid_generate() it can end up calling our stub version of the function. This exact scenario happened with GlusterFS libgfapi.so, which depends on libglusterfs.so's uuid_generate(). Scope the uuid stubs for vdi.c only and avoid affecting other shared objects. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2012-11-21 09:40:29 +01:00
Jeff Cody	1bc6b705ee	block: add bdrv_reopen() support for raw hdev, floppy, and cdrom For hdev, floppy, and cdrom, the reopen() handlers are the same as for the file reopen handler. For floppy and cdrom types, however, we keep O_NONBLOCK, as in the _open function. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2012-11-21 09:40:29 +01:00
Gerhard Wiesinger	b1649fae49	vmdk: Fix data corruption bug in WRITE and READ handling Fixed a MAJOR BUG in VMDK files on file boundaries on reads and ALSO ON WRITES WHICH MIGHT CORRUPT THE IMAGE AND DATA!!!!!! Triggered for example with the following VMDK file (partly listed): RW 4193792 FLAT "XP-W1-f001.vmdk" 0 RW 2097664 FLAT "XP-W1-f002.vmdk" 0 RW 4193792 FLAT "XP-W1-f003.vmdk" 0 RW 512 FLAT "XP-W1-f004.vmdk" 0 RW 4193792 FLAT "XP-W1-f005.vmdk" 0 RW 2097664 FLAT "XP-W1-f006.vmdk" 0 RW 4193792 FLAT "XP-W1-f007.vmdk" 0 RW 512 FLAT "XP-W1-f008.vmdk" 0 Patch includes: 1.) Patch fixes wrong calculation on extent boundaries. Especially it fixes the relativeness of the sector number to the current extent. Verfied correctness with: 1.) Converted either with Virtualbox to VDI and then with qemu-img and then with qemu-img only: VBoxManage clonehd --format vdi /VM/XP-W/new/XP-W1.vmdk ~/.VirtualBox/Harddisks/XP-W1-new-test.vdi ./qemu-img convert -O raw ~/.VirtualBox/Harddisks/XP-W1-new-test.vdi /root/QEMU/VM-XP-W1/XP-W1-via-VBOX.img md5sum /root/QEMU/VM-XP-W/XP-W1-direct.img md5sum /root/QEMU/VM-XP-W/XP-W1-via-VBOX.img => same MD5 hash 2.) Verified debug log files 3.) Run Windows XP successfully 4.) chkdsk run successfully without any errors Signed-off-by: Gerhard Wiesinger <lists@wiesinger.com> Acked-by: Fam Zheng <famcool@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-11-14 18:19:23 +01:00
Stefan Hajnoczi	d7331bed11	aio: rename AIOPool to AIOCBInfo Now that AIOPool no longer keeps a freelist, it isn't really a "pool" anymore. Rename it to AIOCBInfo and make it const since it no longer needs to be modified. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-11-14 18:19:21 +01:00
Stefan Weil	cee40d2d2d	block: Workaround for older versions of MinGW gcc Versions before gcc-4.6 don't support unnamed fields in initializers (see http://gcc.gnu.org/bugzilla/show_bug.cgi?id=10676). Offset and OffsetHigh belong to an unnamed struct which is part of an unnamed union. Therefore the original code does not work with older versions of gcc. Signed-off-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-11-14 18:19:21 +01:00
Kevin Wolf	a354807706	qcow2: Fix refcount table size calculation A missing factor for the refcount table entry size in the calculation could mean that too little memory was allocated for the in-memory representation of the table, resulting in a buffer overflow. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru> Tested-by: Michael Tokarev <mjt@tls.msk.ru>	2012-11-14 18:19:21 +01:00
Paolo Bonzini	1d7d2a9d21	nbd: accept URIs The URI syntax is consistent with the Gluster syntax. Export names are specified in the path, preceded by one or more (otherwise unused) slashes. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-11-12 14:38:28 +01:00
Paolo Bonzini	d04b0bbbc9	nbd: accept relative path to Unix socket Adding the "is_unix" member now will simplify the parsing of NBD URIs. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-11-12 11:33:29 +01:00
Paolo Bonzini	f563a5d7a8	Merge remote-tracking branch 'origin/master' into threadpool Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-10-31 10:42:51 +01:00
Paolo Bonzini	a27365265c	raw-win32: implement native asynchronous I/O With the new support for EventNotifiers in the AIO event loop, we can hook a completion port to every opened file and use asynchronous I/O on them. Wine's support is extremely inefficient, also because it really does the I/O synchronously on regular files. (!) But it works, and it is good to keep the Win32 and POSIX ports as similar as possible. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-10-31 10:38:13 +01:00
Paolo Bonzini	10fb6e0682	raw-posix: move linux-aio.c to block/ Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-10-31 10:38:13 +01:00
Paolo Bonzini	fc4edb84bf	raw-win32: add emulated AIO support Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-10-31 10:38:13 +01:00
Paolo Bonzini	9f8540ecef	raw-posix: rename raw-posix-aio.h, hide unavailable prototypes Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-10-31 10:38:12 +01:00
Paolo Bonzini	de81a16936	raw: merge posix-aio-compat.c into block/raw-posix.c Making the qemu_paiocb specific to raw devices will let us access members of the BDRVRawState arbitrarily. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-10-31 10:38:12 +01:00
Paolo Bonzini	47e6b251a5	block: switch posix-aio-compat to threadpool This is not meant for portability, but to remove code duplication. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-10-31 10:38:12 +01:00
Paolo Bonzini	f42b22077b	aio: add Win32 implementation The Win32 implementation will only accept EventNotifiers, thus a few drivers are disabled under Windows. EventNotifiers are a good match for the GSource implementation, too, because the Win32 port of glib allows to place their HANDLEs in a GPollFD. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-10-30 09:30:53 +01:00
Paolo Bonzini	b952b5589a	mirror: add support for on-source-error/on-target-error Error management is important for mirroring; otherwise, an error on the target (even something as "innocent" as ENOSPC) requires to start again with a full copy. Similar to on_read_error/on_write_error, two separate knobs are provided for on_source_error (reads) and on_target_error (writes). The default is 'report' for both. The 'ignore' policy will leave the sector dirty, so that it will be retried later. Thus, it will not cause corruption. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-10-24 10:26:22 +02:00
Paolo Bonzini	d63ffd87ac	mirror: implement completion Switching to the target of the migration is done mostly asynchronously, and reported to management via the BLOCK_JOB_COMPLETED event; the only synchronous phase is opening the backing files. bdrv_open_backing_file can always be done, even for migration of the full image (aka sync: 'full'). In this case, qmp_drive_mirror will create the target disk with no backing file at all, and bdrv_open_backing_file will be a no-op. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-10-24 10:26:22 +02:00
Paolo Bonzini	893f7ebafe	mirror: introduce mirror job This patch adds the implementation of a new job that mirrors a disk to a new image while letting the guest continue using the old image. The target is treated as a "black box" and data is copied from the source to the target in the background. This can be used for several purposes, including storage migration, continuous replication, and observation of the guest I/O in an external program. It is also a first step in replacing the inefficient block migration code that is part of QEMU. The job is possibly never-ending, but it is logically structured into two phases: 1) copy all data as fast as possible until the target first gets in sync with the source; 2) keep target in sync and ensure that reopening to the target gets a correct (full) copy of the source data. The second phase is indicated by the progress in "info block-jobs" reporting the current offset to be equal to the length of the file. When the job is cancelled in the second phase, QEMU will run the job until the source is clean and quiescent, then it will report successful completion of the job. In other words, the BLOCK_JOB_CANCELLED event means that the target may _not_ be consistent with a past state of the source; the BLOCK_JOB_COMPLETED event means that the target is consistent with a past state of the source. (Note that it could already happen that management lost the race against QEMU and got a completion event instead of cancellation). It is not yet possible to complete the job and switch over to the target disk. The next patches will fix this and add many refinements to the basic idea introduced here. These include improved error management, some tunable knobs and performance optimizations. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-10-24 10:26:19 +02:00
Paolo Bonzini	65f4632243	block: rename block_job_complete to block_job_completed The imperative will be used for the QMP command. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-10-24 10:26:19 +02:00
Jeff Cody	d5208c45be	block: in commit, determine base image from the top image This simplifies some code and error checking, and also fixes a bug. bdrv_find_backing_image() should only be passed absolute filenames, or filenames relative to the chain. In the QMP message handler for block commit, when looking up the base do so from the determined top image, so we know it is reachable from top. Some of the error messages put out by block-commit have changed slightly, which causes 2 tests cases for block-commit to fail. This patch updates the test cases to look for the correct error output. Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-10-24 10:26:19 +02:00
MORITA Kazutaka	2f5368017f	sheepdog: use bool for boolean variables This improves readability. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2012-10-12 10:47:35 +02:00
Aurelien Jarno	048d3612a5	Merge branch 'trivial-patches' of git://github.com/stefanha/qemu * 'trivial-patches' of git://github.com/stefanha/qemu: versatilepb: Use symbolic indices for ARM PIC qdev: kill bogus comment qemu-barrier: Fix compiler version check for future gcc versions hw: Add missing 'static' attribute for QEMUMachine cleanup useless return sentence qemu-sockets: Fix compiler warning (regression for MinGW) vnc: Fix spelling (hellmen -> hellman) in comment slirp: Fix spelling in comment (enought -> enough, insure -> ensure) tcg/arm: Use tcg_out_mov_reg rather than inline equivalent code cpu: Add missing 'static' attribute to qemu_global_mutex configure: Support empty target list (--target-list=) hw: Fix return value check for bdrv_read, bdrv_write	2012-10-06 18:54:14 +02:00
Amos Kong	4d5b97da35	cleanup useless return sentence This patch cleans up return sentences in the end of void functions. Reported-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Amos Kong <akong@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>	2012-10-05 15:10:21 +02:00
Jim Meyering	00ea188125	qcow2: mark this file's sole strncpy use as justified Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2012-10-05 07:58:38 -05:00
Jim Meyering	d66f8e7bd3	vmdk: relative_path: use pstrcpy in place of strncpy Avoid strncpy+manual-NUL-terminate. Use pstrcpy instead. Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2012-10-05 07:58:36 -05:00
Jim Meyering	3178e2755e	sheepdog: avoid a few buffer overruns * parse_vdiname: Use pstrcpy, not strncpy, when the destination buffer must be NUL-terminated. * sd_open: Likewise, avoid buffer overrun. * do_sd_create: Likewise. Leave the preceding memset, since pstrcpy does not NUL-fill, and filename needs that. * sd_snapshot_create: Add a comment/question. * find_vdi_name: Remove a useless memset. * sd_snapshot_goto: Remove a useless memset. Use pstrcpy to NUL-terminate, because find_vdi_name requires that its vdi arg (filename parameter) be NUL-terminated. It seems ok not to NUL-fill the buffer. Do the same for snapid: remove useless memset-0 (instead, zero tag[0]). Use pstrcpy, not strncpy. * sd_snapshot_list: Use pstrcpy, not strncpy to write into the ->name member. Each must be NUL-terminated. Acked-by: Kevin Wolf <kwolf@redhat.com> Acked-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2012-10-05 07:58:36 -05:00
Paolo Bonzini	8f96b5be92	blkdebug: process all set_state rules in the old state Currently it is impossible to write a blkdebug script that ping-pongs between two states, because the second set-state rule will use the state that is set in the first. If you have [set-state] event = "..." state = "1" new_state = "2" [set-state] event = "..." state = "2" new_state = "1" for example the state will remain locked at 1. This can be fixed by first processing all rules, and then setting the state. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-09-28 19:40:56 +02:00
Paolo Bonzini	1d809098aa	stream: add on-error argument This patch adds support for error management to streaming. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-09-28 19:40:56 +02:00
Paolo Bonzini	92aa5c6d77	iostatus: move BlockdevOnError declaration to QAPI This will let block-stream reuse the enum. Places that used the enums are renamed accordingly. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-09-28 19:40:26 +02:00
Paolo Bonzini	2f0c9fe64c	block: move job APIs to separate files Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-09-28 19:14:26 +02:00
Jeff Cody	747ff60263	block: add live block commit functionality This adds the live commit coroutine. This iteration focuses on the commit only below the active layer, and not the active layer itself. The behaviour is similar to block streaming; the sectors are walked through, and anything that exists above 'base' is committed back down into base. At the end, intermediate images are deleted, and the chain stitched together. Images are restored to their original open flags upon completion. Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-09-28 18:23:12 +02:00
Bharata B Rao	8d6d89cb63	block: Support GlusterFS as a QEMU block backend. This patch adds gluster as the new block backend in QEMU. This gives QEMU the ability to boot VM images from gluster volumes. Its already possible to boot from VM images on gluster volumes using FUSE mount, but this patchset provides the ability to boot VM images from gluster volumes by by-passing the FUSE layer in gluster. This is made possible by using libgfapi routines to perform IO on gluster volumes directly. VM Image on gluster volume is specified like this: file=gluster[+transport]://[server[:port]]/volname/image[?socket=...] 'gluster' is the protocol. 'transport' specifies the transport type used to connect to gluster management daemon (glusterd). Valid transport types are tcp, unix and rdma. If a transport type isn't specified, then tcp type is assumed. 'server' specifies the server where the volume file specification for the given volume resides. This can be either hostname, ipv4 address or ipv6 address. ipv6 address needs to be within square brackets [ ]. If transport type is 'unix', then 'server' field should not be specifed. The 'socket' field needs to be populated with the path to unix domain socket. 'port' is the port number on which glusterd is listening. This is optional and if not specified, QEMU will send 0 which will make gluster to use the default port. If the transport type is unix, then 'port' should not be specified. 'volname' is the name of the gluster volume which contains the VM image. 'image' is the path to the actual VM image that resides on gluster volume. Examples: file=gluster://1.2.3.4/testvol/a.img file=gluster+tcp://1.2.3.4/testvol/a.img file=gluster+tcp://1.2.3.4:24007/testvol/dir/a.img file=gluster+tcp://[1:2:3:4:5:6:7:8]/testvol/dir/a.img file=gluster+tcp://[1:2:3:4:5:6:7:8]:24007/testvol/dir/a.img file=gluster+tcp://server.domain.com:24007/testvol/dir/a.img file=gluster+unix:///testvol/dir/a.img?socket=/tmp/glusterd.socket file=gluster+rdma://1.2.3.4:24007/testvol/a.img Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-09-28 17:58:12 +02:00
Anthony Liguori	444dbc381b	Merge remote-tracking branch 'kwolf/for-anthony' into staging * kwolf/for-anthony: block: remove keep_read_only flag from BlockDriverState struct block: convert bdrv_commit() to use bdrv_reopen() block: vpc image file reopen block: vdi image file reopen block: vmdk image file reopen block: qcow image file reopen block: qcow2 image file reopen block: qed image file reopen block: raw image file reopen block: raw-posix image file reopen block: purge s->aligned_buf and s->aligned_buf_size from raw-posix.c block: use BDRV_O_NOCACHE instead of s->aligned_buf in raw-posix.c block: do not parse BDRV_O_CACHE_WB in block drivers block: move open flag parsing in raw block drivers to helper functions block: move aio initialization into a helper function block: Framework for reopening files safely block: make bdrv_set_enable_write_cache() modify open_flags block: correctly set the keep_read_only flag blockdev: preserve readonly and snapshot states across media changes	2012-09-25 16:06:16 -05:00
Jeff Cody	3fe4b70008	block: vpc image file reopen There is currently nothing that needs to be done for VPC image file reopen. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-09-24 15:15:12 +02:00
Jeff Cody	ecfe2bbabb	block: vdi image file reopen There is currently nothing that needs to be done for VDI reopen. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-09-24 15:15:12 +02:00
Jeff Cody	3897575f1c	block: vmdk image file reopen This patch supports reopen for VMDK image files. VMDK extents are added to the existing reopen queue, so that the transactional model of reopen is maintained with multiple image files. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-09-24 15:15:12 +02:00
Jeff Cody	d177692ede	block: qcow image file reopen These are the stubs for the file reopen drivers for the qcow format. There is currently nothing that needs to be done by the qcow driver in reopen. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-09-24 15:15:12 +02:00
Jeff Cody	21d82ac95f	block: qcow2 image file reopen These are the stubs for the file reopen drivers for the qcow2 format. There is currently nothing that needs to be done by the qcow2 driver in reopen. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-09-24 15:15:12 +02:00
Jeff Cody	f9cb20f167	block: qed image file reopen These are the stubs for the file reopen drivers for the qed format. There is currently nothing that needs to be done by the qed driver in reopen. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-09-24 15:15:12 +02:00
Jeff Cody	01bdddb5aa	block: raw image file reopen These are the stubs for the file reopen drivers for the raw format. There is currently nothing that needs to be done by the raw driver in reopen. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-09-24 15:15:12 +02:00
Jeff Cody	eeb6b45d48	block: raw-posix image file reopen This is derived from the Supriya Kannery's reopen patches. This contains the raw-posix driver changes for the bdrv_reopen_* functions. All changes are staged into a temporary scratch buffer during the prepare() stage, and copied over to the live structure during commit(). Upon abort(), all changes are abandoned, and the live structures are unmodified. The _prepare() will create an extra fd - either by means of a dup, if possible, or opening a new fd if not (for instance, access control changes). Upon _commit(), the original fd is closed and the new fd is used. Upon _abort(), the duplicate/new fd is closed. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-09-24 15:15:12 +02:00
Jeff Cody	3d1807ac67	block: purge s->aligned_buf and s->aligned_buf_size from raw-posix.c The aligned_buf pointer and aligned_buf size are no longer used in raw_posix.c, so remove all references to them. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-09-24 15:15:12 +02:00
Jeff Cody	9acc5a06d4	block: use BDRV_O_NOCACHE instead of s->aligned_buf in raw-posix.c Rather than check for a non-NULL aligned_buf to determine if raw_aio_submit needs to check for alignment, check for the presence of BDRV_O_NOCACHE in the bs->open_flags. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-09-24 15:15:11 +02:00
Jeff Cody	39c9fb9565	block: do not parse BDRV_O_CACHE_WB in block drivers Block drivers should ignore BDRV_O_CACHE_WB in .bdrv_open flags, and in the bs->open_flags. This patch removes the code, leaving the behaviour behind as if BDRV_O_CACHE_WB was set. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-09-24 15:15:11 +02:00
Jeff Cody	6a8dc0422e	block: move open flag parsing in raw block drivers to helper functions Code motion, to move parsing of open flags into a helper function. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-09-24 15:15:11 +02:00
Jeff Cody	fc32a72dc1	block: move aio initialization into a helper function Move AIO initialization for raw-posix block driver into a helper function. In addition to just code motion, the aio_ctx pointer is checked for NULL, prior to calling laio_init(), to make sure laio_init() is only run once. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-09-24 15:15:11 +02:00
Ronnie Sahlberg	40a13ca8d2	iSCSI: We dont need to explicitely call qemu_notify_event() any more We no longer need to explicitely call qemu_notify_event() any more since this is now done automatically any time the filehandles we listen to change. Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-09-21 16:12:33 +02:00
Ronnie Sahlberg	f1a12821d7	iSCSI: We need to support SG_IO also from iscsi_ioctl() We need to support SG_IO from the synchronous iscsi_ioctl() since scsi-block uses this to do an INQ to the device to discover its properties This patch makes scsi-block work with iscsi. Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-09-21 16:05:51 +02:00
Stefan Weil	514f21a5d4	vdi: Fix warning from clang ccc-analyzer reports these warnings: block/vdi.c:704:13: warning: Dereference of null pointer bmap[i] = VDI_UNALLOCATED; ^ block/vdi.c:702:13: warning: Dereference of null pointer bmap[i] = i; ^ Moving some code into the if block fixes this. It also avoids calling function write with 0 bytes of data. Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-09-12 15:50:09 +02:00
Stefan Weil	45724d6d02	block/curl: Fix wrong free statement Report from smatch: block/curl.c:546 curl_close(21) info: redundant null check on s->url calling free() The check was redundant, and free was also wrong because the memory was allocated using g_strdup. Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-09-12 15:50:09 +02:00
MORITA Kazutaka	1f7a48de44	sheepdog: fix savevm and loadvm This patch sets data to be sent to Sheepdog correctly and fixes savevm and loadvm operations on a Sheepdog image. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-09-12 15:50:09 +02:00
Anthony Liguori	cdedd9d867	Merge remote-tracking branch 'kwolf/for-anthony' into staging * kwolf/for-anthony: qemu-iotests: add backing file smaller than image test case stream: complete early if end of backing file is reached qed: refuse unaligned zero writes with a backing file	2012-08-31 10:04:18 -05:00
Stefan Hajnoczi	571cd9dcc7	stream: complete early if end of backing file is reached It is possible to create an image that is larger than its backing file. Reading beyond the end of the backing file produces zeroes if no writes have been made to those sectors in the image file. This patch finishes streaming early when the end of the backing file is reached. Without this patch the block job hangs and continually tries to stream the first sectors beyond the end of the backing file. To reproduce the hung block job bug: $ qemu-img create -f qcow2 backing.qcow2 128M $ qemu-img create -f qcow2 -o backing_file=backing.qcow2 image.qcow2 6G $ qemu -drive if=virtio,cache=none,file=image.qcow2 (qemu) block_stream virtio0 (qemu) info block-jobs The qemu-iotests 030 streaming test still passes. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-08-29 15:23:35 +02:00
Stefan Hajnoczi	ef72f76e58	qed: refuse unaligned zero writes with a backing file Zero writes have cluster granularity in QED. Therefore they can only be used to zero entire clusters. If the zero write request leaves sectors untouched, zeroing the entire cluster would obscure the backing file. Instead return -ENOTSUP, which is handled by block.c:bdrv_co_do_write_zeroes() and falls back to a regular write. The qemu-iotests 034 test cases covers this scenario. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-08-29 15:23:35 +02:00
Ronnie Sahlberg	135b908878	iscsi: Set number of blocks to 0 for blank CDROM devices The number of blocks of the device is used to compute the device size in bdrv_getlength()/iscsi_getlength(). For MMC devices, the ReturnedLogicalBlockAddress in the READCAPACITY10 has a special meaning when it is 0. In this case it does not mean that LBA 0 is the last accessible LBA, and thus the device has 1 readable block, but instead it means that the disc is blank and there are no readable blocks. This change ensures that when the iSCSI LUN is loaded with a blank DVD-R disk or similar that bdrv_getlength() will return the correct size of the device as 0 bytes. Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>	2012-08-28 14:50:08 +02:00
Anthony Liguori	a9b670b139	Merge remote-tracking branch 'bonzini/scsi-next' into staging * bonzini/scsi-next: virtio-scsi: add backwards-compatibility properties for 1.1 and earlier machines iscsi: fix races between task completion and abort iscsi: simplify iscsi_schedule_bh iscsi: move iscsi_schedule_bh and iscsi_readv_writev_bh_cb Revert "iscsi: Fix NULL dereferences / races between task completion and abort"	2012-08-22 13:31:17 -05:00
Anthony Liguori	7b2f89c435	Merge remote-tracking branch 'kwolf/for-anthony' into staging * kwolf/for-anthony: virtio-blk: hide VIRTIO_BLK_F_CONFIG_WCE from old machine types Documentation: Warn against qemu-img on active image vmdk: Read footer for streamOptimized images vmdk: Fix header structure Conflicts: hw/virtio-blk.c	2012-08-22 13:01:05 -05:00
Jim Meyering	a7e47d4bfc	sheepdog: don't leak socket file descriptor upon connection failure Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2012-08-22 10:47:14 -05:00
Paolo Bonzini	1bd075f29e	iscsi: fix races between task completion and abort This patch fixes two main issues with block/iscsi.c: 1) iscsi_task_mgmt_abort_task_async calls iscsi_scsi_task_cancel which was also directly called in iscsi_aio_cancel 2) a race between task completion and task abortion could happen cause the scsi_free_scsi_task were done before iscsi_schedule_bh has finished. To fix this, all the freeing of IscsiTasks and releasing of the AIOCBs is centralized in iscsi_bh_cb, independent of whether the SCSI command has completed or was cancelled. 3) iscsi_aio_cancel was not synchronously waiting for the end of the command. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-08-20 15:58:47 +02:00
Paolo Bonzini	cfb3f5064a	iscsi: simplify iscsi_schedule_bh It is always used with the same callback, remove the argument. And its return value is never used, assume allocation succeeds. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-08-20 15:58:47 +02:00
Paolo Bonzini	27cbd828c6	iscsi: move iscsi_schedule_bh and iscsi_readv_writev_bh_cb Put these functions at the beginning, to avoid forward references in the next patches. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-08-20 15:58:47 +02:00
Paolo Bonzini	b209091957	Revert "iscsi: Fix NULL dereferences / races between task completion and abort" This reverts commit `64e69e8092`. The commit returned immediately from iscsi_aio_cancel, risking corruption in case the following happens: guest qemu target ========================================================================= send write 1 --------> send write 1 --------> cancel write 1 ------> cancel write 1 ------> <------------------ cancellation processed send write 2 --------> send write 2 --------> <---------------- completed write 2 <------------------ completed write 2 <---------------- completed write 1 <---------------- cancellation not done Here, the guest would see write 2 superseding write 1, when in fact the outcome could have been the opposite. The right behavior is to return only after the target says whether the cancellation was done or not, and it will be implemented by the next three patches. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-08-20 15:50:45 +02:00
Kevin Wolf	65bd155c73	vmdk: Read footer for streamOptimized images The footer takes precedence over the header when it exists. It contains the real grain directory offset that is missing in the header. Without this patch, streamOptimized images with a footer cannot be read. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Reviewed-by: Jeff Cody <jcody@redhat.com>	2012-08-17 13:27:02 +02:00
Kevin Wolf	7a736bfa4e	vmdk: Fix header structure Commit `bb45ded9` swapped gd_offset and rgd_offset. This is wrong. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-08-17 11:14:19 +02:00
Stefan Priebe	64e69e8092	iscsi: Fix NULL dereferences / races between task completion and abort Signed-off-by: Stefan Priebe <s.priebe@profihost.ag> Acked-by: Ronnie Sahlberg <ronniesahlberg@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-08-15 13:16:22 +02:00
Corey Bryant	2e1e79dae7	block: Convert close calls to qemu_close This patch converts all block layer close calls, that correspond to qemu_open calls, to qemu_close. Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-08-15 10:48:57 +02:00
Corey Bryant	6165f4d85d	block: Convert open calls to qemu_open This patch converts all block layer open calls to qemu_open. Note that this adds the O_CLOEXEC flag to the changed open paths when the O_CLOEXEC macro is defined. Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-08-15 10:48:57 +02:00
Corey Bryant	e174082835	block: Prevent detection of /dev/fdset/ as floppy Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-08-15 10:48:57 +02:00
Anthony Liguori	53810bab3a	Merge remote-tracking branch 'kwolf/for-anthony' into staging * kwolf/for-anthony: qemu-iotests: skip 039 with ./check -nocache block: add BLOCK_O_CHECK for qemu-img check qcow2: mark image clean after repair succeeds qed: mark image clean after repair succeeds blockdev: flip default cache mode from writethrough to writeback virtio-blk: disable write cache if not negotiated virtio-blk: support VIRTIO_BLK_F_CONFIG_WCE qemu-iotests: Save some sed processes ahci: Fix sglist memleak in ahci_dma_rw_buf() ahci: Fix ahci cdrom read corruptions for reads > 128k virtio-blk: fix use-after-free while handling scsi commands	2012-08-11 19:48:50 -05:00
Anthony Liguori	312942619a	Merge remote-tracking branch 'bonzini/scsi-next' into staging * bonzini/scsi-next: scsi-disk: add support for the UNMAP command scsi-disk: improve out-of-range LBA detection for WRITE SAME scsi-disk: more assertions and resets for aiocb virtio-scsi: do not compare 32-bit QEMU tags against 64-bit virtio-scsi tags iscsi: Pick default initiator-name based on the name of the VM iscsi: reorganize code for parse_initiator_name iscsi: do not leak initiator_name	2012-08-11 17:11:23 -05:00
Stefan Hajnoczi	058f8f16db	block: add BLOCK_O_CHECK for qemu-img check Image formats with a dirty bit, like qed and qcow2, repair dirty image files upon open with BDRV_O_RDWR. Performing automatic repair when qemu-img check runs is not ideal because the bdrv_open() call repairs the image before the actual bdrv_check() call from qemu-img.c. Fix this "double repair" since it leads to confusing output from qemu-img check. Tell the block driver that this image is being opened just for bdrv_check(). This skips automatic repair and qemu-img.c can invoke it manually with bdrv_check(). Update the golden output for qemu-iotests 039 to reflect the new qemu-img check output. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-08-10 10:25:12 +02:00
Stefan Hajnoczi	acbe59829e	qcow2: mark image clean after repair succeeds The dirty bit is cleared after image repair succeeds in qcow2_open(). Move this into qcow2_check() so that all callers benefit from this behavior when fix mode is enabled. This is necessary so qemu-img check can call .bdrv_check() and mark the image clean. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-08-10 10:25:12 +02:00
Stefan Hajnoczi	b10170aca0	qed: mark image clean after repair succeeds The dirty bit is cleared after image repair succeeds in qed_open(). Move this into qed_check() so that all callers benefit from this behavior when fix=true. This is necessary so qemu-img check can call .bdrv_check() and mark the image clean. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-08-10 10:25:12 +02:00
Ronnie Sahlberg	31459f463a	iscsi: Pick default initiator-name based on the name of the VM This patch updates the iscsi layer to automatically pick a 'unique' initiator-name based on the name of the vm in case the user has not set an explicit iqn-name to use. Create a new function qemu_get_vm_name() that returns the name of the VM, if specified. This way we can thus create default names to use as the initiator name based on the guest session. If the VM is not named via the '-name' command line argument, the iscsi initiator-name used wiull simply be iqn.2008-11.org.linux-kvm If a name for the VM was specified with the '-name' option, iscsi will use a default initiatorname of iqn.2008-11.org.linux-kvm:<name> These names are just the default iscsi initiator name that qemu will generate/use only when the user has not set an explicit initiator name via the commandlines or config files. Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>	2012-08-09 15:04:09 +02:00
Paolo Bonzini	f2ef4a6dd9	iscsi: reorganize code for parse_initiator_name Merge the occurrences of the "iqn.2008-11.org.linux-kvm" string to avoid duplication. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-08-08 14:51:59 +02:00
Paolo Bonzini	b93c94f7ec	iscsi: do not leak initiator_name The argument of iscsi_create_context is never freed by libiscsi, which in fact calls strdup on it. Avoid a leak. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-08-08 14:51:59 +02:00
Stefan Hajnoczi	bfe8043e92	qcow2: implement lazy refcounts Lazy refcounts is a performance optimization for qcow2 that postpones refcount metadata updates and instead marks the image dirty. In the case of crash or power failure the image will be left in a dirty state and repaired next time it is opened. Reducing metadata I/O is important for cache=writethrough and cache=directsync because these modes guarantee that data is on disk after each write (hence we cannot take advantage of caching updates in RAM). Refcount metadata is not needed for guest->file block address translation and therefore does not need to be on-disk at the time of write completion - this is the motivation behind the lazy refcount optimization. The lazy refcount optimization must be enabled at image creation time: qemu-img create -f qcow2 -o compat=1.1,lazy_refcounts=on a.qcow2 10G qemu-system-x86_64 -drive if=virtio,file=a.qcow2,cache=writethrough Update qemu-iotests 031 and 036 since the extension header size changes when we add feature bit table entries. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-08-06 22:39:14 +02:00
Stefan Hajnoczi	c61d0004bc	qcow2: introduce dirty bit This patch adds an incompatible feature bit to mark images that have not been closed cleanly. When a dirty image file is opened a consistency check and repair is performed. Update qemu-iotests 031 and 036 since the extension header size changes when we add feature bit table entries. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-08-06 22:39:14 +02:00
Markus Armbruster	4480e0f924	vvfat: Do not clobber the user's geometry vvfat creates a virtual VFAT filesystem with a certain logical geometry that depends on its options. It sets the "geometry hint" to this geometry. It is the only block driver to do this. The geometry hint is about about physical geometry, and used only by certain hard disk device models. vvfat's hint is normally invisible for device models, because bdrv_open() puts a raw format on top of vvfat's fat protocol. That raw format is where drive_init() puts the user's geometry (if any), and where the device model gets it from. Nobody complained, because the default physical geometry is the same as vvfat's logical geometry: opts LCHS def. PCHS 1024,16,63 same :32: 1024,16,63 same :16: 1024,16,63 same :12: 64,16,63 same Except when you specify :floppy: opts LCHS def. PCHS :floppy: 80, 2,36 5,16,63 :32:floppy: 80, 2,36 5,16,63 :16:floppy: 80, 2,36 5,16,63 :12:floppy: 80, 2,18 2,16,63 Silly thing to do for use with a hard disk. However, the "raw" format can be suppressed by adding an redundant-looking "format=vvfat" to "file=fat:FOO". Then, vvfat's hint clobbers the user's geometry, i.e. -drive options cyls, heads, secs get silently ignored. Don't do that. No change without format=vvfat. With it, the user's hard disk geometry (-drive options cyls, heads, secs) is now obeyed, and the default hard disk geometry with :floppy: now matches the one without format=vvfat. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-07-17 16:48:30 +02:00
Markus Armbruster	f91cbefe2d	vvfat: Fix partition table Unless parameter ":floppy:" is given, vvfat creates a virtual image with DOS MBR defining a single partition which holds the FAT file system. The size of the virtual image depends on the width of the FAT: 32 MiB (CHS 64, 16, 63) for 12 bit FAT, 504 MiB (CHS 1024, 16, 63) for 16 and 32 bit FAT, leaving (6416-1)63 = 64449 and (102416-1)64 = 1032129 sectors for the partition. However, it screws up the end of the partition in the MBR: FAT width param. start CHS end CHS start LBA size :32: 0,1,1 1023,14,63 63 1032065 :16: 0,1,1 1023,14,55 63 1032057 :12: 0,1,1 63,14,55 63 64377 The actual FAT file system nevertheless assumes the partition has 1032129 or 64449 sectors. Oops. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-07-17 16:48:30 +02:00
Christoph Hellwig	19db9b9042	sheepdog: do not blindly memset all read buffers Only buffers that map to unallocated blocks need to be zeroed. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-07-17 16:48:29 +02:00
MORITA Kazutaka	cddd4ac7a2	sheepdog: always use coroutine-based network functions This reduces some code duplication. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-07-17 16:48:29 +02:00
Anthony Liguori	23797df3d9	Merge remote-tracking branch 'mjt/mjt-iov2' into staging * mjt/mjt-iov2: rewrite iov_send_recv() and move it to iov.c cleanup qemu_co_sendv(), qemu_co_recvv() and friends export iov_send_recv() and use it in iov_send() and iov_recv() rename qemu_sendv to iov_send, change proto and move declarations to iov.h change qemu_iovec_to_buf() to match other to,from_buf functions consolidate qemu_iovec_copy() and qemu_iovec_concat() and make them consistent allow qemu_iovec_from_buffer() to specify offset from which to start copying consolidate qemu_iovec_memset{,_skip}() into single function and use existing iov_memset() rewrite iov_* functions change iov_* function prototypes to be more appropriate virtio-serial-bus: use correct lengths in control_out() message Conflicts: tests/Makefile Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2012-07-09 12:35:06 -05:00
Anthony Liguori	715cc00ce1	Merge remote-tracking branch 'kwolf/for-anthony' into staging * kwolf/for-anthony: (24 commits) block: Factor bdrv_read_unthrottled() out of guess_disk_lchs() qtest: Tidy up temporary files properly fdc: Drop broken code for user-defined floppy geometry fdc_test: introduce test_sense_interrupt fdc_test: update media_change test fdc: fix interrupt handling fdc: rewrite seek and DSKCHG bit handling block: introduce bdrv_swap, implement bdrv_append on top of it block: copy over job and dirty bitmap fields in bdrv_append raw: hook into blkdebug blkdebug: optionally tie errors to a specific sector blkdebug: store list of active rules blkdebug: pass getlength to underlying file blkdebug: tiny cleanup blkdebug: remove sync i/o events sheepdog: traverse pending_list from the first for each time sheepdog: split outstanding list into inflight and pending sheepdog: make sure we don't free aiocb before sending all requests sheepdog: use coroutine based socket functions in coroutine context sheepdog: restart I/O when socket becomes ready in do_co_req() ...	2012-07-09 10:29:40 -05:00
Paolo Bonzini	5c171afa4c	raw: hook into blkdebug Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-07-09 15:53:02 +02:00
Paolo Bonzini	e4780db429	blkdebug: optionally tie errors to a specific sector This makes blkdebug scripts more powerful, and independent of the exact sequence of operations performed by streaming. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-07-09 15:53:02 +02:00
Paolo Bonzini	571cd43e57	blkdebug: store list of active rules This prepares for the next patch, where some active rules may actually not trigger depending on input to readv/writev. Store the active rules in a SIMPLEQ (so that it can be emptied easily with QSIMPLEQ_INIT), and fetch the errno/once/immediately arguments from there. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-07-09 15:53:02 +02:00
Paolo Bonzini	e130225587	blkdebug: pass getlength to underlying file This is required when using blkdebug with raw format. Unlike qcow2/QED, raw asks blkdebug for the length of the file, it doesn't get it from a header. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-07-09 15:53:02 +02:00
Paolo Bonzini	368e8dd10a	blkdebug: tiny cleanup Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-07-09 15:53:02 +02:00
Paolo Bonzini	820100fd15	blkdebug: remove sync i/o events These are unused, except (by mistake more or less) in QED. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-07-09 15:53:02 +02:00
MORITA Kazutaka	7dc1cde05b	sheepdog: traverse pending_list from the first for each time The pending list can be modified in other coroutine context sd_co_rw_vector, so we need to traverse the list from the first again after we send the pending request. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-07-09 15:53:02 +02:00
MORITA Kazutaka	c292ee6a67	sheepdog: split outstanding list into inflight and pending outstanding_list_head is used for both pending and inflight requests. This patch splits it and improves readability. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-07-09 15:53:02 +02:00
MORITA Kazutaka	1d732d7d7c	sheepdog: make sure we don't free aiocb before sending all requests This patch increments the pending counter before sending requests, and make sures that aiocb is not freed while sending them. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-07-09 15:53:01 +02:00
MORITA Kazutaka	b97564f4c5	sheepdog: use coroutine based socket functions in coroutine context This removes blocking network I/Os in coroutine context. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-07-09 15:53:01 +02:00
MORITA Kazutaka	2dfcca3b68	sheepdog: restart I/O when socket becomes ready in do_co_req() Currently, no one reenters the yielded coroutine. This fixes it. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-07-09 15:53:01 +02:00
MORITA Kazutaka	1b6ac9985a	sheepdog: fix dprintf format strings This fixes warnings about dprintf format in debug mode. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-07-09 15:53:01 +02:00
Stefan Hajnoczi	206e6d8551	qcow2: preserve free_byte_offset when qcow2_alloc_bytes() fails When qcow2_alloc_clusters() error handling code was introduced in commit `5d757b563d`, the value of free_byte_offset was clobbered in the error case. This patch keeps free_byte_offset at 0 so we will try to allocate clusters again next time this function is called. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-07-09 15:53:01 +02:00
Stefan Hajnoczi	b35278f754	qcow2: fix #ifdef'd qcow2_check_refcounts() callers The DEBUG_ALLOC qcow2.h macro enables additional consistency checks throughout the code. This makes it easier to spot corruptions that are introduced during development. Since consistency check is an expensive operation the DEBUG_ALLOC macro is used to compile checks out in normal builds and qcow2_check_refcounts() calls missed the addition of a new function argument. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-07-09 15:53:01 +02:00
Ronnie Sahlberg	622695a458	ISCSI: force use of sg for SMC and SSC devices If the device we open is a SMC or SSC device, then force the use of sg. We dont have any medium changer or tape emulation so only passthrough via real sg or scsi-generic via iscsi would work anyway. Forcing sg also makes qemu skip trying to read from the device to guess the image format by reading from the device (find_image_format()). SMC devices do not implement READ6/10/12/16 so it is not possible to read from them (SSC have different CDBs). With this patch I can successfully manage a SMC device wiht iscsi in passthrough mode. Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com> [Added TYPE_TAPE handling - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-07-02 10:18:41 +02:00
Ronnie Sahlberg	983924532f	ISCSI: Add SCSI passthrough via scsi-generic to libiscsi Update iscsi to allow passthrough of SG_IO scsi commands when the iscsi device is forced to be scsi-generic. Implement both bdrv_ioctl() and bdrv_aio_ioctl() in the iscsi backend, emulate the SG_IO ioctl and pass the SCSI commands across to the iscsi target. This allows end-to-end passthrough of SCSI all the way from the guest, to qemu, via scsi-generic, then libiscsi all the way to the iscsi target. To activate this you need to specify that the iscsi lun should be treated as a scsi-generic device. Example: -device lsi -device scsi-generic,drive=MyISCSI \ -drive file=iscsi://10.1.1.125/iqn.ronnie.test/1,if=none,id=MyISCSI Note, you can currently not boot a qemu guest from a scsi device. Note, This only works when the host is linux, since the emulation relies on definitions of SG_IO from the scsi-generic implementation in the linux kernel. It should be fairly easy to re-implement some structures similar enough for non-linux hosts to do the same style of passthrough via a fake scsi generic layer and libiscsi if need be. Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-07-02 10:18:41 +02:00
Kevin Wolf	94282e7146	raw-posix: Fix build without is_allocated support Move the declaration of s into the #ifdef sections that actually make use of it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2012-06-24 01:04:45 +02:00
Stefan Hajnoczi	af7b708db2	qcow2: fix autoclear image header update The autoclear feature bits can be used for qcow2 file format features that are safe to "drop" by old programs that do not understand the feature. Upon opening the image file unknown autoclear feature bits are cleared and the image file header is rewritten, but this was happening too early in the code when critical header fields were not yet loaded. Process autoclear feature bits after all necessary header information has been loaded. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-06-15 14:03:43 +02:00
Kevin Wolf	b7ab0fea37	qcow2: Fix avail_sectors in cluster allocation code avail_sectors should really be the number of sectors from the start of the allocation, not from the start of the write request. We're lucky enough that this mistake didn't cause any real bug. avail_sectors is only used in the intialiser of QCowL2Meta: .nb_available = MIN(requested_sectors, avail_sectors), m->nb_available in turn is only used for COW at the end of the allocation. A COW occurs only if the request wasn't cluster aligned, which in turn would imply that requested_sectors was less than avail_sectors (both in the original and in the fixed version). In this case avail_sectors is ignored and therefore the mistake doesn't cause any misbehaviour. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-06-15 14:03:43 +02:00
Kevin Wolf	cdba7fee1d	qcow2: Simplify calculation for COW area at the end copy_sectors() always uses the sum (cluster_offset + n_start) or (start_sect + n_start), so if some value is added to both cluster_offset and start_sect, and subtracted from n_start, it's cancelled out anyway. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-06-15 14:03:43 +02:00
Paolo Bonzini	6af4e9ead4	qcow2: always operate caches in writeback mode Writethrough does not need special-casing anymore in the qcow2 caches. The block layer adds flushes after every guest-initiated data write, and these will also flush the qcow2 caches to the OS. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-06-15 14:03:43 +02:00
MORITA Kazutaka	e0d93a89b9	sheepdog: add coroutine_fn markers to coroutine functions Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-06-15 14:03:42 +02:00
Josh Durgin	b11f38fcdf	rbd: hook up cache options Writeback caching was added in Ceph 0.46, and writethrough will be in 0.47. These are controlled by general config options, so there's no need to check for librbd version. Signed-off-by: Josh Durgin <josh.durgin@inktank.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-06-15 14:03:42 +02:00
Kevin Wolf	166acf546f	qcow2: Support for fixing refcount inconsistencies Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-06-15 14:03:42 +02:00
Kevin Wolf	ccf34716ee	qemu-img check: Print fixed clusters and recheck When any inconsistencies have been fixed, print the statistics and run another check to make sure everything is correct now. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-06-15 14:03:42 +02:00
Kevin Wolf	4534ff5426	qemu-img check -r for repairing images The QED block driver already provides the functionality to not only detect inconsistencies in images, but also fix them. However, this functionality cannot be manually invoked with qemu-img, but the check happens only automatically during bdrv_open(). This adds a -r switch to qemu-img check that allows manual invocation of an image repair. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-06-15 14:03:42 +02:00
Paolo Bonzini	6ef228fc0d	stream: move rate limiting to a separate header file Make the code reusable. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-06-15 14:03:42 +02:00
Paolo Bonzini	188a7bbf94	stream: move is_allocated_above to block.c Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-06-15 14:03:42 +02:00
Paolo Bonzini	f9749f28b7	stream: tweak usage of bdrv_co_is_allocated is_allocated_base has complex semantics that are not really usable outside streaming. Split the check in two parts, where the allocated state for the top bs is moved to the caller. The resulting function is more generally useful. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-06-15 14:03:42 +02:00
Paolo Bonzini	5500316ded	block: implement is_allocated for raw Either FIEMAP, or SEEK_DATA+SEEK_HOLE can be used to implement the is_allocated callback for raw files. On Linux ext4, btrfs and XFS all support it. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-06-15 14:03:42 +02:00
Zhi Yong Wu	87267753a3	qcow2: fix endianness conversion Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-06-15 14:03:42 +02:00
Zhi Yong Wu	833e40858c	qcow2: remove a line of unnecessary code Commit `3948d1d4` removed the pointer argument we filled in with l2_offset but forgot to remove the unnecessary l2_offset assignment. Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-06-15 14:03:42 +02:00
Kevin Wolf	1417d7e40e	qcow2: Silence false warning Some gcc versions seem not to be able to figure out that the switch statement covers all possible values and that c is therefore always initialised. Add a default branch for them. Reported-by: malc <av1474@comtv.ru> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: malc <av1474@comtv.ru>	2012-06-15 15:52:45 +04:00
Michael Tokarev	2fc8ae1dd7	cleanup qemu_co_sendv(), qemu_co_recvv() and friends The same as for non-coroutine versions in previous patches: rename arguments to be more obvious, change type of arguments from int to size_t where appropriate, and use common code for send and receive paths (with one extra argument) since these are exactly the same. Use common iov_send_recv() directly. qemu_co_sendv(), qemu_co_recvv(), and qemu_co_recv() are now trivial #define's merely adding one extra arg. qemu_co_sendv() and qemu_co_recvv() callers are converted to different argument order and extra `iov_cnt' argument. Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2012-06-11 23:12:11 +04:00
Michael Tokarev	d5e6b1619c	change qemu_iovec_to_buf() to match other to,from_buf functions It now allows specifying offset within qiov to start from and amount of bytes to copy. Actual implementation is just a call to iov_to_buf(). Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2012-06-11 23:12:11 +04:00
Michael Tokarev	1b093c480a	consolidate qemu_iovec_copy() and qemu_iovec_concat() and make them consistent qemu_iovec_concat() is currently a wrapper for qemu_iovec_copy(), use the former (with extra "0" arg) in a few places where it is used. Change skip argument of qemu_iovec_copy() from uint64_t to size_t, since size of qiov itself is size_t, so there's no way to skip larger sizes. Rename it to soffset, to make it clear that the offset is applied to src. Also change the only usage of uint64_t in hw/9pfs/virtio-9p.c, in v9fs_init_qiov_from_pdu() - all callers of it actually uses size_t too, not uint64_t. One added restriction: as for all other iovec-related functions, soffset must point inside src. Order of argumens is already good: qemu_iovec_memset(QEMUIOVector qiov, size_t offset, int c, size_t bytes) vs: qemu_iovec_concat(QEMUIOVector dst, QEMUIOVector *src, size_t soffset, size_t sbytes) (note soffset is after _src_ not dst, since it applies to src; for memset it applies to qiov). Note that in many places where this function is used, the previous call is qemu_iovec_reset(), which means many callers actually want copy (replacing dst content), not concat. So we may want to add a wrapper like qemu_iovec_copy() with the same arguments but which calls qemu_iovec_reset() before _concat(). Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2012-06-11 23:12:11 +04:00
Michael Tokarev	03396148bc	allow qemu_iovec_from_buffer() to specify offset from which to start copying Similar to qemu_iovec_memset(QEMUIOVector qiov, size_t offset, int c, size_t bytes); the new prototype is: qemu_iovec_from_buf(QEMUIOVector qiov, size_t offset, const void *buf, size_t bytes); The processing starts at offset bytes within qiov. This way, we may copy a bounce buffer directly to a middle of qiov. This is exactly the same function as iov_from_buf() from iov.c, so use the existing implementation and rename it to qemu_iovec_from_buf() to be shorter and to match the utility function. As with utility implementation, we now assert that the offset is inside actual iovec. Nothing changed for current callers, because `offset' parameter is new. While at it, stop using "bounce-qiov" in block/qcow2.c and copy decrypted data directly from cluster_data instead of recreating a temp qiov for doing that. Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2012-06-11 23:12:11 +04:00
Michael Tokarev	3d9b49254f	consolidate qemu_iovec_memset{,_skip}() into single function and use existing iov_memset() This patch combines two functions into one, and replaces the implementation with already existing iov_memset() from iov.c. The new prototype of qemu_iovec_memset(): size_t qemu_iovec_memset(qiov, size_t offset, int fillc, size_t bytes) It is different from former qemu_iovec_memset_skip(), and I want to make other functions to be consistent with it too: first how much to skip, second what, and 3rd how many of it. It also returns actual number of bytes filled in, which may be less than the requested `bytes' if qiov is smaller than offset+bytes, in the same way iov_memset() does. While at it, use utility function iov_memset() from iov.h in posix-aio-compat.c, where qiov was used. Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2012-06-11 23:07:44 +04:00
Paolo Bonzini	7456e4ce8d	build: move block/ objects to nested Makefile.objs Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-06-07 09:21:13 +02:00
Jim Meyering	c2d76497b6	block: prevent snapshot mode $TMPDIR symlink attack In snapshot mode, bdrv_open creates an empty temporary file without checking for mkstemp or close failure, and ignoring the possibility of a buffer overrun given a surprisingly long $TMPDIR. Change the get_tmp_filename function to return int (not void), so that it can inform its two callers of those failures. Also avoid the risk of buffer overrun and do not ignore mkstemp or close failure. Update both callers (in block.c and vvfat.c) to propagate temp-file-creation failure to their callers. get_tmp_filename creates and closes an empty file, while its callers later open that presumed-existing file with O_CREAT. The problem was that a malicious user could provoke mkstemp failure and race to create a symlink with the selected temporary file name, thus causing the qemu process (usually root owned) to open through the symlink, overwriting an attacker-chosen file. This addresses CVE-2012-2652. http://bugzilla.redhat.com/CVE-2012-2652 Signed-off-by: Jim Meyering <meyering@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-30 10:18:20 +02:00
MORITA Kazutaka	6f3c714eb7	sheepdog: fix return value of do_load_save_vm_state bdrv_save_vmstate and bdrv_load_vmstate should return the vmstate size on success, and -errno on error. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-30 09:58:39 +02:00
Anthony Liguori	306761537f	Merge remote-tracking branch 'kwolf/for-anthony' into staging * kwolf/for-anthony: fdc-test: introduced qtest no_media_on_start and cmos qtest for floppy fdc: fix media detection fdc: floppy drive should be visible after start without media qemu-iotests: mark 035 qcow2-only qcow2: Check qcow2_alloc_clusters_at() return value sheepdog: use heap instead of stack for BDRVSheepdogState sheepdog: return -errno on error sheepdog: mark image as snapshot when tag is specified qemu-img: Explain how rebase operation can be used to perform a 'diff' operation. qcow2: don't leak buffer for unexpected qcow_version in header	2012-05-29 04:30:49 -05:00
Ronnie Sahlberg	f4dfa67f04	ISCSI: Switch to using READ16/WRITE16 for I/O to the LUN This allows using LUNs bigger than 2TB. Keep using READ10 for other device types such as MMC. Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>	2012-05-28 14:04:16 +02:00
Ronnie Sahlberg	6bcd1346bb	ISCSI: Only call READCAPACITY16 for SBC devices, use READCAPACITY10 for MMC Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>	2012-05-28 14:04:15 +02:00
Ronnie Sahlberg	dbfff6d776	ISCSI: get device type at connection time This is needed to avoid READ CAPACITY(16) for MMC devices. Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-05-28 14:04:14 +02:00
Paolo Bonzini	c7b4a95202	ISCSI: change num_blocks to 64-bit Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-05-28 14:04:14 +02:00
Ronnie Sahlberg	c9b9f6824f	ISCSI: redo how we set up the events Call qemu_notify_event() after updating events. Otherwise, If we add an event for -is-writeable but the socket is already writeable there may be a delay before the event callback is actually triggered. Those delays would in particular hurt performance during BIOS boot and when the GRUB bootloader reads the kernel and initrd. But first call out to the socket write functions directly, and only set up the write event if the socket is full. This will happen very rarely and this improves performance. Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>	2012-05-28 14:04:06 +02:00
Kevin Wolf	df02179189	qcow2: Check qcow2_alloc_clusters_at() return value When using qcow2_alloc_clusters_at(), the cluster allocation code checked the wrong variable for an error code. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-25 18:12:54 +02:00
MORITA Kazutaka	b6fc8245e9	sheepdog: use heap instead of stack for BDRVSheepdogState bdrv_create() is called in coroutine context now, so we cannot use more stack than 1 MB in the function if we use ucontext coroutine. This patch allocates BDRVSheepdogState, whose size is 4 MB, on the heap in sd_create(). Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-25 18:12:54 +02:00
MORITA Kazutaka	cb595887cc	sheepdog: return -errno on error On error, BlockDriver APIs should return -errno instead of -1. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-25 18:12:54 +02:00
MORITA Kazutaka	622b6057be	sheepdog: mark image as snapshot when tag is specified When a snapshot tag is specified in the filename, the opened image is a snapshot. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-25 18:12:54 +02:00
Jim Meyering	b6c147622d	qcow2: don't leak buffer for unexpected qcow_version in header Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-25 18:12:54 +02:00
Kevin Wolf	c44bfe4637	qcow2: Don't ignore failure to clear autoclear flags Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-14 17:02:19 +02:00
Anthony Liguori	04120e3bb0	block: fix warning introduced in `efcc7a23` Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2012-05-10 09:10:42 -05:00
Paolo Bonzini	efcc7a2324	stream: do not copy unallocated sectors from the base Unallocated sectors should really never be accessed by the guest, so there's no need to copy them during the streaming process. If they are read by the guest during streaming, guest-initiated copy-on-read will copy them (we're in the base == NULL case, which enables copy on read). If they are read after we disconnect the image from the base, they will read as zeroes anyway. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-10 11:01:59 +02:00
Paolo Bonzini	b21d677ee9	stream: fix ratelimiting corner case This fixes inability to make progress in streaming if the quota is set to less than the amount of data that an I/O operation has to write. In this case, limit->dispatched + n will always be above the quota and, due to the "goto retry" to recheck cancellation and allocation, streaming will livelock. This can be reproduced with "block_job_set_speed ide0-hd0 1b". Of course, with this patch the requested limit will not be obeyed. That could be done with another patch that caps is_allocated's n argument by the slice quota. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-10 11:01:59 +02:00
Paolo Bonzini	f6133def92	stream: pass new base image format to bdrv_change_backing_file When an image is modified to point to the new backing file, the backing file format is set to NULL, which means auto-probe. This is wrong, in fact it is a small security problem. Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-10 11:01:59 +02:00
Paolo Bonzini	fa4478d5c8	block: wait for job callback in block_job_cancel_sync The limitation on not having I/O after cancellation cannot really be kept. Even streaming has a very small race window where you could cancel a job and have it report completion. If this window is hit, bdrv_change_backing_file() will yield and possibly cause accesses to dangling pointers etc. So, let's just assume that we cannot know exactly what will happen after the coroutine has set busy to false. We can set a very lax condition: - if we cancel the job, the coroutine won't set it to false again (and hence will not call co_sleep_ns again). - block_job_cancel_sync will wait for the coroutine to exit, which pretty much ensures no race. Instead, we track the coroutine that executes the job and put very strict conditions on what to do while it is quiescent (busy = false). First of all, the coroutine must never set busy = false while the job has been cancelled. Second, the coroutine can be reentered arbitrarily while it is quiescent, so you cannot really do anything but co_sleep_ns at that time. This condition is obeyed by the block_job_sleep_ns function. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-10 10:32:12 +02:00
Paolo Bonzini	4513eafe92	block: add block_job_sleep_ns This function abstracts the pretty complex semantics of the "busy" member of BlockJob. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-10 10:32:12 +02:00
Paolo Bonzini	e023b2e244	block: fix snapshot on QED QED's opaque data includes a pointer back to the BlockDriverState. This breaks when bdrv_append shuffles data between bs_new and bs_top. To avoid this, add a "rebind" function that tells the driver about the new relationship between the BlockDriverState and its opaque. The patch also adds rebind to VVFAT for completeness, even though it is not used with live snapshots. Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-10 10:32:12 +02:00
Paolo Bonzini	469ef350e1	block: update in-memory backing file and format These are needed to print "info block" output correctly. QCOW2 does this because it needs it to write the header, but QED does not, and common code is the right place to do it. Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-10 10:32:11 +02:00
Paolo Bonzini	5f3777945d	block: push bdrv_change_backing_file error checking up from drivers This check applies to all drivers, but QED lacks it. Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-10 10:32:11 +02:00
Anthony Liguori	7c652c1eaf	Merge remote-tracking branch 'kwolf/for-anthony' into staging * kwolf/for-anthony: fdc: simplify media change handling qcow2: lock on prealloc block: make bdrv_create adopt coroutine qcow2: Limit COW to where it's needed sheepdog: switch to writethrough mode if cluster doesn't support flush	2012-05-08 09:38:41 -05:00
Zhi Yong Wu	15552c4ad3	qcow2: lock on prealloc preallocate() will be locked. This is required because qcow2_alloc_cluster_link_l2() assumes that it runs under a lock that it can drop while COW is being performed. Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-07 19:33:18 +02:00
Kevin Wolf	54e6814360	qcow2: Limit COW to where it's needed This fixes a regression introduced in commit `250196f1`. The bug leads to data corruption, found during an Autotest run with a Fedora 8 guest. Consider a write request whose first part is covered by an already allocated cluster, but additional clusters need to be newly allocated. When counting the number of clusters to allocate, the qcow2 code would decide to do COW for all remaining clusters of the write request, even if some of them are already allocated. If during this COW operation another write request is issued that touches the same cluster, it will still refer to the old cluster. When the COW completes, the first request will update the L2 table and the second write request will be lost. Note that the requests need not overlap, it's enough for them to touch the same cluster. This patch ensures that only clusters that really require COW are considered for allocation. In this case any other request writing to the same cluster will be an allocating write and gets serialised. Reported-by: Marcelo Tosatti <mtosatti@redhat.com> Tested-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-07 19:33:18 +02:00
MORITA Kazutaka	115c2b5a68	sheepdog: switch to writethrough mode if cluster doesn't support flush This is necessary for qemu to work with the older version of Sheepdog which doesn't support SD_OP_FLUSH_VDI. Signed-off-by: MORITA Kazutaka <morita.kazutaka@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-07 19:33:18 +02:00
Ronnie Sahlberg	fa6acb0c2f	ISCSI: Add support for thin-provisioning via discard/UNMAP and bigger LUNs Update the configure test for libiscsi support to detect version 1.3 or later. Version 1.3 of libiscsi provides both READCAPACITY16 as well as UNMAP commands. Update the iscsi block layer to use READCAPACITY16 to detect the size of the LUN instead of READCAPACITY10. This allows support for LUNs larger than 2TB. Update to implement bdrv_aio_discard() using the UNMAP command. This allows us to use thin-provisioned LUNs from TGTD and other iSCSI targets that support thin-provisioning. Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com> [squashed in subsequent patch from Ronnie to fix off-by-one in LBA count] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-05-04 10:39:18 +02:00
Josh Durgin	787f31330e	rbd: add discard support Change the write flag to an operation type in RBDAIOCB, and make the buffer optional since discard doesn't use it. Discard is first included in librbd 0.1.2 (which is in Ceph 0.46). If librbd is too old, leave out qemu_rbd_aio_discard entirely, so the old behavior is preserved. Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-02 18:41:42 +02:00
Zhi Yong Wu	647cc47223	qcow2: fix the return value -ENOENT -> -EEXIST Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-02 18:39:39 +02:00
Kevin Wolf	7242411460	qcow2: Don't hold cache references across yield If cache references are held while the coroutine has yielded, the cache may get used up and abort() when it can't find a free entry. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-02 18:39:39 +02:00
Kevin Wolf	60651f901a	qcow2: Remove unused parameter in do_alloc_cluster_offset Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-02 18:39:39 +02:00
Stefan Weil	b9531b6eed	block/qcow2: Add missing GCC_FMT_ATTR to function report_unsupported() Cc: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-02 18:39:39 +02:00
Pavel Borzenkov	83affaa622	raw-posix: Do not use CONFIG_COCOA macro Use __APPLE__ and __MACH__ macros instead of CONFIG_COCOA to detect Mac OS X host. The patch is based on Ben Leslie's patch: http://patchwork.ozlabs.org/patch/97859/ Signed-off-by: Ben Leslie <benno@benno.id.au> Signed-off-by: Pavel Borzenkov <pavel.borzenkov@gmail.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Andreas Färber <andreas.faerber@web.de>	2012-05-01 00:16:58 +02:00
Anthony Liguori	a8b69b8e24	Merge remote-tracking branch 'qmp/queue/qmp' into staging * qmp/queue/qmp: qapi: fix qmp_balloon() conversion qemu-iotests: add block-stream speed value test case block: add 'speed' optional parameter to block-stream block: change block-job-set-speed argument from 'value' to 'speed' block: use Error mechanism instead of -errno for block_job_set_speed() block: use Error mechanism instead of -errno for block_job_create()	2012-04-27 12:00:06 -05:00
Stefan Hajnoczi	c83c66c3b5	block: add 'speed' optional parameter to block-stream Allow streaming operations to be started with an initial speed limit. This eliminates the window of time between starting streaming and issuing block-job-set-speed. Users should use the new optional 'speed' parameter instead so that speed limits are in effect immediately when the job starts. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>	2012-04-27 11:44:50 -03:00
Stefan Hajnoczi	882ec7ce53	block: change block-job-set-speed argument from 'value' to 'speed' Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>	2012-04-27 11:44:50 -03:00
Stefan Hajnoczi	9e6636c72d	block: use Error mechanism instead of -errno for block_job_set_speed() There are at least two different errors that can occur in block_job_set_speed(): the job might not support setting speeds or the value might be invalid. Use the Error mechanism to report the error where it occurs. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>	2012-04-27 11:44:50 -03:00
Stefan Hajnoczi	fd7f8c6537	block: use Error mechanism instead of -errno for block_job_create() The block job API uses -errno return values internally and we convert these to Error in the QMP functions. This is ugly because the Error should be created at the point where we still have all the relevant information. More importantly, it is hard to add new error cases to this case since we quickly run out of -errno values without losing information. Go ahead and use Error directly and don't convert later. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>	2012-04-27 11:44:50 -03:00
Kevin Wolf	b3adf53a3a	nbd: Fix uninitialised use of s->sock s->sock is assigned only afterwards, so we're really registering an aio_fd_handler for file descriptor 0 here. Not exactly what we intended. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-04-26 17:54:22 +02:00
Anthony Liguori	1f8bcac09a	Merge remote-tracking branch 'kwolf/for-anthony' into staging * kwolf/for-anthony: (38 commits) qemu-iotests: Fix test 031 for qcow2 v3 support qemu-iotests: Add -o and make v3 the default for qcow2 qcow2: Zero write support qemu-iotests: Test backing file COW with zero clusters qemu-iotests: add a simple test for write_zeroes qcow2: Support for feature table header extension qcow2: Support reading zero clusters qcow2: Version 3 images qcow2: Ignore reserved bits in check_refcounts qcow2: Ignore reserved bits in refcount table entries qcow2: Simplify count_cow_clusters qcow2: Refactor qcow2_free_any_clusters qcow2: Ignore reserved bits in L1/L2 entries qcow2: Fail write_compressed when overwriting data qcow2: Ignore reserved bits in count_contiguous_clusters() qcow2: Ignore reserved bits in get_cluster_offset qcow2: Save disk size in snapshot header Specification for qcow2 version 3 qcow2: Fix refcount block allocation during qcow2_alloc_cluster_at() iotests: Resolve test failures caused by hostname ...	2012-04-23 14:27:04 -05:00
Kevin Wolf	621f058940	qcow2: Zero write support Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-04-20 15:57:30 +02:00
Kevin Wolf	cfcc4c62ff	qcow2: Support for feature table header extension Instead of printing an ugly bitmask, qemu can now print a more helpful string even for yet unknown features. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-04-20 15:57:29 +02:00
Kevin Wolf	6377af48b0	qcow2: Support reading zero clusters This adds support for reading zero clusters in version 3 images. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-04-20 15:57:29 +02:00
Kevin Wolf	6744cbab8c	qcow2: Version 3 images This adds the basic infrastructure to qcow2 to handle version 3 images. It includes code to create v3 images, allow header updates for v3 images and checks feature bits. It still misses support for zero clusters, so this is not a fully compliant implementation of v3 yet. The default for creating new images stays at v2 for now. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-04-20 15:57:29 +02:00
Kevin Wolf	afdf0abe77	qcow2: Ignore reserved bits in check_refcounts Also don't infer the cluster type directly from the L2 entries, but use qcow2_get_cluster_type() to keep everything in a single place. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-04-20 15:57:29 +02:00
Kevin Wolf	76dc9e0c8f	qcow2: Ignore reserved bits in refcount table entries Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-04-20 15:57:29 +02:00
Kevin Wolf	143550a83e	qcow2: Simplify count_cow_clusters count_cow_clusters() tries to reuse existing functions, and all it achieves is to make things much more complicated than they really are: Everything needs COW, unless it's a normal cluster with refcount 1. This patch implements the obvious way of doing this, and by using qcow2_get_cluster_type() it gets rid of all flag magic. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-04-20 15:57:28 +02:00
Kevin Wolf	c7a4c37a0f	qcow2: Refactor qcow2_free_any_clusters Zero clusters will add another cluster type. Refactor the open-coded cluster type detection into a switch of QCOW2_CLUSTER_* options so that the detection is in a single place. This makes it easier to add new cluster types. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-04-20 15:57:28 +02:00
Kevin Wolf	8e37f681d5	qcow2: Ignore reserved bits in L1/L2 entries This changes the still existing places that assume that the only flags are QCOW_OFLAG_COPIED and QCOW_OFLAG_COMPRESSED to properly mask out reserved bits. It does not convert bdrv_check yet. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-04-20 15:57:28 +02:00
Kevin Wolf	b0b6862e5e	qcow2: Fail write_compressed when overwriting data qcow2_alloc_compressed_cluster_offset() already fails if the copied flag is set, because qcow2_write_compressed() doesn't perform COW as it would have to do to allow this. However, what we really want to check here is whether the cluster is allocated or not. With internal snapshots the copied flag may not be set on allocated clusters. Check the cluster offset instead. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-04-20 15:57:27 +02:00
Kevin Wolf	2bfcc4a0a0	qcow2: Ignore reserved bits in count_contiguous_clusters() Until now, count_contiguous_clusters() has an argument that allowed to specify flags that should be ignored in the comparison, i.e. that are allowed to change between contiguous clusters. This patch changes the function so that it ignores all flags by default now and you need to pass the flags on which it should stop. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-04-20 15:57:27 +02:00
Kevin Wolf	68d000a390	qcow2: Ignore reserved bits in get_cluster_offset With this change, reading from a qcow2 image ignores all reserved bits that are set in an L1 or L2 table entry. Now get_cluster_offset() assigns *cluster_offset only the offset without any other flags. The cluster type is not longer encoded in the offset, but a positive return value in case of success. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-04-20 15:57:27 +02:00
Kevin Wolf	90b277593d	qcow2: Save disk size in snapshot header This allows that different snapshots of an image can have different sizes, which is a requirement for enabling image resizing even with images that have internal snapshots. We don't do the actual support for it now, but make sure that the additional field is present and not completely ignored in all version 3 images. When trying to load a snapshot of different size, it returns an error. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-04-20 15:57:27 +02:00
Kevin Wolf	f24423bd90	qcow2: Fix refcount block allocation during qcow2_alloc_cluster_at() Refcount block allocation and refcount table growth rely on s->free_cluster_index pointing to somewhere after the current allocation. Change qcow2_alloc_cluster_at() to fulfill this assumption. Without this change it could happen that a newly allocated refcount block and the allocated data block point to the same area in the image file, causing data corruption in the long run. This fixes a bug that became first visible after commit `250196f1`. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-04-20 15:56:19 +02:00
Paolo Bonzini	bafbd6a1c6	aio: remove process_queue callback and qemu_aio_process_queue Both unused after the previous patch. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-04-19 16:37:53 +02:00
Paolo Bonzini	7fe7b68b32	nbd: do not block in nbd_wr_sync if no data at all is available Right now, nbd_wr_sync will hang if no data at all is available on the socket and the other side is not going to provide any. Relax this by making it loop only for writes or partial reads. This fixes a race where one thread is executing qemu_aio_wait() and another is executing main_loop_wait(). Then, the select() call in main_loop_wait() can return stale data and call the "readable" callback with no data in the socket. Reported-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-04-19 16:36:43 +02:00

... 3 4 5 6 7 ...

995 Commits