this patch adds iscsi_truncate which effectively allows for
online resizing of iscsi volumes. for this to work you have
to resize the volume on your storage and then call
block_resize command in qemu which will issue a
readcapacity16 to update the capacity.
v4:
- factor out complete readcapacity logic into a separate function
- handle capacity change check condition in readcapacity function
(this happens if the block_resize cmd is the first iscsi task
executed after a resize on the storage)
v3:
- remove switch statement in iscsi_open
- create separate patch for brdv_drain_all() in bdrv_truncate()
v2:
- add a general bdrv_drain_all() before bdrv_truncate() to avoid
in-flight AIOs while the device is truncated
- since no AIOs are in flight we can use a sync libiscsi call
to re-read the capacity
- factor out the readcapacity16 logic as it is redundant
to iscsi_open() and iscsi_truncate().
Signed-off-by: Peter Lieven <pl@kamp.de>
[allow any type of unit attention check condition in iscsi_readcapacity_sync(),
as in Message-ID: <51263A2A.6070304@dlhnet.de> - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
the storage might return a check condition status for various reasons.
(e.g. bus reset, capacity change, thin-provisioning info etc.)
currently all these informative status responses lead to an I/O error
which is populated to the guest. this patch introduces a retry mechanism
to avoid this.
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch adds support for a unix domain socket for a connection
between qemu and local sheepdog server. You can use the unix domain
socket with the following syntax:
$ qemu sheepdog+unix:///<vdiname>?socket=<socket path>[#snapid]
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This uses the form "<host>:<port>" for the representation of the
sheepdog server to use inet_connect.
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The URI syntax is consistent with the NBD and Gluster syntax. The
syntax is
sheepdog[+tcp]://[host:port]/vdiname[#snapid|#tag]
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The qemu-img check command can display fragmentation statistics:
* Total number of clusters in virtual disk
* Number of allocated clusters
* Number of fragmented clusters
This patch adds fragmentation statistics support to qcow2.
Compressed and normal clusters count as allocated. Zero clusters are
not counted as allocated unless their L2 entry has a non-zero offset
(e.g. preallocation).
Only the current L1 table counts towards the statistics - snapshots are
ignored.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The check_refcounts_l1/l2() functions have a check_copied argument to
check that the QCOW_O_COPIED flag is consistent with refcount == 1.
This should be a bool, not an int.
However, the next patch introduces qcow2 fragmentation statistics and
also needs to pass an option to check_refcounts_l1/l2(). This is a good
opportunity to use an int flags field.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This patch adds the support for reporting the image end offset (in
bytes). This is particularly useful after a conversion (or a rebase)
where the destination is a block device in order to find the first
unused byte at the end of the image.
Signed-off-by: Federico Simoncelli <fsimonce@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The curl_easy_setopt(state->curl, CURLOPT_PROTOCOLS, ...) interface was
introduced in libcurl 7.19.4. Therefore we cannot protect against
CVE-2013-0249 when linking against an older libcurl.
This fixes the build failure introduced by
fb6d1bbd24.
Reported-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Andreas Färber <andreas.faeber@web.de>
Message-id: 1360743934-8337-1-git-send-email-stefanha@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This reverts commit f880defbb0.
Jeff Cody's testing revealed that the interpretation of size differs
even between VirtualPC and HyperV. Revert this so there is time to
consider the impact of any backwards incompatible behavior this change
creates.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Linux block devices can be set read-only with "blockdev --setro
<device>". The same thing can be done for LVM volumes using "lvchange
--permission r <volume>". This read-only setting is independent of
device node permissions. Therefore the device can still be opened
O_RDWR but actual writes will fail.
This results in odd behavior for QEMU. bdrv_open() is supposed to fail
if a read-only image is being opened with BDRV_O_RDWR. By not failing
for Linux block devices, the guest boots up but every write produces an
I/O error.
This patch checks whether the block device is read-only so that Linux
block devices behave like regular files.
Reported-by: Sibiao Luo <sluo@redhat.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The size calculated from the CHS values is not the real image (disk) size,
but usually a smaller value. This is caused by rounding effects.
Only older operating systems use CHS. Such guests won't be able to use
the whole disk. All modern operating systems use the real size.
This patch fixes https://bugs.launchpad.net/qemu/+bug/1105670/.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Message-id: 1360265212-22037-1-git-send-email-sw@weilnetz.de
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
There is a buffer overflow in libcurl POP3/SMTP/IMAP. The workaround is
simple: disable extra protocols so that they cannot be exploited. Full
details here:
http://curl.haxx.se/docs/adv_20130206.html
QEMU only cares about HTTP, HTTPS, FTP, FTPS, and TFTP. I have tested
that this fix prevents the exploit on my host with
libcurl-7.27.0-5.fc18.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Commit eeb6b45d48 (block: raw-posix image
file reopen) broke the build on OpenIndiana.
illumos has no O_ASYNC. Exclude it from flags to be compared
and instead assert that it is not set where defined.
Cf. e61ab1da7e for qemu-ga.
Cc: qemu-stable@nongnu.org (1.3.x)
Cc: Jeff Cody <jcody@redhat.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The previous scanf() format string stopped parsing the file name on the
first white white space, which seems to be allowed at least by VMware
Workstation.
Change the format string to collect everything between the first and
second quote as the file name, disallowing line breaks.
Signed-off-by: Philipp Hahn <hahn@univention.de>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Return -errno instead of -1 on errors. Hey, no memory leak to fix here
while we're touching it!
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The buffers are allocated with g_(re)alloc, so use g_free to free them.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Return -errno instead of -1 on errors and add error checks in some
places that didn't have one. Passing things by reference requires more
correct typing, replaced a few off_ts therefore - with a 32-bit off_t
this is even a fix for truncation bugs.
While touching the code, fix even some more memory leaks than in the
other drivers...
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Return -errno instead of -1 on errors. While touching the
code, fix a memory leak.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Return -errno instead of -1 on errors. While touching the
code, fix a memory leak.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Return -errno instead of -1 on errors. While touching the
code, fix a memory leak.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Sheep daemon needs vdi_id to identify which vdi is closed to release resources
such as object cache.
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <tailai.ly@taobao.com>
Reviewed-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Introduce a new option "adapter_type" when converting to vmdk images.
It can be one of the following: ide (default), buslogic, lsilogic
or legacyESX (according to the vmdk spec from vmware).
In case of a non-ide adapter, heads is set to 255 instead of the 16.
The latter is used for "ide".
Also see LP#545089
Signed-off-by: Othmar Pasteka <pasteka@kabsi.at>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Once upon a time, it was decided that qemu_malloc(0) should abort.
Switching to glib retired that bright idea. Some code that was added
to cope with it (e.g. in commits 702ef63, b76b6e9) is still around.
Bury it.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
On a zero-sized disk we need to break out of the job successfully
before bdrv_dirty_iter_init is called, otherwise you will get an
assertion failure with the next patch.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
vdi_open did not check for a bad signature.
This check was only in vdi_probe.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
vdi_open returned -1 in case of any error, but it should return an
error code (negative value of errno or -EMEDIUMTYPE).
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The signature is a 32 bit value and needs up to 8 hex digits for printing.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This improves error reports for bochs, cow, qcow, qcow2, qed and vmdk
when a file with the wrong format is selected.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Yet another optimization is to extend the mirroring iteration to include more
adjacent dirty blocks. This limits the number of I/O operations and makes
mirroring efficient even with a small granularity. Most of the infrastructure
is already in place; we only need to put a loop around the computation of
the origin and sector count of the iteration.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
With AIO support in place, we can start copying more than one chunk
in parallel. This patch introduces the required infrastructure for
this: the buffer is split into multiple granularity-sized chunks,
and there is a free list to access them.
Because of copy-on-write, a single operation may already require
multiple chunks to be available on the free list.
In addition, two different iterations on the HBitmap may want to
copy the same cluster. We avoid this by keeping a bitmap of in-flight
I/O operations, and blocking until the previous iteration completes.
This should be a pretty rare occurrence, though; as long as there is
no overlap the next iteration can start before the previous one finishes.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This makes sense when the next commit starts using the extra buffer space
to perform many I/O operations asynchronously.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There is really no change in the behavior of the job here, since
there is still a maximum of one in-flight I/O operation between
the source and the target. However, this patch already introduces
the AIO callbacks (which are unmodified in the next patch)
and some of the logic to count in-flight operations and only
complete the job when there is none.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The desired granularity may be very different depending on the kind of
operation (e.g. continuous replication vs. collapse-to-raw) and whether
the VM is expected to perform lots of I/O while mirroring is in progress.
Allow the user to customize it, while providing a sane default so that
in general there will be no extra allocated space in the target compared
to the source.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When mirroring runs, the backing files for the target may not yet be
ready. However, this means that a copy-on-write operation on the target
would fill the missing sectors with zeros. Copy-on-write only happens
if the granularity of the dirty bitmap is smaller than the cluster size
(and only for clusters that are allocated in the source after the job
has started copying). So far, the granularity was fixed to 1MB; to avoid
the problem we detected the situation and required the backing files to
be available in that case only.
However, we want to lower the granularity for efficiency, so we need
a better solution. The solution is to always copy a whole cluster the
first time it is touched. The code keeps a bitmap of clusters that
have already been allocated by the mirroring job, and only does "manual"
copy-on-write if the chunk being copied is zero in the bitmap.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This actually uses the dirty bitmap in the block layer, and converts
mirroring to use an HBitmapIter.
Reviewed-by: Laszlo Ersek <lersek@redhat.com> (except block/mirror.c parts)
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch adds support for directly passing the iovec
array from QEMUIOVector if libiscsi supports it (1.8.0
or newer).
Signed-off-by: Peter Lieven <pl@kamp.de>
[Preserve the improvements from commit 4cc841b, iscsi: partly
avoid iovec linearization in iscsi_aio_writev, 2012-11-19 - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
acb->buf is freed in the WRITE(16) callback, but this may not
get called at all when commands are aborted. Add another
free in the ABORT TASK callback, which requires setting acb->buf
to NULL everywhere.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
# By Peter Lieven (3) and others
# Via Paolo Bonzini
* bonzini/scsi-next:
scsi: Drop useless null test in scsi_unit_attention()
lsi: use qbus_reset_all to reset SCSI bus
scsi: fix segfault with 0-byte disk
iscsi: add support for iSCSI NOPs [v2]
iscsi: partly avoid iovec linearization in iscsi_aio_writev
iscsi: add iscsi_create support
This patch will send NOP-Out PDUs every 5 seconds to the iSCSI target.
If a consecutive number of NOP-In replies fail a reconnect is initiated.
iSCSI NOPs help to ensure that the connection to the target is still operational.
This should not, but in reality may be the case even if the TCP connection is still
alive if there are bugs in either the target or the initiator implementation.
v2:
- track the NOPs inside libiscsi so libiscsi can reset the counter
in case it initiates a reconnect.
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
libiscsi expects all write16 data in a linear buffer. If the
iovec only contains one buffer we can skip the linearization
step as well as the additional malloc/free and pass the
buffer directly.
Reported-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch adds support for bdrv_create. This allows e.g.
to use qemu-img to convert from any supported device to
an iscsi backed storage as destination.
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
# By Kevin Wolf (4) and others
# Via Stefan Hajnoczi
* stefanha/block:
dataplane: support viostor virtio-pci status bit setting
dataplane: avoid reentrancy during virtio_blk_data_plane_stop()
win32-aio: use iov utility functions instead of open-coding them
win32-aio: Fix memory leak
win32-aio: Fix vectored reads
aio: Fix return value of aio_poll()
ide: Remove wrong assertion
block: fix null-pointer bug on error case in block commit
Fixes the build on OpenBSD among others.
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
We have iov_from_buf() and iov_to_buf(), use them instead of
open-coding these in block/win32-aio.c
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>