Commit Graph

1018 Commits

Author SHA1 Message Date
Peter Lieven a23fdf3559 block/raw: add .bdrv_get_info
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-07-19 15:27:37 +08:00
Peter Lieven 8bf9344ad6 block/raw: add bdrv_co_write_zeroes
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-07-19 12:29:21 +08:00
Fam Zheng 78f27bd02c block: fix vvfat error path for enable_write_target
s->qcow and s->qcow_filename are allocated but not freed on error. Fix the
possible leaks, remove unnecessary check for bdrv_new(), propagate ret code of
bdrv_create() and also the one of enable_write_target().

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-07-19 12:29:21 +08:00
Bharata B Rao 0c14fb47ec gluster: Add discard support for GlusterFS block driver.
Implement bdrv_aio_discard for gluster.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-07-19 12:29:21 +08:00
Peter Lieven 0777b5dde4 iscsi: factor out sector conversions
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-17 17:01:41 +02:00
Peter Lieven 91bea4e2bb iscsi: assert that sectors are aligned to LUN blocksize
if the blocksize of an iSCSI LUN is bigger than the BDRV_SECTOR_SIZE
it is possible that sector_num or nb_sectors are not correctly
aligned.

to avoid corruption we fail requests which are misaligned.

Signed-off-by: Peter Lieven <pl@kamp.de>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-17 17:01:41 +02:00
Peter Lieven 7e4d5a9f94 iscsi: remove support for misaligned nb_sectors in aio_readv
this hask is not working (anymore). support for misaligned offsets should
be handled at the block layer.

Signed-off-by: Peter Lieven <pl@kamp.de>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-17 17:01:41 +02:00
Peter Lieven d3bda7bc16 iscsi: fix -ENOSPC in iscsi_create()
the -ENOPSC case did not work due to the missing goto.

Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-17 17:00:28 +02:00
Ronnie Sahlberg 0a53f01074 Fix iSCSI crash on SG_IO with an iovector
Don't assume that SG_IO is always invoked with a simple buffer,
check the iovec_count and if it is >= 1 then we need to pass an array
of iovectors to libiscsi instead of just a plain buffer.

Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-17 17:00:26 +02:00
Kevin Wolf 98289620e0 block: Don't parse protocol from file.filename
One of the major reasons for doing something new for -blockdev and
blockdev-add was that the old block layer code parses filenames instead
of just taking them literally. So we should really leave it untouched
when it's passing using the new interfaces (like -drive
file.filename=...).

This allows opening relative file names that contain a colon.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-07-15 09:49:00 +02:00
Fam Zheng 3494d65027 curl: refuse to open URL from HTTP server without range support
CURL driver requests partial data from server on guest IO req. For HTTP
and HTTPS, it uses "Range: ***" in requests, and this will not work if
server not accepting range. This patch does this check when open.

 * Removed curl_size_cb, which is not used: On one hand it's registered to
   libcurl as CURLOPT_WRITEFUNCTION, instead of CURLOPT_HEADERFUNCTION,
   which will get called with *data*, not *header*. On the other hand the
   s->len is assigned unconditionally later.

   In this gone function, the sscanf for "Content-Length: %zd", on
   (void *)ptr, which is not guaranteed to be zero-terminated, is
   potentially a security bug. So this patch fixes it as a side-effect. The
   bug is reported as: https://bugs.launchpad.net/qemu/+bug/1188943
   (Note the bug is marked "private" so you might not be able to see it)

 * Introduced curl_header_cb, which is used to parse header and mark the
   server as accepting range if "Accept-Ranges: bytes" line is seen from
   response header. If protocol is HTTP or HTTPS, but server response has
   no not this support, refuse to open this URL.

Note that python builtin module SimpleHTTPServer is an example of not
supporting range, if you need to test this driver, get a better server
or use internet URLs.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-07-05 09:40:18 +02:00
Fam Zheng da7a50f938 vmdk: Implement .bdrv_has_zero_init
Depending on the subformat, has_zero_init queries underlying storage for
flat extent. If it has a flat extent and its underlying storage doesn't
have zero init, return 0. Otherwise return 1.

Aligns the operator assignments.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-07-05 09:40:18 +02:00
Peter Lieven 3ac216270a block: change default of .has_zero_init to 0
.has_zero_init defaults to 1 for all formats and protocols.

this is a dangerous default since this means that all
new added drivers need to manually overwrite it to 0 if
they do not ensure that a device is zero initialized
after bdrv_create().

if a driver needs to explicitly set this value to
1 its easier to verify the correctness in the review process.

during review of the existing drivers it turned out
that ssh and gluster had a wrong default of 1.
both protocols support host_devices as backend
which are not by default zero initialized. this
wrong assumption will lead to possible corruption
if qemu-img convert is used to write to such a backend.

vpc and vmdk also defaulted to 1 altough they support
fixed respectively flat extends. this has to be addresses
in separate patches. both formats as well as the mentioned
ssh and gluster are turned to the default of 0 with this
patch for safety.

a similar problem with the wrong default existed for
iscsi most likely because the driver developer did
oversee the default value of 1.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-06-28 13:52:35 +02:00
Kevin Wolf 72c6cc94da vpc: Implement .bdrv_has_zero_init
Depending on the subformat, has_zero_init on VHD must behave like raw
and query the underlying storage (fixed) or like other sparse formats
that can always return 1 (dynamic, differencing).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-06-28 10:21:00 +02:00
Fam Zheng 8ed610a1c9 vmdk: remove wrong calculation of relative path
When creating image with backing file, the driver tries to calculate the
relative path from created image file to backing file, but the path
computation is incorrect. e.g.:

    $ qemu-img create -f vmdk -b vmdk-data-disk.vmdk vmdk-data-snapshot1
    Formatting 'vmdk-data-snapshot1', fmt=vmdk size=10737418240
    backing_file='vmdk-data-disk.vmdk' compat6=off zeroed_grain=off

    $ qemu-img info vmdk-data-snapshot1
    image: vmdk-data-snapshot1
    file format: vmdk
    virtual size: 10G (10737418240 bytes)
    disk size: 12K
->  backing file: disk.vmdk

The common part in file names, "vmdk-data-", is incorrectly forgotten by
relative_path(). As the VMDK specification has no restriction on
parentNameHint to be relative path, we simply remove this by using the
backing_file option.

Cc: qemu-stable@nongnu.org
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-06-28 09:20:27 +02:00
Kevin Wolf 8ab6feec2c gluster: Return bdrv_has_zero_init = 0
GlusterFS volumes can be backed by block devices, in which case
bdrv_create() doesn't make sure that the image is zeroed out. It is
currently not possibly to detect whether a given image is backed by a
file or a block device, and incorrectly assuming that it is zeroed
corrupts images during qemu-img convert, so let's err on the side of
caution and always return 0.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-06-28 09:20:27 +02:00
Richard W.M. Jones 0b3f21e6a9 block/ssh: Set bdrv_has_zero_init according to the file type.
If the remote is a regular file, set it to true (ie. reads of
uninitialized areas in a newly created file will return zeroes).
If we can't prove that, return false (a safe default).

Tested by adding a debugging print statement [not part of this commit]
and creating a remote file and a remote block device:

  $ ./qemu-img create ssh://localhost/tmp/new 100M
  Formatting 'ssh://localhost/tmp/new', fmt=raw size=104857600
  filename ssh://localhost/tmp/new: has_zero_init = 1
  $ sudo lvcreate -L 1G -n tmp /dev/fedora
    Logical volume "tmp" created
  $ ./qemu-img create ssh://localhost/dev/fedora/tmp 1G
  Formatting 'ssh://localhost/dev/fedora/tmp', fmt=raw size=1073741824
  filename ssh://localhost/dev/fedora/tmp: has_zero_init = 0

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-06-28 09:20:27 +02:00
Kevin Wolf f59fee8d50 block: Make BlockJobTypes const
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-06-28 09:20:27 +02:00
Dietmar Maurer 98d2c6f2cd block: add basic backup support to block driver
backup_start() creates a block job that copies a point-in-time snapshot
of a block device to a target block device.

We call backup_do_cow() for each write during backup. That function
reads the original data from the block device before it gets
overwritten.  The data is then written to the target device.

Currently backup cluster size is hardcoded to 65536 bytes.

[I made a number of changes to Dietmar's original patch and folded them
in to make code review easy.  Here is the full list:

 * Drop BackupDumpFunc interface in favor of a target block device
 * Detect zero clusters with buffer_is_zero() and use bdrv_co_write_zeroes()
 * Use 0 delay instead of 1us, like other block jobs
 * Unify creation/start functions into backup_start()
 * Simplify cleanup, free bitmap in backup_run() instead of cb
 * function
 * Use HBitmap to avoid duplicating bitmap code
 * Use bdrv_getlength() instead of accessing ->total_sectors
 * directly
 * Delete the backup.h header file, it is no longer necessary
 * Move ./backup.c to block/backup.c
 * Remove #ifdefed out code
 * Coding style and whitespace cleanups
 * Use bdrv_add_before_write_notifier() instead of blockjob-specific hooks
 * Keep our own in-flight CowRequest list instead of using block.c
   tracked requests.  This means a little code duplication but is much
   simpler than trying to share the tracked requests list and use the
   backup block size.
 * Add on_source_error and on_target_error error handling.
 * Use trace events instead of DPRINTF()

-- stefanha]

Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-06-28 09:20:26 +02:00
Kevin Wolf a5c5ea3f60 raw-posix: Fix /dev/cdrom magic on OS X
The raw-posix driver has code to provide a /dev/cdrom on OS X even
though it doesn't really exist. However, since commit c66a6157 the real
filename is dismissed after finding it, so opening /dev/cdrom fails.
Put the filename back into the options QDict to make this work again.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-06-28 09:20:26 +02:00
Fam Zheng 96c51eb5e4 vmdk: refuse to open higher version than supported
Refuse to open higher version for safety.

Although we try to be compatible with published VMDK spec, VMware has
newer version from ESXi 5.1 exported OVF/OVA, which we have no knowledge
what's changed in it. And it is very likely to have more new versions in
the future, so it's not safe to open them blindly.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-06-24 10:25:43 +02:00
Kevin Wolf 0b919fae31 qcow2: Batch discards
This optimises the discard operation for freed clusters by batching
discard requests (both snapshot deletion and bdrv_discard end up
updating the refcounts cluster by cluster).

Note that we don't discard asynchronously, but keep s->lock held. This
is to avoid that a freed cluster is reallocated and written to while the
discard is still in flight.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-06-24 10:25:17 +02:00
Kevin Wolf 67af674e47 qcow2: Options to enable discard for freed clusters
Deleted snapshots are discarded in the image file by default, discard
requests take their default from the -drive discard=... option and other
places that free clusters must always be enabled explicitly.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-06-24 10:25:17 +02:00
Kevin Wolf 6cfcb9b8b9 qcow2: Add refcount update reason to all callers
This adds a refcount update reason to all callers of update_refcounts(),
so that a follow-up patch can use this information to decide whether
clusters that reach a refcount of 0 should be discarded in the image
file.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-06-24 10:25:17 +02:00
Anthony Liguori 3ed8a8430a Merge remote-tracking branch 'bonzini/scsi-next' into staging
# By Paolo Bonzini (3) and others
# Via Paolo Bonzini
* bonzini/scsi-next:
  iscsi: reorganize iscsi_readcapacity_sync
  iscsi: simplify freeing of tasks
  vhost-scsi: fix k->set_guest_notifiers() NULL dereference
  scsi-disk: scsi-block device for scsi pass-through should not be removable
  scsi-generic: check the return value of bdrv_aio_ioctl in execute_command
  scsi-generic: fix sign extension of READ CAPACITY(10) data
  scsi: reset cdrom tray statuses on scsi_disk_reset

Message-id: 1371565016-2643-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-06-18 10:06:47 -05:00
Paolo Bonzini 1288844e7c iscsi: reorganize iscsi_readcapacity_sync
Avoid the goto, and use the same retry logic for the 10- and 16-
byte versions.

Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-18 12:43:03 +02:00
Paolo Bonzini f0d2a4d4d6 iscsi: simplify freeing of tasks
Always free them in the iscsi_aio_*_acb functions and remove the
checks in their callers.  Remove ifs when the task struct was
previously dereferenced (spotted by Coverity).

Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-18 12:43:03 +02:00
Ján Tomko 2330790879 nbd: strip braces from literal IPv6 address in URI
Otherwise they would get passed to getaddrinfo and fail with:
address resolution failed for [::1]🔢 Name or service not known

(Broken by commit v1.4.0-736-gf17c90b)

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-18 11:43:00 +02:00
Anthony Liguori 21a885a7e2 Merge remote-tracking branch 'luiz/queue/qmp' into staging
# By Luiz Capitulino
# Via Luiz Capitulino
* luiz/queue/qmp:
  qerror: drop QERR_OPEN_FILE_FAILED macro
  block: bdrv_reopen_prepare(): don't use QERR_OPEN_FILE_FAILED
  savevm: qmp_xen_save_devices_state(): use error_setg_file_open()
  dump: qmp_dump_guest_memory(): use error_setg_file_open()
  cpus: use error_setg_file_open()
  blockdev: use error_setg_file_open()
  block: mirror_complete(): use error_setg_file_open()
  rng-random: use error_setg_file_open()
  error: add error_setg_file_open() helper

Message-id: 1371484631-29510-1-git-send-email-lcapitulino@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-06-17 13:14:46 -05:00
Evgeny Budilovsky 0bed087df2 vmdk: Allow reading variable size descriptor files
the hard-coded 2k buffer on the stack won't allow reading big descriptor
files which can be generated when storing big images. For example 500G
vmdk splitted to 2G chunks.

Signed-off-by: Evgeny Budilovsky <evgeny.budilovsky@ravellosystems.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-06-17 17:47:59 +02:00
Richard W.M. Jones 8da1aa15db curl: Don't set curl options on the handle just before it's going to be deleted.
(Found by Kamil Dudka)

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Cc: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-06-17 17:47:59 +02:00
Stefan Hajnoczi 5a394b9e96 vmdk: byteswap VMDK4Header.desc_offset field
Remember to byteswap VMDK4Header.desc_offset on big-endian machines.

Cc: qemu-stable@nongnu.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-06-17 17:47:59 +02:00
Richard W.M. Jones a7cea2ba47 block/curl.c: Refuse to open the handle for writes.
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-06-17 17:47:59 +02:00
Liu Yuan cede621ffc sheepdog: support 'qemu-img snapshot -a'
Just call sd_create_branch() in the snapshot_goto to rollback the image is good
enough. With this patch, 'loadvm' process for sheepdog is modified:

Suppose we have a snapshot chain A --> B --> C, we do 'loadvm A' so as to get
a new chain,

A --> B
|
V
C1

in the old code:

1 reload inode of A (in snapshot_goto)
2 read vmstate via A's vdi_id (loadvm_state)
3 delete C and create C1, reload inode of C1 (sd_create_branch on write)

with this patch applied:

1 reload inode of A, delete C and create C1  (in snapshot_goto)
2 read vmstate via C1's parent, that is A's vdi_id (loadvm_state)

This will fix the possible bug that QEMU exit between 2 and 3 in the old code

Cc: qemu-devel@nongnu.org
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-06-17 17:47:58 +02:00
Liu Yuan b579ffb3fd sheepdog: fix snapshot tag initialization
This is an old and obvious bug. We should pass snapshot_id to the
tag. Or simple command like 'qemu-img snapshot -a tag sheepdog:image' will fail

Cc: qemu-devel@nongnu.org
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-06-17 17:47:58 +02:00
Luiz Capitulino dacc26aae5 block: mirror_complete(): use error_setg_file_open()
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Kevin Wolf <kwolf@redhat.com>
2013-06-17 11:01:14 -04:00
Richard W.M. Jones 9e5e2b23d3 curl: Whitespace only changes.
Trivial patch to remove odd whitespace.

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-06-11 23:45:43 +04:00
Wenchao Xia 553a7e8718 qmp: add ImageInfo in BlockDeviceInfo used by query-block
Now image info will be retrieved as an embbed json object inside
BlockDeviceInfo, backing chain info and all related internal snapshot
info can be got in the enhanced recursive structure of ImageInfo. New
recursive member *backing-image is added to reflect the backing chain
status.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-06-07 13:37:45 +02:00
Wenchao Xia 43526ec8d1 block: add image info query function bdrv_query_image_info()
This patch adds function bdrv_query_image_info(), which will
retrieve image info in qmp object format. The implementation is
based on the code moved from qemu-img.c, but uses block layer
function to get snapshot info.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-06-07 13:37:45 +02:00
Wenchao Xia fb0ed4539c block: add snapshot info query function bdrv_query_snapshot_info_list()
This patch adds function bdrv_query_snapshot_info_list(), which will
retrieve snapshot info of an image in qmp object format. The implementation
is based on the code moved from qemu-img.c with modification to fit more
for qmp based block layer API.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-06-07 13:37:45 +02:00
Kevin Wolf bf736fe34c blkdebug: Add BLKDBG_FLUSH_TO_OS/DISK events
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-06-06 11:27:22 +02:00
Wenchao Xia 5b91704469 block: dump snapshot and image info to specified output
bdrv_snapshot_dump() and bdrv_image_info_dump() do not dump to a buffer now,
some internal buffers are still used for format control, which have no
chance to be truncated. As a result, these two functions have no more issue
of truncation, and they can be used by both qemu and qemu-img with correct
parameter specified.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-06-04 13:56:30 +02:00
Wenchao Xia f364ec65b5 block: move qmp and info dump related code to block/qapi.c
This patch is a pure code move patch, except following modification:
1 get_human_readable_size() is changed to static function.
2 dump_human_image_info() is renamed to bdrv_image_info_dump().
3 in qmp_query_block() and qmp_query_blockstats, use bdrv_next(bs)
instead of direct traverse of global array 'bdrv_states'.
4 collect_snapshots() and collect_image_info() are renamed, unused parameter
*fmt in collect_image_info() is removed.
5 code style fix.

To avoid conflict and tip better, macro in header file is BLOCK_QAPI_H
instead of QAPI_H. Now block.h and snapshot.h are at the same level in
include path, block_int.h and qapi.h will both include them.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-06-04 13:56:30 +02:00
Wenchao Xia de08c606f9 block: move snapshot code in block.c to block/snapshot.c
All snapshot related code, except bdrv_snapshot_dump() and
bdrv_is_snapshot(), is moved to block/snapshot.c. bdrv_snapshot_dump()
will be moved to another file later. bdrv_is_snapshot() is not related
with internal snapshot. It also fixes small code style errors reported
by check script.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-06-04 13:56:30 +02:00
Qiao Nuohan ce3a4718fe Remove twice include of qemu-common.h
This patch is used to remove twice include of "qemu-common.h" in
block/win32-aio.c

Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-05-18 16:35:11 +04:00
Kevin Wolf 2cf7cfa1cd qcow2: Catch some L1 table index overflows
This catches the situation that is described in the bug report at
https://bugs.launchpad.net/qemu/+bug/865518 and goes like this:

    $ qemu-img create -f qcow2 huge.qcow2 $((1024*1024))T
    Formatting 'huge.qcow2', fmt=qcow2 size=1152921504606846976 encryption=off cluster_size=65536 lazy_refcounts=off
    $ qemu-io /tmp/huge.qcow2 -c "write $((1024*1024*1024*1024*1024*1024 - 1024)) 512"
    Segmentation fault

With this patch applied the segfault will be avoided, however the case
will still fail, though gracefully:

    $ qemu-img create -f qcow2 /tmp/huge.qcow2 $((1024*1024))T
    Formatting 'huge.qcow2', fmt=qcow2 size=1152921504606846976 encryption=off cluster_size=65536 lazy_refcounts=off
    qemu-img: The image size is too large for file format 'qcow2'

Note that even long before these overflow checks kick in, you get
insanely high memory usage (up to INT_MAX * sizeof(uint64_t) = 16 GB for
the L1 table), so with somewhat smaller image sizes you'll probably see
qemu aborting for a failed g_malloc().

If you need huge image sizes, you should increase the cluster size to
the maximum of 2 MB in order to get higher limits.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-05-14 16:44:33 +02:00
Dong Xu Wang c7e775e4dd remove double semicolons
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-05-12 13:25:55 +04:00
Fam Zheng cdeaf1f159 vmdk: add bdrv_co_write_zeroes
Use special offset to write zeroes efficiently, when zeroed-grain GTE is
available. If zero-write an allocated cluster, cluster is leaked because
its offset pointer is overwritten by "0x1".

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-05-03 10:33:49 +02:00
Fam Zheng e304e8e5a0 vmdk: store fields of VmdkMetaData in cpu endian
Previously VmdkMetaData.offset is stored little endian while other
fields are cpu endian. This changes offset to cpu endian and convert
before writing to image.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-05-03 10:33:46 +02:00
Fam Zheng 95b0aa4231 vmdk: change magic number to macro
Two hard coded flag bits are changed to macros.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-05-03 10:33:43 +02:00
Fam Zheng 69e0b6dfa4 vmdk: Add option to create zeroed-grain image
Add image create option "zeroed-grain" to enable zeroed-grain GTE
feature of vmdk sparse extents. When this option is on, header version
of newly created extent will be 2 and VMDK4_FLAG_ZERO_GRAIN flag bit
will be set.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-05-03 10:33:41 +02:00
Fam Zheng 14ead646fe vmdk: add support for “zeroed‐grain” GTE
Introduced support for zeroed-grain GTE, as specified in Virtual Disk
Format 5.0[1].

    Recent VMware hosted platform products support a new “zeroed‐grain”
    grain table entry (GTE). The zeroed‐grain GTE returns all zeros on
    read.  In other words, the zeroed‐grain GTE indicates that a grain
    in the child disk is zero‐filled but does not actually occupy space
    in storage.  A sparse extent with zeroed‐grain GTE has the following
    in its header:

     * SparseExtentHeader.version = 2
     * SparseExtentHeader.flags has bit 2 set

    Other than the new flag and the possibly zeroed‐grain GTE, version 2
    sparse extents are identical to version 1.  Also, a zeroed‐grain GTE
    has value 0x1 in the GT table.

[1] Virtual Disk Format 5.0, http://www.vmware.com/support/developer/vddk/vmdk_50_technote.pdf?src=vmdk
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-05-03 10:33:38 +02:00
Fam Zheng 65f7472577 vmdk: named return code.
Internal routines in vmdk.c previously return -1 on error and 0 on
success. More return values are useful for future changes such as
zeroed-grain GTE. Change all the magic `return 0` and `return -1` to
macro names:

 * VMDK_OK      0
 * VMDK_ERROR   (-1)
 * VMDK_UNALLOC (-2)
 * VMDK_ZEROED  (-3)

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-05-03 10:33:35 +02:00
Jeff Cody 059e2fbbca block: add read-only support to VHDX image format.
This adds in read-only support to the VHDX image format.  This supports
reads for fixed-size, and dynamic sized VHDX images.

Differencing files are still unsupported.

The image must be opened without BDRV_O_RDWR set, because we do not
yet update the headers.  I.e., pass 'readonly=on' in the drive image
options from the QEMU commandline.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-05-03 10:31:58 +02:00
Jeff Cody e8d4e5ffdb block: initial VHDX driver support framework - supports open and probe
This is the initial block driver framework for VHDX image support
(i.e. Hyper-V image file formats), that supports opening VHDX files, and
parsing the headers.

This commit does not yet enable:
    - reading
    - writing
    - updating the header
    - differencing files (images with parents)
    - log replay / dirty logs (only clean images)

This is based on Microsoft's VHDX specification:
    "VHDX Format Specification v0.95", published 4/12/2012
    https://www.microsoft.com/en-us/download/details.aspx?id=29681

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-05-03 10:31:58 +02:00
Jeff Cody 203cdba3bc block: vhdx header for the QEMU support of VHDX images
This is based on Microsoft's VHDX specification:
    "VHDX Format Specification v0.95", published 4/12/2012
    https://www.microsoft.com/en-us/download/details.aspx?id=29681

These structures define the various header, metadata, and other
block structures defined in the VHDX specification.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-05-03 10:31:58 +02:00
Liu Yuan 859e5553a4 sheepdog: fix loadvm operation
Currently the 'loadvm' opertaion works as following:
1. switch to the snapshot
2. mark current working VDI as a snapshot
3. rely on sd_create_branch to create a new working VDI based on the snapshot

This works not the same as other format as QCOW2. For e.g,

qemu > savevm # get a live snapshot snap1
qemu > savevm # snap2
qemu > loadvm 1 # This will steally create snap3 of the working VDI

Which will result in following snapshot chain:

base <-- snap1 <-- snap2 <-- snap3
          ^
          |
      working VDI

snap3 was unnecessarily created and might be annoying users.

This patch discard the unnecessary 'snap3' creation. and implement
rollback(loadvm) operation to the specified snapshot by
1. switch to the snapshot
2. delete working VDI
3. rely on sd_create_branch to create a new working VDI based on the snapshot

The snapshot chain for above example will be:

base <-- snap1 <-- snap2
          ^
          |
      working VDI

Cc: qemu-devel@nongnu.org
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <tailai.ly@taobao.com>
Reviewed-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-26 13:37:51 +02:00
MORITA Kazutaka 13c31de2fd sheepdog: resend write requests when SD_RES_READONLY is received
When a snapshot is taken from out side of qemu (e.g. qemu-img
snapshot), write requests to the current vdi return SD_RES_READONLY.
In this case, the sheepdog block driver needs to update the current
inode to the latest one and resend the write requests.

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-26 13:26:28 +02:00
MORITA Kazutaka 9ff53a0eb8 sheepdog: add helper function to reload inode
This adds a helper function to update the current inode state with the
specified vdi object.

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-26 13:26:28 +02:00
MORITA Kazutaka 6a0b549033 sheepdog: add SD_RES_READONLY result code
Sheepdog returns SD_RES_READONLY when qemu sends write requests to the
snapshot vdi.  This adds the result code and makes sd_strerror() print
its error reason.

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-26 13:26:28 +02:00
MORITA Kazutaka 982dcbf4cb sheepdog: cleanup find_vdi_name
This makes 'filename' and 'tag' constant variables, and renames
'for_snapshot' to 'lock' to clear how it works.

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-26 13:26:28 +02:00
Kevin Wolf c3ca988d2b rbd: Fix use after free in rbd_open()
Commit a9ccedc3 frees the QemuOpts for the driver-specific options
immediately, even though it still needs the filename string that is
contained there. This doesn't work. Move the deletion of the QemuOpts to
the end of the function where its content isn't needed any more.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-26 13:26:28 +02:00
Liu Yuan 8d71c63137 sheepdog: implement .bdrv_co_is_allocated()
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <tailai.ly@taobao.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-26 13:26:28 +02:00
Liu Yuan e8bfaa2fae sheepdog: use BDRV_SECTOR_SIZE
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <tailai.ly@taobao.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-26 13:26:28 +02:00
Liu Yuan cac8f4a60f sheepdog: add discard/trim support for sheepdog
The 'TRIM' command from VM that is to release underlying data storage for
better thin-provision is already supported by the Sheepdog.

This patch adds the TRIM support at QEMU part.

For older Sheepdog that doesn't support it, we return 0(success) to upper layer.

Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Liu Yuan <tailai.ly@taobao.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-26 13:26:27 +02:00
Anthony Liguori f1ab7a5acf Merge remote-tracking branch 'kwolf/for-anthony' into staging
# By Kevin Wolf (16) and Stefan Hajnoczi (4)
# Via Kevin Wolf
* kwolf/for-anthony:
  qemu-iotests: add 053 unaligned compressed image size test
  block: Allow overriding backing.file.filename
  block: Remove filename parameter from .bdrv_file_open()
  vvfat: Use bdrv_open options instead of filename
  sheepdog: Use bdrv_open options instead of filename
  rbd: Use bdrv_open options instead of filename
  iscsi: Use bdrv_open options instead of filename
  gluster: Use bdrv_open options instead of filename
  curl: Use bdrv_open options instead of filename
  blkverify: Use bdrv_open options instead of filename
  blkdebug: Use bdrv_open options instead of filename
  raw-win32: Use bdrv_open options instead of filename
  raw-posix: Use bdrv_open options instead of filename
  block: Enable filename option
  block: Add driver-specific options for backing files
  block: Fail gracefully when using a format driver on protocol level
  qemu-iotests: Fix _filter_qemu
  qemu-img: do not zero-pad the compressed write buffer
  qcow: allow sub-cluster compressed write to last cluster
  qcow2: allow sub-cluster compressed write to last cluster

Message-id: 1366630294-18984-1-git-send-email-kwolf@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-04-22 08:08:22 -05:00
Anthony Liguori 25690739f1 Merge remote-tracking branch 'bonzini/nbd-next' into staging
# By Stefan Hajnoczi
# Via Paolo Bonzini
* bonzini/nbd-next:
  nbd: set TCP_NODELAY
  nbd: use TCP_CORK in nbd_co_send_request()
  nbd: unlock mutex in nbd_co_send_request() error path

Message-id: 1366381830-11267-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-04-22 08:05:14 -05:00
Kevin Wolf 56d1b4d21d block: Remove filename parameter from .bdrv_file_open()
It is unused now in all block drivers.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-04-22 11:34:35 +02:00
Kevin Wolf 7ad9be64e8 vvfat: Use bdrv_open options instead of filename
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-04-22 10:27:59 +02:00
Kevin Wolf c8c96350e0 sheepdog: Use bdrv_open options instead of filename
This is only to convert the internal interface that is used for passing
the "filename" to be parsed, but converting to actual fine grained
options is left for another day, as it doesn't look trivial.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-04-22 10:27:59 +02:00
Kevin Wolf a9ccedc3da rbd: Use bdrv_open options instead of filename
This is only to convert the internal interface that is used for passing
the "filename" to be parsed, but converting to actual fine grained
options is left for another day, as it doesn't look trivial.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-04-22 10:27:59 +02:00
Kevin Wolf 60beb3412d iscsi: Use bdrv_open options instead of filename
This is only to convert the internal interface that is used for passing
the "filename" to be parsed, but converting to actual fine grained
options is left for another day, as it doesn't look trivial.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-04-22 10:27:59 +02:00
Kevin Wolf b489477653 gluster: Use bdrv_open options instead of filename
This is only to convert the internal interface that is used for passing
the "filename" to be parsed, but converting to actual fine grained
options is left for another day, as it doesn't look trivial.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-04-22 10:27:59 +02:00
Kevin Wolf 8e6d58cd5b curl: Use bdrv_open options instead of filename
As a bonus, going through the QemuOpts QEMU_OPT_SIZE parser for the
readahead option gives us proper error reporting that the previous use
of atoi() lacked.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-04-22 10:27:59 +02:00
Kevin Wolf 16c790926b blkverify: Use bdrv_open options instead of filename
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-04-22 10:27:59 +02:00
Kevin Wolf f468121290 blkdebug: Use bdrv_open options instead of filename
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-04-22 10:27:59 +02:00
Kevin Wolf 8a79380b8e raw-win32: Use bdrv_open options instead of filename
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-04-22 10:27:59 +02:00
Kevin Wolf c66a615723 raw-posix: Use bdrv_open options instead of filename
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-04-22 10:27:59 +02:00
Kevin Wolf 31ca6d077c block: Add driver-specific options for backing files
Options starting in "backing." are passed to the backing file now. If
you don't need to specify the filename for the backing file, you can add
it on the command line instead of in the image file:

$ qemu-nbd -t /tmp/test.img
$ qemu-img create -f qcow2 empty.qcow2 1G
$ qemu-system-x86_64 -drive file=empty.qcow2,backing.file.driver=nbd,\
    backing.file.host=localhost

Note that this doesn't override the backing filename from the image. If
the image has one, this will fail because NBD doesn't want the options
and a filename at the same time.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-04-22 10:27:59 +02:00
Stefan Hajnoczi 16b3c5cd9f qcow: allow sub-cluster compressed write to last cluster
Compression in qcow requires image length to be a multiple of the
cluster size.  Lift this requirement by zero-padding the final cluster
when necessary.  The virtual disk size is still not cluster-aligned, so
the guest cannot access the zero sectors.

Note that this is almost identical to the qcow2 version of this code.
qcow2's compression code is drawn from qcow.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-04-22 10:27:58 +02:00
Stefan Hajnoczi f4d38bef7c qcow2: allow sub-cluster compressed write to last cluster
Compression in qcow2 requires image length to be a multiple of the
cluster size.  Lift this requirement by zero-padding the final cluster
when necessary.  The virtual disk size is still not cluster-aligned, so
the guest cannot access the zero sectors.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-04-22 10:27:58 +02:00
Richard W.M. Jones c7a101f529 ssh: Remove unnecessary use of strlen function.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-19 11:45:38 +02:00
Stefan Weil 6ae7d660a0 block/ssh: Add missing gcc format attributes
Now gcc will check whether format string and variable arguments match.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-19 11:44:16 +02:00
Stefan Hajnoczi 97ebbab0e3 nbd: set TCP_NODELAY
Disable the Nagle algorithm to reduce latency.  Note this means we must
also use TCP_CORK when sending header followed by payload to avoid
fragmenting lots of little packets.  The previous patch took care of
that.

Suggested-by: Nick Thomas <nick@bytemark.co.uk>
Tested-by: Nick Thomas <nick@bytemark.co.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-04-15 16:35:17 +02:00
Stefan Hajnoczi 0fcece25c0 nbd: use TCP_CORK in nbd_co_send_request()
Use TCP_CORK to defer packet transmission until both the header and the
payload have been written.

Suggested-by: Nick Thomas <nick@bytemark.co.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-04-15 16:30:40 +02:00
Stefan Hajnoczi 6760c47aa4 nbd: unlock mutex in nbd_co_send_request() error path
Cc: qemu-stable@nongnu.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-04-15 16:29:46 +02:00
Josh Durgin dc7588c1eb rbd: add an asynchronous flush
The existing bdrv_co_flush_to_disk implementation uses rbd_flush(),
which is sychronous and causes the main qemu thread to block until it
is complete. This results in unresponsiveness and extra latency for
the guest.

Fix this by using an asynchronous version of flush.  This was added to
librbd with a special #define to indicate its presence, since it will
be backported to stable versions. Thus, there is no need to check the
version of librbd.

Implement this as bdrv_aio_flush, since it matches other aio functions
in the rbd block driver, and leave out bdrv_co_flush_to_disk when the
asynchronous version is available.

Reported-by: Oliver Francke <oliver@filoo.de>
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-15 10:18:05 +02:00
Richard W.M. Jones 9a2d462e7b block: ssh: Use libssh2_sftp_fsync (if supported by libssh2) to flush to disk.
libssh2_sftp_fsync is an extension to libssh2 to support fsync(2) over
sftp, which is itself an extension of OpenSSH.

If both libssh2 and the ssh daemon support it, this will allow
bdrv_flush_to_disk to commit changes through to disk on the remote
server.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-15 10:18:05 +02:00
Richard W.M. Jones 0a12ec87a5 block: Add support for Secure Shell (ssh) block device.
qemu-system-x86_64 -drive file=ssh://hostname/some/image

QEMU will ssh into 'hostname' and open '/some/image' which is made
available as a standard block device.

You can specify a username (ssh://user@host/...) and/or a port number
(ssh://host:port/...).  You can also use an alternate syntax using
properties (file.user, file.host, file.port, file.path).

Current limitations:

- Authentication must be done without passwords or passphrases, using
  ssh-agent.  Other authentication methods are not supported.

- Uses a single connection, instead of concurrent AIO with multiple
  SSH connections.

This is implemented using libssh2 on the client side.  The server just
requires a regular ssh daemon with sftp-server support.  Most ssh
daemons on Unix/Linux systems will work out of the box.

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Cc: Stefan Hajnoczi <stefanha@gmail.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-15 10:18:05 +02:00
Kevin Wolf 8d3b1a2d0b block: Introduce bdrv_pwritev() for qcow2_save_vmstate
Directly pass the QEMUIOVector on instead of linearising it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-15 08:26:18 +02:00
Kevin Wolf cf8074b382 block: Introduce bdrv_writev_vmstate
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-15 08:26:18 +02:00
Aurelien Jarno 753d9b82c5 aes: move aes.h from include/block to include/qemu
Move aes.h from include/block to include/qemu to show it can be reused
by other subsystems.

Cc: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2013-04-13 13:51:57 +02:00
Paolo Bonzini 0d09e41a51 hw: move headers to include/
Many of these should be cleaned up with proper qdev-/QOM-ification.
Right now there are many catch-all headers in include/hw/ARCH depending
on cpu.h, and this makes it necessary to compile these files per-target.
However, fixing this does not belong in these patches.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-04-08 18:13:10 +02:00
Kevin Wolf c2b6ff51e4 qcow2: Fix L1 write error handling in qcow2_update_snapshot_refcount
It ignored the error code, and at least the 'goto fail' is obvious
nonsense as it creates an endless loop (if the next attempt doesn't
magically succeed) and leaves the in-memory L1 table in big-endian
instead of converting it back.

In error cases, there's no point in writing an updated L1 table, so
skip this part for them.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-04-05 18:58:05 +02:00
Kevin Wolf c2bc78b6a9 qcow2: Return real error in qcow2_update_snapshot_refcount
This fixes the error message triggered by the following script:

    cat > /tmp/blkdebug.cfg <<EOF
    [inject-error]
    event = "cluster_free"
    errno = "28"
    immediately = "off"
    EOF

    $qemu_img create -f qcow2 test.qcow2 10G
    $qemu_img snapshot -c snap test.qcow2
    $qemu_img snapshot -d snap blkdebug:/tmp/blkdebug.cfg:test.qcow2

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-04-05 18:58:05 +02:00
Stefan Hajnoczi f9e8cacc55 oslib-posix: rename socket_set_nonblock() to qemu_set_nonblock()
The fcntl(fd, F_SETFL, O_NONBLOCK) flag is not specific to sockets.
Rename to qemu_set_nonblock() just like qemu_set_cloexec().

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-04-02 11:47:37 -04:00
Kevin Wolf ecdd5333ab qcow2: Gather clusters in a looping loop
Instead of just checking once in exactly this order if there are
dependendies, non-COW clusters and new allocation, this starts looping
around these. This way we can, for example, gather non-COW clusters after
new allocations as long as the host cluster offsets stay contiguous.

Once handle_dependencies() is extended so that COW areas of in-flight
allocations can be overwritten, this allows to continue with gathering
other clusters (we wouldn't be able to do that without this change
because we would have missed a possible second dependency in one of the
next clusters).

This means that in the typical sequential write case, we can combine the
COW overwrite of one cluster with the allocation of the next cluster as
soon as something like Delayed COW gets actually implemented. It is only
by avoiding splitting requests this way that Delayed COW actually starts
improving performance noticably.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:44 +01:00
Kevin Wolf 2c3b32d256 qcow2: Move cluster gathering to a non-looping loop
This patch is mainly to separate the indentation change from the
semantic changes. All that really changes here is that everything moves
into a while loop, all 'goto done' become 'break' and at the end of the
loop a new 'break is inserted.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:44 +01:00
Kevin Wolf 88c6588c51 qcow2: Allow requests with multiple l2metas
Instead of expecting a single l2meta, have a list of them. This allows
to still have a single I/O request for the guest data, even though
multiple l2meta may be needed in order to describe both a COW overwrite
and a new cluster allocation (typical sequential write case).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:44 +01:00
Kevin Wolf 710c2496d8 qcow2: Use byte granularity in qcow2_alloc_cluster_offset()
This gets rid of the nb_clusters and keep_clusters and the associated
complicated calculations. Just advance the number of bytes that have
been processed and everything is fine.

This patch advances the variables even after the last operation even
though they aren't used any more afterwards to make things look more
uniform. A later patch will turn the whole thing into a loop and then
it actually starts making sense.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:44 +01:00