Commit Graph

2043 Commits

Author SHA1 Message Date
Kevin Wolf e424aff5f3 mirror: Fix coroutine reentrance
This fixes a regression introduced by commit dcfb3beb ("mirror: Do zero
write on target if sectors not allocated"), which was reported to cause
aborts with the message "Co-routine re-entered recursively".

The cause for this bug is the following code in mirror_iteration_done():

    if (s->common.busy) {
        qemu_coroutine_enter(s->common.co, NULL);
    }

This has always been ugly because - unlike most places that reenter - it
doesn't have a specific yield that it pairs with, but is more
uncontrolled.  What we really mean here is "reenter the coroutine if
it's in one of the four explicit yields in mirror.c".

This used to be equivalent with s->common.busy because neither
mirror_run() nor mirror_iteration() call any function that could yield.
However since commit dcfb3beb this doesn't hold true any more:
bdrv_get_block_status_above() can yield.

So what happens is that bdrv_get_block_status_above() wants to take a
lock that is already held, so it adds itself to the queue of waiting
coroutines and yields. Instead of being woken up by the unlock function,
however, it gets woken up by mirror_iteration_done(), which is obviously
wrong.

In most cases the code actually happens to cope fairly well with such
cases, but in this specific case, the unlock must already have scheduled
the coroutine for wakeup when mirror_iteration_done() reentered it. And
then the coroutine happened to process the scheduled restarts and tried
to reenter itself recursively.

This patch fixes the problem by pairing the reenter in
mirror_iteration_done() with specific yields instead of abusing
s->common.busy.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1439455310-11263-1-git-send-email-kwolf@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2015-08-14 09:51:31 -04:00
Stefan Hajnoczi cae98cb87d block/mirror: limit qiov to IOV_MAX elements
If mirror has more free buffers than IOV_MAX, preadv(2)/pwritev(2)
EINVAL failures may be encountered.

It is possible to trigger this by setting granularity to a low value
like 8192.

This patch stops appending chunks once IOV_MAX is reached.

The spurious EINVAL failure can be reproduced with a qcow2 image file
and the following QMP invocation:

  qmp.command('drive-mirror', device='virtio0', target='/tmp/r7.s1',
              granularity=8192, sync='full', mode='absolute-paths',
              format='raw')

While the guest is running dd if=/dev/zero of=/var/tmp/foo oflag=direct
bs=4k.

Cc: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1435761950-26714-1-git-send-email-stefanha@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2015-08-06 04:41:09 -04:00
Sascha Silbe e94867ed5f block: don't register quorum driver if SHA256 support is unavailable
Commit 488981a4 [block: convert quorum blockdrv to use crypto APIs]
broke qemu-iotest 041 on hosts with GnuTLS < 2.10.0. It converted a
compile-time check to a run-time check at device open time. The result
is that we now advertise a feature (the quorum block driver) that will
never work (on those hosts). There's no way (short of parsing
human-readable error messages) for qemu-iotests or any other API
consumer to recognise that the quorum block driver isn't _actually_
available and shouldn't be used or tested.

Move the run-time check to bdrv_quorum_init() to avoid registering the
quorum block driver if we know it cannot work. This way API consumers
can recognise it's unavailable.

Fixes: 488981a4af
Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 1438699705-21761-1-git-send-email-silbe@linux.vnet.ibm.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-08-05 15:19:32 +01:00
Peter Maydell 9f8c5b69c2 -----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
 
 iQIcBAABAgAGBQJVtwOFAAoJEL2+eyfA3jBXeN8QAJAuQL5Wvwe1epGhiOIAVadX
 Zx0Yx8DnYbHH184tsEOyuliHUxSN6X6bBlO1rTaI/tpEOeZEBY9EBUBDdcty//Y4
 ZHCKDppYSP773wJDLIvHn7gw83Z+zqgRw1WXcyd4piBKiilJ5c2wvpDrTNqwFh0I
 1mqTFoVOARK0OVlxuXdhaH+hBI823MaPaSRckOWYgZjRZN2HlCAN8sDigdQtqBGp
 Udugw4TijlMi5/JmLUmvDHLaVZz2EEKlH7fHjInU98Z2p5UE/dkTI2QIj2smTSEk
 XpUkmukHgLHyIxDnE2pAhdQG12RTd8HrnkmohauHXsvVzQTvW6tOBfR57sqVSy1g
 WPxEiAgvs5kWnF7LLmR4AO7xd+r9/EFrNsMpdinKjA68RAnxEfFcCriNMPtPtv2h
 g/Nw60uqrSiv98/JL7ApcqwcevJOteBY50D//hkpyevlyPkh4hg0vISXfaAaDFrp
 YC0byq32UtkWZJkgAAj+ZkGJOdeRGl+Ma8+2FwxHAaZepCMGJC2xRAQ28CEHUEfP
 WAM/VRirgmxV4kG2JTnotlWqj6z0iYdjiL+AV+Mq0cywkboOQoyaOoIGC30kovGZ
 XMcEDLTr1wke/ODU7JegOX+SPLU+Nf4zXz6QJMaIRlAGO+EpQiJoHjH0PWvjAoqV
 xcWr+wXm9H0ejX5Rx1is
 =9QFT
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/cody/tags/jtc-for-upstream-pull-request' into staging

# gpg: Signature made Tue Jul 28 05:22:29 2015 BST using RSA key ID C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg:                 aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg:                 aka "Jeffrey Cody <codyprime@gmail.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 9957 4B4D 3474 90E7 9D98  D624 BDBE 7B27 C0DE 3057

* remotes/cody/tags/jtc-for-upstream-pull-request:
  block/ssh: Avoid segfault if inet_connect doesn't set errno.
  sheepdog: serialize requests to overwrapping area

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-07-28 13:22:57 +01:00
Richard W.M. Jones 325e390421 block/ssh: Avoid segfault if inet_connect doesn't set errno.
On some (but not all) systems:

  $ qemu-img create -f qcow2 overlay -b ssh://xen/
  Segmentation fault

It turns out this happens when inet_connect returns -1 in the
following code, but errno == 0.

  s->sock = inet_connect(s->hostport, errp);
  if (s->sock < 0) {
      ret = -errno;
      goto err;
  }

In the test case above, no host called "xen" exists, so getaddrinfo fails.

On Fedora 22, getaddrinfo happens to set errno = ENOENT (although it
is *not* documented to do that), so it doesn't segfault.

On RHEL 7, errno is not set by the failing getaddrinfo, so ret =
-errno = 0, so the caller doesn't know there was an error and
continues with a half-initialized BDRVSSHState struct, and everything
goes south from there, eventually resulting in a segfault.

Fix this by setting ret to -EIO (same as block/nbd.c and
block/sheepdog.c).  The real error is saved in the Error** errp
struct, so it is printed correctly:

  $ ./qemu-img create -f qcow2 overlay -b ssh://xen/
  qemu-img: overlay: address resolution failed for xen:22: No address associated with hostname

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Reported-by: Jun Li
BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1147343
Signed-off-by: Jeff Cody <jcody@redhat.com>
2015-07-28 00:19:05 -04:00
Hitoshi Mitake 6a55c82cec sheepdog: serialize requests to overwrapping area
Current sheepdog driver only serializes create requests in oid
unit. This mechanism isn't enough for handling requests to
overwrapping area spanning multiple oids, so it can result bugs like
below:
https://bugs.launchpad.net/sheepdog-project/+bug/1456421

This patch adds a new serialization mechanism for the problem. The
difference from the old one is:
1. serialize entire aiocb if their targetting areas overwrap
2. serialize all requests (read, write, and discard), not only creates

This patch also removes the old mechanism because the new one can be
an alternative.

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Teruaki Ishizaki <ishizaki.teruaki@lab.ntt.co.jp>
Cc: Vasiliy Tolstov <v.tolstov@selfip.ru>
Signed-off-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
Tested-by: Vasiliy Tolstov <v.tolstov@selfip.ru>
Signed-off-by: Jeff Cody <jcody@redhat.com>
2015-07-28 00:16:57 -04:00
Jeff Cody b15deac795 block: vpc - prevent overflow if max_table_entries >= 0x40000000
When we allocate the pagetable based on max_table_entries, we multiply
the max table entry value by 4 to accomodate a table of 32-bit integers.
However, max_table_entries is a uint32_t, and the VPC driver accepts
ranges for that entry over 0x40000000.  So during this allocation:

s->pagetable = qemu_try_blockalign(bs->file, s->max_table_entries * 4);

The size arg overflows, allocating significantly less memory than
expected.

Since qemu_try_blockalign() size argument is size_t, cast the
multiplication correctly to prevent overflow.

The value of "max_table_entries * 4" is used elsewhere in the code as
well, so store the correct value for use in all those cases.

We also check the Max Tables Entries value, to make sure that it is <
SIZE_MAX / 4, so we know the pagetable size will fit in size_t.

Cc: qemu-stable@nongnu.org
Reported-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-07-27 17:19:06 +02:00
Fam Zheng 9990069758 mirror: Speed up bitmap initial scanning
Limiting to sectors_per_chunk for each bdrv_is_allocated_above is slow,
because the underlying protocol driver would issue much more queries
than necessary. We should coalesce the query.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: <1436413678-7114-4-git-send-email-famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-07-22 11:14:21 +01:00
Richard W.M. Jones 796a060bc0 block/curl: Don't lose original error when a connection fails.
Currently if qemu is connected to a curl source (eg. web server), and
the web server fails / times out / dies, you always see a bogus EIO
"Input/output error".

For example, choose a large file located on any local webserver which
you control:

  $ qemu-img convert -p http://example.com/large.iso /tmp/test

Once it starts copying the file, stop the webserver and you will see
qemu-img fail with:

  qemu-img: error while reading sector 61440: Input/output error

This patch does two things: Firstly print the actual error from curl
so it doesn't get lost.  Secondly, change EIO to EPROTO.  EPROTO is a
POSIX.1 compatible errno which more accurately reflects that there was
a protocol error, rather than some kind of hardware failure.

After this patch is applied, the error changes to:

  $ qemu-img convert -p http://example.com/large.iso /tmp/test
  qemu-img: curl: transfer closed with 469989 bytes remaining to read
  qemu-img: error while reading sector 16384: Protocol error

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
2015-07-14 21:50:13 -04:00
Wen Congyang 48ac0a4df8 mirror: correct buf_size
If bus_size is less than 0, the command fails.
If buf_size is 0, use DEFAULT_MIRROR_BUF_SIZE.
If buf_size % granularity is not 0, mirror_free_init() will
do dangerous things.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 5555A588.3080907@cn.fujitsu.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2015-07-14 21:50:13 -04:00
Stefan Hajnoczi 17d9716d7b block: keep bitmap if incremental backup job is cancelled
Reclaim the dirty bitmap if an incremental backup block job is
cancelled.  The ret variable may be 0 when the job is cancelled so it's
not enough to check ret < 0.

Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1434380534-7680-1-git-send-email-stefanha@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2015-07-14 21:50:13 -04:00
Fam Zheng 4c0cbd6fec block/mirror: Sleep periodically during bitmap scanning
Before, we only yield after initializing dirty bitmap, where the QMP
command would return. That may take very long, and guest IO will be
blocked.

Add sleep points like the later mirror iterations.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Wen Congyang <wency@cn.fujitsu.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1431486673-19280-1-git-send-email-famz@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2015-07-14 21:50:13 -04:00
Josh Durgin e34d8f297d rbd: fix ceph settings precedence
Apply the ceph settings from a config file before any ceph settings
from the command line. Since the ceph config file location may be
specified on the command line, parse it once to read the config file,
and do a second pass to apply the rest of the command line ceph
options.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-07-14 17:15:23 +02:00
Josh Durgin 99a3c89d5d rbd: make qemu's cache setting override any ceph setting
To be safe, when cache=none is used ceph settings should not be able
to override it to turn on caching. This was previously possible with
rbd_cache=true in the rbd device configuration or a ceph configuration
file. Similarly, rbd settings could have turned off caching when qemu
requested it, although this would just be a performance problem.

Fix this by changing rbd's cache setting to match qemu after all other
ceph settings have been applied.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-07-14 17:15:23 +02:00
Josh Durgin 3dbf00e058 rbd: remove unused constants and fields
RBDAIOCB.status was only used for cancel, which was removed in
7691e24dbe.

RBDAIOCB.sector_num was never used.

RADOSCB.done and rcbid were never used.

RBD_FD* are obsolete since the pipe was removed in
e04fb07fd1.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-07-14 17:15:23 +02:00
Peter Maydell acf7b7fdf3 Bugfixes and Daniel Berrange's crypto library.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJVnQWdAAoJEL/70l94x66D6OgIAKJlzQfmy5w7Q9WD4vCMhD76
 JrpLSsn7Gx/Bws0Nu9nLQlqun5z4hiUxyG2kP/WqD9+tV3cpSMSyrG6ImVdqKnQ5
 +Z8WJZuREkQv0aqDUjQVST+eIDZuh2LWJXAjhgsCXUHY77eWb/7WmKT79xJOa+5C
 5xB1qxudqX5IsTvpiKKPbmUGYkAcvRX1dUSaFwRIMO0UyKn59B9WfM9a5slIbLW7
 XfI8+wEJshTVLuQkkTfdidWQc5M5DwlmO7ESUNR/BRPCPFeyjcDqgQY5pBM5XVo9
 C+S0R3zIt3Ew0fhCtLRyjlIT0bGfwjbU5HRiHcyldBKhNUZZjSUoOWJnYRHXUDY=
 =H8wA
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

Bugfixes and Daniel Berrange's crypto library.

# gpg: Signature made Wed Jul  8 12:12:29 2015 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream:
  ossaudio: fix memory leak
  ui: convert VNC to use generic cipher API
  block: convert qcow/qcow2 to use generic cipher API
  ui: convert VNC websockets to use crypto APIs
  block: convert quorum blockdrv to use crypto APIs
  crypto: add a nettle cipher implementation
  crypto: add a gcrypt cipher implementation
  crypto: introduce generic cipher API & built-in implementation
  crypto: move built-in D3DES implementation into crypto/
  crypto: move built-in AES implementation into crypto/
  crypto: introduce new module for computing hash digests
  vl: move rom_load_all after machine init done

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-07-08 20:46:35 +01:00
Daniel P. Berrange f6fa64f6d2 block: convert qcow/qcow2 to use generic cipher API
Switch the qcow/qcow2 block driver over to use the generic cipher
API, this allows it to use the pluggable AES implementations,
instead of being hardcoded to use QEMU's built-in impl.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1435770638-25715-10-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-08 13:11:01 +02:00
Daniel P. Berrange 488981a4af block: convert quorum blockdrv to use crypto APIs
Get rid of direct use of gnutls APIs in quorum blockdrv in
favour of using the crypto APIs. This avoids the need to
do conditional compilation of the quorum driver. It can
simply report an error at file open file instead if the
required hash algorithm isn't supported by QEMU.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1435770638-25715-8-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-08 13:11:01 +02:00
Ting Wang 970311646a blockjob: add block_job_release function
There is job resource leak in function mirror_start_job,
although bdrv_create_dirty_bitmap is unlikely failed.
Add block_job_release for each release when needed.

Signed-off-by: Ting Wang <kathy.wangting@huawei.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1435311455-56048-1-git-send-email-kathy.wangting@huawei.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-07-07 14:27:14 +01:00
Richard W.M. Jones 25d9747b64 block/raw-posix: Don't think /dev/fd/<NN> is a floppy drive.
In libguestfs we use /dev/fd/<NN> to pass pre-opened file descriptors
to qemu-img.  Lately I've discovered that although this works, qemu
believes that these are floppy disk images.  That in itself isn't much
of a problem, but now qemu prints a warning about host floppy
pass-thru being deprecated.

Extend the existing test so that it ignores /dev/fd/ as well as
/dev/fdset/

A simple test of this, if you are using the bash shell, is:

  qemu-img info <( cat /dev/null )

without this patch:

  $ qemu-img info <( cat /dev/null )
  qemu-img: Host floppy pass-through is deprecated
  Support for it will be removed in a future release.
  qemu-img: Could not open '/dev/fd/63': Could not refresh total sector count: Illegal seek

with this patch:

  $ qemu-img info <( cat /dev/null )
  qemu-img: Could not open '/dev/fd/63': Could not refresh total sector count: Illegal seek

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1435761614-31358-1-git-send-email-rjones@redhat.com
Fixes: https://bugs.launchpad.net/qemu/+bug/1470536
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-07-07 14:27:14 +01:00
Fam Zheng 53ec73e264 block: Use bdrv_drain to replace uncessary bdrv_drain_all
There callers work on a single BlockDriverState subtree, where using
bdrv_drain() is more accurate.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-07-07 14:27:14 +01:00
Daniel P. Berrange 6f2945cde6 crypto: move built-in AES implementation into crypto/
To prepare for a generic internal cipher API, move the
built-in AES implementation into the crypto/ directory

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1435770638-25715-3-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-07 12:04:13 +02:00
Stefan Hajnoczi 7a63f3cdc4 block: update bdrv_drain_all()/bdrv_drain() comments
The doc comments for bdrv_drain_all() and bdrv_drain() are outdated:

 * The bdrv_drain() comment is a poor man's bdrv_lock()/bdrv_unlock()
   which Fam Zheng is currently developing.  Unfortunately this warning
   was never really enough because devices keep submitting I/O and op
   blockers don't prevent that.

 * The bdrv_drain_all() comment is still partially correct but reflects
   the nature of the implementation rather than API documentation.

Do make it clear that bdrv_drain() is only appropriate within an
AioContext.  For anything spanning AioContexts you need
bdrv_drain_all().

Cc: Markus Armbruster <armbru@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1435854281-6078-1-git-send-email-stefanha@redhat.com
2015-07-07 10:31:08 +01:00
Alberto Garcia 1bd84ee717 qcow2: remove unnecessary check
The value of 'i' is guaranteed to be >= 0

Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 1435824371-2660-1-git-send-email-berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-07-07 10:31:04 +01:00
Alberto Garcia 764ba3ae51 block: remove redundant check before g_slist_find()
An empty GSList is represented by a NULL pointer, therefore it's a
perfectly valid argument for g_slist_find() and there's no need to
make any additional check.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 1435583533-5758-1-git-send-email-berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-07-02 10:06:23 +01:00
Peter Lieven 29c838cdc9 block/nfs: limit maximum readahead size to 1MB
a malicious caller could otherwise specify a very
large value via the URI and force libnfs to allocate
a large amount of memory for the readahead buffer.

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Message-id: 1435317241-25585-1-git-send-email-pl@kamp.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-07-02 10:06:23 +01:00
Peter Lieven 9049736ec7 block/iscsi: restore compatiblity with libiscsi 1.9.0
RHEL7 and others are stuck with libiscsi 1.9.0 since there
unfortunately was an ABI breakage after that release.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1435313881-19366-1-git-send-email-pl@kamp.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-07-02 10:06:23 +01:00
Fam Zheng 508249952c block: Fix dirty bitmap in bdrv_co_discard
Unsetting dirty globally with discard is not very correct. The discard may zero
out sectors (depending on can_write_zeroes_with_unmap), we should replicate
this change to destination side to make sure that the guest sees the same data.

Calling bdrv_reset_dirty also troubles mirror job because the hbitmap iterator
doesn't expect unsetting of bits after current position.

So let's do it the opposite way which fixes both problems: set the dirty bits
if we are to discard it.

Reported-by: wangxiaolong@ucloud.cn
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-07-02 10:06:23 +01:00
Fam Zheng dcfb3beb51 mirror: Do zero write on target if sectors not allocated
If guest discards a source cluster, mirroring with bdrv_aio_readv is overkill.
Some protocols do zero upon discard, where it's best to use
bdrv_aio_write_zeroes, otherwise, bdrv_aio_discard will be enough.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-07-02 10:06:23 +01:00
Fam Zheng 0fc9f8ea28 qmp: Add optional bool "unmap" to drive-mirror
If specified as "true", it allows discarding on target sectors where source is
not allocated.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-07-02 10:06:23 +01:00
Fam Zheng ba3f0e2545 block: Add bdrv_get_block_status_above
Like bdrv_is_allocated_above, this function follows the backing chain until seeing
BDRV_BLOCK_ALLOCATED.  Base is not included.

Reimplement bdrv_is_allocated on top.

[Initialized bdrv_co_get_block_status_above() ret to 0 to silence
mingw64 compiler warning about the unitialized variable.  assert(bs !=
base) prevents that case but I suppose the program could be compiled
with -DNDEBUG.
--Stefan]

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-07-02 10:03:50 +01:00
John Snow 4b80ab2b7d qapi: Rename 'dirty-bitmap' mode to 'incremental'
If we wish to make differential backups a feature that's easy to access,
it might be pertinent to rename the "dirty-bitmap" mode to "incremental"
to make it clear what /type/ of backup the dirty-bitmap is helping us
perform.

This is an API breaking change, but 2.4 has not yet gone live,
so we have this flexibility.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1433463642-21840-2-git-send-email-jsnow@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-07-02 09:20:18 +01:00
Jindřich Makovička 3e5feb6202 qcow2: Handle EAGAIN returned from update_refcount
Fixes a crash during image compression

Signed-off-by: Jindřich Makovička <makovick@gmail.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-07-02 09:20:18 +01:00
Peter Lieven 5dd7a535b7 block/iscsi: add support for request timeouts
libiscsi starting with 1.15 will properly support timeout of iscsi
commands. The default will remain no timeout, but this can
be changed via cmdline parameters, e.g.:

qemu -iscsi timeout=30 -drive file=iscsi://...

If a timeout occurs a reconnect is scheduled and the timed out command
will be requeued for processing after a successful reconnect.

The required API call iscsi_set_timeout is present since libiscsi
1.10 which was released in October 2013. However, due to some bugs
in the libiscsi code the use is not recommended before version 1.15.

Please note that this patch bumps the libiscsi requirement to 1.10
to have all function and macros defined. The patch fixes also a
off-by-one error in the NOP timeout calculation which was fixed
while touching these code parts.

Signed-off-by: Peter Lieven <pl@kamp.de>
Message-id: 1434455107-19328-1-git-send-email-pl@kamp.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-07-02 09:20:18 +01:00
Dimitris Aragiorgis 3307ed7b3f raw-posix: Introduce hdev_is_sg()
Until now, an SG device was identified only by checking if its path
started with "/dev/sg". Then, hdev_open() would set the bs->sg flag
accordingly. The patch relies on the actual properties of the device
instead of the specified file path.

To this end, test for an SG device (e.g. /dev/sg0) by ensuring that
all of the following holds:

 - The specified file name corresponds to a character device
 - The device supports the SG_GET_VERSION_NUM ioctl
 - The device supports the SG_GET_SCSI_ID ioctl

Signed-off-by: Dimitris Aragiorgis <dimara@arrikto.com>
Message-id: 1435056300-14924-6-git-send-email-dimara@arrikto.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-23 15:08:52 +01:00
Dimitris Aragiorgis a93a3982a6 raw-posix: Use DPRINTF for DEBUG_FLOPPY
Get rid of several #ifdef DEBUG_FLOPPY and substitute them with
DPRINTF.

Signed-off-by: Dimitris Aragiorgis <dimara@arrikto.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1435056300-14924-5-git-send-email-dimara@arrikto.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-23 15:08:52 +01:00
Dimitris Aragiorgis bcb225550d raw-posix: DPRINTF instead of DEBUG_BLOCK_PRINT
Building the QEMU tools fails if we #define DEBUG_BLOCK inside
block/raw-posix.c. Here instead of adding qemu-log.o in block-obj-y
so that DEBUG_BLOCK_PRINT can be used, we substitute the latter with
a simple DPRINTF() (that does not cause bit-rot).

Signed-off-by: Dimitris Aragiorgis <dimara@arrikto.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1435056300-14924-4-git-send-email-dimara@arrikto.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-23 15:08:52 +01:00
Dimitris Aragiorgis 1b6bc94d5d Fix migration in case of scsi-generic
During migration, QEMU uses fsync()/fdatasync() on the open file
descriptor for read-write block devices to flush data just before
stopping the VM.

However, fsync() on a scsi-generic device returns -EINVAL which
causes the migration to fail. This patch skips flushing data in case
of an SG device, since submitting SCSI commands directly via an SG
character device (e.g. /dev/sg0) bypasses the page cache completely,
anyway.

Note that fsync() not only flushes the page cache but also the disk
cache. The scsi-generic device never sends flushes, and for
migration it assumes that the same SCSI device is used by the
destination host, so it does not issue any SCSI SYNCHRONIZE CACHE
(10) command.

Finally, remove the bdrv_is_sg() test from iscsi_co_flush() since
this is now redundant (we flush the underlying protocol at the end
of bdrv_co_flush() which, with this patch, we never reach).

Signed-off-by: Dimitris Aragiorgis <dimara@arrikto.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1435056300-14924-3-git-send-email-dimara@arrikto.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-23 15:08:52 +01:00
Dimitris Aragiorgis b192af8acc block: Use bdrv_is_sg() everywhere
Instead of checking bs->sg use bdrv_is_sg() consistently throughout
the code.

Signed-off-by: Dimitris Aragiorgis <dimara@arrikto.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1435056300-14924-2-git-send-email-dimara@arrikto.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-23 15:08:52 +01:00
Wolfgang Bumiller d5941ddae8 vvfat: add a label option
Until now the vvfat volume label was hardcoded to be
"QEMU VVFAT", now you can pass a file.label=labelname option
to the -drive to change it.

The FAT structure defines the volume label to be limited to
11 bytes and is filled up spaces when shorter than that. The
trailing spaces however aren't exposed to the user by
operating systems.

[Added missing comment '#' characters in block-core.json to fix build
errors.
--Stefan]

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Message-id: 1434706529-13895-2-git-send-email-w.bumiller@proxmox.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-23 15:06:17 +01:00
Alexander Yarygin 97b0385a34 block-backend: Introduce blk_drain()
This patch introduces the blk_drain() function which allows to replace
blk_drain_all() when only one BlockDriverState needs to be drained.

Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1434537440-28236-2-git-send-email-yarygin@linux.vnet.ibm.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-23 15:06:16 +01:00
Alberto Garcia 2f388b93a1 throttle: Check current timers before updating any_timer_armed[]
Calling throttle_group_config() cancels all timers from a particular
BlockDriverState, so any_timer_armed[] should be updated accordingly.

However, with the current code it may happen that a timer is armed in
a different BlockDriverState from the same group, so any_timer_armed[]
would be set to false in a situation where there is still a timer
armed.

The consequence is that we might end up with two timers armed. This
should not have any noticeable impact however, since all accesses to
the ThrottleGroup are protected by a lock, and the situation would
become normal again shortly thereafter as soon as all timers have been
fired.

The correct way to solve this is to check that we're actually
cancelling a timer before updating any_timer_armed[].

Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 1434382875-3998-1-git-send-email-berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-23 15:06:16 +01:00
Alexander Yarygin f406c03c09 block: Let bdrv_drain_all() to call aio_poll() for each AioContext
After the commit 9b536adc ("block: acquire AioContext in
bdrv_drain_all()") the aio_poll() function got called for every
BlockDriverState, in assumption that every device may have its own
AioContext. If we have thousands of disks attached, there are a lot of
BlockDriverStates but only a few AioContexts, leading to tons of
unnecessary aio_poll() calls.

This patch changes the bdrv_drain_all() function allowing it find shared
AioContexts and to call aio_poll() only for unique ones.

Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id: 1433936297-7098-4-git-send-email-yarygin@linux.vnet.ibm.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-23 15:06:16 +01:00
Markus Armbruster cc7a8ea740 Include qapi/qmp/qerror.h exactly where needed
In particular, don't include it into headers.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
2015-06-22 18:20:41 +02:00
Markus Armbruster d49b683644 qerror: Move #include out of qerror.h
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
2015-06-22 18:20:40 +02:00
Markus Armbruster 4629ed1e98 qerror: Finally unused, clean up
Remove it except for two things in qerror.h:

* Two #include to be cleaned up separately to avoid cluttering this
  patch.

* The QERR_ macros.  Mark as obsolete.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
2015-06-22 18:20:40 +02:00
Markus Armbruster c6bd8c706a qerror: Clean up QERR_ macros to expand into a single string
These macros expand into error class enumeration constant, comma,
string.  Unclean.  Has been that way since commit 13f59ae.

The error class is always ERROR_CLASS_GENERIC_ERROR since the previous
commit.

Clean up as follows:

* Prepend every use of a QERR_ macro by ERROR_CLASS_GENERIC_ERROR, and
  delete it from the QERR_ macro.  No change after preprocessing.

* Rewrite error_set(ERROR_CLASS_GENERIC_ERROR, ...) into
  error_setg(...).  Again, no change after preprocessing.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
2015-06-22 18:20:40 +02:00
Eric Blake fc48ffc39e qobject: Use 'bool' for qbool
We require a C99 compiler, so let's use 'bool' instead of 'int'
when dealing with boolean values.  There are few enough clients
to fix them all in one pass.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Acked-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2015-06-22 17:40:00 +02:00
Peter Maydell f3e3b083d4 Block layer core and image format patches
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJVevYFAAoJEH8JsnLIjy/W3jEP/0hiQ3rCRZ/he8s5maTdT+TR
 YSeHkB5rKpz0Uopn1DMn1QrIbUVzX7dyb+uf9zQ0/xRQIzf6k8uxqU/NWrdoF3NK
 qx91dGWedwnG+TEBIMbcR7nMrw4dP6kH7uPz/VWMXDHVLz0HIcD95qhKgs0mSY6J
 dWqex6ACjXM68zJU5IioagU9evV80WZE1S8z7zfixxtTBx5hCaTVbwalkaCxcrXw
 PbZle55rjI8B10+OzgBw0fq10nias+NTndU9CwNBboxmEtAjq8/mQ663vcWlmiFo
 9a/hkda27Z5ut/0Tqk1v4uLHauylp++rrAabPBAuCFMKes6cdkddP15Q/r52aJ29
 5meodQtbet1rGrM+Aq4vuSuWId71PGypEI/3URDdNfYFNISoeLLsk4lcQUu7VrDD
 sRX3Jt8SI3nkIgOnhPyi7NDPmafxFt8yRt5vM8MyR5ynF8NS/2hiAc3wqnbXGjUj
 a5GqDCefb1yM0R5HvksuFFt3OnXlKJQ3J+ksXNUJf9DSAZPauqWD696pcTeg8wyy
 3PIGkczgUuKTVfFWd3THZxJLAo7ZuqvXBHHV8o1SeMBDxwh4FhTd8Kjvm3rUNFfl
 VDox4qwZ1AcxLrxgqazKU7sD9iWBDHURRcpOoUsBys7oxQZnhcmMp1fRlEkTOyrD
 HNiSNByqBrtkfeVzHlSe
 =QgYk
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer core and image format patches

# gpg: Signature made Fri Jun 12 16:08:53 2015 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream: (25 commits)
  block: Fix reopen flag inheritance
  block: Add BlockDriverState.inherits_from
  block: Add list of children to BlockDriverState
  queue.h: Add QLIST_FIX_HEAD_PTR()
  block: Drain requests before swapping nodes in bdrv_swap()
  block: Move flag inheritance to bdrv_open_inherit()
  block: Use QemuOpts in bdrv_open_common()
  block: Use macro for cache option names
  vmdk: Use bdrv_open_image()
  quorum: Use bdrv_open_image()
  check-qdict: Test cases for new functions
  qdict: Add qdict_{set,copy}_default()
  qdict: Add qdict_array_entries()
  iotests: Add tests for overriding BDRV_O_PROTOCOL
  block: driver should override flags in bdrv_open()
  block: Change bitmap truncate conditional to assertion
  block: record new size in bdrv_dirty_bitmap_truncate
  raw-posix: Fix .bdrv_co_get_block_status() for unaligned image size
  vmdk: Use vmdk_find_index_in_cluster everywhere
  vmdk: Fix index_in_cluster calculation in vmdk_co_get_block_status
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-06-15 10:43:06 +01:00
Kevin Wolf 67251a3113 block: Fix reopen flag inheritance
When reopening an image, the block layer already takes care to reopen
bs->file as well with recalculated inherited flags. The same must happen
for any other child (most notably missing before this patch: backing
files).

If bs->file (or any other child) didn't originally inherit from bs, e.g.
because it was created separately and then only referenced, it must not
inherit flags on reopen either, so check the inherited_from field before
propagation the reopen down.

VMDK already reopened its extents manually; this code can now be
dropped.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2015-06-12 17:04:59 +02:00