Commit Graph

3261 Commits

Author SHA1 Message Date
Paolo Bonzini 850d54a2a9 block: access io_plugged with atomic ops
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170605123908.18777-7-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-16 07:55:00 +08:00
Paolo Bonzini e2a6ae7fe5 block: access wakeup with atomic ops
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170605123908.18777-6-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-16 07:55:00 +08:00
Paolo Bonzini 20fc71b25c block: access serialising_in_flight with atomic ops
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170605123908.18777-5-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-16 07:55:00 +08:00
Paolo Bonzini d993b85804 block: access io_limits_disabled with atomic ops
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170605123908.18777-4-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-16 07:55:00 +08:00
Paolo Bonzini 414c2ec358 block: access quiesce_counter with atomic ops
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170605123908.18777-3-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-16 07:55:00 +08:00
Paolo Bonzini d3faa13e5f block: access copy_on_read with atomic ops
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170605123908.18777-2-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-16 07:55:00 +08:00
Vladimir Sementsov-Ogievskiy d1fdf257d5 nbd: rename read_sync and friends
Rename
  nbd_wr_syncv -> nbd_rwv
  read_sync -> nbd_read
  read_sync_eof -> nbd_read_eof
  write_sync -> nbd_write
  drop_sync -> nbd_drop

1. nbd_ prefix
   read_sync and write_sync are already shared, so it is good to have a
   namespace prefix. drop_sync will be shared, and read_sync_eof is
   related to read_sync, so let's rename them all.

2. _sync suffix
   _sync is related to the fact that nbd_wr_syncv doesn't return if a
   write to socket returns EAGAIN. The first implementation of
   nbd_wr_syncv (was wr_sync in 7a5ca8648b) just loops while getting
   EAGAIN, the current implementation yields in this case.
   Why we want to get rid of it:
   - it is normal for r/w functions to be synchronous, so having an
     additional suffix for it looks redundant (contrariwise, we have
     _aio suffix for async functions)
   - _sync suffix in block layer is used when function does flush (so
     using it for other thing is confusing a bit)
   - keep function names short after adding nbd_ prefix

3. for nbd_wr_syncv let's use more common notation 'rw'

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170602150150.258222-2-vsementsov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-15 11:04:06 +02:00
Jeff Cody 5c3ad1a6a8 block/iscsi: enable filename option and parsing
When enabling option parsing and blockdev-add for iscsi, we removed the
'filename' option.  Unfortunately, this was a bit optimistic, as
previous versions of QEMU allowed the use of the option in backing
filenames via json.  This means that without parsing this option, we
cannot open existing images that used to work fine.

See bug: https://bugzilla.redhat.com/show_bug.cgi?id=1457088

Tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Message-id: 0789ab6c32814ab4b6896707d378804bd4424c65.1497444637.git.jcody@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-06-14 17:39:46 -04:00
Jeff Cody 91589d9e5c block/rbd: enable filename option and parsing
When enabling option parsing and blockdev-add for rbd, we removed the
'filename' option.  Unfortunately, this was a bit optimistic, as
previous versions of QEMU allowed the use of the option in backing
filenames via json.  This means that without parsing this option, we
cannot open existing images that used to work fine.

See bug: https://bugzilla.redhat.com/show_bug.cgi?id=1457088

Tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Message-id: 937dc9fde348d13311eb8e23444df3bc3190b612.1497444637.git.jcody@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-06-14 17:39:46 -04:00
Peter Maydell e0b4891ae6 -----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJZOpeXAAoJEL2+eyfA3jBXKqYP/jXv/ehVADW3X30CXSPoZsGo
 rdsdd8pqNwuEYHpFrGclMNbsDWxUsiv8t3MleyVEtl+N0pBVnlZisR3PCONShxxj
 w0S0/UniAib+j+Y3D7xSHKKrin6CjNVDltrJdX7VOAZsOTj93pVDhd8SlPc3YDSt
 DrUQoJI96DOwtQofXLTzZkzytbd/9ZN7VzF3j4wJ/72do7thCBybizRLUBEiKM+w
 RQgcb3Xvob1H9pKSjaNnmH8CKqrISJRfzpXCAzowQy+X8RTTBedn/ZHtiHFYb2XZ
 6J/OYM1p4W2X69KMUVJ8wAZsNVNlVWaXsD+lKcBcLtlYIUac7CB/Hcg4mrJDqhzq
 752KXvencRKP52HfVLaySXBE2JguTvUrkBDw3/u9W2qUZw3tUexPhH/CDH6wf8XU
 JhbZk7fm722pfCs/Gmt+9Muz7YkGbT91EWPWZYxbZ1avebmTlq6mCarqTTz9DmXZ
 psJe7a6D0M0bcQL0//rOH3USbOSUcFascJvuRoll28H+z3oCfC9ILbBZE8xQ9uOM
 W5mmCL8SoljY+cSIb9NSy0/vG4gqvy4TigaIRu8VT8EJG3Yel8y9YXb4kX5NFV6Z
 ZuEcgykUnBNzBmiXC+M57r9Qa1J/9GciSOVfH8PdOzhuwWimBANg5WD+B6XcrknD
 OCOVaV58DMh3ApwfVCkj
 =XJ/8
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging

# gpg: Signature made Fri 09 Jun 2017 13:41:59 BST
# gpg:                using RSA key 0xBDBE7B27C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg:                 aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg:                 aka "Jeffrey Cody <codyprime@gmail.com>"
# Primary key fingerprint: 9957 4B4D 3474 90E7 9D98  D624 BDBE 7B27 C0DE 3057

* remotes/cody/tags/block-pull-request:
  block/gluster.c: Handle qdict_array_entries() failure

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-13 12:55:47 +01:00
Peter Maydell 475df9d809 Block layer patches
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJZOorTAAoJEH8JsnLIjy/W+MAP/iOH2GtgtUfH8CLxqWqZUJ5p
 rmlLLTrooBHUF09BUSCkwD5E0s1El9phvldt9t2l6RaxxSlMwZl1nB0ww9q+z2Aq
 NY5VBYUov28JnqFqRkOPiJHMo7oRkzgyGVn3RCbevwWudEK3DcSRT5UftDK5UPH3
 HwxYY09uxpDcMkXajC1pRG/RJFrfsmbM3pXFYMsOgj8QT5nNdqebaqcYnuTKX9F4
 5Hn5O0tRdZZWWvP1QXRAPszmihdh4GN4I1EEfAgjV8Y3TqBApZS4vdVOJXm07Vlx
 FB6FKEWvK8zMU/0kS8jH5v4A03lRtokESL9Io5K4mwuxiTfNjvb4ZlftGVgUzuFi
 X5Yn6pPYU8t+rvRNSHF/8UYxX6lcuGL2DlVn/a+47phmQrW46fd8srp1A1c+dtzS
 jjMvw1tJNLYGP29QWzhaROPVn3Uk4wPbZIU472HB6ZTSC76g+jvGzfLVspnZeUjY
 96l3721vv8oa7ZtFvRNT9zTYxAoGXEcgFR7f+E9yVWdISZ0AKdONNIUV0w9AHKNc
 rDM4XdnG7ouiVaGeT3cdDNjjbb/v3+sR9yUzf9DUd41purmkrRPtGcetd8esf8Jr
 plIiBvf+mCRDACSYbIF3JKEHGYFasgs+X/OkhBVeeIBJUkl89mFpca5Qs4AF1d6v
 l7IW3y56umBa00qDgl/R
 =ey0C
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches

# gpg: Signature made Fri 09 Jun 2017 12:47:31 BST
# gpg:                using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream:
  block: fix external snapshot abort permission error
  block/qcow.c: Fix memory leak in qcow_create()
  qemu-iotests: Test automatic commit job cancel on hot unplug
  commit: Fix use after free in completion
  qemu-iotests: Block migration test
  migration/block: Clean up BBs in block_save_complete()
  migration: Inactivate images after .save_live_complete_precopy()
  block: Fix anonymous BBs in blk_root_inactivate()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-12 10:43:32 +01:00
Peter Maydell 56faeb9bb6 block/gluster.c: Handle qdict_array_entries() failure
In qemu_gluster_parse_json(), the call to qdict_array_entries()
could return a negative error code, which we were ignoring
because we assigned the result to an unsigned variable.
Fix this by using the 'int' type instead, which matches the
return type of qdict_array_entries() and also the type
we use for the loop enumeration variable 'i'.

(Spotted by Coverity, CID 1360960.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1496682098-1540-1-git-send-email-peter.maydell@linaro.org
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-06-09 08:41:29 -04:00
Peter Maydell 272545cf21 block/qcow.c: Fix memory leak in qcow_create()
Coverity points out that the code path in qcow_create() for
the magic "fat:" backing file name leaks the memory used to
store the filename (CID 1307771). Free the memory before
we overwrite the pointer.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-09 13:46:20 +02:00
Kevin Wolf 19ebd13ed4 commit: Fix use after free in completion
The final bdrv_set_backing_hd() could be working on already freed nodes
because the commit job drops its references (through BlockBackends) to
both overlay_bs and top already a bit earlier.

One way to trigger the bug is hot unplugging a disk for which
blockdev_mark_auto_del() cancels the block job.

Fix this by taking BDS-level references while we're still using the
nodes.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
2017-06-09 13:46:13 +02:00
Kevin Wolf 93c26503e0 block: Fix anonymous BBs in blk_root_inactivate()
blk->name isn't an array, but a pointer that can be NULL. Checking for
an anonymous BB must involve a NULL check first, otherwise we get
crashes.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
2017-06-09 11:45:03 +02:00
Paolo Bonzini 6bdcc018a6 nbd: make it thread-safe, fix qcow2 over nbd
NBD is not thread safe, because it accesses s->in_flight without
a CoMutex.  Fixing this will be required for multiqueue.
CoQueue doesn't have spurious wakeups but, when another coroutine can
run between qemu_co_queue_next's wakeup and qemu_co_queue_wait's
re-locking of the mutex, the wait condition can become false and
a loop is necessary.

In fact, it turns out that the loop is necessary even without this
multi-threaded scenario.  A particular sequence of coroutine wakeups
is happening ~80% of the time when starting a guest with qcow2 image
served over NBD (i.e. qemu-nbd --format=raw, and QEMU's -drive option
has -format=qcow2).  This patch fixes that issue too.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-07 18:22:02 +02:00
Vladimir Sementsov-Ogievskiy be41c100c0 nbd/client.c: use errp instead of LOG
Move to modern errp scheme from just LOGging errors.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170526110913.89098-1-vsementsov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-06 20:18:36 +02:00
Vladimir Sementsov-Ogievskiy f260956536 nbd: add errp parameter to nbd_wr_syncv()
Will be used in following patch to provide actual error message in
some cases.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170516094533.6160-4-vsementsov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-06 20:18:36 +02:00
Niels de Vos df3a429ae8 gluster: add support for PREALLOC_MODE_FALLOC
Add missing support for "preallocation=falloc" to the Gluster block
driver. This change bases its logic on that of block/file-posix.c and
removed the gluster_supports_zerofill() and qemu_gluster_zerofill()
functions in favour of #ifdef checks in an easy to read
switch-statement.

Both glfs_zerofill() and glfs_fallocate() have been introduced with
GlusterFS 3.5.0 (pkg-config glusterfs-api = 6). A #define for the
availability of glfs_fallocate() has been added to ./configure.

Reported-by: Satheesaran Sundaramoorthi <sasundar@redhat.com>
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Message-id: 20170528063114.28691-1-ndevos@redhat.com
URL: https://bugzilla.redhat.com/1450759
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-06-02 10:51:47 -04:00
Stefan Hajnoczi 0748b3526e Block layer patches
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJZLDGTAAoJEH8JsnLIjy/WjY0P/jtXF+SQKs9H695Uy+U+EUr5
 9w+lYTfjHXTUx9F5AKT4OAMww/uWSG5ywG8FZD8AJZyt1B8piDrSA7sW5YYgWCV4
 XD1HMzN6pasVGrchzgfBSW/K9BgUlvBEjqFrLCNVii0osa68J7vX3uJsLyRA+yPo
 ijFseJaB/b0KnAPG9JqHXc4DVQgU2IhZH3ZsgE6Utb+KUrm8/veiORhWQboBbfCW
 eyTF+YWee/rI+up4nKn9+Le57EUXWr3Y50BBxFZw1/4wQbv0WEhOIbl/Z4ggxNRR
 VoyXGMqOIZFlGQ7bGxDawuEwGKfcOFFEHj3rhPrs7RPnod/6X8Vxm0rAr2GNZoOW
 9tLMQKe4HLo3kVwX65AhzWd/WMukAd0MC/0oaULOjiAEdlWjflyEhx9H9uYefl5x
 kn77ZqQqIEL4aCYRBLJn9oJV3CIZbSd6cuKC9UPuXAwD8XDWTXUzT/aDMJUnfij7
 vhS1qks1Wyk37wIjTuuERwrr0K6y8e3De30NeU6LqQuzXvJ1MYSp49BA1FCgHfmT
 x5RWbTSjLdjZiIo7biE2dSwdOtsxsnuxX19utqfGqOmegNU7LykRtu6mFh9kJhNe
 5ladt1Mu//puZLQ/q4RH/J0pwkuS/OVrS3LBobcggPGHhUIiHhdNUI73bb7BK0+9
 ME5DGLm4/c/REB3Lidr9
 =kkU0
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'kwolf/tags/for-upstream' into staging

Block layer patches

# gpg: Signature made Mon 29 May 2017 03:34:59 PM BST
# gpg:                using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* kwolf/tags/for-upstream:
  block/file-*: *_parse_filename() and colons
  block: Fix backing paths for filenames with colons
  block: Tweak error message related to qemu-img amend
  qemu-img: Fix leakage of options on error
  qemu-img: copy *key-secret opts when opening newly created files
  qemu-img: introduce --target-image-opts for 'convert' command
  qemu-img: fix --image-opts usage with dd command
  qemu-img: add support for --object with 'dd' command
  qemu-img: Fix documentation of convert
  qcow2: remove extra local_error variable
  mirror: Drop permissions on s->target on completion
  nvme: Add support for Controller Memory Buffers
  iotests: 147: Don't test inet6 if not available
  qemu-iotests: Test streaming with missing job ID
  stream: fix crash in stream_start() when block_job_create() fails

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-30 14:15:15 +01:00
Max Reitz 03c320d803 block/file-*: *_parse_filename() and colons
The file drivers' *_parse_filename() implementations just strip the
optional protocol prefix off the filename. However, for e.g.
"file:foo:bar", this would lead to "foo:bar" being stored as the BDS's
filename which looks like it should be managed using the "foo" protocol.
This is especially troublesome if you then try to resolve a backing
filename based on "foo:bar".

This issue can only occur if the stripped part is a relative filename
("file:/foo:bar" will be shortened to "/foo:bar" and having a slash
before the first colon means that "/foo" is not recognized as a protocol
part). Therefore, we can easily fix it by prepending "./" to such
filenames.

Before this patch:
$ ./qemu-img create -f qcow2 backing.qcow2 64M
Formatting 'backing.qcow2', fmt=qcow2 size=67108864 encryption=off
    cluster_size=65536 lazy_refcounts=off refcount_bits=16
$ ./qemu-img create -f qcow2 -b backing.qcow2 file🔝image.qcow2
Formatting 'file🔝image.qcow2', fmt=qcow2 size=67108864
    backing_file=backing.qcow2 encryption=off cluster_size=65536
    lazy_refcounts=off refcount_bits=16
$ ./qemu-io file🔝image.qcow2
can't open device file🔝image.qcow2: Could not open backing file:
    Unknown protocol 'top'

After this patch:
$ ./qemu-io file🔝image.qcow2
[no error]

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170522195217.12991-3-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-29 15:39:54 +02:00
Eric Blake bcb07dba92 block: Tweak error message related to qemu-img amend
When converting a 1.1 image down to 0.10, qemu-iotests 060 forces
a contrived failure where allocating a cluster used to replace a
zero cluster reads unaligned data.  Since it is a zero cluster
rather than a data cluster being converted, changing the error
message to match our earlier change in 'qcow2: Make distinction
between zero cluster types obvious' is worthwhile.

Suggested-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 20170508171302.17805-1-eblake@redhat.com
[mreitz: Commit message fixes]
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-29 15:39:54 +02:00
Alberto Garcia a7a6a2bffc qcow2: remove extra local_error variable
Commit d7086422b1 added a local_err
variable global to the qcow2_amend_options() function, so there's no
need to have this other one.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 20170511150337.21470-1-berto@igalia.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-29 15:39:53 +02:00
Kevin Wolf 63c8ef2890 mirror: Drop permissions on s->target on completion
This fixes an assertion failure that was triggered by qemu-iotests 129
on some CI host, while the same test case didn't seem to fail on other
hosts.

Essentially the problem is that the blk_unref(s->target) in
mirror_exit() doesn't necessarily mean that the BlockBackend goes away
immediately. It is possible that the job completion was triggered nested
in mirror_drain(), which looks like this:

    BlockBackend *target = s->target;
    blk_ref(target);
    blk_drain(target);
    blk_unref(target);

In this case, the write permissions for s->target are retained until
after blk_drain(), which makes removing mirror_top_bs fail for the
active commit case (can't have a writable backing file in the chain
without the filter driver).

Explicitly dropping the permissions first means that the additional
reference doesn't hurt and the job can complete successfully even if
called from the nested blk_drain().

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2017-05-29 15:37:26 +02:00
Alberto Garcia 525989a50a stream: fix crash in stream_start() when block_job_create() fails
The code that tries to reopen a BlockDriverState in stream_start()
when the creation of a new block job fails crashes because it attempts
to dereference a pointer that is known to be NULL.

This is a regression introduced in a170a91fd3,
likely because the code was copied from stream_complete().

Cc: qemu-stable@nongnu.org
Reported-by: Kashyap Chamarthy <kchamart@redhat.com>
Signed-off-by: Alberto Garcia <berto@igalia.com>
Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-26 16:48:21 +02:00
Jeff Cody 223a23c198 block/gluster: glfs_lseek() workaround
On current released versions of glusterfs, glfs_lseek() will sometimes
return invalid values for SEEK_DATA or SEEK_HOLE.  For SEEK_DATA and
SEEK_HOLE, the returned value should be >= the passed offset, or < 0 in
the case of error:

LSEEK(2):

    off_t lseek(int fd, off_t offset, int whence);

    [...]

    SEEK_HOLE
              Adjust  the file offset to the next hole in the file greater
              than or equal to offset.  If offset points into the middle of
              a hole, then the file offset is set to offset.  If there is no
              hole past offset, then the file offset is adjusted to the end
              of the file (i.e., there is  an implicit hole at the end of
              any file).

    [...]

    RETURN VALUE
              Upon  successful  completion,  lseek()  returns  the resulting
              offset location as measured in bytes from the beginning of the
              file.  On error, the value (off_t) -1 is returned and errno is
              set to indicate the error

However, occasionally glfs_lseek() for SEEK_HOLE/DATA will return a
value less than the passed offset, yet greater than zero.

For instance, here are example values observed from this call:

    offs = glfs_lseek(s->fd, start, SEEK_HOLE);
    if (offs < 0) {
        return -errno;          /* D1 and (H3 or H4) */
    }

start == 7608336384
offs == 7607877632

This causes QEMU to abort on the assert test.  When this value is
returned, errno is also 0.

This is a reported and known bug to glusterfs:
https://bugzilla.redhat.com/show_bug.cgi?id=1425293

Although this is being fixed in gluster, we still should work around it
in QEMU, given that multiple released versions of gluster behave this
way.

This patch treats the return case of (offs < start) the same as if an
error value other than ENXIO is returned; we will assume we learned
nothing, and there are no holes in the file.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Message-id: 87c0140e9407c08f6e74b04131b610f2e27c014c.1495560397.git.jcody@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-05-24 16:44:46 -04:00
Paolo Bonzini f321dcb57f blockjob: introduce block_job_pause/resume_all
Remove use of block_job_pause/resume from outside blockjob.c, thus
making them static.  The new functions are used by the block layer,
so place them in blockjob_int.h.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 20170508141310.8674-5-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-05-24 16:38:51 -04:00
Paolo Bonzini 05b0d8e3b8 blockjob: introduce block_job_early_fail
Outside blockjob.c, block_job_unref is only used when a block job fails
to start, and block_job_ref is not used at all.  The reference counting
thus is pretty well hidden.  Introduce a separate function to be used
by block jobs; because block_job_ref and block_job_unref now become
static, move them earlier in blockjob.c.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 20170508141310.8674-4-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-05-24 16:38:51 -04:00
Juan Quintela 68ba3b0743 migration: migration.h was not needed
This files don't use any function from migration.h, so drop it.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-05-18 19:20:59 +02:00
Stefan Hajnoczi 2ccbd47c1d migration/next for 20170517
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJZHCoMAAoJEPSH7xhYctcjyOcQAN82GDYgXj93k40rU/SmZTP7
 blelisGsY5UNo33bLZq07fVwwdk1vIR0OIZvjMyGVWptAX49QJ6BVwX2E5zmb9LW
 AT3rVeyqz8nnC6OwWBxN9bu+sPJ13ibGs1l2j5Kn9jZ6a9rJCC7LOKdo4Dxbs3Uk
 Obw4f7swsozTQPxeHfrsBgFIvcB8qXLjdxsVhj+IWkmp1KDKVg+TWfNFJx30dK0G
 ktVsV0Xu6exEzcnzpTf93Bcv8vt49JRrCka9N5YryPTZmFuGgW291lqviPWiZg/W
 39F3cga5QfDzcs4Z6Lrz3Qeo/q+2n5G5O23UmrJccZ//UQMdeW9sd5udj211aMeq
 I7UdrarIHWRCCVTVdVL7AGJ8xmMIKHsvKRWstw7FEMHQ+lD/sFSfpWBtYdGhAotF
 mf/yncMKb52QbNyIuanoKi8UjU+RCvuslCac87U3fPqz/qYGvhnmO145S/wai1mR
 +FQQXORJOhdsWDqRRz9q8/uXqPwm173+rHHzMgFa3P1X9u1jfLhjJk0g9sDFtyAb
 If4IzOwfuCLJyelcuzzy9SSOzDsGu1LcrBoRgqTugX+MSJXFjWOKKfA1wxnAKkPf
 T2fQIqny2N7VCfpDB1iaCfxnkizIwrYEI3YRkMuJpYU3489x/BJQIILoLo1yEj4G
 vNhq+qJ9V/Uj8X+X5/cL
 =A5DU
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'quintela/tags/migration/20170517' into staging

migration/next for 20170517

# gpg: Signature made Wed 17 May 2017 11:46:36 AM BST
# gpg:                using RSA key 0xF487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>"
# gpg:                 aka "Juan Quintela <quintela@trasno.org>"
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03  4B82 F487 EF18 5872 D723

* quintela/tags/migration/20170517:
  migration: Move check_migratable() into qdev.c
  migration: Move postcopy stuff to postcopy-ram.c
  migration: Move page_cache.c to migration/
  migration: Create migration/blocker.h
  ram: Rename RAM_SAVE_FLAG_COMPRESS to RAM_SAVE_FLAG_ZERO
  migration: Pass Error ** argument to {save,load}_vmstate
  migration: Fix regression with compression threads

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-18 10:05:52 +01:00
Stefan Hajnoczi fefb28a471 -----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJZGx79AAoJEL2+eyfA3jBXYmwP/0hyCtpaXU8eLFNarnedv6M+
 iDEFULUuYtiY2Jd3brxoVifIkP5Z4P6XEtCwoPKUZhu7yq+OySs/f/ewgskQOr0F
 YRYX2CSiVFIuT816mnYnpkrUwOzHPzewOF+99CgQD6IAEg1cXVgzscOpHnWW7Eaz
 vkrvT0C0maxm3ClCLcRFJWPwUsFAZR+2e3lf/1DWjxgkPx5KJW6tMatdQeuBJCMe
 5y/A4+bMOZyRHDppp1YMh2ijeJ230mvRnjNadUhfXtIKKvHDpulFVZFaw/Bh/3+C
 FSZYKNrgQw4lzBc2Bao2q5nP69fDlfZiKABjumbyib4yfkgOcaCRnseB1xHdVYsz
 WHKeLufFvwsnXgDVHz/DoYB5wyJG9I7oHNj4VE6BHWsszR6UhlN+XsbtK6Sl5CKG
 wv8z3D8U3e0ki/jufta2+Yh4bvMwrEofjzPh+g1T1Gme+BVubDtAV16w0UcNbAAl
 O3quUnQA1PWA2jPIdqE28Z2gdAywun7nLEV8fxOL5NNnWrHDEH4EZvHb1l/Lcjgn
 W1Dx/vRDpd1cJFwfh3NP+n+zIhaT36ZJ2rnpJfWvsJTJn346gW5XPcgYChqqEMle
 dOnjPqXceOZHFBsvWSBPFX0qfNPOsTBtwtTqOggZsF5W9aCkvUPi3SRLe8fSlUkH
 8p06ejEr4VTZkeE2s5WM
 =joxq
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'jtc/tags/block-pull-request' into staging

# gpg: Signature made Tue 16 May 2017 04:47:09 PM BST
# gpg:                using RSA key 0xBDBE7B27C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg:                 aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg:                 aka "Jeffrey Cody <codyprime@gmail.com>"
# Primary key fingerprint: 9957 4B4D 3474 90E7 9D98  D624 BDBE 7B27 C0DE 3057

* jtc/tags/block-pull-request:
  curl: do not do aio_poll when waiting for a free CURLState
  curl: convert readv to coroutines
  curl: convert CURLAIOCB to byte values
  curl: split curl_find_state/curl_init_state
  curl: avoid recursive locking of BDRVCURLState mutex
  curl: never invoke callbacks with s->mutex held
  curl: strengthen assertion in curl_clean_state
  block: curl: Allow passing cookies via QCryptoSecret

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-17 13:52:07 +01:00
Juan Quintela 795c40b8bd migration: Create migration/blocker.h
This allows us to remove lots of includes of migration/migration.h

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-05-17 12:04:59 +02:00
Paolo Bonzini 2bb5c936c5 curl: do not do aio_poll when waiting for a free CURLState
Instead, put the CURLAIOCB on a wait list and yield; curl_clean_state will
wake the corresponding coroutine.

Because of CURL's callback-based structure, we cannot easily convert
everything to CoMutex/CoQueue; keeping the QemuMutex is simpler.  However,
CoQueue is a simple wrapper around a linked list, so we can easily
use QSIMPLEQ and open-code a CoQueue, protected by the BDRVCURLState
QemuMutex instead of a CoMutex.

Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170515100059.15795-8-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-05-16 10:34:50 -04:00
Paolo Bonzini 28256d8246 curl: convert readv to coroutines
This is pretty simple.  The bottom half goes away because, unlike
bdrv_aio_readv, coroutine-based read can return immediately without
yielding.  However, for simplicity I kept the former bottom half
handler in a separate function.

Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170515100059.15795-7-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-05-16 10:34:50 -04:00
Paolo Bonzini 2125e5ea6e curl: convert CURLAIOCB to byte values
This is in preparation for the conversion from bdrv_aio_readv to
bdrv_co_preadv, and it also requires changing some of the size_t values
to uint64_t.  This was broken before for disks > 2TB, but now it would
break at 4GB.

Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170515100059.15795-6-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-05-16 10:34:50 -04:00
Paolo Bonzini 3ce6a729b5 curl: split curl_find_state/curl_init_state
If curl_easy_init fails, a CURLState is left with s->in_use = 1.  Split
curl_init_state in two, so that we can distinguish the two failures and
call curl_clean_state if needed.

While at it, simplify curl_find_state, removing a dummy loop.  The
aio_poll loop is moved to the sole caller that needs it.

Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170515100059.15795-5-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-05-16 10:34:45 -04:00
Gerd Hoffmann cdece0467c block/win32: fix 'ret not initialized' warning
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20170516074256.24731-1-kraxel@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-16 15:34:18 +01:00
Paolo Bonzini 456af34629 curl: avoid recursive locking of BDRVCURLState mutex
The curl driver has a ugly hack where, if it cannot find an empty CURLState,
it just uses aio_poll to wait for one to be empty.  This is probably
buggy when used together with dataplane, and the simplest way to fix it
is to use coroutines instead.

A more immediate effect of the bug however is that it can cause a
recursive call to curl_readv_bh_cb and recursively taking the
BDRVCURLState mutex.  This causes a deadlock.

The fix is to unlock the mutex around aio_poll, but for cleanliness we
should also take the mutex around all calls to curl_init_state, even if
reaching the unlock/lock pair is impossible.  The same is true for
curl_clean_state.

Reported-by: Kun Wei <kuwei@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20170515100059.15795-4-pbonzini@redhat.com
Cc: qemu-stable@nongnu.org
Cc: Jeff Cody <jcody@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-05-16 10:34:17 -04:00
Paolo Bonzini 34db05e7ff curl: never invoke callbacks with s->mutex held
All curl callbacks go through curl_multi_do, and hence are called with
s->mutex held.  Note that with comments, and make curl_read_cb drop the
lock before invoking the callback.

Likewise for curl_find_buf, where the callback can be invoked by the
caller.

Cc: qemu-stable@nongnu.org
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170515100059.15795-3-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-05-16 10:34:17 -04:00
Paolo Bonzini 675a775633 curl: strengthen assertion in curl_clean_state
curl_clean_state should only be called after all AIOCBs have been
completed.  This is not so obvious for the call from curl_detach_aio_context,
so assert that.

Cc: qemu-stable@nongnu.org
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170515100059.15795-2-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-05-16 10:34:03 -04:00
Peter Krempa 327c8ebd70 block: curl: Allow passing cookies via QCryptoSecret
Since cookies can contain sensitive data (session ID, etc ...) it is
desired to hide them from the prying eyes of users. Add a possibility to
pass them via the secret infrastructure.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1447413

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: f4a22cdebdd0bca6a13a43a2a6deead7f2ec4bb3.1493906281.git.pkrempa@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-05-16 10:31:08 -04:00
Stefan Hajnoczi b54933eed5 -----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJZFciYAAoJEJykq7OBq3PINCgH/1IXhfLR6AtPKAa0b2oABWEW
 lD90HiHIxeqYIDiVnppD6CEnIZ6VBmImSyRr6V9Fk7OTkCucVRLOXFZF63d7ETDo
 PBF8+FJ8YgdClJ80f3HkmPu6vzvnRzcs5mUBrrKhZPAy2adxNmIpfir/78p+Aomd
 HxCYl3jF6aVUtFEtzTwAmDjXU2NUNPQABAAKiGT5jwSUcksxogg7LSIM/ZlFzNhc
 ZMI9h6doRP0v3kLyh429qVEgXpCh7VaD9Swv6H3/sSHJdCBP5MILrTADrK+45tmL
 nIxHruOyt63Jgd8acGDdvYz1oHjPXb2Cn/87SiONsqwoyFJmkqbVXvOg4gqPuJ8=
 =yXgr
 -----END PGP SIGNATURE-----

Merge tag 'block-pull-request' into staging

# gpg: Signature made Fri 12 May 2017 10:37:12 AM EDT
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* tag 'block-pull-request':
  aio: add missing aio_notify() to aio_enable_external()
  block: Simplify BDRV_BLOCK_RAW recursion
  coroutine: remove GThread implementation

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-12 10:39:23 -04:00
Stefan Hajnoczi 3753e255da Block layer patches
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJZFHXJAAoJEH8JsnLIjy/WvdsP/3mTGDZJO4M0HXWBWYQBwdGE
 EuMyGxn8Y0IUIBNx8sogAY9yQ/aQUh1b3bLWg7MAvZvQpi2S448Ak/2agaRFTWXJ
 vVv1nP0uigfgElSZZKTyj2SyhghrGkBou98hyYOd05YAApXmxoBKvTgC1di0Ec5K
 86Mc91Cr3udh4Q5w5Ss2IxiN+fNoHhHdOOm8gkWM4iQl6fmDJ8lBBxoiU+K2uXX+
 /flAVRgogFglKb0EAq8eSdy4qFRyp1G43yzOJKNiM3BuYeXpipnyO5mY3ynrEv5Y
 3213hJ/rriY15xHcKtkPxEqd32iX2qCRGFXWMMiq4r527JG/B40SEJQaxfHu33U2
 Et3+0MxoDHFseXE2sSjs8c0DKQiU5/hf7RkzqOMPMNluIX4ykrSyOdKzj4EFEowr
 GGLobm4Hr/rv8YDhRhllyKZ6Xzrjihy5WH5T9+HVfOlf35AxjJGZYqfCouhCiP15
 0Nxz3qWfDH/U4zvhY+PIfQ51kvGl02oMZF6uu48kvzwdbIKgnImKg5HbG5R85mYQ
 BFqCjaEWjSUWxrqbZ/uRBUqavCLJm9C3mEWLCzGBmXNNs4Th8kXftuo1br+QHtr5
 aT7xoxUJs/nmBoCKYhlCSrxjqCa0clenvNeiO4Q44UxWmOFQi9SM5myeSuCIaa+S
 qnYHZLZ74gVZpQaoPkjE
 =SA0O
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'kwolf/tags/for-upstream' into staging

Block layer patches

# gpg: Signature made Thu 11 May 2017 10:31:37 AM EDT
# gpg:                using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* kwolf/tags/for-upstream: (58 commits)
  MAINTAINERS: Add qemu-progress to the block layer
  qcow2: Discard/zero clusters by byte count
  qcow2: Assert that cluster operations are aligned
  qcow2: Optimize write zero of unaligned tail cluster
  iotests: Add test 179 to cover write zeroes with unmap
  iotests: Improve _filter_qemu_img_map
  qcow2: Optimize zero_single_l2() to minimize L2 churn
  qcow2: Make distinction between zero cluster types obvious
  qcow2: Name typedef for cluster type
  qcow2: Correctly report status of preallocated zero clusters
  block: Update comments on BDRV_BLOCK_* meanings
  qcow2: Use consistent switch indentation
  qcow2: Nicer variable names in qcow2_update_snapshot_refcount()
  tests: Add coverage for recent block geometry fixes
  blkdebug: Add ability to override unmap geometries
  blkdebug: Simplify override logic
  blkdebug: Add pass-through write_zero and discard support
  blkdebug: Refactor error injection
  blkdebug: Sanity check block layer guarantees
  qemu-io: Switch 'map' output to byte-based reporting
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-12 10:39:08 -04:00
Eric Blake ee29d6adef block: Simplify BDRV_BLOCK_RAW recursion
Since we are already in coroutine context during the body of
bdrv_co_get_block_status(), we can shave off a few layers of
wrappers when recursing to query the protocol when a format driver
returned BDRV_BLOCK_RAW.

Note that we are already using the correct recursion later on in
the same function, when probing whether the protocol layer is sparse
in order to find out if we can add BDRV_BLOCK_ZERO to an existing
BDRV_BLOCK_DATA|BDRV_BLOCK_OFFSET_VALID.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20170504173745.27414-1-eblake@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-12 10:36:46 -04:00
Eric Blake d2cb36af2b qcow2: Discard/zero clusters by byte count
Passing a byte offset, but sector count, when we ultimately
want to operate on cluster granularity, is madness.  Clean up
the external interfaces to take both offset and count as bytes,
while still keeping the assertion added previously that the
caller must align the values to a cluster.  Then rename things
to make sure backports don't get confused by changed units:
instead of qcow2_discard_clusters() and qcow2_zero_clusters(),
we now have qcow2_cluster_discard() and qcow2_cluster_zeroize().

The internal functions still operate on clusters at a time, and
return an int for number of cleared clusters; but on an image
with 2M clusters, a single L2 table holds 256k entries that each
represent a 2M cluster, totalling well over INT_MAX bytes if we
ever had a request for that many bytes at once.  All our callers
currently limit themselves to 32-bit bytes (and therefore fewer
clusters), but by making this function 64-bit clean, we have one
less place to clean up if we later improve the block layer to
support 64-bit bytes through all operations (with the block layer
auto-fragmenting on behalf of more-limited drivers), rather than
the current state where some interfaces are artificially limited
to INT_MAX at a time.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170507000552.20847-13-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:07 +02:00
Eric Blake f10ee139ad qcow2: Assert that cluster operations are aligned
We already audited (in commit 0c1bd469) that qcow2_discard_clusters()
is only passed cluster-aligned start values; but we can further
tighten the assertion that the only unaligned end value is at EOF.

Recent commits have taken advantage of an unaligned tail cluster,
for both discard and write zeroes.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170507000552.20847-12-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:07 +02:00
Eric Blake fbaa6bb3d3 qcow2: Optimize write zero of unaligned tail cluster
We've already improved discards to operate efficiently on the tail
of an unaligned qcow2 image; it's time to make a similar improvement
to write zeroes.  The special case is only valid at the tail
cluster of a file, where we must recognize that any sectors beyond
the image end would implicitly read as zero, and therefore should
not penalize our logic for widening a partial cluster into writing
the whole cluster as zero.

However, note that for now, the special case of end-of-file is only
recognized if there is no backing file, or if the backing file has
the same length; that's because when the backing file is shorter
than the active layer, we don't have code in place to recognize
that reads of a sector unallocated at the top and beyond the backing
end-of-file are implicitly zero.  It's not much of a real loss,
because most people don't use images that aren't cluster-aligned,
or where the active layer is a different size than the backing
layer (especially where the difference falls within a single cluster).

Update test 154 to cover the new scenarios, using two images of
intentionally differing length.

While at it, fix the test to gracefully skip when run as
./check -qcow2 -o compat=0.10 154
since the older format lacks zero clusters already required earlier
in the test.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170507000552.20847-11-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:07 +02:00
Eric Blake 06cc5e2b2d qcow2: Optimize zero_single_l2() to minimize L2 churn
Similar to discard_single_l2(), we should try to avoid dirtying
the L2 cache when the cluster we are changing already has the
right characteristics.

Note that by the time we get to zero_single_l2(), BDRV_REQ_MAY_UNMAP
is a requirement to unallocate a cluster (this is because the block
layer clears that flag if discard.* flags during open requested that
we never punch holes - see the conversation around commit 170f4b2e,
https://lists.gnu.org/archive/html/qemu-devel/2016-09/msg07306.html).
Therefore, this patch can only reuse a zero cluster as-is if either
unmapping is not requested, or if the zero cluster was not associated
with an allocation.

Technically, there are some cases where an unallocated cluster
already reads as all zeroes (namely, when there is no backing file
[easy: check bs->backing], or when the backing file also reads as
zeroes [harder: we can't check bdrv_get_block_status since we are
already holding the lock]), where the guest would not immediately see
a difference if we left that cluster unallocated.  But if the user
did not request unmapping, leaving an unallocated cluster is wrong;
and even if the user DID request unmapping, keeping a cluster
unallocated risks a subtle semantic change of guest-visible contents
if a backing file is later added, and it is not worth auditing
whether all internal uses such as mirror properly avoid an unmap
request.  Thus, this patch is intentionally limited to just clusters
that are already marked as zero.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170507000552.20847-8-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:07 +02:00
Eric Blake fdfab37dfe qcow2: Make distinction between zero cluster types obvious
Treat plain zero clusters differently from allocated ones, so that
we can simplify the logic of checking whether an offset is present.
Do this by splitting QCOW2_CLUSTER_ZERO into two new enums,
QCOW2_CLUSTER_ZERO_PLAIN and QCOW2_CLUSTER_ZERO_ALLOC.

I tried to arrange the enum so that we could use
'ret <= QCOW2_CLUSTER_ZERO_PLAIN' for all unallocated types, and
'ret >= QCOW2_CLUSTER_ZERO_ALLOC' for allocated types, although
I didn't actually end up taking advantage of the layout.

In many cases, this leads to simpler code, by properly combining
cases (sometimes, both zero types pair together, other times,
plain zero is more like unallocated while allocated zero is more
like normal).

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 20170507000552.20847-7-eblake@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:07 +02:00
Eric Blake 3ef9521893 qcow2: Name typedef for cluster type
Although it doesn't add all that much type safety (this is C, after
all), it does add a bit of legibility to use the name QCow2ClusterType
instead of a plain int.

In particular, qcow2_get_cluster_offset() has an overloaded return
type; a QCow2ClusterType on success, and -errno on failure; keeping
the cluster type in a separate variable makes it slightly easier for
the next patch to make further computations based on the type.

Suggested-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 20170507000552.20847-6-eblake@redhat.com
[mreitz: Use the new type in two more places (one of them pulled from
         the next patch)]
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:06 +02:00
Eric Blake 4341df8a83 qcow2: Correctly report status of preallocated zero clusters
We were throwing away the preallocation information associated with
zero clusters.  But we should be matching the well-defined semantics
in bdrv_get_block_status(), where (BDRV_BLOCK_ZERO |
BDRV_BLOCK_OFFSET_VALID) informs the user which offset is reserved,
while still reminding the user that reading from that offset is
likely to read garbage.

count_contiguous_clusters_by_type() is now used only for unallocated
cluster runs, hence it gets renamed and tightened.

Making this change lets us see which portions of an image are zero
but preallocated, when using qemu-img map --output=json.  The
--output=human side intentionally ignores all zero clusters, whether
or not they are preallocated.

The fact that there is no change to qemu-iotests './check -qcow2'
merely means that we aren't yet testing this aspect of qemu-img;
a later patch will add a test.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170507000552.20847-5-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:06 +02:00
Eric Blake bbd995d830 qcow2: Use consistent switch indentation
Fix a couple of inconsistent indentations, before an upcoming
patch further tweaks the switch statements.
(best viewed with 'git diff -b').

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170507000552.20847-3-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:06 +02:00
Eric Blake b32cbae111 qcow2: Nicer variable names in qcow2_update_snapshot_refcount()
In order to keep checkpatch happy when the next patch changes
indentation, we first have to shorten some long lines.  The easiest
approach is to use a new variable in place of
'offset & L2E_OFFSET_MASK', except that 'offset' is the best name
for that variable.  Change '[old_]offset' to '[old_]entry' to
make room.

While touching things, also fix checkpatch warnings about unusual
'for' statements.

Suggested by Max Reitz <mreitz@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 20170507000552.20847-2-eblake@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:06 +02:00
Eric Blake 430b26a82d blkdebug: Add ability to override unmap geometries
Make it easier to simulate various unusual hardware setups (for
example, recent commits 3482b9b and b8d0a98 affect the Dell
Equallogic iSCSI with its 15M preferred and maximum unmap and
write zero sizing, or b2f95fe deals with the Linux loopback
block device having a max_transfer of 64k), by allowing blkdebug
to wrap any other device with further restrictions on various
alignments.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170429191419.30051-9-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:06 +02:00
Eric Blake 3dc834f879 blkdebug: Simplify override logic
Rather than store into a local variable, then copy to the struct
if the value is valid, then reporting errors otherwise, it is
simpler to just store into the struct and report errors if the
value is invalid.  This however requires that the struct store
a 64-bit number, rather than a narrower type.  Likewise, setting
a sane errno value in ret prior to the sequence of parsing and
jumping to out: on error makes it easier for the next patch to
add a chain of similar checks.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 20170429191419.30051-8-eblake@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:06 +02:00
Eric Blake 63188c2450 blkdebug: Add pass-through write_zero and discard support
In order to test the effects of artificial geometry constraints
on operations like write zero or discard, we first need blkdebug
to manage these actions.  It also allows us to inject errors on
those operations, just like we can for read/write/flush.

We can also test the contract promised by the block layer; namely,
if a device has specified limits on alignment or maximum size,
then those limits must be obeyed (for now, the blkdebug driver
merely inherits limits from whatever it is wrapping, but the next
patch will further enhance it to allow specific limit overrides).

This patch intentionally refuses to service requests smaller than
the requested alignments; this is because an upcoming patch adds
a qemu-iotest to prove that the block layer is correctly handling
fragmentation, but the test only works if there is a way to tell
the difference at artificial alignment boundaries when blkdebug is
using a larger-than-default alignment.  If we let the blkdebug
layer always defer to the underlying layer, which potentially has
a smaller granularity, the iotest will be thwarted.

Tested by setting up an NBD server with export 'foo', then invoking:
$ ./qemu-io
qemu-io> open -o driver=blkdebug blkdebug::nbd://localhost:10809/foo
qemu-io> d 0 15M
qemu-io> w -z 0 15M

Pre-patch, the server never sees the discard (it was silently
eaten by the block layer); post-patch it is passed across the
wire.  Likewise, pre-patch the write is always passed with
NBD_WRITE (with 15M of zeroes on the wire), while post-patch
it can utilize NBD_WRITE_ZEROES (for less traffic).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170429191419.30051-7-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:06 +02:00
Eric Blake d157ed5f72 blkdebug: Refactor error injection
Rather than repeat the logic at each caller of checking if a Rule
exists that warrants an error injection, fold that logic into
inject_error(); and rename it to rule_check() for legibility.
This will help the next patch, which adds two more callers that
need to check rules for the potential of injecting errors.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170429191419.30051-6-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:06 +02:00
Eric Blake e0ef439588 blkdebug: Sanity check block layer guarantees
Commits 04ed95f4 and 1a62d0ac updated the block layer to auto-fragment
any I/O to fit within device boundaries. Additionally, when using a
minimum alignment of 4k, we want to ensure the block layer does proper
read-modify-write rather than requesting I/O on a slice of a sector.
Let's enforce that the contract is obeyed when using blkdebug.  For
now, blkdebug only allows alignment overrides, and just inherits other
limits from whatever device it is wrapping, but a future patch will
further enhance things.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170429191419.30051-5-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-05-11 14:28:06 +02:00
Kevin Wolf 22d5cd82e9 file-posix: Remove .bdrv_inactivate/invalidate_cache
Now that the block layer takes care to request a lot less permissions
for inactive nodes, the special-casing in file-posix isn't necessary any
more.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-05-11 12:08:24 +02:00
Kevin Wolf cfa1a5723f block: Drop permissions when migration completes
With image locking, permissions affect other qemu processes as well. We
want to be sure that the destination can run, so let's drop permissions
on the source when migration completes.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-05-11 12:08:24 +02:00
Kevin Wolf 4417ab7adf block: New BdrvChildRole.activate() for blk_resume_after_migration()
Instead of manually calling blk_resume_after_migration() in migration
code after doing bdrv_invalidate_cache_all(), integrate the BlockBackend
activation with cache invalidation into a single function. This is
achieved with a new callback in BdrvChildRole that is called by
bdrv_invalidate_cache_all().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-05-11 12:08:24 +02:00
Max Reitz 293073a56c qcow2: Discard preallocated zero clusters
In discard_single_l2(), we completely discard normal clusters instead of
simply turning them into preallocated zero clusters. That means we
should probably do the same with such preallocated zero clusters:
Discard them instead of keeping them allocated.

Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 12:08:24 +02:00
Max Reitz 564a6b6938 qcow2: Reuse preallocated zero clusters
Instead of just freeing preallocated zero clusters and completely
allocating them from scratch, reuse them.

We cannot do this in handle_copied(), however, since this is a COW
operation. Therefore, we have to add the new logic to handle_alloc() and
simply return the existing offset if it exists. The only catch is that
we have to convince qcow2_alloc_cluster_link_l2() not to free the old
clusters (because we have reused them).

Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 12:08:24 +02:00
Max Reitz 92413c16be qcow2: Fix preallocation size formula
When calculating the number of reftable entries, we should actually use
the number of refblocks and not (wrongly[1]) re-calculate it.

[1] "Wrongly" means: Dividing the number of clusters by the number of
    entries per refblock and rounding down instead of up.

Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 12:08:24 +02:00
Fam Zheng 244a566810 file-posix: Add image locking to perm operations
This extends the permission bits of op blocker API to external using
Linux OFD locks.

Each permission in @perm and @shared_perm is represented by a locked
byte in the image file.  Requesting a permission in @perm is translated
to a shared lock of the corresponding byte; rejecting to share the same
permission is translated to a shared lock of a separate byte. With that,
we use 2x number of bytes of distinct permission types.

virtlockd in libvirt locks the first byte, so we do locking from a
higher offset.

Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 11:15:32 +02:00
Fam Zheng 1c3a555c35 file-win32: Error out if locking=on
We share the same set of QAPI options with file-posix, but locking is
not supported here. So error out if it is specified as 'on' for now.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 11:08:41 +02:00
Fam Zheng 16b48d5d66 file-posix: Add 'locking' option
Making this option available even before implementing it will let
converting tests easier: in coming patches they can specify the option
already when necessary, before we actually write code to lock the
images.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 11:08:40 +02:00
Stefan Hajnoczi f465706e59 trivial patches for 2017-05-10
-----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEe3O61ovnosKJMUsicBtPaxppPlkFAlkSvwIPHG1qdEB0bHMu
 bXNrLnJ1AAoJEHAbT2saaT5Zys4IAMZLWicv1c7O3m1ajmmg7iGfRbsajcx9FSBi
 NxdrqG3zgV10gz8/R7goMYGkeFs8MAoDfagbBkXgwFgA31M+ecOe93XyoOQLpe9/
 43fx2u8exVdruIb60F5yDEd51RLwK2C4Iz7SVNRoVWMqDcMOCuC+WBog+AbTB0V+
 19RjhKStMyXMXPYVO0bLhQIcH+ixFLUljbpwDvz5FKor5NqGG+FzHjmwYciiTbr3
 o7Z3OIMWT7rDr9V5/553miiNP9ufG3fJreMyXDrTkFRVmDZaqRBp+tvdrYcb77ed
 /DDxC5vafgCRzwsrmCIsIQXV0janFGDQiqbR+hzBMBG1RTRoBiM=
 =AAfU
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'mjt/tags/trivial-patches-fetch' into staging

trivial patches for 2017-05-10

# gpg: Signature made Wed 10 May 2017 03:19:30 AM EDT
# gpg:                using RSA key 0x701B4F6B1A693E59
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>"
# gpg:                 aka "Michael Tokarev <mjt@debian.org>"
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D  4324 457C E0A0 8044 65C5
#      Subkey fingerprint: 7B73 BAD6 8BE7 A2C2 8931  4B22 701B 4F6B 1A69 3E59

* mjt/tags/trivial-patches-fetch: (23 commits)
  tests: Remove redundant assignment
  MAINTAINERS: Update paths for AioContext implementation
  MAINTAINERS: Update paths for main loop
  jazz_led: fix bad snprintf
  tests: Ignore another built executable (test-hmp)
  scripts: Switch to more portable Perl shebang
  scripts/qemu-binfmt-conf.sh: Fix shell portability issue
  virtfs: allow a device id to be specified in the -virtfs option
  hw/core/generic-loader: Fix crash when running without CPU
  virtio-blk: Remove useless condition around g_free()
  qemu-doc: Fix broken URLs of amnhltm.zip and dosidle210.zip
  use _Static_assert in QEMU_BUILD_BUG_ON
  channel-file: fix wrong parameter comments
  block: Make 'replication_state' an enum
  util: Use g_malloc/g_free in envlist.c
  qga: fix compiler warnings (clang 5)
  device_tree: fix compiler warnings (clang 5)
  usb-ccid: make ccid_write_data_block() cope with null buffers
  tests: Ignore more test executables
  Add 'none' as type for drive's if option
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-10 12:31:19 -04:00
Stefan Hajnoczi 1effe6ad5e Merge qcrypto 2017/05/09 v1
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJZEceTAAoJEL6G67QVEE/fNHUQAJ4iSzs2SsHSk/4TXensnC8s
 ySRNRrn13wx7NmUdus8zYeGL2nSEs4D4z5uu7xLJ+5TlBHKhWoekO+twnIj3y82P
 VbjDWiB26JnY/nkhZE1UzqE5Ix3iDBKuTJybGeql0059RGu34Itof76rNRq9Tyhz
 xeoNXd/6v5plk093B9ZJjXTcCUpDHxdhvpE3C7Dc1D6Gihx4yPn6VbM68ROCqFc/
 /1exUFrhGdQvo3GpeGztU6nLruJqT5AtHNohxtNMBrWOQDinSwJl75BSh8UmhzuG
 pBFqWMMHaEGq2DhHuFqmTJQxvfz/NKH7NaWRwv5aCZxXB1qd7HweGHgc+XRSqA88
 /2O3+jwNb4UC1BZjvAOWa4t87FhRxnqLs2byZtGt3hYLUJ7vDuZ+OTGwSezY9xy9
 B9ruwY0RF090wnLgx52xOE9QoLNKX6KPZic2gK74QKkMlpiUZMFuAU6KQhhMfQ4H
 IsxhsTsPeQJ672FtprnvNq6yWQYDFsO8vSl00GzJ9kIcwv5Kak/bTSSN1yLRkFzr
 Dc+ymeAKqqBoTWl5qN9Rn/yagjInxPsdnGjVP3TktIaBzW7D2mhnP6vckoAw1gJF
 2mOh4K5CiWombX1e4sxBzzrUb9x6FmkK0GqPpExlapm2rZfGfbTWUha698eNRsoY
 eXHSXmLNuZ6R7nyo5wBH
 =R5f2
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'danpb/tags/pull-qcrypto-2017-05-09-1' into staging

Merge qcrypto 2017/05/09 v1

# gpg: Signature made Tue 09 May 2017 09:43:47 AM EDT
# gpg:                using RSA key 0xBE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>"
# gpg:                 aka "Daniel P. Berrange <berrange@redhat.com>"
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E  8E3F BE86 EBB4 1510 4FDF

* danpb/tags/pull-qcrypto-2017-05-09-1:
  crypto: qcrypto_random_bytes() now works on windows w/o any other crypto libs
  crypto: move 'opaque' parameter to (nearly) the end of parameter list
  List SASL config file under the cryptography maintainer's realm
  Default to GSSAPI (Kerberos) instead of DIGEST-MD5 for SASL

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-10 11:22:13 -04:00
Daniel P. Berrange e4a3507e86 crypto: move 'opaque' parameter to (nearly) the end of parameter list
Previous commit moved 'opaque' to be the 2nd parameter in the list:

  commit 375092332e
  Author: Fam Zheng <famz@redhat.com>
  Date:   Fri Apr 21 20:27:02 2017 +0800

    crypto: Make errp the last parameter of functions

    Move opaque to 2nd instead of the 2nd to last, so that compilers help
    check with the conversion.

this puts it back to the 2nd to last position.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-05-09 14:41:47 +01:00
Markus Armbruster bd269ebc82 sockets: Limit SocketAddressLegacy to external interfaces
SocketAddressLegacy is a simple union, and simple unions are awkward:
they have their variant members wrapped in a "data" object on the
wire, and require additional indirections in C.  SocketAddress is the
equivalent flat union.  Convert all users of SocketAddressLegacy to
SocketAddress, except for existing external interfaces.

See also commit fce5d53..9445673 and 85a82e8..c5f1ae3.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1493192202-3184-7-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[Minor editing accident fixed, commit message and a comment tweaked]

Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-09 09:14:40 +02:00
Markus Armbruster 62cf396b5d sockets: Rename SocketAddressFlat to SocketAddress
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1493192202-3184-6-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2017-05-09 09:14:40 +02:00
Markus Armbruster dfd100f242 sockets: Rename SocketAddress to SocketAddressLegacy
The next commit will rename SocketAddressFlat to SocketAddress, and
the commit after that will replace most uses of SocketAddressLegacy by
SocketAddress, replacing most of this commit's renames right back.

Note that checkpatch emits a few "line over 80 characters" warnings.
The long lines are all temporary; the SocketAddressLegacy replacement
will shorten them again.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1493192202-3184-5-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-09 09:14:40 +02:00
Markus Armbruster 0785bd7a7c sockets: Prepare inet_parse() for flattened SocketAddress
I'm going to flatten SocketAddress: rename SocketAddress to
SocketAddressLegacy, SocketAddressFlat to SocketAddress, eliminate
SocketAddressLegacy except in external interfaces.

inet_parse() returns a newly allocated InetSocketAddress.  Lift the
allocation from inet_parse() into its caller socket_parse() to prepare
for flattening SocketAddress.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1493192202-3184-3-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[Straightforward rebase]
2017-05-09 09:14:40 +02:00
Eric Blake 46f5ac205a qobject: Use simpler QDict/QList scalar insertion macros
We now have macros in place to make it less verbose to add a scalar
to QDict and QList, so use them.

Patch created mechanically via:
  spatch --sp-file scripts/coccinelle/qobject.cocci \
    --macro-file scripts/cocci-macro-file.h --dir . --in-place
then touched up manually to fix a couple of '?:' back to original
spacing, as well as avoiding a long line in monitor.c.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20170427215821.19397-7-eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-09 09:13:51 +02:00
Eric Blake de6e7951fe qobject: Drop useless QObject casts
We have macros in place to make it less verbose to add a subtype
of QObject to both QDict and QList. While we have made cleanups
like this in the past (see commit fcfcd8ffc, for example), having
it be automated by Coccinelle makes it easier to maintain.

Patch created mechanically via:
  spatch --sp-file scripts/coccinelle/qobject.cocci \
    --macro-file scripts/cocci-macro-file.h --dir . --in-place
then I verified that no manual touchups were required.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20170427215821.19397-5-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-08 20:32:14 +02:00
Fam Zheng 3c76c606da block: Make 'replication_state' an enum
BDRVReplicationState.replication_state is a name with a bit of
duplication, plus it could be an enum like BDRVReplicationState.mode,
which is more readable and also more straightforward in a debugger.

Rename it, and improve the type while at it.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-05-07 09:57:51 +03:00
Eric Blake 048c5fd1bf qcow2: Allow discard of final unaligned cluster
As mentioned in commit 0c1bd46, we ignored requests to
discard the trailing cluster of an unaligned image.  While
discard is an advisory operation from the guest standpoint,
(and we are therefore free to ignore any request), our
qcow2 implementation exploits the fact that a discarded
cluster reads back as 0.  As long as we discard on cluster
boundaries, we are fine; but that means we could observe
non-zero data leaked at the tail of an unaligned image.

Enhance iotest 66 to cover this case, and fix the implementation
to honor a discard request on the final partial cluster.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 20170407013709.18440-1-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-28 16:02:03 +02:00
Max Reitz f59adb3256 block: Add .bdrv_truncate() error messages
Add missing error messages for the block driver implementations of
.bdrv_truncate(); drop the generic one from block.c's bdrv_truncate().

Since one of these changes touches a mis-indented block in
block/file-posix.c, this patch fixes that coding style issue along the
way.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170328205129.15138-5-mreitz@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-28 16:02:03 +02:00
Max Reitz 4bff28b81a block: Add errp to BD.bdrv_truncate()
Add an Error parameter to the block drivers' bdrv_truncate() interface.
If a block driver does not set this in case of an error, the generic
bdrv_truncate() implementation will do so.

Where it is obvious, this patch also makes some block drivers set this
value.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170328205129.15138-4-mreitz@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-28 16:02:03 +02:00
Max Reitz ed3d2ec98a block: Add errp to b{lk,drv}_truncate()
For one thing, this allows us to drop the error message generation from
qemu-img.c and blockdev.c and instead have it unified in
bdrv_truncate().

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170328205129.15138-3-mreitz@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-28 16:02:02 +02:00
Max Reitz 55b9392b98 block/vhdx: Make vhdx_create() always set errp
This patch makes vhdx_create() always set errp in case of an error. It
also adds errp parameters to vhdx_create_bat() and
vhdx_create_new_region_table() so we can pass on the error object
generated by blk_truncate() as of a future commit.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 20170328205129.15138-2-mreitz@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-28 16:02:02 +02:00
Denis V. Lunev f13ce1be35 block: fix alignment calculations in bdrv_co_do_zero_pwritev
tail_padding_bytes is calculated wrong. F.e. for
    offset = 0
    bytes = 2048
    align = 512
we will have tail_padding_bytes = 512 which is definitely wrong. The patch
fixes that arithmetics.

Fortunately this problem is harmless, we will have 1 extra allocation and
free thus there is no need to put this into stable. The problem is here
from the very beginning.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27 16:24:01 +02:00
Max Reitz de234897b6 block: Do not unref bs->file on error in BD's open
The block layer takes care of removing the bs->file child if the block
driver's bdrv_open()/bdrv_file_open() implementation fails. The block
driver therefore does not need to do so, and indeed should not unless it
sets bs->file to NULL afterwards -- because if this is not done, the
bdrv_unref_child() in bdrv_open_inherit() will dereference the freed
memory block at bs->file afterwards, which is not good.

We can now decide whether to add a "bs->file = NULL;" after each of the
offending bdrv_unref_child() invocations, or just drop them altogether.
The latter is simpler, so let's do that.

Cc: qemu-stable <qemu-stable@nongnu.org>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27 16:12:13 +02:00
Fam Zheng e914404efb block: Remove NULL check in bdrv_co_flush
Reported by Coverity. We already use bs in bdrv_inc_in_flight before
checking for NULL. It is unnecessary as all callers pass non-NULL bs, so
drop it.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27 15:39:50 +02:00
Max Reitz 362b3786eb Revert "block/io: Comment out permission assertions"
This reverts commit e3e0003a8f.

This commit was necessary for the 2.9 release because we were unable to
fix the underlying issue(s) in time. However, we will be for 2.10.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27 15:39:49 +02:00
Kevin Wolf 0d5e0bb2f7 file-win32: Remove unnecessary include
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27 15:39:49 +02:00
Kevin Wolf ad02b7af0c file-posix: Remove unnecessary includes
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27 15:39:49 +02:00
Krzysztof Kozlowski 0731a50feb block: Constify data passed by pointer to blk_name
blk_name() is not modifying data passed to it through pointer and it
returns also a pointer to const so the argument can be made const for
code safeness.

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27 15:39:49 +02:00
Jeff Cody 56e7cf8df0 block/rbd: Add support for reopen()
This adds support for reopen in rbd, for changing between r/w and r/o.

Note, that this is only a flag change, but we will block a change from
r/o to r/w if we are using an RBD internal snapshot.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: d4e87539167ec6527d44c97b164eabcccf96e4f3.1491597120.git.jcody@redhat.com
2017-04-24 15:09:33 -04:00
Jeff Cody 80b61a27c6 block/rbd - update variable names to more apt names
Update 'clientname' to be 'user', which tracks better with both
the QAPI and rados variable naming.

Update 'name' to be 'image_name', as it indicates the rbd image.
Naming it 'image' would have been ideal, but we are using that for
the rados_image_t value returned by rbd_open().

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: b7ec1fb2e1cf36f9b6911631447a5b0422590b7d.1491597120.git.jcody@redhat.com
2017-04-24 15:09:33 -04:00
Jeff Cody e2b8247a32 block: do not set BDS read_only if copy_on_read enabled
A few block drivers will set the BDS read_only flag from their
.bdrv_open() function.  This means the bs->read_only flag could
be set after we enable copy_on_read, as the BDRV_O_COPY_ON_READ
flag check occurs prior to the call to bdrv->bdrv_open().

This adds an error return to bdrv_set_read_only(), and an error will be
return if we try to set the BDS to read_only while copy_on_read is
enabled.

This patch also changes the behavior of vvfat.  Before, vvfat could
override the drive 'readonly' flag with its own, internal 'rw' flag.

For instance, this -drive parameter would result in a writable image:

"-drive format=vvfat,dir=/tmp/vvfat,rw,if=virtio,readonly=on"

This is not correct.  Now, attempting to use the above -drive parameter
will result in an error (i.e., 'rw' is incompatible with 'readonly=on').

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 0c5b4c1cc2c651471b131f21376dfd5ea24d2196.1491597120.git.jcody@redhat.com
2017-04-24 15:09:33 -04:00
Jeff Cody fe5241bfe3 block: add bdrv_set_read_only() helper function
We have a helper wrapper for checking for the BDS read_only flag,
add a helper wrapper to set the read_only flag as well.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 9b18972d05f5fa2ac16c014f0af98d680553048d.1491597120.git.jcody@redhat.com
2017-04-24 15:09:33 -04:00
Ashish Mittal da92c3ff60 block/vxhs.c: Add support for a new block device type called "vxhs"
Source code for the qnio library that this code loads can be downloaded from:
https://github.com/VeritasHyperScale/libqnio.git

Sample command line using JSON syntax:
./x86_64-softmmu/qemu-system-x86_64 -name instance-00000008 -S -vnc 0.0.0.0:0
-k en-us -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5
-msg timestamp=on
'json:{"driver":"vxhs","vdisk-id":"c3e9095a-a5ee-4dce-afeb-2a59fb387410",
"server":{"host":"172.172.17.4","port":"9999"}}'

Sample command line using URI syntax:
qemu-img convert -f raw -O raw -n
/var/lib/nova/instances/_base/0c5eacd5ebea5ed914b6a3e7b18f1ce734c386ad
vxhs://192.168.0.1:9999/c6718f6b-0401-441d-a8c3-1f0064d75ee0

Sample command line using TLS credentials (run in secure mode):
./qemu-io --object
tls-creds-x509,id=tls0,dir=/etc/pki/qemu/vxhs,endpoint=client -c 'read
-v 66000 2.5k' 'json:{"server.host": "127.0.0.1", "server.port": "9999",
"vdisk-id": "/test.raw", "driver": "vxhs", "tls-creds":"tls0"}'

[Jeff: Modified trace-events with the correct string formatting]

Signed-off-by: Ashish Mittal <Ashish.Mittal@veritas.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Message-id: 1491277689-24949-2-git-send-email-Ashish.Mittal@veritas.com
2017-04-24 15:08:42 -04:00
Fam Zheng cb8d4bf677 nfs: Make errp the last parameter of nfs_client_open
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-10-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:13:44 +02:00
Fam Zheng 78bbd910bb block: Make errp the last parameter of commit_active_start
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-9-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:13:44 +02:00
Fam Zheng 51ccfa2dbf mirror: Make errp the last parameter of mirror_start_job
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-8-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:13:44 +02:00
Fam Zheng 375092332e crypto: Make errp the last parameter of functions
Move opaque to 2nd instead of the 2nd to last, so that compilers help
check with the conversion.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-7-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[Commit message typo corrected]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:13:22 +02:00
Fam Zheng 6dffc1f670 socket: Make errp the last parameter of inet_connect_saddr
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-3-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:12:59 +02:00
Fam Zheng 226799cec5 socket: Make errp the last parameter of socket_connect
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-2-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:12:59 +02:00
Fam Zheng 178bd438af block: Walk bs->children carefully in bdrv_drain_recurse
The recursive bdrv_drain_recurse may run a block job completion BH that
drops nodes. The coming changes will make that more likely and use-after-free
would happen without this patch

Stash the bs pointer and use bdrv_ref/bdrv_unref in addition to
QLIST_FOREACH_SAFE to prevent such a case from happening.

Since bdrv_unref accesses global state that is not protected by the AioContext
lock, we cannot use bdrv_ref/bdrv_unref unconditionally.  Fortunately the
protection is not needed in IOThread because only main loop can modify a graph
with the AioContext lock held.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170418143044.12187-2-famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Tested-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-04-18 22:56:28 +08:00
Max Reitz e3e0003a8f block/io: Comment out permission assertions
In case of block migration, there may be writes to BlockBackends that do
not have the write permission taken. Before this issue is fixed (which
is not going to happen in 2.9), we therefore cannot assert that this is
the case.

Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 20170411145050.31290-1-mreitz@redhat.com
Tested-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-11 16:09:31 +01:00
Kevin Wolf 5eceb01adf sheepdog: Fix crash in co_read_response()
This fixes a regression introduced in commit 9d456654.

aio_co_wake() can only be used to reenter a coroutine that was already
previously entered, otherwise co->ctx is uninitialised and we access
garbage. Using it immediately after qemu_coroutine_create() like in
co_read_response() is wrong and causes segfaults.

Replace the call with aio_co_enter(), which gets an explicit AioContext
parameter and works even for new coroutines.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1491919733-21065-1-git-send-email-kwolf@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-11 16:08:29 +01:00
Fam Zheng 2ec9a782d1 iscsi: Fix iscsi_create
Since d5895fcb (iscsi: Split URL into individual options), creating
qcow2 image on an iscsi LUN fails:

    qemu-img create -f qcow2 iscsi://$SERVER/$IQN/0 1G
    qemu-img: iscsi://$SERVER/$IQN/0: Could not create image: Invalid
        argument

The problem is iscsi_open now expects that transport_name, portal and
target are already parsed into structured options by
iscsi_parse_filename, but it is not called in iscsi_create.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 20170410075451.21329-1-famz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
[mreitz: Dropped now superfluous
         qdict_put(bs_options, "filename", ...)]
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-11 15:33:00 +02:00
Eric Blake 1606e4cf8a throttle: Remove block from group on hot-unplug
When a block device that is part of a throttle group is hot-unplugged,
we forgot to remove it from the throttle group. This leaves stale
memory around, and causes an easily reproducible crash:

$ ./x86_64-softmmu/qemu-system-x86_64 -nodefaults -nographic -qmp stdio \
-device virtio-scsi-pci,bus=pci.0 -drive \
id=drive_image2,if=none,format=raw,file=file2,bps=512000,iops=100,group=foo \
-device scsi-hd,id=image2,drive=drive_image2 -drive \
id=drive_image3,if=none,format=raw,file=file3,bps=512000,iops=100,group=foo \
-device scsi-hd,id=image3,drive=drive_image3
{'execute':'qmp_capabilities'}
{'execute':'device_del','arguments':{'id':'image3'}}
{'execute':'system_reset'}

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1428810

Suggested-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 20170406190847.29347-1-eblake@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-11 15:33:00 +02:00
Dong Jia Shi 7a9e51198c block: pass the right options for BlockDriver.bdrv_open()
raw_open() expects the caller always passing in the right actual
@options parameter. But when trying to applying snapshot on a RBD
image, bdrv_snapshot_goto() calls raw_open() (by calling the
bdrv_open callback on the BlockDriver) with a NULL @options, and
that will result in a Segmentation fault.

For the other non-raw format drivers, it also makes sense to passing
in the actual options, althought they don't trigger the problem so
far.

Let's prepare a @options by adding the "file" key-value pair to a
copy of the actual options that were given for the node (i.e.
bs->options), and pass it to the callback.

BlockDriver.bdrv_open() expects bs->file to be NULL and just
overwrites it with the result from bdrv_open_child(). That means we
should actually make sure it's NULL because otherwise the child BDS
will have a reference count that is 1 too high. So we unconditionally
invoke bdrv_unref_child() before calling BlockDriver.bdrv_open(), and
we wrap everything in bdrv_ref()/bdrv_unref() so the BDS isn't
deleted in the meantime.

Suggested-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Message-id: 20170405091909.36357-2-bjsdjshi@linux.vnet.ibm.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-11 15:33:00 +02:00
Fam Zheng 76296dff97 sheepdog: Use bdrv_coroutine_enter before BDRV_POLL_WHILE
When called from main thread, the coroutine should run in the context of
bs. Use bdrv_coroutine_enter to ensure that.

Signed-off-by: Fam Zheng <famz@redhat.com>
2017-04-11 20:07:15 +08:00
Fam Zheng 49ca625913 block: Fix bdrv_co_flush early return
bdrv_inc_in_flight and bdrv_dec_in_flight are mandatory for
BDRV_POLL_WHILE to work, even for the shortcut case where flush is
unnecessary. Move the if block to below bdrv_dec_in_flight, and BTW fix
the variable declaration position.

Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-11 20:07:15 +08:00
Fam Zheng e92f0e1910 block: Use bdrv_coroutine_enter to start I/O coroutines
BDRV_POLL_WHILE waits for the started I/O by releasing bs's ctx then polling
the main context, which relies on the yielded coroutine continuing on bs->ctx
before notifying qemu_aio_context with bdrv_wakeup().

Thus, using qemu_coroutine_enter to start I/O is wrong because if the coroutine
is entered from main loop, co->ctx will be qemu_aio_context, as a result of the
"release, poll, acquire" loop of BDRV_POLL_WHILE, race conditions happen when
both main thread and the iothread access the same BDS:

  main loop                                iothread
-----------------------------------------------------------------------
  blockdev_snapshot
    aio_context_acquire(bs->ctx)
                                           virtio_scsi_data_plane_handle_cmd
    bdrv_drained_begin(bs->ctx)
    bdrv_flush(bs)
      bdrv_co_flush(bs)                      aio_context_acquire(bs->ctx).enter
        ...
        qemu_coroutine_yield(co)
      BDRV_POLL_WHILE()
        aio_context_release(bs->ctx)
                                             aio_context_acquire(bs->ctx).return
                                               ...
                                                 aio_co_wake(co)
        aio_poll(qemu_aio_context)               ...
          co_schedule_bh_cb()                    ...
            qemu_coroutine_enter(co)             ...

              /* (A) bdrv_co_flush(bs)           /* (B) I/O on bs */
                      continues... */
                                             aio_context_release(bs->ctx)
        aio_context_acquire(bs->ctx)

Note that in above case, bdrv_drained_begin() doesn't do the "release,
poll, acquire" in BDRV_POLL_WHILE, because bs->in_flight == 0.

Fix this by using bdrv_coroutine_enter and enter coroutine in the right
context.

iotests 109 output is updated because the coroutine reenter flow during
mirror job complete is different (now through co_queue_wakeup, instead
of the unconditional qemu_coroutine_switch before), making the end job
len different.

Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2017-04-11 20:07:15 +08:00
Fam Zheng 14e9559f46 block: Make bdrv_parent_drained_begin/end public
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2017-04-11 20:07:15 +08:00
Fam Zheng 19dd29e8a7 mirror: Fix aio context of mirror_top_bs
It should be moved to the same context as source, before inserting to the
graph.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-07 14:44:06 +02:00
Kevin Wolf 1bf03e66fd block: Don't check permissions for copy on read
The assertion is currently failing. We can't require callers to have
write permissions when all they are doing is a read, so comment it out.
Add a FIXME comment in the code so that the check is re-enabled when
copy on read is refactored into its own filter driver.

Reported-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
2017-04-07 14:44:06 +02:00
Max Reitz 7a25fcd056 block/mirror: Fix use-after-free
If @bs does not have any parents, the only reference to @mirror_top_bs
will be held by the BlockJob object after the bdrv_unref() following
block_job_create(). However, if block_job_create() fails, this reference
will not exist and @mirror_top_bs will have been deleted when we
goto fail.

The issue comes back at all later entries to the fail label: We delete
the BlockJob object before rolling back our changes to the node graph.
This means that we will delete @mirror_top_bs in the process.

All in all, whenever @bs does not have any parents and we go down the
fail path we will dereference @mirror_top_bs after it has been deleted.

Fix this by invoking bdrv_unref() only when block_job_create() was
successful and by bdrv_ref()'ing @mirror_top_bs in the fail path before
deleting the BlockJob object. Finally, bdrv_unref() it at the end of the
fail path after we actually no longer need it.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-07 14:44:05 +02:00
Kevin Wolf 0d0676a104 commit: Set commit_top_bs->total_sectors
Like in the mirror filter driver, we also need to set the image size for
the commit filter driver. This is less likely to be a problem in
practice than for the mirror because we're not at the active layer here,
but attaching new parents to a node in the middle of the chain is
possible, so the size needs to be correct anyway.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2017-04-07 14:44:05 +02:00
Kevin Wolf 02be4aeb93 commit: Set commit_top_bs->aio_context
The filter driver that is inserted by the commit job needs to use the
same AioContext as its parent and child nodes.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2017-04-07 14:44:05 +02:00
Kevin Wolf d35ff5e6b3 block: Ignore guest dev permissions during incoming migration
Usually guest devices don't like other writers to the same image, so
they use blk_set_perm() to prevent this from happening. In the migration
phase before the VM is actually running, though, they don't have a
problem with writes to the image. On the other hand, storage migration
needs to be able to write to the image in this phase, so the restrictive
blk_set_perm() call of qdev devices breaks it.

This patch flags all BlockBackends with a qdev device as
blk->disable_perm during incoming migration, which means that the
requested permissions are stored in the BlockBackend, but not actually
applied to its root node yet.

Once migration has finished and the VM should be resumed, the
permissions are applied. If they cannot be applied (e.g. because the NBD
server used for block migration hasn't been shut down), resuming the VM
fails.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
2017-04-07 14:44:05 +02:00
Peter Maydell 87cc4c6102 * MemoryRegionCache revert
* glib optimization workaround
 * fix "info lapic" segfault on isapc
 * fix QIOChannel memory leak
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQExBAABCAAbBQJY4oOMFBxwYm9uemluaUByZWRoYXQuY29tAAoJEL/70l94x66D
 AsIH/i52nJw41utJCs5AevnQyqNs9RnyMkZLHiVoi6a+pdJqX+0mCw8gV/5FsbPZ
 dtyt1tEuYBSu72adr+/ExE4aIEjwzeyRmnUdOkB+iYPxirHKuf4K/JTuLuvMtaQQ
 Tqj+FU5tx3wx0jlGOm5A7pzjZ680JUex+oaz3d1bZziv3zCyFCIgiZ2m2UAaaPQe
 fsd3fksJvc0gKOUKmdLUpu2m/xP3hAQAfQ4P/ozOfbVh9V2CVNaQ/cl935tNtdFK
 aYN3KleW3/ovb+YSexeNoW7QQH/3ZsjronCW5OmbF4FgHoeoV8MUROfNgu1S2bRU
 Bne9K/6boPzhD8NDEuSy8SXvf7s=
 =EdXr
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* MemoryRegionCache revert
* glib optimization workaround
* fix "info lapic" segfault on isapc
* fix QIOChannel memory leak

# gpg: Signature made Mon 03 Apr 2017 18:17:00 BST
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream:
  main-loop: Acquire main_context lock around os_host_main_loop_wait.
  exec: revert MemoryRegionCache
  nbd: fix memory leak on socket_connect failed
  ipmi: Fix macro issues
  target-i386: fix "info lapic" segfault on isapc
  iscsi: drop unused IscsiAIOCB.qiov field

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-04 11:40:55 +01:00
Max Reitz 86d1bd7098 block/parallels: Avoid overflows
Change the types of variables in allocate_clusters() to int64_t so we do
not have to worry about potential overflows.

Add an assertion that our accesses to s->bat[] do not result in a buffer
overflow and that the implicit conversion performed when invoking
bat_entry_off() does not result in an integer overflow.

Coverity-id: 1307776
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170331170512.10381-1-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:40 +02:00
Eric Blake 0c1bd4692f qcow2: Discard unaligned tail when wiping image
There is a subtle difference between the fast (qcow2v3 with no
extra data) and slow path (qcow2v2 format [aka 0.10], or when a
snapshot is present) of qcow2_make_empty().  The slow path fails
to discard the final (partial) cluster of an unaligned image.

The problem stems from the fact that qcow2_discard_clusters() was
silently ignoring sub-cluster head and tail on unaligned requests.
A quick audit of all callers shows that qcow2_snapshot_create() has
always passed a cluster-aligned request since the call was added
in commit 1ebf561; qcow2_co_pdiscard() has passed a cluster-aligned
request since commit ecdbead taught the block layer about preferred
discard alignment; and qcow2_make_empty() was fixed to pass an
aligned start (but not necessarily end) in commit a3e1505.

Asserting that the start is always aligned also points out that we
now have a dead check: rounding the end offset down can never result
in a value less than the aligned start offset (the check was rendered
dead with commit ecdbead).  Meanwhile, we do not want to round the
end cluster down in the one case of the end offset matching the
(unaligned) file size - that final partial cluster should still be
discarded.

With those fixes in place, the fast and slow paths are back in sync
at discarding an entire image; the next patch will update
qemu-iotests to ensure we don't regress.

Note that bdrv_co_pdiscard ignores ALL partial cluster requests,
including the partial cluster at the end of an image; it can be
argued that the partial cluster at the end should be special-cased
so that a guest issuing discard requests at proper alignments
everywhere else can likewise empty the entire image.  But that
optimization is left for another day.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 20170331185356.2479-3-eblake@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:40 +02:00
Markus Armbruster d1c136885b sheepdog: Fix blockdev-add
Commit 831acdc "sheepdog: Implement bdrv_parse_filename()" and commit
d282f34 "sheepdog: Support blockdev-add" have different ideas on how
the QemuOpts parameters for the server address are named.  Fix that.
While there, rename BlockdevOptionsSheepdog member addr to server, for
consistency with BlockdevOptionsSsh, BlockdevOptionsGluster,
BlockdevOptionsNbd.

Commit 831acdc's example becomes

    --drive driver=sheepdog,server.type=inet,server.host=fido,server.port=7000,vdi=dolly

instead of

    --drive driver=sheepdog,host=fido,vdi=dolly

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
Message-id: 1490895797-29094-10-git-send-email-armbru@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:39 +02:00
Markus Armbruster 9445673ea6 nbd: Tidy up blockdev-add interface
SocketAddress is a simple union, and simple unions are awkward: they
have their variant members wrapped in a "data" object on the wire, and
require additional indirections in C.  I intend to limit its use to
existing external interfaces, and convert all internal interfaces to
SocketAddressFlat.

BlockdevOptionsNbd is an external interface using SocketAddress.  We
already use SocketAddressFlat elsewhere in blockdev-add.  Replace it
by SocketAddressFlat while we can (it's new in 2.9) for simplicity and
consistency.  For example,

    { "execute": "blockdev-add",
      "arguments": { "node-name": "foo", "driver": "nbd",
                     "server": { "type": "inet",
		                 "data": { "host": "localhost",
				           "port": "12345" } } } }

becomes

    { "execute": "blockdev-add",
      "arguments": { "node-name": "foo", "driver": "nbd",
                     "server": { "type": "inet",
		                 "host": "localhost", "port": "12345" } } }

Since the internal interfaces still take SocketAddress, this requires
conversion function socket_address_crumple().  It'll go away when I
update the interfaces.

Unfortunately, SocketAddress is also visible in -drive since 2.8:

    -drive if=none,driver=nbd,server.type=inet,server.data.host=127.0.0.1,server.data.port=12345

Nobody should be using it, as it's fairly new and has never been
documented, so adding still more compatibility gunk to keep it working
isn't worth the trouble.  You now have to use

    -drive if=none,driver=nbd,server.type=inet,server.host=127.0.0.1,server.port=12345

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1490895797-29094-9-git-send-email-armbru@redhat.com

[mreitz: Change iotest 147 accordingly]

Because of this interface change, iotest 147 has to be adapted.
Unfortunately, we cannot just flatten all of the addresses because
nbd-server-start still takes a plain SocketAddress. Therefore, we need
both and this is most easily achieved by writing the SocketAddress into
the code and flattening it where necessary.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170330221243.17333-1-mreitz@redhat.com

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:39 +02:00
Markus Armbruster 8bc0673f6d qapi-schema: SocketAddressFlat variants 'vsock' and 'fd'
Note that the new variants are impossible in qemu_gluster_glfs_init(),
because the gconf->server can only come from qemu_gluster_parse_uri()
or qemu_gluster_parse_json(), and neither can create anything but
'inet' or 'unix'.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1490895797-29094-7-git-send-email-armbru@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:39 +02:00
Markus Armbruster fce5d5386d gluster: Prepare for SocketAddressFlat extension
qemu_gluster_glfs_init() and qemu_gluster_parse_json() rely on the
fact that SocketAddressFlatType has only two members
SOCKET_ADDRESS_FLAT_TYPE_INET and SOCKET_ADDRESS_FLAT_TYPE_UNIX.
Correct, but won't stay correct.  Make them more robust.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490895797-29094-6-git-send-email-armbru@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:39 +02:00
Markus Armbruster 129c7d1c53 block: Document -drive problematic code and bugs
-blockdev and blockdev_add convert their arguments via QObject to
BlockdevOptions for qmp_blockdev_add(), which converts them back to
QObject, then to a flattened QDict.  The QDict's members are typed
according to the QAPI schema.

-drive converts its argument via QemuOpts to a (flat) QDict.  This
QDict's members are all QString.

Thus, the QType of a flat QDict member depends on whether it comes
from -drive or -blockdev/blockdev_add, except when the QAPI type maps
to QString, which is the case for 'str' and enumeration types.

The block layer core extracts generic configuration from the flat
QDict, and the block driver extracts driver-specific configuration.

Both commonly do so by converting (parts of) the flat QDict to
QemuOpts, which turns all values into strings.  Not exactly elegant,
but correct.

However, A few places access the flat QDict directly:

* Most of them access members that are always QString.  Correct.

* bdrv_open_inherit() accesses a boolean, carefully.  Correct.

* nfs_config() uses a QObject input visitor.  Correct only because the
  visited type contains nothing but QStrings.

* nbd_config() and ssh_config() use a QObject input visitor, and the
  visited types contain non-QStrings: InetSocketAddress members
  @numeric, @to, @ipv4, @ipv6.  -drive works as long as you don't try
  to use them (they're all optional).  @to is ignored anyway.

  Reproducer:
  -drive driver=ssh,server.host=h,server.port=22,server.ipv4,path=p
  -drive driver=nbd,server.type=inet,server.data.host=h,server.data.port=22,server.data.ipv4
  both fail with "Invalid parameter type for 'data.ipv4', expected: boolean"

Add suitable comments to all these places.  Mark the buggy ones FIXME.

"Fortunately", -drive's driver-specific options are entirely
undocumented.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1490895797-29094-5-git-send-email-armbru@redhat.com
[mreitz: Fixed two typos]
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:39 +02:00
Markus Armbruster ca0b64e5ed nbd sockets vnc: Mark problematic address family tests TODO
Certain features make sense only with certain address families.  For
instance, passing file descriptors requires AF_UNIX.  Testing
SocketAddress's saddr->type == SOCKET_ADDRESS_KIND_UNIX is obvious,
but problematic: it can't recognize AF_UNIX when type ==
SOCKET_ADDRESS_KIND_FD.

Mark such tests of saddr->type TODO.  We may want to check the address
family with getsockname() there.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1490895797-29094-2-git-send-email-armbru@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:39 +02:00
yaolujing cc1e139139 nbd: fix memory leak on socket_connect failed
When TCP connection fails between nbd server and client,
the local var, sioc, memory leak.

This patch fixes the memory leak.

Signed-off-by: yaolujing <yaolujing@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1491005709-29989-1-git-send-email-yaolujing@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-02 21:18:21 +02:00
Stefan Hajnoczi b9afaba891 iscsi: drop unused IscsiAIOCB.qiov field
The IscsiAIOCB.qiov field has been unused since commit
063c3378a9 ("block/iscsi: introduce
bdrv_co_{readv, writev, flush_to_disk}") back in 2013.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20170327165005.22038-1-stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-02 21:17:47 +02:00
Max Reitz 34634ca286 block/curl: Check protocol prefix
If the user has explicitly specified a block driver and thus a protocol,
we have to make sure the URL's protocol prefix matches. Otherwise the
latter will silently override the former which might catch some users by
surprise.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20170331120431.1767-3-mreitz@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-31 15:53:22 -04:00
Eric Blake e98c6961c8 rbd: Fix regression in legacy key/values containing escaped :
Commit c7cacb3 accidentally broke legacy key-value parsing through
pseudo-filename parsing of -drive file=rbd://..., for any key that
contains an escaped ':'.  Such a key is surprisingly common, thanks
to mon_host specifying a 'host:port' string.  The break happens
because passing things from QDict through QemuOpts back to another
QDict requires that we pack our parsed key/value pairs into a string,
and then reparse that string, but the intermediate string that we
created ("key1=value1:key2=value2") lost the \: escaping that was
present in the original, so that we could no longer see which : were
used as separators vs. those used as part of the original input.

Fix it by collecting the key/value pairs through a QList, and
sending that list on a round trip through a JSON QString (as in
'["key1","value1","key2","value2"]') on its way through QemuOpts,
rather than hand-rolling our own string.  Since the string is only
handled internally, this was faster than creating a full-blown
struct of '[{"key1":"value1"},{"key2":"value2"}]', and safer at
guaranteeing order compared to '{"key1":"value1","key2":"value2"}'.

It would be nicer if we didn't have to round-trip through QemuOpts
in the first place, but that's a much bigger task for later.

Reproducer:
./x86_64-softmmu/qemu-system-x86_64 -nodefaults -nographic -qmp stdio \
-drive 'file=rbd:volumes/volume-ea141b5c-cdb3-4765-910d-e7008b209a70'\
':id=compute:key=AQAVkvxXAAAAABAA9ZxWFYdRmV+DSwKr7BKKXg=='\
':auth_supported=cephx\;none:mon_host=192.168.1.2\:6789'\
',format=raw,if=none,id=drive-virtio-disk0,'\
'serial=ea141b5c-cdb3-4765-910d-e7008b209a70,cache=writeback'

Even without an RBD setup, this serves a test of whether we get
the incorrect parser error of:
qemu-system-x86_64: -drive file=rbd:...cache=writeback: conf option 6789 has no value
or the correct behavior of hanging while trying to connect to
the requested mon_host of 192.168.1.2:6789.

Reported-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170331152730.12514-1-eblake@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-31 13:53:57 -04:00
Markus Armbruster 2836284db6 rbd: Fix bugs around -drive parameter "server"
qemu_rbd_open() takes option parameters as a flattened QDict, with
keys of the form server.%d.host, server.%d.port, where %d counts up
from zero.

qemu_rbd_array_opts() extracts these values as follows.  First, it
calls qdict_array_entries() to find the list's length.  For each list
element, it formats the list's key prefix (e.g. "server.0."), then
creates a new QDict holding the options with that key prefix, then
converts that to a QemuOpts, so it can finally get the member values
from there.

If there's one surefire way to make code using QDict more awkward,
it's creating more of them and mixing in QemuOpts for good measure.

The extraction of keys starting with server.%d into another QDict
makes us ignore parameters like server.0.neither-host-nor-port
silently.

The conversion to QemuOpts abuses runtime_opts, as described a few
commits ago.

Rewrite to simply get the values straight from the options QDict.

Fixes -drive not to crash when server.*.* are present, but
server.*.host is absent.

Fixes -drive to reject invalid server.*.*.

Permits cleaning up runtime_opts.  Do that, and fix -drive to reject
bogus parameters host and port instead of silently ignoring them.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490691368-32099-11-git-send-email-armbru@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-28 10:01:21 -04:00
Markus Armbruster 464444fcc1 rbd: Revert -blockdev and -drive parameter auth-supported
This reverts half of commit 0a55679.  We're having second thoughts on
the QAPI schema (and thus the external interface), and haven't reached
consensus, yet.  Issues include:

* The implementation uses deprecated rados_conf_set() key
  "auth_supported".  No biggie.

* The implementation makes -drive silently ignore invalid parameters
  "auth" and "auth-supported.*.X" where X isn't "auth".  Fixable (in
  fact I'm going to fix similar bugs around parameter server), so
  again no biggie.

* BlockdevOptionsRbd member @password-secret applies only to
  authentication method cephx.  Should it be a variant member of
  RbdAuthMethod?

* BlockdevOptionsRbd member @user could apply to both methods cephx
  and none, but I'm not sure it's actually used with none.  If it
  isn't, should it be a variant member of RbdAuthMethod?

* The client offers a *set* of authentication methods, not a list.
  Should the methods be optional members of BlockdevOptionsRbd instead
  of members of list @auth-supported?  The latter begs the question
  what multiple entries for the same method mean.  Trivial question
  now that RbdAuthMethod contains nothing but @type, but less so when
  RbdAuthMethod acquires other members, such the ones discussed above.

* How BlockdevOptionsRbd member @auth-supported interacts with
  settings from a configuration file specified with @conf is
  undocumented.  I suspect it's untested, too.

Let's avoid painting ourselves into a corner now, and revert the
feature for 2.9.

Note that users can still configure authentication methods with a
configuration file.  They probably do that anyway if they use Ceph
outside QEMU as well.

Further note that this doesn't affect use of key "auth-supported" in
-drive file=rbd:...:key=value.

qemu_rbd_array_opts()'s parameter @type now must be RBD_MON_HOST,
which is silly.  This will be cleaned up shortly.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490691368-32099-9-git-send-email-armbru@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-28 10:01:21 -04:00
Markus Armbruster 078463977a rbd: Clean up qemu_rbd_create()'s detour through QemuOpts
The conversion from QDict to QemuOpts is pointless.  Simply get the
stuff straight from the QDict.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490691368-32099-8-git-send-email-armbru@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-28 10:00:57 -04:00
Markus Armbruster cbf036b4f3 rbd: Clean up runtime_opts, fix -drive to reject filename
runtime_opts is used for three different purposes:

* qemu_rbd_open() uses it to accept options it recognizes, such as
  "pool" and "image".  Other .bdrv_open() methods do it similarly.

* qemu_rbd_open() accepts additional list-valued options
  auth-supported and server, with the help of qemu_rbd_array_opts().
  The list elements are again dictionaries.  qemu_rbd_array_opts()
  uses runtime_opts to accept their members.  Thus, runtime_opts
  contains recognized sub-sub-options "auth", "host", "port" in
  addition to recognized options.  No other block driver does that.

* qemu_rbd_create() uses it to convert the QDict produced by
  qemu_rbd_parse_filename() to QemuOpts.  No other block driver does
  that.  The keys produced by qemu_rbd_parse_filename() are "pool",
  "image", "snapshot", "conf", "user" and "keyvalue-pairs".
  qemu_rbd_open() accepts these, so no additional ones here.

This is a confusing mess.  Dates back to commit 0f9d252.  First step
to clean it up is documenting runtime_opts.desc[]:

* Reorder entries to match the QAPI schema, like we do in other block
  drivers.

* Document why the schema's "server" and "auth-supported" aren't in
  .desc[].

* Document why "keyvalue-pairs", "host", "port" and "auth" are in
  .desc[], but not the schema.

* Delete "filename", because none of the three users actually uses it.
  This fixes -drive to reject parameter filename instead of silently
  ignoring it.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490691368-32099-7-git-send-email-armbru@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-28 09:53:16 -04:00
Markus Armbruster 82f20e8547 rbd: Don't accept -drive driver=rbd, keyvalue-pairs=...
The way we communicate extra key-value pairs from
qemu_rbd_parse_filename() to qemu_rbd_open() exposes option parameter
"keyvalue-pairs" on the command line.  It's not wanted there.  Hack:
rename the parameter to "=keyvalue-pairs" to make it inaccessible.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490691368-32099-6-git-send-email-armbru@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-28 09:53:16 -04:00
Markus Armbruster 8efb339dd4 rbd: Clean up after the previous commit
This code in qemu_rbd_parse_filename()

    found_str = qemu_rbd_next_tok(p, '\0', &p);
    p = found_str;

has no effect.  Drop it, and simplify qemu_rbd_next_tok().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490691368-32099-5-git-send-email-armbru@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-28 09:53:16 -04:00
Markus Armbruster 730b00bbfd rbd: Don't limit length of parameter values
We laboriously enforce that parameter values are between one and some
arbitrary limit in length.  Only RBD_MAX_IMAGE_NAME_SIZE comes from
librbd.h, and I'm not sure it applies.  Where the other limits come
from is unclear.

Drop the length checking.  The limits librbd actually imposes must be
checked by librbd anyway.

There's one minor complication: BDRVRBDState member name is a
fixed-size array.  Depends on the length limit.  Make it a pointer to
a dynamically allocated string.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490691368-32099-4-git-send-email-armbru@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-28 09:53:16 -04:00
Markus Armbruster f51c363c2b rbd: Fix to cleanly reject -drive without pool or image
qemu_rbd_open() neglects to check pool and image are present.  Missing
image is caught by rbd_open(), but missing pool crashes.  Reproducer:

    $ qemu-system-x86_64 -nodefaults -drive driver=rbd,id=rbd,image=i,...
    terminate called after throwing an instance of 'std::logic_error'
      what():  basic_string::_M_construct null not valid
    Aborted (core dumped)

where ... is a working server.0.{host,port} configuration.

Doesn't affect -drive with file=..., because qemu_rbd_parse_filename()
always sets both pool and image.

Doesn't affect -blockdev, because pool and image are mandatory in the
QAPI schema.

Fix by adding the missing checks.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490691368-32099-3-git-send-email-armbru@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-28 09:53:16 -04:00
Denis V. Lunev dc62da88b5 parallels: wrong call to bdrv_truncate
Parallels driver should not call bdrv_truncate if the image was opened
in the read-only mode. Without the patch
    qemu-img check harddisk.hds
asserts with
    bdrv_truncate: Assertion `child->perm & BLK_PERM_RESIZE' failed.

Parameters used on the write path are not needed if the image is opened
in the read-only mode.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reported-by: Edgar Kaziahmedov <edos@virtuozzo.mipt.ru>
Message-id: 1490625488-7980-1-git-send-email-den@openvz.org
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-03-28 11:06:00 +01:00
Peter Maydell eb06c9e2d3 * MTTCG fix for win32
* virtio-scsi assertion failure
 * mem-prealloc coverity fix
 * x86 migration revert which requires more thought
 * x86 instruction limit (avoids >2 page translation blocks)
 * nbd dead code cleanup
 * small memory.c logic fix
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQExBAABCAAbBQJY2Te4FBxwYm9uemluaUByZWRoYXQuY29tAAoJEL/70l94x66D
 GyIH/jMpl0w5cdW2hxzEba5alqALKx8fz8LMFy47lSndifyr74Nbk7fq9u89m9/6
 3dz92sOq4ixUt8+eWEHcy0lJqucrStdMWcA7LsSIioXfgbBN39e9NfJFshXKTSQU
 RSL3M5f5XvYHZqHWhk/GjzlkA2l+Dq2v7FM+DT4HISnP0fjcmGXEfadfUZi6KLao
 94xXGs73pTkln9jm8N1pwn3JuJ4+FbEatrvok01nmTbA7VrrBz0zVbTZjhWz7Tu/
 sqBuIBAnPNKhYZFhF8GnNrXUaIciCbw13QdT047JSfpdkSQ7IUfGt7mW48X0+q9z
 JCHTiTZ35d7/lqeMojgl9ANUDpk=
 =iED8
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* MTTCG fix for win32
* virtio-scsi assertion failure
* mem-prealloc coverity fix
* x86 migration revert which requires more thought
* x86 instruction limit (avoids >2 page translation blocks)
* nbd dead code cleanup
* small memory.c logic fix

# gpg: Signature made Mon 27 Mar 2017 17:03:04 BST
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream:
  scsi-generic: Fill in opt_xfer_len in INQUIRY reply if it is zero
  Revert "apic: save apic_delivered flag"
  nbd: drop unused NBDClientSession.is_unix field
  win32: replace custom mutex and condition variable with native primitives
  mem-prealloc: fix sysconf(_SC_NPROCESSORS_ONLN) failure case.
  tcg/i386: Check the size of instruction being translated
  virtio-scsi: Fix acquire/release in dataplane handlers
  virtio-scsi: Make virtio_scsi_acquire/release public
  clear pending status before calling memory commit

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-27 17:34:50 +01:00
Peter Maydell 700f9ce0f9 block/file-posix.c: Fix unused variable warning on OpenBSD
On OpenBSD none of the ioctls probe_logical_blocksize() tries
exist, so the variable sector_size is unused. Refactor the
code to avoid this (and reduce the duplicated code).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490279788-12995-1-git-send-email-peter.maydell@linaro.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-03-27 17:28:34 +02:00
Kevin Wolf e5bcf967fb file-posix: Make bdrv_flush() failure permanent without O_DIRECT
Success for bdrv_flush() means that all previously written data is safe
on disk. For fdatasync(), the best semantics we can hope for on Linux
(without O_DIRECT) is that all data that was written since the last call
was successfully written back. Therefore, and because we can't redo all
writes after a flush failure, we have to give up after a single
fdatasync() failure. After this failure, we would never be able to make
the promise that a successful bdrv_flush() makes.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 20170322210005.16533-1-kwolf@redhat.com
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-03-27 16:53:42 +02:00
Paolo Bonzini a12a712a7d nbd-client: fix handling of hungup connections
After the switch to reading replies in a coroutine, nothing is
reentering pending receive coroutines if the connection hangs.
Move nbd_recv_coroutines_enter_all to the reply read coroutine,
which is the place where hangups are detected.  nbd_teardown_connection
can simply wait for the reply read coroutine to detect the hangup
and clean up after itself.

This wouldn't be enough though because nbd_receive_reply returns 0
(rather than -EPIPE or similar) when reading from a hung connection.
Fix the return value check in nbd_read_reply_entry.

This fixes qemu-iotests 083.

Reported-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20170314111157.14464-1-pbonzini@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-03-27 16:50:36 +02:00
Stefan Hajnoczi e4548bb640 nbd: drop unused NBDClientSession.is_unix field
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20170327123223.1199-1-stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-27 14:41:01 +02:00
Eric Blake 67adf4b398 trace: Fix backwards mirror_yield parameters
block/trace-events lists the parameters for mirror_yield
consistently with other mirror events (cnt just after s, like in
mirror_before_sleep; in_flight last, like in mirror_yield_in_flight).
But the callers were passing parameters in the wrong order, leading
to poor trace messages, including type truncation when there are
more than 4G dirty sectors involved.  Broken since its introduction
in commit bd48bde.

While touching this, ensure that all callers use the same type
(uint64_t) for cnt, as a later patch will enable the compiler to do
stricter type-checking.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-03-24 09:21:42 +00:00
Peter Maydell e6ebebc204 -----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJY0rRYAAoJEL2+eyfA3jBXEvkQAJSNErQOEdqBoX/gSjeYSX85
 PGp+fUrIux0HIYeKySShnsJ3Z1AuIHCogtcfEyHzTo8cDljZssgS4BRKy41ZnNaM
 91Q91MgVyAEtwzApg5WNwWhTB7QDkbz7J75mTk74KPN6y9uKNbjSBRSnH4ZbIH/p
 L3tk6eGpHWf3N0UvoifoKpExlq0A+AYkisuZn7D9C+bBDEnEUWYRcvfEk3sKrZD/
 XikclGwNSPKmdBeYenzlLHFA8WyGe85pkys6QRPeRL1l8dDBBPt1so2y4PLzaEBO
 fImh+ivrHHbKI5TD0RoRVsY4qi9bbH8dK1gDp0oT8uZpwIsO4EWRHA1GZRq6lVIw
 j7a+p/ZFBiVa2WFvWpicZppRwnkuuswqqm4NVsC1djSMoDvPeO2T24YlcRPYeYrp
 FVlY04HpP195mw3e7VVWlirRQY+Jo5IwJkSOUKM4xOZpKY/prS2kqT+KQq2bYK5a
 t3MTKwT04q/7eBtPFoJFf3gwI4q8hyizPtf4X0AN5/YREwJh7J4azQSLEJSjlo2F
 37TbMqGVNQPBtwXWnfK2mi12NIHCaP/clh8hqqrQE6EdjFQQcdD5j5df5syLalTK
 qy+IbxpvoyNt0niXstXI62RnKDbfwsz8YtYYjVIUfv9VkpyQU1gHak7VeodRyHjz
 zuINtr0Jrmr47n8d9qTj
 =Gi6q
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging

# gpg: Signature made Wed 22 Mar 2017 17:28:56 GMT
# gpg:                using RSA key 0xBDBE7B27C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg:                 aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg:                 aka "Jeffrey Cody <codyprime@gmail.com>"
# Primary key fingerprint: 9957 4B4D 3474 90E7 9D98  D624 BDBE 7B27 C0DE 3057

* remotes/cody/tags/block-pull-request:
  blockjob: add devops to blockjob backends
  block-backend: add drained_begin / drained_end ops
  blockjob: add block_job_start_shim
  blockjob: avoid recursive AioContext locking

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-23 11:39:53 +00:00
John Snow f4d9cc88ee block-backend: add drained_begin / drained_end ops
Allow block backends to forward drain requests to their devices/users.
The initial intended purpose for this patch is to allow BBs to forward
requests along to BlockJobs, which will want to pause if their associated
BB has entered a drained region.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 20170316212351.13797-3-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-22 13:26:27 -04:00
Edgar Kaziahmedov ff5bbe56c6 parallels: fix default options parsing
parallels block driver is completely broken since commit
    commit 75cdcd1553
    Author: Markus Armbruster <armbru@redhat.com>
    Date:   Tue Feb 21 21:14:08 2017 +0100
    option: Fix checking of sizes for overflow and trailing crap
Right now even simple
    qemu-io -c "read 512 64k" 1.hds
ends up with
    Unexpected error in parse_option_size() at util/qemu-option.c:188:
    Parameter 'prealloc-size' expects a non-negative number below 2^64
    Aborted (core dumped)
The cure is simple - we should use 'M' as a suffix in default option value
instead of 'MiB'.

Signed-off-by: Edgar Kaziahmedov <edos@virtuozzo.mipt.ru>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Message-id: 1490002022-22653-1-git-send-email-den@openvz.org
CC: Markus Armbruster <armbru@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-03-21 10:02:36 +00:00
Paolo Bonzini eb048026aa curl: fix compilation on OpenBSD
EPROTO is not found in OpenBSD.   We usually use EIO when no better
errno is available, do that here too.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20170317152412.8472-1-pbonzini@redhat.com
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-17 18:27:14 +00:00
Fam Zheng c1cef67251 block: Always call bdrv_child_check_perm first
bdrv_child_set_perm alone is not very usable because the caller must
call bdrv_child_check_perm first. This is already encapsulated
conveniently in bdrv_child_try_set_perm, so remove the other prototypes
from the header and fix the one wrong caller, block/mirror.c.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-17 12:54:06 +01:00
Fam Zheng fed414df9d file-posix: Don't leak fd in hdev_get_max_segments
This fixes a leaked fd introduced in commit 9103f1ce.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-17 12:54:06 +01:00
Changlong Xie 37a9051cc7 replication: clarify permissions
Even if hidden_disk, secondary_disk are backing files, they all need
write permissions in replication scenario. Otherwise we will encouter
below exceptions on secondary side during adding nbd server:

{'execute': 'nbd-server-add', 'arguments': {'device': 'colo-disk', 'writable': true } }
{"error": {"class": "GenericError", "desc": "Conflicts with use by hidden-qcow2-driver as 'backing', which does not allow 'write' on sec-qcow2-driver-for-nbd"}}

CC: Zhang Hailiang <zhang.zhanghailiang@huawei.com>
CC: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
CC: Wen Congyang <wencongyang2@huawei.com>
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-17 12:54:06 +01:00
Stefan Hajnoczi 6958349085 file-posix: clean up max_segments buffer termination
The following pattern is unsafe:

  char buf[32];
  ret = read(fd, buf, sizeof(buf));
  ...
  buf[ret] = 0;

If read(2) returns 32 then a byte beyond the end of the buffer is
zeroed.

In practice this buffer overflow does not occur because the sysfs
max_segments file only contains an unsigned short + '\n'.  The string is
always shorter than 32 bytes.

Regardless, avoid this pattern because static analysis tools might
complain and it could lead to real buffer overflows if copy-pasted
elsewhere in the codebase.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-17 12:54:06 +01:00
Kevin Wolf dcbf37ce41 commit: Implement .bdrv_refresh_filename
We want query-block to return the right filename, even if a commit job
put a bdrv_commit_top on top of the actual image format driver. Let
bdrv_commit_top.bdrv_refresh_filename get the filename from its backing
file.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-13 12:49:33 +01:00
Kevin Wolf fd4a6493bb mirror: Implement .bdrv_refresh_filename
We want query-block to return the right filename, even if a mirror job
put a bdrv_mirror_top on top of the actual image format driver. Let
bdrv_mirror_top.bdrv_refresh_filename get the filename from its backing
file.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-13 12:49:33 +01:00
Kevin Wolf 9196565866 commit: Implement bdrv_commit_top.bdrv_co_get_block_status
In some cases, bdrv_co_get_block_status() is called recursively for the
whole backing chain. The automatically inserted bdrv_commit_top filter
driver must not stop the recursion, so implement a callback that simply
forwards the request to bs->backing.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-13 12:49:33 +01:00
Kevin Wolf b64aa44195 block: Request block status from *file for BDRV_BLOCK_RAW
This fixes bdrv_co_get_block_status() for the bdrv_mirror_top block
driver, which must fall through to bs->backing instead of bs->file.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-13 12:49:33 +01:00
Eric Blake 6f712ee080 vvfat: React to bdrv_is_allocated() errors
If bdrv_is_allocated() fails, we should react to that failure.
For 2 of the 3 callers, reporting the error was easy.  But in
cluster_was_modified() and its lone caller
get_cluster_count_for_direntry(), it's rather invasive to update
the logic to pass the error back; so there, I went with merely
documenting the issue by changing the return type to bool (in
all likelihood, treating the cluster as modified will then
trigger a read which will also fail, and eventually get to an
error - but given the appalling number of abort() calls in this
code, I'm not making it any worse).

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-13 12:49:33 +01:00
Eric Blake 666a9543fa backup: React to bdrv_is_allocated() errors
If bdrv_is_allocated() fails, we should immediately do the backup
error action, rather than attempting backup_do_cow() (although
that will likely fail too).

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-13 12:49:33 +01:00
Eric Blake e32ccbc6e9 block: Drop unmaintained 'archipelago' driver
The driver has failed to build since commit da34e65, in qemu 2.6,
due to a missing include of qapi/error.h for error_setg().
Since no one has complained in three releases, it is easier to
remove the dead code than to keep it around, especially since it
is not being built by default and therefore prone to bitrot.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-13 12:49:33 +01:00
Fam Zheng 9103f1ceb4 file-posix: Consider max_segments for BlockLimits.max_transfer
BlockLimits.max_transfer can be too high without this fix, guest will
encounter I/O error or even get paused with werror=stop or rerror=stop. The
cause is explained below.

Linux has a separate limit, /sys/block/.../queue/max_segments, which in
the worst case can be more restrictive than the BLKSECTGET which we
already consider (note that they are two different things). So, the
failure scenario before this patch is:

1) host device has max_sectors_kb = 4096 and max_segments = 64;
2) guest learns max_sectors_kb limit from QEMU, but doesn't know
   max_segments;
3) guest issues e.g. a 512KB request thinking it's okay, but actually
   it's not, because it will be passed through to host device as an
   SG_IO req that has niov > 64;
4) host kernel doesn't like the segmenting of the request, and returns
   -EINVAL;

This patch checks the max_segments sysfs entry for the host device and
calculates a "conservative" bytes limit using the page size, which is
then merged into the existing max_transfer limit. Guest will discover
this from the usual virtual block device interfaces. (In the case of
scsi-generic, it will be done in the INQUIRY reply interception in
device model.)

The other possibility is to actually propagate it as a separate limit,
but it's not better. On the one hand, there is a big complication: the
limit is per-LUN in QEMU PoV (because we can attach LUNs from different
host HBAs to the same virtio-scsi bus), but the channel to communicate
it in a per-LUN manner is missing down the stack; on the other hand,
two limits versus one doesn't change much about the valid size of I/O
(because guest has no control over host segmenting).

Also, the idea to fall back to bounce buffering in QEMU, upon -EINVAL,
was explored. Unfortunately there is no neat way to ensure the bounce
buffer is less segmented (in terms of DMA addr) than the guest buffer.

Practically, this bug is not very common. It is only reported on a
Emulex (lpfc), so it's okay to get it fixed in the easier way.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-13 12:49:33 +01:00
Vladimir Sementsov-Ogievskiy a410a7f1af backup: allow target without .bdrv_get_info
Currently backup to nbd target is broken, as nbd doesn't have
.bdrv_get_info realization.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-13 12:49:33 +01:00
Fam Zheng b69f00dde4 commit: Don't use error_abort in commit_start
bdrv_set_backing_hd failure needn't be abort. Since we already have
error parameter, use it.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:29 +01:00
Fam Zheng 50bfbe93b2 block: Don't use error_abort in blk_new_open
We have an errp and bdrv_root_attach_child can fail permission check,
error_abort is not the best choice here.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:29 +01:00
Markus Armbruster c5f1ae3ae7 qapi-schema: Rename SocketAddressFlat's variant tcp to inet
QAPI type SocketAddressFlat differs from SocketAddress pointlessly:
the discriminator value for variant InetSocketAddress is 'tcp' instead
of 'inet'.  Rename.

The type is so far only used by the Gluster block drivers.  Take care
to keep 'tcp' working in things like -drive's file.server.0.type=tcp.
The "gluster+tcp" URI scheme in pseudo-filenames stays the same.
blockdev-add changes, but it has changed incompatibly since 2.8
already.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:29 +01:00
Markus Armbruster 2b733709d7 qapi-schema: Rename GlusterServer to SocketAddressFlat
As its documentation says, it's not specific to Gluster.  Rename it,
as I'm going to use it for something else.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:29 +01:00
Markus Armbruster 85a82e852d gluster: Plug memory leaks in qemu_gluster_parse_json()
To reproduce, run

    $ valgrind qemu-system-x86_64 --nodefaults -S --drive driver=gluster,volume=testvol,path=/a/b/c,server.0.type=xxx

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:29 +01:00
Markus Armbruster e152ef7a98 gluster: Don't duplicate qapi-util.c's qapi_enum_parse()
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:29 +01:00
Markus Armbruster fc29458dee gluster: Drop assumptions on SocketTransport names
qemu_gluster_glfs_init() passes the names of QAPI enumeration type
SocketTransport to glfs_set_volfile_server().  Works, because they
were chosen to match.  But the coupling is artificial.  Use the
appropriate literal strings instead.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:28 +01:00
Markus Armbruster 831acdc95e sheepdog: Implement bdrv_parse_filename()
This permits configuration with driver-specific options in addition to
pseudo-filename parsed as URI.  For instance,

    --drive driver=sheepdog,host=fido,vdi=dolly

instead of

    --drive driver=sheepdog,file=sheepdog://fido/dolly

It's also a first step towards supporting blockdev-add.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:28 +01:00
Markus Armbruster 8ecc2f9eab sheepdog: Use SocketAddress and socket_connect()
sd_parse_uri() builds a string from host and port parts for
inet_connect().  inet_connect() parses it into host, port and options.
Whether this gets exactly the same host, port and no options for all
inputs is not obvious.

Cut out the string middleman and build a SocketAddress for
socket_connect() instead.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:28 +01:00
Markus Armbruster 36bcac16fd sheepdog: Report errors in pseudo-filename more usefully
Errors in the pseudo-filename are all reported with the same laconic
"Can't parse filename" message.

Add real error reporting, such as:

    $ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog:///
    qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog:///: missing file path in URI
    $ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepgod:///vdi
    qemu-system-x86_64: --drive driver=sheepdog,filename=sheepgod:///vdi: URI scheme must be 'sheepdog', 'sheepdog+tcp', or 'sheepdog+unix'
    $ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock
    qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock: unexpected query parameters

The code to translate legacy syntax to URI fails to escape URI
meta-characters.  The new error messages are misleading then.  Replace
them by the old "Can't parse filename" message.  "Internal error"
would be more honest.  Anyway, no worse than before.  Also add a FIXME
comment.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:28 +01:00
Markus Armbruster daa0b0d4b1 sheepdog: Don't truncate long VDI name in _open(), _create()
sd_parse_uri() truncates long VDI names silently.  Reject them
instead.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:28 +01:00
Markus Armbruster 89e2a31d33 sheepdog: Fix snapshot ID parsing in _open(), _create, _goto()
sd_parse_uri() and sd_snapshot_goto() screw up error checking after
strtoul(), and truncate long tag names silently.  Fix by replacing
those parts by new sd_parse_snapid_or_tag(), which checks more
carefully.

sd_snapshot_delete() also parses snapshot IDs, but is currently too
broken for me to touch.  Mark TODO.

Two calls of strtol() without error checking remain in
parse_redundancy().  Mark them FIXME.

More silent truncation of configuration strings remains elsewhere.
Not marked.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:28 +01:00
Markus Armbruster a0dc0e2bfe sheepdog: Mark sd_snapshot_delete() lossage FIXME
sd_snapshot_delete() should delete the snapshot whose ID matches
@snapshot_id and whose name matches @name.  But that's not what it
does.  If @snapshot_id is a valid ID, it deletes the snapshot with
that ID, else it deletes the snapshot with that name.  It doesn't use
@name at all.  Add suitable FIXME comments, so someone who actually
knows Sheepdog can fix it.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:28 +01:00
Markus Armbruster 48d7c4af06 sheepdog: Fix error handling sd_create()
As a bdrv_create() method, sd_create() must set an error and return
negative errno on failure.  It prints the error instead of setting it
when connect_to_sdog() fails.  Fix that.

While there, return the value of connect_to_sdog() like we do
elsewhere, instead of -EIO.  No functional change, as
connect_to_sdog() returns no other error code.

Many more suspicious uses of error_report() and error_report_err()
remain in other functions.  Left for another day.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:28 +01:00
Markus Armbruster e25cad6921 sheepdog: Fix error handling in sd_snapshot_delete()
As a bdrv_snapshot_delete() method, sd_snapshot_delete() must set an
error and return negative errno on failure.  It sometimes returns -1,
and sometimes neglects to set an error.  It also prints error messages
with error_report().  Fix all that.

Moreover, its handling of an attempt to delete a nonexistent snapshot
is wrong: it error_report()s and succeeds.  Fix it to set an error and
return -ENOENT instead.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:28 +01:00
Markus Armbruster cbc488ee2a sheepdog: Defuse time bomb in sd_open() error handling
When qemu_opts_absorb_qdict() fails, sd_open() closes stdin, because
sd->fd is still zero.  Fortunately, qemu_opts_absorb_qdict() can't
fail, because:

1. it only fails when qemu_opt_parse() fails, and
2. the only member of runtime_opts.desc[] is a QEMU_OPT_STRING, and
3. qemu_opt_parse() can't fail for QEMU_OPT_STRING.

Defuse this ticking time bomb by jumping behind the file descriptor
cleanup on error.

Also do that for the error paths where sd->fd is still -1.  The file
descriptor cleanup happens to do nothing then, but let's not rely on
that here.

While there, rename label out to err, because it's on the error path,
not the normal path out of the function.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:28 +01:00
Kevin Wolf 5fe31c25cc block: Fix error handling in bdrv_replace_in_backing_chain()
When adding an Error parameter, bdrv_replace_in_backing_chain() would
become nothing more than a wrapper around change_parent_backing_link().
So make the latter public, renamed as bdrv_replace_node(), and remove
bdrv_replace_in_backing_chain().

Most of the callers just remove a node from the graph that they just
inserted, so they can use &error_abort, but completion of a mirror job
with 'replaces' set can actually fail.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-07 14:53:28 +01:00
Kevin Wolf 88f9d1b3d2 mirror: Fix error path for dirty bitmap creation
mirror_top_bs must be removed from the graph again when creating the
dirty bitmap fails.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-07 14:53:28 +01:00
Kevin Wolf 0bf74767ff mirror: Fix permissions for removing mirror_top_bs
mirror_top_bs takes write permissions on its backing file, which can
make it impossible to attach that backing file node to another parent.
However, this is exactly what needs to be done in order to remove
mirror_top_bs from the backing chain. So give up the write permission
first.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-07 14:53:28 +01:00
Kevin Wolf 7d9fcb391c mirror: Fix permission problem with 'replaces'
The 'replaces' option of drive-mirror can be used to mirror a Quorum
node to a new image and then let the target image replace one of the
Quorum children. In order for this graph modification to succeed, the
mirror job needs to lift its restrictions on the target node first
before actually replacing the child.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-07 14:53:28 +01:00
Kevin Wolf b247767aac commit: Fix error handling
Apparently some kind of mismerge happened in commit 8dfba279, which
broke the error handling without any real reason by removing the
assignment of the return value to ret in a blk_insert_bs() call.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-03-07 14:53:28 +01:00
Markus Armbruster 048abb7b20 qapi: Drop unused non-strict qobject input visitor
The split between tests/test-qobject-input-visitor.c and
tests/test-qobject-input-strict.c now makes less sense than ever.  The
next commit will take care of that.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1488544368-30622-20-git-send-email-armbru@redhat.com>
2017-03-05 09:14:19 +01:00
Paolo Bonzini f6eb0b319e iscsi: fix missing unlock
Reported by Coverity.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03 16:41:20 +01:00
Peter Maydell 54639aed97 -----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJYt5TDAAoJEL2+eyfA3jBXL7UP/ROAc7tc0e4F2+5lPE88bLfJ
 i4dBI092Qt4LG73POoQDbgpYahs5ywLYE7MAJP2GmgiiUOj4XFN/K/d6wJluqrGj
 mLiOTlMcYmr2Kbi5CMfZBWlkd27GD/g1knCzpqtAOpM/zIxAwnRr4Mq8RXAHSgio
 tTvgtNr5JG9tYU088COwY7otr6WDhIMwESN5KQcLiOxmo4GtNkpBmpwfMdsgij9X
 mxfjQSdWzutITgsA28JiOMfWMbCiLYkwUgo/KCVIG4aB9mNKgEOx6YTIuGXspx2E
 8DaGLW54KagsuAhEHFCKFecmYCRjaDCZhwcMEEdif+mn1k1bJGtjK+roNFvn2vEy
 8qdKDTn4B7f6uR0fSGNviti04NNpCZTlPmhWpupqRWetY9vCdYNoRjtxRgh6t7wd
 rziBkN7FW/yxvr2GEXb5lLtH0YN+YTOw6FIXoRS8zKoNNtNuJaKxymfQ57VHxF0V
 BG9A+XMAIPgCudoqtBIpyaNxlByqhFKm6Z7aMNzzdZoohANtQxsAO6mAxFKk0yTp
 3zhOywglHd8wtmTVnZEtQTpdqk/Swd9/qCig6hHq+3U3QydyrHM7ZZbB/Z22xiaO
 53DDiLvPr8VbwCZQcpjXg0YOcQXTEMY/xzxDn5AQg10nyWYBr7HKKTxfmT1aew/N
 CRQi8CycU0TVneRnPMsg
 =LvuA
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging

# gpg: Signature made Thu 02 Mar 2017 03:42:59 GMT
# gpg:                using RSA key 0xBDBE7B27C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg:                 aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg:                 aka "Jeffrey Cody <codyprime@gmail.com>"
# Primary key fingerprint: 9957 4B4D 3474 90E7 9D98  D624 BDBE 7B27 C0DE 3057

* remotes/cody/tags/block-pull-request:
  block/rbd: add support for 'mon_host', 'auth_supported' via QAPI
  block/rbd: add blockdev-add support
  block/rbd: parse all options via bdrv_parse_filename
  block/rbd: add all the currently supported runtime_opts
  block/rbd: don't copy strings in qemu_rbd_next_tok()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-02 23:20:37 +00:00
Jeff Cody 0a55679b4a block/rbd: add support for 'mon_host', 'auth_supported' via QAPI
This adds support for three additional options that may be specified
by QAPI in blockdev-add:

    server: host, port
    auth method: either 'cephx' or 'none'

The "server" and "auth-supported" QAPI parameters are arrays.  To conform
with the rados API, the array items are join as a single string with a ';'
character as a delimiter when setting the configuration values.

Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-01 22:39:25 -05:00
Kevin Wolf b2c2832c61 block: Add Error parameter to bdrv_append()
Aborting on error in bdrv_append() isn't correct. This patch fixes it
and lets the callers handle failures.

Test case 085 needs a reference output update. This is caused by the
reversed order of bdrv_set_backing_hd() and change_parent_backing_link()
in bdrv_append(): When the backing file of the new node is set, the
parent nodes are still pointing to the old top, so the backing blocker
is now initialised with the node name rather than the BlockBackend name.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2017-02-28 20:47:51 +01:00
Kevin Wolf 12fa4af61f block: Add Error parameter to bdrv_set_backing_hd()
Not all callers of bdrv_set_backing_hd() know for sure that attaching
the backing file will be allowed by the permission system. Return the
error from the function rather than aborting.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2017-02-28 20:47:51 +01:00
Kevin Wolf c8f6d58edb block: Assertions for resize permission
This adds an assertion that ensures that the necessary resize permission
has been granted before bdrv_truncate() is called.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28 20:47:50 +01:00
Kevin Wolf afa4b29323 block: Assertions for write permissions
This adds assertions that ensure that the necessary write permissions
have been granted before someone attempts to write to a node.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2017-02-28 20:47:50 +01:00
Kevin Wolf 85c97ca7a1 block: Pass BdrvChild to bdrv_aligned_preadv/pwritev and copy-on-read
This is where we want to check the permissions, so we need to have the
BdrvChild around where they are stored.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2017-02-28 20:47:50 +01:00
Kevin Wolf 887354bd13 hmp: Request permissions in qemu-io
The HMP command 'qemu-io' is a bit tricky because it wants to work on
the original BlockBackend, but additional permissions could be required.
The details are explained in a comment in the code, but in summary, just
request whatever permissions the current qemu-io command needs.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28 20:47:50 +01:00
Kevin Wolf 0db832f42e commit: Add filter-node-name to block-commit
Management tools need to be able to know about every node in the graph
and need a way to address them. Changing the graph structure was okay
because libvirt doesn't really manage the node level yet, but future
libvirt versions need to deal with both new and old version of qemu.

This new option to blockdev-commit allows the client to set a node-name
for the automatically inserted filter driver, and at the same time
serves as a witness for a future libvirt that this version of qemu does
automatically insert a filter driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2017-02-28 20:47:50 +01:00
Kevin Wolf 6cdbceb12c mirror: Add filter-node-name to blockdev-mirror
Management tools need to be able to know about every node in the graph
and need a way to address them. Changing the graph structure was okay
because libvirt doesn't really manage the node level yet, but future
libvirt versions need to deal with both new and old version of qemu.

This new option to blockdev-mirror allows the client to set a node-name
for the automatically inserted filter driver, and at the same time
serves as a witness for a future libvirt that this version of qemu does
automatically insert a filter driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28 20:47:50 +01:00
Kevin Wolf a170a91fd3 stream: Use real permissions in streaming block job
The correct permissions are relatively obvious here (and explained in
code comments). For intermediate streaming, we need to reopen the top
node read-write before creating the job now because the permissions
system catches attempts to get the BLK_PERM_WRITE_UNCHANGED permission
on a read-only node.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28 20:47:50 +01:00
Kevin Wolf 4ef85a9c23 mirror: Use real permissions in mirror/active commit block job
The mirror block job is mainly used for two different scenarios:
Mirroring to an otherwise unused, independent target node, or for active
commit where the target node is part of the backing chain of the source.

Similarly to the commit block job patch, we need to insert a new filter
node to keep the permissions correct during active commit.

Note that one change this implies is that job->blk points to
mirror_top_bs as its root now, and mirror_top_bs (rather than the actual
source node) contains the bs->job pointer. This requires qemu-img commit
to get the job by name now rather than just taking bs->job.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Acked-by: Max Reitz <mreitz@redhat.com>
2017-02-28 20:47:46 +01:00
Kevin Wolf 4e9e4323d5 backup: Use real permissions in backup block job
The backup block job doesn't have very complicated requirements: It
needs to read from the source and write to the target, but it's fine
with either side being changed. The only restriction is that we can't
resize the image because the job uses a cached value.

qemu-iotests 055 needs to be changed because it used a target which was
already attached to a virtio-blk device. The permission system correctly
forbids this (virtio-blk can't accept another writer with its default
share-rw=off).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28 20:40:37 +01:00
Kevin Wolf d3f0675922 commit: Use real permissions for HMP 'commit'
This is a little simpler than the commit block job because it's
synchronous and only commits into the immediate backing file, but
otherwise doing more or less the same.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28 20:40:37 +01:00
Kevin Wolf 8dfba27977 commit: Use real permissions in commit block job
This is probably one of the most interesting conversions to the new
op blocker system because a commit block job intentionally leaves some
intermediate block nodes in the backing chain that aren't valid on their
own any more; only the whole chain together results in a valid view.

In order to provide the 'consistent read' permission to the parents of
the 'top' node of the commit job, a new filter block driver is inserted
above 'top' which doesn't require 'consistent read' on its backing
chain. Subsequently, the commit job can block 'consistent read' on all
intermediate nodes without causing a conflict.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2017-02-28 20:40:37 +01:00
Kevin Wolf 76d554e20b blockjob: Add permissions to block_job_add_bdrv()
Block jobs don't actually do I/O through the the reference they create
with block_job_add_bdrv(), but they might want to use the permisssion
system to express what the block job does to intermediate nodes. This
adds permissions to block_job_add_bdrv() to provide the means to request
permissions.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28 20:40:37 +01:00