Commit Graph

2995 Commits

Author SHA1 Message Date
Markus Armbruster e152ef7a98 gluster: Don't duplicate qapi-util.c's qapi_enum_parse()
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:29 +01:00
Markus Armbruster fc29458dee gluster: Drop assumptions on SocketTransport names
qemu_gluster_glfs_init() passes the names of QAPI enumeration type
SocketTransport to glfs_set_volfile_server().  Works, because they
were chosen to match.  But the coupling is artificial.  Use the
appropriate literal strings instead.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:28 +01:00
Markus Armbruster 831acdc95e sheepdog: Implement bdrv_parse_filename()
This permits configuration with driver-specific options in addition to
pseudo-filename parsed as URI.  For instance,

    --drive driver=sheepdog,host=fido,vdi=dolly

instead of

    --drive driver=sheepdog,file=sheepdog://fido/dolly

It's also a first step towards supporting blockdev-add.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:28 +01:00
Markus Armbruster 8ecc2f9eab sheepdog: Use SocketAddress and socket_connect()
sd_parse_uri() builds a string from host and port parts for
inet_connect().  inet_connect() parses it into host, port and options.
Whether this gets exactly the same host, port and no options for all
inputs is not obvious.

Cut out the string middleman and build a SocketAddress for
socket_connect() instead.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:28 +01:00
Markus Armbruster 36bcac16fd sheepdog: Report errors in pseudo-filename more usefully
Errors in the pseudo-filename are all reported with the same laconic
"Can't parse filename" message.

Add real error reporting, such as:

    $ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog:///
    qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog:///: missing file path in URI
    $ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepgod:///vdi
    qemu-system-x86_64: --drive driver=sheepdog,filename=sheepgod:///vdi: URI scheme must be 'sheepdog', 'sheepdog+tcp', or 'sheepdog+unix'
    $ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock
    qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock: unexpected query parameters

The code to translate legacy syntax to URI fails to escape URI
meta-characters.  The new error messages are misleading then.  Replace
them by the old "Can't parse filename" message.  "Internal error"
would be more honest.  Anyway, no worse than before.  Also add a FIXME
comment.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:28 +01:00
Markus Armbruster daa0b0d4b1 sheepdog: Don't truncate long VDI name in _open(), _create()
sd_parse_uri() truncates long VDI names silently.  Reject them
instead.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:28 +01:00
Markus Armbruster 89e2a31d33 sheepdog: Fix snapshot ID parsing in _open(), _create, _goto()
sd_parse_uri() and sd_snapshot_goto() screw up error checking after
strtoul(), and truncate long tag names silently.  Fix by replacing
those parts by new sd_parse_snapid_or_tag(), which checks more
carefully.

sd_snapshot_delete() also parses snapshot IDs, but is currently too
broken for me to touch.  Mark TODO.

Two calls of strtol() without error checking remain in
parse_redundancy().  Mark them FIXME.

More silent truncation of configuration strings remains elsewhere.
Not marked.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:28 +01:00
Markus Armbruster a0dc0e2bfe sheepdog: Mark sd_snapshot_delete() lossage FIXME
sd_snapshot_delete() should delete the snapshot whose ID matches
@snapshot_id and whose name matches @name.  But that's not what it
does.  If @snapshot_id is a valid ID, it deletes the snapshot with
that ID, else it deletes the snapshot with that name.  It doesn't use
@name at all.  Add suitable FIXME comments, so someone who actually
knows Sheepdog can fix it.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:28 +01:00
Markus Armbruster 48d7c4af06 sheepdog: Fix error handling sd_create()
As a bdrv_create() method, sd_create() must set an error and return
negative errno on failure.  It prints the error instead of setting it
when connect_to_sdog() fails.  Fix that.

While there, return the value of connect_to_sdog() like we do
elsewhere, instead of -EIO.  No functional change, as
connect_to_sdog() returns no other error code.

Many more suspicious uses of error_report() and error_report_err()
remain in other functions.  Left for another day.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:28 +01:00
Markus Armbruster e25cad6921 sheepdog: Fix error handling in sd_snapshot_delete()
As a bdrv_snapshot_delete() method, sd_snapshot_delete() must set an
error and return negative errno on failure.  It sometimes returns -1,
and sometimes neglects to set an error.  It also prints error messages
with error_report().  Fix all that.

Moreover, its handling of an attempt to delete a nonexistent snapshot
is wrong: it error_report()s and succeeds.  Fix it to set an error and
return -ENOENT instead.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:28 +01:00
Markus Armbruster cbc488ee2a sheepdog: Defuse time bomb in sd_open() error handling
When qemu_opts_absorb_qdict() fails, sd_open() closes stdin, because
sd->fd is still zero.  Fortunately, qemu_opts_absorb_qdict() can't
fail, because:

1. it only fails when qemu_opt_parse() fails, and
2. the only member of runtime_opts.desc[] is a QEMU_OPT_STRING, and
3. qemu_opt_parse() can't fail for QEMU_OPT_STRING.

Defuse this ticking time bomb by jumping behind the file descriptor
cleanup on error.

Also do that for the error paths where sd->fd is still -1.  The file
descriptor cleanup happens to do nothing then, but let's not rely on
that here.

While there, rename label out to err, because it's on the error path,
not the normal path out of the function.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:28 +01:00
Kevin Wolf 5fe31c25cc block: Fix error handling in bdrv_replace_in_backing_chain()
When adding an Error parameter, bdrv_replace_in_backing_chain() would
become nothing more than a wrapper around change_parent_backing_link().
So make the latter public, renamed as bdrv_replace_node(), and remove
bdrv_replace_in_backing_chain().

Most of the callers just remove a node from the graph that they just
inserted, so they can use &error_abort, but completion of a mirror job
with 'replaces' set can actually fail.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-07 14:53:28 +01:00
Kevin Wolf 88f9d1b3d2 mirror: Fix error path for dirty bitmap creation
mirror_top_bs must be removed from the graph again when creating the
dirty bitmap fails.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-07 14:53:28 +01:00
Kevin Wolf 0bf74767ff mirror: Fix permissions for removing mirror_top_bs
mirror_top_bs takes write permissions on its backing file, which can
make it impossible to attach that backing file node to another parent.
However, this is exactly what needs to be done in order to remove
mirror_top_bs from the backing chain. So give up the write permission
first.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-07 14:53:28 +01:00
Kevin Wolf 7d9fcb391c mirror: Fix permission problem with 'replaces'
The 'replaces' option of drive-mirror can be used to mirror a Quorum
node to a new image and then let the target image replace one of the
Quorum children. In order for this graph modification to succeed, the
mirror job needs to lift its restrictions on the target node first
before actually replacing the child.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-07 14:53:28 +01:00
Kevin Wolf b247767aac commit: Fix error handling
Apparently some kind of mismerge happened in commit 8dfba279, which
broke the error handling without any real reason by removing the
assignment of the return value to ret in a blk_insert_bs() call.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-03-07 14:53:28 +01:00
Markus Armbruster 048abb7b20 qapi: Drop unused non-strict qobject input visitor
The split between tests/test-qobject-input-visitor.c and
tests/test-qobject-input-strict.c now makes less sense than ever.  The
next commit will take care of that.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1488544368-30622-20-git-send-email-armbru@redhat.com>
2017-03-05 09:14:19 +01:00
Paolo Bonzini f6eb0b319e iscsi: fix missing unlock
Reported by Coverity.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03 16:41:20 +01:00
Peter Maydell 54639aed97 -----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJYt5TDAAoJEL2+eyfA3jBXL7UP/ROAc7tc0e4F2+5lPE88bLfJ
 i4dBI092Qt4LG73POoQDbgpYahs5ywLYE7MAJP2GmgiiUOj4XFN/K/d6wJluqrGj
 mLiOTlMcYmr2Kbi5CMfZBWlkd27GD/g1knCzpqtAOpM/zIxAwnRr4Mq8RXAHSgio
 tTvgtNr5JG9tYU088COwY7otr6WDhIMwESN5KQcLiOxmo4GtNkpBmpwfMdsgij9X
 mxfjQSdWzutITgsA28JiOMfWMbCiLYkwUgo/KCVIG4aB9mNKgEOx6YTIuGXspx2E
 8DaGLW54KagsuAhEHFCKFecmYCRjaDCZhwcMEEdif+mn1k1bJGtjK+roNFvn2vEy
 8qdKDTn4B7f6uR0fSGNviti04NNpCZTlPmhWpupqRWetY9vCdYNoRjtxRgh6t7wd
 rziBkN7FW/yxvr2GEXb5lLtH0YN+YTOw6FIXoRS8zKoNNtNuJaKxymfQ57VHxF0V
 BG9A+XMAIPgCudoqtBIpyaNxlByqhFKm6Z7aMNzzdZoohANtQxsAO6mAxFKk0yTp
 3zhOywglHd8wtmTVnZEtQTpdqk/Swd9/qCig6hHq+3U3QydyrHM7ZZbB/Z22xiaO
 53DDiLvPr8VbwCZQcpjXg0YOcQXTEMY/xzxDn5AQg10nyWYBr7HKKTxfmT1aew/N
 CRQi8CycU0TVneRnPMsg
 =LvuA
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging

# gpg: Signature made Thu 02 Mar 2017 03:42:59 GMT
# gpg:                using RSA key 0xBDBE7B27C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg:                 aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg:                 aka "Jeffrey Cody <codyprime@gmail.com>"
# Primary key fingerprint: 9957 4B4D 3474 90E7 9D98  D624 BDBE 7B27 C0DE 3057

* remotes/cody/tags/block-pull-request:
  block/rbd: add support for 'mon_host', 'auth_supported' via QAPI
  block/rbd: add blockdev-add support
  block/rbd: parse all options via bdrv_parse_filename
  block/rbd: add all the currently supported runtime_opts
  block/rbd: don't copy strings in qemu_rbd_next_tok()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-02 23:20:37 +00:00
Jeff Cody 0a55679b4a block/rbd: add support for 'mon_host', 'auth_supported' via QAPI
This adds support for three additional options that may be specified
by QAPI in blockdev-add:

    server: host, port
    auth method: either 'cephx' or 'none'

The "server" and "auth-supported" QAPI parameters are arrays.  To conform
with the rados API, the array items are join as a single string with a ';'
character as a delimiter when setting the configuration values.

Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-01 22:39:25 -05:00
Kevin Wolf b2c2832c61 block: Add Error parameter to bdrv_append()
Aborting on error in bdrv_append() isn't correct. This patch fixes it
and lets the callers handle failures.

Test case 085 needs a reference output update. This is caused by the
reversed order of bdrv_set_backing_hd() and change_parent_backing_link()
in bdrv_append(): When the backing file of the new node is set, the
parent nodes are still pointing to the old top, so the backing blocker
is now initialised with the node name rather than the BlockBackend name.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2017-02-28 20:47:51 +01:00
Kevin Wolf 12fa4af61f block: Add Error parameter to bdrv_set_backing_hd()
Not all callers of bdrv_set_backing_hd() know for sure that attaching
the backing file will be allowed by the permission system. Return the
error from the function rather than aborting.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2017-02-28 20:47:51 +01:00
Kevin Wolf c8f6d58edb block: Assertions for resize permission
This adds an assertion that ensures that the necessary resize permission
has been granted before bdrv_truncate() is called.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28 20:47:50 +01:00
Kevin Wolf afa4b29323 block: Assertions for write permissions
This adds assertions that ensure that the necessary write permissions
have been granted before someone attempts to write to a node.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2017-02-28 20:47:50 +01:00
Kevin Wolf 85c97ca7a1 block: Pass BdrvChild to bdrv_aligned_preadv/pwritev and copy-on-read
This is where we want to check the permissions, so we need to have the
BdrvChild around where they are stored.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2017-02-28 20:47:50 +01:00
Kevin Wolf 887354bd13 hmp: Request permissions in qemu-io
The HMP command 'qemu-io' is a bit tricky because it wants to work on
the original BlockBackend, but additional permissions could be required.
The details are explained in a comment in the code, but in summary, just
request whatever permissions the current qemu-io command needs.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28 20:47:50 +01:00
Kevin Wolf 0db832f42e commit: Add filter-node-name to block-commit
Management tools need to be able to know about every node in the graph
and need a way to address them. Changing the graph structure was okay
because libvirt doesn't really manage the node level yet, but future
libvirt versions need to deal with both new and old version of qemu.

This new option to blockdev-commit allows the client to set a node-name
for the automatically inserted filter driver, and at the same time
serves as a witness for a future libvirt that this version of qemu does
automatically insert a filter driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2017-02-28 20:47:50 +01:00
Kevin Wolf 6cdbceb12c mirror: Add filter-node-name to blockdev-mirror
Management tools need to be able to know about every node in the graph
and need a way to address them. Changing the graph structure was okay
because libvirt doesn't really manage the node level yet, but future
libvirt versions need to deal with both new and old version of qemu.

This new option to blockdev-mirror allows the client to set a node-name
for the automatically inserted filter driver, and at the same time
serves as a witness for a future libvirt that this version of qemu does
automatically insert a filter driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28 20:47:50 +01:00
Kevin Wolf a170a91fd3 stream: Use real permissions in streaming block job
The correct permissions are relatively obvious here (and explained in
code comments). For intermediate streaming, we need to reopen the top
node read-write before creating the job now because the permissions
system catches attempts to get the BLK_PERM_WRITE_UNCHANGED permission
on a read-only node.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28 20:47:50 +01:00
Kevin Wolf 4ef85a9c23 mirror: Use real permissions in mirror/active commit block job
The mirror block job is mainly used for two different scenarios:
Mirroring to an otherwise unused, independent target node, or for active
commit where the target node is part of the backing chain of the source.

Similarly to the commit block job patch, we need to insert a new filter
node to keep the permissions correct during active commit.

Note that one change this implies is that job->blk points to
mirror_top_bs as its root now, and mirror_top_bs (rather than the actual
source node) contains the bs->job pointer. This requires qemu-img commit
to get the job by name now rather than just taking bs->job.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Acked-by: Max Reitz <mreitz@redhat.com>
2017-02-28 20:47:46 +01:00
Kevin Wolf 4e9e4323d5 backup: Use real permissions in backup block job
The backup block job doesn't have very complicated requirements: It
needs to read from the source and write to the target, but it's fine
with either side being changed. The only restriction is that we can't
resize the image because the job uses a cached value.

qemu-iotests 055 needs to be changed because it used a target which was
already attached to a virtio-blk device. The permission system correctly
forbids this (virtio-blk can't accept another writer with its default
share-rw=off).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28 20:40:37 +01:00
Kevin Wolf d3f0675922 commit: Use real permissions for HMP 'commit'
This is a little simpler than the commit block job because it's
synchronous and only commits into the immediate backing file, but
otherwise doing more or less the same.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28 20:40:37 +01:00
Kevin Wolf 8dfba27977 commit: Use real permissions in commit block job
This is probably one of the most interesting conversions to the new
op blocker system because a commit block job intentionally leaves some
intermediate block nodes in the backing chain that aren't valid on their
own any more; only the whole chain together results in a valid view.

In order to provide the 'consistent read' permission to the parents of
the 'top' node of the commit job, a new filter block driver is inserted
above 'top' which doesn't require 'consistent read' on its backing
chain. Subsequently, the commit job can block 'consistent read' on all
intermediate nodes without causing a conflict.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2017-02-28 20:40:37 +01:00
Kevin Wolf 76d554e20b blockjob: Add permissions to block_job_add_bdrv()
Block jobs don't actually do I/O through the the reference they create
with block_job_add_bdrv(), but they might want to use the permisssion
system to express what the block job does to intermediate nodes. This
adds permissions to block_job_add_bdrv() to provide the means to request
permissions.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28 20:40:37 +01:00
Kevin Wolf b541155587 block: Add BdrvChildRole.get_parent_desc()
For meaningful error messages in the permission system, we need to get
some human-readable description of the parent of a BdrvChild.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28 20:40:37 +01:00
Kevin Wolf c6cc12bfa7 blockjob: Add permissions to block_job_create()
This functions creates a BlockBackend internally, so the block jobs need
to tell it what they want to do with the BB.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28 20:40:37 +01:00
Kevin Wolf 39829a01ae block: Allow error return in BlockDevOps.change_media_cb()
Some devices allow a media change between read-only and read-write
media. They need to adapt the permissions in their .change_media_cb()
implementation, which can fail. So add an Error parameter to the
function.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28 20:40:36 +01:00
Kevin Wolf c62d32f503 block: Request real permissions in blk_new_open()
We can figure out the necessary permissions from the flags that the
caller passed.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28 20:40:36 +01:00
Kevin Wolf 55880601d8 block: Add BDRV_O_RESIZE for blk_new_open()
blk_new_open() is a convenience function that processes flags rather
than QDict options as a simple way to just open an image file.

In order to keep it convenient in the future, it must automatically
request the necessary permissions. This can easily be inferred from the
flags for read and write, but we need another flag that tells us whether
to get the resize permission.

We can't just always request it because that means that no block jobs
can run on the resulting BlockBackend (which is something that e.g.
qemu-img commit wants to do), but we also can't request it never because
most of the .bdrv_create() implementations call blk_truncate().

The solution is to introduce another flag that is passed by all users
that want to resize the image.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2017-02-28 20:40:36 +01:00
Kevin Wolf d7086422b1 block: Add error parameter to blk_insert_bs()
Now that blk_insert_bs() requests the BlockBackend permissions for the
node it attaches to, it can fail. Instead of aborting, pass the errors
to the callers.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28 20:40:36 +01:00
Kevin Wolf 6d0eb64d5c block: Add permissions to blk_new()
We want every user to be specific about the permissions it needs, so
we'll pass the initial permissions as parameters to blk_new(). A user
only needs to call blk_set_perm() if it wants to change the permissions
after the fact.

The permissions are stored in the BlockBackend and applied whenever a
BlockDriverState should be attached in blk_insert_bs().

This does not include actually choosing the right set of permissions
everywhere yet. Instead, the usual FIXME comment is added to each place
and will be addressed in individual patches.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28 20:40:36 +01:00
Kevin Wolf 981776b348 block: Add permissions to BlockBackend
The BlockBackend can now store the permissions that its user requires.
This is necessary because nodes can be ejected from or inserted into a
BlockBackend and all of these operations must make sure that the user
still gets what it requested initially.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28 20:40:36 +01:00
Kevin Wolf 91ef38257a vvfat: Implement .bdrv_child_perm()
vvfat is the last remaining driver that can have children, but doesn't
implement .bdrv_child_perm() yet. The default handlers aren't suitable
here, so let's implement a very simple driver-specific one that protects
the internal child from being used by other users as good as our
permissions permit.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28 20:40:36 +01:00
Kevin Wolf 862f215fab block: Request child permissions in format drivers
This makes use of the .bdrv_child_perm() implementation for formats that
we just added. All format drivers expose the permissions they actually
need nows, so that they can be set accordingly and updated when parents
are attached or detached.

The only format not included here is raw, which was already converted
with the other filter drivers.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28 20:40:36 +01:00
Kevin Wolf d7010dfb68 block: Request child permissions in filter drivers
All callers will have to request permissions for all of their child
nodes. Block drivers that act as simply filters can use the default
implementation of .bdrv_child_perm().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28 20:40:36 +01:00
Kevin Wolf d5e6f437c5 block: Let callers request permissions when attaching a child node
When attaching a node as a child to a new parent, the required and
shared permissions for this parent are checked against all other parents
of the node now, and an error is returned if there is a conflict.

This allows error returns to a function that previously always
succeeded, and the same is true for quite a few callers and their
callers. Converting all of them within the same patch would be too much,
so for now everyone tells that they don't need any permissions and allow
everyone else to do anything. This way we can use &error_abort initially
and convert caller by caller to pass actual permission requirements and
implement error handling.

All these places are marked with FIXME comments and it will be the job
of the next patches to clean them up again.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28 20:40:36 +01:00
Kevin Wolf 8b2ff5291f block: Add Error argument to bdrv_attach_child()
It will have to return an error soon, so prepare the callers for it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
2017-02-28 20:40:35 +01:00
Jeff Cody c7cacb3e7a block/rbd: parse all options via bdrv_parse_filename
Get rid of qemu_rbd_parsename in favor of bdrv_parse_filename.
This simplifies a lot of the parsing as well, as we can treat everything
a bit simpler since nonexistent options are simply NULL pointers instead
of empty strings.

An important item to note:

Ceph has many extra option values that can be specified as key/value
pairs.  This was handled previously in the driver by extracting the
values that the QEMU driver cared about, and then blindly passing all
extra options to rbd after splitting them into key/value pairs, and
cleaning up any special character escaping.

The practice is continued in this patch; there is an option
"keyvalue-pairs" that is populated with all the key/value pairs that the
QEMU driver does not care about.  These key/value pairs will override
any settings in the 'conf' configuration file, just as they did before.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-02-28 11:32:31 -05:00
Jeff Cody 0f9d252de4 block/rbd: add all the currently supported runtime_opts
This adds all the currently supported runtime opts, which
are the options as parsed from the filename.  All of these
options are explicitly checked for during during runtime,
with an exception to the "keyvalue-pairs" option.

This option contains all the key/value pairs that the QEMU rbd
driver merely unescapes, and passes along blindly to rados.  This
option is a "legacy" option, and will not be exposed in the QAPI
or available for introspection.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-02-28 11:32:25 -05:00
Jeff Cody 7830f90998 block/rbd: don't copy strings in qemu_rbd_next_tok()
This patch is prep work for parsing options for .bdrv_parse_filename,
and using QDict options.

The function qemu_rbd_next_tok() searched for various key/value pairs,
and copied them into buffers.  This will soon be an unnecessary extra
step, so we will now return found strings by reference only, and
offload the responsibility for safely handling/coping these strings to
the caller.

This also cleans up error handling some, as the callers now rely on
the Error object to determine if there is a parse error.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-02-28 11:32:06 -05:00
Peter Maydell c8c0a1a784 -----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJYtP3aAAoJEL2+eyfA3jBXBBIQAJ+Cuv7NTZYbRXTe/soZ5Lz5
 rO6fNsfHj7b/xx5Sti+vSMvq7EpVpQ51qcf0XrVNh+BGCElW43st2567tA5PGil/
 brqWgVCP4MygFVGm9Fbk6txVwnotvnwkw417TVKcaeYeq8hfTo7CyjQK8+hg+C0A
 0B//4ARg8mM0YYg9NL4ihjNP/3E6Js9oAneuADv42M7SwB89wiIwh3tHC0pXDJbt
 IuJCqdR7/yaJyK8ccbDuUkY/qMEDK5VUQDVloPvvWrCcZKVWt/kFh7+5bTIzCu6D
 4s4k1j2L5kV00JnnzH6FNAWp3VY2CFdHDGrFV6rFiOiSo5tcGv1K4uINiq7KeEPL
 kIefFhlTgIKbSAgqk1fDSh4y96Eo3Vez0bJ5KnPvMx4z5EmN5HjiCW0qibmbCM5z
 uhTKZbZAI90KUAPjkGgJf8URkJjAGxdJQD7W3d7jx4xn010bKOQYjXmuX3qzoYiP
 vENak0sG9wN5QU2KIDymUd+xE4BgBo2cq+7CpOu3CyO+9O8ESm820vLZA5cdtr7N
 5iqeDsBMJjL3LX8o3cuqw+DLKF3vRqZdw4PhuZSaM8ccqCuGSyyywvtoZ4hOYkwS
 ZGa1nABmiFaMmxl3FHj1/0MF4X4yAwj2Ae5ublHoIlGpiNpZartlVu+D/sGtQ3SC
 +MOqFgUhc2oRRKqflsn4
 =tgky
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging

# gpg: Signature made Tue 28 Feb 2017 04:34:34 GMT
# gpg:                using RSA key 0xBDBE7B27C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg:                 aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg:                 aka "Jeffrey Cody <codyprime@gmail.com>"
# Primary key fingerprint: 9957 4B4D 3474 90E7 9D98  D624 BDBE 7B27 C0DE 3057

* remotes/cody/tags/block-pull-request:
  iscsi: add missing colons to the qapi docs
  block/mirror: fix broken sparseness detection

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-02-28 13:41:03 +00:00
John Snow 39c11580f3 block/mirror: fix broken sparseness detection
int64_t is in all likelihood the actual scalar type we want.
Yep, really.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1219541

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-02-27 14:02:31 -05:00
Paolo Bonzini d045c466d9 iscsi: do not use aio_context_acquire/release
Now that all bottom halves and callbacks take care of taking the
AioContext lock, we can migrate some users away from it and to a
specific QemuMutex or CoMutex.

Protect libiscsi calls with a QemuMutex.  Callbacks are invoked
using bottom halves, so we don't even have to drop it around
callback invocations.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20170222180725.28611-4-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-27 13:58:58 +00:00
Paolo Bonzini 37d1e4d9bf nfs: do not use aio_context_acquire/release
Now that all bottom halves and callbacks take care of taking the
AioContext lock, we can migrate some users away from it and to a
specific QemuMutex or CoMutex.

Protect libnfs calls with a QemuMutex.  Callbacks are invoked
using bottom halves, so we don't even have to drop it around
callback invocations.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20170222180725.28611-3-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-27 13:58:53 +00:00
Paolo Bonzini ba3186c4e4 curl: do not use aio_context_acquire/release
Now that all bottom halves and callbacks take care of taking the
AioContext lock, we can migrate some users away from it and to a
specific QemuMutex or CoMutex.

Protect BDRVCURLState access with a QemuMutex.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20170222180725.28611-2-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-27 13:33:24 +00:00
Peter Maydell 6b4e463ff3 Block layer patches
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJYsHaaAAoJEH8JsnLIjy/WvC4P/iw/WC6XySNzPMOzICrJr9fc
 IHz447+0MqoIBWqGRFLEU8z4t6k8HGt2YnWKFuS1N+2gU5fSgOdp20nAA+pQvxRb
 RTyQL7BNJPvFNzrEQPTpdvBt79veYsoNe5dNfSq01z/PdgxMQR4PkS96Jm8w4Jdu
 20ZPfULws8+Wq1Snsxl4PdnFpY2n97OOGQRCjl50h5ypol5+fXDDzAvp/GPf6Q8B
 09OTWmQ4UIr4OZqT83T1kDdRZRRMIxiRP5qTTKdh0BlJEQdBjrJ9fTClY4bMcIgl
 VP/3kzkmdoqSW+4D+0L88q7vyvM3Kwc6n2PFDaRf6Lgy9ueYoPKuW9AF5vzcxhED
 AI0SN9boM+4z1P75kalBaHg/BiRL7UDwpHgdanPVcENoP8IywsKzOIvdjOpqINmX
 uLU3QfK+qK6x7gflhVXiX2sFzTLzZQlWu5KPjW6QfuMzUuRAeksfMDXh2GXNFVG1
 igGnORyqB2nOGNAsudtMrOvjHQRSbwGsnvWcwvaX0swrxOb3oAwc3X+E4nxE/K5/
 wJQLrCM9DfD3CKKlKiu+1wQXx6zqaGYIsNxLcrSylJ3TPsGuxfkx3b9GVCCQ0Y+e
 YSdwWSAIS9LxR4qISJ5ehlnYDA0sRtdco0xGW6ACNE5IrgREYUcH4EJOXbxZ4NHs
 BK/DVdYhFtNZ60zHeEn4
 =OYgF
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches

# gpg: Signature made Fri 24 Feb 2017 18:08:26 GMT
# gpg:                using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream:
  tests: Use opened block node for block job tests
  vvfat: Use opened node as backing file
  block: Add bdrv_new_open_driver()
  block: Factor out bdrv_open_driver()
  block: Use BlockBackend for image probing
  block: Factor out bdrv_open_child_bs()
  block: Attach bs->file only during .bdrv_open()
  block: Pass BdrvChild to bdrv_truncate()
  mirror: Resize active commit base in mirror_run()
  qcow2: Use BB for resizing in qcow2_amend_options()
  blockdev: Use BlockBackend to resize in qmp_block_resize()
  iotests: Fix another race in 030
  qemu-img: Improve documentation for PREALLOC_MODE_FALLOC
  qemu-img: Truncate before full preallocation
  qemu-img: Add tests for raw image preallocation
  qemu-img: Do not truncate before preallocation
  qemu-iotests: redirect nbd server stdout to /dev/null
  qemu-iotests: add ability to exclude certain protocols from tests
  qemu-iotests: Test 137 only supports 'file' protocol

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-02-26 12:26:37 +00:00
tianqing 1d393bdeae RBD: Add support readv,writev for rbd
Rbd can do readv and writev directly, so wo do not need to transform
iov to buf or vice versa any more.

Signed-off-by: tianqing <tianqing@unitedstack.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-02-24 12:43:01 -05:00
Peter Lieven ef503a8417 block/nfs: try to avoid the bounce buffer in pwritev
if the passed qiov contains exactly one iov we can
pass the buffer directly.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1487349541-10201-3-git-send-email-pl@kamp.de
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-02-24 12:38:35 -05:00
Peter Lieven 69785a229d block/nfs: convert to preadv / pwritev
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1487349541-10201-2-git-send-email-pl@kamp.de
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-02-24 12:38:35 -05:00
Kevin Wolf a8a4d15c1c vvfat: Use opened node as backing file
We should not try to assign a not yet opened node as the backing file,
because as soon as the permission system is added it will fail.  The
just added bdrv_new_open_driver() function is the right tool to open a
file with an internal driver, use it.

In case anyone wonders whether that magic fake backing file to trigger a
special action on 'commit' actually works today: No, not for me. One
reason is that we've been adding a raw format driver on top for several
years now and raw doesn't support commit. Other reasons include that the
backing file isn't writable and the driver doesn't support reopen, and
it's also size 0 and the driver doesn't support bdrv_truncate. All of
these are easily fixable, but then 'commit' ended up in an infinite loop
deep in the vvfat code for me, so I thought I'd best leave it alone. I'm
not really sure what it was supposed to do anyway.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2017-02-24 16:09:23 +01:00
Kevin Wolf 4e4bf5c42c block: Attach bs->file only during .bdrv_open()
The way that attaching bs->file worked was a bit unusual in that it was
the only child that would be attached to a node which is not opened yet.
Because of this, the block layer couldn't know yet which permissions the
driver would eventually need.

This patch moves the point where bs->file is attached to the beginning
of the individual .bdrv_open() implementations, so drivers already know
what they are going to do with the child. This is also more consistent
with how driver-specific children work.

For a moment, bdrv_open() gets its own BdrvChild to perform image
probing, but instead of directly assigning this BdrvChild to the BDS, it
becomes a temporary one and the node name is passed as an option to the
drivers, so that they can simply use bdrv_open_child() to create another
reference for their own use.

This duplicated child for (the not opened yet) bs is not the final
state, a follow-up patch will change the image probing code to use a
BlockBackend, which is completely independent of bs.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2017-02-24 16:09:23 +01:00
Kevin Wolf 52cdbc5869 block: Pass BdrvChild to bdrv_truncate()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2017-02-24 16:09:23 +01:00
Kevin Wolf becc347e1c mirror: Resize active commit base in mirror_run()
This is more consistent with the commit block job, and it moves the code
to a place where we already have the necessary BlockBackends to resize
the base image when bdrv_truncate() is changed to require a BdrvChild.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2017-02-24 16:09:23 +01:00
Kevin Wolf 70b27f3643 qcow2: Use BB for resizing in qcow2_amend_options()
In order to able to convert bdrv_truncate() to take a BdrvChild and
later to correctly check the resize permission here, we need to use a
BlockBackend for resizing the image.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2017-02-24 16:09:23 +01:00
Nir Soffer c6ccc2c5e6 qemu-img: Improve documentation for PREALLOC_MODE_FALLOC
Now that we are truncating the file in both PREALLOC_MODE_FULL and
PREALLOC_MODE_OFF, not truncating in PREALLOC_MODE_FALLOC looks odd.
Add a comment explaining why we do not truncate in this case.

Signed-off-by: Nir Soffer <nirsof@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-02-24 16:09:23 +01:00
Nir Soffer 5a1dad9d5a qemu-img: Truncate before full preallocation
In a previous commit (qemu-img: Do not truncate before preallocation) we
moved truncate to the PREALLOC_MODE_OFF branch to avoid slowdown in
posix_fallocate().

However this change is not optimal when using PREALLOC_MODE_FULL, since
knowing the final size from the beginning could allow the file system
driver to do less allocations and possibly avoid fragmentation of the
file.

Now we truncate also before doing full preallocation.

Signed-off-by: Nir Soffer <nirsof@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-02-24 16:09:23 +01:00
Nir Soffer f6a7240442 qemu-img: Do not truncate before preallocation
When using file system that does not support fallocate() (e.g. NFS <
4.2), truncating the file only when preallocation=OFF speeds up creating
raw file.

Here is example run, tested on Fedora 24 machine, creating raw file on
NFS version 3 server.

$ time ./qemu-img-master create -f raw -o preallocation=falloc mnt/test 1g
Formatting 'mnt/test', fmt=raw size=1073741824 preallocation=falloc

real	0m21.185s
user	0m0.022s
sys	0m0.574s

$ time ./qemu-img-fix create -f raw -o preallocation=falloc mnt/test 1g
Formatting 'mnt/test', fmt=raw size=1073741824 preallocation=falloc

real	0m11.601s
user	0m0.016s
sys	0m0.525s

$ time dd if=/dev/zero of=mnt/test bs=1M count=1024 oflag=direct
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 15.6627 s, 68.6 MB/s

real	0m16.104s
user	0m0.009s
sys	0m0.220s

Running with strace we can see that without this change we do one
pread() and one pwrite() for each block. With this change, we do only
one pwrite() per block.

$ strace ./qemu-img-master create -f raw -o preallocation=falloc mnt/test 8192
...
pread64(9, "\0", 1, 4095)               = 1
pwrite64(9, "\0", 1, 4095)              = 1
pread64(9, "\0", 1, 8191)               = 1
pwrite64(9, "\0", 1, 8191)              = 1

$ strace ./qemu-img-fix create -f raw -o preallocation=falloc mnt/test 8192
...
pwrite64(9, "\0", 1, 4095)              = 1
pwrite64(9, "\0", 1, 8191)              = 1

This happens because posix_fallocate is checking if each block is
allocated before writing a byte to the block, and when truncating the
file before preallocation, all blocks are unallocated.

Signed-off-by: Nir Soffer <nirsof@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-02-24 16:09:22 +01:00
Markus Armbruster 7c81e4e9db block: Don't bother asserting type of output visitor's output
After a visit of a complex QAPI type FOO

    ov = qobject_output_visitor_new(&foo);
    visit_type_FOO(ov, NULL, expr, &error_abort);
    visit_complete(ov, &foo);

we can safely assume qobject_type(foo) is QTYPE_QDICT.  We do in many
places, but occasionally assert qobject_type(obj) == QTYPE_QDICT.
Don't.  The appropriate place to check such fundamental properties of
QAPI visitors is the test suite.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1487363905-9480-15-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-02-22 19:52:20 +01:00
Anton Nefedov 90ab48eb07 mirror: do not increase offset during initial zero_or_discard phase
If explicit zeroing out before mirroring is required for the target image,
it moves the block job offset counter to EOF, then offset and len counters
count the image size twice. There is no harm but stats are confusing,
specifically the progress of the operation is always reported as 99% by
management tools.

The patch skips offset increase for the first "technical" pass over the
image. This should not cause any further harm.

Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1486045515-8009-1-git-send-email-den@openvz.org
CC: Jeff Cody <jcody@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
CC: Eric Blake <eblake@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-02-21 10:38:00 -05:00
Kevin Wolf 31eb1202d3 iscsi: Add blockdev-add support
This adds blockdev-add support for iscsi devices.

Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-02-21 10:37:34 -05:00
Kevin Wolf 1d56010482 iscsi: Add timeout option
This was previously only available with -iscsi. Again, after this patch,
the -iscsi option only takes effect if an URL is given. New users are
supposed to use the new driver-specific option.

All -iscsi options have a corresponding driver-specific option for the
iscsi block driver now.

Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-02-21 10:37:26 -05:00
Kevin Wolf 81aa2a0fb5 iscsi: Add header-digest option
This was previously only available with -iscsi. Again, after this patch,
the -iscsi option only takes effect if an URL is given. New users are
supposed to use the new driver-specific option.

Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-02-21 10:37:16 -05:00
Kevin Wolf d4e799292c iscsi: Add initiator-name option
This was previously only available with -iscsi. Again, after this patch,
the -iscsi option only takes effect if an URL is given. New users are
supposed to use the new driver-specific option.

Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-02-21 10:37:08 -05:00
Kevin Wolf 4317142020 iscsi: Handle -iscsi user/password in bdrv_parse_filename()
This splits the logic in the old parse_chap() function into a part that
parses the -iscsi options into the new driver-specific options, and
another part that actually applies those options (called apply_chap()
now).

Note that this means that username and password specified with -iscsi
only take effect when a URL is provided. This is intentional, -iscsi is
a legacy interface only supported for compatibility, new users should
use the proper driver-specific options.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-02-21 10:36:57 -05:00
Kevin Wolf d5895fcb1d iscsi: Split URL into individual options
This introduces a .bdrv_parse_filename handler for iscsi which parses an
URL if given and translates it to individual options.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-02-21 10:36:34 -05:00
Paolo Bonzini 1ace7ceac5 coroutine-lock: add mutex argument to CoQueue APIs
All that CoQueue needs in order to become thread-safe is help
from an external mutex.  Add this to the API.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20170213181244.16297-6-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-21 11:39:40 +00:00
Paolo Bonzini b9e413dd37 block: explicitly acquire aiocontext in aio callbacks that need it
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170213135235.12274-16-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-21 11:39:39 +00:00
Paolo Bonzini 1919631e6b block: explicitly acquire aiocontext in bottom halves that need it
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170213135235.12274-15-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-21 11:39:39 +00:00
Paolo Bonzini 9d45665448 block: explicitly acquire aiocontext in callbacks that need it
This covers both file descriptor callbacks and polling callbacks,
since they execute related code.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170213135235.12274-14-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-21 11:39:36 +00:00
Paolo Bonzini 2f47da5f7f block: explicitly acquire aiocontext in timers that need it
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170213135235.12274-13-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-21 11:14:08 +00:00
Paolo Bonzini b20123a28b qed: introduce qed_aio_start_io and qed_aio_next_io_cb
qed_aio_start_io and qed_aio_next_io will not have to acquire/release
the AioContext, while qed_aio_next_io_cb will.  Split the functionality
and gain a little type-safety in the process.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170213135235.12274-11-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-21 11:14:08 +00:00
Paolo Bonzini e5c67ab552 blkdebug: reschedule coroutine on the AioContext it is running on
Keep the coroutine on the same AioContext.  Without this change,
there would be a race between yielding the coroutine and reentering it.
While the race cannot happen now, because the code only runs from a single
AioContext, this will change with multiqueue support in the block layer.

While doing the change, replace custom bottom half with aio_co_schedule.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170213135235.12274-10-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-21 11:14:08 +00:00
Paolo Bonzini ff82911cd3 nbd: convert to use qio_channel_yield
In the client, read the reply headers from a coroutine, switching the
read side between the "read header" coroutine and the I/O coroutine that
reads the body of the reply.

In the server, if the server can read more requests it will create a new
"read request" coroutine as soon as a request has been read.  Otherwise,
the new coroutine is created in nbd_request_put.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170213135235.12274-8-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-21 11:14:08 +00:00
Paolo Bonzini 35f106e684 block-backend: allow blk_prw from coroutine context
qcow2_create2 calls this.  Do not run a nested event loop, as that
breaks when aio_co_wake tries to queue the coroutine on the co_queue_wakeup
list of the currently running one.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20170213135235.12274-4-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-21 11:14:07 +00:00
Paolo Bonzini c2b38b277a block: move AioContext, QEMUTimer, main-loop to libqemuutil
AioContext is fairly self contained, the only dependency is QEMUTimer but
that in turn doesn't need anything else.  So move them out of block-obj-y
to avoid introducing a dependency from io/ to block-obj-y.

main-loop and its dependency iohandler also need to be moved, because
later in this series io/ will call iohandler_get_aio_context.

[Changed copyright "the QEMU team" to "other QEMU contributors" as
suggested by Daniel Berrange and agreed by Paolo.
--Stefan]

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20170213135235.12274-2-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-21 11:14:07 +00:00
Alberto Garcia 7061a07898 qcow2: Optimize the refcount-block overlap check
The metadata overlap checks introduced in a40f1c2add help detect
corruption in the qcow2 image by verifying that data writes don't
overlap with existing metadata sections.

The 'refcount-block' check in particular iterates over the refcount
table in order to get the addresses of all refcount blocks and check
that none of them overlap with the region where we want to write.

The problem with the refcount table is that since it always occupies
complete clusters its size is usually very big. With the default
values of cluster_size=64KB and refcount_bits=16 this table holds 8192
entries, each one of them enough to map 2GB worth of host clusters.

So unless we're using images with several TB of allocated data this
table is going to be mostly empty, and iterating over it is a waste of
CPU. If the storage backend is fast enough this can have an effect on
I/O performance.

This patch keeps the index of the last used (i.e. non-zero) entry in
the refcount table and updates it every time the table changes. The
refcount-block overlap check then uses that index instead of reading
the whole table.

In my tests with a 4GB qcow2 file stored in RAM this doubles the
amount of write IOPS.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 20170201123828.4815-1-berto@igalia.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-02-12 00:47:43 +01:00
Peter Lieven f67409a5bb block/nfs: fix naming of runtime opts
commit 94d6a7a accidentally left the naming of runtime opts and QAPI
scheme inconsistent. As one consequence passing of parameters in the
URI is broken. Sync the naming of the runtime opts to the QAPI
scheme.

Please note that this is technically backwards incompatible with the 2.8
release, but the 2.8 release is the only version that had the wrong naming.
Furthermore release 2.8 suffered from a NULL pointer dereference during
URI parsing.

Fixes: 94d6a7a76e
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Message-id: 1485942829-10756-3-git-send-email-pl@kamp.de
[mreitz: Fixed commit message]
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-02-12 00:47:42 +01:00
Peter Lieven 8d20abe87a block/nfs: fix NULL pointer dereference in URI parsing
parse_uint_full wants to put the parsed value into the
variable passed via its second argument which is NULL.

Fixes: 94d6a7a76e
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1485942829-10756-2-git-send-email-pl@kamp.de
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-02-12 00:47:42 +01:00
Dou Liyang a6baa60807 block/qapi: reduce the execution time of qmp_query_blockstats
In order to reduce the execution time, this patch optimize
the qmp_query_blockstats():
Remove the next_query_bds function.
Remove the bdrv_query_stats function.
Remove some judgement sentence.

The original qmp_query_blockstats calls next_query_bds to get
the next objects in each loops. In the next_query_bds, it checks
the query_nodes and blk. It also call bdrv_query_stats to get
the stats, In the bdrv_query_stats, it checks blk and bs each
times. This waste more times, which may stall the main loop a
bit. And if the disk is too many and donot use the dataplane
feature, this may affect the performance in main loop thread.

This patch removes that two functions, and makes the structure
clearly.

Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Message-id: 1484467275-27919-3-git-send-email-douly.fnst@cn.fujitsu.com
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[mreitz: Removed duplicate info->value assignment]
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-02-12 00:47:42 +01:00
Dou Liyang 20a6d768f5 block/qapi: reduce the coupling between the bdrv_query_stats and bdrv_query_bds_stats
The bdrv_query_stats and bdrv_query_bds_stats functions need to call
each other, that increases the coupling. it also makes the program
complicated and makes some unnecessary tests.

Remove the call from bdrv_query_bds_stats to bdrv_query_stats, just
take some recursion to make it clearly.

Avoid testing whether the blk is NULL during querying the bds stats.
It is unnecessary.

Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Message-id: 1484467275-27919-2-git-send-email-douly.fnst@cn.fujitsu.com
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-02-12 00:47:42 +01:00
QingFeng Hao 4545d4f4af block/vmdk: Fix the endian problem of buf_len and lba
The problem was triggered by qemu-iotests case 055. It failed when it
was comparing the compressed vmdk image with original test.img.

The cause is that buf_len in vmdk_write_extent wasn't converted to
little-endian before it was stored to disk. But later vmdk_read_extent
read it and converted it from little-endian to cpu endian.
If the cpu is big-endian like s390, the problem will happen and
the data length read by vmdk_read_extent will become invalid!
The fix is to add the conversion in vmdk_write_extent, meanwhile,
repair the endianness problem of lba field which shall also be converted
to little-endian before storing to disk.

Cc: qemu-stable@nongnu.org
Signed-off-by: QingFeng Hao <haoqf@linux.vnet.ibm.com>
Signed-off-by: Jing Liu <liujbjl@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20161216052040.53067-2-haoqf@linux.vnet.ibm.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-02-12 00:47:42 +01:00
Fam Zheng 9adceb0213 qapi: Tweak error message of bdrv_query_image_info
@bs doesn't always have a device name, such as when it comes from
"qemu-img info". Report file name instead.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 20170119130759.28319-2-famz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-02-12 00:47:41 +01:00
Peter Maydell 4e9f5244e1 -----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJYkeZAAAoJEJykq7OBq3PI6oUH/3qlRvQrWmhWLR+XCtwU0gON
 HRApL57Of+B1YbqJzb8wzjLMLfzZQYLoT7kf3FDRON751Iwpv2Qyl6j79kbmOQwy
 txvtgUTtPZrOZ9HMk6M1VboiKrkM1t0I1QiRYy/af2f1gD3KTqIt8YN1ic3xatKD
 Fgmx+oD+6EkrNilthemvDyaXtGsdTl4GC9ZbGcJB2VJzzWkksRUfeZWysIu9p2zP
 l6viegW/1+o5wYgBt6DxMalfNGbEiuBgXgx6PVFPbkw0xNURC52qDHhQ91xTSWt1
 pvFrIhYWR/ETN0twJh+jtmCjkawKWSsx2nrLlrSh4H0EpwFoRfFqH/ZrOFSg0wg=
 =QnCX
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging

# gpg: Signature made Wed 01 Feb 2017 13:44:32 GMT
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/tracing-pull-request:
  trace: clean up trace-events files
  qapi: add missing trace_visit_type_enum() call
  trace: improve error reporting when parsing simpletrace header
  trace: update docs to reflect new code generation approach
  trace: switch to modular code generation for sub-directories
  trace: move setting of group name into Makefiles
  trace: move hw/i386/xen events to correct subdir
  trace: move hw/xen events to correct subdir
  trace: move hw/block/dataplane events to correct subdir
  make: move top level dir to end of include search path

# Conflicts:
#	Makefile

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-02-02 16:08:28 +00:00
Paolo Bonzini acf6e5f096 sheepdog: reorganize check for overlapping requests
Wrap the code that was copied repeatedly in the two functions,
sd_aio_setup and sd_aio_complete.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20161129113245.32724-6-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-02-01 00:17:20 -05:00
Paolo Bonzini c4080e9391 sheepdog: simplify inflight_aio_head management
Add to the list in add_aio_request and, indirectly, resend_aioreq.  Inline
free_aio_req in the caller, it does not simply undo alloc_aio_req's job.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20161129113245.32724-5-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-02-01 00:17:20 -05:00
Paolo Bonzini 28ddd08cd6 sheepdog: do not use BlockAIOCB
Sheepdog's AIOCB are completely internal entities for a group of
requests and do not need dynamic allocation.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20161129113245.32724-4-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-02-01 00:17:20 -05:00
Paolo Bonzini e80ab33dc0 sheepdog: reorganize coroutine flow
Delimit co_recv's lifetime clearly in aio_read_response.

Do a simple qemu_coroutine_enter in aio_read_response, letting
sd_co_writev call sd_write_done.

Handle nr_pending in the same way in sd_co_rw_vector,
sd_write_done and sd_co_flush_to_disk.

Remove sd_co_rw_vector's return value; just leave with no
pending requests.

[Jeff: added missing 'return' back, spotted by Paolo after
       series was applied.]

Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-02-01 00:17:20 -05:00
Paolo Bonzini a71264f9f5 sheepdog: remove unused cancellation support
SheepdogAIOCB is internal to sheepdog.c, hence it is never canceled.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20161129113245.32724-2-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-02-01 00:17:20 -05:00
Stefan Hajnoczi 7f4076c1bb trace: clean up trace-events files
There are a number of unused trace events that
scripts/cleanup-trace-events.pl finds.  The "hw/vfio/pci-quirks.c"
filename was typoed and "qapi/qapi-visit-core.c" was missing the qapi/
directory prefix.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20170126171613.1399-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-01-31 17:12:15 +00:00
Peter Lieven 2deb63c2da block/iscsi: statically link qemu_iscsi_opts
commit f57b4b5f moved qemu_iscsi_opts into vl.c. This
made them invisible for qemu-img, qemu-nbd etc.

Fixes: f57b4b5fb1
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Message-Id: <1485262161-18543-1-git-send-email-pl@kamp.de>
[Drop useless #ifdef. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-27 18:07:58 +01:00