Commit Graph

3147 Commits

Author SHA1 Message Date
Kevin Wolf
7076309aef qed: Make qed_write_header() synchronous
Note that this code is generally not running in coroutine context, so
this is an actual blocking synchronous operation. We'll fix this in a
moment.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:14 +02:00
Kevin Wolf
b4ac32f34f qed: Remove callback from qed_copy_from_backing_file()
With this change, qed_aio_write_prefill() and qed_aio_write_postfill()
collapse into a single function. This is reflected by a rename of the
combined function to qed_aio_write_cow().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:14 +02:00
Kevin Wolf
0f7aa24d2c qed: Make qed_copy_from_backing_file() synchronous
Note that this code is generally not running in coroutine context, so
this is an actual blocking synchronous operation. We'll fix this in a
moment.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:14 +02:00
Kevin Wolf
e85c528142 qed: Make qed_read_backing_file() synchronous
Note that this code is generally not running in coroutine context, so
this is an actual blocking synchronous operation. We'll fix this in a
moment.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:14 +02:00
Kevin Wolf
0f21b7a1b7 qed: Remove callback from qed_find_cluster()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:14 +02:00
Kevin Wolf
a8165d2d66 qed: Remove callback from qed_read_l2_table()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:14 +02:00
Kevin Wolf
f6513529c6 qed: Remove callback from qed_read_table()
Instead of passing the return value to a callback, return it to the
caller so that the callback can be inlined there.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:14 +02:00
Kevin Wolf
11273076e9 qed: Make qed_read_table() synchronous
Note that this code is generally not running in coroutine context, so
this is an actual blocking synchronous operation. We'll fix this in a
moment.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:14 +02:00
Kevin Wolf
3b7cd9fd8f qed: Use bottom half to resume waiting requests
The qed driver serialises allocating write requests. When the active
allocation is finished, the AIO callback is called, but after this, the
next allocating request is immediately processed instead of leaving the
coroutine. Resuming another allocation request in the same request
coroutine means that the request now runs in the wrong coroutine.

The following is one of the possible effects of this: The completed
request will generally reenter its request coroutine in a bottom half,
expecting that it completes the request in bdrv_driver_pwritev().
However, if the second request actually yielded before leaving the
coroutine, the reused request coroutine is in an entirely different
place and is reentered prematurely. Not a good idea.

Let's make sure that we exit the coroutine after completing the first
request by resuming the next allocating request only with a bottom
half.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:14 +02:00
Alberto Garcia
24990c5b95 qcow2: Use offset_into_cluster() and offset_to_l2_index()
We already have functions for doing these calculations, so let's use
them instead of doing everything by hand. This makes the code a bit
more readable.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26 14:51:13 +02:00
Alberto Garcia
ee22a9d869 qcow2: Merge the writing of the COW regions with the guest data
If the guest tries to write data that results on the allocation of a
new cluster, instead of writing the guest data first and then the data
from the COW regions, write everything together using one single I/O
operation.

This can improve the write performance by 25% or more, depending on
several factors such as the media type, the cluster size and the I/O
request size.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26 14:51:13 +02:00
Alberto Garcia
86b862c431 qcow2: Pass a QEMUIOVector to do_perform_cow_{read,write}()
Instead of passing a single buffer pointer to do_perform_cow_write(),
pass a QEMUIOVector. This will allow us to merge the write requests
for the COW regions and the actual data into a single one.

Although do_perform_cow_read() does not strictly need to change its
API, we're doing it here as well for consistency.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26 14:51:13 +02:00
Alberto Garcia
b3cf1c7cf8 qcow2: Allow reading both COW regions with only one request
Reading both COW regions requires two separate requests, but it's
perfectly possible to merge them and perform only one. This generally
improves performance, particularly on rotating disk drives. The
downside is that the data in the middle region is read but discarded.

This patch takes a conservative approach and only merges reads when
the size of the middle region is <= 16KB.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26 14:51:13 +02:00
Alberto Garcia
672f0f2c4b qcow2: Split do_perform_cow() into _read(), _encrypt() and _write()
This patch splits do_perform_cow() into three separate functions to
read, encrypt and write the COW regions.

perform_cow() can now read both regions first, then encrypt them and
finally write them to disk. The memory allocation is also done in
this function now, using one single buffer large enough to hold both
regions.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26 14:51:13 +02:00
Alberto Garcia
99450c6fb9 qcow2: Make perform_cow() call do_perform_cow() twice
Instead of calling perform_cow() twice with a different COW region
each time, call it just once and make perform_cow() handle both
regions.

This patch simply moves code around. The next one will do the actual
reordering of the COW operations.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26 14:51:13 +02:00
Alberto Garcia
e034f5bcbc qcow2: Use unsigned int for both members of Qcow2COWRegion
Qcow2COWRegion has two attributes:

- The offset of the COW region from the start of the first cluster
  touched by the I/O request. Since it's always going to be positive
  and the maximum request size is at most INT_MAX, we can use a
  regular unsigned int to store this offset.

- The size of the COW region in bytes. This is guaranteed to be >= 0,
  so we should use an unsigned type instead.

In x86_64 this reduces the size of Qcow2COWRegion from 16 to 8 bytes.
It will also help keep some assertions simpler now that we know that
there are no negative numbers.

The prototype of do_perform_cow() is also updated to reflect these
changes.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26 14:51:13 +02:00
Alberto Garcia
026ac1586b qcow2: Remove unused Error variable in do_perform_cow()
We are using the return value of qcow2_encrypt_sectors() to detect
problems but we are throwing away the returned Error since we have no
way to report it to the user. Therefore we can simply get rid of the
local Error variable and pass NULL instead.

Alternatively we could try to figure out a way to pass the original
error instead of simply returning -EIO, but that would be more
invasive, so let's keep the current approach.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26 14:51:13 +02:00
Alberto Garcia
0d2fac8ede throttle: Update throttle-groups.c documentation
There used to be throttle_timers_{detach,attach}_aio_context() calls
in bdrv_set_aio_context(), but since 7ca7f0f6db
they are now in blk_set_aio_context().

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26 14:51:13 +02:00
Stefan Hajnoczi
ea17c9d20d block: use BDRV_POLL_WHILE() in bdrv_rw_vmstate()
Calling aio_poll() directly may have been fine previously, but this is
the future, man!  The difference between an aio_poll() loop and
BDRV_POLL_WHILE() is that BDRV_POLL_WHILE() releases the AioContext
around aio_poll().

This allows the IOThread to run fd handlers or BHs to complete the
request.  Failure to release the AioContext causes deadlocks.

Using BDRV_POLL_WHILE() partially fixes a 'savevm' hang with -object
iothread.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26 14:51:13 +02:00
Stefan Hajnoczi
dc88a467ec block: count bdrv_co_rw_vmstate() requests
Call bdrv_inc/dec_in_flight() for vmstate reads/writes.  This seems
unnecessary at first glance because vmstate reads/writes are done
synchronously while the guest is stopped.  But we need the bdrv_wakeup()
in bdrv_dec_in_flight() so the main loop sees request completion.
Besides, it's cleaner to count vmstate reads/writes like ordinary
read/write requests.

The bdrv_wakeup() partially fixes a 'savevm' hang with -object iothread.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26 14:51:13 +02:00
Kevin Wolf
4f78a16fee commit: Fix completion with extra reference
commit_complete() can't assume that after its block_job_completed() the
job is actually immediately freed; someone else may still be holding
references. In this case, the op blockers on the intermediate nodes make
the graph reconfiguration in the completion code fail.

Call block_job_remove_all_bdrv() manually so that we know for sure that
any blockers on intermediate nodes are given up.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2017-06-26 14:51:12 +02:00
Peter Maydell
84e3d0725b QAPI patches for 2017-06-09
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJZSRWrAAoJEDhwtADrkYZTVOQP/RK8br2A1Cn7LeVG6jnKz5hJ
 OqyII77x8I2RachWvnwxQeHPMEVDuz3y2WjL+80s6peXlpR6y/13w7oX0f6aiJuo
 +T9khTqMv2I7HsM5UCsXAJpFPHT7r90b4x8nstY80YLGe7lA7L6yk6PGyCxHThwA
 mOiTKDw6/Xb/yZGrS2Favrun7juNpAs0Ec1IAkaA8xsEgVkd6tDv281rmHqvibl/
 //90VfJp3nHFZ12FCQ1HzA42Eigtmo/fIk9LnAzBoYG0zw0cnzjuv0BNzs/JwuUZ
 /VskeD1cViQ4yzFnPpjOavjYjTN854/JTJzm7gZ7dTQ6/l3ykoY6NDE8p1BLuHlC
 p2RKkg20EeZlpOEtMQ4g6iyG6EUxaKcEiXmQ31LqN/LJwxTYbo5B5nCHMjrt4gxe
 MqFBJQSNsJ7QjZ7Qa7pADMCi/G0m7/0dN8vBqSr4vcbLVvdbw/yb/9s33wXGrUj1
 PyXM2ymi+vvSqcXtNXKshsJLxJSJxO1tm2tRIANDTabQ00yxs8dOYnQnbQFR94fp
 6nrE2PnjZqgqk69aNDJEbngj6Tgx44nyTr1+Q17juZf9nTCE5QmBE1J0IRoykCJn
 E8+T63ZxtIxVV2yLi5xBjmZaZtPyJRGGeUXunA10SuWrHzupEcBuhFhFYd2MFM5L
 fsojALN2K3Gdx2+CmAo2
 =O9Vv
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2017-06-09-v2' into staging

QAPI patches for 2017-06-09

# gpg: Signature made Tue 20 Jun 2017 13:31:39 BST
# gpg:                using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-qapi-2017-06-09-v2: (41 commits)
  tests/qdict: check more get_try_int() cases
  console: use get_uint() for "head" property
  i386/cpu: use get_uint() for "min-level"/"min-xlevel" properties
  numa: use get_uint() for "size" property
  pnv-core: use get_uint() for "core-pir" property
  pvpanic: use get_uint() for "ioport" property
  auxbus: use get_uint() for "addr" property
  arm: use get_uint() for "mp-affinity" property
  xen: use get_uint() for "max-ram-below-4g" property
  pc: use get_uint() for "hpet-intcap" property
  pc: use get_uint() for "apic-id" property
  pc: use get_uint() for "iobase" property
  acpi: use get_uint() for "pci-hole*" properties
  acpi: use get_uint() for various acpi properties
  acpi: use get_uint() for "acpi-pcihp-io*" properties
  platform-bus: use get_uint() for "addr" property
  bcm2835_fb: use {get, set}_uint() for "vcram-size" and "vcram-base"
  aspeed: use {set, get}_uint() for "ram-size" property
  pcihp: use get_uint() for "bsel" property
  pc-dimm: make "size" property uint64
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-22 11:34:39 +01:00
Peter Maydell
65a0e3e842 -----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
 
 iQEtBAABCAAXBQJZQyPmEBxmYW16QHJlZGhhdC5jb20ACgkQyjViTGqRccaWrgf/
 SCAHpi4gzWbr7AN03jP16Qy/kqNik6F7LTNSqrRbvBPb3TNchDd4z44SAghK5m/r
 +IlYQc20sBZ60tRHIHAUSF2WNcea2pj1v3ZVgjrI7hiJ3DXPiqqt/dAR/W/BLIDO
 tAHAVF6Pnrjm9DC4d2zATLDHvcHMzWOsnePh7XcOm44REbwUr3GDg6bf2+j+5yfS
 9ewmXfh8z4w1IvSn+f5B+IeCvGvJNA1D55dqcGo8Ivlg9PnElziXFaXO2s7UiLIM
 mF3eTSIbJQNNN+E+0lpRpnqQiq+Txxggu61Q4f8bOTBhEOPa3etj1ydnXMVbvX25
 6SUuBfGh51tyOIZOJz3GtA==
 =9b+J
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/famz/tags/docker-and-block-pull-request' into staging

# gpg: Signature made Fri 16 Jun 2017 01:18:46 BST
# gpg:                using RSA key 0xCA35624C6A9171C6
# gpg: Good signature from "Fam Zheng <famz@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021  AD56 CA35 624C 6A91 71C6

* remotes/famz/tags/docker-and-block-pull-request: (23 commits)
  block: make accounting thread-safe
  block: split BlockAcctStats creation and setup
  block: introduce block_account_one_io
  block: protect modification of dirty bitmaps with a mutex
  migration/block: reset dirty bitmap before reading
  block: introduce dirty_bitmap_mutex
  block: protect tracked_requests and flush_queue with reqs_lock
  block: access write_gen with atomics
  block: use Stat64 for wr_highest_offset
  util: add stats64 module
  throttle-groups: protect throttled requests with a CoMutex
  throttle-groups: do not use qemu_co_enter_next
  throttle-groups: only start one coroutine from drained_begin
  block: access io_plugged with atomic ops
  block: access wakeup with atomic ops
  block: access serialising_in_flight with atomic ops
  block: access io_limits_disabled with atomic ops
  block: access quiesce_counter with atomic ops
  block: access copy_on_read with atomic ops
  docker: Add flex and bison to centos6 image
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-20 16:01:15 +01:00
Peter Maydell
7e56accdaf * nbd and qemu-nbd fixes (Eric, Max)
* nbd refactoring (Vladimir)
 * vhost-user-scsi, take N+1 (Felipe)
 * replace memory_region_set_fd with memory_region_init_ram_from_fd (Marc-André)
 * docs/ movement (Paolo)
 * megasas TOCTOU fixes (Paolo)
 * make async_safe_run_on_cpu work on kvm/hax accelerators (Paolo)
 * Build system and poison.h improvements (Thomas)
 * -accel thread=xxx fix (Thomas)
 * move files to accel/ (Yang Zhong)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJZQli7AAoJEL/70l94x66DYAUH/icBoGSyY70G033BxbWW+jPW
 pjqobQw3YnFmiHSkXy1Xd4LKa0dyrVdmVECOAvB0embiN6pIk2UpeT4aQNHffG3f
 Y9wMRKnp6RMLpzwRYbde4rAlitn52ecGIANtFUC6RTnL7wKCuh3beWruKmkudG8V
 ZAsmX30pPFJxtQYJ0/q3c2jB6V87h2GocRLAmy1lF5/X1pXQ2fiKOYOJvV3HDY1U
 UQ+Ixuf4tTbyBwTm4Lx2tD3+olkVTUvcATqxhI+1JKu6lE0Dgb58urYDOfvVPFDl
 98s6bU53f0gpOcb1/wKOAXKdZ2tvqRtwsP8IoPDnUpk8HGxbNWdofoxsO37vVIU=
 =zeKT
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* nbd and qemu-nbd fixes (Eric, Max)
* nbd refactoring (Vladimir)
* vhost-user-scsi, take N+1 (Felipe)
* replace memory_region_set_fd with memory_region_init_ram_from_fd (Marc-André)
* docs/ movement (Paolo)
* megasas TOCTOU fixes (Paolo)
* make async_safe_run_on_cpu work on kvm/hax accelerators (Paolo)
* Build system and poison.h improvements (Thomas)
* -accel thread=xxx fix (Thomas)
* move files to accel/ (Yang Zhong)

# gpg: Signature made Thu 15 Jun 2017 10:51:55 BST
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (41 commits)
  vhost-user-scsi: Introduce a vhost-user-scsi sample application
  vhost-user-scsi: Introduce vhost-user-scsi host device
  qemu-doc: include version number
  docs: create interop/ subdirectory
  include/exec/poison: Mark some CONFIG defines as poisoned, too
  include/exec/poison: Add missing TARGET defines
  nbd/server: refactor nbd_trip
  nbd/server: rename rc to ret
  nbd/server: get rid of fail: return rc
  nbd/server: nbd_negotiate: fix error path
  nbd/server: remove NBDClientNewData
  nbd/server: refactor nbd_co_receive_request
  nbd/server: get rid of EAGAIN dead code
  nbd/server: refactor nbd_co_send_reply
  nbd/server: get rid of ssize_t
  nbd/server: get rid of nbd_negotiate_read and friends
  nbd: make nbd_drop public
  nbd: rename read_sync and friends
  accel: move kvm related accelerator files into accel/
  tcg: move tcg backend files into accel/tcg/
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-20 14:20:34 +01:00
Marc-André Lureau
01b2ffcedd qapi: merge QInt and QFloat in QNum
We would like to use a same QObject type to represent numbers, whether
they are int, uint, or floats. Getters will allow some compatibility
between the various types if the number fits other representations.

Add a few more tests while at it.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170607163635.17635-7-marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[parse_stats_intervals() simplified a bit, comment in
test_visitor_in_int_overflow() tidied up, suppress bogus warnings]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-06-20 14:31:31 +02:00
Paolo Bonzini
5b50bf77ce block: make accounting thread-safe
I'm not trying too hard yet.  Later, with multiqueue support,
this may cause mutex contention or cacheline bouncing.

Cc: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170605123908.18777-20-pbonzini@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-16 07:55:00 +08:00
Paolo Bonzini
9caa6f3dbe block: split BlockAcctStats creation and setup
block_acct_destroy is called unconditionally in blk_delete, but there is
no BlockAcctStats function that is called unconditionally in blk_new.
Split block_acct_init in two, so that it will be possible to create a
QemuMutex in block_acct_init and destroy it in block_acct_cleanup.

Cc: Alberto Garcia <berto@igalia.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170605123908.18777-19-pbonzini@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-16 07:55:00 +08:00
Paolo Bonzini
39c1b4254e block: introduce block_account_one_io
This is the common code to account operations that produced actual I/O.

Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170605123908.18777-18-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-16 07:55:00 +08:00
Paolo Bonzini
b64bd51efa block: protect modification of dirty bitmaps with a mutex
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170605123908.18777-17-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-16 07:55:00 +08:00
Paolo Bonzini
2119882c7e block: introduce dirty_bitmap_mutex
It protects only the list of dirty bitmaps; in the next patch we will
also protect their content.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170605123908.18777-15-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-16 07:55:00 +08:00
Paolo Bonzini
3783fa3dd3 block: protect tracked_requests and flush_queue with reqs_lock
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170605123908.18777-14-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-16 07:55:00 +08:00
Paolo Bonzini
47fec59941 block: access write_gen with atomics
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170605123908.18777-13-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-16 07:55:00 +08:00
Paolo Bonzini
f7946da274 block: use Stat64 for wr_highest_offset
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170605123908.18777-12-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-16 07:55:00 +08:00
Paolo Bonzini
93001e9d87 throttle-groups: protect throttled requests with a CoMutex
Another possibility is to use tg->lock, which we're holding anyway in
both schedule_next_request and throttle_group_co_io_limits_intercept.
This would require open-coding the CoQueue however, so I've chosen this
alternative.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170605123908.18777-10-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-16 07:55:00 +08:00
Paolo Bonzini
3b170dc867 throttle-groups: do not use qemu_co_enter_next
Prepare for removing this function; always restart throttled requests
from coroutine context.  This will matter when restarting throttled
requests will have to acquire a CoMutex.

Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170605123908.18777-9-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-16 07:55:00 +08:00
Paolo Bonzini
7258ed930c throttle-groups: only start one coroutine from drained_begin
Starting all waiting coroutines from bdrv_drain_all is unnecessary;
throttle_group_co_io_limits_intercept calls schedule_next_request as
soon as the coroutine restarts, which in turn will restart the next
request if possible.

If we only start the first request and let the coroutines dance from
there the code is simpler and there is more reuse between
throttle_group_config, throttle_group_restart_blk and timer_cb.  The
next patch will benefit from this.

We also stop accessing from throttle_group_restart_blk the
blkp->throttled_reqs CoQueues even when there was no
attached throttling group.  This worked but is not pretty.

The only thing that can interrupt the dance is the QEMU_CLOCK_VIRTUAL
timer when switching from one block device to the next, because the
timer is set to "now + 1" but QEMU_CLOCK_VIRTUAL might not be running.
Set that timer to point in the present ("now") rather than the future
and things work.

Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170605123908.18777-8-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-16 07:55:00 +08:00
Paolo Bonzini
850d54a2a9 block: access io_plugged with atomic ops
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170605123908.18777-7-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-16 07:55:00 +08:00
Paolo Bonzini
e2a6ae7fe5 block: access wakeup with atomic ops
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170605123908.18777-6-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-16 07:55:00 +08:00
Paolo Bonzini
20fc71b25c block: access serialising_in_flight with atomic ops
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170605123908.18777-5-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-16 07:55:00 +08:00
Paolo Bonzini
d993b85804 block: access io_limits_disabled with atomic ops
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170605123908.18777-4-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-16 07:55:00 +08:00
Paolo Bonzini
414c2ec358 block: access quiesce_counter with atomic ops
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170605123908.18777-3-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-16 07:55:00 +08:00
Paolo Bonzini
d3faa13e5f block: access copy_on_read with atomic ops
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170605123908.18777-2-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-16 07:55:00 +08:00
Vladimir Sementsov-Ogievskiy
d1fdf257d5 nbd: rename read_sync and friends
Rename
  nbd_wr_syncv -> nbd_rwv
  read_sync -> nbd_read
  read_sync_eof -> nbd_read_eof
  write_sync -> nbd_write
  drop_sync -> nbd_drop

1. nbd_ prefix
   read_sync and write_sync are already shared, so it is good to have a
   namespace prefix. drop_sync will be shared, and read_sync_eof is
   related to read_sync, so let's rename them all.

2. _sync suffix
   _sync is related to the fact that nbd_wr_syncv doesn't return if a
   write to socket returns EAGAIN. The first implementation of
   nbd_wr_syncv (was wr_sync in 7a5ca8648b) just loops while getting
   EAGAIN, the current implementation yields in this case.
   Why we want to get rid of it:
   - it is normal for r/w functions to be synchronous, so having an
     additional suffix for it looks redundant (contrariwise, we have
     _aio suffix for async functions)
   - _sync suffix in block layer is used when function does flush (so
     using it for other thing is confusing a bit)
   - keep function names short after adding nbd_ prefix

3. for nbd_wr_syncv let's use more common notation 'rw'

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170602150150.258222-2-vsementsov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-15 11:04:06 +02:00
Jeff Cody
5c3ad1a6a8 block/iscsi: enable filename option and parsing
When enabling option parsing and blockdev-add for iscsi, we removed the
'filename' option.  Unfortunately, this was a bit optimistic, as
previous versions of QEMU allowed the use of the option in backing
filenames via json.  This means that without parsing this option, we
cannot open existing images that used to work fine.

See bug: https://bugzilla.redhat.com/show_bug.cgi?id=1457088

Tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Message-id: 0789ab6c32814ab4b6896707d378804bd4424c65.1497444637.git.jcody@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-06-14 17:39:46 -04:00
Jeff Cody
91589d9e5c block/rbd: enable filename option and parsing
When enabling option parsing and blockdev-add for rbd, we removed the
'filename' option.  Unfortunately, this was a bit optimistic, as
previous versions of QEMU allowed the use of the option in backing
filenames via json.  This means that without parsing this option, we
cannot open existing images that used to work fine.

See bug: https://bugzilla.redhat.com/show_bug.cgi?id=1457088

Tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Message-id: 937dc9fde348d13311eb8e23444df3bc3190b612.1497444637.git.jcody@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-06-14 17:39:46 -04:00
Peter Maydell
e0b4891ae6 -----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJZOpeXAAoJEL2+eyfA3jBXKqYP/jXv/ehVADW3X30CXSPoZsGo
 rdsdd8pqNwuEYHpFrGclMNbsDWxUsiv8t3MleyVEtl+N0pBVnlZisR3PCONShxxj
 w0S0/UniAib+j+Y3D7xSHKKrin6CjNVDltrJdX7VOAZsOTj93pVDhd8SlPc3YDSt
 DrUQoJI96DOwtQofXLTzZkzytbd/9ZN7VzF3j4wJ/72do7thCBybizRLUBEiKM+w
 RQgcb3Xvob1H9pKSjaNnmH8CKqrISJRfzpXCAzowQy+X8RTTBedn/ZHtiHFYb2XZ
 6J/OYM1p4W2X69KMUVJ8wAZsNVNlVWaXsD+lKcBcLtlYIUac7CB/Hcg4mrJDqhzq
 752KXvencRKP52HfVLaySXBE2JguTvUrkBDw3/u9W2qUZw3tUexPhH/CDH6wf8XU
 JhbZk7fm722pfCs/Gmt+9Muz7YkGbT91EWPWZYxbZ1avebmTlq6mCarqTTz9DmXZ
 psJe7a6D0M0bcQL0//rOH3USbOSUcFascJvuRoll28H+z3oCfC9ILbBZE8xQ9uOM
 W5mmCL8SoljY+cSIb9NSy0/vG4gqvy4TigaIRu8VT8EJG3Yel8y9YXb4kX5NFV6Z
 ZuEcgykUnBNzBmiXC+M57r9Qa1J/9GciSOVfH8PdOzhuwWimBANg5WD+B6XcrknD
 OCOVaV58DMh3ApwfVCkj
 =XJ/8
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging

# gpg: Signature made Fri 09 Jun 2017 13:41:59 BST
# gpg:                using RSA key 0xBDBE7B27C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg:                 aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg:                 aka "Jeffrey Cody <codyprime@gmail.com>"
# Primary key fingerprint: 9957 4B4D 3474 90E7 9D98  D624 BDBE 7B27 C0DE 3057

* remotes/cody/tags/block-pull-request:
  block/gluster.c: Handle qdict_array_entries() failure

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-13 12:55:47 +01:00
Peter Maydell
475df9d809 Block layer patches
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJZOorTAAoJEH8JsnLIjy/W+MAP/iOH2GtgtUfH8CLxqWqZUJ5p
 rmlLLTrooBHUF09BUSCkwD5E0s1El9phvldt9t2l6RaxxSlMwZl1nB0ww9q+z2Aq
 NY5VBYUov28JnqFqRkOPiJHMo7oRkzgyGVn3RCbevwWudEK3DcSRT5UftDK5UPH3
 HwxYY09uxpDcMkXajC1pRG/RJFrfsmbM3pXFYMsOgj8QT5nNdqebaqcYnuTKX9F4
 5Hn5O0tRdZZWWvP1QXRAPszmihdh4GN4I1EEfAgjV8Y3TqBApZS4vdVOJXm07Vlx
 FB6FKEWvK8zMU/0kS8jH5v4A03lRtokESL9Io5K4mwuxiTfNjvb4ZlftGVgUzuFi
 X5Yn6pPYU8t+rvRNSHF/8UYxX6lcuGL2DlVn/a+47phmQrW46fd8srp1A1c+dtzS
 jjMvw1tJNLYGP29QWzhaROPVn3Uk4wPbZIU472HB6ZTSC76g+jvGzfLVspnZeUjY
 96l3721vv8oa7ZtFvRNT9zTYxAoGXEcgFR7f+E9yVWdISZ0AKdONNIUV0w9AHKNc
 rDM4XdnG7ouiVaGeT3cdDNjjbb/v3+sR9yUzf9DUd41purmkrRPtGcetd8esf8Jr
 plIiBvf+mCRDACSYbIF3JKEHGYFasgs+X/OkhBVeeIBJUkl89mFpca5Qs4AF1d6v
 l7IW3y56umBa00qDgl/R
 =ey0C
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches

# gpg: Signature made Fri 09 Jun 2017 12:47:31 BST
# gpg:                using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream:
  block: fix external snapshot abort permission error
  block/qcow.c: Fix memory leak in qcow_create()
  qemu-iotests: Test automatic commit job cancel on hot unplug
  commit: Fix use after free in completion
  qemu-iotests: Block migration test
  migration/block: Clean up BBs in block_save_complete()
  migration: Inactivate images after .save_live_complete_precopy()
  block: Fix anonymous BBs in blk_root_inactivate()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-12 10:43:32 +01:00
Peter Maydell
56faeb9bb6 block/gluster.c: Handle qdict_array_entries() failure
In qemu_gluster_parse_json(), the call to qdict_array_entries()
could return a negative error code, which we were ignoring
because we assigned the result to an unsigned variable.
Fix this by using the 'int' type instead, which matches the
return type of qdict_array_entries() and also the type
we use for the loop enumeration variable 'i'.

(Spotted by Coverity, CID 1360960.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1496682098-1540-1-git-send-email-peter.maydell@linaro.org
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-06-09 08:41:29 -04:00
Peter Maydell
272545cf21 block/qcow.c: Fix memory leak in qcow_create()
Coverity points out that the code path in qcow_create() for
the magic "fat:" backing file name leaks the memory used to
store the filename (CID 1307771). Free the memory before
we overwrite the pointer.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-09 13:46:20 +02:00
Kevin Wolf
19ebd13ed4 commit: Fix use after free in completion
The final bdrv_set_backing_hd() could be working on already freed nodes
because the commit job drops its references (through BlockBackends) to
both overlay_bs and top already a bit earlier.

One way to trigger the bug is hot unplugging a disk for which
blockdev_mark_auto_del() cancels the block job.

Fix this by taking BDS-level references while we're still using the
nodes.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
2017-06-09 13:46:13 +02:00