2018-07-02 21:14:58 +02:00
|
|
|
QA output created by 223
|
|
|
|
|
2018-11-19 18:29:24 +01:00
|
|
|
=== Create partially sparse image, then add dirty bitmaps ===
|
2018-07-02 21:14:58 +02:00
|
|
|
|
|
|
|
Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=4194304
|
|
|
|
wrote 2097152/2097152 bytes at offset 1048576
|
|
|
|
2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
|
|
|
|
Testing:
|
|
|
|
QMP_VERSION
|
|
|
|
{"return": {}}
|
|
|
|
{"return": {}}
|
|
|
|
{"return": {}}
|
|
|
|
{"return": {}}
|
2018-11-19 18:29:24 +01:00
|
|
|
{"return": {}}
|
2018-12-05 12:01:31 +01:00
|
|
|
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false, "reason": "host-qmp-quit"}}
|
2018-07-02 21:14:58 +02:00
|
|
|
|
|
|
|
|
|
|
|
=== Write part of the file under active bitmap ===
|
|
|
|
|
2018-11-19 18:29:24 +01:00
|
|
|
wrote 512/512 bytes at offset 512
|
|
|
|
512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
|
2018-07-02 21:14:58 +02:00
|
|
|
wrote 2097152/2097152 bytes at offset 2097152
|
|
|
|
2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
|
|
|
|
|
2018-11-19 18:29:24 +01:00
|
|
|
=== End dirty bitmaps, and start serving image over NBD ===
|
2018-07-02 21:14:58 +02:00
|
|
|
|
|
|
|
{"return": {}}
|
|
|
|
{"return": {}}
|
|
|
|
{"return": {}}
|
2019-01-11 20:47:13 +01:00
|
|
|
{"error": {"class": "GenericError", "desc": "NBD server not running"}}
|
2018-07-02 21:14:58 +02:00
|
|
|
{"return": {}}
|
2019-01-11 20:47:13 +01:00
|
|
|
{"error": {"class": "GenericError", "desc": "NBD server already running"}}
|
2019-01-17 20:36:58 +01:00
|
|
|
exports available: 0
|
2018-07-02 21:14:58 +02:00
|
|
|
{"return": {}}
|
2019-01-11 20:47:13 +01:00
|
|
|
{"error": {"class": "GenericError", "desc": "Cannot find device=nosuch nor node_name=nosuch"}}
|
|
|
|
{"error": {"class": "GenericError", "desc": "NBD server already has export named 'n'"}}
|
nbd: Only require disabled bitmap for read-only exports
Our initial implementation of x-nbd-server-add-bitmap put
in a restriction because of incremental backups: in that
usage, we are exporting one qcow2 file (the temporary overlay
target of a blockdev-backup sync:none job) and a dirty bitmap
owned by a second qcow2 file (the source of the
blockdev-backup, which is the backing file of the temporary).
While both qcow2 files are still writable (the target in
order to capture copy-on-write of old contents, and the
source in order to track live guest writes in the meantime),
the NBD client expects to see constant data, including the
dirty bitmap. An enabled bitmap in the source would be
modified by guest writes, which is at odds with the NBD
export being a read-only constant view, hence the initial
code choice of enforcing a disabled bitmap (the intent is
that the exposed bitmap was disabled in the same transaction
that started the blockdev-backup job, although we don't want
to track enough state to actually enforce that).
However, consider the case of a bitmap contained in a read-only
node (including when the bitmap is found in a backing layer of
the active image). Because the node can't be modified, the
bitmap won't change due to writes, regardless of whether it is
still enabled. Forbidding the export unless the bitmap is
disabled is awkward, paritcularly since we can't change the
bitmap to be disabled (because the node is read-only).
Alternatively, consider the case of live storage migration,
where management directs the destination to create a writable
NBD server, then performs a drive-mirror from the source to
the target, prior to doing the rest of the live migration.
Since storage migration can be time-consuming, it may be wise
to let the destination include a dirty bitmap to track which
portions it has already received, where even if the migration
is interrupted and restarted, the source can query the
destination block status in order to potentially minimize
re-sending data that has not changed in the meantime on a
second attempt. Such code has not been written, and might not
be trivial (after all, a cluster being marked dirty in the
bitmap does not necessarily guarantee it has the desired
contents), but it makes sense that letting an active dirty
bitmap be exposed and changing alongside writes may prove
useful in the future.
Solve both issues by gating the restriction against a
disabled bitmap to only happen when the caller has requested
a read-only export, and where the BDS that owns the bitmap
(whether or not it is the BDS handed to nbd_export_new() or
from its backing chain) is still writable. We could drop
the check altogether (if management apps are prepared to
deal with a changing bitmap even on a read-only image), but
for now keeping a check for the read-only case still stands
a chance of preventing management errors.
Update iotest 223 to show the looser behavior by leaving
a bitmap enabled the whole run; note that we have to tear
down and re-export a node when handling an error.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190111194720.15671-4-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2019-01-11 20:47:15 +01:00
|
|
|
{"error": {"class": "GenericError", "desc": "Enabled bitmap 'b2' incompatible with readonly export"}}
|
nbd: Allow bitmap export during QMP nbd-server-add
With the experimental x-nbd-server-add-bitmap command, there was
a window of time where an NBD client could see the export but not
the associated dirty bitmap, which can cause a client that planned
on using the dirty bitmap to be forced to treat the entire image
as dirty as a safety fallback. Furthermore, if the QMP client
successfully exports a disk but then fails to add the bitmap, it
has to take on the burden of removing the export. Since we don't
allow changing the exposed dirty bitmap (whether to a different
bitmap, or removing advertisement of the bitmap), it is nicer to
make the bitmap tied to the export at the time the export is
created, with automatic failure to export if the bitmap is not
available.
The experimental command included an optional 'bitmap-export-name'
field for remapping the name exposed over NBD to be different from
the bitmap name stored on disk. However, my libvirt demo code
for implementing differential backups on top of persistent bitmaps
did not need to take advantage of that feature (it is instead
possible to create a new temporary bitmap with the desired name,
use block-dirty-bitmap-merge to merge one or more persistent
bitmaps into the temporary, then associate the temporary with the
NBD export, if control is needed over the exported bitmap name).
Hence, I'm not copying that part of the experiment over to the
stable addition. For more details on the libvirt demo, see
https://www.redhat.com/archives/libvir-list/2018-October/msg01254.html,
https://kvmforum2018.sched.com/event/FzuB/facilitating-incremental-backup-eric-blake-red-hat
This patch focuses on the user interface, and reduces (but does
not completely eliminate) the window where an NBD client can see
the export but not the dirty bitmap, with less work to clean up
after errors. Later patches will add further cleanups now that
this interface is declared stable via a single QMP command,
including removing the race window.
Update test 223 to use the new interface.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20190111194720.15671-6-eblake@redhat.com>
2019-01-11 20:47:17 +01:00
|
|
|
{"error": {"class": "GenericError", "desc": "Bitmap 'b3' is not found"}}
|
2018-11-19 18:29:24 +01:00
|
|
|
{"return": {}}
|
2019-01-17 20:36:58 +01:00
|
|
|
exports available: 2
|
|
|
|
export: 'n'
|
|
|
|
size: 4194304
|
nbd: Advertise multi-conn for shared read-only connections
The NBD specification defines NBD_FLAG_CAN_MULTI_CONN, which can be
advertised when the server promises cache consistency between
simultaneous clients (basically, rules that determine what FUA and
flush from one client are able to guarantee for reads from another
client). When we don't permit simultaneous clients (such as qemu-nbd
without -e), the bit makes no sense; and for writable images, we
probably have a lot more work before we can declare that actions from
one client are cache-consistent with actions from another. But for
read-only images, where flush isn't changing any data, we might as
well advertise multi-conn support. What's more, advertisement of the
bit makes it easier for clients to determine if 'qemu-nbd -e' was in
use, where a second connection will succeed rather than hang until the
first client goes away.
This patch affects qemu as server in advertising the bit. We may want
to consider patches to qemu as client to attempt parallel connections
for higher throughput by spreading the load over those connections
when a server advertises multi-conn, but for now sticking to one
connection per nbd:// BDS is okay.
See also: https://bugzilla.redhat.com/1708300
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190815185024.7010-1-eblake@redhat.com>
[eblake: tweak blockdev-nbd.c to not request shared when writable,
fix iotest 233]
Reviewed-by: John Snow <jsnow@redhat.com>
2019-08-15 20:50:24 +02:00
|
|
|
flags: 0x5ef ( readonly flush fua trim zeroes df multi cache )
|
nbd/server: Advertise actual minimum block size
Both NBD_CMD_BLOCK_STATUS and structured NBD_CMD_READ will split their
reply according to bdrv_block_status() boundaries. If the block device
has a request_alignment smaller than 512, but we advertise a block
alignment of 512 to the client, then this can result in the server
reply violating client expectations by reporting a smaller region of
the export than what the client is permitted to address (although this
is less of an issue for qemu 4.0 clients, given recent client patches
to overlook our non-compliance at EOF). Since it's always better to
be strict in what we send, it is worth advertising the actual minimum
block limit rather than blindly rounding it up to 512.
Note that this patch is not foolproof - it is still possible to
provoke non-compliant server behavior using:
$ qemu-nbd --image-opts driver=blkdebug,align=512,image.driver=file,image.filename=/path/to/non-aligned-file
That is arguably a bug in the blkdebug driver (it should never pass
back block status smaller than its alignment, even if it has to make
multiple bdrv_get_status calls and determine the
least-common-denominator status among the group to return). It may
also be possible to observe issues with a backing layer with smaller
alignment than the active layer, although so far I have been unable to
write a reliable iotest for that scenario (but again, an issue like
that could be argued to be a bug in the block layer, or something
where we need a flag to bdrv_block_status() to state whether the
result must be aligned to the current layer's limits or can be
subdivided for accuracy when chasing backing files).
Anyways, as blkdebug is not normally used, and as this patch makes our
server more interoperable with qemu 3.1 clients, it is worth applying
now, even while we still work on a larger patch series for the 4.1
timeframe to have byte-accurate file lengths.
Note that the iotests output changes - for 223 and 233, we can see the
server's better granularity advertisement; and for 241, the three test
cases have the following effects:
- natural alignment: the server's smaller alignment is now advertised,
and the hole reported at EOF is now the right result; we've gotten rid
of the server's non-compliance
- forced server alignment: the server still advertises 512 bytes, but
still sends a mid-sector hole. This is still a server compliance bug,
which needs to be fixed in the block layer in a later patch; output
does not change because the client is already being tolerant of the
non-compliance
- forced client alignment: the server's smaller alignment means that
the client now sees the server's status change mid-sector without any
protocol violations, but the fact that the map shows an unaligned
mid-sector hole is evidence of the block layer problems with aligned
block status, to be fixed in a later patch
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190329042750.14704-7-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
[eblake: rebase to enhanced iotest 241 coverage]
2019-03-31 03:36:36 +02:00
|
|
|
min block: 1
|
2019-01-17 20:36:58 +01:00
|
|
|
opt block: 4096
|
|
|
|
max block: 33554432
|
|
|
|
available meta contexts: 2
|
|
|
|
base:allocation
|
|
|
|
qemu:dirty-bitmap:b
|
|
|
|
export: 'n2'
|
|
|
|
size: 4194304
|
|
|
|
flags: 0x4ed ( flush fua trim zeroes df cache )
|
nbd/server: Advertise actual minimum block size
Both NBD_CMD_BLOCK_STATUS and structured NBD_CMD_READ will split their
reply according to bdrv_block_status() boundaries. If the block device
has a request_alignment smaller than 512, but we advertise a block
alignment of 512 to the client, then this can result in the server
reply violating client expectations by reporting a smaller region of
the export than what the client is permitted to address (although this
is less of an issue for qemu 4.0 clients, given recent client patches
to overlook our non-compliance at EOF). Since it's always better to
be strict in what we send, it is worth advertising the actual minimum
block limit rather than blindly rounding it up to 512.
Note that this patch is not foolproof - it is still possible to
provoke non-compliant server behavior using:
$ qemu-nbd --image-opts driver=blkdebug,align=512,image.driver=file,image.filename=/path/to/non-aligned-file
That is arguably a bug in the blkdebug driver (it should never pass
back block status smaller than its alignment, even if it has to make
multiple bdrv_get_status calls and determine the
least-common-denominator status among the group to return). It may
also be possible to observe issues with a backing layer with smaller
alignment than the active layer, although so far I have been unable to
write a reliable iotest for that scenario (but again, an issue like
that could be argued to be a bug in the block layer, or something
where we need a flag to bdrv_block_status() to state whether the
result must be aligned to the current layer's limits or can be
subdivided for accuracy when chasing backing files).
Anyways, as blkdebug is not normally used, and as this patch makes our
server more interoperable with qemu 3.1 clients, it is worth applying
now, even while we still work on a larger patch series for the 4.1
timeframe to have byte-accurate file lengths.
Note that the iotests output changes - for 223 and 233, we can see the
server's better granularity advertisement; and for 241, the three test
cases have the following effects:
- natural alignment: the server's smaller alignment is now advertised,
and the hole reported at EOF is now the right result; we've gotten rid
of the server's non-compliance
- forced server alignment: the server still advertises 512 bytes, but
still sends a mid-sector hole. This is still a server compliance bug,
which needs to be fixed in the block layer in a later patch; output
does not change because the client is already being tolerant of the
non-compliance
- forced client alignment: the server's smaller alignment means that
the client now sees the server's status change mid-sector without any
protocol violations, but the fact that the map shows an unaligned
mid-sector hole is evidence of the block layer problems with aligned
block status, to be fixed in a later patch
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190329042750.14704-7-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
[eblake: rebase to enhanced iotest 241 coverage]
2019-03-31 03:36:36 +02:00
|
|
|
min block: 1
|
2019-01-17 20:36:58 +01:00
|
|
|
opt block: 4096
|
|
|
|
max block: 33554432
|
|
|
|
available meta contexts: 2
|
|
|
|
base:allocation
|
|
|
|
qemu:dirty-bitmap:b2
|
2018-07-02 21:14:58 +02:00
|
|
|
|
2018-11-19 18:29:24 +01:00
|
|
|
=== Contrast normal status to large granularity dirty-bitmap ===
|
2018-07-02 21:14:58 +02:00
|
|
|
|
2018-11-19 18:29:24 +01:00
|
|
|
read 512/512 bytes at offset 512
|
|
|
|
512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
|
|
|
|
read 524288/524288 bytes at offset 524288
|
|
|
|
512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
|
2018-07-02 21:14:58 +02:00
|
|
|
read 1048576/1048576 bytes at offset 1048576
|
|
|
|
1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
|
|
|
|
read 2097152/2097152 bytes at offset 2097152
|
|
|
|
2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
|
2019-03-31 03:28:37 +02:00
|
|
|
[{ "start": 0, "length": 4096, "depth": 0, "zero": false, "data": true, "offset": OFFSET},
|
|
|
|
{ "start": 4096, "length": 1044480, "depth": 0, "zero": true, "data": false, "offset": OFFSET},
|
|
|
|
{ "start": 1048576, "length": 3145728, "depth": 0, "zero": false, "data": true, "offset": OFFSET}]
|
2018-11-19 18:29:24 +01:00
|
|
|
[{ "start": 0, "length": 65536, "depth": 0, "zero": false, "data": false},
|
2019-03-31 03:28:37 +02:00
|
|
|
{ "start": 65536, "length": 2031616, "depth": 0, "zero": false, "data": true, "offset": OFFSET},
|
2018-11-19 18:29:24 +01:00
|
|
|
{ "start": 2097152, "length": 2097152, "depth": 0, "zero": false, "data": false}]
|
|
|
|
|
|
|
|
=== Contrast to small granularity dirty-bitmap ===
|
|
|
|
|
2019-03-31 03:28:37 +02:00
|
|
|
[{ "start": 0, "length": 512, "depth": 0, "zero": false, "data": true, "offset": OFFSET},
|
2018-11-19 18:29:24 +01:00
|
|
|
{ "start": 512, "length": 512, "depth": 0, "zero": false, "data": false},
|
2019-03-31 03:28:37 +02:00
|
|
|
{ "start": 1024, "length": 2096128, "depth": 0, "zero": false, "data": true, "offset": OFFSET},
|
2018-07-02 21:14:58 +02:00
|
|
|
{ "start": 2097152, "length": 2097152, "depth": 0, "zero": false, "data": false}]
|
|
|
|
|
2019-01-11 20:47:20 +01:00
|
|
|
=== End qemu NBD server ===
|
2018-07-02 21:14:58 +02:00
|
|
|
|
|
|
|
{"return": {}}
|
|
|
|
{"return": {}}
|
2019-01-11 20:47:13 +01:00
|
|
|
{"error": {"class": "GenericError", "desc": "Export 'n2' is not found"}}
|
|
|
|
{"return": {}}
|
2019-01-11 20:47:14 +01:00
|
|
|
{"error": {"class": "GenericError", "desc": "NBD server not running"}}
|
2018-11-19 18:29:24 +01:00
|
|
|
{"return": {}}
|
iotests: Wait for qemu to end in 223
When iotest 223 was first written, it didn't matter if we waited for
the qemu process to clean up. But with the introduction of a later
qemu-nbd process trying to reuse the same file, there is a race where
even though the asynchronous qemu process has responded to "quit", it
has not yet had time to unlock the file and exit, resulting in:
-[{ "start": 0, "length": 65536, "depth": 0, "zero": false, "data": false},
-{ "start": 65536, "length": 2031616, "depth": 0, "zero": false, "data": true},
-{ "start": 2097152, "length": 2097152, "depth": 0, "zero": false, "data": false}]
+qemu-nbd: Failed to blk_new_open 'tests/qemu-iotests/scratch/t.qcow2': Failed to get shared "write" lock
+Is another process using the image [tests/qemu-iotests/scratch/t.qcow2]?
+qemu-img: Could not open 'driver=nbd,server.type=unix,server.path=tests/qemu-iotests/scratch/qemu-nbd.sock,x-dirty-bitmap=qemu:dirty-bitmap:b': Failed to connect socket tests/qemu-iotests/scratch/qemu-nbd.sock: Connection refused
+./common.nbd: line 33: kill: (11122) - No such process
Fixes: ddd09448
Reported-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190305182908.13557-1-eblake@redhat.com>
Tested-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2019-03-05 19:29:08 +01:00
|
|
|
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false, "reason": "host-qmp-quit"}}
|
2019-01-11 20:47:20 +01:00
|
|
|
|
|
|
|
=== Use qemu-nbd as server ===
|
|
|
|
|
|
|
|
[{ "start": 0, "length": 65536, "depth": 0, "zero": false, "data": false},
|
2019-03-31 03:28:37 +02:00
|
|
|
{ "start": 65536, "length": 2031616, "depth": 0, "zero": false, "data": true, "offset": OFFSET},
|
2019-01-11 20:47:20 +01:00
|
|
|
{ "start": 2097152, "length": 2097152, "depth": 0, "zero": false, "data": false}]
|
2019-03-31 03:28:37 +02:00
|
|
|
[{ "start": 0, "length": 512, "depth": 0, "zero": false, "data": true, "offset": OFFSET},
|
2019-01-11 20:47:20 +01:00
|
|
|
{ "start": 512, "length": 512, "depth": 0, "zero": false, "data": false},
|
2019-03-31 03:28:37 +02:00
|
|
|
{ "start": 1024, "length": 2096128, "depth": 0, "zero": false, "data": true, "offset": OFFSET},
|
2019-01-11 20:47:20 +01:00
|
|
|
{ "start": 2097152, "length": 2097152, "depth": 0, "zero": false, "data": false}]
|
2018-07-02 21:14:58 +02:00
|
|
|
*** done
|