From clang-13:
nbd/server.c:976:22: error: variable 'bitmaps' set but not used \
[-Werror,-Wunused-but-set-variable]
which is incorrect; see //bugs.llvm.org/show_bug.cgi?id=3888.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We'll need a possibility of non-blocking nbd_co_establish_connection(),
so that it returns immediately, and it returns success only if a
connections was previously established in background.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210610100802.5888-30-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
block/nbd doesn't need underlying sioc channel anymore. So, we can
update nbd/client-connection interface to return only one top-most io
channel, which is more straight forward.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210610100802.5888-27-vsementsov@virtuozzo.com>
[eblake: squash in Vladimir's fixes for uninit usage caught by clang]
Signed-off-by: Eric Blake <eblake@redhat.com>
Now, when a thread can do negotiation and retry, it may run relatively
long. We need a mechanism to stop it, when the user is not interested
in a result any more. So, on nbd_client_connection_release() let's
shutdown the socket, and do not retry connection if thread is detached.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610100802.5888-22-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Add an option for a thread to retry connecting until it succeeds. We'll
use nbd/client-connection both for reconnect and for initial connection
in nbd_open(), so we need a possibility to use same NBDClientConnection
instance to connect once in nbd_open() and then use retry semantics for
reconnect.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610100802.5888-21-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: grammar tweak]
Signed-off-by: Eric Blake <eblake@redhat.com>
Add arguments and logic to support nbd negotiation in the same thread
after successful connection.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610100802.5888-20-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
We don't update connect_thread_func() to use QEMU_LOCK_GUARD, as it
will get more complex critical sections logic in further commit, where
QEMU_LOCK_GUARD doesn't help.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610100802.5888-19-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
We now have bs-independent connection API, which consists of four
functions:
nbd_client_connection_new()
nbd_client_connection_release()
nbd_co_establish_connection()
nbd_co_establish_connection_cancel()
Move them to a separate file together with NBDClientConnection
structure which becomes private to the new API.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210610100802.5888-18-vsementsov@virtuozzo.com>
[eblake: comment tweaks]
Signed-off-by: Eric Blake <eblake@redhat.com>
- NBD server: Fix crashes related to switching between AioContexts
- file-posix: Workaround for discard/write_zeroes on buggy filesystems
- Follow-up fixes for the reopen vs. permission changes
- quorum: Fix error handling for flush
- block-copy: Refactor copy_range handling
- docs: Describe how to use 'null-co' block driver
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmC3iy8RHGt3b2xmQHJl
ZGhhdC5jb20ACgkQfwmycsiPL9ZP0hAAuh07CFWLzHCcRC7PBzekSPfzRYYBLDSW
EObJ1Ov4mvz8UZoP6BDJ5QVzLPhel6hXkxTd83B1D7t/Dq+yJYR0z8Kv3USpaVJ4
2U26SsoGQM8BmtVDL1Q8tQ5eDWQ4ykxNx6F2/lKBe1EH1lfaun04Xj1rNh7jpilo
nNmKMDDI1UOkH0lKDR3tqfEV0XQE7o+ZKfPlIbvYMjXk9ZPKUHfjNPGVdCLQVnqH
VJI01hF7eEx1ykSMdlC+TzNoVGG+mCBokGuW0JlUvOpX6FcGnAlxXQx8u53c1I8O
lggZV8b2IbrNBUVwXQHLLrXxjOo+u54Ct9y/gXUTAj8qai+9jVRp60Y1pnCyeIeu
DzFx10xwy04PGleb7AAZ4dT8du2+PTkuyQ9KmlvQ2U4IUcgW124CrDeO7XYr1aif
hCTJPeEDrC9YNU6AQ8rLXrYUtkumSm2zUzU5nZ//i5WH41475/vsmgP5A+Jr457A
Xu0yiI2Gqkr9CNsP9ZzMkNj03oIBhPFuGxiwibLQsy/6UVnaDYS0+rQ8FXYnF5+K
iEpgXe3vKTWxM097kzJMBTDVRMXRa75NtK7KWXMDgVpHTbcv1t1otsn+6dfv+B55
ULJM1ETsyYS0T6BqNglvdytsraSt7JgSF+ZLHbYk3KVDshwnq/0ksgSqHNNA14ca
kYTzHhMgo5w=
=gSq/
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches
- NBD server: Fix crashes related to switching between AioContexts
- file-posix: Workaround for discard/write_zeroes on buggy filesystems
- Follow-up fixes for the reopen vs. permission changes
- quorum: Fix error handling for flush
- block-copy: Refactor copy_range handling
- docs: Describe how to use 'null-co' block driver
# gpg: Signature made Wed 02 Jun 2021 14:44:15 BST
# gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg: issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream:
docs/secure-coding-practices: Describe how to use 'null-co' block driver
block-copy: refactor copy_range handling
block-copy: fix block_copy_task_entry() progress update
nbd/server: Use drained block ops to quiesce the server
block-backend: add drained_poll
block: improve permission conflict error message
block: simplify bdrv_child_user_desc()
block/vvfat: inherit child_vvfat_qcow from child_of_bds
block: improve bdrv_child_get_parent_desc()
block-backend: improve blk_root_get_parent_desc()
block: document child argument of bdrv_attach_child_common()
block/file-posix: Try other fallbacks after invalid FALLOC_FL_ZERO_RANGE
block/file-posix: Fix problem with fallocate(PUNCH_HOLE) on GPFS
block: drop BlockBackendRootState::read_only
block: drop BlockDriverState::read_only
block: consistently use bdrv_is_read_only()
block/vvfat: fix vvfat_child_perm crash
block/vvfat: child_vvfat_qcow: add .get_parent_aio_context, fix crash
qemu-io-cmds: assert that we don't have .perm requested in no-blk case
block/quorum: Provide .bdrv_co_flush instead of .bdrv_co_flush_to_disk
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Before switching between AioContexts we need to make sure that we're
fully quiesced ("nb_requests == 0" for every client) when entering the
drained section.
To do this, we set "quiescing = true" for every client on
".drained_begin" to prevent new coroutines from being created, and
check if "nb_requests == 0" on ".drained_poll". Finally, once we're
exiting the drained section, on ".drained_end" we set "quiescing =
false" and call "nbd_client_receive_next_request()" to resume the
processing of new requests.
With these changes, "blk_aio_attach()" and "blk_aio_detach()" can be
reverted to be as simple as they were before f148ae7d36.
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1960137
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Sergio Lopez <slp@redhat.com>
Message-Id: <20210602060552.17433-3-slp@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit e50caf4a5c ("tracing: convert documentation to rST")
converted docs/devel/tracing.txt to docs/devel/tracing.rst.
We still have several references to the old file, so let's fix them
with the following command:
sed -i s/tracing.txt/tracing.rst/ $(git grep -l docs/devel/tracing.txt)
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210517151702.109066-2-sgarzare@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
When querying image extents for raw image, qemu-nbd reports holes as
zero:
$ qemu-nbd -t -r -f raw empty-6g.raw
$ qemu-img map --output json nbd://localhost
[{ "start": 0, "length": 6442450944, "depth": 0, "zero": true, "data": true, "offset": 0}]
$ qemu-img map --output json empty-6g.raw
[{ "start": 0, "length": 6442450944, "depth": 0, "zero": true, "data": false, "offset": 0}]
Turns out that qemu-img map reports a hole based on BDRV_BLOCK_DATA, but
nbd server reports a hole based on BDRV_BLOCK_ALLOCATED.
The NBD protocol says:
NBD_STATE_HOLE (bit 0): if set, the block represents a hole (and
future writes to that area may cause fragmentation or encounter an
NBD_ENOSPC error); if clear, the block is allocated or the server
could not otherwise determine its status.
qemu-img manual says:
whether the sectors contain actual data or not (boolean field data;
if false, the sectors are either unallocated or stored as
optimized all-zero clusters);
To me, data=false looks compatible with NBD_STATE_HOLE. From user point
of view, getting same results from qemu-nbd and qemu-img is more
important than being more correct about allocation status.
Changing nbd server to report holes using BDRV_BLOCK_DATA makes qemu-nbd
results compatible with qemu-img map:
$ qemu-img map --output json nbd://localhost
[{ "start": 0, "length": 6442450944, "depth": 0, "zero": true, "data": false, "offset": 0}]
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Message-Id: <20210219160752.1826830-1-nsoffer@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
When switching between AIO contexts we need to me make sure that both
recv_coroutine and send_coroutine are not scheduled to run. Otherwise,
QEMU may crash while attaching the new context with an error like
this one:
aio_co_schedule: Co-routine was already scheduled in 'aio_co_schedule'
To achieve this we need a local implementation of
'qio_channel_readv_all_eof' named 'nbd_read_eof' (a trick already done
by 'nbd/client.c') that allows us to interrupt the operation and to
know when recv_coroutine is yielding.
With this in place, we delegate detaching the AIO context to the
owning context with a BH ('nbd_aio_detach_bh') scheduled using
'aio_wait_bh_oneshot'. This BH signals that we need to quiesce the
channel by setting 'client->quiescing' to 'true', and either waits for
the coroutine to finish using AIO_WAIT_WHILE or, if it's yielding in
'nbd_read_eof', actively enters the coroutine to interrupt it.
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1900326
Signed-off-by: Sergio Lopez <slp@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20201214170519.223781-4-slp@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
* Update NetBSD VM to version 9.1
* Misc fixes (e.g. categorize some devices)
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAl+zld8RHHRodXRoQHJl
ZGhhdC5jb20ACgkQLtnXdP5wLbWb8Q/+IRvGUGjGcPfbTvwoOqJVy4Hm7huW5i1s
wHe/6nitNtpvaAqcxbQHBIvWX9xTzppWcFiEkIs8rPwLOUFKh5xJ+NbEdf4acQaJ
m4G2mEY5bYt/o5e6p7ZK1RgS2EjD1eQ6BwMWQKeUHET7MTv0UabKtvWmBWpMqFxA
vl/3SbVWsSwGB9gOA5oksYhKY5ZRcVaDxsGk89f7iwgaStcxWNxVFEXddbBmqhfW
Q4ZPt0K7yod7NDBOaGEoc2hOjIfr0TvovHojDuAxt+2tKdYi1vwtnwKbFqTWp7Ca
7ttzoQUSsteiOjAhHRpa2PEbfrNs+loIm9fem5fQ9i7POlbS/Ozv2RnPCZm1X8pW
n7Jvsh25V066AFnHat7PnjcBVBRFfmR3xtA61PqvAMGEKW8tortbZbpqXO18Pv5p
6P/GG9G3QE0v2rEsU5BNFWp/aD7fiWy/VPu3dGFUkI9/S3biatocldHn/9eyXz94
k75Xzhe5x6n5Jf8QYFQ/6BO0qSoidNbAVg1W8+QyRXIJJhWRnvW9eYa7tSx5ezJg
5+oCo4oh6Qd9nvrl5pIwvX6QMDf2kPxzp7PsHeemqt7+QNmXErAVsIi1HUVsLWRP
Qb/BbKyKNeWJwvWWLAm/2kXVmNQfjLVNCwg04xa8tkQemhIDekVrCpMoX3cNHjWf
EWa1vEtbq9k=
=A3/B
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/huth-gitlab/tags/pull-request-2020-11-17' into staging
* Fixes for compiling on Haiku, and add Haiku VM for compile-testing
* Update NetBSD VM to version 9.1
* Misc fixes (e.g. categorize some devices)
# gpg: Signature made Tue 17 Nov 2020 09:20:31 GMT
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5
* remotes/huth-gitlab/tags/pull-request-2020-11-17:
max111x: put it into the 'misc' category
nand: put it into the 'storage' category
ads7846: put it into the 'input' category
ssd0323: put it into the 'display' category
gitlab-ci: Use $CI_REGISTRY instead of hard-coding registry.gitlab.com
target/microblaze: Fix possible array out of bounds in mmu_write()
tests/vm: update NetBSD to 9.1
tests/vm: Add Haiku test based on their vagrant images
configure: Add a proper check for sys/ioccom.h and use it in tpm_ioctl.h
configure: Do not build pc-bios/optionrom on Haiku
configure: Fix the _BSD_SOURCE define for the Haiku build
qemu/bswap: Remove unused qemu_bswap_len()
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
On Solaris and Haiku, the _IO() macros are defined in <sys/ioccom.h>.
Add a proper check for this header to our build system, and make sure
to include the header in tpm_ioctl.h to fix a build failure on Solaris
and Haiku.
Message-Id: <20201115152317.42752-1-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Coverity noticed (CID 1436125) that we check the return value of
nbd_extent_array_add in most places, but not at the end of
bitmap_to_extents(). The return value exists to break loops before a
future iteration, so there is nothing to check if we are already done
iterating. Adding a cast to void, plus a comment why, pacifies
Coverity.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20201111163510.713855-1-eblake@redhat.com>
[eblake: Prefer cast to void over odd && usage]
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Allow the server to expose an additional metacontext to be requested
by savvy clients. qemu-nbd adds a new option -A to expose the
qemu:allocation-depth metacontext through NBD_CMD_BLOCK_STATUS; this
can also be set via QMP when using block-export-add.
qemu as client is hacked into viewing the key aspects of this new
context by abusing the already-experimental x-dirty-bitmap option to
collapse all depths greater than 2, which results in a tri-state value
visible in the output of 'qemu-img map --output=json' (yes, that means
x-dirty-bitmap is now a bit of a misnomer, but I didn't feel like
renaming it as it would introduce a needless break of back-compat,
even though we make no compat guarantees with x- members):
unallocated (depth 0) => "zero":false, "data":true
local (depth 1) => "zero":false, "data":false
backing (depth 2+) => "zero":true, "data":true
libnbd as client is probably a nicer way to get at the information
without having to decipher such hacks in qemu as client. ;)
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20201027050556.269064-11-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
'qemu-img map' provides a way to determine which extents of an image
come from the top layer vs. inherited from a backing chain. This is
useful information worth exposing over NBD. There is a proposal to
add a QMP command block-dirty-bitmap-populate which can create a dirty
bitmap that reflects allocation information, at which point the
qemu:dirty-bitmap:NAME metadata context can expose that information
via the creation of a temporary bitmap, but we can shorten the effort
by adding a new qemu:allocation-depth metadata context that does the
same thing without an intermediate bitmap (this patch does not
eliminate the need for that proposal, as it will have other uses as
well).
While documenting things, remember that although the NBD protocol has
NBD_OPT_SET_META_CONTEXT, the rest of its documentation refers to
'metadata context', which is a more apt description of what is
actually being used by NBD_CMD_BLOCK_STATUS: the user is requesting
metadata by passing one or more context names. So I also touched up
some existing wording to prefer the term 'metadata context' where it
makes sense.
Note that this patch does not actually enable any way to request a
server to enable this context; that will come in the next patch.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20201027050556.269064-10-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
With this, 'qemu-nbd -B b0 -B b1 -f qcow2 img.qcow2' can let you sniff
out multiple bitmaps from one server. qemu-img as client can still
only read one bitmap per client connection, but other NBD clients
(hello libnbd) can now read multiple bitmaps in a single pass.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20201027050556.269064-8-eblake@redhat.com>
Rather than open-code the count of negotiated contexts at several
sites, embed it directly into the struct. This will make it easier
for upcoming commits to support even more simultaneous contexts.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20201027050556.269064-7-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Each dirty bitmap already knows its name; by reducing the scope of the
places where we construct "qemu:dirty-bitmap:NAME" strings, tracking
the name is more localized, and there are fewer per-export fields to
worry about. This in turn will make it easier for an upcoming patch
to export more than one bitmap at once.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20201027050556.269064-6-eblake@redhat.com>
Since 'block-export-add' is new to 5.2, we can still tweak the
interface; there, allowing 'bitmaps':['str'] is nicer than
'bitmap':'str'. This wires up the qapi and qemu-nbd changes to permit
passing multiple bitmaps as distinct metadata contexts that the NBD
client may request, but the actual support for more than one will
require a further patch to the server.
Note that there are no changes made to the existing deprecated
'nbd-server-add' command; this required splitting the QAPI type
BlockExportOptionsNbd, which fortunately does not affect QMP
introspection.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20201027050556.269064-5-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Make it possible to specify the iothread where the export will run. By
default the block node can be moved to other AioContexts later and the
export will follow. The fixed-iothread option forces strict behavior
that prevents changing AioContext while the export is active. See the
QAPI docs for details.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20200929125516.186715-5-stefanha@redhat.com
[Fix stray '#' character in block-export.json and add missing "(since:
5.2)" as suggested by Eric Blake.
--Stefan]
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Block exports are used by softmmu, qemu-storage-daemon, and qemu-nbd.
They are not used by other programs and are not otherwise needed in
libblock.
Undo the recent move of blockdev-nbd.c from blockdev_ss into block_ss.
Since bdrv_close_all() (libblock) calls blk_exp_close_all()
(libblockdev) a stub function is required..
Make qemu-nbd.c use signal handling utility functions instead of
duplicating the code. This helps because os-posix.c is in libblockdev
and it depends on a qemu_system_killed() symbol that qemu-nbd.c lacks.
Once we use the signal handling utility functions we also end up
providing the necessary symbol.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20200929125516.186715-4-stefanha@redhat.com
[Fixed s/ndb/nbd/ typo in commit description as suggested by Eric Blake
--Stefan]
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
We had a premature optimization of trying to read as little from the
wire as possible while handling NBD_OPT_SET_META_CONTEXT in phases.
But in reality, we HAVE to read the entire string from the client
before we can get to the next command, and it is easier to just read
it all at once than it is to read it in pieces. And once we do that,
several functions end up no longer performing I/O, so they can drop
length and errp parameters, and just return a bool instead of
modifying through a pointer.
Our iotests still pass; I also checked that libnbd's testsuite (which
covers more corner cases of odd meta context requests) still passes.
There are cases where the sequence of trace messages produced differs
(for example, when no bitmap is exported, a query for "qemu:" now
produces two trace lines instead of one), but trace points are for
debug and have no effect on what the client sees.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200930121105.667049-4-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
[eblake: enhance commit message]
Signed-off-by: Eric Blake <eblake@redhat.com>
The NBD spec is clear that any string sent from the client must not
contain embedded NUL characters. If the client passes "a\0", we
should reject that option request rather than act on "a".
Testing this is not possible with a compliant client, but I was able
to use gdb to coerce libnbd into temporarily behaving as such a
client.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200930121105.667049-3-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
gcc 10 from Fedora 32 gives me:
Compiling C object libblock.fa.p/nbd_server.c.o
../nbd/server.c: In function ‘nbd_co_client_start’:
../nbd/server.c:625:14: error: ‘namelen’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
625 | rc = nbd_negotiate_send_info(client, NBD_INFO_NAME, namelen, name,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
626 | errp);
| ~~~~~
../nbd/server.c:564:14: note: ‘namelen’ was declared here
564 | uint32_t namelen;
| ^~~~~~~
cc1: all warnings being treated as errors
As I cannot see how this can happen, let uns silence the warning.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20200930155859.303148-3-borntraeger@de.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
There is no real reason any more why nbd_export_new() and
nbd_export_create() should be separate functions. The latter only
performs a few checks before it calls the former.
What makes the current state stand out is that it's the only function in
BlockExportDriver that is not a static function inside nbd/server.c, but
a small wrapper in blockdev-nbd.c that then calls back into nbd/server.c
for the real functionality.
Move all the checks to nbd/server.c and make the resulting function
static to improve readability.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200924152717.287415-27-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The 'writable' option is a basic option that will probably be applicable
to most if not all export types that we will implement. Move it from NBD
to the generic BlockExport layer.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200924152717.287415-26-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Every export type will need a BlockBackend, so creating it centrally in
blk_exp_add() instead of the .create driver callback avoids duplication.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200924152717.287415-24-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Every block export has a BlockBackend representing the disk that is
exported. It should live in BlockExport therefore.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200924152717.287415-23-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Implement a new QMP command block-export-del and make nbd-server-remove
a wrapper around it.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200924152717.287415-21-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The reference owned by the user/monitor that is created when adding the
export and dropped when removing it was tied to the 'exports' list in
nbd/server.c. Every block export will have a user reference, so move it
to the block export level and tie it to the 'block_exports' list in
block/export/export.c instead. This is necessary for introducing a QMP
command for removing exports.
Note that exports are present in block_exports even after the user has
requested shutdown. This is different from NBD's exports where exports
are immediately removed on a shutdown request, even if they are still in
the process of shutting down. In order to avoid that the user still
interacts with an export that is shutting down (and possibly removes it
a second time), we need to remember if the user actually still owns it.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200924152717.287415-20-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This adds a function to shut down all block exports, and another one to
shut down the block exports of a single type. The latter is used for now
when stopping the NBD server. As soon as we implement support for
multiple NBD servers, we'll need a per-server list of exports and it
will be replaced by a function using that.
As a side effect, the BlockExport layer has a list tracking all existing
exports now. closed_exports loses its only user and can go away.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200924152717.287415-18-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Instead of letting the driver allocate and return the BlockExport
object, allocate it already in blk_exp_add() and pass it. This allows us
to initialise the generic part before calling into the driver so that
the driver can just use these values instead of having to parse the
options a second time.
For symmetry, move freeing the BlockExport to blk_exp_unref().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200924152717.287415-17-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200924152717.287415-15-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Having a refcount makes sense for all types of block exports. It is also
a prerequisite for keeping a list of all exports at the BlockExport
level.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200924152717.287415-14-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Closing export is somewhat convoluted because nbd_export_close() and
nbd_export_put() call each other and the ways they actually end up being
nested is not necessarily obvious.
However, it is not really necessary to call nbd_export_close() from
nbd_export_put() when putting the last reference because it only does
three things:
1. Close all clients. We're going to refcount 0 and all clients hold a
reference, so we know there is no active client any more.
2. Close the user reference (represented by exp->name being non-NULL).
The same argument applies: If the export were still named, we would
still have a reference.
3. Freeing exp->description. This is really cleanup work to be done when
the export is finally freed. There is no reason to already clear it
while clients are still in the process of shutting down.
So after moving the cleanup of exp->description, the code can be
simplified so that only nbd_export_close() calls nbd_export_put(), but
never the other way around.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200924152717.287415-13-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The export close callback is unused by the built-in NBD server. qemu-nbd
uses it only during shutdown to wait for the unrefed export to actually
go away. It can just use nbd_export_close_all() instead and do without
the callback.
This removes the close callback from nbd_export_new() and makes both
callers of it more similar.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200924152717.287415-11-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
nbd-server-add tries to be convenient and adds two questionable
features that we don't want to share in block-export-add, even for NBD
exports:
1. When requesting a writable export of a read-only device, the export
is silently downgraded to read-only. This should be an error in the
context of block-export-add.
2. When using a BlockBackend name, unplugging the device from the guest
will automatically stop the NBD server, too. This may sometimes be
what you want, but it could also be very surprising. Let's keep
things explicit with block-export-add. If the user wants to stop the
export, they should tell us so.
Move these things into the nbd-server-add QMP command handler so that
they apply only there.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200924152717.287415-8-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Instead of implementing qemu-nbd --offset in the NBD code, just put a
raw block node with the requested offset on top of the user image and
rely on that doing the job.
This does not only simplify the nbd_export_new() interface and bring it
closer to the set of options that the nbd-server-add QMP command offers,
but in fact it also eliminates a potential source for bugs in the NBD
code which previously had to add the offset manually in all relevant
places.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200924152717.287415-7-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We want to have a common set of commands for all types of block exports.
Currently, this is only NBD, but we're going to add more types.
This patch adds the basic BlockExport and BlockExportDriver structs and
a QMP command block-export-add that creates a new export based on the
given BlockExportOptions.
qmp_nbd_server_add() becomes a wrapper around qmp_block_export_add().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200924152717.287415-5-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200924152717.287415-2-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When looking for a dirty bitmap to share, we should handle filters by
just including them in the search (so they do not break backing chains).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Meson doesn't enjoy the same flexibility we have with Make in choosing
the include path. In particular the tracing headers are using
$(build_root)/$(<D).
In order to keep the include directives unchanged,
the simplest solution is to generate headers with patterns like
"trace/trace-audio.h" and place forwarding headers in the source tree
such that for example "audio/trace.h" includes "trace/trace-audio.h".
This patch is too ugly to be applied to the Makefiles now. It's only
a way to separate the changes to the tracing header files from the
Meson rewrite of the tracing logic.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Although qemu as NBD client limits requests to <2G, the NBD protocol
allows clients to send requests almost all the way up to 4G. But
because our block layer is not yet 64-bit clean, we accidentally wrap
such requests into a negative size, and fail with EIO instead of
performing the intended operation.
The bug is visible in modern systems with something as simple as:
$ qemu-img create -f qcow2 /tmp/image.img 5G
$ sudo qemu-nbd --connect=/dev/nbd0 /tmp/image.img
$ sudo blkdiscard /dev/nbd0
or with user-space only:
$ truncate --size=3G file
$ qemu-nbd -f raw file
$ nbdsh -u nbd://localhost:10809 -c 'h.trim(3*1024*1024*1024,0)'
Although both blk_co_pdiscard and blk_pwrite_zeroes currently return 0
on success, this is also a good time to fix our code to a more robust
paradigm that treats all non-negative values as success.
Alas, our iotests do not currently make it easy to add external
dependencies on blkdiscard or nbdsh, so we have to rely on manual
testing for now.
This patch can be reverted when we later improve the overall block
layer to be 64-bit clean, but for now, a minimal fix was deemed less
risky prior to release.
CC: qemu-stable@nongnu.org
Fixes: 1f4d6d18ed
Fixes: 1c6c4bb7f0
Fixes: https://github.com/systemd/systemd/issues/16242
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200722212231.535072-1-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
[eblake: rework success tests to use >=0]
Consider nbd_export_close_all(). The call-stack looks like this:
nbd_export_close_all() -> nbd_export_close -> call client_close() for
each client.
client_close() doesn't guarantee that client is closed: nbd_trip()
keeps reference to it. So, nbd_export_close_all() just reduce
reference counter on export and removes it from the list, but doesn't
guarantee that nbd_trip() finished neither export actually removed.
Let's wait for all exports actually removed.
Without this fix, the following crash is possible:
- export bitmap through internal Qemu NBD server
- connect a client
- shutdown Qemu
On shutdown nbd_export_close_all is called, but it actually don't wait
for nbd_trip() to finish and to release its references. So, export is
not release, and exported bitmap remains busy, and on try to remove the
bitmap (which is part of bdrv_close()) the assertion fails:
bdrv_release_dirty_bitmap_locked: Assertion `!bdrv_dirty_bitmap_busy(bitmap)' failed
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200714162234.13113-2-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If we want to check error after errp-function call, we need to
introduce local_err and then propagate it to errp. Instead, use
the ERRP_GUARD() macro, benefits are:
1. No need of explicit error_propagate call
2. No need of explicit local_err variable: use errp directly
3. ERRP_GUARD() leaves errp as is if it's not NULL or
&error_fatal, this means that we don't break error_abort
(we'll abort on error_set, not on error_propagate)
If we want to add some info to errp (by error_prepend() or
error_append_hint()), we must use the ERRP_GUARD() macro.
Otherwise, this info will not be added when errp == &error_fatal
(the program will exit prior to the error_append_hint() or
error_prepend() call). Fix several such cases, e.g. in nbd_read().
This commit is generated by command
sed -n '/^Network Block Device (NBD)$/,/^$/{s/^F: //p}' \
MAINTAINERS | \
xargs git ls-files | grep '\.[hc]$' | \
xargs spatch \
--sp-file scripts/coccinelle/errp-guard.cocci \
--macro-file scripts/cocci-macro-file.h \
--in-place --no-show-diff --max-width 80
Reported-by: Kevin Wolf <kwolf@redhat.com>
Reported-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Commit message tweaked]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20200707165037.1026246-8-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[ERRP_AUTO_PROPAGATE() renamed to ERRP_GUARD(), and
auto-propagated-errp.cocci to errp-guard.cocci. Commit message
tweaked again.]
Ever since commit 36683283 (v2.8), the server code asserts that error
strings sent to the client are well-formed per the protocol by not
exceeding the maximum string length of 4096. At the time the server
first started sending error messages, the assertion could not be
triggered, because messages were completely under our control.
However, over the years, we have added latent scenarios where a client
could trigger the server to attempt an error message that would
include the client's information if it passed other checks first:
- requesting NBD_OPT_INFO/GO on an export name that is not present
(commit 0cfae925 in v2.12 echoes the name)
- requesting NBD_OPT_LIST/SET_META_CONTEXT on an export name that is
not present (commit e7b1948d in v2.12 echoes the name)
At the time, those were still safe because we flagged names larger
than 256 bytes with a different message; but that changed in commit
93676c88 (v4.2) when we raised the name limit to 4096 to match the NBD
string limit. (That commit also failed to change the magic number
4096 in nbd_negotiate_send_rep_err to the just-introduced named
constant.) So with that commit, long client names appended to server
text can now trigger the assertion, and thus be used as a denial of
service attack against a server. As a mitigating factor, if the
server requires TLS, the client cannot trigger the problematic paths
unless it first supplies TLS credentials, and such trusted clients are
less likely to try to intentionally crash the server.
We may later want to further sanitize the user-supplied strings we
place into our error messages, such as scrubbing out control
characters, but that is less important to the CVE fix, so it can be a
later patch to the new nbd_sanitize_name.
Consideration was given to changing the assertion in
nbd_negotiate_send_rep_verr to instead merely log a server error and
truncate the message, to avoid leaving a latent path that could
trigger a future CVE DoS on any new error message. However, this
merely complicates the code for something that is already (correctly)
flagging coding errors, and now that we are aware of the long message
pitfall, we are less likely to introduce such errors in the future,
which would make such error handling dead code.
Reported-by: Xueqiang Wei <xuwei@redhat.com>
CC: qemu-stable@nongnu.org
Fixes: https://bugzilla.redhat.com/1843684 CVE-2020-10761
Fixes: 93676c88d7
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200610163741.3745251-2-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>