qemu-e2k

Commit Graph

Author	SHA1	Message	Date
Eric Blake	58a6fdcc9e	nbd/server: Allow MULTI_CONN for shared writable exports According to the NBD spec, a server that advertises NBD_FLAG_CAN_MULTI_CONN promises that multiple client connections will not see any cache inconsistencies: when properly separated by a single flush, actions performed by one client will be visible to another client, regardless of which client did the flush. We always satisfy these conditions in qemu - even when we support multiple clients, ALL clients go through a single point of reference into the block layer, with no local caching. The effect of one client is instantly visible to the next client. Even if our backend were a network device, we argue that any multi-path caching effects that would cause inconsistencies in back-to-back actions not seeing the effect of previous actions would be a bug in that backend, and not the fault of caching in qemu. As such, it is safe to unconditionally advertise CAN_MULTI_CONN for any qemu NBD server situation that supports parallel clients. Note, however, that we don't want to advertise CAN_MULTI_CONN when we know that a second client cannot connect (for historical reasons, qemu-nbd defaults to a single connection while nbd-server-add and QMP commands default to unlimited connections; but we already have existing means to let either style of NBD server creation alter those defaults). This is visible by no longer advertising MULTI_CONN for 'qemu-nbd -r' without -e, as in the iotest nbd-qemu-allocation. The harder part of this patch is setting up an iotest to demonstrate behavior of multiple NBD clients to a single server. It might be possible with parallel qemu-io processes, but I found it easier to do in python with the help of libnbd, and help from Nir and Vladimir in writing the test. Signed-off-by: Eric Blake <eblake@redhat.com> Suggested-by: Nir Soffer <nsoffer@redhat.com> Suggested-by: Vladimir Sementsov-Ogievskiy <v.sementsov-og@mail.ru> Message-Id: <20220512004924.417153-3-eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2022-05-12 13:10:52 +02:00
Vladimir Sementsov-Ogievskiy	e5fb29d5d0	qapi: nbd-export: allow select bitmaps by node/name pair Hi all! Current logic of relying on search through backing chain is not safe neither convenient. Sometimes it leads to necessity of extra bitmap copying. Also, we are going to add "snapshot-access" driver, to access some snapshot state through NBD. And this driver is not formally a filter, and of course it's not a COW format driver. So, searching through backing chain will not work. Instead of widening the workaround of bitmap searching, let's extend the interface so that user can select bitmap precisely. Note, that checking for bitmap active status is not copied to the new API, I don't see a reason for it, user should understand the risks. And anyway, bitmap from other node is unrelated to this export being read-only or read-write. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org> Message-Id: <20220314213226.362217-3-v.sementsov-og@mail.ru> [eblake: Adjust S-o-b to Vladimir's new email, with permission] Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2022-04-26 13:15:19 -05:00
Marc-André Lureau	e0e7fe07e1	Remove trailing ; after G_DEFINE_AUTO macro The macro doesn't need it. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>	2022-03-22 14:46:18 +04:00
Marc-André Lureau	9edc6313da	Replace GCC_FMT_ATTR with G_GNUC_PRINTF One less qemu-specific macro. It also helps to make some headers/units only depend on glib, and thus moved in standalone projects eventually. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Richard W.M. Jones <rjones@redhat.com>	2022-03-22 14:40:51 +04:00
Peter Maydell	fdee2c9692	nbd patches for 2022-03-07 - Dan Berrange: Allow qemu-nbd to support TLS over Unix sockets - Eric Blake: Minor cleanups related to 64-bit block operations -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEccLMIrHEYCkn0vOqp6FrSiUnQ2oFAmImtE8ACgkQp6FrSiUn Q2ovmgf/aksDqf2eNcahs++fez+8Qi9ll5OY/qGyjnzBgsatYKjrK+xF7OnjoJox eRX026lh81Q4EQK7oZBUnr2UCY4bncDBTI7MTLh603EV/tId5ZLwx007ERhzvtC1 mIsQHXNuO9X25LQG2eWnfunY9YztQpiT5r/g3khD2yPBqJWIvBfblzPLx6FkF7px /WM8xEKCihmGr1Wr3b+zGYL083YkaBWCvHoR8mJt3tEFUj+Qie8XcdV0OVyI0XUj 5goIFRcpVwBE8P2nLtfUKNzEXz22cmdonOJUX7E5IvGO21k5F/HrWlHdo8JnuSUZ t0w5L9yCxBrRpY1burz30b77J0WMCw== =C8Dd -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2022-03-07' into staging nbd patches for 2022-03-07 - Dan Berrange: Allow qemu-nbd to support TLS over Unix sockets - Eric Blake: Minor cleanups related to 64-bit block operations # gpg: Signature made Tue 08 Mar 2022 01:41:35 GMT # gpg: using RSA key 71C2CC22B1C4602927D2F3AAA7A16B4A2527436A # gpg: Good signature from "Eric Blake <eblake@redhat.com>" [full] # gpg: aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>" [full] # gpg: aka "[jpeg image of size 6874]" [full] # Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2 F3AA A7A1 6B4A 2527 436A * remotes/ericb/tags/pull-nbd-2022-03-07: qemu-io: Allow larger write zeroes under no fallback qemu-io: Utilize 64-bit status during map nbd/server: Minor cleanups tests/qemu-iotests: validate NBD TLS with UNIX sockets and PSK tests/qemu-iotests: validate NBD TLS with UNIX sockets tests/qemu-iotests: validate NBD TLS with hostname mismatch tests/qemu-iotests: convert NBD TLS test to use standard filters tests/qemu-iotests: introduce filter for qemu-nbd export list tests/qemu-iotests: expand _filter_nbd rules tests/qemu-iotests: add QEMU_IOTESTS_REGEN=1 to update reference file block/nbd: don't restrict TLS usage to IP sockets qemu-nbd: add --tls-hostname option for TLS certificate validation block/nbd: support override of hostname for TLS certificate validation block: pass desired TLS hostname through from block driver client crypto: mandate a hostname when checking x509 creds on a client Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-03-09 11:38:29 +00:00
Eric Blake	314b902621	nbd/server: Minor cleanups Spelling fixes, grammar improvements and consistent spacing, noticed while preparing other patches in this file. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20211203231539.3900865-2-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2022-03-07 19:23:25 -06:00
Peter Maydell	5df022cf2e	osdep: Move memalign-related functions to their own header Move the various memalign-related functions out of osdep.h and into their own header, which we include only where they are used. While we're doing this, add some brief documentation comments. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20220226180723.1706285-10-peter.maydell@linaro.org	2022-03-07 13:16:49 +00:00
Nir Soffer	523f5a9971	nbd/server.c: Remove unused field NBDRequestData struct has unused QSIMPLEQ_ENTRY field. It seems that this field exists since the first git commit and was never used. Signed-off-by: Nir Soffer <nsoffer@redhat.com> Message-Id: <20220111194313.581486-1-nsoffer@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Fixes: `d9a73806` ("qemu-nbd: introduce NBDRequest", v1.1) Signed-off-by: Eric Blake <eblake@redhat.com>	2022-01-28 16:48:28 -06:00
Eric Blake	e35574226a	nbd/server: Simplify zero and trim Now that the block layer supports 64-bit operations (see commit `2800637a` and friends, new to v6.2), we no longer have to self-fragment requests larger than 2G, reverting the workaround added in `890cbccb08` ("nbd: Fix large trim/zero requests", v5.1.0). Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20211117170230.1128262-3-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2021-11-22 07:37:15 -06:00
Eric Blake	1644cccea5	nbd/server: Don't complain on certain client disconnects When a client disconnects abruptly, but did not have any pending requests (for example, when using nbdsh without calling h.shutdown), we used to output the following message: $ qemu-nbd -f raw file $ nbdsh -u 'nbd://localhost:10809' -c 'h.trim(1,0)' qemu-nbd: Disconnect client, due to: Failed to read request: Unexpected end-of-file before all bytes were read Then in commit `f148ae7`, we refactored nbd_receive_request() to use nbd_read_eof(); when this returns 0, we regressed into tracing uninitialized memory (if tracing is enabled) and reporting a less-specific: qemu-nbd: Disconnect client, due to: Request handling failed in intermediate state Note that with Unix sockets, we have yet another error message, unchanged by the 6.0 regression: $ qemu-nbd -k /tmp/sock -f raw file $ nbdsh -u 'nbd+unix:///?socket=/tmp/sock' -c 'h.trim(1,0)' qemu-nbd: Disconnect client, due to: Failed to send reply: Unable to write to socket: Broken pipe But in all cases, the error message goes away if the client performs a soft shutdown by using NBD_CMD_DISC, rather than a hard shutdown by abrupt disconnect: $ nbdsh -u 'nbd://localhost:10809' -c 'h.trim(1,0)' -c 'h.shutdown()' This patch fixes things to avoid uninitialized memory, and in general avoids warning about a client that does a hard shutdown when not in the middle of a packet. A client that aborts mid-request, or which does not read the full server's reply, can still result in warnings, but those are indeed much more unusual situations. CC: qemu-stable@nongnu.org Fixes: `f148ae7d36` ("nbd/server: Quiesce coroutines on context switch", v6.0.0) Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> [eblake: defer unrelated typo fixes to later patch] Message-Id: <20211117170230.1128262-2-eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-11-22 07:37:14 -06:00
Eric Blake	76df2b8d69	nbd/server: Silence clang sanitizer warning clang's sanitizer is picky: memset(NULL, x, 0) is technically undefined behavior, even though no sane implementation of memset() deferences the NULL. Caught by the nbd-qemu-allocation iotest. The alternative to checking before each memset is to instead force an allocation of 1 element instead of g_new0(type, 0)'s behavior of returning NULL for a 0-length array. Reported-by: Peter Maydell <peter.maydell@linaro.org> Fixes: `3b1f244c59` (nbd: Allow export of multiple bitmaps for one device) Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20211115223943.626416-1-eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>	2021-11-16 08:45:33 -06:00
Eric Blake	da24597dd3	nbd/server: Allow LIST_META_CONTEXT without STRUCTURED_REPLY The NBD protocol just relaxed the requirements on NBD_OPT_LIST_META_CONTEXT: https://github.com/NetworkBlockDevice/nbd/commit/13a4e33a87 Since listing is not stateful (unlike SET_META_CONTEXT), we don't care if a client asks for meta contexts without first requesting structured replies. Well-behaved clients will still ask for structured reply first (if for no other reason than for back-compat to older servers), but that's no reason to avoid this change. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20210907173505.1499709-1-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2021-09-29 13:46:32 -05:00
Richard Henderson	cd1675f8d7	nbd/server: Mark variable unused in nbd_negotiate_meta_queries From clang-13: nbd/server.c:976:22: error: variable 'bitmaps' set but not used \ [-Werror,-Wunused-but-set-variable] which is incorrect; see //bugs.llvm.org/show_bug.cgi?id=3888. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-07-26 07:06:25 -10:00
Sergio Lopez	fd6afc501a	nbd/server: Use drained block ops to quiesce the server Before switching between AioContexts we need to make sure that we're fully quiesced ("nb_requests == 0" for every client) when entering the drained section. To do this, we set "quiescing = true" for every client on ".drained_begin" to prevent new coroutines from being created, and check if "nb_requests == 0" on ".drained_poll". Finally, once we're exiting the drained section, on ".drained_end" we set "quiescing = false" and call "nbd_client_receive_next_request()" to resume the processing of new requests. With these changes, "blk_aio_attach()" and "blk_aio_detach()" can be reverted to be as simple as they were before `f148ae7d36`. RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1960137 Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Sergio Lopez <slp@redhat.com> Message-Id: <20210602060552.17433-3-slp@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2021-06-02 14:23:20 +02:00
Nir Soffer	0da9856851	nbd: server: Report holes for raw images When querying image extents for raw image, qemu-nbd reports holes as zero: $ qemu-nbd -t -r -f raw empty-6g.raw $ qemu-img map --output json nbd://localhost [{ "start": 0, "length": 6442450944, "depth": 0, "zero": true, "data": true, "offset": 0}] $ qemu-img map --output json empty-6g.raw [{ "start": 0, "length": 6442450944, "depth": 0, "zero": true, "data": false, "offset": 0}] Turns out that qemu-img map reports a hole based on BDRV_BLOCK_DATA, but nbd server reports a hole based on BDRV_BLOCK_ALLOCATED. The NBD protocol says: NBD_STATE_HOLE (bit 0): if set, the block represents a hole (and future writes to that area may cause fragmentation or encounter an NBD_ENOSPC error); if clear, the block is allocated or the server could not otherwise determine its status. qemu-img manual says: whether the sectors contain actual data or not (boolean field data; if false, the sectors are either unallocated or stored as optimized all-zero clusters); To me, data=false looks compatible with NBD_STATE_HOLE. From user point of view, getting same results from qemu-nbd and qemu-img is more important than being more correct about allocation status. Changing nbd server to report holes using BDRV_BLOCK_DATA makes qemu-nbd results compatible with qemu-img map: $ qemu-img map --output json nbd://localhost [{ "start": 0, "length": 6442450944, "depth": 0, "zero": true, "data": false, "offset": 0}] Signed-off-by: Nir Soffer <nsoffer@redhat.com> Message-Id: <20210219160752.1826830-1-nsoffer@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-03-08 13:08:45 -06:00
Sergio Lopez	f148ae7d36	nbd/server: Quiesce coroutines on context switch When switching between AIO contexts we need to me make sure that both recv_coroutine and send_coroutine are not scheduled to run. Otherwise, QEMU may crash while attaching the new context with an error like this one: aio_co_schedule: Co-routine was already scheduled in 'aio_co_schedule' To achieve this we need a local implementation of 'qio_channel_readv_all_eof' named 'nbd_read_eof' (a trick already done by 'nbd/client.c') that allows us to interrupt the operation and to know when recv_coroutine is yielding. With this in place, we delegate detaching the AIO context to the owning context with a BH ('nbd_aio_detach_bh') scheduled using 'aio_wait_bh_oneshot'. This BH signals that we need to quiesce the channel by setting 'client->quiescing' to 'true', and either waits for the coroutine to finish using AIO_WAIT_WHILE or, if it's yielding in 'nbd_read_eof', actively enters the coroutine to interrupt it. RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1900326 Signed-off-by: Sergio Lopez <slp@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20201214170519.223781-4-slp@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-01-20 14:48:54 -06:00
Eric Blake	c0b21f2e22	nbd: Silence Coverity false positive Coverity noticed (CID 1436125) that we check the return value of nbd_extent_array_add in most places, but not at the end of bitmap_to_extents(). The return value exists to break loops before a future iteration, so there is nothing to check if we are already done iterating. Adding a cast to void, plus a comment why, pacifies Coverity. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20201111163510.713855-1-eblake@redhat.com> [eblake: Prefer cast to void over odd && usage] Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2020-11-16 14:51:12 -06:00
Eric Blake	dbc7b01492	nbd: Add 'qemu-nbd -A' to expose allocation depth Allow the server to expose an additional metacontext to be requested by savvy clients. qemu-nbd adds a new option -A to expose the qemu:allocation-depth metacontext through NBD_CMD_BLOCK_STATUS; this can also be set via QMP when using block-export-add. qemu as client is hacked into viewing the key aspects of this new context by abusing the already-experimental x-dirty-bitmap option to collapse all depths greater than 2, which results in a tri-state value visible in the output of 'qemu-img map --output=json' (yes, that means x-dirty-bitmap is now a bit of a misnomer, but I didn't feel like renaming it as it would introduce a needless break of back-compat, even though we make no compat guarantees with x- members): unallocated (depth 0) => "zero":false, "data":true local (depth 1) => "zero":false, "data":false backing (depth 2+) => "zero":true, "data":true libnbd as client is probably a nicer way to get at the information without having to decipher such hacks in qemu as client. ;) Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20201027050556.269064-11-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2020-10-30 15:22:00 -05:00
Eric Blake	71719cd57f	nbd: Add new qemu:allocation-depth metadata context 'qemu-img map' provides a way to determine which extents of an image come from the top layer vs. inherited from a backing chain. This is useful information worth exposing over NBD. There is a proposal to add a QMP command block-dirty-bitmap-populate which can create a dirty bitmap that reflects allocation information, at which point the qemu:dirty-bitmap:NAME metadata context can expose that information via the creation of a temporary bitmap, but we can shorten the effort by adding a new qemu:allocation-depth metadata context that does the same thing without an intermediate bitmap (this patch does not eliminate the need for that proposal, as it will have other uses as well). While documenting things, remember that although the NBD protocol has NBD_OPT_SET_META_CONTEXT, the rest of its documentation refers to 'metadata context', which is a more apt description of what is actually being used by NBD_CMD_BLOCK_STATUS: the user is requesting metadata by passing one or more context names. So I also touched up some existing wording to prefer the term 'metadata context' where it makes sense. Note that this patch does not actually enable any way to request a server to enable this context; that will come in the next patch. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20201027050556.269064-10-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2020-10-30 15:22:00 -05:00
Eric Blake	3b1f244c59	nbd: Allow export of multiple bitmaps for one device With this, 'qemu-nbd -B b0 -B b1 -f qcow2 img.qcow2' can let you sniff out multiple bitmaps from one server. qemu-img as client can still only read one bitmap per client connection, but other NBD clients (hello libnbd) can now read multiple bitmaps in a single pass. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20201027050556.269064-8-eblake@redhat.com>	2020-10-30 15:10:15 -05:00
Eric Blake	47ec485e8d	nbd: Refactor counting of metadata contexts Rather than open-code the count of negotiated contexts at several sites, embed it directly into the struct. This will make it easier for upcoming commits to support even more simultaneous contexts. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20201027050556.269064-7-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2020-10-30 15:10:15 -05:00
Eric Blake	02e87e3b1c	nbd: Simplify qemu bitmap context name Each dirty bitmap already knows its name; by reducing the scope of the places where we construct "qemu:dirty-bitmap:NAME" strings, tracking the name is more localized, and there are fewer per-export fields to worry about. This in turn will make it easier for an upcoming patch to export more than one bitmap at once. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20201027050556.269064-6-eblake@redhat.com>	2020-10-30 15:10:15 -05:00
Eric Blake	cbad81cef8	nbd: Update qapi to support exporting multiple bitmaps Since 'block-export-add' is new to 5.2, we can still tweak the interface; there, allowing 'bitmaps':['str'] is nicer than 'bitmap':'str'. This wires up the qapi and qemu-nbd changes to permit passing multiple bitmaps as distinct metadata contexts that the NBD client may request, but the actual support for more than one will require a further patch to the server. Note that there are no changes made to the existing deprecated 'nbd-server-add' command; this required splitting the QAPI type BlockExportOptionsNbd, which fortunately does not affect QMP introspection. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20201027050556.269064-5-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>	2020-10-30 15:10:15 -05:00
Stefan Hajnoczi	f51d23c80a	block/export: add iothread and fixed-iothread options Make it possible to specify the iothread where the export will run. By default the block node can be moved to other AioContexts later and the export will follow. The fixed-iothread option forces strict behavior that prevents changing AioContext while the export is active. See the QAPI docs for details. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20200929125516.186715-5-stefanha@redhat.com [Fix stray '#' character in block-export.json and add missing "(since: 5.2)" as suggested by Eric Blake. --Stefan] Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-10-23 13:42:16 +01:00
Eric Blake	ebd57062a1	nbd: Simplify meta-context parsing We had a premature optimization of trying to read as little from the wire as possible while handling NBD_OPT_SET_META_CONTEXT in phases. But in reality, we HAVE to read the entire string from the client before we can get to the next command, and it is easier to just read it all at once than it is to read it in pieces. And once we do that, several functions end up no longer performing I/O, so they can drop length and errp parameters, and just return a bool instead of modifying through a pointer. Our iotests still pass; I also checked that libnbd's testsuite (which covers more corner cases of odd meta context requests) still passes. There are cases where the sequence of trace messages produced differs (for example, when no bitmap is exported, a query for "qemu:" now produces two trace lines instead of one), but trace points are for debug and have no effect on what the client sees. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200930121105.667049-4-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> [eblake: enhance commit message] Signed-off-by: Eric Blake <eblake@redhat.com>	2020-10-09 15:05:08 -05:00
Eric Blake	d1e2c3e7bd	nbd/server: Reject embedded NUL in NBD strings The NBD spec is clear that any string sent from the client must not contain embedded NUL characters. If the client passes "a\0", we should reject that option request rather than act on "a". Testing this is not possible with a compliant client, but I was able to use gdb to coerce libnbd into temporarily behaving as such a client. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200930121105.667049-3-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2020-10-09 15:05:08 -05:00
Christian Borntraeger	bbc35fc20e	nbd: silence maybe-uninitialized warnings gcc 10 from Fedora 32 gives me: Compiling C object libblock.fa.p/nbd_server.c.o ../nbd/server.c: In function ‘nbd_co_client_start’: ../nbd/server.c:625:14: error: ‘namelen’ may be used uninitialized in this function [-Werror=maybe-uninitialized] 625 \| rc = nbd_negotiate_send_info(client, NBD_INFO_NAME, namelen, name, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 626 \| errp); \| ~~~~~ ../nbd/server.c:564:14: note: ‘namelen’ was declared here 564 \| uint32_t namelen; \| ^~~~~~~ cc1: all warnings being treated as errors As I cannot see how this can happen, let uns silence the warning. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Message-Id: <20200930155859.303148-3-borntraeger@de.ibm.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2020-10-09 15:04:32 -05:00
Kevin Wolf	5b1cb49704	nbd: Merge nbd_export_new() and nbd_export_create() There is no real reason any more why nbd_export_new() and nbd_export_create() should be separate functions. The latter only performs a few checks before it calls the former. What makes the current state stand out is that it's the only function in BlockExportDriver that is not a static function inside nbd/server.c, but a small wrapper in blockdev-nbd.c that then calls back into nbd/server.c for the real functionality. Move all the checks to nbd/server.c and make the resulting function static to improve readability. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200924152717.287415-27-kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-10-02 15:46:40 +02:00
Kevin Wolf	30dbc81d31	block/export: Move writable to BlockExportOptions The 'writable' option is a basic option that will probably be applicable to most if not all export types that we will implement. Move it from NBD to the generic BlockExport layer. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200924152717.287415-26-kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-10-02 15:46:40 +02:00
Kevin Wolf	331170e073	block/export: Create BlockBackend in blk_exp_add() Every export type will need a BlockBackend, so creating it centrally in blk_exp_add() instead of the .create driver callback avoids duplication. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200924152717.287415-24-kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-10-02 15:46:40 +02:00
Kevin Wolf	37a4f70cea	block/export: Move blk to BlockExport Every block export has a BlockBackend representing the disk that is exported. It should live in BlockExport therefore. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200924152717.287415-23-kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-10-02 15:46:40 +02:00
Kevin Wolf	3c3bc462ad	block/export: Add block-export-del Implement a new QMP command block-export-del and make nbd-server-remove a wrapper around it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200924152717.287415-21-kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-10-02 15:46:40 +02:00
Kevin Wolf	3859ad36f0	block/export: Move strong user reference to block_exports The reference owned by the user/monitor that is created when adding the export and dropped when removing it was tied to the 'exports' list in nbd/server.c. Every block export will have a user reference, so move it to the block export level and tie it to the 'block_exports' list in block/export/export.c instead. This is necessary for introducing a QMP command for removing exports. Note that exports are present in block_exports even after the user has requested shutdown. This is different from NBD's exports where exports are immediately removed on a shutdown request, even if they are still in the process of shutting down. In order to avoid that the user still interacts with an export that is shutting down (and possibly removes it a second time), we need to remember if the user actually still owns it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200924152717.287415-20-kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-10-02 15:46:40 +02:00
Kevin Wolf	bc4ee65b8c	block/export: Add blk_exp_close_all(_type) This adds a function to shut down all block exports, and another one to shut down the block exports of a single type. The latter is used for now when stopping the NBD server. As soon as we implement support for multiple NBD servers, we'll need a per-server list of exports and it will be replaced by a function using that. As a side effect, the BlockExport layer has a list tracking all existing exports now. closed_exports loses its only user and can go away. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200924152717.287415-18-kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-10-02 15:46:40 +02:00
Kevin Wolf	a6ff798966	block/export: Allocate BlockExport in blk_exp_add() Instead of letting the driver allocate and return the BlockExport object, allocate it already in blk_exp_add() and pass it. This allows us to initialise the generic part before calling into the driver so that the driver can just use these values instead of having to parse the options a second time. For symmetry, move freeing the BlockExport to blk_exp_unref(). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200924152717.287415-17-kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-10-02 15:46:40 +02:00
Kevin Wolf	8612c68673	block/export: Move AioContext from NBDExport to BlockExport Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200924152717.287415-15-kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-10-02 15:46:40 +02:00
Kevin Wolf	c69de1bef5	block/export: Move refcount from NBDExport to BlockExport Having a refcount makes sense for all types of block exports. It is also a prerequisite for keeping a list of all exports at the BlockExport level. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200924152717.287415-14-kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-10-02 15:46:40 +02:00
Kevin Wolf	dbc9e94a23	nbd/server: Simplify export shutdown Closing export is somewhat convoluted because nbd_export_close() and nbd_export_put() call each other and the ways they actually end up being nested is not necessarily obvious. However, it is not really necessary to call nbd_export_close() from nbd_export_put() when putting the last reference because it only does three things: 1. Close all clients. We're going to refcount 0 and all clients hold a reference, so we know there is no active client any more. 2. Close the user reference (represented by exp->name being non-NULL). The same argument applies: If the export were still named, we would still have a reference. 3. Freeing exp->description. This is really cleanup work to be done when the export is finally freed. There is no reason to already clear it while clients are still in the process of shutting down. So after moving the cleanup of exp->description, the code can be simplified so that only nbd_export_close() calls nbd_export_put(), but never the other way around. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200924152717.287415-13-kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-10-02 15:46:40 +02:00
Kevin Wolf	d794f7f372	nbd: Remove NBDExport.close callback The export close callback is unused by the built-in NBD server. qemu-nbd uses it only during shutdown to wait for the unrefed export to actually go away. It can just use nbd_export_close_all() instead and do without the callback. This removes the close callback from nbd_export_new() and makes both callers of it more similar. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200924152717.287415-11-kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-10-02 15:46:40 +02:00
Kevin Wolf	9b562c646b	block/export: Remove magic from block-export-add nbd-server-add tries to be convenient and adds two questionable features that we don't want to share in block-export-add, even for NBD exports: 1. When requesting a writable export of a read-only device, the export is silently downgraded to read-only. This should be an error in the context of block-export-add. 2. When using a BlockBackend name, unplugging the device from the guest will automatically stop the NBD server, too. This may sometimes be what you want, but it could also be very surprising. Let's keep things explicit with block-export-add. If the user wants to stop the export, they should tell us so. Move these things into the nbd-server-add QMP command handler so that they apply only there. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200924152717.287415-8-kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-10-02 15:46:40 +02:00
Kevin Wolf	b57e4de079	qemu-nbd: Use raw block driver for --offset Instead of implementing qemu-nbd --offset in the NBD code, just put a raw block node with the requested offset on top of the user image and rely on that doing the job. This does not only simplify the nbd_export_new() interface and bring it closer to the set of options that the nbd-server-add QMP command offers, but in fact it also eliminates a potential source for bugs in the NBD code which previously had to add the offset manually in all relevant places. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200924152717.287415-7-kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-10-02 15:46:40 +02:00
Kevin Wolf	56ee86261e	block/export: Add BlockExport infrastructure and block-export-add We want to have a common set of commands for all types of block exports. Currently, this is only NBD, but we're going to add more types. This patch adds the basic BlockExport and BlockExportDriver structs and a QMP command block-export-add that creates a new export based on the given BlockExportOptions. qmp_nbd_server_add() becomes a wrapper around qmp_block_export_add(). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200924152717.287415-5-kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-10-02 15:46:40 +02:00
Kevin Wolf	8760366cdb	nbd: Remove unused nbd_export_get_blockdev() Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200924152717.287415-2-kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-10-02 15:46:40 +02:00
Max Reitz	ee2f94ca27	nbd: Use CAF when looking for dirty bitmap When looking for a dirty bitmap to share, we should handle filters by just including them in the search (so they do not break backing chains). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:31:31 +02:00
Eric Blake	890cbccb08	nbd: Fix large trim/zero requests Although qemu as NBD client limits requests to <2G, the NBD protocol allows clients to send requests almost all the way up to 4G. But because our block layer is not yet 64-bit clean, we accidentally wrap such requests into a negative size, and fail with EIO instead of performing the intended operation. The bug is visible in modern systems with something as simple as: $ qemu-img create -f qcow2 /tmp/image.img 5G $ sudo qemu-nbd --connect=/dev/nbd0 /tmp/image.img $ sudo blkdiscard /dev/nbd0 or with user-space only: $ truncate --size=3G file $ qemu-nbd -f raw file $ nbdsh -u nbd://localhost:10809 -c 'h.trim(310241024*1024,0)' Although both blk_co_pdiscard and blk_pwrite_zeroes currently return 0 on success, this is also a good time to fix our code to a more robust paradigm that treats all non-negative values as success. Alas, our iotests do not currently make it easy to add external dependencies on blkdiscard or nbdsh, so we have to rely on manual testing for now. This patch can be reverted when we later improve the overall block layer to be 64-bit clean, but for now, a minimal fix was deemed less risky prior to release. CC: qemu-stable@nongnu.org Fixes: `1f4d6d18ed` Fixes: `1c6c4bb7f0` Fixes: https://github.com/systemd/systemd/issues/16242 Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200722212231.535072-1-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> [eblake: rework success tests to use >=0]	2020-07-28 08:49:29 -05:00
Vladimir Sementsov-Ogievskiy	453cc6be0a	nbd: make nbd_export_close_all() synchronous Consider nbd_export_close_all(). The call-stack looks like this: nbd_export_close_all() -> nbd_export_close -> call client_close() for each client. client_close() doesn't guarantee that client is closed: nbd_trip() keeps reference to it. So, nbd_export_close_all() just reduce reference counter on export and removes it from the list, but doesn't guarantee that nbd_trip() finished neither export actually removed. Let's wait for all exports actually removed. Without this fix, the following crash is possible: - export bitmap through internal Qemu NBD server - connect a client - shutdown Qemu On shutdown nbd_export_close_all is called, but it actually don't wait for nbd_trip() to finish and to release its references. So, export is not release, and exported bitmap remains busy, and on try to remove the bitmap (which is part of bdrv_close()) the assertion fails: bdrv_release_dirty_bitmap_locked: Assertion `!bdrv_dirty_bitmap_busy(bitmap)' failed Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200714162234.13113-2-vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-07-17 14:20:57 +02:00
Vladimir Sementsov-Ogievskiy	795d946d07	nbd: Use ERRP_GUARD() If we want to check error after errp-function call, we need to introduce local_err and then propagate it to errp. Instead, use the ERRP_GUARD() macro, benefits are: 1. No need of explicit error_propagate call 2. No need of explicit local_err variable: use errp directly 3. ERRP_GUARD() leaves errp as is if it's not NULL or &error_fatal, this means that we don't break error_abort (we'll abort on error_set, not on error_propagate) If we want to add some info to errp (by error_prepend() or error_append_hint()), we must use the ERRP_GUARD() macro. Otherwise, this info will not be added when errp == &error_fatal (the program will exit prior to the error_append_hint() or error_prepend() call). Fix several such cases, e.g. in nbd_read(). This commit is generated by command sed -n '/^Network Block Device (NBD)$/,/^$/{s/^F: //p}' \ MAINTAINERS \| \ xargs git ls-files \| grep '\.[hc]$' \| \ xargs spatch \ --sp-file scripts/coccinelle/errp-guard.cocci \ --macro-file scripts/cocci-macro-file.h \ --in-place --no-show-diff --max-width 80 Reported-by: Kevin Wolf <kwolf@redhat.com> Reported-by: Greg Kurz <groug@kaod.org> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [Commit message tweaked] Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20200707165037.1026246-8-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> [ERRP_AUTO_PROPAGATE() renamed to ERRP_GUARD(), and auto-propagated-errp.cocci to errp-guard.cocci. Commit message tweaked again.]	2020-07-10 15:18:09 +02:00
Eric Blake	5c4fe018c0	nbd/server: Avoid long error message assertions CVE-2020-10761 Ever since commit `36683283` (v2.8), the server code asserts that error strings sent to the client are well-formed per the protocol by not exceeding the maximum string length of 4096. At the time the server first started sending error messages, the assertion could not be triggered, because messages were completely under our control. However, over the years, we have added latent scenarios where a client could trigger the server to attempt an error message that would include the client's information if it passed other checks first: - requesting NBD_OPT_INFO/GO on an export name that is not present (commit `0cfae925` in v2.12 echoes the name) - requesting NBD_OPT_LIST/SET_META_CONTEXT on an export name that is not present (commit `e7b1948d` in v2.12 echoes the name) At the time, those were still safe because we flagged names larger than 256 bytes with a different message; but that changed in commit `93676c88` (v4.2) when we raised the name limit to 4096 to match the NBD string limit. (That commit also failed to change the magic number 4096 in nbd_negotiate_send_rep_err to the just-introduced named constant.) So with that commit, long client names appended to server text can now trigger the assertion, and thus be used as a denial of service attack against a server. As a mitigating factor, if the server requires TLS, the client cannot trigger the problematic paths unless it first supplies TLS credentials, and such trusted clients are less likely to try to intentionally crash the server. We may later want to further sanitize the user-supplied strings we place into our error messages, such as scrubbing out control characters, but that is less important to the CVE fix, so it can be a later patch to the new nbd_sanitize_name. Consideration was given to changing the assertion in nbd_negotiate_send_rep_verr to instead merely log a server error and truncate the message, to avoid leaving a latent path that could trigger a future CVE DoS on any new error message. However, this merely complicates the code for something that is already (correctly) flagging coding errors, and now that we are aware of the long message pitfall, we are less likely to introduce such errors in the future, which would make such error handling dead code. Reported-by: Xueqiang Wei <xuwei@redhat.com> CC: qemu-stable@nongnu.org Fixes: https://bugzilla.redhat.com/1843684 CVE-2020-10761 Fixes: `93676c88d7` Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200610163741.3745251-2-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2020-06-10 12:58:59 -05:00
Vladimir Sementsov-Ogievskiy	dacbb6eb8a	nbd/server: use bdrv_dirty_bitmap_next_dirty_area Use bdrv_dirty_bitmap_next_dirty_area for bitmap_to_extents. Since bdrv_dirty_bitmap_next_dirty_area is very accurate in its interface, we'll never exceed requested region with last chunk. So, we don't need dont_fragment, and bitmap_to_extents() interface becomes clean enough to not require any comment. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20200205112041.6003-10-vsementsov@virtuozzo.com Signed-off-by: John Snow <jsnow@redhat.com>	2020-03-18 14:03:46 -04:00
Vladimir Sementsov-Ogievskiy	89cbc7e308	nbd/server: introduce NBDExtentArray Introduce NBDExtentArray class, to handle extents list creation in more controlled way and with fewer OUT parameters in functions. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20200205112041.6003-9-vsementsov@virtuozzo.com Signed-off-by: John Snow <jsnow@redhat.com>	2020-03-18 14:03:46 -04:00

1 2 3 4 5

227 Commits