qemu-e2k/docs/interop/nbd.txt

72 lines
2.7 KiB
Plaintext
Raw Normal View History

QEMU supports the NBD protocol, and has an internal NBD client (see
block/nbd.c), an internal NBD server (see blockdev-nbd.c), and an
external NBD server tool (see qemu-nbd.c). The common code is placed
in nbd/*.
The NBD protocol is specified here:
https://github.com/NetworkBlockDevice/nbd/blob/master/doc/proto.md
The following paragraphs describe some specific properties of NBD
protocol realization in QEMU.
= Metadata namespaces =
QEMU supports the "base:allocation" metadata context as defined in the
NBD protocol specification, and also defines an additional metadata
namespace "qemu".
== "qemu" namespace ==
The "qemu" namespace currently contains two available metadata context
types. The first is related to exposing the contents of a dirty
bitmap alongside the associated disk contents. That metadata context
is named with the following form:
qemu:dirty-bitmap:<dirty-bitmap-export-name>
Each dirty-bitmap metadata context defines only one flag for extents
in reply for NBD_CMD_BLOCK_STATUS:
bit 0: NBD_STATE_DIRTY, set when the extent is "dirty"
The second is related to exposing the source of various extents within
the image, with a single metadata context named:
qemu:allocation-depth
In the allocation depth context, the entire 32-bit value represents a
depth of which layer in a thin-provisioned backing chain provided the
data (0 for unallocated, 1 for the active layer, 2 for the first
backing layer, and so forth).
For NBD_OPT_LIST_META_CONTEXT the following queries are supported
in addition to the specific "qemu:allocation-depth" and
"qemu:dirty-bitmap:<dirty-bitmap-export-name>":
* "qemu:" - returns list of all available metadata contexts in the
namespace.
* "qemu:dirty-bitmap:" - returns list of all available dirty-bitmap
metadata contexts.
= Features by version =
The following list documents which qemu version first implemented
various features (both as a server exposing the feature, and as a
client taking advantage of the feature when present), to make it
easier to plan for cross-version interoperability. Note that in
several cases, the initial release containing a feature may require
additional patches from the corresponding stable branch to fix bugs in
the operation of that feature.
* 2.6: NBD_OPT_STARTTLS with TLS X.509 Certificates
* 2.8: NBD_CMD_WRITE_ZEROES
* 2.10: NBD_OPT_GO, NBD_INFO_BLOCK
* 2.11: NBD_OPT_STRUCTURED_REPLY
* 2.12: NBD_CMD_BLOCK_STATUS for "base:allocation"
* 3.0: NBD_OPT_STARTTLS with TLS Pre-Shared Keys (PSK),
NBD_CMD_BLOCK_STATUS for "qemu:dirty-bitmap:", NBD_CMD_CACHE
* 4.2: NBD_FLAG_CAN_MULTI_CONN for shareable read-only exports,
nbd: Prepare for NBD_CMD_FLAG_FAST_ZERO Commit fe0480d6 and friends added BDRV_REQ_NO_FALLBACK as a way to avoid wasting time on a preliminary write-zero request that will later be rewritten by actual data, if it is known that the write-zero request will use a slow fallback; but in doing so, could not optimize for NBD. The NBD specification is now considering an extension that will allow passing on those semantics; this patch updates the new protocol bits and 'qemu-nbd --list' output to recognize the bit, as well as the new errno value possible when using the new flag; while upcoming patches will improve the client to use the feature when present, and the server to advertise support for it. The NBD spec recommends (but not requires) that ENOTSUP be avoided for all but failures of a fast zero (the only time it is mandatory to avoid an ENOTSUP failure is when fast zero is supported but not requested during write zeroes; the questionable use is for ENOTSUP to other actions like a normal write request). However, clients that get an unexpected ENOTSUP will either already be treating it the same as EINVAL, or may appreciate the extra bit of information. We were equally loose for returning EOVERFLOW in more situations than recommended by the spec, so if it turns out to be a problem in practice, a later patch can tighten handling for both error codes. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20190823143726.27062-3-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> [eblake: tweak commit message, also handle EOPNOTSUPP]
2019-08-23 16:37:23 +02:00
NBD_CMD_FLAG_FAST_ZERO
* 5.2: NBD_CMD_BLOCK_STATUS for "qemu:allocation-depth"
nbd/server: Allow MULTI_CONN for shared writable exports According to the NBD spec, a server that advertises NBD_FLAG_CAN_MULTI_CONN promises that multiple client connections will not see any cache inconsistencies: when properly separated by a single flush, actions performed by one client will be visible to another client, regardless of which client did the flush. We always satisfy these conditions in qemu - even when we support multiple clients, ALL clients go through a single point of reference into the block layer, with no local caching. The effect of one client is instantly visible to the next client. Even if our backend were a network device, we argue that any multi-path caching effects that would cause inconsistencies in back-to-back actions not seeing the effect of previous actions would be a bug in that backend, and not the fault of caching in qemu. As such, it is safe to unconditionally advertise CAN_MULTI_CONN for any qemu NBD server situation that supports parallel clients. Note, however, that we don't want to advertise CAN_MULTI_CONN when we know that a second client cannot connect (for historical reasons, qemu-nbd defaults to a single connection while nbd-server-add and QMP commands default to unlimited connections; but we already have existing means to let either style of NBD server creation alter those defaults). This is visible by no longer advertising MULTI_CONN for 'qemu-nbd -r' without -e, as in the iotest nbd-qemu-allocation. The harder part of this patch is setting up an iotest to demonstrate behavior of multiple NBD clients to a single server. It might be possible with parallel qemu-io processes, but I found it easier to do in python with the help of libnbd, and help from Nir and Vladimir in writing the test. Signed-off-by: Eric Blake <eblake@redhat.com> Suggested-by: Nir Soffer <nsoffer@redhat.com> Suggested-by: Vladimir Sementsov-Ogievskiy <v.sementsov-og@mail.ru> Message-Id: <20220512004924.417153-3-eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-05-12 02:49:24 +02:00
* 7.1: NBD_FLAG_CAN_MULTI_CONN for shareable writable exports