qemu-e2k

Author	SHA1	Message	Date
Roy Shterman	e0ae49871a	block/iscsi: Adding new iSER transport layer option iSER is a new transport layer supported in Libiscsi, iSER provides a zero-copy RDMA capable interface that can improve performance. In order to use the new iSER transport one need to have RDMA supported HW and to choose iser as the protocol name in Libiscsi URI. For now iSER memory buffers are pre-allocated and pre-registered, hence in order to work with iSER from QEMU, one need to enable MEMLOCK attribute in the VM to be large enough for all iSER buffers and RDMA resources. Signed-off-by: Roy Shterman <roysh@mellanox.com> Message-Id: <1476000896-18632-3-git-send-email-roysh@mellanox.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-10-24 11:30:55 +02:00
Roy Shterman	583ec22e23	block/iscsi: Introducing new zero-copy API A new API to deploy zero-copy command submission. The new API takes I/O vectors list and number of I/O vectors to submit as input parameters when initiating the command. New API must be used if working with iSER transport option. Signed-off-by: Roy Shterman <roysh@mellanox.com> Message-Id: <1476000896-18632-2-git-send-email-roysh@mellanox.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-10-24 11:30:55 +02:00
Paolo Bonzini	fffb6e1223	block: use aio_bh_schedule_oneshot This simplifies bottom half handlers by removing calls to qemu_bh_delete and thus removing the need to stash the bottom half pointer in the opaque datum. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-10-07 13:34:07 +02:00
Peter Maydell	4c892756fd	-----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABCAAGBQJX5LZ0AAoJEMo1YkxqkXHGmgAIAKeDTKx0sA76ewIvH9fKdmUu NNatJw59XnVX8lpfOU5yISkJ4BD6oBdN7tbWaOW8yzcAeYu1Ff5iUu4LBEUFb7eW g6zqUCV58XjaCTLmTiAfa19Exfnh6pXZlZMRP4Hr3vUVSCHFmC0EyTEllfHxU/jW aPHtAEge/p6EDAHygHJBTSQzsaXRdyJNyt/AKPreDtblNRT8VgapCDzZQPcCVGH1 F9grWVu0B/VVDS0mfgSRhT0UeF/vtiikuRW92sC4woVVB+brJyG4VwGT8oeUN8RU 30/tGo5p9fpqef3iP669uUrloLfmWcKcIJuPfQ4ZUlZh8kIV+lWK9kZuTVgocGw= =xLJw -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/famz/tags/various-pull-request' into staging # gpg: Signature made Fri 23 Sep 2016 05:58:28 BST # gpg: using RSA key 0xCA35624C6A9171C6 # gpg: Good signature from "Fam Zheng <famz@redhat.com>" # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 5003 7CB7 9706 0F76 F021 AD56 CA35 624C 6A91 71C6 * remotes/famz/tags/various-pull-request: (23 commits) docker: exec $CMD docker: Terminate instances at SIGTERM and SIGHUP docker: Support showing environment information docker: Print used options before doing configure docker: Flatten default target list in test-quick docker: Update fedora image to latest docker: Generate /packages.txt in ubuntu image docker: Generate /packages.txt in fedora image docker: Generate /packages.txt in centos6 image tests: Ignore test-uuid Add UUID files to MAINTAINERS tests: Add uuid tests uuid: Tighten uuid parse vl: Switch qemu_uuid to QemuUUID configure: Remove detection code for UUID tests: No longer dependent on CONFIG_UUID crypto: Switch to QEMU UUID API vpc: Use QEMU UUID API vdi: Use QEMU UUID API vhdx: Use QEMU UUID API ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org> # Conflicts: # tests/Makefile.include	2016-09-23 13:10:43 +01:00
Peter Maydell	6de68ffd7c	* More KVM LAPIC fixes * fix divide-by-zero regression on libiscsi SG devices * fix qemu-char segfault * add scripts/show-fixed-bugs.sh -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQExBAABCAAbBQJX5CEJFBxwYm9uemluaUByZWRoYXQuY29tAAoJEL/70l94x66D PVcH/A6kHS5FHBrqLGl0BRUzZU0HSJJVOofAB55/60qknNajMSTAWM2i/mkNxs6P 6MJD88Pb+aJFRlP6qD7CLGIlIYR38hH8VFTViuW9/Q+BuMMFgkqUauVCr3maRs82 zH6o8wj/6PEi6yjBz3yVQCKI9+dHrXiv3BdIml88CP3jAMRjit7Hzcgad7NBmMjg Try4ipTxLq0qTkMPNn5IewtiARtJH6UInrE3pzbHKekORSvammJ8Lej0mqXYNACl fatQSXTiMc1DX4Mb3Bph6rm1FzmzRGKxydaidVZdswhj2+voMgt3Xvt0mfcaD8Gk aNbuHlj2tZ4zPIP08V21xYfd1xc= =EsZw -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging * More KVM LAPIC fixes * fix divide-by-zero regression on libiscsi SG devices * fix qemu-char segfault * add scripts/show-fixed-bugs.sh # gpg: Signature made Thu 22 Sep 2016 19:20:57 BST # gpg: using RSA key 0xBFFBD25F78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: kvm: fix events.flags (KVM_VCPUEVENT_VALID_SMM) overwritten by 0 scripts: Add a script to check for bug URLs in the git log msmouse: Fix segfault caused by free the chr before chardev cleanup. iscsi: Fix divide-by-zero regression on raw SG devices kvm: apic: set APIC base as part of kvm_apic_put target-i386: introduce kvm_put_one_msr Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2016-09-23 12:12:55 +01:00
Fam Zheng	cea25275a3	util: Add UUID API A number of different places across the code base use CONFIG_UUID. Some of them are soft dependency, some are not built if libuuid is not available, some come with dummy fallback, some throws runtime error. It is hard to maintain, and hard to reason for users. Since UUID is a simple standard with only a small number of operations, it is cleaner to have a central support in libqemuutil. This patch adds qemu_uuid_* functions that all uuid users in the code base can rely on. Except for qemu_uuid_generate which is new code, all other functions are just copy from existing fallbacks from other files. Note that qemu_uuid_parse is moved without updating the function signature to use QemuUUID, to keep this patch simple. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Message-Id: <1474432046-325-2-git-send-email-famz@redhat.com>	2016-09-23 11:42:52 +08:00
Eric Blake	95eaa78537	iscsi: Fix divide-by-zero regression on raw SG devices When qemu uses iscsi devices in sg mode, iscsilun->block_size is left at 0. Prior to commits `cf081fca` and similar, when block limits were tracked in sectors, this did not matter: various block limits were just left at 0. But when we started scaling by block size, this caused SIGFPE. Then, in a later patch, commit `a5b8dd2c` added an assertion to bdrv_open_common() that request_alignment is always non-zero; which was not true for SG mode. Rather than relax that assertion, we can just provide a sane value (we don't know of any SG device with a block size smaller than qemu's default sizing of 512 bytes). One possible solution for SG mode is to just blindly skip ALL of iscsi_refresh_limits(), since we already short circuit so many other things in sg mode. But this patch takes a slightly more conservative approach, and merely guarantees that scaling will succeed, while still using multiples of the original size where possible. Resulting limits may still be zero in SG mode (that is, we mostly only fix block_size used as a denominator or which affect assertions, not all uses). Reported-by: Holger Schranz <holger@fam-schranz.de> Signed-off-by: Eric Blake <eblake@redhat.com> CC: qemu-stable@nongnu.org Message-Id: <1473283640-15756-1-git-send-email-eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-09-22 20:20:51 +02:00
Colin Lord	f57b4b5fb1	blockdev: prepare iSCSI block driver for dynamic loading This commit moves the initialization of the QemuOptsList qemu_iscsi_opts struct out of block/iscsi.c in order to allow the iscsi module to be dynamically loaded. Signed-off-by: Colin Lord <clord@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1471008424-16465-2-git-send-email-clord@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2016-09-20 22:10:57 +02:00
Eric Blake	97c7e85cfe	iscsi: Switch .bdrv_co_discard() to byte-based Another step towards killing off sector-based block APIs. Unlike write_zeroes, where we can be handed unaligned requests and must fail gracefully with -ENOTSUP for a fallback, we are guaranteed that discard requests are always aligned because the block layer already ignored unaligned head/tail. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1468624988-423-13-git-send-email-eblake@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-20 14:24:25 +01:00
Eric Blake	6bd01f14db	iscsi: Rely on block layer to break up large requests Now that the block layer honors max_request, we don't need to bother with an EINVAL on overlarge requests, but can instead assert that requests are well-behaved. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1468607524-19021-7-git-send-email-eblake@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-20 14:11:54 +01:00
Peter Lieven	e1123a3b40	block/iscsi: allow caching of the allocation map until now the allocation map was used only as a hint if a cluster is allocated or not. If a block was not allocated (or Qemu had no info about the allocation status) a get_block_status call was issued to check the allocation status and possibly avoid a subsequent read of unallocated sectors. If a block known to be allocated the get_block_status call was omitted. In the other case a get_block_status call was issued before every read to avoid the necessity for a consistent allocation map. To avoid the potential overhead of calling get_block_status for each and every read request this took only place for the bigger requests. This patch enhances this mechanism to cache the allocation status and avoid calling get_block_status for blocks where the allocation status has been queried before. This allows for bypassing the read request even for smaller requests and additionally omits calling get_block_status for known to be unallocated blocks. Signed-off-by: Peter Lieven <pl@kamp.de> Message-Id: <1468831940-15556-3-git-send-email-pl@kamp.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-07-19 08:34:53 +02:00
Peter Lieven	eb36b953e0	block/iscsi: fix rounding in iscsi_allocationmap_set when setting clusters as alloacted the boundaries have to be expanded. As Paolo pointed out the calculation of the number of clusters is wrong: Suppose cluster_sectors is 2, sector_num = 1, nb_sectors = 6: In the "mark allocated" case, you want to set 0..8, i.e. cluster_num=0, nb_clusters=4. 0--.--2--.--4--.--6--.--8 <--\|_________________\|--> (<--> = expanded) Instead you are setting nb_clusters=3, so that 6..8 is not marked. 0--.--2--.--4--.--6--.--8 <--\|______________\|!!! (! = wrong) Cc: qemu-stable@nongnu.org Reported-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Peter Lieven <pl@kamp.de> Message-Id: <1468831940-15556-2-git-send-email-pl@kamp.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-07-19 08:34:53 +02:00
Paolo Bonzini	0b8b8753e4	coroutine: move entry argument to qemu_coroutine_create In practice the entry argument is always known at creation time, and it is confusing that sometimes qemu_coroutine_enter is used with a non-NULL argument to re-enter a coroutine (this happens in block/sheepdog.c and tests/test-coroutine.c). So pass the opaque value at creation time, for consistency with e.g. aio_bh_new. Mostly done with the following semantic patch: @ entry1 @ expression entry, arg, co; @@ - co = qemu_coroutine_create(entry); + co = qemu_coroutine_create(entry, arg); ... - qemu_coroutine_enter(co, arg); + qemu_coroutine_enter(co); @ entry2 @ expression entry, arg; identifier co; @@ - Coroutine co = qemu_coroutine_create(entry); + Coroutine co = qemu_coroutine_create(entry, arg); ... - qemu_coroutine_enter(co, arg); + qemu_coroutine_enter(co); @ entry3 @ expression entry, arg; @@ - qemu_coroutine_enter(qemu_coroutine_create(entry), arg); + qemu_coroutine_enter(qemu_coroutine_create(entry, arg)); @ reentry @ expression co; @@ - qemu_coroutine_enter(co, NULL); + qemu_coroutine_enter(co); except for the aforementioned few places where the semantic patch stumbled (as expected) and for test_co_queue, which would otherwise produce an uninitialized variable warning. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-13 13:26:02 +02:00
Markus Armbruster	a9c94277f0	Use #include "..." for our own headers, <...> for others Tracked down with an ugly, brittle and probably buggy Perl script. Also move includes converted to <...> up so they get included before ours where that's obviously okay. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Tested-by: Eric Blake <eblake@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net>	2016-07-12 16:19:16 +02:00
Eric Blake	5411541270	block: Use bool as appropriate for BDS members Using int for values that are only used as booleans is confusing. While at it, rearrange a couple of members so that all the bools are contiguous. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:26 +02:00
Eric Blake	a5b8dd2ce8	block: Move request_alignment into BlockLimit It makes more sense to have ALL block size limit constraints in the same struct. Improve the documentation while at it. Simplify a couple of conditionals, now that we have audited and documented that request_alignment is always non-zero. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:26 +02:00
Eric Blake	b9f7855a50	block: Switch discard length bounds to byte-based Sector-based limits are awkward to think about; in our on-going quest to move to byte-based interfaces, convert max_discard and discard_alignment. Rename them, using 'pdiscard' as an aid to track which remaining discard interfaces need conversion, and so that the compiler will help us catch the change in semantics across any rebased code. The BlockLimits type is now completely byte-based; and in iscsi.c, sector_limits_lun2qemu() is no longer needed. pdiscard_alignment is made unsigned (we use power-of-2 alignments as bitmasks, where unsigned is easier to think about) while leaving max_pdiscard signed (since we still have an 'int' interface); this is comparable to what commit `cf081fc` did for write zeroes limits. We may later want to make everything an unsigned 64-bit limit - but that requires a bigger code audit. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:25 +02:00
Eric Blake	5def6b80e1	block: Switch transfer length bounds to byte-based Sector-based limits are awkward to think about; in our on-going quest to move to byte-based interfaces, convert max_transfer_length and opt_transfer_length. Rename them (dropping the _length suffix) so that the compiler will help us catch the change in semantics across any rebased code, and improve the documentation. Use unsigned values, so that we don't have to worry about negative values and so that bit-twiddling is easier; however, we are still constrained by 2^31 of signed int in most APIs. When a value comes from an external source (iscsi and raw-posix), sanitize the results to ensure that opt_transfer is a power of 2. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:25 +02:00
Eric Blake	c8b3b998e2	iscsi: Set request_alignment during .bdrv_refresh_limits() We want to eventually stick request_alignment alongside other BlockLimits, but first, we must ensure it is populated at the same time as all other limits, rather than being a special case that is set only when a block is first opened. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:25 +02:00
Eric Blake	f9e95af0a6	iscsi: Advertise realistic limits to block layer The function sector_limits_lun2qemu() returns a value in units of the block layer's 512-byte sector, and can be as large as 0x40000000, which is much larger than the block layer's inherent limit of BDRV_REQUEST_MAX_SECTORS. The block layer already handles '0' as a synonym to the inherent limit, and it is nicer to return this value than it is to calculate an arbitrary maximum, for two reasons: we want to ensure that the block layer continues to special-case '0' as 'no limit beyond the inherent limits'; and we want to be able to someday expand the block layer to allow 64-bit limits, where auditing for uses of BDRV_REQUEST_MAX_SECTORS will help us make sure we aren't artificially constraining iscsi to old block layer limits. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:25 +02:00
Peter Lieven	0ead93120e	iscsi: fix assertion in is_sector_request_lun_aligned Commit `94d047a` added an assertion the the request alignment check. This introduced 2 issues: a) A off-by-one error since a request of BDRV_REQUEST_MAX_SECTORS is actually allowed. b) The bdrv_get_block_status call in the read path to check the allocation status requests up to INT_MAX sectors which triggers the assertion. Fixes: `94d047a35b` Signed-off-by: Peter Lieven <pl@kamp.de> Message-Id: <1466414680-18383-1-git-send-email-pl@kamp.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-06-29 14:03:47 +02:00
Eric Blake	94d047a35b	iscsi: Convert to bdrv_co_pwrite_zeroes() Another step on our continuing quest to switch to byte-based interfaces. As this is the first byte-based iscsi interface, convert is_request_lun_aligned() into two versions, one for sectors and one for bytes. Also, change from outright -EINVAL failure on an unaligned request, to instead failing with -ENOTSUP to trigger a read-modify-write fallback, particularly since the block layer should be honoring bs->request_alignment to avoid -EINVAL on read/write requests. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-06-08 10:21:08 +02:00
Eric Blake	cf081fca4e	block: Track write zero limits in bytes Another step towards removing sector-based interfaces: convert the maximum write and minimum alignment values from sectors to bytes. Rename the variables to let the compiler check that all users are converted to the new semantics. The maximum remains an int as long as BDRV_REQUEST_MAX_SECTORS is constrained by INT_MAX (this means that we can't even support a 2G write_zeroes, but just under it) - changing operation lengths to unsigned or to 64-bits is a much bigger audit, and debatable if we even want to do it (since at the core, a 32-bit platform will still have ssize_t as its underlying limit on write()). Meanwhile, alignment is changed to 'uint32_t', since it makes no sense to have an alignment larger than the maximum write, and less painful to use an unsigned type with well-defined behavior in bit operations than to have to worry about what happens if a driver mistakenly supplies a negative alignment. Add an assert that no one was trying to use sectors to get a write zeroes larger than 2G, and therefore that a later conversion to bytes won't be impacted by keeping the limit at 32 bits. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-06-08 10:21:08 +02:00
Eric Blake	8b18474451	iscsi: Use block size as minimum zero/discard alignment If hardware does not advertise a minimum zero/discard alignment, we still want to guarantee that the block layer will align requests to our blocks, rather than the arbitrary 512-byte BDRV sector size. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-06-08 10:21:08 +02:00
Peter Lieven	a6b3167fa0	block/iscsi: avoid potential overflow of acb->task->cdb at least in the path via virtio-blk the maximum size is not restricted. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Lieven <pl@kamp.de> Message-Id: <1464080368-29584-1-git-send-email-pl@kamp.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-05-29 09:11:11 +02:00
Vadim Rozenfeld	644c6869d3	iscsi: pass SCSI status back for SG_IO Signed-off-by: Vadim Rozenfeld <vrozenfe@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-05-23 16:53:46 +02:00
Eric Blake	465fe887cc	block: Honor BDRV_REQ_FUA during write_zeroes The block layer has a couple of cases where it can lose Force Unit Access semantics when writing a large block of zeroes, such that the request returns before the zeroes have been guaranteed to land on underlying media. SCSI does not support FUA during WRITESAME(10/16); FUA is only supported if it falls back to WRITE(10/16). But where the underlying device is new enough to not need a fallback, it means that any upper layer request with FUA semantics was silently ignoring BDRV_REQ_FUA. Conversely, NBD has situations where it can support FUA but not ZERO_WRITE; when that happens, the generic block layer fallback to bdrv_driver_pwritev() (or the older bdrv_co_writev() in qemu 2.6) was losing the FUA flag. The problem of losing flags unrelated to ZERO_WRITE has been latent in bdrv_co_do_write_zeroes() since commit `aa7bfbff`, but back then, it did not matter because there was no FUA flag. It became observable when commit `93f5e6d8` paved the way for flags that can impact correctness, when we should have been using bdrv_co_writev_flags() with modified flags. Compare to commit `9eeb6dd`, which got flag manipulation right in bdrv_co_do_zero_pwritev(). Symptoms: I tested with qemu-io with default writethrough cache (which is supposed to use FUA semantics on every write), and targetted an NBD client connected to a server that intentionally did not advertise NBD_FLAG_SEND_FUA. When doing 'write 0 512', the NBD client sent two operations (NBD_CMD_WRITE then NBD_CMD_FLUSH) to get the fallback FUA semantics; but when doing 'write -z 0 512', the NBD client sent only NBD_CMD_WRITE. The fix is do to a cleanup bdrv_co_flush() at the end of the operation if any step in the middle relied on a BDS that does not natively support FUA for that step (note that we don't need to flush after every operation, if the operation is broken into chunks based on bounce-buffer sizing). Each BDS gains a new flag .supported_zero_flags, which parallels the use of .supported_write_flags but only when accessing a zero write operation (the flags MUST be different, because of SCSI having different semantics based on WRITE vs. WRITESAME; and also because BDRV_REQ_MAY_UNMAP only makes sense on zero writes). Also fix some documentation to describe -ENOTSUP semantics, particularly since iscsi depends on those semantics. Down the road, we may want to add a driver where its .bdrv_co_pwritev() honors all three of BDRV_REQ_FUA, BDRV_REQ_ZERO_WRITE, and BDRV_REQ_MAY_UNMAP, and advertise this via bs->supported_write_flags for blocks opened by that driver; such a driver should NOT supply .bdrv_co_write_zeroes nor .supported_zero_flags. But none of the drivers touched in this patch want to do that (the act of writing zeroes is different enough from normal writes to deserve a second callback). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-05-12 15:22:09 +02:00
Eric Blake	4df863f336	block: Make supported_write_flags a per-bds property Pre-patch, .supported_write_flags lives at the driver level, which means we are blindly declaring that all block devices using a given driver will either equally support FUA, or that we need a fallback at the block layer. But there are drivers where FUA support is a per-block decision: the NBD block driver is dependent on the remote server advertising NBD_FLAG_SEND_FUA (and has fallback code to duplicate the flush that the block layer would do if NBD had not set .supported_write_flags); and the iscsi block driver is dependent on the mode sense bits advertised by the underlying device (and is currently silently ignoring FUA requests if the underlying device does not support FUA). The fix is to make supported flags as a per-BDS option, set during .bdrv_open(). This patch moves the variable and fixes NBD and iscsi to set it only conditionally; later patches will then further simplify the NBD driver to quit duplicating work done at the block layer, as well as tackle the fact that SCSI does not support FUA semantics on WRITESAME(10/16) but only on WRITE(10/16). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-05-12 15:22:09 +02:00
Kevin Wolf	78a07294d5	block: Introduce bdrv_driver_pwritev() This is a function that simply calls into the block driver for doing a write, providing the byte granularity interface we want to eventually have everywhere, and using whatever interface that driver supports. This one is a bit more interesting than the version for reads: It adds support for .bdrv_co_writev_flags() everywhere, so that drivers implementing this function can drop .bdrv_co_writev() now. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>	2016-05-12 15:22:07 +02:00
Kevin Wolf	9f0eb9e129	iscsi: Support BDRV_REQ_FUA This replaces the existing hack in the iscsi driver that sent the FUA bit in writethrough mode and ignored the following flush in order to optimise the number of roundtrips (see commit `73b5394e`). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2016-03-30 12:16:02 +02:00
Kevin Wolf	bfd18d1e0b	block: Move enable_write_cache to BB level Whether a write cache is used or not is a decision that concerns the user (e.g. the guest device) rather than the backend. It was already logically part of the BB level as bdrv_move_feature_fields() always kept it on top of the BDS tree; with this patch, the core of it (the actual flag and the additional flushes) is also implemented there. Direct callers of bdrv_open() must pass BDRV_O_CACHE_WB now if bs doesn't have a BlockBackend attached. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2016-03-30 12:16:02 +02:00
Daniel P. Berrange	b189346eb1	iscsi: add support for getting CHAP password via QCryptoSecret API The iSCSI driver currently accepts the CHAP password in plain text as a block driver property. This change adds a new "password-secret" property that accepts the ID of a QCryptoSecret instance. $QEMU \ -object secret,id=sec0,filename=/home/berrange/example.pw \ -drive driver=iscsi,url=iscsi://example.com/target-foo/lun1,\ user=dan,password-secret=sec0 Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 1453385961-10718-4-git-send-email-berrange@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>	2016-02-29 14:54:31 -05:00
Fam Zheng	3399833f14	iscsi: Assign bs to file in iscsi_co_get_block_status Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Message-id: 1453780743-16806-6-git-send-email-famz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2016-02-02 17:50:47 +01:00
Fam Zheng	67a0fd2a9b	block: Add "file" output parameter to block status query functions The added parameter can be used to return the BDS pointer which the valid offset is referring to. Its value should be ignored unless BDRV_BLOCK_OFFSET_VALID in ret is set. Until block drivers fill in the right value, let's clear it explicitly right before calling .bdrv_get_block_status. The "bs->file" condition in bdrv_co_get_block_status is kept now to keep iotest case 102 passing, and will be fixed once all drivers return the right file pointer. Signed-off-by: Fam Zheng <famz@redhat.com> Message-id: 1453780743-16806-2-git-send-email-famz@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2016-02-02 17:50:47 +01:00
Peter Maydell	80c71a241a	block: Clean up includes Clean up includes so that osdep.h is included first and headers which it implies are not included manually. This commit was created with scripts/clean-includes. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-01-20 13:36:23 +01:00
Zhu Lingshan	1cb6d137ff	iscsi: send readcapacity10 when readcapacity16 failed When play with Dell MD3000 target, for sure it is a TYPE_DISK, but readcapacity16 would fail. Then we find that readcapacity10 succeeded. It looks like the target just support readcapacity10 even through it is a TYPE_DISK or have some TYPE_ROM characteristics. This patch can give a chance to send readcapacity16 when readcapacity10 failed. This patch is not harmful to original pathes Signed-off-by: Zhu Lingshan <lszhu@suse.com> Message-Id: <1451359934-9236-1-git-send-email-lszhu@suse.com> [Don't fall through on UNIT ATTENTION. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-01-15 18:58:01 +01:00
Zhu Lingshan	240125bc49	iscsi: fix readcapacity error message fix:The error message for readcapacity 16 incorrectly mentioned a readcapacity 10 failure, fixed the error message. Signed-off-by: Zhu Lingshan <lszhu@suse.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2016-01-11 11:39:28 +03:00
Fam Zheng	83c98d7b92	block: Drop BlockDriver.bdrv_ioctl Now the callback is not used any more, drop the field along with all implementations in block drivers, which are iscsi and raw. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-id: 1447064214-29930-8-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-11-12 16:22:43 +01:00
Fam Zheng	4bb17ab51a	iscsi: Emulate commands in iscsi_aio_ioctl as iscsi_ioctl iscsi_ioctl emulates SG_GET_VERSION_NUM and SG_GET_SCSI_ID. Now that bdrv_ioctl() will be emulated with .bdrv_aio_ioctl, replicate the logic into iscsi_aio_ioctl to make them consistent. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-id: 1447064214-29930-5-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-11-12 16:22:42 +01:00
Fam Zheng	e01dd3da5c	iscsi: Translate scsi sense into error code Previously we return -EIO blindly when anything goes wrong. Add a helper function to parse sense fields and try to make the return code more meaningful. This also fixes the default werror configuration (enospc) when we're using qcow2 on an iscsi lun. The old -EIO not being treated as out of space error failed to trigger vm stop. Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <1446699609-11376-1-git-send-email-famz@redhat.com> [libiscsi 1.9 compatibility - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-11-05 14:42:19 +01:00
Fam Zheng	dca21ef23b	aio: Add "is_external" flag for event handlers All callers pass in false, and the real external ones will switch to true in coming patches. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-10-23 18:18:23 +02:00
Peter Lieven	6d1f252d8c	block/iscsi: validate block size returned from target It has been reported that at least tgtd returns a block size of 0 for LUN 0. To avoid running into divide by zero later on and protect against other problematic block sizes validate the block size right at connection time. Cc: qemu-stable@nongnu.org Reported-by: Andrey Korolyov <andrey@xdel.ru> Signed-off-by: Peter Lieven <pl@kamp.de> Message-Id: <1439552016-8557-1-git-send-email-pl@kamp.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-09-07 18:14:03 +02:00
Peter Lieven	9049736ec7	block/iscsi: restore compatiblity with libiscsi 1.9.0 RHEL7 and others are stuck with libiscsi 1.9.0 since there unfortunately was an ABI breakage after that release. Signed-off-by: Peter Lieven <pl@kamp.de> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1435313881-19366-1-git-send-email-pl@kamp.de Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-07-02 10:06:23 +01:00
Peter Lieven	5dd7a535b7	block/iscsi: add support for request timeouts libiscsi starting with 1.15 will properly support timeout of iscsi commands. The default will remain no timeout, but this can be changed via cmdline parameters, e.g.: qemu -iscsi timeout=30 -drive file=iscsi://... If a timeout occurs a reconnect is scheduled and the timed out command will be requeued for processing after a successful reconnect. The required API call iscsi_set_timeout is present since libiscsi 1.10 which was released in October 2013. However, due to some bugs in the libiscsi code the use is not recommended before version 1.15. Please note that this patch bumps the libiscsi requirement to 1.10 to have all function and macros defined. The patch fixes also a off-by-one error in the NOP timeout calculation which was fixed while touching these code parts. Signed-off-by: Peter Lieven <pl@kamp.de> Message-id: 1434455107-19328-1-git-send-email-pl@kamp.de Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-07-02 09:20:18 +01:00
Dimitris Aragiorgis	1b6bc94d5d	Fix migration in case of scsi-generic During migration, QEMU uses fsync()/fdatasync() on the open file descriptor for read-write block devices to flush data just before stopping the VM. However, fsync() on a scsi-generic device returns -EINVAL which causes the migration to fail. This patch skips flushing data in case of an SG device, since submitting SCSI commands directly via an SG character device (e.g. /dev/sg0) bypasses the page cache completely, anyway. Note that fsync() not only flushes the page cache but also the disk cache. The scsi-generic device never sends flushes, and for migration it assumes that the same SCSI device is used by the destination host, so it does not issue any SCSI SYNCHRONIZE CACHE (10) command. Finally, remove the bdrv_is_sg() test from iscsi_co_flush() since this is now redundant (we flush the underlying protocol at the end of bdrv_co_flush() which, with this patch, we never reach). Signed-off-by: Dimitris Aragiorgis <dimara@arrikto.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1435056300-14924-3-git-send-email-dimara@arrikto.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-06-23 15:08:52 +01:00
Dimitris Aragiorgis	b192af8acc	block: Use bdrv_is_sg() everywhere Instead of checking bs->sg use bdrv_is_sg() consistently throughout the code. Signed-off-by: Dimitris Aragiorgis <dimara@arrikto.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1435056300-14924-2-git-send-email-dimara@arrikto.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-06-23 15:08:52 +01:00
Markus Armbruster	d49b683644	qerror: Move #include out of qerror.h Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>	2015-06-22 18:20:40 +02:00
Fam Zheng	44f192f364	iscsi: Remove pointless runtime check of macro value raw_bsd already has QEMU_BUILD_BUG_ON(BDRV_SECTOR_SIZE != 512), so iscsi should relax. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2015-06-03 14:21:23 +03:00
Peter Lieven	9eac3622a2	block/iscsi: use the allocationmap also if cache.direct=on the allocationmap has only a hint character. The driver always double checks that blocks marked unallocated in the cache are still unallocated before taking the fast path and return zeroes. So using the allocationmap is migration safe and can also be enabled with cache.direct=on. Signed-off-by: Peter Lieven <pl@kamp.de> Message-id: 1429193313-4263-10-git-send-email-pl@kamp.de Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-04-28 15:36:10 +02:00
Peter Lieven	03e40fef46	block/iscsi: bump year in copyright notice Signed-off-by: Peter Lieven <pl@kamp.de> Message-id: 1429193313-4263-9-git-send-email-pl@kamp.de Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-04-28 15:36:10 +02:00

1 2 3 4 5

207 Commits