qemu-e2k

Author	SHA1	Message	Date
Evgeny Yakovlev	3ff2f67a7c	block: ignore flush requests when storage is clean Some guests (win2008 server for example) do a lot of unnecessary flushing when underlying media has not changed. This adds additional overhead on host when calling fsync/fdatasync. This change introduces a write generation scheme in BlockDriverState. Current write generation is checked against last flushed generation to avoid unnessesary flushes. The problem with excessive flushing was found by a performance test which does parallel directory tree creation (from 2 processes). Results improved from 0.424 loops/sec to 0.432 loops/sec. Each loop creates 10^3 directories with 10 files in each. This affected some blkdebug testcases that were expecting error logs from failure-injected flushes which are now skipped entirely (tests 026 071 089). This also affects the performance of block jobs and thus BLOCK_JOB_READY events for driver-mirror and active block-commit commands now arrives faster, before QMP send successfully returns to caller (tests 141 144). Signed-off-by: Evgeny Yakovlev <eyakovlev@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1468870792-7411-5-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Fam Zheng <famz@redhat.com> CC: John Snow <jsnow@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com>	2016-07-18 18:19:01 -04:00
Roman Pen	5e1b34a3fa	linux-aio: prevent submitting more than MAX_EVENTS Invoking io_setup(MAX_EVENTS) we ask kernel to create ring buffer for us with specified number of events. But kernel ring buffer allocation logic is a bit tricky (ring buffer is page size aligned + some percpu allocation are required) so eventually more than requested events number is allocated. From a userspace side we have to follow the convention and should not try to io_submit() more or logic, which consumes completed events, should be changed accordingly. The pitfall is in the following sequence: MAX_EVENTS = 128 io_setup(MAX_EVENTS) io_submit(MAX_EVENTS) io_submit(MAX_EVENTS) /* now 256 events are in-flight / io_getevents(MAX_EVENTS) = 128 / we can handle only 128 events at once, to be sure * that nothing is pended the io_getevents(MAX_EVENTS) * call must be invoked once more or hang will happen. */ To prevent the hang or reiteration of io_getevents() call this patch restricts the number of in-flights, which is now limited to MAX_EVENTS. Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1468415004-31755-1-git-send-email-roman.penyaev@profitbricks.com Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: qemu-devel@nongnu.org Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-18 15:10:52 +01:00
Paolo Bonzini	0187f5c9cb	linux-aio: share one LinuxAioState within an AioContext This has better performance because it executes fewer system calls and does not use a bottom half per disk. Originally proposed by Ming Lei. [Changed #include "raw-aio.h" to "block/raw-aio.h" in win32-aio.c to fix build error as reported by Peter Maydell <peter.maydell@linaro.org>. --Stefan] Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1467650000-51385-1-git-send-email-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> squash! linux-aio: share one LinuxAioState within an AioContext	2016-07-18 15:09:31 +01:00
Max Reitz	c4b48bfdc5	vvfat: Fix qcow write target driver specification First, bdrv_open_child() expects all options for the child to be prefixed by the child's name (and a separating dot). Second, bdrv_open_child() does not take ownership of the QDict passed to it but only extracts all options for the child, so if a QDict is created for the sole purpose of passing it to bdrv_open_child(), it needs to be freed afterwards. This patch makes vvfat adhere to both of these rules. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20160711135452.11304-1-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2016-07-13 13:41:39 +02:00
Reda Sallahi	524089bce4	vmdk: fix metadata write regression Commit "cdeaf1f vmdk: add bdrv_co_write_zeroes" causes a regression on writes. It writes metadata after every write instead of doing it only once for each cluster. vmdk_pwritev() writes metadata whenever m_data is set as valid so this patch sets m_data as valid only when we have a new cluster which hasn't been allocated before or a zero grain. Signed-off-by: Reda Sallahi <fullmanet@gmail.com> Message-id: 20160707084249.29084-1-fullmanet@gmail.com Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2016-07-13 13:41:39 +02:00
Sascha Silbe	f14a39ccb9	Improve block job rate limiting for small bandwidth values ratelimit_calculate_delay() previously reset the accounting every time slice, no matter how much data had been processed before. This had (at least) two consequences: 1. The minimum speed is rather large, e.g. 5 MiB/s for commit and stream. Not sure if there are real-world use cases where this would be a problem. Mirroring and backup over a slow link (e.g. DSL) would come to mind, though. 2. Tests for block job operations (e.g. cancel) were rather racy All block jobs currently use a time slice of 100ms. That's a reasonable value to get smooth output during regular operation. However this also meant that the state of block jobs changed every 100ms, no matter how low the configured limit was. On busy hosts, qemu often transferred additional chunks until the test case had a chance to cancel the job. Fix the block job rate limit code to delay for more than one time slice to address the above issues. To make it easier to handle oversized chunks we switch the semantics from returning a delay _before_ the current request to a delay _after_ the current request. If necessary, this delay consists of multiple time slice units. Since the mirror job sends multiple chunks in one go even if the rate limit was exceeded in between, we need to keep track of the start of the current time slice so we can correctly re-compute the delay for the updated amount of data. The minimum bandwidth now is 1 data unit per time slice. The block jobs are currently passing the amount of data transferred in sectors and using 100ms time slices, so this translates to 5120 bytes/second. With chunk sizes usually being O(512KiB), tests have plenty of time (O(100s)) to operate on block jobs. The chance of a race condition now is fairly remote, except possibly on insanely loaded systems. Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com> Message-id: 1467127721-9564-2-git-send-email-silbe@linux.vnet.ibm.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2016-07-13 13:41:38 +02:00
Max Reitz	c834cba905	qcow2: Fix qcow2_get_cluster_offset() Recently, qcow2_get_cluster_offset() has been changed to work with bytes instead of sectors. This invalidated some assertions and introduced a possible integer multiplication overflow. This could be reproduced using e.g. $ qemu-img create -f qcow2 -o cluster_size=1M blub.qcow2 8G Formatting 'foo.qcow2', fmt=qcow2 size=8589934592 encryption=off cluster_size=1048576 lazy_refcounts=off refcount_bits=16 $ qemu-io -c map blub.qcow2 qemu-io: qemu/block/qcow2-cluster.c:504: qcow2_get_cluster_offset: Assertion `bytes_needed <= INT_MAX' failed. [1] 20775 abort (core dumped) qemu-io -c map foo.qcow2 This patch removes the now wrong assertion, adding comments and more assertions to prove its correctness (and fixing the overflow which would become apparent with the original assertion removed). Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20160620142623.24471-3-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2016-07-13 13:41:38 +02:00
Max Reitz	84c26520d3	qcow2: Avoid making the L1 table too big We refuse to open images whose L1 table we deem "too big". Consequently, we should not produce such images ourselves. Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20160615153630.2116-3-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> [mreitz: Added QEMU_BUILD_BUG_ON()] Signed-off-by: Max Reitz <mreitz@redhat.com>	2016-07-13 13:41:38 +02:00
Kevin Wolf	8c39825218	block/qdev: Allow configuring rerror/werror with qdev properties The rerror/werror policies are implemented in the devices, so that's where they should be configured. In comparison to the old options in -drive, the qdev properties are only added to those devices that actually support them. If the option isn't given (or "auto" is specified), the setting of the BlockBackend is used for compatibility with the old options. For block jobs, "auto" is the same as "enospc". Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2016-07-13 13:32:27 +02:00
Kevin Wolf	1e8fb7f1ee	commit: Fix use of error handling policy Commit implemented the 'enospc' policy as 'ignore' if the error was not ENOSPC. The QAPI documentation promises that it's treated as 'stop'. Using the common block job error handling function fixes this and also adds the missing QMP event. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2016-07-13 13:32:27 +02:00
Paolo Bonzini	0b8b8753e4	coroutine: move entry argument to qemu_coroutine_create In practice the entry argument is always known at creation time, and it is confusing that sometimes qemu_coroutine_enter is used with a non-NULL argument to re-enter a coroutine (this happens in block/sheepdog.c and tests/test-coroutine.c). So pass the opaque value at creation time, for consistency with e.g. aio_bh_new. Mostly done with the following semantic patch: @ entry1 @ expression entry, arg, co; @@ - co = qemu_coroutine_create(entry); + co = qemu_coroutine_create(entry, arg); ... - qemu_coroutine_enter(co, arg); + qemu_coroutine_enter(co); @ entry2 @ expression entry, arg; identifier co; @@ - Coroutine co = qemu_coroutine_create(entry); + Coroutine co = qemu_coroutine_create(entry, arg); ... - qemu_coroutine_enter(co, arg); + qemu_coroutine_enter(co); @ entry3 @ expression entry, arg; @@ - qemu_coroutine_enter(qemu_coroutine_create(entry), arg); + qemu_coroutine_enter(qemu_coroutine_create(entry, arg)); @ reentry @ expression co; @@ - qemu_coroutine_enter(co, NULL); + qemu_coroutine_enter(co); except for the aforementioned few places where the semantic patch stumbled (as expected) and for test_co_queue, which would otherwise produce an uninitialized variable warning. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-13 13:26:02 +02:00
Fam Zheng	5af7045bd0	raw-posix: Use qemu_dup Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-13 13:26:02 +02:00
Alberto Garcia	fd62c609ed	commit: Add 'job-id' parameter to 'block-commit' This patch adds a new optional 'job-id' parameter to 'block-commit', allowing the user to specify the ID of the block job to be created. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-13 13:26:02 +02:00
Alberto Garcia	2323322ed0	stream: Add 'job-id' parameter to 'block-stream' This patch adds a new optional 'job-id' parameter to 'block-stream', allowing the user to specify the ID of the block job to be created. The HMP 'block_stream' command remains unchanged. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-13 13:26:02 +02:00
Alberto Garcia	70559d499c	backup: Add 'job-id' parameter to 'blockdev-backup' and 'drive-backup' This patch adds a new optional 'job-id' parameter to 'blockdev-backup' and 'drive-backup', allowing the user to specify the ID of the block job to be created. The HMP 'drive_backup' command remains unchanged. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-13 13:26:02 +02:00
Alberto Garcia	71aa98678c	mirror: Add 'job-id' parameter to 'blockdev-mirror' and 'drive-mirror' This patch adds a new optional 'job-id' parameter to 'blockdev-mirror' and 'drive-mirror', allowing the user to specify the ID of the block job to be created. The HMP 'drive_mirror' command remains unchanged. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-13 13:26:02 +02:00
Alberto Garcia	7f0317cfc8	blockjob: Add 'job_id' parameter to block_job_create() When a new job is created, the job ID is taken from the device name of the BDS. This patch adds a new 'job_id' parameter to let the caller provide one instead. This patch also verifies that the ID is always unique and well-formed. This causes problems in a couple of places where no ID is being set, because the BDS does not have a device name. In the case of test_block_job_start() (from test-blockjob-txn.c) we can simply use this new 'job_id' parameter to set the missing ID. In the case of img_commit() (from qemu-img.c) we still don't have the API to make commit_active_start() set the job ID, so we solve it by setting a default value. We'll get rid of this as soon as we extend the API. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-13 13:26:02 +02:00
Alberto Garcia	9df229c3ca	blockjob: Update description of the 'id' field The 'id' field of the BlockJob structure will be able to hold any ID, not only a device name. This patch updates the description of that field and the error messages where it is being used. Soon we'll add the ability to set an arbitrary ID when creating a block job. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-13 13:26:02 +02:00
Markus Armbruster	a9c94277f0	Use #include "..." for our own headers, <...> for others Tracked down with an ugly, brittle and probably buggy Perl script. Also move includes converted to <...> up so they get included before ours where that's obviously okay. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Tested-by: Eric Blake <eblake@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net>	2016-07-12 16:19:16 +02:00
Peter Maydell	975b1c3ac6	QAPI patches for 2016-07-06 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJXfMjDAAoJEDhwtADrkYZTyKwP/i5Lva3qhDk9AJhvPNSbnps/ sxVGVkUTeUM0UoXdKmDChkk2zScv2AOVxL3oibb22PmU1MvDU+XoKYokae/VucVS WL16t/Z77QxSJ1yKhXT3i0P7+9sR9Gq2eDHchSSITLreVO8/jNb7u1JjnV4nPyQP Scg6sCCnfgSw8W4EOz0wDhwXHbRv7obSY0lGUD1R6FifETHQ98rTpII0BS1AyvbF HIJMcGpirAe62BGNQeJv933mrD1EgRfBytQAkEL/l6DDlz/xk9AE26bsjdP5Xa9A MwEBje9CPm0ICleqq1ZVKNQQqng2PVfXXXdJYKQKVnIHL+REzRDKbTGTlS4njdVi v2i4E7MwDPIf86zRUXJ6eUkWDHgld4of2WZmoxNPZmWWmHMb31iqGcb0//mYmP06 QOPd/uxpxg9pze4JeQMhzHCbFum2/s43cyqaH24lrwhTw3EsI45t4sgJQTG1NIFt ZdPXVwNBfa6ur50dsRWHYonQimpofW94YYupWLggiVGKTzvfSWVQJGc9Ho65P23y ZJrkbN3bWNnwVFzIbWFI5TKnE/TsnWvTPwk+9cGSP4MKf1sy6t5nIycoTiP3vC9x PesENYYKy5ZkOkKQpC6lGwjkMl24ii3lVl5g4vWXXjtZHwGNmq/7/UJ7Dm4htqIk B5gm5gQ/2Tul4Dihlteb =BgpV -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2016-07-06' into staging QAPI patches for 2016-07-06 # gpg: Signature made Wed 06 Jul 2016 10:00:51 BST # gpg: using RSA key 0x3870B400EB918653 # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" # gpg: aka "Markus Armbruster <armbru@pond.sub.org>" # Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653 * remotes/armbru/tags/pull-qapi-2016-07-06: replay: Use new QAPI cloning sockets: Use new QAPI cloning qapi: Add new clone visitor qapi: Add new visit_complete() function tests: Factor out common code in qapi output tests tests: Clean up test-string-output-visitor qmp-output-visitor: Favor new visit_free() function string-output-visitor: Favor new visit_free() function qmp-input-visitor: Favor new visit_free() function string-input-visitor: Favor new visit_free() function opts-visitor: Favor new visit_free() function qapi: Add new visit_free() function qapi: Add parameter to visit_end_* qemu-img: Don't leak errors when outputting JSON qapi: Improve use of qmp/types.h Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2016-07-06 11:38:09 +01:00
Eric Blake	3b098d5697	qapi: Add new visit_complete() function Making each output visitor provide its own output collection function was the only remaining reason for exposing visitor sub-types to the rest of the code base. Add a polymorphic visit_complete() function which is a no-op for input visitors, and which populates an opaque pointer for output visitors. For maximum type-safety, also add a parameter to the output visitor constructors with a type-correct version of the output pointer, and assert that the two uses match. This approach was considered superior to either passing the output parameter only during construction (action at a distance during visit_free() feels awkward) or only during visit_complete() (defeating type safety makes it easier to use incorrectly). Most callers were function-local, and therefore a mechanical conversion; the testsuite was a bit trickier, but the previous cleanup patch minimized the churn here. The visit_complete() function may be called at most once; doing so lets us use transfer semantics rather than duplication or ref-count semantics to get the just-built output back to the caller, even though it means our behavior is not idempotent. Generated code is simplified as follows for events: \|@@ -26,7 +26,7 @@ void qapi_event_send_acpi_device_ost(ACP \| QDict qmp; \| Error err = NULL; \| QMPEventFuncEmit emit; \|- QmpOutputVisitor qov; \|+ QObject obj; \| Visitor v; \| q_obj_ACPI_DEVICE_OST_arg param = { \| info \|@@ -39,8 +39,7 @@ void qapi_event_send_acpi_device_ost(ACP \| \| qmp = qmp_event_build_dict("ACPI_DEVICE_OST"); \| \|- qov = qmp_output_visitor_new(); \|- v = qmp_output_get_visitor(qov); \|+ v = qmp_output_visitor_new(&obj); \| \| visit_start_struct(v, "ACPI_DEVICE_OST", NULL, 0, &err); \| if (err) { \|@@ -55,7 +54,8 @@ void qapi_event_send_acpi_device_ost(ACP \| goto out; \| } \| \|- qdict_put_obj(qmp, "data", qmp_output_get_qobject(qov)); \|+ visit_complete(v, &obj); \|+ qdict_put_obj(qmp, "data", obj); \| emit(QAPI_EVENT_ACPI_DEVICE_OST, qmp, &err); and for commands: \| { \| Error err = NULL; \|- QmpOutputVisitor qov = qmp_output_visitor_new(); \| Visitor v; \| \|- v = qmp_output_get_visitor(qov); \|+ v = qmp_output_visitor_new(ret_out); \| visit_type_AddfdInfo(v, "unused", &ret_in, &err); \|- if (err) { \|- goto out; \|+ if (!err) { \|+ visit_complete(v, ret_out); \| } \|- *ret_out = qmp_output_get_qobject(qov); \|- \|-out: \| error_propagate(errp, err); Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <1465490926-28625-13-git-send-email-eblake@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2016-07-06 10:52:04 +02:00
Eric Blake	1830f22a67	qmp-output-visitor: Favor new visit_free() function Now that we have a polymorphic visit_free(), we no longer need qmp_output_visitor_cleanup(); however, we still need to expose the subtype for qmp_output_get_qobject(). Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <1465490926-28625-10-git-send-email-eblake@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2016-07-06 10:52:04 +02:00
Eric Blake	09204eac9b	opts-visitor: Favor new visit_free() function Now that we have a polymorphic visit_free(), we no longer need opts_visitor_cleanup(); which in turn means we no longer need to return a subtype from opts_visitor_new() nor a public upcast function. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <1465490926-28625-6-git-send-email-eblake@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2016-07-06 10:52:04 +02:00
Eric Blake	1158bb2a05	qapi: Add parameter to visit_end_* Rather than making the dealloc visitor track of stack of pointers remembered during visit_start_* in order to free them during visit_end_, it's a lot easier to just make all callers pass the same pointer to visit_end_. The generated code has access to the same pointer, while all other users are doing virtual walks and can pass NULL. The dealloc visitor is then greatly simplified. All three visit_end_() functions intentionally take a void, even though the visit_start_() functions differ between void, GenericList, and GenericAlternate*. This is done for several reasons: when doing a virtual walk, passing NULL doesn't care what the type is, but when doing a generated walk, we already have to cast the caller's specific FOO to call visit_start, while using void** lets us use visit_end without a cast. Also, an upcoming patch will add a clone visitor that wants to use the same implementation for all three visit_end callbacks, which is made easier if all three share the same signature. For visitors with already track per-object state (the QMP visitors via a stack, and the string visitors which do not allow nesting), add an assertion that the caller is indeed passing the same pointer to paired calls. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <1465490926-28625-4-git-send-email-eblake@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2016-07-06 10:52:04 +02:00
Peter Maydell	f1f7a1ddf3	block/qcow2: Don't use cpu_to_w() Don't use the cpu_to_w() functions, which we are trying to deprecate. Instead either just use cpu_to_() to do the byteswap, or use st_be_p() if we need to do the store somewhere other than to a variable that's already the correct type. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 1466093177-17890-1-git-send-email-peter.maydell@linaro.org Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2016-07-05 16:54:04 +02:00
Kevin Wolf	a03ef88f77	block: Convert bdrv_co_preadv/pwritev to BdrvChild This is the final patch for converting the common I/O path to take a BdrvChild parameter instead of BlockDriverState. The completion of this conversion means that all users that perform I/O on an image need to actually hold a reference (in the form of BdrvChild, possible as part of a BlockBackend) to that image. This also protects against inconsistent use of BlockBackend vs. BlockDriverState functions because direct use of a BlockDriverState isn't possible any more and blk->root is private for block-backends.c. In addition, we can now distinguish different users in the I/O path, and the future op blockers work is going to add assertions based on permissions stored in BdrvChild. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:27 +02:00
Kevin Wolf	e293b7a3df	block: Convert bdrv_prwv_co() to BdrvChild Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:27 +02:00
Kevin Wolf	720ff280e7	block: Convert bdrv_pwrite_zeroes() to BdrvChild Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:27 +02:00
Kevin Wolf	d9ca2ea2e2	block: Convert bdrv_pwrite(v/_sync) to BdrvChild Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:27 +02:00
Kevin Wolf	cf2ab8fc34	block: Convert bdrv_pread(v) to BdrvChild Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:27 +02:00
Kevin Wolf	18d51c4bac	block: Convert bdrv_write() to BdrvChild Signed-off-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:27 +02:00
Kevin Wolf	fbcbbf4e80	block: Convert bdrv_read() to BdrvChild Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:27 +02:00
Kevin Wolf	f8e2bd538d	block: Use BlockBackend for I/O in bdrv_commit() Just like block jobs, the HMP commit command should use its own BlockBackend for doing I/O on BlockDriverStates. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:27 +02:00
Kevin Wolf	83fd6dd3e7	block: Move bdrv_commit() to block/commit.c No code changes, just moved from one file to another. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:27 +02:00
Kevin Wolf	adad6496c5	block: Convert bdrv_co_do_readv/writev to BdrvChild Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:27 +02:00
Kevin Wolf	0d1049c7d1	block: Convert bdrv_aio_writev() to BdrvChild Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:26 +02:00
Kevin Wolf	ebb7af2173	block: Convert bdrv_aio_readv() to BdrvChild Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:26 +02:00
Kevin Wolf	25ec177d90	block: Convert bdrv_co_writev() to BdrvChild Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:26 +02:00
Kevin Wolf	28b04a8f65	block: Convert bdrv_co_readv() to BdrvChild Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:26 +02:00
Kevin Wolf	db1e80ee2e	vhdx: Some more BlockBackend use in vhdx_create() This does some easy conversions from bdrv_* to blk_* functions in vhdx_create(). We should avoid bypassing the BlockBackend layer whenever possible. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:26 +02:00
Kevin Wolf	e858a9705c	blkreplay: Convert to byte-based I/O The blkreplay driver only forwards the requests it gets, so converting it to byte granularity is trivial. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:26 +02:00
Kevin Wolf	eecc77473b	vvfat: Use BdrvChild for s->qcow vvfat uses a temporary qcow file to cache written data in read-write mode. In order to do things properly, this should show up in the BDS graph and I/O should go through BdrvChild like for every other node. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:26 +02:00
Denis V. Lunev	1c42f149dd	block: fix return code for partial write for Linux AIO Partial write most likely means that there is not space rather than "something wrong happens". Thus it would be more natural to return ENOSPC rather than EINVAL. The problem actually happens with NBD server, which has reported EINVAL rather then ENOSPC on the first error using its protocol, which makes report to the user wrong. Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Pavel Borzenkov <pborzenkov@virtuozzo.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:26 +02:00
Eric Blake	5411541270	block: Use bool as appropriate for BDS members Using int for values that are only used as booleans is confusing. While at it, rearrange a couple of members so that all the bools are contiguous. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:26 +02:00
Eric Blake	8cc9c6e92b	block: Fix error message style error_setg() is not supposed to be used for multi-sentence messages; tweak the message to append a hint instead. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:26 +02:00
Eric Blake	a5b8dd2ce8	block: Move request_alignment into BlockLimit It makes more sense to have ALL block size limit constraints in the same struct. Improve the documentation while at it. Simplify a couple of conditionals, now that we have audited and documented that request_alignment is always non-zero. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:26 +02:00
Eric Blake	d9e0dfa246	block: Split bdrv_merge_limits() from bdrv_refresh_limits() During bdrv_merge_limits(), we were computing initial limits based on another BDS in two places. At first glance, the two computations are not identical (one is doing straight copying, the other is doing merging towards or away from zero) - but when you realize that the first round is starting with all-0 memory, all of the merging happens to work. Factoring out the merging makes it easier to track how two BDS limits are merged, in case we have future reasons to merge in even more limits. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:26 +02:00
Eric Blake	ad82be2f4f	block: Drop raw_refresh_limits() The raw block driver was blindly copying all limits from bs->file, even though: 1. the main bdrv_refresh_limits() already does this for many of the limits, and 2. blindly copying from the children can weaken any stricter limits that were already inherited from the backing chain during the main bdrv_refresh_limits(). Also, a future patch is about to move .request_alignment into BlockLimits, and that is a limit that should NOT be copied from other layers in the BDS chain. Thus, we can completely drop raw_refresh_limits(), and rely on the block layer setting up the proper limits. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:25 +02:00
Eric Blake	b9f7855a50	block: Switch discard length bounds to byte-based Sector-based limits are awkward to think about; in our on-going quest to move to byte-based interfaces, convert max_discard and discard_alignment. Rename them, using 'pdiscard' as an aid to track which remaining discard interfaces need conversion, and so that the compiler will help us catch the change in semantics across any rebased code. The BlockLimits type is now completely byte-based; and in iscsi.c, sector_limits_lun2qemu() is no longer needed. pdiscard_alignment is made unsigned (we use power-of-2 alignments as bitmasks, where unsigned is easier to think about) while leaving max_pdiscard signed (since we still have an 'int' interface); this is comparable to what commit `cf081fc` did for write zeroes limits. We may later want to make everything an unsigned 64-bit limit - but that requires a bigger code audit. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:25 +02:00
Eric Blake	5def6b80e1	block: Switch transfer length bounds to byte-based Sector-based limits are awkward to think about; in our on-going quest to move to byte-based interfaces, convert max_transfer_length and opt_transfer_length. Rename them (dropping the _length suffix) so that the compiler will help us catch the change in semantics across any rebased code, and improve the documentation. Use unsigned values, so that we don't have to worry about negative values and so that bit-twiddling is easier; however, we are still constrained by 2^31 of signed int in most APIs. When a value comes from an external source (iscsi and raw-posix), sanitize the results to ensure that opt_transfer is a power of 2. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:25 +02:00
Eric Blake	79ba8c986a	block: Set default request_alignment during bdrv_refresh_limits() We want to eventually stick request_alignment alongside other BlockLimits, but first, we must ensure it is populated at the same time as all other limits, rather than being a special case that is set only when a block is first opened. Now that all drivers have been updated to supply an override of request_alignment during their .bdrv_refresh_limits(), as needed, the block layer itself can defer setting the default alignment until part of the overall bdrv_refresh_limits(). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:25 +02:00
Eric Blake	a65064816d	block: Set request_alignment during .bdrv_refresh_limits() We want to eventually stick request_alignment alongside other BlockLimits, but first, we must ensure it is populated at the same time as all other limits, rather than being a special case that is set only when a block is first opened. Add a .bdrv_refresh_limits() to all four of our legacy devices that will always be sector-only (bochs, cloop, dmg, vvfat), in spite of their recent conversion to expose a byte interface. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:25 +02:00
Eric Blake	2914a1de99	raw-win32: Set request_alignment during .bdrv_refresh_limits() We want to eventually stick request_alignment alongside other BlockLimits, but first, we must ensure it is populated at the same time as all other limits, rather than being a special case that is set only when a block is first opened. In this case, raw_probe_alignment() already did what we needed, so just fix its signature and wire it in correctly. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:25 +02:00
Eric Blake	a84178ccff	qcow2: Set request_alignment during .bdrv_refresh_limits() We want to eventually stick request_alignment alongside other BlockLimits, but first, we must ensure it is populated at the same time as all other limits, rather than being a special case that is set only when a block is first opened. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:25 +02:00
Eric Blake	c8b3b998e2	iscsi: Set request_alignment during .bdrv_refresh_limits() We want to eventually stick request_alignment alongside other BlockLimits, but first, we must ensure it is populated at the same time as all other limits, rather than being a special case that is set only when a block is first opened. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:25 +02:00
Eric Blake	835db3ee7b	blkdebug: Set request_alignment during .bdrv_refresh_limits() We want to eventually stick request_alignment alongside other BlockLimits, but first, we must ensure it is populated at the same time as all other limits, rather than being a special case that is set only when a block is first opened. Note that when the user does not provide "align", then we were defaulting to bs->request_alignment - but at this stage in the initialization, that was always 512. We were also rejecting an explicit "align":0 from the user; this patch now allows that, as an explicit request for the default alignment (which may not always be 512 in the future). qemu-iotests 77 is particularly sensitive to the fact that we can specify an artificial alignment override in blkdebug, and that override must continue to work even when limits are refreshed on an already open device. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:25 +02:00
Eric Blake	24ce9a2026	block: Give nonzero result to blk_get_max_transfer_length() Making all callers special-case 0 as unlimited is awkward, and we DO have a hard maximum of BDRV_REQUEST_MAX_SECTORS given our current block layer API limits. In the case of scsi, this means that we now always advertise a limit to the guest, even in cases where the underlying layers previously use 0 for no inherent limit beyond the block layer. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:25 +02:00
Eric Blake	f9e95af0a6	iscsi: Advertise realistic limits to block layer The function sector_limits_lun2qemu() returns a value in units of the block layer's 512-byte sector, and can be as large as 0x40000000, which is much larger than the block layer's inherent limit of BDRV_REQUEST_MAX_SECTORS. The block layer already handles '0' as a synonym to the inherent limit, and it is nicer to return this value than it is to calculate an arbitrary maximum, for two reasons: we want to ensure that the block layer continues to special-case '0' as 'no limit beyond the inherent limits'; and we want to be able to someday expand the block layer to allow 64-bit limits, where auditing for uses of BDRV_REQUEST_MAX_SECTORS will help us make sure we aren't artificially constraining iscsi to old block layer limits. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:25 +02:00
Eric Blake	202204717a	nbd: Advertise realistic limits to block layer We were basing the advertisement of maximum discard and transfer length off of UINT32_MAX, but since the rest of the block layer has signed int limits on a transaction, nothing could ever reach that maximum, and we risk overflowing an int once things are converted to byte-based rather than sector-based limits. What's more, we DO have a much smaller limit: both the current kernel and qemu-nbd have a hard limit of 32M on a read or write transaction, and while they may also permit up to a full 32 bits on a discard transaction, the upstream NBD protocol is proposing wording that without any explicit advertisement otherwise, clients should limit ALL requests to the same limits as read and write, even though the other requests do not actually require as many bytes across the wire. So the better limit to tell the block layer is 32M for both values. Behavior doesn't actually change with this patch (the block layer is currently ignoring the max_transfer advertisements); but when that problem is fixed in a later series, this patch will prevent the exposure of a latent bug. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:24 +02:00
Eric Blake	476b923c32	nbd: Allow larger requests The NBD layer was breaking up request at a limit of 2040 sectors (just under 1M) to cater to old qemu-nbd. But the server limit was raised to 32M in commit `2d8214885` to match the kernel, more than three years ago; and the upstream NBD Protocol is proposing documentation that without any explicit communication to state otherwise, a client should be able to safely assume that a 32M transaction will work. It is time to rely on the larger sizing, and any downstream distro that cares about maximum interoperability to older qemu-nbd servers can just tweak the value of #define NBD_MAX_SECTORS. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Cc: qemu-stable@nongnu.org Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:24 +02:00
Eric Blake	82524274ea	block: Fix harmless off-by-one in bdrv_aligned_preadv() If the amount of data to read ends exactly on the total size of the bs, then we were wasting time creating a local qiov to read the data in preparation for what would normally be appending zeroes beyond the end, even though this corner case has nothing further to do. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:24 +02:00
Eric Blake	a604fa2ba5	block: Document supported flags during bdrv_aligned_preadv() We don't pass any flags on to drivers to handle. Tighten an assert to explain why we pass 0 to bdrv_driver_preadv(), and add some comments on things to be aware of if we want to turn on per-BDS BDRV_REQ_FUA support during reads in the future. Also, document that we may want to consider using unmap during copy-on-read operations where the read is all zeroes. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:24 +02:00
Eric Blake	cff86b38ac	block: Tighter assertions on bdrv_aligned_pwritev() For symmetry with bdrv_aligned_preadv(), assert that the caller really has aligned things properly. This requires adding an align parameter, which is used now only in the new asserts, but will come in handy in a later patch that adds auto-fragmentation to the max transfer size, since that value need not always be a multiple of the alignment, and therefore must be rounded down. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:24 +02:00
Peter Maydell	1ec20c2a3a	* serial port fixes (Paolo) * Q35 modeling improvements (Paolo, Vasily) * chardev cleanup improvements (Marc-André) * iscsi bugfix (Peter L.) * cpu_exec patch from multi-arch patches (Peter C.) * pci-assign tweak (Lin Ma) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJXc+GeAAoJEL/70l94x66DtIAH/3+eUBqSxVJ3SMUxJep2Op07 lIWqw1GHAdw1gWQDG4HzokKWrVVp/+NFYQjRFcNMfF8L+/Xm6hHAYc7Y4DMkDxSw zHX2BT93gPcaFJRz3Md8n2anzFHaWePx7LucPjaoas2OzrbVKXC8JT6n3GGnKQzZ 0CxDoyW4keI4ZVAOy9SOKsLPxdSvG8uLvaZU98l/YS/TuiGzpv8IWcdHR+k1hua+ FIenzj7jD9+JFoLEUWkU0pYs33J6yYKPiZn7HgGL9RNWKPFR88+CtMdYXgfOPo7z i05L9RTmL4SpahmStPN2r72MC0T0ub0czk/+qxBNms4r/2gBwaSyldmcTfAXM9o= =DA8v -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging * serial port fixes (Paolo) * Q35 modeling improvements (Paolo, Vasily) * chardev cleanup improvements (Marc-André) * iscsi bugfix (Peter L.) * cpu_exec patch from multi-arch patches (Peter C.) * pci-assign tweak (Lin Ma) # gpg: Signature made Wed 29 Jun 2016 15:56:30 BST # gpg: using RSA key 0xBFFBD25F78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: (35 commits) socket: unlink unix socket on remove socket: add listen feature char: clean up remaining chardevs when leaving vhost-user: disable chardev handlers on close vhost-user-test: fix g_cond_wait_until compat implementation vl: smp_parse: fix regression ich9: implement SCI_IRQ_SEL register ich9: implement ACPI_EN register serial: reinstate watch after migration serial: remove watch on reset char: change qemu_chr_fe_add_watch to return unsigned serial: separate serial_xmit and serial_watch_cb serial: simplify tsr_retry reset serial: make tsr_retry unsigned iscsi: fix assertion in is_sector_request_lun_aligned target-*: Don't redefine cpu_exec() pci-assign: Move "Invalid ROM" error message to pci-assign-load-rom.c vnc: generalize "VNC server running on ..." message scsi: esp: fix migration MC146818 RTC: add GPIO access to output IRQ ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2016-06-29 19:14:48 +01:00
Peter Lieven	0ead93120e	iscsi: fix assertion in is_sector_request_lun_aligned Commit `94d047a` added an assertion the the request alignment check. This introduced 2 issues: a) A off-by-one error since a request of BDRV_REQUEST_MAX_SECTORS is actually allowed. b) The bdrv_get_block_status call in the read path to check the allocation status requests up to INT_MAX sectors which triggers the assertion. Fixes: `94d047a35b` Signed-off-by: Peter Lieven <pl@kamp.de> Message-Id: <1466414680-18383-1-git-send-email-pl@kamp.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-06-29 14:03:47 +02:00
Changlong Xie	15d6729850	mirror: fix misleading comments s/target bs/to_replace/, also we check to_replace bs is not blocked in qmp_drive_mirror() not here Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Message-id: 1466672241-22485-3-git-send-email-xiecl.fnst@cn.fujitsu.com Signed-off-by: Jeff Cody <jcody@redhat.com>	2016-06-28 23:08:25 -04:00
Changlong Xie	b48100cf07	blockjob: assert(cb) when create job Callback for block job should always exist Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Message-id: 1466672241-22485-2-git-send-email-xiecl.fnst@cn.fujitsu.com Signed-off-by: Jeff Cody <jcody@redhat.com>	2016-06-28 23:08:13 -04:00
John Snow	e4808881cb	mirror: limit niov to IOV_MAX elements, again During the refactor of mirror_iteration in `e5b43573`, we regressed the fix introduced in `cae98cb8`. This patch re-adds IOV_MAX checking to cases where we aren't checking alignment (and size) already. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 1466625064-11280-3-git-send-email-jsnow@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>	2016-06-28 22:53:03 -04:00
John Snow	176129552f	mirror: clarify mirror_do_read return code mirror_do_read intends to return the number of sectors processed after the starting sector, without regard to how many sectors were processed before the starting sector due to alignment. Clean up the comments and code to hopefully illustrate this more clearly. This also fixes an issue in initialization where if the mirror buffer size is initialized to smaller than the number of sectors being requested for transfer, we report back an incorrectly large number to the caller. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 1466625064-11280-2-git-send-email-jsnow@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>	2016-06-28 22:53:03 -04:00
Jeff Cody	7eac868a50	block/gluster: add support for selecting debug logging level This adds commandline support for the logging level of the gluster protocol driver, output to stdout. The option is 'debug', e.g.: -drive filename=gluster://192.168.15.180/gv2/test.qcow2,debug=9 Debug levels are 0-9, with 9 being the most verbose, and 0 representing no debugging output. The default is the same as it was before, which is a level of 4. The current logging levels defined in the gluster source are: 0 - None 1 - Emergency 2 - Alert 3 - Critical 4 - Error 5 - Warning 6 - Notice 7 - Info 8 - Debug 9 - Trace (From: glusterfs/logging.h) Reviewed-by: Niels de Vos <ndevos@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com>	2016-06-28 22:52:45 -04:00
Denis V. Lunev	ff04198bf5	mirror: fix trace_mirror_yield_in_flight usage in mirror_iteration() trace_mirror_yield_in_flight accepts 2nd arguments in sectors while here we pass chunks instead. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1466518157-27140-1-git-send-email-den@openvz.org CC: Jeff Cody <jcody@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com>	2016-06-28 22:52:45 -04:00
Peter Lieven	d99b26c42a	block/nfs: add support for libnfs pagecache upcoming libnfs will have support for a read cache that can significantly help to speed up requests since libnfs by design circumvents the kernel cache. Example: qemu -cdrom nfs://127.0.0.1/iso/my.iso?pagecache=1024 The pagecache parameters takes the maximum amount of pages to cache. A page in libnfs is always the NFS_BLKSIZE which is 4KB. Signed-off-by: Peter Lieven <pl@kamp.de> Reviewed-by: Jeff Cody <jcody@redhat.com> Message-id: 1463662083-20814-3-git-send-email-pl@kamp.de Signed-off-by: Jeff Cody <jcody@redhat.com>	2016-06-28 22:52:45 -04:00
Peter Lieven	38f8d5e025	block/nfs: refuse readahead if cache.direct is on if we open a NFS export with disabled cache we should refuse the readahead feature as it will cache data inside libnfs. If a export was opened with readahead enabled it should futher not be allowed to disable the cache while running. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Lieven <pl@kamp.de> Reviewed-by: Jeff Cody <jcody@redhat.com> Message-id: 1463662083-20814-2-git-send-email-pl@kamp.de Signed-off-by: Jeff Cody <jcody@redhat.com>	2016-06-28 22:52:45 -04:00
Niels de Vos	947eb2030e	block/gluster: add support for SEEK_DATA/SEEK_HOLE GlusterFS 3.8 contains support for SEEK_DATA and SEEK_HOLE. This makes it possible to detect sparse areas in files. Signed-off-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com>	2016-06-28 22:52:45 -04:00
Peter Maydell	b0ad00b8c9	-----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJXaFInAAoJEJykq7OBq3PI6VsH/0Sfgbdo1RksYuQwb/y92sCW EN+lxUZ+OLfgrc8PYgNZwfSM3rsfYhznL0MAXOeEe7Ahabi07w7DhGR8WvwfAOlI G96FRuvrIPfv5u6U6fwS4CvG3TIHVLxfHKCsTpPUmH8U5CNx/x/tpjNiWN1dj6t+ sXybSjYHfZfiZy2tI9MFIFWCdxnF/pl0QAPhbRqc8Y/RQTDrPKRjLpz+nitN/u96 5TS7KlELyQuP91YMmLceYSmIkHbxW703h+iE2n4hov0uZCP8Jil+2Jsd3ziQSRlL j6LqexQ2ViBGdDSfiZGYES2VPlsHOCwb4G+IgWBStfZg1ppaXENvcDzPrgrB+L4= =eUnF -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging # gpg: Signature made Mon 20 Jun 2016 21:29:27 BST # gpg: using RSA key 0x9CA4ABB381AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" # Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8 * remotes/stefanha/tags/tracing-pull-request: (42 commits) trace: split out trace events for linux-user/ directory trace: split out trace events for qom/ directory trace: split out trace events for target-ppc/ directory trace: split out trace events for target-s390x/ directory trace: split out trace events for target-sparc/ directory trace: split out trace events for net/ directory trace: split out trace events for audio/ directory trace: split out trace events for ui/ directory trace: split out trace events for hw/alpha/ directory trace: split out trace events for hw/arm/ directory trace: split out trace events for hw/acpi/ directory trace: split out trace events for hw/vfio/ directory trace: split out trace events for hw/s390x/ directory trace: split out trace events for hw/pci/ directory trace: split out trace events for hw/ppc/ directory trace: split out trace events for hw/9pfs/ directory trace: split out trace events for hw/i386/ directory trace: split out trace events for hw/isa/ directory trace: split out trace events for hw/sd/ directory trace: split out trace events for hw/sparc/ directory ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2016-06-20 22:30:34 +01:00
Daniel P. Berrange	b54ca48e40	trace: split out trace events for block/ directory Move all trace-events for files in the block/ directory to their own file. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 1466066426-16657-7-git-send-email-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-06-20 17:22:14 +01:00
Peter Maydell	7fa124b273	Error reporting patches for 2016-06-20 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIbBAABAgAGBQJXaAQPAAoJEDhwtADrkYZTbPMP+OmXlf1IwcUONhaHqD0SXAdC Q5NNHheNHbIdzrsOmgj2GAxCeGG08REPrdYLb8N9lI3Fev8ggQ0v3d9qiStL6Nvz 2B7DeTxXba9Qmc2gY86RR4f7T6VK+8k1Kj9f3YcAoXHucdjDLRmtUozwES3Kk8mn oeBqd07yuTAO93xQhF6BDv3mfagPZV+kp15Se04ufWPpT9eG3Cp9Pl4/Yad/wJUr 2B4pIvYnhISVlZIOvpuzdydBaCb/U0YefjWBDkawLwRBtM/kpNKoDdqr2UA0yChQ Xulgpt+UQLZ4aOYGLkzqBT7v3kDeEnYUFDdE03xZBLbHddOU05NEhLgzRVUEAdUS EAKWvWu692USDh9a2r0AYLqZW6J0+Fyi4fZ34gSuP69oi74CIvjJHDAEwsSdt6bm WHCchvfIk9kinzGiFA3hkAbVWAbI0EpB0J7k3mgCVVVelBpF9b9U4M2ydx9ZPufX t4ULXxQO9Qyxr7r0uiznugQAjIRwLMDOPPd1F5Vi+LX0wF3Pcrtc/F+3ehoLBjBD 6zZA5lZVlUEISdzzytfjeX3ch5vRFwlujrTd2anxRuduuBV1/ztvE1v0Nsiii6Sw 15BaHQJvBnMnAIO5LfhnAaVpIBk0jpQpwVyRcNUpeZ1VU2qlqVS4zxYZ0FDQgyer IZa1qYJ0Sz+oMkq9RXU= =TptL -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2016-06-20' into staging Error reporting patches for 2016-06-20 # gpg: Signature made Mon 20 Jun 2016 15:56:15 BST # gpg: using RSA key 0x3870B400EB918653 # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" # gpg: aka "Markus Armbruster <armbru@pond.sub.org>" # Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653 * remotes/armbru/tags/pull-error-2016-06-20: log: Fix qemu_set_log_filename() error handling log: Fix qemu_set_dfilter_ranges() error reporting log: Plug memory leak on multiple -dfilter coccinelle: Remove unnecessary variables for function return value error: Remove unnecessary local_err variables error: Remove NULL checks on error_propagate() calls vl: Error messages need to go to stderr, fix some Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2016-06-20 16:19:18 +01:00
Eduardo Habkost	9be385980d	coccinelle: Remove unnecessary variables for function return value Use Coccinelle script to replace 'ret = E; return ret' with 'return E'. The script will do the substitution only when the function return type and variable type are the same. Manual fixups: * audio/audio.c: coding style of "read (...)" and "write (...)" * block/qcow2-cluster.c: wrap line to make it shorter * block/qcow2-refcount.c: change indentation of wrapped line * target-tricore/op_helper.c: fix coding style of "remainder\|quotient" * target-mips/dsp_helper.c: reverted changes because I don't want to argue about checkpatch.pl * ui/qemu-pixman.c: fix line indentation * block/rbd.c: restore blank line between declarations and statements Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <1465855078-19435-4-git-send-email-ehabkost@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [Unused Coccinelle rule name dropped along with a redundant comment; whitespace touched up in block/qcow2-cluster.c; stale commit message paragraph deleted] Signed-off-by: Markus Armbruster <armbru@redhat.com>	2016-06-20 16:38:13 +02:00
Eduardo Habkost	6b62d96137	error: Remove unnecessary local_err variables This patch simplifies code that uses a local_err variable just to immediately use it for an error_propagate() call. Coccinelle patch used to perform the changes added to scripts/coccinelle/remove_local_err.cocci. Reviewed-by: Eric Blake <eblake@redhat.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <1465855078-19435-3-git-send-email-ehabkost@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [Blank line in s390-virtio-ccw.c restored] Signed-off-by: Markus Armbruster <armbru@redhat.com>	2016-06-20 16:38:13 +02:00
Eduardo Habkost	621ff94d50	error: Remove NULL checks on error_propagate() calls error_propagate() already ignores local_err==NULL, so there's no need to check it before calling. Coccinelle patch used to perform the changes added to scripts/coccinelle/error_propagate_null.cocci. Reviewed-by: Eric Blake <eblake@redhat.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <1465855078-19435-2-git-send-email-ehabkost@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2016-06-20 16:38:13 +02:00
Stefan Hajnoczi	5ab4b69ce2	backup: follow AioContext change gracefully Move s->target to the new AioContext when there is an AioContext change. The backup_run() coroutine does not use asynchronous I/O so there is no need to wait for in-flight requests in a BlockJobDriver->pause() callback. Guest writes are intercepted by the backup job. Treat them as guest activity and do it even while the job is paused. This is necessary since the only alternative would be to fail a job that experienced guest writes during pause once the job is resumed. In practice the guest writes don't interfere with AioContext switching since bdrv_drain() is used by bdrv_set_aio_context(). Loops already contain pause points because of block_job_sleep_ns() calls in the yield_and_check() helper function. It is necessary to convert a raw qemu_coroutine_yield() to block_job_yield() so the MIRROR_SYNC_MODE_NONE case can pause. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 1466096189-6477-9-git-send-email-stefanha@redhat.com	2016-06-20 14:25:41 +01:00
Stefan Hajnoczi	565ac01f8d	mirror: follow AioContext change gracefully Add block_job_pause_point() calls to mark quiescent points and make sure to complete in-flight requests when switching AioContexts. This patch solves undefined behavior in the mirror block job when the BDS AioContext is changed by dataplane. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 1466096189-6477-8-git-send-email-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-06-20 14:25:41 +01:00
Denis V. Lunev	ec050f77a5	block: process before_write_notifiers in bdrv_co_discard This is mandatory for correct backup creation. In the other case the content under this area would be lost. Dirty bits are set exactly like in bdrv_aligned_pwritev, i.e. they are set even if notifier has returned a error. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Vladimir Sementsov-Ogievskiy<vsementsov@virtuozzo.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1466093381-6120-4-git-send-email-den@openvz.org CC: Fam Zheng <famz@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-06-20 11:44:12 +01:00
Denis V. Lunev	968d8b0627	block: fix race in bdrv_co_discard with drive-mirror Actually we must set dirty bitmap dirty after we have written all our zeroes for correct processing in drive mirror code. In the other case we can face not zeroes in this area in mirror_iteration. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Vladimir Sementsov-Ogievskiy<vsementsov@virtuozzo.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1466093381-6120-3-git-send-email-den@openvz.org CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-06-20 11:44:12 +01:00
Denis V. Lunev	3a36e474f2	block: fixed BdrvTrackedRequest filling in bdrv_co_discard The request area is specified in bytes, not in sectors. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Vladimir Sementsov-Ogievskiy<vsementsov@virtuozzo.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1466093381-6120-2-git-send-email-den@openvz.org CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-06-20 11:44:12 +01:00
Paolo Bonzini	02d0e09503	os-posix: include sys/mman.h qemu/osdep.h checks whether MAP_ANONYMOUS is defined, but this check is bogus without a previous inclusion of sys/mman.h. Include it in sysemu/os-posix.h and remove it from everywhere else. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-06-16 18:39:03 +02:00
Max Reitz	67882b1535	block/null: Implement bdrv_refresh_filename() The null block driver ignores any filename used for creating its BDSs, which allows creating such BDSs even without any filename at all. In that case, we currently construct a JSON filename when queried instead of a plain "null-co://" or "null-aio://". This patch implements bdrv_refresh_filename() to remedy this behavior. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20160610185750.30956-4-mreitz@redhat.com [mreitz@redhat.com: Added commit message] Signed-off-by: Max Reitz <mreitz@redhat.com>	2016-06-16 15:20:37 +02:00
Max Reitz	274fccee2b	block/mirror: Fix target backing BDS Currently, we are trying to move the backing BDS from the source to the target in bdrv_replace_in_backing_chain() which is called from mirror_exit(). However, mirror_complete() already tries to open the target's backing chain with a call to bdrv_open_backing_file(). First, we should only set the target's backing BDS once. Second, the mirroring block job has a better idea of what to set it to than the generic code in bdrv_replace_in_backing_chain() (in fact, the latter's conditions on when to move the backing BDS from source to target are not really correct). Therefore, remove that code from bdrv_replace_in_backing_chain() and leave it to mirror_complete(). Depending on what kind of mirroring is performed, we furthermore want to use different strategies to open the target's backing chain: - If blockdev-mirror is used, we can assume the user made sure that the target already has the correct backing chain. In particular, we should not try to open a backing file if the target does not have any yet. - If drive-mirror with mode=absolute-paths is used, we can and should reuse the already existing chain of nodes that the source BDS is in. In case of sync=full, no backing BDS is required; with sync=top, we just link the source's backing BDS to the target, and with sync=none, we use the source BDS as the target's backing BDS. We should not try to open these backing files anew because this would lead to two BDSs existing per physical file in the backing chain, and we would like to avoid such concurrent access. - If drive-mirror with mode=existing is used, we have to use the information provided in the physical image file which means opening the target's backing chain completely anew, just as it has been done already. If the target's backing chain shares images with the source, this may lead to multiple BDSs per physical image file. But since we cannot reliably ascertain this case, there is nothing we can do about it. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20160610185750.30956-3-mreitz@redhat.com Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2016-06-16 15:20:37 +02:00
Vikhyat Umrao	87cd3d20e1	rbd:change error_setg() to error_setg_errno() Ceph RBD block driver does not use error_setg_errno() where it is possible to use. This patch replaces error_setg() from error_setg_errno(). Signed-off-by: Vikhyat Umrao <vumrao@redhat.com> Message-id: 1462780319-5796-1-git-send-email-vumrao@redhat.com Reviewed-by: Josh Durgin <jdurgin@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2016-06-16 15:20:37 +02:00
Alberto Garcia	834fe28ddf	block: Create the commit block job before reopening any image If the base or overlay images need to be reopened in read-write mode but the block_job_create() call fails then no one will put those images back in read-only mode. We can solve this problem easily by calling block_job_create() first. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: aa495045770a6f1a7cc5d408397a17c75097fdd8.1464346103.git.berto@igalia.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2016-06-16 15:20:37 +02:00
Alberto Garcia	eb1364ceac	block: use the block job list in bdrv_drain_all() bdrv_drain_all() pauses all block jobs by using bdrv_next() to iterate over all top-level BlockDriverStates. Therefore the code is unable to find block jobs in other nodes. This patch uses block_job_next() to iterate over all block jobs. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: 55ee7d7d4a65c28aa1a1b28823897ef326f328e2.1464346103.git.berto@igalia.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2016-06-16 15:20:37 +02:00
Kevin Wolf	c9d20029f4	block: Remove bs->zero_beyond_eof It is always true for open images now. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-06-16 15:19:56 +02:00
Kevin Wolf	734a77584a	qcow2: Let vmstate call qcow2_co_preadv/pwrite directly We don't really want to go through the block layer in order to read from or write to the vmstate in a qcow2 image. Doing so required a few ugly hacks like saving and restoring the old image size (because writing to vmstate offsets would increase the image size) or disabling the "reads after EOF = zeroes" logic. When calling the right functions directly, these hacks aren't necessary any more. Note that .bdrv_vmstate_load/save() return 0 instead of the number of bytes in case of success now. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-06-16 15:19:56 +02:00
Kevin Wolf	1a8ae82217	block: Make bdrv_load/save_vmstate coroutine_fns This allows drivers to share code between normal I/O and vmstate accesses. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-06-16 15:19:56 +02:00
Kevin Wolf	b433d9424d	block: Allow .bdrv_load/save_vmstate() to return 0/-errno The return value of .bdrv_load/save_vmstate() can be any non-negative number in case of success now. It used to be bytes/-errno. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-06-16 15:19:55 +02:00
Kevin Wolf	5ddda0b8f0	block: Make .bdrv_load_vmstate() vectored This brings it in line with .bdrv_save_vmstate(). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-06-16 15:19:55 +02:00
Kevin Wolf	f1e8474115	block: Introduce bdrv_preadv() We already have a byte-based bdrv_pwritev(), but the read counterpart was still missing. This commit adds it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-06-16 15:19:55 +02:00
Kevin Wolf	ccb9dc1012	linux-aio: Cancel BH if not needed linux-aio uses a BH in order to make sure that the remaining completions are processed even in nested event loops of completion callbacks in order to avoid deadlocks. There is no need, however, to have the BH overhead for the first call into qemu_laio_completion_bh() or after all pending completions have already been processed. Therefore, this patch calls directly into qemu_laio_completion_bh() in qemu_laio_completion_cb() and cancels the BH after qemu_laio_completion_bh() has processed all pending completions. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>	2016-06-16 15:19:55 +02:00
Kevin Wolf	23b0d9fb1d	block: Don't enforce 512 byte minimum alignment If block drivers say that they can do an alignment < 512 bytes, let's just suppose they mean it. raw-posix used to be an offender with respect to this, but it can actually deal with byte-aligned requests now. The default is still 512 bytes for any drivers that only implement sector-based interfaces, but it is 1 now for drivers that implement .bdrv_co_preadv. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-06-16 15:19:55 +02:00
Kevin Wolf	9d52aa3c38	raw-posix: Implement .bdrv_co_preadv/pwritev The raw-posix block driver actually supports byte-aligned requests now on non-O_DIRECT images, like it already (and previously incorrectly) claimed in bs->request_alignment. For some block drivers this means that a RMW cycle can be avoided when they write sub-sector metadata e.g. for cluster allocation. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-06-16 15:19:55 +02:00

1 2 3 4 5 ...

2642 Commits