qemu-e2k

Commit Graph

Author	SHA1	Message	Date
John Snow	35d6b368f2	blockjobs: ensure abort is called for cancelled jobs Presently, even if a job is canceled post-completion as a result of a failing peer in a transaction, it will still call .commit because nothing has updated or changed its return code. The reason why this does not cause problems currently is because backup's implementation of .commit checks for cancellation itself. I'd like to simplify this contract: (1) Abort is called if the job/transaction fails (2) Commit is called if the job/transaction succeeds To this end: A job's return code, if 0, will be forcibly set as -ECANCELED if that job has already concluded. Remove the now redundant check in the backup job implementation. We need to check for cancellation in both block_job_completed AND block_job_completed_single, because jobs may be cancelled between those two calls; for instance in transactions. This also necessitates an ABORTING -> ABORTING transition to be allowed. The check in block_job_completed could be removed, but there's no point in starting to attempt to succeed a transaction that we know in advance will fail. This does NOT affect mirror jobs that are "canceled" during their synchronous phase. The mirror job itself forcibly sets the canceled property to false prior to ceding control, so such cases will invoke the "commit" callback. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-19 12:01:24 +01:00
John Snow	75f710599f	blockjobs: add block_job_dismiss For jobs that have reached their CONCLUDED state, prior to having their last reference put down (meaning jobs that have completed successfully, unsuccessfully, or have been canceled), allow the user to dismiss the job's lingering status report via block-job-dismiss. This gives management APIs the chance to conclusively determine if a job failed or succeeded, even if the event broadcast was missed. Note: block_job_do_dismiss and block_job_decommission happen to do exactly the same thing, but they're called from different semantic contexts, so both aliases are kept to improve readability. Note 2: Don't worry about the 0x04 flag definition for AUTO_DISMISS, she has a friend coming in a future patch to fill the hole where 0x02 is. Verbs: Dismiss: operates on CONCLUDED jobs only. Signed-off-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-19 12:01:24 +01:00
John Snow	0ec4dfb8d6	blockjobs: add block_job_verb permission table Which commands ("verbs") are appropriate for jobs in which state is also somewhat burdensome to keep track of. As of this commit, it looks rather useless, but begins to look more interesting the more states we add to the STM table. A recurring theme is that no verb will apply to an 'undefined' job. Further, it's not presently possible to restrict the "pause" or "resume" verbs any more than they are in this commit because of the asynchronous nature of how jobs enter the PAUSED state; justifications for some seemingly erroneous applications are given below. ===== Verbs ===== Cancel: Any state except undefined. Pause: Any state except undefined; 'created': Requests that the job pauses as it starts. 'running': Normal usage. (PAUSED) 'paused': The job may be paused for internal reasons, but the user may wish to force an indefinite user-pause, so this is allowed. 'ready': Normal usage. (STANDBY) 'standby': Same logic as above. Resume: Any state except undefined; 'created': Will lift a user's pause-on-start request. 'running': Will lift a pause request before it takes effect. 'paused': Normal usage. 'ready': Will lift a pause request before it takes effect. 'standby': Normal usage. Set-speed: Any state except undefined, though ready may not be meaningful. Complete: Only a 'ready' job may accept a complete request. ======= Changes ======= (1) To facilitate "nice" error checking, all five major block-job verb interfaces in blockjob.c now support an errp parameter: - block_job_user_cancel is added as a new interface. - block_job_user_pause gains an errp paramter - block_job_user_resume gains an errp parameter - block_job_set_speed already had an errp parameter. - block_job_complete already had an errp parameter. (2) block-job-pause and block-job-resume will no longer no-op when trying to pause an already paused job, or trying to resume a job that isn't paused. These functions will now report that they did not perform the action requested because it was not possible. iotests have been adjusted to address this new behavior. (3) block-job-complete doesn't worry about checking !block_job_started, because the permission table guards against this. (4) test-bdrv-drain's job implementation needs to announce that it is 'ready' now, in order to be completed. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-19 12:01:24 +01:00
John Snow	c9de40505f	blockjobs: add state transition table The state transition table has mostly been implied. We're about to make it a bit more complex, so let's make the STM explicit instead. Perform state transitions with a function that for now just asserts the transition is appropriate. Transitions: Undefined -> Created: During job initialization. Created -> Running: Once the job is started. Jobs cannot transition from "Created" to "Paused" directly, but will instead synchronously transition to running to paused immediately. Running -> Paused: Normal workflow for pauses. Running -> Ready: Normal workflow for jobs reaching their sync point. (e.g. mirror) Ready -> Standby: Normal workflow for pausing ready jobs. Paused -> Running: Normal resume. Standby -> Ready: Resume of a Standby job. +---------+ \|UNDEFINED\| +--+------+ \| +--v----+ \|CREATED\| +--+----+ \| +--v----+ +------+ \|RUNNING<----->PAUSED\| +--+----+ +------+ \| +--v--+ +-------+ \|READY<------->STANDBY\| +-----+ +-------+ Notably, there is no state presently defined as of this commit that deals with a job after the "running" or "ready" states, so this table will be adjusted alongside the commits that introduce those states. Signed-off-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-19 12:01:24 +01:00
John Snow	75859b9420	blockjobs: model single jobs as transactions model all independent jobs as single job transactions. It's one less case we have to worry about when we add more states to the transition machine. This way, we can just treat all job lifetimes exactly the same. This helps tighten assertions of the STM graph and removes some conditionals that would have been needed in the coming commits adding a more explicit job lifetime management API. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-19 12:01:24 +01:00
Peter Maydell	9cc7d0cf6a	-----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJaqD6PAAoJEH3vgQaq/DkO9FIP/3pAW3xJUDGYsONiebX1IbhA VpoQCcjks3cHD18AUoVHufayJBUVfed1LhYPP8xoDuSRmKs1xU1O9FknxMQaL+Dw kbliBY7GjN8A2EcCjW+ZwyNT/KpjyXXwuZ2PSnOSSiN3JK6wrLCzeZyKyOYewLCS u9fKscnqWkg+awbCfDlVs92AaBAKoOP9loOq6e2J/jVY8HSDGb2owRnsxaWg8gJ8 J9BlnXENQ14jEwickD3sluPfWkhu9xh7cCocH8cfgXL5veGUELz0Ugx4RHcsAF9Q SVDg/EhRRN11cvOkLnlggETaLbGtEE64AL4HhjxzCLraHsnEazPDwFgetB9mOhhF Nqu8HuGcVvRgn89au89mxAvTSWX9KFq4oF8Vi+FZZHkLilRx6NJnMpUpd9zkSJDq yjR2/BV0A9Ep1gvWX/rhpPrN5dALYHcaxoiSB497Yj4SI2ZSyzfrneteYdPv4EEc 3CSJ3l6NCGAE2dNXuVZTVqHyXOSl7mJQQmT53dtsSNipCMEsVr0mOx3DPNY26LIc DUdnX6JOyZPU0wzOj8xjFNV72/gBEkqVZ5p9UJ+lrIYwOsTobpzfDtYquu4asda8 IN44mcbRCZRFIiZZOGEdnwf34vIpQKMiZAtszAaan9KXwTXV9LbipaomBEN88vUD IgI5XsZTfiD2uIjnREWv =ISfR -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/jnsnow/tags/bitmaps-pull-request' into staging # gpg: Signature made Tue 13 Mar 2018 21:11:43 GMT # gpg: using RSA key 7DEF8106AAFC390E # gpg: Good signature from "John Snow (John Huston) <jsnow@redhat.com>" # Primary key fingerprint: FAEB 9711 A12C F475 812F 18F2 88A9 064D 1835 61EB # Subkey fingerprint: F9B7 ABDB BCAC DF95 BE76 CBD0 7DEF 8106 AAFC 390E * remotes/jnsnow/tags/bitmaps-pull-request: iotests: add dirty bitmap postcopy test iotests: add dirty bitmap migration test migration: add postcopy migration of dirty bitmaps migration: allow qmp command migrate-start-postcopy for any postcopy migration: add is_active_iterate handler migration/qemu-file: add qemu_put_counted_string() migration: include migrate_dirty_bitmaps in migrate_postcopy qapi: add dirty-bitmaps migration capability migration: introduce postcopy-only pending dirty-bitmap: add locked state block/dirty-bitmap: add _locked version of bdrv_reclaim_dirty_bitmap block/dirty-bitmap: fix locking in bdrv_reclaim_dirty_bitmap block/dirty-bitmap: add bdrv_dirty_bitmap_enable_successor() Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-03-16 14:15:18 +00:00
Peter Maydell	475fe4576f	nbd patches for 2018-03-13 - Eric Blake: iotests: Fix stuck NBD process on 33 - Vladimir Sementsov-Ogievskiy: 0/5 nbd server fixing and refactoring before BLOCK_STATUS - Eric Blake: nbd/server: Honor FUA request on NBD_CMD_TRIM - Stefan Hajnoczi: 0/2 block: fix nbd-server-stop crash after blockdev-snapshot-sync - Vladimir Sementsov-Ogievskiy: nbd block status base:allocation -----BEGIN PGP SIGNATURE----- Comment: Public key at http://people.redhat.com/eblake/eblake.gpg iQEcBAABCAAGBQJaqDklAAoJEKeha0olJ0NqRsAH+waQGLA8YPwxlnpRW2kulLfC dZXv/ocl2vGgxRrDLEL46xh1RUpapERHADk/Qun8reQpqLicd6p8VCuoOZFEj6QN Xo98JHrKKL6AZ1rVhWUzD8G6qwgL6FGq6Eb5ty/kanf2/0igwtHmu86nOgGyc9dz zelGPdIxyxIEjCNiLIN49iEFs+gk1hr8qp1TNMbnHlQh8moOYqdCJWNisOQowoEE soCJ4NLnvKBtnmrxDrvtkppQKW8ukDOG/q5BkSTvAIEBH/v0ioohFUNTFkC8vmjO 8YAwlXAz6EpQuKxpEfl7vxaT19edrNIo55JO1/Gwzk50g4/Mt+AH2JM/msFcMKQ= =i+nR -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2018-03-13-v2' into staging nbd patches for 2018-03-13 - Eric Blake: iotests: Fix stuck NBD process on 33 - Vladimir Sementsov-Ogievskiy: 0/5 nbd server fixing and refactoring before BLOCK_STATUS - Eric Blake: nbd/server: Honor FUA request on NBD_CMD_TRIM - Stefan Hajnoczi: 0/2 block: fix nbd-server-stop crash after blockdev-snapshot-sync - Vladimir Sementsov-Ogievskiy: nbd block status base:allocation # gpg: Signature made Tue 13 Mar 2018 20:48:37 GMT # gpg: using RSA key A7A16B4A2527436A # gpg: Good signature from "Eric Blake <eblake@redhat.com>" # gpg: aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>" # gpg: aka "[jpeg image of size 6874]" # Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2 F3AA A7A1 6B4A 2527 436A * remotes/ericb/tags/pull-nbd-2018-03-13-v2: iotests: new test 209 for NBD BLOCK_STATUS iotests: add file_path helper iotests.py: tiny refactor: move system imports up nbd: BLOCK_STATUS for standard get_block_status function: client part block/nbd-client: save first fatal error in nbd_iter_error nbd: BLOCK_STATUS for standard get_block_status function: server part nbd/server: add nbd_read_opt_name helper nbd/server: add nbd_opt_invalid helper iotests: add 208 nbd-server + blockdev-snapshot-sync test case block: let blk_add/remove_aio_context_notifier() tolerate BDS changes nbd/server: Honor FUA request on NBD_CMD_TRIM nbd/server: refactor nbd_trip: split out nbd_handle_request nbd/server: refactor nbd_trip: cmd_read and generic reply nbd/server: fix: check client->closing before sending reply nbd/server: fix sparse read nbd/server: move nbd_co_send_structured_error up iotests: Fix stuck NBD process on 33 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-03-16 13:14:07 +00:00
Vladimir Sementsov-Ogievskiy	4f43e9535b	dirty-bitmap: add locked state Add special state, when qmp operations on the bitmap are disabled. It is needed during bitmap migration. "Frozen" state is not appropriate here, because it looks like bitmap is unchanged. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20180207155837.92351-5-vsementsov@virtuozzo.com Signed-off-by: John Snow <jsnow@redhat.com>	2018-03-13 17:05:00 -04:00
Vladimir Sementsov-Ogievskiy	044ee8e143	block/dirty-bitmap: add _locked version of bdrv_reclaim_dirty_bitmap Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20180207155837.92351-4-vsementsov@virtuozzo.com Signed-off-by: John Snow <jsnow@redhat.com>	2018-03-13 17:04:54 -04:00
Vladimir Sementsov-Ogievskiy	604ab74bb5	block/dirty-bitmap: fix locking in bdrv_reclaim_dirty_bitmap Like other setters here these functions should take a lock. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20180207155837.92351-3-vsementsov@virtuozzo.com Signed-off-by: John Snow <jsnow@redhat.com>	2018-03-13 17:04:48 -04:00
Vladimir Sementsov-Ogievskiy	78a33ab587	nbd: BLOCK_STATUS for standard get_block_status function: client part Minimal realization: only one extent in server answer is supported. Flag NBD_CMD_FLAG_REQ_ONE is used to force this behavior. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20180312152126.286890-6-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: grammar tweaks, fix min_block check and 32-bit cap, use -1 instead of errno on failure in nbd_negotiate_simple_meta_context, ensure that block status makes progress on success] Signed-off-by: Eric Blake <eblake@redhat.com>	2018-03-13 15:43:48 -05:00
Vladimir Sementsov-Ogievskiy	1e98efc029	block/nbd-client: save first fatal error in nbd_iter_error It is ok, that fatal error hides previous not fatal, but hiding first fatal error is a bad feature. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20180312152126.286890-5-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2018-03-13 15:38:56 -05:00
Stefan Hajnoczi	d03654eacd	block: let blk_add/remove_aio_context_notifier() tolerate BDS changes Commit `2019ba0a01` ("block: Add AioContextNotifier functions to BB") added blk_add/remove_aio_context_notifier() and implemented them by passing through the bdrv_() equivalent. This doesn't work across bdrv_append(), which detaches child->bs and re-attaches it to a new BlockDriverState. When blk_remove_aio_context_notifier() is called we will access the new BDS instead of the one where the notifier was added! >From the point of view of the blk_() API user, changes to the root BDS should be transparent. This patch maintains a list of AioContext notifiers in BlockBackend and adds/removes them from the BlockDriverState as needed. Reported-by: Stefano Panella <spanella@gmail.com> Cc: Max Reitz <mreitz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20180306204819.11266-2-stefanha@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2018-03-13 15:38:55 -05:00
Vladimir Sementsov-Ogievskiy	e73a265e9f	block/dirty-bitmap: add bdrv_dirty_bitmap_enable_successor() Enabling bitmap successor is necessary to enable successors of bitmaps being migrated before target vm start. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 20180207155837.92351-2-vsementsov@virtuozzo.com Signed-off-by: John Snow <jsnow@redhat.com>	2018-03-13 15:33:59 -04:00
Daniel P. Berrangé	44acd46f60	block: include original filename when reporting invalid URIs Consider passing a JSON based block driver to "qemu-img commit" $ qemu-img commit 'json:{"driver":"qcow2","file":{"driver":"gluster",\ "volume":"gv0","path":"sn1.qcow2", "server":[{"type":\ "tcp","host":"10.73.199.197","port":"24007"}]},}' Currently it will commit the content and then report an incredibly useless error message when trying to re-open the committed image: qemu-img: invalid URI Usage: file=gluster[+transport]://[host[:port]]volume/path[?socket=...][,file.debug=N][,file.logfile=/path/filename.log] With this fix we get: qemu-img: invalid URI json:{"server.0.host": "10.73.199.197", "driver": "gluster", "path": "luks.qcow2", "server.0.type": "tcp", "server.0.port": "24007", "volume": "gv0"} Of course the root cause problem still exists, but now we know what actually needs fixing. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20180206105204.14817-1-berrange@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>	2018-03-13 08:06:55 -04:00
Peter Maydell	12c06d6f96	Block layer patches -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJaoqOgAAoJEH8JsnLIjy/W2MMP/1gj7CJgtSG9wIzyBHjSWQMy ofXEgRJO9t/smfUMlH2NdrW8P2LvYcmqOsEBkLJzCtl48fPexwtI/cunzVjutXcf VlpqKz/8uN4C9D6m8FN/5kKf65l+tnVqnCoJgwafY5uT7jmoC8LF1xO2jo8a+lJd 0Dv6RxJUQq/tDR6OvO6aW4EzbOUcD4wkLvi/uz8+ZjV1BLSLlpdudejr6W9TnJY/ EGFedbxqjPV7fIvMbodbFp0Ie8Aw0WEL8ttERboeR4jbA/o+PZVGpPtHsr/4V6QO Pgh6vH2rGavxFzwuCWEGhlLKGx66CGqqdTknm6lNJchepCvcfoYxjOPZv9FCaMUs enC/x43xSkCmkwBwKKxpXqu1vS5nGdMebAwRjstSIplypjv2YOwS1AiU5snaDwuk t9Gjkw0Wka5nySuYi43H2RPXmlWbh4T8DfQ6pOyJGvXGjm8t+f5BTaMtSWn6Iq2W F6r1UezQJBDnUbpFgsRg4AP+htPGDHgsOg7KzCCd/lBHwbjX7dkQlAYbBZZ2OBF+ wQN5olDR6jsKIy2IlARNgNweZHW5UQa1cc+7HlVNNE5tqtkjo7aWPk/LhEzBCIHg sWG3VH2y3lQlaMzYh1v+jnGrFoq1ZJU4sbjaxvQX8czjmaQvPtbzKuZAovQ4pGwa g0SrWP6p9yLo0LXLuXBP =WDF4 -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging Block layer patches # gpg: Signature made Fri 09 Mar 2018 15:09:20 GMT # gpg: using RSA key 7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: (56 commits) qemu-iotests: fix 203 migration completion race iotests: Tweak 030 in order to trigger a race condition with parallel jobs iotests: Skip test for ENOMEM error iotests: Mark all tests executable iotests: Test creating overlay when guest running qemu-iotests: Test ssh image creation over QMP qemu-iotests: Test qcow2 over file image creation with QMP block: Fail bdrv_truncate() with negative size file-posix: Fix no-op bdrv_truncate() with falloc preallocation ssh: Support .bdrv_co_create ssh: Pass BlockdevOptionsSsh to connect_to_ssh() ssh: QAPIfy host-key-check option ssh: Use QAPI BlockdevOptionsSsh object sheepdog: Support .bdrv_co_create sheepdog: QAPIfy "redundancy" create option nfs: Support .bdrv_co_create nfs: Use QAPI options in nfs_client_open() rbd: Use qemu_rbd_connect() in qemu_rbd_do_create() rbd: Assign s->snap/image_name in qemu_rbd_open() rbd: Support .bdrv_co_create ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-03-12 10:08:09 +00:00
Kevin Wolf	89b259eeaa	file-posix: Fix no-op bdrv_truncate() with falloc preallocation If bdrv_truncate() is called, but the requested size is the same as before, don't call posix_fallocate(), which returns -EINVAL for length zero and would therefore make bdrv_truncate() fail. The problem can be triggered by creating a zero-sized raw image with 'falloc' preallocation mode. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2018-03-09 15:17:48 +01:00
Kevin Wolf	4906da7e4d	ssh: Support .bdrv_co_create This adds the .bdrv_co_create driver callback to ssh, which enables image creation over QMP. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2018-03-09 15:17:48 +01:00
Kevin Wolf	375f0b9266	ssh: Pass BlockdevOptionsSsh to connect_to_ssh() Move the parsing of the QDict options up to the callers, in preparation for the .bdrv_co_create implementation that directly gets a QAPI type. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2018-03-09 15:17:47 +01:00
Kevin Wolf	ec2f54182c	ssh: QAPIfy host-key-check option This makes the host-key-check option available in blockdev-add. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2018-03-09 15:17:47 +01:00
Kevin Wolf	16e4bdb1bf	ssh: Use QAPI BlockdevOptionsSsh object Create a BlockdevOptionsSsh object in connect_to_ssh() and take the options from there. 'host_key_check' is still processed separately because it's not in the schema yet. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2018-03-09 15:17:47 +01:00
Kevin Wolf	63fd65a0a5	sheepdog: Support .bdrv_co_create This adds the .bdrv_co_create driver callback to sheepdog, which enables image creation over QMP. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2018-03-09 15:17:47 +01:00
Kevin Wolf	a595e4bcca	sheepdog: QAPIfy "redundancy" create option The "redundancy" option for Sheepdog image creation is currently a string that can encode one or two integers depending on its format, which at the same time implicitly selects a mode. This patch turns it into a QAPI union and converts the string into such a QAPI object before interpreting the values. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2018-03-09 15:17:47 +01:00
Kevin Wolf	a1a42af422	nfs: Support .bdrv_co_create This adds the .bdrv_co_create driver callback to nfs, which enables image creation over QMP. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2018-03-09 15:17:47 +01:00
Kevin Wolf	c22a034545	nfs: Use QAPI options in nfs_client_open() Using the QAPI visitor to turn all options into QAPI BlockdevOptionsNfs simplifies the code a lot. It will also be useful for implementing the QAPI based .bdrv_co_create callback. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2018-03-09 15:17:47 +01:00
Kevin Wolf	aa045c2d37	rbd: Use qemu_rbd_connect() in qemu_rbd_do_create() This is almost exactly the same code. The differences are that qemu_rbd_connect() supports BlockdevOptionsRbd.server and that the cache mode is set explicitly. Supporting 'server' is a welcome new feature for image creation. Caching is disabled by default, so leave it that way. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2018-03-09 15:17:47 +01:00
Kevin Wolf	d41a55885d	rbd: Assign s->snap/image_name in qemu_rbd_open() Now that the options are already available in qemu_rbd_open() and not only parsed in qemu_rbd_connect(), we can assign s->snap and s->image_name there instead of passing the fields by reference to qemu_rbd_connect(). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2018-03-09 15:17:47 +01:00
Kevin Wolf	1bebea37b4	rbd: Support .bdrv_co_create This adds the .bdrv_co_create driver callback to rbd, which enables image creation over QMP. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2018-03-09 15:17:47 +01:00
Kevin Wolf	4bfb274165	rbd: Pass BlockdevOptionsRbd to qemu_rbd_connect() With the conversion to a QAPI options object, the function is now prepared to be used in a .bdrv_co_create implementation. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2018-03-09 15:17:47 +01:00
Kevin Wolf	4ff4504974	rbd: Remove non-schema options from runtime_opts Instead of the QemuOpts in qemu_rbd_connect(), we want to use QAPI objects. As a preparation, fetch those options directly from the QDict that .bdrv_open() supports in the rbd driver and that are not in the schema. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2018-03-09 15:17:47 +01:00
Kevin Wolf	3d9136f972	rbd: Factor out qemu_rbd_connect() The code to establish an RBD connection is duplicated between open and create. In order to be able to share the code, factor out the code from qemu_rbd_open() as a first step. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2018-03-09 15:17:47 +01:00
Kevin Wolf	71c87815f9	rbd: Fix use after free in qemu_rbd_set_keypairs() error path If we want to include the invalid option name in the error message, we can't free the string earlier than that. Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2018-03-09 15:17:47 +01:00
Kevin Wolf	ab8bda76a0	gluster: Support .bdrv_co_create This adds the .bdrv_co_create driver callback to gluster, which enables image creation over QMP. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2018-03-09 15:17:47 +01:00
Kevin Wolf	3766ef579c	file-win32: Support .bdrv_co_create This adds the .bdrv_co_create driver callback to file-win32, which enables image creation over QMP. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2018-03-09 15:17:47 +01:00
Kevin Wolf	927f11e131	file-posix: Support .bdrv_co_create This adds the .bdrv_co_create driver callback to file, which enables image creation over QMP. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2018-03-09 15:17:47 +01:00
Kevin Wolf	b0292b851b	block: x-blockdev-create QMP command This adds a synchronous x-blockdev-create QMP command that can create qcow2 images on a given node name. We don't want to block while creating an image, so this is not the final interface in all aspects, but BlockdevCreateOptionsQcow2 and .bdrv_co_create() are what they actually might look like in the end. In any case, this should be good enough to test whether we interpret BlockdevCreateOptions as we should. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2018-03-09 15:17:47 +01:00
Kevin Wolf	b76b4f6045	qcow2: Use visitor for options in qcow2_create() Instead of manually creating the BlockdevCreateOptions object, use a visitor to parse the given options into the QAPI object. This involves translation from the old command line syntax to the syntax mandated by the QAPI schema. Option names are still checked against qcow2_create_opts, so only the old option names are allowed on the command line, even if they are translated in qcow2_create(). In contrast, new option values are optionally recognised besides the old values: 'compat' accepts 'v2'/'v3' as an alias for '0.10'/'1.1', and 'encrypt.format' accepts 'qcow' as an alias for 'aes' now. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2018-03-09 15:17:47 +01:00
Kevin Wolf	e4b5dad826	qcow2: Handle full/falloc preallocation in qcow2_co_create() Once qcow2_co_create() can be called directly on an already existing node, we must provide the 'full' and 'falloc' preallocation modes outside of creating the image on the protocol layer. Fortunately, we have preallocated truncate now which can provide this functionality. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2018-03-09 15:17:47 +01:00
Kevin Wolf	60900b7bd9	qcow2: Use QCryptoBlockCreateOptions in qcow2_co_create() Instead of passing the encryption format name and the QemuOpts down, use the QCryptoBlockCreateOptions contained in BlockdevCreateOptions. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2018-03-09 15:17:47 +01:00
Kevin Wolf	e1d74bc6c6	qcow2: Use BlockdevRef in qcow2_co_create() Instead of passing a separate BlockDriverState* into qcow2_co_create(), make use of the BlockdevRef that is included in BlockdevCreateOptions. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2018-03-09 15:17:47 +01:00
Kevin Wolf	29ca9e450d	qcow2: Pass BlockdevCreateOptions to qcow2_co_create() All of the simple options are now passed to qcow2_co_create() in a BlockdevCreateOptions object. Still missing: node-name and the encryption options. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2018-03-09 15:17:47 +01:00
Kevin Wolf	cbf2b7c404	qcow2: Let qcow2_create() handle protocol layer Currently, qcow2_create() only parses the QemuOpts and then calls qcow2_co_create() for the actual image creation, which includes both the creation of the actual file on the file system and writing a valid empty qcow2 image into that file. The plan is that qcow2_co_create() becomes the function that implements the functionality for a future 'blockdev-create' QMP command, which only creates the qcow2 layer on an already opened file node. This is a first step towards that goal: Let's move out anything that deals with the protocol layer from qcow2_co_create() into qcow2_create(). This means that qcow2_co_create() doesn't need a file name any more. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2018-03-09 15:17:47 +01:00
Kevin Wolf	8ca5bfdc46	qcow2: Rename qcow2_co_create2() to qcow2_co_create() The functions originally known as qcow2_create() and qcow2_create2() are now called qcow2_co_create_opts() and qcow2_co_create(), which matches the names of the BlockDriver callbacks that they will implement at the end of this patch series. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2018-03-09 15:17:47 +01:00
Alberto Garcia	0c2ada8136	qcow2: Make qemu-img check detect corrupted L1 tables in snapshots 'qemu-img check' cannot detect if a snapshot's L1 table is corrupted. This patch checks the table's offset and size and reports corruption if the values are not valid. This patch doesn't add code to fix that corruption yet, only to detect and report it. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-09 15:17:47 +01:00
Alberto Garcia	db5794f1f1	qcow2: Check snapshot L1 table in qcow2_snapshot_delete() This function deletes a snapshot from disk, removing its entry from the snapshot table, freeing its L1 table and decreasing the refcounts of all clusters. The L1 table offset and size are however not validated. If we use invalid values in this function we'll probably corrupt the image even more, so we should return an error instead. We now have a function to take care of this, so let's use it. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-09 15:17:47 +01:00
Alberto Garcia	a8475d7573	qcow2: Check snapshot L1 table in qcow2_snapshot_goto() This function copies a snapshot's L1 table into the active one without validating it first. We now have a function to take care of this, so let's use it. Cc: Eric Blake <eblake@redhat.com> Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-09 15:17:47 +01:00
Alberto Garcia	c7a9d81d70	qcow2: Check snapshot L1 tables in qcow2_check_metadata_overlap() The inactive-l2 overlap check iterates uses the L1 tables from all snapshots, but it does not validate them first. We now have a function to take care of this, so let's use it. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-09 15:17:47 +01:00
Alberto Garcia	c9a442e450	qcow2: Check L1 table parameters in qcow2_expand_zero_clusters() This function iterates over all snapshots of a qcow2 file in order to expand all zero clusters, but it does not validate the snapshots' L1 tables first. We now have a function to take care of this, so let's use it. We can also take the opportunity to replace the sector-based bdrv_read() with bdrv_pread(). Cc: Eric Blake <eblake@redhat.com> Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-09 15:17:47 +01:00
Alberto Garcia	314e8d3928	qcow2: Check L1 table offset in qcow2_snapshot_load_tmp() This function checks that the size of a snapshot's L1 table is not too large, but it doesn't validate the offset. We now have a function to take care of this, so let's use it. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-09 15:17:47 +01:00
Alberto Garcia	0cf0e5980b	qcow2: Generalize validate_table_offset() into qcow2_validate_table() This function checks that the offset and size of a table are valid. While the offset checks are fine, the size check is too generic, since it only verifies that the total size in bytes fits in a 64-bit integer. In practice all tables used in qcow2 have much smaller size limits, so the size needs to be checked again for each table using its actual limit. This patch generalizes this function by allowing the caller to specify the maximum size for that table. In addition to that it allows passing an Error variable. The function is also renamed and made public since we're going to use it in other parts of the code. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-09 15:17:47 +01:00
Paolo Bonzini	2fd6163884	block: convert bdrv_check callback to coroutine_fn Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1516279431-30424-8-git-send-email-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-09 15:17:47 +01:00
Paolo Bonzini	2b148f392b	block: convert bdrv_invalidate_cache callback to coroutine_fn QED's bdrv_invalidate_cache implementation would like to reuse functions that acquire/release the metadata locks. Call it from coroutine context to simplify the logic. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1516279431-30424-6-git-send-email-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-09 15:17:47 +01:00
Paolo Bonzini	9fb4dfc570	qed: make bdrv_qed_do_open a coroutine_fn It is called from qed_invalidate_cache in coroutine context (incoming migration runs in a coroutine), so it's cleaner if metadata is always loaded from a coroutine. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1516279431-30424-5-git-send-email-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-09 15:17:47 +01:00
Paolo Bonzini	1fafcd9368	qcow2: make qcow2_do_open a coroutine_fn It is called from qcow2_invalidate_cache in coroutine context (incoming migration runs in a coroutine), so it's cleaner if metadata is always loaded from a coroutine. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1516279431-30424-4-git-send-email-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-09 15:17:47 +01:00
Paolo Bonzini	1a0c2bfb03	qcow2: fix flushing after dirty bitmap metadata writes update_header_sync itself does not need to flush the caches to disk. The only paths that allocate clusters are: - bitmap_list_store with in_place=false, called by update_ext_header_and_dir - store_bitmap_data, called by store_bitmap - store_bitmap, called by qcow2_store_persistent_dirty_bitmaps and followed by update_ext_header_and_dir So in the end the central place where we need to flush the caches is update_ext_header_and_dir. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-09 15:17:47 +01:00
Paolo Bonzini	8b220eb7c8	qcow2: introduce qcow2_write_caches and qcow2_flush_caches They will be used to avoid recursively taking s->lock during bdrv_open or bdrv_check. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1516279431-30424-7-git-send-email-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-09 15:17:47 +01:00
Daniel P. Berrange	f87e08f9e2	block: implement the bdrv_reopen_prepare helper for LUKS driver If the bdrv_reopen_prepare helper isn't provided, the qemu-img commit command fails to re-open the base layer after committing changes into it. Provide a no-op implementation for the LUKS driver, since there is not any custom work that needs doing to re-open it. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-09 15:17:47 +01:00
Deepa Srinivasan	c060332c76	block: Fix qemu crash when using scsi-block Starting qemu with the following arguments causes qemu to segfault: ... -device lsi,id=lsi0 -drive file=iscsi:<...>,format=raw,if=none,node-name= iscsi1 -device scsi-block,bus=lsi0.0,id=<...>,drive=iscsi1 This patch fixes blk_aio_ioctl() so it does not pass stack addresses to blk_aio_ioctl_entry() which may be invoked after blk_aio_ioctl() returns. More details about the bug follow. blk_aio_ioctl() invokes blk_aio_prwv() with blk_aio_ioctl_entry as the coroutine parameter. blk_aio_prwv() ultimately calls aio_co_enter(). When blk_aio_ioctl() is executed from within a coroutine context (e.g. iscsi_bh_cb()), aio_co_enter() adds the coroutine (blk_aio_ioctl_entry) to the current coroutine's wakeup queue. blk_aio_ioctl() then returns. When blk_aio_ioctl_entry() executes later, it accesses an invalid pointer: .... BlkRwCo *rwco = &acb->rwco; rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset, rwco->qiov->iov[0].iov_base); <--- qiov is invalid here ... In the case when blk_aio_ioctl() is called from a non-coroutine context, blk_aio_ioctl_entry() executes immediately. But if bdrv_co_ioctl() calls qemu_coroutine_yield(), blk_aio_ioctl() will return. When the coroutine execution is complete, control returns to blk_aio_ioctl_entry() after the call to blk_co_ioctl(). There is no invalid reference after this point, but the function is still holding on to invalid pointers. The fix is to change blk_aio_prwv() to accept a void pointer for the IO buffer rather than a QEMUIOVector. blk_aio_prwv() passes this through in BlkRwCo and the coroutine function casts it to QEMUIOVector or uses the void pointer directly. Signed-off-by: Deepa Srinivasan <deepa.srinivasan@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Mark Kanda <mark.kanda@oracle.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-03-08 15:43:11 +00:00
Peter Maydell	58e2e17dba	Block layer patches -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJanYJPAAoJEH8JsnLIjy/WxjUQAJA+DTOmGXvaNpMs65BrU79K /r/iGVrzHv/RMLmrWMnqj96W9SnpMuiAP9hVLNsekqClY9q4ME4DpGcXhWfhSvF5 FC51ehvFJdfo8cPorsevcqNj60iWebjcx3lFfUq2606UOyYih3oijYxr6gSwWbRc GAgdGMqsvGYpzgqAQVEWHUhaX0La49/OzY42aR+E+LCBNfTYvlydvyoc+tUTdIpW 1eM/ASGndGsN0Cf2vxlbKgJ0/P6v+cRZuuIDhKZqre+YG+yM+pq7yZb+o7nf/P36 TPR93BsT7FSVAizRK7VFRuPIynHpiaxYygrJERCXF0sxsV4OlKjpmt/uUPamWFh+ 46Jx2NK1AuAx87BdErgmA119ObO3oAPxK0+2p981obb6SphTbbPxDj6SOlYCt4mJ mhff4JtIiwCmDSckAwd2mkBI1Tvl9qqcELrpyd2t2eU4ec2vf7fPd85EsK/Mq6Kr dbfqFvjNaaMxChoqFgkHAveYJ7zYqRFI2IY5o9c1QyZehCGPWjScxHXZZYdpDl59 YF9DkYQDOyvEX2jmMECaO1r/0nnO+BqQHu5ItJuTte9rjP9Q0do3iBISiIefewtf yji6/QNn2hFrnr1HPAwLFFC3kPgc8Mq8mIUb53j8vG/01KhVRCcnJm2K6D4IUwLZ S6ZnQJB97eE4y7YR5dNt =2axz -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging Block layer patches # gpg: Signature made Mon 05 Mar 2018 17:45:51 GMT # gpg: using RSA key 7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: (38 commits) block: Fix NULL dereference on empty drive error qcow2: Replace align_offset() with ROUND_UP() block/ssh: Add basic .bdrv_truncate() block/ssh: Make ssh_grow_file() blocking block/ssh: Pull ssh_grow_file() from ssh_create() qemu-img: Make resize error message more general qcow2: make qcow2_co_create2() a coroutine_fn block: rename .bdrv_create() to .bdrv_co_create_opts() Revert "IDE: Do not flush empty CDROM drives" block: test blk_aio_flush() with blk->root == NULL block: add BlockBackend->in_flight counter block: extract AIO_WAIT_WHILE() from BlockDriverState aio: rename aio_context_in_iothread() to in_aio_context_home_thread() docs: document how to use the l2-cache-entry-size parameter specs/qcow2: Fix documentation of the compressed cluster descriptor iotest 033: add misaligned write-zeroes test via truncate block: fix write with zero flag set and iovector provided block: Drop unused .bdrv_co_get_block_status() vvfat: Switch to .bdrv_co_block_status() vpc: Switch to .bdrv_co_block_status() ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org> # Conflicts: # include/block/block.h	2018-03-06 11:20:44 +00:00
Kevin Wolf	bfe1a14c18	block: Fix NULL dereference on empty drive error blk_error_action() sends a BLOCK_IO_ERROR QMP event which includes the node name of its root node. If the BlockBackend represents an empty drive, there is no root node, so we should not try to access its node name. Make the field optional in the event and include it only when the BlockBackend isn't empty. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2018-03-05 18:45:32 +01:00
Markus Armbruster	112ed241f5	qapi: Empty out qapi-schema.json The previous commit improved compile time by including less of the generated QAPI headers. This is impossible for stuff defined directly in qapi-schema.json, because that ends up in headers that that pull in everything. Move everything but include directives from qapi-schema.json to new sub-module qapi/misc.json, then include just the "misc" shard where possible. It's possible everywhere, except: * monitor.c needs qmp-command.h to get qmp_init_marshal() * monitor.c, ui/vnc.c and the generated qapi-event-FOO.c need qapi-event.h to get enum QAPIEvent Perhaps we'll get rid of those some other day. Adding a type to qapi/migration.json now recompiles some 120 instead of 2300 out of 5100 objects. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180211093607.27351-25-armbru@redhat.com> [eblake: rebase to master] Signed-off-by: Eric Blake <eblake@redhat.com>	2018-03-02 13:45:50 -06:00
Markus Armbruster	9af2398977	Include less of the generated modular QAPI headers In my "build everything" tree, a change to the types in qapi-schema.json triggers a recompile of about 4800 out of 5100 objects. The previous commit split up qmp-commands.h, qmp-event.h, qmp-visit.h, qapi-types.h. Each of these headers still includes all its shards. Reduce compile time by including just the shards we actually need. To illustrate the benefits: adding a type to qapi/migration.json now recompiles some 2300 instead of 4800 objects. The next commit will improve it further. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180211093607.27351-24-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> [eblake: rebase to master] Signed-off-by: Eric Blake <eblake@redhat.com>	2018-03-02 13:45:50 -06:00
Markus Armbruster	0dd13589b0	Include qapi/qmp/qerror.h exactly where needed Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180211093607.27351-2-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2018-03-02 13:14:08 -06:00
Alberto Garcia	9e029689e1	qcow2: Replace align_offset() with ROUND_UP() The align_offset() function is equivalent to the ROUND_UP() macro so there's no need to use the former. The ROUND_UP() name is also a bit more explicit. This patch uses ROUND_UP() instead of the slower QEMU_ALIGN_UP() because align_offset() already requires that the second parameter is a power of two. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20180215131008.5153-1-berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-03-02 18:39:56 +01:00
Max Reitz	624f3006b8	block/ssh: Add basic .bdrv_truncate() libssh2 does not seem to offer real truncation support, so we can only grow files -- but that is better than nothing. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20180214204915.7980-4-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-03-02 18:39:56 +01:00
Max Reitz	bd8e0e32da	block/ssh: Make ssh_grow_file() blocking At runtime (that is, during a future ssh_truncate()), the SSH session is non-blocking. However, ssh_truncate() (or rather, bdrv_truncate() in general) is not a coroutine, so this resize operation needs to block. For ssh_create(), that is fine, too; the session is never set to non-blocking anyway. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20180214204915.7980-3-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-03-02 18:39:56 +01:00
Max Reitz	2b12a756ac	block/ssh: Pull ssh_grow_file() from ssh_create() If we ever want to offer even rudimentary truncation functionality for ssh, we should put the respective code into a reusable function. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20180214204915.7980-2-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-03-02 18:39:56 +01:00
Stefan Hajnoczi	c274393a3e	qcow2: make qcow2_co_create2() a coroutine_fn qcow2_create2() calls qemu_co_mutex_lock(). Only a coroutine_fn may call another coroutine_fn. In fact, qcow2_create2 is always called from coroutine context. Rename the function to add the "co" moniker and add coroutine_fn. Reported-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20170705102231.20711-3-stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-02 18:39:07 +01:00
Stefan Hajnoczi	efc75e2a4c	block: rename .bdrv_create() to .bdrv_co_create_opts() BlockDriver->bdrv_create() has been called from coroutine context since commit `5b7e1542cf` ("block: make bdrv_create adopt coroutine"). Make this explicit by renaming to .bdrv_co_create_opts() and add the coroutine_fn annotation. This makes it obvious to block driver authors that they may yield, use CoMutex, or other coroutine_fn APIs. bdrv_co_create is reserved for the QAPI-based version that Kevin is working on. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20170705102231.20711-2-stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-02 18:39:07 +01:00
Stefan Hajnoczi	33f2a75777	block: add BlockBackend->in_flight counter BlockBackend currently relies on BlockDriverState->in_flight to track requests for blk_drain(). There is a corner case where BlockDriverState->in_flight cannot be used though: blk->root can be NULL when there is no medium. This results in a segfault when the NULL pointer is dereferenced. Introduce a BlockBackend->in_flight counter for aio requests so it works even when blk->root == NULL. Based on a patch by Kevin Wolf <kwolf@redhat.com>. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-02 18:39:07 +01:00
Stefan Hajnoczi	7719f3c968	block: extract AIO_WAIT_WHILE() from BlockDriverState BlockDriverState has the BDRV_POLL_WHILE() macro to wait on event loop activity while a condition evaluates to true. This is used to implement synchronous operations where it acts as a condvar between the IOThread running the operation and the main loop waiting for the operation. It can also be called from the thread that owns the AioContext and in that case it's just a nested event loop. BlockBackend needs this behavior but doesn't always have a BlockDriverState it can use. This patch extracts BDRV_POLL_WHILE() into the AioWait abstraction, which can be used with AioContext and isn't tied to BlockDriverState anymore. This feature could be built directly into AioContext but then all users would kick the event loop even if they signal different conditions. Imagine an AioContext with many BlockDriverStates, each time a request completes any waiter would wake up and re-check their condition. It's nicer to keep a separate AioWait object for each condition instead. Please see "block/aio-wait.h" for details on the API. The name AIO_WAIT_WHILE() avoids the confusion between AIO_POLL_WHILE() and AioContext polling. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-02 18:39:07 +01:00
Anton Nefedov	18a59f03c3	block: fix write with zero flag set and iovector provided The normal bdrv_co_pwritev() use is either - BDRV_REQ_ZERO_WRITE clear and iovector provided - BDRV_REQ_ZERO_WRITE set and iovector == NULL while - the flag clear and iovector == NULL is an assertion failure in bdrv_co_do_zero_pwritev() - the flag set and iovector provided is in fact allowed (the flag prevails and zeroes are written) However the alignment logic does not support the latter case so the padding areas get overwritten with zeroes. Currently, general functions like bdrv_rw_co() do provide iovector regardless of flags. So, keep it supported and use bdrv_co_do_zero_pwritev() alignment for it which also makes the code a bit more obvious anyway. Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-02 18:39:07 +01:00
Eric Blake	636cb51258	block: Drop unused .bdrv_co_get_block_status() We are gradually moving away from sector-based interfaces, towards byte-based. Now that all drivers have been updated to provide the byte-based .bdrv_co_block_status(), we can delete the sector-based interface. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-02 18:39:07 +01:00
Eric Blake	fba3998dae	vvfat: Switch to .bdrv_co_block_status() We are gradually moving away from sector-based interfaces, towards byte-based. Update the vvfat driver accordingly. Note that we can rely on the block driver having already clamped limits to our block size, and simplify accordingly. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-02 18:39:07 +01:00
Eric Blake	2f83673b57	vpc: Switch to .bdrv_co_block_status() We are gradually moving away from sector-based interfaces, towards byte-based. Update the vpc driver accordingly. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-02 18:39:07 +01:00
Eric Blake	c72080b9b8	vmdk: Switch to .bdrv_co_block_status() We are gradually moving away from sector-based interfaces, towards byte-based. Update the vmdk driver accordingly. Drop the now-unused vmdk_find_index_in_cluster(). Also, fix a pre-existing bug: if find_extent() fails (unlikely, since the block layer did a bounds check), then we must return a failure, rather than 0. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-02 18:39:07 +01:00
Eric Blake	67635f6abe	vdi: Switch to .bdrv_co_block_status() We are gradually moving away from sector-based interfaces, towards byte-based. Update the vdi driver accordingly. Note that the TODO is already covered (the block layer guarantees bounds of its requests), and that we can remove the now-unused s->block_sectors. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-02 18:39:07 +01:00
Eric Blake	b80666bf84	vdi: Avoid bitrot of debugging code Rework the debug define so that we always get -Wformat checking, even when debugging is disabled. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-02 18:39:07 +01:00
Eric Blake	47943e9865	sheepdog: Switch to .bdrv_co_block_status() We are gradually moving away from sector-based interfaces, towards byte-based. Update the sheepdog driver accordingly. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-02 18:39:07 +01:00
Eric Blake	d41aa7e36f	raw: Switch to .bdrv_co_block_status() We are gradually moving away from sector-based interfaces, towards byte-based. Update the raw driver accordingly. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-02 18:39:07 +01:00
Eric Blake	b8d739fd6f	qed: Switch to .bdrv_co_block_status() We are gradually moving away from sector-based interfaces, towards byte-based. Update the qed driver accordingly, taking the opportunity to inline qed_is_allocated_cb() into its lone caller (the callback used to be important, until we switched qed to coroutines). There is no intent to optimize based on the want_zero flag for this format. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-02 18:39:07 +01:00
Eric Blake	a320fb04b6	qcow2: Switch to .bdrv_co_block_status() We are gradually moving away from sector-based interfaces, towards byte-based. Update the qcow2 driver accordingly. For now, we are ignoring the 'want_zero' hint. However, it should be relatively straightforward to honor the hint as a way to return larger *pnum values when we have consecutive clusters with the same data/zero status but which differ only in having non-consecutive mappings. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-02 18:39:07 +01:00
Eric Blake	d63b4c93e3	qcow: Switch to .bdrv_co_block_status() We are gradually moving away from sector-based interfaces, towards byte-based. Update the qcow driver accordingly. There is no intent to optimize based on the want_zero flag for this format. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-02 18:39:07 +01:00
Eric Blake	8e0cf59d02	parallels: Switch to .bdrv_co_block_status() We are gradually moving away from sector-based interfaces, towards byte-based. Update the parallels driver accordingly. Note that the internal function block_status() is still sector-based, because it is still in use by other sector-based functions; but that's okay because request_alignment is 512 as a result of those functions. For now, no optimizations are added based on the mapping hint. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-02 18:39:07 +01:00
Eric Blake	05c33f1021	null: Switch to .bdrv_co_block_status() We are gradually moving away from sector-based interfaces, towards byte-based. Update the null driver accordingly. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-02 18:39:07 +01:00
Eric Blake	92809c3600	iscsi: Switch to .bdrv_co_block_status() We are gradually moving away from sector-based interfaces, towards byte-based. Update the iscsi driver accordingly. In this case, it is handy to teach iscsi_co_block_status() to handle a NULL map and file parameter, even though the block layer passes non-NULL values, because we also call the function directly. For now, there are no optimizations done based on the want_zero flag. We can also make the simplification of asserting that the block layer passed in aligned values. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-02 18:39:07 +01:00
Eric Blake	04a408fbff	iscsi: Switch iscsi_allocmap_update() to byte-based We are gradually converting to byte-based interfaces, as they are easier to reason about than sector-based. Convert all uses of the allocmap (no semantic change). Callers that already had bytes available are simpler, and callers that now scale to bytes will be easier to switch to byte-based in the future. Signed-off-by: Eric Blake <eblake@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-02 18:39:07 +01:00
Eric Blake	ba059e7b17	iscsi: Switch cluster_sectors to byte-based We are gradually converting to byte-based interfaces, as they are easier to reason about than sector-based. Convert all uses of the cluster size in sectors, along with adding assertions that we are not dividing by zero. Improve some comment grammar while in the area. Signed-off-by: Eric Blake <eblake@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-02 18:39:07 +01:00
Eric Blake	08c9e7735e	gluster: Switch to .bdrv_co_block_status() We are gradually moving away from sector-based interfaces, towards byte-based. Update the gluster driver accordingly. In want_zero mode, we continue to report fine-grained hole information (the caller wants as much mapping detail as possible); but when not in that mode, the caller prefers larger *pnum and merely cares about what offsets are allocated at this layer, rather than where the holes live. Since holes still read as zeroes at this layer (rather than deferring to a backing layer), we can take the shortcut of skipping find_allocation(), and merely state that all bytes are allocated. We can also drop redundant bounds checks that are already guaranteed by the block layer. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-02 18:39:07 +01:00
Eric Blake	a290f08590	file-posix: Switch to .bdrv_co_block_status() We are gradually moving away from sector-based interfaces, towards byte-based. Update the file protocol driver accordingly. In want_zero mode, we continue to report fine-grained hole information (the caller wants as much mapping detail as possible); but when not in that mode, the caller prefers larger *pnum and merely cares about what offsets are allocated at this layer, rather than where the holes live. Since holes still read as zeroes at this layer (rather than deferring to a backing layer), we can take the shortcut of skipping lseek(), and merely state that all bytes are allocated. We can also drop redundant bounds checks that are already guaranteed by the block layer. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-02 18:39:07 +01:00
Eric Blake	3e4d0e72b7	block: Switch passthrough drivers to .bdrv_co_block_status() We are gradually moving away from sector-based interfaces, towards byte-based. Update the generic helpers, and all passthrough clients (blkdebug, commit, mirror, throttle) accordingly. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-02 18:39:07 +01:00
Eric Blake	e3efee828b	nvme: Drop pointless .bdrv_co_get_block_status() Commit `bdd6a90` has a bug: drivers should never directly set BDRV_BLOCK_ALLOCATED, but only io.c should do that (as needed). Instead, drivers should report BDRV_BLOCK_DATA if it knows that data comes from this BDS. But let's look at the bigger picture: semantically, the nvme driver is similar to the nbd, null, and raw drivers (no backing file, all data comes from this BDS). But while two of those other drivers have to supply the callback (null because it can special-case BDRV_BLOCK_ZERO, raw because it can special-case a different offset), in this case the block layer defaults are good enough without the callback at all (similar to nbd). So, fix the bug by deletion ;) Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-02 18:39:07 +01:00
Eric Blake	86a3d5c688	block: Add .bdrv_co_block_status() callback We are gradually moving away from sector-based interfaces, towards byte-based. Now that the block layer exposes byte-based allocation, it's time to tackle the drivers. Add a new callback that operates on as small as byte boundaries. Subsequent patches will then update individual drivers, then finally remove .bdrv_co_get_block_status(). The new code also passes through the 'want_zero' hint, which will allow subsequent patches to further optimize callers that only care about how much of the image is allocated (want_zero is false), rather than full details about runs of zeroes and which offsets the allocation actually maps to (want_zero is true). As part of this effort, fix another part of the documentation: the claim in commit `4c41cb4` that BDRV_BLOCK_ALLOCATED is short for 'DATA \|\| ZERO' is a lie at the block layer (see commit `e88ae2264`), even though it is how the bit is computed from the driver layer. After all, there are intentionally cases where we return ZERO but not ALLOCATED at the block layer, when we know that a read sees zero because the backing file is too short. Note that the driver interface is thus slightly different than the public interface with regards to which bits will be set, and what guarantees are provided on input. We also add an assertion that any driver using the new callback will make progress (the only time pnum will be 0 is if the block layer already handled an out-of-bounds request, or if there is an error); the old driver interface did not provide this guarantee, which could lead to some inf-loops in drastic corner-case failures. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-02 18:39:07 +01:00
Eric Blake	fd8d372dd3	nbd: Honor server's advertised minimum block size Commit `79ba8c98` (v2.7) changed the setting of request_alignment to occur only during bdrv_refresh_limits(), rather than at at bdrv_open() time; but at the time, NBD was unaffected, because it still used sector-based callbacks, so the block layer defaulted NBD to use 512 request_alignment. Later, commit `70c4fb26` (also v2.7) changed NBD to use byte-based callbacks, without setting request_alignment. This resulted in NBD using request_alignment of 1, which works great when the server supports it (as is the case for qemu-nbd), but falls apart miserably if the server requires alignment (but only if qemu actually sends a sub-sector request; qemu-io can do it, but most qemu operations still perform on sectors or larger). Even later, the NBD protocol was updated to document that clients should learn the server's minimum alignment during NBD_OPT_GO; and recommended that clients should assume a minimum size of 512 unless the server understands NBD_OPT_GO and replied with a smaller size. Commit `081dd1fe` (v2.10) attempted to do that, by assigning request_alignment to whatever was learned from the server; but it has two flaws: the assignment is done during bdrv_open() so it gets unconditionally wiped out back to 1 during any later bdrv_refresh_limits(); and the code is not using a default of 512 when the server did not report a minimum size. Fix these issues by moving the assignment to request_alignment to the right function, and by using a sane default when the server does not advertise a minimum size. CC: qemu-stable@nongnu.org Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20180215032905.27146-1-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy<vsementsov@virtuozzo.com>	2018-03-01 14:02:32 -06:00
Paolo Bonzini	78d8c99e29	block/nvme: fix Coverity reports 1) string not null terminated in sysfs_find_group_file 2) NULL pointer dereference and dead local variable in nvme_init. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20180213015240.9352-1-famz@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2018-03-01 15:21:46 +08:00
Alberto Garcia	1221fe6f63	qcow2: Allow configuring the L2 slice size Now that the code is ready to handle L2 slices we can finally add an option to allow configuring their size. An L2 slice is the portion of an L2 table that is read by the qcow2 cache. Until now the cache was always reading full L2 tables, and since the L2 table size is equal to the cluster size this was not very efficient with large clusters. Here's a more detailed explanation of why it makes sense to have smaller cache entries in order to load L2 data: https://lists.gnu.org/archive/html/qemu-block/2017-09/msg00635.html This patch introduces a new command-line option to the qcow2 driver named l2-cache-entry-size (cf. l2-cache-size). The cache entry size has the same restrictions as the cluster size: it must be a power of two and it has the same range of allowed values, with the additional requirement that it must not be larger than the cluster size. The L2 cache entry size (L2 slice size) remains equal to the cluster size for now by default, so this feature must be explicitly enabled. Although my tests show that 4KB slices consistently improve performance and give the best results, let's wait and make more tests with different cluster sizes before deciding on an optimal default. Now that the cache entry size is not necessarily equal to the cluster size we need to reflect that in the MIN_L2_CACHE_SIZE documentation. That minimum value is a requirement of the COW algorithm: we need to read two L2 slices (and not two L2 tables) in order to do COW, see l2_allocate() for the actual code. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: c73e5611ff4a9ec5d20de68a6c289553a13d2354.1517840877.git.berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-02-13 17:00:00 +01:00
Alberto Garcia	dd32c88108	qcow2: Rename l2_table in count_cow_clusters() This function doesn't need any changes to support L2 slices, but since it's now dealing with slices intead of full tables, the l2_table variable is renamed for clarity. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 6107001fc79e6739242f1de7d191375e4f130aac.1517840877.git.berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-02-13 17:00:00 +01:00
Alberto Garcia	c26f10ba89	qcow2: Rename l2_table in count_contiguous_clusters_unallocated() This function doesn't need any changes to support L2 slices, but since it's now dealing with slices instead of full tables, the l2_table variable is renamed for clarity. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 78bcc54bc632574dd0b900a77a00a1b6ffc359e6.1517840877.git.berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-02-13 17:00:00 +01:00
Alberto Garcia	13f893c47d	qcow2: Rename l2_table in count_contiguous_clusters() This function doesn't need any changes to support L2 slices, but since it's now dealing with slices intead of full tables, the l2_table variable is renamed for clarity. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 812b0c3505bb1687e51285dccf1a94f0cecb1f74.1517840877.git.berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-02-13 17:00:00 +01:00
Alberto Garcia	e4e7254829	qcow2: Rename l2_table in qcow2_alloc_compressed_cluster_offset() This function doesn't need any changes to support L2 slices, but since it's now dealing with slices instead of full tables, the l2_table variable is renamed for clarity. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 0c5d4b9bf163aa3b49ec19cc512a50d83563f2ad.1517840877.git.berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-02-13 17:00:00 +01:00

1 2 3 4 5 ...

3737 Commits