block_mig_state.total_time is currently the sum of the read request
latencies. This is not very accurate because block migration uses aio and
so several requests can be submitted at once. Bandwidth should be computed
with wall-clock time, not by adding the latencies. In this case,
"total_time" has a higher value than it should, and so the computed
bandwidth is lower than it is in reality. This means that migration can
take longer than it needs to.
However, we don't want to use pure wall-clock time here. We are computing
bandwidth in the asynchronous phase, where the migration repeatedly wakes
up and sends some aio requests. The computed bandwidth will be used for
synchronous transfer.
Signed-off-by: Avishay Traeger <avishay@il.ibm.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
block_mig_state.reads is an int, and multiplying by BLOCK_SIZE yielded a
negative number, resulting in a negative bandwidth (running on a 32-bit
machine). Change order to avoid.
Signed-off-by: Avishay Traeger <avishay@il.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Set block device in use during block migration, disallow drive_del and
bdrv_truncate for in use devices.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
So that ejection of attached device by guest does not free data
in use by block migration instance.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
b02bea3a85 added a check on the return
value of bdrv_write and aborts migration when it fails. However, if the
size of the block device to migrate is not a multiple of BLOCK_SIZE
(currently 1 MB), the last bdrv_write will fail with -EIO.
Fixed by calling bdrv_write with the correct size of the last block.
Signed-off-by: Pierre Riteau <Pierre.Riteau@irisa.fr>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When block migration is requested and no read-write block device is
present, a divide by zero exception is triggered because
total_sector_sum equals zero.
Signed-off-by: Pierre Riteau <Pierre.Riteau@irisa.fr>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
An old version of this patch was applied to master, so this contains the
differences between v1 and v2.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Block migration can submit multiple AIO reads for the same sector/chunk, but
completion of such reads can happen out of order:
migration guest
- get_dirty(N)
- aio_read(N)
- clear_dirty(N)
write(N)
set_dirty(N)
- get_dirty(N)
- aio_read(N)
If the first aio_read completes after the second, stale data will be
migrated to the destination.
Fix by not allowing multiple AIOs inflight for the same sector.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Currently block_load() doesn't check return value of bdrv_write(), and
even the destination weren't prepared to execute block migration, it
proceeds and guest boots on the target. This patch fix this issue.
Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When there is no block driver associate with BlockDriverState bdrv_getlength
returns -ENOMEDIUM that cause block migration to fail
Signed-off-by: Shahar Havivi <shaharh@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When available, we'd like to be able to access the DeviceState
when registering a savevm. For buses with a get_dev_path()
function, this will allow us to create more unique savevm
id strings.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
init_blk_migration_it() skips drives with type hint BDRV_TYPE_CDROM.
The intention is to skip read-only drives. However, BDRV_TYPE_CDROM
is only a hint. It is currently sufficent for read-only. But it's
not necessary, and it may not remain sufficient.
Use bdrv_is_read_only() instead.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The bdrv_first linked list of BlockDriverStates is currently extern so
that block migration can iterate the list. However, since there is
already a bdrv_iterate() function there is no need to expose bdrv_first.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Move to stage3 only when remaining work can be done below max downtime.
Use qemu_get_clock_ns for measuring read performance.
Signed-off-by: Liran Schour <lirans@il.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Start transfer dirty blocks during the iterative stage. That will
reduce the time that the guest will be suspended
Signed-off-by: Liran Schour <lirans@il.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
blk_mig_save_bulked_block is never called with sync flag. Remove the sync
flag. Calculate bulk completion during blk_mig_save_bulked_block.
Remove unused constants.
Signed-off-by: Liran Schour <lirans@il.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
No need to migrate emptiness (risking divide by zero later on).
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Inject progress report in percentage into the block live stream. This
can be read out and displayed easily on restore.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Report progress of an outgoing live migration to the monitor instead of
stdout.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
In order to allow proper progress reporting to the monitor that
initiated the migration, forward the monitor reference through the
migration layer down to SaveLiveStateHandler.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
So far progress reporting only works for the first block device. Fix
this by keeping an overall sum of sectors to be migratated, calculating
the sum of all processed sectors, and finally basing the progress
display on those values.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Introduce qemu_savevm_state_cancel and inject a stage -1 to cancel a
live migration. This gives the involved subsystems a chance to clean up
dynamically allocated resources. Namely, the block migration layer can
now free its device descriptors and pending blocks.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Based on the original patch by Pierre Riteau: Use a common blk_send
function to transmit a block.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Both functions share a lot of code, so make them one.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
We already save total_sectors in BlkMigDevState, let's use this value
during the migration and avoid to recalculate it needlessly.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
In case we restart a migration, submitted, read_done, transferred, and
print_completion need to be reinitialized to 0.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
No need to push block_mig_state to the heap and, thus, establish an
indirection.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move a potentially large buffer from stack to heap.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Instead of duplicating the definition of constants or introducing
trivial retrieval functions move the SECTOR constants into the public
block API. This also obsoletes sector_per_block in BlkMigState.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch introduces block migration called during live migration. Block
are being copied to the destination in an async way. First the code will
transfer the whole disk and then transfer all dirty blocks accumulted during
the migration.
Still need to improve transition from the iterative phase of migration to the
end phase. For now transition will take place when all blocks transfered once,
all the dirty blocks will be transfered during the end phase (guest is
suspended).
Changes from v4:
- Global variabels moved to a global state structure allocated dynamically.
- Minor coding style issues.
- Poll block.c for tracking of dirty blocks instead of manage it here.
Signed-off-by: Liran Schour <lirans@il.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>