qemu-e2k

Commit Graph

Author	SHA1	Message	Date
Andrey Gruzdev	82ea3e3b99	migration: Rename 'bs' to 'block' in background snapshot code Rename 'bs' to commonly used 'block' in migration/ram.c background snapshot code. Signed-off-by: Andrey Gruzdev <andrey.gruzdev@virtuozzo.com> Reported-by: David Hildenbrand <david@redhat.com> Message-Id: <20210401092226.102804-5-andrey.gruzdev@virtuozzo.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-04-07 18:37:56 +01:00
Andrey Gruzdev	eeccb99c9d	migration: Pre-fault memory before starting background snasphot This commit solves the issue with userfault_fd WP feature that background snapshot is based on. For any never poluated or discarded memory page, the UFFDIO_WRITEPROTECT ioctl() would skip updating PTE for that page, thereby loosing WP setting for it. So we need to pre-fault pages for each RAM block to be protected before making a userfault_fd wr-protect ioctl(). Fixes: `278e2f551a` (migration: support UFFD write fault processing in ram_save_iterate()) Signed-off-by: Andrey Gruzdev <andrey.gruzdev@virtuozzo.com> Reported-by: David Hildenbrand <david@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Message-Id: <20210401092226.102804-4-andrey.gruzdev@virtuozzo.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: Bodged ifdef __linux__ on ram_write_tracking_prepare, should really go in a stub	2021-04-07 18:37:28 +01:00
Daniel P. Berrangé	cbde7be900	migrate: remove QMP/HMP commands for speed, downtime and cache size The generic 'migrate_set_parameters' command handle all types of param. Only the QMP commands were documented in the deprecations page, but the rationale for deprecating applies equally to HMP, and the replacements exist. Furthermore the HMP commands are just shims to the QMP commands, so removing the latter breaks the former unless they get re-implemented. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2021-03-18 09:22:55 +00:00
Mahmoud Mandour	373969507a	migration: Replaced qemu_mutex_lock calls with QEMU_LOCK_GUARD Replaced various qemu_mutex_lock calls and their respective qemu_mutex_unlock calls with QEMU_LOCK_GUARD macro. This simplifies the code by eliminating the respective qemu_mutex_unlock calls. Signed-off-by: Mahmoud Mandour <ma.mandourr@gmail.com> Message-Id: <20210311031538.5325-7-ma.mandourr@gmail.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-03-15 20:01:55 +00:00
Stefan Reiter	e846b74650	migration: only check page size match if RAM postcopy is enabled Postcopy may also be advised for dirty-bitmap migration only, in which case the remote page size will not be available and we'll instead read bogus data, blocking migration with a mismatch error if the VM uses hugepages. Fixes: `58110f0acb` ("migration: split common postcopy out of ram postcopy") Signed-off-by: Stefan Reiter <s.reiter@proxmox.com> Message-Id: <20210204163522.13291-1-s.reiter@proxmox.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-02-08 11:19:52 +00:00
Markus Armbruster	8b9407a09f	migration: Clean up signed vs. unsigned XBZRLE cache-size `73af8dd8d7` "migration: Make xbzrle_cache_size a migration parameter" (v2.11.0) made the new parameter unsigned (QAPI type 'size', uint64_t in C). It neglected to update existing code, which continues to use int64_t. migrate_xbzrle_cache_size() returns the new parameter. Adjust its return type. QMP query-migrate-cache-size returns migrate_xbzrle_cache_size(). Adjust its return type. migrate-set-parameters passes the new parameter to xbzrle_cache_resize(). Adjust its parameter type. xbzrle_cache_resize() passes it on to cache_init(). Adjust its parameter type. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20210202141734.2488076-3-armbru@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-02-08 11:19:51 +00:00
Andrey Gruzdev	278e2f551a	migration: support UFFD write fault processing in ram_save_iterate() In this particular implementation the same single migration thread is responsible for both normal linear dirty page migration and procesing UFFD page fault events. Processing write faults includes reading UFFD file descriptor, finding respective RAM block and saving faulting page to the migration stream. After page has been saved, write protection can be removed. Since asynchronous version of qemu_put_buffer() is expected to be used to save pages, we also have to flush migraion stream prior to un-protecting saved memory range. Write protection is being removed for any previously protected memory chunk that has hit the migration stream. That's valid for pages from linear page scan along with write fault pages. Signed-off-by: Andrey Gruzdev <andrey.gruzdev@virtuozzo.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20210129101407.103458-4-andrey.gruzdev@virtuozzo.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> fixup pagefault.address cast for 32bit	2021-02-08 11:19:51 +00:00
Andrey Gruzdev	6e8c25b4c6	migration: introduce 'background-snapshot' migration capability Add new capability to 'qapi/migration.json' schema. Update migrate_caps_check() to validate enabled capability set against introduced one. Perform checks for required kernel features and compatibility with guest memory backends. Signed-off-by: Andrey Gruzdev <andrey.gruzdev@virtuozzo.com> Reviewed-by: Peter Xu <peterx@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20210129101407.103458-2-andrey.gruzdev@virtuozzo.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-02-08 11:19:51 +00:00
Philippe Mathieu-Daudé	af3bbbe984	migration/ram: Fix hexadecimal format string specifier The '%u' conversion specifier is for decimal notation. When prefixing a format with '0x', we want the hexadecimal specifier ('%x'). Inspired-by: Dov Murik <dovmurik@linux.vnet.ibm.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20201103112558.2554390-5-philmd@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-11-12 14:02:41 +00:00
Rao, Lei	b70cb3b485	Reduce the time of checkpoint for COLO we should set ram_bulk_stage to false after ram_state_init, otherwise the bitmap will be unused in migration_bitmap_find_dirty. all pages in ram cache will be flushed to the ram of secondary guest for each checkpoint. Signed-off-by: Lei Rao <lei.rao@intel.com> Signed-off-by: Derek Su <dereksu@qnap.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Li Zhijian <lizhijian@cn.fujitsu.com> Reviewed-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2020-11-11 16:52:23 +08:00
Bihong Yu	49324e939c	migration: Do not initialise statics and globals to 0 or NULL Signed-off-by: Bihong Yu <yubihong@huawei.com> Reviewed-by: Chuan Zheng <zhengchuan@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <1603163448-27122-7-git-send-email-yubihong@huawei.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-10-26 16:15:04 +00:00
Bihong Yu	f4c51a6bfd	migration: Add braces {} for if statement Signed-off-by: Bihong Yu <yubihong@huawei.com> Reviewed-by: Chuan Zheng <zhengchuan@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <1603163448-27122-6-git-send-email-yubihong@huawei.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-10-26 16:15:04 +00:00
Bihong Yu	395cb45009	migration: Add spaces around operator Signed-off-by: Bihong Yu <yubihong@huawei.com> Reviewed-by: Chuan Zheng <zhengchuan@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <1603163448-27122-4-git-send-email-yubihong@huawei.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-10-26 16:15:04 +00:00
Bihong Yu	29fccade10	migration: Don't use '#' flag of printf format Signed-off-by: Bihong Yu <yubihong@huawei.com> Reviewed-by: Chuan Zheng <zhengchuan@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <1603163448-27122-3-git-send-email-yubihong@huawei.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-10-26 16:15:04 +00:00
Chuan Zheng	3ded54b1bd	migration/dirtyrate: move RAMBLOCK_FOREACH_MIGRATABLE into ram.h RAMBLOCK_FOREACH_MIGRATABLE is need in dirtyrate measure, move the existing definition up into migration/ram.h Signed-off-by: Chuan Zheng <zhengchuan@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: David Edmondson <david.edmondson@oracle.com> Reviewed-by: Li Qiang <liq3ea@gmail.com> Message-Id: <1600237327-33618-6-git-send-email-zhengchuan@huawei.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-09-25 12:45:57 +01:00
zhaolichang	3a4452d896	migration/: fix some comment spelling errors I found that there are many spelling errors in the comments of qemu, so I used the spellcheck tool to check the spelling errors and finally found some spelling errors in the migration folder. Signed-off-by: zhaolichang <zhaolichang@huawei.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20200917075029.313-3-zhaolichang@huawei.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2020-09-17 20:36:32 +02:00
Claudio Fontana	b0c3cf9407	cpu-throttle: new module, extracted from cpus.c move the vcpu throttling functionality into its own module. This functionality is not specific to any accelerator, and it is used currently by migration to slow down guests to try to have migrations converge, and by the cocoa MacOS UI to throttle speed. cpu-throttle contains the controls to adjust and inspect throttle settings, start (set) and stop vcpu throttling, and the throttling function itself that is run periodically on vcpus to make them take a nap. Execution of the throttling function on all vcpus is triggered by a timer, registered at module initialization. No functionality change. Signed-off-by: Claudio Fontana <cfontana@suse.de> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <20200629093504.3228-3-cfontana@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-07-10 18:04:49 -04:00
Keqian Zhu	fb6135807f	migration: Count new_dirty instead of real_dirty real_dirty_pages becomes equal to total ram size after dirty log sync in ram_init_bitmaps, the reason is that the bitmap of ramblock is initialized to be all set, so old path counts them as "real dirty" at beginning. This causes wrong dirty rate and false positive throttling. Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> Message-Id: <20200622032037.31112-1-zhukeqian1@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-07-03 16:23:05 +01:00
Wei Wang	9227140217	migration: fix xbzrle encoding rate calculation It's reported an error of implicit conversion from "unsigned long" to "double" when compiling with Clang 10. Simply make the encoding rate 0 when the encoded_size is 0. Fixes: `e460a4b1a4` Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reported-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Wei Wang <wei.w.wang@intel.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200617201309.1640952-3-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2020-06-18 10:26:02 +01:00
Lukas Straub	24fa16f8cc	migration/colo.c: Flush ram cache only after receiving device state If we suceed in receiving ram state, but fail receiving the device state, there will be a mismatch between the two. Fix this by flushing the ram cache only after the vmstate has been received. Signed-off-by: Lukas Straub <lukasstraub2@web.de> Message-Id: <3289d007d494cb0e2f05b1cf4ae6a78d300fede3.1589193382.git.lukasstraub2@web.de> Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-06-01 18:44:27 +01:00
Wei Wang	e460a4b1a4	migration/xbzrle: add encoding rate Users may need to check the xbzrle encoding rate to know if the guest memory is xbzrle encoding-friendly, and dynamically turn off the encoding if the encoding rate is low. Signed-off-by: Yi Sun <yi.y.sun@intel.com> Signed-off-by: Wei Wang <wei.w.wang@intel.com> Message-Id: <1588208375-19556-1-git-send-email-wei.w.wang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-05-07 17:40:24 +01:00
David Hildenbrand	ddf35bdf0a	migration/ram: Consolidate variable reset after placement in ram_load_postcopy() Let's consolidate resetting the variables. Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com> Cc: Juan Quintela <quintela@redhat.com> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20200421085300.7734-10-david@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Fixup for context conflicts with `91ba442`	2020-05-07 17:40:24 +01:00
Keqian Zhu	cbbf818224	migration/throttle: Add cpu-throttle-tailslow migration parameter At the tail stage of throttling, the Guest is very sensitive to CPU percentage while the @cpu-throttle-increment is excessive usually at tail stage. If this parameter is true, we will compute the ideal CPU percentage used by the Guest, which may exactly make the dirty rate match the dirty rate threshold. Then we will choose a smaller throttle increment between the one specified by @cpu-throttle-increment and the one generated by ideal CPU percentage. Therefore, it is compatible to traditional throttling, meanwhile the throttle increment won't be excessive at tail stage. This may make migration time longer, and is disabled by default. Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> Message-Id: <20200413101508.54793-1-zhukeqian1@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-05-07 17:40:24 +01:00
Peter Maydell	a2261b2754	trivial patches (20200504) Silent static analyzer warning Remove dead assignments Support -chardev serial on macOS Update MAINTAINERS Some cosmetic changes -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEEzS913cjjpNwuT1Fz8ww4vT8vvjwFAl6wOI4SHGxhdXJlbnRA dml2aWVyLmV1AAoJEPMMOL0/L748p7UQAIFSNN0FrDV+K7i8qqq0X+JrS+dNOHNm DSpOf8IaGm/BezzL6XirXBVpFxg9iB5DQVLsjP1kUggO7rbBO0blx5H5eOPhnXZj xg60kLN16ty7NZ/WPS1G9jF4nDsjz0ZUtCXb0OXsuGJIOrsmN2r/lxdJwcjHZaqJ RzbcCSFXlvL0g7mOakJinMJH5r/nWCiUoEYsikhP10DcvuSBoCnjr+LYV6Ef02G0 Y5lgKN2G0EAMgWTJaL3gIF27zS8QLDNll+eO+PIU5K4yo75/wRCKr4e3PpErZlf6 B+hCAAPnXCpDKw+8sK2z+9OZXUGe1hQ8LHNgNNM921C66f+vLLXpIDTAECihM4K4 0wThYlFDwT4j+PMHFNlzIobGMtb33ui8m40lepMt/YOVFqY4tr8u3MLhHkVDo2+8 sNuOOWLXAoFOYyRqgTeVJvZvMUFQqtDiftghw1BR55TyIpDWjvLYRqae5CI+MGXs 6YylZVHGzVjMVptxvivvIQ735Nq8LaKq7N8Cb7uvcbRaCki39BsxXVPZx4p6NdwN dMndUOz/y75dNlRMDjK8l/oRFPJa/p1Yz8mZhl0uVOO6JeJhBwYmk+WkQ7g/GHZb Rx15HnVWRu6C/Icbw4kqZYyqrgl5lykS8aAWURePdpjzKY77rY1H71FesMhjifRN ZGgfUdWI88M4 =ibgH -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/vivier2/tags/trivial-branch-for-5.1-pull-request' into staging trivial patches (20200504) Silent static analyzer warning Remove dead assignments Support -chardev serial on macOS Update MAINTAINERS Some cosmetic changes # gpg: Signature made Mon 04 May 2020 16:45:18 BST # gpg: using RSA key CD2F75DDC8E3A4DC2E4F5173F30C38BD3F2FBE3C # gpg: issuer "laurent@vivier.eu" # gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" [full] # gpg: aka "Laurent Vivier <laurent@vivier.eu>" [full] # gpg: aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" [full] # Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F 5173 F30C 38BD 3F2F BE3C * remotes/vivier2/tags/trivial-branch-for-5.1-pull-request: hw/timer/pxa2xx_timer: Add assertion to silent static analyzer warning hw/timer/stm32f2xx_timer: Remove dead assignment hw/gpio/aspeed_gpio: Remove dead assignment hw/isa/i82378: Remove dead assignment hw/ide/sii3112: Remove dead assignment hw/input/adb-kbd: Remove dead assignment hw/i2c/pm_smbus: Remove dead assignment blockdev: Remove dead assignment block: Avoid dead assignment Compress lines for immediate return chardev: Add macOS to list of OSes that support -chardev serial MAINTAINERS: Update Keith Busch's email address elf_ops: Don't try to g_mapped_file_unref(NULL) hw/mem/pc-dimm: Fix line over 80 characters warning hw/mem/pc-dimm: Print slot number on error at pc_dimm_pre_plug() MAINTAINERS: Mark the LatticeMico32 target as orphan timer/exynos4210_mct: Remove redundant statement in exynos4210_mct_write() display/blizzard: use extract16() for fix clang analyzer warning in blizzard_draw_line16_32() scsi/esp-pci: add g_assert() for fix clang analyzer warning in esp_pci_io_write() Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2020-05-05 14:03:28 +01:00
Daniel Brodsky	6e8a355de6	lockable: replaced locks with lock guard macros where appropriate - ran regexp "qemu_mutex_lock$.$.\n.*if" to find targets - replaced result with QEMU_LOCK_GUARD if all unlocks at function end - replaced result with WITH_QEMU_LOCK_GUARD if unlock not at end Signed-off-by: Daniel Brodsky <dnbrdsky@gmail.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-id: 20200404042108.389635-3-dnbrdsky@gmail.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-05-04 16:07:43 +01:00
Simran Singhal	b3ac2b94cd	Compress lines for immediate return Compress two lines into a single line if immediate return statement is found. It also remove variables progress, val, data, ret and sock as they are no longer needed. Remove space between function "mixer_load" and '(' to fix the checkpatch.pl error:- ERROR: space prohibited between function name and open parenthesis '(' Done using following coccinelle script: @@ local idexpression ret; expression e; @@ -ret = +return e; -return ret; Signed-off-by: Simran Singhal <singhalsimran0@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20200401165314.GA3213@simran-Inspiron-5558> [lv: in handle_aiocb_write_zeroes_unmap() move "int ret" inside the #ifdef] Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2020-05-04 14:43:22 +02:00
Vladimir Sementsov-Ogievskiy	b4a1733c5e	migration/ram: fix use after free of local_err local_err is used again in migration_bitmap_sync_precopy() after precopy_notify(), so we must zero it. Otherwise try to set non-NULL local_err will crash. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200324153630.11882-6-vsementsov@virtuozzo.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-03-25 12:31:38 +00:00
zhanghailiang	8af66371ed	ram/colo: only record bitmap of dirty pages in COLO stage It is only need to record bitmap of dirty pages while goes into COLO stage. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Message-Id: <20200224065414.36524-6-zhang.zhanghailiang@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-03-13 09:36:30 +00:00
zhanghailiang	0393031a16	COLO: Optimize memory back-up process This patch will reduce the downtime of VM for the initial process, Previously, we copied all these memory in preparing stage of COLO while we need to stop VM, which is a time-consuming process. Here we optimize it by a trick, back-up every page while in migration process while COLO is enabled, though it affects the speed of the migration, but it obviously reduce the downtime of back-up all SVM'S memory in COLO preparing stage. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Message-Id: <20200224065414.36524-5-zhang.zhanghailiang@huawei.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> minor typo fixes	2020-03-13 09:36:30 +00:00
Keqian Zhu	dc14a47076	migration/throttle: Add throttle-trig-thres migration parameter Currently, if the bytes_dirty_period is more than the 50% of bytes_xfer_period, we start or increase throttling. If we make this percentage higher, then we can tolerate higher dirty rate during migration, which means less impact on guest. The side effect of higher percentage is longer migration time. We can make this parameter configurable to switch between mig- ration time first or guest performance first. The default value is 50 and valid range is 1 to 100. Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> Message-Id: <20200224023142.39360-1-zhukeqian1@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-03-13 09:36:30 +00:00
Juan Quintela	87dc6f5f66	multifd: Add zstd compression multifd support Signed-off-by: Juan Quintela <quintela@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-02-28 09:25:49 +01:00
Juan Quintela	ab7cbb0b9a	multifd: Make no compression operations into its own structure It will be used later. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> --- No comp value needs to be zero.	2020-02-28 09:24:43 +01:00
Juan Quintela	d32ca5ad79	multifd: Split multifd code into its own file Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-29 11:28:59 +01:00
Juan Quintela	b673eab4e2	multifd: Make multifd_load_setup() get an Error parameter We need to change the full chain to pass the Error parameter. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-29 11:28:59 +01:00
Juan Quintela	00f4b572e6	multifd: Make multifd_save_setup() get an Error parameter Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-29 11:28:59 +01:00
Juan Quintela	857a4bbb86	migration: Make checkpatch happy with comments Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-29 11:28:59 +01:00
Juan Quintela	a6703e4d33	multifd: Use qemu_target_page_size() We will make it cpu independent. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-29 11:28:59 +01:00
Juan Quintela	99f2c6fb46	multifd: multifd_send_sync_main only needs the qemufile Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-29 11:28:59 +01:00
Juan Quintela	67a4c8910c	multifd: multifd_queue_page only needs the qemufile Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-29 11:28:59 +01:00
Juan Quintela	df94d32bb1	multifd: multifd_send_pages only needs the qemufile Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-29 11:28:59 +01:00
Zhimin Feng	9c4d333c09	migration/multifd: fix nullptr access in multifd_send_terminate_threads If the multifd_send_threads is not created when migration is failed, multifd_save_cleanup would be called twice. In this senario, the multifd_send_state is accessed after it has been released, the result is that the source VM is crashing down. Here is the coredump stack: Program received signal SIGSEGV, Segmentation fault. 0x00005629333a78ef in multifd_send_terminate_threads (err=err@entry=0x0) at migration/ram.c:1012 1012 MultiFDSendParams *p = &multifd_send_state->params[i]; #0 0x00005629333a78ef in multifd_send_terminate_threads (err=err@entry=0x0) at migration/ram.c:1012 #1 0x00005629333ab8a9 in multifd_save_cleanup () at migration/ram.c:1028 #2 0x00005629333abaea in multifd_new_send_channel_async (task=0x562935450e70, opaque=<optimized out>) at migration/ram.c:1202 #3 0x000056293373a562 in qio_task_complete (task=task@entry=0x562935450e70) at io/task.c:196 #4 0x000056293373a6e0 in qio_task_thread_result (opaque=0x562935450e70) at io/task.c:111 #5 0x00007f475d4d75a7 in g_idle_dispatch () from /usr/lib64/libglib-2.0.so.0 #6 0x00007f475d4da9a9 in g_main_context_dispatch () from /usr/lib64/libglib-2.0.so.0 #7 0x0000562933785b33 in glib_pollfds_poll () at util/main-loop.c:219 #8 os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242 #9 main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:518 #10 0x00005629334c5acf in main_loop () at vl.c:1810 #11 0x000056293334d7bb in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4471 If the multifd_send_threads is not created when migration is failed. In this senario, we don't call multifd_save_cleanup in multifd_new_send_channel_async. Signed-off-by: Zhimin Feng <fengzhimin1@huawei.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-29 11:28:59 +01:00
Juan Quintela	b69a0227a8	migration: Don't send data if we have stopped If we do a cancel, we got out without one error, but we can't do the rest of the output as in a normal situation. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-29 11:28:59 +01:00
Juan Quintela	3d4095b222	multifd: Make sure that we don't do any IO after an error Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-29 11:28:59 +01:00
Juan Quintela	ddac5cb2d9	multifd: Be consistent about using uint64_t We transmit ram_addr_t always as uint64_t. Be consistent in its use (on 64bit system, it is always uint64_t problem is 32bits). Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-20 09:17:07 +01:00
Alexey Romko	8bba004cca	Bug #1829242 correction. Added type conversions to ram_addr_t before all left shifts of page indexes to TARGET_PAGE_BITS, to correct overflows when the page address was 4Gb and more. Signed-off-by: Alexey Romko <nevilad@yahoo.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:23 +01:00
Jiahui Cen	9560a48ecc	migration/multifd: fix destroyed mutex access in terminating multifd threads One multifd will lock all the other multifds' IOChannel mutex to inform them to quit by setting p->quit or shutting down p->c. In this senario, if some multifds had already been terminated and multifd_load_cleanup/multifd_save_cleanup had destroyed their mutex, it could cause destroyed mutex access when trying lock their mutex. Here is the coredump stack: #0 0x00007f81a2794437 in raise () from /usr/lib64/libc.so.6 #1 0x00007f81a2795b28 in abort () from /usr/lib64/libc.so.6 #2 0x00007f81a278d1b6 in __assert_fail_base () from /usr/lib64/libc.so.6 #3 0x00007f81a278d262 in __assert_fail () from /usr/lib64/libc.so.6 #4 0x000055eb1bfadbd3 in qemu_mutex_lock_impl (mutex=0x55eb1e2d1988, file=<optimized out>, line=<optimized out>) at util/qemu-thread-posix.c:64 #5 0x000055eb1bb4564a in multifd_send_terminate_threads (err=<optimized out>) at migration/ram.c:1015 #6 0x000055eb1bb4bb7f in multifd_send_thread (opaque=0x55eb1e2d19f8) at migration/ram.c:1171 #7 0x000055eb1bfad628 in qemu_thread_start (args=0x55eb1e170450) at util/qemu-thread-posix.c:502 #8 0x00007f81a2b36df5 in start_thread () from /usr/lib64/libpthread.so.0 #9 0x00007f81a286048d in clone () from /usr/lib64/libc.so.6 To fix it up, let's destroy the mutex after all the other multifd threads had been terminated. Signed-off-by: Jiahui Cen <cenjiahui@huawei.com> Signed-off-by: Ying Fang <fangying1@huawei.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:23 +01:00
Jiahui Cen	f76e32eb05	migration/multifd: fix nullptr access in terminating multifd threads One multifd channel will shutdown all the other multifd's IOChannel when it fails to receive an IOChannel. In this senario, if some multifds had not received its IOChannel yet, it would try to shutdown its IOChannel which could cause nullptr access at qio_channel_shutdown. Here is the coredump stack: #0 object_get_class (obj=obj@entry=0x0) at qom/object.c:908 #1 0x00005563fdbb8f4a in qio_channel_shutdown (ioc=0x0, how=QIO_CHANNEL_SHUTDOWN_BOTH, errp=0x0) at io/channel.c:355 #2 0x00005563fd7b4c5f in multifd_recv_terminate_threads (err=<optimized out>) at migration/ram.c:1280 #3 0x00005563fd7bc019 in multifd_recv_new_channel (ioc=ioc@entry=0x556400255610, errp=errp@entry=0x7ffec07dce00) at migration/ram.c:1478 #4 0x00005563fda82177 in migration_ioc_process_incoming (ioc=ioc@entry=0x556400255610, errp=errp@entry=0x7ffec07dce30) at migration/migration.c:605 #5 0x00005563fda8567d in migration_channel_process_incoming (ioc=0x556400255610) at migration/channel.c:44 #6 0x00005563fda83ee0 in socket_accept_incoming_migration (listener=0x5563fff6b920, cioc=0x556400255610, opaque=<optimized out>) at migration/socket.c:166 #7 0x00005563fdbc25cd in qio_net_listener_channel_func (ioc=<optimized out>, condition=<optimized out>, opaque=<optimized out>) at io/net-listener.c:54 #8 0x00007f895b6fe9a9 in g_main_context_dispatch () from /usr/lib64/libglib-2.0.so.0 #9 0x00005563fdc18136 in glib_pollfds_poll () at util/main-loop.c:218 #10 0x00005563fdc181b5 in os_host_main_loop_wait (timeout=1000000000) at util/main-loop.c:241 #11 0x00005563fdc183a2 in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:517 #12 0x00005563fd8edb37 in main_loop () at vl.c:1791 #13 0x00005563fd74fd45 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4473 To fix it up, let's check p->c before calling qio_channel_shutdown. Signed-off-by: Jiahui Cen <cenjiahui@huawei.com> Signed-off-by: Ying Fang <fangying1@huawei.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:23 +01:00
Wei Yang	c6b3a2e0c4	migration/multifd: not use multifd during postcopy We don't support multifd during postcopy, but user still could enable both multifd and postcopy. This leads to migration failure. Skip multifd during postcopy. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:23 +01:00
Wei Yang	eab54aa78f	migration/multifd: clean pages after filling packet This is a preparation for the next patch: not use multifd during postcopy. Without enabling postcopy, everything looks good. While after enabling postcopy, migration may fail even not use multifd during postcopy. The reason is the pages is not properly cleared and old target page will continue to be transferred. After clean pages, migration succeeds. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:23 +01:00
Wei Yang	644acf99b8	migration/postcopy: enable compress during postcopy postcopy requires to place a whole host page, while migration thread migrate memory in target page size. This makes postcopy need to collect all target pages in one host page before placing via userfaultfd. To enable compress during postcopy, there are two problems to solve: 1. Random order for target page arrival 2. Target pages in one host page arrives without interrupt by target page from other host page The first one is handled by previous cleanup patch. This patch handles the second one by: 1. Flush compress thread for each host page 2. Wait for decompress thread for before placing host page Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:23 +01:00
Wei Yang	91ba442f5c	migration/postcopy: enable random order target page arrival After using number of target page received to track one host page, we could have the capability to handle random order target page arrival in one host page. This is a preparation for enabling compress during postcopy. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:23 +01:00
Wei Yang	e5e73b0f90	migration/postcopy: set all_zero to true on the first target page For the first target page, all_zero is set to true for this round check. After target_pages introduced, we could leverage this variable instead of checking the address offset. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:23 +01:00
Wei Yang	4cbb3c63c1	migration/postcopy: count target page number to decide the place_needed In postcopy, it requires to place whole host page instead of target page. Currently, it relies on the page offset to decide whether this is the last target page. We also can count the target page number during the iteration. When the number of target page equals (host page size / target page size), this means it is the last target page in the host page. This is a preparation for non-ordered target page transmission. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:23 +01:00
Wei Yang	ca1a6b708b	migration/postcopy: wait for decompress thread in precopy Compress is not supported with postcopy, it is safe to wait for decompress thread just in precopy. This is a preparation for later patch. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:23 +01:00
Wei Yang	2e36bc1b88	migration/postcopy: reduce memset when it is zero page and matches_target_page_size In this case, page_buffer content would not be used. Skip this to save some time. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:23 +01:00
Yury Kotov	e65cec5e5d	migration/ram: Yield periodically to the main loop Usually, incoming migration coroutine yields to the main loop while its IO-channel is waiting for data to receive. But there is a case when RAM migration and data receive have the same speed: VM with huge zeroed RAM. In this case, IO-channel won't read and thus the main loop is stuck and for instance, it doesn't respond to QMP commands. For this case, yield periodically, but not too often, so as not to affect the speed of migration. Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:22 +01:00
Dr. David Alan Gilbert	97e1e06780	migration: Rate limit inside host pages When using hugepages, rate limiting is necessary within each huge page, since a 1G huge page can take a significant time to send, so you end up with bursty behaviour. Fixes: `4c011c37ec` ("postcopy: Send whole huge pages") Reported-by: Lin Ma <LMa@suse.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:22 +01:00
Daniel Henrique Barboza	03acb4e94d	ram.c: remove unneeded labels ram_save_queue_pages() has an 'err' label that can be replaced by 'return -1' instead. Same thing with ram_discard_range(), and in this case we can also get rid of the 'ret' variable and return either '-1' on error or the result of ram_block_discard_range(). CC: Juan Quintela <quintela@redhat.com> CC: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:22 +01:00
Juan Quintela	4d65a6216b	migration: Make sure that we don't call write() in case of error If we are exiting due to an error/finish/.... Just don't try to even touch the channel with one IO operation. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:22 +01:00
Juan Quintela	d069bcca6c	multifd: Initialize local variable Fill everything with zero, so the padding fields are also initialized. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2020-01-20 09:08:53 +01:00
Paolo Bonzini	44901b5aff	colo: fix return without releasing RCU Use WITH_RCU_READ_LOCK_GUARD to avoid exiting colo_init_ram_cache without releasing RCU. Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-12-17 19:33:52 +01:00
Marc-André Lureau	e4f1bea2a8	migration: fix maybe-uninitialized warning ../migration/ram.c: In function ‘multifd_recv_thread’: /home/elmarco/src/qq/include/qapi/error.h:165:5: error: ‘block’ may be used uninitialized in this function [-Werror=maybe-uninitialized] 165 \| error_setg_internal((errp), __FILE__, __LINE__, __func__, \ \| ^~~~~~~~~~~~~~~~~~~ ../migration/ram.c:818:15: note: ‘block’ was declared here 818 \| RAMBlock *block; \| ^~~~~ Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-12-17 19:32:47 +01:00
Beata Michalska	bd108a44bc	migration: ram: Switch to ram block writeback Switch to ram block writeback for pmem migration. Signed-off-by: Beata Michalska <beata.michalska@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-id: 20191121000843.24844-4-beata.michalska@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-12-16 10:46:35 +00:00
Wei Yang	aff66d2ef0	migration/multifd: pages->used would be cleared when attach to multifd_send_state When we found an available channel in multifd_send_pages(), its pages->used is cleared and then attached to multifd_send_state. It is not necessary to do this twice. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191011085050.17622-5-richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-10-11 15:02:06 +01:00
Wei Yang	9985e1f48d	migration/multifd: initialize packet->magic/version once at setup stage MultiFDPacket_t's magic and version field never changes during migration, so move these two fields in setup stage. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191011085050.17622-4-richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-10-11 15:02:00 +01:00
Wei Yang	f2148c4c79	migration/multifd: use pages->allocated instead of the static max multifd_send_fill_packet() prepares meta data for following pages to transfer. It would be more proper to fill pages->allocated instead of static max value, especially we want to support flexible packet size. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191011085050.17622-3-richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-10-11 15:01:54 +01:00
Wei Yang	d884e77bfe	migration/multifd: fix a typo in comment of multifd_recv_unfill_packet() Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191011085050.17622-2-richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-10-11 15:01:44 +01:00
Wei Yang	3414322a83	migration/postcopy: allocate tmp_page in setup stage During migration, a tmp page is allocated so that we could place a whole host page during postcopy. Currently the page is allocated during load stage, this is a little bit late. And more important, if we failed to allocate it, the error is not checked properly. Even it is NULL, we would still use it. This patch moves the allocation to setup stage and if failed error message would be printed and caller would notice it. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-10-11 14:28:19 +01:00
Dr. David Alan Gilbert	89ac5a1d2a	migration: Use automatic rcu_read unlock in ram.c Use the automatic read unlocker in migration/ram.c Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20191007143642.301445-4-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-10-11 14:20:00 +01:00
Dr. David Alan Gilbert	0e6ebd4877	migration: Fix missing rcu_read_unlock Use the automatic rcu_read unlocker to fix a missing unlock. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20191007143642.301445-3-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-10-11 14:19:59 +01:00
Wei Yang	64737606e8	migration: remove sent parameter in get_queued_page_not_dirty This is a cleanup for previous removal of unsentmap. The sent parameter is not necessary now. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190819061843.28642-4-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-09-25 15:51:19 +01:00
Wei Yang	1e7cf8c323	migration/postcopy: unsentmap is not necessary for postcopy Commit `f3f491fcd6` ('Postcopy: Maintain unsentmap') introduced unsentmap to track not yet sent pages. This is not necessary since: * unsentmap is a sub-set of bmap before postcopy start * unsentmap is the summation of bmap and unsentmap after canonicalizing This patch just removes it. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190819061843.28642-3-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-09-25 15:51:19 +01:00
Wei Yang	8324ef86f0	migration/postcopy: not necessary to do discard when canonicalizing bitmap All pages, either partially sent or partially dirty, will be discarded in postcopy_send_discard_bm_ram(), since we update the unsentmap to be unsentmap = unsentmap \| dirty in ram_postcopy_send_discard_bitmap(). This is not necessary to do discard when canonicalizing bitmap. And by doing so, we separate the page discard into two individual steps: * canonicalize bitmap * discard page Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190819061843.28642-2-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-09-25 15:51:19 +01:00
Dr. David Alan Gilbert	ce62df5378	migration: register_savevm_live doesn't need dev Commit `78dd48df3` removed the last caller of register_savevm_live for an instantiable device (rather than a single system wide device); so trim out the parameter. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190822115433.12070-1-dgilbert@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-09-12 11:15:03 +01:00
Ivan Ren	2f4aefd320	migration: multifd_send_thread always post p->sem_sync when error happen When encounter error, multifd_send_thread should always notify who pay attention to it before exit. Otherwise it may block migration_thread at multifd_send_sync_main forever. Error as follow: ------------------------------------------------------------------------------- (gdb) bt #0 0x00007f4d669dfa0b in do_futex_wait.constprop.1 () from /lib64/libpthread.so.0 #1 0x00007f4d669dfa9f in __new_sem_wait_slow.constprop.0 () from /lib64/libpthread.so.0 #2 0x00007f4d669dfb3b in sem_wait@@GLIBC_2.2.5 () from /lib64/libpthread.so.0 #3 0x0000562ccf0a5614 in qemu_sem_wait (sem=sem@entry=0x562cd1b698e8) at util/qemu-thread-posix.c:319 #4 0x0000562ccecb4752 in multifd_send_sync_main (rs=<optimized out>) at /qemu/migration/ram.c:1099 #5 0x0000562ccecb95f4 in ram_save_iterate (f=0x562cd0ecc000, opaque=<optimized out>) at /qemu/migration/ram.c:3550 #6 0x0000562ccef43c23 in qemu_savevm_state_iterate (f=0x562cd0ecc000, postcopy=false) at migration/savevm.c:1189 #7 0x0000562ccef3dcf3 in migration_iteration_run (s=0x562cd09fabf0) at migration/migration.c:3131 #8 migration_thread (opaque=opaque@entry=0x562cd09fabf0) at migration/migration.c:3258 #9 0x0000562ccf0a4c26 in qemu_thread_start (args=<optimized out>) at util/qemu-thread-posix.c:502 #10 0x00007f4d669d9e25 in start_thread () from /lib64/libpthread.so.0 #11 0x00007f4d6670635d in clone () from /lib64/libc.so.6 (gdb) f 4 #4 0x0000562ccecb4752 in multifd_send_sync_main (rs=<optimized out>) at /qemu/migration/ram.c:1099 1099 qemu_sem_wait(&p->sem_sync); (gdb) list 1094 } 1095 for (i = 0; i < migrate_multifd_channels(); i++) { 1096 MultiFDSendParams *p = &multifd_send_state->params[i]; 1097 1098 trace_multifd_send_sync_main_wait(p->id); 1099 qemu_sem_wait(&p->sem_sync); 1100 } 1101 trace_multifd_send_sync_main(multifd_send_state->packet_num); 1102 } 1103 (gdb) p i $1 = 0 (gdb) p multifd_send_state->params[0].pending_job $2 = 2 //It means the job before MULTIFD_FLAG_SYNC has already fail (gdb) p multifd_send_state->params[0].quit $3 = true Signed-off-by: Ivan Ren <ivanren@tencent.com> Message-Id: <1567044996-2362-1-git-send-email-ivanren@tencent.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-09-12 10:53:33 +01:00
Paolo Bonzini	9458a9a1df	memory: fix race between TCG and accesses to dirty bitmap There is a race between TCG and accesses to the dirty log: vCPU thread reader thread ----------------------- ----------------------- TLB check -> slow path notdirty_mem_write write to RAM set dirty flag clear dirty flag TLB check -> fast path read memory write to RAM Fortunately, in order to fix it, no change is required to the vCPU thread. However, the reader thread must delay the read after the vCPU thread has finished the write. This can be approximated conservatively by run_on_cpu, which waits for the end of the current translation block. A similar technique is used by KVM, which has to do a synchronous TLB flush after doing a test-and-clear of the dirty-page flags. Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-08-20 17:26:20 +02:00
Juan Quintela	7dd59d01dd	migration: add some multifd traces Signed-off-by: Juan Quintela <quintela@redhat.com> Message-Id: <20190814020218.1868-6-quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Juan Quintela	18cdcea371	migration: Make global sem_sync semaphore by channel This makes easy to debug things because when you want for all threads to arrive at that semaphore, you know which one your are waiting for. Signed-off-by: Juan Quintela <quintela@redhat.com> Message-Id: <20190814020218.1868-3-quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Juan Quintela	5558c91ae8	migration: Add traces for multifd terminate threads Signed-off-by: Juan Quintela <quintela@redhat.com> Message-Id: <20190814020218.1868-2-quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	7a3e957177	migration: rename migration_bitmap_sync_range to ramblock_sync_dirty_bitmap Rename for better understanding of the code. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190808033155.30162-1-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Ivan Ren	81507f6b7e	migration: update ram_counters for multifd sync packet Multifd sync will send MULTIFD_FLAG_SYNC flag info to destination, add these bytes to ram_counters record. Signed-off-by: Ivan Ren <ivanren@tencent.com> Suggested-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <1564464816-21804-4-git-send-email-ivanren@tencent.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Ivan Ren	1b81c974cc	migration: add speed limit for multifd migration Limit the speed of multifd migration through common speed limitation qemu file. Signed-off-by: Ivan Ren <ivanren@tencent.com> Message-Id: <1564464816-21804-3-git-send-email-ivanren@tencent.com> Reviewed-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	9dec3cc3f4	migration/postcopy: use QEMU_IS_ALIGNED to replace host_offset Use QEMU_IS_ALIGNED for the check, it would be more consistent with other align calculations. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190806004648.8659-3-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	dad45ab2be	migration/postcopy: simplify calculation of run_start and fixup_start_addr The purpose of the calculation is to find a HostPage which is partially dirty. * fixup_start_addr points to the start of the HostPage to discard * run_start points to the next HostPage to check While in the middle stage, there would two cases for run_start: * aligned with HostPage means this is not partially dirty * not aligned means this is partially dirty When it is aligned, no work and calculation is necessary. run_start already points to the start of next HostPage and is ready to continue. When it is not aligned, the calculation could be simplified with: * fixup_start_addr = QEMU_ALIGN_DOWN(run_start, host_ratio) * run_start = QEMU_ALIGN_UP(run_start, host_ratio) By doing so, run_start always points to the next HostPage to check. fixup_start_addr always points to the HostPage to discard. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190806004648.8659-2-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	810cf2bbd4	migration/postcopy: make PostcopyDiscardState a static variable In postcopy-ram.c, we provide three functions to discard certain RAMBlock range: * postcopy_discard_send_init() * postcopy_discard_send_range() * postcopy_discard_send_finish() Currently, we allocate/deallocate PostcopyDiscardState for each RAMBlock on sending discard information to destination. This is not necessary and the same data area could be reused for each RAMBlock. This patch defines PostcopyDiscardState a static variable. By doing so: 1) avoid memory allocation and deallocation to the system 2) avoid potential failure of memory allocation 3) hide some details for their users Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190724010721.2146-1-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	10da4a3689	migration: extract ram_load_precopy After cleanup, it would be clear to audience there are two cases ram_load: * precopy * postcopy And it is not necessary to check postcopy_running on each iteration for precopy. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190725002023.2335-3-richardw.yang@linux.intel.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	be4a1a1b6f	migration: return -EINVAL directly when version_id mismatch It is not reasonable to continue when version_id mismatch. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190722075339.25121-2-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	5d0980a459	migration: just pass RAMBlock is enough RAMBlock->used_length is always passed to migration_bitmap_sync_range(), which could be retrieved from RAMBlock. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190718012547.16373-1-richardw.yang@linux.intel.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	8996604fe6	migration/postcopy: do_fixup is true when host_offset is non-zero This means it is not necessary to spare an extra variable to hold this condition. Use host_offset directly is fine. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190710050814.31344-3-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	e927a03317	migration/postcopy: reduce one operation to calculate fixup_start_addr Use the same way for run_end to calculate run_start, which saves one operation. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190710050814.31344-2-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	a162b572e9	migration/postcopy: discard_length must not be 0 Since we break the loop when there is no more page to discard, we are sure the following process would find some page to discard. It is not necessary to check it again. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190627020822.15485-4-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	33a5cb6202	migration/postcopy: break the loop when there is no more page to discard When one is equal or bigger then end, it means there is no page to discard. Just break the loop in this case instead of processing it. No functional change, just refactor it a little. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190627020822.15485-3-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	0abfff9ea7	migration/postcopy: the valid condition is one less then end If one equals end, it means we have gone through the whole bitmap. Use a more restrict check to skip a unnecessary condition. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190627020822.15485-2-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Ivan Ren	f193bc0c53	migration: fix migrate_cancel multifd migration leads destination hung forever When migrate_cancel a multifd migration, if run sequence like this: [source] [destination] multifd_send_sync_main[finish] multifd_recv_thread wait &p->sem_sync shutdown to_dst_file detect error from_src_file send RAM_SAVE_FLAG_EOS[fail] [no chance to run multifd_recv_sync_main] multifd_load_cleanup join multifd receive thread forever will lead destination qemu hung at following stack: pthread_join qemu_thread_join multifd_load_cleanup process_incoming_migration_co coroutine_trampoline Signed-off-by: Ivan Ren <ivanren@tencent.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <1561468699-9819-4-git-send-email-ivanren@tencent.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-07-24 14:47:21 +02:00
Juan Quintela	3c3ca25d1f	migration: Make explicit that we are quitting multifd We add a bool to indicate that. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-07-24 14:47:12 +02:00
Ivan Ren	a3ec6b7d23	migration: fix migrate_cancel leads live_migration thread hung forever When we 'migrate_cancel' a multifd migration, live_migration thread may hung forever at some points, because of multifd_send_thread has already exit for socket error: 1. multifd_send_pages may hung at qemu_sem_wait(&multifd_send_state-> channels_ready) 2. multifd_send_sync_main my hung at qemu_sem_wait(&multifd_send_state-> sem_sync) Signed-off-by: Ivan Ren <ivanren@tencent.com> Message-Id: <1561468699-9819-3-git-send-email-ivanren@tencent.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> --- Remove spurious not needed bits	2019-07-24 14:47:02 +02:00
Ivan Ren	713f762a31	migration: fix migrate_cancel leads live_migration thread endless loop When we 'migrate_cancel' a multifd migration, live_migration thread may go into endless loop in multifd_send_pages functions. Reproduce steps: (qemu) migrate_set_capability multifd on (qemu) migrate -d url (qemu) [wait a while] (qemu) migrate_cancel Then may get live_migration 100% cpu usage in following stack: pthread_mutex_lock qemu_mutex_lock_impl multifd_send_pages multifd_queue_page ram_save_multifd_page ram_save_target_page ram_save_host_page ram_find_and_save_block ram_find_and_save_block ram_save_iterate qemu_savevm_state_iterate migration_iteration_run migration_thread qemu_thread_start start_thread clone Signed-off-by: Ivan Ren <ivanren@tencent.com> Message-Id: <1561468699-9819-2-git-send-email-ivanren@tencent.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-07-24 14:46:51 +02:00
Ivan Ren	40c4d4a835	migration: always initial RAMBlock.bmap to 1 for new migration Reproduce the problem: migrate migrate_cancel migrate Error happen for memory migration The reason as follows: 1. qemu start, ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] all set to 1 by a series of cpu_physical_memory_set_dirty_range 2. migration start:ram_init_bitmaps - memory_global_dirty_log_start: begin log diry - memory_global_dirty_log_sync: sync dirty bitmap to ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] - migration_bitmap_sync_range: sync ram_list. dirty_memory[DIRTY_MEMORY_MIGRATION] to RAMBlock.bmap and ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] is set to zero 3. migration data... 4. migrate_cancel, will stop log dirty 5. migration start:ram_init_bitmaps - memory_global_dirty_log_start: begin log diry - memory_global_dirty_log_sync: sync dirty bitmap to ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] - migration_bitmap_sync_range: sync ram_list. dirty_memory[DIRTY_MEMORY_MIGRATION] to RAMBlock.bmap and ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] is set to zero Here RAMBlock.bmap only have new logged dirty pages, don't contain the whole guest pages. Signed-off-by: Ivan Ren <ivanren@tencent.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <1563115879-2715-1-git-send-email-ivanren@tencent.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-07-15 15:47:47 +02:00
Wei Yang	89dab31b27	migration/postcopy: fix document of postcopy_send_discard_bm_ram() Commit `6b6712efcc` ('ram: Split dirty bitmap by RAMBlock') changes the parameter of postcopy_send_discard_bm_ram(), while left the document part untouched. This patch correct the document and fix two typo by hand. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20190715020549.15018-1-richardw.yang@linux.intel.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-07-15 15:45:22 +02:00
Peng Tao	b17fbbe55c	migration: allow private destination ram with x-ignore-shared By removing the share ram check, qemu is able to migrate to private destination ram when x-ignore-shared capability is on. Then we can create multiple destination VMs based on the same source VM. This changes the x-ignore-shared migration capability to work similar to Lai's original bypass-shared-memory work(https://lists.gnu.org/archive/html/qemu-devel/2018-04/msg00003.html) which enables kata containers (https://katacontainers.io) to implement the VM templating feature. An example usage in kata containers(https://katacontainers.io): 1. Start the source VM: qemu-system-x86 -m 2G \ -object memory-backend-file,id=mem0,size=2G,share=on,mem-path=/tmpfs/template-memory \ -numa node,memdev=mem0 2. Stop the template VM, set migration x-ignore-shared capability, migrate "exec:cat>/tmpfs/state", quit it 3. Start target VM: qemu-system-x86 -m 2G \ -object memory-backend-file,id=mem0,size=2G,share=off,mem-path=/tmpfs/template-memory \ -numa node,memdev=mem0 \ -incoming defer 4. connect to target VM qmp, set migration x-ignore-shared capability, migrate_incoming "exec:cat /tmpfs/state" 5. create more target VMs repeating 3 and 4 Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Yury Kotov <yury-kotov@yandex-team.ru> Cc: Jiangshan Lai <laijs@hyper.sh> Cc: Xu Wang <xu@hyper.sh> Signed-off-by: Peng Tao <tao.peng@linux.alibaba.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <1560494113-1141-1-git-send-email-tao.peng@linux.alibaba.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-07-15 15:39:03 +02:00

1 2 3 4 5 ...

449 Commits