qemu-e2k

Author	SHA1	Message	Date
Joao Martins	f80929f3af	exec/ram_addr: Return number of dirty pages in cpu_physical_memory_set_dirty_lebitmap() In preparation for including the number of dirty pages in the vfio_get_dirty_bitmap() tracepoint, return the number of dirty pages in cpu_physical_memory_set_dirty_lebitmap() similar to cpu_physical_memory_sync_dirty_bitmap(). To avoid counting twice when GLOBAL_DIRTY_RATE is enabled, stash the number of bits set per bitmap quad in a variable (@nbits) and reuse it there. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230530180556.24441-2-joao.m.martins@oracle.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2023-06-13 11:28:58 +02:00
Alexander Graf	4b870dc4d0	hostmem-file: add offset option Add an option for hostmem-file to start the memory object at an offset into the target file. This is useful if multiple memory objects reside inside the same target file, such as a device node. In particular, it's useful to map guest memory directly into /dev/mem for experimentation. To make this work consistently, also fix up all places in QEMU that expect fd offsets to be 0. Signed-off-by: Alexander Graf <graf@amazon.com> Message-Id: <20230403221421.60877-1-graf@amazon.com> Acked-by: Markus Armbruster <armbru@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2023-05-23 16:47:03 +02:00
Peter Xu	cedb70eafb	migration: Use non-atomic ops for clear log bitmap Since we already have bitmap_mutex to protect either the dirty bitmap or the clear log bitmap, we don't need atomic operations to set/clear/test on the clear log bitmap. Switching all ops from atomic to non-atomic versions, meanwhile touch up the comments to show which lock is in charge. Introduced non-atomic version of bitmap_test_and_clear_atomic(), mostly the same as the atomic version but simplified a few places, e.g. dropped the "old_bits" variable, and also the explicit memory barriers. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2022-11-21 11:58:10 +01:00
Richard Henderson	65cd34e8c4	accel/tcg: Unify declarations of tb_invalidate_phys_range We missed this function when we introduced tb_page_addr_t. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2022-10-26 11:11:28 +10:00
Marc-André Lureau	8e3b0cbb72	Replace qemu_real_host_page variables with inlined functions Replace the global variables with inlined helper functions. getpagesize() is very likely annotated with a "const" function attribute (at least with glibc), and thus optimization should apply even better. This avoids the need for a constructor initialization too. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20220323155743.1585078-12-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-06 10:50:38 +02:00
Hyman Huang(黄勇)	4998a37e4b	memory: introduce total_dirty_pages to stat dirty pages introduce global var total_dirty_pages to stat dirty pages along with memory_global_dirty_log_sync. Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2021-11-01 22:56:44 +01:00
Hyman Huang(é»„å‹‡)	63b41db4bc	memory: make global_dirty_tracking a bitmask since dirty ring has been introduced, there are two methods to track dirty pages of vm. it seems that "logging" has a hint on the method, so rename the global_dirty_log to global_dirty_tracking would make description more accurate. dirty rate measurement may start or stop dirty tracking during calculation. this conflict with migration because stop dirty tracking make migration leave dirty pages out then that'll be a problem. make global_dirty_tracking a bitmask can let both migration and dirty rate measurement work fine. introduce GLOBAL_DIRTY_MIGRATION and GLOBAL_DIRTY_DIRTY_RATE to distinguish what current dirty tracking aims for, migration or dirty rate. Signed-off-by: Hyman Huang(é»„å‹‡) <huangy81@chinatelecom.cn> Message-Id: <9c9388657cfa0301bd2c1cfa36e7cf6da4aeca19.1624040308.git.huangy81@chinatelecom.cn> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2021-11-01 22:56:43 +01:00
David Hildenbrand	8dbe22c686	memory: Introduce RAM_NORESERVE and wire it up in qemu_ram_mmap() Let's introduce RAM_NORESERVE, allowing mmap'ing with MAP_NORESERVE. The new flag has the following semantics: " RAM is mmap-ed with MAP_NORESERVE. When set, reserving swap space (or huge pages if applicable) is skipped: will bail out if not supported. When not set, the OS will do the reservation, if supported for the memory type. " Allow passing it into: - memory_region_init_ram_nomigrate() - memory_region_init_resizeable_ram() - memory_region_init_ram_from_file() ... and teach qemu_ram_mmap() and qemu_anon_ram_alloc() about the flag. Bail out if the flag is not supported, which is the case right now for both, POSIX and win32. We will add Linux support next and allow specifying RAM_NORESERVE via memory backends. The target use case is virtio-mem, which dynamically exposes memory inside a large, sparse memory area to the VM. Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20210510114328.21835-9-david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-06-15 20:27:38 +02:00
David Hildenbrand	ebef62d0e5	softmmu/memory: Pass ram_flags to qemu_ram_alloc() and qemu_ram_alloc_internal() Let's pass ram_flags to qemu_ram_alloc() and qemu_ram_alloc_internal(), preparing for passing additional flags. Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20210510114328.21835-7-david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-06-15 20:27:38 +02:00
David Hildenbrand	d5015b8013	softmmu/memory: Pass ram_flags to qemu_ram_alloc_from_fd() Let's pass in ram flags just like we do with qemu_ram_alloc_from_file(), to clean up and prepare for more flags. Simplify the documentation of passed ram flags: Looking at our documentation of RAM_SHARED and RAM_PMEM is sufficient, no need to be repetitive. Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20210510114328.21835-5-david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-06-15 20:27:38 +02:00
Jagannathan Raman	44a4ff31c0	memory: alloc RAM from file at offset Allow RAM MemoryRegion to be created from an offset in a file, instead of allocating at offset of 0 by default. This is needed to synchronize RAM between QEMU & remote process. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Signed-off-by: John G Johnson <john.g.johnson@oracle.com> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 609996697ad8617e3b01df38accc5c208c24d74e.1611938319.git.jag.raman@oracle.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2021-02-09 20:53:56 +00:00
Stefan Hajnoczi	369d6dc4de	memory: add readonly support to memory_region_init_ram_from_file() There is currently no way to open(O_RDONLY) and mmap(PROT_READ) when creating a memory region from a file. This functionality is needed since the underlying host file may not allow writing. Add a bool readonly argument to memory_region_init_ram_from_file() and the APIs it calls. Extend memory_region_init_ram_from_file() rather than introducing a memory_region_init_rom_from_file() API so that callers can easily make a choice between read/write and read-only at runtime without calling different APIs. No new RAMBlock flag is introduced for read-only because it's unclear whether RAMBlocks need to know that they are read-only. Pass a bool readonly argument instead. Both of these design decisions can be changed in the future. It just seemed like the simplest approach to me. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20210104171320.575838-2-stefanha@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-02-01 17:07:34 -05:00
Stefan Hajnoczi	d73415a315	qemu/atomic.h: rename atomic_ to qatomic_ clang's C11 atomic_fetch_() functions only take a C11 atomic type pointer argument. QEMU uses direct types (int, etc) and this causes a compiler error when a QEMU code calls these functions in a source file that also included <stdatomic.h> via a system header file: $ CC=clang CXX=clang++ ./configure ... && make ../util/async.c:79:17: error: address argument to atomic operation must be a pointer to _Atomic type ('unsigned int ' invalid) Avoid using atomic_*() names in QEMU's atomic.h since that namespace is used by <stdatomic.h>. Prefix QEMU's APIs with 'q' so that atomic.h and <stdatomic.h> can co-exist. I checked /usr/include on my machine and searched GitHub for existing "qatomic_" users but there seem to be none. This patch was generated using: $ git grep -h -o '\<atomic$64$\?_[a-z0-9_]\+' include/qemu/atomic.h \| \ sort -u >/tmp/changed_identifiers $ for identifier in $(</tmp/changed_identifiers); do sed -i "s%\<$identifier\>%q$identifier%g" \ $(git grep -I -l "\<$identifier\>") done I manually fixed line-wrap issues and misaligned rST tables. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200923105646.47864-1-stefanha@redhat.com>	2020-09-23 16:07:44 +01:00
Keqian Zhu	fb6135807f	migration: Count new_dirty instead of real_dirty real_dirty_pages becomes equal to total ram size after dirty log sync in ram_init_bitmaps, the reason is that the bitmap of ramblock is initialized to be all set, so old path counts them as "real dirty" at beginning. This causes wrong dirty rate and false positive throttling. Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> Message-Id: <20200622032037.31112-1-zhukeqian1@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-07-03 16:23:05 +01:00
Philippe Mathieu-Daudé	da278d58a0	accel: Move Xen accelerator code under accel/xen/ This code is not related to hardware emulation. Move it under accel/ with the other hypervisors. Reviewed-by: Paul Durrant <paul@xen.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200508100222.7112-1-philmd@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-06-10 12:09:56 -04:00
Philippe Mathieu-Daudé	ab7e41e667	exec: Rename qemu_ram_writeback() as qemu_ram_msync() Rename qemu_ram_writeback() as qemu_ram_msync() to better match what it does. Suggested-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 20200508062456.23344-5-philmd@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-06-05 09:54:48 +01:00
Juan Quintela	41aa4e9fd8	ram_addr: Split RAMBlock definition We need some of the fields without having to poison everything else. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-29 11:28:59 +01:00
Beata Michalska	61c490e25e	Memory: Enable writeback for given memory region Add an option to trigger memory writeback to sync given memory region with the corresponding backing store, case one is available. This extends the support for persistent memory, allowing syncing on-demand. Signed-off-by: Beata Michalska <beata.michalska@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20191121000843.24844-3-beata.michalska@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-12-16 10:46:35 +00:00
Wei Yang	038adc2f58	core: replace getpagesize() with qemu_real_host_page_size There are three page size in qemu: real host page size host page size target page size All of them have dedicate variable to represent. For the last two, we use the same form in the whole qemu project, while for the first one we use two forms: qemu_real_host_page_size and getpagesize(). qemu_real_host_page_size is defined to be a replacement of getpagesize(), so let it serve the role. [Note] Not fully tested for some arch or device. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191013021145.16011-3-richardw.yang@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-10-26 15:38:06 +02:00
Dr. David Alan Gilbert	694ea274d9	rcu: Use automatic rc_read unlock in core memory/exec code Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20191007143642.301445-6-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-10-11 14:20:02 +01:00
Wei Yang	1e7cf8c323	migration/postcopy: unsentmap is not necessary for postcopy Commit `f3f491fcd6` ('Postcopy: Maintain unsentmap') introduced unsentmap to track not yet sent pages. This is not necessary since: * unsentmap is a sub-set of bmap before postcopy start * unsentmap is the summation of bmap and unsentmap after canonicalizing This patch just removes it. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190819061843.28642-3-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-09-25 15:51:19 +01:00
Markus Armbruster	ec150c7e09	include: Make headers more self-contained Back in 2016, we discussed[1] rules for headers, and these were generally liked: 1. Have a carefully curated header that's included everywhere first. We got that already thanks to Peter: osdep.h. 2. Headers should normally include everything they need beyond osdep.h. If exceptions are needed for some reason, they must be documented in the header. If all that's needed from a header is typedefs, put those into qemu/typedefs.h instead of including the header. 3. Cyclic inclusion is forbidden. This patch gets include/ closer to obeying 2. It's actually extracted from my "[RFC] Baby steps towards saner headers" series[2], which demonstrates a possible path towards checking 2 automatically. It passes the RFC test there. [1] Message-ID: <87h9g8j57d.fsf@blackfin.pond.sub.org> https://lists.nongnu.org/archive/html/qemu-devel/2016-03/msg03345.html [2] Message-Id: <20190711122827.18970-1-armbru@redhat.com> https://lists.nongnu.org/archive/html/qemu-devel/2019-07/msg02715.html Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20190812052359.30071-2-armbru@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>	2019-08-16 13:31:51 +02:00
Peter Xu	002cad6b16	migration: Split log_clear() into smaller chunks Currently we are doing log_clear() right after log_sync() which mostly keeps the old behavior when log_clear() was still part of log_sync(). This patch tries to further optimize the migration log_clear() code path to split huge log_clear()s into smaller chunks. We do this by spliting the whole guest memory region into memory chunks, whose size is decided by MigrationState.clear_bitmap_shift (an example will be given below). With that, we don't do the dirty bitmap clear operation on the remote node (e.g., KVM) when we fetch the dirty bitmap, instead we explicitly clear the dirty bitmap for the memory chunk for each of the first time we send a page in that chunk. Here comes an example. Assuming the guest has 64G memory, then before this patch the KVM ioctl KVM_CLEAR_DIRTY_LOG will be a single one covering 64G memory. If after the patch, let's assume when the clear bitmap shift is 18, then the memory chunk size on x86_64 will be 1UL<<18 * 4K = 1GB. Then instead of sending a big 64G ioctl, we'll send 64 small ioctls, each of the ioctl will cover 1G of the guest memory. For each of the 64 small ioctls, we'll only send if any of the page in that small chunk was going to be sent right away. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190603065056.25211-12-peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-07-15 15:39:03 +02:00
Peter Xu	077874e01f	memory: Introduce memory listener hook log_clear() Introduce a new memory region listener hook log_clear() to allow the listeners to hook onto the points where the dirty bitmap is cleared by the bitmap users. Previously log_sync() contains two operations: - dirty bitmap collection, and, - dirty bitmap clear on remote site. Let's take KVM as example - log_sync() for KVM will first copy the kernel dirty bitmap to userspace, and at the same time we'll clear the dirty bitmap there along with re-protecting all the guest pages again. We add this new log_clear() interface only to split the old log_sync() into two separated procedures: - use log_sync() to collect the collection only, and, - use log_clear() to clear the remote dirty bitmap. With the new interface, the memory listener users will still be able to decide how to implement the log synchronization procedure, e.g., they can still only provide log_sync() method only and put all the two procedures within log_sync() (that's how the old KVM works before KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 is introduced). However with this new interface the memory listener users will start to have a chance to postpone the log clear operation explicitly if the module supports. That can really benefit users like KVM at least for host kernels that support KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2. There are three places that can clear dirty bits in any one of the dirty bitmap in the ram_list.dirty_memory[3] array: cpu_physical_memory_snapshot_and_clear_dirty cpu_physical_memory_test_and_clear_dirty cpu_physical_memory_sync_dirty_bitmap Currently we hook directly into each of the functions to notify about the log_clear(). Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20190603065056.25211-7-peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-07-15 15:39:02 +02:00
Peter Xu	5dea4079ad	memory: Pass mr into snapshot_and_clear_dirty Also we change the 2nd parameter of it to be the relative offset within the memory region. This is to be used in follow up patches. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20190603065056.25211-6-peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-07-15 15:39:02 +02:00
Peter Xu	ae7a2bca8a	memory: Don't set migration bitmap when without migration Similar to `9460dee4b2` ("memory: do not touch code dirty bitmap unless TCG is enabled", 2015-06-05) but for the migration bitmap - we can skip the MIGRATION bitmap update if migration not enabled. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20190603065056.25211-4-peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-07-15 15:39:02 +02:00
Peter Xu	267691b65c	migration: No need to take rcu during sync_dirty_bitmap cpu_physical_memory_sync_dirty_bitmap() has one RAMBlock* as parameter, which means that it must be with RCU read lock held already. Taking it again inside seems redundant. Removing it. Instead comment on the functions about the RCU read lock. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20190603065056.25211-2-peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-07-15 15:39:02 +02:00
Markus Armbruster	14a48c1d0d	qemu-common: Move tcg_enabled() etc. to sysemu/tcg.h Other accelerators have their own headers: sysemu/hax.h, sysemu/hvf.h, sysemu/kvm.h, sysemu/whpx.h. Only tcg_enabled() & friends sit in qemu-common.h. This necessitates inclusion of qemu-common.h into headers, which is against the rules spelled out in qemu-common.h's file comment. Move tcg_enabled() & friends into their own header sysemu/tcg.h, and adjust #include directives. Cc: Richard Henderson <rth@twiddle.net> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-2-armbru@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> [Rebased with conflicts resolved automatically, except for accel/tcg/tcg-all.c]	2019-06-11 20:22:09 +02:00
David Hildenbrand	905b7ee4d6	exec: Introduce qemu_maxrampagesize() and rename qemu_getrampagesize() Rename qemu_getrampagesize() to qemu_minrampagesize(). While at it, properly rename find_max_supported_pagesize() to find_min_backend_pagesize(). s390x is actually interested into the maximum ram pagesize, so introduce and use qemu_maxrampagesize(). Add a TODO, indicating that looking at any mapped memory backends is not 100% correct in some cases. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20190417113143.5551-3-david@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>	2019-04-25 13:47:27 +02:00
Zhang Chen	13af18f222	COLO: Load dirty pages into SVM's RAM cache firstly We should not load PVM's state directly into SVM, because there maybe some errors happen when SVM is receving data, which will break SVM. We need to ensure receving all data before load the state into SVM. We use an extra memory to cache these data (PVM's ram). The ram cache in secondary side is initially the same as SVM/PVM's memory. And in the process of checkpoint, we cache the dirty pages of PVM into this ram cache firstly, so this ram cache always the same as PVM's memory at every checkpoint, then we flush this cached ram to SVM after we receive all PVM's state. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:03 +08:00
Junyan He	a4de8552b2	hostmem-file: add the 'pmem' option When QEMU emulates vNVDIMM labels and migrates vNVDIMM devices, it needs to know whether the backend storage is a real persistent memory, in order to decide whether special operations should be performed to ensure the data persistence. This boolean option 'pmem' allows users to specify whether the backend storage of memory-backend-file is a real persistent memory. If 'pmem=on', QEMU will set the flag RAM_PMEM in the RAM block of the corresponding memory region. If 'pmem' is set while lack of libpmem support, a error is generated. Signed-off-by: Junyan He <junyan.he@intel.com> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2018-08-10 13:29:39 +03:00
Junyan He	cbfc017103	memory, exec: switch file ram allocation functions to 'flags' parameters As more flag parameters besides the existing 'share' are going to be added to following functions memory_region_init_ram_from_file qemu_ram_alloc_from_fd qemu_ram_alloc_from_file let's switch them to use the 'flags' parameters so as to ease future flag additions. The existing 'share' flag is converted to the RAM_SHARED bit in ram_flags, and other flag bits are ignored by above functions right now. Signed-off-by: Junyan He <junyan.he@intel.com> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2018-08-10 13:29:39 +03:00
Paolo Bonzini	8bca9a03ec	move public invalidate APIs out of translate-all.{c,h}, clean up Place them in exec.c, exec-all.h and ram_addr.h. This removes knowledge of translate-all.h (which is an internal header) from several files outside accel/tcg and removes knowledge of AddressSpace from translate-all.c (as it only operates on ram_addr_t). Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-06-28 19:05:30 +02:00
David Hildenbrand	c136180c90	postcopy: drop ram_pages parameter from postcopy_ram_incoming_init() Not needed. Don't expose last_ram_page(). Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20180620202736.21399-1-david@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2018-06-27 13:28:31 +02:00
Marcel Apfelbaum	06329ccecf	mem: add share parameter to memory-backend-ram Currently only file backed memory backend can be created with a "share" flag in order to allow sharing guest RAM with other processes in the host. Add the "share" flag also to RAM Memory Backend in order to allow remapping parts of the guest RAM to different host virtual addresses. This is needed by the RDMA devices in order to remap non-contiguous QEMU virtual addresses to a contiguous virtual address range. Moved the "share" flag to the Host Memory base class, modified phys_mem_alloc to include the new parameter and a new interface memory_region_init_ram_shared_nomigrate. There are no functional changes if the new flag is not used. Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>	2018-02-19 13:03:24 +02:00
Dr. David Alan Gilbert	aa777e297c	cpu_physical_memory_sync_dirty_bitmap: Another alignment fix This code has an optimised, word aligned version, and a boring unaligned version. My commit `f70d345` fixed one alignment issue, but there's another. The optimised version operates on 'longs' dealing with (typically) 64 pages at a time, replacing the whole long by a 0 and counting the bits. If the Ramblock is less than 64bits in length that long can contain bits representing two different RAMBlocks, but the code will update the bmap belinging to the 1st RAMBlock only while having updated the total dirty page count for both. This probably didn't matter prior to `6b6712ef` which split the dirty bitmap by RAMBlock, but now they're separate RAMBlocks we end up with a count that doesn't match the state in the bitmaps. Symptom: Migration showing a few dirty pages left to be sent constantly Seen on aarch64 and x86 with x86+ovmf Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reported-by: Wei Huang <wei@redhat.com> Fixes: `6b6712efcc` Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-01-16 14:54:52 +01:00
Alexey Perevalov	f949461489	migration: add bitmap for received page This patch adds ability to track down already received pages, it's necessary for calculation vCPU block time in postcopy migration feature, and for recovery after postcopy migration failure. Also it's necessary to solve shared memory issue in postcopy livemigration. Information about received pages will be transferred to the software virtual bridge (e.g. OVS-VSWITCHD), to avoid fallocate (unmap) for already received pages. fallocate syscall is required for remmaped shared memory, due to remmaping itself blocks ioctl(UFFDIO_COPY, ioctl in this case will end with EEXIT error (struct page is exists after remmap). Bitmap is placed into RAMBlock as another postcopy/precopy related bitmaps. Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2017-10-23 18:03:41 +02:00
Dr. David Alan Gilbert	f70d3451fe	cpu_physical_memory_sync_dirty_bitmap: Fix alignment check This code has an optimised, word aligned version, and a boring unaligned version. Recently `084140bd49` fixed a missing offset addition from the core of both versions. However, the offset isn't necessarily aligned and thus the choice between the two versions needs fixing up to also include the offset. Symptom: A few stuck unsent pages during migration; not normally noticed unless under very low bandwidth in which case the migration may get stuck never ending and never performing a 2nd sync; noticed by a hanging postcopy-test on a very heavily loaded system. Fixes: `084140bd49` Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reported-by: Alex BenneÃ© <alex.benee@linaro.org> Tested-by: Alex BenneÃ© <alex.benee@linaro.org> -- v2 Move 'page' inside the if (Comment from Paolo) Message-Id: <20170724165125.29887-1-dgilbert@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-08-01 17:27:33 +02:00
Haozhong Zhang	084140bd49	exec: fix access to ram_list.dirty_memory when sync dirty bitmap In cpu_physical_memory_sync_dirty_bitmap(rb, start, ...), the 2nd argument 'start' is relative to the start of the ramblock 'rb'. When it's used to access the dirty memory bitmap of ram_list (i.e. ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION]->blocks[]), an offset to the start of all RAM (i.e. rb->offset) should be added to it, which has however been missed since c/s `6b6712efcc`. For a ramblock of host memory backend whose offset is not zero, cpu_physical_memory_sync_dirty_bitmap() synchronizes the incorrect part of the dirty memory bitmap of ram_list to the per ramblock dirty bitmap. As a result, a guest with host memory backend may crash after migration. Fix it by adding the offset of ramblock when accessing the dirty memory bitmap of ram_list in cpu_physical_memory_sync_dirty_bitmap(). Reported-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Message-Id: <20170628083704.24997-1-haozhong.zhang@intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Tested-by: Juan Quintela <quintela@redhat.com> Tested-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2017-06-28 12:23:58 +02:00
Marc-André Lureau	38b3362dd1	exec: split qemu_ram_alloc_from_file() Add qemu_ram_alloc_from_fd(), which can be use to allocate ramblock from fd only. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20170602141229.15326-4-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-06-15 11:04:04 +02:00
Juan Quintela	6b6712efcc	ram: Split dirty bitmap by RAMBlock Both the ram bitmap and the unsent bitmap are split by RAMBlock. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Reviewed-by: Peter Xu <peterx@redhat.com> -- Fix compilation when DEBUG_POSTCOPY is enabled (thanks Hailiang)	2017-05-04 10:00:38 +02:00
Peter Maydell	52e94ea5de	Xen 2017/04/21 + fix -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJY/5EdAAoJEIlPj0hw4a6QHj8P/1D3iZq8vKyLvnkioYC00ao8 dhuCFV1Dk0dl/QXvDXxNAHFqmHp2PzHeHF2V19bTbUh9QnY3WH9aSsMQ7YVPDzLb 6ZezbxXd1l8+IOrWh+ch+x6jiUpRAeHGjhSpFxQEmH5THBmPtLHBFhAeeGpVq6CT MPFlN0mr1snyMMbK2aKCVTHgh5Wip83/t/kijIq4RmPwXdJbwxx55PL8jxC/NZHq /NbAZzAT149fmItfBNnsRTfNoDs7fSmgM9VelYsnJNHs6qkYYfewxys4vudePG8n ZcjCXMv9vdCSTfXfYY9nxhfGCgkkRSvBWrzARRIUPxTXGAmEU52LJrccpG9AEFRZ Kgz1JYr0y+u/g+RsCJvAglvmciawXgZH8GIR4sFl2iv4u6cx8PxNRgjTdt7YX4Je V7+scmOLB0U5wkI7PUcPV1v2fwKJHFfMdyik257otfMCJiTCnWGJfCZGR4JAdy3C NHuG4eIj2XLjZ7IQIo3mg+EfdCWdfVMKtXj0MAFsP6Kcr6sY/ef5sx/wB61k3NU1 Y4ctI/LAOONsjlC2yzJ4aZR4LgyFcVcwx8FKOK7niCcCgVLYGWpyiHU3kFKYlmKM FKIlGPPOK8WuRV9NUGDI9A8XENyFXGyiR2xGCotfgHj+FSSqRzhKH4ecfgNimJVY wjTzZmBa68pLHdXaJ+SV =OfeB -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/sstabellini/tags/xen-20170421-v2-tag' into staging Xen 2017/04/21 + fix # gpg: Signature made Tue 25 Apr 2017 19:10:37 BST # gpg: using RSA key 0x894F8F4870E1AE90 # gpg: Good signature from "Stefano Stabellini <stefano.stabellini@eu.citrix.com>" # gpg: aka "Stefano Stabellini <sstabellini@kernel.org>" # Primary key fingerprint: D04E 33AB A51F 67BA 07D3 0AEA 894F 8F48 70E1 AE90 * remotes/sstabellini/tags/xen-20170421-v2-tag: (21 commits) move xen-mapcache.c to hw/i386/xen/ move xen-hvm.c to hw/i386/xen/ move xen-common.c to hw/xen/ add xen-9p-backend to MAINTAINERS under Xen xen/9pfs: build and register Xen 9pfs backend xen/9pfs: send responses back to the frontend xen/9pfs: implement in/out_iov_from_pdu and vmarshal/vunmarshal xen/9pfs: receive requests from the frontend xen/9pfs: connect to the frontend xen/9pfs: introduce Xen 9pfs backend 9p: introduce a type for the 9p header xen: import ring.h from xen configure: use pkg-config for obtaining xen version xen: additionally restrict xenforeignmemory operations xen: use libxendevice model to restrict operations xen: use 5 digit xen versions xen: use libxendevicemodel when available configure: detect presence of libxendevicemodel xen: create wrappers for all other uses of xc_hvm_XXX() functions xen: rename xen_modified_memory() to xen_hvm_modified_memory() ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-04-26 10:22:31 +01:00
Gerd Hoffmann	8deaf12ca1	memory: add support getting and using a dirty bitmap copy. This patch adds support for getting and using a local copy of the dirty bitmap. memory_region_snapshot_and_clear_dirty() will create a snapshot of the dirty bitmap for the specified range, clear the dirty bitmap and return the copy. The returned bitmap can be a bit larger than requested, the range is expanded so the code can copy unsigned longs from the bitmap and avoid atomic bit update operations. memory_region_snapshot_get_dirty() will return the dirty status of pages, pretty much like memory_region_get_dirty(), but using the copy returned by memory_region_copy_and_clear_dirty(). Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-id: 20170421091632.30900-3-kraxel@redhat.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>	2017-04-24 10:12:28 +02:00
Juan Quintela	66103a5796	ram: Remove migration_bitmap_extend() We have disabled memory hotplug, so we don't need to handle migration_bitamp there. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>	2017-04-21 12:25:40 +02:00
Juan Quintela	b8c4899398	ram: rename last_ram_offset() last_ram_pages() We always use it as pages anyways. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2017-04-21 12:25:40 +02:00
Juan Quintela	15440dd5a0	ram: Pass RAMBlock to bitmap_sync We change the meaning of start to be the offset from the beggining of the block. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2017-04-21 12:25:39 +02:00
Juan Quintela	68908ed665	ram: Change num_dirty_pages_period type to uint64_t Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>	2017-04-21 12:25:36 +02:00
Paul Durrant	5100afb5f5	xen: rename xen_modified_memory() to xen_hvm_modified_memory() This patch is a purely cosmetic change that avoids a name collision in a subsequent patch. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Anthony Perard <anthony.perard@citrix.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>	2017-03-22 11:47:39 -07:00
Chao Fan	1ffb5dfd35	Change the method to calculate dirty-pages-rate In function cpu_physical_memory_sync_dirty_bitmap, file include/exec/ram_addr.h: if (src[idx][offset]) { unsigned long bits = atomic_xchg(&src[idx][offset], 0); unsigned long new_dirty; new_dirty = ~dest[k]; dest[k] \|= bits; new_dirty &= bits; num_dirty += ctpopl(new_dirty); } After these codes executed, only the pages not dirtied in bitmap(dest), but dirtied in dirty_memory[DIRTY_MEMORY_MIGRATION] will be calculated. For example: When ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] = 0b00001111, and atomic_rcu_read(&migration_bitmap_rcu)->bmap = 0b00000011, the new_dirty will be 0b00001100, and this function will return 2 but not 4 which is expected. the dirty pages in dirty_memory[DIRTY_MEMORY_MIGRATION] are all new, so these should be calculated also. Signed-off-by: Chao Fan <fanc.fnst@cn.fujitsu.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2017-03-16 08:55:56 +01:00
Alexey Kardashevskiy	9c60766887	exec, kvm, target-ppc: Move getrampagesize() to common code getrampagesize() returns the largest supported page size and mainly used to know if huge pages are enabled. However is implemented in target-ppc/kvm.c and not available in TCG or other architectures. This renames and moves gethugepagesize() to mmap-alloc.c where fd-based analog of it is already implemented. This renames and moves getrampagesize() to exec.c as it seems to be the common place for helpers like this. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-03 11:30:59 +11:00

1 2

91 Commits