When option -mem-prealloc is used with one or more memory-backend
objects, created backends may not obey configured bind policy or
creation may fail after kernel attempts to move pages according
to bind policy.
Reason is in file_ram_alloc(), which will pre-allocate
any descriptor based RAM if global mem_prealloc != 0 and that
happens way before bind policy is applied to memory range.
One way to fix it would be to extend memory_region_foo() API
and add more invariants that could broken later due implicit
dependencies that's hard to track.
Another approach is to drop adhoc main RAM allocation and
consolidate it around memory-backend. That allows to have
single place that allocates guest RAM (main and memdev)
in the same way and then global mem_prealloc could be
replaced by backend's property[s] that will affect created
memory-backend objects but only in correct order this time.
With main RAM now converted to hostmem backends, there is no
point in keeping global mem_prealloc around, so alias
-mem-prealloc to "memory-backend.prealloc=on"
machine compat[*] property and make mem_prealloc a local
variable to only stir registration of compat property.
*) currently user accessible -global works only with DEVICE
based objects and extra work is needed to make it work
with hostmem backends. But that is convenience option
and out of scope of this already huge refactoring.
Hence machine compat properties were used.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20200219160953.13771-78-imammedo@redhat.com>
Allow machine to opt in for hostmem backend based initial RAM
even if user uses old -mem-path/prealloc options by providing
MachineClass::default_ram_id
Follow up patches will incrementally convert machines to new API,
by dropping memory_region_allocate_system_memory() and setting
default_ram_id that board used to use before conversion to keep
migration stream the same.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20200219160953.13771-4-imammedo@redhat.com>
Neither stat(2) nor lseek(2) report the size of Linux devdax pmem
character device nodes. Commit 314aec4a6e
("hostmem-file: reject invalid pmem file sizes") added code to
hostmem-file.c to fetch the size from sysfs and compare against the
user-provided size=NUM parameter:
if (backend->size > size) {
error_setg(errp, "size property %" PRIu64 " is larger than "
"pmem file \"%s\" size %" PRIu64, backend->size,
fb->mem_path, size);
return;
}
It turns out that exec.c:qemu_ram_alloc_from_fd() already has an
equivalent size check but it skips devdax pmem character devices because
lseek(2) returns 0:
if (file_size > 0 && file_size < size) {
error_setg(errp, "backing store %s size 0x%" PRIx64
" does not match 'size' option 0x" RAM_ADDR_FMT,
mem_path, file_size, size);
return NULL;
}
This patch moves the devdax pmem file size code into get_file_size() so
that we check the memory size in a single place:
qemu_ram_alloc_from_fd(). This simplifies the code and makes it more
general.
This also fixes the problem that hostmem-file only checks the devdax
pmem file size when the pmem=on parameter is given. An unchecked
size=NUM parameter can lead to SIGBUS in QEMU so we must always fetch
the file size for Linux devdax pmem character device nodes.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20190830093056.12572-1-stefanha@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 314aec4a6e ("hostmem-file: reject
invalid pmem file sizes") added a file size check that verifies the
hostmem object's size parameter against the actual devdax pmem file.
This is useful because getting the size wrong results in confusing
errors inside the guest.
However, the code doesn't work properly for files where struct
stat::st_size is zero. Hostmem-file's ->alloc() function returns early
without setting an Error, causing the following assertion failure:
qemu/memory.c:2215: memory_region_get_ram_ptr: Assertion `mr->ram_block' failed.
This patch handles the case where qemu_get_pmem_size() returns 0 but
there is no error.
Fixes: 314aec4a6e
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20190823135632.25010-1-stefanha@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Guests started with NVDIMMs larger than the underlying host file produce
confusing errors inside the guest. This happens because the guest
accesses pages beyond the end of the file.
Check the pmem file size on startup and print a clear error message if
the size is invalid.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1669053
Cc: Wei Yang <richardw.yang@linux.intel.com>
Cc: Zhang Yi <yi.z.zhang@linux.intel.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20190214031004.32522-3-stefanha@redhat.com>
Reviewed-by: Wei Yang <richardw.yang@linux.intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Pankaj Gupta <pagupta@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
cleanup file_backend_memory_alloc() by using one CONFIG_POSIX ifdef
instead of several ones within the function to make it simpler to follow.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Suggested-by: Wei Yang <richardw.yang@linux.intel.com>
Reviewed-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20190213123858.24620-1-imammedo@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20190214031004.32522-2-stefanha@redhat.com>
[lv: s/hostmem/hostmem-file/]
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
When there are multiple memory backends in use, including the object type
and property name in the error message can help users to locate the error.
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Signed-off-by: Zhang Yi <yi.z.zhang@linux.intel.com>
Message-Id: <97d9193875747d8378c05b9e3b3cb39c1b7d2b4e.1546399191.git.yi.z.zhang@linux.intel.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
[ehabkost: reword commit message]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
hostmem-file and hostmem-memfd use the whole object path for the
memory region name, and hostname-ram uses only the path component (the
object id, or canonical path basename):
qemu -m 1024 -object memory-backend-file,id=mem,size=1G,mem-path=/tmp/foo -numa node,memdev=mem -monitor stdio
(qemu) info ramblock
Block Name PSize Offset Used Total
/objects/mem 4 KiB 0x0000000000000000 0x0000000040000000 0x0000000040000000
qemu -m 1024 -object memory-backend-memfd,id=mem,size=1G -numa node,memdev=mem -monitor stdio
(qemu) info ramblock
Block Name PSize Offset Used Total
/objects/mem 4 KiB 0x0000000000000000 0x0000000040000000 0x0000000040000000
qemu -m 1024 -object memory-backend-ram,id=mem,size=1G -numa node,memdev=mem -monitor stdio
(qemu) info ramblock
Block Name PSize Offset Used Total
mem 4 KiB 0x0000000000000000 0x0000000040000000 0x0000000040000000
For consistency, change to use object id for -file and -memfd as well
with >= 4.0.
Having a consistent naming allows to migrate to different hostmem
backends.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
We will never get the canonical path from the object
before object_property_add_child.
Signed-off-by: Zhang Yi <yi.z.zhang@linux.intel.com>
Message-Id: <a6491f996827f4039c1a52198ed5dcc7727cb0f9.1540389255.git.yi.z.zhang@linux.intel.com>
[ehabkost: reword commit message]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
memfd_backend_memory_alloc/file_backend_memory_alloc both needlessly
are are calling host_memory_backend_mr_inited() which creates an
illusion that alloc could be called multiple times but it isn't, it's
called once from UserCreatable complete().
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
object_get_canonical_path_component() returns a string which
must be freed using g_free().
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Zhang Yi <yi.z.zhang@linux.intel.com>
Message-Id: <7328fb16c394eaf5d65437d11c2a9343647b6d3d.1535471899.git.yi.z.zhang@linux.intel.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Before this change, memory-backend-file object is valid for Linux hosts
only because hostmem-file.c is compiled only on Linux hosts.
However, other POSIX-based hosts (such as macOS) can support
memory-backend-file object in the same way as on Linux hosts.
This patch makes hostmem-file.c and related functions to be compiled on
all POSIX-based hosts to make available memory-backend-file on them.
Signed-off-by: Hikaru Nishida <hikarupsp@gmail.com>
Message-Id: <20180924123205.29651-1-hikarupsp@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When QEMU emulates vNVDIMM labels and migrates vNVDIMM devices, it
needs to know whether the backend storage is a real persistent memory,
in order to decide whether special operations should be performed to
ensure the data persistence.
This boolean option 'pmem' allows users to specify whether the backend
storage of memory-backend-file is a real persistent memory. If
'pmem=on', QEMU will set the flag RAM_PMEM in the RAM block of the
corresponding memory region. If 'pmem' is set while lack of libpmem
support, a error is generated.
Signed-off-by: Junyan He <junyan.he@intel.com>
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
As more flag parameters besides the existing 'share' are going to be
added to following functions
memory_region_init_ram_from_file
qemu_ram_alloc_from_fd
qemu_ram_alloc_from_file
let's switch them to use the 'flags' parameters so as to ease future
flag additions.
The existing 'share' flag is converted to the RAM_SHARED bit in ram_flags,
and other flag bits are ignored by above functions right now.
Signed-off-by: Junyan He <junyan.he@intel.com>
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Currently only file backed memory backend can
be created with a "share" flag in order to allow
sharing guest RAM with other processes in the host.
Add the "share" flag also to RAM Memory Backend
in order to allow remapping parts of the guest RAM
to different host virtual addresses. This is needed
by the RDMA devices in order to remap non-contiguous
QEMU virtual addresses to a contiguous virtual address range.
Moved the "share" flag to the Host Memory base class,
modified phys_mem_alloc to include the new parameter
and a new interface memory_region_init_ram_shared_nomigrate.
There are no functional changes if the new flag is not used.
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
When mmap(2) the backend files, QEMU uses the host page size
(getpagesize(2)) by default as the alignment of mapping address.
However, some backends may require alignments different than the page
size. For example, mmap a device DAX (e.g., /dev/dax0.0) on Linux
kernel 4.13 to an address, which is 4K-aligned but not 2M-aligned,
fails with a kernel message like
[617494.969768] dax dax0.0: qemu-system-x86: dax_mmap: fail, unaligned vma (0x7fa37c579000 - 0x7fa43c579000, 0x1fffff)
Because there is no common approach to get such alignment requirement,
we add the 'align' option to 'memory-backend-file', so that users or
management utils, which have enough knowledge about the backend, can
specify a proper alignment via this option.
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Message-Id: <20171211072806.2812-2-haozhong.zhang@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[ehabkost: fixed typo, fixed error_setg() format string]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The new option can be used to indicate that the file contents can
be destroyed and don't need to be flushed to disk when QEMU exits
or when the memory backend object is removed.
Internally, it will trigger a madvise(MADV_REMOVE) call when the
memory backend is removed.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170824192315.5897-4-ehabkost@redhat.com>
[ehabkost: fixup: improved documentation]
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Tested-by: Zack Cornelius <zack.cornelius@kove.net>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Use the new interface to boost readability.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1489151370-15453-3-git-send-email-peterx@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
To do the conversion, the file_backend_class_init() was moved
after the getter/setter functions. The old
file_backend_instance_init() function was removed because it is
not needed anymore.
The NULL errp arguments on the property registration calls were
changed to &error_abort.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Commit 57cb38b included qapi/error.h into qemu/osdep.h to get the
Error typedef. Since then, we've moved to include qemu/osdep.h
everywhere. Its file comment explains: "To avoid getting into
possible circular include dependencies, this file should not include
any other QEMU headers, with the exceptions of config-host.h,
compiler.h, os-posix.h and os-win32.h, all of which are doing a
similar job to this file and are under similar constraints."
qapi/error.h doesn't do a similar job, and it doesn't adhere to
similar constraints: it includes qapi-types.h. That's in excess of
100KiB of crap most .c files don't actually need.
Add the typedef to qemu/typedefs.h, and include that instead of
qapi/error.h. Include qapi/error.h in .c files that need it and don't
get it now. Include qapi-types.h in qom/object.h for uint16List.
Update scripts/clean-includes accordingly. Update it further to match
reality: replace config.h by config-target.h, add sysemu/os-posix.h,
sysemu/os-win32.h. Update the list of includes in the qemu/osdep.h
comment quoted above similarly.
This reduces the number of objects depending on qapi/error.h from "all
of them" to less than a third. Unfortunately, the number depending on
qapi-types.h shrinks only a little. More work is needed for that one.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
[Fix compilation without the spice devel packages. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.
This commit was created with scripts/clean-includes.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1454089805-5470-5-git-send-email-peter.maydell@linaro.org
The free() and g_free() functions both happily accept
NULL on any platform QEMU builds on. As such putting a
conditional 'if (foo)' check before calls to 'free(foo)'
merely serves to bloat the lines of code.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The subtle difference between "property not found" and "property not
set" is already confusing enough.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
A new "share" property can be used with the "memory-file" backend to
map memory with MAP_SHARED instead of MAP_PRIVATE.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
And allow preallocation of file-based memory even without -mem-prealloc.
Some care is necessary because -mem-prealloc does not allow disabling
preallocation for hostmem-file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
MST: comment tweak