Commit Graph

94 Commits

Author SHA1 Message Date
David Hildenbrand
e3213eb5ec memory-device: rewrite address assignment using ranges
Let's rewrite it properly using ranges. This fixes certain overflows that
are right now possible. E.g.

qemu-system-x86_64 -m 4G,slots=20,maxmem=40G -M pc \
    -object memory-backend-file,id=mem1,share,mem-path=/dev/zero,size=2G
    -device pc-dimm,memdev=mem1,id=dimm1,addr=-0x40000000

Now properly errors out instead of succeeding. (Note that qapi
parsing of huge uint64_t values is broken and fixes are on the way)

"can't add memory device [0xffffffffa0000000:0x80000000], usable range for
memory devices [0x140000000:0xe00000000]"

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181214131043.25071-3-david@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-01-09 22:09:31 -02:00
David Hildenbrand
5e6aa26723 memory-device: avoid overflows on very huge devices
Should not be a problem right now, but it could theoretically happen
in the future.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181023152306.3123-7-david@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-12-11 15:45:22 -02:00
David Hildenbrand
3e18dbbb13 memory-device: use QEMU_IS_ALIGNED
Shorter and easier to read.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181023152306.3123-6-david@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-12-11 15:45:22 -02:00
Marc-André Lureau
640713d8a1 nvdimm: set non-volatile on the memory region
qemu-system-x86_64 -machine pc,nvdimm -m 2G,slots=4,maxmem=16G -enable-kvm -monitor stdio -object memory-backend-file,id=mem1,share=on,mem-path=/tmp/foo,size=1G -device nvdimm,id=nvdimm1,memdev=mem1

HMP info mtree command reflects the flag with "nv-" prefix on memory type:

(qemu) info mtree
0000000100000000-000000013fffffff (prio 0, nv-i/o): alias nvdimm-memory @/objects/mem1 0000000000000000-000000003fffffff

(qemu) info mtree -f
0000000100000000-000000013fffffff (prio 0, nv-ram): /objects/mem1

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20181003114454.5662-3-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-11-06 21:35:05 +01:00
David Hildenbrand
005feccf62 memory-device: trace when pre_plugging/plugging/unplugging
Let's trace the address and the id of a memory device when
pre_plugging/plugging/unplugging succeeded.

Trace it when pre_plugging as well as when plugging, so we really know
when a specific address is actually used.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181005092024.14344-17-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-10-24 06:44:59 -03:00
David Hildenbrand
8288590d23 memory-device: complete factoring out unplug handling
With the new memory device functions in place, we can factor out
unplugging of memory devices completely.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181005092024.14344-16-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-10-24 06:44:59 -03:00
David Hildenbrand
55d67a0492 memory-device: complete factoring out plug handling
With the new memory device functions in place, we can factor out
plugging of memory devices completely.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181005092024.14344-15-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-10-24 06:44:59 -03:00
David Hildenbrand
6ef2c0f2c1 memory-device: complete factoring out pre_plug handling
With all required memory device class functions in place, we can factor
out pre_plug handling of memory devices. Take proper care of errors. We
still have to carry along legacy_align required for pc compatibility
handling.

We will factor out tracing of the address separately in a follow-up
patch.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181005092024.14344-14-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-10-24 06:44:59 -03:00
David Hildenbrand
c331d3e136 memory-device: add device class function set_addr()
To be able to factor out address assignment of memory devices, we will
have to read (get_addr()) and write (set_addr()) the address.

We can't use properties for this purpose, as properties are device
specific. E.g. while the address property for a DIMM is called "addr", it
might be called differently (e.g. "memaddr") for other devices.

Especially virtio based memory devices cannot use "addr" as that is already
reserved and used for the address on the bus (for the proxy device).

Also, it might be possible to have memory devices without address
properties (e.g. internal DIMM-like thingies).

In contrast to get_addr(), we expect that set_addr() can fail.

Keep it simple for now for pc-dimm and simply set the static property, that
will fail once realized.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181005092024.14344-13-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-10-24 06:44:59 -03:00
David Hildenbrand
af39002747 memory-device: drop get_region_size()
There are no remaining users of get_region_size() except
memory_device_get_region_size() itself. We can make
memory_device_get_region_size() work directly on get_memory_region()
instead and drop get_region_size().

In addition, we can now use memory_device_get_region_size() in pc-dimm
code to implement get_plugged_size()"

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181005092024.14344-12-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-10-24 06:44:59 -03:00
David Hildenbrand
3a0a2b0a2b memory-device: factor out get_memory_region() from pc-dimm
The memory region is necessary for plugging/unplugging a memory device.
The region size (via get_region_size()) is no longer sufficient, as
besides the alignment, also the region itself is required in order to
add it to the device memory region of the machine via
- memory_region_add_subregion
- memory_region_del_subregion

So, to factor out plugging/unplugging of memory devices from pc-dimm
code, we have to factor out access to the memory region first.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181005092024.14344-11-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-10-24 06:44:59 -03:00
David Hildenbrand
946d6154ab memory-device: add and use memory_device_get_region_size()
We will factor out get_memory_region() from pc-dimm to memory device code
soon. Once that is done, get_region_size() can be implemented
generically and essentially be replaced by
memory_device_get_region_size (and work only on get_memory_region()).

We have some users of get_memory_region() (spapr and pc-dimm code) that are
only interested in the size. So let's rework them to use
memory_device_get_region_size() first, then we can factor out
get_memory_region() and eventually remove get_region_size() without
touching the same code multiple times.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181005092024.14344-10-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-10-24 06:44:59 -03:00
David Hildenbrand
e40c5b6b3f memory-device: forward errors in get_region_size()/get_plugged_size()
Let's properly forward the errors, so errors from get_region_size() /
get_plugged_size() can be handled.

Users right now call both functions after the device has been realized,
which is will never fail, so it is fine to continue using error_abort.

While at it, remove a leftover error check (suggested by Igor).

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181005092024.14344-8-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-10-24 06:44:59 -03:00
David Hildenbrand
15cea5ae81 memory-device: introduce separate config option
Some architectures might support memory devices, while they don't
support DIMM/NVDIMM. So let's
- Rename CONFIG_MEM_HOTPLUG to CONFIG_MEM_DEVICE
- Introduce CONFIG_DIMM and use it similarly to CONFIG NVDIMM

CONFIG_DIMM and CONFIG_NVDIMM require CONFIG_MEM_DEVICE.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181005092024.14344-7-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-10-24 06:44:59 -03:00
David Hildenbrand
26b1d1fd64 memory-device: use memory device terminology in error messages
While we rephrased most error messages, we missed these.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181005092024.14344-6-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-10-24 06:44:59 -03:00
David Hildenbrand
fd3416f5eb pc-dimm: pass PCDIMMDevice to pc_dimm_.*plug
We're plugging/unplugging a PCDIMMDevice, so directly pass this type
instead of a more generic DeviceState.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181005092024.14344-5-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-10-24 06:44:59 -03:00
David Hildenbrand
f99d84b1fc memory-device: improve "range conflicts" error message
Handle id==NULL better and indicate that we are dealing with memory
devices.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181005092024.14344-4-david@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-10-24 06:44:59 -03:00
David Hildenbrand
ac1b337588 memory-device: fix error message when hinted address is too small
The "at" should actually be a "before".
    if (new_addr < address_space_start)
     -> "can't add memory ... before... $address_space_start"

So it looks similar to the other check
    } else if ((new_addr + size) > address_space_end)
     -> "can't add memory ... beyond..."

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181005092024.14344-3-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-10-24 06:44:59 -03:00
David Hildenbrand
7c63ba2055 memory-device: fix alignment error message
We're missing "x" after the leading 0.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181005092024.14344-2-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-10-24 06:44:59 -03:00
David Hildenbrand
b0e624435b pc-dimm: assign and verify the "addr" property during pre_plug
We can assign and verify the address before realizing and trying to plug.
reading/writing the address property should never fail for DIMMs, so let's
reduce error handling a bit by using &error_abort. Getting access to the
memory region now might however fail. So forward errors from
get_memory_region() properly.

As all memory devices should use the alignment of the underlying memory
region for guest physical address asignment, do detection of the
alignment in pc_dimm_pre_plug(), but allow pc.c to overwrite the
alignment for compatibility handling.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180801133444.11269-5-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-08-23 18:46:25 +02:00
David Hildenbrand
8f1ffe5be8 pc-dimm: assign and verify the "slot" property during pre_plug
We can assign and verify the slot before realizing and trying to plug.
reading/writing the slot property should never fail, so let's reduce
error handling a bit by using &error_abort.

To do this during pre_plug, add and use (x86, ppc) pc_dimm_pre_plug().

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180801133444.11269-2-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-08-23 18:46:25 +02:00
Junyan He
faf8a13d80 mem/nvdimm: ensure write persistence to PMEM in label emulation
Guest writes to vNVDIMM labels are intercepted and performed on the
backend by QEMU. When the backend is a real persistent memort, QEMU
needs to take proper operations to ensure its write persistence on the
persistent memory. Otherwise, a host power failure may result in the
loss of guest label configurations.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2018-08-10 13:29:39 +03:00
David Hildenbrand
f0b7bca64d pc-dimm: get_memory_region() will not fail after realize
Let's try to reduce error handling a bit. In the plug/unplug case, the
device was realized and therefore we can assume that getting access to
the memory region will not fail.

For get_vmstate_memory_region() this is already handled that way.
Document both cases.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180619134141.29478-13-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-28 19:05:34 +02:00
David Hildenbrand
a4659a8ef4 nvdimm: make get_memory_region() perform checks and initialization
We might get a call to get_memory_region() before the device has been
realized. We should return a consistent value, as the return value
will e.g. later on be used in the pre_plug handler.

To avoid duplicating too much code, factor the initialization and checks
out into a helper function.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180619134141.29478-12-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-28 19:05:34 +02:00
David Hildenbrand
eb7fd4d0f6 nvdimm: convert nvdimm_mr into a pointer
This way we can easily check if the region has already been inititalized
without having to rely on the size of an uninitialized region being 0.

Free the region in nvdimm_finalize() and not in unrealize() as we will
allow to create the region before realization in following patches.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180619134141.29478-11-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-28 19:05:34 +02:00
David Hildenbrand
5d10a0e12b nvdimm: convert "unarmed" into a static property
We don't allow to modify it after realization. So we can simply turn
it into a static property.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180619134141.29478-10-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-28 19:05:33 +02:00
David Hildenbrand
a57d191122 pc-dimm: merge get_(vmstate_)memory_region()
Importantly, get_vmstate_memory_region() should also fail with a proper
error if called before the device is realized. For a PCDIMM, both functions
are to return the same thing, so share the implementation.

All current users are called after the device has been realized, so we
can expect the calls to succeed.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180619134141.29478-9-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-28 19:05:33 +02:00
David Hildenbrand
7943e97b85 hostmem: drop error variable from host_memory_backend_get_memory()
Unused, so let's remove it.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180619134141.29478-8-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-28 19:05:33 +02:00
David Hildenbrand
4ab56d04ed nvdimm: no need to overwrite get_vmstate_memory_region()
Our parent class (PC_DIMM) provides exactly the same function.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180619134141.29478-7-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-28 19:05:33 +02:00
David Hildenbrand
9995c75951 pc-dimm: remove pc_dimm_get_free_slot() from header
Not used outside of pc-dimm.c and there shouldn't be other users. If
other devices (e.g. memory devices) ever have to also use slots, then we
will have to factor this out.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180619134141.29478-5-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-28 19:05:33 +02:00
David Hildenbrand
284878ee98 pc-dimm: rename pc_dimm_memory_* to pc_dimm_*
Let's rename it to make it look more consistent.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180619134141.29478-4-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-28 19:05:33 +02:00
David Hildenbrand
1e695fd7c3 pc-dimm: remove leftover "struct pc_dimms_capacity"
Not needed anymore, let's drop it.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180619134141.29478-2-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-28 19:05:32 +02:00
David Hildenbrand
4d8938a05d memory-device: turn alignment assert into check
The start of the address space indicates which maximum alignment is
supported by our machine (e.g. ppc, x86 1GB). This is helpful to
catch fragmenting guest physical memory in strange fashions.

Right now we can crash QEMU by e.g. (there might be easier examples)

qemu-system-x86_64 -m 256M,maxmem=20G,slots=2 \
 -object memory-backend-file,id=mem0,size=8192M,mem-path=/dev/zero,align=8192M \
 -device pc-dimm,id=dimm1,memdev=mem0

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180607154705.6316-2-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-28 19:05:31 +02:00
Ross Zwisler
1a97a478e6 nvdimm: fix typo in label-size definition
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Fixes: commit da6789c27c ("nvdimm: add a macro for property "label-size"")
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Cc: Haozhong Zhang <haozhong.zhang@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-05-23 17:02:03 +03:00
David Hildenbrand
3ff333effa pc-dimm: fix error messages if no slots were defined
If no slots were defined we try to allocate an empty bitmap, which
fails.

Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20180427120515.24067-1-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-05-11 14:33:40 +02:00
David Hildenbrand
18d11dc910 pc-dimm: move actual plug/unplug of a memory region to MemoryDevice
Registering the memory region for migration has do be done by the owner.
There could be cases, where we don't want to migrate the memory.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180423165126.15441-8-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-05-07 10:00:02 -03:00
David Hildenbrand
1b6d6af21b pc-dimm: factor out capacity and slot checks into MemoryDevice
Move the checks into memory_device_get_free_addr(). This will check
before doing any calculations if we have KVM/vhost slots left and if
the total region size would be exceeded.

Of course, while at it, make it independent of pc-dimm code.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180423165126.15441-7-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-05-07 10:00:02 -03:00
David Hildenbrand
bb0831bdf4 pc-dimm: factor out address search into MemoryDevice code
This mainly moves code, but does a handfull of optimizations:
- We pass the machine instead of the address space properties
- We check the hinted address directly and handle fragmented memory
  better
- We make the search independent of pc-dimm

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180423165126.15441-6-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-05-07 10:00:02 -03:00
David Hildenbrand
bd6c3e4a49 pc-dimm: pass in the machine and to the MemoryHotplugState
We use the machine internally either way, so let's just pass it in then.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180423165126.15441-5-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-05-07 10:00:02 -03:00
David Hildenbrand
acc7fa17e6 pc-dimm: no need to pass the memory region
We can just query it ourselves. When unplugging, we should always be
able to the region (as it was previously plugged). E.g. PPC already
assumed that and used &error_abort.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180423165126.15441-4-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-05-07 10:00:02 -03:00
David Hildenbrand
2cc0e2e814 pc-dimm: factor out MemoryDevice interface
On the qmp level, we already have the concept of memory devices:
    "query-memory-devices"
Right now, we only support NVDIMM and PCDIMM.

We want to map other devices later into the address space of the guest.
Such device could e.g. be virtio devices. These devices will have a
guest memory range assigned but won't be exposed via e.g. ACPI. We want
to make them look like memory device, but not glued to pc-dimm.

Especially, it will not always be possible to have TYPE_PC_DIMM as a parent
class (e.g. virtio devices). Let's use an interface instead. As a first
part, convert handling of
- qmp_pc_dimm_device_list
- get_plugged_memory_size
to our new model. plug/unplug stuff etc. will follow later.

A memory device will have to provide the following functions:
- get_addr(): Necessary, as the property "addr" can e.g. not be used for
              virtio devices (already defined).
- get_plugged_size(): The amount this device offers to the guest as of
                      now.
- get_region_size(): Because this can later on be bigger than the
                     plugged size.
- fill_device_info(): Fill MemoryDeviceInfo, e.g. for qmp.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180423165126.15441-2-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-05-07 10:00:02 -03:00
Haozhong Zhang
6388e18de9 qmp: distinguish PC-DIMM and NVDIMM in MemoryDeviceInfoList
It may need to treat PC-DIMM and NVDIMM differently, e.g., when
deciding the necessity of non-volatile flag bit in SRAT memory
affinity structures.

A new field 'nvdimm' is added to the union type MemoryDeviceInfo for
such purpose. Its type is currently PCDIMMDeviceInfo and will be
updated when necessary in the future.

It also fixes "info memory-devices"/query-memory-devices which
currently show nvdimm devices as dimm devices since
object_dynamic_cast(obj, TYPE_PC_DIMM) happily cast nvdimm to
TYPE_PC_DIMM which it's been inherited from.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 03:34:52 +02:00
Haozhong Zhang
52c95cae4e pc-dimm: make qmp_pc_dimm_device_list() sort devices by address
Make qmp_pc_dimm_device_list() return sorted by start address
list of devices so that it could be reused in places that
would need sorted list*. Reuse existing pc_dimm_built_list()
to get sorted list.

While at it hide recursive callbacks from callers, so that:

  qmp_pc_dimm_device_list(qdev_get_machine(), &list);

could be replaced with simpler:

  list = qmp_pc_dimm_device_list();

* follow up patch will use it in build_srat()

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au> for ppc part
Reviewed-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 03:34:52 +02:00
Markus Armbruster
9af2398977 Include less of the generated modular QAPI headers
In my "build everything" tree, a change to the types in
qapi-schema.json triggers a recompile of about 4800 out of 5100
objects.

The previous commit split up qmp-commands.h, qmp-event.h, qmp-visit.h,
qapi-types.h.  Each of these headers still includes all its shards.
Reduce compile time by including just the shards we actually need.

To illustrate the benefits: adding a type to qapi/migration.json now
recompiles some 2300 instead of 4800 objects.  The next commit will
improve it further.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180211093607.27351-24-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[eblake: rebase to master]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-02 13:45:50 -06:00
Haozhong Zhang
cb836434cd nvdimm: add 'unarmed' option
Currently the only vNVDIMM backend can guarantee the guest write
persistence is device DAX on Linux, because no host-side kernel cache
is involved in the guest access to it. The approach to detect whether
the backend is device DAX needs to access sysfs, which may not work
with SELinux.

Instead, we add the 'unarmed' option to device 'nvdimm', so that users
or management utils, which have enough knowledge about the backend,
can control the unarmed flag in guest ACPI NFIT via this option. The
guest Linux NVDIMM driver, for example, will mark the corresponding
vNVDIMM device read-only if the unarmed flag in guest NFIT is set.

The default value of 'unarmed' option is 'off' in order to keep the
backwards compatibility.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Message-Id: <20171211072806.2812-4-haozhong.zhang@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-01-19 11:18:51 -02:00
Haozhong Zhang
da6789c27c nvdimm: add a macro for property "label-size"
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20171211072806.2812-3-haozhong.zhang@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-01-19 11:18:51 -02:00
Igor Mammedov
f47bd1c839 spapr: replace numa_get_node() with lookup in pc-dimm list
SPAPR is the last user of numa_get_node() and a bunch of
supporting code to maintain numa_info[x].addr list.

Get LMB node id from pc-dimm list, which allows to
remove ~80LOC maintaining dynamic address range
lookup list.

It also removes pc-dimm dependency on numa_[un]set_mem_node_id()
and makes pc-dimms a sole source of information about which
node it belongs to and removes duplicate data from global
numa_info.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-12-15 09:49:24 +11:00
Vadim Galitsyn
9aa3397f19 qmp: introduce query-memory-size-summary command
Add a new query-memory-size-summary command which provides the
following memory information in bytes:

  * base-memory - size of "base" memory specified with command line option -m.

  * plugged-memory - amount of memory that was hot-plugged.
    If target does not have CONFIG_MEM_HOTPLUG enabled, no
    value is reported.

Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
Signed-off-by: Mohammed Gamal <mohammed.gamal@profitbricks.com>
Signed-off-by: Eduardo Otubo <eduardo.otubo@profitbricks.com>
Signed-off-by: Vadim Galitsyn <vadim.galitsyn@profitbricks.com>
Reviewed-by: Eugene Crosser <evgenii.cherkashin@profitbricks.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: qemu-devel@nongnu.org
Message-Id: <20170829153022.27004-3-vadim.galitsyn@profitbricks.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
  Fixup comments as per Igor's review
  Added 'of' from Vadim's reply
2017-09-14 15:52:10 +01:00
Thomas Huth
0479097859 hw/ppc/spapr: Fix segfault when instantiating a 'pc-dimm' without 'memdev'
QEMU currently crashes when trying to use a 'pc-dimm' on the pseries
machine without specifying its 'memdev' property. This happens because
pc_dimm_get_memory_region() does not check whether the 'memdev' property
has properly been set by the user. Looking closer at this function, it's
also obvious that it is using &error_abort to call another function - and
this is bad in a function that is used in the hot-plugging calling chain
since this can also cause QEMU to exit unexpectedly.

So let's fix these issues in a proper way now: Add a "Error **errp"
parameter to pc_dimm_get_memory_region() which we use in case the 'memdev'
property has not been set by the user, and which we can use instead of
the &error_abort, and change the callers of get_memory_region() to make
use of this "errp" parameter for proper error checking.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-08-22 21:26:46 +10:00
Philippe Mathieu-Daudé
87e0331c5a docs: fix broken paths to docs/devel/tracing.txt
With the move of some docs/ to docs/devel/ on ac06724a71,
no references were updated.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-07-31 13:12:53 +03:00