If a memory-backend is configured with mode
HOST_MEM_POLICY_PREFERRED then
host_memory_backend_memory_complete() calls mbind() as:
mbind(..., MPOL_PREFERRED, nodemask, ...);
Here, 'nodemask' is a bitmap of host NUMA nodes and corresponds
to the .host-nodes attribute. Therefore, there can be multiple
nodes specified. However, the documentation to MPOL_PREFERRED
says:
MPOL_PREFERRED
This mode sets the preferred node for allocation. ...
If nodemask specifies more than one node ID, the first node
in the mask will be selected as the preferred node.
Therefore, only the first node is honored and the rest is
silently ignored. Well, with recent changes to the kernel and
numactl we can do better.
The Linux kernel added in v5.15 via commit cfcaa66f8032
("mm/hugetlb: add support for mempolicy MPOL_PREFERRED_MANY")
support for MPOL_PREFERRED_MANY, which accepts multiple preferred
NUMA nodes instead.
Then, numa_has_preferred_many() API was introduced to numactl
(v2.0.15~26) allowing applications to query kernel support.
Wiring this all together, we can pass MPOL_PREFERRED_MANY to the
mbind() call instead and stop ignoring multiple nodes, silently.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Message-Id: <a0b4adce1af5bd2344c2218eb4a04b3ff7bcfdb4.1671097918.git.mprivozn@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
The has_FOO for pointer-valued FOO are redundant, except for arrays.
They are also a nuisance to work with. Recent commit "qapi: Start to
elide redundant has_FOO in generated C" provided the means to elide
them step by step. This is the step for qapi/tpm.json.
Said commit explains the transformation in more detail. The invariant
violations mentioned there do not occur here.
Cc: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Message-Id: <20221104160712.3005652-26-armbru@redhat.com>
Commit 02b61f38d3 ("hw/virtio: incorporate backend features in features")
properly negotiates VHOST_USER_F_PROTOCOL_FEATURES with the vhost-user
backend, but we forgot to enable vrings as specified in
docs/interop/vhost-user.rst:
If ``VHOST_USER_F_PROTOCOL_FEATURES`` has not been negotiated, the
ring starts directly in the enabled state.
If ``VHOST_USER_F_PROTOCOL_FEATURES`` has been negotiated, the ring is
initialized in a disabled state and is enabled by
``VHOST_USER_SET_VRING_ENABLE`` with parameter 1.
Some vhost-user front-ends already did this by calling
vhost_ops.vhost_set_vring_enable() directly:
- backends/cryptodev-vhost.c
- hw/net/virtio-net.c
- hw/virtio/vhost-user-gpio.c
But most didn't do that, so we would leave the vrings disabled and some
backends would not work. We observed this issue with the rust version of
virtiofsd [1], which uses the event loop [2] provided by the
vhost-user-backend crate where requests are not processed if vring is
not enabled.
Let's fix this issue by enabling the vrings in vhost_dev_start() for
vhost-user front-ends that don't already do this directly. Same thing
also in vhost_dev_stop() where we disable vrings.
[1] https://gitlab.com/virtio-fs/virtiofsd
[2] https://github.com/rust-vmm/vhost/blob/240fc2966/crates/vhost-user-backend/src/event_loop.rs#L217
Fixes: 02b61f38d3 ("hw/virtio: incorporate backend features in features")
Reported-by: German Maglione <gmaglione@redhat.com>
Tested-by: German Maglione <gmaglione@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20221123131630.52020-1-sgarzare@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20221130112439.2527228-3-alex.bennee@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
cryptodev: Added a new type of backend named lkcf-backend for
cryptodev. This backend upload asymmetric keys to linux kernel,
and let kernel do the accelerations if possible.
The lkcf stands for Linux Kernel Cryptography Framework.
Signed-off-by: lei he <helei.sig11@bytedance.com>
Message-Id: <20221008085030.70212-5-helei.sig11@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
virtio-crypto: Modify the current interface of virtio-crypto
device to support asynchronous mode.
Signed-off-by: lei he <helei.sig11@bytedance.com>
Message-Id: <20221008085030.70212-2-helei.sig11@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Let's allow for specifying a thread context via the "prealloc-context"
property. When set, preallcoation threads will be crated via the
thread context -- inheriting the same CPU affinity as the thread
context.
Pinning preallcoation threads to CPUs can heavily increase performance
in NUMA setups, because, preallocation from a CPU close to the target
NUMA node(s) is faster then preallocation from a CPU further remote,
simply because of memory bandwidth for initializing memory with zeroes.
This is especially relevant for very large VMs backed by huge/gigantic
pages, whereby preallocation is mandatory.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Message-Id: <20221014134720.168738-7-david@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
... and implement it under POSIX. When a ThreadContext is provided,
create new threads via the context such that these new threads obtain a
properly configured CPU affinity.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Message-Id: <20221014134720.168738-6-david@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Let's
* give the function a "qemu_*" style name
* make sure the parameters in the implementation match the prototype
* rename smp_cpus to max_threads, which makes the semantics of that
parameter clearer
... and add a function documentation.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Message-Id: <20221014134720.168738-2-david@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
When you try to set virtio-rng property "filename" after the backend
has been completed with user_creatable_complete(), the error message
blames "insufficient permission":
$ qemu-system-x86_64 -S -display none -nodefaults -monitor stdio -object rng-random,id=rng0 -device virtio-rng,id=vrng0,rng=rng0
QEMU 7.1.50 monitor - type 'help' for more information
(qemu) qom-set /objects/rng0 filename /dev/random
Error: Insufficient permission to perform this operation
This implies it could work with "sufficient permission". It can't.
Change the error message to:
Error: Property 'filename' can no longer be set
Same for cryptodev-vhost-user property "chardev", rng-egd property
"chardev", and vhost-user-backend property "chardev".
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20221012153801.2604340-3-armbru@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
[Commit message tidied up]
Swtpm may release the lock once the last one of its state blobs has been
migrated out. In case of VM migration failure QEMU now needs to notify
swtpm that it should again take the lock, which it can otherwise only do
once it has received the first TPM command from the VM.
Only try to send the lock command if swtpm supports it. It will not have
released the lock (and support shared storage setups) if it doesn't
support the locking command since the functionality of releasing the lock
upon state blob reception and the lock command were added to swtpm
'together'.
If QEMU sends the lock command and the storage has already been locked
no error is reported.
If swtpm does not receive the lock command (from older version of QEMU),
it will lock the storage once the first TPM command has been received. So
sending the lock command is an optimization.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20220912174741.1542330-3-stefanb@linux.ibm.com
Use the latest tpm_ioctl.h from upstream swtpm project.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20220912174741.1542330-2-stefanb@linux.ibm.com
When resuming after a migration, the backend sends CMD_INIT to the
emulator from the startup callback, then it sends the migration state
from the vmstate to the emulator, then it sends CMD_INIT again. Skip the
first CMD_INIT during a migration to avoid initializing the TPM twice.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
* Two preparation patches for the upcoming removal of the slirp submodule
* Some other small test fixes (typos, etc.)
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmMOVvcRHHRodXRoQHJl
ZGhhdC5jb20ACgkQLtnXdP5wLbUcOA//d4MU0VWbMRXHLLTvaIO+0I1JhiMP5PbU
qgDwGfIu7uY58hXvjDCSmoV5Lj35I/VdsmWYcC4dwQcIr9BwZq3H+jthB4sxMDIJ
UAnowmO22x5iTZr4PBY3GuYKRRUaf7EuqqOwmNAtrvDV+3BVn2sQFLtjWhqnyhqR
syonfyVhlFhqnFXPs6fXTXQxiuziuMmmHGSQMNRGuBudkivvOTQzElb3gxTp7pRe
FfIoAUVohUXptd26U+5Zr2KPxQQ/eZ2Elhnhjc6/r4u4JpbyfCQrGTFAMSuvq4HM
z/kKr/JA0v6vmX5ARjbCL0RhoNOM/DcOooxzX6YO3VkZTrQAHZxAsk25mihURRX3
UgGLDlagNuPSTl1fkUuumH86fFQ54bFBFFOV3yJWQF5UDuWKoy3bPlSf5L0/bwRp
z5gYnf0lJxMG3kGgmaOnW4gj0Z0amn9AzI33BQDIldVNTHnp8/hNpscrsq5Voi2j
ot1G/aZt9OH+DeqAB8TJfbsHE8mtTgioihZ2QQOMAKVkF25UImFjNWliX8SUHG2h
E3ro9QLugV2FgIggJwRyN9w394hEn7BR8DMyiPCRemcjnT4Fuy9IoEBEkJ2gj3n4
QiDPdrr/1dw8uApGBts3YyRbSmajqKUegXCuOYXjpU90f4Kno0WN2/jkTx8pvfcE
bJvG21nzrdY=
=MCyJ
-----END PGP SIGNATURE-----
Merge tag 'testing-pull-request-2022-08-30' of https://gitlab.com/thuth/qemu into staging
* First batch of patches to get qtests adapted for Windows
* Two preparation patches for the upcoming removal of the slirp submodule
* Some other small test fixes (typos, etc.)
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmMOVvcRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbUcOA//d4MU0VWbMRXHLLTvaIO+0I1JhiMP5PbU
# qgDwGfIu7uY58hXvjDCSmoV5Lj35I/VdsmWYcC4dwQcIr9BwZq3H+jthB4sxMDIJ
# UAnowmO22x5iTZr4PBY3GuYKRRUaf7EuqqOwmNAtrvDV+3BVn2sQFLtjWhqnyhqR
# syonfyVhlFhqnFXPs6fXTXQxiuziuMmmHGSQMNRGuBudkivvOTQzElb3gxTp7pRe
# FfIoAUVohUXptd26U+5Zr2KPxQQ/eZ2Elhnhjc6/r4u4JpbyfCQrGTFAMSuvq4HM
# z/kKr/JA0v6vmX5ARjbCL0RhoNOM/DcOooxzX6YO3VkZTrQAHZxAsk25mihURRX3
# UgGLDlagNuPSTl1fkUuumH86fFQ54bFBFFOV3yJWQF5UDuWKoy3bPlSf5L0/bwRp
# z5gYnf0lJxMG3kGgmaOnW4gj0Z0amn9AzI33BQDIldVNTHnp8/hNpscrsq5Voi2j
# ot1G/aZt9OH+DeqAB8TJfbsHE8mtTgioihZ2QQOMAKVkF25UImFjNWliX8SUHG2h
# E3ro9QLugV2FgIggJwRyN9w394hEn7BR8DMyiPCRemcjnT4Fuy9IoEBEkJ2gj3n4
# QiDPdrr/1dw8uApGBts3YyRbSmajqKUegXCuOYXjpU90f4Kno0WN2/jkTx8pvfcE
# bJvG21nzrdY=
# =MCyJ
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 30 Aug 2022 14:29:11 EDT
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5
* tag 'testing-pull-request-2022-08-30' of https://gitlab.com/thuth/qemu: (23 commits)
tests/avocado/migration: Get find_free_port() from the ports
tests/qtest/ac97-test: Correct reference to driver
gitlab-ci: Only use one process in Windows jobs for compilation
docs/devel/testing: fix minor typo
tests/avocado: Fix trivial typo
tests/avocado: Do not run tests that require libslirp if it is not available
tests/vm: Add libslirp to the VM tests
tests/qtest: prom-env-test: Use double quotes to pass the prom-env option
tests/qtest: npcm7xx_emc-test: Skip running test_{tx, rx} on win32
tests/qtest: machine-none-test: Use double quotes to pass the cpu option
tests/qtest: device-plug-test: Reverse the usage of double/single quotes
tests/qtest: libqos: Rename malloc.h to libqos-malloc.h
tests/qtest: libqos: Drop inclusion of <sys/wait.h>
tests/qtest: migration-test: Skip running test_migrate_fd_proto on win32
tests/qtest: i440fx-test: Skip running request_{bios, pflash} for win32
tests/qtest: Build cases that use memory-backend-file for posix only
tests/qtest: Build e1000e-test for posix only
tests/qtest: Adapt {m48t59,rtc}-test cases for win32
backends/tpm: Exclude headers and macros that don't exist on win32
tests/qtest: migration-test: Handle link() for win32
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
It is currently not possible yet to use "memory-backend-memfd" on s390x
with hugepages enabled. This problem is caused by qemu_maxrampagesize()
not taking memory-backend-memfd objects into account yet, so the code
in s390_memory_init() fails to enable the huge page support there via
s390_set_max_pagesize(). Fix it by generalizing the code, so that it
looks at qemu_ram_pagesize(memdev->mr.ram_block) instead of re-trying
to get the information from the filesystem.
Suggested-by: David Hildenbrand <david@redhat.com>
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2116496
Message-Id: <20220810125720.3849835-2-thuth@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
These headers and macros do not exist on Windows. Exclude them.
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Message-Id: <20220824094029.1634519-15-bmeng.cn@gmail.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The purpose of dbus_get_proxies to construct the proxies corresponding to the
IDs registered to dbus-vmstate.
Currenty, this function returns an error in case there is any failure
while instantiating proxy for "all" the names on dbus.
Ideally this function should error out only if it is not able to find and
validate the proxies registered to the backend otherwise any offending
process(for eg: the process purposefully may not export its Id property on
the dbus) may connect to the dbus and can lead to migration failures.
This commit ensures that dbus_get_proxies returns an error if it is not
able to find and validate the proxies of interest(the IDs registered
during the dbus-vmstate instantiation).
Signed-off-by: Priyankar Jain <priyankar.jain@nutanix.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1637936117-37977-1-git-send-email-priyankar.jain@nutanix.com>
There are two parts in this patch:
1, support akcipher service by cryptodev-builtin driver
2, virtio-crypto driver supports akcipher service
In principle, we should separate this into two patches, to avoid
compiling error, merge them into one.
Then virtio-crypto gets request from guest side, and forwards the
request to builtin driver to handle it.
Test with a guest linux:
1, The self-test framework of crypto layer works fine in guest kernel
2, Test with Linux guest(with asym support), the following script
test(note that pkey_XXX is supported only in a newer version of keyutils):
- both public key & private key
- create/close session
- encrypt/decrypt/sign/verify basic driver operation
- also test with kernel crypto layer(pkey add/query)
All the cases work fine.
Run script in guest:
rm -rf *.der *.pem *.pfx
modprobe pkcs8_key_parser # if CONFIG_PKCS8_PRIVATE_KEY_PARSER=m
rm -rf /tmp/data
dd if=/dev/random of=/tmp/data count=1 bs=20
openssl req -nodes -x509 -newkey rsa:2048 -keyout key.pem -out cert.pem -subj "/C=CN/ST=BJ/L=HD/O=qemu/OU=dev/CN=qemu/emailAddress=qemu@qemu.org"
openssl pkcs8 -in key.pem -topk8 -nocrypt -outform DER -out key.der
openssl x509 -in cert.pem -inform PEM -outform DER -out cert.der
PRIV_KEY_ID=`cat key.der | keyctl padd asymmetric test_priv_key @s`
echo "priv key id = "$PRIV_KEY_ID
PUB_KEY_ID=`cat cert.der | keyctl padd asymmetric test_pub_key @s`
echo "pub key id = "$PUB_KEY_ID
keyctl pkey_query $PRIV_KEY_ID 0
keyctl pkey_query $PUB_KEY_ID 0
echo "Enc with priv key..."
keyctl pkey_encrypt $PRIV_KEY_ID 0 /tmp/data enc=pkcs1 >/tmp/enc.priv
echo "Dec with pub key..."
keyctl pkey_decrypt $PRIV_KEY_ID 0 /tmp/enc.priv enc=pkcs1 >/tmp/dec
cmp /tmp/data /tmp/dec
echo "Sign with priv key..."
keyctl pkey_sign $PRIV_KEY_ID 0 /tmp/data enc=pkcs1 hash=sha1 > /tmp/sig
echo "Verify with pub key..."
keyctl pkey_verify $PRIV_KEY_ID 0 /tmp/data /tmp/sig enc=pkcs1 hash=sha1
echo "Enc with pub key..."
keyctl pkey_encrypt $PUB_KEY_ID 0 /tmp/data enc=pkcs1 >/tmp/enc.pub
echo "Dec with priv key..."
keyctl pkey_decrypt $PRIV_KEY_ID 0 /tmp/enc.pub enc=pkcs1 >/tmp/dec
cmp /tmp/data /tmp/dec
echo "Verify with pub key..."
keyctl pkey_verify $PUB_KEY_ID 0 /tmp/data /tmp/sig enc=pkcs1 hash=sha1
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: lei he <helei.sig11@bytedance.com
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20220611064243.24535-2-pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Prior to the introduction of the prealloc-threads property, the amount
of threads used to preallocate memory was derived from the value of
smp-cpus passed to qemu, the amount of physical cpus of the host
and a hardcoded maximum value. When the prealloc-threads property
was introduced, it included a default of 1 in backends/hostmem.c and
a default of smp-cpus using the sugar API for the property itself. The
latter default is not used when the property is not specified on qemu's
command line, so guests that were not adjusted for this change suddenly
started to use the default of 1 thread to preallocate memory, which
resulted in observable slowdowns in guest boots for guests with large
memory (e.g. when using libvirt <8.2.0 or managing guests manually).
This commit restores the original behavior for these cases while not
impacting guests started with the prealloc-threads property in any way.
Fixes: 220c1fd864e9d ("hostmem: introduce "prealloc-threads" property")
Signed-off-by: Jaroslav Jindrak <dzejrou@gmail.com>
Message-Id: <20220517123858.7933-1-dzejrou@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The ``opened=on`` option in the command line or QMP ``object-add`` either had
no effect (if ``opened`` was the last option) or caused errors. The property
is therefore useless and was deprecated in 6.0; make it read-only now.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Replace the global variables with inlined helper functions. getpagesize() is very
likely annotated with a "const" function attribute (at least with glibc), and thus
optimization should apply even better.
This avoids the need for a constructor initialization too.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20220323155743.1585078-12-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
g_new(T, n) is neater than g_malloc(sizeof(T) * n). It's also safer,
for two reasons. One, it catches multiplication overflowing size_t.
Two, it returns T * rather than void *, which lets the compiler catch
more type errors.
This commit only touches allocations with size arguments of the form
sizeof(T).
Patch created mechanically with:
$ spatch --in-place --sp-file scripts/coccinelle/use-g_new-etc.cocci \
--macro-file scripts/cocci-macro-file.h FILES...
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20220315144156.1595462-4-armbru@redhat.com>
Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
The function qemu_madvise() and the QEMU_MADV_* constants associated
with it are used in only 10 files. Move them out of osdep.h to a new
qemu/madvise.h header that is included where it is needed.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220208200856.3558249-2-peter.maydell@linaro.org
Use the source XML document as single reference, importing its
documentation via the dbus-doc directive.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Although not used by the backend itself, use a common location for
documentation and sharing purposes.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
EPC (Enclave Page Cahe) is a specialized type of memory used by Intel
SGX (Software Guard Extensions). The SDM desribes EPC as:
The Enclave Page Cache (EPC) is the secure storage used to store
enclave pages when they are a part of an executing enclave. For an
EPC page, hardware performs additional access control checks to
restrict access to the page. After the current page access checks
and translations are performed, the hardware checks that the EPC
page is accessible to the program currently executing. Generally an
EPC page is only accessed by the owner of the executing enclave or
an instruction which is setting up an EPC page.
Because of its unique requirements, Linux manages EPC separately from
normal memory. Similar to memfd, the device /dev/sgx_vepc can be
opened to obtain a file descriptor which can in turn be used to mmap()
EPC memory.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210719112136.57018-3-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Simple unions predate flat unions. Having both complicates the QAPI
schema language and the QAPI generator. We haven't been using simple
unions in new code for a long time, because they are less flexible and
somewhat awkward on the wire.
To prepare for their removal, convert simple union TpmTypeOptions to
an equivalent flat one, with existing enum TpmType replacing implicit
enum TpmTypeOptionsKind. Adds some boilerplate to the schema, which
is a bit ugly, but a lot easier to maintain than the simple union
feature.
Cc: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210917143134.412106-6-armbru@redhat.com>
[Indentation tidied up]
Most callers check the return value. Some check whether it set an
error. Functionally equivalent, but the former tends to be easier on
the eyes, so do that everywhere.
Prior art: commit c6ecec43b2 "qemu-option: Check return value instead
of @err where convenient".
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20210720125408.387910-10-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
This allows callers to return better error messages instead of making
one up while the real error ends up on stderr. Most callers can
immediately make use of this because they already have an Error
parameter themselves. The others just keep printing the error with
error_report_err().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210609154658.350308-2-kwolf@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Let's provide a way to control the use of RAM_NORESERVE via memory
backends using the "reserve" property which defaults to true (old
behavior).
Only Linux currently supports clearing the flag (and support is checked at
runtime, depending on the setting of "/proc/sys/vm/overcommit_memory").
Windows and other POSIX systems will bail out with "reserve=false".
The target use case is virtio-mem, which dynamically exposes memory
inside a large, sparse memory area to the VM. This essentially allows
avoiding to set "/proc/sys/vm/overcommit_memory == 0") when using
virtio-mem and also supporting hugetlbfs in the future.
As really only Linux implements RAM_NORESERVE right now, let's expose
the property only with CONFIG_LINUX. Setting the property to "false"
will then only fail in corner cases -- for example on very old kernels
or when memory overcommit was completely disabled by the admin.
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210510114328.21835-11-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Let's pass in ram flags just like we do with qemu_ram_alloc_from_file(),
to clean up and prepare for more flags.
Simplify the documentation of passed ram flags: Looking at our
documentation of RAM_SHARED and RAM_PMEM is sufficient, no need to be
repetitive.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210510114328.21835-5-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit e50caf4a5c ("tracing: convert documentation to rST")
converted docs/devel/tracing.txt to docs/devel/tracing.rst.
We still have several references to the old file, so let's fix them
with the following command:
sed -i s/tracing.txt/tracing.rst/ $(git grep -l docs/devel/tracing.txt)
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210517151702.109066-2-sgarzare@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Simplify the tpm_emulator_ctrlcmd() handler by replacing a pair of
qemu_mutex_lock/qemu_mutex_unlock calls by the WITH_QEMU_LOCK_GUARD
macro.
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Christophe de Dinechin <dinechin@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210512070713.3286188-1-philmd@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Stop including sysemu/sysemu.h in files that don't need it.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210416171314.2074665-2-thuth@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
This commit fixes an issue where migration is failing in the load phase
because of a false alarm about data unavailability.
Following is the error received when the amount of data to be transferred
exceeds the default buffer size setup by G_BUFFERED_INPUT_STREAM(4KiB),
even when the maximum data size supported by this backend is 1MiB
(DBUS_VMSTATE_SIZE_LIMIT):
dbus_vmstate_post_load: Invalid vmstate size: 4364
qemu-kvm: error while loading state for instance 0x0 of device 'dbus-vmstate/dbus-vmstate'
This commit sets the size of the input stream buffer used during load to
DBUS_VMSTATE_SIZE_LIMIT which is the maximum amount of data a helper can
send during save phase.
Secondly, this commit makes sure that the input stream buffer is loaded before
checking the size of the data available in it, rectifying the false alarm about
data unavailability.
Fixes: 5010cec2bc ("Add dbus-vmstate object")
Signed-off-by: Priyankar Jain <priyankar.jain@nutanix.com>
Message-Id: <cdaad4718e62bf22fd5e93ef3e252de20da5c17c.1612273156.git.priyankar.jain@nutanix.com>
[ Modified printf format for gsize ]
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
When dbus_vmstate_post_load() fails, it complains to stderr. Except
on short read, where it checks with g_return_val_if_fail(). This
fails silently if G_DISABLE_CHECKS is undefined (it should be), or
else pads the short read with uninitialized bytes.
Replace g_return_val_if_fail() by a proper error check.
Fixes: 5010cec2bc
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20210126124240.2081959-2-armbru@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
When management applications (like Libvirt) want to check whether
memory-backend-file.pmem is supported they can list object
properties using 'qom-list-properties'. However, 'pmem' is
declared always (and thus reported always) and only at runtime
QEMU errors out if it was built without libpmem (and thus can not
guarantee write persistence). This is suboptimal since we have
ability to declare attributes at compile time.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1915216
Message-Id: <dfcc5dc7e2efc0283bc38e3036da2c0323621cdb.1611647111.git.mprivozn@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Allow RAM MemoryRegion to be created from an offset in a file, instead
of allocating at offset of 0 by default. This is needed to synchronize
RAM between QEMU & remote process.
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 609996697ad8617e3b01df38accc5c208c24d74e.1611938319.git.jag.raman@oracle.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This patch enables using rng-builtin with record/replay
by making the callbacks deterministic.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
Message-Id: <161233201286.170686.7858208964037376305.stgit@pasha-ThinkPad-X280>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add documentation for '-machine memory-backend' CLI option and
how to use it.
And document that x-use-canonical-path-for-ramblock-id,
is considered to be stable to make sure it won't go away by accident.
x- was intended for unstable/iternal properties, and not supposed to
be stable option. However it's too late to rename (drop x-)
it as it would mean that users will have to mantain both
x-use-canonical-path-for-ramblock-id (for QEMU 5.0-5.2) versions
and prefix-less for later versions.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210121161504.1007247-1-imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Several architectures have mechanisms which are designed to protect
guest memory from interference or eavesdropping by a compromised
hypervisor. AMD SEV does this with in-chip memory encryption and
Intel's TDX can do similar things. POWER's Protected Execution
Framework (PEF) accomplishes a similar goal using an ultravisor and
new memory protection features, instead of encryption.
To (partially) unify handling for these, this introduces a new
ConfidentialGuestSupport QOM base class. "Confidential" is kind of vague,
but "confidential computing" seems to be the buzzword about these schemes,
and "secure" or "protected" are often used in connection to unrelated
things (such as hypervisor-from-guest or guest-from-guest security).
The "support" in the name is significant because in at least some of the
cases it requires the guest to take specific actions in order to protect
itself from hypervisor eavesdropping.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Let -object memory-backend-file work on read-only files when the
readonly=on option is given. This can be used to share the contents of a
file between multiple guests while preventing them from consuming
Copy-on-Write memory if guests dirty the pages, for example.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20210104171320.575838-3-stefanha@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
There is currently no way to open(O_RDONLY) and mmap(PROT_READ) when
creating a memory region from a file. This functionality is needed since
the underlying host file may not allow writing.
Add a bool readonly argument to memory_region_init_ram_from_file() and
the APIs it calls.
Extend memory_region_init_ram_from_file() rather than introducing a
memory_region_init_rom_from_file() API so that callers can easily make a
choice between read/write and read-only at runtime without calling
different APIs.
No new RAMBlock flag is introduced for read-only because it's unclear
whether RAMBlocks need to know that they are read-only. Pass a bool
readonly argument instead.
Both of these design decisions can be changed in the future. It just
seemed like the simplest approach to me.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20210104171320.575838-2-stefanha@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The easiest spots to use QAPI_LIST_APPEND are where we already have an
obvious pointer to the tail of a list. While at it, consistently use
the variable name 'tail' for that purpose.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20210113221013.390592-5-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>