qemu-e2k

Author	SHA1	Message	Date
Stefan Hajnoczi	369d6dc4de	memory: add readonly support to memory_region_init_ram_from_file() There is currently no way to open(O_RDONLY) and mmap(PROT_READ) when creating a memory region from a file. This functionality is needed since the underlying host file may not allow writing. Add a bool readonly argument to memory_region_init_ram_from_file() and the APIs it calls. Extend memory_region_init_ram_from_file() rather than introducing a memory_region_init_rom_from_file() API so that callers can easily make a choice between read/write and read-only at runtime without calling different APIs. No new RAMBlock flag is introduced for read-only because it's unclear whether RAMBlocks need to know that they are read-only. Pass a bool readonly argument instead. Both of these design decisions can be changed in the future. It just seemed like the simplest approach to me. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20210104171320.575838-2-stefanha@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-02-01 17:07:34 -05:00
Philippe Mathieu-Daudé	ed6f53f9ca	util/oslib: Assert qemu_try_memalign() alignment is a power of 2 qemu_try_memalign() expects a power of 2 alignment: - posix_memalign(3): The address of the allocated memory will be a multiple of alignment, which must be a power of two and a multiple of sizeof(void *). - _aligned_malloc() The alignment value, which must be an integer power of 2. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20201021173803.2619054-3-philmd@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:06 -10:00
Daniele Buono	c905a3680d	cfi: Initial support for cfi-icall in QEMU LLVM/Clang, supports runtime checks for forward-edge Control-Flow Integrity (CFI). CFI on indirect function calls (cfi-icall) ensures that, in indirect function calls, the function called is of the right signature for the pointer type defined at compile time. For this check to work, the code must always respect the function signature when using function pointer, the function must be defined at compile time, and be compiled with link-time optimization. This rules out, for example, shared libraries that are dynamically loaded (given that functions are not known at compile time), and code that is dynamically generated at run-time. This patch: 1) Introduces the CONFIG_CFI flag to support cfi in QEMU 2) Introduces a decorator to allow the definition of "sensitive" functions, where a non-instrumented function may be called at runtime through a pointer. The decorator will take care of disabling cfi-icall checks on such functions, when cfi is enabled. 3) Marks functions currently in QEMU that exhibit such behavior, in particular: - The function in TCG that calls pre-compiled TBs - The function in TCI that interprets instructions - Functions in the plugin infrastructures that jump to callbacks - Functions in util that directly call a signal handler Signed-off-by: Daniele Buono <dbuono@linux.vnet.ibm.com> Acked-by: Alex Bennée <alex.bennee@linaro.org Message-Id: <20201204230615.2392-3-dbuono@linux.vnet.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-01-02 21:03:35 +01:00
Paolo Bonzini	fcb4f59c87	oslib-posix: relocate path to /var Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-09-30 19:11:36 +02:00
Paolo Bonzini	9386a4a715	oslib-posix: default exec_dir to bindir If the exec_dir cannot be retrieved, just assume it's the installation directory that was specified at configure time. This makes it simpler to reason about what the callers will do if they get back an empty path. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-09-30 19:11:36 +02:00
Paolo Bonzini	a4c13869f9	oslib: do not call g_strdup from qemu_get_exec_dir Just return the directory without requiring the caller to free it. This also removes a bogus check for NULL in os_find_datadir and module_load_one; g_strdup of a static variable cannot return NULL. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-09-30 19:11:36 +02:00
Daniel P. Berrangé	448058aa99	util: rename qemu_open() to qemu_open_old() We want to introduce a new version of qemu_open() that uses an Error object for reporting problems and make this it the preferred interface. Rename the existing method to release the namespace for the new impl. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2020-09-16 10:33:48 +01:00
Alex Bennée	ad06ef0efb	util: add qemu_get_host_physmem utility function This will be used in a future patch. For POSIX systems _SC_PHYS_PAGES isn't standardised but at least appears in the man pages for Open/FreeBSD. The result is advisory so any users of it shouldn't just fail if we can't work it out. The win32 stub currently returns 0 until someone with a Windows system can develop and test a patch. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Cc: BALATON Zoltan <balaton@eik.bme.hu> Cc: Christian Ehrhardt <christian.ehrhardt@canonical.com> Message-Id: <20200724064509.331-5-alex.bennee@linaro.org>	2020-07-27 09:40:12 +01:00
David CARLIER	8edbca515c	util: Implement qemu_get_thread_id() for OpenBSD Implement qemu_get_thread_id() for OpenBSD hosts, using getthrid(). Signed-off-by: David Carlier <devnexen@gmail.com> Reviewed-by: Brad Smith <brad@comstyle.com> Message-id: CA+XhMqxD6gQDBaj8tX0CMEj3si7qYKsM8u1km47e_-U7MC37Pg@mail.gmail.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> [PMM: tidied up commit message] Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2020-07-20 11:35:17 +01:00
Laurent Vivier	894022e616	net: check if the file descriptor is valid before using it qemu_set_nonblock() checks that the file descriptor can be used and, if not, crashes QEMU. An assert() is used for that. The use of assert() is used to detect programming error and the coredump will allow to debug the problem. But in the case of the tap device, this assert() can be triggered by a misconfiguration by the user. At startup, it's not a real problem, but it can also happen during the hot-plug of a new device, and here it's a problem because we can crash a perfectly healthy system. For instance: # ip link add link virbr0 name macvtap0 type macvtap mode bridge # ip link set macvtap0 up # TAP=/dev/tap$(ip -o link show macvtap0 \| cut -d: -f1) # qemu-system-x86_64 -machine q35 -device pcie-root-port,id=pcie-root-port-0 -monitor stdio 9<> $TAP (qemu) netdev_add type=tap,id=hostnet0,vhost=on,fd=9 (qemu) device_add driver=virtio-net-pci,netdev=hostnet0,id=net0,bus=pcie-root-port-0 (qemu) device_del net0 (qemu) netdev_del hostnet0 (qemu) netdev_add type=tap,id=hostnet1,vhost=on,fd=9 qemu-system-x86_64: .../util/oslib-posix.c:247: qemu_set_nonblock: Assertion `f != -1' failed. Aborted (core dumped) To avoid that, add a function, qemu_try_set_nonblock(), that allows to report the problem without crashing. In the same way, we also update the function for vhostfd in net_init_tap_one() and for fd in net_init_socket() (both descriptors are provided by the user and can be wrong). Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2020-07-15 21:00:13 +08:00
Michal Privoznik	e47f4765af	util: Introduce qemu_get_host_name() This function offers operating system agnostic way to fetch host name. It is implemented for both POSIX-like and Windows systems. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Cc: qemu-stable@nongnu.org Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2020-07-13 17:44:58 -05:00
David CARLIER	2b9b9e7010	util/oslib-posix.c: Implement qemu_init_exec_dir() for Haiku The qemu_init_exec_dir() function is inherently non-portable; provide an implementation for Haiku hosts. Signed-off-by: David Carlier <devnexen@gmail.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20200703145614.16684-9-peter.maydell@linaro.org [PMM: Expanded commit message] Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2020-07-13 14:36:10 +01:00
David CARLIER	2a4b472c3c	osdep.h: Always include <sys/signal.h> if it exists Regularize our handling of <sys/signal.h>: currently we include it in osdep.h, but only for OpenBSD, and we include it without an ifdef guard in a couple of C files. This causes problems for Haiku, which doesn't have that header. Instead, check in configure whether sys/signal.h exists, and if it does then always include it from osdep.h. Signed-off-by: David Carlier <devnexen@gmail.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20200703145614.16684-5-peter.maydell@linaro.org [PMM: Expanded commit message; rename to HAVE_SYS_SIGNAL_H] Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2020-07-13 14:36:09 +01:00
David CARLIER	2032e243a5	util/oslib-posix : qemu_init_exec_dir implementation for Mac From 3025a0ce3fdf7d3559fc35a52c659f635f5c750c Mon Sep 17 00:00:00 2001 From: David Carlier <devnexen@gmail.com> Date: Tue, 26 May 2020 21:35:27 +0100 Subject: [PATCH] util/oslib-posix : qemu_init_exec_dir implementation for Mac Using dyld API to get the full path of the current process. Signed-off-by: David Carlier <devnexen@gmail.com> Message-id: CA+XhMqxwC10XHVs4Z-JfE0-WLAU3ztDuU9QKVi31mjr59HWCxg@mail.gmail.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2020-06-23 11:39:46 +01:00
David Carlier	9548a89173	util/oslib: Returns the real thread identifier on FreeBSD and NetBSD getpid is good enough in a mono thread context, however thr_self/_lwp_self reflects the real current thread identifier from a given process. Signed-off-by: David Carlier <devnexen@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: David Carlier <devnexen@gmail.com>	2020-06-10 12:10:48 -04:00
Bauerchen	278fb16273	oslib-posix: take lock before qemu_cond_broadcast In touch_all_pages, if the mutex is not taken around qemu_cond_broadcast, qemu_cond_broadcast may be called before all touch page threads enter qemu_cond_wait. In this case, the touch page threads wait forever for the main thread to wake them up, causing a deadlock. Signed-off-by: Bauerchen <bauerchen@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-04-11 08:49:20 -04:00
Paolo Bonzini	78b3f67acd	oslib-posix: initialize mutex and condition variable The mutex and condition variable were never initialized, causing -mem-prealloc to abort with an assertion failure. Fixes: `037fb5eb39` Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com> Cc: bauerchen <bauerchen@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-16 23:02:22 +01:00
bauerchen	037fb5eb39	mem-prealloc: optimize large guest startup [desc]: Large memory VM starts slowly when using -mem-prealloc, and there are some areas to optimize in current method; 1、mmap will be used to alloc threads stack during create page clearing threads, and it will attempt mm->mmap_sem for write lock, but clearing threads have hold read lock, this competition will cause threads createion very slow; 2、methods of calcuating pages for per threads is not well;if we use 64 threads to split 160 hugepage,63 threads clear 2page,1 thread clear 34 page,so the entire speed is very slow; to solve the first problem,we add a mutex in thread function,and start all threads when all threads finished createion; and the second problem, we spread remainder to other threads,in situation that 160 hugepage and 64 threads, there are 32 threads clear 3 pages,and 32 threads clear 2 pages. [test]: 320G 84c VM start time can be reduced to 10s 680G 84c VM start time can be reduced to 18s Signed-off-by: bauerchen <bauerchen@tencent.com> Reviewed-by: Pan Rui <ruippan@tencent.com> Reviewed-by: Ivan Ren <ivanren@tencent.com> [Simplify computation of the number of pages per thread. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-02-25 09:18:01 +01:00
Wei Yang	038adc2f58	core: replace getpagesize() with qemu_real_host_page_size There are three page size in qemu: real host page size host page size target page size All of them have dedicate variable to represent. For the last two, we use the same form in the whole qemu project, while for the first one we use two forms: qemu_real_host_page_size and getpagesize(). qemu_real_host_page_size is defined to be a replacement of getpagesize(), so let it serve the role. [Note] Not fully tested for some arch or device. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191013021145.16011-3-richardw.yang@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-10-26 15:38:06 +02:00
Stefan Hajnoczi	72d41eb4b8	memory: fetch pmem size in get_file_size() Neither stat(2) nor lseek(2) report the size of Linux devdax pmem character device nodes. Commit `314aec4a6e` ("hostmem-file: reject invalid pmem file sizes") added code to hostmem-file.c to fetch the size from sysfs and compare against the user-provided size=NUM parameter: if (backend->size > size) { error_setg(errp, "size property %" PRIu64 " is larger than " "pmem file \"%s\" size %" PRIu64, backend->size, fb->mem_path, size); return; } It turns out that exec.c:qemu_ram_alloc_from_fd() already has an equivalent size check but it skips devdax pmem character devices because lseek(2) returns 0: if (file_size > 0 && file_size < size) { error_setg(errp, "backing store %s size 0x%" PRIx64 " does not match 'size' option 0x" RAM_ADDR_FMT, mem_path, file_size, size); return NULL; } This patch moves the devdax pmem file size code into get_file_size() so that we check the memory size in a single place: qemu_ram_alloc_from_fd(). This simplifies the code and makes it more general. This also fixes the problem that hostmem-file only checks the devdax pmem file size when the pmem=on parameter is given. An unchecked size=NUM parameter can lead to SIGBUS in QEMU so we must always fetch the file size for Linux devdax pmem character device nodes. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20190830093056.12572-1-stefanha@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-09-16 12:32:21 +02:00
Markus Armbruster	db72581598	Include qemu/main-loop.h less In my "build everything" tree, changing qemu/main-loop.h triggers a recompile of some 5600 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). It includes block/aio.h, which in turn includes qemu/event_notifier.h, qemu/notify.h, qemu/processor.h, qemu/qsp.h, qemu/queue.h, qemu/thread-posix.h, qemu/thread.h, qemu/timer.h, and a few more. Include qemu/main-loop.h only where it's needed. Touching it now recompiles only some 1700 objects. For block/aio.h and qemu/event_notifier.h, these numbers drop from 5600 to 2800. For the others, they shrink only slightly. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190812052359.30071-21-armbru@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>	2019-08-16 13:31:52 +02:00
Markus Armbruster	a8d2532645	Include qemu-common.h exactly where needed No header includes qemu-common.h after this commit, as prescribed by qemu-common.h's file comment. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-5-armbru@redhat.com> [Rebased with conflicts resolved automatically, except for include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and net/tap-bsd.c fixed up]	2019-06-12 13:20:20 +02:00
Zhang Yi	2ac0f1621c	util/mmap-alloc: Add a 'is_pmem' parameter to qemu_ram_mmap besides the existing 'shared' flags, we are going to add 'is_pmem' to qemu_ram_mmap(), which indicated the memory backend file is a persist memory. Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Signed-off-by: Zhang Yi <yi.z.zhang@linux.intel.com> Reviewed-by: Pankaj Gupta <pagupta@redhat.com> Message-Id: <786c46862cfeb253ee0ea2f44d62ffe76edb7fa4.1549555521.git.yi.z.zhang@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Pankaj Gupta <pagupta@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2019-04-25 14:17:36 -03:00
Peter Maydell	2cb73afa6a	Machine queue, 2019-03-11 * memfd fixes (Ilya Maximets) * Move nvdimms state into struct MachineState (Eric Auger) * hostmem-file: reject invalid pmem file sizes (Stefan Hajnoczi) -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJchwQFAAoJECgHk2+YTcWmhkMP/iyHjvM7eTXcbs+5xidkQpX8 mc9ElHmX/W2ZK1TUeopz2hUuOG12qkt3G4bOKEKgD07h/O5J7HPXSRvT1TU7UbA/ ZkNQiF/TpuyB8JtxIgbYtgh4ZDFIGFy5o/phjCEuejyHMxZXVL8PNKCm9ZUPKgfG XYH1Q7Y+uHH7qQDhLRPdfs5/v8hOKdmHK/SuUn/dq2CqA4GoNjnC9IfxnuvIpDU6 F2Hj2YhPC35zFgR3bIh2Fqz4qv37u50a1L4VPKaCQpPY5YNGj6jPaOVPQbMrviFI 1/yaNr5RGdNrS7aQLcDKKVeclSuFHC7x3uo27JF1RbP8p4tAQi0M89E/RLyBV5lY Y7a9fInmJbxJQifgct6dv8yzTiNoniX5yph81RMXk0CzV74sP+yeKkwkIK2dWAsn 2zsM6qCHFvIv3F7iIy+ONl6TJ/RALvyP4F3Vhd3lT2Y+nwnQOvUdrX6eL4yeYGfZ 4OPCEHIn+xhb3ApYbG+4OrDBYZrPVpr6yYcqc8Ob9paeR08DgaghDX3E23bASwSl e9Cz19nvnIse/zHIAYoWhPFMfSTkWgREzCs+VA07bqPCb1/PNHBQmxv2mvdpB8Rw r/FjZyptCNyXRSfU28HEImAA7dsB9VtZAVK9oVRXaIOk2G6W5bFfAmQmAPETBRaA K9ZExT9oQhQdjKIaya0l =6nAH -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/ehabkost/tags/machine-next-pull-request' into staging Machine queue, 2019-03-11 * memfd fixes (Ilya Maximets) * Move nvdimms state into struct MachineState (Eric Auger) * hostmem-file: reject invalid pmem file sizes (Stefan Hajnoczi) # gpg: Signature made Tue 12 Mar 2019 00:57:41 GMT # gpg: using RSA key 2807936F984DC5A6 # gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>" [full] # Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6 * remotes/ehabkost/tags/machine-next-pull-request: memfd: improve error messages memfd: set up correct errno if not supported memfd: always check for MFD_CLOEXEC hostmem-memfd: disable for systems without sealing support machine: Move nvdimms state into struct MachineState nvdimm: Rename AcpiNVDIMMState into NVDIMMState hostmem-file: reject invalid pmem file sizes Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-03-12 15:25:46 +00:00
Philippe Mathieu-Daudé	02cdcc96be	oslib-posix: Ignore fcntl("/dev/null", F_SETFL, O_NONBLOCK) failure Previous to OpenBSD 6.3 [1], fcntl(F_SETFL) is not permitted on memory devices. Trying this call sets errno to ENODEV ("not a memory device"): 19 ENODEV Operation not supported by device. An attempt was made to apply an inappropriate function to a device, for example, trying to read a write-only device such as a printer. Do not assert fcntl failures in this specific case (errno set to ENODEV) on OpenBSD. This fixes: $ lm32-softmmu/qemu-system-lm32 assertion "f != -1" failed: file "util/oslib-posix.c", line 247, function "qemu_set_nonblock" Abort trap (core dumped) [1] The fix seems https://github.com/openbsd/src/commit/c2a35b387f9d3c "fcntl(F_SETFL) invokes the FIONBIO and FIOASYNC ioctls internally, so the memory devices (/dev/null, /dev/zero, etc) need to permit them." Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20190307142822.8531-2-philmd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-03-11 16:33:49 +01:00
Stefan Hajnoczi	314aec4a6e	hostmem-file: reject invalid pmem file sizes Guests started with NVDIMMs larger than the underlying host file produce confusing errors inside the guest. This happens because the guest accesses pages beyond the end of the file. Check the pmem file size on startup and print a clear error message if the size is invalid. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1669053 Cc: Wei Yang <richardw.yang@linux.intel.com> Cc: Zhang Yi <yi.z.zhang@linux.intel.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20190214031004.32522-3-stefanha@redhat.com> Reviewed-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Pankaj Gupta <pagupta@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2019-03-11 10:44:19 -03:00
Murilo Opsfelder Araujo	53adb9d43e	mmap-alloc: fix hugetlbfs misaligned length in ppc64 The commit `7197fb4058` ("util/mmap-alloc: fix hugetlb support on ppc64") fixed Huge TLB mappings on ppc64. However, we still need to consider the underlying huge page size during munmap() because it requires that both address and length be a multiple of the underlying huge page size for Huge TLB mappings. Quote from "Huge page (Huge TLB) mappings" paragraph under NOTES section of the munmap(2) manual: "For munmap(), addr and length must both be a multiple of the underlying huge page size." On ppc64, the munmap() in qemu_ram_munmap() does not work for Huge TLB mappings because the mapped segment can be aligned with the underlying huge page size, not aligned with the native system page size, as returned by getpagesize(). This has the side effect of not releasing huge pages back to the pool after a hugetlbfs file-backed memory device is hot-unplugged. This patch fixes the situation in qemu_ram_mmap() and qemu_ram_munmap() by considering the underlying page size on ppc64. After this patch, memory hot-unplug releases huge pages back to the pool. Fixes: `7197fb4058` Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-02-04 18:44:20 +11:00
Li Qiang	da93b82079	util: check the return value of fcntl in qemu_set_{block, nonblock} Assert that the return value is not an error. This is like commit `7e6478e7d4` for qemu_set_cloexec. Signed-off-by: Li Qiang <liq3ea@163.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2019-01-14 19:31:04 -05:00
Brad Smith	fc3d1bad1e	oslib-posix: Use MAP_STACK in qemu_alloc_stack() on OpenBSD Use MAP_STACK in qemu_alloc_stack() on OpenBSD. Added to our 6.4 release. MAP_STACK Indicate that the mapping is used as a stack. This flag must be used in combination with MAP_ANON and MAP_PRIVATE. Implement MAP_STACK option for mmap(). Synchronous faults (pagefault and syscall) confirm the stack register points at MAP_STACK memory, otherwise SIGSEGV is delivered. sigaltstack() and pthread_attr_setstack() are modified to create a MAP_STACK sub-region which satisfies alignment requirements. Observe that MAP_STACK can only be set/cleared by mmap(), which zeroes the contents of the region -- there is no mprotect() equivalent operation, so there is no MAP_STACK-adding gadget. Signed-off-by: Brad Smith <brad@comstyle.com> Reviewed-by: Kamil Rytarowski <n54@gmx.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20181019125239.GA13884@humpty.home.comstyle.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-11-06 10:52:23 +00:00
Marc-André Lureau	35f7f3fb5c	util: use fcntl() for qemu_write_pidfile() locking Daniel BerrangÃ© suggested to use fcntl() locks rather than lockf(). 'man lockf': On Linux, lockf() is just an interface on top of fcntl(2) locking. Many other systems implement lockf() in this way, but note that POSIX.1 leaves the relationship between lockf() and fcntl(2) locks unspecified. A portable application should probably avoid mixing calls to these interfaces. IOW, if its just a shim around fcntl() on many systems, it is clearer if we just use fcntl() directly, as we then know how fcntl() locks will behave if they're on a network filesystem like NFS. Suggested-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20180831145314.14736-3-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-10-02 18:47:55 +02:00
Marc-André Lureau	9e6bdef224	util: add qemu_write_pidfile() There are variants of qemu_create_pidfile() in qemu-pr-helper and qemu-ga. Let's have a common implementation in libqemuutil. The code is initially based from pr-helper write_pidfile(), with various improvements and suggestions from Daniel BerrangÃ©: QEMU will leave the pidfile existing on disk when it exits which initially made me think it avoids the deletion race. The app managing QEMU, however, may well delete the pidfile after it has seen QEMU exit, and even if the app locks the pidfile before deleting it, there is still a race. eg consider the following sequence QEMU 1 libvirtd QEMU 2 1. lock(pidfile) 2. exit() 3. open(pidfile) 4. lock(pidfile) 5. open(pidfile) 6. unlink(pidfile) 7. close(pidfile) 8. lock(pidfile) IOW, at step 8 the new QEMU has successfully acquired the lock, but the pidfile no longer exists on disk because it was deleted after the original QEMU exited. While we could just say no external app should ever delete the pidfile, I don't think that is satisfactory as people don't read docs, and admins don't like stale pidfiles being left around on disk. To make this robust, I think we might want to copy libvirt's approach to pidfile acquisition which runs in a loop and checks that the file on disk /after/ acquiring the lock matches the file that was locked. Then we could in fact safely let QEMU delete its own pidfiles on clean exit.. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20180831145314.14736-2-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-10-02 18:47:55 +02:00
Marcel Apfelbaum	06329ccecf	mem: add share parameter to memory-backend-ram Currently only file backed memory backend can be created with a "share" flag in order to allow sharing guest RAM with other processes in the host. Add the "share" flag also to RAM Memory Backend in order to allow remapping parts of the guest RAM to different host virtual addresses. This is needed by the RDMA devices in order to remap non-contiguous QEMU virtual addresses to a contiguous virtual address range. Moved the "share" flag to the Host Memory base class, modified phys_mem_alloc to include the new parameter and a new interface memory_region_init_ram_shared_nomigrate. There are no functional changes if the new flag is not used. Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>	2018-02-19 13:03:24 +02:00
Andreas Gustafsson	9bc5a7193f	oslib-posix: check for posix_memalign in configure script Check for the presence of posix_memalign() in the configure script, not using "defined(_POSIX_C_SOURCE) && !defined(__sun__)". This lets qemu use posix_memalign() on NetBSD versions that have it, instead of falling back to valloc() which is wasteful when the required alignment is smaller than a page. Signed-off-by: Andreas Gustafsson <gson@gson.org> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Reviewed-by: Kamil Rytarowski <n54@gmx.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>	2018-02-10 10:21:50 +03:00
Kamil Rytarowski	094611b426	oslib-posix: Use sysctl(2) call to resolve exec_dir on NetBSD NetBSD 8.0(beta) ships with KERN_PROC_PATHNAME in sysctl(2). Older NetBSD versions can use argv[0] parsing fallback. This code section is partly shared with FreeBSD. Signed-off-by: Kamil Rytarowski <n54@gmx.com> Message-id: 20171028194833.23858-1-n54@gmx.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-11-02 16:19:34 +00:00
Stefan Weil	e947d47da0	oslib-posix: Fix compiler warning and some data types gcc warning: /qemu/util/oslib-posix.c:304:11: error: variable ‘addr’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Werror=clobbered] Fix also some related data types: numpages, hpagesize are used as pointer offset. Always use size_t for them and also for the derived numpages_per_thread and size_per_thread. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Stefan Weil <sw@weilnetz.de> Message-id: 20171016202912.1117-1-sw@weilnetz.de Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-10-20 11:16:27 +02:00
Eduardo Habkost	e916a6e88a	oslib-posix: Print errors before aborting on qemu_alloc_stack() If QEMU is running on a system that's out of memory and mmap() fails, QEMU aborts with no error message at all, making it hard to debug the reason for the failure. Add perror() calls that will print error information before aborting. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20170829212053.6003-1-ehabkost@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-08-30 09:33:49 +01:00
Peter Maydell	02ffa034fb	util/oslib-posix.c: Avoid warning on NetBSD On NetBSD the compiler warns: util/oslib-posix.c: In function 'sigaction_invoke': util/oslib-posix.c:589:5: warning: missing braces around initializer [-Wmissing-braces] siginfo_t si = { 0 }; ^ util/oslib-posix.c:589:5: warning: (near initialization for 'si.si_pad') [-Wmissing-braces] because on this platform siginfo_t is defined as typedef union siginfo { char si_pad[128]; /* Total size; for future expansion */ struct _ksiginfo _info; } siginfo_t; Avoid this warning by initializing the struct with {} instead; this is a GCC extension but we use it all over the codebase already. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1500568341-8389-1-git-send-email-peter.maydell@linaro.org	2017-07-21 10:32:19 +01:00
Daniel P. Berrange	788cf9f8c8	block: rip out all traces of password prompting Now that qcow & qcow2 are wired up to get encryption keys via the QCryptoSecret object, nothing is relying on the interactive prompting for passwords. All the code related to password prompting can thus be ripped out. Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170623162419.26068-17-berrange@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:56 +02:00
Peter Maydell	2a8469aaab	-----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJZOEC7AAoJEJykq7OBq3PIs4wIAMWtKbkZQkgqpHnSt4ZzpjkQ 1lRLmM6HOqMO1SarK0PduSRafwgUD2q/24gk6UeqUbArZXK+7U3SnfsfEHkOqCdB +ELRPZ+/b77AEZzhCSZ7uHHvrISO5MtLW/z4Av8GB5KYARbO5aZgIyeg7Na29SDk vQoANktrtLgLHu0vZfSUTTmPMRJqcC/DMm/EukXapXVW+cEt23V+nohchFmw8VtF Uni17u26B7CJGZOGduD11CIIvQ9QX+acyDlknkCIqfwd3Xxle0Mu0S3IV4KH7zjn MIcF3hGQWeln5AZzQe998EC0Ko+pmPRooZaeUbrolp+KjK9OcRnNKr7O2jjvGkA= =9+yN -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging # gpg: Signature made Wed 07 Jun 2017 19:06:51 BST # gpg: using RSA key 0x9CA4ABB381AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" # Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8 * remotes/stefanha/tags/block-pull-request: configure: split c and cxx extra flags coroutine-lock: do not touch coroutine after another one has been entered .gdbinit: load QEMU sub-commands when gdb starts coccinelle: fix typo in comment oslib: strip trailing '\n' from error_setg() string argument Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-06-12 14:14:42 +01:00
Philippe Mathieu-Daudé	462e5d5065	oslib: strip trailing '\n' from error_setg() string argument spotted by Coccinelle script scripts/coccinelle/err-bad-newline.cocci Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-07 14:38:44 +01:00
Stefano Stabellini	7e6478e7d4	Check the return value of fcntl in qemu_set_cloexec Assert that the return value is not an error. This issue was found by Coverity. CID: 1374831 Signed-off-by: Stefano Stabellini <sstabellini@kernel.org> CC: groug@kaod.org CC: pbonzini@redhat.com CC: Eric Blake <eblake@redhat.com> Message-Id: <1494356693-13190-2-git-send-email-sstabellini@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-06-06 20:18:35 +02:00
Greg Kurz	fcdcf1eed2	util: drop old utimensat() compat code Now that 9pfs and virtfs-proxy-helper have been converted to utimensat(), we don't need to keep qemu_utimens() anymore. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Eric Blake <eblake@redhat.com>	2017-05-25 10:30:14 +02:00
Jitendra Kolhe	dfd0dcc717	mem-prealloc: fix sysconf(_SC_NPROCESSORS_ONLN) failure case. This was spotted by Coverity, in case where sysconf(_SC_NPROCESSORS_ONLN) fails and returns -1. This results in memset_num_threads getting set to -1. Which we then pass to g_new0(). The patch replaces MAX_MEM_PREALLOC_THREAD_COUNT macro with a function call get_memset_num_threads() to handle sysconf() failure gracefully. In case sysconf() fails, we fall back to single threaded. (Spotted by Coverity, CID 1372465.) Signed-off-by: Jitendra Kolhe <jitendra.kolhe@hpe.com> Message-Id: <1490079006-32495-1-git-send-email-jitendra.kolhe@hpe.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-03-24 11:50:11 +01:00
Paolo Bonzini	6b3cca76dd	oslib-posix: fix compilation on OpenBSD si_band is not found in OpenBSD. It is marked as obsolescent in POSIX, so we can delete it without any remorse. Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 20170317152214.6148-1-pbonzini@redhat.com Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-03-17 18:27:49 +00:00
Daniel P. Berrange	9dc44aa582	os: don't corrupt pre-existing memory-backend data with prealloc When using a memory-backend object with prealloc turned on, QEMU will memset() the first byte in every memory page to zero. While this might have been acceptable for memory backends associated with RAM, this corrupts application data for NVDIMMs. Instead of setting every page to zero, read the current byte value and then just write that same value back, so we are not corrupting the original data. Directly write the value instead of memset()ing it, since there's no benefit to memset for a single byte write. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Message-id: 20170303113255.28262-1-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-03-15 11:55:41 +08:00
Jitendra Kolhe	1e356fc14b	mem-prealloc: reduce large guest start-up and migration time. Using "-mem-prealloc" option for a large guest leads to higher guest start-up and migration time. This is because with "-mem-prealloc" option qemu tries to map every guest page (create address translations), and make sure the pages are available during runtime. virsh/libvirt by default, seems to use "-mem-prealloc" option in case the guest is configured to use huge pages. The patch tries to map all guest pages simultaneously by spawning multiple threads. Currently limiting the change to QEMU library functions on POSIX compliant host only, as we are not sure if the problem exists on win32. Below are some stats with "-mem-prealloc" option for guest configured to use huge pages. ------------------------------------------------------------------------ Idle Guest \| Start-up time \| Migration time ------------------------------------------------------------------------ Guest stats with 2M HugePage usage - single threaded (existing code) ------------------------------------------------------------------------ 64 Core - 4TB \| 54m11.796s \| 75m43.843s 64 Core - 1TB \| 8m56.576s \| 14m29.049s 64 Core - 256GB \| 2m11.245s \| 3m26.598s ------------------------------------------------------------------------ Guest stats with 2M HugePage usage - map guest pages using 8 threads ------------------------------------------------------------------------ 64 Core - 4TB \| 5m1.027s \| 34m10.565s 64 Core - 1TB \| 1m10.366s \| 8m28.188s 64 Core - 256GB \| 0m19.040s \| 2m10.148s ----------------------------------------------------------------------- Guest stats with 2M HugePage usage - map guest pages using 16 threads ----------------------------------------------------------------------- 64 Core - 4TB \| 1m58.970s \| 31m43.400s 64 Core - 1TB \| 0m39.885s \| 7m55.289s 64 Core - 256GB \| 0m11.960s \| 2m0.135s ----------------------------------------------------------------------- Changed in v2: - modify number of memset threads spawned to min(smp_cpus, 16). - removed 64GB memory restriction for spawning memset threads. Changed in v3: - limit number of threads spawned based on min(sysconf(_SC_NPROCESSORS_ONLN), 16, smp_cpus) - implement memset thread specific siglongjmp in SIGBUS signal_handler. Changed in v4 - remove sigsetjmp/siglongjmp and SIGBUS unblock/block for main thread as main thread no longer touches any pages. - simplify code my returning memset_thread_failed status from touch_all_pages. Signed-off-by: Jitendra Kolhe <jitendra.kolhe@hpe.com> Message-Id: <1487907103-32350-1-git-send-email-jitendra.kolhe@hpe.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-03-14 13:26:36 +01:00
Paolo Bonzini	d98d407234	cpus: remove ugly cast on sigbus_handler The cast is there because sigbus_handler is invoked via sigfd_handler. But it feels just wrong to use struct qemu_signalfd_siginfo in the prototype of a function that is passed to sigaction. Instead, do a simple-minded conversion of qemu_signalfd_siginfo to siginfo_t. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-03-03 16:40:02 +01:00
Ed Maste	a7764f1548	Fix FreeBSD (10.x) build after `7dc9ae43` Include sys/user.h for declaration of 'struct kinfo_proc'. Add -lutil to qemu-ga link for kinfo_getproc. Signed-off-by: Ed Maste <emaste@freebsd.org> Message-id: 1479778365-11315-1-git-send-email-emaste@freebsd.org Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-11-22 10:56:01 +00:00
Anand J	814bb12a56	clean-up: removed duplicate #includes Some files contain multiple #includes of the same header file. Removed most of those unnecessary duplicate entries using scripts/clean-includes. Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Anand J <anand.indukala@gmail.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2016-10-28 18:17:24 +03:00
Peter Maydell	86e121ae75	* Thread Sanitizer fixes (Alex) * Coverity fixes (David) * test-qht fixes (Emilio) * QOM interface for info irq/info pic (Hervé) * -rtc clock=rt fix (Junlian) * mux chardev fixes (Marc-André) * nicer report on death by signal (Michal) * qemu-tech TLC (Paolo) * MSI support for edu device (Peter) * qemu-nbd --offset fix (Tomáš) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQExBAABCAAbBQJX98xmFBxwYm9uemluaUByZWRoYXQuY29tAAoJEL/70l94x66D IXsH/idLNlBzbrGhcuZOXEAd4fCyCyhXGMuOAGJXLHgv+EfiqrJ9z4HTn44czdh7 rJuQDYeDrfl36zc0n8weY7JSEsorCq+JBDomFUFodmCrFUIue2jXYOK6pt5LUrQM OTyruQMKHD316SnJFOK8Tkxi5DrAHNRs+ynDcm+IoB65KE9YgBcBWuEJ03mF9cHi 5sb/SBEqfL49gVlnFXBDTRgXXwA5axS7xKd4+7CWtbVFvJxurImjywGqKI5G/dmC TJyP+Dty4iNjFP1E0VvfL6ETovncZlfe4Hx1b971pll/ec88jGL0brqQMPjACrWh TyLXLN9oTbEKuDxx1Nh23xRFh+c= =sgtZ -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging * Thread Sanitizer fixes (Alex) * Coverity fixes (David) * test-qht fixes (Emilio) * QOM interface for info irq/info pic (Hervé) * -rtc clock=rt fix (Junlian) * mux chardev fixes (Marc-André) * nicer report on death by signal (Michal) * qemu-tech TLC (Paolo) * MSI support for edu device (Peter) * qemu-nbd --offset fix (Tomáš) # gpg: Signature made Fri 07 Oct 2016 17:25:10 BST # gpg: using RSA key 0xBFFBD25F78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: (39 commits) qemu-doc: merge qemu-tech and qemu-doc qemu-tech: rewrite some parts qemu-tech: reorganize content qemu-tech: move TCG test documentation to tests/tcg/README qemu-tech: move user mode emulation features from qemu-tech qemu-tech: document lazy condition code evaluation in cpu.h qemu-tech: move text from qemu-tech to tcg/README qemu-doc: drop installation and compilation notes qemu-doc: replace introduction with the one from the internals manual qemu-tech: drop index test-qht: perform lookups under rcu_read_lock qht: fix unlock-after-free segfault upon resizing qht: simplify qht_reset_size qemu-nbd: Shrink image size by specified offset qemu_kill_report: Report PID name too util: Introduce qemu_get_pid_name char: update read handler in all cases char: use a fixed idx for child muxed chr i8259: give ISA device when registering ISA ioports .travis.yml: add gcc sanitizer build ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2016-10-10 10:39:29 +01:00

1 2

93 Commits