qemu-e2k

Author	SHA1	Message	Date
Fam Zheng	ba9e75ceef	coroutine: Extract qemu_aio_coroutine_enter It's a variant of qemu_coroutine_enter with an explicit AioContext parameter. Signed-off-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2017-04-11 20:07:15 +08:00
Peter Maydell	87cc4c6102	* MemoryRegionCache revert * glib optimization workaround * fix "info lapic" segfault on isapc * fix QIOChannel memory leak -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQExBAABCAAbBQJY4oOMFBxwYm9uemluaUByZWRoYXQuY29tAAoJEL/70l94x66D AsIH/i52nJw41utJCs5AevnQyqNs9RnyMkZLHiVoi6a+pdJqX+0mCw8gV/5FsbPZ dtyt1tEuYBSu72adr+/ExE4aIEjwzeyRmnUdOkB+iYPxirHKuf4K/JTuLuvMtaQQ Tqj+FU5tx3wx0jlGOm5A7pzjZ680JUex+oaz3d1bZziv3zCyFCIgiZ2m2UAaaPQe fsd3fksJvc0gKOUKmdLUpu2m/xP3hAQAfQ4P/ozOfbVh9V2CVNaQ/cl935tNtdFK aYN3KleW3/ovb+YSexeNoW7QQH/3ZsjronCW5OmbF4FgHoeoV8MUROfNgu1S2bRU Bne9K/6boPzhD8NDEuSy8SXvf7s= =EdXr -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging * MemoryRegionCache revert * glib optimization workaround * fix "info lapic" segfault on isapc * fix QIOChannel memory leak # gpg: Signature made Mon 03 Apr 2017 18:17:00 BST # gpg: using RSA key 0xBFFBD25F78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: main-loop: Acquire main_context lock around os_host_main_loop_wait. exec: revert MemoryRegionCache nbd: fix memory leak on socket_connect failed ipmi: Fix macro issues target-i386: fix "info lapic" segfault on isapc iscsi: drop unused IscsiAIOCB.qiov field Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-04-04 11:40:55 +01:00
Richard W.M. Jones	ecbddbb106	main-loop: Acquire main_context lock around os_host_main_loop_wait. When running virt-rescue the serial console hangs from time to time. Virt-rescue runs an ordinary Linux kernel "appliance", but there is only a single idle process running inside, so the qemu main loop is largely idle. With virt-rescue >= 1.37 you may be able to observe the hang by doing: $ virt-rescue -e ^] --scratch ><rescue> while true; do ls -l /usr/bin; done The hang in virt-rescue can be resolved by pressing a key on the serial console. Possibly with the same root cause, we also observed hangs during very early boot of regular Linux VMs with a serial console. Those hangs are extremely rare, but you may be able to observe them by running this command on baremetal for a sufficiently long time: $ while libguestfs-test-tool -t 60 >& /tmp/log ; do echo -n . ; done (Check in /tmp/log that the failure was caused by a hang during early boot, and not some other reason) During investigation of this bug, Paolo Bonzini wrote: > glib is expecting QEMU to use g_main_context_acquire around accesses to > GMainContext. However QEMU is not doing that, instead it is taking its > own mutex. So we should add g_main_context_acquire and > g_main_context_release in the two implementations of > os_host_main_loop_wait; these should undo the effect of Frediano's > glib patch. This patch exactly implements Paolo's suggestion in that paragraph. This fixes the serial console hang in my testing, across 3 different physical machines (AMD, Intel Core i7 and Intel Xeon), over many hours of automated testing. I wasn't able to reproduce the early boot hangs (but as noted above, these are extremely rare in any case). Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1435432 Reported-by: Richard W.M. Jones <rjones@redhat.com> Tested-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Richard W.M. Jones <rjones@redhat.com> Message-Id: <20170331205133.23906-1-rjones@redhat.com> [Paolo: this is actually a glib bug: recent glib versions are also expecting g_main_context_acquire around g_poll---but that is not documented and probably not even intended]. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-04-03 19:13:12 +02:00
Markus Armbruster	216411b839	sockets: New helper socket_address_crumple() SocketAddress is a simple union, and simple unions are awkward: they have their variant members wrapped in a "data" object on the wire, and require additional indirections in C. I intend to limit its use to existing external interfaces. New ones should use SocketAddressFlat. I further intend to convert all internal interfaces to SocketAddressFlat. This helper should go away then. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-id: 1490895797-29094-8-git-send-email-armbru@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-04-03 17:11:39 +02:00
Markus Armbruster	a6c76285f2	io vnc sockets: Clean up SocketAddressKind switches We have quite a few switches over SocketAddressKind. Some have case labels for all enumeration values, others rely on a default label. Some abort when the value isn't a valid SocketAddressKind, others report an error then. Unify as follows. Always provide case labels for all enumeration values, to clarify intent. Abort when the value isn't a valid SocketAddressKind, because the program state is messed up then. Improve a few error messages while there. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 1490895797-29094-4-git-send-email-armbru@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-04-03 17:11:39 +02:00
Markus Armbruster	ca0b64e5ed	nbd sockets vnc: Mark problematic address family tests TODO Certain features make sense only with certain address families. For instance, passing file descriptors requires AF_UNIX. Testing SocketAddress's saddr->type == SOCKET_ADDRESS_KIND_UNIX is obvious, but problematic: it can't recognize AF_UNIX when type == SOCKET_ADDRESS_KIND_FD. Mark such tests of saddr->type TODO. We may want to check the address family with getsockname() there. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1490895797-29094-2-git-send-email-armbru@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-04-03 17:11:39 +02:00
Halil Pasic	aa26292859	event_notifier: prevent accidental use after close Let's set the handles to the underlying facilities to their extremal value so no accidental misuse can happen, and to make it obvious that the notifier is dysfunctional. E.g. if we just close an fd but do not touch the int holding the fd eventually a read/write could succeed again when the fd gets reused, and corrupt the file addressed by the fd. Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2017-03-29 02:35:23 +03:00
Markus Armbruster	44fdc76455	sockets: Fix socket_address_to_string() hostname truncation We first snprintf() to a fixed buffer, then g_strdup() the result boggle. Worse, the size of the fixed buffer INET6_ADDRSTRLEN + 5 + 4 is bogus: the 4 correctly accounts for '[', ']', ':' and '\0', but INET6_ADDRSTRLEN is not a suitable limit for inet->host, and 5 is not one for inet->port! They are for host and port in numeric form (exploiting that INET6_ADDRSTRLEN > INET_ADDRSTRLEN), but inet->host can also be a hostname, and inet->port can be a service name, to be resolved with getaddrinfo(). Fortunately, the only user so far is the "socket" network backend's net_socket_connected(), which uses it to initialize a NetSocketState's info_str[]. info_str[] has considerable more space: 256 instead of 55. So the bug's impact appears to be limited to truncated "info networks" with the "socket" network backend. The fix is obvious: use g_strdup_printf(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1490268208-23368-1-git-send-email-armbru@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>	2017-03-28 18:50:38 +02:00
Andrey Shedel	12f8def0e0	win32: replace custom mutex and condition variable with native primitives The multithreaded TCG implementation exposed deadlocks in the win32 condition variables: as implemented, qemu_cond_broadcast waited on receivers, whereas the pthreads API it was intended to emulate does not. This was causing a deadlock because broadcast was called while holding the IO lock, as well as all possible waiters blocked on the same lock. This patch replaces all the custom synchronisation code for mutexes and condition variables with native Windows primitives (SRWlocks and condition variables) with the same semantics as their POSIX equivalents. To enable that, it requires a Windows Vista or newer host OS. Signed-off-by: Andrey Shedel <ashedel@microsoft.com> [AB: edited commit message] Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com> Message-Id: <20170324220141.10104-1-Andrew.Baumann@microsoft.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-03-27 14:41:01 +02:00
Jitendra Kolhe	dfd0dcc717	mem-prealloc: fix sysconf(_SC_NPROCESSORS_ONLN) failure case. This was spotted by Coverity, in case where sysconf(_SC_NPROCESSORS_ONLN) fails and returns -1. This results in memset_num_threads getting set to -1. Which we then pass to g_new0(). The patch replaces MAX_MEM_PREALLOC_THREAD_COUNT macro with a function call get_memset_num_threads() to handle sysconf() failure gracefully. In case sysconf() fails, we fall back to single threaded. (Spotted by Coverity, CID 1372465.) Signed-off-by: Jitendra Kolhe <jitendra.kolhe@hpe.com> Message-Id: <1490079006-32495-1-git-send-email-jitendra.kolhe@hpe.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-03-24 11:50:11 +01:00
Markus Armbruster	0ee9ae7c8c	keyval: Document issues with 'any' and alternate types Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1490014548-15083-5-git-send-email-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2017-03-21 10:42:09 +01:00
Markus Armbruster	fae425d74f	keyval: Improve some comments Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1490014548-15083-3-git-send-email-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2017-03-21 10:41:54 +01:00
Paolo Bonzini	53fabd4b86	qemu-ga: obey LISTEN_PID when using systemd socket activation qemu-ga's socket activation support was not obeying the LISTEN_PID environment variable, which avoids that a process uses a socket-activation file descriptor meant for its parent. Mess can for example ensue if a process forks a children before consuming the socket-activation file descriptor and therefore setting O_CLOEXEC on it. Luckily, qemu-nbd also got socket activation code, and its copy does support LISTEN_PID. Some extra fixups are needed to ensure that the code can be used for both, but that's what this patch does. The main change is to replace get_listen_fds's "consume" argument with the FIRST_SOCKET_ACTIVATION_FD macro from the qemu-nbd code. Cc: "Richard W.M. Jones" <rjones@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-03-19 11:12:12 +01:00
Paolo Bonzini	6b3cca76dd	oslib-posix: fix compilation on OpenBSD si_band is not found in OpenBSD. It is marked as obsolescent in POSIX, so we can delete it without any remorse. Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 20170317152214.6148-1-pbonzini@redhat.com Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-03-17 18:27:49 +00:00
Peter Lieven	b7a745dc33	thread-pool: add missing qemu_bh_cancel in completion function commit `3c80ca15` fixed a deadlock scenarion with nested aio_poll invocations. However, the rescheduling of the completion BH introcuded unnecessary spinning in the main-loop. On very fast file backends this can even lead to the "WARNING: I/O thread spun for 1000 iterations" message popping up. Callgrind reports about 3-4% less instructions with this patch running qemu-img bench on a ramdisk based VMDK file. Fixes: `3c80ca158c` Cc: qemu-stable@nongnu.org Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-03-17 12:54:21 +01:00
Daniel P. Berrange	9dc44aa582	os: don't corrupt pre-existing memory-backend data with prealloc When using a memory-backend object with prealloc turned on, QEMU will memset() the first byte in every memory page to zero. While this might have been acceptable for memory backends associated with RAM, this corrupts application data for NVDIMMs. Instead of setting every page to zero, read the current byte value and then just write that same value back, so we are not corrupting the original data. Directly write the value instead of memset()ing it, since there's no benefit to memset for a single byte write. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Message-id: 20170303113255.28262-1-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-03-15 11:55:41 +08:00
Paolo Bonzini	6b8f0187a4	icount: process QEMU_CLOCK_VIRTUAL timers in vCPU thread icount has become much slower after tcg_cpu_exec has stopped using the BQL. There is also a latent bug that is masked by the slowness. The slowness happens because every occurrence of a QEMU_CLOCK_VIRTUAL timer now has to wake up the I/O thread and wait for it. The rendez-vous is mediated by the BQL QemuMutex: - handle_icount_deadline wakes up the I/O thread with BQL taken - the I/O thread wakes up and waits on the BQL - the VCPU thread releases the BQL a little later - the I/O thread raises an interrupt, which calls qemu_cpu_kick - the VCPU thread notices the interrupt, takes the BQL to process it and waits on it All this back and forth is extremely expensive, causing a 6 to 8-fold slowdown when icount is turned on. One may think that the issue is that the VCPU thread is too dependent on the BQL, but then the latent bug comes in. I first tried removing the BQL completely from the x86 cpu_exec, only to see everything break. The only way to fix it (and make everything slow again) was to add a dummy BQL lock/unlock pair. This is because in -icount mode you really have to process the events before the CPU restarts executing the next instruction. Therefore, this series moves the processing of QEMU_CLOCK_VIRTUAL timers straight in the vCPU thread when running in icount mode. The required changes include: - make the timer notification callback wake up TCG's single vCPU thread when run from another thread. By using async_run_on_cpu, the callback can override all_cpu_threads_idle() when the CPU is halted. - move handle_icount_deadline after qemu_tcg_wait_io_event, so that the timer notification callback is invoked after the dummy work item wakes up the vCPU thread - make handle_icount_deadline run the timers instead of just waking the I/O thread. - stop processing the timers in the main loop Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-03-14 13:51:34 +01:00
Paolo Bonzini	3f53bc61a4	cpus: define QEMUTimerListNotifyCB for QEMU system emulation There is no change for now, because the callback just invokes qemu_notify_event. Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-03-14 13:28:29 +01:00
Paolo Bonzini	d2528bdc19	qemu-timer: do not include sysemu/cpus.h from util/qemu-timer.h This dependency is the wrong way, and we will need util/qemu-timer.h from sysemu/cpus.h in the next patch. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-03-14 13:28:18 +01:00
Paolo Bonzini	33bef0b994	qemu-timer: fix off-by-one If the first timer is exactly at the current value of the clock, the deadline is met and the timer should fire. This fixes itself on the next iteration of the loop without icount; with icount, however, execution of instructions will stop exactly at the deadline and won't proceed. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-03-14 13:26:42 +01:00
Suramya Shah	bd5d983fa8	util: Removed unneeded header from path.c Signed-off-by: Suramya Shah <shah.suramya@gmail.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20170310163948.7567-1-shah.suramya@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-03-14 13:26:37 +01:00
Jitendra Kolhe	1e356fc14b	mem-prealloc: reduce large guest start-up and migration time. Using "-mem-prealloc" option for a large guest leads to higher guest start-up and migration time. This is because with "-mem-prealloc" option qemu tries to map every guest page (create address translations), and make sure the pages are available during runtime. virsh/libvirt by default, seems to use "-mem-prealloc" option in case the guest is configured to use huge pages. The patch tries to map all guest pages simultaneously by spawning multiple threads. Currently limiting the change to QEMU library functions on POSIX compliant host only, as we are not sure if the problem exists on win32. Below are some stats with "-mem-prealloc" option for guest configured to use huge pages. ------------------------------------------------------------------------ Idle Guest \| Start-up time \| Migration time ------------------------------------------------------------------------ Guest stats with 2M HugePage usage - single threaded (existing code) ------------------------------------------------------------------------ 64 Core - 4TB \| 54m11.796s \| 75m43.843s 64 Core - 1TB \| 8m56.576s \| 14m29.049s 64 Core - 256GB \| 2m11.245s \| 3m26.598s ------------------------------------------------------------------------ Guest stats with 2M HugePage usage - map guest pages using 8 threads ------------------------------------------------------------------------ 64 Core - 4TB \| 5m1.027s \| 34m10.565s 64 Core - 1TB \| 1m10.366s \| 8m28.188s 64 Core - 256GB \| 0m19.040s \| 2m10.148s ----------------------------------------------------------------------- Guest stats with 2M HugePage usage - map guest pages using 16 threads ----------------------------------------------------------------------- 64 Core - 4TB \| 1m58.970s \| 31m43.400s 64 Core - 1TB \| 0m39.885s \| 7m55.289s 64 Core - 256GB \| 0m11.960s \| 2m0.135s ----------------------------------------------------------------------- Changed in v2: - modify number of memset threads spawned to min(smp_cpus, 16). - removed 64GB memory restriction for spawning memset threads. Changed in v3: - limit number of threads spawned based on min(sysconf(_SC_NPROCESSORS_ONLN), 16, smp_cpus) - implement memset thread specific siglongjmp in SIGBUS signal_handler. Changed in v4 - remove sigsetjmp/siglongjmp and SIGBUS unblock/block for main thread as main thread no longer touches any pages. - simplify code my returning memset_thread_failed status from touch_all_pages. Signed-off-by: Jitendra Kolhe <jitendra.kolhe@hpe.com> Message-Id: <1487907103-32350-1-git-send-email-jitendra.kolhe@hpe.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-03-14 13:26:36 +01:00
Markus Armbruster	0b2c1beea4	keyval: Support lists Additionally permit non-negative integers as key components. A dictionary's keys must either be all integers or none. If all keys are integers, convert the dictionary to a list. The set of keys must be [0,N]. Examples: * list.1=goner,list.0=null,list.1=eins,list.2=zwei is equivalent to JSON [ "null", "eins", "zwei" ] * a.b.c=1,a.b.0=2 is inconsistent: a.b.c clashes with a.b.0 * list.0=null,list.2=eins,list.2=zwei has a hole: list.1 is missing Similar design flaw as for objects: there is no way to denote an empty list. While interpreting "key absent" as empty list seems natural (removing a list member from the input string works when there are multiple ones, so why not when there's just one), it doesn't work: "key absent" already means "optional list absent", which isn't the same as "empty list present". Update the keyval object visitor to use this a.0 syntax in error messages rather than the usual a[0]. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1488317230-26248-25-git-send-email-armbru@redhat.com> [Off-by-one fix squashed in, as per Kevin's review] Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2017-03-07 16:07:48 +01:00
Markus Armbruster	f740048323	keyval: Restrict key components to valid QAPI names Until now, key components are separated by '.'. This leaves little room for evolving the syntax, and is incompatible with the __RFQDN_ prefix convention for downstream extensions. Since key components will be commonly used as QAPI member names by the QObject input visitor, we can just as well borrow the QAPI naming rules here: letters, digits, hyphen and period starting with a letter, with an optional __RFQDN_ prefix for downstream extensions. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <1488317230-26248-20-git-send-email-armbru@redhat.com>	2017-03-07 16:07:47 +01:00
Markus Armbruster	d454dbe0ee	keyval: New keyval_parse() keyval_parse() parses KEY=VALUE,... into a QDict. Works like qemu_opts_parse(), except: * Returns a QDict instead of a QemuOpts (d'oh). * Supports nesting, unlike QemuOpts: a KEY is split into key fragments at '.' (dotted key convention; the block layer does something similar on top of QemuOpts). The key fragments are QDict keys, and the last one's value is updated to VALUE. * Each key fragment may be up to 127 bytes long. qemu_opts_parse() limits the entire key to 127 bytes. * Overlong key fragments are rejected. qemu_opts_parse() silently truncates them. * Empty key fragments are rejected. qemu_opts_parse() happily accepts empty keys. * It does not store the returned value. qemu_opts_parse() stores it in the QemuOptsList. * It does not treat parameter "id" specially. qemu_opts_parse() ignores all but the first "id", and fails when its value isn't id_wellformed(), or duplicate (a QemuOpts with the same ID is already stored). It also screws up when a value contains ",id=". * Implied value is not supported. qemu_opts_parse() desugars "foo" to "foo=on", and "nofoo" to "foo=off". * An implied key's value can't be empty, and can't contain ','. I intend to grow this into a saner replacement for QemuOpts. It'll take time, though. Note: keyval_parse() provides no way to do lists, and its key syntax is incompatible with the __RFQDN_ prefix convention for downstream extensions, because it blindly splits at '.', even in __RFQDN_. Both issues will be addressed later in the series. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1488317230-26248-4-git-send-email-armbru@redhat.com>	2017-03-07 16:07:46 +01:00
Peter Maydell	17783ac828	ppc patch queuye for 2017-03-03 This will probably be my last pull request before the hard freeze. It has some new work, but that has all been posted in draft before the soft freeze, so I think it's reasonable to include in qemu-2.9. This batch has: * A substantial amount of POWER9 work * Implements the legacy (hash) MMU for POWER9 * Some more preliminaries for implementing the POWER9 radix MMU * POWER9 has_work * Basic POWER9 compatibility mode handling * Removal of some premature tests * Some cleanups and fixes to the existing MMU code to make the POWER9 work simpler * A bugfix for TCG multiply adds on power * Allow pseries guests to access PCIe extended config space This also includes a code-motion not strictly in ppc code - moving getrampagesize() from ppc code to exec.c. This will make some future VFIO improvements easier, Paolo said it was ok to merge via my tree. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJYuOEEAAoJEGw4ysog2bOSHD0P/jBg/qr/4KnsB1KhnlVrB2sP vy2d3bGGlUWr9Z+CK/PMCRB8ekFgQLjidLIXji6mviUocv6m3WsVrnbLF/oOL/IT NPMVAffw7q804YVu1Ns9R82d6CIqHTy//bpg69tFMcJmhL9fqPan3wTZZ9JeiyAm SikqkAHBSW4SxKqg8ApaSqx5L2QTqyfkClR0sLmgM0JtmfJrbobpQ6bMtdPjUZ9L n2gnpO2vaWCa1SEQrRrdELqvcD8PHkSJapWOBXOkpGWxoeov/PYxOgkpdDUW4qYY lVLtp1Vd3OB/h3Unqfw32DNiHA5p89hWPX5UybKMgRVL9Cv2/lyY47pcY8XTeNzn bv84YRbFJeI+GgoEnghmtq+IM8XiW/cr9rWm9wATKfKGcmmFauumALrsffUpHVCM 4hSNgBv5t2V9ptZ+MDlM/Ku+zk9GoqwQ+hemdpVtiyhOtGUPGFBn5YLE4c2DHFxV +L9JtBnFn8obnssNoz0wL+QvZchT1qUHMhH5CWAanjw9CTDp/YwQ2P01zK+00s9d 4cB7fUG3WNto5eXXEGMaXeDsUEz8z//hTe3j5sVbnHsXi0R3dhv7iryifmx4bUKU H9EwAc+uNUHbvBy7u6IWg0I8P2n00CCO6JqXijQ92zELJ5j0XhzHUI2dOXn+zyEo 3FZu56LFnSSUBEXuTjq4 =PcNw -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.9-20170303' into staging ppc patch queuye for 2017-03-03 This will probably be my last pull request before the hard freeze. It has some new work, but that has all been posted in draft before the soft freeze, so I think it's reasonable to include in qemu-2.9. This batch has: * A substantial amount of POWER9 work * Implements the legacy (hash) MMU for POWER9 * Some more preliminaries for implementing the POWER9 radix MMU * POWER9 has_work * Basic POWER9 compatibility mode handling * Removal of some premature tests * Some cleanups and fixes to the existing MMU code to make the POWER9 work simpler * A bugfix for TCG multiply adds on power * Allow pseries guests to access PCIe extended config space This also includes a code-motion not strictly in ppc code - moving getrampagesize() from ppc code to exec.c. This will make some future VFIO improvements easier, Paolo said it was ok to merge via my tree. # gpg: Signature made Fri 03 Mar 2017 03:20:36 GMT # gpg: using RSA key 0x6C38CACA20D9B392 # gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" # gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" # gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" # gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" # Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392 * remotes/dgibson/tags/ppc-for-2.9-20170303: target/ppc: rewrite f[n]m[add,sub] using float64_muladd spapr: Small cleanup of PPC MMU enums spapr_pci: Advertise access to PCIe extended config space target/ppc: Rework hash mmu page fault code and add defines for clarity target/ppc: Move no-execute and guarded page checking into new function target/ppc: Add execute permission checking to access authority check target/ppc: Add Instruction Authority Mask Register Check hw/ppc/spapr: Add POWER9 to pseries cpu models target/ppc/POWER9: Add cpu_has_work function for POWER9 target/ppc/POWER9: Add POWER9 pa-features definition target/ppc/POWER9: Add POWER9 mmu fault handler target/ppc: Don't gen an SDR1 on POWER9 and rework register creation target/ppc: Add patb_entry to sPAPRMachineState target/ppc/POWER9: Add POWERPC_MMU_V3 bit powernv: Don't test POWER9 CPU yet exec, kvm, target-ppc: Move getrampagesize() to common code target/ppc: Add POWER9/ISAv3.00 to compat_table Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-03-04 16:31:14 +00:00
Paolo Bonzini	d98d407234	cpus: remove ugly cast on sigbus_handler The cast is there because sigbus_handler is invoked via sigfd_handler. But it feels just wrong to use struct qemu_signalfd_siginfo in the prototype of a function that is passed to sigaction. Instead, do a simple-minded conversion of qemu_signalfd_siginfo to siginfo_t. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-03-03 16:40:02 +01:00
Alexey Kardashevskiy	9c60766887	exec, kvm, target-ppc: Move getrampagesize() to common code getrampagesize() returns the largest supported page size and mainly used to know if huge pages are enabled. However is implemented in target-ppc/kvm.c and not available in TCG or other architectures. This renames and moves gethugepagesize() to mmap-alloc.c where fd-based analog of it is already implemented. This renames and moves getrampagesize() to exec.c as it seems to be the common place for helpers like this. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-03 11:30:59 +11:00
Peter Maydell	c9fc677a35	-----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJYto49AAoJENro4Ql1lpzljb8P/RutuV92h/e/RvIQ00gIE1Tk Tdma/9HpobMtutScp7xCbSjDCaFBHtk8YFmdbPwcwLpNIg9RGUs5cCTVEPvT8nq3 XRrUFtCzZua4U4CKLOAbo53bDN2x36r6rHSpX7FaYvC51o47MNFS2fpKJ5R8Rg4s v9i9zM6bWGYzNbBnK4XDi251/7r8yK2g3okBEZao5TDwBUCtimbwfXn9E/iB21XD HJJRZGIkgCFXebCbPzLfob8NxQi0whuJW6N9XscJOy1AKDTfGT4QUsUUkPUZNRVe lbGjoGJnq5XYMO+9OWJzaSf6ecFFQT/Seu5P6xLb3SRgdT9IXISouPt9sOsYlM7R U/Sw34gPKb2979gASLs2NPbrTPHBQp0hdL8E01LRFjdGVHuvgZcGr0OdxpbUSkEf nJgtBFpLVQdL+qtdQtxgSBZz0/hWMIRcXB++MCPfpWpP9uBUErr4/XQN2rfqLLUM UgA/doDMX7Hc9m81mZ+h+Pji+9O8zWjt7NKJXgPeZPsUsMbUe05VzAl3Fp4khEWM 05hrFB5biQguXkHXBIKO/fWwaftRaatqy0qAvXaiKidcRzQjF9ohlZgWd/CJyAWz V4GocO2oZL3rxQXwW6kh3HKlMCWUj563FhT4V5eN8VQE5PWLQQ+ObBkRHWfyayRE eEFFYApPQRVibAo9U7mC =09RR -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/elmarco/tags/leak-pull-request' into staging # gpg: Signature made Wed 01 Mar 2017 09:02:53 GMT # gpg: using RSA key 0xDAE8E10975969CE5 # gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>" # gpg: aka "Marc-André Lureau <marcandre.lureau@gmail.com>" # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 87A9 BD93 3F87 C606 D276 F62D DAE8 E109 7596 9CE5 * remotes/elmarco/tags/leak-pull-request: (28 commits) tests: fix virtio-blk-test leaks tests: add specialized device_find function tests: fix usb-test leaks tests: allows to run single test in usb-hcd-ehci-test usb: release the created buses bus: do not unref hotplug handler tests: fix virtio-9p-test leaks tests: fix virtio-scsi-test leak tests: fix e1000e leaks tests: fix i440fx-test leaks tests: fix e1000-test leak tests: fix tco-test leaks tests: fix eepro100-test leak pc: pcihp: avoid adding ACPI_PCIHP_PROP_BSEL twice tests: fix ipmi-bt-test leak tests: fix ipmi-kcs-test leak tests: fix bios-tables-test leak tests: fix hd-geo-test leaks tests: fix ide-test leaks tests: fix vhost-user-test leaks ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-03-02 15:25:37 +00:00
Marc-André Lureau	e703dcbaee	timer: use an inline function for free Similarly to allocation, do it from an inline function. This allows tests to only use the headers for allocation/free of timer. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>	2017-03-01 00:09:28 +04:00
Markus Armbruster	9e19ad4e49	option: Tweak invalid size error message and unbreak iotest 049 Commit `75cdcd1` neglected to update tests/qemu-iotests/049.out, and made the error message for negative size worse. Fix that. Reported-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-02-28 20:40:31 +01:00
Markus Armbruster	75cdcd1553	option: Fix checking of sizes for overflow and trailing crap parse_option_size()'s checking for overflow and trailing crap is wrong. Has always been that way. qemu_strtosz() gets it right, so use that. This adds support for size suffixes 'P', 'E', and ignores case for all suffixes, not just 'k'. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1487708048-2131-25-git-send-email-armbru@redhat.com>	2017-02-23 20:35:36 +01:00
Markus Armbruster	f46bfdbfc8	util/cutils: Change qemu_strtosz*() from int64_t to uint64_t This will permit its use in parse_option_size(). Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> (maintainer:X86) Cc: Kevin Wolf <kwolf@redhat.com> (supporter:Block layer core) Cc: Max Reitz <mreitz@redhat.com> (supporter:Block layer core) Cc: qemu-block@nongnu.org (open list:Block layer core) Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <1487708048-2131-24-git-send-email-armbru@redhat.com>	2017-02-23 20:35:36 +01:00
Markus Armbruster	f17fd4fdf0	util/cutils: Return qemu_strtosz*() error and value separately This makes qemu_strtosz(), qemu_strtosz_mebi() and qemu_strtosz_metric() similar to qemu_strtoi64(), except negative values are rejected. Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> (maintainer:X86) Cc: Kevin Wolf <kwolf@redhat.com> (supporter:Block layer core) Cc: Max Reitz <mreitz@redhat.com> (supporter:Block layer core) Cc: qemu-block@nongnu.org (open list:Block layer core) Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <1487708048-2131-23-git-send-email-armbru@redhat.com>	2017-02-23 20:35:36 +01:00
Markus Armbruster	4fcdf65ae2	util/cutils: Let qemu_strtosz*() optionally reject trailing crap Change the qemu_strtosz() & friends to return -EINVAL when @endptr is null and the conversion doesn't consume the string completely. Matches how qemu_strtol() & friends work. Only test_qemu_strtosz_simple() passes a null @endptr. No functional change there, because its conversion consumes the string. Simplify callers that use @endptr only to fail when it doesn't point to '\0' to pass a null @endptr instead. Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> (maintainer:X86) Cc: Kevin Wolf <kwolf@redhat.com> (supporter:Block layer core) Cc: Max Reitz <mreitz@redhat.com> (supporter:Block layer core) Cc: qemu-block@nongnu.org (open list:Block layer core) Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <1487708048-2131-22-git-send-email-armbru@redhat.com>	2017-02-23 20:35:36 +01:00
Markus Armbruster	17f942560e	util/cutils: Drop QEMU_STRTOSZ_DEFSUFFIX_* macros Writing QEMU_STRTOSZ_DEFSUFFIX_* instead of '*' gains nothing. Get rid of these eyesores. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <1487708048-2131-18-git-send-email-armbru@redhat.com>	2017-02-23 20:35:36 +01:00
Markus Armbruster	466dea14e6	util/cutils: New qemu_strtosz() Most callers of qemu_strtosz_suffix() pass QEMU_STRTOSZ_DEFSUFFIX_B. Capture the pattern in new qemu_strtosz(). Inline qemu_strtosz_suffix() into its only remaining caller. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1487708048-2131-17-git-send-email-armbru@redhat.com>	2017-02-23 20:35:36 +01:00
Markus Armbruster	e591591b32	util/cutils: Rename qemu_strtosz() to qemu_strtosz_MiB() With qemu_strtosz(), no suffix means mebibytes. It's used rarely. I'm going to add a similar function where no suffix means bytes. Rename qemu_strtosz() to qemu_strtosz_MiB() to make the name qemu_strtosz() available for the new function. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1487708048-2131-16-git-send-email-armbru@redhat.com>	2017-02-23 20:35:36 +01:00
Markus Armbruster	d2734d2629	util/cutils: New qemu_strtosz_metric() To parse numbers with metric suffixes, we use qemu_strtosz_suffix_unit(nptr, &eptr, QEMU_STRTOSZ_DEFSUFFIX_B, 1000) Capture this in a new function for legibility: qemu_strtosz_metric(nptr, &eptr) Replace test_qemu_strtosz_suffix_unit() by test_qemu_strtosz_metric(). Rename qemu_strtosz_suffix_unit() to do_strtosz() and give it internal linkage. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1487708048-2131-15-git-send-email-armbru@redhat.com>	2017-02-23 20:35:36 +01:00
Markus Armbruster	3403e5eb88	option: Fix to reject invalid and overflowing numbers parse_option_number() fails to check for these errors after strtoull(). Has always been broken. Fix that. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1487708048-2131-10-git-send-email-armbru@redhat.com>	2017-02-23 20:35:35 +01:00
Markus Armbruster	4baef2679e	util/cutils: Clean up control flow around qemu_strtol() a bit Reorder check_strtox_error() to make it obvious that we always store through a non-null @endptr. Transform if (some error) { error case ... err = value for error case; } else { normal case ... err = value for normal case; } return err; to if (some error) { error case ... return value for error case; } normal case ... return value for normal case; Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1487708048-2131-9-git-send-email-armbru@redhat.com>	2017-02-23 20:35:35 +01:00
Markus Armbruster	717adf9609	util/cutils: Clean up variable names around qemu_strtol() Name same things the same, different things differently. * qemu_strtol()'s parameter @nptr is called @p in check_strtox_error(). Rename the latter. * qemu_strtol()'s parameter @endptr is called @next in check_strtox_error(). Rename the latter. * qemu_strtol()'s variable @p is called @endptr in check_strtox_error(). Rename both to @ep. * qemu_strtol()'s variable @err is negative errno, check_strtox_error()'s parameter @err is positive. Rename the latter to @libc_errno. Same for qemu_strtoul(), qemu_strtoi64(), qemu_strtou64(), of course. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1487708048-2131-8-git-send-email-armbru@redhat.com>	2017-02-23 20:35:35 +01:00
Markus Armbruster	b30d188677	util/cutils: Rename qemu_strtoll(), qemu_strtoull() The name qemu_strtoll() suggests conversion to long long, but it actually converts to int64_t. Rename to qemu_strtoi64(). The name qemu_strtoull() suggests conversion to unsigned long long, but it actually converts to uint64_t. Rename to qemu_strtou64(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <1487708048-2131-7-git-send-email-armbru@redhat.com>	2017-02-23 20:35:35 +01:00
Markus Armbruster	4295f879be	util/cutils: Rewrite documentation of qemu_strtol() & friends Fixes the following documentation bugs: * Fails to document that null @nptr is safe. * Fails to document that we return -EINVAL when no conversion could be performed (commit `47d4be1`). * Confuses long long with int64_t, and unsigned long long with uint64_t. * Claims the unsigned conversions can underflow. They can't. While there, mark problematic assumptions that int64_t is long long, and uint64_t is unsigned long long with FIXME comments. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <1487708048-2131-6-git-send-email-armbru@redhat.com>	2017-02-23 20:35:35 +01:00
Markus Armbruster	8ee8409eff	option: Assert value string isn't null Plenty of code relies on QemuOpt member @str not being null, including qemu_opts_print(), qemu_opts_to_qdict(), and callbacks passed to qemu_opt_foreach(). Begs the question whether it can be null. Only opt_set() creates QemuOpt. It sets member @str to its argument @value. Passing null for @value would plant a time bomb. Callers: * opts_do_parse() can't pass null. * qemu_opt_set() passes its argument @value. Callers: - qemu_opts_from_qdict_1() can't pass null - qemu_opts_set() passes its argument @value, but none of its callers pass null. - Many more outside qemu-option.c, but they shouldn't pass null, either. Assert member @str isn't null, so that misuse is caught right away. Simplify parse_option_bool(), parse_option_number() and parse_option_size() accordingly. Best viewed with whitespace changes ignored. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1487708048-2131-3-git-send-email-armbru@redhat.com>	2017-02-23 20:35:35 +01:00
Paolo Bonzini	a7b91d35ba	coroutine-lock: make CoRwlock thread-safe and fair This adds a CoMutex around the existing CoQueue. Because the write-side can just take CoMutex, the old "writer" field is not necessary anymore. Instead of removing it altogether, count the number of pending writers during a read-side critical section and forbid further readers from entering. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 20170213181244.16297-7-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-02-21 11:39:40 +00:00
Paolo Bonzini	1ace7ceac5	coroutine-lock: add mutex argument to CoQueue APIs All that CoQueue needs in order to become thread-safe is help from an external mutex. Add this to the API. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 20170213181244.16297-6-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-02-21 11:39:40 +00:00
Paolo Bonzini	480cff6322	coroutine-lock: add limited spinning to CoMutex Running a very small critical section on pthread_mutex_t and CoMutex shows that pthread_mutex_t is much faster because it doesn't actually go to sleep. What happens is that the critical section is shorter than the latency of entering the kernel and thus FUTEX_WAIT always fails. With CoMutex there is no such latency but you still want to avoid wait and wakeup. So introduce it artificially. This only works with one waiters; because CoMutex is fair, it will always have more waits and wakeups than a pthread_mutex_t. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 20170213181244.16297-3-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-02-21 11:39:40 +00:00
Paolo Bonzini	fed20a70e3	coroutine-lock: make CoMutex thread-safe This uses the lock-free mutex described in the paper '"Blocking without Locking", or LFTHREADS: A lock-free thread library' by Gidenstam and Papatriantafilou. The same technique is used in OSv, and in fact the code is essentially a conversion to C of OSv's code. [Added missing coroutine_fn in tests/test-aio-multithread.c. --Stefan] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 20170213181244.16297-2-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-02-21 11:39:40 +00:00
Paolo Bonzini	bd451435c0	async: remove unnecessary inc/dec pairs Pull the increment/decrement pair out of aio_bh_poll and into the callers. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170213135235.12274-18-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-02-21 11:39:40 +00:00

1 2 3 4 5 ...

633 Commits