Signals are slow and do not exist on Win32. The previous patches
have done most of the legwork to introduce memory barriers (some
of them were even there already for the sake of Windows!) and
we can now set the flags directly in the iothread.
qemu_cpu_kick_thread is not used anymore on TCG, since the TCG thread is
never outside usermode while the CPU is running (not halted). Instead run
the content of the signal handler (now in qemu_cpu_kick_no_halt) directly.
qemu_cpu_kick_no_halt is also used in qemu_mutex_lock_iothread to avoid
the overhead of qemu_cond_broadcast.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use the same API to trigger interruption of a CPU, no matter if
under TCG or KVM. There is no difference: these calls come from
the CPU thread, so the qemu_cpu_kick calls will send a signal
to the running thread and it will be processed synchronously,
just like a call to cpu_exit. The only difference is in the
overhead, but neither call to cpu_exit (now qemu_cpu_kick)
is in a hot path.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Synchronize the remaining pair of accesses in cpu_signal. These should
be necessary on Windows as well, at least in theory. Probably
SuspendProcess and ResumeProcess introduce some implicit memory
barrier.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is already useful on Windows in order to remove tls.h, because
accesses to current_cpu are done from a different thread on that
platform. It will be used on POSIX platforms as soon TCG stops using
signals to interrupt the execution of translated code.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When QEMU starts the RCU thread executes qemu_mutex_lock_thread
causing error "qemu:qemu_cpu_kick_thread: No such process" and exits.
This isn't occur frequently but in glibc the thread id can exist and
this not guarantee that the thread is on active/running state. If is
inserted a sleep(1) after newthread assignment [1] the issue appears.
So not make assumption that thread exist if first_cpu->thread is set
then change the validation of cpu to created that is set into cpu
threads (kvm, tcg, dummy).
[1] https://sourceware.org/git/?p=glibc.git;a=blob;f=nptl/pthread_create.c;h=d10f4ea8004e1d8f3a268b95cc0f8d93b8d89867;hb=HEAD#l621
Cc: qemu-stable@nongnu.org
Signed-off-by: Aníbal Limón <anibal.limon@linux.intel.com>
Message-Id: <1441313313-3040-1-git-send-email-anibal.limon@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
After commit 626cf8f (icount: set can_do_io outside TB execution,
2014-12-08), can_do_io is set to 1 if not executing code. It is
no longer necessary to make this assumption in cpu_can_do_io.
It is also possible to remove the use_icount test, simply by
never setting cpu->can_do_io to 0 unless use_icount is true.
With these changes cpu_can_do_io boils down to a read of
cpu->can_do_io.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Remove un-needed usages of ENV_GET_CPU() by converting the APIs to use
CPUState pointers and retrieving the env_ptr as minimally needed.
Scripted conversion for target-* change:
for I in target-*/cpu.h; do
sed -i \
's/\(^int cpu_[^_]*_exec(\)[^ ][^ ]* \*s);$/\1CPUState *cpu);/' \
$I;
done
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
The sole caller of this function navigates the cpu->env_ptr only for
this function to take it back the cpu pointer straight away. Pass in
cpu pointer instead and grab the env pointer locally in the function.
Removes a core code usage of ENV_GET_CPU().
Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
This function will be used to avoid recursive locking of the iothread lock
whenever address_space_rw/ld*/st* are called with the BQL held, which is
almost always the case.
Tracking whether the iothread is owned is very cheap (just use a TLS
variable) but requires some care because now the lock must always be
taken with qemu_mutex_lock_iothread(). Previously this wasn't the case.
Outside TCG mode this is not a problem. In TCG mode, we need to be
careful and avoid the "prod out of compiled code" step if already
in a VCPU thread. This is easily done with a check on current_cpu,
i.e. qemu_in_vcpu_thread().
Hopefully, multithreaded TCG will get rid of the whole logic to kick
VCPUs whenever an I/O event occurs!
Cc: Frederic Konrad <fred.konrad@greensocs.com>
Message-Id: <1434646046-27150-3-git-send-email-pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The next patch will require the BQL to be always taken with
qemu_mutex_lock_iothread(), while right now this isn't the case.
Outside TCG mode this is not a problem. In TCG mode, we need to be
careful and avoid the "prod out of compiled code" step if already
in a VCPU thread. This is easily done with a check on current_cpu,
i.e. qemu_in_vcpu_thread().
Hopefully, multithreaded TCG will get rid of the whole logic to kick
VCPUs whenever an I/O event occurs!
Cc: Frederic Konrad <fred.konrad@greensocs.com>
Message-Id: <1434646046-27150-2-git-send-email-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These macros expand into error class enumeration constant, comma,
string. Unclean. Has been that way since commit 13f59ae.
The error class is always ERROR_CLASS_GENERIC_ERROR since the previous
commit.
Clean up as follows:
* Prepend every use of a QERR_ macro by ERROR_CLASS_GENERIC_ERROR, and
delete it from the QERR_ macro. No change after preprocessing.
* Rewrite error_set(ERROR_CLASS_GENERIC_ERROR, ...) into
error_setg(...). Again, no change after preprocessing.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
We create optional sections with this patch. But we already have
optional subsections. Instead of having two mechanism that do the
same, we can just generalize it.
For subsections we just change:
- Add a needed function to VMStateDescription
- Remove VMStateSubsection (after removal of the needed function
it is just a VMStateDescription)
- Adjust the whole tree, moving the needed function to the corresponding
VMStateDescription
Signed-off-by: Juan Quintela <quintela@redhat.com>
While qemu is running in sleep=no mode, a warning will be printed
when no timer deadline is set.
As this mode is intended for getting deterministic virtual time, if no
timer is set on the virtual clock this determinism is broken.
Signed-off-by: Victor CLEMENT <victor.clement@openwide.fr>
Message-Id: <1432912446-9811-4-git-send-email-victor.clement@openwide.fr>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The 'sleep' parameter sets the icount_sleep mode, which is enabled by
default. To disable it, add the 'sleep=no' parameter (or 'nosleep') to the
qemu -icount option.
Signed-off-by: Victor CLEMENT <victor.clement@openwide.fr>
Message-Id: <1432912446-9811-3-git-send-email-victor.clement@openwide.fr>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When the icount_sleep mode is disabled, the QEMU_VIRTUAL_CLOCK runs at the
maximum possible speed by warping the sleep times of the virtual cpu to the
soonest clock deadline. The virtual clock will be updated only according
the instruction counter.
Signed-off-by: Victor CLEMENT <victor.clement@openwide.fr>
Message-Id: <1432912446-9811-2-git-send-email-victor.clement@openwide.fr>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This will allow clients to query additional information directly using
qom-get on the CPU objects.
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
following a464982499, it's now possible for
there to be attempts to take the BQL before CPUs have been realized in
cases where a machine model inits peripherals before the first CPU.
BQL lock aquisition kicks the first_cpu, leading to a segfault if this
happens pre-realize. Guard the CPU kick routine to perform no action for
a CPU that doesn't exist or doesn't have a thread yet.
There was a fix to this with commit
6b49809c59, but the check there misses
the case where the CPU has been inited and not realized. Strengthen the
check to make sure that the first_cpu has a thread (i.e. it is
realized) before allowing the kick.
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-Id: <1427107689-6946-1-git-send-email-peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2ed1ebcf6 "timer: replace time() with QEMU_CLOCK_HOST" broke compile
when configured with --enable-profiler. Turned out the profiler has been
broken for a while.
This does s/qemu_time/tcg_time/ as the profiler only works in a TCG mode.
This also fixes the compile error.
This changes profile_getclock() to return nanoseconds rather than
CPU ticks as the "profile" HMP command prints seconds and there is no
platform-independent way to get ticks-per-second rate.
Since TCG is quite slow and get_clock() returns nanoseconds (fine
enough), this should not affect precision much.
This removes unused qemu_time_start and tlb_flush_time.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <1426478258-29961-1-git-send-email-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When requesting a size which cannot be read, the error message shows
a different address which is misleading to the user and it looks like
something's wrong with the address parsing. This is because the input
@addr variable is incremented in the memory dumping loop:
(qemu) memsave 0xffffffff8418069c 0xb00000 mem
Invalid addr 0xffffffff849ffe9c specified
Fix that by saving the original address and size and use them in the
error message:
(qemu) memsave 0xffffffff8418069c 0xb00000 mem
Invalid addr 0xffffffff8418069c/size 11534336 specified
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
For good measure, ensure that the following sequence:
thread 1 calls qemu_mutex_lock_iothread
thread 2 calls qemu_mutex_lock_iothread
VCPU thread are created
VCPU thread enters execution loop
results in the VCPU threads letting the other two threads run
and obeying iothread_requesting_mutex even if the VCPUs are
not halted. To do this, check iothread_requesting_mutex
before execution starts.
Tested-by: Leon Alrae <leon.alrae@imgtec.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When two threads (other than the low-priority TCG VCPU thread)
are competing for the iothread lock, a deadlock can happen. This
is because iothread_requesting_mutex is set to false by the first
thread that gets the mutex, and then the VCPU thread might never
yield from the execution loop. If iothread_requesting_mutex is
changed from a bool to a counter, the deadlock is fixed.
However, there is another bug in qemu_mutex_lock_iothread that
can be triggered by the new call_rcu thread. The bug happens
if qemu_mutex_lock_iothread is called before the CPUs are
created. In that case, first_cpu is NULL and the caller
segfaults in qemu_mutex_lock_iothread. To fix this, just
do not do the kick if first_cpu is NULL.
Reported-by: Leon Alrae <leon.alrae@imgtec.com>
Reported-by: Andreas Gustafsson <gson@gson.org>
Tested-by: Leon Alrae <leon.alrae@imgtec.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
- RCU: fix MemoryRegion lifetime issues in PCI; document the rules;
convert of AddressSpaceDispatch and RAMList
- KVM: add kvm_exit reasons for aarch64
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJU4hugAAoJEL/70l94x66DZXEH/i72tOgvKZfAjfq2xmHXNEsr
roCfTFIIjKK7feyW6YgwT5pgex6I5umFsO+uIyI/wbu8nDl/3NYEQBT4fR2cGfli
GKeJOEu8kf+Zt8U+fbxyVQclbuU5S0Ujsg1fX4QXC4swB5fGLT2cRWJ5qd6hKBQs
GflBuLa7h4eOzcTtOPpqRIwZ8mQE0uxv/hKq9kYLKHXJN2aWsiOls8KQ2CXj2yAl
p6bMS5f0H0S/1hvQcQV9EazX7owlPIEet3AmSL1TC2sjJ8hrNGMBoFPtUys1uqjc
B3CwuGi0JtWIduFYV9vZ/Ze4G7Y2iZlqc5vDxIl94d+iFmoHymDOi3mFUZ3H8XQ=
=Lk9p
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
- vhost-scsi: add bootindex property
- RCU: fix MemoryRegion lifetime issues in PCI; document the rules;
convert of AddressSpaceDispatch and RAMList
- KVM: add kvm_exit reasons for aarch64
# gpg: Signature made Mon Feb 16 16:32:32 2015 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream: (21 commits)
Convert ram_list to RCU
exec: convert ram_list to QLIST
cosmetic changes preparing for the following patches
exec: protect mru_block with RCU
rcu: add g_free_rcu
rcu: introduce RCU-enabled QLIST
exec: RCUify AddressSpaceDispatch
exec: make iotlb RCU-friendly
exec: introduce cpu_reload_memory_map
docs: clarify memory region lifecycle
pci: split shpc_cleanup and shpc_free
pcie: remove mmconfig memory leak and wrap mmconfig update with transaction
memory: keep the owner of the AddressSpace alive until do_address_space_destroy
rcu: run RCU callbacks under the BQL
rcu: do not let RCU callbacks pile up indefinitely
vhost-scsi: set the bootable value of channel/target/lun
vhost-scsi: add a property for booting
vhost-scsi: expose the TYPE_FW_PATH_PROVIDER interface
vhost-scsi: add bootindex property
qdev: support to get a device firmware path directly
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Note that even after this patch, most callers of address_space_*
functions must still be under the big QEMU lock, otherwise the memory
region returned by address_space_translate can disappear as soon as
address_space_translate returns. This will be fixed in the next part
of this series.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
qemu_clock_run_timers() only takes care of main_loop_tlg, we shouldn't
forget aio timer list groups.
Currently, the qemu_clock_deadline_ns_all (a few lines above) counts all
the timergroups of this clock type, including aio tlg, but we don't fire
them, so they are never cleared, which makes a dead loop.
For example, this function hangs when trying to drive throttled block
request queue with qtest clock_step.
Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1421661103-29153-1-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
With the introduction of QEMU_CLOCK_VIRTUAL_RT, the computation of
sc->diff_clk can be simplified nicely:
qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) -
qemu_clock_get_ns(QEMU_CLOCK_REALTIME) +
cpu_get_clock_offset()
= qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) -
(qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - cpu_get_clock_offset())
= qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) -
(qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + timers_state.cpu_clock_offset)
= qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) -
qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL_RT)
Cc: Sebastian Tanase <sebastian.tanase@openwide.fr>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fix mismatch between timer_new_ms and timer_mod.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch makes icount warp use the new QEMU_CLOCK_VIRTUAL_RT clock.
This way, icount's QEMU_CLOCK_VIRTUAL will never count time during which
the virtual machine is stopped.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Separate accessing the instruction counter from the compensation for
speed and halting that are introduced by qemu_icount_bias. This
introduces new infrastructure used by the record/replay patches.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch sets can_do_io function to allow reading icount
within cpu-exec, but outside TB execution.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Exception index is reset at every entry at every entry into cpu_exec()
function. This may cause missing the exceptions while replaying them.
This patch moves exception_index reset to the locations where they are
processed.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Ticks and clock offset used by CPU timers have to be saved in vmstate.
But vmstate for these fields registered only in icount mode.
Missing registration leads to breaking the continuity when vmstate is loaded.
This patch introduces new initialization function which fixes this.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This implements an NMI interface for s390 and s390-ccw machines.
This removes #ifdef s390 branch in qmp_inject_nmi so new s390's
nmi_monitor_handler() callback is going to be used for NMI.
Since nmi_monitor_handler()-calling code is platform independent,
CPUState::cpu_index is used instead of S390CPU::env.cpu_num.
There should not be any change in behaviour as both @cpu_index and
@cpu_num are global CPU numbers.
Note that s390_cpu_restart() already takes care of the specified cpu,
so we don't need to schedule via async_run_on_cpu().
Since the only error s390_cpu_restart() can return is ENOSYS, convert
it to QERR_UNSUPPORTED.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This introduces an NMI (Non Maskable Interrupt) interface with
a single nmi_monitor_handler() method. A machine or a device can
implement it. This searches for an QOM object with this interface
and if it is implemented, calls it. The callback implements an action
required to cause debug crash dump on in-kernel debugger invocation.
The callback returns Error**.
This adds a nmi_monitor_handle() helper which walks through
all objects to find the interface. The interface method is called
for all found instances.
This adds support for it in qmp_inject_nmi(). Since no architecture
supports it at the moment, there is no change in behaviour.
This changes inject-nmi command description for HMP and QMP.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Show in 'info jit' the current delay between the host clock
and the guest clock. In addition, print the maximum advance
and delay of the guest compared to the host.
Signed-off-by: Sebastian Tanase <sebastian.tanase@openwide.fr>
Tested-by: Camille Bégué <camille.begue@openwide.fr>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The goal is to sleep qemu whenever the guest clock
is in advance compared to the host clock (we use
the monotonic clocks). The amount of time to sleep
is calculated in the execution loop in cpu_exec.
At first, we tried to approximate at each for loop the real time elapsed
while searching for a TB (generating or retrieving from cache) and
executing it. We would then approximate the virtual time corresponding
to the number of virtual instructions executed. The difference between
these 2 values would allow us to know if the guest is in advance or delayed.
However, the function used for measuring the real time
(qemu_clock_get_ns(QEMU_CLOCK_REALTIME)) proved to be very expensive.
We had an added overhead of 13% of the total run time.
Therefore, we modified the algorithm and only take into account the
difference between the 2 clocks at the begining of the cpu_exec function.
During the for loop we try to reduce the advance of the guest only by
computing the virtual time elapsed and sleeping if necessary. The overhead
is thus reduced to 3%. Even though this method still has a noticeable
overhead, it no longer is a bottleneck in trying to achieve a better
guest frequency for which the guest clock is faster than the host one.
As for the the alignement of the 2 clocks, with the first algorithm
the guest clock was oscillating between -1 and 1ms compared to the host clock.
Using the second algorithm we notice that the guest is 5ms behind the host, which
is still acceptable for our use case.
The tests where conducted using fio and stress. The host machine in an i5 CPU at
3.10GHz running Debian Jessie (kernel 3.12). The guest machine is an arm versatile-pb
built with buildroot.
Currently, on our test machine, the lowest icount we can achieve that is suitable for
aligning the 2 clocks is 6. However, we observe that the IO tests (using fio) are
slower than the cpu tests (using stress).
Signed-off-by: Sebastian Tanase <sebastian.tanase@openwide.fr>
Tested-by: Camille Bégué <camille.begue@openwide.fr>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The align option is used for activating the align algorithm
in order to synchronise the host clock and the guest clock.
Signed-off-by: Sebastian Tanase <sebastian.tanase@openwide.fr>
Tested-by: Camille Bégué <camille.begue@openwide.fr>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Make icount parameter use QemuOpts style options in order
to easily add other suboptions.
Signed-off-by: Sebastian Tanase <sebastian.tanase@openwide.fr>
Tested-by: Camille Bégué <camille.begue@openwide.fr>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When using the icount option on ARM, the virtual
clock starts counting at realtime clock but it
should start at 0.
The reason why the virtual clock starts at realtime clock
is because the first time we call qemu_clock_warp (which
calls icount_warp_rt) in tcg_exec_all, qemu_icount_bias
(which is part of the virtual time computation mechanism)
will increment by realtime - vm_clock_warp_start, with
vm_clock_warp_start being 0 (see icount_warp_rt in cpus.c).
By changing the value of vm_clock_warp_start from 0 to -1,
the first time we call qemu_clock_warp which calls
icount_warp_rt, we will return immediatly because
icount_warp_rt first checks if vm_clock_warp_start is -1
and if it's the case it returns. Therefore, qemu_icount_bias
will first be incremented by the value of a virtual timer
deadline when the virtual cpu goes from active to inactive.
The virtual time will start at 0 and increment based
on the instruction counter when the vcpu is active or
the qemu_icount_bias value when inactive.
Signed-off-by: Sebastian Tanase <sebastian.tanase@openwide.fr>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This adds cpu_icount_to_ns function which is needed for reverse execution.
It returns the time for a specific instruction.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This fixes a bug where qemu_icount and qemu_icount_bias are not migrated.
It adds a subsection "timer/icount" to vmstate_timers so icount is migrated only
when needed.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This puts qemu_icount and qemu_icount_bias into TimerState structure to allow
them to be migrated.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There patch protects vmstop_requested with a lock and introduces
qemu_system_vmstop_request_prepare.
Together with the new call to qemu_vmstop_requested in vm_start,
qemu_system_vmstop_request_prepare avoids a race where the VM could remain
stopped even though the iostatus of a block device has already been set
(for example).
qemu_system_vmstop_request_prepare however also lets the caller thread
delay observation of the state change until it has itself communicated
that change to the user. This delay avoids any possibility of a wrong
reordering of the BLOCK_IO_ERROR event and the subsequent STOP event.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
MST: comment tweaks
Use dedicated qemu_soonest_timeout() instead of MIN().
Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
After previous Peter patch, they are redundant. This way we don't
assign them except when needed. Once there, there were lots of case
where the ".fields" indentation was wrong:
.fields = (VMStateField []) {
and
.fields = (VMStateField []) {
Change all the combinations to:
.fields = (VMStateField[]){
The biggest problem (appart from aesthetics) was that checkpatch complained
when we copy&pasted the code from one place to another.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
These functions don't need type casts (as does cpu_physical_memory_rw)
and also make the code better readable.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Default to false.
Tidy variable naming and inline cast uses while at it.
Tested-by: Jia Liu <proljc@gmail.com> (or32)
Signed-off-by: Andreas Färber <afaerber@suse.de>
If enabled, set the thread name at creation (on GNU systems with
pthread_set_np)
Fix up all the callers with a thread name
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
This motion is preparing for refactoring vCPU APIC subsequently.
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Stop/cont commands are broken with -icount due to a deadlock. The
real problem is that the computation of timers_state.cpu_ticks_offset
makes no sense with -icount enabled: we set it to an icount clock value
in cpu_disable_ticks, and subtract a TSC (or similar, whatever
cpu_get_real_ticks happens to return) value in cpu_enable_ticks.
The fix is simple. timers_state.cpu_ticks_offset is only used
together with cpu_get_real_ticks, so we can use cpu_get_real_ticks
in cpu_disable_ticks. There is no need to update cpu_ticks_prev
at the time cpu_disable_ticks is called; instead, we can do it
the next time cpu_get_ticks is called.
The change to cpu_disable_ticks is the important part of the patch.
The rest modifies the code to always check timers_state.cpu_ticks_prev,
even when the ticks are not advancing (i.e. the VM is stopped). It also
makes a similar change to cpu_get_clock_locked, so that the code remains
similar for cpu_get_ticks and cpu_get_clock_locked.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1382977938-13844-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
When we translate the virtual address to physical check for error.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Computing the deadline of all vm_clocks is somewhat expensive and calls
out to qemu-timer.c; two reasons not to do it in the seqlock's write-side
critical section. This however opens the door for races in setting and
reading vm_clock_warp_start.
To plug them, we need to cover the case where a new deadline slips in
between the call to qemu_clock_deadline_ns_all and the actual modification
of the icount_warp_timer. Restrict changes to vm_clock_warp_start and
the icount_warp_timer's expiration time, to only move them back (which
would simply cause an early wakeup).
If a vm_clock timer is cancelled while CPUs are idle, this might cause the
icount_warp_timer to fire unnecessarily. This is not a problem, after it
fires the timer becomes inactive and the next call to timer_mod_anticipate
will be precise.
In addition to this, we must deactivate the icount_warp_timer _before_
checking whether CPUs are idle. This way, if the "last" CPU becomes idle
during the call to timer_del we will still set up the icount_warp_timer.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
To prepare for future code changes, move the increment of qemu_icount_bias
outside the "if" statement.
Also, hoist outside the if the check for timers that expired due to the
"warping". The check is redundant when !runstate_is_running(), but
doing it this way helps because the code that increments qemu_icount_bias
will be a critical section.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This will help later when we will have to place these calls in
a critical section, and thus call a version of cpu_get_icount()
that does not take the lock.
Reviewed-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
QEMU_CLOCK_VIRTUAL may be read outside BQL. This will make its
foundation, i.e. cpu_clock_offset exposed to race condition.
Using private lock to protect it.
After this patch, reading QEMU_CLOCK_VIRTUAL is thread safe
unless use_icount is true, in which case the existing callers
still rely on the BQL.
Lock rule: private lock innermost, ie BQL->"this lock"
Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It was introduced to loop over CPUs from target-independent code, but
since commit 182735efaf target-independent
CPUState is used.
A loop can be considered more efficient than function calls in a loop,
and CPU_FOREACH() hides implementation details just as well, so use that
instead.
Suggested-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
There is the 'nmi' command that is used to trigger a guest dump via kdump feature on x86.
s390 uses RESTART interrupt to trigger kdump.
So, this patch provides a mean to use 'nmi' command on s390 to raise RESTART interrupt.
The CPU to receive the RESTART interrupt is the "default" one.
There is an infrastructure to select the "default" CPU using 'cpu' command.
The 'info cpus' command can be used to see which one is the "default".
In order to wire up the RESTART to 'nmi' command we had to:
1. implement the kvm_s390_cpu_restart function by exporting the existing code
2. implement s390_cpu_restart function as kvm-aware wrapper
3. modify the qmp_inject_nmi function to enable (for s390) the scan for
"default" CPU and call s390_cpu_restart for it;
3. fix some messages.
Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Alexander Graf <agraf@suse.de>
Rearrange timer.h so it is in order by function type.
Make legacy functions call non-legacy functions rather than vice-versa.
Convert cpus.c to use new API.
Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Notify all timerlists derived from vm_clock in icount warp
calculations.
When calculating timer delay based on vm_clock deadline, use
all timerlists.
For compatibility, maintain an apparent bug where when using
icount, if no vm_clock timer was set, qemu_clock_deadline
would return INT32_MAX and always set an icount clock expiry
about 2 seconds ahead.
NB: thread safety - when different timerlists sit on different
threads, this will need some locking.
Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
It makes more sense and will make things simpler later.
Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Prepares for changing cpu_single_step() argument to CPUState.
Acked-by: Michael Walle <michael@walle.cc> (for lm32)
Signed-off-by: Andreas Färber <afaerber@suse.de>
Even if the VM is already stopped, we cannot assume that all data has
already been successfully flushed to disk. The flush during the previous
vm_stop() could have failed.
Run bdrv_flush_all() unconditionally so that we get an error each time
if the block device isn't really flushed.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
# By Chegu Vinod
# Via Juan Quintela
* quintela/migration.next:
Force auto-convegence of live migration
Add 'auto-converge' migration capability
Introduce async_run_on_cpu()
Message-id: 1373664508-5404-1-git-send-email-quintela@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
If flushing the block devices fails, return an error. The VM is stopped
anyway.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Introduce an asynchronous version of run_on_cpu() i.e. the caller
doesn't have to block till the call back routine finishes execution
on the target vcpu.
Signed-off-by: Chegu Vinod <chegu_vinod@hp.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Move next_cpu from CPU_COMMON to CPUState.
Move first_cpu variable to qom/cpu.h.
gdbstub needs to use CPUState::env_ptr for now.
cpu_copy() no longer needs to save and restore cpu_next.
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
[AF: Rebased, simplified cpu_copy()]
Signed-off-by: Andreas Färber <afaerber@suse.de>
On PPC, we don't support MP state. So far it's not necessary and I'm
not convinced yet that we really need to support it ever.
However, the current idle logic in QEMU assumes that an in-kernel PIC
also means we support MP state. This assumption is not true anymore.
Let's split up the two cases into two different variables. That way
PPC can expose an in-kernel PIC, while not implementing MP state.
Signed-off-by: Alexander Graf <agraf@suse.de>
CC: Jan Kiszka <jan.kiszka@siemens.com>
This allows to move the call into CPUState's realizefn.
Therefore move the stub into libqemustub.a.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Pass it to qemu_dummy_cpu_thread_fn().
Use CPUState::env_ptr for cpu_single_env.
Prepares for changing qemu_init_vcpu() argument to CPUState.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Pass it on to qemu_kvm_cpu_thread_fn().
Prepares for changing qemu_init_vcpu() argument to CPUState.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andreas Färber <afaerber@suse.de>
CPUArchState is no longer needed.
Prepares for changing qemu_kvm_cpu_thread_fn() opaque to CPUState.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Use CPUState::env_ptr for now.
Prepares for changing cpu_handle_guest_debug() argument to CPUState.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andreas Färber <afaerber@suse.de>
It no longer uses CPUArchState.
Prepares for changing qemu_kvm_cpu_thread_fn() opaque to CPUState.
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Make cpustats monitor command available unconditionally.
Prepares for changing kvm_handle_internal_error() and kvm_cpu_exec()
arguments to CPUState.
Signed-off-by: Andreas Färber <afaerber@suse.de>
CPUArchState is no longer needed.
Prepares for changing qemu_kvm_cpu_thread_fn() opaque to CPUState.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andreas Färber <afaerber@suse.de>
CPUArchState is no longer needed.
Prepares for changing qemu_kvm_init_cpu_signals() argument to CPUState.
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andreas Färber <afaerber@suse.de>
It no longer uses CPUArchState.
Prepares for changing qemu_kvm_cpu_thread_fn() opaque to CPUState.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andreas Färber <afaerber@suse.de>
It no longer needs CPUArchState.
Prepares for changing all_cpu_threads_idle() CPU loop to CPUState and
needed for changing qemu_kvm_wait_io_event() argument to CPUState.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andreas Färber <afaerber@suse.de>
It no longer depends on CPUArchState, so move it to qom/cpu.c.
Prepares for changing GDBState::c_cpu to CPUState.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Change Monitor::mon_cpu to CPUState as well.
Reviewed-by: liguang <lig.fnst@cn.fujitsu.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Due to a preceding while loop, no CPU would've been put into stopped
state. Reinitialize the variable.
This fixes commit d798e97456 (Allow to use
pause_all_vcpus from VCPU context) for non-KVM case.
While at it, change a 0 to false, amending commit
4fdeee7cd4 (cpu: Move stop field to
CPUState).
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Replaces an open-coded loop and hides unused CPUArchState.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Also add a stub for it, to make possible to use it in qom/cpu.c,
which is shared with user emulators.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
GetLastError() returns a DWORD value which is unsigned long,
so the correct format specifier is %lu.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
... so it could be called without requiring CPUArchState.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
On multi-core systems, SuspendThread does not guaranty immediate thread
suspension. We add busy loop to wait for effective thread suspension
after call to ThreadSuspend().
Signed-off-by: Fabien Chouteau <chouteau@adacore.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Move it to qom/cpu.h to avoid issues with include order.
Change pc_acpi_smi_interrupt() opaque to X86CPU.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Both fields are used in VMState, thus need to be moved together.
Explicitly zero them on reset since they were located before
breakpoints.
Pass PowerPCCPU to kvmppc_handle_halt().
Signed-off-by: Andreas Färber <afaerber@suse.de>
No functional change, just less usages of first_cpu and next_cpu fields.
env is passed to cpu_memory_rw_debug(), which in turn passes it to
target-specific cpu_get_phys_page_debug(). Changing both would be a
larger refactoring, so defer that by using env_ptr for now.
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
The set_cpu_log() function in cpus.c is a fairly simple wrapper
which is only called from one location. Just inline the code
into vl.c, since there is no need to indirect it via cpus.c
and the handling of the error case is more appropriate to vl.c.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Rename the public-facing function cpu_set_log to qemu_set_log. This
requires us to rename the internal-only qemu_set_log() to
do_qemu_set_log().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Rename cpu_str_to_log_mask() to qemu_str_to_log_mask(), since
the qemu_log functionality is no longer restricted to TCG CPU
debug logging.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Abstract out the "print a human readable list of all the
valid log categories" functionality which is currently duplicated
in three separate places. (We leave the monitor.c help_cmd()
implementation as-is since it wants to send the message to
the monitor and add its own information.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
The qemu_log() functionality is no longer specific to TCG CPU debug logs.
Rename cpu_set_log_filename() to qemu_set_log_filename() and drop the
pointless wrapper set_cpu_log_filename().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Since commit 20d695a925 (kvm: Pass
CPUState to kvm_arch_*) CPUArchState is no longer needed.
Allows to change qemu_kvm_eat_signals() argument as well.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Note that target-alpha accesses this field from TCG, now using a
negative offset. Therefore the field is placed last in CPUState.
Pass PowerPCCPU to [kvm]ppc_fixup_cpu() to facilitate this change.
Move common parts of mips cpu_state_reset() to mips_cpu_reset().
Acked-by: Richard Henderson <rth@twiddle.net> (for alpha)
[AF: Rebased onto ppc CPU subclasses and openpic changes]
Signed-off-by: Andreas Färber <afaerber@suse.de>
To facilitate the field movements, pass MIPSCPU to malta_mips_config();
avoid that for mips_cpu_map_tc() since callers only access MIPS Thread
Contexts, inside TCG helpers.
Signed-off-by: Andreas Färber <afaerber@suse.de>
For target-mips also change the return type to bool.
Make include paths for cpu-qom.h consistent for alpha and unicore32.
Signed-off-by: Andreas Färber <afaerber@suse.de>
[AF: Updated new target-openrisc function accordingly]
Acked-by: Richard Henderson <rth@twiddle.net> (for alpha)
CPUArchState is no longer needed except for iterating the CPUs.
Needed for qemu_tcg_init_vcpu().
KVM and dummy threads still need CPUArchState for cpu_single_env.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Change return type to bool, move to include/qemu/cpu.h and
add documentation.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
[AF: Updated new caller qemu_in_vcpu_thread()]
Old code used !io_thread to know if a thread was an vcpu or not. That
fails when we introduce the iothread.
Signed-off-by: Juan Quintela <quintela@redhat.com>
* 'target-arm.for-upstream' of git://git.linaro.org/people/pmaydell/qemu-arm:
target-arm: Drop unused DECODE_CPREG_CRN macro
target-arm: use deposit instead of hardcoded version
target-arm: mark a few integer helpers const and pure
target-arm: convert sar, shl and shr helpers to TCG
target-arm: convert add_cc and sub_cc helpers to TCG
target-arm: use globals for CC flags
target-arm: Reinstate display of VFP registers in cpu_dump_state
cpu_dump_state: move DUMP_FPU and DUMP_CCOP flags from x86-only to generic
Move the DUMP_FPU and DUMP_CCOP flags for cpu_dump_state() from being
x86-specific flags to being generic ones. This allows us to drop some
TARGET_I386 ifdefs in various places, and means that we can (potentially)
be more consistent across architectures about which monitor commands or
debug abort printouts include FPU register contents and info about
QEMU's condition-code optimisations.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Contrary to its name, 'qemu_global_mutex' is only used locally
in cpus.c.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Since the only user of the extended cpu_list_id() format
was the x86 ?model/?dump/?cpuid output, we can drop it
completely.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
On x86 userspace delivers interrupts to the kernel asynchronously
(and therefore VCPU idle management is done in the kernel) if and
only if there is an in-kernel irqchip. On other architectures this
isn't necessarily true (they may always send interrupts
asynchronously), so define a new kvm_async_interrupts_enabled()
function instead of misusing kvm_irqchip_in_kernel().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Avi Kivity <avi@redhat.com>
CPU_COMMON_THREAD was only used for Windows, adding an hThread field
to CPU_COMMON.
Move the field into QOM CPUState and change its type to HANDLE,
which it is assigned from. This requires Windows headers, pulled in
through qemu-thread.h.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Commit 946fb27c1 moved all the uses of all_cpu_threads_idle()
into cpus.c. This means we can mark the function 'static'
(again), if we shuffle it a bit earlier in the source file.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
This patch combines qtest and -icount together to turn the vm_clock
into a source that can be fully managed by the client. To this end new
commands clock_step and clock_set are added. Hooking them with libqtest
is left as an exercise to the reader.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The idea behind qtest is pretty simple. Instead of executing a CPU via TCG or
KVM, rely on an external process to send events to the device model that the CPU
would normally generate.
qtest presents itself as an accelerator. In addition, a new option is added to
establish a qtest server (-qtest) that takes a character device. This is what
allows the external process to send CPU events to the device model.
qtest uses a simple line based protocol to send the events. Documentation of
that protocol is in qtest.c.
I considered reusing the monitor for this job. Adding interrupts would be a bit
difficult. In addition, logging would also be difficult.
qtest has extensive logging support. All protocol commands are logged with
time stamps using a new command line option (-qtest-log). Logging is important
since ultimately, this is a feature for debugging.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Scripted conversion:
for file in *.[hc] hw/*.[hc] hw/kvm/*.[hc] linux-user/*.[hc] linux-user/m68k/*.[hc] bsd-user/*.[hc] darwin-user/*.[hc] tcg/*/*.[hc] target-*/cpu.h; do
sed -i "s/CPUState/CPUArchState/g" $file
done
All occurrences of CPUArchState are expected to be replaced by QOM CPUState,
once all targets are QOM'ified and common fields have been extracted.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
In order to perform critical manipulations on the VM state in the
context of a VCPU, specifically code patching, stopping and resuming of
all VCPUs may be necessary. resume_all_vcpus is already compatible, now
enable pause_all_vcpus for this use case by stopping the calling context
before starting to wait for the whole gang.
CC: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
When the TCG thread is started but not yet the machine, we wait in
qemu_tcg_cpu_thread_fn on tcg_halt_cond. To allow run_on_cpu already at
this time, we need to process pending request in that loop.
CC: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
As we have thread-local cpu_single_env now and KVM uses exactly one
thread per VCPU, we can drop the cpu_single_env updates from the loop
and initialize this variable only once during setup.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
On real hardware, NMI button events are injected via the LINT1 line of
the APICs. E.g. kdump expect this wiring and gets upset if the per-APIC
LINT1 mask is not respected, i.e. if NMIs are injected to VCPUs that
should not receive them. Change the APIC emulation code to reflect this.
Based on qemu-kvm patch by Lai Jiangshan.
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
These two blocks of code are exactly the same, remove one.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
On Windows, cpus.c needs access to the hThread. Add a Windows-specific
function to grab it. This requires changing the CPU threads to
joinable. There is no substantial change because the threads run
in an infinite loop.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Split from Jan's original qemu-thread-posix.c patch. No semantic change,
just introduce the new API that POSIX and Win32 implementations will
conform to.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Please, note that the QMP command has a new 'cpu-index' parameter.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Double semicolons should be single.
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Many places in QEMU call qemu_aio_flush() to complete all pending
asynchronous I/O. Most of these places actually want to drain all block
requests but there is no block layer API to do so.
This patch introduces the bdrv_drain_all() API to wait for requests
across all BlockDriverStates to complete. As a bonus we perform checks
after qemu_aio_wait() to ensure that requests really have finished.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We disable vm_clock when pausing all vcpus, but we forget to
reenable it when resuming all vcpus. It will cause that the
guest can not be rebooted.
Tested-by: Zhi Yong Wu <zwu.kernel@gmai.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
After the removal of the non-threaded mode cpu_exec_all is now only used
by TCG. Refactor it accordingly, also dropping its unused return value.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
It should be a matter of allowing the transition POSTMIGRATE ->
FINISH_MIGRATE, but it turns out that the VM won't do the
transition the second time because it's already stopped.
So this commit also adds vm_stop_force_state() which performs
the transition even if the VM is already stopped.
While there also allow other states to migrate.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Now that iothread is always compiled sending a signal seems only an
additional step. This patch also avoid writing to two pipe (one from signal
and one in qemu_service_io).
Work with kvm enabled or disabled. strace output is more readable (less syscalls).
[ kwolf: Merged build fix by Paolo Bonzini ]
Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Currently, only vm_start() and vm_stop() change the VM state.
That's, the state is only changed when starting or stopping the VM.
This commit adds the runstate_set() function, which makes it possible
to also do state transitions when the VM is stopped or running.
Additional states are also added and the current state is stored.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Today, when notifying a VM state change with vm_state_notify(),
we pass a VMSTOP macro as the 'reason' argument. This is not ideal
because the VMSTOP macros tell why qemu stopped and not exactly
what the current VM state is.
One example to demonstrate this problem is that vm_start() calls
vm_state_notify() with reason=0, which turns out to be VMSTOP_USER.
This commit fixes that by replacing the VMSTOP macros with a proper
state type called RunState.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Enabling the I/O thread by default seems like an important part of declaring
1.0. Besides allowing true SMP support with KVM, the I/O thread means that the
TCG VCPU doesn't have to multiplex itself with the I/O dispatch routines which
currently requires a (racey) signal based alarm system.
I know there have been concerns about performance. I think so far the ones that
have come up (virtio-net) are most likely due to secondary reasons like
decreased batching.
I think we ought to force enabling I/O thread early in 1.0 development and
commit to resolving any lingering issues.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
We can express the VCPU thread wakeup with the stop mechanism, saving
both qemu_system_ready and the qemu_system_cond. For KVM threads, we can
just enter the main loop as long as the thread is stopped. The central
TCG thread is better held back before the loop as there can be side
effects of the services called even when all CPUs are stopped.
Creating VCPUs in stopped state will also be required for proper CPU
hotplugging support.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
In TCG mode, iothread and vcpus run in lock-step. So it's pointless to
send a signal from qemu_cpu_kick to the vcpu thread - if we got here,
the receiver already left the vcpu loop.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This conveys the intention better, and scales to more than >1
threads contending the mutex with the iothread (as long as all
of them have a "quiescent point" like the TCG thread has).
Also, on Mac OS X the fair_mutex somehow didn't work as intended
and deadlocked.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Both the signal thread (via sigwait()) and the cpu thread (via
a normal signal handler) were attempting to catch SIG_IPI.
This resulted in random freezes under Darwin.
This patch separates SIG_IPI from the rest of the signals handled
by the signal thread, because it is independently caught by the cpu
thread.
Signed-off-by: Alexandre Raymond <cerbere@gmail.com>
Acked-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Changes since v1:
- take pthread_sigmask() out of the ifdef as it is now common
to both parts.
This fix effectively blocks, in the main thread, the signals handled
by signalfd or the compatibility signal thread.
This way, such signals are received synchronously in the main thread
through sigfd_handler() instead of triggering the signal handler
directly, asynchronously.
Signed-off-by: Alexandre Raymond <cerbere@gmail.com>
Acked-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
sigset_t, used by that header, is not available in mingw32 environments.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Add command line support for logging to a location other than /tmp/qemu.log.
With logging enabled (command line option -d), the log is written to
the hard-coded path /tmp/qemu.log. This patch adds support for writing
the log to a different location by passing the -D option.
Signed-off-by: Matthew Fernandez <matthew.fernandez@gmail.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
It is purely for icount-based virtual timers. And now that we got the
code right, rename the function to clarify the intended scope.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
The previous patch however is not enough, because if the virtual CPU
goes to sleep waiting for a future timer interrupt to wake it up, qemu
deadlocks. The timer interrupt never comes because time is driven by
icount, but the vCPU doesn't run any insns.
You could say that VCPUs should never go to sleep in icount
mode if there is a pending vm_clock timer; rather time should
just warp to the next vm_clock event with no sleep ever taking place.
Even better, you can sleep for some time related to the
time left until the next event, to avoid that the warps are too visible
externally; for example, you could be sending network packets continously
instead of every 100ms.
This is what this patch implements. qemu_clock_warp is called: 1)
whenever a vm_clock timer is adjusted, to ensure the warp_timer is
synchronized; 2) at strategic points in the CPU thread, to make sure
the insn counter is synchronized before the CPU starts running.
In any case, the warp_timer is disabled while the CPU is running,
because the insn counter will then be making progress on its own.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
The correct fix for -icount is to consider the biggest difference
between iothread and non-iothread modes. In the traditional model,
CPUs run _before_ the iothread calls select (or WaitForMultipleObjects
for Win32). In the iothread model, CPUs run while the iothread
isn't holding the mutex, i.e. _during_ those same calls.
So, the iothread should always block as long as possible to let
the CPUs run smoothly---the timeout might as well be infinite---and
either the OS or the CPU thread itself will let the iothread know
when something happens. At this point, the iothread wakes up and
interrupts the CPU.
This is exactly the approach that this patch takes: when cpu_exec_all
returns in -icount mode, and it is because a vm_clock deadline has
been met, it wakes up the iothread to process the timers. This is
really the "bulk" of fixing icount.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Here the int values fds[0], sigfd, s, sock and fd are converted
to void pointers which are later converted back to an int value.
These conversions should always use intptr_t instead of unsigned long.
They are needed for environments where sizeof(long) != sizeof(void *).
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>