Commit Graph

68 Commits

Author SHA1 Message Date
Jan Kiszka 39853bbc49 kvm: Introduce kvm_irqchip_add/remove_irqfd
Add services to associate an eventfd file descriptor as input with an
IRQ line as output. Such a line can be an input pin of an in-kernel
irqchip or a virtual line returned by kvm_irqchip_add_route.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-05-21 19:22:50 +03:00
Jan Kiszka e7b2030862 kvm: Make kvm_irqchip_commit_routes an internal service
Automatically commit route changes after kvm_add_routing_entry and
kvm_irqchip_release_virq. There is no performance relevant use case for
which collecting multiple route changes is beneficial. This makes
kvm_irqchip_commit_routes an internal service which assert()s that the
corresponding IOCTL will always succeed.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-05-21 19:22:49 +03:00
Jan Kiszka 1e2aa8be09 kvm: Publicize kvm_irqchip_release_virq
This allows to drop routes created by kvm_irqchip_add_irq/msi_route
again.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-05-21 19:22:49 +03:00
Jan Kiszka 92b4e48982 kvm: Introduce kvm_irqchip_add_msi_route
Add a service that establishes a static route from a virtual IRQ line to
an MSI message. Will be used for IRQFD and device assignment. As we will
use this service outside of CONFIG_KVM protected code, stub it properly.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-05-21 19:22:49 +03:00
Jan Kiszka 1df186df35 kvm: Rename kvm_irqchip_add_route to kvm_irqchip_add_irq_route
We will add kvm_irqchip_add_msi_route, so let's make the difference
clearer.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-05-21 19:22:49 +03:00
Jan Kiszka 04fa27f5ae kvm: Introduce basic MSI support for in-kernel irqchips
This patch basically adds kvm_irqchip_send_msi, a service for sending
arbitrary MSI messages to KVM's in-kernel irqchip models.

As the original KVM API requires us to establish a static route from a
pseudo GSI to the target MSI message and inject the MSI via toggling
that virtual IRQ, we need to play some tricks to make this interface
transparent. We create those routes on demand and keep them in a hash
table. Succeeding messages can then search for an existing route in the
table first and reuse it whenever possible. If we should run out of
limited GSIs, we simply flush the table and rebuild it as messages are
sent.

This approach is rather simple and could be optimized further. However,
latest kernels contains a more efficient MSI injection interface that
will obsolete the GSI-based dynamic injection.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-05-16 18:04:44 -03:00
Jan Kiszka c73b00973b kvm: Drop unused kvm_pit_in_kernel
This is now implied by kvm_irqchip_in_kernel.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-04-12 19:01:41 -03:00
Michael S. Tsirkin 4b8f1c88e9 kvm: allow arbitrarily sized mmio ioeventfd
We use a 2 byte ioeventfd for virtio memory,
add support for this.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-04-12 19:01:41 -03:00
Andreas Färber 9349b4f9fd Rename CPUState -> CPUArchState
Scripted conversion:
  for file in *.[hc] hw/*.[hc] hw/kvm/*.[hc] linux-user/*.[hc] linux-user/m68k/*.[hc] bsd-user/*.[hc] darwin-user/*.[hc] tcg/*/*.[hc] target-*/cpu.h; do
    sed -i "s/CPUState/CPUArchState/g" $file
  done

All occurrences of CPUArchState are expected to be replaced by QOM CPUState,
once all targets are QOM'ified and common fields have been extracted.

Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
2012-03-14 22:20:27 +01:00
Jan Kiszka 8a7c73932e kvm: Add kvm_has_pit_state2 helper
To be used for in-kernel PIT emulation.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-03-07 12:27:42 +02:00
Jan Kiszka 3d4b26494f kvm: Implement kvm_irqchip_in_kernel like kvm_enabled
To both avoid that kvm_irqchip_in_kernel always has to be paired with
kvm_enabled and that the former ends up in a function call, implement it
like the latter. This means keeping the state in a global variable and
defining kvm_irqchip_in_kernel as a preprocessor macro.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-02-08 15:57:50 -02:00
Jan Kiszka 680c1c6fd7 kvm: x86: Add user space part for in-kernel APIC
This introduces the alternative APIC device which makes use of KVM's
in-kernel device model. External NMI injection via LINT1 is emulated by
checking the current state of the in-kernel APIC, only injecting a NMI
into the VCPU if LINT1 is unmasked and configured to DM_NMI.

MSI is not yet supported, so we disable this when the in-kernel model is
in use.

CC: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
2012-01-19 12:14:42 +01:00
Jan Kiszka 9b5b76d449 kvm: x86: Establish IRQ0 override control
KVM is forced to disable the IRQ0 override when we run with in-kernel
irqchip but without IRQ routing support of the kernel. Set the fwcfg
value correspondingly. This aligns us with qemu-kvm.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
2012-01-19 12:14:42 +01:00
Jan Kiszka 84b058d7df kvm: Introduce core services for in-kernel irqchip support
Add the basic infrastructure to active in-kernel irqchip support, inject
interrupts into these models, and maintain IRQ routes.

Routing is optional and depends on the host arch supporting
KVM_CAP_IRQ_ROUTING. When it's not available on x86, we looe the HPET as
we can't route GSI0 to IOAPIC pin 2.

In-kernel irqchip support will once be controlled by the machine
property 'kernel_irqchip', but this is not yet wired up.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
2012-01-19 12:14:42 +01:00
Avi Kivity 9f213ed92c kvm: switch kvm slots to use host virtual address instead of ram_addr_t
This simplifies a later switch to the memory API in slot management.

Signed-off-by: Avi Kivity <avi@redhat.com>
2011-12-20 14:14:07 +02:00
Jan Kiszka ba9bc59e1f kvm: x86: Pass KVMState to kvm_arch_get_supported_cpuid
kvm_arch_get_supported_cpuid checks for global cpuid restrictions, it
does not require any CPUState reference. Changing its interface allows
to call it before any VCPU is initialized.

CC: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-06-20 15:24:00 -03:00
Jan Kiszka f2574737f6 kvm: x86: Push kvm_arch_debug to kvm_arch_handle_exit
There are no generic bits remaining in the handling of KVM_EXIT_DEBUG.
So push its logic completely into arch hands, i.e. only x86 so far.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-16 17:11:06 -03:00
Jan Kiszka 990368650f kvm: Rename kvm_arch_process_irqchip_events to async_events
We will broaden the scope of this function on x86 beyond irqchip events.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-15 01:19:05 -03:00
Jan Kiszka 6a7af8cb04 kvm: Make kvm_state globally available
KVM-assisted devices need access to it but we have no clean channel to
distribute a reference. As a workaround until there is a better
solution, export kvm_state for global use, though use should remain
restricted to the mentioned scenario.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-02-14 12:39:47 -02:00
Anthony PERARD e5896b12e2 Introduce log_start/log_stop in CPUPhysMemoryClient
In order to use log_start/log_stop with Xen as well in the vga code,
this two operations have been put in CPUPhysMemoryClient.

The two new functions cpu_physical_log_start,cpu_physical_log_stop are
used in hw/vga.c and replace the kvm_log_start/stop. With this, vga does
no longer depends on kvm header.

[ Jan: rebasing and style fixlets ]

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-02-14 12:39:47 -02:00
Jan Kiszka 7a39fe5882 kvm: Drop return values from kvm_arch_pre/post_run
We do not check them, and the only arch with non-empty implementations
always returns 0 (this is also true for qemu-kvm).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-02-14 12:39:46 -02:00
Jan Kiszka a1b87fe046 kvm: Provide sigbus services arch-independently
Provide arch-independent kvm_on_sigbus* stubs to remove the #ifdef'ery
from cpus.c. This patch also fixes --disable-kvm build by providing the
missing kvm_on_sigbus_vcpu kvm-stub.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-02-14 12:39:45 -02:00
Jan Kiszka 94a8d39afd kvm: Consolidate must-have capability checks
Instead of splattering the code with #ifdefs and runtime checks for
capabilities we cannot work without anyway, provide central test
infrastructure for verifying their availability both at build and
runtime.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:22 -02:00
Jan Kiszka cad1e2827b kvm: Drop smp_cpus argument from init functions
No longer used.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-23 02:27:21 -02:00
Stefan Hajnoczi d2f2b8a740 kvm: test for ioeventfd support on old kernels
There used to be a limit of 6 KVM io bus devices in the kernel.
On such a kernel, we can't use many ioeventfds for host notification
since the limit is reached too easily.

Add an API to test for this condition.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2011-01-10 14:44:16 +02:00
Marcelo Tosatti c0532a76b4 MCE: Relay UCR MCE to guest
Port qemu-kvm's

commit 4b62fff1101a7ad77553147717a8bd3bf79df7ef
Author: Huang Ying <ying.huang@intel.com>
Date:   Mon Sep 21 10:43:25 2009 +0800

    MCE: Relay UCR MCE to guest

    UCR (uncorrected recovery) MCE is supported in recent Intel CPUs,
    where some hardware error such as some memory error can be reported
    without PCC (processor context corrupted). To recover from such MCE,
    the corresponding memory will be unmapped, and all processes accessing
    the memory will be killed via SIGBUS.

    For KVM, if QEMU/KVM is killed, all guest processes will be killed
    too. So we relay SIGBUS from host OS to guest system via a UCR MCE
    injection. Then guest OS can isolate corresponding memory and kill
    necessary guest processes only. SIGBUS sent to main thread (not VCPU
    threads) will be broadcast to all VCPU threads as UCR MCE.

aliguori: fix build

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-10-20 16:15:04 -05:00
Huang Ying 983dfc3b13 Add RAM -> physical addr mapping in MCE simulation
In QEMU-KVM, physical address != RAM address. While MCE simulation
needs physical address instead of RAM address. So
kvm_physical_memory_addr_from_ram() is implemented to do the
conversion, and it is invoked before being filled in the IA32_MCi_ADDR
MSR.

Reported-by: Dean Nelson <dnelson@redhat.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-20 16:15:04 -05:00
Cam Macdonell 44f1a3d876 Add function to assign ioeventfd to MMIO.
Signed-off-by: Cam Macdonell <cam@cs.ualberta.ca>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-08-10 16:25:15 -05:00
Sheng Yang f1665b21f1 kvm: Enable XSAVE live migration support
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-06-28 13:06:03 -03:00
Sheng Yang c958a8bd9b kvm: Extend kvm_arch_get_supported_cpuid() to support index
Would use it later for XSAVE related CPUID.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-06-28 13:06:03 -03:00
Paul Brook 11165820d1 Move stdbool.h
Move inclusion of stdbool.h to common header files, instead of including
in an ad-hoc manner.

Signed-off-by: Paul Brook <paul@codesourcery.com>
2010-06-13 19:00:50 +01:00
Gleb Natapov 4513d9232b Do not stop VM if emulation failed in userspace.
Continue vcpu execution in case emulation failure happened while vcpu
was in userspace. In this case #UD will be injected into the guest
allowing guest OS to kill offending process and continue.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-11 14:03:44 -03:00
Marcelo Tosatti 0af691d779 kvm: enable smp > 1
Process INIT/SIPI requests and enable -smp > 1.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-11 14:02:22 -03:00
Jan Kiszka ff44f1a373 KVM: x86: Add debug register saving and restoring
Make use of the new KVM_GET/SET_DEBUGREGS to save/restore the x86 debug
registers.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-04-26 11:28:35 -03:00
Paolo Bonzini 98c8573eb3 provide a stub version of kvm-all.c if !CONFIG_KVM
This allows limited use of kvm functions (which will return ENOSYS)
even in once-compiled modules.  The patch also improves a bit the error
messages for KVM initialization.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[blauwirbel@gmail.com: fixed Win32 build]
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-04-19 18:59:30 +00:00
Paolo Bonzini 00a1555e0c move around definitions in kvm.h that do not require CPUState
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-04-09 18:55:55 +02:00
Michael S. Tsirkin ca82180603 kvm: add API to set ioeventfd
Comment on kvm usage: rather than require users to do if (kvm_enabled())
and/or ifdefs, this patch adds an API that, internally, is defined to
stub function on non-kvm build, and checks kvm_enabled for non-kvm
run.

While rest of qemu code still uses if (kvm_enabled()), I think this
approach is cleaner, and we should convert rest of code to it
long term.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-04-01 13:56:43 -05:00
Blue Swirl 1c14f162dd Allow various header files to be included from non-CPU code
Allow balloon.h, gdbstub.h and kvm.h to be included from
non-CPU code.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-03-29 19:23:47 +00:00
Paul Brook b3755a915e Disable phsyical memory handling in userspace emulation.
Code to handle physical memory access is not meaningful in usrmode emulation,
so disable it.

Signed-off-by: Paul Brook <paul@codesourcery.com>
2010-03-12 18:34:25 +00:00
Jan Kiszka ea375f9ab8 KVM: Rework VCPU state writeback API
This grand cleanup drops all reset and vmsave/load related
synchronization points in favor of four(!) generic hooks:

- cpu_synchronize_all_states in qemu_savevm_state_complete
  (initial sync from kernel before vmsave)
- cpu_synchronize_all_post_init in qemu_loadvm_state
  (writeback after vmload)
- cpu_synchronize_all_post_init in main after machine init
- cpu_synchronize_all_post_reset in qemu_system_reset
  (writeback after system reset)

These writeback points + the existing one of VCPU exec after
cpu_synchronize_state map on three levels of writeback:

- KVM_PUT_RUNTIME_STATE (during runtime, other VCPUs continue to run)
- KVM_PUT_RESET_STATE   (on synchronous system reset, all VCPUs stopped)
- KVM_PUT_FULL_STATE    (on init or vmload, all VCPUs stopped as well)

This level is passed to the arch-specific VCPU state writing function
that will decide which concrete substates need to be written. That way,
no writer of load, save or reset functions that interact with in-kernel
KVM states will ever have to worry about synchronization again. That
also means that a lot of reasons for races, segfaults and deadlocks are
eliminated.

cpu_synchronize_state remains untouched, just as Anthony suggested. We
continue to need it before reading or writing of VCPU states that are
also tracked by in-kernel KVM subsystems.

Consequently, this patch removes many cpu_synchronize_state calls that
are now redundant, just like remaining explicit register syncs.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-04 00:29:28 -03:00
Jan Kiszka b0b1d69079 KVM: Rework of guest debug state writing
So far we synchronized any dirty VCPU state back into the kernel before
updating the guest debug state. This was a tribute to a deficite in x86
kernels before 2.6.33. But as this is an arch-dependent issue, it is
better handle in the x86 part of KVM and remove the writeback point for
generic code. This also avoids overwriting the flushed state later on if
user space decides to change some more registers before resuming the
guest.

We furthermore need to reinject guest exceptions via the appropriate
mechanism. That is KVM_SET_GUEST_DEBUG for older kernels and
KVM_SET_VCPU_EVENTS for recent ones. Using both mechanisms at the same
time will cause state corruptions.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-04 00:29:26 -03:00
Blue Swirl 20c205269d Fix mingw32 build
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-02-23 21:46:28 +00:00
Marcelo Tosatti cc84de9570 kvm: consume internal signal with sigtimedwait
Change the way the internal qemu signal, used for communication between
iothread and vcpus, is handled.

Block and consume it with sigtimedwait on the outer vcpu loop, which
allows more precise timing control.

Change from standard signal (SIGUSR1) to real-time one, so multiple
signals are not collapsed.

Set the signal number on KVM's in-kernel allowed sigmask.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-02-22 10:58:33 +02:00
Michael S. Tsirkin 7b8f3b7834 kvm: move kvm to use memory notifiers
remove direct kvm calls from exec.c, make
kvm use memory notifiers framework instead.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-02-09 16:56:13 -06:00
Sheng Yang 62a2744ca0 kvm: Flush coalesced MMIO buffer periodly
The default action of coalesced MMIO is, cache the writing in buffer, until:
1. The buffer is full.
2. Or the exit to QEmu due to other reasons.

But this would result in a very late writing in some condition.
1. The each time write to MMIO content is small.
2. The writing interval is big.
3. No need for input or accessing other devices frequently.

This issue was observed in a experimental embbed system. The test image
simply print "test" every 1 seconds. The output in QEmu meets expectation,
but the output in KVM is delayed for seconds.

Per Avi's suggestion, I hooked flushing coalesced MMIO buffer in VGA update
handler. By this way, We don't need vcpu explicit exit to QEmu to
handle this issue.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-02-03 19:47:33 -02:00
Jan Kiszka a0fb002c64 kvm: x86: Add support for VCPU event states
This patch extends the qemu-kvm state sync logic with support for
KVM_GET/SET_VCPU_EVENTS, giving access to yet missing exception,
interrupt and NMI states.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-12-03 15:25:57 -06:00
Jan Kiszka caa5af0ff3 kvm: Add arch reset handler
Will be required by succeeding changes.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-11-17 08:49:37 -06:00
Hollis Blanchard 9bdbe550f0 kvm: Move KVM mp_state accessors to i386-specific code
Unbreaks PowerPC and S390 KVM builds.

Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-11-12 11:23:55 -06:00
Anthony Liguori c227f0995e Revert "Get rid of _t suffix"
In the very least, a change like this requires discussion on the list.

The naming convention is goofy and it causes a massive merge problem.  Something
like this _must_ be presented on the list first so people can provide input
and cope with it.

This reverts commit 99a0949b72.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-01 16:12:16 -05:00
malc 99a0949b72 Get rid of _t suffix
Some not so obvious bits, slirp and Xen were left alone for the time
being.

Signed-off-by: malc <av1474@comtv.ru>
2009-10-01 22:45:02 +04:00