Commit Graph

224 Commits

Author SHA1 Message Date
Sheng Yang 5fdbcb9dd1 KVM: VMX: Fix undefined beaviour of EPT after reload kvm-intel.ko
As well as move set base/mask ptes to vmx_init().

Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-27 11:34:10 +03:00
Sheng Yang 5ec5726a16 KVM: VMX: Fix bypass_guest_pf enabling when disable EPT in module parameter
Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-27 11:34:10 +03:00
Marcelo Tosatti c93cd3a588 KVM: task switch: translate guest segment limit to virt-extension byte granular field
If 'g' is one then limit is 4kb granular.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-27 11:34:10 +03:00
Avi Kivity 577bdc4966 KVM: Avoid instruction emulation when event delivery is pending
When an event (such as an interrupt) is injected, and the stack is
shadowed (and therefore write protected), the guest will exit.  The
current code will see that the stack is shadowed and emulate a few
instructions, each time postponing the injection.  Eventually the
injection may succeed, but at that time the guest may be unwilling
to accept the interrupt (for example, the TPR may have changed).

This occurs every once in a while during a Windows 2008 boot.

Fix by unshadowing the fault address if the fault was due to an event
injection.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-27 11:34:10 +03:00
Marcelo Tosatti 34198bf842 KVM: task switch: use seg regs provided by subarch instead of reading from GDT
There is no guarantee that the old TSS descriptor in the GDT contains
the proper base address. This is the case for Windows installation's
reboot-via-triplefault.

Use guest registers instead. Also translate the address properly.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-27 11:34:09 +03:00
Marcelo Tosatti 98899aa0e0 KVM: task switch: segment base is linear address
The segment base is always a linear address, so translate before
accessing guest memory.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-27 11:34:09 +03:00
Joerg Roedel 5f4cb662a0 KVM: SVM: allow enabling/disabling NPT by reloading only the architecture module
If NPT is enabled after loading both KVM modules on AMD and it should be
disabled, both KVM modules must be reloaded. If only the architecture module is
reloaded the behavior is undefined. With this patch it is possible to disable
NPT only by reloading the kvm_amd module.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-27 11:34:09 +03:00
Avi Kivity 722c05f219 KVM: MMU: Fix potential race setting upper shadow ptes on nonpae hosts
The direct mapped shadow code (used for real mode and two dimensional paging)
sets upper-level ptes using direct assignment rather than calling
set_shadow_pte().  A nonpae host will split this into two writes, which opens
up a race if another vcpu accesses the same memory area.

Fix by calling set_shadow_pte() instead of assigning directly.

Noticed by Izik Eidus.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:40 +03:00
Glauber Costa 2a7c5b8b55 KVM: x86 emulator: emulate clflush
If the guest issues a clflush in a mmio address, the instruction
can trap into the hypervisor. Currently, we do not decode clflush
properly, causing the guest to hang. This patch fixes this emulating
clflush (opcode 0f ae).

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:40 +03:00
Marcelo Tosatti 376c53c2b3 KVM: MMU: improve invalid shadow root page handling
Harden kvm_mmu_zap_page() against invalid root pages that
had been shadowed from memslots that are gone.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:40 +03:00
Marcelo Tosatti 34d4cb8fca KVM: MMU: nuke shadowed pgtable pages and ptes on memslot destruction
Flush the shadow mmu before removing regions to avoid stale entries.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:40 +03:00
Avi Kivity d6e88aec07 KVM: Prefix some x86 low level function with kvm_, to avoid namespace issues
Fixes compilation with CONFIG_VMI enabled.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:39 +03:00
Ben-Ami Yassour c65bbfa1d6 KVM: check injected pic irq within valid pic irqs
Check that an injected pic irq is between 0 and 15.

Signed-off-by: Ben-Ami Yassour <benami@il.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:39 +03:00
Mohammed Gamal 19fdfa0d13 KVM: x86 emulator: Fix HLT instruction
This patch fixes issue encountered with HLT instruction
under FreeDOS's HIMEM XMS Driver.

The HLT instruction jumped directly to the done label and
skips updating the EIP value, therefore causing the guest
to spin endlessly on the same instruction.

The patch changes the instruction so that it writes back
the updated EIP value.

Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:38 +03:00
Avi Kivity ac9f6dc0db KVM: Apply the kernel sigmask to vcpus blocked due to being uninitialized
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:38 +03:00
Sheng Yang 4e1096d27f KVM: VMX: Add ept_sync_context in flush_tlb
Fix a potention issue caused by kvm_mmu_slot_remove_write_access(). The
old behavior don't sync EPT TLB with modified EPT entry, which result
in inconsistent content of EPT TLB and EPT table.

Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:38 +03:00
Marcelo Tosatti 5a4c928804 KVM: mmu_shrink: kvm_mmu_zap_page requires slots_lock to be held
kvm_mmu_zap_page() needs slots lock held (rmap_remove->gfn_to_memslot,
for example).

Since kvm_lock spinlock is held in mmu_shrink(), do a non-blocking
down_read_trylock().

Untested.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:38 +03:00
Joerg Roedel 0da1db75a2 KVM: SVM: fix suspend/resume support
On suspend the svm_hardware_disable function is called which frees all svm_data
variables. On resume they are not re-allocated. This patch removes the
deallocation of svm_data from the hardware_disable function to the
hardware_unsetup function which is not called on suspend.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:37 +03:00
Marcelo Tosatti f8b78fa3d4 KVM: move slots_lock acquision down to vapic_exit
There is no need to grab slots_lock if the vapic_page will not
be touched.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:36 +03:00
Chris Lalancette efa67e0d1f KVM: VMX: Fake emulate Intel perfctr MSRs
Older linux guests (in this case, 2.6.9) can attempt to
access the performance counter MSRs without a fixup section, and injecting
a GPF kills the guest.  Work around by allowing the guest to write those MSRs.

Tested by me on RHEL-4 i386 and x86_64 guests, as well as F-9 guests.

Signed-off-by: Chris Lalancette <clalance@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:36 +03:00
Sheng Yang 65267ea1b3 KVM: VMX: Fix a wrong usage of vmcs_config
The function ept_update_paging_mode_cr0() write to
CPU_BASED_VM_EXEC_CONTROL based on vmcs_config.cpu_based_exec_ctrl. That's
wrong because the variable may not consistent with the content in the
CPU_BASE_VM_EXEC_CONTROL MSR.

Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:36 +03:00
Avi Kivity db475c39ec KVM: MMU: Fix printk format
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:35 +03:00
Avi Kivity 6ada8cca79 KVM: MMU: When debug is enabled, make it a run-time parameter
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:35 +03:00
Avi Kivity 7a5b56dfd3 KVM: x86 emulator: lazily evaluate segment registers
Instead of prefetching all segment bases before emulation, read them at the
last moment.  Since most of them are unneeded, we save some cycles on
Intel machines where this is a bit expensive.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:35 +03:00
Avi Kivity 0adc8675d6 KVM: x86 emulator: avoid segment base adjust for lea
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:34 +03:00
Avi Kivity f5b4edcd52 KVM: x86 emulator: simplify rip relative decoding
rip relative decoding is relative to the instruction pointer of the next
instruction; by moving address adjustment until after decoding is complete,
we remove the need to determine the instruction size.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:34 +03:00
Avi Kivity 84411d85da KVM: x86 emulator: simplify r/m decoding
Consolidate the duplicated code when not in any special case.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:33 +03:00
Avi Kivity dc71d0f162 KVM: x86 emulator: simplify sib decoding
Instead of using sparse switches, use simpler if/else sequences.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:33 +03:00
Avi Kivity 8684c0af0b KVM: x86 emulator: handle undecoded rex.b with r/m = 5 in certain cases
x86_64 does not decode rex.b in certain cases, where the r/m field = 5.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:33 +03:00
Mohammed Gamal b13354f8f0 KVM: x86 emulator: emulate nop and xchg reg, acc (opcodes 0x90 - 0x97)
Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:33 +03:00
Avi Kivity f76c710d75 KVM: Use printk_rlimit() instead of reporting emulation failures just once
Emulation failure reports are useful, so allow more than one per the lifetime
of the module.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:32 +03:00
Glauber Costa 25be46080f KVM: Do not calculate linear rip in emulation failure report
If we're not gonna do anything (case in which failure is already
reported), we do not need to even bother with calculating the linear rip.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:32 +03:00
Marcelo Tosatti 622395a9e6 KVM: only abort guest entry if timer count goes from 0->1
Only abort guest entry if the timer count went from 0->1, since for 1->2
or larger the bit will either be set already or a timer irq will have
been injected.

Using atomic_inc_and_test() for it also introduces an SMP barrier
to the LAPIC version (thought it was unecessary because of timer
migration, but guest can be scheduled to a different pCPU between exit
and kvm_vcpu_block(), so there is the possibility for a race).

Noticed by Avi.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:32 +03:00
Laurent Vivier 542472b53e KVM: Add coalesced MMIO support (x86 part)
This patch enables coalesced MMIO for x86 architecture.
It defines KVM_MMIO_PAGE_OFFSET and KVM_CAP_COALESCED_MMIO.
It enables the compilation of coalesced_mmio.c.

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:31 +03:00
Laurent Vivier 92760499d0 KVM: kvm_io_device: extend in_range() to manage len and write attribute
Modify member in_range() of structure kvm_io_device to pass length and the type
of the I/O (write or read).

This modification allows to use kvm_io_device with coalesced MMIO.

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:30 +03:00
Avi Kivity 131d82791b KVM: MMU: Avoid page prefetch on SVM
SVM cannot benefit from page prefetching since guest page fault bypass
cannot by made to work there.  Avoid accessing the guest page table in
this case.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:30 +03:00
Avi Kivity d761a501cf KVM: MMU: Move nonpaging_prefetch_page()
In preparation for next patch. No code change.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:30 +03:00
Avi Kivity 91ed7a0e15 KVM: x86 emulator: implement 'push imm' (opcode 0x68)
Encountered in FC6 boot sequence, now that we don't force ss.rpl = 0 during
the protected mode transition.  Not really necessary, but nice to have.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:29 +03:00
Avi Kivity 19e43636b5 KVM: x86 emulator: simplify push imm8 emulation
Instead of fetching the data explicitly, use SrcImmByte.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:29 +03:00
Avi Kivity eab9f71feb KVM: MMU: Optimize prefetch_page()
Instead of reading each pte individually, read 256 bytes worth of ptes and
batch process them.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:28 +03:00
Guillaume Thouvenin 38d5bc6d50 KVM: x86 emulator: Add support for mov r, sreg (0x8c) instruction
Add support for mov r, sreg (0x8c) instruction

Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: Laurent Vivier <laurent.vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:28 +03:00
Guillaume Thouvenin 4257198ae2 KVM: x86 emulator: Add support for mov seg, r (0x8e) instruction
Add support for mov r, sreg (0x8c) instruction.

[avi: drop the sreg decoding table in favor of 1:1 encoding]

Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: Laurent Vivier <laurent.vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:28 +03:00
Guillaume Thouvenin 615ac12561 KVM: x86 emulator: adds support to mov r,imm (opcode 0xb8) instruction
Add support to mov r, imm (0xb8) instruction.

Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: Laurent Vivier <laurent.vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:27 +03:00
Guillaume Thouvenin 954cd36f76 KVM: x86 emulator: add support for jmp far 0xea
Add support for jmp far (opcode 0xea) instruction.

Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: Laurent Vivier <laurent.vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:27 +03:00
Guillaume Thouvenin 89c696383d KVM: x86 emulator: Update c->dst.bytes in decode instruction
Update c->dst.bytes in decode instruction instead of instruction
itself.  It's needed because if c->dst.bytes is equal to 0, the
instruction is not emulated.

Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: Laurent Vivier <laurent.vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:27 +03:00
Guillaume Thouvenin 3e6e0aab1b KVM: Prefixes segment functions that will be exported with "kvm_"
Prefixes functions that will be exported with kvm_.
We also prefixed set_segment() even if it still static
to be coherent.

signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: Laurent Vivier <laurent.vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:27 +03:00
Avi Kivity 9ba075a664 KVM: MTRR support
Add emulation for the memory type range registers, needed by VMware esx 3.5,
and by pci device assignment.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:26 +03:00
Sheng Yang f08864b42a KVM: VMX: Enable NMI with in-kernel irqchip
Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:26 +03:00
Sheng Yang 3419ffc8e4 KVM: IOAPIC/LAPIC: Enable NMI support
[avi: fix ia64 build breakage]

Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:25 +03:00
Avi Kivity 50d40d7fb9 KVM: Remove unnecessary ->decache_regs() call
Since we aren't modifying any register, there's no need to decache
the register state.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:25 +03:00