linux

Commit Graph

Author	SHA1	Message	Date
Avi Kivity	1961d276c8	KVM: Add guest mode signal mask Allow a special signal mask to be used while executing in guest mode. This allows signals to be used to interrupt a vcpu without requiring signal delivery to a userspace handler, which is quite expensive. Userspace still receives -EINTR and can get the signal via sigwait(). Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-05-03 10:52:24 +03:00
Avi Kivity	6722c51c51	KVM: Initialize the apic_base msr on svm too Older userspace didn't care, but newer userspace (with the cpuid changes) does. Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-05-03 10:52:24 +03:00
Avi Kivity	1b19f3e61d	KVM: Add a special exit reason when exiting due to an interrupt This is redundant, as we also return -EINTR from the ioctl, but it allows us to examine the exit_reason field on resume without seeing old data. Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-05-03 10:52:24 +03:00
Avi Kivity	8eb7d334bd	KVM: Fold kvm_run::exit_type into kvm_run::exit_reason Currently, userspace is told about the nature of the last exit from the guest using two fields, exit_type and exit_reason, where exit_type has just two enumerations (and no need for more). So fold exit_type into exit_reason, reducing the complexity of determining what really happened. Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-05-03 10:52:24 +03:00
Avi Kivity	b4e63f560b	KVM: Allow userspace to process hypercalls which have no kernel handler This is useful for paravirtualized graphics devices, for example. Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-05-03 10:52:24 +03:00
Avi Kivity	5d308f4550	KVM: Add method to check for backwards-compatible API extensions Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-05-03 10:52:24 +03:00
Avi Kivity	106b552b43	KVM: Remove the 'emulated' field from the userspace interface We no longer emulate single instructions in userspace. Instead, we service mmio or pio requests. Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-05-03 10:52:23 +03:00
Avi Kivity	06465c5a3a	KVM: Handle cpuid in the kernel instead of punting to userspace KVM used to handle cpuid by letting userspace decide what values to return to the guest. We now handle cpuid completely in the kernel. We still let userspace decide which values the guest will see by having userspace set up the value table beforehand (this is necessary to allow management software to set the cpu features to the least common denominator, so that live migration can work). The motivation for the change is that kvm kernel code can be impacted by cpuid features, for example the x86 emulator. Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-05-03 10:52:23 +03:00
Avi Kivity	46fc147788	KVM: Do not communicate to userspace through cpu registers during PIO Currently when passing the a PIO emulation request to userspace, we rely on userspace updating %rax (on 'in' instructions) and %rsi/%rdi/%rcx (on string instructions). This (a) requires two extra ioctls for getting and setting the registers and (b) is unfriendly to non-x86 archs, when they get kvm ports. So fix by doing the register fixups in the kernel and passing to userspace only an abstract description of the PIO to be done. Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-05-03 10:52:23 +03:00
Avi Kivity	9a2bb7f486	KVM: Use a shared page for kernel/user communication when runing a vcpu Instead of passing a 'struct kvm_run' back and forth between the kernel and userspace, allocate a page and allow the user to mmap() it. This reduces needless copying and makes the interface expandable by providing lots of free space. Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-05-03 10:52:23 +03:00
Avi Kivity	1ea252afcd	KVM: Fix bogus sign extension in mmu mapping audit When auditing a 32-bit guest on a 64-bit host, sign extension of the page table directory pointer table index caused bogus addresses to be shown on audit errors. Fix by declaring the index unsigned. Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-05-03 10:52:23 +03:00
Avi Kivity	bbe4432e66	KVM: Use own minor number Use the minor number (232) allocated to kvm by lanana. Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-05-03 10:52:22 +03:00
Dor Laor	510043da85	KVM: Use the generic skip_emulated_instruction() in hypercall code Instead of twiddling the rip registers directly, use the skip_emulated_instruction() function to do that for us. Signed-off-by: Dor Laor <dor.laor@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-05-03 10:52:22 +03:00
Dor Laor	9b22bf5783	KVM: Fix guest register corruption on paravirt hypercall The hypercall code mixes up the ->cache_regs() and ->decache_regs() callbacks, resulting in guest register corruption. Signed-off-by: Dor Laor <dor.laor@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-05-03 10:52:22 +03:00
Avi Kivity	6b8d0f9b18	KVM: Fix off-by-one when writing to a nonpae guest pde Nonpae guest pdes are shadowed by two pae ptes, so we double the offset twice: once to account for the pte size difference, and once because we need to shadow pdes for a single guest pde. But when writing to the upper guest pde we also need to truncate the lower bits, otherwise the multiply shifts these bits into the pde index and causes an access to the wrong shadow pde. If we're at the end of the page (accessing the very last guest pde) we can even overflow into the next host page and oops. Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-04-19 18:39:26 +03:00
Ingo Molnar	6d9658df07	KVM: always reload segment selectors failed VM entry on VMX might still change %fs or %gs, thus make sure that KVM always reloads the segment selectors. This is crutial on both x86 and x86_64: x86 has __KERNEL_PDA in %fs on which things like 'current' depends and x86_64 has 0 there and needs MSR_GS_BASE to work. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-03-27 17:55:48 +02:00
Avi Kivity	6af11b9e82	KVM: Prevent system selectors leaking into guest on real->protected mode transition on vmx Intel virtualization extensions do not support virtualizing real mode. So kvm uses virtualized vm86 mode to run real mode code. Unfortunately, this virtualized vm86 mode does not support the so called "big real" mode, where the segment selector and base do not agree with each other according to the real mode rules (base == selector << 4). To work around this, kvm checks whether a selector/base pair violates the virtualized vm86 rules, and if so, forces it into conformance. On a transition back to protected mode, if we see that the guest did not touch a forced segment, we restore it back to the original protected mode value. This pile of hacks breaks down if the gdt has changed in real mode, as it can cause a segment selector to point to a system descriptor instead of a normal data segment. In fact, this happens with the Windows bootloader and the qemu acpi bios, where a protected mode memcpy routine issues an innocent 'pop %es' and traps on an attempt to load a system descriptor. "Fix" by checking if the to-be-restored selector points at a system segment, and if so, coercing it into a normal data segment. The long term solution, of course, is to abandon vm86 mode and use emulation for big real mode. Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-03-27 17:54:38 +02:00
Avi Kivity	27aba76615	KVM: MMU: Fix host memory corruption on i386 with >= 4GB ram PAGE_MASK is an unsigned long, so using it to mask physical addresses on i386 (which are 64-bit wide) leads to truncation. This can result in page->private of unrelated memory pages being modified, with disasterous results. Fix by not using PAGE_MASK for physical addresses; instead calculate the correct value directly from PAGE_SIZE. Also fix a similar BUG_ON(). Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-03-18 10:49:09 +02:00
Avi Kivity	ac1b714e78	KVM: MMU: Fix guest writes to nonpae pde KVM shadow page tables are always in pae mode, regardless of the guest setting. This means that a guest pde (mapping 4MB of memory) is mapped to two shadow pdes (mapping 2MB each). When the guest writes to a pte or pde, we intercept the write and emulate it. We also remove any shadowed mappings corresponding to the write. Since the mmu did not account for the doubling in the number of pdes, it removed the wrong entry, resulting in a mismatch between shadow page tables and guest page tables, followed shortly by guest memory corruption. This patch fixes the problem by detecting the special case of writing to a non-pae pde and adjusting the address and number of shadow pdes zapped accordingly. Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-03-18 10:49:09 +02:00
Avi Kivity	f5b42c3324	KVM: Fix guest sysenter on vmx The vmx code currently treats the guest's sysenter support msrs as 32-bit values, which breaks 32-bit compat mode userspace on 64-bit guests. Fix by using the native word width of the machine. Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-03-18 10:49:06 +02:00
Avi Kivity	ca45aaae1e	KVM: Unset kvm_arch_ops if arch module loading failed Otherwise, the core module thinks the arch module is loaded, and won't let you reload it after you've fixed the bug. Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-03-18 10:49:06 +02:00
Andrew Morton	e9cdb1e330	KVM: Move kvmfs magic number to <linux/magic.h> Use the standard magic.h for kvmfs. Cc: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-03-04 11:12:43 +02:00
Avi Kivity	58e690e6fd	KVM: Fix bogus failure in kvm.ko module initialization A bogus 'return r' can cause an otherwise successful module load to fail. This both denies users the use of kvm, and it also denies them the use of their machine, as it leaves a filesystem registered with its callbacks pointing into now-freed module memory. Fix by returning a zero like a good module. Thanks to Richard Lucassen <mailinglists@lucassen.org> (?) for reporting the problem and for providing access to a machine which exhibited it. Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-03-04 11:12:43 +02:00
Uri Lublin	ff990d5952	KVM: Remove write access permissions when dirty-page-logging is enabled Enabling dirty page logging is done using KVM_SET_MEMORY_REGION ioctl. If the memory region already exists, we need to remove write accesses, so writes will be caught, and dirty pages will be logged. Signed-off-by: Uri Lublin <uril@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-03-04 11:12:43 +02:00
Uri Lublin	02b27c1f80	kvm: move do_remove_write_access() up To be called from kvm_vm_ioctl_set_memory_region() Signed-off-by: Uri Lublin <uril@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-03-04 11:12:43 +02:00
Uri Lublin	cd1a4a982a	KVM: Fix dirty page log bitmap size/access calculation Since dirty_bitmap is an unsigned long array, the alignment and size need to take that into account. Signed-off-by: Uri Lublin <uril@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-03-04 11:12:42 +02:00
Uri Lublin	ab51a434c5	KVM: Add missing calls to mark_page_dirty() A few places where we modify guest memory fail to call mark_page_dirty(), causing live migration to fail. This adds the missing calls. Signed-off-by: Uri Lublin <uril@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-03-04 11:12:42 +02:00
Avi Kivity	bccf2150fe	KVM: Per-vcpu inodes Allocate a distinct inode for every vcpu in a VM. This has the following benefits: - the filp cachelines are no longer bounced when f_count is incremented on every ioctl() - the API and internal code are distinctly clearer; for example, on the KVM_GET_REGS ioctl, there is no need to copy the vcpu number from userspace and then copy the registers back; the vcpu identity is derived from the fd used to make the call Right now the performance benefits are completely theoretical since (a) we don't support more than one vcpu per VM and (b) virtualization hardware inefficiencies completely everwhelm any cacheline bouncing effects. But both of these will change, and we need to prepare the API today. Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-03-04 11:12:42 +02:00
Avi Kivity	c5ea766006	KVM: Move kvm_vm_ioctl_create_vcpu() around In preparation of some hacking. Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-03-04 11:12:42 +02:00
Avi Kivity	2c6f5df979	KVM: Rename some kvm_dev_ioctl_() functions to kvm_vm_ioctl_() This reflects the changed scope, from device-wide to single vm (previously every device open created a virtual machine). Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-03-04 11:12:42 +02:00
Avi Kivity	f17abe9a44	KVM: Create an inode per virtual machine This avoids having filp->f_op and the corresponding inode->i_fop different, which is a little unorthodox. The ioctl list is split into two: global kvm ioctls and per-vm ioctls. A new ioctl, KVM_CREATE_VM, is used to create VMs and return the VM fd. Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-03-04 11:12:42 +02:00
Avi Kivity	37e29d906c	KVM: Add internal filesystem for generating inodes The kvmfs inodes will represent virtual machines and vcpus, as necessary, reducing cacheline bouncing due to inodes and filps being shared. Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-03-04 11:12:41 +02:00
Avi Kivity	19d1408dfd	KVM: More 0 -> NULL conversions Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-03-04 11:12:41 +02:00
Joerg Roedel	0152527b76	KVM: SVM: intercept SMI to handle it at host level This patch changes the SVM code to intercept SMIs and handle it outside the guest. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-03-04 11:12:41 +02:00
Avi Kivity	cd205625e9	KVM: svm: init cr0 with the wp bit set Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-03-04 11:12:41 +02:00
Avi Kivity	270fd9b96f	KVM: Wire up hypercall handlers to a central arch-independent location Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-03-04 11:12:41 +02:00
Avi Kivity	02e235bc8e	KVM: Add hypercall host support for svm Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-03-04 11:12:41 +02:00
Ingo Molnar	c21415e843	KVM: Add host hypercall support for vmx Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-03-04 11:12:40 +02:00
Ingo Molnar	102d8325a1	KVM: add MSR based hypercall API This adds a special MSR based hypercall API to KVM. This is to be used by paravirtual kernels and virtual drivers. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-03-04 11:12:40 +02:00
Markus Rechberger	5972e9535e	KVM: Use page_private()/set_page_private() apis Besides using an established api, this allows using kvm in older kernels. Signed-off-by: Markus Rechberger <markus.rechberger@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-03-04 11:12:39 +02:00
Ahmed S. Darwish	9d8f549dc6	KVM: Use ARRAY_SIZE macro instead of manual calculation. Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com> Signed-off-by: Dor Laor <dor.laor@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-03-04 11:12:39 +02:00
Joerg Roedel	de979caacc	KVM: vmx: hack set_cr0_no_modeswitch() to actually do modeswitch The whole thing is rotten, but this allows vmx to boot with the guest reboot fix. Signed-off-by: Markus Rechberger <markus.rechberger@amd.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-03-04 11:12:39 +02:00
Avi Kivity	d27d4aca18	KVM: Cosmetics Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-03-04 11:12:39 +02:00
Jeremy Katz	43934a38d7	KVM: Move virtualization deactivation from CPU_DEAD state to CPU_DOWN_PREPARE This gives it more chances of surviving suspend. Signed-off-by: Jeremy Katz <katzj@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-03-04 11:12:39 +02:00
Avi Kivity	bf3f8e86c2	KVM: mmu: add missing dirty page tracking cases We fail to mark a page dirty in three cases: - setting the accessed bit in a pte - setting the dirty bit in a pte - emulating a write into a pagetable This fix adds the missing cases. Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-03-04 11:12:39 +02:00
Jeremy Fitzhardinge	464d1a78fb	[PATCH] i386: Convert i386 PDA code to use %fs Convert the PDA code to use %fs rather than %gs as the segment for per-processor data. This is because some processors show a small but measurable performance gain for reloading a NULL segment selector (as %fs generally is in user-space) versus a non-NULL one (as %gs generally is). On modern processors the difference is very small, perhaps undetectable. Some old AMD "K6 3D+" processors are noticably slower when %fs is used rather than %gs; I have no idea why this might be, but I think they're sufficiently rare that it doesn't matter much. This patch also fixes the math emulator, which had not been adjusted to match the changed struct pt_regs. [frederik.deweerdt@gmail.com: fixit with gdb] [mingo@elte.hu: Fix KVM too] Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Ian Campbell <Ian.Campbell@XenSource.com> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Zachary Amsden <zach@vmware.com> Cc: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org>	2007-02-13 13:26:20 +01:00
Avi Kivity	59ae6c6b87	[PATCH] KVM: Host suspend/resume support Add the necessary callbacks to suspend and resume a host running kvm. This is just a repeat of the cpu hotplug/unplug work. Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:41 -08:00
Avi Kivity	774c47f1d7	[PATCH] KVM: cpu hotplug support On hotplug, we execute the hardware extension enable sequence. On unplug, we decache any vcpus that last ran on the exiting cpu, and execute the hardware extension disable sequence. Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:41 -08:00
Avi Kivity	8d0be2b3bf	[PATCH] KVM: VMX: add vcpu_clear() Like the inline code it replaces, this function decaches the vmcs from the cpu it last executed on. in addition: - vcpu_clear() works if the last cpu is also the cpu we're running on - it is faster on larger smps by virtue of using smp_call_function_single() Includes fix from Ingo Molnar. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:41 -08:00
Avi Kivity	133de9021d	[PATCH] KVM: Add a global list of all virtual machines This will allow us to iterate over all vcpus and see which cpus they are running on. [akpm@osdl.org: use standard (ugly) initialisers] Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:40 -08:00
Ingo Molnar	1e8ba6fba5	[PATCH] kvm: fix vcpu freeing bug vcpu_load() can return NULL and it sometimes does in failure paths (for example when the userspace ABI version is too old) - causing a preemption count underflow in the ->vcpu_free() later on. So check for NULL. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:40 -08:00
Avi Kivity	26bb83a755	[PATCH] kvm: VMX: Reload ds and es even in 64-bit mode Or 32-bit userspace will get confused. Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:40 -08:00
Dor Laor	54810342f1	[PATCH] kvm: Two-way apic tpr synchronization We report the value of cr8 to userspace on an exit. Also let userspace change cr8 when we re-enter the guest. The lets 64-bit guest code maintain the tpr correctly. Thanks for Yaniv Kamay for the idea. Signed-off-by: Dor Laor <dor.laor@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:40 -08:00
Avi Kivity	d92899a001	[PATCH] kvm: SVM: Hack initial cpu csbase to be consistent with intel This allows us to run the mmu testsuite on amd. Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:40 -08:00
Avi Kivity	ac6c2bc592	[PATCH] kvm: Fix mmu going crazy of guest sets cr0.wp == 0 The kvm mmu relies on cr0.wp being set even if the guest does not set it. The vmx code correctly forces cr0.wp at all times, the svm code does not, so it can't boot solaris without this patch. Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:40 -08:00
Avi Kivity	988ad74ff6	[PATCH] kvm: vmx: handle triple faults by returning EXIT_REASON_SHUTDOWN to userspace Just like svm. Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:40 -08:00
Avi Kivity	e119d117a1	[PATCH] kvm: Fix gva_to_gpa() gva_to_gpa() needs to be updated to the new walk_addr() calling convention, otherwise it may oops under some circumstances. Use the opportunity to remove all the code duplication in gva_to_gpa(), which essentially repeats the calculations in walk_addr(). Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:40 -08:00
S.Caglar Onur	a0610ddf6b	[PATCH] kvm: Fix asm constraint for lldt instruction lldt does not accept immediate operands, which "g" allows. Signed-off-by: S.Caglar Onur <caglar@pardus.org.tr> Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:40 -08:00
Ingo Molnar	96958231ce	[PATCH] kvm: optimize inline assembly Forms like "0(%rsp)" generate an instruction with an unnecessary one byte displacement under certain circumstances. replace with the equivalent "(%rsp)". Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:40 -08:00
Al Viro	11718b4d6b	[PATCH] misc NULL noise removal Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-09 09:14:07 -08:00
Al Viro	8b6d44c7bd	[PATCH] kvm: NULL noise removal Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-09 09:14:07 -08:00
Al Viro	2f36698799	[PATCH] kvm: __user annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-09 09:14:07 -08:00
Avi Kivity	432bd6cbf9	[PATCH] KVM: fix lockup on 32-bit intel hosts with nx disabled in the bios Intel hosts, without long mode, and with nx support disabled in the bios have an efer that is readable but not writable. This causes a lockup on switch to guest mode (even though it should exit with reason 34 according to the documentation). Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-01 16:22:41 -08:00
Robert P. J. Day	49b14f24cc	[PATCH] Fix "CONFIG_X86_64_" typo in drivers/kvm/svm.c Fix what looks like an obvious typo in the file drivers/kvm/svm.c. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Acked-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-30 08:26:45 -08:00
Joerg Roedel	46fe4ddd9d	[PATCH] KVM: SVM: Propagate cpu shutdown events to userspace This patch implements forwarding of SHUTDOWN intercepts from the guest on to userspace on AMD SVM. A SHUTDOWN event occurs when the guest produces a triple fault (e.g. on reboot). This also fixes the bug that a guest reboot actually causes a host reboot under some circumstances. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-26 13:50:57 -08:00
Avi Kivity	73b1087e61	[PATCH] KVM: MMU: Report nx faults to the guest With the recent guest page fault change, we perform access checks on our own instead of relying on the cpu. This means we have to perform the nx checks as well. Software like the google toolbar on windows appears to rely on this somehow. Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-26 13:50:57 -08:00
Avi Kivity	7993ba43db	[PATCH] KVM: MMU: Perform access checks in walk_addr() Check pte permission bits in walk_addr(), instead of scattering the checks all over the code. This has the following benefits: 1. We no longer set the accessed bit for accessed which fail permission checks. 2. Setting the accessed bit is simplified. 3. Under some circumstances, we used to pretend a page fault was fixed when it would actually fail the access checks. This caused an unnecessary vmexit. 4. The error code for guest page faults is now correct. The fix helps netbsd further along booting, and allows kvm to pass the new mmu testsuite. Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-26 13:50:57 -08:00
Avi Kivity	6f00e68f21	[PATCH] KVM: Emulate IA32_MISC_ENABLE msr This allows netbsd 3.1 i386 to get further along installing. Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-26 13:50:57 -08:00
Leonard Norrgard	bce66ca4a2	[PATCH] KVM: SVM: Fix SVM idt confusion There's an obvious typo in svm_{get,set}_idt, causing it to access the ldt instead. Because these functions are only called for save/load on AMD, the bug does not impact normal operation. With the fix, save/load works as expected on AMD hosts. Signed-off-by: Uri Lublin <uril@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-26 13:50:57 -08:00
Avi Kivity	fc3dffe121	[PATCH] KVM: fix bogus pagefault on writable pages If a page is marked as dirty in the guest pte, set_pte_common() can set the writable bit on newly-instantiated shadow pte. This optimization avoids a write fault after the initial read fault. However, if a write fault instantiates the pte, fix_write_pf() incorrectly reports the fault as a guest page fault, and the guest oopses on what appears to be a correctly-mapped page. Fix is to detect the condition and only report a guest page fault on a user access to a kernel page. With the fix, a kvm guest can survive a whole night of running the kernel hacker's screensaver (make -j9 in a loop). Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-23 07:52:06 -08:00
Avi Kivity	038e51de2e	[PATCH] KVM: x86 emulator: fix bit string instructions The various bit string instructions (bts, btc, etc.) fail to adjust the address correctly if the bit address is beyond BITS_PER_LONG. This bug creeped in as the emulator originally relied on cr2 to contain the memory address; however we now decode it from the mod r/m bits, and must adjust the offset to account for large bit indices. The patch is rather large because it switches src and dst decoding around, so that the bit index is available when decoding the memory address. This fixes workloads like the FC5 installer. Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-23 07:52:06 -08:00
Avi Kivity	cccf748b81	[PATCH] KVM: fix race between mmio reads and injected interrupts The kvm mmio read path looks like: 1. guest read faults 2. kvm emulates read, calls emulator_read_emulated() 3. fails as a read requires userspace help 4. exit to userspace 5. userspace emulates read, kvm sets vcpu->mmio_read_completed 6. re-enter guest, fault again 7. kvm emulates read, calls emulator_read_emulated() 8. succeeds as vcpu->mmio_read_emulated is set 9. instruction completes and guest is resumed A problem surfaces if the userspace exit (step 5) also requests an interrupt injection. In that case, the guest does not re-execute the original instruction, but the interrupt handler. The next time an mmio read is exectued (likely for a different address), step 3 will find vcpu->mmio_read_completed set and return the value read for the original instruction. The problem manifested itself in a few annoying ways: - little squares appear randomly on console when switching virtual terminals - ne2000 fails under nfs read load - rtl8139 complains about "pci errors" even though the device model is incapable of issuing them. Fix by skipping interrupt injection if an mmio read is pending. A better fix is to avoid re-entry into the guest, and re-emulating immediately instead. However that's a bit more complex. Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-23 07:52:06 -08:00
Avi Kivity	084384754e	[PATCH] KVM: make sure there is a vcpu context loaded when destroying the mmu This makes the vmwrite errors on vm shutdown go away. Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-23 07:52:06 -08:00
Herbert Xu	e001548911	[PATCH] vmx: Fix register constraint in launch code Both "=r" and "=g" breaks my build on i386: $ make CC [M] drivers/kvm/vmx.o {standard input}: Assembler messages: {standard input}:3318: Error: bad register name `%sil' make[1]: * [drivers/kvm/vmx.o] Error 1 make: * [_module_drivers/kvm] Error 2 The reason is that setbe requires an 8-bit register but "=r" does not constrain the target register to be one that has an 8-bit version on i386. According to http://gcc.gnu.org/bugzilla/show_bug.cgi?id=10153 the correct constraint is "=q". Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-22 19:27:02 -08:00
Ingo Molnar	07031e14c1	[PATCH] KVM: add VM-exit profiling This adds the profile=kvm boot option, which enables KVM to profile VM exits. Use: "readprofile -m ./System.map \| sort -n" to see the resulting output: [...] 18246 serial_out 148.3415 18945 native_flush_tlb 378.9000 23618 serial_in 212.7748 29279 __spin_unlock_irq 622.9574 43447 native_apic_write 2068.9048 52702 enable_8259A_irq 742.2817 54250 vgacon_scroll 89.3740 67394 ide_inb 6126.7273 79514 copy_page_range 98.1654 84868 do_wp_page 86.6000 140266 pit_read 783.6089 151436 ide_outb 25239.3333 152668 native_io_delay 21809.7143 174783 mask_and_ack_8259A 783.7803 362404 native_set_pte_at 36240.4000 1688747 total 0.5009 Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-11 18:18:21 -08:00
Dor Laor	022a93080c	[PATCH] KVM: Simplify test for interrupt window No need to test for rflags.if as both VT and SVM specs assure us that on exit caused from interrupt window opening, 'if' is set. Signed-off-by: Dor Laor <dor.laor@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:28 -08:00
Ingo Molnar	68a99f6d37	[PATCH] KVM: Simplify mmu_alloc_roots() Small optimization/cleanup: page == page_header(page->page_hpa) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:28 -08:00
Ingo Molnar	d21225ee2b	[PATCH] KVM: Make loading cr3 more robust Prevent the guest's loading of a corrupt cr3 (pointing at no guest phsyical page) from crashing the host. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:28 -08:00
Avi Kivity	760db773fb	[PATCH] KVM: MMU: Add missing dirty bit If we emulate a write, we fail to set the dirty bit on the guest pte, leading the guest to believe the page is clean, and thus lose data. Bad. Fix by setting the guest pte dirty bit under such conditions. Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:28 -08:00
Avi Kivity	4db9c47c05	[PATCH] KVM: Don't set guest cr3 from vmx_vcpu_setup() It overwrites the right cr3 set from mmu setup. Happens only with the test harness. Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:28 -08:00
Avi Kivity	cc1d8955cb	[PATCH] KVM: Add missing 'break' Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:28 -08:00
Ingo Molnar	7f7417d67e	[PATCH] KVM: Avoid oom on cr3 switch Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:28 -08:00
Avi Kivity	86a2b42e81	[PATCH] KVM: Initialize vcpu->kvm a little earlier Fixes oops on early close of /dev/kvm. Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:28 -08:00
Avi Kivity	e52de1b8cf	[PATCH] KVM: Improve reporting of vmwrite errors This will allow us to see the root cause when a vmwrite error happens. Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:27 -08:00
Avi Kivity	37a7d8b046	[PATCH] KVM: MMU: add audit code to check mappings, etc are correct Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:27 -08:00
Avi Kivity	9ede74e0af	[PATCH] KVM: MMU: Destroy mmu while we still have a vcpu left mmu_destroy flushes the guest tlb (indirectly), which needs a valid vcpu. Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:27 -08:00
Avi Kivity	40907d5768	[PATCH] KVM: MMU: Flush guest tlb when reducing permissions on a pte If we reduce permissions on a pte, we must flush the cached copy of the pte from the guest's tlb. This is implemented at the moment by flushing the entire guest tlb, and can be improved by flushing just the relevant virtual address, if it is known. Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:27 -08:00
Avi Kivity	e2dec939db	[PATCH] KVM: MMU: Detect oom conditions and propagate error to userspace Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:27 -08:00
Avi Kivity	714b93da1a	[PATCH] KVM: MMU: Replace atomic allocations by preallocated objects The mmu sometimes needs memory for reverse mapping and parent pte chains. however, we can't allocate from within the mmu because of the atomic context. So, move the allocations to a central place that can be executed before the main mmu machinery, where we can bail out on failure before any damage is done. (error handling is deffered for now, but the basic structure is there) Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:27 -08:00
Avi Kivity	f51234c2cd	[PATCH] KVM: MMU: Free pages on kvm destruction Because mmu pages have attached rmap and parent pte chain structures, we need to zap them before freeing so the attached structures are freed. Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:27 -08:00
Avi Kivity	143646567f	[PATCH] KVM: MMU: Treat user-mode faults as a hint that a page is no longer a page table Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:26 -08:00
Avi Kivity	32b3562735	[PATCH] KVM: MMU: Fix cmpxchg8b emulation cmpxchg8b uses edx:eax as the compare operand, not edi:eax. cmpxchg8b is used by 32-bit pae guests to set page table entries atomically, and this is emulated touching shadowed guest page tables. Also, implement it for 32-bit hosts. Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:26 -08:00
Avi Kivity	3bb65a22a4	[PATCH] KVM: MMU: Never free a shadow page actively serving as a root We always need cr3 to point to something valid, so if we detect that we're freeing a root page, simply push it back to the top of the active list. Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:26 -08:00
Avi Kivity	86a5ba025d	[PATCH] KVM: MMU: Page table write flood protection In fork() (or when we protect a page that is no longer a page table), we can experience floods of writes to a page, which have to be emulated. This is expensive. So, if we detect such a flood, zap the page so subsequent writes can proceed natively. Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:26 -08:00
Avi Kivity	139bdb2d9e	[PATCH] KVM: MMU: If an empty shadow page is not empty, report more info Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:26 -08:00
Avi Kivity	5f1e0b6abc	[PATCH] KVM: MMU: Ensure freed shadow pages are clean Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:26 -08:00
Avi Kivity	260746c03d	[PATCH] KVM: MMU: <ove is_empty_shadow_page() above kvm_mmu_free_page() Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:26 -08:00
Avi Kivity	0e7bc4b961	[PATCH] KVM: MMU: Handle misaligned accesses to write protected guest page tables A misaligned access affects two shadow ptes instead of just one. Since a misaligned access is unlikely to occur on a real page table, just zap the page out of existence, avoiding further trouble. Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:26 -08:00
Avi Kivity	73f7198e73	[PATCH] KVM: MMU: Remove release_pt_page_64() Unused. Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:26 -08:00
Avi Kivity	5f015a5b28	[PATCH] KVM: MMU: Remove invlpg interception Since we write protect shadowed guest page tables, there is no need to trap page invalidations (the guest will always change the mapping before issuing the invlpg instruction). Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:25 -08:00

1 2 3 4 5

201 Commits