qemu-e2k

Author	SHA1	Message	Date
David Woodhouse	18e83f28bf	hw/xen: select kernel mode for per-vCPU event channel upcall vector A guest which has configured the per-vCPU upcall vector may set the HVM_PARAM_CALLBACK_IRQ param to fairly much anything other than zero. For example, Linux v6.0+ after commit b1c3497e604 ("x86/xen: Add support for HVMOP_set_evtchn_upcall_vector") will just do this after setting the vector: /* Trick toolstack to think we are enlightened. / if (!cpu) rc = xen_set_callback_via(1); That's explicitly setting the delivery to GSI#1, but it's supposed to be overridden by the per-vCPU vector setting. This mostly works in Qemu except* for the logic to enable the in-kernel handling of event channels, which falsely determines that the kernel cannot accelerate GSI delivery in this case. Add a kvm_xen_has_vcpu_callback_vector() to report whether vCPU#0 has the vector set, and use that in xen_evtchn_set_callback_param() to enable the kernel acceleration features even when the param appears to be set to target a GSI. Preserve the Xen behaviour that when HVM_PARAM_CALLBACK_IRQ is set to zero the event channel delivery is disabled completely. (Which is what that bizarre guest behaviour is working round in the first place.) Cc: qemu-stable@nongnu.org Fixes: `91cce75617` ("hw/xen: Add xen_evtchn device for event channel emulation") Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-11-06 10:03:45 +00:00
David Woodhouse	e16aff4cc2	kvm/i386: Add xen-evtchn-max-pirq property The default number of PIRQs is set to 256 to avoid issues with 32-bit MSI devices. Allow it to be increased if the user desires. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 09:09:22 +00:00
David Woodhouse	8b57d5c523	i386/xen: Reserve Xen special pages for console, xenstore rings Xen has eight frames at 0xfeff8000 for this; we only really need two for now and KVM puts the identity map at 0xfeffc000, so limit ourselves to four. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 09:07:52 +00:00
David Woodhouse	6f43f2ee49	kvm/i386: Add xen-gnttab-max-frames property Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 09:07:52 +00:00
David Woodhouse	ddf0fd9ae1	hw/xen: Support HVM_PARAM_CALLBACK_TYPE_GSI callback The GSI callback (and later PCI_INTX) is a level triggered interrupt. It is asserted when an event channel is delivered to vCPU0, and is supposed to be cleared when the vcpu_info->evtchn_upcall_pending field for vCPU0 is cleared again. Thankfully, Xen does not assert the GSI if the guest sets its own evtchn_upcall_pending field; we only need to assert the GSI when we have delivered an event for ourselves. So that's the easy part, kind of. There's a slight complexity in that we need to hold the BQL before we can call qemu_set_irq(), and we definitely can't do that while holding our own port_lock (because we'll need to take that from the qemu-side functions that the PV backend drivers will call). So if we end up wanting to set the IRQ in a context where we don't already hold the BQL, defer to a BH. However, we do need to poll for the evtchn_upcall_pending flag being cleared. In an ideal world we would poll that when the EOI happens on the PIC/IOAPIC. That's how it works in the kernel with the VFIO eventfd pairs — one is used to trigger the interrupt, and the other works in the other direction to 'resample' on EOI, and trigger the first eventfd again if the line is still active. However, QEMU doesn't seem to do that. Even VFIO level interrupts seem to be supported by temporarily unmapping the device's BARs from the guest when an interrupt happens, then trapping all MMIO to the device and sending the 'resample' event on every MMIO access until the IRQ is cleared! Maybe in future we'll plumb the 'resample' concept through QEMU's irq framework but for now we'll do what Xen itself does: just check the flag on every vmexit if the upcall GSI is known to be asserted. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 09:06:44 +00:00
David Woodhouse	c723d4c15e	hw/xen: Implement EVTCHNOP_bind_virq Add the array of virq ports to each vCPU so that we can deliver timers, debug ports, etc. Global virqs are allocated against vCPU 0 initially, but can be migrated to other vCPUs (when we implement that). The kernel needs to know about VIRQ_TIMER in order to accelerate timers, so tell it via KVM_XEN_VCPU_ATTR_TYPE_TIMER. Also save/restore the value of the singleshot timer across migration, as the kernel will handle the hypercalls automatically now. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:50 +00:00
David Woodhouse	27d4075dd8	i386/xen: Add support for Xen event channel delivery to vCPU The kvm_xen_inject_vcpu_callback_vector() function will either deliver the per-vCPU local APIC vector (as an MSI), or just kick the vCPU out of the kernel to trigger KVM's automatic delivery of the global vector. Support for asserting the GSI/PCI_INTX callbacks will come later. Also add kvm_xen_get_vcpu_info_hva() which returns the vcpu_info of a given vCPU. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:50 +00:00
David Woodhouse	d40ddd5290	hw/xen: Add xen_overlay device for emulating shared xenheap pages For the shared info page and for grant tables, Xen shares its own pages from the "Xen heap" to the guest. The guest requests that a given page from a certain address space (XENMAPSPACE_shared_info, etc.) be mapped to a given GPA using the XENMEM_add_to_physmap hypercall. To support that in qemu when emulating Xen, create a memory region (migratable) and allow it to be mapped as an overlay when requested. Xen theoretically allows the same page to be mapped multiple times into the guest, but that's hard to track and reinstate over migration, so we automatically unmap any previous mapping when creating a new one. This approach has been used in production with.... a non-trivial number of guests expecting true Xen, without any problems yet being noticed. This adds just the shared info page for now. The grant tables will be a larger region, and will need to be overlaid one page at a time. I think that means I need to create separate aliases for each page of the overall grant_frames region, so that they can be mapped individually. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
Joao Martins	79b7067dc6	i386/xen: implement HYPERVISOR_sched_op, SCHEDOP_shutdown It allows to shutdown itself via hypercall with any of the 3 reasons: 1) self-reboot 2) shutdown 3) crash Implementing SCHEDOP_shutdown sub op let us handle crashes gracefully rather than leading to triple faults if it remains unimplemented. In addition, the SHUTDOWN_soft_reset reason is used for kexec, to reset Xen shared pages and other enlightenments and leave a clean slate for the new kernel without the hypervisor helpfully writing information at unexpected addresses. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> [dwmw2: Ditch sched_op_compat which was never available for HVM guests, Add SCHEDOP_soft_reset] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00
David Woodhouse	61491cf441	i386/kvm: Add xen-version KVM accelerator property and init KVM Xen support This just initializes the basic Xen support in KVM for now. Only permitted on TYPE_PC_MACHINE because that's where the sysbus devices for Xen heap overlay, event channel, grant tables and other stuff will exist. There's no point having the basic hypercall support if nothing else works. Provide sysemu/kvm_xen.h and a kvm_xen_get_caps() which will be used later by support devices. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-03-01 08:22:49 +00:00

10 Commits