Commit Graph

23 Commits

Author SHA1 Message Date
Alexey Kardashevskiy 3df9d74806 memory/iommu: QOM'fy IOMMU MemoryRegion
This defines new QOM object - IOMMUMemoryRegion - with MemoryRegion
as a parent.

This moves IOMMU-related fields from MR to IOMMU MR. However to avoid
dymanic QOM casting in fast path (address_space_translate, etc),
this adds an @is_iommu boolean flag to MR and provides new helper to
do simple cast to IOMMU MR - memory_region_get_iommu. The flag
is set in the instance init callback. This defines
memory_region_is_iommu as memory_region_get_iommu()!=NULL.

This switches MemoryRegion to IOMMUMemoryRegion in most places except
the ones where MemoryRegion may be an alias.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20170711035620.4232-2-aik@ozlabs.ru>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-14 12:04:41 +02:00
Peter Xu dd4d607e40 intel_iommu: enable remote IOTLB
This patch is based on Aviv Ben-David (<bd.aviv@gmail.com>)'s patch
upstream:

  "IOMMU: enable intel_iommu map and unmap notifiers"
  https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg01453.html

However I removed/fixed some content, and added my own codes.

Instead of translate() every page for iotlb invalidations (which is
slower), we walk the pages when needed and notify in a hook function.

This patch enables vfio devices for VT-d emulation.

And, since we already have vhost DMAR support via device-iotlb, a
natural benefit that this patch brings is that vt-d enabled vhost can
live even without ATS capability now. Though more tests are needed.

Signed-off-by: Aviv Ben-David <bdaviv@cs.technion.ac.il>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-10-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Peter Xu 558e0024a4 intel_iommu: allow dynamic switch of IOMMU region
This is preparation work to finally enabled dynamic switching ON/OFF for
VT-d protection. The old VT-d codes is using static IOMMU address space,
and that won't satisfy vfio-pci device listeners.

Let me explain.

vfio-pci devices depend on the memory region listener and IOMMU replay
mechanism to make sure the device mapping is coherent with the guest
even if there are domain switches. And there are two kinds of domain
switches:

  (1) switch from domain A -> B
  (2) switch from domain A -> no domain (e.g., turn DMAR off)

Case (1) is handled by the context entry invalidation handling by the
VT-d replay logic. What the replay function should do here is to replay
the existing page mappings in domain B.

However for case (2), we don't want to replay any domain mappings - we
just need the default GPA->HPA mappings (the address_space_memory
mapping). And this patch helps on case (2) to build up the mapping
automatically by leveraging the vfio-pci memory listeners.

Another important thing that this patch does is to seperate
IR (Interrupt Remapping) from DMAR (DMA Remapping). IR region should not
depend on the DMAR region (like before this patch). It should be a
standalone region, and it should be able to be activated without
DMAR (which is a common behavior of Linux kernel - by default it enables
IR while disabled DMAR).

Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-9-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Aviv Ben-David 3b40f0e53c intel_iommu: add "caching-mode" option
This capability asks the guest to invalidate cache before each map operation.
We can use this invalidation to trap map operations in the hypervisor.

Signed-off-by: Aviv Ben-David <bd.aviv@gmail.com>
[peterx: using "caching-mode" instead of "cache-mode" to align with spec]
[peterx: re-write the subject to make it short and clear]
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Aviv Ben-David <bd.aviv@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-02-17 21:52:31 +02:00
Peter Xu 1a43713b02 intel_iommu: fix several incorrect endianess and bit fields
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-11-15 17:20:36 +02:00
Radim Krčmář fb506e701e intel_iommu: reject broken EIM
Cluster x2APIC cannot work without KVM's x2apic API when the maximal
APIC ID is greater than 8 and only KVM's LAPIC can support x2APIC, so we
forbid other APICs and also the old KVM case with less than 9, to
simplify the code.

There is no point in enabling EIM in forbidden APICs, so we keep it
enabled only for the KVM APIC;  unconditionally, because making the
option depend on KVM version would be a maintanance burden.

Old QEMUs would enable eim whenever intremap was on, which would trick
guests into thinking that they can enable cluster x2APIC even if any
interrupt destination would get clamped to 8 bits.
Depending on your configuration, QEMU could notice that the destination
LAPIC is not present and report it with a very non-obvious:

  KVM: injection failed, MSI lost (Operation not permitted)

Or the guest could say something about unexpected interrupts, because
clamping leads to aliasing so interrupts were being delivered to
incorrect VCPUs.

KVM_X2APIC_API is the feature that allows us to enable EIM for KVM.

QEMU 2.7 allowed EIM whenever interrupt remapping was enabled.  In order
to keep backward compatibility, we again allow guests to misbehave in
non-obvious ways, and make it the default for old machine types.

A user can enable the buggy mode it with "x-buggy-eim=on".

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-17 15:44:49 -02:00
Radim Krčmář e6b6af0560 intel_iommu: add OnOffAuto intr_eim as "eim" property
The default (auto) emulates the current behavior.
A user can now control EIM like
  -device intel-iommu,intremap=on,eim=off

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-17 15:44:49 -02:00
Michael S. Tsirkin bc38ee10fc intel_iommu: avoid unnamed fields
Also avoid unnamed fields for portability.
Also, rename VTD_IRTE to VTD_IR_TableEntry for coding
style compliance.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-21 20:44:20 +03:00
Peter Xu ede9c94acf intel_iommu: add SID validation for IR
This patch enables SID validation. Invalid interrupts will be dropped.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-21 20:44:16 +03:00
Jan Kiszka 28589311b3 intel_iommu: Add support for Extended Interrupt Mode
As neither QEMU nor KVM support more than 255 CPUs so far, this is
simple: we only need to switch the destination ID translation in
vtd_remap_irq_get if EIME is set.

Once CFI support is there, it will have to take EIM into account as
well. So far, nothing to do for this.

This patch allows to use x2APIC in split irqchip mode of KVM.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
[use le32_to_cpu() to retrieve dest_id]
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-21 20:43:49 +03:00
Peter Xu 8b5ed7dffa intel_iommu: add support for split irqchip
In split irqchip mode, IOAPIC is working in user space, only update
kernel irq routes when entry changed. When IR is enabled, we directly
update the kernel with translated messages. It works just like a kernel
cache for the remapping entries.

Since KVM irqfd is using kernel gsi routes to deliver interrupts, as
long as we can support split irqchip, we will support irqfd as
well. Also, since kernel gsi routes will cache translated interrupts,
irqfd delivery will not suffer from any performance impact due to IR.

And, since we supported irqfd, vhost devices will be able to work
seamlessly with IR now. Logically this should contain both vhost-net and
vhost-user case.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[move trace-events lines into target-i386/trace-events]
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-21 20:43:49 +03:00
Peter Xu 651e4cefee intel_iommu: Add support for PCI MSI remap
This patch enables interrupt remapping for PCI devices.

To play the trick, one memory region "iommu_ir" is added as child region
of the original iommu memory region, covering range 0xfeeXXXXX (which is
the address range for APIC). All the writes to this range will be taken
as MSI, and translation is carried out only when IR is enabled.

Idea suggested by Paolo Bonzini.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-20 19:31:04 +03:00
Peter Xu 1f91acee17 intel_iommu: define several structs for IOMMU IR
Several data structs are defined to better support the rest of the
patches: IRTE to parse remapping table entries, and IOAPIC/MSI related
structure bits to parse interrupt entries to be filled in by guest
kernel.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-20 19:30:27 +03:00
Peter Xu a58614391d intel_iommu: define interrupt remap table addr register
Defined Interrupt Remap Table Address register to store IR table
pointer. Also, do proper handling on global command register writes to
store table pointer and its size.

One more debug flag "DEBUG_IR" is added for interrupt remapping.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-20 19:30:27 +03:00
Peter Xu d46114f9ec acpi: enable INTR for DMAR report structure
In ACPI DMA remapping report structure, enable INTR flag when specified.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-20 19:30:27 +03:00
Peter Xu 04af0e18bc intel_iommu: rename VTD_PCI_DEVFN_MAX to x86-iommu
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-20 19:30:27 +03:00
Peter Xu 1c7955c450 x86-iommu: introduce parent class
Introducing parent class for intel-iommu devices named "x86-iommu". This
is preparation work to abstract shared functionalities out from Intel
and AMD IOMMUs. Currently, only the parent class is introduced. It does
nothing yet.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-20 19:30:27 +03:00
Jason Wang d66b969b0d intel_iommu: large page support
Current intel_iommu only supports 4K page which may not be sufficient
to cover guest working set. This patch tries to enable 2M and 1G mapping
for intel_iommu. This is also useful for future device IOTLB
implementation to have a better hit rate.

Major work is adding a page mask field on IOTLB entry to make it
support large page. And also use the slpte level as key to do IOTLB
lookup. MAMV was increased to 18 to support direct invalidation for 1G
mapping.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06 20:44:10 +02:00
Knut Omang 7df953bd45 intel_iommu: Add support for translation for devices behind bridges
- Use a hash table indexed on bus pointers to store information about buses
  instead of using the bus numbers.
  Bus pointers are stored in a new VTDBus struct together with the vector
  of device address space pointers indexed by devfn.
- The bus number is still used for lookup for selective SID based invalidate,
  in which case the bus number is lazily resolved from the bus hash table and
  cached in a separate index.

Signed-off-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-10-18 10:05:43 +03:00
Michael S. Tsirkin 1e06f131fd intel_iommu: fix VTD_SID_TO_BUS
(((sid) >> 8) && 0xff)  makes no sense
(((sid) >> 8) & 0xff) seems to be what was meant.

https://bugs.launchpad.net/qemu/+bug/1382477

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-11-02 12:03:04 +02:00
Le Tan b5a280c008 intel-iommu: add IOTLB using hash table
Add IOTLB to cache information about the translation of input-addresses. IOTLB
use a GHashTable as cache. The key of the hash table is the logical-OR of gfn
and source id after left-shifting.

Signed-off-by: Le Tan <tamlokveer@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-28 23:10:22 +02:00
Le Tan d92fa2dc6e intel-iommu: add context-cache to cache context-entry
Add context-cache to cache context-entry encountered on a page-walk. Each
VTDAddressSpace has a member of VTDContextCacheEntry which represents an entry
in the context-cache. Since devices with different bus_num and devfn have their
respective VTDAddressSpace, this will be a good way to reference the cached
entries.
Each VTDContextCacheEntry will have a context_cache_gen and the cached entry
is valid only when context_cache_gen equals IntelIOMMUState.context_cache_gen.

Signed-off-by: Le Tan <tamlokveer@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-28 23:10:22 +02:00
Le Tan 1da12ec4c8 intel-iommu: introduce Intel IOMMU (VT-d) emulation
Add support for emulating Intel IOMMU according to the VT-d specification for
the q35 chipset machine. Implement the logics for DMAR (DMA remapping) without
PASID support. The emulation supports register-based invalidation and primary
fault logging.

Signed-off-by: Le Tan <tamlokveer@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-28 23:10:22 +02:00