linux

Commit Graph

Author	SHA1	Message	Date
Tejun Heo	a1c36b166b	vfio: convert to idr_alloc() Convert to the much saner new idr interface. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-27 19:10:19 -08:00
Kees Cook	d65530fbc7	drivers/vfio: remove depends on CONFIG_EXPERIMENTAL The CONFIG_EXPERIMENTAL config item has not carried much meaning for a while now and is almost always enabled by default. As agreed during the Linux kernel summit, remove it from any "depends on" lines in Kconfigs. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2013-02-24 09:59:44 -07:00
Alex Williamson	84237a826b	vfio-pci: Add support for VGA region access PCI defines display class VGA regions at I/O port address 0x3b0, 0x3c0 and MMIO address 0xa0000. As these are non-overlapping, we can ignore the I/O port vs MMIO difference and expose them both in a single region. We make use of the VGA arbiter around each access to configure chipset access as necessary. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2013-02-18 10:11:13 -07:00
Alex Williamson	2dd1194833	vfio-pci: Manage user power state transitions We give the user access to change the power state of the device but certain transitions result in an uninitialized state which the user cannot resolve. To fix this we need to mark the PowerState field of the PMCSR register read-only and effect the requested change on behalf of the user. This has the added benefit that pdev->current_state remains accurate while controlled by the user. The primary example of this bug is a QEMU guest doing a reboot where the device it put into D3 on shutdown and becomes unusable on the next boot because the device did a soft reset on D3->D0 (NoSoftRst-). Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2013-02-18 10:10:33 -07:00
Alex Williamson	2b489a45f6	vfio: whitelist pcieport pcieport does nice things like manage AER and we know it doesn't do DMA or expose any user accessible devices on the host. It also keeps the Memory, I/O, and Busmaster bits enabled, which is pretty handy when trying to use anyting below it. Devices owned by pcieport cannot be given to users via vfio, but we can tolerate them not being owned by vfio-pci. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2013-02-14 14:02:13 -07:00
Alex Williamson	e014e9444a	vfio: Protect vfio_dev_present against device_del vfio_dev_present is meant to give us a wait_event callback so that we can block removing a device from vfio until it becomes unused. The root of this check depends on being able to get the iommu group from the device. Unfortunately if the BUS_NOTIFY_DEL_DEVICE notifier has fired then the device-group reference is no longer searchable and we fail the lookup. We don't need to go to such extents for this though. We have a reference to the device, from which we can acquire a reference to the group. We can then use the group reference to search for the device and properly block removal. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2013-02-14 14:02:13 -07:00
Alex Williamson	906ee99dd2	vfio-pci: Cleanup BAR access We can actually handle MMIO and I/O port from the same access function since PCI already does abstraction of this. The ROM BAR only requires a minor difference, so it gets included too. vfio_pci_config_readwrite gets renamed for consistency. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2013-02-14 14:02:12 -07:00
Alex Williamson	5b279a11d3	vfio-pci: Cleanup read/write functions The read and write functions are nearly identical, combine them and convert to a switch statement. This also makes it easy to narrow the scope of when we use the io/mem accessors in case new regions are added. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2013-02-14 14:02:12 -07:00
Alex Williamson	5641ade41f	vfio-pci: Enable PCIe extended capabilities on v1 Even PCIe 1.x had extended config space. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2013-02-14 10:45:31 -07:00
Alex Williamson	ec1287e511	vfio-pci: Fix buffer overfill A read from a range hidden from the user (ex. MSI-X vector table) attempts to fill the user buffer up to the end of the excluded range instead of up to the requested count. Fix it. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Cc: stable@vger.kernel.org	2013-01-15 10:45:26 -07:00
Alex Williamson	9a92c5091a	vfio-pci: Enable device before attempting reset Devices making use of PM reset are getting incorrectly identified as not supporting reset because pci_pm_reset() fails unless the device is in D0 power state. When first attached to vfio_pci devices are typically in an unknown power state. We can fix this by explicitly setting the power state or simply calling pci_enable_device() before attempting a pci_reset_function(). We need to enable the device anyway, so move this up in our vfio_pci_enable() function, which also simplifies the error path a bit. Note that pci_disable_device() does not explicitly set the power state, so there's no need to re-order vfio_pci_disable(). Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2012-12-07 13:43:51 -07:00
Jiang Liu	05bf3aac93	VFIO: fix out of order labels for error recovery in vfio_pci_init() The two labels for error recovery in function vfio_pci_init() is out of order, so fix it. Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2012-12-07 13:43:51 -07:00
Jiang Liu	de2b3eeafb	VFIO: use ACCESS_ONCE() to guard access to dev->driver Comments from dev_driver_string(), /* dev->driver can change to NULL underneath us because of unbinding, * so be careful about accessing it. */ So use ACCESS_ONCE() to guard access to dev->driver field. Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2012-12-07 13:43:50 -07:00
Jiang Liu	9df7b25ab7	VFIO: unregister IOMMU notifier on error recovery path On error recovery path in function vfio_create_group(), it should unregister the IOMMU notifier for the new VFIO group. Otherwise it may cause invalid memory access later when handling bus notifications. Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2012-12-07 13:43:50 -07:00
Alex Williamson	2007722a60	vfio-pci: Re-order device reset Move the device reset to the end of our disable path, the device should already be stopped from pci_disable_device(). This also allows us to manipulate the save/restore to avoid the save/reset/restore + save/restore that we had before. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2012-12-07 13:43:50 -07:00
Fengguang Wu	3a1f7041dd	vfio: simplify kmalloc+copy_from_user to memdup_user Generated by: coccinelle/api/memdup_user.cocci Acked-by: Julia Lawall <julia.lawall@lip6.fr> Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2012-12-07 13:43:49 -07:00
Alex Williamson	899649b7d4	vfio: Fix PCI INTx disable consistency The virq_disabled flag tracks the userspace view of INTx masking across interrupt mode changes, but we're not consistently applying this to the interrupt and masking handler notion of the device. Currently if the user sets DisINTx while in MSI or MSIX mode, then returns to INTx mode (ex. rebooting a qemu guest), the hardware has DisINTx+, but the management of INTx thinks it's enabled, making it impossible to actually clear DisINTx. Fix this by updating the handler state when INTx is re-enabled. Cc: stable@vger.kernel.org Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2012-10-10 09:10:32 -06:00
Alex Williamson	9dbdfd23b7	vfio: Move PCI INTx eventfd setting earlier We need to be ready to recieve an interrupt as soon as we call request_irq, so our eventfd context setting needs to be moved earlier. Without this, an interrupt from our device or one sharing the interrupt line can pass a NULL into eventfd_signal and oops. Cc: stable@vger.kernel.org Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2012-10-10 09:10:32 -06:00
Alex Williamson	34002f54d2	vfio: Fix PCI mmap after `b3b9c293` Our mmap path mistakely relied on vma->vm_pgoff to get set in remap_pfn_range. After `b3b9c293`, that path only applies to copy-on-write mappings. Set it in our own code. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2012-10-10 09:10:31 -06:00
Linus Torvalds	547b1e81af	Fix staging driver use of VM_RESERVED The VM_RESERVED flag was killed off in commit `314e51b985` ("mm: kill vma flag VM_RESERVED and mm->reserved_vm counter"), and replaced by the proper semantic flags (eg "don't core-dump" etc). But there was a new use of VM_RESERVED that got missed by the merge. Fix the remaining use of VM_RESERVED in the vfio_pci driver, replacing the VM_RESERVED flag with VM_DONTEXPAND \| VM_DONTDUMP. Signed-off-by: Linus Torvalds <torvalds@linux-foundation,org>	2012-10-09 21:06:41 +09:00
Al Viro	2903ff019b	switch simple cases of fget_light to fdget Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-26 22:20:08 -04:00
Al Viro	1d3653a79c	switch vfio_group_set_container() to fget_light() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-26 21:10:09 -04:00
Alex Williamson	b68e7fa879	vfio: Fix virqfd release race vfoi-pci supports a mechanism like KVM's irqfd for unmasking an interrupt through an eventfd. There are two ways to shutdown this interface: 1) close the eventfd, 2) ioctl (such as disabling the interrupt). Both of these do the release through a workqueue, which can result in a segfault if two jobs get queued for the same virqfd. Fix this by protecting the pointer to these virqfds by a spinlock. The vfio pci device will therefore no longer have a reference to it once the release job is queued under lock. On the ioctl side, we still flush the workqueue to ensure that any outstanding releases are completed. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2012-09-21 10:48:28 -06:00
Al Viro	31605debdf	vfio: grab vfio_device reference before exposing the sucker via fd_install() It's not critical (anymore) since another thread closing the file will block on ->device_lock before it gets to dropping the final reference, but it's definitely cleaner that way... Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-08-22 10:26:42 -04:00
Al Viro	90b1253e41	vfio: get rid of vfio_device_put()/vfio_group_get_device* races we really need to make sure that dropping the last reference happens under the group->device_lock; otherwise a loop (under device_lock) might find vfio_device instance that is being freed right now, has already dropped the last reference and waits on device_lock to exclude the sucker from the list. Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-08-22 10:26:13 -04:00
Al Viro	6d2cd3ce81	vfio: get rid of open-coding kref_put_mutex Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-08-22 10:25:19 -04:00
Al Viro	934ad4c235	vfio: don't dereference after kfree... Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-08-22 10:23:04 -04:00
Alex Williamson	89e1f7d4c6	vfio: Add PCI device driver Add PCI device support for VFIO. PCI devices expose regions for accessing config space, I/O port space, and MMIO areas of the device. PCI config access is virtualized in the kernel, allowing us to ensure the integrity of the system, by preventing various accesses while reducing duplicate support across various userspace drivers. I/O port supports read/write access while MMIO also supports mmap of sufficiently sized regions. Support for INTx, MSI, and MSI-X interrupts are provided using eventfds to userspace. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2012-07-31 08:16:24 -06:00
Alex Williamson	73fa0d10d0	vfio: Type1 IOMMU implementation This VFIO IOMMU backend is designed primarily for AMD-Vi and Intel VT-d hardware, but is potentially usable by anything supporting similar mapping functionality. We arbitrarily call this a Type1 backend for lack of a better name. This backend has no IOVA or host memory mapping restrictions for the user and is optimized for relatively static mappings. Mapped areas are pinned into system memory. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2012-07-31 08:16:23 -06:00
Alex Williamson	cba3345cc4	vfio: VFIO core VFIO is a secure user level driver for use with both virtual machines and user level drivers. VFIO makes use of IOMMU groups to ensure the isolation of devices in use, allowing unprivileged user access. It's intended that VFIO will replace KVM device assignment and UIO drivers (in cases where the target platform includes a sufficiently capable IOMMU). New in this version of VFIO is support for IOMMU groups managed through the IOMMU core as well as a rework of the API, removing the group merge interface. We now go back to a model more similar to original VFIO with UIOMMU support where the file descriptor obtained from /dev/vfio/vfio allows access to the IOMMU, but only after a group is added, avoiding the previous privilege issues with this type of model. IOMMU support is also now fully modular as IOMMUs have vastly different interface requirements on different platforms. VFIO users are able to query and initialize the IOMMU model of their choice. Please see the follow-on Documentation commit for further description and usage example. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2012-07-31 08:16:22 -06:00

30 Commits