116 lines
3.9 KiB
Plaintext
116 lines
3.9 KiB
Plaintext
|
PCI SR/IOV EMULATION SUPPORT
|
||
|
============================
|
||
|
|
||
|
Description
|
||
|
===========
|
||
|
SR/IOV (Single Root I/O Virtualization) is an optional extended capability
|
||
|
of a PCI Express device. It allows a single physical function (PF) to appear as multiple
|
||
|
virtual functions (VFs) for the main purpose of eliminating software
|
||
|
overhead in I/O from virtual machines.
|
||
|
|
||
|
Qemu now implements the basic common functionality to enable an emulated device
|
||
|
to support SR/IOV. Yet no fully implemented devices exists in Qemu, but a
|
||
|
proof-of-concept hack of the Intel igb can be found here:
|
||
|
|
||
|
git://github.com/knuto/qemu.git sriov_patches_v5
|
||
|
|
||
|
Implementation
|
||
|
==============
|
||
|
Implementing emulation of an SR/IOV capable device typically consists of
|
||
|
implementing support for two types of device classes; the "normal" physical device
|
||
|
(PF) and the virtual device (VF). From Qemu's perspective, the VFs are just
|
||
|
like other devices, except that some of their properties are derived from
|
||
|
the PF.
|
||
|
|
||
|
A virtual function is different from a physical function in that the BAR
|
||
|
space for all VFs are defined by the BAR registers in the PFs SR/IOV
|
||
|
capability. All VFs have the same BARs and BAR sizes.
|
||
|
|
||
|
Accesses to these virtual BARs then is computed as
|
||
|
|
||
|
<VF BAR start> + <VF number> * <BAR sz> + <offset>
|
||
|
|
||
|
From our emulation perspective this means that there is a separate call for
|
||
|
setting up a BAR for a VF.
|
||
|
|
||
|
1) To enable SR/IOV support in the PF, it must be a PCI Express device so
|
||
|
you would need to add a PCI Express capability in the normal PCI
|
||
|
capability list. You might also want to add an ARI (Alternative
|
||
|
Routing-ID Interpretation) capability to indicate that your device
|
||
|
supports functions beyond it's "own" function space (0-7),
|
||
|
which is necessary to support more than 7 functions, or
|
||
|
if functions extends beyond offset 7 because they are placed at an
|
||
|
offset > 1 or have stride > 1.
|
||
|
|
||
|
...
|
||
|
#include "hw/pci/pcie.h"
|
||
|
#include "hw/pci/pcie_sriov.h"
|
||
|
|
||
|
pci_your_pf_dev_realize( ... )
|
||
|
{
|
||
|
...
|
||
|
int ret = pcie_endpoint_cap_init(d, 0x70);
|
||
|
...
|
||
|
pcie_ari_init(d, 0x100, 1);
|
||
|
...
|
||
|
|
||
|
/* Add and initialize the SR/IOV capability */
|
||
|
pcie_sriov_pf_init(d, 0x200, "your_virtual_dev",
|
||
|
vf_devid, initial_vfs, total_vfs,
|
||
|
fun_offset, stride);
|
||
|
|
||
|
/* Set up individual VF BARs (parameters as for normal BARs) */
|
||
|
pcie_sriov_pf_init_vf_bar( ... )
|
||
|
...
|
||
|
}
|
||
|
|
||
|
For cleanup, you simply call:
|
||
|
|
||
|
pcie_sriov_pf_exit(device);
|
||
|
|
||
|
which will delete all the virtual functions and associated resources.
|
||
|
|
||
|
2) Similarly in the implementation of the virtual function, you need to
|
||
|
make it a PCI Express device and add a similar set of capabilities
|
||
|
except for the SR/IOV capability. Then you need to set up the VF BARs as
|
||
|
subregions of the PFs SR/IOV VF BARs by calling
|
||
|
pcie_sriov_vf_register_bar() instead of the normal pci_register_bar() call:
|
||
|
|
||
|
pci_your_vf_dev_realize( ... )
|
||
|
{
|
||
|
...
|
||
|
int ret = pcie_endpoint_cap_init(d, 0x60);
|
||
|
...
|
||
|
pcie_ari_init(d, 0x100, 1);
|
||
|
...
|
||
|
memory_region_init(mr, ... )
|
||
|
pcie_sriov_vf_register_bar(d, bar_nr, mr);
|
||
|
...
|
||
|
}
|
||
|
|
||
|
Testing on Linux guest
|
||
|
======================
|
||
|
The easiest is if your device driver supports sysfs based SR/IOV
|
||
|
enabling. Support for this was added in kernel v.3.8, so not all drivers
|
||
|
support it yet.
|
||
|
|
||
|
To enable 4 VFs for a device at 01:00.0:
|
||
|
|
||
|
modprobe yourdriver
|
||
|
echo 4 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs
|
||
|
|
||
|
You should now see 4 VFs with lspci.
|
||
|
To turn SR/IOV off again - the standard requires you to turn it off before you can enable
|
||
|
another VF count, and the emulation enforces this:
|
||
|
|
||
|
echo 0 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs
|
||
|
|
||
|
Older drivers typically provide a max_vfs module parameter
|
||
|
to enable it at load time:
|
||
|
|
||
|
modprobe yourdriver max_vfs=4
|
||
|
|
||
|
To disable the VFs again then, you simply have to unload the driver:
|
||
|
|
||
|
rmmod yourdriver
|