2022-02-17 18:44:51 +01:00
|
|
|
PCI SR/IOV EMULATION SUPPORT
|
|
|
|
============================
|
|
|
|
|
|
|
|
Description
|
|
|
|
===========
|
|
|
|
SR/IOV (Single Root I/O Virtualization) is an optional extended capability
|
|
|
|
of a PCI Express device. It allows a single physical function (PF) to appear as multiple
|
|
|
|
virtual functions (VFs) for the main purpose of eliminating software
|
|
|
|
overhead in I/O from virtual machines.
|
|
|
|
|
2022-04-22 10:30:07 +02:00
|
|
|
QEMU now implements the basic common functionality to enable an emulated device
|
2023-04-14 11:04:41 +02:00
|
|
|
to support SR/IOV.
|
2022-02-17 18:44:51 +01:00
|
|
|
|
|
|
|
Implementation
|
|
|
|
==============
|
|
|
|
Implementing emulation of an SR/IOV capable device typically consists of
|
|
|
|
implementing support for two types of device classes; the "normal" physical device
|
2022-04-22 10:30:07 +02:00
|
|
|
(PF) and the virtual device (VF). From QEMU's perspective, the VFs are just
|
2022-02-17 18:44:51 +01:00
|
|
|
like other devices, except that some of their properties are derived from
|
|
|
|
the PF.
|
|
|
|
|
|
|
|
A virtual function is different from a physical function in that the BAR
|
|
|
|
space for all VFs are defined by the BAR registers in the PFs SR/IOV
|
|
|
|
capability. All VFs have the same BARs and BAR sizes.
|
|
|
|
|
|
|
|
Accesses to these virtual BARs then is computed as
|
|
|
|
|
|
|
|
<VF BAR start> + <VF number> * <BAR sz> + <offset>
|
|
|
|
|
|
|
|
From our emulation perspective this means that there is a separate call for
|
|
|
|
setting up a BAR for a VF.
|
|
|
|
|
|
|
|
1) To enable SR/IOV support in the PF, it must be a PCI Express device so
|
|
|
|
you would need to add a PCI Express capability in the normal PCI
|
|
|
|
capability list. You might also want to add an ARI (Alternative
|
|
|
|
Routing-ID Interpretation) capability to indicate that your device
|
|
|
|
supports functions beyond it's "own" function space (0-7),
|
|
|
|
which is necessary to support more than 7 functions, or
|
|
|
|
if functions extends beyond offset 7 because they are placed at an
|
|
|
|
offset > 1 or have stride > 1.
|
|
|
|
|
|
|
|
...
|
|
|
|
#include "hw/pci/pcie.h"
|
|
|
|
#include "hw/pci/pcie_sriov.h"
|
|
|
|
|
|
|
|
pci_your_pf_dev_realize( ... )
|
|
|
|
{
|
|
|
|
...
|
|
|
|
int ret = pcie_endpoint_cap_init(d, 0x70);
|
|
|
|
...
|
2023-07-10 17:38:35 +02:00
|
|
|
pcie_ari_init(d, 0x100);
|
2022-02-17 18:44:51 +01:00
|
|
|
...
|
|
|
|
|
|
|
|
/* Add and initialize the SR/IOV capability */
|
|
|
|
pcie_sriov_pf_init(d, 0x200, "your_virtual_dev",
|
|
|
|
vf_devid, initial_vfs, total_vfs,
|
|
|
|
fun_offset, stride);
|
|
|
|
|
|
|
|
/* Set up individual VF BARs (parameters as for normal BARs) */
|
|
|
|
pcie_sriov_pf_init_vf_bar( ... )
|
|
|
|
...
|
|
|
|
}
|
|
|
|
|
|
|
|
For cleanup, you simply call:
|
|
|
|
|
|
|
|
pcie_sriov_pf_exit(device);
|
|
|
|
|
|
|
|
which will delete all the virtual functions and associated resources.
|
|
|
|
|
|
|
|
2) Similarly in the implementation of the virtual function, you need to
|
|
|
|
make it a PCI Express device and add a similar set of capabilities
|
|
|
|
except for the SR/IOV capability. Then you need to set up the VF BARs as
|
|
|
|
subregions of the PFs SR/IOV VF BARs by calling
|
|
|
|
pcie_sriov_vf_register_bar() instead of the normal pci_register_bar() call:
|
|
|
|
|
|
|
|
pci_your_vf_dev_realize( ... )
|
|
|
|
{
|
|
|
|
...
|
|
|
|
int ret = pcie_endpoint_cap_init(d, 0x60);
|
|
|
|
...
|
2023-07-10 17:38:35 +02:00
|
|
|
pcie_ari_init(d, 0x100);
|
2022-02-17 18:44:51 +01:00
|
|
|
...
|
|
|
|
memory_region_init(mr, ... )
|
|
|
|
pcie_sriov_vf_register_bar(d, bar_nr, mr);
|
|
|
|
...
|
|
|
|
}
|
|
|
|
|
|
|
|
Testing on Linux guest
|
|
|
|
======================
|
|
|
|
The easiest is if your device driver supports sysfs based SR/IOV
|
|
|
|
enabling. Support for this was added in kernel v.3.8, so not all drivers
|
|
|
|
support it yet.
|
|
|
|
|
|
|
|
To enable 4 VFs for a device at 01:00.0:
|
|
|
|
|
|
|
|
modprobe yourdriver
|
|
|
|
echo 4 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs
|
|
|
|
|
|
|
|
You should now see 4 VFs with lspci.
|
|
|
|
To turn SR/IOV off again - the standard requires you to turn it off before you can enable
|
|
|
|
another VF count, and the emulation enforces this:
|
|
|
|
|
|
|
|
echo 0 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs
|
|
|
|
|
|
|
|
Older drivers typically provide a max_vfs module parameter
|
|
|
|
to enable it at load time:
|
|
|
|
|
|
|
|
modprobe yourdriver max_vfs=4
|
|
|
|
|
|
|
|
To disable the VFs again then, you simply have to unload the driver:
|
|
|
|
|
|
|
|
rmmod yourdriver
|