ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem

ivshmem can be configured with and without interrupt capability
(a.k.a. "doorbell").  The two configurations have largely disjoint
options, which makes for a confusing (and badly checked) user
interface.  Moreover, the device can't tell the guest whether its
doorbell is enabled.

Create two new device models ivshmem-plain and ivshmem-doorbell, and
deprecate the old one.

Changes from ivshmem:

* PCI revision is 1 instead of 0.  The new revision is fully backwards
  compatible for guests.  Guests may elect to require at least
  revision 1 to make sure they're not exposed to the funny "no shared
  memory, yet" state.

* Property "role" replaced by "master".  role=master becomes
  master=on, role=peer becomes master=off.  Default is off instead of
  auto.

* Property "use64" is gone.  The new devices always have 64 bit BARs.

Changes from ivshmem to ivshmem-plain:

* The Interrupt Pin register in PCI config space is zero (does not use
  an interrupt pin) instead of one (uses INTA).

* Property "x-memdev" is renamed to "memdev".

* Properties "shm" and "size" are gone.  Use property "memdev"
  instead.

* Property "msi" is gone.  The new device can't have MSI-X capability.
  It can't interrupt anyway.

* Properties "ioeventfd" and "vectors" are gone.  They're meaningless
  without interrupts anyway.

Changes from ivshmem to ivshmem-doorbell:

* Property "msi" is gone.  The new device always has MSI-X capability.

* Property "ioeventfd" defaults to on instead of off.

* Property "size" is gone.  The new device can only map all the shared
  memory received from the server.

Guests can easily find out whether the device is configured for
interrupts by checking for MSI-X capability.

Note: some code added in sub-optimal places to make the diff easier to
review.  The next commit will move it to more sensible places.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com>
This commit is contained in:
Markus Armbruster 2016-03-15 19:34:51 +01:00
parent 2a845da736
commit 5400c02b90
4 changed files with 308 additions and 140 deletions

View File

@ -17,9 +17,10 @@ get interrupted by its peers.
There are two basic configurations:
- Just shared memory: -device ivshmem,shm=NAME,...
- Just shared memory: -device ivshmem-plain,memdev=HMB,...
This uses shared memory object NAME.
This uses host memory backend HMB. It should have option "share"
set.
- Shared memory plus interrupts: -device ivshmem,chardev=CHR,vectors=N,...
@ -30,9 +31,8 @@ There are two basic configurations:
Each peer gets assigned a unique ID by the server. IDs must be
between 0 and 65535.
Interrupts are message-signaled by default (MSI-X). With msi=off
the device has no MSI-X capability, and uses legacy INTx instead.
vectors=N configures the number of vectors to use.
Interrupts are message-signaled (MSI-X). vectors=N configures the
number of vectors to use.
For more details on ivshmem device properties, see The QEMU Emulator
User Documentation (qemu-doc.*).
@ -40,14 +40,15 @@ User Documentation (qemu-doc.*).
== The ivshmem PCI device's guest interface ==
The device has vendor ID 1af4, device ID 1110, revision 0.
The device has vendor ID 1af4, device ID 1110, revision 1. Before
QEMU 2.6.0, it had revision 0.
=== PCI BARs ===
The ivshmem PCI device has two or three BARs:
- BAR0 holds device registers (256 Byte MMIO)
- BAR1 holds MSI-X table and PBA (only when using MSI-X)
- BAR1 holds MSI-X table and PBA (only ivshmem-doorbell)
- BAR2 maps the shared memory object
There are two ways to use this device:
@ -58,18 +59,19 @@ There are two ways to use this device:
user space (see http://dpdk.org/browse/memnic).
- If you additionally need the capability for peers to interrupt each
other, you need BAR0 and, if using MSI-X, BAR1. You will most
likely want to write a kernel driver to handle interrupts. Requires
the device to be configured for interrupts, obviously.
other, you need BAR0 and BAR1. You will most likely want to write a
kernel driver to handle interrupts. Requires the device to be
configured for interrupts, obviously.
Before QEMU 2.6.0, BAR2 can initially be invalid if the device is
configured for interrupts. It becomes safely accessible only after
the ivshmem server provided the shared memory. Guest software should
wait for the IVPosition register (described below) to become
non-negative before accessing BAR2.
the ivshmem server provided the shared memory. These devices have PCI
revision 0 rather than 1. Guest software should wait for the
IVPosition register (described below) to become non-negative before
accessing BAR2.
The device is not capable to tell guest software whether it is
configured for interrupts.
Revision 0 of the device is not capable to tell guest software whether
it is configured for interrupts.
=== PCI device registers ===
@ -77,10 +79,12 @@ BAR 0 contains the following registers:
Offset Size Access On reset Function
0 4 read/write 0 Interrupt Mask
bit 0: peer interrupt
bit 0: peer interrupt (rev 0)
reserved (rev 1)
bit 1..31: reserved
4 4 read/write 0 Interrupt Status
bit 0: peer interrupt
bit 0: peer interrupt (rev 0)
reserved (rev 1)
bit 1..31: reserved
8 4 read-only 0 or ID IVPosition
12 4 write-only N/A Doorbell
@ -92,18 +96,18 @@ Software should only access the registers as specified in column
"Access". Reserved bits should be ignored on read, and preserved on
write.
Interrupt Status and Mask Register together control the legacy INTx
interrupt when the device has no MSI-X capability: INTx is asserted
when the bit-wise AND of Status and Mask is non-zero and the device
has no MSI-X capability. Interrupt Status Register bit 0 becomes 1
when an interrupt request from a peer is received. Reading the
register clears it.
In revision 0 of the device, Interrupt Status and Mask Register
together control the legacy INTx interrupt when the device has no
MSI-X capability: INTx is asserted when the bit-wise AND of Status and
Mask is non-zero and the device has no MSI-X capability. Interrupt
Status Register bit 0 becomes 1 when an interrupt request from a peer
is received. Reading the register clears it.
IVPosition Register: if the device is not configured for interrupts,
this is zero. Else, it is the device's ID (between 0 and 65535).
Before QEMU 2.6.0, the register may read -1 for a short while after
reset.
reset. These devices have PCI revision 0 rather than 1.
There is no good way for software to find out whether the device is
configured for interrupts. A positive IVPosition means interrupts,
@ -124,14 +128,14 @@ interrupt vectors connected, the write is ignored. The device is not
capable to tell guest software what peers are connected, or how many
interrupt vectors are connected.
If the peer doesn't use MSI-X, its Interrupt Status register is set to
1. This asserts INTx unless masked by the Interrupt Mask register.
The device is not capable to communicate the interrupt vector to guest
software then.
The peer's interrupt for this vector then becomes pending. There is
no way for software to clear the pending bit, and a polling mode of
operation is therefore impossible.
If the peer uses MSI-X, the interrupt for this vector becomes pending.
There is no way for software to clear the pending bit, and a polling
mode of operation is therefore impossible with MSI-X.
If the peer is a revision 0 device without MSI-X capability, its
Interrupt Status register is set to 1. This asserts INTx unless
masked by the Interrupt Mask register. The device is not capable to
communicate the interrupt vector to guest software then.
With multiple MSI-X vectors, different vectors can be used to indicate
different events have occurred. The semantics of interrupt vectors

View File

@ -29,6 +29,7 @@
#include "qom/object_interfaces.h"
#include "sysemu/char.h"
#include "sysemu/hostmem.h"
#include "sysemu/qtest.h"
#include "qapi/visitor.h"
#include "exec/ram_addr.h"
@ -53,6 +54,18 @@
} \
} while (0)
#define TYPE_IVSHMEM_COMMON "ivshmem-common"
#define IVSHMEM_COMMON(obj) \
OBJECT_CHECK(IVShmemState, (obj), TYPE_IVSHMEM_COMMON)
#define TYPE_IVSHMEM_PLAIN "ivshmem-plain"
#define IVSHMEM_PLAIN(obj) \
OBJECT_CHECK(IVShmemState, (obj), TYPE_IVSHMEM_PLAIN)
#define TYPE_IVSHMEM_DOORBELL "ivshmem-doorbell"
#define IVSHMEM_DOORBELL(obj) \
OBJECT_CHECK(IVShmemState, (obj), TYPE_IVSHMEM_DOORBELL)
#define TYPE_IVSHMEM "ivshmem"
#define IVSHMEM(obj) \
OBJECT_CHECK(IVShmemState, (obj), TYPE_IVSHMEM)
@ -81,8 +94,6 @@ typedef struct IVShmemState {
MemoryRegion *ivshmem_bar2; /* BAR 2 (shared memory) */
MemoryRegion server_bar2; /* used with server_chr */
size_t ivshmem_size; /* size of shared memory region */
uint32_t ivshmem_64bit;
Peer *peers;
int nb_peers; /* space in @peers[] */
@ -97,9 +108,12 @@ typedef struct IVShmemState {
OnOffAuto master;
Error *migration_blocker;
char * shmobj;
char * sizearg;
char * role;
/* legacy cruft */
char *role;
char *shmobj;
char *sizearg;
size_t legacy_size;
uint32_t not_legacy_32bit;
} IVShmemState;
/* registers for the Inter-VM shared memory device */
@ -126,8 +140,21 @@ static void ivshmem_update_irq(IVShmemState *s)
PCIDevice *d = PCI_DEVICE(s);
uint32_t isr = s->intrstatus & s->intrmask;
/* No INTx with msi=on, whether the guest enabled MSI-X or not */
if (ivshmem_has_feature(s, IVSHMEM_MSI)) {
/*
* Do nothing unless the device actually uses INTx. Here's how
* the device variants signal interrupts, what they put in PCI
* config space:
* Device variant Interrupt Interrupt Pin MSI-X cap.
* ivshmem-plain none 0 no
* ivshmem-doorbell MSI-X 1 yes(1)
* ivshmem,msi=off INTx 1 no
* ivshmem,msi=on MSI-X 1(2) yes(1)
* (1) if guest enabled MSI-X
* (2) the device lies
* Leads to the condition for doing nothing:
*/
if (ivshmem_has_feature(s, IVSHMEM_MSI)
|| !d->config[PCI_INTERRUPT_PIN]) {
return;
}
@ -259,7 +286,7 @@ static void ivshmem_vector_notify(void *opaque)
{
MSIVector *entry = opaque;
PCIDevice *pdev = entry->pdev;
IVShmemState *s = IVSHMEM(pdev);
IVShmemState *s = IVSHMEM_COMMON(pdev);
int vector = entry - s->msi_vectors;
EventNotifier *n = &s->peers[s->vm_id].eventfds[vector];
@ -280,7 +307,7 @@ static void ivshmem_vector_notify(void *opaque)
static int ivshmem_vector_unmask(PCIDevice *dev, unsigned vector,
MSIMessage msg)
{
IVShmemState *s = IVSHMEM(dev);
IVShmemState *s = IVSHMEM_COMMON(dev);
EventNotifier *n = &s->peers[s->vm_id].eventfds[vector];
MSIVector *v = &s->msi_vectors[vector];
int ret;
@ -297,7 +324,7 @@ static int ivshmem_vector_unmask(PCIDevice *dev, unsigned vector,
static void ivshmem_vector_mask(PCIDevice *dev, unsigned vector)
{
IVShmemState *s = IVSHMEM(dev);
IVShmemState *s = IVSHMEM_COMMON(dev);
EventNotifier *n = &s->peers[s->vm_id].eventfds[vector];
int ret;
@ -314,7 +341,7 @@ static void ivshmem_vector_poll(PCIDevice *dev,
unsigned int vector_start,
unsigned int vector_end)
{
IVShmemState *s = IVSHMEM(dev);
IVShmemState *s = IVSHMEM_COMMON(dev);
unsigned int vector;
IVSHMEM_DPRINTF("vector poll %p %d-%d\n", dev, vector_start, vector_end);
@ -461,6 +488,7 @@ static void setup_interrupt(IVShmemState *s, int vector, Error **errp)
static void process_msg_shmem(IVShmemState *s, int fd, Error **errp)
{
struct stat buf;
size_t size;
void *ptr;
if (s->ivshmem_bar2) {
@ -476,22 +504,28 @@ static void process_msg_shmem(IVShmemState *s, int fd, Error **errp)
return;
}
if (s->ivshmem_size > buf.st_size) {
error_setg(errp, "server sent only %zd bytes of shared memory",
(size_t)buf.st_size);
close(fd);
return;
size = buf.st_size;
/* Legacy cruft */
if (s->legacy_size != SIZE_MAX) {
if (size < s->legacy_size) {
error_setg(errp, "server sent only %zd bytes of shared memory",
(size_t)buf.st_size);
close(fd);
return;
}
size = s->legacy_size;
}
/* mmap the region and map into the BAR2 */
ptr = mmap(0, s->ivshmem_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
ptr = mmap(0, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
if (ptr == MAP_FAILED) {
error_setg_errno(errp, errno, "Failed to mmap shared memory");
close(fd);
return;
}
memory_region_init_ram_ptr(&s->server_bar2, OBJECT(s),
"ivshmem.bar2", s->ivshmem_size, ptr);
"ivshmem.bar2", size, ptr);
qemu_set_ram_fd(memory_region_get_ram_addr(&s->server_bar2), fd);
s->ivshmem_bar2 = &s->server_bar2;
}
@ -703,7 +737,7 @@ static void ivshmem_msix_vector_use(IVShmemState *s)
static void ivshmem_reset(DeviceState *d)
{
IVShmemState *s = IVSHMEM(d);
IVShmemState *s = IVSHMEM_COMMON(d);
s->intrstatus = 0;
s->intrmask = 0;
@ -781,7 +815,7 @@ static void ivshmem_disable_irqfd(IVShmemState *s)
static void ivshmem_write_config(PCIDevice *pdev, uint32_t address,
uint32_t val, int len)
{
IVShmemState *s = IVSHMEM(pdev);
IVShmemState *s = IVSHMEM_COMMON(pdev);
int is_enabled, was_enabled = msix_enabled(pdev);
pci_default_write_config(pdev, address, val, len);
@ -805,7 +839,7 @@ static void desugar_shm(IVShmemState *s)
path = g_strdup_printf("/dev/shm/%s", s->shmobj);
object_property_set_str(obj, path, "mem-path", &error_abort);
g_free(path);
object_property_set_int(obj, s->ivshmem_size, "size", &error_abort);
object_property_set_int(obj, s->legacy_size, "size", &error_abort);
object_property_set_bool(obj, true, "share", &error_abort);
object_property_add_child(OBJECT(s), "internal-shm-backend", obj,
&error_abort);
@ -813,42 +847,14 @@ static void desugar_shm(IVShmemState *s)
s->hostmem = MEMORY_BACKEND(obj);
}
static void pci_ivshmem_realize(PCIDevice *dev, Error **errp)
static void ivshmem_common_realize(PCIDevice *dev, Error **errp)
{
IVShmemState *s = IVSHMEM(dev);
IVShmemState *s = IVSHMEM_COMMON(dev);
Error *err = NULL;
uint8_t *pci_conf;
uint8_t attr = PCI_BASE_ADDRESS_SPACE_MEMORY |
PCI_BASE_ADDRESS_MEM_PREFETCH;
if (!!s->server_chr + !!s->shmobj + !!s->hostmem != 1) {
error_setg(errp,
"You must specify either 'shm', 'chardev' or 'x-memdev'");
return;
}
if (s->hostmem) {
MemoryRegion *mr;
if (s->sizearg) {
g_warning("size argument ignored with hostmem");
}
mr = host_memory_backend_get_memory(s->hostmem, &error_abort);
s->ivshmem_size = memory_region_size(mr);
} else if (s->sizearg == NULL) {
s->ivshmem_size = 4 << 20; /* 4 MB default */
} else {
char *end;
int64_t size = qemu_strtosz(s->sizearg, &end);
if (size < 0 || (size_t)size != size || *end != '\0'
|| !is_power_of_2(size)) {
error_setg(errp, "Invalid size %s", s->sizearg);
return;
}
s->ivshmem_size = size;
}
/* IRQFD requires MSI */
if (ivshmem_has_feature(s, IVSHMEM_IOEVENTFD) &&
!ivshmem_has_feature(s, IVSHMEM_MSI)) {
@ -856,29 +862,9 @@ static void pci_ivshmem_realize(PCIDevice *dev, Error **errp)
return;
}
/* check that role is reasonable */
if (s->role) {
if (strncmp(s->role, "peer", 5) == 0) {
s->master = ON_OFF_AUTO_OFF;
} else if (strncmp(s->role, "master", 7) == 0) {
s->master = ON_OFF_AUTO_ON;
} else {
error_setg(errp, "'role' must be 'peer' or 'master'");
return;
}
} else {
s->master = ON_OFF_AUTO_AUTO;
}
pci_conf = dev->config;
pci_conf[PCI_COMMAND] = PCI_COMMAND_IO | PCI_COMMAND_MEMORY;
/*
* Note: we don't use INTx with IVSHMEM_MSI at all, so this is a
* bald-faced lie then. But it's a backwards compatible lie.
*/
pci_config_set_interrupt_pin(pci_conf, 1);
memory_region_init_io(&s->ivshmem_mmio, OBJECT(s), &ivshmem_mmio_ops, s,
"ivshmem-mmio", IVSHMEM_REG_BAR_SIZE);
@ -886,14 +872,10 @@ static void pci_ivshmem_realize(PCIDevice *dev, Error **errp)
pci_register_bar(dev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY,
&s->ivshmem_mmio);
if (s->ivshmem_64bit) {
if (!s->not_legacy_32bit) {
attr |= PCI_BASE_ADDRESS_MEM_TYPE_64;
}
if (s->shmobj) {
desugar_shm(s);
}
if (s->hostmem != NULL) {
IVSHMEM_DPRINTF("using hostmem\n");
@ -940,9 +922,68 @@ static void pci_ivshmem_realize(PCIDevice *dev, Error **errp)
}
}
static void pci_ivshmem_exit(PCIDevice *dev)
static void ivshmem_realize(PCIDevice *dev, Error **errp)
{
IVShmemState *s = IVSHMEM(dev);
IVShmemState *s = IVSHMEM_COMMON(dev);
if (!qtest_enabled()) {
error_report("ivshmem is deprecated, please use ivshmem-plain"
" or ivshmem-doorbell instead");
}
if (!!s->server_chr + !!s->shmobj + !!s->hostmem != 1) {
error_setg(errp,
"You must specify either 'shm', 'chardev' or 'x-memdev'");
return;
}
if (s->hostmem) {
if (s->sizearg) {
g_warning("size argument ignored with hostmem");
}
} else if (s->sizearg == NULL) {
s->legacy_size = 4 << 20; /* 4 MB default */
} else {
char *end;
int64_t size = qemu_strtosz(s->sizearg, &end);
if (size < 0 || (size_t)size != size || *end != '\0'
|| !is_power_of_2(size)) {
error_setg(errp, "Invalid size %s", s->sizearg);
return;
}
s->legacy_size = size;
}
/* check that role is reasonable */
if (s->role) {
if (strncmp(s->role, "peer", 5) == 0) {
s->master = ON_OFF_AUTO_OFF;
} else if (strncmp(s->role, "master", 7) == 0) {
s->master = ON_OFF_AUTO_ON;
} else {
error_setg(errp, "'role' must be 'peer' or 'master'");
return;
}
} else {
s->master = ON_OFF_AUTO_AUTO;
}
if (s->shmobj) {
desugar_shm(s);
}
/*
* Note: we don't use INTx with IVSHMEM_MSI at all, so this is a
* bald-faced lie then. But it's a backwards compatible lie.
*/
pci_config_set_interrupt_pin(dev->config, 1);
ivshmem_common_realize(dev, errp);
}
static void ivshmem_exit(PCIDevice *dev)
{
IVShmemState *s = IVSHMEM_COMMON(dev);
int i;
if (s->migration_blocker) {
@ -955,7 +996,7 @@ static void pci_ivshmem_exit(PCIDevice *dev)
void *addr = memory_region_get_ram_ptr(s->ivshmem_bar2);
int fd;
if (munmap(addr, s->ivshmem_size) == -1) {
if (munmap(addr, memory_region_size(s->ivshmem_bar2) == -1)) {
error_report("Failed to munmap shared memory %s",
strerror(errno));
}
@ -1075,26 +1116,37 @@ static Property ivshmem_properties[] = {
DEFINE_PROP_BIT("msi", IVShmemState, features, IVSHMEM_MSI, true),
DEFINE_PROP_STRING("shm", IVShmemState, shmobj),
DEFINE_PROP_STRING("role", IVShmemState, role),
DEFINE_PROP_UINT32("use64", IVShmemState, ivshmem_64bit, 1),
DEFINE_PROP_UINT32("use64", IVShmemState, not_legacy_32bit, 1),
DEFINE_PROP_END_OF_LIST(),
};
static void ivshmem_common_class_init(ObjectClass *klass, void *data)
{
DeviceClass *dc = DEVICE_CLASS(klass);
PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
k->realize = ivshmem_common_realize;
k->exit = ivshmem_exit;
k->config_write = ivshmem_write_config;
k->vendor_id = PCI_VENDOR_ID_IVSHMEM;
k->device_id = PCI_DEVICE_ID_IVSHMEM;
k->class_id = PCI_CLASS_MEMORY_RAM;
k->revision = 1;
dc->reset = ivshmem_reset;
set_bit(DEVICE_CATEGORY_MISC, dc->categories);
dc->desc = "Inter-VM shared memory";
}
static void ivshmem_class_init(ObjectClass *klass, void *data)
{
DeviceClass *dc = DEVICE_CLASS(klass);
PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
k->realize = pci_ivshmem_realize;
k->exit = pci_ivshmem_exit;
k->config_write = ivshmem_write_config;
k->vendor_id = PCI_VENDOR_ID_IVSHMEM;
k->device_id = PCI_DEVICE_ID_IVSHMEM;
k->class_id = PCI_CLASS_MEMORY_RAM;
dc->reset = ivshmem_reset;
k->realize = ivshmem_realize;
k->revision = 0;
dc->desc = "Inter-VM shared memory (legacy)";
dc->props = ivshmem_properties;
dc->vmsd = &ivshmem_vmsd;
set_bit(DEVICE_CATEGORY_MISC, dc->categories);
dc->desc = "Inter-VM shared memory";
}
static void ivshmem_check_memdev_is_busy(Object *obj, const char *name,
@ -1123,16 +1175,121 @@ static void ivshmem_init(Object *obj)
&error_abort);
}
static const TypeInfo ivshmem_common_info = {
.name = TYPE_IVSHMEM_COMMON,
.parent = TYPE_PCI_DEVICE,
.instance_size = sizeof(IVShmemState),
.abstract = true,
.class_init = ivshmem_common_class_init,
};
static const TypeInfo ivshmem_info = {
.name = TYPE_IVSHMEM,
.parent = TYPE_PCI_DEVICE,
.parent = TYPE_IVSHMEM_COMMON,
.instance_size = sizeof(IVShmemState),
.instance_init = ivshmem_init,
.class_init = ivshmem_class_init,
};
static const VMStateDescription ivshmem_plain_vmsd = {
.name = TYPE_IVSHMEM_PLAIN,
.version_id = 0,
.minimum_version_id = 0,
.pre_load = ivshmem_pre_load,
.post_load = ivshmem_post_load,
.fields = (VMStateField[]) {
VMSTATE_PCI_DEVICE(parent_obj, IVShmemState),
VMSTATE_UINT32(intrstatus, IVShmemState),
VMSTATE_UINT32(intrmask, IVShmemState),
VMSTATE_END_OF_LIST()
},
};
static Property ivshmem_plain_properties[] = {
DEFINE_PROP_ON_OFF_AUTO("master", IVShmemState, master, ON_OFF_AUTO_OFF),
DEFINE_PROP_END_OF_LIST(),
};
static void ivshmem_plain_init(Object *obj)
{
IVShmemState *s = IVSHMEM_PLAIN(obj);
object_property_add_link(obj, "memdev", TYPE_MEMORY_BACKEND,
(Object **)&s->hostmem,
ivshmem_check_memdev_is_busy,
OBJ_PROP_LINK_UNREF_ON_RELEASE,
&error_abort);
}
static void ivshmem_plain_class_init(ObjectClass *klass, void *data)
{
DeviceClass *dc = DEVICE_CLASS(klass);
dc->props = ivshmem_plain_properties;
dc->vmsd = &ivshmem_plain_vmsd;
}
static const TypeInfo ivshmem_plain_info = {
.name = TYPE_IVSHMEM_PLAIN,
.parent = TYPE_IVSHMEM_COMMON,
.instance_size = sizeof(IVShmemState),
.instance_init = ivshmem_plain_init,
.class_init = ivshmem_plain_class_init,
};
static const VMStateDescription ivshmem_doorbell_vmsd = {
.name = TYPE_IVSHMEM_DOORBELL,
.version_id = 0,
.minimum_version_id = 0,
.pre_load = ivshmem_pre_load,
.post_load = ivshmem_post_load,
.fields = (VMStateField[]) {
VMSTATE_PCI_DEVICE(parent_obj, IVShmemState),
VMSTATE_MSIX(parent_obj, IVShmemState),
VMSTATE_UINT32(intrstatus, IVShmemState),
VMSTATE_UINT32(intrmask, IVShmemState),
VMSTATE_END_OF_LIST()
},
};
static Property ivshmem_doorbell_properties[] = {
DEFINE_PROP_CHR("chardev", IVShmemState, server_chr),
DEFINE_PROP_UINT32("vectors", IVShmemState, vectors, 1),
DEFINE_PROP_BIT("ioeventfd", IVShmemState, features, IVSHMEM_IOEVENTFD,
true),
DEFINE_PROP_ON_OFF_AUTO("master", IVShmemState, master, ON_OFF_AUTO_OFF),
DEFINE_PROP_END_OF_LIST(),
};
static void ivshmem_doorbell_init(Object *obj)
{
IVShmemState *s = IVSHMEM_DOORBELL(obj);
s->features |= (1 << IVSHMEM_MSI);
s->legacy_size = SIZE_MAX; /* whatever the server sends */
}
static void ivshmem_doorbell_class_init(ObjectClass *klass, void *data)
{
DeviceClass *dc = DEVICE_CLASS(klass);
dc->props = ivshmem_doorbell_properties;
dc->vmsd = &ivshmem_doorbell_vmsd;
}
static const TypeInfo ivshmem_doorbell_info = {
.name = TYPE_IVSHMEM_DOORBELL,
.parent = TYPE_IVSHMEM_COMMON,
.instance_size = sizeof(IVShmemState),
.instance_init = ivshmem_doorbell_init,
.class_init = ivshmem_doorbell_class_init,
};
static void ivshmem_register_types(void)
{
type_register_static(&ivshmem_common_info);
type_register_static(&ivshmem_plain_info);
type_register_static(&ivshmem_doorbell_info);
type_register_static(&ivshmem_info);
}

View File

@ -1262,13 +1262,18 @@ basic example.
@subsection Inter-VM Shared Memory device
With KVM enabled on a Linux host, a shared memory device is available. Guests
map a POSIX shared memory region into the guest as a PCI device that enables
zero-copy communication to the application level of the guests. The basic
syntax is:
On Linux hosts, a shared memory device is available. The basic syntax
is:
@example
qemu-system-i386 -device ivshmem,size=@var{size},shm=@var{shm-name}
qemu-system-x86_64 -device ivshmem-plain,memdev=@var{hostmem}
@end example
where @var{hostmem} names a host memory backend. For a POSIX shared
memory backend, use something like
@example
-object memory-backend-file,size=1M,share,mem-path=/dev/shm/ivshmem,id=@var{hostmem}
@end example
If desired, interrupts can be sent between guest VMs accessing the same shared
@ -1282,8 +1287,7 @@ memory server is:
ivshmem-server -p @var{pidfile} -S @var{path} -m @var{shm-name} -l @var{shm-size} -n @var{vectors}
# Then start your qemu instances with matching arguments
qemu-system-i386 -device ivshmem,size=@var{shm-size},vectors=@var{vectors},chardev=@var{id}
[,msi=on][,ioeventfd=on][,role=peer|master]
qemu-system-x86_64 -device ivshmem-doorbell,vectors=@var{vectors},chardev=@var{id}
-chardev socket,path=@var{path},id=@var{id}
@end example
@ -1291,12 +1295,11 @@ When using the server, the guest will be assigned a VM ID (>=0) that allows gues
using the same server to communicate via interrupts. Guests can read their
VM ID from a device register (see ivshmem-spec.txt).
The @option{role} argument can be set to either master or peer and will affect
how the shared memory is migrated. With @option{role=master}, the guest will
copy the shared memory on migration to the destination host. With
@option{role=peer}, the guest will not be able to migrate with the device attached.
With the @option{peer} case, the device should be detached and then reattached
after migration using the PCI hotplug support.
With device property @option{master=on}, the guest will copy the shared
memory on migration to the destination host. With @option{master=off},
the guest will not be able to migrate with the device attached. In the
latter case, the device should be detached and then reattached after
migration using the PCI hotplug support.
@subsubsection ivshmem and hugepages
@ -1304,8 +1307,8 @@ Instead of specifying the <shm size> using POSIX shm, you may specify
a memory backend that has hugepage support:
@example
qemu-system-i386 -object memory-backend-file,size=1G,mem-path=/dev/hugepages/my-shmem-file,share,id=mb1
-device ivshmem,x-memdev=mb1
qemu-system-x86_64 -object memory-backend-file,size=1G,mem-path=/dev/hugepages/my-shmem-file,share,id=mb1
-device ivshmem-plain,memdev=mb1
@end example
ivshmem-server also supports hugepages mount points with the

View File

@ -127,7 +127,9 @@ static void setup_vm_cmd(IVState *s, const char *cmd, bool msix)
static void setup_vm(IVState *s)
{
char *cmd = g_strdup_printf("-device ivshmem,shm=%s,size=1M", tmpshm);
char *cmd = g_strdup_printf("-object memory-backend-file"
",id=mb1,size=1M,share,mem-path=/dev/shm%s"
" -device ivshmem-plain,memdev=mb1", tmpshm);
setup_vm_cmd(s, cmd, false);
@ -284,8 +286,10 @@ static void *server_thread(void *data)
static void setup_vm_with_server(IVState *s, int nvectors, bool msi)
{
char *cmd = g_strdup_printf("-chardev socket,id=chr0,path=%s,nowait "
"-device ivshmem,size=1M,chardev=chr0,vectors=%d,msi=%s",
tmpserver, nvectors, msi ? "true" : "false");
"-device ivshmem%s,chardev=chr0,vectors=%d",
tmpserver,
msi ? "-doorbell" : ",size=1M,msi=off",
nvectors);
setup_vm_cmd(s, cmd, msi);
@ -412,7 +416,7 @@ static void test_ivshmem_memdev(void)
/* just for the sake of checking memory-backend property */
setup_vm_cmd(&state, "-object memory-backend-ram,size=1M,id=mb1"
" -device ivshmem,x-memdev=mb1", false);
" -device ivshmem-plain,memdev=mb1", false);
cleanup_vm(&state);
}