Linux doesn't generate these, but it's perfectly valid according to
a close reading of the spec. I opened virtio spec bug VIRTIO-134 to
make this clearer there, too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
As a demonstration, the lguest launcher is pretty strict, trying to
catch badly behaved drivers. Document this precisely.
A good implementation would *NOT* crash the guest when these happened!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There are some (optional) parts we don't implement, but this quotes all
the device requirements from the spec (csd 03, but it should be the same
across all released versions).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The next patch will insert many quotes from the virtio 1.0 spec; they
make most sense if we copy the spec.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The example launcher doesn't reset the queue_enable like the spec says
we have to. Plus, we should reset the size in case they negotiated
a different (smaller) one.
This is easy to test by unloading and reloading a virtio module.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The VIRTIO_PCI_CAP_PCI_CFG in the PCI virtio 1.0 spec allows access to
the BAR registers without mapping them. This is a compulsory feature,
and we implement it here.
There are some subtleties involving access widths which we should
note:
4.1.4.7.1 Device Requirements: PCI configuration access capability
...
Upon detecting driver write access to pci_cfg_data, the device MUST
execute a write access at offset cap.offset at BAR selected by
cap.bar using the first cap.length bytes from pci_cfg_data.
Upon detecting driver read access to pci_cfg_data, the device MUST
execute a read access of length cap.length at offset cap.offset at
BAR selected by cap.bar and store the first cap.length bytes in
pci_cfg_data.
So, for a write, we copy into the pci_cfg_data window, then write from
there out to the BAR. This works correctly if cap.length != width of
write. Similarly, for a read, we read into window from the BAR then
read the value from there.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a magic register which causes a character to be outputted: it can
be used even before the device is configured.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The only real change here (other than using the PCI bus) is that we
didn't negotiate VIRTIO_NET_F_MRG_RXBUF before, so the format of the
packet header changed with virtio 1.0; we need TUNSETVNETHDRSZ on the
tun fd to tell it about the extra two bytes.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We remove SCSI support (which was removed for 1.0) and VIRTIO_BLK_F_FLUSH
feature flag (removed too, since it's compulsory for 1.0).
The rest is mainly mechanical.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We want to use the local kernel headers, but -I../../include/uapi leads us into
a world of hurt. Instead we create a dummy include/ dir with symlinks.
If we just use #include "../../include/uapi/linux/virtio_blk.h" we get:
../../include/uapi/linux/virtio_blk.h:31:32: fatal error: linux/virtio_types.h: No such file or directory
#include <linux/virtio_types.h>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
For each device, We need to include the vendor capabilities to demark
where virtio common, notification and ISR regions are (we put them
all in BAR0).
We need to handle the switching of the virtqueues using the accessors.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This handles ioport 0xCF8 and 0xCFC accesses, which are used to
read/write PCI device config space.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We don't do anything with them yet (emulate_mmio_write and
emulate_mmio_read are stubs), but we decode the instructions and
search for the device they're hitting.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is where we point our PCI BARs, so that we can intercept MMIO
accesses. We tell the kernel about it so any faults in this area are
directed to us.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
While hacking on getting I/O out to the lguest launcher, I noticed
that returning 0xFF for the PS/2 keyboard status made it spin for a
while thinking there was a key pending. Fix this by returning 1
instead of 0xFF.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We copy 7 bytes at eip for userspace's instruction decode; we have to
carefully handle the case where eip is at the end of a page. We can't
leave this to userspace since kernel has all the page table decode
logic.
The decode logic moves to userspace, basically unchanged.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Somehow a naked u16 slipped into the glibc headers on my Ubuntu machine
(i386 2.17-0ubuntu5), breaking compile:
In file included from lguest.c:46:0:
/usr/include/linux/virtio_net.h:188:2: error: unknown type name ‘u16’
We use the kernel-style types anyway, just define them before the includes.
Also remove the advice on adding missing headers: that no longer works.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Lguest guests are UP, but the host is probably SMP, so real barriers are
required in case the device thread and the guest are on different CPUs.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The virtio spec was missing a barrier in example code, so I went back
to look at the lguest code. Indeed, we need one.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
After commit 07fe997, lguest tool has already moved from
Documentation/virtual/lguest/ to tools/lguest/.
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The CONFIG_EXPERIMENTAL config item has not carried much meaning for a
while now and is almost always enabled by default. As agreed during the
Linux kernel summit, remove it from any "depends on" lines in Kconfigs.
CC: Rusty Russell <rusty@rustcorp.com.au>
CC: Davidlohr Bueso <dave@gnu.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
virtio requests are scatter-gather-style descriptors, but no
assumptions should be made about the layout. lguest was lazy here,
but saved by the fact that the network device hands all requests to
tun (which does it correctly) and console and random devices simply
use readv and writev.
Block devices, however, are broken: we convert to iovecs internally,
just make sure we handle the correctly.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We usually got away with ->next on the final entry being NULL, but it
finally bit me.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org
This is a better location instead of having it in Documentation.
Signed-off-by: Davidlohr Bueso <dave@gnu.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (fixed compile)