Commit Graph

495463 Commits

Author SHA1 Message Date
Rusty Russell 1e1c17a7a2 tools/lguest: use common error macros in the example launcher.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-13 17:15:53 +10:30
Rusty Russell 17c56d6de8 tools/lguest: give virtqueues names for better error messages
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-13 17:15:53 +10:30
Rusty Russell d39a6785f4 tools/lguest: more documentation and checking of virtio 1.0 compliance.
This is from all the non-PCI parts of the spec.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-13 17:15:52 +10:30
Rusty Russell 55c2d7884e lguest: don't look in console features to find emerg_wr.
The 1.0 spec clearly states that you must set the ACKNOWLEDGE and
DRIVER status bits before accessing the feature bits.  This is a
problem for the early console code, which doesn't really want to
acknowledge the device (the spec specifically excepts writing to the
console's emerg_wr from the usual ordering constrains).

Instead, we check that the *size* of the device configuration is
sufficient to hold emerg_wr: at worst (if the device doesn't support
the VIRTIO_CONSOLE_F_EMERG_WRITE feature), it will ignore the
writes.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-13 17:15:51 +10:30
Rusty Russell d761b03291 tools/lguest: don't start devices until DRIVER_OK status set.
We were activating them with the virtqueues, and that's not allowed.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-13 17:15:51 +10:30
Rusty Russell 3afe3e0f8d tools/lguest: handle indirect partway through chain.
Linux doesn't generate these, but it's perfectly valid according to
a close reading of the spec.  I opened virtio spec bug VIRTIO-134 to
make this clearer there, too.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-13 17:15:50 +10:30
Rusty Russell c97eb679ef tools/lguest: insert driver references from the 1.0 spec (4.1 Virtio Over PCI)
As a demonstration, the lguest launcher is pretty strict, trying to
catch badly behaved drivers.  Document this precisely.

A good implementation would *NOT* crash the guest when these happened!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-13 17:15:49 +10:30
Rusty Russell 8dc425ffdd tools/lguest: insert device references from the 1.0 spec (4.1 Virtio Over PCI)
There are some (optional) parts we don't implement, but this quotes all
the device requirements from the spec (csd 03, but it should be the same
across all released versions).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-13 17:15:47 +10:30
Rusty Russell b2ce1ea442 tools/lguest: rename virtio_pci_cfg_cap field to match spec.
The next patch will insert many quotes from the virtio 1.0 spec; they
make most sense if we copy the spec.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-13 17:15:47 +10:30
Rusty Russell 53aceb49f9 tools/lguest: fix features_accepted logic in example launcher.
We were clearing the lower bits when setting the upper bits.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-13 17:15:46 +10:30
Rusty Russell d2dbdac336 tools/lguest: handle device reset correctly in example launcher.
The example launcher doesn't reset the queue_enable like the spec says
we have to.  Plus, we should reset the size in case they negotiated
a different (smaller) one.

This is easy to test by unloading and reloading a virtio module.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-13 17:15:45 +10:30
Luis R. Rodriguez a2e1999157 virtual: Documentation: simplify and generalize paravirt_ops.txt
The general documentation we have for pv_ops is currenty present
on the IA64 docs, but since this documentation covers IA64 xen
enablement and IA64 Xen support got ripped out a while ago
through commit d52eefb47 present since v3.14-rc1 lets just
simplify, generalize and move the pv_ops documentation to a
shared place.

Cc: Isaku Yamahata <yamahata@valinux.co.jp>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: virtualization@lists.linux-foundation.org
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: xen-devel@lists.xenproject.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-13 17:15:44 +10:30
Rusty Russell d9bab50aa4 lguest: remove NOTIFY call and eventfd facility.
Disappointing, as this was kind of neat (especially getting to use RCU
to manage the address -> eventfd mapping).  But now the devices are PCI
handled in userspace, we get rid of both the NOTIFY hypercall and
the interface to connect an eventfd.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:46 +10:30
Rusty Russell 00f8d54651 lguest: remove NOTIFY facility from demonstration launcher.
This was only used for early console, now we can get rid of it altogether.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:45 +10:30
Rusty Russell a561adfaec lguest: use the PCI console device's emerg_wr for early boot messages.
This involves manually checking the console device (which is always in
slot 1 of bus 0) and using the window in VIRTIO_PCI_CAP_PCI_CFG to
program it (as we can't map the BAR yet).

We could in fact do this much earlier, but we wait for the first
write from the virtio_cons_early_init() facility.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:44 +10:30
Rusty Russell 713e3f7224 lguest: always put console in PCI slot #1.
This simplifies the early probe.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:44 +10:30
Rusty Russell 59eba788db lguest: support backdoor window.
The VIRTIO_PCI_CAP_PCI_CFG in the PCI virtio 1.0 spec allows access to
the BAR registers without mapping them.  This is a compulsory feature,
and we implement it here.

There are some subtleties involving access widths which we should
note:

4.1.4.7.1 Device Requirements: PCI configuration access capability

...
   Upon detecting driver write access to pci_cfg_data, the device MUST
   execute a write access at offset cap.offset at BAR selected by
   cap.bar using the first cap.length bytes from pci_cfg_data.

   Upon detecting driver read access to pci_cfg_data, the device MUST
   execute a read access of length cap.length at offset cap.offset at
   BAR selected by cap.bar and store the first cap.length bytes in
   pci_cfg_data.

So, for a write, we copy into the pci_cfg_data window, then write from
there out to the BAR.  This works correctly if cap.length != width of
write.  Similarly, for a read, we read into window from the BAR then
read the value from there.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:43 +10:30
Rusty Russell e8330d9bc1 lguest: support emerg_wr in console device in example launcher.
This is a magic register which causes a character to be outputted: it can
be used even before the device is configured.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:43 +10:30
Rusty Russell b3e28b65de lguest: remove lguest bus definitions from header.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:42 +10:30
Rusty Russell d9028eda7b lguest: remove support for lguest bus in demonstration launcher.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:42 +10:30
Rusty Russell e68ccd1f9d lguest: remove support for lguest bus.
The demonstration launcher now uses PCI entirely.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:41 +10:30
Rusty Russell eb39f83372 lguest: define VIRTIO_CONFIG_NO_LEGACY in example launcher.
We only support virtio 1.0 now

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:40 +10:30
Rusty Russell ebff01137a lguest: Convert console device to virtio 1.0 PCI.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:40 +10:30
Rusty Russell 0d5b5d399f lguest: Convert entropy device to virtio 1.0 PCI.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:39 +10:30
Rusty Russell bf6d40344d lguest: Convert net device to virtio 1.0 PCI.
The only real change here (other than using the PCI bus) is that we
didn't negotiate VIRTIO_NET_F_MRG_RXBUF before, so the format of the
packet header changed with virtio 1.0; we need TUNSETVNETHDRSZ on the
tun fd to tell it about the extra two bytes.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:39 +10:30
Rusty Russell 5051654764 lguest: Convert block device to virtio 1.0 PCI.
We remove SCSI support (which was removed for 1.0) and VIRTIO_BLK_F_FLUSH
feature flag (removed too, since it's compulsory for 1.0).

The rest is mainly mechanical.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:38 +10:30
Rusty Russell 8e70946943 lguest: add a dummy PCI host bridge.
Otherwise Linux fails to find the bus.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:38 +10:30
Rusty Russell 3e0e5f2640 lguest: fix failure to find linux/virtio_types.h
We want to use the local kernel headers, but -I../../include/uapi leads us into
a world of hurt.  Instead we create a dummy include/ dir with symlinks.

If we just use #include "../../include/uapi/linux/virtio_blk.h" we get:

	../../include/uapi/linux/virtio_blk.h:31:32: fatal error: linux/virtio_types.h: No such file or directory
	 #include <linux/virtio_types.h>

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:37 +10:30
Rusty Russell 9315307710 lguest: implement virtio-PCI MMIO accesses.
For each device, We need to include the vendor capabilities to demark
where virtio common, notification and ISR regions are (we put them
all in BAR0).

We need to handle the switching of the virtqueues using the accessors.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:36 +10:30
Rusty Russell d7fbf6e95e lguest: add PCI config space emulation to example launcher.
This handles ioport 0xCF8 and 0xCFC accesses, which are used to
read/write PCI device config space.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:36 +10:30
Rusty Russell 6a54f9ab0d lguest: decode mmio accesses for PCI in example launcher.
We don't do anything with them yet (emulate_mmio_write and
emulate_mmio_read are stubs), but we decode the instructions and
search for the device they're hitting.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:35 +10:30
Rusty Russell 0a6bcc183f lguest: add MMIO region allocator in example launcher.
This is where we point our PCI BARs, so that we can intercept MMIO
accesses.  We tell the kernel about it so any faults in this area are
directed to us.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:35 +10:30
Rusty Russell e1b83e2788 lguest: Override pcibios_enable_irq/pcibios_disable_irq to our stupid PIC
This lets us deliver interrupts for our emulated PCI devices using our
dumb PIC, and not emulate an 8259 and PCI irq mapping tables or whatever.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:34 +10:30
Rusty Russell ee72576c14 lguest: disable ACPI explicitly.
Once we add PCI, it starts trying to manage our interrupts.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:34 +10:30
Rusty Russell 7313d5217e lguest: add iomem region, where guest page faults get sent to userspace.
This lets us implement PCI.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:33 +10:30
Rusty Russell d1c29465b8 lguest: don't disable iospace.
This no longer speeds up boot (IDE got better, I guess), but it does stop
us probing for a PCI bus.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:32 +10:30
Rusty Russell 48fd6b71d6 lguest: suppress PS/2 keyboard polling.
While hacking on getting I/O out to the lguest launcher, I noticed
that returning 0xFF for the PS/2 keyboard status made it spin for a
while thinking there was a key pending.  Fix this by returning 1
instead of 0xFF.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:32 +10:30
Rusty Russell c565650b10 lguest: send trap 13 through to userspace.
We copy 7 bytes at eip for userspace's instruction decode; we have to
carefully handle the case where eip is at the end of a page.  We can't
leave this to userspace since kernel has all the page table decode
logic.

The decode logic moves to userspace, basically unchanged.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:31 +10:30
Rusty Russell c9e433e4b8 lguest: add infrastructure to check mappings.
We normally abort the guest unconditionally when it gives us a bad address,
but in the next patch we want to copy some bytes which may not be mapped.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:31 +10:30
Rusty Russell 8ed313001a lguest: add infrastructure for userspace to deliver a trap to the guest.
This is required for instruction emulation to move to userspace.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:30 +10:30
Rusty Russell 69a09dc174 lguest: write more information to userspace about pending traps.
This is preparation for userspace handling MMIO and ioport accesses.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:30 +10:30
Rusty Russell 18c137371b lguest: add operations to get/set a register from the Launcher.
We use the ptrace API struct, and we currently don't let them set
anything but the normal registers (we'd have to filter the others).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:29 +10:30
Rusty Russell a454bb36ca lguest: have --rng read from /dev/urandom not /dev/random.
Theoretical debates aside, now it boots.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 16:47:28 +10:30
Rusty Russell be8ff5952a virtio: don't require a config space on the console device.
Strictly, it's only needed when we have features (size or multiport).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 15:03:17 +10:30
Rusty Russell 7abb568dbb virtio_pci: use 16-bit accessor for queue_enable.
Since PCI is little endian, 8-bit access might work, but the spec section
is very clear on this:

  4.1.3.1 Driver Requirements: PCI Device Layout

  The driver MUST access each field using the “natural” access method,
  i.e. 32-bit accesses for 32-bit fields, 16-bit accesses for 16-bit
  fields and 8-bit accesses for 8-bit fields.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
2015-02-11 15:03:16 +10:30
Rusty Russell 6d96ee98b1 virtio: Don't expose legacy config features when VIRTIO_CONFIG_NO_LEGACY defined.
The VIRTIO_F_ANY_LAYOUT and VIRTIO_F_NOTIFY_ON_EMPTY features are pre-1.0
only.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
2015-02-11 15:03:16 +10:30
Rusty Russell 527100a4ee virtio: Don't expose legacy block features when VIRTIO_BLK_NO_LEGACY defined.
This allows modern implementations to ensure they don't use legacy
feature bits or SCSI commands (which are not used in v1.0 non-legacy).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
2015-02-11 15:03:15 +10:30
Rusty Russell e6a02746e0 virtio: define VIRTIO_PCI_CAP_PCI_CFG in header.
This provides backdoor access to the device MMIOs, and every device should
have one.  From the virtio 1.0 spec (CS03):

  4.1.4.7.1 Device Requirements: PCI configuration access capability

  The device MUST present at least one VIRTIO_PCI_CAP_PCI_CFG capability.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
2015-02-11 15:03:15 +10:30
Tetsuo Handa 5e05bf5833 virtio: Avoid possible kernel panic if DEBUG is enabled.
The virtqueue_add() calls START_USE() upon entry. The virtqueue_kick() is
called if vq->num_added == (1 << 16) - 1 before calling END_USE().
The virtqueue_kick_prepare() called via virtqueue_kick() calls START_USE()
upon entry, and will call panic() if DEBUG is enabled.
Move this virtqueue_kick() call to after END_USE() call.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11 15:03:14 +10:30
Pawel Moll 1862ee22ce virtio-mmio: Update the device to OASIS spec version
This patch add a support for second version of the virtio-mmio device,
which follows OASIS "Virtual I/O Device (VIRTIO) Version 1.0"
specification.

Main changes:

1. The control register symbolic names use the new device/driver
   nomenclature rather than the old guest/host one.

2. The driver detect the device version (version 1 is the pre-OASIS
   spec, version 2 is compatible with fist revision of the OASIS spec)
   and drives the device accordingly.

3. New version uses direct addressing (64 bit address split into two
   low/high register) instead of the guest page size based one,
   and addresses each part of the queue (descriptors, available, used)
   separately.

4. The device activity is now explicitly triggered by writing to the
   "queue ready" register.

5. Whole 64 bit features are properly handled now (both ways).

Signed-off-by: Pawel Moll <pawel.moll@arm.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-01-23 14:57:10 +10:30