The following sequence happens:
- the SeaBIOS virtio-blk driver does not support the WCE feature, which
causes QEMU to disable writeback caching
- the Linux virtio-blk driver resets the device, finds WCE is available
but writeback caching is disabled; tells block layer to not send cache
flush commands
- the Linux virtio-blk driver sets the DRIVER_OK bit, which causes
writeback caching to be re-enabled, but the Linux virtio-blk driver does
not know of this side effect and cache flushes remain disabled
The bug is at the third step. If the guest does know about CONFIG_WCE,
QEMU should ignore the WCE feature's state. The guest will control the
cache mode solely using configuration space. This change makes Linux
do flushes correctly, but Linux will keep SeaBIOS's writethrough mode.
Hence, whenever the guest is reset, the cache mode of the disk should
be reset to whatever was specified in the "-drive" option. With this
change, the Linux virtio-blk driver finds that writeback caching is
enabled, and tells the block layer to send cache flush commands
appropriately.
Reported-by: Rusty Russell <rusty@au1.ibm.com
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Add an Error ** parameter to bdrv_open, bdrv_file_open and associated
functions to allow more specific error messages.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Avoid trying to setup dataplane again if dataplane setup is already in
progress. This may happen if an eventfd is triggered during setup.
I saw this occasionally with an experimental s390 irqfd implementation:
virtio_blk_handle_output
-> virtio_blk_data_plane_start
-> virtio_ccw_set_host_notifier
...
-> virtio_queue_set_host_notifier_fd_handler
-> virtio_queue_host_notifier_read
-> virtio_queue_notify_vq
-> virtio_blk_handle_output
-> virtio_blk_data_plane_start
-> vring_setup
-> hostmem_init
-> memory_listener_register
-> BOOM
As virtio-ccw tries to follow what virtio-pci does, it might be triggerable
for other platforms as well.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
We call bdrv_attach_dev when initializing whether or not bs is created
locally, so call bdrv_detach_dev and let the refcnt handle the
lifecycle.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Manage BlockDriverState lifecycle with refcnt, so bdrv_delete() is no
longer public and should be called by bdrv_unref() if refcnt is
decreased to 0.
This is an identical change because effectively, there's no multiple
reference of BDS now: no caller of bdrv_ref() yet, only bdrv_new() sets
bs->refcnt to 1, so all bdrv_unref() now actually delete the BDS.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
If PFLASH_DEBUG is enabled then we have some build errors:
hw/block/pflash_cfi02.c: In function ‘pflash_timer’:
hw/block/pflash_cfi02.c:128:5: error: expected ‘)’ before string constant
hw/block/pflash_cfi02.c:128:5: error: too few arguments to function ‘fprintf’
This patch fixes the problem.
Signed-off-by: Antony Pavlov <antonynpavlov@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This is an autogenerated patch using scripts/switch-timer-api.
Switch the entire code base to using the new timer API.
Note this patch may introduce some line length issues.
Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The .io_flush() handler no longer exists and has no users. Drop the
io_flush argument to aio_set_fd_handler() and related functions.
The AioFlushEventNotifierHandler and AioFlushHandler typedefs are no
longer used and are dropped too.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Check exit conditions before entering blocking aio_poll(). This is
mainly for consistency since it's unlikely that we are stopping in the
first event loop iteration.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Move the code to hw/i386, the sole remaining property is available
as !pci_enabled.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1376069702-22330-4-git-send-email-aliguori@us.ibm.com
Rebased.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
With the new semantics of pc_sysfw (no -pflash implies "old-style" ROM setup,
-pflash implies "new-style" ROM setup), there is no need anymore for a compat
property. Old machines simply will never use -pflash, and thus will always
use old-style setup.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1376069702-22330-3-git-send-email-aliguori@us.ibm.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The variable is not written anymore.
This cleans up after 9e1c2ec (which accidentally left variable
pc_sysfw_flash_vs_rom_bug_compatible behind, value always zero), and
buries dead code from commit dafb82e (which resurrected the pc_sysfw
code for pc_sysfw_flash_vs_rom_bug_compatible by mistake).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1376069702-22330-2-git-send-email-aliguori@us.ibm.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Introduce a type constant, use QOM casts and rename the parent field and
prepare for QOM realize.
Reviewed-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Since commit dd3be74207 SUNW,fdtwo's
initfn (realizefn since 940194c236)
was using SYSBUS_FDC() cast. This uses type sysbus-fdc rather than
SUNW,fdtwo.
Fix this by letting SUNW,fdtwo and sysbus-fdc both inherit from an
abstract type base-sysbus-fdc.
This allows to consolidate realizefns by using instance_init functions.
Clean up variable names and variable order while at it.
Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Cc: Hu Tao <hutao@cn.fujitsu.com>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Andreas Färber <afaerber@suse.de>
# By Stefan Hajnoczi (4) and others
# Via Stefan Hajnoczi
* stefanha/block:
dataplane: refuse to start if device is already in use
dataplane: enable virtio-blk x-data-plane=on live migration
migration: fix spice migration
migration: notify migration state before starting thread
block: Repair the throttling code.
gluster: Add image resize support
Message-id: 1375112172-24863-1-git-send-email-stefanha@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The category will be used to sort the devices displayed in
the command line help.
Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Message-id: 1375107465-25767-4-git-send-email-marcel.a@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Dataplane must check whether a block device is in use before launching
the dataplane thread. This is necessary since the thread does not
synchronize with the main loop and I/O requests could cause corruption.
One example is when a drive is added and a block job is started before
hotplugging the virtio-blk-pci adapter. In this case we must not use
dataplane mode.
Cc: qemu-stable@nongnu.org
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Although the dataplane thread does not cooperate with dirty memory
logging yet it's fairly easy to temporarily disable dataplane during
live migration. This way virtio-blk can live migrate when
x-data-plane=on.
The dataplane thread will restart after migration is cancelled or if the
guest resuming virtio-blk operation after migration completes.
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Support backend option "direct-io-safe". This is documented as
follows in the Xen backend specification:
* direct-io-safe
* Values: 0/1 (boolean)
* Default Value: 0
*
* The underlying storage is not affected by the direct IO memory
* lifetime bug. See:
* http://lists.xen.org/archives/html/xen-devel/2012-12/msg01154.html
*
* Therefore this option gives the backend permission to use
* O_DIRECT, notwithstanding that bug.
*
* That is, if this option is enabled, use of O_DIRECT is safe,
* in circumstances where we would normally have avoided it as a
* workaround for that bug. This option is not relevant for all
* backends, and even not necessarily supported for those for
* which it is relevant. A backend which knows that it is not
* affected by the bug can ignore this option.
*
* This option doesn't require a backend to use O_DIRECT, so it
* should not be used to try to control the caching behaviour.
Also, BDRV_O_NATIVE_AIO is ignored if BDRV_O_NOCACHE, so clarify the
default flags passed to the qemu block layer.
The original proposal for a "cache" backend option has been dropped
because it was believed too wide, especially considering that at the
moment the backend doesn't have a way to tell the toolstack that it is
capable of supporting it.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
The firmware commonly used with MIPS Malta boards (YAMON) reads the
status of the pflash with a 32bit memory access. On real hardware
this results in the status byte being mirrored in the upper 16 bits
of the read value. For example if the status byte is represented by
SS then the hardware reads 0x00SS00SS. The YAMON firmware compares the
status against 32bit values expecting the mirrored value and fails
without it.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Rename fdctrl_init_common() to fdctrl_realize_common() and let
fdctrl_connect_drives() propagate an Error through it.
Reviewed-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Introduce type constant and replace FROM_SYSBUS().
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
[AF: Renamed parent field]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Introduce type constant and replace FROM_SYSBUS().
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
[AF: Renamed parent field]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Introduce type constant and avoid DO_UPCAST(), container_of(),
and use DEVICE() to avoid accessing parent qdev directly.
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
[AF: Renamed parent field and avoided repeated SYS_BUS_DEVICE() casts]
Signed-off-by: Andreas Färber <afaerber@suse.de>
# By Michael Tokarev (2) and others
# Via Michael Tokarev
* mjt/trivial-patches:
doc: monitor multiplexing rewording
block/m25p80: Update Micron entries
Fix command example in qemu.sasl
slirp: remove mbuf(m_hdr,m_dat) indirection
linux-user: declare sys_futex to have 6 arguments
Message-id: 1374225073-12959-1-git-send-email-mjt@msgid.tls.msk.ru
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
- Split 32Mb and 256Mb parts into a11 and a13 variants.
- Add the 4K sector flag to the 128Mb parts. (These entries were taken from
the Linux kernel list, which is missing the flag.)
- Fill out the table of sizes with entries for 64Mb parts.
Prodded by Peter Crosthwaite.
Signed-off-by: Ed Maste <emaste@freebsd.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Load the virtio.c state into vring.c when we start dataplane mode and
vice versa when stopping dataplane mode. This patch makes it possible
to start and stop dataplane any time while the guest is running.
This will eventually allow us to go back to QEMU main loop for
bdrv_drain_all() and live migration. In the meantime, this patch makes
the dataplane lifecycle more robust but should make no visible
difference. It may be useful in the virtio-net dataplane effort.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Nand chips are not sysbus devices - they do not have any sense of MMIO,
nor interrupts. Re-parent to TYPE_DEVICE accordingly.
Cc: afaerber@suse.de
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The prescribed transition from Sysbus::init function to a
Device::realize.
Cc: afaerber@suse.de
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Define and use standard QOM cast macro. Remove usages of DO_UPCAST and
direct -> style casting.
Cc: afaerber@suse.de
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Make this code closer to passing checkpatch. Mostly missing braces, but
a few rogue tabs in there as well.
Cc: qemu-trivial@nongnu.org
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The DMAContext is a simple pointer to an AddressSpace that is now always
already available. Make everyone hold the address space directly,
and clean up the DMA API to use the AddressSpace directly.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Initial commit for emulated Non-Volatile-Memory Express (NVMe) pci
storage device.
NVMe is an open, industry driven storage specification defining
an optimized register and command set designed to deliver the full
capabilities of non-volatile memory on PCIe SSDs. Further information
may be found on the organizations website at:
http://www.nvmexpress.org/
This commit implements the minimum from the specification to work with
existing drivers.
Cc: Keith Busch <keith.busch@gmail.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Drop ISADeviceClass::init and the resulting no-op initfn and let
children implement their own realizefn. Adapt error handling.
Split off an instance_init where sensible.
Signed-off-by: Andreas Färber <afaerber@suse.de>
We may want to include a driver in the whitelist for read only tasks
such as diagnosing or exporting guest data (with libguestfs as a good
example). This patch introduces a readonly whitelist option, and for
backward compatibility, the old configure option --block-drv-whitelist
is now an alias to rw whitelist.
Drivers in readonly list is only permitted to open file readonly, and
returns -ENOTSUP for RW opening.
E.g. To include vmdk readonly, and others read+write:
./configure --target-list=x86_64-softmmu \
--block-drv-rw-whitelist=qcow2,raw,file,qed \
--block-drv-ro-whitelist=vmdk
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When pc-sysfw.rom_only == 0, flash memory will be
usable with kvm. In order to enable flash memory mode,
a pflash device must be created. (For example, by
using the -pflash command line parameter.)
Usage of a flash memory device with kvm requires
KVM_CAP_READONLY_MEM, and kvm will abort if
a flash device is used with an older kvm which does
not support this capability.
If a flash device is not used, then qemu/kvm will
operate in the original rom-mode.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1369816047-16384-5-git-send-email-jordan.l.justen@intel.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The isapc machine with seabios currently requires the BIOS region
to be read/write memory rather than read-only memory.
KVM currently cannot support the BIOS as a ROM region, but qemu
in non-KVM mode can. Based on this, isapc machine currently only
works with KVM.
To work-around this isapc issue, this change avoids marking the
BIOS as readonly for isapc.
This change also will allow KVM to start supporting ROM mode
via KVM_CAP_READONLY_MEM.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1369816047-16384-2-git-send-email-jordan.l.justen@intel.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
"Readable" is a very unfortunate name for this flag because even a
rom_device region will always be readable from the guest POV. What
differs is the mapping, just like the comments had to explain already.
Also, readable could currently be understood as being a generic region
flag, but it only applies to rom_device regions.
So rename the flag and the function to modify it after the original term
"ROMD" which could also be interpreted as "ROM direct", i.e. ROM mode
with direct access. In any case, the scope of the flag is clearer now.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This reverts commit 9953f8822c.
While Markus's analysis is entirely correct, there are 1.6 patches
that fix the bug for real and without requiring machine type hacks.
Let's think of the children who will have to read this code, and
avoid a complicated mess of semantics that differ between <1.5,
1.5, and >1.5.
Conflicts:
hw/i386/pc_piix.c
hw/i386/pc_q35.c
include/hw/i386/pc.h
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Anthony Liguori <aliguori@us.ibm.com>
Message-id: 1368189483-7915-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Add new devices for various manufacturers, and re-sort Spansion list to
match the order in Linux, which requires chips with a non-zero extended ID
to come first.
With this commit the outstanding differences to Linux rev 55bf75b are:
- Erase size flag differences in s25sl032p, s25sl064p, s25fl016k, s25fl064k
(These devices have only some blocks that support small erase sizes.)
- Linux lacks n25q128
- Devices without a Jedec ID have been excluded
Signed-off-by: Ed Maste <emaste@freebsd.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Introduce type constant and cast macro to obsolete DO_UPCAST().
Reuse type constant for PC machine compatibility settings.
Prepares for ISA realizefn.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Message-id: 1367093935-29091-4-git-send-email-afaerber@suse.de
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Use of a flash memory device for the BIOS was added in series "[PATCH
v10 0/8] PC system flash support", commit 4732dca..1b89faf, v1.1.
Flash vs. ROM is a guest-visible difference. Thus, flash use had to
be suppressed for machine types pc-1.0 and older. This was
accomplished by adding a dummy device "pc-sysfw" with property
"rom_only":
* Non-zero rom_only means "use ROM". Default for pc-1.0 and older.
* Zero rom_only means "maybe use flash". Default for newer machines.
Not only is the dummy device ugly, it was also retroactively added to
the older machine types! Fortunately, it's not guest-visible (thus no
immediate guest ABI breakage), and has no vmstate (thus no immediate
migration breakage). Breakage occurs only if the user unwisely
enables flash by setting rom_only to zero. Patch review FAIL #1.
Why "maybe use flash"? Flash didn't (and still doesn't) work with
KVM. Therefore, rom_only=0 really means "use flash, except when KVM
is enabled, use ROM". This is a Bad Idea, because it makes enabling/
disabling KVM guest-visible. Patch review FAIL #2.
Aside: it also precludes migrating between KVM on and off, but that's
not possible for other reasons anyway.
Fix as follows:
1. Change the meaning of rom_only=0 to mean "use flash, no ifs, buts,
or maybes" for pc-i440fx-1.5 and pc-q35-1.5. Don't change anything
for older machines (to remain bug-compatible).
2. Change the default value from 0 to 1 for these machines.
Necessary, because 0 doesn't work with KVM. Once it does, we can flip
the default back to 0.
3. Don't revert the retroactive addition of device "pc-sysfw" to older
machine types. Seems not worth the trouble.
4. Add a TODO comment asking for device "pc-sysfw" to be dropped once
flash works with KVM.
Net effect is that you get a BIOS ROM again even when KVM is disabled,
just like for machines predating the introduction of flash.
To get flash instead, use "--global pc-sysfw.rom_only=0".
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1365780303-26398-4-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>