Commit Graph

59 Commits

Author SHA1 Message Date
Luiz Capitulino
2582bfedd2 block: BLOCK_IO_ERROR QMP event
This commit introduces the bdrv_mon_event() function, which
should be called by block subsystems (eg. IDE) when a I/O
error occurs, so that an QMP event is emitted.

The following information is currently provided in the event:

- device name
- operation (ie. "read" or "write")
- action taken (eg. "stop")

Event example:

{ "event": "BLOCK_IO_ERROR",
    "data": { "device": "ide0-hd1",
              "operation": "write",
              "action": "stop" },
    "timestamp": { "seconds": 1265044230, "microseconds": 450486 } }

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-02-10 11:57:03 -06:00
Liran Schour
aaa0eb75e2 Count dirty blocks and expose an API to get dirty count
This will manage dirty counter for each device and will allow to get the
dirty counter from above.

Signed-off-by: Liran Schour <lirans@il.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-02-09 16:56:14 -06:00
Christoph Hellwig
9a2d77ad0d block: kill BDRV_O_CREAT
The BDRV_O_CREAT option is unused inside qemu and partially duplicates
the bdrv_create method.  Remove it, and the -C option to qemu-io which
isn't used in qemu-iotests anyway.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-26 15:42:02 -06:00
Naphtali Sprei
37226ad946 No need anymoe for bdrv_set_read_only
Signed-off-by: Naphtali Sprei <nsprei@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-26 15:42:01 -06:00
Naphtali Sprei
f5edb014ed Clean-up a little bit the RW related bits of BDRV_O_FLAGS. BDRV_O_RDONLY gone (and so is BDRV_O_ACCESS). Default value for bdrv_flags (0/zero) is READ-ONLY. Need to explicitly request READ-WRITE.
Instead of using the field 'readonly' of the BlockDriverState struct for passing the request,
pass the request in the flags parameter to the function.

Signed-off-by: Naphtali Sprei <nsprei@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-20 08:25:22 -06:00
Kevin Wolf
756e6736a1 block: Add bdrv_change_backing_file
Introduce the functions needed to change the backing file of an image. The
function is implemented for qcow2.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-13 17:14:15 -06:00
Kevin Wolf
b783e409bf block: Introduce BDRV_O_NO_BACKING
If an image references a backing file that doesn't exist, qemu-img info fails
to open this image. Exactly in this case the info would be valuable, though:
the user might want to find out which file is missing.

This patch introduces a BDRV_O_NO_BACKING flag to ignore the backing file when
opening the image. qemu-img info is the first user and provides info now even
if the backing file is invalid.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-13 17:14:15 -06:00
Luiz Capitulino
218a536a7a block: Convert bdrv_info_stats() to QObject
Each device statistic information is stored in a QDict and
the returned QObject is a QList of all devices.

This commit should not change user output.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-12-12 07:59:49 -06:00
Luiz Capitulino
d15e546567 block: Convert bdrv_info() to QObject
Each block device information is stored in a QDict and the
returned QObject is a QList of all devices.

This commit should not change user output.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-12-12 07:59:49 -06:00
Jan Kiszka
23bd90d2f9 block migration: Increase dirty chunk size to 1M
4K is too small for efficiently saving and restoring multi-GB block
devices.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-12-03 10:48:54 -06:00
Jan Kiszka
6ea44308b0 block migration: Rework constants API
Instead of duplicating the definition of constants or introducing
trivial retrieval functions move the SECTOR constants into the public
block API. This also obsoletes sector_per_block in BlkMigState.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-12-03 10:48:52 -06:00
Jan Kiszka
a55eb92c22 block migration: Fix coding style and whitespaces
No functional changes.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-12-03 10:48:52 -06:00
lirans@il.ibm.com
7cd1e32a86 Expose a mechanism to trace block writes
To support live migration without shared storage we need to be able to trace
writes to disk while migrating. This Patch expose dirty block tracking per
device to be polled from upper layer.

Changes from v4:
- Register dirty tracking for each block device.
- Minor coding style issues.
- Block.c will now manage a dirty bitmap per device once
  bdrv_set_dirty_tracking() is called. Bitmap is polled by the upper
  layer (block-migration.c).

Signed-off-by: Liran Schour <lirans@il.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-11-17 08:03:31 -06:00
Markus Armbruster
eb852011ab Configurable block format whitelist
We have code for a quite a few block formats.  While I trust that all
of these formats are useful at least for some people in some
circumstances, some of them are of a kind that friends don't let
friends use in production.

This patch provides an optional block format whitelist, default off.
If a whitelist is configured with --block-drv-whitelist, QEMU proper
can use only whitelisted formats.  Other programs, like qemu-img, are
not affected.

Drivers for formats off the whitelist still participate in format
probing, to ensure all programs probe exactly the same.  Without that,
QEMU proper would be prone to treat images with a format off the
whitelist as raw when the image's format is probed.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-11-09 08:43:02 -06:00
Naphtali Sprei
59f2689d90 Added readonly flag to -drive command
This is a slightly revised patch for adding readonly flag to the -drive command.
Even though this patch is "stand-alone", it assumes a previous related patch (in Anthony staging tree), that passes
the readonly attribute of the drive to the guest OS, applied first.

This enables sharing same image between guests, with readonly access.
Implementaion mark the drive as read_only and changes the flags when actually opening the file.
The readonly attribute of a qcow also passed to it's base file.
For ide that cannot pass the readonly attribute to the guest OS, disallow the readonly flag.

Also, return error code from bdrv_truncate for readonly drive.

Signed-off-by: Naphtali Sprei <nsprei@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-11-09 08:43:01 -06:00
Christoph Hellwig
b2e12bc6e3 block: add aio_flush operation
Instead stalling the VCPU while serving a cache flush try to do it
asynchronously.  Use our good old helper thread pool to issue an
asynchronous fdatasync for raw-posix.  Note that while Linux AIO
implements a fdatasync operation it is not useful for us because
it isn't actually implement in asynchronous fashion.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-09-11 10:19:46 -05:00
Christoph Hellwig
e900a7b748 block: add enable_write_cache flag
Add a enable_write_cache flag in the block driver state, and use it to
decide if we claim to have a volatile write cache that needs controlled
flushing from the guest.  The flag is off if cache=writethrough is
defined because O_DSYNC guarantees that every write goes to stable
storage, and it is on for cache=none and cache=writeback.

Both scsi-disk and ide now use the new flage, changing from their
defaults of always off (ide) or always on (scsi-disk).

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-09-11 10:19:46 -05:00
Kevin Wolf
40b4f53967 Add bdrv_aio_multiwrite
One performance problem of qcow2 during the initial image growth are
sequential writes that are not cluster aligned. In this case, when a first
requests requires to allocate a new cluster but writes only to the first
couple of sectors in that cluster, the rest of the cluster is zeroed - just
to be overwritten by the following second request that fills up the cluster.

Let's try to merge sequential write requests to the same cluster, so we can
avoid to write the zero padding to the disk in the first place.

As a nice side effect, also other formats take advantage of dealing with less
and larger requests.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-09-11 10:18:06 -05:00
Christoph Hellwig
5c6c3a6c54 raw-posix: add Linux native AIO support
Now that do have a nicer interface to work against we can add Linux native
AIO support.  It's an extremly thing layer just setting up an iocb for
the io_submit system call in the submission path, and registering an
eventfd with the qemu poll handler to do complete the iocbs directly
from there.

This started out based on Anthony's earlier AIO patch, but after
estimated 42,000 rewrites and just as many build system changes
there's not much left of it.

To enable native kernel aio use the aio=native sub-command on the
drive command line.  I have also added an option to qemu-io to
test the aio support without needing a guest.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-08-27 20:30:22 -05:00
Christoph Hellwig
45566e9c99 replace bdrv_{get, put}_buffer with bdrv_{load, save}_vmstate
The VM state offset is a concept internal to the image format.  Replace
the old bdrv_{get,put}_buffer method that require an index into the
image file that is constructed from the VM state offset and an offset
into the vmstate with the bdrv_{load,save}_vmstate that just take an
offset into the VM state.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-07-16 08:28:13 -05:00
Anthony Liguori
1cec71e359 Revert "support colon in filenames"
This reverts commit 707c0dbc97.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-07-09 16:06:38 -05:00
Kevin Wolf
0aa217e461 qcow2: Make cache=writethrough default
The performance of qcow2 has improved meanwhile, so we don't need to
special-case it any more. Switch the default to write-through caching
like all other block drivers.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-07-09 16:06:37 -05:00
Ram Pai
707c0dbc97 support colon in filenames
Problem: It is impossible to feed filenames with the character colon because
qemu interprets such names as a protocol. For example filename scsi:0, is
interpreted as a protocol by name "scsi".

This patch allows user to espace colon characters. For example the above
filename can now be expressed either as 'scsi\:0' or as file:scsi:0

anything following the "file:" tag is interpreted verbatin. However if "file:"
tag is omitted then any colon characters in the string must be escaped using
backslash.

Here are couple of examples:

scsi\:0\:abc is a local file scsi:0:abc
http\://myweb is a local file by name http://myweb
file:scsi:0:abc is a local file scsi:0:abc
file:http://myweb is a local file by name http://myweb

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-29 13:50:05 -05:00
Mark McLoughlin
aea2a33c73 Prevent CD-ROM media eject while device is locked
Section 10.8.25 ("START/STOP UNIT Command") of SFF-8020i states that
if the device is locked we should refuse to eject if the device is
locked.

ASC_MEDIA_REMOVAL_PREVENTED is the appropriate return in this case.

In order to stop itself from ejecting the media it is running from,
Fedora's installer (anaconda) requires the CDROMEJECT ioctl() to fail
if the drive has been previously locked.

See also https://bugzilla.redhat.com/501412

Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-16 15:52:37 -05:00
Kevin Wolf
0e7e1989f7 Convert all block drivers to new bdrv_create
Now we can make use of the newly introduced option structures. Instead of
having bdrv_create carry more and more parameters (which are format specific in
most cases), just pass a option structure as defined by the driver itself.

bdrv_create2() contains an emulation of the old interface to simplify the
transition.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-05-22 10:50:31 -05:00
Anthony Liguori
5efa9d5a8b Convert block infrastructure to use new module init functionality
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-05-14 16:13:41 -05:00
aliguori
e97fc193e1 Introduce bdrv_check (Kevin Wolf)
From: Kevin Wolf <kwolf@redhat.com>

Introduce a new bdrv_check function pointer for block drivers. Modify qcow2 to
return an error status in check_refcounts(), so it can implement bdrv_check.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@7214 c046a42c-6fe2-441c-8c8c-71466251a162
2009-04-21 23:11:50 +00:00
aliguori
c87c067293 remove bdrv_aio_read/bdrv_aio_write (Christoph Hellwig)
Always use the vectored APIs to reduce code churn once we switch the BlockDriver
API to be vectored.


Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@7019 c046a42c-6fe2-441c-8c8c-71466251a162
2009-04-07 18:43:20 +00:00
aliguori
178e08a58f Fix savevm after BDRV_FILE size enforcement
We now enforce that you cannot write beyond the end of a non-growable file.
qcow2 files are not growable but we rely on them being growable to do
savevm/loadvm.  Temporarily allow them to be growable by introducing a new
API specifically for savevm read/write operations.

Reported-by: malc
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6994 c046a42c-6fe2-441c-8c8c-71466251a162
2009-04-05 19:10:55 +00:00
aliguori
5eb456396d block: support known backing format for image create and open (Uri Lublin)
Added a backing_format field to BlockDriverState.
Added bdrv_create2 and drv->bdrv_create2 to create an image with
a known backing file format.
Upon bdrv_open2 if backing format is known use it, instead of
probing the (backing) image.

Signed-off-by: Uri Lublin <uril@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6908 c046a42c-6fe2-441c-8c8c-71466251a162
2009-03-28 17:55:10 +00:00
aliguori
221f715d90 new scsi-generic abstraction, use SG_IO (Christoph Hellwig)
Okay, I started looking into how to handle scsi-generic I/O in the
new world order.

I think the best is to use the SG_IO ioctl instead of the read/write
interface as that allows us to support scsi passthrough on disk/cdrom
devices, too.  See Hannes patch on the kvm list from August for an
example.

Now that we always do ioctls we don't need another abstraction than
bdrv_ioctl for the synchronous requests for now, and for asynchronous
requests I've added a aio_ioctl abstraction keeping it simple.

Long-term we might want to move the ops to a higher-level abstraction
and let the low-level code fill out the request header, but I'm lazy
enough to leave that to the people trying to support scsi-passthrough
on a non-Linux OS.

Tested lightly by issuing various sg_ commands from sg3-utils in a guest
to a host CDROM device.


Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6895 c046a42c-6fe2-441c-8c8c-71466251a162
2009-03-28 17:28:41 +00:00
aliguori
7d78066926 Add specialized block driver scsi generic API (Avi Kivity)
When a scsi device is backed by a scsi generic device instead of an
ordinary host block device, the block API is abused in a couple of annoying
ways:

 - nb_sectors is negative, and specifies a byte count instead of a sector count
 - offset is ignored, since scsi-generic is essentially a packet protocol

This overloading makes hacking the block layer difficult.  Remove it by
introducing a new explicit API for scsi-generic devices.  The new API
is still backed by the old implementation, but at least the users are
insulated.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6822 c046a42c-6fe2-441c-8c8c-71466251a162
2009-03-12 19:57:08 +00:00
aliguori
b7ea8c2636 Revert r6405
This series is broken by design as it requires expensive IO operations at
open time causing very long delays when starting a virtual machine for the
first time.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6815 c046a42c-6fe2-441c-8c8c-71466251a162
2009-03-11 20:05:33 +00:00
aliguori
70240ca680 Revert r6407
This series is broken by design as it requires expensive IO operations at
open time causing very long delays when starting a virtual machine for the
first time.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6813 c046a42c-6fe2-441c-8c8c-71466251a162
2009-03-11 20:05:25 +00:00
aliguori
376253ece4 monitor: Rework API (Jan Kiszka)
Refactor the monitor API and prepare it for decoupled terminals:
term_print functions are renamed to monitor_* and all monitor services
gain a new parameter (mon) that will once refer to the monitor instance
the output is supposed to appear on. However, the argument remains
unused for now. All monitor command callbacks are also extended by a mon
parameter so that command handlers are able to pass an appropriate
reference to monitor output services.

For the case that monitor outputs so far happen without clearly
identifiable context, the global variable cur_mon is introduced that
shall once provide a pointer either to the current active monitor (while
processing commands) or to the default one. On the mid or long term,
those use case will be obsoleted so that this variable can be removed
again.

Due to the broad usage of the monitor interface, this patch mostly deals
with converting users of the monitor API. A few of them are already
extended to pass 'mon' from the command handler further down to internal
functions that invoke monitor_printf.

At this chance, monitor-related prototypes are moved from console.h to
a new monitor.h. The same is done for the readline API.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6711 c046a42c-6fe2-441c-8c8c-71466251a162
2009-03-05 23:01:23 +00:00
aliguori
c0f4ce7751 monitor: Rework early disk password inquiry (Jan Kiszka)
Reading the passwords for encrypted hard disks during early startup is
broken (I guess for quiet a while now):
 - No monitor terminal is ready for input at this point
 - Forcing all mux'ed terminals into monitor mode can confuse other
   users of that channels

To overcome these issues and to lay the ground for a clean decoupling of
monitor terminals, this patch changes the initial password inquiry as
follows:
 - Prevent autostart if there is some encrypted disk
 - Once the user tries to resume the VM, prompt for all missing
   passwords
 - Only resume if all passwords were accepted

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6707 c046a42c-6fe2-441c-8c8c-71466251a162
2009-03-05 23:01:01 +00:00
aliguori
045df33021 block: Introduce bdrv_get_encrypted_filename (Jan Kiszka)
Introduce bdrv_get_encrypted_filename service to allow more informative
password prompting.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6704 c046a42c-6fe2-441c-8c8c-71466251a162
2009-03-05 23:00:48 +00:00
aliguori
51de97605b block: Improve bdrv_iterate (Jan Kiszka)
Make bdrv_iterate more useful by passing the BlockDriverState to the
iterator instead of the device name.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6703 c046a42c-6fe2-441c-8c8c-71466251a162
2009-03-05 23:00:43 +00:00
aliguori
1987530fe0 qcow2 format: keep 'num_free_bytes', and show it upon 'info blockstats' (Uri Lublin)
'num_free_bytes' is the number of non-allocated bytes below highest-allocation.
It's useful, together with the highest-allocation, to figure out how
fragmented the image is, and how likely it will run out-of-space soon.

For example when the highest allocation is high (almost end-of-disk), but 
many bytes (clusters) are free, and can be re-allocated when neeeded, than
we know it's probably not going to reach end-of-disk-space soon.

Added bookkeeping to block-qcow2.c
Export it using BlockDeviceInfo
Show it upon 'info blockstats' if BlockDeviceInfo exists

Signed-off-by: Uri Lublin <uril@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6407 c046a42c-6fe2-441c-8c8c-71466251a162
2009-01-22 18:57:34 +00:00
aliguori
c421820580 block-qcow2: export highest_allocated through BlockDriverInfo and get_info() (Uri Lublin)
Signed-off-by: Uri Lublin <uril@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6405 c046a42c-6fe2-441c-8c8c-71466251a162
2009-01-22 18:57:26 +00:00
aliguori
3b69e4b9ad Vectored block device API (Avi Kivity)
Most devices that are capable of DMA are also capable of scatter-gather.
With the memory mapping API, this means that the device code needs to be
able to access discontiguous host memory regions.

For block devices, this translates to vectored I/O.  This patch implements
an aynchronous vectored interface for the qemu block devices.  At the moment
all I/O is bounced and submitted through the non-vectored API; in the future
we will convert block devices to natively support vectored I/O wherever
possible.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6397 c046a42c-6fe2-441c-8c8c-71466251a162
2009-01-22 16:59:24 +00:00
aliguori
4dc822d726 Use writeback caching by default with qcow2
qcow2 writes a cluster reference count on every cluster update.  This causes
performance to crater when using anything but cache=writeback.  This is most
noticeable when using savevm.  Right now, qcow2 isn't a reliable format
regardless of the type of cache your using because metadata is not updated in
the correct order.  Considering this, I think it's somewhat reasonable to use
writeback caching by default with qcow2 files.

It at least avoids the massive performance regression for users until we sort
out the issues in qcow2. 

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>



git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5879 c046a42c-6fe2-441c-8c8c-71466251a162
2008-12-04 21:39:21 +00:00
aliguori
f3d54fc494 Abstract out geometry detection code from IDE for reuse
Virtio will want to use the geometry detection code.  It doesn't belong 
in ide.c anyway.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>



git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5797 c046a42c-6fe2-441c-8c8c-71466251a162
2008-11-25 21:50:24 +00:00
aliguori
4fc9af53d8 Use an option rom instead of boot sector for -kernel
Generate an option rom instead of using a hijacked boot sector for kernel
booting.  This just requires adding a small option ROM header and a few more
instructions to the boot sector to take over the int19 vector and run our
boot code.

A disk is no longer needed when using -kernel on x86.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>



git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5650 c046a42c-6fe2-441c-8c8c-71466251a162
2008-11-08 16:27:07 +00:00
aliguori
9f7965c7e9 Expand cache= option and use write-through caching by default
This patch changes the cache= option to accept none, writeback, or writethough
to control the host page cache behavior.  By default, writethrough caching is
now used which internally is implemented by using O_DSYNC to open the disk
images.  When using -snapshot, writeback is used by default since data integrity
it not at all an issue.

cache=none has the same behavior as cache=off previously.  The later syntax is
still supported by now deprecated.  I also cleaned up the O_DIRECT
implementation to avoid many of the #ifdefs.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>



git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5485 c046a42c-6fe2-441c-8c8c-71466251a162
2008-10-14 14:42:54 +00:00
aliguori
c6ca28d636 Add bdrv_flush_all()
This patch adds a bdrv_flush_all() function.  It's necessary to ensure that all
IO operations have been flushed to disk before completely a live migration.

N.B. we don't actually use this now.  We really should flush the block drivers
using an live savevm callback to avoid unnecessary guest down time.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>



git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5432 c046a42c-6fe2-441c-8c8c-71466251a162
2008-10-06 13:55:43 +00:00
aliguori
a76bab4952 Refactor AIO to allow multiple AIO implementations
This patch refactors the AIO layer to allow multiple AIO implementations.  It's
only possible because of the recent signalfd() patch.  

Right now, the AIO infrastructure is pretty specific to the block raw backend.
For other block devices to implement AIO, the qemu_aio_wait function must
support registration.  This patch introduces a new function,
qemu_aio_set_fd_handler, which can be used to register a file descriptor to be
called back.  qemu_aio_wait() now polls a set of file descriptors registered
with this function until one becomes readable or writable.

This patch should allow the implementation of alternative AIO backends (via a
thread pool or linux-aio) and AIO backends in non-traditional block devices
(like NBD).

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>



git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5297 c046a42c-6fe2-441c-8c8c-71466251a162
2008-09-22 19:17:18 +00:00
aliguori
03ff3ca30f Use common objects for qemu-img and qemu-nbd
Right now, we sprinkle #if defined(QEMU_IMG) && defined(QEMU_NBD) all over the
code.  It's ugly and causes us to have to build multiple object files for
linking against qemu and the tools.

This patch introduces a new file, qemu-tool.c which contains enough for
qemu-img, qemu-nbd, and QEMU to all share the same objects.

This also required getting qemu-nbd to be a bit more Windows friendly.  I also
changed the Windows block-raw to use normal IO instead of overlapping IO since
we don't actually do AIO yet on Windows.  I changed the various #if 0's to
 #if WIN32_AIO to make it easier for someone to eventually fix AIO on Windows.

After this patch, there are no longer any #ifdef's related to qemu-img and
qemu-nbd.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>



git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5226 c046a42c-6fe2-441c-8c8c-71466251a162
2008-09-15 15:51:35 +00:00
aliguori
baf35cb902 Use signalfd() to work around signal/select race
This patch introduces signalfd() to work around the signal/select race in
checking for AIO completions.  For platforms that don't support signalfd(), we
emulate it with threads.

There was a long discussion about this approach.  I don't believe there are any
fundamental problems with this approach and I believe eliminating the use of
signals is a good thing.

I've tested Windows and Linux using Windows and Linux guests.  I've also checked
for disk IO performance regressions.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>



git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5187 c046a42c-6fe2-441c-8c8c-71466251a162
2008-09-10 15:45:19 +00:00
ths
75818250ba Allow QEMU to connect directly to an NBD server, by Laurent Vivier.
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@4838 c046a42c-6fe2-441c-8c8c-71466251a162
2008-07-03 13:41:03 +00:00