qemu-e2k

Author	SHA1	Message	Date
Markus Armbruster	1f69c2b022	fdc: Drop broken code for user-defined floppy geometry bdrv_get_floppy_geometry_hint() fails to store through its parameter drive when bs has a geometry hint. Makes fd_revalidate() assign random crap to drv->drive. Has been broken that way for ages. Harmless, because: * The only way to set a geometry hint is -drive if=none,cyls=... Since commit `c219331e`, probably unintentional. * The only use of drv->drive is as argument to another bdrv_get_floppy_geometry_hint(). Which doesn't use it, since the geometry hint is still there. Drop the broken code, ignore -drive parameter cyls, heads and secs for floppies even with if=none, just like before commit `c219331e`. Matches -help, which explains cyls, heads, secs as "hard disk physical geometry". Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-07-09 15:53:03 +02:00
Paolo Bonzini	4ddc07cac2	block: introduce bdrv_swap, implement bdrv_append on top of it The new function can be made a bit nicer than bdrv_append. It swaps the whole contents, and then swaps back (using the usual t=a;a=b;b=t idiom) the fields that need to stay on top. Thus, it does not need explicit bdrv_detach_dev, bdrv_iostatus_disable, etc. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-07-09 15:53:02 +02:00
Paolo Bonzini	a9fc4408e3	block: copy over job and dirty bitmap fields in bdrv_append While these should not be in use at the time a transaction is started, a command in the prepare phase of a transaction might have added them, so they need to be brought over. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-07-09 15:53:02 +02:00
Markus Armbruster	f8d6bba1c1	block: Replace bdrv_get_format() by bdrv_get_format_name() So callers don't need to know anything about maximum name length. Returning a pointer is safe, because the name string lives as long as the block driver it names, and block drivers don't die. Requested by Peter Maydell. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-06-15 14:03:43 +02:00
Paolo Bonzini	e1e9b0aca0	block: always open drivers in writeback mode Formats are entirely in charge of flushes for metadata writes. For guest-initiated writes, a writethrough cache is faked in the block layer. So we can always open in writeback mode. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-06-15 14:03:43 +02:00
Paolo Bonzini	425b01487a	block: add bdrv_set_enable_write_cache Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-06-15 14:03:43 +02:00
Paolo Bonzini	c4a248a138	block: copy enable_write_cache in bdrv_append Because the guest will be able to flip enable_write_cache, the actual state may not match what is used to open the new snapshot. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-06-15 14:03:43 +02:00
Paolo Bonzini	f05fa4ad03	block: flush in writethrough mode after writes We want to make the formats handle their own flushes autonomously, while keeping for guests the ability to use a writethrough cache. Since formats will write metadata via bs->file, bdrv_co_do_writev is the only place where we need to add a flush. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-06-15 14:03:43 +02:00
Markus Armbruster	c843328783	block: New bdrv_get_flags() Signed-off-by: Markus Armbruster <armbru@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-06-15 14:03:43 +02:00
Kevin Wolf	4534ff5426	qemu-img check -r for repairing images The QED block driver already provides the functionality to not only detect inconsistencies in images, but also fix them. However, this functionality cannot be manually invoked with qemu-img, but the check happens only automatically during bdrv_open(). This adds a -r switch to qemu-img check that allows manual invocation of an image repair. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-06-15 14:03:42 +02:00
Paolo Bonzini	188a7bbf94	stream: move is_allocated_above to block.c Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-06-15 14:03:42 +02:00
Michael Tokarev	d5e6b1619c	change qemu_iovec_to_buf() to match other to,from_buf functions It now allows specifying offset within qiov to start from and amount of bytes to copy. Actual implementation is just a call to iov_to_buf(). Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2012-06-11 23:12:11 +04:00
Michael Tokarev	1b093c480a	consolidate qemu_iovec_copy() and qemu_iovec_concat() and make them consistent qemu_iovec_concat() is currently a wrapper for qemu_iovec_copy(), use the former (with extra "0" arg) in a few places where it is used. Change skip argument of qemu_iovec_copy() from uint64_t to size_t, since size of qiov itself is size_t, so there's no way to skip larger sizes. Rename it to soffset, to make it clear that the offset is applied to src. Also change the only usage of uint64_t in hw/9pfs/virtio-9p.c, in v9fs_init_qiov_from_pdu() - all callers of it actually uses size_t too, not uint64_t. One added restriction: as for all other iovec-related functions, soffset must point inside src. Order of argumens is already good: qemu_iovec_memset(QEMUIOVector qiov, size_t offset, int c, size_t bytes) vs: qemu_iovec_concat(QEMUIOVector dst, QEMUIOVector *src, size_t soffset, size_t sbytes) (note soffset is after _src_ not dst, since it applies to src; for memset it applies to qiov). Note that in many places where this function is used, the previous call is qemu_iovec_reset(), which means many callers actually want copy (replacing dst content), not concat. So we may want to add a wrapper like qemu_iovec_copy() with the same arguments but which calls qemu_iovec_reset() before _concat(). Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2012-06-11 23:12:11 +04:00
Michael Tokarev	03396148bc	allow qemu_iovec_from_buffer() to specify offset from which to start copying Similar to qemu_iovec_memset(QEMUIOVector qiov, size_t offset, int c, size_t bytes); the new prototype is: qemu_iovec_from_buf(QEMUIOVector qiov, size_t offset, const void *buf, size_t bytes); The processing starts at offset bytes within qiov. This way, we may copy a bounce buffer directly to a middle of qiov. This is exactly the same function as iov_from_buf() from iov.c, so use the existing implementation and rename it to qemu_iovec_from_buf() to be shorter and to match the utility function. As with utility implementation, we now assert that the offset is inside actual iovec. Nothing changed for current callers, because `offset' parameter is new. While at it, stop using "bounce-qiov" in block/qcow2.c and copy decrypted data directly from cluster_data instead of recreating a temp qiov for doing that. Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2012-06-11 23:12:11 +04:00
Jim Meyering	eba25057b9	block: prevent snapshot mode $TMPDIR symlink attack In snapshot mode, bdrv_open creates an empty temporary file without checking for mkstemp or close failure, and ignoring the possibility of a buffer overrun given a surprisingly long $TMPDIR. Change the get_tmp_filename function to return int (not void), so that it can inform its two callers of those failures. Also avoid the risk of buffer overrun and do not ignore mkstemp or close failure. Update both callers (in block.c and vvfat.c) to propagate temp-file-creation failure to their callers. get_tmp_filename creates and closes an empty file, while its callers later open that presumed-existing file with O_CREAT. The problem was that a malicious user could provoke mkstemp failure and race to create a symlink with the selected temporary file name, thus causing the qemu process (usually root owned) to open through the symlink, overwriting an attacker-chosen file. This addresses CVE-2012-2652. http://bugzilla.redhat.com/CVE-2012-2652 Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2012-05-30 14:48:40 +08:00
Paolo Bonzini	dc5a137125	qemu-img: make "info" backing file output correct and easier to use qemu-img info should use the same logic as qemu when printing the backing file path, or debugging becomes quite tricky. We can also simplify the output in case the backing file has an absolute path or a protocol. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-10 10:32:12 +02:00
Paolo Bonzini	6405875cdd	block: move field reset from bdrv_open_common to bdrv_close bdrv_close should leave fields in the same state as bdrv_new. It is not up to bdrv_open_common to fix the mess. Also, backing_format was not being re-initialized. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-10 10:32:12 +02:00
Paolo Bonzini	947995c09e	block: protect path_has_protocol from filenames with colons path_has_protocol will erroneously return "true" if the colon is part of a filename. These names are common with stable device names produced by udev. We cannot fully protect against this in case the filename does not have a path component (e.g. if the current directory is /dev/disk/by-path), but in the common case there will be a slash before and path_has_protocol can easily detect that and return false. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-10 10:32:12 +02:00
Paolo Bonzini	f53f4da9c6	block: simplify path_is_absolute On Windows, all the logic is already in is_windows_drive and is_windows_drive_prefix. On POSIX, there is no need to look out for colons. The win32 code changes the behaviour in some cases, we could have something like "d:foo.img". The old code would treat it as relative path, the new one as absolute. Now the path is absolute, because to go from c:/program files/blah to d:foo.img you cannot say c:/program files/blah/d:foo.img. You have to say d:foo.img. But you could also say it's relative because (I think, at least it was like that in DOS 15 years ago) d:foo.img is relative to the current path of drive D. Considering how path_is_absolute is used by path_combine, I think it's better to treat it as absolute. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-10 10:32:12 +02:00
Paolo Bonzini	fa4478d5c8	block: wait for job callback in block_job_cancel_sync The limitation on not having I/O after cancellation cannot really be kept. Even streaming has a very small race window where you could cancel a job and have it report completion. If this window is hit, bdrv_change_backing_file() will yield and possibly cause accesses to dangling pointers etc. So, let's just assume that we cannot know exactly what will happen after the coroutine has set busy to false. We can set a very lax condition: - if we cancel the job, the coroutine won't set it to false again (and hence will not call co_sleep_ns again). - block_job_cancel_sync will wait for the coroutine to exit, which pretty much ensures no race. Instead, we track the coroutine that executes the job and put very strict conditions on what to do while it is quiescent (busy = false). First of all, the coroutine must never set busy = false while the job has been cancelled. Second, the coroutine can be reentered arbitrarily while it is quiescent, so you cannot really do anything but co_sleep_ns at that time. This condition is obeyed by the block_job_sleep_ns function. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-10 10:32:12 +02:00
Paolo Bonzini	4513eafe92	block: add block_job_sleep_ns This function abstracts the pretty complex semantics of the "busy" member of BlockJob. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-10 10:32:12 +02:00
Paolo Bonzini	0ac9377d04	block: fully delete bs->file when closing We are reusing bs->file across close/open, which may not cause any known bugs but is a recipe for trouble. Prefer bdrv_delete, and enjoy the new invariant in the implementation of bdrv_delete. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-10 10:32:12 +02:00
Paolo Bonzini	a275fa42fa	block: do not reuse the backing file across bdrv_close/bdrv_open This is another bug caused by not doing a full cleanup of the BDS across close/open. This was found with mirroring by Shaolong Hu, but it can probably be reproduced also with eject or change. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-10 10:32:12 +02:00
Paolo Bonzini	3a389e7926	block: another bdrv_append fix bdrv_append must also copy open_flags to the top, because the snapshot has BDRV_O_NO_BACKING set. This causes interesting results if you later use drive-reopen (not upstream) to reopen the image, and lose the backing file in the process. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-10 10:32:12 +02:00
Paolo Bonzini	e023b2e244	block: fix snapshot on QED QED's opaque data includes a pointer back to the BlockDriverState. This breaks when bdrv_append shuffles data between bs_new and bs_top. To avoid this, add a "rebind" function that tells the driver about the new relationship between the BlockDriverState and its opaque. The patch also adds rebind to VVFAT for completeness, even though it is not used with live snapshots. Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-10 10:32:12 +02:00
Paolo Bonzini	71df14fcbe	block: fix allocation size for dirty bitmap Also reuse elsewhere the new constant for sizeof(unsigned long) * 8. The dirty bitmap is allocated in bits but declared as unsigned long. Thus, its memory block is accessed beyond its end unless the image is a multiple of 64 chunks (i.e. a multiple of 64 MB). Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-10 10:32:11 +02:00
Paolo Bonzini	63090dac3a	block: open backing file as read-only when probing for size bdrv_img_create will temporarily open the backing file to probe its size. However, this could be done with a read-write open if the wrong flags are passed to bdrv_img_create. Since there is really no documentation on what flags can be passed, assume that bdrv_img_create receives the flags with which the new image will be opened; sanitize them when opening the backing file. Reported-by: Eric Blake <eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-10 10:32:11 +02:00
Paolo Bonzini	469ef350e1	block: update in-memory backing file and format These are needed to print "info block" output correctly. QCOW2 does this because it needs it to write the header, but QED does not, and common code is the right place to do it. Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-10 10:32:11 +02:00
Paolo Bonzini	5f3777945d	block: push bdrv_change_backing_file error checking up from drivers This check applies to all drivers, but QED lacks it. Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-10 10:32:11 +02:00
Zhi Yong Wu	4c355d53c6	block: add the support to drain throttled requests Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> [ Iterate until all block devices have processed all requests, add comments. - Paolo ] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-10 10:32:11 +02:00
Zhi Yong Wu	5b7e1542cf	block: make bdrv_create adopt coroutine The current qemu.git introduces failure with preallocation and some sizes: qemu-img create -f qcow2 new.img 976563K -o preallocation=metadata qemu-img: qemu-coroutine-lock.c:111: qemu_co_mutex_unlock: Assertion `mutex->locked == 1' failed. And lock needs to work in coroutine context. So to fix this issue, we need to make bdrv_create adopt coroutine at first. Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-05-07 19:33:18 +02:00
Stefan Hajnoczi	c83c66c3b5	block: add 'speed' optional parameter to block-stream Allow streaming operations to be started with an initial speed limit. This eliminates the window of time between starting streaming and issuing block-job-set-speed. Users should use the new optional 'speed' parameter instead so that speed limits are in effect immediately when the job starts. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>	2012-04-27 11:44:50 -03:00
Stefan Hajnoczi	882ec7ce53	block: change block-job-set-speed argument from 'value' to 'speed' Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>	2012-04-27 11:44:50 -03:00
Stefan Hajnoczi	9e6636c72d	block: use Error mechanism instead of -errno for block_job_set_speed() There are at least two different errors that can occur in block_job_set_speed(): the job might not support setting speeds or the value might be invalid. Use the Error mechanism to report the error where it occurs. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>	2012-04-27 11:44:50 -03:00
Stefan Hajnoczi	fd7f8c6537	block: use Error mechanism instead of -errno for block_job_create() The block job API uses -errno return values internally and we convert these to Error in the QMP functions. This is ugly because the Error should be created at the point where we still have all the relevant information. More importantly, it is hard to add new error cases to this case since we quickly run out of -errno values without losing information. Go ahead and use Error directly and don't convert later. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>	2012-04-27 11:44:50 -03:00
Kevin Wolf	621f058940	qcow2: Zero write support Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-04-20 15:57:30 +02:00
Liu Yuan	80ccf93b88	qemu-img: let 'qemu-img convert' flush data The 'qemu-img convert -h' advertise that the default cache mode is 'writeback', while in fact it is 'unsafe'. This patch 1) fix the help manual and 2) let bdrv_close() call bdrv_flush() 2) is needed because some backend storage doesn't have a self-flush mechanism(for e.g., sheepdog), so we need to call bdrv_flush() to make sure the image is really writen to the storage instead of hanging around writeback cache forever. Signed-off-by: Liu Yuan <tailai.ly@taobao.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-04-20 11:42:41 +02:00
Kevin Wolf	7094f12f86	block: Drain requests in bdrv_close If an AIO request is in flight that refers to a BlockDriverState that has been closed and possibly even freed, more or less anything could happen. I have seen segfaults, -EBADF return values and qcow2 sometimes actually catches the situation in bdrv_close() and abort()s. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>	2012-04-19 15:48:52 +02:00
Benoît Canet	077892696b	block: add a function to clear incoming live migration flags This function will clear all BDRV_O_INCOMING flags. Signed-off-by: Benoit Canet <benoit.canet@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-04-05 16:27:56 +02:00
Jeff Cody	f6801b83d0	block: bdrv_append() fixes A few fixups for bdrv_append(): The new bs (bs_new) passed into bdrv_append() should be anonymous. Rather than call bdrv_make_anon() to enforce this, use an assert to catch when a caller is passing in a bs_new that is not anonymous. Also, the new top layer should have its backing_format reflect the original top's format. And last, after the swap of bs contents, the device_name will have been copied down. This needs to be cleared to reflect the anonymity of the bs that was pushed down. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-04-05 14:54:41 +02:00
Paolo Bonzini	9f25eccc1c	block: set job->speed in block_set_speed There is no need to do this in every implementation of set_speed (even though there is only one right now). Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-04-05 14:54:40 +02:00
Paolo Bonzini	3e914655f2	block: fix streaming/closing race Streaming can issue I/O while qcow2_close is running. This causes the L2 caches to become very confused or, alternatively, could cause a segfault when the streaming coroutine is reentered after closing its block device. The fix is to cancel streaming jobs when closing their underlying device. The cancellation must be synchronous, on the other hand qemu_aio_wait will not restart a coroutine that is sleeping in co_sleep. So add a flag saying whether streaming has in-flight I/O. If the busy flag is false, the coroutine is quiescent and, when cancelled, will not issue any new I/O. This protects streaming against closing, but not against deleting. We have a reference count protecting us against concurrent deletion, but I still added an assertion to ensure nothing bad happens. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-04-05 14:54:40 +02:00
Zhi Yong Wu	498e386c58	block: disable I/O throttling on sync api Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-04-05 14:54:40 +02:00
Paolo Bonzini	29cdb2513c	block: push recursive flushing up from drivers Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-04-05 14:54:39 +02:00
Stefan Hajnoczi	e88774971c	block: handle -EBUSY in bdrv_commit_all() Monitor operations that manipulate image files must not execute while a background job (like image streaming) is in progress. This prevents corruptions from happening when two pieces of code are manipulating the image file without knowledge of each other. The monitor "commit" command raises QERR_DEVICE_IN_USE when bdrv_commit() returns -EBUSY but "commit all" has no error handling. This is easy to fix, although note that we do not deliver a detailed error about which device was busy in the "commit all" case. Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-03-12 15:14:06 +01:00
Jeff Cody	8802d1fdd4	qapi: Introduce blockdev-group-snapshot-sync command This is a QAPI/QMP only command to take a snapshot of a group of devices. This is similar to the blockdev-snapshot-sync command, except blockdev-group-snapshot-sync accepts a list devices, filenames, and formats. It is attempted to keep the snapshot of the group atomic; if the creation or open of any of the new snapshots fails, then all of the new snapshots are abandoned, and the name of the snapshot image that failed is returned. The failure case should not interrupt any operations. Rather than use bdrv_close() along with a subsequent bdrv_open() to perform the pivot, the original image is never closed and the new image is placed 'in front' of the original image via manipulation of the BlockDriverState fields. Thus, once the new snapshot image has been successfully created, there are no more failure points before pivoting to the new snapshot. This allows the group of disks to remain consistent with each other, even across snapshot failures. Signed-off-by: Jeff Cody <jcody@redhat.com> Acked-by: Luiz Capitulino <lcapitulino@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-02-29 15:48:33 +01:00
Paolo Bonzini	b6a127a156	block: drop aio_multiwrite in BlockDriver These were never used. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-02-29 12:48:47 +01:00
Hervé Poussineau	f8d3d12857	block: add a transfer rate for floppy types Floppies must be read at a specific transfer rate, depending of its own format. Update floppy description table to include required transfer rate. Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-02-29 12:48:46 +01:00
Luiz Capitulino	6f382ed226	qmp: add DEVICE_TRAY_MOVED event It's emitted whenever the tray is moved by the guest or by HMP/QMP commands. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Acked-by: Kevin Wolf <kwolf@redhat.com>	2012-02-22 17:23:50 -02:00
Luiz Capitulino	f36f394952	block: bdrv_eject(): Make eject_flag a real bool Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Acked-by: Kevin Wolf <kwolf@redhat.com>	2012-02-22 17:23:05 -02:00
Luiz Capitulino	329c0a48a9	block: Rename bdrv_mon_event() & BlockMonEventAction They are QMP events, not monitor events. Rename them accordingly. Also, move bdrv_emit_qmp_error_event() up in the file. A new event will be added soon and it's good to have them next each other. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Acked-by: Kevin Wolf <kwolf@redhat.com>	2012-02-22 17:22:35 -02:00
Stefan Hajnoczi	79c053bde9	block: perform zero-detection during copy-on-read Copy-on-Read populates the image file with data read from a backing image. In order to avoid bloating the image file when all zeroes are read we should scan the buffer and perform an optimized zero write operation. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-02-09 16:17:50 +01:00
Stefan Hajnoczi	f08f2ddae0	block: add .bdrv_co_write_zeroes() interface The ability to zero regions of an image file is a useful primitive for higher-level features such as image streaming or zero write detection. Image formats may support an optimized metadata representation instead of writing zeroes into the image file. This allows zero writes to be potentially faster than regular write operations and also preserve sparseness of the image file. The .bdrv_co_write_zeroes() interface should be implemented by block drivers that wish to provide efficient zeroing. Note that this operation is different from the discard operation, which may leave the contents of the region indeterminate. That means discarded blocks are not guaranteed to contain zeroes and may contain junk data instead. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-02-09 16:17:50 +01:00
Marcelo Tosatti	e8a6bb9caa	block: add bdrv_find_backing_image Add bdrv_find_backing_image: given a BlockDriverState pointer, and an id, traverse the backing image chain to locate the id. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-01-26 14:49:18 +01:00
Stefan Hajnoczi	eeec61f291	block: add BlockJob interface for long-running operations Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-01-26 11:45:26 +01:00
Stefan Hajnoczi	470c05047a	block: make copy-on-read a per-request flag Previously copy-on-read could only be enabled for all requests to a block device. This means requests coming from the guest as well as QEMU's internal requests would perform copy-on-read when enabled. For image streaming we want to support finer-grained behavior than just populating the image file from its backing image. Image streaming supports partial streaming where a common backing image is preserved. In this case guest requests should not perform copy-on-read because they would indiscriminately copy data which should be left in a backing image from the backing chain. Introduce a per-request flag for copy-on-read so that a block device can process both regular and copy-on-read requests. Overlapping reads and writes still need to be serialized for correctness when copy-on-read is happening, so add an in-flight reference count to track this. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-01-26 11:45:26 +01:00
Stefan Hajnoczi	2d3735d3bf	block: check bdrv_in_use() before blockdev operations Long-running block operations like block migration and image streaming must have continual access to their block device. It is not safe to perform operations like hotplug, eject, change, resize, commit, or external snapshot while a long-running operation is in progress. This patch adds the missing bdrv_in_use() checks so that block migration and image streaming never have the rug pulled out from underneath them. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-01-26 11:45:26 +01:00
Paolo Bonzini	3f3aace830	block: avoid useless checks on acb->bh Coverity is confused by this "if" and reports leaks on acb->bh. The bottom half is always deleted before releasing the AIOCB, in either bdrv_aio_cancel_em or bdrv_aio_bh_cb. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-12-15 12:40:08 +01:00
Paolo Bonzini	df9309fb43	block: simplify failure handling for bdrv_aio_multiwrite Now that early failure of bdrv_aio_writev is not possible anymore, mcb->num_requests can be set before the loop starts. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-12-15 12:40:07 +01:00
Paolo Bonzini	ad54ae80c7	block: bdrv_aio_* do not return NULL Initially done with the following semantic patch: @ rule1 @ expression E; statement S; @@ E = ( bdrv_aio_readv \| bdrv_aio_writev \| bdrv_aio_flush \| bdrv_aio_discard \| bdrv_aio_ioctl ) (...); ( - if (E == NULL) { ... } \| - if (E) { <... S ...> } ) which however missed the occurrence in block/blkverify.c (as it should have done), and left behind some unused variables. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-12-15 12:40:07 +01:00
Stefan Hajnoczi	922453bca6	block: convert qemu_aio_flush() calls to bdrv_drain_all() Many places in QEMU call qemu_aio_flush() to complete all pending asynchronous I/O. Most of these places actually want to drain all block requests but there is no block layer API to do so. This patch introduces the bdrv_drain_all() API to wait for requests across all BlockDriverStates to complete. As a bonus we perform checks after qemu_aio_wait() to ensure that requests really have finished. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-12-05 14:56:06 +01:00
Stefan Hajnoczi	5f8b6491f2	block: wait_for_overlapping_requests() deadlock detection Debugging a reentrant request deadlock was fun but in the future we need a quick and obvious way of detecting such bugs. Add an assert that checks we are not about to deadlock when waiting for another request. Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-12-05 14:52:34 +01:00
Stefan Hajnoczi	bd9533e36e	block: implement bdrv_co_is_allocated() boundary cases Cases beyond the end of the disk image are only implemented for block drivers that do not provide .bdrv_co_is_allocated(). It's worth making these cases generic so that block drivers that do implement .bdrv_co_is_allocated() also get them for free. Suggested-by: Mark Wu <wudxw@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-12-05 14:51:39 +01:00
Stefan Hajnoczi	ab1859218a	block: core copy-on-read logic Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-12-05 14:51:38 +01:00
Stefan Hajnoczi	d83947ac6d	block: request overlap detection Detect overlapping requests and remember to align to cluster boundaries if the image format uses them. This assumes that allocating I/O is performed in cluster granularity - which is true for qcow2, qed, etc. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-12-05 14:51:38 +01:00
Stefan Hajnoczi	f4658285f9	block: wait for overlapping requests When copy-on-read is enabled it is necessary to wait for overlapping requests before issuing new requests. This prevents races between the copy-on-read and a write request. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-12-05 14:51:38 +01:00
Stefan Hajnoczi	53fec9d3fd	block: add interface to toggle copy-on-read The bdrv_enable_copy_on_read()/bdrv_disable_copy_on_read() functions can be used to programmatically enable or disable copy-on-read for a block device. Later patches add the actual copy-on-read logic. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-12-05 14:51:38 +01:00
Stefan Hajnoczi	dbffbdcfff	block: add request tracking The block layer does not know about pending requests. This information is necessary for copy-on-read since overlapping requests must be serialized to prevent races that corrupt the image. The BlockDriverState gets a new tracked_request list field which contains all pending requests. Each request is a BdrvTrackedRequest record with sector_num, nb_sectors, and is_write fields. Note that request tracking is always enabled but hopefully this extra work is so small that it doesn't justify adding an enable/disable flag. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-12-05 14:51:38 +01:00
Stefan Hajnoczi	060f51c9de	block: add bdrv_co_is_allocated() interface This patch introduces the public bdrv_co_is_allocated() interface which can be used to query image allocation status while the VM is running. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-12-05 14:51:37 +01:00
Stefan Hajnoczi	6aebab140d	block: drop .bdrv_is_allocated() interface Now that all block drivers have been converted to .bdrv_co_is_allocated() we can drop .bdrv_is_allocated(). Note that the public bdrv_is_allocated() interface is still available but is in fact a synchronous wrapper around .bdrv_co_is_allocated(). Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-12-05 14:51:37 +01:00
Stefan Hajnoczi	376ae3f1cb	block: add .bdrv_co_is_allocated() This patch adds the .bdrv_co_is_allocated() interface which is identical to .bdrv_is_allocated() but runs in coroutine context. Running in coroutine context implies that other coroutines might be performing I/O at the same time. Therefore it must be safe to run while the following BlockDriver functions are in-flight: .bdrv_co_readv() .bdrv_co_writev() .bdrv_co_flush() .bdrv_co_is_allocated() The new .bdrv_co_is_allocated() interface is useful because it can be used when a VM is running, whereas .bdrv_is_allocated() is a synchronous interface that does not cope with parallel requests. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-12-05 14:51:36 +01:00
Stefan Hajnoczi	05c4af54c6	block: use public bdrv_is_allocated() interface There is no need for bdrv_commit() to use the BlockDriver .bdrv_is_allocated() interface directly. Converting to the public interface gives us the freedom to drop .bdrv_is_allocated() entirely in favor of a new .bdrv_co_is_allocated() in the future. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-12-05 14:51:36 +01:00
Zhi Yong Wu	727f005e6a	hmp/qmp: add block_set_io_throttle Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-12-05 14:51:35 +01:00
Zhi Yong Wu	98f90dba5e	block: add I/O throttling algorithm Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-12-05 14:51:35 +01:00
Zhi Yong Wu	0563e19151	block: add the blockio limits command line support Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-12-05 14:51:35 +01:00
Anthony Liguori	0f15423c32	block: allow migration to work with image files (v3) Image files have two types of data: immutable data that describes things like image size, backing files, etc. and mutable data that includes offset and reference count tables. Today, image formats aggressively cache mutable data to improve performance. In some cases, this happens before a guest even starts. When dealing with live migration, since a file is open on two machines, the caching of meta data can lead to data corruption. This patch addresses this by introducing a mechanism to invalidate any cached mutable data a block driver may have which is then used by the live migration code. NB, this still requires coherent shared storage. Addressing migration without coherent shared storage (i.e. NFS) requires additional work. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2011-11-21 14:58:48 -06:00
Kevin Wolf	ca716364f0	block: Make cache=unsafe flush to the OS cache=unsafe completely ignored bdrv_flush, because flushing the host disk costs a lot of performance. However, this means that qcow2 images (and potentially any other format) can lose data even after the guest has issued a flush if the qemu process crashes/is killed. In case of a host crash, data loss is certainly expected with cache=unsafe, but if just the qemu process dies this is a bit too unsafe. Now that we have two separate flush functions, we can choose to flush everythign to the OS, but don't enforce that it's physically written to the disk. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-11-11 14:02:59 +01:00
Kevin Wolf	eb489bb1ec	block: Introduce bdrv_co_flush_to_os qcow2 has a writeback metadata cache, so flushing a qcow2 image actually consists of writing back that cache to the protocol and only then flushes the protocol in order to get everything stable on disk. This introduces a separate bdrv_co_flush_to_os to reflect the split. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-11-11 14:02:59 +01:00
Kevin Wolf	c68b89acd6	block: Rename bdrv_co_flush to bdrv_co_flush_to_disk There are two different types of flush that you can do: Flushing one level up to the OS (i.e. writing data to the host page cache) or flushing it all the way down to the disk. The existing functions flush to the disk, reflect this in the function name. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-11-11 14:02:59 +01:00
Paolo Bonzini	025ccaa7f9	block: add eject request callback Recent versions of udev always keep the tray locked so that the kernel can observe "eject request" events (aka tray button presses) even on discs that aren't mounted. Add support for these events in the ATAPI and SCSI cd drive device models. To let management cope with the behavior of udev, an event should also be added for "tray opened/closed". This way, after issuing an "eject" command, management can poll until the guests actually reacts to the command. They can then issue the "change" command after the tray has been opened, or try with "eject -f" after a (configurable?) timeout. However, with this patch and the corresponding support in the device models, at least it is possible to do a manual two-step eject+change sequence. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-11-11 14:02:57 +01:00
Anthony Liguori	8494a397b6	Merge remote-tracking branch 'kwolf/for-anthony' into staging Conflicts: block/vmdk.c	2011-10-31 11:09:00 -05:00
Stefan Hajnoczi	03f541bd6e	block: reinitialize across bdrv_close()/bdrv_open() Several BlockDriverState fields are not being reinitialized across bdrv_close()/bdrv_open(). Make sure they are reset to their default values. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-10-28 19:25:50 +02:00
Stefan Hajnoczi	e7c637967e	block: set bs->read_only before .bdrv_open() Several block drivers set bs->read_only in .bdrv_open() but block.c:bdrv_open_common() clobbers its value. Additionally, QED uses bdrv_is_read_only() in .bdrv_open() to decide whether to perform consistency checks. The correct ordering is to initialize bs->read_only from the open flags before calling .bdrv_open(). This way block drivers can override it if necessary and can use bdrv_is_read_only() in .bdrv_open(). Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-10-28 19:25:49 +02:00
Kevin Wolf	2b5728164f	block: Fix bdrv_open use after free tmp_filename was used outside the block it was defined in, i.e. after it went out of scope. Move its declaration to the top level. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-10-28 19:25:49 +02:00
Kevin Wolf	3574c60819	block: Remove dead code Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-10-28 19:25:49 +02:00
Luiz Capitulino	f795e743bd	Drop qemu-objects.h from modules that don't require it Previous commits dropped most qobjects usage from qemu modules (now they are a low level interface used by the QAPI). However, some modules still include the qemu-objects.h header file. This commit drops qemu-objects.h from some of those modules and includes qjson.h instead, which is what they actually need. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>	2011-10-27 11:48:47 -02:00
Luiz Capitulino	f11f57e405	qapi: Convert query-blockstats Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>	2011-10-27 11:48:47 -02:00
Luiz Capitulino	b202381800	qapi: Convert query-block Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>	2011-10-27 11:48:47 -02:00
Luiz Capitulino	58e21ef5ab	block: Rename the BlockIOStatus enum values The biggest change is to rename its prefix from BDRV_IOS to BLOCK_DEVICE_IO_STATUS. Next commit will convert the query-block command to the QAPI and that's how the enumeration is going to be generated. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>	2011-10-27 11:48:47 -02:00
Luiz Capitulino	d6bf279e7a	block: iostatus: Drop BDRV_IOS_INVAL A future commit will convert bdrv_info() to the QAPI and it won't provide IOS_INVAL. Luckily all we have to do is to add a new 'iostatus_enabled' member to BlockDriverState and use it instead. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>	2011-10-27 11:48:47 -02:00
Paolo Bonzini	6db39ae2e2	block: change discard to co_discard Since coroutine operation is now mandatory, convert both bdrv_discard implementations to coroutines. For qcow2, this means taking the lock around the operation. raw-posix remains synchronous. The bdrv_discard callback is then unused and can be eliminated. Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-10-21 17:34:14 +02:00
Paolo Bonzini	8b94ff8573	block: change flush to co_flush Since coroutine operation is now mandatory, convert all bdrv_flush implementations to coroutines. For qcow2, this means taking the lock. Other implementations are simpler and just forward bdrv_flush to the underlying protocol, so they can avoid the lock. The bdrv_flush callback is then unused and can be eliminated. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-10-21 17:34:14 +02:00
Paolo Bonzini	4265d620c5	block: add bdrv_co_discard and bdrv_aio_discard support This similarly adds support for coroutine and asynchronous discard. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-10-21 17:34:13 +02:00
Paolo Bonzini	07f0761574	block: unify flush implementations Add coroutine support for flush and apply the same emulation that we already do for read/write. bdrv_aio_flush is simplified to always go through a coroutine. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-10-21 17:34:13 +02:00
Paolo Bonzini	35246a6825	block: rename bdrv_co_rw_bh Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-10-21 17:34:12 +02:00
Stefan Hajnoczi	09f085d59d	block: drop bdrv_has_async_rw() Commit cd74d83345e0e3b708330ab8c4cd9111bb82cda6 ("block: switch bdrv_read()/bdrv_write() to coroutines") removed the bdrv_has_async_rw() callers. This patch removes bdrv_has_async_rw() since it is no longer used. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-10-14 17:31:22 +02:00
Stefan Hajnoczi	f8c35c1d59	block: drop .bdrv_read()/.bdrv_write() emulation There is no need to emulate .bdrv_read()/.bdrv_write() since these interfaces are only called if aio and coroutine interfaces are not present. All valid BlockDrivers must implement either sync, aio, or coroutine interfaces. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-10-14 17:31:22 +02:00
Stefan Hajnoczi	8c5873d697	block: drop emulation functions that use coroutines Block drivers that implement coroutine functions used to get sync and aio wrappers. This is no longer necessary since all request processing now happens in a coroutine. If a block driver implements the coroutine interface then none of the other interfaces will be invoked. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-10-14 17:31:22 +02:00
Stefan Hajnoczi	1a6e115b19	block: switch bdrv_aio_writev() to coroutines More sync, aio, and coroutine unification. Make bdrv_aio_writev() go through coroutine request processing. Remove the dirty block callback mechanism which was needed only for aio processing and can be done more naturally in coroutine context. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-10-13 15:02:54 +02:00
Stefan Hajnoczi	6b7cb2479b	block: mark blocks dirty on coroutine write completion The aio write operation marks blocks dirty when the write operation completes. The coroutine write operation marks blocks dirty before issuing the write operation. It seems safest to mark the block dirty when the operation completes so that anything tracking dirty blocks will not act before the change has been made to the image file. Make the coroutine write operation dirty blocks on write completion. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-10-13 15:02:54 +02:00
Stefan Hajnoczi	b2a6137166	block: switch bdrv_aio_readv() to coroutines More sync, aio, and coroutine unification. Make bdrv_aio_readv() go through coroutine request processing. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-10-13 15:02:54 +02:00
Stefan Hajnoczi	1c9805a398	block: switch bdrv_read()/bdrv_write() to coroutines The bdrv_read()/bdrv_write() functions call .bdrv_read()/.bdrv_write(). They should go through bdrv_co_do_readv() and bdrv_co_do_writev() instead in order to unify request processing code across sync, aio, and coroutine interfaces. This is also an important step towards removing BlockDriverState .bdrv_read()/.bdrv_write() in the future. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-10-13 15:02:53 +02:00
Stefan Hajnoczi	c5fbe57111	block: split out bdrv_co_do_readv() and bdrv_co_do_writev() The public interface for I/O in coroutine context is bdrv_co_readv() and bdrv_co_writev(). Split out the request processing code into bdrv_co_do_readv() and bdrv_co_writev() so that it can be called internally when we refactor all request processing to use coroutines. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-10-13 15:02:53 +02:00
Stefan Hajnoczi	1ed20acf2f	block: directly invoke .bdrv_* from emulation functions The emulation functions which supply default BlockDriver .bdrv_() functions given another implemented .bdrv_() function should not use public bdrv_() interfaces. This patch ensures they invoke .bdrv_() directly to avoid adding an extra layer of coroutine request processing and possibly entering an infinite loop. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-10-13 15:02:53 +02:00
Stefan Hajnoczi	a652d16025	block: directly invoke .bdrv_aio_() in bdrv_co_io_em() We will unify block layer request processing across sync, aio, and coroutines and this means a .bdrv_co_() emulation function should not call back into the public interface. There's no need here, just call .bdrv_aio_() directly. The gory details: bdrv_co_io_em() cannot call back into the public bdrv_aio_() interface since that will be handled using coroutines, which causes us to call into bdrv_co_io_em() again in an infinite loop :). Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-10-13 15:02:27 +02:00
Luiz Capitulino	d2078cc238	HMP: Print 'io-status' information Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-10-11 09:42:45 +02:00
Luiz Capitulino	f04ef60100	QMP: query-status: Add 'io-status' key Contains the I/O status for the given device. The key is only present if the device supports it and the VM is configured to stop on errors. Please, check the documentation being added in this commit for more information. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-10-11 09:42:45 +02:00
Luiz Capitulino	28a7282a5d	block: Keep track of devices' I/O status This commit adds support to the BlockDriverState type to keep track of devices' I/O status. There are three possible status: BDRV_IOS_OK (no error), BDRV_IOS_ENOSPC (no space error) and BDRV_IOS_FAILED (any other error). The distinction between no space and other errors is important because a management application may want to watch for no space in order to extend the space assigned to the VM and put it to run again. Qemu devices supporting the I/O status feature have to enable it explicitly by calling bdrv_iostatus_enable() _and_ have to be configured to stop the VM on errors (ie. werror=stop\|enospc or rerror=stop). In case of multiple errors being triggered in sequence only the first one is stored. The I/O status is always reset to BDRV_IOS_OK when the 'cont' command is issued. Next commits will add support to some devices and extend the query-block/info block commands to return the I/O status information. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-10-11 09:41:47 +02:00
Stefan Hajnoczi	59370aaa56	trace: add arguments to bdrv_co_io_em() trace event It is useful to know the BlockDriverState as well as the sector_num/nb_sectors of an emulated .bdrv_co_*() request. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>	2011-10-03 10:56:27 +01:00
Stefan Hajnoczi	28dcee10c5	trace: trace bdrv_open_common() bdrv_open_common() is a useful point to trace since it reveals the filename and block driver for a given BlockDriverState. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>	2011-10-03 10:55:50 +01:00
Markus Armbruster	7d4b4ba5c2	block: New change_media_cb() parameter load To let device models distinguish between eject and load. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-09-12 15:17:22 +02:00
Markus Armbruster	29e05f2022	block: Reset buffer alignment on detach BlockDriverState member buffer_alignment is initially 512. The device model may set them, with bdrv_set_buffer_alignment(). If the device model gets detached (hot unplug), the device's alignment is left behind. Only okay because device hot unplug automatically destroys the BlockDriverState. But that's a questionable feature, best not to rely on it. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-09-12 15:17:22 +02:00
Markus Armbruster	7b6f9300d5	block: New bdrv_set_buffer_alignment() Device models should be able to set it without an unclean include of block_int.h. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-09-12 15:17:22 +02:00
Markus Armbruster	e4def80b36	block: Show whether the virtual tray is open in info block Need to ask the device, so this requires new BlockDevOps member is_tray_open(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-09-12 15:17:21 +02:00
Markus Armbruster	9e6a4c9177	block: Drop BlockDriverState member removable It's a confused mess (see previous commit). No users remain. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-09-12 15:17:21 +02:00
Markus Armbruster	2c6942fa7b	block: Clean up remaining users of "removable" BlockDriverState member removable is a confused mess. It is true when an ide-cd, scsi-cd or floppy qdev is attached, or when the BlockDriverState was created with -drive if={floppy,sd} or -drive if={ide,scsi,xen,none},media=cdrom ("created removable"), except when an ide-hd, scsi-hd, scsi-generic or virtio-blk qdev is attached. Three users remain: 1. eject_device(), via bdrv_is_removable() uses it to determine whether a block device can eject media. 2. bdrv_info() is monitor command "info block". QMP documentation says "true if the device is removable, false otherwise". From the monitor user's point of view, the only sensible interpretation of "is removable" is "can eject media with monitor commands eject and change". A block device can eject media unless a device is attached that doesn't support it. Switch the two users over to new bdrv_dev_has_removable_media() that returns exactly that. 3. bdrv_getlength() uses to suppress its length cache when media can change (see commit `46a4e4e6`). Media change is either monitor command change (updates the length cache), monitor command eject (doesn't update the length cache, easily fixable), or physical media change (invalidates length cache, not so easily fixable). I'm refraining from improving anything here, because this series is long enough already. Instead, I simply switch it over to bdrv_dev_has_removable_media() as well. This changes the behavior of the length cache and of monitor commands eject and change in two cases: a. drive not created removable, no device attached The commit makes the drive removable, and defeats the length cache. Example: -drive if=none b. drive created removable, but the attached drive is non-removable, and doesn't call bdrv_set_removable(..., 0) (most devices don't) The commit makes the drive non-removable, and enables the length cache. Example: -drive if=xen,media=cdrom -M xenpv The other non-removable devices that don't call bdrv_set_removable() can't currently use a drive created removable, either because they aren't qdevified, or because they lack a drive property. Won't stay that way. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-09-12 15:17:21 +02:00
Markus Armbruster	025e849a50	block: Rename bdrv_set_locked() to bdrv_lock_medium() While there, make the locked parameter bool. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-09-12 15:17:20 +02:00
Markus Armbruster	f107639a6f	block: Drop medium lock tracking, ask device models instead Requires new BlockDevOps member is_medium_locked(). Implement for IDE and SCSI CD-ROMs. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-09-12 15:17:20 +02:00
Markus Armbruster	fdec4404dd	block: Leave enforcing tray lock to device models The device model knows best when to accept the guest's eject command. No need to detour through the block layer. bdrv_eject() can't fail anymore. Make it void. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-09-12 15:17:20 +02:00
Markus Armbruster	22cf56c4d8	block: Drop tray status tracking, no longer used Commit `4be9762a` is now completely redone. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-09-12 15:17:20 +02:00
Markus Armbruster	a1aff5bf67	block: Revert entanglement of bdrv_is_inserted() with tray status Commit `4be9762a` changed bdrv_is_inserted() to fail when the tray is open. Unfortunately, there are two different kinds of users, with conflicting needs. 1. Device models using bdrv_eject(), currently ide-cd and scsi-cd. They expect bdrv_is_inserted() to reflect the tray status. Commit `4be9762a` makes them happy. 2. Code that wants to know whether a BlockDriverState has media, such as find_image_format(), bdrv_flush_all(). Commit `4be9762a` makes them unhappy. In particular, it breaks flush on VM stop for media ejected by the guest. Revert the change to bdrv_is_inserted(). Check the tray status in the device models instead. Note on IDE: Since only ATAPI devices have a tray, and they don't accept ATA commands since the recent commit "ide: Reject ATA commands specific to drive kinds", checking in atapi.c suffices. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-09-12 15:17:20 +02:00
Markus Armbruster	07b70bfbb3	savevm: Include writable devices with removable media savevm and loadvm silently ignore block devices with removable media, such as floppies and SD cards. Rolling back a VM to a previous checkpoint will not roll back writes to block devices with removable media. Moreover, bdrv_is_removable() is a confused mess, and wrong in at least one case: it considers "-drive if=xen,media=cdrom -M xenpv" removable. It'll be cleaned up later in this series. Read-only block devices are also ignored, but that's okay. Fix by ignoring only read-only block devices and empty block devices. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-09-06 11:24:07 +02:00
Markus Armbruster	c602a489f9	block: Clean up bdrv_flush_all() Change (!bdrv_is_removable(bs) \|\| bdrv_is_inserted(bs)) to just bdrv_is_inserted(). Rationale: The value of bdrv_is_removable(bs) matters only when bdrv_is_inserted(bs) is false. bdrv_is_inserted(bs) is true when bs is open (bs->drv != NULL) and not an empty host drive (CD-ROM or floppy). Therefore, bdrv_is_removable(bs) matters only when: 1. bs is not open old: may call bdrv_flush(bs), which does nothing new: won't call 2. bs is an empty host drive old: may call bdrv_flush(bs), which calls driver method raw_flush(), which calls fdatasync() or equivalent, which can't do anything useful while the drive is empty new: won't call Result is bs->drv && !bdrv_is_read_only(bs) && bdrv_is_inserted(bs). bdrv_is_inserted(bs) implies bs->drv. Drop the redundant test. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-09-06 11:24:07 +02:00
Markus Armbruster	8e49ca4624	block: Leave tracking media change to device models hw/fdc.c is the only one that cares. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-09-06 11:24:06 +02:00
Markus Armbruster	145feb176f	block: Split change_cb() into change_media_cb(), resize_cb() Multiplexing callbacks complicates matters needlessly. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-09-06 11:23:51 +02:00
Markus Armbruster	0e49de5232	block: Generalize change_cb() to BlockDevOps So we can more easily add device model callbacks. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-09-06 11:23:51 +02:00
Markus Armbruster	fa879d62eb	block: Attach non-qdev devices as well For now, this just protects against programming errors like having the same drive back multiple non-qdev devices, or untimely bdrv_delete(). Later commits will add other interesting uses. While there, rename BlockDriverState member peer to dev, bdrv_attach() to bdrv_attach_dev(), bdrv_detach() to bdrv_detach_dev(), and bdrv_get_attached() to bdrv_get_attached_dev(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-09-06 11:23:51 +02:00
Stefan Weil	541dc0d47f	Use new macro QEMU_PACKED for packed structures Most changes were made using these commands: git grep -la '__attribute__((packed))'\|xargs perl -pi -e 's/__attribute__$\(packed$\)/QEMU_PACKED/' git grep -la '__attribute__ ((packed))'\|xargs perl -pi -e 's/__attribute__ $\(packed$\)/QEMU_PACKED/' git grep -la '__attribute__((__packed__))'\|xargs perl -pi -e 's/__attribute__$\(__packed__$\)/QEMU_PACKED/' git grep -la '__attribute__ ((__packed__))'\|xargs perl -pi -e 's/__attribute__ $\(__packed__$\)/QEMU_PACKED/' git grep -la '__attribute((packed))'\|xargs perl -pi -e 's/__attribute$\(packed$\)/QEMU_PACKED/' Whitespace in linux-user/syscall_defs.h was fixed manually to avoid warnings from scripts/checkpatch.pl. Manual changes were also applied to hw/pc.c. I did not fix indentation with tabs in block/vvfat.c. The patch will show 4 errors with scripts/checkpatch.pl. Signed-off-by: Stefan Weil <weil@mail.berlios.de> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2011-09-03 10:45:59 +00:00
Christoph Hellwig	c488c7f649	block: latency accounting Account the total latency for read/write/flush requests. This allows management tools to average it based on a snapshot of the nr ops counters and allow checking for SLAs or provide statistics. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-08-26 18:18:38 +02:00
Christoph Hellwig	a597e79ce1	block: explicit I/O accounting Decouple the I/O accounting from bdrv_aio_readv/writev/flush and make the hardware models call directly into the accounting helpers. This means: - we do not count internal requests from image formats in addition to guest originating I/O - we do not double count I/O ops if the device model handles it chunk wise - we only account I/O once it actuall is done - can extent I/O accounting to synchronous or coroutine I/O easily - implement I/O latency tracking easily (see the next patch) I've conveted the existing device model callers to the new model, device models that are using synchronous I/O and weren't accounted before haven't been updated yet. Also scsi hasn't been converted to the end-to-end accounting as I want to defer that after the pending scsi layer overhaul. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-08-25 18:18:42 +02:00
Christoph Hellwig	e8045d6726	block: include flush requests in info blockstats Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-08-23 17:41:14 +02:00
Stefan Hajnoczi	92196b2f56	block: add cache=directsync parameter to -drive This patch adds -drive cache=directsync for O_DIRECT \| O_SYNC host file I/O with no disk write cache presented to the guest. This mode is useful when guests may not be sending flushes when appropriate and therefore leave data at risk in case of power failure. When cache=directsync is used, write operations are only completed to the guest when data is safely on disk. This new mode is like cache=writethrough but it bypasses the host page cache. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-08-23 14:15:17 +02:00
Stefan Hajnoczi	c3993cdca3	block: parse cache mode flags in a single place This patch introduces bdrv_parse_cache_flags() which sets open flags given a cache mode. Previously this was duplicated in blockdev.c and qemu-img.c. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-08-23 14:15:17 +02:00
Robert Wang	d62b5dea30	fix code format Fix code format to make checkpatch.pl happy. Signed-off-by: Robert Wang <wdongxu@linux.vnet.ibm.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2011-08-22 10:17:52 -05:00
Anthony Liguori	7267c0947d	Use glib memory allocation and free functions qemu_malloc/qemu_free no longer exist after this commit. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2011-08-20 23:01:08 -05:00
Kevin Wolf	e7a8a7837a	block: Use bdrv_co_* instead of synchronous versions in coroutines If we're already in a coroutine, there is no reason to use the synchronous version of block layer functions when a coroutine one exists. This makes bdrv_read/write/flush use bdrv_co_* when used inside a coroutine. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-08-04 11:27:15 +02:00
Kevin Wolf	384acbf46b	async: Remove AsyncContext The purpose of AsyncContexts was to protect qcow and qcow2 against reentrancy during an emulated bdrv_read/write (which includes a qemu_aio_wait() call and can run AIO callbacks of different requests if it weren't for AsyncContexts). Now both qcow and qcow2 are protected by CoMutexes and AsyncContexts can be removed. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-08-02 15:53:41 +02:00
Kevin Wolf	f9f05dc58c	block: Add bdrv_co_readv/writev emulation In order to be able to call bdrv_co_readv/writev for drivers that don't implement the functions natively, add an emulation that uses the AIO functions to implement them. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-08-02 15:53:40 +02:00
Kevin Wolf	6848542018	block: Emulate AIO functions with bdrv_co_readv/writev Use the bdrv_co_readv/writev callbacks to implement bdrv_aio_readv/writev and bdrv_read/write if a driver provides the coroutine version instead of the synchronous or AIO version. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-08-02 15:53:40 +02:00
Kevin Wolf	da1fa91d6c	block: Add bdrv_co_readv/writev Add new block driver callbacks bdrv_co_readv/writev, which work on a QEMUIOVector like bdrv_aio_*, but don't need a callback. The function may only be called inside a coroutine, so a block driver implementing this interface can yield instead of blocking during I/O. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-08-02 15:53:40 +02:00
Frediano Ziglio	5bf3f8e4f7	block: Removed unused function bdrv_write_sync Signed-off-by: Frediano Ziglio <freddy77@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-08-01 12:10:29 +02:00
Markus Armbruster	49aa46bb4b	block: Don't let locked flag prevent medium load Commit `aea2a33c` made bdrv_eject() obey the locked flag. Correct for medium eject (eject_flag set), incorrect for medium load (eject_flag clear). See MMC-5 Table 341 "Actions for Lock/Unlock/Eject". Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-08-01 12:10:28 +02:00
Markus Armbruster	822e1cd17e	block: Make BlockDriver method bdrv_eject() return void Callees always return 0, except for FreeBSD's cdrom_eject(), which returns -ENOTSUP when the device is in a terminally wedged state. The only caller is bdrv_eject(), and it maps -ENOTSUP to 0 since commit `4be9762a`. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-08-01 12:10:28 +02:00
Markus Armbruster	a19712b0db	block: Reset device model callbacks on detach BlockDriverState members change_cb and change_opaque are initially null. The device model may set them, with bdrv_set_change_cb(). If the device model gets detached (hot unplug), they're left dangling. Only safe because device hot unplug automatically destroys the BlockDriverState. But that's a questionable feature, best not to rely on it. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-08-01 12:09:11 +02:00
Fam Zheng	4a1d5e1fde	block: add bdrv_get_allocated_file_size() operation qemu-img.c wants to count allocated file size of image. Previously it counts a single bs->file by 'stat' or Window API. As VMDK introduces multiple file support, the operation becomes format specific with platform specific meanwhile. The functions are moved to block/raw-{posix,win32}.c and qemu-img.c calls bdrv_get_allocated_file_size to count the bs. And also added VMDK code to count his own extents. Signed-off-by: Fam Zheng <famcool@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-07-19 15:39:08 +02:00
Kevin Wolf	d220894e02	bdrv_img_create: Fix segfault Block drivers that don't support creating images don't have a size option. Fail gracefully instead of segfaulting when trying to access the option's value. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-06-08 11:56:40 +02:00
Christoph Hellwig	a659979328	block: clarify the meaning of BDRV_O_NOCACHE Change BDRV_O_NOCACHE to only imply bypassing the host OS file cache, but no writeback semantics. All existing callers are changed to also specify BDRV_O_CACHE_WB to give them writeback semantics. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-06-08 10:39:32 +02:00
Markus Armbruster	8d278467ff	block: Remove type hint, it's guest matter, doesn't belong here No users of bdrv_get_type_hint() left. bdrv_set_type_hint() can make the media removable by side effect. Make that explicit. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-05-19 10:26:23 +02:00
Markus Armbruster	d8aeeb31d5	block QMP: Deprecate query-block's "type", drop info block's "type=" query-block's specification documents response member "type" with values "hd", "cdrom", "floppy", "unknown". Its value is unreliable: a block device used as floppy has type "floppy" if created with if=floppy, but type "hd" if created with if=none. That's because with if=none, the type is at best a declaration of intent: the drive can be connected to any guest device. Its type is really the guest device's business. Reporting it here is wrong. No known user of QMP uses "type". It's unlikely that any unknown users exist, because its value is useless unless you know how the block device was created. But then you also know the true value. Fixing the broken value risks breaking (hypothetical!) clients that somehow rely on the current behavior. Not fixing the value risks breaking (hypothetical!) clients that rely on the value to be accurate. Can't entirely avoid hypothetical lossage. Change the value to be always "unknown". This makes "info block" always report "type=unknown". Pointless. Change it to not report the type. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-05-19 10:26:19 +02:00
Stefan Weil	a1c7273b82	Fix typos in comments and code (occured -> occurred and related) The code changed here is an unused data type name (evt_flush_occurred). Signed-off-by: Stefan Weil <weil@mail.berlios.de> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>	2011-05-08 10:02:18 +01:00
Stefan Weil	ebabb67a17	Fix typo in code and comments Replace writeable -> writable Signed-off-by: Stefan Weil <weil@mail.berlios.de> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>	2011-05-06 08:19:25 +01:00
Stefan Hajnoczi	46a4e4e608	block: Do not cache device size for removable media The block layer caches the device size to avoid doing lseek(fd, 0, SEEK_END) every time this value is needed. For removable media the device size becomes stale if a new medium is inserted. This patch simply prevents device size caching for removable media. A smarter solution is to update the cached device size when a new medium is inserted. Given that there are currently bugs with CD-ROM media change I do not want to implement that approach until we've gotten things correct first. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-04-07 13:51:47 +02:00
Stefan Hajnoczi	b8c6d09589	trace: Trace bdrv_set_locked() It can be handy to know when the guest locks/unlocks the CD-ROM tray. This trace event makes that possible. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-04-07 13:51:47 +02:00
Ryan Harper	d22b2f41c4	Do not delete BlockDriverState when deleting the drive When removing a drive from the host-side via drive_del we currently have the following path: drive_del qemu_aio_flush() bdrv_close() // zaps bs->drv, which makes any subsequent I/O get // dropped. Works as designed drive_uninit() bdrv_delete() // frees the bs. Since the device is still connected to // bs, any subsequent I/O is a use-after-free. The value of bs->drv becomes unpredictable on free. As long as it remains null, I/O still gets dropped, however it could become non-null at any point after the free resulting SEGVs or other QEMU state corruption. To resolve this issue as simply as possible, we can chose to not actually delete the BlockDriverState pointer. Since bdrv_close() handles setting the drv pointer to NULL, we just need to remove the BlockDriverState from the QLIST that is used to enumerate the block devices. This is currently handled within bdrv_delete, so move this into its own function, bdrv_make_anon(). The result is that we can now invoke drive_del, this closes the file descriptors and sets BlockDriverState->drv to NULL which prevents futher IO to the device, and since we do not free BlockDriverState, we don't have to worry about the copy retained in the block devices. We also don't attempt to remove the qdev property since we are no longer deleting the BlockDriverState on drives with associated drives. This also allows for removing Drives with no devices associated either. Reported-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Ryan Harper <ryanh@us.ibm.com> Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-04-07 13:51:47 +02:00
Ryan Harper	301db7c2dd	Don't allow multiwrites against a block device without underlying medium If the block device has been closed, we no longer have a medium to submit IO against, check for this before submitting io. This prevents a segfault further in the code where we dereference elements of the block driver. Signed-off-by: Ryan Harper <ryanh@us.ibm.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-03-15 13:21:14 +01:00
Stefan Hajnoczi	a13aac04e1	trace: Trace bdrv_aio_flush() Add a trace event for bdrv_aio_flush() to complement the existing bdrv_aio_readv() and bdrv_aio_writev() events. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>	2011-03-07 15:34:42 +00:00
Blue Swirl	5bbdbb4676	fdc: move floppy geometry guessing to block.c Other geometry guessing functions already reside in block.c. Remove some unused or debugging only fields. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2011-02-20 09:33:17 +00:00
Marcelo Tosatti	8591675f44	block: enable in_use flag Set block device in use during block migration, disallow drive_del and bdrv_truncate for in use devices. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-02-07 12:51:19 +01:00
Marcelo Tosatti	db593f2565	Add flag to indicate external users to block device Certain operations such as drive_del or resize cannot be performed while external users (eg. block migration) reference the block device. Add a flag to indicate that. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-02-07 12:51:19 +01:00
Christoph Hellwig	db97ee6a97	block: tell drivers about an image resize Extend the change_cb callback with a reason argument, and use it to tell drivers about size changes. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-01-31 10:03:00 +01:00
Stefan Hajnoczi	96df67d1c3	block: Use backing format driver during image creation The backing format should be honored during image creation. For some reason we currently use the image format to open the backing file. This fails when the backing file has a different format than the image being created. Keep the image and backing format drivers completely separate. Also print the backing filename if there is an error opening the backing file instead of the image filename. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Acked-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-01-24 16:49:50 +01:00
Blue Swirl	71df0eeb98	block: delete a write-only variable Avoid a warning with GCC 4.6.0: /src/qemu/block.c: In function 'bdrv_img_create': /src/qemu/block.c:2862:25: error: variable 'fmt' set but not used [-Werror=unused-but-set-variable] CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2011-01-06 18:25:37 +00:00
Christoph Hellwig	bb8bf76fb1	block: add discard support Add a new bdrv_discard method to free blocks in a mapping image, and a new drive property to set the granularity for these discard. If no discard granularity support is set discard support is disabled. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-12-17 16:11:03 +01:00
Jes Sorensen	4f70f249ca	bdrv_img_create() use proper errno return values Kevin suggested to have bdrv_img_create() return proper -errno values on error. Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-12-17 16:11:03 +01:00
Jes Sorensen	792da93a63	Prevent creating an image with the same filename as backing file Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-12-17 16:11:03 +01:00
Jes Sorensen	f88e1a4201	qemu-img.c: Re-factor img_create() This patch re-factors img_create() moving the code doing the actual work into block.c where it can be shared with QEMU. This is needed to be able to create images from QEMU to be used for live snapshots. Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-12-17 16:11:03 +01:00
Stefan Hajnoczi	df2dbb4a50	block: Fix the use of protocols in backing files Backing filenames may contain a protocol. The code currently doesn't consider this case and produces filenames that embed "<protocol>:". Don't combine filenames if the backing filename contains a protocol. Based on an earlier patch by Anthony Liguori <aliguori@us.ibm.com>. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-12-17 16:10:59 +01:00
Stefan Hajnoczi	9e0b22f4f2	block: Introduce path_has_protocol() function The bdrv_find_protocol() function returns NULL if an unknown protocol name is given. It returns the "file" protocol when the filename contains no protocol at all. This makes it difficult to distinguish between paths which contain a protocol and those which do not. Factor out a helper function that tests whether or not a filename has a protocol. The next patch makes use of this function. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-12-17 16:10:59 +01:00
Stefan Hajnoczi	16905d7175	block: Make bdrv_create_file() ':' handling consistent Filenames may start with "<protocol>:" to explicitly use a protocol like nbd. Filenames with unknown protocols are rejected in most of QEMU except for bdrv_create_file(). Even if a file with an invalid filename can be created, QEMU cannot use it since all the other relevant functions reject such paths. Make bdrv_create_file() consistent. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-12-14 15:44:21 +01:00
Marcelo Tosatti	4dcafbb1eb	block: set sector dirty on AIO write completion Sectors are marked dirty in the bitmap on AIO submission. This is wrong since data has not reached storage. Set a given sector as dirty in the dirty bitmap on AIO completion, so that reading a sector marked as dirty is guaranteed to return uptodate data. Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2010-11-21 09:16:56 -06:00
Marcelo Tosatti	6d59fec11e	block: fix shift in dirty bitmap calculation Otherwise upper 32 bits of bitmap entries are not correctly calculated. Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2010-11-21 09:16:56 -06:00
Kevin Wolf	205ef7961f	block: Allow bdrv_flush to return errors This changes bdrv_flush to return 0 on success and -errno in case of failure. It's a requirement for implementing proper error handle in users of bdrv_flush. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>	2010-11-04 12:52:16 +01:00
edison	51ef67270b	Copy snapshots out of QCOW2 disk In order to backup snapshots, created from QCOW2 iamge, we want to copy snapshots out of QCOW2 disk to a seperate storage. The following patch adds a new option in "qemu-img": qemu-img convert -f qcow2 -O qcow2 -s snapshot_name src_img bck_img. Right now, it only supports to copy the full snapshot, delta snapshot is on the way. Changes from V1: all the comments from Kevin are addressed: Add read-only checking Fix coding style Change the name from bdrv_snapshot_load to bdrv_snapshot_load_tmp Signed-off-by: Disheng Su <edison@cloud.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-10-22 14:49:35 +02:00
Stefan Hajnoczi	bbf0a44081	trace: Trace bdrv_aio_{readv,writev} Observing block layer aio readv/writev operations is useful for debugging image formats or understanding guest disk I/O patterns. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2010-10-09 08:17:03 +00:00
Stefan Hajnoczi	6d519a5f95	trace: Trace virtio-blk, multiwrite, and paio_submit This patch adds trace events that make it possible to observe virtio-blk. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>	2010-09-09 16:22:45 -05:00
Anthony Liguori	8b33d9eeba	Revert "Make default invocation of block drivers safer (v3)" This reverts commit `79368c81bf`. Conflicts: block.c I haven't been able to come up with a solution yet for the corruption caused by unaligned requests from the IDE disk so revert until a solution can be written. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2010-09-08 17:09:15 -05:00
Kevin Wolf	ee1811965f	block: Fix image re-open in bdrv_commit Arguably we should re-open the backing file with the backing file format and not with the format of the snapshot image. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-08-30 18:29:22 +02:00
Markus Armbruster	4be9762adb	block: Change bdrv_eject() not to drop the image bdrv_eject() gets called when a device model opens or closes the tray. If the block driver implements method bdrv_eject(), that method gets called. Drivers host_cdrom implements it, and it opens and closes the physical tray, and nothing else. When a device model opens, then closes the tray, media changes only if the user actively changes the physical media while the tray is open. This is matches how physical hardware behaves. If the block driver doesn't implement method bdrv_eject(), we do something quite different: opening the tray severs the connection to the image by calling bdrv_close(), and closing the tray does nothing. When the device model opens, then closes the tray, media is gone, unless the user actively inserts another one while the tray is open, with a suitable change command in the monitor. This isn't how physical hardware behaves. Rather inconvenient when programs "helpfully" eject media to give you a chance to change it. The way bdrv_eject() behaves here turns that chance into a must, which is not what these programs or their users expect. Change the default action not to call bdrv_close(). Instead, note the tray status in new BlockDriverState member tray_open. Use it in bdrv_is_inserted(). Arguably, the device models should keep track of tray status themselves. But this is less invasive. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-08-03 15:57:22 +02:00
Kevin Wolf	336c1c1255	block: Fix bdrv_has_zero_init Assuming that any image on a block device is not properly zero-initialized is actually wrong: Only raw images have this problem. Any other image format shouldn't care about it, they initialize everything properly themselves. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-08-03 15:57:22 +02:00
Kevin Wolf	8a4266144e	block: Change bdrv_commit to handle multiple sectors at once bdrv_commit copies the image to its backing file sector by sector, which is (surprise!) relatively slow. Let's take a larger buffer and handle more sectors at once if possible. With a 1G qcow2 file, this brought the time bdrv_commit takes down from 5:06 min to 1:14 min for me. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-08-03 15:57:22 +02:00
Blue Swirl	199630b62e	Fix -snapshot deleting images on disk change Block device change command did not copy BDRV_O_SNAPSHOT flag. Thus the new image did not have this flag and the file got deleted during opening. Fix by copying BDRV_O_SNAPSHOT flag. Signed-off-by: Blue Swirl <blauwirbel@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-07-26 13:39:40 +02:00
Stefan Weil	c98ac35d87	block: Use error codes from lower levels for error message "No such file or directory" is a misleading error message when a user tries to open a file with wrong permissions. Cc: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Weil <weil@mail.berlios.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-07-26 13:39:40 +02:00
Anthony Liguori	79368c81bf	Make default invocation of block drivers safer (v3) CVE-2008-2004 described a vulnerability in QEMU whereas a malicious user could trick the block probing code into accessing arbitrary files in a guest. To mitigate this, we added an explicit format parameter to -drive which disabling block probing. Fast forward to today, and the vast majority of users do not use this parameter. libvirt does not use this by default nor does virt-manager. Most users want block probing so we should try to make it safer. This patch adds some logic to the raw device which attempts to detect a write operation to the beginning of a raw device. If the first 4 bytes happen to match an image file that has a backing file that we support, it scrubs the signature to all zeros. If a user specifies an explicit format parameter, this behavior is disabled. I contend that while a legitimate guest could write such a signature to the header, we would behave incorrectly anyway upon the next invocation of QEMU. This simply changes the incorrect behavior to not involve a security vulnerability. I've tested this pretty extensively both in the positive and negative case. I'm not 100% confident in the block layer's ability to deal with zero sized writes particularly with respect to the aio functions so some additional eyes would be appreciated. Even in the case of a single sector write, we have to make sure to invoked the completion from a bottom half so just removing the zero sized write is not an option. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2010-07-15 08:17:06 -05:00
Kevin Wolf	9ac228e02c	qcow2/vdi: Change check to distinguish error cases This distinguishes between harmless leaks and real corruption. Hopefully users better understand what qemu-img check wants to tell them. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-07-06 17:05:49 +02:00
Kevin Wolf	e076f3383b	qemu-img check: Distinguish different kinds of errors People think that their images are corrupted when in fact there are just some leaked clusters. Differentiating several error cases should make the messages more comprehensible. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-07-06 17:05:48 +02:00
Kevin Wolf	de189a1b4a	block: Handle multiwrite errors only when all requests have completed Don't try to be clever by freeing all temporary data and calling all callbacks when the return value (an error) is certain. Doing so has at least two important problems: * The temporary data that is freed (qiov, possibly zero buffer) is still used by the requests that have not yet completed. * Calling the callbacks for all requests in the multiwrite means for the caller that it may free buffers etc. which are still in use. Just remember the error value and do the cleanup when all requests have completed. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-07-02 15:44:12 +02:00
Kevin Wolf	453f9a1652	block: Fix early failure in multiwrite bdrv_aio_writev may call the callback immediately (and it will commonly do so in error cases). Current code doesn't consider this. For details see the comment added by this patch. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-07-02 15:44:12 +02:00
Markus Armbruster	7d0d69509a	block: Fix virtual media change for if=none BlockDriverState member removable controls whether virtual media change (monitor commands change, eject) is allowed. It is set when the "type hint" is BDRV_TYPE_CDROM or BDRV_TYPE_FLOPPY. The type hint is only set by drive_init(). It sets BDRV_TYPE_FLOPPY for if=floppy. It sets BDRV_TYPE_CDROM for media=cdrom and if=ide, scsi, xen, or none. if=ide and if=scsi work, because the type hint makes it a CD-ROM. if=xen likewise, I think. For the same reason, if=none works when it's used by ide-drive or scsi-disk. For other guest devices, there are problems: * fdc: you can't change virtual media $ qemu [...] -drive if=none,id=foo,... -global isa-fdc.driveA=foo QEMU 0.12.50 monitor - type 'help' for more information (qemu) eject foo Device 'foo' is not removable unless you add media=cdrom, but that makes it readonly. * virtio: if you add media=cdrom, you can change virtual media. If you eject, the guest gets I/O errors. If you change, the guest sees the drive's contents suddenly change. * scsi-generic: if you add media=cdrom, you can change virtual media. I didn't test what that does to the guest or the physical device, but it can't be pretty. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-07-02 13:18:02 +02:00
Markus Armbruster	3ac906f771	block: Clean up bdrv_snapshots() Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-07-02 13:18:02 +02:00
Markus Armbruster	f9092b108f	savevm: Survive hot-unplug of snapshot device savevm.c keeps a pointer to the snapshot block device. If you manage to get that device deleted, the pointer dangles, and the next snapshot operation will crash & burn. Unplugging a guest device that uses it does the trick: $ MALLOC_PERTURB_=234 qemu-system-x86_64 [...] QEMU 0.12.50 monitor - type 'help' for more information (qemu) info snapshots No available block device supports snapshots (qemu) drive_add auto if=none,file=tmp.qcow2 OK (qemu) device_add usb-storage,id=foo,drive=none1 (qemu) info snapshots Snapshot devices: none1 Snapshot list (from none1): ID TAG VM SIZE DATE VM CLOCK (qemu) device_del foo (qemu) info snapshots Snapshot devices: Segmentation fault (core dumped) Move management of that pointer to block.c, and zap it when the device it points becomes unusable. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-07-02 13:18:02 +02:00
Markus Armbruster	18846dee1a	block: Catch attempt to attach multiple devices to a blockdev For instance, -device scsi-disk,drive=foo -device scsi-disk,drive=foo happily creates two SCSI disks connected to the same block device. It's all downhill from there. Device usb-storage deliberately attaches twice to the same blockdev, which fails with the fix in place. Detach before the second attach there. Also catch attempt to delete while a guest device model is attached. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-07-02 13:18:02 +02:00
Ryan Harper	15c7733bb2	Don't reset bs->is_temporary in bdrv_open_common To fix https://bugs.launchpad.net/qemu/+bug/597402 where qemu fails to call unlink() on temporary snapshots due to bs->is_temporary getting clobbered in bdrv_open_common() after being set in bdrv_open() which calls the former. We don't need to initialize bs->is_temporary in bdrv_open_common(). Signed-off-by: Ryan Harper <ryanh@us.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-07-02 13:18:01 +02:00
Christoph Hellwig	39508e7adb	block: allow filenames with colons again for host devices Before the raw/file split we used to allow filenames with colons for host device only. While this was more by accident than by design people rely on it, so we need to bring it back. So move the host device probing to be before the protocol detection again. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-07-02 13:18:01 +02:00
Kevin Wolf	f08145fe16	block: Add bdrv_(p)write_sync Add new functions that write and flush the written data to disk immediately. This is what needs to be used for image format metadata to maintain integrity for cache=... modes that don't use O_DSYNC. (Actually, we only need barriers, and therefore the functions are defined as such, but flushes is what is implemented in this patch - we can try to change that later) Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-06-22 14:38:02 +02:00
Blue Swirl	5ffbbc67b5	block: fix a warning and possible truncation Fix a warning from OpenBSD gcc (3.3.5 (propolice)): /src/qemu/block.c: In function `bdrv_info_stats_bs': /src/qemu/block.c:1548: warning: long long int format, long unsigned int arg (arg 6) There may be also truncation effects. Signed-off-by: Blue Swirl <blauwirbel@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-06-15 09:42:30 +02:00
Markus Armbruster	2f399b0aad	block: New bdrv_next() This is a more flexible alternative to bdrv_iterate(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-06-15 09:41:59 +02:00
Markus Armbruster	6ab4b5ab8f	block: Decouple block device "commit all" from DriveInfo do_commit() and mux_proc_byte() iterate over the list of drives defined with drive_init(). This misses host block devices defined by other means. Such means don't exist now, but will be introduced later in this series. Change them to use new bdrv_commit_all(), which iterates over all host block devices. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-06-15 09:41:59 +02:00
Markus Armbruster	abd7f68d08	block: Move error actions from DriveInfo to BlockDriverState That's where they belong semantically (block device host part), even though the actions are actually executed by guest device code. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-06-15 09:41:59 +02:00
Miguel Di Ciurcio Filho	feeee5aca7	savevm: Really verify if a drive supports snapshots Both bdrv_can_snapshot() and bdrv_has_snapshot() does not work as advertized. First issue: Their names implies different porpouses, but they do the same thing and have exactly the same code. Maybe copied and pasted and forgotten? bdrv_has_snapshot() is called in various places for actually checking if there is snapshots or not. Second issue: the way bdrv_can_snapshot() verifies if a block driver supports or not snapshots does not catch all cases. E.g.: a raw image. So when do_savevm() is called, first thing it does is to set a global BlockDriverState to save the VM memory state calling get_bs_snapshots(). static BlockDriverState get_bs_snapshots(void) { BlockDriverState bs; DriveInfo dinfo; if (bs_snapshots) return bs_snapshots; QTAILQ_FOREACH(dinfo, &drives, next) { bs = dinfo->bdrv; if (bdrv_can_snapshot(bs)) goto ok; } return NULL; ok: bs_snapshots = bs; return bs; } bdrv_can_snapshot() may return a BlockDriverState that does not support snapshots and do_savevm() goes on. Later on in do_savevm(), we find: QTAILQ_FOREACH(dinfo, &drives, next) { bs1 = dinfo->bdrv; if (bdrv_has_snapshot(bs1)) { / Write VM state size only to the image that contains the state */ sn->vm_state_size = (bs == bs1 ? vm_state_size : 0); ret = bdrv_snapshot_create(bs1, sn); if (ret < 0) { monitor_printf(mon, "Error while creating snapshot on '%s'\n", bdrv_get_device_name(bs1)); } } } bdrv_has_snapshot(bs1) is not checking if the device does support or has snapshots as explained above. Only in bdrv_snapshot_create() the device is actually checked for snapshot support. So, in cases where the first device supports snapshots, and the second does not, the snapshot on the first will happen anyways. I believe this is not a good behavior. It should be an all or nothing process. This patch addresses these issues by making bdrv_can_snapshot() actually do what it must do and enforces better tests to avoid errors in the middle of do_savevm(). bdrv_has_snapshot() is removed and replaced by bdrv_can_snapshot() where appropriate. bdrv_can_snapshot() was moved from savevm.c to block.c. It makes more sense to me. The loadvm_state() function was updated too to enforce that when loading a VM at least all writable devices must support snapshots too. Signed-off-by: Miguel Di Ciurcio Filho <miguel.filho@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-06-15 09:41:58 +02:00
MORITA Kazutaka	7cdb1f6d30	block: call the snapshot handlers of the protocol drivers When snapshot handlers are not defined in the format driver, it is better to call the ones of the protocol driver. This enables us to implement snapshot support in the protocol driver. We need to call bdrv_close() and bdrv_open() handlers of the format driver before and after bdrv_snapshot_goto() call of the protocol. It is because the contents of the block driver state may need to be changed after loading vmstate. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2010-06-04 11:43:40 +02:00

... 2 3 4 5 6 ...

551 Commits