qemu-e2k

Commit Graph

Author	SHA1	Message	Date
Kevin Wolf	4c346e0bb9	MAINTAINERS: Add header files to Block Layer Core section Suggested-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:09 +02:00
Daniel P. Berrange	f7ac119cfa	tests: add test case for encrypted qcow2 read/write Add a simple test case for qemu-iotests that covers read/write with encrypted qcow2 files. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:09 +02:00
Daniel P. Berrange	8caf02127e	qemu-io: prompt for encryption keys when required The qemu-io tool does not check if the image is encrypted so historically would silently corrupt the sectors by writing plain text data into them instead of cipher text. The earlier commit turns this mistake into a fatal abort, so check for encryption and prompt for key when required. This enables us to add unit tests to ensure we don't break the ability of qemu-img to convert existing encrypted qcow2 files into a non-encrypted format. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:01 +02:00
Daniel P. Berrange	6a11d5183f	util: allow \n to terminate password input The qemu_read_password() method looks for \r to terminate the reading of the a password. This is what will be seen when reading the password from a TTY. When scripting though, it is useful to be able to send the password via a pipe, in which case we must look for \n to terminate password input. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:01 +02:00
Daniel P. Berrange	d57e4e482e	util: move read_password method out of qemu-img into osdep/oslib The qemu-img.c file has a read_password() method impl that is used to prompt for passwords on the console, with impls for POSIX and Windows. This will be needed by qemu-io.c too, so move it into the QEMU osdep/oslib files where it can be shared without code duplication Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:01 +02:00
Daniel P. Berrange	8336aafae1	qcow2/qcow: protect against uninitialized encryption key When a qcow[2] file is opened, if the header reports an encryption method, this is used to set the 'crypt_method_header' field on the BDRVQcow[2]State struct, and the 'encrypted' flag in the BDRVState struct. When doing I/O operations, the 'crypt_method' field on the BDRVQcow[2]State struct is checked to determine if encryption needs to be applied. The crypt_method_header value is copied into crypt_method when the bdrv_set_key() method is called. The QEMU code which opens a block device is expected to always do a check if (bdrv_is_encrypted(bs)) { bdrv_set_key(bs, ....key...); } If code forgets to do this, then 'crypt_method' is never set and so when I/O is performed, QEMU writes plain text data into a sector which is expected to contain cipher text, or when reading, will return cipher text instead of plain text. Change the qcow[2] code to consult bs->encrypted when deciding whether encryption is required, and assert(s->crypt_method) to protect against cases where the caller forgets to set the encryption key. Also put an assert in the set_key methods to protect against the case where the caller sets an encryption key on a block device that does not have encryption Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:01 +02:00
Fam Zheng	aa4f592a1d	qemu-iotests: Make debugging python tests easier Adding "-d" option. The output goes to "tee" so it appears in your console. Also, raise the verbosity of unnitest runner. When testing a topic branch, it's possible that a bug introduced by a code change makes the python test case hang, with debug output, it is much easier to locate the problem. This can also be helpful if you want to watch the progress of a python test, it offers you a way to sense the speed of each test case method you're writing. Note: because there is no easy way to get both the verbose output and the output expected by ./check comparison, the case would always fail with an "output mismatch". The sole purpose of using this option is giving developers a quick way to debug when things go wrong. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:01 +02:00
Fam Zheng	b93bbf4ee9	qemu-iotests: qemu-img info on afl VMDK image with a huge capacity The image is contributed by Richard W.M. Jones. Cc: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:01 +02:00
Fam Zheng	4a9c9ea0d3	block: Detect multiplication overflow in bdrv_getlength Bogus image may have a large total_sectors that will overflow the multiplication. For cleanness, fix the return code so the error message will be meaningful. Reported-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:01 +02:00
Eric Blake	b062ad86dc	qemu-io: Use getopt() correctly POSIX says getopt() returns -1 on completion. While Linux happens to define EOF as -1, this definition is not required by POSIX, and there may be platforms where checking for EOF instead of -1 would lead to an infinite loop. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:01 +02:00
Alberto Garcia	d1b4efe5c4	qcow2: style fixes in qcow2-cache.c Fix pointer declaration to make it consistent with the rest of the code. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:01 +02:00
Alberto Garcia	a3f1afb43a	qcow2: make qcow2_cache_put() a void function This function never receives an invalid table pointer, so we can make it void and remove all the error checking code. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:01 +02:00
Alberto Garcia	812e4082ca	qcow2: use a hash to look for entries in the L2 cache The current cache algorithm traverses the array starting always from the beginning, so the average number of comparisons needed to perform a lookup is proportional to the size of the array. By using a hash of the offset as the starting point, lookups are faster and independent from the array size. The hash is computed using the cluster number of the table, multiplied by 4 to make it perform better when there are collisions. In my tests, using a cache with 2048 entries, this reduces the average number of comparisons per lookup from 430 to 2.5. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:01 +02:00
Alberto Garcia	fdfbca82a0	qcow2: remove qcow2_cache_find_entry_to_replace() A cache miss means that the whole array was traversed and the entry we were looking for was not found, so there's no need to traverse it again in order to select an entry to replace. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:01 +02:00
Alberto Garcia	2693310ecc	qcow2: use an LRU algorithm to replace entries from the L2 cache The current algorithm to evict entries from the cache gives always preference to those in the lowest positions. As the size of the cache increases, the chances of the later elements of being removed decrease exponentially. In a scenario with random I/O and lots of cache misses, entries in positions 8 and higher are rarely (if ever) evicted. This can be seen even with the default cache size, but with larger caches the problem becomes more obvious. Using an LRU algorithm makes the chances of being removed from the cache independent from the position. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:01 +02:00
Alberto Garcia	baf07d60f5	qcow2: simplify qcow2_cache_put() and qcow2_cache_entry_mark_dirty() Since all tables are now stored together, it is possible to obtain the position of a particular table directly from its address, so the operation becomes O(1). Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:01 +02:00
Alberto Garcia	72e80b8901	qcow2: use one single memory block for the L2/refcount cache tables The qcow2 L2/refcount cache contains one separate table for each cache entry. Doing one allocation per table adds unnecessary overhead and it also requires us to store the address of each table separately. Since the size of the cache is constant during its lifetime, it's better to have an array that contains all the tables using one single allocation. In my tests measuring freshly created caches with sizes 128MB (L2) and 32MB (refcount) this uses around 10MB of RAM less. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:01 +02:00
Fam Zheng	13c4941cdd	vmdk: Fix overflow if l1_size is 0x20000000 Richard Jones caught this bug with afl fuzzer. In fact, that's the only possible value to overflow (extent->l1_size = 0x20000000) l1_size: l1_size = extent->l1_size * sizeof(long) => 0x80000000; g_try_malloc returns NULL because l1_size is interpreted as negative during type casting from 'int' to 'gsize', which yields a enormous value. Hence, by coincidence, we get a "not too bad" behavior: qemu-img: Could not open '/tmp/afl6.img': Could not open '/tmp/afl6.img': Cannot allocate memory Values larger than 0x20000000 will be refused by the validation in vmdk_add_extent. Values smaller than 0x20000000 will not overflow l1_size. Cc: qemu-stable@nongnu.org Reported-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Tested-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:01 +02:00
Fam Zheng	5e82a31eb9	vmdk: Fix next_cluster_sector for compressed write This fixes the bug introduced by commit `c6ac36e` (vmdk: Optimize cluster allocation). Sometimes, write_len could be larger than cluster size, because it contains both data and marker. We must advance next_cluster_sector in this case, otherwise the image gets corrupted. Cc: qemu-stable@nongnu.org Reported-by: Antoni Villalonga <qemu-list@friki.cat> Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:00 +02:00
Christoph Hellwig	aacd5650c6	nvme: support NVME_VOLATILE_WRITE_CACHE feature The SCSI emulation in the Linux NVMe driver really wants to know if a device has a volatile write cache. Given that qemu has moved away from a model where we report the backing store WCE bit to one where the WCE bit is supposed to be part of the migratable guest-visible state we always return 1 here. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:00 +02:00
Kevin Wolf	ecbda7a225	qcow2: Flush pending discards before allocating cluster Before a freed cluster can be reused, pending discards for this cluster must be processed. The original assumption was that this was not a problem because discards are only cached during discard/write zeroes operations, which are synchronous so that no concurrent write requests can cause cluster allocations. However, the discard/write zeroes operation itself can allocate a new L2 table (and it has to in order to put zero flags there), so make sure we can cope with the situation. This fixes https://bugs.launchpad.net/bugs/1349972. Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2015-05-22 17:08:00 +02:00
Peter Maydell	8b6db32a4e	-----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJVXvBFAAoJEJykq7OBq3PIBXwH/1b3aLGQnkFn7h2RylhERDts 9YLApEyF5nHX1hi2LSgXC6AuuZ0S4vV3kwiqYrrsnz7e8Yn5tJ3SEfuS8dkgtHb4 dyB/Fy/ZuOQ/DUcWXUD1gpxCAlJXf8Kk4Z7SB9OAH0TDc9KExAe3hmrfYE/eRvMd Yxh9W3FAfN8yFC+wOtfsua+066zd6vH0fCi2sn7PHElLZGST3jtWN/I2Rdd3j3rp cffTHsJDT8e3iWao7qdWtYDIYDvECbROJorevb7CaMcxeR67hpI1TIvckZ67/1Ks p42iY7zVpM4goVe3GvC8lWdLy2hbUS+NWggtJi7+nlpV0SNQzDBfr2nnKZB3o0s= =+MDM -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging # gpg: Signature made Fri May 22 10:00:53 2015 BST using RSA key ID 81AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" * remotes/stefanha/tags/block-pull-request: (38 commits) block: get_block_status: use "else" when testing the opposite condition qemu-iotests: Test unaligned sub-block zero write block: Fix NULL deference for unaligned write if qiov is NULL Revert "block: Fix unaligned zero write" block: align bounce buffers to page block: minimal bounce buffer alignment block: return EPERM on writes or discards to read-only devices configure: Add workaround for ccache and clang configure: silence glib unknown attribute __alloc_size__ configure: factor out supported flag check configure: handle clang -nopie argument warning block/parallels: improve image writing performance further block/parallels: optimize linear image expansion block/parallels: add prealloc-mode and prealloc-size open paramemets block/parallels: delay writing to BAT till bdrv_co_flush_to_os block/parallels: create bat_entry_off helper block/parallels: improve image reading performance iotests, parallels: check for incorrectly closed image in tests block/parallels: implement incorrect close detection block/parallels: implement parallels_check method of block driver ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2015-05-22 13:25:40 +01:00
Peter Maydell	f5790c3bc8	Revert "target-alpha: Add vector implementation for CMPBGE" This reverts commit `32ad48abd7`. Unfortunately the SSE2 code here fails to compile on some versions of gcc: target-alpha/int_helper.c:77:24: error: invalid operands to binary >= (have '__vector(16) unsigned char' and '__vector(16) unsigned char') Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2015-05-22 12:30:13 +01:00
Peter Maydell	27e1259a69	Rewrite fp exceptions -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJVXhd4AAoJEK0ScMxN0CebJ0oIAJFrQw67jOZzEOQmQBCMFRNi DeDsVbQhK9e2iiQGdYL+r6XQQ4uG+MS5CI8FDi1DdJrRxzDaHD5pDWGx+flAVwDz jqanuVIap7YfkL/wFtdRyhu7y7tOD9mXb9Hsy968Ga+MHO1q548ctnbeWHHKtRND JkGZtCCjh7j0AMMTdyTq0dvKnSAAgnh4AK1Pni1tdERHmbzQQAuixzgI8jVCQIrE FOySuaG6KX1TkNsg2ow+4XrRnc1M8ZhMPFqvg+JVBCbiXBmSEq0cqp2pg9zknzD4 5ODpqoqwXgMeLxKXMIRGqIqrbJe+J8ypZFUxPjdlLOoFvUglqmk5qyfufvvp3yA= =jbie -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/rth/tags/pull-axp-20150521' into staging Rewrite fp exceptions # gpg: Signature made Thu May 21 18:35:52 2015 BST using RSA key ID 4DD0279B # gpg: Good signature from "Richard Henderson <rth7680@gmail.com>" # gpg: aka "Richard Henderson <rth@redhat.com>" # gpg: aka "Richard Henderson <rth@twiddle.net>" * remotes/rth/tags/pull-axp-20150521: target-alpha: Add vector implementation for CMPBGE target-alpha: Rewrite helper_zapnot target-alpha: Raise IOV from CVTQL target-alpha: Suppress underflow from CVTTQ if DNZ target-alpha: Raise EXC_M_INV properly for fp inputs target-alpha: Disallow literal operand to 1C.30 to 1C.37 target-alpha: Implement WH64EN target-alpha: Fix integer overflow checking insns target-alpha: Fix cvttq vs inf target-alpha: Fix cvttq vs large integers target-alpha: Raise IOV from CVTTQ target-alpha: Set EXC_M_SWC for exceptions from /S insns target-alpha: Set fpcr_exc_status even for disabled exceptions target-alpha: Tidy FPCR representation target-alpha: Set PC correctly for floating-point exceptions target-alpha: Forget installed round mode after MT_FPCR target-alpha: Rename floating-point subroutines target-alpha: Move VAX helpers to a new file Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2015-05-22 10:06:33 +01:00
Paolo Bonzini	a53f1a95f9	block: get_block_status: use "else" when testing the opposite condition A bit of Boolean algebra (and common sense) tells us that the second "if" here is looking for blocks that are not allocated. This is the opposite of the "if" that sets BDRV_BLOCK_ALLOCATED, and thus it can use an "else". Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 1431599702-10431-1-git-send-email-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:33 +01:00
Fam Zheng	ab53c44718	qemu-iotests: Test unaligned sub-block zero write Test zero write in byte range 512~1024 for 4k alignment. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-id: 1431522721-3266-4-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:33 +01:00
Fam Zheng	9eeb6dd1b2	block: Fix NULL deference for unaligned write if qiov is NULL For zero write, callers pass in NULL qiov (qemu-io "write -z" or scsi-disk "write same"). Commit `fc3959e466` fixed bdrv_co_write_zeroes which is the common case for this bug, but it still exists in bdrv_aio_write_zeroes. A simpler fix would be in bdrv_co_do_pwritev which is the NULL dereference point and covers both cases. So don't access it in bdrv_co_do_pwritev in this case, use three aligned writes. [Initialize ret to 0 in bdrv_co_do_zero_pwritev() to avoid uninitialized variable warning with gcc 4.9.2. --Stefan] Signed-off-by: Fam Zheng <famz@redhat.com> Message-id: 1431522721-3266-3-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:33 +01:00
Fam Zheng	d01c07f222	Revert "block: Fix unaligned zero write" This reverts commit `fc3959e466`. The core write code already handles the case, so remove this duplication. Because commit `61007b316` moved the touched code from block.c to block/io.c, the change is manually reverted. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-id: 1431522721-3266-2-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:33 +01:00
Denis V. Lunev	459b4e6612	block: align bounce buffers to page The following sequence int fd = open(argv[1], O_RDWR \| O_CREAT \| O_DIRECT, 0644); for (i = 0; i < 100000; i++) write(fd, buf, 4096); performs 5% better if buf is aligned to 4096 bytes. The difference is quite reliable. On the other hand we do not want at the moment to enforce bounce buffering if guest request is aligned to 512 bytes. The patch changes default bounce buffer optimal alignment to MAX(page size, 4k). 4k is chosen as maximal known sector size on real HDD. The justification of the performance improve is quite interesting. From the kernel point of view each request to the disk was split by two. This could be seen by blktrace like this: 9,0 11 1 0.000000000 11151 Q WS 312737792 + 1023 [qemu-img] 9,0 11 2 0.000007938 11151 Q WS 312738815 + 8 [qemu-img] 9,0 11 3 0.000030735 11151 Q WS 312738823 + 1016 [qemu-img] 9,0 11 4 0.000032482 11151 Q WS 312739839 + 8 [qemu-img] 9,0 11 5 0.000041379 11151 Q WS 312739847 + 1016 [qemu-img] 9,0 11 6 0.000042818 11151 Q WS 312740863 + 8 [qemu-img] 9,0 11 7 0.000051236 11151 Q WS 312740871 + 1017 [qemu-img] 9,0 5 1 0.169071519 11151 Q WS 312741888 + 1023 [qemu-img] After the patch the pattern becomes normal: 9,0 6 1 0.000000000 12422 Q WS 314834944 + 1024 [qemu-img] 9,0 6 2 0.000038527 12422 Q WS 314835968 + 1024 [qemu-img] 9,0 6 3 0.000072849 12422 Q WS 314836992 + 1024 [qemu-img] 9,0 6 4 0.000106276 12422 Q WS 314838016 + 1024 [qemu-img] and the amount of requests sent to disk (could be calculated counting number of lines in the output of blktrace) is reduced about 2 times. Both qemu-img and qemu-io are affected while qemu-kvm is not. The guest does his job well and real requests comes properly aligned (to page). Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-id: 1431441056-26198-3-git-send-email-den@openvz.org CC: Paolo Bonzini <pbonzini@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:33 +01:00
Denis V. Lunev	4196d2f030	block: minimal bounce buffer alignment The patch introduces new concept: minimal memory alignment for bounce buffers. Original so called "optimal" value is actually minimal required value for aligment. It should be used for validation that the IOVec is properly aligned and bounce buffer is not required. Though, from the performance point of view, it would be better if bounce buffer or IOVec allocated by QEMU will be aligned stricter. The patch does not change any alignment value yet. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-id: 1431441056-26198-2-git-send-email-den@openvz.org CC: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:33 +01:00
Paolo Bonzini	eaf5fe2dd4	block: return EPERM on writes or discards to read-only devices This is the behavior in the operating system, for example Linux's blkdev_write_iter has the following: if (bdev_read_only(I_BDEV(bd_inode))) return -EPERM; This does not apply to opening a device for read/write, when the device only supports read-only operation. In this case any of EACCES, EPERM or EROFS is acceptable depending on why writing is not possible. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1431013548-22492-1-git-send-email-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:33 +01:00
John Snow	fd0e60530f	configure: Add workaround for ccache and clang Test if ccache is interfering with semantic analysis of macros, disable its habit of trying to compile already pre-processed versions of code if so. ccache attempts to save time by compiling pre-processed versions of code, but this disturbs clang's static analysis enough to produce false positives. ccache allows us to disable this feature, opting instead to compile the original version instead of its preprocessed version. This makes ccache much slower for cache misses, but at least it becomes usable with QEMU/clang. This workaround only activates for users using ccache AND clang, and only if their configuration is observed to be producing warnings. You may need to clear your ccache for builds started without -Werror, as those may continue to produce warnings from the cache. Thanks to Peter Eisentraut for his writeup on the issue: http://peter.eisentraut.org/blog/2014/12/01/ccache-and-clang-part-3/ Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1427324259-1481-5-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:33 +01:00
John Snow	bbbf2e04e5	configure: silence glib unknown attribute __alloc_size__ The glib headers use GCC attributes. Unfortunately the __GNUC__ and __GNUC_MINOR__ version macros are also defined by clang, but clang doesn't support the same attributes as GCC. clang 3.5.0 does not support the __alloc_size__ attribute: `c047507a9a` The following warning is produced: gstrfuncs.h:257:44: warning: unknown attribute '__alloc_size__' ignored [-Wunknown-attributes] G_GNUC_MALLOC G_GNUC_ALLOC_SIZE(2); gmacros.h:67:45: note: expanded from macro 'G_GNUC_ALLOC_SIZE' #define G_GNUC_ALLOC_SIZE(x) __attribute__((__alloc_size__(x))) This patch checks whether glib headers cause warnings and disables -Wunknown-attributes if it is able to. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1427324259-1481-4-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:33 +01:00
John Snow	93b2586922	configure: factor out supported flag check Factor out the function that checks if a compiler flag is supported or not. Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1427324259-1481-3-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:32 +01:00
Stefan Hajnoczi	e4a7b344df	configure: handle clang -nopie argument warning gcc 4.9.2 treats -nopie as an error: cc: error: unrecognized command line option ‘-nopie’ clang 3.5.0 treats -nopie as a warning: clang: warning: argument unused during compilation: '-nopie' The causes ./configure to fail with clang: ERROR: configure test passed without -Werror but failed with -Werror. Make the -nopie test use -Werror so that compile_prog works for both gcc and clang. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1427324259-1481-2-git-send-email-jsnow@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:32 +01:00
Denis V. Lunev	ddd2ef2ce8	block/parallels: improve image writing performance further Try to perform IO for the biggest continuous block possible. All blocks abscent in the image are accounted in the same type and preallocation is made for all of them at once. The performance for sequential write is increased from 200 Mb/sec to 235 Mb/sec on my SSD HDD. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@parallels.com> Signed-off-by: Roman Kagan <rkagan@parallels.com> Message-id: 1430207220-24458-28-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:32 +01:00
Denis V. Lunev	19f5dc1591	block/parallels: optimize linear image expansion Plain image expansion spends a lot of time to update image file size. This seriously affects the performance. The following simple test qemu_img create -f parallels -o cluster_size=64k ./1.hds 64G qemu_io -n -c "write -P 0x11 0 1024M" ./1.hds could be improved if the format driver will pre-allocate some space in the image file with a reasonable chunk. This patch preallocates 128 Mb using bdrv_write_zeroes, which should normally use fallocate() call inside. Fallback to older truncate() could be used as a fallback using image open options thanks to the previous patch. The benefit is around 15%. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Karan <rkagan@parallels.com> Signed-off-by: Roman Kagan <rkagan@parallels.com> Message-id: 1430207220-24458-27-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:32 +01:00
Denis V. Lunev	d61790112f	block/parallels: add prealloc-mode and prealloc-size open paramemets This is preparational commit for tweaks in Parallels image expansion. The idea is that enlarge via truncate by one data block is slow. It would be much better to use fallocate via bdrv_write_zeroes and expand by some significant amount at once. Original idea with sequential file writing to the end of the file without fallocate/truncate would be slower than this approach if the image is expanded with several operations: - each image expanding means file metadata update, i.e. filesystem journal write. Truncate/write to newly truncated space update file metadata twice thus truncate removal helps. With fallocate call inside bdrv_write_zeroes file metadata is updated only once and this should happen infrequently thus this approach is the best one for the image expansion - tail writes are ordered, i.e. the guest IO queue could not be sent immediately to the host introducing additional IO delays This patch just adds proper parameters into BDRVParallelsState and performs options parsing in parallels_open. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@parallels.com> Signed-off-by: Roman Kagan <rkagan@parallels.com> Message-id: 1430207220-24458-26-git-send-email-den@openvz.org CC: Roman Kagan <rkagan@parallels.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:32 +01:00
Denis V. Lunev	0d31c7c200	block/parallels: delay writing to BAT till bdrv_co_flush_to_os The idea is that we do not need to immediately sync BAT to the image as from the guest point of view there is a possibility that IO is lost even in the physical controller until flush command was finished. bdrv_co_flush_to_os is exactly the right place for this purpose. Technically the patch uses loaded BAT data as a cache and performs actual on-disk metadata updates in parallels_co_flush_to_os callback. This patch speed ups qemu-img create -f parallels -o cluster_size=64k ./1.hds 64G qemu-io -f parallels -c "write -P 0x11 0 1024k" 1.hds writing from 50-60 Mb/sec to 80-90 Mb/sec on rotational media and from 160 Mb/sec to 190 Mb/sec on SSD disk. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@parallels.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Roman Kagan <rkagan@parallels.com> Message-id: 1430207220-24458-25-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:32 +01:00
Denis V. Lunev	2d68e22e94	block/parallels: create bat_entry_off helper calculate offset of the BAT entry in the image file. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@parallels.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Roman Kagan <rkagan@parallels.com> Message-id: 1430207220-24458-24-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:32 +01:00
Denis V. Lunev	6953d92078	block/parallels: improve image reading performance Try to perform IO for the biggest continuous block possible. The performance for sequential read is increased from 220 Mb/sec to 360 Mb/sec for continous image on my SSD HDD. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@parallels.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Roman Kagan <rkagan@parallels.com> Message-id: 1430207220-24458-23-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:32 +01:00
Denis V. Lunev	a6be831e99	iotests, parallels: check for incorrectly closed image in tests Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@parallels.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Roman Kagan <rkagan@parallels.com> Message-id: 1430207220-24458-22-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:32 +01:00
Denis V. Lunev	6dd6b9f144	block/parallels: implement incorrect close detection The software driver must set inuse field in Parallels header to 0x746F6E59 when the image is opened in read-write mode. The presence of this magic in the header on open forces image consistency check. There is an unfortunate trick here. We can not check for inuse in parallels_check as this will happen too late. It is possible to do that for simple check, but during the fix this would always report an error as the image was opened in BDRV_O_RDWR mode. Thus we save the flag in BDRVParallelsState for this. On the other hand, nothing should be done to clear inuse in parallels_check. Generic close will do the job right. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@parallels.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Roman Kagan <rkagan@parallels.com> Message-id: 1430207220-24458-21-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:32 +01:00
Denis V. Lunev	49ad646731	block/parallels: implement parallels_check method of block driver The check is very simple at the moment. It calculates necessary stats and fix only the following errors: - space leak at the end of the image. This would happens due to preallocation - clusters outside the image are zeroed. Nothing else could be done here Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@parallels.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Roman Kagan <rkagan@parallels.com> Message-id: 1430207220-24458-20-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:32 +01:00
Denis V. Lunev	23d6bd3bd1	block/parallels: move parallels_open/probe to the very end of the file This will help to avoid forward declarations for upcoming parallels_check Some very obvious formatting fixes were made to the moved code to make checkpatch happy. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@parallels.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Roman Kagan <rkagan@parallels.com> Message-id: 1430207220-24458-19-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:32 +01:00
Denis V. Lunev	9eae9cca95	block/parallels: read parallels image header and BAT into single buffer This metadata cache would allow to properly batch BAT updates to disk in next patches. These updates will be properly aligned to avoid read-modify-write transactions on block level. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@parallels.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Roman Kagan <rkagan@parallels.com> Message-id: 1430207220-24458-18-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:32 +01:00
Denis V. Lunev	dd97cdc064	block/parallels: keep BAT bitmap data in little endian in memory This will allow to use this data as buffer to BAT update directly without any intermediate buffers. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@parallels.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Roman Kagan <rkagan@parallels.com> Message-id: 1430207220-24458-17-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:32 +01:00
Denis V. Lunev	555cc9d9fc	block/parallels: create bat2sect helper deduplicate copy/paste arithmetcs Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@parallels.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Roman Kagan <rkagan@parallels.com> Message-id: 1430207220-24458-16-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:32 +01:00
Denis V. Lunev	369f7de9d5	block/parallels: rename catalog_ names to bat_ BAT means 'block allocation table'. Thus this name is clean and shorter on writing. Some obvious formatting fixes in the old code were made to make checkpatch happy. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@parallels.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Roman Kagan <rkagan@parallels.com> Message-id: 1430207220-24458-15-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:32 +01:00
Denis V. Lunev	cc5690f20f	parallels: change copyright information in the image header Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@parallels.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Roman Kagan <rkagan@parallels.com> Message-id: 1430207220-24458-14-git-send-email-den@openvz.org CC: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-05-22 09:37:32 +01:00

1 2 3 4 5 ...

38818 Commits All Branches Search

38818 Commits

All Branches