Commit Graph

38823 Commits

Author SHA1 Message Date
John Snow
91d0374a7f libqos/ahci: Swap memread/write with bufread/write
Where it makes sense, use the new faster primitives.
For generally small reads/writes such as for the PRDT
and FIS packets, stick with the more wasteful but
easier to debug memread/memwrite.

For ahci-test (before migration tests):
With this patch:
real    0m3.675s
user    0m2.582s
sys     0m1.718s

Without any qtest protocol improvements:
real    0m14.171s
user    0m12.072s
sys     0m12.527s

Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1430864578-22072-6-git-send-email-jsnow@redhat.com
2015-05-22 15:58:22 -04:00
John Snow
4d00796364 qtest: add memset to qtest protocol
Previously, memset was just a frontend to write() and only
stupidly sent the pattern many times across the wire.

Let's not discuss who stupidly wrote it like that in the first place.
(Hint: It was me.)

Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1430864578-22072-4-git-send-email-jsnow@redhat.com
2015-05-22 15:58:22 -04:00
John Snow
7a6a740d8d qtest: Add base64 encoded read/write
For larger pieces of data that won't need to be debugged and
viewing the hex nibbles is unlikely to be useful, we can encode
data using base64 instead of encoding each byte as %02x, which
leads to some space savings and faster reads/writes.

For now, the default is left as hex nibbles in memwrite() and memread().
For the purposes of making qtest io easier to read and debug, some
callers may want to specify using the old encoding format for small
patches of data where the savings from base64 wouldn't be that profound.

memwrite/memread use a data encoding that takes 2x the size of the original
buffer, but base64 uses "only" (4/3)x, so for larger buffers we can save a
decent amount of time and space.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1430864578-22072-3-git-send-email-jsnow@redhat.com
2015-05-22 15:58:22 -04:00
John Snow
332cc7e9b3 qtest: allow arbitrarily long sends
qtest currently has a static buffer of size 1024 that if we
overflow, ignores the additional data silently which leads
to hangs or stream failures.

Use glib's string facilities to allow arbitrarily long data,
but split this off into a new function, qtest_sendf.

Static data can still be sent using qtest_send, which avoids
the malloc/copy overhead.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1430864578-22072-2-git-send-email-jsnow@redhat.com
2015-05-22 15:58:22 -04:00
John Snow
5d1cf0917b qtest/ahci: add migrate halted dma test
Test migrating a halted DMA transaction.
Resume, then test data integrity.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1430417242-11859-10-git-send-email-jsnow@redhat.com
2015-05-22 15:58:22 -04:00
John Snow
189d1b6126 qtest/ahci: add halted dma test
If we're going to test the migration of halted DMA jobs,
we should probably check to make sure we can resume them
locally as a first step.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1430417242-11859-9-git-send-email-jsnow@redhat.com
2015-05-22 15:58:22 -04:00
John Snow
a606ce50c2 qtest/ahci: add flush migrate test
Use blkdebug to inject an error on first flush, then attempt to flush
on the first guest. When the error halts the VM, migrate to the
second VM, and attempt to resume the command.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1430417242-11859-8-git-send-email-jsnow@redhat.com
2015-05-22 15:58:22 -04:00
John Snow
88e21f9485 qtest/ahci: add migrate dma test
Write to one guest, migrate, and then read from the other.
adjust ahci_io to clear any buffers it creates, so that we
can use ahci_io safely on both guests knowing we are using
empty buffers and not accidentally re-using data.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1430417242-11859-7-git-send-email-jsnow@redhat.com
2015-05-22 15:58:22 -04:00
John Snow
278128ab06 qtest/ahci: Add migration test
Notes:

 * The migration is performed on QOSState objects.

 * The migration is performed in such a way that it does not assume
   consistency between the allocators attached to each. That is to say,
   you can use each QOSState object completely independently and then at
   an arbitrary point decide to migrate, and the destination object will
   now be consistent with the memory within the source guest. The source
   object that was migrated from will have a completely blank allocator.

ahci-test.c:
 - verify_state is added
 - ahci_migrate is added as a frontend to migrate
 - test_migrate_sanity test case is added.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1430417242-11859-6-git-send-email-jsnow@redhat.com
2015-05-22 15:58:22 -04:00
John Snow
04329029a8 ich9/ahci: Enable Migration
Lift the flag preventing the migration of the ICH9/AHCI devices.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1430417242-11859-5-git-send-email-jsnow@redhat.com
2015-05-22 15:58:22 -04:00
John Snow
085248ae87 libqos: Add migration helpers
libqos.c:
    -set_context for addressing which commands go where
    -migrate performs the actual migration

malloc.c:
    - Structure of the allocator is adjusted slightly with
      a second-tier malloc to make swapping around the allocators
      easy when we "migrate" the lists from the source to the destination.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1430417242-11859-4-git-send-email-jsnow@redhat.com
2015-05-22 15:58:22 -04:00
John Snow
455e861cc6 libqos/ahci: Fix sector set method
|| probably does not mean the same thing as |.

Additionally, allow users to submit a prd_size of 0
to indicate that they'd like to continue using the default.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1430417242-11859-3-git-send-email-jsnow@redhat.com
2015-05-22 15:58:22 -04:00
John Snow
008b6e123f libqos/ahci: Add halted command helpers
Sometimes we want a command to halt the VM instead
of complete successfully, so it'd be nice to let the
libqos/ahci functions cope with such scenarios.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1430417242-11859-2-git-send-email-jsnow@redhat.com
2015-05-22 15:58:22 -04:00
John Snow
62754b1571 glib: remove stale compat functions
Since we're bumping the version to 2.22+,
remove the now-stale compat functions.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1431469140-22208-2-git-send-email-jsnow@redhat.com
2015-05-22 15:58:06 -04:00
John Snow
f40685c62b configure: require glib 2.22
This provides g_ptr_array_new_with_free_func, as well as a few
other functions that we've been hacking around in glib-compat.h.
Cleaning up the compatibility headers will come later.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1431469140-22208-2-git-send-email-jsnow@redhat.com
2015-05-22 14:13:58 -04:00
Peter Maydell
bb2fa17f18 TriCore v1.6.1 ISA and missing v1.6 instructions
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCgAGBQJVX0UVAAoJEArSxjlracoUHboP/1Ajq0KCGYuyTeOLcZbfSqgR
 frvtEl4Js2VqJ69p3XipZ8j16Co0U+gDm6Hu7/wytGXOSAplXwvlDe+556778MpV
 xVdsXoeh5X5xPMtoU3qEy6sK4bLxxmjAJtj4Ij1NojqRbRr1ZuvxxaUgB5ud1eWE
 Th/HQEhquAiZTfSilqu+xAMeFfRrpPlqIKFhHHOcctEZw03v0kTEfBMXw0AR5UW2
 /ytvoBtiWnME2WYIkP6odWh96RthqigKhrCPbfnTFjD5amYeeeI0R8DeRKiDEWmY
 eEyddC5UcODIBfioj+aUbCf2jzu7LClu4MkyiFe0yK/SU2/Yz6v7GdgUhKsCoGAw
 noYbKOgilQtMBjs6hKbfxVIMmtkfz9dJ2y62H2PUzjCnFPhluZURE96T3p/e8HUo
 z3G+bTPZ69+7r4GxFMq8HpQE8hzC5xTZ8jA3bT+/8gYTyN8c0si4EF2XRBFhUm3H
 cjqSg8HTmoCQAPRkF29F1BrDyH/oPji2S5xLvtI4tG/Kk13ZhOcVxKky3wcK8Onf
 srCHdEsIThlMUCnAmphfqSCjliPXYmnZoGymeLfLRISPZrB4PmctTSbYdjdQU1kN
 ac5xa0NsL1wmFPLl2yk7TNFFpw0v1sH0SfpdxAjjVnj8B4jNRK6hleGFjpfjzpEE
 gl1/hij9FtqJPrx1Zgkz
 =NjJ8
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bkoppelmann/tags/pull-tricore-20150522' into staging

TriCore v1.6.1 ISA and missing v1.6 instructions

# gpg: Signature made Fri May 22 16:02:45 2015 BST using RSA key ID 6B69CA14
# gpg: Good signature from "Bastian Koppelmann <kbastian@mail.uni-paderborn.de>"

* remotes/bkoppelmann/tags/pull-tricore-20150522:
  target-tricore: add RR_DIV and RR_DIV_U instructions of the v1.6 ISA
  target-tricore: add FRET instructions of the v1.6 ISA
  target-tricore: add FCALL instructions of the v1.6 ISA
  target-tricore: add SYS_RESTORE instruction of the v1.6 ISA
  target-tricore: add RR_CRC32 instruction of the v1.6.1 ISA
  target-tricore: add SWAPMSK instructions of the v1.6.1 ISA
  target-tricore: add CMPSWP instructions of the v1.6.1 ISA
  target-tricore: Add SRC_MOV_E instruction of the v1.6 ISA
  target-tricore: introduce ISA v1.6.1 feature
  target-tricore: Add ISA v1.3.1 cpu and fix tc1796 to using v1.3

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-05-22 16:22:42 +01:00
Bastian Koppelmann
9371557115 target-tricore: add RR_DIV and RR_DIV_U instructions of the v1.6 ISA
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2015-05-22 17:02:34 +02:00
Bastian Koppelmann
0e045f43c4 target-tricore: add FRET instructions of the v1.6 ISA
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2015-05-22 17:02:34 +02:00
Bastian Koppelmann
9e14a7b24f target-tricore: add FCALL instructions of the v1.6 ISA
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2015-05-22 17:02:34 +02:00
Bastian Koppelmann
bc3551c433 target-tricore: add SYS_RESTORE instruction of the v1.6 ISA
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2015-05-22 17:02:34 +02:00
Bastian Koppelmann
e5c96c82bc target-tricore: add RR_CRC32 instruction of the v1.6.1 ISA
This instruction was introduced by the new Aurix platform.

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2015-05-22 17:02:33 +02:00
Bastian Koppelmann
ddd8cebe31 target-tricore: add SWAPMSK instructions of the v1.6.1 ISA
Those instruction were introduced in the new Aurix platform.

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2015-05-22 17:02:33 +02:00
Bastian Koppelmann
62872ebc38 target-tricore: add CMPSWP instructions of the v1.6.1 ISA
Those instruction were introduced in the new Aurix platform.

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2015-05-22 17:02:33 +02:00
Bastian Koppelmann
fcecf12684 target-tricore: Add SRC_MOV_E instruction of the v1.6 ISA
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2015-05-22 17:02:33 +02:00
Bastian Koppelmann
6d2afc8a5e target-tricore: introduce ISA v1.6.1 feature
The aurix platform contains of several different cpu models and uses
the 1.6.1 ISA. This patch changes the generic aurix model to the more
specific tc27x cpu model and sets specific features.

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2015-05-22 17:02:33 +02:00
Bastian Koppelmann
fd5ecf31d4 target-tricore: Add ISA v1.3.1 cpu and fix tc1796 to using v1.3
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2015-05-22 17:02:33 +02:00
Peter Maydell
8b6db32a4e -----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
 
 iQEcBAABAgAGBQJVXvBFAAoJEJykq7OBq3PIBXwH/1b3aLGQnkFn7h2RylhERDts
 9YLApEyF5nHX1hi2LSgXC6AuuZ0S4vV3kwiqYrrsnz7e8Yn5tJ3SEfuS8dkgtHb4
 dyB/Fy/ZuOQ/DUcWXUD1gpxCAlJXf8Kk4Z7SB9OAH0TDc9KExAe3hmrfYE/eRvMd
 Yxh9W3FAfN8yFC+wOtfsua+066zd6vH0fCi2sn7PHElLZGST3jtWN/I2Rdd3j3rp
 cffTHsJDT8e3iWao7qdWtYDIYDvECbROJorevb7CaMcxeR67hpI1TIvckZ67/1Ks
 p42iY7zVpM4goVe3GvC8lWdLy2hbUS+NWggtJi7+nlpV0SNQzDBfr2nnKZB3o0s=
 =+MDM
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging

# gpg: Signature made Fri May 22 10:00:53 2015 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/block-pull-request: (38 commits)
  block: get_block_status: use "else" when testing the opposite condition
  qemu-iotests: Test unaligned sub-block zero write
  block: Fix NULL deference for unaligned write if qiov is NULL
  Revert "block: Fix unaligned zero write"
  block: align bounce buffers to page
  block: minimal bounce buffer alignment
  block: return EPERM on writes or discards to read-only devices
  configure: Add workaround for ccache and clang
  configure: silence glib unknown attribute __alloc_size__
  configure: factor out supported flag check
  configure: handle clang -nopie argument warning
  block/parallels: improve image writing performance further
  block/parallels: optimize linear image expansion
  block/parallels: add prealloc-mode and prealloc-size open paramemets
  block/parallels: delay writing to BAT till bdrv_co_flush_to_os
  block/parallels: create bat_entry_off helper
  block/parallels: improve image reading performance
  iotests, parallels: check for incorrectly closed image in tests
  block/parallels: implement incorrect close detection
  block/parallels: implement parallels_check method of block driver
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-05-22 13:25:40 +01:00
Peter Maydell
f5790c3bc8 Revert "target-alpha: Add vector implementation for CMPBGE"
This reverts commit 32ad48abd7.

Unfortunately the SSE2 code here fails to compile on some versions
of gcc:
 target-alpha/int_helper.c:77:24: error: invalid operands to binary >=
 (have '__vector(16) unsigned char' and '__vector(16) unsigned char')

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-05-22 12:30:13 +01:00
Peter Maydell
27e1259a69 Rewrite fp exceptions
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJVXhd4AAoJEK0ScMxN0CebJ0oIAJFrQw67jOZzEOQmQBCMFRNi
 DeDsVbQhK9e2iiQGdYL+r6XQQ4uG+MS5CI8FDi1DdJrRxzDaHD5pDWGx+flAVwDz
 jqanuVIap7YfkL/wFtdRyhu7y7tOD9mXb9Hsy968Ga+MHO1q548ctnbeWHHKtRND
 JkGZtCCjh7j0AMMTdyTq0dvKnSAAgnh4AK1Pni1tdERHmbzQQAuixzgI8jVCQIrE
 FOySuaG6KX1TkNsg2ow+4XrRnc1M8ZhMPFqvg+JVBCbiXBmSEq0cqp2pg9zknzD4
 5ODpqoqwXgMeLxKXMIRGqIqrbJe+J8ypZFUxPjdlLOoFvUglqmk5qyfufvvp3yA=
 =jbie
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/rth/tags/pull-axp-20150521' into staging

Rewrite fp exceptions

# gpg: Signature made Thu May 21 18:35:52 2015 BST using RSA key ID 4DD0279B
# gpg: Good signature from "Richard Henderson <rth7680@gmail.com>"
# gpg:                 aka "Richard Henderson <rth@redhat.com>"
# gpg:                 aka "Richard Henderson <rth@twiddle.net>"

* remotes/rth/tags/pull-axp-20150521:
  target-alpha: Add vector implementation for CMPBGE
  target-alpha: Rewrite helper_zapnot
  target-alpha: Raise IOV from CVTQL
  target-alpha: Suppress underflow from CVTTQ if DNZ
  target-alpha: Raise EXC_M_INV properly for fp inputs
  target-alpha: Disallow literal operand to 1C.30 to 1C.37
  target-alpha: Implement WH64EN
  target-alpha: Fix integer overflow checking insns
  target-alpha: Fix cvttq vs inf
  target-alpha: Fix cvttq vs large integers
  target-alpha: Raise IOV from CVTTQ
  target-alpha: Set EXC_M_SWC for exceptions from /S insns
  target-alpha: Set fpcr_exc_status even for disabled exceptions
  target-alpha: Tidy FPCR representation
  target-alpha: Set PC correctly for floating-point exceptions
  target-alpha: Forget installed round mode after MT_FPCR
  target-alpha: Rename floating-point subroutines
  target-alpha: Move VAX helpers to a new file

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-05-22 10:06:33 +01:00
Paolo Bonzini
a53f1a95f9 block: get_block_status: use "else" when testing the opposite condition
A bit of Boolean algebra (and common sense) tells us that the
second "if" here is looking for blocks that are not allocated.
This is the opposite of the "if" that sets BDRV_BLOCK_ALLOCATED,
and thus it can use an "else".

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1431599702-10431-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-05-22 09:37:33 +01:00
Fam Zheng
ab53c44718 qemu-iotests: Test unaligned sub-block zero write
Test zero write in byte range 512~1024 for 4k alignment.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1431522721-3266-4-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-05-22 09:37:33 +01:00
Fam Zheng
9eeb6dd1b2 block: Fix NULL deference for unaligned write if qiov is NULL
For zero write, callers pass in NULL qiov (qemu-io "write -z" or
scsi-disk "write same").

Commit fc3959e466 fixed bdrv_co_write_zeroes which is the common case
for this bug, but it still exists in bdrv_aio_write_zeroes. A simpler
fix would be in bdrv_co_do_pwritev which is the NULL dereference point
and covers both cases.

So don't access it in bdrv_co_do_pwritev in this case, use three aligned
writes.

[Initialize ret to 0 in bdrv_co_do_zero_pwritev() to avoid uninitialized
variable warning with gcc 4.9.2.
--Stefan]

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1431522721-3266-3-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-05-22 09:37:33 +01:00
Fam Zheng
d01c07f222 Revert "block: Fix unaligned zero write"
This reverts commit fc3959e466.

The core write code already handles the case, so remove this
duplication.

Because commit 61007b316 moved the touched code from block.c to
block/io.c, the change is manually reverted.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1431522721-3266-2-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-05-22 09:37:33 +01:00
Denis V. Lunev
459b4e6612 block: align bounce buffers to page
The following sequence
    int fd = open(argv[1], O_RDWR | O_CREAT | O_DIRECT, 0644);
    for (i = 0; i < 100000; i++)
            write(fd, buf, 4096);
performs 5% better if buf is aligned to 4096 bytes.

The difference is quite reliable.

On the other hand we do not want at the moment to enforce bounce
buffering if guest request is aligned to 512 bytes.

The patch changes default bounce buffer optimal alignment to
MAX(page size, 4k). 4k is chosen as maximal known sector size on real
HDD.

The justification of the performance improve is quite interesting.
From the kernel point of view each request to the disk was split
by two. This could be seen by blktrace like this:
  9,0   11  1     0.000000000 11151  Q  WS 312737792 + 1023 [qemu-img]
  9,0   11  2     0.000007938 11151  Q  WS 312738815 + 8 [qemu-img]
  9,0   11  3     0.000030735 11151  Q  WS 312738823 + 1016 [qemu-img]
  9,0   11  4     0.000032482 11151  Q  WS 312739839 + 8 [qemu-img]
  9,0   11  5     0.000041379 11151  Q  WS 312739847 + 1016 [qemu-img]
  9,0   11  6     0.000042818 11151  Q  WS 312740863 + 8 [qemu-img]
  9,0   11  7     0.000051236 11151  Q  WS 312740871 + 1017 [qemu-img]
  9,0    5  1     0.169071519 11151  Q  WS 312741888 + 1023 [qemu-img]
After the patch the pattern becomes normal:
  9,0    6  1     0.000000000 12422  Q  WS 314834944 + 1024 [qemu-img]
  9,0    6  2     0.000038527 12422  Q  WS 314835968 + 1024 [qemu-img]
  9,0    6  3     0.000072849 12422  Q  WS 314836992 + 1024 [qemu-img]
  9,0    6  4     0.000106276 12422  Q  WS 314838016 + 1024 [qemu-img]
and the amount of requests sent to disk (could be calculated counting
number of lines in the output of blktrace) is reduced about 2 times.

Both qemu-img and qemu-io are affected while qemu-kvm is not. The guest
does his job well and real requests comes properly aligned (to page).

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1431441056-26198-3-git-send-email-den@openvz.org
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-05-22 09:37:33 +01:00
Denis V. Lunev
4196d2f030 block: minimal bounce buffer alignment
The patch introduces new concept: minimal memory alignment for bounce
buffers. Original so called "optimal" value is actually minimal required
value for aligment. It should be used for validation that the IOVec
is properly aligned and bounce buffer is not required.

Though, from the performance point of view, it would be better if
bounce buffer or IOVec allocated by QEMU will be aligned stricter.

The patch does not change any alignment value yet.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1431441056-26198-2-git-send-email-den@openvz.org
CC: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-05-22 09:37:33 +01:00
Paolo Bonzini
eaf5fe2dd4 block: return EPERM on writes or discards to read-only devices
This is the behavior in the operating system, for example Linux's
blkdev_write_iter has the following:

        if (bdev_read_only(I_BDEV(bd_inode)))
                return -EPERM;

This does not apply to opening a device for read/write, when the
device only supports read-only operation.  In this case any of
EACCES, EPERM or EROFS is acceptable depending on why writing is
not possible.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1431013548-22492-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-05-22 09:37:33 +01:00
John Snow
fd0e60530f configure: Add workaround for ccache and clang
Test if ccache is interfering with semantic analysis of macros,
disable its habit of trying to compile already pre-processed
versions of code if so. ccache attempts to save time by compiling
pre-processed versions of code, but this disturbs clang's static
analysis enough to produce false positives.

ccache allows us to disable this feature, opting instead to
compile the original version instead of its preprocessed version.
This makes ccache much slower for cache misses, but at least it
becomes usable with QEMU/clang.

This workaround only activates for users using ccache AND clang,
and only if their configuration is observed to be producing warnings.
You may need to clear your ccache for builds started without -Werror,
as those may continue to produce warnings from the cache.

Thanks to Peter Eisentraut for his writeup on the issue:
http://peter.eisentraut.org/blog/2014/12/01/ccache-and-clang-part-3/

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427324259-1481-5-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-05-22 09:37:33 +01:00
John Snow
bbbf2e04e5 configure: silence glib unknown attribute __alloc_size__
The glib headers use GCC attributes.  Unfortunately the __GNUC__ and
__GNUC_MINOR__ version macros are also defined by clang, but clang
doesn't support the same attributes as GCC.

clang 3.5.0 does not support the __alloc_size__ attribute:

  c047507a9a

The following warning is produced:

  gstrfuncs.h:257:44: warning: unknown attribute '__alloc_size__' ignored [-Wunknown-attributes]
        G_GNUC_MALLOC G_GNUC_ALLOC_SIZE(2);
          gmacros.h:67:45: note: expanded from macro 'G_GNUC_ALLOC_SIZE'
                #define G_GNUC_ALLOC_SIZE(x) __attribute__((__alloc_size__(x)))

This patch checks whether glib headers cause warnings and disables
-Wunknown-attributes if it is able to.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427324259-1481-4-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-05-22 09:37:33 +01:00
John Snow
93b2586922 configure: factor out supported flag check
Factor out the function that checks if a compiler
flag is supported or not.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427324259-1481-3-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-05-22 09:37:32 +01:00
Stefan Hajnoczi
e4a7b344df configure: handle clang -nopie argument warning
gcc 4.9.2 treats -nopie as an error:

  cc: error: unrecognized command line option ‘-nopie’

clang 3.5.0 treats -nopie as a warning:

  clang: warning: argument unused during compilation: '-nopie'

The causes ./configure to fail with clang:

  ERROR: configure test passed without -Werror but failed with -Werror.

Make the -nopie test use -Werror so that compile_prog works for both gcc
and clang.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427324259-1481-2-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-05-22 09:37:32 +01:00
Denis V. Lunev
ddd2ef2ce8 block/parallels: improve image writing performance further
Try to perform IO for the biggest continuous block possible.
All blocks abscent in the image are accounted in the same type
and preallocation is made for all of them at once.

The performance for sequential write is increased from 200 Mb/sec to
235 Mb/sec on my SSD HDD.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id: 1430207220-24458-28-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-05-22 09:37:32 +01:00
Denis V. Lunev
19f5dc1591 block/parallels: optimize linear image expansion
Plain image expansion spends a lot of time to update image file size.
This seriously affects the performance. The following simple test
  qemu_img create -f parallels -o cluster_size=64k ./1.hds 64G
  qemu_io -n -c "write -P 0x11 0 1024M" ./1.hds
could be improved if the format driver will pre-allocate some space
in the image file with a reasonable chunk.

This patch preallocates 128 Mb using bdrv_write_zeroes, which should
normally use fallocate() call inside. Fallback to older truncate()
could be used as a fallback using image open options thanks to the
previous patch.

The benefit is around 15%.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Karan <rkagan@parallels.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id: 1430207220-24458-27-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-05-22 09:37:32 +01:00
Denis V. Lunev
d61790112f block/parallels: add prealloc-mode and prealloc-size open paramemets
This is preparational commit for tweaks in Parallels image expansion.
The idea is that enlarge via truncate by one data block is slow. It
would be much better to use fallocate via bdrv_write_zeroes and
expand by some significant amount at once.

Original idea with sequential file writing to the end of the file without
fallocate/truncate would be slower than this approach if the image is
expanded with several operations:
- each image expanding means file metadata update, i.e. filesystem
  journal write. Truncate/write to newly truncated space update file
  metadata twice thus truncate removal helps. With fallocate call
  inside bdrv_write_zeroes file metadata is updated only once and
  this should happen infrequently thus this approach is the best one
  for the image expansion
- tail writes are ordered, i.e. the guest IO queue could not be sent
  immediately to the host introducing additional IO delays

This patch just adds proper parameters into BDRVParallelsState and
performs options parsing in parallels_open.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id: 1430207220-24458-26-git-send-email-den@openvz.org
CC: Roman Kagan <rkagan@parallels.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-05-22 09:37:32 +01:00
Denis V. Lunev
0d31c7c200 block/parallels: delay writing to BAT till bdrv_co_flush_to_os
The idea is that we do not need to immediately sync BAT to the image as
from the guest point of view there is a possibility that IO is lost
even in the physical controller until flush command was finished.
bdrv_co_flush_to_os is exactly the right place for this purpose.

Technically the patch uses loaded BAT data as a cache and performs
actual on-disk metadata updates in parallels_co_flush_to_os callback.

This patch speed ups
  qemu-img create -f parallels -o cluster_size=64k ./1.hds 64G
  qemu-io -f parallels -c "write -P 0x11 0 1024k" 1.hds
writing from 50-60 Mb/sec to 80-90 Mb/sec on rotational media and
from 160 Mb/sec to 190 Mb/sec on SSD disk.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id: 1430207220-24458-25-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-05-22 09:37:32 +01:00
Denis V. Lunev
2d68e22e94 block/parallels: create bat_entry_off helper
calculate offset of the BAT entry in the image file.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id: 1430207220-24458-24-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-05-22 09:37:32 +01:00
Denis V. Lunev
6953d92078 block/parallels: improve image reading performance
Try to perform IO for the biggest continuous block possible.
The performance for sequential read is increased from 220 Mb/sec to
360 Mb/sec for continous image on my SSD HDD.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id: 1430207220-24458-23-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-05-22 09:37:32 +01:00
Denis V. Lunev
a6be831e99 iotests, parallels: check for incorrectly closed image in tests
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id: 1430207220-24458-22-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-05-22 09:37:32 +01:00
Denis V. Lunev
6dd6b9f144 block/parallels: implement incorrect close detection
The software driver must set inuse field in Parallels header to
0x746F6E59 when the image is opened in read-write mode. The presence of
this magic in the header on open forces image consistency check.

There is an unfortunate trick here. We can not check for inuse in
parallels_check as this will happen too late. It is possible to do
that for simple check, but during the fix this would always report
an error as the image was opened in BDRV_O_RDWR mode. Thus we save
the flag in BDRVParallelsState for this.

On the other hand, nothing should be done to clear inuse in
parallels_check. Generic close will do the job right.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id: 1430207220-24458-21-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-05-22 09:37:32 +01:00
Denis V. Lunev
49ad646731 block/parallels: implement parallels_check method of block driver
The check is very simple at the moment. It calculates necessary stats
and fix only the following errors:
- space leak at the end of the image. This would happens due to
  preallocation
- clusters outside the image are zeroed. Nothing else could be done here

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id: 1430207220-24458-20-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-05-22 09:37:32 +01:00
Denis V. Lunev
23d6bd3bd1 block/parallels: move parallels_open/probe to the very end of the file
This will help to avoid forward declarations for upcoming parallels_check

Some very obvious formatting fixes were made to the moved code to make
checkpatch happy.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@parallels.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Roman Kagan <rkagan@parallels.com>
Message-id: 1430207220-24458-19-git-send-email-den@openvz.org
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-05-22 09:37:32 +01:00