-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJV6fTdAAoJEH8JsnLIjy/WniYP/iSJcVpTj/nEGDPDVrAGAldE
W3CiFSRbaFTiLvYXp7MDRv5+tV5+ttykhhRlWjOx5BgU616zHh0IiVxuvvGwwCdM
wl84TpqZ0tUvPFtvZOFE96JS3jcgbz2nu0PA4axIKeIL7aG7L4te35L7Bq7jGwjM
qrJ5Fg6B4f9Ji8vhphbmCdeZ5slc/mRE7vMAn+5flxvInl3oLexcSUQ9+LZ0iYS5
ktmGjGNQq1f0Z83hY6k6JKodz0THOPjVh5+seKaJSJjUVpxsMhj+xLDYtEnR2CwV
13ObNTozStu4IYThD9ydoG2qSFnQGbnn+TrDuyJrhor5wZlmSXolwDGu3qthbBro
tW9WnvK9fpVPEEyfDNs9NnKiYFERchbgckqTz7MxWJMCdXdPcYIAzFO2SlO1Mznl
Kv4nEJouOP2CCq/Q6FmUD585uOM8QculKn1Mx8XdPVFvkoYKsvxXWcqRLsf29/Az
PCt09br+WQE6vbEIGuQjcgBSAgkUDwcuH7jqsyeoiRuKZdAUVyIpsgl4GT28HHk9
NE8R1TQoBi+4iOvPTWltoSGKi4pj9svqKuHA2YjIaUQtQ18oZleBhVz9Ugm4BFIM
Se3vUVDKiCNidEITxAmU24ZEXyS+3yTKQQzrvO9zT/b4HJHkOq7I9WIRyqRagCpX
oBTcbZ9QuuYYF3Z9f2Wi
=4Pz4
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches
# gpg: Signature made Fri 04 Sep 2015 20:45:33 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
* remotes/kevin/tags/for-upstream:
quorum: validate vote threshold against num_children even if read-pattern is fifo
qcow2: reorder fields in Qcow2CachedTable to reduce padding
docs: document how to configure the qcow2 L2/refcount caches
qcow2: add option to clean unused cache entries after some time
qcow2: mark the memory as no longer needed after qcow2_cache_empty()
iotests: Warn if python subprocess is killed
iotests: Do not suppress segfaults in bash tests
iotests: Respect -nodefaults in tests 41 and 55
iotests: More options for VM.add_drive()
qemu-img: Fix crash in amend invocation
block/raw-posix: Use raw_normalize_devicepath()
qemu-iotests: s390x: fix test 130
qemu-iotests: s390x: fix test 049, reject negative sizes in QemuOpts
qemu-iotests: s390x: fix test 041 and 055
qemu-iotests: disable default qemu devices for cross-platform compatibility
qemu-iotests: qemu machine type support
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This is using a ds1338 RTC chip on the I2C bus. This RTC chip is
not present on the real 3DS PDK board.
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Acked-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 05601683a2a95c881cbc9f22651a044d969bd0ae.1441057361.git.jcd@tribudubois.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This patch adds support for SMBIOS 3.0 entry point. When caller invokes
smbios_set_defaults(), it can specify entry point as 2.1 or 3.0. Then
smbios_get_tables() will return the entry point table in right format.
Acked-by: Gabriel Somlo <somlo@cmu.edu>
Tested-by: Gabriel Somlo <somlo@cmu.edu>
Tested-by: Leif Lindholm <leif.lindholm@linaro.org>
Signed-off-by: Wei Huang <wei@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Message-id: 1440615870-9518-2-git-send-email-wei@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Currently, if a subprocess of a python test (i.e. qemu-io, qemu-img, or
qemu) receives a signal and is subsequently aborted, this is not logged.
This patch makes python tests always check the exit code of these
subprocesses, and emit a message if they have been killed.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Currently, if a qemu/qemu-io/qemu-img/qemu-nbd invocation receives a
segmentation fault, that message is invisible in most cases since the
output is generally filtered and bash suppresses the segmentation fault
notice for any but the last element of a pipe.
Most of the time, the test will then fail anyway because of missing
output, but not necessarily (as happened with test 82 recently).
Fix this by making the corresponding environment variables point to
wrapper functions which execute the respective command in a subshell.
Giving options to qemu/qemu-io/qemu-img and path names with spaces were
broken for the Python tests; this patch "accidentally" fixes that.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
While -nodefaults is set in $QEMU_OPTIONS, this is currently (wrongly)
ignored for Python iotests. In order to be prepared for when this is
fixed, we should explicitly add an IDE CD-ROM drive instead of relying
on it being created automatically.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch allows specifying the interface to be used for the drive, and
makes specifying a path optional (if the path is None, the "file" option
will be omitted, thus creating an empty drive).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The default device id of hard disk on the s390 platform is "virtio0"
which differs to the "ide0-hd0" for the x86 platform. Setting id in
the drive definition, ie:"qemu -drive id=testdisk", will be the same
on all platforms.
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Bo Tu <tubo@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
when creating an image qemu-img enable us specifying the size of the
image using -o size=xx options. But when we specify an invalid size
such as a negtive size then different platform gives different result.
parse_option_size() function in util/qemu-option.c will be called to
parse the size, a cast was called in the function to cast the input
(saved as a double in the function) size to an unsigned int64 value,
when the input is a negtive value or exceeds the maximum of uint64, then
the result is undefined.
According to C99 6.3.1.4, the result of converting a floating point
number to an integer that cannot represent the (integer part of) number
is undefined. And sure enough the results are different on x86 and
s390.
C99 Language spec 6.3.1.4 Real floating and integers:
the result of this assignment/cast is undefined if the float is not
in the open interval (-1, U<type>_MAX+1).
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Signed-off-by: Bo Tu <tubo@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There is no 'ide-cd' device defined on non-pc platform, so
test_medium_not_found() test should be skipped.
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Michael Mueller <mimu@linux.vnet.ibm.com>
Reviewed-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Signed-off-by: Xiao Guang Chen <chenxg@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch fixes an io test suite issue that was introduced with the
commit c88930a686 'qemu-char: Permit only
a single "stdio" character device'. The option supresses the creation of
default devices such as the floopy and cdrom. Output files for test case
067, 071, 081 and 087 need to be updated to accommodate this change.
Use virtio-blk instead of virtio-blk-pci as the device driver for test
case 067. For virtio-blk-pci is the same with virtio-blk as device
driver but other platform such as s390 may not recognize the virtio-blk-pci.
The default devices differ across machines. As the qemu output often
contains these devices (or events for them, like opening a CD tray on
reset), the reference output currently is rather machine-specific.
All existing qemu tests explicitly configure the devices they're working
with, so just pass -nodefaults to qemu by default to disable the default
devices. Update the reference outputs accordingly.
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Michael Mueller <mimu@linux.vnet.ibm.com>
Reviewed-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Signed-off-by: Xiao Guang Chen <chenxg@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch adds qemu machine type support to the io test suite.
Based on the qemu default machine type and alias of the default machine
type the reference output file can now vary from the default to a
machine specific output file if necessary. When using a machine specific
reference file if the default machine has an alias then use the alias as the output
file name otherwise use the default machine name as the output file name.
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Michael Mueller <mimu@linux.vnet.ibm.com>
Reviewed-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Signed-off-by: Xiao Guang Chen <chenxg@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
check_type() first checks and peels off the array type, then checks
the element type. For two out of four error messages, it takes pains
to report errors for "array of T" instead of just T. Odd. Let's
examine the errors.
* Unknown element type, e.g.
tests/qapi-schema/args-array-unknown.json:
Member 'array' of 'data' for command 'oops' uses unknown type
'array of NoSuchType'
To make sense of this, you need to know that 'array of NoSuchType'
refers to '[NoSuchType]'. Easy enough. However, simply reporting
Member 'array' of 'data' for command 'oops' uses unknown type
'NoSuchType'
is at least as easy to understand.
* Element type's meta-type is inadmissible, e.g.
tests/qapi-schema/returns-whitelist.json:
'returns' for command 'no-way-this-will-get-whitelisted' cannot
use built-in type 'array of int'
'array of int' is technically not a built-in type, but that's
pedantry. However, simply reporting
'returns' for command 'no-way-this-will-get-whitelisted' cannot
use built-in type 'int'
avoids the issue, and is at least as easy to understand.
* The remaining two errors are unreachable, because the array checking
ensures that value is a string.
Thus, reporting some errors for "array of T" instead of just T works,
but doesn't really improve things. Drop it.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
We always report "should be a dictionary" then. This is misleading:
when allow_dict, it can be a dictionary or a type name string, else it
can only be a type name.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
The first check ensures the second one can't trigger. Drop the first
one, because the second one is in a more logical place, and emits a
nicer error message.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reproducer: with
{ 'command': 'user_def_cmd4', 'returns': { 'a': 'int' } }
added to qapi-schema-test.json, qapi-commands.py dies when it tries to
generate the command handler function
Traceback (most recent call last):
File "/work/armbru/qemu/scripts/qapi-commands.py", line 359, in <module>
ret = generate_command_decl(cmd['command'], arglist, ret_type) + "\n"
File "/work/armbru/qemu/scripts/qapi-commands.py", line 29, in generate_command_decl
ret_type=c_type(ret_type), name=c_name(name),
File "/work/armbru/qemu/scripts/qapi.py", line 927, in c_type
assert isinstance(value, str) and value != ""
AssertionError
because the return type doesn't exist.
Simply outlaw this usage, and drop or dumb down test cases accordingly.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
A command's or event's 'data' must be a struct type, given either as a
dictionary, or as struct type name.
Commit dd883c6 tightened the checking there, but not enough: we still
accept 'union'. Fix to reject it.
We may want to support union types there, but we'll have to extend
qapi-commands.py and qapi-events.py for it.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
A command's 'data' must be a struct type, given either as a
dictionary, or as struct type name.
Existing test case data-int.json covers simple type 'int'. Add test
cases for type names referring to union and alternate types.
The latter is caught (good), but the former is not (bug).
Events have the same problem, but since they get checked by the same
code, we don't bother to duplicate the tests.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Since every schema entity has 'data', the data- prefix conveys no
information. These tests actually exercise commands. Only commands
have arguments, so change the prefix to to args-.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Test case added in commit 2fc0043, and messed up in commit 5223070.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Most functions that can return a pointer or set an Error ** value
are decent enough to guarantee a NULL return when reporting an error.
Not so with our generated qapi visitor functions. If the caller
is not careful to clean up partially-allocated objects on error,
then the caller suffers a memory leak.
Properly fixing it is probably complex enough to save for a later
day, so merely document it for now.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1438295587-19069-1-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
When event FOO's 'data' is a struct with a base, we consider only the
struct's direct members, and ignore its base. The generated
qapi_event_send_foo() doesn't take arguments for base members.
No such events currently exist in the QMP schema.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
We generate a declaration, but no definition.
The QMP schema has two: Qcow2OverlapChecks and BlockdevRef. Neither
visit_type_Qcow2OverlapChecksKind() nor visit_type_BlockdevRefKind()
is actually used.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
The visit_type_implicit_FOO() are generated on demand, right before
their first use. Used by visit_type_STRUCT_fields() when STRUCT has
base FOO, and by visit_type_UNION() when flat UNION has member a FOO.
If the schema defines FOO after its first use as struct base or flat
union member, visit_type_implicit_FOO() calls
visit_type_implicit_FOO() before its definition, which doesn't
compile.
Rearrange qapi-schema-test.json to demonstrate the bug.
Fix by generating the necessary forward declaration.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
A flat union's tag member gets renamed to 'kind' in the generated
code. Breaks when another member named 'kind' exists.
Example, adapted from qapi-schema-test.json:
{ 'struct': 'UserDefUnionBase',
'data': { 'kind': 'str', 'enum1': 'EnumOne' } }
We generate:
struct UserDefFlatUnion
{
EnumOne kind;
union {
void *data;
UserDefA *value1;
UserDefB *value2;
UserDefB *value3;
};
char *kind;
};
Kill the silly rename.
Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Use c_name() instead of ad hoc code. Doesn't upcase the -p prefix,
which is an improvement in my book. Unbreaks prefix containing '.',
but other funny characters remain broken. To be fixed next.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
* vhost-scsi fix from Igor and Lu Lina
* a build system fix from Daniel
* two more multi-arch-related patches from Peter C.
* TCG patches from myself and Sergey Fedorov
* RCU improvement from Wen Congyang
* a few more simple cleanups
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABCAAGBQJVzmCgAAoJEL/70l94x66DhFgH/1m3iGac2Ks3vAUAdS2HBcxC
EeziMwWFmkrfbtzUkz/jE0NG5uA2Bs8OFHsC8vmQFwkpDbGUlJ1zd5/N5UOHMG3d
zF0vd+nKNw9C1Fo0/LPyQSeP64/xXEMTmFLqmYf4ZOowz8lr/m6WYrMIzKUoXSEn
FeRtq78moDT8qwF372j8aoQUUpsctXDHBQHORZdcERvlc4mxojeJ3+mNViR2bv3r
92PwGvrJ26mQXEKmGo5O1VM4k7QVg7xJQfgE11x7ShE2E9fJDMgts0Q/xCjWCLwS
BXtEtbd9QeFEfG/mlRFevGtuvksq98m0hN7lAWb13zWmlJFuLyyMmlGfGAlU55Q=
=Y2DB
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* SCSI fixes from Stefan and Fam
* vhost-scsi fix from Igor and Lu Lina
* a build system fix from Daniel
* two more multi-arch-related patches from Peter C.
* TCG patches from myself and Sergey Fedorov
* RCU improvement from Wen Congyang
* a few more simple cleanups
# gpg: Signature made Fri 14 Aug 2015 22:41:52 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream:
disas: Defeature print_target_address
hw: fix mask for ColdFire UART command register
scsi-generic: identify AIO callbacks more clearly
scsi-disk: identify AIO callbacks more clearly
scsi: create restart bottom half in the right AioContext
configure: only add CONFIG_RDMA to config-host.h once
qemu-nbd: remove unnecessary qemu_notify_event()
vhost-scsi: Clarify vhost_virtqueue_mask argument
exec: use macro ROUND_UP for alignment
rcu: Allow calling rcu_(un)register_thread() during synchronize_rcu()
exec: drop cpu_can_do_io, just read cpu->can_do_io
cpu_defs: Simplify CPUTLB padding logic
cpu-exec: Do not invalidate original TB in cpu_exec_nocache()
vhost/scsi: call vhost_dev_cleanup() at unrealize() time
virtio-scsi-test: Add test case for tail unaligned WRITE SAME
scsi-disk: Fix assertion failure on WRITE SAME
tests: virtio-scsi: clear unit attention after reset
scsi-disk: fix cmd.mode field typo
virtio-scsi: use virtqueue_map_sg() when loading requests
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
To share smbios among different architectures, this patch moves SMBIOS
code (smbios.c and smbios.h) from x86 specific folders into new
hw/smbios directories. As a result, CONFIG_SMBIOS=y is defined in
x86 default config files.
Acked-by: Gabriel Somlo <somlo@cmu.edu>
Tested-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Leif Lindholm <leif.lindholm@linaro.org>
Signed-off-by: Wei Huang <wei@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The unit attention after reset (power on) prevents normal commands from
running. The unaligned WRITE SAME test never executed its command!
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <1438262173-11546-4-git-send-email-stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* megasas SIGSEGV fix
* memory refcount change to fix virtio hot-unplug
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABCAAGBQJVty9DAAoJEL/70l94x66DJZ4H/j3ix0YRs/rxQEXuvVTg0NeS
abspC2foLAqUbIbeB6ApZBXPZAtIA6mPOm+aK04HuB2K2NXqi57pv6qiJ6LMbVNM
NBOfM3qfk/Drt5Sf/4esAbqFaqlkjeKbC7FetZgM4vTZkFK/mfrqUnWGpE7HdRHp
ap2R1U9aZrS4V3O7TMLrJumnwLEl0bAZ0JnMPQrtjvHt2NmCHQn+4owUiXB2BmwK
xo2pIQeJVYbGpRlUEqkehaHYSZsjrIM/RLRYcHWEA5ucZekQKUgwbgNy4K1/YAT0
/edH0DtkKoSC1eFhS1TKeWm8mCfHp49mAJXlq16zaWrQGItcfYtJ2QLVTdi4Hfc=
=IpxH
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* crypto fixes
* megasas SIGSEGV fix
* memory refcount change to fix virtio hot-unplug
# gpg: Signature made Tue Jul 28 08:29:07 2015 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream:
memory: do not add a reference to the owner of aliased regions
megasas: Add write function to handle write access to PCI BAR 3
crypto: extend unit tests to cover decryption too
crypto: fix built-in AES decrypt function
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We want to have uniform build messages, so fix some messages
which did not follow the standard pattern.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This checks that VPC is able to successfully fail (without segfault)
on an image file with a max_table_entries that exceeds 0x40000000.
This table entry is within the valid range for VPC (although too large
for this sample image).
Cc: qemu-stable@nongnu.org
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The current unit test only verifies the encryption API,
resulting in us missing a recently introduced bug in the
decryption API from commit d3462e3. It was fortunately
later discovered & fixed by commit bd09594, thanks to the
QEMU I/O tests for qcow2 encryption, but we should really
detect this directly in the crypto unit tests. Also remove
an accidental debug message and simplify some asserts.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1437468902-23230-1-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch rewrites the ctx->dispatching optimization, which was the cause
of some mysterious hangs that could be reproduced on aarch64 KVM only.
The hangs were indirectly caused by aio_poll() and in particular by
flash memory updates's call to blk_write(), which invokes aio_poll().
Fun stuff: they had an extremely short race window, so much that
adding all kind of tracing to either the kernel or QEMU made it
go away (a single printf made it half as reproducible).
On the plus side, the failure mode (a hang until the next keypress)
made it very easy to examine the state of the process with a debugger.
And there was a very nice reproducer from Laszlo, which failed pretty
often (more than half of the time) on any version of QEMU with a non-debug
kernel; it also failed fast, while still in the firmware. So, it could
have been worse.
For some unknown reason they happened only with virtio-scsi, but
that's not important. It's more interesting that they disappeared with
io=native, making thread-pool.c a likely suspect for where the bug arose.
thread-pool.c is also one of the few places which use bottom halves
across threads, by the way.
I hope that no other similar bugs exist, but just in case :) I am
going to describe how the successful debugging went... Since the
likely culprit was the ctx->dispatching optimization, which mostly
affects bottom halves, the first observation was that there are two
qemu_bh_schedule() invocations in the thread pool: the one in the aio
worker and the one in thread_pool_completion_bh. The latter always
causes the optimization to trigger, the former may or may not. In
order to restrict the possibilities, I introduced new functions
qemu_bh_schedule_slow() and qemu_bh_schedule_fast():
/* qemu_bh_schedule_slow: */
ctx = bh->ctx;
bh->idle = 0;
if (atomic_xchg(&bh->scheduled, 1) == 0) {
event_notifier_set(&ctx->notifier);
}
/* qemu_bh_schedule_fast: */
ctx = bh->ctx;
bh->idle = 0;
assert(ctx->dispatching);
atomic_xchg(&bh->scheduled, 1);
Notice how the atomic_xchg is still in qemu_bh_schedule_slow(). This
was already debated a few months ago, so I assumed it to be correct.
In retrospect this was a very good idea, as you'll see later.
Changing thread_pool_completion_bh() to qemu_bh_schedule_fast() didn't
trigger the assertion (as expected). Changing the worker's invocation
to qemu_bh_schedule_slow() didn't hide the bug (another assumption
which luckily held). This already limited heavily the amount of
interaction between the threads, hinting that the problematic events
must have triggered around thread_pool_completion_bh().
As mentioned early, invoking a debugger to examine the state of a
hung process was pretty easy; the iothread was always waiting on a
poll(..., -1) system call. Infinite timeouts are much rarer on x86,
and this could be the reason why the bug was never observed there.
With the buggy sequence more or less resolved to an interaction between
thread_pool_completion_bh() and poll(..., -1), my "tracing" strategy was
to just add a few qemu_clock_get_ns(QEMU_CLOCK_REALTIME) calls, hoping
that the ordering of aio_ctx_prepare(), aio_ctx_dispatch, poll() and
qemu_bh_schedule_fast() would provide some hint. The output was:
(gdb) p last_prepare
$3 = 103885451
(gdb) p last_dispatch
$4 = 103876492
(gdb) p last_poll
$5 = 115909333
(gdb) p last_schedule
$6 = 115925212
Notice how the last call to qemu_poll_ns() came after aio_ctx_dispatch().
This makes little sense unless there is an aio_poll() call involved,
and indeed with a slightly different instrumentation you can see that
there is one:
(gdb) p last_prepare
$3 = 107569679
(gdb) p last_dispatch
$4 = 107561600
(gdb) p last_aio_poll
$5 = 110671400
(gdb) p last_schedule
$6 = 110698917
So the scenario becomes clearer:
iothread VCPU thread
--------------------------------------------------------------------------
aio_ctx_prepare
aio_ctx_check
qemu_poll_ns(timeout=-1)
aio_poll
aio_dispatch
thread_pool_completion_bh
qemu_bh_schedule()
At this point bh->scheduled = 1 and the iothread has not been woken up.
The solution must be close, but this alone should not be a problem,
because the bottom half is only rescheduled to account for rare situations
(see commit 3c80ca1, thread-pool: avoid deadlock in nested aio_poll()
calls, 2014-07-15).
Introducing a third thread---a thread pool worker thread, which
also does qemu_bh_schedule()---does bring out the problematic case.
The third thread must be awakened *after* the callback is complete and
thread_pool_completion_bh has redone the whole loop, explaining the
short race window. And then this is what happens:
thread pool worker
--------------------------------------------------------------------------
<I/O completes>
qemu_bh_schedule()
Tada, bh->scheduled is already 1, so qemu_bh_schedule() does nothing
and the iothread is never woken up. This is where the bh->scheduled
optimization comes into play---it is correct, but removing it would
have masked the bug.
So, what is the bug?
Well, the question asked by the ctx->dispatching optimization ("is any
active aio_poll dispatching?") was wrong. The right question to ask
instead is "is any active aio_poll *not* dispatching", i.e. in the prepare
or poll phases? In that case, the aio_poll is sleeping or might go to
sleep anytime soon, and the EventNotifier must be invoked to wake
it up.
In any other case (including if there is *no* active aio_poll at all!)
we can just wait for the next prepare phase to pick up the event (e.g. a
bottom half); the prepare phase will avoid the blocking and service the
bottom half.
Expressing the invariant with a logic formula, the broken one looked like:
!(exists(thread): in_dispatching(thread)) => !optimize
or equivalently:
!(exists(thread):
in_aio_poll(thread) && in_dispatching(thread)) => !optimize
In the correct one, the negation is in a slightly different place:
(exists(thread):
in_aio_poll(thread) && !in_dispatching(thread)) => !optimize
or equivalently:
(exists(thread): in_prepare_or_poll(thread)) => !optimize
Even if the difference boils down to moving an exclamation mark :)
the implementation is quite different. However, I think the new
one is simpler to understand.
In the old implementation, the "exists" was implemented with a boolean
value. This didn't really support well the case of multiple concurrent
event loops, but I thought that this was okay: aio_poll holds the
AioContext lock so there cannot be concurrent aio_poll invocations, and
I was just considering nested event loops. However, aio_poll _could_
indeed be concurrent with the GSource. This is why I came up with the
wrong invariant.
In the new implementation, "exists" is computed simply by counting how many
threads are in the prepare or poll phases. There are some interesting
points to consider, but the gist of the idea remains:
1) AioContext can be used through GSource as well; as mentioned in the
patch, bit 0 of the counter is reserved for the GSource.
2) the counter need not be updated for a non-blocking aio_poll, because
it won't sleep forever anyway. This is just a matter of checking
the "blocking" variable. This requires some changes to the win32
implementation, but is otherwise not too complicated.
3) as mentioned above, the new implementation will not call aio_notify
when there is *no* active aio_poll at all. The tests have to be
adjusted for this change. The calls to aio_notify in async.c are fine;
they only want to kick aio_poll out of a blocking wait, but need not
do anything if aio_poll is not running.
4) nested aio_poll: these just work with the new implementation; when
a nested event loop is invoked, the outer event loop is never in the
prepare or poll phases. The outer event loop thus has already decremented
the counter.
Reported-by: Richard W. M. Jones <rjones@redhat.com>
Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Message-id: 1437487673-23740-5-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
In these tests, the purpose of the initial calls to aio_poll and
g_main_context_iteration is simply to put the AioContext in a
known state; the return value of the function does not really
matter. The next patch will change those return values; change
the assertions to a while loop which expresses the intention
better.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Message-id: 1437487673-23740-3-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Version: GnuPG v1
iQIcBAABAgAGBQJVrT14AAoJEH3vgQaq/DkOq7EP/0yHGzrIM7wKCPEgZWSuzvJF
g2ooc2d669ulLqhRicUuwIGpYgT7WpVrHMxDbw+f3Y02Upyov44l8bG82hmC/1r+
NAbpkl7PZHD7PM/duKtuclPIAmdpXXQoy7mHtb8PG71poaAhC1D8t0Swy1wKPn6r
uhrhySpN2B+yV9P5sNWdxVTd14oHpJhLsTo/YRe+ptgZnqeWyG6+Rz9xX0nMqaLA
8byl4fGUJ8SGxcyV6NKeUK16wXb7HH9d7EaRihnYoxT50DeJb+8NKWcrwfgzd9hu
M+suPJBbenQ6JcT8mDaOsM1/lsWLUJ+561QS3opx3j2kDrtK/sHKf2flZkGp1Ev9
QioEdL9m731/8wIITWIKntzCw4h2nO+ovnQFIzvcni+PaehZrvF5VIC7QLOSZhy6
zqu+E3PY0PkheqXv8/KWOWs+MctfyfotpCtcD7esQ/f9fD4MsFn1NFaDxG6gnWUt
wYytkxhqvCiOy5dGumcyOC7VDdILB4FObSe15H3LUqSfVqIMoFQ6q7Pr+RC84JPE
cosVoRWd/EE8dxqAP0NmARZwdExIRInfg1ZrooteQy9JQvmgaVReqY6diK9SZVtm
1Aue2qBr8im6sxxz4uzwi5Oi8vvB/Y88EV2mBkaDN0oRWQYzj39AX1vebW+vMjJi
GdtqqEIxSNb2mH4vcB0r
=GHN/
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/jnsnow/tags/ide-pull-request' into staging
# gpg: Signature made Mon Jul 20 19:27:04 2015 BST using RSA key ID AAFC390E
# gpg: Good signature from "John Snow (John Huston) <jsnow@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: FAEB 9711 A12C F475 812F 18F2 88A9 064D 1835 61EB
# Subkey fingerprint: F9B7 ABDB BCAC DF95 BE76 CBD0 7DEF 8106 AAFC 390E
* remotes/jnsnow/tags/ide-pull-request:
tests: Fix broken targets check-report-qtest-*
ahci: Force ICC bits in PxCMD to zero
qtest/ide: add another short PRDT test flavor
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
They need QTEST_QEMU_IMG. Without it, the tests raise an assertion:
$ make -C bin check-report-qtest-i386.xml
make: Entering directory 'bin'
GTESTER check-report-qtest-i386.xml
blkdebug: Suspended request 'A'
blkdebug: Resuming request 'A'
ahci-test: tests/libqos/libqos.c:162:
mkimg: Assertion `qemu_img_path' failed.
main-loop: WARNING: I/O thread spun for 1000 iterations
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1437231284-17455-1-git-send-email-sw@weilnetz.de
Signed-off-by: John Snow <jsnow@redhat.com>
The existing short PRDT test case does not transfer any data because the
first PRD is less than 1 sector.
This patch adds another short PRDT test case where the first sector can
be read but the PRDT is still smaller than the requested number of
sectors. This exercises a different code path in ide_dma_cb().
Cc: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1435770571-9906-1-git-send-email-stefanha@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Commit e0cf11f31c ("timer: Use a single
definition of NSEC_PER_SEC for the whole codebase") renamed
NANOSECONDS_PER_SECOND to NSEC_PER_SEC.
On Mac OS X there is a <dispatch/time.h> system header which also
defines NSEC_PER_SEC. This causes compiler warnings.
Let's use the old name instead. It's longer but it doesn't clash.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1436364609-7929-1-git-send-email-stefanha@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABCAAGBQJVnQWdAAoJEL/70l94x66D6OgIAKJlzQfmy5w7Q9WD4vCMhD76
JrpLSsn7Gx/Bws0Nu9nLQlqun5z4hiUxyG2kP/WqD9+tV3cpSMSyrG6ImVdqKnQ5
+Z8WJZuREkQv0aqDUjQVST+eIDZuh2LWJXAjhgsCXUHY77eWb/7WmKT79xJOa+5C
5xB1qxudqX5IsTvpiKKPbmUGYkAcvRX1dUSaFwRIMO0UyKn59B9WfM9a5slIbLW7
XfI8+wEJshTVLuQkkTfdidWQc5M5DwlmO7ESUNR/BRPCPFeyjcDqgQY5pBM5XVo9
C+S0R3zIt3Ew0fhCtLRyjlIT0bGfwjbU5HRiHcyldBKhNUZZjSUoOWJnYRHXUDY=
=H8wA
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
Bugfixes and Daniel Berrange's crypto library.
# gpg: Signature made Wed Jul 8 12:12:29 2015 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream:
ossaudio: fix memory leak
ui: convert VNC to use generic cipher API
block: convert qcow/qcow2 to use generic cipher API
ui: convert VNC websockets to use crypto APIs
block: convert quorum blockdrv to use crypto APIs
crypto: add a nettle cipher implementation
crypto: add a gcrypt cipher implementation
crypto: introduce generic cipher API & built-in implementation
crypto: move built-in D3DES implementation into crypto/
crypto: move built-in AES implementation into crypto/
crypto: introduce new module for computing hash digests
vl: move rom_load_all after machine init done
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>