Use SIGUSR1 unconditionally as SIG_IPI. First, ucontext coroutines tend
to corrupt RT signal masks due to a 32-on-64-bit Linux kernel bug. And,
second, there appears to be no advantage in using RT signals for VCPU
kicking.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
* kwolf/for-anthony: (30 commits)
declare ECANCELED on all machines
tests/Makefile: Add missing $(EXESUF)
stream: do not copy unallocated sectors from the base
stream: fix ratelimiting corner case
stream: fix HMP block_job_set_speed
stream: pass new base image format to bdrv_change_backing_file
stream: add testcase for partial streaming
stream: fix sectors not allocated test
qemu-io: fix the alloc command
qemu-io: correctly print non-integer values as decimals
qemu-img: make "info" backing file output correct and easier to use
block: move field reset from bdrv_open_common to bdrv_close
block: protect path_has_protocol from filenames with colons
block: simplify path_is_absolute
block: wait for job callback in block_job_cancel_sync
block: add block_job_sleep_ns
block: fully delete bs->file when closing
block: do not reuse the backing file across bdrv_close/bdrv_open
block: another bdrv_append fix
block: fix snapshot on QED
...
The macro definition of cpu_init meant that if cpu_arm_init()
returned NULL this wouldn't result in cpu_init() itself returning
NULL. This had the effect that "-cpu foo" for some unknown CPU
name 'foo' would cause ARM targets to segfault rather than
generating a useful error message. Fix this by making cpu_init
a simple inline function.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Andreas Färber <afaerber@suse.de>
This patch fixes a bug affecting a variety of Neon instructions, such as
VQADD.
Signed-off-by: Matt Craighead <mjcraighead@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If the kernel page size is larger than TARGET_PAGE_SIZE, which
happens for example on ppc64 with kernels compiled for 64K pages,
the dirty tracking doesn't work.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Avi Kivity <avi@redhat.com>
Unallocated sectors should really never be accessed by the guest,
so there's no need to copy them during the streaming process.
If they are read by the guest during streaming, guest-initiated
copy-on-read will copy them (we're in the base == NULL case, which
enables copy on read). If they are read after we disconnect the
image from the base, they will read as zeroes anyway.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This fixes inability to make progress in streaming if the quota is set
to less than the amount of data that an I/O operation has to write.
In this case, limit->dispatched + n will always be above the quota and,
due to the "goto retry" to recheck cancellation and allocation, streaming
will livelock.
This can be reproduced with "block_job_set_speed ide0-hd0 1b". Of course,
with this patch the requested limit will not be obeyed. That could be
done with another patch that caps is_allocated's n argument by the slice
quota.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The change of the argument name from value to speed was not propagated there.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When an image is modified to point to the new backing file, the backing
file format is set to NULL, which means auto-probe. This is wrong, in
fact it is a small security problem.
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The test on sectors not allocated can fail if the L1/L2 tables are
not on disk yet. Allow tests to shutdown the VM early.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Because sector_num is not updated, the loop would either go on
forever or return garbage.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qemu-io's cvtstr function sometimes will incorrectly omit the
decimal part of the number, and sometimes will incorrectly include
it. This patch fixes both. The former is more serious, and can
be seen in the patches to 027.out and 033.out.
The changes to all other files were scripted with sed, so there were
no "surprises" beyond 027.out and 033.out.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qemu-img info should use the same logic as qemu when printing the
backing file path, or debugging becomes quite tricky. We can also
simplify the output in case the backing file has an absolute path
or a protocol.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_close should leave fields in the same state as bdrv_new. It is
not up to bdrv_open_common to fix the mess.
Also, backing_format was not being re-initialized.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
path_has_protocol will erroneously return "true" if the colon is part
of a filename. These names are common with stable device names produced
by udev. We cannot fully protect against this in case the filename
does not have a path component (e.g. if the current directory is
/dev/disk/by-path), but in the common case there will be a slash before
and path_has_protocol can easily detect that and return false.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
On Windows, all the logic is already in is_windows_drive and
is_windows_drive_prefix. On POSIX, there is no need to look
out for colons.
The win32 code changes the behaviour in some cases, we could have
something like "d:foo.img". The old code would treat it as relative
path, the new one as absolute. Now the path is absolute, because to
go from c:/program files/blah to d:foo.img you cannot say c:/program
files/blah/d:foo.img. You have to say d:foo.img. But you could also
say it's relative because (I think, at least it was like that in DOS
15 years ago) d:foo.img is relative to the current path of drive D.
Considering how path_is_absolute is used by path_combine, I think it's
better to treat it as absolute.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The limitation on not having I/O after cancellation cannot really be
kept. Even streaming has a very small race window where you could
cancel a job and have it report completion. If this window is hit,
bdrv_change_backing_file() will yield and possibly cause accesses to
dangling pointers etc.
So, let's just assume that we cannot know exactly what will happen
after the coroutine has set busy to false. We can set a very lax
condition:
- if we cancel the job, the coroutine won't set it to false again
(and hence will not call co_sleep_ns again).
- block_job_cancel_sync will wait for the coroutine to exit, which
pretty much ensures no race.
Instead, we track the coroutine that executes the job and put very
strict conditions on what to do while it is quiescent (busy = false).
First of all, the coroutine must never set busy = false while the job
has been cancelled. Second, the coroutine can be reentered arbitrarily
while it is quiescent, so you cannot really do anything but co_sleep_ns at
that time. This condition is obeyed by the block_job_sleep_ns function.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This function abstracts the pretty complex semantics of the "busy"
member of BlockJob.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are reusing bs->file across close/open, which may not cause any
known bugs but is a recipe for trouble. Prefer bdrv_delete, and
enjoy the new invariant in the implementation of bdrv_delete.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This is another bug caused by not doing a full cleanup of the BDS
across close/open. This was found with mirroring by Shaolong Hu,
but it can probably be reproduced also with eject or change.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_append must also copy open_flags to the top, because the snapshot
has BDRV_O_NO_BACKING set. This causes interesting results if you
later use drive-reopen (not upstream) to reopen the image, and lose
the backing file in the process.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
QED's opaque data includes a pointer back to the BlockDriverState.
This breaks when bdrv_append shuffles data between bs_new and bs_top.
To avoid this, add a "rebind" function that tells the driver about
the new relationship between the BlockDriverState and its opaque.
The patch also adds rebind to VVFAT for completeness, even though
it is not used with live snapshots.
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
A trailing space is left when qemu-img has no arguments, for example if
-nocache is not used. This becomes an empty argument after split()
and causes qemu-io to fail.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Also reuse elsewhere the new constant for sizeof(unsigned long) * 8.
The dirty bitmap is allocated in bits but declared as unsigned long.
Thus, its memory block is accessed beyond its end unless the image
is a multiple of 64 chunks (i.e. a multiple of 64 MB).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_img_create will temporarily open the backing file to probe its size.
However, this could be done with a read-write open if the wrong flags are
passed to bdrv_img_create. Since there is really no documentation on
what flags can be passed, assume that bdrv_img_create receives the flags
with which the new image will be opened; sanitize them when opening
the backing file.
Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
These are needed to print "info block" output correctly. QCOW2 does this
because it needs it to write the header, but QED does not, and common code
is the right place to do it.
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This check applies to all drivers, but QED lacks it.
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
[ Iterate until all block devices have processed all requests,
add comments. - Paolo ]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Adjust the tcg_out_qemu_{ld,st}() slow paths to pass AREG0 in r3,
based on patches by malc.
Also adjust the registers clobbered, based on patch by Alex.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Acked-by: Alexander Graf <agraf@suse.de>
[AF: Do not hardcode r3 for AREG0, requested by Alex]
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This accounts for the additional addr_reg2 register.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Also assure i64 alignment where necessary.
Alignment code optimization suggested by malc.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
For targets where TARGET_LONG_BITS != 32, i.e. 64-bit guests,
addr_reg is moved to r4. For hosts without TCG_TARGET_CALL_ALIGN_ARGS
either data_reg2 or data_reg or a masked version thereof would overwrite
r4. Place it in r5 instead, matching TCG_TARGET_CALL_ALIGN_ARGS hosts.
This fixes immediate crashes of 64-bit guests observed on Darwin/ppc but
not on Darwin/ppc64.
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Acked-by: malc <av1474@comtv.ru>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
* qmp/queue/qmp:
hmp: fix bad value conversion for M type
hmp: expr_unary(): check for overflow in strtoul()/strtoull()
vl: drop is_suspended variable
runstate: introduce suspended state
qapi-schema.json: fix RunState enums alphabetical order
wakeup on migration
The M type converts from megabytes to bytes. However, the value can be
negative before the conversion, which will lead to a flawed conversion.
For example, this:
(qemu) balloon -1000000000000011
(qemu)
Just "works", but the value passed by the balloon command will be
something else.
This patch fixes this problem by requering a positive value before
converting. There's really no reason to accept a negative value for
the M type.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
It's not checked currently, so something like:
(qemu) balloon -100000000000001111114334234
(qemu)
Will just "work" (in this case the balloon command will get a random
value).
Fix it by checking if strtoul()/strtoull() overflowed.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
QEMU enters in this state when the guest suspends to ram (S3).
This is important so that HMP users and QMP clients can know that
the guest is suspended. QMP also has an event for this, but events
are not reliable and are limited (ie. a client can connect to QEMU
after the event has been emitted).
Having a different state for S3 brings a new issue, though. Every
device that doesn't run when the VM is stopped but wants to run
when the VM is suspended has to check for RUN_STATE_SUSPENDED
explicitly. This is the case for the keyboard and mouse devices,
for example.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Wakeup the guest when the live part of the migation is finished.
This avoids being in suspended state on migration, so we don't
have to save the is_suspended bit.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>