Just create a variable for it, the same way that multifd does. This
way it is safe to use for other thread, etc, etc.
Reviewed-by: Leonardo Bras <leobras@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20230515195709.63843-11-quintela@redhat.com>
We only care about the amount of bytes transferred. Flushing is done
by the system somewhere else.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20230530183941.7223-4-quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
In the function qmp_migrate(), yank_unregister_instance() gets called
twice which isn't required. Hence, refactoring it so that it gets called
during the local_error cleanup.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Tejus GK <tejus.gk@nutanix.com>
Message-ID: <20230621130940.178659-3-tejus.gk@nutanix.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
So no need to assert we are in x86_64.
Once there, refactor the function to remove useless variables.
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-ID: <20230608224943.3877-11-quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The bootsector code is read only from the guest (otherwise we are
going to have problems with it being read from both source and
destination).
Create a single copy for all the tests.
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-ID: <20230608224943.3877-10-quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
So just make it a global variable.
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-ID: <20230608224943.3877-9-quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
So arch_dirty_ring option becomes one option like the others.
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-ID: <20230608224943.3877-8-quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Now that the return path thread is allowed to finish during a paused
migration, we can move the cleanup of the QEMUFiles to the main
migration thread.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230918172822.19052-9-farosas@suse.de>
Replace the return path retry logic with finishing and restarting the
thread. This fixes a race when resuming the migration that leads to a
segfault.
Currently when doing postcopy we consider that an IO error on the
return path file could be due to a network intermittency. We then keep
the thread alive but have it do cleanup of the 'from_dst_file' and
wait on the 'postcopy_pause_rp' semaphore. When the user issues a
migrate resume, a new return path is opened and the thread is allowed
to continue.
There's a race condition in the above mechanism. It is possible for
the new return path file to be setup *before* the cleanup code in the
return path thread has had a chance to run, leading to the *new* file
being closed and the pointer set to NULL. When the thread is released
after the resume, it tries to dereference 'from_dst_file' and crashes:
Thread 7 "return path" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffd1dbf700 (LWP 9611)]
0x00005555560e4893 in qemu_file_get_error_obj (f=0x0, errp=0x0) at ../migration/qemu-file.c:154
154 return f->last_error;
(gdb) bt
#0 0x00005555560e4893 in qemu_file_get_error_obj (f=0x0, errp=0x0) at ../migration/qemu-file.c:154
#1 0x00005555560e4983 in qemu_file_get_error (f=0x0) at ../migration/qemu-file.c:206
#2 0x0000555555b9a1df in source_return_path_thread (opaque=0x555556e06000) at ../migration/migration.c:1876
#3 0x000055555602e14f in qemu_thread_start (args=0x55555782e780) at ../util/qemu-thread-posix.c:541
#4 0x00007ffff38d76ea in start_thread (arg=0x7fffd1dbf700) at pthread_create.c:477
#5 0x00007ffff35efa6f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Here's the race (important bit is open_return_path happening before
migration_release_dst_files):
migration | qmp | return path
--------------------------+-----------------------------+---------------------------------
qmp_migrate_pause()
shutdown(ms->to_dst_file)
f->last_error = -EIO
migrate_detect_error()
postcopy_pause()
set_state(PAUSED)
wait(postcopy_pause_sem)
qmp_migrate(resume)
migrate_fd_connect()
resume = state == PAUSED
open_return_path <-- TOO SOON!
set_state(RECOVER)
post(postcopy_pause_sem)
(incoming closes to_src_file)
res = qemu_file_get_error(rp)
migration_release_dst_files()
ms->rp_state.from_dst_file = NULL
post(postcopy_pause_rp_sem)
postcopy_pause_return_path_thread()
wait(postcopy_pause_rp_sem)
rp = ms->rp_state.from_dst_file
goto retry
qemu_file_get_error(rp)
SIGSEGV
-------------------------------------------------------------------------------------------
We can keep the retry logic without having the thread alive and
waiting. The only piece of data used by it is the 'from_dst_file' and
it is only allowed to proceed after a migrate resume is issued and the
semaphore released at migrate_fd_connect().
Move the retry logic to outside the thread by waiting for the thread
to finish before pausing the migration.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230918172822.19052-8-farosas@suse.de>
We'll start calling the await_return_path_close_on_source() function
from other parts of the code, so move all of the related checks and
tracepoints into it.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230918172822.19052-7-farosas@suse.de>
This file is owned by the return path thread which is already doing
cleanup.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230918172822.19052-6-farosas@suse.de>
It's not safe to call qemu_file_shutdown() on the to_dst_file without
first checking for the file's presence under the lock. The cleanup of
this file happens at postcopy_pause() and migrate_fd_cleanup() which
are not necessarily running in the same thread as migrate_fd_cancel().
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230918172822.19052-5-farosas@suse.de>
We cannot call qemu_file_shutdown() on the return path file without
taking the file lock. The return path thread could be running it's
cleanup code and have just cleared the from_dst_file pointer.
Checking ms->to_dst_file for errors could also race with
migrate_fd_cleanup() which clears the to_dst_file pointer.
Protect both accesses by taking the file lock.
This was caught by inspection, it should be rare, but the next patches
will start calling this code from other places, so let's do the
correct thing.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230918172822.19052-4-farosas@suse.de>
We don't need to set the rp_state.error right after a shutdown because
qemu_file_shutdown() always sets the QEMUFile error, so the return
path thread would have seen it and set the rp error itself.
Setting the error outside of the thread is also racy because the
thread could clear it after we set it.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230918172822.19052-3-farosas@suse.de>
We hit intermit CI issue on failing at migration-test over the unit test
preempt/plain:
qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Memory content inconsistency at 5b43000 first_byte = bd last_byte = bc current = 4f hit_edge = 1
**
ERROR:../tests/qtest/migration-test.c:300:check_guests_ram: assertion failed: (bad == 0)
(test program exited with status code -6)
Fabiano debugged into it and found that the preempt thread can quit even
without receiving all the pages, which can cause guest not receiving all
the pages and corrupt the guest memory.
To make sure preempt thread finished receiving all the pages, we can rely
on the page_requested_count being zero because preempt channel will only
receive requested page faults. Note, not all the faulted pages are required
to be sent via the preempt channel/thread; imagine the case when a
requested page is just queued into the background main channel for
migration, the src qemu will just still send it via the background channel.
Here instead of spinning over reading the count, we add a condvar so the
main thread can wait on it if that unusual case happened, without burning
the cpu for no good reason, even if the duration is short; so even if we
spin in this rare case is probably fine. It's just better to not do so.
The condvar is only used when that special case is triggered. Some memory
ordering trick is needed to guarantee it from happening (against the
preempt thread status field), so the main thread will always get a kick
when that triggers correctly.
Closes: https://gitlab.com/qemu-project/qemu/-/issues/1886
Debugged-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230918172822.19052-2-farosas@suse.de>
The marking should be extended transitively to all functions that call
these ones, so that static analysis can be done much more efficiently.
However, this is a start and makes it possible to use vrc's path-based
searches to find potential bugs where coroutine_fns call blocking functions.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Just remove the declaration. There is nothing in the function after the
switch statement, so it is safe to do.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Originally meant to avoid a shadowed variable "s", which was fixed by
renaming the outer declaration to "qts". Avoid the chance of an overflow
in the computation of ABS(t - s).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
VNC_FEATURE_XVP was not shifted left before adding it to vs->features,
so it was never enabled; but it was also checked the wrong way with
a logical AND instead of vnc_has_feature. Fix both places.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The debug message was cut and pasted from the invalid audio format
case, but the audio message is at bytes 2-3.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We are doing things like
nb_sectors /= (s->qdev.blocksize / BDRV_SECTOR_SIZE);
in the code here (e.g. in scsi_disk_emulate_mode_sense()), so if
the blocksize is smaller than BDRV_SECTOR_SIZE (=512), this crashes
with a division by 0 exception. Thus disallow block sizes of 256
bytes to avoid this situation.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1813
CVE: 2023-42467
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20230925091854.49198-1-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
enable_cpu_pm is only used by softmmu-specific code, namely target/i386/host-cpu.c
and target/i386/kvm/*. It does not need a stub definition anymore.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
bios.bin is now used only by ISA PC, so PCI drivers are not necessary.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These are the last users of the 128K SeaBIOS blob in the i440FX family.
Removing them allows us to drop PCI support from the 128K blob,
thus making it easier to update SeaBIOS to newer versions.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* Simplify the NIC init code of the jazz machine a little bit
* Minor qtest and avocado fixes
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmURS8gRHHRodXRoQHJl
ZGhhdC5jb20ACgkQLtnXdP5wLbVn4A/+NQKFZcN7gVn5JXkK7kf6i01LNmAoqjj9
QeQL+WCoNC68OApw7DxIEnpBYT0G42NTHHx4SYeOvzJUzCpeWcxYzQUz58ObZML7
+OKsiOsaHu3/qOuihBCn43et6moLdDCWbee5Zr6JQv/Fjn3q3nEQZnJDWdw8vm1v
csYQJZOD6HelLVMmbLfl1szzrykDTT53NhPncH/SjPz6we17sKqHqmT6LBUIsXcV
u2LaowppKmT7Ooexu6SmsCagLhtWuYo1iGGcRqoojtRWo7eZtWLrAy2DJpyFkPBW
AIYBfntRISZv4eBGCxcVfvODD/Q4OXHuYTfGzD3m+ELJ6hUk/+d4/aHJ2hm+KEm+
AD0IpDtimaEmyQTPlaWHhhEur/82JZ+zYlxUMPf3+hglB/rbr6fhA0SMAV6nwR0r
N8jnB8UCml9oDxJVvDZyrcPMGFs1xlr5FVSHHEoL338SvSfjG3NOEtcNao9n6A8d
rO2CfPzI7peQhKWAzJL+qpnmenyIniH23tFnf2mpOZ0g45ZWtJeT0CXL3aQO3XAZ
m56pkM0d/etAHHRoLQ5D/iKZpwiTRLjdzsJ0gMAQsIuRlG/j5h+zou0vUMgm6F8F
igRHLxytlywZBTCABm2XIlKmaJp8hQlVQMpKsv/BwzTvzzk0GGS5d1qzzFt5WWR7
4rSalTn5Xuw=
=FioB
-----END PGP SIGNATURE-----
Merge tag 'pull-request-2023-09-25' of https://gitlab.com/thuth/qemu into staging
* Make keyutils independent from keyring in meson.build
* Simplify the NIC init code of the jazz machine a little bit
* Minor qtest and avocado fixes
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmURS8gRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbVn4A/+NQKFZcN7gVn5JXkK7kf6i01LNmAoqjj9
# QeQL+WCoNC68OApw7DxIEnpBYT0G42NTHHx4SYeOvzJUzCpeWcxYzQUz58ObZML7
# +OKsiOsaHu3/qOuihBCn43et6moLdDCWbee5Zr6JQv/Fjn3q3nEQZnJDWdw8vm1v
# csYQJZOD6HelLVMmbLfl1szzrykDTT53NhPncH/SjPz6we17sKqHqmT6LBUIsXcV
# u2LaowppKmT7Ooexu6SmsCagLhtWuYo1iGGcRqoojtRWo7eZtWLrAy2DJpyFkPBW
# AIYBfntRISZv4eBGCxcVfvODD/Q4OXHuYTfGzD3m+ELJ6hUk/+d4/aHJ2hm+KEm+
# AD0IpDtimaEmyQTPlaWHhhEur/82JZ+zYlxUMPf3+hglB/rbr6fhA0SMAV6nwR0r
# N8jnB8UCml9oDxJVvDZyrcPMGFs1xlr5FVSHHEoL338SvSfjG3NOEtcNao9n6A8d
# rO2CfPzI7peQhKWAzJL+qpnmenyIniH23tFnf2mpOZ0g45ZWtJeT0CXL3aQO3XAZ
# m56pkM0d/etAHHRoLQ5D/iKZpwiTRLjdzsJ0gMAQsIuRlG/j5h+zou0vUMgm6F8F
# igRHLxytlywZBTCABm2XIlKmaJp8hQlVQMpKsv/BwzTvzzk0GGS5d1qzzFt5WWR7
# 4rSalTn5Xuw=
# =FioB
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 25 Sep 2023 04:58:48 EDT
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5
* tag 'pull-request-2023-09-25' of https://gitlab.com/thuth/qemu:
tests/avocado: fix waiting for vm shutdown in replay_linux
hw/mips/jazz: Simplify the NIC setup code
hw/mips/jazz: Move the NIC init code into a separate function
tests/qtest/netdev-socket: Do not test multicast on Darwin
tests/qtest/m48t59-test: Silence compiler warning with -Wshadow
tests/qtest/netdev-socket: Raise connection timeout to 120 seconds
meson.build: Make keyutils independent from keyring
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Upcoming additions to support NBD 64-bit effect lengths will add a new
command flag NBD_CMD_FLAG_PAYLOAD_LEN that needs to be considered in
our sanity checks of the client's messages (that is, more than just
CMD_WRITE have the potential to carry a client payload when extended
headers are in effect). But before we can start to support that, it
is easier to first refactor the existing set of various if statements
over open-coded combinations of request->type to instead be a single
switch statement over all command types that sets witnesses, then
straight-line processing based on the witnesses. No semantic change
is intended.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20230829175826.377251-24-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Widen the length field of NBDRequest to 64-bits, although we can
assert that all current uses are still under 32 bits: either because
of NBD_MAX_BUFFER_SIZE which is even smaller (and where size_t can
still be appropriate, even on 32-bit platforms), or because nothing
ever puts us into NBD_MODE_EXTENDED yet (and while future patches will
allow larger transactions, the lengths in play here are still capped
at 32-bit). There are no semantic changes, other than a typo fix in a
couple of error messages.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20230829175826.377251-23-eblake@redhat.com>
[eblake: fix assertion bug in nbd_co_send_simple_reply]
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
This patch fixes the race condition in waiting for shutdown
of the replay linux test.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
Suggested-by: John Snow <jsnow@redhat.com>
Message-ID: <20230811070608.3383343-4-pavel.dovgalyuk@ispras.ru>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The for-loop does not make much sense here - it is always left after
the first iteration, so we can also check for nb_nics == 1 instead
which is way easier to understand.
Also, the checks for nd->model are superfluous since the code in
mips_jazz_init_net() calls qemu_check_nic_model() that already
takes care of this (i.e. initializing nd->model if it has not been
set yet, and checking whether it is the "help" option or the
supported NIC model).
Message-ID: <20230913160922.355640-3-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The mips_jazz_init() function is already quite big, so moving
away some code here can help to make it more understandable.
Additionally, by moving this code into a separate function, the
next patch (that will refactor the for-loop around the NIC init
code) will be much shorter and easier to understand.
Message-ID: <20230913160922.355640-2-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Do not run this test on Darwin, otherwise we get:
qemu-system-arm: -netdev dgram,id=st0,remote.type=inet,remote.host=230.0.0.1,remote.port=1234:
can't add socket to multicast group 230.0.0.1: Can't assign requested address
Broken pipe
../../tests/qtest/libqtest.c:191: kill_qemu() tried to terminate QEMU
process but encountered exit status 1 (expected 0)
Abort trap: 6
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230918062549.2363-1-philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
When compiling this file with -Wshadow=local , we get:
../tests/qtest/m48t59-test.c: In function ‘bcd_check_time’:
../tests/qtest/m48t59-test.c:195:17: warning: declaration of ‘s’
shadows a previous local [-Wshadow=local]
195 | long t, s;
| ^
../tests/qtest/m48t59-test.c:158:17: note: shadowed declaration is here
158 | QTestState *s = m48t59_qtest_start();
| ^
Rename the QTestState variable to "qts" which is the common
naming for such a variable in other tests.
Reported-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20230922163742.149444-1-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Commit 0db0fbb5cf ("Add conditional dependency for libkeyutils")
tried to provide a possibility for the user to disable keyutils
if not required by makeing it depend on the keyring feature. This
looked reasonable at a first glance (the unit test in tests/unit/
needs both), but the condition in meson.build fails if the feature
is meant to be detected automatically, and there is also another
spot in backends/meson.build where keyutils is used independently
from keyring. So let's remove the dependency on keyring again and
introduce a proper meson build option instead.
Cc: qemu-stable@nongnu.org
Fixes: 0db0fbb5cf ("Add conditional dependency for libkeyutils")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1842
Message-ID: <20230824094208.255279-1-thuth@redhat.com>
Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Add the constants and structs necessary for later patches to start
implementing the NBD_OPT_EXTENDED_HEADERS extension in both the client
and server, matching recent upstream nbd.git (through commit
e6f3b94a934). This patch does not change any existing behavior, but
merely sets the stage for upcoming patches.
This patch does not change the status quo that neither the client nor
server use a packed-struct representation for the request header.
While most of the patch adds new types, there is also some churn for
renaming the existing NBDExtent to NBDExtent32 to contrast it with
NBDExtent64, which I thought was a nicer name than NBDExtentExt.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20230829175826.377251-22-eblake@redhat.com>